diff --git a/src/UserGuide/Master/Table/AI-capability/AINode_Upgrade_timecho.md b/src/UserGuide/Master/Table/AI-capability/AINode_Upgrade_timecho.md deleted file mode 100644 index a3167dc86..000000000 --- a/src/UserGuide/Master/Table/AI-capability/AINode_Upgrade_timecho.md +++ /dev/null @@ -1,900 +0,0 @@ - - -# AINode - -AINode is a native IoTDB node that supports the registration, management, and invocation of time series related models. It includes industry-leading self-developed time series large models, such as the Timer series models developed by Tsinghua University. Models can be invoked using standard SQL statements, enabling millisecond-level real-time inference on time series data, and supporting application scenarios such as time series trend prediction, missing value filling, and anomaly value detection. - -The system architecture is shown in the following figure: - -![](/img/AINode-0-en.png) - -The responsibilities of the three nodes are as follows: - -* **ConfigNode**: Responsible for distributed node management and load balancing. -* **DataNode**: Responsible for receiving and parsing user SQL requests; responsible for storing time series data; responsible for data preprocessing calculations. -* **AINode**: Responsible for managing and using time series models. - -## 1. Advantages and Features - -Compared to building a machine learning service separately, it has the following advantages: - -* **Simple and Easy to Use**: No need to use Python or Java programming, you can complete the entire process of machine learning model management and inference using SQL statements. For example, creating a model can be done using the CREATE MODEL statement, and using a model for inference can be done using the `SELECT * FROM FORECAST (...)` statement, making it more simple and convenient. -* **Avoid Data Migration**: Using IoTDB-native machine learning can directly apply data stored in IoTDB to machine learning model inference without moving data to a separate machine learning service platform, thus accelerating data processing, improving security, and reducing costs. - -![](/img/h1.png) - -* **Built-in Advanced Algorithms**: Supports industry-leading machine learning analysis algorithms, covering typical time series analysis tasks, and empowering time series databases with native data analysis capabilities. For example: - * **Time Series Forecasting**: Learning change patterns from past time series data; outputting the most likely predictions for future sequences based on given past observations. - * **Time Series Anomaly Detection**: Detecting and identifying abnormal values in given time series data to help discover abnormal behavior in time series. - -## 2. Basic Concepts - -* **Model (Model)**: A machine learning model that takes time series data as input and outputs the results or decisions of the analysis task. The model is the basic management unit of AINode, supporting the creation (registration), deletion, query, modification (fine-tuning), and use (inference) of models. -* **Create (Create)**: Load the external designed or trained model file or algorithm into AINode, managed and used uniformly by IoTDB. -* **Inference (Inference)**: Use the created model to complete the time series analysis task on the specified time series data. -* **Built-in (Built-in)**: AINode comes with common time series analysis scenario (e.g., prediction and anomaly detection) machine learning algorithms or self-developed models. - -![](/img/AINode-en.png) - -## 3. Installation and Deployment - -AINode deployment can be referred to the documentation [AINode Deployment](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md). - -## 4. Usage Guide - -TimechoDB-AINode supports three major functions: model inference, model fine-tuning, and model management (registration, viewing, deletion, loading, unloading, etc.). The following sections will provide detailed explanations. - -### 4.1 Model Inference - -The AINode table model supports two major inference capabilities: time series prediction and time series classification. - -#### 4.1.1 Time Series Prediction - -The time series prediction capability provided by the AINode table model includes: - -* **Univariate Prediction**: Supports prediction of a single target variable. -* **Covariate Prediction**: Can simultaneously predict multiple target variables and supports introducing covariates in prediction to improve accuracy. - -The following sections will detail the syntax definition, parameter descriptions, and usage examples of the prediction inference function. - -1. **SQL Syntax** - -```SQL -SELECT * FROM FORECAST( - MODEL_ID, - TARGETS, -- SQL to get target variables - [HISTORY_COVS, -- String, SQL to get historical covariates - FUTURE_COVS, -- String, SQL to get future covariates - OUTPUT_START_TIME, - OUTPUT_LENGTH, - OUTPUT_INTERVAL, - TIMECOL, - PRESERVE_INPUT, - AUTO_ADAPT, -- Boolean type, indicating whether adaptive mode is enabled. - MODEL_OPTIONS]? -) -``` - -* Built-in model inference does not require a registration process. By using the forecast function and specifying model_id, you can use the inference function of the model. -* Parameter description - -| Parameter Name | Parameter Type | Parameter Attributes | Description | Required | Notes | -|----------------|----------------|----------------------|-------------|----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| model_id | Scalar parameter | String type | Unique identifier of the prediction model | Yes | | -| targets | Table parameter | SET SEMANTIC | Input data for the target variables to be predicted. IoTDB will automatically sort the data in ascending order of time before passing it to AINode. | Yes | Use SQL to describe the input data with target variables. If the input SQL is invalid, corresponding query errors will be reported. | -| history_covs | Scalar parameter | String type (valid table model query SQL), default: none | Specifies historical data of covariates for this prediction task, which are used to assist in predicting target variables. AINode will not output prediction results for historical covariates. Before passing data to the model, AINode will automatically sort the data in ascending order of time. | No | 1. Query results can only contain FIELD columns;
2. Other: Different models may have specific requirements, and errors will be thrown if not met. | -| future_covs | Scalar parameter | String type (valid table model query SQL), default: none | Specifies future data of some covariates for this prediction task, which are used to assist in predicting target variables. Before passing data to the model, AINode will automatically sort the data in ascending order of time. | No | 1. Can only be specified when history_covs is set;
2. The covariate names involved must be a subset of history_covs;
3. Query results can only contain FIELD columns;
4. Other: Different models may have specific requirements, and errors will be thrown if not met. | -| auto_adapt | Scalar parameter | Boolean type, default value: true | Whether to enable adaptive processing for covariate inference.(Support from V2.0.8.2) | No | When adaptive mode is enabled:
1. If the set of future covariates (`future_covs`) is not a subset of the historical covariates (`history_covs`), any future covariates not present in the historical set will be automatically discarded.
2. If the length of any historical covariate does not match the length of the input target variable: a. If shorter, pad zeros at the beginning; b. If longer, discard the earliest data points.
3. If the length of any future covariate does not match the prediction length (`output_length`): a. If shorter, pad zeros at the end; b. If longer, discard the most recent data points. | -| output_start_time | Scalar parameter | Timestamp type. Default value: last timestamp of target variable + output_interval | Starting timestamp of output prediction points [i.e., forecast start time] | No | Must be greater than the maximum timestamp of target variable timestamps | -| output_length | Scalar parameter | INT32 type. Default value: 96 | Output window size | No | Must be greater than 0 | -| output_interval | Scalar parameter | Time interval type. Default value: (last timestamp - first timestamp of input data) / n - 1 | Time interval between output prediction points. Supported units: ns, us, ms, s, m, h, d, w | No | Must be greater than 0 | -| timecol | Scalar parameter | String type. Default value: time | Name of time column | No | Must be a TIMESTAMP column existing in targets | -| preserve_input | Scalar parameter | Boolean type. Default value: false | Whether to retain all original rows of target variable input in the output result set | No | | -| model_options | Scalar parameter | String type. Default value: empty string | Key-value pairs related to the model, such as whether to normalize the input. Different key-value pairs are separated by ';'. | No | | - -Notes: -* **Default behavior**: Predict all columns of targets. Currently, only supports INT32, INT64, FLOAT, DOUBLE types. -* **Input data requirements**: - * Must contain a time column. - * Row count requirements: If insufficient, an error will be reported; if exceeding the maximum, the last data will be automatically truncated. - * Column count requirements: Univariate models only support single columns, multi-column will report errors; covariate models usually have no restrictions unless the model itself has clear constraints. -* **Output results**: - * Includes all target variable columns, with data types consistent with the original table. - * If `preserve_input=true` is specified, an additional `is_input` column will be added to identify original data rows. -* **Timestamp generation**: - * Uses `OUTPUT_START_TIME` (optional) as the starting time point for prediction and divides historical and future data. - * Uses `OUTPUT_INTERVAL` (optional, default is the sampling interval of input data) as the output time interval. The timestamp of the Nth row is calculated as: `OUTPUT_START_TIME + (N - 1) * OUTPUT_INTERVAL`. - -2. **Usage Examples** - -**Example 1: Univariate Prediction** - -Create database etth and table eg in advance - -```SQL -create database etth; -create table eg (hufl FLOAT FIELD, hull FLOAT FIELD, mufl FLOAT FIELD, mull FLOAT FIELD, lufl FLOAT FIELD, lull FLOAT FIELD, ot FLOAT FIELD) -``` - -Prepare original data [ETTh1-tab](/img/ETTh1-tab.csv). - -You can import the raw data using the [import-data](../Tools-System/Data-Import-Tool_timecho.md#_2-2-csv-format) script. For example: - -```bash -./tools/import-data.sh -ft csv -sql_dialect table -db etth -table eg -s ~/Desktop/model-compare-html/ETTh1-tab.csv -``` - -Use the first 96 rows of data from column ot in table eg to predict its future 1440 rows of data. - -```SQL -IoTDB:etth> select Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT from eg LIMIT 1440 -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -| Time| HUFL| HULL| MUFL| MULL| LUFL| LULL| OT| -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -|2016-07-01T00:00:00.000+08:00| 5.827|2.009|1.599|0.462|4.203| 1.34|30.531| -|2016-07-01T01:00:00.000+08:00| 5.693|2.076|1.492|0.426|4.142|1.371|27.787| -|2016-07-01T02:00:00.000+08:00| 5.157|1.741|1.279|0.355|3.777|1.218|27.787| -|2016-07-01T03:00:00.000+08:00| 5.09|1.942|1.279|0.391|3.807|1.279|25.044| -...... -Total line number = 1440 -It costs 0.119s - -IoTDB:etth> select * from forecast( - model_id => 'sundial', - targets => (select Time, ot from etth.eg where time >= 2016-08-07T18:00:00.000+08:00 limit 1440) order BY time, - output_length => 96 -) -+-----------------------------+---------+ -| time| ot| -+-----------------------------+---------+ -|2016-10-06T18:00:00.000+08:00|20.733124| -|2016-10-06T19:00:00.000+08:00|20.258146| -|2016-10-06T20:00:00.000+08:00|20.022043| -|2016-10-06T21:00:00.000+08:00|19.789446| -...... -Total line number = 96 -It costs 1.615s -``` - -**Example 2: Covariate Prediction** - -Create table tab_real (to store original real data) in advance - -```SQL -create table tab_real (target1 DOUBLE FIELD, target2 DOUBLE FIELD, cov1 DOUBLE FIELD, cov2 DOUBLE FIELD, cov3 DOUBLE FIELD); -``` - -Prepare original data - -```SQL --- Insert statement -IoTDB:etth> INSERT INTO tab_real (time, target1, target2, cov1, cov2, cov3) VALUES -(1, 1.0, 1.0, 1.0, 1.0, 1.0), -(2, 2.0, 2.0, 2.0, 2.0, 2.0), -(3, 3.0, 3.0, 3.0, 3.0, 3.0), -(4, 4.0, 4.0, 4.0, 4.0, 4.0), -(5, 5.0, 5.0, 5.0, 5.0, 5.0), -(6, 6.0, 6.0, 6.0, 6.0, 6.0), -(7, NULL, NULL, NULL, NULL, 7.0), -(8, NULL, NULL, NULL, NULL, 8.0); - -IoTDB:etth> SELECT * FROM tab_real -+-----------------------------+-------+-------+----+----+----+ -| time|target1|target2|cov1|cov2|cov3| -+-----------------------------+-------+-------+----+----+----+ -|1970-01-01T08:00:00.001+08:00| 1.0| 1.0| 1.0| 1.0| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| 2.0| 2.0| 2.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| 3.0| 3.0| 3.0| 3.0| -|1970-01-01T08:00:00.004+08:00| 4.0| 4.0| 4.0| 4.0| 4.0| -|1970-01-01T08:00:00.005+08:00| 5.0| 5.0| 5.0| 5.0| 5.0| -|1970-01-01T08:00:00.006+08:00| 6.0| 6.0| 6.0| 6.0| 6.0| -|1970-01-01T08:00:00.007+08:00| null| null|null|null| 7.0| -|1970-01-01T08:00:00.008+08:00| null| null|null|null| 8.0| -+-----------------------------+-------+-------+----+----+----+ -``` - -* Prediction task 1: Use historical covariates cov1, cov2, and cov3 to assist in predicting target variables target1 and target2. - - ![](/img/ainode-upgrade-table-forecast-timecho-1-en.png) - - * Use the first 6 rows of historical data from cov1, cov2, cov3, target1, target2 in table tab_real to predict the next 2 rows of target variables target1 and target2. - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.338330268859863|7.338330268859863| - |1970-01-01T08:00:00.008+08:00| 8.02529525756836| 8.02529525756836| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.315s - ``` -* Prediction task 2: Use historical covariates cov1, cov2 and known covariates cov3 in the same table to assist in predicting target variables target1 and target2. - - ![](/img/ainode-upgrade-table-forecast-timecho-2-en.png) - - * Use the first 6 rows of historical data from cov1, cov2, cov3, target1, target2 in table tab_real, and known covariate cov3 in the future 2 rows of the same table to predict the next 2 rows of target variables target1 and target2. - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - FUTURE_COVS => ' - SELECT TIME, cov3 - FROM etth.tab_real - WHERE TIME >= 7 - LIMIT 2', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.244050025939941|7.244050025939941| - |1970-01-01T08:00:00.008+08:00|7.907227516174316|7.907227516174316| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.291s - ``` -* Prediction task 3: Use historical covariates cov1, cov2 from different tables and known covariates cov3 to assist in predicting target variables target1 and target2. - - ![](/img/ainode-upgrade-table-forecast-timecho-3-en.png) - - * Create table tab_cov_forecast (to store known covariate cov3 prediction values) in advance, and prepare related data. - ```SQL - create table tab_cov_forecast (cov3 DOUBLE FIELD); - - -- Insert statement - INSERT INTO tab_cov_forecast (time, cov3) VALUES (7, 7.0),(8, 8.0); - - IoTDB:etth> SELECT * FROM tab_cov_forecast - +----+----+ - |time|cov3| - +----+----+ - | 7| 7.0| - | 8| 8.0| - +----+----+ - ``` - * Use the first 6 rows of known data from cov1, cov2, cov3, target1, target2 in table tab_real, and known covariate cov3 in the future 2 rows from table tab_cov_forecast to predict the next 2 rows of target variables target1 and target2. - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - FUTURE_COVS => ' - SELECT TIME, cov3 - FROM etth.tab_cov_forecast - WHERE TIME >= 7 - LIMIT 2', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.244050025939941|7.244050025939941| - |1970-01-01T08:00:00.008+08:00|7.907227516174316|7.907227516174316| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.351s - ``` - - -#### 4.1.2 Time Series Classification - -Time series classification is a critical capability beyond time series prediction, with extensive applications in industrial scenarios. Its typical paradigm is to input the recent sampling values of multiple measuring points, comprehensively judge the overall operating status of the equipment, and output a classification label for the current status. For example, it can be used for operating status classification of new energy battery pack equipment and other scenarios. - -The AINode table model supports executing time series classification tasks by calling covariate classification models. - -> Note: This feature is available starting from version V2.0.9.1. - -1. **SQL Syntax** -```sql -SELECT * FROM CLASSIFY( - MODEL_ID, - INPUTS -- SQL to retrieve input variables - [TIMECOL, - MODEL_OPTIONS]? -) -``` - -* Parameter Description - -| Parameter Name | Parameter Type | Parameter Attribute | Description | Required | Remarks | -|----------------|----------------|---------------------|-------------|----------|---------| -| model_id | Scalar Parameter | String | Unique identifier of the model used for classification | Yes | - | -| inputs | Table Parameter | SET SEMANTIC | Input data to be classified. IoTDB will automatically sort the data in ascending chronological order before passing it to AINode. | Yes | Describes the input data to be classified via SQL; corresponding query errors will be thrown if the input SQL is invalid. | -| timecol | Scalar Parameter | String, Default: `time` | Name of the time column | No | Must be a column of TIMESTAMP type present in the `inputs` result set; otherwise, an error will be thrown. | -| model_options | Scalar Parameter | String, Default: Empty string | Model-related key-value pairs (e.g., whether input normalization is required). Different key-value pairs are separated by `;`. | No | Unsupported parameters for a specific model will be ignored without throwing errors. | - -**Specifications** - -* **Input Data Requirements** - - Type Constraint: Only INT32, INT64, FLOAT, and DOUBLE data types are supported currently. - - Row Count Constraint: Varies by model. Errors will be thrown if the row count is below the minimum or above the maximum required by the model. - - Column Count Constraint**: Must include a time column. Univariate classification models support only one data column and will throw an error for multiple columns; multivariate classification models generally have no restrictions unless explicitly specified by the model itself. - - Order Constraint: Multivariate zero-shot classification models generally have no order restrictions unless explicitly specified by the model itself. - -* **Output Result** - The returned result is a table composed of time series data classification results, and its schema depends on the specific implementation of the model. - -2. **Usage Example** - -Suppose a project has 10 time series variables with an input length of 192. The custom `mantis_custom` model is used as an example for time series classification inference. - -* Model Registration -```sql -CREATE MODEL mantis_custom USING URI 'file:///path/to/mantis' -``` -For detailed steps to register a custom model, refer to Section 4.3. - -* Execute SQL -```sql -IoTDB:etth> SELECT * FROM CLASSIFY ( - MODEL_ID => 'mantis_custom', - INPUTS => ( - SELECT Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT,UT,MT,LT - FROM eg - WHERE TIME < 2016-07-09 00:00:00 - ORDER BY TIME DESC - LIMIT 192) ORDER BY TIME -) -``` - -* Execution Result -```sql -+--------+ -|category| -+--------+ -| 4| -+--------+ -``` - - -### 4.2 Model Fine-Tuning - -AINode supports model fine-tuning through SQL. - -**SQL Syntax** - -```SQL -createModelStatement - | CREATE MODEL modelId=identifier (WITH HYPERPARAMETERS '(' hparamPair (',' hparamPair)* ')')? FROM MODEL existingModelId=identifier ON DATASET '(' targetData=string ')' - ; -hparamPair - : hparamKey=identifier '=' hyparamValue=primaryExpression - ; -``` - -**Parameter Description** - -| Name | Description | -|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| modelId | Unique identifier of the fine-tuned model | -| hparamPair | Hyperparameter key-value pairs used for fine-tuning, currently supports the following:
`train_epochs`: int type, number of fine-tuning epochs
`iter_per_epoch`: int type, number of iterations per epoch
`learning_rate`: double type, learning rate | -| existingModelId | Base model used for fine-tuning | -| targetData | SQL to get the dataset used for fine-tuning | - -**Example** - -1. Select data from the ot field in the specified time range as the fine-tuning dataset, and create the model sundialv3 based on sundial. - -```SQL -IoTDB> set sql_dialect=table -Msg: The statement is executed successfully. -IoTDB> CREATE MODEL sundialv3 FROM MODEL sundial ON DATASET ('SELECT time, ot from etth.eg where 1467302400000 <= time and time < 1517468400001') -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -| sundialv3| sundial| fine_tuned| training| -+---------------------+---------+-----------+---------+ -``` - -2. Fine-tuning tasks are started asynchronously in the background, which can be seen in the AINode process log; after fine-tuning is completed, query and use the new model. - -```SQL -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -| sundialv3| sundial| fine_tuned| active| -+---------------------+---------+-----------+---------+ -``` - -### 4.3 Register Custom Models - -**The following Transformers models can be registered to AINode:** - -1. AINode currently uses transformers version 4.56.2, so when building the model, avoid inheriting interfaces from lower versions (<4.50); -2. The model must inherit a pipeline for inference tasks of AINode (currently supports prediction pipeline): - * iotdb-core/ainode/iotdb/ainode/core/inference/pipeline/basic_pipeline.py - - **Before V2.0.9.3** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - Preprocess the input data before the inference task starts, including shape validation and numerical conversion. - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - Postprocess the output results after the inference task is completed. - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def preprocess(self, inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], **infer_kwargs): - """ - Preprocess the input data before passing it to the model for inference, validating the shape and type of the input data. - - Args: - inputs (list[dict]): - Input data, a list of dictionaries, each dictionary contains: - - 'targets': Tensor with shape (input_length,) or (target_count, input_length). - - 'past_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - 'future_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - infer_kwargs (dict, optional): Additional keyword arguments for inference, such as: - - `output_length`(int): Used to validate the validity of 'future_covariates' if provided. - - Raises: - ValueError: If the input format is invalid (e.g., missing keys, invalid tensor shapes). - - Returns: - Preprocessed and validated input data that can be directly used for model inference. - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - Perform forecasting on the given inputs. - - Parameters: - inputs: Input data for forecasting. The type and structure depend on the specific model implementation. - **infer_kwargs: Additional inference parameters, e.g.: - - `output_length`(int): The number of time points the model should generate. - - Returns: - Forecast output, the specific form depends on the specific model implementation. - """ - pass - - def postprocess(self, outputs: list[torch.Tensor], **infer_kwargs) -> list[torch.Tensor]: - """ - Postprocess the model outputs after inference, validating the shape of the output data and ensuring it meets the expected dimensions. - - Args: - outputs: - Model outputs, a list of 2D tensors, each tensor with shape `[target_count, output_length]`. - - Raises: - InferenceModelInternalException: If the output tensor shape is invalid (e.g., incorrect dimensions). - ValueError: If the output format is incorrect. - - Returns: - list[torch.Tensor]: - Postprocessed outputs, which will be a list of 2D tensors. - """ - pass - ``` - - **From V2.0.9.3 onwards** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - Preprocess the input data before the inference task starts, including shape validation and numerical conversion. - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - Postprocess the output results after the inference task is completed. - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def _preprocess( - self, - inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], - **infer_kwargs, - ): - """ - Preprocess the input data before passing it to the model for inference, validating the shape and type of the input data. - - Args: - inputs (list[dict[str, dict[str, torch.Tensor] | torch.Tensor]]): - Input data, a list of dictionaries, each dictionary contains: - - 'targets': Tensor with shape (input_length,) or (target_count, input_length). - - 'past_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - 'future_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - infer_kwargs (dict, optional): Additional keyword arguments for inference, such as: - - `output_length`(int): Used to validate the validity of 'future_covariates' if provided. - - Raises: - ValueError: If the input format is invalid (e.g., missing keys, invalid tensor shapes). - - Returns: - Preprocessed and validated input data that can be directly used for model inference. - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - Perform forecasting on the given inputs. - - Parameters: - inputs: Input data for forecasting. The type and structure depend on the specific model implementation. - **infer_kwargs: Additional inference parameters, e.g.: - - `output_length`(int): The number of time points the model should generate. - - Returns: - Forecast output, the specific form depends on the specific model implementation. - """ - pass - - def _postprocess(self, outputs, **infer_kwargs) -> list[torch.Tensor]: - """ - Postprocess the model outputs after inference, validating the shape of the output data and ensuring it meets the expected dimensions. - - Args: - outputs: - Model outputs, a list of 2D tensors, each tensor with shape `[target_count, output_length]`. - - Raises: - InferenceModelInternalException: If the output tensor shape is invalid (e.g., incorrect dimensions). - ValueError: If the output format is incorrect. - - Returns: - list[torch.Tensor]: - Postprocessed outputs, which will be a list of 2D tensors. - """ - pass - ``` - -3. Modify the model configuration file `config.json` to ensure it contains the following fields: - - **Before V2.0.9.3** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // Specify the model Config class - "AutoModelForCausalLM": "model.Chronos2Model" // Specify the model class - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // Specify the inference pipeline for the model - "model_type": "custom_t5", // Specify the model type - } - ``` - * The model Config class and model class **must** be specified via `auto_map`; - * The inference pipeline class **must** be inherited and specified; - * For built-in and user-defined models managed by AINode, `model_type` also serves as a unique non-duplicable identifier. That is, the model type to be registered must not duplicate any existing model types; models created via fine-tuning will inherit the model type of the original model. - - **From V2.0.9.3 onwards** - > The `model_type` parameter is **not required** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // Specify the model Config class - "AutoModelForCausalLM": "model.Chronos2Model" // Specify the model class - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // Specify the inference pipeline for the model - } - ``` - * The model Config class and model class **must** be specified via `auto_map`; - * The inference pipeline class **must** be inherited and specified; - -4. Ensure the model directory to be registered contains the following files, and the model configuration file name and weight file name are not customizable: - * Model configuration file: config.json; - * Model weight file: model.safetensors; - * Model code: other .py files. - -**The SQL syntax for registering custom models is as follows:** - -```SQL -CREATE MODEL USING URI -``` - -**Parameter Description:** - -* **model_id**: Unique identifier for the custom model; cannot be duplicated, with the following constraints: - * Allowed characters: [0-9 a-z A-Z \_ ] (letters, numbers (not at the beginning), underscore (not at the beginning)) - * Length limit: 2-64 characters - * Case-sensitive -* **uri**: Local URI address containing the model code and weights. - -**Registration Example:** - -Upload a custom Transformers model from a local path, AINode will copy the folder to the user_defined directory. - -```SQL -CREATE MODEL chronos2 USING URI 'file:///path/to/chronos2' -``` - -After executing the SQL, the registration process will be asynchronous. The registration status of the model can be viewed by checking the model display (see the "Viewing Models" section). After the model is registered successfully, it can be called using normal query methods for model inference. - -### 4.4 Viewing Models - -Registered models can be queried using the view command. - -```SQL -SHOW MODELS -``` - -In addition to displaying all model information directly, you can specify `model_id` to view the information of a specific model. - -```SQL -SHOW MODELS -- Only show specific model -``` - -The result of the model display contains the following: - -| **ModelId** | **ModelType** | **Category** | **State** | -| ------------------- | --------------------- | -------------------- | ----------------- | -| Model ID | Model Type | Model Category | Model State | - -Where, the State model status machine flowchart is as follows: - -![](/img/ainode-upgrade-state-timecho-en.png) - -State machine flow description: - -1. After starting AINode, executing `show models` command, only **system built-in (BUILTIN)** models can be viewed. -2. Users can import their own models, which are identified as **user-defined (USER_DEFINED)**; AINode will try to parse the model type (ModelType) from the model configuration file; if parsing fails, this field will be displayed as empty. -3. Time series large models (built-in models) do not have weight files packaged with AINode, and AINode automatically downloads them when starting. - 1. During download, it is ACTIVATING, and after successful download, it becomes ACTIVE, and if failed, it becomes INACTIVE. -4. After users start a model fine-tuning task, the model state during training is TRAINING, and after successful training, it becomes ACTIVE, and if failed, it becomes FAILED. -5. If the fine-tuning task is successful, after fine-tuning, all ckpt (training files) will be statistically analyzed to find the best file and automatically renamed to the user-specified model_id. - -**Viewing Example** - -```SQL -IoTDB> show models -+---------------------+--------------+--------------+-------------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------+--------------+-------------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| custom| | user_defined| active| -| timer_xl| timer| builtin| activating| -| sundial| sundial| builtin| active| -| sundialx_1| sundial| fine_tuned| active| -| sundialx_4| sundial| fine_tuned| training| -| sundialx_5| sundial| fine_tuned| failed| -| chronos2| t5| builtin| inactive| -+---------------------+--------------+--------------+-------------+ -``` - -**Built-in Traditional Time Series Models:** - -| Model Name | Core Concept | Applicable Scenarios | Key Features | -|-----------------------------------------|------------------------------------------------------------------------------|-----------------------------------------------------------|------------------------------------------------------------------------------| -| **ARIMA** (Autoregressive Integrated Moving Average) | Combines AR, differencing (I), and MA for stationary or differenced series | Univariate forecasting (stock prices, sales, economics) | 1. For linear trends with weak seasonality2. Requires (p,d,q) tuning3. Sensitive to missing values | -| **Holt-Winters** (Triple Exponential Smoothing) | Exponential smoothing with level, trend, and seasonal components | Data with clear trend & seasonality (monthly sales, power demand) | 1. Handles additive/multiplicative seasonality2. Weights recent data higher3. Simple implementation | -| **Exponential Smoothing** | Weighted average of history with exponentially decaying weights | Trending but non-seasonal data (short-term demand) | 1. Few parameters, simple computation2. Suitable for stable/slow-changing series3. Extensible to double/triple smoothing | -| **Naive Forecaster** | Uses last observation as next prediction (simplest baseline) | Benchmarking or data with no clear pattern | 1. No training needed2. Sensitive to sudden changes3. Seasonal variant uses prior season value | -| **STL Forecaster** | Decomposes series into trend, seasonal, residual; forecasts components | Complex seasonality/trends (climate, traffic) | 1. Handles non-fixed seasonality2. Robust to outliers3. Components can use other models | -| **Gaussian HMM** | Hidden states generate observations; each state follows Gaussian distribution | State sequence prediction/classification (speech, finance) | 1. Models temporal state transitions2. Observations independent per state3. Requires state count | -| **GMM HMM** | Extends Gaussian HMM; each state uses Gaussian Mixture Model | Multi-modal observation scenarios (motion recognition, biosignals) | 1. More flexible than single Gaussian2. Higher complexity3. Requires GMM component count | -| **STRAY** (Search for Outliers using Random Projection and Adaptive Thresholding) | Uses SVD to detect anomalies in high-dimensional time series | High-dimensional anomaly detection (sensor networks, IT monitoring) | 1. No distribution assumption2. Handles high dimensions3. Sensitive to global anomalies | - -**Built-in Time Series Large Models:** - -| Model Name | Core Concept | Applicable Scenarios | Key Features | -|-----------------|------------------------------------------------------------------------------|-----------------------------------------------------------|------------------------------------------------------------------------------| -| **Timer-XL** | Long-context time series large model pretrained on massive industrial data | Complex industrial forecasting requiring ultra-long history (energy, aerospace, transport) | 1. Supports input of tens of thousands of time points2. Covers non-stationary, multivariate, and covariate scenarios3. Pretrained on trillion-scale high-quality industrial IoT data | -| **Timer-Sundial** | Generative foundation model with "Transformer + TimeFlow" architecture | Zero-shot forecasting requiring uncertainty quantification (finance, supply chain, renewable energy) | 1. Strong zero-shot generalization; supports point & probabilistic forecasting2. Flexible analysis of any prediction distribution statistic3. Innovative flow-matching architecture for efficient non-deterministic sample generation | -| **Chronos-2** | Universal time series foundation model based on discrete tokenization | Rapid zero-shot univariate forecasting; scenarios enhanced by covariates (promotions, weather) | 1. Powerful zero-shot probabilistic forecasting2. Unified multi-variable & covariate modeling (strict input requirements):  a. Future covariate names ⊆ historical covariate names  b. Each historical covariate length = target length  c. Each future covariate length = prediction length3. Efficient encoder-only structure balancing performance and speed | - -### 4.5 Model Deletion - -Registered models can be deleted via SQL. AINode removes the corresponding model folder under `user_defined`. Syntax: -```SQL -DROP MODEL -``` -- Requires specifying an existing `model_id`. -- Deletion is asynchronous (status: `DROPPING`), during which the model cannot be used for inference. -- **Built-in models cannot be deleted.** - -### 4.6 Loading/Unloading Models - -AINode supports two loading strategies: -* **On-Demand Loading**: Load model temporarily during inference, then release resources. Suitable for testing or low-load scenarios. -* **Persistent Loading**: Keep model instances resident in CPU memory or GPU VRAM to support high-concurrency inference. Users specify load/unload targets via SQL; AINode auto-manages instance counts. Current loaded status is queryable. - -Details below: - -1. **Configuration Parameters** - Edit these settings to control persistent loading behavior: - ```properties - # Ratio of total device memory/VRAM usable by AINode for inference - # Datatype: Float - ain_inference_memory_usage_ratio=0.4 - - # Memory overhead ratio per loaded model instance (model_size * this_value) - # Datatype: Float - ain_inference_extra_memory_ratio=1.2 - ``` - -2. **List Available Devices** - ```SQL - SHOW AI_DEVICES - ``` - Example: - ```SQL - IoTDB> SHOW AI_DEVICES - +-------------+ - | DeviceId| - +-------------+ - | cpu| - | 0| - | 1| - +-------------+ - ``` - -3. **Load Model** - Manually load model; system auto-balances instance count based on resources: - ```SQL - LOAD MODEL TO DEVICES (, )* - ``` - Parameters: - * `existing_model_id`: Model ID (current version supports `timer_xl` and `sundial` only) - * `device_id`: - * `cpu`: Load into server memory - * `gpu_id`: Load into specified GPU(s), e.g., `'0,1'` for GPUs 0 and 1 - Example: - ```SQL - LOAD MODEL sundial TO DEVICES 'cpu,0,1' - ``` - -4. **Unload Model** - Unload all instances of a model; system reallocates freed resources: - ```SQL - UNLOAD MODEL FROM DEVICES (, )* - ``` - Parameters same as `LOAD MODEL`. - Example: - ```SQL - UNLOAD MODEL sundial FROM DEVICES 'cpu,0,1' - ``` - -5. **View Loaded Models** - ```SQL - SHOW LOADED MODELS - SHOW LOADED MODELS (, )* -- Filter by device - ``` - Example (sundial loaded on CPU, GPU 0, GPU 1): - ```SQL - IoTDB> SHOW LOADED MODELS - +-------------+--------------+------------------+ - | DeviceId| ModelId| Count(instances)| - +-------------+--------------+------------------+ - | cpu| sundial| 4| - | 0| sundial| 6| - | 1| sundial| 6| - +-------------+--------------+------------------+ - ``` - * `DeviceId`: Device identifier - * `ModelId`: Loaded model ID - * `Count(instances)`: Number of model instances per device (auto-assigned by system) - -### 4.7 Large Time Series Models - -AINode supports multiple large time series models. For deployment details, refer to [Time Series Large Model](../AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md) - -### 5. Permission Management - -Use IoTDB's built-in authentication for AINode permissions. Users need `USE_MODEL` permission to manage models and access input data for inference. - -| **Permission** | **Scope** | **Administrator (default ROOT)** | **Normal User** | -|---------------------|--------------------------------|----------------------------------|-----------------| -| USE_MODEL | create model / show models / drop model | √ | √ | -| READ_SCHEMA&READ_DATA | forecast | √ | √ | \ No newline at end of file diff --git a/src/UserGuide/Master/Table/AI-capability/AINode_timecho.md b/src/UserGuide/Master/Table/AI-capability/AINode_timecho.md deleted file mode 100644 index 3a1570f60..000000000 --- a/src/UserGuide/Master/Table/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,482 +0,0 @@ - - -# AINode - -AINode is a native IoTDB node that supports the registration, management, and invocation of time-series-related models. It comes with built-in industry-leading self-developed time-series large models, such as the Timer series developed by Tsinghua University. These models can be invoked through standard SQL statements, enabling real-time inference of time series data at the millisecond level, and supporting application scenarios such as trend forecasting, missing value imputation, and anomaly detection for time series data. - -> Available since V2.0.5.1 - -The system architecture is shown below: -::: center - -::: - -The responsibilities of the three nodes are as follows: - -- **ConfigNode:** - - Manages distributed nodes and handles load balancing across the system. -- **DataNode:** - - Receives and parses user SQL queries. - - Stores time-series data. - - Performs preprocessing computations on raw data. -- **AINode:** - - Manages and utilizes time-series models (including training/inference). - - Supports deep learning and machine learning workflows. - -## 1. Advantageous features - -Compared with building a machine learning service alone, it has the following advantages: - -- **Simple and easy to use**: no need to use Python or Java programming, the complete process of machine learning model management and inference can be completed using SQL statements. Creating a model can be done using the CREATE MODEL statement, and using a model for inference can be done using the SELECT * FROM FORECAST (...) statement, making it simpler and more convenient to use. - -- **Avoid Data Migration**: With IoTDB native machine learning, data stored in IoTDB can be directly applied to the inference of machine learning models without having to move the data to a separate machine learning service platform, which accelerates data processing, improves security, and reduces costs. - -![](/img/AInode1.png) - -- **Built-in Advanced Algorithms**: supports industry-leading machine learning analytics algorithms covering typical timing analysis tasks, empowering the timing database with native data analysis capabilities. Such as: - - **Time Series Forecasting**: learns patterns of change from past time series; thus outputs the most likely prediction of future series based on observations at a given past time. - - **Anomaly Detection for Time Series**: detects and identifies outliers in a given time series data, helping to discover anomalous behaviour in the time series. - - **Annotation for Time Series (Time Series Annotation)**: Adds additional information or markers, such as event occurrence, outliers, trend changes, etc., to each data point or specific time period to better understand and analyse the data. - - -## 2. Basic Concepts - -- **Model**: A machine learning model takes time series data as input and outputs analysis task results or decisions. Models are the basic management units of AINode, supporting model operations such as creation (registration), deletion, query, modification (fine-tuning), and usage (inference). -- **Create**: Load externally designed or trained model files/algorithms into AINode for unified management and usage by IoTDB. -- **Inference**: Use the created model to complete time series analysis tasks applicable to the model on specified time series data. -- **Built-in Capabilities**: AINode comes with machine learning algorithms or self-developed models for common time series analysis scenarios (e.g., forecasting and anomaly detection). -![](/img/AINode-en.png) - -## 3. Installation and Deployment - -The deployment of AINode can be found in the document [AINode Deployment](../Deployment-and-Maintenance/AINode_Deployment_timecho.md) . - - -## 4. Usage Guide - -AINode provides model creation and deletion functions for time series models. Built-in models do not require creation and can be used directly. - -### 4.1 Registering Models - -Trained deep learning models can be registered by specifying their input and output vector dimensions for inference. - -Models that meet the following criteria can be registered with AINode: - -1. AINode currently supports models trained with PyTorch 2.4.0. Features above version 2.4.0 should be avoided. -2. AINode supports models stored using PyTorch JIT (`model.pt`), which must include both the model structure and weights. -3. The model input sequence can include single or multiple columns. If multi-column, it must match the model capabilities and configuration file. -4. Model configuration parameters must be clearly defined in the `config.yaml` file. When using the model, the input and output dimensions defined in `config.yaml` must be strictly followed. Mismatches with the configuration file will cause errors. - -The SQL syntax for model registration is defined as follows: - -```SQL -create model using uri -``` - -Detailed meanings of SQL parameters: - -- **model_id**: The global unique identifier for the model, non-repeating. Model names have the following constraints: - - Allowed characters: [0-9 a-z A-Z _] (letters, digits (not at the beginning), underscores (not at the beginning)) - - Length: 2-64 characters - - Case-sensitive -- **uri**: The resource path of the model registration files, which should include the **model structure and weight file `model.pt` and the model configuration file `config.yaml`** - - - **Model structure and weight file**: The weight file generated after model training, currently supporting `.pt` files from PyTorch training. - - - **Model configuration file**: Parameters related to the model structure provided during registration, which must include input and output dimensions for inference: - - | **Parameter Name** | **Description** | **Example** | - | ------------ | ---------------------------- | -------- | - | input_shape | Rows and columns of model input | [96,2] | - | output_shape | Rows and columns of model output | [48,2] | - - In addition to inference, data types of input and output can also be specified: - - | **Parameter Name** | **Description** | **Example** | - | ------------------ | ------------------------- | ---------------------- | - | input_type | Data type of model input | ['float32', 'float32'] | - | output_type | Data type of model output | ['float32', 'float32'] | - - Additional notes can be specified for model management display: - - | **Parameter Name** | **Description** | **Example** | - | ------------------ | --------------------------------------------- | -------------------------------------------- | - | attributes | Optional notes set by users for model display | 'model_type': 'dlinear', 'kernel_size': '25' | - -In addition to registering local model files, remote resource paths can be specified via URIs for registration, using open-source model repositories (e.g., HuggingFace). - -#### Example - -The [example folder](https://github.com/apache/iotdb/tree/master/integration-test/src/test/resources/ainode-example) contains model.pt (trained model) and config.yaml with the following content: - -```YAML -configs: - # Required - input_shape: [96, 2] # Model accepts 96 rows x 2 columns of data - output_shape: [48, 2] # Model outputs 48 rows x 2 columns of data - - # Optional (default to all float32, column count matches shape) - input_type: ["int64", "int64"] # Data types of inputs, must match input column count - output_type: ["text", "int64"] # Data types of outputs, must match output column count - -attributes: # Optional user-defined notes - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -Register the model by specifying this folder as the loading path: - -```SQL -IoTDB> create model dlinear_example using uri "file://./example" -``` - -Models can also be downloaded from HuggingFace for registration: - -```SQL -IoTDB> create model dlinear_example using uri "https://huggingface.co/google/timesfm-2.0-500m-pytorch" -``` - -After SQL execution, registration proceeds asynchronously. The registration status can be checked via model display (see Model Display section). The registration success time mainly depends on the model file size. - -Once registered, the model can be invoked for inference through normal query syntax. - -### 4.2 Viewing Models - -Registered models can be queried using the `show models` command. The SQL definitions are: - -```SQL -show models - -show models -``` - -In addition to displaying all models, specifying a `model_id` shows details of a specific model. The display includes: - -| **ModelId** | **ModelType** | **Category** | **State** | -|-------------|---------------|----------------|-------------| -| Model ID | Model Type | Model Category | Model State | - -- Model State Transition Diagram - -![](/img/AINode-State-en.png) - -**Instructions:** - -1. Initialization: - - When AINode starts, show models only displays BUILT-IN models. -2. Custom Model Import: - - Users can import custom models (marked as USER-DEFINED). - - The system attempts to parse the ModelTypefrom the config file. - - If parsing fails, the field remains empty. -3. Foundation Model Weights: - - Time-series foundation model weights are not bundled with AINode. - - AINode automatically downloads them during startup. - - Download state: LOADING. -4. Download Outcomes: - - Success → State changes to ACTIVE. - - Failure → State changes to INACTIVE. -5. Fine-Tuning Process: - - When fine-tuning starts: State becomes TRAINING. - - Successful training → State transitions to ACTIVE. - - Training failure → State changes to FAILED. - -**Example** - -```SQL -IoTDB> show models -+---------------------+--------------------+--------------+---------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+--------------+---------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| custom| | USER-DEFINED| ACTIVE| -| timerxl| Timer-XL| BUILT-IN| LOADING| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| sundialx_1| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_2| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_4| Timer-Sundial| FINE-TUNED| TRAINING| -| sundialx_5| Timer-Sundial| FINE-TUNED| FAILED| -+---------------------+--------------------+--------------+---------+ -``` - - -### 4.3 Deleting Models - -Registered models can be deleted via SQL, which removes all related files under AINode: - -```SQL -drop model -``` - -Specify the registered `model_id` to delete the model. Since deletion involves data cleanup, the operation is not immediate, and the model state becomes `DROPPING`, during which it cannot be used for inference. **Note:** Built-in models cannot be deleted. - -### 4.4 Inference with Built-in Models - -SQL syntax: - - -```SQL -SELECT * FROM forecast( - input, - model_id, - [output_length, - output_start_time, - output_interval, - timecol, - preserve_input, - model_options]? -) -``` - - -Built-in models do not require prior registration for inference. Simply use the `forecast` function and specify the `model_id` to invoke the model's inference capabilities. - - - **Note**: Inference with built-in time series large models requires local availability of model weights in the directory `/IOTDB_AINODE_HOME/data/ainode/models/weights/model_id/`. If weights are missing, they will be automatically downloaded from HuggingFace. Ensure direct network access to HuggingFace. - -Parameter descriptions: - -| Parameter | Type | Attribute | Description | Required | Notes | -| ----------------- | ------------- | -------------------------------------------------------- | ------------------------------------------------------------ | -------- | ------------------------------------------------------------ | -| input | Table | SET SEMANTIC | Input data for forecasting | Yes | | -| model_id | String | Scalar | Name of the model to use | Yes | Must be non-empty and a built-in model; otherwise, errors like "MODEL_ID cannot be null" occur. | -| output_length | INT32 | Scalar (default: 96) | Size of the output forecast window | No | Must be > 0. | -| output_start_time | Timestamp | Scalar (default: last input timestamp + output_interval) | Start timestamp of the forecast results | No | Can be negative (before 1970-01-01). | -| output_interval | Time interval | Scalar (default: inferred from input) | Time interval between forecast points (supports ns, us, ms, s, m, h, d, w) | No | If > 0, uses user-specified interval; else, infers from input. | -| timecol | String | Scalar (default: "time") | Name of the timestamp column | No | Must exist in `input` and be of TIMESTAMP type; otherwise, errors occur. | -| preserve_input | Boolean | Scalar (default: false) | Retain all input rows in the output | No | | -| model_options | String | Scalar (default: empty) | Model-specific key-value pairs (e.g., normalization) | No | Unsupported parameters are ignored. See appendix for built-in model parameters. | - -**Notes:** - -1. The `forecast` function predicts all columns in the input table by default (excluding the time column and columns specified in `partition by`). -2. The `forecast` function does not require the input data to be in any specific order. It sorts the input data in ascending order by the timestamp (specified by the `TIMECOL` parameter) before invoking the model for prediction. -3. Different models have varying requirements for the number of input data rows. If the input data has fewer rows than the minimum requirement, an error will be reported. - - Among the current built-in models in AINode: - - Timer-XL requires at least 96 rows of input data. - - Timer-Sundial requires at least 16 rows of input data. -4. The result columns of the `forecast` function include all input columns from the input table, with their original data types preserved. If `preserve_input = true`, an additional `is_input` column will be included to indicate whether a row is from the input data. - - Currently, only columns of type INT32, INT64, FLOAT, or DOUBLE are supported for prediction. Otherwise, an error will occur: "The type of the column [%s] is [%s], only INT32, INT64, FLOAT, DOUBLE is allowed." -5. `output_start_time` and `output_interval` only affect the generation of the timestamp column in the output results. Both are optional parameters: - - `output_start_time` defaults to the last timestamp of the input data plus `output_interval`. - - `output_interval` defaults to the sampling interval of the input data, calculated as: (last timestamp - first timestamp) / (number of rows - 1). - - The timestamp of the Nth output row is calculated as: `output_start_time + (N - 1) * output_interval`. - -**Example: Database and table must be pre-created** - -```sql -create database etth -create table eg (hufl FLOAT FIELD, hull FLOAT FIELD, mufl FLOAT FIELD, mull FLOAT FIELD, lufl FLOAT FIELD, lull FLOAT FIELD, ot FLOAT FIELD) -``` - -Using the ETTh1-tab dataset:[ETTh1-tab](/img/ETTh1-tab.csv)。 - -**View supported models** - -```Bash -IoTDB:etth> show models -+---------------------+--------------------+--------+------+ -| ModelId| ModelType|Category| State| -+---------------------+--------------------+--------+------+ -| arima| Arima|BUILT-IN|ACTIVE| -| holtwinters| HoltWinters|BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing|BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster|BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster|BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm|BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm|BUILT-IN|ACTIVE| -| stray| Stray|BUILT-IN|ACTIVE| -| sundial| Timer-Sundial|BUILT-IN|ACTIVE| -| timer_xl| Timer-XL|BUILT-IN|ACTIVE| -+---------------------+--------------------+--------+------+ -Total line number = 10 -It costs 0.004s -``` - -**Inference with sundial model:** - -```Bash -IoTDB:etth> select Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT from eg LIMIT 96 -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -| Time| HUFL| HULL| MUFL| MULL| LUFL| LULL| OT| -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -|2016-07-01T00:00:00.000+08:00| 5.827|2.009|1.599|0.462|4.203| 1.34|30.531| -|2016-07-01T01:00:00.000+08:00| 5.693|2.076|1.492|0.426|4.142|1.371|27.787| -|2016-07-01T02:00:00.000+08:00| 5.157|1.741|1.279|0.355|3.777|1.218|27.787| -|2016-07-01T03:00:00.000+08:00| 5.09|1.942|1.279|0.391|3.807|1.279|25.044| -...... -Total line number = 96 -It costs 0.119s - -IoTDB:etth> select * from forecast( - model_id => 'sundial', - input => (select Time, ot from etth.eg where time >= 2016-08-07T18:00:00.000+08:00 limit 1440) order BY time, - output_length => 96 -) -+-----------------------------+---------+ -| time| ot| -+-----------------------------+---------+ -|2016-10-06T18:00:00.000+08:00|20.781654| -|2016-10-06T19:00:00.000+08:00|20.252121| -|2016-10-06T20:00:00.000+08:00|19.960138| -|2016-10-06T21:00:00.000+08:00|19.662334| -...... -Total line number = 96 -It costs 1.615s -``` -### 4.5 Fine-tuning Built-in Models -> Only Timer-XL and Timer-Sundial support fine-tuning. - - -The SQL syntax is as follows: - - -```SQL -create model (with hyperparameters -(=(, =)*))? -from model -on dataset (inputSql) -``` - -#### Example - -1. Select the first 80% of data from the measurement `ot` as the fine-tuning dataset, and create the model `sundialv3` based on `sundial`. - -```SQL -IoTDB> set sql_dialect=table -Msg: The statement is executed successfully. -IoTDB> CREATE MODEL sundialv3 FROM MODEL sundial ON DATASET ('SELECT time, ot from etth.eg where 1467302400000 <= time and time < 1517468400001') -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+--------------------+----------+--------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+--------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| timer_xl| Timer-XL| BUILT-IN| ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED| ACTIVE| -| sundialv3| Timer-Sundial|FINE-TUNED|TRAINING| -+---------------------+--------------------+----------+--------+ -``` - -2. The fine-tuning task starts asynchronously in the background, and logs can be viewed in the AINode process. After fine-tuning is complete, query and use the new model - -```SQL -IoTDB> show models -+---------------------+--------------------+----------+------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+------+ -| arima| Arima| BUILT-IN|ACTIVE| -| holtwinters| HoltWinters| BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN|ACTIVE| -| stray| Stray| BUILT-IN|ACTIVE| -| sundial| Timer-Sundial| BUILT-IN|ACTIVE| -| timer_xl| Timer-XL| BUILT-IN|ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|ACTIVE| -| sundialv3| Timer-Sundial|FINE-TUNED|ACTIVE| -+---------------------+--------------------+----------+------+ -``` - -### 4.6 Time Series Large Model Import Steps - -AINode supports multiple time series large models. For deployment, refer to [Time Series Large Model](../AI-capability/TimeSeries-Large-Model.md) - - -## 5 Permission Management - -AINode uses IoTDB's authentication for permission management. Users need `USE_MODEL` permission for model management and `READ_DATA` permission for inference (to access input data sources). - -| **Permission** | **Scope** | **Admin (ROOT)** | **Regular User** | **Path-Related** | -| -------------- | ------------------------ | ---------------- | ---------------- | ---------------- | -| USE_MODEL | Create/Show/Drop models | ✔️ | ✔️ | ❌ | -| READ_DATA | Call inference functions | ✔️ | ✔️ | ✔️ | - - -## 6 Appendix - -**Arima** - -| Parameter | Description | Default | -| ----------------------- | ------------------------------------------------------------ | --------- | -| order | ARIMA order `(p, d, q)`: p=autoregressive, d=differencing, q=moving average. | (1,0,0) | -| seasonal_order | Seasonal ARIMA order `(P, D, Q, s)`: seasonal AR, differencing, MA orders, and season length (e.g., 12 for monthly data). | (0,0,0,0) | -| method | Optimizer: 'newton', 'nm', 'bfgs', 'lbfgs', 'powell', 'cg', 'ncg', 'basinhopping'. | 'lbfgs' | -| maxiter | Maximum iterations/function evaluations. | 50 | -| out_of_sample_size | Number of tail samples for validation (not used in fitting). | 0 | -| scoring | Scoring function for validation (sklearn metric or custom). | 'mse' | -| trend | Trend term configuration. If `with_intercept=True` and None, defaults to 'c' (constant). | None | -| with_intercept | Include intercept term. | True | -| time_varying_regression | Allow regression coefficients to vary over time. | False | -| enforce_stationarity | Enforce stationarity of AR components. | True | -| enforce_invertibility | Enforce invertibility of MA components. | True | -| simple_differencing | Use differenced data for estimation (sacrifices first rows). | False | -| measurement_error | Assume observation errors. | False | -| mle_regression | Use maximum likelihood for regression (must be False if `time_varying_regression=True`). | True | -| hamilton_representation | Use Hamilton representation (default is Harvey). | False | -| concentrate_scale | Exclude scale parameter from likelihood (reduces parameters). | False | - -**NaiveForecaster** - -| Parameter | Description | Default | -| --------- | ------------------------------------------------------------ | ------- | -| strategy | Forecasting strategy: - `"last"`: Use last training value (seasonal if `sp`>1). - `"mean"`: Use mean of last window (seasonal if `sp`>1). - `"drift"`: Fit line through last window and extrapolate (non-robust to NaN). | "last" | -| sp | Seasonal period. `None` or 1 means no seasonality; 12 means monthly. | 1 | - -**STLForecaster** - -| Parameter | Description | Default | -| ------------- | ---------------------------------------------------------- | ------- | -| sp | Seasonal period (units). Passed to statsmodels' STL. | 2 | -| seasonal | Seasonal smoothing window (odd ≥3, typically ≥7). | 7 | -| seasonal_deg | LOESS polynomial degree for season (0=constant, 1=linear). | 1 | -| trend_deg | LOESS polynomial degree for trend (0 or 1). | 1 | -| low_pass_deg | LOESS polynomial degree for low-pass (0 or 1). | 1 | -| seasonal_jump | Interpolation step for season LOESS (larger = faster). | 1 | -| trend_jump | Interpolation step for trend LOESS (larger = faster). | 1 | -| low_pass_jump | Interpolation step for low-pass LOESS. | 1 | - -**ExponentialSmoothing (HoltWinters)** - -| Parameter | Description | Default | -| --------------------- | ------------------------------------------------------------ | ----------- | -| damped_trend | Use damped trend (trend flattens instead of growing infinitely). | True | -| initialization_method | Initialization method: - `"estimated"`: Fit to estimate initial states - `"heuristic"`: Use heuristic for initial level/trend/season - `"known"`: User-provided initial values - `"legacy-heuristic"`: Legacy compatibility | "estimated" | -| optimized | Optimize parameters via maximum likelihood. | True | -| remove_bias | Remove bias to make residuals' mean zero. | False | -| use_brute | Use brute-force grid search for initial parameters. | | \ No newline at end of file diff --git a/src/UserGuide/Master/Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md b/src/UserGuide/Master/Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md deleted file mode 100644 index c7fde0bf2..000000000 --- a/src/UserGuide/Master/Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md +++ /dev/null @@ -1,157 +0,0 @@ - -# Time Series Large Models - -## 1. Introduction - -Time Series Large Models are foundational models specifically designed for time series data analysis. The IoTDB team has been developing the Timer, a self-researched foundational time series model, which is based on the Transformer architecture and pre-trained on massive multi-domain time series data, supporting downstream tasks such as time series forecasting, anomaly detection, and time series imputation. The AINode platform developed by the team also supports the integration of cutting-edge time series foundational models from the industry, providing users with diverse model options. Unlike traditional time series analysis techniques, these large models possess universal feature extraction capabilities and can serve a wide range of analytical tasks through zero-shot analysis, fine-tuning, and other services. - -All technical achievements in the field of time series large models related to this paper (including both the team's self-researched models and industry-leading directions) have been published in top international machine learning conferences, with specific details in the appendix. - -## 2. Application Scenarios - -* **Time Series Forecasting**: Providing time series data forecasting services for industrial production, natural environments, and other fields to help users understand future trends in advance. -* **Data Imputation**: Performing context-based filling for missing segments in time series to enhance the continuity and integrity of the dataset. -* **Anomaly Detection**: Using autoregressive analysis technology to monitor time series data in real-time, promptly alerting potential anomalies. - -![](/img/LargeModel10.png) - -## 3. Timer-1 Model - -The Timer model (non-built-in model) not only demonstrates excellent few-shot generalization and multi-task adaptability, but also acquires a rich knowledge base through pre-training, endowing it with universal capabilities to handle diverse downstream tasks, with the following characteristics: - -* **Generalizability**: The model can achieve industry-leading deep model prediction results through fine-tuning with only a small number of samples. -* **Universality**: The model design is flexible, capable of adapting to various different task requirements, and supports variable input and output lengths, enabling it to function effectively in various application scenarios. -* **Scalability**: As the number of model parameters increases or the scale of pre-training data expands, the model's performance will continue to improve, ensuring that the model can continuously optimize its prediction effectiveness as time and data volume grow. - -![](/img/model01.png) - -## 4. Timer-XL Model - -Timer-XL further extends and upgrades the network structure based on Timer, achieving comprehensive breakthroughs in multiple dimensions: - -* **Ultra-Long Context Support**: This model breaks through the limitations of traditional time series forecasting models, supporting the processing of inputs with thousands of Tokens (equivalent to tens of thousands of time points), effectively solving the context length bottleneck problem. -* **Coverage of Multi-Variable Forecasting Scenarios**: Supports various forecasting scenarios, including the prediction of non-stationary time series, multi-variable prediction tasks, and predictions involving covariates, meeting diversified business needs. -* **Large-Scale Industrial Time Series Dataset**: Pre-trained on a trillion-scale time series dataset from the industrial IoT field, the dataset possesses important characteristics such as massive scale, excellent quality, and rich domain coverage, covering multiple fields including energy, aerospace, steel, and transportation. - -![](/img/model02.png) - -## 5. Timer-Sundial Model - -Timer-Sundial is a series of generative foundational models focused on time series forecasting. The base version has 128 million parameters and has been pre-trained on 1 trillion time points, with the following core characteristics: - -* **Strong Generalization Performance**: Possesses zero-shot forecasting capabilities and can support both point forecasting and probabilistic forecasting simultaneously. -* **Flexible Prediction Distribution Analysis**: Not only can it predict means or quantiles, but it can also evaluate any statistical properties of the prediction distribution through the raw samples generated by the model. -* **Innovative Generative Architecture**: Employs a "Transformer + TimeFlow" collaborative architecture - the Transformer learns the autoregressive representations of time segments, while the TimeFlow module transforms random noise into diverse prediction trajectories based on the flow-matching framework (Flow-Matching), achieving efficient generation of non-deterministic samples. - -![](/img/model03.png) - -## 6. Chronos-2 Model - -Chronos-2 is a universal time series foundational model developed by the Amazon Web Services (AWS) research team, evolved from the Chronos discrete token modeling paradigm. This model is suitable for both zero-shot univariate forecasting and covariate forecasting. Its main characteristics include: - -* **Probabilistic Forecasting Capability**: The model outputs multi-step prediction results in a generative manner, supporting quantile or distribution-level forecasting to characterize future uncertainty. -* **Zero-Shot General Forecasting**: Leveraging the contextual learning ability acquired through pre-training, it can directly execute forecasting on unseen datasets without retraining or parameter updates. -* **Unified Modeling of Multi-Variable and Covariates**: Supports joint modeling of multiple related time series and their covariates under the same architecture to improve prediction performance for complex tasks. However, it has strict input requirements: - * The set of names of future covariates must be a subset of the set of names of historical covariates; - * The length of each historical covariate must equal the length of the target variable; - * The length of each future covariate must equal the prediction length; -* **Efficient Inference and Deployment**: The model adopts a compact encoder-only structure, maintaining strong generalization capabilities while ensuring inference efficiency. - -![](/img/timeseries-large-model-chronos2.png) - -## 7. Performance Showcase - -Time Series Large Models can adapt to real time series data from various different domains and scenarios, demonstrating excellent processing capabilities across various tasks. The following shows the actual performance on different datasets: - -**Time Series Forecasting:** - -Leveraging the forecasting capabilities of Time Series Large Models, future trends of time series can be accurately predicted. The blue curve in the following figure represents the predicted trend, while the red curve represents the actual trend, with both curves highly consistent. - -![](/img/LargeModel03.png) - -**Data Imputation:** - -Using Time Series Large Models to fill missing data segments through predictive imputation. - -![](/img/timeseries-large-model-data-imputation.png) - -**Anomaly Detection:** - -Using Time Series Large Models to accurately identify outliers that deviate significantly from the normal trend. - -![](/img/LargeModel05.png) - -## 8. Deployment and Usage - -1. Open the IoTDB CLI console and check that the ConfigNode, DataNode, and AINode nodes are all Running. - -```Plain -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| -+------+----------+-------+---------------+------------+--------------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| 2.0.5.1| 069354f| -| 1| DataNode|Running| 127.0.0.1| 10730| 2.0.5.1| 069354f| -| 2| AINode|Running| 127.0.0.1| 10810| 2.0.5.1|069354f-dev| -+------+----------+-------+---------------+------------+--------------+-----------+ -Total line number = 3 -It costs 0.140s -``` - -2. In an online environment, the first startup of the AINode node will automatically pull the Timer-XL, Sundial, and Chronos2 models. - - > Note: - > - > * The AINode installation package does not include model weight files. - > * The automatic pull feature depends on the deployment environment having HuggingFace network access capability. - > * AINode supports manual upload of model weight files. For specific operation methods, refer to [Importing Weight Files](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_3-3-importing-built-in-weight-files). - -3. Check if the models are available. - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### Appendix - -**[1]** Timer: Generative Pre-trained Transformers Are Large Time Series Models, Yong Liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ Back]() - -**[2]** TIMER-XL: LONG-CONTEXT TRANSFORMERS FOR UNIFIED TIME SERIES FORECASTING, Yong Liu, Guo Qin, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ Back]() - -**[3]** Sundial: A Family of Highly Capable Time Series Foundation Models, Yong Liu, Guo Qin, Zhiyuan Shi, Zhi Chen, Caiyin Yang, Xiangdong Huang, Jianmin Wang, Mingsheng Long, **ICML 2025 spotlight**. [↩ Back]() - -**[4]** Chronos-2: From Univariate to Universal Forecasting, Abdul Fatir Ansari, Oleksandr Shchur, Jaris Küken, Andreas Auer, Boran Han, Pedro Mercado, Syama Sundar Rangapuram, Huibin Shen, Lorenzo Stella, Xiyuan Zhang, Mononito Goswami, Shubham Kapoor, Danielle C. Maddix, Pablo Guerron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, Michael Bohlke-Schneider, **arXiv:2510.15821**. [↩ Back]() \ No newline at end of file diff --git a/src/UserGuide/Master/Table/API/Programming-CSharp-Native-API_timecho.md b/src/UserGuide/Master/Table/API/Programming-CSharp-Native-API_timecho.md deleted file mode 100644 index 7c515df99..000000000 --- a/src/UserGuide/Master/Table/API/Programming-CSharp-Native-API_timecho.md +++ /dev/null @@ -1,403 +0,0 @@ - -# C# Native API - -## 1. Feature Overview - -IoTDB provides a C# native client driver and corresponding connection pool, offering object-oriented interfaces that allow direct assembly of time-series objects for writing without SQL construction. It is recommended to use the connection pool for multi-threaded parallel database operations. - -## 2. Usage Instructions - -**Environment Requirements:** - -* .NET SDK >= 5.0 or .NET Framework 4.x -* Thrift >= 0.14.1 -* NLog >= 4.7.9 - -**Dependency Installation:** - -It supports installation using tools such as NuGet Package Manager or .NET CLI. Taking .NET CLI as an example: - -If using .NET 5.0 or a later version of the SDK, enter the following command to install the latest NuGet package: - -```Plain -dotnet add package Apache.IoTDB -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 3. Read/Write Operations - -### 3.1 TableSessionPool - -#### 3.1.1 Functional Description - -The `TableSessionPool` defines basic operations for interacting with IoTDB, supporting data insertion, query execution, and session closure. It also serves as a connection pool to efficiently reuse connections and properly release resources when unused. This interface defines how to acquire sessions from the pool and how to close the pool. - -#### 3.1.2 Method List - -Below are the methods defined in `TableSessionPool` with detailed descriptions: - -| Method | Description | Parameters | Return Type | -| ---------------------------------------------------------------- | -------------------------------------------------------------------------------- |-----------------------------------------------------------------------------------------------------------| ---------------------------- | -| `Open(bool enableRpcCompression)` | Opens a session connection with custom `enableRpcCompression` | `enableRpcCompression`: Whether to enable `RpcCompression` (requires server-side configuration alignment) | `Task` | -| `Open()` | Opens a session connection without enabling `RpcCompression` | None | `Task` | -| `InsertAsync(Tablet tablet)` | Inserts a `Tablet` object containing time-series data into the database | `tablet`: The Tablet object to insert | `Task` | -| `ExecuteNonQueryStatementAsync(string sql)` | Executes a non-query SQL statement (e.g., DDL/DML commands) | `sql`: The SQL statement to execute | `Task` | -| `ExecuteQueryStatementAsync(string sql)` | Executes a query SQL statement and returns a `SessionDataSet` with results | `sql`: The SQL query to execute | `Task` | -| `ExecuteQueryStatementAsync(string sql, long timeoutInMs)` | Executes a query SQL statement with a timeout (milliseconds) | `sql`: The SQL query to execute
`timeoutInMs`: Query timeout in milliseconds | `Task` | -| `Close()` | Closes the session and releases held resources | None | `Task` | - -#### 3.1.3 Interface Examples - -```C# -public async Task Open(bool enableRpcCompression, CancellationToken cancellationToken = default) - - public async Task Open(CancellationToken cancellationToken = default) - - public async Task InsertAsync(Tablet tablet) - - public async Task ExecuteNonQueryStatementAsync(string sql) - - public async Task ExecuteQueryStatementAsync(string sql) - - public async Task ExecuteQueryStatementAsync(string sql, long timeoutInMs) - - public async Task Close() -``` - -### 3.2 TableSessionPool.Builder - -#### 3.2.1 Functional Description - -The `TableSessionPool.Builder` class configures and creates instances of `TableSessionPool`, allowing developers to set connection parameters, session settings, and pooling behaviors. - -#### 3.2.2 Configuration Options - -Below are the available configuration options for `TableSessionPool.Builder` and their defaults: - -| ​**Configuration Method** | ​**Description** | ​**Default Value** | -| --------------------------------------------- | -------------------------------------------------------------------------------- |---------------------------------------------------| -| `SetHost(string host)` | Sets the IoTDB node host | `localhost` | -| `SetPort(int port)` | Sets the IoTDB node port | `6667` | -| `SetNodeUrls(List nodeUrls)` | Sets IoTDB cluster node URLs (overrides `SetHost`/`SetPort` when used) | Not set | -| `SetUsername(string username)` | Sets the connection username | `"root"` | -| `SetPassword(string password)` | Sets the connection password | `"TimechoDB@2021"` //before V2.0.6 it is root | -| `SetFetchSize(int fetchSize)` | Sets the fetch size for query results | `1024` | -| `SetZoneId(string zoneId)` | Sets the timezone ZoneID | `UTC+08:00` | -| `SetPoolSize(int poolSize)` | Sets the maximum number of sessions in the connection pool | `8` | -| `SetEnableRpcCompression(bool enable)` | Enables/disables RPC compression | `false` | -| `SetConnectionTimeoutInMs(int timeout)` | Sets the connection timeout in milliseconds | `500` | -| `SetDatabase(string database)` | Sets the target database name | `""` | - -#### 3.2.3 Interface Examples - -```c# -public Builder SetHost(string host) - { - _host = host; - return this; - } - - public Builder SetPort(int port) - { - _port = port; - return this; - } - - public Builder SetUsername(string username) - { - _username = username; - return this; - } - - public Builder SetPassword(string password) - { - _password = password; - return this; - } - - public Builder SetFetchSize(int fetchSize) - { - _fetchSize = fetchSize; - return this; - } - - public Builder SetZoneId(string zoneId) - { - _zoneId = zoneId; - return this; - } - - public Builder SetPoolSize(int poolSize) - { - _poolSize = poolSize; - return this; - } - - public Builder SetEnableRpcCompression(bool enableRpcCompression) - { - _enableRpcCompression = enableRpcCompression; - return this; - } - - public Builder SetConnectionTimeoutInMs(int timeout) - { - _connectionTimeoutInMs = timeout; - return this; - } - - public Builder SetNodeUrls(List nodeUrls) - { - _nodeUrls = nodeUrls; - return this; - } - - protected internal Builder SetSqlDialect(string sqlDialect) - { - _sqlDialect = sqlDialect; - return this; - } - - public Builder SetDatabase(string database) - { - _database = database; - return this; - } - - public Builder() - { - _host = "localhost"; - _port = 6667; - _username = "root"; - _password = "TimechoDB@2021"; //before V2.0.6 it is root - _fetchSize = 1024; - _zoneId = "UTC+08:00"; - _poolSize = 8; - _enableRpcCompression = false; - _connectionTimeoutInMs = 500; - _sqlDialect = IoTDBConstant.TABLE_SQL_DIALECT; - _database = ""; - } - - public TableSessionPool Build() - { - SessionPool sessionPool; - // if nodeUrls is not empty, use nodeUrls to create session pool - if (_nodeUrls.Count > 0) - { - sessionPool = new SessionPool(_nodeUrls, _username, _password, _fetchSize, _zoneId, _poolSize, _enableRpcCompression, _connectionTimeoutInMs, _sqlDialect, _database); - } - else - { - sessionPool = new SessionPool(_host, _port, _username, _password, _fetchSize, _zoneId, _poolSize, _enableRpcCompression, _connectionTimeoutInMs, _sqlDialect, _database); - } - return new TableSessionPool(sessionPool); - } -``` - -## 4. Example - -Complete example : [samples/Apache.IoTDB.Samples/TableSessionPoolTest.cs](https://github.com/apache/iotdb-client-csharp/blob/main/samples/Apache.IoTDB.Samples/TableSessionPoolTest.cs) - -```c# -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -using System; -using System.Collections.Generic; -using System.Threading.Tasks; -using Apache.IoTDB.DataStructure; - -namespace Apache.IoTDB.Samples; - -public class TableSessionPoolTest -{ - private readonly SessionPoolTest sessionPoolTest; - - public TableSessionPoolTest(SessionPoolTest sessionPoolTest) - { - this.sessionPoolTest = sessionPoolTest; - } - - public async Task Test() - { - await TestCleanup(); - - await TestSelectAndInsert(); - await TestUseDatabase(); - // await TestCleanup(); - } - - - public async Task TestSelectAndInsert() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - - await tableSessionPool.ExecuteNonQueryStatementAsync("CREATE DATABASE test1"); - await tableSessionPool.ExecuteNonQueryStatementAsync("CREATE DATABASE test2"); - - await tableSessionPool.ExecuteNonQueryStatementAsync("use test2"); - - // or use full qualified table name - await tableSessionPool.ExecuteNonQueryStatementAsync( - "create table test1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - - await tableSessionPool.ExecuteNonQueryStatementAsync( - "create table table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - var res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - // show tables by specifying another database - // using SHOW tables FROM - res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES FROM test1"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - var tableName = "testTable1"; - List columnNames = - new List { - "region_id", - "plant_id", - "device_id", - "model", - "temperature", - "humidity" }; - List dataTypes = - new List{ - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE}; - List columnCategories = - new List{ - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD}; - var values = new List> { }; - var timestamps = new List { }; - for (long timestamp = 0; timestamp < 100; timestamp++) - { - timestamps.Add(timestamp); - values.Add(new List { "1", "5", "3", "A", 1.23F + timestamp, 111.1 + timestamp }); - } - var tablet = new Tablet(tableName, columnNames, columnCategories, dataTypes, values, timestamps); - - await tableSessionPool.InsertAsync(tablet); - - - res = await tableSessionPool.ExecuteQueryStatementAsync("select * from testTable1 " - + "where region_id = '1' and plant_id in ('3', '5') and device_id = '3'"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.Close(); - } - - - public async Task TestUseDatabase() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetDatabase("test1") - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - - // show tables from current database - var res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.ExecuteNonQueryStatementAsync("use test2"); - - // show tables from current database - res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.Close(); - } - - public async Task TestCleanup() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - await tableSessionPool.ExecuteNonQueryStatementAsync("drop database test1"); - await tableSessionPool.ExecuteNonQueryStatementAsync("drop database test2"); - - await tableSessionPool.Close(); - } -} -``` diff --git a/src/UserGuide/Master/Table/API/Programming-Cpp-Native-API_timecho.md b/src/UserGuide/Master/Table/API/Programming-Cpp-Native-API_timecho.md deleted file mode 100644 index 9be6f1da9..000000000 --- a/src/UserGuide/Master/Table/API/Programming-Cpp-Native-API_timecho.md +++ /dev/null @@ -1,454 +0,0 @@ - - -# C++ Native API - -## 1. Dependencies - -- Java 8+ -- Flex -- Bison 2.7+ -- Boost 1.56+ -- OpenSSL 1.0+ -- GCC 5.5.0+ - -## 2. Installation - -### 2.1 Install Required Dependencies - -- **MAC** - 1. Install Bison: - - Use the following brew command to install the Bison version: - ```shell - brew install bison - ``` - - 2. Install Boost: Make sure to install the latest version of Boost. - - ```shell - brew install boost - ``` - - 3. Check OpenSSL: Make sure the OpenSSL library is installed. The default OpenSSL header file path is "/usr/local/opt/openssl/include". - - If you encounter errors related to OpenSSL not being found during compilation, try adding `-Dopenssl.include.dir=""`. - -- **Ubuntu 16.04+ or Other Debian-based Systems** - - Use the following commands to install dependencies: - - ```shell - sudo apt-get update - sudo apt-get install gcc g++ bison flex libboost-all-dev libssl-dev - ``` - -- **CentOS 7.7+/Fedora/Rocky Linux or Other Red Hat-based Systems** - - Use the yum command to install dependencies: - - ```shell - sudo yum update - sudo yum install gcc gcc-c++ boost-devel bison flex openssl-devel - ``` - -- **Windows** - - 1. Set Up the Build Environment - - Install MS Visual Studio (version 2019+ recommended): Make sure to select Visual Studio C/C++ IDE and compiler (supporting CMake, Clang, MinGW) during installation. - - Download and install [CMake](https://cmake.org/download/). - - 2. Download and Install Flex, Bison - - Download [Win_Flex_Bison](https://sourceforge.net/projects/winflexbison/). - - After downloading, rename the executables to flex.exe and bison.exe to ensure they can be found during compilation, and add the directory of these executables to the PATH environment variable. - - 3. Install Boost Library - - Download [Boost](https://www.boost.org/users/download/). - - Compile Boost locally: Run `bootstrap.bat` and `b2.exe` in sequence. - - Add the Boost installation directory to the PATH environment variable, e.g., `C:\Program Files (x86)\boost_1_78_0`. - - 4. Install OpenSSL - - Download and install [OpenSSL](http://slproweb.com/products/Win32OpenSSL.html). - - Add the include directory under the installation directory to the PATH environment variable. - -### 2.2 Compilation - -Clone the source code from git: -```shell -git clone https://github.com/apache/iotdb.git -``` - -The default main branch is the master branch. If you want to use a specific release version, switch to that branch (e.g., version 2.0.6): -```shell -git checkout rc/2.0.6 -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -Run Maven to compile in the IoTDB root directory: - -- Mac or Linux with glibc version >= 2.32 - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp - ``` - -- Linux with glibc version >= 2.31 - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Diotdb-tools-thrift.version=0.14.1.1-old-glibc-SNAPSHOT - ``` - -- Linux with glibc version >= 2.17 - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Diotdb-tools-thrift.version=0.14.1.1-glibc223-SNAPSHOT - ``` - -- Windows using Visual Studio 2022 - ```batch - .\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp - ``` - -- Windows using Visual Studio 2019 - ```batch - .\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Dcmake.generator="Visual Studio 16 2019" -Diotdb-tools-thrift.version=0.14.1.1-msvc142-SNAPSHOT - ``` - - If you haven't added the Boost library path to the PATH environment variable, you need to add the relevant parameters to the compile command, e.g., `-DboostIncludeDir="C:\Program Files (x86)\boost_1_78_0" -DboostLibraryDir="C:\Program Files (x86)\boost_1_78_0\stage\lib"`. - -After successful compilation, the packaged library files will be located in `iotdb-client/client-cpp/target`, and you can find the compiled example program under `example/client-cpp-example/target`. - -### 2.3 Compilation Q&A - -Q: What are the requirements for the environment on Linux? - -A: -- The known minimum version requirement for glibc (x86_64 version) is 2.17, and the minimum version for GCC is 5.5. -- The known minimum version requirement for glibc (ARM version) is 2.31, and the minimum version for GCC is 10.2. -- If the above requirements are not met, you can try compiling Thrift locally: - - Download the code from https://github.com/apache/iotdb-bin-resources/tree/iotdb-tools-thrift-v0.14.1.0/iotdb-tools-thrift. - - Run `./mvnw clean install`. - - Go back to the IoTDB code directory and run `./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp`. - -Q: How to resolve the `undefined reference to '_libc_single_thread'` error during Linux compilation? - -A: -- This issue is caused by the precompiled Thrift dependencies requiring a higher version of glibc. -- You can try adding `-Diotdb-tools-thrift.version=0.14.1.1-glibc223-SNAPSHOT` or `-Diotdb-tools-thrift.version=0.14.1.1-old-glibc-SNAPSHOT` to the Maven compile command. - -Q: What if I need to compile using Visual Studio 2017 or earlier on Windows? - -A: -- You can try compiling Thrift locally before compiling the client: - - Download the code from https://github.com/apache/iotdb-bin-resources/tree/iotdb-tools-thrift-v0.14.1.0/iotdb-tools-thrift. - - Run `.\mvnw.cmd clean install`. - - Go back to the IoTDB code directory and run `.\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Dcmake.generator="Visual Studio 15 2017"`. - -## 3. Usage - -### 3.1 TableSession Class - -All operations in the C++ client are performed through the TableSession class. Below are the method descriptions defined in the TableSession interface. - -#### 3.1.1 Method List - -1. `insert(Tablet& tablet, bool sorted = false)`: Inserts a Tablet object containing time series data into the database. The sorted parameter indicates whether the rows in the tablet are already sorted by time. -2. `executeNonQueryStatement(string& sql)`: Executes non-query SQL statements, such as DDL (Data Definition Language) or DML (Data Manipulation Language) commands. -3. `executeQueryStatement(string& sql)`: Executes query SQL statements and returns a SessionDataSet object containing the query results. The optional timeoutInMs parameter indicates the timeout return time. - * Note: When retrieving rows of query results by calling `SessionDataSet::next()`, you must store the returned `std::shared_ptr` object in a local scope variable (e.g.: `auto row = dataSet->next();`) to ensure the validity of the data lifecycle. If you access it directly via `.get()` or a raw pointer (e.g., `dataSet->next().get()`), the reference count of the smart pointer will drop to zero, the data will be released immediately, and subsequent access will lead to undefined behavior. -4. `open(bool enableRPCCompression = false)`: Opens the connection and determines whether to enable RPC compression (client state must match server state, disabled by default). -5. `close()`: Closes the connection. - -#### 3.1.2 Interface Display - -```cpp -class TableSession { -private: - Session* session; -public: - TableSession(Session* session) { - this->session = session; - } - void insert(Tablet& tablet, bool sorted = false); - void executeNonQueryStatement(const std::string& sql); - unique_ptr executeQueryStatement(const std::string& sql); - unique_ptr executeQueryStatement(const std::string& sql, int64_t timeoutInMs); - string getDatabase(); //Get the currently selected database, can be replaced by executeNonQueryStatement - void open(bool enableRPCCompression = false); - void close(); -}; -``` - -### 3.2 TableSessionBuilder Class - -The TableSessionBuilder class is a builder used to configure and create instances of the TableSession class. Through it, you can conveniently set connection parameters, query parameters, and other settings when creating instances. - -#### 3.2.1 Usage Example - -```cpp -//Set connection IP, port, username, password -//The order of settings is arbitrary, just ensure build() is called last, the created instance is connected by default through open() -session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("TimechoDB@2021") //before V2.0.6 it is root - ->build(); -``` - -#### 3.2.2 Configurable Parameter List - -| **Parameter Name** | **Description** | **Default Value** | -| :---: | :---: |:-------------------------------------------:| -| host | Set the connected node IP | "127.0.0.1" ("localhost") | -| rpcPort | Set the connected node port | 6667 | -| username | Set the connection username | "root" | -| password | Set the connection password | "TimechoDB@2021" //before V2.0.6 it is root | -| zoneId | Set the ZoneId related to timezone | "" | -| fetchSize | Set the query result fetch size | 10000 | -| database | Set the target database name | "" | - -## 4. Examples - -The sample code of using these interfaces is in: - -- `example/client-cpp-example/src/TableModelSessionExample.cpp`: [TableModelSessionExample](https://github.com/apache/iotdb/blob/master/example/client-cpp-example/src/TableModelSessionExample.cpp) - -If the compilation finishes successfully, the example project will be placed under `example/client-cpp-example/target` - - -```cpp -#include "TableSession.h" -#include "TableSessionBuilder.h" - -using namespace std; - -shared_ptr session; - -void insertRelationalTablet() { - - vector> schemaList { - make_pair("region_id", TSDataType::TEXT), - make_pair("plant_id", TSDataType::TEXT), - make_pair("device_id", TSDataType::TEXT), - make_pair("model", TSDataType::TEXT), - make_pair("temperature", TSDataType::FLOAT), - make_pair("humidity", TSDataType::DOUBLE) - }; - - vector columnTypes = { - ColumnCategory::TAG, - ColumnCategory::TAG, - ColumnCategory::TAG, - ColumnCategory::ATTRIBUTE, - ColumnCategory::FIELD, - ColumnCategory::FIELD - }; - - Tablet tablet("table1", schemaList, columnTypes, 100); - - for (int row = 0; row < 100; row++) { - int rowIndex = tablet.rowSize++; - tablet.timestamps[rowIndex] = row; - - // Using index-based API is more efficient than column name lookup - // Prefer: tablet.addValue(0, rowIndex, "1"); - // Avoid: tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue(0, rowIndex, "1"); // region_id - tablet.addValue(1, rowIndex, "5"); // plant_id - tablet.addValue(2, rowIndex, "3"); // device_id - tablet.addValue(3, rowIndex, "A"); // model - tablet.addValue(4, rowIndex, 37.6F); // temperature - tablet.addValue(5, rowIndex, 111.1); // humidity - if (tablet.rowSize == tablet.maxRowNumber) { - session->insert(tablet); - tablet.reset(); - } - } - - if (tablet.rowSize != 0) { - session->insert(tablet); - tablet.reset(); - } -} - -void Output(unique_ptr &dataSet) { - for (const string &name: dataSet->getColumnNames()) { - cout << name << " "; - } - cout << endl; - while (dataSet->hasNext()) { - cout << dataSet->next()->toString(); - } - cout << endl; -} - -void OutputWithType(unique_ptr &dataSet) { - for (const string &name: dataSet->getColumnNames()) { - cout << name << " "; - } - cout << endl; - for (const string &type: dataSet->getColumnTypeList()) { - cout << type << " "; - } - cout << endl; - while (dataSet->hasNext()) { - cout << dataSet->next()->toString(); - } - cout << endl; -} - -int main() { - try { - session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("root") - ->build(); - - cout << "[Create Database db1,db2]\n" << endl; - try { - session->executeNonQueryStatement("CREATE DATABASE IF NOT EXISTS db1"); - session->executeNonQueryStatement("CREATE DATABASE IF NOT EXISTS db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Use db1 as database]\n" << endl; - try { - session->executeNonQueryStatement("USE db1"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Create Table table1,table2]\n" << endl; - try { - session->executeNonQueryStatement("create table db1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - session->executeNonQueryStatement("create table db2.table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show Tables]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show tables from specific database]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES FROM db1"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[InsertTablet]\n" << endl; - try { - insertRelationalTablet(); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Query Table Data]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SELECT * FROM table1" - " where region_id = '1' and plant_id in ('3', '5') and device_id = '3'"); - OutputWithType(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - session->close(); - - // specify database in constructor - session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("root") - ->database("db1") - ->build(); - - cout << "[Show tables from current database(db1)]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Change database to db2]\n" << endl; - try { - session->executeNonQueryStatement("USE db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show tables from current database(db2)]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Drop Database db1,db2]\n" << endl; - try { - session->executeNonQueryStatement("DROP DATABASE db1"); - session->executeNonQueryStatement("DROP DATABASE db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "session close\n" << endl; - session->close(); - - cout << "finished!\n" << endl; - } catch (IoTDBConnectionException &e) { - cout << e.what() << endl; - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - return 0; -} -``` - - -## 5. FAQ - -### 5.1 on Mac - -If errors occur when compiling thrift source code, try to downgrade your xcode-commandline from 12 to 11.5 - -see https://stackoverflow.com/questions/63592445/ld-unsupported-tapi-file-type-tapi-tbd-in-yaml-file/65518087#65518087 - - -### 5.2 on Windows - -When Building Thrift and downloading packages via "wget", a possible annoying issue may occur with -error message looks like: -```shell -Failed to delete cached file C:\Users\Administrator\.m2\repository\.cache\download-maven-plugin\index.ser -``` -Possible fixes: -- Try to delete the ".m2\repository\\.cache\" directory and try again. -- Add "\true\" configuration to the download-maven-plugin maven phase that complains this error. - diff --git a/src/UserGuide/Master/Table/API/Programming-Go-Native-API_timecho.md b/src/UserGuide/Master/Table/API/Programming-Go-Native-API_timecho.md deleted file mode 100644 index 41ab86372..000000000 --- a/src/UserGuide/Master/Table/API/Programming-Go-Native-API_timecho.md +++ /dev/null @@ -1,576 +0,0 @@ - - -# Go Native API - -The Git repository for the Go Native API client is located [here](https://github.com/apache/iotdb-client-go/) - -## 1. Usage -### 1.1 Dependencies - -* golang >= 1.13 -* make >= 3.0 -* curl >= 7.1.1 -* thrift 0.15.0 -* Linux、Macos or other unix-like systems -* Windows+bash (WSL、cygwin、Git Bash) - -### 1.2 Installation - -* go mod - -```sh -export GO111MODULE=on -export GOPROXY=https://goproxy.io - -mkdir session_example && cd session_example - -curl -o session_example.go -L https://github.com/apache/iotdb-client-go/raw/main/example/session_example.go - -go mod init session_example -go run session_example.go -``` - -* GOPATH - -```sh -# get thrift 0.15.0 -go get github.com/apache/thrift -cd $GOPATH/src/github.com/apache/thrift -git checkout 0.15.0 - -mkdir -p $GOPATH/src/iotdb-client-go-example/session_example -cd $GOPATH/src/iotdb-client-go-example/session_example -curl -o session_example.go -L https://github.com/apache/iotdb-client-go/raw/main/example/session_example.go -go run session_example.go -``` -* Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 2. ITableSession Interface -### 2.1 Description - -Defines core operations for interacting with IoTDB tables, including data insertion, query execution, and session closure. Not thread-safe. - -### 2.2 Method List - -| **Method Name** | **Description** | **Parameters** | **Return Value** | **Return Error** | -| -------------------------------------------------------------- | -------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | ----------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | -| `Insert(tablet *Tablet)` | Inserts a`Tablet`containing time-series data into the database.| `tablet`: A pointer to a Tablet containing time-series data to be inserted. | A pointer to TSStatus indicating the execution result. | An error if an issue occurs during the operation, such as a connection error or execution failure. | -| `xecuteNonQueryStatement(sql string)`| Executes non-query SQL statements such as DDL or DML commands. | `sql`: The SQL statement to execute.| A pointer to TSStatus indicating the execution result.| An error if an issue occurs during the operation, such as a connection error or execution failure. | -| `ExecuteQueryStatement (sql string, timeoutInMs *int64)` | Executes a query SQL statement with a specified timeout in milliseconds. | `sql`: The SQL query statement.`timeoutInMs`: Query timeout in milliseconds. | A pointer to SessionDataSet containing the query results. | An error if an issue occurs during the operation, such as a connection error or execution failure. | -| `Close()` | Closes the session and releases resources. | None | None | An error if there is an issue with closing the IoTDB connection. | - -### 2.3 Interface Definition -1. ITableSession - -```go -// ITableSession defines an interface for interacting with IoTDB tables. -// It supports operations such as data insertion, executing queries, and closing the session. -// Implementations of this interface are expected to manage connections and ensure -// proper resource cleanup. -// -// Each method may return an error to indicate issues such as connection errors -// or execution failures. -// -// Since this interface includes a Close method, it is recommended to use -// defer to ensure the session is properly closed. -type ITableSession interface { - - // Insert inserts a Tablet into the database. - // - // Parameters: - // - tablet: A pointer to a Tablet containing time-series data to be inserted. - // - // Returns: - // - r: A pointer to TSStatus indicating the execution result. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - Insert(tablet *Tablet) (r *common.TSStatus, err error) - - // ExecuteNonQueryStatement executes a non-query SQL statement, such as a DDL or DML command. - // - // Parameters: - // - sql: The SQL statement to execute. - // - // Returns: - // - r: A pointer to TSStatus indicating the execution result. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - ExecuteNonQueryStatement(sql string) (r *common.TSStatus, err error) - - // ExecuteQueryStatement executes a query SQL statement and returns the result set. - // - // Parameters: - // - sql: The SQL query statement to execute. - // - timeoutInMs: A pointer to the timeout duration in milliseconds for the query execution. - // - // Returns: - // - result: A pointer to SessionDataSet containing the query results. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - ExecuteQueryStatement(sql string, timeoutInMs *int64) (*SessionDataSet, error) - - // Close closes the session, releasing any held resources. - // - // Returns: - // - err: An error if there is an issue with closing the IoTDB connection. - Close() (err error) -} -``` - -2. Constructing a TableSession - -* There’s no need to manually set the `sqlDialect` field in the `Config`structs. This is automatically handled by the corresponding `NewSession` function during initialization. Simply use the appropriate constructor based on your use case (single-node or cluster). - -```Go -type Config struct { - Host string - Port string - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - sqlDialect string - Version Version - Database string -} - -type ClusterConfig struct { - NodeUrls []string //ip:port - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - sqlDialect string - Database string -} - -// NewTableSession creates a new TableSession instance using the provided configuration. -// -// Parameters: -// - config: The configuration for the session. -// - enableRPCCompression: A boolean indicating whether RPC compression is enabled. -// - connectionTimeoutInMs: The timeout in milliseconds for establishing a connection. -// -// Returns: -// - An ITableSession instance if the session is successfully created. -// - An error if there is an issue during session initialization. -func NewTableSession(config *Config, enableRPCCompression bool, connectionTimeoutInMs int) (ITableSession, error) - -// NewClusterTableSession creates a new TableSession instance for a cluster setup. -// -// Parameters: -// - clusterConfig: The configuration for the cluster session. -// - enableRPCCompression: A boolean indicating whether RPC compression is enabled. -// -// Returns: -// - An ITableSession instance if the session is successfully created. -// - An error if there is an issue during session initialization. -func NewClusterTableSession(clusterConfig *ClusterConfig, enableRPCCompression bool) (ITableSession, error) -``` - -> Note: -> -> When creating a `TableSession` via `NewTableSession` or `NewClusterTableSession`, the connection is already established; no additional `Open` operation is required. - -### 2.4 Example - -```go -package main - -import ( - "flag" - "log" - "math/rand" - "strconv" - "time" - - "github.com/apache/iotdb-client-go/v2/client" -) - -func main() { - flag.Parse() - config := &client.Config{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_session", - } - session, err := client.NewTableSession(config, false, 0) - if err != nil { - log.Fatal(err) - } - defer session.Close() - - checkError(session.ExecuteNonQueryStatement("create database test_db")) - checkError(session.ExecuteNonQueryStatement("use test_db")) - checkError(session.ExecuteNonQueryStatement("create table t1 (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - insertRelationalTablet(session) - showTables(session) - query(session) -} - -func getTextValueFromDataSet(dataSet *client.SessionDataSet, columnName string) string { - if isNull, err := dataSet.IsNull(columnName); err != nil { - log.Fatal(err) - } else if isNull { - return "null" - } - v, err := dataSet.GetString(columnName) - if err != nil { - log.Fatal(err) - } - return v -} - -func insertRelationalTablet(session client.ITableSession) { - tablet, err := client.NewRelationalTablet("t1", []*client.MeasurementSchema{ - { - Measurement: "tag1", - DataType: client.STRING, - }, - { - Measurement: "tag2", - DataType: client.STRING, - }, - { - Measurement: "s1", - DataType: client.TEXT, - }, - { - Measurement: "s2", - DataType: client.TEXT, - }, - }, []client.ColumnCategory{client.TAG, client.TAG, client.FIELD, client.FIELD}, 1024) - if err != nil { - log.Fatal("Failed to create relational tablet {}", err) - } - ts := time.Now().UTC().UnixNano() / 1000000 - for row := 0; row < 16; row++ { - ts++ - tablet.SetTimestamp(ts, row) - tablet.SetValueAt("tag1_value_"+strconv.Itoa(row), 0, row) - tablet.SetValueAt("tag2_value_"+strconv.Itoa(row), 1, row) - tablet.SetValueAt("s1_value_"+strconv.Itoa(row), 2, row) - tablet.SetValueAt("s2_value_"+strconv.Itoa(row), 3, row) - tablet.RowSize++ - } - checkError(session.Insert(tablet)) - - tablet.Reset() - - for row := 0; row < 16; row++ { - ts++ - tablet.SetTimestamp(ts, row) - tablet.SetValueAt("tag1_value_1", 0, row) - tablet.SetValueAt("tag2_value_1", 1, row) - tablet.SetValueAt("s1_value_"+strconv.Itoa(row), 2, row) - tablet.SetValueAt("s2_value_"+strconv.Itoa(row), 3, row) - - nullValueColumn := rand.Intn(4) - tablet.SetValueAt(nil, nullValueColumn, row) - tablet.RowSize++ - } - checkError(session.Insert(tablet)) -} - -func showTables(session client.ITableSession) { - timeout := int64(2000) - dataSet, err := session.ExecuteQueryStatement("show tables", &timeout) - defer dataSet.Close() - if err != nil { - log.Fatal(err) - } - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - value, err := dataSet.GetString("TableName") - if err != nil { - log.Fatal(err) - } - log.Printf("tableName is %v", value) - } -} - -func query(session client.ITableSession) { - timeout := int64(2000) - dataSet, err := session.ExecuteQueryStatement("select * from t1", &timeout) - defer dataSet.Close() - if err != nil { - log.Fatal(err) - } - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - log.Printf("%v %v %v %v", getTextValueFromDataSet(dataSet, "tag1"), getTextValueFromDataSet(dataSet, "tag2"), getTextValueFromDataSet(dataSet, "s1"), getTextValueFromDataSet(dataSet, "s2")) - } -} - -func checkError(err error) { - if err != nil { - log.Fatal(err) - } -} -``` - -## 3. TableSessionPool Interface -### 3.1 Description - -Manages a pool of `ITableSession` instances for efficient connection reuse and resource cleanup. - -### 3.2 Method List - -| **Method Name** | **Description** | **Return Value** | **Return Error** | -| ----------------------- | ------------------------------------------------------------ | ------------------------------------------------------------- | ------------------------------------------- | -| `GetSession()` | Acquires a session from the pool for database interaction. | A usable ITableSession instance for interacting with IoTDB. | An error if a session cannot be acquired. | -| `Close()` | Closes the session pool and releases resources.。 | None | None | - -### 3.3 Interface Definition -1. TableSessionPool - -```Go -// TableSessionPool manages a pool of ITableSession instances, enabling efficient -// reuse and management of resources. It provides methods to acquire a session -// from the pool and to close the pool, releasing all held resources. -// -// This implementation ensures proper lifecycle management of sessions, -// including efficient reuse and cleanup of resources. - -// GetSession acquires an ITableSession instance from the pool. -// -// Returns: -// - A usable ITableSession instance for interacting with IoTDB. -// - An error if a session cannot be acquired. -func (spool *TableSessionPool) GetSession() (ITableSession, error) { - return spool.sessionPool.getTableSession() -} - -// Close closes the TableSessionPool, releasing all held resources. -// Once closed, no further sessions can be acquired from the pool. -func (spool *TableSessionPool) Close() -``` - -2. Constructing a TableSessionPool - -```Go -type PoolConfig struct { - Host string - Port string - NodeUrls []string - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - Database string - sqlDialect string -} - -// NewTableSessionPool creates a new TableSessionPool with the specified configuration. -// -// Parameters: -// - conf: PoolConfig defining the configuration for the pool. -// - maxSize: The maximum number of sessions the pool can hold. -// - connectionTimeoutInMs: Timeout for establishing a connection in milliseconds. -// - waitToGetSessionTimeoutInMs: Timeout for waiting to acquire a session in milliseconds. -// - enableCompression: A boolean indicating whether to enable compression. -// -// Returns: -// - A TableSessionPool instance. -func NewTableSessionPool(conf *PoolConfig, maxSize, connectionTimeoutInMs, waitToGetSessionTimeoutInMs int, - enableCompression bool) TableSessionPool -``` - -> Note: -> -> * If a `Database` is specified when creating the `TableSessionPool`, all sessions acquired from the pool will automatically use this database. There is no need to explicitly set the database during operations. -> * Automatic State Reset: If a session temporarily switches to another database using `USE DATABASE` during usage, the session will automatically revert to the original database specified in the pool when closed and returned to the pool. - -### 3.4 Example - -```go -package main - -import ( - "log" - "strconv" - "sync" - "sync/atomic" - "time" - - "github.com/apache/iotdb-client-go/v2/client" -) - -func main() { - sessionPoolWithSpecificDatabaseExample() - sessionPoolWithoutSpecificDatabaseExample() - putBackToSessionPoolExample() -} - -func putBackToSessionPoolExample() { - // should create database test_db before executing - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_db", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) - defer sessionPool.Close() - - num := 4 - successGetSessionNum := int32(0) - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - dbName := "db" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create database "+dbName+"because ", err) - return - } - atomic.AddInt32(&successGetSessionNum, 1) - defer func() { - time.Sleep(6 * time.Second) - // put back to session pool - session.Close() - }() - checkError(session.ExecuteNonQueryStatement("create database " + dbName)) - checkError(session.ExecuteNonQueryStatement("use " + dbName)) - checkError(session.ExecuteNonQueryStatement("create table table_of_" + dbName + " (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() - log.Println("success num is", successGetSessionNum) - - log.Println("All session's database have been reset.") - // the using database will automatically reset to session pool's database after the session closed - wg.Add(5) - for i := 0; i < 5; i++ { - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to get session because ", err) - } - defer session.Close() - timeout := int64(3000) - dataSet, err := session.ExecuteQueryStatement("show tables", &timeout) - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - value, err := dataSet.GetString("TableName") - if err != nil { - log.Fatal(err) - } - log.Println("table is", value) - } - dataSet.Close() - }() - } - wg.Wait() -} - -func sessionPoolWithSpecificDatabaseExample() { - // should create database test_db before executing - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_db", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 8000, false) - defer sessionPool.Close() - num := 10 - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - tableName := "t" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create table "+tableName+"because ", err) - return - } - defer session.Close() - checkError(session.ExecuteNonQueryStatement("create table " + tableName + " (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() -} - -func sessionPoolWithoutSpecificDatabaseExample() { - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 8000, false) - defer sessionPool.Close() - num := 10 - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - dbName := "db" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create database ", dbName, err) - return - } - defer session.Close() - checkError(session.ExecuteNonQueryStatement("create database " + dbName)) - checkError(session.ExecuteNonQueryStatement("use " + dbName)) - checkError(session.ExecuteNonQueryStatement("create table t1 (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() -} - -func checkError(err error) { - if err != nil { - log.Fatal(err) - } -} -``` - diff --git a/src/UserGuide/Master/Table/API/Programming-JDBC_timecho.md b/src/UserGuide/Master/Table/API/Programming-JDBC_timecho.md deleted file mode 100644 index 6b85ccab2..000000000 --- a/src/UserGuide/Master/Table/API/Programming-JDBC_timecho.md +++ /dev/null @@ -1,189 +0,0 @@ - -# JDBC - -The IoTDB JDBC provides a standardized way to interact with the IoTDB database, allowing users to execute SQL statements from Java programs for managing databases and time-series data. It supports operations such as connecting to the database, creating, querying, updating, and deleting data, as well as batch insertion and querying of time-series data. - -**Note:** The current JDBC implementation is designed primarily for integration with third-party tools. High-performance writing **may not be achieved** when using JDBC for insert operations. For Java applications, it is recommended to use the **JAVA Native API** for optimal performance. - -## 1. Prerequisites - -### 1.1 **Environment Requirements** - -- **JDK:** Version 1.8 or higher -- **Maven:** Version 3.6 or higher - -### 1.2 **Adding Maven Dependencies** - -Add the following dependency to your Maven `pom.xml` file: - -```XML - - - com.timecho.iotdb - iotdb-session - 2.0.1.1 - - -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 2. Read and Write Operations - -**Write Operations:** Perform database operations such as inserting data, creating databases, and creating time-series using the `execute` method. - -**Read Operations:** Execute queries using the `executeQuery` method and retrieve results via the `ResultSet` object. - -### 2.1 Method Overview - -| **Method Name** | **Description** | **Parameters** | **Return Value** | -| ------------------------------------------------------------ | ----------------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------- | -| Class.forName(String driver) | Loads the JDBC driver class | `driver`: Name of the JDBC driver class | `Class`: Loaded class object | -| DriverManager.getConnection(String url, String username, String password) | Establishes a database connection | `url`: Database URL `username`: Username `password`: Password | `Connection`: Database connection object | -| Connection.createStatement() | Creates a `Statement` object for executing SQL statements | None | `Statement`: SQL execution object | -| Statement.execute(String sql) | Executes a non-query SQL statement | `sql`: SQL statement to execute | `boolean`: Indicates if a `ResultSet` is returned | -| Statement.executeQuery(String sql) | Executes a query SQL statement and retrieves the result set | `sql`: SQL query statement | `ResultSet`: Query result set | -| ResultSet.getMetaData() | Retrieves metadata of the result set | None | `ResultSetMetaData`: Metadata object | -| ResultSet.next() | Moves to the next row in the result set | None | `boolean`: Whether the move was successful | -| ResultSet.getString(int columnIndex) | Retrieves the string value of a specified column | `columnIndex`: Column index (starting from 1) | `String`: Column value | - -## 3. Sample Code - -**Note:** When using the Table Mode, you must specify the `sql_dialect` parameter as `table` in the URL. Example: - -```Java -String url = "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table"; -``` - -You can find the full example code at [GitHub Repository](https://github.com/apache/iotdb/blob/rc/2.0.1/example/jdbc/src/main/java/org/apache/iotdb/TableModelJDBCExample.java). - -Here is an excerpt of the sample code: - -```Java -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -package org.apache.iotdb; - -import org.apache.iotdb.jdbc.IoTDBSQLException; - -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -import java.sql.Connection; -import java.sql.DriverManager; -import java.sql.ResultSet; -import java.sql.ResultSetMetaData; -import java.sql.SQLException; -import java.sql.Statement; - -public class TableModelJDBCExample { - - private static final Logger LOGGER = LoggerFactory.getLogger(TableModelJDBCExample.class); - - public static void main(String[] args) throws ClassNotFoundException, SQLException { - Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); - - // don't specify database in url - try (Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", "root", "TimechoDB@2021"); //before V2.0.6 it is root - Statement statement = connection.createStatement()) { - - statement.execute("CREATE DATABASE test1"); - statement.execute("CREATE DATABASE test2"); - - statement.execute("use test2"); - - // or use full qualified table name - statement.execute( - "create table test1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - - statement.execute( - "create table table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - // show tables by specifying another database - // using SHOW tables FROM - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES FROM test1")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - - // specify database in url - try (Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/test1?sql_dialect=table", "root", "TimechoDB@2021"); //before V2.0.6 it is root - Statement statement = connection.createStatement()) { - // show tables from current database test1 - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - // change database to test2 - statement.execute("use test2"); - - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - } - } -} -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/API/Programming-Java-Native-API_timecho.md b/src/UserGuide/Master/Table/API/Programming-Java-Native-API_timecho.md deleted file mode 100644 index ec19bf235..000000000 --- a/src/UserGuide/Master/Table/API/Programming-Java-Native-API_timecho.md +++ /dev/null @@ -1,851 +0,0 @@ - -# Java Native API - -## 1. Function Introduction - -IoTDB provides a Java native client driver and a session pool management mechanism. These tools enable developers to interact with IoTDB using object-oriented APIs, allowing time-series objects to be directly assembled and inserted into the database without constructing SQL statements. It is recommended to use the `ITableSessionPool` for multi-threaded database operations to maximize efficiency. - -## 2. Usage Instructions - -**Environment Requirements** - -- **JDK**: Version 1.8 or higher -- **Maven**: Version 3.6 or higher - -**Adding Maven Dependencies** - -```XML - - - com.timecho.iotdb - iotdb-session - - ${project.version} - - -``` -* The latest version of `iotdb-session` can be viewed [here](https://repo1.maven.org/maven2/com/timecho/iotdb/iotdb-session/) -* Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 3. Read and Write Operations - -### 3.1 ITableSession Interface - -#### 3.1.1 Feature Description - -The `ITableSession` interface defines basic operations for interacting with IoTDB, including data insertion, query execution, and session closure. Note that this interface is **not thread-safe**. - -#### 3.1.2 Method Overview - -| **Method Name** | **Description** | **Parameters** | **Return Value** | **Exceptions** | -| --------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ---------------- | --------------------------------------------------------- | -| insert(Tablet tablet) | Inserts a `Tablet` containing time-series data into the database. | `tablet`: The `Tablet` object to be inserted. | None | `StatementExecutionException`, `IoTDBConnectionException` | -| executeNonQueryStatement(String sql) | Executes non-query SQL statements such as DDL or DML commands. | `sql`: The SQL statement to execute. | None | `StatementExecutionException`, `IoTDBConnectionException` | -| executeQueryStatement(String sql) | Executes a query SQL statement and returns a `SessionDataSet` containing the query results. | `sql`: The SQL query statement to execute. | `SessionDataSet` | `StatementExecutionException`, `IoTDBConnectionException` | -| executeQueryStatement(String sql, long timeoutInMs) | Executes a query SQL statement with a specified timeout in milliseconds. | `sql`: The SQL query statement. `timeoutInMs`: Query timeout in milliseconds. | `SessionDataSet` | `StatementExecutionException` | -| close() | Closes the session and releases resources. | None | None | IoTDBConnectionException | - -**Description of Object Data Type:** - -Since V2.0.8, the `iTableSession.insert(Tablet tablet)` interface supports splitting a single Object-class file into multiple segments and writing them sequentially in order. When the column data type in the Tablet data structure is **`TSDataType.Object`**, you need to use the following method to populate the Tablet: - -```Java -/* -rowIndex: row position in the tablet -columnIndex: column position in the tablet -isEOF: whether the current write operation contains the last segment of the Object file -offset: starting offset of the current write content within the Object file -content: byte array of the current write content -Note: When writing, ensure the total length of all segmented byte[] arrays equals the original Object size, -otherwise it will cause incorrect data size. -*/ -void addValue(int rowIndex, int columnIndex, boolean isEOF, long offset, byte[] content) -``` - -During queries, the following four methods are supported to retrieve values: -`Field.getStringValue`, `Field.getObjectValue`, `SessionDataSet.DataIterator.getObject`, and `SessionDataSet.DataIterator.getString`. -All these methods return a String containing metadata in the format: -`(Object) XX.XX KB` (where XX.XX KB represents the actual object size). - -#### 3.1.3 Sample Code - -```java -/** - * This interface defines a session for interacting with IoTDB tables. - * It supports operations such as data insertion, executing queries, and closing the session. - * Implementations of this interface are expected to manage connections and ensure - * proper resource cleanup. - * - *

Each method may throw exceptions to indicate issues such as connection errors or - * execution failures. - * - *

Since this interface extends {@link AutoCloseable}, it is recommended to use - * try-with-resources to ensure the session is properly closed. - */ -public interface ITableSession extends AutoCloseable { - - /** - * Inserts a {@link Tablet} into the database. - * - * @param tablet the tablet containing time-series data to be inserted. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - void insert(Tablet tablet) throws StatementExecutionException, IoTDBConnectionException; - - /** - * Executes a non-query SQL statement, such as a DDL or DML command. - * - * @param sql the SQL statement to execute. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - * @throws StatementExecutionException if an error occurs while executing the statement. - */ - void executeNonQueryStatement(String sql) throws IoTDBConnectionException, StatementExecutionException; - - /** - * Executes a query SQL statement and returns the result set. - * - * @param sql the SQL query statement to execute. - * @return a {@link SessionDataSet} containing the query results. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - SessionDataSet executeQueryStatement(String sql) - throws StatementExecutionException, IoTDBConnectionException; - - /** - * Executes a query SQL statement with a specified timeout and returns the result set. - * - * @param sql the SQL query statement to execute. - * @param timeoutInMs the timeout duration in milliseconds for the query execution. - * @return a {@link SessionDataSet} containing the query results. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - SessionDataSet executeQueryStatement(String sql, long timeoutInMs) - throws StatementExecutionException, IoTDBConnectionException; - - /** - * Closes the session, releasing any held resources. - * - * @throws IoTDBConnectionException if there is an issue with closing the IoTDB connection. - */ - @Override - void close() throws IoTDBConnectionException; -} -``` - -### 3.2 TableSessionBuilder Class - -#### 3.2.1 Feature Description - -The `TableSessionBuilder` class is a builder for configuring and creating instances of the `ITableSession` interface. It allows developers to set connection parameters, query parameters, and security features. - -#### 3.2.2 Parameter Configuration - -| **Parameter** | **Description** | **Default Value** | -|-----------------------------------------------------| ------------------------------------------------------------ |---------------------------------------------------| -| nodeUrls(List\ nodeUrls) | Sets the list of IoTDB cluster node URLs. | `Collections.singletonList("``localhost:6667``")` | -| username(String username) | Sets the username for the connection. | `"root"` | -| password(String password) | Sets the password for the connection. | `"TimechoDB@2021"` //before V2.0.6 it is root | -| database(String database) | Sets the target database name. | `null` | -| queryTimeoutInMs(long queryTimeoutInMs) | Sets the query timeout in milliseconds. | `60000` (1 minute) | -| fetchSize(int fetchSize) | Sets the fetch size for query results. | `5000` | -| zoneId(ZoneId zoneId) | Sets the timezone-related `ZoneId`. | `ZoneId.systemDefault()` | -| thriftDefaultBufferSize(int thriftDefaultBufferSize) | Sets the default buffer size for the Thrift client (in bytes). | `1024`(1KB) | -| thriftMaxFrameSize(int thriftMaxFrameSize) | Sets the maximum frame size for the Thrift client (in bytes). | `64 * 1024 * 1024`(64MB) | -| enableRedirection(boolean enableRedirection) | Enables or disables redirection for cluster nodes. | `true` | -| enableAutoFetch(boolean enableAutoFetch) | Enables or disables automatic fetching of available DataNodes. | `true` | -| maxRetryCount(int maxRetryCount) | Sets the maximum number of connection retry attempts. | `60` | -| retryIntervalInMs(long retryIntervalInMs) | Sets the interval between retry attempts (in milliseconds). | `500`(500 millisesonds) | -| useSSL(boolean useSSL) | Enables or disables SSL for secure connections. | `false` | -| trustStore(String keyStore) | Sets the path to the trust store for SSL connections. | `null` | -| trustStorePwd(String keyStorePwd) | Sets the password for the SSL trust store. | `null` | -| enableCompression(boolean enableCompression) | Enables or disables RPC compression for the connection. | `false` | -| connectionTimeoutInMs(int connectionTimeoutInMs) | Sets the connection timeout in milliseconds. | `0` (no timeout) | - -#### 3.2.3 Sample Code - -```java -/** - * A builder class for constructing instances of {@link ITableSession}. - * - *

This builder provides a fluent API for configuring various options such as connection - * settings, query parameters, and security features. - * - *

All configurations have reasonable default values, which can be overridden as needed. - */ -public class TableSessionBuilder { - - /** - * Builds and returns a configured {@link ITableSession} instance. - * - * @return a fully configured {@link ITableSession}. - * @throws IoTDBConnectionException if an error occurs while establishing the connection. - */ - public ITableSession build() throws IoTDBConnectionException; - - /** - * Sets the list of node URLs for the IoTDB cluster. - * - * @param nodeUrls a list of node URLs. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue Collection.singletonList("localhost:6667") - */ - public TableSessionBuilder nodeUrls(List nodeUrls); - - /** - * Sets the username for the connection. - * - * @param username the username. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue "root" - */ - public TableSessionBuilder username(String username); - - /** - * Sets the password for the connection. - * - * @param password the password. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue "TimechoDB@2021" //before V2.0.6 it is root - */ - public TableSessionBuilder password(String password); - - /** - * Sets the target database name. - * - * @param database the database name. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder database(String database); - - /** - * Sets the query timeout in milliseconds. - * - * @param queryTimeoutInMs the query timeout in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 60000 (1 minute) - */ - public TableSessionBuilder queryTimeoutInMs(long queryTimeoutInMs); - - /** - * Sets the fetch size for query results. - * - * @param fetchSize the fetch size. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 5000 - */ - public TableSessionBuilder fetchSize(int fetchSize); - - /** - * Sets the {@link ZoneId} for timezone-related operations. - * - * @param zoneId the {@link ZoneId}. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue ZoneId.systemDefault() - */ - public TableSessionBuilder zoneId(ZoneId zoneId); - - /** - * Sets the default init buffer size for the Thrift client. - * - * @param thriftDefaultBufferSize the buffer size in bytes. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 1024 (1 KB) - */ - public TableSessionBuilder thriftDefaultBufferSize(int thriftDefaultBufferSize); - - /** - * Sets the maximum frame size for the Thrift client. - * - * @param thriftMaxFrameSize the maximum frame size in bytes. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 64 * 1024 * 1024 (64 MB) - */ - public TableSessionBuilder thriftMaxFrameSize(int thriftMaxFrameSize); - - /** - * Enables or disables redirection for cluster nodes. - * - * @param enableRedirection whether to enable redirection. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue true - */ - public TableSessionBuilder enableRedirection(boolean enableRedirection); - - /** - * Enables or disables automatic fetching of available DataNodes. - * - * @param enableAutoFetch whether to enable automatic fetching. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue true - */ - public TableSessionBuilder enableAutoFetch(boolean enableAutoFetch); - - /** - * Sets the maximum number of retries for connection attempts. - * - * @param maxRetryCount the maximum retry count. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 60 - */ - public TableSessionBuilder maxRetryCount(int maxRetryCount); - - /** - * Sets the interval between retries in milliseconds. - * - * @param retryIntervalInMs the interval in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 500 milliseconds - */ - public TableSessionBuilder retryIntervalInMs(long retryIntervalInMs); - - /** - * Enables or disables SSL for secure connections. - * - * @param useSSL whether to enable SSL. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue false - */ - public TableSessionBuilder useSSL(boolean useSSL); - - /** - * Sets the trust store path for SSL connections. - * - * @param keyStore the trust store path. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder trustStore(String keyStore); - - /** - * Sets the trust store password for SSL connections. - * - * @param keyStorePwd the trust store password. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder trustStorePwd(String keyStorePwd); - - /** - * Enables or disables rpc compression for the connection. - * - * @param enableCompression whether to enable compression. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue false - */ - public TableSessionBuilder enableCompression(boolean enableCompression); - - /** - * Sets the connection timeout in milliseconds. - * - * @param connectionTimeoutInMs the connection timeout in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 0 (no timeout) - */ - public TableSessionBuilder connectionTimeoutInMs(int connectionTimeoutInMs); -} -``` - -> Note: When creating tables using the native API, if table or column names contain special characters or Chinese characters, do not add extra double quotes around them. Otherwise, the quotation marks will become part of the name itself. - -## 4. Session Pool - -### 4.1 ITableSessionPool Interface - -#### 4.1.1 Feature Description - -The `ITableSessionPool` interface manages a pool of `ITableSession` instances, enabling efficient reuse of connections and proper cleanup of resources. - -#### 4.1.2 Method Overview - -| **Method Name** | **Description** | **Return Value** | **Exceptions** | -| --------------- | ---------------------------------------------------------- | ---------------- | -------------------------- | -| getSession() | Acquires a session from the pool for database interaction. | `ITableSession` | `IoTDBConnectionException` | -| close() | Closes the session pool and releases resources.。 | None | None | - -#### 4.1.3 Sample Code - -```Java -/** - * This interface defines a pool for managing {@link ITableSession} instances. - * It provides methods to acquire a session from the pool and to close the pool. - * - *

The implementation should handle the lifecycle of sessions, ensuring efficient - * reuse and proper cleanup of resources. - */ -public interface ITableSessionPool { - - /** - * Acquires an {@link ITableSession} instance from the pool. - * - * @return an {@link ITableSession} instance for interacting with the IoTDB. - * @throws IoTDBConnectionException if there is an issue obtaining a session from the pool. - */ - ITableSession getSession() throws IoTDBConnectionException; - - /** - * Closes the session pool, releasing any held resources. - * - *

Once the pool is closed, no further sessions can be acquired. - */ - void close(); -} -``` - -### 4.2 ableSessionPoolBuilder Class - -#### 4.2.1 Feature Description - -The `TableSessionPoolBuilder` class is a builder for configuring and creating `ITableSessionPool` instances, supporting options like connection settings and pooling behavior. - -#### 4.2.2 Parameter Configuration - -| **Parameter** | **Description** | **Default Value** | -|---------------------------------------------------------------| ------------------------------------------------------------ |------------------------------------------------| -| nodeUrls(List\ nodeUrls) | Sets the list of IoTDB cluster node URLs. | `Collections.singletonList("localhost:6667")` | -| maxSize(int maxSize) | Sets the maximum size of the session pool, i.e., the maximum number of sessions allowed in the pool. | `5` | -| user(String user) | Sets the username for the connection. | `"root"` | -| password(String password) | Sets the password for the connection. | `"TimechoDB@2021"` //before V2.0.6 it is root | -| database(String database) | Sets the target database name. | `"root"` | -| queryTimeoutInMs(long queryTimeoutInMs) | Sets the query timeout in milliseconds. | `60000`(1 minute) | -| fetchSize(int fetchSize) | Sets the fetch size for query results. | `5000` | -| zoneId(ZoneId zoneId) | Sets the timezone-related `ZoneId`. | `ZoneId.systemDefault()` | -| waitToGetSessionTimeoutInMs(long waitToGetSessionTimeoutInMs) | Sets the timeout duration (in milliseconds) for acquiring a session from the pool. | `30000`(30 seconds) | -| thriftDefaultBufferSize(int thriftDefaultBufferSize) | Sets the default buffer size for the Thrift client (in bytes). | `1024`(1KB) | -| thriftMaxFrameSize(int thriftMaxFrameSize) | Sets the maximum frame size for the Thrift client (in bytes). | `64 * 1024 * 1024`(64MB) | -| enableCompression(boolean enableCompression) | Enables or disables compression for the connection. | `false` | -| enableRedirection(boolean enableRedirection) | Enables or disables redirection for cluster nodes. | `true` | -| connectionTimeoutInMs(int connectionTimeoutInMs) | Sets the connection timeout in milliseconds. | `10000` (10 seconds) | -| enableAutoFetch(boolean enableAutoFetch) | Enables or disables automatic fetching of available DataNodes. | `true` | -| maxRetryCount(int maxRetryCount) | Sets the maximum number of connection retry attempts. | `60` | -| retryIntervalInMs(long retryIntervalInMs) | Sets the interval between retry attempts (in milliseconds). | `500` (500 milliseconds) | -| useSSL(boolean useSSL) | Enables or disables SSL for secure connections. | `false` | -| trustStore(String keyStore) | Sets the path to the trust store for SSL connections. | `null` | -| trustStorePwd(String keyStorePwd) | Sets the password for the SSL trust store. | `null` | - -#### 4.2.3 Sample Code - -```Java -/** - * A builder class for constructing instances of {@link ITableSessionPool}. - * - *

This builder provides a fluent API for configuring a session pool, including - * connection settings, session parameters, and pool behavior. - * - *

All configurations have reasonable default values, which can be overridden as needed. - */ -public class TableSessionPoolBuilder { - - /** - * Builds and returns a configured {@link ITableSessionPool} instance. - * - * @return a fully configured {@link ITableSessionPool}. - */ - public ITableSessionPool build(); - - /** - * Sets the list of node URLs for the IoTDB cluster. - * - * @param nodeUrls a list of node URLs. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue Collection.singletonList("localhost:6667") - */ - public TableSessionPoolBuilder nodeUrls(List nodeUrls); - - /** - * Sets the maximum size of the session pool. - * - * @param maxSize the maximum number of sessions allowed in the pool. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 5 - */ - public TableSessionPoolBuilder maxSize(int maxSize); - - /** - * Sets the username for the connection. - * - * @param user the username. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "root" - */ - public TableSessionPoolBuilder user(String user); - - /** - * Sets the password for the connection. - * - * @param password the password. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "TimechoDB@2021" //before V2.0.6 it is root - */ - public TableSessionPoolBuilder password(String password); - - /** - * Sets the target database name. - * - * @param database the database name. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "root" - */ - public TableSessionPoolBuilder database(String database); - - /** - * Sets the query timeout in milliseconds. - * - * @param queryTimeoutInMs the query timeout in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 60000 (1 minute) - */ - public TableSessionPoolBuilder queryTimeoutInMs(long queryTimeoutInMs); - - /** - * Sets the fetch size for query results. - * - * @param fetchSize the fetch size. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 5000 - */ - public TableSessionPoolBuilder fetchSize(int fetchSize); - - /** - * Sets the {@link ZoneId} for timezone-related operations. - * - * @param zoneId the {@link ZoneId}. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue ZoneId.systemDefault() - */ - public TableSessionPoolBuilder zoneId(ZoneId zoneId); - - /** - * Sets the timeout for waiting to acquire a session from the pool. - * - * @param waitToGetSessionTimeoutInMs the timeout duration in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 30000 (30 seconds) - */ - public TableSessionPoolBuilder waitToGetSessionTimeoutInMs(long waitToGetSessionTimeoutInMs); - - /** - * Sets the default buffer size for the Thrift client. - * - * @param thriftDefaultBufferSize the buffer size in bytes. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 1024 (1 KB) - */ - public TableSessionPoolBuilder thriftDefaultBufferSize(int thriftDefaultBufferSize); - - /** - * Sets the maximum frame size for the Thrift client. - * - * @param thriftMaxFrameSize the maximum frame size in bytes. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 64 * 1024 * 1024 (64 MB) - */ - public TableSessionPoolBuilder thriftMaxFrameSize(int thriftMaxFrameSize); - - /** - * Enables or disables compression for the connection. - * - * @param enableCompression whether to enable compression. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue false - */ - public TableSessionPoolBuilder enableCompression(boolean enableCompression); - - /** - * Enables or disables redirection for cluster nodes. - * - * @param enableRedirection whether to enable redirection. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue true - */ - public TableSessionPoolBuilder enableRedirection(boolean enableRedirection); - - /** - * Sets the connection timeout in milliseconds. - * - * @param connectionTimeoutInMs the connection timeout in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 10000 (10 seconds) - */ - public TableSessionPoolBuilder connectionTimeoutInMs(int connectionTimeoutInMs); - - /** - * Enables or disables automatic fetching of available DataNodes. - * - * @param enableAutoFetch whether to enable automatic fetching. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue true - */ - public TableSessionPoolBuilder enableAutoFetch(boolean enableAutoFetch); - - /** - * Sets the maximum number of retries for connection attempts. - * - * @param maxRetryCount the maximum retry count. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 60 - */ - public TableSessionPoolBuilder maxRetryCount(int maxRetryCount); - - /** - * Sets the interval between retries in milliseconds. - * - * @param retryIntervalInMs the interval in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 500 milliseconds - */ - public TableSessionPoolBuilder retryIntervalInMs(long retryIntervalInMs); - - /** - * Enables or disables SSL for secure connections. - * - * @param useSSL whether to enable SSL. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue false - */ - public TableSessionPoolBuilder useSSL(boolean useSSL); - - /** - * Sets the trust store path for SSL connections. - * - * @param keyStore the trust store path. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue null - */ - public TableSessionPoolBuilder trustStore(String keyStore); - - /** - * Sets the trust store password for SSL connections. - * - * @param keyStorePwd the trust store password. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue null - */ - public TableSessionPoolBuilder trustStorePwd(String keyStorePwd); -} -``` - -## 5. Example Code - -Session: [src/main/java/org/apache/iotdb/TableModelSessionExample.java](https://github.com/apache/iotdb/blob/master/example/session/src/main/java/org/apache/iotdb/TableModelSessionExample.java) - -SessionPool: [src/main/java/org/apache/iotdb/TableModelSessionPoolExample.java](https://github.com/apache/iotdb/blob/master/example/session/src/main/java/org/apache/iotdb/TableModelSessionPoolExample.java) - -```Java -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -package org.apache.iotdb; - -import org.apache.iotdb.isession.ITableSession; -import org.apache.iotdb.isession.SessionDataSet; -import org.apache.iotdb.isession.pool.ITableSessionPool; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.TableSessionPoolBuilder; - -import org.apache.tsfile.enums.ColumnCategory; -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.write.record.Tablet; - -import java.util.ArrayList; -import java.util.Arrays; -import java.util.Collections; -import java.util.List; - -import static org.apache.iotdb.SessionExample.printDataSet; - -public class TableModelSessionPoolExample { - - private static final String LOCAL_URL = "127.0.0.1:6667"; - - public static void main(String[] args) { - - // don't specify database in constructor - ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(Collections.singletonList(LOCAL_URL)) - .user("root") - .password("TimechoDB@2021") //before V2.0.6 it is root - .maxSize(1) - .build(); - - try (ITableSession session = tableSessionPool.getSession()) { - - session.executeNonQueryStatement("CREATE DATABASE test1"); - session.executeNonQueryStatement("CREATE DATABASE test2"); - - session.executeNonQueryStatement("use test2"); - - // or use full qualified table name - session.executeNonQueryStatement( - "create table test1.table1(" - + "region_id STRING TAG, " - + "plant_id STRING TAG, " - + "device_id STRING TAG, " - + "model STRING ATTRIBUTE, " - + "temperature FLOAT FIELD, " - + "humidity DOUBLE FIELD) with (TTL=3600000)"); - - session.executeNonQueryStatement( - "create table table2(" - + "region_id STRING TAG, " - + "plant_id STRING TAG, " - + "color STRING ATTRIBUTE, " - + "temperature FLOAT FIELD, " - + "speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - // show tables by specifying another database - // using SHOW tables FROM - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES FROM test1")) { - printDataSet(dataSet); - } - - // insert table data by tablet - List columnNameList = - Arrays.asList("region_id", "plant_id", "device_id", "model", "temperature", "humidity"); - List dataTypeList = - Arrays.asList( - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE); - List columnTypeList = - new ArrayList<>( - Arrays.asList( - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD)); - Tablet tablet = new Tablet("test1", columnNameList, dataTypeList, columnTypeList, 100); - for (long timestamp = 0; timestamp < 100; timestamp++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue("plant_id", rowIndex, "5"); - tablet.addValue("device_id", rowIndex, "3"); - tablet.addValue("model", rowIndex, "A"); - tablet.addValue("temperature", rowIndex, 37.6F); - tablet.addValue("humidity", rowIndex, 111.1); - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - session.insert(tablet); - tablet.reset(); - } - } - if (tablet.getRowSize() != 0) { - session.insert(tablet); - tablet.reset(); - } - - // query table data - try (SessionDataSet dataSet = - session.executeQueryStatement( - "select * from test1 " - + "where region_id = '1' and plant_id in ('3', '5') and device_id = '3'")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } finally { - tableSessionPool.close(); - } - - // specify database in constructor - tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(Collections.singletonList(LOCAL_URL)) - .user("root") - .password("TimechoDB@2021")//before V2.0.6 it is root - .maxSize(1) - .database("test1") - .build(); - - try (ITableSession session = tableSessionPool.getSession()) { - - // show tables from current database - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - // change database to test2 - session.executeNonQueryStatement("use test2"); - - // show tables by specifying another database - // using SHOW tables FROM - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } - - try (ITableSession session = tableSessionPool.getSession()) { - - // show tables from default database test1 - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } finally { - tableSessionPool.close(); - } - } -} -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/API/Programming-MQTT_timecho.md b/src/UserGuide/Master/Table/API/Programming-MQTT_timecho.md deleted file mode 100644 index 39184f479..000000000 --- a/src/UserGuide/Master/Table/API/Programming-MQTT_timecho.md +++ /dev/null @@ -1,261 +0,0 @@ - -# MQTT Protocol - -## 1. Overview - -MQTT (Message Queuing Telemetry Transport) is a lightweight messaging protocol designed for IoT and low-bandwidth environments. It operates on a Publish/Subscribe (Pub/Sub) model, enabling efficient and reliable bidirectional communication between devices. Its core objectives are low power consumption, minimal bandwidth usage, and high real-time performance, making it ideal for unstable networks or resource-constrained scenarios (e.g., sensors, mobile devices). - -IoTDB provides deep integration with the MQTT protocol, fully compliant with MQTT v3.1 (OASIS International Standard). The IoTDB server includes a built-in high-performance MQTT Broker module, eliminating the need for third-party middleware. Devices can directly write time-series data into the IoTDB storage engine via MQTT messages. - -![](/img/mqtt-table-en-1.png) - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the MQTT service JAR file by default. Please contact the Timecho team to obtain the JAR file before using this service, and place it in the `timechodb_home/lib` or `timechodb_home/ext/external_service` directory. - - -## 2. Configuration - -By default, the IoTDB MQTT service loads configurations from `${IOTDB_HOME}/${IOTDB_CONF}/iotdb-system.properties`. - -| **Property** | **Description** | **Default** | -| ------------------------ | -------------------------------------------------------------------------------------------------------------------- | ------------------- | -| `enable_mqtt_service` | Enable/ disable the MQTT service. | FALSE | -| `mqtt_host` | Host address bound to the MQTT service. | 127.0.0.1 | -| `mqtt_port` | Port bound to the MQTT service. | 1883 | -| `mqtt_handler_pool_size` | Thread pool size for processing MQTT messages. | 1 | -| **`mqtt_payload_formatter`** | **Formatting method for MQTT message payloads. ​**​**Options: `json` (tree mode), `line` (table mode).** | **json** | -| `mqtt_max_message_size` | Maximum allowed MQTT message size (bytes). | 1048576 | - -## 3. Write Protocol - -* Line Protocol Syntax - -```JavaScript -[,=[,=]][ =[,=]] =[,=] [] -``` - -* Example - -```JavaScript -myMeasurement,tag1=value1,tag2=value2 attr1=value1,attr2=value2 fieldKey="fieldValue" 1556813561098000000 -``` - -![](/img/mqtt-table-en-2.png) - -## 4. Naming Conventions - -* Database Name - -The first segment of the MQTT topic (split by `/`) is used as the database name. - -```Properties -topic: stock/Legacy -databaseName: stock - - -topic: stock/Legacy/# -databaseName:stock -``` - -* Table Name - -The table name is derived from the `` in the line protocol. - -* Type Identifiers - -| Filed Value | IoTDB Data Type | -|--------------------------------------------------------------------| ----------------- | -| 1
1.12 | DOUBLE | -| 1`f`
1.12`f` | FLOAT | -| 1`i`
123`i` | INT64 | -| 1`u`
123`u` | INT64 | -| 1`i32`
123`i32` | INT32 | -| `"xxx"` | TEXT | -| `t`,`T`,`true`,`True`,`TRUE`
`f`,`F`,`false`,`False`,`FALSE` | BOOLEAN | - - -## 5. Coding Examples -The following is an example which a mqtt client send messages to IoTDB server. - - ```java -MQTT mqtt = new MQTT(); -mqtt.setHost("127.0.0.1", 1883); -mqtt.setUserName("root"); -mqtt.setPassword("root"); - -BlockingConnection connection = mqtt.blockingConnection(); -String DATABASE = "myMqttTest"; -connection.connect(); - -String payload = - "test1,tag1=t1,tag2=t2 attr3=a5,attr4=a4 field1=\"fieldValue1\",field2=1i,field3=1u 1"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -payload = "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 2"; -connection.publish(DATABASE, payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -payload = "# It's a remark\n " + "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 6"; - connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); - Thread.sleep(10); - -//batch write example -payload = - "test1,tag1=t1,tag2=t2 field7=t,field8=T,field9=true 3 \n " - + "test1,tag1=t1,tag2=t2 field7=f,field8=F,field9=FALSE 4"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -//batch write example -payload = - "test1,tag1=t1,tag2=t2 attr1=a1,attr2=a2 field1=\"fieldValue1\",field2=1i,field3=1u 4 \n " - + "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 5"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -connection.disconnect(); - ``` - - - -## 6. Customize your MQTT Message Format - -If you do not like the above Line format, you can customize your MQTT Message format by just writing several lines -of codes. An example can be found in [example/mqtt-customize](https://github.com/apache/iotdb/tree/master/example/mqtt-customize) project. - -Steps: -1. Create a java project, and add dependency: -```xml - - org.apache.iotdb - iotdb-server - 2.0.4-SNAPSHOT - -``` -2. Define your implementation which implements `org.apache.iotdb.db.protocol.mqtt.PayloadFormatter` - e.g., - -```java -package org.apache.iotdb.mqtt.server; - -import io.netty.buffer.ByteBuf; -import org.apache.iotdb.db.protocol.mqtt.Message; -import org.apache.iotdb.db.protocol.mqtt.PayloadFormatter; - -import java.nio.charset.StandardCharsets; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -public class CustomizedLinePayloadFormatter implements PayloadFormatter { - - @Override - public List format(String topic, ByteBuf payload) { - // Suppose the payload is a line format - if (payload == null) { - return null; - } - - String line = payload.toString(StandardCharsets.UTF_8); - // parse data from the line and generate Messages and put them into List ret - List ret = new ArrayList<>(); - // this is just an example, so we just generate some Messages directly - for (int i = 0; i < 3; i++) { - long ts = i; - TableMessage message = new TableMessage(); - - // Parsing Database Name - message.setDatabase("db" + i); - - //Parsing Table Names - message.setTable("t" + i); - - // Parsing Tags - List tagKeys = new ArrayList<>(); - tagKeys.add("tag1" + i); - tagKeys.add("tag2" + i); - List tagValues = new ArrayList<>(); - tagValues.add("t_value1" + i); - tagValues.add("t_value2" + i); - message.setTagKeys(tagKeys); - message.setTagValues(tagValues); - - // Parsing Attributes - List attributeKeys = new ArrayList<>(); - List attributeValues = new ArrayList<>(); - attributeKeys.add("attr1" + i); - attributeKeys.add("attr2" + i); - attributeValues.add("a_value1" + i); - attributeValues.add("a_value2" + i); - message.setAttributeKeys(attributeKeys); - message.setAttributeValues(attributeValues); - - // Parsing Fields - List fields = Arrays.asList("field1" + i, "field2" + i); - List dataTypes = Arrays.asList(TSDataType.FLOAT, TSDataType.FLOAT); - List values = Arrays.asList("4.0" + i, "5.0" + i); - message.setFields(fields); - message.setDataTypes(dataTypes); - message.setValues(values); - - //// Parsing timestamp - message.setTimestamp(ts); - ret.add(message); - } - return ret; - } - - @Override - public String getName() { - // set the value of mqtt_payload_formatter in iotdb-system.properties as the following string: - return "CustomizedLine"; - } -} -``` -3. modify the file in `src/main/resources/META-INF/services/org.apache.iotdb.db.protocol.mqtt.PayloadFormatter`: - clean the file and put your implementation class name into the file. - In this example, the content is: `org.apache.iotdb.mqtt.server.CustomizedLinePayloadFormatter` -4. compile your implementation as a jar file: `mvn package -DskipTests` - - -Then, in your server: -1. Create ${IOTDB_HOME}/ext/mqtt/ folder, and put the jar into this folder. -2. Update configuration to enable MQTT service. (`enable_mqtt_service=true` in `conf/iotdb-system.properties`) -3. Set the value of `mqtt_payload_formatter` in `conf/iotdb-system.properties` as the value of getName() in your implementation - , in this example, the value is `CustomizedLine` -4. Launch the IoTDB server. -5. Now IoTDB will use your implementation to parse the MQTT message. - -More: the message format can be anything you want. For example, if it is a binary format, -just use `payload.forEachByte()` or `payload.array` to get bytes content. - -## 7. Caution - -To avoid compatibility issues caused by a default client_id, always explicitly supply a unique, non-empty client_id in every MQTT client. -Behavior varies when the client_id is missing or empty. Common examples: -1. Explicitly sending an empty string -• MQTTX: When client_id="", IoTDB silently discards the message. -• mosquitto_pub: When client_id="", IoTDB receives the message normally. -2. Omitting client_id entirely -• MQTTX: IoTDB accepts the message. -• mosquitto_pub: IoTDB rejects the connection. -Therefore, explicitly assigning a unique, non-empty client_id is the simplest way to eliminate these discrepancies and ensure reliable message delivery. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/API/Programming-ODBC_timecho.md b/src/UserGuide/Master/Table/API/Programming-ODBC_timecho.md deleted file mode 100644 index fe40f1f8f..000000000 --- a/src/UserGuide/Master/Table/API/Programming-ODBC_timecho.md +++ /dev/null @@ -1,1051 +0,0 @@ - - -# ODBC - -## 1. Feature Introduction -The IoTDB ODBC driver provides the ability to interact with the database via the standard ODBC interface, supporting data management in time-series databases through ODBC connections. It currently supports database connection, data query, data insertion, data modification, and data deletion operations, and is compatible with various applications and toolchains that support the ODBC protocol. - -> Note: This feature is supported starting from V2.0.8.2. - -## 2. Usage Method -It is recommended to install using the pre-compiled binary package. There is no need to compile it yourself; simply use the script to complete the driver installation and system registration. Currently, only Windows systems are supported. - -### 2.1 Environment Requirements -Only the ODBC Driver Manager dependency at the operating system level is required; no compilation environment configuration is needed: - -| **Operating System** | **Requirements and Installation Method** | -| :--- |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Windows | 1. **Windows 10/11, Server 2016/2019/2022**: Comes with ODBC Driver Manager version 17/18 built-in; no extra installation needed.
2. **Windows 8.1/Server 2012 R2**: Requires manual installation of the corresponding version of the ODBC Driver Manager. | - -### 2.2 Installation Steps -1. Contact the Tianmou team to obtain the pre-compiled binary package. - Binary package directory structure: - ```Plain - ├── bin/ - │ ├── apache_iotdb_odbc.dll - │ └── install_driver.exe - ├── install.bat - └── registry.bat - ``` -2. Open a command line tool (CMD/PowerShell) with **Administrator privileges** and run the following command: (You can replace the path with any absolute path) - ```Bash - install.bat "C:\Program Files\Apache IoTDB ODBC Driver" - ``` - The script automatically completes the following operations: - * Creates the installation directory (if it does not exist). - * Copies `bin\apache_iotdb_odbc.dll` to the specified installation directory. - * Calls `install_driver.exe` to register the driver to the system via the ODBC standard API (`SQLInstallDriverEx`). -3. Verify installation: Open "ODBC Data Source Administrator". If you can see `Apache IoTDB ODBC Driver` in the "Drivers" tab, the registration was successful. - ![](/img/odbc-1-en.png) - -### 2.3 Uninstallation Steps -1. Open Command Prompt as Administrator and `cd` into the project root directory. -2. Run the uninstallation script: - ```Bash - uninstall.bat - ``` - The script will call `install_driver.exe` to unregister the driver from the system via the ODBC standard API (`SQLRemoveDriver`). The DLL files in the installation directory will not be automatically deleted; please delete them manually if cleanup is required. - -### 2.4 Connection Configuration -After installing the driver, you need to configure a Data Source Name (DSN) to allow applications to connect to the database using the DSN name. The IoTDB ODBC driver supports two methods for configuring connection parameters: via Data Source and via Connection String. - -#### 2.4.1 Configuring Data Source -**Configure via ODBC Data Source Administrator** -1. Open "ODBC Data Source Administrator", switch to the "User DSN" tab, and click the "Add" button. - ![](/img/odbc-2-en.png) -2. Select "Apache IoTDB ODBC Driver" from the pop-up driver list and click "Finish". - ![](/img/odbc-3-en.png) -3. The data source configuration dialog will appear. Fill in the connection parameters and click OK: - ![](/img/odbc-4-en.png) - The meaning of each field in the dialog box is as follows: - - | **Area** | **Field** | **Description** | - | :--- | :--- | :--- | - | Data Source | DSN Name | Data Source Name; applications refer to this data source by this name. | - | Data Source | Description | Data Source description (optional). | - | Connection | Server | IoTDB server IP address, default 127.0.0.1. | - | Connection | Port | IoTDB Session API port, default 6667. | - | Connection | User | Username, default root. | - | Connection | Password | Password, default root. | - | Options | Table Model | Check to use Table Model; uncheck to use Tree Model. | - | Options | Database | Database name. Only available in Table Model mode; grayed out in Tree Model. | - | Options | Log Level | Log level (0-4): 0=OFF, 1=ERROR, 2=WARN, 3=INFO, 4=TRACE. | - | Options | Session Timeout | Session timeout time (milliseconds); 0 means no timeout. Note: The server-side `queryTimeoutThreshold` defaults to 60000ms; exceeding this value requires modifying server configuration. | - | Options | Batch Size | Number of rows fetched per batch, default 1000. Setting to 0 resets to the default value. | - -4. After filling in the details, you can click the "Test Connection" button to test the connection. Testing will attempt to connect to the IoTDB server using the current parameters and execute a `SHOW VERSION` query. If successful, the server version information will be displayed; if failed, the specific error reason will be shown. -5. Once parameters are confirmed correct, click "OK" to save. The data source will appear in the "User DSN" list, as shown in the example below with the name "123". - ![](/img/odbc-5-en.png) - To modify the configuration of an existing data source, select it in the list and click the "Configure" button to edit again. - -#### 2.4.2 Connection String -The connection string format is **semicolon-separated key-value pairs**, for example: -```Bash -Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;database=testdb;isTableModel=true;loglevel=2 -``` -Specific field attributes are introduced in the table below: - -| **Field Name** | **Description** | **Optional Values** | **Default Value** | -| :--- | :--- |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--- | -| DSN | Data Source Name | Custom data source name | - | -| uid | Database username | Any string | root | -| pwd | Database password | Any string | root | -| server | IoTDB server address | IP address | 127.0.0.1 | -| port | IoTDB server port | Port number | 6667 | -| database | Database name (only effective in Table Model mode) | Any string | Empty string | -| loglevel | Log level | Integer value (0-4) | 4 (LOG_LEVEL_TRACE) | -| isTableModel / tablemodel | Whether to enable Table Model mode | Boolean type, supports multiple representations:
1. 0, false, no, off: set to false;
2. 1, true, yes, on: set to true;
3. Other values default to true. | true | -| sessiontimeoutms | Session timeout time (milliseconds) | 64-bit integer, defaults to `LLONG_MAX`; setting to `0` will be replaced with `LLONG_MAX`. Note: The server has a timeout setting: `private long queryTimeoutThreshold = 60000;` this item needs to be modified to get a timeout time exceeding 60 seconds. | LLONG_MAX | -| batchsize | Batch size for fetching data each time | 64-bit integer, defaults to `1000`; setting to `0` will be replaced with `1000` | 1000 | - -Notes: -* Field names are case-insensitive (automatically converted to lowercase for comparison). -* Connection string format is semicolon-separated key-value pairs, e.g., `Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;database=testdb;isTableModel=true;loglevel=2`. -* For boolean fields (`isTableModel`), multiple representation methods are supported. -* All fields are optional; if not specified, default values are used. -* Unsupported fields will be ignored and a warning logged, but will not affect the connection. -* The default server interface port 6667 is the default port used by IoTDB's C++ Session interface. This ODBC driver uses the C++ Session interface to transfer data with IoTDB. If the C++ Session interface on the IoTDB server uses a non-default port, corresponding changes must be made in the ODBC connection string. - -#### 2.4.3 Relationship between Data Source Configuration and Connection String -Configurations saved in the ODBC Data Source Administrator are written into the system's ODBC data source configuration as key-value pairs (corresponding to the registry `HKEY_CURRENT_USER\SOFTWARE\ODBC\ODBC.INI` under Windows). When an application uses `SQLConnect` or specifies `DSN=DataSourceName` in the connection string, the driver reads these parameters from the system configuration. - -**The priority of the connection string is higher than the configuration saved in the DSN.** Specific rules are as follows: -1. If the connection string contains `DSN=xxx` and does not contain `DRIVER=...`, the driver first loads all parameters of that DSN from the system configuration as base values. -2. Then, parameters explicitly specified in the connection string will override parameters with the same name in the DSN. -3. If the connection string contains `DRIVER=...`, no DSN parameters will be read from the system configuration; it will rely entirely on the connection string. - -For example: If the DSN is configured with `Server=192.168.1.100` and `Port=6667`, but the connection string is `DSN=MyDSN;Server=127.0.0.1`, then the actual connection will use `Server=127.0.0.1` (overridden by connection string) and `Port=6667` (from DSN). - -### 2.5 Logging -Log output during driver runtime is divided into "Driver Self-Logs" and "ODBC Manager Tracing Logs". Note the impact of log levels on performance. - -#### 2.5.1 Driver Self-Logs -* Output location: `apache_iotdb_odbc.log` in the user's home directory. -* Log level: Configured via the `loglevel` parameter in the connection string (0-4; higher levels produce more detailed output). -* Performance impact: High log levels will significantly reduce driver performance; recommended for debugging only. - -#### 2.5.2 ODBC Manager Tracing Logs -* How to enable: Open "ODBC Data Source Administrator" → "Tracing" → "Start Tracing Now". -* Precautions: Enabling this will greatly reduce driver performance; use only for troubleshooting. - -## 3. Interface Support - -### 3.1 Method List -The driver's support status for standard ODBC APIs is as follows: - -| ODBC/Setup API | Function Function | Parameter List | Parameter Description | -|:------------------|:---------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--- | -| SQLAllocHandle | Allocate ODBC Handle | (SQLSMALLINT HandleType, SQLHANDLE InputHandle, SQLHANDLE *OutputHandle) | HandleType: Type of handle to allocate (ENV/DBC/STMT/DESC);
InputHandle: Parent context handle;
OutputHandle: Pointer to the returned new handle. | -| SQLBindCol | Bind column to result buffer | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN *StrLen_or_Ind) | StatementHandle: Statement handle;
ColumnNumber: Column number;
TargetType: C data type;
TargetValue: Data buffer;BufferLength: Buffer length;
StrLen_or_Ind: Returns data length or NULL indicator. | -| SQLColAttribute | Get column attribute information | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLUSMALLINT FieldIdentifier, SQLPOINTER CharacterAttribute, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength, SQLLEN *NumericAttribute) | StatementHandle: Statement handle;
ColumnNumber: Column number;
FieldIdentifier: Attribute ID;
CharacterAttribute: Character attribute output;
BufferLength: Buffer length;
StringLength: Returned length;
NumericAttribute: Numeric attribute output. | -| SQLColumns | Query table column information | (SQLHSTMT StatementHandle, SQLCHAR *CatalogName, SQLSMALLINT NameLength1, SQLCHAR *SchemaName, SQLSMALLINT NameLength2, SQLCHAR *TableName, SQLSMALLINT NameLength3, SQLCHAR *ColumnName, SQLSMALLINT NameLength4) | StatementHandle: Statement handle;
Catalog/Schema/Table/ColumnName: Query object names;

NameLength*: Corresponding name lengths. | -| SQLConnect | Establish database connection | (SQLHDBC ConnectionHandle, SQLCHAR *ServerName, SQLSMALLINT NameLength1, SQLCHAR *UserName, SQLSMALLINT NameLength2, SQLCHAR *Authentication, SQLSMALLINT NameLength3) | ConnectionHandle: Connection handle;
ServerName: Data source name;
UserName: Username;
Authentication: Password;
NameLength*: String lengths. | -| SQLDescribeCol | Describe columns in result set | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLCHAR *ColumnName, SQLSMALLINT BufferLength, SQLSMALLINT *NameLength, SQLSMALLINT *DataType, SQLULEN *ColumnSize, SQLSMALLINT *DecimalDigits, SQLSMALLINT *Nullable) | StatementHandle: Statement handle;
ColumnNumber: Column number;
ColumnName: Column name output;
BufferLength: Buffer length;
NameLength: Returned column name length;
DataType: SQL type;
ColumnSize: Column size;
DecimalDigits: Decimal digits;
Nullable: Whether nullable. | -| SQLDisconnect | Disconnect database connection | (SQLHDBC ConnectionHandle) | ConnectionHandle: Connection handle. | -| SQLDriverConnect | Establish connection using connection string | (SQLHDBC ConnectionHandle, SQLHWND WindowHandle, SQLCHAR *InConnectionString, SQLSMALLINT StringLength1, SQLCHAR *OutConnectionString, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength2, SQLUSMALLINT DriverCompletion) | ConnectionHandle: Connection handle;
WindowHandle: Window handle;InConnectionString: Input connection string;
StringLength1: Input length;
OutConnectionString: Output connection string;
BufferLength: Output buffer;
StringLength2: Returned length;
DriverCompletion: Connection prompt method. | -| SQLEndTran | Commit or rollback transaction | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT CompletionType) | HandleType: Handle type;
Handle: Connection or environment handle;
CompletionType: Commit or rollback transaction. | -| SQLExecDirect | Execute SQL statement directly | (SQLHSTMT StatementHandle, SQLCHAR *StatementText, SQLINTEGER TextLength) | StatementHandle: Statement handle;
StatementText: SQL text;TextLength: SQL length. | -| SQLFetch | Fetch next row in result set | (SQLHSTMT StatementHandle) | StatementHandle: Statement handle. | -| SQLFreeHandle | Free ODBC handle | (SQLSMALLINT HandleType, SQLHANDLE Handle) | HandleType: Handle type;
Handle: Handle to free. | -| SQLFreeStmt | Free statement-related resources | (SQLHSTMT StatementHandle, SQLUSMALLINT Option) | StatementHandle: Statement handle;
Option: Free option (close cursor/reset parameters, etc.). | -| SQLGetConnectAttr | Get connection attribute | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER *StringLength) | ConnectionHandle: Connection handle;
Attribute: Attribute ID;
Value: Returned attribute value;
BufferLength: Buffer length;
StringLength: Returned length. | -| SQLGetData | Get result data | (SQLHSTMT StatementHandle, SQLUSMALLINT Col_or_Param_Num, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN *StrLen_or_Ind) | StatementHandle: Statement handle;
Col_or_Param_Num: Column number;
TargetType: C type;
TargetValue: Data buffer;
BufferLength: Buffer size;
StrLen_or_Ind: Returned length or NULL flag. | -| SQLGetDiagField | Get diagnostic field | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLSMALLINT DiagIdentifier, SQLPOINTER DiagInfo, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength) | HandleType: Handle type;
Handle: Handle;
RecNumber: Record number;
DiagIdentifier: Diagnostic field ID;
DiagInfo: Output info;
BufferLength: Buffer;
StringLength: Returned length. | -| SQLGetDiagRec | Get diagnostic record | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLCHAR *Sqlstate, SQLINTEGER *NativeError, SQLCHAR *MessageText, SQLSMALLINT BufferLength, SQLSMALLINT *TextLength) | HandleType: Handle type;
Handle: Handle;
RecNumber: Record number;
Sqlstate: SQL state code;
NativeError: Native error code;
MessageText: Error message;
BufferLength: Buffer;
TextLength: Returned length. | -| SQLGetInfo | Get database information | (SQLHDBC ConnectionHandle, SQLUSMALLINT InfoType, SQLPOINTER InfoValue, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength) | ConnectionHandle: Connection handle;
InfoType: Information type;
InfoValue: Return value;
BufferLength: Buffer length;
StringLength: Returned length. | -| SQLGetStmtAttr | Get statement attribute | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER *StringLength) | StatementHandle: Statement handle;
Attribute: Attribute ID;
Value: Return value;
BufferLength: Buffer;
StringLength: Returned length. | -| SQLGetTypeInfo | Get data type information | (SQLHSTMT StatementHandle, SQLSMALLINT DataType) | StatementHandle: Statement handle;
DataType: SQL data type. | -| SQLMoreResults | Get more result sets | (SQLHSTMT StatementHandle) | StatementHandle: Statement handle. | -| SQLNumResultCols | Get number of columns in result set | (SQLHSTMT StatementHandle, SQLSMALLINT *ColumnCount) | StatementHandle: Statement handle;
ColumnCount: Returned column count. | -| SQLRowCount | Get number of affected rows | (SQLHSTMT StatementHandle, SQLLEN *RowCount) | StatementHandle: Statement handle;
RowCount: Returned number of affected rows. | -| SQLSetConnectAttr | Set connection attribute | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | ConnectionHandle: Connection handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Attribute value length. | -| SQLSetEnvAttr | Set environment attribute | (SQLHENV EnvironmentHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | EnvironmentHandle: Environment handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Length. | -| SQLSetStmtAttr | Set statement attribute | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | StatementHandle: Statement handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Length. | -| SQLTables | Query table information | (SQLHSTMT StatementHandle, SQLCHAR *CatalogName, SQLSMALLINT NameLength1, SQLCHAR *SchemaName, SQLSMALLINT NameLength2, SQLCHAR *TableName, SQLSMALLINT NameLength3, SQLCHAR *TableType, SQLSMALLINT NameLength4) | StatementHandle: Statement handle;
Catalog/Schema/TableName: Table names;
TableType: Table type;
NameLength*: Corresponding lengths. | - -### 3.2 Data Type Conversion -The mapping relationship between IoTDB data types and standard ODBC data types is as follows: - -| **IoTDB Data Type** | **ODBC Data Type** | -| :--- | :--- | -| BOOLEAN | SQL_BIT | -| INT32 | SQL_INTEGER | -| INT64 | SQL_BIGINT | -| FLOAT | SQL_REAL | -| DOUBLE | SQL_DOUBLE | -| TEXT | SQL_VARCHAR | -| STRING | SQL_VARCHAR | -| BLOB | SQL_LONGVARBINARY | -| TIMESTAMP | SQL_BIGINT | -| DATE | SQL_DATE | - -## 4. Operation Examples -This chapter mainly introduces full-type operation examples for **C#**, **Python**, **C++**, **PowerBI**, and **Excel**, covering core operations such as data query, insertion, and deletion. - -### 4.1 C# Example - -```C# -Here is the C# code with all comments and string literals translated into English: - -```csharp -/******* -Note: When the output contains Chinese characters, it may cause garbled text. -This is because the table.Write() function cannot output strings in UTF-8 encoding -and can only output using GB2312 (or another system default encoding). This issue -may not occur in software like Power BI; it also does not occur when using the Console.WriteLine function. -This is an issue with the ConsoleTable package. -*****/ - -using System.Data.Common; -using System.Data.Odbc; -using System.Reflection.PortableExecutable; -using ConsoleTables; -using System; - -/// Executes a SELECT query and outputs the results of fulltable in table format -void Query(OdbcConnection dbConnection) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - dbCommand.CommandText = "select * from fulltable"; - using (OdbcDataReader dbReader = dbCommand.ExecuteReader()) - { - var fCount = dbReader.FieldCount; - Console.WriteLine($"fCount = {fCount}"); - - // Output header row - var columns = new string[fCount]; - for (var i = 0; i < fCount; i++) - { - var fName = dbReader.GetName(i); - if (fName.Contains('.')) - { - fName = fName.Substring(fName.LastIndexOf('.') + 1); - } - columns[i] = fName; - } - - // Output content rows - var table = new ConsoleTable(columns); - while (dbReader.Read()) - { - var row = new object[fCount]; - for (var i = 0; i < fCount; i++) - { - if (dbReader.IsDBNull(i)) - { - row[i] = null; - continue; - } - row[i] = dbReader.GetValue(i); - } - table.AddRow(row); - } - table.Write(); - Console.WriteLine(); - } - } - } - catch (Exception ex) - { - Console.WriteLine(ex.ToString()); - } -} - -/// Executes non-query SQL statements (such as CREATE DATABASE, CREATE TABLE, INSERT, etc.) -void Execute(OdbcConnection dbConnection, string command) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - try - { - dbCommand.CommandText = command; - Console.WriteLine($"Execute command: {command}"); - dbCommand.ExecuteNonQuery(); - } - catch (Exception ex) - { - Console.WriteLine($"CommandText error: {ex.Message}"); - } - } - } - catch (OdbcException ex) - { - Console.WriteLine($"Database error: {ex.Message}"); - } - catch (Exception ex) - { - Console.WriteLine($"Unknown error occurred: {ex.Message}"); - } -} - -var dsn = "Apache IoTDB DSN"; -var user = "root"; -var password = "root"; -var server = "127.0.0.1"; -var database = "test"; -var connectionString = $"DSN={dsn};Server={server};UID={user};PWD={password};Database={database};loglevel=4"; - -using (OdbcConnection dbConnection = new OdbcConnection(connectionString)) -{ - Console.WriteLine($"Start"); - try - { - dbConnection.Open(); - } - catch (Exception ex) - { - Console.WriteLine($"Login failed: {ex.Message}"); - Console.WriteLine($"Stack Trace: {ex.StackTrace}"); - dbConnection.Dispose(); - return; - } - Console.WriteLine($"Successfully opened connection. database name = {dbConnection.Driver}"); - - Execute(dbConnection, "CREATE DATABASE IF NOT EXISTS test"); - Execute(dbConnection, "use test"); - Console.WriteLine("use test Execute complete. Begin to setup fulltable."); - - Execute(dbConnection, "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) WITH (TTL=315360000000)"); - - string[] insertStatements = new string[] - { - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')" - }; - - foreach (var insert in insertStatements) - { - Execute(dbConnection, insert); - } - Console.WriteLine("fulltable setup complete. Begin to query."); - - Query(dbConnection); // Execute query and output results -} -``` - -### 4.2 Python Example -1. To access ODBC via Python, install the `pyodbc` package: - ```Plain - pip install pyodbc - ``` -2. Full Code: - - -```Python -Here is the complete Python code with all comments and string literals translated into English: - -```python -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -""" -Apache IoTDB ODBC Python example -Use pyodbc to connect to the IoTDB ODBC driver and perform operations such as query and insert. -For reference, see examples/cpp-example/test.cpp and examples/BasicTest/BasicTest/Program.cs -""" - -import pyodbc - -def execute(conn: pyodbc.Connection, command: str) -> None: - """Executes non-query SQL statements (such as USE, CREATE, INSERT, DELETE, etc.)""" - try: - with conn.cursor() as cursor: - cursor.execute(command) - # INSERT/UPDATE/DELETE require commit; session commands such as USE do not. - cmd_upper = command.strip().upper() - if cmd_upper.startswith(("INSERT", "UPDATE", "DELETE")): - conn.commit() - print(f"Execute command: {command}") - except pyodbc.Error as ex: - print(f"CommandText error: {ex}") - -def query(conn: pyodbc.Connection, sql: str) -> None: - """Executes a SELECT query and outputs the results in table format""" - try: - with conn.cursor() as cursor: - cursor.execute(sql) - col_count = len(cursor.description) - print(f"fCount = {col_count}") - - if col_count <= 0: - return - - # Get column names (if the name contains '.', take the last segment, consistent with C++/C# samples). - columns = [] - for i in range(col_count): - col_name = cursor.description[i][0] or f"Column{i}" - if "." in str(col_name): - col_name = str(col_name).split(".")[-1] - columns.append(str(col_name)) - - # Fetch data rows - rows = cursor.fetchall() - - # Simple table output - col_widths = [max(len(str(col)), 4) for col in columns] - for i, row in enumerate(rows): - for j, val in enumerate(row): - if j < len(col_widths): - col_widths[j] = max(col_widths[j], len(str(val) if val is not None else "NULL")) - - # Print header - header = " | ".join(str(c).ljust(col_widths[i]) for i, c in enumerate(columns)) - print(header) - print("-" * len(header)) - - # Print data rows - for row in rows: - values = [] - for i, val in enumerate(row): - if val is None: - cell = "NULL" - else: - cell = str(val) - values.append(cell.ljust(col_widths[i]) if i < len(col_widths) else cell) - print(" | ".join(values)) - - print() - - except pyodbc.Error as ex: - print(f"Query error: {ex}") - -def main() -> None: - dsn = "Apache IoTDB DSN" - user = "root" - password = "root" - server = "127.0.0.1" - database = "test" - connection_string = ( - f"DSN={dsn};Server={server};UID={user};PWD={password};" - f"Database={database};loglevel=4" - ) - - print("Start") - - try: - conn = pyodbc.connect(connection_string) - except pyodbc.Error as ex: - print(f"Login failed: {ex}") - return - - try: - driver_name = conn.getinfo(6) # SQL_DRIVER_NAME - print(f"Successfully opened connection. driver = {driver_name}") - except Exception: - print("Successfully opened connection.") - - try: - execute(conn, "CREATE DATABASE IF NOT EXISTS test") - execute(conn, "use test") - print("use test Execute complete. Begin to setup fulltable.") - - # Create the fulltable table and insert test data - execute( - conn, - "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, " - "int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, " - "double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, " - "blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) " - "WITH (TTL=315360000000)", - ) - insert_statements = [ - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - ] - for insert_sql in insert_statements: - execute(conn, insert_sql) - print("fulltable setup complete. Begin to query.") - query(conn, "select * from fulltable") - print("Query ok") - finally: - conn.close() - -if __name__ == "__main__": - main() -``` - - -### 4.3 C++ Example - - -```C++ -Here is the complete C++ code with all comments and string literals translated into English: - -```cpp -#define WIN32_LEAN_AND_MEAN -#include - -#include -#include -#include -#include -#include -#include -#include - -#ifndef SQL_DIAG_COLUMN_SIZE -#define SQL_DIAG_COLUMN_SIZE 33L -#endif - -// Error handling function (core functionality preserved) -void CheckOdbcError(SQLRETURN retCode, SQLSMALLINT handleType, SQLHANDLE handle, const char* functionName) { - if (retCode == SQL_SUCCESS || retCode == SQL_SUCCESS_WITH_INFO) { - return; - } - - SQLCHAR sqlState[6]; - SQLCHAR message[SQL_MAX_MESSAGE_LENGTH]; - SQLINTEGER nativeError; - SQLSMALLINT textLength; - SQLRETURN errRet; - errRet = SQLGetDiagRec(handleType, handle, 1, sqlState, &nativeError, message, sizeof(message), &textLength); - - std::cerr << "ODBC Error in " << functionName << ":\n"; - std::cerr << " SQL State: " << sqlState << "\n"; - std::cerr << " Native Error: " << nativeError << "\n"; - std::cerr << " Message: " << message << "\n"; - std::cerr << " SQLGetDiagRec Return: " << errRet << "\n"; - - if (retCode == SQL_ERROR || retCode == SQL_INVALID_HANDLE) { - exit(1); - } -} - -// Simplified table output - displays basic data only -void PrintSimpleTable(const std::vector& headers, - const std::vector>& rows) { - // Print header row - for (size_t i = 0; i < headers.size(); i++) { - std::cout << headers[i]; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - // Print separator line - for (size_t i = 0; i < headers.size(); i++) { - std::cout << "----------------"; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - // Print data rows - for (const auto& row : rows) { - for (size_t i = 0; i < row.size(); i++) { - std::cout << row[i]; - if (i < row.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - } - std::cout << std::endl; -} - -/// Executes a SELECT query and outputs the results of fulltable in table format -void Query(SQLHDBC hDbc) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret = SQL_SUCCESS; - - try { - // Allocate statement handle - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - return; - } - - // Execute query - const std::string sqlQuery = "select * from fulltable"; - std::cout << "Execute query: " << sqlQuery << std::endl; - - ret = SQLExecDirect(hStmt, reinterpret_cast(const_cast(sqlQuery.c_str())), SQL_NTS); - if (!SQL_SUCCEEDED(ret)) { - if (ret != SQL_NO_DATA) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect(SELECT)"); - } - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - // Get column count - SQLSMALLINT colCount = 0; - ret = SQLNumResultCols(hStmt, &colCount); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLNumResultCols"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::cout << "Column count = " << colCount << std::endl; - - // If no columns, return directly - if (colCount <= 0) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - // Get column names and type information - std::vector columnNames; - std::vector columnTypes(colCount); - std::vector columnSizes(colCount); - std::vector decimalDigits(colCount); - std::vector nullable(colCount); - - // Get basic column information - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLSMALLINT nameLength = 0; - ret = SQLDescribeCol(hStmt, i, NULL, 0, &nameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get length)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector colNameBuffer(nameLength + 1); - SQLSMALLINT actualNameLength = 0; - - ret = SQLDescribeCol(hStmt, i, colNameBuffer.data(), nameLength + 1, - &actualNameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get name)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::string fullName(reinterpret_cast(colNameBuffer.data())); - - size_t pos = fullName.find_last_of('.'); - if (pos != std::string::npos) { - columnNames.push_back(fullName.substr(pos + 1)); - } else { - columnNames.push_back(fullName); - } - - ret = SQLDescribeCol(hStmt, i, NULL, 0, NULL, &columnTypes[i-1], - &columnSizes[i-1], &decimalDigits[i-1], &nullable[i-1]); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get type info)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - } - - std::vector> tableRows; - - int rowCount = 0; - // Fetch data for every row - while (true) { - ret = SQLFetch(hStmt); - if (ret == SQL_NO_DATA) { - break; - } - - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLFetch"); - break; - } - - std::vector row; - - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLLEN indicator = 0; - std::string valueStr; - - SQLSMALLINT cType; - size_t bufferSize; - bool isCharacterType = false; - const int maxBufferSize = 32768; - - switch (columnTypes[i-1]) { - case SQL_CHAR: - case SQL_VARCHAR: - case SQL_LONGVARCHAR: - case SQL_WCHAR: - case SQL_WVARCHAR: - case SQL_WLONGVARCHAR: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_DECIMAL: - case SQL_NUMERIC: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_INTEGER: - case SQL_SMALLINT: - case SQL_TINYINT: - case SQL_BIGINT: - cType = SQL_C_SBIGINT; - bufferSize = sizeof(SQLBIGINT); - break; - - case SQL_REAL: - case SQL_FLOAT: - case SQL_DOUBLE: - cType = SQL_C_DOUBLE; - bufferSize = sizeof(double); - break; - - case SQL_BIT: - cType = SQL_C_BIT; - bufferSize = sizeof(SQLCHAR); - break; - - case SQL_DATE: - case SQL_TYPE_DATE: - cType = SQL_C_DATE; - bufferSize = sizeof(SQL_DATE_STRUCT); - break; - - case SQL_TIME: - case SQL_TYPE_TIME: - cType = SQL_C_TIME; - bufferSize = sizeof(SQL_TIME_STRUCT); - break; - - case SQL_TIMESTAMP: - case SQL_TYPE_TIMESTAMP: - cType = SQL_C_TIMESTAMP; - bufferSize = sizeof(SQL_TIMESTAMP_STRUCT); - break; - - default: - cType = SQL_C_CHAR; - bufferSize = 256; - isCharacterType = true; - break; - } - - std::vector buffer(bufferSize); - - ret = SQLGetData(hStmt, i, cType, buffer.data(), bufferSize, &indicator); - - if (indicator == SQL_NULL_DATA) { - valueStr = "NULL"; - } - else if (ret != SQL_SUCCESS) { - valueStr = "ERR_CONV"; - } - else { - if (cType == SQL_C_CHAR) { - valueStr = reinterpret_cast(buffer.data()); - } - else if (cType == SQL_C_SBIGINT) { - SQLBIGINT intVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(intVal); - } - else if (cType == SQL_C_DOUBLE) { - double doubleVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(doubleVal); - } - else if (cType == SQL_C_BIT) { - valueStr = (*buffer.data() != 0) ? "TRUE" : "FALSE"; - } - else if (cType == SQL_C_DATE) { - SQL_DATE_STRUCT* date = reinterpret_cast(buffer.data()); - char dateStr[20]; - snprintf(dateStr, sizeof(dateStr), "%04d-%02d-%02d", - date->year, date->month, date->day); - valueStr = dateStr; - } - else if (cType == SQL_C_TIME) { - SQL_TIME_STRUCT* time = reinterpret_cast(buffer.data()); - char timeStr[15]; - snprintf(timeStr, sizeof(timeStr), "%02d:%02d:%02d", - time->hour, time->minute, time->second); - valueStr = timeStr; - } - else if (cType == SQL_C_TIMESTAMP) { - SQL_TIMESTAMP_STRUCT* ts = reinterpret_cast(buffer.data()); - char tsStr[30]; - snprintf(tsStr, sizeof(tsStr), "%04d-%02d-%02d %02d:%02d:%02d.%06d", - ts->year, ts->month, ts->day, - ts->hour, ts->minute, ts->second, - ts->fraction / 1000); - valueStr = tsStr; - } - else { - valueStr = "UNKNOWN_TYPE"; - } - - if (isCharacterType && ret == SQL_SUCCESS_WITH_INFO) { - SQLLEN actualSize = 0; - SQLGetDiagField(SQL_HANDLE_STMT, hStmt, 0, SQL_DIAG_COLUMN_SIZE, - &actualSize, SQL_IS_INTEGER, NULL); - - if (indicator > 0 && static_cast(indicator) > bufferSize - 1) { - valueStr += "..."; - } - } - - } - - row.push_back(valueStr); - } - - tableRows.push_back(row); - } - - if (!tableRows.empty()) { - PrintSimpleTable(columnNames, tableRows); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (const std::exception& ex) { - std::cerr << "Exception: " << ex.what() << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } - catch (...) { - std::cerr << "Unknown exception occurred" << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -/// Executes non-query SQL statements (such as CREATE DATABASE, CREATE TABLE, INSERT, etc.) -void Execute(SQLHDBC hDbc, const std::string& command) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret; - - try { - // Allocate statement handle - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - - // Execute command - ret = SQLExecDirect(hStmt, (SQLCHAR*)command.c_str(), SQL_NTS); - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect"); - } - - // Free statement handle - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (...) { - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -int main() { - SQLHENV hEnv = SQL_NULL_HENV; - SQLHDBC hDbc = SQL_NULL_HDBC; - SQLRETURN ret; - - try { - std::cout << "Start" << std::endl; - - // 1. Initialize ODBC environment - ret = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hEnv); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_ENV)"); - - ret = SQLSetEnvAttr(hEnv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)SQL_OV_ODBC3, 0); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLSetEnvAttr"); - - // 2. Establish connection - ret = SQLAllocHandle(SQL_HANDLE_DBC, hEnv, &hDbc); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_DBC)"); - - // Connection string - std::string dsn = "Apache IoTDB DSN"; - std::string user = "root"; - std::string password = "root"; - std::string server = "127.0.0.1"; - std::string database = "test"; - - std::string connectionString = "DSN=" + dsn + ";Server=" + server + - ";UID=" + user + ";PWD=" + password + - ";Database=" + database + ";loglevel=4"; - std::cout << "Using connection string: " << connectionString << std::endl; - - SQLCHAR outConnStr[1024]; - SQLSMALLINT outConnStrLen; - - ret = SQLDriverConnect(hDbc, NULL, - (SQLCHAR*)connectionString.c_str(), SQL_NTS, - outConnStr, sizeof(outConnStr), - &outConnStrLen, SQL_DRIVER_COMPLETE); - - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - std::cerr << "Login failed" << std::endl; - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLDriverConnect"); - return 1; - } - - // Get driver name - SQLCHAR driverName[256]; - SQLSMALLINT nameLength; - ret = SQLGetInfo(hDbc, SQL_DRIVER_NAME, driverName, sizeof(driverName), &nameLength); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLGetInfo"); - - std::cout << "Successfully opened connection. database name = " << driverName << std::endl; - - // 3. Execute operations - Execute(hDbc, "CREATE DATABASE IF NOT EXISTS test"); - Execute(hDbc, "use test"); - std::cout << "use test Execute complete. Begin to setup fulltable." << std::endl; - - // Create fulltable table and insert test data - Execute(hDbc, "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) WITH (TTL=315360000000)"); - const char* insertStatements[] = { - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')" - }; - for (const char* sql : insertStatements) { - Execute(hDbc, sql); - } - std::cout << "fulltable setup complete. Begin to query." << std::endl; - Query(hDbc); - std::cout << "Query ok" << std::endl; - - // 4. Clean up resources - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - - return 0; - } - catch (...) { - // Exception cleanup - if (hDbc != SQL_NULL_HDBC) { - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - } - if (hEnv != SQL_NULL_HENV) { - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - } - - std::cerr << "Unexpected error!" << std::endl; - return 1; - } -} -``` - -### 4.4 PowerBI Example -1. Open PowerBI Desktop and create a new project. -2. Click "Home" → "Get Data" → "More..." → "ODBC" → Click the "Connect" button. -3. Data Source Selection: In the pop-up window, select "Data Source Name (DSN)" and choose `Apache IoTDB DSN` from the dropdown. -4. Advanced Configuration: - * Click "Advanced options" and fill in the configuration in the "Connection string" input box (example): - ```Plain - server=127.0.0.1;port=6667;database=test;isTableModel=true;loglevel=4 - ``` - * Notes: - * The `dsn` item is optional; filling it in or not does not affect the connection. - * `loglevel` ranges from 0-4: Level 0 (ERROR) has the least logs, Level 4 (TRACE) has the most detailed logs; set as needed. - * `server`/`database`/`dsn`/`loglevel` are case-insensitive (e.g., can be written as `Server/DATABASE`). - * If relevant information is configured in the DSN, you do not need to fill in any configuration information; the Driver Manager will automatically use the configuration filled in the DSN. -5. Authentication: Enter the username (default `root`) and password (default `root`), then click "Connect". -6. Data Loading: Select the table to be called in the interface (e.g., fulltable/table1) , and then click "Load" to view the data. - -### 4.5 Excel Example -1. Open Excel and create a blank workbook. -2. Click the "Data" tab → "From Other Sources" → "From Data Connection Wizard". -3. Data Source Selection: Select "ODBC DSN" → Next → Select `Apache IoTDB DSN` → Next. -4. Connection Configuration: - * The input process for connection string, username, and password is exactly the same as in PowerBI. Reference format for connection string: - ```Plain - server=127.0.0.1;port=6667;database=test;isTableModel=true;loglevel=4 - ``` - * If relevant information is configured in the DSN, you do not need to fill in any configuration information; the Driver Manager will automatically use the configuration filled in the DSN. -5. Table Selection: Choose the database and table you wish to access (e.g., fulltable), then click Next. -6. Save Connection: Customize settings for the data connection file name, connection description, etc., then click "Finish". -7. Import Data: Select the location to import the data into the worksheet (e.g., cell A1 of "Existing Worksheet"), click "OK" to complete data loading. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/API/Programming-Python-Native-API_timecho.md b/src/UserGuide/Master/Table/API/Programming-Python-Native-API_timecho.md deleted file mode 100644 index 4dd1fe4ff..000000000 --- a/src/UserGuide/Master/Table/API/Programming-Python-Native-API_timecho.md +++ /dev/null @@ -1,752 +0,0 @@ - -# Python Native API - -IoTDB provides a Python native client driver and a session pool management mechanism. These tools allow developers to interact with IoTDB in a programmatic and efficient manner. Using the Python API, developers can encapsulate time-series data into objects (e.g., `Tablet`, `NumpyTablet`) and insert them into the database directly, without the need to manually construct SQL statements. For multi-threaded operations, the `TableSessionPool` is recommended to optimize resource utilization and enhance performance. - -## 1. Prerequisites - -To use the IoTDB Python API, install the required package using pip: - -```shell -pip3 install apache-iotdb>=2.0 -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 2. Read and Write Operations - -### 2.1 TableSession - -`TableSession` is a core class in IoTDB, enabling users to interact with the IoTDB database. It provides methods to execute SQL statements, insert data, and manage database sessions. - -#### Method Overview - -| **Method Name** | **Descripton** | **Parameter Type** | **Return Type** | -| --------------------------- | ----------------------------------------------------- | ------------------------------------ | ---------------- | -| insert | Inserts data into the database. | tablet: `Union[Tablet, NumpyTablet]` | None | -| execute_non_query_statement | Executes non-query SQL statements like DDL/DML. | sql: `str` | None | -| execute_query_statement | Executes a query SQL statement and retrieves results. | sql: `str` | `SessionDataSet` | -| close | Closes the session and releases resources. | None | None | - -**Since V2.0.8.2**, `SessionDataSet` provides methods for batch DataFrame retrieval to efficiently handle large-volume queries: - -```python -# Batch DataFrame retrieval -has_next = result.has_next_df() -if has_next: - df = result.next_df() - # Process DataFrame -``` - -**Method Details:** -- `has_next_df()`: Returns `True`/`False` indicating whether more data exists -- `next_df()`: Returns a `DataFrame` or `None`. Each call returns `fetchSize` rows (default: 5000 rows, controlled by Session's `fetch_size` parameter): - - If remaining data ≥ `fetchSize`: returns `fetchSize` rows - - If remaining data < `fetchSize`: returns all remaining rows - - If traversal completes: returns `None` -- Session validates `fetchSize` at initialization: if ≤0, resets to 5000 and logs warning: `fetch_size xxx is illegal, use default fetch_size 5000` - -**Note:** Avoid mixing different traversal methods (e.g., combining `todf()` with `next_df()`), which may cause unexpected errors. - -**Since V2.0.8.3**, the Python client has supported `TSDataType.OBJECT` for Tablet batch write and Session value serialization. Query results are read via the `Field` object. The related interfaces are defined as follows: - -| Function Name | Description | Parameters | Return Value | -|---------------|-------------|------------|--------------| -| `encode_object_cell` | Encodes a single OBJECT cell into wire-format bytes | `is_eof: bool`,
`offset: int`,
`content: bytes` | `bytes`: `|[eof 1B]|[offset 8B BE]|[payload]|` | -| `decode_object_cell` | Parses a wire-format cell back into `eof`, `offset`, and `payload` | `cell: bytes` (length ≥ 9) | `Tuple[bool, int, bytes]`: `(is_eof, offset, payload)` | -| `Tablet.add_value_object` | Writes an OBJECT cell at the specified row and column (internally calls `encode_object_cell`) | `row_index: int`,
`column_index: int`,
`is_eof: bool`,
`offset: int`,
`content: bytes` | `None` | -| `Tablet.add_value_object_by_name` | Same as above, locates column by name | `column_name: str`,
`row_index: int`,
`is_eof: bool`,
`offset: int`,
`content: bytes` | `None` | -| `NumpyTablet.add_value_object` | Same semantics as `Tablet.add_value_object`, column data is stored as `ndarray` | Same as above (`row_index`, `column_index`, ...) | `None` | -| `Field.get_object_value` | Converts the value to a Python value based on the **target type** | `data_type: TSDataType` | Depends on type:
For OBJECT: `str` decoded from the entire `self.value` in UTF-8 (see Field.py) | -| `Field.get_string_value` | Returns a string representation | None | `str`;
For OBJECT: `self.value.decode("utf-8")` | -| `Field.get_binary_value` | Gets the binary data of TEXT/STRING/BLOB | None | `bytes` or `None`;
**Throws an error for OBJECT columns and should not be called** | - - -#### Sample Code - -```Python -class TableSession(object): -def insert(self, tablet: Union[Tablet, NumpyTablet]): - """ - Insert data into the database. - - Parameters: - tablet (Tablet | NumpyTablet): The tablet containing the data to be inserted. - Accepts either a `Tablet` or `NumpyTablet`. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def execute_non_query_statement(self, sql: str): - """ - Execute a non-query SQL statement. - - Parameters: - sql (str): The SQL statement to execute. Typically used for commands - such as INSERT, DELETE, or UPDATE. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def execute_query_statement(self, sql: str, timeout_in_ms: int = 0) -> "SessionDataSet": - """ - Execute a query SQL statement and return the result set. - - Parameters: - sql (str): The SQL query to execute. - timeout_in_ms (int, optional): Timeout for the query in milliseconds. Defaults to 0, - which means no timeout. - - Returns: - SessionDataSet: The result set of the query. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def close(self): - """ - Close the session and release resources. - - Raises: - IoTDBConnectionException: If there is an issue closing the connection. - """ - pass -``` - -### 2.2 TableSessionConfig - -`TableSessionConfig` is a configuration class that sets parameters for creating a `TableSession` instance, defining essential settings for connecting to the IoTDB database. - -#### Parameter Configuration - -| **Parameter** | **Description** | **Type** | **Default Value** | -| ------------------ | ------------------------------------- | -------- |-----------------------------------------------| -| node_urls | List of database node URLs. | `list` | `["localhost:6667"]` | -| username | Username for the database connection. | `str` | `"root"` | -| password | Password for the database connection. | `str` | `"TimechoDB@2021"`,before V2.0.6 it is root | -| database | Target database to connect to. | `str` | `None` | -| fetch_size | Number of rows to fetch per query. | `int` | `5000` | -| time_zone | Default session time zone. | `str` | `Session.DEFAULT_ZONE_ID` | -| enable_compression | Enable data compression. | `bool` | `False` | - -#### Sample Code - -```Python -class TableSessionConfig(object): - """ - Configuration class for a TableSession. - - This class defines various parameters for connecting to and interacting - with the IoTDB tables. - """ - - def __init__( - self, - node_urls: list = None, - username: str = Session.DEFAULT_USER, - password: str = Session.DEFAULT_PASSWORD, - database: str = None, - fetch_size: int = 5000, - time_zone: str = Session.DEFAULT_ZONE_ID, - enable_compression: bool = False, - ): - """ - Initialize a TableSessionConfig object with the provided parameters. - - Parameters: - node_urls (list, optional): A list of node URLs for the database connection. - Defaults to ["localhost:6667"]. - username (str, optional): The username for the database connection. - Defaults to "root". - password (str, optional): The password for the database connection. - Defaults to "TimechoDB@2021",before V2.0.6 it is root - database (str, optional): The target database to connect to. Defaults to None. - fetch_size (int, optional): The number of rows to fetch per query. Defaults to 5000. - time_zone (str, optional): The default time zone for the session. - Defaults to Session.DEFAULT_ZONE_ID. - enable_compression (bool, optional): Whether to enable data compression. - Defaults to False. - """ -``` - -**Note:** After using a `TableSession`, make sure to call the `close` method to release resources. - -## 3. Session Pool - -### 3.1 TableSessionPool - -`TableSessionPool` is a session pool management class designed for creating and managing `TableSession` instances. It provides functionality to retrieve sessions from the pool and close the pool when it is no longer needed. - -#### Method Overview - -| **Method Name** | **Description** | **Return Type** | **Exceptions** | -| --------------- | ------------------------------------------------------ | --------------- | -------------- | -| get_session | Retrieves a new `TableSession` instance from the pool. | `TableSession` | None | -| close | Closes the session pool and releases all resources. | None | None | - -#### Sample Code - -```Python -def get_session(self) -> TableSession: - """ - Retrieve a new TableSession instance. - - Returns: - TableSession: A new session object configured with the session pool. - - Notes: - The session is initialized with the underlying session pool for managing - connections. Ensure proper usage of the session's lifecycle. - """ - -def close(self): - """ - Close the session pool and release all resources. - - This method closes the underlying session pool, ensuring that all - resources associated with it are properly released. - - Notes: - After calling this method, the session pool cannot be used to retrieve - new sessions, and any attempt to do so may raise an exception. - """ -``` - -### 3.2 TableSessionPoolConfig - -`TableSessionPoolConfig` is a configuration class used to define parameters for initializing and managing a `TableSessionPool` instance. It specifies the settings needed for efficient session pool management in IoTDB. - -#### Parameter Configuration - -| **Paramater** | **Description** | **Type** | **Default Value** | -| ------------------ | ------------------------------------------------------------ | -------- | -------------------------- | -| node_urls | List of IoTDB cluster node URLs. | `list` | None | -| max_pool_size | Maximum size of the session pool, i.e., the maximum number of sessions allowed in the pool. | `int` | `5` | -| username | Username for the connection. | `str` | `Session.DEFAULT_USER` | -| password | Password for the connection. | `str` | `Session.DEFAULT_PASSWORD` | -| database | Target database to connect to. | `str` | None | -| fetch_size | Fetch size for query results | `int` | `5000` | -| time_zone | Timezone-related `ZoneId` | `str` | `Session.DEFAULT_ZONE_ID` | -| enable_redirection | Whether to enable redirection. | `bool` | `False` | -| enable_compression | Whether to enable data compression. | `bool` | `False` | -| wait_timeout_in_ms | Sets the connection timeout in milliseconds. | `int` | `10000` | -| max_retry | Maximum number of connection retry attempts. | `int` | `3` | - -#### Sample Code - -```Python -class TableSessionPoolConfig(object): - """ - Configuration class for a TableSessionPool. - - This class defines the parameters required to initialize and manage - a session pool for interacting with the IoTDB database. - """ - def __init__( - self, - node_urls: list = None, - max_pool_size: int = 5, - username: str = Session.DEFAULT_USER, - password: str = Session.DEFAULT_PASSWORD, - database: str = None, - fetch_size: int = 5000, - time_zone: str = Session.DEFAULT_ZONE_ID, - enable_redirection: bool = False, - enable_compression: bool = False, - wait_timeout_in_ms: int = 10000, - max_retry: int = 3, - ): - """ - Initialize a TableSessionPoolConfig object with the provided parameters. - - Parameters: - node_urls (list, optional): A list of node URLs for the database connection. - Defaults to None. - max_pool_size (int, optional): The maximum number of sessions in the pool. - Defaults to 5. - username (str, optional): The username for the database connection. - Defaults to Session.DEFAULT_USER. - password (str, optional): The password for the database connection. - Defaults to Session.DEFAULT_PASSWORD. - database (str, optional): The target database to connect to. Defaults to None. - fetch_size (int, optional): The number of rows to fetch per query. Defaults to 5000. - time_zone (str, optional): The default time zone for the session pool. - Defaults to Session.DEFAULT_ZONE_ID. - enable_redirection (bool, optional): Whether to enable redirection. - Defaults to False. - enable_compression (bool, optional): Whether to enable data compression. - Defaults to False. - wait_timeout_in_ms (int, optional): The maximum time (in milliseconds) to wait for a session - to become available. Defaults to 10000. - max_retry (int, optional): The maximum number of retry attempts for operations. Defaults to 3. - - """ -``` - -**Notes:** - -- Ensure that `TableSession` instances retrieved from the `TableSessionPool` are properly closed after use. -- After closing the `TableSessionPool`, it will no longer be possible to retrieve new sessions. - -### 3.3 SSL Connection - -#### 3.3.1 Server Certificate Configuration - -In the `conf/iotdb-system.properties` configuration file, locate or add the following configuration items: - -``` -enable_thrift_ssl=true -key_store_path=/path/to/your/server_keystore.jks -key_store_pwd=your_keystore_password -``` - -#### 3.3.2 Configure Python Client Certificate - -- Set `use_ssl` to True to enable SSL. -- Specify the client certificate path using the `ca_certs` parameter. - -``` -use_ssl = True -ca_certs = "/path/to/your/server.crt" # 或 ca_certs = "/path/to/your//ca_cert.pem" -``` -**Example Code: Using SSL to Connect to IoTDB** - -```Python -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# - -from iotdb.SessionPool import PoolConfig, SessionPool -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021",before V2.0.6 it is root -# Configure SSL enabled -use_ssl = True -# Configure certificate path -ca_certs = "/path/server.crt" - - -def get_data(): - session = Session( - ip, port_, username_, password_, use_ssl=use_ssl, ca_certs=ca_certs - ) - session.open(False) - with session.execute_query_statement("SHOW DATABASES") as session_data_set: - print(session_data_set.get_column_names()) - while session_data_set.has_next(): - print(session_data_set.next()) - - session.close() - - -def get_data2(): - pool_config = PoolConfig( - host=ip, - port=port_, - user_name=username_, - password=password_, - fetch_size=1024, - time_zone="UTC+8", - max_retry=3, - use_ssl=use_ssl, - ca_certs=ca_certs, - ) - max_pool_size = 5 - wait_timeout_in_ms = 3000 - session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) - session = session_pool.get_session() - with session.execute_query_statement("SHOW DATABASES") as session_data_set: - print(session_data_set.get_column_names()) - while session_data_set.has_next(): - print(session_data_set.next()) - session_pool.put_back(session) - session_pool.close() - - -if __name__ == "__main__": - df = get_data() -``` - -## 4. Sample Code - -**Session** Example: You can find the full example code at [Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/table_model_session_example.py). - -**Session Pool** Example: You can find the full example code at [SessionPool Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/table_model_session_pool_example.py). - -Here is an excerpt of the sample code: - -```Python -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# -import threading - -import numpy as np - -from iotdb.table_session_pool import TableSessionPool, TableSessionPoolConfig -from iotdb.utils.IoTDBConstants import TSDataType -from iotdb.utils.NumpyTablet import NumpyTablet -from iotdb.utils.Tablet import ColumnType, Tablet - - -def prepare_data(): - print("create database") - # Get a session from the pool - session = session_pool.get_session() - session.execute_non_query_statement("CREATE DATABASE IF NOT EXISTS db1") - session.execute_non_query_statement('USE "db1"') - session.execute_non_query_statement( - "CREATE TABLE table0 (id1 string id, attr1 string attribute, " - + "m1 double " - + "field)" - ) - session.execute_non_query_statement( - "CREATE TABLE table1 (id1 string tag, attr1 string attribute, " - + "m1 double " - + "field)" - ) - - print("now the tables are:") - # show result - with session.execute_query_statement("SHOW TABLES") as res: - while res.has_next(): - print(res.next()) - - session.close() - - -def insert_data(num: int): - print("insert data for table" + str(num)) - # Get a session from the pool - session = session_pool.get_session() - column_names = [ - "id1", - "attr1", - "m1", - ] - data_types = [ - TSDataType.STRING, - TSDataType.STRING, - TSDataType.DOUBLE, - ] - column_types = [ColumnType.TAG, ColumnType.ATTRIBUTE, ColumnType.FIELD] - timestamps = [] - values = [] - for row in range(15): - timestamps.append(row) - values.append(["id:" + str(row), "attr:" + str(row), row * 1.0]) - tablet = Tablet( - "table" + str(num), column_names, data_types, values, timestamps, column_types - ) - session.insert(tablet) - session.execute_non_query_statement("FLush") - - np_timestamps = np.arange(15, 30, dtype=np.dtype(">i8")) - np_values = [ - np.array(["id:{}".format(i) for i in range(15, 30)]), - np.array(["attr:{}".format(i) for i in range(15, 30)]), - np.linspace(15.0, 29.0, num=15, dtype=TSDataType.DOUBLE.np_dtype()), - ] - - np_tablet = NumpyTablet( - "table" + str(num), - column_names, - data_types, - np_values, - np_timestamps, - column_types=column_types, - ) - session.insert(np_tablet) - session.close() - - -def query_data(): - # Get a session from the pool - session = session_pool.get_session() - - print("get data from table0") - with session.execute_query_statement("select * from table0") as res: - while res.has_next(): - print(res.next()) - - print("get data from table1") - with session.execute_query_statement("select * from table1") as res: - while res.has_next(): - print(res.next()) - - # Querying Table Data Using Batch DataFrame (Recommended for Large Datasets) - print("get data from table0 using batch DataFrame") - with session.execute_query_statement("select * from table0") as res: - while res.has_next_df(): - print(res.next_df()) - - session.close() - - -def delete_data(): - session = session_pool.get_session() - session.execute_non_query_statement("drop database db1") - print("data has been deleted. now the databases are:") - with session.execute_query_statement("show databases") as res: - while res.has_next(): - print(res.next()) - session.close() - - -# Create a session pool -username = "root" -password = "TimechoDB@2021",before V2.0.6 it is root -node_urls = ["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"] -fetch_size = 1024 -database = "db1" -max_pool_size = 5 -wait_timeout_in_ms = 3000 -config = TableSessionPoolConfig( - node_urls=node_urls, - username=username, - password=password, - database=database, - max_pool_size=max_pool_size, - fetch_size=fetch_size, - wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) - -prepare_data() - -insert_thread1 = threading.Thread(target=insert_data, args=(0,)) -insert_thread2 = threading.Thread(target=insert_data, args=(1,)) - -insert_thread1.start() -insert_thread2.start() - -insert_thread1.join() -insert_thread2.join() - -query_data() -delete_data() -session_pool.close() -print("example is finished!") -``` - -**Object Type Usage Example** - -```python -import os - -import numpy as np -import pytest - -from iotdb.utils.IoTDBConstants import TSDataType -from iotdb.utils.NumpyTablet import NumpyTablet -from iotdb.utils.Tablet import Tablet, ColumnType -from iotdb.utils.object_column import decode_object_cell - - -def _require_thrift(): - pytest.importorskip("iotdb.thrift.common.ttypes") - - -def _session_endpoint(): - host = os.environ.get("IOTDB_HOST", "127.0.0.1") - port = int(os.environ.get("IOTDB_PORT", "6667")) - return host, port - - -@pytest.fixture(scope="module") -def table_session(): - _require_thrift() - from iotdb.Session import Session - from iotdb.table_session import TableSession, TableSessionConfig - - host, port = _session_endpoint() - cfg = TableSessionConfig( - node_urls=[f"{host}:{port}"], - username=os.environ.get("IOTDB_USER", Session.DEFAULT_USER), - password=os.environ.get("IOTDB_PASSWORD", Session.DEFAULT_PASSWORD), - ) - ts = TableSession(cfg) - yield ts - ts.close() - - -def test_table_numpy_tablet_object_columns(table_session): - """ - Table model: Tablet.add_value_object / add_value_object_by_name, - NumpyTablet.add_value_object, insert + query Field + decode_object_cell; - Also includes writing OBJECT in two segments at the same timestamp - (first with is_eof=False/offset=0, then with is_eof=True/offset=length of the first segment), - and verifies the complete concatenated bytes using read_object(f1). - """ - db = "test_py_object_e2e" - table = "obj_tbl" - table_session.execute_non_query_statement(f"CREATE DATABASE IF NOT EXISTS {db}") - table_session.execute_non_query_statement(f"USE {db}") - table_session.execute_non_query_statement(f"DROP TABLE IF EXISTS {table}") - table_session.execute_non_query_statement( - f"CREATE TABLE {table}(" - "device STRING TAG, temp FLOAT FIELD, f1 OBJECT FIELD, f2 OBJECT FIELD)" - ) - - column_names = ["device", "temp", "f1", "f2"] - data_types = [ - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.OBJECT, - TSDataType.OBJECT, - ] - column_types = [ - ColumnType.TAG, - ColumnType.FIELD, - ColumnType.FIELD, - ColumnType.FIELD, - ] - timestamps = [100, 200] - values = [ - ["d1", 1.5, None, None], - ["d1", 2.5, None, None], - ] - - tablet = Tablet( - table, column_names, data_types, values, timestamps, column_types - ) - tablet.add_value_object(0, 2, True, 0, b"first-row-obj") - # Single-segment write for the entire object: is_eof=True and offset=0; - # Segmented sequential writes must pass server-side offset/length validation - tablet.add_value_object_by_name("f2", 0, True, 0, b"seg") - tablet.add_value_object(1, 2, True, 0, b"second-f1") - tablet.add_value_object(1, 3, True, 0, b"second-f2") - table_session.insert(tablet) - - ts_arr = np.array([300, 400], dtype=TSDataType.INT64.np_dtype()) - np_vals = [ - np.array(["d1", "d1"]), - np.array([1.0, 2.0], dtype=np.float32), - np.array([None, None], dtype=object), - np.array([None, None], dtype=object), - ] - np_tab = NumpyTablet( - table, column_names, data_types, np_vals, ts_arr, column_types=column_types - ) - np_tab.add_value_object(0, 2, True, 0, b"np-r0-f1") - np_tab.add_value_object(0, 3, True, 0, b"np-r0-f2") - np_tab.add_value_object(1, 2, True, 0, b"np-r1-f1") - np_tab.add_value_object(1, 3, True, 0, b"np-r1-f2") - table_session.insert(np_tab) - - # Segmented OBJECT: first with is_eof=False (continue transmission), - # then with is_eof=True (last segment); offset is the length of written bytes - chunk0 = bytes((i % 256) for i in range(512)) - chunk1 = b"\xab" * 64 - expected_segmented = chunk0 + chunk1 - seg1 = Tablet( - table, - column_names, - data_types, - [["d1", 3.0, None, None]], - [500], - column_types, - ) - seg1.add_value_object(0, 2, False, 0, chunk0) - seg1.add_value_object(0, 3, True, 0, b"f2-seg") - table_session.insert(seg1) - seg2 = Tablet( - table, - column_names, - data_types, - [["d1", 3.0, None, None]], - [500], - column_types, - ) - seg2.add_value_object(0, 2, True, 512, chunk1) - seg2.add_value_object(0, 3, True, 0, b"f2-seg") - table_session.insert(seg2) - - with table_session.execute_query_statement( - f"SELECT read_object(f1) FROM {table} WHERE time = 500" - ) as ds: - assert ds.has_next() - row = ds.next() - blob = row.get_fields()[0].get_binary_value() - assert blob == expected_segmented - assert not ds.has_next() - - seen = 0 - with table_session.execute_query_statement( - f"SELECT device, temp, f1, f2 FROM {table} ORDER BY time" - ) as ds: - while ds.has_next(): - row = ds.next() - fields = row.get_fields() - assert fields[0].get_object_value(TSDataType.STRING) == "d1" - assert fields[1].get_object_value(TSDataType.FLOAT) is not None - for j in (2, 3): - raw = fields[j].value - assert isinstance(raw, (bytes, bytearray)) - eof, off, body = decode_object_cell(bytes(raw)) - assert isinstance(eof, bool) and isinstance(off, int) - assert isinstance(body, bytes) - fields[j].get_string_value() - fields[j].get_object_value(TSDataType.OBJECT) - seen += 1 - assert seen == 5 - - -if __name__ == "__main__": - pytest.main([__file__, "-v", "-rs"]) -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/API/RestAPI-V1_timecho.md b/src/UserGuide/Master/Table/API/RestAPI-V1_timecho.md deleted file mode 100644 index 21ce0768c..000000000 --- a/src/UserGuide/Master/Table/API/RestAPI-V1_timecho.md +++ /dev/null @@ -1,363 +0,0 @@ - -# RestAPI V1 - -IoTDB's RESTful service can be used for querying, writing, and management operations. It uses the OpenAPI standard to define interfaces and generate frameworks. - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the REST service JAR file by default. Please contact the Timecho team to obtain the corresponding JAR file before using this service, and place it in the `timechodb_home/lib` or `timechodb_home/ext/external_service` directory. - -## 1. Enabling RESTful Service - -The RESTful service is disabled by default. To enable it, locate the `conf/iotdb-system.properties` file in the IoTDB installation directory, set `enable_rest_service` to `true`, and then restart the datanode process. - -```Properties -enable_rest_service=true -``` - -## 2. Authentication - -All RESTful APIs adopt **Basic Authentication**, except the health check interface `/ping`. -All requests must carry the `Authorization` information in the request header. - -1. Authentication Format -``` -Authorization: Basic -``` -The `` is generated by directly Base64-encoding the string in the format `username:password`. -Quick generation methods are shown below: - -* Linux / macOS -```bash -echo -n "your_username:your_password" | base64 -eg: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows -```powershell -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("username:password")) -eg: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) -``` -```cmd -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"username:password\"))" -eg: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. Authentication Example - -Default username: `root`, password: `TimechoDB@2021` - -- Concatenated string: `root:TimechoDB@2021` -- Base64 encoded result: `cm9vdDpUaW1lY2hvREJAMjAyMQ==` -- Final Header: -``` -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. Error Description -- Incorrect username or password: returns HTTP status code `801` -```json -{"code":801,"message":"WRONG_LOGIN_PASSWORD"} -``` - -- Missing `Authorization` configuration: returns HTTP status code `800` -```json -{"code":800,"message":"INIT_AUTH_ERROR"} -``` - - -## 3. Interface Definitions - -### 3.1 Ping - -The `/ping` endpoint can be used for online service health checks. - -- Request Method: GET - -- Request Path: `http://ip:port/ping` - -- Example Request: - - ```Bash - curl http://127.0.0.1:18080/ping - ``` - -- HTTP Status Codes: - - - `200`: The service is working normally and can accept external requests. - - - `503`: The service is experiencing issues and cannot accept external requests. - - | Parameter Name | Type | Description | - | :------------- | :------ | :--------------- | - | code | integer | Status Code | - | message | string | Code Information | - -- Response Example: - - - When the HTTP status code is `200`: - - ```JSON - { "code": 200, "message": "SUCCESS_STATUS"} - ``` - - - When the HTTP status code is `503`: - - ```JSON - { "code": 500, "message": "thrift service is unavailable"} - ``` - -**Note**: The `/ping` endpoint does not require authentication. - -### 3.2 Query Interface - -- Request Path: `/rest/table/v1/query` - -- Request Method: POST - -- Request Format: - - - Header: `application/json` - - - Request Parameters: - - | Parameter Name | Type | Required | Description | - | :------------- | :----- | :------- | :----------------------------------------------------------- | - | `database` | string | Yes | Database name | - | `sql` | string | Yes | SQL query | - | `row_limit` | int | No | Maximum number of rows to return in a single query. If not set, the default value from the configuration file (`rest_query_default_row_size_limit`) is used. If the result set exceeds this limit, status code `411` is returned. | - -- Response Format: - - | Parameter Name | Type | Description | - | :------------- | :---- | :----------------------------------------------------------- | - | `column_names` | array | Column names | - | `data_types` | array | Data types of each column | - | `values` | array | A 2D array where the first dimension represents rows, and the second dimension represents columns. Each element corresponds to a column, with the same length as `column_names`. | - -- Example Request: - - ```Bash - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"database":"test","sql":"select s1,s2,s3 from test_table"}' http://127.0.0.1:18080/rest/table/v1/query - ``` - -- Example Response: - - ```JSON - { - "column_names": [ - "s1", - "s2", - "s3" - ], - "data_types": [ - "STRING", - "BOOLEAN", - "INT32" - ], - "values": [ - [ - "a11", - true, - 2024 - ], - [ - "a11", - false, - 2025 - ] - ] - } - ``` - -### 3.3 Non-Query Interface - -- Request Path: `/rest/table/v1/nonQuery` - -- Request Method: POST - -- Request Format: - - - Header: `application/json` - - - Request Parameters: - - | Parameter Name | Type | Required | Description | - | :------------- | :----- | :------- | :------------ | - | `sql` | string | Yes | SQL statement | - | `database` | string | No | Database name | - -- Response Format: - - | Parameter Name | Type | Description | - | :------------- | :------ | :---------- | - | `code` | integer | Status code | - | `message` | string | Message | - -- Example Requests: - - - Create a database: - - ```Bash - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"create database test","database":""}' http://127.0.0.1:18080/rest/table/v1/nonQuery - ``` - - - Create a table `test_table` in the `test` database: - - ```Bash - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"CREATE TABLE table1 (time TIMESTAMP TIME,region STRING TAG,plant_id STRING TAG,device_id STRING TAG,model_id STRING ATTRIBUTE,maintenance STRING ATTRIBUTE,temperature FLOAT FIELD,humidity FLOAT FIELD,status Boolean FIELD,arrival_time TIMESTAMP FIELD) WITH (TTL=31536000000)","database":"test"}' http://127.0.0.1:18080/rest/table/v1/nonQuery - ``` - -- Example Response: - - ```JSON - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -### 3.4 Batch Write Interface - -- Request Path: `/rest/table/v1/insertTablet` - -- Request Method: POST - -- Request Format: - - - Header: `application/json` - - - Request Parameters: - - | Parameter Name | Type | Required | Description | - | :------------------ | :----- | :------- | :----------------------------------------------------------- | - | `database` | string | Yes | Database name | - | `table` | string | Yes | Table name | - | `column_names` | array | Yes | Column names | - | `column_categories` | array | Yes | Column categories (`TAG`, `FIELD`, `ATTRIBUTE`) | - | `data_types` | array | Yes | Data types | - | `timestamps` | array | Yes | Timestamp column | - | `values` | array | Yes | Value columns. Each column's values can be `null`. A 2D array where the first dimension corresponds to timestamps, and the second dimension corresponds to columns. | - -- Response Format: - - | Parameter Name | Type | Description | - | :------------- | :------ | :---------- | - | `code` | integer | Status code | - | `message` | string | Message | - -- Example Request: - - ```Bash - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"database":"test","column_categories":["TAG","FIELD","FIELD"],"timestamps":[1739702535000,1739789055000],"column_names":["s1","s2","s3"],"data_types":["STRING","BOOLEAN","INT32"],"values":[["a11",true,2024],["a11",false,2025]],"table":"test_table"}' http://127.0.0.1:18080/rest/table/v1/insertTablet - ``` - -- Example Response: - - ```JSON - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -## 4. Configuration - -The configuration file is located in `iotdb-system.properties`. - -- Set `enable_rest_service` to `true` to enable the module, or `false` to disable it. The default value is `false`. - - ```Properties - enable_rest_service=true - ``` - -- Only effective when `enable_rest_service=true`. Set `rest_service_port` to a number (1025~65535) to customize the REST service socket port. The default value is `18080`. - - ```Properties - rest_service_port=18080 - ``` - -- Set `enable_swagger` to `true` to enable Swagger for displaying REST interface information, or `false` to disable it. The default value is `false`. - - ```Properties - enable_swagger=false - ``` - -- The maximum number of rows that can be returned in a single query. If the result set exceeds this limit, only the rows within the limit will be returned, and status code `411` will be returned. - - ```Properties - rest_query_default_row_size_limit=10000 - ``` - -- Expiration time for caching client login information (used to speed up user authentication, in seconds, default is 8 hours). - - ```Properties - cache_expire_in_seconds=28800 - ``` - -- Maximum number of users stored in the cache (default is 100). - - ```Properties - cache_max_num=100 - ``` - -- Initial cache capacity (default is 10). - - ```Properties - cache_init_num=10 - ``` - -- Whether to enable SSL configuration for the REST service. Set `enable_https` to `true` to enable it, or `false` to disable it. The default value is `false`. - - ```Properties - enable_https=false - ``` - -- Path to the `keyStore` (optional). - - ```Properties - key_store_path= - ``` - -- Password for the `keyStore` (optional). - - ```Properties - key_store_pwd= - ``` - -- Path to the `trustStore` (optional). - - ```Properties - trust_store_path="" - ``` - -- Password for the `trustStore` (optional). - - ```Properties - trust_store_pwd="" - ``` - -- SSL timeout time, in seconds. - - ```Properties - idle_timeout_in_seconds=5000 - ``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Background-knowledge/Cluster-Concept_timecho.md b/src/UserGuide/Master/Table/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index 7ebd77297..000000000 --- a/src/UserGuide/Master/Table/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,140 +0,0 @@ - - -# Common Concepts - -## 1. SQL Dialect Related Concepts - -### 1.1 sql_dialect - -IoTDB supports two time-series data mode (SQL dialects), both managing devices and measurement points: - -- **Tree** **Mode**: Organizes data in a hierarchical path structure, where each path represents a measurement point of a device. -- **Table** **Mode**: Organizes data in a relational table format, where each table corresponds to a type of device. - -Each dialect comes with its own SQL syntax and query patterns tailored to its data mode. - -### 1.2 Schema - -Schema refers to the metadata structure of the database, which can follow either a tree or table format. It includes definitions such as measurement point names, data types, and storage configurations. - -### 1.3 Device - -A device corresponds to a physical device in a real-world scenario, typically associated with multiple measurement points. - -### 1.4 Timeseries - -Also referred to as: physical quantity, time series, timeline, point, signal, metric, measurement value, etc. -A measurement point is a time series consisting of multiple data points arranged in ascending timestamp order. It typically represents a collection point that periodically gathers physical quantities from its environment. - -### 1.5 Encoding - -Encoding is a compression technique that represents data in binary form, improving storage efficiency. IoTDB supports multiple encoding methods for different types of data. For details, refer to: [Compression and Encoding ](../Technical-Insider/Encoding-and-Compression.md)。 - -### 1.6 Compression - - After encoding, IoTDB applies additional compression techniques to further reduce data size and improve storage efficiency. Various compression algorithms are supported. For details, refer to: [ Compression and Encoding](../Technical-Insider/Encoding-and-Compression.md)。 - -## 2. Distributed System Related Concepts - -IoTDB supports distributed deployments, typically in a 3C3D cluster model (3 ConfigNodes, 3 DataNodes), as illustrated below: - - - -### 2.1 Key Concepts - -- **Nodes** (*ConfigNode,* *DataNode**, AINode*) -- **Regions** (*SchemaRegion, DataRegion*) -- **Replica Groups** - -Below is an introduction to these concepts. - - -### 2.2 Nodes - -An IoTDB cluster consists of three types of nodes, each with distinct responsibilities: - -- **ConfigNode (Management Node)** Manages cluster metadata, configuration, user permissions, schema, and partitioning. It also handles distributed scheduling and load balancing. All ConfigNodes are replicated for high availability. -- **DataNode (Storage and Computation Node)** Handles client requests, stores data, and executes computations. -- **AINode (Analytics Node)** Provides machine learning capabilities, allowing users to register pre-trained models and perform inference via SQL. It includes built-in time-series modes and common ML algorithms for tasks like prediction and anomaly detection. - -### 2.3 Data Partitioning - -IoTDB divides schema and data into **Regions**, which are managed by DataNodes. - -- **SchemaRegion**: Stores schema information (devices and measurement points). Regions with the same RegionID across different DataNodes serve as replicas. -- **DataRegion**: Stores time-series data for a subset of devices over a specified time period. Regions with the same RegionID across different DataNodes act as replicas. - -For more details, see [Cluster Data Partitioning](../Technical-Insider/Cluster-data-partitioning.md) - -### 2.4 Replica Groups - -Replica groups ensure high availability by maintaining multiple copies of schema and data. The recommended replication configurations are: - -| **Category** | **Configuration Item** | **Standalone Recommended** | **Cluster Recommended** | -| ------------ | ------------------------- | -------------------------- | ----------------------- | -| Metadata | schema_replication_factor | 1 | 3 | -| Data | data_replication_factor | 1 | 2 | - - -## 3. Deployment Related Concepts - -IoTDB has two operation modes: standalone mode and cluster mode. - -### 3.1 Standalone Mode - -An IoTDB standalone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D. - -- **Features**: Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations. -- **Use Cases**: Scenarios with limited resources or low high-availability requirements, such as edge servers. -- **Deployment Method**: [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - -### 3.2 Dual-Active Mode - -Dual-Active Deployment is a feature of TimechoDB, where two independent instances synchronize bidirectionally and can provide services simultaneously. If one instance stops and restarts, the other instance will resume data transfer from the breakpoint. - -> An IoTDB Dual-Active instance typically consists of 2 standalone nodes, i.e., 2 sets of 1C1D. Each instance can also be a cluster. - -- **Features**: The high-availability solution with the lowest resource consumption. -- **Use Cases**: Scenarios with limited resources (only two servers) but requiring high availability. -- **Deployment Method**: [Dual-Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### 3.3 Cluster Mode - -An IoTDB cluster instance consists of 3 ConfigNodes and no fewer than 3 DataNodes, typically 3 DataNodes, i.e., 3C3D. If some nodes fail, the remaining nodes can still provide services, ensuring high availability of the database. Performance can be improved by adding DataNodes. - -- **Features**: High availability, high scalability, and improved system performance by adding DataNodes. -- **Use Cases**: Enterprise-level application scenarios requiring high availability and reliability. -- **Deployment Method**: [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - -### 3.4 Feature Summary - -| **Dimension** | **Stand-Alone Mode** | **Dual-Active Mode** | **Cluster Mode** | -| :-------------------------- | :------------------------------------------------------- | :------------------------------------------------------ | :------------------------------------------------------ | -| Use Cases | Edge-side deployment, low high-availability requirements | High-availability services, disaster recovery scenarios | High-availability services, disaster recovery scenarios | -| Number of Machines Required | 1 | 2 | ≥3 | -| Security and Reliability | Cannot tolerate single-point failure | High, can tolerate single-point failure | High, can tolerate single-point failure | -| Scalability | Can expand DataNodes to improve performance | Each instance can be scaled as needed | Can expand DataNodes to improve performance | -| Performance | Can scale with the number of DataNodes | Same as one of the instances | Can scale with the number of DataNodes | - -- Notes: The deployment steps for Stand-Alone Mode and Cluster Mode are similar (adding ConfigNodes and DataNodes one by one), with differences only in the number of replicas and the minimum number of nodes required to provide services. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Background-knowledge/Data-Model-and-Terminology_timecho.md b/src/UserGuide/Master/Table/Background-knowledge/Data-Model-and-Terminology_timecho.md deleted file mode 100644 index 88537b320..000000000 --- a/src/UserGuide/Master/Table/Background-knowledge/Data-Model-and-Terminology_timecho.md +++ /dev/null @@ -1,388 +0,0 @@ - - -# Modeling Scheme Design - -This section introduces how to transform time series data application scenarios into IoTDB time series mode. - -## 1. Time Series Data Mode - -Before designing an IoTDB data mode, it's essential to understand time series data and its underlying structure. For more details, refer to: [Time Series Data Mode](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - -## 2. Tree-Table Twin Mode in IoTDB - -IoTDB offers tree-table twin mode, each with its distinct characteristics as follows: - -**Tree Mode**: It manages data points as objects, with each data point corresponding to a time series. The data point names, segmented by dots, form a tree-like directory structure that corresponds one-to-one with the physical world, making the read and write operations on data points straightforward and intuitive. - -**Table Mode**: It is recommended to create a table for each type of device. The collection of physical quantities from devices of the same type shares certain commonalities (such as the collection of temperature and humidity physical quantities), allowing for flexible and rich data analysis. - -### 2.1 Mode Characteristics - -Tree-table twin mode syntaxes have their own applicable scenarios. - -The following table compares the tree mode and the table mode from various dimensions, including applicable scenarios and typical operations. Users can choose the appropriate mode based on their specific usage requirements to achieve efficient data storage and management. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DimensionTree ModeTable Mode
Applicable ScenariosMeasurements management, monitoring scenariosDevice management, analysis scenarios
Typical OperationsRead and write operations by specifying data point pathsData filtering and analysis through tags
Structural CharacteristicsFlexible addition and deletion, similar to a file systemTemplate-based management, facilitating data governance
Syntax CharacteristicsConcise and flexibleRich analysis
Performance ComparisonSimilar
- -**Notes:** - -- Both mode spaces can coexist within the same cluster instance. Each mode follows distinct syntax and database naming conventions, and they remain isolated by default. - -## 2.2 Model Selection - -IoTDB supports model selection through various client tools. The configuration methods for different clients are as follows: - -1. [Command-Line Interface (CLI)](../Tools-System/CLI_timecho.md) - -When connecting via CLI, specify the model using the `sql_dialect` parameter (default: tree model). - -```bash -# Tree model -start-cli.sh(bat) -start-cli.sh(bat) -sql_dialect tree - -# Table model -start-cli.sh(bat) -sql_dialect table -``` - -2. [SQL](../User-Manual/Maintenance-commands_timecho.md#_2-1-setting-the-connected-model) - -Use the `SET` statement to switch models in SQL: - -```sql --- Tree model -IoTDB> SET SQL_DIALECT=TREE - --- Table model -IoTDB> SET SQL_DIALECT=TABLE -``` - -3. Application Programming Interfaces (APIs) - -For multi-language APIs, create connections via model-specific session/session pool classes. Examples: - -* [Java Native API](../API/Programming-Java-Native-API_timecho.md) - -```java -// Tree model -SessionPool sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(3) - .build(); - -// Table model -ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(1) - .build(); -``` - -* [Python Native API](../API/Programming-Python-Native-API_timecho.md) - -```python -# Tree model -session = Session( - ip=ip, - port=port, - user=username, - password=password, - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) - -# Table model -config = TableSessionPoolConfig( - node_urls=node_urls, - username=username, - password=password, - database=database, - max_pool_size=max_pool_size, - fetch_size=fetch_size, - wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) -``` - -* [C++ Native API](../API/Programming-Cpp-Native-API_timecho.md) - -```cpp -// Tree model -session = new Session(hostip, port, username, password); - -// Table model -session = (new TableSessionBuilder()) - ->host(ip) - ->rpcPort(port) - ->username(username) - ->password(password) - ->build(); -``` - -* [Go Native API](../API/Programming-Go-Native-API_timecho.md) - -```go -// Tree model -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, -} -sessionPool = client.NewSessionPool(config, 3, 60000, 60000, false) -defer sessionPool.Close() - -// Table model -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, - Database: dbname, -} -sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) -defer sessionPool.Close() -``` - -* [C# Native API](../API/Programming-CSharp-Native-API_timecho.md) - -```csharp -// Tree model -var session_pool = new SessionPool(host, port, pool_size); - -// Table model -var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(nodeUrls) - .SetUsername(username) - .SetPassword(password) - .SetFetchSize(1024) - .Build(); -``` - -* [JDBC](../API/Programming-JDBC_timecho.md) - -For the table model, include `sql_dialect=table` in the JDBC URL: - -```java -// Tree model -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/", username, password); - -// Table model -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", username, password); -``` - -## 2.3 Tree-to-Table Conversion - -IoTDB supports **tree-to-table conversion**, as shown in the figure below: - -![](/img/tree-to-table-en-1.png) - -This feature allows existing tree-model data to be transformed into table views. Users can then query the same dataset using either model. Detailed instructions are available in [Tree-to-Table View](../User-Manual/Tree-to-Table_timecho.md). **Note**: SQL statements for creating tree-to-table views **must be executed in table mode**. - - -## 3. Application Scenarios - -The application scenarios mainly include three categories: - -- Scenario 1: Using the tree mode for data reading and writing. - -- Scenario 2: Using the table mode for data reading and writing. - -- Scenario 3: Sharing the same dataset, using the tree mode for data reading and writing, and the table mode for data analysis. - -### 3.1 Scenario 1: Tree Mode - -#### 3.1.1 Characteristics - -- Simple and intuitive, corresponding one-to-one with monitoring points in the physical world. - -- Flexible like a file system, allowing the design of any branch structure. - -- Suitable for industrial monitoring scenarios such as DCS and SCADA. - -#### 3.1.2 Basic Concepts - -| **Concept** | **Definition** | -| ---------------------------- | ------------------------------------------------------------ | -| **Database** | **Definition**: A path prefixed with `root.`.
**Naming Recommendation**: Only include the next level node under `root`, such as `root.db`.
**Quantity Recommendation**: The upper limit is related to memory. A single database can fully utilize machine resources; there is no need to create multiple databases for performance reasons.
**Creation Method**: Recommended to create manually, but can also be created automatically when a time series is created (defaults to the next level node under `root`). | -| **Time Series (Data Point)** | **Definition**:
A path prefixed with the database path, segmented by `.`, and can contain any number of levels, such as `root.db.turbine.device1.metric1`.
Each time series can have different data types.
**Naming Recommendation**:
Only include unique identifiers (similar to a composite primary key) in the path, generally not exceeding 10 levels.
Typically, place tags with low cardinality (fewer distinct values) at the front to facilitate system compression of common prefixes.
**Quantity Recommendation**:
The total number of time series manageable by the cluster is related to total memory; refer to the resource recommendation section.
There is no limit to the number of child nodes at any level.
**Creation Method**: Can be created manually or automatically during data writing. | -| **Device** | **Definition**: The second-to-last level is the device, such as `device1` in `root.db.turbine.device1.metric1`.
**Creation Method**: Cannot create a device alone; it exists as time series are created. | - -#### 3.1.3 Mode Examples - -##### 3.1.3.1 How to mode when managing multiple types of devices? - -- If different types of devices in the scenario have different hierarchical paths and data point sets, create branches under the database node by device type. Each device type can have a different data point structure. - -
- -
- -##### 3.1.3.2 How to mode when there are no devices, only data points? - -- For example, in a monitoring system for a station, each data point has a unique number but does not correspond to any specific device. - -
- -
- -##### 3.1.3.3 How to mode when a device has both sub-devices and data points? - -- For example, in an energy storage scenario, each layer of the structure monitors its voltage and current. The following mode approach can be used. - -
- -
- - -### 3.2 Scenario 2: Table Mode - -#### 3.2.1 Characteristics - -- Modes and manages device time series data using time series tables, facilitating analysis with standard SQL. - -- Suitable for device data analysis or migrating data from other databases to IoTDB. - -#### 3.2.2 Basic Concepts - -- Database: Can manage multiple types of devices. - -- Time Series Table: Corresponds to a type of device. - -| **Category** | **Definition** | -| -------------------------------- | ------------------------------------------------------------ | -| **Time Column (TIME)** | Each time series table must have a time column named `time`, with the data type `TIMESTAMP`. | -| **Tag Column (TAG)** \| | Unique identifiers (composite primary key) for devices, ranging from 0 to multiple.
Tag information cannot be modified or deleted but can be added.
Recommended to arrange from coarse to fine granularity. | -| **Data Point Column (FIELD)** \| | A device can collect 1 to multiple data points, with values changing over time.
There is no limit to the number of data point columns; it can reach hundreds of thousands. | -| **Attribute Column (ATTRIBUTE)** | Supplementary descriptions of devices, not changing over time.
Device attribute information can range from 0 to multiple and can be updated or added.
A small number of static attributes that may need modification can be stored here. | - -**Data Filtering Efficiency**: Time Column = Tag Column > Attribute Column > Data Point Column. - -#### 3.2.3 Mode Examples - -##### 3.2.3.1 How to mode when managing multiple types of devices? - -- Recommended to create a table for each type of device, with each table having different tags and data point sets. - -- Even if devices are related or have hierarchical relationships, it is recommended to create a table for each type of device. - -
- -
- -##### 3.2.3.2 How to mode when there are no device identifier columns or attribute columns? - -- There is no limit to the number of columns; it can reach hundreds of thousands. - -
- -
- -##### 3.2.3.3 How to mode when a device has both sub-devices and data points? - -- Each device has multiple sub-devices and data point information. It is recommended to create a table for each type of device for management. - -
- -
- -### 3.3 Scenario 3: Dual-Mode Integration - -#### 3.3.1 Characteristics - -- Ingeniously combines the advantages of the tree mode and table mode, sharing the same dataset, with flexible writing and rich querying. - -- During the data writing phase, the tree mode syntax is used, supporting flexible data access and expansion. - -- During the data analysis phase, the table mode syntax is used, allowing users to perform complex data analysis using standard SQL queries. - -#### 3.3.2 Mode Examples - -##### 3.3.2.1 How to mode when managing multiple types of devices? - -- Different types of devices in the scenario have different hierarchical paths and data point sets. - -- **Tree Mode**T: Create branches under the database node by device type, with each device type having a different data point structure. - -- **Table View**T: Create a table view for each type of device, with each table view having different tags and data point sets. - -
- -
- -##### 3.3.2.2 How to mode when there are no device identifier columns or attribute columns? - -- **Tree Mode**: Each data point has a unique number but does not correspond to any specific device. -- **Table View**: Place all data points into a single table. There is no limit to the number of data point columns; it can reach hundreds of thousands. If data points have the same data type, they can be treated as the same type of device. - -
- -
- -##### 3.3.2.3 How to mode when a device has both sub-devices and data points? - -- **Tree Mode**: Mode each layer of the structure according to the monitoring points in the physical world. -- **Table View**: Create multiple tables to manage each layer of structural information according to device classification. - -
- -
diff --git a/src/UserGuide/Master/Table/Background-knowledge/Data-Type_timecho.md b/src/UserGuide/Master/Table/Background-knowledge/Data-Type_timecho.md deleted file mode 100644 index d35af9d07..000000000 --- a/src/UserGuide/Master/Table/Background-knowledge/Data-Type_timecho.md +++ /dev/null @@ -1,201 +0,0 @@ - - -# Data Type - -## 1. Basic Data Types - -IoTDB supports the following data types: - -- **BOOLEAN** (Boolean value) -- **INT32** (32-bit integer) -- **INT64** (64-bit integer) -- **FLOAT** (Single-precision floating-point number) -- **DOUBLE** (Double-precision floating-point number) -- **TEXT** (Text data, suitable for long strings, Not recommended) -- **STRING** (String data with additional statistical information for optimized queries) -- **BLOB** (Large binary object) -- **OBJECT** (Large Binary Object) - > Supported since V2.0.8 -- **TIMESTAMP** (Timestamp, representing precise moments in time) -- **DATE** (Date, storing only calendar date information) - -The difference between **STRING** and **TEXT**: - -- **STRING** stores text data and includes additional statistical information to optimize value-filtering queries. -- **TEXT** is suitable for storing long text strings without additional query optimization. - -The differences between **OBJECT** and **BLOB** types are as follows: - -| | **OBJECT** | **BLOB** | -|----------------------|-------------------------------------------------------------------------------------------------------------------------|--------------------------------------| -| **Write Amplification** (Lower is better) | Low (Write amplification factor is always 1) | High (Write amplification factor = 2 + number of merges) | -| **Space Amplification** (Lower is better) | Low (Merge & release on write) | High (Merge on read and release on compact) | -| **Query Results** | When querying an OBJECT column by default, returns metadata like: `(Object) XX.XX KB`. Actual OBJECT data storage path: `${data_dir}/object_data`. Use `READ_OBJECT` function to retrieve raw content | Directly returns raw binary content | - - -### 1.1 Data Type Compatibility - -If the written data type does not match the registered data type of a series: - -- **Incompatible types** → The system will issue an error. -- **Compatible types** → The system will automatically convert the written data type to match the registered type. - -The compatibility of data types is shown in the table below: - -| Registered Data Type | Compatible Write Data Types | -|:---------------------|:---------------------------------------| -| BOOLEAN | BOOLEAN | -| INT32 | INT32 | -| INT64 | INT32, INT64, TIMESTAMP | -| FLOAT | INT32, FLOAT | -| DOUBLE | INT32, INT64, FLOAT, DOUBLE, TIMESTAMP | -| TEXT | TEXT, STRING | -| STRING | TEXT, STRING | -| BLOB | TEXT, STRING, BLOB | -| OBJECT | OBJECT | -| TIMESTAMP | INT32, INT64, TIMESTAMP | -| DATE | DATE | - -## 2. Timestamp Types - -A timestamp represents the moment when data is recorded. IoTDB supports two types: - -- **Absolute timestamps**: Directly specify a point in time. -- **Relative timestamps**: Define time offsets from a reference point (e.g., `now()`). - -### 2.1 Absolute Timestamp - -IoTDB supports timestamps in two formats: - -1. **LONG**: Milliseconds since the Unix epoch (1970-01-01 00:00:00 UTC). -2. **DATETIME**: Human-readable date-time strings. (including **DATETIME-INPUT** and **DATETIME-DISPLAY** subcategories). - -When entering a timestamp, users can use either a LONG value or a DATETIME string. Supported input formats include: - -
- -**DATETIME-INPUT Type Supports Format** - - -| format | -| :--------------------------- | -| yyyy-MM-dd HH:mm:ss | -| yyyy/MM/dd HH:mm:ss | -| yyyy.MM.dd HH:mm:ss | -| yyyy-MM-dd HH:mm:ssZZ | -| yyyy/MM/dd HH:mm:ssZZ | -| yyyy.MM.dd HH:mm:ssZZ | -| yyyy/MM/dd HH:mm:ss.SSS | -| yyyy-MM-dd HH:mm:ss.SSS | -| yyyy.MM.dd HH:mm:ss.SSS | -| yyyy-MM-dd HH:mm:ss.SSSZZ | -| yyyy/MM/dd HH:mm:ss.SSSZZ | -| yyyy.MM.dd HH:mm:ss.SSSZZ | -| ISO8601 standard time format | - - -
- -> **Note:** `ZZ` represents a time zone offset (e.g., `+0800` for Beijing Time, `-0500` for Eastern Standard Time). - -IoTDB supports timestamp display in **LONG** format or **DATETIME-DISPLAY** format, allowing users to customize time output. - -
- -**Syntax for Custom Time Formats in DATETIME-DISPLAY** - - -| Symbol | Meaning | Presentation | Examples | -| :----: | :-------------------------: | :----------: | :--------------------------------: | -| G | era | era | era | -| C | century of era (>=0) | number | 20 | -| Y | year of era (>=0) | year | 1996 | -| | | | | -| x | weekyear | year | 1996 | -| w | week of weekyear | number | 27 | -| e | day of week | number | 2 | -| E | day of week | text | Tuesday; Tue | -| | | | | -| y | year | year | 1996 | -| D | day of year | number | 189 | -| M | month of year | month | July; Jul; 07 | -| d | day of month | number | 10 | -| | | | | -| a | halfday of day | text | PM | -| K | hour of halfday (0~11) | number | 0 | -| h | clockhour of halfday (1~12) | number | 12 | -| | | | | -| H | hour of day (0~23) | number | 0 | -| k | clockhour of day (1~24) | number | 24 | -| m | minute of hour | number | 30 | -| s | second of minute | number | 55 | -| S | fraction of second | millis | 978 | -| | | | | -| z | time zone | text | Pacific Standard Time; PST | -| Z | time zone offset/id | zone | -0800; -08:00; America/Los_Angeles | -| | | | | -| ' | escape for text | delimiter | | -| '' | single quote | literal | ' | - -
- -### 2.2 Relative Timestamp - -Relative timestamps allow specifying time offsets from **now()** or a **DATETIME** reference. - -The formal definition is: - -```Plain -Duration = (Digit+ ('Y'|'MO'|'W'|'D'|'H'|'M'|'S'|'MS'|'US'|'NS'))+ -RelativeTime = (now() | DATETIME) ((+|-) Duration)+ -``` - -
- - **The syntax of the duration unit** - - - | Symbol | Meaning | Presentation | Examples | - | :----: | :---------: | :----------------------: | :------: | - | y | year | 1y=365 days | 1y | - | mo | month | 1mo=30 days | 1mo | - | w | week | 1w=7 days | 1w | - | d | day | 1d=1 day | 1d | - | | | | | - | h | hour | 1h=3600 seconds | 1h | - | m | minute | 1m=60 seconds | 1m | - | s | second | 1s=1 second | 1s | - | | | | | - | ms | millisecond | 1ms=1000_000 nanoseconds | 1ms | - | us | microsecond | 1us=1000 nanoseconds | 1us | - | ns | nanosecond | 1ns=1 nanosecond | 1ns | - -
- -**Examples:** - -```Plain -now() - 1d2h // A time 1 day and 2 hours earlier than the server time -now() - 1w // A time 1 week earlier than the server time -``` - -> **Note:** There must be spaces on both sides of `+` and `-` operators. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md b/src/UserGuide/Master/Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md deleted file mode 100644 index c60e112d5..000000000 --- a/src/UserGuide/Master/Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md +++ /dev/null @@ -1,51 +0,0 @@ - -# Timeseries Data Model - -## 1. What is Time Series Data? - -In today's interconnected world, industries such as the Internet of Things (IoT) and manufacturing are undergoing rapid digital transformation. Sensors are widely deployed on various devices to collect real-time operational data. For example: - -- **Motors** record voltage and current. -- **Wind Turbines** track blade speed, angular velocity, and power output. -- **Vehicles** capture GPS coordinates, speed, and fuel consumption. -- **Bridges** monitor vibration frequency, deflection, and displacement. - -Sensor data collection has permeated almost every industry, generating vast amounts of **time series data**. - -![](/img/time-series-data-en-01.png) - -Each data collection point is referred to as a **measurement point** (also known as a physical quantity, time series, signal, metric, or measurement value). As time progresses, new data is continuously recorded for each measurement point, forming a **time series**. In tabular form, a time series consists of two columns: **timestamp** and **value**. When visualized, a time series appears as a trend chart over time, resembling an "electrocardiogram" of a device. - -![](/img/time-series-data-en-02.png) - -Given the vast amount of time-series data generated by sensors, structuring this data effectively is essential for digital transformation across industries. Therefore, time-series data modeling is primarily centered around **devices** and **sensors**. - -## 2. Key Concepts in Time Series Data - -Several fundamental concepts define time-series data: - -| **Device** | Also known as an entity or equipment, a device is a real-world object that generates time-series data. In IoTDB, a device serves as a logical grouping of multiple time series. A device could be a physical machine, a measuring instrument, or a collection of sensors. Examples include:
- Energy sector: A wind turbine, identified by parameters such as region, power station, line, model, and instance.
- Manufacturing sector: A robotic arm, uniquely identified by an IoT platform-assigned ID.
- Connected vehicles: A car, identified by its Vehicle Identification Number (VIN).
- Monitoring systems: A CPU, identified by attributes such as data center, rack, hostname, and device type. | -| ------------------------------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| **FIELD** | Also referred to as a physical quantity, signal, metric, or status point, a field represents a specific measurable property recorded by a sensor. Each field corresponds to a measurement point that periodically captures environmental data. Examples include:
- Energy and power: Current, voltage, wind speed, rotational speed.
- Connected vehicles: Fuel level, vehicle speed, latitude, longitude.
- Manufacturing: Temperature, humidity.
Under the **table model**, the total number of **measurement points** is the sum of measurement points of all tables (measurement points per table = number of devices × number of field columns). For detailed statistics methods, please refer to [Metadata Query](../Basic-Concept/Table-Management_timecho.md#_1-7-metadata-query) | -| **Data Point** | A data point consists of a timestamp and a value. The timestamp is typically stored as a long integer, while the value can be of various data types such as BOOLEAN, FLOAT, or INT32.
In tabular format, a data point corresponds to a single row in a time-series dataset, while in graphical representation, it is a single point on a time-series chart.
| -| **Frequency** | The sampling frequency determines how often a sensor records data within a given timeframe.
For example, if a temperature sensor records data once per second, its sampling frequency is 1Hz (1 sample per second). | -| **TTL** | TTL (Time-to-Live) defines the retention period of stored data. Once the TTL expires, the data is automatically deleted.
IoTDB allows different TTL values for different datasets, enabling automated, periodic data deletion. Proper TTL configuration helps:
- Manage disk space efficiently, preventing storage overflow.
- Maintain high query performance.
- Reduce memory resource consumption. | \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Basic-Concept/Database-Management_timecho.md b/src/UserGuide/Master/Table/Basic-Concept/Database-Management_timecho.md deleted file mode 100644 index 2ac3e52fd..000000000 --- a/src/UserGuide/Master/Table/Basic-Concept/Database-Management_timecho.md +++ /dev/null @@ -1,175 +0,0 @@ - - -# Database Management - -## 1. Database Management - -### 1.1 Create a Database - -This command is used to create a database. - -**Syntax:** - -```SQL - CREATE DATABASE (IF NOT EXISTS)? (WITH properties)? -``` - -**Note: ** - -1. ``: The name of the database, with the following characteristics: - - Case-insensitive. After creation, it will be displayed uniformly in lowercase. - - Can include commas (`,`), underscores (`_`), numbers, letters, and Chinese characters. - - Maximum length is 64 characters. - - Names with special characters or Chinese characters must be enclosed in double quotes (`""`). - -2. `WITH properties`: Property names are case-insensitive. For more details, refer to the case sensitivity rules [case-sensitivity](../SQL-Manual/Identifier.md#2-case-sensitivity)。Configurable properties include: - -| Property | Description | Default Value | -| ----------------------- | ------------------------------------------------------------ | -------------------- | -| TTL | Automatic data expiration time, in milliseconds | `INF` | -| TIME_PARTITION_INTERVAL | Time partition interval for the database, in milliseconds | `604800000` (7 days) | -| SCHEMA_REGION_GROUP_NUM | Number of metadata replica groups; generally does not require modification | `1` | -| DATA_REGION_GROUP_NUM | Number of data replica groups; generally does not require modification | `2` | - -**Examples:** - -```SQL -CREATE DATABASE IF NOT EXISTS database1 with(TTL=31536000000); -``` - -### 1.2 Use a Database - -Specify the current database as the namespace for table operations. - -**Syntax:** - -```SQL -USE -``` - -**Example:** - -```SQL -USE database1; -``` - -### 1.3 View the Current Database - -Displays the name of the currently connected database. If no USE statement has been executed, the default is `null`. - -**Syntax:** - -```SQL -SHOW CURRENT_DATABASE -``` - -**Example:** - -```SQL -USE database1; -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| database1| -+---------------+ -``` - - -### 1.4 View All Databases - -Displays all databases and their properties. - -**Syntax:** - -```SQL -SHOW DATABASES (DETAILS)? -``` - -**Columns Explained:** - - -| Column Name | Description | -| ----------------------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| database | Name of the database. | -| TTL | Data retention period. If TTL is specified when creating a database, it applies to all tables within the database. You can also set or update the TTL of individual tables using [create table](../Basic-Concept/Table-Management_timecho.md#11-create-a-table) 、[alter table](../Basic-Concept/Table-Management_timecho.md#14-update-tables) . | -| SchemaReplicationFactor | Number of metadata replicas, ensuring metadata high availability. This can be configured in the `iotdb-system.properties` file under the `schema_replication_factor` property. | -| DataReplicationFactor | Number of data replicas, ensuring data high availability. This can be configured in the `iotdb-system.properties` file under the `data_replication_factor` property. | -| TimePartitionInterval | Time partition interval, determining how often data is grouped into directories on disk. The default is typically one week. | -| Model | Returned when using the `DETAILS` option, showing the data model corresponding to each database (e.g., timeseries tree model or device table model). | - -**Examples:** - -```SQL -SHOW DATABASES DETAILS; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|DataRegionGroupNum| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| database1| INF| 1| 1| 604800000| 1| 2| -|information_schema| INF| null| null| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -``` - -### 1.5 Update a Database - -Used to modify some attributes in the database. - -**Syntax:** - -```SQL -ALTER DATABASE (IF EXISTS)? database=identifier SET PROPERTIES propertyAssignments -``` - -**Note:** - -1. The `ALTER DATABASE` operation currently only supports modifications to the database's `SCHEMA_REGION_GROUP_NUM`, `DATA_REGION_GROUP_NUM`, and `TTL` attributes. - -**Example:** - -```SQL -ALTER DATABASE database1 SET PROPERTIES TTL=31536000000; -``` - -### 1.6 Delete a Database - -Deletes the specified database and all associated tables and data. - -**Syntax:** - -```SQL -DROP DATABASE (IF EXISTS)? -``` - -**Note:** - -1. A database currently in use can still be dropped. -2. Deleting a database removes all its tables and stored data. - -**Example:** - -```SQL -DROP DATABASE IF EXISTS database1; -``` diff --git a/src/UserGuide/Master/Table/Basic-Concept/Query-Data_timecho.md b/src/UserGuide/Master/Table/Basic-Concept/Query-Data_timecho.md deleted file mode 100644 index 47976fb45..000000000 --- a/src/UserGuide/Master/Table/Basic-Concept/Query-Data_timecho.md +++ /dev/null @@ -1,590 +0,0 @@ - - -# Query Data - -## 1. Syntax Overview - -```SQL -SELECT ⟨select_list⟩ - FROM ⟨tables⟩ | patternRecognition - [WHERE ⟨condition⟩] - [GROUP BY ⟨groups⟩] - [HAVING ⟨group_filter⟩] - [WINDOW windowDefinition (',' windowDefinition)*)] - [FILL ⟨fill_methods⟩] - [ORDER BY ⟨order_expression⟩] - [OFFSET ⟨n⟩] - [LIMIT ⟨n⟩]; -``` - -The IoTDB table model query syntax supports the following clauses: - -- **SELECT Clause**: Specifies the columns to be included in the result. Details: [SELECT Clause](../SQL-Manual/Select-Clause_timecho.md) -- **FROM Clause**: Indicates the data source for the query, which can be a single table, multiple tables joined using the `JOIN` clause, or a subquery. Details: [FROM & JOIN Clause](../SQL-Manual/From-Join-Clause.md) -- **WHERE Clause**: Filters rows based on specific conditions. Logically executed immediately after the `FROM` clause. Details: [WHERE Clause](../SQL-Manual/Where-Clause.md) -- **GROUP BY Clause**: Used for aggregating data, specifying the columns for grouping. Details: [GROUP BY Clause](../SQL-Manual/GroupBy-Clause.md) -- **HAVING Clause**: Applied after the `GROUP BY` clause to filter grouped data, similar to `WHERE` but operates after grouping. Details:[HAVING Clause](../SQL-Manual/Having-Clause.md) -- **FILL Clause**: Handles missing values in query results by specifying fill methods (e.g., previous non-null value or linear interpolation) for better visualization and analysis. Details:[FILL Clause](../SQL-Manual/Fill-Clause.md) -- **ORDER BY Clause**: Sorts query results in ascending (`ASC`) or descending (`DESC`) order, with optional handling for null values (`NULLS FIRST` or `NULLS LAST`). Details: [ORDER BY Clause](../SQL-Manual/OrderBy-Clause.md) -- **OFFSET Clause**: Specifies the starting position for the query result, skipping the first `OFFSET` rows. Often used with the `LIMIT` clause. Details: [LIMIT and OFFSET Clause](../SQL-Manual/Limit-Offset-Clause.md) -- **LIMIT Clause**: Limits the number of rows in the query result. Typically used in conjunction with the `OFFSET` clause for pagination. Details: [LIMIT and OFFSET Clause](../SQL-Manual/Limit-Offset-Clause.md) - -## 2. Clause Execution Order - -![](/img/data-query-1.png) - - -## 3. Common Query Examples - -### 3.1 Sample Dataset - -The [Example Data page](../Reference/Sample-Data.md)page provides SQL statements to construct table schemas and insert data. By downloading and executing these statements in the IoTDB CLI, you can import the data into IoTDB. This data can be used to test and run the example SQL queries included in this documentation, allowing you to reproduce the described results. - -### 3.2 Basic Data Query - -**Example 1: Filter by Time** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE time >= 2024-11-27 00:00:00 and time <= 2024-11-29 00:00:00; -``` - -**Result**: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-28T08:00:00.000+08:00| 85.0| null| -|2024-11-28T09:00:00.000+08:00| null| 40.9| -|2024-11-28T10:00:00.000+08:00| 85.0| 35.2| -|2024-11-28T11:00:00.000+08:00| 88.0| 45.1| -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| null| -|2024-11-27T16:41:00.000+08:00| 85.0| null| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-27T16:43:00.000+08:00| null| null| -|2024-11-27T16:44:00.000+08:00| null| null| -+-----------------------------+-----------+--------+ -Total line number = 11 -It costs 0.075s -``` - -**Example 2: Filter by** **Value** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE temperature > 89.0; -``` - -**Result**: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-29T18:30:00.000+08:00| 90.0| 35.4| -|2024-11-26T13:37:00.000+08:00| 90.0| 35.1| -|2024-11-26T13:38:00.000+08:00| 90.0| 35.1| -|2024-11-30T09:30:00.000+08:00| 90.0| 35.2| -|2024-11-30T14:30:00.000+08:00| 90.0| 34.8| -+-----------------------------+-----------+--------+ -Total line number = 5 -It costs 0.156s -``` - -**Example 3: Filter by Attribute** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE model_id ='B'; -``` - -**Result**: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| null| -|2024-11-27T16:41:00.000+08:00| 85.0| null| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-27T16:43:00.000+08:00| null| null| -|2024-11-27T16:44:00.000+08:00| null| null| -+-----------------------------+-----------+--------+ -Total line number = 7 -It costs 0.106s -``` - -**Example 3:Multi device time aligned query** - -```SQL -IoTDB> SELECT date_bin_gapfill(1d, TIME) AS a_time, - device_id, - AVG(temperature) AS avg_temp - FROM table1 - WHERE TIME >= 2024-11-26 13:00:00 - AND TIME <= 2024-11-27 17:00:00 - GROUP BY 1, device_id FILL METHOD PREVIOUS; -``` - -**Result**: - -```SQL -+-----------------------------+---------+--------+ -| a_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-26T08:00:00.000+08:00| 100| 90.0| -|2024-11-27T08:00:00.000+08:00| 100| 90.0| -|2024-11-26T08:00:00.000+08:00| 101| 90.0| -|2024-11-27T08:00:00.000+08:00| 101| 85.0| -+-----------------------------+---------+--------+ -Total line number = 4 -It costs 0.048s -``` - -### 3.3 Aggregation Query - -**Example**: Calculate the average, maximum, and minimum temperature for each `device_id` within a specific time range. - -```SQL -IoTDB> SELECT device_id, AVG(temperature) as avg_temp, MAX(temperature) as max_temp, MIN(temperature) as min_temp - FROM table1 - WHERE time >= 2024-11-26 00:00:00 AND time <= 2024-11-29 00:00:00 - GROUP BY device_id; -``` - -**Result**: - -```SQL -+---------+--------+--------+--------+ -|device_id|avg_temp|max_temp|min_temp| -+---------+--------+--------+--------+ -| 100| 87.6| 90.0| 85.0| -| 101| 85.0| 85.0| 85.0| -+---------+--------+--------+--------+ -Total line number = 2 -It costs 0.278s -``` - -### 3.4 Latest Point Query - -**Example**: Retrieve the latest record for each `device_id`, including the temperature value and the timestamp of the last record. - -```SQL -IoTDB> SELECT device_id,last(time),last_by(temperature,time) - FROM table1 - GROUP BY device_id; -``` - -**Result**: - -```SQL -+---------+-----------------------------+-----+ -|device_id| _col1|_col2| -+---------+-----------------------------+-----+ -| 100|2024-11-29T18:30:00.000+08:00| 90.0| -| 101|2024-11-30T14:30:00.000+08:00| 90.0| -+---------+-----------------------------+-----+ -Total line number = 2 -It costs 0.090s -``` - -### 3.5 Downsampling Query - -**Example**: Group data by day and calculate the average temperature using `date_bin_gapfill` function. - -```SQL -IoTDB> SELECT device_id,date_bin(1d ,time) as day_time, AVG(temperature) as avg_temp - FROM table1 - WHERE time >= 2024-11-26 00:00:00 AND time <= 2024-11-30 00:00:00 - GROUP BY device_id,date_bin(1d ,time); -``` - -**Result**: - -```SQL -+---------+-----------------------------+--------+ -|device_id| day_time|avg_temp| -+---------+-----------------------------+--------+ -| 100|2024-11-29T08:00:00.000+08:00| 90.0| -| 100|2024-11-28T08:00:00.000+08:00| 86.0| -| 100|2024-11-26T08:00:00.000+08:00| 90.0| -| 101|2024-11-29T08:00:00.000+08:00| 85.0| -| 101|2024-11-27T08:00:00.000+08:00| 85.0| -+---------+-----------------------------+--------+ -Total line number = 5 -It costs 0.066s -``` -### 3.6 Multi device downsampling alignment query - -#### 3.6.1 Sampling Frequency is the Same, but Time is Different - -**Table 1: Sampling Frequency: 1s** - -| Time | device_id | temperature | -| ------------ | --------- | ----------- | -| 00:00:00.001 | d1 | 90.0 | -| 00:00:01.002 | d1 | 85.0 | -| 00:00:02.101 | d1 | 85.0 | -| 00:00:03.201 | d1 | null | -| 00:00:04.105 | d1 | 90.0 | -| 00:00:05.023 | d1 | 85.0 | -| 00:00:06.129 | d1 | 90.0 | - -**Table 2: Sampling Frequency: 1s** - -| Time | device_id | humidity | -| ------------ | --------- | -------- | -| 00:00:00.003 | d1 | 35.1 | -| 00:00:01.012 | d1 | 37.2 | -| 00:00:02.031 | d1 | null | -| 00:00:03.134 | d1 | 35.2 | -| 00:00:04.201 | d1 | 38.2 | -| 00:00:05.091 | d1 | 35.4 | -| 00:00:06.231 | d1 | 35.1 | - -**Example: Querying the downsampled data of table1:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS a_time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**Result:** - -```SQL -+-----------------------------+-------+ -| a_time|a_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| -|2025-05-13T00:00:01.000+08:00| 85.0| -|2025-05-13T00:00:02.000+08:00| 85.0| -|2025-05-13T00:00:03.000+08:00| 85.0| -|2025-05-13T00:00:04.000+08:00| 90.0| -|2025-05-13T00:00:05.000+08:00| 85.0| -|2025-05-13T00:00:06.000+08:00| 90.0| -+-----------------------------+-------+ -``` - -**Example: Querying the downsampled data of table2:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS b_time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**Result:** - -```SQL -+-----------------------------+-------+ -| b_time|b_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 35.1| -|2025-05-13T00:00:01.000+08:00| 37.2| -|2025-05-13T00:00:02.000+08:00| 37.2| -|2025-05-13T00:00:03.000+08:00| 35.2| -|2025-05-13T00:00:04.000+08:00| 38.2| -|2025-05-13T00:00:05.000+08:00| 35.4| -|2025-05-13T00:00:06.000+08:00| 35.1| -+-----------------------------+-------+ -``` - -**Example: Aligning multiple sequences by integer time:** - -```SQL -IoTDB> SELECT time, - a_value, - b_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) B - USING (time) -``` - -**Result:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|b_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:02.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:03.000+08:00| 85.0| 35.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 38.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 35.4| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` - -- **Retaining NULL Values**: When NULL values have special significance or when you wish to preserve the null values in the data, you can choose to omit FILL METHOD PREVIOUS to avoid filling in the gaps. -**Example:** - -```SQL -IoTDB> SELECT time, - a_value, - b_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1) B - USING (time) -``` - -**Result:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|b_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:02.000+08:00| 85.0| null| -|2025-05-13T00:00:03.000+08:00| null| 35.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 38.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 35.4| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` -#### 3.6.2 Different Sampling Frequencies, Different Times - -**Table 1: Sampling Frequency: 1s** - -| Time | device_id | temperature | -| ------------ | --------- | ----------- | -| 00:00:00.001 | d1 | 90.0 | -| 00:00:01.002 | d1 | 85.0 | -| 00:00:02.101 | d1 | 85.0 | -| 00:00:03.201 | d1 | null | -| 00:00:04.105 | d1 | 90.0 | -| 00:00:05.023 | d1 | 85.0 | -| 00:00:06.129 | d1 | 90.0 | - -**Table 3: Sampling Frequency: 2s** - -| Time | device_id | humidity | -| ------------ | --------- | -------- | -| 00:00:00.005 | d1 | 35.1 | -| 00:00:02.106 | d1 | 37.2 | -| 00:00:04.187 | d1 | null | -| 00:00:06.156 | d1 | 35.1 | - -**Example: Querying the downsampled data of table1:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS a_time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**Result:** - -```SQL -+-----------------------------+-------+ -| a_time|a_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| -|2025-05-13T00:00:01.000+08:00| 85.0| -|2025-05-13T00:00:02.000+08:00| 85.0| -|2025-05-13T00:00:03.000+08:00| 85.0| -|2025-05-13T00:00:04.000+08:00| 90.0| -|2025-05-13T00:00:05.000+08:00| 85.0| -|2025-05-13T00:00:06.000+08:00| 90.0| -+-----------------------------+-------+ -``` -**Example: Querying the downsampled data of table3:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS c_time, - first(humidity) AS c_value - FROM table3 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**Result:** - -```SQL -+-----------------------------+-------+ -| c_time|c_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 35.1| -|2025-05-13T00:00:01.000+08:00| 35.1| -|2025-05-13T00:00:02.000+08:00| 37.2| -|2025-05-13T00:00:03.000+08:00| 37.2| -|2025-05-13T00:00:04.000+08:00| 37.2| -|2025-05-13T00:00:05.000+08:00| 37.2| -|2025-05-13T00:00:06.000+08:00| 35.1| -+-----------------------------+-------+ -``` - -**Example: Aligning multiple sequences by the higher sampling frequency:** - -```SQL -IoTDB> SELECT time, - a_value, - c_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS c_value - FROM table3 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) C - USING (time) -``` - -**Result:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|c_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 35.1| -|2025-05-13T00:00:02.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:03.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 37.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` - -### 3.7 Missing Data Filling - -**Example**: Query the records within a specified time range where `device_id` is '100'. If there are missing data points, fill them using the previous non-null value. - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE time >= 2024-11-26 00:00:00 and time <= 2024-11-30 11:00:00 - AND region='East' AND plant_id='1001' AND device_id='101' - FILL METHOD PREVIOUS; -``` - -**Result**: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:41:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:42:00.000+08:00| 85.0| 35.2| -|2024-11-27T16:43:00.000+08:00| 85.0| 35.2| -|2024-11-27T16:44:00.000+08:00| 85.0| 35.2| -+-----------------------------+-----------+--------+ -Total line number = 7 -It costs 0.101s -``` - -### 3.8 Sorting & Pagination - -**Example**: Query records from the table, sorting by `humidity` in descending order and placing null values (NULL) at the end. Skip the first 2 rows and return the next 8 rows. - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - ORDER BY humidity desc NULLS LAST - OFFSET 2 - LIMIT 10; -``` - -**Result**: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-28T09:00:00.000+08:00| null| 40.9| -|2024-11-29T18:30:00.000+08:00| 90.0| 35.4| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-28T10:00:00.000+08:00| 85.0| 35.2| -|2024-11-30T09:30:00.000+08:00| 90.0| 35.2| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-26T13:38:00.000+08:00| 90.0| 35.1| -|2024-11-26T13:37:00.000+08:00| 90.0| 35.1| -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-30T14:30:00.000+08:00| 90.0| 34.8| -+-----------------------------+-----------+--------+ -Total line number = 10 -It costs 0.093s -``` diff --git a/src/UserGuide/Master/Table/Basic-Concept/TTL-Delete-Data_timecho.md b/src/UserGuide/Master/Table/Basic-Concept/TTL-Delete-Data_timecho.md deleted file mode 100644 index de696fa1d..000000000 --- a/src/UserGuide/Master/Table/Basic-Concept/TTL-Delete-Data_timecho.md +++ /dev/null @@ -1,145 +0,0 @@ - - -# TTL Delete Data - -## 1. Overview - -Time-to-Live (TTL) is a mechanism for defining the lifespan of data in a database. In IoTDB, TTL allows setting table-level expiration policies, enabling the system to automatically delete outdated data periodically. This helps manage disk space efficiently, maintain high query performance, and reduce memory usage. - -TTL values are specified in milliseconds, and once data exceeds its defined lifespan, it becomes unavailable for queries and cannot be written to. However, the physical deletion of expired data occurs later during the compaction process. Note that changes to TTL settings can briefly impact the accessibility of data. - -**Notes:** - -1. TTL defines the expiration time of data in milliseconds, independent of the time precision configuration file. -2. Modifying TTL settings can cause temporary variations in data accessibility. -3. The system eventually removes expired data, though this process may involve some delay.。 -4. The TTL expiration check is based on the data point timestamp, not the write time. - -## 2. Set TTL - -In the table model, IoTDB’s TTL operates at the granularity of individual tables. You can set TTL directly on a table or at the database level. When TTL is configured at the database level, it serves as the default for new tables created within the database. However, each table can still have its own independent TTL settings. - -**Note:** Modifying the database-level TTL does not retroactively affect the TTL settings of existing tables. - -### 2.1 Set TTL for Tables - -If TTL is specified when creating a table using SQL, the table’s TTL takes precedence. Refer to [Table-Management](../Basic-Concept/Table-Management_timecho.md)for details. - -Example 1: Setting TTL during table creation: - -```SQL -CREATE TABLE test3 ("site" string id, "temperature" int32) with (TTL=3600); -``` - -Example 2: Changing TTL for an existing table: - -```SQL -ALTER TABLE tableB SET PROPERTIES TTL=3600; -``` - -**Example 3:** If TTL is not specified or set to the default value, it will inherit the database's TTL. By default, the database TTL is `'INF'` (infinite): - -```SQL -CREATE TABLE test3 ("site" string id, "temperature" int32) with (TTL=DEFAULT); -CREATE TABLE test3 ("site" string id, "temperature" int32); -ALTER TABLE tableB set properties TTL=DEFAULT; -``` - -### 2.2 Set TTL for Databases - -Tables without explicit TTL settings inherit the TTL of their database. Refer to [Database-Management](../Basic-Concept/Database-Management_timecho.md)for details. - -Example 4: A database with TTL=3600000 creates tables inheriting this TTL: - -```SQL -CREATE DATABASE db WITH (ttl=3600000); -use db; -CREATE TABLE test3 ("site" string id, "temperature" int32); -``` - -Example 5: A database without a TTL setting creates tables without TTL: - -```SQL -CREATE DATABASE db; -use db; -CREATE TABLE test3 ("site" string id, "temperature" int32); -``` - -Example 6: Setting a table with no TTL explicitly (TTL=INF) in a database with a configured TTL: - -```SQL -CREATE DATABASE db WITH (ttl=3600000); -use db; -CREATE TABLE test3 ("site" string id, "temperature" int32) with (ttl='INF'); -``` - -## 3. Remove TTL - -To cancel a TTL setting, modify the table's TTL to 'INF'. Note that IoTDB does not currently support modifying the TTL of a database. - -```SQL -ALTER TABLE tableB set properties TTL='INF'; -``` - -## 4. View TTL Information - -Use the SHOW DATABASES and SHOW TABLES commands to view TTL details for databases and tables. Refer to [Database-Management](../Basic-Concept/Database-Management_timecho.md)、 [Table-Management](../Basic-Concept/Table-Management_timecho.md)for details. - -> Note: TTL settings in tree-model will also be shown. - -Example Output: - -```SQL -IoTDB> show databases; -+---------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+---------+-------+-----------------------+---------------------+---------------------+ -|test_prop| 300| 1| 3| 100000| -| test2| 300| 1| 1| 604800000| -+---------+-------+-----------------------+---------------------+---------------------+ - -IoTDB> show databases details; -+---------+-------+-----------------------+---------------------+---------------------+-----+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|Model| -+---------+-------+-----------------------+---------------------+---------------------+-----+ -|test_prop| 300| 1| 3| 100000|TABLE| -| test2| 300| 1| 1| 604800000| TREE| -+---------+-------+-----------------------+---------------------+---------------------+-----+ -IoTDB> show tables; -+---------+-------+ -|TableName|TTL(ms)| -+---------+-------+ -| grass| 1000| -| bamboo| 300| -| flower| INF| -+---------+-------+ - -IoTDB> show tables details; -+---------+-------+----------+ -|TableName|TTL(ms)| Status| -+---------+-------+----------+ -| bean| 300|PRE_CREATE| -| grass| 1000| USING| -| bamboo| 300| USING| -| flower| INF| USING| -+---------+-------+----------+ -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Basic-Concept/Table-Management_timecho.md b/src/UserGuide/Master/Table/Basic-Concept/Table-Management_timecho.md deleted file mode 100644 index f2e1bb2cd..000000000 --- a/src/UserGuide/Master/Table/Basic-Concept/Table-Management_timecho.md +++ /dev/null @@ -1,321 +0,0 @@ - - -# Table Management - -Before starting to use the table management functionality, we recommend familiarizing yourself with the following related background knowledge for a better understanding and application of the table management features: -* [Timeseries Data Model](../Background-knowledge/Navigating_Time_Series_Data_timecho.md): Understand the basic concepts and characteristics of time series data to establish a foundation for data modeling. -* [Modeling Scheme Design](../Background-knowledge/Data-Model-and-Terminology_timecho.md): Master the IoTDB time series model and its applicable scenarios to provide a design basis for table management. - -## 1. Table Management - -### 1.1 Create a Table - -#### 1.1.1 Manually create a table with CREATE - -Manually create a table within the current or specified database.The format is "database name. table name". - -**Syntax:** - -```SQL -createTableStatement - : CREATE TABLE (IF NOT EXISTS)? qualifiedName - '(' (columnDefinition (',' columnDefinition)*)? ')' - charsetDesc? - comment? - (WITH properties)? - ; - -charsetDesc - : DEFAULT? (CHAR SET | CHARSET | CHARACTER SET) EQ? identifierOrString - ; - -columnDefinition - : identifier columnCategory=(TAG | ATTRIBUTE | TIME) charsetName? comment? - | identifier type (columnCategory=(TAG | ATTRIBUTE | TIME | FIELD))? charsetName? comment? - ; - -charsetName - : CHAR SET identifier - | CHARSET identifier - | CHARACTER SET identifier - ; - -comment - : COMMENT string - ; -``` - -**Note:** - -1. When creating a table, you do not need to specify a time column. IoTDB automatically adds a column named "time" and places it as the first column. All other columns can be added by enabling the `enable_auto_create_schema` option in the database configuration, or through the session interface for automatic creation or by using table modification statements. -2. Since version V2.0.8.2, tables support custom naming of the time column during creation. The order of the custom time column in the table is determined by the order in the creation SQL. The related constraints are as follows: - - - When the column category is set to TIME, the data type must be TIMESTAMP. - - Each table allows at most one time column (columnCategory = TIME). - - If no time column is explicitly defined, no other column can use "time" as its name to avoid conflicts with the system's default time column naming. -3. The column category can be omitted and defaults to FIELD. When the column category is TAG or ATTRIBUTE, the data type must be STRING (can be omitted). -4. The TTL of a table defaults to the TTL of its database. If the default value is used, this attribute can be omitted or set to default. -5. table name has the following characteristics: - - - It is case-insensitive and, upon successful creation, is uniformly displayed in lowercase. - - The name can include special characters, such as `~!`"%`, etc. - - Table names containing special characters or Chinese characters must be enclosed in double quotation marks ("") during creation. - - - Note: In SQL, special characters or Chinese table names must be enclosed in double quotes. In the native API, no additional quotes are needed; otherwise, the table name will include the quote characters. - - When naming a table, the outermost double quotation marks (`""`) will not appear in the actual table name. - - ```sql - -- In SQL - "a""b" --> a"b - """""" --> "" - -- In API - "a""b" --> "a""b" - ``` -6. columnDefinition column names have the same characteristics as table names and can include the special character `.`. -7. COMMENT adds a comment to the table. - -**Examples:** - -```SQL -CREATE TABLE table1 ( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - humidity FLOAT FIELD COMMENT 'humidity', - status Boolean FIELD COMMENT 'status', - arrival_time TIMESTAMP FIELD COMMENT 'arrival_time' -) COMMENT 'table1' WITH (TTL=31536000000); -``` - -Note: If your terminal does not support multi-line paste (e.g., Windows CMD), please reformat the SQL statement into a single line before execution. - -### 1.2 View Tables - -Used to view all tables and their properties in the current or a specified database. - -**Syntax:** - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -**Note:** - -1. If the `FROM` or `IN` clause is specified, the command lists all tables in the specified database. -2. If neither `FROM` nor `IN` is specified, the command lists all tables in the currently selected database. If no database is selected (`USE` statement not executed), an error is returned. -3. When the `DETAILS` option is used, the command shows the current state of each table: - 1. `USING`: The table is available and operational. - 2. `PRE_CREATE`: The table is in the process of being created or the creation has failed; the table is not available. - 3. `PRE_DELETE`: The table is in the process of being deleted or the deletion has failed; the table will remain permanently unavailable. - -**Examples:** - -```sql -show tables details from database1; -``` -```shell -+---------------+-----------+------+-------+ -| TableName| TTL(ms)|Status|Comment| -+---------------+-----------+------+-------+ -| table1|31536000000| USING| table1| -+---------------+-----------+------+-------+ -``` - -### 1.3 View Table Columns - -Used to view column names, data types, categories, and states of a table. - -**Syntax:** - -```SQL -(DESC | DESCRIBE) (DETAILS)? -``` - -**Note:** If the `DETAILS` option is specified, detailed state information of the columns is displayed: - -- `USING`: The column is in normal use. -- `PRE_DELETE`: The column is being deleted or the deletion has failed; it is permanently unavailable. - - - -**Examples:** - -```sql -desc table1 details; -``` -```shell -+------------+---------+---------+------+------------+ -| ColumnName| DataType| Category|Status| Comment| -+------------+---------+---------+------+------------+ -| time|TIMESTAMP| TIME| USING| null| -| region| STRING| TAG| USING| null| -| plant_id| STRING| TAG| USING| null| -| device_id| STRING| TAG| USING| null| -| model_id| STRING|ATTRIBUTE| USING| null| -| maintenance| STRING|ATTRIBUTE| USING| maintenance| -| temperature| FLOAT| FIELD| USING| temperature| -| humidity| FLOAT| FIELD| USING| humidity| -| status| BOOLEAN| FIELD| USING| status| -|arrival_time|TIMESTAMP| FIELD| USING|arrival_time| -+------------+---------+---------+------+------------+ -``` - -### 1.4 View Table Creation Statement - -Retrieves the complete definition statement of a table or view under the table model. This feature automatically fills in all default values that were omitted during creation, so the displayed statement may differ from the original CREATE statement. - -> This feature is supported starting from v2.0.5. - -**Syntax:** - -```SQL -SHOW CREATE TABLE -``` - -**Note::** - -1. This statement does not support queries on system tables. - -**Example:** - -```SQL -show create table table1; -``` -```shell -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| Table| Create Table| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|table1|CREATE TABLE "table1" ("region" STRING TAG,"plant_id" STRING TAG,"device_id" STRING TAG,"model_id" STRING ATTRIBUTE,"maintenance" STRING ATTRIBUTE,"temperature" FLOAT FIELD,"humidity" FLOAT FIELD,"status" BOOLEAN FIELD,"arrival_time" TIMESTAMP FIELD) WITH (ttl=31536000000)| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 1 -``` - - -### 1.5 Update Tables - -Used to update a table, including adding or deleting columns, modify column type (V2.0.8.2) and configuring table properties. - -**Syntax:** - -```SQL -#addColumn; -ALTER TABLE (IF EXISTS)? tableName=qualifiedName ADD COLUMN (IF NOT EXISTS)? column=columnDefinition; -#dropColumn; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName DROP COLUMN (IF EXISTS)? column=identifier; -#setTableProperties; -// set TTL can use this; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName SET PROPERTIES propertyAssignments; -| COMMENT ON TABLE tableName=qualifiedName IS 'table_comment'; -| COMMENT ON COLUMN tableName.column IS 'column_comment'; -#changeColumndatatype; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName ALTER COLUMN (IF EXISTS)? column=identifier SET DATA TYPE new_type=type; -``` - -**Note::** - -1. The `SET PROPERTIES` operation currently only supports configuring the `TTL` property of a table -2. The delete column function only supports deleting the ATTRIBUTE and FILD columns, and the TAG column does not support deletion. -3. The modified comment will overwrite the original comment. If null is specified, the previous comment will be erased. -4. Since version V2.0.8.2, modifying the data type of a column is supported. Currently, only columns with Category type FIELD can be modified. - - * If the time series is concurrently deleted during the modification process, an error will be reported. - * The new data type must be compatible with the original type. The specific compatibility is shown in the following table: - -**Example:** - -add column -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS a TAG COMMENT 'a'; -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS b FLOAT FIELD COMMENT 'b'; -``` -set TTL -```SQL -ALTER TABLE table1 set properties TTL=3600; -``` -set comment -```SQL -COMMENT ON TABLE table1 IS 'table1'; -COMMENT ON COLUMN table1.a IS null; -``` -alter column datatype -```SQL -ALTER TABLE table1 ALTER COLUMN IF EXISTS b SET DATA TYPE DOUBLE; -``` - -### 1.6 Delete Tables - -Used to delete a table. - -**Syntax:** - -```SQL -DROP TABLE (IF EXISTS)? -``` - -**Examples:** - -```SQL -DROP TABLE table1; -DROP TABLE database1.table1; -``` - -## 1.7 Metadata Query -Under the table model, the **total number of measurement points** equals the sum of measurement points of all tables. Currently, the number of measurement points in a single table can be calculated with the formula: -**Measurement points per single table = Number of devices × Number of field columns**. -Support for directly querying measurement points under the table model via SQL statements will be available in future updates. Please stay tuned. - -Take `table1` in the [sample data](../Reference/Sample-Data.md) as an example. - -In the organizational structure of this sample, there are three tag columns (`region`, `plant_id`, `device_id`) and four field columns (`temperature`, `humidity`, `status`, `arrival_time`). - -A unique device is identified by the combination of all tag columns. Each unique combination of `region` + `plant_id` + `device_id` represents an independent device. - -The sample data defines 2 regions: Beijing and Shanghai. Details are as follows: -- **Beijing**: 1 factory with ID 1001 - - 2 devices under this factory: IDs 100 and 101 -- **Shanghai**: 2 factories with IDs 3001 and 3002 - - Factory 3001: 2 devices (IDs 100, 101) - - Factory 3002: 2 devices (IDs 100, 101) - -In total, there are 6 unique tag combinations in the table, corresponding to 6 independent devices. - -### Complete Calculation Example for Single-Table Measurement Points -1. Query the number of devices -```sql -IoTDB:database1> count devices from table1; -+--------------+ -|count(devices)| -+--------------+ -| 6| -+--------------+ -Total line number = 1 -It costs 0.019s -``` - -2. Calculate the total measurement points of the single table -- Number of devices: 6 -- Number of field columns: 4 -- Total measurement points of the table: **6 × 4 = 24** - diff --git a/src/UserGuide/Master/Table/Basic-Concept/Write-Updata-Data_timecho.md b/src/UserGuide/Master/Table/Basic-Concept/Write-Updata-Data_timecho.md deleted file mode 100644 index 38436f17d..000000000 --- a/src/UserGuide/Master/Table/Basic-Concept/Write-Updata-Data_timecho.md +++ /dev/null @@ -1,398 +0,0 @@ - - -# Write & Update Data - -## 1. SQL Insertion - -### 1.1 Syntax - -In IoTDB, data insertion follows the general syntax: - -```SQL -INSERT INTO [(COLUMN_NAME[, COLUMN_NAME]*)]? VALUES (COLUMN_VALUE[, COLUMN_VALUE]*) -``` - -**Basic Constraints**: - -1. Tables cannot be automatically created using `INSERT` statements. -2. Columns not specified in the `INSERT` statement will automatically be filled with `null`. -3. If no timestamp is provided, the system will use the current time (`now()`). -4. If a column value does not exist for the identified device, the insertion will overwrite any existing `null` values with the new data. -5. If a column value already exists for the identified device, a new insertion will update the column with the new value. -6. Writing duplicate timestamps will update the values in the columns corresponding to the original timestamps. -7. When an INSERT statement does not specify column names (e.g., INSERT INTO table VALUES (...)), the values in VALUES must strictly follow the physical order of columns in the table (this order can be checked via the DESC table command). - -Since attributes generally do not change over time, it is recommended to update attribute values using the `UPDATE` statement described below,Please refer to the following [Data Update](#2-data-updates). - -
- -
- - -### 1.2 Specified Column Insertion - -It is possible to insert data for specific columns. Columns not specified will remain `null`. - -**Example:** - -```SQL -INSERT INTO table1(region, plant_id, device_id, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, time, temperature) VALUES ('Hamburg', '1001', '100', '2025-11-26 13:38:00', 91.0); -``` - -### 1.3 Null Value Insertion - -You can explicitly set `null` values for tag columns, attribute columns, and field columns. - -**Example**: - -Equivalent to the above partial column insertion. - -```SQL -# Equivalent to the example above; -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', null, null, '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', null, null, '2025-11-26 13:38:00', 91.0, null); -``` - -If no tag columns are included, the system will automatically create a device with all tag column values set to `null`. - -> **Note:** This operation will not only automatically populate existing tag columns in the table with `null` values but will also populate any newly added tag columns with `null` values in the future. - -### 1.4 Multi-Row Insertion - -IoTDB supports inserting multiple rows of data in a single statement to improve efficiency. - -**Example**: - -```SQL -INSERT INTO table1 -VALUES -('2025-11-26 13:37:00', 'Frankfurt', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('2025-11-26 13:38:00', 'Frankfurt', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:38:25'); - -INSERT INTO table1 -(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) -VALUES -('Frankfurt', '1001', '100', 'A', '180', '2025-11-26 13:37:00', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('Frankfurt', '1001', '100', 'A', '180', '2025-11-26 13:38:00', 90.0, 35.1, true, '2025-11-26 13:38:25'); -``` - -#### Notes - -- Referencing non-existent columns in SQL will result in an error code `COLUMN_NOT_EXIST(616)`. -- Data type mismatches between the insertion data and the column's data type will result in an error code `DATA_TYPE_MISMATCH(614)`. - - -### 1.5 Query Write-back - -The IoTDB table model supports the **append-only query write-back** feature, implemented via the `INSERT INTO QUERY` statement. This feature allows writing the results of a query into an **existing** table. - -> ​**Note**​: This feature is available starting from version V2.0.6. - -#### 1.5.1 Syntax Definition - -sql - -```sql -INSERT INTO table_name [ ( column [, ... ] ) ] query -``` - -The **query** component supports three formats, which are illustrated with examples below. - -Using the [sample data](../Reference/Sample-Data.md) as the data source, first create the target table: - -sql - -```sql -CREATE TABLE target_table ( time TIMESTAMP TIME, region STRING TAG, device_id STRING TAG, temperature FLOAT FIELD ); -Msg: The statement is executed successfully. -``` - -1. Write-back via Standard Query Statement - -The `query` part is a direct `select ... from ...` query. - -​**Example**​: Use a standard query statement to write the `time`, `region`, `device_id`, and `temperature` data of the Beijing region from `table1` into `target_table`. - -sql - -```sql -insert into target_table select time,region,device_id,temperature from table1 where region = 'Beijing'; -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region='Beijing'; -``` -```shell -+-----------------------------+--------+-----------+-------------+ -| time| region| device_id| temperature| -+-----------------------------+--------+-----------+-------------+ -|2024-11-26T13:37:00.000+08:00| Beijing| 100| 90.0| -|2024-11-26T13:38:00.000+08:00| Beijing| 100| 90.0| -|2024-11-27T16:38:00.000+08:00| Beijing| 101| null| -|2024-11-27T16:39:00.000+08:00| Beijing| 101| 85.0| -|2024-11-27T16:40:00.000+08:00| Beijing| 101| 85.0| -|2024-11-27T16:41:00.000+08:00| Beijing| 101| 85.0| -|2024-11-27T16:42:00.000+08:00| Beijing| 101| null| -|2024-11-27T16:43:00.000+08:00| Beijing| 101| null| -|2024-11-27T16:44:00.000+08:00| Beijing| 101| null| -+-----------------------------+--------+-----------+-------------+ -Total line number = 9 -It costs 0.029s -``` - -2. Write-back via Table Reference Query - -The `query` part uses the table reference syntax `table source_table`. - -​**Example**​: Use a table reference query to write data from `table3` into `target_table`. - -sql - -```sql -insert into target_table(time,device_id,temperature) table table3; -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region is null; -``` -```shell -+-----------------------------+------+-----------+-------------+ -| time|region| device_id| temperature| -+-----------------------------+------+-----------+-------------+ -|2025-05-13T00:00:00.001+08:00| null| d1| 90.0| -|2025-05-13T00:00:01.002+08:00| null| d1| 85.0| -|2025-05-13T00:00:02.101+08:00| null| d1| 85.0| -|2025-05-13T00:00:03.201+08:00| null| d1| null| -|2025-05-13T00:00:04.105+08:00| null| d1| 90.0| -|2025-05-13T00:00:05.023+08:00| null| d1| 85.0| -|2025-05-13T00:00:06.129+08:00| null| d1| 90.0| -+-----------------------------+------+-----------+-------------+ -Total line number = 7 -It costs 0.015s -``` - -3. Write-back via Subquery - -The `query` part is a parenthesized subquery. - -​**Example**​: Use a subquery to write the `time`, `region`, `device_id`, and `temperature` data from `table1` whose timestamps match the records of the Shanghai region in `table2` into `target_table`. - -sql - -```sql -insert into target_table (select t1.time, t1.region as region, t1.device_id as device_id, t1.temperature as temperature from table1 t1 where t1.time in (select t2.time from table2 t2 where t2.region = 'Shanghai')); -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region = 'Shanghai'; -``` -```shell -+-----------------------------+---------+-----------+-------------+ -| time| region| device_id| temperature| -+-----------------------------+---------+-----------+-------------+ -|2024-11-28T08:00:00.000+08:00| Shanghai| 100| 85.0| -|2024-11-29T11:00:00.000+08:00| Shanghai| 100| null| -+-----------------------------+---------+-----------+-------------+ -Total line number = 2 -It costs 0.014s -``` - -#### 1.5.2 Notes - -* The source table in the `query` and the target table `table_name` are allowed to be the same table, e.g., `INSERT INTO testtb SELECT * FROM testtb`. -* The target table ​**must already exist**​; otherwise, the error message `550: Table 'xxx.xxx' does not exist` will be thrown. -* The number and types of query result columns must exactly match those of the target table. Object type is currently not supported, and no implicit type conversion is supported. If types mismatch, the error `701: Insert query has mismatched column types` will be raised. -* You can specify a subset of columns in the target table, provided the following rules are met: - * The timestamp column must be included; otherwise, the error message `701: time column can not be null` will be thrown. - * At least one **FIELD** column must be included; otherwise, the error message `701: No Field column present` will be thrown. - * **TAG** columns are optional. - * The number of specified columns can be less than that of the target table. Missing columns will be automatically filled with `NULL` values. -* For Java applications, the `INSERT INTO QUERY` statement can be executed using the [executeNonQueryStatement](../API/Programming-Java-Native-API_timecho.md#_3-1-itablesession-interface) method. -* For REST API access, the `INSERT INTO QUERY` statement can be executed via the [/rest/table/v1/nonQuery](../API/RestAPI-V1_timecho.md#_3-3-non-query-interface) endpoint. -* `INSERT INTO QUERY` does **not** support the `EXPLAIN` and `EXPLAIN ANALYZE` commands. -* To execute the query write-back statement successfully, users must have the following permissions: - * The `SELECT` permission on the source tables involved in the query. - * The `WRITE` permission on the target table. - * For more details about user permissions, refer to [Authority Management](../User-Manual/Authority-Management_timecho.md). - - -### 1.6 Writing Object Type - -To avoid oversized Object write requests, values of **Object** type can be split into segments and written sequentially. In SQL, the `to_object(isEOF, offset, content)` function must be used for value insertion. - -> Supported since V2.0.8 - -**Syntax:** - -```SQL -INSERT INTO tableName(time, columnName) VALUES(timeValue, TO_OBJECT(isEOF, offset, content)); -``` - -**Parameters:** - -| Name | Data Type | Description | -|---------|--------------------|-----------------------------------------------------------------------------| -| isEOF | BOOLEAN | Whether the current write contains the last segment of the Object | -| offset | INT64 | Starting offset of the current segment within the Object | -| content | Hexadecimal (HEX) | Content of the current segment | - -**Examples:** - -Add an Object-type column `s1` to table `table1`: - -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS s1 OBJECT FIELD COMMENT 'object type'; -``` - -1. **Non-segmented write** - -```SQL -INSERT INTO table1(time, device_id, s1) VALUES(NOW(), 'tag1', TO_OBJECT(TRUE, 0, X'696F746462')); -``` - -2. **Segmented write** - -```SQL --- First write: TO_OBJECT(FALSE, 0, X'696F'); -INSERT INTO table1(time, device_id, s1) VALUES(1, 'tag1', TO_OBJECT(FALSE, 0, X'696F')); - --- Second write: TO_OBJECT(FALSE, 2, X'7464'); -INSERT INTO table1(time, device_id, s1) VALUES(1, 'tag1', TO_OBJECT(FALSE, 2, X'7464')); - --- Third write: TO_OBJECT(TRUE, 4, X'62'); -INSERT INTO table1(time, device_id, s1) VALUES(1, 'tag1', TO_OBJECT(TRUE, 4, X'62')); -``` - -**Notes:** - -1. If only partial segments of an Object are written, querying the column will return `NULL`. Data becomes accessible only after all segments are successfully written. -2. During segmented writes, if the `offset` of the current write does not match the current size of the Object, the write operation will fail. -3. If `offset=0` is used after partial writes, the existing content will be overwritten with new data. - - -## 2. Schema-less Writing - -When performing data writing through Session, IoTDB supports schema-less writing: there is no need to manually create tables beforehand. The system automatically constructs the table structure based on the information in the write request, and then directly executes the data writing operation. - -**Example:** - -```Java -try (ITableSession session = - new TableSessionBuilder() - .nodeUrls(Collections.singletonList("127.0.0.1:6667")) - .username("root") - .password("root") - .build()) { - - session.executeNonQueryStatement("CREATE DATABASE db1"); - session.executeNonQueryStatement("use db1"); - - // Insert data without manually creating the table - List columnNameList = - Arrays.asList("region_id", "plant_id", "device_id", "model", "temperature", "humidity"); - List dataTypeList = - Arrays.asList( - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE); - List columnTypeList = - new ArrayList<>( - Arrays.asList( - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD)); - Tablet tablet = new Tablet("table1", columnNameList, dataTypeList, columnTypeList, 100); - for (long timestamp = 0; timestamp < 100; timestamp++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue("plant_id", rowIndex, "5"); - tablet.addValue("device_id", rowIndex, "3"); - tablet.addValue("model", rowIndex, "A"); - tablet.addValue("temperature", rowIndex, 37.6F); - tablet.addValue("humidity", rowIndex, 111.1); - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - session.insert(tablet); - tablet.reset(); - } - } - if (tablet.getRowSize() != 0) { - session.insert(tablet); - tablet.reset(); - } -} -``` - -After execution, you can verify the table creation using the following command: - -```SQL -desc table1; -``` -```shell -+-----------+---------+-----------+ -| ColumnName| DataType| Category| -+-----------+---------+-----------+ -| time|TIMESTAMP| TIME| -| region_id| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model| STRING| ATTRIBUTE| -|temperature| FLOAT| FIELD| -| humidity| DOUBLE| FIELD| -+-----------+---------+-----------+ -``` - - -## 3. Data Updates - -### 3.1 Syntax - -```SQL -UPDATE SET updateAssignment (',' updateAssignment)* (WHERE where=booleanExpression)? - -updateAssignment - : identifier EQ expression - ; -``` - -**Note:** - -- Updates are allowed only on `ATTRIBUTE` columns. -- `WHERE` conditions: - - Can only include `TAG` and `ATTRIBUTE` columns; `FIELD` and `TIME` columns are not allowed. - - Aggregation functions are not supported. -- The result of the `SET` assignment expression must be a `string` type and follow the same constraints as the `WHERE` clause. - -**Example**: - -```SQL -update table1 set b = a where substring(a, 1, 1) like '%'; -``` diff --git a/src/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md b/src/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md deleted file mode 100644 index 31a27cfb9..000000000 --- a/src/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md +++ /dev/null @@ -1,290 +0,0 @@ - -# AINode Deployment - -## 1. AINode Introduction - -### 1.1 Capability Introduction - -AINode is the third type of endogenous node provided by TimechoDB after ConfigNode and DataNode. By interacting with the DataNodes and ConfigNodes of an TimechoDB cluster, this node extends the capability for machine learning analysis on time series. AINode integrates model management, training, and inference within the database engine. It supports performing time series analysis tasks on specified time series data using registered models through simple SQL statements and also supports registering and using custom machine learning models. AINode currently integrates machine learning algorithms and self-developed models for common time series analysis scenarios (e.g., forecasting). - -### 1.2 Deployment Modes - -AINode is an additional component outside the TimechoDB cluster and is deployed using a separate installation package. - -
- - -
- -## 2. Installation Preparation - -### 2.1 Installation Package Acquisition - -The key directory structure after extracting the AINode installation package (`timechodb--ainode-bin.zip`) is as follows: - -| Directory | Type | Description | -| :--- | :--- | :--- | -| lib | Folder | Executable programs and dependencies for AINode | -| sbin | Folder | Operation scripts for AINode, used to start or stop AINode | -| conf | Folder | Configuration files and version declaration file for AINode | - -### 2.2 Pre-installation Verification - -To ensure the AINode installation package you obtained is complete and correct, it is recommended to perform an SHA512 verification before installation and deployment. - -**Preparation:** - -- Obtain the official SHA512 checksum: Please contact Timecho staff. - -**Verification Steps (using Linux as an example):** - -1. Open a terminal, navigate to the directory containing the installation package (e.g., `/data/ainode`): - -```bash -cd /data/ainode -``` - -2. Execute the following command to calculate the hash value: - -```bash -sha512sum timechodb-{version}-ainode-bin.zip -``` - -3. The terminal will output the result (left side is the SHA512 checksum, right side is the filename): - -```SQL -(base) root@hadoop@1:/data/ainode (0.664s) -sha512sum timechodb-2.0.6.1-ainode-bin.zip -4d5a6a64935b4f0459bc9ed214c4563aa7a6a5941024336e9416212424707f27bdfdfc70f4c528b51b812687d660014adc1b8add699498ea67ff17c7e619a6f0 timechodb-2.0.6.1-ainode-bin.zip -``` - -4. Compare the output with the official SHA512 checksum. If they match, you can proceed with the AINode installation and deployment steps below. - -**Notes:** - -- If the verification results do not match, please contact Timecho staff to obtain a new installation package. -- If you encounter a "file not found" prompt during verification, check if the file path is correct or if the installation package was downloaded completely. - -### 2.3 Environment Requirements - -- Recommended operating environment: Linux, macOS. -- TimechoDB Version: >= V2.0.8. - -#### 2.3.1 Resource Configuration Recommendations - -> Note: The resource configuration recommendations in this section apply only to **model inference tasks**. Guidelines for model training tasks will be provided in subsequent releases. - -The following are baseline resource configurations for model inference running on a single NVIDIA RTX 4090 (24 GB VRAM). For model inference on AINode, overall throughput can be improved by horizontally scaling the number of GPUs. It is generally recommended to deploy servers with 1, 2, 4 or 8 GPUs. - -Specifications of inference tasks used in benchmark tests: -- **Univariate inference**: Historical sequence length: 2880, prediction length: 720 -- **Covariate inference**: Historical sequence length: 2880, prediction length: 720, with 20 known covariates - -| Number of GPUs (NVIDIA 4090, 24 GB VRAM) | Recommended CPU Cores | Recommended Memory (GB) | Supported QPS for Univariate Inference | Supported QPS for Covariate Inference | -|------------------------------------------|-----------------------|-------------------------|-----------------------------------------|---------------------------------------| -| 1 GPU | 16 cores | 24 GB | 100 | 10 | -| 2 GPUs | 32 cores | 48 GB | 200 | 20 | -| 4 GPUs | 64 cores | 96 GB | 400 | 40 | -| 8 GPUs | 128 cores | 192 GB | 800 | 80 | - -**Notes**: -- The CPU and memory configurations above follow this general rule: allocate 16 CPU cores per GPU, and set system memory equal to GPU VRAM at a ratio of 1:1. -- The throughput figures are benchmark references. Actual performance may vary depending on model type, data complexity and deployment environment. -- The throughput of univariate and covariate inference shall be evaluated separately as required, and the two values cannot be summed directly. - - -## 3. Installation, Deployment, and Usage - -### 3.1 Installing AINode - -Download the AINode installation package, import it into a dedicated folder, switch to that folder, and extract the package. - -```bash -unzip timechodb--ainode-bin.zip -``` - -### 3.2 Modifying Configuration Items - -AINode supports modifying some necessary parameters. You can find the following parameters in the `/TIMECHODB_AINODE_HOME/conf/iotdb-ainode.properties` file and make persistent modifications: - -| Name | Description | Type | Default Value | -| :--- | :--- | :--- | :--- | -| `cluster_name` | The cluster identifier the AINode is to join | String | `defaultCluster` | -| `ain_seed_config_node` | The ConfigNode address for AINode registration upon startup | String | `127.0.0.1:10710` | -| `ain_cluster_ingress_address` | The rpc address of the DataNode from which AINode pulls data | String | `127.0.0.1` | -| `ain_cluster_ingress_port` | The rpc port of the DataNode from which AINode pulls data | Integer | `6667` | -| `ain_cluster_ingress_username` | The client username for the DataNode from which AINode pulls data | String | `root` | -| `ain_cluster_ingress_password` | The client password for the DataNode from which AINode pulls data | String | `root` | -| `ain_rpc_address` | The address for AINode service provision and communication (internal service communication interface) | String | `127.0.0.1` | -| `ain_rpc_port` | The port for AINode service provision and communication | String | `10810` | -| `ain_system_dir` | AINode metadata storage path. The starting directory for relative paths is OS-dependent; using an absolute path is recommended. | String | `data/AINode/system` | -| `ain_models_dir` | AINode model file storage path. The starting directory for relative paths is OS-dependent; using an absolute path is recommended. | String | `data/AINode/models` | -| `ain_thrift_compression_enabled` | Whether to enable Thrift compression mechanism for AINode. 0-disable, 1-enable. | Boolean | `0` | - -### 3.3 Importing Built-in Weight Files - -*If the deployment environment has network connectivity and can access HuggingFace, the system will automatically pull the built-in model weight files. This step can be skipped.* -*For offline environments, contact Timecho staff to obtain the model weight folder and place it under the `/TIMECHODB_AINODE_HOME/data/ainode/models/builtin` directory.* -**NOTE:** Pay attention to the directory hierarchy. The parent directory for all built-in model weights should be `builtin`. - -### 3.4 Starting AINode - -After completing the deployment of ConfigNodes, you can add an AINode to support time series model management and inference functionality. After specifying the TimechoDB cluster information in the configuration items, you can execute the corresponding command to start the AINode and join the TimechoDB cluster. - -```bash -# Startup command -# Linux and macOS systems -bash sbin/start-ainode.sh - -# Windows system -sbin\start-ainode.bat - -# Background startup command (recommended for long-term operation) -# Linux and macOS systems -bash sbin/start-ainode.sh -d - -# Windows system -sbin\start-ainode.bat -d -``` - -### 3.5 Activating AINode - -1. Refer to TimechoDB Activation: [Activation Method](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md#_2-6-activate-database) - -2. You can verify AINode activation as follows. When the status shows `ACTIVATED`, it indicates successful activation. - -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -Total line number = 3 -It costs 0.002s -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ -Total line number = 7 -It costs 0.013s -``` - -### 3.6 Checking AINode Node Status - -During startup, AINode automatically joins the TimechoDB cluster. After starting AINode, you can enter an SQL query in the command line. Seeing the AINode node in the cluster with a `Running` status (as shown below) indicates a successful join. - -```sql -TimechoDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -Additionally, you can check the model status using the `show models` command. If the model status is incorrect, please verify the weight file path. - -```sql -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 3.7 Stopping AINode - -If you need to stop a running AINode node, execute the corresponding shutdown script. It supports specifying the port via the `-p` parameter, which corresponds to the `ain_rpc_port` configuration item. - -```bash -# Linux / macOS -bash sbin/stop-ainode.sh -bash sbin/stop-ainode.sh -p # Specify port - -# Windows -sbin\stop-ainode.bat -sbin\stop-ainode.bat -p # Specify port -``` - -After stopping AINode, you can still see the AINode node in the cluster, but its status will be `UNKNOWN` (as shown below). AINode functionality will be unavailable at this time. - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|UNKNOWN| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -If you need to restart the node, re-execute the startup script. - -### 3.8 Upgrading AINode -If you need to upgrade the version of the current AINode, follow these steps: - -1. Stop the current AINode service - - Run the stop command and ensure the service has completely exited before proceeding with subsequent operations. - - ```bash - # Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # Specify port - - # Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # Specify port - ``` - -2. Replace core files - - Delete the `lib` and `sbin` directories of the current version, then copy the `lib` and `sbin` directories from the new version to the corresponding locations. - - Back up the modified configuration files in the `conf` directory, then replace the `conf` folder and synchronize your modified configurations to the corresponding files. - -3. Update built-in model weights (optional) - - If the new version includes updates to built-in models, relevant information will be announced in the [Release History](../IoTDB-Introduction/Release-history_timecho.md). You may contact Timecho staff to obtain the latest weight package, and replace it in the `data/ainode/models/builtin` directory. - -4. After the upgrade is complete, start the AINode service and check the node status. For detailed commands, refer to Sections 3.4 and 3.6. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index 08882b31d..000000000 --- a/src/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,319 +0,0 @@ - -# AINode Deployment - -## 1. AINode Introduction - -### 1.1 Capability Introduction - -AINode is the third type of endogenous node provided by IoTDB after the Configurable Node and DataNode. This node extends its ability to perform machine learning analysis on time series by interacting with the DataNode and Configurable Node of the IoTDB cluster. It supports the introduction of existing machine learning models from external sources for registration and the use of registered models to complete time series analysis tasks on specified time series data through simple SQL statements. The creation, management, and inference of models are integrated into the database engine. Currently, machine learning algorithms or self-developed models are available for common time series analysis scenarios, such as prediction and anomaly detection. - -### 1.2 Delivery Method -AINode is an additional package outside the IoTDB cluster, with independent installation. - -### 1.3 Deployment mode -
- - -
- -## 2. Installation preparation - -### 2.1 Get installation package - -Unzip and install the package -`(timechodb--ainode-bin.zip)`, The directory structure after unpacking the installation package is as follows: - -| **Catalogue** | **Type** | **Explain** | -| ----------- | -------- |-----------------------------------------------------------------------| -| lib | folder | Python package files for AINode | -| sbin | folder | The running script of AINode can start, remove, and stop AINode | -| conf | folder | Configuration files for AINode, and runtime environment setup scripts | -| LICENSE | file | Certificate | -| NOTICE | file | Tips | -| README_ZH.md | file | Explanation of the Chinese version of the markdown format | -| README.md | file | Instructions | - -### 2.2 Pre-installation Check - -To ensure the AINode installation package you obtained is complete and valid, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum:please contact Timecho Team to re-obtain the installation package. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/ainode): - ```Bash - cd /data/ainode - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-06.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment of AINode as per the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### 2.3 Environmental Preparation - -1. Recommended operating systems: Ubuntu, MacOS -2. IoTDB version: >= V 2.0.5.1 -3. Runtime environment - - Python version between 3.9 and 3.12, with pip and venv tools installed; - -## 3. Installation steps - -### 3.1 Install AINode - -1. Ensure Python version is between 3.9 and 3.12: -```shell -python --version -# or -python3 --version -``` - -2. Download and import AINode into a dedicated folder, switch to the folder, and unzip the package: -```shell - unzip timechodb--ainode-bin.zip - ``` -3. Activate AINode: - -- Enter the IoTDB CLI - -```sql -# For Linux or macOS -./start-cli.sh -sql_dialect table - -# For Windows -./start-cli.bat -sql_dialect table -``` - -- Run the following command to retrieve the machine code required for activation: - -```sql -show system info -``` - -- Copy the returned machine code and send it to the Timecho team: - -```sql -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -``` - -- Enter the activation code provided by the Timecho team in the CLI using the following format. Wrap the activation code in single quotes ('): - -```sql -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZK' -``` - -- You can verify the activation using the following method: when the status shows ACTIVATED, it indicates successful activation. - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ - -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ - -``` - -### 3.2 Configuration item modification - -AINode supports modifying some necessary parameters. You can find the following parameters in the `conf/iotdb-ainode.properties` file and make persistent modifications to them: - -| **Name** | **Description** | **Type** | **Default Value** | -| ------------------------------ | ------------------------------------------------------------ | -------- | ------------------ | -| cluster_name | Identifier of the cluster AINode joins | string | defaultCluster | -| ain_seed_config_node | Address of the ConfigNode registered when AINode starts | String | 127.0.0.1:10710 | -| ain_cluster_ingress_address | RPC address of the DataNode for AINode to pull data | String | 127.0.0.1 | -| ain_cluster_ingress_port | RPC port of the DataNode for AINode to pull data | Integer | 6667 | -| ain_cluster_ingress_username | Client username for AINode to pull data from the DataNode | String | root | -| ain_cluster_ingress_password | Client password for AINode to pull data from the DataNode | String | root | -| ain_cluster_ingress_time_zone | Client time zone for AINode to pull data from the DataNode | String | UTC+8 | -| ain_inference_rpc_address | Address for AINode to provide services and communication (internal interface) | String | 127.0.0.1 | -| ain_inference_rpc_port | Port for AINode to provide services and communication | String | 10810 | -| ain_system_dir | Metadata storage path for AINode (relative path starts from OS-dependent directory; absolute path is recommended) | String | data/AINode/system | -| ain_models_dir | Path to store model files for AINode (relative path starts from OS-dependent directory; absolute path is recommended) | String | data/AINode/models | -| ain_thrift_compression_enabled | Whether to enable Thrift compression for AINode (0=disabled, 1=enabled) | Boolean | 0 | - -### 3.3 Importing Weight Files - -> Offline environment only (Online environments can skip this step) -> -Contact Timecho team to obtain the model weight files, then place them in the /IOTDB_AINODE_HOME/data/ainode/models/weights/ directory. - - -### 3.4 Start AINode - -After completing the deployment of Seed Config Node, the registration and inference functions of the model can be supported by adding AINode nodes. After specifying the information of the IoTDB cluster in the configuration file, the corresponding instruction can be executed to start AINode and join the IoTDB cluster。 - -- Networking environment startup - -Start command - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - - # Windows systems - sbin\start-ainode.bat - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### 3.5 Detecting the status of AINode nodes - -During the startup process of AINode, the new AINode will be automatically added to the IoTDB cluster. After starting AINode, you can enter SQL in the command line to query. If you see an AINode node in the cluster and its running status is Running (as shown below), it indicates successful joining. - - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -### 3.6 Stop AINode - -If you need to stop a running AINode node, execute the corresponding shutdown script. - -Stop command - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - -After stopping AINode, you can still see AINode nodes in the cluster, whose running status is UNKNOWN (as shown below), and the AINode function cannot be used at this time. - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -If you need to restart the node, you need to execute the startup script again. - -## 4. common problem - -### 4.1 An error occurs when starting AINode stating that the venv module cannot be found - -When starting AINode using the default method, a Python virtual environment will be created in the installation package directory and dependencies will be installed, so it is required to install the venv module. Generally speaking, Python 3.10 and above versions come with built-in VenV, but for some systems with built-in Python environments, this requirement may not be met. There are two solutions when this error occurs (choose one or the other): - -To install the Venv module locally, taking Ubuntu as an example, you can run the following command to install the built-in Venv module in Python. Or install a Python version with built-in Venv from the Python official website. - - ```shell -apt-get install python3.10-venv -``` -Install version 3.10.0 of venv into AINode in the AINode path. - - ```shell -../Python-3.10.0/python -m venv venv(Folder Name) -``` -When running the startup script, use ` -i ` to specify an existing Python interpreter path as the running environment for AINode, eliminating the need to create a new virtual environment. - -### 4.2 The SSL module in Python is not properly installed and configured to handle HTTPS resources -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -You can install OpenSSLS and then rebuild Python to solve this problem -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - -Python requires OpenSSL to be installed on our system, the specific installation method can be found in [link](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - -### 4.3 Pip version is lower - -A compilation issue similar to "error: Microsoft Visual C++14.0 or greater is required..." appears on Windows - -The corresponding error occurs during installation and compilation, usually due to insufficient C++version or Setup tools version. You can check it in - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - -### 4.4 Install and compile Python - -Use the following instructions to download the installation package from the official website and extract it: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` -Compile and install the corresponding Python package: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/UserGuide/Master/Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index 04f6f2342..000000000 --- a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,624 +0,0 @@ - -# Cluster Deployment - -This guide describes how to manually deploy a cluster instance consisting of 3 ConfigNodes and 3 DataNodes (commonly referred to as a 3C3D cluster). - -
- -
- - -## 1. Prerequisites - -1. **System Preparation**: Ensure the system has been configured according to the [System Requirements](../Deployment-and-Maintenance/Environment-Requirements.md). - -2. **IP Configuration**: It is recommended to use hostnames for IP configuration to prevent issues caused by IP address changes. Set the hostname by editing the `/etc/hosts` file. For example, if the local IP is `192.168.1.3` and the hostname is `iotdb-1`, run: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -``` - -Use the hostname for `cn_internal_address` and `dn_internal_address` in IoTDB configuration. - -3. **Unmodifiable Parameters**: Some parameters cannot be changed after the first startup. Refer to the [Parameter Configuration](#22-parameters-configuration) section. - - -4. **Installation Path**: Ensure the installation path contains no spaces or non-ASCII characters to prevent runtime issues. -5. **User Permissions**: Choose one of the following permissions during installation and deployment: - - **Root User (Recommended)**: This avoids permission-related issues. - - **Non-Root User**: - - Use the same user for all operations, including starting, activating, and stopping services. - - Avoid using `sudo`, which can cause permission conflicts. - -6. **Monitoring Panel**: Deploy a monitoring panel to track key performance metrics. Contact the Timecho team for access and refer to the [Monitoring Board Install and Deploy](../Deployment-and-Maintenance/Monitoring-panel-deployment.md). - -7. **Health Check Tool**: Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - - -## 2. Preparation - -1. Obtain the TimechoDB installation package: `timechodb-{version}-bin.zip` following [IoTDB-Package](../Deployment-and-Maintenance/IoTDB-Package_timecho.md)) - -2. Configure the operating system environment according to [Environment Requirement](../Deployment-and-Maintenance/Environment-Requirements.md)) - -### 2.1 Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -## 3. Installation Steps - -Taking a cluster with three Linux servers with the following information as example: - -| Node IP | Hostname | Services | -| ------------- | -------- | -------------------- | -| 11.101.17.224 | iotdb-1 | ConfigNode, DataNode | -| 11.101.17.225 | iotdb-2 | ConfigNode, DataNode | -| 11.101.17.226 | iotdb-3 | ConfigNode, DataNode | - -### 3.1 Configure Hostnames - -On all three servers, configure the hostnames by editing the `/etc/hosts` file. Use the following commands: - -```Bash -echo "11.101.17.224 iotdb-1" >> /etc/hosts -echo "11.101.17.225 iotdb-2" >> /etc/hosts -echo "11.101.17.226 iotdb-3" >> /etc/hosts -``` - -### 3.2 Extract Installation Package - -Unzip the installation package and navigate to the directory: - -```Plain -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -### 3.3 Parameters Configuration - -#### 3.3.1 Memory Configuration - -Edit the following files for memory allocation: - -- **ConfigNode**: `./conf/confignode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :--------------------------------- | :---------- | :-------------- | :-------------------------------------- | -| MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 30% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -- **DataNode**: `./conf/datanode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :--------------------------------- |:-----------------------------------------------------------------------------------------| :-------------- | :-------------------------------------- | -| MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 50% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -#### 3.3.2 General Configuration - -Set the following parameters in `./conf/iotdb-system.properties`. Refer to `./conf/iotdb-system.properties.template` for a complete list. - -**Cluster-Level Parameters**: - -| **Parameter** | **Description** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | -| :------------------------ | :----------------------------------------------------------- | :---------------- | :---------------- | :---------------- | -| cluster_name | Name of the cluster | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | Metadata replication factor; DataNode count shall not be fewer than this value | 3 | 3 | 3 | -| data_replication_factor | Data replication factor; DataNode count shall not be fewer than this value | 2 | 2 | 2 | - -**ConfigNode Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | **Notes** | -| :------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------------------- | :---------------- | :---------------- | :---------------- | :--------------------------------------------------------- | -| cn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | iotdb-1 | iotdb-2 | iotdb-3 | This parameter cannot be modified after the first startup. | -| cn_internal_port | Port used for internal communication within the cluster | 10710 | 10710 | 10710 | 10710 | 10710 | This parameter cannot be modified after the first startup. | -| cn_consensus_port | Port used for consensus protocol communication among ConfigNode replicas | 10720 | 10720 | 10720 | 10720 | 10720 | This parameter cannot be modified after the first startup. | -| cn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Address and port of the seed ConfigNode (e.g., `cn_internal_address:cn_internal_port`) | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | This parameter cannot be modified after the first startup. | - -**DataNode Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | **Notes** | -| :------------------------------ | :----------------------------------------------------------- |:----------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :---------------- | :---------------- | :---------------- | :--------------------------------------------------------- | -| dn_rpc_address | Address for the client RPC service | 127.0.0.1 | By default, the local machine can directly access it. For non-local access, please modify this configuration item to the IPv4 address or hostname of the server where it is located. It is recommended to use the IPv4 address of the server where it is located. | iotdb-1 | iotdb-2 | iotdb-3 | Effective after restarting the service. | -| dn_rpc_port | Port for the client RPC service | 6667 | 6667 | 6667 | 6667 | 6667 | Effective after restarting the service. | -| dn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | iotdb-1 | iotdb-2 | iotdb-3 | This parameter cannot be modified after the first startup. | -| dn_internal_port | Port used for internal communication within the cluster | 10730 | 10730 | 10730 | 10730 | 10730 | This parameter cannot be modified after the first startup. | -| dn_mpp_data_exchange_port | Port used for receiving data streams | 10740 | 10740 | 10740 | 10740 | 10740 | This parameter cannot be modified after the first startup. | -| dn_data_region_consensus_port | Port used for data replica consensus protocol communication | 10750 | 10750 | 10750 | 10750 | 10750 | This parameter cannot be modified after the first startup. | -| dn_schema_region_consensus_port | Port used for metadata replica consensus protocol communication | 10760 | 10760 | 10760 | 10760 | 10760 | This parameter cannot be modified after the first startup. | -| dn_seed_config_node | Address of the ConfigNode for registering and joining the cluster.(e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Address of the first ConfigNode | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | This parameter cannot be modified after the first startup. | - -**Note:** Ensure files are saved after editing. Tools like VSCode Remote do not save changes automatically. - -### 3.4 Start ConfigNode Instances - -1. Start the first ConfigNode (`iotdb-1`) as the seed node - -```Bash - # Unix/OS X - cd sbin - ./start-confignode.sh -d # The "-d" flag starts the process in the background. - - # Windows - # Before version V2.0.4.x - .\start-confignode.bat - - # V2.0.4.x and later versions - .\windows\start-confignode.bat - ``` - -2. Start the remaining ConfigNodes (`iotdb-2` and `iotdb-3`) in sequence. - -If the startup fails, refer to the [Common Issues](#5-common-issues) section below for troubleshooting. - -### 3.5 Start DataNode Instances - -On each server, navigate to the `sbin` directory and start the DataNode: - -```Bash - # Unix/OS X - cd sbin - ./start-datanode.sh -d # The "-d" flag starts the process in the background. - - # Windows - # Before version V2.0.4.x - .\start-datanode.bat - - # V2.0.4.x and later versions - .\windows\start-datanode.bat - ``` - -### 3.6 Activate the Database - -#### Option 1: Command-Based Activation - -1. Enter the IoTDB CLI on any node of the cluster: - -**Linux** or **MacOS** - -```Bash -# Before version V2.0.6.x -Shell> bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` -**Windows** - -```Bash -# Before version V2.0.4.x -Shell> sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.4.x and later versions, before version V2.0.6.x -Shell> sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -2. Execute the following command to obtain the machine code required for activation: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -|01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -3. Execute the following statement to obtain the version number of the database to be activated: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.9.2| 5ea21bc| -+-------+---------+ -Total line number = 1 -``` - -4. Provide the obtained machine code and version number to the Timecho team. - -5. Enter the activation codes provided by the Timecho team in the CLI in sequence using the following format. Wrap the activation code in single quotes ('): - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -- Note : The activation operation only needs to be performed once on any machine in the cluster. - -#### Option 2: File-Based Activation - -1. Start all ConfigNodes and DataNodes. -2. Copy the `system_info` file from the `activation` directory on each server and send them to the Timecho team. -3. Place the license files provided by the Timecho team into the corresponding `activation` folder for each node. - - -### 3.7 Verify Activation - -In the CLI, you can check the activation status by running the `show activation` command; the example below shows a status of ACTIVATED, indicating successful activation. - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -### 3.8 One-click Cluster Start and Stop - -#### 3.8.1 Overview - -Within the root directory of IoTDB, the `sbin `subdirectory houses the `start-all.sh` and `stop-all.sh` scripts, which work in concert with the `iotdb-cluster.properties` configuration file located in the `conf` subdirectory. This synergy enables the one-click initiation or termination of all nodes within the cluster from a single node. This approach facilitates efficient management of the IoTDB cluster's lifecycle, streamlining the deployment and operational maintenance processes. - -This following section will introduce the specific configuration items in the `iotdb-cluster.properties` file. - -#### 3.8.2 Configuration Items - -> Note: -> -> * When the cluster changes, this configuration file needs to be manually updated. -> * If the `iotdb-cluster.properties` configuration file is not set up and the `start-all.sh` or `stop-all.sh` scripts are executed, the scripts will, by default, start or stop the ConfigNode and DataNode nodes located in the IOTDB\_HOME directory where the scripts reside. -> * It is recommended to configure SSH passwordless login: If not configured, the script will prompt for the server password after execution to facilitate subsequent start, stop, or destroy operations. If already configured, there is no need to enter the server password during script execution. - -* confignode\_address\_list - -| **Name** | **confignode\_address\_list** | -| :----------------: |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | A list of IP addresses or hostname of the hosts where the ConfigNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_address\_list - -| **Name** | **datanode\_address\_list** | -| :----------------: |:------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | A list of IP addresses or hostname of the hosts where the DataNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* ssh\_account - -| **Name** | **ssh\_account** | -| :----------------: | :------------------------------------------------------------------------------------------------- | -| Description | The username used to log in to the target hosts via SSH. All hosts must have the same username. | -| Type | String | -| Default | root | -| Effective | After restarting the system | - -* ssh\_port - -| **Name** | **ssh\_port** | -| :----------------: | :---------------------------------------------------------------------------------- | -| Description | The SSH port exposed by the target hosts. All hosts must have the same SSH port. | -| Type | int | -| Default | 22 | -| Effective | After restarting the system | - -* confignode\_deploy\_path - -| **Name** | **confignode\_deploy\_path** | -| :----------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Description | The path on the target hosts where all ConfigNodes to be started/stopped are located. All ConfigNodes must be in the same directory on their respective hosts. eg: `/data/demo/apache-iotdb-1.3.1-all-bin`| -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_deploy\_path - -| **Name** | **datanode\_deploy\_path** | -| :----------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| Description | The path on the target hosts where all DataNodes to be started/stopped are located. All DataNodes must be in the same directory on their respective hosts.eg: `/data/demo/apache-iotdb-1.3.1-all-bin` | -| Type | String | -| Default | None | -| Effective | After restarting the system | - - -#### 3.8.3 Quick Example - -1. Configuration File: `iotdb-cluster.properties` -```properties -# Configure ConfigNode node addresses, separated by commas -confignode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# Configure DataNode node addresses, separated by commas -datanode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# SSH login username for target deployment servers -ssh_account=root - -# SSH service port number -ssh_port=22 - -# IoTDB installation directory (the program will be deployed into this path on remote nodes) -confignode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -datanode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -``` - -2. Run `./start-all.sh` to launch cluster and verify status - Connect to IoTDB CLI and execute `show cluster`. A successful output is shown below: -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -| 0|ConfigNode|Running| 172.xx.xx.16| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 1|ConfigNode|Running| 172.xx.xx.18| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 2|ConfigNode|Running| 172.xx.xx.17| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 3| DataNode|Running| 172.xx.xx.18| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 4| DataNode|Running| 172.xx.xx.17| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 5| DataNode|Running| 172.xx.xx.16| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -``` - - -## 4. Maintenance - -### 4.1 ConfigNode Maintenance - -ConfigNode maintenance includes adding and removing ConfigNodes. Common use cases include: - -- **Cluster Expansion:** If the cluster contains only 1 ConfigNode, adding 2 more ConfigNodes enhances high availability, resulting in a total of 3 ConfigNodes. -- **Cluster Fault Recovery:** If a ConfigNode's machine fails and it cannot function normally, remove the faulty ConfigNode and add a new one to the cluster. - -**Note:** After completing ConfigNode maintenance, ensure that the cluster contains either 1 or 3 active ConfigNodes. Two ConfigNodes do not provide high availability, and more than three ConfigNodes can degrade performance. - -#### 4.1.1 Adding a ConfigNode - -**Linux / MacOS :** - -```Bash -sbin/start-confignode.sh -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\start-confignode.bat - -# V2.0.4.x and later versions -sbin\windows\start-confignode.bat -``` - -#### 4.1.2 Removing a ConfigNode - -1. Connect to the cluster using the CLI and confirm the internal address and port of the ConfigNode to be removed: - -```Plain -show confignodes; -``` - -Example output: - -```Plain -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -2. Remove the ConfigNode using the script: - -**Linux / MacOS:** - -```Bash -sbin/remove-confignode.sh [confignode_id] -# Or: -sbin/remove-confignode.sh [cn_internal_address:cn_internal_port] -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\remove-confignode.bat [confignode_id] -# Or: -sbin\remove-confignode.bat [cn_internal_address:cn_internal_port] - -# V2.0.4.x and later versions -sbin\windows\remove-confignode.bat [confignode_id] -# Or: -sbin\windows\remove-confignode.bat [cn_internal_address:cn_internal_port] -``` - -### 4.2 DataNode Maintenance - -DataNode maintenance includes adding and removing DataNodes. Common use cases include: - -- **Cluster Expansion:** Add new DataNodes to increase cluster capacity. -- **Cluster Fault Recovery:** If a DataNode's machine fails and it cannot function normally, remove the faulty DataNode and add a new one to the cluster. - -**Note:** During and after DataNode maintenance, ensure that the number of active DataNodes is not fewer than the data replication factor (usually 2) or the schema replication factor (usually 3). - -#### 4.2.1 Adding a DataNode - -**Linux / MacOS:** - -```Bash -sbin/start-datanode.sh -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\start-datanode.bat - -# V2.0.4.x and later versions -sbin\windows\start-datanode.bat -``` - -**Note:** After adding a DataNode, the cluster load will gradually balance across all nodes as new writes arrive and old data expires (if TTL is set). - -#### 4.2.2 Removing a DataNode - -1. Connect to the cluster using the CLI and confirm the RPC address and port of the DataNode to be removed: - -```sql -show datanodes; -``` - -Example output: - -```sql -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -2. Remove the DataNode using the script: - -**Linux / MacOS:** - -```Bash -sbin/remove-datanode.sh [dn_rpc_address:dn_rpc_port] -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\remove-datanode.bat [dn_rpc_address:dn_rpc_port] - -# V2.0.4.x and later versions -sbin\windows\remove-datanode.bat [dn_rpc_address:dn_rpc_port] -``` - -### 4.3 Cluster Maintenance - -For more details on cluster maintenance, please refer to: [Cluster Maintenance](../User-Manual/Load-Balance.md) - -## 5. Common Issues - -1. Activation Fails Repeatedly - - Use the `ls -al` command to verify that the ownership of the installation directory matches the current user. - - Check the ownership of all files in the `./activation` directory to ensure they belong to the current user. - -2. ConfigNode Fails to Start - 1. Review the startup logs to check if any parameters, which cannot be modified after the first startup, were changed. - 2. Check the logs for any other errors. If unresolved, contact technical support for assistance. - 3. If the deployment is fresh or data can be discarded, clean the environment and redeploy using the following steps: - - **Clean the Environment** - - 1. Stop all ConfigNode and DataNode processes: - ```Bash - sbin/stop-standalone.sh - ``` - - 2. Check for any remaining processes: - ```Bash - jps - # or - ps -ef | grep iotdb - ``` - - 3. If processes remain, terminate them manually: - ```Bash - kill -9 - - #For systems with a single IoTDB instance, you can clean up residual processes with: - ps -ef | grep iotdb | grep -v grep | tr -s ' ' ' ' | cut -d ' ' -f2 | xargs kill -9 - ``` - - 4. Delete the `data` and `logs` directories: - ```Bash - cd /data/iotdb - rm -rf data logs - ``` - -## 6. Appendix - -### 6.1 ConfigNode Parameters - -| Parameter | Description | Required | -| :-------- | :---------------------------------------------------------- | :------- | -| -d | Starts the process in daemon mode (runs in the background). | No | - -### 6.2 DataNode Parameters - -| Parameter | Description | Required | -| :-------- | :----------------------------------------------------------- | :------- | -| -v | Displays version information. | No | -| -f | Runs the script in the foreground without backgrounding it. | No | -| -d | Starts the process in daemon mode (runs in the background). | No | -| -p | Specifies a file to store the process ID for process management. | No | -| -c | Specifies the path to the configuration folder; the script loads configuration files from this location. | No | -| -g | Prints detailed garbage collection (GC) information. | No | -| -H | Specifies the path for the Java heap dump file, used during JVM memory overflow. | No | -| -E | Specifies the file for JVM error logs. | No | -| -D | Defines system properties in the format `key=value`. | No | -| -X | Passes `-XX` options directly to the JVM. | No | -| -h | Displays the help instructions. | No | diff --git a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Database-Resources_timecho.md b/src/UserGuide/Master/Table/Deployment-and-Maintenance/Database-Resources_timecho.md deleted file mode 100644 index bb15f8a36..000000000 --- a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Database-Resources_timecho.md +++ /dev/null @@ -1,222 +0,0 @@ - -# Database Resources -## 1. CPU - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Number of timeseries (frequency<=1HZ)CPUNumber of nodes
standaloneDual-ActiveDistributed
Within 1000002-4 cores123
Within 3000004-8 cores123
Within 5000008-16 cores123
Within 100000016-32 cores123
Within 200000032-48 cores123
Within 1000000048core12Please contact Timecho Business for consultation
Over 10000000Please contact Timecho Business for consultation
- -> Supported CPU models: Kunpeng, Phytium, Sunway, Hygon, Zhaoxin, Loongson - -## 2. Memory - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Number of timeseries (frequency<=1HZ)MemoryNumber of nodes
standaloneDual-ActiveDistributed
Within 1000002-4G123
Within 3000006-12G123
Within 50000012-24G123
Within 100000024-48G123
Within 200000048-96G123
Within 10000000128G12Please contact Timecho Business for consultation
Over 10000000Please contact Timecho Business for consultation
- -> Flexible memory configuration options are provided. Users can adjust them in the datanode-env file. For details and configuration guidelines, please refer to [datanode-env](../Reference/System-Config-Manual.md#_3-2-datanode-env-sh-bat) - -**Note**: For dedicated hardware allocation and throughput references for AI model inference scenarios, refer to Section **[2.3.1 Resource Configuration Recommendations](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_2-3-1-resource-configuration-recommendations)** in the AINode deployment documentation. - -## 3. Storage (Disk) -### 3.1 Storage space -Calculation Formula: - -```Plain -Storage Space = Number of Measurement Points * Sampling Frequency (Hz) * Size of Each Data Point (Bytes, see the table below) * Storage Duration * Replication Factor / Compression Ratio -``` - -Data Point Size Calculation Table: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Data TypeTimestamp (Bytes) Value (Bytes) Total Data Point Size (Bytes) -
Boolean819
INT32 / FLOAT (Single Precision)8412
INT64 / DOUBLE (Double Precision)8816
TEXT (String)8Average = a8+a
-Example: - -- Scenario: 1,000 devices, 100 measurement points per device, i.e. 100,000 sequences in total. Data type is INT32. Sampling frequency is 1Hz (once per second). Storage duration is 1 year. Replication factor is 3. -- Full Calculation: - ```Plain - 1,000 devices * 100 measurement points * 12 bytes per data point * 86,400 seconds per day * 365 days per year * 3 replicas / 10 compression ratio = 11 TB - ``` -- Simplified Calculation: - ```Plain - 1,000 * 100 * 12 * 86,400 * 365 * 3 / 10 = 11 TB - ``` -### 3.2 Storage Configuration - -- For systems with > 10 million measurement points or high query loads, SSD is recommended. - -## 4. Network (NIC) -When the write throughput does not exceed 10 million points per second, a gigabit network card is required. When the write throughput exceeds 10 million points per second, a 10-gigabit network card is required. - -| **Write** **Throughput** **(Data Points/Second)** | **NIC** **Speed** | -| ------------------------------------------------- | -------------------- | -| < 10 million | 1 Gbps (Gigabit) | -| ≥ 10 million | 10 Gbps (10 Gigabit) | - -## 5. Additional Notes - -- IoTDB supports second-level cluster scaling . Data migration is not required when adding new nodes, so there is no need to worry about limited cluster capacity based on current data estimates. You can add new nodes to the cluster when scaling is needed in the future. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Deployment-form_timecho.md b/src/UserGuide/Master/Table/Deployment-and-Maintenance/Deployment-form_timecho.md deleted file mode 100644 index b2daee47f..000000000 --- a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Deployment-form_timecho.md +++ /dev/null @@ -1,63 +0,0 @@ - -# Deployment form - -IoTDB has two operation modes: standalone mode and cluster mode. - -## 1. Standalone Mode - -An IoTDB standalone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D. - -- **Features**: Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations. -- **Use Cases**: Scenarios with limited resources or low high-availability requirements, such as edge servers. -- **Deployment Method**: [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -## 2. Dual-Active Mode - -Dual-Active Deployment is a feature of TimechoDB, where two independent instances synchronize bidirectionally and can provide services simultaneously. If one instance stops and restarts, the other instance will resume data transfer from the breakpoint. - -> An IoTDB Dual-Active instance typically consists of 2 standalone nodes, i.e., 2 sets of 1C1D. Each instance can also be a cluster. - -- **Features**: The high-availability solution with the lowest resource consumption. -- **Use Cases**: Scenarios with limited resources (only two servers) but requiring high availability. -- **Deployment Method**: [Dual-Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -## 3. Cluster Mode - -An IoTDB cluster instance consists of 3 ConfigNodes and no fewer than 3 DataNodes, typically 3 DataNodes, i.e., 3C3D. If some nodes fail, the remaining nodes can still provide services, ensuring high availability of the database. Performance can be improved by adding DataNodes. - -- **Features**: High availability, high scalability, and improved system performance by adding DataNodes. -- **Use Cases**: Enterprise-level application scenarios requiring high availability and reliability. -- **Deployment Method**: [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - -## 4. Feature Summary - -| **Dimension** | **Stand-Alone Mode** | **Dual-Active Mode** | **Cluster Mode** | -| :-------------------------- | :------------------------------------------------------- | :------------------------------------------------------ | :------------------------------------------------------ | -| Use Cases | Edge-side deployment, low high-availability requirements | High-availability services, disaster recovery scenarios | High-availability services, disaster recovery scenarios | -| Number of Machines Required | 1 | 2 | ≥3 | -| Security and Reliability | Cannot tolerate single-point failure | High, can tolerate single-point failure | High, can tolerate single-point failure | -| Scalability | Can expand DataNodes to improve performance | Each instance can be scaled as needed | Can expand DataNodes to improve performance | -| Performance | Can scale with the number of DataNodes | Same as one of the instances | Can scale with the number of DataNodes | - -- The deployment steps for Stand-Alone Mode and Cluster Mode are similar (adding ConfigNodes and DataNodes one by one), with differences only in the number of replicas and the minimum number of nodes required to provide services. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/UserGuide/Master/Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index a0d4293d9..000000000 --- a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,487 +0,0 @@ - -# Docker Deployment - -## 1. Environment Preparation - -### 1.1 Install Docker - -```Bash -#Taking Ubuntu as an example. For other operating systems, you can search for installation methods on your own. -#step1: Install necessary system tools -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: Install GPG certificate -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: Add the software source -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: Update and install Docker CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: Set Docker to start automatically on boot -sudo systemctl enable docker -#step6: Verify if Docker is installed successfully -docker --version #Display version information, indicating successful installation. -``` - -### 1.2 Install Docker Compose - -```Bash -#Installation command -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#Verify the installation -docker-compose --version #Display version information, indicating successful installation. -``` - -### 1.3 Install dmidecode - -By default, Linux servers should already have dmidecode. If not, you can use the following command to install it. - -```Bash -sudo apt-get install dmidecode -``` - -After installing `dmidecode`, you can locate its installation path by running:`whereis dmidecode`. Assuming the result is `/usr/sbin/dmidecode`, please remember this path as it will be used in the YML file of Docker Compose later. - -### 1.4 Obtain the Container Image - -For the TimechoDB container image, you can contact the Timecho team to acquire it. - -## 2. Stand-Alone Deployment - -This section demonstrates how to deploy a standalone Docker version of 1C1D. - -### 2.1 Load the Image File - -For example, if the IoTDB container image file you obtained is named: `iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz`, use the following command to load the image: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -To view the loaded image, use the following command: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### 2.2 Create a Docker Bridge Network - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### 2.3 Write the Docker-Compose YML File - -Assume the IoTDB installation directory and the YML file are placed under the `/docker-iotdb` folder. The directory structure is as follows:`docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml` - -```Bash -docker-iotdb: -├── iotdb #Iotdb installation directory -│── docker-compose-standalone.yml #YML file for standalone Docker Composer -``` - -The complete content of `docker-compose-standalone.yml` is as follows: - -```Bash -version: "3" -services: - iotdb-service: - image: timecho/timechodb:2.0.2.1-standalone #The image used - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### 2.4 First Startup - -Use the following command to start: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -Since the system is not activated yet, it will exit immediately after the first startup, which is normal. The purpose of the first startup is to generate the machine code file for the activation process. - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### 2.5 Apply for Activation - -- After the first startup, a `system_info` file will be generated in the physical machine directory `/docker-iotdb/iotdb/activation`. Copy this file and send it to the Timecho team. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Once you receive the `license` file, copy it to the `/docker-iotdb/iotdb/activation` folder. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### 2.6 Start IoTDB Again - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### 2.7 Verify the Deployment - -- Check the logs: If you see the following message, the startup is successful. - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- Enter the container and check the service status: - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - Enter the container, log in to the database through CLI, and use the show cluster command to view the service status and activation status - - ```Bash - docker exec -it iotdb /bin/bash #Enter the container - ./start-cli.sh -h iotdb #Log in to the database - IoTDB> show cluster #Check the service status - ``` - - If all services are in the `running` state, the IoTDB deployment is successful. - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### 2.8 Map the `/conf` Directory (Optional) - -If you want to modify configuration files directly on the physical machine, you can map the `/conf` folder from the container. Follow these steps: - -**Step 1**: Copy the `/conf` directory from the container to `/docker-iotdb/iotdb/conf`: - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -**Step 2**: Add the mapping in `docker-compose-standalone.yml`: - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf # Add this mapping for the /conf folder - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /dev/mem:/dev/mem:ro -``` - -**Step 3**: Restart IoTDB: - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## 3. Cluster Deployment - -This section describes how to manually deploy a cluster consisting of 3 ConfigNodes and 3 DataNodes, commonly referred to as a 3C3D cluster. - -
- -
- -**Note: The cluster version currently only supports host and overlay networks, and does not support bridge networks.** - -Below, we demonstrate how to deploy a 3C3D cluster using the host network as an example. - -### 3.1 Set Hostnames - -Assume there are 3 Linux servers with the following IP addresses and service roles: - -| Node IP | Hostname | Services | -| :---------- | :------- | :------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode, DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode, DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode, DataNode | - -On each of the 3 machines, configure the hostnames by editing the `/etc/hosts` file. Use the following commands: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 3.2 Load the Image File - -For example, if the IoTDB container image file is named `iotdb-enterprise-2.0.x.x.3-standalone-docker.tar.gz`, execute the following command on all 3 servers to load the image: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -To view the loaded images, run: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### 3.3. Write the Docker-Compose YML Files - -Here, we assume the IoTDB installation directory and YML files are placed under the `/docker-iotdb` folder. The directory structure is as follows: - -```Bash -docker-iotdb: -├── confignode.yml #ConfigNode YML file -├── datanode.yml #DataNode YML file -└── iotdb #IoTDB installation directory -``` - -On each server, create two YML files: `confignode.yml` and `datanode.yml`. Examples are provided below: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:2.0.x.x-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:2.0.x.x-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### 3.4 Start ConfigNode for the First Time - -Start the ConfigNode on all 3 servers. **Note the startup order**: Start `iotdb-1` first, followed by `iotdb-2` and `iotdb-3`. - -Run the following command on each server: - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #Background startup -``` - -### 3.5 Apply for Activation - -- After starting the 3 ConfigNodes for the first time, a `system_info` file will be generated in the `/docker-iotdb/iotdb/activation` directory on each physical machine. Copy the `system_info` files from all 3 servers and send them to the Timecho team. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Place the 3 `license` files into the corresponding `/docker-iotdb/iotdb/activation` folders on each ConfigNode server. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- Once the `license` files are placed in the `activation` folders, the ConfigNodes will automatically activate. **No restart is required for the ConfigNodes.** - -### 3.6 Start DataNode - -Start the DataNode on all 3 servers: - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #Background startup -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### 3.7 Verify Deployment - -- Check the logs: If you see the following message, the DataNode has started successfully. - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- Enter the container and check the service status: - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - Enter any container, log in to the database via CLI, and use the `show cluster` command to check the service status: - -```Bash -docker exec -it iotdb-datanode /bin/bash #Entering the container -./start-cli.sh -h iotdb-1 #Log in to the database -IoTDB> show cluster #View status -``` - -If all services are in the `running` state, the IoTDB deployment is successful. - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### 3.8 Map the `/conf` Directory (Optional) - -If you want to modify configuration files directly on the physical machine, you can map the `/conf` folder from the container. Follow these steps: - -**Step 1**: Copy the `/conf` directory from the container to `/docker-iotdb/iotdb/conf` on all 3 servers: - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -or -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -**Step 2**: Add the `/conf` directory mapping in both `confignode.yml` and `datanode.yml` on all 3 servers: - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -**Step 3**: Restart IoTDB on all 3 servers: - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/UserGuide/Master/Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index ac07865bd..000000000 --- a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,208 +0,0 @@ - -# Dual Active Deployment - -## 1. What is a double active version? - -Dual-active mode refers to two independent instances (either standalone or clusters) with completely independent configurations. These instances can simultaneously handle external read and write operations, with real-time bi-directional synchronization and breakpoint recovery capabilities. - -Key features include: - -- **Mutual Backup of Instances**: If one instance stops service, the other remains unaffected. When the stopped instance resumes, the other instance will synchronize newly written data. Businesses can bind both instances for read and write operations, achieving high availability. -- **Cost-Effective Deployment**: The dual-active deployment solution achieves high availability with only two physical nodes, offering cost advantages. Additionally, physical resource isolation for the two instances can be ensured by leveraging dual-ring power and network designs, enhancing operational stability. - -**Note:** The dual-active functionality is exclusively available in enterprise-grade TimechoDB. - -![](/img/20240731104336.png) - -## 2. Prerequisites - -1. **Hostname Configuration**: It is recommended to prioritize hostname over IP during deployment to avoid issues where the database cannot start due to later changes in the host IP. For instance, if the local IP is `192.168.1.3` and the hostname is `iotdb-1`, configure it in `/etc/hosts` using: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -``` - -Use the hostname to configure IoTDB’s `cn_internal_address` and `dn_internal_address`. - -2. **Immutable Parameters**: Some parameters cannot be changed after the initial startup. Follow the steps in the "Installation Steps" section to configure them correctly. - -3. **Monitoring Panel**: Deploying a monitoring panel is recommended to monitor key performance indicators and stay informed about the database’s operational status. Contact the Timecho team to obtain the monitoring panel and refer to the corresponding [Monitoring Panel Deployment](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) for deployment steps. - -## 3. Installation Steps - -This guide uses two standalone nodes, A and B, to deploy the dual-active version of TimechoDB. The IP addresses and hostnames for the nodes are as follows: - -| Machine | IP Address | Hostname | -| ------- | ----------- | -------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### 3.1 Install Two Independent TimechoDB Instances - -Install TimechoDB on both machines (A and B) independently. For detailed instructions, refer to the standalone [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md)or cluster [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)deployment guides. - -Ensure that configurations for A and B are consistent for optimal dual-active performance. - -### 3.2 Configure Data Synchronization from Machine A to Machine B - -- Connect to the database on Machine A using the CLI tool from the `sbin` directory: - -```Bash -# Unix/OS X -./sbin/start-cli.sh -h iotdb-1 - -# Windows -# Before version V2.0.4.x -.\sbin\start-cli.bat -h iotdb-1 - -# V2.0.4.x and later versions -.\sbin\windows\start-cli.bat -h iotdb-1 -``` - -- Then create and start a data synchronization process. Use the following SQL command: - -```Bash -create pipe AB -with source ( - 'source.mode.double-living' ='true' -with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' -) -``` - -- **Note:** To avoid infinite data loops, ensure the parameter `source.mode.double-living` is set to `true` on both A and B. This prevents retransmission of data received through the other instance's pipe. - -### 3.3 Configure Data Synchronization from Machine B to Machine A - -- Connect to the database on Machine B: - -```Bash -# Unix/OS X -./sbin/start-cli.sh -h iotdb-2 - -# Windows -# Before version V2.0.4.x -.\sbin\start-cli.bat -h iotdb-2 - -# V2.0.4.x and later versions -.\sbin\windows\start-cli.bat -h iotdb-2 -``` - -- Then create and start the synchronization process with the following SQL command: - -```Bash -create pipe BA -with source ( -'source.mode.double-living' ='true' -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' -) -``` - -- **Note:** To avoid infinite data loops, ensure the parameter `source.mode.double-living` is set to `true` on both A and B. This prevents retransmission of data received through the other instance's pipe. - -### 3.4 Verify Deployment - -#### Check Cluster Status - -Run the `show cluster` command on both nodes to verify the status of the TimechoDB services: - -```Bash -show cluster -``` - -**Machine A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**Machine B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -Ensure all `ConfigNode` and `DataNode` processes are in the `Running` state. - -#### Check Synchronization Status - -Use the `show pipes` command on both nodes: - -```Bash -show pipes -``` - -Confirm that all pipes are in the `RUNNING` state: - -On machine A: - -![](/img/show%20pipes-A.png) - -On machine B: - -![](/img/show%20pipes-B.png) - -### 3.5 Stop the Dual-Active Instances - -To stop the dual-active instances: - -On machine A: - -```SQL -# Unix/OS X -./sbin/start-cli.sh -h iotdb-1 # Log in to CLI -IoTDB> stop pipe AB # Stop data synchronization -./sbin/stop-standalone.sh # Stop database service - -# Windows -# Before version V2.0.4.x -.\sbin\start-cli.bat -h iotdb-1 -IoTDB> stop pipe AB -.\sbin\stop-standalone.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-cli.bat -h iotdb-1 -IoTDB> stop pipe AB -.\sbin\windows\stop-standalone.bat -``` - -On machine B: - -```SQL -# Unix/OS X -./sbin/start-cli.sh -h iotdb-2 # Log in to CLI -IoTDB> stop pipe BA # Stop data synchronization -./sbin/stop-standalone.sh # Stop database service - -# Windows -# Before version V2.0.4.x -.\sbin\start-cli.bat -h iotdb-2 -IoTDB> stop pipe BA -.\sbin\stop-standalone.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-cli.bat -h iotdb-2 -IoTDB> stop pipe BA -.\sbin\windows\stop-standalone.bat -``` diff --git a/src/UserGuide/Master/Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/UserGuide/Master/Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index c2bffcf22..000000000 --- a/src/UserGuide/Master/Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,48 +0,0 @@ - -# Obtain TimechoDB - -## 1. How to obtain TimechoDB - -The TimechoDB installation package can be obtained through product trial application or by directly contacting the Timecho team. - -## 2. Installation Package Structure - -After unpacking the installation package(`iotdb-enterprise-{version}-bin.zip`),you will see the directory structure is as follows: - -| **Catologue** | **Type** | **Description** | -| :--------------- | :------- | :----------------------------------------------------------- | -| activation | Folder | Directory for activation files, including the generated machine code and the TimechoDB activation code obtained from Timecho staff. *(This directory is generated after starting the ConfigNode, enabling you to obtain the activation code.)* | -| conf | Folder | Configuration files directory, containing ConfigNode, DataNode, JMX, and logback configuration files. | -| data | Folder | Default data file directory, containing data files for ConfigNode and DataNode. *(This directory is generated after starting the program.)* | -| lib | Folder | Library files directory. | -| licenses | Folder | Directory for open-source license certificates. | -| logs | Folder | Default log file directory, containing log files for ConfigNode and DataNode. *(This directory is generated after starting the program.)* | -| sbin | Folder | Main scripts directory, containing scripts for starting, stopping, and managing the database. | -| tools | Folder | Tools directory. | -| ext | Folder | Directory for pipe, trigger, and UDF plugin-related files. | -| LICENSE | File | Open-source license file. | -| NOTICE | File | Open-source notice file. | -| README_ZH.md | File | User manual (Chinese version). | -| README.md | File | User manual (English version). | -| RELEASE_NOTES.md | File | Release notes. | - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the MQTT service and REST service JAR files by default. If you need to use them, please contact the Timecho team to obtain them. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/UserGuide/Master/Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index b2e601dd2..000000000 --- a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,319 +0,0 @@ - -# Stand-Alone Deployment - -This guide introduces how to set up a standalone TimechoDB instance, which includes one ConfigNode and one DataNode (commonly referred to as 1C1D). - -## 1. Prerequisites - -1. **System Preparation**: Ensure the system has been configured according to the [System Requirements](../Deployment-and-Maintenance/Environment-Requirements.md). - -2. **IP Configuration**: It is recommended to use hostnames for IP configuration to prevent issues caused by IP address changes. Set the hostname by editing the `/etc/hosts` file. For example, if the local IP is `192.168.1.3` and the hostname is `iotdb-1`, run: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -``` - -Use the hostname for `cn_internal_address` and `dn_internal_address` in IoTDB configuration. - -3. **Unmodifiable Parameters**: Some parameters cannot be changed after the first startup. Refer to the [Parameter Configuration](#22-parameters-configuration) section. - -4. **Installation Path**: Ensure the installation path contains no spaces or non-ASCII characters to prevent runtime issues. - -5. **User Permissions**: Choose one of the following permissions during installation and deployment: - - **Root User (Recommended)**: This avoids permission-related issues. - - **Non-Root User**: - - Use the same user for all operations, including starting, activating, and stopping services. - - Avoid using `sudo`, which can cause permission conflicts. - -6. **Monitoring Panel**: Deploy a monitoring panel to track key performance metrics. Contact the Timecho team for access and refer to the [Monitoring Board Install and Deploy](../Deployment-and-Maintenance/Monitoring-panel-deployment.md). - -7. **Health Check Tool**: Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - - -## 2. Installation Steps - -### 2.1 Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### 2.2 Extract Installation Package - -Unzip the installation package and navigate to the directory: - -```Bash -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -### 2.3 Parameters Configuration - -#### 2.3.1 Memory Configuration - -Edit the following files for memory allocation: - -- **ConfigNode**: `./conf/confignode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :--------------------------------- | :---------- | :-------------- | :-------------------------------------- | -| MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 30% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -- **DataNode**: `./conf/datanode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :--------------------------------- |:-----------------------------------------------------------------------------------------| :-------------- | :-------------------------------------- | -| MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 50% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -#### 2.3.2 General Configuration - -Set the following parameters in `conf/iotdb-system.properties`. Refer to `conf/iotdb-system.properties.template` for a complete list. - -**Cluster-Level Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------------ | :-------------------------- | :------------- | :-------------- | :----------------------------------------------------------- | -| cluster_name | Name of the cluster | defaultCluster | Customizable | Support hot loading, but it is not recommended to change the cluster name by manually modifying the configuration file. | -| schema_replication_factor | Number of metadata replicas | 1 | 1 | In standalone mode, set this to 1. This value cannot be modified after the first startup. | -| data_replication_factor | Number of data replicas | 1 | 1 | In standalone mode, set this to 1. This value cannot be modified after the first startup. | - -**ConfigNode Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------------------- | :--------------------------------------------------------- | -| cn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | This parameter cannot be modified after the first startup. | -| cn_internal_port | Port used for internal communication within the cluster | 10710 | 10710 | This parameter cannot be modified after the first startup. | -| cn_consensus_port | Port used for consensus protocol communication among ConfigNode replicas | 10720 | 10720 | This parameter cannot be modified after the first startup. | -| cn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Use `cn_internal_address:cn_internal_port` | This parameter cannot be modified after the first startup. | - -**DataNode Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--------------------------------------------------------- | -| dn_rpc_address | Address for the client RPC service | 127.0.0.1 | By default, the local machine can directly access it. For non-local access, please modify this configuration item to the IPv4 address or hostname of the server where it is located. It is recommended to use the IPv4 address of the server where it is located. | Effective after restarting the service. | -| dn_rpc_port | Port for the client RPC service | 6667 | 6667 | Effective after restarting the service. | -| dn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | This parameter cannot be modified after the first startup. | -| dn_internal_port | Port used for internal communication within the cluster | 10730 | 10730 | This parameter cannot be modified after the first startup. | -| dn_mpp_data_exchange_port | Port used for receiving data streams | 10740 | 10740 | This parameter cannot be modified after the first startup. | -| dn_data_region_consensus_port | Port used for data replica consensus protocol communication | 10750 | 10750 | This parameter cannot be modified after the first startup. | -| dn_schema_region_consensus_port | Port used for metadata replica consensus protocol communication | 10760 | 10760 | This parameter cannot be modified after the first startup. | -| dn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Use `cn_internal_address:cn_internal_port` | This parameter cannot be modified after the first startup. | - -### 2.4 Start ConfigNode - -Navigate to the `sbin` directory and start ConfigNode: - -```Bash -# Unix/OS X -./sbin/start-confignode.sh -d # The "-d" flag starts the process in the background. - -# Windows -# Before version V2.0.4.x -.\sbin\start-confignode.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-confignode.bat -``` - -If the startup fails, refer to the [Common Issues](#3-common-issues)。 section below for troubleshooting. - - - -### 2.5 Start DataNode - -Navigate to the `sbin` directory of IoTDB and start the DataNode: - -```Bash -# Unix/OS X -./sbin/start-datanode.sh -d # The "-d" flag starts the process in the background. - -# Windows -# Before version V2.0.4.x -.\sbin\start-datanode.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-datanode.bat -``` - -### 2.6 Activate the Database - -#### Option 1: Command-Based Activation - -1. Enter the IoTDB CLI. - -**Linux** or **MacOS** - -```Bash -# Before version V2.0.6.x -Shell> bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` -**Windows** - -```Bash -# Before version V2.0.4.x -Shell> sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.4.x and later versions, before version V2.0.6.x -Shell> sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - - -2. Execute the following command to obtain the machine code required for activation: - -```SQL -show system info -``` -```Bash -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -3. Execute the following statement to obtain the version number of the database to be activated: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.9.2| 5ea21bc| -+-------+---------+ -Total line number = 1 -``` - -4. Provide the obtained machine code and version number to the Timecho team. - -5. Enter the activation codes provided by the Timecho team in the CLI in sequence using the following format. Wrap the activation code in single quotes ('): - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -#### Option 2: File-Based Activation - -1. After starting the Confignode and Datanode nodes, enter the `activation` folder and send the `system_info` file to the Timecho team. -2. Receive the `license` file returned by the staff. -3. Place the `license` file into the `activation` folder of the corresponding node. - - -### 2.7 Verify Activation - -In the CLI, you can check the activation status by running the `show activation` command. Check the `ClusterActivationStatus` field. If it shows `ACTIVATED`, the database has been successfully activated. - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81.png) - -## 3. Common Issues -1. Activation Fails Repeatedly - - 1. Use the `ls -al` command to verify that the ownership of the installation directory matches the current user. - 2. Check the ownership of all files in the `./activation` directory to ensure they belong to the current user. - -2. ConfigNode Fails to Start - - 1. Review the startup logs to check if any parameters, which cannot be modified after the first startup, were changed. - 2. Check the logs for any other errors. If unresolved, contact technical support for assistance. - 3. If the deployment is fresh or data can be discarded, clean the environment and redeploy using the following steps: - - **Clean the Environment** - - 1. Stop all ConfigNode and DataNode processes: - ```Bash - sbin/stop-standalone.sh - ``` - - 2. Check for any remaining processes: - ```Bash - jps - # or - ps -ef | grep iotdb - ``` - - 3. If processes remain, terminate them manually: - ```Bash - kill -9 - - #For systems with a single IoTDB instance, you can clean up residual processes with: - ps -ef | grep iotdb | grep -v grep | tr -s ' ' ' ' | cut -d ' ' -f2 | xargs kill -9 - ``` - - 4. Delete the `data` and `logs` directories: - ```Bash - cd /data/iotdb - rm -rf data logs - ``` - -## 4. Appendix - -### 4.1 ConfigNode Parameters - -| Parameter | Description | Required | -| :-------- | :---------------------------------------------------------- | :------- | -| -d | Starts the process in daemon mode (runs in the background). | No | - -### 4.2 DataNode Parameters - -| Parameter | Description | Required | -| :-------- | :----------------------------------------------------------- | :------- | -| -v | Displays version information. | No | -| -f | Runs the script in the foreground without backgrounding it. | No | -| -d | Starts the process in daemon mode (runs in the background). | No | -| -p | Specifies a file to store the process ID for process management. | No | -| -c | Specifies the path to the configuration folder; the script loads configuration files from this location. | No | -| -g | Prints detailed garbage collection (GC) information. | No | -| -H | Specifies the path for the Java heap dump file, used during JVM memory overflow. | No | -| -E | Specifies the file for JVM error logs. | No | -| -D | Defines system properties in the format `key=value`. | No | -| -X | Passes `-XX` options directly to the JVM. | No | -| -h | Displays the help instructions. | No | diff --git a/src/UserGuide/Master/Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md b/src/UserGuide/Master/Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md deleted file mode 100644 index 96d73ffc1..000000000 --- a/src/UserGuide/Master/Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md +++ /dev/null @@ -1,44 +0,0 @@ - - -# Overview - -IoTDB Ecosystem Integration Bridges the Full Pipeline of Time-Series Data: -- Through data collection, it enables second-level device connectivity. -- Via data integration, it constructs cross-cloud pipelines. -- Leveraging programming frameworks, it accelerates business logic development. -- With computing engines, it accomplishes distributed processing. -- Through visualization and SQL development, it implements analytical strategies. -- Finally, by interfacing with IoT platforms, it achieves edge-cloud synergy—building a complete intelligent closed loop from the physical world to digital decision-making. - -![](/img/eco-overview-n-en.png) - -The following documentation will help you quickly and comprehensively understand the usage of various integration tools at each stage: - -- Computing Engine - - Spark [Spark](./Spark-IoTDB.md) -- SQL Development - - DBeaver [DBeaver](./DBeaver.md) - - DataGrip [DataGrip ](./DataGrip.md) -- Programming Framework - - Spring Boot Starter [Spring Boot Starter](./Spring-Boot-Starter.md) - - Mybatis Generator [Mybatis Generator](./Mybatis-Generator.md) - - MyBatisPlus Generator [MyBatisPlus Generator](./MyBatisPlus-Generator.md) \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Ecosystem-Integration/SeaTunnel_timecho.md b/src/UserGuide/Master/Table/Ecosystem-Integration/SeaTunnel_timecho.md deleted file mode 100644 index 115a94e1d..000000000 --- a/src/UserGuide/Master/Table/Ecosystem-Integration/SeaTunnel_timecho.md +++ /dev/null @@ -1,193 +0,0 @@ - - - -# Apache SeaTunnel - -## 1. Overview - -SeaTunnel is a distributed integration platform designed for massive data. Leveraging its high performance and elastic scaling capabilities, it connects multi-source heterogeneous data links through standardized Connectors (composed of Source and Sink). The platform uniformly abstracts various data sources into the SeaTunnelRow format via Source. After dynamic resource scheduling and batch processing optimization, it efficiently writes data to different storage systems through Sink. Through the deep integration of the IoTDB Connector with SeaTunnel, it not only addresses core challenges in time-series data scenarios such as **high-throughput writing, multi-source governance, and complex analysis**, but also helps enterprises quickly build **low-cost, highly reliable, and easily scalable** data infrastructure in fields like the Internet of Things and industrial internet, leveraging the out-of-the-box connector ecosystem and automated operation and maintenance capabilities. - -## 2. Usage Steps - -### 2.1 Environment Preparation - -#### 2.1.1 Software Requirements - -| Software | Version | Installation Reference | -| ------------- | ------------- |-----------------------------------------------------------| -| IoTDB | >= 2.0.5 | [Quick Start](../QuickStart/QuickStart_timecho.md) | -| SeaTunnel | 2.3.12 | [Official Website](https://seatunnel.apache.org/download) | - -* Thrift Version Conflict Resolution (Only required for Spark engine): - -```Bash -# Remove older Thrift from Spark -rm -f $SPARK_HOME/jars/libthrift* -# Copy IoTDB's Thrift library to Spark classpath -cp $IOTDB_HOME/lib/libthrift* $SPARK_HOME/jars/ -``` - -#### 2.1.2 Dependency Configuration - -1. JDBC - -* Spark/Flink Engine: Place the [JDBC driver JAR](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) into the `${SEATUNNEL_HOME}/plugins/` directory. -* SeaTunnel Zeta Engine: Place the [JDBC driver JAR](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) into the `${SEATUNNEL_HOME}/lib/` directory. - -2. Connector - -Place the corresponding version of the [SeaTunnel Connector](https://mvnrepository.com/artifact/org.apache.seatunnel/connector-iotdb) into the `${SEATUNNEL_HOME}/plugins/` directory. - -### 2.2 Reading Data (IoTDB Source Connector) - -#### 2.2.1 Configuration Parameters - -| **Parameter** | **Type** | **Required** | **Default** | **Description** | -| -------------------------- | -------- | ------------ | ----------- | --------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | string | yes | - | IoTDB cluster address, format: `"host1:port"` or `"host1:port,host2:port"` | -| `username` | string | yes | - | IoTDB username | -| `password` | string | yes | - | IoTDB password | -| `sql_dialect` | string | no | tree | IoTDB model: `tree` for tree model; `table` for table model | -| `sql` | string | yes | - | SQL query statement to execute | -| `database` | string | no | - | Database name, only effective in table model | -| `schema` | config | yes | - | Data schema definition | -| `fetch_size` | int | no | - | Number of data rows fetched per request from IoTDB during query execution | -| `lower_bound` | long | no | - | Lower bound of time range (used for data partitioning by time column) | -| `upper_bound` | long | no | - | Upper bound of time range (used for data partitioning by time column) | -| `num_partitions` | int | no | - | Number of partitions (used when partitioning by time column):
1 partition: uses the full time range
If partitions < (upper_bound - lower_bound), the difference is used as actual partitions | -| `thrift_default_buffer_size`| int | no | - | Thrift protocol buffer size | -| `thrift_max_frame_size` | int | no | - | Thrift maximum frame size | -| `enable_cache_leader` | boolean | no | - | Whether to enable leader node caching | -| `version` | string | no | - | Client SQL semantic version (`V_0_12` / `V_0_13`) | - -#### 2.2.2 Configuration Example - -1. Create a new file `iotdb_source_example.conf` in the `${SEATUNNEL_HOME}/config/` directory: - -```bash -env { - parallelism = 2 # Parallelism set to 2 - job.mode = "BATCH" # Batch mode -} - -source { - IoTDB { - node_urls = "localhost:6667" - username = "root" - password = "root" - sql_dialect = "table" - sql = "SELECT time,device_id,city,s1,s2,s3,s4 FROM tcollector.table1" - schema { - fields { - time = timestamp - device_id = string - city= string - s1= int - s2= bigint - s3= float - s4= double - } - } - } -} - -sink { - Console { - } # Output to console -} -``` - -2. Run SeaTunnel with the following command: - -```Bash -./bin/seatunnel.sh --config config/iotdb_source_example.conf -e local -``` - -3. For more details, please refer to the official Apache SeaTunnel documentation on [IoTDB Source Connector](https://seatunnel.apache.org/docs/2.3.12/connector-v2/source/IoTDB). - -### 2.3 Writing Data (IoTDB Sink Connector) - -#### 2.3.1 Configuration Parameters - -| **Parameter** | **Type** | **Required** | **Default** | **Description** | -| ----------------------------- | --------- | ------------ | ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | Array | yes | - | IoTDB cluster address, format: `["host1:port"]` or `["host1:port","host2:port"]` | -| `username` | String | yes | - | IoTDB username | -| `password` | String | yes | - | IoTDB password | -| `sql_dialect` | String | no | tree | IoTDB model: `tree` for tree model; `table` for table model | -| `storage_group` | String | yes | - | IoTDB tree model: specifies the storage group for devices (path prefix) e.g., deviceId = \${storage_group} + "." + \${key_device}; IoTDB table model: specifies the database | -| `key_device` | String | yes | - | IoTDB tree model: field name in SeaTunnelRow that specifies the IoTDB device ID; IoTDB table model: field name in SeaTunnelRow that specifies the IoTDB table name | -| `key_timestamp` | String | no | processing time | IoTDB tree model: field name in SeaTunnelRow that specifies the IoTDB timestamp (if not specified, processing time is used as timestamp); IoTDB table model: field name in SeaTunnelRow that specifies the IoTDB time column (if not specified, processing time is used as timestamp) | -| `key_measurement_fields` | Array | no | See description | IoTDB tree model: field names in SeaTunnelRow that specify the list of IoTDB measurements (if not specified, includes all fields except `key_device` and `key_timestamp`); IoTDB table model: field names in SeaTunnelRow that specify the IoTDB field columns (if not specified, includes all fields except `key_device`, `key_timestamp`, `key_tag_fields`, `key_attribute_fields`) | -| `key_tag_fields` | Array | no | - | IoTDB tree model: not applicable; IoTDB table model: field names in SeaTunnelRow that specify the IoTDB tag columns | -| `key_attribute_fields` | Array | no | - | IoTDB tree model: not applicable; IoTDB table model: field names in SeaTunnelRow that specify the IoTDB attribute columns | -| `batch_size` | Integer | no | 1024 | For batch writing, data is flushed to IoTDB when the buffer reaches `batch_size` or when the time reaches `batch_interval_ms` | -| `max_retries` | Integer | no | - | Number of retries on failed flush | -| `retry_backoff_multiplier_ms` | Integer | no | - | Multiplier used to generate the next backoff delay | -| `max_retry_backoff_ms` | Integer | no | - | Maximum wait time before retrying a request to IoTDB | -| `default_thrift_buffer_size` | Integer | no | - | Initial buffer size for Thrift client in IoTDB | -| `max_thrift_frame_size` | Integer | no | - | Maximum frame size for Thrift client in IoTDB | -| `zone_id` | string | no | - | IoTDB client `java.time.ZoneId` | -| `enable_rpc_compression` | Boolean | no | - | Enable RPC compression in IoTDB client | -| `connection_timeout_in_ms` | Integer | no | - | Maximum time (in milliseconds) to wait when connecting to IoTDB | - -#### 2.3.2 Configuration Example - -1. Create a new file `iotdb_sink_example.conf` in the `${SEATUNNEL_HOME}/config/` directory: - -```bash -# Define runtime environment -env { - parallelism = 4 - job.mode = "BATCH" -} - -source{ - Jdbc { - url = "jdbc:mysql://localhost:3306/demo_db?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true" - driver = "com.mysql.cj.jdbc.Driver" - connection_check_timeout_sec = 100 - user = "root" - password = "IoTDB@2024" - query = "select * from device" - } -} -sink { - IoTDB { - node_urls = ["localhost:6667"] - username = "root" - password = "root" - sql_dialect = "table" - storage_group = "seatunnel" - key_device = "id" - key_timestamp = "intime" - } -} -``` - -2. Run SeaTunnel with the following command: - -```Bash -./bin/seatunnel.sh --config config/iotdb_sink_example.conf -e local -``` - -3. For more configuration parameters and examples, please refer to the official Apache SeaTunnel documentation on [IoTDB Sink Connector](https://seatunnel.apache.org/docs/2.3.12/connector-v2/sink/IoTDB). diff --git a/src/UserGuide/Master/Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/UserGuide/Master/Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index f74257551..000000000 --- a/src/UserGuide/Master/Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,299 +0,0 @@ - -# IoTDB Introduction - -TimechoDB is a high-performance, cost-efficient, and IoT-native time-series database developed by Timecho. As an enterprise-grade extension of Apache IoTDB, it is designed to tackle the complexities of managing large-scale time-series data in IoT environments. These challenges include high-frequency data sampling, massive data volumes, out-of-order data, extended processing times, diverse analytical demands, and high storage and maintenance costs. - -TimechoDB enhances Apache IoTDB with superior functionality, optimized performance, enterprise-grade reliability, and an intuitive toolset, enabling industrial users to streamline data operations and unlock deeper insights. - -- [Quick Start](../QuickStart/QuickStart_timecho.md): Download, Deploy, and Use - -## 1. TimechoDB Data Management Solution - -The Timecho ecosystem provides an integrated **collect-store-use** solution, covering the complete lifecycle of time-series data, from acquisition to analysis. - -![](/img/Introduction-en-timecho-new.png) - -Key components include: - -1. **Time-Series Database (TimechoDB)**: - 1. The primary storage and processing engine for time-series data, based on Apache IoTDB. - 2. Offers **high compression, advanced** **query** **capabilities, real-time stream processing, high availability, and scalability**. - 3. Provides **security features, multi-language APIs, and seamless integration with external systems**. -2. **Time-Series Standard File Format** **(Apache** **TsFile)**: - 1. A high-performance storage format originally developed by Timecho’s core contributors. - 2. Enables **efficient compression and fast querying**. - 3. Powers TimechoDB’s **data collection, storage, and analysis pipeline**, ensuring unified data management -3. **Time-Series AI Engine** **(AINode)**: - 1. Integrates **machine learning and deep learning** for time-series analytics. - 2. Extracts actionable insights directly from TimechoDB-stored data. -4. **Data Collection Framework**: - 1. Supports **various industrial protocols, resumable transfers, and network barrier penetration**. - 2. Facilitates **reliable data acquisition in challenging industrial environments**. - -## 2. TimechoDB Architecture - -The diagram below illustrates a common cluster deployment (3 ConfigNodes, 3 DataNodes) of TimechoDB: - -![](/img/Cluster-Concept03N.png) - -## 3. Key Features - -TimechoDB offers the following advantages: - -**Flexible Deployment:** - -- Supports one-click cloud deployment, on-premise installation, and seamless terminal-cloud synchronization. -- Adapts to hybrid, edge, and cloud-native architectures - -**Cost-Efficient Storage:** - -- Utilizes high compression ratio storage, eliminating the need for separate real-time and historical databases. -- Supports unified data management across different time horizons. - -**Hierarchical** **Data** **Organization:** - -- Mirrors real-world industrial structures through hierarchical measurement point modeling. -- Enables directory-based navigation, search, and retrieval. - -**High-Throughput Read****&****Write:** - -- Optimized for millions of concurrent device connections. -- Handles multi-frequency and out-of-order data ingestion with high efficiency. - -**Advanced Time-Series Query Semantics** **:** - -- Features a native time-series computation engine with built-in timestamp alignment. -- Provides nearly 100 aggregation and analytical functions, enabling AI-powered time-series insights. - -**Enterprise-Grade High Availability** **:** - -- Distributed HA architecture ensures 24/7 real-time database services. -- Automated resource balancing when nodes are added, removed, or overheated. -- Supports heterogeneous clusters with varying hardware configurations. - -**Operational Simplicity** **:** - -- Standard SQL query syntax for ease of use. -- Multi-language APIs for flexible development. -- Comes with a comprehensive toolset, including an intuitive management console - -**Robust Ecosystem Integration:** - -- Seamlessly integrates with big data frameworks (Hadoop, Spark) and visualization tools (Grafana, ThingsBoard, DataEase). -- Supports device management for industrial IoT environments. - -## 4. Enterprise-level Enhancements - -TimechoDB extends Apache IoTDB with advanced industrial-grade capabilities, including tiered storage, cloud-edge collaboration, visualization tools, and security upgrades. - -**Dual-Active Deployment:** - -- Implements active-active high availability, ensuring continuous operations. -- Two independent clusters perform real-time bidirectional synchronization. -- Both systems accept external writes and maintain eventual consistency. - -**Seamless Data Synchronization** **:** - -- Built-in synchronization module supports real-time and batch data aggregation from field devices to central hubs. -- Supports full, partial, and cascading aggregation. -- Includes enterprise-ready plugins for cross air-gap transmission, encrypted transmission, and compression. - -**Tiered** **Storage:** - -- Dynamically categorizes data into hot, warm, and cold tiers. -- Efficiently balances SSD, HDD, and cloud storage utilization. -- Automatically optimizes data access speed and storage costs. - -**Enhanced Security** **:** - -- Implements whitelist-based access control and audit logging. -- Strengthens data governance and risk mitigation. - -**Feature Comparison**: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
FunctionApache IoTDBTimechoDB
Deployment ModeStand-Alone Deployment
Distributed Deployment
Dual Active Deployment-
Container DeploymentPartial support
Database FunctionalitySensor Management
Write Data
Query Data
Continuous Query
Trigger
User Defined Function
Permission Management
Data SynchronisationOnly file synchronization, no built-in pluginsReal time synchronization+file synchronization, enriched with built-in plugins
Stream ProcessingOnly framework, no built-in pluginsFramework+rich built-in plugins
Tiered Storage-
View-
White List-
Audit Log-
Supporting ToolsWorkbench-
Cluster Management Tool-
System Monitor Tool-
LocalizationLocalization Compatibility Certification-
Technical SupportExpert Support-
Use Training-
- -### 4.1 Higher Efficiency and Stability - -TimechoDB achieves up to 10x performance improvements over Apache IoTDB in mission-critical workloads, and provides rapid fault recovery for industrial environments. - -### 4.2 Comprehensive Management Tools - -TimechoDB simplifies deployment, monitoring, and maintenance through an intuitive toolset: - -- **Cluster Monitoring Dashboard** - - Real-time insights into IoTDB and underlying OS health. - - 100+ performance metrics for in-depth monitoring and optimization. - - - - ![](/img/Introduction01.png) - - - - ![](/img/Introduction02.png) - - - - ![](/img/Introduction03.png) - - -- **Database Console** **:** - - Simplifies interaction with an intuitive GUI for metadata management, SQL execution, user permissions, and system configuration. -- **Cluster Management Tool** **:** - - Provides **one-click operations** for cluster deployment, scaling, start/stop, and configuration updates. - - ![](/img/introduction-opskit-en.png) - -### 4.3 Professional Enterprise Technical Services - -TimechoDB offers **vendor-backed enterprise services** to support industrial-scale deployments: - -- **On-Site Installation & Training**: Hands-on guidance for fast adoption. -- **Expert Consulting & Advisory**: Performance tuning and expert support. -- **Emergency Support & Remote Assistance**: Minimized downtime for mission-critical operations. -- **Custom Development & Optimization**: Tailored solutions for unique industrial use cases. - -Compared to Apache IoTDB’s 2-3 month release cycle, TimechoDB delivers faster updates and same-day critical issue resolutions, ensuring production stability. - -### 4.4 Ecosystem Compatibility & Compliance - -imechoDB is self-developed, supports mainstream CPUs & operating systems, and meets industry compliance standards, making it a reliable choice for enterprise IoT deployments. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/IoTDB-Introduction/Release-history_timecho.md b/src/UserGuide/Master/Table/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index 77764aaef..000000000 --- a/src/UserGuide/Master/Table/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,703 +0,0 @@ - -# Release History - -## 1. TimechoDB (Database Core) - - -### V2.0.9.4 -> Release Date: 2026.06.10
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.4-bin.zip
-> SHA512 Checksum: 040ebdd9e45d93535e9628cf377003d560be83cec9737f5a5fbd0c3a93a12810814094752eac3eacdfec5cddcf433fa83e76edc14be34c73c1a54d9b937ea1b5 - -Version 2.0.9.4 primarily optimizes table model AINode inference, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- AINode: Table model covariate inference models adaptively support filling null values - - -### V2.0.9.3 -> Release Date: 2026.05.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.3-bin.zip
-> SHA512 Checksum: f6c5d50cbf8902503289884f073593c650ffdc8edbebfabf27f6ab4499630749331aa4ed09dd34627a39fa8dee27b4d7e2689d0ed1cf23c76dd9c7270f9fae2a - -Version 2.0.9.3 of AINode newly supports registering multiple models by using the same model code with different model weights. It also includes enhancements and bug fixes for previous versions, with comprehensive improvements to database monitoring, performance and stability. Details are as follows: - -- AINode: [Supports registering custom models with the same model code and different model weights](../AI-capability/AINode_Upgrade_timecho.md#_4-3-register-custom-models) - - -### V2.0.9.2 -> Release Date: 2026.05.11
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.2-bin.zip
-> SHA512 Checksum: 10d3f34b6e65ad5c09b1cf3538ee27e181cc38c5fedf6acfd7d7053797ca23c76245683536275b69bd478aa1e43364351eceef1948832ab663a7398665af9eff - -Version 2.0.9.2 adds import and export capabilities for the Object data type, and introduces the new `tsfile-backup` script (currently supported only for table model scenarios). It also brings optimizations and bug fixes for legacy versions, with overall upgrades to database monitoring, performance and stability. Details are as follows: - -- Scripts & Tools: [The `import-data` script for TsFile format](../Tools-System/Data-Import-Tool_timecho.md#_2-4-tsfile-format) supports Object type data import for table models -- Scripts & Tools: New[ `tsfile-backup` script ](../Tools-System/Data-Export-Tool_timecho.md#_3-tsfilebackup-based-on-pipe-framework)added for table models -- Stream Processing Module: PIPE for table models supports [local export and remote transmission of Object type data](../User-Manual/Data-Sync_timecho.md#_3-9-object-type-data-export) -- System Module: [Audit logs](../User-Manual/Audit-Log_timecho.md) support slow request quantity statistics - -### V2.0.9.1 -> Release Date: 2026.05.11
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.1-bin.zip
-> SHA512 Checksum: 18ff3801ba58550e06ef0aa4bf4465e8ce1b31d1aecb9c6899eb843f5d9187d3cc575e930ee38d96b87b17067e2b21f1852ab5127eac7480cf5051c20a68894b - -Version 2.0.9.1 endows AINode with covariate classification inference capability, supports schema-level and table-level storage space statistics. It adds set operations, CTE and multiple built-in functions for data query, enables SQL debugging via DEBUG statements, and supports configuring auto-start on boot. This version also contains legacy version improvements, bug fixes, and comprehensive enhancements to database monitoring, performance and stability. Details are as follows: - -- AINode: Table models support [time series data classification inference](../AI-capability/AINode_Upgrade_timecho.md#_4-1-model-inference) -- Query Module: Table models support [set operations (UNION/INTERSECT/EXCEPT)](../SQL-Manual/Set-Operations_timecho.md) and [Common Table Expressions (CTE)](../SQL-Manual/Common-Table-Expression_timecho.md) -- Query Module: Newly added [IF scalar function](../SQL-Manual/Basis-Function_timecho.md#_8-3-if-expression), [binary functions](../SQL-Manual/Basis-Function_timecho.md#_7-binary-functions) and [APPROX_PERCENTILE aggregate function](../SQL-Manual/Basis-Function_timecho.md#_2-aggregate-functions) for table models -- Query Module: Supports [DEBUG SQL](../User-Manual/Maintenance-commands_timecho.md#_6-query-debugging) for query debugging and optimizes the result set of [Explain Analyze](../User-Manual/Query-Performance-Analysis.md) -- Query Module: Supports [schema-level](../../latest/User-Manual/Maintenance-commands_timecho.md#_1-10-view-disk-space-usage) and [table-level](../Reference/System-Tables_timecho.md#_2-22-table-disk-usage) storage space occupancy statistics; the[ `SHOW CONFIGURATION` statement](../User-Manual/Maintenance-commands_timecho.md#_1-13-view-node-configuration) is available to view cluster configuration information -- Scripts & Tools: Data and metadata import/export tools support the SSL protocol -- Scripts & Tools: Command-line tool adds access [history display](../Tools-System/CLI_timecho.md#_4-access-history-feature) capability -- System Module: Supports [system auto-start](../User-Manual/Auto-Start-On-Boot_timecho.md) configuration -- Others: Fixed security vulnerability CVE-2026-28564 - - -### V2.0.8.3 -> Release Date: 2026.04.21
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.8.3-bin.zip
-> SHA512 Checksum: 4b95bea87cc375bc455897dcf4cec80692421fa5c3eee746e1095b94288611d4afdd94aa8dad70340757d041757758924701cbdb2b73b49fb8730c4caac2a126 - -Version 2.0.8.3 enables reading and writing Object type data via Python. It also includes optimizations and bug fixes for previous versions, with comprehensive upgrades to database monitoring, performance and stability. Details are as follows: - -- Interface Module: [Python Native API](../API/Programming-Python-Native-API_timecho.md) supports reading and writing Object type data for table models - - -### V2.0.8.2 - -> Release Date: 2026.03.31
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name:timechodb-2.0.8.2-bin.zip
-> SHA512 Checksum:02ab10e3e94786dd5676e0a69609eef192afd90d87f4d8d7bd44e7e9cbc8a18d61ba5668bae56cb8e4416ac71a877f760963b72ca7838d7c39ae10f1ed321d89 - -Version 2.0.8.2 adds support for modifying the full path of time series in the tree model, customizing the Time column name in the table model, changing data types in both tree and table models, and includes the ODBC Driver, among other features. It also introduces improvements and bug fixes for earlier versions, with comprehensive enhancements to database monitoring, performance, and stability. The detailed release notes are as follows: - -- Storage Module: The tree model supports [modifying the full name of time series](../../latest/Basic-Concept/Operate-Metadata_timecho.md#_2-4-修改时间序列名称) and [changing the data type of time series](../../latest/Basic-Concept/Operate-Metadata_timecho.md#_2-3-修改时间序列数据类型). -- Storage Module: The table model supports [modifying column data types](../Basic-Concept/Table-Management_timecho.md#_1-5-修改表) and [customizing the Time column name](../Basic-Concept/Table-Management_timecho.md#_1-1-创建表). -- Interface Module: Adds support for the [ODBC Driver](../API/Programming-ODBC_timecho.md); the Python SessionDataset supports fetching DataFrames in batches; the MQTT service is externalized, and a new system table named Services is added for service queries. -- AI Node: The table model supports adaptive [covariate inference](../AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理). -- Stream Processing Module: The tree model data synchronization PIPE statement supports specifying multiple precise paths. - - -### V2.0.8.1 - -> Release Date: 2026.02.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name:timechodb-2.0.8.1-bin.zip
-> SHA512 Checksum: 49d97cbf488443f8e8e73cc39f6f320b3bc84b194aed90af695ebd5771650b5e5b6a3abb0fb68059bd01827260485b903c035657b337442f4fdd32c877f2aca3 - -V2.0.8.1 introduces the **Object data type** to table models, significantly enhances audit logging capabilities, optimizes the tree model’s **OPC UA protocol**, adds **covariate-based forecasting** support in AINode, and enables **concurrent inference** in AINode. Additionally, comprehensive improvements have been made to database monitoring, performance, and stability. The detailed release notes are as follows: - -- **Query Module**: Added a list view of available DataNode instances, allowing users to [view each node's RPC address and port](../User-Manual/Maintenance-commands_timecho.md#_1-7-viewing-available-nodes). -- **Query Module**: Introduced a new system table for [statistical query latency analysis](../Reference/System-Tables_timecho.md#_2-20-queries-costs-histogram). -- **Storage Module**: Added SQL support to retrieve the full definition statements for [tables](../Basic-Concept/Table-Management_timecho.md#_1-4-view-table-creation-statement) and [views](../User-Manual/Tree-to-Table_timecho.md#_2-4-viewing-table-views). -- **Storage Module**: Optimized the tree model’s [OPC UA protocol](../../latest/API/Programming-OPC-UA_timecho.md). -- **System Module**: Added support for the [Object data type](../Background-knowledge/Data-Type_timecho.md) in table models. -- **System Module**: Significantly enhanced and upgraded the [audit log](../User-Manual/Audit-Log_timecho.md) functionality. -- **System Module**: Added a new system table to monitor [DataNode connection status](../Reference/System-Tables_timecho.md#_2-18-connections). -- **AINode**: Integrated the built-in **Chronos-2** model, supporting [covariate-based forecasting](../AI-capability/AINode_Upgrade_timecho.md). -- **AINode**: Built-in models **Timer-XL** and **Sundial** now support [concurrent inference](../AI-capability/AINode_Upgrade_timecho.md). -- **Stream Processing Module**: When creating a full-data synchronization pipe, it will be [automatically split](../User-Manual/Data-Sync_timecho.md#_2-1-create-a-task) into two independent pipes—one for real-time data and one for historical data—whose remaining event counts can be monitored separately via the `SHOW PIPES` statement. -- **Others**: Fixed security vulnerabilities **CVE-2025-12183**, **CVE-2025-66566**, and **CVE-2025-11226**. - -### V2.0.6.6 - -> Release Date: 2026.01.20
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.6.6-bin.zip
-> SHA512 Checksum: d12e60b8119690d63c501d0c2afcd527e39df8a8786198e35b53338e21939e1a9244805e710d81cbb62d02c2739909d7e8227c029660a0cd9ea7ca718cf9bdf6 - -V2.0.6.6 primarily optimizes query performance for time series in the tree model, while delivering comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Query Module**: Improved query performance for `SHOW/COUNT TIMESERIES/DEVICES` statements. -* **Others**: Fixed security vulnerabilities CVE-2025-12183, CVE-2025-66566, and CVE-2025-11226. - - -### V2.0.6.4 - -> Release Date: 2025.11.17
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.6.4-bin.zip
-> SHA512 Checksum: 57b9998cc14632862c32b6781c70db1c52caf8172b5d45d27cc214cab50d3afd4230ed0754e1c1a4ed825666bf971dc81fbb7d3b93261e57e9dabc20e794a2b8 - -V2.0.6.4 focuses on enhancements to the storage and AINode modules, resolves several product defects, and provides comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Storage Module**: Added support for modifying the encoding and compression methods of time series in the tree model. -* **AINode**: Introduced one-click deployment and optimized model inference capabilities. - -### V2.0.6.1 - -> Release Date: 2025.09.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.6.1-bin.zip
-> SHA512 Checksum: c88e3e2c0dbd06578bd0697ca9992880b300baee2c4906ba1f952134e37ae2fa803a6af236f4541d318b75f43a498b5d5bfbbc7c445783271076c36e696e4dd0 - -V2.0.6.1 introduces the new table model query write-back function, access control blacklist/whitelist function, bitwise operation functions (built-in scalar functions), and push-downable time functions. Comprehensive enhancements to database monitoring, performance, and stability are also included. Key updates: - -* ​**​Query Module:​**​ - * Supports the table model query write-back function - * The table model row pattern recognition supports the use of aggregate functions to capture continuous data for analytical calculation - * The table model adds built-in scalar functions - bitwise operation functions - * The table model adds push-downable EXTRACT time functions -* ​**​System Module:​**​ - * Adds access control, supporting users to customize and configure blacklist/whitelist functions -* ​**​Others:​**​ - * The default user password is updated to "TimechoDB@2021" with higher security strength - -### V2.0.5.2 - -> Release Date: 2025.08.08
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.5.2-bin.zip
-> SHA512 Checksum: a00a4075c9937b7749c454f71d2480fea5e9ff9659c0628b132e30e2f256c7c537cd91dca4f6be924db0274bb180946a1b88e460c025bf82fdb994a3c2c7b91e - -V2.0.5.2 introduces addresses certain product defects, optimizes the data synchronization function,Comprehensive enhancements to database monitoring, performance, and stability are also included. - - -### V2.0.5.1 - -> Release Date: 2025.07.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.5.1-bin.zip
-> SHA512 Checksum: aa724755b659bf89a60da6f2123dfa91fe469d2e330ed9bd029e8f36dd49212f3d83b1025e9da26cb69315e02f65c7e9a93922e40df4f2aa4c7f8da8da2a4cea - -V2.0.5.1 introduces ​**​tree-to-table view​**​, ​**​window functions​**​ and the ​**​approx\_most\_frequent​**​ aggregate function for the table model, along with support for ​**​LEFT & RIGHT JOIN​**​ and ​**​ASOF LEFT JOIN​**​. AINode adds two built-in models: ​**​Timer-XL​**​ and ​**​Timer-Sundial​**​, supporting inference and fine-tuning for tree and table models. Comprehensive enhancements to database monitoring, performance, and stability are also included. Key updates: - -* ​**​Query Module:​**​ - * Supports manually creating tree-to-table views - * Adds window functions for table model - * Adds approx\_most\_frequent aggregate function - * Extends JOIN support: LEFT/RIGHT JOIN, ASOF LEFT JOIN - * Enables row pattern recognition (captures continuous data for analysis) - * New system tables: VIEWS (view metadata), MODELS (model info), etc. -* ​**​System Module:​**​ - * Adds TsFile data encryption -* ​**​AI Module:​**​ - * New built-in models: Timer-XL and Timer-Sundial - * Supports inference/fine-tuning for tree and table models -* ​**​Others:​**​ - * Enables data publishing via OPC DA protocol - -### 2.x Other historical versions - -#### V2.0.4.2 - -> Release Date: 2025.06.21
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.4.2-bin.zip
-> SHA512 Checksum: 31f26473ac90988ce970dac8d0950671bde918f9af6f2f6a6c2bf99a53aa1c0a459c53a137b18ff0b28e70952e9c4b6acb50029e0b2e38837b969eb8f78f2939 - -V2.0.4.2 adds support for passing TOPIC to custom MQTT plugins. Includes comprehensive improvements to monitoring, performance, and stability. - -#### V2.0.4.1 - -> Release Date: 2025.06.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.4.1-bin.zip
-> SHA512 Checksum: 93ac08bfae06aff6db04849f474458433026f66778f4f5c402eb22f1a7cb14d8096daf0a9e9cc365ddfefd4f8ca4443b2a9fb6461906f056b1e6a344990beb3a - -V2.0.4.1 introduces ​**​User-Defined Table Functions (UDTF)​**​ and multiple built-in table functions for the table model, adds the ​**​approx\_count\_distinct​**​ aggregate function, and enables ​**​ASOF INNER JOIN on timestamp columns​**​. Script tools are categorized, with Windows-specific scripts separated out. Key updates: - -* ​**​Query Module:​**​ - * Adds UDTFs and built-in table functions - * Supports ASOF INNER JOIN on timestamps - * Adds approx\_count\_distinct aggregate function -* ​**​Stream Processing:​**​ - * Supports asynchronous TsFile loading via SQL -* ​**​System Module:​**​ - * Disaster-aware load balancing strategy for replica selection during downsizing - * Compatibility with Windows Server 2025 -* ​**​Scripts & Tools:​**​ - * Categorized scripts; isolated Windows-specific tools - -#### V2.0.3.4 - -> Release Date: 2025.06.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.4-bin.zip
-> SHA512 Checksum: d80d34b7d3890def75b17c491fc4c13efc36153a5950a9b23744755d04d6adb5d6ab9ec970101183fef7bfeb8a559ef92fce90d2d22f7b7fd5795cd5589461bb - -V2.0.3.4 upgrades the user password encryption algorithm to ​**​SHA-256​**​. Includes comprehensive monitoring, performance, and stability improvements. - -#### V2.0.3.3 - -> Release Date: 2025.05.16
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.3-bin.zip
-> SHA512 Checksum: f47e3fb45f869dbe690e7cfaa93f95e5e08a462b362aa9d7ccac7ee5b55022dc8f62db12009dfde055f278f3003ff9ea7c22849d52a3ef2c25822f01ade78591 - -V2.0.3.3 introduces ​**​metadata import/export scripts for table models​**​, ​**​Spark ecosystem integration​**​, and adds ​**​timestamps to AINode results​**​. New aggregate/scalar functions are added. Key updates: - -* ​**​Query Module:​**​ - * New aggregate function: count\_if; scalar functions: greatest/least - * Significant optimization for full-table count(\*) queries -* ​**​AI Module:​**​ - * Timestamps added to AINode results -* ​**​System Module:​**​ - * Optimized metadata performance for table model - * Active monitoring & loading of TsFiles - * New metrics: TsFile parsing time, Tablet conversion count -* ​**​Ecosystem Integration:​**​ - * Spark integration for table model -* ​**​Scripts & Tools:​**​ - * import-schema/export-schema scripts support table model metadata - -#### V2.0.3.2 - -> Release Date: 2025.05.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.2-bin.zip
-> SHA512 Checksum: 76bd294de4b01782e5dd621a996aeb448e4581f98c70fb5b72b17dc392c2e1227c0d26bd3df5533669a80f217a83a566bc6ec926b7efd21ce7a89b894cd33e19 - -V2.0.3.2 resolves product defects, optimizes node removal, and enhances monitoring, performance, and stability. - -#### V2.0.2.1 - -> Release Date: 2025.04.07
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.2.1-bin.zip
-> SHA512 Checksum: a41be3f8c57e6a39ac165f1d6ab92c9ed790b0712528f31662c58617f4c94e6bfc9392a9c1ef2fc5bdd8c7ca79901389f368cbdbec3e5b1d5c1ce155b2f1a457 - -V2.0.2.1 adds ​**​table model permission management​**​, ​**​user management​**​, and ​**​operation authentication​**​, alongside UDFs, system tables, and nested queries. Data subscription mechanisms are optimized. Key updates: - -* ​**​Query Module:​**​ - * Added UDF management: User-Defined Scalar Functions (UDSF) & Aggregate Functions (UDAF) - * Configurable URI-based loading for UDF/PipePlugin/Trigger/AINode JARs - * Permission/user management with operation authentication - * New system tables and maintenance statements -* ​**​System Module:​**​ - * CSharp client supports table model - * New C++ Session write APIs for table model - * Multi-tier storage supports S3-compliant non-AWS object storage - * New pattern\_match function -* ​**​Data Sync:​**​ - * Table model metadata sync and delete propagation - -#### V2.0.1.2 - -> Release Date: 2025.01.25
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.1.2-bin.zip
-> SHA512 Checksum: 51c2fa5da2974a8a3c8871dec1c49bd98e5d193a13ef33ac7801adb833a1e360d74f0160bcdf33c7ffb23a5c5e0f376e26a4315cf877f1459483356285b85349 - -V2.0.1.2 officially implements ​**​dual-model configuration (tree + table)​**​. The table model supports ​**​standard SQL queries​**​, diverse functions/operators, stream processing, and Benchmarking. Python client adds four new data types, and script tools support TsFile/CSV/SQL import/export. Key updates: - -* ​**​Time-Series Table Model:​**​ - * Standard SQL: SELECT, WHERE, JOIN, GROUP BY, ORDER BY, LIMIT, nested queries -* ​**​Query Module:​**​ - * Logical operators, math functions, time-series functions (e.g., DIFF) - * Configurable URI-based JAR loading -* ​**​Storage Module:​**​ - * Session API writes with auto-metadata creation - * Python client supports: String, Blob, Date, Timestamp - * Optimized compaction task priority -* ​**​Stream Processing:​**​ - * Auth info specification on sender side - * TsFile Load for table model - * Plugin adaptation for table model -* ​**​System Module:​**​ - * Enhanced DataNode downsizing stability - * Supports DROP DATABASE in read-only mode -* ​**​Scripts & Tools:​**​ - * Benchmark adapted for table model - * Support for String/Blob/Date/Timestamp in Benchmark - * import-data/export-data: Universal support for TsFile/CSV/SQL -* ​**​Ecosystem Integration:​**​ - * Kubernetes Operator support - - -### V1.3.7.3 - -> Release Date: 2026.06.02
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 Checksum: 8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 primarily optimizes query module and data synchronization capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Query Module: Optimized `Last` queries, aligned series queries, reverse-order time filter queries, and other scenarios. -- Metadata Module: Optimized device creation validation for activated series and their child paths. -- Data Synchronization: Optimized the retry mechanism after synchronization failures. -- Data Synchronization: Cross-network-gateway synchronization plugin supports configuring the real-time write transmission timeout. -- Interface Module: Added error code validation to the Go client write interface. -- Interface Module: Optimized C# client connection pool management. - - -### V1.3.7.2 - -> Release Date: 2026.04.07
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 Checksum: 787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 primarily optimizes data synchronization and query module capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Data Synchronization: Optimized distribution performance for Pipe complex path matching scenarios. -- Query Module: The `SHOW QUERIES` statement now includes client IP, query timeout, server wait time, and other information. -- Ecosystem Integration: Supports IoTDB pushing data to an external OPC Server in OPC Client mode. - - -### V1.3.6.6 - -> Release Date: 2026.01.20
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 Checksum: 590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 enhances data read/write capabilities, resolves several product defects, and delivers comprehensive improvements in database monitoring, performance, and stability. - - -### V1.3.6.3 - -> Release Date: 2026.01.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 Checksum: 43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 focuses on deep optimizations in two core areas—query performance and memory management—while comprehensively enhancing database monitoring, performance, and stability. Specific release contents are as follows: - -* **Query Module**: Optimized query performance across multiple scenarios, including multi-series `Last` queries. -* **Query Module**: Added a new `FastLastQuery` interface in the Java SDK for more efficient `Last` query operations. -* **Query Module**: Modified the tree model’s `fetchSchema` to return results in segmented streaming mode, improving response speed under large-data-volume conditions. -* **Storage Module**: Enhanced memory management to mitigate memory leak risks and ensure long-term system stability. -* **Storage Module**: Optimized the file compaction mechanism to improve compaction efficiency and reduce storage resource consumption. -* **Others**: Fixed security vulnerabilities CVE-2025-12183, CVE-2025-66566, and CVE-2025-11226. - -### V1.3.6.1 - -> Release Date: 2025.12.09
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 Checksum: 9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 focuses on deep optimization of data synchronization stability, while delivering comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Data Synchronization**: Enhanced Pipe SQL parameter configuration to support specifying asynchronous loading methods. -* **Data Synchronization**: Introduced syntactic sugar that automatically splits full-data Pipe creation SQL into real-time and historical synchronization components. -* **System Module**: Added a global configuration option for data-type-specific compression strategies, enabling on-demand adjustment of storage compression policies. - - -### V1.3.5.11 - -> Release Date: 2025.09.24
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 Checksum: f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 version primarily optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.10 - -> Release Date: 2025.08.27
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 Checksum: 3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 version fixes certain product defects and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.9 - -> Release Date: 2025.08.25
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 Checksum: 95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 version optimizes memory control, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### 1.x Other historical versions - -#### V1.3.5.8 - -> Release Date: 2025.08.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 Checksum: aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.7 - -> Release Date: 2025.08.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 Checksum: 17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.6 - -> Release Date: 2025.07.16
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 Checksum: 05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 introduces a new configuration switch to disable the data subscription feature. It optimizes the C++ high-availability client and addresses PIPE synchronization latency issues in normal operation, restart, and deletion scenarios, along with query performance for large TEXT objects. Comprehensive enhancements to database monitoring, performance, and stability are also included. - -#### V1.3.5.4 - -> Release Date: 2025.06.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 Checksum: edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 fixes several product defects and optimizes the node removal functionality. It also delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.5.3 - -> Release Date: 2025.06.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 Checksum: 5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 focuses on optimizing data synchronization capabilities, including persisting PIPE transmission progress and adding monitoring metrics for PIPE event transfer time. Related defects have been resolved. Additionally, the encryption algorithm for user passwords has been upgraded to SHA-256. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.2 - -> Release Date: 2015.06.10
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 Checksum: 4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 primarily optimizes data synchronization features, adding support for cascading configurations via parameters and ensuring fully consistent ordering between synchronized and real-time writes. It also enables partitioned sending of historical and real-time data after system restarts. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.1 - -> Release Date: 2025.05.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 Checksum: 91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 resolves several product defects and delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.4.2 - -> Release Date: 2025.04.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 Checksum: 52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b - -V1.3.4.2 enhances the data synchronization function by supporting bi-directional active-active synchronization of data forwarded through external PIPE sources. - -#### V1.3.4.1 - -> Release Date: 2025.01.08
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 Checksum: e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 introduces pattern matching functions, continuously optimizes the data subscription mechanism, improves stability, and extends import-data/export-data scripts to support new data types while unifying TsFile, CSV and SQL import/export formats. Comprehensive improvements have been made to database monitoring, performance and stability. Key updates: - -* Query Module: Configurable URI-based JAR loading for UDFs, PipePlugins, Triggers and AINodes -* System Module: Extended UDF functionality with new pattern\_match function -* Data Sync: Supports specifying authentication info at sender -* Ecosystem: Kubernetes Operator support -* Scripts: import-data/export-data now supports strings, BLOBs, dates and timestamps -* Scripts: Unified import/export support for TsFile, CSV and SQL formats - -#### V1.3.3.3 - -> Release Date: 2024.10.31
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 Checksum: 4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3 improves restart recovery performance, enables DataNodes to actively monitor/load TsFiles with observability metrics, supports automatic loading at receivers when senders transfer files to specified directories, and adds Alter Source capability for Pipes. Comprehensive improvements to monitoring, performance and stability include: - -* Data Sync: Automatic type conversion for inconsistent data at receivers -* Data Sync: Enhanced observability with ops/latency metrics for internal APIs -* Data Sync: OPC-UA sink plugin supports CS mode and non-anonymous access -* Subscription: SDK supports create\_if\_not\_exists and drop\_if\_exists APIs -* Stream Processing: Alter Pipe supports Alter Source -* System: Added latency monitoring for REST module -* Scripts: Auto-loading TsFiles from specified directories -* Scripts: import-tsfile supports remote server execution -* Scripts: Kubernetes Helm support -* Scripts: Python client supports new data types (string, BLOB, date, timestamp) - -#### V1.3.3.2 - -> Release Date: 2024.08.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 Checksum: 32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2 adds metrics for mods file reading time, merge sort memory usage and dispatch latency, supports configurable time partition origin adjustment, enables automatic subscription termination based on pipe completion markers, and improves merge memory control. Key updates: - -* Query: Explain Analyze shows mods file read time -* Query: Explain Analyze shows merge sort memory and dispatch latency -* Storage: Added configurable file splitting during compaction -* System: Configurable time partition origin -* Stream Processing: Auto-terminate subscriptions on pipe completion markers -* Data Sync: Configurable RPC compression levels -* Scripts: Export filters only root.\_\_system paths - -#### V1.3.3.1 - -> Release Date: 2024.07.12
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 Checksum: 1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1 adds tiered storage throttling, supports username/password auth specification at sync senders, optimizes ambiguous WARN logs at receivers, improves restart performance, and merges configuration files. Key updates: - -* Query: Optimized Filter performance for faster aggregation/WHERE queries -* Query: Java Session evenly distributes SQL requests across nodes -* System: Merged config files into iotdb-system.properties -* Storage: Added tiered storage throttling -* Data Sync: Username/password auth specification at senders -* System: Optimized restart recovery time - -#### V1.3.2.2 - -> Release Date: 2024.06.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 Checksum: ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 introduces EXPLAIN ANALYZE for SQL profiling, UDAF framework, automatic data deletion at disk thresholds, metadata sync, path-specific data point counting, and SQL import/export scripts. Supports rolling cluster upgrades and cluster-wide plugin distribution with comprehensive monitoring/performance improvements. Key updates: - -* Storage: Improved insertRecords performance -* Storage: SpaceTL feature for auto-deletion at disk thresholds -* Query: EXPLAIN ANALYZE for SQL stage-level profiling -* Query: New UDAF framework -* Query: New envelope demodulation analysis in UDFs -* Query: MaxBy/MinBy functions returning timestamps with values -* Query: Faster value-filtered queries -* Data Sync: Wildcard path matching -* Data Sync: Metadata synchronization (including attributes/permissions) -* Stream Processing: ALTER PIPE for hot plugin updates -* System: TsFile load statistics in data point counting -* Scripts: Local upgrade/backup via hard links -* Scripts: New export-data/import-data for CSV/TsFile/SQL formats -* Scripts: Windows window title differentiation for ConfigNode/DataNode/Cli - -#### V1.3.1.4 - -> Release Date: 2024.04.23
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 Checksum: 8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1.4 adds cluster activation status viewing, variance/stddev aggregation functions, FILL timeout settings, TsFile repair command, one-click info collection scripts, and cluster control scripts while optimizing views and stream processing. Key updates: - -* Query: FILL clause timeout threshold -* Query: REST V2 returns column types -* Data Sync: Simplified time range specification -* Data Sync: SSL support (iotdb-thrift-ssl-sink) -* System: SQL query for cluster activation status -* System: Tiered storage transfer rate control -* System: Enhanced observability (node divergence, task scheduling) -* System: Optimized default logging -* Scripts: One-click cluster control scripts (start-all/stop-all) -* Scripts: One-click info collection scripts (collect-info) - -#### V1.3.0.4 - -> Release Date: 2024.01.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 Checksum: 3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 introduces the AINode machine learning framework, upgrades permission granularity to time-series level, and optimizes views/stream processing for better usability and stability. Key updates: - -* Query: New AINode ML framework -* Query: Fixed slow SHOW PATH responses -* Security: Time-series granular permissions -* Security: SSL client-server encryption -* Stream Processing: New metrics monitoring -* Query: LAST queries on non-writable views -* System: Improved data point counting accuracy - -#### V1.2.0.1 - -> Release Date: 2023.06.30
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 Checksum: dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1 introduces stream processing framework, dynamic templates, substring/replace/round functions, enhances SHOW REGION/TIMESERIES/VARIABLE statements and Session APIs while optimizing monitoring metrics. Key updates: - -* Stream Processing: New framework -* Metadata: Dynamic template expansion -* Storage: New SPRINTZ/RLBE encoding and LZMA2 compression -* Query: New CAST, ROUND, SUBSTR, REPLACE functions -* Query: New TIME\_DURATION, MODE aggregation -* Query: CASE WHEN syntax support -* Query: ORDER BY expression support -* Interface: Python API multi-node connection -* Interface: Python client write redirection -* Interface: Batch sequence creation via templates - -#### V1.1.0.1 - -> Release Date: 2023.04.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.1.0.1.zip
-> SHA512 Checksum: 58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1 introduces GROUP BY VARIATION/CONDITION, DIFF/COUNT\_IF functions, and pipeline execution engine while fixing issues including: - -* Aligned sequence LAST queries with ORDER BY TIMESERIES -* LIMIT & OFFSET failures -* Post-restart metadata template errors -* Sequence creation after database deletion - -Key updates: - -* Query: ALIGN BY DEVICE supports ORDER BY TIME -* Query: SHOW QUERIES/KILL QUERY commands -* System: SHOW REGIONS per database -* System: SHOW VARIABLES for cluster parameters -* Query: GROUP BY VARIATION/CONDITION -* Query: SELECT INTO type casting -* Query: New DIFF (scalar), COUNT\_IF (aggregate) -* System: SHOW REGIONS creation time -* System: Configurable dn\_rpc\_port/address - -## 2. Workbench (Console Tool) - - -| **Version** | **Description** | **Supported IoTDB Versions** | **SHA512 checksum** | -| ----------- | ------------------------------------------------------------ | ----------------------------------- | ------------------------------------------------------------ | -| V2.1.1 | Optimize the measuring point selection on the trend interface to support scenarios without devices | V2.0 and above | aa05fd4d9f33f07c0949bc2d6546bb4b9791ed5ea94bcef27e2bf51ea141ec0206f1c12466aced7bf3449e11ad68d65378d697f3d10cb4881024a83746029a65 | -| V2.0.1-beta | The first version of the V2.x series, supporting dual models of tree and table | V2.0 and above | 0ca0d5029874ed8ada9c7d1cb562370b3a46913eed66d39c08759287ccc8bf332cf80bb8861e788614b61ae5d53a9f5605f553e1a607e856f395eb5102e7cc4d | -| V1.5.7 | Optimize the point list by splitting point names into device names and points, ensure the point selection area supports horizontal scrolling, and align the export file column order with the page display. | All 1.x versions from V1.3.4 onward | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | Enhanced CSV import/export: optional tags/aliases on import; support for measurement descriptions with backtick-quoted quotes on export. | All 1.x versions from V1.3.4 onward | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | Added server clock functionality and support for activating Enterprise Edition license databases | All 1.x versions from V1.3.4 onward | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | Added authentication for Prometheus settings in Instance Management | All 1.x versions from V1.3.4 onward | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | Added AI analysis and pattern matching | All 1.x versions from V1.3.2 onward | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | Added tree model display and English UI | All 1.x versions from V1.3.2 onward | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | Enhanced analysis methods and import templates | All 1.x versions from V1.3.2 onward | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | Added DB configuration and UI refinements | All 1.x versions from V1.3.2 onward | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | Optimized permission controls | All 1.x versions from V1.3.1 onward | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | Added "Common Templates" and caching | All 1.x versions from V1.3.0 onward | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | Added import/export for calculations, time alignment field | All 1.x versions from V1.2.2 onward | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | Added activation details and analysis features | All 1.x versions from V1.2.2 onward | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | Optimized point description display | All 1.x versions from V1.2.2 onward | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | Added sync monitoring panel, Prometheus hints | All 1.x versions from V1.2.2 onward | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | Major Workbench upgrade | All 1.x versions from V1.2.0 onward | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/UserGuide/Master/Table/QuickStart/QuickStart_timecho.md b/src/UserGuide/Master/Table/QuickStart/QuickStart_timecho.md deleted file mode 100644 index 918214f34..000000000 --- a/src/UserGuide/Master/Table/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,94 +0,0 @@ - - -# Quick Start - -This document will guide you through methods to get started quickly with IoTDB. - -## 1. How to Install and Deploy? - -This guide will assist you in quickly installing and deploying IoTDB. You can quickly navigate to the content you need to review through the following document links: - -1. Prepare the necessary machine resources: The deployment and operation of IoTDB require consideration of various aspects of machine resource configuration. For specific resource configurations, please refer to [Database Resource](../Deployment-and-Maintenance/Database-Resources_timecho.md) - -2. Complete system configuration preparations: IoTDB's system configuration involves multiple aspects. For an introduction to key system configurations, please see [System Requirements](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. Obtain the installation package: You can contact the Timecho Team to get the IoTDB installation package to ensure you download the latest and most stable version. For the specific structure of the installation package, please refer to[Obtain TimechoDB](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. Install the database and activate it: Depending on your actual deployment architecture, you can choose from the following tutorials for installation and deployment: - - - Stand-Alone Deployment: [Stand-Alone Deployment ](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - Distributed(Cluster) Deployment:[Distributed(Cluster) Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - Dual-Active Deployment:[Dual-Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️Note: We currently still recommend direct installation and deployment on physical/virtual machines. For Docker deployment, please refer to [Docker Deployment](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. Install supporting tools: The TimechoDB provides supporting tools such as a monitoring panel, which is recommended to be installed when deploying the TimechoDB to facilitate a more convenient usage: - - - Monitoring Panel: Provides hundreds of database monitoring metrics for fine-grained monitoring of IoTDB and its operating system, assisting in system optimization, performance tuning, and bottleneck detection. For installation steps, please see [Monitoring-panel Deployment ](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - -## 2. How to Use IoTDB? - -1. Database Modeling Design: Database modeling is a crucial step in creating a database system, involving the design of data structures and relationships to ensure that the organization of data meets the needs of specific applications. The following documents will help you quickly understand IoTDB's modeling design: - - - Introduction to Time Series Concepts: [Navigating Time Series Data](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - - - Introduction to Modeling Design:[Data Model and Terminology](../Background-knowledge/Data-Model-and-Terminology_timecho.md) - - - Introduction to Database:[Database Management](../Basic-Concept/Database-Management_timecho.md) - - - Introduction to Tables: [Table Management](../Basic-Concept/Table-Management_timecho.md) - -2. Data Insertion & Updates: IoTDB provides multiple methods for inserting real-time data, and supports data write-back. For basic data insertion and updating operations, please see [Write&Updata Data](../Basic-Concept/Write-Updata-Data_timecho.md) - -3. Data Querying: IoTDB offers a rich set of data querying capabilities. For a basic introduction to data querying, please see [Query Data](../Basic-Concept/Query-Data.md). It includes pattern queries and window functions applicable to time-series featured analysis. For detailed introductions, please refer to [Pattern Query](../User-Manual/Pattern-Query_timecho.md) and [Window Function](../User-Manual/Window-Function_timecho.md). - -4. Data Deletion: IoTDB supports two deletion methods: SQL-based deletion and automatic expiration deletion (TTL). - - - SQL-Based Deletion: For a basic introduction, please refer to [Delete Data](../Basic-Concept/Delete-Data.md) - - - Automatic Expiration Deletion (TTL): For a basic introduction, please see [TTL Delete Data](../Basic-Concept/TTL-Delete-Data_timecho.md) - -5. Advanced Features: In addition to common database functions such as insertion and querying, IoTDB also supports features like "data synchronization." For specific usage methods, please refer to the respective documents: - - - Data Synchronization: [Data Sync](../User-Manual/Data-Sync_timecho.md) - -6. Application Programming Interfaces (APIs): IoTDB provides various application programming interfaces (APIs) to facilitate developers' interaction with IoTDB in applications. Currently supported interfaces include [Java Native API](../API/Programming-Java-Native-API_timecho.md)、[Python Native API](../API/Programming-Python-Native-API_timecho.md)、[JDBC](../API/Programming-JDBC_timecho.md), and more. For more programming interfaces, please refer to the [Application Programming Interfaces] section on the official website. - -## 3. What other convenient tools are available? - -In addition to its own rich features, IoTDB is complemented by a comprehensive suite of peripheral tools. This guide will help you quickly understand and use these tools: - - - Monitoring Panel: A tool for detailed monitoring of IoTDB and its operating system, covering hundreds of database monitoring indicators such as database performance and system resources, helping system optimization and bottleneck identification. For a detailed usage introduction, please refer to [Monitor Tool](../Tools-System/Monitor-Tool_timecho.md) - - -## 4. Want to learn more technical details? - -If you want to explore IoTDB’s internal mechanisms further, refer to the following documents: - - - Data Partitioning and Load Balancing: IoTDB is designed with a partitioning strategy and load balancing algorithm to enhance cluster availability and performance. For more details, please see [Cluster data partitioning](../Technical-Insider/Cluster-data-partitioning.md) - - - Compression & Encoding: IoTDB employs various encoding and compression techniques optimized for different data types to improve storage efficiency. For more details, please see [Encoding and Compression](../Technical-Insider/Encoding-and-Compression.md) - - diff --git a/src/UserGuide/Master/Table/Reference/System-Config-Manual_timecho.md b/src/UserGuide/Master/Table/Reference/System-Config-Manual_timecho.md deleted file mode 100644 index f96a33713..000000000 --- a/src/UserGuide/Master/Table/Reference/System-Config-Manual_timecho.md +++ /dev/null @@ -1,3374 +0,0 @@ - - -# Config Manual - -## 1. IoTDB Configuration Files - -The configuration files for IoTDB are located in the `conf` folder under the IoTDB installation directory. Key configuration files include: - -1. `confignode-env.sh` **/** `confignode-env.bat`: - 1. Environment configuration file for ConfigNode. - 2. Used to configure memory size and other environment settings for ConfigNode. -2. `datanode-env.sh` **/** `datanode-env.bat`: - 1. Environment configuration file for DataNode. - 2. Used to configure memory size and other environment settings for DataNode. -3. `iotdb-system.properties`: - 1. Main configuration file for IoTDB. - 2. Contains configurable parameters for IoTDB. -4. `iotdb-system.properties.template`: - 1. Template for the `iotdb-system.properties` file. - 2. Provides a reference for all available configuration parameters. - -## 2. Modify Configurations - -### 2.1 **Modify Existing Parameters**: - -- Parameters already present in the `iotdb-system.properties` file can be directly modified. - -### 2.2 **Adding New Parameters**: - -- For parameters not listed in `iotdb-system.properties`, you can find them in the `iotdb-system.properties.template` file. -- Copy the desired parameter from the template file to `iotdb-system.properties` and modify its value. - -### 2.3 Configuration Update Methods - -Different configuration parameters have different update methods, categorized as follows: - -1. **Modify before the first startup.**: - 1. These parameters can only be modified before the first startup of ConfigNode/DataNode. - 2. Modifying them after the first startup will prevent ConfigNode/DataNode from starting. -2. **Restart Required for Changes to Take Effect**: - 1. These parameters can be modified after ConfigNode/DataNode has started. - 2. However, a restart of ConfigNode/DataNode is required for the changes to take effect. -3. **Hot Reload**: - 1. These parameters can be modified while ConfigNode/DataNode is running. - 2. After modification, use the following SQL commands to apply the changes: - - `load configuration`: Reloads the configuration. - - `set configuration key1 = 'value1'`: Updates specific configuration parameters. - -## 3. Environment Parameters - -The environment configuration files (`confignode-env.sh/bat` and `datanode-env.sh/bat`) are used to configure Java environment parameters for ConfigNode and DataNode, such as JVM settings. These configurations are passed to the JVM when ConfigNode or DataNode starts. - -### 3.1 **confignode-env.sh/bat** - -- MEMORY_SIZE - -| Name | MEMORY_SIZE | -| ----------- | ------------------------------------------------------------ | -| Description | Memory size allocated when IoTDB ConfigNode starts. | -| Type | String | -| Default | Depends on the operating system and machine configuration. Defaults to 3/10 of the machine's memory, capped at 16G. | -| Effective | Restart required | - -- ON_HEAP_MEMORY - -| Name | ON_HEAP_MEMORY | -| ----------- | ------------------------------------------------------------ | -| Description | On-heap memory size available for IoTDB ConfigNode. Previously named `MAX_HEAP_SIZE`. | -| Type | String | -| Default | Depends on the `MEMORY_SIZE` configuration. | -| Effective | Restart required | - -- OFF_HEAP_MEMORY - -| Name | OFF_HEAP_MEMORY | -| ----------- | ------------------------------------------------------------ | -| Description | Off-heap memory size available for IoTDB ConfigNode. Previously named `MAX_DIRECT_MEMORY_SIZE`. | -| Type | String | -| Default | Depends on the `MEMORY_SIZE` configuration. | -| Effective | Restart required | - -### 3.2 **datanode-env.sh/bat** - -- MEMORY_SIZE - -| Name | MEMORY_SIZE | -| ----------- | ------------------------------------------------------------ | -| Description | Memory size allocated when IoTDB DataNode starts. | -| Type | String | -| Default | Depends on the operating system and machine configuration. Defaults to 1/2 of the machine's memory. | -| Effective | Restart required | - -- ON_HEAP_MEMORY - -| Name | ON_HEAP_MEMORY | -| ----------- | ------------------------------------------------------------ | -| Description | On-heap memory size available for IoTDB DataNode. Previously named `MAX_HEAP_SIZE`. | -| Type | String | -| Default | Depends on the `MEMORY_SIZE` configuration. | -| Effective | Restart required | - -- OFF_HEAP_MEMORY - -| Name | OFF_HEAP_MEMORY | -| ----------- | ------------------------------------------------------------ | -| Description | Off-heap memory size available for IoTDB DataNode. Previously named `MAX_DIRECT_MEMORY_SIZE`. | -| Type | String | -| Default | Depends on the `MEMORY_SIZE` configuration. | -| Effective | Restart required | - -## 4. System Parameters (`iotdb-system.properties.template`) - -The `iotdb-system.properties` file contains various configurations for managing IoTDB clusters, nodes, replication, directories, monitoring, SSL, connections, object storage, tier management, and REST services. Below is a detailed breakdown of the parameters: - -### 4.1 Cluster Configuration - -- cluster_name - -| Name | cluster_name | -| ----------- | --------------------------------------------------------- | -| Description | Name of the cluster. | -| Type | String | -| Default | default_cluster | -| Effective | Use CLI: `set configuration cluster_name='xxx'`. | -| Note | Changes are distributed across nodes. Changes may not propagate to all nodes in case of network issues or node failures. Nodes that fail to update must manually modify `cluster_name` in their configuration files and restart. Under normal circumstances, it is not recommended to modify `cluster_name` by manually modifying configuration files or to perform hot-loading via `load configuration` method. | - -### 4.2 Seed ConfigNode - -- cn_seed_config_node - -| Name | cn_seed_config_node | -| ----------- | ------------------------------------------------------------ | -| Description | Address of the seed ConfigNode for Confignode to join the cluster. | -| Type | String | -| Default | 127.0.0.1:10710 | -| Effective | Modify before the first startup. | - -- dn_seed_config_node - -| Name | dn_seed_config_node | -| ----------- | ------------------------------------------------------------ | -| Description | Address of the seed ConfigNode for Datanode to join the cluster. | -| Type | String | -| Default | 127.0.0.1:10710 | -| Effective | Modify before the first startup. | - -### 4.3 Node RPC Configuration - -- cn_internal_address - -| Name | cn_internal_address | -| ----------- | ---------------------------------------------- | -| Description | Internal address for ConfigNode communication. | -| Type | String | -| Default | 127.0.0.1 | -| Effective | Modify before the first startup. | - -- cn_internal_port - -| Name | cn_internal_port | -| ----------- | ------------------------------------------- | -| Description | Port for ConfigNode internal communication. | -| Type | Short Int : [0,65535] | -| Default | 10710 | -| Effective | Modify before the first startup. | - -- cn_consensus_port - -| Name | cn_consensus_port | -| ----------- | ----------------------------------------------------- | -| Description | Port for ConfigNode consensus protocol communication. | -| Type | Short Int : [0,65535] | -| Default | 10720 | -| Effective | Modify before the first startup. | - -- dn_rpc_address - -| Name | dn_rpc_address | -| ----------- |---------------------------------| -| Description | Address for client RPC service. | -| Type | String | -| Default | 127.0.0.1 | -| Effective | Restart required. | - -- dn_rpc_port - -| Name | dn_rpc_port | -| ----------- | ---------------------------- | -| Description | Port for client RPC service. | -| Type | Short Int : [0,65535] | -| Default | 6667 | -| Effective | Restart required. | - -- dn_internal_address - -| Name | dn_internal_address | -| ----------- | -------------------------------------------- | -| Description | Internal address for DataNode communication. | -| Type | string | -| Default | 127.0.0.1 | -| Effective | Modify before the first startup. | - -- dn_internal_port - -| Name | dn_internal_port | -| ----------- | ----------------------------------------- | -| Description | Port for DataNode internal communication. | -| Type | int | -| Default | 10730 | -| Effective | Modify before the first startup. | - -- dn_mpp_data_exchange_port - -| Name | dn_mpp_data_exchange_port | -| ----------- | -------------------------------- | -| Description | Port for MPP data exchange. | -| Type | int | -| Default | 10740 | -| Effective | Modify before the first startup. | - -- dn_schema_region_consensus_port - -| Name | dn_schema_region_consensus_port | -| ----------- | ------------------------------------------------------------ | -| Description | Port for Datanode SchemaRegion consensus protocol communication. | -| Type | int | -| Default | 10750 | -| Effective | Modify before the first startup. | - -- dn_data_region_consensus_port - -| Name | dn_data_region_consensus_port | -| ----------- | ------------------------------------------------------------ | -| Description | Port for Datanode DataRegion consensus protocol communication. | -| Type | int | -| Default | 10760 | -| Effective | Modify before the first startup. | - -- dn_join_cluster_retry_interval_ms - -| Name | dn_join_cluster_retry_interval_ms | -| ----------- | --------------------------------------------------- | -| Description | Interval for DataNode to retry joining the cluster. | -| Type | long | -| Default | 5000 | -| Effective | Restart required. | - -### 4.4 Replication configuration - -- config_node_consensus_protocol_class - -| Name | config_node_consensus_protocol_class | -| ----------- | ------------------------------------------------------------ | -| Description | Consensus protocol for ConfigNode replication, only supports RatisConsensus | -| Type | String | -| Default | org.apache.iotdb.consensus.ratis.RatisConsensus | -| Effective | Modify before the first startup. | - -- schema_replication_factor - -| Name | schema_replication_factor | -| ----------- | ------------------------------------------------------------ | -| Description | Default schema replication factor for databases. | -| Type | int32 | -| Default | 1 | -| Effective | Restart required. Takes effect on the new database after restarting. | - -- schema_region_consensus_protocol_class - -| Name | schema_region_consensus_protocol_class | -| ----------- | ------------------------------------------------------------ | -| Description | Consensus protocol for schema region replication. Only supports RatisConsensus when multi-replications. | -| Type | String | -| Default | org.apache.iotdb.consensus.ratis.RatisConsensus | -| Effective | Modify before the first startup. | - -- data_replication_factor - -| Name | data_replication_factor | -| ----------- | ------------------------------------------------------------ | -| Description | Default data replication factor for databases. | -| Type | int32 | -| Default | 1 | -| Effective | Restart required. Takes effect on the new database after restarting. | - -- data_region_consensus_protocol_class - -| Name | data_region_consensus_protocol_class | -| ----------- | ------------------------------------------------------------ | -| Description | Consensus protocol for data region replication. Supports IoTConsensus or RatisConsensus when multi-replications. | -| Type | String | -| Default | org.apache.iotdb.consensus.iot.IoTConsensus | -| Effective | Modify before the first startup. | - -### 4.5 Directory configuration - -- cn_system_dir - -| Name | cn_system_dir | -| ----------- | ----------------------------------------------------------- | -| Description | System data storage path for ConfigNode. | -| Type | String | -| Default | data/confignode/system(Windows:data\\configndoe\\system) | -| Effective | Restart required | - -- cn_consensus_dir - -| Name | cn_consensus_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Consensus protocol data storage path for ConfigNode. | -| Type | String | -| Default | data/confignode/consensus(Windows:data\\configndoe\\consensus) | -| Effective | Restart required | - -- cn_pipe_receiver_file_dir - -| Name | cn_pipe_receiver_file_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Directory for pipe receiver files in ConfigNode. | -| Type | String | -| Default | data/confignode/system/pipe/receiver(Windows:data\\confignode\\system\\pipe\\receiver) | -| Effective | Restart required | - -- dn_system_dir - -| Name | dn_system_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Schema storage path for DataNode. By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/system(Windows:data\\datanode\\system) | -| Effective | Restart required | - -- dn_data_dirs - -| Name | dn_data_dirs | -| ----------- | ------------------------------------------------------------ | -| Description | Data storage path for DataNode. By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/data(Windows:data\\datanode\\data) | -| Effective | Restart required | - -- dn_multi_dir_strategy - -| Name | dn_multi_dir_strategy | -| ----------- | ------------------------------------------------------------ | -| Description | The strategy used by IoTDB to select directories in `data_dirs` for TsFiles. You can use either the simple class name or the fully qualified class name. The system provides the following two strategies: 1. SequenceStrategy: IoTDB selects directories sequentially, iterating through all directories in `data_dirs` in a round-robin manner. 2. MaxDiskUsableSpaceFirstStrategy IoTDB prioritizes the directory in `data_dirs` with the largest disk free space. To implement a custom strategy: 1. Inherit the `org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy `class and implement your own strategy method. 2. Fill in the configuration item with the fully qualified class name of your implementation (package name + class name, e.g., `UserDefineStrategyPackage`). 3. Add the JAR file containing your custom class to the project. | -| Type | String | -| Default | SequenceStrategy | -| Effective | Hot reload. | - -- dn_consensus_dir - -| Name | dn_consensus_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Consensus log storage path for DataNode. By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/consensus(Windows:data\\datanode\\consensus) | -| Effective | Restart required | - -- dn_wal_dirs - -| Name | dn_wal_dirs | -| ----------- | ------------------------------------------------------------ | -| Description | Write-ahead log (WAL) storage path for DataNode. By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/wal(Windows:data\\datanode\\wal) | -| Effective | Restart required | - -- dn_tracing_dir - -| Name | dn_tracing_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Tracing root directory for DataNode. By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | datanode/tracing(Windows:datanode\\tracing) | -| Effective | Restart required | - -- dn_sync_dir - -| Name | dn_sync_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Sync storage path for DataNode.By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/sync(Windows:data\\datanode\\sync) | -| Effective | Restart required | - -- sort_tmp_dir - -| Name | sort_tmp_dir | -| ----------- | ------------------------------------------------- | -| Description | Temporary directory for sorting operations. | -| Type | String | -| Default | data/datanode/tmp(Windows:data\\datanode\\tmp) | -| Effective | Restart required | - -- dn_pipe_receiver_file_dirs - -| Name | dn_pipe_receiver_file_dirs | -| ----------- | ------------------------------------------------------------ | -| Description | Directory for pipe receiver files in DataNode. | -| Type | String | -| Default | data/datanode/system/pipe/receiver(Windows:data\\datanode\\system\\pipe\\receiver) | -| Effective | Restart required | - -- iot_consensus_v2_receiver_file_dirs - -| Name | iot_consensus_v2_receiver_file_dirs | -| ----------- | ------------------------------------------------------------ | -| Description | Directory for IoTConsensus V2 receiver files. | -| Type | String | -| Default | data/datanode/system/pipe/consensus/receiver(Windows:data\\datanode\\system\\pipe\\consensus\\receiver) | -| Effective | Restart required | - -- iot_consensus_v2_deletion_file_dir - -| Name | iot_consensus_v2_deletion_file_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Directory for IoTConsensus V2 deletion files. | -| Type | String | -| Default | data/datanode/system/pipe/consensus/deletion(Windows:data\\datanode\\system\\pipe\\consensus\\deletion) | -| Effective | Restart required | - -### 4.6 Metric Configuration - -- cn_metric_reporter_list - -| Name | cn_metric_reporter_list | -| ----------- | ----------------------------------------- | -| Description | Systems for reporting ConfigNode metrics. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -- cn_metric_level - -| Name | cn_metric_level | -| ----------- | --------------------------------------- | -| Description | Level of detail for ConfigNode metrics. | -| Type | String | -| Default | IMPORTANT | -| Effective | Restart required. | - -- cn_metric_async_collect_period - -| Name | cn_metric_async_collect_period | -| ----------- | ------------------------------------------------------------ | -| Description | Period for asynchronous metric collection in ConfigNode (in seconds). | -| Type | int | -| Default | 5 | -| Effective | Restart required. | - -- cn_metric_prometheus_reporter_port - -| Name | cn_metric_prometheus_reporter_port | -| ----------- | --------------------------------------------------- | -| Description | Port for Prometheus metric reporting in ConfigNode. | -| Type | int | -| Default | 9091 | -| Effective | Restart required. | - -- dn_metric_reporter_list - -| Name | dn_metric_reporter_list | -| ----------- | --------------------------------------- | -| Description | Systems for reporting DataNode metrics. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -- dn_metric_level - -| Name | dn_metric_level | -| ----------- | ------------------------------------- | -| Description | Level of detail for DataNode metrics. | -| Type | String | -| Default | IMPORTANT | -| Effective | Restart required. | - -- dn_metric_async_collect_period - -| Name | dn_metric_async_collect_period | -| ----------- | ------------------------------------------------------------ | -| Description | Period for asynchronous metric collection in DataNode (in seconds). | -| Type | int | -| Default | 5 | -| Effective | Restart required. | - -- dn_metric_prometheus_reporter_port - -| Name | dn_metric_prometheus_reporter_port | -| ----------- | ------------------------------------------------- | -| Description | Port for Prometheus metric reporting in DataNode. | -| Type | int | -| Default | 9092 | -| Effective | Restart required. | - -- dn_metric_internal_reporter_type - -| Name | dn_metric_internal_reporter_type | -| ----------- | ------------------------------------------------------------ | -| Description | Internal reporter types for DataNode metrics. For internal monitoring and checking that the data has been successfully written and refreshed. | -| Type | String | -| Default | IOTDB | -| Effective | Restart required. | - -### 4.7 SSL Configuration - -- enable_thrift_ssl - -| Name | enable_thrift_ssl | -| ----------- | --------------------------------------------- | -| Description | Enables SSL encryption for RPC communication. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- enable_https - -| Name | enable_https | -| ----------- | ------------------------------ | -| Description | Enables SSL for REST services. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- key_store_path - -| Name | key_store_path | -| ----------- | ---------------------------- | -| Description | Path to the SSL certificate. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -- key_store_pwd - -| Name | key_store_pwd | -| ----------- | --------------------------------- | -| Description | Password for the SSL certificate. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -### 4.8 Connection Configuration - -- cn_rpc_thrift_compression_enable - -| Name | cn_rpc_thrift_compression_enable | -| ----------- | ----------------------------------- | -| Description | Enables Thrift compression for RPC. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- cn_rpc_max_concurrent_client_num - -| Name | cn_rpc_max_concurrent_client_num | -| ----------- |-------------------------------------------| -| Description | Maximum number of concurrent RPC clients. | -| Type | int | -| Default | 3000 | -| Effective | Restart required. | - -- cn_connection_timeout_ms - -| Name | cn_connection_timeout_ms | -| ----------- | ---------------------------------------------------- | -| Description | Connection timeout for ConfigNode (in milliseconds). | -| Type | int | -| Default | 60000 | -| Effective | Restart required. | - -- cn_selector_thread_nums_of_client_manager - -| Name | cn_selector_thread_nums_of_client_manager | -| ----------- | ------------------------------------------------------------ | -| Description | Number of selector threads for client management in ConfigNode. | -| Type | int | -| Default | 1 | -| Effective | Restart required. | - -- cn_max_client_count_for_each_node_in_client_manager - -| Name | cn_max_client_count_for_each_node_in_client_manager | -| ----------- | ------------------------------------------------------ | -| Description | Maximum clients per node in ConfigNode client manager. | -| Type | int | -| Default | 300 | -| Effective | Restart required. | - -- dn_session_timeout_threshold - -| Name | dn_session_timeout_threshold | -| ----------- | ---------------------------------------- | -| Description | Maximum idle time for DataNode sessions. | -| Type | int | -| Default | 0 | -| Effective | Restart required.t required. | - -- dn_rpc_thrift_compression_enable - -| Name | dn_rpc_thrift_compression_enable | -| ----------- | -------------------------------------------- | -| Description | Enables Thrift compression for DataNode RPC. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- dn_rpc_advanced_compression_enable - -| Name | dn_rpc_advanced_compression_enable | -| ----------- | ----------------------------------------------------- | -| Description | Enables advanced Thrift compression for DataNode RPC. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- dn_rpc_selector_thread_count - -| Name | rpc_selector_thread_count | -| ----------- | -------------------------------------------- | -| Description | Number of selector threads for DataNode RPC. | -| Type | int | -| Default | 1 | -| Effective | Restart required.t required. | - -- dn_rpc_min_concurrent_client_num - -| Name | rpc_min_concurrent_client_num | -| ----------- | ------------------------------------------------------ | -| Description | Minimum number of concurrent RPC clients for DataNode. | -| Type | Short Int : [0,65535] | -| Default | 1 | -| Effective | Restart required. | - -- dn_rpc_max_concurrent_client_num - -| Name | dn_rpc_max_concurrent_client_num | -| ----------- |--------------------------------------------------------| -| Description | Maximum number of concurrent RPC clients for DataNode. | -| Type | Short Int : [0,65535] | -| Default | 1000 | -| Effective | Restart required. | - -- dn_thrift_max_frame_size - -| Name | dn_thrift_max_frame_size | -| ----------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | Maximum frame size for RPC requests/responses. | -| Type | int | -| Default | Defaults to 0, which means the value is automatically calculated based on the DN JVM configuration parameters at startup:
a. min(64MB, dn_alloc_memory/64)
b. If the user manually configures `dn_thrift_max_frame_size`, the user-specified value will be used instead. | -| Effective | Restart required. | - -- dn_thrift_init_buffer_size - -| Name | dn_thrift_init_buffer_size | -| ----------- | ----------------------------------- | -| Description | Initial buffer size for Thrift RPC. | -| Type | long | -| Default | 1024 | -| Effective | Restart required. | - -- dn_connection_timeout_ms - -| Name | dn_connection_timeout_ms | -| ----------- | -------------------------------------------------- | -| Description | Connection timeout for DataNode (in milliseconds). | -| Type | int | -| Default | 60000 | -| Effective | Restart required. | - -- dn_selector_thread_count_of_client_manager - -| Name | dn_selector_thread_count_of_client_manager | -| ----------- | ------------------------------------------------------------ | -| Description | selector thread (TAsyncClientManager) nums for async thread in a clientManager | -| Type | int | -| Default | 1 | -| Effective | Restart required.t required. | - -- dn_max_client_count_for_each_node_in_client_manager - -| Name | dn_max_client_count_for_each_node_in_client_manager | -| ----------- | --------------------------------------------------- | -| Description | Maximum clients per node in DataNode clientmanager. | -| Type | int | -| Default | 300 | -| Effective | Restart required. | - -### 4.9 Object storage management - -- remote_tsfile_cache_dirs - -| Name | remote_tsfile_cache_dirs | -| ----------- | ---------------------------------------- | -| Description | Local cache directory for cloud storage. | -| Type | String | -| Default | data/datanode/data/cache | -| Effective | Restart required. | - -- remote_tsfile_cache_page_size_in_kb - -| Name | remote_tsfile_cache_page_size_in_kb | -| ----------- | --------------------------------------------- | -| Description | Block size for cached files in cloud storage. | -| Type | int | -| Default | 20480 | -| Effective | Restart required. | - -- remote_tsfile_cache_max_disk_usage_in_mb - -| Name | remote_tsfile_cache_max_disk_usage_in_mb | -| ----------- | ------------------------------------------- | -| Description | Maximum disk usage for cloud storage cache. | -| Type | long | -| Default | 51200 | -| Effective | Restart required. | - -- object_storage_type - -| Name | object_storage_type | -| ----------- | ---------------------- | -| Description | Type of cloud storage. | -| Type | String | -| Default | AWS_S3 | -| Effective | Restart required. | - -- object_storage_endpoint - -| Name | object_storage_endpoint | -| ----------- | --------------------------- | -| Description | Endpoint for cloud storage. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -- object_storage_bucket - -| Name | object_storage_bucket | -| ----------- | ------------------------------ | -| Description | Bucket name for cloud storage. | -| Type | String | -| Default | iotdb_data | -| Effective | Restart required. | - -- object_storage_access_key - -| Name | object_storage_access_key | -| ----------- | ----------------------------- | -| Description | Access key for cloud storage. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -- object_storage_access_secret - -| Name | object_storage_access_secret | -| ----------- | -------------------------------- | -| Description | Access secret for cloud storage. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -### 4.10 Tier management - -- dn_default_space_usage_thresholds - -| Name | dn_default_space_usage_thresholds | -| ----------- | ------------------------------------------------------------ | -| Description | Disk usage threshold, data will be moved to the next tier when the usage of the tier is higher than this threshold.If tiered storage is enabled, please separate thresholds of different tiers by semicolons ";". | -| Type | double | -| Default | 0.85 | -| Effective | Hot reload. | - -- dn_tier_full_policy - -| Name | dn_tier_full_policy | -| ----------- | ------------------------------------------------------------ | -| Description | How to deal with the last tier's data when its used space has been higher than its dn_default_space_usage_thresholds. | -| Type | String | -| Default | NULL | -| Effective | Hot reload. | - -- migrate_thread_count - -| Name | migrate_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | thread pool size for migrate operation in the DataNode's data directories. | -| Type | int | -| Default | 1 | -| Effective | Hot reload. | - -- tiered_storage_migrate_speed_limit_bytes_per_sec - -| Name | tiered_storage_migrate_speed_limit_bytes_per_sec | -| ----------- | ------------------------------------------------------------ | -| Description | The migrate speed limit of different tiers can reach per second | -| Type | int | -| Default | 10485760 | -| Effective | Hot reload. | - -### 4.11 REST Service Configuration - -- enable_rest_service - -| Name | enable_rest_service | -| ----------- | --------------------------- | -| Description | Is the REST service enabled | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- rest_service_port - -| Name | rest_service_port | -| ----------- | ------------------------------------ | -| Description | the binding port of the REST service | -| Type | int32 | -| Default | 18080 | -| Effective | Restart required. | - -- enable_swagger - -| Name | enable_swagger | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to display rest service interface information through swagger. eg: http://ip:port/swagger.json | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- rest_query_default_row_size_limit - -| Name | rest_query_default_row_size_limit | -| ----------- | ------------------------------------------------------------ | -| Description | the default row limit to a REST query response when the rowSize parameter is not given in request | -| Type | int32 | -| Default | 10000 | -| Effective | Restart required. | - -- cache_expire_in_seconds - -| Name | cache_expire_in_seconds | -| ----------- | ------------------------------------------------------------ | -| Description | The expiration time of the user login information cache (in seconds) | -| Type | int32 | -| Default | 28800 | -| Effective | Restart required. | - -- cache_max_num - -| Name | cache_max_num | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of users can be stored in the user login cache. | -| Type | int32 | -| Default | 100 | -| Effective | Restart required. | - -- cache_init_num - -| Name | cache_init_num | -| ----------- | ------------------------------------------------------------ | -| Description | The initial capacity of users can be stored in the user login cache. | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- client_auth - -| Name | client_auth | -| ----------- | --------------------------------- | -| Description | Is client authentication required | -| Type | boolean | -| Default | false | -| Effective | Restart required. | - -- trust_store_path - -| Name | trust_store_path | -| ----------- | -------------------- | -| Description | SSL trust store path | -| Type | String | -| Default | "" | -| Effective | Restart required. | - -- trust_store_pwd - -| Name | trust_store_pwd | -| ----------- | ------------------------- | -| Description | SSL trust store password. | -| Type | String | -| Default | "" | -| Effective | Restart required. | - -- idle_timeout_in_seconds - -| Name | idle_timeout_in_seconds | -| ----------- | ------------------------ | -| Description | SSL timeout (in seconds) | -| Type | int32 | -| Default | 5000 | -| Effective | Restart required. | - -### 4.12 Load balancing configuration - -- series_slot_num - -| Name | series_slot_num | -| ----------- | ------------------------------------------- | -| Description | Number of SeriesPartitionSlots per Database | -| Type | int32 | -| Default | 10000 | -| Effective | Modify before the first startup. | - -- series_partition_executor_class - -| Name | series_partition_executor_class | -| ----------- | ------------------------------------------------------------ | -| Description | SeriesPartitionSlot executor class | -| Type | String | -| Default | org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor | -| Effective | Modify before the first startup. | - -- schema_region_group_extension_policy - -| Name | schema_region_group_extension_policy | -| ----------- | ------------------------------------------------------------ | -| Description | The policy of extension SchemaRegionGroup for each Database. | -| Type | string | -| Default | AUTO | -| Effective | Restart required. | - -- default_schema_region_group_num_per_database - -| Name | default_schema_region_group_num_per_database | -| ----------- | ------------------------------------------------------------ | -| Description | When set schema_region_group_extension_policy=CUSTOM, this parameter is the default number of SchemaRegionGroups for each Database.When set schema_region_group_extension_policy=AUTO, this parameter is the default minimal number of SchemaRegionGroups for each Database. | -| Type | int | -| Default | 1 | -| Effective | Restart required. | - -- schema_region_per_data_node - -| Name | schema_region_per_data_node | -| ----------- | ------------------------------------------------------------ | -| Description | It only takes effect when set schema_region_group_extension_policy=AUTO.This parameter is the maximum number of SchemaRegions expected to be managed by each DataNode. | -| Type | double | -| Default | 1.0 | -| Effective | Restart required. | - -- data_region_group_extension_policy - -| Name | data_region_group_extension_policy | -| ----------- | ---------------------------------------------------------- | -| Description | The policy of extension DataRegionGroup for each Database. | -| Type | string | -| Default | AUTO | -| Effective | Restart required. | - -- default_data_region_group_num_per_database - -| Name | default_data_region_group_per_database | -| ----------- | ------------------------------------------------------------ | -| Description | When set data_region_group_extension_policy=CUSTOM, this parameter is the default number of DataRegionGroups for each Database.When set data_region_group_extension_policy=AUTO, this parameter is the default minimal number of DataRegionGroups for each Database. | -| Type | int | -| Default | 2 | -| Effective | Restart required. | - -- data_region_per_data_node - -| Name | data_region_per_data_node | -| ----------- | ------------------------------------------------------------ | -| Description | It only takes effect when set data_region_group_extension_policy=AUTO.This parameter is the maximum number of DataRegions expected to be managed by each DataNode. | -| Type | double | -| Default | 5.0 | -| Effective | Restart required. | - -- enable_auto_leader_balance_for_ratis_consensus - -| Name | enable_auto_leader_balance_for_ratis_consensus | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to enable auto leader balance for Ratis consensus protocol. | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -- enable_auto_leader_balance_for_iot_consensus - -| Name | enable_auto_leader_balance_for_iot_consensus | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to enable auto leader balance for IoTConsensus protocol. | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -### 4.13 Cluster management - -- time_partition_origin - -| Name | time_partition_origin | -| ----------- | ------------------------------------------------------------ | -| Description | Time partition origin in milliseconds, default is equal to zero. | -| Type | Long | -| Unit | ms | -| Default | 0 | -| Effective | Modify before the first startup. | - -- time_partition_interval - -| Name | time_partition_interval | -| ----------- | ------------------------------------------------------------ | -| Description | Time partition interval in milliseconds, and partitioning data inside each data region, default is equal to one week | -| Type | Long | -| Unit | ms | -| Default | 604800000 | -| Effective | Modify before the first startup. | - -- heartbeat_interval_in_ms - -| Name | heartbeat_interval_in_ms | -| ----------- | -------------------------------------- | -| Description | The heartbeat interval in milliseconds | -| Type | Long | -| Unit | ms | -| Default | 1000 | -| Effective | Restart required. | - -- disk_space_warning_threshold - -| Name | disk_space_warning_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | Disk remaining threshold at which DataNode is set to ReadOnly status | -| Type | double(percentage) | -| Default | 0.05 | -| Effective | Restart required. | - -### 4.14 Memory Control Configuration - -- datanode_memory_proportion - -| Name | datanode_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Memory Allocation Ratio: StorageEngine, QueryEngine, SchemaEngine, Consensus, StreamingEngine and Free Memory. | -| Type | Ratio | -| Default | 3:3:1:1:1:1 | -| Effective | Restart required. | - -- schema_memory_proportion - -| Name | schema_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Schema Memory Allocation Ratio: SchemaRegion, SchemaCache, and PartitionCache. | -| Type | Ratio | -| Default | 5:4:1 | -| Effective | Restart required. | - -- storage_engine_memory_proportion - -| Name | storage_engine_memory_proportion | -| ----------- | ----------------------------------------------------------- | -| Description | Memory allocation ratio in StorageEngine: Write, Compaction | -| Type | Ratio | -| Default | 8:2 | -| Effective | Restart required. | - -- write_memory_proportion - -| Name | write_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Memory allocation ratio in writing: Memtable, TimePartitionInfo | -| Type | Ratio | -| Default | 19:1 | -| Effective | Restart required. | - -- primitive_array_size - -| Name | primitive_array_size | -| ----------- | --------------------------------------------------------- | -| Description | primitive array size (length of each array) in array pool | -| Type | int32 | -| Default | 64 | -| Effective | Restart required. | - -- chunk_metadata_size_proportion - -| Name | chunk_metadata_size_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Ratio of compaction memory for chunk metadata maintains in memory when doing compaction | -| Type | Double | -| Default | 0.1 | -| Effective | Restart required. | - -- flush_proportion - -| Name | flush_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Ratio of memtable memory for invoking flush disk, 0.4 by defaultIf you have extremely high write load (like batch=1000), it can be set lower than the default value like 0.2 | -| Type | Double | -| Default | 0.4 | -| Effective | Restart required. | - -- buffered_arrays_memory_proportion - -| Name | buffered_arrays_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Ratio of memtable memory allocated for buffered arrays, 0.6 by default | -| Type | Double | -| Default | 0.6 | -| Effective | Restart required. | - -- reject_proportion - -| Name | reject_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Ratio of memtable memory for rejecting insertion, 0.8 by defaultIf you have extremely high write load (like batch=1000) and the physical memory size is large enough, it can be set higher than the default value like 0.9 | -| Type | Double | -| Default | 0.8 | -| Effective | Restart required. | - -- device_path_cache_proportion - -| Name | device_path_cache_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Ratio of memtable memory for the DevicePathCache. DevicePathCache is the deviceId cache, keeping only one copy of the same deviceId in memory | -| Type | Double | -| Default | 0.05 | -| Effective | Restart required. | - -- write_memory_variation_report_proportion - -| Name | write_memory_variation_report_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | If memory cost of data region increased more than proportion of allocated memory for writing, report to system. The default value is 0.001 | -| Type | Double | -| Default | 0.001 | -| Effective | Restart required. | - -- check_period_when_insert_blocked - -| Name | check_period_when_insert_blocked | -| ----------- | ------------------------------------------------------------ | -| Description | When an insertion is rejected, the waiting period (in ms) to check system again, 50 by default.If the insertion has been rejected and the read load is low, it can be set larger. | -| Type | int32 | -| Default | 50 | -| Effective | Restart required. | - -- io_task_queue_size_for_flushing - -| Name | io_task_queue_size_for_flushing | -| ----------- | -------------------------------------------- | -| Description | size of ioTaskQueue. The default value is 10 | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- enable_query_memory_estimation - -| Name | enable_query_memory_estimation | -| ----------- | ------------------------------------------------------------ | -| Description | If true, we will estimate each query's possible memory footprint before executing it and deny it if its estimated memory exceeds current free memory | -| Type | bool | -| Default | true | -| Effective | Hot reload. | - -### 4.15 Schema Engine Configuration - -- schema_engine_mode - -| Name | schema_engine_mode | -| ----------- | ------------------------------------------------------------ | -| Description | The schema management mode of schema engine. Currently, support Memory and PBTree.This config of all DataNodes in one cluster must keep same. | -| Type | string | -| Default | Memory | -| Effective | Modify before the first startup. | - -- partition_cache_size - -| Name | partition_cache_size | -| ----------- | ------------------------- | -| Description | cache size for partition. | -| Type | Int32 | -| Default | 1000 | -| Effective | Restart required. | - -- sync_mlog_period_in_ms - -| Name | sync_mlog_period_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | The cycle when metadata log is periodically forced to be written to disk(in milliseconds)If sync_mlog_period_in_ms=0 it means force metadata log to be written to disk after each refreshmentSetting this parameter to 0 may slow down the operation on slow disk. | -| Type | Int64 | -| Default | 100 | -| Effective | Restart required. | - -- tag_attribute_flush_interval - -| Name | tag_attribute_flush_interval | -| ----------- | ------------------------------------------------------------ | -| Description | interval num for tag and attribute records when force flushing to disk | -| Type | int32 | -| Default | 1000 | -| Effective | Modify before the first startup. | - -- tag_attribute_total_size - -| Name | tag_attribute_total_size | -| ----------- | ------------------------------------------------------------ | -| Description | max size for a storage block for tags and attributes of a one-time series | -| Type | int32 | -| Default | 700 | -| Effective | Modify before the first startup. | - -- max_measurement_num_of_internal_request - -| Name | max_measurement_num_of_internal_request | -| ----------- | ------------------------------------------------------------ | -| Description | max measurement num of internal requestWhen creating timeseries with Session.createMultiTimeseries, the user input plan, the timeseries num ofwhich exceeds this num, will be split to several plans with timeseries no more than this num. | -| Type | Int32 | -| Default | 10000 | -| Effective | Restart required. | - -- datanode_schema_cache_eviction_policy - -| Name | datanode_schema_cache_eviction_policy | -| ----------- | --------------------------------------- | -| Description | Policy of DataNodeSchemaCache eviction. | -| Type | String | -| Default | FIFO | -| Effective | Restart required. | - -- cluster_timeseries_limit_threshold - -| Name | cluster_timeseries_limit_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | This configuration parameter sets the maximum number of time series allowed in the cluster. | -| Type | Int32 | -| Default | -1 | -| Effective | Restart required. | - -- cluster_device_limit_threshold - -| Name | cluster_device_limit_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | This configuration parameter sets the maximum number of devices allowed in the cluster. | -| Type | Int32 | -| Default | -1 | -| Effective | Restart required. | - -- database_limit_threshold - -| Name | database_limit_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | This configuration parameter sets the maximum number of Cluster Databases allowed. | -| Type | Int32 | -| Default | -1 | -| Effective | Restart required. | - -### 4.16 Configurations for creating schema automatically - -- enable_auto_create_schema - -| Name | enable_auto_create_schema | -| ----------- | ------------------------------------------------ | -| Description | Whether creating schema automatically is enabled | -| Value | true or false | -| Default | true | -| Effective | Restart required. | - -- default_storage_group_level - -| Name | default_storage_group_level | -| ----------- | ------------------------------------------------------------ | -| Description | Database level when creating schema automatically is enabled e.g. root.sg0.d1.s2We will set root.sg0 as the database if database level is 1If the incoming path is shorter than this value, the creation/insertion will fail. | -| Value | int32 | -| Default | 1 | -| Effective | Restart required. | - -- boolean_string_infer_type - -| Name | boolean_string_infer_type | -| ----------- |------------------------------------------------------------------------------------| -| Description | register time series as which type when receiving boolean string "true" or "false" | -| Value | BOOLEAN or TEXT | -| Default | BOOLEAN | -| Effective | Hot_reload | - -- integer_string_infer_type - -| Name | integer_string_infer_type | -| ----------- |------------------------------------------------------------------------------------------------------------------| -| Description | register time series as which type when receiving an integer string and using float or double may lose precision | -| Value | INT32, INT64, FLOAT, DOUBLE, TEXT | -| Default | DOUBLE | -| Effective | Hot_reload | - -- floating_string_infer_type - -| Name | floating_string_infer_type | -| ----------- |----------------------------------------------------------------------------------| -| Description | register time series as which type when receiving a floating number string "6.7" | -| Value | DOUBLE, FLOAT or TEXT | -| Default | DOUBLE | -| Effective | Hot_reload | - -- nan_string_infer_type - -| Name | nan_string_infer_type | -| ----------- |--------------------------------------------------------------------| -| Description | register time series as which type when receiving the Literal NaN. | -| Value | DOUBLE, FLOAT or TEXT | -| Default | DOUBLE | -| Effective | Hot_reload | - -- default_boolean_encoding - -| Name | default_boolean_encoding | -| ----------- |----------------------------------------------------------------| -| Description | BOOLEAN encoding when creating schema automatically is enabled | -| Value | PLAIN, RLE | -| Default | RLE | -| Effective | Hot_reload | - -- default_int32_encoding - -| Name | default_int32_encoding | -| ----------- |--------------------------------------------------------------| -| Description | INT32 encoding when creating schema automatically is enabled | -| Value | PLAIN, RLE, TS_2DIFF, REGULAR, GORILLA | -| Default | TS_2DIFF | -| Effective | Hot_reload | - -- default_int64_encoding - -| Name | default_int64_encoding | -| ----------- |--------------------------------------------------------------| -| Description | INT64 encoding when creating schema automatically is enabled | -| Value | PLAIN, RLE, TS_2DIFF, REGULAR, GORILLA | -| Default | TS_2DIFF | -| Effective | Hot_reload | - -- default_float_encoding - -| Name | default_float_encoding | -| ----------- |--------------------------------------------------------------| -| Description | FLOAT encoding when creating schema automatically is enabled | -| Value | PLAIN, RLE, TS_2DIFF, GORILLA | -| Default | GORILLA | -| Effective | Hot_reload | - -- default_double_encoding - -| Name | default_double_encoding | -| ----------- |---------------------------------------------------------------| -| Description | DOUBLE encoding when creating schema automatically is enabled | -| Value | PLAIN, RLE, TS_2DIFF, GORILLA | -| Default | GORILLA | -| Effective | Hot_reload | - -- default_text_encoding - -| Name | default_text_encoding | -| ----------- |-------------------------------------------------------------| -| Description | TEXT encoding when creating schema automatically is enabled | -| Value | PLAIN | -| Default | PLAIN | -| Effective | Hot_reload | - - -* boolean_compressor - -| Name | boolean_compressor | -|------------------|-----------------------------------------------------------------------------------------| -| Description | BOOLEAN compression when creating schema automatically is enabled (Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - -* int32_compressor - -| Name | int32_compressor | -|----------------------|--------------------------------------------------------------------------------------------| -| Description | INT32/DATE compression when creating schema automatically is enabled(Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - -* int64_compressor - -| Name | int64_compressor | -|--------------------|-------------------------------------------------------------------------------------------------| -| Description | INT64/TIMESTAMP compression when creating schema automatically is enabled (Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - -* float_compressor - -| Name | float_compressor | -|-----------------------|---------------------------------------------------------------------------------------| -| Description | FLOAT compression when creating schema automatically is enabled (Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - -* double_compressor - -| Name | double_compressor | -|-------------------|----------------------------------------------------------------------------------------| -| Description | DOUBLE compression when creating schema automatically is enabled (Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - -* text_compressor - -| Name | text_compressor | -|--------------------|--------------------------------------------------------------------------------------------------| -| Description | TEXT/BINARY/BLOB compression when creating schema automatically is enabled (Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - - -### 4.17 Query Configurations - -- read_consistency_level - -| Name | read_consistency_level | -| ----------- | ------------------------------------------------------------ | -| Description | The read consistency levelThese consistency levels are currently supported:strong(Default, read from the leader replica)weak(Read from a random replica) | -| Type | String | -| Default | strong | -| Effective | Restart required. | - -- meta_data_cache_enable - -| Name | meta_data_cache_enable | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to cache meta data (BloomFilter, ChunkMetadata and TimeSeriesMetadata) or not. | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -- chunk_timeseriesmeta_free_memory_proportion - -| Name | chunk_timeseriesmeta_free_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Read memory Allocation Ratio: BloomFilterCache : ChunkCache : TimeSeriesMetadataCache : Coordinator : Operators : DataExchange : timeIndex in TsFileResourceList : others.The parameter form is a:b:c:d:e:f:g:h, where a, b, c, d, e, f, g and h are integers. for example: 1:1:1:1:1:1:1:1 , 1:100:200:50:200:200:200:50 | -| Type | String | -| Default | 1 : 100 : 200 : 300 : 400 | -| Effective | Restart required. | - -- enable_last_cache - -| Name | enable_last_cache | -| ----------- | ---------------------------- | -| Description | Whether to enable LAST cache | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -- mpp_data_exchange_core_pool_size - -| Name | mpp_data_exchange_core_pool_size | -| ----------- | -------------------------------------------- | -| Description | Core size of ThreadPool of MPP data exchange | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- mpp_data_exchange_max_pool_size - -| Name | mpp_data_exchange_max_pool_size | -| ----------- | ------------------------------------------- | -| Description | Max size of ThreadPool of MPP data exchange | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- mpp_data_exchange_keep_alive_time_in_ms - -| Name | mpp_data_exchange_keep_alive_time_in_ms | -| ----------- | --------------------------------------- | -| Description | Max waiting time for MPP data exchange | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- driver_task_execution_time_slice_in_ms - -| Name | driver_task_execution_time_slice_in_ms | -| ----------- | -------------------------------------- | -| Description | The max execution time of a DriverTask | -| Type | int32 | -| Default | 200 | -| Effective | Restart required. | - -- max_tsblock_size_in_bytes - -| Name | max_tsblock_size_in_bytes | -| ----------- | ----------------------------- | -| Description | The max capacity of a TsBlock | -| Type | int32 | -| Default | 131072 | -| Effective | Restart required. | - -- max_tsblock_line_numbers - -| Name | max_tsblock_line_numbers | -| ----------- | ------------------------------------------- | -| Description | The max number of lines in a single TsBlock | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- slow_query_threshold - -| Name | slow_query_threshold | -| ----------- |----------------------------------------| -| Description | Time cost(ms) threshold for slow query | -| Type | long | -| Default | 3000 | -| Effective | Hot reload | - -- query_cost_stat_window - -| Name | query_cost_stat_window | -|-------------|--------------------| -| Description | Time window threshold(min) for record of history queries. | -| Type | Int32 | -| Default | 0 | -| Effective | Hot reload | - -- query_timeout_threshold - -| Name | query_timeout_threshold | -| ----------- | ----------------------------------------- | -| Description | The max executing time of query. unit: ms | -| Type | Int32 | -| Default | 60000 | -| Effective | Restart required. | - -- max_allowed_concurrent_queries - -| Name | max_allowed_concurrent_queries | -| ----------- | -------------------------------------------------- | -| Description | The maximum allowed concurrently executing queries | -| Type | Int32 | -| Default | 1000 | -| Effective | Restart required. | - -- query_thread_count - -| Name | query_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | How many threads can concurrently execute query statement. When <= 0, use CPU core number. | -| Type | Int32 | -| Default | 0 | -| Effective | Restart required. | - -- degree_of_query_parallelism - -| Name | degree_of_query_parallelism | -| ----------- | ------------------------------------------------------------ | -| Description | How many pipeline drivers will be created for one fragment instance. When <= 0, use CPU core number / 2. | -| Type | Int32 | -| Default | 0 | -| Effective | Restart required. | - -- mode_map_size_threshold - -| Name | mode_map_size_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | The threshold of count map size when calculating the MODE aggregation function | -| Type | Int32 | -| Default | 10000 | -| Effective | Restart required. | - -- batch_size - -| Name | batch_size | -| ----------- | ------------------------------------------------------------ | -| Description | The amount of data iterate each time in server (the number of data strips, that is, the number of different timestamps.) | -| Type | Int32 | -| Default | 100000 | -| Effective | Restart required. | - -- sort_buffer_size_in_bytes - -| Name | sort_buffer_size_in_bytes | -| ----------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | The memory for external sort in sort operator, when the data size is smaller than sort_buffer_size_in_bytes, the sort operator will use in-memory sort. | -| Type | long | -| Default | 1048576(Before V2.0.6)
0(Supports from V2.0.6), if `sort_buffer_size_in_bytes <= 0`, default value will be used, `default value = min(32MB, memory for query operators / query_thread_count / 2)`, if `sort_buffer_size_in_bytes > 0`, the specified value will be used. | -| Effective | Hot_reload | - -- merge_threshold_of_explain_analyze - -| Name | merge_threshold_of_explain_analyze | -| ----------- | ------------------------------------------------------------ | -| Description | The threshold of operator count in the result set of EXPLAIN ANALYZE, if the number of operator in the result set is larger than this threshold, operator will be merged. | -| Type | int | -| Default | 10 | -| Effective | Hot reload | - -### 4.18 TTL Configuration - -- ttl_check_interval - -| Name | ttl_check_interval | -| ----------- | ------------------------------------------------------------ | -| Description | The interval of TTL check task in each database. The TTL check task will inspect and select files with a higher volume of expired data for compaction. Default is 2 hours. | -| Type | int | -| Default | 7200000 | -| Effective | Restart required. | - -- max_expired_time - -| Name | max_expired_time | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum expiring time of device which has a ttl. Default is 1 month.If the data elapsed time (current timestamp minus the maximum data timestamp of the device in the file) of such devices exceeds this value, then the file will be cleaned by compaction. | -| Type | int | -| Default | 2592000000 | -| Effective | Restart required. | - -- expired_data_ratio - -| Name | expired_data_ratio | -| ----------- | ------------------------------------------------------------ | -| Description | The expired device ratio. If the ratio of expired devices in one file exceeds this value, then expired data of this file will be cleaned by compaction. | -| Type | float | -| Default | 0.3 | -| Effective | Restart required. | - -### 4.19 Storage Engine Configuration - -- timestamp_precision - -| Name | timestamp_precision | -| ----------- | ------------------------------------------------------------ | -| Description | Use this value to set timestamp precision as "ms", "us" or "ns". | -| Type | String | -| Default | ms | -| Effective | Modify before the first startup. | - -- timestamp_precision_check_enabled - -| Name | timestamp_precision_check_enabled | -| ----------- | ------------------------------------------------------------ | -| Description | When the timestamp precision check is enabled, the timestamps those are over 13 digits for ms precision, or over 16 digits for us precision are not allowed to be inserted. | -| Type | Boolean | -| Default | true | -| Effective | Modify before the first startup. | - -- max_waiting_time_when_insert_blocked - -| Name | max_waiting_time_when_insert_blocked | -| ----------- | ------------------------------------------------------------ | -| Description | When the waiting time (in ms) of an inserting exceeds this, throw an exception. 10000 by default. | -| Type | Int32 | -| Default | 10000 | -| Effective | Restart required. | - -- handle_system_error - -| Name | handle_system_error | -| ----------- | -------------------------------------------------------- | -| Description | What will the system do when unrecoverable error occurs. | -| Type | String | -| Default | CHANGE_TO_READ_ONLY | -| Effective | Restart required. | - -- enable_timed_flush_seq_memtable - -| Name | enable_timed_flush_seq_memtable | -| ----------- | --------------------------------------------------- | -| Description | Whether to timed flush sequence tsfiles' memtables. | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- seq_memtable_flush_interval_in_ms - -| Name | seq_memtable_flush_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | If a memTable's last update time is older than current time minus this, the memtable will be flushed to disk. | -| Type | long | -| Default | 600000 | -| Effective | Hot reload | - -- seq_memtable_flush_check_interval_in_ms - -| Name | seq_memtable_flush_check_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | The interval to check whether sequence memtables need flushing. | -| Type | long | -| Default | 30000 | -| Effective | Hot reload | - -- enable_timed_flush_unseq_memtable - -| Name | enable_timed_flush_unseq_memtable | -| ----------- | ----------------------------------------------------- | -| Description | Whether to timed flush unsequence tsfiles' memtables. | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- unseq_memtable_flush_interval_in_ms - -| Name | unseq_memtable_flush_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | If a memTable's last update time is older than current time minus this, the memtable will be flushed to disk. | -| Type | long | -| Default | 600000 | -| Effective | Hot reload | - -- unseq_memtable_flush_check_interval_in_ms - -| Name | unseq_memtable_flush_check_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | The interval to check whether unsequence memtables need flushing. | -| Type | long | -| Default | 30000 | -| Effective | Hot reload | - -- tvlist_sort_algorithm - -| Name | tvlist_sort_algorithm | -| ----------- | ------------------------------------------------- | -| Description | The sort algorithms used in the memtable's TVList | -| Type | String | -| Default | TIM | -| Effective | Restart required. | - -- avg_series_point_number_threshold - -| Name | avg_series_point_number_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | When the average point number of timeseries in memtable exceeds this, the memtable is flushed to disk. | -| Type | int32 | -| Default | 100000 | -| Effective | Restart required. | - -- flush_thread_count - -| Name | flush_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | How many threads can concurrently flush. When <= 0, use CPU core number. | -| Type | int32 | -| Default | 0 | -| Effective | Restart required. | - -- enable_partial_insert - -| Name | enable_partial_insert | -| ----------- | ------------------------------------------------------------ | -| Description | In one insert (one device, one timestamp, multiple measurements), if enable partial insert, one measurement failure will not impact other measurements | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -- recovery_log_interval_in_ms - -| Name | recovery_log_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | the interval to log recover progress of each vsg when starting iotdb | -| Type | Int32 | -| Default | 5000 | -| Effective | Restart required. | - -- 0.13_data_insert_adapt - -| Name | 0.13_data_insert_adapt | -| ----------- | ------------------------------------------------------------ | -| Description | If using a v0.13 client to insert data, please set this configuration to true. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- enable_tsfile_validation - -| Name | enable_tsfile_validation | -| ----------- | ------------------------------------------------------------ | -| Description | Verify that TSfiles generated by Flush, Load, and Compaction are correct. | -| Type | boolean | -| Default | false | -| Effective | Hot reload | - -- tier_ttl_in_ms - -| Name | tier_ttl_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | Default tier TTL. When the survival time of the data exceeds the threshold, it will be migrated to the next tier. | -| Type | long | -| Default | -1 | -| Effective | Restart required. | - -- max_object_file_size_in_byte - -| Name | max_object_file_size_in_byte | -|-------------|-----------------------------------------------------------------------| -| Description | Maximum size limit for a single object file (supported since V2.0.8). | -| Type | long | -| Default | 4294967296 (4 GB in bytes) | -| Effective | Hot reload | - -- restrict_object_limit - -| Name | restrict_object_limit | -|-------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | No special restrictions on table names, column names, or device identifiers for `OBJECT` type (supported since V2.0.8). When set to `true` and the table contains `OBJECT` columns, the following restrictions apply:
1. Naming Rules: Values in TAG columns, table names, and field names must not use `.` or `..`; Prohibited characters include `./` or `.\`, otherwise metadata creation will fail; Names containing filesystem-unsupported characters will cause write errors.
2. Case Sensitivity: If the underlying filesystem is case-insensitive, device identifiers like `'d1'` and `'D1'` are treated as identical; Creating similar identifiers may overwrite `OBJECT` data files, leading to data corruption.
3. Storage Path: Actual storage path format: `${dataregionid}/${tablename}/${tag1}/${tag2}/.../${field}/${timestamp}.bin` | -| Type | boolean | -| Default | false | -| Effective | Can only be modified before the first service startup. | - -### 4.20 Compaction Configurations - -- enable_seq_space_compaction - -| Name | enable_seq_space_compaction | -| ----------- | ---------------------------------------------------------- | -| Description | sequence space compaction: only compact the sequence files | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- enable_unseq_space_compaction - -| Name | enable_unseq_space_compaction | -| ----------- | ------------------------------------------------------------ | -| Description | unsequence space compaction: only compact the unsequence files | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- enable_cross_space_compaction - -| Name | enable_cross_space_compaction | -| ----------- | ------------------------------------------------------------ | -| Description | cross space compaction: compact the unsequence files into the overlapped sequence files | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- enable_auto_repair_compaction - -| Name | enable_auto_repair_compaction | -| ----------- | ---------------------------------------------- | -| Description | enable auto repair unsorted file by compaction | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- cross_selector - -| Name | cross_selector | -| ----------- | ------------------------------------------- | -| Description | the selector of cross space compaction task | -| Type | String | -| Default | rewrite | -| Effective | Restart required. | - -- cross_performer - -| Name | cross_performer | -| ----------- |-----------------------------------------------------------| -| Description | the compaction performer of cross space compaction task, Options: read_point, fast | -| Type | String | -| Default | fast | -| Effective | Hot reload . | - -- inner_seq_selector - -| Name | inner_seq_selector | -| ----------- |--------------------------------------------------------| -| Description | the selector of inner sequence space compaction task, Options: size_tiered_single_target,size_tiered_multi_target | -| Type | String | -| Default | size_tiered_multi_target | -| Effective | Hot reload | - -- inner_seq_performer - -| Name | inner_seq_performer | -| ----------- |---------------------------------------------------------| -| Description | the performer of inner sequence space compaction task, Options: read_chunk, fast | -| Type | String | -| Default | read_chunk | -| Effective | Hot reload | - -- inner_unseq_selector - -| Name | inner_unseq_selector | -| ----------- |----------------------------------------------------------| -| Description | the selector of inner unsequence space compaction task, Options: size_tiered_single_target,size_tiered_multi_target | -| Type | String | -| Default | size_tiered_multi_target | -| Effective | Hot reload | - -- inner_unseq_performer - -| Name | inner_unseq_performer | -| ----------- |-----------------------------------------------------------| -| Description | the performer of inner unsequence space compaction task, Options: read_point, fast | -| Type | String | -| Default | fast | -| Effective | Hot reload | - -- compaction_priority - -| Name | compaction_priority | -| ----------- | ------------------------------------------------------------ | -| Description | The priority of compaction executionINNER_CROSS: prioritize inner space compaction, reduce the number of files firstCROSS_INNER: prioritize cross space compaction, eliminate the unsequence files firstBALANCE: alternate two compaction types | -| Type | String | -| Default | INNER_CROSS | -| Effective | Restart required. | - -- candidate_compaction_task_queue_size - -| Name | candidate_compaction_task_queue_size | -| ----------- | -------------------------------------------- | -| Description | The size of candidate compaction task queue. | -| Type | int32 | -| Default | 50 | -| Effective | Restart required. | - -- target_compaction_file_size - -| Name | target_compaction_file_size | -| ----------- | ------------------------------------------------------------ | -| Description | This parameter is used in two places:The target tsfile size of inner space compaction.The candidate size of seq tsfile in cross space compaction will be smaller than target_compaction_file_size * 1.5.In most cases, the target file size of cross compaction won't exceed this threshold, and if it does, it will not be much larger than it. | -| Type | Int64 | -| Default | 2147483648 | -| Effective | Hot reload | - -- inner_compaction_total_file_size_threshold - -| Name | inner_compaction_total_file_size_threshold | -| ----------- | ---------------------------------------------------- | -| Description | The total file size limit in inner space compaction. | -| Type | int64 | -| Default | 10737418240 | -| Effective | Hot reload | - -- inner_compaction_total_file_num_threshold - -| Name | inner_compaction_total_file_num_threshold | -| ----------- | --------------------------------------------------- | -| Description | The total file num limit in inner space compaction. | -| Type | int32 | -| Default | 100 | -| Effective | Hot reload | - -- max_level_gap_in_inner_compaction - -| Name | max_level_gap_in_inner_compaction | -| ----------- | ----------------------------------------------- | -| Description | The max level gap in inner compaction selection | -| Type | int32 | -| Default | 2 | -| Effective | Hot reload | - -- target_chunk_size - -| Name | target_chunk_size | -| ----------- | ------------------------------------------------------------ | -| Description | The target chunk size in flushing and compaction. If the size of a timeseries in memtable exceeds this, the data will be flushed to multiple chunks.| -| Type | Int64 | -| Default | 1600000 | -| Effective | Restart required. | - -- target_chunk_point_num - -| Name | target_chunk_point_num | -| ----------- |-----------------------------------------------------------------| -| Description | The target point nums in one chunk in flushing and compaction. If the point number of a timeseries in memtable exceeds this, the data will be flushed to multiple chunks. | -| Type | Int64 | -| Default | 100000 | -| Effective | Restart required. | - -- chunk_size_lower_bound_in_compaction - -| Name | chunk_size_lower_bound_in_compaction | -| ----------- | ------------------------------------------------------------ | -| Description | If the chunk size is lower than this threshold, it will be deserialized into points | -| Type | Int64 | -| Default | 128 | -| Effective | Restart required. | - -- chunk_point_num_lower_bound_in_compaction - -| Name | chunk_point_num_lower_bound_in_compaction | -| ----------- |------------------------------------------------------------------------------------------| -| Description | If the chunk point num is lower than this threshold, it will be deserialized into points | -| Type | Int64 | -| Default | 100 | -| Effective | Restart required. | - -- inner_compaction_candidate_file_num - -| Name | inner_compaction_candidate_file_num | -| ----------- | ------------------------------------------------------------ | -| Description | The file num requirement when selecting inner space compaction candidate files | -| Type | int32 | -| Default | 30 | -| Effective | Hot reload | - -- max_cross_compaction_candidate_file_num - -| Name | max_cross_compaction_candidate_file_num | -| ----------- | ------------------------------------------------------------ | -| Description | The max file when selecting cross space compaction candidate files | -| Type | int32 | -| Default | 500 | -| Effective | Hot reload | - -- max_cross_compaction_candidate_file_size - -| Name | max_cross_compaction_candidate_file_size | -| ----------- | ------------------------------------------------------------ | -| Description | The max total size when selecting cross space compaction candidate files | -| Type | Int64 | -| Default | 5368709120 | -| Effective | Hot reload | - -- min_cross_compaction_unseq_file_level - -| Name | min_cross_compaction_unseq_file_level | -| ----------- | ------------------------------------------------------------ | -| Description | The min inner compaction level of unsequence file which can be selected as candidate | -| Type | int32 | -| Default | 1 | -| Effective | Hot reload | - -- compaction_thread_count - -| Name | compaction_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | How many threads will be set up to perform compaction, 10 by default. | -| Type | int32 | -| Default | 10 | -| Effective | Hot reload | - -- compaction_max_aligned_series_num_in_one_batch - -| Name | compaction_max_aligned_series_num_in_one_batch | -| ----------- | ------------------------------------------------------------ | -| Description | How many chunk will be compacted in aligned series compaction, 10 by default. | -| Type | int32 | -| Default | 10 | -| Effective | Hot reload | - -- compaction_schedule_interval_in_ms - -| Name | compaction_schedule_interval_in_ms | -| ----------- | ---------------------------------------- | -| Description | The interval of compaction task schedule | -| Type | Int64 | -| Default | 60000 | -| Effective | Restart required. | - -- compaction_write_throughput_mb_per_sec - -| Name | compaction_write_throughput_mb_per_sec | -| ----------- | -------------------------------------------------------- | -| Description | The limit of write throughput merge can reach per second | -| Type | int32 | -| Default | 16 | -| Effective | Restart required. | - -- compaction_read_throughput_mb_per_sec - -| Name | compaction_read_throughput_mb_per_sec | -| ----------- | ------------------------------------------------------- | -| Description | The limit of read throughput merge can reach per second | -| Type | int32 | -| Default | 0 | -| Effective | Hot reload | - -- compaction_read_operation_per_sec - -| Name | compaction_read_operation_per_sec | -| ----------- | ------------------------------------------------------ | -| Description | The limit of read operation merge can reach per second | -| Type | int32 | -| Default | 0 | -| Effective | Hot reload | - -- sub_compaction_thread_count - -| Name | sub_compaction_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | The number of sub compaction threads to be set up to perform compaction. | -| Type | int32 | -| Default | 4 | -| Effective | Hot reload | - -- inner_compaction_task_selection_disk_redundancy - -| Name | inner_compaction_task_selection_disk_redundancy | -| ----------- | ------------------------------------------------------------ | -| Description | Redundancy value of disk availability, only use for inner compaction. | -| Type | double | -| Default | 0.05 | -| Effective | Hot reload | - -- inner_compaction_task_selection_mods_file_threshold - -| Name | inner_compaction_task_selection_mods_file_threshold | -| ----------- | -------------------------------------------------------- | -| Description | Mods file size threshold, only use for inner compaction. | -| Type | long | -| Default | 131072 | -| Effective | Hot reload | - -- compaction_schedule_thread_num - -| Name | compaction_schedule_thread_num | -| ----------- | ------------------------------------------------------------ | -| Description | The number of threads to be set up to select compaction task. | -| Type | int32 | -| Default | 4 | -| Effective | Hot reload | - -### 4.21 Write Ahead Log Configuration - -- wal_mode - -| Name | wal_mode | -| ----------- | ------------------------------------------------------------ | -| Description | The details of these three modes are as follows:DISABLE: the system will disable wal.SYNC: the system will submit wal synchronously, write request will not return until its wal is fsynced to the disk successfully.ASYNC: the system will submit wal asynchronously, write request will return immediately no matter its wal is fsynced to the disk successfully. | -| Type | String | -| Default | ASYNC | -| Effective | Restart required. | - -- max_wal_nodes_num - -| Name | max_wal_nodes_num | -| ----------- | ------------------------------------------------------------ | -| Description | each node corresponds to one wal directory The default value 0 means the number is determined by the system, the number is in the range of [data region num / 2, data region num]. | -| Type | int32 | -| Default | 0 | -| Effective | Restart required. | - -- wal_async_mode_fsync_delay_in_ms - -| Name | wal_async_mode_fsync_delay_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | Duration a wal flush operation will wait before calling fsync in the async mode | -| Type | int32 | -| Default | 1000 | -| Effective | Hot reload | - -- wal_sync_mode_fsync_delay_in_ms - -| Name | wal_sync_mode_fsync_delay_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | Duration a wal flush operation will wait before calling fsync in the sync mode | -| Type | int32 | -| Default | 3 | -| Effective | Hot reload | - -- wal_buffer_size_in_byte - -| Name | wal_buffer_size_in_byte | -| ----------- | ---------------------------- | -| Description | Buffer size of each wal node | -| Type | int32 | -| Default | 33554432 | -| Effective | Restart required. | - -- wal_buffer_queue_capacity - -| Name | wal_buffer_queue_capacity | -| ----------- | --------------------------------- | -| Description | Buffer capacity of each wal queue | -| Type | int32 | -| Default | 500 | -| Effective | Restart required. | - -- wal_file_size_threshold_in_byte - -| Name | wal_file_size_threshold_in_byte | -| ----------- | ------------------------------- | -| Description | Size threshold of each wal file | -| Type | int32 | -| Default | 31457280 | -| Effective | Hot reload | - -- wal_min_effective_info_ratio - -| Name | wal_min_effective_info_ratio | -| ----------- | --------------------------------------------------- | -| Description | Minimum ratio of effective information in wal files | -| Type | double | -| Default | 0.1 | -| Effective | Hot reload | - -- wal_memtable_snapshot_threshold_in_byte - -| Name | wal_memtable_snapshot_threshold_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | MemTable size threshold for triggering MemTable snapshot in wal | -| Type | int64 | -| Default | 8388608 | -| Effective | Hot reload | - -- max_wal_memtable_snapshot_num - -| Name | max_wal_memtable_snapshot_num | -| ----------- | ------------------------------------- | -| Description | MemTable's max snapshot number in wal | -| Type | int32 | -| Default | 1 | -| Effective | Hot reload | - -- delete_wal_files_period_in_ms - -| Name | delete_wal_files_period_in_ms | -| ----------- | ----------------------------------------------------------- | -| Description | The period when outdated wal files are periodically deleted | -| Type | int64 | -| Default | 20000 | -| Effective | Hot reload | - -- wal_throttle_threshold_in_byte - -| Name | wal_throttle_threshold_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | The minimum size of wal files when throttle down in IoTConsensus | -| Type | long | -| Default | 53687091200 | -| Effective | Hot reload | - -- iot_consensus_cache_window_time_in_ms - -| Name | iot_consensus_cache_window_time_in_ms | -| ----------- | ------------------------------------------------ | -| Description | Maximum wait time of write cache in IoTConsensus | -| Type | long | -| Default | -1 | -| Effective | Hot reload | - -- enable_wal_compression - -| Name | iot_consensus_cache_window_time_in_ms | -| ----------- | ------------------------------------- | -| Description | Enable Write Ahead Log compression. | -| Type | boolean | -| Default | true | -| Effective | Hot reload | - -### 4.22 **IoTConsensus Configuration** - -- data_region_iot_max_log_entries_num_per_batch - -| Name | data_region_iot_max_log_entries_num_per_batch | -| ----------- | ------------------------------------------------- | -| Description | The maximum log entries num in IoTConsensus Batch | -| Type | int32 | -| Default | 1024 | -| Effective | Restart required. | - -- data_region_iot_max_size_per_batch - -| Name | data_region_iot_max_size_per_batch | -| ----------- | -------------------------------------- | -| Description | The maximum size in IoTConsensus Batch | -| Type | int32 | -| Default | 16777216 | -| Effective | Restart required. | - -- data_region_iot_max_pending_batches_num - -| Name | data_region_iot_max_pending_batches_num | -| ----------- | ----------------------------------------------- | -| Description | The maximum pending batches num in IoTConsensus | -| Type | int32 | -| Default | 5 | -| Effective | Restart required. | - -- data_region_iot_max_memory_ratio_for_queue - -| Name | data_region_iot_max_memory_ratio_for_queue | -| ----------- | -------------------------------------------------- | -| Description | The maximum memory ratio for queue in IoTConsensus | -| Type | double | -| Default | 0.6 | -| Effective | Restart required. | - -- region_migration_speed_limit_bytes_per_second - -| Name | region_migration_speed_limit_bytes_per_second | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum transit size in byte per second for region migration | -| Type | long | -| Default | 33554432 | -| Effective | Restart required. | - -### 4.23 TsFile Configurations - -- group_size_in_byte - -| Name | group_size_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of bytes written to disk each time the data in memory is written to disk | -| Type | int32 | -| Default | 134217728 | -| Effective | Hot reload | - -- page_size_in_byte - -| Name | page_size_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | The memory size for each series writer to pack page, default value is 64KB | -| Type | int32 | -| Default | 65536 | -| Effective | Hot reload | - -- max_number_of_points_in_page - -| Name | max_number_of_points_in_page | -| ----------- | ------------------------------------------- | -| Description | The maximum number of data points in a page | -| Type | int32 | -| Default | 10000 | -| Effective | Hot reload | - -- pattern_matching_threshold - -| Name | pattern_matching_threshold | -| ----------- | ------------------------------------------- | -| Description | The threshold for pattern matching in regex | -| Type | int32 | -| Default | 1000000 | -| Effective | Hot reload | - -- float_precision - -| Name | float_precision | -| ----------- | ------------------------------------------------------------ | -| Description | Floating-point precision of query results.Only effective for RLE and TS_2DIFF encodings.Due to the limitation of machine precision, some values may not be interpreted strictly. | -| Type | int32 | -| Default | 2 | -| Effective | Hot reload | - -- value_encoder - -| Name | value_encoder | -| ----------- | ------------------------------------------------------------ | -| Description | Encoder of value series. default value is PLAIN. | -| Type | For int, long data type, also supports TS_2DIFF and RLE(run-length encoding), GORILLA and ZIGZAG. | -| Default | PLAIN | -| Effective | Hot reload | - -- compressor - -| Name | compressor | -| ----------- | ------------------------------------------------------------ | -| Description | Compression configuration And it is also used as the default compressor of time column in aligned timeseries. | -| Type | Data compression method, supports UNCOMPRESSED, SNAPPY, ZSTD, LZMA2 or LZ4. Default value is LZ4 | -| Default | LZ4 | -| Effective | Hot reload | - -- encrypt_flag - -| Name | encrypt_flag | -| ----------- | ---------------------- | -| Description | Enable data encryption | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- encrypt_type - -| Name | encrypt_type | -| ----------- |---------------------------------------| -| Description | The method of data encrytion | -| Type | String | -| Default | org.apache.tsfile.encrypt.UNENCRYPTED | -| Effective | Restart required. | - -- encrypt_key_path - -| Name | encrypt_key_path | -| ----------- | ----------------------------------- | -| Description | The path of key for data encryption | -| Type | String | -| Default | None | -| Effective | Restart required. | - -### 4.24 Authorization Configuration - -- authorizer_provider_class - -| Name | authorizer_provider_class | -| ----------- | ------------------------------------------------------------ | -| Description | which class to serve for authorization. | -| Type | String | -| Default | org.apache.iotdb.commons.auth.authorizer.LocalFileAuthorizer | -| Effective | Restart required. | - -- iotdb_server_encrypt_decrypt_provider - -| Name | iotdb_server_encrypt_decrypt_provider | -| ----------- | ------------------------------------------------------------ | -| Description | encryption provider class | -| Type | String | -| Default | org.apache.iotdb.commons.security.encrypt.MessageDigestEncrypt | -| Effective | Modify before the first startup. | - -- iotdb_server_encrypt_decrypt_provider_parameter - -| Name | iotdb_server_encrypt_decrypt_provider_parameter | -| ----------- | ----------------------------------------------- | -| Description | encryption provided class parameter | -| Type | String | -| Default | None | -| Effective | Modify before the first startup. | - -- author_cache_size - -| Name | author_cache_size | -| ----------- | --------------------------- | -| Description | Cache size of user and role | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- author_cache_expire_time - -| Name | author_cache_expire_time | -| ----------- | ---------------------------------- | -| Description | Cache expire time of user and role | -| Type | int32 | -| Default | 30 | -| Effective | Restart required. | - -### 4.25 UDF Configuration - -- udf_initial_byte_array_length_for_memory_control - -| Name | udf_initial_byte_array_length_for_memory_control | -| ----------- | ------------------------------------------------------------ | -| Description | Used to estimate the memory usage of text fields in a UDF query.It is recommended to set this value to be slightly larger than the average length of all text records. | -| Type | int32 | -| Default | 48 | -| Effective | Restart required. | - -- udf_memory_budget_in_mb - -| Name | udf_memory_budget_in_mb | -| ----------- | ------------------------------------------------------------ | -| Description | How much memory may be used in ONE UDF query (in MB). The upper limit is 20% of allocated memory for read. | -| Type | Float | -| Default | 30.0 | -| Effective | Restart required. | - -- udf_reader_transformer_collector_memory_proportion - -| Name | udf_reader_transformer_collector_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | UDF memory allocation ratio.The parameter form is a:b:c, where a, b, and c are integers. | -| Type | String | -| Default | 1:1:1 | -| Effective | Restart required. | - -- udf_lib_dir - -| Name | udf_lib_dir | -| ----------- | ---------------------------- | -| Description | the udf lib directory | -| Type | String | -| Default | ext/udf(Windows:ext\\udf) | -| Effective | Restart required. | - -### 4.26 Trigger Configuration - -- trigger_lib_dir - -| Name | trigger_lib_dir | -| ----------- | ------------------------- | -| Description | the trigger lib directory | -| Type | String | -| Default | ext/trigger | -| Effective | Restart required. | - -- stateful_trigger_retry_num_when_not_found - -| Name | stateful_trigger_retry_num_when_not_found | -| ----------- | ------------------------------------------------------------ | -| Description | How many times will we retry to found an instance of stateful trigger on DataNodes | -| Type | Int32 | -| Default | 3 | -| Effective | Restart required. | - -### 4.27 **Select-Into Configuration** - -- into_operation_buffer_size_in_byte - -| Name | into_operation_buffer_size_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum memory occupied by the data to be written when executing select-into statements. | -| Type | long | -| Default | 104857600 | -| Effective | Hot reload | - -- select_into_insert_tablet_plan_row_limit - -| Name | select_into_insert_tablet_plan_row_limit | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of rows can be processed in insert-tablet-plan when executing select-into statements. | -| Type | int32 | -| Default | 10000 | -| Effective | Hot reload | - -- into_operation_execution_thread_count - -| Name | into_operation_execution_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | The number of threads in the thread pool that execute insert-tablet tasks | -| Type | int32 | -| Default | 2 | -| Effective | Restart required. | - -### 4.28 Continuous Query Configuration - -- continuous_query_submit_thread_count - -| Name | continuous_query_execution_thread | -| ----------- | ------------------------------------------------------------ | -| Description | The number of threads in the scheduled thread pool that submit continuous query tasks periodically | -| Type | int32 | -| Default | 2 | -| Effective | Restart required. | - -- continuous_query_min_every_interval_in_ms - -| Name | continuous_query_min_every_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | The minimum value of the continuous query execution time interval | -| Type | long (duration) | -| Default | 1000 | -| Effective | Restart required. | - -### 4.29 Pipe Configuration - -- pipe_lib_dir - -| Name | pipe_lib_dir | -| ----------- | ----------------------- | -| Description | the pipe lib directory. | -| Type | string | -| Default | ext/pipe | -| Effective | Not support modify | - -- pipe_subtask_executor_max_thread_num - -| Name | pipe_subtask_executor_max_thread_num | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). | -| Type | int | -| Default | 5 | -| Effective | Restart required. | - -- pipe_sink_timeout_ms - -| Name | pipe_sink_timeout_ms | -| ----------- | ------------------------------------------------------------ | -| Description | The connection timeout (in milliseconds) for the thrift client. | -| Type | int | -| Default | 900000 | -| Effective | Restart required. | - -- pipe_sink_selector_number - -| Name | pipe_sink_selector_number | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of selectors that can be used in the sink.Recommend to set this value to less than or equal to pipe_sink_max_client_number. | -| Type | int | -| Default | 4 | -| Effective | Restart required. | - -- pipe_sink_max_client_number - -| Name | pipe_sink_max_client_number | -| ----------- | ----------------------------------------------------------- | -| Description | The maximum number of clients that can be used in the sink. | -| Type | int | -| Default | 16 | -| Effective | Restart required. | - -- pipe_air_gap_receiver_enabled - -| Name | pipe_air_gap_receiver_enabled | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to enable receiving pipe data through air gap.The receiver can only return 0 or 1 in TCP mode to indicate whether the data is received successfully. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- pipe_air_gap_receiver_port - -| Name | pipe_air_gap_receiver_port | -| ----------- | ------------------------------------------------------------ | -| Description | The port for the server to receive pipe data through air gap. | -| Type | int | -| Default | 9780 | -| Effective | Restart required. | - -- pipe_all_sinks_rate_limit_bytes_per_second - -| Name | pipe_all_sinks_rate_limit_bytes_per_second | -| ----------- | ------------------------------------------------------------ | -| Description | The total bytes that all pipe sinks can transfer per second.When given a value less than or equal to 0, it means no limit. default value is -1, which means no limit. | -| Type | double | -| Default | -1 | -| Effective | Hot reload | - -### 4.30 RatisConsensus Configuration - -- config_node_ratis_log_appender_buffer_size_max - -| Name | config_node_ratis_log_appender_buffer_size_max | -| ----------- | ------------------------------------------------------------ | -| Description | max payload size for a single log-sync-RPC from leader to follower of ConfigNode (in byte, by default 16MB) | -| Type | int32 | -| Default | 16777216 | -| Effective | Restart required. | - -- schema_region_ratis_log_appender_buffer_size_max - -| Name | schema_region_ratis_log_appender_buffer_size_max | -| ----------- | ------------------------------------------------------------ | -| Description | max payload size for a single log-sync-RPC from leader to follower of SchemaRegion (in byte, by default 16MB) | -| Type | int32 | -| Default | 16777216 | -| Effective | Restart required. | - -- data_region_ratis_log_appender_buffer_size_max - -| Name | data_region_ratis_log_appender_buffer_size_max | -| ----------- | ------------------------------------------------------------ | -| Description | max payload size for a single log-sync-RPC from leader to follower of DataRegion (in byte, by default 16MB) | -| Type | int32 | -| Default | 16777216 | -| Effective | Restart required. | - -- config_node_ratis_snapshot_trigger_threshold - -| Name | config_node_ratis_snapshot_trigger_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | max numbers of snapshot_trigger_threshold logs to trigger a snapshot of Confignode | -| Type | int32 | -| Default | 400,000 | -| Effective | Restart required. | - -- schema_region_ratis_snapshot_trigger_threshold - -| Name | schema_region_ratis_snapshot_trigger_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | max numbers of snapshot_trigger_threshold logs to trigger a snapshot of SchemaRegion | -| Type | int32 | -| Default | 400,000 | -| Effective | Restart required. | - -- data_region_ratis_snapshot_trigger_threshold - -| Name | data_region_ratis_snapshot_trigger_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | max numbers of snapshot_trigger_threshold logs to trigger a snapshot of DataRegion | -| Type | int32 | -| Default | 400,000 | -| Effective | Restart required. | - -- config_node_ratis_log_unsafe_flush_enable - -| Name | config_node_ratis_log_unsafe_flush_enable | -| ----------- | ------------------------------------------------------ | -| Description | Is confignode allowed flushing Raft Log asynchronously | -| Type | boolean | -| Default | false | -| Effective | Restart required. | - -- schema_region_ratis_log_unsafe_flush_enable - -| Name | schema_region_ratis_log_unsafe_flush_enable | -| ----------- | -------------------------------------------------------- | -| Description | Is schemaregion allowed flushing Raft Log asynchronously | -| Type | boolean | -| Default | false | -| Effective | Restart required. | - -- data_region_ratis_log_unsafe_flush_enable - -| Name | data_region_ratis_log_unsafe_flush_enable | -| ----------- | ------------------------------------------------------ | -| Description | Is dataregion allowed flushing Raft Log asynchronously | -| Type | boolean | -| Default | false | -| Effective | Restart required. | - -- config_node_ratis_log_segment_size_max_in_byte - -| Name | config_node_ratis_log_segment_size_max_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | max capacity of a RaftLog segment file of confignode (in byte, by default 24MB) | -| Type | int32 | -| Default | 25165824 | -| Effective | Restart required. | - -- schema_region_ratis_log_segment_size_max_in_byte - -| Name | schema_region_ratis_log_segment_size_max_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | max capacity of a RaftLog segment file of schemaregion (in byte, by default 24MB) | -| Type | int32 | -| Default | 25165824 | -| Effective | Restart required. | - -- data_region_ratis_log_segment_size_max_in_byte - -| Name | data_region_ratis_log_segment_size_max_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | max capacity of a RaftLog segment file of dataregion(in byte, by default 24MB) | -| Type | int32 | -| Default | 25165824 | -| Effective | Restart required. | - -- config_node_simple_consensus_log_segment_size_max_in_byte - -| Name | data_region_ratis_log_segment_size_max_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | max capacity of a simple log segment file of confignode(in byte, by default 24MB) | -| Type | int32 | -| Default | 25165824 | -| Effective | Restart required. | - -- config_node_ratis_grpc_flow_control_window - -| Name | config_node_ratis_grpc_flow_control_window | -| ----------- | ---------------------------------------------------------- | -| Description | confignode flow control window for ratis grpc log appender | -| Type | int32 | -| Default | 4194304 | -| Effective | Restart required. | - -- schema_region_ratis_grpc_flow_control_window - -| Name | schema_region_ratis_grpc_flow_control_window | -| ----------- | ------------------------------------------------------------ | -| Description | schema region flow control window for ratis grpc log appender | -| Type | int32 | -| Default | 4194304 | -| Effective | Restart required. | - -- data_region_ratis_grpc_flow_control_window - -| Name | data_region_ratis_grpc_flow_control_window | -| ----------- | ----------------------------------------------------------- | -| Description | data region flow control window for ratis grpc log appender | -| Type | int32 | -| Default | 4194304 | -| Effective | Restart required. | - -- config_node_ratis_grpc_leader_outstanding_appends_max - -| Name | config_node_ratis_grpc_leader_outstanding_appends_max | -| ----------- | ----------------------------------------------------- | -| Description | config node grpc line concurrency threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- schema_region_ratis_grpc_leader_outstanding_appends_max - -| Name | schema_region_ratis_grpc_leader_outstanding_appends_max | -| ----------- | ------------------------------------------------------- | -| Description | schema region grpc line concurrency threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- data_region_ratis_grpc_leader_outstanding_appends_max - -| Name | data_region_ratis_grpc_leader_outstanding_appends_max | -| ----------- | ----------------------------------------------------- | -| Description | data region grpc line concurrency threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- config_node_ratis_log_force_sync_num - -| Name | config_node_ratis_log_force_sync_num | -| ----------- | ------------------------------------ | -| Description | config node fsync threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- schema_region_ratis_log_force_sync_num - -| Name | schema_region_ratis_log_force_sync_num | -| ----------- | -------------------------------------- | -| Description | schema region fsync threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- data_region_ratis_log_force_sync_num - -| Name | data_region_ratis_log_force_sync_num | -| ----------- | ------------------------------------ | -| Description | data region fsync threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- config_node_ratis_rpc_leader_election_timeout_min_ms - -| Name | config_node_ratis_rpc_leader_election_timeout_min_ms | -| ----------- | ---------------------------------------------------- | -| Description | confignode leader min election timeout | -| Type | int32 | -| Default | 2000ms | -| Effective | Restart required. | - -- schema_region_ratis_rpc_leader_election_timeout_min_ms - -| Name | schema_region_ratis_rpc_leader_election_timeout_min_ms | -| ----------- | ------------------------------------------------------ | -| Description | schema region leader min election timeout | -| Type | int32 | -| Default | 2000ms | -| Effective | Restart required. | - -- data_region_ratis_rpc_leader_election_timeout_min_ms - -| Name | data_region_ratis_rpc_leader_election_timeout_min_ms | -| ----------- | ---------------------------------------------------- | -| Description | data region leader min election timeout | -| Type | int32 | -| Default | 2000ms | -| Effective | Restart required. | - -- config_node_ratis_rpc_leader_election_timeout_max_ms - -| Name | config_node_ratis_rpc_leader_election_timeout_max_ms | -| ----------- | ---------------------------------------------------- | -| Description | confignode leader max election timeout | -| Type | int32 | -| Default | 4000ms | -| Effective | Restart required. | - -- schema_region_ratis_rpc_leader_election_timeout_max_ms - -| Name | schema_region_ratis_rpc_leader_election_timeout_max_ms | -| ----------- | ------------------------------------------------------ | -| Description | schema region leader max election timeout | -| Type | int32 | -| Default | 4000ms | -| Effective | Restart required. | - -- data_region_ratis_rpc_leader_election_timeout_max_ms - -| Name | data_region_ratis_rpc_leader_election_timeout_max_ms | -| ----------- | ---------------------------------------------------- | -| Description | data region leader max election timeout | -| Type | int32 | -| Default | 4000ms | -| Effective | Restart required. | - -- config_node_ratis_request_timeout_ms - -| Name | config_node_ratis_request_timeout_ms | -| ----------- | --------------------------------------- | -| Description | confignode ratis client retry threshold | -| Type | int32 | -| Default | 10000 | -| Effective | Restart required. | - -- schema_region_ratis_request_timeout_ms - -| Name | schema_region_ratis_request_timeout_ms | -| ----------- | ------------------------------------------ | -| Description | schema region ratis client retry threshold | -| Type | int32 | -| Default | 10000 | -| Effective | Restart required. | - -- data_region_ratis_request_timeout_ms - -| Name | data_region_ratis_request_timeout_ms | -| ----------- | ---------------------------------------- | -| Description | data region ratis client retry threshold | -| Type | int32 | -| Default | 10000 | -| Effective | Restart required. | - -- config_node_ratis_max_retry_attempts - -| Name | config_node_ratis_max_retry_attempts | -| ----------- | ------------------------------------ | -| Description | confignode ratis client retry times | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- config_node_ratis_initial_sleep_time_ms - -| Name | config_node_ratis_initial_sleep_time_ms | -| ----------- | ------------------------------------------ | -| Description | confignode ratis client initial sleep time | -| Type | int32 | -| Default | 100ms | -| Effective | Restart required. | - -- config_node_ratis_max_sleep_time_ms - -| Name | config_node_ratis_max_sleep_time_ms | -| ----------- | -------------------------------------------- | -| Description | confignode ratis client max retry sleep time | -| Type | int32 | -| Default | 10000 | -| Effective | Restart required. | - -- schema_region_ratis_max_retry_attempts - -| Name | schema_region_ratis_max_retry_attempts | -| ----------- | ------------------------------------------ | -| Description | schema region ratis client max retry times | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- schema_region_ratis_initial_sleep_time_ms - -| Name | schema_region_ratis_initial_sleep_time_ms | -| ----------- | ------------------------------------------ | -| Description | schema region ratis client init sleep time | -| Type | int32 | -| Default | 100ms | -| Effective | Restart required. | - -- schema_region_ratis_max_sleep_time_ms - -| Name | schema_region_ratis_max_sleep_time_ms | -| ----------- | ----------------------------------------- | -| Description | schema region ratis client max sleep time | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- data_region_ratis_max_retry_attempts - -| Name | data_region_ratis_max_retry_attempts | -| ----------- | --------------------------------------------- | -| Description | data region ratis client max retry sleep time | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- data_region_ratis_initial_sleep_time_ms - -| Name | data_region_ratis_initial_sleep_time_ms | -| ----------- | ---------------------------------------- | -| Description | data region ratis client init sleep time | -| Type | int32 | -| Default | 100ms | -| Effective | Restart required. | - -- data_region_ratis_max_sleep_time_ms - -| Name | data_region_ratis_max_sleep_time_ms | -| ----------- | --------------------------------------------- | -| Description | data region ratis client max retry sleep time | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- ratis_first_election_timeout_min_ms - -| Name | ratis_first_election_timeout_min_ms | -| ----------- | ----------------------------------- | -| Description | Ratis first election min timeout | -| Type | int64 | -| Default | 50 (ms) | -| Effective | Restart required. | - -- ratis_first_election_timeout_max_ms - -| Name | ratis_first_election_timeout_max_ms | -| ----------- | ----------------------------------- | -| Description | Ratis first election max timeout | -| Type | int64 | -| Default | 150 (ms) | -| Effective | Restart required. | - -- config_node_ratis_preserve_logs_num_when_purge - -| Name | config_node_ratis_preserve_logs_num_when_purge | -| ----------- | ------------------------------------------------------------ | -| Description | confignode snapshot preserves certain logs when taking snapshot and purge | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- schema_region_ratis_preserve_logs_num_when_purge - -| Name | schema_region_ratis_preserve_logs_num_when_purge | -| ----------- | ------------------------------------------------------------ | -| Description | schema region snapshot preserves certain logs when taking snapshot and purge | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- data_region_ratis_preserve_logs_num_when_purge - -| Name | data_region_ratis_preserve_logs_num_when_purge | -| ----------- | ------------------------------------------------------------ | -| Description | data region snapshot preserves certain logs when taking snapshot and purge | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- config_node_ratis_log_max_size - -| Name | config_node_ratis_log_max_size | -| ----------- | -------------------------------------- | -| Description | config node Raft Log disk size control | -| Type | int64 | -| Default | 2147483648 (2GB) | -| Effective | Restart required. | - -- schema_region_ratis_log_max_size - -| Name | schema_region_ratis_log_max_size | -| ----------- | ---------------------------------------- | -| Description | schema region Raft Log disk size control | -| Type | int64 | -| Default | 2147483648 (2GB) | -| Effective | Restart required. | - -- data_region_ratis_log_max_size - -| Name | data_region_ratis_log_max_size | -| ----------- | -------------------------------------- | -| Description | data region Raft Log disk size control | -| Type | int64 | -| Default | 21474836480 (20GB) | -| Effective | Restart required. | - -- config_node_ratis_periodic_snapshot_interval - -| Name | config_node_ratis_periodic_snapshot_interval | -| ----------- | -------------------------------------------- | -| Description | config node Raft periodic snapshot interval | -| Type | int64 | -| Default | 86400 (s) | -| Effective | Restart required. | - -- schema_region_ratis_periodic_snapshot_interval - -| Name | schema_region_ratis_preserve_logs_num_when_purge | -| ----------- | ------------------------------------------------ | -| Description | schema region Raft periodic snapshot interval | -| Type | int64 | -| Default | 86400 (s) | -| Effective | Restart required. | - -- data_region_ratis_periodic_snapshot_interval - -| Name | data_region_ratis_preserve_logs_num_when_purge | -| ----------- | ---------------------------------------------- | -| Description | data region Raft periodic snapshot interval | -| Type | int64 | -| Default | 86400 (s) | -| Effective | Restart required. | - -### 4.31 IoTConsensusV2 Configuration - -- iot_consensus_v2_pipeline_size - -| Name | iot_consensus_v2_pipeline_size | -| ----------- | ------------------------------------------------------------ | -| Description | Default event buffer size for connector and receiver in iot consensus v2 | -| Type | int | -| Default | 5 | -| Effective | Restart required. | - -- iot_consensus_v2_mode - -| Name | iot_consensus_v2_pipeline_size | -| ----------- | ------------------------------ | -| Description | IoTConsensusV2 mode. | -| Type | String | -| Default | batch | -| Effective | Restart required. | - -### 4.32 Procedure Configuration - -- procedure_core_worker_thread_count - -| Name | procedure_core_worker_thread_count | -| ----------- | ------------------------------------- | -| Description | Default number of worker thread count | -| Type | int32 | -| Default | 4 | -| Effective | Restart required. | - -- procedure_completed_clean_interval - -| Name | procedure_completed_clean_interval | -| ----------- | ------------------------------------------------------------ | -| Description | Default time interval of completed procedure cleaner work in, time unit is second | -| Type | int32 | -| Default | 30(s) | -| Effective | Restart required. | - -- procedure_completed_evict_ttl - -| Name | procedure_completed_evict_ttl | -| ----------- | ------------------------------------------------------- | -| Description | Default ttl of completed procedure, time unit is second | -| Type | int32 | -| Default | 60(s) | -| Effective | Restart required. | - -### 4.33 MQTT Broker Configuration - -- enable_mqtt_service - -| Name | enable_mqtt_service。 | -| ----------- | ----------------------------------- | -| Description | whether to enable the mqtt service. | -| Type | Boolean | -| Default | false | -| Effective | Hot reload | - -- mqtt_host - -| Name | mqtt_host | -| ----------- | ------------------------------ | -| Description | the mqtt service binding host. | -| Type | String | -| Default | 127.0.0.1 | -| Effective | Hot reload | - -- mqtt_port - -| Name | mqtt_port | -| ----------- | ------------------------------ | -| Description | the mqtt service binding port. | -| Type | int32 | -| Default | 1883 | -| Effective | Hot reload | - -- mqtt_handler_pool_size - -| Name | mqtt_handler_pool_size | -| ----------- | ---------------------------------------------------- | -| Description | the handler pool size for handing the mqtt messages. | -| Type | int32 | -| Default | 1 | -| Effective | Hot reload | - -- mqtt_payload_formatter - -| Name | mqtt_payload_formatter | -| ----------- | ----------------------------------- | -| Description | the mqtt message payload formatter. | -| Type | String | -| Default | json | -| Effective | Hot reload | - -- mqtt_max_message_size - -| Name | mqtt_max_message_size | -| ----------- | ---------------------------------- | -| Description | max length of mqtt message in byte | -| Type | int32 | -| Default | 1048576 | -| Effective | Hot reload | - -### 4.34 Audit log Configuration - -- enable_audit_log - -| Name | enable_audit_log | -| ----------- | -------------------------------- | -| Description | whether to enable the audit log. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- audit_log_storage - -| Name | audit_log_storage | -| ----------- | ----------------------------- | -| Description | Output location of audit logs | -| Type | String | -| Default | IOTDB,LOGGER | -| Effective | Restart required. | - -- audit_log_operation - -| Name | audit_log_operation | -| ----------- | ------------------------------------------------------------ | -| Description | whether enable audit log for DML operation of datawhether enable audit log for DDL operation of schemawhether enable audit log for QUERY operation of data and schema | -| Type | String | -| Default | DML,DDL,QUERY | -| Effective | Restart required. | - -- enable_audit_log_for_native_insert_api - -| Name | enable_audit_log_for_native_insert_api | -| ----------- | ---------------------------------------------- | -| Description | whether the local write api records audit logs | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -### 4.35 White List Configuration - -- enable_white_list - -| Name | enable_white_list | -| ----------- | ------------------------- | -| Description | whether enable white list | -| Type | Boolean | -| Default | false | -| Effective | Hot reload | - -### 4.36 IoTDB-AI Configuration - -- model_inference_execution_thread_count - -| Name | model_inference_execution_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | The thread count which can be used for model inference operation. | -| Type | int | -| Default | 5 | -| Effective | Restart required. | - -### 4.37 Load TsFile Configuration - -- load_clean_up_task_execution_delay_time_seconds - -| Name | load_clean_up_task_execution_delay_time_seconds | -| ----------- | ------------------------------------------------------------ | -| Description | Load clean up task is used to clean up the unsuccessful loaded tsfile after a certain period of time. | -| Type | int | -| Default | 1800 | -| Effective | Hot reload | - -- load_write_throughput_bytes_per_second - -| Name | load_write_throughput_bytes_per_second | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum bytes per second of disk write throughput when loading tsfile. | -| Type | int | -| Default | -1 | -| Effective | Hot reload | - -- load_active_listening_enable - -| Name | load_active_listening_enable | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to enable the active listening mode for tsfile loading. | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- load_active_listening_dirs - -| Name | load_active_listening_dirs | -| ----------- | ------------------------------------------------------------ | -| Description | The directory to be actively listened for tsfile loading.Multiple directories should be separated by a ','. | -| Type | String | -| Default | ext/load/pending | -| Effective | Hot reload | - -- load_active_listening_fail_dir - -| Name | load_active_listening_fail_dir | -| ----------- | ------------------------------------------------------------ | -| Description | The directory where tsfiles are moved if the active listening mode fails to load them. | -| Type | String | -| Default | ext/load/failed | -| Effective | Hot reload | - -- load_active_listening_max_thread_num - -| Name | load_active_listening_max_thread_num | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of threads that can be used to load tsfile actively.The default value, when this parameter is commented out or <= 0, use CPU core number. | -| Type | Long | -| Default | 0 | -| Effective | Restart required. | - -- load_active_listening_check_interval_seconds - -| Name | load_active_listening_check_interval_seconds | -| ----------- | ------------------------------------------------------------ | -| Description | The interval specified in seconds for the active listening mode to check the directory specified in `load_active_listening_dirs`.The active listening mode will check the directory every `load_active_listening_check_interval_seconds seconds`. | -| Type | Long | -| Default | 5 | -| Effective | Restart required. | - -* last_cache_operation_on_load - -|Name| last_cache_operation_on_load | -|:---:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|Description| The operation performed to LastCache when a TsFile is successfully loaded. `UPDATE`: use the data in the TsFile to update LastCache; `UPDATE_NO_BLOB`: similar to UPDATE, but will invalidate LastCache for blob series; `CLEAN_DEVICE`: invalidate LastCache of devices contained in the TsFile; `CLEAN_ALL`: clean the whole LastCache. | -|Type| String | -|Default| UPDATE_NO_BLOB | -|Effective| Effective after restart | - -* cache_last_values_for_load - -|Name| cache_last_values_for_load | -|:---:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|Description| Whether to cache last values before loading a TsFile. Only effective when `last_cache_operation_on_load=UPDATE_NO_BLOB` or `last_cache_operation_on_load=UPDATE`. When set to true, blob series will be ignored even with `last_cache_operation_on_load=UPDATE`. Enabling this will increase the memory footprint during loading TsFiles. | -|Type| Boolean | -|Default| true | -|Effective| Effective after restart | - -* cache_last_values_memory_budget_in_byte - -|Name| cache_last_values_memory_budget_in_byte | -|:---:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|Description| When `cache_last_values_for_load=true`, the maximum memory that can be used to cache last values. If this value is exceeded, the cached values will be abandoned and last values will be read from the TsFile in a streaming manner. | -|Type| int32 | -|Default| 4194304 | -|Effective| Effective after restart | - - -### 4.38 Dispatch Retry Configuration - -- enable_retry_for_unknown_error - -| Name | enable_retry_for_unknown_error | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum retrying time for write request remotely dispatching, time unit is milliseconds. | -| Type | Long | -| Default | 60000 | -| Effective | Hot reload | - -- enable_retry_for_unknown_error - -| Name | enable_retry_for_unknown_error | -| ----------- | ------------------------------------ | -| Description | Whether retrying for unknown errors. | -| Type | boolean | -| Default | false | -| Effective | Hot reload | \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Reference/System-Tables_timecho.md b/src/UserGuide/Master/Table/Reference/System-Tables_timecho.md deleted file mode 100644 index 8338d998d..000000000 --- a/src/UserGuide/Master/Table/Reference/System-Tables_timecho.md +++ /dev/null @@ -1,801 +0,0 @@ - - -# System Tables - -IoTDB has a built-in system database called `INFORMATION_SCHEMA`, which contains a series of system tables for storing IoTDB runtime information (such as currently executing SQL statements, etc.). Currently, the `INFORMATION_SCHEMA` database only supports read operations. - -> 💡 **[V2.0.9.1 Version Update]**
-> 👉 Added onw system tables: **[TABLE_DISK_USAGE](#_2-22-table-disk-usage)** (Table-level Storage Space Statistics), enhancing cluster maintenance and performance analysis. - - -## 1. System Database - -* ​**Name**​: `INFORMATION_SCHEMA` -* ​**Commands**​: Read-only, only supports `Show databases (DETAILS)` / `Show Tables (DETAILS)` / `Use`. Other operations will result in an error: `"The database 'information_schema' can only be queried."` -* ​**Attributes**​:` TTL=INF`, other attributes default to `null ` -* ​**SQL Example**​: - -```sql -IoTDB> show databases -+------------------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+------------------+-------+-----------------------+---------------------+---------------------+ -|information_schema| INF| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+ - -IoTDB> show tables from information_schema -+-----------------------+-------+ -| TableName|TTL(ms)| -+-----------------------+-------+ -| columns| INF| -| config_nodes| INF| -| configurations| INF| -| connections| INF| -| current_queries| INF| -| data_nodes| INF| -| databases| INF| -| functions| INF| -| keywords| INF| -| nodes| INF| -| pipe_plugins| INF| -| pipes| INF| -| queries| INF| -|queries_costs_histogram| INF| -| regions| INF| -| services| INF| -| subscriptions| INF| -| table_disk_usage| INF| -| tables| INF| -| topics| INF| -| views| INF| -+-----------------------+-------+ -``` - -## 2. System Tables - -* ​**Names**​: `DATABASES`, `TABLES`, `REGIONS`, `QUERIES`, `COLUMNS`, `PIPES`, `PIPE_PLUGINS`, `SUBSCRIPTION`, `TOPICS`, `VIEWS`, `MODELS`, `FUNCTIONS`, `CONFIGURATIONS`, `KEYWORDS`, `NODES`, `CONFIG_NODES`, `DATA_NODES`, `CONNECTIONS`, `CURRENT_QUERIES`, `QUERIES_COSTS_HISTOGRAM`, `SERVICES`, `TABLE_DISK_USAGE` (detailed descriptions in later sections) -* ​**Operations**​: Read-only, only supports `SELECT`, `COUNT/SHOW DEVICES`, `DESC`. Any modifications to table structure or content are not allowed and will result in an error: `"The database 'information_schema' can only be queried." ` -* ​**Column Names**​: System table column names are all lowercase by default and separated by underscores (`_`). - -### 2.1 DATABASES - -* Contains information about all databases in the cluster. -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| --------------------------------- | ----------- | ------------- | -------------------------------- | -| `database` | STRING | TAG | Database name | -| `ttl(ms)` | STRING | ATTRIBUTE | Data retention time | -| `schema_replication_factor` | INT32 | ATTRIBUTE | Schema replica count | -| `data_replication_factor` | INT32 | ATTRIBUTE | Data replica count | -| `time_partition_interval` | INT64 | ATTRIBUTE | Time partition interval | -| `schema_region_group_num` | INT32 | ATTRIBUTE | Number of schema region groups | -| `data_region_group_num` | INT32 | ATTRIBUTE | Number of data region groups | - -* The query results only display the collection of databases for which you have any permission on the database itself or any table within the database. -* Query Example: - -```sql -IoTDB> select * from information_schema.databases -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -| database|ttl(ms)|schema_replication_factor|data_replication_factor|time_partition_interval|schema_region_group_num|data_region_group_num| -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -|information_schema| INF| null| null| null| null| null| -| database1| INF| 1| 1| 604800000| 0| 0| -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -``` - -### 2.2 TABLES - -* Contains information about all tables in the cluster. -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ------------------ | ----------- | ------------- | --------------------- | -| `database` | STRING | TAG | Database name | -| `table_name` | STRING | TAG | Table name | -| `ttl(ms)` | STRING | ATTRIBUTE | Data retention time | -| `status` | STRING | ATTRIBUTE | Status | -| `comment` | STRING | ATTRIBUTE | Description/comment | - -* Note: Possible values for `status`: `USING`, `PRE_CREATE`, `PRE_DELETE`. For details, refer to the [View Tables](../Basic-Concept/Table-Management_timecho.md#12-view-tables) in Table Management documentation -* The query results only display the collection of tables for which you have any permission. -* Query Example: - -```sql -IoTDB> select * from information_schema.tables -+------------------+--------------+-----------+------+-------+-----------+ -| database| table_name| ttl(ms)|status|comment| table_type| -+------------------+--------------+-----------+------+-------+-----------+ -|information_schema| databases| INF| USING| null|SYSTEM VIEW| -|information_schema| models| INF| USING| null|SYSTEM VIEW| -|information_schema| subscriptions| INF| USING| null|SYSTEM VIEW| -|information_schema| regions| INF| USING| null|SYSTEM VIEW| -|information_schema| functions| INF| USING| null|SYSTEM VIEW| -|information_schema| keywords| INF| USING| null|SYSTEM VIEW| -|information_schema| columns| INF| USING| null|SYSTEM VIEW| -|information_schema| topics| INF| USING| null|SYSTEM VIEW| -|information_schema|configurations| INF| USING| null|SYSTEM VIEW| -|information_schema| queries| INF| USING| null|SYSTEM VIEW| -|information_schema| tables| INF| USING| null|SYSTEM VIEW| -|information_schema| pipe_plugins| INF| USING| null|SYSTEM VIEW| -|information_schema| nodes| INF| USING| null|SYSTEM VIEW| -|information_schema| data_nodes| INF| USING| null|SYSTEM VIEW| -|information_schema| pipes| INF| USING| null|SYSTEM VIEW| -|information_schema| views| INF| USING| null|SYSTEM VIEW| -|information_schema| config_nodes| INF| USING| null|SYSTEM VIEW| -| database1| table1|31536000000| USING| null| BASE TABLE| -+------------------+--------------+-----------+------+-------+-----------+ -``` - -### 2.3 REGIONS - -* Contains information about all regions in the cluster. -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ------------------------- | ----------- | ------------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `region_id` | INT32 | TAG | Region ID | -| `datanode_id` | INT32 | TAG | DataNode ID | -| `type` | STRING | ATTRIBUTE | Type (`SchemaRegion`/`DataRegion`) | -| `status` | STRING | ATTRIBUTE | Status (`Running`,`Unknown`, etc.) | -| `database` | STRING | ATTRIBUTE | Database name | -| `series_slot_num` | INT32 | ATTRIBUTE | Number of series slots | -| `time_slot_num` | INT64 | ATTRIBUTE | Number of time slots | -| `rpc_address` | STRING | ATTRIBUTE | RPC address | -| `rpc_port` | INT32 | ATTRIBUTE | RPC port | -| `internal_address` | STRING | ATTRIBUTE | Internal communication address | -| `role` | STRING | ATTRIBUTE | Role (`Leader`/`Follower`) | -| `create_time` | TIMESTAMP | ATTRIBUTE | Creation time | -| `tsfile_size_bytes` | INT64 | ATTRIBUTE | - For​**DataRegion with statistics ​**​: Total file size of TsFiles.
- For**DataRegion without statistics**(Unknown):`-1`.
- For​**SchemaRegion**​:`null`. | - -* Only administrators are allowed to perform query operations. -* Query Example: - -```SQL -IoTDB> select * from information_schema.regions -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -|region_id|datanode_id| type| status| database|series_slot_num|time_slot_num|rpc_address|rpc_port|internal_address| role| create_time|tsfile_size_bytes| -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -| 0| 1|SchemaRegion|Running|database1| 12| 0| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:08.485+08:00| null| -| 1| 1| DataRegion|Running|database1| 6| 6| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:09.156+08:00| 3985| -| 2| 1| DataRegion|Running|database1| 6| 6| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:09.156+08:00| 3841| -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -``` - -### 2.4 QUERIES - -* Contains information about all currently executing queries in the cluster. Can also be queried using the `SHOW QUERIES` syntax. -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| -------------------- | ----------- | ------------- | ------------------------------------------------------------ | -| `query_id` | STRING | TAG | Query ID | -| `start_time` | TIMESTAMP | ATTRIBUTE | Query start timestamp (precision matches system precision) | -| `datanode_id` | INT32 | ATTRIBUTE | DataNode ID that initiated the query | -| `elapsed_time` | FLOAT | ATTRIBUTE | Query execution duration (in seconds) | -| `statement` | STRING | ATTRIBUTE | SQL statement of the query | -| `user` | STRING | ATTRIBUTE | User who initiated the query | - -* For regular users, the query results only display the queries executed by themselves; for administrators, all queries are displayed. -* Query Example: - -```SQL -IoTDB> select * from information_schema.queries -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -|20250331_023242_00011_1|2025-03-31T10:32:42.360+08:00| 1| 0.025|select * from information_schema.queries|root| -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -``` - -### 2.5 COLUMNS - -* Contains information about all columns in tables across the cluster -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ------------------- | ----------- | ------------- | -------------------- | -| `database` | STRING | TAG | Database name | -| `table_name` | STRING | TAG | Table name | -| `column_name` | STRING | TAG | Column name | -| `datatype` | STRING | ATTRIBUTE | Column data type | -| `category` | STRING | ATTRIBUTE | Column category | -| `status` | STRING | ATTRIBUTE | Column status | -| `comment` | STRING | ATTRIBUTE | Column description | - -Notes: -* Possible values for `status`: `USING`, `PRE_DELETE`. For details, refer to [Viewing Table Columns](../Basic-Concept/Table-Management_timecho.html#13-view-table-columns) in Table Management documentation. -* The query results only display the column information of tables for which you have any permission. - -* Query Example: - -```SQL -IoTDB> select * from information_schema.columns where database = 'database1' -+---------+----------+------------+---------+---------+------+-------+ -| database|table_name| column_name| datatype| category|status|comment| -+---------+----------+------------+---------+---------+------+-------+ -|database1| table1| time|TIMESTAMP| TIME| USING| null| -|database1| table1| region| STRING| TAG| USING| null| -|database1| table1| plant_id| STRING| TAG| USING| null| -|database1| table1| device_id| STRING| TAG| USING| null| -|database1| table1| model_id| STRING|ATTRIBUTE| USING| null| -|database1| table1| maintenance| STRING|ATTRIBUTE| USING| null| -|database1| table1| temperature| FLOAT| FIELD| USING| null| -|database1| table1| humidity| FLOAT| FIELD| USING| null| -|database1| table1| status| BOOLEAN| FIELD| USING| null| -|database1| table1|arrival_time|TIMESTAMP| FIELD| USING| null| -+---------+----------+------------+---------+---------+------+-------+ -``` - -### 2.6 PIPES - -* Contains information about all pipes in the cluster -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ----------------------------------- | ----------- | ------------- | ---------------------------------------------------------- | -| `id` | STRING | TAG | Pipe name | -| `creation_time` | TIMESTAMP | ATTRIBUTE | Creation time | -| `state` | STRING | ATTRIBUTE | Pipe status (`RUNNING`/`STOPPED`) | -| `pipe_source` | STRING | ATTRIBUTE | Source plugin parameters | -| `pipe_processor` | STRING | ATTRIBUTE | Processor plugin parameters | -| `pipe_sink` | STRING | ATTRIBUTE | Sink plugin parameters | -| `exception_message` | STRING | ATTRIBUTE | Exception message | -| `remaining_event_count` | INT64 | ATTRIBUTE | Remaining event count (`-1`if Unknown) | -| `estimated_remaining_seconds` | DOUBLE | ATTRIBUTE | Estimated remaining time in seconds (`-1`if Unknown) | - -* Only administrators are allowed to perform operations. -* Query Example: - -```SQL -select * from information_schema.pipes -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -| id| creation_time| state| pipe_source|pipe_processor| pipe_sink|exception_message|remaining_event_count|estimated_remaining_seconds| -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -|tablepipe1|2025-03-31T12:25:24.040+08:00|RUNNING|{__system.sql-dialect=table, source.password=******, source.username=root}| {}|{format=hybrid, node-urls=192.168.xxx.xxx:6667, sink=iotdb-thrift-sink}| | 0| 0.0| -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -``` - -### 2.7 PIPE\_PLUGINS - -* Contains information about all PIPE plugins in the cluster -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ------------------- | ----------- | ------------- | ----------------------------------------------------- | -| `plugin_name` | STRING | TAG | Plugin name | -| `plugin_type` | STRING | ATTRIBUTE | Plugin type (`Builtin`/`External`) | -| `class_name` | STRING | ATTRIBUTE | Plugin's main class name | -| `plugin_jar` | STRING | ATTRIBUTE | Plugin's JAR file name (`null`for builtin type) | - -* Query Example: - -```SQL -IoTDB> select * from information_schema.pipe_plugins -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -| plugin_name|plugin_type| class_name|plugin_jar| -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -|IOTDB-THRIFT-SSL-SINK| Builtin|org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| null| -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| null| -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.donothing.DoNothingConnector| null| -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.processor.donothing.DoNothingProcessor| null| -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| null| -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.extractor.iotdb.IoTDBExtractor| null| -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -``` - -### 2.8 SUBSCRIPTIONS - -* Contains information about all data subscriptions in the cluster -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ---------------------------- | ----------- | ------------- | ------------------------- | -| `topic_name` | STRING | TAG | Subscription topic name | -| `consumer_group_name` | STRING | TAG | Consumer group name | -| `subscribed_consumers` | STRING | ATTRIBUTE | Subscribed consumers | - -* Only administrators are allowed to perform operations. -* Query Example: - -```SQL -IoTDB> select * from information_schema.subscriptions where topic_name = 'topic_1' -+----------+-------------------+--------------------------------+ -|topic_name|consumer_group_name| subscribed_consumers| -+----------+-------------------+--------------------------------+ -| topic_1| cg1|[c3, c4, c5, c6, c7, c0, c1, c2]| -+----------+-------------------+--------------------------------+ -``` - -### 2.9 TOPICS - -* Contains information about all data subscription topics in the cluster -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| --------------------- | ----------- | ------------- | -------------------------------- | -| `topic_name` | STRING | TAG | Subscription topic name | -| `topic_configs` | STRING | ATTRIBUTE | Topic configuration parameters | - -* Only administrators are allowed to perform operations. -* Query Example: - -```SQL -IoTDB> select * from information_schema.topics -+----------+----------------------------------------------------------------+ -|topic_name| topic_configs| -+----------+----------------------------------------------------------------+ -| topic|{__system.sql-dialect=table, start-time=2025-01-10T17:05:38.282}| -+----------+----------------------------------------------------------------+ -``` - -### 2.10 VIEWS - -> This system table is available starting from version V2.0.5. - -* Contains information about all table views in the database. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------------ | ----------- | ----------------- | --------------------------------- | -| database | STRING | TAG | Database name | -| table\_name | STRING | TAG | View name | -| view\_definition | STRING | ATTRIBUTE | SQL statement for view creation | - -* The query results only display the collection of views for which you have any permission. -* Query example: - -```SQL -IoTDB> select * from information_schema.views -+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------+ -| database|table_name| view_definition| -+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------+ -|database1| ln|CREATE VIEW "ln" ("device" STRING TAG,"model" STRING TAG,"status" BOOLEAN FIELD,"hardware" STRING FIELD) WITH (ttl='INF') AS root.ln.**| -+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------+ -``` - - -### 2.11 MODELS - -> This system table is available starting from version V 2.0.5 and has been discontinued since version V 2.0.8. - -* Contains information about all models in the database. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------- | ----------- | ----------------- | ------------------------------------------------------------------------------------------------ | -| model\_id | STRING | TAG | Model name | -| model\_type | STRING | ATTRIBUTE | Model type (Forecast, Anomaly Detection, Custom) | -| state | STRING | ATTRIBUTE | Model status (Available/Unavailable) | -| configs | STRING | ATTRIBUTE | String format of model hyperparameters, consistent with the output of the `show` command | -| notes | STRING | ATTRIBUTE | Model description\* Built-in model: Built-in model in IoTDB\* User-defined model: Custom model | - -* Query example: - -```SQL --- Find all built-in forecast models -IoTDB> select * from information_schema.models where model_type = 'BUILT_IN_FORECAST' -+---------------------+-----------------+------+-------+-----------------------+ -| model_id| model_type| state|configs| notes| -+---------------------+-----------------+------+-------+-----------------------+ -| _STLForecaster|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _NaiveForecaster|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _ARIMA|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -|_ExponentialSmoothing|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _HoltWinters|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _sundial|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -+---------------------+-----------------+------+-------+-----------------------+ -``` - - -### 2.12 FUNCTIONS - -> This system table is available starting from version V2.0.5. - -* Contains information about all functions in the database. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------------ | ----------- | ----------------- | -------------------------------------------------------------------------- | -| function\_name | STRING | TAG | Function name | -| function\_type | STRING | ATTRIBUTE | Function type (Built-in/User-defined, Scalar/Aggregation/Table Function) | -| class\_name(udf) | STRING | ATTRIBUTE | Class name if it is a UDF, otherwise null (tentative) | -| state | STRING | ATTRIBUTE | Availability status | - -* Query example: - -```SQL -IoTDB> select * from information_schema.functions where function_type='built-in table function' -+--------------+-----------------------+---------------+---------+ -|function_name | function_type|class_name(udf)| state| -+--------------+-----------------------+---------------+---------+ -| CUMULATE|built-in table function| null|AVAILABLE| -| SESSION|built-in table function| null|AVAILABLE| -| HOP|built-in table function| null|AVAILABLE| -| TUMBLE|built-in table function| null|AVAILABLE| -| FORECAST|built-in table function| null|AVAILABLE| -| VARIATION|built-in table function| null|AVAILABLE| -| CAPACITY|built-in table function| null|AVAILABLE| -+--------------+-----------------------+---------------+---------+ -``` - - -### 2.13 CONFIGURATIONS - -> This system table is available starting from version V2.0.5. - -* Contains all configuration properties of the database. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------- | ----------- | ----------------- | ------------------------------ | -| variable | STRING | TAG | Configuration property name | -| value | STRING | ATTRIBUTE | Configuration property value | - -* Only administrators are allowed to perform operations on this table. -* Query example: - -```SQL -IoTDB> select * from information_schema.configurations -+----------------------------------+-----------------------------------------------------------------+ -| variable| value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - - -### 2.14 KEYWORDS - -> This system table is available starting from version V2.0.5. - -* Contains all keywords in the database. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------- | ----------- | ----------------- | ------------------------------------------------- | -| word | STRING | TAG | Keyword | -| reserved | INT32 | ATTRIBUTE | Whether it is a reserved word (1 = Yes, 0 = No) | - -* Query example: - -```SQL -IoTDB> select * from information_schema.keywords limit 10 -+----------+--------+ -| word|reserved| -+----------+--------+ -| ABSENT| 0| -|ACTIVATION| 1| -| ACTIVATE| 1| -| ADD| 0| -| ADMIN| 0| -| AFTER| 0| -| AINODES| 1| -| ALL| 0| -| ALTER| 1| -| ANALYZE| 0| -+----------+--------+ -``` - - -### 2.15 NODES - -> This system table is available starting from version V2.0.5. - -* Contains information about all nodes in the database cluster. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| -------------------------------------------- | ----------- | ----------------- | ---------------------- | -| node\_id | INT32 | TAG | Node ID | -| node\_type | STRING | ATTRIBUTE | Node type | -| status | STRING | ATTRIBUTE | Node status | -| internal\_address | STRING | ATTRIBUTE | Internal RPC address | -| internal\_port | INT32 | ATTRIBUTE | Internal port | -| version | STRING | ATTRIBUTE | Version number | -| build\_info | STRING | ATTRIBUTE | Commit ID | -| activate\_status (Enterprise Edition only) | STRING | ATTRIBUTE | Activation status | - -* Only administrators are allowed to perform operations on this table. -* Query example: - -```SQL -IoTDB> select * from information_schema.nodes -+-------+----------+-------+----------------+-------------+-------+----------+ -|node_id| node_type| status|internal_address|internal_port|version|build_info| -+-------+----------+-------+----------------+-------------+-------+----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|2.0.5.1| 58d685e| -| 1| DataNode|Running| 127.0.0.1| 10730|2.0.5.1| 58d685e| -+-------+----------+-------+----------------+-------------+-------+----------+ -``` - - -### 2.16 CONFIG\_NODES - -> This system table is available starting from version V2.0.5. - -* Contains information about all ConfigNodes in the cluster. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------------------- | ----------- | ----------------- | --------------------------- | -| node\_id | INT32 | TAG | Node ID | -| config\_consensus\_port | INT32 | ATTRIBUTE | ConfigNode consensus port | -| role | STRING | ATTRIBUTE | ConfigNode role | - -* Only administrators are allowed to perform operations on this table. -* Query example: - -```SQL -IoTDB> select * from information_schema.config_nodes -+-------+---------------------+------+ -|node_id|config_consensus_port| role| -+-------+---------------------+------+ -| 0| 10720|Leader| -+-------+---------------------+------+ -``` - - -### 2.17 DATA\_NODES - -> This system table is available starting from version V2.0.5. - -* Contains information about all DataNodes in the cluster. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------------------- | ----------- | ----------------- | ----------------------------- | -| node\_id | INT32 | TAG | Node ID | -| data\_region\_num | INT32 | ATTRIBUTE | Number of DataRegions | -| schema\_region\_num | INT32 | ATTRIBUTE | Number of SchemaRegions | -| rpc\_address | STRING | ATTRIBUTE | RPC address | -| rpc\_port | INT32 | ATTRIBUTE | RPC port | -| mpp\_port | INT32 | ATTRIBUTE | MPP communication port | -| data\_consensus\_port | INT32 | ATTRIBUTE | DataRegion consensus port | -| schema\_consensus\_port | INT32 | ATTRIBUTE | SchemaRegion consensus port | - -* Only administrators are allowed to perform operations on this table. -* Query example: - -```SQL -IoTDB> select * from information_schema.data_nodes -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -|node_id|data_region_num|schema_region_num|rpc_address|rpc_port|mpp_port|data_consensus_port|schema_consensus_port| -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -| 1| 4| 4| 0.0.0.0| 6667| 10740| 10760| 10750| -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -``` - -### 2.18 CONNECTIONS - -> This system table is available starting from version V 2.0.8 - -* Contains all connections in the cluster. -* The table structure is as follows: - -| **Column Name** | **Data Type** | **Column Type** | **Description** | -|-----------------|---------------|-----------------|------------------------| -| datanode_id | STRING | TAG | DataNode ID | -| user_id | STRING | TAG | User ID | -| session_id | STRING | TAG | Session ID | -| user_name | STRING | ATTRIBUTE | Username | -| last_active_time| TIMESTAMP | ATTRIBUTE | Last active time | -| client_ip | STRING | ATTRIBUTE | Client IP address | - -* Query example: - -```SQL -IoTDB> select * from information_schema.connections; -+-----------+-------+----------+---------+-----------------------------+---------+ -|datanode_id|user_id|session_id|user_name| last_active_time|client_ip| -+-----------+-------+----------+---------+-----------------------------+---------+ -| 1| 0| 2| root|2026-01-21T16:28:54.704+08:00|127.0.0.1| -+-----------+-------+----------+---------+-----------------------------+---------+ -``` - -### 2.19 CURRENT_QUERIES - -> This system table is available starting from version V 2.0.8 - -* Contains all queries whose execution end time falls within the range `[now() - query_cost_stat_window, now())`, including currently executing queries. The `query_cost_stat_window` parameter represents the query cost statistics window. Its default value is 0 and can be configured via the `iotdb-system.properties` configuration file. -* The table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -|--------------|-----------|-------------|-----------------------------------------------------------------------------| -| query_id | STRING | TAG | Query statement ID | -| state | STRING | FIELD | Query state: RUNNING indicates executing, FINISHED indicates completed | -| start_time | TIMESTAMP | FIELD | Query start timestamp (precision matches system timestamp precision) | -| end_time | TIMESTAMP | FIELD | Query end timestamp (precision matches system timestamp precision). NULL if query is not yet finished | -| datanode_id | INT32 | FIELD | DataNode from which the query was initiated | -| cost_time | FLOAT | FIELD | Query execution time in seconds. If query is not finished, shows elapsed time | -| statement | STRING | FIELD | Query SQL / concatenated query request SQL | -| user | STRING | FIELD | User who initiated the query | -| client_ip | STRING | FIELD | Client IP address that initiated the query | - -* Regular users can only view their own queries; administrators can view all queries. -* Query example: - -```SQL -IoTDB> select * from information_schema.current_queries; -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -| query_id| state| start_time|end_time|datanode_id|cost_time| statement|user|client_ip| -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -|20260121_085427_00013_1|RUNNING|2026-01-21T16:54:27.019+08:00| null| 1| 0.0|select * from information_schema.current_queries|root|127.0.0.1| -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -``` - -### 2.20 QUERIES_COSTS_HISTOGRAM - -> This system table is available starting from version V 2.0.8 - -* Contains a histogram of query execution times within the past `query_cost_stat_window` period (only statistics for completed SQL queries). The `query_cost_stat_window` parameter represents the query cost statistics window. Its default value is 0 and can be configured via the `iotdb-system.properties` configuration file. -* The table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -|--------------|-----------|-------------|-----------------------------------------------------------------------------| -| bin | STRING | TAG | Bucket name: 61 buckets total - [0, 1), [1, 2), [2, 3), ..., [59, 60), 60+ | -| nums | INT32 | FIELD | Number of SQL queries in the bucket | -| datanode_id | INT32 | FIELD | DataNode to which this bucket belongs | - -* Only administrators can execute operations on this table. -* Query example: - -```SQL -IoTDB> select * from information_schema.queries_costs_histogram limit 10 -+------+----+-----------+ -| bin|nums|datanode_id| -+------+----+-----------+ -| [0,1)| 0| 1| -| [1,2)| 0| 1| -| [2,3)| 0| 1| -| [3,4)| 0| 1| -| [4,5)| 0| 1| -| [5,6)| 0| 1| -| [6,7)| 0| 1| -| [7,8)| 0| 1| -| [8,9)| 0| 1| -|[9,10)| 0| 1| -+------+----+-----------+ -``` - -### 2.21 SERVICES - -> This system table is available starting from version V 2.0.8.2 - -* Displays services (MQTT service, REST service) on all active DataNodes (with RUNNING or READ-ONLY status). -* Table structure: - -| Column Name | Data Type | Column Type | Description | -|---------------|-----------|-------------|---------------------------------| -| service_name | STRING | TAG | Service Name | -| datanode_id | INT32 | ATTRIBUTE | DataNode ID where service runs | -| state | STRING | ATTRIBUTE | Service status: RUNNING/STOPPED | - - -* Query example: - -```sql -IoTDB> SELECT * FROM information_schema.services -+------------+-----------+---------+ -|service_name|datanode_id|state | -+------------+-----------+---------+ -|MQTT |1 |STOPPED | -|REST |1 |RUNNING | -+------------+-----------+---------+ -``` - -### 2.22 TABLE_DISK_USAGE -> This system table is available since version V2.0.9.1 - -Used to display the disk space usage of specified tables (excluding views), including the size of ChunkGroups and the size of Metadata. - -Note: Statistics are based on the actual size of data in TsFiles; therefore, deletions made via mods are not considered. - -The table structure is shown below: - -| Column Name | Data Type | Column Type | Description | -|-----------------|-----------|-------------|----------------------------------| -| database | string | Field | Database name | -| table_name | string | Field | Table name | -| datanode_id | int32 | Field | DataNode node ID | -| region_id | int32 | Field | Region ID | -| time_partition | int64 | Field | Time partition ID | -| size_in_bytes | int64 | Field | Disk space occupied (in bytes) | - -**Query Examples**: - -```SQL --- Query all data; -select * from information_schema.table_disk_usage; -``` - -```Bash -+---------+-------------------+-----------+---------+--------------+-------------+ -| database| table_name|datanode_id|region_id|time_partition|size_in_bytes| -+---------+-------------------+-----------+---------+--------------+-------------+ -|database1| table1| 1| 3| 2864| 867| -|database1| table11| 1| 3| 2864| 0| -|database1| table3| 1| 3| 2864| 0| -|database1| table1| 1| 3| 2865| 1411| -|database1| table11| 1| 3| 2865| 0| -|database1| table3| 1| 3| 2865| 0| -|database1| table1| 1| 3| 2925| 590| -|database1| table11| 1| 3| 2925| 0| -|database1| table3| 1| 3| 2925| 0| -|database1| table1| 1| 4| 2864| 883| -|database1| table11| 1| 4| 2864| 0| -|database1| table3| 1| 4| 2864| 0| -|database1| table1| 1| 4| 2865| 1224| -|database1| table11| 1| 4| 2865| 0| -|database1| table3| 1| 4| 2865| 0| -|database1| table1| 1| 4| 2888| 0| -|database1| table11| 1| 4| 2888| 0| -|database1| table3| 1| 4| 2888| 205| -| etth| tab_cov_forecast| 1| 8| 0| 0| -| etth| tab_real| 1| 8| 0| 963| -| etth|tab_target_forecast| 1| 8| 0| 0| -| etth| tab_cov_forecast| 1| 9| 0| 448| -| etth| tab_real| 1| 9| 0| 0| -| etth|tab_target_forecast| 1| 9| 0| 0| -+---------+-------------------+-----------+---------+--------------+-------------+ -``` - -```SQL --- Specify query conditions; -select * from information_schema.table_disk_usage where region_id = 4 and table_name like '%1'; -``` - -```Bash -+---------+----------+-----------+---------+--------------+-------------+ -| database|table_name|datanode_id|region_id|time_partition|size_in_bytes| -+---------+----------+-----------+---------+--------------+-------------+ -|database1| table1| 1| 4| 2864| 883| -|database1| table11| 1| 4| 2864| 0| -|database1| table1| 1| 4| 2865| 1224| -|database1| table11| 1| 4| 2865| 0| -|database1| table1| 1| 4| 2888| 0| -|database1| table11| 1| 4| 2888| 0| -+---------+----------+-----------+---------+--------------+-------------+ -``` - - -## 3. Permission Description - -* GRANT/REVOKE operations are not supported for the `information_schema` database or any of its tables. -* All users can view `information_schema` database details via the `SHOW DATABASES` statement. -* All users can list system tables via `SHOW TABLES FROM information_schema`. -* All users can inspect system table structures using the `DESC` statement. diff --git a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/Basis-Function_timecho.md deleted file mode 100644 index 0010eebda..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/Basis-Function_timecho.md +++ /dev/null @@ -1,2392 +0,0 @@ - - - -# Basic Functions - -## 1. Comparison Functions and Operators - -### 1.1 Basic Comparison Operators - -Comparison operators are used to compare two values and return the comparison result (`true` or `false`). - -| Operators | Description | -| :-------- | :----------------------- | -| < | Less than | -| > | Greater than | -| <= | Less than or equal to | -| >= | Greater than or equal to | -| = | Equal to | -| <> | Not equal to | -| != | Not equal to | - -#### 1.1.1 Comparison rules: - -1. All types can be compared with themselves. -2. Numeric types (INT32, INT64, FLOAT, DOUBLE, TIMESTAMP) can be compared with each other. -3. Character types (STRING, TEXT) can also be compared with each other. -4. Comparisons between types other than those mentioned above will result in an error. - -### 1.2 BETWEEN Operator - -1. The `BETWEEN `operator is used to determine whether a value falls within a specified range. -2. The `NOT BETWEEN` operator is used to determine whether a value does not fall within a specified range. -3. The `BETWEEN` and `NOT BETWEEN` operators can be used to evaluate any sortable type. -4. The value, minimum, and maximum parameters for `BETWEEN` and `NOT BETWEEN` must be of the same type, otherwise an error will occur. - -Syntax: - -```SQL - value BETWEEN min AND max: - value NOT BETWEEN min AND max: -``` - -Example 1 :BETWEEN - -```SQL --- Query records where temperature is between 85.0 and 90.0 -SELECT * FROM table1 WHERE temperature BETWEEN 85.0 AND 90.0; -``` - -Example 2 : NOT BETWEEN - -``` --- Query records where humidity is not between 35.0 and 40.0 -SELECT * FROM table1 WHERE humidity NOT BETWEEN 35.0 AND 40.0; -``` - -### 1.3 IS NULL Operator - -1. These operators apply to all data types. - -Example 1: Query records where temperature is NULL - -```SQL -SELECT * FROM table1 WHERE temperature IS NULL; -``` - -Example 2: Query records where humidity is not NULL - -```SQL -SELECT * FROM table1 WHERE humidity IS NOT NULL; -``` - -### 1.4 IN Operator - -1. The `IN` operator can be used in the `WHERE `clause to compare a column with a list of values. -2. These values can be provided by a static array or scalar expressions. - -Syntax: - -```SQL -... WHERE column [NOT] IN ('value1','value2', expression1) -``` - -Example 1: Static array: Query records where region is 'Beijing' or 'Shanghai' - -```SQL -SELECT * FROM table1 WHERE region IN ('Beijing', 'Shanghai'); ---Equivalent to -SELECT * FROM region WHERE name = 'Beijing' OR name = 'Shanghai'; -``` - -Example 2: Scalar expression: Query records where temperature is among specific values - -```SQL -SELECT * FROM table1 WHERE temperature IN (85.0, 90.0); -``` - -Example 3: Query records where region is not 'Beijing' or 'Shanghai' - -```SQL -SELECT * FROM table1 WHERE region NOT IN ('Beijing', 'Shanghai'); -``` - -### 1.5 GREATEST and LEAST - -The `GREATEST` function returns the maximum value from a list of arguments, while the `LEAST` function returns the minimum value. The return type matches the input data type. - -Key Behaviors: -1. NULL Handling: Returns NULL if all arguments are NULL. -2. Parameter Requirements: Requires at least 2 arguments. -3. Type Constraints: All arguments must have the same data type. -4. Supported Types: `BOOLEAN`、`FLOAT`、`DOUBLE`、`INT32`、`INT64`、`STRING`、`TEXT`、`TIMESTAMP`、`DATE` - -**Syntax:** - -```sql - greatest(value1, value2, ..., valueN) - least(value1, value2, ..., valueN) -``` - -**Examples:** - -```sql --- Retrieve the maximum value between `temperature` and `humidity` in `table2` -SELECT GREATEST(temperature,humidity) FROM table2; - --- Retrieve the minimum value between `temperature` and `humidity` in `table2` -SELECT LEAST(temperature,humidity) FROM table2; -``` - -## 2. Aggregate functions - -### 2.1 Overview - -1. Aggregate functions are many-to-one functions. They perform aggregate calculations on a set of values to obtain a single aggregate result. - -2. Except for `COUNT()`, all other aggregate functions ignore null values and return null when there are no input rows or all values are null. For example, `SUM()` returns null instead of zero, and `AVG()` does not include null values in the count. - -### 2.2 Supported Aggregate Functions - -| Function Name | Description | Allowed Input Types | Output Type | -|:-----------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------| -| COUNT | Counts the number of data points. | All types | INT64 | -| COUNT_IF | COUNT_IF(exp) counts the number of rows that satisfy a specified boolean expression. | `exp` must be a boolean expression,(e.g. `count_if(temperature>20)`) | INT64 | -| APPROX_COUNT_DISTINCT | The APPROX_COUNT_DISTINCT(x[, maxStandardError]) function provides an approximation of COUNT(DISTINCT x), returning the estimated number of distinct input values. | `x`: The target column to be calculated, supports all data types.
`maxStandardError` (optional): Specifies the maximum standard error allowed for the function's result. Valid range is [0.0040625, 0.26]. Defaults to 0.023 if not specified. | INT64 | -| APPROX_MOST_FREQUENT | The APPROX_MOST_FREQUENT(x, k, capacity) function is used to approximately calculate the top k most frequent elements in a dataset. It returns a JSON-formatted string where the keys are the element values and the values are their corresponding approximate frequencies. (Available since V2.0.5.1) | `x` : The column to be calculated, supporting all existing data types in IoTDB;
`k`: The number of top-k most frequent values to return;
`capacity`: The number of buckets used for computation, which relates to memory usage—a larger value reduces error but consumes more memory, while a smaller value increases error but uses less memory. | STRING | -| APPROX_PERCENTILE | The APPROX_PERCENTILE function calculates the value at a specified percentile in a dataset, helping quickly understand data distribution (e.g., median, quartiles). It supports weighted percentile calculation. If the percentile does not point to an exact position, it returns a linear interpolation of adjacent values at that position.Memory usage depends on the number of centroids, and the maximum number of centroids can be limited using the compression parameter. Error can be estimated using empirical formulas.Note: This function is supported since V2.0.9.1. | Unweighted Version: APPROX_PERCENTILE(x, percentage)
x: Column to compute. Supports all numeric types: INT32, INT64, FLOAT, DOUBLE, TIMESTAMP.
percentage: Target percentile, DOUBLE type.
Weighted Version: APPROX_PERCENTILE(x, w, percentage)
x: Column to compute. Supports all numeric types: INT32, INT64, FLOAT, DOUBLE, TIMESTAMP.
w: Weight column, integer type (must align with the length of x; NULL or 0 means the row is ignored).
percentage: Target percentile, DOUBLE type. | Same as the input column x. | -| SUM | Calculates the sum. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| AVG | Calculates the average. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| MAX | Finds the maximum value. | All types | Same as input type | -| MIN | Finds the minimum value. | All types | Same as input type | -| FIRST | Finds the value with the smallest timestamp that is not NULL. | All types | Same as input type | -| LAST | Finds the value with the largest timestamp that is not NULL. | All types | Same as input type | -| STDDEV | Alias for STDDEV_SAMP, calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| STDDEV_POP | Calculates the population standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| STDDEV_SAMP | Calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VARIANCE | Alias for VAR_SAMP, calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VAR_POP | Calculates the population variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VAR_SAMP | Calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| EXTREME | Finds the value with the largest absolute value. If the largest absolute values of positive and negative values are equal, returns the positive value. | INT32 INT64 FLOAT DOUBLE | Same as input type | -| MODE | Finds the mode. Note: 1. There is a risk of memory exception when the number of distinct values in the input sequence is too large; 2. If all elements have the same frequency, i.e., there is no mode, a random element is returned; 3. If there are multiple modes, a random mode is returned; 4. NULL values are also counted in frequency, so even if not all values in the input sequence are NULL, the final result may still be NULL. | All types | Same as input type | -| MAX_BY | MAX_BY(x, y) finds the value of x corresponding to the maximum y in the binary input x and y. MAX_BY(time, x) returns the timestamp when x is at its maximum. | x and y can be of any type | Same as the data type of the first input x | -| MIN_BY | MIN_BY(x, y) finds the value of x corresponding to the minimum y in the binary input x and y. MIN_BY(time, x) returns the timestamp when x is at its minimum. | x and y can be of any type | Same as the data type of the first input x | -| FIRST_BY | FIRST_BY(x, y) finds the value of x in the same row when y is the first non-null value. | x and y can be of any type | Same as the data type of the first input x | -| LAST_BY | LAST_BY(x, y) finds the value of x in the same row when y is the last non-null value. | x and y can be of any type | Same as the data type of the first input x | - - -### 2.3 Examples - -#### 2.3.1 Example Data - -The [Example Data page](../Reference/Sample-Data.md) contains SQL statements for building table structures and inserting data. Download and execute these statements in the IoTDB CLI to import the data into IoTDB. You can use this data to test and execute the SQL statements in the examples and obtain the corresponding results. - -#### 2.3.2 Count - -Counts the number of rows in the entire table and the number of non-null values in the `temperature` column. - -```SQL -IoTDB> select count(*), count(temperature) from table1; -``` - -The execution result is as follows: - -> Note: Only the COUNT function can be used with *, otherwise an error will occur. - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 18| 12| -+-----+-----+ -Total line number = 1 -It costs 0.834s -``` - - -#### 2.3.3 Count_if - -Count `Non-Null` `arrival_time` Records in `table2` - -```sql -select count_if(arrival_time is not null) from table2; -``` - -The execution result is as follows: - -```sql -+-----+ -|_col0| -+-----+ -| 4| -+-----+ -Total line number = 1 -It costs 0.047s -``` - -#### 2.3.4 Approx_count_distinct - -Retrieve the number of distinct values in the `temperature` column from `table1`. - -```sql -IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature) as approx FROM table1; -IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1; -``` - -The execution result is as follows: - -```sql -+------+------+ -|origin|approx| -+------+------+ -| 3| 3| -+------+------+ -Total line number = 1 -It costs 0.022s -``` - -#### 2.3.5 Approx_most_frequent - -Query the ​​top 2 most frequent values​​ in the `temperature` column of `table1`. - -```sql -IoTDB> select approx_most_frequent(temperature,2,100) as topk from table1; -``` - -The execution result is as follows: - -```sql -+-------------------+ -| topk| -+-------------------+ -|{"85.0":6,"90.0":5}| -+-------------------+ -Total line number = 1 -It costs 0.064s -``` - -#### 2.3.6 Approx_Percentile - -Calculate the 90th percentile of the `temperature` column and the 50th percentile (median) of the `humidity` column from `table1` respectively, and return these two approximate percentile values. - -```SQL -SELECT APPROX_PERCENTILE(temperature,0.9), APPROX_PERCENTILE(humidity,0.5) FROM table1; -``` - -**Execution Result:** - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 35.2| -+-----+-----+ -Total line number = 1 -It costs 0.206s -``` - -#### 2.3.7 First - -Finds the values with the smallest timestamp that are not NULL in the `temperature` and `humidity` columns. - -```SQL -IoTDB> select first(temperature), first(humidity) from table1; -``` - -The execution result is as follows: - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 35.1| -+-----+-----+ -Total line number = 1 -It costs 0.170s -``` - -#### 2.3.8 Last - -Finds the values with the largest timestamp that are not NULL in the `temperature` and `humidity` columns. - -```SQL -IoTDB> select last(temperature), last(humidity) from table1; -``` - -The execution result is as follows: - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 34.8| -+-----+-----+ -Total line number = 1 -It costs 0.211s -``` - -#### 2.3.9 First_by - -Finds the `time` value of the row with the smallest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the smallest timestamp that is not NULL in the `temperature` column. - -```SQL -IoTDB> select first_by(time, temperature), first_by(humidity, temperature) from table1; -``` - -The execution result is as follows: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-26T13:37:00.000+08:00| 35.1| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.269s -``` - -#### 2.3.10 Last_by - -Queries the `time` value of the row with the largest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the largest timestamp that is not NULL in the `temperature` column. - -```SQL -IoTDB> select last_by(time, temperature), last_by(humidity, temperature) from table1; -``` - -The execution result is as follows: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-30T14:30:00.000+08:00| 34.8| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.070s -``` - -#### 2.3.11 Max_by - -Queries the `time` value of the row where the `temperature` column is at its maximum, and the `humidity` value of the row where the `temperature` column is at its maximum. - -```SQL -IoTDB> select max_by(time, temperature), max_by(humidity, temperature) from table1; -``` - -The execution result is as follows: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-30T09:30:00.000+08:00| 35.2| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.172s -``` - -#### 2.3.12 Min_by - -Queries the `time` value of the row where the `temperature` column is at its minimum, and the `humidity` value of the row where the `temperature` column is at its minimum. - -```SQL -select min_by(time, temperature), min_by(humidity, temperature) from table1; -``` - -The execution result is as follows: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-29T10:00:00.000+08:00| null| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.244s -``` - - -## 3. Logical operators - -### 3.1 Overview - -Logical operators are used to combine conditions or negate conditions, returning a Boolean result (`true` or `false`). - -Below are the commonly used logical operators along with their descriptions: - -| Operator | Description | Example | -| :------- | :-------------------------------- | :------ | -| AND | True only if both values are true | a AND b | -| OR | True if either value is true | a OR b | -| NOT | True when the value is false | NOT a | - -### 3.2 Impact of NULL on Logical Operators - -#### 3.2.1 AND Operator - -- If one or both sides of the expression are `NULL`, the result may be `NULL`. -- If one side of the `AND` operator is `FALSE`, the expression result is `FALSE`. - -Examples: - -```SQL -NULL AND true -- null -NULL AND false -- false -NULL AND NULL -- null -``` - -#### 3.2.2 OR Operator - -- If one or both sides of the expression are `NULL`, the result may be `NULL`. -- If one side of the `OR` operator is `TRUE`, the expression result is `TRUE`. - -Examples: - -```SQL -NULL OR NULL -- null -NULL OR false -- null -NULL OR true -- true -``` - -##### 3.2.2.1 Truth Table - -The following truth table illustrates how `NULL` is handled in `AND` and `OR` operators: - -| a | b | a AND b | a OR b | -| :---- | :---- | :------ | :----- | -| TRUE | TRUE | TRUE | TRUE | -| TRUE | FALSE | FALSE | TRUE | -| TRUE | NULL | NULL | TRUE | -| FALSE | TRUE | FALSE | TRUE | -| FALSE | FALSE | FALSE | FALSE | -| FALSE | NULL | FALSE | NULL | -| NULL | TRUE | NULL | TRUE | -| NULL | FALSE | FALSE | NULL | -| NULL | NULL | NULL | NULL | - -#### 3.2.3 NOT Operator - -The logical negation of `NULL` remains `NULL`. - -Example: - -```SQL -NOT NULL -- null -``` - -##### 3.2.3.1 Truth Table - -The following truth table illustrates how `NULL` is handled in the `NOT` operator: - -| a | NOT a | -| :---- | :---- | -| TRUE | FALSE | -| FALSE | TRUE | -| NULL | NULL | - -## 4. Date and Time Functions and Operators - -### 4.1 now() -> Timestamp - -Returns the current timestamp. - -### 4.2 date_bin(interval, Timestamp[, Timestamp]) -> Timestamp - -The `date_bin` function is used for handling time data by rounding a timestamp (`Timestamp`) to the boundary of a specified time interval (`interval`). - -#### **Syntax:** - -```SQL --- Calculates the time interval starting from timestamp 0 and returns the nearest interval boundary to the specified timestamp. -date_bin(interval,source) - --- Calculates the time interval starting from the origin timestamp and returns the nearest interval boundary to the specified timestamp. -date_bin(interval,source,origin) - ---Supported time units for interval: ---Years (y), months (mo), weeks (week), days (d), hours (h), minutes (M), seconds (s), milliseconds (ms), microseconds (µs), nanoseconds (ns). ---source: Must be of timestamp type. -``` - -#### **Parameters**: - -| Parameter | Description | -| :-------- | :----------------------------------------------------------- | -| interval | 1. Time interval 2. Supported units: `y`, `mo`, `week`, `d`, `h`, `M`, `s`, `ms`, `µs`, `ns`. | -| source | 1. The timestamp column or expression to be calculated. 2. Must be of timestamp type. | -| origin | The reference timestamp. | - -#### 4.2.1Syntax Rules : - -1. If `origin` is not specified, the default reference timestamp is `1970-01-01T00:00:00Z` (Beijing time: `1970-01-01 08:00:00`). -2. `interval` must be a non-negative number with a time unit. If `interval` is `0ms`, the function returns `source` directly without calculation. -3. If `origin` or `source` is negative, it represents a time point before the epoch. `date_bin` will calculate and return the relevant time period. -4. If `source` is `null`, the function returns `null`. -5. Mixing months and non-month time units (e.g., `1 MONTH 1 DAY`) is not supported due to ambiguity. - -> For example, if the starting point is **April 30, 2000**, calculating `1 DAY` first and then `1 MONTH` results in **June 1, 2000**, whereas calculating `1 MONTH` first and then `1 DAY` results in **May 31, 2000**. The resulting dates are different. - -#### 4.2.2 Examples - -##### Example Data - -The [Example Data page](../Reference/Sample-Data.md) contains SQL statements for building table structures and inserting data. Download and execute these statements in the IoTDB CLI to import the data into IoTDB. You can use this data to test and execute the SQL statements in the examples and obtain the corresponding results. - -#### Example 1: Without Specifying the Origin Timestamp - -```SQL -SELECT - time, - date_bin(1h,time) as time_bin -FROM - table1; -``` - -Result**:** - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:00:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.683s -``` - -#### Example 2: Specifying the Origin Timestamp - -```SQL -SELECT - time, - date_bin(1h, time, 2024-11-29T18:30:00.000) as time_bin -FROM - table1; -``` - -Result: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:30:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T09:30:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T10:30:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:30:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T07:30:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T08:30:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T09:30:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T10:30:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:30:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:30:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.056s -``` - -#### Example 3: Negative Origin - -```SQL -SELECT - time, - date_bin(1h, time, 1969-12-31 00:00:00.000) as time_bin -FROM - table1; -``` - -Result: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:00:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.203s -``` - -#### Example 4: Interval of 0 - -```SQL -SELECT - time, - date_bin(0ms, time) as time_bin -FROM - table1; -``` - -Result**:** - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:30:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:38:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:39:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:40:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:41:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:42:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:43:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:44:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:30:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:37:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:38:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.107s -``` - -#### Example 5: Source is NULL - -```SQL -SELECT - arrival_time, - date_bin(1h,arrival_time) as time_bin -FROM - table1; -``` - -Result: - -```Plain -+-----------------------------+-----------------------------+ -| arrival_time| time_bin| -+-----------------------------+-----------------------------+ -| null| null| -|2024-11-30T14:30:17.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:13.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:37:01.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -|2024-11-27T16:37:03.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:37:04.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -| null| null| -|2024-11-27T16:37:08.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -|2024-11-29T18:30:15.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:09.000+08:00|2024-11-28T08:00:00.000+08:00| -| null| null| -|2024-11-28T10:00:11.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:12.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:34.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:25.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.319s -``` - -### 4.3 Extract Function - -This function is used to extract the value of a specific part of a date. (Supported from version V2.0.6) - -#### 4.3.1 Syntax Definition - -```SQL -EXTRACT (identifier FROM expression) -``` - -* Parameter Description - * **expression**: `TIMESTAMP` type or a time constant - * **identifier**: The valid ranges and corresponding return value types are shown in the table below. - - | Valid Range | Return Type | Return Range | - |----------------------|---------------|--------------------| - | `YEAR` | `INT64` | `/` | - | `QUARTER` | `INT64` | `1-4` | - | `MONTH` | `INT64` | `1-12` | - | `WEEK` | `INT64` | `1-53` | - | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` | - | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` | - | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` | - | `HOUR` | `INT64` | `0-23` | - | `MINUTE` | `INT64` | `0-59` | - | `SECOND` | `INT64` | `0-59` | - | `MS` | `INT64` | `0-999` | - | `US` | `INT64` | `0-999` | - | `NS` | `INT64` | `0-999` | - - -#### 4.3.2 Usage Example - -Using table1 from the [Sample Data](../Reference/Sample-Data.md) as the source data, query the average temperature for the first 12 hours of each day within a certain period. - -```SQL -IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00 and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by date_bin(1d,time) order by date_bin(1d,time) -+----------+-----+ -| fmtdate|avgtp| -+----------+-----+ -|2024-11-28| 86.0| -|2024-11-29| 85.0| -|2024-11-30| 90.0| -+----------+-----+ -Total line number = 3 -It costs 0.041s -``` - -Introduction to the `Format` function: [Format Function](../SQL-Manual/Basis-Function_timecho.md#_7-2-format-function) - -Introduction to the `Date_bin` function: [Date_bin Funtion](../SQL-Manual/Basis-Function_timecho.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp) - - -## 5. Mathematical Functions and Operators - -### 5.1 Mathematical Operators - -| **Operator** | **Description** | -| :----------- | :---------------------------------------------- | -| + | Addition | -| - | Subtraction | -| * | Multiplication | -| / | Division (integer division performs truncation) | -| % | Modulus (remainder) | -| - | Negation | - -### 5.2 Mathematical functions - -| Function Name | Description | Input | Output | Usage | -|:--------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------|:-------------------| :--------- | -| sin | Sine | double, float, INT64, INT32 | double | sin(x) | -| cos | Cosine | double, float, INT64, INT32 | double | cos(x) | -| tan | Tangent | double, float, INT64, INT32 | double | tan(x) | -| asin | Inverse Sine | double, float, INT64, INT32 | double | asin(x) | -| acos | Inverse Cosine | double, float, INT64, INT32 | double | acos(x) | -| atan | Inverse Tangent | double, float, INT64, INT32 | double | atan(x) | -| sinh | Hyperbolic Sine | double, float, INT64, INT32 | double | sinh(x) | -| cosh | Hyperbolic Cosine | double, float, INT64, INT32 | double | cosh(x) | -| tanh | Hyperbolic Tangent | double, float, INT64, INT32 | double | tanh(x) | -| degrees | Converts angle `x` in radians to degrees | double, float, INT64, INT32 | double | degrees(x) | -| radians | Radian Conversion from Degrees | double, float, INT64, INT32 | double | radians(x) | -| abs | Absolute Value | double, float, INT64, INT32 | Same as input type | abs(x) | -| sign | Returns the sign of `x`: - If `x = 0`, returns `0` - If `x > 0`, returns `1` - If `x < 0`, returns `-1` For `double/float` inputs: - If `x = NaN`, returns `NaN` - If `x = +Infinity`, returns `1.0` - If `x = -Infinity`, returns `-1.0` | double, float, INT64, INT32 | Same as input type | sign(x) | -| ceil | Rounds `x` up to the nearest integer | double, float, INT64, INT32 | double | ceil(x) | -| floor | Rounds `x` down to the nearest integer | double, float, INT64, INT32 | double | floor(x) | -| exp | Returns `e^x` (Euler's number raised to the power of `x`) | double, float, INT64, INT32 | double | exp(x) | -| ln | Returns the natural logarithm of `x` | double, float, INT64, INT32 | double | ln(x) | -| log10 | Returns the base 10 logarithm of `x` | double, float, INT64, INT32 | double | log10(x) | -| round | Rounds `x` to the nearest integer | double, float, INT64, INT32 | double | round(x) | -| round | Rounds `x` to `d` decimal places | double, float, INT64, INT32 | double | round(x, d) | -| sqrt | Returns the square root of `x`. | double, float, INT64, INT32 | double | sqrt(x) | -| e | Returns Euler’s number `e`. | | double | e() | -| pi | Pi (π) | | double | pi() | - -## 6. Bitwise Functions - -> Supported from version V2.0.6 - -Example raw data is as follows: - -``` -IoTDB:database1> select * from bit_table -+-----------------------------+---------+------+-----+ -| time|device_id|length|width| -+-----------------------------+---------+------+-----+ -|2025-10-29T15:59:42.957+08:00| d1| 14| 12| -|2025-10-29T15:58:59.399+08:00| d3| 15| 10| -|2025-10-29T15:59:32.769+08:00| d2| 13| 12| -+-----------------------------+---------+------+-----+ - --- Table creation statement -CREATE TABLE bit_table(time TIMESTAMP TIME, device_id STRING TAG, length INT32 FIELD, width INT32 FIELD); - --- Write data -INSERT INTO bit_table values(2025-10-29 15:59:42.957, 'd1', 14, 12),(2025-10-29 15:58:59.399, 'd3', 15, 10),(2025-10-29 15:59:32.769, 'd2', 13, 12); -``` - -### 6.1 bit\_count(num, bits) - -The `bit_count(num, bits)`function is used to count the number of 1s in the binary representation of the integer `num`under the specified bit width `bits`. - -#### 6.1.1 Syntax Definition - -``` -bit_count(num, bits) -> INT64 -- The return type is Int64 -``` - -* Parameter Description - - * **​num:​**​ Any integer value (int32 or int64) - * **​bits:​**​ Integer value, with a valid range of 2\~64 - -Note: An error will be raised if the number of `bits`is insufficient to represent `num`(using ​**two's complement signed representation**​): `Argument exception, the scalar function num must be representable with the bits specified. [num] cannot be represented with [bits] bits.` - -* Usage Methods - - * Two specific numbers: `bit_count(9, 64)` - * Column and a number: `bit_count(column1, 64)` - * Between two columns: `bit_count(column1, column2)` - -#### 6.1.2 Usage Examples - -``` --- Two specific numbers -IoTDB:database1> select distinct bit_count(2,8) from bit_table -+-----+ -|_col0| -+-----+ -| 1| -+-----+ --- Two specific numbers -IoTDB:database1> select distinct bit_count(-5,8) from bit_table -+-----+ -|_col0| -+-----+ -| 7| -+-----+ --- Column and a number -IoTDB:database1> select length,bit_count(length,8) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 3| -| 15| 4| -| 13| 3| -+------+-----+ --- Insufficient bits -IoTDB:database1> select length,bit_count(length,2) from bit_table -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Argument exception, the scalar function num must be representable with the bits specified. 13 cannot be represented with 2 bits. -``` - -### 6.2 bitwise\_and(x, y) - -The `bitwise_and(x, y)`function performs a logical AND operation on each bit of two integers x and y based on their two's complement representation, and returns the bitwise AND operation result. - -#### 6.2.1 Syntax Definition - -``` -bitwise_and(x, y) -> INT64 -- The return type is Int64 -``` - -* Parameter Description - - * ​**x, y**​: Must be integer values of data type Int32 or Int64 -* Usage Methods - - * Two specific numbers: `bitwise_and(19, 25)` - * Column and a number: `bitwise_and(column1, 25)` - * Between two columns: `bitwise_and(column1, column2)` - -#### 6.2.2 Usage Examples - -``` ---Two specific numbers -IoTDB:database1> select distinct bitwise_and(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 17| -+-----+ ---Column and a number -IoTDB:database1> select length, bitwise_and(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 8| -| 15| 9| -| 13| 9| -+------+-----+ ---Between two columns -IoTDB:database1> select length, width, bitwise_and(length, width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 12| -| 15| 10| 10| -| 13| 12| 12| -+------+-----+-----+ -``` - -### 6.3 bitwise\_not(x) - -The `bitwise_not(x)`function performs a logical NOT operation on each bit of the integer x based on its two's complement representation, and returns the bitwise NOT operation result. - -#### 6.3.1 Syntax Definition - -``` -bitwise_not(x) -> INT64 -- The return type is Int64 -``` - -* Parameter Description - - * ​**x**​: Must be an integer value of data type Int32 or Int64 -* Usage Methods - - * Specific number: `bitwise_not(5)` - * Single column operation: `bitwise_not(column1)` - -#### 6.3.2 Usage Examples - -``` --- Specific number -IoTDB:database1> select distinct bitwise_not(5) from bit_table -+-----+ -|_col0| -+-----+ -| -6| -+-----+ --- Single column -IoTDB:database1> select length, bitwise_not(length) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| -15| -| 15| -16| -| 13| -14| -+------+-----+ -``` - -### 6.4 bitwise\_or(x, y) - -The `bitwise_or(x,y)`function performs a logical OR operation on each bit of two integers x and y based on their two's complement representation, and returns the bitwise OR operation result. - -#### 6.4.1 Syntax Definition - -``` -bitwise_or(x, y) -> INT64 -- The return type is Int64 -``` - -* Parameter Description - - * ​**x, y**​: Must be integer values of data type Int32 or Int64 -* Usage Methods - - * Two specific numbers: `bitwise_or(19, 25)` - * Column and a number: `bitwise_or(column1, 25)` - * Between two columns: `bitwise_or(column1, column2)` - -#### 6.4.2 Usage Examples - -``` --- Two specific numbers -IoTDB:database1> select distinct bitwise_or(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 27| -+-----+ --- Column and a number -IoTDB:database1> select length,bitwise_or(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 31| -| 15| 31| -| 13| 29| -+------+-----+ --- Between two columns -IoTDB:database1> select length, width, bitwise_or(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 14| -| 15| 10| 15| -| 13| 12| 13| -+------+-----+-----+ -``` - -### 6.5 bitwise\_xor(x, y) - -The `bitwise_xor(x,y)`function performs a logical XOR (exclusive OR) operation on each bit of two integers x and y based on their two's complement representation, and returns the bitwise XOR operation result. XOR rule: same bits result in 0, different bits result in 1. - -#### 6.5.1 Syntax Definition - -``` -bitwise_xor(x, y) -> INT64 -- The return type is Int64 -``` - -* Parameter Description - - * ​**x, y**​: Must be integer values of data type Int32 or Int64 -* Usage Methods - - * Two specific numbers: `bitwise_xor(19, 25)` - * Column and a number: `bitwise_xor(column1, 25)` - * Between two columns: `bitwise_xor(column1, column2)` - -#### 6.5.2 Usage Examples - -``` --- Two specific numbers -IoTDB:database1> select distinct bitwise_xor(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 10| -+-----+ --- Column and a number -IoTDB:database1> select length,bitwise_xor(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 23| -| 15| 22| -| 13| 20| -+------+-----+ --- Between two columns -IoTDB:database1> select length, width, bitwise_xor(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 2| -| 15| 10| 5| -| 13| 12| 1| -+------+-----+-----+ -``` - -### 6.6 bitwise\_left\_shift(value, shift) - -The `bitwise_left_shift(value, shift)`function returns the result of shifting the binary representation of integer `value`left by `shift`bits. The left shift operation moves bits towards the higher-order direction, filling the vacated lower-order bits with 0s, and discarding the higher-order bits that overflow. Equivalent to: `value << shift`. - -#### 6.6.1 Syntax Definition - -``` -bitwise_left_shift(value, shift) -> [same as value] -- The return type is the same as the data type of value -``` - -* Parameter Description - - * ​**value**​: The integer value to shift left. Must be of data type Int32 or Int64. - * ​**shift**​: The number of bits to shift. Must be of data type Int32 or Int64. -* Usage Methods - - * Two specific numbers: `bitwise_left_shift(1, 2)` - * Column and a number: `bitwise_left_shift(column1, 2)` - * Between two columns: `bitwise_left_shift(column1, column2)` - -#### 6.6.2 Usage Examples - -``` ---Two specific numbers -IoTDB:database1> select distinct bitwise_left_shift(1,2) from bit_table -+-----+ -|_col0| -+-----+ -| 4| -+-----+ --- Column and a number -IoTDB:database1> select length, bitwise_left_shift(length,2) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 56| -| 15| 60| -| 13| 52| -+------+-----+ --- Between two columns -IoTDB:database1> select length, width, bitwise_left_shift(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -+------+-----+-----+ -``` - -### 6.7 bitwise\_right\_shift(value, shift) - -The `bitwise_right_shift(value, shift)`function returns the result of logically (unsigned) right shifting the binary representation of integer `value`by `shift`bits. The logical right shift operation moves bits towards the lower-order direction, filling the vacated higher-order bits with 0s, and discarding the lower-order bits that overflow. - -#### 6.7.1 Syntax Definition - -``` -bitwise_right_shift(value, shift) -> [same as value] -- The return type is the same as the data type of value -``` - -* Parameter Description - - * ​**value**​: The integer value to shift right. Must be of data type Int32 or Int64. - * ​**shift**​: The number of bits to shift. Must be of data type Int32 or Int64. -* Usage Methods - - * Two specific numbers: `bitwise_right_shift(8, 3)` - * Column and a number: `bitwise_right_shift(column1, 3)` - * Between two columns: `bitwise_right_shift(column1, column2)` - -#### 6.7.2 Usage Examples - -``` ---Two specific numbers -IoTDB:database1> select distinct bitwise_right_shift(8,3) from bit_table -+-----+ -|_col0| -+-----+ -| 1| -+-----+ ---Column and a number -IoTDB:database1> select length, bitwise_right_shift(length,3) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 1| -| 15| 1| -| 13| 1| -+------+-----+ ---Between two columns -IoTDB:database1> select length, width, bitwise_right_shift(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -``` - -### 6.8 bitwise\_right\_shift\_arithmetic(value, shift) - -The `bitwise_right_shift_arithmetic(value, shift)`function returns the result of arithmetically right shifting the binary representation of integer `value`by `shift`bits. The arithmetic right shift operation moves bits towards the lower-order direction, discarding the lower-order bits that overflow, and filling the vacated higher-order bits with the sign bit (0 for positive numbers, 1 for negative numbers) to preserve the sign of the number. - -#### 6.8.1 Syntax Definition - -``` -bitwise_right_shift_arithmetic(value, shift) -> [same as value]-- The return type is the same as the data type of value -``` - -* Parameter Description - - * ​**value**​: The integer value to shift right. Must be of data type Int32 or Int64. - * ​**shift**​: The number of bits to shift. Must be of data type Int32 or Int64. -* Usage Methods: - - * Two specific numbers: `bitwise_right_shift_arithmetic(12, 2)` - * Column and a number: `bitwise_right_shift_arithmetic(column1, 64)` - * Between two columns: `bitwise_right_shift_arithmetic(column1, column2)` - -#### 6.8.2 Usage Examples - -``` ---Two specific numbers -IoTDB:database1> select distinct bitwise_right_shift_arithmetic(12,2) from bit_table -+-----+ -|_col0| -+-----+ -| 3| -+-----+ --- Column and a number -IoTDB:database1> select length, bitwise_right_shift_arithmetic(length,3) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 1| -| 15| 1| -| 13| 1| -+------+-----+ ---Between two columns -IoTDB:database1> select length, width, bitwise_right_shift_arithmetic(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -+------+-----+-----+ -``` - -## 7. Binary Functions - -> Supported since V2.0.9.1 - -### 7.1 Base64 Encoding Functions -| Function Name | Description | Input Type | Output Type | -| ----------------------------- | ----------------------------------------------------------------------------- | --------------------- | ------------- | -| `to_base64(input)` | Encode input data to standard Base64 string for binary data transmission/storage | STRING/TEXT/BLOB | STRING | -| `from_base64(input)` | Decode standard Base64 string to raw binary data (inverse of to_base64) | STRING/TEXT | BLOB | -| `to_base64url(input)` | Encode input to URL-safe Base64URL string (replace +/_, omit padding) | STRING/TEXT/BLOB | STRING | -| `from_base64url(input)` | Decode Base64URL string to raw binary data (inverse of to_base64url) | STRING/TEXT | BLOB | -| `to_base32(input)` | Encode input to Base32 string (case-insensitive, high readability) | STRING/TEXT/BLOB | STRING | -| `from_base32(input)` | Decode Base32 string to raw binary data (inverse of to_base32) | STRING/TEXT | BLOB | - -**Examples** -1. to_base64: Encode string to standard Base64 -```SQL -SELECT DISTINCT to_base64('IoTDB Binary Test') FROM table1; -``` -``` -+----------------------------+ -| _col0| -+----------------------------+ -|SW9URELkuozov5vliLbmtYvor5U=| -+----------------------------+ -``` - -2. from_base64: Decode Base64 to binary -```SQL -SELECT DISTINCT from_base64('SW9URELkuozov5vliLbmtYvor5U=') FROM table1; -``` -``` -+------------------------------------------+ -| _col0| -+------------------------------------------+ -|0x496f544442e4ba8ce8bf9be588b6e6b58be8af95| -+------------------------------------------+ -``` - -3. to_base64url: Encode to URL-safe Base64URL -```SQL -SELECT DISTINCT to_base64url('https://iotdb.apache.org') FROM table1; -``` -``` -+--------------------------------+ -| _col0| -+--------------------------------+ -|aHR0cHM6Ly9pb3RkYi5hcGFjaGUub3Jn| -+--------------------------------+ -``` - -4. from_base64url: Decode Base64URL -```SQL -SELECT DISTINCT from_base64url('aHR0cHM6Ly9pb3RkYi5hcGFjaGUub3Jn') FROM table1; -``` -``` -+--------------------------------------------------+ -| _col0| -+--------------------------------------------------+ -|0x68747470733a2f2f696f7464622e6170616368652e6f7267| -+--------------------------------------------------+ -``` - -5. to_base32: Encode to Base32 -```SQL -SELECT DISTINCT to_base32('123456') FROM table1; -``` -``` -+----------------+ -| _col0| -+----------------+ -|GEZDGNBVGY======| -+----------------+ -``` - -6. from_base32: Decode Base32 -```SQL -SELECT DISTINCT from_base32('GEZDGNBVGY======') FROM table1; -``` -``` -+--------------+ -| _col0| -+--------------+ -|0x313233343536| -+--------------+ -``` - -### 7.2 Hex Encoding Functions -| Function Name | Description | Input Type | Output Type | -| ------------------------ | -------------------------------------------------- | --------------------- | ------------- | -| `TO_HEX(input)` | Convert input to hex string (raw byte view) | STRING/TEXT/BLOB | STRING | -| `FROM_HEX(input)` | Decode hex string to raw binary (inverse of TO_HEX) | STRING/TEXT | BLOB | - -**Examples** -1. TO_HEX: Convert string/binary to hex -```SQL -SELECT DISTINCT TO_HEX('test') FROM table1; -``` -``` -+--------+ -| _col0| -+--------+ -|74657374| -+--------+ -``` - -2. FROM_HEX: Decode hex to binary -```SQL -SELECT DISTINCT FROM_HEX('74657374') FROM table1; -``` -``` -+----------+ -| _col0| -+----------+ -|0x74657374| -+----------+ -``` - -### 7.3 Basic Binary Functions -| Function Name | Description | Input Type | Output Type | -| ----------------------------------- | ------------------------------------------------------------------------------------------ | -------------------------- | -------------- | -| `length(input)` | Return data length: chars for TEXT, bytes for BLOB/OBJECT | STRING/TEXT/BLOB/OBJECT | INT32 | -| `REVERSE(input)` | Reverse input: chars for TEXT, bytes for BLOB | STRING/TEXT/BLOB | Same as input | -| `LPAD(input, length, pad_bytes)` | Left-pad/truncate BLOB to target byte length | BLOB, INT32/INT64, BLOB | BLOB | -| `RPAD(input, length, pad_bytes)` | Right-pad/truncate BLOB to target byte length | BLOB, INT32/INT64, BLOB | BLOB | - -**Examples** -1. length: Get data length -```SQL -SELECT DISTINCT length('IoTDB') FROM table1; -``` -``` -+-----+ -|_col0| -+-----+ -| 5| -+-----+ -``` - -2. REVERSE: Reverse data -```SQL -SELECT DISTINCT REVERSE('12345') FROM table1; -``` -``` -+-----+ -|_col0| -+-----+ -|54321| -+-----+ -``` - -3. LPAD: Left-pad BLOB -```SQL -SELECT DISTINCT LPAD(FROM_HEX('74657374'),5, FROM_HEX('74657374')) FROM table1; -``` -``` -+------------+ -| _col0| -+------------+ -|0x7474657374| -+------------+ -``` - -4. RPAD: Right-pad BLOB -```SQL -SELECT DISTINCT RPAD(FROM_HEX('74657374'),5, FROM_HEX('74657374')) FROM table1; -``` -``` -+------------+ -| _col0| -+------------+ -|0x7465737474| -+------------+ -``` - -### 7.4 Integer Encoding Functions -| Function Name | Description | Input Type | Output Type | -| ------------------------------------- | ------------------------------------------------------------------------- | ------------ | ------------ | -| `to_big_endian_32(input)` | Convert INT32 to 4-byte big-endian BLOB (network byte order) | INT32 | BLOB | -| `to_big_endian_64(input)` | Convert INT64 to 8-byte big-endian BLOB | INT64 | BLOB | -| `from_big_endian_32(input)` | Decode 4-byte big-endian BLOB to INT32 | BLOB | INT32 | -| `from_big_endian_64(input)` | Decode 8-byte big-endian BLOB to INT64 | BLOB | INT64 | -| `to_little_endian_32(input)` | Convert INT32 to 4-byte little-endian BLOB (x86 architecture) | INT32 | BLOB | -| `to_little_endian_64(input)` | Convert INT64 to 8-byte little-endian BLOB | INT64 | BLOB | -| `from_little_endian_32(input)` | Decode 4-byte little-endian BLOB to INT32 | BLOB | INT32 | -| `from_little_endian_64(input)` | Decode 8-byte little-endian BLOB to INT64 | BLOB | INT64 | - -**Examples** -1. Big-endian encode/decode -```SQL -SELECT DISTINCT TO_HEX(to_big_endian_32(12345)) FROM table1; -``` -``` -+--------+ -| _col0| -+--------+ -|00003039| -+--------+ -``` - -2. Little-endian encode/decode -```SQL -SELECT DISTINCT TO_HEX(to_little_endian_32(12345)) FROM table1; -``` -``` -+--------+ -| _col0| -+--------+ -|39300000| -+--------+ -``` - -### 7.5 Floating-Point Encoding Functions -| Function Name | Description | Input Type | Output Type | -| ------------------------------- | ------------------------------------------------------------------------- | ------------ | ------------ | -| `to_ieee754_32(input)` | Convert FLOAT to 4-byte big-endian IEEE754 BLOB | FLOAT | BLOB | -| `to_ieee754_64(input)` | Convert DOUBLE to 8-byte big-endian IEEE754 BLOB | DOUBLE | BLOB | -| `from_ieee754_32(input)` | Decode 4-byte IEEE754 BLOB to FLOAT | BLOB | FLOAT | -| `from_ieee754_64(input)` | Decode 8-byte IEEE754 BLOB to DOUBLE | BLOB | DOUBLE | - -**Examples** -1. FLOAT encode/decode -```SQL -SELECT DISTINCT from_ieee754_32(FROM_HEX('42b40000')) FROM table1; -``` -``` -+-----+ -|_col0| -+-----+ -| 90.0| -+-----+ -``` - -2. DOUBLE encode/decode -```SQL -SELECT DISTINCT from_ieee754_64(FROM_HEX('400921fb54411744')) FROM table1; -``` -``` -+------------+ -| _col0| -+------------+ -|3.1415926535| -+------------+ -``` - -### 7.6 Hash Functions -| Function Name | Description | Input Type | Output Type | -| --------------------------------- | ------------------------------------------------------------------------- | ------------------ | ------------- | -| `sha256(input)` | SHA-256 cryptographic hash (collision-resistant) | STRING/TEXT/BLOB | BLOB(32B) | -| `SHA512(input)` | SHA-512 cryptographic hash (higher security) | STRING/TEXT/BLOB | BLOB(64B) | -| `SHA1(input)` | SHA-1 hash (not secure for cryptography) | STRING/TEXT/BLOB | BLOB(20B) | -| `MD5(input)` | MD5 hash (non-cryptographic checksum) | STRING/TEXT/BLOB | BLOB(16B) | -| `CRC32(input)` | CRC32 checksum (fast error detection) | STRING/TEXT/BLOB | INT64 | -| `spooky_hash_v2_32(input)` | 32-bit SpookyHashV2 (high-performance non-crypto) | STRING/TEXT/BLOB | BLOB(4B) | -| `spooky_hash_v2_64(input)` | 64-bit SpookyHashV2 | STRING/TEXT/BLOB | BLOB(8B) | -| `xxhash64(input)` | 64-bit xxHash (ultra-fast) | STRING/TEXT/BLOB | BLOB(8B) | -| `murmur3(input)` | 128-bit MurmurHash3 (uniform distribution) | STRING/TEXT/BLOB | BLOB(16B) | - -**Examples** -```SQL -SELECT DISTINCT TO_HEX(sha256('test')) FROM table1; -``` -``` -+----------------------------------------------------------------+ -| _col0| -+----------------------------------------------------------------+ -|9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08| -+----------------------------------------------------------------+ -``` - -### 7.7 HMAC Functions -| Function Name | Description | Input Type | Output Type | -| ------------------------------- | ------------------------------------------------------------------------------ | ------------------------------------- | ------------- | -| `hmac_md5(data, key)` | HMAC-MD5 message authentication code | data: STRING/TEXT/BLOB key: STRING/TEXT | BLOB(16B) | -| `hmac_sha1(data, key)` | HMAC-SHA1 authentication code | data: STRING/TEXT/BLOB key: STRING/TEXT | BLOB(20B) | -| `hmac_sha256(data, key)` | HMAC-SHA256 (industry-recommended, high security) | data: STRING/TEXT/BLOB key: STRING/TEXT | BLOB(32B) | -| `hmac_sha512(data, key)` | HMAC-SHA512 (maximum security) | data: STRING/TEXT/BLOB key: STRING/TEXT | BLOB(64B) | - -**Examples** -```SQL -SELECT DISTINCT TO_HEX(hmac_sha256('user_data_123', 'iotdb_secret_key')) FROM table1; -``` -``` -+----------------------------------------------------------------+ -| _col0| -+----------------------------------------------------------------+ -|73b6f26bbcb5192dbe2cb83745b0fc48c63418fa674b0bf62fabe7f8747f3afd| -+----------------------------------------------------------------+ -``` - - -## 8. Conditional Expressions - -### 8.1 CASE - -CASE expressions come in two forms: **Simple CASE** and **Searched CASE**. - -#### 8.1.1 Simple CASE - -The simple form evaluates each value expression from left to right until it finds a match with the given expression: - -```SQL -CASE expression - WHEN value THEN result - [ WHEN ... ] - [ ELSE result ] -END -``` - -If a matching value is found, the corresponding result is returned. If no match is found, the result from the `ELSE` clause (if provided) is returned; otherwise, `NULL` is returned. - -Example: - -```SQL -SELECT a, - CASE a - WHEN 1 THEN 'one' - WHEN 2 THEN 'two' - ELSE 'many' - END -``` - -#### 8.1.2 Searched CASE - -The searched form evaluates each Boolean condition from left to right until a `TRUE` condition is found, then returns the corresponding result: - -```SQL -CASE - WHEN condition THEN result - [ WHEN ... ] - [ ELSE result ] -END -``` - -If no condition evaluates to `TRUE`, the `ELSE` clause result (if provided) is returned; otherwise, `NULL` is returned. - -Example: - -```SQL -SELECT a, b, - CASE - WHEN a = 1 THEN 'aaa' - WHEN b = 2 THEN 'bbb' - ELSE 'ccc' - END -``` - -### 8.2 COALESCE - -Returns the first non-null value from the given list of parameters. - -```SQL -coalesce(value1, value2[, ...]) -``` - -### 8.3 IF Expression - -The IF expression has two forms: one that specifies only the true value, and another that specifies both the true value and the false value. - -| Form | Description | Output Type Restrictions | -| ---------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ | ------------------------ | -| `IF(condition, true_value)` | If the condition evaluates to true, `true_value` is computed and returned; otherwise, `null` is returned and `true_value` is not evaluated. | — | -| `IF(condition, true_value, false_value)` | If the condition evaluates to true, `true_value` is computed and returned; otherwise, `false_value` is computed and returned. | The data types of `true_value` and `false_value` **must be exactly the same**. Implicit type conversion is not supported. | - -> Supported since V2.0.9.1 - -**Examples:** - -1. Equivalent examples of IF and CASE expressions: -```SQL --- IF syntax -SELECT - device_id, - temperature, - IF(temperature > 85, 'High Value', 'Low Value') -FROM table1; - --- Equivalent CASE syntax -SELECT - device_id, - temperature, - CASE - WHEN temperature > 85 THEN 'High Value' - ELSE 'Low Value' - END -FROM table1; -``` - -2. Output type restriction examples: -```SQL --- Succeeds --- temperature (float) and humidity (float) have matching types -SELECT IF(temperature > 85, temperature, humidity) FROM table1; - --- Fails --- temperature (float) and status (boolean) have mismatched types -SELECT IF(temperature > 85, temperature, status) FROM table1; -``` - - -## 9. Conversion Functions - -### 9.1 Conversion Functions - -#### 9.1.1 cast(value AS type) → type - -Explicitly converts a value to the specified type. This can be used to convert strings (`VARCHAR`) to numeric types or numeric values to string types. Starting from V2.0.8, OBJECT type can be explicitly cast to STRING type. - -If the conversion fails, a runtime error is thrown. - -Example: - -```SQL -SELECT * - FROM table1 - WHERE CAST(time AS DATE) - IN (CAST('2024-11-27' AS DATE), CAST('2024-11-28' AS DATE)); -``` - -#### 9.1.2 try_cast(value AS type) → type - -Similar to `CAST()`. If the conversion fails, returns `NULL` instead of throwing an error. - -Example: - -```SQL -SELECT * - FROM table1 - WHERE try_cast(time AS DATE) - IN (try_cast('2024-11-27' AS DATE), try_cast('2024-11-28' AS DATE)); -``` - -### 9.2 Format Function - -This function generates and returns a formatted string based on a specified format string and input arguments. Similar to Java’s `String.format` or C’s `printf`, it allows developers to construct dynamic string templates using placeholder syntax. Predefined format specifiers in the template are replaced precisely with corresponding argument values, producing a complete string that adheres to specific formatting requirements. - -#### 9.2.1 Syntax - -```SQL -format(pattern, ...args) -> STRING -``` - -**Parameters** - -* `pattern`: A format string containing static text and one or more format specifiers (e.g., `%s`, `%d`), or any expression returning a `STRING`/`TEXT` type. -* `args`: Input arguments to replace format specifiers. Constraints: - * Number of arguments ≥ 1. - * Multiple arguments must be comma-separated (e.g., `arg1, arg2`). - * Total arguments can exceed the number of specifiers in `pattern` but cannot be fewer, otherwise an exception is triggered. - -**Return Value** - -* Formatted result string of type `STRING`. - -#### 9.2.2 Usage Examples - -1. Format Floating-Point Numbers - ```SQL - IoTDB:database1> SELECT format('%.5f', humidity) FROM table1 WHERE humidity = 35.4; - +--------+ - | _col0| - +--------+ - |35.40000| - +--------+ - ``` -2. Format Integers - ```SQL - IoTDB:database1> SELECT format('%03d', 8) FROM table1 LIMIT 1; - +-----+ - |_col0| - +-----+ - | 008| - +-----+ - ``` -3. Format Dates and Timestamps - -* Locale-Specific Date - -```SQL -IoTDB:database1> SELECT format('%1$tA, %1$tB %1$te, %1$tY', 2024-01-01) FROM table1 LIMIT 1; -+--------------------+ -| _col0| -+--------------------+ -|Monday, January 1, 2024| -+--------------------+ -``` - -* Remove Timezone Information - -```SQL -IoTDB:database1> SELECT format('%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL', 2024-01-01T00:00:00.000+08:00) FROM table1 LIMIT 1; -+-----------------------+ -| _col0| -+-----------------------+ -|2024-01-01 00:00:00.000| -+-----------------------+ -``` - -* Second-Level Timestamp Precision - -```SQL -IoTDB:database1> SELECT format('%1$tF %1$tT', 2024-01-01T00:00:00.000+08:00) FROM table1 LIMIT 1; -+-------------------+ -| _col0| -+-------------------+ -|2024-01-01 00:00:00| -+-------------------+ -``` - -* Date/Time Format Symbols - -| **Symbol** | **​ Description** | -| ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| 'H' | 24-hour format (two digits, zero-padded), i.e. 00 - 23 | -| 'I' | 12-hour format (two digits, zero-padded), i.e. 01 - 12 | -| 'k' | 24-hour format (no padding), i.e. 0 - 23 | -| 'l' | 12-hour format (no padding), i.e. 1 - 12 | -| 'M' | Minute (two digits, zero-padded), i.e. 00 - 59 | -| 'S' | Second (two digits, zero-padded; supports leap seconds), i.e. 00 - 60 | -| 'L' | Millisecond (three digits, zero-padded), i.e. 000 - 999 | -| 'N' | Nanosecond (nine digits, zero-padded), i.e. 000000000 - 999999999。 | -| 'p' | Locale-specific lowercase AM/PM marker (e.g., "am", "pm"). Prefix with`T`to force uppercase (e.g., "AM"). | -| 'z' | RFC 822 timezone offset from GMT (e.g.,`-0800`). Adjusts for daylight saving. Uses the JVM's default timezone for`long`/`Long`/`Date`. | -| 'Z' | Timezone abbreviation (e.g., "PST"). Adjusts for daylight saving. Uses the JVM's default timezone; Formatter's timezone overrides the argument's timezone if specified. | -| 's' | Seconds since Unix epoch (1970-01-01 00:00:00 UTC), i.e. Long.MIN\_VALUE/1000 to Long.MAX\_VALUE/1000。 | -| 'Q' | Milliseconds since Unix epoch, i.e. Long.MIN\_VALUE 至 Long.MAX\_VALUE。 | - -* Common Date/Time Conversion Characters - -| **Symbol** | **​ Description** | -| ---------------- | -------------------------------------------------------------------- | -| 'B' | Locale-specific full month name, for example "January", "February" | -| 'b' | Locale-specific abbreviated month name, for example "Jan", "Feb" | -| 'h' | Same as`b` | -| 'A' | Locale-specific full weekday name, for example "Sunday", "Monday" | -| 'a' | Locale-specific short weekday name, for example "Sun", "Mon" | -| 'C' | Year divided by 100 (two digits, zero-padded) | -| 'Y' | Year (minimum 4 digits, zero-padded) | -| 'y' | Last two digits of year (zero-padded) | -| 'j' | Day of year (three digits, zero-padded) | -| 'm' | Month (two digits, zero-padded) | -| 'd' | Day of month (two digits, zero-padded) | -| 'e' | Day of month (no padding) | - -4. Format Strings - ```SQL - IoTDB:database1> SELECT format('The measurement status is: %s', status) FROM table2 LIMIT 1; - +-------------------------------+ - | _col0| - +-------------------------------+ - |The measurement status is: true| - +-------------------------------+ - ``` -5. Format Percentage Sign - ```SQL - IoTDB:database1> SELECT format('%s%%', 99.9) FROM table1 LIMIT 1; - +-----+ - |_col0| - +-----+ - |99.9%| - +-----+ - ``` - -#### 9.2.3 Format Conversion Failure Scenarios - -1. Type Mismatch Errors - -* Timestamp Type Conflict - - If the format specifier includes time-related tokens (e.g., `%Y-%m-%d`) but the argument: - - * Is a non-`DATE`/`TIMESTAMP` type value. ◦ - * Requires sub-day precision (e.g., `%H`, `%M`) but the argument is not `TIMESTAMP`. - -```SQL --- Example 1 -IoTDB:database1> SELECT format('%1$tA, %1$tB %1$te, %1$tY', humidity) from table2 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %1$tA, %1$tB %1$te, %1$tY (IllegalFormatConversion: A != java.lang.Float) - --- Example 2 -IoTDB:database1> SELECT format('%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL', humidity) from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL (IllegalFormatConversion: Y != java.lang.Float) -``` - -* Floating-Point Type Conflict - - Using `%f` with non-numeric arguments (e.g., strings or booleans): - -```SQL -IoTDB:database1> select format('%.5f',status) from table1 where humidity = 35.4 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %.5f (IllegalFormatConversion: f != java.lang.Boolean) -``` - -2. Argument Count Mismatch - The number of arguments must equal or exceed the number of format specifiers. - - ```SQL - IoTDB:database1> SELECT format('%.5f %03d', humidity) FROM table1 WHERE humidity = 35.4; - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %.5f %03d (MissingFormatArgument: Format specifier '%03d') - ``` -3. Invalid Invocation Errors - - Triggered if: - - * Total arguments < 2 (must include `pattern` and at least one argument).• - * `pattern` is not of type `STRING`/`TEXT`. - -```SQL --- Example 1 -IoTDB:database1> select format('%s') from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Scalar function format must have at least two arguments, and first argument pattern must be TEXT or STRING type. - ---Example 2 -IoTDB:database1> select format(123, humidity) from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Scalar function format must have at least two arguments, and first argument pattern must be TEXT or STRING type. -``` - - -## 10. String Functions and Operators - -### 10.1 String operators - -#### 10.1.1 || Operator - -The `||` operator is used for string concatenation and functions the same as the `concat` function. - -#### 10.1.2 LIKE Statement - - The `LIKE` statement is used for pattern matching. For detailed usage, refer to Pattern Matching:[LIKE](#1-like-运算符). - -### 10.2 String Functions - -| Function Name | Description | Input | Output | Usage | -| :------------ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------| :------ | :----------------------------------------------------------- | -| `length` | Returns the number of characters in a string (not byte length). | `string` (the string whose length is to be calculated) | INT32 | length(string) | -| `upper` | Converts all letters in a string to uppercase. | string | String | upper(string) | -| `lower` | Converts all letters in a string to lowercase. | string | String | lower(string) | -| `trim` | Removes specified leading and/or trailing characters from a string. **Parameters:** - `specification` (optional): Specifies which side to trim: - `BOTH`: Removes characters from both sides (default). - `LEADING`: Removes characters from the beginning. - `TRAILING`: Removes characters from the end. - `trimcharacter` (optional): Character to be removed (default is whitespace). - `string`: The target string. | string | String | trim([ [ specification ] [ trimcharacter ] FROM ] string) Example:`trim('!' FROM '!foo!');` —— `'foo'` | -| `strpos` | Returns the position of the first occurrence of `subStr` in `sourceStr`. **Notes:** - Position starts at `1`. - Returns `0` if `subStr` is not found. - Positioning is based on characters, not byte arrays. | `sourceStr` (string to be searched), `subStr` (substring to find) | INT32 | strpos(sourceStr, subStr) | -| `starts_with` | Checks if `sourceStr` starts with the specified `prefix`. | `sourceStr`, `prefix` | Boolean | starts_with(sourceStr, prefix) | -| `ends_with` | Checks if `sourceStr` ends with the specified `suffix`. | `sourceStr`, `suffix` | Boolean | ends_with(sourceStr, suffix) | -| `concat` | Concatenates `string1, string2, ..., stringN`. Equivalent to the `\|\|` operator. | `string`, `text` | String | concat(str1, str2, ...) or str1 \|\| str2 ... | -| `strcmp` | Compares two strings lexicographically. **Returns:** - `-1` if `str1 < str2` - `0` if `str1 = str2` - `1` if `str1 > str2` - `NULL` if either `str1` or `str2` is `NULL` | `string1`, `string2` | INT32 | strcmp(str1, str2) | -| `replace` | Removes all occurrences of `search` in `string`. | `string`, `search` | String | replace(string, search) | -| `replace` | Replaces all occurrences of `search` in `string` with `replace`. | `string`, `search`, `replace` | String | replace(string, search, replace) | -| `substring` | Extracts a substring from `start_index` to the end of the string. **Notes:** - `start_index` starts at `1`. - Returns `NULL` if input is `NULL`. - Throws an error if `start_index` is greater than string length. | `string`, `start_index` | String | substring(string from start_index)or substring(string, start_index) | -| `substring` | Extracts a substring of `length` characters starting from `start_index`. **Notes:** - `start_index` starts at `1`. - Returns `NULL` if input is `NULL`. - Throws an error if `start_index` is greater than string length. - Throws an error if `length` is negative. - If `start_index + length` exceeds `int.MAX`, an overflow error may occur. | `string`, `start_index`, `length` | String | substring(string from start_index for length) or substring(string, start_index, length) | - -## 11. Pattern Matching Functions - -### 11.1 LIKE - -#### 11.1.1 Usage - -The `LIKE `operator is used to compare a value with a pattern. It is commonly used in the `WHERE `clause to match specific patterns within strings. - -#### 11.1.2 Syntax - -```SQL -... column [NOT] LIKE 'pattern' ESCAPE 'character'; -``` - -#### 11.1.3 Match rules - -- Matching characters is case-sensitive -- The pattern supports two wildcard characters: - - `_` matches any single character - - `%` matches zero or more characters - -#### 11.1.4 Notes - -- `LIKE` pattern matching applies to the entire string by default. Therefore, if it's desired to match a sequence anywhere within a string, the pattern must start and end with a percent sign. -- To match the escape character itself, double it (e.g., `\\` to match `\`). For example, you can use `\\` to match for `\`. - -#### 11.1.5 Examples - -#### **Example 1: Match Strings Starting with a Specific Character** - -- **Description:** Find all names that start with the letter `E` (e.g., `Europe`). - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'E%'; -``` - -#### **Example 2: Exclude a Specific Pattern** - -- **Description:** Find all names that do **not** start with the letter `E`. - -```SQL -SELECT * FROM table1 WHERE continent NOT LIKE 'E%'; -``` - -#### **Example 3: Match Strings of a Specific Length** - -- **Description:** Find all names that start with `A`, end with `a`, and have exactly two characters in between (e.g., `Asia`). - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'A__a'; -``` - -#### **Example 4: Escape Special Characters** - -- **Description:** Find all names that start with `South_` (e.g., `South_America`). The underscore (`_`) is a wildcard character, so it needs to be escaped using `\`. - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'South\_%' ESCAPE '\'; -``` - -#### **Example 5: Match the Escape Character Itself** - -- **Description:** Find all names that start with 'South\'. Since `\` is the escape character, it must be escaped using `\\`. - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'South\\%' ESCAPE '\'; -``` - -### 11.2 regexp_like - -#### 11.2.1 Usage - -Evaluates whether the regular expression pattern is present within the given string. - -#### 11.2.2 Syntax - -```SQL -regexp_like(string, pattern); -``` - -#### 10.2.3 Notes - -- The pattern for `regexp_like` only needs to be contained within the string, and does not need to match the entire string. -- To match the entire string, use the `^` and `$` anchors. -- `^` signifies the "start of the string," and `$` signifies the "end of the string." -- Regular expressions use the Java-defined regular syntax, but there are the following exceptions to be aware of: - - Multiline mode - 1. Enabled by: `(?m)`. - 2. Recognizes only `\n` as the line terminator. - 3. Does not support the `(?d)` flag, and its use is prohibited. - - Case-insensitive matching - 1. Enabled by: `(?i)`. - 2. Based on Unicode rules, it does not support context-dependent and localized matching. - 3. Does not support the `(?u)` flag, and its use is prohibited. - - Character classes - 1. Within character classes (e.g., `[A-Z123]`), `\Q` and `\E` are not supported and are treated as literals. - - Unicode character classes (`\p{prop}`) - 1. Underscores in names: All underscores in names must be removed (e.g., `OldItalic `instead of `Old_Italic`). - 2. Scripts: Specify directly, without the need for `Is`, `script=`, or `sc=` prefixes (e.g., `\p{Hiragana}`). - 3. Blocks: Must use the `In` prefix, `block=` or `blk=` prefixes are not supported (e.g., `\p{InMongolian}`). - 4. Categories: Specify directly, without the need for `Is`, `general_category=`, or `gc=` prefixes (e.g., `\p{L}`). - 5. Binary properties: Specify directly, without `Is` (e.g., `\p{NoncharacterCodePoint}`). - -#### 11.2.4 Examples - -#### Example 1: **Matching strings containing a specific pattern** - -```SQL -SELECT regexp_like('1a 2b 14m', '\\d+b'); -- true -``` - -- **Explanation**: Determines whether the string '1a 2b 14m' contains a substring that matches the pattern `\d+b`. - - `\d+` means "one or more digits". - - `b` represents the letter b. - - In `'1a 2b 14m'`, the substring `'2b'` matches this pattern, so it returns `true`. - - -#### **Example 2: Matching the entire string** - -```SQL -SELECT regexp_like('1a 2b 14m', '^\\d+b$'); -- false -``` - -- **Explanation**: Checks if the string `'1a 2b 14m'` matches the pattern `^\\d+b$` exactly. - - `\d+` means "one or more digits". - - `b` represents the letter b. - - `'1a 2b 14m'` does not match this pattern because it does not start with digits and does not end with `b`, so it returns `false`. - -## 12. Timeseries Windowing Functions - -The sample data is as follows: - -```SQL -IoTDB> SELECT * FROM bid; -+-----------------------------+--------+-----+ -| time|stock_id|price| -+-----------------------------+--------+-----+ -|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+--------+-----+ - --- Create table statement -CREATE TABLE bid(time TIMESTAMP TIME, stock_id STRING TAG, price FLOAT FIELD); --- Insert data -INSERT INTO bid(time, stock_id, price) VALUES('2021-01-01T09:05:00','AAPL',100.0),('2021-01-01T09:06:00','TESL',200.0),('2021-01-01T09:07:00','AAPL',103.0),('2021-01-01T09:07:00','TESL',202.0),('2021-01-01T09:09:00','AAPL',102.0),('2021-01-01T09:15:00','TESL',195.0); -``` - -### 12.1 HOP - -#### 12.1.1 Function Description - -The HOP function segments data into overlapping time windows for analysis, assigning each row to all windows that overlap with its timestamp. If windows overlap (when SLIDE < SIZE), data will be duplicated across multiple windows. - -#### 12.1.2 Function Definition - -```SQL -HOP(data, timecol, size, slide[, origin]) -``` - -#### 12.1.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | ------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer | Window size | -| SLIDE | Scalar | Long integer | Sliding step | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - - -#### 12.1.4 Returned Results - -The HOP function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### 12.1.5 Usage Example - -```SQL -IoTDB> SELECT * FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.2 SESSION - -#### 12.2.1 Function Description - -The SESSION function groups data into sessions based on time intervals. It checks the time gap between consecutive rows—rows with gaps smaller than the threshold (GAP) are grouped into the current window, while larger gaps trigger a new window. - -#### 12.2.2 Function Definition - -```SQL -SESSION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], timecol, gap) -``` -#### 12.2.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| TIMECOL | Scalar | String (default: 'time') | Time column name | -| GAP | Scalar | Long integer | Session gap threshold | - -#### 12.2.4 Returned Results - -The SESSION function returns: - -* `window_start`: Time of the first row in the session -* `window_end`: Time of the last row in the session -* Pass-through columns: All input columns from DATA - -#### 12.2.5 Usage Example - -```SQL -IoTDB> SELECT * FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY SESSION when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL| 201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.3 VARIATION - -#### 12.3.1 Function Description - -The VARIATION function groups data based on value differences. The first row becomes the baseline for the first window. Subsequent rows are compared to the baseline—if the difference is within the threshold (DELTA), they join the current window; otherwise, a new window starts with that row as the new baseline. - -#### 12.3.2 Function Definition - -```sql -VARIATION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], col, delta) -``` - -#### 12.3.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| COL | Scalar | String | Column for difference calculation | -| DELTA | Scalar | Float | Difference threshold | - -#### 12.3.4 Returned Results - -The VARIATION function returns: - -* `window_index`: Window identifier -* Pass-through columns: All input columns from DATA - -#### 12.3.5 Usage Example - -```sql -IoTDB> SELECT * FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price',DELTA => 2.0); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 1|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY VARIATION when combined with GROUP BY -IoTDB> SELECT first(time) as window_start, last(time) as window_end, stock_id, avg(price) as avg FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price', DELTA => 2.0) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:07:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.5| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 12.4 CAPACITY - -#### 12.4.1 Function Description - -The CAPACITY function groups data into fixed-size windows, where each window contains up to SIZE rows. - -#### 12.4.2 Function Definition - -```sql -CAPACITY(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], size) -``` - -#### 12.4.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| SIZE | Scalar | Long integer | Window size (row count) | - -#### 12.4.4 Returned Results - -The CAPACITY function returns: - -* `window_index`: Window identifier -* Pass-through columns: All input columns from DATA - -#### 12.4.5 Usage Example - -```sql -IoTDB> SELECT * FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 0|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY COUNT when combined with GROUP BY -IoTDB> SELECT first(time) as start_time, last(time) as end_time, stock_id, avg(price) as avg FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| start_time| end_time|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|101.5| -|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 12.5 TUMBLE - -#### 12.5.1 Function Description - -The TUMBLE function assigns each row to a non-overlapping, fixed-size time window based on a timestamp attribute. - -#### 12.5.2 Function Definition - -```sql -TUMBLE(data, timecol, size[, origin]) -``` -#### 12.5.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | ------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer (positive) | Window size | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - -#### 12.5.4 Returned Results - -The TUMBLE function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### 12.5.5 Usage Example - -```SQL -IoTDB> SELECT * FROM TUMBLE( DATA => bid, TIMECOL => 'time', SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM TUMBLE(DATA => bid, TIMECOL => 'time', SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.6 CUMULATE - -#### 12.6.1 Function Description - -The CUMULATE function creates expanding windows from an initial window, maintaining the same start time while incrementally extending the end time by STEP until reaching SIZE. Each window contains all elements within its range. For example, with a 1-hour STEP and 24-hour SIZE, daily windows would be: `[00:00, 01:00)`, `[00:00, 02:00)`, ..., `[00:00, 24:00)`. - -#### 12.6.2 Function Definition - -```sql -CUMULATE(data, timecol, size, step[, origin]) -``` - -#### 12.6.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | --------------------------------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer (positive) | Window size (must be an integer multiple of STEP) | -| STEP | Scalar | Long integer (positive) | Expansion step | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - -> Note: An error `Cumulative table function requires size must be an integral multiple of step` occurs if SIZE is not divisible by STEP. - -#### 12.6.4 Returned Results - -The CUMULATE function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### 12.6.5 Usage Example - -```sql -IoTDB> SELECT * FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m, SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| TESL| 201.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00| AAPL| 100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| AAPL| 101.5| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` diff --git a/src/UserGuide/Master/Table/SQL-Manual/Common-Table-Expression_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/Common-Table-Expression_timecho.md deleted file mode 100644 index e2ae2ef06..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/Common-Table-Expression_timecho.md +++ /dev/null @@ -1,233 +0,0 @@ - -# Common Table Expressions (CTE) - -## 1. Overview - -CTE (Common Table Expressions) supports defining one or more temporary result sets (called common tables) using the `WITH` clause. These result sets can be referenced multiple times in subsequent parts of the same query. CTE provides a clean way to construct complex queries, making SQL code more readable and maintainable. - -> Note: This feature is available since version 2.0.9.1. - -## 2. Syntax Definition - -The simplified SQL syntax for CTE is as follows: - -```sql -with_clause: - WITH cte_name [(col_name [, col_name] ...)] AS (subquery) - [, cte_name [(col_name [, col_name] ...)] AS (subquery)] ... -``` - -- Supports simple and nested CTEs: One or more CTEs can be defined in a `WITH` clause, and CTEs can reference each other in a nested way (forward references are **not** allowed, meaning a CTE cannot reference another CTE that has not yet been defined). -- Name conflict between CTE and source table: If a CTE has the same name as a source table, only the CTE is visible in the outer scope, and the source table is shadowed. -- Multiple references to CTE: The same CTE can be referenced multiple times in the outer query. -- EXPLAIN / EXPLAIN ANALYZE support: `EXPLAIN` or `EXPLAIN ANALYZE` can be used on the entire query, but **not** on the `subquery` inside a CTE definition. -- Column count constraint: The number of column names specified in a CTE definition must match the number of output columns from the `subquery`, otherwise an error will be thrown. -- Unused CTE: A query can still execute normally if a defined CTE is not referenced in the main query body. - -## 3. Examples - -Using tables `table1` and `table2` from the [Sample Data](../Reference/Sample-Data.md) as source tables: - -### 3.1 Simple CTE - -```sql -WITH cte1 AS (SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL), - cte2 AS (SELECT device_id, humidity FROM table2 WHERE humidity IS NOT NULL) -SELECT * FROM cte1 JOIN cte2 ON cte1.device_id = cte2.device_id LIMIT 10; -``` - -Result - -``` -+---------+-----------+---------+--------+ -|device_id|temperature|device_id|humidity| -+---------+-----------+---------+--------+ -| 100| 90.0| 100| 45.1| -| 100| 90.0| 100| 35.2| -| 100| 90.0| 100| 35.1| -| 100| 85.0| 100| 45.1| -| 100| 85.0| 100| 35.2| -| 100| 85.0| 100| 35.1| -| 100| 85.0| 100| 45.1| -| 100| 85.0| 100| 35.2| -| 100| 85.0| 100| 35.1| -| 100| 88.0| 100| 45.1| -+---------+-----------+---------+--------+ -Total line number = 10 -It costs 0.075s -``` - -### 3.2 CTE with the Same Name as Source Table - -```sql -WITH table1 AS (SELECT time, device_id, temperature FROM table1 WHERE temperature IS NOT NULL) -SELECT * FROM table1 LIMIT 5; -``` - -Result - -``` -+-----------------------------+---------+-----------+ -| time|device_id|temperature| -+-----------------------------+---------+-----------+ -|2024-11-30T09:30:00.000+08:00| 101| 90.0| -|2024-11-30T14:30:00.000+08:00| 101| 90.0| -|2024-11-29T10:00:00.000+08:00| 101| 85.0| -|2024-11-27T16:39:00.000+08:00| 101| 85.0| -|2024-11-27T16:40:00.000+08:00| 101| 85.0| -+-----------------------------+---------+-----------+ -Total line number = 5 -It costs 0.103s -``` - -### 3.3 Nested CTE - -```sql -WITH - table1 AS (SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL), - cte1 AS (SELECT device_id, temperature FROM table2 WHERE temperature IS NOT NULL), - table2 AS (SELECT temperature FROM table1), - cte2 AS (SELECT temperature FROM table1) -SELECT * FROM table2; -``` - -Result - -``` -+-----------+ -|temperature| -+-----------+ -| 90.0| -| 90.0| -| 85.0| -| 85.0| -| 85.0| -| 85.0| -| 90.0| -| 85.0| -| 85.0| -| 88.0| -| 90.0| -| 90.0| -+-----------+ -Total line number = 12 -It costs 0.050s -``` - -- Forward references are **not** supported - -```sql -WITH - cte2 AS (SELECT temperature FROM cte1), - cte1 AS (SELECT device_id, temperature FROM table1) -SELECT * FROM cte2; -``` - -Error message - -``` -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 550: Table 'database1.cte1' does not exist. -``` - -### 3.4 Multiple References to CTE - -```sql -WITH cte AS (SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL) -SELECT * FROM cte WHERE temperature > (SELECT avg(temperature) FROM cte); -``` - -Result - -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 90.0| -| 100| 90.0| -| 100| 88.0| -| 100| 90.0| -| 100| 90.0| -+---------+-----------+ -Total line number = 6 -It costs 0.203s -``` - -### 3.5 EXPLAIN Support - -- Supported on the entire query - -```sql -EXPLAIN WITH cte AS (SELECT * FROM table1) SELECT * FROM cte; -``` - -Result - -``` -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| distribution plan| -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ | -| │OutputNode-7 │ | -| │OutputColumns-[time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time] │ | -| │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ | -| └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ | -| │ | -| │ | -| ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ | -| │Collect-42 │ | -| │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ | -| └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ | -| ┌───────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────┐ | -| │ │ | -| ┌───────────┐ ┌───────────┐ | -| │Exchange-49│ │Exchange-50│ | -| └───────────┘ └───────────┘ | -| │ │ | -| │ │ | -|┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐| -|│DeviceTableScanNode-41 │ │DeviceTableScanNode-40 │| -|│QualifiedTableName: database1.table1 │ │QualifiedTableName: database1.table1 │| -|│OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│| -|│DeviceNumber: 3 │ │DeviceNumber: 3 │| -|│ScanOrder: ASC │ │ScanOrder: ASC │| -|│PushDownOffset: 0 │ │PushDownOffset: 0 │| -|│PushDownLimit: 0 │ │PushDownLimit: 0 │| -|│PushDownLimitToEachDevice: false │ │PushDownLimitToEachDevice: false │| -|│RegionId: 2 │ │RegionId: 1 │| -|└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘| -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 29 -It costs 0.065s -``` - -- Not supported for internal queries of CTE - -```sql -WITH cte AS (EXPLAIN SELECT * FROM table1) SELECT * FROM cte; -``` - -Error message - -``` -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 700: line 1:14: mismatched input 'EXPLAIN'. Expecting: -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/SQL-Manual/Featured-Functions_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/Featured-Functions_timecho.md deleted file mode 100644 index 97cefa7a3..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/Featured-Functions_timecho.md +++ /dev/null @@ -1,861 +0,0 @@ - -# Featured Functions - -## 1. Downsampling Functions - -### 1.1 `date_bin` Function - -#### **Description** - -The `date_bin` function is a scalar function that aligns timestamps to the start of specified time intervals. It is commonly used with the `GROUP BY` clause for downsampling. - -- **Partial Intervals May Be Empty:** Only timestamps that meet the conditions are aligned; missing intervals are not filled. -- **All Intervals Return Empty:** If no data exists within the query range, the downsampling result is an empty set. - -#### **Usage Examples** - -[Sample Dataset](../Reference/Sample-Data.md): The example data page contains SQL statements for building table structures and inserting data. Download and execute these statements in the IoTDB CLI to import the data into IoTDB. You can use this data to test and execute the SQL statements in the examples and obtain the corresponding results. - -**Example 1: Hourly Average Temperature for Device 100** - -```SQL -SELECT date_bin(1h, time) AS hour_time, avg(temperature) AS avg_temp -FROM table1 -WHERE (time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00) - AND device_id = '100' -GROUP BY 1; -``` - -**Result** - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:00:00.000+08:00| 90.0| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -+-----------------------------+--------+ -``` - -**Example 2: Hourly Average Temperature for Each Device** - -```SQL -SELECT date_bin(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00 -GROUP BY 1, device_id; -``` - -**Result** - -```Plain -+-----------------------------+---------+--------+ -| hour_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-29T11:00:00.000+08:00| 100| null| -|2024-11-29T18:00:00.000+08:00| 100| 90.0| -|2024-11-28T08:00:00.000+08:00| 100| 85.0| -|2024-11-28T09:00:00.000+08:00| 100| null| -|2024-11-28T10:00:00.000+08:00| 100| 85.0| -|2024-11-28T11:00:00.000+08:00| 100| 88.0| -|2024-11-29T10:00:00.000+08:00| 101| 85.0| -|2024-11-27T16:00:00.000+08:00| 101| 85.0| -+-----------------------------+---------+--------+ -``` - -**Example 3: Hourly Average Temperature for All Devices** - -```SQL -SELECT date_bin(1h, time) AS hour_time, avg(temperature) AS avg_temp - FROM table1 - WHERE time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00 - group by 1; -``` - -**Result** - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-29T10:00:00.000+08:00| 85.0| -|2024-11-27T16:00:00.000+08:00| 85.0| -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:00:00.000+08:00| 90.0| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -+-----------------------------+--------+ -``` - -### 1.2 `date_bin_gapfill` Function - -#### **Description:** - -The `date_bin_gapfill` function is an extension of `date_bin` that fills in missing time intervals, returning a complete time series. - -- **Partial Intervals May Be Empty**: Aligns timestamps for data that meets the conditions and fills in missing intervals. -- **All Intervals Return Empty**: If no data exists within the query range, the result is an empty set. - -#### **Limitations:** - -1. The function must always be used with the `GROUP BY` clause. If used elsewhere, it behaves like `date_bin` without gap-filling. -2. A `GROUP BY` clause can contain only one instance of date_bin_gapfill. Multiple calls will result in an error. -3. The `GAPFILL` operation occurs after the `HAVING` clause and before the `FILL` clause. -4. The `WHERE` clause must include time filters in one of the following forms: - 1. `time >= XXX AND time <= XXX` - 2. `time > XXX AND time < XXX` - 3. `time BETWEEN XXX AND XXX` -5. If additional time filters or conditions are used, an error is raised. Time conditions and other value filters must be connected using the `AND` operator. -6. If `startTime` and `endTime` cannot be inferred from the `WHERE` clause, an error is raised. - -**Usage Examples** - -**Example 1: Fill Missing Intervals** - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, avg(temperature) AS avg_temp -FROM table1 -WHERE (time >= 2024-11-28 07:00:00 AND time <= 2024-11-28 16:00:00) - AND device_id = '100' -GROUP BY 1; -``` - -**Result** - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-28T07:00:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -|2024-11-28T12:00:00.000+08:00| null| -|2024-11-28T13:00:00.000+08:00| null| -|2024-11-28T14:00:00.000+08:00| null| -|2024-11-28T15:00:00.000+08:00| null| -|2024-11-28T16:00:00.000+08:00| null| -+-----------------------------+--------+ -``` - -**Example 2: Fill Missing Intervals with Device Grouping** - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-28 07:00:00 AND time <= 2024-11-28 16:00:00 -GROUP BY 1, device_id; -``` - -**Result** - -```Plain -+-----------------------------+---------+--------+ -| hour_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-28T07:00:00.000+08:00| 100| null| -|2024-11-28T08:00:00.000+08:00| 100| 85.0| -|2024-11-28T09:00:00.000+08:00| 100| null| -|2024-11-28T10:00:00.000+08:00| 100| 85.0| -|2024-11-28T11:00:00.000+08:00| 100| 88.0| -|2024-11-28T12:00:00.000+08:00| 100| null| -|2024-11-28T13:00:00.000+08:00| 100| null| -|2024-11-28T14:00:00.000+08:00| 100| null| -|2024-11-28T15:00:00.000+08:00| 100| null| -|2024-11-28T16:00:00.000+08:00| 100| null| -+-----------------------------+---------+--------+ -``` - -**Example 3: Empty Result Set for No Data in Range** - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-27 09:00:00 AND time <= 2024-11-27 14:00:00 -GROUP BY 1, device_id; -``` - -**Result** - -```Plain -+---------+---------+--------+ -|hour_time|device_id|avg_temp| -+---------+---------+--------+ -+---------+---------+--------+ -``` - -## 2. `DIFF` Function - -### 2.1 **Description:** - -- The `DIFF` function calculates the difference between the current row and the previous row. For the first row, it returns `NULL` since there is no previous row. - -### 2.2 **Function Definition:** - -``` -DIFF(numberic[, boolean]) -> Double -``` - -### 2.3 **Parameters:** - -#### **First Parameter (numeric):** - -- **Type**: Must be numeric (`INT32`, `INT64`, `FLOAT`, `DOUBLE`). -- **Purpose**: Specifies the column for which to calculate the difference. - -#### **Second Parameter (boolean, optional):** - -- **Type**: Boolean (`true` or `false`). -- **Default**: `true`. -- **Purpose**: - - `true`: Ignores `NULL` values and uses the first non-`NULL` value for calculation. If no non-`NULL` value exists, returns `NULL`. - - `false`: Does not ignore `NULL` values. If the previous row is `NULL`, the result is `NULL`. - -### 2.4 **Notes:** - -- In **tree models**, the second parameter must be specified as `'ignoreNull'='true'` or `'ignoreNull'='false'`. -- In **table models**, simply use `true` or `false`. Using `'ignoreNull'='true'` or `'ignoreNull'='false'` in table models results in a string comparison and always evaluates to `false`. - -### 2.5 **Usage Examples** - -#### **Example 1: Ignore NULL Values** - -```SQL -SELECT time, DIFF(temperature) AS diff_temp -FROM table1 -WHERE device_id = '100'; -``` - -**Result** - -```Plain -+-----------------------------+---------+ -| time|diff_temp| -+-----------------------------+---------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:30:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| -5.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 0.0| -|2024-11-28T11:00:00.000+08:00| 3.0| -|2024-11-26T13:37:00.000+08:00| 2.0| -|2024-11-26T13:38:00.000+08:00| 0.0| -+-----------------------------+---------+ -``` - -#### **Example 2: Do Not Ignore NULL Values** - -```SQL -SELECT time, DIFF(temperature, false) AS diff_temp -FROM table1 -WHERE device_id = '100'; -``` - -**Result** - -```Plain -+-----------------------------+---------+ -| time|diff_temp| -+-----------------------------+---------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:30:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| -5.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| null| -|2024-11-28T11:00:00.000+08:00| 3.0| -|2024-11-26T13:37:00.000+08:00| 2.0| -|2024-11-26T13:38:00.000+08:00| 0.0| -+-----------------------------+---------+ -``` - -#### **Example 3: Full Example** - -```SQL -SELECT time, temperature, - DIFF(temperature) AS diff_temp_1, - DIFF(temperature, false) AS diff_temp_2 -FROM table1 -WHERE device_id = '100'; -``` - -**Result** - -```Plain -+-----------------------------+-----------+-----------+-----------+ -| time|temperature|diff_temp_1|diff_temp_2| -+-----------------------------+-----------+-----------+-----------+ -|2024-11-29T11:00:00.000+08:00| null| null| null| -|2024-11-29T18:30:00.000+08:00| 90.0| null| null| -|2024-11-28T08:00:00.000+08:00| 85.0| -5.0| -5.0| -|2024-11-28T09:00:00.000+08:00| null| null| null| -|2024-11-28T10:00:00.000+08:00| 85.0| 0.0| null| -|2024-11-28T11:00:00.000+08:00| 88.0| 3.0| 3.0| -|2024-11-26T13:37:00.000+08:00| 90.0| 2.0| 2.0| -|2024-11-26T13:38:00.000+08:00| 90.0| 0.0| 0.0| -+-----------------------------+-----------+-----------+-----------+ -``` - -## 3 Timeseries Windowing Functions - -The sample data is as follows: - -```SQL -IoTDB> SELECT * FROM bid; -+-----------------------------+--------+-----+ -| time|stock_id|price| -+-----------------------------+--------+-----+ -|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+--------+-----+ - --- Create table statement -CREATE TABLE bid(time TIMESTAMP TIME, stock_id STRING TAG, price FLOAT FIELD); --- Insert data -INSERT INTO bid(time, stock_id, price) VALUES('2021-01-01T09:05:00','AAPL',100.0),('2021-01-01T09:06:00','TESL',200.0),('2021-01-01T09:07:00','AAPL',103.0),('2021-01-01T09:07:00','TESL',202.0),('2021-01-01T09:09:00','AAPL',102.0),('2021-01-01T09:15:00','TESL',195.0); -``` - -### 3.1 HOP - -#### Function Description - -The HOP function segments data into overlapping time windows for analysis, assigning each row to all windows that overlap with its timestamp. If windows overlap (when SLIDE < SIZE), data will be duplicated across multiple windows. - -#### Function Definition - -```SQL -HOP(data, timecol, size, slide[, origin]) -``` - -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | ------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer | Window size | -| SLIDE | Scalar | Long integer | Sliding step | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - - -#### Returned Results - -The HOP function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```SQL -IoTDB> SELECT * FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.2 SESSION - -#### Function Description - -The SESSION function groups data into sessions based on time intervals. It checks the time gap between consecutive rows—rows with gaps smaller than the threshold (GAP) are grouped into the current window, while larger gaps trigger a new window. - -#### Function Definition - -```SQL -SESSION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], timecol, gap) -``` -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| TIMECOL | Scalar | String (default: 'time') | Time column name | -| GAP | Scalar | Long integer | Session gap threshold | - -#### Returned Results - -The SESSION function returns: - -* `window_start`: Time of the first row in the session -* `window_end`: Time of the last row in the session -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```SQL -IoTDB> SELECT * FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY SESSION when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL| 201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.3 VARIATION - -#### Function Description - -The VARIATION function groups data based on value differences. The first row becomes the baseline for the first window. Subsequent rows are compared to the baseline—if the difference is within the threshold (DELTA), they join the current window; otherwise, a new window starts with that row as the new baseline. - -#### Function Definition - -```sql -VARIATION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], col, delta) -``` - -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| COL | Scalar | String | Column for difference calculation | -| DELTA | Scalar | Float | Difference threshold | - -#### Returned Results - -The VARIATION function returns: - -* `window_index`: Window identifier -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```sql -IoTDB> SELECT * FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price',DELTA => 2.0); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 1|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY VARIATION when combined with GROUP BY -IoTDB> SELECT first(time) as window_start, last(time) as window_end, stock_id, avg(price) as avg FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price', DELTA => 2.0) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:07:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.5| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 3.4 CAPACITY - -#### Function Description - -The CAPACITY function groups data into fixed-size windows, where each window contains up to SIZE rows. - -#### Function Definition - -```sql -CAPACITY(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], size) -``` - -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| SIZE | Scalar | Long integer | Window size (row count) | - -#### Returned Results - -The CAPACITY function returns: - -* `window_index`: Window identifier -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```sql -IoTDB> SELECT * FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 0|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY COUNT when combined with GROUP BY -IoTDB> SELECT first(time) as start_time, last(time) as end_time, stock_id, avg(price) as avg FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| start_time| end_time|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|101.5| -|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 3.5 TUMBLE - -#### Function Description - -The TUMBLE function assigns each row to a non-overlapping, fixed-size time window based on a timestamp attribute. - -#### Function Definition - -```sql -TUMBLE(data, timecol, size[, origin]) -``` -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | ------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer (positive) | Window size | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - -#### Returned Results - -The TUMBLE function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```SQL -IoTDB> SELECT * FROM TUMBLE( DATA => bid, TIMECOL => 'time', SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM TUMBLE(DATA => bid, TIMECOL => 'time', SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.6 CUMULATE - -#### Function Description - -The CUMULATE function creates expanding windows from an initial window, maintaining the same start time while incrementally extending the end time by STEP until reaching SIZE. Each window contains all elements within its range. For example, with a 1-hour STEP and 24-hour SIZE, daily windows would be: `[00:00, 01:00)`, `[00:00, 02:00)`, ..., `[00:00, 24:00)`. - -#### Function Definition - -```sql -CUMULATE(data, timecol, size, step[, origin]) -``` - -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | --------------------------------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer (positive) | Window size (must be an integer multiple of STEP) | -| STEP | Scalar | Long integer (positive) | Expansion step | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - -> Note: An error `Cumulative table function requires size must be an integral multiple of step` occurs if SIZE is not divisible by STEP. - -#### Returned Results - -The CUMULATE function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```sql -IoTDB> SELECT * FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m, SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| TESL| 201.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00| AAPL| 100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| AAPL| 101.5| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - - -## 4. Window Functions - -### 4.1 SQL Definition - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -For more detailed introductions to the features, please refer to: [Window Function](../User-Manual/Window-Function_timecho.md) - -### 4.2 Usage Examples - -The original data of the device_flow table is as follows: - -```sql -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ -``` - -1. Query all columns from device_flow, group the data by the device dimension, sort the records within each device group by the value of the flow field, calculate the cumulative sum of the flow field, and finally return the cumulative sum as a column named sum. - -SQL: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -``` - -Result: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` -2. Query all original columns from the device_flow table, group the data by the device dimension (device), sort the records within each device group by the value of the flow field, count the number of rows within the range of "the flow group of the current row + the previous 1 flow group", and finally return the count result as a column named count. - -SQL: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. Query all original columns from the device_flow table, group the data by device, sort the records in ascending order by the value of the flow field within each group, count the number of all rows falling within the numeric range of "the flow value of the current row minus 2" to "the flow value of the current row", and finally return the count result as a column named count. - -SQL: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -## 5. Object Type Read Function - -**Description**: -Reads binary content from an `OBJECT` type column and returns a `BLOB` type (raw binary data of the object). -> Supported since V2.0.8 - -**Syntax:** -```SQL -READ_OBJECT(object [, offset, length]) -``` - -**Parameters:** -- **Required**: - `object` (OBJECT type) -- **Optional**: - - `offset` (long/INT64): Start position for reading. Default: `0`. - - Throws error if `offset < 0` or `offset >= full file length`. - - `length` (long/INT64): Number of bytes to read. Default: full file length. - - Error if `length > 2^31 - 1`. - - If `length` exceeds remaining bytes from `offset`, reads until end of file. - - If `length < 0`, reads all remaining data from `offset`. - -**Examples:** -```sql -IoTDB:database1> SELECT READ_OBJECT(s1) FROM table1 WHERE device_id = 'tag1' -+------------+ -| _col0| -+------------+ -|0x696f746462| -+------------+ -Total line number = 1 - -IoTDB:database1> SELECT READ_OBJECT(s1, 0, 3) FROM table1 WHERE device_id = 'tag1' -+--------+ -| _col0| -+--------+ -|0x696f74| -+--------+ -Total line number = 1 -``` diff --git a/src/UserGuide/Master/Table/SQL-Manual/QuickStart-Only-Sql_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/QuickStart-Only-Sql_timecho.md deleted file mode 100644 index 97569de90..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/QuickStart-Only-Sql_timecho.md +++ /dev/null @@ -1,126 +0,0 @@ - -# QuickStart Only SQL - -> **Before executing the following SQL statements, please ensure** -> -> * **IoTDB service has been successfully started** -> * **Connected to IoTDB via Cli client** -> -> Note: If your terminal does not support multi-line pasting (e.g., Windows CMD), please adjust the SQL statements to single-line format before execution. - -## 1. Database Management - -```SQL --- Create database database1, and set the database TTL time to 1 year; -CREATE DATABASE IF NOT EXISTS database1; - --- Use database database1; -USE database1; - --- Modify the database TTL time to 1 week; -ALTER DATABASE database1 SET PROPERTIES TTL=604800000; - --- Delete database database1; -DROP DATABASE IF EXISTS database1; -``` - -For detailed syntax description, please refer to: [Database Management](../Basic-Concept/Database-Management_timecho.md) - -## 2. Table Management - -```SQL --- Create table table1; -CREATE TABLE table1 ( - time TIMESTAMP TIME, - device_id STRING TAG, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - status Boolean FIELD COMMENT 'status' -); - --- View column information of table table1; -DESC table1 DETAILS; - --- Modify table; --- Add column to table table1; -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS humidity FLOAT FIELD COMMENT 'humidity'; --- Set table table1 TTL to 1 week; -ALTER TABLE table1 set properties TTL=604800000; - --- Delete table table1; -DROP TABLE table1; -``` - -For detailed syntax description, please refer to: [Table Management](../Basic-Concept/Table-Management_timecho.md) - -## 3. Data Writing - -```SQL --- Single row writing; -INSERT INTO table1(device_id, time, temperature) VALUES ('100', '2025-11-26 13:37:00', 90.0); - --- Multi-row writing; -INSERT INTO table1(device_id, maintenance, time, temperature) VALUES - ('101', '180', '2024-11-26 13:37:00', 88.0), - ('100', '180', '2024-11-26 13:38:00', 85.0), - ('101', '180', '2024-11-27 16:38:00', 80.0); -``` - -For detailed syntax description, please refer to: [Data Writing](../Basic-Concept/Write-Updata-Data_timecho.md#_1-data-insertion) - -## 4. Data Query - -```SQL --- Full table query; -SELECT * FROM table1; - --- Function query; -SELECT count(*), sum(temperature) FROM table1; - --- Query data for specified device and time period; -SELECT * -FROM table1 -WHERE time >= 2024-11-26 00:00:00 and time <= 2024-11-27 00:00:00 and device_id='101'; -``` - -For detailed syntax description, please refer to: [Data Query](../Basic-Concept/Query-Data_timecho.md) - -## 5. Data Update - -```SQL --- Update the maintenance attribute value for data where device_id is 100; -UPDATE table1 SET maintenance='45' WHERE device_id='100'; -``` - -For detailed syntax description, please refer to: [Data Update](../Basic-Concept/Write-Updata-Data_timecho.md#_2-data-updates) - -## 6. Data Deletion - -```SQL --- Delete data for specified device and time period; -DELETE FROM table1 WHERE time >= 2024-11-26 23:39:00 and time <= 2024-11-27 20:42:00 AND device_id='101'; - --- Full table deletion; -DELETE FROM table1; -``` - -For detailed syntax description, please refer to: [Data Deletion](../Basic-Concept/Delete-Data.md) \ No newline at end of file diff --git a/src/UserGuide/Master/Table/SQL-Manual/Row-Pattern-Recognition_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/Row-Pattern-Recognition_timecho.md deleted file mode 100644 index b5136f77a..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/Row-Pattern-Recognition_timecho.md +++ /dev/null @@ -1,167 +0,0 @@ - - -# Pattern Query - -## 1. Syntax Definition - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**Note:** - -* PARTITION BY: Optional. Used to group the input table, and each group can perform pattern matching independently. If this clause is not specified, the entire input table will be processed as a single unit. -* ORDER BY: Optional. Used to ensure that input data is processed in a specific order during matching. -* MEASURES: Optional. Used to specify which information to extract from the matched segment of data. -* ROWS PER MATCH: Optional. Used to specify the output method of the result set after successful pattern matching. -* AFTER MATCH SKIP: Optional. Used to specify which row to resume from for the next pattern match after identifying a non-empty match. -* PATTERN: Used to define the row pattern to be matched. -* SUBSET: Optional. Used to merge rows matched by multiple basic pattern variables into a single logical set. -* DEFINE: Used to define the basic pattern variables for the row pattern. - -For more detailed introductions to the features, please refer to:[Pattern Query](../User-Manual/Pattern-Query_timecho.md) - -## 2. Usage Examples - -Using [Sample Data](../Reference/Sample-Data.md) as the source data - -1. Time Segment Query - -Segment the data in table1 by time intervals less than or equal to 24 hours, and query the total number of data entries in each segment, as well as the start and end times. - -Query SQL - -SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -Query Results - -SQL - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -2. Difference Segment Query - -Segment the data in table2 by humidity value differences less than 0.1, and query the total number of data entries in each segment, as well as the start and end times. - -* Query SQL - -SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* Query Results - -SQL - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -3. Event Statistics Query - -Group the data in table1 by device ID, and count the start and end times and maximum humidity value where the humidity in the Shanghai area is greater than 35. - -* Query SQL - -SQL - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= 'Shanghai' AND A.humidity> 35 -) AS m -``` - -* Query Results - -SQL - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` diff --git a/src/UserGuide/Master/Table/SQL-Manual/SQL-Authority-Management_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/SQL-Authority-Management_timecho.md deleted file mode 100644 index 3528a5f31..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/SQL-Authority-Management_timecho.md +++ /dev/null @@ -1,378 +0,0 @@ - - -# Authority Management - -This document is the SQL manual for authority management starting from version V2.0.7. For detailed function usage, see [Authority Management](../User-Manual/Authority-Management-Upgrade_timecho.md). For an introduction to authority management functions before version V2.0.7, refer to [Authority Management](../User-Manual/Authority-Management_timecho.md) - -## 1. Privilege List - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Privilege TypePrivilege NameScope of EffectDescription
Global PrivilegesSYSTEMGlobalAllows users to create, modify, and delete databases.
Allows users to create, modify, and delete tables and table views.
Allows users to create, delete, and view user-defined functions.
Allows users to create, start, stop, delete, and view PIPEs. Allows users to create, delete, and view PIPEPLUGINS.
Allows users to query and cancel queries. Allows users to view variables. Allows users to view cluster status.
Allows users to create, delete, and view deep learning models.
SECURITYGlobalAllows users to create users.
Allows users to delete users.
Allows users to modify user passwords.
Allows users to view user privilege information.
Allows users to list all users.
Allows users to create roles.
Allows users to delete roles.
Allows users to view role privilege information.
Allows users to grant a role to a user or revoke it.
Allows users to list all roles.
AUDITGlobalAllows users to maintain audit log rules and view audit logs.
Data PrivilegesCREATEANYAllows creating any table and any database.
DatabaseAllows users to create tables under this database; allows users to create a database with this name.
TableAllows users to create a table with this name.
ALTERANYAllows modifying the definition of any table and any database.
DatabaseAllows users to modify the definition of a database and the definitions of tables under that database.
TableAllows users to modify the definition of a table.
SELECTANYAllows querying data from any table in any database in the system.
DatabaseAllows users to query data from any table in this database.
TableAllows users to query data in this table. When executing multi-table queries, the database only displays data that the user has permission to access.
INSERTANYAllows inserting/updating data into any table in any database.
DatabaseAllows users to insert/update data into any table within the scope of this database.
TableAllows users to insert/update data into this table.
DELETEANYAllows deleting data from any table.
DatabaseAllows users to delete data within the scope of this database.
TableAllows users to delete data from this table.
- -## 2. SQL Statements - -### 2.1 User and Role Management - -1. Create User (Requires SECURITY privilege) - -```SQL -CREATE USER -eg: CREATE USER user1 'Passwd@202604'; -``` - -2. Change Password - -Users can change their own passwords, but changing other users' passwords requires the SECURITY privilege. - -```SQL -ALTER USER SET PASSWORD -eg: ALTER USER tempuser SET PASSWORD 'Newpwd@202604'; -``` - -3. Drop User (Requires SECURITY privilege) - -```SQL -DROP USER -eg: DROP USER user1; -``` - -4. Create Role (Requires SECURITY privilege) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1; -``` - -5. Drop Role (Requires SECURITY privilege) - -```SQL -DROP ROLE -eg: DROP ROLE role1; -``` - -6. Grant Role to User (Requires SECURITY privilege) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1; -``` - -7. Revoke Role from User (Requires SECURITY privilege) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1; -``` - -8. List All Users (Requires SECURITY privilege) - -```SQL -LIST USER; -``` - -9. List All Roles (Requires SECURITY privilege) - -```SQL -LIST ROLE; -``` - -10. List All Users Under a Specified Role (Requires SECURITY privilege) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser; -``` - -11. List All Roles of a Specified User - -Users can list their own roles, but listing other users' roles requires the SECURITY privilege. - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser; -``` - -12. List All Privileges of a User - -Users can list their own privilege information, but listing other users' privileges requires the SECURITY privilege. - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser; -``` - -13. List All Privileges of a Role - -Users can list the privilege information of roles they possess, but listing other roles' privileges requires the SECURITY privilege. - -```SQL -LIST PRIVILEGES OF ROLE -eg: LIST PRIVILEGES OF ROLE actor; -``` - -### 2.2 Privilege Management - -#### 2.2.1 Grant Privileges - -1. Grant user management privileges to a user - -```SQL -GRANT SECURITY TO USER -eg: GRANT SECURITY TO USER TEST_USER; -``` - -2. Grant a user the privilege to create databases and create tables within the database scope, and allow the user to manage privileges within that scope - -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION -eg: GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION; -``` - -3. Grant a role the privilege to query a database - -```SQL -GRANT SELECT ON DATABASE TO ROLE -eg: GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE; -``` - -4. Grant a user the privilege to query a table - -```SQL -GRANT SELECT ON . TO USER -eg: GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER; -``` - -5. Grant a role the privilege to query all databases and tables - -```SQL -GRANT SELECT ON ANY TO ROLE -eg: GRANT SELECT ON ANY TO ROLE TEST_ROLE; -``` - -6. ALL Syntax Sugar: ALL represents all privileges within the object scope. You can use the ALL field to flexibly grant privileges. - -```SQL -GRANT ALL TO USER TESTUSER; --- Grants all privileges available to the user, including global privileges and all data privileges in the ANY scope - -GRANT ALL ON ANY TO USER TESTUSER; --- Grants all data privileges available in the ANY scope to the user. After executing this statement, the user will have all data privileges on all databases - -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER; --- Grants all data privileges available in the DB scope to the user. After executing this statement, the user will have all data privileges on this database - -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER; --- Grants all data privileges available in the TABLE scope to the user. After executing this statement, the user will have all data privileges on this table -``` - -#### 2.2.2 Revoke Privileges - -1. Revoke user management privileges from a user - -```SQL -REVOKE SECURITY FROM USER -eg: REVOKE SECURITY FROM USER TEST_USER; -``` - -2. Revoke a user's privilege to create databases and create tables within the database scope - -```SQL -REVOKE CREATE ON DATABASE FROM USER -eg: REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER; -``` - -3. Revoke a user's privilege to query a table - -```SQL -REVOKE SELECT ON . FROM USER -eg: REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER; -``` - -4. Revoke a user's privilege to query all databases and tables - -```SQL -REVOKE SELECT ON ANY FROM USER -eg: REVOKE SELECT ON ANY FROM USER TEST_USER; -``` - -5. ALL Syntax Sugar: ALL represents all privileges within the object scope. You can use the ALL field to flexibly revoke privileges. - -```SQL -REVOKE ALL FROM USER TESTUSER; --- Revokes all global privileges and all data privileges in the ANY scope from the user - -REVOKE ALL ON ANY FROM USER TESTUSER; --- Revokes all data privileges in the ANY scope from the user, and does not affect DB-scope and TABLE-scope privileges - -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER; --- Revokes all data privileges on the DB from the user, and does not affect TABLE privileges - -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER; --- Revokes all data privileges on the TABLE from the user -``` - -#### 2.2.3 View User Privileges - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md deleted file mode 100644 index fc9593436..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md +++ /dev/null @@ -1,172 +0,0 @@ - - -# Data Addition & Deletion - -## 1. Data Insertion - -**Syntax:** - -```SQL -INSERT INTO [(COLUMN_NAME[, COLUMN_NAME]*)]? VALUES (COLUMN_VALUE[, COLUMN_VALUE]*) -``` - -[Detailed syntax reference](../Basic-Concept/Write-Updata-Data_timecho.md#_1-1-syntax) - -**Example 1: Specified Columns Insertion** - -```SQL -INSERT INTO table1(region, plant_id, device_id, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, time, temperature) VALUES ('Hamburg', '1001', '100', '2025-11-26 13:38:00', 91.0); -``` - -**Example 2: NULL Value Insertion** - -Equivalent to the example above -```SQL -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', null, null, '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', null, null, '2025-11-26 13:38:00', 91.0, null); -``` - -**Example 3: Multi-row Insertion** - -```SQL -INSERT INTO table1 -VALUES -('2025-11-26 13:37:00', 'Frankfurt', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('2025-11-26 13:38:00', 'Frankfurt', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:38:25'); - -INSERT INTO table1 -(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) -VALUES -('Frankfurt', '1001', '100', 'A', '180', '2025-11-26 13:37:00', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('Frankfurt', '1001', '100', 'A', '180', '2025-11-26 13:38:00', 90.0, 35.1, true, '2025-11-26 13:38:25'); -``` - -**Example 4: Query Write-back** - -```SQL -insert into target_table select time,region,device_id,temperature from table1 where region = 'bj'; - -insert into target_table(time,device_id,temperature) table table3; - -insert into target_table (select t1.time, t1.region as region, t1.device_id as device_id, t1.temperature as temperature from table1 t1 where t1.time in (select t2.time from table2 t2 where t2.region = 'shh')); -``` - - -## 2. Data Update - -**Syntax:** - -```SQL -UPDATE SET updateAssignment (',' updateAssignment)* (WHERE where=booleanExpression)? - -updateAssignment - : identifier EQ expression - ; -``` - -[Detailed syntax reference](../Basic-Concept/Write-Updata-Data_timecho.md#_2-1-syntax) - -**Example:** - -```SQL -update table1 set b = a where substring(a, 1, 1) like '%'; -``` - -## 3. Data Deletion - -**Syntax:** - -```SQL -DELETE FROM [WHERE_CLAUSE]? - -WHERE_CLAUSE: - WHERE DELETE_CONDITION - -DELETE_CONDITION: - SINGLE_CONDITION - | DELETE_CONDITION AND DELETE_CONDITION - | DELETE_CONDITION OR DELETE_CONDITION - -SINGLE_CODITION: - TIME_CONDITION | ID_CONDITION - -TIME_CONDITION: - time TIME_OPERATOR LONG_LITERAL - -TIME_OPERATOR: - < | > | <= | >= | = - -ID_CONDITION: - identifier = STRING_LITERAL -``` - -**Example 1: Full Table Deletion** - -```SQL -DELETE FROM table1; -``` - -**Example 2: Time-range Deletion** - -Single time range -```SQL -DELETE FROM table1 WHERE time <= 2024-11-29 00:00:00; -``` -Multiple time ranges -```sql -DELETE FROM table1 WHERE time >= 2024-11-27 00:00:00 and time <= 2024-11-29 00:00:00; -``` - -**Example 3: Device-Specific Deletion** - -Delete data for specific device -```SQL -DELETE FROM table1 -WHERE device_id='101' AND model_id = 'B'; -``` -Delete data for device within time range -```sql -DELETE FROM table1 -WHERE time >= '2024-11-27 16:39:00' AND time <= '2024-11-29 16:42:00' - AND device_id='101' AND model_id = 'B'; -``` -Delete data for specific device model -```sql -DELETE FROM table1 WHERE model_id = 'B'; -``` - -## 4. Device Deletion - -**Syntax:** - -```SQL -DELETE DEVICES FROM tableName=qualifiedName (WHERE booleanExpression)? -``` - -**Example: Delete specified device and all associated data** - -```SQL -DELETE DEVICES FROM table1 WHERE device_id = '101'; -``` diff --git a/src/UserGuide/Master/Table/SQL-Manual/SQL-Data-Sync_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/SQL-Data-Sync_timecho.md deleted file mode 100644 index 41eff7eeb..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/SQL-Data-Sync_timecho.md +++ /dev/null @@ -1,321 +0,0 @@ - - -# Data Sync - -This document mainly contains the SQL statements for the data synchronization function. For detailed function introduction and usage instructions, see [Data Sync](../User-Manual/Data-Sync_timecho.md) - -## 1. Create Task - -**Syntax:** - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId is the name that uniquely identifies the task --- Data extraction plugin, optional -WITH SOURCE ( - [ = ,], -) --- Data processing plugin, optional -WITH PROCESSOR ( - [ = ,], -) --- Data connection plugin, required -WITH SINK ( - [ = ,], -) -``` - -**Example 1: Full Data Synchronization** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -**Example 2: Partial Data Synchronization** - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'mode.streaming' = 'true', - 'database-name'='db_b.*', - 'start-time' = '2023.08.23T08:00:00+00:00', - 'end-time' = '2023.10.23T08:00:00+00:00' -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -**Example 3: Bidirectional Data Transmission** - -* Execute the following statement on IoTDB A - -```SQL -create pipe AB -with source ( - 'source.mode.double-living' ='true' -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* Execute the following statement on IoTDB B - -```SQL -create pipe BA -with source ( - 'source.mode.double-living' ='true' -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -) -``` - -**Example 4: Edge-Cloud Data Transmission** - -* Execute the following statement on IoTDB B to synchronize data from B to A - -```SQL -create pipe BA -with source ( - 'database-name'='db_b.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -) -``` - -* Execute the following statement on IoTDB C to synchronize data from C to A - -```SQL -create pipe CA -with source ( - 'database-name'='db_c.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* Execute the following statement on IoTDB D to synchronize data from D to A - -```SQL -create pipe DA -with source ( - 'database-name'='db_d.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -) -``` - -**Example 5: Cascaded Data Transmission** - -* Execute the following statement on IoTDB A to synchronize data from A to B - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* Execute the following statement on IoTDB B to synchronize data from B to C - -```SQL -create pipe BC -with source ( -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -) -``` - -**Example 6: Cross-Gap Data Transmission** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -) -``` - -**Example 7: Compressed Synchronization** - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', - 'compressor' = 'snappy,lz4', - 'rate-limit-bytes-per-second'='1048576' -) -``` - -**Example 8: Encrypted Synchronization** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', - 'ssl.trust-store-path'='pki/trusted', - 'ssl.trust-store-pwd'='root' -) -``` - -**Example 9: Local Export of Object Type Data** - -```SQL -CREATE PIPE tsfile_export_local -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile-local-sink', - 'sink.local.target-path' = '/data/backup/export_2024', - 'sink.rate-limit-bytes-per-second' = '10485760' -); -``` - -**Example 10: Remote Transmission of Object Type Data** - -* This method requires pre-registration of the `tsfile_remote_sink` plugin - -```SQL -CREATE PIPE tsfile_export_scp -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile_remote_sink', - 'sink.file-mode' = 'scp', - 'sink.scp.host' = '192.168.1.100', - 'sink.scp.port' = '22', - 'sink.scp.user' = 'backup_user', - 'sink.scp.password' = 'ComplexPass123!', - 'sink.scp.remote-path' = '/remote/archive/', - 'sink.rate-limit-bytes-per-second' = '10485760' -); -``` - -## 2. Start Task - -**Syntax:** - -```SQL -START PIPE -``` - -**Example:** - -```SQL -START PIPE A2B -``` - -## 3. Stop Task - -**Syntax:** - -```SQL -STOP PIPE -``` - -**Example:** - -```SQL -STOP PIPE A2B -``` - -## 4. Drop Task - -**Syntax:** - -```SQL -DROP PIPE [IF EXISTS] -``` - -**Example:** - -```SQL -DROP PIPE IF EXISTS A2B -``` - -## 5. Show Tasks - -**Syntax:** - -```SQL --- Show all tasks -SHOW PIPES --- Show a specific task -SHOW PIPE -``` - -**Example:** - -```SQL -SHOW PIPES - -SHOW PIPE A2B -``` - -## 6. Alter Task - -**Syntax:** - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -**Example:** - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md deleted file mode 100644 index 7b27761cc..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md +++ /dev/null @@ -1,686 +0,0 @@ - - -# Management Statements - -## 1. Status Inspection - -### 1.1 View Current Tree/Table Mode - -**Syntax:** - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 1.2 View Current User - -**Syntax:** - -```SQL -showCurrentUserStatement - : SHOW CURRENT_USER - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW CURRENT_USER -+-----------+ -|CurrentUser| -+-----------+ -| root| -+-----------+ -``` - -### 1.3 View Connected Database - -**Syntax:** - -```SQL -showCurrentDatabaseStatement - : SHOW CURRENT_DATABASE - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW CURRENT_DATABASE; -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ - -IoTDB> USE test; - -IoTDB> SHOW CURRENT_DATABASE; -+---------------+ -|CurrentDatabase| -+---------------+ -| test| -+---------------+ -``` - -### 1.4 View Cluster Version - -**Syntax:** - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW VERSION -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.5 View Key Cluster Parameters - -**Syntax:** - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW VARIABLES -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.6 View Cluster ID - -**Syntax:** - -```SQL -showClusterIdStatement - : SHOW (CLUSTERID | CLUSTER_ID) - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW CLUSTER_ID -+------------------------------------+ -| ClusterId| -+------------------------------------+ -|40163007-9ec1-4455-aa36-8055d740fcda| -``` - -### 1.7 View Server Time - -Shows time of the DataNode server directly connected to client - -**Syntax:** - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - - -### 1.8 View Region Information - -**Description**: Displays regions' information of the current cluster. - -**Syntax**: - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW REGIONS -``` - -**Result**: - -```SQL -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 6|SchemaRegion|Running|tcollector| 670| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.194| | -| 7| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.196| 169.85 KB| -| 8| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.198| 161.63 KB| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.9 View Available Nodes - -**Description**: Returns the RPC addresses and ports of all available DataNodes in the current cluster. Note: A DataNode is considered "available" if it is not in the REMOVING state. - -> This feature is supported starting from v2.0.8. - -**Syntax**: - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW AVAILABLE URLS -``` - -**Result**: - -```SQL -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.10 View Service Information - -> Supported since V2.0.8.2 - -**Syntax**: - -```sql -showServicesStatement - : SHOW SERVICES - ; -``` - -**Example**: - -```sql -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -``` - -**Result**: - -```sql -+--------------+-------------+---------+ -| Service Name | DataNode ID | State | -+--------------+-------------+---------+ -| MQTT | 1 | STOPPED | -| REST | 1 | RUNNING | -+--------------+-------------+---------+ -``` - -### 1.11 View Cluster Activation Status - -**Syntax**: - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW ACTIVATION -``` - -**Result**: - -```SQL -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -## 2. Status Configuration - -### 2.1 Set Connection Tree/Table Mode - -**Syntax:** - -```SQL -SET SQL_DIALECT EQ (TABLE | TREE) -``` - -**Example:** - -```SQL -IoTDB> SET SQL_DIALECT=TABLE -IoTDB> SHOW CURRENT_SQL_DIALECT -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 2.2 Update Configuration Items - -**Syntax:** - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**Example:** - -```SQL -IoTDB> SET CONFIGURATION disk_space_warning_threshold='0.05',heartbeat_interval_in_ms='1000' ON 1; -``` - -### 2.3 Load Manually Modified Configuration - -**Syntax:** - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Example:** - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 Set System Status - -**Syntax:** - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Example:** - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - -## 3. Data Management - -### 3.1 Flush Memory Table to Disk - -**Syntax:** - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Example:** - -```SQL -IoTDB> FLUSH test_db TRUE ON LOCAL; -``` - -## 4. Data Repair - -### 4.1 Start Background TsFile Repair - -**Syntax:** - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Example:** - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 Pause TsFile Repair - -**Syntax:** - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Example:** - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. Query Operations - -### 5.1 View Active Queries - -**Syntax:** - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW QUERIES WHERE elapsed_time > 30 -+-----------------------+-----------------------------+-----------+------------+------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -|20250108_101015_00000_1|2025-01-08T18:10:15.935+08:00| 1| 32.283|show queries|root| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -``` - -### 5.2 Terminate Queries - -**Syntax:** - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**Example:** - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -- teminate specific query -IoTDB> KILL ALL QUERIES; -- teminate all query -``` - -### 5.3 Query Performance Analysis - -#### 5.3.1 View Execution Plan - -**Syntax:** - -```SQL -EXPLAIN -``` - -Detailed syntax reference: [EXPLAIN STATEMENT](../User-Manual/Query-Performance-Analysis.md#_1-explain-statement) - -**Example:** - -```SQL -IoTDB> explain select * from t1 -+-----------------------------------------------------------------------------------------------+ -| distribution plan| -+-----------------------------------------------------------------------------------------------+ -| ┌─────────────────────────────────────────────┐ | -| │OutputNode-4 │ | -| │OutputColumns-[time, device_id, type, speed] │ | -| │OutputSymbols: [time, device_id, type, speed]│ | -| └─────────────────────────────────────────────┘ | -| │ | -| │ | -| ┌─────────────────────────────────────────────┐ | -| │Collect-21 │ | -| │OutputSymbols: [time, device_id, type, speed]│ | -| └─────────────────────────────────────────────┘ | -| ┌───────────────────────┴───────────────────────┐ | -| │ │ | -|┌─────────────────────────────────────────────┐ ┌───────────┐ | -|│TableScan-19 │ │Exchange-28│ | -|│QualifiedTableName: test.t1 │ └───────────┘ | -|│OutputSymbols: [time, device_id, type, speed]│ │ | -|│DeviceNumber: 1 │ │ | -|│ScanOrder: ASC │ ┌─────────────────────────────────────────────┐| -|│PushDownOffset: 0 │ │TableScan-20 │| -|│PushDownLimit: 0 │ │QualifiedTableName: test.t1 │| -|│PushDownLimitToEachDevice: false │ │OutputSymbols: [time, device_id, type, speed]│| -|│RegionId: 2 │ │DeviceNumber: 1 │| -|└─────────────────────────────────────────────┘ │ScanOrder: ASC │| -| │PushDownOffset: 0 │| -| │PushDownLimit: 0 │| -| │PushDownLimitToEachDevice: false │| -| │RegionId: 1 │| -| └─────────────────────────────────────────────┘| -+-----------------------------------------------------------------------------------------------+ -``` - -#### 5.3.2 Analyze Query Performance - -**Syntax:** - -```SQL -EXPLAIN ANALYZE [VERBOSE] -``` - -Detailed syntax reference: [EXPLAIN ANALYZE STATEMENT](../User-Manual/Query-Performance-Analysis.md#_2-explain-analyze-statement) - -**Example:** - -```SQL -IoTDB> explain analyze verbose select * from t1 -+-----------------------------------------------------------------------------------------------+ -| Explain Analyze| -+-----------------------------------------------------------------------------------------------+ -|Analyze Cost: 38.860 ms | -|Fetch Partition Cost: 9.888 ms | -|Fetch Schema Cost: 54.046 ms | -|Logical Plan Cost: 10.102 ms | -|Logical Optimization Cost: 17.396 ms | -|Distribution Plan Cost: 2.508 ms | -|Dispatch Cost: 22.126 ms | -|Fragment Instances Count: 2 | -| | -|FRAGMENT-INSTANCE[Id: 20241127_090849_00009_1.2.0][IP: 0.0.0.0][DataRegion: 2][State: FINISHED]| -| Total Wall Time: 18 ms | -| Cost of initDataQuerySource: 6.153 ms | -| Seq File(unclosed): 1, Seq File(closed): 0 | -| UnSeq File(unclosed): 0, UnSeq File(closed): 0 | -| ready queued time: 0.164 ms, blocked queued time: 0.342 ms | -| Query Statistics: | -| loadBloomFilterFromCacheCount: 0 | -| loadBloomFilterFromDiskCount: 0 | -| loadBloomFilterActualIOSize: 0 | -| loadBloomFilterTime: 0.000 | -| loadTimeSeriesMetadataAlignedMemSeqCount: 1 | -| loadTimeSeriesMetadataAlignedMemSeqTime: 0.246 | -| loadTimeSeriesMetadataFromCacheCount: 0 | -| loadTimeSeriesMetadataFromDiskCount: 0 | -| loadTimeSeriesMetadataActualIOSize: 0 | -| constructAlignedChunkReadersMemCount: 1 | -| constructAlignedChunkReadersMemTime: 0.294 | -| loadChunkFromCacheCount: 0 | -| loadChunkFromDiskCount: 0 | -| loadChunkActualIOSize: 0 | -| pageReadersDecodeAlignedMemCount: 1 | -| pageReadersDecodeAlignedMemTime: 0.047 | -| [PlanNodeId 43]: IdentitySinkNode(IdentitySinkOperator) | -| CPU Time: 5.523 ms | -| output: 2 rows | -| HasNext() Called Count: 6 | -| Next() Called Count: 5 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 31]: CollectNode(CollectOperator) | -| CPU Time: 5.512 ms | -| output: 2 rows | -| HasNext() Called Count: 6 | -| Next() Called Count: 5 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 29]: TableScanNode(TableScanOperator) | -| CPU Time: 5.439 ms | -| output: 1 rows | -| HasNext() Called Count: 3 -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| DeviceNumber: 1 | -| CurrentDeviceIndex: 0 | -| [PlanNodeId 40]: ExchangeNode(ExchangeOperator) | -| CPU Time: 0.053 ms | -| output: 1 rows | -| HasNext() Called Count: 2 | -| Next() Called Count: 1 | -| Estimated Memory Size: : 131072 | -| | -|FRAGMENT-INSTANCE[Id: 20241127_090849_00009_1.3.0][IP: 0.0.0.0][DataRegion: 1][State: FINISHED]| -| Total Wall Time: 13 ms | -| Cost of initDataQuerySource: 5.725 ms | -| Seq File(unclosed): 1, Seq File(closed): 0 | -| UnSeq File(unclosed): 0, UnSeq File(closed): 0 | -| ready queued time: 0.118 ms, blocked queued time: 5.844 ms | -| Query Statistics: | -| loadBloomFilterFromCacheCount: 0 | -| loadBloomFilterFromDiskCount: 0 | -| loadBloomFilterActualIOSize: 0 | -| loadBloomFilterTime: 0.000 | -| loadTimeSeriesMetadataAlignedMemSeqCount: 1 | -| loadTimeSeriesMetadataAlignedMemSeqTime: 0.004 | -| loadTimeSeriesMetadataFromCacheCount: 0 | -| loadTimeSeriesMetadataFromDiskCount: 0 | -| loadTimeSeriesMetadataActualIOSize: 0 | -| constructAlignedChunkReadersMemCount: 1 | -| constructAlignedChunkReadersMemTime: 0.001 | -| loadChunkFromCacheCount: 0 | -| loadChunkFromDiskCount: 0 | -| loadChunkActualIOSize: 0 | -| pageReadersDecodeAlignedMemCount: 1 | -| pageReadersDecodeAlignedMemTime: 0.007 | -| [PlanNodeId 42]: IdentitySinkNode(IdentitySinkOperator) | -| CPU Time: 0.270 ms | -| output: 1 rows | -| HasNext() Called Count: 3 | -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 30]: TableScanNode(TableScanOperator) | -| CPU Time: 0.250 ms | -| output: 1 rows | -| HasNext() Called Count: 3 | -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| DeviceNumber: 1 | -| CurrentDeviceIndex: 0 | -+-----------------------------------------------------------------------------------------------+ -``` diff --git a/src/UserGuide/Master/Table/SQL-Manual/SQL-Metadata-Operations_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/SQL-Metadata-Operations_timecho.md deleted file mode 100644 index 7b6107b4e..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/SQL-Metadata-Operations_timecho.md +++ /dev/null @@ -1,385 +0,0 @@ - - -# Metadata Operations - -## 1. Database Management - -### 1.1 Create Database - -**Syntax:** - -```SQL -CREATE DATABASE (IF NOT EXISTS)? (WITH properties)? -``` - -[Detailed syntax reference](../Basic-Concept/Database-Management_timecho.md#_1-1-create-a-database) - -**Examples:** - -```SQL -CREATE DATABASE database1; -CREATE DATABASE IF NOT EXISTS database1; - --- Create database with 1-year TTL; -CREATE DATABASE IF NOT EXISTS database1 with(TTL=31536000000); -``` - -### 1.2 Use Database - -**Syntax:** - -```SQL -USE -``` - -**Examples:** - -```SQL -USE database1; -``` - -### 1.3 View Current Database - -**Syntax:** - -```SQL -SHOW CURRENT_DATABASE; -``` - -**Examples:** - -```SQL -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ -``` -```sql -USE database1; -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| database1| -+---------------+ -``` - -### 1.4 List All Databases - -**Syntax:** - -```SQL -SHOW DATABASES (DETAILS)? -``` - -[Detailed syntax reference](../Basic-Concept/Database-Management_timecho.md#_1-4-view-all-databases) - -**Examples:** - -```SQL -show databases; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+------------------+-------+-----------------------+---------------------+---------------------+ -| database1| INF| 1| 1| 604800000| -|information_schema| INF| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+ -``` -```sql -show databases details; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|DataRegionGroupNum| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| database1| INF| 1| 1| 604800000| 1| 2| -|information_schema| INF| null| null| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -``` - -### 1.5 Modify Database - -**Syntax:** - -```SQL -ALTER DATABASE (IF EXISTS)? database=identifier SET PROPERTIES propertyAssignments -``` - -**Examples:** - -```SQL -ALTER DATABASE database1 SET PROPERTIES TTL=31536000000; -``` - -### 1.6 Drop Database - -**Syntax:** - -```SQL -DROP DATABASE (IF EXISTS)? -``` - -**Examples:** - -```SQL -DROP DATABASE IF EXISTS database1; -``` - -## 2. Table Management - -### 2.1 Create Table - -**Syntax:** - -```SQL -createTableStatement - : CREATE TABLE (IF NOT EXISTS)? qualifiedName - '(' (columnDefinition (',' columnDefinition)*)? ')' - charsetDesc? - comment? - (WITH properties)? - ; - -charsetDesc - : DEFAULT? (CHAR SET | CHARSET | CHARACTER SET) EQ? identifierOrString - ; - -columnDefinition - : identifier columnCategory=(TAG | ATTRIBUTE | TIME) charsetName? comment? - | identifier type (columnCategory=(TAG | ATTRIBUTE | TIME | FIELD))? charsetName? comment? - ; - -charsetName - : CHAR SET identifier - | CHARSET identifier - | CHARACTER SET identifier - ; - -comment - : COMMENT string - ; -``` - -[Detailed syntax reference](../Basic-Concept/Table-Management_timecho.md#_1-1-create-a-table) - -**Examples:** - -```SQL -CREATE TABLE table1 ( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - humidity FLOAT FIELD COMMENT 'humidity', - status BOOLEAN FIELD COMMENT 'status', - arrival_time TIMESTAMP FIELD COMMENT 'arrival_time' -) COMMENT 'table1' WITH (TTL=31536000000); - -CREATE TABLE if not exists tableB (); - -CREATE TABLE tableC ( - "Site" STRING TAG, - "Temperature" int32 FIELD COMMENT 'temperature' - ) with (TTL=DEFAULT); - ``` - -Custom time column: named time_test, located in the second column of the table. (Support from V2.0.8.2) - ```sql - CREATE TABLE table1 ( - region STRING TAG, - time_user_defined TIMESTAMP TIME, - temperature FLOAT FIELD - ); -``` - -Note: If your terminal does not support multi-line paste (e.g., Windows CMD), please reformat the SQL statement into a single line before execution. - - -### 2.2 List Tables - -**Syntax:** - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -**Examples:** - -```SQL -show tables from database1; -``` -```shell -+---------+---------------+ -|TableName| TTL(ms)| -+---------+---------------+ -| table1| 31536000000| -+---------+---------------+ -``` -```sql -show tables details from database1; -``` -```shell -+---------------+-----------+------+-------+ -| TableName| TTL(ms)|Status|Comment| -+---------------+-----------+------+-------+ -| table1|31536000000| USING| table1| -+---------------+-----------+------+-------+ -``` - -### 2.3 Describe Table Columns - -**Syntax:** - -```SQL -(DESC | DESCRIBE) (DETAILS)? -``` - -**Examples:** - -```SQL -desc table1; -``` -```shell -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ -``` -```sql -desc table1 details; -``` -```shell -+------------+---------+---------+------+------------+ -| ColumnName| DataType| Category|Status| Comment| -+------------+---------+---------+------+------------+ -| time|TIMESTAMP| TIME| USING| null| -| region| STRING| TAG| USING| null| -| plant_id| STRING| TAG| USING| null| -| device_id| STRING| TAG| USING| null| -| model_id| STRING|ATTRIBUTE| USING| null| -| maintenance| STRING|ATTRIBUTE| USING| maintenance| -| temperature| FLOAT| FIELD| USING| temperature| -| humidity| FLOAT| FIELD| USING| humidity| -| status| BOOLEAN| FIELD| USING| status| -|arrival_time|TIMESTAMP| FIELD| USING|arrival_time| -+------------+---------+---------+------+------------+ -``` - - -### 2.4 View Table Creation Statement - -**Syntax:** - -```SQL -SHOW CREATE TABLE -``` - -**Examples:** - -```SQL -show create table table1; -``` -```shell -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| Table| Create Table| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|table1|CREATE TABLE "table1" ("region" STRING TAG,"plant_id" STRING TAG,"device_id" STRING TAG,"model_id" STRING ATTRIBUTE,"maintenance" STRING ATTRIBUTE,"temperature" FLOAT FIELD,"humidity" FLOAT FIELD,"status" BOOLEAN FIELD,"arrival_time" TIMESTAMP FIELD) WITH (ttl=31536000000)| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -``` - - -### 2.5 Modify Table - -**Syntax:** - -```SQL -#addColumn; -ALTER TABLE (IF EXISTS)? tableName=qualifiedName ADD COLUMN (IF NOT EXISTS)? column=columnDefinition COMMENT 'column_comment'; -#dropColumn; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName DROP COLUMN (IF EXISTS)? column=identifier; -#setTableProperties; -// set TTL can use this; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName SET PROPERTIES propertyAssignments; -| COMMENT ON TABLE tableName=qualifiedName IS 'table_comment'; -| COMMENT ON COLUMN tableName.column IS 'column_comment'; -#changeColumndatatype; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName ALTER COLUMN (IF EXISTS)? column=identifier SET DATA TYPE new_type=type; -``` - -**Examples:** - -add column -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS a TAG COMMENT 'a'; -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS b FLOAT FIELD COMMENT 'b'; -``` -set TTL -```SQL -ALTER TABLE table1 set properties TTL=3600; -``` -set comment -```SQL -COMMENT ON TABLE table1 IS 'table1'; -COMMENT ON COLUMN table1.a IS null; -``` -alter column datatype -```SQL -ALTER TABLE table1 ALTER COLUMN IF EXISTS b SET DATA TYPE DOUBLE; -``` - -### 2.6 Drop Table - -**Syntax:** - -```SQL -DROP TABLE (IF EXISTS)? -``` - -**Examples:** - -```SQL -DROP TABLE table1; -DROP TABLE database1.table1; -``` - - diff --git a/src/UserGuide/Master/Table/SQL-Manual/Select-Clause_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/Select-Clause_timecho.md deleted file mode 100644 index a489f8a90..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/Select-Clause_timecho.md +++ /dev/null @@ -1,491 +0,0 @@ - - -# SELECT Clauses - -**SELECT Clause** specifies the columns included in the query results. - -## 1. Syntax Overview - -```sql -SELECT setQuantifier? selectItem (',' selectItem)* - -selectItem - : expression (AS? identifier)? #selectSingle - | tableName '.' ASTERISK (AS columnAliases)? #selectAll - | ASTERISK #selectAll - ; -setQuantifier - : DISTINCT - | ALL - ; -``` - -- It supports aggregate functions (e.g., `SUM`, `AVG`, `COUNT`) and window functions, logically executed last in the query process. -- DISTINCT Keyword: `SELECT DISTINCT column_name` ensures that the values in the query results are unique, removing duplicates. -- COLUMNS Function: The COLUMNS function is supported in the SELECT clause for column filtering. It can be combined with expressions, allowing the expression's logic to apply to all columns selected by the function. - -## 2. Detailed Syntax: - -Each `selectItem` can take one of the following forms: - -1. **Expression**: `expression [[AS] column_alias]` defines a single output column and optionally assigns an alias. -2. **All Columns from a Relation**: `relation.*` selects all columns from a specified relation. Column aliases are not allowed in this case. -3. **All Columns in the Result Set**: `*` selects all columns returned by the query. Column aliases are not allowed. - -Usage scenarios for DISTINCT: - -1. **SELECT Statement**: Use DISTINCT in the SELECT statement to remove duplicate items from the query results. - -2. **Aggregate Functions**: When used with aggregate functions, DISTINCT only processes non-duplicate rows in the input dataset. - -3. **AGROUP BY Clause**: Use ALL and DISTINCT quantifiers in the GROUP BY clause to determine whether each duplicate grouping set produces distinct output rows. - -`COLUMNS` Function: - -1. **`COLUMNS(*)`**: Matches all columns and supports combining with expressions. -2. **`COLUMNS(regexStr) ? AS identifier`**: Regular expression matching - - Selects columns whose names match the specified regular expression `(regexStr)` and supports combining with expressions. - - Allows renaming columns by referencing groups captured by the regular expression. If `AS` is omitted, the original column name is displayed in the format `_coln_original_name` (where `n` is the column’s position in the result table). - - Renaming Syntax: - - Use parentheses () in regexStr to define capture groups. - - Reference captured groups in identifier using `'$index'`. - - Note: The identifier must be enclosed in double quotes if it contains special characters like `$`. - -## 3. Example Data - - -The [Example Data page](../Reference/Sample-Data.md)page provides SQL statements to construct table schemas and insert data. By downloading and executing these statements in the IoTDB CLI, you can import the data into IoTDB. This data can be used to test and run the example SQL queries included in this documentation, allowing you to reproduce the described results. - -### 3.1 Selection List - -#### 3.1.1 Star Expression - -The asterisk (`*`) selects all columns in a table. Note that it cannot be used with most functions, except for cases like `COUNT(*)`. - -**Example**: Selecting all columns from a table. - - -```sql -SELECT * FROM table1; -``` - -Results: - -```sql -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| modifytime| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-29T18:30:00.000+08:00| 上海| 3002| 100| E| 180| 90.0| 35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T10:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| 35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 35.2| true| null| -|2024-11-30T14:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00| 上海| 3001| 101| D| 360| 85.0| null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| 35.3| null| null| -|2024-11-27T16:40:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.2| false| null| -|2024-11-27T16:43:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false| null| -|2024-11-27T16:44:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -Total line number = 18 -It costs 0.653s -``` - -#### 3.1.2 Aggregate Functions - -Aggregate functions summarize multiple rows into a single value. When aggregate functions are present in the `SELECT` clause, the query is treated as an **aggregate query**. All expressions in the query must either be part of an aggregate function or specified in the [GROUP BY clause](../SQL-Manual/GroupBy-Clause.md). - -**Example 1**: Total number of rows in a table. - -```sql -SELECT count(*) FROM table1; -``` - -Results: - -```sql -+-----+ -|_col0| -+-----+ -| 18| -+-----+ -Total line number = 1 -It costs 0.091s -``` - -**Example 2**: Total rows grouped by region. - -```sql -SELECT region, count(*) - FROM table1 - GROUP BY region; -``` - -Results: - -```sql -+------+-----+ -|region|_col1| -+------+-----+ -| 上海| 9| -| 北京| 9| -+------+-----+ -Total line number = 2 -It costs 0.071s -``` - -#### 3.1.3 Aliases - -The `AS` keyword assigns an alias to selected columns, improving readability by overriding existing column names. - -**Example 1**: Original table. - - -```sql -IoTDB> SELECT * FROM table1; -``` - -Results: - -```sql -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| modifytime| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-29T18:30:00.000+08:00| 上海| 3002| 100| E| 180| 90.0| 35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T10:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| 35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 35.2| true| null| -|2024-11-30T14:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00| 上海| 3001| 101| D| 360| 85.0| null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| 35.3| null| null| -|2024-11-27T16:40:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.2| false| null| -|2024-11-27T16:43:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false| null| -|2024-11-27T16:44:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -Total line number = 18 -It costs 0.653s -``` - -**Example 2**: Assigning an alias to a single column. - -```sql -IoTDB> SELECT device_id - AS device - FROM table1; -``` - -Results: - -```sql -+------+ -|device| -+------+ -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -+------+ -Total line number = 18 -It costs 0.053s -``` - -**Example 3:** Assigning aliases to all columns. - -```sql -IoTDB> SELECT table1.* - AS (timestamp, Reg, Pl, DevID, Mod, Mnt, Temp, Hum, Stat,MTime) - FROM table1; -``` - -Results: - -```sql -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -| TIMESTAMP| REG| PL|DEVID|MOD|MNT|TEMP| HUM| STAT| MTIME| -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -|2024-11-29T11:00:00.000+08:00|上海|3002| 100| E|180|null|45.1| true| null| -|2024-11-29T18:30:00.000+08:00|上海|3002| 100| E|180|90.0|35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00|上海|3001| 100| C| 90|85.0|null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00|上海|3001| 100| C| 90|null|40.9| true| null| -|2024-11-28T10:00:00.000+08:00|上海|3001| 100| C| 90|85.0|35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00|上海|3001| 100| C| 90|88.0|45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00|北京|1001| 100| A|180|90.0|35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00|北京|1001| 100| A|180|90.0|35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00|上海|3002| 101| F|360|90.0|35.2| true| null| -|2024-11-30T14:30:00.000+08:00|上海|3002| 101| F|360|90.0|34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00|上海|3001| 101| D|360|85.0|null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00|北京|1001| 101| B|180|null|35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00|北京|1001| 101| B|180|85.0|35.3| null| null| -|2024-11-27T16:40:00.000+08:00|北京|1001| 101| B|180|85.0|null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00|北京|1001| 101| B|180|85.0|null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00|北京|1001| 101| B|180|null|35.2|false| null| -|2024-11-27T16:43:00.000+08:00|北京|1001| 101| B|180|null|null|false| null| -|2024-11-27T16:44:00.000+08:00|北京|1001| 101| B|180|null|null|false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -Total line number = 18 -It costs 0.189s -``` - -#### 3.1.4 Object Type Query - -> Supported since V2.0.8 - -**Example 1: Directly querying Object type data** - -```sql -IoTDB:database1> SELECT s1 FROM table1 WHERE device_id = 'tag1'; -``` - -Results: - -```sql -+------------+ -| s1| -+------------+ -|(Object) 5 B| -+------------+ -Total line number = 1 -It costs 0.428s -``` - -**Example 2: Retrieving raw content of Object type data using `read_object` function** - -```sql -IoTDB:database1> SELECT read_object(s1) FROM table1 WHERE device_id = 'tag1' -``` - -Results: - -```sql -+------------+ -| _col0| -+------------+ -|0x696f746462| -+------------+ -Total line number = 1 -It costs 0.188s -``` - - -### 3.2 Columns Function - -1. Without combining expressions - -Query data from columns whose names start with 'm' -```sql -IoTDB:database1> select columns('^m.*') from table1 limit 5; -``` - -Results: - -```sql -+--------+-----------+ -|model_id|maintenance| -+--------+-----------+ -| E| 180| -| E| 180| -| C| 90| -| C| 90| -| C| 90| -+--------+-----------+ -``` - -Query columns whose names start with 'o' - throw an exception if no columns match -```sql -IoTDB:database1> select columns('^o.*') from table1 limit 5; -``` - -Results: - -```sql -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: No matching columns found that match regex '^o.*' -``` - -Query data from columns whose names start with 'm' and rename them with 'series_' prefix -```sql -IoTDB:database1> select columns('^m(.*)') AS "series_$0" from table1 limit 5; -``` - -Results: - -```sql -+---------------+------------------+ -|series_model_id|series_maintenance| -+---------------+------------------+ -| E| 180| -| E| 180| -| C| 90| -| C| 90| -| C| 90| -+---------------+------------------+ -``` - -2. With Expression Combination - -- Single COLUMNS Function - -Query the minimum value of all columns -```sql -IoTDB:database1> select min(columns(*)) from table1 -``` - -Results: - -```sql -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -| _col0_time|_col1_region|_col2_plant_id|_col3_device_id|_col4_model_id|_col5_maintenance|_col6_temperature|_col7_humidity|_col8_status| _col9_arrival_time| -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -|2024-11-26T13:37:00.000+08:00| 上海| 1001| 100| A| 180| 85.0| 34.8| false|2024-11-26T13:37:34.000+08:00| -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -``` - -- Multiple COLUMNS Functions in Same Expression - -> Usage Restriction: When multiple COLUMNS functions appear in the same expression, their parameters must be identical. - -Query the sum of minimum and maximum values for columns starting with 'h' -```sql -IoTDB:database1> select min(columns('^h.*')) + max(columns('^h.*')) from table1 -``` - -Results: - -```sql -+--------------+ -|_col0_humidity| -+--------------+ -| 79.899994| -+--------------+ -``` - -Error Case: Non-Identical COLUMNS Functions -```sql -IoTDB:database1> select min(columns('^h.*')) + max(columns('^t.*')) from table1 -``` - -Results: - -```sql -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Multiple different COLUMNS in the same expression are not supported -``` - -- Multiple COLUMNS Functions in Different Expressions - -Query minimum of 'h'-columns and maximum of 'h'-columns separately -```sql -IoTDB:database1> select min(columns('^h.*')) , max(columns('^h.*')) from table1 -``` - -Results: - -```sql -+--------------+--------------+ -|_col0_humidity|_col1_humidity| -+--------------+--------------+ -| 34.8| 45.1| -+--------------+--------------+ -``` - -Query minimum of 'h'-columns and maximum of 'te'-columns -```sql -IoTDB:database1> select min(columns('^h.*')) , max(columns('^te.*')) from table1 -``` - -Results: - -```sql -+--------------+-----------------+ -|_col0_humidity|_col1_temperature| -+--------------+-----------------+ -| 34.8| 90.0| -+--------------+-----------------+ -``` - -3. In Where Clause - -Query data where all 'h'-columns must be > 40 (equivalent to) -```sql -IoTDB:database1> select * from table1 where columns('^h.*') > 40 -``` - -Results: - -```sql -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -``` - -Alternative syntax -```sql -IoTDB:database1> select * from table1 where humidity > 40 -``` - -Results: - -```sql -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -``` - -## 4. Column Order in the Result Set - -- **Column Order**: The order of columns in the result set matches the order specified in the `SELECT` clause. -- **Multi-column Expressions**: If a selection expression produces multiple columns, their order follows the order in the source relation.p. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/SQL-Manual/Set-Operations_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/Set-Operations_timecho.md deleted file mode 100644 index 3628b15ec..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/Set-Operations_timecho.md +++ /dev/null @@ -1,295 +0,0 @@ - -# Set Operations - -IoTDB natively supports standard SQL set operations, including three core operators: **UNION**, **INTERSECT**, and **EXCEPT**. These operations enable seamless merging, comparison, and filtering of query results from multiple time-series data sources, greatly improving the flexibility and efficiency of time-series data analysis. - -> Note: This feature is available since version 2.0.9.1. - -## 1. UNION -### 1.1 Overview -The UNION operator combines all rows from two result sets (order not guaranteed), supporting both duplicate elimination (default) and duplicate retention modes. - -### 1.2 Syntax -```sql -query UNION (ALL | DISTINCT) query -``` - -**Description** -1. **Duplicate Handling** - - Default (`UNION` or `UNION DISTINCT`): Automatically removes duplicate rows. - - `UNION ALL`: Preserves all rows (including duplicates) with higher performance. - -2. **Input Requirements** - - The two queries must return the same number of columns. - - Corresponding columns must have compatible data types: - - Numeric compatibility: `INT32`, `INT64`, `FLOAT`, and `DOUBLE` are fully compatible with each other. - - String compatibility: `TEXT` and `STRING` are fully compatible. - - Special rule: `INT64` is compatible with `TIMESTAMP`. - -3. **Result Set Rules** - - Column names and order are inherited from the first query. - -### 1.3 Examples -Using the [sample data](../Reference/Sample-Data.md): - -1. Get distinct non-null device and temperature records from `table1` and `table2` -```sql -SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL -UNION -SELECT device_id, temperature FROM table2 WHERE temperature IS NOT NULL; - --- Equivalent to: -SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL -UNION DISTINCT -SELECT device_id, temperature FROM table2 WHERE temperature IS NOT NULL; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 85.0| -| 100| 90.0| -| 100| 85.0| -| 100| 88.0| -+---------+-----------+ -Total line number = 5 -It costs 0.074s -``` - -2. Get all non-null device and temperature records from `table1` and `table2` (including duplicates) -```sql -SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL -UNION ALL -SELECT device_id, temperature FROM table2 WHERE temperature IS NOT NULL; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 90.0| -| 101| 85.0| -| 101| 85.0| -| 101| 85.0| -| 101| 85.0| -| 100| 90.0| -| 100| 85.0| -| 100| 85.0| -| 100| 88.0| -| 100| 90.0| -| 100| 90.0| -| 101| 90.0| -| 101| 85.0| -| 101| 85.0| -| 100| 85.0| -| 100| 90.0| -+---------+-----------+ -Total line number = 17 -It costs 0.108s -``` - -> **Notes** -> - Set operations **do not guarantee result order**; actual output may differ from examples. - - -## 2. INTERSECT -### 2.1 Overview -The INTERSECT operator returns rows that exist in both result sets (order not guaranteed), supporting both duplicate elimination (default) and duplicate retention modes. - -### 2.2 Syntax -```sql -query1 INTERSECT [ALL | DISTINCT] query2 -``` - -**Description** -1. **Duplicate Handling** - - Default (`INTERSECT` or `INTERSECT DISTINCT`): Automatically removes duplicate rows. - - `INTERSECT ALL`: Preserves duplicate rows, with slightly lower performance. - -2. **Precedence Rules** - - `INTERSECT` has higher precedence than `UNION` and `EXCEPT` - (e.g., `A UNION B INTERSECT C` is equivalent to `A UNION (B INTERSECT C)`). - - Evaluation is left-to-right - (e.g., `A INTERSECT B INTERSECT C` is equivalent to `(A INTERSECT B) INTERSECT C`). - -3. **Input Requirements** - - The two queries must return the same number of columns. - - Corresponding columns must have compatible data types (same rules as UNION). - - NULL values are treated as equal (`NULL IS NOT DISTINCT FROM NULL`). - - If the `time` column is not included in `SELECT`, it does not participate in comparison and will not appear in the result. - -4. **Result Set Rules** - - Column names and order are inherited from the first query. - -### 2.3 Examples -Using the [sample data](../Reference/Sample-Data.md): - -1. Get distinct common device and temperature records from `table1` and `table2` -```sql -SELECT device_id, temperature FROM table1 -INTERSECT -SELECT device_id, temperature FROM table2; - --- Equivalent to: -SELECT device_id, temperature FROM table1 -INTERSECT DISTINCT -SELECT device_id, temperature FROM table2; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 85.0| -| 100| null| -| 100| 90.0| -| 100| 85.0| -+---------+-----------+ -Total line number = 5 -It costs 0.087s -``` - -2. Get all common device and temperature records from `table1` and `table2` (including duplicates) -```sql -SELECT device_id, temperature FROM table1 -INTERSECT ALL -SELECT device_id, temperature FROM table2; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 100| 85.0| -| 100| 90.0| -| 100| null| -| 101| 85.0| -| 101| 85.0| -| 101| 90.0| -+---------+-----------+ -Total line number = 6 -It costs 0.139s -``` - -> **Notes** -> - Set operations **do not guarantee result order**. -> - When mixed with `UNION`/`EXCEPT`, use parentheses to explicitly specify precedence - > (e.g., `A INTERSECT (B UNION C)`). - - -## 3. EXCEPT -### 3.1 Overview -The EXCEPT operator returns rows that exist in the first result set but **not** in the second (order not guaranteed), supporting both duplicate elimination (default) and duplicate retention modes. - -### 3.2 Syntax -```sql -query1 EXCEPT [ALL | DISTINCT] query2 -``` - -**Description** -1. **Duplicate Handling** - - Default (`EXCEPT` or `EXCEPT DISTINCT`): Automatically removes duplicate rows. - - `EXCEPT ALL`: Preserves duplicate rows, with slightly lower performance. - -2. **Precedence Rules** - - `EXCEPT` has the same precedence as `UNION`, and lower precedence than `INTERSECT` - (e.g., `A INTERSECT B EXCEPT C` is equivalent to `(A INTERSECT B) EXCEPT C`). - - Evaluation is left-to-right - (e.g., `A EXCEPT B EXCEPT C` is equivalent to `(A EXCEPT B) EXCEPT C`). - -3. **Input Requirements** - - The two queries must return the same number of columns. - - Corresponding columns must have compatible data types (same rules as UNION). - - NULL values are treated as equal (`NULL IS NOT DISTINCT FROM NULL`). - - If the `time` column is not included in `SELECT`, it does not participate in comparison and will not appear in the result. - -4. **Result Set Rules** - - Column names and order are inherited from the first query. - -### 3.3 Examples -Using the [sample data](../Reference/Sample-Data.md): - -1. Get distinct records from `table1` that do not exist in `table2` -```sql -SELECT device_id, temperature FROM table1 -EXCEPT -SELECT device_id, temperature FROM table2; - --- Equivalent to: -SELECT device_id, temperature FROM table1 -EXCEPT DISTINCT -SELECT device_id, temperature FROM table2; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| null| -| 100| 88.0| -+---------+-----------+ -Total line number = 2 -It costs 0.173s -``` - -2. Get all records from `table1` that do not exist in `table2` (including duplicates) -```sql -SELECT device_id, temperature FROM table1 -EXCEPT ALL -SELECT device_id, temperature FROM table2; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 100| 85.0| -| 100| 88.0| -| 100| 90.0| -| 100| 90.0| -| 100| null| -| 101| 85.0| -| 101| 85.0| -| 101| 90.0| -| 101| null| -| 101| null| -| 101| null| -| 101| null| -+---------+-----------+ -Total line number = 12 -It costs 0.155s -``` - -> **Notes** -> - Set operations **do not guarantee result order**. -> - When mixed with `UNION`/`INTERSECT`, use parentheses to explicitly specify precedence - > (e.g., `A EXCEPT (B INTERSECT C)`). \ No newline at end of file diff --git a/src/UserGuide/Master/Table/SQL-Manual/overview_timecho.md b/src/UserGuide/Master/Table/SQL-Manual/overview_timecho.md deleted file mode 100644 index d564f44c6..000000000 --- a/src/UserGuide/Master/Table/SQL-Manual/overview_timecho.md +++ /dev/null @@ -1,53 +0,0 @@ - - -# Overview - -## 1. Syntax Overview - -```SQL -SELECT ⟨select_list⟩ - FROM ⟨tables⟩ | patternRecognition - [WHERE ⟨condition⟩] - [GROUP BY ⟨groups⟩] - [HAVING ⟨group_filter⟩] - [WINDOW windowDefinition (',' windowDefinition)*)] - [FILL ⟨fill_methods⟩] - [ORDER BY ⟨order_expression⟩] - [OFFSET ⟨n⟩] - [LIMIT ⟨n⟩]; -``` - -The IoTDB table model query syntax supports the following clauses: - -- **SELECT Clause**: Specifies the columns to be included in the result. Details: [SELECT Clause](../SQL-Manual/Select-Clause_timecho.md) -- **FROM Clause**: Indicates the data source for the query, which can be a single table, multiple tables joined using the `JOIN` clause, or a subquery. Details: [FROM & JOIN Clause](../SQL-Manual/From-Join-Clause.md) -- **WHERE Clause**: Filters rows based on specific conditions. Logically executed immediately after the `FROM` clause. Details: [WHERE Clause](../SQL-Manual/Where-Clause.md) -- **GROUP BY Clause**: Used for aggregating data, specifying the columns for grouping. Details: [GROUP BY Clause](../SQL-Manual/GroupBy-Clause.md) -- **HAVING Clause**: Applied after the `GROUP BY` clause to filter grouped data, similar to `WHERE` but operates after grouping. Details:[HAVING Clause](../SQL-Manual/Having-Clause.md) -- **FILL Clause**: Handles missing values in query results by specifying fill methods (e.g., previous non-null value or linear interpolation) for better visualization and analysis. Details:[FILL Clause](../SQL-Manual/Fill-Clause.md) -- **ORDER BY Clause**: Sorts query results in ascending (`ASC`) or descending (`DESC`) order, with optional handling for null values (`NULLS FIRST` or `NULLS LAST`). Details: [ORDER BY Clause](../SQL-Manual/OrderBy-Clause.md) -- **OFFSET Clause**: Specifies the starting position for the query result, skipping the first `OFFSET` rows. Often used with the `LIMIT` clause. Details: [LIMIT and OFFSET Clause](../SQL-Manual/Limit-Offset-Clause.md) -- **LIMIT Clause**: Limits the number of rows in the query result. Typically used in conjunction with the `OFFSET` clause for pagination. Details: [LIMIT and OFFSET Clause](../SQL-Manual/Limit-Offset-Clause.md) - -## 2. Clause Execution Order - -![](/img/data-query-1.png) \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Tools-System/CLI_timecho.md b/src/UserGuide/Master/Table/Tools-System/CLI_timecho.md deleted file mode 100644 index 484d5b32f..000000000 --- a/src/UserGuide/Master/Table/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,178 +0,0 @@ - -# CLI - -The IoTDB Command Line Interface (CLI) tool allows users to interact with the IoTDB server. Before using the CLI tool to connect to IoTDB, ensure that the IoTDB service is running correctly. This document explains how to launch the CLI and its related parameters. - -In this manual, `$IOTDB_HOME` represents the installation directory of IoTDB. - -## 1. CLI Launch - -The CLI client script is located in the `$IOTDB_HOME/sbin` directory. The common commands to start the CLI tool are as follows: - -#### **Linux** **MacOS** - -```Bash -Shell> bash sbin/start-cli.sh -sql_dialect table -#or -# Before version V2.0.6.x -Shell> bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -#### **Windows** - -```Bash -# Before version V2.0.4.x -Shell> sbin\start-cli.bat -sql_dialect table -#or -Shell> sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.4.x and later versions -Shell> sbin\windows\start-cli.bat -sql_dialect table -#or -# V2.0.4.x and later versions, before version V2.0.6.x -Shell> sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -**Parameter Explanation** - -| **Parameter** | **Type** | **Required** | **Description** | **Example** | -| -------------------------- | -------- | ------------ |---------------------------------------------------------------------------------------------------| ------------------- | -| -h `` | string | No | The IP address of the IoTDB server. (Default: 127.0.0.1) | -h 127.0.0.1 | -| -p `` | int | No | The RPC port of the IoTDB server. (Default: 6667) | -p 6667 | -| -u `` | string | No | The username to connect to the IoTDB server. (Default: root) | -u root | -| -pw `` | string | No | The password to connect to the IoTDB server. (Default: `TimechoDB@2021`,before V2.0.6 it is root) | -pw root | -| -sql_dialect `` | string | No | The data model type: tree or table. (Default: tree) | -sql_dialect table | -| -e `` | string | No | Batch operations in non-interactive mode. | -e "show databases" | -| -c | Flag | No | Required if rpc_thrift_compression_enable=true on the server. | -c | -| -disableISO8601 | Flag | No | If set, timestamps will be displayed as numeric values instead of ISO8601 format. | -disableISO8601 | -| -usessl `` | Boolean | No | Enable SSL connection | -usessl true | -| -ts `` | string | No | SSL certificate store path | -ts /path/to/truststore | -| -tpw `` | string | No | SSL certificate store password | -tpw myTrustPassword | -| -timeout `` | int | No | Query timeout (seconds). If not set, the server's configuration will be used. | -timeout 30 | -| -help | Flag | No | Displays help information for the CLI tool. | -help | - -The figure below indicates a successful startup: - -![](/img/Cli-01.png) - - -## 2. Example Commands - -### 2.1 **Create a Database** - -```Java -create database test -``` - -![](/img/Cli-02.png) - - -### 2.2 **Show Databases** -```Java -show databases -``` - -![](/img/Cli-03.png) - - -## 3. CLI Exit - -To exit the CLI and terminate the session, type`quit`or`exit`. - -### 3.1 Additional Notes and Shortcuts - -1. **Navigate Command History:** Use the up and down arrow keys. -2. **Auto-Complete Commands:** Use the right arrow key. -3. **Interrupt Command Execution:** Press `CTRL+C`. - -## 4. Access History Feature - -Since IoTDB **V2.0.9.1**, the access history feature is available. After a client logs in successfully, key historical access information is displayed, and the feature supports distributed deployments. Both administrators and regular users can only view their own access history. The core displayed information includes: - -- Last successful session: displays date, time, access application, IP address, and access method (not shown for first login or when no history exists). -- Most recent failed attempt: displays the date, time, access application, IP address, and access method of the latest failed login attempt immediately before the current successful login. -- Cumulative failed attempts: total number of failed session attempts since the last successful session was established. - -### 4.1 Enabling Access History - -You can enable or disable the access history feature by modifying the corresponding parameter in the `iotdb-system.properties` file. A restart is required for changes to take effect. For example: - -```Plain -# Controls whether the audit log feature is enabled -enable_audit_log=false -``` - -- When enabled: login information is recorded and expired data is cleaned periodically. -- When disabled: no data is recorded, displayed, or cleaned up. -- If disabled and then re-enabled, the displayed history will be the last record before the feature was disabled, which may not reflect the actual latest login. - -Usage example: - -```Bash ---------------------- -Starting IoTDB Cli ---------------------- - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ Enterprise version 2.0.9.1 (Build: xxxxxxx) - - ----Last Successful Session------------------ -Time: 2026-03-24T10:25:47.759+08:00 -IP Address: 127.0.0.1 ----Last Failed Session---------------------- -Time: 2026-03-24T10:27:26.314+08:00 -IP Address: 127.0.0.1 -Cumulative Failed Attempts: 1 -Successfully logged in at 127.0.0.1:6667 -IoTDB> -``` - -### 4.2 Viewing Access History - -The `root` user and users with the `AUDIT` privilege can view login history records using SQL statements. - -Syntax: - -```SQL -SELECT * FROM __audit.login_history; -``` - -Example: - -```SQL -IoTDB> SELECT * FROM __audit.login_history -+-----------------------------+-------+-------+--------+---------+------+ -| time|user_id|node_id|username| ip|result| -+-----------------------------+-------+-------+--------+---------+------+ -|2026-03-25T10:55:58.240+08:00| u_0| node_1| root|127.0.0.1| true| -+-----------------------------+-------+-------+--------+---------+------+ -Total line number = 1 -It costs 0.213s -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Tools-System/Data-Export-Tool_timecho.md b/src/UserGuide/Master/Table/Tools-System/Data-Export-Tool_timecho.md deleted file mode 100644 index f3f031d17..000000000 --- a/src/UserGuide/Master/Table/Tools-System/Data-Export-Tool_timecho.md +++ /dev/null @@ -1,252 +0,0 @@ -# Data Export - -## 1. Function Overview - -IoTDB supports two methods for data export: - -* Data Export Tool: `export-data.sh/bat` is located in the `tools` directory. It can export the query results of specified SQL statements into CSV, SQL, and TsFile (open-source time-series file format) files. -* PIPE Framework-based TsFileBackup: `tsfile-backup.sh/bat` is located in the `tools` directory. It can export specified data files into TsFile format using the PIPE framework. - - - - - - - - - - - - - - - - - - - - - - - - - -
File FormatIoTDB ToolDescription
CSVexport-data.sh/batPlain text format for storing structured data. Must follow the CSV format specified below.
SQLFile containing custom SQL statements.
TsFileOpen-source time-series file format.
tsfile-backup.sh/batAn open-source time-series data file format,and this script supports the Object data type.
- - -## 2. Data Export Tool -### 2.1 Common Parameters -| Short | Full Parameter | Description | Required | Default | -|----------------|--------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ----------------- |-----------------------------------------------| -| `-ft` | `--file_type` | Export file type: `csv`, `sql`, `tsfile`. | ​**Yes** | - | -| `-h` | `--host` | Hostname of the IoTDB server. | No | `127.0.0.1` | -| `-p` | `--port` | Port number of the IoTDB server. | No | `6667` | -| `-u` | `--username` | Username for authentication. | No | `root` | -| `-pw` | `--password` | Password for authentication. Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Select server model : tree or table | No | tree | -| `-db ` | `--database` | The target database to be exported only takes effect when `-sql_dialect` is of the table type. | Yes when `-sql_dialect = table`| - | -| `-table` | `--table` | The target table to be exported only takes effect when `-sql_dialect` is of the table type. If the `-q` parameter is specified, this parameter will not take effect. If the export type is tsfile/sql, this parameter is mandatory. | ​ No | - | -| `-start_time` | `--start_time` | The start time of the data to be exported only takes effect when `-sql_dialect` is of the table type. If `-q` is specified, this parameter will not take effect. The supported time formats are the same as those for the `-tf` parameter. |No | - | -| `-end_time` | `--end_time` | The end time of the data to be exported only takes effect when `-sql_dialect` is set to the table type. If `-q` is specified, this parameter will not take effect. | No | - | -| `-t` | `--target` | Target directory for the output files. If the path does not exist, it will be created. | ​**Yes** | - | -| `-pfn` | `--prefix_file_name` | Prefix for the exported file names. For example, `abc` will generate files like `abc_0.tsfile`, `abc_1.tsfile`. | No | `dump_0.tsfile` | -| `-q` | `--query` | SQL query command to execute. Starting from v2.0.8, semicolons in SQL statements are automatically removed, and query execution proceeds normally. | No | - | -| `-timeout` | `--query_timeout` | Query timeout in milliseconds (ms). | No | `-1` (before v2.0.8)
`Long.MAX_VALUE` (v2.0.8 and later)
(Range: `-1~Long.MAX_VALUE`) | -| `-help` | `--help` | Display help information. | No | - | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 CSV Format -#### 2.2.1 Command - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table - [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -``` -#### 2.2.2 CSV-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ------------ | ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- |------------------------------------------| -| `-dt` | `--datatype` | Whether to include data types in the CSV file header (`true` or `false`). | No | `false` | -| `-lpf` | `--lines_per_file` | Number of rows per exported file. | No | `10000` (Range:0~Integer.Max=2147483647) | -| `-tf` | `--time_format` | Time format for the CSV file. Options: 1) Timestamp (numeric, long), 2) ISO8601 (default), 3) Custom pattern (e.g., `yyyy-MM-dd HH:mm:ss`). SQL file timestamps are unaffected by this setting. | No | `ISO8601` | -| `-tz` | `--timezone` | Timezone setting (e.g., `+08:00`, `-01:00`). | No | System default | - -#### 2.2.3 Examples - -```Shell -# Valid Example -> export-data.sh -ft csv -sql_dialect table -t /path/export/dir -db database1 -q "select * from table1" - -# Error Example -> export-data.sh -ft csv -sql_dialect table -t /path/export/dir -q "select * from table1" -Parse error: Missing required option: db -``` -### 2.3 SQL Format -#### 2.3.1 Command -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-aligned ] - -lpf - [-tf ] [-tz ] [-q ] [-timeout ] - -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] -``` -#### 2.3.2 SQL-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ---------------- | ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ---------------- | -| `-aligned` | `--use_aligned` | Whether to export as aligned SQL format (`true` or `false`). | No | `true` | -| `-lpf` | `--lines_per_file` | Number of rows per exported file. | No | `10000` (Range:0~Integer.Max=2147483647) | -| `-tf` | `--time_format` | Time format for the CSV file. Options: 1) Timestamp (numeric, long), 2) ISO8601 (default), 3) Custom pattern (e.g., `yyyy-MM-dd HH:mm:ss`). SQL file timestamps are unaffected by this setting. | No | `ISO8601` | -| `-tz` | `--timezone` | Timezone setting (e.g., `+08:00`, `-01:00`). | No | System default | - -#### 2.3.3 Examples -```Shell -# Valid Example -> export-data.sh -ft sql -sql_dialect table -t /path/export/dir -db database1 -start_time 1 - -# Error Example -> export-data.sh -ft sql -sql_dialect table -t /path/export/dir -start_time 1 -Parse error: Missing required option: db -``` - -### 2.4 TsFile Format - -#### 2.4.1 Command - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] -``` - -#### 2.4.2 TsFile-Specific Parameters - -* None - -#### 2.4.3 Examples - -```Shell -# Valid Example -> /tools/export-data.sh -ft tsfile -sql_dialect table -t /path/export/dir -db database1 -start_time 0 - -# Error Example -> /tools/export-data.sh -ft tsfile -sql_dialect table -t /path/export/dir -start_time 0 -Parse error: Missing required option: db -``` - - -## 3. TsFileBackup Based on PIPE Framework -Since **V2.0.9.2**, IoTDB supports the `tsfile-backup.sh/bat` script. This script can automatically generate and send the `CREATE PIPE` SQL command to the server, exporting specified data files to TsFile format. - -**Notes:** -1. **To use this script, contact the Timecho Team to obtain the JAR package(`tsfile-remote-sink--jar-with-dependencies.jar`), and place it in a path accessible to IoTDB (e.g., all Data Node hosts).** -2. **This script supports exporting Object-type data to TsFile files.** - - -### 3.1 Execution Commands -```Shell -# Unix/OS X -> tools/tsfile-backup.sh [-sql_dialect ] [-h ] [-p ] - [-u ] [-pw ] [-path ] [-db ] [-table -
] [-s ] [-e ] [-t ] - [-th ] [-tu ] [-tp ] - [--rate_limit] [--plugin_jar] [-help] - -# Windows -> tools\windows\tsfile-backup.bat [-sql_dialect ] [-h ] [-p ] - [-u ] [-pw ] [-path ] [-db ] [-table -
] [-s ] [-e ] [-t ] - [-th ] [-tu ] [-tp ] - [--rate_limit] [--plugin_jar] [-help] -``` - - -### 3.2 Script Parameters -| Abbreviation | Full Name | Description | Required | Default | -|-------------------------|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------| -------- | --------------- | -| `-sql_dialect` | `--sql_dialect` | Specifies the data model type. Valid values: `tree` (Tree Model) or `table` (Table Model). | Yes | - | -| `-h` | `--host` | Local host address (IP of the IoTDB instance where the data resides). | No | `127.0.0.1` | -| `-p` | `--port` | Port number for the IoTDB RPC service. | No | `6667` | -| `-u` | `--user` | Username for IoTDB authentication. | No | `root` | -| `-pw` | `--password` | Password for IoTDB authentication (hidden input supported). | No | `root` | -| `-t` | `--target` | Export target directory. In SCP mode, this is an absolute physical path on the remote server. TsFile and associated Object directories will be exported here. | Yes | - | -| `-db` | `--database` | Database name (optional for Table Model). | No | `.*` | -| `-table` | `--table` | Table name (optional for Table Model). | No | `.*` | -| `-s` | `--start_time` | Start time (ISO8601 format e.g. `2026-01-01T00:00:00` or millisecond timestamp). Only data from this time onwards is exported. | No | - | -| `-e` | `--end_time` | End time (same format as above). Only data before this time is exported. | No | - | -| `-th` | `--target_host` | Remote target host IP. If specified, the script automatically configures Pipe to use SCP for data transfer. | No | - | -| `-tu` | `--target_host_user` | Username for SSH/SCP login to the remote server. | No | - | -| `-tpw` | `--target_host_pw` | Password for remote authentication (hidden input supported). | No | - | -| `-tp` | `--target_host_port` | Remote SSH port. | No | `22` | -| `--rate_limit` | `--rate_limit` | Transfer rate limit (unit: Bytes/s) to prevent excessive bandwidth usage. | No | - | -| `--plugin_jar` | `--plugin_jar` | Path to the Pipe plugin JAR file. | No | - | -| `--object-parallelism` | `--object-parallelism` | Specifies the maximum parallelism for object file transmission. | No | - | -| `--object-batch-size` | `--object-batch-size` | Limits the total byte size of each object file upload batch, used to control memory usage and single SCP transfer size. | No | - | -| `-help` | `--help` | Show help information. | No | - | - - -### 3.3 Execution Examples - -Example 1: SCP Remote Export (Send Data to Another Server) - -```Bash -./tsfile-backup.sh -sql_dialect table -db test_db -t /remote/archive/ -th 192.168.1.100 -tu backup_user -tpw ComplexPass123! -``` - -Example 2: Remote Object Data Export with Rate Limiting - -```Bash -./tsfile-backup.sh -sql_dialect table -t /mnt/backup/ -th 10.0.0.5 -tu iot_admin -tpw Admin@2026 --rate_limit 5242880 -``` - -Example 3: Specify Pipe Plugin JAR Directory - -```Bash -./tsfile-backup.sh -sql_dialect table -db test -table .* -tu luoluoyuyu -tpw -t /tmp/backup --plugin_jar /local/lib/tsfile-remote-sink-2.0.8-SNAPSHOT-jar-with-dependencies.jar -``` - -**Note**: When exporting Object-type data in SCP mode, to avoid handshake exceptions, connection failures, or frequent Pipe restarts, it is recommended to take any of the following measures: -* Appropriately lower the configuration parameter `object-parallelism` -* Increase the `MaxStartups` value on the target machine as needed. After modification, execute `sshd reload` or `sshd restart` for the configuration to take effect. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Tools-System/Data-Import-Tool_timecho.md b/src/UserGuide/Master/Table/Tools-System/Data-Import-Tool_timecho.md deleted file mode 100644 index ca7cfa5b8..000000000 --- a/src/UserGuide/Master/Table/Tools-System/Data-Import-Tool_timecho.md +++ /dev/null @@ -1,374 +0,0 @@ -# Data Import - -## 1. Functional Overview - -IoTDB supports three methods for data import: -- Data Import Tool: Use the `import-data.sh/bat` script in the `tools` directory to manually import CSV, SQL, or TsFile (open-source time-series file format) data into IoTDB. -- `TsFile` Auto-Loading Feature -- Load `TsFile` SQL - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
File FormatIoTDB ToolDescription
CSVimport-data.sh/batCan be used for single or batch import of CSV files into IoTDB
SQLCan be used for single or batch import of SQL files into IoTDB
TsFileCan be used for single or batch import of TsFile files into IoTDB
TsFile Auto-Loading FeatureCan automatically monitor a specified directory for newly generated TsFiles and load them into IoTDB
Load SQLCan be used for single or batch import of TsFile files into IoTDB
- -- The table model TsFile import currently only supports local import. -- Since version V2.0.9.2, the import-data.sh/bat script supports the Object data type when importing TsFile files. - -## 2. Data Import Tool -### 2.1 Common Parameters - -| Short | Full Parameter | Description | Required | Default | -|-----------------|---------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ----------------- |-----------------------------------------------| -| `-ft` | `--file_type` | File type: `csv`, `sql`, `tsfile`. | ​**Yes** | - | -| `-h` | `--host` | IoTDB server hostname. | No | `127.0.0.1` | -| `-p` | `--port` | IoTDB server port. | No | `6667` | -| `-u` | `--username` | Username. | No | `root` | -| `-pw` | `--password` | Password. Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Select server model : tree or table | No | `tree` | -| ` -db ` | `--database` | ​Target database , applies only to `-sql_dialect=table` |Yes when `-sql_dialect = table`;
Starting from version V2.0.9.2, this parameter is optional when the file format is SQL. A prompt will be issued if the target database is not explicitly specified in either the parameter or the SQL statement. | - | -| `-table` | `--table ` | Target table , required for CSV imports in table model | No | - | -| `-s` | `--source` | Local path to the file/directory to import. ​​**Supported formats**​: CSV, SQL, TsFile. Unsupported formats trigger error: `The file name must end with "csv", "sql", or "tsfile"!` | ​**Yes** | - | -| `-tn` | `--thread_num` | Maximum parallel threads | No | `8`
Range: 0 to Integer.Max(2147483647). | -| `-tz` | `--timezone` | Timezone (e.g., `+08:00`, `-01:00`). | No | System default | -| `-help` | `--help` | Display help (general or format-specific: `-help csv`). | No | - | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 CSV Format - -#### 2.2.1 Command -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table - [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] -``` - -#### 2.2.2 CSV-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ---------------- | ------------------------------- |----------------------------------------------------------| ---------- |-----------------| -| `-fd` | `--fail_dir` | Directory to save failed files. | No | YOUR_CSV_FILE_PATH | -| `-lpf` | `--lines_per_failed_file` | Max lines per failed file. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-aligned` | `--use_aligned` | Import as aligned time series. | No | `false` | -| `-batch` | `--batch_size` | Rows processed per API call. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-ti` | `--type_infer` | Type mapping (e.g., `BOOLEAN=text,INT=long`). | No | - | -| `-tp` | `--timestamp_precision` | Timestamp precision: `ms`, `us`, `ns`. | No | `ms` | - -#### 2.2.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_0.csv -db database1 -table table1 - -# Error Example -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_1.csv -table table1 -Parse error: Missing required option: db - -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_1.csv -db database1 -table table5 -There are no tables or the target table table5 does not exist -``` - -#### 2.2.4 Import Notes - -1. CSV Import Specifications - -- Special Character Escaping Rules: If a text-type field contains special characters (e.g., commas `,`), they must be escaped using a backslash (`\`). -- Supported Time Formats: `yyyy-MM-dd'T'HH:mm:ss`, `yyyy-MM-dd HH:mm:ss`, or `yyyy-MM-dd'T'HH:mm:ss.SSSZ`. -- Timestamp Column Requirement: The timestamp column must be the first column in the data file. - -2. CSV File Example - -```sql -time,region,device,model,temperature,humidity -1970-01-01T08:00:00.001+08:00,"SH","101","F",90.0,35.2 -1970-01-01T08:00:00.002+08:00,"SH","101","F",90.0,34.8 -``` - - -### 2.3 SQL Format - -#### 2.3.1 Command - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] -``` - -#### 2.3.2 SQL-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| -------------- | ------------------------------- | -------------------------------------------------------------------- | ---------- | ------------------ | -| `-fd` | `--fail_dir` | Directory to save failed files. | No |YOUR_CSV_FILE_PATH| -| `-lpf` | `--lines_per_failed_file` | Max lines per failed file. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-batch` | `--batch_size` | Rows processed per API call. | No | `100000`
Range: 0 to Integer.Max(2147483647). | - -#### 2.3.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft sql -sql_dialect table -s ./sql/dump0_0.sql -db database1 - -# Error Example -> tools/import-data.sh -ft sql -sql_dialect table -s ./sql/dump1_1.sql -db database1 -Source file or directory ./sql/dump1_1.sql does not exist - -# When the ​target table exists but metadata is incompatible or ​data is malformed, the system will generate a .failed file and log error details. -# Log Example -Fail to insert measurements '[column.name]' caused by [data type is not consistent, input '[column.value]', registered '[column.DataType]'] -``` -### 2.4 TsFile Format - -#### 2.4.1 Command - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-o ] -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-o ] -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] -``` -#### 2.4.2 TsFile-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -|---------|---------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|-----------------------| -| `-os` | `--on_success` | Action for successful files:
`none`: Do not delete the file.
`mv`: Move the successful file to the target directory.
`cp`:Create a hard link (copy) of the successful file to the target directory.
`delete`:Delete the file. | ​**Yes** | - | -| `-sd` | `--success_dir` | Target directory for `mv`/`cp` actions on success. Required if `-os` is `mv`/`cp`. The file name will be flattened and concatenated with the original file name. | Conditional | `${EXEC_DIR}/success` | -| `-of` | `--on_fail` | Action for failed files:
`none`:Skip the file.
`mv`:Move the failed file to the target directory.
`cp`:Create a hard link (copy) of the failed file to the target directory.
`delete`:Delete the file.. | ​**Yes** | - | -| `-fd` | `--fail_dir` | Target directory for `mv`/`cp` actions on failure. Required if `-of` is `mv`/`cp`. The file name will be flattened and concatenated with the original file name. | Conditional | `${EXEC_DIR}/fail` | -| `-tp` | `--timestamp_precision` | TsFile timestamp precision: `ms`, `us`, `ns`.
For non-remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The system will manually verify if the timestamp precision matches the server. If it does not match, an error will be returned.
​For remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The Pipe system will automatically verify if the timestamp precision matches. If it does not match, a Pipe error will be returned. | No | `ms` | -| `-o` | `--object-file-paths` | Storage path for Object files.
Default mode: If this parameter is not specified, the script automatically identifies and imports Object files located in the subdirectory with the same name as the TsFile.
Absolute path mode: Explicitly specifies the external storage root directory for Object files; the tool creates an associated data index based on this path.
Note: This parameter is supported since V2.0.9.2 | No | | - -#### 2.4.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 -os none -of none - -# Error Example -> tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 -Parse error: Missing required options: os, of -``` - - -**Object Type Import** - -1. Import Directory Structure - -* Default Mode - -```Bash -target_dir - ├── tsfile.tsfile - └── tsfile/ (matches the TsFile name) - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - -* Specified Object Directory - -```Bash -target_dir - ├── tsfile.tsfile -object_dir - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - -2. Command Line Examples - -* Basic Import (automatically identifies Object files in the TsFile-named directory) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/import/sensor_v1.tsfile -db database1 -os none -of none -``` - -* Batch Directory Import (specify concurrent threads and post-success action) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/raw_data/ -tn 16 -os mv -sd /data/archive/ -``` - -* Table Model Associated Import (specify external Object storage path and target database) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/import/ -db factory_db -o /mnt/object_storage/ -of mv -fd /data/error_log/ -``` - - -## 3. TsFile Auto-Loading - -This feature enables IoTDB to automatically monitor a specified directory for new TsFiles and load them into the database without manual intervention. - -![](/img/Data-import2.png) - -### 3.1 Configuration - -Add the following parameters to `iotdb-system.properties` (template: `iotdb-system.properties.template`): - -| Parameter | Description | Value Range | Required | Default | Hot-Load? | -| ---------------------------------------------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------| ---------- | ----------------------------- | ----------------------- | -| `load_active_listening_enable` | Enable auto-loading. | `true`/`false` | Optional | `true` | Yes | -| `load_active_listening_dirs` | Directories to monitor (subdirectories included). Multiple paths separated by commas.
Note: In the table model, the directory name where the file is located will be used as the database. | String | Optional | `ext/load/pending` | Yes | -| `load_active_listening_fail_dir` | Directory to store failed TsFiles. Only can set one. | String | Optional | `ext/load/failed` | Yes | -| `load_active_listening_max_thread_num` | Maximum Threads for TsFile Loading Tasks:The default value for this parameter, when commented out, is max(1, CPU cores / 2). If the value set by the user falls outside the range [1, CPU cores / 2], it will be reset to the default value of max(1, CPU cores / 2). | `1` to `Long.MAX_VALUE` | Optional | `max(1, CPU_CORES / 2)` | No (restart required) | -| `load_active_listening_check_interval_seconds` | Active Listening Polling Interval (in seconds):The active listening feature for TsFiles is implemented through polling the target directory. This configuration specifies the time interval between two consecutive checks of the `load_active_listening_dirs`. After each check, the next check will be performed after `load_active_listening_check_interval_seconds` seconds. If the polling interval set by the user is less than 1, it will be reset to the default value of 5 seconds. | `1` to `Long.MAX_VALUE` | Optional | `5` | No (restart required) | - -### 3.2 Examples - -```bash -load_active_listening_dir/ -├─sensors/ -│ ├─temperature/ -│ │ └─temperature-table.TSFILE - -``` - -- Table model TsFile - - `temperature-table.TSFILE`: will be imported into the `temperature` database (because it is located in the `sensors/temperature/` directory) - - -### 3.3 Notes - -1. ​​**Mods Files**​: If TsFiles have associated `.mods` files, move `.mods` files to the monitored directory ​**before** their corresponding TsFiles. Ensure `.mods` and TsFiles are in the same directory. -2. ​​**Restricted Directories**​: Do NOT set Pipe receiver directories, data directories, or other system paths as monitored directories. -3. ​​**Directory Conflicts**​: Ensure `load_active_listening_fail_dir` does not overlap with `load_active_listening_dirs` or its subdirectories. -4. ​​**Permissions**​: The monitored directory must have write permissions. Files are deleted after successful loading; insufficient permissions may cause duplicate loading. - - -## 4. Load SQL - -IoTDB supports importing one or multiple TsFile files containing time series into another running IoTDB instance directly via SQL execution through the CLI. - -### 4.1 Command - -```SQL -load '' with ( - 'attribute-key1'='attribute-value1', - 'attribute-key2'='attribute-value2', -) -``` - -* `` : The path to a TsFile or a folder containing multiple TsFiles. -* ``: Optional parameters, as described below. - -| Key | Key Description | Value Type | Value Range | Value is Required | Default Value | -|--------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|--------------------------------|-------------------|----------------------------| -| `database-level` | When the database corresponding to the TsFile does not exist, the database hierarchy level can be specified via the ` database-level` parameter. The default is the level set in `iotdb-common.properties`. For example, setting level=1 means the prefix path of level 1 in all time series in the TsFile will be used as the database. | Integer | `[1: Integer.MAX_VALUE]` | No | 1 | -| `on-success` | Action for successfully loaded TsFiles: `delete` (delete the TsFile after successful import) or `none` (retain the TsFile in the source folder). | String | `delete / none` | No | delete | -| `model` | Specifies whether the TsFile uses the `table` model or `tree` model. This parameter becomes invalid starting from V2.0.2.1. The system automatically identifies whether the data model is tree-based or table-based. | String | `tree / table` | No | Aligns with `-sql_dialect` | -| `database-name` | Table model only: Target database for import. Automatically created if it does not exist. The database-name must not include the `root.` prefix (an error will occur if included). | String | `-` | No | null | -| `convert-on-type-mismatch` | Whether to perform type conversion during loading if data types in the TsFile mismatch the target schema. | Boolean | `true / false` | No | true | -| `verify` | Whether to validate the schema before loading the TsFile. | Boolean | `true / false` | No | true | -| `tablet-conversion-threshold` | Size threshold (in bytes) for converting TsFiles into tablet format during loading. Default: `-1` (no conversion for any TsFile). | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | No | -1 | -| `async` | Whether to enable asynchronous loading. If enabled, TsFiles are moved to an active-load directory and loaded into the `database-name` asynchronously. | Boolean | `true / false` | No | false | - -### 4.2 Example - -```SQL --- Create target database: database2 -IoTDB> create database database2 -Msg: The statement is executed successfully. - -IoTDB> use database2 -Msg: The statement is executed successfully. - -IoTDB:database2> show tables details -+---------+-------+------+-------+ -|TableName|TTL(ms)|Status|Comment| -+---------+-------+------+-------+ -+---------+-------+------+-------+ -Empty set. - --- Import tsfile by excuting load sql -IoTDB:database2> load '/home/dump0.tsfile' with ( 'on-success'='none', 'database-name'='database2') -Msg: The statement is executed successfully. - --- Verify whether the import was successful -IoTDB:database2> select * from table2 -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -|2024-11-30T00:00:00.000+08:00| 上海| 3002| 101| 90.0| 35.2| true| null| -|2024-11-29T00:00:00.000+08:00| 上海| 3001| 101| 85.0| 35.1| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T00:00:00.000+08:00| 北京| 1001| 101| 85.0| 35.1| true|2024-11-27T16:37:01.000+08:00| -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| null| 45.1| true| null| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| 85.0| 35.2| false|2024-11-28T08:00:09.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -``` diff --git a/src/UserGuide/Master/Table/Tools-System/Maintenance-Tool_timecho.md b/src/UserGuide/Master/Table/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index 725b758a4..000000000 --- a/src/UserGuide/Master/Table/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,1151 +0,0 @@ - -# Cluster Management Tool - -## 1. IoTDB-OpsKit - -The IoTDB OpsKit is an easy-to-use operation and maintenance tool designed for TimechoDB (Enterprise-grade product based on Apache IoTDB). It helps address the operational and maintenance challenges of multi-node distributed IoTDB deployments by providing functionalities such as cluster deployment, start/stop management, elastic scaling, configuration updates, and data export. With one-click command execution, it simplifies the management of complex database clusters and significantly reduces operational complexity. - -This document provides guidance on remotely deploying, configuring, starting, and stopping IoTDB cluster instances using the cluster management tool. - -### 1.1 Prerequisites - -The IoTDB OpsKit requires GLIBC 2.17 or later, which means the minimum supported operating system version is CentOS 7. The target machines for IoTDB deployment must have the following dependencies installed: - -- JDK 8 or later -- lsof -- netstat -- unzip - -If any of these dependencies are missing, please install them manually. The last section of this document provides installation commands for reference. - -> **Note:** The IoTDB cluster management tool requires **root privileges** to execute. - -### 1.2 Deployment - -#### Download and Installation - -The IoTDB OpsKit is an auxiliary tool for TimechoDB. Please contact Timecho team to obtain the download instructions. - -To install: - -1. Navigate to the `iotdb-opskit` directory and execute: - -```Bash -bash install-iotdbctl.sh -``` - -This will activate the `iotdbctl` command in the current shell session. You can verify the installation by checking the deployment prerequisites: - -```Bash -iotdbctl cluster check example -``` - -1. Alternatively, if you prefer not to activate `iotdbctl`, you can execute commands directly using the absolute path: - -```Bash -/sbin/iotdbctl cluster check example -``` - -### 1.3 Cluster Configuration Files - -The cluster configuration files are stored in the `iotdbctl/config` directory as YAML files. - -- Each YAML file name corresponds to a cluster name. Multiple YAML files can coexist. -- A sample configuration file (`default_cluster.yaml`) is provided in the `iotdbctl/config` directory to assist users in setting up their configurations. - -#### **Structure of YAML Configuration** - -The YAML file consists of the following five sections: - -1. `global` – General settings, such as SSH credentials, installation paths, and JDK configurations. -2. `confignode_servers` – Configuration settings for ConfigNodes. -3. `datanode_servers` – Configuration settings for DataNodes. -4. `grafana_server` – Configuration settings for Grafana monitoring. -5. `prometheus_server` – Configuration settings for Prometheus monitoring. - -A sample YAML file (`default_cluster.yaml`) is included in the `iotdbctl/config` directory. - -- You can copy and rename it based on your cluster setup. -- All uncommented fields are mandatory. -- Commented fields are optional. - -**Example:** Checking `default_cluster.yaml` - -To validate a cluster configuration, execute: - -```SQL -iotdbctl cluster check default_cluster -``` - -For a complete list of available commands, refer to the command reference section below. - -#### Parameter Reference - -| **Parameter** | **Description** | **Mandatory** | -| ----------------------- | ------------------------------------------------------------ | ------------- | -| iotdb_zip_dir | IoTDB distribution directory. If empty, the package will be downloaded from `iotdb_download_url`. | NO | -| iotdb_download_url | IoTDB download URL. If `iotdb_zip_dir` is empty, the package will be retrieved from this address. | NO | -| jdk_tar_dir | Local path to the JDK package for uploading and deployment. | NO | -| jdk_deploy_dir | Remote deployment directory for the JDK. | NO | -| jdk_dir_name | JDK decompression directory name. Default: `jdk_iotdb`. | NO | -| iotdb_lib_dir | IoTDB library directory (or `.zip` package for upgrades). Default: commented out. | NO | -| user | SSH login username for deployment. | YES | -| password | SSH password (if omitted, key-based authentication will be used). | NO | -| pkey | SSH private key (used if `password` is not provided). | NO | -| ssh_port | SSH port number. | YES | -| deploy_dir | IoTDB deployment directory. | YES | -| iotdb_dir_name | IoTDB decompression directory name. Default: `iotdb`. | NO | -| datanode-env.sh | Corresponds to `iotdb/config/datanode-env.sh`. If both `global` and `confignode_servers` are configured, `confignode_servers` takes precedence. | NO | -| confignode-env.sh | Corresponds to `iotdb/config/confignode-env.sh`. If both `global` and `datanode_servers` are configured, `datanode_servers` takes precedence. | NO | -| iotdb-system.properties | Corresponds to `/config/iotdb-system.properties`. | NO | -| cn_internal_address | The inter-node communication address for ConfigNodes. This parameter defines the address of the surviving ConfigNode, which defaults to `confignode_x`. If both `global` and `confignode_servers` are configured, the value in `confignode_servers` takes precedence. Corresponds to `cn_internal_address` in `iotdb/config/iotdb-system.properties`. | YES | -| dn_internal_address | The inter-node communication address for DataNodes. This address defaults to `confignode_x`. If both `global` and `datanode_servers` are configured, the value in `datanode_servers` takes precedence. Corresponds to `dn_internal_address` in `iotdb/config/iotdb-system.properties`. | YES | - -Both `datanode-env.sh` and `confignode-env.sh` allow **extra parameters** to be appended. These parameters can be configured using the `extra_opts` field. Example from `default_cluster.yaml`: - -```YAML -datanode-env.sh: - extra_opts: | - IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" - IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" -``` - -#### ConfigNode Configuration - -ConfigNodes can be configured in `confignode_servers`. Multiple ConfigNodes can be deployed, with the first started ConfigNode (`node1`) serving as the Seed ConfigNode by default. - -| **Parameter** | **Description** | **Mandatory** | -| ----------------------- | ------------------------------------------------------------ | ------------- | -| name | ConfigNode name. | YES | -| deploy_dir | ConfigNode deployment directory. | YES | -| cn_internal_address | Inter-node communication address for ConfigNodes, corresponding to `iotdb/config/iotdb-system.properties`. | YES | -| cn_seed_config_node | The cluster configuration address points to the surviving ConfigNode. This address defaults to `confignode_x`. If both `global` and `confignode_servers` are configured, the value in `confignode_servers` takes precedence, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| cn_internal_port | Internal communication port, corresponding to `cn_internal_port` in `iotdb/config/iotdb-system.properties`. | YES | -| cn_consensus_port | Consensus communication port, corresponding to `cn_consensus_port` in `iotdb/config/iotdb-system.properties`. | NO | -| cn_data_dir | Data directory for ConfigNodes, corresponding to `cn_data_dir` in `iotdb/config/iotdb-system.properties`. | YES | -| iotdb-system.properties | ConfigNode properties file. If `global` and `confignode_servers` are both configured, values from `confignode_servers` take precedence. | NO | - -#### DataNode Configuration - -Datanodes can be configured in `datanode_servers`. Multiple DataNodes can be deployed, each requiring unique configuration. - -| **Parameter** | **Description** | **Mandatory** | -| ----------------------- | ------------------------------------------------------------ | ------------- | -| name | DataNode name. | YES | -| deploy_dir | DataNode deployment directory. | YES | -| dn_rpc_address | RPC communication address, corresponding to `dn_rpc_address` in `iotdb/config/iotdb-system.properties`. | YES | -| dn_internal_address | Internal communication address, corresponding to `dn_internal_address` in `iotdb/config/iotdb-system.properties`. | YES | -| dn_seed_config_node | Points to the active ConfigNode. Defaults to `confignode_x`. If `global` and `datanode_servers` are both configured, values from `datanode_servers` take precedence. Corresponds to `dn_seed_config_node` in `iotdb/config/iotdb-system.properties`. | YES | -| dn_rpc_port | RPC port for DataNodes, corresponding to `dn_rpc_port` in `iotdb/config/iotdb-system.properties`. | YES | -| dn_internal_port | Internal communication port, corresponding to `dn_internal_port` in `iotdb/config/iotdb-system.properties`. | YES | -| iotdb-system.properties | DataNode properties file. If `global` and `datanode_servers` are both configured, values from `datanode_servers` take precedence. | NO | - -#### Grafana Configuration - -Grafana can be configured in `grafana_server`. Defines the settings for deploying Grafana as a monitoring solution for IoTDB. - -| **Parameter** | **Description** | **Mandatory** | -| ---------------- | ------------------------------------------------------------ | ------------- | -| grafana_dir_name | Name of the Grafana decompression directory. Default: `grafana_iotdb`. | NO | -| host | The IP address of the machine hosting Grafana. | YES | -| grafana_port | The port Grafana listens on. Default: `3000`. | NO | -| deploy_dir | Deployment directory for Grafana. | YES | -| grafana_tar_dir | Path to the Grafana compressed package. | YES | -| dashboards | Path to pre-configured Grafana dashboards. | NO | - -#### Prometheus Configuration - -Grafana can be configured in `prometheus_server`. Defines the settings for deploying Prometheus as a monitoring solution for IoTDB. - -| **Parameter** | **Description** | **Mandatory** | -| --------------------------- | ------------------------------------------------------------ | ------------- | -| prometheus_dir_name | Name of the Prometheus decompression directory. Default: `prometheus_iotdb`. | NO | -| host | The IP address of the machine hosting Prometheus. | YES | -| prometheus_port | The port Prometheus listens on. Default: `9090`. | NO | -| deploy_dir | Deployment directory for Prometheus. | YES | -| prometheus_tar_dir | Path to the Prometheus compressed package. | YES | -| storage_tsdb_retention_time | Number of days data is retained. Default: `15 days`. | NO | -| storage_tsdb_retention_size | Maximum data storage size per block. Default: `512M`. Units: KB, MB, GB, TB, PB, EB. | NO | - -If metrics are enabled in `iotdb-system.properties` (in `config/xxx.yaml`), the configurations will be automatically applied to Prometheus without manual modification. - -**Special Configuration Notes** - -- **Handling Special Characters in YAML Keys**: If a YAML key value contains special characters (such as `:`), it is recommended to enclose the entire value in double quotes (`""`). -- **Avoid Spaces in File Paths**: Paths containing spaces may cause parsing errors in some configurations. - -### 1.4 Usage Scenarios - -#### Data Cleanup - -This operation deletes cluster data directories, including: - -- IoTDB data directories, -- ConfigNode directories (`cn_system_dir`, `cn_consensus_dir`), -- DataNode directories (`dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`), -- Log directories and ext directories specified in the YAML configuration. - -To clean cluster data, perform the following steps: - -```Bash -# Step 1: Stop the cluster -iotdbctl cluster stop default_cluster - -# Step 2: Clean the cluster data -iotdbctl cluster clean default_cluster -``` - -#### Cluster Destruction - -The cluster destruction process completely removes the following resources: - -- Data directories, -- ConfigNode directories (`cn_system_dir`, `cn_consensus_dir`), -- DataNode directories (`dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`), -- Log and ext directories, -- IoTDB deployment directory, -- Grafana and Prometheus deployment directories. - -To destroy a cluster, follow these steps: - -```Bash -# Step 1: Stop the cluster -iotdbctl cluster stop default_cluster - -# Step 2: Destroy the cluster -iotdbctl cluster destroy default_cluster -``` - -#### Cluster Upgrade - -To upgrade the cluster, follow these steps: - -1. In `config/xxx.yaml`, set **`iotdb_lib_dir`** to the path of the JAR files to be uploaded. Example: `iotdb/lib` -2. If uploading a compressed package, compress the `iotdb/lib` directory: - -```Bash -zip -r lib.zip apache-iotdb-1.2.0/lib/* -``` - -1. Execute the following commands to distribute the library and restart the cluster: - -```Bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### Hot Deployment - -Hot deployment allows real-time configuration updates without restarting the cluster. - -Steps: - -1. Modify the configuration in `config/xxx.yaml`. -2. Distribute the updated configuration and reload it: - -```Bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### Cluster Expansion - -To expand the cluster by adding new nodes: - -1. Add a new DataNode or ConfigNode in `config/xxx.yaml`. -2. Execute the cluster expansion command: - -```Bash -iotdbctl cluster scaleout default_cluster -``` - -#### Cluster Shrinking - -To remove a node from the cluster: - -1. Identify the node name or IP:port in `config/xxx.yaml`: - 1. ConfigNode port: `cn_internal_port` - 2. DataNode port: `rpc_port` -2. Execute the following command: - -```Bash -iotdbctl cluster scalein default_cluster -``` - -#### Managing Existing IoTDB Clusters - -To manage an existing IoTDB cluster with the OpsKit tool: - -1. Configure SSH credentials: - 1. Set `user`, `password` (or `pkey`), and `ssh_port` in `config/xxx.yaml`. -2. Modify IoTDB deployment paths: For example, if IoTDB is deployed at `/home/data/apache-iotdb-1.1.1`: - -```YAML -deploy_dir: /home/data/ -iotdb_dir_name: apache-iotdb-1.1.1 -``` - -1. Configure JDK paths: If `JAVA_HOME` is not used, set the JDK deployment path: - -```YAML -jdk_deploy_dir: /home/data/ -jdk_dir_name: jdk_1.8.2 -``` - -1. Set cluster addresses: - -- `cn_internal_address` and `dn_internal_address` -- In `confignode_servers` → `iotdb-system.properties`, configure: - - `cn_internal_address`, `cn_internal_port`, `cn_consensus_port`, `cn_system_dir`, `cn_consensus_dir` -- In `datanode_servers` → `iotdb-system.properties`, configure: - - `dn_rpc_address`, `dn_internal_address`, `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir` - -1. Execute the initialization command: - -```Bash -iotdbctl cluster init default_cluster -``` - -#### Deploying IoTDB, Grafana, and Prometheus - -To deploy an IoTDB cluster along with Grafana and Prometheus: - -1. Enable metrics: In `iotdb-system.properties`, enable the metrics interface. -2. Configure Grafana: - -- If deploying multiple dashboards, separate names with commas. -- Ensure dashboard names are unique to prevent overwriting. - -1. Configure Prometheus: - -- If the IoTDB cluster has metrics enabled, Prometheus automatically adapts without manual configuration. - -1. Start the cluster: - -```Bash -iotdbctl cluster start default_cluster -``` - -For detailed parameters, refer to the **Cluster Configuration Files** section above. - -### 1.5 Command Reference - -The basic command structure is: - -```Bash -iotdbctl cluster [params (Optional)] -``` - -- `key` – The specific command to execute. -- `cluster_name` – The name of the cluster (matches the YAML file name in `iotdbctl/config`). -- `params` – Optional parameters for the command. - -Example: Deploying the `default_cluster` cluster - -```Bash -iotdbctl cluster deploy default_cluster -``` - -#### Command Overview - -| **ommand** | **escription** | **Parameters** | -| ---------- | ---------------------------------- | ---------------------------------------------------- | -| check | Check cluster readiness for deployment. | Cluster name | -| clean | Clean up cluster data. | Cluster name | -| deploy/dist-all | Deploy the cluster. | Cluster name, -N module (optional: iotdb, grafana, prometheus), -op force (optional) | -| list | List cluster status. | None | -| start | Start the cluster. | Cluster name, -N node name (optional: iotdb, grafana, prometheus) | -| stop | Stop the cluster. | Cluster name, -N node name (optional), -op force (optional) | -| restart | Restart the cluster. | Cluster name, -N node name (optional), -op force (optional) | -| show | View cluster details. | Cluster name, details (optional) | -| destroy | Destroy the cluster. | Cluster name, -N module (optional: iotdb, grafana, prometheus) | -| scaleout | Expand the cluster. | Cluster name | -| scalein | Shrink the cluster. | Cluster name, -N node name or IP:port | -| reload | Hot reload cluster configuration. | Cluster name | -| dist-conf | Distribute cluster configuration. | Cluster name | -| dumplog | Backup cluster logs. | Cluster name, -N node name, -h target IP, -pw target password, -p target port, -path backup path, -startdate, -enddate, -loglevel, -l transfer speed | -| dumpdata | Backup cluster data | Cluster name, -h target IP, -pw target password, -p target port, -path backup path, -startdate, -enddate, -l transfer speed | -| dist-lib | Upgrade the IoTDB lib package. | Cluster name | -| init | Initialize the cluster configuration. | Cluster name | -| status | View process status. | Cluster name | -| activate | Activate the cluster. | Cluster name | -| health_check | Perform a health check. | Cluster name, -N, nodename (optional) | -| backup | Backup the cluster. | Cluster name,-N nodename (optional) | -| importschema | Import metadata. | Cluster name,-N nodename -param paramters | -| exportschema | Export metadata. | Cluster name,-N nodename -param paramters | - -### 1.6 Detailed Command Execution Process - -The following examples use `default_cluster.yaml` as a reference. Users can modify the commands according to their specific cluster configuration files. - -#### Check Cluster Deployment Environment - -The following command checks whether the cluster environment meets the deployment requirements: - -```Bash -iotdbctl cluster check default_cluster -``` - -**Execution Steps:** - -1. Locate the corresponding YAML file (`default_cluster.yaml`) based on the cluster name. -2. Retrieve configuration information for ConfigNode and DataNode (`confignode_servers` and `datanode_servers`). -3. Verify the following conditions on the target node: - 1. SSH connectivity - 2. JDK version (must be 1.8 or above) - 3. Required system tools: unzip, lsof, netstat - -**Expected Output:** - -- If successful: `Info: example check successfully!` -- If failed: `Error: example check fail!` - -**Troubleshooting Tips:** - -- JDK version not satisfied: Specify a valid `jdk1.8+` path in the YAML file for deployment. -- Missing system tools: Install unzip, lsof, and netstat on the server. -- Port conflict: Check the error log, e.g., `Error: Server (ip:172.20.31.76) iotdb port (10713) is listening.` - -#### Deploy Cluster - -Deploy the entire cluster using the following command: - -```Bash -iotdbctl cluster deploy default_cluster -``` - -**Execution Steps:** - -1. Locate the corresponding `YAML` file based on the cluster name. -2. Upload the IoTDB and JDK compressed packages (if `jdk_tar_dir` and `jdk_deploy_dir` are configured). -3. Generate and upload the iotdb-system.properties file based on the YAML configuration. - -**Force Deployment:** To overwrite existing deployment directories and redeploy: - -```Bash -iotdbctl cluster deploy default_cluster -op force -``` - -**Deploying Individual Modules:** - -You can deploy specific components individually: - -```Bash -# Deploy Grafana module -iotdbctl cluster deploy default_cluster -N grafana - -# Deploy Prometheus module -iotdbctl cluster deploy default_cluster -N prometheus - -# Deploy IoTDB module -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### Start Cluster - -Start the cluster using the following command: - -```Bash -iotdbctl cluster start default_cluster -``` - -**Execution Steps:** - -1. Locate the `YAML` file based on the cluster name. -2. Start ConfigNodes sequentially according to the YAML order. - 1. The first ConfigNode is treated as the Seed ConfigNode. - 2. Verify startup by checking process IDs. -3. Start DataNodes sequentially and verify their process IDs. -4. After process verification, check the cluster's service health via CLI. - 1. If the CLI connection fails, retry every 10 seconds, up to 5 times. - -**Start a Single Node:** Start specific nodes by name or IP: - -```Bash -# By node name -iotdbctl cluster start default_cluster -N datanode_1 - -# By IP and port (ConfigNode uses `cn_internal_port`, DataNode uses `rpc_port`) -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 - -# Start Grafana -iotdbctl cluster start default_cluster -N grafana - -# Start Prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -**Note:** The `iotdbctl` tool relies on `start-confignode.sh` and `start-datanode.sh` scripts. If startup fails, check the cluster status using the following command: - -```Bash -iotdbctl cluster status default_cluster -``` - -#### View Cluster Status - -To view the current cluster status: - -```Bash -iotdbctl cluster show default_cluster -``` - -To view detailed information: - -```Bash -iotdbctl cluster show default_cluster details -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve `confignode_servers` and `datanode_servers` configuration. -2. Execute `show cluster details` via CLI. -3. If one node returns successfully, the process skips checking remaining nodes. - -#### Stop Cluster - -To stop the entire cluster: - -```Bash -iotdbctl cluster stop default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve `confignode_servers` and `datanode_servers` configuration. -2. Stop DataNodes sequentially based on the YAML configuration. -3. Stop ConfigNodes in sequence. - -**Force Stop:** To forcibly stop the cluster using `kill -9`: - -```Bash -iotdbctl cluster stop default_cluster -op force -``` - -**Stop a Single Node:** Stop nodes by name or IP: - -```Bash -# By node name -iotdbctl cluster stop default_cluster -N datanode_1 - -# By IP and port -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 - -# Stop Grafana -iotdbctl cluster stop default_cluster -N grafana - -# Stop Prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -**Note:** If the IoTDB cluster is not fully stopped, verify its status using: - -```Bash -iotdbctl cluster status default_cluster -``` - -#### Clean Cluster Data - -To clean up cluster data, execute: - -```Bash -iotdbctl cluster clean default_cluster -``` - -**Execution Steps:** - -1. Locate the `YAML` file and retrieve `confignode_servers` and `datanode_servers` configuration. -2. Verify that no services are running. If any are active, the cleanup will not proceed. -3. Delete the following directories: - 1. IoTDB data directories, - 2. ConfigNode and DataNode system directories (`cn_system_dir`, `dn_system_dir`), - 3. Consensus directories (`cn_consensus_dir`, `dn_consensus_dir`), - 4. Logs and ext directories. - -#### Restart Cluster - -Restart the cluster using the following command: - -```Bash -iotdbctl cluster restart default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve configurations for ConfigNodes, DataNodes, Grafana, and Prometheus. -2. Perform a cluster stop followed by a cluster start. - -**Force Restart:** To forcibly restart the cluster: - -```Bash -iotdbctl cluster restart default_cluster -op force -``` - -**Restart a Single Node:** Restart specific nodes by name: - -```Bash -# Restart DataNode -iotdbctl cluster restart default_cluster -N datanode_1 - -# Restart ConfigNode -iotdbctl cluster restart default_cluster -N confignode_1 - -# Restart Grafana -iotdbctl cluster restart default_cluster -N grafana - -# Restart Prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### Cluster Expansion - -To add a node to the cluster: - -1. Edit `config/xxx.yaml` to add a new DataNode or ConfigNode. -2. Execute the following command: - -```Bash -iotdbctl cluster scaleout default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve node configuration. -2. Upload the IoTDB and JDK packages (if `jdk_tar_dir` and `jdk_deploy_dir` are configured). -3. Generate and upload iotdb-system.properties. -4. Start the new node and verify success. - -Tip: Only one node expansion is supported per execution. - -#### Cluster Shrinking - -To remove a node from the cluster: - -1. Identify the node name or IP:port in `config/xxx.yaml`. -2. Execute the following command: - -```Bash -#Scale down by node name -iotdbctl cluster scalein default_cluster -N nodename - -#Scale down according to ip+port (ip+port obtains the only node according to ip+dn_rpc_port in datanode, and obtains the only node according to ip+cn_internal_port in confignode) -iotdbctl cluster scalein default_cluster -N ip:port -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve node configuration. -2. Ensure at least one ConfigNode and one DataNode remain. -3. Identify the node to remove, execute the scale-in command, and delete the node directory. - -Tip: Only one node shrinking is supported per execution. - -#### Destroy Cluster - -To destroy the entire cluster: - -```Bash -iotdbctl cluster destroy default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve node information. -2. Verify that all nodes are stopped. If any node is running, the destruction will not proceed. -3. Delete the following directories: - 1. IoTDB data directories, - 2. ConfigNode and DataNode system directories, - 3. Consensus directories, - 4. Logs, ext, and deployment directories, - 5. Grafana and Prometheus directories. - -**Destroy a Single Module:** To destroy individual modules: - -```Bash -# Destroy Grafana -iotdbctl cluster destroy default_cluster -N grafana - -# Destroy Prometheus -iotdbctl cluster destroy default_cluster -N prometheus - -# Destroy IoTDB -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### Distribute Cluster Configuration - -To distribute the cluster configuration files across nodes: - -```Bash -iotdbctl cluster dist-conf default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on `cluster-name`. -2. Retrieve configuration from `confignode_servers`, `datanode_servers`, `grafana`, and `prometheus`. -3. Generate and upload `iotdb-system.properties` to the specified nodes. - -#### Hot Load Cluster Configuration - -To reload the cluster configuration without restarting: - -```Plain -iotdbctl cluster reload default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on `cluster-name`. -2. Execute the `load configuration` command through the CLI for each node. - -#### Cluster Node Log Backup - -To back up logs from specific nodes: - -```Bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` - -**Execution Steps:** - -1. Locate the YAML file based on `cluster-name`. -2. Verify node existence (`datanode_1` and `confignode_1`). -3. Back up log data within the specified date range. -4. Save logs to `/iotdb/logs` or the default IoTDB installation path. - -| **Command** | **Description** | **Mandatory** | -| ----------- | ------------------------------------------------------------ | ------------- | -| -h | IP address of the backup server | NO | -| -u | Username for the backup server | NO | -| -pw | Password for the backup server | NO | -| -p | Backup server port (Default: `22`) | NO | -| -path | Path for backup data (Default: current path) | NO | -| -loglevel | Log level (`all`, `info`, `error`, `warn`. Default: `all`) | NO | -| -l | Speed limit (Default: unlimited; Range: 0 to 104857601 Kbit/s) | NO | -| -N | Cluster names (comma-separated) | YES | -| -startdate | Start date (inclusive; Default: `1970-01-01`) | NO | -| -enddate | End date (inclusive) | NO | -| -logs | IoTDB log storage path (Default: `{iotdb}/logs`) | NO | - -#### Cluster Data Backup - -To back up data from the cluster: - -```Bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` - -This command identifies the leader node from the YAML file and backs up data within the specified date range to the `/iotdb/datas` directory on the `192.168.9.48` server. - -| **Command** | **Description** | **Mandatory** | -| ------------ | ------------------------------------------------------------ | ------------- | -| -h | IP address of the backup server | NO | -| -u | Username for the backup server | NO | -| -pw | Password for the backup server | NO | -| -p | Backup server port (Default: `22`) | NO | -| -path | Path for storing backup data (Default: current path) | NO | -| -granularity | Data partition granularity | YES | -| -l | Speed limit (Default: unlimited; Range: 0 to 104857601 Kbit/s) | NO | -| -startdate | Start date (inclusive) | YES | -| -enddate | End date (inclusive) | YES | - -#### Cluster Upgrade - -To upgrade the cluster: - -```Bash -iotdbctl cluster dist-lib default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve the configuration of `confignode_servers` and `datanode_servers`. -3. Upload the library package. - -**Note:** After the upgrade, restart all IoTDB nodes for the changes to take effect. - -#### Cluster Initialization - -To initialize the cluster: - -```Bash -iotdbctl cluster init default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve configuration details for `confignode_servers`, `datanode_servers`, `Grafana`, and `Prometheus`. -3. Initialize the cluster configuration. - -#### View Cluster Process - -To check the cluster process status: - -```Bash -iotdbctl cluster status default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve configuration details for `confignode_servers`, `datanode_servers`, `Grafana`, and `Prometheus`. -3. Display the operational status of each node in the cluster. - -#### Cluster Authorization Activation - -**Default Activation Method:** To activate the cluster using an activation code: - -```Bash -iotdbctl cluster activate default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve the `confignode_servers` configuration. -3. Obtain the machine code. -4. Enter the activation code when prompted. - -Example: - -```Bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=, lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful. -``` - -**Activate a Specific Node:** To activate a specific node: - -```Bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -**Activate via License Path:** To activate using a license file: - -```Bash -iotdbctl cluster activate default_cluster -op license_path -``` - -#### Cluster Health Check - -To perform a cluster health check: - -```Bash -iotdbctl cluster health_check default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve configuration details for `confignode_servers` and `datanode_servers`. -3. Execute `health_check.sh` on each node. - -**Single Node Health Check:** To check a specific node: - -```Bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` - -#### Cluster Shutdown Backup - -To back up the cluster during shutdown: - -```Bash -iotdbctl cluster backup default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve configuration details for `confignode_servers` and `datanode_servers`. -3. Execute `backup.sh` on each node. - -**Single Node Backup:** To back up a specific node: - -```Bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` - -**Note:** Multi-node deployment on a single machine only supports quick mode. - -#### Cluster Metadata Import - -To import metadata: - -```Bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name to retrieve `datanode_servers` configuration information. -2. Execute metadata import using `import-schema.sh` on `datanode1`. - -Parameter Descriptions for `-param`: - -| **Command** | **Description** | **Mandatory** | -| ----------- | ------------------------------------------------------------ | ------------- | -| -s | Specify the data file or directory to be imported. If a directory is specified, all files with a `.csv` extension will be imported in bulk. | YES | -| -fd | Specify a directory to store failed import files. If omitted, failed files will be saved in the source directory with the `.failed` suffix added to the original filename. | No | -| -lpf | Specify the maximum number of lines per failed import file (Default: 10,000). | NO | - -#### Cluster Metadata Export - -To export metadata: - -```Bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name to retrieve `datanode_servers` configuration information. -2. Execute metadata export using `export-schema.sh` on `datanode1`. - -**Parameter Descriptions for** **`-param`:** - -| **Command** | **Description** | **Mandatory** | -| ----------- | ------------------------------------------------------------ | ------------- | -| -t | Specify the output path for the exported CSV file. | YES | -| -path | Specify the metadata path pattern for export. If this parameter is specified, the `-s` parameter will be ignored. Example: `root.stock.**`. | NO | -| -pf | If `-path` is not specified, use this parameter to specify the file containing metadata paths to export. The file must be in `.txt` format, with one path per line. | NO | -| -lpf | Specify the maximum number of lines per exported file (Default: 10,000). | NO | -| -timeout | Specify the session query timeout in milliseconds. | NO | - -### 1.7 Introduction to Cluster Deployment Tool Samples - -In the cluster deployment tool installation directory (config/example), there are three YAML configuration examples. If needed, you can copy and modify them for your deployment. - -| **Name** | **Description** | -| -------------------- | -------------------------------------------------------- | -| default_1c1d.yaml | Example configuration for 1 ConfigNode and 1 DataNode. | -| default_3c3d.yaml | Example configuration for 3 ConfigNodes and 3 DataNodes. | -| default_3c3d_grafa_prome | Example configuration for 3 ConfigNodes, 3 DataNodes, Grafana, and Prometheus. | - -## 2. IoTDB Data Directory Overview Tool - -The IoTDB Data Directory Overview Tool provides an overview of the IoTDB data directory structure. It is located at `tools/tsfile/print-iotdb-data-dir`. - -### 2.1 Usage - -- For Windows: - -```Bash -.\print-iotdb-data-dir.bat () -``` - -- For Linux or MacOs: - -```Shell -./print-iotdb-data-dir.sh () -``` - -**Note:** If the output path is not specified, the default relative path `IoTDB_data_dir_overview.txt` will be used. - -### 2.2 Example - -Use Windows in this example: - -~~~Bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -~~~ - -## 3. TsFile Sketch Tool - -The TsFile Sketch Tool provides a summarized view of the content within a TsFile. It is located at `tools/tsfile/print-tsfile`. - -### 3.1 Usage - -- For Windows: - -```Plain -.\print-tsfile-sketch.bat () -``` - -- For Linux or MacOs: - -```Plain -./print-tsfile-sketch.sh () -``` - -**Note:** If the output path is not specified, the default relative path `TsFile_sketch_view.txt` will be used. - -### 3.2 Example - -Use Windows in this example: - -~~~Bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -~~~ - -Explanations: - -- The output is separated by the `|` symbol. The left side indicates the actual position within the TsFile, while the right side provides a summary of the content. -- The `"||||||||||||||||||||"` lines are added for readability and are not part of the actual TsFile data. -- The final `"IndexOfTimerseriesIndex Tree"` section reorganizes the metadata index tree at the end of the TsFile. This view aids understanding but does not represent actual stored data. - -## 4. TsFile Resource Sketch Tool - -The TsFile Resource Sketch Tool displays details about TsFile resource files. It is located at `tools/tsfile/print-tsfile-resource-files`. - -### 4.1 Usage - -- For Windows: - -```Bash -.\print-tsfile-resource-files.bat -``` - -- For Linux or MacOs: - -```Plain -./print-tsfile-resource-files.sh -``` - -### 4.2 Example - -Use Windows in this example: - -~~~Bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -~~~ diff --git a/src/UserGuide/Master/Table/Tools-System/Monitor-Tool_timecho.md b/src/UserGuide/Master/Table/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index 418d62f82..000000000 --- a/src/UserGuide/Master/Table/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,194 +0,0 @@ - -# Monitor Tool - -## 1. **Prometheus** **Integration** - -### 1.1 **Prometheus Metric Mapping** - -The following table illustrates the mapping of IoTDB metrics to the Prometheus-compatible format. For a given metric with `Metric Name = name` and tags `K1=V1, ..., Kn=Vn`, the mapping follows this pattern, where `value` represents the actual measurement. - -| **Metric Type** | **Mapping** | -| ---------------- | ------------------------------------------------------------ | -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| AutoGauge, Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | - -### 1.2 **Configuration File** - -To enable Prometheus metric collection in IoTDB, modify the configuration file as follows: - -1. Taking DataNode as an example, modify the iotdb-system.properties configuration file as follows: - -```Properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -1. Start IoTDB DataNodes -2. Use a web browser or `curl` to access `http://server_ip:9091/metrics` to retrieve metric data, such as: - -```Plain -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -### 1.3 **Prometheus + Grafana** **Integration** - -IoTDB exposes monitoring data in the standard Prometheus-compatible format. Prometheus collects and stores these metrics, while Grafana is used for visualization. - -**Integration Workflow** - -The following picture describes the relationships among IoTDB, Prometheus and Grafana: - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -Iotdb-Prometheus-Grafana Workflow - -1. IoTDB continuously collects monitoring metrics. -2. Prometheus collects metrics from IoTDB at a configurable interval. -3. Prometheus stores the collected metrics in its internal time-series database (TSDB). -4. Grafana queries Prometheus at a configurable interval and visualizes the metrics. - -**Prometheus Configuration Example** - -To configure Prometheus to collect IoTDB metrics, modify the `prometheus.yml` file as follows: - -```YAML -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -For more details, refer to: - -- Prometheus Documentation: - - [Prometheus getting_started](https://prometheus.io/docs/prometheus/latest/getting_started/) - - [Prometheus scrape metrics](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) -- Grafana Documentation: - - [Grafana getting_started](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - - [Grafana query metrics from Prometheus](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## 2. **Apache IoTDB Dashboard** - -We introduce the Apache IoTDB Dashboard, designed for unified centralized operations and management, which enables monitoring multiple clusters through a single panel. - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - -You can access the Dashboard's Json file in TimechoDB. - -### 2.1 **Cluster Overview** - -Including but not limited to: - -- Total number of CPU cores, memory capacity, and disk space in the cluster. -- Number of ConfigNodes and DataNodes in the cluster. -- Cluster uptime. -- Cluster write throughput. -- Current CPU, memory, and disk utilization across all nodes. -- Detailed information for individual nodes. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - -### 2.2 **Data Writing** - -Including but not limited to: - -- Average write latency, median latency, and the 99% percentile latency. -- Number and size of WAL files. -- WAL flush SyncBuffer latency per node. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### 2.3 **Data Querying** - -Including but not limited to: - -- Time series metadata query load time per node. -- Time series data read duration per node. -- Time series metadata modification duration per node. -- Chunk metadata list loading time per node. -- Chunk metadata modification duration per node. -- Chunk metadata-based filtering duration per node. -- Average time required to construct a Chunk Reader. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### 2.4 **Storage Engine** - -Including but not limited to: - -- File count and size by type. -- Number and size of TsFiles at different processing stages. -- Task count and execution duration for various operations. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### 2.5 **System Monitoring** - -Including but not limited to: - -- System memory, swap memory, and process memory usage. -- Disk space, file count, and file size statistics. -- JVM garbage collection (GC) time percentage, GC events by type, GC data volume, and heap memory utilization across generations. -- Network throughput and packet transmission rate. - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) - -### 2.6 Data Synchronization - -Including but not limited to: - -- Pipe event commit queue size, number of unassigned Pipe events -- Number of unprocessed events in the Source queue, Source event feeding rate, Processor event processing rate -- Number of untransmitted events for all Pipe Sinks/Sources, transmission event rate of Pipe connectors -- Retry queue size and pending handler count of Pipe Sinks; total data size before and after compression and compression duration of Pipe Sinks; batch size and batch interval distribution of Pipe Sinks -- Pipe memory usage and capacity, number of Pipe phantom references, quantity and total size of linked TsFiles, disk bytes read for TsFile transmission via Pipe - -![](/img/monitor-tool-pipe-1-en.png) - -![](/img/monitor-tool-pipe-2-en.png) - -![](/img/monitor-tool-pipe-3-en.png) - -![](/img/monitor-tool-pipe-4-en.png) \ No newline at end of file diff --git a/src/UserGuide/Master/Table/Tools-System/Schema-Export-Tool_timecho.md b/src/UserGuide/Master/Table/Tools-System/Schema-Export-Tool_timecho.md deleted file mode 100644 index acb88afa2..000000000 --- a/src/UserGuide/Master/Table/Tools-System/Schema-Export-Tool_timecho.md +++ /dev/null @@ -1,111 +0,0 @@ - - -# Schema Export - -## 1. Overview - -The schema export tool `export-schema.sh/bat` is located in the `tools` directory. It can export schema from a specified database in IoTDB to a script file. - -## 2. Detailed Functionality - -### 2.1 Parameter - -| **Short Param** | **Full Param** | **Description** | Required | Default | -|-----------------|----------------------------|-----------------------------------------------------------------------| ------------------------------------- |-----------------------------------------------| -| `-h` | `-- host` | Hostname | No | 127.0.0.1 | -| `-p` | `--port` | Port number | No | 6667 | -| `-u` | `--username` | Username | No | root | -| `-pw` | `--password` | Password, Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Specifies whether the server uses`tree `model or`table `model | No | tree | -| `-db` | `--database` | Target database to export (only applies when`-sql_dialect=table`) | Required if`-sql_dialect=table` | - | -| `-table` | `--table` | Target table to export (only applies when`-sql_dialect=table`) | No | - | -| `-t` | `--target` | Output directory (created if it doesn't exist) | Yes | | -| `-path` | `--path_pattern` | Path pattern for metadata export | Required if`-sql_dialect=tree` | | -| `-pfn` | `--prefix_file_name` | Output filename prefix | No | `dump_dbname.sql` | -| `-lpf` | `--lines_per_file` | Maximum lines per dump file (only applies when`-sql_dialect=tree`) | No | `10000` | -| `-timeout` | `--query_timeout` | Query timeout in milliseconds (`-1`= no timeout) | No | -1Range:`-1 to Long. max=9223372036854775807` | -| `-help` | `--help` | Display help information | No | | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 Command - -```Bash -Shell -# Unix/OS X -> tools/export-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -# Windows -# Before version V2.0.4.x -> tools\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\schema\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -``` - -### 2.3 Examples - -Export schema from `database1` to `/home`: - -```Bash -./export-schema.sh -sql_dialect table -t /home/ -db database1 -``` - -Output `dump_database1.sql`: - -```sql -DROP TABLE IF EXISTS table1; -CREATE TABLE table1( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -DROP TABLE IF EXISTS table2; -CREATE TABLE table2( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -``` diff --git a/src/UserGuide/Master/Table/Tools-System/Schema-Import-Tool_timecho.md b/src/UserGuide/Master/Table/Tools-System/Schema-Import-Tool_timecho.md deleted file mode 100644 index 929cc3011..000000000 --- a/src/UserGuide/Master/Table/Tools-System/Schema-Import-Tool_timecho.md +++ /dev/null @@ -1,166 +0,0 @@ - - -# Schema Import - -## 1. Overview - -The schema import tool `import-schema.sh/bat` is located in `tools` directory. - -## 2. Detailed Functionality - -### 2.1 Parameter - -| **Short Param** | **Full Param** | **Description** | Required | Default | -|-----------------| ------------------------------- |-----------------------------------------------------------------------| ---------- |----------------------------------------------| -| `-h` | `-- host` | Hostname | No | 127.0.0.1 | -| `-p` | `--port` | Port number | No | 6667 | -| `-u` | `--username` | Username | No | root | -| `-pw` | `--password` | Password, Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Specifies whether the server uses`tree `model or`table `model | No | tree | -| `-db` | `--database` | Target database for import | Yes | - | -| `-table` | `--table` | Target table for import (only applies when`-sql_dialect=table`) | No | - | -| `-s` | `--source` | Local directory path containing script file(s) to import | Yes | | -| `-fd` | `--fail_dir` | Directory to save failed import files | No | | -| `-lpf` | `--lines_per_failed_file` | Maximum lines per failed file (only applies when`-sql_dialect=table`) | No | 100000Range:`0 to Integer.Max=2147483647` | -| `-help` | `--help` | Display help information | No | | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 Command - -```Bash -# Unix/OS X -tools/import-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# Windows -# Before version V2.0.4.x -tools\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# V2.0.4.x and later versions -tools\windows\schema\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] -``` - -### 2.3 Examples - -Import `dump_database1.sql` from `/home` into `database2`, - -```sql --- File content (dump_database1.sql): -DROP TABLE IF EXISTS table1; -CREATE TABLE table1( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -DROP TABLE IF EXISTS table2; -CREATE TABLE table2( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -``` - -Executing the command: - -```Bash -./import-schema.sh -sql_dialect table -s /home/dump_database1.sql -db database2 - -# If database2 doesn't exist -The target database database2 does not exist - -# If database2 exists -Import completely! -``` - -Verification: - -```Bash -# Before import -IoTDB:database2> show tables -+---------+-------+ -|TableName|TTL(ms)| -+---------+-------+ -+---------+-------+ -Empty set. - -# After import -IoTDB:database2> show tables details -+---------+-------+------+-------+ -|TableName|TTL(ms)|Status|Comment| -+---------+-------+------+-------+ -| table2| INF| USING| null| -| table1| INF| USING| null| -+---------+-------+------+-------+ - -IoTDB:database2> desc table1 -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ - -IoTDB:database2> desc table2 -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ -``` diff --git a/src/UserGuide/Master/Table/User-Manual/Audit-Log_timecho.md b/src/UserGuide/Master/Table/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index c2b4fcaa2..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,165 +0,0 @@ - - - -# Security Audit - -## 1. Introduction - -Audit logs serve as the record credentials of a database, enabling tracking of various operations (e.g., create, read, update, delete) to ensure information security. The audit log feature in IoTDB supports the following capabilities: - -* Supports enabling/disabling the audit log functionality through configuration -* Supports configuring operation types and privilege levels to be recorded via parameters -* Supports setting the storage duration of audit log files, including time-based rolling (via TTL) and space-based rolling (via SpaceTL) -* Supports configuring parameters to count slow requests (with write/query latency exceeding a threshold, default 3000 milliseconds) within any specified time period -* Audit log files are stored in encrypted format by default - -> Note: This feature is available from version V2.0.8 onwards. - -## 2. Configuration Parameters - -Edit the `iotdb-system.properties` file to enable audit logging using the following parameters: - - -* V2.0.8.1 - -| Parameter Name | Description | Data Type | Default Value | Activation Method | -|-------------------------------------------|------------------------------------------------------------------------------------------------------------|-----------|-------------------------------|-------------------| -| `enable_audit_log` | Whether to enable audit logging. true: enabled. false: disabled. | Boolean | false | Hot Reload | -| `auditable_operation_type` | Operation type selection. DML: all DML operations are logged; DDL: all DDL operations are logged; QUERY: all query operations are logged; CONTROL: all control statements are logged. | String | DML,DDL,QUERY,CONTROL | Hot Reload | -| `auditable_operation_level` | Permission level selection. global: log all audit events; object: only log events related to data instances. Containment relationship: object < global. For example: when set to global, all audit logs are recorded normally; when set to object, only operations on specific data instances are recorded. | String | global | Hot Reload | -| `auditable_operation_result` | Audit result selection. success: log only successful events; fail: log only failed events | String | success,fail | Hot Reload | -| `audit_log_ttl_in_days` | Audit log TTL (Time To Live). Logs older than this threshold will expire. | Double | -1.0 (never deleted) | Hot Reload | -| `audit_log_space_tl_in_GB` | Audit log SpaceTL. Logs will start rotating when total space reaches this threshold. | Double | 1.0 | Hot Reload | -| `audit_log_batch_interval_in_ms` | Batch write interval for audit logs | Long | 1000 | Hot Reload | -| `audit_log_batch_max_queue_bytes` | Maximum byte size of the queue for batch processing audit logs. Subsequent write operations will be blocked when this threshold is exceeded. | Long | 268435456 | Hot Reload | - -* V2.0.9.2 - -| Parameter Name | Description | Data Type | Default Value | Activation Method | -|-------------------------------------------|------------------------------------------------------------------------------------------------------------|-----------|-------------------------------|-------------------| -| `enable_audit_log` | Whether to enable audit logging. true: enabled. false: disabled. | Boolean | false | Hot Reload | -| `auditable_operation_type` | Operation type selection. DML: all DML operations are logged; DDL: all DDL operations are logged; QUERY: all query operations are logged; CONTROL: all control statements are logged. | String | DML,DDL,QUERY,CONTROL | Hot Reload | -| `auditable_dml_event_type` | Event types for auditing DML operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_ddl_event_type` | Event types for auditing DDL operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_query_event_type` | Event types for auditing query operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_control_event_type` | Event types for auditing control operations. `CHANGE_AUDIT_OPTION`: audit option change, `OBJECT_AUTHENTICATION`: object authentication, `LOGIN`: login, `LOGOUT`: logout, `DN_SHUTDOWN`: data node shutdown, `SLOW_OPERATION`: slow operation | String | `CHANGE_AUDIT_OPTION`,`OBJECT_AUTHENTICATION`,`LOGIN`,`LOGOUT`,`DN_SHUTDOWN`,`SLOW_OPERATION` | Hot Reload | -| `auditable_operation_level` | Permission level selection. global: log all audit events; object: only log events related to data instances. Containment relationship: object < global. For example: when set to global, all audit logs are recorded normally; when set to object, only operations on specific data instances are recorded. | String | global | Hot Reload | -| `auditable_operation_result` | Audit result selection. success: log only successful events; fail: log only failed events | String | success,fail | Hot Reload | -| `audit_log_ttl_in_days` | Audit log TTL (Time To Live). Logs older than this threshold will expire. | Double | -1.0 (never deleted) | Hot Reload | -| `audit_log_space_tl_in_GB` | Audit log SpaceTL. Logs will start rotating when total space reaches this threshold. | Double | 1.0 | Hot Reload | -| `audit_log_batch_interval_in_ms` | Batch write interval for audit logs | Long | 1000 | Hot Reload | -| `audit_log_batch_max_queue_bytes` | Maximum byte size of the queue for batch processing audit logs. Subsequent write operations will be blocked when this threshold is exceeded. | Long | 268435456 | Hot Reload | - -**Instructions for Object Authentication and Slow Operations:** -- When the parameters `auditable_dml_event_type`, `auditable_ddl_event_type`, `auditable_query_event_type`, or `auditable_control_event_type` are set to `OBJECT_AUTHENTICATION`, the corresponding event types will be recorded in the audit log. -- When the parameters `auditable_dml_event_type`, `auditable_ddl_event_type`, `auditable_query_event_type`, or `auditable_control_event_type` are set to `SLOW_OPERATION`, only the corresponding event types whose execution time exceeds the value of the `slow_query_threshold` parameter (default: 3000 ms) will be recorded in the audit log. The value of the `slow_query_threshold` parameter can be configured in the `iotdb-system.properties` file. - - -## 3. Access Methods - -Supports direct reading of audit logs via SQL. - -### 3.1 SQL Syntax - -```SQL -SELECT (, )* log FROM WHERE whereclause ORDER BY order_expression -``` - -Where: - -* `AUDIT_LOG_PATH`: Audit log storage location `__audit.audit_log`; -* `audit_log_field`: Query fields refer to the metadata structure below -* Supports WHERE clause filtering and ORDER BY sorting - -### 3.2 Metadata Structure - -| Field | Description | Data Type | -|------------------------|--------------------------------------------------|----------------| -| `time` | The date and time when the event started | timestamp | -| `username` | User name | string | -| `cli_hostname` | Client hostname identifier | string | -| `audit_event_type` | Audit event type, e.g., WRITE_DATA, GENERATE_KEY| string | -| `operation_type` | Operation type, e.g., DML, DDL, QUERY, CONTROL | string | -| `privilege_type` | Privilege used, e.g., WRITE_DATA, MANAGE_USER | string | -| `privilege_level` | Event privilege level, global or object | string | -| `result` | Event result, success=1, fail=0 | boolean | -| `database` | Database name | string | -| `sql_string` | User's original SQL statement | string | -| `log` | Detailed event description | string | - -### 3.3 Usage Examples - -* Query times, usernames and host information for successfully executed DML operations: - -```SQL -IoTDB:__audit> select time,username,cli_hostname from audit_log where result = true and operation_type='DML' -+-----------------------------+--------+------------+ -| time|username|cli_hostname| -+-----------------------------+--------+------------+ -|2026-01-23T11:43:46.697+08:00| root| 127.0.0.1| -|2026-01-23T11:45:39.950+08:00| root| 127.0.0.1| -+-----------------------------+--------+------------+ -Total line number = 2 -It costs 0.284s -``` - -* Query latest operation details: - -```SQL -IoTDB:__audit> select time,username,cli_hostname,operation_type,sql_string from audit_log order by time desc limit 1 -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -| time|username|cli_hostname|operation_type| sql_string| -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -|2026-01-23T11:46:31.026+08:00| root| 127.0.0.1| QUERY|select time,username,cli_hostname,operation_type,sql_string from audit_log order by time desc limit 1| -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -Total line number = 1 -It costs 0.053s -``` - -* Query failed operations: - -```SQL -IoTDB:__audit> select time,database,operation_type,log from audit_log where result=false -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -| time|database|operation_type| log| -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -|2026-01-23T11:47:42.136+08:00| | CONTROL|User user1 (ID=-1) login failed with code: 804, Authentication failed.| -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -Total line number = 1 -It costs 0.011s -``` - - -* Query audit event records with types 'slow operation' - -```SQL -IoTDB:__audit> select * from audit_log where audit_event_type='SLOW_OPERATION' limit 3 -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| time|node_id|user_id|username|cli_hostname|audit_event_type|operation_type|privilege_type|privilege_level|result| database| sql_string| log| -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|2026-05-06T14:57:57.468+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| QUERY| [SELECT]| OBJECT| true| | show databases| SLOW_QUERY: cost 10 ms, show databases| -|2026-05-06T14:58:38.149+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| DML| [INSERT]| OBJECT| true|database1|INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2024-11-26 13:37:00', 90.0, 35.1, true, '2024-11-26 13:37:34'), ('北京', '1001', '100', 'A', '180', '2024-11-26 13:38:00', 90.0, 35.1, true, '2024-11-26 13:38:25'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:38:00', NULL, 35.1, true, '2024-11-27 16:37:01'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:39:00', 85.0, 35.3, NULL, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:40:00', 85.0, NULL, NULL, '2024-11-27 16:37:03'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:41:00', 85.0, NULL, NULL, '2024-11-27 16:37:04'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:42:00', NULL, 35.2, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:43:00', NULL, Null, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:44:00', NULL, Null, false, '2024-11-27 16:37:08'), ('上海', '3001', '100', 'C', '90', '2024-11-28 08:00:00', 85.0, Null, NULL, '2024-11-28 08:00:09'), ('上海', '3001', '100', 'C', '90', '2024-11-28 09:00:00', NULL, 40.9, true, NULL), ('上海', '3001', '100', 'C', '90', '2024-11-28 10:00:00', 85.0, 35.2, NULL, '2024-11-28 10:00:11'), ('上海', '3001', '100', 'C', '90', '2024-11-28 11:00:00', 88.0, 45.1, true, '2024-11-28 11:00:12'), ('上海', '3001', '101', 'D', '360', '2024-11-29 10:00:00', 85.0, NULL, NULL, '2024-11-29 10:00:13'), ('上海', '3002', '100', 'E', '180', '2024-11-29 11:00:00', NULL, 45.1, true, NULL), ('上海', '3002', '100', 'E', '180', '2024-11-29 18:30:00', 90.0, 35.4, true, '2024-11-29 18:30:15'), ('上海', '3002', '101', 'F', '360', '2024-11-30 09:30:00', 90.0, 35.2, true, NULL), ('上海', '3002', '101', 'F', '360', '2024-11-30 14:30:00', 90.0, 34.8, true, '2024-11-30 14:30:17')|Execution: INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2024-11-26 13:37:00', 90.0, 35.1, true, '2024-11-26 13:37:34'), ('北京', '1001', '100', 'A', '180', '2024-11-26 13:38:00', 90.0, 35.1, true, '2024-11-26 13:38:25'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:38:00', NULL, 35.1, true, '2024-11-27 16:37:01'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:39:00', 85.0, 35.3, NULL, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:40:00', 85.0, NULL, NULL, '2024-11-27 16:37:03'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:41:00', 85.0, NULL, NULL, '2024-11-27 16:37:04'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:42:00', NULL, 35.2, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:43:00', NULL, Null, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:44:00', NULL, Null, false, '2024-11-27 16:37:08'), ('上海', '3001', '100', 'C', '90', '2024-11-28 08:00:00', 85.0, Null, NULL, '2024-11-28 08:00:09'), ('上海', '3001', '100', 'C', '90', '2024-11-28 09:00:00', NULL, 40.9, true, NULL), ('上海', '3001', '100', 'C', '90', '2024-11-28 10:00:00', 85.0, 35.2, NULL, '2024-11-28 10:00:11'), ('上海', '3001', '100', 'C', '90', '2024-11-28 11:00:00', 88.0, 45.1, true, '2024-11-28 11:00:12'), ('上海', '3001', '101', 'D', '360', '2024-11-29 10:00:00', 85.0, NULL, NULL, '2024-11-29 10:00:13'), ('上海', '3002', '100', 'E', '180', '2024-11-29 11:00:00', NULL, 45.1, true, NULL), ('上海', '3002', '100', 'E', '180', '2024-11-29 18:30:00', 90.0, 35.4, true, '2024-11-29 18:30:15'), ('上海', '3002', '101', 'F', '360', '2024-11-30 09:30:00', 90.0, 35.2, true, NULL), ('上海', '3002', '101', 'F', '360', '2024-11-30 14:30:00', 90.0, 34.8, true, '2024-11-30 14:30:17') cost 329 ms, with status code: TSStatus(code:200, message:)| -|2026-05-06T14:58:45.534+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| QUERY| [SELECT]| OBJECT| true|database1| select * from table1| SLOW_QUERY: cost 121 ms, select * from table1| -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 3 -It costs 0.026s -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/User-Manual/Authority-Management-Upgrade_timecho.md b/src/UserGuide/Master/Table/User-Manual/Authority-Management-Upgrade_timecho.md deleted file mode 100644 index 5056f75bc..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Authority-Management-Upgrade_timecho.md +++ /dev/null @@ -1,495 +0,0 @@ - -# Authority Management - -IoTDB provides permission management capabilities to deliver fine-grained access control for data and cluster systems, ensuring data and system security. This document introduces the basic concepts of the permission module under the IoTDB Table Model, user specifications, permission governance, authentication logic, and practical application examples. - -## 1. Basic Concepts -### 1.1 User -A user is a legitimate database operator. Each user is identified by a unique username and authenticated with a password. To use the database, users must provide valid usernames and passwords stored in the system. - -### 1.2 Privilege -The database supports a variety of operations, but not all users are authorized to perform every action. A user is considered to have the corresponding privilege if permitted to execute a specific operation. - -### 1.3 Role -A role is a collection of privileges marked by a unique role identifier. Roles correspond to real-world job identities (e.g., traffic dispatchers), and one identity may cover multiple users. Users with identical job identities usually share the same set of permissions. Roles serve as an abstraction to realize unified permission management for such user groups. - -### 1.4 Default Users and Roles -After initialization, IoTDB provides a default user `root` with the default password `TimechoDB@2021`. As the administrator account, root owns all privileges by default. Its permissions cannot be granted or revoked, and the account cannot be deleted. There is only one administrator user in the database. -Newly created users and roles have no permissions by default. - -## 2. User Specifications -Users with the `SECURITY` privilege can create users and roles, and all creations must comply with the following constraints: - -### 2.1 Username Rules -- Length: 4 to 32 characters. Supports uppercase and lowercase letters, digits, and special symbols (`!@#$%^&*()_+-=`). Usernames identical to the administrator account are not allowed. -- If a username consists entirely of digits or contains special characters, it must be enclosed in double quotation marks `""` during creation. - -### 2.2 Password Rules -Length: 12 to 32 characters. A valid password must contain both uppercase and lowercase letters, at least one digit, and at least one special symbol (`!@#$%^&*()_+-=`). A password cannot be the same as the associated username. - -### 2.3 Role Name Rules -Length: 4 to 32 characters. Supports uppercase and lowercase letters, digits, and special symbols (`!@#$%^&*()_+-=`). Role names identical to the administrator account are prohibited. - -## 3. Permission Management -Under the IoTDB Table Model, permissions are divided into two major categories: global privileges and data privileges. - -### 3.1 Global Privileges -Global privileges include three types: `SYSTEM`, `SECURITY`, and `AUDIT`: -- **SYSTEM**: Covers privileges for O&M operations and Data Definition Language (DDL) tasks. -- **SECURITY**: Covers management of users and roles, as well as privilege assignment for other accounts. -- **AUDIT**: Covers audit rule maintenance and audit log viewing. - -Detailed descriptions of each global privilege are shown in the table below: - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Privilege CategoryOriginal NameDescription
SYSTEMN/AAllows users to create, alter and drop databases.
N/AAllows users to create, alter and drop tables and table views.
N/AAllows users to create, drop and query user-defined functions.
N/AAllows users to create, start, stop, drop and view PIPE tasks; create, drop and view PIPEPLUGINS.
N/AAllows users to execute and cancel queries, view system variables, and check cluster status.
N/AAllows users to create, drop and query deep learning models.
SECURITYMANAGE_USERAllows users to create and drop users, modify user passwords, view user privilege information, and list all users.
MANAGE_ROLEAllows users to create and drop roles, view role privilege information, grant roles to users, revoke roles from users, and list all roles.
AUDITN/AAllows users to maintain audit log rules and view audit logs.
- -### 3.2 Data Privileges -Data privileges consist of privilege types and effective scopes. - -- **Privilege Types**: `CREATE`, `DROP`, `ALTER`, `SELECT`, `INSERT`, `DELETE`. -- **Scopes**: `ANY` (system-wide), `DATABASE` (database-level), `TABLE` (single table). - - Privileges with the `ANY` scope apply to all databases and tables. - - Database-level privileges apply to the specified database and all tables under it. - - Table-level privileges only take effect on the target single table. -- **Scope Matching Logic**: When performing single-table operations, the system verifies permissions by scope priority. For example, when writing data to `DATABASE1.TABLE1`, the system checks write permissions in sequence for `ANY`, `DATABASE1` and `DATABASE1.TABLE1` until a matching privilege is found or the check fails. - -The logical relationship between privilege types, scopes and capabilities is shown below: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Privilege TypePermission Scope (Level)Capability Description
CREATEANYAllows creating any databases and tables.
DatabaseAllows creating the specified database and creating tables under this database.
TableAllows creating the specified table.
DROPANYAllows dropping any databases and tables.
DatabaseAllows dropping the specified database and all tables under it.
TableAllows dropping the specified table.
ALTERANYAllows modifying the definitions of any databases and tables.
DatabaseAllows modifying database definitions and the structure of all tables under the database.
TableAllows modifying the table structure and definition.
SELECTANYAllows querying data from any table across all databases.
DatabaseAllows querying data from all tables in the specified database.
TableAllows querying data in the specified table. For multi-table queries, only accessible data with valid permissions will be displayed.
INSERTANYAllows inserting and updating data in any tables of all databases.
DatabaseAllows inserting and updating data in all tables under the specified database.
TableAllows inserting and updating data in the specified table.
DELETEANYAllows deleting data from any tables.
DatabaseAllows deleting data within the specified database.
TableAllows deleting data in the specified table.
- -### 3.3 Privilege Granting and Revocation -IoTDB supports privilege assignment and revocation through three methods: -- Direct granting or revocation by the super administrator. -- Granting or revocation by users with the `GRANT OPTION` privilege. -- Granting or revocation via role configuration, operated by the super administrator or users with `SECURITY` privileges. - -The following rules apply to permission management in the IoTDB Table Model: -- No scope needs to be specified when granting or revoking global privileges. -- Data privilege operations require explicit privilege types and scopes. Revocation only takes effect on the designated scope and is not affected by hierarchical inclusion relationships. -- Pre-authorization is supported for databases and tables that have not been created yet. -- Repeated granting or revocation of privileges is permitted. -- **WITH GRANT OPTION**: Authorizes users to manage permissions within their granted scope, including granting and revoking privileges for other users. - -## 4. Syntax and Usage Examples -### 4.1 User and Role Management -1. **Create User** (Requires `SECURITY` privilege) -```SQL -CREATE USER --- Example -CREATE USER user1 'Passwd@202604' -``` - -2. **Modify Password** - Users can change their own passwords; modifying other users' passwords requires the `SECURITY` privilege. -```SQL -ALTER USER SET PASSWORD --- Example -ALTER USER tempuser SET PASSWORD 'Newpwd@202604' -``` - -3. **Drop User** (Requires `SECURITY` privilege) -```SQL -DROP USER --- Example -DROP USER user1 -``` - -4. **Create Role** (Requires `SECURITY` privilege) -```SQL -CREATE ROLE --- Example -CREATE ROLE role1 -``` - -5. **Drop Role** (Requires `SECURITY` privilege) -```SQL -DROP ROLE --- Example -DROP ROLE role1 -``` - -6. **Grant Role to User** (Requires `SECURITY` privilege) -```SQL -GRANT ROLE TO --- Example -GRANT ROLE admin TO user1 -``` - -7. **Revoke Role from User** (Requires `SECURITY` privilege) -```SQL -REVOKE ROLE FROM --- Example -REVOKE ROLE admin FROM user1 -``` - -8. **List All Users** (Requires `SECURITY` privilege) -```SQL -LIST USER -``` - -9. **List All Roles** (Requires `SECURITY` privilege) -```SQL -LIST ROLE -``` - -10. **List Users Under a Specified Role** (Requires `SECURITY` privilege) -```SQL -LIST USER OF ROLE --- Example -LIST USER OF ROLE roleuser -``` - -11. **List Roles of a Specified User** - Users can view their own roles; viewing roles of other users requires the `SECURITY` privilege. -```SQL -LIST ROLE OF USER --- Example -LIST ROLE OF USER tempuser -``` - -12. **List All Privileges of a User** - Users can view their own privileges; viewing privileges of other users requires the `SECURITY` privilege. -```SQL -LIST PRIVILEGES OF USER --- Example -LIST PRIVILEGES OF USER tempuser -``` - -13. **List All Privileges of a Role** - Users can view privileges of their bound roles; viewing privileges of other roles requires the `SECURITY` privilege. -```SQL -LIST PRIVILEGES OF ROLE --- Example -LIST PRIVILEGES OF ROLE actor -``` - -### 4.2 Granting and Revoking Privileges -#### 4.2.1 Grant Privileges -1. Grant user management privileges to a user -```SQL -GRANT SECURITY TO USER --- Example -GRANT SECURITY TO USER TEST_USER -``` - -2. Grant database and table creation privileges with independent permission management rights -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION --- Example -GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION -``` - -3. Grant database query privileges to a role -```SQL -GRANT SELECT ON DATABASE TO ROLE --- Example -GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE -``` - -4. Grant table query privileges to a user -```SQL -GRANT SELECT ON . TO USER --- Example -GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER -``` - -5. Grant global cross-database query privileges to a role -```SQL -GRANT SELECT ON ANY TO ROLE --- Example -GRANT SELECT ON ANY TO ROLE TEST_ROLE -``` - -6. **ALL Syntax Sugar**: `ALL` represents all available privileges within the target scope -```SQL --- Grant all global privileges and full data privileges under the ANY scope -GRANT ALL TO USER TESTUSER - --- Grant all data privileges covering the entire system scope -GRANT ALL ON ANY TO USER TESTUSER - --- Grant all data privileges for the specified database -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER - --- Grant all data privileges for the specified single table -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER -``` - -#### 4.2.2 Revoke Privileges -1. Revoke user management privileges -```SQL -REVOKE SECURITY FROM USER --- Example -REVOKE SECURITY FROM USER TEST_USER -``` - -2. Revoke database and table creation privileges -```SQL -REVOKE CREATE ON DATABASE FROM USER --- Example -REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER -``` - -3. Revoke table query privileges -```SQL -REVOKE SELECT ON . FROM USER --- Example -REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER -``` - -4. Revoke global cross-database query privileges -```SQL -REVOKE SELECT ON ANY FROM USER --- Example -REVOKE SELECT ON ANY FROM USER TEST_USER -``` - -5. **ALL Syntax Sugar** for privilege revocation -```SQL --- Revoke all global privileges and ANY-scoped data privileges -REVOKE ALL FROM USER TESTUSER - --- Only revoke data privileges under the ANY scope -REVOKE ALL ON ANY FROM USER TESTUSER - --- Only revoke all data privileges of the specified database -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER - --- Only revoke all data privileges of the specified table -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER -``` - -### 4.3 View User Privileges -Each user has an independent privilege list that records all authorized permissions. -Use `LIST PRIVILEGES OF USER ` to query detailed privileges of a user or role. The output format is as follows: - -| ROLE | SCOPE | PRIVILEGE | WITH GRANT OPTION | -|-------|---------|-----------|------------------| -| | DB1.TB1 | SELECT | FALSE | -| | | SECURITY | TRUE | -| ROLE1 | DB2.TB2 | UPDATE | TRUE | -| ROLE1 | DB3.* | DELETE | FALSE | -| ROLE1 | *.* | UPDATE | TRUE | - -- **ROLE**: Blank for self-owned user privileges; displays the role name if the permission is inherited from a role. -- **SCOPE**: Table-level scope displayed as `DB.TABLE`, database-level as `DB.*`, and global ANY scope as `*.*`. -- **PRIVILEGE**: Lists specific authorized permission types. -- **WITH GRANT OPTION**: `TRUE` means the user can grant or revoke permissions within the corresponding scope. -- Users and roles can hold permissions for both the Tree Model and Table Model simultaneously. The system only displays permissions applicable to the currently connected model, while permissions for the other model will be hidden. - -## 5. Practical Scenario Example -Based on the [Sample Data](../Reference/Sample-Data.md), data in different tables belongs to two independent data centers (bj and sh). To prevent unauthorized cross-center data access, permission isolation needs to be configured at the data center level. - -### 5.1 Create Users -Use `CREATE USER` to create new users. For example, the root administrator creates two dedicated write users for the BJ and SH data centers: `bj_write_user` and `sh_write_user`, with a unified password `write_Pwd@2026`. - -```SQL -CREATE USER bj_write_user 'write_Pwd@2026'; -CREATE USER sh_write_user 'write_Pwd@2026'; -``` - -Execute the user query statement: -```SQL -LIST USER -``` - -Query result: -``` -+------+-------------+-----------------+-----------------+ -|UserId| User|MaxSessionPerUser|MinSessionPerUser| -+------+-------------+-----------------+-----------------+ -| 0| root| -1| 1| -| 10000|bj_write_user| -1| -1| -| 10001|sh_write_user| -1| -1| -+------+-------------+-----------------+-----------------+ -``` - -### 5.2 Grant Privileges -Newly created users have no permissions by default and cannot perform database operations. For example, an insertion executed by `bj_write_user` will fail: -```SQL -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('Beijing', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -``` - -Error prompt: -``` -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: database is not specified -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: DATABASE database1 -``` - -Grant table write privileges to `bj_write_user` via the root account: -```SQL -GRANT INSERT ON database1.table1 TO USER bj_write_user -``` - -Retry data insertion after switching to the target database: -```SQL -IoTDB> use database1 -Msg: The statement is executed successfully. -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('Beijing', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: The statement is executed successfully. -``` - -### 5.3 Revoke Privileges -Use the `REVOKE` statement to reclaim granted permissions: -```SQL -REVOKE INSERT ON database1.table1 FROM USER bj_write_user -REVOKE INSERT ON database1.table2 FROM USER sh_write_user -``` - -After revocation, `bj_write_user` no longer has write access to table1: -``` -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: No permissions for this operation, please add privilege INSERT ON database1.table1 -``` - \ No newline at end of file diff --git a/src/UserGuide/Master/Table/User-Manual/Authority-Management_timecho.md b/src/UserGuide/Master/Table/User-Manual/Authority-Management_timecho.md deleted file mode 100644 index 3ac042003..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Authority-Management_timecho.md +++ /dev/null @@ -1,494 +0,0 @@ - - -# Authority Management - -IoTDB provides permission management functionality to implement fine-grained access control for data and cluster systems, ensuring data and system security. This document introduces the basic concepts, user definitions, permission management, authentication logic, and functional use cases of the permission module in IoTDB's table model. - -## 1. Basic Concepts - -### 1.1 User - -A **user** is a legitimate database user. Each user is associated with a unique username and authenticated via a password. Before accessing the database, a user must provide valid credentials (a username and password that exist in the database). - -### 1.2 Permission - -A database supports multiple operations, but not all users can perform every operation. If a user is authorized to execute a specific operation, they are said to have the **permission** for that operation. - -### 1.3 Role - -A **role** is a collection of permissions, identified by a unique role name. Roles typically correspond to real-world identities (e.g., "traffic dispatcher"), where a single identity may encompass multiple users. Users sharing the same real-world identity often require the same set of permissions, and roles abstract this grouping for unified management. - -### 1.4 Default User and Role - -Upon initialization, IoTDB includes a default user: - -* ​**Username**​: `root` -* ​**Default password**​: `TimechoDB@2021` //before V2.0.6 it is root - -The `root` user is the ​**administrator**​, inherently possessing all permissions. This user cannot be granted or revoked permissions and cannot be deleted. The database maintains only one administrator user. Newly created users or roles start with **no permissions** by default. - -## 2. Permission List - -In IoTDB's table model, there are two main types of permissions: Global Permissions and Data Permissions . - -### 2.1 Global Permissions - -Global permissions include user management and role management. - -The following table describes the types of global permissions: - -| Permission Name | Description | -| ----------------- |----------------------------------------------------------------------------------------------------------------------------------| -| MANAGE\_USER | - Create users
- Delete users
- Modify user passwords
- View user permission details
- List all users | -| MANAGE\_ROLE | - Create roles
- Delete roles
- View role permission details
- Grant/revoke roles to/from users
- List all roles | - -### 2.2 Data Permissions - -Data permissions consist of permission types and permission scopes. - -* Permission Types: - * CREATE: Permission to create resources - * DROP: Permission to delete resources - * ALTER: Permission to modify definitions - * SELECT: Permission to query data - * INSERT: Permission to insert/update data - * DELETE: Permission to delete data -* Permission Scopes: - * ANY: System-wide (affects all databases and tables) - * DATABASE: Database-wide (affects the specified database and its tables) - * TABLE: Table-specific (affects only the specified table) -* Scope Enforcement Logic: - -When performing table-level operations, the system matches user permissions with data permission scopes hierarchically. Example: If a user attempts to write data to `DATABASE1.TABLE1`, the system checks for write permissions in this order: 1. `ANY` scope → 2. `DATABASE1` scope → 3. `DATABASE1.TABLE1` scope. The check stops at the first successful match or fails if no permissions are found. - -* Permission Type-Scope-Effect Matrix - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Permission TypeScope(Hierarchy)Effect
CREATEANYCreate any table/database
DATABASECreate tables in the specified database; create a database with the specified name
TABLECreate a table with the specified name
DROPANYDelete any table/database
DATABASEDelete the specified database or its tables
TABLEDelete the specified table
ALTERANYModify definitions of any table/database
DATABASEModify definitions of the specified database or its tables
TABLEModify the definition of the specified table
SELECTANYQuery data from any table in any database
DATABASEQuery data from any table in the specified database
TABLEQuery data from the specified table
INSERTANYInsert/update data in any table
DATABASEInsert/update data in any table within the specified database
TABLEInsert/update data in the specified table
DELETEANYDelete data from any table
DATABASEDelete data from tables within the specified database
TABLEDelete data from the specified table
- -## 3. User and Role Management - -1. Create User (Requires `MANAGE_USER` Permission) - -```SQL -CREATE USER -eg: CREATE USER user1 'passwd' -``` - -Constraints: - -* Username: 4-32 characters (letters, numbers, special chars: `!@#$%^&*()_+-=`). Cannot duplicate the admin (`root`) username. - - If the username consists entirely of numbers or contains special characters, you need to enclose it in double quotes `""` when creating it. -* Password: 4-32 characters (letters, numbers, special chars). Stored as SHA-256 hash by default. - -2. Modify Password - -Users can modify their own passwords. Modifying others' passwords requires `MANAGE_USER`. - -```SQL -ALTER USER SET PASSWORD -eg: ALTER USER tempuser SET PASSWORD 'newpwd' -``` - -3. Delete User (Requires `MANAGE_USER`) - -```SQL -DROP USER -eg: DROP USER user1 -``` - -4. Create Role (Requires `MANAGE_ROLE`) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1 -``` - -Constraints: - -* Role Name: 4-32 characters (letters, numbers, special chars). Cannot duplicate the admin role name. - -5. Delete Role (Requires `MANAGE_ROLE`) - -```SQL -DROP ROLE -eg: DROP ROLE role1 -``` - -6. Assign Role to User (Requires `MANAGE_ROLE`) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1 -``` - -7. Revoke Role from User (Requires `MANAGE_ROLE`) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1 -``` - -8. List All Users (Requires `MANAGE_USER`) - -```SQL -LIST USER -``` - -9. List All Roles (Requires `MANAGE_ROLE`) - -```SQL -LIST ROLE -``` - -10. List Users in a Role (Requires `MANAGE_USER`) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser -``` - -11. List Roles of a User - -* Users can list their own permissions. -* Listing others' permissions requires `MANAGE_USER`. - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser -``` - -12. List User Permissions - -* Users can list their own permissions. -* Listing others' permissions requires `MANAGE_USER`. - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser -``` - -13. List Role Permissions - -* Users can list permissions of roles they have. -* Listing other roles' permissions requires `MANAGE_ROLE`. - -```SQL -LIST PRIVILEGES OF ROLE -eg: LIST PRIVILEGES OF ROLE actor -``` - -## 4. Permission Management - -IoTDB supports granting and revoking permissions through the following three methods: - -* Direct assignment/revocation by a super administrator -* Assignment/revocation by users with the `GRANT OPTION` privilege -* Assignment/revocation via roles (managed by super administrators or users with `MANAGE_ROLE` permissions) - -In the IoTDB Table Model, the following principles apply when granting or revoking permissions: - -* **Global permissions** can be granted/revoked without specifying a scope. -* **Data permissions** require specifying both the permission type and permission scope. When revoking, only the explicitly defined scope is affected, regardless of hierarchical inclusion relationships. -* Preemptive permission planning is allowed—permissions can be granted for databases or tables that do not yet exist. -* Repeated granting/revoking of permissions is permitted. -* `WITH GRANT OPTION`: Allows users to manage permissions within the granted scope. Users with this option can grant or revoke permissions for other users in the same scope. - -### 4.1 Granting Permissions - -1. Grant a user the permission to manage users - -```SQL -GRANT MANAGE_USER TO USER -eg: GRANT MANAGE_USER TO USER TEST_USER -``` - -2. Grant a user the permission to create databases and tables within the database, and allow them to manage permissions in that scope - -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION -eg: GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION -``` - -3. Grant a role the permission to query a database - -```SQL -GRANT SELECT ON DATABASE TO ROLE -eg: GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE -``` - -4. Grant a user the permission to query a table - -```SQL -GRANT SELECT ON . TO USER -eg: GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER -``` - -5. Grant a role the permission to query all databases and tables - -```SQL -GRANT SELECT ON ANY TO ROLE -eg: GRANT SELECT ON ANY TO ROLE TEST_ROLE -``` - -6. ALL Syntax Sugar: ALL represents all permissions within a given scope, allowing flexible permission granting. - -```sql -GRANT ALL TO USER TESTUSER --- Grants all possible permissions to the user, including global permissions and all data permissions under ANY scope. - -GRANT ALL ON ANY TO USER TESTUSER --- Grants all data permissions under the ANY scope. After execution, the user will have all data permissions across all databases. - -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER --- Grants all data permissions within the specified database. After execution, the user will have all data permissions on that database. - -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER --- Grants all data permissions on the specified table. After execution, the user will have all data permissions on that table. -``` - -### 4.2 Revoking Permissions - -1. Revoke a user's permission to manage users - -```SQL -REVOKE MANAGE_USER FROM USER -eg: REVOKE MANAGE_USER FROM USER TEST_USER -``` - -2. Revoke a user's permission to create databases and tables within the database - -```SQL -REVOKE CREATE ON DATABASE FROM USER -eg: REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER -``` - -3. Revoke a user's permission to query a table - -```SQL -REVOKE SELECT ON . FROM USER -eg: REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER -``` - -4. Revoke a user's permission to query all databases and tables - -```SQL -REVOKE SELECT ON ANY FROM USER -eg: REVOKE SELECT ON ANY FROM USER TEST_USER -``` - -5. ALL Syntax Sugar: ALL represents all permissions within a given scope, allowing flexible permission revocation. - -```sql -REVOKE ALL FROM USER TESTUSER --- Revokes all global permissions and all data permissions under ANY scope. - -REVOKE ALL ON ANY FROM USER TESTUSER --- Revokes all data permissions under the ANY scope, without affecting DB or TABLE-level permissions. - -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER --- Revokes all data permissions on the specified database, without affecting TABLE-level permissions. - -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER --- Revokes all data permissions on the specified table. -``` - -### 4.3 Viewing User Permissions - -Each user has an access control list that identifies all the permissions they have been granted. You can use the `LIST PRIVILEGES OF USER ` statement to view the permission information of a specific user or role. The output format is as follows: - -| ROLE | SCOPE | PRIVIVLEGE | WITH GRANT OPTION | -|--------------|---------| -------------- |-------------------| -| | DB1.TB1 | SELECT | FALSE | -| | | MANAGE\_ROLE | TRUE | -| ROLE1 | DB2.TB2 | UPDATE | TRUE | -| ROLE1 | DB3.\* | DELETE | FALSE | -| ROLE1 | \*.\* | UPDATE | TRUE | - -* ​**ROLE column**​: If empty, it indicates the user's own permissions. If not empty, it means the permission is derived from a granted role. -* ​**SCOPE column**​: Represents the permission scope of the user/role. Table-level permissions are denoted as `DB.TABLE`, database-level permissions as `DB.*`, and ANY-level permissions as `*.*`. -* ​**PRIVILEGE column**​: Lists the specific permission types. -* ​**WITH GRANT OPTION column**​: If `TRUE`, it means the user can grant their own permissions to others. -* A user or role can have permissions in both the tree model and the table model, but the system will only display the permissions relevant to the currently connected model. Permissions under the other model will not be shown. - -## 5. Example - -Using the content from the [Sample Data](../Reference/Sample-Data.md) as an example, the data in the two tables may belong to the **bj** and **sh** data centers, respectively. To prevent each center from accessing the other's database data, we need to implement permission isolation at the data center level. - -### 5.1 Creating Users - -Use `CREATE USER ` to create users. For example, the **root** user with all permissions can create two user roles for the **ln** and **sgcc** groups, named **bj\_write\_user** and ​**sh\_write\_user**​, both with the password ​**write\_pwd**​. The SQL statements are: - -```SQL -CREATE USER bj_write_user 'write_pwd' -CREATE USER sh_write_user 'write_pwd' -``` - -To display the users, use the following SQL statement: - -```Plain -LIST USER -``` - -The result will show the two newly created users, as follows: - -```sql -+-------------+ -| User| -+-------------+ -|bj_write_user| -| root| -|sh_write_user| -+-------------+ -``` - -### 5.2 Granting User Permissions - -Although the two users have been created, they do not yet have any permissions and thus cannot perform database operations. For example, if the **bj\_write\_user** attempts to write data to ​**table1**​, the SQL statement would be: - -```sql -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -``` - -The system will deny the operation and display an error: - -```sql -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: database is not specified -IoTDB> use database1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: DATABASE database1 -``` - -The **root** user can grant **bj\_write\_user** write permissions for **table1** using the `GRANT ON TO USER ` statement, for example: - -```sql -GRANT INSERT ON database1.table1 TO USER bj_write_user -``` - -After granting permissions, **bj\_write\_user** can successfully write data: - -```SQL -IoTDB> use database1 -Msg: The statement is executed successfully. -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: The statement is executed successfully. -``` - -### 5.3 Revoking User Permissions - -After granting permissions, the **root** user can revoke them using the `REVOKE ON FROM USER ` statement. For example: - -```sql -REVOKE INSERT ON database1.table1 FROM USER bj_write_user -REVOKE INSERT ON database1.table2 FROM USER sh_write_user -``` - -Once permissions are revoked, **bj\_write\_user** will no longer have write access to ​**table1**​: - -```sql -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: No permissions for this operation, please add privilege INSERT ON database1.table1 -``` diff --git a/src/UserGuide/Master/Table/User-Manual/Auto-Start-On-Boot_timecho.md b/src/UserGuide/Master/Table/User-Manual/Auto-Start-On-Boot_timecho.md deleted file mode 100644 index 1fbcd4c6f..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Auto-Start-On-Boot_timecho.md +++ /dev/null @@ -1,213 +0,0 @@ - - -# Auto-start on Boot -## 1. Overview -TimechoDB supports registering ConfigNode, DataNode, and AINode as Linux system services via the three scripts `daemon-confignode.sh`, `daemon-datanode.sh`, and `daemon-ainode.sh`. Combined with the system-built `systemctl` command, it manages the TimechoDB cluster in daemon mode, enabling more convenient startup, shutdown, restart, and auto-start on boot operations, and improving service stability. - -> Note: This feature is available starting from version 2.0.9.1. - -## 2. Environment Requirements -| Item | Specification | -|--------------|-------------------------------------------------------------------------------| -| OS | Linux (supports the `systemctl` command) | -| User Privilege | root user | -| Environment Variable | `JAVA_HOME` must be set before deploying ConfigNode and DataNode | - -## 3. Service Registration -Enter the TimechoDB installation directory and execute the corresponding daemon script: - -```bash -# Register ConfigNode service -./tools/ops/daemon-confignode.sh - -# Register DataNode service -./tools/ops/daemon-datanode.sh - -# Register AINode service -./tools/ops/daemon-ainode.sh -``` - -During script execution, you will be prompted with two options: -1. Whether to start the corresponding TimechoDB service immediately (timechodb-confignode / timechodb-datanode / timechodb-ainode); -2. Whether to register the corresponding service for auto-start on boot. - -After script execution, the corresponding service files will be generated in the `/etc/systemd/system/` directory: -- `timechodb-confignode.service` -- `timechodb-datanode.service` -- `timechodb-ainode.service` - -## 4. Service Management -After service registration, you can use `systemctl` commands to start, stop, restart, check status, and configure auto-start on boot for each TimechoDB node service. All commands below must be executed as the root user. - -### 4.1 Manual Service Startup -```bash -# Start ConfigNode service -systemctl start timechodb-confignode -# Start DataNode service -systemctl start timechodb-datanode -# Start AINode service -systemctl start timechodb-ainode -``` - -### 4.2 Manual Service Shutdown -```bash -# Stop ConfigNode service -systemctl stop timechodb-confignode -# Stop DataNode service -systemctl stop timechodb-datanode -# Stop AINode service -systemctl stop timechodb-ainode -``` - -After stopping the service, check the service status. If it shows `inactive (dead)`, the service has been shut down successfully. For other statuses, check TimechoDB logs to analyze exceptions. - -### 4.3 Check Service Status -```bash -# Check ConfigNode service status -systemctl status timechodb-confignode -# Check DataNode service status -systemctl status timechodb-datanode -# Check AINode service status -systemctl status timechodb-ainode -``` - -Status Description: -- `active (running)`: Service is running. If this status persists for 10 minutes, the service has started successfully. -- `failed`: Service startup failed. Check TimechoDB logs for troubleshooting. - -### 4.4 Restart Service -Restarting a service is equivalent to stopping and then starting it. Commands are as follows: -```bash -# Restart ConfigNode service -systemctl restart timechodb-confignode -# Restart DataNode service -systemctl restart timechodb-datanode -# Restart AINode service -systemctl restart timechodb-ainode -``` - -### 4.5 Enable Auto-start on Boot -```bash -# Enable ConfigNode auto-start on boot -systemctl enable timechodb-confignode -# Enable DataNode auto-start on boot -systemctl enable timechodb-datanode -# Enable AINode auto-start on boot -systemctl enable timechodb-ainode -``` - -### 4.6 Disable Auto-start on Boot -```bash -# Disable ConfigNode auto-start on boot -systemctl disable timechodb-confignode -# Disable DataNode auto-start on boot -systemctl disable timechodb-datanode -# Disable AINode auto-start on boot -systemctl disable timechodb-ainode -``` - -## 5. Custom Service Configuration -### 5.1 Customization Methods -#### 5.1.1 Method 1: Modify the Script -1. Modify the `[Unit]`, `[Service]`, and `[Install]` sections in the `daemon-xxx.sh` script. For details of configuration items, refer to the next section. -2. Execute the `daemon-xxx.sh` script. - -#### 5.1.2 Method 2: Modify the Service File -1. Modify the `xx.service` file in `/etc/systemd/system`. -2. Execute `systemctl daemon-reload`. - -### 5.2 `daemon-xxx.sh` Configuration Items -#### 5.2.1 `[Unit]` Section (Service Metadata) -| Item | Description | -|---------------|-----------------------------------------------------------------------------| -| Description | Service description | -| Documentation | Link to the official TimechoDB documentation | -| After | Ensures the service starts only after the network service has started | - -#### 5.2.2 `[Service]` Section (Service Runtime Configuration) -| Item | Meaning | -|-------------------------------------------|-----------------------------------------------------------------------------------------------------------| -| StandardOutput, StandardError | Specify storage paths for service standard output and error logs | -| LimitNOFILE=65536 | Set the maximum number of file descriptors, default value is 65536 | -| Type=simple | Service type is a simple foreground process; systemd tracks the main service process | -| User=root, Group=root | Run the service with root user and group permissions | -| ExecStart / ExecStop | Specify the paths of the service startup and shutdown scripts respectively | -| Restart=on-failure | Automatically restart the service only if it exits abnormally | -| SuccessExitStatus=143 | Treat exit code 143 (128+15, normal termination via SIGTERM) as a successful exit | -| RestartSec=5 | Interval between service restarts, default 5 seconds | -| StartLimitInterval=600s, StartLimitBurst=3 | Maximum 3 restarts within 10 minutes (600 seconds) to prevent excessive resource consumption from frequent restarts | -| RestartPreventExitStatus=SIGKILL | Do not auto-restart the service if killed by the SIGKILL signal, avoiding infinite restart of zombie processes | - -#### 5.2.3 `[Install]` Section (Installation Configuration) -| Item | Meaning | -|-----------------------|----------------------------------------------------------------------| -| WantedBy=multi-user.target | Start the service automatically when the system enters multi-user mode | - -### 5.3 Sample `.service` File Format -```bash -[Unit] -Description=timechodb-confignode -Documentation=https://www.timecho.com/ -After=network.target - -[Service] -StandardOutput=null -StandardError=null -LimitNOFILE=65536 -Type=simple -User=root -Group=root -Environment=JAVA_HOME=$JAVA_HOME -ExecStart=$TimechoDB_SBIN_HOME/start-confignode.sh -Restart=on-failure -SuccessExitStatus=143 -RestartSec=5 -StartLimitInterval=600s -StartLimitBurst=3 -RestartPreventExitStatus=SIGKILL - -[Install] -WantedBy=multi-user.target -``` - -Note: The above is the standard format of the `timechodb-confignode.service` file. The formats of `timechodb-datanode.service` and `timechodb-ainode.service` are similar. - -## 6. Notes -1. **Process Daemon Mechanism** - - **Auto-restart**: The system will auto-restart the service if it fails to start or exits abnormally during runtime (e.g., OOM). - - **No restart**: Normal exits (e.g., executing `kill`, `./sbin/stop-xxx.sh`, or `systemctl stop`) will not trigger auto-restart. - -2. **Log Location** - - All runtime logs are stored in the `logs` folder under the TimechoDB installation directory. Refer to this directory for troubleshooting. - -3. **Cluster Status Check** - - After service startup, execute `./sbin/start-cli.sh` and run the `show cluster` command to view the cluster status. - -4. **Fault Recovery Procedure** - - If the service status is `failed`, after fixing the issue, **you must first execute `systemctl daemon-reload`** before running `systemctl start`, otherwise startup will fail. - -5. **Configuration Activation** - - After modifying the `daemon-xxx.sh` script, execute `systemctl daemon-reload` to re-register the service for new configurations to take effect. - -6. **Startup Mode Compatibility** - - Services started via `systemctl start` can be stopped using `./sbin/stop` (no restart triggered). - - Processes started via `./sbin/start` cannot be monitored via `systemctl`. \ No newline at end of file diff --git a/src/UserGuide/Master/Table/User-Manual/Black-White-List_timecho.md b/src/UserGuide/Master/Table/User-Manual/Black-White-List_timecho.md deleted file mode 100644 index 3aa3cb94a..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Black-White-List_timecho.md +++ /dev/null @@ -1,78 +0,0 @@ - - -# Black White List - -## 1. Introduction - -IoTDB is a time-series database designed for IoT scenarios, supporting efficient data storage, query, and analysis. With the widespread application of IoT technology, data security and access control have become critical. In open environments, ensuring secure data access for legitimate users presents a key challenge. The whitelist mechanism allows only trusted IPs or users to connect, reducing the attack surface at the source. The blacklist function can block malicious IPs in real time in edge-cloud collaborative scenarios, preventing unauthorized access, SQL injection, brute‑force attacks, DDoS, and other threats, thereby providing continuous and stable security for data transmission. - -> Note: This feature is available starting from version 2.0.6. - -## 2. Whitelist - -### 2.1 Function Description - -By enabling the whitelist function and configuring the whitelist, client addresses allowed to connect to IoTDB are specified. Only clients within the whitelist can access IoTDB, achieving security control. - -### 2.2 Configuration Parameters - -Administrators can enable/disable the whitelist function and add, modify, or delete whitelist IPs/IP segments in the following two ways: - -* Edit the configuration file `iotdb‑system.properties`. -* Use the `set configuration` statement. - * Table model reference: [set configuration](../SQL-Manual/SQL-Maintenance-Statements_timecho.md#_2-2-update-configuration-items) - -Related parameters are as follows: - -| Name | Description | Default Value | Effective Mode | Example | -| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------- | --------------- | ---------------- | ------------------------------------------------------------------- | -| `enable_white_list` | Whether to enable the whitelist function. true: enable; false: disable. The value is case‑insensitive. | false | Hot reload | `set enable_white_list = 'true'` | -| `white_ip_list` | Add, modify, or delete whitelist IPs/IP segments. Supports exact match and the \* wildcard. Multiple IPs are separated by commas. | empty | Hot reload | `set white_ip_list='192.168.1.200,192.168.1.201,192.168.1.*'` | - -## 3. Blacklist - -### 3.1 Function Description - -By enabling the blacklist function and configuring the blacklist, certain specific IP addresses are prevented from accessing the database, guarding against unauthorized access, SQL injection, brute‑force attacks, DDoS attacks, and other security threats, thereby ensuring the security and stability of data transmission. - -### 3.2 Configuration Parameters - -Administrators can enable/disable the blacklist function and add, modify, or delete blacklist IPs/IP segments in the following two ways: - -* Edit the configuration file `iotdb‑system.properties`. -* Use the `set configuration`statement. - * Table model reference:[set configuration](../SQL-Manual/SQL-Maintenance-Statements_timecho.md#_2-2-update-configuration-items) - -Related parameters are as follows: - -| Name | Description | Default Value | Effective Mode | Example | -|---------------------| ----------------------------------------------------------------------------------------------------------------------------------- | --------------- | ---------------- | ------------------------------------------------------------------- | -| `enable_black_list` | Whether to enable the blacklist function. true: enable; false: disable. The value is case‑insensitive. | false | Hot reload | `set enable_black_list = 'true'` | -| `black_ip_list` | Add, modify, or delete blacklist IPs/IP segments. Supports exact match and the \* wildcard. Multiple IPs are separated by commas. | empty | Hot reload | `set black_ip_list='192.168.1.200,192.168.1.201,192.168.1.*'` | - -## 4. Notes - -1. After the whitelist is enabled, if the list is empty, all connections are denied. If the local IP is not included, local login is denied. -2. When the same IP appears in both the whitelist and blacklist, the blacklist takes precedence. -3. The system validates the IP format. Invalid entries will cause an error when the user connects and be skipped, without affecting the loading of other valid IPs. -4. Duplicate IPs in the configuration are supported; they are automatically deduplicated in memory without notification. For manual deduplication, edit the configuration accordingly. -5. Blacklist/whitelist rules only apply to new connections. Existing connections before enabling the function are not affected; they will be intercepted only upon subsequent reconnection. diff --git a/src/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md b/src/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index f62a412b3..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,850 +0,0 @@ - -# Data Sync - -Data synchronization is a typical requirement in the Industrial Internet of Things (IIoT). Through data synchronization mechanisms, data sharing between IoTDB instances can be achieved, enabling the establishment of a complete data pipeline to meet needs such as internal and external network data exchange, edge-to-cloud synchronization, data migration, and data backup. - -## 1. Functional Overview - -### 1.1 Data Synchronization - -A data synchronization task consists of three stages: - -![](/img/data-sync-new.png) - -- Source Stage: This stage is used to extract data from the source IoTDB, defined in the `source` section of the SQL statement. -- Process Stage: This stage is used to process the data extracted from the source IoTDB, defined in the `processor` section of the SQL statement. -- Sink Stage: This stage is used to send data to the target IoTDB, defined in the `sink` section of the SQL statement. - -By declaratively configuring these three parts in an SQL statement, flexible data synchronization capabilities can be achieved.Currently, data synchronization supports the synchronization of the following information, and you can select the synchronization scope when creating a synchronization task (the default is data.insert, which means synchronizing newly written data): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Synchronization ScopeSynchronization Content Description
allAll scopes
data(Data)insertSynchronize newly written data
deleteSynchronize deleted data
schemadatabaseSynchronize database creation, modification or deletion operations
tableSynchronize table creation, modification or deletion operations
TTLSynchronize the data retention time
auth-Synchronize user permissions and access control
- -### 1.2 Functional Limitations and Notes - -- Data synchronization between IoTDB of 1. x series version and IoTDB of 2. x and above series versions is not supported. -- When performing data synchronization tasks, avoid executing any deletion operations to prevent inconsistencies between the two ends. -- The `pipe` and `pipe plugins` for tree modes and table modes are designed to be isolated from each other. Before creating a `pipe`, it is recommended to first use the `show` command to query the built-in plugins available under the current `-sql_dialect` parameter configuration to ensure syntax compatibility and functional support. -- Object-type data export is supported since version V2.0.9.2. -- When Pipe fails to write data to the sink due to field type mismatches, IoTDB automatically converts the data to the field types defined in the existing sink schema and retries the write operation to improve synchronization success rate. This feature is controlled by the parameter `sink.exception.data.convert-on-type-mismatch`. Refer to the subsequent sink parameter table for detailed parameter descriptions. - - * The conversion rules for type mismatches are as follows: - - | Source Type | Target Type | Conversion Rule | - |---------------------|-------------|---------------------------------------------------------------------------------| - | Numeric Type | Numeric Type| Convert to the target numeric type. Truncation, precision loss or overflow may occur. | - | Numeric Type | BOOLEAN | `0` is converted to `false`; non-zero values are converted to `true`. | - | BOOLEAN | Numeric Type| `true` is converted to `1`; `false` is converted to `0`. | - | TEXT, STRING, BLOB | BOOLEAN | Parse the string into a BOOLEAN value. | - | TEXT, STRING, BLOB | Numeric Type| Parse the string into the target numeric type. If parsing fails, write the default value `0`, `0L` or `0.0`. | - | TEXT, STRING, BLOB | TIMESTAMP | Parse the string into a TIMESTAMP value. If parsing fails, write the default value `0L`. | - | TEXT, STRING, BLOB | DATE | Parse the string into a DATE value. If parsing fails, write the default date `1970-01-01`. | - | Invalid Numeric Value | DATE | If conversion to a valid DATE fails, write the default date `1970-01-01`. | - | DATE | TIMESTAMP | Convert to the timestamp of 00:00 (UTC) on the same day. | - | TIMESTAMP | DATE | Convert to the corresponding date in UTC. | - - > **Note**: Automatic conversion is performed based on the existing sink schema and will **not** modify the sink schema. This feature prioritizes continuous data synchronization, which may result in precision loss or writing of default values. - - - -## 2. Usage Instructions - -A data synchronization task can be in one of three states: RUNNING, STOPPED, and DROPPED. The state transitions of the task are illustrated in the diagram below: - -![](/img/Data-Sync02.png) - -After creation, the task will start directly. Additionally, if the task stops due to an exception, the system will automatically attempt to restart it. - -We provide the following SQL statements for managing the state of synchronization tasks. - -### 2.1 Create a Task - -Use the `CREATE PIPE` statement to create a data synchronization task. Among the following attributes, `PipeId` and `sink` are required, while `source` and `processor` are optional. Note that the order of the `SOURCE` and `SINK` plugins cannot be swapped when writing the SQL. - -SQL Example: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId is a unique name identifying the task --- Data extraction plugin (optional) -WITH SOURCE ( - [ = ,], -) --- Data processing plugin (optional) -WITH PROCESSOR ( - [ = ,], -) --- Data transmission plugin (required) -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS Semantics**: Ensures that the creation command is executed only if the specified Pipe does not exist, preventing errors caused by attempting to create an already existing Pipe. - -**Note**: - -Starting from V2.0.8, when creating a full data synchronization Pipe (e.g. Pipeid: `alldatapipe`), the system will automatically split it into two independent Pipes: - -* History Pipe: The PipeId is the original name plus the suffix `_history` (e.g. `alldatapipe_history`). The source parameter carries the default configurations: `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` -* Realtime Pipe: The PipeId is the original name plus the suffix `_realtime` (e.g. `alldatapipe_realtime`). The source parameter carries the default configuration: `'history.enable'='false'`. If metadata synchronization is configured, the Realtime Pipe will be responsible for sending the data. - -After successful creation, the original PipeId (e.g. `alldatapipe`) will no longer be a valid identifier. When performing task operations such as starting, stopping, deleting, or viewing, you must use the split independent PipeId (i.e. `*_history` or `*_realtime`). For operation examples, see the [View Task](./Data-Sync_timecho.md#_2-5-view-task) section - -### 2.2 Start a Task - -After creation, the task directly enters the RUNNING state and does not require manual startup. However, if the task is stopped using the `STOP PIPE` statement, you need to manually start it using the `START PIPE` statement. If the task stops due to an exception, it will automatically restart to resume data processing: - -```SQL -START PIPE -``` - -### 2.3 Stop a Task - -To stop data processing: - -```SQL -STOP PIPE -``` - -### 2.4 Delete a Task - -To delete a specified task: - -```SQL -DROP PIPE [IF EXISTS] -``` - -**IF EXISTS Semantics**: Ensures that the deletion command is executed only if the specified Pipe exists, preventing errors caused by attempting to delete a non-existent Pipe. - -**Note**: Deleting a task does not require stopping the synchronization task first. - -### 2.5 View Tasks - -To view all tasks: - -```SQL -SHOW PIPES -``` - -To view a specific task: - -```SQL -SHOW PIPE -``` - -Example Output of `SHOW PIPES`: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -**Column Descriptions**: - -- **ID**: Unique identifier of the synchronization task. -- **CreationTime**: Time when the task was created. -- **State**: Current state of the task. -- **PipeSource**: Source of the data stream. -- **PipeProcessor**: Processing logic applied during data transmission. -- **PipeSink**: Destination of the data stream. -- **ExceptionMessage**: Displays exception information for the task. -- **RemainingEventCount** (statistics may have delays): Number of remaining events, including data and metadata synchronization events, as well as system and user-defined events. -- **EstimatedRemainingSeconds** (statistics may have delays): Estimated remaining time to complete the transmission based on the current event count and pipe processing rate. - -Example: - -In V2.0.8 and later versions, create a full data synchronization task and view the task details. - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -``` - - -### 2.6 Modify a Task - -The `ALTER PIPE` statement dynamically updates an existing PIPE and supports modifying or replacing the configuration of source, processor, and sink. - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -Description: - -* Executing this operation does not change the running state of the PIPE. It is equivalent to keeping the processing progress of the original PipeId and creating a new PIPE at the original progress position. -* The modify/replace parameters for source/processor/sink are all optional. If no modification parameter is specified, it is equivalent to deleting the current PIPE and recreating it with the original configuration and progress. -* For a plugin specified with modify, the plugin's other parameters are retained, and only the given parameters are replaced or added. -* For a plugin specified with replace, all parameters of the plugin are replaced directly. -* When the [IF EXISTS] keyword is used, execution succeeds even if no Pipe with the same name exists, but no operation is actually performed. - -Example: - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` - -### 2.7 Synchronization Plugins - -To make the architecture more flexible and adaptable to different synchronization scenarios, IoTDB supports plugin assembly in the synchronization task framework. The system provides some common pre-installed plugins, and you can also customize `processor` and `sink` plugins and load them into the IoTDB system. - -To view the plugins available in the system (including custom and built-in plugins), use the following statement: - -```SQL -SHOW PIPEPLUGINS -``` - -Example Output: - -```SQL -IoTDB> SHOW PIPEPLUGINS -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -| PluginName|PluginType| ClassName|PluginJar|ExceptionMessage| -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -| DO-NOTHING-PROCESSOR| Builtin|org.apache.iotdb.commons.pipe.agent.plugin.builtin.processor.donothing.DoNothingProcessor| | | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.donothing.DoNothingSink| | | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.airgap.IoTDBAirGapSink| | | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.source.iotdb.IoTDBSource| | | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.thrift.IoTDBThriftSink| | | -|IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.thrift.IoTDBThriftSslSink| | | -| TSFILE-LOCAL-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.tsfile.PipeTsFileLocalSink| | | -| WRITE-BACK-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.writeback.WriteBackSink| | | -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -``` - -Detailed introduction of pre-installed plugins is as follows (for detailed parameters of each plugin, please refer to the [Parameter Description](#reference-parameter-description): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
TypeCustom PluginPlugin NameDescription
Source PluginNot Supportediotdb-sourceDefault extractor plugin for extracting historical or real-time data from IoTDB.
processor PluginSupporteddo-nothing-processorDefault processor plugin that does not process incoming data.
sink PluginSupporteddo-nothing-sinkDoes not process outgoing data.
iotdb-thrift-sinkDefault sink plugin for data transmission between IoTDB instances (V2.0.0+). Uses Thrift RPC framework with a multi-threaded async non-blocking IO model, ideal for distributed target scenarios.
iotdb-air-gap-sinkUsed for cross-unidirectional data gate synchronization between IoTDB instances (V2.0.0+). Supports gate models like NARI Syskeeper 2000.
iotdb-thrift-ssl-sinkUsed for data transmission between IoTDB instances (V2.0.0+). Uses Thrift RPC framework with a multi-threaded sync blocking IO model, suitable for high-security scenarios.
write-back-sinkA data write-back plugin for IoTDB (V2.0.2 and above) to achieve the effect of materialized views.
opc-ua-sinkAn OPC UA protocol data transfer plugin for IoTDB (V2.0.2 and above), supporting both Client/Server and Pub/Sub communication modes.
tsfile-local-sinkUsed in IoTDB (V2.0.9.2 and later) to support exporting Object data to the local file system where the IoTDB server resides.
tsfile-remote-sinkUsed in IoTDB (V2.0.9.2 and later) to support sending Object data to a remote server via the SSH/SCP protocol.
- -## 3. Usage Examples - -### 3.1 Full Data Synchronization - -This example demonstrates synchronizing all data from one IoTDB to another. The data pipeline is shown below: - -![](/img/e1.png) - -In this example, we create a synchronization task named `A2B` to synchronize all data from IoTDB A to IoTDB B. The `iotdb-thrift-sink` plugin (built-in) is used, and the `node-urls` parameter is configured with the URL of the DataNode service port on the target IoTDB. - -SQL Example: - -```SQL -CREATE PIPE A2B -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668' -- URL of the DataNode service port on the target IoTDB -) -``` - -### 3.2 Partial Data Synchronization - -This example demonstrates synchronizing data within a specific historical time range (from August 23, 2023, 8:00 to October 23, 2023, 8:00) to another IoTDB. The data pipeline is shown below: - -![](/img/e2.png) - -In this example, we create a synchronization task named `A2B`. First, we define the data range in the `source` configuration. Since we are synchronizing historical data (data that existed before the task was created), we need to configure the start time (`start-time`), end time (`end-time`), and the streaming mode (`mode.streaming`). The `node-urls` parameter is configured with the URL of the DataNode service port on the target IoTDB. - -SQL Example: - -```SQL -CREATE PIPE A2B -WITH SOURCE ( - 'source' = 'iotdb-source', - 'mode.streaming' = 'true' -- Extraction mode for newly inserted data (after the pipe is created): - -- Whether to extract data in streaming mode (if set to false, batch mode is used). - 'database-name'='testdb.*', -- Scope of Data Synchronization - 'start-time' = '2023.08.23T08:00:00+00:00', -- The event time at which data synchronization starts (inclusive). - 'end-time' = '2023.10.23T08:00:00+00:00' -- The event time at which data synchronization ends (inclusive). -) -WITH SINK ( - 'sink' = 'iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668' -- The URL of the DataNode's data service port in the target IoTDB instance. -) -``` - -### 3.3 Bidirectional Data Transmission - -This example demonstrates a scenario where two IoTDB instances act as dual-active systems. The data pipeline is shown below: - -![](/img/e3.png) - -To avoid infinite data loops, the `source.mode.double-living` parameter must be set to `true` on both IoTDB A and B, indicating that data forwarded from another pipe will not be retransmitted. - -SQL Example: On IoTDB A: - -```SQL -CREATE PIPE AB -WITH SOURCE ( - 'source.mode.double-living' = 'true' -- Do not forward data from other pipes -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668' -- URL of the DataNode service port on the target IoTDB -) -``` - -On IoTDB B: - -```SQL -CREATE PIPE BA -WITH SOURCE ( - 'source.mode.double-living' = 'true' -- Do not forward data from other pipes -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667' -- URL of the DataNode service port on the target IoTDB -) -``` - -### 3.4 Edge-to-Cloud Data Transmission - -This example demonstrates synchronizing data from multiple IoTDB clusters (B, C, D) to a central IoTDB cluster (A). The data pipeline is shown below: - -![](/img/sync_en_03.png) - -To synchronize data from clusters B, C, and D to cluster A, the `database-name` and `table-name` parameters are used to restrict the data range. - -SQL Example: On IoTDB B: - -```SQL -CREATE PIPE BA -WITH SOURCE ( - 'database-name' = 'db_b.*', -- Restrict the database scope - 'table-name' = '.*' -- Match all tables -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667' -- URL of the DataNode service port on the target IoTDB -) -``` - -On IoTDB C : - -```SQL -CREATE PIPE CA -WITH SOURCE ( - 'database-name' = 'db_c.*', -- Restrict the database scope - 'table-name' = '.*' -- Match all tables -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668' -- URL of the DataNode service port on the target IoTDB -) -``` - -On IoTDB D: - -```SQL -CREATE PIPE DA -WITH SOURCE ( - 'database-name' = 'db_d.*', -- Restrict the database scope - 'table-name' = '.*' -- Match all tables -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669' -- URL of the DataNode service port on the target IoTDB -) -``` - -### 3.5 Cascaded Data Transmission - -This example demonstrates cascading data transmission from IoTDB A to IoTDB B and then to IoTDB C. The data pipeline is shown below: - -![](/img/sync_en_04.png) - - -SQL Example: On IoTDB A: - -```SQL -CREATE PIPE AB -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668' -- URL of the DataNode service port on the target IoTDB -) -``` - -On IoTDB B: - -```SQL -CREATE PIPE BC -WITH SOURCE ( -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669' -- URL of the DataNode service port on the target IoTDB -) -``` - -### 3.6 Air-Gapped Data Transmission - -This example demonstrates synchronizing data from one IoTDB to another through a unidirectional air gap. The data pipeline is shown below: - -![](/img/cross-network-gateway.png) - -In this example, the `iotdb-air-gap-sink` plugin is used (currently supports specific air gap models; contact Timecho team for details). After configuring the air gap, execute the following statement on IoTDB A, where `node-urls` is the URL of the DataNode service port on the target IoTDB. - -SQL Example: - -```SQL -CREATE PIPE A2B -WITH SINK ( - 'sink' = 'iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780' -- URL of the DataNode service port on the target IoTDB -) -``` - -**Note:** -* When creating a pipe for synchronization across a network gap (data diode), you must ensure that the target user on the receiving end already exists. If the receiving-end user is missing at the time of pipe creation, data prior to the subsequent creation of that user will not be synchronized. -* Currently supported network gap device models are listed in the table below. -> For other models of network gateway devices, Please contact timechodb staff to confirm compatibility. - -| Gateway Type | Model | Return Packet Limit | Send Limit | -| ---------------------- | ---------------------------------------------------------- | ------------------- | ---------------------- | -| Forward Gate | NARI Syskeeper-2000 Forward Gate | All 0 / All 1 bytes | No Limit | -| Forward Gate | XJ Self-developed Diaphragm | All 0 / All 1 bytes | No Limit | -| Unknown | WISGAP | No Limit | No Limit | -| Forward Gate | KEDONG StoneWall-2000 Network Security Isolation Device | No Limit | No Limit | -| Reverse Gate | NARI Syskeeper-2000 Reverse Direction | All 0 / All 1 bytes | Meet E Language Format | -| Unknown | DPtech ISG5000 | No Limit | No Limit | -| Unknown | GAP XL—GAP | No Limit | No Limit | - -### 3.7 Compressed Synchronization - -IoTDB supports specifying data compression methods during synchronization. The `compressor` parameter can be configured to enable real-time data compression and transmission. Supported algorithms include `snappy`, `gzip`, `lz4`, `zstd`, and `lzma2`. Multiple algorithms can be combined and applied in the configured order. The `rate-limit-bytes-per-second` parameter (supported in V1.3.3 and later) limits the maximum number of bytes transmitted per second (calculated after compression). If set to a value less than 0, there is no limit. - -**SQL Example**: - -```SQL -CREATE PIPE A2B -WITH SINK ( - 'node-urls' = '127.0.0.1:6668', -- URL of the DataNode service port on the target IoTDB - 'compressor' = 'snappy,lz4', -- Compression algorithms - 'rate-limit-bytes-per-second' = '1048576' -- Maximum bytes allowed per second -) -``` - -### 3.8 Encrypted Synchronization - -IoTDB supports SSL encryption during synchronization to securely transmit data between IoTDB instances. By configuring SSL-related parameters such as the certificate path (`ssl.trust-store-path`) and password (`ssl.trust-store-pwd`), data can be protected by SSL encryption during synchronization. - -**SQL Example**: - -```SQL -CREATE PIPE A2B -WITH SINK ( - 'sink' = 'iotdb-thrift-ssl-sink', - 'node-urls' = '127.0.0.1:6667', -- URL of the DataNode service port on the target IoTDB - 'ssl.trust-store-path' = 'pki/trusted', -- Path to the trust store certificate - 'ssl.trust-store-pwd' = 'root' -- Password for the trust store certificate -) -``` - -### 3.9 Object-Type Data Export -Since version V2.0.9.2, IoTDB supports exporting Object-type data. The following two methods are supported by configuring sink parameters: - -* **Local Mode**: Exports data to the local file system where the IoTDB server resides. -* **SCP Mode**: Sends data to a remote server via the SSH/SCP protocol. - -**Example 1: Local Export** - -You can directly use the built-in `tsfile-local-sink` plugin to create a PIPE statement for data export. For example: - -```SQL -CREATE PIPE tsfile_export_local -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile-local-sink', -- Required, specifies the Sink type - 'sink.local.target-path' = '/data/backup/export_2024' -- Target export path - 'sink.rate-limit-bytes-per-second' = '10485760' -- Rate limit: 10MB/s -); -``` - -**Example 2: Remote Transfer** - -1. Contact the Timecho Team to obtain the JAR package related to the `tsfile-remote-sink` plugin, such as `tsfile-remote-sink--jar-with-dependencies.jar`, and place it in a path accessible to IoTDB (e.g., all Data Node hosts). -2. Register the plugin using the following statement: - -```SQL -CREATE PIPEPLUGIN tsfile_remote_sink -AS 'org.apache.iotdb.pipe.plugin.sink.tsfile.PipeTsFileRemoteSink' -USING URI 'file:///path/to/tsfile-remote-sink--jar-with-dependencies.jar'; -``` - -3. Create the PIPE statement: - -```SQL -CREATE PIPE tsfile_export_scp -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile_remote_sink', - 'sink.file-mode' = 'scp', -- Specifies SCP mode - 'sink.scp.host' = '192.168.1.100', -- Remote host IP - 'sink.scp.port' = '22', -- SSH port - 'sink.scp.user' = 'backup_user', -- SSH username - 'sink.scp.password' = 'ComplexPass123!', -- SSH password - 'sink.scp.remote-path' = '/remote/archive/', -- Remote storage path - 'sink.rate-limit-bytes-per-second' = '10485760' -- Rate limit: 10MB/s -); -``` - -**Note**: When exporting Object-type data in SCP mode, to avoid handshake exceptions, connection failures, or frequent Pipe restarts, it is recommended to take any of the following measures: -* Appropriately lower the configuration parameter `sink.scp.object-parallelism` -* Increase the `MaxStartups` value on the target machine as needed. After modification, execute `sshd reload` or `sshd restart` for the configuration to take effect. - -**Sink Exported TSFile and Object Format:** - -```Bash -target_dir - ├── tsfile.tsfile - └── tsfile/ (matches the TSFile name) - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - -## Reference: Notes - -You can adjust the parameters for data synchronization by modifying the IoTDB configuration file (`iotdb-system.properties`), such as the directory for storing synchronized data. The complete configuration is as follows: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## Reference: Parameter Description - -### source parameter - -| **Parameter** | **Description** | **Value Range** | **Required** | **Default Value** | -| :----------------------- |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------| :----------- | :---------------------------------------------------------- | -| source | iotdb-source | String: iotdb-source | Yes | - | -| inclusion | Used to specify the range of data to be synchronized in the data synchronization task, including data,schema and auth | String:all, data(insert,delete), schema(database,table,ttl), auth | No | data.insert | -| inclusion.exclusion | Used to exclude specific operations from the range specified by inclusion, reducing the amount of data synchronized | String:all, data(insert,delete), schema(database,table,ttl), auth | No | - | -| mode.streaming | This parameter specifies the source of time-series data capture. It applies to scenarios where `mode.streaming` is set to `false`, determining the capture source for `data.insert` in `inclusion`. Two capture strategies are available: - **true**: Dynamically selects the capture type. The system adapts to downstream processing speed, choosing between capturing each write request or only capturing TsFile file sealing requests. When downstream processing is fast, write requests are prioritized to reduce latency; when processing is slow, only file sealing requests are captured to prevent processing backlogs. This mode suits most scenarios, optimizing the balance between processing latency and throughput. - **false**: Uses a fixed batch capture approach, capturing only TsFile file sealing requests. This mode is suitable for resource-constrained applications, reducing system load. **Note**: Snapshot data captured when the pipe starts will only be provided for downstream processing as files. | Boolean: true / false | No | true | -| mode.strict | Determines whether to strictly filter data when using the `time`, `path`, `database-name`, or `table-name` parameters: - **true**: Strict filtering. The system will strictly filter captured data according to the specified conditions, ensuring that only matching data is selected. - **false**: Non-strict filtering. Some extra data may be included during the selection process to optimize performance and reduce CPU and I/O consumption. | Boolean: true / false | No | true | -| mode.snapshot | This parameter determines the data capture mode, affecting the `data` in `inclusion`. Two modes are available: - **true**: Static data capture. A one-time data snapshot is taken when the pipe starts. Once the snapshot data is fully consumed, the pipe automatically terminates (executing `DROP PIPE` SQL automatically). - **false**: Dynamic data capture. In addition to capturing snapshot data when the pipe starts, it continuously captures subsequent data changes. The pipe remains active to process the dynamic data stream. | Boolean: true / false | No | false | -| database-name | When the user connects with `sql_dialect` set to `table`, this parameter can be specified. Determines the scope of data capture, affecting the `data` in `inclusion`. Specifies the database name to filter. It can be a specific database name or a Java-style regular expression to match multiple databases. By default, all databases are matched. | String: Database name or database regular expression pattern string, which can match uncreated or non - existent databases. | No | ".*" | -| table-name | When the user connects with `sql_dialect` set to `table`, this parameter can be specified. Determines the scope of data capture, affecting the `data` in `inclusion`. Specifies the table name to filter. It can be a specific table name or a Java-style regular expression to match multiple tables. By default, all tables are matched. | String: Data table name or data table regular expression pattern string, which can be uncreated or non - existent tables. | No | ".*" | -| start-time | Determines the scope of data capture, affecting the `data` in `inclusion`. Data with an event time **greater than or equal to** this parameter will be selected for stream processing in the pipe. | Long: [Long.MIN_VALUE, Long.MAX_VALUE](Unix bare timestamp)orString: ISO format timestamp supported by IoTDB | No | Long: [Long.MIN_VALUE, Long.MAX_VALUE](Unix bare timestamp) | -| end-time | Determines the scope of data capture, affecting the `data` in `inclusion`. Data with an event time **less than or equal to** this parameter will be selected for stream processing in the pipe. | Long: [Long.MIN_VALUE, Long.MAX_VALUE](Unix bare timestamp)orString: ISO format timestamp supported by IoTDB | No | Long: [Long.MIN_VALUE, Long.MAX_VALUE](Unix bare timestamp) | -| mode.double-living | Whether to enable full dual-active mode. When enabled, the system will ignore the `-sql_dialect` connection method to capture all tree-table model data and not forward data synced from another pipe (to avoid circular synchronization). | Boolean: true / false | No | false | -| mods | Same as mods.enable, whether to send the MODS file for TSFile. | Boolean: true / false | No | false | -| skipIf | Which errors can be skipped? Currently only the insufficient privileges error. | String:no-privileges | No | no-privileges | - -> 💎 **Note:** The difference between the values of true and false for the data extraction mode `mode.streaming` -> -> - True (recommended): Under this value, the task will process and send the data in real-time. Its characteristics are high timeliness and low throughput. -> - False: Under this value, the task will process and send the data in batches (according to the underlying data files). Its characteristics are low timeliness and high throughput. - -### sink parameter - -#### iotdb-thrift-sink - -| **Parameter** | **Description** | Value Range | Required | Default Value | -|:-------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------| :------------ | -| sink | iotdb-thrift-sink or iotdb-thrift-async-sink | String: iotdb-thrift-sink or iotdb-thrift-async-sink | Yes | - | -| node-urls | URLs of the DataNode service ports on the target IoTDB. (please note that the synchronization task does not support forwarding to its own service). | String. Example:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| user/username | username for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | root | -| batch.enable | Enables batch mode for log transmission to improve throughput and reduce IOPS. | Boolean: true, false | No | true | -| batch.max-delay-seconds | Maximum delay (in seconds) for batch transmission. | Integer | No | 1 | -| batch.max-delay-ms | Maximum delay (in ms) for batch transmission. (Available since v2.0.5) | Integer | No | 1 | -| batch.size-bytes | Maximum batch size (in bytes) for batch transmission. | Long | No | 16*1024*1024 | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | The maximum number of bytes allowed to be transmitted per second. The compressed bytes (such as after compression) are calculated. If it is less than 0, there is no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| format | The payload formats for data transmission include the following options:
- hybrid: The format depends on what is passed from the processor (either tsfile or tablet), and the sink performs no conversion.
- tsfile: Data is forcibly converted to tsfile format before transmission. This is suitable for scenarios like data file backup.
- tablet: Data is forcibly converted to tsfile format before transmission. This is useful for data synchronization when the sender and receiver have incompatible data types (to minimize errors). | String: hybrid / tsfile / tablet | No | hybrid | -| mark-as-general-write-request | This parameter controls whether data forwarded by external pipes can be synchronized between dual-active pipes (configured on the sender side of dual-active external pipes). | Boolean: true / false. True: can synchronize; False: cannot synchronize; | No | False | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - -#### iotdb-air-gap-sink - -| **Parameter** | **Description** | Value Range | Required | Default Value | -|:--------------------------------------------| :----------------------------------------------------------- | :----------------------------------------------------------- | :------- |:---------------------------------------------| -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | Yes | - | -| node-urls | URLs of the DataNode service ports on the target IoTDB. (please note that the synchronization task does not support forwarding to its own service). | String. Example:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| user/username | username for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | TimechoDB@2021 (Before V2.0.6.x it is root) | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | The maximum number of bytes allowed to be transmitted per second. The compressed bytes (such as after compression) are calculated. If it is less than 0, there is no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| air-gap.handshake-timeout-ms | The timeout duration for the handshake requests when the sender and receiver attempt to establish a connection for the first time, in milliseconds. | Integer | No | 5000 | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - -#### iotdb-thrift-ssl-sink - -| **Parameter** | **Description** | Value Range | Required | Default Value | -|:--------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------| :------------ | -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | Yes | - | -| node-urls | URLs of the DataNode service ports on the target IoTDB. (please note that the synchronization task does not support forwarding to its own service). | String. Example:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| user/usename | Usename for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | root | -| batch.enable | Enables batch mode for log transmission to improve throughput and reduce IOPS. | Boolean: true, false | No | true | -| batch.max-delay-seconds | Maximum delay (in seconds) for batch transmission. | Integer | No | 1 | -| batch.max-delay-ms | Maximum delay (in ms) for batch transmission. (Available since v2.0.5) | Integer | No | 1 | -| batch.size-bytes | Maximum batch size (in bytes) for batch transmission. | Long | No | 16*1024*1024 | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | Maximum bytes allowed per second for transmission (calculated after compression). Set to a value less than 0 for no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| ssl.trust-store-path | Path to the trust store certificate for SSL connection. | String.Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| ssl.trust-store-pwd | Password for the trust store certificate. | Integer | Yes | - | -| format | The payload formats for data transmission include the following options:
- hybrid: The format depends on what is passed from the processor (either tsfile or tablet), and the sink performs no conversion.
- tsfile: Data is forcibly converted to tsfile format before transmission. This is suitable for scenarios like data file backup.
- tablet: Data is forcibly converted to tsfile format before transmission. This is useful for data synchronization when the sender and receiver have incompatible data types (to minimize errors). | String: hybrid / tsfile / tablet | No | hybrid | -| mark-as-general-write-request | This parameter controls whether data forwarded by external pipes can be synchronized between dual-active pipes (configured on the sender side of dual-active external pipes).(Available since v2.0.5) | Boolean: true / false. True: can synchronize; False: cannot synchronize; | No | False | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - - - -#### write-back-sink - -| **Parameter** | **Description** | **value Range** | **Required** | **Default Value** | -| ---------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- |--------------| -| sink | write-back-sink | String: write-back-sink | Yes | - | -| user/username | User used for write-back | String: username | No | root | -| password | Password used for write-back | String: password | No | root123 | -| user-id | User ID corresponding to the user | String | No | root | -| cli-hostname | CLI hostname corresponding to the user | String | No | root | -| use-event-user-name | Whether to use another user's username if the event contains one (generally not needed now because there is no external source) | Boolean: true / false | No | false | - -#### opc-ua-sink - -| **Parameter** | **Description** |Value Range | Required | Default Value | -|:-------------------------------------|:-------------------------------------------------------------------------------------------------------| :------------------------------------- |:-------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| sink | opc-ua-sink | String: opc-ua-sink | Yes | - | -| sink.opcua.model | OPC UA model used | String: client-server / pub-sub | No | pub-sub | -| sink.opcua.tcp.port | OPC UA's TCP port | Integer: [0, 65536] | No | 12686 | -| sink.opcua.https.port | OPC UA's HTTPS port | Integer: [0, 65536] | No | 8443 | -| sink.opcua.security.dir | Directory for OPC UA's keys and certificates | String: Path, supports absolute and relative directories | No | Opc_security folder``in the conf directory of the DataNode related to iotdb
If there is no conf directory for iotdb (such as launching DataNode in IDEA), it will be the iotdb_opc_Security folder``in the user's home directory | -| sink.opcua.enable-anonymous-access | Whether OPC UA allows anonymous access | Boolean | No | true | -| sink.user | User for OPC UA, specified in the configuration | String | No | root | -| sink.password | Password for OPC UA, specified in the configuration | String | No | TimechoDB@2021 (Before V2.0.6.x it is root) | -| sink.opcua.placeholder | A placeholder string used to substitute for null mapping paths when the value of the ID column is null | String | No | "null" | - - -#### tsfile-local-sink -| Parameter | Description | Value Range | Required | Default | -|-----------------------------------|-----------------------------------------------------------------------------|------------------------|----------|---------| -| sink | Component name | String: tsfile-local-sink | Yes | - | -| sink.local.target-path | Local target directory | String | Yes | - | -| sink.rate-limit-bytes-per-second | Rate limit threshold (unit: bytes/second). Takes effect when enabled. No limit if rate-limit <= 0 | Long | No | 0 | - -#### tsfile-remote-sink -| Parameter | Description | Value Range | Required | Default | -|------------------------------------|----------------------------------------------------------------------------|-------------------------|----------|---------| -| sink | Component name | String: tsfile-remote-sink | Yes | - | -| sink.scp.host | Remote host IP | String | Yes | - | -| sink.scp.port | Remote SSH port | Long | No | 22 | -| sink.scp.user | Remote SSH user | String | Yes | - | -| sink.scp.password | Remote SSH password | String | Yes | - | -| sink.scp.remote-path | Remote target directory | String | Yes | - | -| sink.rate-limit-bytes-per-second | Unit: bytes/second. Takes effect when enabled. No limit if rate-limit <= 0 | Long | No | 0 | -| sink.scp.object-parallelism | Maximum parallelism for object file transmission | Long | No |` min(cpu/4,16)` | -| sink.scp.object-batch-size-bytes | Maximum size of Object files sent per asynchronous thread, unit: MB | Long | No | 200 | - diff --git a/src/UserGuide/Master/Table/User-Manual/Maintenance-commands_timecho.md b/src/UserGuide/Master/Table/User-Manual/Maintenance-commands_timecho.md deleted file mode 100644 index bdcb82e21..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Maintenance-commands_timecho.md +++ /dev/null @@ -1,934 +0,0 @@ - -# Maintenance Statement - -## 1. Status Checking - -### 1.1 Viewing the Connected Model - -**Description**: Returns the current SQL dialect model (`Tree` or `Table`). - -**Syntax**: - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT; -``` - -**Result:** - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 1.2 Viewing the Logged-in Username - -**Description**: Returns the currently logged-in username. - -**Syntax**: - -```SQL -showCurrentUserStatement - : SHOW CURRENT_USER - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_USER; -``` - -**Result**: - -```Plain -+-----------+ -|CurrentUser| -+-----------+ -| root| -+-----------+ -``` - -### 1.3 Viewing the Connected Database Name - -**Description**: Returns the name of the currently connected database. If no `USE` statement has been executed, it returns `null`. - -**Syntax**: - -```SQL -showCurrentDatabaseStatement - : SHOW CURRENT_DATABASE - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_DATABASE; -IoTDB> USE test; -IoTDB> SHOW CURRENT_DATABASE; -``` - -**Result**: - -```Plain -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ -+---------------+ -|CurrentDatabase| -+---------------+ -| test| -+---------------+ -``` - -### 1.4 Viewing the Cluster Version - -**Description**: Returns the current cluster version. - -**Syntax**: - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW VERSION; -``` - -**Result**: - -```Plain -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.5 Viewing Cluster Key Parameters - -**Description**: Returns key parameters of the current cluster. - -**Syntax**: - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -Key Parameters: - -1. **ClusterName**: The name of the current cluster. -2. **DataReplicationFactor**: Number of data replicas per DataRegion. -3. **SchemaReplicationFactor**: Number of schema replicas per SchemaRegion. -4. **DataRegionConsensusProtocolClass**: Consensus protocol class for DataRegions. -5. **SchemaRegionConsensusProtocolClass**: Consensus protocol class for SchemaRegions. -6. **ConfigNodeConsensusProtocolClass**: Consensus protocol class for ConfigNodes. -7. **TimePartitionOrigin**: The starting timestamp of database time partitions. -8. **TimePartitionInterval**: The interval of database time partitions (in milliseconds). -9. **ReadConsistencyLevel**: The consistency level for read operations. -10. **SchemaRegionPerDataNode**: Number of SchemaRegions per DataNode. -11. **DataRegionPerDataNode**: Number of DataRegions per DataNode. -12. **SeriesSlotNum**: Number of SeriesSlots per DataRegion. -13. **SeriesSlotExecutorClass**: Implementation class for SeriesSlots. -14. **DiskSpaceWarningThreshold**: Disk space warning threshold (in percentage). -15. **TimestampPrecision**: Timestamp precision. - -**Example**: - -```SQL -IoTDB> SHOW VARIABLES; -``` - -**Result**: - -```Plain -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.6 Viewing the Cluster ID - -**Description**: Returns the ID of the current cluster. - -**Syntax**: - -```SQL -showClusterIdStatement - : SHOW (CLUSTERID | CLUSTER_ID) - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CLUSTER_ID; -``` - -**Result**: - -```Plain -+------------------------------------+ -| ClusterId| -+------------------------------------+ -|40163007-9ec1-4455-aa36-8055d740fcda| -+------------------------------------+ -``` - -### 1.7 Viewing the Timestamp of the Connected DataNode - -**Description**: Returns the current timestamp of the DataNode process directly connected to the client. - -**Syntax**: - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP; -``` - -**Result**: - -```Plain -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - -### 1.8 Viewing Executing Queries - -**Description**: Displays information about all currently executing queries. - -> For more details on how to use system tables, please refer to [System Tables](../Reference/System-Tables_timecho.md) - -**Syntax**: - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**Parameters**: - -1. **WHERE Clause**: Filters the result set based on specified conditions. -2. **ORDER BY Clause**: Sorts the result set based on specified columns. -3. **limitOffsetClause**: Limits the number of rows returned. - 1. Format: `LIMIT , `. - -**Columns in QUERIES Table**: - -- **query_id**: Unique ID of the query. -- **start_time**: Timestamp when the query started. -- **datanode_id**: ID of the DataNode executing the query. -- **elapsed_time**: Time elapsed since the query started (in seconds). -- **statement**: The SQL statement being executed. -- **user**: The user who initiated the query. - -**Example**: - -```SQL -IoTDB> SHOW QUERIES WHERE elapsed_time > 30; -``` - -**Result**: - -```Plain -+-----------------------+-----------------------------+-----------+------------+------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -|20250108_101015_00000_1|2025-01-08T18:10:15.935+08:00| 1| 32.283|show queries|root| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -``` - - -### 1.9 Viewing Region Information - -**Description**: Displays regions' information of the current cluster. - -**Syntax**: - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW REGIONS -``` - -**Result**: - -```SQL -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 6|SchemaRegion|Running|tcollector| 670| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.194| | -| 7| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.196| 169.85 KB| -| 8| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.198| 161.63 KB| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.10 Viewing Available Nodes - -**Description**: Returns the RPC addresses and ports of all available DataNodes in the current cluster. Note: A DataNode is considered "available" if it is not in the REMOVING state. - -> This feature is supported starting from v2.0.8. - -**Syntax**: - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW AVAILABLE URLS -``` - -**Result**: - -```SQL -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.11 View Service Information - -**Description**: Returns service information (MQTT service, REST service) on all active DataNodes (in RUNNING or READ-ONLY state) in the current cluster. - -> Supported since V2.0.8.2 - -#### Syntax: -```sql -showServicesStatement - : SHOW SERVICES - ; -``` - -#### Examples: -```sql -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -``` - -Execution result: -```sql -+--------------+-------------+---------+ -| Service Name | DataNode ID | State | -+--------------+-------------+---------+ -| MQTT | 1 | STOPPED | -| REST | 1 | RUNNING | -+--------------+-------------+---------+ -``` - -### 1.12 View Cluster Activation Status - -**Description**:Returns the activation status of the current cluster. - -#### Syntax: - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -#### Examples: - -```SQL -IoTDB> SHOW ACTIVATION -``` - -Execution result: - -```SQL -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -### 1.13 View Node Configuration - -**Description**: By default, returns the effective configuration items from the configuration file of the specified node (identified by `node_id`). If `node_id` is not specified, returns the configuration of the directly connected DataNode. -Adding the `all` parameter returns all configuration items (the `value` of unconfigured items is `null`). -Adding the `with desc` parameter returns configuration items with descriptions. - -> Supported since version 2.0.9.1 - -#### Syntax: -```SQL -showConfigurationStatement - : SHOW (ALL)? CONFIGURATION (ON nodeId=INTEGER_VALUE)? (WITH DESC)? - ; -``` - -#### Result Set Description -| Column Name | Column Type | Description | -|---------------|-------------|---------------------------------| -| name | string | Configuration name | -| value | string | Configuration value | -| default_value | string | Default value of the configuration | -| description | string | Configuration description (optional) | - -#### Examples: - -1. View configuration of the directly connected DataNode -```SQL -SHOW CONFIGURATION; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| schema_replication_factor| 1| 1| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_metric_prometheus_reporter_port| 9091| 9091| -| dn_metric_prometheus_reporter_port| 9092| 9092| -| series_slot_num| 1000| 1000| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| time_partition_origin| 0| 0| -| time_partition_interval| 604800000| 604800000| -| disk_space_warning_threshold| 0.05| 0.05| -| schema_engine_mode| Memory| Memory| -| tag_attribute_total_size| 700| 700| -| read_consistency_level| strong| strong| -| timestamp_precision| ms| ms| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 28 -It costs 0.013s -``` - -2. View configuration of the node with a specific node ID -```SQL -SHOW CONFIGURATION ON 1; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| schema_replication_factor| 1| 1| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_metric_prometheus_reporter_port| 9091| 9091| -| dn_metric_prometheus_reporter_port| 9092| 9092| -| series_slot_num| 1000| 1000| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| time_partition_origin| 0| 0| -| time_partition_interval| 604800000| 604800000| -| disk_space_warning_threshold| 0.05| 0.05| -| schema_engine_mode| Memory| Memory| -| tag_attribute_total_size| 700| 700| -| read_consistency_level| strong| strong| -| timestamp_precision| ms| ms| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 28 -It costs 0.004s -``` - -3. View all configurations -```SQL -SHOW ALL CONFIGURATION; -``` - -```Bash -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| dn_join_cluster_retry_interval_ms| null| 5000| -| config_node_consensus_protocol_class| null| org.apache.iotdb.consensus.ratis.RatisConsensus| -| schema_replication_factor| 1| 1| -| schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_system_dir| null| data/confignode/system| -| cn_consensus_dir| null| data/confignode/consensus| -| cn_pipe_receiver_file_dir| null| data/confignode/system/pipe/receiver| -... -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 412 -It costs 0.006s -``` - -4. View configuration items with descriptions -```SQL -SHOW CONFIGURATION ON 1 WITH DESC; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| name| value| default_value| description| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| Used for indicate cluster name and distinguish different cluster. If you need to modify the cluster name, it's recommended to use 'set configuration "cluster_name=xxx"' sql. Manually modifying configuration file is not recommended, which may cause node restart fail.effectiveMode: hot_reload.Datatype: string| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710|For the first ConfigNode to start, cn_seed_config_node points to its own cn_internal_address:cn_internal_port. For other ConfigNodes that to join the cluster, cn_seed_config_node points to any running ConfigNode's cn_internal_address:cn_internal_port. Note: After this ConfigNode successfully joins the cluster for the first time, this parameter is no longer used. Each node automatically maintains the list of ConfigNodes and traverses connections when restarting. Format: address:port e.g. 127.0.0.1:10710.effectiveMode: first_start.Datatype: String| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| dn_seed_config_node points to any running ConfigNode's cn_internal_address:cn_internal_port. Note: After this DataNode successfully joins the cluster for the first time, this parameter is no longer used. Each node automatically maintains the list of ConfigNodes and traverses connections when restarting. Format: address:port e.g. 127.0.0.1:10710.effectiveMode: first_start.Datatype: String| -| cn_internal_address| 127.0.0.1| 127.0.0.1| Used for RPC communication inside cluster. Could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: first_start.Datatype: String| -| cn_internal_port| 10710| 10710| Used for RPC communication inside cluster.effectiveMode: first_start.Datatype: int| -| cn_consensus_port| 10720| 10720| Used for consensus communication among ConfigNodes inside cluster.effectiveMode: first_start.Datatype: int| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| Used for connection of IoTDB native clients(Session) Could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: restart.Datatype: String| -| dn_rpc_port| 6667| 6667| Used for connection of IoTDB native clients(Session) Bind with dn_rpc_address.effectiveMode: restart.Datatype: int| -| dn_internal_address| 127.0.0.1| 127.0.0.1| Used for communication inside cluster. could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: first_start.Datatype: String| -| dn_internal_port| 10730| 10730| Used for communication inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_mpp_data_exchange_port| 10740| 10740| Port for data exchange among DataNodes inside cluster Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_schema_region_consensus_port| 10750| 10750| port for consensus's communication for schema region inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_data_region_consensus_port| 10760| 10760| port for consensus's communication for data region inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| schema_replication_factor| 1| 1| Default number of schema replicas.effectiveMode: first_start.Datatype: int| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| SchemaRegion consensus protocol type. This parameter is unmodifiable after ConfigNode starts for the first time. These consensus protocols are currently supported: 1. org.apache.iotdb.consensus.ratis.RatisConsensus 2. org.apache.iotdb.consensus.simple.SimpleConsensus (The schema_replication_factor can only be set to 1).effectiveMode: first_start.Datatype: string| -| data_replication_factor| 1| 1| Default number of data replicas.effectiveMode: first_start.Datatype: int| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| DataRegion consensus protocol type. This parameter is unmodifiable after ConfigNode starts for the first time. These consensus protocols are currently supported: 1. org.apache.iotdb.consensus.simple.SimpleConsensus (The data_replication_factor can only be set to 1) 2. org.apache.iotdb.consensus.iot.IoTConsensus 3. org.apache.iotdb.consensus.ratis.RatisConsensus 4. org.apache.iotdb.consensus.iot.IoTConsensusV2.effectiveMode: first_start.Datatype: string| -| cn_metric_prometheus_reporter_port| 9091| 9091| The port of prometheus reporter of metric module.effectiveMode: restart.Datatype: int| -| dn_metric_prometheus_reporter_port| 9092| 9092| The port of prometheus reporter of metric module.effectiveMode: restart.Datatype: int| -| series_slot_num| 1000| 1000| All parameters in Partition configuration is unmodifiable after ConfigNode starts for the first time. And these parameters should be consistent within the ConfigNodeGroup. Number of SeriesPartitionSlots per Database.effectiveMode: first_start.Datatype: Integer| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| SeriesPartitionSlot executor class These hashing algorithms are currently supported: 1. BKDRHashExecutor(Default) 2. APHashExecutor 3. JSHashExecutor 4. SDBMHashExecutor Also, if you want to implement your own SeriesPartition executor, you can inherit the SeriesPartitionExecutor class and modify this parameter to correspond to your Java class.effectiveMode: first_start.Datatype: String| -| time_partition_origin| 0| 0| Time partition origin in milliseconds, default is equal to zero. This origin is set by default to the beginning of Unix time, which is January 1, 1970, at 00:00 UTC (Coordinated Universal Time). This point is known as the Unix epoch, and its timestamp is 0. If you want to specify a different time partition origin, you can set this value to a specific Unix timestamp in milliseconds.effectiveMode: first_start.Datatype: long| -| time_partition_interval| 604800000| 604800000| Time partition interval in milliseconds, and partitioning data inside each data region, default is equal to one week.effectiveMode: first_start.Datatype: long| -| disk_space_warning_threshold| 0.05| 0.05| Disk remaining threshold at which DataNode is set to ReadOnly status.effectiveMode: restart.Datatype: double(percentage)| -| schema_engine_mode| Memory| Memory| The schema management mode of schema engine. Currently, support Memory and PBTree. This config of all DataNodes in one cluster must keep same.effectiveMode: first_start.Datatype: string| -| tag_attribute_total_size| 700| 700| max size for a storage block for tags and attributes of one time series. If the combined size of tags and attributes exceeds the tag_attribute_total_size, a new storage block will be allocated to continue storing the excess data. the unit is byte.effectiveMode: first_start.Datatype: int| -| read_consistency_level| strong| strong| The read consistency level These consistency levels are currently supported: 1. strong(Default, read from the leader replica) 2. weak(Read from a random replica).effectiveMode: restart.Datatype: string| -| timestamp_precision| ms| ms| Use this value to set timestamp precision as "ms", "us" or "ns". Once the precision has been set, it can not be changed.effectiveMode: first_start.Datatype: String| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 28 -It costs 0.010s -``` - -## 2. Status Setting - -### 2.1 Setting the Connected Model - -**Description**: Sets the current SQL dialect model to `Tree` or `Table` which can be used in both tree and table models. - -**Syntax**: - -```SQL -SET SQL_DIALECT = (TABLE | TREE); -``` - -**Example**: - -```SQL -IoTDB> SET SQL_DIALECT=TABLE; -IoTDB> SHOW CURRENT_SQL_DIALECT; -``` - -**Result**: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 2.2 Updating Configuration Items - -**Description**: Updates configuration items. Changes take effect immediately without restarting if the items support hot modification. - -**Syntax**: - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**Parameters**: - -1. **propertyAssignments**: A list of properties to update. - 1. Format: `property (',' property)*`. - 2. Values: - - `DEFAULT`: Resets the configuration to its default value. - - `expression`: A specific value (must be a string). -2. **ON INTEGER_VALUE** **(Optional):** Specifies the node ID to update. - 1. If not specified or set to a negative value, updates all ConfigNodes and DataNodes. - -**Example**: - -```SQL -IoTDB> SET CONFIGURATION disk_space_warning_threshold='0.05',heartbeat_interval_in_ms='1000' ON 1; -``` - -### 2.3 Loading Manually Modified Configuration Files - -**Description**: Loads manually modified configuration files and hot-loads the changes. Configuration items that support hot modification take effect immediately. - -**Syntax**: - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode** **(Optional):** - 1. Specifies the scope of configuration loading. - 2. Default: `CLUSTER`. - 3. Values: - - `LOCAL`: Loads configuration only on the DataNode directly connected to the client. - - `CLUSTER`: Loads configuration on all DataNodes in the cluster. - -**Example**: - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 Setting the System Status - -**Description**: Sets the system status to either `READONLY` or `RUNNING`. - -**Syntax**: - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **RUNNING |** **READONLY**: - 1. **RUNNING**: Sets the system to running mode, allowing both read and write operations. - 2. **READONLY**: Sets the system to read-only mode, allowing only read operations and prohibiting writes. -2. **localOrClusterMode** **(Optional):** - 1. **LOCAL**: Applies the status change only to the DataNode directly connected to the client. - 2. **CLUSTER**: Applies the status change to all DataNodes in the cluster. - 3. **Default**: `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - -## 3. Data Management - -### 3.1 Flushing Data from Memory to Disk - -**Description**: Flushes data from the memory table to disk. - -**Syntax**: - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **identifier** **(Optional):** - 1. Specifies the name of the database to flush. - 2. If not specified, all databases are flushed. - 3. **Multiple Databases**: Multiple database names can be specified, separated by commas (e.g., `FLUSH test_db1, test_db2`). -2. **booleanValue** **(****Optional****)**: - 1. Specifies the type of data to flush. - 2. **TRUE**: Flushes only the sequential memory table. - 3. **FALSE**: Flushes only the unsequential MemTable. - 4. **Default**: Flushes both sequential and unsequential memory tables. -3. **localOrClusterMode** **(****Optional****)**: - 1. **ON LOCAL**: Flushes only the memory tables on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Flushes memory tables on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> FLUSH test_db TRUE ON LOCAL; -``` - - -## 4. Data Repair - -### 4.1 Starting Background Scan and Repair of TsFiles - -**Description**: Starts a background task to scan and repair TsFiles, fixing issues such as timestamp disorder within data files. - -**Syntax**: - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode(Optional)**: - 1. **ON LOCAL**: Executes the repair task only on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Executes the repair task on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 Pausing Background TsFile Repair Task - -**Description**: Pauses the background repair task. The paused task can be resumed by executing the `START REPAIR DATA` command again. - -**Syntax**: - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode** **(Optional):** - 1. **ON LOCAL**: Executes the pause command only on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Executes the pause command on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. Query Termination - -### 5.1 Terminating Queries - -**Description**: Terminates one or more running queries. - -**Syntax**: - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**Parameters**: - -1. **QUERY** **queryId:** Specifies the ID of the query to terminate. - -- To obtain the `queryId`, use the `SHOW QUERIES` command. - -2. **ALL QUERIES:** Terminates all currently running queries. - -**Example**: - -Terminate a specific query: - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -``` - -Terminate all queries: - -```SQL -IoTDB> KILL ALL QUERIES; -``` - -## 6. Query Debugging -### 6.1 DEBUG SQL - -**Definition**: Add the `DEBUG` keyword at the beginning of an SQL query statement. During execution, debug logs will be output, including the underlying file scan information involved in the query. - -> Supported since V2.0.9.1 - -#### Syntax: -```sql -debugSQLStatement - : DEBUG ? query - ; -``` - -**Description**: -* Log output path: `logs/log_datanode_query_debug.log` - -#### Example: -1. Execute the following SQL for DEBUG query -```sql -DEBUG SELECT * FROM table3; -``` - -2. Check the log content in `log_datanode_query_debug.log` to view the file scan information involved in the query. - -```bash -2026-03-24 10:10:41,515 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.t.TsFileResource:1098 - Path: table3.d1 file /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2864/1769139940009-1-0-0.tsfile is not satisfied because of no device! -2026-03-24 10:10:41,515 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.t.TsFileResource:1098 - Path: table3.d1 file /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2865/1769139940010-1-0-0.tsfile is not satisfied because of no device! -2026-03-24 10:10:41,516 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:159 - Cache miss: table3.d1. in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,516 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:160 - Device: table3.d1, all sensors: [, temperature] -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.BloomFilterCache:110 - get bloomFilter from cache where filePath is: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: table3.d1. metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=-128, chunkMetaDataListDataSize=8, measurementId='', dataType=VECTOR, statistics=startTime: 1747065600001 endTime: 1747065601002 count: 2, modified=false, isSeq=true, chunkMetadataList=[measurementId: , datatype: VECTOR, version: 0, Statistics: startTime: 1747065600001 endTime: 1747065601002 count: 2, deleteIntervalList: null]}. -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: table3.d1.temperature metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=64, chunkMetaDataListDataSize=8, measurementId='temperature', dataType=FLOAT, statistics=startTime: 1747065600001 endTime: 1747065601002 count: 2 [minValue:85.0,maxValue:90.0,firstValue:90.0,lastValue:85.0,sumValue:175.0], modified=false, isSeq=true, chunkMetadataList=[measurementId: temperature, datatype: FLOAT, version: 0, Statistics: startTime: 1747065600001 endTime: 1747065601002 count: 2 [minValue:85.0,maxValue:90.0,firstValue:90.0,lastValue:85.0,sumValue:175.0], deleteIntervalList: null]}. -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:110 - Modifications size is 1 for file Path: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:114 - [] -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:125 - After modification Chunk meta data list is: -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:126 - org.apache.tsfile.file.metadata.TableDeviceChunkMetadata@2e11291f -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile', regionId=4, timePartitionId=2888, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=19} -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile', regionId=4, timePartitionId=2888, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=46} -2026-03-24 10:10:41,519 [pool-69-IoTDB-ClientRPC-Processor-1$20260324_021041_00068_1] INFO o.a.i.d.q.p.Coordinator:902 - debug select * from table3 -``` diff --git a/src/UserGuide/Master/Table/User-Manual/Pattern-Query_timecho.md b/src/UserGuide/Master/Table/User-Manual/Pattern-Query_timecho.md deleted file mode 100644 index 2d5135657..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Pattern-Query_timecho.md +++ /dev/null @@ -1,1138 +0,0 @@ - - -# Pattern Query - -For time-series data feature analysis scenarios, IoTDB provides the capability of pattern query, which deliver a flexible and efficient solution for in-depth mining and complex computation of time-series data. The following sections will elaborate on the feature in detail. - -## 1. Overview - -Pattern query enables capturing a segment of continuous data by defining the recognition logic of pattern variables and regular expressions, and performing analysis and calculation on each captured data segment. It is suitable for business scenarios such as identifying specific patterns in time-series data (as shown in the figure below) and detecting specific events. - -![](/img/timeseries-featured-analysis-1.png) - -> Note: This feature is available starting from version V2.0.5. - -## 2. Function Introduction -### 2.1 Syntax Format - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**Note:** - -* PARTITION BY: Optional. Used to group the input table, and each group can perform pattern matching independently. If this clause is not specified, the entire input table will be processed as a single unit. -* ORDER BY: Optional. Used to ensure that input data is processed in a specific order during matching. -* MEASURES: Optional. Used to specify which information to extract from the matched segment of data. -* ROWS PER MATCH: Optional. Used to specify the output method of the result set after successful pattern matching. -* AFTER MATCH SKIP: Optional. Used to specify which row to resume from for the next pattern match after identifying a non-empty match. -* PATTERN: Used to define the row pattern to be matched. -* SUBSET: Optional. Used to merge rows matched by multiple basic pattern variables into a single logical set. -* DEFINE: Used to define the basic pattern variables for the row pattern. - -**Original Data for Syntax Examples:** - -```SQL -IoTDB:database3> select * from t -+-----------------------------+------+----------+ -| time|device|totalprice| -+-----------------------------+------+----------+ -|2025-01-01T00:01:00.000+08:00| d1| 90| -|2025-01-01T00:02:00.000+08:00| d1| 80| -|2025-01-01T00:03:00.000+08:00| d1| 70| -|2025-01-01T00:04:00.000+08:00| d1| 80| -|2025-01-01T00:05:00.000+08:00| d1| 70| -|2025-01-01T00:06:00.000+08:00| d1| 80| -+-----------------------------+------+----------+ - --- Creation Statement -create table t(device tag, totalprice int32 field) - -insert into t(time,device,totalprice) values(2025-01-01T00:01:00, 'd1', 90),(2025-01-01T00:02:00, 'd1', 80),(2025-01-01T00:03:00, 'd1', 70),(2025-01-01T00:04:00, 'd1', 80),(2025-01-01T00:05:00, 'd1', 70),(2025-01-01T00:06:00, 'd1', 80) -``` - -### 2.2 DEFINE Clause - -Used to specify the judgment condition for each basic pattern variable in pattern recognition. These variables are usually represented by identifiers (e.g., `A`, `B`), and the Boolean expressions in this clause precisely define which rows meet the requirements of the variable. - -* During pattern matching execution, a row is only marked as the variable (and thus included in the current matching group) if the Boolean expression returns TRUE. - -```SQL --- A row can only be identified as B if its totalprice value is less than the totalprice value of the previous row. -DEFINE B AS totalprice < PREV(totalprice) -``` - -* Variables not **explicitly** defined in this clause have an implicitly set condition of always true (TRUE), meaning they can be successfully matched on any input row. - -### 2.3 SUBSET Clause - -Used to merge rows matched by multiple basic pattern variables (e.g., `A`, `B`) into a combined pattern variable (e.g., `U`), allowing these rows to be treated as a single logical set for operations. It can be used in the `MEASURES`, `DEFINE`, and `AFTER MATCH SKIP` clauses. - -```SQL -SUBSET U = (A, B) -``` -For example, for the pattern `PATTERN ((A | B){5} C+)`, it is impossible to determine whether the 5th repetition matches the basic pattern variable A or B during matching. Therefore: - -1. In the `MEASURES` clause, if you need to reference the last row matched in this phase, you can do so by defining the combined pattern variable `SUBSET U = (A, B)`. At this point, the expression `RPR_LAST(U.totalprice)` will directly return the `totalprice` value of the target row. -2. In the `AFTER MATCH SKIP` clause, if the matching result does not include the basic pattern variable A or B, executing `AFTER MATCH SKIP TO LAST B` or `AFTER MATCH SKIP TO LAST A` will fail to jump due to missing anchors. However, by introducing the combined pattern variable `SUBSET U = (A, B)`, using `AFTER MATCH SKIP TO LAST U` is always valid. - -### 2.4 PATTERN Clause - -Used to define the row pattern to be matched, whose basic building block is a row pattern variable. - -```SQL -PATTERN ( row_pattern ) -``` - -#### 2.4.1 Pattern Types - -| Row Pattern | Syntax Format | Description | -|-----------------------|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Pattern Concatenation | `A B+ C+ D+` | Composed of subpatterns without any operators, matching all subpatterns in the declared order sequentially. | -| Pattern Alternation | `A \| B \| C` | Composed of multiple subpatterns separated by `\|`, matching only one of them. If multiple subpatterns can be matched, the leftmost one is selected. | -| Pattern Permutation | `PERMUTE(A, B, C)` | Equivalent to performing alternation matching on all different orders of the subpattern elements. It requires that A, B, and C must all be matched, but their order of appearance is not fixed. If multiple matching orders are possible, the priority is determined by the **lexicographical order** based on the definition sequence of elements in the PERMUTE list. For example, A B C has the highest priority, while C B A has the lowest. | -| Pattern Grouping | `(A B C)` | Encloses subpatterns in parentheses to treat them as a single unit, which can be used with other operators. For example, `(A B C)+` indicates a pattern where a group of `(A B C)` appears consecutively. | -| Empty Pattern | `()` | Represents an empty match that does not contain any rows. | -| Pattern Exclusion | `{- row_pattern -}` | Used to specify the matched part to be excluded from the output. Usually used with the `ALL ROWS PER MATCH` option to output rows of interest. For example, `PATTERN (A {- B+ C+ -} D+)` with ALL ROWS PER MATCH will only output the first row `(corresponding to A)` and the trailing rows `(corresponding to D+)` of the match. | - -#### 2.4.2 Partition Start/End Anchor - -* `^A` indicates matching a pattern that starts with A as the partition beginning - * When the value of the PATTERN clause is `^A`, the match must start from the first row of the partition, and this row must satisfy the definition of `A`. - * When the value of the PATTERN clause is `^A^` or `A^`, the output result is empty. -* `A$` indicates matching a pattern that ends with A as the partition end - * When the value of the PATTERN clause is `A$`, the match must end at the end of the partition, and this row must satisfy the definition of `A`. - * When the value of the PATTERN clause is `$A` or `$A$`, the output result is empty. - -**Examples** - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - AFTER MATCH SKIP PAST LAST ROW - PATTERN %s -- PATTERN 子句 - DEFINE A AS true -) AS m; -``` - -* Results - * When the PATTERN clause is specified as PATTERN (^A) - - ![](/img/timeseries-featured-analysis-2.png) - - Actual Return - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * When the PATTERN clause is specified as PATTERN (^A^), the output result is empty. This is because it is impossible to match an A starting from the beginning of a partition and then return to the beginning of the partition again. - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - * When the PATTERN clause is specified as PATTERN (A\$) - - ![](/img/timeseries-featured-analysis-3.png) - - Actual Return - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:06:00.000+08:00| 1| 80| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * When the PATTERN clause is specified as PATTERN (\$A\$), the output result is empty. - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - -#### 2.4.3 Quantifiers - -Quantifiers are used to specify the number of times a subpattern repeats, placed after the corresponding subpattern (e.g., `(A | B)*`). - -Common quantifiers are as follows: - -| Quantifier | Description | -| -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `*` | Zero or more repetitions | -| `+` | One or more repetitions | -| `?` | Zero or one repetition | -| `{n}` | Exactly n repetitions | -| `{m, n}` | Repetitions between m and n times (m and n are non-negative integers). \* If the left bound is omitted, the default starts from 0; \* If the right bound is omitted, there is no upper limit on the number of repetitions (e.g., {5,} is equivalent to "at least five times"); \* If both left and right bounds are omitted (i.e., {,}), it is equivalent to `*`. | - -* The matching preference can be changed by adding `?` after the quantifier. - * `{3,5}`: Prefers 5 times, least prefers 3 times; `{3,5}?`: Prefers 3 times, least prefers 5 times. - * `?`: Prefers 1 time; `??`: Prefers 0 times. - -### 2.5 AFTER MATCH SKIP Clause - -Used to specify which row to start the next pattern match from after identifying a non-empty match. - -| Jump Strategy | Description | Allows Overlapping Matches? | -| ------------------------------------------------------------- | -------------------------------------------------------------------------------- | ----------------------------- | -| `AFTER MATCH SKIP PAST LAST ROW` | Default behavior. Starts from the row after the last row of the current match. | No | -| `AFTER MATCH SKIP TO NEXT ROW` | Starts from the second row in the current match. | Yes | -| `AFTER MATCH SKIP TO [ FIRST \| LAST ] pattern_variable` | Jumps to start from the [ first row | last row ] of a pattern variable. | Yes | - -* Among all possible configurations, only when `ALL ROWS PER MATCH WITH UNMATCHED ROWS` is used in combination with `AFTER MATCH SKIP PAST LAST ROW` can the system ensure that exactly one output record is generated for each input row. - -**Examples** - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - %s -- AFTER MATCH SKIP 子句 - PATTERN (A B+ C+ D?) - SUBSET U = (C, D) - DEFINE - B AS B.totalprice < PREV (B.totalprice), - C AS C.totalprice > PREV (C.totalprice), - D AS false -- 永远不会匹配成功 -) AS m; -``` - -* Results - * When AFTER MATCH SKIP PAST LAST ROW is specified - - ![](/img/timeseries-featured-analysis-4-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: According to the semantics of `AFTER MATCH SKIP PAST LAST ROW`, starting from row 5, no valid match can be found - * This pattern will never have overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 4 - ``` - - * When AFTER MATCH SKIP TO NEXT ROW - - ![](/img/timeseries-featured-analysis-5-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: According to the semantics of `AFTER MATCH SKIP TO NEXT ROW`, starting from row 2, matches: Rows 2, 3, 4 - * Third match: Attempts to start from row 3, fails - * Fourth match: Attempts to start from row 4, succeeds, matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:02:00.000+08:00| 2| 80| A| - |2025-01-01T00:03:00.000+08:00| 2| 70| B| - |2025-01-01T00:04:00.000+08:00| 2| 80| C| - |2025-01-01T00:04:00.000+08:00| 3| 80| A| - |2025-01-01T00:05:00.000+08:00| 3| 70| B| - |2025-01-01T00:06:00.000+08:00| 3| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 10 - ``` - - * When AFTER MATCH SKIP TO FIRST C - - ![](/img/timeseries-featured-analysis-6-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: Starts from the first C (i.e., row 4), matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO LAST B or AFTER MATCH SKIP TO B - - ![](/img/timeseries-featured-analysis-7-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: Attempts to start from the last B (i.e., row 3), fails - * Third match: Attempts to start from row 4, successfully matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO U - - ![](/img/timeseries-featured-analysis-8-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: `SKIP TO U` means jumping to the last C or D; D can never match successfully, so it jumps to the last C (i.e., row 4), successfully matching rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO A, you cannot jump to the first row of the match, otherwise it will cause an infinite loop - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: cannot skip to first row of match - ``` - - * When AFTER MATCH SKIP TO B, you cannot jump to a pattern variable that does not exist in the match group - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: pattern variable is not present in match - ``` - - -### 2.6 ROWS PER MATCH Clause - -Used to specify the output method of the result set after a successful pattern match, including the following two main options: - -| Output Method | Rule Description | Output Result | Handling Logic for **Empty Matches/Unmatched Rows** | -| -------------------- | -------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| ONE ROW PER MATCH | Generates one output row for each successful match. | \* Columns in the PARTITION BY clause\* Expressions defined in the MEASURES clause. | Outputs empty matches; skips unmatched rows. | -| ALL ROWS PER MATCH | Each row in a match generates an output record, unless the row is excluded via exclusion syntax. | \* Columns in the PARTITION BY clause\* Columns in the ORDER BY clause\* Expressions defined in the MEASURES clause\* Remaining columns in the input table | \* Default: Outputs empty matches; skips unmatched rows.\* ALL ROWS PER MATCH​**SHOW EMPTY MATCHES**​: Outputs empty matches by default; skips unmatched rows.\* ALL ROWS PER MATCH​**OMIT EMPTY MATCHES**​: Does not output empty matches; skips unmatched rows.\* ALL ROWS PER MATCH​**WITH UNMATCHED ROWS**​: Outputs empty matches and generates an additional output record for each unmatched row. | - -### 2.7 MEASURES Clause - -Used to specify which information to extract from a matched set of data. This clause is optional; if not explicitly specified, some input columns will become the output results of pattern recognition based on the settings of the ROWS PER MATCH clause. - -SQL - -```SQL -MEASURES measure_expression AS measure_name [, ...] -``` - -* A `measure_expression` is a scalar value calculated from the matched set of data. - -| Usage Example | Description | -| ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `A.totalprice AS starting_price` | Returns the price from the first row in the matched group (i.e., the only row associated with variable A) as the starting price. | -| `RPR_LAST(B.totalprice) AS bottom_price` | Returns the price from the last row associated with variable B, representing the lowest price in the "V" shape pattern (corresponding to the end of the downward segment). | -| `RPR_LAST(U.totalprice) AS top_price` | Returns the highest price in the matched group, corresponding to the last row associated with variable C or D (i.e., the end of the entire matched group). [Assuming SUBSET U = (C, D)] | - -* Each `measure_expression` defines an output column, which can be referenced by its specified `measure_name`. - -### 2.8 Row Pattern Recognition Expressions - -Expressions used in the MEASURES and DEFINE clauses are ​**scalar expressions**​, evaluated in the row-level context of the input table. In addition to supporting standard SQL syntax, **scalar expressions** also support special extended functions for row pattern recognition. - -#### 2.8.1 Pattern Variable References - -```SQL -A.totalprice -U.orderdate -orderstatus -``` - -* When a column name is prefixed with a **basic pattern variable** or a ​**combined pattern variable**​, it refers to the corresponding column values of all rows matched by that variable. -* If a column name has no prefix, it is equivalent to using the "​**global combined pattern variable**​" (i.e., the union of all basic pattern variables) as the prefix, referring to the column values of all rows in the current match. - -> Using table names as column name prefixes in pattern recognition expressions is not allowed. - -#### 2.8.2 Extended Functions - -| Function Name | Function Syntax | Description | -| ------------------------------- | ----------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| `MATCH_NUMBER` Function | `MATCH_NUMBER()` | Returns the sequence number of the current match within the partition, starting from 1. Empty matches occupy match sequence numbers just like non-empty matches. | -| `CLASSIFIER` Function | `CLASSIFIER(option)` | 1. Returns the name of the basic pattern variable mapped by the current row. 2. `option` is an optional parameter: a basic pattern variable `CLASSIFIER(A)` or a combined pattern variable `CLASSIFIER(U)` can be passed in to limit the function's scope; for rows outside the scope, NULL is returned directly. When used with a combined pattern variable, it can be used to distinguish which basic pattern variable in the union the row is mapped to. | -| Logical Navigation Functions | `RPR_FIRST(expr, k)` | 1. Indicates locating the first row satisfying `expr` in the ​**current match group**​, then searching for the k-th occurrence of the row corresponding to the same pattern variable towards the end of the group, and returning the specified column value of that row. If the k-th matching row is not found in the specified direction, the function returns NULL. 2. `k` is an optional parameter, defaulting to 0 (only locating the first row satisfying the condition); if explicitly specified, it must be a non-negative integer. | -| Logical Navigation Functions | `RPR_LAST(expr, k)` | 1. Indicates locating the last row satisfying `expr` in the ​**current match group**​, then searching for the k-th occurrence of the row corresponding to the same pattern variable towards the start of the group, and returning the specified column value of that row. If the k-th matching row is not found in the specified direction, the function returns NULL. 2. `k` is an optional parameter, defaulting to 0 (only locating the last row satisfying the condition); if explicitly specified, it must be a non-negative integer. | -| Physical Navigation Functions | `PREV(expr, k)` | 1. Indicates offsetting k rows towards the start from the last row matched to the given pattern variable, and returning the corresponding column value. If navigation exceeds the ​**partition boundary**​, the function returns NULL. 2. `k` is an optional parameter, defaulting to 1; if explicitly specified, it must be a non-negative integer. | -| Physical Navigation Functions | `NEXT(expr, k)` | 1. Indicates offsetting k rows towards the end from the last row matched to the given pattern variable, and returning the corresponding column value. If navigation exceeds the ​**partition boundary**​, the function returns NULL. 2. `k` is an optional parameter, defaulting to 1; if explicitly specified, it must be a non-negative integer. | -| Aggregate Functions | COUNT, SUM, AVG, MAX, MIN Functions | Can be used to calculate data in the current match. Aggregate functions and navigation functions are not allowed to be nested within each other. (Supported from version V2.0.6) | -| Nested Functions | `PREV/NEXT(CLASSIFIER())` | Nesting of physical navigation functions and the CLASSIFIER function. Used to obtain the pattern variables corresponding to the previous and next matching rows of the current row. | -| Nested Functions | `PREV/NEXT(RPR_FIRST/RPR_LAST(expr, k)`) | **Logical functions are allowed to be nested** inside physical functions; **physical functions are not allowed to be nested** inside logical functions. Used to perform logical offset first, then physical offset. | - -**Examples** - -1. CLASSIFIER Function - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* Analysis - - ![](/img/timeseries-featured-analysis-9-en.png) - -* Result - -```SQL -+-----------------------------+-----+-----+---------------+-----+ -| time|match|price|lower_or_higher|label| -+-----------------------------+-----+-----+---------------+-----+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| -+-----------------------------+-----+-----+---------------+-----+ -Total line number = 6 -``` - -2. Logical Navigation Functions - -* Query sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` - -* Results - * When the value is totalprice, RPR\_LAST(totalprice), RUNNING RPR\_LAST(totalprice) - - ![](/img/timeseries-featured-analysis-10.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is FINAL RPR\_LAST(totalprice) - - ![](/img/timeseries-featured-analysis-11.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_FIRST(totalprice), RUNNING RPR\_FIRST(totalprice), FINAL RPR\_FIRST(totalprice) - - ![](/img/timeseries-featured-analysis-12.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 90| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 90| - |2025-01-01T00:05:00.000+08:00| 90| - |2025-01-01T00:06:00.000+08:00| 90| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_LAST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-13.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| null| - |2025-01-01T00:02:00.000+08:00| null| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is FINAL RPP\_LAST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-14.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_FIRST(totalprice, 2) and FINAL RPR\_FIRST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-15.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 70| - |2025-01-01T00:02:00.000+08:00| 70| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 6 - ``` - -3. Physical Navigation Functions - -* Query sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (B) - DEFINE B AS B.totalprice >= PREV(B.totalprice) -) AS m; -``` - -* Results - * When the value is `PREV(totalprice)` - - ![](/img/timeseries-featured-analysis-16.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `PREV(B.totalprice, 2)` - - ![](/img/timeseries-featured-analysis-17.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `PREV(B.totalprice, 4)` - - ![](/img/timeseries-featured-analysis-18.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| null| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `NEXT(totalprice)` or `NEXT(B.totalprice, 1)` - - ![](/img/timeseries-featured-analysis-19.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * `When the value is `NEXT(B.totalprice, 2)` - - ![](/img/timeseries-featured-analysis-20.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - -4. Aggregate Functions - -* Query sql - -```SQL -SELECT m.time, m.count, m.avg, m.sum, m.min, m.max -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - COUNT(*) AS count, - AVG(totalprice) AS avg, - SUM(totalprice) AS sum, - MIN(totalprice) AS min, - MAX(totalprice) AS max - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* Analysis (Taking MIN(totalprice) as an Example) - -![](/img/timeseries-featured-analysis-21.png) - -* Result - -```SQL -+-----------------------------+-----+-----------------+-----+---+---+ -| time|count| avg| sum|min|max| -+-----------------------------+-----+-----------------+-----+---+---+ -|2025-01-01T00:01:00.000+08:00| 1| 90.0| 90.0| 90| 90| -|2025-01-01T00:02:00.000+08:00| 2| 85.0|170.0| 80| 90| -|2025-01-01T00:03:00.000+08:00| 3| 80.0|240.0| 70| 90| -|2025-01-01T00:04:00.000+08:00| 4| 80.0|320.0| 70| 90| -|2025-01-01T00:05:00.000+08:00| 5| 78.0|390.0| 70| 90| -|2025-01-01T00:06:00.000+08:00| 6|78.33333333333333|470.0| 70| 90| -+-----------------------------+-----+-----------------+-----+---+---+ -Total line number = 6 -``` - -5. Nested Functions - -Example 1 - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label, m.prev_label, m.next_label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label, - PREV(CLASSIFIER(W)) AS prev_label, - NEXT(CLASSIFIER(W)) AS next_label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* Analysis - -![](/img/timeseries-featured-analysis-22-en.png) - -* Result - -```SQL -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -| time|match|price|lower_or_higher|label|prev_label|next_label| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| null| A| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| H| null| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| null| A| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| L| null| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| null| A| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| L| null| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -Total line number = 6 -``` - -Example 2 - -* Query sql - -```SQL -SELECT m.time, m.prev_last_price, m.next_first_price -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - PREV(RPR_LAST(totalprice), 2) AS prev_last_price, - NEXT(RPR_FIRST(totalprice), 2) as next_first_price - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* Analysis - -![](/img/timeseries-featured-analysis-23.png) - -* Result - -```SQL -+-----------------------------+---------------+----------------+ -| time|prev_last_price|next_first_price| -+-----------------------------+---------------+----------------+ -|2025-01-01T00:01:00.000+08:00| null| 70| -|2025-01-01T00:02:00.000+08:00| null| 70| -|2025-01-01T00:03:00.000+08:00| 90| 70| -|2025-01-01T00:04:00.000+08:00| 80| 70| -|2025-01-01T00:05:00.000+08:00| 70| 70| -|2025-01-01T00:06:00.000+08:00| 80| 70| -+-----------------------------+---------------+----------------+ -Total line number = 6 -``` - -#### 2.8.3 RUNNING and FINAL Semantics - -1. Definition - -* `RUNNING`: Indicates the calculation scope is from the start row of the current match group to the row currently being processed (i.e., up to the current row). -* `FINAL`: Indicates the calculation scope is from the start row of the current match group to the final row of the group (i.e., the entire match group). - -2. Scope of Application - -* The DEFINE clause uses RUNNING semantics by default. -* The MEASURES clause uses RUNNING semantics by default and supports specifying FINAL semantics. When using the ONE ROW PER MATCH output mode, all expressions are calculated from the last row position of the match group, and at this time, RUNNING semantics are equivalent to FINAL semantics. - -3. Syntax Constraints - -* RUNNING and FINAL need to be written before **logical navigation functions** or aggregate functions, and cannot directly act on **column references.** - * Valid: `RUNNING RPP_LAST(A.totalprice)`, `FINAL RPP_LAST(A.totalprice)` - * Invalid: `RUNNING A.totalprice`, `FINAL A.totalprice`, `RUNNING PREV(A.totalprice)` - -## 3. Scenario Examples - -Using [Sample Data](../Reference/Sample-Data.md) as the source data - -### 3.1 Time Segment Query - -Segment the data in table1 by time intervals less than or equal to 24 hours, and query the total number of data entries in each segment, as well as the start and end times. - -Query SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -Results - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -### 3.2 Difference Segment Query - -Segment the data in table2 by humidity value differences less than 0.1, and query the total number of data entries in each segment, as well as the start and end times. - -* Query SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* Results - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -### 3.3 Event Statistics Query - -Group the data in table1 by device ID, and count the start and end times and maximum humidity value where the humidity in the Shanghai area is greater than 35. - -* Query SQL - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= '上海' AND A.humidity> 35 -) AS m -``` - -* Results - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` - - -## 4. Practical Cases - -### 4.1 Altitude Monitoring - -* **Business Background** - -During oil product transportation, environmental pressure is directly affected by altitude: higher altitude means lower atmospheric pressure, which increases oil evaporation risks. To accurately assess natural oil loss, BeiDou positioning data must identify altitude anomalies to support loss evaluation. - -* **Data Structure** - -Monitoring table contains these core fields: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | Data collection timestamp | -| device\_id | STRING | TAG | Vehicle device ID (partition key) | -| department | STRING | FIELD | Affiliated department | -| altitude | DOUBLE | FIELD | Altitude (unit: meters) | - -* **Business Requirements** - -Identify altitude anomaly events: When vehicle altitude exceeds 500m and later drops below 500m, it constitutes a complete anomaly event. Calculate core metrics: - -* Event start time (first timestamp exceeding 500m) -* Event end time (last timestamp above 500m) -* Maximum altitude during event - -![](/img/pattern-query-altitude.png) - -* **Implementation Method** - -```SQL -SELECT * -FROM beidou -MATCH_RECOGNIZE ( - PARTITION BY device_id -- Partition by vehicle device ID - ORDER BY time -- Chronological ordering - MEASURES - FIRST(A.time) AS ts_s, -- Event start timestamp - LAST(A.time) AS ts_e, -- Event end timestamp - MAX(A.altitude) AS max_a -- Maximum altitude during event - PATTERN (A+) -- Match consecutive records above 500m - DEFINE - A AS A.altitude > 500 -- Define A as altitude > 500m -) -``` - -### 4.2 Safety Injection Operation Identification - -* **Business Background** - -Nuclear power plants require periodic safety tests (e.g., PT1RPA010 "Safety Injection Logic Test with 1 RPA 601KC") to verify equipment integrity. These tests cause characteristic flow pattern changes. The control system must identify these patterns to detect anomalies and ensure equipment safety. - -* **Data Structure** - -Sensor table contains these core fields: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | Data collection timestamp | -| pipe\_id | STRING | TAG | Pipe ID (partition key) | -| pressure | DOUBLE | FIELD | Pipe pressure | -| flow\_rate | DOUBLE | FIELD | Pipe flow rate (key metric) | - -* **Business Requirements** - -Identify PT1RPA010 flow pattern: Normal flow → Continuous decline → Extremely low flow (<0.5) → Continuous recovery → Normal flow. Extract core metrics: - -* Pattern start time (initial normal flow timestamp) -* Pattern end time (recovered normal flow timestamp) -* Extremely low phase start/end times -* Minimum flow rate during extremely low phase - -![](/img/pattern-query-flow.png) - -* **Implementation Method** - -```SQL -SELECT * FROM sensor MATCH_RECOGNIZE( - PARTITION BY pipe_id -- Partition by pipe ID - ORDER BY time -- Chronological ordering - MEASURES - A.time AS start_ts, -- Pattern start timestamp - E.time AS end_ts, -- Pattern end timestamp - FIRST(C.time) AS low_start_ts, -- Extremely low phase start - LAST(C.time) AS low_end_ts, -- Extremely low phase end - MIN(C.flow_rate) AS min_low_flow -- Minimum flow during low phase - ONE ROW PER MATCH -- Output one row per match - PATTERN(A B+? C+ D+? E) -- Match normal→decline→extremely low→recovery→normal - DEFINE - A AS flow_rate BETWEEN 2 AND 2.5, -- Initial normal flow - B AS flow_rate < PREV(B.flow_rate), -- Continuous decline - C AS flow_rate < 0.5, -- Extremely low threshold - D AS flow_rate > PREV(D.flow_rate), -- Continuous recovery - E AS flow_rate BETWEEN 2 AND 2.5 -- Normal recovery -); -``` - -### 4.3 Extreme Operational Gust (Sombrero Wind) Identification - -* **Business Background** - -In wind power generation, "extreme operational gusts (sombrero wind)" are short-duration (≈10s) sinusoidal gusts with prominent peaks that can cause physical turbine damage. Identifying these gusts and calculating their frequency helps assess turbine damage risks and guide maintenance. - -* **Data Structure** - -Turbine sensor table contains: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | Wind speed timestamp | -| speed | DOUBLE | FIELD | Wind speed (key metric) | - -* **Business Requirements** - -Identify sombrero wind pattern: Gradual speed decline → Sharp increase → Sharp decrease → Gradual recovery to initial value (≈10s total). Primary goal: count gust occurrences for risk assessment. - -![](/img/pattern-query-speed.png) - -* **Implementation Method** - -```SQL -SELECT COUNT(*) -- Count extreme gust occurrences -FROM sensor -MATCH_RECOGNIZE( - ORDER BY time -- Chronological ordering - MEASURES - FIRST(B.time) AS ts_s, -- Gust start timestamp - LAST(D.time) AS ts_e -- Gust end timestamp - PATTERN (B+ R+? F+? D+? E) -- Match sombrero wind pattern - DEFINE - -- Phase B: Gradual decline, initial speed>9, delta<2.5 - B AS speed <= AVG(B.speed) - AND FIRST(B.speed) > 9 - AND (FIRST(B.speed) - LAST(B.speed)) < 2.5, - -- Phase R: Sharp increase (above phase average) - R AS speed >= AVG(R.speed), - -- Phase F: Sharp decrease, peak>16 (crest threshold) - F AS speed <= AVG(F.speed) - AND MAX(F.speed) > 16, - -- Phase D: Gradual recovery, delta<2.5 - D AS speed >= AVG(D.speed) - AND (LAST(D.speed) - FIRST(D.speed)) < 2.5, - -- Phase E: Recovery to ±0.2 of initial value, total duration <11s - E AS speed - FIRST(B.speed) BETWEEN -0.2 AND 0.2 - AND time - FIRST(B.time) < 11 -); -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Table/User-Manual/Tiered-Storage_timecho.md b/src/UserGuide/Master/Table/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 0fe24c1f7..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,102 +0,0 @@ - -# Tiered Storage - -## 1. Overview - -The **tiered storage** feature enables users to manage multiple types of storage media efficiently. Users can configure different storage media types within IoTDB and classify them into distinct storage tiers. In IoTDB, tiered storage is implemented by managing multiple directories. Users can group multiple storage directories into the same category and designate them as a **storage tier**. Additionally, data can be classified based on its "hotness" or "coldness" and stored accordingly in designated tiers. - -Currently, IoTDB supports hot and cold data classification based on the **Time-To-Live (****TTL****)** parameter. When data in a tier no longer meets the defined TTL rules, it is automatically migrated to the next tier. - -## 2. **Parameter Definitions** - -To enable multi-level storage in IoTDB, the following configurations are required: - -1. Configure data directories and assign them into different tiers -2. Set TTL for each Tier to distinguish hot and cold data managed by different tiers. -3. Configure minimum remaining storage space ratio for each tier (Optional). If the available space in a tier falls below the defined threshold, data will be migrated to the next tier automatically. - -The specific parameter definitions and their descriptions are as follows. - -| **Parameter** | **Default Value** | **Required** | **Description** | **Constraints** | -| :--------------------------------------------------------------------------------------------------- | :------------------------- | --- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `dn_data_dirs` | `data/datanode/data` | Yes | Specifies storage directories grouped into tiers. | Tiers are separated by `;`, directories within the same tier are separated by `,`.
Cloud storage (e.g., AWS S3) can only be the last tier.
Use `OBJECT_STORAGE` to denote cloud storage.
Only one cloud storage bucket is allowed. | -| `tier_ttl_in_ms` | `-1` | Yes | Defines the TTL (in milliseconds) for each tier to determine the data range it manages. | Tiers are separated by `;`.
The number of tiers must match `dn_data_dirs`.
`-1` means "no limit". | -| `dn_default_space_usage_thresholds` | `0.85` | Yes | Define the maximum storage usage threshold ratio for each tier of data directories. When the used space exceeds this ratio, the data will be automatically migrated to the next tier. If the storage usage of the last tier surpasses this threshold, the system will be set to ​​READ_ONLY​​ mode. | -Tiers are separated by `;`.The number of tiers must match `dn_data_dirs`. | -| `object_storage_type` | `AWS_S3` | Required when using remote storage | Cloud storage type. | all `AWS_S3` is supported. | -| `object_storage_bucket` | `iotdb_data` | Required when using remote storage | Cloud storage bucket name. | | -| `object_storage_endpoint` | (Empty) | Required when using remote storage | Cloud storage endpoint. | | -| `object_storage_region` | (Empty) | Required when using remote storage | Cloud storage Region. | | -| `object_storage_access_key` | (Empty) | Required when using remote storage | Cloud storage access key. | | -| `object_storage_access_secret` | (Empty) | Required when using remote storage | Cloud storage access secret. | | -| `enable_path_style_access` | false | No | Whether to enable path style access for object storage service. | | -| `remote_tsfile_cache_dirs` | `data/datanode/data/cache` | No | Local cache directory for cloud storage. | | -| `remote_tsfile_cache_page_size_in_kb` | `20480` | No | Page size (in KB) for cloud storage local cache. | | -| `remote_tsfile_cache_max_disk_usage_in_mb` | `51200` | No | Maximum disk space (in MB) allocated for cloud storage local cache. | | - -## 3. Local Tiered Storage Example - -The following is an example of a **two-tier local storage configuration**: - -```Properties -# Mandatory configurations -dn_data_dirs=/data1/data;/data2/data,/data3/data -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -**Tier Details:** - -| **Tier** | **Storage Directories** | **Data Range** | **Remaining Space Threshold** | -| :------- | :--------------------------- | :-------------------- | :---------------------------- | -| Tier 1 | `/data1/data` | Last 1 day of data | 20% | -| Tier 2 | `/data2/data`, `/data3/data` | Data older than 1 day | 10% | - -## 4. Cloud-based Tiered Storage Example - -The following is an example of a **three-tier configuration with cloud storage**: - -```Properties -# Mandatory configurations -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_type=AWS_S3 -object_storage_bucket=iotdb -object_storage_region= -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -# Optional configurations -enable_path_style_access=false -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -**Tier Details:** - -| **Tier** | **Storage Directories** | **Data Range** | **Remaining Space Threshold** | -| :------- | :--------------------------- | :----------------------------- | :---------------------------- | -| Tier 1 | `/data1/data` | Last 1 day of data | 20% | -| Tier 2 | `/data2/data`, `/data3/data` | Data from 1 day to 10 days ago | 15% | -| Tier 3 | S3 Cloud Storage | Data older than 10 days | 10% | \ No newline at end of file diff --git a/src/UserGuide/Master/Table/User-Manual/Timeseries-Featured-Analysis_timecho.md b/src/UserGuide/Master/Table/User-Manual/Timeseries-Featured-Analysis_timecho.md deleted file mode 100644 index cb36c56f6..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Timeseries-Featured-Analysis_timecho.md +++ /dev/null @@ -1,1728 +0,0 @@ - - -# Timeseries Featured Analysis - -For time-series data feature analysis scenarios, IoTDB provides two core capabilities: pattern query and window functions. These capabilities deliver a flexible and efficient solution for in-depth mining and complex computation of time-series data. The following sections will elaborate on the two features in detail. - -## 1. Pattern Query - -### 1.1 Overview - -Pattern query enables capturing a segment of continuous data by defining the recognition logic of pattern variables and regular expressions, and performing analysis and calculation on each captured data segment. It is suitable for business scenarios such as identifying specific patterns in time-series data (as shown in the figure below) and detecting specific events. - -![](/img/timeseries-featured-analysis-1.png) - -> Note: This feature is available starting from version V2.0.5. - -### 1.2 Function Introduction -#### 1.2.1 Syntax Format - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**Note:** - -* PARTITION BY: Optional. Used to group the input table, and each group can perform pattern matching independently. If this clause is not specified, the entire input table will be processed as a single unit. -* ORDER BY: Optional. Used to ensure that input data is processed in a specific order during matching. -* MEASURES: Optional. Used to specify which information to extract from the matched segment of data. -* ROWS PER MATCH: Optional. Used to specify the output method of the result set after successful pattern matching. -* AFTER MATCH SKIP: Optional. Used to specify which row to resume from for the next pattern match after identifying a non-empty match. -* PATTERN: Used to define the row pattern to be matched. -* SUBSET: Optional. Used to merge rows matched by multiple basic pattern variables into a single logical set. -* DEFINE: Used to define the basic pattern variables for the row pattern. - -**Original Data for Syntax Examples:** - -```SQL -IoTDB:database3> select * from t -+-----------------------------+------+----------+ -| time|device|totalprice| -+-----------------------------+------+----------+ -|2025-01-01T00:01:00.000+08:00| d1| 90| -|2025-01-01T00:02:00.000+08:00| d1| 80| -|2025-01-01T00:03:00.000+08:00| d1| 70| -|2025-01-01T00:04:00.000+08:00| d1| 80| -|2025-01-01T00:05:00.000+08:00| d1| 70| -|2025-01-01T00:06:00.000+08:00| d1| 80| -+-----------------------------+------+----------+ - --- Creation Statement -create table t(device tag, totalprice int32 field) - -insert into t(time,device,totalprice) values(2025-01-01T00:01:00, 'd1', 90),(2025-01-01T00:02:00, 'd1', 80),(2025-01-01T00:03:00, 'd1', 70),(2025-01-01T00:04:00, 'd1', 80),(2025-01-01T00:05:00, 'd1', 70),(2025-01-01T00:06:00, 'd1', 80) -``` - -#### 1.2.2 DEFINE Clause - -Used to specify the judgment condition for each basic pattern variable in pattern recognition. These variables are usually represented by identifiers (e.g., `A`, `B`), and the Boolean expressions in this clause precisely define which rows meet the requirements of the variable. - -* During pattern matching execution, a row is only marked as the variable (and thus included in the current matching group) if the Boolean expression returns TRUE. - -```SQL --- A row can only be identified as B if its totalprice value is less than the totalprice value of the previous row. -DEFINE B AS totalprice < PREV(totalprice) -``` - -* Variables not **explicitly** defined in this clause have an implicitly set condition of always true (TRUE), meaning they can be successfully matched on any input row. - -#### 1.2.3 SUBSET Clause - -Used to merge rows matched by multiple basic pattern variables (e.g., `A`, `B`) into a combined pattern variable (e.g., `U`), allowing these rows to be treated as a single logical set for operations. It can be used in the `MEASURES`, `DEFINE`, and `AFTER MATCH SKIP` clauses. - -```SQL -SUBSET U = (A, B) -``` -For example, for the pattern `PATTERN ((A | B){5} C+)`, it is impossible to determine whether the 5th repetition matches the basic pattern variable A or B during matching. Therefore: - -1. In the `MEASURES` clause, if you need to reference the last row matched in this phase, you can do so by defining the combined pattern variable `SUBSET U = (A, B)`. At this point, the expression `RPR_LAST(U.totalprice)` will directly return the `totalprice` value of the target row. -2. In the `AFTER MATCH SKIP` clause, if the matching result does not include the basic pattern variable A or B, executing `AFTER MATCH SKIP TO LAST B` or `AFTER MATCH SKIP TO LAST A` will fail to jump due to missing anchors. However, by introducing the combined pattern variable `SUBSET U = (A, B)`, using `AFTER MATCH SKIP TO LAST U` is always valid. - -#### 1.2.4 PATTERN Clause - -Used to define the row pattern to be matched, whose basic building block is a row pattern variable. - -```SQL -PATTERN ( row_pattern ) -``` - -##### 1.2.4.1 Pattern Types - -| Row Pattern | Syntax Format | Description | -|-----------------------|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Pattern Concatenation | `A B+ C+ D+` | Composed of subpatterns without any operators, matching all subpatterns in the declared order sequentially. | -| Pattern Alternation | `A \| B \| C` | Composed of multiple subpatterns separated by `\|`, matching only one of them. If multiple subpatterns can be matched, the leftmost one is selected. | -| Pattern Permutation | `PERMUTE(A, B, C)` | Equivalent to performing alternation matching on all different orders of the subpattern elements. It requires that A, B, and C must all be matched, but their order of appearance is not fixed. If multiple matching orders are possible, the priority is determined by the **lexicographical order** based on the definition sequence of elements in the PERMUTE list. For example, A B C has the highest priority, while C B A has the lowest. | -| Pattern Grouping | `(A B C)` | Encloses subpatterns in parentheses to treat them as a single unit, which can be used with other operators. For example, `(A B C)+` indicates a pattern where a group of `(A B C)` appears consecutively. | -| Empty Pattern | `()` | Represents an empty match that does not contain any rows. | -| Pattern Exclusion | `{- row_pattern -}` | Used to specify the matched part to be excluded from the output. Usually used with the `ALL ROWS PER MATCH` option to output rows of interest. For example, `PATTERN (A {- B+ C+ -} D+)` with ALL ROWS PER MATCH will only output the first row `(corresponding to A)` and the trailing rows `(corresponding to D+)` of the match. | - -##### 1.2.4.2 Partition Start/End Anchor - -* `^A` indicates matching a pattern that starts with A as the partition beginning - * When the value of the PATTERN clause is `^A`, the match must start from the first row of the partition, and this row must satisfy the definition of `A`. - * When the value of the PATTERN clause is `^A^` or `A^`, the output result is empty. -* `A$` indicates matching a pattern that ends with A as the partition end - * When the value of the PATTERN clause is `A$`, the match must end at the end of the partition, and this row must satisfy the definition of `A`. - * When the value of the PATTERN clause is `$A` or `$A$`, the output result is empty. - -**Examples** - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - AFTER MATCH SKIP PAST LAST ROW - PATTERN %s -- PATTERN 子句 - DEFINE A AS true -) AS m; -``` - -* Results - * When the PATTERN clause is specified as PATTERN (^A) - - ![](/img/timeseries-featured-analysis-2.png) - - Actual Return - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * When the PATTERN clause is specified as PATTERN (^A^), the output result is empty. This is because it is impossible to match an A starting from the beginning of a partition and then return to the beginning of the partition again. - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - * When the PATTERN clause is specified as PATTERN (A\$) - - ![](/img/timeseries-featured-analysis-3.png) - - Actual Return - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:06:00.000+08:00| 1| 80| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * When the PATTERN clause is specified as PATTERN (\$A\$), the output result is empty. - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - -##### 1.2.4.3 Quantifiers - -Quantifiers are used to specify the number of times a subpattern repeats, placed after the corresponding subpattern (e.g., `(A | B)*`). - -Common quantifiers are as follows: - -| Quantifier | Description | -| -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `*` | Zero or more repetitions | -| `+` | One or more repetitions | -| `?` | Zero or one repetition | -| `{n}` | Exactly n repetitions | -| `{m, n}` | Repetitions between m and n times (m and n are non-negative integers). \* If the left bound is omitted, the default starts from 0; \* If the right bound is omitted, there is no upper limit on the number of repetitions (e.g., {5,} is equivalent to "at least five times"); \* If both left and right bounds are omitted (i.e., {,}), it is equivalent to `*`. | - -* The matching preference can be changed by adding `?` after the quantifier. - * `{3,5}`: Prefers 5 times, least prefers 3 times; `{3,5}?`: Prefers 3 times, least prefers 5 times. - * `?`: Prefers 1 time; `??`: Prefers 0 times. - -#### 1.2.5 AFTER MATCH SKIP Clause - -Used to specify which row to start the next pattern match from after identifying a non-empty match. - -| Jump Strategy | Description | Allows Overlapping Matches? | -| ------------------------------------------------------------- | -------------------------------------------------------------------------------- | ----------------------------- | -| `AFTER MATCH SKIP PAST LAST ROW` | Default behavior. Starts from the row after the last row of the current match. | No | -| `AFTER MATCH SKIP TO NEXT ROW` | Starts from the second row in the current match. | Yes | -| `AFTER MATCH SKIP TO [ FIRST \| LAST ] pattern_variable` | Jumps to start from the [ first row | last row ] of a pattern variable. | Yes | - -* Among all possible configurations, only when `ALL ROWS PER MATCH WITH UNMATCHED ROWS` is used in combination with `AFTER MATCH SKIP PAST LAST ROW` can the system ensure that exactly one output record is generated for each input row. - -**Examples** - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - %s -- AFTER MATCH SKIP 子句 - PATTERN (A B+ C+ D?) - SUBSET U = (C, D) - DEFINE - B AS B.totalprice < PREV (B.totalprice), - C AS C.totalprice > PREV (C.totalprice), - D AS false -- 永远不会匹配成功 -) AS m; -``` - -* Results - * When AFTER MATCH SKIP PAST LAST ROW is specified - - ![](/img/timeseries-featured-analysis-4-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: According to the semantics of `AFTER MATCH SKIP PAST LAST ROW`, starting from row 5, no valid match can be found - * This pattern will never have overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 4 - ``` - - * When AFTER MATCH SKIP TO NEXT ROW - - ![](/img/timeseries-featured-analysis-5-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: According to the semantics of `AFTER MATCH SKIP TO NEXT ROW`, starting from row 2, matches: Rows 2, 3, 4 - * Third match: Attempts to start from row 3, fails - * Fourth match: Attempts to start from row 4, succeeds, matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:02:00.000+08:00| 2| 80| A| - |2025-01-01T00:03:00.000+08:00| 2| 70| B| - |2025-01-01T00:04:00.000+08:00| 2| 80| C| - |2025-01-01T00:04:00.000+08:00| 3| 80| A| - |2025-01-01T00:05:00.000+08:00| 3| 70| B| - |2025-01-01T00:06:00.000+08:00| 3| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 10 - ``` - - * When AFTER MATCH SKIP TO FIRST C - - ![](/img/timeseries-featured-analysis-6-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: Starts from the first C (i.e., row 4), matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO LAST B or AFTER MATCH SKIP TO B - - ![](/img/timeseries-featured-analysis-7-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: Attempts to start from the last B (i.e., row 3), fails - * Third match: Attempts to start from row 4, successfully matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO U - - ![](/img/timeseries-featured-analysis-8-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: `SKIP TO U` means jumping to the last C or D; D can never match successfully, so it jumps to the last C (i.e., row 4), successfully matching rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO A, you cannot jump to the first row of the match, otherwise it will cause an infinite loop - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: cannot skip to first row of match - ``` - - * When AFTER MATCH SKIP TO B, you cannot jump to a pattern variable that does not exist in the match group - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: pattern variable is not present in match - ``` - - -#### 1.2.6 ROWS PER MATCH Clause - -Used to specify the output method of the result set after a successful pattern match, including the following two main options: - -| Output Method | Rule Description | Output Result | Handling Logic for **Empty Matches/Unmatched Rows** | -| -------------------- | -------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| ONE ROW PER MATCH | Generates one output row for each successful match. | \* Columns in the PARTITION BY clause\* Expressions defined in the MEASURES clause. | Outputs empty matches; skips unmatched rows. | -| ALL ROWS PER MATCH | Each row in a match generates an output record, unless the row is excluded via exclusion syntax. | \* Columns in the PARTITION BY clause\* Columns in the ORDER BY clause\* Expressions defined in the MEASURES clause\* Remaining columns in the input table | \* Default: Outputs empty matches; skips unmatched rows.\* ALL ROWS PER MATCH​**SHOW EMPTY MATCHES**​: Outputs empty matches by default; skips unmatched rows.\* ALL ROWS PER MATCH​**OMIT EMPTY MATCHES**​: Does not output empty matches; skips unmatched rows.\* ALL ROWS PER MATCH​**WITH UNMATCHED ROWS**​: Outputs empty matches and generates an additional output record for each unmatched row. | - -#### 1.2.7 MEASURES Clause - -Used to specify which information to extract from a matched set of data. This clause is optional; if not explicitly specified, some input columns will become the output results of pattern recognition based on the settings of the ROWS PER MATCH clause. - -SQL - -```SQL -MEASURES measure_expression AS measure_name [, ...] -``` - -* A `measure_expression` is a scalar value calculated from the matched set of data. - -| Usage Example | Description | -| ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `A.totalprice AS starting_price` | Returns the price from the first row in the matched group (i.e., the only row associated with variable A) as the starting price. | -| `RPR_LAST(B.totalprice) AS bottom_price` | Returns the price from the last row associated with variable B, representing the lowest price in the "V" shape pattern (corresponding to the end of the downward segment). | -| `RPR_LAST(U.totalprice) AS top_price` | Returns the highest price in the matched group, corresponding to the last row associated with variable C or D (i.e., the end of the entire matched group). [Assuming SUBSET U = (C, D)] | - -* Each `measure_expression` defines an output column, which can be referenced by its specified `measure_name`. - -#### 1.2.8 Row Pattern Recognition Expressions - -Expressions used in the MEASURES and DEFINE clauses are ​**scalar expressions**​, evaluated in the row-level context of the input table. In addition to supporting standard SQL syntax, **scalar expressions** also support special extended functions for row pattern recognition. - -##### 1.2.8.1 Pattern Variable References - -```SQL -A.totalprice -U.orderdate -orderstatus -``` - -* When a column name is prefixed with a **basic pattern variable** or a ​**combined pattern variable**​, it refers to the corresponding column values of all rows matched by that variable. -* If a column name has no prefix, it is equivalent to using the "​**global combined pattern variable**​" (i.e., the union of all basic pattern variables) as the prefix, referring to the column values of all rows in the current match. - -> Using table names as column name prefixes in pattern recognition expressions is not allowed. - -##### 1.2.8.2 Extended Functions - -| Function Name | Function Syntax | Description | -| ------------------------------- | ----------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| `MATCH_NUMBER` Function | `MATCH_NUMBER()` | Returns the sequence number of the current match within the partition, starting from 1. Empty matches occupy match sequence numbers just like non-empty matches. | -| `CLASSIFIER` Function | `CLASSIFIER(option)` | 1. Returns the name of the basic pattern variable mapped by the current row. 2. `option` is an optional parameter: a basic pattern variable `CLASSIFIER(A)` or a combined pattern variable `CLASSIFIER(U)` can be passed in to limit the function's scope; for rows outside the scope, NULL is returned directly. When used with a combined pattern variable, it can be used to distinguish which basic pattern variable in the union the row is mapped to. | -| Logical Navigation Functions | `RPR_FIRST(expr, k)` | 1. Indicates locating the first row satisfying `expr` in the ​**current match group**​, then searching for the k-th occurrence of the row corresponding to the same pattern variable towards the end of the group, and returning the specified column value of that row. If the k-th matching row is not found in the specified direction, the function returns NULL. 2. `k` is an optional parameter, defaulting to 0 (only locating the first row satisfying the condition); if explicitly specified, it must be a non-negative integer. | -| Logical Navigation Functions | `RPR_LAST(expr, k)` | 1. Indicates locating the last row satisfying `expr` in the ​**current match group**​, then searching for the k-th occurrence of the row corresponding to the same pattern variable towards the start of the group, and returning the specified column value of that row. If the k-th matching row is not found in the specified direction, the function returns NULL. 2. `k` is an optional parameter, defaulting to 0 (only locating the last row satisfying the condition); if explicitly specified, it must be a non-negative integer. | -| Physical Navigation Functions | `PREV(expr, k)` | 1. Indicates offsetting k rows towards the start from the last row matched to the given pattern variable, and returning the corresponding column value. If navigation exceeds the ​**partition boundary**​, the function returns NULL. 2. `k` is an optional parameter, defaulting to 1; if explicitly specified, it must be a non-negative integer. | -| Physical Navigation Functions | `NEXT(expr, k)` | 1. Indicates offsetting k rows towards the end from the last row matched to the given pattern variable, and returning the corresponding column value. If navigation exceeds the ​**partition boundary**​, the function returns NULL. 2. `k` is an optional parameter, defaulting to 1; if explicitly specified, it must be a non-negative integer. | -| Aggregate Functions | COUNT, SUM, AVG, MAX, MIN Functions | Can be used to calculate data in the current match. Aggregate functions and navigation functions are not allowed to be nested within each other. (Supported from version V2.0.6) | -| Nested Functions | `PREV/NEXT(CLASSIFIER())` | Nesting of physical navigation functions and the CLASSIFIER function. Used to obtain the pattern variables corresponding to the previous and next matching rows of the current row. | -| Nested Functions | `PREV/NEXT(RPR_FIRST/RPR_LAST(expr, k)`) | **Logical functions are allowed to be nested** inside physical functions; **physical functions are not allowed to be nested** inside logical functions. Used to perform logical offset first, then physical offset. | - -**Examples** - -1. CLASSIFIER Function - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* Analysis - - ![](/img/timeseries-featured-analysis-9-en.png) - -* Result - -```SQL -+-----------------------------+-----+-----+---------------+-----+ -| time|match|price|lower_or_higher|label| -+-----------------------------+-----+-----+---------------+-----+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| -+-----------------------------+-----+-----+---------------+-----+ -Total line number = 6 -``` - -2. Logical Navigation Functions - -* Query sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` - -* Results - * When the value is totalprice, RPR\_LAST(totalprice), RUNNING RPR\_LAST(totalprice) - - ![](/img/timeseries-featured-analysis-10.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is FINAL RPR\_LAST(totalprice) - - ![](/img/timeseries-featured-analysis-11.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_FIRST(totalprice), RUNNING RPR\_FIRST(totalprice), FINAL RPR\_FIRST(totalprice) - - ![](/img/timeseries-featured-analysis-12.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 90| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 90| - |2025-01-01T00:05:00.000+08:00| 90| - |2025-01-01T00:06:00.000+08:00| 90| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_LAST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-13.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| null| - |2025-01-01T00:02:00.000+08:00| null| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is FINAL RPP\_LAST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-14.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_FIRST(totalprice, 2) and FINAL RPR\_FIRST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-15.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 70| - |2025-01-01T00:02:00.000+08:00| 70| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 6 - ``` - -3. Physical Navigation Functions - -* Query sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (B) - DEFINE B AS B.totalprice >= PREV(B.totalprice) -) AS m; -``` - -* Results - * When the value is `PREV(totalprice)` - - ![](/img/timeseries-featured-analysis-16.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `PREV(B.totalprice, 2)` - - ![](/img/timeseries-featured-analysis-17.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `PREV(B.totalprice, 4)` - - ![](/img/timeseries-featured-analysis-18.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| null| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `NEXT(totalprice)` or `NEXT(B.totalprice, 1)` - - ![](/img/timeseries-featured-analysis-19.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * `When the value is `NEXT(B.totalprice, 2)` - - ![](/img/timeseries-featured-analysis-20.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - -4. Aggregate Functions - -* Query sql - -```SQL -SELECT m.time, m.count, m.avg, m.sum, m.min, m.max -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - COUNT(*) AS count, - AVG(totalprice) AS avg, - SUM(totalprice) AS sum, - MIN(totalprice) AS min, - MAX(totalprice) AS max - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* Analysis (Taking MIN(totalprice) as an Example) - -![](/img/timeseries-featured-analysis-21.png) - -* Result - -```SQL -+-----------------------------+-----+-----------------+-----+---+---+ -| time|count| avg| sum|min|max| -+-----------------------------+-----+-----------------+-----+---+---+ -|2025-01-01T00:01:00.000+08:00| 1| 90.0| 90.0| 90| 90| -|2025-01-01T00:02:00.000+08:00| 2| 85.0|170.0| 80| 90| -|2025-01-01T00:03:00.000+08:00| 3| 80.0|240.0| 70| 90| -|2025-01-01T00:04:00.000+08:00| 4| 80.0|320.0| 70| 90| -|2025-01-01T00:05:00.000+08:00| 5| 78.0|390.0| 70| 90| -|2025-01-01T00:06:00.000+08:00| 6|78.33333333333333|470.0| 70| 90| -+-----------------------------+-----+-----------------+-----+---+---+ -Total line number = 6 -``` - -5. Nested Functions - -Example 1 - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label, m.prev_label, m.next_label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label, - PREV(CLASSIFIER(W)) AS prev_label, - NEXT(CLASSIFIER(W)) AS next_label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* Analysis - -![](/img/timeseries-featured-analysis-22-en.png) - -* Result - -```SQL -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -| time|match|price|lower_or_higher|label|prev_label|next_label| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| null| A| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| H| null| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| null| A| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| L| null| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| null| A| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| L| null| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -Total line number = 6 -``` - -Example 2 - -* Query sql - -```SQL -SELECT m.time, m.prev_last_price, m.next_first_price -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - PREV(RPR_LAST(totalprice), 2) AS prev_last_price, - NEXT(RPR_FIRST(totalprice), 2) as next_first_price - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* Analysis - -![](/img/timeseries-featured-analysis-23.png) - -* Result - -```SQL -+-----------------------------+---------------+----------------+ -| time|prev_last_price|next_first_price| -+-----------------------------+---------------+----------------+ -|2025-01-01T00:01:00.000+08:00| null| 70| -|2025-01-01T00:02:00.000+08:00| null| 70| -|2025-01-01T00:03:00.000+08:00| 90| 70| -|2025-01-01T00:04:00.000+08:00| 80| 70| -|2025-01-01T00:05:00.000+08:00| 70| 70| -|2025-01-01T00:06:00.000+08:00| 80| 70| -+-----------------------------+---------------+----------------+ -Total line number = 6 -``` - -##### 1.2.8.3 RUNNING and FINAL Semantics - -1. Definition - -* `RUNNING`: Indicates the calculation scope is from the start row of the current match group to the row currently being processed (i.e., up to the current row). -* `FINAL`: Indicates the calculation scope is from the start row of the current match group to the final row of the group (i.e., the entire match group). - -2. Scope of Application - -* The DEFINE clause uses RUNNING semantics by default. -* The MEASURES clause uses RUNNING semantics by default and supports specifying FINAL semantics. When using the ONE ROW PER MATCH output mode, all expressions are calculated from the last row position of the match group, and at this time, RUNNING semantics are equivalent to FINAL semantics. - -3. Syntax Constraints - -* RUNNING and FINAL need to be written before **logical navigation functions** or aggregate functions, and cannot directly act on **column references.** - * Valid: `RUNNING RPP_LAST(A.totalprice)`, `FINAL RPP_LAST(A.totalprice)` - * Invalid: `RUNNING A.totalprice`, `FINAL A.totalprice`, `RUNNING PREV(A.totalprice)` - -### 1.3 Scenario Examples - -Using [Sample Data](../Reference/Sample-Data.md) as the source data - -#### 1.3.1 Time Segment Query - -Segment the data in table1 by time intervals less than or equal to 24 hours, and query the total number of data entries in each segment, as well as the start and end times. - -Query SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -Results - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -#### 1.3.2 Difference Segment Query - -Segment the data in table2 by humidity value differences less than 0.1, and query the total number of data entries in each segment, as well as the start and end times. - -* Query SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* Results - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -#### 1.3.3 Event Statistics Query - -Group the data in table1 by device ID, and count the start and end times and maximum humidity value where the humidity in the Shanghai area is greater than 35. - -* Query SQL - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= '上海' AND A.humidity> 35 -) AS m -``` - -* Results - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` - - -## 2. Window Functions - -### 2.1 Function Overview - -Window Functions perform calculations on each row based on a specific set of rows related to the current row (called a "window"). It combines grouping operations (`PARTITION BY`), sorting (`ORDER BY`), and definable calculation ranges (window frame `FRAME`), enabling complex cross-row calculations without collapsing the original data rows. It is commonly used in data analysis scenarios such as ranking, cumulative sums, moving averages, etc. - -> Note: This feature is available starting from version V 2.0.5. - -For example, in a scenario where you need to query the cumulative power consumption values of different devices, you can achieve this using window functions. - -```SQL --- Original data -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ - --- Create table and insert data -CREATE TABLE device_flow(device String tag, flow INT32 FIELD); -insert into device_flow(time, device ,flow ) values ('1970-01-01T08:00:00.000+08:00','d0',3),('1970-01-01T08:00:01.000+08:00','d0',5),('1970-01-01T08:00:02.000+08:00','d0',3),('1970-01-01T08:00:03.000+08:00','d0',1),('1970-01-01T08:00:04.000+08:00','d1',2),('1970-01-01T08:00:05.000+08:00','d1',4); - - --- Execute window function query -SELECT *, sum(flow) ​OVER(PARTITION​ ​BY​ device ​ORDER​ ​BY​ flow) ​as​ sum ​FROM device_flow; -``` - -After grouping, sorting, and calculation (steps are disassembled as shown in the figure below), - -![](/img/window-function-1.png) - -the expected results can be obtained: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -### 2.2 Function Definition - -#### 2.2.1 SQL Definition - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -#### 2.2.2 Window Definition - -##### 2.2.2.1 Partition - -`PARTITION BY` is used to divide data into multiple independent, unrelated "groups". Window functions can only access and operate on data within their respective groups, and cannot access data from other groups. This clause is optional; if not explicitly specified, all data is divided into the same group by default. It is worth noting that unlike `GROUP BY` which aggregates a group of data into a single row, the window function with `PARTITION BY` **does not affect the number of rows within the group.** - -* Example - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER (PARTITION BY device) as count FROM device_flow; -``` - -Disassembly steps: - -![](/img/window-function-2.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 4| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -|1970-01-01T08:00:02.000+08:00| d0| 3| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 4| -+-----------------------------+------+----+-----+ -``` - -##### 2.2.2.2 Ordering - -`ORDER BY` is used to sort data within a partition. After sorting, rows with equal values are called peers. Peers affect the behavior of window functions; for example, different rank functions handle peers differently, and different frame division methods also handle peers differently. This clause is optional. - -* Example - -Query statement: - -```SQL -IoTDB> SELECT *, rank() OVER (PARTITION BY device ORDER BY flow) as rank FROM device_flow; -``` - -Disassembly steps: - -![](/img/window-function-3.png) - -Query result: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -##### 2.2.2.3 Framing - -For each row in a partition, the window function evaluates on a corresponding set of rows called a Frame (i.e., the input domain of the Window Function on each row). The Frame can be specified manually, involving two attributes when specified, as detailed below. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Frame AttributeAttribute ValueValue Description
TypeROWSDivide the frame by row number
GROUPSDivide the frame by peers, i.e., rows with the same value are regarded as equivalent. All rows in peers are grouped into one group called a peer group
RANGEDivide the frame by value
Start and End PositionUNBOUNDED PRECEDINGThe first row of the entire partition
offset PRECEDINGRepresents the row with an "offset" distance from the current row in the preceding direction
CURRENT ROWThe current row
offset FOLLOWINGRepresents the row with an "offset" distance from the current row in the following direction
UNBOUNDED FOLLOWINGThe last row of the entire partition
- -Among them, the meanings of `CURRENT ROW`, `PRECEDING N`, and `FOLLOWING N` vary with the type of frame, as shown in the following table: - -| | `ROWS` | `GROUPS` | `RANGE` | -|--------------------|------------|------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------| -| `CURRENT ROW` | Current row | Since a peer group contains multiple rows, this option differs depending on whether it acts on frame_start and frame_end: * frame_start: the first row of the peer group; * frame_end: the last row of the peer group. | Same as GROUPS, differing depending on whether it acts on frame_start and frame_end: * frame_start: the first row of the peer group; * frame_end: the last row of the peer group. | -| `offset PRECEDING` | The previous offset rows | The previous offset peer groups; | Rows whose value difference from the current row in the preceding direction is less than or equal to offset are grouped into one frame | -| `offset FOLLOWING` | The following offset rows | The following offset peer groups. | Rows whose value difference from the current row in the following direction is less than or equal to offset are grouped into one frame | - -The syntax format is as follows: - -```SQL --- Specify both frame_start and frame_end -{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end --- Specify only frame_start, frame_end is CURRENT ROW -{ RANGE | ROWS | GROUPS } frame_start -``` - -If the Frame is not specified manually, the default Frame division rules are as follows: - -* When the window function uses ORDER BY: The default Frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (i.e., from the first row of the window to the current row). For example: In RANK() OVER(PARTITION BY COL1 ORDER BY COL2), the Frame defaults to include the current row and all preceding rows in the partition. -* When the window function does not use ORDER BY: The default Frame is RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (i.e., all rows in the entire window). For example: In AVG(COL2) OVER(PARTITION BY col1), the Frame defaults to include all rows in the partition, calculating the average of the entire partition. - -It should be noted that when the Frame type is GROUPS or RANGE, `ORDER BY` must be specified. The difference is that ORDER BY in GROUPS can involve multiple fields, while RANGE requires calculation and thus can only specify one field. - -* Example - -1. Frame type is ROWS - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ROWS 1 PRECEDING) as count FROM device_flow; -``` - -Disassembly steps: - -* Take the previous row and the current row as the Frame - * For the first row of the partition, since there is no previous row, the entire Frame has only this row, returning 1; - * For other rows of the partition, the entire Frame includes the current row and its previous row, returning 2: - -![](/img/window-function-4.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 2| -+-----------------------------+------+----+-----+ -``` - -2. Frame type is GROUPS - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Disassembly steps: - -* Take the previous peer group and the current peer group as the Frame. Taking the partition with device d0 as an example (same for d1), for the count of rows: - * For the peer group with flow 1, since there are no peer groups smaller than it, the entire Frame has only this row, returning 1; - * For the peer group with flow 3, it itself contains 2 rows, and the previous peer group is the one with flow 1 (1 row), so the entire Frame has 3 rows, returning 3; - * For the peer group with flow 5, it itself contains 1 row, and the previous peer group is the one with flow 3 (2 rows), so the entire Frame has 3 rows, returning 3. - -![](/img/window-function-5.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. Frame type is RANGE - -Query statement: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Disassembly steps: - -* Group rows whose data is **less than or equal to 2** compared to the current row into the same Frame. Taking the partition with device d0 as an example (same for d1), for the count of rows: - * For the row with flow 1, since it is the smallest row, the entire Frame has only this row, returning 1; - * For the row with flow 3, note that CURRENT ROW exists as frame_end, so it is the last row of the entire peer group. There is 1 row smaller than it that meets the requirement, and the peer group has 2 rows, so the entire Frame has 3 rows, returning 3; - * For the row with flow 5, it itself contains 1 row, and there are 2 rows smaller than it that meet the requirement, so the entire Frame has 3 rows, returning 3. - -![](/img/window-function-6.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -### 2.3 Built-in Window Functions - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Window Function CategoryWindow Function NameFunction DefinitionSupports FRAME Clause
Aggregate FunctionAll built-in aggregate functionsAggregate a set of values to get a single aggregated result.Yes
Value Functionfirst_valueReturn the first value of the frame; if IGNORE NULLS is specified, skip leading NULLsYes
last_valueReturn the last value of the frame; if IGNORE NULLS is specified, skip trailing NULLsYes
nth_valueReturn the nth element of the frame (note that n starts from 1); if IGNORE NULLS is specified, skip NULLsYes
leadReturn the element offset rows after the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return defaultNo
lagReturn the element offset rows before the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return defaultNo
Rank FunctionrankReturn the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there may be gaps between sequence numbersNo
dense_rankReturn the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there are no gaps between sequence numbersNo
row_numberReturn the row number of the current row in the entire partition; note that the row number starts from 1No
percent_rankReturn the sequence number of the current row's value in the entire partition as a percentage; i.e., (rank() - 1) / (n - 1), where n is the number of rows in the entire partitionNo
cume_distReturn the sequence number of the current row's value in the entire partition as a percentage; i.e., (number of rows less than or equal to it) / n No
ntileSpecify n to number each row from 1 to n.No
- -#### 2.3.1 Aggregate Function - -All built-in aggregate functions such as `sum()`, `avg()`, `min()`, `max()` can be used as Window Functions. - -> Note: Unlike GROUP BY, each row has a corresponding output in the Window Function - -Example: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -#### 2.3.2 Value Function - -1. `first_value` - -* Function name: `first_value(value) [IGNORE NULLS]` -* Definition: Return the first value of the frame; if IGNORE NULLS is specified, skip leading NULLs; -* Example: - -```SQL -IoTDB> SELECT *, first_value(flow) OVER w as first_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+-----------+ -| time|device|flow|first_value| -+-----------------------------+------+----+-----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----------+ -``` - -2. `last_value` - -* Function name: `last_value(value) [IGNORE NULLS]` -* Definition: Return the last value of the frame; if IGNORE NULLS is specified, skip trailing NULLs; -* Example: - -```SQL -IoTDB> SELECT *, last_value(flow) OVER w as last_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|last_value| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -3. `nth_value` - -* Function name: `nth_value(value, n) [IGNORE NULLS]` -* Definition: Return the nth element of the frame (note that n starts from 1); if IGNORE NULLS is specified, skip NULLs; -* Example: - -```SQL -IoTDB> SELECT *, nth_value(flow, 2) OVER w as nth_values FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|nth_values| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -4. lead - -* Function name: `lead(value[, offset[, default]]) [IGNORE NULLS]` -* Definition: Return the element offset rows after the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return default; the default value of offset is 1, and the default value of default is NULL. -* The lead function requires an ORDER BY window clause -* Example: - -```SQL -IoTDB> SELECT *, lead(flow) OVER w as lead FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY time); -+-----------------------------+------+----+----+ -| time|device|flow|lead| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4|null| -|1970-01-01T08:00:00.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 1| -|1970-01-01T08:00:03.000+08:00| d0| 1|null| -+-----------------------------+------+----+----+ -``` - -5. lag - -* Function name: `lag(value[, offset[, default]]) [IGNORE NULLS]` -* Definition: Return the element offset rows before the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return default; the default value of offset is 1, and the default value of default is NULL. -* The lag function requires an ORDER BY window clause -* Example: - -```SQL -IoTDB> SELECT *, lag(flow) OVER w as lag FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY device); -+-----------------------------+------+----+----+ -| time|device|flow| lag| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2|null| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3|null| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -+-----------------------------+------+----+----+ -``` - -#### 2.3.3 Rank Function - -1. rank - -* Function name: `rank()` -* Definition: Return the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there may be gaps between sequence numbers; -* Example: - -```SQL -IoTDB> SELECT *, rank() OVER w as rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -2. dense_rank - -* Function name: `dense_rank()` -* Definition: Return the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there are no gaps between sequence numbers. -* Example: - -```SQL -IoTDB> SELECT *, dense_rank() OVER w as dense_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|dense_rank| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+----------+ -``` - -3. row_number - -* Function name: `row_number()` -* Definition: Return the row number of the current row in the entire partition; note that the row number starts from 1; -* Example: - -```SQL -IoTDB> SELECT *, row_number() OVER w as row_number FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|row_number| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----------+ -``` - -4. percent_rank - -* Function name: `percent_rank()` -* Definition: Return the sequence number of the current row's value in the entire partition as a percentage; i.e., **(rank() - 1) / (n - 1)**, where n is the number of rows in the entire partition; -* Example: - -```SQL -IoTDB> SELECT *, percent_rank() OVER w as percent_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+------------------+ -| time|device|flow| percent_rank| -+-----------------------------+------+----+------------------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.0| -|1970-01-01T08:00:00.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:02.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+------------------+ -``` - -5. cume_dist - -* Function name: `cume_dist` -* Definition: Return the sequence number of the current row's value in the entire partition as a percentage; i.e., **(number of rows less than or equal to it) / n**. -* Example: - -```SQL -IoTDB> SELECT *, cume_dist() OVER w as cume_dist FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+---------+ -| time|device|flow|cume_dist| -+-----------------------------+------+----+---------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.5| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.25| -|1970-01-01T08:00:00.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:02.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+---------+ -``` - -6. ntile - -* Function name: `ntile` -* Definition: Specify n to number each row from 1 to n. - * If the number of rows in the entire partition is less than n, the number is the row index; - * If the number of rows in the entire partition is greater than n: - * If the number of rows is divisible by n, it is perfect. For example, if the number of rows is 4 and n is 2, the numbers are 1, 1, 2, 2; - * If the number of rows is not divisible by n, distribute to the first few groups. For example, if the number of rows is 5 and n is 3, the numbers are 1, 1, 2, 2, 3; -* Example: - -```SQL -IoTDB> SELECT *, ntile(2) OVER w as ntile FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+-----+ -| time|device|flow|ntile| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -+-----------------------------+------+----+-----+ -``` - -### 2.4 Scenario Examples - -1. Multi-device diff function - -For each row of each device, calculate the difference from the previous row: - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -For each row of each device, calculate the difference from the next row: - -```SQL -SELECT - *, - measurement - lead(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -For each row of a single device, calculate the difference from the previous row (same for the next row): - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (ORDER BY time) -FROM data -where device='d1' -WHERE timeCondition; -``` - -2. Multi-device TOP_K/BOTTOM_K - -Use rank to get the sequence number, then retain the desired order in the outer query. - -(Note: The execution order of window functions is after the HAVING clause, so a subquery is needed here) - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY time DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -In addition to sorting by time, you can also sort by the value of the measurement point: - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY measurement DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -3. Multi-device CHANGE_POINTS - -This SQL is used to remove consecutive identical values in the input sequence, which can be achieved with lead + subquery: - -```SQL -SELECT - time, - device, - measurement -FROM( - SELECT - time, - device, - measurement, - LEAD(measurement) OVER (PARTITION BY device ORDER BY time) AS next - FROM data - WHERE timeCondition -) -WHERE measurement != next OR next IS NULL; -``` diff --git a/src/UserGuide/Master/Table/User-Manual/Tree-to-Table_timecho.md b/src/UserGuide/Master/Table/User-Manual/Tree-to-Table_timecho.md deleted file mode 100644 index 729c7ada4..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Tree-to-Table_timecho.md +++ /dev/null @@ -1,619 +0,0 @@ - - -# Tree-to-Table Mapping - -## 1. Functional Overview - -IoTDB introduces a tree-to-table function, which enables the creation of table views from existing tree-model data. This allows querying via table views, achieving collaborative processing of both tree and table models for the same dataset: - -* During the data writing phase, the tree-model syntax is used, supporting flexible data ingestion and expansion. -* During the data analysis phase, the table-model syntax is adopted, allowing complex data analysis through standard SQL queries. - -![](/img/tree-to-table-en-1.png) - -> * This feature supports from version V2.0.5. -> * Table views are read-only, so data cannot be written through them. - - ## 2. Feature Description -### 2.1 Creating a Table View -#### 2.1.1 Syntax Definition - -```SQL --- create (or replace) view on tree -CREATE - [OR REPLACE] - VIEW view_name ([viewColumnDefinition (',' viewColumnDefinition)*]) - [comment] - [RESTRICT] - [WITH properties] - AS prefixPath - -viewColumnDefinition - : column_name [dataType] TAG [comment] # tagColumn - | column_name [dataType] TIME [comment] # timeColumn - | column_name [dataType] FIELD [FROM original_measurement] [comment] # fieldColumn - ; - -comment - : COMMENT string - ; -``` - -> Note: Columns only support tags, fields, or time; attributes are not supported. - -#### 2.1.2 Syntax Explanation -1. **`prefixPath`** - -Corresponds to the path in the tree model. The last level of the path must be `**`, and no other levels can contain `*`or `**`. This path determines the subtree corresponding to the VIEW. - -2. **`view_name`** - -The name of the view, which follows the same rules as a table name (for specific constraints, refer to [Create Table](../Basic-Concept/Table-Management_timecho.md#\_1-1-create-a-table)), e.g., `db.view`. - -3. **`viewColumnDefinition`** - -* `TAG`: Each TAG column corresponds, in order, to the path nodes at the levels following the `prefixPath`. -* `FIELD`: A FIELD column corresponds to a measurement (leaf node) in the tree model. - * If a FIELD column is specified, the column name uses the declared `column_name`. - * If `original_measurement`is declared, it maps directly to that measurement in the tree model. Otherwise, the lowercase `column_name`is used as the measurement name for mapping. - * Mapping multiple FIELD columns to the same measurement name in the tree model is not supported. - * If the `dataType`for a FIELD column is not specified, the system defaults to the data type of the mapped measurement in the tree model. - * If a device in the tree model does not contain certain declared FIELD columns, or if their data types are inconsistent with the declared FIELD columns, the value for that FIELD column will always be `NULL`when querying that device. - * If no FIELD columns are specified, the system automatically scans for all measurements under the `prefixPath`subtree (including all ordinary sequence measurements and measurements defined in any templates whose mounted paths overlap with the `prefixPath`) during creation. The column names will use the measurement names from the tree model. - * The tree model cannot have measurements with the same name (case-insensitive) but different data types. -* `TIME`: When creating a view, you do not need to specify a time column. IoTDB automatically adds a column named "time" and places it as the first column. Since version V2.0.8.2, views support **custom naming of the time column** during creation. The order of the custom time column in the view is determined by the order in the creation SQL. The related constraints are as follows: - * When the column category is set to `TIME`, the data type must be `TIMESTAMP`. - * Each view allows at most one time column (columnCategory = TIME). - * If no time column is explicitly defined, no other column can use `time` as its name to avoid conflicts with the system's default time column naming. - - -4. **`WITH properties`** - -Currently, only TTL is supported. It indicates that data older than TTL (in milliseconds) will not be displayed in query results, i.e., effectively `WHERE time > now() - TTL`. If a TTL is also set in the tree model, the query uses the smaller value of the two. - -> Note: The table view's TTL does not affect the actual TTL of the devices in the tree model. When data reaches the TTL set in the tree model, it will be physically deleted by the system. - -5. **`OR REPLACE`** - -A table and a view cannot have the same name. If a table with the same name already exists during creation, an error will be reported. If a view with the same name already exists, it will be replaced. - -6. **`RESTRICT`** - -This constrains the number of levels of the tree model devices that are matched (starting from the level below the `prefixPath`). If the `RESTRICT`keyword is present, only devices whose level count exactly equals the number of TAG columns are matched. Otherwise, devices whose level count is less than or equal to the number of TAG columns are matched. The default behavior is non-RESTRICT, meaning devices with a level count less than or equal to the number of TAG columns are matched. - -#### 2.1.3 Usage Example -1. Tree Model and Table View Schema - -![](/img/tree-to-table-en-2.png) - -2. Creating the Table View - -* Creation Statement: - -```SQL -CREATE OR REPLACE VIEW viewdb."wind_turbine" - (wind_turbine_group String TAG, - wind_turbine_number String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -with (ttl=604800000) -AS root.db.** -``` - -* Detailed Explanation - -This statement creates a view named `viewdb.wind_turbine`(an error will occur if `viewdb`does not exist). If the view already exists, it will be replaced. - -* It creates a table view for the time series mounted under the tree model path `root.db.**`. -* It has two `TAG` columns, `wind_turbine_group `and `wind_turbine_number`, so the table view will only include devices from the 3rd level of the original tree model. -* It has two `FIELD`columns, `voltage` and `current`. Here, these `FIELD` columns correspond to measurement names in the tree model that are also `voltage` and `current`, and only select time series of type `DOUBLE`. - -**Renaming measurement requirement:** - -If the measurement name in the tree model is `current_new`, but you want the corresponding `FIELD` column name in the table view to be `current`, the SQL should be changed as follows: - -```SQL -CREATE OR REPLACE VIEW viewdb."wind_turbine" - (wind_turbine_group String TAG, - wind_turbine_number String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD FROM current_new - ) -with (ttl=604800000) -AS root.db.** -``` - -When customizing the time column (supported since V2.0.8.2), the SQL changes are as follows: - -```SQL -CREATE OR REPLACE VIEW viewdb."wind_turbine" - (wind_turbine_group String TAG, - wind_turbine_number String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD, - time_user_defined TIMESTAMP TIME - ) -with (ttl=604800000) -AS root.db.** -``` - - -### 2.2 Modifying a Table View -#### 2.2.1 Syntax Definition - -The ALTER VIEW function supports modifying the view name, adding columns, renaming columns, modifying FIELD column data type (supported since V2.0.8.2), deleting columns, setting the view's TTL property, and adding comments via COMMENT. - -```SQL --- Rename view -ALTER VIEW [IF EXISTS] viewName RENAME TO to=identifier - --- Add a column to the view -ALTER VIEW [IF EXISTS] viewName ADD COLUMN [IF NOT EXISTS] viewColumnDefinition -viewColumnDefinition - : column_name [dataType] TAG # tagColumn - | column_name [dataType] FIELD [FROM original_measurement] # fieldColumn - --- Rename a column in the view -ALTER VIEW [IF EXISTS] viewName RENAME COLUMN [IF EXISTS] oldName TO newName - --- Modify the data type of a FIELD column -ALTER VIEW [IF EXISTS] viewName ALTER COLUMN [IF EXISTS] columnName SET DATA TYPE new_type - --- Delete a column from the view -ALTER VIEW [IF EXISTS] viewName DROP COLUMN [IF EXISTS] columnName - --- Modify the view's TTL -ALTER VIEW [IF EXISTS] viewName SET PROPERTIES propertyAssignments - --- Add comments -COMMENT ON VIEW qualifiedName IS (string | NULL) #commentView -COMMENT ON COLUMN qualifiedName '.' column=identifier IS (string | NULL) #commentColumn -``` - -#### 2.2.2 Syntax Explanation -1. The `SET PROPERTIES`operation currently only supports configuring the TTL property for the table view. -2. The `DROP COLUMN`function only supports deleting FIELD columns; TAG columns cannot be deleted. -3. Modifying the comment will overwrite the original comment. If set to `null`, the previous comment will be erased. -4. When modifying the data type of a FIELD column, the new data type must be compatible with the original type. The specific compatibility is shown in the following table: - -| Original Type | Convertible To Type | -|---------------|----------------------------------------------| -| INT32 | INT64, FLOAT, DOUBLE, TIMESTAMP, STRING, TEXT | -| INT64 | TIMESTAMP, DOUBLE, STRING, TEXT | -| FLOAT | DOUBLE, STRING, TEXT | -| DOUBLE | STRING, TEXT | -| BOOLEAN | STRING, TEXT | -| TEXT | BLOB, STRING | -| STRING | TEXT, BLOB | -| BLOB | STRING, TEXT | -| DATE | STRING, TEXT | -| TIMESTAMP | INT64, DOUBLE, STRING, TEXT | - -#### 2.2.3 Usage Examples - -```SQL --- Rename view -ALTER VIEW IF EXISTS tableview1 RENAME TO tableview - --- Add a column to the view -ALTER VIEW IF EXISTS tableview ADD COLUMN IF NOT EXISTS temperature float field - --- Rename a column in the view -ALTER VIEW IF EXISTS tableview RENAME COLUMN IF EXISTS temperature TO temp - --- Modify the data type of a FIELD column -ALTER VIEW IF EXISTS tableview ALTER COLUMN IF EXISTS temperature SET DATA TYPE double - --- Delete a column from the view -ALTER VIEW IF EXISTS tableview DROP COLUMN IF EXISTS temp - --- Modify the view's TTL -ALTER VIEW IF EXISTS tableview SET PROPERTIES TTL=3600 - --- Add comments -COMMENT ON VIEW tableview IS 'Tree to Table' -COMMENT ON COLUMN tableview.status is Null -``` - -### 2.3 Deleting a Table View -#### 2.3.1 Syntax Definition - -```SQL -DROP VIEW [IF EXISTS] viewName -``` - -#### 2.3.2 Usage Example - -```SQL -DROP VIEW IF EXISTS tableview -``` - -### 2.4 Viewing Table Views -#### 2.4.1 **`Show Tables`** -1. Syntax Definition - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -2. Syntax Explanation - -The `SHOW TABLES (DETAILS)`statement displays the type information of tables or views through the `TABLE_TYPE`field in the result set: - -| Type | `TABLE_TYPE`Field Value | -| -------------------------------------------- | ----------------------------- | -| Ordinary Table(Table) | `BASE TABLE` | -| Tree-to-Table View (Tree View) | `VIEW FROM TREE` | -| System Table(Iinformation\_schema.Tables) | `SYSTEM VIEW` | - -3. Usage Examples - -```SQL -IoTDB> show tables details from database1 -+-----------+-----------+------+---------------+--------------+ -| TableName| TTL(ms)|Status| Comment| TableType| -+-----------+-----------+------+---------------+--------------+ -| tableview| INF| USING| Tree to Table |VIEW FROM TREE| -| table1|31536000000| USING| null| BASE TABLE| -| table2|31536000000| USING| null| BASE TABLE| -+-----------+-----------+------+---------------+--------------+ - -IoTDB> show tables details from information_schema -+--------------+-------+------+-------+-----------+ -| TableName|TTL(ms)|Status|Comment| TableType| -+--------------+-------+------+-------+-----------+ -| columns| INF| USING| null|SYSTEM VIEW| -| config_nodes| INF| USING| null|SYSTEM VIEW| -|configurations| INF| USING| null|SYSTEM VIEW| -| data_nodes| INF| USING| null|SYSTEM VIEW| -| databases| INF| USING| null|SYSTEM VIEW| -| functions| INF| USING| null|SYSTEM VIEW| -| keywords| INF| USING| null|SYSTEM VIEW| -| models| INF| USING| null|SYSTEM VIEW| -| nodes| INF| USING| null|SYSTEM VIEW| -| pipe_plugins| INF| USING| null|SYSTEM VIEW| -| pipes| INF| USING| null|SYSTEM VIEW| -| queries| INF| USING| null|SYSTEM VIEW| -| regions| INF| USING| null|SYSTEM VIEW| -| subscriptions| INF| USING| null|SYSTEM VIEW| -| tables| INF| USING| null|SYSTEM VIEW| -| topics| INF| USING| null|SYSTEM VIEW| -| views| INF| USING| null|SYSTEM VIEW| -+--------------+-------+------+-------+-----------+ -``` - -#### 2.4.2 **`Show Create View`** -1. Syntax Definition - -```SQL -SHOW CREATE VIEW viewname; -``` - -2. Syntax Explanation - -* This statement retrieves the complete definition of a table or view. -* It automatically fills in all default values omitted during creation, so the statement shown in the result may differ from the original CREATE statement. -* This statement does not support system tables. - -3. Usage Examples - -```SQL -IoTDB> show create view tableview -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| View| Create View| -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|tableview|CREATE VIEW "tableview" ("device" STRING TAG,"model" STRING TAG,"status" BOOLEAN FIELD,"hardware" STRING FIELD) COMMENT '表视图' WITH (ttl=INF) AS root.ln.**| -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -``` - -> Additionally, you can also use the `SHOW CREATE TABLE` statement to view the creation information of table views. For more details, see [show create table](../Basic-Concept/Table-Management_timecho.md#_1-4-view-table-creation-statement) - - -### 2.5 Query Differences Between Non-aligned and Aligned Devices - -Queries on tree-to-table views may yield different results compared to equivalent tree model `ALIGN BY DEVICE`queries when dealing with null values in aligned and non-aligned devices. - -* Aligned Devices - * Tree Model Query Behavior:Rows where all selected time series have null values are not retained. - * Table View Query Behavior:Consistent with the table model, rows where all selected fields are null are retained. -* Non-aligned Devices - * Tree Model Query Behavior:Rows where all selected time series have null values are not retained. - * Table View Query Behavior:Consistent with the tree model, rows where all selected fields are null are not retained. -* Explanation Example - * Aligned - - ```SQL - -- Write data in tree model (aligned) - CREATE ALIGNED TIMESERIES root.db.battery.b1(voltage INT32, current FLOAT) - INSERT INTO root.db.battery.b1(time, voltage, current) aligned values (1, 1, 1) - INSERT INTO root.db.battery.b1(time, voltage, current) aligned values (2, null, 1) - - -- Create VIEW statement - CREATE VIEW view1 (battery_id TAG, voltage INT32 FIELD, current FLOAT FIELD) as root.db.battery.** - - -- Query - IoTDB> select voltage from view1 - +-------+ - |voltage| - +-------+ - | 1| - | null| - +-------+ - Total line number = 2 - ``` - - * Non-aligned - - ```SQL - -- Write data in tree model (non-aligned) - CREATE TIMESERIES root.db.battery.b1.voltage INT32 - CREATE TIMESERIES root.db.battery.b1.current FLOAT - INSERT INTO root.db.battery.b1(time, voltage, current) values (1, 1, 1) - INSERT INTO root.db.battery.b1(time, voltage, current) values (2, null, 1) - - -- Create VIEW statement - CREATE VIEW view1 (battery_id TAG, voltage INT32 FIELD, current FLOAT FIELD) as root.db.battery.** - - -- Query - IoTDB> select voltage from view1 - +-------+ - |voltage| - +-------+ - | 1| - +-------+ - Total line number = 1 - - -- Can only ensure all rows are retrieved if the query specifies all FIELD columns, or only non-FIELD columns - IoTDB> select voltage,current from view1 - +-------+-------+ - |voltage|current| - +-------+-------+ - | 1| 1.0| - | null| 1.0| - +-------+-------+ - Total line number = 2 - - IoTDB> select battery_id from view1 - +-----------+ - |battery_id| - +-----------+ - | b1| - | b1| - +-----------+ - Total line number = 2 - - -- If the query involves only some FIELD columns, the final number of rows depends on the number of rows after aligning the specified FIELD columns by timestamp. - IoTDB> select time,voltage from view1 - +-----------------------------+-------+ - | time|voltage| - +-----------------------------+-------+ - |1970-01-01T08:00:00.001+08:00| 1| - +-----------------------------+-------+ - Total line number = 1 - ``` - -## 3. Scenario Examples -### 3.1 Managing Multiple Device Types in the Original Tree Model - -* The scenario involves managing different types of devices, each with its own hierarchical path and set of measurements. -* During Data Writing: Create branches under the database node according to device type. Each device type can have a different measurement structure. -* During Querying: Create a separate table for each device type. Each table will have different tags and sets of measurements. - -![](/img/tree-to-table-en-3.png) - -**SQL for Creating a Table View:** - -```SQL --- Wind Turbine Table -CREATE VIEW viewdb.wind_turbine - (wind_turbine_group String TAG, - wind_turbine_number String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -AS root.db.wind_turbine.** - --- Motor Table -CREATE VIEW viewdb.motor - ( motor_group String TAG, - motor_number String TAG, - power FLOAT FIELD, - electricity FLOAT FIELD, - temperature FLOAT FIELD - ) -AS root.db.motor.** -``` - -### 3.2 Original Tree Model Contains Only Measurements, No Devices - -This scenario occurs in systems like station monitoring where each measurement has a unique identifier but cannot be mapped to specific physical devices. - -> Wide Table Form - -![](/img/tree-to-table-en-4.png) - -**SQL for Creating a Table View:** - -```SQL -CREATE VIEW viewdb.machine - (DCS_PIT_02105A DOUBLE FIELD, - DCS_PIT_02105B DOUBLE FIELD, - DCS_PIT_02105C DOUBLE FIELD, - ... - DCS_XI_02716A DOUBLE FIELD - ) -AS root.db.** -``` - -### 3.3 Original Tree Model Where a Device Has Both Sub-devices and Measurements - -This scenario is common in energy storage systems where each hierarchical level requires monitoring of parameters like voltage and current. - -* Writing Phase: Model according to physical monitoring points at each hierarchical level -* Querying Phase: Create multiple tables based on device categories to manage information at each structural level - -![](/img/tree-to-table-en-5.png) - -**SQL for Creating a Table View:** - -```SQL --- Battery Compartment -CREATE VIEW viewdb.battery_compartment - (station String TAG, - batter_compartment String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -RESTRICT -AS root.db.** - --- Battery Stack -CREATE VIEW viewdb.battery_stack - (station String TAG, - batter_compartment String TAG, - battery_stack String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -RESTRICT -AS root.db.** - --- Battery Cluster -CREATE VIEW viewdb.battery_cluster - (station String TAG, - batter_compartment String TAG, - battery_stackString TAG, - battery_cluster String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -RESTRICT -AS 'root.db.**' - --- Battery Ceil -CREATE VIEW viewdb.battery_ceil - (station String TAG, - batter_compartment String TAG, - battery_cluster String TAG, - battery_cluster String TAG, - battery_ceil String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -RESTRICT -AS root.db.** -``` - -### 3.4 Original Tree Model Where a Device Has Only One Measurement Under It - -> Narrow Table Form - -#### 3.4.1 All Measurements Have the Same Data Type - -![](/img/tree-to-table-en-6.png) - -**SQL for Creating a Table View:** - -```SQL -CREATE VIEW viewdb.machine - ( - sensor_id STRING TAG, - value DOUBLE FIELD - ) -AS root.db.** -``` - -#### 3.4.2 Measurements Have Different Data Types -##### 3.4.2.1 Create a Narrow Table View for Each Data Type of Measurement - -**Advantage: ​**The number of table views is constant, only related to the data types in the system. - -**Disadvantage: ​**When querying the value of a specific measurement, its data type must be known in advance to determine which table view to query. - -![](/img/tree-to-table-en-7.png) - -**SQL for Creating a Table View:** - -```SQL -CREATE VIEW viewdb.machine_float - ( - sensor_id STRING TAG, - value FLOAT FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_double - ( - sensor_id STRING TAG, - value DOUBLE FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_int32 - ( - sensor_id STRING TAG, - value INT32 FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_int64 - ( - sensor_id STRING TAG, - value INT64 FIELD - ) -AS root.db.** - -... -``` - -##### 3.4.2.2 Create a Table for Each Measurement - -**Advantage: ​**When querying the value of a specific measurement, there's no need to first check its data type to determine which table to query, making the process simple and convenient. - -**Disadvantage: ​**When there are a large number of measurements, it will introduce too many table views, requiring the writing of a large number of view creation statements. - -![](/img/tree-to-table-en-8.png) - -**SQL for Creating a Table View:** - -```SQL -CREATE VIEW viewdb.DCS_PIT_02105A - ( - value FLOAT FIELD - ) -AS root.db.DCS_PIT_02105A.** - -CREATE VIEW viewdb.DCS_PIT_02105B - ( - value DOUBLE FIELD - ) -AS root.db.DCS_PIT_02105B.** - -CREATE VIEW viewdb.DCS_XI_02716A - ( - value INT64 FIELD - ) -AS root.db.DCS_XI_02716A.** - -...... -``` diff --git a/src/UserGuide/Master/Table/User-Manual/Window-Function_timecho.md b/src/UserGuide/Master/Table/User-Manual/Window-Function_timecho.md deleted file mode 100644 index 11675903e..000000000 --- a/src/UserGuide/Master/Table/User-Manual/Window-Function_timecho.md +++ /dev/null @@ -1,759 +0,0 @@ - - -# Window Functions - -For time-series data feature analysis scenarios, IoTDB provides the capability of window functions, which deliver a flexible and efficient solution for in-depth mining and complex computation of time-series data. The following sections will elaborate on the feature in detail. - -## 1. Function Overview - -Window Functions perform calculations on each row based on a specific set of rows related to the current row (called a "window"). It combines grouping operations (`PARTITION BY`), sorting (`ORDER BY`), and definable calculation ranges (window frame `FRAME`), enabling complex cross-row calculations without collapsing the original data rows. It is commonly used in data analysis scenarios such as ranking, cumulative sums, moving averages, etc. - -> Note: This feature is available starting from version V 2.0.5. - -For example, in a scenario where you need to query the cumulative power consumption values of different devices, you can achieve this using window functions. - -```SQL --- Original data -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ - --- Create table and insert data -CREATE TABLE device_flow(device String tag, flow INT32 FIELD); -insert into device_flow(time, device ,flow ) values ('1970-01-01T08:00:00.000+08:00','d0',3),('1970-01-01T08:00:01.000+08:00','d0',5),('1970-01-01T08:00:02.000+08:00','d0',3),('1970-01-01T08:00:03.000+08:00','d0',1),('1970-01-01T08:00:04.000+08:00','d1',2),('1970-01-01T08:00:05.000+08:00','d1',4); - - --- Execute window function query -SELECT *, sum(flow) ​OVER(PARTITION​ ​BY​ device ​ORDER​ ​BY​ flow) ​as​ sum ​FROM device_flow; -``` - -After grouping, sorting, and calculation (steps are disassembled as shown in the figure below), - -![](/img/window-function-1.png) - -the expected results can be obtained: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -## 2. Function Definition - -### 2.1 SQL Definition - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -### 2.2 Window Definition - -#### 2.2.1 Partition - -`PARTITION BY` is used to divide data into multiple independent, unrelated "groups". Window functions can only access and operate on data within their respective groups, and cannot access data from other groups. This clause is optional; if not explicitly specified, all data is divided into the same group by default. It is worth noting that unlike `GROUP BY` which aggregates a group of data into a single row, the window function with `PARTITION BY` **does not affect the number of rows within the group.** - -* Example - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER (PARTITION BY device) as count FROM device_flow; -``` - -Disassembly steps: - -![](/img/window-function-2.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 4| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -|1970-01-01T08:00:02.000+08:00| d0| 3| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 4| -+-----------------------------+------+----+-----+ -``` - -#### 2.2.2 Ordering - -`ORDER BY` is used to sort data within a partition. After sorting, rows with equal values are called peers. Peers affect the behavior of window functions; for example, different rank functions handle peers differently, and different frame division methods also handle peers differently. This clause is optional. - -* Example - -Query statement: - -```SQL -IoTDB> SELECT *, rank() OVER (PARTITION BY device ORDER BY flow) as rank FROM device_flow; -``` - -Disassembly steps: - -![](/img/window-function-3.png) - -Query result: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -#### 2.2.3 Framing - -For each row in a partition, the window function evaluates on a corresponding set of rows called a Frame (i.e., the input domain of the Window Function on each row). The Frame can be specified manually, involving two attributes when specified, as detailed below. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Frame AttributeAttribute ValueValue Description
TypeROWSDivide the frame by row number
GROUPSDivide the frame by peers, i.e., rows with the same value are regarded as equivalent. All rows in peers are grouped into one group called a peer group
RANGEDivide the frame by value
Start and End PositionUNBOUNDED PRECEDINGThe first row of the entire partition
offset PRECEDINGRepresents the row with an "offset" distance from the current row in the preceding direction
CURRENT ROWThe current row
offset FOLLOWINGRepresents the row with an "offset" distance from the current row in the following direction
UNBOUNDED FOLLOWINGThe last row of the entire partition
- -Among them, the meanings of `CURRENT ROW`, `PRECEDING N`, and `FOLLOWING N` vary with the type of frame, as shown in the following table: - -| | `ROWS` | `GROUPS` | `RANGE` | -|--------------------|------------|------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------| -| `CURRENT ROW` | Current row | Since a peer group contains multiple rows, this option differs depending on whether it acts on frame_start and frame_end: * frame_start: the first row of the peer group; * frame_end: the last row of the peer group. | Same as GROUPS, differing depending on whether it acts on frame_start and frame_end: * frame_start: the first row of the peer group; * frame_end: the last row of the peer group. | -| `offset PRECEDING` | The previous offset rows | The previous offset peer groups; | Rows whose value difference from the current row in the preceding direction is less than or equal to offset are grouped into one frame | -| `offset FOLLOWING` | The following offset rows | The following offset peer groups. | Rows whose value difference from the current row in the following direction is less than or equal to offset are grouped into one frame | - -The syntax format is as follows: - -```SQL --- Specify both frame_start and frame_end -{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end --- Specify only frame_start, frame_end is CURRENT ROW -{ RANGE | ROWS | GROUPS } frame_start -``` - -If the Frame is not specified manually, the default Frame division rules are as follows: - -* When the window function uses ORDER BY: The default Frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (i.e., from the first row of the window to the current row). For example: In RANK() OVER(PARTITION BY COL1 ORDER BY COL2), the Frame defaults to include the current row and all preceding rows in the partition. -* When the window function does not use ORDER BY: The default Frame is RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (i.e., all rows in the entire window). For example: In AVG(COL2) OVER(PARTITION BY col1), the Frame defaults to include all rows in the partition, calculating the average of the entire partition. - -It should be noted that when the Frame type is GROUPS or RANGE, `ORDER BY` must be specified. The difference is that ORDER BY in GROUPS can involve multiple fields, while RANGE requires calculation and thus can only specify one field. - -* Example - -1. Frame type is ROWS - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ROWS 1 PRECEDING) as count FROM device_flow; -``` - -Disassembly steps: - -* Take the previous row and the current row as the Frame - * For the first row of the partition, since there is no previous row, the entire Frame has only this row, returning 1; - * For other rows of the partition, the entire Frame includes the current row and its previous row, returning 2: - -![](/img/window-function-4.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 2| -+-----------------------------+------+----+-----+ -``` - -2. Frame type is GROUPS - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Disassembly steps: - -* Take the previous peer group and the current peer group as the Frame. Taking the partition with device d0 as an example (same for d1), for the count of rows: - * For the peer group with flow 1, since there are no peer groups smaller than it, the entire Frame has only this row, returning 1; - * For the peer group with flow 3, it itself contains 2 rows, and the previous peer group is the one with flow 1 (1 row), so the entire Frame has 3 rows, returning 3; - * For the peer group with flow 5, it itself contains 1 row, and the previous peer group is the one with flow 3 (2 rows), so the entire Frame has 3 rows, returning 3. - -![](/img/window-function-5.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. Frame type is RANGE - -Query statement: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Disassembly steps: - -* Group rows whose data is **less than or equal to 2** compared to the current row into the same Frame. Taking the partition with device d0 as an example (same for d1), for the count of rows: - * For the row with flow 1, since it is the smallest row, the entire Frame has only this row, returning 1; - * For the row with flow 3, note that CURRENT ROW exists as frame_end, so it is the last row of the entire peer group. There is 1 row smaller than it that meets the requirement, and the peer group has 2 rows, so the entire Frame has 3 rows, returning 3; - * For the row with flow 5, it itself contains 1 row, and there are 2 rows smaller than it that meet the requirement, so the entire Frame has 3 rows, returning 3. - -![](/img/window-function-6.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -## 3. Built-in Window Functions - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Window Function CategoryWindow Function NameFunction DefinitionSupports FRAME Clause
Aggregate FunctionAll built-in aggregate functionsAggregate a set of values to get a single aggregated result.Yes
Value Functionfirst_valueReturn the first value of the frame; if IGNORE NULLS is specified, skip leading NULLsYes
last_valueReturn the last value of the frame; if IGNORE NULLS is specified, skip trailing NULLsYes
nth_valueReturn the nth element of the frame (note that n starts from 1); if IGNORE NULLS is specified, skip NULLsYes
leadReturn the element offset rows after the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return defaultNo
lagReturn the element offset rows before the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return defaultNo
Rank FunctionrankReturn the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there may be gaps between sequence numbersNo
dense_rankReturn the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there are no gaps between sequence numbersNo
row_numberReturn the row number of the current row in the entire partition; note that the row number starts from 1No
percent_rankReturn the sequence number of the current row's value in the entire partition as a percentage; i.e., (rank() - 1) / (n - 1), where n is the number of rows in the entire partitionNo
cume_distReturn the sequence number of the current row's value in the entire partition as a percentage; i.e., (number of rows less than or equal to it) / n No
ntileSpecify n to number each row from 1 to n.No
- -### 3.1 Aggregate Function - -All built-in aggregate functions such as `sum()`, `avg()`, `min()`, `max()` can be used as Window Functions. - -> Note: Unlike GROUP BY, each row has a corresponding output in the Window Function - -Example: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -### 3.2 Value Function - -1. `first_value` - -* Function name: `first_value(value) [IGNORE NULLS]` -* Definition: Return the first value of the frame; if IGNORE NULLS is specified, skip leading NULLs; -* Example: - -```SQL -IoTDB> SELECT *, first_value(flow) OVER w as first_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+-----------+ -| time|device|flow|first_value| -+-----------------------------+------+----+-----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----------+ -``` - -2. `last_value` - -* Function name: `last_value(value) [IGNORE NULLS]` -* Definition: Return the last value of the frame; if IGNORE NULLS is specified, skip trailing NULLs; -* Example: - -```SQL -IoTDB> SELECT *, last_value(flow) OVER w as last_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|last_value| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -3. `nth_value` - -* Function name: `nth_value(value, n) [IGNORE NULLS]` -* Definition: Return the nth element of the frame (note that n starts from 1); if IGNORE NULLS is specified, skip NULLs; -* Example: - -```SQL -IoTDB> SELECT *, nth_value(flow, 2) OVER w as nth_values FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|nth_values| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -4. lead - -* Function name: `lead(value[, offset[, default]]) [IGNORE NULLS]` -* Definition: Return the element offset rows after the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return default; the default value of offset is 1, and the default value of default is NULL. -* The lead function requires an ORDER BY window clause -* Example: - -```SQL -IoTDB> SELECT *, lead(flow) OVER w as lead FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY time); -+-----------------------------+------+----+----+ -| time|device|flow|lead| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4|null| -|1970-01-01T08:00:00.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 1| -|1970-01-01T08:00:03.000+08:00| d0| 1|null| -+-----------------------------+------+----+----+ -``` - -5. lag - -* Function name: `lag(value[, offset[, default]]) [IGNORE NULLS]` -* Definition: Return the element offset rows before the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return default; the default value of offset is 1, and the default value of default is NULL. -* The lag function requires an ORDER BY window clause -* Example: - -```SQL -IoTDB> SELECT *, lag(flow) OVER w as lag FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY device); -+-----------------------------+------+----+----+ -| time|device|flow| lag| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2|null| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3|null| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -+-----------------------------+------+----+----+ -``` - -### 3.3 Rank Function - -1. rank - -* Function name: `rank()` -* Definition: Return the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there may be gaps between sequence numbers; -* Example: - -```SQL -IoTDB> SELECT *, rank() OVER w as rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -2. dense_rank - -* Function name: `dense_rank()` -* Definition: Return the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there are no gaps between sequence numbers. -* Example: - -```SQL -IoTDB> SELECT *, dense_rank() OVER w as dense_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|dense_rank| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+----------+ -``` - -3. row_number - -* Function name: `row_number()` -* Definition: Return the row number of the current row in the entire partition; note that the row number starts from 1; -* Example: - -```SQL -IoTDB> SELECT *, row_number() OVER w as row_number FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|row_number| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----------+ -``` - -4. percent_rank - -* Function name: `percent_rank()` -* Definition: Return the sequence number of the current row's value in the entire partition as a percentage; i.e., **(rank() - 1) / (n - 1)**, where n is the number of rows in the entire partition; -* Example: - -```SQL -IoTDB> SELECT *, percent_rank() OVER w as percent_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+------------------+ -| time|device|flow| percent_rank| -+-----------------------------+------+----+------------------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.0| -|1970-01-01T08:00:00.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:02.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+------------------+ -``` - -5. cume_dist - -* Function name: `cume_dist` -* Definition: Return the sequence number of the current row's value in the entire partition as a percentage; i.e., **(number of rows less than or equal to it) / n**. -* Example: - -```SQL -IoTDB> SELECT *, cume_dist() OVER w as cume_dist FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+---------+ -| time|device|flow|cume_dist| -+-----------------------------+------+----+---------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.5| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.25| -|1970-01-01T08:00:00.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:02.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+---------+ -``` - -6. ntile - -* Function name: `ntile` -* Definition: Specify n to number each row from 1 to n. - * If the number of rows in the entire partition is less than n, the number is the row index; - * If the number of rows in the entire partition is greater than n: - * If the number of rows is divisible by n, it is perfect. For example, if the number of rows is 4 and n is 2, the numbers are 1, 1, 2, 2; - * If the number of rows is not divisible by n, distribute to the first few groups. For example, if the number of rows is 5 and n is 3, the numbers are 1, 1, 2, 2, 3; -* Example: - -```SQL -IoTDB> SELECT *, ntile(2) OVER w as ntile FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+-----+ -| time|device|flow|ntile| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -+-----------------------------+------+----+-----+ -``` - -## 4. Scenario Examples - -1. Multi-device diff function - -For each row of each device, calculate the difference from the previous row: - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -For each row of each device, calculate the difference from the next row: - -```SQL -SELECT - *, - measurement - lead(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -For each row of a single device, calculate the difference from the previous row (same for the next row): - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (ORDER BY time) -FROM data -where device='d1' -WHERE timeCondition; -``` - -2. Multi-device TOP_K/BOTTOM_K - -Use rank to get the sequence number, then retain the desired order in the outer query. - -(Note: The execution order of window functions is after the HAVING clause, so a subquery is needed here) - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY time DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -In addition to sorting by time, you can also sort by the value of the measurement point: - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY measurement DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -3. Multi-device CHANGE_POINTS - -This SQL is used to remove consecutive identical values in the input sequence, which can be achieved with lead + subquery: - -```SQL -SELECT - time, - device, - measurement -FROM( - SELECT - time, - device, - measurement, - LEAD(measurement) OVER (PARTITION BY device ORDER BY time) AS next - FROM data - WHERE timeCondition -) -WHERE measurement != next OR next IS NULL; -``` diff --git a/src/UserGuide/Master/Tree/AI-capability/AINode_Upgrade_timecho.md b/src/UserGuide/Master/Tree/AI-capability/AINode_Upgrade_timecho.md deleted file mode 100644 index 54f4d63b6..000000000 --- a/src/UserGuide/Master/Tree/AI-capability/AINode_Upgrade_timecho.md +++ /dev/null @@ -1,663 +0,0 @@ - - -# AINode - -AINode is a native IoTDB node that supports the registration, management, and invocation of time series related models, with built-in industry-leading self-developed time series large models such as the Tsinghua University's Timer series. It can be invoked through standard SQL statements to achieve millisecond-level real-time inference on time series data, supporting applications such as time series trend prediction, missing value imputation, and anomaly detection. - -The system architecture is shown in the following diagram: - -![](/img/AINode-0-en.png) - -The responsibilities of the three nodes are as follows: - -* **ConfigNode**: Responsible for distributed node management and load balancing. -* **DataNode**: Responsible for receiving and parsing user SQL requests; responsible for storing time series data; responsible for data preprocessing calculations. -* **AINode**: Responsible for managing and using time series models. - -## 1. Advantages - -Compared to building machine learning services separately, it has the following advantages: - -* **Simple and easy to use**: No need to use Python or Java programming, SQL statements can be used to complete the entire process of machine learning model management and inference. For example, creating a model can use the CREATE MODEL statement, and using a model for inference can use the CALL INFERENCE (...) statement, etc., which is simpler and more convenient to use. -* **Avoid data migration**: Using IoTDB native machine learning can directly apply time series data stored in IoTDB to machine learning model inference, without moving data to a separate machine learning service platform, thus accelerating data processing, improving security, and reducing costs. - -![](/img/AInode1.png) - -* **Built-in advanced algorithms**: Supports industry-leading machine learning analysis algorithms, covering typical time series analysis tasks, empowering time series databases with native data analysis capabilities. Such as: - * **Time Series Forecasting**: Learn change patterns from past time series to output the most likely prediction of future sequences based on given past observations. - * **Anomaly Detection for Time Series**: Detect and identify anomalies in given time series data to help discover abnormal behavior in time series. - -## 2. Basic Concepts - -* **Model**: Machine learning model, which takes time series data as input and outputs the results or decisions of the analysis task. The model is the basic management unit of AINode, supporting the addition (registration), deletion, query, modification (fine-tuning), and use (inference) of models. -* **Create**: Load external designed or trained model files or algorithms into AINode, managed and used by IoTDB. -* **Inference**: Use the created model to complete the time series analysis task applicable to the model on the specified time series data. -* **Built-in**: AINode comes with common time series analysis scenario (e.g., prediction and anomaly detection) machine learning algorithms or self-developed models. - -![](/img/AInode2.png) - -## 3. Installation and Deployment - -AINode deployment can be referenced in the documentation [AINode Deployment](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md). - -## 4. Usage Guide - -TimechoDB-AINode supports three major functions: model inference, model fine-tuning, and model management (registration, viewing, deletion, loading, unloading, etc.). The following sections will explain them in detail. - -### 4.1 Model Inference - -SQL syntax as follows: - -```SQL -call inference(,inputSql,(=)*) -``` - -After completing the model registration (built-in model inference does not require a registration process), the inference function of the model can be used by calling the inference function with the call keyword. The corresponding parameter descriptions are as follows: - -* **model\_id**: Corresponds to an already registered model -* **sql**: SQL query statement, the result of the query is used as the input for model inference. The dimensions of the rows and columns in the query result need to match the size specified in the specific model config. (It is not recommended to use the `SELECT *` clause in this sql, because in IoTDB, `*` does not sort columns, so the column order is undefined. It is recommended to use `SELECT ot` to ensure that the column order matches the expected input of the model.) -* **parameterName/parameterValue**: currently supported: - - | Parameter Name | Parameter Type | Parameter Description | Default Value | - | ---------------- | -------------- | ----------------------- | -------------- | - | **generateTime** | boolean | Whether to include a timestamp column in the result | false | - | **outputLength** | int | Specifies the output length of the result | 96 | - -Notes: -1. The prerequisite for using built-in time series large models for inference is that the local machine has the corresponding model weights, located at `/TIMECHODB_AINODE_HOME/data/ainode/models/builtin/model_id/`. If the local machine does not have model weights, it will automatically pull from HuggingFace. Please ensure that the local machine can directly access HuggingFace. -2. In deep learning applications, it is common to use time-derived features (the time column in the data) as covariates and input them into the model together with the data to improve model performance. However, the time column is generally not included in the model's output results. To ensure universality, the model inference result only corresponds to the model's true output. If the model does not output a time column, the result will not contain it. - -**Example** - -Sample data [ETTh-tree](/img/ETTh-tree.csv) - -Below is an example of using the sundial model for inference. The input is 96 rows, and the output is 48 rows. We use SQL to perform the inference. - -```SQL -IoTDB> select OT from root.db.** -+-----------------------------+---------------+ -| Time|root.db.etth.OT| -+-----------------------------+---------------+ -|2016-07-01T00:00:00.000+08:00| 30.531| -|2016-07-01T01:00:00.000+08:00| 27.787| -|2016-07-01T02:00:00.000+08:00| 27.787| -|2016-07-01T03:00:00.000+08:00| 25.044| -|2016-07-01T04:00:00.000+08:00| 21.948| -| ...... | ...... | -|2016-07-04T19:00:00.000+08:00| 29.546| -|2016-07-04T20:00:00.000+08:00| 29.475| -|2016-07-04T21:00:00.000+08:00| 29.264| -|2016-07-04T22:00:00.000+08:00| 30.953| -|2016-07-04T23:00:00.000+08:00| 31.726| -+-----------------------------+---------------+ -Total line number = 96 - -IoTDB> call inference(sundial,"select OT from root.db.**", generateTime=True, outputLength=48) -+-----------------------------+------------------+ -| Time| output| -+-----------------------------+------------------+ -|2016-07-04T23:00:00.000+08:00|30.537494659423828| -|2016-07-04T23:59:22.500+08:00|29.619892120361328| -|2016-07-05T00:58:45.000+08:00|28.815832138061523| -|2016-07-05T01:58:07.500+08:00| 27.91131019592285| -|2016-07-05T02:57:30.000+08:00|26.893848419189453| -| ...... | ...... | -|2016-07-06T17:33:07.500+08:00| 24.40607261657715| -|2016-07-06T18:32:30.000+08:00| 25.00441551208496| -|2016-07-06T19:31:52.500+08:00|24.907312393188477| -|2016-07-06T20:31:15.000+08:00|25.156436920166016| -|2016-07-06T21:30:37.500+08:00|25.335433959960938| -+-----------------------------+------------------+ -Total line number = 48 -``` - -### 4.2 Model Fine-Tuning - -AINode supports model fine-tuning through SQL. - -**SQL Syntax** - -```SQL -createModel - | CREATE MODEL modelId=identifier (WITH HYPERPARAMETERS LR_BRACKET hparamPair (COMMA hparamPair)* RR_BRACKET)? FROM MODEL existingModelId=identifier ON DATASET LR_BRACKET trainingData RR_BRACKET - ; - -trainingData - : dataElement(COMMA dataElement)* - ; - -dataElement - : pathPatternElement (LR_BRACKET timeRange RR_BRACKET)? - ; - -pathPatternElement - : PATH path=prefixPath - ; -``` - -**Parameter Description** - -| Name | Description | -| ------ |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| modelId | The unique identifier of the fine-tuned model | -| hparamPair | Key-value pairs of hyperparameters used for fine-tuning, currently supported:
`train_epochs`: int type, number of fine-tuning epochs
`iter_per_epoch`: int type, number of iterations per epoch
`learning_rate`: double type, learning rate | -| existingModelId | The base model used for fine-tuning | -| trainingData | The dataset used for fine-tuning | - -**Example** - -1. Select the data of the measurement point root.db.etth.ot within a specified time range as the fine-tuning dataset, and create the model sundialv2 based on sundial. - -```SQL -IoTDB> CREATE MODEL sundialv2 FROM MODEL sundial ON DATASET (PATH root.db.etth.OT([1467302400000, 1467644400000))) -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| training| -+---------------------+---------+-----------+---------+ -``` - -2. Fine-tuning tasks are started asynchronously in the background, and logs can be seen in the AINode process; after fine-tuning is completed, query and use the new model. - -```SQL -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -+---------------------+---------+-----------+---------+ -``` - -### 4.3 Register Custom Models - -**Transformers models that meet the following requirements can be registered to AINode:** - -1. AINode currently uses transformers version v4.56.2, so when building the model, it is necessary to **avoid inheriting low-version (<4.50) interfaces**; -2. The model needs to inherit a type of AINode inference task pipeline (currently supports the forecasting pipeline): - * iotdb-core/ainode/iotdb/ainode/core/inference/pipeline/basic\_pipeline.py - - **Before V2.0.9.3** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - Preprocess the input data before the inference task starts, including shape validation and numerical conversion. - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - Postprocess the output results after the inference task is completed. - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def preprocess(self, inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], **infer_kwargs): - """ - Preprocess the input data before passing it to the model for inference, validating the shape and type of the input data. - - Args: - inputs (list[dict]): - Input data, a list of dictionaries, each dictionary contains: - - 'targets': Tensor with shape (input_length,) or (target_count, input_length). - - 'past_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - 'future_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - infer_kwargs (dict, optional): Additional keyword arguments for inference, such as: - - `output_length`(int): Used to validate the validity of 'future_covariates' if provided. - - Raises: - ValueError: If the input format is invalid (e.g., missing keys, invalid tensor shapes). - - Returns: - Preprocessed and validated input data that can be directly used for model inference. - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - Perform forecasting on the given inputs. - - Parameters: - inputs: Input data for forecasting. The type and structure depend on the specific model implementation. - **infer_kwargs: Additional inference parameters, e.g.: - - `output_length`(int): The number of time points the model should generate. - - Returns: - Forecast output, the specific form depends on the specific model implementation. - """ - pass - - def postprocess(self, outputs: list[torch.Tensor], **infer_kwargs) -> list[torch.Tensor]: - """ - Postprocess the model outputs after inference, validating the shape of the output data and ensuring it meets the expected dimensions. - - Args: - outputs: - Model outputs, a list of 2D tensors, each tensor with shape `[target_count, output_length]`. - - Raises: - InferenceModelInternalException: If the output tensor shape is invalid (e.g., incorrect dimensions). - ValueError: If the output format is incorrect. - - Returns: - list[torch.Tensor]: - Postprocessed outputs, which will be a list of 2D tensors. - """ - pass - ``` - - **From V2.0.9.3 onwards** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - Preprocess the input data before the inference task starts, including shape validation and numerical conversion. - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - Postprocess the output results after the inference task is completed. - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def _preprocess( - self, - inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], - **infer_kwargs, - ): - """ - Preprocess the input data before passing it to the model for inference, validating the shape and type of the input data. - - Args: - inputs (list[dict[str, dict[str, torch.Tensor] | torch.Tensor]]): - Input data, a list of dictionaries, each dictionary contains: - - 'targets': Tensor with shape (input_length,) or (target_count, input_length). - - 'past_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - 'future_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - infer_kwargs (dict, optional): Additional keyword arguments for inference, such as: - - `output_length`(int): Used to validate the validity of 'future_covariates' if provided. - - Raises: - ValueError: If the input format is invalid (e.g., missing keys, invalid tensor shapes). - - Returns: - Preprocessed and validated input data that can be directly used for model inference. - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - Perform forecasting on the given inputs. - - Parameters: - inputs: Input data for forecasting. The type and structure depend on the specific model implementation. - **infer_kwargs: Additional inference parameters, e.g.: - - `output_length`(int): The number of time points the model should generate. - - Returns: - Forecast output, the specific form depends on the specific model implementation. - """ - pass - - def _postprocess(self, outputs, **infer_kwargs) -> list[torch.Tensor]: - """ - Postprocess the model outputs after inference, validating the shape of the output data and ensuring it meets the expected dimensions. - - Args: - outputs: - Model outputs, a list of 2D tensors, each tensor with shape `[target_count, output_length]`. - - Raises: - InferenceModelInternalException: If the output tensor shape is invalid (e.g., incorrect dimensions). - ValueError: If the output format is incorrect. - - Returns: - list[torch.Tensor]: - Postprocessed outputs, which will be a list of 2D tensors. - """ - pass - ``` - -3. Modify the model configuration file `config.json` to ensure it contains the following fields: - - **Before V2.0.9.3** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // Specify the model Config class - "AutoModelForCausalLM": "model.Chronos2Model" // Specify the model class - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // Specify the inference pipeline for the model - "model_type": "custom_t5", // Specify the model type - } - ``` - * The model Config class and model class **must** be specified via `auto_map`; - * The inference pipeline class **must** be inherited and specified; - * For built-in and user-defined models managed by AINode, `model_type` also serves as a unique non-duplicable identifier. That is, the model type to be registered must not duplicate any existing model types; models created via fine-tuning will inherit the model type of the original model. - - **From V2.0.9.3 onwards** - > The `model_type` parameter is **not required** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // Specify the model Config class - "AutoModelForCausalLM": "model.Chronos2Model" // Specify the model class - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // Specify the inference pipeline for the model - } - ``` - * The model Config class and model class **must** be specified via `auto_map`; - * The inference pipeline class **must** be inherited and specified; - -4. Ensure that the model directory to be registered contains the following files, and the model configuration file name and weight file name are not customizable: - * Model configuration file: config.json; - * Model weight file: model.safetensors; - * Model code: other .py files. - -**The SQL syntax for registering a custom model is as follows:** - -```SQL -CREATE MODEL USING URI -``` - -**Parameter Description:** - -* **model\_id**: The unique identifier of the custom model; non-repetitive, with the following constraints: - * Allowed characters: [0-9 a-z A-Z \_ ] (letters, numbers (not at the beginning), underscore (not at the beginning)) - * Length limit: 2-64 characters - * Case-sensitive -* **uri**: The local URI address containing the model code and weights. - -**Registration Example:** - -Upload a custom Transformers model from a local path. AINode will copy the folder to the user\_defined directory. - -```SQL -CREATE MODEL chronos2 USING URI 'file:///path/to/chronos2' -``` - -After executing the SQL, the registration process will be performed asynchronously. The registration status of the model can be viewed by checking the model display (see the model display section). After the model is registered, it can be called using normal query methods to perform model inference. - -### 4.4 View Models - -Registered models can be viewed using the view command. - -```SQL -SHOW MODELS -``` - -In addition to directly displaying all model information, you can specify `model_id` to view the information of a specific model. - -```SQL -SHOW MODELS -- Only display specific model -``` - -The results of model display include the following: - -| **ModelId** | **ModelType** | **Category** | **State** | -| ------------------- | --------------------- | -------------------- | ----------------- | -| Model ID | Model Type | Model Category | Model State | - -Where, State model state machine flow diagram as follows: - -![](/img/ainode-upgrade-state-timecho-en.png) - -State machine flow explanation: - -1. After starting AINode, executing `show models` command, only **system built-in (BUILTIN)** models can be viewed. -2. Users can import their own models, which are identified as **user-defined (USER_DEFINED)**; AINode will attempt to parse the model type (ModelType) from the model configuration file; if parsing fails, this field will display as empty. -3. Time series large models (built-in models) weight files are not packaged with AINode, AINode automatically downloads them when starting. - 1. During download, it is ACTIVATING, and after successful download, it becomes ACTIVE, failure becomes INACTIVE. -4. After users start a model fine-tuning task, the model state is TRAINING, and after successful training, it becomes ACTIVE, failure becomes FAILED. -5. If the fine-tuning task is successful, after fine-tuning, the model will automatically rename the best checkpoint (training file) based on the best metric and become the user-specified model\_id. - -**View Example** - -```SQL -IoTDB> show models -+---------------------+--------------+--------------+-------------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------+--------------+-------------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| custom| | user_defined| active| -| timer_xl| timer| builtin| activating| -| sundial| sundial| builtin| active| -| sundialx_1| sundial| fine_tuned| active| -| sundialx_4| sundial| fine_tuned| training| -| sundialx_5| sundial| fine_tuned| failed| -| chronos2| t5| builtin| inactive| -+---------------------+--------------+--------------+-------------+ -``` - -Built-in traditional time series models are introduced as follows: - -| Model Name | Core Concept | Applicable Scenario | Main Features | -|------------|--------------|---------------------|---------------| -| **ARIMA** (Autoregressive Integrated Moving Average) | Combines autoregression (AR), differencing (I), and moving average (MA), used for predicting stationary time series or data that can be made stationary through differencing. | Univariate time series prediction, such as stock prices, sales, economic indicators. | 1. Suitable for linear trends and weak seasonality data. 2. Need to select parameters (p,d,q). 3. Sensitive to missing values. | -| **Holt-Winters** (Three-parameter exponential smoothing) | Based on exponential smoothing, introduces three components: level, trend, and seasonality, suitable for data with trend and seasonality. | Time series with clear seasonality and trend, such as monthly sales, power demand. | 1. Can handle additive or multiplicative seasonality. 2. Gives higher weight to recent data. 3. Simple to implement. | -| **Exponential Smoothing** | Uses weighted average of historical data, with weights decreasing exponentially over time, emphasizing the importance of recent observations. | Time series without obvious seasonality but with trend, such as short-term demand prediction. | 1. Few parameters, simple calculation. 2. Suitable for stationary or slowly changing sequences. 3. Can be extended to double or triple exponential smoothing. | -| **Naive Forecaster** | Uses the observation value from the previous period as the prediction for the next period, the simplest baseline model. | As a comparison baseline for other models, or simple prediction when data has no obvious pattern. | 1. No training required. 2. Sensitive to sudden changes. 3. Seasonal naive variant can use the value from the same period of the previous season for prediction. | -| **STL Forecaster** (Seasonal-Trend Decomposition) | Based on STL decomposition of time series, predict trend, seasonal, and residual components separately and combine. | Time series with complex seasonality, trend, and non-linear patterns, such as climate data, traffic flow. | 1. Can handle non-fixed seasonality. 2. Robust to outliers. 3. After decomposition, other models can be combined to predict each component. | -| **Gaussian HMM** (Gaussian Hidden Markov Model) | Assumes that observed data is generated by hidden states, with each state's observed probability following a Gaussian distribution. | State sequence prediction or classification, such as speech recognition, financial state recognition. | 1. Suitable for modeling time series state. 2. Assumes observed values are independent given the state. 3. Need to specify the number of hidden states. | -| **GMM HMM** (Gaussian Mixture Hidden Markov Model) | An extension of Gaussian HMM, where the observed probability of each state is described by a Gaussian Mixture Model, capturing more complex observed distributions. | Scenarios requiring multi-modal observed distributions, such as complex action recognition, bio-signal analysis. | 1. More flexible than single Gaussian. 2. More parameters, higher computational complexity. 3. Need to train the number of GMM components. | -| **STRAY** (Singular Value-based Anomaly Detection) | Detects anomalies in high-dimensional data through Singular Value Decomposition (SVD), commonly used for time series anomaly detection. | High-dimensional time series anomaly detection, such as sensor networks, IT system monitoring. | 1. No distribution assumption required. 2. Can handle high-dimensional data. 3. Sensitive to global anomalies, may miss local anomalies. | - -Built-in time series large models are introduced as follows: - -| Model Name | Core Concept | Applicable Scenario | Main Features | -|------------|--------------|---------------------|---------------| -| **Timer-XL** | Time series large model supporting ultra-long context, enhanced generalization ability through large-scale industrial data pre-training. | Complex industrial prediction requiring extremely long historical data, such as energy, aerospace, transportation. | 1. Ultra-long context support, can handle tens of thousands of time points as input. 2. Multi-scenario coverage, supports non-stationary, multi-variable, and covariate prediction. 3. Pre-trained on trillions of high-quality industrial time series data. | -| **Timer-Sundial** | A generative foundational model using "Transformer + TimeFlow" architecture, focused on probabilistic prediction. | Zero-shot prediction scenarios requiring quantification of uncertainty, such as finance, supply chain, new energy power generation. | 1. Strong zero-shot generalization ability, supports point prediction and probabilistic prediction. 2. Flexible analysis of any statistical characteristics of the prediction distribution. 3. Innovative generative architecture, achieving efficient non-deterministic sample generation. | -| **Chronos-2** | A universal time series foundational model based on discrete tokenization paradigm, transforming prediction into language modeling tasks. | Fast zero-shot univariate prediction, and scenarios that can leverage covariates (e.g., promotions, weather) to improve results. | 1. Strong zero-shot probabilistic prediction ability. 2. Supports covariate unified modeling, but has strict input requirements: a. The set of names of future covariates must be a subset of the set of names of historical covariates; b. The length of each historical covariate must equal the length of the target variable; c. The length of each future covariate must equal the prediction length; 3. Uses an efficient encoder-based structure, balancing performance and inference speed. | - -### 4.5 Delete Models - -For registered models, users can delete them through SQL. AINode will delete the corresponding model folder in the user\_defined directory. The SQL syntax is as follows: - -```SQL -DROP MODEL -``` - -The model id that has been successfully registered must be specified to delete the corresponding model. Since model deletion involves model data cleanup, the operation will not be completed immediately, and the model status becomes DROPPING, and the model in this state cannot be used for model inference. Note that this feature does not support deleting built-in models. - -### 4.6 Load/Unload Models - -To adapt to different scenarios, AINode provides the following two model loading strategies: - -* **On-demand loading**: Load the model temporarily when inference is performed, and release resources after completion. Suitable for testing or low-load scenarios. -* **Persistent loading**: Load the model persistently in memory (CPU) or GPU memory, to support high-concurrency inference. Users only need to specify the model to load or unload through SQL, and AINode will automatically manage the number of instances. The status of the persistent model can also be viewed at any time. - -The following sections will detail the loading/unloading model content: - -1. Configuration parameters - -Support editing the following configuration items to set persistent loading related parameters. - -```Properties -# The proportion of device memory/GPU memory available for AINode inference -# Datatype: Float -ain_inference_memory_usage_ratio=0.4 - -# The proportion of memory that each loaded model instance needs to occupy, i.e., model occupancy * this value -# Datatype: Float -ain_inference_extra_memory_ratio=1.2 -``` - -2. Show available devices - -Support viewing all available device IDs through the following SQL command. - -```SQL -SHOW AI_DEVICES -``` - -Example - -```SQL -IoTDB> show ai_devices -+-------------+ -| DeviceId| -+-------------+ -| cpu| -| 0| -| 1| -+-------------+ -``` - -3. Load model - -Support loading models through the following SQL command, and the system will **automatically balance** the number of model instances based on hardware resource usage. - -```SQL -LOAD MODEL TO DEVICES (, )* -``` - -Parameter requirements - -* **existing\_model\_id:** The model id to be specified, current version only supports timer\_xl and sundial. -* **device\_id:** The location where the model is loaded. - * **cpu:** Load to the memory of the AINode server. - * **gpu\_id:** Load to the corresponding GPU of the AINode server, e.g., "0, 1" means load to the two GPUs numbered 0 and 1. - -Example - -```SQL -LOAD MODEL sundial TO DEVICES 'cpu,0,1' -``` - -4. Unload model - -Support unloading specified models through the following SQL command, and the system will **reallocate** the freed resources to other models. - -```SQL -UNLOAD MODEL FROM DEVICES (, )* -``` - -Parameter requirements - -* **existing\_model\_id:** The model id to be specified, current version only supports timer\_xl and sundial. -* **device\_id:** The location where the model is loaded. - * **cpu:** Attempt to unload the specified model from the memory of the AINode server. - * **gpu\_id:** Attempt to unload the specified model from the corresponding GPU of the AINode server, e.g., "0, 1" means attempt to unload from the two GPUs numbered 0 and 1. - -Example - -```SQL -UNLOAD MODEL sundial FROM DEVICES 'cpu,0,1' -``` - -5. Show loaded models - -Support viewing the models that have been manually loaded through the following SQL command, and you can specify the device via `device_id`. - -```SQL -SHOW LOADED MODELS -SHOW LOADED MODELS (, )* # View models in specified devices -``` - -Example: sundial model is loaded on memory, gpu_0, and gpu_1 - -```SQL -IoTDB> show loaded models -+-------------+--------------+------------------+ -| DeviceId| ModelId| Count(instances)| -+-------------+--------------+------------------+ -| cpu| sundial| 4| -| 0| sundial| 6| -| 1| sundial| 6| -+-------------+--------------+------------------+ -``` - -Explanation: -* DeviceId: Device ID -* ModelId: Loaded model ID -* Count(instances): Number of model instances on each device (automatically assigned by the system) - -### 4.7 Introduction to Time Series Large Models - -AINode currently supports multiple time series large models. For related introductions and deployment usage, please refer to [Time Series Large Models](../AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md) - -## 5. Permission Management - -When using AINode related features, you can use IoTDB's own authentication for permission management. Users can only use the model management related features if they have the USE_MODEL permission. When using the inference feature, users need to have permission to access the source sequence corresponding to the SQL for the input model. - -| Permission Name | Permission Scope | Administrator User (Default ROOT) | Ordinary User | Path-related | -| --------------- | ----------------- | ------------------------------- | -------------- | ------------ | -| USE_MODEL | create model / show models / drop model | √ | √ | x | -| READ_DATA | call inference | √ | √ | √ | \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/AI-capability/AINode_timecho.md b/src/UserGuide/Master/Tree/AI-capability/AINode_timecho.md deleted file mode 100644 index 7ed02f520..000000000 --- a/src/UserGuide/Master/Tree/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,693 +0,0 @@ - - -# AINode - -AINode is a native IoTDB node that supports the registration, management, and invocation of time-series-related models. It comes with built-in industry-leading self-developed time-series large models, such as the Timer series developed by Tsinghua University. These models can be invoked through standard SQL statements, enabling real-time inference of time series data at the millisecond level, and supporting application scenarios such as trend forecasting, missing value imputation, and anomaly detection for time series data. - -> Available since V2.0.5.1 - -The system architecture is shown below: -::: center - -::: - -The responsibilities of the three nodes are as follows: - -- **ConfigNode:** - - Manages distributed nodes and handles load balancing across the system. -- **DataNode:** - - Receives and parses user SQL queries. - - Stores time-series data. - - Performs preprocessing computations on raw data. -- **AINode:** - - Manages and utilizes time-series models (including training/inference). - - Supports deep learning and machine learning workflows. - -## 1. Advantageous features - -Compared with building a machine learning service alone, it has the following advantages: - -- **Simple and easy to use**: no need to use Python or Java programming, the complete process of machine learning model management and inference can be completed using SQL statements. Creating a model can be done using the CREATE MODEL statement, and using a model for inference can be done using the CALL INFERENCE (...) statement, making it simpler and more convenient to use. - -- **Avoid Data Migration**: With IoTDB native machine learning, data stored in IoTDB can be directly applied to the inference of machine learning models without having to move the data to a separate machine learning service platform, which accelerates data processing, improves security, and reduces costs. - -![](/img/AInode1.png) - -- **Built-in Advanced Algorithms**: supports industry-leading machine learning analytics algorithms covering typical timing analysis tasks, empowering the timing database with native data analysis capabilities. Such as: - - **Time Series Forecasting**: learns patterns of change from past time series; thus outputs the most likely prediction of future series based on observations at a given past time. - - **Anomaly Detection for Time Series**: detects and identifies outliers in a given time series data, helping to discover anomalous behaviour in the time series. - - **Annotation for Time Series (Time Series Annotation)**: Adds additional information or markers, such as event occurrence, outliers, trend changes, etc., to each data point or specific time period to better understand and analyse the data. - - - -## 2. Basic Concepts - -- **Model**: A machine learning model takes time series data as input and outputs analysis task results or decisions. Models are the basic management units of AINode, supporting model operations such as creation (registration), deletion, query, modification (fine-tuning), and usage (inference). -- **Create**: Load externally designed or trained model files/algorithms into AINode for unified management and usage by IoTDB. -- **Inference**: Use the created model to complete time series analysis tasks applicable to the model on specified time series data. -- **Built-in Capabilities**: AINode comes with machine learning algorithms or self-developed models for common time series analysis scenarios (e.g., forecasting and anomaly detection). - -::: center - -:::: - -## 3. Installation and Deployment - -The deployment of AINode can be found in the document [AINode Deployment](../Deployment-and-Maintenance/AINode_Deployment_apache.md). - -## 4. Usage Guidelines - -AINode provides model creation and deletion functions for time series models. Built-in models do not require creation and can be used directly. - - -### 4.1 Registering Models - - -Trained deep learning models can be registered by specifying their input and output vector dimensions for inference. - -Models that meet the following criteria can be registered with AINode: - -1. AINode currently supports models trained with PyTorch 2.4.0. Features above version 2.4.0 should be avoided. -2. AINode supports models stored using PyTorch JIT (`model.pt`), which must include both the model structure and weights. -3. The model input sequence can include single or multiple columns. If multi-column, it must match the model capabilities and configuration file. -4. Model configuration parameters must be clearly defined in the `config.yaml` file. When using the model, the input and output dimensions defined in `config.yaml` must be strictly followed. Mismatches with the configuration file will cause errors. - -The SQL syntax for model registration is defined as follows: - -```SQL -create model using uri -``` - -Detailed meanings of SQL parameters: - -- **model_id**: The global unique identifier for the model, non-repeating. Model names have the following constraints: - - Allowed characters: [0-9 a-z A-Z _] (letters, digits (not at the beginning), underscores (not at the beginning)) - - Length: 2-64 characters - - Case-sensitive -- **uri**: The resource path of the model registration files, which should include the **model structure and weight file `model.pt` and the model configuration file `config.yaml`** - - - **Model structure and weight file**: The weight file generated after model training, currently supporting `.pt` files from PyTorch training. - - - **Model configuration file**: Parameters related to the model structure provided during registration, which must include input and output dimensions for inference: - - | **Parameter Name** | **Description** | **Example** | - | ------------ | ---------------------------- | -------- | - | input_shape | Rows and columns of model input | [96,2] | - | output_shape | Rows and columns of model output | [48,2] | - - In addition to inference, data types of input and output can also be specified: - - | **Parameter Name** | **Description** | **Example** | - | ------------------ | ------------------------- | ---------------------- | - | input_type | Data type of model input | ['float32', 'float32'] | - | output_type | Data type of model output | ['float32', 'float32'] | - - Additional notes can be specified for model management display: - - | **Parameter Name** | **Description** | **Example** | - | ------------------ | --------------------------------------------- | -------------------------------------------- | - | attributes | Optional notes set by users for model display | 'model_type': 'dlinear', 'kernel_size': '25' | - -In addition to registering local model files, remote resource paths can be specified via URIs for registration, using open-source model repositories (e.g., HuggingFace). - - -#### Example - -The [example folder](https://github.com/apache/iotdb/tree/master/integration-test/src/test/resources/ainode-example) contains model.pt (trained model) and config.yaml with the following content: - -```YAML -configs: - # Required - input_shape: [96, 2] # Model accepts 96 rows x 2 columns of data - output_shape: [48, 2] # Model outputs 48 rows x 2 columns of data - - # Optional (default to all float32, column count matches shape) - input_type: ["int64", "int64"] # Data types of inputs, must match input column count - output_type: ["text", "int64"] # Data types of outputs, must match output column count - -attributes: # Optional user-defined notes - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -Register the model by specifying this folder as the loading path: - -```SQL -IoTDB> create model dlinear_example using uri "file://./example" -``` - -After SQL execution, registration proceeds asynchronously. The registration status can be checked via model display (see Model Display section). The registration success time mainly depends on the model file size. - -Once registered, the model can be invoked for inference through normal query syntax. - -### 4.2 Viewing Models - -Registered models can be queried using the `show models` command. The SQL definitions are: - -```SQL -show models - -show models -``` - -In addition to displaying all models, specifying a `model_id` shows details of a specific model. The display includes: - -| **ModelId** | **ModelType** | **Category** | **State** | -|-------------|---------------|----------------|-------------| -| Model ID | Model Type | Model Category | Model State | - -- Model State Transition Diagram - -![](/img/AINode-State-en.png) - -**Instructions:** - -1. Initialization: - - When AINode starts, show models only displays BUILT-IN models. -2. Custom Model Import: - - Users can import custom models (marked as USER-DEFINED). - - The system attempts to parse the ModelTypefrom the config file. - - If parsing fails, the field remains empty. -3. Foundation Model Weights: - - Time-series foundation model weights are not bundled with AINode. - - AINode automatically downloads them during startup. - - Download state: LOADING. -4. Download Outcomes: - - Success → State changes to ACTIVE. - - Failure → State changes to INACTIVE. -5. Fine-Tuning Process: - - When fine-tuning starts: State becomes TRAINING. - - Successful training → State transitions to ACTIVE. - - Training failure → State changes to FAILED. - -**Example** - -```SQL -IoTDB> show models -+---------------------+--------------------+--------------+---------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+--------------+---------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| custom| | USER-DEFINED| ACTIVE| -| timerxl| Timer-XL| BUILT-IN| LOADING| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| sundialx_1| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_2| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_4| Timer-Sundial| FINE-TUNED| TRAINING| -| sundialx_5| Timer-Sundial| FINE-TUNED| FAILED| -+---------------------+--------------------+--------------+---------+ -``` - -### 4.3 Deleting Models - -Registered models can be deleted via SQL, which removes all related files under AINode: - -```SQL -drop model -``` - -Specify the registered `model_id` to delete the model. Since deletion involves data cleanup, the operation is not immediate, and the model state becomes `DROPPING`, during which it cannot be used for inference. **Note:** Built-in models cannot be deleted. - -### 4.4 Using Built-in Model Reasoning - -The SQL syntax is as follows: - - -```SQL -call inference(,inputSql,(=)*) - -window_function: - head(window_size) - tail(window_size) - count(window_size,sliding_step) -``` - -Built-in model inference does not require a registration process, the inference function can be used by calling the inference function through the call keyword, and its corresponding parameters are described as follows: - -- **built_in_model_name**: built-in model name -- **parameterName**: parameter name -- **parameterValue**: parameter value - -- **Note**: To use a built-in time series large model for inference, the corresponding model weights must be stored locally in the directory `/IOTDB_AINODE_HOME/data/ainode/models/weights/model_id/`. If the weights are not present locally, they will be automatically downloaded from HuggingFace. Ensure your environment has direct access to HuggingFace. - -#### Built-in Models and Parameter Descriptions - -The following machine learning models are currently built-in, please refer to the following links for detailed parameter descriptions. - -| Model | built_in_model_name | Task type | Parameter description | -| -------------------- | --------------------- | -------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Arima | _Arima | Forecast | [Arima Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.arima.ARIMA.html?highlight=Arima) | -| STLForecaster | _STLForecaster | Forecast | [STLForecaster Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.trend.STLForecaster.html#sktime.forecasting.trend.STLForecaster) | -| NaiveForecaster | _NaiveForecaster | Forecast | [NaiveForecaster Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.naive.NaiveForecaster.html#naiveforecaster) | -| ExponentialSmoothing | _ExponentialSmoothing | Forecast | [ExponentialSmoothing Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.exp_smoothing.ExponentialSmoothing.html) | -| GaussianHMM | _GaussianHMM | Annotation | [GaussianHMMParameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.hmm_learn.gaussian.GaussianHMM.html) | -| GMMHMM | _GMMHMM | Annotation | [GMMHMM Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.hmm_learn.gmm.GMMHMM.html) | -| Stray | _Stray | Anomaly detection | [Stray Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.stray.STRAY.html) | - - -After completing the registration of the model, the inference function can be used by calling the inference function through the call keyword, and its corresponding parameters are described as follows: - -- **model_id**: corresponds to a registered model -- **sql**: sql query statement, the result of the query is used as input to the model for model inference. The dimensions of the rows and columns in the result of the query need to match the size specified in the specific model config. (It is not recommended to use the `SELECT *` clause for the sql here because in IoTDB, `*` does not sort the columns, so the order of the columns is undefined, you can use `SELECT s0,s1` to ensure that the columns order matches the expectations of the model input) -- **window_function**: Window functions that can be used in the inference process, there are currently three types of window functions provided to assist in model inference: - - **head(window_size)**: Get the top window_size points in the data for model inference, this window can be used for data cropping. - ![](/img/AINode-call1.png) - - - **tail(window_size)**: get the last window_size point in the data for model inference, this window can be used for data cropping. - ![](/img/AINode-call2.png) - - - **count(window_size, sliding_step)**: sliding window based on the number of points, the data in each window will be reasoned through the model respectively, as shown in the example below, window_size for 2 window function will be divided into three windows of the input dataset, and each window will perform reasoning operations to generate results respectively. The window can be used for continuous inference - ![](/img/AINode-call3.png) - -**Explanation 1**: window can be used to solve the problem of cropping rows when the results of the sql query and the input row requirements of the model do not match. Note that when the number of columns does not match or the number of rows is directly less than the model requirement, the inference cannot proceed and an error message will be returned. - -**Explanation 2**: In deep learning applications, timestamp-derived features (time columns in the data) are often used as covariates in generative tasks, and are input into the model together to enhance the model, but the time columns are generally not included in the model's output. In order to ensure the generality of the implementation, the model inference results only correspond to the real output of the model, if the model does not output the time column, it will not be included in the results. - - -#### Example - -The following is an example of inference in action using a deep learning model, for the `dlinear` prediction model with input `[96,2]` and output `[48,2]` mentioned above, which we use via SQL. - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**", generateTime=True) -+-----------------------------+--------------------------------------------+-----------------------------+ -| Time| _result_0| _result_1| -+-----------------------------+--------------------------------------------+-----------------------------+ -|1990-04-06T00:00:00.000+08:00| 0.726302981376648| 1.6549958229064941| -|1990-04-08T00:00:00.000+08:00| 0.7354921698570251| 1.6482787370681763| -|1990-04-10T00:00:00.000+08:00| 0.7238251566886902| 1.6278168201446533| -...... -|1990-07-07T00:00:00.000+08:00| 0.7692174911499023| 1.654654049873352| -|1990-07-09T00:00:00.000+08:00| 0.7685555815696716| 1.6625318765640259| -|1990-07-11T00:00:00.000+08:00| 0.7856493592262268| 1.6508299350738525| -+-----------------------------+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### Example of using the tail/head window function - -When the amount of data is variable and you want to take the latest 96 rows of data for inference, you can use the corresponding window function tail. head function is used in a similar way, except that it takes the earliest 96 points. - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1988-01-01T00:00:00.000+08:00| 0.7355| 1.211| -...... -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 996 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**", generateTime=True, window=tail(96)) -+-----------------------------+--------------------------------------------+-----------------------------+ -| Time| _result_0| _result_1| -+-----------------------------+--------------------------------------------+-----------------------------+ -|1990-04-06T00:00:00.000+08:00| 0.726302981376648| 1.6549958229064941| -|1990-04-08T00:00:00.000+08:00| 0.7354921698570251| 1.6482787370681763| -|1990-04-10T00:00:00.000+08:00| 0.7238251566886902| 1.6278168201446533| -...... -|1990-07-07T00:00:00.000+08:00| 0.7692174911499023| 1.654654049873352| -|1990-07-09T00:00:00.000+08:00| 0.7685555815696716| 1.6625318765640259| -|1990-07-11T00:00:00.000+08:00| 0.7856493592262268| 1.6508299350738525| -+-----------------------------+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### Example of using the count window function - -This window is mainly used for computational tasks. When the task's corresponding model can only handle a fixed number of rows of data at a time, but the final desired outcome is multiple sets of prediction results, this window function can be used to perform continuous inference using a sliding window of points. Suppose we now have an anomaly detection model `anomaly_example(input: [24,2], output[1,1])`, which generates a 0/1 label for every 24 rows of data. An example of its use is as follows: - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(anomaly_example,"select s0,s1 from root.**", generateTime=True, window=count(24,24)) -+-----------------------------+-------------------------+ -| Time| _result_0| -+-----------------------------+-------------------------+ -|1990-04-06T00:00:00.000+08:00| 0| -|1990-04-30T00:00:00.000+08:00| 1| -|1990-05-24T00:00:00.000+08:00| 1| -|1990-06-17T00:00:00.000+08:00| 0| -+-----------------------------+-------------------------+ -Total line number = 4 -``` - -In the result set, each row's label corresponds to the output of the anomaly detection model after inputting each group of 24 rows of data. - -### 4.5 Fine-tuning Built-in Models -> Only Timer-XL and Timer-Sundial support fine-tuning. - - -The SQL syntax is as follows: - - -```SQL -create model (with hyperparameters -(=(, =)*))? -from model -on dataset (PATH ([timeRange])?) -``` - -#### Examples - -1. Select the first 80% of data from the measurement point `root.db.etth.ot` as the fine-tuning dataset, and create the model `sundialv2` based on `sundial`. - -```SQL -IoTDB> CREATE MODEL sundialv2 FROM MODEL sundial ON DATASET (PATH root.db.etth.OT([1467302400000, 1517468400001))) -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+--------------------+----------+--------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+--------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| timer_xl| Timer-XL| BUILT-IN| ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|TRAINING| -+---------------------+--------------------+----------+--------+ -``` - -2. The fine-tuning task starts asynchronously in the background, and logs can be viewed in the AINode process. After fine-tuning is complete, query and use the new model - -```SQL -IoTDB> show models -+---------------------+--------------------+----------+------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+------+ -| arima| Arima| BUILT-IN|ACTIVE| -| holtwinters| HoltWinters| BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN|ACTIVE| -| stray| Stray| BUILT-IN|ACTIVE| -| sundial| Timer-Sundial| BUILT-IN|ACTIVE| -| timer_xl| Timer-XL| BUILT-IN|ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|ACTIVE| -+---------------------+--------------------+----------+------+ -``` - -### 4.6 TimeSeries Large Models Import Steps - -The deployment of AINode can be found in the document [AINode Deployment](../Deployment-and-Maintenance/AINode_Deployment_timecho.md) . - - -## 5. Privilege Management - -When using AINode related functions, the authentication of IoTDB itself can be used to do a permission management, users can only use the model management related functions when they have the USE_MODEL permission. When using the inference function, the user needs to have the permission to access the source sequence corresponding to the SQL of the input model. - -| Privilege Name | Privilege Scope | Administrator User (default ROOT) | Normal User | Path Related | -| --------- | --------------------------------- | ---------------------- | -------- | -------- | -| USE_MODEL | create model/show models/drop model | √ | √ | x | -| READ_DATA| call inference | √ | √|√ | - -## 6. Practical Examples - -### 6.1 Power Load Prediction - -In some industrial scenarios, there is a need to predict power loads, which can be used to optimise power supply, conserve energy and resources, support planning and expansion, and enhance power system reliability. - -The data for the test set of ETTh1 that we use is [ETTh1](/img/ETTh1.csv). - - -It contains power data collected at 1h intervals, and each data consists of load and oil temperature as High UseFul Load, High UseLess Load, Middle UseLess Load, Low UseFul Load, Low UseLess Load, Oil Temperature. - -On this dataset, the model inference function of IoTDB-ML can predict the oil temperature in the future period of time through the relationship between the past values of high, middle and low use loads and the corresponding time stamp oil temperature, which empowers the automatic regulation and monitoring of grid transformers. - -#### Step 1: Data Import - -Users can import the ETT dataset into IoTDB using `import-data.sh` in the tools folder - -``Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/ETTh1.csv -`` - -#### Step 2: Model Import - -We can enter the following SQL in iotdb-cli to pull a trained model from huggingface for registration for subsequent inference. - -```SQL -create model dlinear using uri 'https://huggingface.co/hvlgo/dlinear/tree/main' -``` - -This model is trained on the lighter weight deep model DLinear, which is able to capture as many trends within a sequence and relationships between variables as possible with relatively fast inference, making it more suitable for fast real-time prediction than other deeper models. - -#### Step 3: Model inference - -```Shell -IoTDB> select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth LIMIT 96 -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -| Time|root.eg.etth.s0|root.eg.etth.s1|root.eg.etth.s2|root.eg.etth.s3|root.eg.etth.s4|root.eg.etth.s5|root.eg.etth.s6| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -|2017-10-20T00:00:00.000+08:00| 10.449| 3.885| 8.706| 2.025| 2.041| 0.944| 8.864| -|2017-10-20T01:00:00.000+08:00| 11.119| 3.952| 8.813| 2.31| 2.071| 1.005| 8.442| -|2017-10-20T02:00:00.000+08:00| 9.511| 2.88| 7.533| 1.564| 1.949| 0.883| 8.16| -|2017-10-20T03:00:00.000+08:00| 9.645| 2.21| 7.249| 1.066| 1.828| 0.914| 7.949| -...... -|2017-10-23T20:00:00.000+08:00| 8.105| 0.938| 4.371| -0.569| 3.533| 1.279| 9.708| -|2017-10-23T21:00:00.000+08:00| 7.167| 1.206| 4.087| -0.462| 3.107| 1.432| 8.723| -|2017-10-23T22:00:00.000+08:00| 7.1| 1.34| 4.015| -0.32| 2.772| 1.31| 8.864| -|2017-10-23T23:00:00.000+08:00| 9.176| 2.746| 7.107| 1.635| 2.65| 1.097| 9.004| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example, "select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth", generateTime=True, window=head(96)) -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -| Time| output0| output1| output2| output3| output4| output5| output6| -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -|2017-10-23T23:00:00.000+08:00| 10.319546| 3.1450553| 7.877341| 1.5723765|2.7303758| 1.1362307| 8.867775| -|2017-10-24T01:00:00.000+08:00| 10.443649| 3.3286757| 7.8593454| 1.7675098| 2.560634| 1.1177158| 8.920919| -|2017-10-24T03:00:00.000+08:00| 10.883752| 3.2341104| 8.47036| 1.6116762|2.4874182| 1.1760603| 8.798939| -...... -|2017-10-26T19:00:00.000+08:00| 8.0115595| 1.2995274| 6.9900327|-0.098746896| 3.04923| 1.176214| 9.548782| -|2017-10-26T21:00:00.000+08:00| 8.612427| 2.5036244| 5.6790237| 0.66474205|2.8870275| 1.2051733| 9.330128| -|2017-10-26T22:00:00.000+08:00| 10.096699| 3.399722| 6.9909| 1.7478468|2.7642853| 1.1119363| 9.541455| -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -Total line number = 48 -``` - -We compare the results of the prediction of the oil temperature with the real results, and we can get the following image. - -The data before 10/24 00:00 represents the past data input to the model, the blue line after 10/24 00:00 is the oil temperature forecast result given by the model, and the red line is the actual oil temperature data from the dataset (used for comparison). - -![](/img/AINode-analysis1.png) - -As can be seen, we have used the relationship between the six load information and the corresponding time oil temperatures for the past 96 hours (4 days) to model the possible changes in this data for the oil temperature for the next 48 hours (2 days) based on the inter-relationships between the sequences learned previously, and it can be seen that the predicted curves maintain a high degree of consistency in trend with the actual results after visualisation. - -### 6.2 Power Prediction - -Power monitoring of current, voltage and power data is required in substations for detecting potential grid problems, identifying faults in the power system, effectively managing grid loads and analysing power system performance and trends. - -We have used the current, voltage and power data in a substation to form a dataset in a real scenario. The dataset consists of data such as A-phase voltage, B-phase voltage, and C-phase voltage collected every 5 - 6s for a time span of nearly four months in the substation. - -The test set data content is [data](/img/data.csv). - -On this dataset, the model inference function of IoTDB-ML can predict the C-phase voltage in the future period through the previous values and corresponding timestamps of A-phase voltage, B-phase voltage and C-phase voltage, empowering the monitoring management of the substation. - -#### Step 1: Data Import - -Users can import the dataset using `import-data.sh` in the tools folder - -```Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/data.csv -``` - -#### Step 2: Model Import - -We can select built-in models or registered models in IoTDB CLI for subsequent inference. - -We use the built-in model STLForecaster for prediction. STLForecaster is a time series forecasting method based on the STL implementation in the statsmodels library. - -#### Step 3: Model Inference - -```Shell -IoTDB> select * from root.eg.voltage limit 96 -+-----------------------------+------------------+------------------+------------------+ -| Time|root.eg.voltage.s0|root.eg.voltage.s1|root.eg.voltage.s2| -+-----------------------------+------------------+------------------+------------------+ -|2023-02-14T20:38:32.000+08:00| 2038.0| 2028.0| 2041.0| -|2023-02-14T20:38:38.000+08:00| 2014.0| 2005.0| 2018.0| -|2023-02-14T20:38:44.000+08:00| 2014.0| 2005.0| 2018.0| -...... -|2023-02-14T20:47:52.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:47:57.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:48:03.000+08:00| 2024.0| 2016.0| 2027.0| -+-----------------------------+------------------+------------------+------------------+ -Total line number = 96 - -IoTDB> call inference(_STLForecaster, "select s0,s1,s2 from root.eg.voltage", generateTime=True, window=head(96),predict_length=48) -+-----------------------------+---------+---------+---------+ -| Time| output0| output1| output2| -+-----------------------------+---------+---------+---------+ -|2023-02-14T20:48:03.000+08:00|2026.3601|2018.2953|2029.4257| -|2023-02-14T20:48:09.000+08:00|2019.1538|2011.4361|2022.0888| -|2023-02-14T20:48:15.000+08:00|2025.5074|2017.4522|2028.5199| -...... - -|2023-02-14T20:52:15.000+08:00|2022.2336|2015.0290|2025.1023| -|2023-02-14T20:52:21.000+08:00|2015.7241|2008.8975|2018.5085| -|2023-02-14T20:52:27.000+08:00|2022.0777|2014.9136|2024.9396| -|2023-02-14T20:52:33.000+08:00|2015.5682|2008.7821|2018.3458| -+-----------------------------+---------+---------+---------+ -Total line number = 48 -``` - -Comparing the predicted results of the C-phase voltage with the real results, we can get the following image. - -The data before 02/14 20:48 represents the past data input to the model, the blue line after 02/14 20:48 is the predicted result of phase C voltage given by the model, while the red line is the actual phase C voltage data from the dataset (used for comparison). - -![](/img/AINode-analysis2.png) - -It can be seen that we used the voltage data from the past 10 minutes and, based on the previously learned inter-sequence relationships, modeled the possible changes in the phase C voltage data for the next 5 minutes. The visualized forecast curve shows a certain degree of synchronicity with the actual results in terms of trend. - -### 6.3 Anomaly Detection - -In the civil aviation and transport industry, there exists a need for anomaly detection of the number of passengers travelling on an aircraft. The results of anomaly detection can be used to guide the adjustment of flight scheduling to make the organisation more efficient. - -Airline Passengers is a time-series dataset that records the number of international air passengers between 1949 and 1960, sampled at one-month intervals. The dataset contains a total of one time series. The dataset is [airline](/img/airline.csv). -On this dataset, the model inference function of IoTDB-ML can empower the transport industry by capturing the changing patterns of the sequence in order to detect anomalies at the sequence time points. - -#### Step 1: Data Import - -Users can import the dataset using `import-data.sh` in the tools folder - -``Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/data.csv -`` - -#### Step 2: Model Inference - -IoTDB has some built-in machine learning algorithms that can be used directly, a sample prediction using one of the anomaly detection algorithms is shown below: - -```Shell -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", generateTime=True, k=2) -+-----------------------------+-------+ -| Time|output0| -+-----------------------------+-------+ -|1960-12-31T00:00:00.000+08:00| 0| -|1961-01-31T08:00:00.000+08:00| 0| -|1961-02-28T08:00:00.000+08:00| 0| -|1961-03-31T08:00:00.000+08:00| 0| -...... -|1972-06-30T08:00:00.000+08:00| 1| -|1972-07-31T08:00:00.000+08:00| 1| -|1972-08-31T08:00:00.000+08:00| 0| -|1972-09-30T08:00:00.000+08:00| 0| -|1972-10-31T08:00:00.000+08:00| 0| -|1972-11-30T08:00:00.000+08:00| 0| -+-----------------------------+-------+ -Total line number = 144 -``` - -We plot the results detected as anomalies to get the following image. Where the blue curve is the original time series and the time points specially marked with red dots are the time points that the algorithm detects as anomalies. - -![](/img/s6.png) - -It can be seen that the Stray model has modelled the input sequence changes and successfully detected the time points where anomalies occur. \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md b/src/UserGuide/Master/Tree/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md deleted file mode 100644 index c7fde0bf2..000000000 --- a/src/UserGuide/Master/Tree/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md +++ /dev/null @@ -1,157 +0,0 @@ - -# Time Series Large Models - -## 1. Introduction - -Time Series Large Models are foundational models specifically designed for time series data analysis. The IoTDB team has been developing the Timer, a self-researched foundational time series model, which is based on the Transformer architecture and pre-trained on massive multi-domain time series data, supporting downstream tasks such as time series forecasting, anomaly detection, and time series imputation. The AINode platform developed by the team also supports the integration of cutting-edge time series foundational models from the industry, providing users with diverse model options. Unlike traditional time series analysis techniques, these large models possess universal feature extraction capabilities and can serve a wide range of analytical tasks through zero-shot analysis, fine-tuning, and other services. - -All technical achievements in the field of time series large models related to this paper (including both the team's self-researched models and industry-leading directions) have been published in top international machine learning conferences, with specific details in the appendix. - -## 2. Application Scenarios - -* **Time Series Forecasting**: Providing time series data forecasting services for industrial production, natural environments, and other fields to help users understand future trends in advance. -* **Data Imputation**: Performing context-based filling for missing segments in time series to enhance the continuity and integrity of the dataset. -* **Anomaly Detection**: Using autoregressive analysis technology to monitor time series data in real-time, promptly alerting potential anomalies. - -![](/img/LargeModel10.png) - -## 3. Timer-1 Model - -The Timer model (non-built-in model) not only demonstrates excellent few-shot generalization and multi-task adaptability, but also acquires a rich knowledge base through pre-training, endowing it with universal capabilities to handle diverse downstream tasks, with the following characteristics: - -* **Generalizability**: The model can achieve industry-leading deep model prediction results through fine-tuning with only a small number of samples. -* **Universality**: The model design is flexible, capable of adapting to various different task requirements, and supports variable input and output lengths, enabling it to function effectively in various application scenarios. -* **Scalability**: As the number of model parameters increases or the scale of pre-training data expands, the model's performance will continue to improve, ensuring that the model can continuously optimize its prediction effectiveness as time and data volume grow. - -![](/img/model01.png) - -## 4. Timer-XL Model - -Timer-XL further extends and upgrades the network structure based on Timer, achieving comprehensive breakthroughs in multiple dimensions: - -* **Ultra-Long Context Support**: This model breaks through the limitations of traditional time series forecasting models, supporting the processing of inputs with thousands of Tokens (equivalent to tens of thousands of time points), effectively solving the context length bottleneck problem. -* **Coverage of Multi-Variable Forecasting Scenarios**: Supports various forecasting scenarios, including the prediction of non-stationary time series, multi-variable prediction tasks, and predictions involving covariates, meeting diversified business needs. -* **Large-Scale Industrial Time Series Dataset**: Pre-trained on a trillion-scale time series dataset from the industrial IoT field, the dataset possesses important characteristics such as massive scale, excellent quality, and rich domain coverage, covering multiple fields including energy, aerospace, steel, and transportation. - -![](/img/model02.png) - -## 5. Timer-Sundial Model - -Timer-Sundial is a series of generative foundational models focused on time series forecasting. The base version has 128 million parameters and has been pre-trained on 1 trillion time points, with the following core characteristics: - -* **Strong Generalization Performance**: Possesses zero-shot forecasting capabilities and can support both point forecasting and probabilistic forecasting simultaneously. -* **Flexible Prediction Distribution Analysis**: Not only can it predict means or quantiles, but it can also evaluate any statistical properties of the prediction distribution through the raw samples generated by the model. -* **Innovative Generative Architecture**: Employs a "Transformer + TimeFlow" collaborative architecture - the Transformer learns the autoregressive representations of time segments, while the TimeFlow module transforms random noise into diverse prediction trajectories based on the flow-matching framework (Flow-Matching), achieving efficient generation of non-deterministic samples. - -![](/img/model03.png) - -## 6. Chronos-2 Model - -Chronos-2 is a universal time series foundational model developed by the Amazon Web Services (AWS) research team, evolved from the Chronos discrete token modeling paradigm. This model is suitable for both zero-shot univariate forecasting and covariate forecasting. Its main characteristics include: - -* **Probabilistic Forecasting Capability**: The model outputs multi-step prediction results in a generative manner, supporting quantile or distribution-level forecasting to characterize future uncertainty. -* **Zero-Shot General Forecasting**: Leveraging the contextual learning ability acquired through pre-training, it can directly execute forecasting on unseen datasets without retraining or parameter updates. -* **Unified Modeling of Multi-Variable and Covariates**: Supports joint modeling of multiple related time series and their covariates under the same architecture to improve prediction performance for complex tasks. However, it has strict input requirements: - * The set of names of future covariates must be a subset of the set of names of historical covariates; - * The length of each historical covariate must equal the length of the target variable; - * The length of each future covariate must equal the prediction length; -* **Efficient Inference and Deployment**: The model adopts a compact encoder-only structure, maintaining strong generalization capabilities while ensuring inference efficiency. - -![](/img/timeseries-large-model-chronos2.png) - -## 7. Performance Showcase - -Time Series Large Models can adapt to real time series data from various different domains and scenarios, demonstrating excellent processing capabilities across various tasks. The following shows the actual performance on different datasets: - -**Time Series Forecasting:** - -Leveraging the forecasting capabilities of Time Series Large Models, future trends of time series can be accurately predicted. The blue curve in the following figure represents the predicted trend, while the red curve represents the actual trend, with both curves highly consistent. - -![](/img/LargeModel03.png) - -**Data Imputation:** - -Using Time Series Large Models to fill missing data segments through predictive imputation. - -![](/img/timeseries-large-model-data-imputation.png) - -**Anomaly Detection:** - -Using Time Series Large Models to accurately identify outliers that deviate significantly from the normal trend. - -![](/img/LargeModel05.png) - -## 8. Deployment and Usage - -1. Open the IoTDB CLI console and check that the ConfigNode, DataNode, and AINode nodes are all Running. - -```Plain -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| -+------+----------+-------+---------------+------------+--------------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| 2.0.5.1| 069354f| -| 1| DataNode|Running| 127.0.0.1| 10730| 2.0.5.1| 069354f| -| 2| AINode|Running| 127.0.0.1| 10810| 2.0.5.1|069354f-dev| -+------+----------+-------+---------------+------------+--------------+-----------+ -Total line number = 3 -It costs 0.140s -``` - -2. In an online environment, the first startup of the AINode node will automatically pull the Timer-XL, Sundial, and Chronos2 models. - - > Note: - > - > * The AINode installation package does not include model weight files. - > * The automatic pull feature depends on the deployment environment having HuggingFace network access capability. - > * AINode supports manual upload of model weight files. For specific operation methods, refer to [Importing Weight Files](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_3-3-importing-built-in-weight-files). - -3. Check if the models are available. - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### Appendix - -**[1]** Timer: Generative Pre-trained Transformers Are Large Time Series Models, Yong Liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ Back]() - -**[2]** TIMER-XL: LONG-CONTEXT TRANSFORMERS FOR UNIFIED TIME SERIES FORECASTING, Yong Liu, Guo Qin, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ Back]() - -**[3]** Sundial: A Family of Highly Capable Time Series Foundation Models, Yong Liu, Guo Qin, Zhiyuan Shi, Zhi Chen, Caiyin Yang, Xiangdong Huang, Jianmin Wang, Mingsheng Long, **ICML 2025 spotlight**. [↩ Back]() - -**[4]** Chronos-2: From Univariate to Universal Forecasting, Abdul Fatir Ansari, Oleksandr Shchur, Jaris Küken, Andreas Auer, Boran Han, Pedro Mercado, Syama Sundar Rangapuram, Huibin Shen, Lorenzo Stella, Xiyuan Zhang, Mononito Goswami, Shubham Kapoor, Danielle C. Maddix, Pablo Guerron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, Michael Bohlke-Schneider, **arXiv:2510.15821**. [↩ Back]() \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/API/Programming-Data-Subscription_timecho.md b/src/UserGuide/Master/Tree/API/Programming-Data-Subscription_timecho.md deleted file mode 100644 index 730ff8dc3..000000000 --- a/src/UserGuide/Master/Tree/API/Programming-Data-Subscription_timecho.md +++ /dev/null @@ -1,260 +0,0 @@ - - - - -# Data Subscription API - -IoTDB provides powerful data subscription functionality, allowing users to access newly added data from IoTDB in real-time through subscription APIs. For detailed functional definitions and introductions:[Data subscription](../User-Manual/Data-subscription_timecho) - -## 1. Core Steps - -1. Create Topic: Create a Topic that includes the measurement points you wish to subscribe to. -2. Subscribe to Topic: Before a consumer subscribes to a topic, the topic must have been created, otherwise the subscription will fail. Consumers under the same consumer group will evenly distribute the data. -3. Consume Data: Only by explicitly subscribing to a specific topic will you receive data from that topic. -4. Unsubscribe: When a consumer is closed, it will exit the corresponding consumer group and cancel all existing subscriptions. - - -## 2. Detailed Steps - -This section is used to illustrate the core development process and does not demonstrate all parameters and interfaces. For a comprehensive understanding of all features and parameters, please refer to: [Java Native API](../API/Programming-Java-Native-API_timecho#_3-native-interface-description) - - -### 2.1 Create a Maven project - -Create a Maven project and import the following dependencies(JDK >= 1.8, Maven >= 3.6) - -```xml - - - org.apache.iotdb - iotdb-session - - ${project.version} - - -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -### 2.2 Code Example - -#### 2.2.1 Topic operations - -```java -import java.util.Optional; -import java.util.Properties; -import java.util.Set; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.SubscriptionSession; -import org.apache.iotdb.session.subscription.model.Topic; - -public class DataConsumerExample { - - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - try (SubscriptionSession session = new SubscriptionSession("127.0.0.1", 6667, "root", "TimechoDB@2021", 67108864)) { //Before V2.0.6.x the default password is root - // 1. open session - session.open(); - - // 2. create a topic of all data - Properties sessionConfig = new Properties(); - sessionConfig.put(TopicConstant.PATH_KEY, "root.**"); - - session.createTopic("allData", sessionConfig); - - // 3. show all topics - Set topics = session.getTopics(); - System.out.println(topics); - - // 4. show a specific topic - Optional allData = session.getTopic("allData"); - System.out.println(allData.get()); - } - } -} -``` - -#### 2.2.2 Data Consume - -##### Scenario-1: Subscribing to newly added real-time data in IoTDB (for scenarios such as dashboard or configuration display) - -```java -import java.io.IOException; -import java.util.List; -import java.util.Properties; -import org.apache.iotdb.rpc.subscription.config.ConsumerConstant; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.consumer.SubscriptionPullConsumer; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessage; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessageType; -import org.apache.iotdb.session.subscription.payload.SubscriptionSessionDataSet; -import org.apache.tsfile.read.common.RowRecord; - -public class DataConsumerExample { - - public static void main(String[] args) throws IOException { - - // 5. create a pull consumer, the subscription is automatically cancelled when the logic in the try resources is completed - Properties consumerConfig = new Properties(); - consumerConfig.put(ConsumerConstant.CONSUMER_ID_KEY, "c1"); - consumerConfig.put(ConsumerConstant.CONSUMER_GROUP_ID_KEY, "cg1"); - consumerConfig.put(ConsumerConstant.USERNAME_KEY, "root"); - consumerConfig.put(ConsumerConstant.PASSWORD_KEY, "TimechoDB@2021"); //Before V2.0.6.x the default password is root - try (SubscriptionPullConsumer pullConsumer = new SubscriptionPullConsumer(consumerConfig)) { - pullConsumer.open(); - pullConsumer.subscribe("topic_all"); - while (true) { - List messages = pullConsumer.poll(10000); - for (final SubscriptionMessage message : messages) { - final short messageType = message.getMessageType(); - if (SubscriptionMessageType.isValidatedMessageType(messageType)) { - for (final SubscriptionSessionDataSet dataSet : message.getSessionDataSetsHandler()) { - while (dataSet.hasNext()) { - final RowRecord record = dataSet.next(); - System.out.println(record); - } - } - } - } - } - } - } -} - - -``` - -##### Scenario-2: Subscribing to newly added TsFiles (for scenarios such as regular data backup) - -Prerequisite: The format of the topic to be consumed must be of the TsfileHandler type. For example:`create topic topic_all_tsfile with ('path'='root.**','format'='TsFileHandler')` - -```java -import java.io.IOException; -import java.util.List; -import java.util.Properties; -import org.apache.iotdb.rpc.subscription.config.ConsumerConstant; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.consumer.SubscriptionPullConsumer; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessage; - - -public class DataConsumerExample { - - public static void main(String[] args) throws IOException { - // 1. create a pull consumer, the subscription is automatically cancelled when the logic in the try resources is completed - Properties consumerConfig = new Properties(); - consumerConfig.put(ConsumerConstant.CONSUMER_ID_KEY, "c1"); - consumerConfig.put(ConsumerConstant.CONSUMER_GROUP_ID_KEY, "cg1"); - // 2. Specify the consumption type as the tsfile type - consumerConfig.put(ConsumerConstant.USERNAME_KEY, "root"); - consumerConfig.put(ConsumerConstant.PASSWORD_KEY, "TimechoDB@2021"); //Before V2.0.6.x the default password is root - consumerConfig.put(ConsumerConstant.FILE_SAVE_DIR_KEY, "/Users/iotdb/Downloads"); - try (SubscriptionPullConsumer pullConsumer = new SubscriptionPullConsumer(consumerConfig)) { - pullConsumer.open(); - pullConsumer.subscribe("topic_all_tsfile"); - while (true) { - List messages = pullConsumer.poll(10000); - for (final SubscriptionMessage message : messages) { - message.getTsFileHandler().copyFile("/Users/iotdb/Downloads/1.tsfile"); - } - } - } - } -} -``` - - - - -## 3. Java Native API Description - -### 3.1 Parameter List - -The consumer-related parameters can be set through the Properties parameter object. The specific parameters are as follows: - -#### SubscriptionConsumer - - -| **Parameter** | **required or optional with default** | **Parameter Meaning** | -| :---------------------- |:-------------------------------------------------------------------------------------| :----------------------------------------------------------- | -| host | optional: 127.0.0.1 | `String`: The RPC host of a DataNode in IoTDB | -| port | optional: 6667 | `Integer`: The RPC port of a DataNode in IoTDB | -| node-urls | optional: 127.0.0.1:6667 | `List`: The RPC addresses of all DataNodes in IoTDB, which can be multiple; either host:port or node-urls can be filled. If both host:port and node-urls are filled, the **union** of host:port and node-urls will be taken to form a new node-urls for application | -| username | optional: root | `String`: The username of the DataNode in IoTDB | -| password | optional: TimechoDB@2021 //Before V2.0.6.x the default password is root | `String`: The password of the DataNode in IoTDB | -| groupId | optional | `String`: consumer group id,if not specified, it will be randomly assigned (a new consumer group),ensuring that the consumer group id of different consumer groups are all different | -| consumerId | optional | `String`: consumer client id,if not specified, it will be randomly assigned,ensuring that each consumer client id in the same consumer group is different | -| heartbeatIntervalMs | optional: 30000 (min: 1000) | `Long`: The interval at which the consumer sends periodic heartbeat requests to the IoTDB DataNode | -| endpointsSyncIntervalMs | optional: 120000 (min: 5000) | `Long`: The interval at which the consumer detects the expansion or contraction of IoTDB cluster nodes and adjusts the subscription connection | -| fileSaveDir | optional: Paths.get(System.getProperty("user.dir"), "iotdb-subscription").toString() | `String`: The temporary directory path where the consumer stores the subscribed TsFile files | -| fileSaveFsync | optional: false | `Boolean`: Whether the consumer actively calls fsync during the subscription of TsFiles | - -Special configurations in `SubscriptionPushConsumer` : - -| **Parameter** | **required or optional with default** | **Parameter Meaning** | -| :----------------- | :------------------------------------ | :----------------------------------------------------------- | -| ackStrategy | optional: `ACKStrategy.AFTER_CONSUME` | The acknowledgment mechanism for consumption progress includes the following options: `ACKStrategy.BEFORE_CONSUME`(the consumer submits the consumption progress immediately upon receiving the data, before `onReceive` )`ACKStrategy.AFTER_CONSUME`(the consumer submits the consumption progress after consuming the data, after `onReceive` ) | -| consumeListener | optional | The callback function for consuming data, which needs to implement the `ConsumeListener` interface, defining the processing logic for consuming `SessionDataSetsHandler` and `TsFileHandler` formatted data | -| autoPollIntervalMs | optional: 5000 (min: 500) | Long: The time interval at which the consumer automatically pulls data, in **ms** | -| autoPollTimeoutMs | optional: 10000 (min: 1000) | Long: The timeout duration for the consumer to pull data each time, in **ms** | - -Special configurations in `SubscriptionPullConsumer` : - -| **Parameter** | **required or optional with default** | **Parameter Meaning** | -| :----------------- | :------------------------------------ | :----------------------------------------------------------- | -| autoCommit | optional: true | Boolean: Whether to automatically commit the consumption progress. If this parameter is set to false, the `commit` method needs to be called manually to submit the consumption progress | -| autoCommitInterval | optional: 5000 (min: 500) | Long: The time interval for automatically committing the consumption progress, in **ms** .This parameter only takes effect when the `autoCommit` parameter is set to true | - - -### 3.2 Function List - -#### Data subscription - -##### SubscriptionPullConsumer - -| **Function name** | **Description** | **Parameter** | -| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -| `open()` | Opens the consumer connection and starts message consumption. If `autoCommit` is enabled, it will start the automatic commit worker. | None | -| `close()` | Closes the consumer connection. If `autoCommit` is enabled, it will commit all uncommitted messages before closing. | None | -| `poll(final Duration timeout)` | Pulls messages with a specified timeout. | `timeout` : The timeout duration. | -| `poll(final long timeoutMs)` | Pulls messages with a specified timeout in milliseconds. | `timeoutMs` : The timeout duration in milliseconds. | -| `poll(final Set topicNames, final Duration timeout)` | Pulls messages from specified topics with a specified timeout. | `topicNames` : The set of topics to pull messages from. `timeout`: The timeout duration。 | -| `poll(final Set topicNames, final long timeoutMs)` | Pulls messages from specified topics with a specified timeout in milliseconds. | `topicNames` : The set of topics to pull messages from.`timeoutMs`: The timeout duration in milliseconds. | -| `commitSync(final SubscriptionMessage message)` | Synchronously commits a single message. | `message` : The message object to be committed. | -| `commitSync(final Iterable messages)` | Synchronously commits multiple messages. | `messages` : The collection of message objects to be committed. | -| `commitAsync(final SubscriptionMessage message)` | Asynchronously commits a single message. | `message` : The message object to be committed. | -| `commitAsync(final Iterable messages)` | Asynchronously commits multiple messages. | `messages` : The collection of message objects to be committed. | -| `commitAsync(final SubscriptionMessage message, final AsyncCommitCallback callback)` | Asynchronously commits a single message with a specified callback. | `message` : The message object to be committed. `callback` : The callback function to be executed after asynchronous commit. | -| `commitAsync(final Iterable messages, final AsyncCommitCallback callback)` | Asynchronously commits multiple messages with a specified callback. | `messages` : The collection of message objects to be committed.`callback` : The callback function to be executed after asynchronous commit. | - -##### SubscriptionPushConsumer - -| **Function name** | **Description** | **Parameter** | -| -------------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| `open()` | Opens the consumer connection, starts message consumption, and submits the automatic polling worker. | None | -| `close()` | Closes the consumer connection and stops message consumption. | None | -| `toString()` | Returns the core configuration information of the consumer object. | None | -| `coreReportMessage()` | Obtains the key-value representation of the consumer's core configuration. | None | -| `allReportMessage()` | Obtains the key-value representation of all the consumer's configurations. | None | -| `buildPushConsumer()` | Builds a `SubscriptionPushConsumer` instance through the `Builder` | None | -| `ackStrategy(final AckStrategy ackStrategy)` | Configures the message acknowledgment strategy for the consumer. | `ackStrategy`: The specified message acknowledgment strategy. | -| `consumeListener(final ConsumeListener consumeListener)` | Configures the message consumption logic for the consumer. | `consumeListener`: The processing logic when the consumer receives messages. | -| `autoPollIntervalMs(final long autoPollIntervalMs)` | Configures the interval for automatic polling. | `autoPollIntervalMs` : The interval for automatic polling, in milliseconds. | -| `autoPollTimeoutMs(final long autoPollTimeoutMs)` | Configures the timeout for automatic polling.间。 | `autoPollTimeoutMs`: The timeout for automatic polling, in milliseconds. | \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/API/Programming-JDBC_timecho.md b/src/UserGuide/Master/Tree/API/Programming-JDBC_timecho.md deleted file mode 100644 index 5ed6af0b8..000000000 --- a/src/UserGuide/Master/Tree/API/Programming-JDBC_timecho.md +++ /dev/null @@ -1,296 +0,0 @@ - - -# JDBC - -**Note**: The current JDBC implementation is only for connecting with third-party tools. We do not recommend using JDBC (when executing insert statements) as it cannot provide high-performance writing. For queries, we recommend using JDBC. -PLEASE USE [Java Native API](./Programming-Java-Native-API_timecho) INSTEAD* - -## 1. Dependencies - -* JDK >= 1.8+ -* Maven >= 3.9+ - -## 2. Installation - -In root directory: - -```shell -mvn clean install -pl iotdb-client/jdbc -am -DskipTests -``` - -## 3. Use IoTDB JDBC with Maven - -```xml - - - org.apache.iotdb - iotdb-jdbc - ${project.version} - - -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 4. Coding Examples - -This chapter provides an example of how to open a database connection, execute an SQL query, and display the results. - -It requires including the packages containing the JDBC classes needed for database programming. - -**NOTE: For faster insertion, the insertTablet() in Session is recommended.** - -```java -import java.sql.*; -import org.apache.iotdb.jdbc.IoTDBSQLException; - -public class JDBCExample { - /** - * Before executing a SQL statement with a Statement object, you need to create a Statement object using the createStatement() method of the Connection object. - * After creating a Statement object, you can use its execute() method to execute a SQL statement - * Finally, remember to close the 'statement' and 'connection' objects by using their close() method - * For statements with query results, we can use the getResultSet() method of the Statement object to get the result set. - */ - public static void main(String[] args) throws SQLException { - Connection connection = getConnection(); - if (connection == null) { - System.out.println("get connection defeat"); - return; - } - Statement statement = connection.createStatement(); - //Create database - try { - statement.execute("CREATE DATABASE root.demo"); - }catch (IoTDBSQLException e){ - System.out.println(e.getMessage()); - } - - - //SHOW DATABASES - statement.execute("SHOW DATABASES"); - outputResult(statement.getResultSet()); - - //Create time series - //Different data type has different encoding methods. Here use INT32 as an example - try { - statement.execute("CREATE TIMESERIES root.demo.s0 WITH DATATYPE=INT32,ENCODING=RLE;"); - }catch (IoTDBSQLException e){ - System.out.println(e.getMessage()); - } - //Show time series - statement.execute("SHOW TIMESERIES root.demo"); - outputResult(statement.getResultSet()); - //Show devices - statement.execute("SHOW DEVICES"); - outputResult(statement.getResultSet()); - //Count time series - statement.execute("COUNT TIMESERIES root"); - outputResult(statement.getResultSet()); - //Count nodes at the given level - statement.execute("COUNT NODES root LEVEL=3"); - outputResult(statement.getResultSet()); - //Count timeseries group by each node at the given level - statement.execute("COUNT TIMESERIES root GROUP BY LEVEL=3"); - outputResult(statement.getResultSet()); - - - //Execute insert statements in batch - statement.addBatch("INSERT INTO root.demo(timestamp,s0) VALUES(1,1);"); - statement.addBatch("INSERT INTO root.demo(timestamp,s0) VALUES(1,1);"); - statement.addBatch("INSERT INTO root.demo(timestamp,s0) VALUES(2,15);"); - statement.addBatch("INSERT INTO root.demo(timestamp,s0) VALUES(2,17);"); - statement.addBatch("INSERT INTO root.demo(timestamp,s0) values(4,12);"); - statement.executeBatch(); - statement.clearBatch(); - - //Full query statement - String sql = "SELECT * FROM root.demo"; - ResultSet resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Exact query statement - sql = "SELECT s0 FROM root.demo WHERE time = 4;"; - resultSet= statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Time range query - sql = "SELECT s0 FROM root.demo WHERE time >= 2 AND time < 5;"; - resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Aggregate query - sql = "SELECT COUNT(s0) FROM root.demo;"; - resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Delete time series - statement.execute("DELETE timeseries root.demo.s0"); - - //close connection - statement.close(); - connection.close(); - } - - public static Connection getConnection() { - // JDBC driver name and database URL - String driver = "org.apache.iotdb.jdbc.IoTDBDriver"; - String url = "jdbc:iotdb://127.0.0.1:6667/"; - // set rpc compress mode - // String url = "jdbc:iotdb://127.0.0.1:6667?rpc_compress=true"; - - // Database credentials - String username = "root"; - String password = "TimechoDB@2021"; //Before V2.0.6.x the default password is root - - Connection connection = null; - try { - Class.forName(driver); - connection = DriverManager.getConnection(url, username, password); - } catch (ClassNotFoundException e) { - e.printStackTrace(); - } catch (SQLException e) { - e.printStackTrace(); - } - return connection; - } - - /** - * This is an example of outputting the results in the ResultSet - */ - private static void outputResult(ResultSet resultSet) throws SQLException { - if (resultSet != null) { - System.out.println("--------------------------"); - final ResultSetMetaData metaData = resultSet.getMetaData(); - final int columnCount = metaData.getColumnCount(); - for (int i = 0; i < columnCount; i++) { - System.out.print(metaData.getColumnLabel(i + 1) + " "); - } - System.out.println(); - while (resultSet.next()) { - for (int i = 1; ; i++) { - System.out.print(resultSet.getString(i)); - if (i < columnCount) { - System.out.print(", "); - } else { - System.out.println(); - break; - } - } - } - System.out.println("--------------------------\n"); - } - } -} -``` - -The parameter `version` can be used in the url: -````java -String url = "jdbc:iotdb://127.0.0.1:6667?version=V_1_0"; -```` -The parameter `version` represents the SQL semantic version used by the client, which is used in order to be compatible with the SQL semantics of `0.12` when upgrading to `0.13`. -The possible values are: `V_0_12`, `V_0_13`, `V_1_0`. - -In addition, IoTDB provides additional interfaces in JDBC for users to read and write the database using different character sets (e.g., GB18030) in the connection. -The default character set for IoTDB is UTF-8. When users want to use a character set other than UTF-8, they need to specify the charset property in the JDBC connection. For example: -1. Create a connection using the GB18030 charset: -```java -DriverManager.getConnection("jdbc:iotdb://127.0.0.1:6667?charset=GB18030", "root", "TimechoDB@2021"); //Before V2.0.6.x the default password is root -``` -2. When executing SQL with the `IoTDBStatement` interface, the SQL can be provided as a `byte[]` array, and it will be parsed into a string according to the specified charset. -```java -public boolean execute(byte[] sql) throws SQLException; -``` -3. When outputting query results, the `getBytes` method of `ResultSet` can be used to get `byte[]`, which will be encoded using the charset specified in the connection. -```java -System.out.print(resultSet.getString(i) + " (" + new String(resultSet.getBytes(i), charset) + ")"); -``` -Here is a complete example: -```java -public class JDBCCharsetExample { - - private static final Logger LOGGER = LoggerFactory.getLogger(JDBCCharsetExample.class); - - public static void main(String[] args) throws Exception { - Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); - - try (final Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?charset=GB18030", "root", "TimechoDB@2021"); //Before V2.0.6.x the default password is root - final IoTDBStatement statement = (IoTDBStatement) connection.createStatement()) { - - final String insertSQLWithGB18030 = - "insert into root.测试(timestamp, 维语, 彝语, 繁体, 蒙文, 简体, 标点符号, 藏语) values(1, 'ئۇيغۇر تىلى', 'ꆈꌠꉙ', \"繁體\", 'ᠮᠣᠩᠭᠣᠯ ᠬᠡᠯᠡ', '简体', '——?!', \"བོད་སྐད།\");"; - final byte[] insertSQLWithGB18030Bytes = insertSQLWithGB18030.getBytes("GB18030"); - statement.execute(insertSQLWithGB18030Bytes); - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - - outputResult("GB18030"); - outputResult("UTF-8"); - outputResult("UTF-16"); - outputResult("GBK"); - outputResult("ISO-8859-1"); - } - - private static void outputResult(String charset) throws SQLException { - System.out.println("[Charset: " + charset + "]"); - try (final Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?charset=" + charset, "root", "TimechoDB@2021"); //Before V2.0.6.x the default password is root - final IoTDBStatement statement = (IoTDBStatement) connection.createStatement()) { - outputResult(statement.executeQuery("select ** from root"), Charset.forName(charset)); - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - } - - private static void outputResult(ResultSet resultSet, Charset charset) throws SQLException { - if (resultSet != null) { - System.out.println("--------------------------"); - final ResultSetMetaData metaData = resultSet.getMetaData(); - final int columnCount = metaData.getColumnCount(); - for (int i = 0; i < columnCount; i++) { - System.out.print(metaData.getColumnLabel(i + 1) + " "); - } - System.out.println(); - - while (resultSet.next()) { - for (int i = 1; ; i++) { - System.out.print( - resultSet.getString(i) + " (" + new String(resultSet.getBytes(i), charset) + ")"); - if (i < columnCount) { - System.out.print(", "); - } else { - System.out.println(); - break; - } - } - } - System.out.println("--------------------------\n"); - } - } -} -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/API/Programming-Java-Native-API_timecho.md b/src/UserGuide/Master/Tree/API/Programming-Java-Native-API_timecho.md deleted file mode 100644 index a746bceaa..000000000 --- a/src/UserGuide/Master/Tree/API/Programming-Java-Native-API_timecho.md +++ /dev/null @@ -1,625 +0,0 @@ - - -# Java Native API - -In the native API of IoTDB, the `Session` is the core interface for interacting with the database. It integrates a rich set of methods that support data writing, querying, and metadata operations. By instantiating a `Session`, you can establish a connection to the IoTDB server and perform various database operations within the environment constructed by this connection. The `Session` is not thread-safe and should not be called simultaneously by multiple threads. - -`SessionPool` is a connection pool for `Session`, and it is recommended to use `SessionPool` for programming. In scenarios with multi-threaded concurrency, `SessionPool` can manage and allocate connection resources effectively, thereby improving system performance and resource utilization efficiency. - -## 1. Overview of Steps - -1. Create a Connection Pool Instance: Initialize a SessionPool object to manage multiple Session instances. -2. Perform Operations: Directly obtain a Session instance from the SessionPool and execute database operations, without the need to open and close connections each time. -3. Close Connection Pool Resources: When database operations are no longer needed, close the SessionPool to release all related resources. - - -## 2. Detailed Steps - -This section provides an overview of the core development process and does not demonstrate all parameters and interfaces. For a complete list of functionalities and parameters, please refer to:[Java Native API](./Programming-Java-Native-API_timecho#_3-native-interface-description) or check the: [Source Code](https://github.com/apache/iotdb/tree/rc/2.0.1/example/session/src/main/java/org/apache/iotdb) - -### 2.1 Create a Maven Project - -Create a Maven project and add the following dependencies to the pom.xml file (JDK >= 1.8, Maven >= 3.6): - -```xml - - - org.apache.iotdb - iotdb-session - - ${project.version} - - -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -### 2.2 Creating a Connection Pool Instance - - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.session.pool.SessionPool; - -public class IoTDBSessionPoolExample { - private static SessionPool sessionPool; - - public static void main(String[] args) { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //Before V2.0.6.x the default password is root " - .maxSize(3) - .build(); - } -} -``` - -### 2.3 Performing Database Operations - -#### 2.3.1 Data Insertion - -In industrial scenarios, data insertion can be categorized into the following types: inserting multiple rows of data, and inserting multiple rows of data for a single device. Below, we introduce the insertion interfaces for different scenarios. - -##### Multi-Row Data Insertion Interface - -Interface Description: Supports inserting multiple rows of data at once, where each row corresponds to multiple measurement values for a device at a specific timestamp. - - -Interface List: - -| **Interface Name** | **Function Description** | -| ------------------------------------------------------------ | ------------------------------------------------------------ | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | Inserts multiple rows of data, suitable for scenarios where measurements are independently collected. | - -Code Example: - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; -import org.apache.tsfile.enums.TSDataType; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. execute insert data - insertRecordsExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //Before V2.0.6.x the default password is root - .maxSize(3) - .build(); - } - - public static void insertRecordsExample() throws IoTDBConnectionException, StatementExecutionException { - String deviceId = "root.sg1.d1"; - List measurements = new ArrayList<>(); - measurements.add("s1"); - measurements.add("s2"); - measurements.add("s3"); - List deviceIds = new ArrayList<>(); - List> measurementsList = new ArrayList<>(); - List> valuesList = new ArrayList<>(); - List timestamps = new ArrayList<>(); - List> typesList = new ArrayList<>(); - - for (long time = 0; time < 500; time++) { - List values = new ArrayList<>(); - List types = new ArrayList<>(); - values.add(1L); - values.add(2L); - values.add(3L); - types.add(TSDataType.INT64); - types.add(TSDataType.INT64); - types.add(TSDataType.INT64); - - deviceIds.add(deviceId); - measurementsList.add(measurements); - valuesList.add(values); - typesList.add(types); - timestamps.add(time); - if (time != 0 && time % 100 == 0) { - try { - sessionPool.insertRecords(deviceIds, timestamps, measurementsList, typesList, valuesList); - } catch (IoTDBConnectionException | StatementExecutionException e) { - // solve exception - } - deviceIds.clear(); - measurementsList.clear(); - valuesList.clear(); - typesList.clear(); - timestamps.clear(); - } - } - try { - sessionPool.insertRecords(deviceIds, timestamps, measurementsList, typesList, valuesList); - } catch (IoTDBConnectionException | StatementExecutionException e) { - // solve exception - } - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - -##### Single-Device Multi-Row Data Insertion Interface - -Interface Description: Supports inserting multiple rows of data for a single device at once, where each row corresponds to multiple measurement values for a specific timestamp. - -Interface List: - -| **Interface Name** | **Function Description** | -| ----------------------------- | ------------------------------------------------------------ | -| `insertTablet(Tablet tablet)` | Inserts multiple rows of data for a single device, suitable for scenarios where measurements are independently collected. | - -Code Example: - -```java -import java.util.ArrayList; -import java.util.List; -import java.util.Random; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.write.record.Tablet; -import org.apache.tsfile.write.schema.IMeasurementSchema; -import org.apache.tsfile.write.schema.MeasurementSchema; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. execute insert data - insertTabletExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - //nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //Before V2.0.6.x the default password is root - .maxSize(3) - .build(); - } - - private static void insertTabletExample() throws IoTDBConnectionException, StatementExecutionException { - /* - * A Tablet example: - * device1 - * time s1, s2, s3 - * 1, 1, 1, 1 - * 2, 2, 2, 2 - * 3, 3, 3, 3 - */ - // The schema of measurements of one device - // only measurementId and data type in MeasurementSchema take effects in Tablet - List schemaList = new ArrayList<>(); - schemaList.add(new MeasurementSchema("s1", TSDataType.INT64)); - schemaList.add(new MeasurementSchema("s2", TSDataType.INT64)); - schemaList.add(new MeasurementSchema("s3", TSDataType.INT64)); - - Tablet tablet = new Tablet("root.sg.d1",schemaList,100); - - // Method 1 to add tablet data - long timestamp = System.currentTimeMillis(); - - Random random = new Random(); - for (long row = 0; row < 100; row++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - for (int s = 0; s < 3; s++) { - long value = random.nextLong(); - tablet.addValue(schemaList.get(s).getMeasurementName(), rowIndex, value); - } - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - sessionPool.insertTablet(tablet); - tablet.reset(); - } - timestamp++; - } - if (tablet.getRowSize() != 0) { - sessionPool.insertTablet(tablet); - tablet.reset(); - } - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - -#### 2.3.2 SQL Operations - -SQL operations are divided into two categories: queries and non-queries. The corresponding interfaces are executeQuery and executeNonQuery. The difference between them is that the former executes specific query statements and returns a result set, while the latter performs insert, delete, and update operations and does not return a result set. - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.isession.pool.SessionDataSetWrapper; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. executes a non-query SQL statement, such as a DDL or DML command. - executeQueryExample(); - // 3. executes a query SQL statement and returns the result set. - executeNonQueryExample(); - // 4. close SessionPool - closeSessionPool(); - } - - private static void executeNonQueryExample() throws IoTDBConnectionException, StatementExecutionException { - // 1. create a nonAligned time series - sessionPool.executeNonQueryStatement("create timeseries root.test.d1.s1 with dataType = int32"); - // 2. set ttl - sessionPool.executeNonQueryStatement("set TTL to root.test.** 10000"); - // 3. delete time series - sessionPool.executeNonQueryStatement("delete timeseries root.test.d1.s1"); - } - - private static void executeQueryExample() throws IoTDBConnectionException, StatementExecutionException { - // 1. execute normal query - try(SessionDataSetWrapper wrapper = sessionPool.executeQueryStatement("select s1 from root.sg1.d1 limit 10")) { - // get DataIterator like JDBC - DataIterator dataIterator = wrapper.iterator(); - System.out.println(wrapper.getColumnNames()); - System.out.println(wrapper.getColumnTypes()); - while (dataIterator.next()) { - StringBuilder builder = new StringBuilder(); - for (String columnName : wrapper.getColumnNames()) { - builder.append(dataIterator.getString(columnName) + " "); - } - System.out.println(builder); - } - } - // 2. execute aggregate query - try(SessionDataSetWrapper wrapper = sessionPool.executeQueryStatement("select count(s1) from root.sg1.d1 group by ([0, 40), 5ms) ")) { - // get DataIterator like JDBC - DataIterator dataIterator = wrapper.iterator(); - System.out.println(wrapper.getColumnNames()); - System.out.println(wrapper.getColumnTypes()); - while (dataIterator.next()) { - StringBuilder builder = new StringBuilder(); - for (String columnName : wrapper.getColumnNames()) { - builder.append(dataIterator.getString(columnName) + " "); - } - System.out.println(builder); - } - } - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //Before V2.0.6.x the default password is root - .maxSize(3) - .build(); - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - - -For more information on the use of result sets and their method `SessionDataSet.DataIterator`, please refer to the following example (note: the `getBlob` and `getDate` interfaces have been supported since version V2.0.4): - -```java -import org.apache.iotdb.isession.SessionDataSet; -import org.apache.iotdb.isession.pool.SessionDataSetWrapper; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; - -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.utils.Binary; -import org.apache.tsfile.utils.DateUtils; -import org.apache.tsfile.write.record.Tablet; -import org.apache.tsfile.write.schema.MeasurementSchema; -import org.junit.Assert; - -import java.sql.Timestamp; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -public class SessionExample { - private static SessionPool sessionPool; - - public static void main(String[] args) - throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. executes a query SQL statement, such as a DDL or DML command. - executeQueryExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void executeQueryExample() - throws IoTDBConnectionException, StatementExecutionException { - Tablet tablet = - new Tablet( - "root.sg.d1", - Arrays.asList( - new MeasurementSchema("s1", TSDataType.INT32), - new MeasurementSchema("s2", TSDataType.INT64), - new MeasurementSchema("s3", TSDataType.FLOAT), - new MeasurementSchema("s4", TSDataType.DOUBLE), - new MeasurementSchema("s5", TSDataType.TEXT), - new MeasurementSchema("s6", TSDataType.BOOLEAN), - new MeasurementSchema("s7", TSDataType.TIMESTAMP), - new MeasurementSchema("s8", TSDataType.BLOB), - new MeasurementSchema("s9", TSDataType.STRING), - new MeasurementSchema("s10", TSDataType.DATE), - new MeasurementSchema("s11", TSDataType.TIMESTAMP)), - 10); - tablet.addTimestamp(0, 0L); - tablet.addValue("s1", 0, 1); - tablet.addValue("s2", 0, 1L); - tablet.addValue("s3", 0, 0f); - tablet.addValue("s4", 0, 0d); - tablet.addValue("s5", 0, "text_value"); - tablet.addValue("s6", 0, true); - tablet.addValue("s7", 0, 1L); - tablet.addValue("s8", 0, new Binary(new byte[] {1})); - tablet.addValue("s9", 0, "string_value"); - tablet.addValue("s10", 0, DateUtils.parseIntToLocalDate(20250403)); - tablet.initBitMaps(); - tablet.bitMaps[10].mark(0); - tablet.rowSize = 1; - sessionPool.insertAlignedTablet(tablet); - - try (SessionDataSetWrapper dataSet = - sessionPool.executeQueryStatement("select * from root.sg.d1")) { - SessionDataSet.DataIterator iterator = dataSet.iterator(); - int count = 0; - while (iterator.next()) { - count++; - Assert.assertFalse(iterator.isNull("root.sg.d1.s1")); - Assert.assertEquals(1, iterator.getInt("root.sg.d1.s1")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s2")); - Assert.assertEquals(1L, iterator.getLong("root.sg.d1.s2")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s3")); - Assert.assertEquals(0, iterator.getFloat("root.sg.d1.s3"), 0.01); - Assert.assertFalse(iterator.isNull("root.sg.d1.s4")); - Assert.assertEquals(0, iterator.getDouble("root.sg.d1.s4"), 0.01); - Assert.assertFalse(iterator.isNull("root.sg.d1.s5")); - Assert.assertEquals("text_value", iterator.getString("root.sg.d1.s5")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s6")); - Assert.assertTrue(iterator.getBoolean("root.sg.d1.s6")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s7")); - Assert.assertEquals(new Timestamp(1), iterator.getTimestamp("root.sg.d1.s7")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s8")); - Assert.assertEquals(new Binary(new byte[] {1}), iterator.getBlob("root.sg.d1.s8")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s9")); - Assert.assertEquals("string_value", iterator.getString("root.sg.d1.s9")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s10")); - Assert.assertEquals( - DateUtils.parseIntToLocalDate(20250403), iterator.getDate("root.sg.d1.s10")); - Assert.assertTrue(iterator.isNull("root.sg.d1.s11")); - Assert.assertNull(iterator.getTimestamp("root.sg.d1.s11")); - - Assert.assertEquals(new Timestamp(0), iterator.getTimestamp("Time")); - Assert.assertFalse(iterator.isNull("Time")); - } - Assert.assertEquals(tablet.rowSize, count); - } - } - - private static void constructSessionPool() { - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("root") - .maxSize(3) - .build(); - } - - public static void closeSessionPool() { - sessionPool.close(); - } -} -``` - - - -### 3. Native Interface Description - -#### 3.1 Parameter List - -The Session class has the following fields, which can be set through the constructor or the Session.Builder method: - -| **Field Name** | **Type** | **Description** | -| -------------------------------- | ----------------------------------- | ------------------------------------------------------------ | -| `nodeUrls` | `List` | List of URLs for database nodes, supporting multiple node connections | -| `username` | `String` | Username | -| `password` | `String` | Password | -| `fetchSize` | `int` | Default batch size for query results | -| `useSSL` | `boolean` | Whether to enable SSL | -| `trustStore` | `String` | Path to the trust store | -| `trustStorePwd` | `String` | Password for the trust store | -| `queryTimeoutInMs` | `long` | Query timeout in milliseconds. Default value: -1. A negative value means the server default configuration is used, and 0 disables query timeout. | -| `enableRPCCompression` | `boolean` | Whether to enable RPC compression | -| `connectionTimeoutInMs` | `int` | Connection timeout in milliseconds | -| `zoneId` | `ZoneId` | Time zone setting for the session | -| `thriftDefaultBufferSize` | `int` | Default buffer size for Thrift Thrift | -| `thriftMaxFrameSize` | `int` | Maximum frame size for Thrift Thrift | -| `defaultEndPoint` | `TEndPoint` | Default database endpoint information | -| `defaultSessionConnection` | `SessionConnection` | Default session connection object | -| `isClosed` | `boolean` | Whether the current session is closed | -| `enableRedirection` | `boolean` | Whether to enable redirection | -| `enableRecordsAutoConvertTablet` | `boolean` | Whether to enable the function of recording the automatic transfer to Tablet | -| `deviceIdToEndpoint` | `Map` | Mapping of device IDs to database endpoints | -| `endPointToSessionConnection` | `Map` | Mapping of database endpoints to session connections | -| `executorService` | `ScheduledExecutorService` | Thread pool for periodically updating the node list | -| `availableNodes` | `INodeSupplier` | Supplier of available nodes | -| `enableQueryRedirection` | `boolean` | Whether to enable query redirection | -| `version` | `Version` | Client version number, used for compatibility judgment with the server | -| `enableAutoFetch` | `boolean` | Whether to enable automatic fetching | -| `maxRetryCount` | `int` | Maximum number of retries | -| `retryIntervalInMs` | `long` | Retry interval in milliseconds | - - - -#### 3.2 Interface list - -##### 3.2.1 Metadata Management - -| **Method Name** | **Function Description** | **Parameter Explanation** | -| ------------------------------------------------------------ | ---------------------------------------------- | ------------------------------------------------------------ | -| `createDatabase(String database)` | Create a database | `database`: The name of the database to be created | -| `deleteDatabase(String database)` | Delete a specified database | `database`: The name of the database to be deleted | -| `deleteDatabases(List databases)` | Batch delete databases | `databases`: A list of database names to be deleted | -| `createTimeseries(String path, TSDataType dataType, TSEncoding encoding, CompressionType compressor)` | Create a single time series | `path`: The path of the time series,`dataType`: The data type,`encoding`: The encoding type,`compressor`: The compression type | -| `createAlignedTimeseries(...)` | Create aligned time series | Device ID, list of measurement points, list of data types, list of encodings, list of compression types | -| `createMultiTimeseries(...)` | Batch create time series | Multiple paths, data types, encodings, compression types, properties, tags, aliases, etc. | -| `deleteTimeseries(String path)` | Delete a time series | `path`: The path of the time series to be deleted | -| `deleteTimeseries(List paths)` | Batch delete time series | `paths`: A list of time series paths to be deleted | -| `setSchemaTemplate(String templateName, String prefixPath)` | Set a schema template | `templateName`: The name of template,`prefixPath`: The path where the template is applied | -| `createSchemaTemplate(Template template)` | Create a schema template | `template`: The template object | -| `dropSchemaTemplate(String templateName)` | Delete a schema template | `templateName`: The name of template to be deleted | -| `addAlignedMeasurementsInTemplate(...)` | Add aligned measurements to a template | Template name, list of measurement paths, data type, encoding type, compression type | -| `addUnalignedMeasurementsInTemplate(...)` | Add unaligned measurements to a template | Same as above | -| `deleteNodeInTemplate(String templateName, String path)` | Delete a node in a template | `templateName`: The name of template,`path`: The path to be deleted | -| `countMeasurementsInTemplate(String name)` | Count the number of measurements in a template | `name`: The name of template | -| `isMeasurementInTemplate(String templateName, String path)` | Check if a measurement exists in a template | `templateName`: The name of template,`path`: The path of the measurement | -| `isPathExistInTemplate(String templateName, String path)` | Check if a path exists in a template | same as above | -| `showMeasurementsInTemplate(String templateName)` | Show measurements in a template | `templateName`: The name of template | -| `showMeasurementsInTemplate(String templateName, String pattern)` | Show measurements in a template by pattern | `templateName`: The name of template,`pattern`: The matching pattern | -| `showAllTemplates()` | Show all templates | No parameters | -| `showPathsTemplateSetOn(String templateName)` | Show paths where a template is set | `templateName`: The name of the template | -| `showPathsTemplateUsingOn(String templateName)` | Show actual paths using a template | Same as above上 | -| `unsetSchemaTemplate(String prefixPath, String templateName)` | Unset the template setting for a path | `prefixPath`: The path,`templateName`: The name of template | - - -##### 3.2.2 Data Insertion - -| **Method Name** | **Function Description** | **Parameter Explanation** | -| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -| `insertRecord(String deviceId, long time, List measurements, List types, Object... values)` | Insert a single record | `deviceId`: Device ID,`time`: Timestamp,`measurements`: List of measurement points,`types`: List of data types,`values`: List of values | -| `insertRecord(String deviceId, long time, List measurements, List values)` | Insert a single record | `deviceId`: Device ID,`time`: Timestamp,`measurements`: List of measurement points,`values`: List of values | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> valuesList)` | Insert multiple records | `deviceIds`: List of device IDs,`times`: List of timestamps,`measurementsList`: List of timestamps,`valuesList`: List of lists of values | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | Insert multiple records | Same as above,plus `typesList`: List of lists of data types | -| `insertRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList)` | Insert multiple records for a single device | `deviceId`: Device ID,`times`: List of timestamps,`measurementsList`: List of lists of measurement points,`typesList`: List of lists of types,`valuesList`: List of lists of values | -| `insertRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList, boolean haveSorted)` | Insert sorted multiple records for a single device | Same as above, plus `haveSorted`: Whether the data is already sorted | -| `insertStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList)` | Insert string-formatted records for a single device | `deviceId`: Device ID,`times`: List of timestamps,`measurementsList`: List of lists of measurement points,`valuesList`: List of lists of values | -| `insertStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList, boolean haveSorted)` | Insert sorted string-formatted records for a single device | Same as above, plus `haveSorted`: Whether the data is already sorted序 | -| `insertAlignedRecord(String deviceId, long time, List measurements, List types, List values)` | Insert a single aligned record | `deviceId`: Device ID,`time`: Timestamp,`measurements`: List of measurement points,`types`: List of types,`values`: List of values | -| `insertAlignedRecord(String deviceId, long time, List measurements, List values)` | Insert a single string-formatted aligned record | `deviceId`: Device ID`time`: Timestamp,`measurements`: List of measurement points,`values`: List of values | -| `insertAlignedRecords(List deviceIds, List times, List> measurementsList, List> valuesList)` | Insert multiple aligned records | `deviceIds`: List of device IDs,`times`: List of timestamps,`measurementsList`: List of lists of measurement points,`valuesList`: List of lists of values | -| `insertAlignedRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | Insert multiple aligned records | Same as above, plus `typesList`: List of lists of data types | -| `insertAlignedRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList)` | Insert multiple aligned records for a single device | Same as above | -| `insertAlignedRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList, boolean haveSorted)` | Insert sorted multiple aligned records for a single device | Same as above, plus `haveSorted`: Whether the data is already sorted | -| `insertAlignedStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList)` | Insert string-formatted aligned records for a single device | `deviceId`: Device ID,`times`: List of timestamps,`measurementsList`: List of lists of measurement points,`valuesList`: List of lists of values | -| `insertAlignedStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList, boolean haveSorted)` | Insert sorted string-formatted aligned records for a single device | Same as above, plus w `haveSorted`: whether the data is already sorted | -| `insertTablet(Tablet tablet)` | Insert a single Tablet data | `tablet`: The Tablet data to be inserted | -| `insertTablet(Tablet tablet, boolean sorted)` | Insert a sorted Tablet data | Same as above, plus `sorted`: whether the data is already sorted | -| `insertAlignedTablet(Tablet tablet)` | Insert an aligned Tablet data | `tablet`: The Tablet data to be inserted | -| `insertAlignedTablet(Tablet tablet, boolean sorted)` | Insert a sorted aligned Tablet data | Same as above, plus `sorted`: whether the data is already sorted | -| `insertTablets(Map tablets)` | Insert multiple Tablet data in batch | `tablets`: Mapping from device IDs to Tablet data | -| `insertTablets(Map tablets, boolean sorted)` | Insert sorted multiple Tablet data in batch | Same as above, plus `sorted`: whether the data is already sorted | -| `insertAlignedTablets(Map tablets)` | Insert multiple aligned Tablet data in batch | `tablets`: Mapping from device IDs to Tablet data | -| `insertAlignedTablets(Map tablets, boolean sorted)` | Insert sorted multiple aligned Tablet data in batch | Same as above, plus `sorted`: whether the data is already sorted | - -##### 3.2.3 Data Deletion - -| **Method Name** | **Function Description** | **Parameter Explanation** | -| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------- | -| `deleteTimeseries(String path)` | Delete a single time series | `path`: The path of the time series | -| `deleteTimeseries(List paths)` | Batch delete time series | `paths`: A list of time series paths | -| `deleteData(String path, long endTime)` | Delete historical data for a specified path | `path`: The path,`endTime`: The end timestamp | -| `deleteData(List paths, long endTime)` | Batch delete historical data for specified paths | `paths`: A list of paths,`endTime`: The end timestamp | -| `deleteData(List paths, long startTime, long endTime)` | Delete historical data within a time range for specified paths | Same as above, plus `startTime`: The start timestamp | - - -##### 3.2.4 Data Query - -| **Method Name** | **Function Description** | **Parameter Explanation** | -| ------------------------------------------------------------ | -------------------------------------------------------- | ------------------------------------------------------------ | -| `executeQueryStatement(String sql)` | Execute a query statement | `sql`: The query SQL statement | -| `executeQueryStatement(String sql, long timeoutInMs)` | Execute a query statement with timeout | `sql`: The query SQL statement, `timeoutInMs`: The query timeout (in milliseconds), default to the server configuration, which is 60s. | -| `executeRawDataQuery(List paths, long startTime, long endTime)` | Query raw data for specified paths | paths: A list of query paths, `startTime`: The start timestamp, `endTime`: The end timestamp | -| `executeRawDataQuery(List paths, long startTime, long endTime, long timeOut)` | Query raw data for specified paths (with timeout) | Same as above, plus `timeOut`: The timeout time | -| `executeLastDataQuery(List paths)` | Query the latest data | `paths`: A list of query paths | -| `executeLastDataQuery(List paths, long lastTime)` | Query the latest data at a specified time | `paths`: A list of query paths, `lastTime`: The specified timestamp | -| `executeLastDataQuery(List paths, long lastTime, long timeOut)` | Query the latest data at a specified time (with timeout) | Same as above, plus `timeOut`: The timeout time | -| `executeLastDataQueryForOneDevice(String db, String device, List sensors, boolean isLegalPathNodes)` | Query the latest data for a single device | `db`: The database name, `device`: The device name, `sensors`: A list of sensors, `isLegalPathNodes`: Whether the path nodes are legal | -| `executeAggregationQuery(List paths, List aggregations)` | Execute an aggregation query | `paths`: A list of query paths, `aggregations`: A list of aggregation types | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime)` | Execute an aggregation query with a time range | Same as above, plus `startTime`: The start timestamp, `endTime`:` The end timestamp | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime, long interval)` | Execute an aggregation query with a time interval | Same as above, plus `interval`: The time interval | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime, long interval, long slidingStep)` | Execute a sliding window aggregation query | Same as above, plus `slidingStep`: The sliding step | -| `fetchAllConnections()` | Get information of all active connections | No parameters | - -##### 3.2.5 System Status and Backup - -| **Method Name** | **Function Description** | **Parameter Explanation** | -| -------------------------- | ----------------------------------------- | ------------------------------------------ | -| `getBackupConfiguration()` | Get backup configuration information | No parameters | -| `fetchAllConnections()` | Get information of all active connections | No parameters | -| `getSystemStatus()` | Get the system status | Deprecated, returns `SystemStatus.NORMAL` | diff --git a/src/UserGuide/Master/Tree/API/Programming-MQTT_timecho.md b/src/UserGuide/Master/Tree/API/Programming-MQTT_timecho.md deleted file mode 100644 index 0f2e6a14d..000000000 --- a/src/UserGuide/Master/Tree/API/Programming-MQTT_timecho.md +++ /dev/null @@ -1,294 +0,0 @@ - -# MQTT Protocol - -## 1. Overview - -MQTT (Message Queuing Telemetry Transport) is a lightweight messaging protocol designed for IoT and low-bandwidth environments. It operates on a Publish/Subscribe (Pub/Sub) model, enabling efficient and reliable bidirectional communication between devices. Its core objectives are low power consumption, minimal bandwidth usage, and high real-time performance, making it ideal for unstable networks or resource-constrained scenarios (e.g., sensors, mobile devices). - -IoTDB provides deep integration with the MQTT protocol, fully compliant with MQTT v3.1 (OASIS International Standard). The IoTDB server includes a built-in high-performance MQTT Broker module, eliminating the need for third-party middleware. Devices can directly write time-series data into the IoTDB storage engine via MQTT messages. - - - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the MQTT service JAR file by default. Please contact the Timecho team to obtain the JAR file before using this service, and place it in the `timechodb_home/lib` or `timechodb_home/ext/external_service` directory. - -## 2. Built-in MQTT Service -The Built-in MQTT Service provide the ability of direct connection to IoTDB through MQTT. It listen the publish messages from MQTT clients - and then write the data into storage immediately. -The MQTT topic corresponds to IoTDB timeseries. -The messages payload can be format to events by `PayloadFormatter` which loaded by java SPI, and the default implementation is `JSONPayloadFormatter`. -The default `json` formatter support two json format and its json array. The following is an MQTT message payload example: - -```json - { - "device":"root.sg.d1", - "timestamp":1586076045524, - "measurements":["s1","s2"], - "values":[0.530635,0.530635] - } -``` -or -```json - { - "device":"root.sg.d1", - "timestamps":[1586076045524,1586076065526], - "measurements":["s1","s2"], - "values":[[0.530635,0.530635], [0.530655,0.530695]] - } -``` -or json array of the above two. - - - -## 3. MQTT Configurations -The IoTDB MQTT service load configurations from `${IOTDB_HOME}/${IOTDB_CONF}/iotdb-system.properties` by default. - -Configurations are as follows: - -| **Property** | **Description** | **Default** | -| ------------------------ | -------------------------------------------------------------------------------------------------------------------- | ------------------- | -| `enable_mqtt_service` | Enable/ disable the MQTT service. | FALSE | -| `mqtt_host` | Host address bound to the MQTT service. | 127.0.0.1 | -| `mqtt_port` | Port bound to the MQTT service. | 1883 | -| `mqtt_handler_pool_size` | Thread pool size for processing MQTT messages. | 1 | -| **`mqtt_payload_formatter`** | **Formatting method for MQTT message payloads. ​**​**Options: `json` (tree mode), `line` (table mode).** | **json** | -| `mqtt_max_message_size` | Maximum allowed MQTT message size (bytes). | 1048576 | - -## 4. Coding Examples -The following is an example which a mqtt client send messages to IoTDB server. - -```java -MQTT mqtt = new MQTT(); -mqtt.setHost("127.0.0.1", 1883); -mqtt.setUserName("root"); -mqtt.setPassword("root"); - -BlockingConnection connection = mqtt.blockingConnection(); -connection.connect(); - -Random random = new Random(); -for (int i = 0; i < 10; i++) { - String payload = String.format("{\n" + - "\"device\":\"root.sg.d1\",\n" + - "\"timestamp\":%d,\n" + - "\"measurements\":[\"s1\"],\n" + - "\"values\":[%f]\n" + - "}", System.currentTimeMillis(), random.nextDouble()); - - connection.publish("root.sg.d1.s1", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -} - -connection.disconnect(); - -``` - -## 5. Customize your MQTT Message Format - -In a production environment, each device typically has its own MQTT client, and the message formats of these clients have been pre-defined. If communication is to be carried out in accordance with the MQTT message format supported by IoTDB, a comprehensive upgrade and transformation of all existing clients would be required, which would undoubtedly incur significant costs. However, we can easily achieve customization of the MQTT message format through simple programming means, without the need to modify the clients. -An example can be found in [example/mqtt-customize](https://github.com/apache/iotdb/tree/rc/2.0.1/example/mqtt-customize) project. - -Assuming the MQTT client sends the following message format: -```json - { - "time":1586076045523, - "deviceID":"car_1", - "deviceType":"Gasoline car​​", - "point":"Fuel level​​", - "value":10.0 -} -``` -Or in the form of an array of JSON: -```java -[ - { - "time":1586076045523, - "deviceID":"car_1", - "deviceType":"Gasoline car​​", - "point":"Fuel level", - "value":10.0 - }, - { - "time":1586076045524, - "deviceID":"car_2", - "deviceType":"NEV(new enegry vehicle)", - "point":"Speed", - "value":80.0 - } -] -``` - -Then you can set up the custom MQTT message format through the following steps: - -1. Create a java project, and add dependency: -```xml - - org.apache.iotdb - iotdb-server - 2.0.4-SNAPSHOT - -``` -2. Define your implementation which implements `org.apache.iotdb.db.protocol.mqtt.PayloadFormatter` -e.g., - -```java -package org.apache.iotdb.mqtt.server; - -import org.apache.iotdb.db.protocol.mqtt.Message; -import org.apache.iotdb.db.protocol.mqtt.PayloadFormatter; -import org.apache.iotdb.db.protocol.mqtt.TableMessage; - -import com.google.common.collect.Lists; -import com.google.gson.Gson; -import com.google.gson.GsonBuilder; -import com.google.gson.JsonArray; -import com.google.gson.JsonElement; -import com.google.gson.JsonObject; -import com.google.gson.JsonParseException; -import io.netty.buffer.ByteBuf; -import org.apache.commons.lang3.NotImplementedException; -import org.apache.tsfile.enums.TSDataType; - -import java.nio.charset.StandardCharsets; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -/** - * The Customized JSON payload formatter. one json format supported: { "time":1586076045523, - * "deviceID":"car_1", "deviceType":"NEV", "point":"Speed", "value":80.0 } - */ -public class CustomizedJsonPayloadFormatter implements PayloadFormatter { - private static final String JSON_KEY_TIME = "time"; - private static final String JSON_KEY_DEVICEID = "deviceID"; - private static final String JSON_KEY_DEVICETYPE = "deviceType"; - private static final String JSON_KEY_POINT = "point"; - private static final String JSON_KEY_VALUE = "value"; - private static final Gson GSON = new GsonBuilder().create(); - - @Override - public List format(String topic, ByteBuf payload) { - if (payload == null) { - return new ArrayList<>(); - } - String txt = payload.toString(StandardCharsets.UTF_8); - JsonElement jsonElement = GSON.fromJson(txt, JsonElement.class); - if (jsonElement.isJsonObject()) { - JsonObject jsonObject = jsonElement.getAsJsonObject(); - return formatTableRow(topic, jsonObject); - } else if (jsonElement.isJsonArray()) { - JsonArray jsonArray = jsonElement.getAsJsonArray(); - List messages = new ArrayList<>(); - for (JsonElement element : jsonArray) { - JsonObject jsonObject = element.getAsJsonObject(); - messages.addAll(formatTableRow(topic, jsonObject)); - } - return messages; - } - throw new JsonParseException("payload is invalidate"); - } - - @Override - @Deprecated - public List format(ByteBuf payload) { - throw new NotImplementedException(); - } - - private List formatTableRow(String topic, JsonObject jsonObject) { - TableMessage message = new TableMessage(); - String database = !topic.contains("/") ? topic : topic.substring(0, topic.indexOf("/")); - String table = "test_table"; - - // Parsing Database Name - message.setDatabase((database)); - - // Parsing Table Name - message.setTable(table); - - // Parsing Tags - List tagKeys = new ArrayList<>(); - tagKeys.add(JSON_KEY_DEVICEID); - List tagValues = new ArrayList<>(); - tagValues.add(jsonObject.get(JSON_KEY_DEVICEID).getAsString()); - message.setTagKeys(tagKeys); - message.setTagValues(tagValues); - - // Parsing Attributes - List attributeKeys = new ArrayList<>(); - List attributeValues = new ArrayList<>(); - attributeKeys.add(JSON_KEY_DEVICETYPE); - attributeValues.add(jsonObject.get(JSON_KEY_DEVICETYPE).getAsString()); - message.setAttributeKeys(attributeKeys); - message.setAttributeValues(attributeValues); - - // Parsing Fields - List fields = Arrays.asList(JSON_KEY_POINT); - List dataTypes = Arrays.asList(TSDataType.FLOAT); - List values = Arrays.asList(jsonObject.get(JSON_KEY_VALUE).getAsFloat()); - message.setFields(fields); - message.setDataTypes(dataTypes); - message.setValues(values); - - // Parsing timestamp - message.setTimestamp(jsonObject.get(JSON_KEY_TIME).getAsLong()); - return Lists.newArrayList(message); - } - - @Override - public String getName() { - // set the value of mqtt_payload_formatter in iotdb-common.properties as the following string: - return "CustomizedJson2Table"; - } - - @Override - public String getType() { - return PayloadFormatter.TABLE_TYPE; - } -} -``` -3. modify the file in `src/main/resources/META-INF/services/org.apache.iotdb.db.protocol.mqtt.PayloadFormatter`: - clean the file and put your implementation class name into the file. - In this example, the content is: `org.apache.iotdb.mqtt.server.CustomizedJsonPayloadFormatter` -4. compile your implementation as a jar file: `mvn package -DskipTests` - - -Then, in your server: -1. Create ${IOTDB_HOME}/ext/mqtt/ folder, and put the jar into this folder. -2. Update configuration to enable MQTT service. (`enable_mqtt_service=true` in `conf/iotdb-system.properties`) -3. Set the value of `mqtt_payload_formatter` in `conf/iotdb-system.properties` as the value of getName() in your implementation - , in this example, the value is `CustomizedJson2Table` -4. Launch the IoTDB server. -5. Now IoTDB will use your implementation to parse the MQTT message. - -More: the message format can be anything you want. For example, if it is a binary format, -just use `payload.forEachByte()` or `payload.array` to get bytes content. - - -## 6. Caution - -To avoid compatibility issues caused by a default client_id, always explicitly supply a unique, non-empty client_id in every MQTT client. -Behavior varies when the client_id is missing or empty. Common examples: -1. Explicitly sending an empty string - • MQTTX: When client_id="", IoTDB silently discards the message. - • mosquitto_pub: When client_id="", IoTDB receives the message normally. -2. Omitting client_id entirely - • MQTTX: IoTDB accepts the message. - • mosquitto_pub: IoTDB rejects the connection. - Therefore, explicitly assigning a unique, non-empty client_id is the simplest way to eliminate these discrepancies and ensure reliable message delivery. diff --git a/src/UserGuide/Master/Tree/API/Programming-ODBC_timecho.md b/src/UserGuide/Master/Tree/API/Programming-ODBC_timecho.md deleted file mode 100644 index 287211667..000000000 --- a/src/UserGuide/Master/Tree/API/Programming-ODBC_timecho.md +++ /dev/null @@ -1,996 +0,0 @@ - - -# ODBC - -## 1. Feature Introduction -The IoTDB ODBC driver provides the ability to interact with the database via the standard ODBC interface, supporting data management in time-series databases through ODBC connections. It currently supports database connection, data query, data insertion, data modification, and data deletion operations, and is compatible with various applications and toolchains that support the ODBC protocol. - -> Note: This feature is supported starting from V2.0.8.2. - - - -## 2. Usage Method -It is recommended to install using the pre-compiled binary package. There is no need to compile it yourself; simply use the script to complete the driver installation and system registration. Currently, only Windows systems are supported. - -### 2.1 Environment Requirements -Only the ODBC Driver Manager dependency at the operating system level is required; no compilation environment configuration is needed: - -| **Operating System** | **Requirements and Installation Method** | -| :--- | :--- | -| Windows | 1. **Windows 10/11, Server 2016/2019/2022**: Comes with ODBC Driver Manager version 17/18 built-in; no extra installation needed.2. **Windows 8.1/Server 2012 R2**: Requires manual installation of the corresponding version of the ODBC Driver Manager. | - -### 2.2 Installation Steps -1. Contact the Tianmou team to obtain the pre-compiled binary package. - Binary package directory structure: - ```Plain - ├── bin/ - │ ├── apache_iotdb_odbc.dll - │ └── install_driver.exe - ├── install.bat - └── registry.bat - ``` -2. Open a command line tool (CMD/PowerShell) with **Administrator privileges** and run the following command: (You can replace the path with any absolute path) - ```Bash - install.bat "C:\Program Files\Apache IoTDB ODBC Driver" - ``` - The script automatically completes the following operations: - * Creates the installation directory (if it does not exist). - * Copies `bin\apache_iotdb_odbc.dll` to the specified installation directory. - * Calls `install_driver.exe` to register the driver to the system via the ODBC standard API (`SQLInstallDriverEx`). -3. Verify installation: Open "ODBC Data Source Administrator". If you can see `Apache IoTDB ODBC Driver` in the "Drivers" tab, the registration was successful. - ![](/img/odbc-1-en.png) - -### 2.3 Uninstallation Steps -1. Open Command Prompt as Administrator and `cd` into the project root directory. -2. Run the uninstallation script: - ```Bash - uninstall.bat - ``` - The script will call `install_driver.exe` to unregister the driver from the system via the ODBC standard API (`SQLRemoveDriver`). The DLL files in the installation directory will not be automatically deleted; please delete them manually if cleanup is required. - -### 2.4 Connection Configuration -After installing the driver, you need to configure a Data Source Name (DSN) to allow applications to connect to the database using the DSN name. The IoTDB ODBC driver supports two methods for configuring connection parameters: via Data Source and via Connection String. - -#### 2.4.1 Configuring Data Source -**Configure via ODBC Data Source Administrator** -1. Open "ODBC Data Source Administrator", switch to the "User DSN" tab, and click the "Add" button. - ![](/img/odbc-2-en.png) -2. Select "Apache IoTDB ODBC Driver" from the pop-up driver list and click "Finish". - ![](/img/odbc-3-en.png) -3. The data source configuration dialog will appear. Fill in the connection parameters and click OK: - ![](/img/odbc-4-en.png) - The meaning of each field in the dialog box is as follows: - - | **Area** | **Field** | **Description** | - | :--- | :--- | :--- | - | Data Source | DSN Name | Data Source Name; applications refer to this data source by this name. | - | Data Source | Description | Data Source description (optional). | - | Connection | Server | IoTDB server IP address, default 127.0.0.1. | - | Connection | Port | IoTDB Session API port, default 6667. | - | Connection | User | Username, default root. | - | Connection | Password | Password, default root. | - | Options | Table Model | Check to use Table Model; uncheck to use Tree Model. | - | Options | Database | Database name. Only available in Table Model mode; grayed out in Tree Model. | - | Options | Log Level | Log level (0-4): 0=OFF, 1=ERROR, 2=WARN, 3=INFO, 4=TRACE. | - | Options | Session Timeout | Session timeout time (milliseconds); 0 means no timeout. Note: The server-side `queryTimeoutThreshold` defaults to 60000ms; exceeding this value requires modifying server configuration. | - | Options | Batch Size | Number of rows fetched per batch, default 1000. Setting to 0 resets to the default value. | - -4. After filling in the details, you can click the "Test Connection" button to test the connection. Testing will attempt to connect to the IoTDB server using the current parameters and execute a `SHOW VERSION` query. If successful, the server version information will be displayed; if failed, the specific error reason will be shown. -5. Once parameters are confirmed correct, click "OK" to save. The data source will appear in the "User DSN" list, as shown in the example below with the name "123". - ![](/img/odbc-5-en.png) - To modify the configuration of an existing data source, select it in the list and click the "Configure" button to edit again. - -#### 2.4.2 Connection String -The connection string format is **semicolon-separated key-value pairs**, for example: -```Bash -Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;isTableModel=false;loglevel=2 -``` -Specific field attributes are introduced in the table below: - -| **Field Name** | **Description** | **Optional Values** | **Default Value** | -| :--- | :--- |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--- | -| DSN | Data Source Name | Custom data source name | - | -| uid | Database username | Any string | root | -| pwd | Database password | Any string | root | -| server | IoTDB server address | IP address | 127.0.0.1 | -| port | IoTDB server port | Port number | 6667 | -| database | Database name (only effective in Table Model mode) | Any string | Empty string | -| loglevel | Log level | Integer value (0-4) | 4 (LOG_LEVEL_TRACE) | -| isTableModel / tablemodel | Whether to enable Table Model mode | Boolean type, supports multiple representations:
1. 0, false, no, off: set to false;
2. 1, true, yes, on: set to true;
3. Other values default to true. | true | -| sessiontimeoutms | Session timeout time (milliseconds) | 64-bit integer, defaults to `LLONG_MAX`; setting to `0` will be replaced with `LLONG_MAX`. Note: The server has a timeout setting: `private long queryTimeoutThreshold = 60000;` this item needs to be modified to get a timeout time exceeding 60 seconds. | LLONG_MAX | -| batchsize | Batch size for fetching data each time | 64-bit integer, defaults to `1000`; setting to `0` will be replaced with `1000` | 1000 | - -Notes: -* Field names are case-insensitive (automatically converted to lowercase for comparison). -* Connection string format is semicolon-separated key-value pairs, e.g., `Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;isTableModel=false;loglevel=2`. -* For boolean fields (`isTableModel`), multiple representation methods are supported. -* All fields are optional; if not specified, default values are used. -* Unsupported fields will be ignored and a warning logged, but will not affect the connection. -* The default server interface port 6667 is the default port used by IoTDB's C++ Session interface. This ODBC driver uses the C++ Session interface to transfer data with IoTDB. If the C++ Session interface on the IoTDB server uses a non-default port, corresponding changes must be made in the ODBC connection string. - -#### 2.4.3 Relationship between Data Source Configuration and Connection String -Configurations saved in the ODBC Data Source Administrator are written into the system's ODBC data source configuration as key-value pairs (corresponding to the registry `HKEY_CURRENT_USER\SOFTWARE\ODBC\ODBC.INI` under Windows). When an application uses `SQLConnect` or specifies `DSN=DataSourceName` in the connection string, the driver reads these parameters from the system configuration. - -**The priority of the connection string is higher than the configuration saved in the DSN.** Specific rules are as follows: -1. If the connection string contains `DSN=xxx` and does not contain `DRIVER=...`, the driver first loads all parameters of that DSN from the system configuration as base values. -2. Then, parameters explicitly specified in the connection string will override parameters with the same name in the DSN. -3. If the connection string contains `DRIVER=...`, no DSN parameters will be read from the system configuration; it will rely entirely on the connection string. - -For example: If the DSN is configured with `Server=192.168.1.100` and `Port=6667`, but the connection string is `DSN=MyDSN;Server=127.0.0.1`, then the actual connection will use `Server=127.0.0.1` (overridden by connection string) and `Port=6667` (from DSN). - -### 2.5 Logging -Log output during driver runtime is divided into "Driver Self-Logs" and "ODBC Manager Tracing Logs". Note the impact of log levels on performance. - -#### 2.5.1 Driver Self-Logs -* Output location: `apache_iotdb_odbc.log` in the user's home directory. -* Log level: Configured via the `loglevel` parameter in the connection string (0-4; higher levels produce more detailed output). -* Performance impact: High log levels will significantly reduce driver performance; recommended for debugging only. - -#### 2.5.2 ODBC Manager Tracing Logs -* How to enable: Open "ODBC Data Source Administrator" → "Tracing" → "Start Tracing Now". -* Precautions: Enabling this will greatly reduce driver performance; use only for troubleshooting. - -## 3. Interface Support - -### 3.1 Method List -The driver's support status for standard ODBC APIs is as follows: - -| ODBC/Setup API | Function Function | Parameter List | Parameter Description | -|:------------------|:---------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--- | -| SQLAllocHandle | Allocate ODBC Handle | (SQLSMALLINT HandleType, SQLHANDLE InputHandle, SQLHANDLE *OutputHandle) | HandleType: Type of handle to allocate (ENV/DBC/STMT/DESC);
InputHandle: Parent context handle;
OutputHandle: Pointer to the returned new handle. | -| SQLBindCol | Bind column to result buffer | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN *StrLen_or_Ind) | StatementHandle: Statement handle;
ColumnNumber: Column number;
TargetType: C data type;
TargetValue: Data buffer;BufferLength: Buffer length;
StrLen_or_Ind: Returns data length or NULL indicator. | -| SQLColAttribute | Get column attribute information | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLUSMALLINT FieldIdentifier, SQLPOINTER CharacterAttribute, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength, SQLLEN *NumericAttribute) | StatementHandle: Statement handle;
ColumnNumber: Column number;
FieldIdentifier: Attribute ID;
CharacterAttribute: Character attribute output;
BufferLength: Buffer length;
StringLength: Returned length;
NumericAttribute: Numeric attribute output. | -| SQLColumns | Query table column information | (SQLHSTMT StatementHandle, SQLCHAR *CatalogName, SQLSMALLINT NameLength1, SQLCHAR *SchemaName, SQLSMALLINT NameLength2, SQLCHAR *TableName, SQLSMALLINT NameLength3, SQLCHAR *ColumnName, SQLSMALLINT NameLength4) | StatementHandle: Statement handle;
Catalog/Schema/Table/ColumnName: Query object names;

NameLength*: Corresponding name lengths. | -| SQLConnect | Establish database connection | (SQLHDBC ConnectionHandle, SQLCHAR *ServerName, SQLSMALLINT NameLength1, SQLCHAR *UserName, SQLSMALLINT NameLength2, SQLCHAR *Authentication, SQLSMALLINT NameLength3) | ConnectionHandle: Connection handle;
ServerName: Data source name;
UserName: Username;
Authentication: Password;
NameLength*: String lengths. | -| SQLDescribeCol | Describe columns in result set | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLCHAR *ColumnName, SQLSMALLINT BufferLength, SQLSMALLINT *NameLength, SQLSMALLINT *DataType, SQLULEN *ColumnSize, SQLSMALLINT *DecimalDigits, SQLSMALLINT *Nullable) | StatementHandle: Statement handle;
ColumnNumber: Column number;
ColumnName: Column name output;
BufferLength: Buffer length;
NameLength: Returned column name length;
DataType: SQL type;
ColumnSize: Column size;
DecimalDigits: Decimal digits;
Nullable: Whether nullable. | -| SQLDisconnect | Disconnect database connection | (SQLHDBC ConnectionHandle) | ConnectionHandle: Connection handle. | -| SQLDriverConnect | Establish connection using connection string | (SQLHDBC ConnectionHandle, SQLHWND WindowHandle, SQLCHAR *InConnectionString, SQLSMALLINT StringLength1, SQLCHAR *OutConnectionString, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength2, SQLUSMALLINT DriverCompletion) | ConnectionHandle: Connection handle;
WindowHandle: Window handle;InConnectionString: Input connection string;
StringLength1: Input length;
OutConnectionString: Output connection string;
BufferLength: Output buffer;
StringLength2: Returned length;
DriverCompletion: Connection prompt method. | -| SQLEndTran | Commit or rollback transaction | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT CompletionType) | HandleType: Handle type;
Handle: Connection or environment handle;
CompletionType: Commit or rollback transaction. | -| SQLExecDirect | Execute SQL statement directly | (SQLHSTMT StatementHandle, SQLCHAR *StatementText, SQLINTEGER TextLength) | StatementHandle: Statement handle;
StatementText: SQL text;TextLength: SQL length. | -| SQLFetch | Fetch next row in result set | (SQLHSTMT StatementHandle) | StatementHandle: Statement handle. | -| SQLFreeHandle | Free ODBC handle | (SQLSMALLINT HandleType, SQLHANDLE Handle) | HandleType: Handle type;
Handle: Handle to free. | -| SQLFreeStmt | Free statement-related resources | (SQLHSTMT StatementHandle, SQLUSMALLINT Option) | StatementHandle: Statement handle;
Option: Free option (close cursor/reset parameters, etc.). | -| SQLGetConnectAttr | Get connection attribute | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER *StringLength) | ConnectionHandle: Connection handle;
Attribute: Attribute ID;
Value: Returned attribute value;
BufferLength: Buffer length;
StringLength: Returned length. | -| SQLGetData | Get result data | (SQLHSTMT StatementHandle, SQLUSMALLINT Col_or_Param_Num, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN *StrLen_or_Ind) | StatementHandle: Statement handle;
Col_or_Param_Num: Column number;
TargetType: C type;
TargetValue: Data buffer;
BufferLength: Buffer size;
StrLen_or_Ind: Returned length or NULL flag. | -| SQLGetDiagField | Get diagnostic field | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLSMALLINT DiagIdentifier, SQLPOINTER DiagInfo, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength) | HandleType: Handle type;
Handle: Handle;
RecNumber: Record number;
DiagIdentifier: Diagnostic field ID;
DiagInfo: Output info;
BufferLength: Buffer;
StringLength: Returned length. | -| SQLGetDiagRec | Get diagnostic record | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLCHAR *Sqlstate, SQLINTEGER *NativeError, SQLCHAR *MessageText, SQLSMALLINT BufferLength, SQLSMALLINT *TextLength) | HandleType: Handle type;
Handle: Handle;
RecNumber: Record number;
Sqlstate: SQL state code;
NativeError: Native error code;
MessageText: Error message;
BufferLength: Buffer;
TextLength: Returned length. | -| SQLGetInfo | Get database information | (SQLHDBC ConnectionHandle, SQLUSMALLINT InfoType, SQLPOINTER InfoValue, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength) | ConnectionHandle: Connection handle;
InfoType: Information type;
InfoValue: Return value;
BufferLength: Buffer length;
StringLength: Returned length. | -| SQLGetStmtAttr | Get statement attribute | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER *StringLength) | StatementHandle: Statement handle;
Attribute: Attribute ID;
Value: Return value;
BufferLength: Buffer;
StringLength: Returned length. | -| SQLGetTypeInfo | Get data type information | (SQLHSTMT StatementHandle, SQLSMALLINT DataType) | StatementHandle: Statement handle;
DataType: SQL data type. | -| SQLMoreResults | Get more result sets | (SQLHSTMT StatementHandle) | StatementHandle: Statement handle. | -| SQLNumResultCols | Get number of columns in result set | (SQLHSTMT StatementHandle, SQLSMALLINT *ColumnCount) | StatementHandle: Statement handle;
ColumnCount: Returned column count. | -| SQLRowCount | Get number of affected rows | (SQLHSTMT StatementHandle, SQLLEN *RowCount) | StatementHandle: Statement handle;
RowCount: Returned number of affected rows. | -| SQLSetConnectAttr | Set connection attribute | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | ConnectionHandle: Connection handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Attribute value length. | -| SQLSetEnvAttr | Set environment attribute | (SQLHENV EnvironmentHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | EnvironmentHandle: Environment handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Length. | -| SQLSetStmtAttr | Set statement attribute | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | StatementHandle: Statement handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Length. | -| SQLTables | Query table information | (SQLHSTMT StatementHandle, SQLCHAR *CatalogName, SQLSMALLINT NameLength1, SQLCHAR *SchemaName, SQLSMALLINT NameLength2, SQLCHAR *TableName, SQLSMALLINT NameLength3, SQLCHAR *TableType, SQLSMALLINT NameLength4) | StatementHandle: Statement handle;
Catalog/Schema/TableName: Table names;
TableType: Table type;
NameLength*: Corresponding lengths. | - -### 3.2 Data Type Conversion -The mapping relationship between IoTDB data types and standard ODBC data types is as follows: - -| **IoTDB Data Type** | **ODBC Data Type** | -| :--- | :--- | -| BOOLEAN | SQL_BIT | -| INT32 | SQL_INTEGER | -| INT64 | SQL_BIGINT | -| FLOAT | SQL_REAL | -| DOUBLE | SQL_DOUBLE | -| TEXT | SQL_VARCHAR | -| STRING | SQL_VARCHAR | -| BLOB | SQL_LONGVARBINARY | -| TIMESTAMP | SQL_BIGINT | -| DATE | SQL_DATE | - -## 4. Operation Examples -This chapter mainly introduces full-type operation examples for **C#**, **Python**, **C++**, **PowerBI**, and **Excel**, covering core operations such as data query, insertion, and deletion. - -### 4.1 C# Example - -```C# -/******* -Note: When the output contains Chinese characters, it may cause garbled text. -This is because the table.Write() function cannot output strings in UTF-8 encoding -and can only output using GB2312 (or another system default encoding). This issue -may not occur in software like Power BI; it also does not occur when using the Console.WriteLine function. -This is an issue with the ConsoleTable package. -*****/ - -using System.Data.Common; -using System.Data.Odbc; -using System.Reflection.PortableExecutable; -using ConsoleTables; -using System; - -/// Executes a SELECT query and outputs the results of root.full.fulldevice in table format -void Query(OdbcConnection dbConnection) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - dbCommand.CommandText = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000"; - using (OdbcDataReader dbReader = dbCommand.ExecuteReader()) - { - var fCount = dbReader.FieldCount; - Console.WriteLine($"fCount = {fCount}"); - - // Output header row - var columns = new string[fCount]; - for (var i = 0; i < fCount; i++) - { - var fName = dbReader.GetName(i); - if (fName.Contains('.')) - { - fName = fName.Substring(fName.LastIndexOf('.') + 1); - } - columns[i] = fName; - } - - // Output content rows - var table = new ConsoleTable(columns); - while (dbReader.Read()) - { - var row = new object[fCount]; - for (var i = 0; i < fCount; i++) - { - if (dbReader.IsDBNull(i)) - { - row[i] = null; - continue; - } - row[i] = dbReader.GetValue(i); - } - table.AddRow(row); - } - table.Write(); - Console.WriteLine(); - } - } - } - catch (Exception ex) - { - Console.WriteLine(ex.ToString()); - } -} - -/// Executes non-query SQL statements (such as INSERT; Tree Model INSERT will auto-create paths) -void Execute(OdbcConnection dbConnection, string command) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - try - { - dbCommand.CommandText = command; - Console.WriteLine($"Execute command: {command}"); - dbCommand.ExecuteNonQuery(); - } - catch (Exception ex) - { - Console.WriteLine($"CommandText error: {ex.Message}"); - } - } - } - catch (OdbcException ex) - { - Console.WriteLine($"Database error: {ex.Message}"); - } - catch (Exception ex) - { - Console.WriteLine($"Unknown error occurred: {ex.Message}"); - } -} - -var dsn = "Apache IoTDB DSN"; -var user = "root"; -var password = "root"; -var server = "127.0.0.1"; -var connectionString = $"DSN={dsn};Server={server};UID={user};PWD={password};loglevel=4;istablemodel=0"; - -using (OdbcConnection dbConnection = new OdbcConnection(connectionString)) -{ - Console.WriteLine($"Start"); - try - { - dbConnection.Open(); - } - catch (Exception ex) - { - Console.WriteLine($"Login failed: {ex.Message}"); - Console.WriteLine($"Stack Trace: {ex.StackTrace}"); - dbConnection.Dispose(); - return; - } - - string[] insertStatements = new string[] - { - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')" - }; - - foreach (var insert in insertStatements) - { - Execute(dbConnection, insert); - } - Console.WriteLine($"[DEBUG] Inserted {insertStatements.Length} rows. Begin to query."); - - Query(dbConnection); // Execute query and output results -} -``` - - -### 4.2 Python Example -1. To access ODBC via Python, install the `pyodbc` package: - ```Plain - pip install pyodbc - ``` -2. Full Code: - - -```Python -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -""" -Apache IoTDB ODBC Python Example - Tree Model -Uses pyodbc to connect to the IoTDB ODBC driver, using istablemodel=0 for Tree Model. -Functionality references examples/BasicTest/TreeTest/TreeTest.cs and examples/cpp-example/TreeTest.cpp. -""" - -import pyodbc - -def execute(conn: pyodbc.Connection, command: str) -> None: - """Executes non-query SQL statements (such as INSERT; Tree Model INSERT will auto-create paths)""" - try: - with conn.cursor() as cursor: - cursor.execute(command) - cmd_upper = command.strip().upper() - if cmd_upper.startswith(("INSERT", "UPDATE", "DELETE")): - conn.commit() - print(f"Execute command: {command}") - except pyodbc.Error as ex: - print(f"CommandText error: {ex}") - -def query(conn: pyodbc.Connection, sql: str) -> None: - """Executes a SELECT query and outputs the results of root.full.fulldevice in table format""" - try: - with conn.cursor() as cursor: - cursor.execute(sql) - col_count = len(cursor.description) - print(f"fCount = {col_count}") - - if col_count <= 0: - return - - columns = [] - for i in range(col_count): - col_name = cursor.description[i][0] or f"Column{i}" - if "." in str(col_name): - col_name = str(col_name).split(".")[-1] - columns.append(str(col_name)) - - rows = cursor.fetchall() - - # Calculate column widths - col_widths = [max(len(str(col)), 4) for col in columns] - for row in rows: - for j, val in enumerate(row): - if j < len(col_widths): - col_widths[j] = max(col_widths[j], len(str(val) if val is not None else "NULL")) - - # Print header - header = " | ".join(str(c).ljust(col_widths[i]) for i, c in enumerate(columns)) - print(header) - print("-" * len(header)) - - # Print rows - for row in rows: - values = [] - for i, val in enumerate(row): - if val is None: - cell = "NULL" - else: - cell = str(val) - values.append(cell.ljust(col_widths[i]) if i < len(col_widths) else cell) - print(" | ".join(values)) - - print() - except pyodbc.Error as ex: - print(f"Query error: {ex}") - -def main() -> None: - dsn = "Apache IoTDB DSN" - user = "root" - password = "root" - server = "127.0.0.1" - connection_string = ( - f"DSN={dsn};Server={server};UID={user};PWD={password};" - f"loglevel=4;istablemodel=0" - ) - - print("Start") - try: - conn = pyodbc.connect(connection_string) - except pyodbc.Error as ex: - print(f"Login failed: {ex}") - return - - try: - driver_name = conn.getinfo(6) # SQL_DRIVER_NAME - print(f"Successfully opened connection. driver = {driver_name}") - except Exception: - print("Successfully opened connection.") - - try: - insert_statements = [ - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600001, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')", - ] - - for insert_sql in insert_statements: - execute(conn, insert_sql) - - print(f"[DEBUG] Inserted {len(insert_statements)} rows. Begin to query.") - - query_sql = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000" - query(conn, query_sql) - print("Query ok") - finally: - conn.close() - -if __name__ == "__main__": - main() -``` - - -### 4.3 C++ Example - - -```C++ -#define WIN32_LEAN_AND_MEAN -#include - -#include -#include -#include -#include -#include -#include -#include - -#ifndef SQL_DIAG_COLUMN_SIZE -#define SQL_DIAG_COLUMN_SIZE 33L -#endif - -// Helper function to check ODBC errors -void CheckOdbcError(SQLRETURN retCode, SQLSMALLINT handleType, SQLHANDLE handle, const char* functionName) { - if (retCode == SQL_SUCCESS || retCode == SQL_SUCCESS_WITH_INFO) { - return; - } - - SQLCHAR sqlState[6]; - SQLCHAR message[SQL_MAX_MESSAGE_LENGTH]; - SQLINTEGER nativeError; - SQLSMALLINT textLength; - SQLRETURN errRet; - errRet = SQLGetDiagRec(handleType, handle, 1, sqlState, &nativeError, message, sizeof(message), &textLength); - - std::cerr << "ODBC Error in " << functionName << ":\n"; - std::cerr << " SQL State: " << sqlState << "\n"; - std::cerr << " Native Error: " << nativeError << "\n"; - std::cerr << " Message: " << message << "\n"; - std::cerr << " SQLGetDiagRec Return: " << errRet << "\n"; - - if (retCode == SQL_ERROR || retCode == SQL_INVALID_HANDLE) { - exit(1); - } -} - -// Helper function to print a simple table -void PrintSimpleTable(const std::vector& headers, - const std::vector>& rows) { - for (size_t i = 0; i < headers.size(); i++) { - std::cout << headers[i]; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - for (size_t i = 0; i < headers.size(); i++) { - std::cout << "----------------"; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - for (const auto& row : rows) { - for (size_t i = 0; i < row.size(); i++) { - std::cout << row[i]; - if (i < row.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - } - std::cout << std::endl; -} - -/// Executes a SELECT query and outputs the results of root.full.fulldevice in table format -void Query(SQLHDBC hDbc) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret = SQL_SUCCESS; - - try { - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - return; - } - - const std::string sqlQuery = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000"; - std::cout << "Execute query: " << sqlQuery << std::endl; - - ret = SQLExecDirect(hStmt, reinterpret_cast(const_cast(sqlQuery.c_str())), SQL_NTS); - if (!SQL_SUCCEEDED(ret)) { - if (ret != SQL_NO_DATA) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect(SELECT)"); - } - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - SQLSMALLINT colCount = 0; - ret = SQLNumResultCols(hStmt, &colCount); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLNumResultCols"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::cout << "Column count = " << colCount << std::endl; - - if (colCount <= 0) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector columnNames; - std::vector columnTypes(colCount); - std::vector columnSizes(colCount); - std::vector decimalDigits(colCount); - std::vector nullable(colCount); - - // Get basic column information - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLSMALLINT nameLength = 0; - ret = SQLDescribeCol(hStmt, i, NULL, 0, &nameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get length)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector colNameBuffer(nameLength + 1); - SQLSMALLINT actualNameLength = 0; - - ret = SQLDescribeCol(hStmt, i, colNameBuffer.data(), nameLength + 1, - &actualNameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get name)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::string fullName(reinterpret_cast(colNameBuffer.data())); - - size_t pos = fullName.find_last_of('.'); - if (pos != std::string::npos) { - columnNames.push_back(fullName.substr(pos + 1)); - } else { - columnNames.push_back(fullName); - } - - ret = SQLDescribeCol(hStmt, i, NULL, 0, NULL, &columnTypes[i-1], - &columnSizes[i-1], &decimalDigits[i-1], &nullable[i-1]); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get type info)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - } - - std::vector> tableRows; - - int rowCount = 0; - // Fetch data for every row - while (true) { - ret = SQLFetch(hStmt); - if (ret == SQL_NO_DATA) { - break; - } - - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLFetch"); - break; - } - - std::vector row; - - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLLEN indicator = 0; - std::string valueStr; - - SQLSMALLINT cType; - size_t bufferSize; - bool isCharacterType = false; - const int maxBufferSize = 32768; - - switch (columnTypes[i-1]) { - case SQL_CHAR: - case SQL_VARCHAR: - case SQL_LONGVARCHAR: - case SQL_WCHAR: - case SQL_WVARCHAR: - case SQL_WLONGVARCHAR: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_DECIMAL: - case SQL_NUMERIC: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_INTEGER: - case SQL_SMALLINT: - case SQL_TINYINT: - case SQL_BIGINT: - cType = SQL_C_SBIGINT; - bufferSize = sizeof(SQLBIGINT); - break; - - case SQL_REAL: - case SQL_FLOAT: - case SQL_DOUBLE: - cType = SQL_C_DOUBLE; - bufferSize = sizeof(double); - break; - - case SQL_BIT: - cType = SQL_C_BIT; - bufferSize = sizeof(SQLCHAR); - break; - - case SQL_DATE: - case SQL_TYPE_DATE: - cType = SQL_C_DATE; - bufferSize = sizeof(SQL_DATE_STRUCT); - break; - - case SQL_TIME: - case SQL_TYPE_TIME: - cType = SQL_C_TIME; - bufferSize = sizeof(SQL_TIME_STRUCT); - break; - - case SQL_TIMESTAMP: - case SQL_TYPE_TIMESTAMP: - cType = SQL_C_TIMESTAMP; - bufferSize = sizeof(SQL_TIMESTAMP_STRUCT); - break; - - default: - cType = SQL_C_CHAR; - bufferSize = 256; - isCharacterType = true; - break; - } - - std::vector buffer(bufferSize); - - ret = SQLGetData(hStmt, i, cType, buffer.data(), bufferSize, &indicator); - - if (indicator == SQL_NULL_DATA) { - valueStr = "NULL"; - } - else if (ret != SQL_SUCCESS) { - valueStr = "ERR_CONV"; - } - else { - if (cType == SQL_C_CHAR) { - valueStr = reinterpret_cast(buffer.data()); - } - else if (cType == SQL_C_SBIGINT) { - SQLBIGINT intVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(intVal); - } - else if (cType == SQL_C_DOUBLE) { - double doubleVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(doubleVal); - } - else if (cType == SQL_C_BIT) { - valueStr = (*buffer.data() != 0) ? "TRUE" : "FALSE"; - } - else if (cType == SQL_C_DATE) { - SQL_DATE_STRUCT* date = reinterpret_cast(buffer.data()); - char dateStr[20]; - snprintf(dateStr, sizeof(dateStr), "%04d-%02d-%02d", - date->year, date->month, date->day); - valueStr = dateStr; - } - else if (cType == SQL_C_TIME) { - SQL_TIME_STRUCT* time = reinterpret_cast(buffer.data()); - char timeStr[15]; - snprintf(timeStr, sizeof(timeStr), "%02d:%02d:%02d", - time->hour, time->minute, time->second); - valueStr = timeStr; - } - else if (cType == SQL_C_TIMESTAMP) { - SQL_TIMESTAMP_STRUCT* ts = reinterpret_cast(buffer.data()); - char tsStr[30]; - snprintf(tsStr, sizeof(tsStr), "%04d-%02d-%02d %02d:%02d:%02d.%06d", - ts->year, ts->month, ts->day, - ts->hour, ts->minute, ts->second, - ts->fraction / 1000); - valueStr = tsStr; - } - else { - valueStr = "UNKNOWN_TYPE"; - } - - if (isCharacterType && ret == SQL_SUCCESS_WITH_INFO) { - SQLLEN actualSize = 0; - SQLGetDiagField(SQL_HANDLE_STMT, hStmt, 0, SQL_DIAG_COLUMN_SIZE, - &actualSize, SQL_IS_INTEGER, NULL); - - if (indicator > 0 && static_cast(indicator) > bufferSize - 1) { - valueStr += "..."; - } - } - - } - - row.push_back(valueStr); - } - - tableRows.push_back(row); - } - - if (!tableRows.empty()) { - PrintSimpleTable(columnNames, tableRows); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (const std::exception& ex) { - std::cerr << "Exception: " << ex.what() << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } - catch (...) { - std::cerr << "Unknown exception occurred" << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -/// Executes non-query SQL statements (such as INSERT; Tree Model INSERT will auto-create paths) -void Execute(SQLHDBC hDbc, const std::string& command) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret; - - try { - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - - ret = SQLExecDirect(hStmt, (SQLCHAR*)command.c_str(), SQL_NTS); - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect"); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (...) { - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -int main() { - SQLHENV hEnv = SQL_NULL_HENV; - SQLHDBC hDbc = SQL_NULL_HDBC; - SQLRETURN ret; - - try { - std::cout << "Start" << std::endl; - - ret = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hEnv); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_ENV)"); - - ret = SQLSetEnvAttr(hEnv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)SQL_OV_ODBC3, 0); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLSetEnvAttr"); - - ret = SQLAllocHandle(SQL_HANDLE_DBC, hEnv, &hDbc); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_DBC)"); - - std::string dsn = "Apache IoTDB DSN"; - std::string user = "root"; - std::string password = "root"; - std::string server = "127.0.0.1"; - - std::string connectionString = "DSN=" + dsn + ";Server=" + server + - ";UID=" + user + ";PWD=" + password + - ";loglevel=4;istablemodel=0"; - std::cout << "Using connection string: " << connectionString << std::endl; - - SQLCHAR outConnStr[1024]; - SQLSMALLINT outConnStrLen; - - ret = SQLDriverConnect(hDbc, NULL, - (SQLCHAR*)connectionString.c_str(), SQL_NTS, - outConnStr, sizeof(outConnStr), - &outConnStrLen, SQL_DRIVER_COMPLETE); - - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - std::cerr << "Login failed" << std::endl; - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLDriverConnect"); - return 1; - } - - SQLCHAR driverName[256]; - SQLSMALLINT nameLength; - ret = SQLGetInfo(hDbc, SQL_DRIVER_NAME, driverName, sizeof(driverName), &nameLength); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLGetInfo"); - - std::cout << "Successfully opened connection. database name = " << driverName << std::endl; - - const char* insertStatements[] = { - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')" - }; - for (const char* sql : insertStatements) { - Execute(hDbc, sql); - } - std::cout << "[DEBUG] Inserted 20 rows. Begin to query." << std::endl; - Query(hDbc); - std::cout << "Query ok" << std::endl; - - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - - return 0; - } - catch (...) { - if (hDbc != SQL_NULL_HDBC) { - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - } - if (hEnv != SQL_NULL_HENV) { - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - } - - std::cerr << "Unexpected error!" << std::endl; - return 1; - } -} -``` - -### 4.4 PowerBI Example -1. Open PowerBI Desktop and create a new project. -2. Click "Home" → "Get Data" → "More..." → "ODBC" → Click the "Connect" button. -3. Data Source Selection: In the pop-up window, select "Data Source Name (DSN)" and choose `Apache IoTDB DSN` from the dropdown. -4. Advanced Configuration: - * Click "Advanced options" and fill in the configuration in the "Connection string" input box (example): - ```Plain - server=127.0.0.1;port=6667;isTableModel=false;loglevel=4 - ``` - * Notes: - * The `dsn` item is optional; filling it in or not does not affect the connection. - * `loglevel` ranges from 0-4: Level 0 (ERROR) has the least logs, Level 4 (TRACE) has the most detailed logs; set as needed. - * `server`/`dsn`/`loglevel` are case-insensitive (e.g., can be written as `Server`). - * If relevant information is configured in the DSN, you do not need to fill in any configuration information; the Driver Manager will automatically use the configuration filled in the DSN. -5. Authentication: Enter the username (default `root`) and password (default `root`), then click "Connect". -6. Data Loading: Click "Load" to view the data. - -### 4.5 Excel Example -1. Open Excel and create a blank workbook. -2. Click the "Data" tab → "From Other Sources" → "From Data Connection Wizard". -3. Data Source Selection: Select "ODBC DSN" → Next → Select `Apache IoTDB DSN` → Next. -4. Connection Configuration: - * The input process for connection string, username, and password is exactly the same as in PowerBI. Reference format for connection string: - ```Plain - server=127.0.0.1;port=6667;isTableModel=false;loglevel=4 - ``` - * If relevant information is configured in the DSN, you do not need to fill in any configuration information; the Driver Manager will automatically use the configuration filled in the DSN. -5. Save Connection: Customize settings for the data connection file name, connection description, etc., then click "Finish". -6. Import Data: Select the location to import the data into the worksheet (e.g., cell A1 of "Existing Worksheet"), click "OK" to complete data loading. \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/API/Programming-OPC-DA_timecho.md b/src/UserGuide/Master/Tree/API/Programming-OPC-DA_timecho.md deleted file mode 100644 index 861d88e98..000000000 --- a/src/UserGuide/Master/Tree/API/Programming-OPC-DA_timecho.md +++ /dev/null @@ -1,209 +0,0 @@ - - -# OPC DA Protocol - -## 1. OPC DA - -OPC DA (OPC Data Access) is a communication protocol standard in the field of industrial automation and a core part of the classic OPC (OLE for Process Control) technology. Its primary goal is to enable real-time data exchange between industrial devices and software (such as SCADA, HMI, and databases) in a Windows environment. OPC DA is implemented based on COM/DCOM and is a lightweight protocol with two roles: server and client. - -* **Server:** Can be regarded as a pool of items, storing the latest data and status of each instance. All items can only be managed on the server side; clients can only read and write data and have no authority to manipulate metadata. - -![](/img/opc-da-1-1.png) - -* **Client:** After connecting to the server, the client needs to define a custom group (this group is only relevant to the client) and create items with the same names as those on the server. The client can then read and write the items it has created. - -![](/img/opc-da-1-2-en.png) - -## 2. OPC DA Sink - -IoTDB (available since V2.0.5.1 for V2.x) provides an OPC DA Sink that supports pushing tree-model data to a local COM server plugin. It encapsulates the OPC DA interface specifications and their inherent complexity, significantly simplifying the integration process. The data flow diagram for the OPC DA Sink is shown below. - -![](/img/opc-da-2-1-en.png) - -### 2.1 SQL Syntax - -```SQL ----- Note: The clsID here needs to be replaced with your own clsID -create pipe opc ( - 'sink'='opc-da-sink', - --- 'opcda.progid'='opcserversim.Instance.1' - 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C' -); -``` - -### 2.2 Parameter Description - -| ​**​Parameter​**​ | ​**​Description​**​ | ​**​Value Range​**​ | ​**​Required​**​ | -| ----------------------------- | ----------------------------------------------------------------------------------------------------------- | ------------------------------- | ----------------------------------------- | -| sink | OPC DA Sink | String: opc-da-sink | Yes | -| sink.opcda.clsid | The ClsID (unique identifier string) of the OPC Server. It is recommended to use clsID instead of progID. | String | Either clsID or progID must be provided | -| sink.opcda.progid | The ProgID of the OPC Server. If clsID is available, it is preferred over progID. | String | Either clsID or progID must be provided | - - -### 2.3 Mapping Specifications - -When used, IoTDB will push the latest data from its tree model to the server. The itemID for the data is the full path of the time series in the tree model, such as root.a.b.c.d. Note that, according to the OPC DA standard, clients cannot directly create items on the server side. Therefore, the server must pre-create items corresponding to IoTDB's time series with the itemID and the appropriate data type. - -* Data type correspondence is as follows: - -| IoTDB | OPC-DA Server | -| ----------- | ----------------------------------------------------------- | -| INT32 | VT\_I4 | -| INT64 | VT\_I8 | -| FLOAT | VT\_R4 | -| DOUBLE | VT\_R8 | -| TEXT | VT\_BSTR | -| BOOLEAN | VT\_BOOL | -| DATE | VT\_DATE | -| TIMESTAMP | VT\_DATE | -| BLOB | VT_BSTR (Variant does not support VT_BLOB, so VT_BSTR is used as a substitute) | -| STRING | VT\_BSTR | - -### 2.4 Common Error Codes - -| Symbol | Error Code | Description | -| ----------------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| OPC\_E\_BADTYPE | 0xC0040004 | The server cannot convert the data between the specified format/requested data type and the canonical data type. This means the server's data type does not match IoTDB's registered type. | -| OPC\_E\_UNKNOWNITEMID | 0xC0040007 | The item ID is not defined in the server's address space (when adding or validating), or the item ID no longer exists in the server's address space (when reading or writing). This means IoTDB's measurement point does not have a corresponding itemID on the server. | -| OPC\_E\_INVALIDITEMID | 0xC0040008 | The itemID does not conform to the server's syntax specifications. | -| REGDB\_E\_CLASSNOTREG | 0x80040154 | Class not registered | -| RPC\_S\_SERVER\_UNAVAILABLE | 0x800706BA | RPC service unavailable | -| DISP\_E\_OVERFLOW | 0x8002000A | Exceeds the maximum value of the type | -| DISP\_E\_BADVARTYPE | 0x80020005 | Type mismatch | - - -### 2.5 Usage Limitations - -* Only supports COM and can only be used on Windows. -* A small amount of old data may be pushed after restarting, but new data will eventually be pushed. -* Currently, only tree-model data is supported. - -## 3. Usage Steps -### 3.1 Prerequisites -1. Windows environment, version >= 8. -2. IoTDB is installed and running normally. -3. OPC DA Server is installed. - -* Using Simple OPC Server Simulator as an example: - -![](/img/opc-da-3-1.png) - -* Double-click an item to modify its name (itemID), data, data type, and other information. -* Right-click an item to delete it, update its value, or create a new item. - -![](/img/opc-da-3-2.png) - -4. OPC DA Client is installed. -* Using KepwareServerEX's quickClient as an example: -* In Kepware, the OPC DA Client can be opened as follows: - -![](/img/opc-da-3-3-en.png) - -![](/img/opc-da-3-4-en.png) - - -### 3.2 Configuration Modifications - -Modify the server configuration to prevent IoTDB's write client and Kepware's read client from connecting to two different instances, which would make debugging impossible. - -* First, press Win+R, type dcomcnfgin the Run menu, and open the DCOM component configuration: - -![](/img/opc-da-3-5-en.png) - -* Navigate to Component Services -> Computers -> My Computer -> DCOM Config, find AGG Software Simple OPC Server Simulator, right-click, and select "Properties": - -![](/img/opc-da-3-6-en.png) - -* Under Identity, change User Accountto Interactive User. Note: Do not use Launching User, as this may cause the two clients to start different server instances. - -![](/img/opc-da-3-7-en.png) - -### 3.3 Obtaining clsID -1. Method 1: Obtain via DCOM Configuration -* Press Win+R, type dcomcnfgin the Run menu, and open the DCOM component configuration. -* Navigate to Component Services -> Computers -> My Computer -> DCOM Config, find AGG Software Simple OPC Server Simulator, right-click, and select "Properties". -* Under General, you can obtain the application's clsID, which will be used for the opc-da-sink connection later. Note: Do not include the curly braces. - -![](/img/opc-da-3-8-en.png) - -2. Method 2: clsID and progID can also be obtained directly from the server. - -* Click `Help` > `Show OPC Server Info` - -![](/img/opc-da-3-9.png) - -* The pop-up window will display the information. - -![](/img/opc-da-3-10-en.png) - -### 3.4 Writing Data -#### 3.4.1 DA Server -1. Create a new item in the DA Server with the same name and type as the item to be written in IoTDB. - -![](/img/opc-da-3-11.png) - -2. Connect to the server in Kepware: - -![](/img/opc-da-3-12-en.png) - -3. Right-click the server to create a new group (the group name can be arbitrary): - -![](/img/opc-da-3-13-en.png) - -![](/img/opc-da-3-14-en.png) - -4. Right-click to create a new item with the same name as the one created earlier. - -![](/img/opc-da-3-15-en.png) - -![](/img/opc-da-3-16-en.png) - -![](/img/opc-da-3-17-en.png) - -#### 3.4.2 IoTDB - -1. Start IoTDB. -2. Create a Pipe. - -```SQL -create pipe opc ('sink'='opc-da-sink', 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C') -``` - -* Note: If the creation fails with the error Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 1107: Failed to connect to server, error code: 0x80040154, refer to this solution: https://opcexpert.com/support/0x80040154-class-not-registered/. - -3. Create a time series (if automatic metadata creation is enabled, this step can be skipped). - -```SQL -create timeseries root.a.b.c.r string; -``` - -4. Insert data. - -```SQL -insert into root.a.b.c (time, r) values(10000, "SomeString") -``` - -### 3.5 Verifying Data - -Check the data in Quick Client; it should have been updated. - -![](/img/opc-da-3-18-en.png) \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/API/Programming-OPC-UA_timecho.md b/src/UserGuide/Master/Tree/API/Programming-OPC-UA_timecho.md deleted file mode 100644 index 7d139aea2..000000000 --- a/src/UserGuide/Master/Tree/API/Programming-OPC-UA_timecho.md +++ /dev/null @@ -1,373 +0,0 @@ - - -# OPC UA Protocol - -## 1. Overview - -This document describes two independent operational modes for IoTDB's integration with the OPC UA protocol. Choose the mode based on your business scenario: - -* **Mode 1: Data Subscription Service (IoTDB as OPC UA Server)**: IoTDB starts an embedded OPC UA server to passively allow external clients (e.g., UAExpert) to connect and subscribe to its internal data. This is the traditional usage. -* **Mode 2: Data Push (IoTDB as OPC UA Client)**: IoTDB acts as a client to actively synchronize data and metadata to one or more independently deployed external OPC UA servers. - > Note: This mode is supported starting from V2.0.8. - -**Note: Modes are mutually exclusive** -When the Pipe configuration specifies the `node-urls` parameter (Mode 2), IoTDB will **not** start the embedded OPC UA server (Mode 1). These two modes **cannot be used simultaneously** within the same Pipe. - -## 2. Data Subscription - -This mode supports users subscribing to data from IoTDB using the OPC UA protocol, with communication modes supporting both Client/Server and Pub/Sub. - -Note: This feature does **not** involve collecting data from external OPC Servers into IoTDB. - -![](/img/opc-ua-new-1-en.png) - -### 2.1 OPC Service Startup - -#### 2.1.1 Syntax - -Syntax for starting OPC UA protocol: - -```SQL -CREATE PIPE p1 - WITH SOURCE (...) - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'sink.opcua.tcp.port' = '12686', - 'sink.opcua.https.port' = '8443', - 'sink.user' = 'root', - 'sink.password' = 'TimechoDB@2021', // Default password was 'root' before V2.0.6.x - 'sink.opcua.security.dir' = '...' - ) -``` - -#### 2.1.2 Parameters - -| **Parameter** | **Description** | **Value Range** | **Required** | **Default Value** | -| ------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |-------------------------------------------------------------------------------------------------------------------------------------| -------------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| sink | OPC UA SINK | String: opc-ua-sink | Required | | -| sink.opcua.model | OPC UA operational mode | String: client-server / pub-sub | Optional | client-server | -| sink.opcua.tcp.port | OPC UA TCP port | Integer: [0, 65536] | Optional | 12686 | -| sink.opcua.https.port | OPC UA HTTPS port | Integer: [0, 65536] | Optional | 8443 | -| sink.opcua.security.dir | OPC UA key and certificate directory | String: Path (supports absolute/relative paths) | Optional | 1. `opc_security` folder under IoTDB's DataNode conf directory `/`. 2. User home directory's `iotdb_opc_security` folder `/` if no IoTDB conf directory exists (e.g., when starting DataNode in IDEA) | -| opcua.security-policy | Security policy used for OPC UA connections (case-insensitive). Multiple policies can be configured and separated by commas. After configuring one policy, clients can only connect using that policy. Default implementation supports `None` and `Basic256Sha256`. Should be set to a non-`None` policy by default. `None` policy is only for debugging (convenient but insecure; not recommended for production). Note: Supported since V2.0.8, only for client-server mode. | String (security level increases):`None`,`Basic128Rsa15`,`Basic256`,`Basic256Sha256`,`Aes128_Sha256_RsaOaep`,`Aes256_Sha256_RsaPss` | Optional | `Basic256Sha256,Aes128_Sha256_RsaOaep,Aes256_Sha256_RsaPss` | -| sink.opcua.enable-anonymous-access | Whether OPC UA allows anonymous access | Boolean | Optional | true | -| sink.user | User (OPC UA allowed user) | String | Optional | root | -| sink.password | Password (OPC UA allowed password) | String | Optional | TimechoDB@2021 (Default was 'root' before V2.0.6.x) | -| opcua.with-quality | Whether OPC UA publishes data in value + quality mode. When enabled, system processes data as follows:1. Both value and quality present → Push directly to OPC UA Server.2. Only value present → Quality automatically filled as UNCERTAIN (default, configurable).3. Only quality present → Ignore write (no processing).4. Non-value/quality fields present → Ignore data and log warning (configurable log frequency to avoid high-frequency interference).5. Quality type restriction: Only boolean type supported (true = GOOD, false = BAD).**Note**: Supported since V2.0.8, only for client-server mode | Boolean | Optional | false | -| opcua.value-name | Effective when `with-quality` = true, specifies the name of the value point. **Note**: Supported since V2.0.8, only for client-server mode | String | Optional | value | -| opcua.quality-name | Effective when `with-quality` = true, specifies the name of the quality point. **Note**: Supported since V2.0.8, only for client-server mode | String | Optional | quality | -| opcua.default-quality | When no quality is provided, specify `GOOD`/`UNCERTAIN`/`BAD` via SQL parameter. **Note**: Supported since V2.0.8, only for client-server mode | String: `GOOD`/`UNCERTAIN`/`BAD` | Optional | `UNCERTAIN` | -| opcua.timeout-seconds | Client connection timeout in seconds (effective only when IoTDB acts as client). **Note**: Supported since V2.0.8, only for client-server mode | Long | Optional | 10L | - -#### 2.1.3 Example - -```Bash -CREATE PIPE p1 - WITH SINK ('sink' = 'opc-ua-sink', - 'sink.user' = 'root', - 'sink.password' = 'TimechoDB@2021'); // Default password was 'root' before V2.0.6.x -START PIPE p1; -``` - -#### 2.1.4 Usage Restrictions - -1. Data must be written after protocol startup to establish connection. Only data written *after* connection can be subscribed. -2. Recommended for single-node mode. In distributed mode, each IoTDB DataNode acts as an independent OPC Server; separate subscriptions are required for each. - -### 2.2 Example of Two Communication Modes - -#### 2.2.1 Client/Server Mode - -In this mode, IoTDB's stream processing engine establishes a connection with the OPC UA Server (Server) via OPC UA Sink. The OPC UA Server maintains data in its address space (Address Space), and IoTDB can request and retrieve this data. Other OPC UA clients (Clients) can also access the server's data. - -* **Features**: - * OPC UA organizes device information received from Sink into folders under Objects folder in tree structure. - * Each point is recorded as a variable node with the latest value in the current database. - * OPC UA cannot delete data or change data type settings. - -##### 2.2.1.1 Preparation - -1. Example using UAExpert client: Download UAExpert client from https://www.unified-automation.com/downloads/opc-ua-clients.html -2. Install UAExpert and configure certificate information. - -##### 2.2.1.2 Quick Start -###### 2.2.1.2.1 Scenarios Supporting the None Security Policy -1. Start OPC UA service using SQL (detailed syntax see [IoTDB OPC Server Syntax](./Programming-OPC-UA_timecho.md#_2-1-语法)): - -```SQL -create pipe p1 with sink ('sink'='opc-ua-sink', 'opcua.security-policy'='AES128_SHA256_RSAOAEP, AES256_SHA256_RSAPSS, BASIC256SHA256, NONE'); -``` -Note: Since version V2.0.8.1, None is no longer supported by default. To use it, you must manually enable it via the security-policy parameter as shown above. - -2. Write some data: - -```SQL -INSERT INTO root.test.db(time, s2) VALUES(NOW(), 2); -``` - -3. Configure UAExpert to connect to IoTDB (password matches `sink.password` configured above, e.g., root/TimechoDB@2021): - - ::: center - - - - ::: - - ::: center - - - - ::: - -4. Trust the server certificate, then view written data under Objects folder on the left: - - ::: center - - - - ::: - - ::: center - - - - ::: - - Note: Since the SecurityPolicy is set to None, mutual certificate trust is not required. For production environments, it is recommended to use a non-None SecurityPolicy for connection, which requires mutual certificate trust. For operations, refer to the Pub/Sub mode section below. In the Client/Server certificate directory (search for the keyword keyStore in the printed logs), move the contents in reject to trusted/certs. Follow the sequence: connect → move server directory → connect → move client directory → connect. - - -5. Drag left nodes to the middle to display latest value: - - ::: center - - - - ::: - -###### 2.2.1.2.2 Scenarios Not Supporting the None Security Policy -1. Use the following SQL to create and start the OPC UA service. - ```SQL - create pipe p1 with sink ('sink'='opc-ua-sink'); - ``` - - Note: Since version V2.0.8.1, OpcUaSink no longer supports None mode by default for security considerations. - -2. Insert some test data. - ```SQL - insert into root.test.db(time, s2) values(now(), 2); - ``` - -3. Configure the IoTDB connection in UAExpert: - - - Do not access the URL directly; endpoints must be discovered using the Discover method - - The client first sends a GetEndpoints request with the None policy to retrieve the endpoint list - - It then selects the corresponding encrypted endpoint based on the configured Basic256Sha256 + SignAndEncrypt to establish an encrypted connection - - ![](/img/opc-ua-un-none-1.png) - -4. Use the same username and password configuration as above. After selecting the relevant connection mode (Sign / Sign & Encrypt), if the following prompt appears, click Ignore to connect directly. - - ![](/img/opc-ua-un-none-2.png) - - -#### 2.2.2 Pub/Sub Mode - -In this mode, IoTDB's stream processing engine sends data change events to the OPC UA Server (Server) via OPC UA Sink. These events are published to the server's message queue and managed via Event Nodes. Other OPC UA clients (Clients) can subscribe to these Event Nodes to receive notifications when data changes. - -* **Features**: - - * Each point is packaged as an Event Node (EventNode) by OPC UA. - * Related fields and meanings: - - | Field | Meaning | Type (Milo) | Example | - | ------------ | ------------------ | -------------- | ----------------------- | - | Time | Timestamp | DateTime | 1698907326198 | - | SourceName | Full path of point | String | root.test.opc.sensor0 | - | SourceNode | Data type of point | NodeId | Int32 | - | Message | Data | LocalizedText | 3.0 | - - - -- Events are sent only to currently subscribed clients. Unconnected clients ignore events. -- Deleted data cannot be pushed to clients. - -##### 2.2.2.1 Preparation - -Code located in `example/pipe-opc-ua-sink/src/main/java/org/apache/iotdb/opcua` of iotdb-example package. - -Contains: - -- Main class (`ClientTest`) -- Client certificate logic (`IoTDBKeyStoreLoaderClient`) -- Client configuration and startup logic (`ClientExampleRunner`) -- Parent class for `ClientTest` (`ClientExample`) - -##### 2.2.2.2 Quick Start - -1. Open IoTDB and write some data: - -```SQL -INSERT INTO root.a.b(time, c, d) VALUES(NOW(), 1, 2); // Auto-creates metadata -``` - -2. Create and start Pub/Sub mode OPC UA Sink: - -```SQL -CREATE PIPE p1 WITH SINK ('sink'='opc-ua-sink', 'sink.opcua.model'='pub-sub'); -START PIPE p1; -``` - -3. Observe server creates OPC certificate directory under conf: -4. Run Client to connect, but server rejects Client certificate: -5. Enter server's `sink.opcua.security.dir` → `pki` → `rejected` directory, find Client's certificate: -6. Move Client's certificate (not copy) to `trusted/certs` directory: -7. Reopen Client → server certificate rejected by Client: -8. Enter client's `/client/security` → `pki` → `rejected` → move server's certificate (not copy) to `trusted`: -9. Open Client → successful bidirectional trust, connection established. -10. Write data to server → Client prints received data: - -#### 2.2.3 Notes - -1. **Single-node vs Cluster**: Recommend single-node (1C1D). In cluster with multiple DataNodes, data may be distributed across nodes, preventing full data subscription. -2. **No root certificate operations**: No need to handle IoTDB's root security directory `iotdb-server.pfx` or client security directory `example-client.pfx`. During bidirectional connection, root certificates are exchanged. New certificates are placed in `rejected` directory; if in `trusted/certs`, they're trusted. -3. **Recommended Java 17+**: JDK 8 may have key size restrictions causing "Illegal key size" errors. For specific versions (e.g., jdk.1.8u151+), add `Security.setProperty("crypto.policy", "unlimited");` in `ClientExampleRunner.createClient()`, or replace `JDK/jre/lib/security/local_policy.jar` and `US_export_policy.jar` with unlimited versions from https://www.oracle.com/java/technologies/javase-jce8-downloads.html. -4. **Connection issues**: If error is "Unknown host", modify `/etc/hosts` on IoTDB DataNode machine to add target machine's URL and hostname. - -## 3. Data Push - -In this mode, IoTDB acts as an OPC UA client via Pipe to actively push selected data (including quality code) to one or more external OPC UA servers. External servers automatically create directory trees and nodes based on IoTDB's metadata. - -![](/img/opc-ua-data-push-en.png) - -### 3.1 OPC Service Startup - -#### 3.1.1 Syntax - -Syntax for starting OPC UA protocol: - -```SQL -CREATE PIPE p1 - WITH SOURCE (...) - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'opcua.node-url' = '127.0.0.1:12686', - 'opcua.historizing' = 'true', - 'opcua.with-quality' = 'true' - ) -``` - -#### 3.1.2 Parameters - -| **Parameter** | **Description** | **Value Range** | **Required** | **Default Value** | -|-----------------------| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |-------------------------------------------------------------------------------------------------------------------------------------| -------------------- | -------------------- | -| sink | OPC UA SINK | String: opc-ua-sink | Required | | -| opcua.node-url | Comma-separated OPC UA TCP ports. When specified, IoTDB **does not** start local server but sends data to configured OPC UA Server. | String | Optional | `''` | -| opcua.historizing | When automatically creating directories and leaf nodes, whether to store historical data in new nodes. | Boolean | Optional | false | -| opcua.with-quality | Whether OPC UA publishes data in value + quality mode. When enabled, system processes data as follows:1. Both value and quality present → Push directly to OPC UA Server.2. Only value present → Quality automatically filled as UNCERTAIN (default, configurable).3. Only quality present → Ignore write (no processing).4. Non-value/quality fields present → Ignore data and log warning (configurable log frequency).5. Quality type restriction: Only boolean type supported (true = GOOD, false = BAD). | Boolean | Optional | false | -| opcua.value-name | Effective when `with-quality` = true, specifies the name of the value point. | String | Optional | value | -| opcua.quality-name | Effective when `with-quality` = true, specifies the name of the quality point. | String | Optional | quality | -| opcua.default-quality | When no quality is provided, specify `GOOD`/`UNCERTAIN`/`BAD` via SQL parameter. | String: `GOOD`/`UNCERTAIN`/`BAD` | Optional | `UNCERTAIN` | -| opcua.security-policy | OPC UA client security policy (case-insensitive), URL format: `http://opcfoundation.org/UA/SecurityPolicy#`, e.g., `http://opcfoundation.org/UA/SecurityPolicy#Aes128_Sha256_RsaOaep` | String (security level increases):`None`,`Basic128Rsa15`,`Basic256`,`Basic256Sha256`,`Aes128_Sha256_RsaOaep`,`Aes256_Sha256_RsaPss` | Optional | `Basic256Sha256` | -| opcua.timeout-seconds | Client connection timeout in seconds (effective only when IoTDB acts as client) | Long | Optional | 10L | - -> **Parameter Naming Note**: All parameters support omitting `opcua.` prefix (e.g., `node-urls` and `opcua.node-urls` are equivalent). -> -> **Support Note**: All `opcua.` parameters are supported starting from V2.0.8, and only for `client-server` mode. - -#### 3.1.3 Example - -```Bash -CREATE PIPE p1 - WITH SOURCE (...) - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'node-urls' = '127.0.0.1:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ) -``` - -#### 3.1.4 Usage Restrictions - -1. Current mode **only supports `client-server` mode and tree model data**. -2. Do not configure multiple DataNodes on one machine to avoid port conflicts. -3. **Does not support** `OBJECT` type data push. -4. When a time series is renamed, OPC UA Sink automatically deletes the old path and pushes data to the new path. -5. **Strongly recommended** to use non-`None` security policy (e.g., `Basic256Sha256`) with proper bidirectional certificate trust in production. - -### 3.2 External OPC UA Server Project - -IoTDB supports a standalone external Server project. This Server implements the same configuration as IoTDB's embedded Server but requires additional support for dynamically creating directories and leaf nodes based on IoTDB's metadata. - -Configuration is injected via command-line args when starting the Server (no YAML/XML support). Parameter keys match IoTDB OPC Server configuration items, with dots (`.`) and hyphens (`-`) replaced by underscores (`_`). - -Example: - -```SQL -./start-IoTDB-opc-server.sh -enable_anonymous_access true -u root -pw root -https_port 8443 -``` - -Where `user` and `password` can be abbreviated as `-u` and `-p`. All other parameter keys match configuration items. Note: `userName` is **not** a valid parameter key; only `user` is supported. - -### 3.3 Scenario Example - -**Goal**: Aggregate data from multiple sources to 3 external OPC Servers for unified monitoring center access. - -![](/img/opc-ua-data-push-example-en.png) - -1. **Preparation**: Start external OPC UA Server (port 12686) on three servers (`ip1`, `ip2`, `ip3`). -2. **Configure Pipes**: Create 3 Pipes in IoTDB, using `processor` or `source` path patterns to filter data by region and push to corresponding Servers. - ```SQL - -- Start IoTDB - ./start-standalone.sh - - -- Start three OPC UA Servers (on ip1, ip2, ip3) - ./start-IoTDB-external-opc-server.sh -enable-anonymous-access true -u root -pw root - - -- Create three Pipes - ./start-cli.sh - CREATE PIPE p1 - WITH SOURCE () - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip1:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - CREATE PIPE p2 - WITH SOURCE () - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip2:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - CREATE PIPE p3 - WITH SOURCE () - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip3:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - ``` -3. **Result**: The monitoring center only needs to connect to `ip1`, `ip2`, and `ip3` to access the complete data view from all regions, with quality information attached. diff --git a/src/UserGuide/Master/Tree/API/Programming-Python-Native-API_timecho.md b/src/UserGuide/Master/Tree/API/Programming-Python-Native-API_timecho.md deleted file mode 100644 index 9ece00588..000000000 --- a/src/UserGuide/Master/Tree/API/Programming-Python-Native-API_timecho.md +++ /dev/null @@ -1,875 +0,0 @@ - - -# Python Native API - -## 1. Requirements - -You have to install thrift (>=0.13) before using the package. - - - -## 2. How to use (Example) - -First, download the package: `pip3 install apache-iotdb>=2.0` - -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -You can get an example of using the package to read and write data at here:[Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/session_example.py) - -An example of aligned timeseries: [Aligned Timeseries Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/session_aligned_timeseries_example.py) - -(you need to add `import iotdb` in the head of the file) - -Or: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //Before V2.0.6.x the default password is root -session = Session(ip, port_, username_, password_) -session.open(False) -zone = session.get_time_zone() -session.close() -``` - -## 3. Initialization - -* Initialize a Session - -```python -session = Session( - ip="127.0.0.1", - port="6667", - user="root", - password="TimechoDB@2021", //Before V2.0.6.x the default password is root - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) -``` - -* Initialize a Session to connect multiple nodes - -```python -session = Session.init_from_node_urls( - node_urls=["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"], - user="root", - password="TimechoDB@2021", //Before V2.0.6.x the default password is root - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) -``` - -* Open a session, with a parameter to specify whether to enable RPC compression - -```python -session.open(enable_rpc_compression=False) -``` - -Notice: this RPC compression status of client must comply with that of IoTDB server - -* Close a Session - -```python -session.close() -``` - -## 4. Managing Session through SessionPool - -Utilizing SessionPool to manage sessions eliminates the need to worry about session reuse. When the number of session connections reaches the maximum capacity of the pool, requests for acquiring a session will be blocked, and you can set the blocking wait time through parameters. After using a session, it should be returned to the SessionPool using the `putBack` method for proper management. - -### 4.1 Create SessionPool - -```python -pool_config = PoolConfig(host=ip,port=port, user_name=username, - password=password, fetch_size=1024, - time_zone="UTC+8", max_retry=3) -max_pool_size = 5 -wait_timeout_in_ms = 3000 - -# # Create the connection pool -session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) -``` -### 4.2 Create a SessionPool using distributed nodes. -```python -pool_config = PoolConfig(node_urls=node_urls=["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"], user_name=username, - password=password, fetch_size=1024, - time_zone="UTC+8", max_retry=3) -max_pool_size = 5 -wait_timeout_in_ms = 3000 -``` -### 4.3 Acquiring a session through SessionPool and manually calling PutBack after use - -```python -session = session_pool.get_session() -session.set_storage_group(STORAGE_GROUP_NAME) -session.create_time_series( - TIMESERIES_PATH, TSDataType.BOOLEAN, TSEncoding.PLAIN, Compressor.SNAPPY -) -# After usage, return the session using putBack -session_pool.put_back(session) -# When closing the sessionPool, all managed sessions will be closed as well -session_pool.close() -``` -### 4.4 SSL Connection - -#### 4.4.1 Server Certificate Configuration - -In the `conf/iotdb-system.properties` configuration file, locate or add the following configuration items: - -```Java -enable_thrift_ssl=true -key_store_path=/path/to/your/server_keystore.jks -key_store_pwd=your_keystore_password -``` - -#### 4.4.2 Configure Python Client Certificate - -- Set `use_ssl` to True to enable SSL. -- Specify the client certificate path using the `ca_certs` parameter. - -```Java -use_ssl = True -ca_certs = "/path/to/your/server.crt" # 或 ca_certs = "/path/to/your//ca_cert.pem" -``` -**Example Code: Using SSL to Connect to IoTDB** - -```Java -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# - -from iotdb.SessionPool import PoolConfig, SessionPool -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //Before V2.0.6.x the default password is root -# Configure SSL enabled -use_ssl = True -# Configure certificate path -ca_certs = "/path/server.crt" - - -def get_data(): - session = Session( - ip, port_, username_, password_, use_ssl=use_ssl, ca_certs=ca_certs - ) - session.open(False) - with session.execute_query_statement("select * from root.eg.etth") as result: - df = result.todf() - df.rename(columns={"Time": "date"}, inplace=True) - session.close() - return df - - -def get_data2(): - pool_config = PoolConfig( - host=ip, - port=port_, - user_name=username_, - password=password_, - fetch_size=1024, - time_zone="UTC+8", - max_retry=3, - use_ssl=use_ssl, - ca_certs=ca_certs, - ) - max_pool_size = 5 - wait_timeout_in_ms = 3000 - session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) - session = session_pool.get_session() - with session.execute_query_statement("select * from root.eg.etth") as result: - df = result.todf() - df.rename(columns={"Time": "date"}, inplace=True) - session_pool.put_back(session) - session_pool.close() - - -if __name__ == "__main__": - df = get_data() -``` - -## 5. Data Definition Interface (DDL Interface) - -### 5.1 Database Management - -* CREATE DATABASE - -```python -session.set_storage_group(group_name) -``` - -* Delete one or several databases - -```python -session.delete_storage_group(group_name) -session.delete_storage_groups(group_name_lst) -``` -### 5.2 Timeseries Management - -* Create one or multiple timeseries - -```python -session.create_time_series(ts_path, data_type, encoding, compressor, - props=None, tags=None, attributes=None, alias=None) - -session.create_multi_time_series( - ts_path_lst, data_type_lst, encoding_lst, compressor_lst, - props_lst=None, tags_lst=None, attributes_lst=None, alias_lst=None -) -``` - -* Create aligned timeseries - -```python -session.create_aligned_time_series( - device_id, measurements_lst, data_type_lst, encoding_lst, compressor_lst -) -``` - -Attention: Alias of measurements are **not supported** currently. - -* Delete one or several timeseries - -```python -session.delete_time_series(paths_list) -``` - -* Check whether the specific timeseries exists - -```python -session.check_time_series_exists(path) -``` - -## 6. Data Manipulation Interface (DML Interface) - -### 6.1 Insert - -It is recommended to use insertTablet to help improve write efficiency. - -* Insert a Tablet,which is multiple rows of a device, each row has the same measurements - * **Better Write Performance** - * **Support null values**: fill the null value with any value, and then mark the null value via BitMap (from v0.13) - - -We have two implementations of Tablet in Python API. - -* Normal Tablet - -```python -values_ = [ - [False, 10, 11, 1.1, 10011.1, "test01"], - [True, 100, 11111, 1.25, 101.0, "test02"], - [False, 100, 1, 188.1, 688.25, "test03"], - [True, 0, 0, 0, 6.25, "test04"], -] -timestamps_ = [1, 2, 3, 4] -tablet_ = Tablet( - device_id, measurements_, data_types_, values_, timestamps_ -) -session.insert_tablet(tablet_) - -values_ = [ - [None, 10, 11, 1.1, 10011.1, "test01"], - [True, None, 11111, 1.25, 101.0, "test02"], - [False, 100, None, 188.1, 688.25, "test03"], - [True, 0, 0, 0, None, None], -] -timestamps_ = [16, 17, 18, 19] -tablet_ = Tablet( - device_id, measurements_, data_types_, values_, timestamps_ -) -session.insert_tablet(tablet_) -``` -* Numpy Tablet - -Comparing with Tablet, Numpy Tablet is using [numpy.ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html) to record data. -With less memory footprint and time cost of serialization, the insert performance will be better. - -**Notice** -1. time and numerical value columns in Tablet is ndarray -2. recommended to use the specific dtypes to each ndarray, see the example below - (if not, the default dtypes are also ok). - -```python -import numpy as np -data_types_ = [ - TSDataType.BOOLEAN, - TSDataType.INT32, - TSDataType.INT64, - TSDataType.FLOAT, - TSDataType.DOUBLE, - TSDataType.TEXT, -] -np_values_ = [ - np.array([False, True, False, True], TSDataType.BOOLEAN.np_dtype()), - np.array([10, 100, 100, 0], TSDataType.INT32.np_dtype()), - np.array([11, 11111, 1, 0], TSDataType.INT64.np_dtype()), - np.array([1.1, 1.25, 188.1, 0], TSDataType.FLOAT.np_dtype()), - np.array([10011.1, 101.0, 688.25, 6.25], TSDataType.DOUBLE.np_dtype()), - np.array(["test01", "test02", "test03", "test04"], TSDataType.TEXT.np_dtype()), -] -np_timestamps_ = np.array([1, 2, 3, 4], TSDataType.INT64.np_dtype()) -np_tablet_ = NumpyTablet( - device_id, measurements_, data_types_, np_values_, np_timestamps_ -) -session.insert_tablet(np_tablet_) - -# insert one numpy tablet with None into the database. -np_values_ = [ - np.array([False, True, False, True], TSDataType.BOOLEAN.np_dtype()), - np.array([10, 100, 100, 0], TSDataType.INT32.np_dtype()), - np.array([11, 11111, 1, 0], TSDataType.INT64.np_dtype()), - np.array([1.1, 1.25, 188.1, 0], TSDataType.FLOAT.np_dtype()), - np.array([10011.1, 101.0, 688.25, 6.25], TSDataType.DOUBLE.np_dtype()), - np.array(["test01", "test02", "test03", "test04"], TSDataType.TEXT.np_dtype()), -] -np_timestamps_ = np.array([98, 99, 100, 101], TSDataType.INT64.np_dtype()) -np_bitmaps_ = [] -for i in range(len(measurements_)): - np_bitmaps_.append(BitMap(len(np_timestamps_))) -np_bitmaps_[0].mark(0) -np_bitmaps_[1].mark(1) -np_bitmaps_[2].mark(2) -np_bitmaps_[4].mark(3) -np_bitmaps_[5].mark(3) -np_tablet_with_none = NumpyTablet( - device_id, measurements_, data_types_, np_values_, np_timestamps_, np_bitmaps_ -) -session.insert_tablet(np_tablet_with_none) -``` - -* Insert multiple Tablets - -```python -session.insert_tablets(tablet_lst) -``` - -* Insert a Record - -```python -session.insert_record(device_id, timestamp, measurements_, data_types_, values_) -``` - -* Insert multiple Records - -```python -session.insert_records( - device_ids_, time_list_, measurements_list_, data_type_list_, values_list_ -) -``` - -* Insert multiple Records that belong to the same device. - With type info the server has no need to do type inference, which leads a better performance - - -```python -session.insert_records_of_one_device(device_id, time_list, measurements_list, data_types_list, values_list) -``` - -### 6.2 Insert with type inference - -When the data is of String type, we can use the following interface to perform type inference based on the value of the value itself. For example, if value is "true" , it can be automatically inferred to be a boolean type. If value is "3.2" , it can be automatically inferred as a flout type. Without type information, server has to do type inference, which may cost some time. - -* Insert a Record, which contains multiple measurement value of a device at a timestamp - -```python -session.insert_str_record(device_id, timestamp, measurements, string_values) -``` - -### 6.3 Insert of Aligned Timeseries - -The Insert of aligned timeseries uses interfaces like insert_aligned_XXX, and others are similar to the above interfaces: - -* insert_aligned_record -* insert_aligned_records -* insert_aligned_records_of_one_device -* insert_aligned_tablet -* insert_aligned_tablets - - -## 7. IoTDB-SQL Interface - -* Execute query statement - -```python -session.execute_query_statement(sql) -``` - -* Execute non query statement - -```python -session.execute_non_query_statement(sql) -``` - -* Execute statement - -```python -session.execute_statement(sql) -``` - -## 8. Schema Template -### 8.1 Create Schema Template -The step for creating a metadata template is as follows -1. Create the template class -2. Adding MeasurementNode -3. Execute create schema template function - -```python -template = Template(name=template_name, share_time=True) - -m_node_x = MeasurementNode("x", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) -m_node_y = MeasurementNode("y", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) -m_node_z = MeasurementNode("z", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) - -template.add_template(m_node_x) -template.add_template(m_node_y) -template.add_template(m_node_z) - -session.create_schema_template(template) -``` -### 8.2 Modify Schema Template measurements -Modify measurements in a template, the template must be already created. These are functions that add or delete some measurement nodes. -* add node in template -```python -session.add_measurements_in_template(template_name, measurements_path, data_types, encodings, compressors, is_aligned) -``` - -* delete node in template -```python -session.delete_node_in_template(template_name, path) -``` - -### 8.3 Set Schema Template -```python -session.set_schema_template(template_name, prefix_path) -``` - -### 8.4 Uset Schema Template -```python -session.unset_schema_template(template_name, prefix_path) -``` - -### 8.5 Show Schema Template -* Show all schema templates -```python -session.show_all_templates() -``` -* Count all measurements in templates -```python -session.count_measurements_in_template(template_name) -``` - -* Judge whether the path is measurement or not in templates, This measurement must be in the template -```python -session.count_measurements_in_template(template_name, path) -``` - -* Judge whether the path is exist or not in templates, This path may not belong to the template -```python -session.is_path_exist_in_template(template_name, path) -``` - -* Show nodes under in schema template -```python -session.show_measurements_in_template(template_name) -``` - -* Show the path prefix where a schema template is set -```python -session.show_paths_template_set_on(template_name) -``` - -* Show the path prefix where a schema template is used (i.e. the time series has been created) -```python -session.show_paths_template_using_on(template_name) -``` - -### 8.6 Drop Schema Template -Delete an existing metadata template,dropping an already set template is not supported -```python -session.drop_schema_template("template_python") -``` - - -## 9. Pandas Support - -To easily transform a query result to a [Pandas Dataframe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) -the SessionDataSet has a method `.todf()` which consumes the dataset and transforms it to a pandas dataframe. - -Example: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //Before V2.0.6.x the default password is root -session = Session(ip, port_, username_, password_) -session.open(False) -with session.execute_query_statement("SELECT ** FROM root") as result: - # Transform to Pandas Dataset - df = result.todf() - -session.close() - -# Now you can work with the dataframe -df = ... -``` - - -**Since V2.0.8.2**, `SessionDataSet` provides methods for batch DataFrame retrieval to efficiently handle large-volume queries: - -```python -# Batch DataFrame retrieval -has_next = result.has_next_df() -if has_next: - df = result.next_df() - # Process DataFrame -``` - -**Method Details:** -- `has_next_df()`: Returns `True`/`False` indicating whether more data exists -- `next_df()`: Returns a `DataFrame` or `None`. Each call returns `fetchSize` rows (default: 5000 rows, controlled by Session's `fetch_size` parameter): - - If remaining data ≥ `fetchSize`: returns `fetchSize` rows - - If remaining data < `fetchSize`: returns all remaining rows - - If traversal completes: returns `None` -- Session validates `fetchSize` at initialization: if ≤0, resets to 5000 and logs warning: `fetch_size xxx is illegal, use default fetch_size 5000` - -**Note:** Avoid mixing different traversal methods (e.g., combining `todf()` with `next_df()`), which may cause unexpected errors. - -**Usage Example:** - -```python -from iotdb.Session import Session - -# Initialize session with fetch_size=2 -session = Session( - host="127.0.0.1", port="6667", fetch_size=2 -) -session.open(False) -session.execute_non_query_statement("CREATE DATABASE root.device0") - -# Insert 3 records -session.insert_str_record("root.device0", 123, "pressure", "15.0") -session.insert_str_record("root.device0", 124, "pressure", "15.0") -session.insert_str_record("root.device0", 125, "pressure", "15.0") - -# Query and batch retrieve -with session.execute_query_statement("SELECT * FROM root.device0") as session_data_set: - while session_data_set.has_next_df(): - df = session_data_set.next_df() - # Outputs two DataFrames: first with 2 rows, second with 1 row - print(df) - -session.close() -``` - - -## 10. IoTDB Testcontainer - -The Test Support is based on the lib `testcontainers` (https://testcontainers-python.readthedocs.io/en/latest/index.html) which you need to install in your project if you want to use the feature. - -To start (and stop) an IoTDB Database in a Docker container simply do: -```python -class MyTestCase(unittest.TestCase): - - def test_something(self): - with IoTDBContainer() as c: - session = Session("localhost", c.get_exposed_port(6667), "root", "TimechoDB@2021") //Before V2.0.6.x the default password is root - session.open(False) - with session.execute_query_statement("SHOW TIMESERIES") result: - print(result) - session.close() -``` - -by default it will load the image `apache/iotdb:latest`, if you want a specific version just pass it like e.g. `IoTDBContainer("apache/iotdb:0.12.0")` to get version `0.12.0` running. - -## 11. IoTDB DBAPI - -IoTDB DBAPI implements the Python DB API 2.0 specification (https://peps.python.org/pep-0249/), which defines a common -interface for accessing databases in Python. - -### 11.1 Examples -+ Initialization - -The initialized parameters are consistent with the session part (except for the sqlalchemy_mode). -```python -from iotdb.dbapi import connect - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //Before V2.0.6.x the default password is root -conn = connect(ip, port_, username_, password_,fetch_size=1024,zone_id="UTC+8",sqlalchemy_mode=False) -cursor = conn.cursor() -``` -+ simple SQL statement execution -```python -cursor.execute("SELECT ** FROM root") -for row in cursor.fetchall(): - print(row) -``` - -+ execute SQL with parameter - -IoTDB DBAPI supports pyformat style parameters -```python -cursor.execute("SELECT ** FROM root WHERE time < %(time)s",{"time":"2017-11-01T00:08:00.000"}) -for row in cursor.fetchall(): - print(row) -``` - -+ execute SQL with parameter sequences -```python -seq_of_parameters = [ - {"timestamp": 1, "temperature": 1}, - {"timestamp": 2, "temperature": 2}, - {"timestamp": 3, "temperature": 3}, - {"timestamp": 4, "temperature": 4}, - {"timestamp": 5, "temperature": 5}, -] -sql = "insert into root.cursor(timestamp,temperature) values(%(timestamp)s,%(temperature)s)" -cursor.executemany(sql,seq_of_parameters) -``` - -+ close the connection and cursor -```python -cursor.close() -conn.close() -``` - -## 12. IoTDB SQLAlchemy Dialect (Experimental) -The SQLAlchemy dialect of IoTDB is written to adapt to Apache Superset. -This part is still being improved. -Please do not use it in the production environment! -### 12.1 Mapping of the metadata -The data model used by SQLAlchemy is a relational data model, which describes the relationships between different entities through tables. -While the data model of IoTDB is a hierarchical data model, which organizes the data through a tree structure. -In order to adapt IoTDB to the dialect of SQLAlchemy, the original data model in IoTDB needs to be reorganized. -Converting the data model of IoTDB into the data model of SQLAlchemy. - -The metadata in the IoTDB are: - -1. Database -2. Path -3. Entity -4. Measurement - -The metadata in the SQLAlchemy are: -1. Schema -2. Table -3. Column - -The mapping relationship between them is: - -| The metadata in the SQLAlchemy | The metadata in the IoTDB | -| -------------------- | -------------------------------------------- | -| Schema | Database | -| Table | Path ( from database to entity ) + Entity | -| Column | Measurement | - -The following figure shows the relationship between the two more intuitively: - -![sqlalchemy-to-iotdb](/img/UserGuide/API/IoTDB-SQLAlchemy/sqlalchemy-to-iotdb.png?raw=true) - -### 12.2 Data type mapping -| data type in IoTDB | data type in SQLAlchemy | -|--------------------|-------------------------| -| BOOLEAN | Boolean | -| INT32 | Integer | -| INT64 | BigInteger | -| FLOAT | Float | -| DOUBLE | Float | -| TEXT | Text | -| LONG | BigInteger | - -### 12.3 Example - -+ execute statement - -```python -from sqlalchemy import create_engine - -engine = create_engine("iotdb://root:root@127.0.0.1:6667") -connect = engine.connect() -result = connect.execute("SELECT ** FROM root") -for row in result.fetchall(): - print(row) -``` - -+ ORM (now only simple queries are supported) - -```python -from sqlalchemy import create_engine, Column, Float, BigInteger, MetaData -from sqlalchemy.ext.declarative import declarative_base -from sqlalchemy.orm import sessionmaker - -metadata = MetaData( - schema='root.factory' -) -Base = declarative_base(metadata=metadata) - - -class Device(Base): - __tablename__ = "room2.device1" - Time = Column(BigInteger, primary_key=True) - temperature = Column(Float) - status = Column(Float) - - -engine = create_engine("iotdb://root:TimechoDB@2021@127.0.0.1:6667") //Before V2.0.6.x the default password is root - -DbSession = sessionmaker(bind=engine) -session = DbSession() - -res = session.query(Device.status).filter(Device.temperature > 1) - -for row in res: - print(row) -``` - - -## 13. Developers - -### 13.1 Introduction - -This is an example of how to connect to IoTDB with python, using the thrift rpc interfaces. Things are almost the same on Windows or Linux, but pay attention to the difference like path separator. - - - -### 13.2 Prerequisites - -Python3.7 or later is preferred. - -You have to install Thrift (0.11.0 or later) to compile our thrift file into python code. Below is the official tutorial of installation, eventually, you should have a thrift executable. - -``` -http://thrift.apache.org/docs/install/ -``` - -Before starting you need to install `requirements_dev.txt` in your python environment, e.g. by calling -```shell -pip install -r requirements_dev.txt -``` - - - -### 13.3 Compile the thrift library and Debug - -In the root of IoTDB's source code folder, run `mvn clean generate-sources -pl iotdb-client/client-py -am`. - -This will automatically delete and repopulate the folder `iotdb/thrift` with the generated thrift files. -This folder is ignored from git and should **never be pushed to git!** - -**Notice** Do not upload `iotdb/thrift` to the git repo. - - - - -### 13.4 Session Client & Example - -We packed up the Thrift interface in `client-py/src/iotdb/Session.py` (similar with its Java counterpart), also provided an example file `client-py/src/SessionExample.py` of how to use the session module. please read it carefully. - - -Or, another simple example: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //Before V2.0.6.x the default password is root -session = Session(ip, port_, username_, password_) -session.open(False) -zone = session.get_time_zone() -session.close() -``` - - - -### 13.5 Tests - -Please add your custom tests in `tests` folder. - -To run all defined tests just type `pytest .` in the root folder. - -**Notice** Some tests need docker to be started on your system as a test instance is started in a docker container using [testcontainers](https://testcontainers-python.readthedocs.io/en/latest/index.html). - - - -### 13.6 Futher Tools - -[black](https://pypi.org/project/black/) and [flake8](https://pypi.org/project/flake8/) are installed for autoformatting and linting. -Both can be run by `black .` or `flake8 .` respectively. - - - -## 14. Releasing - -To do a release just ensure that you have the right set of generated thrift files. -Then run linting and auto-formatting. -Then, ensure that all tests work (via `pytest .`). -Then you are good to go to do a release! - - - -### 14.1 Preparing your environment - -First, install all necessary dev dependencies via `pip install -r requirements_dev.txt`. - - - -### 14.2 Doing the Release - -There is a convenient script `release.sh` to do all steps for a release. -Namely, these are - -* Remove all transient directories from last release (if exists) -* (Re-)generate all generated sources via mvn -* Run Linting (flake8) -* Run Tests via pytest -* Build -* Release to pypi - diff --git a/src/UserGuide/Master/Tree/API/RestServiceV1_timecho.md b/src/UserGuide/Master/Tree/API/RestServiceV1_timecho.md deleted file mode 100644 index 376b21c6b..000000000 --- a/src/UserGuide/Master/Tree/API/RestServiceV1_timecho.md +++ /dev/null @@ -1,931 +0,0 @@ - - -# REST API V1(Not Recommend) -IoTDB's RESTful services can be used for query, write, and management operations, using the OpenAPI standard to define interfaces and generate frameworks. - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the REST service JAR file by default. Please contact the Timecho team to obtain the corresponding JAR file before using this service, and place it in the `timechodb_home/lib` or `timechodb_home/ext/external_service` directory. - -## 1. Enable RESTful Services - -All RESTful services require **Basic authentication** except the health check interface `/ping`. An `Authorization` header must be carried in all requests. - -1. Authentication Format -``` -Authorization: Basic -``` -Where `` is the Base64 encoding result of the string formatted as `username:password`. Quick generation methods are as follows: - -* Linux/macOS -```bash -echo -n "your_username:your_password" | base64 -Example: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows -```powershell -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("username:password")) -Example: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) -``` - -```cmd -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"username:password\"))" -Example: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. Authentication Example - -Default username: `root`, default password: `TimechoDB@2021`: -- Concatenated string: `root:TimechoDB@2021` -- Base64 encoded result: `cm9vdDpUaW1lY2hvREJAMjAyMQ==` -- Final Request Header: -``` -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. Error Description -- Incorrect username or password: Returns HTTP status code `600` with response content: -```json -{"code":600,"message":"WRONG_LOGIN_PASSWORD_ERROR"} -``` - -- Missing `Authorization` header: Returns HTTP status code `603` with response content: -```json -{"code":603,"message":"UNINITIALIZED_AUTH_ERROR"} -``` - -## 3. Interface - -### 3.1 ping - -The `/ping` API can be used for service liveness probing. - -Request method: `GET` - -Request path: `http://ip:port/ping` - -The user name used in the example is: root, password: root - -Example request: - -```shell -$ curl http://127.0.0.1:18080/ping -``` - -Response status codes: - -- `200`: The service is alive. -- `503`: The service cannot accept any requests now. - -Response parameters: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -|code | integer | status code | -| message | string | message | - -Sample response: - -- With HTTP status code `200`: - - ```json - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -- With HTTP status code `503`: - - ```json - { - "code": 500, - "message": "thrift service is unavailable" - } - ``` - -> `/ping` can be accessed without authorization. - -### 3.2 query - -The query interface can be used to handle data queries and metadata queries. - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v1/query` - -Parameter Description: - -| parameter name | parameter type | required | parameter description | -|----------------| -------------- | -------- | ------------------------------------------------------------ | -| sql | string | yes | | -| rowLimit | integer | no | The maximum number of rows in the result set that can be returned by a query.
If this parameter is not set, the `rest_query_default_row_size_limit` of the configuration file will be used as the default value.
When the number of rows in the returned result set exceeds the limit, the status code `411` will be returned. | - -Response parameters: - -| parameter name | parameter type | parameter description | -|----------------| -------------- | ------------------------------------------------------------ | -| expressions | array | Array of result set column names for data query, `null` for metadata query | -| columnNames | array | Array of column names for metadata query result set, `null` for data query | -| timestamps | array | Timestamp column, `null` for metadata query | -| values | array | A two-dimensional array, the first dimension has the same length as the result set column name array, and the second dimension array represents a column of the result set | - -**Examples:** - -Tip: Statements like `select * from root.xx.**` are not recommended because those statements may cause OOM. - -**Expression query** - - ```shell - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select s3, s4, s3 + 1 from root.sg27 limit 2"}' http://127.0.0.1:18080/rest/v1/query - ```` -Response instance - ```json - { - "expressions": [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg27.s3 + 1" - ], - "columnNames": null, - "timestamps": [ - 1635232143960, - 1635232153960 - ], - "values": [ - [ - 11, - null - ], - [ - false, - true - ], - [ - 12.0, - null - ] - ] - } - ``` - -**Show child paths** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show child paths root"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "child paths" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ] - ] -} -``` - -**Show child nodes** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show child nodes root"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "child nodes" - ], - "timestamps": null, - "values": [ - [ - "sg27", - "sg28" - ] - ] -} -``` - -**Show all ttl** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show all ttl"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - null, - null - ] - ] -} -``` - -**Show ttl** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show ttl on root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27" - ], - [ - null - ] - ] -} -``` - -**Show functions** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show functions"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "function name", - "function type", - "class name (UDF)" - ], - "timestamps": null, - "values": [ - [ - "ABS", - "ACOS", - "ASIN", - ... - ], - [ - "built-in UDTF", - "built-in UDTF", - "built-in UDTF", - ... - ], - [ - "org.apache.iotdb.db.query.udf.builtin.UDTFAbs", - "org.apache.iotdb.db.query.udf.builtin.UDTFAcos", - "org.apache.iotdb.db.query.udf.builtin.UDTFAsin", - ... - ] - ] -} -``` - -**Show timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show timeseries"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg28.s3", - "root.sg28.s4" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg27", - "root.sg27", - "root.sg28", - "root.sg28" - ], - [ - "INT32", - "BOOLEAN", - "INT32", - "BOOLEAN" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -**Show latest timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show latest timeseries"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg28.s4", - "root.sg27.s4", - "root.sg28.s3", - "root.sg27.s3" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg28", - "root.sg27", - "root.sg28", - "root.sg27" - ], - [ - "BOOLEAN", - "BOOLEAN", - "INT32", - "INT32" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -**Count timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"count timeseries root.**"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -**Count nodes** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"count nodes root.** level=2"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -**Show devices** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show devices"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "devices", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -**Show devices with database** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show devices with database"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "devices", - "database", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -**List user** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"list user"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "user" - ], - "timestamps": null, - "values": [ - [ - "root" - ] - ] -} -``` - -**Aggregation** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "columnNames": null, - "timestamps": [ - 0 - ], - "values": [ - [ - 1 - ], - [ - 2 - ] - ] -} -``` - -**Group by level** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.** group by level = 1"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "count(root.sg27.*)", - "count(root.sg28.*)" - ], - "timestamps": null, - "values": [ - [ - 3 - ], - [ - 3 - ] - ] -} -``` - -**Group by** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.sg27 group by([1635232143960,1635232153960),1s)"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "columnNames": null, - "timestamps": [ - 1635232143960, - 1635232144960, - 1635232145960, - 1635232146960, - 1635232147960, - 1635232148960, - 1635232149960, - 1635232150960, - 1635232151960, - 1635232152960 - ], - "values": [ - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ], - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ] - ] -} -``` - -**Last** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select last s3 from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "value", - "dataType" - ], - "timestamps": [ - 1635232143960 - ], - "values": [ - [ - "root.sg27.s3" - ], - [ - "11" - ], - [ - "INT32" - ] - ] -} -``` - -**Disable align** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select * from root.sg27 disable align"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "code": 407, - "message": "disable align clauses are not supported." -} -``` - -**Align by device** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(s3) from root.sg27 align by device"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "code": 407, - "message": "align by device clauses are not supported." -} -``` - -**Select into** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select s3, s4 into root.sg29.s1, root.sg29.s2 from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "code": 407, - "message": "select into clauses are not supported." -} -``` - -### 3.3 nonQuery - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v1/nonQuery` - -Parameter Description: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| sql | string | query content | - -Example request: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"CREATE DATABASE root.ln"}' http://127.0.0.1:18080/rest/v1/nonQuery -``` - -Response parameters: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| code | integer | status code | -| message | string | message | - -Sample response: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - - -### 3.4 insertTablet - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v1/insertTablet` - -Parameter Description: - -| parameter name |parameter type |is required|parameter describe| -|:---------------| :--- | :---| :---| -| timestamps | array | yes | Time column | -| measurements | array | yes | The name of the measuring point | -| dataTypes | array | yes | The data type | -| values | array | yes | Value columns, the values in each column can be `null` | -| isAligned | boolean | yes | Whether to align the timeseries | -| deviceId | string | yes | Device name | - -Example request: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"timestamps":[1635232143960,1635232153960],"measurements":["s3","s4"],"dataTypes":["INT32","BOOLEAN"],"values":[[11,null],[false,true]],"isAligned":false,"deviceId":"root.sg27"}' http://127.0.0.1:18080/rest/v1/insertTablet -``` - -Sample response: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| code | integer | status code | -| message | string | message | - -Sample response: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - -## 4. Configuration - -The configuration is located in 'iotdb-system.properties'. - -* Set 'enable_rest_service' to 'true' to enable the module, and 'false' to disable the module. By default, this value is' false '. - -```properties -enable_rest_service=true -``` - -* This parameter is valid only when 'enable_REST_service =true'. Set 'rest_service_port' to a number (1025 to 65535) to customize the REST service socket port. By default, the value is 18080. - -```properties -rest_service_port=18080 -``` - -* Set 'enable_swagger' to 'true' to display rest service interface information through swagger, and 'false' to do not display the rest service interface information through the swagger. By default, this value is' false '. - -```properties -enable_swagger=false -``` - -* The maximum number of rows in the result set that can be returned by a query. When the number of rows in the returned result set exceeds the limit, the status code `411` is returned. - -````properties -rest_query_default_row_size_limit=10000 -```` - -* Expiration time for caching customer login information (used to speed up user authentication, in seconds, 8 hours by default) - -```properties -cache_expire=28800 -``` - - -* Maximum number of users stored in the cache (default: 100) - -```properties -cache_max_num=100 -``` - -* Initial cache size (default: 10) - -```properties -cache_init_num=10 -``` - -* REST Service whether to enable SSL configuration, set 'enable_https' to' true 'to enable the module, and set' false 'to disable the module. By default, this value is' false '. - -```properties -enable_https=false -``` - -* keyStore location path (optional) - -```properties -key_store_path= -``` - - -* keyStore password (optional) - -```properties -key_store_pwd= -``` - - -* trustStore location path (optional) - -```properties -trust_store_path= -``` - -* trustStore password (optional) - -```properties -trust_store_pwd= -``` - - -* SSL timeout period, in seconds - -```properties -idle_timeout=5000 -``` diff --git a/src/UserGuide/Master/Tree/API/RestServiceV2_timecho.md b/src/UserGuide/Master/Tree/API/RestServiceV2_timecho.md deleted file mode 100644 index 6a852c489..000000000 --- a/src/UserGuide/Master/Tree/API/RestServiceV2_timecho.md +++ /dev/null @@ -1,983 +0,0 @@ - - -# REST API V2 -IoTDB's RESTful services can be used for query, write, and management operations, using the OpenAPI standard to define interfaces and generate frameworks. - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the REST service JAR file by default. Please contact the Timecho team to obtain the corresponding JAR file before using this service, and place it in the `timechodb_home/lib` or `timechodb_home/ext/external_service` directory. - -## 1. Enable RESTful Services - -RESTful services are disabled by default. - -Find the `conf/iotdb-system.properties` file under the IoTDB installation directory and set `enable_rest_service` to `true` to enable the module. - - ```properties - enable_rest_service=true - ``` - -## 2. Authentication - -All RESTful services require **Basic authentication** except the health check interface `/ping`. An `Authorization` header must be carried in all requests. - -1. Authentication Format -``` -Authorization: Basic -``` -Where `` is the Base64 encoding result of the string formatted as `username:password`. Quick generation methods are as follows: - -* Linux/macOS -```bash -echo -n "your_username:your_password" | base64 -Example: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows -```powershell -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("username:password")) -Example: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) -``` - -```cmd -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"username:password\"))" -Example: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. Authentication Example - -Default username: `root`, default password: `TimechoDB@2021`: -- Concatenated string: `root:TimechoDB@2021` -- Base64 encoded result: `cm9vdDpUaW1lY2hvREJAMjAyMQ==` -- Final Request Header: -``` -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. Error Description -- Incorrect username or password: Returns HTTP status code `600` with response content: -```json -{"code":600,"message":"WRONG_LOGIN_PASSWORD_ERROR"} -``` - -- Missing `Authorization` header: Returns HTTP status code `603` with response content: -```json -{"code":603,"message":"UNINITIALIZED_AUTH_ERROR"} -``` - - - -## 3. Interface - -### 3.1 ping - -The `/ping` API can be used for service liveness probing. - -Request method: `GET` - -Request path: `http://ip:port/ping` - -The user name used in the example is: root, password: root - -Example request: - -```shell -$ curl http://127.0.0.1:18080/ping -``` - -Response status codes: - -- `200`: The service is alive. -- `503`: The service cannot accept any requests now. - -Response parameters: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -|code | integer | status code | -| message | string | message | - -Sample response: - -- With HTTP status code `200`: - - ```json - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -- With HTTP status code `503`: - - ```json - { - "code": 500, - "message": "thrift service is unavailable" - } - ``` - -> `/ping` can be accessed without authorization. - -### 3.2 query - -The query interface can be used to handle data queries and metadata queries. - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v2/query` - -Parameter Description: - -| parameter name | parameter type | required | parameter description | -|----------------| -------------- | -------- | ------------------------------------------------------------ | -| sql | string | yes | | -| row_limit | integer | no | The maximum number of rows in the result set that can be returned by a query.
If this parameter is not set, the `rest_query_default_row_size_limit` of the configuration file will be used as the default value.
When the number of rows in the returned result set exceeds the limit, the status code `411` will be returned. | - -Response parameters: - -| parameter name | parameter type | parameter description | -|----------------| -------------- | ------------------------------------------------------------ | -| expressions | array | Array of result set column names for data query, `null` for metadata query | -| column_names | array | Array of column names for metadata query result set, `null` for data query | -| timestamps | array | Timestamp column, `null` for metadata query | -| values | array | A two-dimensional array, the first dimension has the same length as the result set column name array, and the second dimension array represents a column of the result set | - -**Examples:** - -Tip: Statements like `select * from root.xx.**` are not recommended because those statements may cause OOM. - -**Expression query** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select s3, s4, s3 + 1 from root.sg27 limit 2"}' http://127.0.0.1:18080/rest/v2/query -```` - -```json -{ - "expressions": [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg27.s3 + 1" - ], - "column_names": null, - "timestamps": [ - 1635232143960, - 1635232153960 - ], - "values": [ - [ - 11, - null - ], - [ - false, - true - ], - [ - 12.0, - null - ] - ] -} -``` - -**Show child paths** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show child paths root"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "child paths" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ] - ] -} -``` - -**Show child nodes** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show child nodes root"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "child nodes" - ], - "timestamps": null, - "values": [ - [ - "sg27", - "sg28" - ] - ] -} -``` - -**Show all ttl** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show all ttl"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - null, - null - ] - ] -} -``` - -**Show ttl** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show ttl on root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27" - ], - [ - null - ] - ] -} -``` - -**Show functions** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show functions"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "function name", - "function type", - "class name (UDF)" - ], - "timestamps": null, - "values": [ - [ - "ABS", - "ACOS", - "ASIN", - ... - ], - [ - "built-in UDTF", - "built-in UDTF", - "built-in UDTF", - ... - ], - [ - "org.apache.iotdb.db.query.udf.builtin.UDTFAbs", - "org.apache.iotdb.db.query.udf.builtin.UDTFAcos", - "org.apache.iotdb.db.query.udf.builtin.UDTFAsin", - ... - ] - ] -} -``` - -**Show timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show timeseries"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg28.s3", - "root.sg28.s4" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg27", - "root.sg27", - "root.sg28", - "root.sg28" - ], - [ - "INT32", - "BOOLEAN", - "INT32", - "BOOLEAN" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -**Show latest timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show latest timeseries"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg28.s4", - "root.sg27.s4", - "root.sg28.s3", - "root.sg27.s3" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg28", - "root.sg27", - "root.sg28", - "root.sg27" - ], - [ - "BOOLEAN", - "BOOLEAN", - "INT32", - "INT32" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -**Count timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"count timeseries root.**"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -**Count nodes** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"count nodes root.** level=2"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -**Show devices** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show devices"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "devices", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -**Show devices with database** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show devices with database"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "devices", - "database", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -**List user** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"list user"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "user" - ], - "timestamps": null, - "values": [ - [ - "root" - ] - ] -} -``` - -**Aggregation** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "column_names": null, - "timestamps": [ - 0 - ], - "values": [ - [ - 1 - ], - [ - 2 - ] - ] -} -``` - -**Group by level** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.** group by level = 1"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "count(root.sg27.*)", - "count(root.sg28.*)" - ], - "timestamps": null, - "values": [ - [ - 3 - ], - [ - 3 - ] - ] -} -``` - -**Group by** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.sg27 group by([1635232143960,1635232153960),1s)"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "column_names": null, - "timestamps": [ - 1635232143960, - 1635232144960, - 1635232145960, - 1635232146960, - 1635232147960, - 1635232148960, - 1635232149960, - 1635232150960, - 1635232151960, - 1635232152960 - ], - "values": [ - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ], - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ] - ] -} -``` - -**Last** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select last s3 from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "value", - "dataType" - ], - "timestamps": [ - 1635232143960 - ], - "values": [ - [ - "root.sg27.s3" - ], - [ - "11" - ], - [ - "INT32" - ] - ] -} -``` - -**Disable align** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select * from root.sg27 disable align"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "code": 407, - "message": "disable align clauses are not supported." -} -``` - -**Align by device** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(s3) from root.sg27 align by device"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "code": 407, - "message": "align by device clauses are not supported." -} -``` - -**Select into** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select s3, s4 into root.sg29.s1, root.sg29.s2 from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "code": 407, - "message": "select into clauses are not supported." -} -``` - -### 3.3 nonQuery - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v2/nonQuery` - -Parameter Description: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| sql | string | query content | - -Example request: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"CREATE DATABASE root.ln"}' http://127.0.0.1:18080/rest/v2/nonQuery -``` - -Response parameters: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| code | integer | status code | -| message | string | message | - -Sample response: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - - -### 3.4 insertTablet - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v2/insertTablet` - -Parameter Description: - -| parameter name |parameter type |is required|parameter describe| -|:---------------| :--- | :---| :---| -| timestamps | array | yes | Time column | -| measurements | array | yes | The name of the measuring point | -| data_types | array | yes | The data type | -| values | array | yes | Value columns, the values in each column can be `null` | -| is_aligned | boolean | yes | Whether to align the timeseries | -| device | string | yes | Device name | - -Example request: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"timestamps":[1635232143960,1635232153960],"measurements":["s3","s4"],"data_types":["INT32","BOOLEAN"],"values":[[11,null],[false,true]],"is_aligned":false,"device":"root.sg27"}' http://127.0.0.1:18080/rest/v2/insertTablet -``` - -Sample response: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| code | integer | status code | -| message | string | message | - -Sample response: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - -### 3.5 insertRecords - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v2/insertRecords` - -Parameter Description: - -| parameter name |parameter type |is required|parameter describe| -|:------------------| :--- | :---| :---| -| timestamps | array | yes | Time column | -| measurements_list | array | yes | The name of the measuring point | -| data_types_list | array | yes | The data type | -| values_list | array | yes | Value columns, the values in each column can be `null` | -| devices | string | yes | Device name | -| is_aligned | boolean | yes | Whether to align the timeseries | - -Example request: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"timestamps":[1635232113960,1635232151960,1635232143960,1635232143960],"measurements_list":[["s33","s44"],["s55","s66"],["s77","s88"],["s771","s881"]],"data_types_list":[["INT32","INT64"],["FLOAT","DOUBLE"],["FLOAT","DOUBLE"],["BOOLEAN","TEXT"]],"values_list":[[1,11],[2.1,2],[4,6],[false,"cccccc"]],"is_aligned":false,"devices":["root.s1","root.s1","root.s1","root.s3"]}' http://127.0.0.1:18080/rest/v2/insertRecords -``` - -Sample response: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| code | integer | status code | -| message | string | message | - -Sample response: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - -## 4. Configuration - -The configuration is located in 'iotdb-system.properties'. - -* Set 'enable_rest_service' to 'true' to enable the module, and 'false' to disable the module. By default, this value is' false '. - -```properties -enable_rest_service=true -``` - -* This parameter is valid only when 'enable_REST_service =true'. Set 'rest_service_port' to a number (1025 to 65535) to customize the REST service socket port. By default, the value is 18080. - -```properties -rest_service_port=18080 -``` - -* Set 'enable_swagger' to 'true' to display rest service interface information through swagger, and 'false' to do not display the rest service interface information through the swagger. By default, this value is' false '. - -```properties -enable_swagger=false -``` - -* The maximum number of rows in the result set that can be returned by a query. When the number of rows in the returned result set exceeds the limit, the status code `411` is returned. - -````properties -rest_query_default_row_size_limit=10000 -```` - -* Expiration time for caching customer login information (used to speed up user authentication, in seconds, 8 hours by default) - -```properties -cache_expire=28800 -``` - - -* Maximum number of users stored in the cache (default: 100) - -```properties -cache_max_num=100 -``` - -* Initial cache size (default: 10) - -```properties -cache_init_num=10 -``` - -* REST Service whether to enable SSL configuration, set 'enable_https' to' true 'to enable the module, and set' false 'to disable the module. By default, this value is' false '. - -```properties -enable_https=false -``` - -* keyStore location path (optional) - -```properties -key_store_path= -``` - - -* keyStore password (optional) - -```properties -key_store_pwd= -``` - - -* trustStore location path (optional) - -```properties -trust_store_path= -``` - -* trustStore password (optional) - -```properties -trust_store_pwd= -``` - - -* SSL timeout period, in seconds - -```properties -idle_timeout=5000 -``` diff --git a/src/UserGuide/Master/Tree/Background-knowledge/Cluster-Concept_timecho.md b/src/UserGuide/Master/Tree/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index 8e93c5f4e..000000000 --- a/src/UserGuide/Master/Tree/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,118 +0,0 @@ - - -# Common Concepts - -## 1. Sql_dialect Related Concepts - -| Concept | Meaning | -| ----------------------- | ------------------------------------------------------------ | -| sql_dialect | IoTDB supports two time-series data models (SQL dialects), both managing devices and measurement points. Tree: Manages data in a hierarchical path manner, where one path corresponds to one measurement point of a device. Table: Manages data in a relational table manner, where one table corresponds to a category of devices. | -| Schema | Schema is the data model information of the database, i.e., tree structure or table structure. It includes definitions such as the names and data types of measurement points. | -| Device | Corresponds to a physical device in an actual scenario, usually containing multiple measurement points. | -| Timeseries | Also known as: physical quantity, time series, timeline, point location, semaphore, indicator, measurement value, etc. It is a time series formed by arranging multiple data points in ascending order of timestamps. Usually, a Timeseries represents a collection point that can periodically collect physical quantities of the environment it is in. | -| Encoding | Encoding is a compression technique that represents data in binary form to improve storage efficiency. IoTDB supports various encoding methods for different types of data. For more detailed information, please refer to:[Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) | -| Compression | After data encoding, IoTDB uses compression technology to further compress binary data to enhance storage efficiency. IoTDB supports multiple compression methods. For more detailed information, please refer to: [Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) | - -## 2. Distributed Related Concepts - -The following figure shows a common IoTDB 3C3D (3 ConfigNodes, 3 DataNodes) cluster deployment pattern: - - - -IoTDB's cluster includes the following common concepts: - -- Nodes(ConfigNode、DataNode、AINode) -- Region(SchemaRegion、DataRegion) -- Replica Groups - -The above concepts will be introduced in the following text. - - -### 2.1 Nodes - -IoTDB cluster includes three types of nodes (processes): ConfigNode (management node), DataNode (data node), and AINode (analysis node), as shown below: - -- ConfigNode: Manages cluster node information, configuration information, user permissions, metadata, partition information, etc., and is responsible for the scheduling of distributed operations and load balancing. All ConfigNodes are fully backed up with each other, as shown in ConfigNode-1, ConfigNode-2, and ConfigNode-3 in the figure above. -- DataNode: Serves client requests and is responsible for data storage and computation, as shown in DataNode-1, DataNode-2, and DataNode-3 in the figure above. -- AINode: Provides machine learning capabilities, supports the registration of trained machine learning models, and allows model inference through SQL calls. It has already built-in self-developed time-series large models and common machine learning algorithms (such as prediction and anomaly detection). - -### 2.2 Data Partitioning - -In IoTDB, both metadata and data are divided into small partitions, namely Regions, which are managed by various DataNodes in the cluster. - -- SchemaRegion: Metadata partition, managing the metadata of a part of devices and measurement points. SchemaRegions with the same RegionID on different DataNodes are mutual replicas, as shown in SchemaRegion-1 in the figure above, which has three replicas located on DataNode-1, DataNode-2, and DataNode-3. -- DataRegion: Data partition, managing the data of a part of devices for a certain period of time. DataRegions with the same RegionID on different DataNodes are mutual replicas, as shown in DataRegion-2 in the figure above, which has two replicas located on DataNode-1 and DataNode-2. -- For specific partitioning algorithms, please refer to: [Data Partitioning](../Technical-Insider/Cluster-data-partitioning.md) - -### 2.3 Replica Groups - -The number of replicas for data and metadata can be configured. The recommended configurations for different deployment modes are as follows, where multi-replication can provide high-availability services. - -| Category | Parameter | Stand-Alone Recommended Configuration | Cluster Recommended Configuration | -| :----- | :------------------------ | :----------- | :----------- | -| Schema | schema_replication_factor | 1 | 3 | -| Data | data_replication_factor | 1 | 2 | - - -## 3. Deployment Related Concepts - -IoTDB has three operating modes: Stand-Alone mode, Cluster mode, and Dual-Active mode. - -### 3.1 Stand-Alone Mode - -An IoTDB Stand-Alone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D; - - -- **Features**:Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations. -- **Applicable Scenarios**:Scenarios with limited resources or low requirements for high availability, such as edge-side servers. -- **Deployment Method**:[Stand-Alone-Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -### 3.2 Dual-Active Mode - -Dual-active deployment is a feature of TimechoDB Enterprise Edition, which refers to two independent instances performing bidirectional synchronization and can provide services simultaneously. When one instance is restarted after a shutdown, the other instance will resume transmission of the missing data. - - -> An IoTDB dual-active instance usually consists of 2 single-machine nodes, i.e., 2 sets of 1C1D. Each instance can also be a cluster. - -- **Features**:The most resource-efficient high-availability solution. -- **Applicable Scenarios**:Scenarios with limited resources (only two servers) but requiring high-availability capabilities. -- **Deployment Method**:[Dual-Active-Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### 3.3 Cluster Mode - -An IoTDB cluster instance consists of 3 ConfigNodes and no less than 3 DataNodes, usually 3 DataNodes, i.e., 3C3D; when some nodes fail, the remaining nodes can still provide services, ensuring the high availability of the database service, and the database performance can be improved with the addition of nodes. - -- **Features**:High availability and scalability, and the system performance can be improved by adding DataNodes. -- **Applicable Scenarios**:Enterprise-level application scenarios requiring high availability and reliability. -- **Deployment Method**:[Cluster-Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -### 3.4 Summary of Features - -| Dimension | Stand-Alone Mode | Dual-Active Mode | Cluster Mode | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| Applicable Scenarios | Edge-side deployment, scenarios with low requirements for high availability | High-availability business, disaster recovery scenarios, etc. | High-availability business, disaster recovery scenarios, etc. | -| Number of Machines Required | 1 | 2 | ≥3 | -| Security and Reliability | Cannot tolerate single-point failures | High, can tolerate single-point failures | High, can tolerate single-point failures | -| Scalability | Can expand DataNodes to improve performance | Each instance can be expanded as needed | Can expand DataNodes to improve performance | -| Performance | Can be expanded with the number of DataNodes | Same as the performance of one of the instances | Can be expanded with the number of DataNodes | - -- The deployment steps for single-machine mode and cluster mode are similar (adding ConfigNodes and DataNodes one by one), with only the number of replicas and the minimum number of nodes that can provide services being different. \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Background-knowledge/Data-Model-and-Terminology_timecho.md b/src/UserGuide/Master/Tree/Background-knowledge/Data-Model-and-Terminology_timecho.md deleted file mode 100644 index 6b6e2018d..000000000 --- a/src/UserGuide/Master/Tree/Background-knowledge/Data-Model-and-Terminology_timecho.md +++ /dev/null @@ -1,393 +0,0 @@ - - -# Modeling Scheme Design - -This section introduces how to transform time series data application scenarios into IoTDB time series mode. - -## 1. Time Series Data Mode - -Before designing an IoTDB data mode, it's essential to understand time series data and its underlying structure. For more details, refer to: [Time Series Data Mode](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - -## 2. Tree-Table Twin Mode in IoTDB - -IoTDB offers Tree-table twin mode, each with its distinct characteristics as follows: - -**Tree Mode**: It manages data points as objects, with each data point corresponding to a time series. The data point names, segmented by dots, form a tree-like directory structure that corresponds one-to-one with the physical world, making the read and write operations on data points straightforward and intuitive. - -> 1. When performing data mode, to meet sufficient performance requirements, it is recommended that the penultimate layer node (corresponding to the number of devices) in the data path (Path) contains no fewer than 1,000 entries. The number of devices is linked to concurrent processing capability—a higher number of devices ensures more efficient concurrent read and write operations. - In scenarios where "the number of devices is small but each device contains a large number of data points" (e.g., only 3 devices, each with 10,000 data points), it is advisable to add a .value level at the end of the path. This increases the total number of nodes in the penultimate layer. Example: root.db.device01.metric.value. -> 2. When constructing tree mode [paths](../Basic-Concept/Operate-Metadata_timecho.md#4-path-query), if node naming may include non-standard characters or special symbols, it is recommended to implement a backtick encapsulation strategy for all hierarchical nodes. This approach effectively mitigates issues such as probe registration failures and data write interruptions caused by character parsing errors, ensuring the accuracy of path identifiers in syntax parsing. - -**Table Mode**: It is recommended to create a table for each type of device. The collection of physical quantities from devices of the same type shares certain commonalities (such as the collection of temperature and humidity physical quantities), allowing for flexible and rich data analysis. - -### 2.1 Mode Characteristics - -Tree-table twin mode syntaxes have their own applicable scenarios. - -The following table compares the tree mode and the table mode from various dimensions, including applicable scenarios and typical operations. Users can choose the appropriate mode based on their specific usage requirements to achieve efficient data storage and management. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DimensionTree ModeTable Mode
Applicable ScenariosMeasurements management, monitoring scenariosDevice management, analysis scenarios
Typical OperationsRead and write operations by specifying data point pathsData filtering and analysis through tags
Structural CharacteristicsFlexible addition and deletion, similar to a file systemTemplate-based management, facilitating data governance
Syntax CharacteristicsConcise and flexibleRich analysis
Performance ComparisonSimilar
- -**Notes:** - -- Both mode spaces can coexist within the same cluster instance. Each mode follows distinct syntax and database naming conventions, and they remain isolated by default. - - -## 2.2 Model Selection - -IoTDB supports model selection through various client tools. The configuration methods for different clients are as follows: - -1. [Command-Line Interface (CLI)](../Tools-System/CLI_timecho.md) - -When connecting via CLI, specify the model using the `sql_dialect` parameter (default: tree model). - -```bash -# Tree model -start-cli.sh(bat) -start-cli.sh(bat) -sql_dialect tree - -# Table model -start-cli.sh(bat) -sql_dialect table -``` - -2. [SQL](../User-Manual/Maintenance-commands_timecho.md#_2-1-setting-the-connected-model) - -Use the `SET` statement to switch models in SQL: - -```sql --- Tree model -IoTDB> SET SQL_DIALECT=TREE - --- Table model -IoTDB> SET SQL_DIALECT=TABLE -``` - -3. Application Programming Interfaces (APIs) - -For multi-language APIs, create connections via model-specific session/session pool classes. Examples: - -* [Java Native API](../API/Programming-Java-Native-API_timecho.md) - -```java -// Tree model -SessionPool sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(3) - .build(); - -// Table model -ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(1) - .build(); -``` - -* [Python Native API](../API/Programming-Python-Native-API_timecho.md) - -```python -# Tree model -session = Session( - ip=ip, - port=port, - user=username, - password=password, - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) - -# Table model -config = TableSessionPoolConfig( - node_urls=node_urls, - username=username, - password=password, - database=database, - max_pool_size=max_pool_size, - fetch_size=fetch_size, - wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) -``` - -* [C++ Native API](../API/Programming-Cpp-Native-API.md) - -```cpp -// Tree model -session = new Session(hostip, port, username, password); - -// Table model -session = (new TableSessionBuilder()) - ->host(ip) - ->rpcPort(port) - ->username(username) - ->password(password) - ->build(); -``` - -* [Go Native API](../API/Programming-Go-Native-API.md) - -```go -// Tree model -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, -} -sessionPool = client.NewSessionPool(config, 3, 60000, 60000, false) -defer sessionPool.Close() - -// Table model -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, - Database: dbname, -} -sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) -defer sessionPool.Close() -``` - -* [C# Native API](../API/Programming-CSharp-Native-API.md) - -```csharp -// Tree model -var session_pool = new SessionPool(host, port, pool_size); - -// Table model -var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(nodeUrls) - .SetUsername(username) - .SetPassword(password) - .SetFetchSize(1024) - .Build(); -``` - -* [JDBC](../API/Programming-JDBC_timecho.md) - -For the table model, include `sql_dialect=table` in the JDBC URL: - -```java -// Tree model -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/", username, password); - -// Table model -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", username, password); -``` - -## 2.3 Tree-to-Table Conversion - -IoTDB supports **tree-to-table conversion**, as shown in the figure below: - -![](/img/tree-to-table-en-1.png) - -This feature allows existing tree-model data to be transformed into table views. Users can then query the same dataset using either model. Detailed instructions are available in [Tree-to-Table View](../../latest-Table/User-Manual/Tree-to-Table_timecho.md). **Note**: SQL statements for creating tree-to-table views **must be executed in table mode**. - - -## 3. Application Scenarios - -The application scenarios mainly include three categories: - -- Scenario 1: Using the tree mode for data reading and writing. - -- Scenario 2: Using the table mode for data reading and writing. - -- Scenario 3: Sharing the same dataset, using the tree mode for data reading and writing, and the table mode for data analysis. - -### 3.1 Scenario 1: Tree Mode - -#### 3.1.1 Characteristics - -- Simple and intuitive, corresponding one-to-one with monitoring points in the physical world. - -- Flexible like a file system, allowing the design of any branch structure. - -- Suitable for industrial monitoring scenarios such as DCS and SCADA. - -#### 3.1.2 Basic Concepts - -| **Concept** | **Definition** | -| ---------------------------- | ------------------------------------------------------------ | -| **Database** | **Definition**: A path prefixed with `root.`.
**Naming Recommendation**: Only include the next level node under `root`, such as `root.db`.
**Quantity Recommendation**: The upper limit is related to memory. A single database can fully utilize machine resources; there is no need to create multiple databases for performance reasons.
**Creation Method**: Recommended to create manually, but can also be created automatically when a time series is created (defaults to the next level node under `root`). | -| **Time Series (Data Point)** | **Definition**:
A path prefixed with the database path, segmented by `.`, and can contain any number of levels, such as `root.db.turbine.device1.metric1`.
Each time series can have different data types.
**Naming Recommendation**:
Only include unique identifiers (similar to a composite primary key) in the path, generally not exceeding 10 levels.
Typically, place tags with low cardinality (fewer distinct values) at the front to facilitate system compression of common prefixes.
**Quantity Recommendation**:
The total number of time series manageable by the cluster is related to total memory; refer to the resource recommendation section.
There is no limit to the number of child nodes at any level.
**Creation Method**: Can be created manually or automatically during data writing. | -| **Device** | **Definition**: The second-to-last level is the device, such as `device1` in `root.db.turbine.device1.metric1`.
**Creation Method**: Cannot create a device alone; it exists as time series are created. | - -#### 3.1.3 Mode Examples - -##### 3.1.3.1 How to mode when managing multiple types of devices? - -- If different types of devices in the scenario have different hierarchical paths and data point sets, create branches under the database node by device type. Each device type can have a different data point structure. - -
- -
- -##### 3.1.3.2 How to mode when there are no devices, only data points? - -- For example, in a monitoring system for a station, each data point has a unique number but does not correspond to any specific device. - -
- -
- -##### 3.1.3.3 How to mode when a device has both sub-devices and data points? - -- For example, in an energy storage scenario, each layer of the structure monitors its voltage and current. The following mode approach can be used. - -
- -
- - -### 3.2 Scenario 2: Table Mode - -#### 3.2.1 Characteristics - -- Modes and manages device time series data using time series tables, facilitating analysis with standard SQL. - -- Suitable for device data analysis or migrating data from other databases to IoTDB. - -#### 3.2.2 Basic Concepts - -- Database: Can manage multiple types of devices. - -- Time Series Table: Corresponds to a type of device. - -| **Category** | **Definition** | -| -------------------------------- | ------------------------------------------------------------ | -| **Time Column (TIME)** | Each time series table must have a time column named `time`, with the data type `TIMESTAMP`. | -| **Tag Column (TAG)** \| | Unique identifiers (composite primary key) for devices, ranging from 0 to multiple.
Tag information cannot be modified or deleted but can be added.
Recommended to arrange from coarse to fine granularity. | -| **Data Point Column (FIELD)** \| | A device can collect 1 to multiple data points, with values changing over time.
There is no limit to the number of data point columns; it can reach hundreds of thousands. | -| **Attribute Column (ATTRIBUTE)** | Supplementary descriptions of devices, not changing over time.
Device attribute information can range from 0 to multiple and can be updated or added.
A small number of static attributes that may need modification can be stored here. | - -**Data Filtering Efficiency**: Time Column = Tag Column > Attribute Column > Data Point Column. - -#### 3.2.3 Mode Examples - -##### 3.2.3.1 How to mode when managing multiple types of devices? - -- Recommended to create a table for each type of device, with each table having different tags and data point sets. - -- Even if devices are related or have hierarchical relationships, it is recommended to create a table for each type of device. - -
- -
- -##### 3.2.3.2 How to mode when there are no device identifier columns or attribute columns? - -- There is no limit to the number of columns; it can reach hundreds of thousands. - -
- -
- -##### 3.2.3.3 How to mode when a device has both sub-devices and data points? - -- Each device has multiple sub-devices and data point information. It is recommended to create a table for each type of device for management. - -
- -
- -### 3.3 Scenario 3: Dual-Mode Integration - -#### 3.3.1 Characteristics - -- Ingeniously combines the advantages of the tree mode and table mode, sharing the same dataset, with flexible writing and rich querying. - -- During the data writing phase, the tree mode syntax is used, supporting flexible data access and expansion. - -- During the data analysis phase, the table mode syntax is used, allowing users to perform complex data analysis using standard SQL queries. - -#### 3.3.2 Mode Examples - -##### 3.3.2.1 How to mode when managing multiple types of devices? - -- Different types of devices in the scenario have different hierarchical paths and data point sets. - -- **Tree Mode**T: Create branches under the database node by device type, with each device type having a different data point structure. - -- **Table View**T: Create a table view for each type of device, with each table view having different tags and data point sets. - -
- -
- -##### 3.3.2.2 How to mode when there are no device identifier columns or attribute columns? - -- **Tree Mode**: Each data point has a unique number but does not correspond to any specific device. -- **Table View**: Place all data points into a single table. There is no limit to the number of data point columns; it can reach hundreds of thousands. If data points have the same data type, they can be treated as the same type of device. - -
- -
- -##### 3.3.2.3 How to mode when a device has both sub-devices and data points? - -- **Tree Mode**: Mode each layer of the structure according to the monitoring points in the physical world. -- **Table View**: Create multiple tables to manage each layer of structural information according to device classification. - -
- -
diff --git a/src/UserGuide/Master/Tree/Background-knowledge/Navigating_Time_Series_Data_timecho.md b/src/UserGuide/Master/Tree/Background-knowledge/Navigating_Time_Series_Data_timecho.md deleted file mode 100644 index dc29b26d4..000000000 --- a/src/UserGuide/Master/Tree/Background-knowledge/Navigating_Time_Series_Data_timecho.md +++ /dev/null @@ -1,65 +0,0 @@ - -# Timeseries Data Model - -## 1. What Is Time Series Data? - -In today's era of the Internet of Things, various scenarios such as the Internet of Things and industrial scenarios are undergoing digital transformation. People collect various states of devices by installing sensors on them. If the motor collects voltage and current, the blade speed, angular velocity, and power generation of the fan; Vehicle collection of latitude and longitude, speed, and fuel consumption; The vibration frequency, deflection, displacement, etc. of the bridge. The data collection of sensors has penetrated into various industries. - -![](/img/time-series-data-en-01.png) - -Generally speaking, we refer to each collection point as a measurement point (also known as a physical quantity, time series, timeline, signal quantity, indicator, measurement value, etc.). Each measurement point continuously collects new data information over time, forming a time series. In the form of a table, each time series is a table formed by two columns: time and value; In a graphical way, each time series is a trend chart formed over time, which can also be vividly referred to as the device's electrocardiogram. - -![](/img/time-series-data-en-02.png) - -The massive time series data generated by sensors is the foundation of digital transformation in various industries, so our modeling of time series data mainly focuses on equipment and sensors. - -## 2. Key Concepts of Time Series Data -The main concepts involved in time-series data can be divided from bottom to top: data points, measurement points, and equipment. - -![](/img/time-series-data-en-04.png) - -### 2.1 Data Point - -- Definition: Consists of a timestamp and a value, where the timestamp is of type long and the value can be of various types such as BOOLEAN, FLOAT, INT32, etc. -- Example: A row of a time series in the form of a table in the above figure, or a point of a time series in the form of a graph, is a data point. - -![](/img/time-series-data-en-03.png) - -### 2.2 Measurement Points - -- Definition: It is a time series formed by multiple data points arranged in increments according to timestamps. Usually, a measuring point represents a collection point and can regularly collect physical quantities of the environment it is located in. -- Also known as: physical quantity, time series, timeline, semaphore, indicator, measurement value, etc -- Example: - - Electricity scenario: current, voltage - - Energy scenario: wind speed, rotational speed - - Vehicle networking scenarios: fuel consumption, vehicle speed, longitude, dimensions - - Factory scenario: temperature, humidity -- In the tree model, the total number of measurement points equals the number of leaf nodes under the entire path pattern. For detailed statistics methods, refer to [Count Timeseries](../Basic-Concept/Operate-Metadata_timecho.md#_2-7-count-timeseries) - -### 2.3 Device - -- Definition: Corresponding to a physical device in an actual scene, usually a collection of measurement points, identified by one to multiple labels -- Example: - - Vehicle networking scenario: Vehicles identified by vehicle identification code (VIN) - - Factory scenario: robotic arm, unique ID identification generated by IoT platform - - Energy scenario: Wind turbines, identified by region, station, line, model, instance, etc - - Monitoring scenario: CPU, identified by machine room, rack, Hostname, device type, etc \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Basic-Concept/Operate-Metadata_timecho.md b/src/UserGuide/Master/Tree/Basic-Concept/Operate-Metadata_timecho.md deleted file mode 100644 index 5fa49ec97..000000000 --- a/src/UserGuide/Master/Tree/Basic-Concept/Operate-Metadata_timecho.md +++ /dev/null @@ -1,1283 +0,0 @@ - - -# Data Modeling - -## 1. Database Management - -### 1.1 Create Database - -According to the storage model we can set up the corresponding database. Two SQL statements are supported for creating databases, as follows: - -```sql -create database root.ln; -create database root.sgcc; -``` - -We can thus create two databases using the above two SQL statements. - -It is worth noting that 1 database is recommended. - -When the path itself or the parent/child layer of the path is already created as database, the path is then not allowed to be created as database. - -For example, when the databases root.ln and root.sgcc already exist, creating the database root.ln.wf01 is not allowed. The system will return the corresponding error message as shown below: - -```sql -CREATE DATABASE root.ln.wf01; -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 501: root.ln has already been created as a database -``` - -Similarly, when the database root.db.test already exists, creating the database root.db is not allowed either. The system will return the corresponding error message as shown below: - -```sql -CREATE DATABASE root.db; -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 529: Some children of root.db have already been created as datab -``` - -Database Node Naming Rules: -1. Node names may contain: **Chinese/English letters, Digits (0-9), Underscore(\_)、Period (.)、Backtick(\`)** -2. The entire name must be enclosed in **backticks (\`)** if: - - It consists solely of digits (e.g., 12345) - - It contains special characters (. or \_) that may cause ambiguity (e.g., db.01, \_temp) -3. Escaping Backticks: - If the node name itself contains a backtick (\`), use **two consecutive backticks(\`\`)** to represent a single backtick. Example: To name a node as \`db123\`\` (containing one backtick), write it as \`db123\`\`\`. - -Besides, if deploy on Windows or macOS system, the LayerName is case-insensitive, which means it's not allowed to create databases `root.ln` and `root.LN` at the same time. - -### 1.2 Show Databases - -After creating the database, we can use the [SHOW DATABASES](../SQL-Manual/SQL-Manual_timecho) statement and [SHOW DATABASES \](../SQL-Manual/SQL-Manual_timecho) to view the databases. The SQL statements are as follows: - -```sql -SHOW DATABASES; -SHOW DATABASES root.**; -``` - -The result is as follows: - -``` -+-------------+----+-------------------------+-----------------------+-----------------------+ -|database| ttl|schema_replication_factor|data_replication_factor|time_partition_interval| -+-------------+----+-------------------------+-----------------------+-----------------------+ -| root.sgcc|null| 2| 2| 604800| -| root.ln|null| 2| 2| 604800| -+-------------+----+-------------------------+-----------------------+-----------------------+ -Total line number = 2 -It costs 0.060s -``` - -### 1.3 Delete Database - -User can use the `DELETE DATABASE ` statement to delete all databases matching the pathPattern. Please note the data in the database will also be deleted. - -```sql -DELETE DATABASE root.ln; -DELETE DATABASE root.sgcc; -// delete all data, all timeseries and all databases; -DELETE DATABASE root.**; -``` - -### 1.4 Count Databases - -User can use the `COUNT DATABASE ` statement to count the number of databases. It is allowed to specify `PathPattern` to count the number of databases matching the `PathPattern`. - -SQL statement is as follows: - -```sql -count databases; -count databases root.*; -count databases root.sgcc.*; -count databases root.sgcc; -``` - -The result is as follows: - -``` -+-------------+ -| database| -+-------------+ -| root.sgcc| -| root.turbine| -| root.ln| -+-------------+ -Total line number = 3 -It costs 0.003s - -+-------------+ -| database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.003s - -+-------------+ -| database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 0| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 1| -+-------------+ -Total line number = 1 -It costs 0.002s -``` - -### 1.5 Setting up heterogeneous databases (Advanced operations) - -Under the premise of familiar with IoTDB metadata modeling, -users can set up heterogeneous databases in IoTDB to cope with different production needs. - -Currently, the following database heterogeneous parameters are supported: - -| Parameter | Type | Description | -| ------------------------- | ------- | --------------------------------------------- | -| TTL | Long | TTL of the Database | -| SCHEMA_REPLICATION_FACTOR | Integer | The schema replication number of the Database | -| DATA_REPLICATION_FACTOR | Integer | The data replication number of the Database | -| SCHEMA_REGION_GROUP_NUM | Integer | The SchemaRegionGroup number of the Database | -| DATA_REGION_GROUP_NUM | Integer | The DataRegionGroup number of the Database | - -Note the following when configuring heterogeneous parameters: - -+ TTL and TIME_PARTITION_INTERVAL must be positive integers. -+ SCHEMA_REPLICATION_FACTOR and DATA_REPLICATION_FACTOR must be smaller than or equal to the number of deployed DataNodes. -+ The function of SCHEMA_REGION_GROUP_NUM and DATA_REGION_GROUP_NUM are related to the parameter `schema_region_group_extension_policy` and `data_region_group_extension_policy` in iotdb-common.properties configuration file. Take DATA_REGION_GROUP_NUM as an example: - If `data_region_group_extension_policy=CUSTOM` is set, DATA_REGION_GROUP_NUM serves as the number of DataRegionGroups owned by the Database. - If `data_region_group_extension_policy=AUTO`, DATA_REGION_GROUP_NUM is used as the lower bound of the DataRegionGroup quota owned by the Database. That is, when the Database starts writing data, it will have at least this number of DataRegionGroups. - -Users can set any heterogeneous parameters when creating a Database, or adjust some heterogeneous parameters during a stand-alone/distributed IoTDB run. - -#### Set heterogeneous parameters when creating a Database - -The user can set any of the above heterogeneous parameters when creating a Database. The SQL statement is as follows: - -```sql -CREATE DATABASE prefixPath (WITH databaseAttributeClause (COMMA? databaseAttributeClause)*)? -``` - -For example: - -```sql -CREATE DATABASE root.db WITH SCHEMA_REPLICATION_FACTOR=1, DATA_REPLICATION_FACTOR=3, SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### Adjust heterogeneous parameters at run time - -Users can adjust some heterogeneous parameters during the IoTDB runtime, as shown in the following SQL statement: - -```sql -ALTER DATABASE prefixPath WITH databaseAttributeClause (COMMA? databaseAttributeClause)*; -``` - -For example: - -```sql -ALTER DATABASE root.db WITH SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -Note that only the following heterogeneous parameters can be adjusted at runtime: - -+ SCHEMA_REGION_GROUP_NUM -+ DATA_REGION_GROUP_NUM - -#### Show heterogeneous databases - -The user can query the specific heterogeneous configuration of each Database, and the SQL statement is as follows: - -```sql -SHOW DATABASES DETAILS prefixPath? -``` - -For example: - -```sql -SHOW DATABASES DETAILS -``` -``` -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|Database| TTL|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|MinSchemaRegionGroupNum|MaxSchemaRegionGroupNum|DataRegionGroupNum|MinDataRegionGroupNum|MaxDataRegionGroupNum| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|root.db1| null| 1| 3| 604800000| 0| 1| 1| 0| 2| 2| -|root.db2|86400000| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -|root.db3| null| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -Total line number = 3 -It costs 0.058s -``` - -The query results in each column are as follows: - -+ The name of the Database -+ The TTL of the Database -+ The schema replication number of the Database -+ The data replication number of the Database -+ The time partition interval of the Database -+ The current SchemaRegionGroup number of the Database -+ The required minimum SchemaRegionGroup number of the Database -+ The permitted maximum SchemaRegionGroup number of the Database -+ The current DataRegionGroup number of the Database -+ The required minimum DataRegionGroup number of the Database -+ The permitted maximum DataRegionGroup number of the Database - -### 1.6 TTL - -IoTDB supports setting data retention time (TTL) at the device level, allowing the system to automatically and periodically delete old data to effectively control disk space and maintain high query performance and low memory usage. TTL is set in milliseconds by default. Once data expires, it cannot be queried or written, but physical deletion is delayed until compaction. Please note that changes to TTL may temporarily affect data queryability, and if TTL is reduced or removed, previously invisible data due to TTL may reappear. - -Important notes: -- TTL is set in milliseconds and is not affected by the time precision in the configuration file. -- Changes to TTL may affect data queryability. -- The system will eventually remove expired data, but there may be a delay. -- TTL determines data expiration based on the data point timestamp, not the ingestion time. -- The system supports setting up to 1000 TTL rules. When the limit is reached, existing rules must be removed before new ones can be added. - -#### TTL Path Rule -The path can only be prefix paths (i.e., the path cannot contain \* , except \*\* in the last level). -This path will match devices and also allows users to specify paths without asterisks as specific databases or devices. -When the path does not contain asterisks, the system will check if it matches a database; if it matches a database, both the path and path.\*\* will be set at the same time. Note: Device TTL settings do not verify the existence of metadata, i.e., it is allowed to set TTL for a non-existent device. -``` -qualified paths: -root.** -root.db.** -root.db.group1.** -root.db -root.db.group1.d1 - -unqualified paths: -root.*.db -root.**.db.* -root.db.* -``` -#### TTL Applicable Rules -When a device is subject to multiple TTL rules, the more precise and longer rules are prioritized. For example, for the device "root.bj.hd.dist001.turbine001", the rule "root.bj.hd.dist001.turbine001" takes precedence over "root.bj.hd.dist001.\*\*", and the rule "root.bj.hd.dist001.\*\*" takes precedence over "root.bj.hd.**". -#### Set TTL -The set ttl operation can be understood as setting a TTL rule, for example, setting ttl to root.sg.group1.** is equivalent to mounting ttl for all devices that can match this path pattern. -The unset ttl operation indicates unmounting TTL for the corresponding path pattern; if there is no corresponding TTL, nothing will be done. -If you want to set TTL to be infinitely large, you can use the INF keyword. -The SQL Statement for setting TTL is as follow: - -```sql -set ttl to pathPattern 360000; -``` -Set the Time to Live (TTL) to a pathPattern of 360,000 milliseconds; the pathPattern should not contain a wildcard (\*) in the middle and must end with a double asterisk (\*\*). The pathPattern is used to match corresponding devices. -To maintain compatibility with older SQL syntax, if the user-provided pathPattern matches a database (db), the path pattern is automatically expanded to include all sub-paths denoted by path.\*\*. -For instance, writing "set ttl to root.sg 360000" will automatically be transformed into "set ttl to root.sg.\*\* 360000", which sets the TTL for all devices under root.sg. However, if the specified pathPattern does not match a database, the aforementioned logic will not apply. For example, writing "set ttl to root.sg.group 360000" will not be expanded to "root.sg.group.\*\*" since root.sg.group does not match a database. -It is also permissible to specify a particular device without a wildcard (*). -#### Unset TTL - -To unset TTL, we can use follwing SQL statement: - -```sql -unset ttl from root.ln -``` - -After unset TTL, all data will be accepted in `root.ln`. - -```sql -unset ttl from root.sgcc.** -``` - -Unset the TTL in the `root.sgcc` path. - -New syntax - -```sql -unset ttl from root.** -``` - -Old syntax - -```sql -unset ttl to root.** -``` -There is no functional difference between the old and new syntax, and they are compatible with each other. -The new syntax is just more conventional in terms of wording. - -Unset the TTL setting for all path pattern. - -#### Show TTL - -To Show TTL, we can use following SQL statement: - -show all ttl - -```sql -SHOW ALL TTL; -``` -``` -+--------------+--------+ -| path| TTL| -| root.**|55555555| -| root.sg2.a.**|44440000| -+--------------+--------+ -``` - -show ttl on pathPattern -```sql -SHOW TTL ON root.db.**; -``` -``` -+--------------+--------+ -| path| TTL| -| root.db.**|55555555| -| root.db.a.**|44440000| -+--------------+--------+ -``` - -The SHOW ALL TTL example gives the TTL for all path patterns. -The SHOW TTL ON pathPattern shows the TTL for the path pattern specified. - -Display devices' ttl -```sql -show devices; -``` -``` -+---------------+---------+---------+ -| Device|IsAligned| TTL| -+---------------+---------+---------+ -|root.sg.device1| false| 36000000| -|root.sg.device2| true| INF| -+---------------+---------+---------+ -``` -All devices will definitely have a TTL, meaning it cannot be null. INF represents infinity. - - -## 2. Timeseries Management - -### 2.1 Create Timeseries - -According to the storage model selected before, we can create corresponding timeseries in the two databases respectively. The SQL statements for creating timeseries are as follows: - -```sql -create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT; -create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT; -create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT; -``` - -From v0.13, you can use a simplified version of the SQL statements to create timeseries: - -```sql -create timeseries root.ln.wf01.wt01.status BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature FLOAT; -create timeseries root.ln.wf02.wt02.hardware TEXT; -create timeseries root.ln.wf02.wt02.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature FLOAT; -``` - -When creating a timeseries, the system will automatically assign default encoding and compression methods, requiring no manual specification. If your business scenario requires custom adjustments, you may refer to the following example: - -```sql -create timeseries root.sgcc.wf03.wt01.temperature FLOAT encoding=PLAIN compressor=SNAPPY; -``` - -Note that if you manually specify an encoding method that is incompatible with the data type, the system will return an error message, as shown below: - -```sql -create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF; -error: encoding TS_2DIFF does not support BOOLEAN -``` - -For a full list of supported data types and corresponding encoding methods, please refer to [Compression & Encoding](../Technical-Insider/Encoding-and-Compression.md)。 - - -### 2.2 Create Aligned Timeseries - -The SQL statement for creating a group of timeseries are as follows: - -```sql -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT, longitude FLOAT); -``` - -You can set different datatype, encoding, and compression for the timeseries in a group of aligned timeseries - -It is also supported to set an alias, tag, and attribute for aligned timeseries. - - -### 2.3 Modifying Timseries Data Types - -Starting from version V2.0.8.2, modifying the data type of a timeseries via SQL statements is supported. - -Syntax definition: - -```SQL -ALTER TIMESERIES fullPath SET DATA TYPE newType=type -``` - -Notes: - -* If the timeseries is concurrently deleted during the modification process, an error will be reported. - -* The new data type must be compatible with the original type. The specific compatibility is shown in the following table: - -| Original Type |Convertible To Type | -| ----------- | ----------------------------------------------- | -| INT32 | INT64, FLOAT, DOUBLE, TIMESTAMP, STRING, TEXT | -| INT64 | TIMESTAMP, DOUBLE, STRING, TEXT | -| FLOAT | DOUBLE, STRING, TEXT | -| DOUBLE | STRING, TEXT | -| BOOLEAN | STRING, TEXT | -| TEXT | BLOB, STRING | -| STRING | TEXT, BLOB | -| BLOB | STRING, TEXT | -| DATE | STRING, TEXT | -| TIMESTAMP | INT64, DOUBLE, STRING, TEXT | - -Usage example: - -```SQL -ALTER TIMESERIES root.ln.wf01.wt01.temperature set data type DOUBLE -``` - -### 2.4 Modifying Timeseries Name - -Since version V2.0.8.2, it has been supported to modify the full path name of a timeseries through SQL statements. After a successful modification, the original name becomes invalid but is still retained in the metadata storage. - -Syntax definition: - -```sql --- Supports modifying the full path of a certain sequence to another full path -ALTER TIMESERIES RENAME TO -``` - -Usage instructions: - -- This statement takes effect immediately upon successful execution, and the tags/attributes/alias of the original sequence will be migrated to the new sequence. -- The invalidated sequence (original sequence) no longer supports write, query, delete, or other operations. The name of the invalidated sequence will be retained by the system, and creating a new sequence with the same name is not allowed. This ensures the uniqueness and traceability of the original sequence name: it supports viewing the original sequence through the `SHOW INVALID TIMESERIES` statement, preventing the loss of original sequence information due to frequent modifications, significantly improving data traceability and problem localization efficiency. -- The new sequence supports creating views, but the original sequence does not support creating views. When modifying the encoding, compression, sequence type, tags, attributes, or alias of the new sequence, the original sequence will not be modified; deleting the new sequence will also modify the original sequence. -- If the new sequence path or the alias of the original sequence under the target device already exists (including real sequences, views, invalid sequences, and their aliases), the system will report an error. - -Usage example: - -```sql -ALTER TIMESERIES root.ln.wf01.wt01.temperature RENAME TO root.newln.newwf.newwt.temperature -``` - - -### 2.5 Delete Timeseries - -To delete the timeseries we created before, we are able to use `(DELETE | DROP) TimeSeries ` statement. - -The usage are as follows: - -```sql -delete timeseries root.ln.wf01.wt01.status; -delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware; -delete timeseries root.ln.wf02.*; -drop timeseries root.ln.wf02.*; -``` - -### 2.6 Show Timeseries - -* SHOW LATEST? TIMESERIES pathPattern? whereClause? limitClause? - - There are four optional clauses added in SHOW TIMESERIES, return information of time series - -Timeseries information includes: timeseries path, alias of measurement, database it belongs to, data type, encoding type, compression type, tags and attributes. - -Examples: - -* SHOW TIMESERIES - - presents all timeseries information in JSON form - -* SHOW TIMESERIES <`PathPattern`> - - returns all timeseries information matching the given <`PathPattern`>. SQL statements are as follows: - -```sql -show timeseries root.**; -show timeseries root.ln.**; -``` - -The results are shown below respectively: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.016s - -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -Total line number = 4 -It costs 0.004s -``` - -* SHOW TIMESERIES LIMIT INT OFFSET INT - - returns all the timeseries information start from the offset and limit the number of series returned. For example, - -```sql -show timeseries root.ln.** limit 10 offset 10 -``` - -* SHOW TIMESERIES WHERE TIMESERIES contains 'containStr' - - The query result set is filtered by string fuzzy matching based on the names of the timeseries. For example: - -```sql -show timeseries root.ln.** where timeseries contains 'wf01.wt' -``` - -The result is shown below: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 2 -It costs 0.016s -``` - -* SHOW TIMESERIES WHERE DataType=type - - The query result set is filtered by data type. For example: - -```sql -show timeseries root.ln.** where dataType=FLOAT -``` - -The result is shown below: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 3 -It costs 0.016s - -``` - - -* SHOW TIMESERIES WHERE TAGS(KEY) = VALUE -* SHOW TIMESERIES WHERE TAGS(KEY) CONTAINS VALUE - - The query result set is filtered by tags. For example: - -```sql -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -The query results are as follows: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s - -``` - - -* SHOW LATEST TIMESERIES - - all the returned timeseries information should be sorted in descending order of the last timestamp of timeseries - -It is worth noting that when the queried path does not exist, the system will return no timeseries. - -- SHOW INVALID TIMESERIES - - Since version V2.0.8.2, this SQL statement is supported to display the invalidated timeseries after a successful full path name modification. - -```sql -IoTDB> show invalid timeSeries -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| NewPath| -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| GORILLA| LZ4|null| null| null| null| BASE|root.newln.newwf.newwt.temperature| -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -``` - -Explanation: The last column, "NewPath," in the returned result displays the new sequence corresponding to the invalidated sequence. This serves scenarios such as view construction and cluster migration (Load + rename). - - -### 2.7 Count Timeseries - -IoTDB is able to use `COUNT TIMESERIES ` to count the number of timeseries matching the path. SQL statements are as follows: - -* `WHERE` condition could be used to fuzzy match a time series name with the following syntax: `COUNT TIMESERIES WHERE TIMESERIES contains 'containStr'`. -* `WHERE` condition could be used to filter result by data type with the syntax: `COUNT TIMESERIES WHERE DataType='`. -* `WHERE` condition could be used to filter result by tags with the syntax: `COUNT TIMESERIES WHERE TAGS(key)='value'` or `COUNT TIMESERIES WHERE TAGS(key) contains 'value'`. -* `LEVEL` could be defined to show count the number of timeseries of each node at the given level in current Metadata Tree. This could be used to query the number of sensors under each device. The grammar is: `COUNT TIMESERIES GROUP BY LEVEL=`. - - -```sql -COUNT TIMESERIES root.**; -COUNT TIMESERIES root.ln.**; -COUNT TIMESERIES root.ln.*.*.status; -COUNT TIMESERIES root.ln.wf01.wt01.status; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' ; -COUNT TIMESERIES root.** WHERE DATATYPE = INT64; -COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c' ; -COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c' ; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1; -``` - -For example, if there are several timeseries (use `show timeseries` to show all timeseries): - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| {"unit":"c"}| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| {"description":"test1"}| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.004s -``` - -Then the Metadata Tree will be as below: - -
- -As can be seen, `root` is considered as `LEVEL=0`. So when you enter statements such as: - -```sql -COUNT TIMESERIES root.** GROUP BY LEVEL=1; -COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2; -COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2; -``` - -You will get following results: - -``` -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -| root.sgcc| 2| -| root.ln| 4| -+------------+-----------------+ -Total line number = 3 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf02| 2| -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 2 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 1 -It costs 0.002s -``` - -> Note: The path of timeseries is just a filter condition, which has no relationship with the definition of level. - -### 2.8 Active Timeseries Query -By adding WHERE time filter conditions to the existing SHOW/COUNT TIMESERIES, we can obtain time series with data within the specified time range. - -It is important to note that in metadata queries with time filters, views are not considered; only the time series actually stored in the TsFile are taken into account. - -An example usage is as follows: -```sql -insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -show timeseries; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -show timeseries where time >= 15000 and time < 16000; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -count timeseries where time >= 15000 and time < 16000; -+-----------------+ -|count(timeseries)| -+-----------------+ -| 4| -+-----------------+ -``` -Regarding the definition of active time series, data that can be queried normally is considered active, meaning time series that have been inserted but deleted are not included. -### 2.9 Tag and Attribute Management - -We can also add an alias, extra tag and attribute information while creating one timeseries. - -The differences between tag and attribute are: - -* Tag could be used to query the path of timeseries, we will maintain an inverted index in memory on the tag: Tag -> Timeseries -* Attribute could only be queried by timeseries path : Timeseries -> Attribute - -The SQL statements for creating timeseries with extra tag and attribute information are extended as follows: - -```sql -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2); -``` - -The `temprature` in the brackets is an alias for the sensor `s1`. So we can use `temprature` to replace `s1` anywhere. - -> IoTDB also supports using AS function to set alias. The difference between the two is: the alias set by the AS function is used to replace the whole time series name, temporary and not bound with the time series; while the alias mentioned above is only used as the alias of the sensor, which is bound with it and can be used equivalent to the original sensor name. - -> Notice that the size of the extra tag and attribute information shouldn't exceed the `tag_attribute_total_size`. - -We can update the tag information after creating it as following: - -* Rename the tag/attribute key - -```sql -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -* Reset the tag/attribute value - -```sql -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -* Delete the existing tag/attribute - -```sql -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2 -``` - -* Add new tags - -```sql -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -* Add new attributes - -```sql -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -* Upsert alias, tags and attributes - -> add alias or a new key-value if the alias or key doesn't exist, otherwise, update the old one with new value. - -```sql -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag3=v3, tag4=v4) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -* Show timeseries using tags. Use TAGS(tagKey) to identify the tags used as filter key - -```sql -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -``` - -returns all the timeseries information that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -```sql -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c; -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1; -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -The results are shown below respectly: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s -``` - -- count timeseries using tags - -```sql -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause; -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL=; -``` - -returns all the number of timeseries that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -```sql -count timeseries; -count timeseries root.** where TAGS(unit)='c'; -count timeseries root.** where TAGS(unit)='c' group by level = 2; -``` - -The results are shown below respectly : - -```sql -count timeseries; -+-----------------+ -|count(timeseries)| -+-----------------+ -| 6| -+-----------------+ -Total line number = 1 -It costs 0.019s -count timeseries root.** where TAGS(unit)='c'; -+-----------------+ -|count(timeseries)| -+-----------------+ -| 2| -+-----------------+ -Total line number = 1 -It costs 0.020s -count timeseries root.** where TAGS(unit)='c' group by level = 2; -+--------------+-----------------+ -| column|count(timeseries)| -+--------------+-----------------+ -| root.ln.wf02| 2| -| root.ln.wf01| 0| -|root.sgcc.wf03| 0| -+--------------+-----------------+ -Total line number = 3 -It costs 0.011s -``` - -> Notice that, we only support one condition in the where clause. Either it's an equal filter or it is an `contains` filter. In both case, the property in the where condition must be a tag. - -create aligned timeseries - -```sql -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)) -``` - -The execution result is as follows: - -```sql -show timeseries -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -|root.sg1.d1.s2| null| root.sg1| DOUBLE| GORILLA| SNAPPY|{"tag4":"v4","tag3":"v3"}|{"attr4":"v4","attr3":"v3"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -Support query: - -```sql -show timeseries where TAGS(tag1)='v1' -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -The above operations are supported for timeseries tag, attribute updates, etc. - -## 3. Path query - -### 3.1 Path - -A `path` is an expression that conforms to the following constraints: - -```sql -path - : nodeName ('.' nodeName)* - ; - -nodeName - : wildcard? identifier wildcard? - | wildcard - ; - -wildcard - : '*' - | '**' - ; -``` - -### 3.2 NodeName - -- The parts of a path separated by `.` are called node names (`nodeName`). -- For example, `root.a.b.c` is a path with a depth of 4 levels, where `root`, `a`, `b`, and `c` are all node names. - -#### Constraints - -- **Reserved keyword `root`**: `root` is a reserved keyword that is only allowed at the beginning of a path. If `root` appears in any other level, the system will fail to parse it and report an error. -- **Character support**: Except for the first level (`root`), other levels support the following characters: - - Letters (`a-z`, `A-Z`) - - Digits (`0-9`) - - Underscores (`_`) - - UNICODE Chinese characters (`\u2E80` to `\u9FFF`) -- **Case sensitivity**: On Windows systems, path node names in the database are case-insensitive. For example, `root.ln` and `root.LN` are considered the same path. - -### 3.3 Special Characters (Backquote) - -If special characters (such as spaces or punctuation marks) are needed in a `nodeName`, you can enclose the node name in Backquote (`). For more information on the use of backticks, please refer to [Backquote](../SQL-Manual/Syntax-Rule.md#reverse-quotation-marks). - -### 3.4 Path Pattern - -To make it more convenient and efficient to express multiple time series, IoTDB provides paths with wildcards `*` and `**`. Wildcards can appear in any level of a path. - -- **Single-level wildcard (`\*`)**: Represents one level in a path. - - For example, `root.vehicle.*.sensor1` represents paths with a depth of 4 levels, prefixed by `root.vehicle` and suffixed by `sensor1`. -- **Multi-level wildcard (`\**`)**: Represents one or more levels (`*`+). - - For example: - - `root.vehicle.device1.**` represents all paths with a depth of 4 or more levels, prefixed by `root.vehicle.device1`. - - `root.vehicle.**.sensor1` represents paths with a depth of 4 or more levels, prefixed by `root.vehicle` and suffixed by `sensor1`. - -**Note**: `*` and `**` cannot be placed at the beginning of a path. - -### 3.5 Show Child Paths - -```sql -SHOW CHILD PATHS pathPattern -``` - -Return all child paths and their node types of all the paths matching pathPattern. - -node types: ROOT -> DB INTERNAL -> DATABASE -> INTERNAL -> DEVICE -> TIMESERIES - - -Example: - -* return the child paths of root.ln:show child paths root.ln - -``` -+------------+----------+ -| child paths|node types| -+------------+----------+ -|root.ln.wf01| INTERNAL| -|root.ln.wf02| INTERNAL| -+------------+----------+ -Total line number = 2 -It costs 0.002s -``` - -> get all paths in form of root.xx.xx.xx:show child paths root.xx.xx - -### 3.6 Show Child Nodes - -```sql -SHOW CHILD NODES pathPattern -``` - -Return all child nodes of the pathPattern. - -Example: - -* return the child nodes of root:show child nodes root - -``` -+------------+ -| child nodes| -+------------+ -| ln| -+------------+ -``` - -* return the child nodes of root.ln:show child nodes root.ln - -``` -+------------+ -| child nodes| -+------------+ -| wf01| -| wf02| -+------------+ -``` - -### 3.7 Count Nodes - -IoTDB is able to use `COUNT NODES LEVEL=` to count the number of nodes at - the given level in current Metadata Tree considering a given pattern. IoTDB will find paths that - match the pattern and counts distinct nodes at the specified level among the matched paths. - This could be used to query the number of devices with specified measurements. The usage are as - follows: - -```sql -COUNT NODES root.** LEVEL=2; -COUNT NODES root.ln.** LEVEL=2; -COUNT NODES root.ln.wf01.** LEVEL=3; -COUNT NODES root.**.temperature LEVEL=3; -``` - -As for the above mentioned example and Metadata tree, you can get following results: - -``` -+------------+ -|count(nodes)| -+------------+ -| 4| -+------------+ -Total line number = 1 -It costs 0.003s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 1| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s -``` - -> Note: The path of timeseries is just a filter condition, which has no relationship with the definition of level. - -### 3.8 Show Devices - -* SHOW DEVICES pathPattern? (WITH DATABASE)? devicesWhereClause? limitClause? - -Similar to `Show Timeseries`, IoTDB also supports two ways of viewing devices: - -* `SHOW DEVICES` statement presents all devices' information, which is equal to `SHOW DEVICES root.**`. -* `SHOW DEVICES ` statement specifies the `PathPattern` and returns the devices information matching the pathPattern and under the given level. -* `WHERE` condition supports `DEVICE contains 'xxx'` to do a fuzzy query based on the device name. - -SQL statement is as follows: - -```sql -show devices; -show devices root.ln.**; -show devices root.ln.** where device contains 't'; -``` - -You can get results below: - -``` -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.ln.wf01.wt01| false| -| root.ln.wf02.wt02| false| -|root.sgcc.wf03.wt01| false| -| root.turbine.d1| false| -+-------------------+---------+ -Total line number = 4 -It costs 0.002s - -+-----------------+---------+ -| devices|isAligned| -+-----------------+---------+ -|root.ln.wf01.wt01| false| -|root.ln.wf02.wt02| false| -+-----------------+---------+ -Total line number = 2 -It costs 0.001s -``` - -`isAligned` indicates whether the timeseries under the device are aligned. - -To view devices' information with database, we can use `SHOW DEVICES WITH DATABASE` statement. - -* `SHOW DEVICES WITH DATABASE` statement presents all devices' information with their database. -* `SHOW DEVICES WITH DATABASE` statement specifies the `PathPattern` and returns the - devices' information under the given level with their database information. - -SQL statement is as follows: - -```sql -show devices with database; -show devices root.ln.** with database; -``` - -You can get results below: - -``` -+-------------------+-------------+---------+ -| devices| database|isAligned| -+-------------------+-------------+---------+ -| root.ln.wf01.wt01| root.ln| false| -| root.ln.wf02.wt02| root.ln| false| -|root.sgcc.wf03.wt01| root.sgcc| false| -| root.turbine.d1| root.turbine| false| -+-------------------+-------------+---------+ -Total line number = 4 -It costs 0.003s - -+-----------------+-------------+---------+ -| devices| database|isAligned| -+-----------------+-------------+---------+ -|root.ln.wf01.wt01| root.ln| false| -|root.ln.wf02.wt02| root.ln| false| -+-----------------+-------------+---------+ -Total line number = 2 -It costs 0.001s -``` - -### 3.9 Count Devices - -* COUNT DEVICES `` - -The above statement is used to count the number of devices. At the same time, it is allowed to specify `PathPattern` to count the number of devices matching the `PathPattern`. - -SQL statement is as follows: - -```sql -show devices; -count devices; -count devices root.ln.**; -``` - -You can get results below: - -``` -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -|root.sgcc.wf03.wt03| false| -| root.turbine.d1| false| -| root.ln.wf02.wt02| false| -| root.ln.wf01.wt01| false| -+-------------------+---------+ -Total line number = 4 -It costs 0.024s - -+--------------+ -|count(devices)| -+--------------+ -| 4| -+--------------+ -Total line number = 1 -It costs 0.004s - -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -Total line number = 1 -It costs 0.004s -``` - -### 3.10 Active Device Query -Similar to active timeseries query, we can add time filter conditions to device viewing and statistics to query active devices that have data within a certain time range. The definition of active here is the same as for active time series. An example usage is as follows: -```sql -insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -show devices; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -| root.sg.data3| false| -+-------------------+---------+ - -show devices where time >= 15000 and time < 16000; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -+-------------------+---------+ - -count devices where time >= 15000 and time < 16000; -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Basic-Concept/Query-Data_timecho.md b/src/UserGuide/Master/Tree/Basic-Concept/Query-Data_timecho.md deleted file mode 100644 index 3c9c40404..000000000 --- a/src/UserGuide/Master/Tree/Basic-Concept/Query-Data_timecho.md +++ /dev/null @@ -1,3054 +0,0 @@ - -# Query Data -## 1. OVERVIEW - -### 1.1 Syntax Definition - -In IoTDB, `SELECT` statement is used to retrieve data from one or more selected time series. Here is the syntax definition of `SELECT` statement: - -```sql -SELECT [LAST] selectExpr [, selectExpr] ... - [INTO intoItem [, intoItem] ...] - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY { - ([startTime, endTime), interval [, slidingStep]) | - LEVEL = levelNum [, levelNum] ... | - TAGS(tagKey [, tagKey] ... ) | - VARIATION(expression[,delta][,ignoreNull=true/false]) | - CONDITION(expression,[keep>/>=/=/ 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is "status" and "temperature". The SQL statement requires that the status and temperature sensor values between the time point of "2017-11-01T00:05:00.000" and "2017-11-01T00:12:00.000" be selected. - -The execution result of this SQL statement is as follows: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -|2017-11-01T00:10:00.000+08:00| true| 25.52| -|2017-11-01T00:11:00.000+08:00| false| 22.91| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 6 -It costs 0.018s -``` - -#### Select Multiple Columns of Data for the Same Device According to Multiple Time Intervals - -IoTDB supports specifying multiple time interval conditions in a query. Users can combine time interval conditions at will according to their needs. For example, the SQL statement is: - -```sql -select status,temperature from root.ln.wf01.wt01 where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is "status" and "temperature"; the statement specifies two different time intervals, namely "2017-11-01T00:05:00.000 to 2017-11-01T00:12:00.000" and "2017-11-01T16:35:00.000 to 2017-11-01T16:37:00.000". The SQL statement requires that the values of selected timeseries satisfying any time interval be selected. - -The execution result of this SQL statement is as follows: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -|2017-11-01T00:10:00.000+08:00| true| 25.52| -|2017-11-01T00:11:00.000+08:00| false| 22.91| -|2017-11-01T16:35:00.000+08:00| true| 23.44| -|2017-11-01T16:36:00.000+08:00| false| 21.98| -|2017-11-01T16:37:00.000+08:00| false| 21.93| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 9 -It costs 0.018s -``` - - -#### Choose Multiple Columns of Data for Different Devices According to Multiple Time Intervals - -The system supports the selection of data in any column in a query, i.e., the selected columns can come from different devices. For example, the SQL statement is: - -```sql -select wf01.wt01.status,wf02.wt02.hardware from root.ln where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -which means: - -The selected timeseries are "the power supply status of ln group wf01 plant wt01 device" and "the hardware version of ln group wf02 plant wt02 device"; the statement specifies two different time intervals, namely "2017-11-01T00:05:00.000 to 2017-11-01T00:12:00.000" and "2017-11-01T16:35:00.000 to 2017-11-01T16:37:00.000". The SQL statement requires that the values of selected timeseries satisfying any time interval be selected. - -The execution result of this SQL statement is as follows: - -``` -+-----------------------------+------------------------+--------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf02.wt02.hardware| -+-----------------------------+------------------------+--------------------------+ -|2017-11-01T00:06:00.000+08:00| false| v1| -|2017-11-01T00:07:00.000+08:00| false| v1| -|2017-11-01T00:08:00.000+08:00| false| v1| -|2017-11-01T00:09:00.000+08:00| false| v1| -|2017-11-01T00:10:00.000+08:00| true| v2| -|2017-11-01T00:11:00.000+08:00| false| v1| -|2017-11-01T16:35:00.000+08:00| true| v2| -|2017-11-01T16:36:00.000+08:00| false| v1| -|2017-11-01T16:37:00.000+08:00| false| v1| -+-----------------------------+------------------------+--------------------------+ -Total line number = 9 -It costs 0.014s -``` - -#### Order By Time Query - -IoTDB supports the 'order by time' statement since 0.11, it's used to display results in descending order by time. -For example, the SQL statement is: - -```sql -select * from root.ln.** where time > 1 order by time desc limit 10; -``` - -The execution result of this SQL statement is as follows: - -``` -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -|2017-11-07T23:59:00.000+08:00| v1| false| 21.07| false| -|2017-11-07T23:58:00.000+08:00| v1| false| 22.93| false| -|2017-11-07T23:57:00.000+08:00| v2| true| 24.39| true| -|2017-11-07T23:56:00.000+08:00| v2| true| 24.44| true| -|2017-11-07T23:55:00.000+08:00| v2| true| 25.9| true| -|2017-11-07T23:54:00.000+08:00| v1| false| 22.52| false| -|2017-11-07T23:53:00.000+08:00| v2| true| 24.58| true| -|2017-11-07T23:52:00.000+08:00| v1| false| 20.18| false| -|2017-11-07T23:51:00.000+08:00| v1| false| 22.24| false| -|2017-11-07T23:50:00.000+08:00| v2| true| 23.7| true| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -Total line number = 10 -It costs 0.016s -``` - -### 1.4 Execution Interface - -In IoTDB, there are two ways to execute data query: - -- Execute queries using IoTDB-SQL. -- Efficient execution interfaces for common queries, including time-series raw data query, last query, and aggregation query. - -#### Execute queries using IoTDB-SQL - -Data query statements can be used in SQL command-line terminals, JDBC, JAVA / C++ / Python / Go and other native APIs, and RESTful APIs. - -- Execute the query statement in the SQL command line terminal: start the SQL command line terminal, and directly enter the query statement to execute, see [SQL command line terminal](../Tools-System/CLI_timecho.md). - -- Execute query statements in JDBC, see [JDBC](../API/Programming-JDBC_timecho.md) for details. - -- Execute query statements in native APIs such as JAVA / C++ / Python / Go. For details, please refer to the relevant documentation in the Application Programming Interface chapter. The interface prototype is as follows: - - ````java - SessionDataSet executeQueryStatement(String sql) - ```` - -- Used in RESTful API, see [HTTP API V1](../API/RestServiceV1_timecho.md) or [HTTP API V2](../API/RestServiceV2_timecho.md) for details. - -#### Efficient execution interfaces - -The native APIs provide efficient execution interfaces for commonly used queries, which can save time-consuming operations such as SQL parsing. include: - -* Time-series raw data query with time range: - - The specified query time range is a left-closed right-open interval, including the start time but excluding the end time. - -```java -SessionDataSet executeRawDataQuery(List paths, long startTime, long endTime); -``` - -* Last query: - - Query the last data, whose timestamp is greater than or equal LastTime. - -```java -SessionDataSet executeLastDataQuery(List paths, long LastTime); -``` - -* Aggregation query: - - Support specified query time range: The specified query time range is a left-closed right-open interval, including the start time but not the end time. - - Support GROUP BY TIME. - -```java -SessionDataSet executeAggregationQuery(List paths, List aggregations); - -SessionDataSet executeAggregationQuery( - List paths, List aggregations, long startTime, long endTime); - -SessionDataSet executeAggregationQuery( - List paths, - List aggregations, - long startTime, - long endTime, - long interval); - -SessionDataSet executeAggregationQuery( - List paths, - List aggregations, - long startTime, - long endTime, - long interval, - long slidingStep); -``` - -## 2. `SELECT` CLAUSE -The `SELECT` clause specifies the output of the query, consisting of several `selectExpr`. Each `selectExpr` defines one or more columns in the query result. For select expression details, see document [Operator-and-Expression](../SQL-Manual/Operator-and-Expression.md). - -- Example 1: - -```sql -select temperature from root.ln.wf01.wt01 -``` - -- Example 2: - -```sql -select status, temperature from root.ln.wf01.wt01 -``` - -### 2.1 Last Query - -The last query is a special type of query in Apache IoTDB. It returns the data point with the largest timestamp of the specified time series. In other word, it returns the latest state of a time series. This feature is especially important in IoT data analysis scenarios. To meet the performance requirement of real-time device monitoring systems, Apache IoTDB caches the latest values of all time series to achieve microsecond read latency. - -The last query is to return the most recent data point of the given timeseries in a three column format. - -The SQL syntax is defined as: - -```sql -select last [COMMA ]* from < PrefixPath > [COMMA < PrefixPath >]* [ORDER BY TIMESERIES (DESC | ASC)?] -``` - -which means: Query and return the last data points of timeseries prefixPath.path. - -- Only time filter is supported in \. Any other filters given in the \ will give an exception. When the cached most recent data point does not satisfy the criterion specified by the filter, IoTDB will have to get the result from the external storage, which may cause a decrease in performance. - -- The result will be returned in a four column table format. - - ``` - | Time | timeseries | value | dataType | - ``` - - **Note:** The `value` colum will always return the value as `string` and thus also has `TSDataType.TEXT`. Therefore, the column `dataType` is returned also which contains the _real_ type how the value should be interpreted. - -- We can use `TIME/TIMESERIES/VALUE/DATATYPE (DESC | ASC)` to specify that the result set is sorted in descending/ascending order based on a particular column. When the value column contains multiple types of data, the sorting is based on the string representation of the values. - -**Example 1:** get the last point of root.ln.wf01.wt01.status: - -```sql -select last status from root.ln.wf01.wt01; -``` -``` -+-----------------------------+------------------------+-----+--------+ -| Time| timeseries|value|dataType| -+-----------------------------+------------------------+-----+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.status|false| BOOLEAN| -+-----------------------------+------------------------+-----+--------+ -Total line number = 1 -It costs 0.000s -``` - -**Example 2:** get the last status and temperature points of root.ln.wf01.wt01, whose timestamp larger or equal to 2017-11-07T23:50:00。 - -```sql -select last status, temperature from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00; -``` -``` -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**Example 3:** get the last points of all sensor in root.ln.wf01.wt01, and order the result by the timeseries column in descending order - -```sql -select last * from root.ln.wf01.wt01 order by timeseries desc; -``` -``` -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**Example 4:** get the last points of all sensor in root.ln.wf01.wt01, and order the result by the dataType column in descending order - -```sql -select last * from root.ln.wf01.wt01 order by dataType desc; -``` -``` -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**Note:** The requirement to query the latest data point with other filtering conditions can be implemented through function composition. For example: -```sql -select max_time(*), last_value(*) from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00 and status = false align by device; -``` -``` -+-----------------+---------------------+----------------+-----------------------+------------------+ -| Device|max_time(temperature)|max_time(status)|last_value(temperature)|last_value(status)| -+-----------------+---------------------+----------------+-----------------------+------------------+ -|root.ln.wf01.wt01| 1510077540000| 1510077540000| 21.067368| false| -+-----------------+---------------------+----------------+-----------------------+------------------+ -Total line number = 1 -It costs 0.021s -``` - - -## 3. `WHERE` CLAUSE - -In IoTDB query statements, two filter conditions, **time filter** and **value filter**, are supported. - -The supported operators are as follows: - -- Comparison operators: greater than (`>`), greater than or equal ( `>=`), equal ( `=` or `==`), not equal ( `!=` or `<>`), less than or equal ( `<=`), less than ( `<`). -- Logical operators: and ( `AND` or `&` or `&&`), or ( `OR` or `|` or `||`), not ( `NOT` or `!`). -- Range contains operator: contains ( `IN` ). -- String matches operator: `LIKE`, `REGEXP`. - -### 3.1 Time Filter - -Use time filters to filter data for a specific time range. For supported formats of timestamps, please refer to [Timestamp](../Background-knowledge/Data-Type.md) . - -An example is as follows: - -1. Select data with timestamp greater than 2022-01-01T00:05:00.000: - - ```sql - select s1 from root.sg1.d1 where time > 2022-01-01T00:05:00.000; - ```` - -2. Select data with timestamp equal to 2022-01-01T00:05:00.000: - - ```sql - select s1 from root.sg1.d1 where time = 2022-01-01T00:05:00.000; - ```` - -3. Select the data in the time interval [2017-11-01T00:05:00.000, 2017-11-01T00:12:00.000): - - ```sql - select s1 from root.sg1.d1 where time >= 2022-01-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; - ```` - -Note: In the above example, `time` can also be written as `timestamp`. - -### 3.2 Value Filter - -Use value filters to filter data whose data values meet certain criteria. **Allow** to use a time series not selected in the select clause as a value filter. - -An example is as follows: - -1. Select data with a value greater than 36.5: - - ```sql - select temperature from root.sg1.d1 where temperature > 36.5; - ```` - -2. Select data with value equal to true: - - ```sql - select status from root.sg1.d1 where status = true; - ```` - -3. Select data for the interval [36.5,40] or not: - - ```sql - select temperature from root.sg1.d1 where temperature between 36.5 and 40; - ```` - - ```sql - select temperature from root.sg1.d1 where temperature not between 36.5 and 40; - ```` - -4. Select data with values within a specific range: - - ```sql - select code from root.sg1.d1 where code in ('200', '300', '400', '500'); - ```` - -5. Select data with values outside a certain range: - - ```sql - select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); - ```` - -6. Select data with values is null: - - ```sql - select code from root.sg1.d1 where temperature is null; - ```` - -7. Select data with values is not null: - - ```sql - select code from root.sg1.d1 where temperature is not null; - ```` - -### 3.3 Fuzzy Query - -Fuzzy query is divided into Like statement and Regexp statement, both of which can support fuzzy matching of TEXT type and STRING type data. - -Like statement: - -#### Fuzzy matching using `Like` - -In the value filter condition, for TEXT type data, use `Like` and `Regexp` operators to perform fuzzy matching on data. - -**Matching rules:** - -- The percentage (`%`) wildcard matches any string of zero or more characters. -- The underscore (`_`) wildcard matches any single character. - -**Example 1:** Query data containing `'cc'` in `value` under `root.sg.d1`. - -```sql -select * from root.sg.d1 where value like '%cc%' -``` -``` -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -**Example 2:** Query data that consists of 3 characters and the second character is `'b'` in `value` under `root.sg.d1`. - -```sql -select * from root.sg.device where value like '_b_'; -``` -``` -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:02.000+08:00| abc| -+-----------------------------+----------------+ -Total line number = 1 -It costs 0.002s -``` - -#### Fuzzy matching using `Regexp` - -The filter conditions that need to be passed in are regular expressions in the Java standard library style. - -**Examples of common regular matching:** - -``` -All characters with a length of 3-20: ^.{3,20}$ -Uppercase english characters: ^[A-Z]+$ -Numbers and English characters: ^[A-Za-z0-9]+$ -Beginning with a: ^a.* -``` - -**Example 1:** Query a string composed of 26 English characters for the value under root.sg.d1 - -```sql -select * from root.sg.d1 where value regexp '^[A-Za-z]+$' -``` -``` -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -**Example 2:** Query root.sg.d1 where the value value is a string composed of 26 lowercase English characters and the time is greater than 100 - -```sql -select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100 -``` -``` -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -## 4. `GROUP BY` CLAUSE - -IoTDB supports using `GROUP BY` clause to aggregate the time series by segment and group. - -Segmented aggregation refers to segmenting data in the row direction according to the time dimension, aiming at the time relationship between different data points in the same time series, and obtaining an aggregated value for each segment. Currently only **group by time**、**group by variation**、**group by condition**、**group by session** and **group by count** is supported, and more segmentation methods will be supported in the future. - -Group aggregation refers to grouping the potential business attributes of time series for different time series. Each group contains several time series, and each group gets an aggregated value. Support **group by path level** and **group by tag** two grouping methods. - -### 4.1 Aggregate By Segment - -#### Aggregate By Time - -Aggregate by time is a typical query method for time series data. Data is collected at high frequency and needs to be aggregated and calculated at certain time intervals. For example, to calculate the daily average temperature, the sequence of temperature needs to be segmented by day, and then calculated. average value. - -Aggregate by time refers to a query method that uses a lower frequency than the time frequency of data collection, and is a special case of segmented aggregation. For example, the frequency of data collection is one second. If you want to display the data in one minute, you need to use time aggregagtion. - -This section mainly introduces the related examples of time aggregation, using the `GROUP BY` clause. IoTDB supports partitioning result sets according to time interval and customized sliding step. And by default results are sorted by time in ascending order. - -The GROUP BY statement provides users with three types of specified parameters: - -* Parameter 1: The display window on the time axis -* Parameter 2: Time interval for dividing the time axis(should be positive) -* Parameter 3: Time sliding step (optional and defaults to equal the time interval if not set) - -The actual meanings of the three types of parameters are shown in Figure below. -Among them, the parameter 3 is optional. - -
-
- - -There are three typical examples of frequency reduction aggregation: - -##### Aggregate By Time without Specifying the Sliding Step Length - -The SQL statement is: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d); -``` - -which means: - -Since the sliding step length is not specified, the `GROUP BY` statement by default set the sliding step the same as the time interval which is `1d`. - -The fist parameter of the `GROUP BY` statement above is the display window parameter, which determines the final display range is [2017-11-01T00:00:00, 2017-11-07T23:00:00). - -The second parameter of the `GROUP BY` statement above is the time interval for dividing the time axis. Taking this parameter (1d) as time interval and startTime of the display window as the dividing origin, the time axis is divided into several continuous intervals, which are [0,1d), [1d, 2d), [2d, 3d), etc. - -Then the system will use the time and value filtering condition in the `WHERE` clause and the first parameter of the `GROUP BY` statement as the data filtering condition to obtain the data satisfying the filtering condition (which in this case is the data in the range of [2017-11-01T00:00:00, 2017-11-07 T23:00:00]), and map these data to the previously segmented time axis (in this case there are mapped data in every 1-day period from 2017-11-01T00:00:00 to 2017-11-07T23:00:00:00). - -Since there is data for each time period in the result range to be displayed, the execution result of the SQL statement is shown below: - -``` -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 1440| 26.0| -|2017-11-02T00:00:00.000+08:00| 1440| 26.0| -|2017-11-03T00:00:00.000+08:00| 1440| 25.99| -|2017-11-04T00:00:00.000+08:00| 1440| 26.0| -|2017-11-05T00:00:00.000+08:00| 1440| 26.0| -|2017-11-06T00:00:00.000+08:00| 1440| 25.99| -|2017-11-07T00:00:00.000+08:00| 1380| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 7 -It costs 0.024s -``` - -##### Aggregate By Time Specifying the Sliding Step Length - -The SQL statement is: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d); -``` - -which means: - -Since the user specifies the sliding step parameter as 1d, the `GROUP BY` statement will move the time interval `1 day` long instead of `3 hours` as default. - -That means we want to fetch all the data of 00:00:00 to 02:59:59 every day from 2017-11-01 to 2017-11-07. - -The first parameter of the `GROUP BY` statement above is the display window parameter, which determines the final display range is [2017-11-01T00:00:00, 2017-11-07T23:00:00). - -The second parameter of the `GROUP BY` statement above is the time interval for dividing the time axis. Taking this parameter (3h) as time interval and the startTime of the display window as the dividing origin, the time axis is divided into several continuous intervals, which are [2017-11-01T00:00:00, 2017-11-01T03:00:00), [2017-11-02T00:00:00, 2017-11-02T03:00:00), [2017-11-03T00:00:00, 2017-11-03T03:00:00), etc. - -The third parameter of the `GROUP BY` statement above is the sliding step for each time interval moving. - -Then the system will use the time and value filtering condition in the `WHERE` clause and the first parameter of the `GROUP BY` statement as the data filtering condition to obtain the data satisfying the filtering condition (which in this case is the data in the range of [2017-11-01T00:00:00, 2017-11-07T23:00:00]), and map these data to the previously segmented time axis (in this case there are mapped data in every 3-hour period for each day from 2017-11-01T00:00:00 to 2017-11-07T23:00:00:00). - -Since there is data for each time period in the result range to be displayed, the execution result of the SQL statement is shown below: - -``` -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| 25.98| -|2017-11-02T00:00:00.000+08:00| 180| 25.98| -|2017-11-03T00:00:00.000+08:00| 180| 25.96| -|2017-11-04T00:00:00.000+08:00| 180| 25.96| -|2017-11-05T00:00:00.000+08:00| 180| 26.0| -|2017-11-06T00:00:00.000+08:00| 180| 25.85| -|2017-11-07T00:00:00.000+08:00| 180| 25.99| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 7 -It costs 0.006s -``` - -The sliding step can be smaller than the interval, in which case there is overlapping time between the aggregation windows (similar to a sliding window). - -The SQL statement is: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-01 10:00:00), 4h, 2h); -``` - -The execution result of the SQL statement is shown below: - -``` -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| 25.98| -|2017-11-01T02:00:00.000+08:00| 180| 25.98| -|2017-11-01T04:00:00.000+08:00| 180| 25.96| -|2017-11-01T06:00:00.000+08:00| 180| 25.96| -|2017-11-01T08:00:00.000+08:00| 180| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 5 -It costs 0.006s -``` - -##### Aggregate by Natural Month - -The SQL statement is: - -```sql -select count(status) from root.ln.wf01.wt01 group by([2017-11-01T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` - -which means: - -Since the user specifies the sliding step parameter as `2mo`, the `GROUP BY` statement will move the time interval `2 months` long instead of `1 month` as default. - -The first parameter of the `GROUP BY` statement above is the display window parameter, which determines the final display range is [2017-11-01T00:00:00, 2019-11-07T23:00:00). - -The start time is 2017-11-01T00:00:00. The sliding step will increment monthly based on the start date, and the 1st day of the month will be used as the time interval's start time. - -The second parameter of the `GROUP BY` statement above is the time interval for dividing the time axis. Taking this parameter (1mo) as time interval and the startTime of the display window as the dividing origin, the time axis is divided into several continuous intervals, which are [2017-11-01T00:00:00, 2017-12-01T00:00:00), [2018-02-01T00:00:00, 2018-03-01T00:00:00), [2018-05-03T00:00:00, 2018-06-01T00:00:00)), etc. - -The third parameter of the `GROUP BY` statement above is the sliding step for each time interval moving. - -Then the system will use the time and value filtering condition in the `WHERE` clause and the first parameter of the `GROUP BY` statement as the data filtering condition to obtain the data satisfying the filtering condition (which in this case is the data in the range of (2017-11-01T00:00:00, 2019-11-07T23:00:00], and map these data to the previously segmented time axis (in this case there are mapped data of the first month in every two month period from 2017-11-01T00:00:00 to 2019-11-07T23:00:00). - -The SQL execution result is: - -``` -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-11-01T00:00:00.000+08:00| 259| -|2018-01-01T00:00:00.000+08:00| 250| -|2018-03-01T00:00:00.000+08:00| 259| -|2018-05-01T00:00:00.000+08:00| 251| -|2018-07-01T00:00:00.000+08:00| 242| -|2018-09-01T00:00:00.000+08:00| 225| -|2018-11-01T00:00:00.000+08:00| 216| -|2019-01-01T00:00:00.000+08:00| 207| -|2019-03-01T00:00:00.000+08:00| 216| -|2019-05-01T00:00:00.000+08:00| 207| -|2019-07-01T00:00:00.000+08:00| 199| -|2019-09-01T00:00:00.000+08:00| 181| -|2019-11-01T00:00:00.000+08:00| 60| -+-----------------------------+-------------------------------+ -``` - -The SQL statement is: - -```sql -select count(status) from root.ln.wf01.wt01 group by([2017-10-31T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` - -which means: - -Since the user specifies the sliding step parameter as `2mo`, the `GROUP BY` statement will move the time interval `2 months` long instead of `1 month` as default. - -The first parameter of the `GROUP BY` statement above is the display window parameter, which determines the final display range is [2017-10-31T00:00:00, 2019-11-07T23:00:00). - -Different from the previous example, the start time is set to 2017-10-31T00:00:00. The sliding step will increment monthly based on the start date, and the 31st day of the month meaning the last day of the month will be used as the time interval's start time. If the start time is set to the 30th date, the sliding step will use the 30th or the last day of the month. - -The start time is 2017-10-31T00:00:00. The sliding step will increment monthly based on the start time, and the 1st day of the month will be used as the time interval's start time. - -The second parameter of the `GROUP BY` statement above is the time interval for dividing the time axis. Taking this parameter (1mo) as time interval and the startTime of the display window as the dividing origin, the time axis is divided into several continuous intervals, which are [2017-10-31T00:00:00, 2017-11-31T00:00:00), [2018-02-31T00:00:00, 2018-03-31T00:00:00), [2018-05-31T00:00:00, 2018-06-31T00:00:00), etc. - -The third parameter of the `GROUP BY` statement above is the sliding step for each time interval moving. - -Then the system will use the time and value filtering condition in the `WHERE` clause and the first parameter of the `GROUP BY` statement as the data filtering condition to obtain the data satisfying the filtering condition (which in this case is the data in the range of [2017-10-31T00:00:00, 2019-11-07T23:00:00) and map these data to the previously segmented time axis (in this case there are mapped data of the first month in every two month period from 2017-10-31T00:00:00 to 2019-11-07T23:00:00). - -The SQL execution result is: - -``` -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-10-31T00:00:00.000+08:00| 251| -|2017-12-31T00:00:00.000+08:00| 250| -|2018-02-28T00:00:00.000+08:00| 259| -|2018-04-30T00:00:00.000+08:00| 250| -|2018-06-30T00:00:00.000+08:00| 242| -|2018-08-31T00:00:00.000+08:00| 225| -|2018-10-31T00:00:00.000+08:00| 216| -|2018-12-31T00:00:00.000+08:00| 208| -|2019-02-28T00:00:00.000+08:00| 216| -|2019-04-30T00:00:00.000+08:00| 208| -|2019-06-30T00:00:00.000+08:00| 199| -|2019-08-31T00:00:00.000+08:00| 181| -|2019-10-31T00:00:00.000+08:00| 69| -+-----------------------------+-------------------------------+ -``` - -##### Left Open And Right Close Range - -The SQL statement is: - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d); -``` - -In this sql, the time interval is left open and right close, so we won't include the value of timestamp 2017-11-01T00:00:00 and instead we will include the value of timestamp 2017-11-07T23:00:00. - -We will get the result like following: - -``` -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-11-02T00:00:00.000+08:00| 1440| -|2017-11-03T00:00:00.000+08:00| 1440| -|2017-11-04T00:00:00.000+08:00| 1440| -|2017-11-05T00:00:00.000+08:00| 1440| -|2017-11-06T00:00:00.000+08:00| 1440| -|2017-11-07T00:00:00.000+08:00| 1440| -|2017-11-07T23:00:00.000+08:00| 1380| -+-----------------------------+-------------------------------+ -Total line number = 7 -It costs 0.004s -``` - -#### Aggregation By Variation - -IoTDB supports grouping by continuous stable values through the `GROUP BY VARIATION` statement. - -Group-By-Variation wil set the first point in group as the base point, -then if the difference between the new data and base point is small than or equal to delta, -the data point will be grouped together and execute aggregation query (The calculation of difference and the meaning of delte are introduced below). The groups won't overlap and there is no fixed start time and end time. -The syntax of clause is as follows: - -```sql -group by variation(controlExpression[,delta][,ignoreNull=true/false]) -``` - -The different parameters mean: - -* controlExpression - -The value that is used to calculate difference. It can be any columns or the expression of them. - -* delta - -The threshold that is used when grouping. The difference of controlExpression between the first data point and new data point should less than or equal to delta. -When delta is zero, all the continuous data with equal expression value will be grouped into the same group. - -* ignoreNull - -Used to specify how to deal with the data when the value of controlExpression is null. When ignoreNull is false, null will be treated as a new value and when ignoreNull is true, the data point will be directly skipped. - -The supported return types of controlExpression and how to deal with null value when ignoreNull is false are shown in the following table: - -| delta | Return Type Supported By controlExpression | The Handling of null when ignoreNull is False | -| -------- | ------------------------------------------ | ------------------------------------------------------------ | -| delta!=0 | INT32、INT64、FLOAT、DOUBLE | If the processing group doesn't contains null, null value should be treated as infinity/infinitesimal and will end current group.
Continuous null values are treated as stable values and assigned to the same group. | -| delta=0 | TEXT、BINARY、INT32、INT64、FLOAT、DOUBLE | Null is treated as a new value in a new group and continuous nulls belong to the same group. | - -groupByVariation - -##### Precautions for Use - -1. The result of controlExpression should be a unique value. If multiple columns appear after using wildcard stitching, an error will be reported. -2. For a group in resultSet, the time column output the start time of the group by default. __endTime can be used in select clause to output the endTime of groups in resultSet. -3. Each device is grouped separately when used with `ALIGN BY DEVICE`. -4. Delta is zero and ignoreNull is true by default. -5. Currently `GROUP BY VARIATION` is not supported with `GROUP BY LEVEL`. - -Using the raw data below, several examples of `GROUP BY VARIAITON` queries will be given. - -``` -+-----------------------------+-------+-------+-------+--------+-------+-------+ -| Time| s1| s2| s3| s4| s5| s6| -+-----------------------------+-------+-------+-------+--------+-------+-------+ -|1970-01-01T08:00:00.000+08:00| 4.5| 9.0| 0.0| 45.0| 9.0| 8.25| -|1970-01-01T08:00:00.010+08:00| null| 19.0| 10.0| 145.0| 19.0| 8.25| -|1970-01-01T08:00:00.020+08:00| 24.5| 29.0| null| 245.0| 29.0| null| -|1970-01-01T08:00:00.030+08:00| 34.5| null| 30.0| 345.0| null| null| -|1970-01-01T08:00:00.040+08:00| 44.5| 49.0| 40.0| 445.0| 49.0| 8.25| -|1970-01-01T08:00:00.050+08:00| null| 59.0| 50.0| 545.0| 59.0| 6.25| -|1970-01-01T08:00:00.060+08:00| 64.5| 69.0| 60.0| 645.0| 69.0| null| -|1970-01-01T08:00:00.070+08:00| 74.5| 79.0| null| null| 79.0| 3.25| -|1970-01-01T08:00:00.080+08:00| 84.5| 89.0| 80.0| 845.0| 89.0| 3.25| -|1970-01-01T08:00:00.090+08:00| 94.5| 99.0| 90.0| 945.0| 99.0| 3.25| -|1970-01-01T08:00:00.150+08:00| 66.5| 77.0| 90.0| 945.0| 99.0| 9.25| -+-----------------------------+-------+-------+-------+--------+-------+-------+ -``` - -##### delta = 0 - -The sql is shown below: - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6) -``` - -Get the result below which ignores the row with null value in `s6`. - -``` -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.040+08:00| 24.5| 3| 50.0| -|1970-01-01T08:00:00.050+08:00|1970-01-01T08:00:00.050+08:00| null| 1| 50.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` - -when ignoreNull is false, the row with null value in `s6` will be considered. - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, ignoreNull=false) -``` - -Get the following result. - -``` -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.010+08:00| 4.5| 2| 10.0| -|1970-01-01T08:00:00.020+08:00|1970-01-01T08:00:00.030+08:00| 29.5| 1| 30.0| -|1970-01-01T08:00:00.040+08:00|1970-01-01T08:00:00.040+08:00| 44.5| 1| 40.0| -|1970-01-01T08:00:00.050+08:00|1970-01-01T08:00:00.050+08:00| null| 1| 50.0| -|1970-01-01T08:00:00.060+08:00|1970-01-01T08:00:00.060+08:00| 64.5| 1| 60.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` - -##### delta !=0 - -The sql is shown below: - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, 4) -``` - -Get the result below: - -``` -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.050+08:00| 24.5| 4| 100.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` - -The sql is shown below: - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6+s5, 10) -``` - -Get the result below: - -``` -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.010+08:00| 4.5| 2| 10.0| -|1970-01-01T08:00:00.040+08:00|1970-01-01T08:00:00.050+08:00| 44.5| 2| 90.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.080+08:00| 79.5| 2| 80.0| -|1970-01-01T08:00:00.090+08:00|1970-01-01T08:00:00.150+08:00| 80.5| 2| 180.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` - -#### Aggregation By Condition - -When you need to filter the data according to a specific condition and group the continuous ones for an aggregation query. -`GROUP BY CONDITION` is suitable for you.The rows which don't meet the given condition will be simply ignored because they don't belong to any group. -Its syntax is defined below: - -```sql -group by condition(predict,[keep>/>=/=/<=/<]threshold,[,ignoreNull=true/false]) -``` - -* predict - -Any legal expression return the type of boolean for filtering in grouping. - -* [keep>/>=/=/<=/<]threshold - -Keep expression is used to specify the number of continuous rows that meet the `predict` condition to form a group. Only the number of rows in group satisfy the keep condition, the result of group will be output. -Keep expression consists of a 'keep' string and a threshold of type `long` or a single 'long' type data. - -* ignoreNull=true/false - -Used to specify how to handle data rows that encounter null predict, skip the row when it's true and end current group when it's false. - -##### Precautions for Use - -1. keep condition is required in the query, but you can omit the 'keep' string and given a `long` number which defaults to 'keep=long number' condition. -2. IgnoreNull defaults to true. -3. For a group in resultSet, the time column output the start time of the group by default. __endTime can be used in select clause to output the endTime of groups in resultSet. -4. Each device is grouped separately when used with `ALIGN BY DEVICE`. -5. Currently `GROUP BY CONDITION` is not supported with `GROUP BY LEVEL`. - -For the following raw data, several query examples are given below: - -``` -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -| Time|root.sg.beijing.car01.soc|root.sg.beijing.car01.charging_status|root.sg.beijing.car01.vehicle_status| -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 14.0| 1| 1| -|1970-01-01T08:00:00.002+08:00| 16.0| 1| 1| -|1970-01-01T08:00:00.003+08:00| 16.0| 0| 1| -|1970-01-01T08:00:00.004+08:00| 16.0| 0| 1| -|1970-01-01T08:00:00.005+08:00| 18.0| 1| 1| -|1970-01-01T08:00:00.006+08:00| 24.0| 1| 1| -|1970-01-01T08:00:00.007+08:00| 36.0| 1| 1| -|1970-01-01T08:00:00.008+08:00| 36.0| null| 1| -|1970-01-01T08:00:00.009+08:00| 45.0| 1| 1| -|1970-01-01T08:00:00.010+08:00| 60.0| 1| 1| -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -``` - -The sql statement to query data with at least two continuous row shown below: - -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoringNull=true) -``` - -Get the result below: - -``` -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -| Time|max_time(root.sg.beijing.car01.charging_status)|count(root.sg.beijing.car01.vehicle_status)|last_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 2| 2| 16.0| -|1970-01-01T08:00:00.005+08:00| 10| 5| 60.0| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -``` - -When ignoreNull is false, the null value will be treated as a row that doesn't meet the condition. - -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoringNull=false) -``` - -Get the result below, the original group is split. - -``` -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -| Time|max_time(root.sg.beijing.car01.charging_status)|count(root.sg.beijing.car01.vehicle_status)|last_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 2| 2| 16.0| -|1970-01-01T08:00:00.005+08:00| 7| 3| 36.0| -|1970-01-01T08:00:00.009+08:00| 10| 2| 60.0| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -``` - -#### Aggregation By Session - -`GROUP BY SESSION` can be used to group data according to the interval of the time. Data with a time interval less than or equal to the given threshold will be assigned to the same group. -For example, in industrial scenarios, devices don't always run continuously, `GROUP BY SESSION` will group the data generated by each access session of the device. -Its syntax is defined as follows: - -```sql -group by session(timeInterval) -``` - -* timeInterval - -A given interval threshold to create a new group of data when the difference between the time of data is greater than the threshold. - -The figure below is a grouping diagram under `GROUP BY SESSION`. - -groupBySession - -##### Precautions for Use - -1. For a group in resultSet, the time column output the start time of the group by default. __endTime can be used in select clause to output the endTime of groups in resultSet. -2. Each device is grouped separately when used with `ALIGN BY DEVICE`. -3. Currently `GROUP BY SESSION` is not supported with `GROUP BY LEVEL`. - -For the raw data below, a few query examples are given: - -``` -+-----------------------------+-----------------+-----------+--------+------+ -| Time| Device|temperature|hardware|status| -+-----------------------------+-----------------+-----------+--------+------+ -|1970-01-01T08:00:01.000+08:00|root.ln.wf02.wt01| 35.7| 11| false| -|1970-01-01T08:00:02.000+08:00|root.ln.wf02.wt01| 35.8| 22| true| -|1970-01-01T08:00:03.000+08:00|root.ln.wf02.wt01| 35.4| 33| false| -|1970-01-01T08:00:04.000+08:00|root.ln.wf02.wt01| 36.4| 44| false| -|1970-01-01T08:00:05.000+08:00|root.ln.wf02.wt01| 36.8| 55| false| -|1970-01-01T08:00:10.000+08:00|root.ln.wf02.wt01| 36.8| 110| false| -|1970-01-01T08:00:20.000+08:00|root.ln.wf02.wt01| 37.8| 220| true| -|1970-01-01T08:00:30.000+08:00|root.ln.wf02.wt01| 37.5| 330| false| -|1970-01-01T08:00:40.000+08:00|root.ln.wf02.wt01| 37.4| 440| false| -|1970-01-01T08:00:50.000+08:00|root.ln.wf02.wt01| 37.9| 550| false| -|1970-01-01T08:01:40.000+08:00|root.ln.wf02.wt01| 38.0| 110| false| -|1970-01-01T08:02:30.000+08:00|root.ln.wf02.wt01| 38.8| 220| true| -|1970-01-01T08:03:20.000+08:00|root.ln.wf02.wt01| 38.6| 330| false| -|1970-01-01T08:04:20.000+08:00|root.ln.wf02.wt01| 38.4| 440| false| -|1970-01-01T08:05:20.000+08:00|root.ln.wf02.wt01| 38.3| 550| false| -|1970-01-01T08:06:40.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-01T08:07:50.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-01T08:08:00.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-02T08:08:01.000+08:00|root.ln.wf02.wt01| 38.2| 110| false| -|1970-01-02T08:08:02.000+08:00|root.ln.wf02.wt01| 37.5| 220| true| -|1970-01-02T08:08:03.000+08:00|root.ln.wf02.wt01| 37.4| 330| false| -|1970-01-02T08:08:04.000+08:00|root.ln.wf02.wt01| 36.8| 440| false| -|1970-01-02T08:08:05.000+08:00|root.ln.wf02.wt01| 37.4| 550| false| -+-----------------------------+-----------------+-----------+--------+------+ -``` - -TimeInterval can be set by different time units, the sql is shown below: - -```sql -select __endTime,count(*) from root.** group by session(1d) -``` - -Get the result: - -``` -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -| Time| __endTime|count(root.ln.wf02.wt01.temperature)|count(root.ln.wf02.wt01.hardware)|count(root.ln.wf02.wt01.status)| -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -|1970-01-01T08:00:01.000+08:00|1970-01-01T08:08:00.000+08:00| 15| 18| 15| -|1970-01-02T08:08:01.000+08:00|1970-01-02T08:08:05.000+08:00| 5| 5| 5| -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -``` - -It can be also used with `HAVING` and `ALIGN BY DEVICE` clauses. - -```sql -select __endTime,sum(hardware) from root.ln.wf02.wt01 group by session(50s) having sum(hardware)>0 align by device -``` - -Get the result below: - -``` -+-----------------------------+-----------------+-----------------------------+-------------+ -| Time| Device| __endTime|sum(hardware)| -+-----------------------------+-----------------+-----------------------------+-------------+ -|1970-01-01T08:00:01.000+08:00|root.ln.wf02.wt01|1970-01-01T08:03:20.000+08:00| 2475.0| -|1970-01-01T08:04:20.000+08:00|root.ln.wf02.wt01|1970-01-01T08:04:20.000+08:00| 440.0| -|1970-01-01T08:05:20.000+08:00|root.ln.wf02.wt01|1970-01-01T08:05:20.000+08:00| 550.0| -|1970-01-02T08:08:01.000+08:00|root.ln.wf02.wt01|1970-01-02T08:08:05.000+08:00| 1650.0| -+-----------------------------+-----------------+-----------------------------+-------------+ -``` - -#### Aggregation By Count - -`GROUP BY COUNT`can aggregate the data points according to the number of points. It can group fixed number of continuous data points together for aggregation query. -Its syntax is defined as follows: - -```sql -group by count(controlExpression, size[,ignoreNull=true/false]) -``` - -* controlExpression - -The object to count during processing, it can be any column or an expression of columns. - -* size - -The number of data points in a group, a number of `size` continuous points will be divided to the same group. - -* ignoreNull=true/false - -Whether to ignore the data points with null in `controlExpression`, when ignoreNull is true, data points with the `controlExpression` of null will be skipped during counting. - -##### Precautions for Use - -1. For a group in resultSet, the time column output the start time of the group by default. __endTime can be used in select clause to output the endTime of groups in resultSet. -2. Each device is grouped separately when used with `ALIGN BY DEVICE`. -3. Currently `GROUP BY SESSION` is not supported with `GROUP BY LEVEL`. -4. When the final number of data points in a group is less than `size`, the result of the group will not be output. - -For the data below, some examples will be given. - -``` -+-----------------------------+-----------+-----------------------+ -| Time|root.sg.soc|root.sg.charging_status| -+-----------------------------+-----------+-----------------------+ -|1970-01-01T08:00:00.001+08:00| 14.0| 1| -|1970-01-01T08:00:00.002+08:00| 16.0| 1| -|1970-01-01T08:00:00.003+08:00| 16.0| 0| -|1970-01-01T08:00:00.004+08:00| 16.0| 0| -|1970-01-01T08:00:00.005+08:00| 18.0| 1| -|1970-01-01T08:00:00.006+08:00| 24.0| 1| -|1970-01-01T08:00:00.007+08:00| 36.0| 1| -|1970-01-01T08:00:00.008+08:00| 36.0| null| -|1970-01-01T08:00:00.009+08:00| 45.0| 1| -|1970-01-01T08:00:00.010+08:00| 60.0| 1| -+-----------------------------+-----------+-----------------------+ -``` - -The sql is shown below - -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5) -``` - -Get the result below, in the second group from 1970-01-01T08:00:00.006+08:00 to 1970-01-01T08:00:00.010+08:00. There are only four points included which is less than `size`. So it won't be output. - -``` -+-----------------------------+-----------------------------+--------------------------------------+ -| Time| __endTime|first_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.001+08:00|1970-01-01T08:00:00.005+08:00| 14.0| -+-----------------------------+-----------------------------+--------------------------------------+ -``` - -When `ignoreNull=false` is used to take null value into account. There will be two groups with 5 points in the resultSet, which is shown as follows: - -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5,ignoreNull=false) -``` - -Get the results: - -``` -+-----------------------------+-----------------------------+--------------------------------------+ -| Time| __endTime|first_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.001+08:00|1970-01-01T08:00:00.005+08:00| 14.0| -|1970-01-01T08:00:00.006+08:00|1970-01-01T08:00:00.010+08:00| 24.0| -+-----------------------------+-----------------------------+--------------------------------------+ -``` - -### 4.2 Aggregate By Group - -#### Aggregation By Level - -Aggregation by level statement is used to group the query result whose name is the same at the given level. - -- Keyword `LEVEL` is used to specify the level that need to be grouped. By convention, `level=0` represents *root* level. -- All aggregation functions are supported. When using five aggregations: sum, avg, min_value, max_value and extreme, please make sure all the aggregated series have exactly the same data type. Otherwise, it will generate a syntax error. - -**Example 1:** there are multiple series named `status` under different databases, like "root.ln.wf01.wt01.status", "root.ln.wf02.wt02.status", and "root.sgcc.wf03.wt01.status". If you need to count the number of data points of the `status` sequence under different databases, use the following query: - -```sql -select count(status) from root.** group by level = 1 -``` - -Result: - -``` -+-------------------------+---------------------------+ -|count(root.ln.*.*.status)|count(root.sgcc.*.*.status)| -+-------------------------+---------------------------+ -| 20160| 10080| -+-------------------------+---------------------------+ -Total line number = 1 -It costs 0.003s -``` - -**Example 2:** If you need to count the number of data points under different devices, you can specify level = 3, - -```sql -select count(status) from root.** group by level = 3 -``` - -Result: - -``` -+---------------------------+---------------------------+ -|count(root.*.*.wt01.status)|count(root.*.*.wt02.status)| -+---------------------------+---------------------------+ -| 20160| 10080| -+---------------------------+---------------------------+ -Total line number = 1 -It costs 0.003s -``` - -**Example 3:** Attention,the devices named `wt01` under databases `ln` and `sgcc` are grouped together, since they are regarded as devices with the same name. If you need to further count the number of data points in different devices under different databases, you can use the following query: - -```sql -select count(status) from root.** group by level = 1, 3 -``` - -Result: - -``` -+----------------------------+----------------------------+------------------------------+ -|count(root.ln.*.wt01.status)|count(root.ln.*.wt02.status)|count(root.sgcc.*.wt01.status)| -+----------------------------+----------------------------+------------------------------+ -| 10080| 10080| 10080| -+----------------------------+----------------------------+------------------------------+ -Total line number = 1 -It costs 0.003s -``` - -**Example 4:** Assuming that you want to query the maximum value of temperature sensor under all time series, you can use the following query statement: - -```sql -select max_value(temperature) from root.** group by level = 0 -``` - -Result: - -``` -+---------------------------------+ -|max_value(root.*.*.*.temperature)| -+---------------------------------+ -| 26.0| -+---------------------------------+ -Total line number = 1 -It costs 0.013s -``` - -**Example 5:** The above queries are for a certain sensor. In particular, **if you want to query the total data points owned by all sensors at a certain level**, you need to explicitly specify `*` is selected. - -```sql -select count(*) from root.ln.** group by level = 2 -``` - -Result: - -``` -+----------------------+----------------------+ -|count(root.*.wf01.*.*)|count(root.*.wf02.*.*)| -+----------------------+----------------------+ -| 20160| 20160| -+----------------------+----------------------+ -Total line number = 1 -It costs 0.013s -``` - -##### Aggregate By Time with Level Clause - -Level could be defined to show count the number of points of each node at the given level in current Metadata Tree. - -This could be used to query the number of points under each device. - -The SQL statement is: - -Get time aggregation by level. - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d), level=1; -``` - -Result: - -``` -+-----------------------------+-------------------------+ -| Time|COUNT(root.ln.*.*.status)| -+-----------------------------+-------------------------+ -|2017-11-02T00:00:00.000+08:00| 1440| -|2017-11-03T00:00:00.000+08:00| 1440| -|2017-11-04T00:00:00.000+08:00| 1440| -|2017-11-05T00:00:00.000+08:00| 1440| -|2017-11-06T00:00:00.000+08:00| 1440| -|2017-11-07T00:00:00.000+08:00| 1440| -|2017-11-07T23:00:00.000+08:00| 1380| -+-----------------------------+-------------------------+ -Total line number = 7 -It costs 0.006s -``` - -Time aggregation with sliding step and by level. - -```sql -select count(status) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d), level=1; -``` - -Result: - -``` -+-----------------------------+-------------------------+ -| Time|COUNT(root.ln.*.*.status)| -+-----------------------------+-------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| -|2017-11-02T00:00:00.000+08:00| 180| -|2017-11-03T00:00:00.000+08:00| 180| -|2017-11-04T00:00:00.000+08:00| 180| -|2017-11-05T00:00:00.000+08:00| 180| -|2017-11-06T00:00:00.000+08:00| 180| -|2017-11-07T00:00:00.000+08:00| 180| -+-----------------------------+-------------------------+ -Total line number = 7 -It costs 0.004s -``` - -#### Aggregation By Tags - -IotDB allows you to do aggregation query with the tags defined in timeseries through `GROUP BY TAGS` clause as well. - -Firstly, we can put these example data into IoTDB, which will be used in the following feature introduction. - -These are the temperature data of the workshops, which belongs to the factory `factory1` and locates in different cities. The time range is `[1000, 10000)`. - -The device node of the timeseries path is the ID of the device. The information of city and workshop are modelled in the tags `city` and `workshop`. -The devices `d1` and `d2` belong to the workshop `d1` in `Beijing`. -`d3` and `d4` belong to the workshop `w2` in `Beijing`. -`d5` and `d6` belong to the workshop `w1` in `Shanghai`. -`d7` belongs to the workshop `w2` in `Shanghai`. -`d8` and `d9` are under maintenance, and don't belong to any workshops, so they have no tags. - - -```SQL -CREATE DATABASE root.factory1; -create timeseries root.factory1.d1.temperature with datatype=FLOAT tags(city=Beijing, workshop=w1); -create timeseries root.factory1.d2.temperature with datatype=FLOAT tags(city=Beijing, workshop=w1); -create timeseries root.factory1.d3.temperature with datatype=FLOAT tags(city=Beijing, workshop=w2); -create timeseries root.factory1.d4.temperature with datatype=FLOAT tags(city=Beijing, workshop=w2); -create timeseries root.factory1.d5.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w1); -create timeseries root.factory1.d6.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w1); -create timeseries root.factory1.d7.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w2); -create timeseries root.factory1.d8.temperature with datatype=FLOAT; -create timeseries root.factory1.d9.temperature with datatype=FLOAT; - -insert into root.factory1.d1(time, temperature) values(1000, 104.0); -insert into root.factory1.d1(time, temperature) values(3000, 104.2); -insert into root.factory1.d1(time, temperature) values(5000, 103.3); -insert into root.factory1.d1(time, temperature) values(7000, 104.1); - -insert into root.factory1.d2(time, temperature) values(1000, 104.4); -insert into root.factory1.d2(time, temperature) values(3000, 103.7); -insert into root.factory1.d2(time, temperature) values(5000, 103.3); -insert into root.factory1.d2(time, temperature) values(7000, 102.9); - -insert into root.factory1.d3(time, temperature) values(1000, 103.9); -insert into root.factory1.d3(time, temperature) values(3000, 103.8); -insert into root.factory1.d3(time, temperature) values(5000, 102.7); -insert into root.factory1.d3(time, temperature) values(7000, 106.9); - -insert into root.factory1.d4(time, temperature) values(1000, 103.9); -insert into root.factory1.d4(time, temperature) values(5000, 102.7); -insert into root.factory1.d4(time, temperature) values(7000, 106.9); - -insert into root.factory1.d5(time, temperature) values(1000, 112.9); -insert into root.factory1.d5(time, temperature) values(7000, 113.0); - -insert into root.factory1.d6(time, temperature) values(1000, 113.9); -insert into root.factory1.d6(time, temperature) values(3000, 113.3); -insert into root.factory1.d6(time, temperature) values(5000, 112.7); -insert into root.factory1.d6(time, temperature) values(7000, 112.3); - -insert into root.factory1.d7(time, temperature) values(1000, 101.2); -insert into root.factory1.d7(time, temperature) values(3000, 99.3); -insert into root.factory1.d7(time, temperature) values(5000, 100.1); -insert into root.factory1.d7(time, temperature) values(7000, 99.8); - -insert into root.factory1.d8(time, temperature) values(1000, 50.0); -insert into root.factory1.d8(time, temperature) values(3000, 52.1); -insert into root.factory1.d8(time, temperature) values(5000, 50.1); -insert into root.factory1.d8(time, temperature) values(7000, 50.5); - -insert into root.factory1.d9(time, temperature) values(1000, 50.3); -insert into root.factory1.d9(time, temperature) values(3000, 52.1); -``` - -##### Aggregation query by one single tag - -If the user wants to know the average temperature of each workshop, he can query like this - -```SQL -SELECT AVG(temperature) FROM root.factory1.** GROUP BY TAGS(city); -``` - -The query will calculate the average of the temperatures of those timeseries which have the same tag value of the key `city`. -The results are - -``` -+--------+------------------+ -| city| avg(temperature)| -+--------+------------------+ -| Beijing|104.04666697184244| -|Shanghai|107.85000076293946| -| NULL| 50.84999910990397| -+--------+------------------+ -Total line number = 3 -It costs 0.231s -``` - -From the results we can see that the differences between aggregation by tags query and aggregation by time or level query are: - -1. Aggregation query by tags will no longer remove wildcard to raw timeseries, but do the aggregation through the data of multiple timeseries, which have the same tag value. -2. Except for the aggregate result column, the result set contains the key-value column of the grouped tag. The column name is the tag key, and the values in the column are tag values which present in the searched timeseries. - If some searched timeseries doesn't have the grouped tag, a `NULL` value in the key-value column of the grouped tag will be presented, which means the aggregation of all the timeseries lacking the tagged key. - -##### Aggregation query by multiple tags - -Except for the aggregation query by one single tag, aggregation query by multiple tags in a particular order is allowed as well. - -For example, a user wants to know the average temperature of the devices in each workshop. -As the workshop names may be same in different city, it's not correct to aggregated by the tag `workshop` directly. -So the aggregation by the tag `city` should be done first, and then by the tag `workshop`. - -SQL - -```SQL -SELECT avg(temperature) FROM root.factory1.** GROUP BY TAGS(city, workshop); -``` - -The results - -``` -+--------+--------+------------------+ -| city|workshop| avg(temperature)| -+--------+--------+------------------+ -| NULL| NULL| 50.84999910990397| -|Shanghai| w1|113.01666768391927| -| Beijing| w2| 104.4000004359654| -|Shanghai| w2|100.10000038146973| -| Beijing| w1|103.73750019073486| -+--------+--------+------------------+ -Total line number = 5 -It costs 0.027s -``` - -We can see that in a multiple tags aggregation query, the result set will output the key-value columns of all the grouped tag keys, which have the same order with the one in `GROUP BY TAGS`. - -##### Downsampling Aggregation by tags based on Time Window - -Downsampling aggregation by time window is one of the most popular features in a time series database. IoTDB supports to do aggregation query by tags based on time window. - -For example, a user wants to know the average temperature of the devices in each workshop, in every 5 seconds, in the range of time `[1000, 10000)`. - -SQL - -```SQL -SELECT avg(temperature) FROM root.factory1.** GROUP BY ([1000, 10000), 5s), TAGS(city, workshop); -``` - -The results - -``` -+-----------------------------+--------+--------+------------------+ -| Time| city|workshop| avg(temperature)| -+-----------------------------+--------+--------+------------------+ -|1970-01-01T08:00:01.000+08:00| NULL| NULL| 50.91999893188476| -|1970-01-01T08:00:01.000+08:00|Shanghai| w1|113.20000076293945| -|1970-01-01T08:00:01.000+08:00| Beijing| w2| 103.4| -|1970-01-01T08:00:01.000+08:00|Shanghai| w2| 100.1999994913737| -|1970-01-01T08:00:01.000+08:00| Beijing| w1|103.81666692097981| -|1970-01-01T08:00:06.000+08:00| NULL| NULL| 50.5| -|1970-01-01T08:00:06.000+08:00|Shanghai| w1| 112.6500015258789| -|1970-01-01T08:00:06.000+08:00| Beijing| w2| 106.9000015258789| -|1970-01-01T08:00:06.000+08:00|Shanghai| w2| 99.80000305175781| -|1970-01-01T08:00:06.000+08:00| Beijing| w1| 103.5| -+-----------------------------+--------+--------+------------------+ -``` - -Comparing to the pure tag aggregations, this kind of aggregation will divide the data according to the time window specification firstly, and do the aggregation query by the multiple tags in each time window secondly. -The result set will also contain a time column, which have the same meaning with the time column of the result in downsampling aggregation query by time window. - -##### Limitation of Aggregation by Tags - -As this feature is still under development, some queries have not been completed yet and will be supported in the future. - -> 1. Temporarily not support `HAVING` clause to filter the results. -> 2. Temporarily not support ordering by tag values. -> 3. Temporarily not support `LIMIT`,`OFFSET`,`SLIMIT`,`SOFFSET`. -> 4. Temporarily not support `ALIGN BY DEVICE`. -> 5. Temporarily not support expressions as aggregation function parameter,e.g. `count(s+1)`. -> 6. Not support the value filter, which stands the same with the `GROUP BY LEVEL` query. - -## 5. `HAVING` CLAUSE - -If you want to filter the results of aggregate queries, -you can use the `HAVING` clause after the `GROUP BY` clause. - -> NOTE: -> -> 1.The expression in HAVING clause must consist of aggregate values; the original sequence cannot appear alone. -> The following usages are incorrect: -> -> ```sql -> select count(s1) from root.** group by ([1,3),1ms) having sum(s1) > s1; -> select count(s1) from root.** group by ([1,3),1ms) having s1 > 1; -> ``` -> -> 2.When filtering the `GROUP BY LEVEL` result, the PATH in `SELECT` and `HAVING` can only have one node. -> The following usages are incorrect: -> -> ```sql -> select count(s1) from root.** group by ([1,3),1ms), level=1 having sum(d1.s1) > 1; -> select count(d1.s1) from root.** group by ([1,3),1ms), level=1 having sum(s1) > 1; -> ``` - -Here are a few examples of using the 'HAVING' clause to filter aggregate results. - -Aggregation result 1: - -``` -+-----------------------------+---------------------+---------------------+ -| Time|count(root.test.*.s1)|count(root.test.*.s2)| -+-----------------------------+---------------------+---------------------+ -|1970-01-01T08:00:00.001+08:00| 4| 4| -|1970-01-01T08:00:00.003+08:00| 1| 0| -|1970-01-01T08:00:00.005+08:00| 2| 4| -|1970-01-01T08:00:00.007+08:00| 3| 2| -|1970-01-01T08:00:00.009+08:00| 4| 4| -+-----------------------------+---------------------+---------------------+ -``` - -Aggregation result filtering query 1: - -```sql - select count(s1) from root.** group by ([1,11),2ms), level=1 having count(s2) > 1; -``` - -Filtering result 1: - -``` -+-----------------------------+---------------------+ -| Time|count(root.test.*.s1)| -+-----------------------------+---------------------+ -|1970-01-01T08:00:00.001+08:00| 4| -|1970-01-01T08:00:00.005+08:00| 2| -|1970-01-01T08:00:00.009+08:00| 4| -+-----------------------------+---------------------+ -``` - -Aggregation result 2: - -``` -+-----------------------------+-------------+---------+---------+ -| Time| Device|count(s1)|count(s2)| -+-----------------------------+-------------+---------+---------+ -|1970-01-01T08:00:00.001+08:00|root.test.sg1| 1| 2| -|1970-01-01T08:00:00.003+08:00|root.test.sg1| 1| 0| -|1970-01-01T08:00:00.005+08:00|root.test.sg1| 1| 2| -|1970-01-01T08:00:00.007+08:00|root.test.sg1| 2| 1| -|1970-01-01T08:00:00.009+08:00|root.test.sg1| 2| 2| -|1970-01-01T08:00:00.001+08:00|root.test.sg2| 2| 2| -|1970-01-01T08:00:00.003+08:00|root.test.sg2| 0| 0| -|1970-01-01T08:00:00.005+08:00|root.test.sg2| 1| 2| -|1970-01-01T08:00:00.007+08:00|root.test.sg2| 1| 1| -|1970-01-01T08:00:00.009+08:00|root.test.sg2| 2| 2| -+-----------------------------+-------------+---------+---------+ -``` - -Aggregation result filtering query 2: - -```sql - select count(s1), count(s2) from root.** group by ([1,11),2ms) having count(s2) > 1 align by device; -``` - -Filtering result 2: - -``` -+-----------------------------+-------------+---------+---------+ -| Time| Device|count(s1)|count(s2)| -+-----------------------------+-------------+---------+---------+ -|1970-01-01T08:00:00.001+08:00|root.test.sg1| 1| 2| -|1970-01-01T08:00:00.005+08:00|root.test.sg1| 1| 2| -|1970-01-01T08:00:00.009+08:00|root.test.sg1| 2| 2| -|1970-01-01T08:00:00.001+08:00|root.test.sg2| 2| 2| -|1970-01-01T08:00:00.005+08:00|root.test.sg2| 1| 2| -|1970-01-01T08:00:00.009+08:00|root.test.sg2| 2| 2| -+-----------------------------+-------------+---------+---------+ -``` - -## 6. `FILL` CLAUSE - -### 6.1 Introduction - -When executing some queries, there may be no data for some columns in some rows, and data in these locations will be null, but this kind of null value is not conducive to data visualization and analysis, and the null value needs to be filled. - -In IoTDB, users can use the FILL clause to specify the fill mode when data is missing. Fill null value allows the user to fill any query result with null values according to a specific method, such as taking the previous value that is not null, or linear interpolation. The query result after filling the null value can better reflect the data distribution, which is beneficial for users to perform data analysis. - -### 6.2 Syntax Definition - -**The following is the syntax definition of the `FILL` clause:** - -```sql -FILL '(' PREVIOUS | LINEAR | constant ')'; -``` - -**Note:** - -- We can specify only one fill method in the `FILL` clause, and this method applies to all columns of the result set. -- Null value fill is not compatible with version 0.13 and previous syntax (`FILL(([(, , )?])+)`) is not supported anymore. - -### 6.3 Fill Methods - -**IoTDB supports the following three fill methods:** - -- `PREVIOUS`: Fill with the previous non-null value of the column. -- `LINEAR`: Fill the column with a linear interpolation of the previous non-null value and the next non-null value of the column. -- Constant: Fill with the specified constant. - -**Following table lists the data types and supported fill methods.** - -| Data Type | Supported Fill Methods | -| :-------- | :---------------------- | -| boolean | previous, value | -| int32 | previous, linear, value | -| int64 | previous, linear, value | -| float | previous, linear, value | -| double | previous, linear, value | -| text | previous, value | - -**Note:** For columns whose data type does not support specifying the fill method, we neither fill it nor throw exception, just keep it as it is. - -**For examples:** - -If we don't use any fill methods: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000; -``` - -the original result will be like: - -``` -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| null| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -#### `PREVIOUS` Fill - -**For null values in the query result set, fill with the previous non-null value of the column.** - -**Note:** If the first value of this column is null, we will keep first value as null and won't fill it until we meet first non-null value - -For example, with `PREVIOUS` fill, the SQL is as follows: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous); -``` - -result will be like: - -``` -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 21.93| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| false| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -**While using `FILL(PREVIOUS)`, you can specify a time interval. If the interval between the timestamp of the current null value and the timestamp of the previous non-null value exceeds the specified time interval, no filling will be performed.** - -> 1. In the case of FILL(LINEAR) and FILL(CONSTANT), if the second parameter is specified, an exception will be thrown -> 2. The interval parameter only supports integers - For example, the raw data looks like this: - -```sql -select s1 from root.db.d1 -``` -``` -+-----------------------------+-------------+ -| Time|root.db.d1.s1| -+-----------------------------+-------------+ -|2023-11-08T16:41:50.008+08:00| 1.0| -+-----------------------------+-------------+ -|2023-11-08T16:46:50.011+08:00| 2.0| -+-----------------------------+-------------+ -|2023-11-08T16:48:50.011+08:00| 3.0| -+-----------------------------+-------------+ -``` - -We want to group the data by 1 min time interval: - -```sql -select avg(s1) - from root.db.d1 - group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m) -``` -``` -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| null| -+-----------------------------+------------------+ -``` - -After grouping, we want to fill the null value: - -```sql -select avg(s1) - from root.db.d1 - group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m) - FILL(PREVIOUS); -``` -``` -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| 3.0| -+-----------------------------+------------------+ -``` - -we also don't want the null value to be filled if it keeps null for 2 min. - -```sql -select avg(s1) -from root.db.d1 -group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m) - FILL(PREVIOUS, 2m); -``` -``` -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| 3.0| -+-----------------------------+------------------+ -``` - -#### `LINEAR` Fill - -**For null values in the query result set, fill the column with a linear interpolation of the previous non-null value and the next non-null value of the column.** - -**Note:** - -- If all the values before current value are null or all the values after current value are null, we will keep current value as null and won't fill it. -- If the column's data type is boolean/text, we neither fill it nor throw exception, just keep it as it is. - -Here we give an example of filling null values using the linear method. The SQL statement is as follows: - -For example, with `LINEAR` fill, the SQL is as follows: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(linear); -``` - -result will be like: - -``` -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 22.08| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -#### Constant Fill - -**For null values in the query result set, fill with the specified constant.** - -**Note:** - -- When using the ValueFill, IoTDB neither fill the query result if the data type is different from the input constant nor throw exception, just keep it as it is. - - | Constant Value Data Type | Support Data Type | - | :----------------------- | :-------------------------------------- | - | `BOOLEAN` | `BOOLEAN` `TEXT` | - | `INT64` | `INT32` `INT64` `FLOAT` `DOUBLE` `TEXT` | - | `DOUBLE` | `FLOAT` `DOUBLE` `TEXT` | - | `TEXT` | `TEXT` | - -- If constant value is larger than Integer.MAX_VALUE, IoTDB neither fill the query result if the data type is int32 nor throw exception, just keep it as it is. - -For example, with `FLOAT` constant fill, the SQL is as follows: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(2.0); -``` - -result will be like: - -``` -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 2.0| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -For example, with `BOOLEAN` constant fill, the SQL is as follows: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(true); -``` - -result will be like: - -``` -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| null| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| true| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -## 7. `LIMIT` and `SLIMIT` CLAUSES (PAGINATION) - -When the query result set has a large amount of data, it is not conducive to display on one page. You can use the `LIMIT/SLIMIT` clause and the `OFFSET/SOFFSET` clause to control paging. - -- The `LIMIT` and `SLIMIT` clauses are used to control the number of rows and columns of query results. -- The `OFFSET` and `SOFFSET` clauses are used to control the starting position of the result display. - -### 7.1 Row Control over Query Results - -By using LIMIT and OFFSET clauses, users control the query results in a row-related manner. We demonstrate how to use LIMIT and OFFSET clauses through the following examples. - -* Example 1: basic LIMIT clause - -The SQL statement is: - -```sql -select status, temperature from root.ln.wf01.wt01 limit 10 -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is "status" and "temperature". The SQL statement requires the first 10 rows of the query result. - -The result is shown below: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:00:00.000+08:00| true| 25.96| -|2017-11-01T00:01:00.000+08:00| true| 24.36| -|2017-11-01T00:02:00.000+08:00| false| 20.09| -|2017-11-01T00:03:00.000+08:00| false| 20.18| -|2017-11-01T00:04:00.000+08:00| false| 21.13| -|2017-11-01T00:05:00.000+08:00| false| 22.72| -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 10 -It costs 0.000s -``` - -* Example 2: LIMIT clause with OFFSET - -The SQL statement is: - -```sql -select status, temperature from root.ln.wf01.wt01 limit 5 offset 3 -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is "status" and "temperature". The SQL statement requires rows 3 to 7 of the query result be returned (with the first row numbered as row 0). - -The result is shown below: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:03:00.000+08:00| false| 20.18| -|2017-11-01T00:04:00.000+08:00| false| 21.13| -|2017-11-01T00:05:00.000+08:00| false| 22.72| -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 5 -It costs 0.342s -``` - -* Example 3: LIMIT clause combined with WHERE clause - -The SQL statement is: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2024-07-07T00:05:00.000 and time< 2024-07-12T00:12:00.000 limit 5 offset 3 -``` - -which means: - -The selected equipment is the ln group wf01 factory wt01 equipment; The selected time series are "state" and "temperature". The SQL statement requires the return of the status and temperature sensor values between the time "2024-07-07T00:05:00.000" and "2024-07-12T00:12:00.0000" on lines 3 to 7 (the first line is numbered as line 0). - -The result is shown below: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2024-07-09T17:32:11.943+08:00| true| 24.941973| -|2024-07-09T17:32:12.944+08:00| true| 20.05108| -|2024-07-09T17:32:13.945+08:00| true| 20.541632| -|2024-07-09T17:32:14.945+08:00| null| 23.09016| -|2024-07-09T17:32:14.946+08:00| true| null| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 5 -It costs 0.070s -``` - -* Example 4: LIMIT clause combined with GROUP BY clause - -The SQL statement is: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) limit 5 offset 3 -``` - -which means: - -The SQL statement clause requires rows 3 to 7 of the query result be returned (with the first row numbered as row 0). - -The result is shown below: - -``` -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-04T00:00:00.000+08:00| 1440| 26.0| -|2017-11-05T00:00:00.000+08:00| 1440| 26.0| -|2017-11-06T00:00:00.000+08:00| 1440| 25.99| -|2017-11-07T00:00:00.000+08:00| 1380| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 4 -It costs 0.016s -``` - -### 7.2 Column Control over Query Results - -By using SLIMIT and SOFFSET clauses, users can control the query results in a column-related manner. We will demonstrate how to use SLIMIT and SOFFSET clauses through the following examples. - -* Example 1: basic SLIMIT clause - -The SQL statement is: - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is the first column under this device, i.e., the power supply status. The SQL statement requires the status sensor values between the time point of "2017-11-01T00:05:00.000" and "2017-11-01T00:12:00.000" be selected. - -The result is shown below: - -``` -+-----------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.temperature| -+-----------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| 20.71| -|2017-11-01T00:07:00.000+08:00| 21.45| -|2017-11-01T00:08:00.000+08:00| 22.58| -|2017-11-01T00:09:00.000+08:00| 20.98| -|2017-11-01T00:10:00.000+08:00| 25.52| -|2017-11-01T00:11:00.000+08:00| 22.91| -+-----------------------------+-----------------------------+ -Total line number = 6 -It costs 0.000s -``` - -* Example 2: SLIMIT clause with SOFFSET - -The SQL statement is: - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 soffset 1 -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is the second column under this device, i.e., the temperature. The SQL statement requires the temperature sensor values between the time point of "2017-11-01T00:05:00.000" and "2017-11-01T00:12:00.000" be selected. - -The result is shown below: - -``` -+-----------------------------+------------------------+ -| Time|root.ln.wf01.wt01.status| -+-----------------------------+------------------------+ -|2017-11-01T00:06:00.000+08:00| false| -|2017-11-01T00:07:00.000+08:00| false| -|2017-11-01T00:08:00.000+08:00| false| -|2017-11-01T00:09:00.000+08:00| false| -|2017-11-01T00:10:00.000+08:00| true| -|2017-11-01T00:11:00.000+08:00| false| -+-----------------------------+------------------------+ -Total line number = 6 -It costs 0.003s -``` - -* Example 3: SLIMIT clause combined with GROUP BY clause - -The SQL statement is: - -```sql -select max_value(*) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) slimit 1 soffset 1 -``` - -The result is shown below: - -``` -+-----------------------------+-----------------------------------+ -| Time|max_value(root.ln.wf01.wt01.status)| -+-----------------------------+-----------------------------------+ -|2017-11-01T00:00:00.000+08:00| true| -|2017-11-02T00:00:00.000+08:00| true| -|2017-11-03T00:00:00.000+08:00| true| -|2017-11-04T00:00:00.000+08:00| true| -|2017-11-05T00:00:00.000+08:00| true| -|2017-11-06T00:00:00.000+08:00| true| -|2017-11-07T00:00:00.000+08:00| true| -+-----------------------------+-----------------------------------+ -Total line number = 7 -It costs 0.000s -``` - -### 7.3 Row and Column Control over Query Results - -In addition to row or column control over query results, IoTDB allows users to control both rows and columns of query results. Here is a complete example with both LIMIT clauses and SLIMIT clauses. - -The SQL statement is: - -```sql -select * from root.ln.wf01.wt01 limit 10 offset 100 slimit 2 soffset 0 -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is columns 0 to 1 under this device (with the first column numbered as column 0). The SQL statement clause requires rows 100 to 109 of the query result be returned (with the first row numbered as row 0). - -The result is shown below: - -``` -+-----------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+-----------------------------+------------------------+ -|2017-11-01T01:40:00.000+08:00| 21.19| false| -|2017-11-01T01:41:00.000+08:00| 22.79| false| -|2017-11-01T01:42:00.000+08:00| 22.98| false| -|2017-11-01T01:43:00.000+08:00| 21.52| false| -|2017-11-01T01:44:00.000+08:00| 23.45| true| -|2017-11-01T01:45:00.000+08:00| 24.06| true| -|2017-11-01T01:46:00.000+08:00| 22.6| false| -|2017-11-01T01:47:00.000+08:00| 23.78| true| -|2017-11-01T01:48:00.000+08:00| 24.72| true| -|2017-11-01T01:49:00.000+08:00| 24.68| true| -+-----------------------------+-----------------------------+------------------------+ -Total line number = 10 -It costs 0.009s -``` - -### 7.4 Error Handling - -If the parameter N/SN of LIMIT/SLIMIT exceeds the size of the result set, IoTDB returns all the results as expected. For example, the query result of the original SQL statement consists of six rows, and we select the first 100 rows through the LIMIT clause: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 limit 100 -``` - -The result is shown below: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -|2017-11-01T00:10:00.000+08:00| true| 25.52| -|2017-11-01T00:11:00.000+08:00| false| 22.91| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 6 -It costs 0.005s -``` - -If the parameter N/SN of LIMIT/SLIMIT clause exceeds the allowable maximum value (N/SN is of type int64), the system prompts errors. For example, executing the following SQL statement: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 limit 9223372036854775808; -``` - -The SQL statement will not be executed and the corresponding error prompt is given as follows: - -``` -Msg: 416: Out of range. LIMIT : N should be Int64. -``` - -If the parameter N/SN of LIMIT/SLIMIT clause is not a positive intege, the system prompts errors. For example, executing the following SQL statement: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 limit 13.1; -``` - -The SQL statement will not be executed and the corresponding error prompt is given as follows: - -``` -Msg: 401: line 1:129 mismatched input '.' expecting {, ';'} -``` - -If the parameter OFFSET of LIMIT clause exceeds the size of the result set, IoTDB will return an empty result set. For example, executing the following SQL statement: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 limit 2 offset 6; -``` - -The result is shown below: - -``` -+----+------------------------+-----------------------------+ -|Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+----+------------------------+-----------------------------+ -+----+------------------------+-----------------------------+ -Empty set. -It costs 0.005s -``` - -If the parameter SOFFSET of SLIMIT clause is not smaller than the number of available timeseries, the system prompts errors. For example, executing the following SQL statement: - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 soffset 2; -``` - -The SQL statement will not be executed and the corresponding error prompt is given as follows: - -``` -Msg: 411: Meet error in query process: The value of SOFFSET (2) is equal to or exceeds the number of sequences (2) that can actually be returned. -``` - -## 8. `ORDER BY` CLAUSE - -### 8.1 Order by in ALIGN BY TIME mode - -The result set of IoTDB is in ALIGN BY TIME mode by default and `ORDER BY TIME` clause can also be used to specify the ordering of timestamp. The SQL statement is: - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time desc; -``` - -Results: - -``` -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -|2017-11-01T00:01:00.000+08:00| v2| true| 24.36| true| -|2017-11-01T00:00:00.000+08:00| v2| true| 25.96| true| -|1970-01-01T08:00:00.002+08:00| v2| false| null| null| -|1970-01-01T08:00:00.001+08:00| v1| true| null| null| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -``` - -### 8.2 Order by in ALIGN BY DEVICE mode - -When querying in ALIGN BY DEVICE mode, `ORDER BY` clause can be used to specify the ordering of result set. - -ALIGN BY DEVICE mode supports four kinds of clauses with two sort keys which are `Device` and `Time`. - -1. ``ORDER BY DEVICE``: sort by the alphabetical order of the device name. The devices with the same column names will be clustered in a group view. - -2. ``ORDER BY TIME``: sort by the timestamp, the data points from different devices will be shuffled according to the timestamp. - -3. ``ORDER BY DEVICE,TIME``: sort by the alphabetical order of the device name. The data points with the same device name will be sorted by timestamp. - -4. ``ORDER BY TIME,DEVICE``: sort by timestamp. The data points with the same time will be sorted by the alphabetical order of the device name. - -> To make the result set more legible, when `ORDER BY` clause is not used, default settings will be provided. -> The default ordering clause is `ORDER BY DEVICE,TIME` and the default ordering is `ASC`. - -When `Device` is the main sort key, the result set is sorted by device name first, then by timestamp in the group with the same device name, the SQL statement is: - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by device desc,time asc align by device; -``` - -The result shows below: - -``` -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -+-----------------------------+-----------------+--------+------+-----------+ -``` - -When `Time` is the main sort key, the result set is sorted by timestamp first, then by device name in data points with the same timestamp. The SQL statement is: - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time asc,device desc align by device; -``` - -The result shows below: - -``` -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -+-----------------------------+-----------------+--------+------+-----------+ -``` - -When `ORDER BY` clause is not used, sort in default way, the SQL statement is: - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` - -The result below indicates `ORDER BY DEVICE ASC,TIME ASC` is the clause in default situation. -`ASC` can be omitted because it's the default ordering. - -``` -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -+-----------------------------+-----------------+--------+------+-----------+ -``` - -Besides,`ALIGN BY DEVICE` and `ORDER BY` clauses can be used with aggregate query,the SQL statement is: - -```sql -select count(*) from root.ln.** group by ((2017-11-01T00:00:00.000+08:00,2017-11-01T00:03:00.000+08:00],1m) order by device asc,time asc align by device; -``` - -The result shows below: - -``` -+-----------------------------+-----------------+---------------+-------------+------------------+ -| Time| Device|count(hardware)|count(status)|count(temperature)| -+-----------------------------+-----------------+---------------+-------------+------------------+ -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| 1| 1| -|2017-11-01T00:02:00.000+08:00|root.ln.wf01.wt01| null| 0| 0| -|2017-11-01T00:03:00.000+08:00|root.ln.wf01.wt01| null| 0| 0| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| 1| 1| null| -|2017-11-01T00:02:00.000+08:00|root.ln.wf02.wt02| 0| 0| null| -|2017-11-01T00:03:00.000+08:00|root.ln.wf02.wt02| 0| 0| null| -+-----------------------------+-----------------+---------------+-------------+------------------+ -``` - -### 8.3 Order by arbitrary expressions - -In addition to the predefined keywords "Time" and "Device" in IoTDB, `ORDER BY` can also be used to sort by any expressions. - -When sorting, `ASC` or `DESC` can be used to specify the sorting order, and `NULLS` syntax is supported to specify the priority of NULL values in the sorting. By default, `NULLS FIRST` places NULL values at the top of the result, and `NULLS LAST` ensures that NULL values appear at the end of the result. If not specified in the clause, the default order is ASC with NULLS LAST. - -Here are several examples of queries for sorting arbitrary expressions using the following data: - -``` -+-----------------------------+-------------+-------+-------+--------+-------+ -| Time| Device| base| score| bonus| total| -+-----------------------------+-------------+-------+-------+--------+-------+ -|1970-01-01T08:00:00.000+08:00| root.one| 12| 50.0| 45.0| 107.0| -|1970-01-02T08:00:00.000+08:00| root.one| 10| 50.0| 45.0| 105.0| -|1970-01-03T08:00:00.000+08:00| root.one| 8| 50.0| 45.0| 103.0| -|1970-01-01T08:00:00.010+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.020+08:00| root.two| 8| 10.0| 15.0| 33.0| -|1970-01-01T08:00:00.010+08:00| root.three| 9| null| 24.0| 33.0| -|1970-01-01T08:00:00.020+08:00| root.three| 8| null| 22.5| 30.5| -|1970-01-01T08:00:00.030+08:00| root.three| 7| null| 23.5| 30.5| -|1970-01-01T08:00:00.010+08:00| root.four| 9| 32.0| 45.0| 86.0| -|1970-01-01T08:00:00.020+08:00| root.four| 8| 32.0| 45.0| 85.0| -|1970-01-01T08:00:00.030+08:00| root.five| 7| 53.0| 44.0| 104.0| -|1970-01-01T08:00:00.040+08:00| root.five| 6| 54.0| 42.0| 102.0| -+-----------------------------+-------------+-------+-------+--------+-------+ -``` - -When you need to sort the results based on the base score score, you can use the following SQL: - -```Sql -select score from root.** order by score desc align by device; -``` - -This will give you the following results: - -``` -+-----------------------------+---------+-----+ -| Time| Device|score| -+-----------------------------+---------+-----+ -|1970-01-01T08:00:00.040+08:00|root.five| 54.0| -|1970-01-01T08:00:00.030+08:00|root.five| 53.0| -|1970-01-01T08:00:00.000+08:00| root.one| 50.0| -|1970-01-02T08:00:00.000+08:00| root.one| 50.0| -|1970-01-03T08:00:00.000+08:00| root.one| 50.0| -|1970-01-01T08:00:00.000+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00| root.two| 10.0| -+-----------------------------+---------+-----+ -``` - -If you want to sort the results based on the total score, you can use an expression in the `ORDER BY` clause to perform the calculation: - -```Sql -select score,total from root.one order by base+score+bonus desc -``` - -This SQL is equivalent to: - -```Sql -select score,total from root.one order by total desc -``` - -Here are the results: - -``` -+-----------------------------+--------------+--------------+ -| Time|root.one.score|root.one.total| -+-----------------------------+--------------+--------------+ -|1970-01-01T08:00:00.000+08:00| 50.0| 107.0| -|1970-01-02T08:00:00.000+08:00| 50.0| 105.0| -|1970-01-03T08:00:00.000+08:00| 50.0| 103.0| -+-----------------------------+--------------+--------------+ -``` - -If you want to sort the results based on the total score and, in case of tied scores, sort by score, base, bonus, and submission time in descending order, you can specify multiple layers of sorting using multiple expressions: - -```Sql -select base, score, bonus, total from root.** order by total desc NULLS Last, - score desc NULLS Last, - bonus desc NULLS Last, - time desc align by device; -``` - -Here are the results: - -``` -+-----------------------------+----------+----+-----+-----+-----+ -| Time| Device|base|score|bonus|total| -+-----------------------------+----------+----+-----+-----+-----+ -|1970-01-01T08:00:00.000+08:00| root.one| 12| 50.0| 45.0|107.0| -|1970-01-02T08:00:00.000+08:00| root.one| 10| 50.0| 45.0|105.0| -|1970-01-01T08:00:00.030+08:00| root.five| 7| 53.0| 44.0|104.0| -|1970-01-03T08:00:00.000+08:00| root.one| 8| 50.0| 45.0|103.0| -|1970-01-01T08:00:00.040+08:00| root.five| 6| 54.0| 42.0|102.0| -|1970-01-01T08:00:00.010+08:00| root.four| 9| 32.0| 45.0| 86.0| -|1970-01-01T08:00:00.020+08:00| root.four| 8| 32.0| 45.0| 85.0| -|1970-01-01T08:00:00.010+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.000+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.020+08:00| root.two| 8| 10.0| 15.0| 33.0| -|1970-01-01T08:00:00.010+08:00|root.three| 9| null| 24.0| 33.0| -|1970-01-01T08:00:00.030+08:00|root.three| 7| null| 23.5| 30.5| -|1970-01-01T08:00:00.020+08:00|root.three| 8| null| 22.5| 30.5| -+-----------------------------+----------+----+-----+-----+-----+ -``` - -In the `ORDER BY` clause, you can also use aggregate query expressions. For example: - -```Sql -select min_value(total) from root.** order by min_value(total) asc align by device; -``` - -This will give you the following results: - -``` -+----------+----------------+ -| Device|min_value(total)| -+----------+----------------+ -|root.three| 30.5| -| root.two| 33.0| -| root.four| 85.0| -| root.five| 102.0| -| root.one| 103.0| -+----------+----------------+ -``` - -When specifying multiple columns in the query, the unsorted columns will change order along with the rows and sorted columns. The order of rows when the sorting columns are the same may vary depending on the specific implementation (no fixed order). For example: - -```Sql -select min_value(total),max_value(base) from root.** order by max_value(total) desc align by device; -``` - -This will give you the following results: -· - -``` -+----------+----------------+---------------+ -| Device|min_value(total)|max_value(base)| -+----------+----------------+---------------+ -| root.one| 103.0| 12| -| root.five| 102.0| 7| -| root.four| 85.0| 9| -| root.two| 33.0| 9| -|root.three| 30.5| 9| -+----------+----------------+---------------+ -``` - -You can use both `ORDER BY DEVICE,TIME` and `ORDER BY EXPRESSION` together. For example: - -```Sql -select score from root.** order by device asc, score desc, time asc align by device; -``` - -This will give you the following results: - -``` -+-----------------------------+---------+-----+ -| Time| Device|score| -+-----------------------------+---------+-----+ -|1970-01-01T08:00:00.040+08:00|root.five| 54.0| -|1970-01-01T08:00:00.030+08:00|root.five| 53.0| -|1970-01-01T08:00:00.010+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00|root.four| 32.0| -|1970-01-01T08:00:00.000+08:00| root.one| 50.0| -|1970-01-02T08:00:00.000+08:00| root.one| 50.0| -|1970-01-03T08:00:00.000+08:00| root.one| 50.0| -|1970-01-01T08:00:00.000+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00| root.two| 50.0| -|1970-01-01T08:00:00.020+08:00| root.two| 10.0| -+-----------------------------+---------+-----+ -``` - -## 9. `ALIGN BY` CLAUSE - -In addition, IoTDB supports another result set format: `ALIGN BY DEVICE`. - -### 9.1 Align by Device - -The `ALIGN BY DEVICE` indicates that the deviceId is considered as a column. Therefore, there are totally limited columns in the dataset. - -> NOTE: -> -> 1.You can see the result of 'align by device' as one relational table, `Time + Device` is the primary key of this Table. -> -> 2.The result is order by `Device` firstly, and then by `Time` order. - -The SQL statement is: - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` - -The result shows below: - -``` -+-----------------------------+-----------------+-----------+------+--------+ -| Time| Device|temperature|status|hardware| -+-----------------------------+-----------------+-----------+------+--------+ -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| 25.96| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| 24.36| true| null| -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| null| true| v1| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| null| false| v2| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| null| true| v2| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| null| true| v2| -+-----------------------------+-----------------+-----------+------+--------+ -Total line number = 6 -It costs 0.012s -``` - -### 9.2 Ordering in ALIGN BY DEVICE - -ALIGN BY DEVICE mode arranges according to the device first, and sort each device in ascending order according to the timestamp. The ordering and priority can be adjusted through `ORDER BY` clause. - -## 10. `INTO` CLAUSE (QUERY WRITE-BACK) - -The `SELECT INTO` statement copies data from query result set into target time series. - -The application scenarios are as follows: - -- **Implement IoTDB internal ETL**: ETL the original data and write a new time series. -- **Query result storage**: Persistently store the query results, which acts like a materialized view. -- **Non-aligned time series to aligned time series**: Rewrite non-aligned time series into another aligned time series. - -### 10.1 SQL Syntax - -#### Syntax Definition - -**The following is the syntax definition of the `select` statement:** - -```sql -selectIntoStatement -: SELECT - resultColumn [, resultColumn] ... - INTO intoItem [, intoItem] ... - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY groupByTimeClause, groupByLevelClause] - [FILL {PREVIOUS | LINEAR | constant}] - [LIMIT rowLimit OFFSET rowOffset] - [ALIGN BY DEVICE] -; - -intoItem -: [ALIGNED] intoDevicePath '(' intoMeasurementName [',' intoMeasurementName]* ')' - ; -``` - -#### `INTO` Clause - -The `INTO` clause consists of several `intoItem`. - -Each `intoItem` consists of a target device and a list of target measurements (similar to the `INTO` clause in an `INSERT` statement). - -Each target measurement and device form a target time series, and an `intoItem` contains a series of time series. For example: `root.sg_copy.d1(s1, s2)` specifies two target time series `root.sg_copy.d1.s1` and `root.sg_copy.d1.s2`. - -The target time series specified by the `INTO` clause must correspond one-to-one with the columns of the query result set. The specific rules are as follows: - -- **Align by time** (default): The number of target time series contained in all `intoItem` must be consistent with the number of columns in the query result set (except the time column) and correspond one-to-one in the order from left to right in the header. -- **Align by device** (using `ALIGN BY DEVICE`): the number of target devices specified in all `intoItem` is the same as the number of devices queried (i.e., the number of devices matched by the path pattern in the `FROM` clause), and One-to-one correspondence according to the output order of the result set device. -
The number of measurements specified for each target device should be consistent with the number of columns in the query result set (except for the time and device columns). It should be in one-to-one correspondence from left to right in the header. - -For examples: - -- **Example 1** (aligned by time) - -```sql -select s1, s2 into root.sg_copy.d1(t1), root.sg_copy.d2(t1, t2), root.sg_copy.d1(t2) from root.sg.d1, root.sg.d2; -``` -``` -+--------------+-------------------+--------+ -| source column| target timeseries| written| -+--------------+-------------------+--------+ -| root.sg.d1.s1| root.sg_copy.d1.t1| 8000| -+--------------+-------------------+--------+ -| root.sg.d2.s1| root.sg_copy.d2.t1| 10000| -+--------------+-------------------+--------+ -| root.sg.d1.s2| root.sg_copy.d2.t2| 12000| -+--------------+-------------------+--------+ -| root.sg.d2.s2| root.sg_copy.d1.t2| 10000| -+--------------+-------------------+--------+ -Total line number = 4 -It costs 0.725s -``` - -This statement writes the query results of the four time series under the `root.sg` database to the four specified time series under the `root.sg_copy` database. Note that `root.sg_copy.d2(t1, t2)` can also be written as `root.sg_copy.d2(t1), root.sg_copy.d2(t2)`. - -We can see that the writing of the `INTO` clause is very flexible as long as the combined target time series is not repeated and corresponds to the query result column one-to-one. - -> In the result set displayed by `CLI`, the meaning of each column is as follows: -> -> - The `source column` column represents the column name of the query result. -> - `target timeseries` represents the target time series for the corresponding column to write. -> - `written` indicates the amount of data expected to be written. - - -- **Example 2** (aligned by time) - -```sql -select count(s1 + s2), last_value(s2) into root.agg.count(s1_add_s2), root.agg.last_value(s2) from root.sg.d1 group by ([0, 100), 10ms); -``` -``` -+--------------------------------------+-------------------------+--------+ -| source column| target timeseries| written| -+--------------------------------------+-------------------------+--------+ -| count(root.sg.d1.s1 + root.sg.d1.s2)| root.agg.count.s1_add_s2| 10| -+--------------------------------------+-------------------------+--------+ -| last_value(root.sg.d1.s2)| root.agg.last_value.s2| 10| -+--------------------------------------+-------------------------+--------+ -Total line number = 2 -It costs 0.375s -``` - -This statement stores the results of an aggregated query into the specified time series. - -- **Example 3** (aligned by device) - -```sql -select s1, s2 into root.sg_copy.d1(t1, t2), root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` -``` -+--------------+--------------+-------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+--------------+-------------------+--------+ -| root.sg.d1| s1| root.sg_copy.d1.t1| 8000| -+--------------+--------------+-------------------+--------+ -| root.sg.d1| s2| root.sg_copy.d1.t2| 11000| -+--------------+--------------+-------------------+--------+ -| root.sg.d2| s1| root.sg_copy.d2.t1| 12000| -+--------------+--------------+-------------------+--------+ -| root.sg.d2| s2| root.sg_copy.d2.t2| 9000| -+--------------+--------------+-------------------+--------+ -Total line number = 4 -It costs 0.625s -``` - -This statement also writes the query results of the four time series under the `root.sg` database to the four specified time series under the `root.sg_copy` database. However, in ALIGN BY DEVICE, the number of `intoItem` must be the same as the number of queried devices, and each queried device corresponds to one `intoItem`. - -> When aligning the query by device, the result set displayed by `CLI` has one more column, the `source device` column indicating the queried device. - -- **Example 4** (aligned by device) - -```sql -select s1 + s2 into root.expr.add(d1s1_d1s2), root.expr.add(d2s1_d2s2) from root.sg.d1, root.sg.d2 align by device; -``` -``` -+--------------+--------------+------------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+--------------+------------------------+--------+ -| root.sg.d1| s1 + s2| root.expr.add.d1s1_d1s2| 10000| -+--------------+--------------+------------------------+--------+ -| root.sg.d2| s1 + s2| root.expr.add.d2s1_d2s2| 10000| -+--------------+--------------+------------------------+--------+ -Total line number = 2 -It costs 0.532s -``` - -This statement stores the result of evaluating an expression into the specified time series. - -#### Using variable placeholders - -In particular, We can use variable placeholders to describe the correspondence between the target and query time series, simplifying the statement. The following two variable placeholders are currently supported: - -- Suffix duplication character `::`: Copy the suffix (or measurement) of the query device, indicating that from this layer to the last layer (or measurement) of the device, the node name (or measurement) of the target device corresponds to the queried device The node name (or measurement) is the same. -- Single-level node matcher `${i}`: Indicates that the current level node name of the target sequence is the same as the i-th level node name of the query sequence. For example, for the path `root.sg1.d1.s1`, `${1}` means `sg1`, `${2}` means `d1`, and `${3}` means `s1`. - -When using variable placeholders, there must be no ambiguity in the correspondence between `intoItem` and the columns of the query result set. The specific cases are classified as follows: - -##### ALIGN BY TIME (default) - -> Note: The variable placeholder **can only describe the correspondence between time series**. If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a time series, so neither the target device nor the measurement can use variable placeholders. - -###### (1) The target device does not use variable placeholders & the target measurement list uses variable placeholders - -**Limitations:** - -1. In each `intoItem`, the length of the list of physical quantities must be 1.
(If the length can be greater than 1, e.g. `root.sg1.d1(::, s1)`, it is not possible to determine which columns match `::`) -2. The number of `intoItem` is 1, or the same as the number of columns in the query result set.
(When the length of each target measurement list is 1, if there is only one `intoItem`, it means that all the query sequences are written to the same device; if the number of `intoItem` is consistent with the query sequence, it is expressed as each query time series specifies a target device; if `intoItem` is greater than one and less than the number of query sequences, it cannot be a one-to-one correspondence with the query sequence) - -**Matching method:** Each query time series specifies the target device, and the target measurement is generated from the variable placeholder. - -**Example:** - -```sql -select s1, s2 -into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) -from root.sg.d1, root.sg.d2; -```` - -This statement is equivalent to: - -```sql -select s1, s2 -into root.sg_copy.d1(s1), root.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2) -from root.sg.d1, root.sg.d2; -```` - -As you can see, the statement is not very simplified in this case. - -###### (2) The target device uses variable placeholders & the target measurement list does not use variable placeholders - -**Limitations:** The number of target measurements in all `intoItem` is the same as the number of columns in the query result set. - -**Matching method:** The target measurement is specified for each query time series, and the target device is generated according to the target device placeholder of the `intoItem` where the corresponding target measurement is located. - -**Example:** - -```sql -select d1.s1, d1.s2, d2.s3, d3.s4 -into ::(s1_1, s2_2), root.sg.d2_2(s3_3), root.${2}_copy.::(s4) -from root.sg; -```` - -###### (3) The target device uses variable placeholders & the target measurement list uses variable placeholders - -**Limitations:** There is only one `intoItem`, and the length of the list of measurement list is 1. - -**Matching method:** Each query time series can get a target time series according to the variable placeholder. - -**Example:** - -```sql -select * into root.sg_bk.::(::) from root.sg.**; -```` - -Write the query results of all time series under `root.sg` to `root.sg_bk`, the device name suffix and measurement remain unchanged. - -##### ALIGN BY DEVICE - -> Note: The variable placeholder **can only describe the correspondence between time series**. If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a specific physical quantity, so the target measurement cannot use variable placeholders. - -###### (1) The target device does not use variable placeholders & the target measurement list uses variable placeholders - -**Limitations:** In each `intoItem`, if the list of measurement uses variable placeholders, the length of the list must be 1. - -**Matching method:** Each query time series specifies the target device, and the target measurement is generated from the variable placeholder. - -**Example:** - -```sql -select s1, s2, s3, s4 -into root.backup_sg.d1(s1, s2, s3, s4), root.backup_sg.d2(::), root.sg.d3(backup_${4}) -from root.sg.d1, root.sg.d2, root.sg.d3 -align by device; -```` - -###### (2) The target device uses variable placeholders & the target measurement list does not use variable placeholders - -**Limitations:** There is only one `intoItem`. (If there are multiple `intoItem` with placeholders, we will not know which source devices each `intoItem` needs to match) - -**Matching method:** Each query device obtains a target device according to the variable placeholder, and the target measurement written in each column of the result set under each device is specified by the target measurement list. - -**Example:** - -```sql -select avg(s1), sum(s2) + sum(s3), count(s4) -into root.agg_${2}.::(avg_s1, sum_s2_add_s3, count_s4) -from root.** -align by device; -```` - -###### (3) The target device uses variable placeholders & the target measurement list uses variable placeholders - -**Limitations:** There is only one `intoItem` and the length of the target measurement list is 1. - -**Matching method:** Each query time series can get a target time series according to the variable placeholder. - -**Example:** - -```sql -select * into ::(backup_${4}) from root.sg.** align by device; -```` - -Write the query result of each time series in `root.sg` to the same device, and add `backup_` before the measurement. - -#### Specify the target time series as the aligned time series - -We can use the `ALIGNED` keyword to specify the target device for writing to be aligned, and each `intoItem` can be set independently. - -**Example:** - -```sql -select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` - -This statement specifies that `root.sg_copy.d1` is an unaligned device and `root.sg_copy.d2` is an aligned device. - -#### Unsupported query clauses - -- `SLIMIT`, `SOFFSET`: The query columns are uncertain, so they are not supported. -- `LAST`, `GROUP BY TAGS`, `DISABLE ALIGN`: The table structure is inconsistent with the writing structure, so it is not supported. - -#### Other points to note - -- For general aggregation queries, the timestamp is meaningless, and the convention is to use 0 to store. -- When the target time-series exists, the data type of the source column and the target time-series must be compatible. About data type compatibility, see the document [Data Type](../Background-knowledge/Data-Type.md). -- When the target time series does not exist, the system automatically creates it (including the database). -- When the queried time series does not exist, or the queried sequence does not have data, the target time series will not be created automatically. - -### 10.2 Application examples - -#### Implement IoTDB internal ETL - -ETL the original data and write a new time series. - -```sql -SELECT preprocess_udf(s1, s2) INTO ::(preprocessed_s1, preprocessed_s2) FROM root.sg.* ALIGN BY DEIVCE; -``` -``` -+--------------+-------------------+---------------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d1| preprocess_udf(s1)| root.sg.d1.preprocessed_s1| 8000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d1| preprocess_udf(s2)| root.sg.d1.preprocessed_s2| 10000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d2| preprocess_udf(s1)| root.sg.d2.preprocessed_s1| 11000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d2| preprocess_udf(s2)| root.sg.d2.preprocessed_s2| 9000| -+--------------+-------------------+---------------------------+--------+ -``` - -#### Query result storage - -Persistently store the query results, which acts like a materialized view. - -```sql -SELECT count(s1), last_value(s1) INTO root.sg.agg_${2}(count_s1, last_value_s1) FROM root.sg1.d1 GROUP BY ([0, 10000), 10ms); -``` -``` -+--------------------------+-----------------------------+--------+ -| source column| target timeseries| written| -+--------------------------+-----------------------------+--------+ -| count(root.sg.d1.s1)| root.sg.agg_d1.count_s1| 1000| -+--------------------------+-----------------------------+--------+ -| last_value(root.sg.d1.s2)| root.sg.agg_d1.last_value_s2| 1000| -+--------------------------+-----------------------------+--------+ -Total line number = 2 -It costs 0.115s -``` - -#### Non-aligned time series to aligned time series - -Rewrite non-aligned time series into another aligned time series. - -**Note:** It is recommended to use the `LIMIT & OFFSET` clause or the `WHERE` clause (time filter) to batch data to prevent excessive data volume in a single operation. - -```sql -SELECT s1, s2 INTO ALIGNED root.sg1.aligned_d(s1, s2) FROM root.sg1.non_aligned_d WHERE time >= 0 and time < 10000; -``` -``` -+--------------------------+----------------------+--------+ -| source column| target timeseries| written| -+--------------------------+----------------------+--------+ -| root.sg1.non_aligned_d.s1| root.sg1.aligned_d.s1| 10000| -+--------------------------+----------------------+--------+ -| root.sg1.non_aligned_d.s2| root.sg1.aligned_d.s2| 10000| -+--------------------------+----------------------+--------+ -Total line number = 2 -It costs 0.375s -``` - -### 10.3 User Permission Management - -The user must have the following permissions to execute a query write-back statement: - -* All `WRITE_SCHEMA` permissions for the source series in the `select` clause. -* All `WRITE_DATA` permissions for the target series in the `into` clause. - -For more user permissions related content, please refer to [Account Management Statements](../User-Manual/Authority-Management_timecho.md). - -### 10.4 Configurable Properties - -* `select_into_insert_tablet_plan_row_limit`: The maximum number of rows can be processed in one insert-tablet-plan when executing select-into statements. 10000 by default. diff --git a/src/UserGuide/Master/Tree/Basic-Concept/Write-Data_timecho.md b/src/UserGuide/Master/Tree/Basic-Concept/Write-Data_timecho.md deleted file mode 100644 index 33ba179fa..000000000 --- a/src/UserGuide/Master/Tree/Basic-Concept/Write-Data_timecho.md +++ /dev/null @@ -1,202 +0,0 @@ - - - -# Write Data -## 1. CLI INSERT - -IoTDB provides users with a variety of ways to insert real-time data, such as directly inputting [INSERT SQL statement](../SQL-Manual/SQL-Manual_timecho#insert-data) in [Client/Shell tools](../Tools-System/CLI.md), or using [Java JDBC](../API/Programming-JDBC_timecho) to perform single or batch execution of [INSERT SQL statement](../SQL-Manual/SQL-Manual_timecho). - -NOTE: This section mainly introduces the use of [INSERT SQL statement](../SQL-Manual/SQL-Manual_timecho#insert-data) for real-time data import in the scenario. - -When writing data with duplicate timestamps, the existing data with the same timestamp will be overwritten directly, which is equivalent to data update; however, if the written value is NULL, the operation will not take effect and the original field value will not be overwritten. - -### 1.1 Use of INSERT Statements - -The [INSERT SQL statement](../SQL-Manual/SQL-Manual_timecho#insert-data) statement is used to insert data into one or more specified timeseries created. For each point of data inserted, it consists of a [timestamp](../Basic-Concept/Operate-Metadata.md) and a sensor acquisition value (see [Data Type](../Background-knowledge/Data-Type_timecho.md)). - -In the scenario of this section, take two timeseries `root.ln.wf02.wt02.status` and `root.ln.wf02.wt02.hardware` as an example, and their data types are BOOLEAN and TEXT, respectively. - -The sample code for single column data insertion is as follows: - -``` -IoTDB > insert into root.ln.wf02.wt02(timestamp,status) values(1,true) -IoTDB > insert into root.ln.wf02.wt02(timestamp,hardware) values(1, "v1") -``` - -The above example code inserts the long integer timestamp and the value "true" into the timeseries `root.ln.wf02.wt02.status` and inserts the long integer timestamp and the value "v1" into the timeseries `root.ln.wf02.wt02.hardware`. When the execution is successful, cost time is shown to indicate that the data insertion has been completed. - -> Note: In IoTDB, TEXT type data can be represented by single and double quotation marks. The insertion statement above uses double quotation marks for TEXT type data. The following example will use single quotation marks for TEXT type data. - -The INSERT statement can also support the insertion of multi-column data at the same time point. The sample code of inserting the values of the two timeseries at the same time point '2' is as follows: - -```sql -IoTDB > insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (2, false, 'v2') -``` - -In addition, The INSERT statement support insert multi-rows at once. The sample code of inserting two rows as follows: - -```sql -IoTDB > insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4') -``` - -When writing data to the tree model, both timestamp and time can be used as time column identifiers in INSERT statements, and there is no need to deliberately distinguish between them when writing statements. However, in query results, the time column is uniformly displayed as Time (a fixed name) to ensure a consistent result format. - -After inserting the data, we can simply query the inserted data using the SELECT statement: - -```sql -IoTDB > select * from root.ln.wf02.wt02 where time < 5 -``` - -The result is shown below. The query result shows that the insertion statements of single column and multi column data are performed correctly. - -``` -+-----------------------------+--------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status| -+-----------------------------+--------------------------+------------------------+ -|1970-01-01T08:00:00.001+08:00| v1| true| -|1970-01-01T08:00:00.002+08:00| v2| false| -|1970-01-01T08:00:00.003+08:00| v3| false| -|1970-01-01T08:00:00.004+08:00| v4| true| -+-----------------------------+--------------------------+------------------------+ -Total line number = 4 -It costs 0.004s -``` - -In addition, we can omit the timestamp column, and the system will use the current system timestamp as the timestamp of the data point. The sample code is as follows: - -```sql -IoTDB > insert into root.ln.wf02.wt02(status, hardware) values (false, 'v2') -``` - -**Note:** Timestamps must be specified when inserting multiple rows of data in a SQL. - -### 1.2 Insert Data Into Aligned Timeseries - -To insert data into a group of aligned time series, we only need to add the `ALIGNED` keyword in SQL, and others are similar. - -The sample code is as follows: - -```sql -IoTDB > create aligned timeseries root.sg1.d1(s1 INT32, s2 DOUBLE) -IoTDB > insert into root.sg1.d1(time, s1, s2) aligned values(1, 1, 1) -IoTDB > insert into root.sg1.d1(time, s1, s2) aligned values(2, 2, 2), (3, 3, 3) -IoTDB > select * from root.sg1.d1 -``` - -The result is shown below. The query result shows that the insertion statements are performed correctly. - -``` -+-----------------------------+--------------+--------------+ -| Time|root.sg1.d1.s1|root.sg1.d1.s2| -+-----------------------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 1| 1.0| -|1970-01-01T08:00:00.002+08:00| 2| 2.0| -|1970-01-01T08:00:00.003+08:00| 3| 3.0| -+-----------------------------+--------------+--------------+ -Total line number = 3 -It costs 0.004s -``` - -## 2. NATIVE API WRITE - -The Native API ( Session ) is the most widely used series of APIs of IoTDB, including multiple APIs, adapted to different data collection scenarios, with high performance and multi-language support. - -### 2.1 Multi-language API write - -#### Java - -Before writing via the Java API, you need to establish a connection, refer to [Java Native API](../API/Programming-Java-Native-API_timecho). -then refer to [ JAVA Data Manipulation Interface (DML) ](../API/Programming-Java-Native-API_timecho#insert) - -#### Python - -Refer to [ Python Data Manipulation Interface (DML) ](../API/Programming-Python-Native-API_timecho#insert) - -#### C++ - -Refer to [ C++ Data Manipulation Interface (DML) ](../API/Programming-Cpp-Native-API.md#insert) - -#### Go - -Refer to [Go Native API](../API/Programming-Go-Native-API.md) - -## 3. REST API WRITE - -Refer to [insertTablet (v1)](../API/RestServiceV1_timecho#inserttablet) or [insertTablet (v2)](../API/RestServiceV2_timecho#inserttablet) - -Example: - -```JSON -{ -      "timestamps": [ -            1, -            2, -            3 -      ], -      "measurements": [ -            "temperature", -            "status" -      ], -      "data_types": [ -            "FLOAT", -            "BOOLEAN" -      ], -      "values": [ -            [ -                  1.1, -                  2.2, -                  3.3 -            ], -            [ -                  false, -                  true, -                  true -            ] -      ], -      "is_aligned": false, -      "device": "root.ln.wf01.wt01" -} -``` - -## 4. MQTT WRITE - -Refer to [Built-in MQTT Service](../API/Programming-MQTT_timecho.md#_2-built-in-mqtt-service) - -## 5. BATCH DATA LOAD - -In different scenarios, the IoTDB provides a variety of methods for importing data in batches. This section describes the two most common methods for importing data in CSV format and TsFile format. - -### 5.1 TsFile Batch Load - -TsFile is the file format of time series used in IoTDB. You can directly import one or more TsFile files with time series into another running IoTDB instance through tools such as CLI. For details, see [Data Import](../Tools-System/Data-Import-Tool_timecho). - -### 5.2 CSV Batch Load - -CSV stores table data in plain text. You can write multiple formatted data into a CSV file and import the data into the IoTDB in batches. Before importing data, you are advised to create the corresponding metadata in the IoTDB. Don't worry if you forget to create one, the IoTDB can automatically infer the data in the CSV to its corresponding data type, as long as you have a unique data type for each column. In addition to a single file, the tool supports importing multiple CSV files as folders and setting optimization parameters such as time precision. For details, see [Data Import](../Tools-System/Data-Import-Tool_timecho). - -## 6. SCHEMALESS WRITING -In IoT scenarios, the types and quantities of devices may dynamically increase or decrease over time, and different devices may generate data with varying fields (e.g., temperature, humidity, status codes). Additionally, businesses often require rapid deployment and flexible integration of new devices without cumbersome predefined processes. Therefore, unlike traditional time-series databases that typically require predefining data models, IoTDB supports schema-less writing, where the database automatically identifies and registers the necessary metadata during data writing, enabling automatic modeling. - -Users can either use CLI `INSERT` statements or native APIs to write data in real-time, either in batches or row-by-row, for single or multiple devices. Alternatively, they can import historical data in formats such as CSV or TsFile using import tools, during which metadata like time series, data types, and compression encoding methods are automatically created. - - - diff --git a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md b/src/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md deleted file mode 100644 index b6df4bb02..000000000 --- a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md +++ /dev/null @@ -1,267 +0,0 @@ - -# AINode Deployment - -## 1. AINode Introduction - -### 1.1 Capability Introduction - -AINode is the third type of endogenous node provided by TimechoDB after ConfigNode and DataNode. By interacting with the DataNodes and ConfigNodes of an TimechoDB cluster, this node extends the capability for machine learning analysis on time series. AINode integrates model management, training, and inference within the database engine. It supports performing time series analysis tasks on specified time series data using registered models through simple SQL statements and also supports registering and using custom machine learning models. AINode currently integrates machine learning algorithms and self-developed models for common time series analysis scenarios (e.g., forecasting). - -### 1.2 Deployment Modes - -AINode is an additional component outside the TimechoDB cluster and is deployed using a separate installation package. - -
- - -
- -## 2. Installation Preparation - -### 2.1 Installation Package Acquisition - -The key directory structure after extracting the AINode installation package (`timechodb--ainode-bin.zip`) is as follows: - -| Directory | Type | Description | -| :--- | :--- | :--- | -| lib | Folder | Executable programs and dependencies for AINode | -| sbin | Folder | Operation scripts for AINode, used to start or stop AINode | -| conf | Folder | Configuration files and version declaration file for AINode | - -### 2.2 Pre-installation Verification - -To ensure the AINode installation package you obtained is complete and correct, it is recommended to perform an SHA512 verification before installation and deployment. - -**Preparation:** - -- Obtain the official SHA512 checksum: Please contact Timecho staff. - -**Verification Steps (using Linux as an example):** - -1. Open a terminal, navigate to the directory containing the installation package (e.g., `/data/ainode`): - -```bash -cd /data/ainode -``` - -2. Execute the following command to calculate the hash value: - -```bash -sha512sum timechodb-{version}-ainode-bin.zip -``` - -3. The terminal will output the result (left side is the SHA512 checksum, right side is the filename): - -```SQL -(base) root@hadoop@1:/data/ainode (0.664s) -sha512sum timechodb-2.0.6.1-ainode-bin.zip -4d5a6a64935b4f0459bc9ed214c4563aa7a6a5941024336e9416212424707f27bdfdfc70f4c528b51b812687d660014adc1b8add699498ea67ff17c7e619a6f0 timechodb-2.0.6.1-ainode-bin.zip -``` - -4. Compare the output with the official SHA512 checksum. If they match, you can proceed with the AINode installation and deployment steps below. - -**Notes:** - -- If the verification results do not match, please contact Timecho staff to obtain a new installation package. -- If you encounter a "file not found" prompt during verification, check if the file path is correct or if the installation package was downloaded completely. - -### 2.3 Environment Requirements - -- Recommended operating environment: Linux, macOS. -- TimechoDB Version: >= V2.0.8-beta. - -## 3. Installation, Deployment, and Usage - -### 3.1 Installing AINode - -Download the AINode installation package, import it into a dedicated folder, switch to that folder, and extract the package. - -```bash -unzip timechodb--ainode-bin.zip -``` - -### 3.2 Modifying Configuration Items - -AINode supports modifying some necessary parameters. You can find the following parameters in the `/TIMECHODB_AINODE_HOME/conf/iotdb-ainode.properties` file and make persistent modifications: - -| Name | Description | Type | Default Value | -| :--- | :--- | :--- | :--- | -| `cluster_name` | The cluster identifier the AINode is to join | String | `defaultCluster` | -| `ain_seed_config_node` | The ConfigNode address for AINode registration upon startup | String | `127.0.0.1:10710` | -| `ain_cluster_ingress_address` | The rpc address of the DataNode from which AINode pulls data | String | `127.0.0.1` | -| `ain_cluster_ingress_port` | The rpc port of the DataNode from which AINode pulls data | Integer | `6667` | -| `ain_cluster_ingress_username` | The client username for the DataNode from which AINode pulls data | String | `root` | -| `ain_cluster_ingress_password` | The client password for the DataNode from which AINode pulls data | String | `root` | -| `ain_rpc_address` | The address for AINode service provision and communication (internal service communication interface) | String | `127.0.0.1` | -| `ain_rpc_port` | The port for AINode service provision and communication | String | `10810` | -| `ain_system_dir` | AINode metadata storage path. The starting directory for relative paths is OS-dependent; using an absolute path is recommended. | String | `data/AINode/system` | -| `ain_models_dir` | AINode model file storage path. The starting directory for relative paths is OS-dependent; using an absolute path is recommended. | String | `data/AINode/models` | -| `ain_thrift_compression_enabled` | Whether to enable Thrift compression mechanism for AINode. 0-disable, 1-enable. | Boolean | `0` | - -### 3.3 Importing Built-in Weight Files - -*If the deployment environment has network connectivity and can access HuggingFace, the system will automatically pull the built-in model weight files. This step can be skipped.* -*For offline environments, contact Timecho staff to obtain the model weight folder and place it under the `/TIMECHODB_AINODE_HOME/data/ainode/models/builtin` directory.* -**NOTE:** Pay attention to the directory hierarchy. The parent directory for all built-in model weights should be `builtin`. - -### 3.4 Starting AINode - -After completing the deployment of ConfigNodes, you can add an AINode to support time series model management and inference functionality. After specifying the TimechoDB cluster information in the configuration items, you can execute the corresponding command to start the AINode and join the TimechoDB cluster. - -```bash -# Startup command -# Linux and macOS systems -bash sbin/start-ainode.sh - -# Windows system -sbin\start-ainode.bat - -# Background startup command (recommended for long-term operation) -# Linux and macOS systems -bash sbin/start-ainode.sh -d - -# Windows system -sbin\start-ainode.bat -d -``` - -### 3.5 Activating AINode - -1. Refer to TimechoDB Activation: [Activation Method](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md#_2-6-activate-database) - -2. You can verify AINode activation as follows. When the status shows `ACTIVATED`, it indicates successful activation. - -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -Total line number = 3 -It costs 0.002s -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ -Total line number = 7 -It costs 0.013s -``` - -### 3.6 Checking AINode Node Status - -During startup, AINode automatically joins the TimechoDB cluster. After starting AINode, you can enter an SQL query in the command line. Seeing the AINode node in the cluster with a `Running` status (as shown below) indicates a successful join. - -```sql -TimechoDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -Additionally, you can check the model status using the `show models` command. If the model status is incorrect, please verify the weight file path. - -```sql -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 3.7 Stopping AINode - -If you need to stop a running AINode node, execute the corresponding shutdown script. It supports specifying the port via the `-p` parameter, which corresponds to the `ain_rpc_port` configuration item. - -```bash -# Linux / macOS -bash sbin/stop-ainode.sh -bash sbin/stop-ainode.sh -p # Specify port - -# Windows -sbin\stop-ainode.bat -sbin\stop-ainode.bat -p # Specify port -``` - -After stopping AINode, you can still see the AINode node in the cluster, but its status will be `UNKNOWN` (as shown below). AINode functionality will be unavailable at this time. - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|UNKNOWN| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -If you need to restart the node, re-execute the startup script. - -### 3.8 Upgrading AINode -If you need to upgrade the version of the current AINode, follow these steps: - -1. Stop the current AINode service - - Run the stop command and ensure the service has completely exited before proceeding with subsequent operations. - - ```bash - # Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # Specify port - - # Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # Specify port - ``` - -2. Replace core files - - Delete the `lib` and `sbin` directories of the current version, then copy the `lib` and `sbin` directories from the new version to the corresponding locations. - - Back up the modified configuration files in the `conf` directory, then replace the `conf` folder and synchronize your modified configurations to the corresponding files. - -3. Update built-in model weights (optional) - - If the new version includes updates to built-in models, relevant information will be announced in the [Release History](../IoTDB-Introduction/Release-history_timecho.md). You may contact Timecho staff to obtain the latest weight package, and replace it in the `data/ainode/models/builtin` directory. - -4. A \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index 8981ee335..000000000 --- a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,319 +0,0 @@ - -# AINode Deployment - -## 1. AINode Introduction - -### 1.1 Capability Introduction - - AINode is the third type of endogenous node provided by IoTDB after the Configurable Node and DataNode. This node extends its ability to perform machine learning analysis on time series by interacting with the DataNode and Configurable Node of the IoTDB cluster. It supports the introduction of existing machine learning models from external sources for registration and the use of registered models to complete time series analysis tasks on specified time series data through simple SQL statements. The creation, management, and inference of models are integrated into the database engine. Currently, machine learning algorithms or self-developed models are available for common time series analysis scenarios, such as prediction and anomaly detection. - -### 1.2 Delivery Method - AINode is an additional package outside the IoTDB cluster, with independent installation. - -### 1.3 Deployment mode -
- - -
- -## 2. Installation preparation - -### 2.1 Get installation package - - Unzip and install the package - `(timechodb--ainode-bin.zip)`, The directory structure after unpacking the installation package is as follows: - -| **Catalogue** | **Type** | **Explain** | -| ----------- | -------- |-----------------------------------------------------------------------| -| lib | folder | Python package files for AINode | -| sbin | folder | The running script of AINode can start, remove, and stop AINode | -| conf | folder | Configuration files for AINode, and runtime environment setup scripts | -| LICENSE | file | Certificate | -| NOTICE | file | Tips | -| README_ZH.md | file | Explanation of the Chinese version of the markdown format | -| README.md | file | Instructions | - -### 2.2 Pre-installation Check - -To ensure the AINode installation package you obtained is complete and valid, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum:please contact Timecho Team to re-obtain the installation package. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/ainode): - ```Bash - cd /data/ainode - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-06.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment of AINode as per the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### 2.3 Environmental Preparation - -1. Recommended operating systems: Ubuntu, MacOS -2. IoTDB version: >= V 2.0.5.1 -3. Runtime environment - - Python version between 3.9 and 3.12, with pip and venv tools installed; - -## 3. Installation steps - -### 3.1 Install AINode - -1. Ensure Python version is between 3.9 and 3.12: -```shell -python --version -# or -python3 --version -``` - -2. Download and import AINode into a dedicated folder, switch to the folder, and unzip the package: -```shell - unzip timechodb--ainode-bin.zip - ``` -3. Activate AINode: - -- Enter the IoTDB CLI - -```sql -# For Linux or macOS -./start-cli.sh - -# For Windows -./start-cli.bat -``` - -- Run the following command to retrieve the machine code required for activation: - -```sql -show system info -``` - -- Copy the returned machine code and send it to the Timecho team: - -```sql -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -``` - -- Enter the activation code provided by the Timecho team in the CLI using the following format. Wrap the activation code in single quotes ('): - -```sql -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZK' -``` - -- You can verify the activation using the following method: when the status shows ACTIVATED, it indicates successful activation. - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ - -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ - -``` - -### 3.2 Configuration item modification - -AINode supports modifying some necessary parameters. You can find the following parameters in the `conf/iotdb-ainode.properties` file and make persistent modifications to them: - -| **Name** | **Description** | **Type** | **Default Value** | -| ------------------------------ | ------------------------------------------------------------ | -------- | ------------------ | -| cluster_name | Identifier of the cluster AINode joins | string | defaultCluster | -| ain_seed_config_node | Address of the ConfigNode registered when AINode starts | String | 127.0.0.1:10710 | -| ain_cluster_ingress_address | RPC address of the DataNode for AINode to pull data | String | 127.0.0.1 | -| ain_cluster_ingress_port | RPC port of the DataNode for AINode to pull data | Integer | 6667 | -| ain_cluster_ingress_username | Client username for AINode to pull data from the DataNode | String | root | -| ain_cluster_ingress_password | Client password for AINode to pull data from the DataNode | String | root | -| ain_cluster_ingress_time_zone | Client time zone for AINode to pull data from the DataNode | String | UTC+8 | -| ain_inference_rpc_address | Address for AINode to provide services and communication (internal interface) | String | 127.0.0.1 | -| ain_inference_rpc_port | Port for AINode to provide services and communication | String | 10810 | -| ain_system_dir | Metadata storage path for AINode (relative path starts from OS-dependent directory; absolute path is recommended) | String | data/AINode/system | -| ain_models_dir | Path to store model files for AINode (relative path starts from OS-dependent directory; absolute path is recommended) | String | data/AINode/models | -| ain_thrift_compression_enabled | Whether to enable Thrift compression for AINode (0=disabled, 1=enabled) | Boolean | 0 | - -### 3.3 Importing Weight Files - -> Offline environment only (Online environments can skip this step) -> -Contact Timecho team to obtain the model weight files, then place them in the /IOTDB_AINODE_HOME/data/ainode/models/weights/ directory. - - -### 3.4 Start AINode - - After completing the deployment of Seed Config Node, the registration and inference functions of the model can be supported by adding AINode nodes. After specifying the information of the IoTDB cluster in the configuration file, the corresponding instruction can be executed to start AINode and join the IoTDB cluster。 - -- Networking environment startup - -Start command - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - - # Windows systems - sbin\start-ainode.bat - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### 3.5 Detecting the status of AINode nodes - -During the startup process of AINode, the new AINode will be automatically added to the IoTDB cluster. After starting AINode, you can enter SQL in the command line to query. If you see an AINode node in the cluster and its running status is Running (as shown below), it indicates successful joining. - - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -### 3.6 Stop AINode - -If you need to stop a running AINode node, execute the corresponding shutdown script. - - Stop command - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - -After stopping AINode, you can still see AINode nodes in the cluster, whose running status is UNKNOWN (as shown below), and the AINode function cannot be used at this time. - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -If you need to restart the node, you need to execute the startup script again. - -## 4. common problem - -### 4.1 An error occurs when starting AINode stating that the venv module cannot be found - - When starting AINode using the default method, a Python virtual environment will be created in the installation package directory and dependencies will be installed, so it is required to install the venv module. Generally speaking, Python 3.10 and above versions come with built-in VenV, but for some systems with built-in Python environments, this requirement may not be met. There are two solutions when this error occurs (choose one or the other): - - To install the Venv module locally, taking Ubuntu as an example, you can run the following command to install the built-in Venv module in Python. Or install a Python version with built-in Venv from the Python official website. - - ```shell -apt-get install python3.10-venv -``` -Install version 3.10.0 of venv into AINode in the AINode path. - - ```shell -../Python-3.10.0/python -m venv venv(Folder Name) -``` - When running the startup script, use ` -i ` to specify an existing Python interpreter path as the running environment for AINode, eliminating the need to create a new virtual environment. - - ### 4.2 The SSL module in Python is not properly installed and configured to handle HTTPS resources -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -You can install OpenSSLS and then rebuild Python to solve this problem -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - - Python requires OpenSSL to be installed on our system, the specific installation method can be found in [link](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - - ### 4.3 Pip version is lower - - A compilation issue similar to "error: Microsoft Visual C++14.0 or greater is required..." appears on Windows - -The corresponding error occurs during installation and compilation, usually due to insufficient C++version or Setup tools version. You can check it in - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - - ### 4.4 Install and compile Python - - Use the following instructions to download the installation package from the official website and extract it: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` - Compile and install the corresponding Python package: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index 516cf88b1..000000000 --- a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,640 +0,0 @@ - -# Cluster Deployment - -This guide describes how to manually deploy a cluster instance consisting of 3 ConfigNodes and 3 DataNodes (commonly referred to as a 3C3D cluster). - -
- -
- - - -## 1. Prerequisites - -1. [System configuration](./Environment-Requirements.md):Ensure the system has been configured according to the preparation guidelines. - -2. **IP Configuration**: It is recommended to use hostnames for IP configuration to prevent issues caused by IP address changes. Configure the `/etc/hosts` file on each server. For example, if the local IP is `11.101.17.224` and the hostname is `iotdb-1`, use the following command to set the hostname: - - ``` shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - - Use the hostname for `cn_internal_address` and `dn_internal_address` in IoTDB configuration. - -3. **Unmodifiable Parameters**: Some parameters cannot be changed after the first startup. Refer to the Parameter Configuration section. - -4. **Installation Path**: Ensure the installation path contains no spaces or non-ASCII characters to prevent runtime issues. - -5. **User Permissions**: Choose one of the following permissions during installation and deployment: - - - **Root User (Recommended)**: This avoids permission-related issues. - - **Non-Root User**: - - Use the same user for all operations, including starting, activating, and stopping services. - - Avoid using `sudo`, which can cause permission conflicts. - -6. **Monitoring Panel**: Deploy a monitoring panel to track key performance metrics. Contact the Timecho team for access and refer to the "[Monitoring Panel Deployment](./Monitoring-panel-deployment.md)" guide. - -7. **Health Check Tool**: Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - - -## 2. Preparation - -1. Obtain the TimechoDB installation package: `timechodb-{version}-bin.zip` following [IoTDB-Package](../Deployment-and-Maintenance/IoTDB-Package_timecho.md)) - -2. Configure the operating system environment according to [Environment Requirement](../Deployment-and-Maintenance/Environment-Requirements.md)) - -### 2.1 Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -## 3. Installation Steps - -Taking a cluster with three Linux servers with the following information as example: - -| Node IP | Host Name | Service | -| ------------- | --------- | -------------------- | -| 11.101.17.224 | iotdb-1 | ConfigNode、DataNode | -| 11.101.17.225 | iotdb-2 | ConfigNode、DataNode | -| 11.101.17.226 | iotdb-3 | ConfigNode、DataNode | - -### 3.1 Configure Hostnames - -On all three servers, configure the hostnames by editing the `/etc/hosts` file. Use the following commands: - -```Bash -echo "11.101.17.224 iotdb-1" >> /etc/hosts -echo "11.101.17.225 iotdb-2" >> /etc/hosts -echo "11.101.17.226 iotdb-3" >> /etc/hosts -``` - -### 3.2 Extract Installation Package - -Unzip the installation package and enter the installation directory: - -```Plain -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -### 3.3 Parameters Configuration - -- #### Memory Configuration - - Edit the following files for memory allocation: - - - **ConfigNode**: `./conf/confignode-env.sh` (or `.bat` for Windows) - - | **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | - | :------------ | :--------------------------------- | :---------- | :-------------- | :-------------------------------------- | - | MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 30% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - - **DataNode**: `./conf/datanode-env.sh` (or `.bat` for Windows) - - | **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | - | :------------ | :--------------------------------- |:-----------------------------------------------------------------------------------------| :-------------- | :-------------------------------------- | - | MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 50% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -**General Configuration** - -Set the following parameters in `./conf/iotdb-system.properties`. Refer to `./conf/iotdb-system.properties.template` for a complete list. - -**Cluster-Level Parameters**: - -| **Parameter** | **Description** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | -| :------------------------ | :----------------------------------------------------------- | :---------------- | :---------------- | :---------------- | -| cluster_name | Name of the cluster | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | Metadata replication factor; DataNode count shall not be fewer than this value | 3 | 3 | 3 | -| data_replication_factor | Data replication factor; DataNode count shall not be fewer than this value | 2 | 2 | 2 | - -#### ConfigNode Parameters - -| **Parameter** | **Description** | **Default** | **Recommended** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | **Notes** | -| :------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------------------- | :---------------- | :---------------- | :---------------- | :--------------------------------------------------------- | -| cn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | iotdb-1 | iotdb-2 | iotdb-3 | This parameter cannot be modified after the first startup. | -| cn_internal_port | Port used for internal communication within the cluster | 10710 | 10710 | 10710 | 10710 | 10710 | This parameter cannot be modified after the first startup. | -| cn_consensus_port | Port used for consensus protocol communication among ConfigNode replicas | 10720 | 10720 | 10720 | 10720 | 10720 | This parameter cannot be modified after the first startup. | -| cn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Address and port of the seed ConfigNode (e.g., `cn_internal_address:cn_internal_port`) | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | This parameter cannot be modified after the first startup. | - -#### DataNode Parameters - -| **Parameter** | **Description** | **Default** | **Recommended** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | **Notes** | -| :------------------------------ | :----------------------------------------------------------- |:----------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :---------------- | :---------------- | :---------------- | :--------------------------------------------------------- | -| dn_rpc_address | Address for the client RPC service | 127.0.0.1 | By default, the local machine can directly access it. For non-local access, please modify this configuration item to the IPv4 address or hostname of the server where it is located. It is recommended to use the IPv4 address of the server where it is located. | iotdb-1 | iotdb-2 | iotdb-3 | Effective after restarting the service. | -| dn_rpc_port | Port for the client RPC service | 6667 | 6667 | 6667 | 6667 | 6667 | Effective after restarting the service. | -| dn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | iotdb-1 | iotdb-2 | iotdb-3 | This parameter cannot be modified after the first startup. | -| dn_internal_port | Port used for internal communication within the cluster | 10730 | 10730 | 10730 | 10730 | 10730 | This parameter cannot be modified after the first startup. | -| dn_mpp_data_exchange_port | Port used for receiving data streams | 10740 | 10740 | 10740 | 10740 | 10740 | This parameter cannot be modified after the first startup. | -| dn_data_region_consensus_port | Port used for data replica consensus protocol communication | 10750 | 10750 | 10750 | 10750 | 10750 | This parameter cannot be modified after the first startup. | -| dn_schema_region_consensus_port | Port used for metadata replica consensus protocol communication | 10760 | 10760 | 10760 | 10760 | 10760 | This parameter cannot be modified after the first startup. | -| dn_seed_config_node | Address of the ConfigNode for registering and joining the cluster.(e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Address of the first ConfigNode | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | This parameter cannot be modified after the first startup. | - -**Note:** Ensure files are saved after editing. Tools like VSCode Remote do not save changes automatically. - -### 3.4 Start ConfigNode Instances - -1. Start the first ConfigNode (`iotdb-1`) as the seed node - -```Bash -# Unix/OS X -cd sbin -./start-confignode.sh -d #"- d" parameter will start in the background - -# Windows -# Before version V2.0.4.x -.\start-confignode.bat - -# V2.0.4.x and later versions -.\windows\start-confignode.bat -``` - -2. Start the remaining ConfigNodes (`iotdb-2` and `iotdb-3`) in sequence. - - If the startup fails, refer to the [Common Questions](#common-questions) section below for troubleshooting. - -### 3.5 Start DataNode Instances - -On each server, navigate to the `sbin` directory and start the DataNode: - -```Go -# Unix/OS X -cd sbin -./start-datanode.sh -d #"- d" parameter will start in the background - -# Windows -# Before version V2.0.4.x -.\start-datanode.bat - -# V2.0.4.x and later versions -.\windows\start-datanode.bat -``` - -### 3.6 Activate Database - -#### Option 1: Command-Based Activation - -1. Enter the IoTDB CLI on any node of the cluster: - -The Linux and MacOS system startup commands are as follows: - -```shell -# Before version V2.0.6.x -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -The Windows system startup commands are as follows: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x and later versions, before version V2.0.6.x -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -The Windows system startup commands are as follows: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x and later versions, before version V2.0.6.x -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -2. Execute the following command to obtain the machine code required for activation: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -|01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -3. Execute the following statement to obtain the version number of the database to be activated: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.9.2| 5ea21bc| -+-------+---------+ -Total line number = 1 -``` - -4. Provide the obtained machine code and version number to the Timecho team. - -5. Enter the activation codes provided by the Timecho team in the CLI in sequence using the following format. Wrap the activation code in single quotes ('): - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -- Note : The activation operation only needs to be performed once on any machine in the cluster. - -#### Option 2: File-Based Activation - -1. Start all ConfigNodes and DataNodes. -2. Copy the `system_info` file from the `activation` directory on each server and send them to the Timecho team. -3. Place the license files provided by the Timecho team into the corresponding `activation` folder for each node. - - -### 3.7 Verify Activation - -In the CLI, you can check the activation status by running the `show activation` command; the example below shows a status of ACTIVATED, indicating successful activation. - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -### 3.8 One-click Cluster Start and Stop - -#### 3.8.1 Overview - -Within the root directory of IoTDB, the `sbin `subdirectory houses the `start-all.sh` and `stop-all.sh` scripts, which work in concert with the `iotdb-cluster.properties` configuration file located in the `conf` subdirectory. This synergy enables the one-click initiation or termination of all nodes within the cluster from a single node. This approach facilitates efficient management of the IoTDB cluster's lifecycle, streamlining the deployment and operational maintenance processes. - -This following section will introduce the specific configuration items in the `iotdb-cluster.properties` file. - -#### 3.8.2 Configuration Items - -> Note: -> -> * When the cluster changes, this configuration file needs to be manually updated. -> * If the `iotdb-cluster.properties` configuration file is not set up and the `start-all.sh` or `stop-all.sh` scripts are executed, the scripts will, by default, start or stop the ConfigNode and DataNode nodes located in the IOTDB\_HOME directory where the scripts reside. -> * It is recommended to configure SSH passwordless login: If not configured, the script will prompt for the server password after execution to facilitate subsequent start, stop, or destroy operations. If already configured, there is no need to enter the server password during script execution. - -* confignode\_address\_list - -| **Name** | **confignode\_address\_list** | -| :----------------: |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | A list of IP addresses or hostname of the hosts where the ConfigNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_address\_list - -| **Name** | **datanode\_address\_list** | -| :----------------: |:------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | A list of IP addresses or hostname of the hosts where the DataNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* ssh\_account - -| **Name** | **ssh\_account** | -| :----------------: | :------------------------------------------------------------------------------------------------- | -| Description | The username used to log in to the target hosts via SSH. All hosts must have the same username. | -| Type | String | -| Default | root | -| Effective | After restarting the system | - -* ssh\_port - -| **Name** | **ssh\_port** | -| :----------------: |:--------------------------------------------------------------------------------------| -| Description | The SSH port exposed by the target hosts. All hosts must have the same SSH port. | -| Type | int | -| Default | 22 | -| Effective | After restarting the system | - -* confignode\_deploy\_path - -| **Name** | **confignode\_deploy\_path** | -| :----------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Description | The path on the target hosts where all ConfigNodes to be started/stopped are located. All ConfigNodes must be in the same directory on their respective hosts. eg: `/data/demo/apache-iotdb-1.3.1-all-bin`| -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_deploy\_path - -| **Name** | **datanode\_deploy\_path** | -| :----------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| Description | The path on the target hosts where all DataNodes to be started/stopped are located. All DataNodes must be in the same directory on their respective hosts. eg: `/data/demo/apache-iotdb-1.3.1-all-bin`| -| Type | String | -| Default | None | -| Effective | After restarting the system | - - -#### 3.8.3 Quick Example - -1. Configuration File: `iotdb-cluster.properties` -```properties -# Configure ConfigNode node addresses, separated by commas -confignode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# Configure DataNode node addresses, separated by commas -datanode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# SSH login username for target deployment servers -ssh_account=root - -# SSH service port number -ssh_port=22 - -# IoTDB installation directory (the program will be deployed into this path on remote nodes) -confignode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -datanode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -``` - -2. Run `./start-all.sh` to launch cluster and verify status - Connect to IoTDB CLI and execute `show cluster`. A successful output is shown below: -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -| 0|ConfigNode|Running| 172.xx.xx.16| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 1|ConfigNode|Running| 172.xx.xx.18| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 2|ConfigNode|Running| 172.xx.xx.17| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 3| DataNode|Running| 172.xx.xx.18| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 4| DataNode|Running| 172.xx.xx.17| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 5| DataNode|Running| 172.xx.xx.16| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -``` - - -## 4. Maintenance - -### 4.1 ConfigNode Maintenance - -ConfigNode maintenance includes adding and removing ConfigNodes. Common use cases include: - -- **Cluster Expansion:** If the cluster contains only 1 ConfigNode, adding 2 more ConfigNodes enhances high availability, resulting in a total of 3 ConfigNodes. -- **Cluster Fault Recovery:** If a ConfigNode's machine fails and it cannot function normally, remove the faulty ConfigNode and add a new one to the cluster. - -**Note:** After completing ConfigNode maintenance, ensure that the cluster contains either 1 or 3 active ConfigNodes. Two ConfigNodes do not provide high availability, and more than three ConfigNodes can degrade performance. - -#### Adding a ConfigNode - -**Linux /** **MacOS**: - -```Bash -sbin/start-confignode.sh -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\start-confignode.bat - -# V2.0.4.x and later versions -sbin\windows\start-confignode.bat -``` - -#### Removing a ConfigNode - -1. Connect to the cluster using the CLI and confirm the internal address and port of the ConfigNode to be removed: - - ```Plain - show confignodes; - ``` - -Example output: - -```Plain -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -2. Remove the ConfigNode using the script: - -**Linux /** **MacOS**: - -```Bash -sbin/remove-confignode.sh [confignode_id] -# Or: -sbin/remove-confignode.sh [cn_internal_address:cn_internal_port] -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\remove-confignode.bat [confignode_id] -# Or: -sbin\remove-confignode.bat [cn_internal_address:cn_internal_port] - -# V2.0.4.x and later versions -sbin\windows\remove-confignode.bat [confignode_id] -# Or: -sbin\windows\remove-confignode.bat [cn_internal_address:cn_internal_port] -``` - -### 4.2 DataNode Maintenance - -DataNode maintenance includes adding and removing DataNodes. Common use cases include: - -- **Cluster Expansion:** Add new DataNodes to increase cluster capacity. -- **Cluster Fault Recovery:** If a DataNode's machine fails and it cannot function normally, remove the faulty DataNode and add a new one to the cluster. - -**Note:** During and after DataNode maintenance, ensure that the number of active DataNodes is not fewer than the data replication factor (usually 2) or the schema replication factor (usually 3). - -#### Adding a DataNode - -**Linux /** **MacOS**: - -```Bash -sbin/start-datanode.sh -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\start-datanode.bat - -# V2.0.4.x and later versions -sbin\windows\start-datanode.bat -``` - -**Note:** After adding a DataNode, the cluster load will gradually balance across all nodes as new writes arrive and old data expires (if TTL is set). - -#### Removing a DataNode - -1. Connect to the cluster using the CLI and confirm the RPC address and port of the DataNode to be removed: - -```Plain -show datanodes; -``` - -Example output: - -```Plain -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -2. Remove the DataNode using the script: - -**Linux / MacOS:** - -```Bash -sbin/remove-datanode.sh [dn_rpc_address:dn_rpc_port] -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\remove-datanode.bat [dn_rpc_address:dn_rpc_port] - -# V2.0.4.x and later versions -sbin\windows\remove-datanode.bat [dn_rpc_address:dn_rpc_port] -``` - -### 4.3 Cluster Maintenance - -For more details on cluster maintenance, please refer to: [Cluster Maintenance](../User-Manual/Load-Balance.md) - -## 5. Common Questions - -1. Activation Fails Repeatedly - - Use the `ls -al` command to verify that the ownership of the installation directory matches the current user. - - Check the ownership of all files in the `./activation` directory to ensure they belong to the current user. -2. ConfigNode Fails to Start - - Review the startup logs to check if any parameters, which cannot be modified after the first startup, were changed. - - Check the logs for any other errors. If unresolved, contact technical support for assistance. - - If the deployment is fresh or data can be discarded, clean the environment and redeploy using the following steps: - **Clean the Environment** - - - Stop all ConfigNode and DataNode processes: - ```Bash - sbin/stop-standalone.sh - ``` - - - Check for any remaining processes: - ```Bash - jps - # or - ps -ef | grep iotdb - ``` - - - If processes remain, terminate them manually: - ```Bash - kill -9 - - #For systems with a single IoTDB instance, you can clean up residual processes with: - ps -ef | grep iotdb | grep -v grep | tr -s ' ' ' ' | cut -d ' ' -f2 | xargs kill -9 - ``` - - - Delete the `data` and `logs` directories: - ```Bash - cd /data/iotdb - rm -rf data logs - ``` - -## 6. Appendix - -### 6.1 ConfigNode Parameters - -| Parameter | Description | Is it required | -| :-------- | :---------------------------------------------------------- | :------------- | -| -d | Starts the process in daemon mode (runs in the background). | No | - -### 6.2 DataNode Parameters - -| Parameter | Description | Required | -| :-------- | :----------------------------------------------------------- | :------- | -| -v | Displays version information. | No | -| -f | Runs the script in the foreground without backgrounding it. | No | -| -d | Starts the process in daemon mode (runs in the background). | No | -| -p | Specifies a file to store the process ID for process management. | No | -| -c | Specifies the path to the configuration folder; the script loads configuration files from this location. | No | -| -g | Prints detailed garbage collection (GC) information. | No | -| -H | Specifies the path for the Java heap dump file, used during JVM memory overflow. | No | -| -E | Specifies the file for JVM error logs. | No | -| -D | Defines system properties in the format `key=value`. | No | -| -X | Passes `-XX` options directly to the JVM. | No | -| -h | Displays the help instructions. | No | diff --git a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Deployment-form_timecho.md b/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Deployment-form_timecho.md deleted file mode 100644 index b2daee47f..000000000 --- a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Deployment-form_timecho.md +++ /dev/null @@ -1,63 +0,0 @@ - -# Deployment form - -IoTDB has two operation modes: standalone mode and cluster mode. - -## 1. Standalone Mode - -An IoTDB standalone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D. - -- **Features**: Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations. -- **Use Cases**: Scenarios with limited resources or low high-availability requirements, such as edge servers. -- **Deployment Method**: [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -## 2. Dual-Active Mode - -Dual-Active Deployment is a feature of TimechoDB, where two independent instances synchronize bidirectionally and can provide services simultaneously. If one instance stops and restarts, the other instance will resume data transfer from the breakpoint. - -> An IoTDB Dual-Active instance typically consists of 2 standalone nodes, i.e., 2 sets of 1C1D. Each instance can also be a cluster. - -- **Features**: The high-availability solution with the lowest resource consumption. -- **Use Cases**: Scenarios with limited resources (only two servers) but requiring high availability. -- **Deployment Method**: [Dual-Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -## 3. Cluster Mode - -An IoTDB cluster instance consists of 3 ConfigNodes and no fewer than 3 DataNodes, typically 3 DataNodes, i.e., 3C3D. If some nodes fail, the remaining nodes can still provide services, ensuring high availability of the database. Performance can be improved by adding DataNodes. - -- **Features**: High availability, high scalability, and improved system performance by adding DataNodes. -- **Use Cases**: Enterprise-level application scenarios requiring high availability and reliability. -- **Deployment Method**: [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - -## 4. Feature Summary - -| **Dimension** | **Stand-Alone Mode** | **Dual-Active Mode** | **Cluster Mode** | -| :-------------------------- | :------------------------------------------------------- | :------------------------------------------------------ | :------------------------------------------------------ | -| Use Cases | Edge-side deployment, low high-availability requirements | High-availability services, disaster recovery scenarios | High-availability services, disaster recovery scenarios | -| Number of Machines Required | 1 | 2 | ≥3 | -| Security and Reliability | Cannot tolerate single-point failure | High, can tolerate single-point failure | High, can tolerate single-point failure | -| Scalability | Can expand DataNodes to improve performance | Each instance can be scaled as needed | Can expand DataNodes to improve performance | -| Performance | Can scale with the number of DataNodes | Same as one of the instances | Can scale with the number of DataNodes | - -- The deployment steps for Stand-Alone Mode and Cluster Mode are similar (adding ConfigNodes and DataNodes one by one), with differences only in the number of replicas and the minimum number of nodes required to provide services. \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index 5d1d89d05..000000000 --- a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,496 +0,0 @@ - -# Docker Deployment - -## 1. Environmental Preparation - -### 1.1 Docker Installation - -```Bash -#Taking Ubuntu as an example, other operating systems can search for installation methods themselves -#step1: Install some necessary system tools -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: Install GPG certificate -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: Write software source information -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: Update and install Docker CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: Set Docker to start automatically upon startup -sudo systemctl enable docker -#step6: Verify if Docker installation is successful -docker --version #Display version information, indicating successful installation -``` - -### 1.2 Docker-compose Installation - -```Bash -#Installation command -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#Verify if the installation was successful -docker-compose --version #Displaying version information indicates successful installation -``` - -### 1.3 Install The Dmidecode Plugin - -By default, Linux servers should already be installed. If not, you can use the following command to install them. - -```Bash -sudo apt-get install dmidecode -``` - -After installing dmidecode, search for the installation path: `wherever dmidecode`. Assuming the result is `/usr/sbin/dmidecode`, remember this path as it will be used in the later docker compose yml file. - -### 1.4 Get Container Image Of IoTDB - -You can contact business or technical support to obtain container images for IoTDB Enterprise Edition. - -## 2. Stand-Alone Deployment - -This section demonstrates how to deploy a standalone Docker version of 1C1D. - -### 2.1 Load Image File - -For example, the container image file name of IoTDB obtained here is: `iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -Load image: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -View image: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### 2.2 Create Docker Bridge Network - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### 2.3 Write The Yml File For docker-compose - -Here we take the example of consolidating the IoTDB installation directory and yml files in the/docker iotdb folder: - -The file directory structure is:`/docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml ` - -```Bash -docker-iotdb: -├── iotdb #Iotdb installation directory -│── docker-compose-standalone.yml #YML file for standalone Docker Composer -``` - -The complete docker-compose-standalone.yml content is as follows: - -```Bash -version: "3" -services: - iotdb-service: - image: timecho/timechodb:2.0.2.1-standalone #The image used - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### 2.4 First Launch - -Use the following command to start: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -Due to lack of activation, it is normal to exit directly upon initial startup. The initial startup is to obtain the machine code file for the subsequent activation process. - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### 2.5 Apply For Activation - -- After the first startup, a system_info file will be generated in the physical machine directory `/docker-iotdb/iotdb/activation`, and this file will be copied to the Timecho staff. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Received the license file returned by the staff, copy the license file to the `/docker iotdb/iotdb/activation` folder. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### 2.6 Restart IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### 2.7 Validate Deployment - -- Viewing the log, the following words indicate successful startup - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- Enter the container to view the service running status and activation information - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - Enter the container, log in to the database through CLI, and use the `show cluster` command to view the service status and activation status - - ```Bash - docker exec -it iotdb /bin/bash #Entering the container - ./start-cli.sh -h iotdb #Log in to the database - IoTDB> show cluster #View status - ``` - - You can see that all services are running and the activation status shows as activated. - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### 2.8 Map/conf Directory (optional) - -If you want to directly modify the configuration file in the physical machine in the future, you can map the/conf folder in the container in three steps: - -Step 1: Copy the/conf directory from the container to/docker-iotdb/iotdb/conf - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -Step 2: Add mappings in docker-compose-standalone.yml - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this/conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -Step 3: Restart IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## 3. Cluster Deployment - -This section describes how to manually deploy an instance that includes 3 Config Nodes and 3 Data Nodes, commonly known as a 3C3D cluster. - -
- -
- -**Note: The cluster version currently only supports host and overlay networks, and does not support bridge networks.** - -Taking the host network as an example, we will demonstrate how to deploy a 3C3D cluster. - -### 3.1 Set Host Name - -Assuming there are currently three Linux servers, the IP addresses and service role assignments are as follows: - -| Node IP | Host Name | Service | -| ----------- | --------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -Configure the host names on three machines separately. To set the host names, configure `/etc/hosts` on the target server using the following command: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 3.2 Load Image File - -For example, the container image file name obtained for IoTDB is: `iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -Execute the load image command on three servers separately: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -View image: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### 3.3 Write The Yml File For Docker Compose - -Here we take the example of consolidating the IoTDB installation directory and yml files in the /docker-iotdb folder: - -The file directory structure is:/docker-iotdb/iotdb, /docker-iotdb/confignode.yml,/docker-iotdb/datanode.yml - -```Bash -docker-iotdb: -├── confignode.yml #Yml file of confignode -├── datanode.yml #Yml file of datanode -└── iotdb #IoTDB installation directory -``` - -On each server, two yml files need to be written, namely confignnode. yml and datanode. yml. The example of yml is as follows: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:2.0.x.x-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:2.0.x.x-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### 3.4 Starting Confignode For The First Time - -First, start configNodes on each of the three servers to obtain the machine code. Pay attention to the startup order, start the first iotdb-1 first, then start iotdb-2 and iotdb-3. - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #Background startup -``` - -### 3.5 Apply For Activation - -- After starting three confignodes for the first time, a system_info file will be generated in each physical machine directory `/docker-iotdb/iotdb/activation`, and the system_info files of the three servers will be copied to the Timecho staff; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Put the three license files into the `/docker iotdb/iotdb/activation` folder of the corresponding Configurable Node node; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- After the license is placed in the corresponding activation folder, confignode will be automatically activated without restarting confignode - -### 3.6 Start Datanode - -Start datanodes on 3 servers separately - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #Background startup -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### 3.7 Validate Deployment - -- Viewing the logs, the following words indicate that the datanode has successfully started - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- Enter any container to view the service running status and activation information - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - Enter the container, log in to the database through CLI, and use the `show cluster` command to view the service status and activation status - - ```Bash - docker exec -it iotdb-datanode /bin/bash #Entering the container - ./start-cli.sh -h iotdb-1 #Log in to the database - IoTDB> show cluster #View status - ``` - - You can see that all services are running and the activation status shows as activated. - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### 3.8 Map/conf Directory (optional) - -If you want to directly modify the configuration file in the physical machine in the future, you can map the/conf folder in the container in three steps: - -Step 1: Copy the `/conf` directory from the container to `/docker-iotdb/iotdb/conf` on each of the three servers - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -or -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -Step 2: Add `/conf` directory mapping in `confignode.yml` and `datanode. yml` on 3 servers - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -Step 3: Restart IoTDB on 3 servers - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` - diff --git a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index 8bf34b405..000000000 --- a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,204 +0,0 @@ - -# Dual Active Deployment - -## 1. What is a double active version? - -Dual active usually refers to two independent machines (or clusters) that perform real-time mirror synchronization. Their configurations are completely independent and can simultaneously receive external writes. Each independent machine (or cluster) can synchronize the data written to itself to another machine (or cluster), and the data of the two machines (or clusters) can achieve final consistency. - -- Two standalone machines (or clusters) can form a high availability group: when one of the standalone machines (or clusters) stops serving, the other standalone machine (or cluster) will not be affected. When the single machine (or cluster) that stopped the service is restarted, another single machine (or cluster) will synchronize the newly written data. Business can be bound to two standalone machines (or clusters) for read and write operations, thereby achieving high availability. -- The dual active deployment scheme allows for high availability with fewer than 3 physical nodes and has certain advantages in deployment costs. At the same time, the physical supply isolation of two sets of single machines (or clusters) can be achieved through the dual ring network of power and network, ensuring the stability of operation. -- At present, the dual active capability is a feature of the enterprise version. - -![](/img/20240731104336.png) - -## 2. Note - -1. It is recommended to prioritize using `hostname` for IP configuration during deployment to avoid the problem of database failure caused by modifying the host IP in the later stage. To set the hostname, you need to configure `/etc/hosts` on the target server. If the local IP is 192.168.1.3 and the hostname is iotdb-1, you can use the following command to set the server's hostname and configure IoTDB's `cn_internal-address` and` dn_internal-address` using the hostname. - - ```Bash - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -2. Some parameters cannot be modified after the first startup, please refer to the "Installation Steps" section below to set them. - -3. Recommend deploying a monitoring panel, which can monitor important operational indicators and keep track of database operation status at any time. The monitoring panel can be obtained by contacting the business department. The steps for deploying the monitoring panel can be referred to [Monitoring Panel Deployment](https://www.timecho.com/docs/UserGuide/latest/Deployment-and-Maintenance/Monitoring-panel-deployment.html) - -## 3. Installation Steps - -Taking the dual active version IoTDB built by two single machines A and B as an example, the IP addresses of A and B are 192.168.1.3 and 192.168.1.4, respectively. Here, we use hostname to represent different hosts. The plan is as follows: - -| Machine | Machine IP | Host Name | -| ------- | ----------- | --------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### 3.1 Install Two Independent IoTDBs Separately - -Install IoTDB on two machines separately, and refer to the deployment documentation for the standalone version [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md),The deployment document for the cluster version can be referred to [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)。**It is recommended that the configurations of clusters A and B remain consistent to achieve the best dual active effect** - -### 3.2 Create A Aata Synchronization Task On Machine A To Machine B - -- Create a data synchronization process on machine A, where the data on machine A is automatically synchronized to machine B. Use the cli tool in the sbin directory to connect to the IoTDB database on machine A: - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 - - # Windows - # Before version V2.0.4.x - .\sbin\start-cli.bat -h iotdb-1 - - # V2.0.4.x and later versions - .\sbin\windows\start-cli.bat -h iotdb-1 - ``` - -- Create and start the data synchronization command with the following SQL: - - ```Bash - create pipe AB - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' - ) - ``` - -- Note: To avoid infinite data loops, it is necessary to set the parameter `source. forwarding pipe questions` on both A and B to `false`, indicating that data transmitted from another pipe will not be forwarded. - -### 3.3 Create A Data Synchronization Task On Machine B To Machine A - -- Create a data synchronization process on machine B, where the data on machine B is automatically synchronized to machine A. Use the cli tool in the sbin directory to connect to the IoTDB database on machine B - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 - - # Windows - # Before version V2.0.4.x - .\sbin\start-cli.bat -h iotdb-2 - - # V2.0.4.x and later versions - .\sbin\windows\start-cli.bat -h iotdb-2 - ``` - - Create and start the pipe with the following SQL: - - ```Bash - create pipe BA - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' - ) - ``` - -- Note: To avoid infinite data loops, it is necessary to set the parameter `source. forwarding pipe questions` on both A and B to `false` , indicating that data transmitted from another pipe will not be forwarded. - -### 3.4 Validate Deployment - -After the above data synchronization process is created, the dual active cluster can be started. - -#### Check the running status of the cluster - -```Bash -#Execute the show cluster command on two nodes respectively to check the status of IoTDB service -show cluster -``` - -**Machine A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**Machine B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -Ensure that every Configurable Node and DataNode is in the Running state. - -#### Check synchronization status - -- Check the synchronization status on machine A - -```Bash -show pipes -``` - -![](/img/show%20pipes-A.png) - -- Check the synchronization status on machine B - -```Bash -show pipes -``` - -![](/img/show%20pipes-B.png) - -Ensure that every pipe is in the RUNNING state. - -### 3.5 Stop Dual Active Version IoTDB - -- Execute the following command on machine A: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 #Log in to CLI - IoTDB> stop pipe AB #Stop the data synchronization process - ./sbin/stop-standalone.sh #Stop database service - - # Windows - # Before version V2.0.4.x - .\sbin\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\stop-standalone.bat - - # V2.0.4.x and later versions - .\sbin\windows\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\windows\stop-standalone.bat - ``` - -- Execute the following command on machine B: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 #Log in to CLI - IoTDB> stop pipe BA #Stop the data synchronization process - ./sbin/stop-standalone.sh #Stop database service - - # Windows - # Before version V2.0.4.x - .\sbin\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\stop-standalone.bat - - # V2.0.4.x and later versions - .\sbin\windows\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\windows\stop-standalone.bat - ``` - diff --git a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/UserGuide/Master/Tree/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index c2bffcf22..000000000 --- a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,48 +0,0 @@ - -# Obtain TimechoDB - -## 1. How to obtain TimechoDB - -The TimechoDB installation package can be obtained through product trial application or by directly contacting the Timecho team. - -## 2. Installation Package Structure - -After unpacking the installation package(`iotdb-enterprise-{version}-bin.zip`),you will see the directory structure is as follows: - -| **Catologue** | **Type** | **Description** | -| :--------------- | :------- | :----------------------------------------------------------- | -| activation | Folder | Directory for activation files, including the generated machine code and the TimechoDB activation code obtained from Timecho staff. *(This directory is generated after starting the ConfigNode, enabling you to obtain the activation code.)* | -| conf | Folder | Configuration files directory, containing ConfigNode, DataNode, JMX, and logback configuration files. | -| data | Folder | Default data file directory, containing data files for ConfigNode and DataNode. *(This directory is generated after starting the program.)* | -| lib | Folder | Library files directory. | -| licenses | Folder | Directory for open-source license certificates. | -| logs | Folder | Default log file directory, containing log files for ConfigNode and DataNode. *(This directory is generated after starting the program.)* | -| sbin | Folder | Main scripts directory, containing scripts for starting, stopping, and managing the database. | -| tools | Folder | Tools directory. | -| ext | Folder | Directory for pipe, trigger, and UDF plugin-related files. | -| LICENSE | File | Open-source license file. | -| NOTICE | File | Open-source notice file. | -| README_ZH.md | File | User manual (Chinese version). | -| README.md | File | User manual (English version). | -| RELEASE_NOTES.md | File | Release notes. | - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the MQTT service and REST service JAR files by default. If you need to use them, please contact the Timecho team to obtain them. \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Kubernetes_timecho.md b/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Kubernetes_timecho.md deleted file mode 100644 index 14b51ab84..000000000 --- a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Kubernetes_timecho.md +++ /dev/null @@ -1,445 +0,0 @@ - - -# Kubernetes - -## 1. Environment Preparation - -### 1.1 Prepare a Kubernetes Cluster - -Ensure that you have an available Kubernetes cluster (minimum recommended version: Kubernetes 1.24) as the foundation for deploying the IoTDB cluster. - -Kubernetes Version Requirement: The recommended version is Kubernetes 1.24 or above. - -IoTDB Version Requirement: The version of TimechoDB must not be lower than v1.3.3.2. - -## 2. Create Namespace - -### 2.1 Create Namespace - -> Note: Before executing the namespace creation operation, verify that the specified namespace name has not been used in the Kubernetes cluster. If the namespace already exists, the creation command will fail, which may lead to errors during the deployment process. - -```Bash -kubectl create ns iotdb-ns -``` - -### 2.2 View Namespace - -```Bash -kubectl get ns -``` - -## 3. Create PersistentVolume (PV) - -### 3.1 Create PV Configuration File - -PV is used for persistent storage of IoTDB's ConfigNode and DataNode data. You need to create one PV for each node. - -> Note: One ConfigNode and one DataNode count as two nodes, requiring two PVs. - -For example, with 3 ConfigNodes and 3 DataNodes: - -1. Create a `pv.yaml` file and make six copies, renaming them to `pv01.yaml` through `pv06.yaml`. - -```Bash -# Create a directory to store YAML files -# Create pv.yaml file -touch pv.yaml -``` - -2. Modify the `name` and `path` in each file to ensure consistency. - -**pv.yaml Example:** - -```YAML -# pv.yaml -apiVersion: v1 -kind: PersistentVolume -metadata: - name: iotdb-pv-01 -spec: - capacity: - storage: 10Gi # Storage capacity - accessModes: # Access modes - - ReadWriteOnce - persistentVolumeReclaimPolicy: Retain # Reclaim policy - # Storage class name, if using local static storage, do not configure; if using dynamic storage, this must be set - storageClassName: local-storage - # Add the corresponding configuration based on your storage type - hostPath: # If using a local path - path: /data/k8s-data/iotdb-pv-01 - type: DirectoryOrCreate # If this line is not configured, you need to manually create the directory -``` - -### 3.2 Apply PV Configuration - -```Bash -kubectl apply -f pv01.yaml -kubectl apply -f pv-02.yaml -... -``` - -### 3.3 View PV - -```Bash -kubectl get pv -``` - - -### 3.4 Manually Create Directories - -> Note: If the type in the hostPath of the YAML file is not configured, you need to manually create the corresponding directories. - -Create the corresponding directories on all Kubernetes nodes: -```Bash -mkdir -p /data/k8s-data/iotdb-pv-01 -mkdir -p /data/k8s-data/iotdb-pv-02 -... -``` - -## 4. Install Helm - -For installation steps, please refer to the[Helm Official Website.](https://helm.sh/zh/docs/intro/install/) - -## 5. Configure IoTDB Helm Chart - -### 5.1 Clone IoTDB Kubernetes Deployment Code - -Please contact timechodb staff to obtain the IoTDB Helm Chart. If you encounter proxy issues, disable the proxy settings: - -### 5.2 Modify YAML Files - -> Ensure that the version used is supported (>=1.3.3.2): - -**values.yaml Example:** - -```YAML -nameOverride: "iotdb" -fullnameOverride: "iotdb" # Name after installation - -image: - repository: nexus.infra.timecho.com:8143/timecho/iotdb-enterprise - pullPolicy: IfNotPresent - tag: 1.3.3.2-standalone # Repository and version used - -storage: - # Storage class name, if using local static storage, do not configure; if using dynamic storage, this must be set - className: local-storage - -datanode: - name: datanode - nodeCount: 3 # Number of DataNode nodes - enableRestService: true - storageCapacity: 10Gi # Available space for DataNode - resources: - requests: - memory: 2Gi # Initial memory size for DataNode - cpu: 1000m # Initial CPU size for DataNode - limits: - memory: 4Gi # Maximum memory size for DataNode - cpu: 1000m # Maximum CPU size for DataNode - -confignode: - name: confignode - nodeCount: 3 # Number of ConfigNode nodes - storageCapacity: 10Gi # Available space for ConfigNode - resources: - requests: - memory: 512Mi # Initial memory size for ConfigNode - cpu: 1000m # Initial CPU size for ConfigNode - limits: - memory: 1024Mi # Maximum memory size for ConfigNode - cpu: 2000m # Maximum CPU size for ConfigNode - configNodeConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - schemaReplicationFactor: 3 - schemaRegionConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - dataReplicationFactor: 2 - dataRegionConsensusProtocolClass: org.apache.iotdb.consensus.iot.IoTConsensus -``` - -## 6. Configure Private Repository Information or Pre-Pull Images - -Configure private repository information on k8s as a prerequisite for the next helm install step. - -Option one is to pull the available iotdb images during helm insta, while option two is to import the available iotdb images into containerd in advance. - -### 6.1 [Option 1] Pull Image from Private Repository - -#### 6.1.1 Create a Secret to Allow k8s to Access the IoTDB Helm Private Repository - -Replace xxxxxx with the IoTDB private repository account, password, and email. - - - -```Bash -# Note the single quotes -kubectl create secret docker-registry timecho-nexus \ - --docker-server='nexus.infra.timecho.com:8143' \ - --docker-username='xxxxxx' \ - --docker-password='xxxxxx' \ - --docker-email='xxxxxx' \ - -n iotdb-ns - -# View the secret -kubectl get secret timecho-nexus -n iotdb-ns -# View and output as YAML -kubectl get secret timecho-nexus --output=yaml -n iotdb-ns -# View and decrypt -kubectl get secret timecho-nexus --output="jsonpath={.data.\.dockerconfigjson}" -n iotdb-ns | base64 --decode -``` - -#### 6.1.2 Load the Secret as a Patch to the Namespace iotdb-ns - -```Bash -# Add a patch to include login information for nexus in this namespace -kubectl patch serviceaccount default -n iotdb-ns -p '{"imagePullSecrets": [{"name": "timecho-nexus"}]}' - -# View the information in this namespace -kubectl get serviceaccounts -n iotdb-ns -o yaml -``` - -### 6.2 [Option 2] Import Image - -This step is for scenarios where the customer cannot connect to the private repository and requires assistance from company implementation staff. - -#### 6.2.1 Pull and Export the Image: - -```Bash -ctr images pull --user xxxxxxxx nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.2 View and Export the Image: - -```Bash -# View -ctr images ls - -# Export -ctr images export iotdb-enterprise:1.3.3.2-standalone.tar nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.3 Import into the k8s Namespace: - -> Note that k8s.io is the namespace for ctr in the example environment; importing to other namespaces will not work. - -```Bash -# Import into the k8s namespace -ctr -n k8s.io images import iotdb-enterprise:1.3.3.2-standalone.tar -``` - -#### 6.2.4 View the Image: - -```Bash -ctr --namespace k8s.io images list | grep 1.3.3.2 -``` - -## 7. Install IoTDB - -### 7.1 Install IoTDB - -```Bash -# Enter the directory -cd iotdb-cluster-k8s/helm - -# Install IoTDB -helm install iotdb ./ -n iotdb-ns -``` - -### 7.2 View Helm Installation List - -```Bash -# helm list -helm list -n iotdb-ns -``` - -### 7.3 View Pods - -```Bash -# View IoTDB pods -kubectl get pods -n iotdb-ns -o wide -``` - -After executing the command, if the output shows 6 Pods with confignode and datanode labels (3 each), it indicates a successful installation. Note that not all Pods may be in the Running state initially; inactive datanode Pods may keep restarting but will normalize after activation. - -### 7.4 Troubleshooting - -```Bash -# View k8s creation logs -kubectl get events -n iotdb-ns -watch kubectl get events -n iotdb-ns - -# Get detailed information -kubectl describe pod confignode-0 -n iotdb-ns -kubectl describe pod datanode-0 -n iotdb-ns - -# View ConfigNode logs -kubectl logs -n iotdb-ns confignode-0 -f -``` - -## 8. Activate IoTDB - -### 8.1 Option 1: Activate Directly in the Pod (Quickest) - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-1 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-2 -- /iotdb/sbin/start-activate.sh -# Obtain the machine code and proceed with activation -``` - -### 8.2 Option 2: Activate Inside the ConfigNode Container - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /bin/bash -cd /iotdb/sbin -/bin/bash start-activate.sh -# Obtain the machine code and proceed with activation -# Exit the container -``` - -### Option 3: Manual Activation - -1. View ConfigNode details to determine the node: - -```Bash -kubectl describe pod confignode-0 -n iotdb-ns | grep -e "Node:" -e "Path:" - -# Example output: -# Node: a87/172.20.31.87 -# Path: /data/k8s-data/env/confignode/.env -``` - -2. View PVC and find the corresponding Volume for ConfigNode to determine the path: - -```Bash -kubectl get pvc -n iotdb-ns | grep "confignode-0" -# Example output: -# map-confignode-confignode-0 Bound iotdb-pv-04 10Gi RWO local-storage 8h - -# To view multiple ConfigNodes, use the following: -for i in {0..2}; do echo confignode-$i; kubectl describe pod confignode-${i} -n iotdb-ns | grep -e "Node:" -e "Path:" -``` - -3. View the Detailed Information of the Corresponding Volume to Determine the Physical Directory Location: - - -```Bash -kubectl describe pv iotdb-pv-04 | grep "Path:" - -# Example output: -# Path: /data/k8s-data/iotdb-pv-04 -``` - -4. Locate the system-info file in the corresponding directory on the corresponding node, use this system-info as the machine code to generate an activation code, and create a new file named license in the same directory, writing the activation code into this file. - -## 9. Verify IoTDB - -### 9.1 Check the Status of Pods within the Namespace - -View the IP, status, and other information of the pods in the iotdb-ns namespace to ensure they are all running normally. - -```Bash -kubectl get pods -n iotdb-ns -o wide - -# Example output: -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` - -### 9.2 Check the Port Mapping within the Namespace - -```Bash -kubectl get svc -n iotdb-ns - -# Example output: -# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -# confignode-svc NodePort 10.10.226.151 80:31026/TCP 7d8h -# datanode-svc NodePort 10.10.194.225 6667:31563/TCP 7d8h -# jdbc-balancer LoadBalancer 10.10.191.209 6667:31895/TCP 7d8h -``` - -### 9.3 Start the CLI Script on Any Server to Verify the IoTDB Cluster Status - -Use the port of jdbc-balancer and the IP of any k8s node. - -```Bash -start-cli.sh -h 172.20.31.86 -p 31895 -start-cli.sh -h 172.20.31.87 -p 31895 -start-cli.sh -h 172.20.31.88 -p 31895 -``` - - - -## 10. Scaling - -### 10.1 Add New PV - -Add a new PV; scaling is only possible with available PVs. - - - -**Note: DataNode cannot join the cluster after restart** - -**Reason**:The static storage hostPath mode is configured, and the script modifies the `iotdb-system.properties` file to set `dn_data_dirs` to `/iotdb6/iotdb_data,/iotdb7/iotdb_data`. However, the default storage path `/iotdb/data` is not mounted, leading to data loss upon restart. -**Solution**:Mount the `/iotdb/data` directory as well, and ensure this setting is applied to both ConfigNode and DataNode to maintain data integrity and cluster stability. - -### 10.2 Scale ConfigNode - -Example: Scale from 3 ConfigNodes to 4 ConfigNodes - -Modify the values.yaml file in iotdb-cluster-k8s/helm to change the number of ConfigNodes from 3 to 4. - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - - - - -### 10.3 Scale DataNode - -Example: Scale from 3 DataNodes to 4 DataNodes - -Modify the values.yaml file in iotdb-cluster-k8s/helm to change the number of DataNodes from 3 to 4. - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - -### 10.4 Verify IoTDB Status - -```Shell -kubectl get pods -n iotdb-ns -o wide - -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -# datanode-3 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index 608491c6f..000000000 --- a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,320 +0,0 @@ - -# Stand-Alone Deployment - -This guide introduces how to set up a standalone TimechoDB instance, which includes one ConfigNode and one DataNode (commonly referred to as 1C1D). - -## 1. Prerequisites - -1. [System configuration](./Environment-Requirements.md): Ensure the system has been configured according to the preparation guidelines. - -2. **IP Configuration**: It is recommended to use hostnames for IP configuration to prevent issues caused by IP address changes. Set the hostname by editing the `/etc/hosts` file. For example, if the local IP is `192.168.1.3` and the hostname is `iotdb-1`, run: - - ```shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - - Use the hostname for `cn_internal_address` and `dn_internal_address` in IoTDB configuration. - -3. **Unmodifiable Parameters**: Some parameters cannot be changed after the first startup. Refer to the Parameter Configuration section. - -4. **Installation Path**: Ensure the installation path contains no spaces or non-ASCII characters to prevent runtime issues. - -5. - **User Permissions**: Choose one of the following permissions during installation and deployment: - - **Root User (Recommended)**: This avoids permission-related issues. - - **Non-Root User**: - - Use the same user for all operations, including starting, activating, and stopping services. - - Avoid using `sudo`, which can cause permission conflicts. - -6. **Monitoring Panel**: Deploy a monitoring panel to track key performance metrics. Contact the Timecho team for access and refer to the "[Monitoring Board Install and Deploy](./Monitoring-panel-deployment.md)" guide. - -7. **Health Check Tool**: Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - - -## 2. Installation Steps - -### 2.1 Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### 2.2 Extract Installation Package - -Unzip the installation package and navigate to the directory: - -```Plain -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -### 2.3 Parameter Configuration - -#### Memory Configuration - -Edit the following files for memory allocation: - -- **ConfigNode**: `conf/confignode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :---------------------------------- | :---------- | :-------------- | :---------------------- | -| MEMORY_SIZE | Total memory allocated for the node | Automatically calculated based on system memory, defaulting to 30% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - -- **DataNode**: `conf/datanode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :---------------------------------- |:-----------------------------------------------------------------------------------------| :-------------- | :---------------------- | -| MEMORY_SIZE | Total memory allocated for the node | Automatically calculated based on system memory, defaulting to 50% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -#### General Configuration - -Set the following parameters in `conf/iotdb-system.properties`. Refer to `conf/iotdb-system.properties.template` for a complete list. - -**Cluster-Level Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------------ | :-------------------------- | :------------- | :-------------- | :----------------------------------------------------------- | -| cluster_name | Name of the cluster | defaultCluster | Customizable | Support hot loading, but it is not recommended to change the cluster name by manually modifying the configuration file | -| schema_replication_factor | Number of metadata replicas | 1 | 1 | In standalone mode, set this to 1. This value cannot be modified after the first startup. | -| data_replication_factor | Number of data replicas | 1 | 1 | In standalone mode, set this to 1. This value cannot be modified after the first startup. | - -**ConfigNode Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------------------- | :--------------------------------------------------------- | -| cn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | This parameter cannot be modified after the first startup. | -| cn_internal_port | Port used for internal communication within the cluster | 10710 | 10710 | This parameter cannot be modified after the first startup. | -| cn_consensus_port | Port used for consensus protocol communication among ConfigNode replicas | 10720 | 10720 | This parameter cannot be modified after the first startup. | -| cn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Use `cn_internal_address:cn_internal_port` | This parameter cannot be modified after the first startup. | - -**DataNode** **Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--------------------------------------------------------- | -| dn_rpc_address | Address for the client RPC service | 127.0.0.1 | By default, the local machine can directly access it. For non-local access, please modify this configuration item to the IPv4 address or hostname of the server where it is located. It is recommended to use the IPv4 address of the server where it is located. | Effective after restarting the service. | -| dn_rpc_port | Port for the client RPC service | 6667 | 6667 | Effective after restarting the service. | -| dn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | This parameter cannot be modified after the first startup. | -| dn_internal_port | Port used for internal communication within the cluster | 10730 | 10730 | This parameter cannot be modified after the first startup. | -| dn_mpp_data_exchange_port | Port used for receiving data streams | 10740 | 10740 | This parameter cannot be modified after the first startup. | -| dn_data_region_consensus_port | Port used for data replica consensus protocol communication | 10750 | 10750 | This parameter cannot be modified after the first startup. | -| dn_schema_region_consensus_port | Port used for metadata replica consensus protocol communication | 10760 | 10760 | This parameter cannot be modified after the first startup. | -| dn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Use `cn_internal_address:cn_internal_port` | This parameter cannot be modified after the first startup. | - -### 2.4 Start ConfigNode - -Navigate to the `sbin` directory and start ConfigNode: - -```Bash -# Unix/OS X -./sbin/start-confignode.sh -d # The "-d" flag starts the process in the background. - -# Windows -# Before version V2.0.4.x -.\sbin\start-confignode.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-confignode.bat -``` - - If the startup fails, refer to the [**Common Problem**](#Common Problem) section below for troubleshooting. - -### 2.5 Start DataNode - -Navigate to the `sbin` directory of IoTDB and start the DataNode: - -````shell -# Unix/OS X -./sbin/start-datanode.sh -d # The "-d" flag starts the process in the background. - -# Windows -# Before version V2.0.4.x -.\sbin\start-datanode.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-datanode.bat -```` - -### 2.6 Activate Database - -#### Option 1: Command-Based Activation - -1. Enter the IoTDB CLI. - -The Linux and MacOS system startup commands are as follows: - -```shell -# Before version V2.0.6.x -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -The Windows system startup commands are as follows: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x and later versions, before version V2.0.6.x -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -2. Execute the following command to obtain the machine code required for activation: - -```SQL -show system info -``` -```Bash -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -3. Execute the following statement to obtain the version number of the database to be activated: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.9.2| 5ea21bc| -+-------+---------+ -Total line number = 1 -``` - -4. Provide the obtained machine code and version number to the Timecho team. - -5. Enter the activation codes provided by the Timecho team in the CLI in sequence using the following format. Wrap the activation code in single quotes ('): - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -#### Option 2: File-Based Activation - -1. After starting the Confignode and Datanode nodes, enter the `activation` folder and send the `system_info` file to the Timecho team. -2. Receive the `license` file returned by the staff. -3. Place the `license` file into the `activation` folder of the corresponding node. - -### 2.7 Verify Activation - -In the CLI, you can check the activation status by running the `show activation` command. Check the `ClusterActivationStatus` field. If it shows `ACTIVATED`, the database has been successfully activated. - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81.png) - -## 3. Common Problem - -1. Activation Fails Repeatedly - 1. Use the `ls -al` command to verify that the ownership of the installation directory matches the current user. - 2. Check the ownership of all files in the `./activation` directory to ensure they belong to the current user. -2. ConfigNode Fails to Start - 1. Review the startup logs to check if any parameters, which cannot be modified after the first startup, were changed. - 2. Check the logs for any other errors. If unresolved, contact technical support for assistance. - 3. If the deployment is fresh or data can be discarded, clean the environment and redeploy using the following steps: - - **Clean the Environment** - -1. Stop all ConfigNode and DataNode processes: - -```Bash -sbin/stop-standalone.sh -``` - -2. Check for any remaining processes: - -```Bash -jps -# or -ps -ef | grep iotdb -``` - -3. If processes remain, terminate them manually: - -```Bash -kill -9 - -#For systems with a single IoTDB instance, you can clean up residual processes with: -ps -ef | grep iotdb | grep -v grep | tr -s ' ' ' ' | cut -d ' ' -f2 | xargs kill -9 -``` - -4. Delete the `data` and `logs` directories: - -```Bash -cd /data/iotdb -rm -rf data logs -``` - -## 4. Appendix - -### 4.1 ConfigNode Parameters - -| Parameter | Description | **Is it required** | -| :-------- | :---------------------------------------------------------- | :----------------- | -| -d | Starts the process in daemon mode (runs in the background). | No | - -### 4.2 DataNode Parameters - -| Parameter | Description | Required | -| :-------- | :----------------------------------------------------------- | :------- | -| -v | Displays version information. | No | -| -f | Runs the script in the foreground without backgrounding it. | No | -| -d | Starts the process in daemon mode (runs in the background). | No | -| -p | Specifies a file to store the process ID for process management. | No | -| -c | Specifies the path to the configuration folder; the script loads configuration files from this location. | No | -| -g | Prints detailed garbage collection (GC) information. | No | -| -H | Specifies the path for the Java heap dump file, used during JVM memory overflow. | No | -| -E | Specifies the file for JVM error logs. | No | -| -D | Defines system properties in the format `key=value`. | No | -| -X | Passes `-XX` options directly to the JVM. | No | -| -h | Displays the help instructions. | No | diff --git a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/workbench-deployment_timecho.md b/src/UserGuide/Master/Tree/Deployment-and-Maintenance/workbench-deployment_timecho.md deleted file mode 100644 index ab236c0d1..000000000 --- a/src/UserGuide/Master/Tree/Deployment-and-Maintenance/workbench-deployment_timecho.md +++ /dev/null @@ -1,273 +0,0 @@ - -# Workbench Deployment - -The visualization console is one of the supporting tools for IoTDB (similar to Navicat for MySQL). It is an official application tool system used for database deployment implementation, operation and maintenance management, and application development stages, making the use, operation, and management of databases simpler and more efficient, truly achieving low-cost management and operation of databases. This document will assist you in installing Workbench. - -
-  -  -
- -The instructions for using the visualization console tool can be found in the [Instructions](../Tools-System/Monitor-Tool.md) section of the document. - -## 1. Installation Preparation - -| Preparation Content | Name | Version Requirements | Link | -| :----------------------: | :-------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | -| Operating System | Windows or Linux | - | - | -| Installation Environment | JDK | v1.5.4 and below require ≥ 1.8; v1.5.5 and above require ≥ 17. Choose the ARM or x64 installer according to your system. | https://www.oracle.com/java/technologies/downloads/ | -| Related Software | Prometheus | Requires installation of V2.30.3 and above. | https://prometheus.io/download/ | -| Database | IoTDB | Requires V1.2.0 Enterprise Edition and above | You can contact business or technical support to obtain | -| Console | IoTDB-Workbench-`` | - | You can choose according to the appendix version comparison table and contact business or technical support to obtain it | - -### Pre-installation Check - -To ensure the Workbench installation package you obtained is complete and valid, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Contact the Timecho Team to get it. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/workbench): - ```Bash - cd /data/workbench - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum IoTDB-Workbench-``.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-04.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact the Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -## 2. Installation Steps - -### 2.1 IoTDB enables monitoring indicator collection - -1. Open the monitoring configuration item. The configuration items related to monitoring in IoTDB are disabled by default. Before deploying the monitoring panel, you need to open the relevant configuration items (note that the service needs to be restarted after enabling monitoring configuration). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ConfigurationLocated in the configuration fileDescription
cn_metric_reporter_listconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to PROMETHEUS
cn_metric_levelPlease add this configuration item to the configuration file and set the value to IMPORTANT
cn_metric_prometheus_reporter_portPlease add this configuration item to the configuration file to maintain the default setting of 9091. If other ports are set, they will not conflict with each other
dn_metric_reporter_listconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to PROMETHEUS
dn_metric_levelPlease add this configuration item to the configuration file and set the value to IMPORTANT
dn_metric_prometheus_reporter_portPlease add this configuration item to the configuration file and set it to 9092 by default. If other ports are set, they will not conflict with each other
dn_metric_internal_reporter_typePlease add this configuration item to the configuration file and set the value to IOTDB
enable_audit_logconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to true
audit_log_storagePlease add this configuration item in the configuration file, with values set to IOTDB and LOGGER
audit_log_operationPlease add this configuration item in the configuration file, with values set to DML,DDL,QUERY
- - -2. Restart all nodes. After modifying the monitoring indicator configuration of three nodes, the confignode and datanode services of all nodes can be restarted: - - ```shell - # Unix/OS X - ./sbin/stop-standalone.sh #Stop confignode and datanode first - ./sbin/start-confignode.sh -d #Start confignode - ./sbin/start-datanode.sh -d #Start datanode - - # Windows - # Before version V2.0.4.x - .\sbin\stop-standalone.bat - .\sbin\start-confignode.bat - .\sbin\start-datanode.bat - - # V2.0.4.x and later versions - .\sbin\windows\stop-standalone.bat - .\sbin\windows\start-confignode.bat - .\sbin\windows\start-datanode.bat - ``` - -3. After restarting, confirm the running status of each node through the client. If the status is Running, it indicates successful configuration: - - ![](/img/%E5%90%AF%E5%8A%A8.png) - -### 2.2 Install and configure Prometheus - -1. Download the Prometheus installation package, which requires installation of V2.30.3 and above. You can go to the Prometheus official website to download it (https://prometheus.io/docs/introduction/first_steps/) -2. Unzip the installation package and enter the unzipped folder: - - ```Shell - tar xvfz prometheus-*.tar.gz - cd prometheus-* - ``` - -3. Modify the configuration. Modify the configuration file prometheus.yml as follows - 1. Add configNode task to collect monitoring data for ConfigNode - 2. Add a datanode task to collect monitoring data for DataNodes - - ```shell - global: - scrape_interval: 15s - evaluation_interval: 15s - scrape_configs: - - job_name: "prometheus" - static_configs: - - targets: ["localhost:9090"] - - job_name: "confignode" - static_configs: - - targets: ["iotdb-1:9091","iotdb-2:9091","iotdb-3:9091"] - honor_labels: true - - job_name: "datanode" - static_configs: - - targets: ["iotdb-1:9092","iotdb-2:9092","iotdb-3:9092"] - honor_labels: true - ``` - -4. Start Prometheus. The default expiration time for Prometheus monitoring data is 15 days. In production environments, it is recommended to adjust it to 180 days or more to track historical monitoring data for a longer period of time. The startup command is as follows: - - ```Shell - ./prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=180d - ``` - -5. Confirm successful startup. Enter in browser `http://IP:port` Go to Prometheus and click on the Target interface under Status. When you see that all States are Up, it indicates successful configuration and connectivity. - -
- - -
- - -### 2.3 Install Workbench - -1. Enter the config directory of iotdb Workbench -`` - -2. Modify Workbench configuration file: Go to the `config` folder and modify the configuration file `application-prod.properties`. If you are installing it locally, there is no need to modify it. If you are deploying it on a server, you need to modify the IP address - > Workbench can be deployed on a local or cloud server as long as it can connect to IoTDB - - | Configuration | Before Modification | After modification | - | ---------------- | ----------------------------------- | ----------------------------------------------- | - | pipe.callbackUrl | pipe.callbackUrl=`http://127.0.0.1` | pipe.callbackUrl=`http://` | - - ![](/img/workbench-conf-1.png) - -3. Startup program: Please execute the startup command in the sbin folder of IoTDB Workbench -`` - Windows: - ```shell - # Start Workbench in the background - start.bat -d - ``` - Linux: - ```shell - # Start Workbench in the background - ./start.sh -d - ``` -4. You can use the `jps` command to check if the startup was successful, as shown in the figure: - - ![](/img/windows-jps.png) - -5. Verification successful: Open "`http://Server IP: Port in configuration file`" in the browser to access, for example:"`http://127.0.0.1:9190`" When the login interface appears, it is considered successful - - ![](/img/workbench-en.png) - - -### 2.4 Configure Instance Information - -1. Configure instance information: You only need to fill in the following information to connect to the instance - - ![](/img/workbench-en-1.jpeg) - - - | Field Name | Is It A Required Field | Field Meaning | Default Value | - | --------------- | ---------------------- | ------------------------------------------------------------ | ------ | - | Connection Type | Yes | The content filled in for different connection types varies, and supports selecting "single machine, cluster, dual active" | - | - | Instance Name | Yes | You can distinguish different instances based on their names, with a maximum input of 50 characters | - | - | Instance | Yes | Fill in the database address (`dn_rpc_address` field in the `iotdb/conf/iotdb-system.properties` file) and port number (`dn_rpc_port` field). Note: For clusters and dual active devices, clicking the "+" button supports entering multiple instance information | - | - | Prometheus | No | Fill in `http://:/app/v1/query` to view some monitoring information on the homepage. We recommend that you configure and use it | - | - | Username | Yes | Fill in the username for IoTDB, supporting input of 4 to 32 characters, including uppercase and lowercase letters, numbers, and special characters (! @ # $% ^&* () _+-=) | root | - | Enter Password | No | Fill in the password for IoTDB. To ensure the security of the database, we will not save the password. Please fill in the password yourself every time you connect to the instance or test | root | - -2. Test the accuracy of the information filled in: You can perform a connection test on the instance information by clicking the "Test" button - - ![](/img/workbench-en-2.png) - -## 3. Appendix: IoTDB and Workbench Version Comparison Table - -| Version | Description | Supported IoTDB Versions | -|---------|-----------------------------------------------------------------------------------------------------------------------------|-------------------------------------| -| V2.0.1-beta | The first version of the V2.x series, supporting dual models of tree and table | V2.0 and above, The AI analysis module only supports versions above 2.0.5. | -| V1.5.7 | Optimize the point list by splitting point names into device names and points, ensure the point selection area supports horizontal scrolling, and align the export file column order with the page display. | All 1.x versions from V1.3.4 onward | -| V1.5.6 | Enhanced CSV import/export: optional tags/aliases on import; support for measurement descriptions with backtick-quoted quotes on export. | All 1.x versions from V1.3.4 onward | -| V1.5.5 | Added server clock functionality and support for activating Enterprise Edition license databases | All 1.x versions from V1.3.4 onward | -| V1.5.4 | Added authentication for Prometheus settings in Instance Management | All 1.x versions from V1.3.4 onward | -| V1.5.1 | Added AI analysis and pattern matching | All 1.x versions from V1.3.2 onward | -| V1.4.0 | Added tree model display and English UI | All 1.x versions from V1.3.2 onward | -| V1.3.1 | Enhanced analysis methods and import templates | All 1.x versions from V1.3.2 onward | -| V1.3.0 | Added DB configuration and UI refinements | All 1.x versions from V1.3.2 onward | -| V1.2.6 | Optimized permission controls | All 1.x versions from V1.3.1 onward | -| V1.2.5 | Added "Common Templates" and caching | All 1.x versions from V1.3.0 onward | -| V1.2.4 | Added import/export for calculations, time alignment field | All 1.x versions from V1.2.2 onward | -| V1.2.3 | Added activation details and analysis features | All 1.x versions from V1.2.2 onward | -| V1.2.2 | Optimized point description display | All 1.x versions from V1.2.2 onward | -| V1.2.1 | Added sync monitoring panel, Prometheus hints | All 1.x versions from V1.2.2 onward | -| V1.2.0 | Major Workbench upgrade | All 1.x versions from V1.2.0 onward | diff --git a/src/UserGuide/Master/Tree/Ecosystem-Integration/Ecosystem-Overview_timecho.md b/src/UserGuide/Master/Tree/Ecosystem-Integration/Ecosystem-Overview_timecho.md deleted file mode 100644 index 742d3ca85..000000000 --- a/src/UserGuide/Master/Tree/Ecosystem-Integration/Ecosystem-Overview_timecho.md +++ /dev/null @@ -1,53 +0,0 @@ - - -# Overview - -IoTDB Ecosystem Integration Bridges the Full Pipeline of Time-Series Data: -- Through data collection, it enables second-level device connectivity. -- Via data integration, it constructs cross-cloud pipelines. -- Leveraging programming frameworks, it accelerates business logic development. -- With computing engines, it accomplishes distributed processing. -- Through visualization and SQL development, it implements analytical strategies. -- Finally, by interfacing with IoT platforms, it achieves edge-cloud synergy—building a complete intelligent closed loop from the physical world to digital decision-making. - -![](/img/eco-overview-n-en.png) - -The following documentation will help you quickly and comprehensively understand the usage of various integration tools at each stage: - -- Data Acquisition - - Telegraf [Telegraf Plugin](./Telegraf.md) -- Data Integration - - NiFi [Apache NiFi](./NiFi-IoTDB.md) - - Kafka [Kafka](./Programming-Kafka.md) -- Computing Engine - - Flink [Flink](./Flink-IoTDB.md) - - Spark [Spark](./Spark-IoTDB.md) -- Visual Analytics - - Zeppelin [Zeppelin](./Zeppelin-IoTDB.md) - - Grafana [Grafana](./Grafana-Connector.md) - - Grafana Plugin [Grafana Plugin](./Grafana-Plugin.md) - - DataEase [DataEase](./DataEase.md) -- SQL Development - - DBeaver [DBeaver](./DBeaver.md) -- IoT Platform - - Ignition [Ignition](./Ignition-IoTDB-plugin_timecho.md) - - Thingsboard [Thingsboard](./Thingsboard.md) \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md b/src/UserGuide/Master/Tree/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md deleted file mode 100644 index ac82207e8..000000000 --- a/src/UserGuide/Master/Tree/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md +++ /dev/null @@ -1,275 +0,0 @@ - - -# Ignition - -## 1. Product Overview - -1. Introduction to Ignition - - Ignition is a web-based monitoring and data acquisition tool (SCADA) - an open and scalable universal platform. Ignition allows you to more easily control, track, display, and analyze all data of your enterprise, enhancing business capabilities. For more introduction details, please refer to [Ignition Official Website](https://docs.inductiveautomation.com/docs/8.1/getting-started/introducing-ignition) - -2. Introduction to the Ignition-IoTDB Connector - - The ignition-IoTDB Connector is divided into two modules: the ignition-IoTDB Connector,Ignition-IoTDB With JDBC。 Among them: - - - Ignition-IoTDB Connector: Provides the ability to store data collected by Ignition into IoTDB, and also supports data reading in Components. It injects script interfaces such as `system. iotdb. insert`and`system. iotdb. query`to facilitate programming in Ignition - - Ignition-IoTDB With JDBC: Ignition-IoTDB With JDBC can be used in the`Transaction Groups`module and is not applicable to the`Tag Historian`module. It can be used for custom writing and querying. - - The specific relationship and content between the two modules and ignition are shown in the following figure. - - ![](/img/20240703114443.png) - -## 2. Installation Requirements - -| **Preparation Content** | Version Requirements | -| ------------------------------- | ------------------------------------------------------------ | -| IoTDB | Version 1.3.1 and above are required to be installed, please refer to IoTDB for installation [Deployment Guidance](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) | -| Ignition | Requirement: 8.1 version (8.1.37 and above) of version 8.1 must be installed. Please refer to the Ignition official website for installation [Installation Guidance](https://docs.inductiveautomation.com/docs/8.1/getting-started/installing-and-upgrading)(Other versions are compatible, please contact the business department for more information) | -| Ignition-IoTDB Connector module | Please contact Business to obtain | -| Ignition-IoTDB With JDBC module | Download address:https://repo1.maven.org/maven2/org/apache/iotdb/iotdb-jdbc/ | - -## 3. Instruction Manual For Ignition-IoTDB Connector - -### 3.1 Introduce - -The Ignition-IoTDB Connector module can store data in a database connection associated with the historical database provider. The data is directly stored in a table in the SQL database based on its data type, as well as a millisecond timestamp. Store data only when making changes based on the value pattern and dead zone settings on each label, thus avoiding duplicate and unnecessary data storage. - -The Ignition-IoTDB Connector provides the ability to store the data collected by Ignition into IoTDB. - -### 3.2 Installation Steps - -Step 1: Enter the `Configuration` - `System` - `Modules` module and click on the `Install or Upgrade a Module` button at the bottom - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-1.png) - -Step 2: Select the obtained `modl`, select the file and upload it, click `Install`, and trust the relevant certificate. - -![](/img/20240703-151030.png) - -Step 3: After installation is completed, you can see the following content - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-3.png) - -Step 4: Enter the `Configuration` - `Tags` - `History` module and click on `Create new Historical Tag Provider` below - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-4.png) - -Step 5: Select `IoTDB` and fill in the configuration information - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-5.png) - -The configuration content is as follows: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NameDescriptionDefault ValueNotes
Main
Provider NameProvider Name-
Enabled trueThe provider can only be used when it is true
DescriptionDescription-
IoTDB Settings
Host NameThe address of the target IoTDB instance-
Port NumberThe port of the target IoTDB instance6667
UsernameThe username of the target IoTDB-
PasswordPassword for target IoTDB-
Database NameThe database name to be stored, starting with root, such as root db-
Pool SizeSize of SessionPool50Can be configured as needed
Store and Forward SettingsJust keep it as default
- - - -### 3.3 Instructions - -#### Configure Historical Data Storage - -- After configuring the `Provider`, you can use the `IoTDB Tag Historian` in the `Designer`, just like using other `Providers`. Right click on the corresponding `Tag` and select `Edit Tag (s) `, then select the History category in the Tag Editor - - ![](/img/ignition-7.png) - -- Set `History Disabled` to `true`, select `Storage Provider` as the `Provider` created in the previous step, configure other parameters as needed, click `OK`, and then save the project. At this point, the data will be continuously stored in the 'IoTDB' instance according to the set content. - - ![](/img/ignition-8.png) - -#### Read Data - -- You can also directly select the tags stored in IoTDB under the Data tab of the Report - - ![](/img/ignition-9.png) - -- You can also directly browse relevant data in Components - - ![](/img/ignition-10.png) - -#### Script module: This function can interact with IoTDB - -1. system.iotdb.insert: - - -- Script Description: Write data to an IoTDB instance - -- Script Definition: - - `system.iotdb.insert(historian, deviceId, timestamps, measurementNames, measurementValues)` - -- Parameter: - - - `str historian`:The name of the corresponding IoTDB Tag Historian Provider - - `str deviceId`:The deviceId written, excluding the configured database, such as Sine - - `long[] timestamps`:List of timestamps for written data points - - `str[] measurementNames`:List of names for written physical quantities - - `str[][] measurementValues`:The written data point data corresponds to the timestamp list and physical quantity name list - -- Return Value: None - -- Available Range:Client, Designer, Gateway - -- Usage example: - - ```shell - system.iotdb.insert("IoTDB", "Sine", [system.date.now()],["measure1","measure2"],[["val1","val2"]]) - ``` - -2. system.iotdb.query: - - -- Script Description:Query the data written to the IoTDB instance - -- Script Definition: - - `system.iotdb.query(historian, sql)` - -- Parameter: - - - `str historian`:The name of the corresponding IoTDB Tag Historian Provider - - `str sql`:SQL statement to be queried - -- Return Value: - Query Results:`List>` - -- Available Range:Client, Designer, Gateway - -- Usage example: - - ```Python - system.iotdb.query("IoTDB", "select * from root.db.Sine where time > 1709563427247") - ``` - -## 4. Ignition-IoTDB With JDBC - -### 4.1 Introduce - - Ignition-IoTDB With JDBC provides a JDBC driver that allows users to connect and query the Ignition IoTDB database using standard JDBC APIs - -### 4.2 Installation Steps - -Step 1: Enter the `Configuration` - `Databases` -`Drivers` module and create the `Translator` - -![](/img/Ignition-IoTDBWithJDBC-1.png) - -Step 2: Enter the `Configuration` - `Databases` - `Drivers` module, create a `JDBC Driver` , select the `Translator` configured in the previous step, and upload the downloaded `IoTDB JDBC`. Set the Classname to `org. apache. iotdb. jdbc.IoTDBDriver` - -![](/img/Ignition-IoTDBWithJDBC-2.png) - -Step 3: Enter the `Configuration` - `Databases` - `Connections` module, create a new `Connections` , select the`IoTDB Driver` created in the previous step for `JDBC Driver`, configure the relevant information, and save it to use - -![](/img/Ignition-IoTDBWithJDBC-3.png) - -### 4.3 Instructions - -#### Data Writing - -Select the previously created `Connection` from the `Data Source` in the `Transaction Groups` - -- `Table name`needs to be set as the complete device path starting from root -- Uncheck `Automatically create table` -- `Store timestame to` configure as time - -Do not select other options, set the fields, and after `enabled` , the data will be installed and stored in the corresponding IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E5%86%99%E5%85%A5-1.png) - -#### Query - -- Select `Data Source` in the `Database Query Browser` and select the previously created `Connection` to write an SQL statement to query the data in IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2-ponz.png) - diff --git a/src/UserGuide/Master/Tree/Ecosystem-Integration/SeaTunnel_timecho.md b/src/UserGuide/Master/Tree/Ecosystem-Integration/SeaTunnel_timecho.md deleted file mode 100644 index f804b8c0b..000000000 --- a/src/UserGuide/Master/Tree/Ecosystem-Integration/SeaTunnel_timecho.md +++ /dev/null @@ -1,190 +0,0 @@ - - - -# Apache SeaTunnel - -## 1. Overview - -SeaTunnel is a distributed integration platform designed for massive data. Leveraging its high performance and elastic scaling capabilities, it connects multi-source heterogeneous data links through standardized Connectors (composed of Source and Sink). The platform uniformly abstracts various data sources into the SeaTunnelRow format via Source. After dynamic resource scheduling and batch processing optimization, it efficiently writes data to different storage systems through Sink. Through the deep integration of the IoTDB Connector with SeaTunnel, it not only addresses core challenges in time-series data scenarios such as **high-throughput writing, multi-source governance, and complex analysis**, but also helps enterprises quickly build **low-cost, highly reliable, and easily scalable** data infrastructure in fields like the Internet of Things and industrial internet, leveraging the out-of-the-box connector ecosystem and automated operation and maintenance capabilities. - -## 2. Usage Steps - -### 2.1 Environment Preparation - -#### 2.1.1 Software Requirements - -| Software | Version | Installation Reference | -| ------------- | ------------- |-----------------------------------------------------------| -| IoTDB | >= 2.0.5 | [Quick Start](../QuickStart/QuickStart_timecho.md) | -| SeaTunnel | 2.3.12 | [Official Website](https://seatunnel.apache.org/download) | - -* Thrift Version Conflict Resolution (Only required for Spark engine): - -```Bash -# Remove older Thrift from Spark -rm -f $SPARK_HOME/jars/libthrift* -# Copy IoTDB's Thrift library to Spark classpath -cp $IOTDB_HOME/lib/libthrift* $SPARK_HOME/jars/ -``` - -#### 2.1.2 Dependency Configuration - -1. JDBC - -* Spark/Flink Engine: Place the [JDBC driver JAR](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) into the `${SEATUNNEL_HOME}/plugins/` directory. -* SeaTunnel Zeta Engine: Place the [JDBC driver JAR](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) into the `${SEATUNNEL_HOME}/lib/` directory. - -2. Connector - -Place the corresponding version of the [SeaTunnel Connector](https://mvnrepository.com/artifact/org.apache.seatunnel/connector-iotdb) into the `${SEATUNNEL_HOME}/plugins/` directory. - -### 2.2 Reading Data (IoTDB Source Connector) - -#### 2.2.1 Configuration Parameters - -| **Parameter** | **Type** | **Required** | **Default** | **Description** | -| -------------------------- | -------- | ------------ | ----------- | --------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | string | yes | - | IoTDB cluster address, format: `"host1:port"` or `"host1:port,host2:port"` | -| `username` | string | yes | - | IoTDB username | -| `password` | string | yes | - | IoTDB password | -| `sql_dialect` | string | no | tree | IoTDB model: `tree` for tree model; `table` for table model | -| `sql` | string | yes | - | SQL query statement to execute | -| `database` | string | no | - | Database name, only effective in table model | -| `schema` | config | yes | - | Data schema definition | -| `fetch_size` | int | no | - | Number of data rows fetched per request from IoTDB during query execution | -| `lower_bound` | long | no | - | Lower bound of time range (used for data partitioning by time column) | -| `upper_bound` | long | no | - | Upper bound of time range (used for data partitioning by time column) | -| `num_partitions` | int | no | - | Number of partitions (used when partitioning by time column):
1 partition: uses the full time range
If partitions < (upper_bound - lower_bound), the difference is used as actual partitions | -| `thrift_default_buffer_size`| int | no | - | Thrift protocol buffer size | -| `thrift_max_frame_size` | int | no | - | Thrift maximum frame size | -| `enable_cache_leader` | boolean | no | - | Whether to enable leader node caching | -| `version` | string | no | - | Client SQL semantic version (`V_0_12` / `V_0_13`) | - -#### 2.2.2 Configuration Example - -1. Create a new file `iotdb_source_example.conf` in the `${SEATUNNEL_HOME}/config/` directory: - -```bash -env { - parallelism = 2 # Parallelism set to 2 - job.mode = "BATCH" # Batch mode -} - -source { - IoTDB { - node_urls = "localhost:6667" - username = "root" - password = "root" - sql = "SELECT temperature, humidity, status FROM root.testdb.seatunnel.source.device align by device" - schema { - fields { - ts = timestamp - device_name = string - temperature = double - humidity = double - status = boolean - } - } - } -} - -sink { - Console { - } # Output to console -} -``` - -2. Run SeaTunnel with the following command: - -```Bash -./bin/seatunnel.sh --config config/iotdb_source_example.conf -e local -``` - -3. For more details, please refer to the official Apache SeaTunnel documentation on [IoTDB Source Connector](https://seatunnel.apache.org/docs/2.3.12/connector-v2/source/IoTDB). - -### 2.3 Writing Data (IoTDB Sink Connector) - -#### 2.3.1 Configuration Parameters - -| **Parameter** | **Type** | **Required** | **Default** | **Description** | -| ----------------------------- | --------- | ------------ | ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | Array | yes | - | IoTDB cluster address, format: `["host1:port"]` or `["host1:port","host2:port"]` | -| `username` | String | yes | - | IoTDB username | -| `password` | String | yes | - | IoTDB password | -| `sql_dialect` | String | no | tree | IoTDB model: `tree` for tree model; `table` for table model | -| `storage_group` | String | yes | - | IoTDB tree model: specifies the storage group for devices (path prefix) e.g., deviceId = \${storage_group} + "." + \${key_device}; IoTDB table model: specifies the database | -| `key_device` | String | yes | - | IoTDB tree model: field name in SeaTunnelRow that specifies the IoTDB device ID; IoTDB table model: field name in SeaTunnelRow that specifies the IoTDB table name | -| `key_timestamp` | String | no | processing time | IoTDB tree model: field name in SeaTunnelRow that specifies the IoTDB timestamp (if not specified, processing time is used as timestamp); IoTDB table model: field name in SeaTunnelRow that specifies the IoTDB time column (if not specified, processing time is used as timestamp) | -| `key_measurement_fields` | Array | no | See description | IoTDB tree model: field names in SeaTunnelRow that specify the list of IoTDB measurements (if not specified, includes all fields except `key_device` and `key_timestamp`); IoTDB table model: field names in SeaTunnelRow that specify the IoTDB field columns (if not specified, includes all fields except `key_device`, `key_timestamp`, `key_tag_fields`, `key_attribute_fields`) | -| `key_tag_fields` | Array | no | - | IoTDB tree model: not applicable; IoTDB table model: field names in SeaTunnelRow that specify the IoTDB tag columns | -| `key_attribute_fields` | Array | no | - | IoTDB tree model: not applicable; IoTDB table model: field names in SeaTunnelRow that specify the IoTDB attribute columns | -| `batch_size` | Integer | no | 1024 | For batch writing, data is flushed to IoTDB when the buffer reaches `batch_size` or when the time reaches `batch_interval_ms` | -| `max_retries` | Integer | no | - | Number of retries on failed flush | -| `retry_backoff_multiplier_ms` | Integer | no | - | Multiplier used to generate the next backoff delay | -| `max_retry_backoff_ms` | Integer | no | - | Maximum wait time before retrying a request to IoTDB | -| `default_thrift_buffer_size` | Integer | no | - | Initial buffer size for Thrift client in IoTDB | -| `max_thrift_frame_size` | Integer | no | - | Maximum frame size for Thrift client in IoTDB | -| `zone_id` | string | no | - | IoTDB client `java.time.ZoneId` | -| `enable_rpc_compression` | Boolean | no | - | Enable RPC compression in IoTDB client | -| `connection_timeout_in_ms` | Integer | no | - | Maximum time (in milliseconds) to wait when connecting to IoTDB | - -#### 2.3.2 Configuration Example - -1. Create a new file `iotdb_sink_example.conf` in the `${SEATUNNEL_HOME}/config/` directory: - -```bash -# Define runtime environment -env { - parallelism = 4 - job.mode = "BATCH" -} - -source { - Jdbc { - url = "jdbc:mysql://localhost:3306/demo_db?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true" - driver = "com.mysql.cj.jdbc.Driver" - connection_check_timeout_sec = 100 - user = "root" - password = "IoTDB@2024" - query = "select * from device" - } -} - -sink { - IoTDB { - node_urls = ["localhost:6667"] - username = "root" - password = "root" - key_device = "id" # Specify the `deviceId` using the device_name field - key_timestamp = "intime" - storage_group = "root.mysql" - } -} -``` - -2. Run SeaTunnel with the following command: - -```Bash -./bin/seatunnel.sh --config config/iotdb_sink_example.conf -e local -``` - -3. For more configuration parameters and examples, please refer to the official Apache SeaTunnel documentation on [IoTDB Sink Connector](https://seatunnel.apache.org/docs/2.3.12/connector-v2/sink/IoTDB). diff --git a/src/UserGuide/Master/Tree/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/UserGuide/Master/Tree/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index 10fa4c4d4..000000000 --- a/src/UserGuide/Master/Tree/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,267 +0,0 @@ - - -# IoTDB Introduction - -TimechoDB is a low-cost, high-performance native temporal database for the Internet of Things, provided by Timecho based on the Apache IoTDB community version as an original commercial product. It can solve various problems encountered by enterprises when building IoT big data platforms to manage time-series data, such as complex application scenarios, large data volumes, high sampling frequencies, high amount of unaligned data, long data processing time, diverse analysis requirements, and high storage and operation costs. - -Timecho provides a more diverse range of product features, stronger performance and stability, and a richer set of utility tools based on TimechoDB. It also offers comprehensive enterprise services to users, thereby providing commercial customers with more powerful product capabilities and a higher quality of development, operations, and usage experience. - -- Download 、Deployment and Usage:[QuickStart](../QuickStart/QuickStart_timecho.md) - - -## 1. Product Components - -Timecho products is composed of several components, covering the entire time-series data lifecycle from data collection, data management to data analysis & application, helping users efficiently manage and analyze the massive amount of time-series data generated by the IoT. - -
- Introduction-en-timecho-new.png - -
- -1. **Time-series database (TimechoDB, a commercial product based on Apache IoTDB provided by the original team)**: The core component of time-series data storage, which can provide users with high-compression storage capabilities, rich time-series query capabilities, real-time stream processing capabilities, while also having high availability of data and high scalability of clusters, and providing security protection. At the same time, TimechoDB also provides users with a variety of application tools for easy management of the system; multi-language API and external system application integration capabilities, making it convenient for users to build applications based on TimechoDB. - -2. **Time-series data standard file format (Apache TsFile, led and contributed by core team members of Timecho)**: This file format is a storage format specifically designed for time-series data, which can efficiently store and query massive amounts of time-series data. Currently, the underlying storage files of Timecho's collection, storage, and intelligent analysis modules are all supported by Apache TsFile. TsFile can be efficiently loaded into TimechoDB and can also be migrated out. Through TsFile, users can use the same file format for data management in the stages of collection, management, application & analysis, greatly simplifying the entire process from data collection to analysis, and improving the efficiency and convenience of time-series data management. - -3. **Time-series model training and inference integrated engine (AINode)**: For intelligent analysis scenarios, TimechoDB provides the AINode time-series model training and inference integrated engine, which offers a complete set of time-series data analysis tools, with the underlying model training engine supporting training tasks and data management, including machine learning, deep learning, etc. With these tools, users can conduct in-depth analysis of the data stored in TimechoDB and mine its value. - -4. **Data collection**: To more conveniently dock with various industrial collection scenarios, Timecho provides data collection access services, supporting multiple protocols and formats, which can access data generated by various sensors and devices, while also supporting features such as breakpoint resumption and network barrier penetration. It is more adapted to the characteristics of difficult configuration, slow transmission, and weak network in the industrial field collection process, making the user's data collection simpler and more efficient. - -## 2. Product Features - -TimechoDB has the following advantages and characteristics: - -- Flexible deployment methods: Support for one-click cloud deployment, out-of-the-box use after unzipping at the terminal, and seamless connection between terminal and cloud (data cloud synchronization tool). - -- Low hardware cost storage solution: Supports high compression ratio disk storage, no need to distinguish between historical and real-time databases, unified data management. - -- Hierarchical sensor organization and management: Supports modeling in the system according to the actual hierarchical relationship of devices to achieve alignment with the industrial sensor management structure, and supports directory viewing, search, and other capabilities for hierarchical structures. - -- High throughput data reading and writing: supports access to millions of devices, high-speed data reading and writing, out of unaligned/multi frequency acquisition, and other complex industrial reading and writing scenarios. - -- Rich time series query semantics: Supports a native computation engine for time series data, supports timestamp alignment during queries, provides nearly a hundred built-in aggregation and time series calculation functions, and supports time series feature analysis and AI capabilities. - -- Highly available distributed system: Supports HA distributed architecture, the system provides 7*24 hours uninterrupted real-time database services, the failure of a physical node or network fault will not affect the normal operation of the system; supports the addition, deletion, or overheating of physical nodes, the system will automatically perform load balancing of computing/storage resources; supports heterogeneous environments, servers of different types and different performance can form a cluster, and the system will automatically load balance according to the configuration of the physical machine. - -- Extremely low usage and operation threshold: supports SQL like language, provides multi language native secondary development interface, and has a complete tool system such as console. - -- Rich ecological environment docking: Supports docking with big data ecosystem components such as Hadoop, Spark, and supports equipment management and visualization tools such as Grafana, Thingsboard, DataEase. - -## 3. Enterprise characteristics - -### 3.1 Higher level product features - -Based on Apache IoTDB, TimechoDB offers a range of advanced product features, with native upgrades and optimizations at the kernel level for industrial production scenarios. These include multi-level storage, cloud-edge collaboration, visualization tools, and security enhancements, allowing users to focus more on business development without worrying too much about underlying logic. This simplifies and enhances industrial production, bringing more economic benefits to enterprises. For example: - - -- Dual Active Deployment:Dual active usually refers to two independent single machines (or clusters) that perform real-time mirror synchronization. Their configurations are completely independent and can simultaneously receive external writes. Each independent single machine (or clusters) can synchronize the data written to itself to another single machine (or clusters), and the data of the two single machines (or clusters) can achieve final consistency. - -- Data Synchronisation:Through the built-in synchronization module of the database, data can be aggregated from the station to the center, supporting various scenarios such as full aggregation, partial aggregation, and hierarchical aggregation. It can support both real-time data synchronization and batch data synchronization modes. Simultaneously providing multiple built-in plugins to support requirements such as gateway penetration, encrypted transmission, and compressed transmission in enterprise data synchronization applications. - -- Tiered Storage:Multi level storage: By upgrading the underlying storage capacity, data can be divided into different levels such as cold, warm, and hot based on factors such as access frequency and data importance, and stored in different media (such as SSD, mechanical hard drive, cloud storage, etc.). At the same time, the system also performs data scheduling during the query process. Thereby reducing customer data storage costs while ensuring data access speed. - -- Security Enhancements: Features like whitelists and audit logs strengthen internal management and reduce the risk of data breaches. - -The detailed functional comparison is as follows: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
FunctionApache IoTDBTimechoDB
Deployment ModeStand-Alone Deployment
Distributed Deployment
Dual Active Deployment-
Container DeploymentPartial support
Database FunctionalitySensor Management
Write Data
Query Data
Continuous Query
Trigger
User Defined Function
Permission Management
Data SynchronisationOnly file synchronization, no built-in pluginsReal time synchronization+file synchronization, enriched with built-in plugins
Stream ProcessingOnly framework, no built-in pluginsFramework+rich built-in plugins
Tiered Storage-
View-
White List-
Audit Log-
Supporting ToolsWorkbench-
Cluster Management Tool-
System Monitor Tool-
LocalizationLocalization Compatibility Certification-
Technical SupportExpert Support-
Use Training-
- -### 3.2 More efficient/stable product performance - -TimechoDB has optimized stability and performance on the basis of the open source version. With technical support from the enterprise version, it can achieve more than 10 times performance improvement and has the performance advantage of timely fault recovery. - -### 3.3 More User-Friendly Tool System - -TimechoDB will provide users with a simpler and more user-friendly tool system. Through products such as the Cluster Monitoring Panel (Grafana), Database Console (Workbench), and Cluster Management Tool (Deploy Tool, abbreviated as IoTD), it will help users quickly deploy, manage, and monitor database clusters, reduce the work/learning costs of operation and maintenance personnel, simplify database operation and maintenance work, and make the operation and maintenance process more convenient and efficient. - -- Cluster Monitoring Panel: Designed to address the monitoring issues of TimechoDB and its operating system, including operating system resource monitoring, TimechoDB performance monitoring, and hundreds of kernel monitoring indicators, to help users monitor the health status of the cluster and perform cluster tuning and operation. - -
-

Overall Overview

-

Operating System Resource Monitoring

-

TimechoDB Performance Monitoring

-
-
- - - -
-

- -- Database Console: Designed to provide a low threshold database interaction tool, it helps users perform metadata management, data addition, deletion, modification, query, permission management, system management, and other operations in a concise and clear manner through an interface console, simplifying the difficulty of database use and improving database efficiency. - - -
-

Home Page

-

Operate Metadata

-

SQL Query

-
-
- - - -
-

- - -- Cluster management tool: aimed at solving the operational difficulties of multi node distributed systems, mainly including cluster deployment, cluster start stop, elastic expansion, configuration updates, data export and other functions, so as to achieve one click instruction issuance for complex database clusters, greatly reducing management difficulty. - - -
-  -
- -### 3.4 More professional enterprise technical services - -TimechoDB customers provide powerful original factory services, including but not limited to on-site installation and training, expert consultant consultation, on-site emergency assistance, software upgrades, online self-service, remote support, and guidance on using the latest development version. At the same time, in order to make TimechoDB more suitable for industrial production scenarios, we will recommend modeling solutions, optimize read-write performance, optimize compression ratios, recommend database configurations, and provide other technical support based on the actual data structure and read-write load of the enterprise. If encountering industrial customization scenarios that are not covered by some products, TimechoDB will provide customized development tools based on user characteristics. - -Compared to the open source version, TimechoDB provides a faster release frequency every 2-3 months. At the same time, it offers day level exclusive fixes for urgent customer issues to ensure stable production environments. - -### 3.5 More compatible localization adaptation - -The TimechoDB code is self-developed and controllable, and is compatible with most mainstream information and creative products (CPU, operating system, etc.), and has completed compatibility certification with multiple manufacturers to ensure product compliance and security. \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/IoTDB-Introduction/Release-history_timecho.md b/src/UserGuide/Master/Tree/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index 6c5df5236..000000000 --- a/src/UserGuide/Master/Tree/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,701 +0,0 @@ - -# Release History - -## 1. TimechoDB (Database Core) - -### V2.0.9.4 -> Release Date: 2026.06.10
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.4-bin.zip
-> SHA512 Checksum: 040ebdd9e45d93535e9628cf377003d560be83cec9737f5a5fbd0c3a93a12810814094752eac3eacdfec5cddcf433fa83e76edc14be34c73c1a54d9b937ea1b5 - -Version 2.0.9.4 primarily optimizes table model AINode inference, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- AINode: Table model covariate inference models adaptively support filling null values - - -### V2.0.9.3 -> Release Date: 2026.05.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.3-bin.zip
-> SHA512 Checksum: f6c5d50cbf8902503289884f073593c650ffdc8edbebfabf27f6ab4499630749331aa4ed09dd34627a39fa8dee27b4d7e2689d0ed1cf23c76dd9c7270f9fae2a - -Version 2.0.9.3 of AINode newly supports registering multiple models by using the same model code with different model weights. It also includes enhancements and bug fixes for previous versions, with comprehensive improvements to database monitoring, performance and stability. Details are as follows: - -- AINode: [Supports registering custom models with the same model code and different model weights](../AI-capability/AINode_Upgrade_timecho.md#_4-3-register-custom-models) - - -### V2.0.9.2 -> Release Date: 2026.05.11
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.2-bin.zip
-> SHA512 Checksum: 10d3f34b6e65ad5c09b1cf3538ee27e181cc38c5fedf6acfd7d7053797ca23c76245683536275b69bd478aa1e43364351eceef1948832ab663a7398665af9eff - -Version 2.0.9.2 adds import and export capabilities for the Object data type, and introduces the new `tsfile-backup` script (currently supported only for table model scenarios). It also brings optimizations and bug fixes for legacy versions, with overall upgrades to database monitoring, performance and stability. Details are as follows: - -- Scripts & Tools: [The `import-data` script for TsFile format](../../latest-Table/Tools-System/Data-Import-Tool_timecho.md#_2-4-tsfile-format) supports Object type data import for table models -- Scripts & Tools: New[ `tsfile-backup` script ](../../latest-Table/Tools-System/Data-Export-Tool_timecho.md#_3-tsfilebackup-based-on-pipe-framework)added for table models -- Stream Processing Module: PIPE for table models supports [local export and remote transmission of Object type data](../../latest-Table/User-Manual/Data-Sync_timecho.md#_3-9-object-type-data-export) -- System Module: [Audit logs](../User-Manual/Audit-Log_timecho.md) support slow request quantity statistics - -### V2.0.9.1 -> Release Date: 2026.05.11
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.1-bin.zip
-> SHA512 Checksum: 18ff3801ba58550e06ef0aa4bf4465e8ce1b31d1aecb9c6899eb843f5d9187d3cc575e930ee38d96b87b17067e2b21f1852ab5127eac7480cf5051c20a68894b - -Version 2.0.9.1 endows AINode with covariate classification inference capability, supports schema-level and table-level storage space statistics. It adds set operations, CTE and multiple built-in functions for data query, enables SQL debugging via DEBUG statements, and supports configuring auto-start on boot. This version also contains legacy version improvements, bug fixes, and comprehensive enhancements to database monitoring, performance and stability. Details are as follows: - -- AINode: Table models support [time series data classification inference](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md#_4-1-model-inference) -- Query Module: Table models support [set operations (UNION/INTERSECT/EXCEPT)](../../latest-Table/SQL-Manual/Set-Operations_timecho.md) and [Common Table Expressions (CTE)](../../latest-Table/SQL-Manual/Common-Table-Expression_timecho.md) -- Query Module: Newly added [IF scalar function](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_8-3-if-expression), [binary functions](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_7-binary-functions) and [APPROX_PERCENTILE aggregate function](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_2-aggregate-functions) for table models -- Query Module: Supports [DEBUG SQL](../User-Manual/Maintenance-commands_timecho.md#_6-query-debugging) for query debugging and optimizes the result set of [Explain Analyze](../User-Manual/Query-Performance-Analysis.md) -- Query Module: Supports [schema-level](../User-Manual/Maintenance-commands_timecho.md#_1-10-view-disk-space-usage) and [table-level](../../latest-Table/Reference/System-Tables_timecho.md#_2-22-table-disk-usage) storage space occupancy statistics; the[ `SHOW CONFIGURATION` statement](../../latest-Table/User-Manual/Maintenance-commands_timecho.md#_1-13-view-node-configuration) is available to view cluster configuration information -- Scripts & Tools: Data and metadata import/export tools support the SSL protocol -- Scripts & Tools: Command-line tool adds access [history display](../Tools-System/CLI_timecho.md#_5-access-history-feature) capability -- System Module: Supports [system auto-start](../User-Manual/Auto-Start-On-Boot_timecho.md) configuration -- Others: Fixed security vulnerability CVE-2026-28564 - - -### V2.0.8.3 -> Release Date: 2026.04.21
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.8.3-bin.zip
-> SHA512 Checksum: 4b95bea87cc375bc455897dcf4cec80692421fa5c3eee746e1095b94288611d4afdd94aa8dad70340757d041757758924701cbdb2b73b49fb8730c4caac2a126 - -Version 2.0.8.3 enables reading and writing Object type data via Python. It also includes optimizations and bug fixes for previous versions, with comprehensive upgrades to database monitoring, performance and stability. Details are as follows: - -- Interface Module: [Python Native API](../../latest-Table/API/Programming-Python-Native-API_timecho.md) supports reading and writing Object type data for table models - - -### V2.0.8.2 - -> Release Date: 2026.03.31
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name:timechodb-2.0.8.2-bin.zip
-> SHA512 Checksum:02ab10e3e94786dd5676e0a69609eef192afd90d87f4d8d7bd44e7e9cbc8a18d61ba5668bae56cb8e4416ac71a877f760963b72ca7838d7c39ae10f1ed321d89 - -Version 2.0.8.2 adds support for modifying the full path of time series in the tree model, customizing the Time column name in the table model, changing data types in both tree and table models, and includes the ODBC Driver, among other features. It also introduces improvements and bug fixes for earlier versions, with comprehensive enhancements to database monitoring, performance, and stability. The detailed release notes are as follows: - -- Storage Module: The tree model supports [modifying the full name of time series](../Basic-Concept/Operate-Metadata_timecho.md#_2-4-修改时间序列名称) and [changing the data type of time series](../Basic-Concept/Operate-Metadata_timecho.md#_2-3-修改时间序列数据类型). -- Storage Module: The table model supports [modifying column data types](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-5-修改表) and [customizing the Time column name](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-1-创建表). -- Interface Module: Adds support for the [ODBC Driver](../API/Programming-ODBC_timecho.md); the Python SessionDataset supports fetching DataFrames in batches; the MQTT service is externalized, and a new system table named Services is added for service queries. -- AI Node: The table model supports adaptive [covariate inference](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理). -- Stream Processing Module: The tree model data synchronization PIPE statement supports specifying multiple precise paths. - - -### V2.0.8.1 - -> Release Date: 2026.02.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name:timechodb-2.0.8.1-bin.zip
-> SHA512 Checksum: 49d97cbf488443f8e8e73cc39f6f320b3bc84b194aed90af695ebd5771650b5e5b6a3abb0fb68059bd01827260485b903c035657b337442f4fdd32c877f2aca3 - -V2.0.8.1 introduces the **Object data type** to table models, significantly enhances audit logging capabilities, optimizes the tree model’s **OPC UA protocol**, adds **covariate-based forecasting** support in AINode, and enables **concurrent inference** in AINode. Additionally, comprehensive improvements have been made to database monitoring, performance, and stability. The detailed release notes are as follows: - -- **Query Module**: Added a list view of available DataNode instances, allowing users to [view each node's RPC address and port](../User-Manual/Maintenance-commands_timecho.md#_1-7-viewing-available-nodes). -- **Query Module**: Introduced a new system table for [statistical query latency analysis](../../latest-Table/Reference/System-Tables_timecho.md#_2-20-queries-costs-histogram). -- **Storage Module**: Added SQL support to retrieve the full definition statements for [tables](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-4-view-table-creation-statement) and [views](../../latest-Table/User-Manual/Tree-to-Table_timecho.md#_2-4-viewing-table-views). -- **Storage Module**: Optimized the tree model’s [OPC UA protocol](../API/Programming-OPC-UA_timecho.md). -- **System Module**: Added support for the [Object data type](../../latest-Table/Background-knowledge/Data-Type_timecho.md) in table models. -- **System Module**: Significantly enhanced and upgraded the [audit log](../User-Manual/Audit-Log_timecho.md) functionality. -- **System Module**: Added a new system table to monitor [DataNode connection status](../../latest-Table/Reference/System-Tables_timecho.md#_2-18-connections). -- **AINode**: Integrated the built-in **Chronos-2** model, supporting [covariate-based forecasting](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md). -- **AINode**: Built-in models **Timer-XL** and **Sundial** now support [concurrent inference](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md). -- **Stream Processing Module**: When creating a full-data synchronization pipe, it will be [automatically split](../User-Manual/Data-Sync_timecho.md#_2-1-create-a-task) into two independent pipes—one for real-time data and one for historical data—whose remaining event counts can be monitored separately via the `SHOW PIPES` statement. -- **Others**: Fixed security vulnerabilities **CVE-2025-12183**, **CVE-2025-66566**, and **CVE-2025-11226**. - -### V2.0.6.6 - -> Release Date: 2026.01.20
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.6.6-bin.zip
-> SHA512 Checksum: d12e60b8119690d63c501d0c2afcd527e39df8a8786198e35b53338e21939e1a9244805e710d81cbb62d02c2739909d7e8227c029660a0cd9ea7ca718cf9bdf6 - -V2.0.6.6 primarily optimizes query performance for time series in the tree mode, while delivering comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Query Module**: Improved query performance for `SHOW/COUNT TIMESERIES/DEVICES` statements. - - -### V2.0.6.4 - -> Release Date: 2025.11.17
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.6.4-bin.zip
-> SHA512 Checksum: 57b9998cc14632862c32b6781c70db1c52caf8172b5d45d27cc214cab50d3afd4230ed0754e1c1a4ed825666bf971dc81fbb7d3b93261e57e9dabc20e794a2b8 - -V2.0.6.4 focuses on enhancements to the storage and AINode modules, resolves several product defects, and provides comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Storage Module**: Added support for modifying the encoding and compression methods of time series in the tree mode. -* **AINode**: Introduced one-click deployment and optimized model inference capabilities. - - -### V2.0.6.1 - -> Release Date: 2025.09.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.6.1-bin.zip
-> SHA512 Checksum: c88e3e2c0dbd06578bd0697ca9992880b300baee2c4906ba1f952134e37ae2fa803a6af236f4541d318b75f43a498b5d5bfbbc7c445783271076c36e696e4dd0 - -V2.0.6.1 introduces the new table model query write-back function, access control blacklist/whitelist function, bitwise operation functions (built-in scalar functions), and push-downable time functions. Comprehensive enhancements to database monitoring, performance, and stability are also included. Key updates: - -* ​**​Query Module:​**​ - * Supports the table model query write-back function - * The table model row pattern recognition supports the use of aggregate functions to capture continuous data for analytical calculation - * The table model adds built-in scalar functions - bitwise operation functions - * The table model adds push-downable EXTRACT time functions -* ​**​System Module:​**​ - * Adds access control, supporting users to customize and configure blacklist/whitelist functions -* ​**​Others:​**​ - * The default user password is updated to "TimechoDB@2021" with higher security strength - -### V2.0.5.2 - -> Release Date: 2025.08.08
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.5.2-bin.zip
-> SHA512 Checksum: a00a4075c9937b7749c454f71d2480fea5e9ff9659c0628b132e30e2f256c7c537cd91dca4f6be924db0274bb180946a1b88e460c025bf82fdb994a3c2c7b91e - -V2.0.5.2 introduces addresses certain product defects, optimizes the data synchronization function,Comprehensive enhancements to database monitoring, performance, and stability are also included. - - -### V2.0.5.1 - -> Release Date: 2025.07.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.5.1-bin.zip
-> SHA512 Checksum: aa724755b659bf89a60da6f2123dfa91fe469d2e330ed9bd029e8f36dd49212f3d83b1025e9da26cb69315e02f65c7e9a93922e40df4f2aa4c7f8da8da2a4cea - -V2.0.5.1 introduces ​**​tree-to-table view​**​, ​**​window functions​**​ and the ​**​approx\_most\_frequent​**​ aggregate function for the table model, along with support for ​**​LEFT & RIGHT JOIN​**​ and ​**​ASOF LEFT JOIN​**​. AINode adds two built-in models: ​**​Timer-XL​**​ and ​**​Timer-Sundial​**​, supporting inference and fine-tuning for tree and table models. Comprehensive enhancements to database monitoring, performance, and stability are also included. Key updates: - -* ​**​Query Module:​**​ - * Supports manually creating tree-to-table views - * Adds window functions for table model - * Adds approx\_most\_frequent aggregate function - * Extends JOIN support: LEFT/RIGHT JOIN, ASOF LEFT JOIN - * Enables row pattern recognition (captures continuous data for analysis) - * New system tables: VIEWS (view metadata), MODELS (model info), etc. -* ​**​System Module:​**​ - * Adds TsFile data encryption -* ​**​AI Module:​**​ - * New built-in models: Timer-XL and Timer-Sundial - * Supports inference/fine-tuning for tree and table models -* ​**​Others:​**​ - * Enables data publishing via OPC DA protocol - -### 2.x Other historical versions - -#### V2.0.4.2 - -> Release Date: 2025.06.21
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.4.2-bin.zip
-> SHA512 Checksum: 31f26473ac90988ce970dac8d0950671bde918f9af6f2f6a6c2bf99a53aa1c0a459c53a137b18ff0b28e70952e9c4b6acb50029e0b2e38837b969eb8f78f2939 - -V2.0.4.2 adds support for passing TOPIC to custom MQTT plugins. Includes comprehensive improvements to monitoring, performance, and stability. - -#### V2.0.4.1 - -> Release Date: 2025.06.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.4.1-bin.zip
-> SHA512 Checksum: 93ac08bfae06aff6db04849f474458433026f66778f4f5c402eb22f1a7cb14d8096daf0a9e9cc365ddfefd4f8ca4443b2a9fb6461906f056b1e6a344990beb3a - -V2.0.4.1 introduces ​**​User-Defined Table Functions (UDTF)​**​ and multiple built-in table functions for the table model, adds the ​**​approx\_count\_distinct​**​ aggregate function, and enables ​**​ASOF INNER JOIN on timestamp columns​**​. Script tools are categorized, with Windows-specific scripts separated out. Key updates: - -* ​**​Query Module:​**​ - * Adds UDTFs and built-in table functions - * Supports ASOF INNER JOIN on timestamps - * Adds approx\_count\_distinct aggregate function -* ​**​Stream Processing:​**​ - * Supports asynchronous TsFile loading via SQL -* ​**​System Module:​**​ - * Disaster-aware load balancing strategy for replica selection during downsizing - * Compatibility with Windows Server 2025 -* ​**​Scripts & Tools:​**​ - * Categorized scripts; isolated Windows-specific tools - -#### V2.0.3.4 - -> Release Date: 2025.06.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.4-bin.zip
-> SHA512 Checksum: d80d34b7d3890def75b17c491fc4c13efc36153a5950a9b23744755d04d6adb5d6ab9ec970101183fef7bfeb8a559ef92fce90d2d22f7b7fd5795cd5589461bb - -V2.0.3.4 upgrades the user password encryption algorithm to ​**​SHA-256​**​. Includes comprehensive monitoring, performance, and stability improvements. - -#### V2.0.3.3 - -> Release Date: 2025.05.16
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.3-bin.zip
-> SHA512 Checksum: f47e3fb45f869dbe690e7cfaa93f95e5e08a462b362aa9d7ccac7ee5b55022dc8f62db12009dfde055f278f3003ff9ea7c22849d52a3ef2c25822f01ade78591 - -V2.0.3.3 introduces ​**​metadata import/export scripts for table models​**​, ​**​Spark ecosystem integration​**​, and adds ​**​timestamps to AINode results​**​. New aggregate/scalar functions are added. Key updates: - -* ​**​Query Module:​**​ - * New aggregate function: count\_if; scalar functions: greatest/least - * Significant optimization for full-table count(\*) queries -* ​**​AI Module:​**​ - * Timestamps added to AINode results -* ​**​System Module:​**​ - * Optimized metadata performance for table model - * Active monitoring & loading of TsFiles - * New metrics: TsFile parsing time, Tablet conversion count -* ​**​Ecosystem Integration:​**​ - * Spark integration for table model -* ​**​Scripts & Tools:​**​ - * import-schema/export-schema scripts support table model metadata - -#### V2.0.3.2 - -> Release Date: 2025.05.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.2-bin.zip
-> SHA512 Checksum: 76bd294de4b01782e5dd621a996aeb448e4581f98c70fb5b72b17dc392c2e1227c0d26bd3df5533669a80f217a83a566bc6ec926b7efd21ce7a89b894cd33e19 - -V2.0.3.2 resolves product defects, optimizes node removal, and enhances monitoring, performance, and stability. - -#### V2.0.2.1 - -> Release Date: 2025.04.07
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.2.1-bin.zip
-> SHA512 Checksum: a41be3f8c57e6a39ac165f1d6ab92c9ed790b0712528f31662c58617f4c94e6bfc9392a9c1ef2fc5bdd8c7ca79901389f368cbdbec3e5b1d5c1ce155b2f1a457 - -V2.0.2.1 adds ​**​table model permission management​**​, ​**​user management​**​, and ​**​operation authentication​**​, alongside UDFs, system tables, and nested queries. Data subscription mechanisms are optimized. Key updates: - -* ​**​Query Module:​**​ - * Added UDF management: User-Defined Scalar Functions (UDSF) & Aggregate Functions (UDAF) - * Configurable URI-based loading for UDF/PipePlugin/Trigger/AINode JARs - * Permission/user management with operation authentication - * New system tables and maintenance statements -* ​**​System Module:​**​ - * CSharp client supports table model - * New C++ Session write APIs for table model - * Multi-tier storage supports S3-compliant non-AWS object storage - * New pattern\_match function -* ​**​Data Sync:​**​ - * Table model metadata sync and delete propagation - -#### V2.0.1.2 - -> Release Date: 2025.01.25
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.1.2-bin.zip
-> SHA512 Checksum: 51c2fa5da2974a8a3c8871dec1c49bd98e5d193a13ef33ac7801adb833a1e360d74f0160bcdf33c7ffb23a5c5e0f376e26a4315cf877f1459483356285b85349 - -V2.0.1.2 officially implements ​**​dual-model configuration (tree + table)​**​. The table model supports ​**​standard SQL queries​**​, diverse functions/operators, stream processing, and Benchmarking. Python client adds four new data types, and script tools support TsFile/CSV/SQL import/export. Key updates: - -* ​**​Time-Series Table Mode:​**​ - * Standard SQL: SELECT, WHERE, JOIN, GROUP BY, ORDER BY, LIMIT, nested queries -* ​**​Query Module:​**​ - * Logical operators, math functions, time-series functions (e.g., DIFF) - * Configurable URI-based JAR loading -* ​**​Storage Module:​**​ - * Session API writes with auto-metadata creation - * Python client supports: String, Blob, Date, Timestamp - * Optimized compaction task priority -* ​**​Stream Processing:​**​ - * Auth info specification on sender side - * TsFile Load for table model - * Plugin adaptation for table model -* ​**​System Module:​**​ - * Enhanced DataNode downsizing stability - * Supports DROP DATABASE in read-only mode -* ​**​Scripts & Tools:​**​ - * Benchmark adapted for table model - * Support for String/Blob/Date/Timestamp in Benchmark - * import-data/export-data: Universal support for TsFile/CSV/SQL -* ​**​Ecosystem Integration:​**​ - * Kubernetes Operator support - - -### V1.3.7.3 - -> Release Date: 2026.06.02
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 Checksum: 8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 primarily optimizes query module and data synchronization capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Query Module: Optimized `Last` queries, aligned series queries, reverse-order time filter queries, and other scenarios. -- Metadata Module: Optimized device creation validation for activated series and their child paths. -- Data Synchronization: Optimized the retry mechanism after synchronization failures. -- Data Synchronization: Cross-network-gateway synchronization plugin supports configuring the real-time write transmission timeout. -- Interface Module: Added error code validation to the Go client write interface. -- Interface Module: Optimized C# client connection pool management. - - -### V1.3.7.2 - -> Release Date: 2026.04.07
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 Checksum: 787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 primarily optimizes data synchronization and query module capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Data Synchronization: Optimized distribution performance for Pipe complex path matching scenarios. -- Query Module: The `SHOW QUERIES` statement now includes client IP, query timeout, server wait time, and other information. -- Ecosystem Integration: Supports IoTDB pushing data to an external OPC Server in OPC Client mode. - - -### V1.3.6.6 - -> Release Date: 2026.01.20
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 Checksum: 590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 enhances data read/write capabilities, resolves several product defects, and delivers comprehensive improvements in database monitoring, performance, and stability. - - -### V1.3.6.3 - -> Release Date: 2026.01.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 Checksum: 43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 focuses on deep optimizations in two core areas—query performance and memory management—while comprehensively enhancing database monitoring, performance, and stability. Specific release contents are as follows: - -* **Query Module**: Optimized query performance across multiple scenarios, including multi-series `Last` queries. -* **Query Module**: Added a new `FastLastQuery` interface in the Java SDK for more efficient `Last` query operations. -* **Query Module**: Modified the tree model’s `fetchSchema` to return results in segmented streaming mode, improving response speed under large-data-volume conditions. -* **Storage Module**: Enhanced memory management to mitigate memory leak risks and ensure long-term system stability. -* **Storage Module**: Optimized the file compaction mechanism to improve compaction efficiency and reduce storage resource consumption. -* **Others**: Fixed security vulnerabilities CVE-2025-12183, CVE-2025-66566, and CVE-2025-11226. - -### V1.3.6.1 - -> Release Date: 2025.12.09
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 Checksum: 9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 focuses on deep optimization of data synchronization stability, while delivering comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Data Synchronization**: Enhanced Pipe SQL parameter configuration to support specifying asynchronous loading methods. -* **Data Synchronization**: Introduced syntactic sugar that automatically splits full-data Pipe creation SQL into real-time and historical synchronization components. -* **System Module**: Added a global configuration option for data-type-specific compression strategies, enabling on-demand adjustment of storage compression policies. - - -### V1.3.5.11 - -> Release Date: 2025.09.24
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 Checksum: f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 version primarily optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.10 - -> Release Date: 2025.08.27
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 Checksum: 3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 version fixes certain product defects and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.9 - -> Release Date: 2025.08.25
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 Checksum: 95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 version optimizes memory control, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### 1.x Other historical versions - -#### V1.3.5.8 - -> Release Date: 2025.08.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 Checksum: aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.7 - -> Release Date: 2025.08.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 Checksum: 17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.6 - -> Release Date: 2025.07.16
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 Checksum: 05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 introduces a new configuration switch to disable the data subscription feature. It optimizes the C++ high-availability client and addresses PIPE synchronization latency issues in normal operation, restart, and deletion scenarios, along with query performance for large TEXT objects. Comprehensive enhancements to database monitoring, performance, and stability are also included. - -#### V1.3.5.4 - -> Release Date: 2025.06.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 Checksum: edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 fixes several product defects and optimizes the node removal functionality. It also delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.5.3 - -> Release Date: 2025.06.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 Checksum: 5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 focuses on optimizing data synchronization capabilities, including persisting PIPE transmission progress and adding monitoring metrics for PIPE event transfer time. Related defects have been resolved. Additionally, the encryption algorithm for user passwords has been upgraded to SHA-256. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.2 - -> Release Date: 2015.06.10
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 Checksum: 4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 primarily optimizes data synchronization features, adding support for cascading configurations via parameters and ensuring fully consistent ordering between synchronized and real-time writes. It also enables partitioned sending of historical and real-time data after system restarts. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.1 - -> Release Date: 2025.05.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 Checksum: 91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 resolves several product defects and delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.4.2 - -> Release Date: 2025.04.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 Checksum: 52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b - -V1.3.4.2 enhances the data synchronization function by supporting bi-directional active-active synchronization of data forwarded through external PIPE sources. - -#### V1.3.4.1 - -> Release Date: 2025.01.08
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 Checksum: e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 introduces pattern matching functions, continuously optimizes the data subscription mechanism, improves stability, and extends import-data/export-data scripts to support new data types while unifying TsFile, CSV and SQL import/export formats. Comprehensive improvements have been made to database monitoring, performance and stability. Key updates: - -* Query Module: Configurable URI-based JAR loading for UDFs, PipePlugins, Triggers and AINodes -* System Module: Extended UDF functionality with new pattern\_match function -* Data Sync: Supports specifying authentication info at sender -* Ecosystem: Kubernetes Operator support -* Scripts: import-data/export-data now supports strings, BLOBs, dates and timestamps -* Scripts: Unified import/export support for TsFile, CSV and SQL formats - -#### V1.3.3.3 - -> Release Date: 2024.10.31
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 Checksum: 4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3 improves restart recovery performance, enables DataNodes to actively monitor/load TsFiles with observability metrics, supports automatic loading at receivers when senders transfer files to specified directories, and adds Alter Source capability for Pipes. Comprehensive improvements to monitoring, performance and stability include: - -* Data Sync: Automatic type conversion for inconsistent data at receivers -* Data Sync: Enhanced observability with ops/latency metrics for internal APIs -* Data Sync: OPC-UA sink plugin supports CS mode and non-anonymous access -* Subscription: SDK supports create\_if\_not\_exists and drop\_if\_exists APIs -* Stream Processing: Alter Pipe supports Alter Source -* System: Added latency monitoring for REST module -* Scripts: Auto-loading TsFiles from specified directories -* Scripts: import-tsfile supports remote server execution -* Scripts: Kubernetes Helm support -* Scripts: Python client supports new data types (string, BLOB, date, timestamp) - -#### V1.3.3.2 - -> Release Date: 2024.08.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 Checksum: 32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2 adds metrics for mods file reading time, merge sort memory usage and dispatch latency, supports configurable time partition origin adjustment, enables automatic subscription termination based on pipe completion markers, and improves merge memory control. Key updates: - -* Query: Explain Analyze shows mods file read time -* Query: Explain Analyze shows merge sort memory and dispatch latency -* Storage: Added configurable file splitting during compaction -* System: Configurable time partition origin -* Stream Processing: Auto-terminate subscriptions on pipe completion markers -* Data Sync: Configurable RPC compression levels -* Scripts: Export filters only root.\_\_system paths - -#### V1.3.3.1 - -> Release Date: 2024.07.12
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 Checksum: 1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1 adds tiered storage throttling, supports username/password auth specification at sync senders, optimizes ambiguous WARN logs at receivers, improves restart performance, and merges configuration files. Key updates: - -* Query: Optimized Filter performance for faster aggregation/WHERE queries -* Query: Java Session evenly distributes SQL requests across nodes -* System: Merged config files into iotdb-system.properties -* Storage: Added tiered storage throttling -* Data Sync: Username/password auth specification at senders -* System: Optimized restart recovery time - -#### V1.3.2.2 - -> Release Date: 2024.06.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 Checksum: ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 introduces EXPLAIN ANALYZE for SQL profiling, UDAF framework, automatic data deletion at disk thresholds, metadata sync, path-specific data point counting, and SQL import/export scripts. Supports rolling cluster upgrades and cluster-wide plugin distribution with comprehensive monitoring/performance improvements. Key updates: - -* Storage: Improved insertRecords performance -* Storage: SpaceTL feature for auto-deletion at disk thresholds -* Query: EXPLAIN ANALYZE for SQL stage-level profiling -* Query: New UDAF framework -* Query: New envelope demodulation analysis in UDFs -* Query: MaxBy/MinBy functions returning timestamps with values -* Query: Faster value-filtered queries -* Data Sync: Wildcard path matching -* Data Sync: Metadata synchronization (including attributes/permissions) -* Stream Processing: ALTER PIPE for hot plugin updates -* System: TsFile load statistics in data point counting -* Scripts: Local upgrade/backup via hard links -* Scripts: New export-data/import-data for CSV/TsFile/SQL formats -* Scripts: Windows window title differentiation for ConfigNode/DataNode/Cli - -#### V1.3.1.4 - -> Release Date: 2024.04.23
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 Checksum: 8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1.4 adds cluster activation status viewing, variance/stddev aggregation functions, FILL timeout settings, TsFile repair command, one-click info collection scripts, and cluster control scripts while optimizing views and stream processing. Key updates: - -* Query: FILL clause timeout threshold -* Query: REST V2 returns column types -* Data Sync: Simplified time range specification -* Data Sync: SSL support (iotdb-thrift-ssl-sink) -* System: SQL query for cluster activation status -* System: Tiered storage transfer rate control -* System: Enhanced observability (node divergence, task scheduling) -* System: Optimized default logging -* Scripts: One-click cluster control scripts (start-all/stop-all) -* Scripts: One-click info collection scripts (collect-info) - -#### V1.3.0.4 - -> Release Date: 2024.01.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 Checksum: 3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 introduces the AINode machine learning framework, upgrades permission granularity to time-series level, and optimizes views/stream processing for better usability and stability. Key updates: - -* Query: New AINode ML framework -* Query: Fixed slow SHOW PATH responses -* Security: Time-series granular permissions -* Security: SSL client-server encryption -* Stream Processing: New metrics monitoring -* Query: LAST queries on non-writable views -* System: Improved data point counting accuracy - -#### V1.2.0.1 - -> Release Date: 2023.06.30
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 Checksum: dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1 introduces stream processing framework, dynamic templates, substring/replace/round functions, enhances SHOW REGION/TIMESERIES/VARIABLE statements and Session APIs while optimizing monitoring metrics. Key updates: - -* Stream Processing: New framework -* Metadata: Dynamic template expansion -* Storage: New SPRINTZ/RLBE encoding and LZMA2 compression -* Query: New CAST, ROUND, SUBSTR, REPLACE functions -* Query: New TIME\_DURATION, MODE aggregation -* Query: CASE WHEN syntax support -* Query: ORDER BY expression support -* Interface: Python API multi-node connection -* Interface: Python client write redirection -* Interface: Batch sequence creation via templates - -#### V1.1.0.1 - -> Release Date: 2023.04.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.1.0.1.zip
-> SHA512 Checksum: 58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1 introduces GROUP BY VARIATION/CONDITION, DIFF/COUNT\_IF functions, and pipeline execution engine while fixing issues including: - -* Aligned sequence LAST queries with ORDER BY TIMESERIES -* LIMIT & OFFSET failures -* Post-restart metadata template errors -* Sequence creation after database deletion - -Key updates: - -* Query: ALIGN BY DEVICE supports ORDER BY TIME -* Query: SHOW QUERIES/KILL QUERY commands -* System: SHOW REGIONS per database -* System: SHOW VARIABLES for cluster parameters -* Query: GROUP BY VARIATION/CONDITION -* Query: SELECT INTO type casting -* Query: New DIFF (scalar), COUNT\_IF (aggregate) -* System: SHOW REGIONS creation time -* System: Configurable dn\_rpc\_port/address - -## 2. Workbench (Console Tool) - -| **Version** | **Description** | **Supported IoTDB Versions** | **SHA512 checksum** | -| ----------- | ------------------------------------------------------------ | ----------------------------------- | ------------------------------------------------------------ | -| V2.1.1 | Optimize the measuring point selection on the trend interface to support scenarios without devices | V2.0 and above | aa05fd4d9f33f07c0949bc2d6546bb4b9791ed5ea94bcef27e2bf51ea141ec0206f1c12466aced7bf3449e11ad68d65378d697f3d10cb4881024a83746029a65 | -| V2.0.1-beta | The first version of the V2.x series, supporting dual models of tree and table | V2.0 and above | 0ca0d5029874ed8ada9c7d1cb562370b3a46913eed66d39c08759287ccc8bf332cf80bb8861e788614b61ae5d53a9f5605f553e1a607e856f395eb5102e7cc4d | -| V1.5.7 | Optimize the point list by splitting point names into device names and points, ensure the point selection area supports horizontal scrolling, and align the export file column order with the page display. | All 1.x versions from V1.3.4 onward | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | Enhanced CSV import/export: optional tags/aliases on import; support for measurement descriptions with backtick-quoted quotes on export. | All 1.x versions from V1.3.4 onward | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | Added server clock functionality and support for activating Enterprise Edition license databases | All 1.x versions from V1.3.4 onward | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | Added authentication for Prometheus settings in Instance Management | All 1.x versions from V1.3.4 onward | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | Added AI analysis and pattern matching | All 1.x versions from V1.3.2 onward | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | Added tree model display and English UI | All 1.x versions from V1.3.2 onward | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | Enhanced analysis methods and import templates | All 1.x versions from V1.3.2 onward | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | Added DB configuration and UI refinements | All 1.x versions from V1.3.2 onward | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | Optimized permission controls | All 1.x versions from V1.3.1 onward | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | Added "Common Templates" and caching | All 1.x versions from V1.3.0 onward | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | Added import/export for calculations, time alignment field | All 1.x versions from V1.2.2 onward | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | Added activation details and analysis features | All 1.x versions from V1.2.2 onward | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | Optimized point description display | All 1.x versions from V1.2.2 onward | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | Added sync monitoring panel, Prometheus hints | All 1.x versions from V1.2.2 onward | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | Major Workbench upgrade | All 1.x versions from V1.2.0 onward | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/UserGuide/Master/Tree/QuickStart/QuickStart_timecho.md b/src/UserGuide/Master/Tree/QuickStart/QuickStart_timecho.md deleted file mode 100644 index cb58bf5f5..000000000 --- a/src/UserGuide/Master/Tree/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,108 +0,0 @@ - - - -# Quick Start - -This document will guide you through methods to get started quickly with IoTDB. - -## 1. How to Install and Deploy? - -This guide will assist you in quickly installing and deploying IoTDB. You can quickly navigate to the content you need to review through the following document links: - -1. Prepare the necessary machine resources: The deployment and operation of IoTDB require consideration of various aspects of machine resource configuration. For specific resource configurations, please refer to [Database Resource](../Deployment-and-Maintenance/Database-Resources.md) - -2. Complete system configuration preparations: IoTDB's system configuration involves multiple aspects. For an introduction to key system configurations, please see [System Requirements](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. Obtain the installation package: You can contact the Timecho Team to get the IoTDB installation package to ensure you download the latest and most stable version. For the specific structure of the installation package, please refer to[Obtain TimechoDB](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. Install the database and activate it: Depending on your actual deployment architecture, you can choose from the following tutorials for installation and deployment: - - - Stand-Alone Deployment: [Stand-Alone Deployment ](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - Distributed(Cluster) Deployment:[Distributed(Cluster) Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - Dual-Active Deployment:[Dual-Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️Note: We currently still recommend direct installation and deployment on physical/virtual machines. For Docker deployment, please refer to [Docker Deployment](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. Install database supporting tools: The enterprise version database provides a monitoring panel 、Workbench Supporting tools, etc,It is recommended to install IoTDB when deploying the enterprise version, which can help you use IoTDB more conveniently: - - - Monitoring panel:Provides over a hundred database monitoring metrics for detailed monitoring of IoTDB and its operating system, enabling system optimization, performance optimization, bottleneck discovery, and more. The installation steps can be viewed [Monitoring panel](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - Workbench: It is the visual interface of IoTDB,Support providing through interface interaction Operate Metadata、Query Data、Data Visualization and other functions, help users use the database easily and efficiently, and the installation steps can be viewed [Workbench Deployment](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - -## 2. How to Use IoTDB? - -1. Database Modeling Design: Database modeling is a crucial step in creating a database system, involving the design of data structures and relationships to ensure that the organization of data meets the needs of specific applications. The following documents will help you quickly understand IoTDB's modeling design: - - - Introduction to Time Series Concepts: [Navigating Time Series Data](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - - - Introduction to Modeling Design:[Data Model and Terminology](../Background-knowledge/Data-Model-and-Terminology_timecho.md) - - - Introduction to SQL syntax[SQL syntax](../Basic-Concept/Operate-Metadata_timecho.md) - -2. Write Data: In terms of data writing, IoTDB provides multiple ways to insert real-time data. Please refer to the basic data writing operations for details [Write Data](../Basic-Concept/Write-Data_timecho.md) - -3. Query Data: IoTDB provides rich data query functions. Please refer to the basic introduction of data query [Query Data](../Basic-Concept/Query-Data_timecho.md) - -4. Other advanced features: In addition to common functions such as writing and querying in databases, IoTDB also supports "Data Synchronisation、Stream Framework、Security Management、Database Administration、AI Capability"and other functions, specific usage methods can be found in the specific document: - - - Data Synchronisation: [Data Synchronisation](../User-Manual/Data-Sync_timecho.md) - - - Stream Framework: [Stream Framework](../User-Manual/Streaming_timecho.md) - - - Security Management: [Security Management](../User-Manual/Black-White-List_timecho.md) - - - Database Administration: [Database Administration](../User-Manual/Authority-Management_timecho.md) - - - AI Capability :[AI Capability](../AI-capability/AINode_timecho.md) - -5. API: IoTDB provides multiple application programming interfaces (API) for developers to interact with IoTDB in their applications, and currently supports[ Java Native API](../API/Programming-Java-Native-API_timecho.md)、[Python Native API](../API/Programming-Python-Native-API_timecho.md)、[C++ Native API](../API/Programming-Cpp-Native-API.md)、[Go Native API](../API/Programming-Go-Native-API.md), For more API, please refer to the official website 【API】 and other chapters - -## 3. What other convenient tools are available? - -In addition to its rich features, IoTDB also has a comprehensive range of tools in its surrounding system. This document will help you quickly use the peripheral tool system : - - - Workbench: Workbench is a visual interface for IoTDB that supports interactive operations. It offers intuitive features for metadata management, data querying, and data visualization, enhancing the convenience and efficiency of user database operations. For detailed usage instructions, please refer to: [Workbench](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - - - Monitor Tool: This is a tool for meticulous monitoring of IoTDB and its host operating system, covering hundreds of database monitoring metrics including database performance and system resources, which aids in system optimization and bottleneck identification. For detailed usage instructions, please refer to: [Monitor Tool](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - Benchmark Tool: IoT benchmark is a time series database benchmark testing tool developed based on Java and big data environments, developed and open sourced by the School of Software at Tsinghua University. It supports multiple writing and querying methods, can store test information and results for further query or analysis, and supports integration with Tableau to visualize test results. For specific usage instructions, please refer to: [Benchmark Tool](../Tools-System/Benchmark.md) - - - Data Import Script: For different scenarios, IoTDB provides users with multiple ways to batch import data. For specific usage instructions, please refer to: [Data Import](../Tools-System/Data-Import-Tool_timecho.md) - - - Data Export Script: For different scenarios, IoTDB provides users with multiple ways to batch export data. For specific usage instructions, please refer to: [Data Export](../Tools-System/Data-Export-Tool_timecho.md) - - -## 4. Want to Learn More About the Technical Details? - -If you are interested in delving deeper into the technical aspects of IoTDB, you can refer to the following documents: - - - Research Paper: IoTDB features columnar storage, data encoding, pre-calculation, and indexing technologies, along with a SQL-like interface and high-performance data processing capabilities. It also integrates seamlessly with Apache Hadoop, MapReduce, and Apache Spark. For related research papers, please refer to: [Research Paper](../Technical-Insider/Publication.md) - - - Compression & Encoding: IoTDB optimizes storage efficiency for different data types through a variety of encoding and compression techniques. To learn more, please refer to:[Compression & Encoding](../Technical-Insider/Encoding-and-Compression.md) - - - Data Partitioning and Load Balancing: IoTDB has meticulously designed data partitioning strategies and load balancing algorithms based on the characteristics of time series data, enhancing the availability and performance of the cluster. For more information, please refer to: [Data Partitionin & Load Balancing](../Technical-Insider/Cluster-data-partitioning.md) - -## 5. Encountering problems during use? - -If you encounter difficulties during installation or use, you can move to [Frequently Asked Questions](../FAQ/Frequently-asked-questions.md) View in the middle \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Reference/DataNode-Config-Manual_timecho.md b/src/UserGuide/Master/Tree/Reference/DataNode-Config-Manual_timecho.md deleted file mode 100644 index f3f5f4f75..000000000 --- a/src/UserGuide/Master/Tree/Reference/DataNode-Config-Manual_timecho.md +++ /dev/null @@ -1,632 +0,0 @@ - - -# DataNode Config Manual - -We use the same configuration files for IoTDB DataNode and Standalone version, all under the `conf`. - -* `datanode-env.sh/bat`:Environment configurations, in which we could set the memory allocation of DataNode and Standalone. - -* `iotdb-system.properties`:IoTDB system configurations. - -## 1. Hot Modification Configuration - -For the convenience of users, IoTDB provides users with hot modification function, that is, modifying some configuration parameters in `iotdb-system.properties` during the system operation and applying them to the system immediately. -In the parameters described below, these parameters whose way of `Effective` is `hot-load` support hot modification. - -Trigger way: The client sends the command(sql) `load configuration` or `set configuration` to the IoTDB server. - -## 2. Environment Configuration File(datanode-env.sh/bat) - -The environment configuration file is mainly used to configure the Java environment related parameters when DataNode is running, such as JVM related configuration. This part of the configuration is passed to the JVM when the DataNode starts. - -The details of each parameter are as follows: - -* MEMORY\_SIZE - -|Name|MEMORY\_SIZE| -|:---:|:---| -|Description|The minimum heap memory size that IoTDB DataNode will use when startup | -|Type|String| -|Default| The default is a half of the memory.| -|Effective|After restarting system| - -* ON\_HEAP\_MEMORY - -|Name|ON\_HEAP\_MEMORY| -|:---:|:---| -|Description|The heap memory size that IoTDB DataNode can use, Former Name: MAX\_HEAP\_SIZE | -|Type|String| -|Default| Calculate based on MEMORY\_SIZE.| -|Effective|After restarting system| - -* OFF\_HEAP\_MEMORY - -|Name|OFF\_HEAP\_MEMORY| -|:---:|:---| -|Description|The direct memory that IoTDB DataNode can use, Former Name: MAX\_DIRECT\_MEMORY\_SIZE| -|Type|String| -|Default| Calculate based on MEMORY\_SIZE.| -|Effective|After restarting system| - -* JMX\_LOCAL - -|Name|JMX\_LOCAL| -|:---:|:---| -|Description|JMX monitoring mode, configured as yes to allow only local monitoring, no to allow remote monitoring| -|Type|Enum String: "true", "false"| -|Default|true| -|Effective|After restarting system| - -* JMX\_PORT - -|Name|JMX\_PORT| -|:---:|:---| -|Description|JMX listening port. Please confirm that the port is not a system reserved port and is not occupied| -|Type|Short Int: [0,65535]| -|Default|31999| -|Effective|After restarting system| - -* JMX\_IP - -|Name|JMX\_IP| -|:---:|:---| -|Description|JMX listening address. Only take effect if JMX\_LOCAL=false. 0.0.0.0 is never allowed| -|Type|String| -|Default|127.0.0.1| -|Effective|After restarting system| - -## 3. JMX Authorization - -We **STRONGLY RECOMMENDED** you CHANGE the PASSWORD for the JMX remote connection. - -The user and passwords are in ${IOTDB\_CONF}/conf/jmx.password. - -The permission definitions are in ${IOTDB\_CONF}/conf/jmx.access. - -## 4. DataNode/Standalone Configuration File (iotdb-system.properties) - -### 4.1 Data Node RPC Configuration - -* dn\_rpc\_address - -|Name| dn\_rpc\_address | -|:---:|:-----------------------------------------------| -|Description| The client rpc service listens on the address. | -|Type| String | -|Default| 127.0.0.1 | -|Effective| After restarting system | - -* dn\_rpc\_port - -|Name| dn\_rpc\_port | -|:---:|:---| -|Description| The client rpc service listens on the port.| -|Type|Short Int : [0,65535]| -|Default| 6667 | -|Effective|After restarting system| - -* dn\_internal\_address - -|Name| dn\_internal\_address | -|:---:|:---| -|Description| DataNode internal service host/IP | -|Type| string | -|Default| 127.0.0.1 | -|Effective|Only allowed to be modified in first start up| - -* dn\_internal\_port - -|Name| dn\_internal\_port | -|:---:|:-------------------------------| -|Description| DataNode internal service port | -|Type| int | -|Default| 10730 | -|Effective| Only allowed to be modified in first start up | - -* dn\_mpp\_data\_exchange\_port - -|Name| mpp\_data\_exchange\_port | -|:---:|:---| -|Description| MPP data exchange port | -|Type| int | -|Default| 10740 | -|Effective|Only allowed to be modified in first start up| - -* dn\_schema\_region\_consensus\_port - -|Name| dn\_schema\_region\_consensus\_port | -|:---:|:---| -|Description| DataNode Schema replica communication port for consensus | -|Type| int | -|Default| 10750 | -|Effective|Only allowed to be modified in first start up| - -* dn\_data\_region\_consensus\_port - -|Name| dn\_data\_region\_consensus\_port | -|:---:|:---| -|Description| DataNode Data replica communication port for consensus | -|Type| int | -|Default| 10760 | -|Effective|Only allowed to be modified in first start up| - -* dn\_join\_cluster\_retry\_interval\_ms - -|Name| dn\_join\_cluster\_retry\_interval\_ms | -|:---:|:--------------------------------------------------------------------------| -|Description| The time of data node waiting for the next retry to join into the cluster | -|Type| long | -|Default| 5000 | -|Effective| After restarting system | - -### 4.2 SSL Configuration - -* enable\_thrift\_ssl - -|Name| enable\_thrift\_ssl | -|:---:|:---------------------------| -|Description|When enable\_thrift\_ssl is configured as true, SSL encryption will be used for communication through dn\_rpc\_port | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* enable\_https - -|Name| enable\_https | -|:---:|:-------------------------| -|Description| REST Service Specifies whether to enable SSL configuration | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* key\_store\_path - -|Name| key\_store\_path | -|:---:|:-----------------| -|Description| SSL certificate path | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* key\_store\_pwd - -|Name| key\_store\_pwd | -|:---:|:----------------| -|Description| SSL certificate password | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -### 4.3 SeedConfigNode - -* dn\_seed\_config\_node - -|Name| dn\_seed\_config\_node | -|:---:|:------------------------------------------------| -|Description| ConfigNode Address for DataNode to join cluster. This parameter is corresponding to dn\_target\_config\_node\_list before V1.2.2 | -|Type| String | -|Default| 127.0.0.1:10710 | -|Effective| Only allowed to be modified in first start up | - -### 4.4 Connection Configuration - -* dn\_rpc\_thrift\_compression\_enable - -|Name| dn\_rpc\_thrift\_compression\_enable | -|:---:|:---| -|Description| Whether enable thrift's compression (using GZIP).| -|Type|Boolean| -|Default| false | -|Effective|After restarting system| - -* dn\_rpc\_advanced\_compression\_enable - -|Name| dn\_rpc\_advanced\_compression\_enable | -|:---:|:---| -|Description| Whether enable thrift's advanced compression.| -|Type|Boolean| -|Default| false | -|Effective|After restarting system| - -* dn\_rpc\_selector\_thread\_count - -|Name| dn\_rpc\_selector\_thread\_count | -|:---:|:-----------------------------------| -|Description| The number of rpc selector thread. | -|Type| int | -|Default| false | -|Effective| After restarting system | - -* dn\_rpc\_min\_concurrent\_client\_num - -|Name| dn\_rpc\_min\_concurrent\_client\_num | -|:---:|:-----------------------------------| -|Description| Minimum concurrent rpc connections | -|Type| Short Int : [0,65535] | -|Description| 1 | -|Effective| After restarting system | - -* dn\_rpc\_max\_concurrent\_client\_num - -|Name| dn\_rpc\_max\_concurrent\_client\_num | -|:---:|:--------------------------------------| -|Description| Max concurrent rpc connections | -|Type| Short Int : [0,65535] | -|Description| 1000 | -|Effective| After restarting system | - -* dn\_thrift\_max\_frame\_size - -|Name| dn\_thrift\_max\_frame\_size | -|:---:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|Description| Max size of bytes of each thrift RPC request/response | -|Type| int | -| Default | Defaults to 0, which means the value is automatically calculated based on the DN JVM configuration parameters at startup:
a. min(64MB, dn_alloc_memory/64)
b. If the user manually configures `dn_thrift_max_frame_size`, the user-specified value will be used instead. | -|Effective| After restarting system | - -* dn\_thrift\_init\_buffer\_size - -|Name| dn\_thrift\_init\_buffer\_size | -|:---:|:---| -|Description| Initial size of bytes of buffer that thrift used | -|Type| long | -|Default| 1024 | -|Effective|After restarting system| - -* dn\_connection\_timeout\_ms - -| Name | dn\_connection\_timeout\_ms | -|:-----------:|:---------------------------------------------------| -| Description | Thrift socket and connection timeout between nodes | -| Type | int | -| Default | 60000 | -| Effective | After restarting system | - -* dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager - -| Name | dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------------:|:--------------------------------------------------------------| -| Description | Number of core clients routed to each node in a ClientManager | -| Type | int | -| Default | 200 | -| Effective | After restarting system | - -* dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager - -| Name | dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager | -|:--------------:|:-------------------------------------------------------------| -| Description | Number of max clients routed to each node in a ClientManager | -| Type | int | -| Default | 300 | -| Effective | After restarting system | - -### 4.5 Dictionary Configuration - -* dn\_system\_dir - -| Name | dn\_system\_dir | -|:-----------:|:----------------------------------------------------------------------------| -| Description | The directories of system files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/system (Windows: data\\datanode\\system) | -| Effective | After restarting system | - -* dn\_data\_dirs - -| Name | dn\_data\_dirs | -|:-----------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | The directories of data files. Multiple directories are separated by comma. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. If the path does not exist, the system will automatically create it. | -| Type | String[] | -| Default | data/datanode/data (Windows: data\\datanode\\data) | -| Effective | After restarting system | - -* dn\_multi\_dir\_strategy - -| Name | dn\_multi\_dir\_strategy | -|:-----------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | IoTDB's strategy for selecting directories for TsFile in tsfile_dir. You can use a simple class name or a full name of the class. The system provides the following three strategies:
1. SequenceStrategy: IoTDB selects the directory from tsfile\_dir in order, traverses all the directories in tsfile\_dir in turn, and keeps counting;
2. MaxDiskUsableSpaceFirstStrategy: IoTDB first selects the directory with the largest free disk space in tsfile\_dir;
You can complete a user-defined policy in the following ways:
1. Inherit the org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy class and implement its own Strategy method;
2. Fill in the configuration class with the full class name of the implemented class (package name plus class name, UserDfineStrategyPackage);
3. Add the jar file to the project. | -| Type | String | -| Default | SequenceStrategy | -| Effective | hot-load | - -* dn\_consensus\_dir - -| Name | dn\_consensus\_dir | -|:-----------:|:-------------------------------------------------------------------------------| -| Description | The directories of consensus files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/consensus | -| Effective | After restarting system | - -* dn\_wal\_dirs - -| Name | dn\_wal\_dirs | -|:-----------:|:-------------------------------------------------------------------------| -| Description | Write Ahead Log storage path. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/wal | -| Effective | After restarting system | - -* dn\_tracing\_dir - -| Name | dn\_tracing\_dir | -|:-----------:|:----------------------------------------------------------------------------| -| Description | The tracing root directory path. It is recommended to use an absolute path. | -| Type | String | -| Default | datanode/tracing | -| Effective | After restarting system | - -* dn\_sync\_dir - -| Name | dn\_sync\_dir | -|:-----------:|:--------------------------------------------------------------------------| -| Description | The directories of sync files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/sync | -| Effective | After restarting system | - -### 4.6 Metric Configuration - -* dn\_metric\_reporter\_list - -| Name | dn\_metric\_reporter\_list | -|:-----------:|:--------------------------------------| -| Description | Systems for reporting DataNode metrics. | -| Type | String | -| Default | None | -| Effective | After restarting system | - -* dn\_metric\_level - -| Name | dn\_metric\_level | -|:-----------:|:------------------------------------| -| Description | Level of detail for DataNode metrics. | -| Type | String | -| Default | IMPORTANT | -| Effective | After restarting system | - -* dn\_metric\_async\_collect\_period - -| Name | dn\_metric\_async\_collect\_period | -|:-----------:|:------------------------------------------------------------| -| Description | Period for asynchronous metric collection in DataNode (in seconds). | -| Type | int | -| Default | 5 | -| Effective | After restarting system | - -* dn\_metric\_prometheus\_reporter\_port - -| Name | dn\_metric\_prometheus\_reporter\_port | -|:-----------:|:------------------------------------------| -| Description | Port for Prometheus metric reporting in DataNode. | -| Type | int | -| Default | 9092 | -| Effective | After restarting system | - -* dn\_metric\_internal\_reporter\_type - -| Name | dn\_metric\_internal\_reporter\_type | -|:-----------:|:------------------------------------------------------------| -| Description | Internal reporter types for DataNode metrics. For internal monitoring and checking that the data has been successfully written and refreshed. | -| Type | String | -| Default | IOTDB | -| Effective | After restarting system | - -## 5. Enable GC log - -GC log is off by default. -For performance tuning, you may want to collect the GC info. - -To enable GC log, just add a parameter "printgc" when you start the DataNode. - -```bash -nohup sbin/start-datanode.sh printgc >/dev/null 2>&1 & -``` -Or -```bash -# Before version V2.0.4.x -sbin\start-datanode.bat printgc - -# V2.0.4.x and later versions -sbin\windows\start-datanode.bat printgc -``` - -GC log is stored at `IOTDB_HOME/logs/gc.log`. -There will be at most 10 gc.log.* files and each one can reach to 10MB. - -### 5.1 REST Service Configuration - -* enable\_rest\_service - -|Name| enable\_rest\_service | -|:---:|:--------------------------------------| -|Description| Whether to enable the Rest service | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* rest\_service\_port - -|Name| rest\_service\_port | -|:---:|:------------------| -|Description| The Rest service listens to the port number | -|Type| int32 | -|Default| 18080 | -|Effective| After restarting system | - -* enable\_swagger - -|Name| enable\_swagger | -|:---:|:-----------------------| -|Description| Whether to enable swagger to display rest interface information | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* rest\_query\_default\_row\_size\_limit - -|Name| rest\_query\_default\_row\_size\_limit | -|:---:|:------------------------------------------------------------------------------------------| -|Description| The maximum number of rows in a result set that can be returned by a query | -|Type| int32 | -|Default| 10000 | -|Effective| After restarting system | - -* cache\_expire - -|Name| cache\_expire | -|:---:|:--------------------------------------------------------| -|Description| Expiration time for caching customer login information | -|Type| int32 | -|Default| 28800 | -|Effective| After restarting system | - -* cache\_max\_num - -|Name| cache\_max\_num | -|:---:|:--------------| -|Description| The maximum number of users stored in the cache | -|Type| int32 | -|Default| 100 | -|Effective| After restarting system | - -* cache\_init\_num - -|Name| cache\_init\_num | -|:---:|:---------------| -|Description| Initial cache capacity | -|Type| int32 | -|Default| 10 | -|Effective| After restarting system | - - -* trust\_store\_path - -|Name| trust\_store\_path | -|:---:|:---------------| -|Description| keyStore Password (optional) | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* trust\_store\_pwd - -|Name| trust\_store\_pwd | -|:---:|:---------------------------------| -|Description| trustStore Password (Optional) | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* idle\_timeout - -|Name| idle\_timeout | -|:---:|:--------------| -|Description| SSL timeout duration, expressed in seconds | -|Type| int32 | -|Default| 5000 | -|Effective| After restarting system | - - -#### Storage engine configuration - - -* dn\_default\_space\_usage\_thresholds - -|Name| dn\_default\_space\_usage\_thresholds | -|:---:|:--------------| -|Description| Define the minimum remaining space ratio for each tier data catalogue; when the remaining space is less than this ratio, the data will be automatically migrated to the next tier; when the remaining storage space of the last tier falls below this threshold, the system will be set to READ_ONLY | -|Type| double | -|Default| 0.85 | -|Effective| hot-load | - -* remote\_tsfile\_cache\_dirs - -|Name| remote\_tsfile\_cache\_dirs | -|:---:|:--------------| -|Description| Cache directory stored locally in the cloud | -|Type| string | -|Default| data/datanode/data/cache | -|Effective| After restarting system | - -* remote\_tsfile\_cache\_page\_size\_in\_kb - -|Name| remote\_tsfile\_cache\_page\_size\_in\_kb | -|:---:|:--------------| -|Description| Block size of locally cached files stored in the cloud | -|Type| int | -|Default| 20480 | -|Effective| After restarting system | - -* remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb - -|Name| remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb | -|:---:|:--------------| -|Description| Maximum Disk Occupancy Size for Cloud Storage Local Cache | -|Type| long | -|Default| 51200 | -|Effective| After restarting system | - -* object\_storage\_type - -|Name| object\_storage\_type | -|:---:|:--------------| -|Description| Cloud Storage Type | -|Type| string | -|Default| AWS_S3 | -|Effective| After restarting system | - -* object\_storage\_bucket - -|Name| object\_storage\_bucket | -|:---:|:--------------| -|Description| Name of cloud storage bucket | -|Type| string | -|Default| iotdb_data | -|Effective| After restarting system | - -* object\_storage\_endpoint - -|Name| object\_storage\_endpoint | -|:---:|:--------------------------------| -|Description| endpoint of cloud storage | -|Type| string | -|Default| None | -|Effective| After restarting system | - -* object\_storage\_access\_key - -|Name| object\_storage\_access\_key | -|:---:|:--------------| -|Description| Authentication information stored in the cloud: key | -|Type| string | -|Default| None | -|Effective| After restarting system | - -* object\_storage\_access\_secret - -|Name| object\_storage\_access\_secret | -|:---:|:--------------| -|Description| Authentication information stored in the cloud: secret | -|Type| string | -|Default| None | -|Effective| After restarting system | diff --git a/src/UserGuide/Master/Tree/SQL-Manual/QuickStart-Only-Sql_timecho.md b/src/UserGuide/Master/Tree/SQL-Manual/QuickStart-Only-Sql_timecho.md deleted file mode 100644 index a0a58fba3..000000000 --- a/src/UserGuide/Master/Tree/SQL-Manual/QuickStart-Only-Sql_timecho.md +++ /dev/null @@ -1,111 +0,0 @@ - - -# QuickStart Only SQL - -> **Before executing the following SQL statements, please ensure** -> -> * **IoTDB service has been successfully started** -> * **Connected to IoTDB via Cli client** -> -> Note: If your terminal does not support multi-line pasting (e.g., Windows CMD), please adjust the SQL statements to single-line format before execution. - -## 1. Database Management - -```SQL --- Create database; -CREATE DATABASE root.ln; - --- View database; -SHOW DATABASES root.**; - --- Delete database; -DELETE DATABASE root.ln; - --- Count database; -COUNT DATABASES root.**; -``` - -For detailed syntax description, please refer to: [Database Management](../Basic-Concept/Operate-Metadata_timecho.md#_1-database-management) - -## 2. Time Series Management - -```SQL --- Create time series; -CREATE TIMESERIES root.ln.wf01.wt01.status BOOLEAN; -CREATE TIMESERIES root.ln.wf01.wt01.temperature FLOAT; - --- Create aligned time series; -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT, longitude FLOAT); - --- Delete time series; -DELETE TIMESERIES root.ln.wf01.wt01.status; - --- View time series; -SHOW TIMESERIES root.ln.**; - --- Count time series; -COUNT TIMESERIES root.ln.**; -``` - -For detailed syntax description, please refer to: [Time Series Management](../Basic-Concept/Operate-Metadata_timecho.md#_2-timeseries-management) - -## 3. Data Writing - -```SQL --- Single column writing; -INSERT INTO root.ln.wf01.wt01(timestamp, temperature) VALUES(1, 23.0),(2, 42.6); - --- Multi-column writing; -INSERT INTO root.ln.wf01.wt01(timestamp, status, temperature) VALUES (3, false, 33.1),(4, true, 24.6); -``` - -For detailed syntax description, please refer to: [Data Writing](../Basic-Concept/Write-Data_timecho.md) - -## 4. Data Query - -```SQL --- Time filter query; -SELECT * from root.ln.** where time > 1; - --- Value filter query; -SELECT temperature FROM root.ln.wf01.wt01 where temperature > 36.5; - --- Function query; -SELECT count(temperature) FROM root.ln.wf01.wt01; - --- Latest point query; -SELECT LAST status FROM root.ln.wf01.wt01; -``` - -For detailed syntax description, please refer to: [Data Query](../Basic-Concept/Query-Data_timecho.md) - -## 5. Data Deletion - -```SQL --- Single column deletion; -DELETE FROM root.ln.wf01.wt01.status WHERE time >= 20; - --- Multi-column deletion; -DELETE FROM root.ln.wf01.wt01.* where time <= 10; -``` - -For detailed syntax description, please refer to: [Data Deletion](../Basic-Concept/Delete-Data.md) \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/SQL-Manual/SQL-Manual_timecho.md b/src/UserGuide/Master/Tree/SQL-Manual/SQL-Manual_timecho.md deleted file mode 100644 index b791e680e..000000000 --- a/src/UserGuide/Master/Tree/SQL-Manual/SQL-Manual_timecho.md +++ /dev/null @@ -1,1697 +0,0 @@ - - -# SQL Manual - -## 1. DATABASE MANAGEMENT - -For more details, see document [Operate-Metadata](../Basic-Concept/Operate-Metadata_timecho.md). - -### 1.1 Create Database - -```sql -create database root.ln; -create database root.sgcc; -``` - -### 1.2 Show Databases - -```sql -SHOW DATABASES; -SHOW DATABASES root.**; -``` - -### 1.3 Delete Database - -```sql -DELETE DATABASE root.ln; -DELETE DATABASE root.sgcc; -// delete all data, all timeseries and all databases; -DELETE DATABASE root.**; -``` - -### 1.4 Count Databases - -```sql -count databases; -count databases root.*; -count databases root.sgcc.*; -count databases root.sgcc; -``` - -### 1.5 Setting up heterogeneous databases (Advanced operations) - -#### Set heterogeneous parameters when creating a Database - -```sql -CREATE DATABASE root.db WITH SCHEMA_REPLICATION_FACTOR=1, DATA_REPLICATION_FACTOR=3, SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### Adjust heterogeneous parameters at run time - -```sql -ALTER DATABASE root.db WITH SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### Show heterogeneous databases - -```sql -SHOW DATABASES DETAILS; -``` - -### 1.6 TTL - -#### Set TTL - -```sql -set ttl to root.ln 3600000; -set ttl to root.sgcc.** 3600000; -set ttl to root.** 3600000; -``` - -#### Unset TTL - -```sql -unset ttl from root.ln; -unset ttl from root.sgcc.**; -unset ttl from root.**; -``` - -#### Show TTL - -```sql -SHOW ALL TTL; -SHOW TTL ON StorageGroupNames; -SHOW DEVICES; -``` - -## 2. TIMESERIES MANAGEMENT - -For more details, see document [Operate-Metadata](../Basic-Concept/Operate-Metadata_timecho.md). - -### 2.1 Create Timeseries - -```sql -create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT; -create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT; -create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT; -``` - -- From v0.13, you can use a simplified version of the SQL statements to create timeseries: - -```sql -create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT; -create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT; -create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT; -``` - -- Notice that when in the CREATE TIMESERIES statement the encoding method conflicts with the data type, the system gives the corresponding error prompt as shown below: - -```sql -create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF; -error: encoding TS_2DIFF does not support BOOLEAN -``` - -### 2.2 Create Aligned Timeseries - -```sql -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT , longitude FLOAT); -``` - -### 2.3 Modify Timeseries Data Type -> Supported since V2.0.8.2 - -```SQL -ALTER TIMESERIES root.ln.wf01.wt01.temperature set data type DOUBLE -``` - -### 2.4 Modify Timeseries Name -> This statement is supported from V2.0.8.2 onwards - -```sql -ALTER TIMESERIES root.ln.wf01.wt01.temperature RENAME TO root.newln.newwf.newwt.temperature -``` - -### 2.5 Delete Timeseries - -```sql -delete timeseries root.ln.wf01.wt01.status; -delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware; -delete timeseries root.ln.wf02.*; -drop timeseries root.ln.wf02.*; -``` - -### 2.6 Show Timeseries - -```sql -show timeseries root.**; -show timeseries root.ln.**; -show timeseries root.ln.** limit 10 offset 10; -show timeseries root.ln.** where timeseries contains 'wf01.wt'; -show timeseries root.ln.** where dataType=FLOAT; -show timeseries root.ln.** where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -show latest timeseries; -show invalid timeseries; -- This statement is supported from V2.0.8.2 onwards; -``` - -### 2.7 Count Timeseries - -```sql -COUNT TIMESERIES root.**; -COUNT TIMESERIES root.ln.**; -COUNT TIMESERIES root.ln.*.*.status; -COUNT TIMESERIES root.ln.wf01.wt01.status; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc'; -COUNT TIMESERIES root.** WHERE DATATYPE = INT64; -COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c'; -COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c'; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1; -COUNT TIMESERIES root.** GROUP BY LEVEL=1; -COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2; -COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2; -``` - -### 2.8 Tag and Attribute Management - -```sql -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2); -``` - -* Rename the tag/attribute key - -```SQL -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1; -``` - -* Reset the tag/attribute value - -```SQL -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1; -``` - -* Delete the existing tag/attribute - -```SQL -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2; -``` - -* Add new tags - -```SQL -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4; -``` - -* Add new attributes - -```SQL -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4; -``` - -* Upsert alias, tags and attributes - -> add alias or a new key-value if the alias or key doesn't exist, otherwise, update the old one with new value. - -```SQL -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag3=v3, tag4=v4) ATTRIBUTES(attr3=v3, attr4=v4); -``` - -* Show timeseries using tags. Use TAGS(tagKey) to identify the tags used as filter key - -```SQL -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause; -``` - -returns all the timeseries information that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -```SQL -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c; -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1; -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -- count timeseries using tags - -```SQL -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause; -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL=; -``` - -returns all the number of timeseries that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -```SQL -count timeseries; -count timeseries root.** where TAGS(unit)='c'; -count timeseries root.** where TAGS(unit)='c' group by level = 2; -``` - -create aligned timeseries - -```SQL -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)); -``` - -The execution result is as follows: - -```SQL -show timeseries; -``` -```shell -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -|root.sg1.d1.s2| null| root.sg1| DOUBLE| GORILLA| SNAPPY|{"tag4":"v4","tag3":"v3"}|{"attr4":"v4","attr3":"v3"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -Support query: - -```SQL -show timeseries where TAGS(tag1)='v1'; -``` -```shell -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -The above operations are supported for timeseries tag, attribute updates, etc. - -## 3. NODE MANAGEMENT - -For more details, see document [Operate-Metadata](../Basic-Concept/Operate-Metadata_timecho.md). - -### 3.1 Show Child Paths - -```SQL -SHOW CHILD PATHS pathPattern; -``` - -### 3.2 Show Child Nodes - -```SQL -SHOW CHILD NODES pathPattern; -``` - -### 3.3 Count Nodes - -```SQL -COUNT NODES root.** LEVEL=2; -COUNT NODES root.ln.** LEVEL=2; -COUNT NODES root.ln.wf01.** LEVEL=3; -COUNT NODES root.**.temperature LEVEL=3; -``` - -### 3.4 Show Devices - -```SQL -show devices; -show devices root.ln.**; -show devices root.ln.** where device contains 't'; -show devices with database; -show devices root.ln.** with database; -``` - -### 3.5 Count Devices - -```SQL -show devices; -count devices; -count devices root.ln.**; -``` - -## 4. INSERT & LOAD DATA - -### 4.1 Insert Data - -For more details, see document [Write-Data](../Basic-Concept/Write-Data_timecho). - -#### Use of INSERT Statements - -- Insert Single Timeseries - -```sql -insert into root.ln.wf02.wt02(timestamp,status) values(1,true); -insert into root.ln.wf02.wt02(timestamp,hardware) values(1, 'v1'); -``` - -- Insert Multiple Timeseries - -```sql -insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (2, false, 'v2'); -insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4'); -``` - -- Use the Current System Timestamp as the Timestamp of the Data Point - -```SQL -insert into root.ln.wf02.wt02(status, hardware) values (false, 'v2'); -``` - -#### Insert Data Into Aligned Timeseries - -```SQL -create aligned timeseries root.sg1.d1(s1 INT32, s2 DOUBLE); -insert into root.sg1.d1(time, s1, s2) aligned values(1, 1, 1); -insert into root.sg1.d1(time, s1, s2) aligned values(2, 2, 2), (3, 3, 3); -select * from root.sg1.d1; -``` - -### 4.2 Load External TsFile Tool - -For more details, see document [Data Import](../Tools-System/Data-Import-Tool_timecho). - -#### Load with SQL - -1. Load a single tsfile by specifying a file path (absolute path). - -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile'` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' sglevel=1` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' onSuccess=delete` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' sglevel=1 onSuccess=delete` - - -2. Load a batch of files by specifying a folder path (absolute path). - -- `load '/Users/Desktop/data'` -- `load '/Users/Desktop/data' sglevel=1` -- `load '/Users/Desktop/data' onSuccess=delete` -- `load '/Users/Desktop/data' sglevel=1 onSuccess=delete` - -#### Load with Script - -```sql -./load-rewrite.bat -f D:\IoTDB\data -h 192.168.0.101 -p 6667 -u root -pw root -``` - -## 5. DELETE DATA - -For more details, see document [Write-Delete-Data](../Basic-Concept/Write-Data_timecho). - -### 5.1 Delete Single Timeseries - -```sql -delete from root.ln.wf02.wt02.status where time<=2017-11-01T16:26:00; -delete from root.ln.wf02.wt02.status where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -delete from root.ln.wf02.wt02.status where time < 10; -delete from root.ln.wf02.wt02.status where time <= 10; -delete from root.ln.wf02.wt02.status where time < 20 and time > 10; -delete from root.ln.wf02.wt02.status where time <= 20 and time >= 10; -delete from root.ln.wf02.wt02.status where time > 20; -delete from root.ln.wf02.wt02.status where time >= 20; -delete from root.ln.wf02.wt02.status where time = 20; -delete from root.ln.wf02.wt02.status where time > 4 or time < 0; -Msg: 303: Check metadata error: For delete statement, where clause can only contain atomic; -expressions like : time > XXX, time <= XXX, or two atomic expressions connected by 'AND'; -delete from root.ln.wf02.wt02.status; -``` - -### 5.2 Delete Multiple Timeseries - -```sql -delete from root.ln.wf02.wt02 where time <= 2017-11-01T16:26:00; -delete from root.ln.wf02.wt02.* where time <= 2017-11-01T16:26:00; -delete from root.ln.wf03.wt02.status where time < now(); -Msg: The statement is executed successfully. -``` - -### 5.3 Delete Time Partition (experimental) - -```sql -DELETE PARTITION root.ln 0,1,2; -``` - -## 6. QUERY DATA - -For more details, see document [Query-Data](../Basic-Concept/Query-Data_timecho). - -```sql -SELECT [LAST] selectExpr [, selectExpr] ... - [INTO intoItem [, intoItem] ...] - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY { - ([startTime, endTime), interval [, slidingStep]) | - LEVEL = levelNum [, levelNum] ... | - TAGS(tagKey [, tagKey] ... ) | - VARIATION(expression[,delta][,ignoreNull=true/false]) | - CONDITION(expression,[keep>/>=/=/ 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` - -#### Select Multiple Columns of Data for the Same Device According to Multiple Time Intervals - -```sql -select status,temperature from root.ln.wf01.wt01 where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -#### Choose Multiple Columns of Data for Different Devices According to Multiple Time Intervals - -```sql -select wf01.wt01.status,wf02.wt02.hardware from root.ln where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -#### Order By Time Query - -```sql -select * from root.ln.** where time > 1 order by time desc limit 10; -``` - -### 6.2 `SELECT` CLAUSE - -#### Use Alias - -```sql -select s1 as temperature, s2 as speed from root.ln.wf01.wt01; -``` - -#### Nested Expressions - -##### Nested Expressions with Time Series Query - -```sql -select a, - b, - ((a + 1) * 2 - 1) % 2 + 1.5, - sin(a + sin(a + sin(b))), - -(a + b) * (sin(a + b) * sin(a + b) + cos(a + b) * cos(a + b)) + 1 -from root.sg1; - -select (a + b) * 2 + sin(a) from root.sg; - -select (a + *) / 2 from root.sg1; - -select (a + b) * 3 from root.sg, root.ln; -``` - -##### Nested Expressions query with aggregations - -```sql -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) -from root.ln.wf01.wt01; - -select avg(*), - (avg(*) + 1) * 3 / 2 -1 -from root.sg1; - -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) as custom_sum -from root.ln.wf01.wt01 -GROUP BY([10, 90), 10ms); -``` - -#### Last Query - -```sql -select last status from root.ln.wf01.wt01; -select last status, temperature from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00; -select last * from root.ln.wf01.wt01 order by timeseries desc; -select last * from root.ln.wf01.wt01 order by dataType desc; -``` - -### 6.3 `WHERE` CLAUSE - -#### Time Filter - -```sql -select s1 from root.sg1.d1 where time > 2022-01-01T00:05:00.000; -select s1 from root.sg1.d1 where time = 2022-01-01T00:05:00.000; -select s1 from root.sg1.d1 where time >= 2022-01-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` - -#### Value Filter - -```sql -select temperature from root.sg1.d1 where temperature > 36.5; -select status from root.sg1.d1 where status = true; -select temperature from root.sg1.d1 where temperature between 36.5 and 40; -select temperature from root.sg1.d1 where temperature not between 36.5 and 40; -select code from root.sg1.d1 where code in ('200', '300', '400', '500'); -select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); -select code from root.sg1.d1 where temperature is null; -select code from root.sg1.d1 where temperature is not null; -``` - -#### Fuzzy Query - -- Fuzzy matching using `Like` - -```sql -select * from root.sg.d1 where value like '%cc%'; -select * from root.sg.device where value like '_b_'; -``` - -- Fuzzy matching using `Regexp` - -```sql -select * from root.sg.d1 where value regexp '^[A-Za-z]+$'; -select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100; -``` - -### 6.4 `GROUP BY` CLAUSE - -- Aggregate By Time without Specifying the Sliding Step Length - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d); -``` - -- Aggregate By Time Specifying the Sliding Step Length - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d); -``` - -- Aggregate by Natural Month - -```sql -select count(status) from root.ln.wf01.wt01 group by([2017-11-01T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -select count(status) from root.ln.wf01.wt01 group by([2017-10-31T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` - -- Left Open And Right Close Range - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d); -``` - -- Aggregation By Variation - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6); -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, ignoreNull=false); -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, 4); -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6+s5, 10); -``` - -- Aggregation By Condition - -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoringNull=true); -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoringNull=false); -``` - -- Aggregation By Session - -```sql -select __endTime,count(*) from root.** group by session(1d); -select __endTime,sum(hardware) from root.ln.wf02.wt01 group by session(50s) having sum(hardware)>0 align by device; -``` - -- Aggregation By Count - -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5); -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5,ignoreNull=false); -``` - -- Aggregation By Level - -```sql -select count(status) from root.** group by level = 1; -select count(status) from root.** group by level = 3; -select count(status) from root.** group by level = 1, 3; -select max_value(temperature) from root.** group by level = 0; -select count(*) from root.ln.** group by level = 2; -``` - -- Aggregate By Time with Level Clause - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d), level=1; -select count(status) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d), level=1; -``` - -- Aggregation query by one single tag - -```sql -SELECT AVG(temperature) FROM root.factory1.** GROUP BY TAGS(city); -``` - -- Aggregation query by multiple tags - -```sql -SELECT avg(temperature) FROM root.factory1.** GROUP BY TAGS(city, workshop); -``` - -- Downsampling Aggregation by tags based on Time Window - -```sql -SELECT avg(temperature) FROM root.factory1.** GROUP BY ([1000, 10000), 5s), TAGS(city, workshop); -``` - -### 6.5 `HAVING` CLAUSE - -Correct: - -```sql -select count(s1) from root.** group by ([1,11),2ms), level=1 having count(s2) > 1; -select count(s1), count(s2) from root.** group by ([1,11),2ms) having count(s2) > 1 align by device; -``` - -Incorrect: - -```sql -select count(s1) from root.** group by ([1,3),1ms) having sum(s1) > s1; -select count(s1) from root.** group by ([1,3),1ms) having s1 > 1; -select count(s1) from root.** group by ([1,3),1ms), level=1 having sum(d1.s1) > 1; -select count(d1.s1) from root.** group by ([1,3),1ms), level=1 having sum(s1) > 1; -``` - -### 6.6 `FILL` CLAUSE - -#### `PREVIOUS` Fill - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous); -``` - -#### `PREVIOUS` FILL and specify the fill timeout threshold -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous, 2m); -``` - -#### `LINEAR` Fill - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(linear); -``` - -#### Constant Fill - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(2.0); -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(true); -``` - -### 6.7 `LIMIT` and `SLIMIT` CLAUSES (PAGINATION) - -#### Row Control over Query Results - -```sql -select status, temperature from root.ln.wf01.wt01 limit 10; -select status, temperature from root.ln.wf01.wt01 limit 5 offset 3; -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time< 2017-11-01T00:12:00.000 limit 2 offset 3; -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) limit 5 offset 3; -``` - -#### Column Control over Query Results - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1; -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 soffset 1; -select max_value(*) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) slimit 1 soffset 1; -``` - -#### Row and Column Control over Query Results - -```sql -select * from root.ln.wf01.wt01 limit 10 offset 100 slimit 2 soffset 0; -``` - -### 6.8 `ORDER BY` CLAUSE - -#### Order by in ALIGN BY TIME mode - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time desc; -``` - -#### Order by in ALIGN BY DEVICE mode - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by device desc,time asc align by device; -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time asc,device desc align by device; -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -select count(*) from root.ln.** group by ((2017-11-01T00:00:00.000+08:00,2017-11-01T00:03:00.000+08:00],1m) order by device asc,time asc align by device; -``` - -#### Order by arbitrary expressions - -```sql -select score from root.** order by score desc align by device; -select score,total from root.one order by base+score+bonus desc; -select score,total from root.one order by total desc; -select base, score, bonus, total from root.** order by total desc NULLS Last, - score desc NULLS Last, - bonus desc NULLS Last, - time desc align by device; -select min_value(total) from root.** order by min_value(total) asc align by device; -select min_value(total),max_value(base) from root.** order by max_value(total) desc align by device; -select score from root.** order by device asc, score desc, time asc align by device; -``` - -### 6.9 `ALIGN BY` CLAUSE - -#### Align by Device - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` - -### 6.10 `INTO` CLAUSE (QUERY WRITE-BACK) - -```sql -select s1, s2 into root.sg_copy.d1(t1), root.sg_copy.d2(t1, t2), root.sg_copy.d1(t2) from root.sg.d1, root.sg.d2; -select count(s1 + s2), last_value(s2) into root.agg.count(s1_add_s2), root.agg.last_value(s2) from root.sg.d1 group by ([0, 100), 10ms); -select s1, s2 into root.sg_copy.d1(t1, t2), root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -select s1 + s2 into root.expr.add(d1s1_d1s2), root.expr.add(d2s1_d2s2) from root.sg.d1, root.sg.d2 align by device; -``` - -- Using variable placeholders: - -```sql -select s1, s2 -into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) -from root.sg.d1, root.sg.d2; - -select d1.s1, d1.s2, d2.s3, d3.s4 -into ::(s1_1, s2_2), root.sg.d2_2(s3_3), root.${2}_copy.::(s4) -from root.sg; - -select * into root.sg_bk.::(::) from root.sg.**; - -select s1, s2, s3, s4 -into root.backup_sg.d1(s1, s2, s3, s4), root.backup_sg.d2(::), root.sg.d3(backup_${4}) -from root.sg.d1, root.sg.d2, root.sg.d3 -align by device; - -select avg(s1), sum(s2) + sum(s3), count(s4) -into root.agg_${2}.::(avg_s1, sum_s2_add_s3, count_s4) -from root.** -align by device; - -select * into ::(backup_${4}) from root.sg.** align by device; - -select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` - -## 7. Maintenance -Generate the corresponding query plan: -```sql -explain select s1,s2 from root.sg.d1; -``` -Execute the corresponding SQL, analyze the execution and output: -```sql -explain analyze select s1,s2 from root.sg.d1 order by s1; -``` - -For more Maintenance commands, please refer to[Maintenance commands](../User-Manual/Maintenance-commands_timecho.md) - -## 8. OPERATOR - -For more details, see document [Operator-and-Expression](./Operator-and-Expression.md). - -### 8.1 Arithmetic Operators - -For details and examples, see the document [Arithmetic Operators and Functions](./Operator-and-Expression.md#_1-1-arithmetic-operators). - -```sql -select s1, - s1, s2, + s2, s1 + s2, s1 - s2, s1 * s2, s1 / s2, s1 % s2 from root.sg.d1; -``` - -### 8.2 Comparison Operators - -For details and examples, see the document [Comparison Operators and Functions](./Operator-and-Expression.md#_1-2-comparison-operators). - -```sql -# Basic comparison operators; -select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test; - -# `BETWEEN ... AND ...` operator; -select temperature from root.sg1.d1 where temperature between 36.5 and 40; -select temperature from root.sg1.d1 where temperature not between 36.5 and 40; - -# Fuzzy matching operator: Use `Like` for fuzzy matching; -select * from root.sg.d1 where value like '%cc%'; -select * from root.sg.device where value like '_b_'; - -# Fuzzy matching operator: Use `Regexp` for fuzzy matching; -select * from root.sg.d1 where value regexp '^[A-Za-z]+$'; -select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100; -select b, b like '1%', b regexp '[0-2]' from root.test; - -# `IS NULL` operator; -select code from root.sg1.d1 where temperature is null; -select code from root.sg1.d1 where temperature is not null; - -# `IN` operator; -select code from root.sg1.d1 where code in ('200', '300', '400', '500'); -select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); -select a, a in (1, 2) from root.test; -``` - -### 8.3 Logical Operators - -For details and examples, see the document [Logical Operators](./Operator-and-Expression.md#_1-3-logical-operators). - -```sql -select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test; -``` - -## 9. BUILT-IN FUNCTIONS - -For more details, see document [Operator-and-Expression](./Operator-and-Expression.md#_2-built-in-functions). - -### 9.1 Aggregate Functions - -For details and examples, see the document [Aggregate Functions](./Operator-and-Expression.md#_2-1-aggregate-functions). - -```sql -select count(status) from root.ln.wf01.wt01; - -select count_if(s1=0 & s2=0, 3), count_if(s1=1 & s2=0, 3) from root.db.d1; -select count_if(s1=0 & s2=0, 3, 'ignoreNull'='false'), count_if(s1=1 & s2=0, 3, 'ignoreNull'='false') from root.db.d1; - -select time_duration(s1) from root.db.d1; -``` - -### 9.2 Arithmetic Functions - -For details and examples, see the document [Arithmetic Operators and Functions](./Operator-and-Expression.md#_2-2-arithmetic-functions). - -```sql -select s1, sin(s1), cos(s1), tan(s1) from root.sg1.d1 limit 5 offset 1000; -select s4,round(s4),round(s4,2),round(s4,-1) from root.sg1.d1; -``` - -### 9.3 Comparison Functions - -For details and examples, see the document [Comparison Operators and Functions](./Operator-and-Expression.md#_2-3-comparison-functions). - -```sql -select ts, on_off(ts, 'threshold'='2') from root.test; -select ts, in_range(ts, 'lower'='2', 'upper'='3.1') from root.test; -``` - -### 9.4 String Processing Functions - -For details and examples, see the document [String Processing](./Operator-and-Expression.md#_2-4-string-processing-functions). - -```sql -select s1, string_contains(s1, 's'='warn') from root.sg1.d4; -select s1, string_matches(s1, 'regex'='[^\\s]+37229') from root.sg1.d4; -select s1, length(s1) from root.sg1.d1; -select s1, locate(s1, "target"="1") from root.sg1.d1; -select s1, locate(s1, "target"="1", "reverse"="true") from root.sg1.d1; -select s1, startswith(s1, "target"="1") from root.sg1.d1; -select s1, endswith(s1, "target"="1") from root.sg1.d1; -select s1, s2, concat(s1, s2, "target1"="IoT", "target2"="DB") from root.sg1.d1; -select s1, s2, concat(s1, s2, "target1"="IoT", "target2"="DB", "series_behind"="true") from root.sg1.d1; -select s1, substring(s1 from 1 for 2) from root.sg1.d1; -select s1, replace(s1, 'es', 'tt') from root.sg1.d1; -select s1, upper(s1) from root.sg1.d1; -select s1, lower(s1) from root.sg1.d1; -select s3, trim(s3) from root.sg1.d1; -select s1, s2, strcmp(s1, s2) from root.sg1.d1; -select strreplace(s1, "target"=",", "replace"="/", "limit"="2") from root.test.d1; -select strreplace(s1, "target"=",", "replace"="/", "limit"="1", "offset"="1", "reverse"="true") from root.test.d1; -select regexmatch(s1, "regex"="\d+\.\d+\.\d+\.\d+", "group"="0") from root.test.d1; -select regexreplace(s1, "regex"="192\.168\.0\.(\d+)", "replace"="cluster-$1", "limit"="1") from root.test.d1; -select regexsplit(s1, "regex"=",", "index"="-1") from root.test.d1; -select regexsplit(s1, "regex"=",", "index"="3") from root.test.d1; -``` - -### 9.5 Data Type Conversion Function - -For details and examples, see the document [Data Type Conversion Function](./Operator-and-Expression.md#_2-5-data-type-conversion-function). - -```sql -SELECT cast(s1 as INT32) from root.sg; -``` - -### 9.6 Constant Timeseries Generating Functions - -For details and examples, see the document [Constant Timeseries Generating Functions](./Operator-and-Expression.md#_2-6-constant-timeseries-generating-functions). - -```sql -select s1, s2, const(s1, 'value'='1024', 'type'='INT64'), pi(s2), e(s1, s2) from root.sg1.d1; -``` - -### 9.7 Selector Functions - -For details and examples, see the document [Selector Functions](./Operator-and-Expression.md#_2-7-selector-functions). - -```sql -select s1, top_k(s1, 'k'='2'), bottom_k(s1, 'k'='2') from root.sg1.d2 where time > 2020-12-10T20:36:15.530+08:00; -``` - -### 9.8 Continuous Interval Functions - -For details and examples, see the document [Continuous Interval Functions](./Operator-and-Expression.md#_2-8-continuous-interval-functions). - -```sql -select s1, zero_count(s1), non_zero_count(s2), zero_duration(s3), non_zero_duration(s4) from root.sg.d2; -``` - -### 9.9 Variation Trend Calculation Functions - -For details and examples, see the document [Variation Trend Calculation Functions](./Operator-and-Expression.md#_2-9-variation-trend-calculation-functions). - -```sql -select s1, time_difference(s1), difference(s1), non_negative_difference(s1), derivative(s1), non_negative_derivative(s1) from root.sg1.d1 limit 5 offset 1000; - -SELECT DIFF(s1), DIFF(s2) from root.test; -SELECT DIFF(s1, 'ignoreNull'='false'), DIFF(s2, 'ignoreNull'='false') from root.test; -``` - -### 9.10 Sample Functions - -For details and examples, see the document [Sample Functions](./Operator-and-Expression.md#_2-10-sample-functions). - -```sql -select equal_size_bucket_random_sample(temperature,'proportion'='0.1') as random_sample from root.ln.wf01.wt01; -select equal_size_bucket_agg_sample(temperature, 'type'='avg','proportion'='0.1') as agg_avg, equal_size_bucket_agg_sample(temperature, 'type'='max','proportion'='0.1') as agg_max, equal_size_bucket_agg_sample(temperature,'type'='min','proportion'='0.1') as agg_min, equal_size_bucket_agg_sample(temperature, 'type'='sum','proportion'='0.1') as agg_sum, equal_size_bucket_agg_sample(temperature, 'type'='extreme','proportion'='0.1') as agg_extreme, equal_size_bucket_agg_sample(temperature, 'type'='variance','proportion'='0.1') as agg_variance from root.ln.wf01.wt01; -select equal_size_bucket_m4_sample(temperature, 'proportion'='0.1') as M4_sample from root.ln.wf01.wt01; -select equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='avg', 'number'='2') as outlier_avg_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='stendis', 'number'='2') as outlier_stendis_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='cos', 'number'='2') as outlier_cos_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='prenextdis', 'number'='2') as outlier_prenextdis_sample from root.ln.wf01.wt01; - -select M4(s1,'timeInterval'='25','displayWindowBegin'='0','displayWindowEnd'='100') from root.vehicle.d1; -select M4(s1,'windowSize'='10') from root.vehicle.d1; -``` - -### 9.11 Change Points Function - -For details and examples, see the document [Time-Series](./Operator-and-Expression.md#_2-11-change-points-function). - -```sql -select change_points(s1), change_points(s2), change_points(s3), change_points(s4), change_points(s5), change_points(s6) from root.testChangePoints.d1; -``` - -## 10. DATA QUALITY FUNCTION LIBRARY - -For more details, see document [Operator-and-Expression](../SQL-Manual/UDF-Libraries.md). - -### 10.1 Data Quality - -For details and examples, see the document [Data-Quality](../SQL-Manual/UDF-Libraries.md#data-quality). - -```sql -# Completeness; -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Consistency; -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Timeliness; -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Validity; -select Validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select Validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Accuracy; -select Accuracy(t1,t2,t3,m1,m2,m3) from root.test; -``` - -### 10.2 Data Profiling - -For details and examples, see the document [Data-Profiling](../SQL-Manual/UDF-Libraries.md#data-profiling). - -```sql -# ACF; -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05; - -# Distinct; -select distinct(s2) from root.test.d2; - -# Histogram; -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1; - -# Integral; -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10; -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10; - -# IntegralAvg; -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10; - -# Mad; -select mad(s0) from root.test; -select mad(s0, "error"="0.01") from root.test; - -# Median; -select median(s0, "error"="0.01") from root.test; - -# MinMax; -select minmax(s1) from root.test; - -# Mode; -select mode(s2) from root.test.d2; - -# MvAvg; -select mvavg(s1, "window"="3") from root.test; - -# PACF; -select pacf(s1, "lag"="5") from root.test; - -# Percentile; -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test; - -# Quantile; -select quantile(s0, "rank"="0.2", "K"="800") from root.test; - -# Period; -select period(s1) from root.test.d3; - -# QLB; -select QLB(s1) from root.test.d1; - -# Resample; -select resample(s1,'every'='5m','interp'='linear') from root.test.d1; -select resample(s1,'every'='30m','aggr'='first') from root.test.d1; -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1; - -# Sample; -select sample(s1,'method'='reservoir','k'='5') from root.test.d1; -select sample(s1,'method'='isometric','k'='5') from root.test.d1; - -# Segment; -select segment(s1, "error"="0.1") from root.test; - -# Skew; -select skew(s1) from root.test.d1; - -# Spline; -select spline(s1, "points"="151") from root.test; - -# Spread; -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; - -# Stddev; -select stddev(s1) from root.test.d1; - -# ZScore; -select zscore(s1) from root.test; -``` - -### 10.3 Anomaly Detection - -For details and examples, see the document [Anomaly-Detection](../SQL-Manual/UDF-Libraries.md#anomaly-detection). - -```sql -# IQR; -select iqr(s1) from root.test; - -# KSigma; -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30; - -# LOF; -select lof(s1,s2) from root.test.d1 where time<1000; -select lof(s1, "method"="series") from root.test.d1 where time<1000; - -# MissDetect; -select missdetect(s2,'minlen'='10') from root.test.d2; - -# Range; -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30; - -# TwoSidedFilter; -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test; - -# Outlier; -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test; - -# MasterTrain; -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test; - -# MasterDetect; -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test; -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test; -``` - -### 10.4 Frequency Domain - -For details and examples, see the document [Frequency-Domain](../SQL-Manual/UDF-Libraries.md#frequency-domain-analysis). - -```sql -# Conv; -select conv(s1,s2) from root.test.d2; - -# Deconv; -select deconv(s3,s2) from root.test.d2; -select deconv(s3,s2,'result'='remainder') from root.test.d2; - -# DWT; -select dwt(s1,"method"="haar") from root.test.d1; - -# FFT; -select fft(s1) from root.test.d1; -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1; - -# HighPass; -select highpass(s1,'wpass'='0.45') from root.test.d1; - -# IFFT; -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1; - -# LowPass; -select lowpass(s1,'wpass'='0.45') from root.test.d1; - -# Envelope; -select envelope(s1) from root.test.d1; -``` - -### 10.5 Data Matching - -For details and examples, see the document [Data-Matching](../SQL-Manual/UDF-Libraries.md#data-matching). - -```sql -# Cov; -select cov(s1,s2) from root.test.d2; - -# DTW; -select dtw(s1,s2) from root.test.d2; - -# Pearson; -select pearson(s1,s2) from root.test.d2; - -# PtnSym; -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1; - -# XCorr; -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05; -``` - -### 10.6 Data Repairing - -For details and examples, see the document [Data-Repairing](../SQL-Manual/UDF-Libraries.md#data-repairing). - -```sql -# TimestampRepair; -select timestamprepair(s1,'interval'='10000') from root.test.d2; -select timestamprepair(s1) from root.test.d2; - -# ValueFill; -select valuefill(s1) from root.test.d2; -select valuefill(s1,"method"="previous") from root.test.d2; - -# ValueRepair; -select valuerepair(s1) from root.test.d2; -select valuerepair(s1,'method'='LsGreedy') from root.test.d2; - -# MasterRepair; -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test; - -# SeasonalRepair; -select seasonalrepair(s1,'period'=3,'k'=2) from root.test.d2; -select seasonalrepair(s1,'method'='improved','period'=3) from root.test.d2; -``` - -### 10.7 Series Discovery - -For details and examples, see the document [Series-Discovery](../SQL-Manual/UDF-Libraries.md#series-discovery). - -```sql -# ConsecutiveSequences; -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1; -select consecutivesequences(s1,s2) from root.test.d1; - -# ConsecutiveWindows; -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1; -``` - -### 10.8 Machine Learning - -For details and examples, see the document [Machine-Learning](../SQL-Manual/UDF-Libraries.md#machine-learning). - -```sql -# AR; -select ar(s0,"p"="2") from root.test.d0; - -# Representation; -select representation(s0,"tb"="3","vb"="2") from root.test.d0; - -# RM; -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0; -``` - - -## 11. CONDITIONAL EXPRESSION - -For details and examples, see the document [Conditional Expressions](../SQL-Manual/UDF-Libraries.md#conditional-expressions). - -```sql -select T, P, case -when 1000=1050 then "bad temperature" -when P<=1000000 or P>=1100000 then "bad pressure" -end as `result` -from root.test1; - -select str, case -when str like "%cc%" then "has cc" -when str like "%dd%" then "has dd" -else "no cc and dd" end as `result` -from root.test2; - -select -count(case when x<=1 then 1 end) as `(-∞,1]`, -count(case when 1 -[RESAMPLE - [EVERY ] - [BOUNDARY ] - [RANGE [, end_time_offset]] -] -[TIMEOUT POLICY BLOCKED|DISCARD] -BEGIN - SELECT CLAUSE - INTO CLAUSE - FROM CLAUSE - [WHERE CLAUSE] - [GROUP BY([, ]) [, level = ]] - [HAVING CLAUSE] - [FILL ({PREVIOUS | LINEAR | constant} (, interval=DURATION_LITERAL)?)] - [LIMIT rowLimit OFFSET rowOffset] - [ALIGN BY DEVICE] -END; -``` - -### 13.1 Configuring execution intervals - -```sql -CREATE CONTINUOUS QUERY cq1 -RESAMPLE EVERY 20s -BEGIN -SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) -END; -``` - -### 13.2 Configuring time range for resampling - -```sql -CREATE CONTINUOUS QUERY cq2 -RESAMPLE RANGE 40s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) -END; -``` - -### 13.3 Configuring execution intervals and CQ time ranges - -```sql -CREATE CONTINUOUS QUERY cq3 -RESAMPLE EVERY 20s RANGE 40s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) - FILL(100.0) -END; -``` - -### 13.4 Configuring end_time_offset for CQ time range - -```sql -CREATE CONTINUOUS QUERY cq4 -RESAMPLE EVERY 20s RANGE 40s, 20s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) - FILL(100.0) -END; -``` - -### 13.5 CQ without group by clause - -```sql -CREATE CONTINUOUS QUERY cq5 -RESAMPLE EVERY 20s -BEGIN - SELECT temperature + 1 - INTO root.precalculated_sg.::(temperature) - FROM root.ln.*.* - align by device -END; -``` - -### 13.6 CQ Management - -#### Listing continuous queries - -```sql -SHOW (CONTINUOUS QUERIES | CQS) -``` - -#### Dropping continuous queries - -```sql -DROP (CONTINUOUS QUERY | CQ) -``` - -#### Altering continuous queries - -CQs can't be altered once they're created. To change a CQ, you must `DROP` and re`CREATE` it with the updated settings. - -## 14. USER-DEFINED FUNCTION (UDF) - -For more details, see document [UDF Libraries](../SQL-Manual/UDF-Libraries.md). - -### 14.1 UDF Registration - -```sql -CREATE FUNCTION AS (USING URI URI-STRING)? -``` - -### 14.2 UDF Deregistration - -```sql -DROP FUNCTION -``` - -### 14.3 UDF Queries - -```sql -SELECT example(*) from root.sg.d1; -SELECT example(s1, *) from root.sg.d1; -SELECT example(*, *) from root.sg.d1; - -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; - -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` - -### 14.4 Show All Registered UDFs - -```sql -SHOW FUNCTIONS; -``` - -## 15. ADMINISTRATION MANAGEMENT - -For more details, see document [Authority Management](../User-Manual/Authority-Management_timecho.md). - -### 15.1 SQL Statements - -- Create user (Requires MANAGE_USER permission) - -```SQL -CREATE USER ; -eg: CREATE USER user1 'passwd'; -``` - -- Delete user (Requires MANAGE_USER permission) - -```sql -DROP USER ; -eg: DROP USER user1; -``` - -- Create role (Requires MANAGE_ROLE permission) - -```sql -CREATE ROLE ; -eg: CREATE ROLE role1; -``` - -- Delete role (Requires MANAGE_ROLE permission) - -```sql -DROP ROLE ; -eg: DROP ROLE role1; -``` - -- Grant role to user (Requires MANAGE_ROLE permission) - -```sql -GRANT ROLE TO ; -eg: GRANT ROLE admin TO user1; -``` - -- Revoke role from user(Requires MANAGE_ROLE permission) - -```sql -REVOKE ROLE FROM ; -eg: REVOKE ROLE admin FROM user1; -``` - -- List all user (Requires MANAGE_USER permission) - -```sql -LIST USER; -``` - -- List all role (Requires MANAGE_ROLE permission) - -```sql -LIST ROLE; -``` - -- List all users granted specific role.(Requires MANAGE_USER permission) - -```sql -LIST USER OF ROLE ; -eg: LIST USER OF ROLE roleuser; -``` - -- List all role granted to specific user. - -```sql -LIST ROLE OF USER ; -eg: LIST ROLE OF USER tempuser; -``` - -- List all privileges of user - -```sql -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; -``` - -- List all privileges of role - -```sql -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -- Modify password - -```sql -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'newpwd'; -``` - -### 15.2 Authorization and Deauthorization - - -```sql -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT MANAGE_ROLE ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -```sql -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE MANAGE_ROLE ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - - -#### Delete Time Partition (experimental) - -```sql -Eg: DELETE PARTITION root.ln 0,1,2; -``` - -#### Continuous Query,CQ - -```sql -Eg: CREATE CONTINUOUS QUERY cq1 BEGIN SELECT max_value(temperature) INTO temperature_max FROM root.ln.*.* GROUP BY time(10s) END; -``` - -#### Maintenance Command - -- FLUSH - -```sql -Eg: flush; -``` - -- MERGE - -```sql -Eg: MERGE; -Eg: FULL MERGE; -``` - -- CLEAR CACHE - -```sql -Eg: CLEAR CACHE; -``` - -- START REPAIR DATA - -```sql -Eg: START REPAIR DATA; -``` - -- STOP REPAIR DATA - -```sql -Eg: STOP REPAIR DATA; -``` - -- SET SYSTEM TO READONLY / WRITABLE - -```sql -Eg: SET SYSTEM TO READONLY / WRITABLE; -``` - -- Query abort - -```sql -Eg: KILL QUERY 1; -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md b/src/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md deleted file mode 100644 index c2e76b944..000000000 --- a/src/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md +++ /dev/null @@ -1,4977 +0,0 @@ - - -# UDF Libraries - -Based on the ability of user-defined functions, IoTDB provides a series of functions for temporal data processing, including data quality, data profiling, anomaly detection, frequency domain analysis, data matching, data repairing, sequence discovery, machine learning, etc., which can meet the needs of industrial fields for temporal data processing. - -> Note: The functions in the current UDF library only support millisecond level timestamp accuracy. - -## 1. Installation steps - -1. Please obtain the compressed file of the UDF library JAR package that is compatible with the IoTDB version. - - | UDF installation package | Supported IoTDB versions | Download link | - | --------------- | ----------------- | ------------------------------------------------------------ | - | TimechoDB-UDF-1.3.3.zip | V1.3.3 and above | Please contact Timecho for assistance | - | TimechoDB-UDF-1.3.2.zip | V1.0.0~V1.3.2 | Please contact Timecho for assistance| - -2. Place the `library-udf.jar` file in the compressed file obtained in the directory `/ext/udf ` of all nodes in the IoTDB cluster -3. In the SQL command line terminal (CLI) or visualization console (Workbench) SQL operation interface of IoTDB, execute the corresponding function registration statement as follows. -4. Batch registration: Two registration methods: registration script or SQL full statement -- Register Script - - Copy the registration script (`register-UDF.sh` or `register-UDF.bat`) from the compressed package to the `tools` directory of IoTDB as needed, and modify the parameters in the script (default is host=127.0.0.1, rpcPort=6667, user=root, pass=root); - - Start IoTDB service, run registration script to batch register UDF - -- All SQL statements - - Open the SQl file in the compressed package, copy all SQL statements, and execute all SQl statements in the SQL command line terminal (CLI) of IoTDB or the SQL operation interface of the visualization console (Workbench) to batch register UDF - -## 2. Data Quality - -### 2.1 Completeness - -#### Registration statement - -```sql -create function completeness as 'org.apache.iotdb.library.dquality.UDTFCompleteness' -``` - -#### Usage - -This function calculates the completeness of a time series, which measures the presence or absence of missing values in the time series data. The function divides the input time series data into consecutive non-overlapping time windows, computes the data completeness for each window individually, and outputs the timestamp of the first data point in the window along with the completeness result. - -**Name:** COMPLETENESS - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. -+ `downtime`: Whether the downtime exception is considered in the calculation of completeness. It is 'true' or 'false' (default). When considering the downtime exception, long-term missing data will be considered as downtime exception without any influence on completeness. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-----------------------------+ -| Time|completeness(root.test.d1.s1)| -+-----------------------------+-----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -+-----------------------------+-----------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------+ -| Time|completeness(root.test.d1.s1, "window"="15")| -+-----------------------------+--------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+--------------------------------------------+ -``` - -### 2.2 Consistency - -#### Registration statement - -```sql -create function consistency as 'org.apache.iotdb.library.dquality.UDTFConsistency' -``` - -#### Usage - -This function calculates the consistency of a time series, which measures whether the changes in the time series data are stable and follow uniform patterns. The function divides the input time series data into consecutive non-overlapping time windows, computes the data consistency for each window individually, and outputs the timestamp of the first data point in the window along with the consistency result. - -**Name:** CONSISTENCY - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|consistency(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+----------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------+ -| Time|consistency(root.test.d1.s1, "window"="15")| -+-----------------------------+-------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+-------------------------------------------+ -``` - -### 2.3 Timeliness - -#### Registration statement - -```sql -create function timeliness as 'org.apache.iotdb.library.dquality.UDTFTimeliness' -``` - -#### Usage - -This function calculates the timeliness of a time series, which measures whether the time series data is collected and reported on schedule. The function divides the input time series data into consecutive non-overlapping time windows, computes the data timeliness for each window individually, and outputs the timestamp of the first data point in the window along with the timeliness result. - -**Name:** TIMELINESS - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+---------------------------+ -| Time|timeliness(root.test.d1.s1)| -+-----------------------------+---------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+---------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------+ -| Time|timeliness(root.test.d1.s1, "window"="15")| -+-----------------------------+------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+------------------------------------------+ -``` - -### 2.4 Validity - -#### Registration statement - -```sql -create function validity as 'org.apache.iotdb.library.dquality.UDTFValidity' -``` - -#### Usage - -This function calculates the validity of a time series, which measures whether the time series data is normal, usable, and free of outliers. The function divides the input time series data into consecutive non-overlapping time windows, computes the data validity for each window individually, and outputs the timestamp of the first data point in the window along with the validity result. - -**Name:** VALIDITY - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select Validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|validity(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -+-----------------------------+-------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select Validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|validity(root.test.d1.s1, "window"="15")| -+-----------------------------+----------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - - - - - -## 3. Data Profiling - -### 3.1 ACF - -#### Registration statement - -```sql -create function acf as 'org.apache.iotdb.library.dprofile.UDTFACF' -``` - -#### Usage - -This function is used to calculate the auto-correlation factor of the input time series, -which equals to cross correlation between the same series. -For more information, please refer to [XCorr](#XCorr) function. - -**Name:** ACF - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. -There are $2N-1$ data points in the series, and the values are interpreted in details in [XCorr](#XCorr) function. - -**Note:** - -+ `null` and `NaN` values in the input series will be ignored and treated as 0. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| null| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|acf(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 6.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -|1970-01-01T08:00:00.004+08:00| 7.0| -|1970-01-01T08:00:00.005+08:00| 0.0| -|1970-01-01T08:00:00.006+08:00| 3.6| -|1970-01-01T08:00:00.007+08:00| 0.0| -|1970-01-01T08:00:00.008+08:00| 1.0| -+-----------------------------+--------------------+ -``` - -### 3.2 Distinct - -#### Registration statement - -```sql -create function distinct as 'org.apache.iotdb.library.dprofile.UDTFDistinct' -``` - -#### Usage - -This function returns all unique values in time series. - -**Name:** DISTINCT - -**Input Series:** Only support a single input series. The type is arbitrary. - -**Output Series:** Output a single series. The type is the same as the input. - -**Note:** - -+ The timestamp of the output series is meaningless. The output order is arbitrary. -+ Missing points and null points in the input series will be ignored, but `NaN` will not. -+ Case Sensitive. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2020-01-01T08:00:00.001+08:00| Hello| -|2020-01-01T08:00:00.002+08:00| hello| -|2020-01-01T08:00:00.003+08:00| Hello| -|2020-01-01T08:00:00.004+08:00| World| -|2020-01-01T08:00:00.005+08:00| World| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select distinct(s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|distinct(root.test.d2.s2)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.001+08:00| Hello| -|1970-01-01T08:00:00.002+08:00| hello| -|1970-01-01T08:00:00.003+08:00| World| -+-----------------------------+-------------------------+ -``` - -### 3.3 Histogram - -#### Registration statement - -```sql -create function histogram as 'org.apache.iotdb.library.dprofile.UDTFHistogram' -``` - -#### Usage - -This function is used to calculate the distribution histogram of a single column of numerical data. - -**Name:** HISTOGRAM - -**Input Series:** Only supports a single input sequence, the type is INT32 / INT64 / FLOAT / DOUBLE - -**Parameters:** - -+ `min`: The lower limit of the requested data range, the default value is -Double.MAX_VALUE. -+ `max`: The upper limit of the requested data range, the default value is Double.MAX_VALUE, and the value of start must be less than or equal to end. -+ `count`: The number of buckets of the histogram, the default value is 1. It must be a positive integer. - -**Output Series:** The value of the bucket of the histogram, where the lower bound represented by the i-th bucket (index starts from 1) is $min+ (i-1)\cdot\frac{max-min}{count}$ and the upper bound is $min + i \cdot \frac{max-min}{count}$. - -**Note:** - -+ If the value is lower than `min`, it will be put into the 1st bucket. If the value is larger than `max`, it will be put into the last bucket. -+ Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 11.0| -|2020-01-01T00:00:11.000+08:00| 12.0| -|2020-01-01T00:00:12.000+08:00| 13.0| -|2020-01-01T00:00:13.000+08:00| 14.0| -|2020-01-01T00:00:14.000+08:00| 15.0| -|2020-01-01T00:00:15.000+08:00| 16.0| -|2020-01-01T00:00:16.000+08:00| 17.0| -|2020-01-01T00:00:17.000+08:00| 18.0| -|2020-01-01T00:00:18.000+08:00| 19.0| -|2020-01-01T00:00:19.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------+ -| Time|histogram(root.test.d1.s1, "min"="1", "max"="20", "count"="10")| -+-----------------------------+---------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 2| -|1970-01-01T08:00:00.001+08:00| 2| -|1970-01-01T08:00:00.002+08:00| 2| -|1970-01-01T08:00:00.003+08:00| 2| -|1970-01-01T08:00:00.004+08:00| 2| -|1970-01-01T08:00:00.005+08:00| 2| -|1970-01-01T08:00:00.006+08:00| 2| -|1970-01-01T08:00:00.007+08:00| 2| -|1970-01-01T08:00:00.008+08:00| 2| -|1970-01-01T08:00:00.009+08:00| 2| -+-----------------------------+---------------------------------------------------------------+ -``` - -### 3.4 Integral - -#### Registration statement - -```sql -create function integral as 'org.apache.iotdb.library.dprofile.UDAFIntegral' -``` - -#### Usage - -This function is used to calculate the integration of time series, -which equals to the area under the curve with time as X-axis and values as Y-axis. - -**Name:** INTEGRAL - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `unit`: The unit of time used when computing the integral. - The value should be chosen from "1S", "1s", "1m", "1H", "1d"(case-sensitive), - and each represents taking one millisecond / second / minute / hour / day as 1.0 while calculating the area and integral. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the integration. - -**Note:** - -+ The integral value equals to the sum of the areas of right-angled trapezoids consisting of each two adjacent points and the time-axis. - Choosing different `unit` implies different scaling of time axis, thus making it apparent to convert the value among those results with constant coefficient. - -+ `NaN` values in the input series will be ignored. The curve or trapezoids will skip these points and use the next valid point. - -#### Examples - -##### Default Parameters - -With default parameters, this function will take one second as 1.0. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 57.5| -+-----------------------------+-------------------------+ -``` - -Calculation expression: -$$\frac{1}{2}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 57.5$$ - -##### Specific time unit - -With time unit specified as "1m", this function will take one minute as 1.0. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.958| -+-----------------------------+-------------------------+ -``` - -Calculation expression: -$$\frac{1}{2\times 60}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 0.958$$ - -### 3.5 IntegralAvg - -#### Registration statement - -```sql -create function integralavg as 'org.apache.iotdb.library.dprofile.UDAFIntegralAvg' -``` - -#### Usage - -This function is used to calculate the function average of time series. -The output equals to the area divided by the time interval using the same time `unit`. -For more information of the area under the curve, please refer to `Integral` function. - -**Name:** INTEGRALAVG - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the time-weighted average. - -**Note:** - -+ The time-weighted value equals to the integral value with any `unit` divided by the time interval of input series. - The result is irrelevant to the time unit used in integral, and it's consistent with the timestamp precision of IoTDB by default. - -+ `NaN` values in the input series will be ignored. The curve or trapezoids will skip these points and use the next valid point. - -+ If the input series is empty, the output value will be 0.0, but if there is only one data point, the value will equal to the input value. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|integralavg(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|1970-01-01T08:00:00.000+08:00| 6.388888888888889| -+-----------------------------+----------------------------+ -``` - -Calculation expression: -$$\frac{1}{2}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] / 10 = 5.75$$ - -### 3.6 Mad - -#### Registration statement - -```sql -create function mad as 'org.apache.iotdb.library.dprofile.UDAFMad' -``` - -#### Usage - -The function is used to compute the exact or approximate median absolute deviation (MAD) of a numeric time series. MAD is the median of the deviation of each element from the elements' median. - -Take a dataset $\{1,3,3,5,5,6,7,8,9\}$ as an instance. Its median is 5 and the deviation of each element from the median is $\{0,0,1,2,2,2,3,4,4\}$, whose median is 2. Therefore, the MAD of the original dataset is 2. - -**Name:** MAD - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `error`: The relative error of the approximate MAD. It should be within [0,1) and the default value is 0. Taking `error`=0.01 as an instance, suppose the exact MAD is $a$ and the approximate MAD is $b$, we have $0.99a \le b \le 1.01a$. With `error`=0, the output is the exact MAD. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the MAD. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -##### Approximate Query - -By setting `error` within (0,1), the function queries the approximate MAD. - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -............ -Total line number = 20 -``` - -SQL for query: - -```sql -select mad(s1, "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -| Time|mad(root.test.s1, "error"="0.01")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9900000000000001| -+-----------------------------+---------------------------------+ -``` - -### 3.7 Median - -#### Registration statement - -```sql -create function median as 'org.apache.iotdb.library.dprofile.UDAFMedian' -``` - -#### Usage - -The function is used to compute the exact or approximate median of a numeric time series. Median is the value separating the higher half from the lower half of a data sample. - -**Name:** MEDIAN - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `error`: The rank error of the approximate median. It should be within [0,1) and the default value is 0. For instance, a median with `error`=0.01 is the value of the element with rank percentage 0.49~0.51. With `error`=0, the output is the exact median. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the median. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -Total line number = 20 -``` - -SQL for query: - -```sql -select median(s1, "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|median(root.test.s1, "error"="0.01")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -+-----------------------------+------------------------------------+ -``` - -### 3.8 MinMax - -#### Registration statement - -```sql -create function minmax as 'org.apache.iotdb.library.dprofile.UDTFMinMax' -``` - -#### Usage - -This function is used to standardize the input series with min-max. Minimum value is transformed to 0; maximum value is transformed to 1. - -**Name:** MINMAX - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `compute`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide minimum and maximum values. The default method is "batch". -+ `min`: The maximum value when method is set to "stream". -+ `max`: The minimum value when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select minmax(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|minmax(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.200+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.300+08:00| 0.25| -|1970-01-01T08:00:00.400+08:00| 0.08333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.700+08:00| 0.0| -|1970-01-01T08:00:00.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.900+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.000+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.100+08:00| 0.25| -|1970-01-01T08:00:01.200+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.300+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.25| -|1970-01-01T08:00:01.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.700+08:00| 1.0| -|1970-01-01T08:00:01.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.900+08:00| 0.0| -|1970-01-01T08:00:02.000+08:00| 0.16666666666666666| -+-----------------------------+--------------------+ -``` - - -### 3.9 MvAvg - -#### Registration statement - -```sql -create function mvavg as 'org.apache.iotdb.library.dprofile.UDTFMvAvg' -``` - -#### Usage - -This function is used to calculate moving average of input series. - -**Name:** MVAVG - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `window`: Length of the moving window. Default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select mvavg(s1, "window"="3") from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -| Time|mvavg(root.test.s1, "window"="3")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.300+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.400+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.700+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.100+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.200+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.300+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.0| -|1970-01-01T08:00:01.500+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.600+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.700+08:00| 3.0| -|1970-01-01T08:00:01.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.900+08:00| -0.6666666666666666| -|1970-01-01T08:00:02.000+08:00| -3.3333333333333335| -+-----------------------------+---------------------------------+ -``` - -### 3.10 PACF - -#### Registration statement - -```sql -create function pacf as 'org.apache.iotdb.library.dprofile.UDTFPACF' -``` - -#### Usage - -This function is used to calculate partial autocorrelation of input series by solving Yule-Walker equation. For some cases, the equation may not be solved, and NaN will be output. - -**Name:** PACF - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `lag`: Maximum lag of pacf to calculate. The default value is $\min(10\log_{10}n,n-1)$, where $n$ is the number of data points. - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Assigning maximum lag - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select pacf(s1, "lag"="5") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------+ -| Time|pacf(root.test.d1.s1, "lag"="5")| -+-----------------------------+--------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| -0.5744680851063829| -|2020-01-01T00:00:03.000+08:00| 0.3172297297297296| -|2020-01-01T00:00:04.000+08:00| -0.2977686586304181| -|2020-01-01T00:00:05.000+08:00| -2.0609033521065867| -+-----------------------------+--------------------------------+ -``` - -### 3.11 Percentile - -#### Registration statement - -```sql -create function percentile as 'org.apache.iotdb.library.dprofile.UDAFPercentile' -``` - -#### Usage - -The function is used to compute the exact or approximate percentile of a numeric time series. A percentile is value of element in the certain rank of the sorted series. - -**Name:** PERCENTILE - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `rank`: The rank percentage of the percentile. It should be (0,1] and the default value is 0.5. For instance, a percentile with `rank`=0.5 is the median. -+ `error`: The rank error of the approximate percentile. It should be within [0,1) and the default value is 0. For instance, a 0.5-percentile with `error`=0.01 is the value of the element with rank percentage 0.49~0.51. With `error`=0, the output is the exact percentile. - -**Output Series:** Output a single series. The type is the same as input series. If `error`=0, there is only one data point in the series, whose timestamp is the same has which the first percentile value has, and value is the percentile, otherwise the timestamp of the only data point is 0. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+-------------+ -| Time|root.test2.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+-------------+ -Total line number = 20 -``` - -SQL for query: - -```sql -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|percentile(root.test2.s1, "rank"="0.2", "error"="0.01")| -+-----------------------------+-------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| -1.0| -+-----------------------------+-------------------------------------------------------+ -``` - -### 3.12 Quantile - -#### Registration statement - -```sql -create function quantile as 'org.apache.iotdb.library.dprofile.UDAFQuantile' -``` - -#### Usage - -The function is used to compute the approximate quantile of a numeric time series. A quantile is value of element in the certain rank of the sorted series. - -**Name:** QUANTILE - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `rank`: The rank of the quantile. It should be (0,1] and the default value is 0.5. For instance, a quantile with `rank`=0.5 is the median. -+ `K`: The size of KLL sketch maintained in the query. It should be within [100,+inf) and the default value is 800. For instance, the 0.5-quantile computed by a KLL sketch with K=800 items is a value with rank quantile 0.49~0.51 with a confidence of at least 99%. The result will be more accurate as K increases. - -**Output Series:** Output a single series. The type is the same as input series. The timestamp of the only data point is 0. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+-------------+ -| Time|root.test1.s1| -+-----------------------------+-------------+ -|2021-03-17T10:32:17.054+08:00| 7| -|2021-03-17T10:32:18.054+08:00| 15| -|2021-03-17T10:32:19.054+08:00| 36| -|2021-03-17T10:32:20.054+08:00| 39| -|2021-03-17T10:32:21.054+08:00| 40| -|2021-03-17T10:32:22.054+08:00| 41| -|2021-03-17T10:32:23.054+08:00| 20| -|2021-03-17T10:32:24.054+08:00| 18| -+-----------------------------+-------------+ -............ -Total line number = 8 -``` - -SQL for query: - -```sql -select quantile(s1, "rank"="0.2", "K"="800") from root.test1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------+ -| Time|quantile(root.test1.s1, "rank"="0.2", "K"="800")| -+-----------------------------+------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.000000000000001| -+-----------------------------+------------------------------------------------+ -``` - -### 3.13 Period - -#### Registration statement - -```sql -create function period as 'org.apache.iotdb.library.dprofile.UDAFPeriod' -``` - -#### Usage - -The function is used to compute the period of a numeric time series. - -**Name:** PERIOD - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is INT32. There is only one data point in the series, whose timestamp is 0 and value is the period. - -#### Examples - -Input series: - - -``` -+-----------------------------+---------------+ -| Time|root.test.d3.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| -|1970-01-01T08:00:00.004+08:00| 1.0| -|1970-01-01T08:00:00.005+08:00| 2.0| -|1970-01-01T08:00:00.006+08:00| 3.0| -|1970-01-01T08:00:00.007+08:00| 1.0| -|1970-01-01T08:00:00.008+08:00| 2.0| -|1970-01-01T08:00:00.009+08:00| 3.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select period(s1) from root.test.d3 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time|period(root.test.d3.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 3| -+-----------------------------+-----------------------+ -``` - -### 3.14 QLB - -#### Registration statement - -```sql -create function qlb as 'org.apache.iotdb.library.dprofile.UDTFQLB' -``` - -#### Usage - -This function is used to calculate Ljung-Box statistics $Q_{LB}$ for time series, and convert it to p value. - -**Name:** QLB - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters**: - -`lag`: max lag to calculate. Legal input shall be integer from 1 to n-2, where n is the sample number. Default value is n-2. - -**Output Series:** Output a single series. The type is DOUBLE. The output series is p value, and timestamp means lag. - -**Note:** If you want to calculate Ljung-Box statistics $Q_{LB}$ instead of p value, you may use ACF function. - -#### Examples - -##### Using Default Parameter - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T00:00:00.100+08:00| 1.22| -|1970-01-01T00:00:00.200+08:00| -2.78| -|1970-01-01T00:00:00.300+08:00| 1.53| -|1970-01-01T00:00:00.400+08:00| 0.70| -|1970-01-01T00:00:00.500+08:00| 0.75| -|1970-01-01T00:00:00.600+08:00| -0.72| -|1970-01-01T00:00:00.700+08:00| -0.22| -|1970-01-01T00:00:00.800+08:00| 0.28| -|1970-01-01T00:00:00.900+08:00| 0.57| -|1970-01-01T00:00:01.000+08:00| -0.22| -|1970-01-01T00:00:01.100+08:00| -0.72| -|1970-01-01T00:00:01.200+08:00| 1.34| -|1970-01-01T00:00:01.300+08:00| -0.25| -|1970-01-01T00:00:01.400+08:00| 0.17| -|1970-01-01T00:00:01.500+08:00| 2.51| -|1970-01-01T00:00:01.600+08:00| 1.42| -|1970-01-01T00:00:01.700+08:00| -1.34| -|1970-01-01T00:00:01.800+08:00| -0.01| -|1970-01-01T00:00:01.900+08:00| -0.49| -|1970-01-01T00:00:02.000+08:00| 1.63| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select QLB(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+---------------------+ -| Time| QLB(root.test.d1.s1)| -+-----------------------------+---------------------+ -|1970-01-01T08:00:00.021+08:00| -0.31671| -|1970-01-01T08:00:00.001+08:00| 0.12748561639660716| -|1970-01-01T08:00:00.022+08:00| -0.17051499999999997| -|1970-01-01T08:00:00.002+08:00| 0.21941409592365868| -|1970-01-01T08:00:00.023+08:00| -0.11341499999999997| -|1970-01-01T08:00:00.003+08:00| 0.3384920824593398| -|1970-01-01T08:00:00.024+08:00| 0.26146| -|1970-01-01T08:00:00.004+08:00| 0.26293189359893154| -|1970-01-01T08:00:00.025+08:00| 0.06431999999999996| -|1970-01-01T08:00:00.005+08:00| 0.37265953802871943| -|1970-01-01T08:00:00.026+08:00| 0.036919999999999994| -|1970-01-01T08:00:00.006+08:00| 0.4923218142923832| -|1970-01-01T08:00:00.027+08:00|-0.009294999999999993| -|1970-01-01T08:00:00.007+08:00| 0.609628728420623| -|1970-01-01T08:00:00.028+08:00| 0.12271499999999999| -|1970-01-01T08:00:00.008+08:00| 0.6510708392264906| -|1970-01-01T08:00:00.029+08:00| 0.008480000000000033| -|1970-01-01T08:00:00.009+08:00| 0.7430561964288097| -|1970-01-01T08:00:00.030+08:00| -0.21764500000000003| -|1970-01-01T08:00:00.010+08:00| 0.6236738200492055| -|1970-01-01T08:00:00.031+08:00| 0.35853999999999997| -|1970-01-01T08:00:00.011+08:00| 0.21487390993160937| -|1970-01-01T08:00:00.032+08:00| 0.18115499999999998| -|1970-01-01T08:00:00.012+08:00| 0.18479562182870324| -|1970-01-01T08:00:00.033+08:00| -0.27745499999999995| -|1970-01-01T08:00:00.013+08:00| 0.07329862193377235| -|1970-01-01T08:00:00.034+08:00| -0.22418500000000002| -|1970-01-01T08:00:00.014+08:00| 0.038000864459751926| -|1970-01-01T08:00:00.035+08:00| 0.31609000000000004| -|1970-01-01T08:00:00.015+08:00| 0.004052989734200874| -|1970-01-01T08:00:00.036+08:00| -0.06078500000000001| -|1970-01-01T08:00:00.016+08:00| 0.005663787468609627| -|1970-01-01T08:00:00.037+08:00| 0.19219499999999998| -|1970-01-01T08:00:00.017+08:00|0.0016316380755082571| -|1970-01-01T08:00:00.038+08:00| -0.25646| -|1970-01-01T08:00:00.018+08:00|2.0047954405910673E-5| -+-----------------------------+---------------------+ -``` - -### 3.15 Resample - -#### Registration statement - -```sql -create function re_sample as 'org.apache.iotdb.library.dprofile.UDTFResample' -``` - -#### Usage - -This function is used to resample the input series according to a given frequency, -including up-sampling and down-sampling. -Currently, the supported up-sampling methods are -NaN (filling with `NaN`), -FFill (filling with previous value), -BFill (filling with next value) and -Linear (filling with linear interpolation). -Down-sampling relies on group aggregation, -which supports Max, Min, First, Last, Mean and Median. - -**Name:** RESAMPLE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - - -+ `every`: The frequency of resampling, which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. This parameter cannot be lacked. -+ `interp`: The interpolation method of up-sampling, which is 'NaN', 'FFill', 'BFill' or 'Linear'. By default, NaN is used. -+ `aggr`: The aggregation method of down-sampling, which is 'Max', 'Min', 'First', 'Last', 'Mean' or 'Median'. By default, Mean is used. -+ `start`: The start time (inclusive) of resampling with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is the timestamp of the first valid data point. -+ `end`: The end time (exclusive) of resampling with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is the timestamp of the last valid data point. - -**Output Series:** Output a single series. The type is DOUBLE. It is strictly equispaced with the frequency `every`. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -##### Up-sampling - -When the frequency of resampling is higher than the original frequency, up-sampling starts. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2021-03-06T16:00:00.000+08:00| 3.09| -|2021-03-06T16:15:00.000+08:00| 3.53| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:45:00.000+08:00| 3.51| -|2021-03-06T17:00:00.000+08:00| 3.41| -+-----------------------------+---------------+ -``` - - -SQL for query: - -```sql -select resample(s1,'every'='5m','interp'='linear') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="5m", "interp"="linear")| -+-----------------------------+----------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:05:00.000+08:00| 3.2366665999094644| -|2021-03-06T16:10:00.000+08:00| 3.3833332856496177| -|2021-03-06T16:15:00.000+08:00| 3.5299999713897705| -|2021-03-06T16:20:00.000+08:00| 3.5199999809265137| -|2021-03-06T16:25:00.000+08:00| 3.509999990463257| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:35:00.000+08:00| 3.503333330154419| -|2021-03-06T16:40:00.000+08:00| 3.506666660308838| -|2021-03-06T16:45:00.000+08:00| 3.509999990463257| -|2021-03-06T16:50:00.000+08:00| 3.4766666889190674| -|2021-03-06T16:55:00.000+08:00| 3.443333387374878| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+----------------------------------------------------------+ -``` - -##### Down-sampling - -When the frequency of resampling is lower than the original frequency, down-sampling starts. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select resample(s1,'every'='30m','aggr'='first') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "aggr"="first")| -+-----------------------------+--------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+--------------------------------------------------------+ -``` - - - -##### Specify the time period - -The time period of resampling can be specified with `start` and `end`. -The period outside the actual time range will be interpolated. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "start"="2021-03-06 15:00:00")| -+-----------------------------+-----------------------------------------------------------------------+ -|2021-03-06T15:00:00.000+08:00| NaN| -|2021-03-06T15:30:00.000+08:00| NaN| -|2021-03-06T16:00:00.000+08:00| 3.309999942779541| -|2021-03-06T16:30:00.000+08:00| 3.5049999952316284| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+-----------------------------------------------------------------------+ -``` - -### 3.16 Sample - -#### Registration statement - -```sql -create function sample as 'org.apache.iotdb.library.dprofile.UDTFSample' -``` - -#### Usage - -This function is used to sample the input series, -that is, select a specified number of data points from the input series and output them. -Currently, three sampling methods are supported: -**Reservoir sampling** randomly selects data points. -All of the points have the same probability of being sampled. -**Isometric sampling** selects data points at equal index intervals. -**Triangle sampling** assigns data points to the buckets based on the number of sampling. -Then it calculates the area of the triangle based on these points inside the bucket and selects the point with the largest area of the triangle. -For more detail, please read [paper](http://skemman.is/stream/get/1946/15343/37285/3/SS_MSthesis.pdf) - -**Name:** SAMPLE - -**Input Series:** Only support a single input series. The type is arbitrary. - -**Parameters:** - -+ `method`: The method of sampling, which is 'reservoir', 'isometric' or 'triangle'. By default, reservoir sampling is used. -+ `k`: The number of sampling, which is a positive integer. By default, it's 1. - -**Output Series:** Output a single series. The type is the same as the input. The length of the output series is `k`. Each data point in the output series comes from the input series. - -**Note:** If `k` is greater than the length of input series, all data points in the input series will be output. - -#### Examples - -##### Reservoir Sampling - -When `method` is 'reservoir' or the default, reservoir sampling is used. -Due to the randomness of this method, the output series shown below is only a possible result. - - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 4.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select sample(s1,'method'='reservoir','k'='5') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="reservoir", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -##### Isometric Sampling - -When `method` is 'isometric', isometric sampling is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select sample(s1,'method'='isometric','k'='5') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="isometric", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -### 3.17 Segment - -#### Registration statement - -```sql -create function segment as 'org.apache.iotdb.library.dprofile.UDTFSegment' -``` - -#### Usage - -This function is used to segment a time series into subsequences according to linear trend, and returns linear fitted values of first values in each subsequence or every data point. - -**Name:** SEGMENT - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `output` :"all" to output all fitted points; "first" to output first fitted points in each subsequence. - -+ `error`: error allowed at linear regression. It is defined as mean absolute error of a subsequence. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** This function treat input series as equal-interval sampled. All data are loaded, so downsample input series first if there are too many data points. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:00.300+08:00| 2.0| -|1970-01-01T08:00:00.400+08:00| 3.0| -|1970-01-01T08:00:00.500+08:00| 4.0| -|1970-01-01T08:00:00.600+08:00| 5.0| -|1970-01-01T08:00:00.700+08:00| 6.0| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 8.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:01.100+08:00| 9.1| -|1970-01-01T08:00:01.200+08:00| 9.2| -|1970-01-01T08:00:01.300+08:00| 9.3| -|1970-01-01T08:00:01.400+08:00| 9.4| -|1970-01-01T08:00:01.500+08:00| 9.5| -|1970-01-01T08:00:01.600+08:00| 9.6| -|1970-01-01T08:00:01.700+08:00| 9.7| -|1970-01-01T08:00:01.800+08:00| 9.8| -|1970-01-01T08:00:01.900+08:00| 9.9| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:02.100+08:00| 8.0| -|1970-01-01T08:00:02.200+08:00| 6.0| -|1970-01-01T08:00:02.300+08:00| 4.0| -|1970-01-01T08:00:02.400+08:00| 2.0| -|1970-01-01T08:00:02.500+08:00| 0.0| -|1970-01-01T08:00:02.600+08:00| -2.0| -|1970-01-01T08:00:02.700+08:00| -4.0| -|1970-01-01T08:00:02.800+08:00| -6.0| -|1970-01-01T08:00:02.900+08:00| -8.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.100+08:00| 10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -|1970-01-01T08:00:03.300+08:00| 10.0| -|1970-01-01T08:00:03.400+08:00| 10.0| -|1970-01-01T08:00:03.500+08:00| 10.0| -|1970-01-01T08:00:03.600+08:00| 10.0| -|1970-01-01T08:00:03.700+08:00| 10.0| -|1970-01-01T08:00:03.800+08:00| 10.0| -|1970-01-01T08:00:03.900+08:00| 10.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select segment(s1, "error"="0.1") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|segment(root.test.s1, "error"="0.1")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -+-----------------------------+------------------------------------+ -``` - -### 3.18 Skew - -#### Registration statement - -```sql -create function skew as 'org.apache.iotdb.library.dprofile.UDAFSkew' -``` - -#### Usage - -This function is used to calculate the population skewness. - -**Name:** SKEW - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the population skewness. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -|2020-01-01T00:00:11.000+08:00| 10.0| -|2020-01-01T00:00:12.000+08:00| 10.0| -|2020-01-01T00:00:13.000+08:00| 10.0| -|2020-01-01T00:00:14.000+08:00| 10.0| -|2020-01-01T00:00:15.000+08:00| 10.0| -|2020-01-01T00:00:16.000+08:00| 10.0| -|2020-01-01T00:00:17.000+08:00| 10.0| -|2020-01-01T00:00:18.000+08:00| 10.0| -|2020-01-01T00:00:19.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select skew(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time| skew(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| -0.9998427402292644| -+-----------------------------+-----------------------+ -``` - -### 3.19 Spline - -#### Registration statement - -```sql -create function spline as 'org.apache.iotdb.library.dprofile.UDTFSpline' -``` - -#### Usage - -This function is used to calculate cubic spline interpolation of input series. - -**Name:** SPLINE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `points`: Number of resampling points. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note**: Output series retains the first and last timestamps of input series. Interpolation points are selected at equal intervals. The function tries to calculate only when there are no less than 4 points in input series. - -#### Examples - -##### Assigning number of interpolation points - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select spline(s1, "points"="151") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|spline(root.test.s1, "points"="151")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.010+08:00| 0.04870000251134237| -|1970-01-01T08:00:00.020+08:00| 0.09680000495910646| -|1970-01-01T08:00:00.030+08:00| 0.14430000734329226| -|1970-01-01T08:00:00.040+08:00| 0.19120000966389972| -|1970-01-01T08:00:00.050+08:00| 0.23750001192092896| -|1970-01-01T08:00:00.060+08:00| 0.2832000141143799| -|1970-01-01T08:00:00.070+08:00| 0.32830001624425253| -|1970-01-01T08:00:00.080+08:00| 0.3728000183105469| -|1970-01-01T08:00:00.090+08:00| 0.416700020313263| -|1970-01-01T08:00:00.100+08:00| 0.4600000222524008| -|1970-01-01T08:00:00.110+08:00| 0.5027000241279602| -|1970-01-01T08:00:00.120+08:00| 0.5448000259399414| -|1970-01-01T08:00:00.130+08:00| 0.5863000276883443| -|1970-01-01T08:00:00.140+08:00| 0.627200029373169| -|1970-01-01T08:00:00.150+08:00| 0.6675000309944153| -|1970-01-01T08:00:00.160+08:00| 0.7072000325520833| -|1970-01-01T08:00:00.170+08:00| 0.7463000340461731| -|1970-01-01T08:00:00.180+08:00| 0.7848000354766846| -|1970-01-01T08:00:00.190+08:00| 0.8227000368436178| -|1970-01-01T08:00:00.200+08:00| 0.8600000381469728| -|1970-01-01T08:00:00.210+08:00| 0.8967000393867494| -|1970-01-01T08:00:00.220+08:00| 0.9328000405629477| -|1970-01-01T08:00:00.230+08:00| 0.9683000416755676| -|1970-01-01T08:00:00.240+08:00| 1.0032000427246095| -|1970-01-01T08:00:00.250+08:00| 1.037500043710073| -|1970-01-01T08:00:00.260+08:00| 1.071200044631958| -|1970-01-01T08:00:00.270+08:00| 1.1043000454902647| -|1970-01-01T08:00:00.280+08:00| 1.1368000462849934| -|1970-01-01T08:00:00.290+08:00| 1.1687000470161437| -|1970-01-01T08:00:00.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:00.310+08:00| 1.2307000483103594| -|1970-01-01T08:00:00.320+08:00| 1.2608000489139557| -|1970-01-01T08:00:00.330+08:00| 1.2903000494873524| -|1970-01-01T08:00:00.340+08:00| 1.3192000500233967| -|1970-01-01T08:00:00.350+08:00| 1.3475000505149364| -|1970-01-01T08:00:00.360+08:00| 1.3752000509548186| -|1970-01-01T08:00:00.370+08:00| 1.402300051335891| -|1970-01-01T08:00:00.380+08:00| 1.4288000516510009| -|1970-01-01T08:00:00.390+08:00| 1.4547000518929958| -|1970-01-01T08:00:00.400+08:00| 1.480000052054723| -|1970-01-01T08:00:00.410+08:00| 1.5047000521290301| -|1970-01-01T08:00:00.420+08:00| 1.5288000521087646| -|1970-01-01T08:00:00.430+08:00| 1.5523000519867738| -|1970-01-01T08:00:00.440+08:00| 1.575200051755905| -|1970-01-01T08:00:00.450+08:00| 1.597500051409006| -|1970-01-01T08:00:00.460+08:00| 1.619200050938924| -|1970-01-01T08:00:00.470+08:00| 1.6403000503385066| -|1970-01-01T08:00:00.480+08:00| 1.660800049600601| -|1970-01-01T08:00:00.490+08:00| 1.680700048718055| -|1970-01-01T08:00:00.500+08:00| 1.7000000476837158| -|1970-01-01T08:00:00.510+08:00| 1.7188475466453037| -|1970-01-01T08:00:00.520+08:00| 1.7373800457262996| -|1970-01-01T08:00:00.530+08:00| 1.7555825448831923| -|1970-01-01T08:00:00.540+08:00| 1.7734400440724702| -|1970-01-01T08:00:00.550+08:00| 1.790937543250622| -|1970-01-01T08:00:00.560+08:00| 1.8080600423741364| -|1970-01-01T08:00:00.570+08:00| 1.8247925413995016| -|1970-01-01T08:00:00.580+08:00| 1.8411200402832066| -|1970-01-01T08:00:00.590+08:00| 1.8570275389817397| -|1970-01-01T08:00:00.600+08:00| 1.8725000374515897| -|1970-01-01T08:00:00.610+08:00| 1.8875225356492449| -|1970-01-01T08:00:00.620+08:00| 1.902080033531194| -|1970-01-01T08:00:00.630+08:00| 1.9161575310539258| -|1970-01-01T08:00:00.640+08:00| 1.9297400281739288| -|1970-01-01T08:00:00.650+08:00| 1.9428125248476913| -|1970-01-01T08:00:00.660+08:00| 1.9553600210317021| -|1970-01-01T08:00:00.670+08:00| 1.96736751668245| -|1970-01-01T08:00:00.680+08:00| 1.9788200117564232| -|1970-01-01T08:00:00.690+08:00| 1.9897025062101101| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.710+08:00| 2.0097024933913334| -|1970-01-01T08:00:00.720+08:00| 2.0188199867081615| -|1970-01-01T08:00:00.730+08:00| 2.027367479995188| -|1970-01-01T08:00:00.740+08:00| 2.0353599732971155| -|1970-01-01T08:00:00.750+08:00| 2.0428124666586482| -|1970-01-01T08:00:00.760+08:00| 2.049739960124489| -|1970-01-01T08:00:00.770+08:00| 2.056157453739342| -|1970-01-01T08:00:00.780+08:00| 2.06207994754791| -|1970-01-01T08:00:00.790+08:00| 2.067522441594897| -|1970-01-01T08:00:00.800+08:00| 2.072499935925006| -|1970-01-01T08:00:00.810+08:00| 2.07702743058294| -|1970-01-01T08:00:00.820+08:00| 2.081119925613404| -|1970-01-01T08:00:00.830+08:00| 2.0847924210611| -|1970-01-01T08:00:00.840+08:00| 2.0880599169707317| -|1970-01-01T08:00:00.850+08:00| 2.0909374133870027| -|1970-01-01T08:00:00.860+08:00| 2.0934399103546166| -|1970-01-01T08:00:00.870+08:00| 2.0955824079182768| -|1970-01-01T08:00:00.880+08:00| 2.0973799061226863| -|1970-01-01T08:00:00.890+08:00| 2.098847405012549| -|1970-01-01T08:00:00.900+08:00| 2.0999999046325684| -|1970-01-01T08:00:00.910+08:00| 2.1005574051201332| -|1970-01-01T08:00:00.920+08:00| 2.1002599065303778| -|1970-01-01T08:00:00.930+08:00| 2.0991524087846245| -|1970-01-01T08:00:00.940+08:00| 2.0972799118041947| -|1970-01-01T08:00:00.950+08:00| 2.0946874155104105| -|1970-01-01T08:00:00.960+08:00| 2.0914199198245944| -|1970-01-01T08:00:00.970+08:00| 2.0875224246680673| -|1970-01-01T08:00:00.980+08:00| 2.083039929962151| -|1970-01-01T08:00:00.990+08:00| 2.0780174356281687| -|1970-01-01T08:00:01.000+08:00| 2.0724999415874406| -|1970-01-01T08:00:01.010+08:00| 2.06653244776129| -|1970-01-01T08:00:01.020+08:00| 2.060159954071038| -|1970-01-01T08:00:01.030+08:00| 2.053427460438006| -|1970-01-01T08:00:01.040+08:00| 2.046379966783517| -|1970-01-01T08:00:01.050+08:00| 2.0390624730288924| -|1970-01-01T08:00:01.060+08:00| 2.031519979095454| -|1970-01-01T08:00:01.070+08:00| 2.0237974849045237| -|1970-01-01T08:00:01.080+08:00| 2.015939990377423| -|1970-01-01T08:00:01.090+08:00| 2.0079924954354746| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.110+08:00| 1.9907018211101906| -|1970-01-01T08:00:01.120+08:00| 1.9788509124245144| -|1970-01-01T08:00:01.130+08:00| 1.9645127287932083| -|1970-01-01T08:00:01.140+08:00| 1.9477527250665083| -|1970-01-01T08:00:01.150+08:00| 1.9286363560946513| -|1970-01-01T08:00:01.160+08:00| 1.9072290767278735| -|1970-01-01T08:00:01.170+08:00| 1.8835963418164114| -|1970-01-01T08:00:01.180+08:00| 1.8578036062105014| -|1970-01-01T08:00:01.190+08:00| 1.8299163247603802| -|1970-01-01T08:00:01.200+08:00| 1.7999999523162842| -|1970-01-01T08:00:01.210+08:00| 1.7623635841923329| -|1970-01-01T08:00:01.220+08:00| 1.7129696477516976| -|1970-01-01T08:00:01.230+08:00| 1.6543635959181928| -|1970-01-01T08:00:01.240+08:00| 1.5890908816156328| -|1970-01-01T08:00:01.250+08:00| 1.5196969577678319| -|1970-01-01T08:00:01.260+08:00| 1.4487272772986044| -|1970-01-01T08:00:01.270+08:00| 1.3787272931317647| -|1970-01-01T08:00:01.280+08:00| 1.3122424581911272| -|1970-01-01T08:00:01.290+08:00| 1.251818225400506| -|1970-01-01T08:00:01.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:01.310+08:00| 1.1548000470995912| -|1970-01-01T08:00:01.320+08:00| 1.1130667107899999| -|1970-01-01T08:00:01.330+08:00| 1.0756000393033045| -|1970-01-01T08:00:01.340+08:00| 1.043200033187868| -|1970-01-01T08:00:01.350+08:00| 1.016666692992053| -|1970-01-01T08:00:01.360+08:00| 0.9968000192642223| -|1970-01-01T08:00:01.370+08:00| 0.9844000125527389| -|1970-01-01T08:00:01.380+08:00| 0.9802666734059655| -|1970-01-01T08:00:01.390+08:00| 0.9852000023722649| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.410+08:00| 1.023999999165535| -|1970-01-01T08:00:01.420+08:00| 1.0559999990463256| -|1970-01-01T08:00:01.430+08:00| 1.0959999996423722| -|1970-01-01T08:00:01.440+08:00| 1.1440000009536744| -|1970-01-01T08:00:01.450+08:00| 1.2000000029802322| -|1970-01-01T08:00:01.460+08:00| 1.264000005722046| -|1970-01-01T08:00:01.470+08:00| 1.3360000091791153| -|1970-01-01T08:00:01.480+08:00| 1.4160000133514405| -|1970-01-01T08:00:01.490+08:00| 1.5040000182390214| -|1970-01-01T08:00:01.500+08:00| 1.600000023841858| -+-----------------------------+------------------------------------+ -``` - -### 3.20 Spread - -#### Registration statement - -```sql -create function spread as 'org.apache.iotdb.library.dprofile.UDAFSpread' -``` - -#### Usage - -This function is used to calculate the spread of time series, that is, the maximum value minus the minimum value. - -**Name:** SPREAD - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is the same as the input. There is only one data point in the series, whose timestamp is 0 and value is the spread. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time|spread(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 26.0| -+-----------------------------+-----------------------+ -``` - - - -### 3.21 ZScore - -#### Registration statement - -```sql -create function zscore as 'org.apache.iotdb.library.dprofile.UDTFZScore' -``` - -#### Usage - -This function is used to standardize the input series with z-score. - -**Name:** ZSCORE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `compute`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide mean and standard deviation. The default method is "batch". -+ `avg`: Mean value when method is set to "stream". -+ `sd`: Standard deviation when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select zscore(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|zscore(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.200+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.300+08:00| 0.20672455764868078| -|1970-01-01T08:00:00.400+08:00| -0.6201736729460423| -|1970-01-01T08:00:00.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.700+08:00| -1.033622788243404| -|1970-01-01T08:00:00.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:00.900+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.000+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.100+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.200+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.300+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.400+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.700+08:00| 3.9277665953249348| -|1970-01-01T08:00:01.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:01.900+08:00| -1.033622788243404| -|1970-01-01T08:00:02.000+08:00|-0.20672455764868078| -+-----------------------------+--------------------+ -``` - - -## 4. Anomaly Detection - -### 4.1 IQR - -#### Registration statement - -```sql -create function iqr as 'org.apache.iotdb.library.anomaly.UDTFIQR' -``` - -#### Usage - -This function is used to detect anomalies based on IQR. Points distributing beyond 1.5 times IQR are selected. - -**Name:** IQR - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `method`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide upper and lower quantiles. The default method is "batch". -+ `q1`: The lower quantile when method is set to "stream". -+ `q3`: The upper quantile when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** $IQR=Q_3-Q_1$ - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select iqr(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+-----------------+ -| Time|iqr(root.test.s1)| -+-----------------------------+-----------------+ -|1970-01-01T08:00:01.700+08:00| 10.0| -+-----------------------------+-----------------+ -``` - -### 4.2 KSigma - -#### Registration statement - -```sql -create function ksigma as 'org.apache.iotdb.library.anomaly.UDTFKSigma' -``` - -#### Usage - -This function is used to detect anomalies based on the Dynamic K-Sigma Algorithm. -Within a sliding window, the input value with a deviation of more than k times the standard deviation from the average will be output as anomaly. - -**Name:** KSIGMA - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `k`: How many times to multiply on standard deviation to define anomaly, the default value is 3. -+ `window`: The window size of Dynamic K-Sigma Algorithm, the default value is 10000. - -**Output Series:** Output a single series. The type is same as input series. - -**Note:** Only when is larger than 0, the anomaly detection will be performed. Otherwise, nothing will be output. - -#### Examples - -##### Assigning k - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:04.000+08:00| 100.0| -|2020-01-01T00:00:06.000+08:00| 150.0| -|2020-01-01T00:00:08.000+08:00| 200.0| -|2020-01-01T00:00:10.000+08:00| 200.0| -|2020-01-01T00:00:14.000+08:00| 200.0| -|2020-01-01T00:00:15.000+08:00| 200.0| -|2020-01-01T00:00:16.000+08:00| 200.0| -|2020-01-01T00:00:18.000+08:00| 200.0| -|2020-01-01T00:00:20.000+08:00| 150.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -|Time |ksigma(root.test.d1.s1,"k"="3.0")| -+-----------------------------+---------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -+-----------------------------+---------------------------------+ -``` - -### 4.3 LOF - -#### Registration statement - -```sql -create function LOF as 'org.apache.iotdb.library.anomaly.UDTFLOF' -``` - -#### Usage - -This function is used to detect density anomaly of time series. According to k-th distance calculation parameter and local outlier factor (lof) threshold, the function judges if a set of input values is an density anomaly, and a bool mark of anomaly values will be output. - -**Name:** LOF - -**Input Series:** Multiple input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `method`:assign a detection method. The default value is "default", when input data has multiple dimensions. The alternative is "series", when a input series will be transformed to high dimension. -+ `k`:use the k-th distance to calculate lof. Default value is 3. -+ `window`: size of window to split origin data points. Default value is 10000. -+ `windowsize`:dimension that will be transformed into when method is "series". The default value is 5. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** Incomplete rows will be ignored. They are neither calculated nor marked as anomaly. - -#### Examples - -##### Using default parameters - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| 1.0| -|1970-01-01T08:00:00.300+08:00| 1.0| 1.0| -|1970-01-01T08:00:00.400+08:00| 1.0| 0.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -1.0| -|1970-01-01T08:00:00.600+08:00| -1.0| -1.0| -|1970-01-01T08:00:00.700+08:00| -1.0| 0.0| -|1970-01-01T08:00:00.800+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select lof(s1,s2) from root.test.d1 where time<1000 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|lof(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.100+08:00| 3.8274824267668244| -|1970-01-01T08:00:00.200+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.300+08:00| 2.838155437762879| -|1970-01-01T08:00:00.400+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.500+08:00| 2.73518261244453| -|1970-01-01T08:00:00.600+08:00| 2.371440975708148| -|1970-01-01T08:00:00.700+08:00| 2.73518261244453| -|1970-01-01T08:00:00.800+08:00| 1.7561416374270742| -+-----------------------------+-------------------------------------+ -``` - -##### Diagnosing 1d timeseries - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 1.0| -|1970-01-01T08:00:00.200+08:00| 2.0| -|1970-01-01T08:00:00.300+08:00| 3.0| -|1970-01-01T08:00:00.400+08:00| 4.0| -|1970-01-01T08:00:00.500+08:00| 5.0| -|1970-01-01T08:00:00.600+08:00| 6.0| -|1970-01-01T08:00:00.700+08:00| 7.0| -|1970-01-01T08:00:00.800+08:00| 8.0| -|1970-01-01T08:00:00.900+08:00| 9.0| -|1970-01-01T08:00:01.000+08:00| 10.0| -|1970-01-01T08:00:01.100+08:00| 11.0| -|1970-01-01T08:00:01.200+08:00| 12.0| -|1970-01-01T08:00:01.300+08:00| 13.0| -|1970-01-01T08:00:01.400+08:00| 14.0| -|1970-01-01T08:00:01.500+08:00| 15.0| -|1970-01-01T08:00:01.600+08:00| 16.0| -|1970-01-01T08:00:01.700+08:00| 17.0| -|1970-01-01T08:00:01.800+08:00| 18.0| -|1970-01-01T08:00:01.900+08:00| 19.0| -|1970-01-01T08:00:02.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select lof(s1, "method"="series") from root.test.d1 where time<1000 -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|lof(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 3.77777777777778| -|1970-01-01T08:00:00.200+08:00| 4.32727272727273| -|1970-01-01T08:00:00.300+08:00| 4.85714285714286| -|1970-01-01T08:00:00.400+08:00| 5.40909090909091| -|1970-01-01T08:00:00.500+08:00| 5.94999999999999| -|1970-01-01T08:00:00.600+08:00| 6.43243243243243| -|1970-01-01T08:00:00.700+08:00| 6.79999999999999| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 7.0| -|1970-01-01T08:00:01.000+08:00| 6.79999999999999| -|1970-01-01T08:00:01.100+08:00| 6.43243243243243| -|1970-01-01T08:00:01.200+08:00| 5.94999999999999| -|1970-01-01T08:00:01.300+08:00| 5.40909090909091| -|1970-01-01T08:00:01.400+08:00| 4.85714285714286| -|1970-01-01T08:00:01.500+08:00| 4.32727272727273| -|1970-01-01T08:00:01.600+08:00| 3.77777777777778| -+-----------------------------+--------------------+ -``` - -### 4.4 MissDetect - -#### Registration statement - -```sql -create function missdetect as 'org.apache.iotdb.library.anomaly.UDTFMissDetect' -``` - -#### Usage - -This function is used to detect missing anomalies. -In some datasets, missing values are filled by linear interpolation. -Thus, there are several long perfect linear segments. -By discovering these perfect linear segments, -missing anomalies are detected. - -**Name:** MISSDETECT - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -`error`: The minimum length of the detected missing anomalies, which is an integer greater than or equal to 10. By default, it is 10. - -**Output Series:** Output a single series. The type is BOOLEAN. Each data point which is miss anomaly will be labeled as true. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 0.0| -|2021-07-01T12:00:01.000+08:00| 1.0| -|2021-07-01T12:00:02.000+08:00| 0.0| -|2021-07-01T12:00:03.000+08:00| 1.0| -|2021-07-01T12:00:04.000+08:00| 0.0| -|2021-07-01T12:00:05.000+08:00| 0.0| -|2021-07-01T12:00:06.000+08:00| 0.0| -|2021-07-01T12:00:07.000+08:00| 0.0| -|2021-07-01T12:00:08.000+08:00| 0.0| -|2021-07-01T12:00:09.000+08:00| 0.0| -|2021-07-01T12:00:10.000+08:00| 0.0| -|2021-07-01T12:00:11.000+08:00| 0.0| -|2021-07-01T12:00:12.000+08:00| 0.0| -|2021-07-01T12:00:13.000+08:00| 0.0| -|2021-07-01T12:00:14.000+08:00| 0.0| -|2021-07-01T12:00:15.000+08:00| 0.0| -|2021-07-01T12:00:16.000+08:00| 1.0| -|2021-07-01T12:00:17.000+08:00| 0.0| -|2021-07-01T12:00:18.000+08:00| 1.0| -|2021-07-01T12:00:19.000+08:00| 0.0| -|2021-07-01T12:00:20.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select missdetect(s2,'minlen'='10') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------+ -| Time|missdetect(root.test.d2.s2, "minlen"="10")| -+-----------------------------+------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| false| -|2021-07-01T12:00:01.000+08:00| false| -|2021-07-01T12:00:02.000+08:00| false| -|2021-07-01T12:00:03.000+08:00| false| -|2021-07-01T12:00:04.000+08:00| true| -|2021-07-01T12:00:05.000+08:00| true| -|2021-07-01T12:00:06.000+08:00| true| -|2021-07-01T12:00:07.000+08:00| true| -|2021-07-01T12:00:08.000+08:00| true| -|2021-07-01T12:00:09.000+08:00| true| -|2021-07-01T12:00:10.000+08:00| true| -|2021-07-01T12:00:11.000+08:00| true| -|2021-07-01T12:00:12.000+08:00| true| -|2021-07-01T12:00:13.000+08:00| true| -|2021-07-01T12:00:14.000+08:00| true| -|2021-07-01T12:00:15.000+08:00| true| -|2021-07-01T12:00:16.000+08:00| false| -|2021-07-01T12:00:17.000+08:00| false| -|2021-07-01T12:00:18.000+08:00| false| -|2021-07-01T12:00:19.000+08:00| false| -|2021-07-01T12:00:20.000+08:00| false| -+-----------------------------+------------------------------------------+ -``` - -### 4.5 Range - -#### Registration statement - -```sql -create function range as 'org.apache.iotdb.library.anomaly.UDTFRange' -``` - -#### Usage - -This function is used to detect range anomaly of time series. According to upper bound and lower bound parameters, the function judges if a input value is beyond range, aka range anomaly, and a new time series of anomaly will be output. - -**Name:** RANGE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `lower_bound`:lower bound of range anomaly detection. -+ `upper_bound`:upper bound of range anomaly detection. - -**Output Series:** Output a single series. The type is the same as the input. - -**Note:** Only when `upper_bound` is larger than `lower_bound`, the anomaly detection will be performed. Otherwise, nothing will be output. - - - -#### Examples - -##### Assigning Lower and Upper Bound - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------------------+ -|Time |range(root.test.d1.s1,"lower_bound"="101.0","upper_bound"="125.0")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -+-----------------------------+------------------------------------------------------------------+ -``` - -### 4.6 TwoSidedFilter - -#### Registration statement - -```sql -create function twosidedfilter as 'org.apache.iotdb.library.anomaly.UDTFTwoSidedFilter' -``` - -#### Usage - -The function is used to filter anomalies of a numeric time series based on two-sided window detection. - -**Name:** TWOSIDEDFILTER - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE - -**Output Series:** Output a single series. The type is the same as the input. It is the input without anomalies. - -**Parameter:** - -- `len`: The size of the window, which is a positive integer. By default, it's 5. When `len`=3, the algorithm detects forward window and backward window with length 3 and calculates the outlierness of the current point. - -- `threshold`: The threshold of outlierness, which is a floating number in (0,1). By default, it's 0.3. The strict standard of detecting anomalies is in proportion to the threshold. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:00:37.000+08:00| 1484.0| -|1970-01-01T08:00:38.000+08:00| 1055.0| -|1970-01-01T08:00:39.000+08:00| 1050.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test -``` - -Output series: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -### 4.7 Outlier - -#### Registration statement - -```sql -create function outlier as 'org.apache.iotdb.library.anomaly.UDTFOutlier' -``` - -#### Usage - -This function is used to detect distance-based outliers. For each point in the current window, if the number of its neighbors within the distance of neighbor distance threshold is less than the neighbor count threshold, the point in detected as an outlier. - -**Name:** OUTLIER - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `r`:the neighbor distance threshold. -+ `k`:the neighbor count threshold. -+ `w`:the window size. -+ `s`:the slide size. - -**Output Series:** Output a single series. The type is the same as the input. - -#### Examples - -##### Assigning Parameters of Queries - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|2020-01-04T23:59:55.000+08:00| 56.0| -|2020-01-04T23:59:56.000+08:00| 55.1| -|2020-01-04T23:59:57.000+08:00| 54.2| -|2020-01-04T23:59:58.000+08:00| 56.3| -|2020-01-04T23:59:59.000+08:00| 59.0| -|2020-01-05T00:00:00.000+08:00| 60.0| -|2020-01-05T00:00:01.000+08:00| 60.5| -|2020-01-05T00:00:02.000+08:00| 64.5| -|2020-01-05T00:00:03.000+08:00| 69.0| -|2020-01-05T00:00:04.000+08:00| 64.2| -|2020-01-05T00:00:05.000+08:00| 62.3| -|2020-01-05T00:00:06.000+08:00| 58.0| -|2020-01-05T00:00:07.000+08:00| 58.9| -|2020-01-05T00:00:08.000+08:00| 52.0| -|2020-01-05T00:00:09.000+08:00| 62.3| -|2020-01-05T00:00:10.000+08:00| 61.0| -|2020-01-05T00:00:11.000+08:00| 64.2| -|2020-01-05T00:00:12.000+08:00| 61.8| -|2020-01-05T00:00:13.000+08:00| 64.0| -|2020-01-05T00:00:14.000+08:00| 63.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|outlier(root.test.s1,"r"="5.0","k"="4","w"="10","s"="5")| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:03.000+08:00| 69.0| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:08.000+08:00| 52.0| -+-----------------------------+--------------------------------------------------------+ -``` - -## 5. Frequency Domain Analysis - -### 5.1 Conv - -#### Registration statement - -```sql -create function conv as 'org.apache.iotdb.library.frequency.UDTFConv' -``` - -#### Usage - -This function is used to calculate the convolution, i.e. polynomial multiplication. - -**Name:** CONV - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output:** Output a single series. The type is DOUBLE. It is the result of convolution whose timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 0.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select conv(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------+ -| Time|conv(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| -|1970-01-01T08:00:00.003+08:00| 2.0| -+-----------------------------+--------------------------------------+ -``` - -### 5.2 Deconv - -#### Registration statement - -```sql -create function deconv as 'org.apache.iotdb.library.frequency.UDTFDeconv' -``` - -#### Usage - -This function is used to calculate the deconvolution, i.e. polynomial division. - -**Name:** DECONV - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `result`: The result of deconvolution, which is 'quotient' or 'remainder'. By default, the quotient will be output. - -**Output:** Output a single series. The type is DOUBLE. It is the result of deconvolving the second series from the first series (dividing the first series by the second series) whose timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - - -##### Calculate the quotient - -When `result` is 'quotient' or the default, this function calculates the quotient of the deconvolution. - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s3|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 8.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| null| -|1970-01-01T08:00:00.003+08:00| 2.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select deconv(s3,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2)| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - -##### Calculate the remainder - -When `result` is 'remainder', this function calculates the remainder of the deconvolution. - -Input series is the same as above, the SQL for query is shown below: - - -```sql -select deconv(s3,s2,'result'='remainder') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2, "result"="remainder")| -+-----------------------------+--------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 0.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -+-----------------------------+--------------------------------------------------------------+ -``` - -### 5.3 DWT - -#### Registration statement - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFDWT' -``` - -#### Usage - -This function is used to calculate 1d discrete wavelet transform of a numerical series. - -**Name:** DWT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of wavelet. May select 'Haar', 'DB4', 'DB6', 'DB8', where DB means Daubechies. User may offer coefficients of wavelet transform and ignore this parameter. Case ignored. -+ `coef`: Coefficients of wavelet transform. When providing this parameter, use comma ',' to split them, and leave no spaces or other punctuations. -+ `layer`: Times to transform. The number of output vectors equals $layer+1$. Default is 1. - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. - -**Note:** The length of input series must be an integer number power of 2. - -#### Examples - - -##### Haar wavelet transform - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.2| -|1970-01-01T08:00:00.200+08:00| 1.5| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.600+08:00| 0.8| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.800+08:00| 2.5| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select dwt(s1,"method"="haar") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|dwt(root.test.d1.s1, "method"="haar")| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.14142135834465192| -|1970-01-01T08:00:00.100+08:00| 1.909188342921157| -|1970-01-01T08:00:00.200+08:00| 1.6263456473052773| -|1970-01-01T08:00:00.300+08:00| 1.9798989957517026| -|1970-01-01T08:00:00.400+08:00| 3.252691126023161| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776479437628| -|1970-01-01T08:00:00.800+08:00| -0.14142135834465192| -|1970-01-01T08:00:00.900+08:00| 0.21213200063848547| -|1970-01-01T08:00:01.000+08:00| -0.7778174761639416| -|1970-01-01T08:00:01.100+08:00| -0.8485281289944873| -|1970-01-01T08:00:01.200+08:00| 0.2828427799095765| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426400127697095| -|1970-01-01T08:00:01.500+08:00| -0.42426408557066786| -+-----------------------------+-------------------------------------+ -``` - - -### 5.4 IDWT - -#### Registration statement - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFIDWT' -``` - -#### Usage - -This function performs one-dimensional inverse discrete wavelet transform on the input series, reconstructing the original data from DWT decomposed wavelet coefficients. - -**Name:** IDWT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of wavelet. May select 'Haar', 'DB4', 'DB6', 'DB8', where DB means Daubechies. User may offer coefficients of wavelet transform and ignore this parameter. Case ignored. -+ `coef`: Coefficients of wavelet transform. When providing this parameter, use comma ',' to split them, and leave no spaces or other punctuations. -+ `layer`: Times to transform. The number of output vectors equals $layer+1$. Default is 1. - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. - -**Note:** -* The length of input series must be an integer number power of 2. -* The parameter settings of the IDWT function (method/coef/layer) should be consistent with the corresponding DWT transformation to correctly reconstruct the original data. -* Typically, the input of IDWT is the output result of the DWT function. - -#### Examples - -##### Haar wavelet transform - -Input series: - -``` -+-----------------------------+--------------------+ -| Time| root.test.d1.s2| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 0.1414213562373095| -|1970-01-01T08:00:00.100+08:00| 1.909188309203678| -|1970-01-01T08:00:00.200+08:00| 1.6263455967290592| -|1970-01-01T08:00:00.300+08:00| 1.979898987322333| -|1970-01-01T08:00:00.400+08:00| 3.2526911934581184| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776310850235| -|1970-01-01T08:00:00.800+08:00| -0.1414213562373095| -|1970-01-01T08:00:00.900+08:00| 0.21213203435596428| -|1970-01-01T08:00:01.000+08:00| -0.7778174593052022| -|1970-01-01T08:00:01.100+08:00| -0.8485281374238569| -|1970-01-01T08:00:01.200+08:00| 0.2828427124746189| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426406871192857| -|1970-01-01T08:00:01.500+08:00|-0.42426406871192857| -+-----------------------------+--------------------+ -``` - -SQL for query: - -```sql -select idwt(s2,"method"="haar") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------+ -| Time|idwt(root.test.d1.s2, "method"="haar")| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.19999999999999998| -|1970-01-01T08:00:00.200+08:00| 1.4999999999999996| -|1970-01-01T08:00:00.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.6999999999999997| -|1970-01-01T08:00:00.600+08:00| 0.7999999999999998| -|1970-01-01T08:00:00.700+08:00| 1.9999999999999996| -|1970-01-01T08:00:00.800+08:00| 2.4999999999999996| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.9999999999999996| -|1970-01-01T08:00:01.200+08:00| 1.7999999999999998| -|1970-01-01T08:00:01.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:01.400+08:00| 0.9999999999999998| -|1970-01-01T08:00:01.500+08:00| 1.5999999999999999| -+-----------------------------+--------------------------------------+ -``` - - - -### 5.5 FFT - -#### Registration statement - -```sql -create function fft as 'org.apache.iotdb.library.frequency.UDTFFFT' -``` - -#### Usage - -This function is used to calculate the fast Fourier transform (FFT) of a numerical series. - -**Name:** FFT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of FFT, which is 'uniform' (by default) or 'nonuniform'. If the value is 'uniform', the timestamps will be ignored and all data points will be regarded as equidistant. Thus, the equidistant fast Fourier transform algorithm will be applied. If the value is 'nonuniform' (TODO), the non-equidistant fast Fourier transform algorithm will be applied based on timestamps. -+ `result`: The result of FFT, which is 'real', 'imag', 'abs' or 'angle', corresponding to the real part, imaginary part, magnitude and phase angle. By default, the magnitude will be output. -+ `compress`: The parameter of compression, which is within (0,1]. It is the reserved energy ratio of lossy compression. By default, there is no compression. - - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. The timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - - -##### Uniform FFT - -With the default `type`, uniform FFT is applied. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select fft(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------+ -| Time| fft(root.test.d1.s1)| -+-----------------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 1.2727111142703152E-8| -|1970-01-01T08:00:00.002+08:00| 2.385520799101839E-7| -|1970-01-01T08:00:00.003+08:00| 8.723291723972645E-8| -|1970-01-01T08:00:00.004+08:00| 19.999999960195904| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| -|1970-01-01T08:00:00.006+08:00| 3.2260694930700566E-7| -|1970-01-01T08:00:00.007+08:00| 8.723291605373329E-8| -|1970-01-01T08:00:00.008+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.009+08:00| 1.2727110997246171E-8| -|1970-01-01T08:00:00.010+08:00|1.9852334701272664E-23| -|1970-01-01T08:00:00.011+08:00| 1.2727111194499847E-8| -|1970-01-01T08:00:00.012+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.013+08:00| 8.723291785769131E-8| -|1970-01-01T08:00:00.014+08:00| 3.226069493070057E-7| -|1970-01-01T08:00:00.015+08:00| 9.999999850988388| -|1970-01-01T08:00:00.016+08:00| 19.999999960195904| -|1970-01-01T08:00:00.017+08:00| 8.723291747109068E-8| -|1970-01-01T08:00:00.018+08:00| 2.3855207991018386E-7| -|1970-01-01T08:00:00.019+08:00| 1.2727112069910878E-8| -+-----------------------------+----------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, there are peaks in $k=4$ and $k=5$ of the output. - -##### Uniform FFT with Compression - -Input series is the same as above, the SQL for query is shown below: - -```sql -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| fft(root.test.d1.s1,| fft(root.test.d1.s1,| -| | "result"="real",| "result"="imag",| -| | "compress"="0.99")| "compress"="0.99")| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - -Note: Based on the conjugation of the Fourier transform result, only the first half of the compression result is reserved. -According to the given parameter, data points are reserved from low frequency to high frequency until the reserved energy ratio exceeds it. -The last data point is reserved to indicate the length of the series. - -### 5.6 HighPass - -#### Registration statement - -```sql -create function highpass as 'org.apache.iotdb.library.frequency.UDTFHighPass' -``` - -#### Usage - -This function performs low-pass filtering on the input series and extracts components above the cutoff frequency. -The timestamps of input will be ignored and all data points will be regarded as equidistant. - -**Name:** HIGHPASS - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `wpass`: The normalized cutoff frequency which values (0,1). This parameter cannot be lacked. - -**Output:** Output a single series. The type is DOUBLE. It is the input after filtering. The length and timestamps of output are the same as the input. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select highpass(s1,'wpass'='0.45') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------+ -| Time|highpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9999999534830373| -|1970-01-01T08:00:01.000+08:00| 1.7462829277628608E-8| -|1970-01-01T08:00:02.000+08:00| -0.9999999593178128| -|1970-01-01T08:00:03.000+08:00| -4.1115269056426626E-8| -|1970-01-01T08:00:04.000+08:00| 0.9999999925494194| -|1970-01-01T08:00:05.000+08:00| 3.328126513330016E-8| -|1970-01-01T08:00:06.000+08:00| -1.0000000183304454| -|1970-01-01T08:00:07.000+08:00| 6.260191433311374E-10| -|1970-01-01T08:00:08.000+08:00| 1.0000000018134796| -|1970-01-01T08:00:09.000+08:00| -3.097210911744423E-17| -|1970-01-01T08:00:10.000+08:00| -1.0000000018134794| -|1970-01-01T08:00:11.000+08:00| -6.260191627862097E-10| -|1970-01-01T08:00:12.000+08:00| 1.0000000183304454| -|1970-01-01T08:00:13.000+08:00| -3.328126501424346E-8| -|1970-01-01T08:00:14.000+08:00| -0.9999999925494196| -|1970-01-01T08:00:15.000+08:00| 4.111526915498874E-8| -|1970-01-01T08:00:16.000+08:00| 0.9999999593178128| -|1970-01-01T08:00:17.000+08:00| -1.7462829341296528E-8| -|1970-01-01T08:00:18.000+08:00| -0.9999999534830369| -|1970-01-01T08:00:19.000+08:00| -1.035237222742873E-16| -+-----------------------------+-----------------------------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, the output is $y=sin(2\pi t/4)$ after high-pass filtering. - -### 5.7 IFFT - -#### Registration statement - -```sql -create function ifft as 'org.apache.iotdb.library.frequency.UDTFIFFT' -``` - -#### Usage - -This function treats the two input series as the real and imaginary part of a complex series, performs an inverse fast Fourier transform (IFFT), and outputs the real part of the result. -For the input format, please refer to the output format of `FFT` function. -Moreover, the compressed output of `FFT` function is also supported. - -**Name:** IFFT - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `start`: The start time of the output series with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is '1970-01-01 08:00:00'. -+ `interval`: The interval of the output series, which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it is 1s. - -**Output:** Output a single series. The type is DOUBLE. It is strictly equispaced. The values are the results of IFFT. - -**Note:** If a row contains null points or `NaN`, it will be ignored. - -#### Examples - - -Input series: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| root.test.d1.re| root.test.d1.im| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - - -SQL for query: - -```sql -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|ifft(root.test.d1.re, root.test.d1.im, "interval"="1m",| -| | "start"="2021-01-01 00:00:00")| -+-----------------------------+-------------------------------------------------------+ -|2021-01-01T00:00:00.000+08:00| 2.902112992431231| -|2021-01-01T00:01:00.000+08:00| 1.1755704705132448| -|2021-01-01T00:02:00.000+08:00| -2.175570513757101| -|2021-01-01T00:03:00.000+08:00| -1.9021130389094498| -|2021-01-01T00:04:00.000+08:00| 0.9999999925494194| -|2021-01-01T00:05:00.000+08:00| 1.902113046743454| -|2021-01-01T00:06:00.000+08:00| 0.17557053610884188| -|2021-01-01T00:07:00.000+08:00| -1.1755704886020932| -|2021-01-01T00:08:00.000+08:00| -0.9021130371347148| -|2021-01-01T00:09:00.000+08:00| 3.552713678800501E-16| -|2021-01-01T00:10:00.000+08:00| 0.9021130371347154| -|2021-01-01T00:11:00.000+08:00| 1.1755704886020932| -|2021-01-01T00:12:00.000+08:00| -0.17557053610884144| -|2021-01-01T00:13:00.000+08:00| -1.902113046743454| -|2021-01-01T00:14:00.000+08:00| -0.9999999925494196| -|2021-01-01T00:15:00.000+08:00| 1.9021130389094498| -|2021-01-01T00:16:00.000+08:00| 2.1755705137571004| -|2021-01-01T00:17:00.000+08:00| -1.1755704705132448| -|2021-01-01T00:18:00.000+08:00| -2.902112992431231| -|2021-01-01T00:19:00.000+08:00| -3.552713678800501E-16| -+-----------------------------+-------------------------------------------------------+ -``` - -### 5.8 LowPass - -#### Registration statement - -```sql -create function lowpass as 'org.apache.iotdb.library.frequency.UDTFLowPass' -``` - -#### Usage - -This function performs low-pass filtering on the input series and extracts components below the cutoff frequency. -The timestamps of input will be ignored and all data points will be regarded as equidistant. - -**Name:** LOWPASS - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `wpass`: The normalized cutoff frequency which values (0,1). This parameter cannot be lacked. - -**Output:** Output a single series. The type is DOUBLE. It is the input after filtering. The length and timestamps of output are the same as the input. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select lowpass(s1,'wpass'='0.45') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|lowpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.9021130073323922| -|1970-01-01T08:00:01.000+08:00| 1.1755704705132448| -|1970-01-01T08:00:02.000+08:00| -1.1755705286582614| -|1970-01-01T08:00:03.000+08:00| -1.9021130389094498| -|1970-01-01T08:00:04.000+08:00| 7.450580419288145E-9| -|1970-01-01T08:00:05.000+08:00| 1.902113046743454| -|1970-01-01T08:00:06.000+08:00| 1.1755705212076808| -|1970-01-01T08:00:07.000+08:00| -1.1755704886020932| -|1970-01-01T08:00:08.000+08:00| -1.9021130222335536| -|1970-01-01T08:00:09.000+08:00| 3.552713678800501E-16| -|1970-01-01T08:00:10.000+08:00| 1.9021130222335536| -|1970-01-01T08:00:11.000+08:00| 1.1755704886020932| -|1970-01-01T08:00:12.000+08:00| -1.1755705212076801| -|1970-01-01T08:00:13.000+08:00| -1.902113046743454| -|1970-01-01T08:00:14.000+08:00| -7.45058112983088E-9| -|1970-01-01T08:00:15.000+08:00| 1.9021130389094498| -|1970-01-01T08:00:16.000+08:00| 1.1755705286582616| -|1970-01-01T08:00:17.000+08:00| -1.1755704705132448| -|1970-01-01T08:00:18.000+08:00| -1.9021130073323924| -|1970-01-01T08:00:19.000+08:00| -2.664535259100376E-16| -+-----------------------------+----------------------------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, the output is $y=2sin(2\pi t/5)$ after low-pass filtering. - - -### 5.9 Envelope - -#### Registration statement - -```sql -create function envelope as 'org.apache.iotdb.library.frequency.UDFEnvelopeAnalysis' -``` - -#### Usage - -This function achieves signal demodulation and envelope extraction by inputting a one-dimensional floating-point array and a user specified modulation frequency. The goal of demodulation is to extract the parts of interest from complex signals, making them easier to understand. For example, demodulation can be used to find the envelope of the signal, that is, the trend of amplitude changes. - -**Name:** Envelope - -**Input:** Only supports a single input sequence, with types INT32/INT64/FLOAT/DOUBLE - - -**Parameters:** - -+ `frequency`: Frequency (optional, positive number. If this parameter is not filled in, the system will infer the frequency based on the time interval corresponding to the sequence). -+ `amplification`: Amplification factor (optional, positive integer. The output of the Time column is a set of positive integers and does not output decimals. When the frequency is less than 1, this parameter can be used to amplify the frequency to display normal results). - -**Output:** -+ `Time`: The meaning of the value returned by this column is frequency rather than time. If the output format is time format (e.g. 1970-01-01T08:00: 19.000+08:00), please convert it to a timestamp value. - - -+ `Envelope(Path, 'frequency'='{frequency}')`:Output a single sequence of type DOUBLE, which is the result of envelope analysis. - -**Note:** When the values of the demodulated original sequence are discontinuous, this function will treat it as continuous processing. It is recommended that the analyzed time series be a complete time series of values. It is also recommended to specify a start time and an end time. - -#### Examples - -Input series: - - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:01.000+08:00| 1.0 | -|1970-01-01T08:00:02.000+08:00| 2.0 | -|1970-01-01T08:00:03.000+08:00| 3.0 | -|1970-01-01T08:00:04.000+08:00| 4.0 | -|1970-01-01T08:00:05.000+08:00| 5.0 | -|1970-01-01T08:00:06.000+08:00| 6.0 | -|1970-01-01T08:00:07.000+08:00| 7.0 | -|1970-01-01T08:00:08.000+08:00| 8.0 | -|1970-01-01T08:00:09.000+08:00| 9.0 | -|1970-01-01T08:00:10.000+08:00| 10.0 | -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -set time_display_type=long; -select envelope(s1),envelope(s1,'frequency'='1000'),envelope(s1,'amplification'='10') from root.test.d1; -``` - -Output series: - - -``` -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -|Time|envelope(root.test.d1.s1)|envelope(root.test.d1.s1, "frequency"="1000")|envelope(root.test.d1.s1, "amplification"="10")| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -| 0| 6.284350808484124| 6.284350808484124| 6.284350808484124| -| 100| 1.5581923657404393| 1.5581923657404393| null| -| 200| 0.8503211038340728| 0.8503211038340728| null| -| 300| 0.512808785945551| 0.512808785945551| null| -| 400| 0.26361156774506744| 0.26361156774506744| null| -|1000| null| null| 1.5581923657404393| -|2000| null| null| 0.8503211038340728| -|3000| null| null| 0.512808785945551| -|4000| null| null| 0.26361156774506744| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ - -``` - - -## 6. Data Matching - -### 6.1 Cov - -#### Registration statement - -```sql -create function cov as 'org.apache.iotdb.library.dmatch.UDAFCov' -``` - -#### Usage - -This function is used to calculate the population covariance. - -**Name:** COV - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the population covariance. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `NaN` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select cov(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|cov(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 12.291666666666666| -+-----------------------------+-------------------------------------+ -``` - -### 6.2 DTW - -#### Registration statement - -```sql -create function dtw as 'org.apache.iotdb.library.dmatch.UDAFDtw' -``` - -#### Usage - -This function is used to calculate the DTW distance between two input series. - -**Name:** DTW - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the DTW distance. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `0` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.004+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.005+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.006+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.007+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.008+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.009+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.010+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.011+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.012+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.013+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.014+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.015+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.016+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.017+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.018+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.019+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.020+08:00| 1.0| 2.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select dtw(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|dtw(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 20.0| -+-----------------------------+-------------------------------------+ -``` - -### 6.3 Pearson - -#### Registration statement - -```sql -create function pearson as 'org.apache.iotdb.library.dmatch.UDAFPearson' -``` - -#### Usage - -This function is used to calculate the Pearson Correlation Coefficient. - -**Name:** PEARSON - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the Pearson Correlation Coefficient. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `NaN` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select pearson(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------+ -| Time|pearson(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.5630881927754872| -+-----------------------------+-----------------------------------------+ -``` - -### 6.4 PtnSym - -#### Registration statement - -```sql -create function ptnsym as 'org.apache.iotdb.library.dmatch.UDTFPtnSym' -``` - -#### Usage - -This function is used to find all symmetric subseries in the input whose degree of symmetry is less than the threshold. -The degree of symmetry is calculated by DTW. -The smaller the degree, the more symmetrical the series is. - -**Name:** PATTERNSYMMETRIC - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE - -**Parameter:** - -+ `window`: The length of the symmetric subseries. It's a positive integer and the default value is 10. -+ `threshold`: The threshold of the degree of symmetry. It's non-negative. Only the subseries whose degree of symmetry is below it will be output. By default, all subseries will be output. - - -**Output Series:** Output a single series. The type is DOUBLE. Each data point in the output series corresponds to a symmetric subseries. The output timestamp is the starting timestamp of the subseries and the output value is the degree of symmetry. - -#### Example - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s4| -+-----------------------------+---------------+ -|2021-01-01T12:00:00.000+08:00| 1.0| -|2021-01-01T12:00:01.000+08:00| 2.0| -|2021-01-01T12:00:02.000+08:00| 3.0| -|2021-01-01T12:00:03.000+08:00| 2.0| -|2021-01-01T12:00:04.000+08:00| 1.0| -|2021-01-01T12:00:05.000+08:00| 1.0| -|2021-01-01T12:00:06.000+08:00| 1.0| -|2021-01-01T12:00:07.000+08:00| 1.0| -|2021-01-01T12:00:08.000+08:00| 2.0| -|2021-01-01T12:00:09.000+08:00| 3.0| -|2021-01-01T12:00:10.000+08:00| 2.0| -|2021-01-01T12:00:11.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|ptnsym(root.test.d1.s4, "window"="5", "threshold"="0")| -+-----------------------------+------------------------------------------------------+ -|2021-01-01T12:00:00.000+08:00| 0.0| -|2021-01-01T12:00:07.000+08:00| 0.0| -+-----------------------------+------------------------------------------------------+ -``` - -### 6.5 XCorr - -#### Registration statement - -```sql -create function xcorr as 'org.apache.iotdb.library.dmatch.UDTFXCorr' -``` - -#### Usage - -This function is used to calculate the cross correlation function of given two time series. -For discrete time series, cross correlation is given by -$$CR(n) = \frac{1}{N} \sum_{m=1}^N S_1[m]S_2[m+n]$$ -which represent the similarities between two series with different index shifts. - -**Name:** XCORR - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series with DOUBLE as datatype. -There are $2N-1$ data points in the series, the center of which represents the cross correlation -calculated with pre-aligned series(that is $CR(0)$ in the formula above), -and the previous(or post) values represent those with shifting the latter series forward(or backward otherwise) -until the two series are no longer overlapped(not included). -In short, the values of output series are given by(index starts from 1) -$$OS[i] = CR(-N+i) = \frac{1}{N} \sum_{m=1}^{i} S_1[m]S_2[N-i+m],\ if\ i <= N$$ -$$OS[i] = CR(i-N) = \frac{1}{N} \sum_{m=1}^{2N-i} S_1[i-N+m]S_2[m],\ if\ i > N$$ - -**Note:** - -+ `null` and `NaN` values in the input series will be ignored and treated as 0. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| null| 6| -|2020-01-01T00:00:02.000+08:00| 2| 7| -|2020-01-01T00:00:03.000+08:00| 3| NaN| -|2020-01-01T00:00:04.000+08:00| 4| 9| -|2020-01-01T00:00:05.000+08:00| 5| 10| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -Output series: - -``` -+-----------------------------+---------------------------------------+ -| Time|xcorr(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+---------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 10.0| -|1970-01-01T08:00:00.002+08:00| 16.0| -|1970-01-01T08:00:00.003+08:00| 16.75| -|1970-01-01T08:00:00.004+08:00| 20.0| -|1970-01-01T08:00:00.005+08:00| 13.2| -|1970-01-01T08:00:00.006+08:00| 5.6| -|1970-01-01T08:00:00.007+08:00| 7.0| -|1970-01-01T08:00:00.008+08:00| 0.0| -+-----------------------------+---------------------------------------+ -``` -### 6.6 Pattern\_match - -#### Registration statement - -```SQL -create function pattern_match as 'org.apache.iotdb.library.match.UDAFPatternMatch' -``` - -#### Usage - -This function performs pattern matching between an input time series and a predefined pattern. A match is considered successful if the similarity measure (distance) is less than or equal to a specified threshold. The results are output as a JSON list. - -​**Name**​: PATTERN\_MATCH - -**Input**​​**​ Series**​: Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE/ BOOLEAN - -​**Parameter**​: - -* `timePattern` : A comma-separated string of timestamps (e.g., `"t1,t2,t3"`). Length must be ​**greater than 1**​. Required. -* `valuePattern `: A comma-separated string of numerical values corresponding to `timePattern`. Length must **match ​**`timePattern` and be greater than 1. Required. - -> For boolean values: Use `1` for `true` and `0` for `false`. - -* `theshold`: Float-type similarity threshold. Required. - -**Output**​​**​ Series**​: A JSON list containing all successfully matched segments. Each entry includes: start timestamp `startTime`, end timestamp `endTime`, calculated similarity value `distance`. - -#### Example - -1. Linear Data - -Input series: - -```SQL -IoTDB> select s0 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s0| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.1| -|1970-01-01T08:00:00.003+08:00| 1.2| -|1970-01-01T08:00:00.004+08:00| 1.3| -|1970-01-01T08:00:00.005+08:00| 0.0| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+--------------------------------------------------------------------------------------------------+ -| match_result| -+--------------------------------------------------------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]| -+--------------------------------------------------------------------------------------------------+ -``` - -2. Boolean Data - -Input series: - -```SQL -IoTDB> select s1 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| true| -|1970-01-01T08:00:00.002+08:00| true| -|1970-01-01T08:00:00.003+08:00| true| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| false| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", "threshold"="0.5") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+-------------------------------------------------+ -| match_result| -+-------------------------------------------------+ -|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+-------------------------------------------------+ -``` - -3. V-shaped Data - -Input series: - -```SQL -IoTDB> select s2 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s2| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| -1.0| -|1970-01-01T08:00:00.003+08:00| -2.0| -|1970-01-01T08:00:00.004+08:00| -3.0| -|1970-01-01T08:00:00.005+08:00| -2.0| -|1970-01-01T08:00:00.006+08:00| -1.0| -|1970-01-01T08:00:00.007+08:00| -0.0| -|1970-01-01T08:00:00.008+08:00| -0.0| -|1970-01-01T08:00:00.009+08:00| -0.0| -|1970-01-01T08:00:00.010+08:00| -0.0| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s2, "timePattern"="1,2,3,4,5,6,7", "valuePattern"="0.0,-1.0,-2.0,-3.0,-2.0,-1.0,-0.0", "threshold"="10") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+----------------------------------------------+ -| match_result| -+----------------------------------------------+ -|[{"distance":0.53,"startTime":1,"endTime":10}]| -+----------------------------------------------+ -``` - -4. Multiple Matching Pattern - -Input series: - -```SQL -IoTDB> select s0,s1 from root.** -+-----------------------------+-------------+-------------+ -| Time|root.db.d0.s0|root.db.d0.s1| -+-----------------------------+-------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| true| -|1970-01-01T08:00:00.002+08:00| 1.1| true| -|1970-01-01T08:00:00.003+08:00| 1.2| true| -|1970-01-01T08:00:00.004+08:00| 1.3| false| -|1970-01-01T08:00:00.005+08:00| 0.0| false| -+-----------------------------+-------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result1, pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", - "threshold"="0.5") as match_result2 from root.db.d0 -``` - -Output series: - -```SQL -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -| match_result1| match_result2| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -``` - - -## 7. Data Repairing - -### 7.1 TimestampRepair - -#### Registration statement - -```sql -create function timestamprepair as 'org.apache.iotdb.library.drepair.UDTFTimestampRepair' -``` - -#### Usage - -This function is used for timestamp repair. -According to the given standard time interval, -the method of minimizing the repair cost is adopted. -By fine-tuning the timestamps, -the original data with unstable timestamp interval is repaired to strictly equispaced data. -If no standard time interval is given, -this function will use the **median**, **mode** or **cluster** of the time interval to estimate the standard time interval. - -**Name:** TIMESTAMPREPAIR - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `interval`: The standard time interval whose unit is millisecond. It is a positive integer. By default, it will be estimated according to the given method. -+ `method`: The method to estimate the standard time interval, which is 'median', 'mode' or 'cluster'. This parameter is only valid when `interval` is not given. By default, median will be used. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -##### Manually Specify the Standard Time Interval - -When `interval` is given, this function repairs according to the given standard time interval. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:19.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:01.000+08:00| 7.0| -|2021-07-01T12:01:11.000+08:00| 8.0| -|2021-07-01T12:01:21.000+08:00| 9.0| -|2021-07-01T12:01:31.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timestamprepair(s1,'interval'='10000') from root.test.d2 -``` - -Output series: - - -``` -+-----------------------------+----------------------------------------------------+ -| Time|timestamprepair(root.test.d2.s1, "interval"="10000")| -+-----------------------------+----------------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+----------------------------------------------------+ -``` - -##### Automatically Estimate the Standard Time Interval - -When `interval` is default, this function estimates the standard time interval. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select timestamprepair(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------+ -| Time|timestamprepair(root.test.d2.s1)| -+-----------------------------+--------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+--------------------------------+ -``` - -### 7.2 ValueFill - -#### Registration statement - -```sql -create function valuefill as 'org.apache.iotdb.library.drepair.UDTFValueFill' -``` - -#### Usage - -This function is used to impute time series. Several methods are supported. - -**Name**: ValueFill -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: {"mean", "previous", "linear", "likelihood", "AR", "MA", "SCREEN"}, default "linear". - Method to use for imputation in series. "mean": use global mean value to fill holes; "previous": propagate last valid observation forward to next valid. "linear": simplest interpolation method; "likelihood":Maximum likelihood estimation based on the normal distribution of speed; "AR": auto regression; "MA": moving average; "SCREEN": speed constraint. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -**Note:** AR method use AR(1) model. Input value should be auto-correlated, or the function would output a single point (0, 0.0). - -#### Examples - -##### Fill with linear - -When `method` is "linear" or the default, Screen method is used to impute. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| NaN| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| NaN| -|2020-01-01T00:00:22.000+08:00| NaN| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select valuefill(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------+ -| Time|valuefill(root.test.d2.s1)| -+-----------------------------+--------------------------+ -|2020-01-01T00:00:02.000+08:00| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 110.5| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.66666666666667| -|2020-01-01T00:00:22.000+08:00| 121.33333333333333| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+--------------------------+ -``` - -##### Previous Fill - -When `method` is "previous", previous method is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select valuefill(s1,"method"="previous") from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------------+ -| Time|valuefill(root.test.d2.s1, "method"="previous")| -+-----------------------------+-----------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 108.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 116.0| -|2020-01-01T00:00:22.000+08:00| 116.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-----------------------------------------------+ -``` - -### 7.3 ValueRepair - -#### Registration statement - -```sql -create function valuerepair as 'org.apache.iotdb.library.drepair.UDTFValueRepair' -``` - -#### Usage - -This function is used to repair the value of the time series. -Currently, two methods are supported: -**Screen** is a method based on speed threshold, which makes all speeds meet the threshold requirements under the premise of minimum changes; -**LsGreedy** is a method based on speed change likelihood, which models speed changes as Gaussian distribution, and uses a greedy algorithm to maximize the likelihood. - - -**Name:** VALUEREPAIR - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The method used to repair, which is 'Screen' or 'LsGreedy'. By default, Screen is used. -+ `minSpeed`: This parameter is only valid with Screen. It is the speed threshold. Speeds below it will be regarded as outliers. By default, it is the median minus 3 times of median absolute deviation. -+ `maxSpeed`: This parameter is only valid with Screen. It is the speed threshold. Speeds above it will be regarded as outliers. By default, it is the median plus 3 times of median absolute deviation. -+ `center`: This parameter is only valid with LsGreedy. It is the center of the Gaussian distribution of speed changes. By default, it is 0. -+ `sigma`: This parameter is only valid with LsGreedy. It is the standard deviation of the Gaussian distribution of speed changes. By default, it is the median absolute deviation. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -**Note:** `NaN` will be filled with linear interpolation before repairing. - -#### Examples - -##### Repair with Screen - -When `method` is 'Screen' or the default, Screen method is used. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select valuerepair(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|valuerepair(root.test.d2.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+----------------------------+ -``` - -##### Repair with LsGreedy - -When `method` is 'LsGreedy', LsGreedy method is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select valuerepair(s1,'method'='LsGreedy') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|valuerepair(root.test.d2.s1, "method"="LsGreedy")| -+-----------------------------+-------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-------------------------------------------------+ -``` - -## 8. Series Discovery - -### 8.1 ConsecutiveSequences - -#### Registration statement - -```sql -create function consecutivesequences as 'org.apache.iotdb.library.series.UDTFConsecutiveSequences' -``` - -#### Usage - -This function is used to find locally longest consecutive subsequences in strictly equispaced multidimensional data. - -Strictly equispaced data is the data whose time intervals are strictly equal. Missing data, including missing rows and missing values, is allowed in it, while data redundancy and timestamp drift is not allowed. - -Consecutive subsequence is the subsequence that is strictly equispaced with the standard time interval without any missing data. If a consecutive subsequence is not a proper subsequence of any consecutive subsequence, it is locally longest. - -**Name:** CONSECUTIVESEQUENCES - -**Input Series:** Support multiple input series. The type is arbitrary but the data is strictly equispaced. - -**Parameters:** - -+ `gap`: The standard time interval which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it will be estimated by the mode of time intervals. - -**Output Series:** Output a single series. The type is INT32. Each data point in the output series corresponds to a locally longest consecutive subsequence. The output timestamp is the starting timestamp of the subsequence and the output value is the number of data points in the subsequence. - -**Note:** For input series that is not strictly equispaced, there is no guarantee on the output. - -#### Examples - -##### Manually Specify the Standard Time Interval - -It's able to manually specify the standard time interval by the parameter `gap`. It's notable that false parameter leads to false output. - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2, "gap"="5m")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------------------+ -``` - - -##### Automatically Estimate the Standard Time Interval - -When `gap` is default, this function estimates the standard time interval by the mode of time intervals and gets the same results. Therefore, this usage is more recommended. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select consecutivesequences(s1,s2) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------+ -``` - -### 8.2 ConsecutiveWindows - -#### Registration statement - -```sql -create function consecutivewindows as 'org.apache.iotdb.library.series.UDTFConsecutiveWindows' -``` - -#### Usage - -This function is used to find consecutive windows of specified length in strictly equispaced multidimensional data. - -Strictly equispaced data is the data whose time intervals are strictly equal. Missing data, including missing rows and missing values, is allowed in it, while data redundancy and timestamp drift is not allowed. - -Consecutive window is the subsequence that is strictly equispaced with the standard time interval without any missing data. - -**Name:** CONSECUTIVEWINDOWS - -**Input Series:** Support multiple input series. The type is arbitrary but the data is strictly equispaced. - -**Parameters:** - -+ `gap`: The standard time interval which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it will be estimated by the mode of time intervals. -+ `length`: The length of the window which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. This parameter cannot be lacked. - -**Output Series:** Output a single series. The type is INT32. Each data point in the output series corresponds to a consecutive window. The output timestamp is the starting timestamp of the window and the output value is the number of data points in the window. - -**Note:** For input series that is not strictly equispaced, there is no guarantee on the output. - -#### Examples - - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------+ -| Time|consecutivewindows(root.test.d1.s1, root.test.d1.s2, "length"="10m")| -+-----------------------------+--------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -+-----------------------------+--------------------------------------------------------------------+ -``` - - - -## 9. Machine Learning - -### 9.1 AR - -#### Registration statement - -```sql -create function ar as 'org.apache.iotdb.library.dlearn.UDTFAR' -``` - -#### Usage - -This function is used to learn the coefficients of the autoregressive models for a time series. - -**Name:** AR - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `p`: The order of the autoregressive model. Its default value is 1. - -**Output Series:** Output a single series. The type is DOUBLE. The first line corresponds to the first order coefficient, and so on. - -**Note:** - -- Parameter `p` should be a positive integer. -- Most points in the series should be sampled at a constant time interval. -- Linear interpolation is applied for the missing points in the series. - -#### Examples - -##### Assigning Model Order - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ar(s0,"p"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+---------------------------+ -| Time|ar(root.test.d0.s0,"p"="2")| -+-----------------------------+---------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.9429| -|1970-01-01T08:00:00.002+08:00| -0.2571| -+-----------------------------+---------------------------+ -``` - diff --git a/src/UserGuide/Master/Tree/Tools-System/CLI_timecho.md b/src/UserGuide/Master/Tree/Tools-System/CLI_timecho.md deleted file mode 100644 index ad512f4ae..000000000 --- a/src/UserGuide/Master/Tree/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,275 +0,0 @@ - - -# Command Line Interface (CLI) - - -IoTDB provides Cli/shell tools for users to interact with IoTDB server in command lines. This document shows how Cli/shell tool works and the meaning of its parameters. - -> Note: In this document, \$IOTDB\_HOME represents the path of the IoTDB installation directory. - -## 1. Running Cli - -After installation, there is a default user in IoTDB: `root`, and the -default password is `TimechoDB@2021`(Before V2.0.6 it is `root`). Users can use this username to try IoTDB Cli/Shell tool. The cli startup script is the `start-cli` file under the \$IOTDB\_HOME/bin folder. When starting the script, you need to specify the IP and PORT. (Make sure the IoTDB cluster is running properly when you use Cli/Shell tool to connect to it.) - -Here is an example where the cluster is started locally and the user has not changed the running port. The default rpc port is -6667
-If you need to connect to the remote DataNode or changes -the rpc port number of the DataNode running, set the specific IP and RPC PORT at -h and -p.
-You also can set your own environment variable at the front of the start script - -The Linux and MacOS system startup commands are as follows: - -```shell -# Before version V2.0.6.x -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -The Windows system startup commands are as follows: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x and later versions, before version V2.0.6.x -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -After operating these commands, the cli can be started successfully. The successful status will be as follows: - -``` - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ version - - -Successfully login at 127.0.0.1:6667 -IoTDB> -``` - -Enter ```quit``` or `exit` can exit Cli. - -## 2. Cli Parameters - -| **Parameter** | **Type** | **Required** | **Description** | **Example** | -| -------------------------- | -------- | ------------ |-----------------------------------------------------------------------------------| ------------------- | -| -h `` | string | No | The IP address of the IoTDB server. (Default: 127.0.0.1) | -h 127.0.0.1 | -| -p `` | int | No | The RPC port of the IoTDB server. (Default: 6667) | -p 6667 | -| -u `` | string | No | The username to connect to the IoTDB server. (Default: root) | -u root | -| -pw `` | string | No | The password to connect to the IoTDB server. (Default: root) | -pw root | -| -sql_dialect `` | string | No | The data model type: tree or table. (Default: tree) | -sql_dialect table | -| -e `` | string | No | Batch operations in non-interactive mode. | -e "show databases" | -| -c | Flag | No | Required if rpc_thrift_compression_enable=true on the server. | -c | -| -disableISO8601 | Flag | No | If set, timestamps will be displayed as numeric values instead of ISO8601 format. | -disableISO8601 | -| -usessl `` | Boolean | No | Enable SSL connection | -usessl true | -| -ts `` | string | No | SSL certificate store path | -ts /path/to/truststore | -| -tpw `` | string | No | SSL certificate store password | -tpw myTrustPassword | -| -timeout `` | int | No | Query timeout (seconds). If not set, the server's configuration will be used. | -timeout 30 | -| -help | Flag | No | Displays help information for the CLI tool. | -help | - -Following is a cli command which connects the host with IP -10.129.187.21, rpc port 6667, username "root", password "root", and prints the timestamp in digital form. The maximum number of lines displayed on the IoTDB command line is 10. - -The Linux and MacOS system startup commands are as follows: - -```shell -Shell > bash sbin/start-cli.sh -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -The Windows system startup commands are as follows: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 - -# # V2.0.4.x and later versions -Shell > sbin\windows\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -## 3. CLI Special Command - -Special commands of Cli are below. - -| Command | Description / Example | -| :-------------------------- | :------------------------------------------------------ | -| `set time_display_type=xxx` | eg. long, default, ISO8601, yyyy-MM-dd HH:mm:ss | -| `show time_display_type` | show time display type | -| `set time_zone=xxx` | eg. +08:00, Asia/Shanghai | -| `show time_zone` | show cli time zone | -| `set fetch_size=xxx` | set fetch size when querying data from server | -| `show fetch_size` | show fetch size | -| `set max_display_num=xxx` | set max lines for cli to output, -1 equals to unlimited | -| `help` | Get hints for CLI special commands | -| `exit/quit` | Exit CLI | - - -## 4. Batch Operation of Cli - --e parameter is designed for the Cli/shell tool in the situation where you would like to manipulate IoTDB in batches through scripts. By using the -e parameter, you can operate IoTDB without entering the cli's input mode. - -In order to avoid confusion between statements and other parameters, the current version only supports the -e parameter as the last parameter. - -The usage of -e parameter for Cli/shell is as follows: - -The Linux and MacOS system commands: - -```shell -Shell > bash sbin/start-cli.sh -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -The Windows system commands: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} - -# V2.0.4.x and later versions -Shell > sbin\windows\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -In the Windows environment, the SQL statement of the -e parameter needs to use ` `` ` to replace `" "` - -In order to better explain the use of -e parameter, take following as an example(On linux system). - -Suppose you want to create a database root.demo to a newly launched IoTDB, create a timeseries root.demo.s1 and insert three data points into it. With -e parameter, you could write a shell like this: - -```shell -# !/bin/bash - -host=127.0.0.1 -rpcPort=6667 -user=root -pass=root - -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create database root.demo" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create timeseries root.demo.s1 WITH DATATYPE=INT32, ENCODING=RLE" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(1,10)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(2,11)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(3,12)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "select s1 from root.demo" -``` - -The results are shown in the figure, which are consistent with the Cli and jdbc operations. - -```shell - Shell > bash ./shell.sh -+-----------------------------+------------+ -| Time|root.demo.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.001+08:00| 10| -|1970-01-01T08:00:00.002+08:00| 11| -|1970-01-01T08:00:00.003+08:00| 12| -+-----------------------------+------------+ -Total line number = 3 -It costs 0.267s -``` - -It should be noted that the use of the -e parameter in shell scripts requires attention to the escaping of special characters. - -## 5. Access History Feature - -Since IoTDB **V2.0.9.1**, the access history feature is supported. After a client logs in successfully, key historical access information is displayed, and the feature supports distributed scenarios. Both administrators and regular users can only view their own access history. The core displayed information includes: - -- Last successful session: displays date, time, access application, IP address, and access method (not shown for first login or when no history exists). -- Most recent failed attempt: displays the date, time, access application, IP address, and access method of the latest failed login attempt before the current successful login. -- Cumulative failed attempts: total number of failed session attempts since the last successful session was established. - -### 5.1 Enabling Access History - -You can enable or disable the access history feature by modifying relevant parameters in the `iotdb-system.properties` file. A restart is required for changes to take effect. For example: - -```Plain -# Controls whether the audit log feature is enabled -enable_audit_log=false -``` - -- When enabled: login information is recorded and expired data is cleaned periodically. -- When disabled: no data is recorded, displayed, or cleaned. -- If disabled and then re-enabled, the displayed history will be the last record before disabling, which may not represent the actual latest login. - -Usage example: - -```Bash ---------------------- -Starting IoTDB Cli ---------------------- - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ Enterprise version 2.0.9.1 (Build: xxxxxxx) - - ----Last Successful Session------------------ -Time: 2026-03-24T10:25:47.759+08:00 -IP Address: 127.0.0.1 ----Last Failed Session---------------------- -Time: 2026-03-24T10:27:26.314+08:00 -IP Address: 127.0.0.1 -Cumulative Failed Attempts: 1 -Successfully login at 127.0.0.1:6667 -IoTDB> -``` - -### 5.2 Viewing Access History - -The `root` user and users with the `AUDIT` privilege can view access history records using SQL statements. - -Syntax: - -```SQL -SELECT * FROM root.__audit.login.u_{userid}.** -``` - -The `userid` can be obtained using the `LIST USER` statement. - -Example: - -```SQL -IoTDB> SELECT * FROM root.__audit.login.** -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -| Time|root.__audit.login.u_0.node_1.result|root.__audit.login.u_0.node_1.ip|root.__audit.login.u_0.node_1.username| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -|2026-03-25T10:55:58.240+08:00| true| 127.0.0.1| root| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -Total line number = 1 -It costs 0.039s - -IoTDB> SELECT * FROM root.__audit.login.u_0.** -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -| Time|root.__audit.login.u_0.node_1.result|root.__audit.login.u_0.node_1.ip|root.__audit.login.u_0.node_1.username| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -|2026-03-25T10:55:58.240+08:00| true| 127.0.0.1| root| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -Total line number = 1 -It costs 0.020s -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Tools-System/Data-Export-Tool_timecho.md b/src/UserGuide/Master/Tree/Tools-System/Data-Export-Tool_timecho.md deleted file mode 100644 index c03ac9270..000000000 --- a/src/UserGuide/Master/Tree/Tools-System/Data-Export-Tool_timecho.md +++ /dev/null @@ -1,166 +0,0 @@ -# Data Export - -## 1. Overview -The data export tool, export-data.sh (Unix/OS X) or export-data.bat (Windows), located in the tools directory, allows users to export query results from specified SQL statements into CSV, SQL, or TsFile (open-source time-series file format) formats. The specific functionalities are as follows: - - - - - - - - - - - - - - - - - - - - - -
File FormatIoTDB ToolDescription
CSVexport-data.sh/batPlain text format for storing structured data. Must follow the CSV format specified below.
SQLFile containing custom SQL statements.
TsFileOpen-source time-series file format.
- - -## 2. Detailed Functionality -### 2.1 Common Parameters -| Short | Full Parameter | Description | Required | Default | -|------------------|--------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------| ----------------- |-----------------------------------------------------------------------------------------------------| -| `-ft` | `--file_type` | Export file type: `csv`, `sql`, `tsfile`. | ​**Yes** | - | -| `-h` | `--host` | Hostname of the IoTDB server. | No | `127.0.0.1` | -| `-p` | `--port` | Port number of the IoTDB server. | No | `6667` | -| `-u` | `--username` | Username for authentication. | No | `root` | -| `-pw` | `--password` | Password for authentication. Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is `root` ) | -| `-t` | `--target` | Target directory for the output files. If the path does not exist, it will be created. | ​**Yes** | - | -| `-pfn` | `--prefix_file_name` | Prefix for the exported file names. For example, `abc` will generate files like `abc_0.tsfile`, `abc_1.tsfile`. | No | `dump_0.tsfile` | -| `-q` | `--query` | SQL query command to execute. Starting from v2.0.8, semicolons in SQL statements are automatically removed, and query execution proceeds normally. | No | - | -| `-timeout` | `--query_timeout` | Query timeout in milliseconds (ms). | No | `-1` (before v2.0.8)
`Long.MAX_VALUE` (v2.0.8 and later)
(Range: `-1~Long.MAX_VALUE`) | -| `-help` | `--help` | Display help information. | No | - | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 CSV Format -#### 2.2.1 Command - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -``` -#### 2.2.2 CSV-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ------------ | ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- |------------------------------------------| -| `-dt` | `--datatype` | Whether to include data types in the CSV file header (`true` or `false`). | No | `false` | -| `-lpf` | `--lines_per_file` | Number of rows per exported file. | No | `10000` (Range:0~Integer.Max=2147483647) | -| `-tf` | `--time_format` | Time format for the CSV file. Options: 1) Timestamp (numeric, long), 2) ISO8601 (default), 3) Custom pattern (e.g., `yyyy-MM-dd HH:mm:ss`). SQL file timestamps are unaffected by this setting. | No | `ISO8601` | -| `-tz` | `--timezone` | Timezone setting (e.g., `+08:00`, `-01:00`). | No | System default | - -#### 2.2.3 Examples - -```Shell -# Valid Example -> tools/export-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn exported-data.csv -dt true -lpf 1000 -tf "yyyy-MM-dd HH:mm:ss" - -tz +08:00 -q "SELECT * FROM root.ln" -timeout 20000 - -# Error Example -> tools/export-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` -## 2.3 SQL Format -#### 2.3.1 Command -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-aligned ] - -lpf - [-tf ] [-tz ] [-q ] [-timeout ] - -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] -``` -#### 2.3.2 SQL-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ---------------- | ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ---------------- | -| `-aligned` | `--use_aligned` | Whether to export as aligned SQL format (`true` or `false`). | No | `true` | -| `-lpf` | `--lines_per_file` | Number of rows per exported file. | No | `10000` (Range:0~Integer.Max=2147483647) | -| `-tf` | `--time_format` | Time format for the CSV file. Options: 1) Timestamp (numeric, long), 2) ISO8601 (default), 3) Custom pattern (e.g., `yyyy-MM-dd HH:mm:ss`). SQL file timestamps are unaffected by this setting. | No | `ISO8601` | -| `-tz` | `--timezone` | Timezone setting (e.g., `+08:00`, `-01:00`). | No | System default | - -#### 2.3.3 Examples -```Shell -# Valid Example -> tools/export-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn exported-data.csv -aligned true -lpf 1000 -tf "yyyy-MM-dd HH:mm:ss" - -tz +08:00 -q "SELECT * FROM root.ln" -timeout 20000 - -# Error Example -> tools/export-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` - -### 2.4 TsFile Format - -#### 2.4.1 Command - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] -``` - -#### 2.4.2 TsFile-Specific Parameters - -* None - -#### 2.4.3 Examples - -```Shell -# Valid Example -> tools/export-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn export-data.tsfile -q "SELECT * FROM root.ln" -timeout 10000 - -# Error Example -> tools/export-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` diff --git a/src/UserGuide/Master/Tree/Tools-System/Data-Import-Tool_timecho.md b/src/UserGuide/Master/Tree/Tools-System/Data-Import-Tool_timecho.md deleted file mode 100644 index 9296b8146..000000000 --- a/src/UserGuide/Master/Tree/Tools-System/Data-Import-Tool_timecho.md +++ /dev/null @@ -1,328 +0,0 @@ -# Data Import - -## 1. Overview -IoTDB supports three methods for data import: -- Data Import Tool: Use the `import-data.sh/bat` script in the `tools` directory to manually import CSV, SQL, or TsFile (open-source time-series file format) data into IoTDB. -- `TsFile` Auto-Loading Feature -- Load `TsFile` SQL - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
File FormatIoTDB ToolDescription
CSVimport-data.sh/batCan be used for single or batch import of CSV files into IoTDB
SQLCan be used for single or batch import of SQL files into IoTDB
TsFileCan be used for single or batch import of TsFile files into IoTDB
TsFile Auto-Loading FeatureCan automatically monitor a specified directory for newly generated TsFiles and load them into IoTDB
Load SQLCan be used for single or batch import of TsFile files into IoTDB
- -## 2. Data Import Tool -### 2.1 Common Parameters - -| Short | Full Parameter | Description | Required | Default | -|-----------------|--------------------------| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- |-----------------------------------------------| -| `-ft` | `--file_type` | File type: `csv`, `sql`, `tsfile`. | ​**Yes** | - | -| `-h` | `--host` | IoTDB server hostname. | No | `127.0.0.1` | -| `-p` | `--port` | IoTDB server port. | No | `6667` | -| `-u` | `--username` | Username. | No | `root` | -| `-pw` | `--password` | Password. Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is `root` ) | -| `-s` | `--source` | Local path to the file/directory to import. ​​**Supported formats**​: CSV, SQL, TsFile. Unsupported formats trigger error: `The file name must end with "csv", "sql", or "tsfile"!` | ​**Yes** | - | -| `-tn` | `--thread_num` | Maximum parallel threads | No | `8`
Range: 0 to Integer.Max(2147483647). | -| `-tz` | `--timezone` | Timezone (e.g., `+08:00`, `-01:00`). | No | System default | -| `-help` | `--help` | Display help (general or format-specific: `-help csv`). | No | - | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 CSV Format - -#### 2.2.1 Command -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] -``` - -#### 2.2.2 CSV-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ---------------- | ------------------------------- |----------------------------------------------------------| ---------- |-----------------| -| `-fd` | `--fail_dir` | Directory to save failed files. | No | YOUR_CSV_FILE_PATH | -| `-lpf` | `--lines_per_failed_file` | Max lines per failed file. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-aligned` | `--use_aligned` | Import as aligned time series. | No | `false` | -| `-batch` | `--batch_size` | Rows processed per API call. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-ti` | `--type_infer` | Type mapping (e.g., `BOOLEAN=text,INT=long`). | No | - | -| `-tp` | `--timestamp_precision` | Timestamp precision: `ms`, `us`, `ns`. | No | `ms` | - -#### 2.2.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -s /path/sql - -fd /path/failure/dir -lpf 100 -aligned true -ti "BOOLEAN=text,INT=long,FLOAT=double" - -tp ms -tz +08:00 -batch 5000 -tn 4 - -# Error Example -> tools/import-data.sh -ft csv -s /non_path -error: Source file or directory /non_path does not exist - -> tools/import-data.sh -ft csv -s /path/sql -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` - -#### 2.2.4 Import Notes - -1. CSV Import Specifications - -- Special Character Escaping Rules: If a text-type field contains special characters (e.g., commas ,), they must be escaped using a backslash (\). -- Supported Time Formats: yyyy-MM-dd'T'HH:mm:ss, yyyy-MM-dd HH:mm:ss, or yyyy-MM-dd'T'HH:mm:ss.SSSZ. -- Timestamp Column Requirement: The timestamp column must be the first column in the data file. - -2. CSV File Example - -- Time Alignment - -```sql --- Headers without data types -Time,root.test.t1.str,root.test.t2.str,root.test.t2.var -1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 -1970-01-01T08:00:00.002+08:00,"123",, - --- Headers with data types (Text-type data supports both quoted and unquoted formats) -Time,root.test.t1.str(TEXT),root.test.t2.str(TEXT),root.test.t2.var(INT32) -1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 -1970-01-01T08:00:00.002+08:00,123,hello world,123 -1970-01-01T08:00:00.003+08:00,"123",, -1970-01-01T08:00:00.004+08:00,123,,12 -``` - -- Device Alignment - -```sql --- Headers without data types -Time,Device,str,var -1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", -1970-01-01T08:00:00.002+08:00,root.test.t1,"123", -1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100 - --- Headers with data types (Text-type data supports both quoted and unquoted formats) -Time,Device,str(TEXT),var(INT32) -1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", -1970-01-01T08:00:00.002+08:00,root.test.t1,"123", -1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100 -1970-01-01T08:00:00.002+08:00,root.test.t1,hello world,123 -``` - - -### 2.3 SQL Format - -#### 2.3.1 Command - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] -``` - -#### 2.3.2 SQL-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| -------------- | ------------------------------- | -------------------------------------------------------------------- | ---------- | ------------------ | -| `-fd` | `--fail_dir` | Directory to save failed files. | No |YOUR_CSV_FILE_PATH| -| `-lpf` | `--lines_per_failed_file` | Max lines per failed file. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-batch` | `--batch_size` | Rows processed per API call. | No | `100000`
Range: 0 to Integer.Max(2147483647). | - -#### 2.3.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -s /path/sql - -fd /path/failure/dir -lpf 500 -tz +08:00 - -batch 100000 -tn 4 - -# Error Example -> tools/import-data.sh -ft sql -s /path/sql -fd /non_path -error: Source file or directory /path/sql does not exist - - -> tools/import-data.sh -ft sql -s /path/sql -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` -### 2.4 TsFile Format - -#### 2.4.1 Command - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] -``` -#### 2.4.2 TsFile-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ----------- | ----------------------------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ----------------- | --------------------------- | -| `-os` | `--on_success` | Action for successful files:
`none`: Do not delete the file.
`mv`: Move the successful file to the target directory.
`cp`:Create a hard link (copy) of the successful file to the target directory.
`delete`:Delete the file. | ​**Yes** | - | -| `-sd` | `--success_dir` | Target directory for `mv`/`cp` actions on success. Required if `-os` is `mv`/`cp`. The file name will be flattened and concatenated with the original file name. | Conditional | `${EXEC_DIR}/success` | -| `-of` | `--on_fail` | Action for failed files:
`none`:Skip the file.
`mv`:Move the failed file to the target directory.
`cp`:Create a hard link (copy) of the failed file to the target directory.
`delete`:Delete the file.. | ​**Yes** | - | -| `-fd` | `--fail_dir` | Target directory for `mv`/`cp` actions on failure. Required if `-of` is `mv`/`cp`. The file name will be flattened and concatenated with the original file name. | Conditional | `${EXEC_DIR}/fail` | -| `-tp` | `--timestamp_precision` | TsFile timestamp precision: `ms`, `us`, `ns`.
For non-remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The system will manually verify if the timestamp precision matches the server. If it does not match, an error will be returned.
​For remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The Pipe system will automatically verify if the timestamp precision matches. If it does not match, a Pipe error will be returned. | No | `ms` | - -#### 2.4.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 - -s /path/sql -os mv -of cp -sd /path/success/dir -fd /path/failure/dir - -tn 8 -tz +08:00 -tp ms - -# Error Example -> tools/import-data.sh -ft tsfile -s /path/sql -os mv -of cp - -fd /path/failure/dir -tn 8 -error: Missing option --success_dir (or -sd) when --on_success is 'mv' or 'cp' - -> tools/import-data.sh -ft tsfile -s /path/sql -os mv -of cp - -sd /path/success/dir -fd /path/failure/dir -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` - - -## 3. TsFile Auto-Loading - -This feature enables IoTDB to automatically monitor a specified directory for new TsFiles and load them into the database without manual intervention. - -![](/img/Data-import2.png) - -### 3.1 Configuration - -Add the following parameters to `iotdb-system.properties` (template: `iotdb-system.properties.template`): - -| Parameter | Description | Value Range | Required | Default | Hot-Load? | -| ---------------------------------------------------- |---------------------------------------------------------------------------------------| --------------------------------- | ---------- | ----------------------------- | ----------------------- | -| `load_active_listening_enable` | Enable auto-loading. | `true`/`false` | Optional | `true` | Yes | -| `load_active_listening_dirs` | Directories to monitor (subdirectories included). Multiple paths separated by commas. | String | Optional | `ext/load/pending` | Yes | -| `load_active_listening_fail_dir` | Directory to store failed TsFiles. Only can set one. | String | Optional | `ext/load/failed` | Yes | -| `load_active_listening_max_thread_num` | Maximum Threads for TsFile Loading Tasks:The default value for this parameter, when commented out, is max(1, CPU cores / 2). If the value set by the user falls outside the range [1, CPU cores / 2], it will be reset to the default value of max(1, CPU cores / 2). | `1` to `Long.MAX_VALUE` | Optional | `max(1, CPU_CORES / 2)` | No (restart required) | -| `load_active_listening_check_interval_seconds` | Active Listening Polling Interval (in seconds):The active listening feature for TsFiles is implemented through polling the target directory. This configuration specifies the time interval between two consecutive checks of the `load_active_listening_dirs`. After each check, the next check will be performed after `load_active_listening_check_interval_seconds` seconds. If the polling interval set by the user is less than 1, it will be reset to the default value of 5 seconds. | `1` to `Long.MAX_VALUE` | Optional | `5` | No (restart required) | - -### 3.2 Notes - -1. ​​**Mods Files**​: If TsFiles have associated `.mods` files, move `.mods` files to the monitored directory ​**before** their corresponding TsFiles. Ensure `.mods` and TsFiles are in the same directory. -2. ​​**Restricted Directories**​: Do NOT set Pipe receiver directories, data directories, or other system paths as monitored directories. -3. ​​**Directory Conflicts**​: Ensure `load_active_listening_fail_dir` does not overlap with `load_active_listening_dirs` or its subdirectories. -4. ​​**Permissions**​: The monitored directory must have write permissions. Files are deleted after successful loading; insufficient permissions may cause duplicate loading. - -## 4. Load SQL - -IoTDB supports importing one or multiple TsFile files containing time series into another running IoTDB instance directly via SQL execution through the CLI. - -### 4.1 Command - -```SQL -load '' with ( - 'attribute-key1'='attribute-value1', - 'attribute-key2'='attribute-value2', -) -``` - -* `` : The path to a TsFile or a folder containing multiple TsFiles. -* ``: Optional parameters, as described below. - -| Key | Key Description | Value Type | Value Range | Value is Required | Default Value | -|--------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|--------------------------------|-------------------|----------------------------| -| `database-level` | When the database corresponding to the TsFile does not exist, the database hierarchy level can be specified via the ` database-level` parameter. The default is the level set in `iotdb-common.properties`. For example, setting level=1 means the prefix path of level 1 in all time series in the TsFile will be used as the database. | Integer | `[1: Integer.MAX_VALUE]` | No | 1 | -| `on-success` | Action for successfully loaded TsFiles: `delete` (delete the TsFile after successful import) or `none` (retain the TsFile in the source folder). | String | `delete / none` | No | delete | -| `model` | Specifies whether the TsFile uses the `table` model or `tree` model. This parameter becomes invalid starting from V2.0.2.1. The system automatically identifies whether the data model is tree-based or table-based. | String | `tree / table` | No | Aligns with `-sql_dialect` | -| `database-name` | Table model only: Target database for import. Automatically created if it does not exist. The database-name must not include the `root.` prefix (an error will occur if included). | String | `-` | No | null | -| `convert-on-type-mismatch` | Whether to perform type conversion during loading if data types in the TsFile mismatch the target schema. | Boolean | `true / false` | No | true | -| `verify` | Whether to validate the schema before loading the TsFile. | Boolean | `true / false` | No | true | -| `tablet-conversion-threshold` | Size threshold (in bytes) for converting TsFiles into tablet format during loading. Default: `-1` (no conversion for any TsFile). | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | No | -1 | -| `async` | Whether to enable asynchronous loading. If enabled, TsFiles are moved to an active-load directory and loaded into the `database-name` asynchronously. | Boolean | `true / false` | No | false | - -### 4.2 Example - -```SQL --- Before import -IoTDB> show databases -+-------------+-----------------------+---------------------+-------------------+---------------------+ -| Database|SchemaReplicationFactor|DataReplicationFactor|TimePartitionOrigin|TimePartitionInterval| -+-------------+-----------------------+---------------------+-------------------+---------------------+ -|root.__system| 1| 1| 0| 604800000| -+-------------+-----------------------+---------------------+-------------------+---------------------+ - --- Import tsfile by excuting load sql -IoTDB> load '/home/dump1.tsfile' with ( 'on-success'='none') -Msg: The statement is executed successfully. - --- Verify whether the import was successful -IoTDB> select * from root.testdb.** -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -| Time|root.testdb.device.model.temperature|root.testdb.device.model.humidity|root.testdb.device.model.status| -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -|2025-04-17T10:35:47.218+08:00| 22.3| 19.4| true| -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Tools-System/Maintenance-Tool_timecho.md b/src/UserGuide/Master/Tree/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index 9a8405f52..000000000 --- a/src/UserGuide/Master/Tree/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,960 +0,0 @@ - -# Cluster Management Tool - -## 1. IoTDB-OpsKit - -The IoTDB OpsKit is an easy-to-use operation and maintenance tool (enterprise version tool). -It is designed to solve the operation and maintenance problems of multiple nodes in the IoTDB distributed system. -It mainly includes cluster deployment, cluster start and stop, elastic expansion, configuration update, data export and other functions, thereby realizing one-click command issuance for complex database clusters, which greatly Reduce management difficulty. -This document will explain how to remotely deploy, configure, start and stop IoTDB cluster instances with cluster management tools. - -### 1.1 Environment dependence - -This tool is a supporting tool for TimechoDB(Enterprise Edition based on IoTDB). You can contact your sales representative to obtain the tool download method. - -The machine where IoTDB is to be deployed needs to rely on jdk 8 and above, lsof, netstat, and unzip functions. If not, please install them yourself. You can refer to the installation commands required for the environment in the last section of the document. - -Tip: The IoTDB cluster management tool requires an account with root privileges - -### 1.2 Deployment method - -#### Download and install - -This tool is a supporting tool for TimechoDB(Enterprise Edition based on IoTDB). You can contact your salesperson to obtain the tool download method. - -Note: Since the binary package only supports GLIBC2.17 and above, the minimum version is Centos7. - -* After entering the following commands in the iotdb-opskit directory: - -```bash -bash install-iotdbctl.sh -``` - -The iotdbctl keyword can be activated in the subsequent shell, such as checking the environment instructions required before deployment as follows: - -```bash -iotdbctl cluster check example -``` - -* You can also directly use <iotdbctl absolute path>/sbin/iotdbctl without activating iotdbctl to execute commands, such as checking the environment required before deployment: - -```bash -/sbin/iotdbctl cluster check example -``` - -### 1.3 Introduction to cluster configuration files - -* There is a cluster configuration yaml file in the `iotdbctl/config` directory. The yaml file name is the cluster name. There can be multiple yaml files. In order to facilitate users to configure yaml files, a `default_cluster.yaml` example is provided under the iotdbctl/config directory. -* The yaml file configuration consists of five major parts: `global`, `confignode_servers`, `datanode_servers`, `grafana_server`, and `prometheus_server` -* `global` is a general configuration that mainly configures machine username and password, IoTDB local installation files, Jdk configuration, etc. A `default_cluster.yaml` sample data is provided in the `iotdbctl/config` directory, - Users can copy and modify it to their own cluster name and refer to the instructions inside to configure the IoTDB cluster. In the `default_cluster.yaml` sample, all uncommented items are required, and those that have been commented are non-required. - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotdbctl cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - -| parameter name | parameter describe | required | -|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| iotdb\_zip\_dir | IoTDB deployment distribution directory, if the value is empty, it will be downloaded from the address specified by `iotdb_download_url` | NO | -| iotdb\_download\_url | IoTDB download address, if `iotdb_zip_dir` has no value, download from the specified address | NO | -| jdk\_tar\_dir | jdk local directory, you can use this jdk path to upload and deploy to the target node. | NO | -| jdk\_deploy\_dir | jdk remote machine deployment directory, jdk will be deployed to this directory, and the following `jdk_dir_name` parameter forms a complete jdk deployment directory, that is, `/` | NO | -| jdk\_dir\_name | The directory name after jdk decompression defaults to jdk_iotdb | NO | -| iotdb\_lib\_dir | The IoTDB lib directory or the IoTDB lib compressed package only supports .zip format and is only used for IoTDB upgrade. It is in the comment state by default. If you need to upgrade, please open the comment and modify the path. If you use a zip file, please use the zip command to compress the iotdb/lib directory, such as zip -r lib.zip apache-iotdb-1.2.0/lib/* d | NO | -| user | User name for ssh login deployment machine | YES | -| password | The password for ssh login. If the password does not specify the use of pkey to log in, please ensure that the ssh login between nodes has been configured without a key. | NO | -| pkey | Key login: If password has a value, password is used first, otherwise pkey is used to log in. | NO | -| ssh\_port | ssh port | YES | -| deploy\_dir | IoTDB deployment directory, IoTDB will be deployed to this directory and the following `iotdb_dir_name` parameter will form a complete IoTDB deployment directory, that is, `/` | YES | -| iotdb\_dir\_name | The directory name after decompression of IoTDB is iotdb by default. | NO | -| datanode-env.sh | Corresponding to `iotdb/config/datanode-env.sh`, when `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first | NO | -| confignode-env.sh | Corresponding to `iotdb/config/confignode-env.sh`, the value in `datanode_servers` is used first when `global` and `datanode_servers` are configured at the same time | NO | -| iotdb-system.properties | Corresponds to `/config/iotdb-system.properties` | NO | -| cn\_internal\_address | The cluster configuration address points to the surviving ConfigNode, and it points to confignode_x by default. When `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_address | The cluster configuration address points to the surviving ConfigNode, and points to confignode_x by default. When configuring values for `global` and `datanode_servers` at the same time, the value in `datanode_servers` is used first, corresponding to `dn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | - -Among them, datanode-env.sh and confignode-env.sh can be configured with extra parameters extra_opts. When this parameter is configured, corresponding values will be appended after datanode-env.sh and confignode-env.sh. Refer to default_cluster.yaml for configuration examples as follows: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - -* `confignode_servers` is the configuration for deploying IoTDB Confignodes, in which multiple Confignodes can be configured - By default, the first started ConfigNode node node1 is regarded as the Seed-ConfigNode - -| parameter name | parameter describe | required | -|-----------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| name | Confignode name | YES | -| deploy\_dir | IoTDB config node deployment directory | YES | -| cn\_internal\_address | Corresponds to iotdb/internal communication address, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| cn_internal_address | The cluster configuration address points to the surviving ConfigNode, and it points to confignode_x by default. When `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| cn\_internal\_port | Internal communication port, corresponding to `cn_internal_port` in `iotdb/config/iotdb-system.properties` | YES | -| cn\_consensus\_port | Corresponds to `cn_consensus_port` in `iotdb/config/iotdb-system.properties` | NO | -| cn\_data\_dir | Corresponds to `cn_consensus_port` in `iotdb/config/iotdb-system.properties` Corresponds to `cn_data_dir` in `iotdb/config/iotdb-system.properties` | YES | -| iotdb-system.properties | Corresponding to `iotdb/config/iotdb-system.properties`, when configuring values in `global` and `confignode_servers` at the same time, the value in confignode_servers will be used first. | NO | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| parameter name | parameter describe | required | -|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| name | Datanode name | YES | -| deploy\_dir | IoTDB data node deployment directory | YES | -| dn\_rpc\_address | The datanode rpc address corresponds to `dn_rpc_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_address | Internal communication address, corresponding to `dn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_seed\_config\_node | The cluster configuration address points to the surviving ConfigNode, and points to confignode_x by default. When configuring values for `global` and `datanode_servers` at the same time, the value in `datanode_servers` is used first, corresponding to `dn_seed_config_node` in `iotdb/config/iotdb-system.properties`. | YES | -| dn\_rpc\_port | Datanode rpc port address, corresponding to `dn_rpc_port` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_port | Internal communication port, corresponding to `dn_internal_port` in `iotdb/config/iotdb-system.properties` | YES | -| iotdb-system.properties | Corresponding to `iotdb/config/iotdb-system.properties`, when configuring values in `global` and `datanode_servers` at the same time, the value in `datanode_servers` will be used first. | NO | - -* grafana_server is the configuration related to deploying Grafana - -| parameter name | parameter describe | required | -|--------------------|-------------------------------------------------------------|-----------| -| grafana\_dir\_name | Grafana decompression directory name(default grafana_iotdb) | NO | -| host | Server ip deployed by grafana | YES | -| grafana\_port | The port of grafana deployment machine, default 3000 | NO | -| deploy\_dir | grafana deployment server directory | YES | -| grafana\_tar\_dir | Grafana compressed package location | YES | -| dashboards | dashboards directory | NO | - -* prometheus_server 是部署Prometheus 相关配置 - -| parameter name | parameter describe | required | -|--------------------------------|----------------------------------------------------|----------| -| prometheus\_dir\_name | prometheus decompression directory name, default prometheus_iotdb | NO | -| host | Server IP deployed by prometheus | YES | -| prometheus\_port | The port of prometheus deployment machine, default 9090 | NO | -| deploy\_dir | prometheus deployment server directory | YES | -| prometheus\_tar\_dir | prometheus compressed package path | YES | -| storage\_tsdb\_retention\_time | The number of days to save data is 15 days by default | NO | -| storage\_tsdb\_retention\_size | The data size that can be saved by the specified block defaults to 512M. Please note the units are KB, MB, GB, TB, PB, and EB. | NO | - -If metrics are configured in `iotdb-system.properties` and `iotdb-system.properties` of config/xxx.yaml, the configuration will be automatically put into promethues without manual modification. - -Note: How to configure the value corresponding to the yaml key to contain special characters such as: etc. It is recommended to use double quotes for the entire value, and do not use paths containing spaces in the corresponding file paths to prevent abnormal recognition problems. - -### 1.4 scenes to be used - -#### Clean data - -* Cleaning up the cluster data scenario will delete the data directory in the IoTDB cluster and `cn_system_dir`, `cn_consensus_dir`, `cn_consensus_dir` configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs` and `ext` directories. -* First execute the stop cluster command, and then execute the cluster cleanup command. - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster clean default_cluster -``` - -#### Cluster destruction - -* The cluster destruction scenario will delete `data`, `cn_system_dir`, `cn_consensus_dir`, in the IoTDB cluster - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs`, `ext`, `IoTDB` deployment directory, - grafana deployment directory and prometheus deployment directory. -* First execute the stop cluster command, and then execute the cluster destruction command. - - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster destroy default_cluster -``` - -#### Cluster upgrade - -* To upgrade the cluster, you first need to configure `iotdb_lib_dir` in config/xxx.yaml as the directory path where the jar to be uploaded to the server is located (for example, iotdb/lib). -* If you use zip files to upload, please use the zip command to compress the iotdb/lib directory, such as zip -r lib.zip apache-iotdb-1.2.0/lib/* -* Execute the upload command and then execute the restart IoTDB cluster command to complete the cluster upgrade. - -```bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### hot deployment - -* First modify the configuration in config/xxx.yaml. -* Execute the distribution command, and then execute the hot deployment command to complete the hot deployment of the cluster configuration - -```bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### Cluster expansion - -* First modify and add a datanode or confignode node in config/xxx.yaml. -* Execute the cluster expansion command - -```bash -iotdbctl cluster scaleout default_cluster -``` - -#### Cluster scaling - -* First find the node name or ip+port to shrink in config/xxx.yaml (where confignode port is cn_internal_port, datanode port is rpc_port) -* Execute cluster shrink command - -```bash -iotdbctl cluster scalein default_cluster -``` - -#### Using cluster management tools to manipulate existing IoTDB clusters - -* Configure the server's `user`, `passwod` or `pkey`, `ssh_port` -* Modify the IoTDB deployment path in config/xxx.yaml, `deploy_dir` (IoTDB deployment directory), `iotdb_dir_name` (IoTDB decompression directory name, the default is iotdb) - For example, if the full path of IoTDB deployment is `/home/data/apache-iotdb-1.1.1`, you need to modify the yaml files `deploy_dir:/home/data/` and `iotdb_dir_name:apache-iotdb-1.1.1` -* If the server is not using java_home, modify `jdk_deploy_dir` (jdk deployment directory) and `jdk_dir_name` (the directory name after jdk decompression, the default is jdk_iotdb). If java_home is used, there is no need to modify the configuration. - For example, the full path of jdk deployment is `/home/data/jdk_1.8.2`, you need to modify the yaml files `jdk_deploy_dir:/home/data/`, `jdk_dir_name:jdk_1.8.2` -* Configure `cn_internal_address`, `dn_internal_address` -* Configure `cn_internal_address`, `cn_internal_port`, `cn_consensus_port`, `cn_system_dir`, in `iotdb-system.properties` in `confignode_servers` - If the values in `cn_consensus_dir` and `iotdb-system.properties` are not the default for IoTDB, they need to be configured, otherwise there is no need to configure them. -* Configure `dn_rpc_address`, `dn_internal_address`, `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir` in `iotdb-system.properties` -* Execute initialization command - -```bash -iotdbctl cluster init default_cluster -``` - -#### Deploy IoTDB, Grafana and Prometheus - -* Configure `iotdb-system.properties` to open the metrics interface -* Configure the Grafana configuration. If there are multiple `dashboards`, separate them with commas. The names cannot be repeated or they will be overwritten. -* Configure the Prometheus configuration. If the IoTDB cluster is configured with metrics, there is no need to manually modify the Prometheus configuration. The Prometheus configuration will be automatically modified according to which node is configured with metrics. -* Start the cluster - -```bash -iotdbctl cluster start default_cluster -``` - -For more detailed parameters, please refer to the cluster configuration file introduction above - -### 1.5 Command - -The basic usage of this tool is: -```bash -iotdbctl cluster [params (Optional)] -``` -* key indicates a specific command. - -* cluster name indicates the cluster name (that is, the name of the yaml file in the `iotdbctl/config` file). - -* params indicates the required parameters of the command (optional). - -* For example, the command format to deploy the default_cluster cluster is: - -```bash -iotdbctl cluster deploy default_cluster -``` - -* The functions and parameters of the cluster are listed as follows: - -| command | description | parameter | -|-----------------|-----------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| check | check whether the cluster can be deployed | Cluster name list | -| clean | cleanup-cluster | cluster-name | -| deploy/dist-all | deploy cluster | Cluster name, -N, module name (optional for iotdb, grafana, prometheus), -op force (optional) | -| list | cluster status list | None | -| start | start cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional) | -| stop | stop cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional), -op force (nodename, grafana, prometheus optional) | -| restart | restart cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional), -op force (nodename, grafana, prometheus optional) | -| show | view cluster information. The details field indicates the details of the cluster information. | Cluster name, details (optional) | -| destroy | destroy cluster | Cluster name, -N, module name (iotdb, grafana, prometheus optional) | -| scaleout | cluster expansion | Cluster name | -| scalein | cluster shrink | Cluster name, -N, cluster node name or cluster node ip+port | -| reload | hot loading of cluster configuration files | Cluster name | -| dist-conf | cluster configuration file distribution | Cluster name | -| dumplog | Back up specified cluster logs | Cluster name, -N, cluster node name -h Back up to target machine ip -pw Back up to target machine password -p Back up to target machine port -path Backup directory -startdate Start time -enddate End time -loglevel Log type -l transfer speed | -| dumpdata | Backup cluster data | Cluster name, -h backup to target machine ip -pw backup to target machine password -p backup to target machine port -path backup directory -startdate start time -enddate end time -l transmission speed | -| dist-lib | lib package upgrade | Cluster name | -| init | When an existing cluster uses the cluster deployment tool, initialize the cluster configuration | Cluster name | -| status | View process status | Cluster name | -| activate | Activate cluster | Cluster name | -| health_check | health check | Cluster name, -N, nodename (optional) | -| backup | Activate cluster | Cluster name,-N nodename (optional) | -| importschema | Activate cluster | Cluster name,-N nodename -param paramters | -| exportschema | Activate cluster | Cluster name,-N nodename -param paramters | - - - -### 1.6 Detailed command execution process - -The following commands are executed using default_cluster.yaml as an example, and users can modify them to their own cluster files to execute - -#### Check cluster deployment environment commands - -```bash -iotdbctl cluster check default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Verify that the target node is able to log in via SSH - -* Verify whether the JDK version on the corresponding node meets IoTDB jdk1.8 and above, and whether the server is installed with unzip, lsof, and netstat. - -* If you see the following prompt `Info:example check successfully!`, it proves that the server has already met the installation requirements. - If `Error:example check fail!` is output, it proves that some conditions do not meet the requirements. You can check the Error log output above (for example: `Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`) to make repairs. , - If the jdk check does not meet the requirements, we can configure a jdk1.8 or above version in the yaml file ourselves for deployment without affecting subsequent use. - If checking lsof, netstat or unzip does not meet the requirements, you need to install it on the server yourself. - -#### Deploy cluster command - -```bash -iotdbctl cluster deploy default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Upload IoTDB compressed package and jdk compressed package according to the node information in `confignode_servers` and `datanode_servers` (if `jdk_tar_dir` and `jdk_deploy_dir` values ​​are configured in yaml) - -* Generate and upload `iotdb-system.properties` according to the yaml file node configuration information - -```bash -iotdbctl cluster deploy default_cluster -op force -``` - -Note: This command will force the deployment, and the specific process will delete the existing deployment directory and redeploy - -*deploy a single module* -```bash -# Deploy grafana module -iotdbctl cluster deploy default_cluster -N grafana -# Deploy the prometheus module -iotdbctl cluster deploy default_cluster -N prometheus -# Deploy the iotdb module -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### Start cluster command - -```bash -iotdbctl cluster start default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Start confignode, start sequentially according to the order in `confignode_servers` in the yaml configuration file and check whether the confignode is normal according to the process id, the first confignode is seek config - -* Start the datanode in sequence according to the order in `datanode_servers` in the yaml configuration file and check whether the datanode is normal according to the process id. - -* After checking the existence of the process according to the process id, check whether each service in the cluster list is normal through the cli. If the cli link fails, retry every 10s until it succeeds and retry up to 5 times - - -* -Start a single node command* -```bash -#Start according to the IoTDB node name -iotdbctl cluster start default_cluster -N datanode_1 -#Start according to IoTDB cluster ip+port, where port corresponds to cn_internal_port of confignode and rpc_port of datanode. -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 -#Start grafana -iotdbctl cluster start default_cluster -N grafana -#Start prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -* Find the yaml file in the default location based on cluster-name - -* Find the node location information based on the provided node name or ip:port. If the started node is `data_node`, the ip uses `dn_rpc_address` in the yaml file, and the port uses `dn_rpc_port` in datanode_servers in the yaml file. - If the started node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - -* start the node - -Note: Since the cluster deployment tool only calls the start-confignode.sh and start-datanode.sh scripts in the IoTDB cluster, -When the actual output result fails, it may be that the cluster has not started normally. It is recommended to use the status command to check the current cluster status (iotdbctl cluster status xxx) - - -#### View IoTDB cluster status command - -```bash -iotdbctl cluster show default_cluster -#View IoTDB cluster details -iotdbctl cluster show default_cluster details -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Execute `show cluster details` through cli on datanode in turn. If one node is executed successfully, it will not continue to execute cli on subsequent nodes and return the result directly. - -#### Stop cluster command - - -```bash -iotdbctl cluster stop default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* According to the datanode node information in `datanode_servers`, stop the datanode nodes in order according to the configuration. - -* Based on the confignode node information in `confignode_servers`, stop the confignode nodes in sequence according to the configuration - -*force stop cluster command* - -```bash -iotdbctl cluster stop default_cluster -op force -``` -Will directly execute the kill -9 pid command to forcibly stop the cluster - -*Stop single node command* - -```bash -#Stop by IoTDB node name -iotdbctl cluster stop default_cluster -N datanode_1 -#Stop according to IoTDB cluster ip+port (ip+port is to get the only node according to ip+dn_rpc_port in datanode or ip+cn_internal_port in confignode to get the only node) -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 -#Stop grafana -iotdbctl cluster stop default_cluster -N grafana -#Stop prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -* Find the yaml file in the default location based on cluster-name - -* Find the corresponding node location information based on the provided node name or ip:port. If the stopped node is `data_node`, the ip uses `dn_rpc_address` in the yaml file, and the port uses `dn_rpc_port` in datanode_servers in the yaml file. - If the stopped node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - -* stop the node - -Note: Since the cluster deployment tool only calls the stop-confignode.sh and stop-datanode.sh scripts in the IoTDB cluster, in some cases the iotdb cluster may not be stopped. - - -#### Clean cluster data command - -```bash -iotdbctl cluster clean default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Based on the information in `confignode_servers` and `datanode_servers`, check whether there are still services running, - If any service is running, the cleanup command will not be executed. - -* Delete the data directory in the IoTDB cluster and the `cn_system_dir`, `cn_consensus_dir`, configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs` and `ext` directories. - - - -#### Restart cluster command - -```bash -iotdbctl cluster restart default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` - -* Execute the above stop cluster command (stop), and then execute the start cluster command (start). For details, refer to the above start and stop commands. - -*Force restart cluster command* - -```bash -iotdbctl cluster restart default_cluster -op force -``` -Will directly execute the kill -9 pid command to force stop the cluster, and then start the cluster - - -*Restart a single node command* - -```bash -#Restart datanode_1 according to the IoTDB node name -iotdbctl cluster restart default_cluster -N datanode_1 -#Restart confignode_1 according to the IoTDB node name -iotdbctl cluster restart default_cluster -N confignode_1 -#Restart grafana -iotdbctl cluster restart default_cluster -N grafana -#Restart prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### Cluster shrink command - -```bash -#Scale down by node name -iotdbctl cluster scalein default_cluster -N nodename -#Scale down according to ip+port (ip+port obtains the only node according to ip+dn_rpc_port in datanode, and obtains the only node according to ip+cn_internal_port in confignode) -iotdbctl cluster scalein default_cluster -N ip:port -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Determine whether there is only one confignode node and datanode to be reduced. If there is only one left, the reduction cannot be performed. - -* Then get the node information to shrink according to ip:port or nodename, execute the shrink command, and then destroy the node directory. If the shrink node is `data_node`, use `dn_rpc_address` in the yaml file for ip, and use `dn_rpc_address` in the port. `dn_rpc_port` in datanode_servers in yaml file. - If the shrinking node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - - -Tip: Currently, only one node scaling is supported at a time - -#### Cluster expansion command - -```bash -iotdbctl cluster scaleout default_cluster -``` -* Modify the config/xxx.yaml file to add a datanode node or confignode node - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Find the node to be expanded, upload the IoTDB compressed package and jdb package (if the `jdk_tar_dir` and `jdk_deploy_dir` values ​​are configured in yaml) and decompress it - -* Generate and upload `iotdb-system.properties` according to the yaml file node configuration information - -* Execute the command to start the node and verify whether the node is started successfully - -Tip: Currently, only one node expansion is supported at a time - -#### destroy cluster command -```bash -iotdbctl cluster destroy default_cluster -``` - -* cluster-name finds the yaml file in the default location - -* Check whether the node is still running based on the node node information in `confignode_servers`, `datanode_servers`, `grafana`, and `prometheus`. - Stop the destroy command if any node is running - -* Delete `data` in the IoTDB cluster and `cn_system_dir`, `cn_consensus_dir` configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs`, `ext`, `IoTDB` deployment directory, - grafana deployment directory and prometheus deployment directory - -*Destroy a single module* - -```bash -# Destroy grafana module -iotdbctl cluster destroy default_cluster -N grafana -# Destroy prometheus module -iotdbctl cluster destroy default_cluster -N prometheus -# Destroy iotdb module -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### Distribute cluster configuration commands - -```bash -iotdbctl cluster dist-conf default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` - -* Generate and upload `iotdb-system.properties` to the specified node according to the node configuration information of the yaml file - -#### Hot load cluster configuration command - -```bash -iotdbctl cluster reload default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Execute `load configuration` in the cli according to the node configuration information of the yaml file. - -#### Cluster node log backup -```bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` - -* Find the yaml file in the default location based on cluster-name - -* This command will verify the existence of datanode_1 and confignode_1 according to the yaml file, and then back up the log data of the specified node datanode_1 and confignode_1 to the specified service `192.168.9.48` port 36000 according to the configured start and end dates (startdate<=logtime<=enddate) The data backup path is `/iotdb/logs`, and the IoTDB log storage path is `/root/data/db/iotdb/logs` (not required, if you do not fill in -logs xxx, the default is to backup logs from the IoTDB installation path /logs ) - -| command | description | required | -|------------|-------------------------------------------------------------------------|----------| -| -h | backup data server ip | NO | -| -u | backup data server username | NO | -| -pw | backup data machine password | NO | -| -p | backup data machine port(default 22) | NO | -| -path | path to backup data (default current path) | NO | -| -loglevel | Log levels include all, info, error, warn (default is all) | NO | -| -l | speed limit (default 1024 speed limit range 0 to 104857601 unit Kbit/s) | NO | -| -N | multiple configuration file cluster names are separated by commas. | YES | -| -startdate | start time (including default 1970-01-01) | NO | -| -enddate | end time (included) | NO | -| -logs | IoTDB log storage path, the default is ({iotdb}/logs)) | NO | - -#### Cluster data backup -```bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* This command will obtain the leader node based on the yaml file, and then back up the data to the /iotdb/datas directory on the 192.168.9.48 service based on the start and end dates (startdate<=logtime<=enddate) - -| command | description | required | -|--------------|-------------------------------------------------------------------------|----------| -| -h | backup data server ip | NO | -| -u | backup data server username | NO | -| -pw | backup data machine password | NO | -| -p | backup data machine port(default 22) | NO | -| -path | path to backup data (default current path) | NO | -| -granularity | partition | YES | -| -l | speed limit (default 1024 speed limit range 0 to 104857601 unit Kbit/s) | NO | -| -startdate | start time (including default 1970-01-01) | YES | -| -enddate | end time (included) | YES | - -#### Cluster upgrade -```bash -iotdbctl cluster dist-lib default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Upload lib package - -Note that after performing the upgrade, please restart IoTDB for it to take effect. - -#### Cluster initialization -```bash -iotdbctl cluster init default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` -* Initialize cluster configuration - -#### View cluster process status -```bash -iotdbctl cluster status default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` -* Display the survival status of each node in the cluster - -#### Cluster authorization activation - -Cluster activation is activated by entering the activation code by default, or by using the - op license_path activated through license path - -* Default activation method -```bash -iotdbctl cluster activate default_cluster -``` -* Find the yaml file in the default location based on `cluster-name` and obtain the `confignode_servers` configuration information -* Obtain the machine code inside -* Waiting for activation code input - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* Activate a node - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -* Activate through license path - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* Find the yaml file in the default location based on `cluster-name` and obtain the `confignode_servers` configuration information -* Obtain the machine code inside -* Waiting for activation code input - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* Activate a node - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -op license_path -``` - -#### Cluster Health Check -```bash -iotdbctl cluster health_check default_cluster -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve confignode_servers and datanode_servers configuration information. -* Execute health_check.sh on each node. -* Single Node Health Check -```bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute health_check.sh on datanode1. - -#### Cluster Shutdown Backup - -```bash -iotdbctl cluster backup default_cluster -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve confignode_servers and datanode_servers configuration information. -* Execute backup.sh on each node - -* Single Node Backup - -```bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` - -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute backup.sh on datanode1. -Note: Multi-node deployment on a single machine only supports quick mode. - -#### Cluster Metadata Import -```bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute metadata import with import-schema.sh on datanode1. -* Parameters for -param are as follows: - -| command | description | required | -|------------|-------------------------------------------------------------------------|----------| -| -s | Specify the data file to be imported. You can specify a file or a directory. If a directory is specified, all files with a .csv extension in the directory will be imported in bulk. | YES | -| -fd | Specify a directory to store failed import files. If this parameter is not specified, failed files will be saved in the source data directory with the extension .failed added to the original filename. | No | -| -lpf | Specify the number of lines written to each failed import file. The default is 10000.| NO | - -#### Cluster Metadata Export - -```bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` - -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute metadata export with export-schema.sh on datanode1. -* Parameters for -param are as follows: - -| command | description | required | -|-------------|-------------------------------------------------------------------------|----------| -| -t | Specify the output path for the exported CSV file. | YES | -| -path | Specify the path pattern for exporting metadata. If this parameter is specified, the -s parameter will be ignored. Example: root.stock.** | NO | -| -pf | If -path is not specified, this parameter must be specified. It designates the file path containing the metadata paths to be exported, supporting txt file format. Each path to be exported is on a new line.| NO | -| -lpf | Specify the maximum number of lines for the exported dump file. The default is 10000.| NO | -| -timeout | Specify the timeout for session queries in milliseconds.| NO | - - - -### 1.7 Introduction to Cluster Deployment Tool Samples - -In the cluster deployment tool installation directory config/example, there are three yaml examples. If necessary, you can copy them to config and modify them. - -| name | description | -|-----------------------------|------------------------------------------------| -| default\_1c1d.yaml | 1 confignode and 1 datanode configuration example | -| default\_3c3d.yaml | 3 confignode and 3 datanode configuration samples | -| default\_3c3d\_grafa\_prome | 3 confignode and 3 datanode, Grafana, Prometheus configuration examples | - - -## 2. IoTDB Data Directory Overview Tool - -IoTDB data directory overview tool is used to print an overview of the IoTDB data directory structure. The location is tools/tsfile/print-iotdb-data-dir. - -### 2.1 Usage - -- For Windows: - -```bash -.\print-iotdb-data-dir.bat () -``` - -- For Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh () -``` - -Note: if the storage path of the output overview file is not set, the default relative path "IoTDB_data_dir_overview.txt" will be used. - -### 2.2 Example - -Use Windows in this example: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## 3. TsFile Sketch Tool - -TsFile sketch tool is used to print the content of a TsFile in sketch mode. The location is tools/tsfile/print-tsfile. - -### 3.1 Usage - -- For Windows: - -``` -.\print-tsfile-sketch.bat () -``` - -- For Linux or MacOs: - -``` -./print-tsfile-sketch.sh () -``` - -Note: if the storage path of the output sketch file is not set, the default relative path "TsFile_sketch_view.txt" will be used. - -### 3.2 Example - -Use Windows in this example: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -Explanations: - -- Separated by "|", the left is the actual position in the TsFile, and the right is the summary content. -- "||||||||||||||||||||" is the guide information added to enhance readability, not the actual data stored in TsFile. -- The last printed "IndexOfTimerseriesIndex Tree" is a reorganization of the metadata index tree at the end of the TsFile, which is convenient for intuitive understanding, and again not the actual data stored in TsFile. - -## 4. TsFile Resource Sketch Tool - -TsFile resource sketch tool is used to print the content of a TsFile resource file. The location is tools/tsfile/print-tsfile-resource-files. - -### 4.1 Usage - -- For Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- For Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### 4.2 Example - -Use Windows in this example: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/UserGuide/Master/Tree/Tools-System/Monitor-Tool_timecho.md b/src/UserGuide/Master/Tree/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index 197ea27e2..000000000 --- a/src/UserGuide/Master/Tree/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,200 +0,0 @@ - - -# Monitor Tool - -The deployment of monitoring tools can refer to the document [Monitoring Panel Deployment](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) section. - -## 1. Prometheus - -### 1.1 The mapping from metric type to prometheus format - -> For metrics whose Metric Name is name and Tags are K1=V1, ..., Kn=Vn, the mapping is as follows, where value is a -> specific value - -| Metric Type | Mapping | -| ---------------- | ------------------------------------------------------------ | -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | - -### 1.2 Config File - -1) Taking DataNode as an example, modify the iotdb-system.properties configuration file as follows: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -Then you can get metrics data as follows - -2) Start IoTDB DataNodes -3) Open a browser or use ```curl``` to visit ```http://servier_ip:9091/metrics```, you can get the following metric - data: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -### 1.3 Prometheus + Grafana - -As shown above, IoTDB exposes monitoring metrics data in the standard Prometheus format to the outside world. Prometheus -can be used to collect and store monitoring indicators, and Grafana can be used to visualize monitoring indicators. - -The following picture describes the relationships among IoTDB, Prometheus and Grafana - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. Along with running, IoTDB will collect its metrics continuously. -2. Prometheus scrapes metrics from IoTDB at a constant interval (can be configured). -3. Prometheus saves these metrics to its inner TSDB. -4. Grafana queries metrics from Prometheus at a constant interval (can be configured) and then presents them on the - graph. - -So, we need to do some additional works to configure and deploy Prometheus and Grafana. - -For instance, you can config your Prometheus as follows to get metrics data from IoTDB: - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -The following documents may help you have a good journey with Prometheus and Grafana. - -[Prometheus getting_started](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus scrape metrics](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana getting_started](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana query metrics from Prometheus](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## 2. Apache IoTDB Dashboard - -We introduce the Apache IoTDB Dashboard, designed for unified centralized operations and management. With it, multiple clusters can be monitored through a single panel. - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - - -You can access the Dashboard's Json file in the enterprise edition. - -### 2.1 Cluster Overview - -Including but not limited to: - -- Total cluster CPU cores, memory space, and hard disk space. -- Number of ConfigNodes and DataNodes in the cluster. -- Cluster uptime duration. -- Cluster write speed. -- Current CPU, memory, and disk usage across all nodes in the cluster. -- Information on individual nodes. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - - -### 2.2 Data Writing - -Including but not limited to: - -- Average write latency, median latency, and the 99% percentile latency. -- Number and size of WAL files. -- Node WAL flush SyncBuffer latency. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### 2.3 Data Querying - -Including but not limited to: - -- Node query load times for time series metadata. -- Node read duration for time series. -- Node edit duration for time series metadata. -- Node query load time for Chunk metadata list. -- Node edit duration for Chunk metadata. -- Node filtering duration based on Chunk metadata. -- Average time to construct a Chunk Reader. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### 2.4 Storage Engine - -Including but not limited to: - -- File count and sizes by type. -- The count and size of TsFiles at various stages. -- Number and duration of various tasks. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### 2.5 System Monitoring - -Including but not limited to: - -- System memory, swap memory, and process memory. -- Disk space, file count, and file sizes. -- JVM GC time percentage, GC occurrences by type, GC volume, and heap memory usage across generations. -- Network transmission rate, packet sending rate - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) - -### 2.6 Data Synchronization - -Including but not limited to: - -- Pipe event commit queue size, number of unassigned Pipe events -- Number of unprocessed events in the Source queue, Source event feeding rate, Processor event processing rate -- Number of untransmitted events for all Pipe Sinks/Sources, transmission event rate of Pipe connectors -- Retry queue size and pending handler count of Pipe Sinks; total data size before and after compression and compression duration of Pipe Sinks; batch size and batch interval distribution of Pipe Sinks -- Pipe memory usage and capacity, number of Pipe phantom references, quantity and total size of linked TsFiles, disk bytes read for TsFile transmission via Pipe - -![](/img/monitor-tool-pipe-1-en.png) - -![](/img/monitor-tool-pipe-2-en.png) - -![](/img/monitor-tool-pipe-3-en.png) - -![](/img/monitor-tool-pipe-4-en.png) \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Tools-System/Schema-Export-Tool_timecho.md b/src/UserGuide/Master/Tree/Tools-System/Schema-Export-Tool_timecho.md deleted file mode 100644 index 6ad9ae662..000000000 --- a/src/UserGuide/Master/Tree/Tools-System/Schema-Export-Tool_timecho.md +++ /dev/null @@ -1,85 +0,0 @@ - - -# Schema Export - -## 1. Overview - -The schema export tool `export-schema.sh/bat` is located in the `tools` directory. It can export schema from a specified database in IoTDB to a script file. - -## 2. Detailed Functionality - -### 2.1 Parameter - -| **Short Param** | **Full Param** | **Description** | Required | Default | -|------------------|--------------------------| ------------------------------------------------------------------------ | ------------------------------------- |-----------------------------------------------| -| `-h` | `-- host` | Hostname | No | 127.0.0.1 | -| `-p` | `--port` | Port number | No | 6667 | -| `-u` | `--username` | Username | No | root | -| `-pw` | `--password` | Password, Supported for hidden input since V2.0.9.1 | No | TimechoDB@2021(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Specifies whether the server uses`tree `model or`table `model | No | tree | -| `-db` | `--database` | Target database to export (only applies when`-sql_dialect=table`) | Required if`-sql_dialect=table` | - | -| `-table` | `--table` | Target table to export (only applies when`-sql_dialect=table`) | No | - | -| `-t` | `--target` | Output directory (created if it doesn't exist) | Yes | | -| `-path` | `--path_pattern` | Path pattern for metadata export | Required if`-sql_dialect=tree` | | -| `-pfn` | `--prefix_file_name` | Output filename prefix | No | `dump_dbname.sql` | -| `-lpf` | `--lines_per_file` | Maximum lines per dump file (only applies when`-sql_dialect=tree`) | No | `10000` | -| `-timeout` | `--query_timeout` | Query timeout in milliseconds (`-1`= no timeout) | No | -1Range:`-1 to Long. max=9223372036854775807` | -| `-help` | `--help` | Display help information | No | | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 Command - -```Bash -Shell -# Unix/OS X -> tools/export-schema.sh [-sql_dialect] -db -table - [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -# Windows -# Before version V2.0.4.x -> tools\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\schema\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -``` - -### 2.3 Examples - - -```Bash -# Export schema under root.treedb -./export-schema.sh -sql_dialect tree -t /home/ -path "root.treedb.**" - -# Output -Timeseries,Alias,DataType,Encoding,Compression -root.treedb.device.temperature,,DOUBLE,GORILLA,LZ4 -root.treedb.device.humidity,,DOUBLE,GORILLA,LZ4 -``` \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/Tools-System/Schema-Import-Tool_timecho.md b/src/UserGuide/Master/Tree/Tools-System/Schema-Import-Tool_timecho.md deleted file mode 100644 index 5d9de70a3..000000000 --- a/src/UserGuide/Master/Tree/Tools-System/Schema-Import-Tool_timecho.md +++ /dev/null @@ -1,90 +0,0 @@ - - -# Schema Import - -## 1. Overview - -The schema import tool `import-schema.sh/bat` is located in `tools` directory. - -## 2. Detailed Functionality - -### 2.1 Parameter - -| **Short Param** | **Full Param** | **Description** | Required | Default | -|-----------------| ------------------------------- |-----------------------------------------------------------------------| ---------- |-------------------------------------------| -| `-h` | `-- host` | Hostname | No | 127.0.0.1 | -| `-p` | `--port` | Port number | No | 6667 | -| `-u` | `--username` | Username | No | root | -| `-pw` | `--password` | Password, Supported for hidden input since V2.0.9.1 | No | TimechoDB@2021(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Specifies whether the server uses`tree `model or`table `model | No | tree | -| `-db` | `--database` | Target database for import | Yes | - | -| `-table` | `--table` | Target table for import (only applies when`-sql_dialect=table`) | No | - | -| `-s` | `--source` | Local directory path containing script file(s) to import | Yes | | -| `-fd` | `--fail_dir` | Directory to save failed import files | No | | -| `-lpf` | `--lines_per_failed_file` | Maximum lines per failed file (only applies when`-sql_dialect=table`) | No | 100000Range:`0 to Integer.Max=2147483647` | -| `-help` | `--help` | Display help information | No | | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 Command - -```Bash -# Unix/OS X -tools/import-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# Windows -# Before version V2.0.4.x -tools\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# V2.0.4.x and later versions -tools\windows\schema\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] -``` - -### 2.3 Examples - -```Bash -# Before import -IoTDB> show timeseries root.treedb.** -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -# Execution -./import-schema.sh -sql_dialect tree -s /home/dump0_0.csv -db root.treedb - -# Verification -IoTDB> show timeseries root.treedb.** -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias| Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.treedb.device.temperature| null|root.treedb| DOUBLE| GORILLA| LZ4|null| null| null| null| BASE| -| root.treedb.device.humidity| null|root.treedb| DOUBLE| GORILLA| LZ4|null| null| null| null| BASE| -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -``` diff --git a/src/UserGuide/Master/Tree/Tools-System/Workbench_timecho.md b/src/UserGuide/Master/Tree/Tools-System/Workbench_timecho.md deleted file mode 100644 index cb0d17f1e..000000000 --- a/src/UserGuide/Master/Tree/Tools-System/Workbench_timecho.md +++ /dev/null @@ -1,33 +0,0 @@ -# WorkBench - -The deployment of the visualization console can refer to the document [Workbench Deployment](../Deployment-and-Maintenance/workbench-deployment_timecho.md) chapter. - -## 1. Product Introduction -IoTDB Visualization Console is an extension component developed for industrial scenarios based on the IoTDB Enterprise Edition time series database. It integrates real-time data collection, storage, and analysis, aiming to provide users with efficient and reliable real-time data storage and query solutions. It features lightweight, high performance, and ease of use, seamlessly integrating with the Hadoop and Spark ecosystems. It is suitable for high-speed writing and complex analytical queries of massive time series data in industrial IoT applications. - -## 2. Instructions for Use -| **Functional Module** | **Functional Description** | -| ---------------------- | ------------------------------------------------------------ | -| Instance Management | Support unified management of connected instances, support creation, editing, and deletion, while visualizing the relationships between multiple instances, helping customers manage multiple database instances more clearly | -| Home | Support viewing the service running status of each node in the database instance (such as activation status, running status, IP information, etc.), support viewing the running monitoring status of clusters, ConfigNodes, and DataNodes, monitor the operational health of the database, and determine if there are any potential operational issues with the instance. | -| Measurement Point List | Support directly viewing the measurement point information in the instance, including database information (such as database name, data retention time, number of devices, etc.), and measurement point information (measurement point name, data type, compression encoding, etc.), while also supporting the creation, export, and deletion of measurement points either individually or in batches. | -| Data Model | Support viewing hierarchical relationships and visually displaying the hierarchical model. | -| Data Query | Support interface-based query interactions for common data query scenarios, and enable batch import and export of queried data. | -| Statistical Query | Support interface-based query interactions for common statistical data scenarios, such as outputting results for maximum, minimum, average, and sum values. | -| SQL Operations | Support interactive SQL operations on the database through a graphical user interface, allowing for the execution of single or multiple statements, and displaying and exporting the results. | -| Trend | Support one-click visualization to view the overall trend of data, draw real-time and historical data for selected measurement points, and observe the real-time and historical operational status of the measurement points. | -| Analysis | Support visualizing data through different analysis methods (such as FFT) for visualization. | -| View | Support viewing information such as view name, view description, result measuring points, and expressions through the interface. Additionally, enable users to quickly create, edit, and delete views through interactive interfaces. | -| Data synchronization | Support the intuitive creation, viewing, and management of data synchronization tasks between databases. Enable direct viewing of task running status, synchronized data, and target addresses. Users can also monitor changes in synchronization status in real-time through the interface. | -| Permission management | Support interface-based control of permissions for managing and controlling database user access and operations. | -| Audit logs | Support detailed logging of user operations on the database, including Data Definition Language (DDL), Data Manipulation Language (DML), and query operations. Assist users in tracking and identifying potential security threats, database errors, and misuse behavior. | - -Main feature showcase -* Home -![首页.png](/img/%E9%A6%96%E9%A1%B5.png) -* Measurement Point List -![测点列表.png](/img/workbench-en-bxzk.png) -* Data Query -![数据查询.png](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2.png) -* Trend -![历史趋势.png](/img/%E5%8E%86%E5%8F%B2%E8%B6%8B%E5%8A%BF.png) \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/User-Manual/Audit-Log_timecho.md b/src/UserGuide/Master/Tree/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index f63be18b4..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,165 +0,0 @@ - - - -# Security Audit - -## 1. Introduction - -Audit logs serve as the record credentials of a database, enabling tracking of various operations (e.g., create, read, update, delete) to ensure information security. The audit log feature in IoTDB supports the following capabilities: - -* Supports enabling/disabling the audit log functionality through configuration -* Supports configuring operation types and privilege levels to be recorded via parameters -* Supports setting the storage duration of audit log files, including time-based rolling (via TTL) and space-based rolling (via SpaceTL) -* Supports configuring parameters to count slow requests (with write/query latency exceeding a threshold, default 3000 milliseconds) within any specified time period -* Audit log files are stored in encrypted format by default - -> Note: This feature is available from version V2.0.8 onwards. - -## 2. Configuration Parameters - -Edit the `iotdb-system.properties` file to enable audit logging using the following parameters: - -* V2.0.8.1 - -| Parameter Name | Description | Data Type | Default Value | Activation Method | -|-------------------------------------------|------------------------------------------------------------------------------------------------------------|-----------|-------------------------------|-------------------| -| `enable_audit_log` | Whether to enable audit logging. true: enabled. false: disabled. | Boolean | false | Hot Reload | -| `auditable_operation_type` | Operation type selection. DML: all DML operations are logged; DDL: all DDL operations are logged; QUERY: all query operations are logged; CONTROL: all control statements are logged. | String | DML,DDL,QUERY,CONTROL | Hot Reload | -| `auditable_operation_level` | Permission level selection. global: log all audit events; object: only log events related to data instances. Containment relationship: object < global. For example: when set to global, all audit logs are recorded normally; when set to object, only operations on specific data instances are recorded. | String | global | Hot Reload | -| `auditable_operation_result` | Audit result selection. success: log only successful events; fail: log only failed events | String | success,fail | Hot Reload | -| `audit_log_ttl_in_days` | Audit log TTL (Time To Live). Logs older than this threshold will expire. | Double | -1.0 (never deleted) | Hot Reload | -| `audit_log_space_tl_in_GB` | Audit log SpaceTL. Logs will start rotating when total space reaches this threshold. | Double | 1.0 | Hot Reload | -| `audit_log_batch_interval_in_ms` | Batch write interval for audit logs | Long | 1000 | Hot Reload | -| `audit_log_batch_max_queue_bytes` | Maximum byte size of the queue for batch processing audit logs. Subsequent write operations will be blocked when this threshold is exceeded. | Long | 268435456 | Hot Reload | - -* V2.0.9.2 - -| Parameter Name | Description | Data Type | Default Value | Activation Method | -|-------------------------------------------|------------------------------------------------------------------------------------------------------------|-----------|-------------------------------|-------------------| -| `enable_audit_log` | Whether to enable audit logging. true: enabled. false: disabled. | Boolean | false | Hot Reload | -| `auditable_operation_type` | Operation type selection. DML: all DML operations are logged; DDL: all DDL operations are logged; QUERY: all query operations are logged; CONTROL: all control statements are logged. | String | DML,DDL,QUERY,CONTROL | Hot Reload | -| `auditable_dml_event_type` | Event types for auditing DML operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_ddl_event_type` | Event types for auditing DDL operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_query_event_type` | Event types for auditing query operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_control_event_type` | Event types for auditing control operations. `CHANGE_AUDIT_OPTION`: audit option change, `OBJECT_AUTHENTICATION`: object authentication, `LOGIN`: login, `LOGOUT`: logout, `DN_SHUTDOWN`: data node shutdown, `SLOW_OPERATION`: slow operation | String | `CHANGE_AUDIT_OPTION`,`OBJECT_AUTHENTICATION`,`LOGIN`,`LOGOUT`,`DN_SHUTDOWN`,`SLOW_OPERATION` | Hot Reload | -| `auditable_operation_level` | Permission level selection. global: log all audit events; object: only log events related to data instances. Containment relationship: object < global. For example: when set to global, all audit logs are recorded normally; when set to object, only operations on specific data instances are recorded. | String | global | Hot Reload | -| `auditable_operation_result` | Audit result selection. success: log only successful events; fail: log only failed events | String | success,fail | Hot Reload | -| `audit_log_ttl_in_days` | Audit log TTL (Time To Live). Logs older than this threshold will expire. | Double | -1.0 (never deleted) | Hot Reload | -| `audit_log_space_tl_in_GB` | Audit log SpaceTL. Logs will start rotating when total space reaches this threshold. | Double | 1.0 | Hot Reload | -| `audit_log_batch_interval_in_ms` | Batch write interval for audit logs | Long | 1000 | Hot Reload | -| `audit_log_batch_max_queue_bytes` | Maximum byte size of the queue for batch processing audit logs. Subsequent write operations will be blocked when this threshold is exceeded. | Long | 268435456 | Hot Reload | - -**Instructions for Object Authentication and Slow Operations:** -- When the parameters `auditable_dml_event_type`, `auditable_ddl_event_type`, `auditable_query_event_type`, or `auditable_control_event_type` are set to `OBJECT_AUTHENTICATION`, the corresponding event types will be recorded in the audit log. -- When the parameters `auditable_dml_event_type`, `auditable_ddl_event_type`, `auditable_query_event_type`, or `auditable_control_event_type` are set to `SLOW_OPERATION`, only the corresponding event types whose execution time exceeds the value of the `slow_query_threshold` parameter (default: 3000 ms) will be recorded in the audit log. The value of the `slow_query_threshold` parameter can be configured in the `iotdb-system.properties` file. - - -## 3. Access Methods - -Supports direct reading of audit logs via SQL. - -### 3.1 SQL Syntax - -```SQL -SELECT (, )* log FROM WHERE whereclause ORDER BY order_expression -``` - -* `AUDIT_LOG_PATH`: Audit log storage location `root.__audit.log..` -* `audit_log_field`: Query fields refer to the metadata structure below -* Supports WHERE clause filtering and ORDER BY sorting - -### 3.2 Metadata Structure - -| Field | Description | Data Type | -|------------------------|--------------------------------------------------|----------------| -| `time` | The date and time when the event started | timestamp | -| `username` | User name | string | -| `cli_hostname` | Client hostname identifier | string | -| `audit_event_type` | Audit event type, e.g., WRITE_DATA, GENERATE_KEY| string | -| `operation_type` | Operation type, e.g., DML, DDL, QUERY, CONTROL | string | -| `privilege_type` | Privilege used, e.g., WRITE_DATA, MANAGE_USER | string | -| `privilege_level` | Event privilege level, global or object | string | -| `result` | Event result, success=1, fail=0 | boolean | -| `database` | Database name | string | -| `sql_string` | User's original SQL statement | string | -| `log` | Detailed event description | string | - -### 3.3 Usage Examples - -* Query times, usernames and host information for successfully executed queries: - -```SQL -IoTDB> select username,cli_hostname from root.__audit.log.** where operation_type='QUERY' and result=true align by device -+-----------------------------+---------------------------+--------+------------+ -| Time| Device|username|cli_hostname| -+-----------------------------+---------------------------+--------+------------+ -|2026-01-23T10:39:21.563+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -|2026-01-23T10:39:33.746+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -|2026-01-23T10:42:15.032+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -+-----------------------------+---------------------------+--------+------------+ -Total line number = 3 -It costs 0.036s -``` - -* Query latest operation details: - -```SQL -IoTDB> select username,cli_hostname,operation_type,sql_string from root.__audit.log.** order by time desc limit 1 align by device -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -| Time| Device|username|cli_hostname|operation_type| sql_string| -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -|2026-01-23T10:42:32.795+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| QUERY|select username,cli_hostname from root.__audit.log.** where operation_type='QUERY' and result=true align by device| -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -Total line number = 1 -It costs 0.033s -``` - -* Query failed operations: - -```SQL -IoTDB> select database,operation_type,log from root.__audit.log.** where result=false align by device -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -| Time| Device| database|operation_type| log| -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -|2026-01-23T10:49:55.159+08:00|root.__audit.log.node_1.u_10000| | CONTROL| User user1 (ID=10000) login failed with code: 801, Authentication failed.| -|2026-01-23T10:52:04.579+08:00|root.__audit.log.node_1.u_10000| [root.**]| QUERY| User user1 (ID=10000) requests authority on object [root.**] with result false| -|2026-01-23T10:52:43.412+08:00|root.__audit.log.node_1.u_10000|root.userdb| DDL| User user1 (ID=10000) requests authority on object root.userdb with result false| -|2026-01-23T10:52:48.075+08:00|root.__audit.log.node_1.u_10000| null| QUERY|User user1 (ID=10000) requests authority on object root.__audit with result false| -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -Total line number = 4 -It costs 0.024s -``` - -* Query audit records for user 'u_0' on node 'node_1' with event types 'SLOW_OPERATION' - -```SQL -IoTDB> select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' align by device -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -| Time| Device|result|privilege_level|privilege_type|database|operation_type| log| sql_string|audit_event_type|cli_hostname|username| -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -|2026-05-06T14:43:55.088+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [READ_DATA]| | QUERY| SLOW_QUERY: cost 60 ms, select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' or audit_event_type='LOGIN'limit 1 align by device|select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' or audit_event_type='LOGIN'limit 1 align by device| SLOW_OPERATION| 127.0.0.1| root| -|2026-05-06T14:44:08.715+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [WRITE_DATA]| | DML| Execution: insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2') cost 290 ms, with status code: TSStatus(code:200, message:)| insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2')| SLOW_OPERATION| 127.0.0.1| root| -|2026-05-06T14:44:11.684+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [WRITE_DATA]| | DML|Execution: insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4') cost 6 ms, with status code: TSStatus(code:200, message:)| insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4')| SLOW_OPERATION| 127.0.0.1| root| -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -Total line number = 3 -It costs 0.010s -``` diff --git a/src/UserGuide/Master/Tree/User-Manual/Authority-Management-Upgrade_timecho.md b/src/UserGuide/Master/Tree/User-Manual/Authority-Management-Upgrade_timecho.md deleted file mode 100644 index 83690fdeb..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/Authority-Management-Upgrade_timecho.md +++ /dev/null @@ -1,410 +0,0 @@ - -# Authority Management - -IoTDB provides comprehensive permission management features to control access to data and cluster resources, ensuring data and system security. This document introduces the core concepts of the IoTDB permission module, user specifications, permission governance, authentication logic, and practical application examples. - -## 1. Core Concepts -### 1.1 User -A user refers to a legitimate database operator. Each user is identified by a unique username and authenticated by a password. To access the database, users must log in with valid usernames and passwords stored in the system. - -### 1.2 Privilege -The database supports a wide range of operations, but not all users are authorized to perform every action. A user is granted a corresponding privilege if permitted to execute a specific operation. Each privilege is bounded by a designated path. Flexible permission management can be implemented via [Path Pattern](../Basic-Concept/Operate-Metadata_timecho.md). - -### 1.3 Role -A role is a collection of privileges identified by a unique role name. Roles correspond to actual job identities (e.g., traffic dispatchers), and multiple users may share the same identity with identical permission sets. Roles enable unified and centralized management of permissions for user groups with consistent access requirements. - -### 1.4 Default Users and Roles -After initialization, IoTDB provides one default user: `root`, with the default password `TimechoDB@2021`. As the built-in super administrator, the root user permanently owns all privileges. Its permissions cannot be granted, revoked, or deleted, and it is the sole administrator account in the database. - -Newly created users and roles have no permissions by default. - -## 2. User Specifications -Users with the `SECURITY` privilege are authorized to create users and roles, subject to the following constraints: - -### 2.1 Username Rules -Usernames must be 4 to 32 characters long, including uppercase and lowercase letters, digits, and special symbols (`!@#$%^&*()_+-=`). Creation of usernames identical to the administrator account is prohibited. - -### 2.2 Password Rules -Passwords must be 12 to 32 characters long, containing both uppercase and lowercase letters, at least one digit, and at least one special symbol (`!@#$%^&*()_+-=`). Passwords cannot be the same as the associated username. - -### 2.3 Role Name Rules -Role names must be 4 to 32 characters long, including uppercase and lowercase letters, digits, and special symbols (`!@#$%^&*()_+-=`). Creation of role names identical to the administrator account is prohibited. - -## 3. Permission Management -Based on its tree data model, IoTDB classifies permissions into two major categories: global privileges and time series privileges. - -### 3.1 Global Privileges -Global privileges include three types: `SYSTEM`, `SECURITY`, and `AUDIT`: -- **SYSTEM**: Governs O&M operations and Data Definition Language (DDL) operations. -- **SECURITY**: Governs user and role management, as well as privilege granting for other accounts. -- **AUDIT**: Governs audit rule maintenance and audit log viewing. - -Detailed descriptions of each global privilege are shown in the table below: - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Privilege NameOriginal Privilege NameDescription
SYSTEMMANAGE_DATABASEAllows users to create and drop databases.
USE_TRIGGERAllows users to create, drop and query triggers.
USE_UDFAllows users to create, drop and query user-defined functions.
USE_PIPEAllows users to create, start, stop, drop and query PIPE tasks; allows users to create, drop and query PIPEPLUGINS.
USE_CQAllows users to register, start, stop, uninstall and query stream processing tasks; allows users to register, uninstall and query stream processing plugins.
MAINTAINAllows users to execute and cancel queries, view system variables, and check cluster status.
USE_MODELAllows users to create, drop and query deep learning models.
SECURITYMANAGE_USERAllows users to create, drop, modify and query users.
MANAGE_ROLEAllows users to create, drop and query roles; grant roles to other users or revoke roles from other users.
AUDITN/AAllows users to maintain audit log rules and view audit logs.
- -### 3.2 Time Series Privileges -Time series privileges control the scope and mode of user data access. They support authorization for absolute paths and prefix matching paths, and take effect at the time series granularity. - -Definitions of all time series privileges are listed in the table below: - -| Privilege Name | Description | -| --------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| READ_DATA | Allows reading time series data under authorized paths. | -| WRITE_DATA | Allows reading time series data under authorized paths.
Allows inserting and deleting time series data under authorized paths.
Supports data import and loading within authorized paths. Data import requires the `WRITE_DATA` privilege for target paths; automatic creation of databases and time series additionally requires `SYSTEM` and `WRITE_SCHEMA` privileges. | -| READ_SCHEMA | Allows viewing detailed metadata tree information under authorized paths, including databases, sub-paths, child nodes, devices, time series, templates, views and other metadata. | -| WRITE_SCHEMA | Allows viewing metadata tree information under authorized paths.
Allows creating, dropping and modifying time series, templates and views under authorized paths.
When creating or modifying a view, the system checks the `WRITE_SCHEMA` privilege for the view path and the `READ_SCHEMA` privilege for data sources.
Querying and inserting data into a view requires the `READ_DATA` and `WRITE_DATA` privileges for the view path.
Allows configuring, canceling and querying TTL settings under authorized paths.
Allows mounting and unmounting templates under authorized paths.
Supports renaming the full path of time series (supported since V2.0.8.2). | - -### 3.3 Privilege Granting and Revocation -In IoTDB, users can obtain permissions through three methods: -1. Granted by the super administrator, who has full control over all user permissions. -2. Granted by common users with authorization permission, who have been assigned the `GRANT OPTION` keyword for specific privileges. -3. Assigned via roles granted by the super administrator or users with the `SECURITY` privilege. - -Permissions can be revoked through the following methods: -1. Revoked by the super administrator. -2. Revoked by common users with authorization permission, who have been assigned the `GRANT OPTION` keyword for specific privileges. -3. Revoked by the super administrator or users with the `SECURITY` privilege by removing specific roles from target users. - -- A valid path must be specified for all authorization operations. Global privileges require the path `root.**`, while time series privileges must use absolute paths or prefix paths ending with double wildcards. -- The `WITH GRANT OPTION` keyword can be specified when granting privileges to roles, enabling grantees to regrant or revoke corresponding privileges within the authorized path scope. For example, if User A is granted read access to `Group1.Company1.**` with the `GRANT OPTION`, User A can authorize or revoke read permissions for all sub-nodes and time series under `Group1.Company1`. -- During revocation, the system matches the revocation statement with all existing permission paths of the target user and clears all matched permissions. For instance, if User A owns the read privilege for `Group1.Company1.Factory1`, revoking the read privilege for `Group1.Company1.**` will clear the permission for the sub-path as well. - -## 4. Syntax and Usage Examples -IoTDB provides combined privilege aliases to simplify authorization configuration: - -| Privilege Name | Scope of Authority | -| ---------- | ---------------------------- | -| ALL | All privileges | -| READ | READ_SCHEMA, READ_DATA | -| WRITE | WRITE_SCHEMA, WRITE_DATA | - -Combined privileges are simplified aliases rather than independent privilege types, and function identically to declaring individual privileges separately. - -The following examples demonstrate common permission management SQL statements. Non-administrator users need corresponding prerequisites to execute these operations, which are noted in each scenario. - -### 4.1 User and Role Management -- **Create User** (Requires `SECURITY` privilege) -```SQL -CREATE USER -eg: CREATE USER user1 'Passwd@202604' -``` - -- **Drop User** (Requires `SECURITY` privilege) -```SQL -DROP USER -eg: DROP USER user1 -``` - -- **Create Role** (Requires `SECURITY` privilege) -```SQL -CREATE ROLE -eg: CREATE ROLE role1 -``` - -- **Drop Role** (Requires `SECURITY` privilege) -```SQL -DROP ROLE -eg: DROP ROLE role1 -``` - -- **Grant Role to User** (Requires `SECURITY` privilege) -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1 -``` - -- **Revoke Role from User** (Requires `SECURITY` privilege) -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1 -``` - -- **List All Users** (Requires `SECURITY` privilege) -```SQL -LIST USER -``` - -- **List All Roles** (Requires `SECURITY` privilege) -```SQL -LIST ROLE -``` - -- **List All Users Under a Specified Role** (Requires `SECURITY` privilege) -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser -``` - -- **List Roles of a Specified User** - Users can view their own roles; viewing other users' roles requires the `SECURITY` privilege. -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser -``` - -- **List All Privileges of a Specified User** - Users can view their own privileges; viewing other users' privileges requires the `SECURITY` privilege. -```SQL -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; -``` - -- **List All Privileges of a Specified Role** - Users can view privileges of their assigned roles; viewing other roles' privileges requires the `SECURITY` privilege. -```SQL -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -- **Modify Password** - Users can update their own passwords; modifying other users' passwords requires the `SECURITY` privilege. -```SQL -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'Newpwd@202604'; -``` - -### 4.2 Privilege Granting and Revocation -#### Grant Syntax -```SQL -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT SECURITY ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -#### Revoke Syntax -```SQL -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE SECURITY ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - -- Non-administrator users must hold the target privileges with the `WITH GRANT OPTION` attribute for the specified paths to execute grant or revoke operations. -- When granting or revoking global privileges (or statements containing global privileges, as `ALL` includes global privileges), the path must be set to `root.**`. - -**Valid Examples of Grant & Revoke** -```SQL -GRANT SECURITY ON root.** TO USER user1; -GRANT SECURITY ON root.** TO ROLE role1 WITH GRANT OPTION; -GRANT ALL ON root.** TO role role1 WITH GRANT OPTION; -REVOKE SECURITY ON root.** FROM USER user1; -REVOKE SECURITY ON root.** FROM ROLE role1; -REVOKE ALL ON root.** FROM ROLE role1; -``` - -**Invalid Statements** -```SQL -GRANT READ, SECURITY ON root.t1.** TO USER user1; -GRANT ALL ON root.t1.t2 TO USER user1 WITH GRANT OPTION; -REVOKE ALL ON root.t1.t2 FROM USER user1; -REVOKE READ, SECURITY ON root.t1.t2 FROM ROLE ROLE1; -``` - -- Valid path formats include complete absolute paths and paths ending with double wildcards: -```SQL --- Valid Paths -root.** -root.t1.t2.** -root.t1.t2.t3 -``` - -```SQL --- Invalid Paths -root.t1.* -root.t1.**.t2 -root.t1*.t2.t3 -``` - -## 5. Practical Scenario Example -Based on the [sample data](https://github.com/thulab/iotdb/files/4438687/OtherMaterial-Sample.Data.txt), IoTDB sample data belongs to multiple power generation groups such as ln and sgcc. To prevent cross-group data access, strict permission isolation at the group level is required. - -### 5.1 Create Users -Use the `CREATE USER` statement to create new users. For example, the root administrator creates two write users for the ln and sgcc groups with the unified password `write_Pwd@2026`. It is recommended to wrap usernames with backticks (`). - -```SQL -CREATE USER `ln_write_user` 'write_Pwd@2026'; -CREATE USER `sgcc_write_user` 'write_Pwd@2026'; -``` - -Execute the following statement to query all users: -```SQL -LIST USER; -``` - -Query result: -``` -IoTDB> CREATE USER `ln_write_user` 'write_Pwd@2026'; -Msg: The statement is executed successfully. -IoTDB> CREATE USER `sgcc_write_user` 'write_Pwd@2026'; -Msg: The statement is executed successfully. -IoTDB> LIST USER; -+------+---------------+-----------------+-----------------+ -|UserId| User|MaxSessionPerUser|MinSessionPerUser| -+------+---------------+-----------------+-----------------+ -| 0| root| -1| 1| -| 10000| ln_write_user| -1| -1| -| 10001|sgcc_write_user| -1| -1| -+------+---------------+-----------------+-----------------+ -Total line number = 3 -It costs 0.005s -``` - -### 5.2 Grant Permissions -Newly created users have no permissions by default and cannot perform any database operations. For example, an insertion executed by `ln_write_user` will fail: -```SQL -INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true); -``` - -Error message: -``` -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true); -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -Grant targeted write permissions to each user via the root account: -```SQL -GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user`; -GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user`; -``` - -Execution result: -``` -IoTDB> GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user`; -Msg: The statement is executed successfully. -IoTDB> GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user`; -Msg: The statement is executed successfully. -``` - -Retry data insertion with `ln_write_user`: -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true); -Msg: The statement is executed successfully. -``` - -### 5.3 Revoke Permissions -Use the `REVOKE` statement to reclaim granted permissions: -```SQL -REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -``` - -Execution result: -``` -IoTDB> REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -Msg: The statement is executed successfully. -IoTDB> REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -Msg: The statement is executed successfully. -``` - -After permission revocation, `ln_write_user` loses write access to the `root.ln.**` path: -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true) -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -## 6. Authentication & Supplementary Instructions -### 6.1 Authentication Mechanism -User permissions consist of three core elements: effective path scope, privilege type, and the `WITH GRANT OPTION` tag. -```Plain -userTest1 : - root.t1.** - read_schema, read_data - with grant option - root.** - write_schema, write_data - with grant option -``` - -Each user has an independent permission list recording all authorized privileges, which can be queried via `LIST PRIVILEGES OF USER `. - -During authentication, the system matches the target operation path with authorized paths in sequence. When verifying the `read_schema` privilege for `root.t1.t2`, the system first matches the path rule `root.t1.**`. If matched, it checks whether the required privilege is included; otherwise, it continues matching until a valid rule is found or all rules are traversed. - -- For multi-path query tasks, the system only returns data accessible to the current user and filters out unauthorized content. -- For multi-path write tasks, the operation requires valid write permissions for **all** target time series. - -**Operations Requiring Combined Privileges** -1. With automatic time series creation enabled, inserting data into non-existent time series requires both `WRITE_DATA` and metadata modification privileges. -2. The `SELECT INTO` statement requires read privileges for source paths and write privileges for target paths. Insufficient source permissions lead to incomplete data; insufficient target permissions will terminate the task and throw an error. -3. View permissions are independent of underlying data sources. Read and write operations on views only verify view-specific permissions without checking privileges of the original data paths. - -### 6.2 Supplementary Notes -A role is a collection of privileges, while users have two types of attributes: independent individual privileges and inherited role privileges. A single role can contain multiple privileges, and a single user can be assigned multiple roles and independent permissions. - -No conflicting permissions exist in IoTDB. A user’s final effective permissions are the **union** of personal privileges and all privileges from assigned roles. An operation is permitted if either the user’s individual privileges or inherited role privileges contain the required authorization. Duplicate permissions between personal and role settings do not affect normal usage. - -Key notes: -If a user holds an independent privilege for Operation A and obtains the same privilege via a role, revoking only the user’s individual privilege cannot restrict the operation. Administrators must revoke the privilege from the corresponding role or remove the role from the user to disable the operation completely. Similarly, revoking privileges only from roles cannot restrict users with independent permissions. - -Modifications to role permissions take effect in real time for all bound users. Adding privileges to a role immediately grants access to all associated users, and removing privileges will revoke corresponding access unless users hold independent overriding permissions. \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/User-Manual/Authority-Management_timecho.md b/src/UserGuide/Master/Tree/User-Manual/Authority-Management_timecho.md deleted file mode 100644 index 54bee9150..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/Authority-Management_timecho.md +++ /dev/null @@ -1,519 +0,0 @@ - - -# Authority Management - -IoTDB provides permission management operations, offering users the ability to manage permissions for data and cluster systems, ensuring data and system security. - -This article introduces the basic concepts of the permission module in IoTDB, including user definition, permission management, authentication logic, and use cases. In the JAVA programming environment, you can use the [JDBC API](https://chat.openai.com/API/Programming-JDBC.md) to execute permission management statements individually or in batches. - -## 1. Basic Concepts - -### 1.1 User - -A user is a legitimate user of the database. Each user corresponds to a unique username and has a password as a means of authentication. Before using the database, a person must provide a valid (i.e., stored in the database) username and password for a successful login. - -### 1.2 Permission - -The database provides various operations, but not all users can perform all operations. If a user can perform a certain operation, they are said to have permission to execute that operation. Permissions are typically limited in scope by a path, and [path patterns](https://chat.openai.com/Basic-Concept/Data-Model-and-Terminology.md) can be used to manage permissions flexibly. - -### 1.3 Role - -A role is a collection of multiple permissions and has a unique role name as an identifier. Roles often correspond to real-world identities (e.g., a traffic dispatcher), and a real-world identity may correspond to multiple users. Users with the same real-world identity often have the same permissions, and roles are abstractions for unified management of such permissions. - -### 1.4 Default Users and Roles - -After installation and initialization, IoTDB includes a default user: root, with the default password TimechoDB@2021 (Before V2.0.6.x it is root). This user is an administrator with fixed permissions, which cannot be granted or revoked and cannot be deleted. There is only one administrator user in the database. - -A newly created user or role does not have any permissions initially. - -## 2. User Definition - -Users with MANAGE_USER and MANAGE_ROLE permissions or administrators can create users or roles. Creating a user must meet the following constraints. - -### 2.1 Username Constraints - -4 to 32 characters, supports the use of uppercase and lowercase English letters, numbers, and special characters (`!@#$%^&*()_+-=`). - -Users cannot create users with the same name as the administrator. - -### 2.2 Password Constraints - -4 to 32 characters, can use uppercase and lowercase letters, numbers, and special characters (`!@#$%^&*()_+-=`). Passwords are encrypted by default using SHA-256. - -### 2.3 Role Name Constraints - -4 to 32 characters, supports the use of uppercase and lowercase English letters, numbers, and special characters (`!@#$%^&*()_+-=`). - -Users cannot create roles with the same name as the administrator. - - - -## 3. Permission Management - -IoTDB primarily has two types of permissions: series permissions and global permissions. - -### 3.1 Series Permissions - -Series permissions constrain the scope and manner in which users access data. IOTDB support authorization for both absolute paths and prefix-matching paths, and can be effective at the timeseries granularity. - -The table below describes the types and scope of these permissions: - - - -| Permission Name | Description | -|-----------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| READ_DATA | Allows reading time series data under the authorized path. | -| WRITE_DATA | Allows reading time series data under the authorized path.
Allows inserting and deleting time series data under the authorized path.
Allows importing and loading data under the authorized path. When importing data, you need the WRITE_DATA permission for the corresponding path. When automatically creating databases or time series, you need MANAGE_DATABASE and WRITE_SCHEMA permissions. | -| READ_SCHEMA | Allows obtaining detailed information about the metadata tree under the authorized path,
including databases, child paths, child nodes, devices, time series, templates, views, etc. | -| WRITE_SCHEMA | Allows obtaining detailed information about the metadata tree under the authorized path.
Allows creating, deleting, and modifying time series, templates, views, etc. under the authorized path. When creating or modifying views, it checks the WRITE_SCHEMA permission for the view path and READ_SCHEMA permission for the data source. When querying and inserting data into views, it checks the READ_DATA and WRITE_DATA permissions for the view path.
Allows setting, unsetting, and viewing TTL under the authorized path.
Allows attaching or detaching templates under the authorized path.
Allowed to modify the full path name of a timeseries under an authorized path. -- Supported from V2.0.8.2 onwards | - - -### 3.2 Global Permissions - -Global permissions constrain the database functions that users can use and restrict commands that change the system and task state. Once a user obtains global authorization, they can manage the database. -The table below describes the types of system permissions: - - -| Permission Name | Description | -|:---------------:|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| MANAGE_DATABASE | Allow users to create and delete databases. | -| MANAGE_USER | Allow users to create, delete, modify, and view users. | -| MANAGE_ROLE | Allow users to create, delete, modify, and view roles.
Allow users to grant/revoke roles to/from other users. | -| USE_TRIGGER | Allow users to create, delete, and view triggers.
Independent of data source permission checks for triggers. | -| USE_UDF | Allow users to create, delete, and view user-defined functions.
Independent of data source permission checks for user-defined functions. | -| USE_CQ | Allow users to create, delete, and view continuous queries.
Independent of data source permission checks for continuous queries. | -| USE_PIPE | Allow users to create, start, stop, delete, and view pipelines.
Allow users to create, delete, and view pipeline plugins.
Independent of data source permission checks for pipelines. | -| EXTEND_TEMPLATE | Permission to automatically create templates. | -| MAINTAIN | Allow users to query and cancel queries.
Allow users to view variables.
Allow users to view cluster status. | -| USE_MODEL | Allow users to create, delete and view deep learning model. | -Regarding template permissions: - -1. Only administrators are allowed to create, delete, modify, query, mount, and unmount templates. -2. To activate a template, you need to have WRITE_SCHEMA permission for the activation path. -3. If automatic creation is enabled, writing to a non-existent path that has a template mounted will automatically extend the template and insert data. Therefore, one needs EXTEND_TEMPLATE permission and WRITE_DATA permission for writing to the sequence. -4. To deactivate a template, WRITE_SCHEMA permission for the mounted template path is required. -5. To query paths that use a specific metadata template, you needs READ_SCHEMA permission for the paths; otherwise, it will return empty results. - - - -### 3.3 Granting and Revoking Permissions - -In IoTDB, users can obtain permissions through three methods: - -1. Granted by administrator, who has control over the permissions of other users. -2. Granted by a user allowed to authorize permissions, and this user was assigned the grant option keyword when obtaining the permission. -3. Granted a certain role by administrator or a user with MANAGE_ROLE, thereby obtaining permissions. - -Revoking a user's permissions can be done through the following methods: - -1. Revoked by administrator. -2. Revoked by a user allowed to authorize permissions, and this user was assigned the grant option keyword when obtaining the permission. -3. Revoked from a user's role by administrator or a user with MANAGE_ROLE, thereby revoking the permissions. - -- When granting permissions, a path must be specified. Global permissions need to be specified as root.**, while series-specific permissions must be absolute paths or prefix paths ending with a double wildcard. -- When granting user/role permissions, you can specify the "with grant option" keyword for that permission, which means that the user can grant permissions on their authorized paths and can also revoke permissions on other users' authorized paths. For example, if User A is granted read permission for `group1.company1.**` with the grant option keyword, then A can grant read permissions to others on any node or series below `group1.company1`, and can also revoke read permissions on any node below `group1.company1` for other users. -- When revoking permissions, the revocation statement will match against all of the user's permission paths and clear the matched permission paths. For example, if User A has read permission for `group1.company1.factory1`, when revoking read permission for `group1.company1.**`, it will remove A's read permission for `group1.company1.factory1`. - - - -## 4. Authentication - -User permissions mainly consist of three parts: permission scope (path), permission type, and the "with grant option" flag: - -``` -userTest1: - root.t1.** - read_schema, read_data - with grant option - root.** - write_schema, write_data - with grant option -``` - -Each user has such a permission access list, identifying all the permissions they have acquired. You can view their permissions by using the command `LIST PRIVILEGES OF USER `. - -When authorizing a path, the database will match the path with the permissions. For example, when checking the read_schema permission for `root.t1.t2`, it will first match with the permission access list `root.t1.**`. If it matches successfully, it will then check if that path contains the permission to be authorized. If not, it continues to the next path-permission match until a match is found or all matches are exhausted. - -When performing authorization for multiple paths, such as executing a multi-path query task, the database will only present data for which the user has permissions. Data for which the user does not have permissions will not be included in the results, and information about these paths without permissions will be output to the alert messages. - -Please note that the following operations require checking multiple permissions: - -1. Enabling the automatic sequence creation feature requires not only write permission for the corresponding sequence when a user inserts data into a non-existent sequence but also metadata modification permission for the sequence. - -2. When executing the "select into" statement, it is necessary to check the read permission for the source sequence and the write permission for the target sequence. It should be noted that the source sequence data may only be partially accessible due to insufficient permissions, and if the target sequence has insufficient write permissions, an error will occur, terminating the task. - -3. View permissions and data source permissions are independent. Performing read and write operations on a view will only check the permissions of the view itself and will not perform permission validation on the source path. - - -## 5. Function Syntax and Examples - -IoTDB provides composite permissions for user authorization: - -| Permission Name | Permission Scope | -|-----------------|--------------------------| -| ALL | All permissions | -| READ | READ_SCHEMA, READ_DATA | -| WRITE | WRITE_SCHEMA, WRITE_DATA | - -Composite permissions are not specific permissions themselves but a shorthand way to denote a combination of permissions, with no difference from directly specifying the corresponding permission names. - -The following series of specific use cases will demonstrate the usage of permission statements. Non-administrator users executing the following statements require obtaining the necessary permissions, which are indicated after the operation description. - -### 5.1 User and Role Related - -- Create user (Requires MANAGE_USER permission) - -```SQL -CREATE USER -eg: CREATE USER user1 'passwd' -``` - -- Delete user (Requires MANAGE_USER permission) - -```sql -DROP USER -eg: DROP USER user1 -``` - -- Create role (Requires MANAGE_ROLE permission) - -```sql -CREATE ROLE -eg: CREATE ROLE role1 -``` - -- Delete role (Requires MANAGE_ROLE permission) - -```sql -DROP ROLE -eg: DROP ROLE role1 -``` - -- Grant role to user (Requires MANAGE_ROLE permission) - -```sql -GRANT ROLE TO -eg: GRANT ROLE admin TO user1 -``` - -- Revoke role from user(Requires MANAGE_ROLE permission) - -```sql -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1 -``` - -- List all user (Requires MANAGE_USER permission) - -```sql -LIST USER -``` - -- List all role (Requires MANAGE_ROLE permission) - -```sql -LIST ROLE -``` - -- List all users granted specific role.(Requires MANAGE_USER permission) - -```sql -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser -``` - -- List all role granted to specific user. - - Users can list their own roles, but listing roles of other users requires the MANAGE_ROLE permission. - -```sql -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser -``` - -- List all privileges of user - -Users can list their own privileges, but listing privileges of other users requires the MANAGE_USER permission. - -```sql -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; -``` - -- List all privileges of role - -Users can list the permission information of roles they have, but listing permissions of other roles requires the MANAGE_ROLE permission. - -```sql -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -- Modify password - -Users can modify their own password, but modifying passwords of other users requires the MANAGE_USER permission. - -```sql -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'newpwd'; -``` - -### 5.2 Authorization and Deauthorization - -Users can use authorization statements to grant permissions to other users. The syntax is as follows: - -```sql -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT MANAGE_ROLE ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -Users can use deauthorization statements to revoke permissions from others. The syntax is as follows: - -```sql -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE MANAGE_ROLE ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - -- **When non-administrator users execute authorization/deauthorization statements, they need to have \ permissions on \, and these permissions must be marked with WITH GRANT OPTION.** - -- When granting or revoking global permissions or when the statement contains global permissions (expanding ALL includes global permissions), you must specify the path as root**. For example, the following authorization/deauthorization statements are valid: - - ```sql - GRANT MANAGE_USER ON root.** TO USER user1; - GRANT MANAGE_ROLE ON root.** TO ROLE role1 WITH GRANT OPTION; - GRANT ALL ON root.** TO role role1 WITH GRANT OPTION; - REVOKE MANAGE_USER ON root.** FROM USER user1; - REVOKE MANAGE_ROLE ON root.** FROM ROLE role1; - REVOKE ALL ON root.** FROM ROLE role1; - ``` - - The following statements are invalid: - - ```sql - GRANT READ, MANAGE_ROLE ON root.t1.** TO USER user1; - GRANT ALL ON root.t1.t2 TO USER user1 WITH GRANT OPTION; - REVOKE ALL ON root.t1.t2 FROM USER user1; - REVOKE READ, MANAGE_ROLE ON root.t1.t2 FROM ROLE ROLE1; - ``` - -- \ must be a full path or a matching path ending with a double wildcard. The following paths are valid: - - ```sql - root.** - root.t1.t2.** - root.t1.t2.t3 - ``` - - The following paths are invalid: - - ```sql - root.t1.* - root.t1.**.t2 - root.t1*.t2.t3 - ``` - - - -## 6. Examples - - Based on the described [sample data](https://github.com/thulab/iotdb/files/4438687/OtherMaterial-Sample.Data.txt), IoTDB's sample data may belong to different power generation groups such as ln, sgcc, and so on. Different power generation groups do not want other groups to access their database data, so we need to implement data isolation at the group level. - -#### Create Users -Use `CREATE USER ` to create users. For example, we can create two users for the ln and sgcc groups with the root user, who has all permissions, and name them ln_write_user and sgcc_write_user. It is recommended to enclose the username in backticks. The SQL statements are as follows: -```SQL -CREATE USER `ln_write_user` 'write_pwd' -CREATE USER `sgcc_write_user` 'write_pwd' -``` - -Now, using the SQL statement to display users: - -```sql -LIST USER -``` - -We can see that these two users have been created, and the result is as follows: - -```sql -IoTDB> CREATE USER `ln_write_user` 'write_pwd' -Msg: The statement is executed successfully. -IoTDB> CREATE USER `sgcc_write_user` 'write_pwd' -Msg: The statement is executed successfully. -IoTDB> LIST USER; -+---------------+ -| user| -+---------------+ -| ln_write_user| -| root| -|sgcc_write_user| -+---------------+ -Total line number = 3 -It costs 0.012s -``` - -#### Granting Permissions to Users - -At this point, although two users have been created, they do not have any permissions, so they cannot operate on the database. For example, if we use the ln_write_user to write data to the database, the SQL statement is as follows: - -```sql -INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true) -``` - -At this point, the system does not allow this operation, and an error is displayed: - -```sql -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true) -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -Now, we will grant each user write permissions to the corresponding paths using the root user. - -We use the `GRANT ON TO USER ` statement to grant permissions to users, for example: - -```sql -GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user` -GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user` -``` - -The execution status is as follows: - -```sql -IoTDB> GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user` -Msg: The statement is executed successfully. -IoTDB> GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user` -Msg: The statement is executed successfully. -``` - -Then, using ln_write_user, try to write data again: - -```sql -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true) -Msg: The statement is executed successfully. -``` - -#### Revoking User Permissions - -After granting user permissions, we can use the `REVOKE ON FROM USER ` to revoke the permissions granted to users. For example, using the root user to revoke the permissions of ln_write_user and sgcc_write_user: - -```sql -REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -``` - - -The execution status is as follows: - -```sql -IoTDB> REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -Msg: The statement is executed successfully. -IoTDB> REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -Msg: The statement is executed successfully. -``` - -After revoking the permissions, ln_write_user no longer has the permission to write data to root.ln.**: - -```sql -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true) -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -## 7. Other Explanations - -Roles are collections of permissions, and both permissions and roles are attributes of users. In other words, a role can have multiple permissions, and a user can have multiple roles and permissions (referred to as the user's self-permissions). - -Currently, in IoTDB, there are no conflicting permissions. Therefore, the actual permissions a user has are the union of their self-permissions and the permissions of all their roles. In other words, to determine if a user can perform a certain operation, it's necessary to check whether their self-permissions or the permissions of all their roles allow that operation. Self-permissions, role permissions, and the permissions of multiple roles a user has may contain the same permission, but this does not have any impact. - -It's important to note that if a user has a certain permission (corresponding to operation A) on their own, and one of their roles has the same permission, revoking the permission from the user alone will not prevent the user from performing operation A. To prevent the user from performing operation A, you need to revoke the permission from both the user and the role, or remove the user from the role that has the permission. Similarly, if you only revoke the permission from the role, it won't prevent the user from performing operation A if they have the same permission on their own. - -At the same time, changes to roles will be immediately reflected in all users who have that role. For example, adding a certain permission to a role will immediately grant that permission to all users who have that role, and removing a certain permission will cause those users to lose that permission (unless the user has it on their own). - - - -## 8. Upgrading from a previous version - -Before version 1.3, there were many different permission types. In 1.3 version's implementation, we have streamlined the permission types. - -The permission paths in version 1.3 of the database must be either full paths or matching paths ending with a double wildcard. During system upgrades, any invalid permission paths and permission types will be automatically converted. The first invalid node on the path will be replaced with "**", and any unsupported permission types will be mapped to the permissions supported by the current system. - -| Permission | Path | Mapped-Permission | Mapped-path | -|-------------------|-----------------|-------------------|---------------| -| CREATE_DATBASE | root.db.t1.* | MANAGE_DATABASE | root.** | -| INSERT_TIMESERIES | root.db.t2.*.t3 | WRITE_DATA | root.db.t2.** | -| CREATE_TIMESERIES | root.db.t2*c.t3 | WRITE_SCHEMA | root.db.** | -| LIST_ROLE | root.** | (ignore) | | - - - -You can refer to the table below for a comparison of permission types between the old and new versions (where "--IGNORE" indicates that the new version ignores that permission): - -| Permission Name | Path-Related | New Permission Name | Path-Related | -|---------------------------|--------------|---------------------|--------------| -| CREATE_DATABASE | YES | MANAGE_DATABASE | NO | -| INSERT_TIMESERIES | YES | WRITE_DATA | YES | -| UPDATE_TIMESERIES | YES | WRITE_DATA | YES | -| READ_TIMESERIES | YES | READ_DATA | YES | -| CREATE_TIMESERIES | YES | WRITE_SCHEMA | YES | -| DELETE_TIMESERIES | YES | WRITE_SCHEMA | YES | -| CREATE_USER | NO | MANAGE_USER | NO | -| DELETE_USER | NO | MANAGE_USER | NO | -| MODIFY_PASSWORD | NO | -- IGNORE | | -| LIST_USER | NO | -- IGNORE | | -| GRANT_USER_PRIVILEGE | NO | -- IGNORE | | -| REVOKE_USER_PRIVILEGE | NO | -- IGNORE | | -| GRANT_USER_ROLE | NO | MANAGE_ROLE | NO | -| REVOKE_USER_ROLE | NO | MANAGE_ROLE | NO | -| CREATE_ROLE | NO | MANAGE_ROLE | NO | -| DELETE_ROLE | NO | MANAGE_ROLE | NO | -| LIST_ROLE | NO | -- IGNORE | | -| GRANT_ROLE_PRIVILEGE | NO | -- IGNORE | | -| REVOKE_ROLE_PRIVILEGE | NO | -- IGNORE | | -| CREATE_FUNCTION | NO | USE_UDF | NO | -| DROP_FUNCTION | NO | USE_UDF | NO | -| CREATE_TRIGGER | YES | USE_TRIGGER | NO | -| DROP_TRIGGER | YES | USE_TRIGGER | NO | -| START_TRIGGER | YES | USE_TRIGGER | NO | -| STOP_TRIGGER | YES | USE_TRIGGER | NO | -| CREATE_CONTINUOUS_QUERY | NO | USE_CQ | NO | -| DROP_CONTINUOUS_QUERY | NO | USE_CQ | NO | -| ALL | NO | All privilegs | | -| DELETE_DATABASE | YES | MANAGE_DATABASE | NO | -| ALTER_TIMESERIES | YES | WRITE_SCHEMA | YES | -| UPDATE_TEMPLATE | NO | -- IGNORE | | -| READ_TEMPLATE | NO | -- IGNORE | | -| APPLY_TEMPLATE | YES | WRITE_SCHEMA | YES | -| READ_TEMPLATE_APPLICATION | NO | -- IGNORE | | -| SHOW_CONTINUOUS_QUERIES | NO | -- IGNORE | | -| CREATE_PIPEPLUGIN | NO | USE_PIPE | NO | -| DROP_PIPEPLUGINS | NO | USE_PIPE | NO | -| SHOW_PIPEPLUGINS | NO | -- IGNORE | | -| CREATE_PIPE | NO | USE_PIPE | NO | -| START_PIPE | NO | USE_PIPE | NO | -| STOP_PIPE | NO | USE_PIPE | NO | -| DROP_PIPE | NO | USE_PIPE | NO | -| SHOW_PIPES | NO | -- IGNORE | | -| CREATE_VIEW | YES | WRITE_SCHEMA | YES | -| ALTER_VIEW | YES | WRITE_SCHEMA | YES | -| RENAME_VIEW | YES | WRITE_SCHEMA | YES | -| DELETE_VIEW | YES | WRITE_SCHEMA | YES | diff --git a/src/UserGuide/Master/Tree/User-Manual/Auto-Start-On-Boot_timecho.md b/src/UserGuide/Master/Tree/User-Manual/Auto-Start-On-Boot_timecho.md deleted file mode 100644 index 1fbcd4c6f..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/Auto-Start-On-Boot_timecho.md +++ /dev/null @@ -1,213 +0,0 @@ - - -# Auto-start on Boot -## 1. Overview -TimechoDB supports registering ConfigNode, DataNode, and AINode as Linux system services via the three scripts `daemon-confignode.sh`, `daemon-datanode.sh`, and `daemon-ainode.sh`. Combined with the system-built `systemctl` command, it manages the TimechoDB cluster in daemon mode, enabling more convenient startup, shutdown, restart, and auto-start on boot operations, and improving service stability. - -> Note: This feature is available starting from version 2.0.9.1. - -## 2. Environment Requirements -| Item | Specification | -|--------------|-------------------------------------------------------------------------------| -| OS | Linux (supports the `systemctl` command) | -| User Privilege | root user | -| Environment Variable | `JAVA_HOME` must be set before deploying ConfigNode and DataNode | - -## 3. Service Registration -Enter the TimechoDB installation directory and execute the corresponding daemon script: - -```bash -# Register ConfigNode service -./tools/ops/daemon-confignode.sh - -# Register DataNode service -./tools/ops/daemon-datanode.sh - -# Register AINode service -./tools/ops/daemon-ainode.sh -``` - -During script execution, you will be prompted with two options: -1. Whether to start the corresponding TimechoDB service immediately (timechodb-confignode / timechodb-datanode / timechodb-ainode); -2. Whether to register the corresponding service for auto-start on boot. - -After script execution, the corresponding service files will be generated in the `/etc/systemd/system/` directory: -- `timechodb-confignode.service` -- `timechodb-datanode.service` -- `timechodb-ainode.service` - -## 4. Service Management -After service registration, you can use `systemctl` commands to start, stop, restart, check status, and configure auto-start on boot for each TimechoDB node service. All commands below must be executed as the root user. - -### 4.1 Manual Service Startup -```bash -# Start ConfigNode service -systemctl start timechodb-confignode -# Start DataNode service -systemctl start timechodb-datanode -# Start AINode service -systemctl start timechodb-ainode -``` - -### 4.2 Manual Service Shutdown -```bash -# Stop ConfigNode service -systemctl stop timechodb-confignode -# Stop DataNode service -systemctl stop timechodb-datanode -# Stop AINode service -systemctl stop timechodb-ainode -``` - -After stopping the service, check the service status. If it shows `inactive (dead)`, the service has been shut down successfully. For other statuses, check TimechoDB logs to analyze exceptions. - -### 4.3 Check Service Status -```bash -# Check ConfigNode service status -systemctl status timechodb-confignode -# Check DataNode service status -systemctl status timechodb-datanode -# Check AINode service status -systemctl status timechodb-ainode -``` - -Status Description: -- `active (running)`: Service is running. If this status persists for 10 minutes, the service has started successfully. -- `failed`: Service startup failed. Check TimechoDB logs for troubleshooting. - -### 4.4 Restart Service -Restarting a service is equivalent to stopping and then starting it. Commands are as follows: -```bash -# Restart ConfigNode service -systemctl restart timechodb-confignode -# Restart DataNode service -systemctl restart timechodb-datanode -# Restart AINode service -systemctl restart timechodb-ainode -``` - -### 4.5 Enable Auto-start on Boot -```bash -# Enable ConfigNode auto-start on boot -systemctl enable timechodb-confignode -# Enable DataNode auto-start on boot -systemctl enable timechodb-datanode -# Enable AINode auto-start on boot -systemctl enable timechodb-ainode -``` - -### 4.6 Disable Auto-start on Boot -```bash -# Disable ConfigNode auto-start on boot -systemctl disable timechodb-confignode -# Disable DataNode auto-start on boot -systemctl disable timechodb-datanode -# Disable AINode auto-start on boot -systemctl disable timechodb-ainode -``` - -## 5. Custom Service Configuration -### 5.1 Customization Methods -#### 5.1.1 Method 1: Modify the Script -1. Modify the `[Unit]`, `[Service]`, and `[Install]` sections in the `daemon-xxx.sh` script. For details of configuration items, refer to the next section. -2. Execute the `daemon-xxx.sh` script. - -#### 5.1.2 Method 2: Modify the Service File -1. Modify the `xx.service` file in `/etc/systemd/system`. -2. Execute `systemctl daemon-reload`. - -### 5.2 `daemon-xxx.sh` Configuration Items -#### 5.2.1 `[Unit]` Section (Service Metadata) -| Item | Description | -|---------------|-----------------------------------------------------------------------------| -| Description | Service description | -| Documentation | Link to the official TimechoDB documentation | -| After | Ensures the service starts only after the network service has started | - -#### 5.2.2 `[Service]` Section (Service Runtime Configuration) -| Item | Meaning | -|-------------------------------------------|-----------------------------------------------------------------------------------------------------------| -| StandardOutput, StandardError | Specify storage paths for service standard output and error logs | -| LimitNOFILE=65536 | Set the maximum number of file descriptors, default value is 65536 | -| Type=simple | Service type is a simple foreground process; systemd tracks the main service process | -| User=root, Group=root | Run the service with root user and group permissions | -| ExecStart / ExecStop | Specify the paths of the service startup and shutdown scripts respectively | -| Restart=on-failure | Automatically restart the service only if it exits abnormally | -| SuccessExitStatus=143 | Treat exit code 143 (128+15, normal termination via SIGTERM) as a successful exit | -| RestartSec=5 | Interval between service restarts, default 5 seconds | -| StartLimitInterval=600s, StartLimitBurst=3 | Maximum 3 restarts within 10 minutes (600 seconds) to prevent excessive resource consumption from frequent restarts | -| RestartPreventExitStatus=SIGKILL | Do not auto-restart the service if killed by the SIGKILL signal, avoiding infinite restart of zombie processes | - -#### 5.2.3 `[Install]` Section (Installation Configuration) -| Item | Meaning | -|-----------------------|----------------------------------------------------------------------| -| WantedBy=multi-user.target | Start the service automatically when the system enters multi-user mode | - -### 5.3 Sample `.service` File Format -```bash -[Unit] -Description=timechodb-confignode -Documentation=https://www.timecho.com/ -After=network.target - -[Service] -StandardOutput=null -StandardError=null -LimitNOFILE=65536 -Type=simple -User=root -Group=root -Environment=JAVA_HOME=$JAVA_HOME -ExecStart=$TimechoDB_SBIN_HOME/start-confignode.sh -Restart=on-failure -SuccessExitStatus=143 -RestartSec=5 -StartLimitInterval=600s -StartLimitBurst=3 -RestartPreventExitStatus=SIGKILL - -[Install] -WantedBy=multi-user.target -``` - -Note: The above is the standard format of the `timechodb-confignode.service` file. The formats of `timechodb-datanode.service` and `timechodb-ainode.service` are similar. - -## 6. Notes -1. **Process Daemon Mechanism** - - **Auto-restart**: The system will auto-restart the service if it fails to start or exits abnormally during runtime (e.g., OOM). - - **No restart**: Normal exits (e.g., executing `kill`, `./sbin/stop-xxx.sh`, or `systemctl stop`) will not trigger auto-restart. - -2. **Log Location** - - All runtime logs are stored in the `logs` folder under the TimechoDB installation directory. Refer to this directory for troubleshooting. - -3. **Cluster Status Check** - - After service startup, execute `./sbin/start-cli.sh` and run the `show cluster` command to view the cluster status. - -4. **Fault Recovery Procedure** - - If the service status is `failed`, after fixing the issue, **you must first execute `systemctl daemon-reload`** before running `systemctl start`, otherwise startup will fail. - -5. **Configuration Activation** - - After modifying the `daemon-xxx.sh` script, execute `systemctl daemon-reload` to re-register the service for new configurations to take effect. - -6. **Startup Mode Compatibility** - - Services started via `systemctl start` can be stopped using `./sbin/stop` (no restart triggered). - - Processes started via `./sbin/start` cannot be monitored via `systemctl`. \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/User-Manual/Black-White-List_timecho.md b/src/UserGuide/Master/Tree/User-Manual/Black-White-List_timecho.md deleted file mode 100644 index 2692edd4a..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/Black-White-List_timecho.md +++ /dev/null @@ -1,78 +0,0 @@ - - -# Black White List - -## 1. Introduction - -IoTDB is a time-series database designed for IoT scenarios, supporting efficient data storage, query, and analysis. With the widespread application of IoT technology, data security and access control have become critical. In open environments, ensuring secure data access for legitimate users presents a key challenge. The whitelist mechanism allows only trusted IPs or users to connect, reducing the attack surface at the source. The blacklist function can block malicious IPs in real time in edge-cloud collaborative scenarios, preventing unauthorized access, SQL injection, brute‑force attacks, DDoS, and other threats, thereby providing continuous and stable security for data transmission. - -> Note: This feature is available starting from version 2.0.6. - -## 2. Whitelist - -### 2.1 Function Description - -By enabling the whitelist function and configuring the whitelist, client addresses allowed to connect to IoTDB are specified. Only clients within the whitelist can access IoTDB, achieving security control. - -### 2.2 Configuration Parameters - -Administrators can enable/disable the whitelist function and add, modify, or delete whitelist IPs/IP segments in the following two ways: - -* Edit the configuration file `iotdb‑system.properties`. -* Use the `set configuration` statement. - * Tree model reference: [set configuration](../Reference/Modify-Config-Manual.md) - -Related parameters are as follows: - -| Name | Description | Default Value | Effective Mode | Example | -| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------- | --------------- | ---------------- | ------------------------------------------------------------------- | -| `enable_white_list` | Whether to enable the whitelist function. true: enable; false: disable. The value is case‑insensitive. | false | Hot reload | `set enable_white_list = 'true'` | -| `white_ip_list` | Add, modify, or delete whitelist IPs/IP segments. Supports exact match and the \* wildcard. Multiple IPs are separated by commas. | empty | Hot reload | `set white_ip_list='192.168.1.200,192.168.1.201,192.168.1.*'` | - -## 3. Blacklist - -### 3.1 Function Description - -By enabling the blacklist function and configuring the blacklist, certain specific IP addresses are prevented from accessing the database, guarding against unauthorized access, SQL injection, brute‑force attacks, DDoS attacks, and other security threats, thereby ensuring the security and stability of data transmission. - -### 3.2 Configuration Parameters - -Administrators can enable/disable the blacklist function and add, modify, or delete blacklist IPs/IP segments in the following two ways: - -* Edit the configuration file `iotdb‑system.properties`. -* Use the `set configuration`statement. - * Tree model reference:[set configuration](../Reference/Modify-Config-Manual.md) - -Related parameters are as follows: - -| Name | Description | Default Value | Effective Mode | Example | -|---------------------| ----------------------------------------------------------------------------------------------------------------------------------- | --------------- | ---------------- | ------------------------------------------------------------------- | -| `enable_black_list` | Whether to enable the blacklist function. true: enable; false: disable. The value is case‑insensitive. | false | Hot reload | `set enable_black_list = 'true'` | -| `black_ip_list` | Add, modify, or delete blacklist IPs/IP segments. Supports exact match and the \* wildcard. Multiple IPs are separated by commas. | empty | Hot reload | `set black_ip_list='192.168.1.200,192.168.1.201,192.168.1.*'` | - -## 4. Notes - -1. After the whitelist is enabled, if the list is empty, all connections are denied. If the local IP is not included, local login is denied. -2. When the same IP appears in both the whitelist and blacklist, the blacklist takes precedence. -3. The system validates the IP format. Invalid entries will cause an error when the user connects and be skipped, without affecting the loading of other valid IPs. -4. Duplicate IPs in the configuration are supported; they are automatically deduplicated in memory without notification. For manual deduplication, edit the configuration accordingly. -5. Blacklist/whitelist rules only apply to new connections. Existing connections before enabling the function are not affected; they will be intercepted only upon subsequent reconnection. diff --git a/src/UserGuide/Master/Tree/User-Manual/Data-Sync_timecho.md b/src/UserGuide/Master/Tree/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index b3bf1debd..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,744 +0,0 @@ - - -# Data Sync - -Data synchronization is a typical requirement in industrial Internet of Things (IoT). Through data synchronization mechanisms, it is possible to achieve data sharing between IoTDB, and to establish a complete data link to meet the needs for internal and external network data interconnectivity, edge-cloud synchronization, data migration, and data backup. - -## 1. Function Overview - -### 1.1 Data Synchronization - -A data synchronization task consists of three stages: - -![](/img/data-sync-new.png) - -- Source Stage:This part is used to extract data from the source IoTDB, defined in the source section of the SQL statement. -- Process Stage:This part is used to process the data extracted from the source IoTDB, defined in the processor section of the SQL statement. -- Sink Stage:This part is used to send data to the target IoTDB, defined in the sink section of the SQL statement. - -By declaratively configuring the specific content of the three parts through SQL statements, flexible data synchronization capabilities can be achieved. Currently, data synchronization supports the synchronization of the following information, and you can select the synchronization scope when creating a synchronization task (the default is data.insert, which means synchronizing newly written data): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Synchronization ScopeSynchronization Content Description
allAll scopes
data(Data)insertSynchronize newly written data
deleteSynchronize deleted data
schemadatabaseSynchronize database creation, modification or deletion operations
timeseriesSynchronize the definition and attributes of time series
TTLSynchronize the data retention time
auth-Synchronize user permissions and access control
- -### 1.2 Functional limitations and instructions - -1. The schema and auth synchronization functions have the following limitations: - -- When using schema synchronization, it is required that the consensus protocol for `Schema region` and `ConfigNode` must be the default ratis protocol. This means that the `iotdb-system.properties` configuration file should contain the settings `config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus` and `schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`. If these are not specified, the default ratis protocol is used. - -- To prevent potential conflicts, please disable the automatic creation of schema on the receiving end when enabling schema synchronization. This can be done by setting the `enable_auto_create_schema` configuration in the `iotdb-system.properties` file to false. - -- When schema synchronization is enabled, the use of custom plugins is not supported. - -- In a dual-active cluster, schema synchronization should avoid simultaneous operations on both ends. - -- During data synchronization tasks, please avoid performing any deletion operations to prevent inconsistent states between the two ends. - -2. Pipe Permission Control Specifications - -- When creating a pipe, a username and password can be specified for the extraction/write‑back plugins. If the password is incorrect, creation is prohibited. If not specified, the current user is used for synchronization by default. - -- During data/metadata synchronization, filtering is first performed according to the path pattern (pattern/path) configured in the Pipe, followed by authentication based on the user’s read permissions: - - If the permission scope is greater than or equal to the write path: full synchronization. - - If the permission scope has no intersection with the write path: no synchronization. - - If the permission scope is smaller than the write path or overlaps partially: synchronize only the intersecting part. - -- When encountering data for which the user lacks permission: - - If the sender’s skipIf=no‑privileges, the unauthorized data is skipped. - - If skipIf is left empty (unconfigured), the task reports an error (Error 803). - - Note: This skipIf configuration is independent of the receiver’s skipIf setting (which defaults to empty). - -- Data under root.__system and root.__audit will not be synchronized. - -3. Automatic Type Conversion for Pipe Sink - -When Pipe fails to write data to the sink due to field type mismatches, IoTDB automatically converts the data to the field types defined in the existing sink schema and retries the write operation to improve synchronization success rate. This feature is controlled by the parameter `sink.exception.data.convert-on-type-mismatch`. Refer to the subsequent sink parameter table for detailed parameter descriptions. - -The conversion rules for type mismatches are as follows: - -| Source Type | Target Type | Conversion Rule | -|---------------------|-------------|---------------------------------------------------------------------------------| -| Numeric Type | Numeric Type| Convert to the target numeric type. Truncation, precision loss or overflow may occur. | -| Numeric Type | BOOLEAN | `0` is converted to `false`; non-zero values are converted to `true`. | -| BOOLEAN | Numeric Type| `true` is converted to `1`; `false` is converted to `0`. | -| TEXT, STRING, BLOB | BOOLEAN | Parse the string into a BOOLEAN value. | -| TEXT, STRING, BLOB | Numeric Type| Parse the string into the target numeric type. If parsing fails, write the default value `0`, `0L` or `0.0`. | -| TEXT, STRING, BLOB | TIMESTAMP | Parse the string into a TIMESTAMP value. If parsing fails, write the default value `0L`. | -| TEXT, STRING, BLOB | DATE | Parse the string into a DATE value. If parsing fails, write the default date `1970-01-01`. | -| Invalid Numeric Value | DATE | If conversion to a valid DATE fails, write the default date `1970-01-01`. | -| DATE | TIMESTAMP | Convert to the timestamp of 00:00 (UTC) on the same day. | -| TIMESTAMP | DATE | Convert to the corresponding date in UTC. | - -> **Note**: Automatic conversion is performed based on the existing sink schema and will **not** modify the sink schema. This feature prioritizes continuous data synchronization, which may result in precision loss or writing of default values. - - -## 2. Usage Instructions - -Data synchronization tasks have three states: RUNNING, STOPPED, and DROPPED. The task state transitions are shown in the following diagram: - -![](/img/Data-Sync02.png) - -After creation, the task will start directly, and when the task stops abnormally, the system will automatically attempt to restart the task. - -Provide the following SQL statements for state management of synchronization tasks. - -### 2.1 Create Task - -Use the `CREATE PIPE` statement to create a data synchronization task. The `PipeId` and `sink` attributes are required, while `source` and `processor` are optional. When entering the SQL, note that the order of the `SOURCE` and `SINK` plugins cannot be swapped. - -The SQL example is as follows: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId is the name that uniquely identifies the task. --- Data extraction plugin, optional plugin -WITH SOURCE ( - [ = ,], -) --- Data processing plugin, optional plugin -WITH PROCESSOR ( - [ = ,], -) --- Data connection plugin, required plugin -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS semantics**: Used in creation operations to ensure that the create command is executed when the specified Pipe does not exist, preventing errors caused by attempting to create an existing Pipe. - -**Note**: - -Starting from V2.0.8, when creating a full data synchronization Pipe (e.g. Pipeid: `alldatapipe`), the system will automatically split it into two independent Pipes: - -* History Pipe: The PipeId is the original name plus the suffix `_history` (e.g. `alldatapipe_history`). The source parameter carries the default configurations: `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` -* Realtime Pipe: The PipeId is the original name plus the suffix `_realtime` (e.g. `alldatapipe_realtime`). The source parameter carries the default configuration: `'history.enable'='false'`. If metadata synchronization is configured, the Realtime Pipe will be responsible for sending the data. - -After successful creation, the original PipeId (e.g. `alldatapipe`) will no longer be a valid identifier. When performing task operations such as starting, stopping, deleting, or viewing, you must use the split independent PipeId (i.e. `*_history` or `*_realtime`). For operation examples, see the [View Task](./Data-Sync_timecho.md#_2-5-view-task) section - -### 2.2 Start Task - -Start processing data: - -```SQL -START PIPE -``` - -### 2.3 Stop Task - -Stop processing data: - -```SQL -STOP PIPE -``` - -### 2.4 Delete Task - -Deletes the specified task: - -```SQL -DROP PIPE [IF EXISTS] -``` -**IF EXISTS semantics**: Used in deletion operations to ensure that when a specified Pipe exists, the delete command is executed to prevent errors caused by attempting to delete non-existent Pipes. - -Deleting a task does not require stopping the synchronization task first. - -### 2.5 View Task - -View all tasks: - -```SQL -SHOW PIPES -``` - -To view a specified task: - -```SQL -SHOW PIPE -``` - -Example of the show pipes result for a pipe: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -The meanings of each column are as follows: - -- **ID**:The unique identifier for the synchronization task -- **CreationTime**:The time when the synchronization task was created -- **State**:The state of the synchronization task -- **PipeSource**:The source of the synchronized data stream -- **PipeProcessor**:The processing logic of the synchronized data stream during transmission -- **PipeSink**:The destination of the synchronized data stream -- **ExceptionMessage**:Displays the exception information of the synchronization task -- **RemainingEventCount (Statistics with Delay)**: The number of remaining events, which is the total count of all events in the current data synchronization task, including data and schema synchronization events, as well as system and user-defined events. -- **EstimatedRemainingSeconds (Statistics with Delay)**: The estimated remaining time, based on the current number of events and the rate at the pipe, to complete the transfer. - -Example: - -In V2.0.8 and later versions, create a full data synchronization task and view the task details. - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -``` - - -### 2.6 Modify Task - -The `ALTER PIPE` statement dynamically updates an existing PIPE and supports modifying or replacing the configuration of source, processor, and sink. - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -Description: - -* Executing this operation does not change the running state of the PIPE. It is equivalent to keeping the processing progress of the original PipeId and creating a new PIPE at the original progress position. -* The modify/replace parameters for source/processor/sink are all optional. If no modification parameter is specified, it is equivalent to deleting the current PIPE and recreating it with the original configuration and progress. -* For a plugin specified with modify, the plugin's other parameters are retained, and only the given parameters are replaced or added. -* For a plugin specified with replace, all parameters of the plugin are replaced directly. -* When the [IF EXISTS] keyword is used, execution succeeds even if no Pipe with the same name exists, but no operation is actually performed. - -Example: - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` - -### 2.7 Synchronization Plugins - -To make the overall architecture more flexible to match different synchronization scenario requirements, we support plugin assembly within the synchronization task framework. The system comes with some pre-installed common plugins that you can use directly. At the same time, you can also customize processor plugins and Sink plugins, and load them into the IoTDB system for use. You can view the plugins in the system (including custom and built-in plugins) with the following statement: - -```SQL -SHOW PIPEPLUGINS -``` - -The return result is as follows (version 1.3.2): - -```SQL -IoTDB> SHOW PIPEPLUGINS -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| PluginName|PluginType| ClassName| PluginJar| -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.processor.donothing.DoNothingProcessor| | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.donothing.DoNothingConnector| | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.extractor.iotdb.IoTDBExtractor| | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| | -| IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| | -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ - -``` - -Detailed introduction of pre-installed plugins is as follows (for detailed parameters of each plugin, please refer to the [Parameter Description](#reference-parameter-description) section): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
TypeCustom PluginPlugin NameDescriptionApplicable Version
source pluginNot Supportediotdb-sourceThe default extractor plugin, used to extract historical or real-time data from IoTDB1.2.x
processor pluginSupporteddo-nothing-processorThe default processor plugin, which does not process the incoming data1.2.x
sink pluginSupporteddo-nothing-sinkDoes not process the data that is sent out1.2.x
iotdb-thrift-sinkThe default sink plugin ( V1.3.1+ ), used for data transfer between IoTDB ( V1.2.0+ ) and IoTDB( V1.2.0+ ) . It uses the Thrift RPC framework to transfer data, with a multi-threaded async non-blocking IO model, high transfer performance, especially suitable for scenarios where the target end is distributed1.2.x
iotdb-air-gap-sinkUsed for data synchronization across unidirectional data diodes from IoTDB ( V1.2.0+ ) to IoTDB ( V1.2.0+ ). Supported diode models include Nanrui Syskeeper 2000, etc1.2.x
iotdb-thrift-ssl-sinkUsed for data transfer between IoTDB ( V1.3.1+ ) and IoTDB ( V1.2.0+ ). It uses the Thrift RPC framework to transfer data, with a single-threaded sync blocking IO model, suitable for scenarios with higher security requirements1.3.1+
- -For importing custom plugins, please refer to the [Stream Processing](./Streaming_timecho.md#custom-stream-processing-plugin-management) section. - -## 3. Use examples - -### 3.1 Full data synchronisation - -This example is used to demonstrate the synchronisation of all data from one IoTDB to another IoTDB with the data link as shown below: - -![](/img/pipe1.jpg) - -In this example, we can create a synchronization task named A2B to synchronize the full data from A IoTDB to B IoTDB. The iotdb-thrift-sink plugin (built-in plugin) for the sink is required. The URL of the data service port of the DataNode node on the target IoTDB needs to be configured through node-urls, as shown in the following example statement: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -``` - -### 3.2 Partial data synchronization - -This example is used to demonstrate the synchronisation of data from a certain historical time range (8:00pm 23 August 2023 to 8:00pm 23 October 2023) to another IoTDB, the data link is shown below: - -![](/img/pipe2.jpg) - -In this example, we can create a synchronization task named A2B. First, we need to define the range of data to be transferred in the source. Since the data being transferred is historical data (historical data refers to data that existed before the creation of the synchronization task), we need to configure the start-time and end-time of the data and the transfer mode mode. The URL of the data service port of the DataNode node on the target IoTDB needs to be configured through node-urls. - -The detailed statements are as follows: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'realtime.mode' = 'stream' -- The extraction mode for newly inserted data (after pipe creation) - 'path' = 'root.vehicle.**', -- Scope of Data Synchronization - 'start-time' = '2023.08.23T08:00:00+00:00', -- The start event time for synchronizing all data, including start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- The end event time for synchronizing all data, including end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -### 3.3 Bidirectional data transfer - -This example is used to demonstrate the scenario where two IoTDB act as active-active pairs, with the data link shown in the figure below: - -![](/img/pipe3.jpg) - -In this example, to avoid infinite data loops, the `forwarding-pipe-requests` parameter on A and B needs to be set to `false`, indicating that data transmitted from another pipe is not forwarded, and to keep the data consistent on both sides, the pipe needs to be configured with `inclusion=all` to synchronize full data and metadata. - -The detailed statement is as follows: - -On A IoTDB, execute the following statement: - -```SQL -create pipe AB -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'forwarding-pipe-requests' = 'false' -- Do not forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -On B IoTDB, execute the following statement: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'forwarding-pipe-requests' = 'false' -- Do not forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -### 3.4 Edge-cloud data transfer - -This example is used to demonstrate the scenario where data from multiple IoTDB is transferred to the cloud, with data from clusters B, C, and D all synchronized to cluster A, as shown in the figure below: - -![](/img/sync_en_03.png) - -In this example, to synchronize the data from clusters B, C, and D to A, the pipe between BA, CA, and DA needs to configure the `path` to limit the range, and to keep the edge and cloud data consistent, the pipe needs to be configured with `inclusion=all` to synchronize full data and metadata. The detailed statement is as follows: - -On B IoTDB, execute the following statement to synchronize data from B to A: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On C IoTDB, execute the following statement to synchronize data from C to A: - -```SQL -create pipe CA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On D IoTDB, execute the following statement to synchronize data from D to A: - -```SQL -create pipe DA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -### 3.5 Cascading data transfer - -This example is used to demonstrate the scenario where data is transferred in a cascading manner between multiple IoTDB, with data from cluster A synchronized to cluster B, and then to cluster C, as shown in the figure below: - -![](/img/sync_en_04.png) - -In this example, to synchronize the data from cluster A to C, the `forwarding-pipe-requests` needs to be set to `true` between BC. The detailed statement is as follows: - -On A IoTDB, execute the following statement to synchronize data from A to B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On B IoTDB, execute the following statement to synchronize data from B to C: - -```SQL -create pipe BC -with source ( - 'forwarding-pipe-requests' = 'true' -- Whether to forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -### 3.6 Cross-gate data transfer - -This example is used to demonstrate the scenario where data from one IoTDB is synchronized to another IoTDB through a unidirectional gateway, as shown in the figure below: - -![](/img/cross-network-gateway.png) - - -In this example, the iotdb-air-gap-sink plugin in the sink task needs to be used . After configuring the gateway, execute the following statement on A IoTDB. Fill in the node-urls with the URL of the data service port of the DataNode node on the target IoTDB configured by the gateway, as detailed below: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- The URL of the data service port of the DataNode node on the target IoTDB -``` -**Note:** -* When creating a pipe for synchronization across a network gap (data diode), you must ensure that the target user on the receiving end already exists. If the receiving-end user is missing at the time of pipe creation, data prior to the subsequent creation of that user will not be synchronized. -* Currently supported network gap device models are listed in the table below. -> For other models of network gateway devices, Please contact timechodb staff to confirm compatibility. - -| Gateway Type | Model | Return Packet Limit | Send Limit | -|--------------|-----------------------------------------------------|---------------------| ---------------------- | -| Forward Gate | NARI Syskeeper-2000 Forward Gate | All 0 / All 1 bytes | No Limit | -| Forward Gate | XJ Self-developed Diaphragm | All 0 / All 1 bytes | No Limit | -| Unknown | WISGAP | No Limit | No Limit | -| Forward Gate | KEDONG StoneWall-2000 Network Security Isolation Device | No Limit | No Limit | -| Reverse Gate | NARI Syskeeper-2000 Reverse Direction | All 0 / All 1 bytes | Meet E Language Format | -| Unknown | DPtech ISG5000 | No Limit | No Limit | -| Unknown | GAP XL—GAP | No Limit | No Limit | - - -### 3.7 Compression Synchronization (V1.3.3+) - -IoTDB supports specifying data compression methods during synchronization. Real time compression and transmission of data can be achieved by configuring the `compressor` parameter. `Compressor` currently supports 5 optional algorithms: snappy/gzip/lz4/zstd/lzma2, and can choose multiple compression algorithm combinations to compress in the order of configuration `rate-limit-bytes-per-second`(supported in V1.3.3 and later versions) is the maximum number of bytes allowed to be transmitted per second, calculated as compressed bytes. If it is less than 0, there is no limit. - -For example, to create a synchronization task named A2B: - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB - 'compressor' = 'snappy,lz4' -- Compression algorithms -) -``` - -### 3.8 Encrypted Synchronization (V1.3.1+) - -IoTDB supports the use of SSL encryption during the synchronization process, ensuring the secure transfer of data between different IoTDB instances. By configuring SSL-related parameters, such as the certificate address and password (`ssl.trust-store-path`)、(`ssl.trust-store-pwd`), data can be protected by SSL encryption during the synchronization process. - -For example, to create a synchronization task named A2B: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- The URL of the data service port of the DataNode node on the target IoTDB - 'ssl.trust-store-path'='pki/trusted', -- The trust store certificate path required to connect to the target DataNode - 'ssl.trust-store-pwd'='root' -- The trust store certificate password required to connect to the target DataNode -) -``` - -## 4. Reference: Notes - -You can adjust the parameters for data synchronization by modifying the IoTDB configuration file (`iotdb-system.properties`), such as the directory for storing synchronized data. The complete configuration is as follows: - -V1.3.3+: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## 5. Reference: parameter description - -### 5.1 source parameter(V1.3.3) - -| Key | Value | Value range | Required | Default value | -|:-------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------|:----------|:---------------| -| source | iotdb-source | String: iotdb-source | Yes | - | -| inclusion | Used to specify the range of data to be synchronized in the data synchronization task, including data, schema, and auth | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | No | data.insert | -| inclusion.exclusion | Used to exclude specific operations from the range specified by inclusion, reducing the amount of data synchronized | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | No | - | -| mode.streaming | Specifies the capture source for time-series data writes. Applicable when mode.streamingis false, determining the source for capturing data.insertspecified in inclusion. Offers two strategies:- true: ​​Dynamic capture selection.​​ The system adaptively chooses between capturing individual write requests or only TsFile sealing requests based on downstream processing speed. Prioritizes capturing write requests for lower latency when processing is fast; captures only file sealing requests to avoid backlog when slow. Suitable for most scenarios, balancing latency and throughput optimally.- false: ​​Fixed batch capture.​​ Captures only TsFile sealing requests. Suitable for resource-constrained scenarios to reduce system load. Note: The snapshot data captured upon pipe startup is only provided to downstream processing in file format. | Boolean: true / false | No | true | -| mode.strict | Determines the strictness when filtering data using time/ path/ database-name/ table-nameparameters:- true: ​​Strict filtering.​​ The system strictly filters captured data according to the given conditions, ensuring only matching data is selected.- false: ​​Non-strict filtering.​​ The system may include some extra data during filtering. Suitable for performance-sensitive scenarios to reduce CPU and I/O consumption. | Boolean: true / false | No | true | -| mode.snapshot | Determines the capture mode for time-series data, affecting the dataspecified in inclusion. Offers two modes:- true: ​​Static data capture.​​ Upon pipe startup, a one-time data snapshot is captured. ​​The pipe will automatically terminate (DROP PIPE SQL is executed automatically) after the snapshot data is fully consumed.​​- false: ​​Dynamic data capture.​​ In addition to capturing a snapshot upon startup, the pipe continuously captures subsequent data changes. The pipe runs continuously to handle the dynamic data stream. | Boolean: true / false | No | false | -| path | Can be specified when the user connects with sql_dialectset to tree. For upgraded user pipes, the default sql_dialectis tree. This parameter determines the capture scope for time-series data, affecting the dataspecified in inclusion, as well as some sequence-related metadata. Data is selected into the streaming pipe if its tree model path matches the specified path.
Starting from version V2.0.8.2, this parameter supports specifying multiple exact paths in a single pipe, e.g., `'path'='root.test.d0.s1,root.test.d0.s2,root.test.d0.s3'`. | String: IoTDB-standard tree path pattern, wildcards allowed | No | root.** | -| start-time | The start event time for synchronizing all data, including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | No | Long.MIN_VALUE | -| end-time | The end event time for synchronizing all data, including end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | No | Long.MAX_VALUE | -| forwarding-pipe-requests | Whether to forward data written by other Pipes (usually data synchronization) | Boolean: true / false | No | true | -| mods | Same as mods.enable, whether to send the MODS file for TSFile. | Boolean: true / false | No | false | -| skipIf | Which errors can be skipped? Currently only the insufficient privileges error. | String:no-privileges | No | no-privileges | - -> 💎 **Note:** The difference between the values of true and false for the data extraction mode `mode.streaming` -> -> - True (recommended): Under this value, the task will process and send the data in real-time. Its characteristics are high timeliness and low throughput. -> - False: Under this value, the task will process and send the data in batches (according to the underlying data files). Its characteristics are low timeliness and high throughput. - -### 5.2 sink parameter - -#### iotdb-thrift-sink - -| **Parameter** | **Description** | Value Range | Required | Default Value | -|:------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :----------------------------------------------------------- | :------- |:---------------------------------------------| -| sink | iotdb-thrift-sink or iotdb-thrift-async-sink | String: iotdb-thrift-sink or iotdb-thrift-async-sink | Yes | - | -| node-urls | URLs of the DataNode service ports on the target IoTDB. (please note that the synchronization task does not support forwarding to its own service). | String. Example:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| user/username | Username for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | TimechoDB@2021 (Before V2.0.6.x it is root) | -| batch.enable | Enables batch mode for log transmission to improve throughput and reduce IOPS. | Boolean: true, false | No | true | -| batch.max-delay-seconds | Maximum delay (in seconds) for batch transmission. | Integer | No | 1 | -| batch.max-delay-ms | Maximum delay (in ms) for batch transmission. (Available since v2.0.5) | Integer | No | 1 | -| batch.size-bytes | Maximum batch size (in bytes) for batch transmission. | Long | No | 16*1024*1024 | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | The maximum number of bytes allowed to be transmitted per second. The compressed bytes (such as after compression) are calculated. If it is less than 0, there is no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| format | The payload formats for data transmission include the following options:
- hybrid: The format depends on what is passed from the processor (either tsfile or tablet), and the sink performs no conversion.
- tsfile: Data is forcibly converted to tsfile format before transmission. This is suitable for scenarios like data file backup.
- tablet: Data is forcibly converted to tsfile format before transmission. This is useful for data synchronization when the sender and receiver have incompatible data types (to minimize errors). | String: hybrid / tsfile / tablet | No | hybrid | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - -#### iotdb-air-gap-sink - -| Key | Value | Value Range | Required | Default Value | -|:----------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:----------------|:---------------------------------------------| -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | Required | - | -| node-urls | The URL of the data service port of any DataNode nodes on the target IoTDB | String. Example: :'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Required | - | -| user/username | Username for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | TimechoDB@2021 (Before V2.0.6.x it is root) | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | Maximum bytes allowed per second for transmission (calculated after compression). Set to a value less than 0 for no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| air-gap.handshake-timeout-ms | The timeout duration of the handshake request when the sender and receiver first attempt to establish a connection, unit: ms | Integer | No | 5000 | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - -#### iotdb-thrift-ssl-sink - -| **Parameter** | **Description** | Value Range | Required | Default Value | -|:-------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------|:---------------------------------------------| -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | Yes | - | -| node-urls | URLs of the DataNode service ports on the target IoTDB. (please note that the synchronization task does not support forwarding to its own service). | String. Example:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| user/username | Username for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | TimechoDB@2021 (Before V2.0.6.x it is root) | -| batch.enable | Enables batch mode for log transmission to improve throughput and reduce IOPS. | Boolean: true, false | No | true | -| batch.max-delay-seconds | Maximum delay (in seconds) for batch transmission. | Integer | No | 1 | -| batch.max-delay-ms | Maximum delay (in ms) for batch transmission. (Available since v2.0.5) | Integer | No | 1 | -| batch.size-bytes | Maximum batch size (in bytes) for batch transmission. | Long | No | 16*1024*1024 | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | Maximum bytes allowed per second for transmission (calculated after compression). Set to a value less than 0 for no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| ssl.trust-store-path | Path to the trust store certificate for SSL connection. | String.Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| ssl.trust-store-pwd | Password for the trust store certificate. | Integer | Yes | - | -| format | The payload formats for data transmission include the following options:
- hybrid: The format depends on what is passed from the processor (either tsfile or tablet), and the sink performs no conversion.
- tsfile: Data is forcibly converted to tsfile format before transmission. This is suitable for scenarios like data file backup.
- tablet: Data is forcibly converted to tsfile format before transmission. This is useful for data synchronization when the sender and receiver have incompatible data types (to minimize errors). | String: hybrid / tsfile / tablet | No | hybrid | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - diff --git a/src/UserGuide/Master/Tree/User-Manual/Data-subscription_timecho.md b/src/UserGuide/Master/Tree/User-Manual/Data-subscription_timecho.md deleted file mode 100644 index 2d13c01b8..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/Data-subscription_timecho.md +++ /dev/null @@ -1,148 +0,0 @@ -# Data Subscription - -## 1. Feature Introduction - -The IoTDB data subscription module (also known as the IoTDB subscription client) is a feature supported after IoTDB V1.3.3, which provides users with a streaming data consumption method that is different from data queries. It refers to the basic concepts and logic of message queue products such as Kafka, **providing data subscription and consumption interfaces**, but it is not intended to completely replace these consumer queue products. Instead, it offers more convenient data subscription services for scenarios where simple streaming data acquisition is needed. - -Using the IoTDB Subscription Client to consume data has significant advantages in the following application scenarios: - -1. **Continuously obtaining the latest data**: By using a subscription method, it is more real-time than scheduled queries, simpler to program applications, and has a lower system burden; - -2. **Simplify data push to third-party systems**: No need to develop data push components for different systems within IoTDB, data can be streamed within third-party systems, making it easier to send data to systems such as Flink, Kafka, DataX, Camel, MySQL, PG, etc. - -## 2. Key Concepts - -The IoTDB Subscription Client encompasses three core concepts: Topic, Consumer, and Consumer Group. The specific relationships are illustrated in the diagram below: - -
- -
- -1. **Topic**: Topic is the data space of IoTDB, represented by paths and time ranges (such as the full time range of root. * *). Consumers can subscribe to data on these topics (currently existing and future written). Unlike Kafka, IoTDB can create topics after data is stored, and the output format can be either Message or TsFile. - -2. **Consumer**: Consumer is an IoTDB subscription client is located, responsible for receiving and processing data published to specific topics. Consumers retrieve data from the queue and process it accordingly. There are two types of Consumers available in the IoTDB subscription client: - - `SubscriptionPullConsumer`, which corresponds to the pull consumption model in message queues, where user code needs to actively invoke data retrieval logic. - - `SubscriptionPushConsumer`, which corresponds to the push consumption model in message queues, where user code is triggered by newly arriving data events. - - -3. **Consumer Group**: A Consumer Group is a collection of Consumers who share the same Consumer Group ID. The Consumer Group has the following characteristics: - - Consumer Group and Consumer are in a one to many relationship. That is, there can be any number of consumers in a consumer group, but a consumer is not allowed to join multiple consumer groups simultaneously. - - A Consumer Group can have different types of Consumers (`SubscriptionPullConsumer` and `SubscriptionPushConsumer`). - - It is not necessary for all consumers in a Consumer Group to subscribe to the same topic. - - When different Consumers in the same Consumer Group subscribe to the same Topic, each piece of data under that Topic will only be processed by one Consumer within the group, ensuring that data is not processed repeatedly. - -## 3. SQL Statements - -### 3.1 Topic Management - -IoTDB supports the creation, deletion, and viewing of Topics through SQL statements. The status changes of Topics are illustrated in the diagram below: - -
- -
- -#### 3.1.1 Create Topic - -The SQL statement is as follows: - -```SQL - CREATE TOPIC [IF NOT EXISTS] - WITH ( - [ = ,], - ); -``` - -**IF NOT EXISTS semantics**: Used in creation operations to ensure that the create command is executed when the specified topic does not exist, preventing errors caused by attempting to create an existing topic. - -Detailed explanation of each parameter is as follows: - -| Key | Required or Optional with Default | Description | -| :-------------------------------------------- | :--------------------------------- | :----------------------------------------------------------- | -| **path** | optional: `root.**` | The path of the time series data corresponding to the topic, representing a set of time series to be subscribed. | -| **start-time** | optional: `MIN_VALUE` | The start time (event time) of the time series data corresponding to the topic. Can be in ISO format, such as 2011-12-03T10:15:30 or 2011-12-03T10:15:30+01:00, or a long value representing a raw timestamp consistent with the database's timestamp precision. Supports the special value `now`, which means the creation time of the topic. When start-time is `now` and end-time is MAX_VALUE, it indicates that only real-time data is subscribed. | -| **end-time** | optional: `MAX_VALUE` | The end time (event time) of the time series data corresponding to the topic. Can be in ISO format, such as 2011-12-03T10:15:30 or 2011-12:03T10:15:30+01:00, or a long value representing a raw timestamp consistent with the database's timestamp precision. Supports the special value `now`, which means the creation time of the topic. When end-time is `now` and start-time is MIN_VALUE, it indicates that only historical data is subscribed. | -| **processor** | optional: `do-nothing-processor` | The name and parameter configuration of the processor plugin, representing the custom processing logic applied to the original subscribed data, which can be specified in a similar way to pipe processor plugins. - | -| **format** | optional: `SessionDataSetsHandler` | Represents the form in which data is subscribed from the topic. Currently supports the following two forms of data: `SessionDataSetsHandler`: Data subscribed from the topic is obtained using `SubscriptionSessionDataSetsHandler`, and consumers can consume each piece of data row by row. `TsFileHandler`: Data subscribed from the topic is obtained using `SubscriptionTsFileHandler`, and consumers can directly subscribe to the TsFile storing the corresponding data. | -| **mode** **(supported in versions 1.3.3.2 and later)** | option: `live` | The subscription mode corresponding to the topic, with two options: `live`: When subscribing to this topic, the subscribed dataset mode is a dynamic dataset, which means that you can continuously consume the latest data. `snapshot`: When the consumer subscribes to this topic, the subscribed dataset mode is a static dataset, which means the snapshot of the data at the moment the consumer group subscribes to the topic (not the moment the topic is created); the formed static dataset after subscription does not support TTL.| -| **loose-range** **(supported in versions 1.3.3.2 and later)** | option: `""` | String: Whether to strictly filter the data corresponding to this topic according to the path and time range, for example: "": Strictly filter the data corresponding to this topic according to the path and time range. `"time"`: Do not strictly filter the data corresponding to this topic according to the time range (rough filter); strictly filter the data corresponding to this topic according to the path. `"path"`: Do not strictly filter the data corresponding to this topic according to the path (rough filter); strictly filter the data corresponding to this topic according to the time range. `"time, path"` / `"path, time"` / `"all"`: Do not strictly filter the data corresponding to this topic according to the path and time range (rough filter).| - -Examples are as follows: - -```SQL --- Full subscription -CREATE TOPIC root_all; - --- Custom subscription -CREATE TOPIC IF NOT EXISTS db_timerange -WITH ( - 'path' = 'root.db.**', - 'start-time' = '2023-01-01', - 'end-time' = '2023-12-31' -); -``` - -#### 3.1.2 Delete Topic - -A Topic can only be deleted if it is not subscribed to. When a Topic is deleted, its related consumption progress will be cleared. - -```SQL -DROP TOPIC [IF EXISTS] ; -``` -**IF EXISTS semantics**: Used in deletion operations to ensure that the delete command is executed when a specified topic exists, preventing errors caused by attempting to delete non-existent topics. - -#### 3.1.3 View Topic - -```SQL -SHOW TOPICS; -SHOW TOPIC ; -``` - -Result set: - -```SQL -[TopicName|TopicConfigs] -``` - -- TopicName: Topic ID -- TopicConfigs: Topic configurations - -### 3.2 Check Subscription Status - -View all subscription relationships: - -```SQL --- Query the subscription relationships between all topics and consumer groups -SHOW SUBSCRIPTIONS --- Query all subscriptions under a specific topic -SHOW SUBSCRIPTIONS ON -``` - -Result set: - -```SQL -[TopicName|ConsumerGroupName|SubscribedConsumers] -``` - -- TopicName: The ID of the topic. -- ConsumerGroupName: The ID of the consumer group specified in the user's code. -- SubscribedConsumers: All client IDs in the consumer group that have subscribed to the topic. - -## 4. API interface - -In addition to SQL statements, IoTDB also supports using data subscription features through Java native interfaces, more details see([link](../API/Programming-Java-Native-API_timecho)). - - -## 5. Frequently Asked Questions - -### 5.1 What is the difference between IoTDB data subscription and Kafka? - -1. Consumption Orderliness - -- **Kafka guarantees that messages within a single partition are ordered**,when a topic corresponds to only one partition and only one consumer subscribes to this topic, the order in which the consumer (single-threaded) consumes the topic data is the same as the order in which the data is written. -- The IoTDB subscription client **does not guarantee** that the order in which the consumer consumes the data is the same as the order in which the data is written, but it will try to reflect the order of data writing. - -2. Message Delivery Semantics - -- Kafka can achieve Exactly once semantics for both Producers and Consumers through configuration. -- The IoTDB subscription client currently cannot provide Exactly once semantics for Consumers. \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/User-Manual/IoTDB-View_timecho.md b/src/UserGuide/Master/Tree/User-Manual/IoTDB-View_timecho.md deleted file mode 100644 index 111bf1f01..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/IoTDB-View_timecho.md +++ /dev/null @@ -1,547 +0,0 @@ - - -# View - -## 1. Sequence View Application Background - -## 2. Application Scenario 1 Time Series Renaming (PI Asset Management) - -In practice, the equipment collecting data may be named with identification numbers that are difficult to be understood by human beings, which brings difficulties in querying to the business layer. - -The Sequence View, on the other hand, is able to re-organise the management of these sequences and access them using a new model structure without changing the original sequence content and without the need to create new or copy sequences. - -**For example**: a cloud device uses its own NIC MAC address to form entity numbers and stores data by writing the following time sequence:`root.db.0800200A8C6D.xvjeifg`. - -It is difficult for the user to understand. However, at this point, the user is able to rename it using the sequence view feature, map it to a sequence view, and use `root.view.device001.temperature` to access the captured data. - -### 2.1 Application Scenario 2 Simplifying business layer query logic - -Sometimes users have a large number of devices that manage a large number of time series. When conducting a certain business, the user wants to deal with only some of these sequences. At this time, the focus of attention can be picked out by the sequence view function, which is convenient for repeated querying and writing. - -**For example**: Users manage a product assembly line with a large number of time series for each segment of the equipment. The temperature inspector only needs to focus on the temperature of the equipment, so he can extract the temperature-related sequences and compose the sequence view. - -### 2.2 Application Scenario 3 Auxiliary Rights Management - -In the production process, different operations are generally responsible for different scopes. For security reasons, it is often necessary to restrict the access scope of the operations staff through permission management. - -**For example**: The safety management department now only needs to monitor the temperature of each device in a production line, but these data are stored in the same database with other confidential data. At this point, it is possible to create a number of new views that contain only temperature-related time series on the production line, and then to give the security officer access to only these sequence views, thus achieving the purpose of permission restriction. - -### 2.3 Motivation for designing sequence view functionality - -Combining the above two types of usage scenarios, the motivations for designing sequence view functionality, are: - -1. time series renaming. -2. to simplify the query logic at the business level. -3. Auxiliary rights management, open data to specific users through the view. - -## 3. Sequence View Concepts - -### 3.1 Terminology Concepts - -Concept: If not specified, the views specified in this document are **Sequence Views**, and new features such as device views may be introduced in the future. - -### 3.2 Sequence view - -A sequence view is a way of organising the management of time series. - -In traditional relational databases, data must all be stored in a table, whereas in time series databases such as IoTDB, it is the sequence that is the storage unit. Therefore, the concept of sequence views in IoTDB is also built on sequences. - -A sequence view is a virtual time series, and each virtual time series is like a soft link or shortcut that maps to a sequence or some kind of computational logic external to a certain view. In other words, a virtual sequence either maps to some defined external sequence or is computed from multiple external sequences. - -Users can create views using complex SQL queries, where the sequence view acts as a stored query statement, and when data is read from the view, the stored query statement is used as the source of the data in the FROM clause. - -### 3.3 Alias Sequences - -There is a special class of beings in a sequence view that satisfy all of the following conditions: - -1. the data source is a single time series -2. there is no computational logic -3. no filtering conditions (e.g., no WHERE clause restrictions). - -Such a sequence view is called an **alias sequence**, or alias sequence view. A sequence view that does not fully satisfy all of the above conditions is called a non-alias sequence view. The difference between them is that only aliased sequences support write functionality. - -** All sequence views, including aliased sequences, do not currently support Trigger functionality. ** - -### 3.4 Nested Views - -A user may want to select a number of sequences from an existing sequence view to form a new sequence view, called a nested view. - -**The current version does not support the nested view feature**. - -### 3.5 Some constraints on sequence views in IoTDB - -#### Constraint 1 A sequence view must depend on one or several time series - -A sequence view has two possible forms of existence: - -1. it maps to a time series -2. it is computed from one or more time series. - -The former form of existence has been exemplified in the previous section and is easy to understand; the latter form of existence here is because the sequence view allows for computational logic. - -For example, the user has installed two thermometers in the same boiler and now needs to calculate the average of the two temperature values as a measurement. The user has captured the following two sequences: `root.db.d01.temperature01`, `root.db.d01.temperature02`. - -At this point, the user can use the average of the two sequences as one sequence in the view: `root.db.d01.avg_temperature`. - -This example will 3.1.2 expand in detail. - -#### Restriction 2 Non-alias sequence views are read-only - -Writing to non-alias sequence views is not allowed. - -Only aliased sequence views are supported for writing. - -#### Restriction 3 Nested views are not allowed - -It is not possible to select certain columns in an existing sequence view to create a sequence view, either directly or indirectly. - -An example of this restriction will be given in 3.1.3. - -#### Restriction 4 Sequence view and time series cannot be renamed - -Both sequence views and time series are located under the same tree, so they cannot be renamed. - -The name (path) of any sequence should be uniquely determined. - -#### Restriction 5 Sequence views share timing data with time series, metadata such as labels are not shared - -Sequence views are mappings pointing to time series, so they fully share timing data, with the time series being responsible for persistent storage. - -However, their metadata such as tags and attributes are not shared. - -This is because the business query, view-oriented users are concerned about the structure of the current view, and if you use group by tag and other ways to do the query, obviously want to get the view contains the corresponding tag grouping effect, rather than the time series of the tag grouping effect (the user is not even aware of those time series). - -## 4. Sequence view functionality - -### 4.1 Creating a view - -Creating a sequence view is similar to creating a time series, the difference is that you need to specify the data source, i.e., the original sequence, through the AS keyword. - -#### SQL for creating a view - -User can select some sequences to create a view: - -```SQL -CREATE VIEW root.view.device.status -AS - SELECT s01 - FROM root.db.device -``` - -It indicates that the user has selected the sequence `s01` from the existing device `root.db.device`, creating the sequence view `root.view.device.status`. - -The sequence view can exist under the same entity as the time series, for example: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device -``` - -Thus, there is a virtual copy of `s01` under `root.db.device`, but with a different name `status`. - -It can be noticed that the sequence views in both of the above examples are aliased sequences, and we are giving the user a more convenient way of creating a sequence for that sequence: - -```SQL -CREATE VIEW root.view.device.status -AS - root.db.device.s01 -``` - -#### Creating views with computational logic - -Following the example in section 2.2 Limitations 1: - -> A user has installed two thermometers in the same boiler and now needs to calculate the average of the two temperature values as a measurement. The user has captured the following two sequences: `root.db.d01.temperature01`, `root.db.d01.temperature02`. -> -> At this point, the user can use the two sequences averaged as one sequence in the view: `root.view.device01.avg_temperature`. - -If the view is not used, the user can query the average of the two temperatures like this: - -```SQL -SELECT (temperature01 + temperature02) / 2 -FROM root.db.d01 -``` - -And if using a sequence view, the user can create a view this way to simplify future queries: - -```SQL -CREATE VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02) / 2 - FROM root.db.d01 -``` - -The user can then query it like this: - -```SQL -SELECT avg_temperature FROM root.db.d01 -``` - -#### Nested sequence views not supported - -Continuing with the example from 3.1.2, the user now wants to create a new view using the sequence view `root.db.d01.avg_temperature`, which is not allowed. We currently do not support nested views, whether it is an aliased sequence or not. - -For example, the following SQL statement will report an error: - -```SQL -CREATE VIEW root.view.device.avg_temp_copy -AS - root.db.d01.avg_temperature -- Not supported. Nested views are not allowed -``` - -#### Creating multiple sequence views at once - -If only one sequence view can be specified at a time which is not convenient for the user to use, then multiple sequences can be specified at a time, for example: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - SELECT s01, s02 - FROM root.db.device -``` - -此外,上述写法可以做简化: - -```SQL -CREATE VIEW root.db.device(status, sub.hardware) -AS - SELECT s01, s02 - FROM root.db.device -``` - -Both statements above are equivalent to the following typing: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device; - -CREATE VIEW root.db.device.sub.hardware -AS - SELECT s02 - FROM root.db.device -``` - -is also equivalent to the following: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - root.db.device.s01, root.db.device.s02 - --- or - -CREATE VIEW root.db.device(status, sub.hardware) -AS - root.db.device(s01, s02) -``` - -##### The mapping relationships between all sequences are statically stored - -Sometimes, the SELECT clause may contain a number of statements that can only be determined at runtime, such as below: - -```SQL -SELECT s01, s02 -FROM root.db.d01, root.db.d02 -``` - -The number of sequences that can be matched by the above statement is uncertain and is related to the state of the system. Even so, the user can use it to create views. - -However, it is important to note that the mapping relationship between all sequences is stored statically (fixed at creation)! Consider the following example: - -The current database contains only three sequences `root.db.d01.s01`, `root.db.d02.s01`, `root.db.d02.s02`, and then the view is created: - -```SQL -CREATE VIEW root.view.d(alpha, beta, gamma) -AS - SELECT s01, s02 - FROM root.db.d01, root.db.d02 -``` - -The mapping relationship between time series is as follows: - -| sequence number | time series | sequence view | -| ---- | ----------------- | ----------------- | -| 1 | `root.db.d01.s01` | root.view.d.alpha | -| 2 | `root.db.d02.s01` | root.view.d.beta | -| 3 | `root.db.d02.s02` | root.view.d.gamma | - -After that, if the user adds the sequence `root.db.d01.s02`, it does not correspond to any view; then, if the user deletes `root.db.d01.s01`, the query for `root.view.d.alpha` will report an error directly, and it will not correspond to `root.db.d01.s02` either. - -Please always note that inter-sequence mapping relationships are stored statically and solidly. - -#### Batch Creation of Sequence Views - -There are several existing devices, each with a temperature value, for example: - -1. root.db.d1.temperature -2. root.db.d2.temperature -3. ... - -There may be many other sequences stored under these devices (e.g. `root.db.d1.speed`), but for now it is possible to create a view that contains only the temperature values for these devices, without relation to the other sequences:. - -```SQL -CREATE VIEW root.db.view(${2}_temperature) -AS - SELECT temperature FROM root.db.* -``` - -This is modelled on the query writeback (`SELECT INTO`) convention for naming rules, which uses variable placeholders to specify naming rules. See also: [QUERY WRITEBACK (SELECT INTO)](../Basic-Concept/Query-Data_timecho#into-clause-query-write-back) - -Here `root.db.*.temperature` specifies what time series will be included in the view; and `${2}` specifies from which node in the time series the name is extracted to name the sequence view. - -Here, `${2}` refers to level 2 (starting at 0) of `root.db.*.temperature`, which is the result of the `*` match; and `${2}_temperature` is the result of the match and `temperature` spliced together with underscores to make up the node names of the sequences under the view. - -The above statement for creating a view is equivalent to the following writeup: - -```SQL -CREATE VIEW root.db.view(${2}_${3}) -AS - SELECT temperature from root.db.* -``` - -The final view contains these sequences: - -1. root.db.view.d1_temperature -2. root.db.view.d2_temperature -3. ... - -Created using wildcards, only static mapping relationships at the moment of creation will be stored. - -#### SELECT clauses are somewhat limited when creating views - -The SELECT clause used when creating a serial view is subject to certain restrictions. The main restrictions are as follows: - -1. the `WHERE` clause cannot be used. -2. `GROUP BY` clause cannot be used. -3. `MAX_VALUE` and other aggregation functions cannot be used. - -Simply put, after `AS` you can only use `SELECT ... FROM ... ` and the results of this query must form a time series. - -### 4.2 View Data Queries - -For the data query functions that can be supported, the sequence view and time series can be used indiscriminately with identical behaviour when performing time series data queries. - -**The types of queries that are not currently supported by the sequence view are as follows:** - -1. **align by device query -2. **group by tags query - -Users can also mix time series and sequence view queries in the same SELECT statement, for example: - -```SQL -SELECT temperature01, temperature02, avg_temperature -FROM root.db.d01 -WHERE temperature01 < temperature02 -``` - -However, if the user wants to query the metadata of the sequence, such as tag, attributes, etc., the query is the result of the sequence view, not the result of the time series referenced by the sequence view. - -In addition, for aliased sequences, if the user wants to get information about the time series such as tags, attributes, etc., the user needs to query the mapping of the view columns to find the corresponding time series, and then query the time series for the tags, attributes, etc. The method of querying the mapping of the view columns will be explained in section 3.5. - -### 4.3 Modify Views - -The modification operations supported by the view include: modifying its calculation logic,modifying tag/attributes, and deleting. - -#### Modify view data source - -```SQL -ALTER VIEW root.view.device.status -AS - SELECT s01 - FROM root.ln.wf.d01 -``` - -#### Modify the view's calculation logic - -```SQL -ALTER VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02 + temperature03) / 3 - FROM root.db.d01 -``` - -#### Tag point management - -- Add a new -tag -```SQL -ALTER view root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -- Add a new attribute - -```SQL -ALTER view root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -- rename tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -- Reset the value of a tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -- Delete an existing tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 DROP tag1, tag2 -``` - -- Update insert tags and attributes - -> If the tag or attribute did not exist before, insert it, otherwise, update the old value with the new one. - -```SQL -ALTER view root.turbine.d1.s1 UPSERT TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -#### Deleting Views - -Since a view is a sequence, a view can be deleted as if it were a time series. - - -```SQL -DELETE VIEW root.view.device.avg_temperatue -``` - -### 4.4 View Synchronisation - -#### If the dependent original sequence is deleted - -When the sequence view is queried (when the sequence is parsed), **the empty result set** is returned if the dependent time series does not exist. - -This is similar to the feedback for querying a non-existent sequence, but with a difference: if the dependent time series cannot be parsed, the empty result set is the one that contains the table header as a reminder to the user that the view is problematic. - -Additionally, when the dependent time series is deleted, no attempt is made to find out if there is a view that depends on the column, and the user receives no warning. - -#### Data Writes to Non-Aliased Sequences Not Supported - -Writes to non-alias sequences are not supported. - -Please refer to the previous section 2.1.6 Restrictions2 for more details. - -#### Metadata for sequences is not shared - -Please refer to the previous section 2.1.6 Restriction 5 for details. - -### 4.5 View Metadata Queries - -View metadata query specifically refers to querying the metadata of the view itself (e.g., how many columns the view has), as well as information about the views in the database (e.g., what views are available). - -#### Viewing Current View Columns - -The user has two ways of querying: - -1. a query using `SHOW TIMESERIES`, which contains both time series and series views. This query contains both the time series and the sequence view. However, only some of the attributes of the view can be displayed. -2. a query using `SHOW VIEW`, which contains only the sequence view. It displays the complete properties of the sequence view. - -Example: - -```Shell -IoTDB> show timeseries; -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.device.s01 | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.view.status | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp01 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp02 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.avg_temp| null| root.db| FLOAT| null| null|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -Total line number = 5 -It costs 0.789s -IoTDB> -``` - -The last column `ViewType` shows the type of the sequence, the time series is BASE and the sequence view is VIEW. - -In addition, some of the sequence view properties will be missing, for example `root.db.d01.avg_temp` is calculated from temperature averages, so the `Encoding` and `Compression` properties are null values. - -In addition, the query results of the `SHOW TIMESERIES` statement are divided into two main parts. - -1. information about the timing data, such as data type, compression, encoding, etc. -2. other metadata information, such as tag, attribute, database, etc. - -For the sequence view, the temporal data information presented is the same as the original sequence or null (e.g., the calculated average temperature has a data type but no compression method); the metadata information presented is the content of the view. - -To learn more about the view, use `SHOW ``VIEW`. The `SHOW ``VIEW` shows the source of the view's data, etc. - -```Shell -IoTDB> show VIEW root.**; -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -| Timeseries|Database|DataType|Tags|Attributes|ViewType| SOURCE| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.view.status | root.db| INT32|null| null| VIEW| root.db.device.s01| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.d01.avg_temp| root.db| FLOAT|null| null| VIEW|(root.db.d01.temp01+root.db.d01.temp02)/2| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -Total line number = 2 -It costs 0.789s -IoTDB> -``` - -The last column, `SOURCE`, shows the data source for the sequence view, listing the SQL statement that created the sequence. - -##### About Data Types - -Both of the above queries involve the data type of the view. The data type of a view is inferred from the original time series type of the query statement or alias sequence that defines the view. This data type is computed in real time based on the current state of the system, so the data type queried at different moments may be changing. - -## 5. FAQ - -#### Q1: I want the view to implement the function of type conversion. For example, a time series of type int32 was originally placed in the same view as other series of type int64. I now want all the data queried through the view to be automatically converted to int64 type. - -> Ans: This is not the function of the sequence view. But the conversion can be done using `CAST`, for example: - -```SQL -CREATE VIEW root.db.device.int64_status -AS - SELECT CAST(s1, 'type'='INT64') from root.db.device -``` - -> This way, a query for `root.view.status` will yield a result of type int64. -> -> Please note in particular that in the above example, the data for the sequence view is obtained by `CAST` conversion, so `root.db.device.int64_status` is not an aliased sequence, and thus **not supported for writing**. - -#### Q2: Is default naming supported? Select a number of time series and create a view; but I don't specify the name of each series, it is named automatically by the database? - -> Ans: Not supported. Users must specify the naming explicitly. - -#### Q3: In the original system, create time series `root.db.device.s01`, you can find that database `root.db` is automatically created and device `root.db.device` is automatically created. Next, deleting the time series `root.db.device.s01` reveals that `root.db.device` was automatically deleted, while `root.db` remained. Will this mechanism be followed for creating views? What are the considerations? - -> Ans: Keep the original behaviour unchanged, the introduction of view functionality will not change these original logics. - -#### Q4: Does it support sequence view renaming? - -> A: Renaming is not supported in the current version, you can create your own view with new name to put it into use. \ No newline at end of file diff --git a/src/UserGuide/Master/Tree/User-Manual/Maintenance-commands_timecho.md b/src/UserGuide/Master/Tree/User-Manual/Maintenance-commands_timecho.md deleted file mode 100644 index cf1f6d9c9..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/Maintenance-commands_timecho.md +++ /dev/null @@ -1,694 +0,0 @@ - -# Maintenance Statement - -## 1. Status Checking - -### 1.1 Viewing the Connected Model - -**Description**: Returns the current SQL dialect mode (`Tree` or `Table`). - -**Syntax**: - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT; -``` - -**Result:** - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TREE| -+-----------------+ -``` - -### 1.2 Viewing the Cluster Version - -**Description**: Returns the current cluster version. - -**Syntax**: - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW VERSION; -``` - -**Result**: - -```Plain -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.3 Viewing Cluster Key Parameters - -**Description**: Returns key parameters of the current cluster. - -**Syntax**: - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -Key Parameters: - -1. **ClusterName**: The name of the current cluster. -2. **DataReplicationFactor**: Number of data replicas per DataRegion. -3. **SchemaReplicationFactor**: Number of schema replicas per SchemaRegion. -4. **DataRegionConsensusProtocolClass**: Consensus protocol class for DataRegions. -5. **SchemaRegionConsensusProtocolClass**: Consensus protocol class for SchemaRegions. -6. **ConfigNodeConsensusProtocolClass**: Consensus protocol class for ConfigNodes. -7. **TimePartitionOrigin**: The starting timestamp of database time partitions. -8. **TimePartitionInterval**: The interval of database time partitions (in milliseconds). -9. **ReadConsistencyLevel**: The consistency level for read operations. -10. **SchemaRegionPerDataNode**: Number of SchemaRegions per DataNode. -11. **DataRegionPerDataNode**: Number of DataRegions per DataNode. -12. **SeriesSlotNum**: Number of SeriesSlots per DataRegion. -13. **SeriesSlotExecutorClass**: Implementation class for SeriesSlots. -14. **DiskSpaceWarningThreshold**: Disk space warning threshold (in percentage). -15. **TimestampPrecision**: Timestamp precision. - -**Example**: - -```SQL -IoTDB> SHOW VARIABLES; -``` - -**Result**: - -```Plain -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.4 Viewing the Current Timestamp of Database - -**Description**: Returns the current timestamp of the database. - -**Syntax**: - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP; -``` - -**Result**: - -```Plain -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - -### 1.5 Viewing Executing Queries - -**Description**: Displays information about all currently executing queries. - -**Syntax**: - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**Parameters**: - -1. **WHERE Clause**: Filters the result set based on specified conditions. -2. **ORDER BY Clause**: Sorts the result set based on specified columns. -3. **limitOffsetClause**: Limits the number of rows returned. - 1. Format: `LIMIT , `. - -**Columns in QUERIES Table**: - -- **time**: Timestamp when the query started. -- **queryid**: Unique ID of the query. -- **datanodeid**: ID of the DataNode executing the query. -- **elapsedtime**: Time elapsed since the query started (in seconds). -- **statement**: The SQL statement being executed. - -**Example**: - -```SQL -IoTDB> SHOW QUERIES WHERE elapsedtime > 0.003 -``` - -**Result**: - -```SQL -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -| Time| QueryId|DataNodeId|ElapsedTime| Statement| -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -|2025-05-09T15:16:01.293+08:00|20250509_071601_00015_1| 1| 0.006|SHOW QUERIES WHERE elapsedtime > 0.003| -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -``` - -### 1.6 Viewing Region Information - -**Description**: Displays regions' information of the current cluster. - -**Syntax**: - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW REGIONS -``` - -**Result**: - -```SQL -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 9|SchemaRegion|Running|root.__system| 21| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.555| | -| 10| DataRegion|Running|root.__system| 21| 21| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.556| 8.27 KB| -| 65|SchemaRegion|Running| root.ln| 1| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-25T14:46:50.113| | -| 66| DataRegion|Running| root.ln| 1| 1| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-25T14:46:50.425| 524 B| -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.7 Viewing Available Nodes - -**Description**: Returns the RPC addresses and ports of all available DataNodes in the current cluster. Note: A DataNode is considered "available" if it is not in the REMOVING state. - -> This feature is supported starting from v2.0.8. - -**Syntax**: - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW AVAILABLE URLS -``` - -**Result**: - -```SQL -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.8 View Service Information - -**Description**: Returns service information (MQTT service, REST service) on all active DataNodes (in RUNNING or READ-ONLY state) in the current cluster. - -> This feature is supported starting from v2.0.8.2. - -#### Syntax: -```sql -showServicesStatement - : SHOW SERVICES - ; -``` - -#### Examples: -```sql -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -``` - -Execution result: -```sql -+--------------+-------------+---------+ -| Service Name | DataNode ID | State | -+--------------+-------------+---------+ -| MQTT | 1 | STOPPED | -| REST | 1 | RUNNING | -+--------------+-------------+---------+ -``` - -### 1.9 View Cluster Activation Status - -**Description**:Returns the activation status of the current cluster. - -#### Syntax: - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -#### Examples: - -```SQL -IoTDB> SHOW ACTIVATION -``` - -Execution result: - -```SQL -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -### 1.10 View Disk Space Usage - -**Description**: Returns the disk space usage of the specified `pattern`, including the size of ChunkGroups and the size of Metadata. - -**Note**: Statistics are based on the actual size of data in TsFiles; therefore, deletions made via `mods` are not considered. - -> Supported since version 2.0.9.1 - -#### Syntax: -```sql -showDiskUsageStatement - : SHOW DISK_USAGE FROM pathPattern - whereClause? - orderByClause? - rowPaginationClause? - ; -pathPattern - : ROOT (DOT nodeName)* - ; -``` - -**Explanation**: The `pattern` is used to match devices, must start with `ROOT`, and intermediate nodes in the path support `*` or `**`. - -#### Result Set - -| Column Name | Column Type | Description | -|---------------|-------------|----------------------------------| -| Database | string | Database name | -| DataNodeId | int32 | DataNode node ID | -| RegionId | int32 | Region ID | -| TimePartition | int64 | Time partition ID | -| SizeInBytes | int64 | Disk space occupied (in bytes) | - -#### Example: -```sql -SHOW DISK_USAGE FROM root.ln.**; -``` - -**Execution Result**: -```bash -+--------+----------+--------+-------------+-----------+ -|Database|DataNodeId|RegionId|TimePartition|SizeInBytes| -+--------+----------+--------+-------------+-----------+ -| root.ln| 1| 13| 2932| 203| -+--------+----------+--------+-------------+-----------+ -``` - -## 2. Status Setting - -### 2.1 Setting the Connected Model - -**Description**: Sets the current SQL dialect mode to `Tree` or `Table` which can be used in both tree and table modes. - -**Syntax**: - -```SQL -SET SQL_DIALECT = (TABLE | TREE); -``` - -**Example**: - -```SQL -IoTDB> SET SQL_DIALECT=TREE; -IoTDB> SHOW CURRENT_SQL_DIALECT; -``` - -**Result**: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TREE| -+-----------------+ -``` - -### 2.2 Updating Configuration Items - -**Description**: Updates configuration items. Changes take effect immediately without restarting if the items support hot modification. - -**Syntax**: - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**Parameters**: - -1. **propertyAssignments**: A list of properties to update. - 1. Format: `property (',' property)*`. - 2. Values: - - `DEFAULT`: Resets the configuration to its default value. - - `expression`: A specific value (must be a string). -2. **ON INTEGER_VALUE** **(Optional):** Specifies the node ID to update. - 1. If not specified or set to a negative value, updates all ConfigNodes and DataNodes. - -**Example**: - -```SQL -IoTDB> SET CONFIGURATION ‘disk_space_warning_threshold’='0.05',‘heartbeat_interval_in_ms’='1000' ON 1; -``` - -### 2.3 Loading Manually Modified Configuration Files - -**Description**: Loads manually modified configuration files and hot-loads the changes. Configuration items that support hot modification take effect immediately. - -**Syntax**: - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode** **(Optional):** - 1. Specifies the scope of configuration loading. - 2. Default: `CLUSTER`. - 3. Values: - - `LOCAL`: Loads configuration only on the DataNode directly connected to the client. - - `CLUSTER`: Loads configuration on all DataNodes in the cluster. - -**Example**: - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 Setting the System Status - -**Description**: Sets the system status to either `READONLY` or `RUNNING`. - -**Syntax**: - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **RUNNING |** **READONLY**: - 1. **RUNNING**: Sets the system to running mode, allowing both read and write operations. - 2. **READONLY**: Sets the system to read-only mode, allowing only read operations and prohibiting writes. -2. **localOrClusterMode** **(Optional):** - 1. **LOCAL**: Applies the status change only to the DataNode directly connected to the client. - 2. **CLUSTER**: Applies the status change to all DataNodes in the cluster. - 3. **Default**: `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - -## 3. Data Management - -### 3.1 Flushing Data from Memory to Disk - -**Description**: Flushes data from the memory table to disk. - -**Syntax**: - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **identifier** **(Optional):** - 1. Specifies the name of the path to flush. - 2. If not specified, all paths are flushed. - 3. **Multiple Paths**: Multiple path names can be specified, separated by commas (e.g., `FLUSH root.ln, root.lnm.**`). -2. **booleanValue** **(****Optional****)**: - 1. Specifies the type of data to flush. - 2. **TRUE**: Flushes only the sequential memory table. - 3. **FALSE**: Flushes only the unsequential MemTable. - 4. **Default**: Flushes both sequential and unsequential memory tables. -3. **localOrClusterMode** **(****Optional****)**: - 1. **ON LOCAL**: Flushes only the memory tables on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Flushes memory tables on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> FLUSH root.ln TRUE ON LOCAL; -``` - -## 4. Data Repair - -### 4.1 Starting Background Scan and Repair of TsFiles - -**Description**: Starts a background task to scan and repair TsFiles, fixing issues such as timestamp disorder within data files. - -**Syntax**: - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode(Optional)**: - 1. **ON LOCAL**: Executes the repair task only on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Executes the repair task on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 Pausing Background TsFile Repair Task - -**Description**: Pauses the background repair task. The paused task can be resumed by executing the `START REPAIR DATA` command again. - -**Syntax**: - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode** **(Optional):** - 1. **ON LOCAL**: Executes the pause command only on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Executes the pause command on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. Query Termination - -### 5.1 Terminating Queries - -**Description**: Terminates one or more running queries. - -**Syntax**: - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**Parameters**: - -1. **QUERY** **queryId:** Specifies the ID of the query to terminate. - -- To obtain the `queryId`, use the `SHOW QUERIES` command. - -2. **ALL QUERIES:** Terminates all currently running queries. - -**Example**: - -Terminate a specific query: - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -``` - -Terminate all queries: - -```SQL -IoTDB> KILL ALL QUERIES; -``` - -## 6. Query Debugging - -### 6.1 DEBUG SQL - -**Definition**: Add the `DEBUG` keyword at the beginning of an SQL query statement. During execution, debug logs will be output, including underlying file scan information involved in the query. - -> Supported since V2.0.9.1 - -#### Syntax: -```sql -debugSQLStatement - : DEBUG ? query - ; -``` - -**Description**: -* Log output directory: `logs/log_datanode_query_debug.log` - -#### Example: -1. Execute the following SQL for a DEBUG query -```sql -DEBUG SELECT * FROM root.ln.**; -``` - -2. Check the log content in `log_datanode_query_debug.log` to view the file scan information involved in the query. - -```bash -2026-03-24 10:06:18,755 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:159 - Cache miss: root.ln.wf01.wt01.temperature in file: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,757 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:160 - Device: root.ln.wf01.wt01, all sensors: [temperature] -2026-03-24 10:06:18,758 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.BloomFilterCache:110 - get bloomFilter from cache where filePath is: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: root.ln.wf01.wt01.temperature metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=0, chunkMetaDataListDataSize=8, measurementId='temperature', dataType=DOUBLE, statistics=startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], modified=false, isSeq=true, chunkMetadataList=[measurementId: temperature, datatype: DOUBLE, version: 0, Statistics: startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], deleteIntervalList: null]}. -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:97 - Modifications size is 0 for file Path: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:109 - After modification Chunk meta data list is: -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:110 - measurementId: temperature, datatype: DOUBLE, version: 0, Statistics: startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], deleteIntervalList: null -2026-03-24 10:06:18,760 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile', regionId=13, timePartitionId=2932, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=27} -2026-03-24 10:06:18,761 [pool-69-IoTDB-ClientRPC-Processor-1$20260324_020618_00052_1] INFO o.a.i.d.q.p.Coordinator:902 - debug select * from root.ln.** -``` diff --git a/src/UserGuide/Master/Tree/User-Manual/Streaming_timecho.md b/src/UserGuide/Master/Tree/User-Manual/Streaming_timecho.md deleted file mode 100644 index 7cad70485..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/Streaming_timecho.md +++ /dev/null @@ -1,854 +0,0 @@ - - -# Stream Computing Framework - -The IoTDB stream processing framework allows users to implement customized stream processing logic, which can monitor and capture storage engine changes, transform changed data, and push transformed data outward. - -We call a data flow processing task a Pipe. A stream processing task (Pipe) contains three subtasks: - -- Source task -- Processor task -- Sink task - -The stream processing framework allows users to customize the processing logic of three subtasks using Java language and process data in a UDF-like manner. -In a Pipe, the above three subtasks are executed by three plugins respectively, and the data will be processed by these three plugins in turn: -Pipe Source is used to extract data, Pipe Processor is used to process data, Pipe Sink is used to send data, and the final data will be sent to an external system. - -**The model of the Pipe task is as follows:** - -![pipe.png](/img/1706778988482.jpg) - -Describing a data flow processing task essentially describes the properties of Pipe Source, Pipe Processor and Pipe Sink plugins. -Users can declaratively configure the specific attributes of the three subtasks through SQL statements, and achieve flexible data ETL capabilities by combining different attributes. - -Using the stream processing framework, a complete data link can be built to meet the needs of end-side-cloud synchronization, off-site disaster recovery, and read-write load sub-library*. - -## 1. Custom stream processing plugin development - -### 1.1 Programming development dependencies - -It is recommended to use maven to build the project and add the following dependencies in `pom.xml`. Please be careful to select the same dependency version as the IoTDB server version. - -```xml - - org.apache.iotdb - pipe-api - 1.3.1 - provided - -``` - -### 1.2 Event-driven programming model - -The user programming interface design of the stream processing plugin refers to the general design concept of the event-driven programming model. Events are data abstractions in the user programming interface, and the programming interface is decoupled from the specific execution method. It only needs to focus on describing the processing method expected by the system after the event (data) reaches the system. - -In the user programming interface of the stream processing plugin, events are an abstraction of database data writing operations. The event is captured by the stand-alone stream processing engine, and is passed to the PipeSource plugin, PipeProcessor plugin, and PipeSink plugin in sequence according to the three-stage stream processing process, and triggers the execution of user logic in the three plugins in turn. - -In order to take into account the low latency of stream processing in low load scenarios on the end side and the high throughput of stream processing in high load scenarios on the end side, the stream processing engine will dynamically select processing objects in the operation logs and data files. Therefore, user programming of stream processing The interface requires users to provide processing logic for the following two types of events: operation log writing event TabletInsertionEvent and data file writing event TsFileInsertionEvent. - -#### **Operation log writing event (TabletInsertionEvent)** - -The operation log write event (TabletInsertionEvent) is a high-level data abstraction for user write requests. It provides users with the ability to manipulate the underlying data of write requests by providing a unified operation interface. - -For different database deployment methods, the underlying storage structures corresponding to operation log writing events are different. For stand-alone deployment scenarios, the operation log writing event is an encapsulation of write-ahead log (WAL) entries; for a distributed deployment scenario, the operation log writing event is an encapsulation of a single node consensus protocol operation log entry. - -For write operations generated by different write request interfaces in the database, the data structure of the request structure corresponding to the operation log write event is also different. IoTDB provides numerous writing interfaces such as InsertRecord, InsertRecords, InsertTablet, InsertTablets, etc. Each writing request uses a completely different serialization method, and the generated binary entries are also different. - -The existence of operation log writing events provides users with a unified view of data operations, which shields the implementation differences of the underlying data structure, greatly reduces the user's programming threshold, and improves the ease of use of the function. - -```java -/** TabletInsertionEvent is used to define the event of data insertion. */ -public interface TabletInsertionEvent extends Event { - - /** - * The consumer processes the data row by row and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processRowByRow(BiConsumer consumer); - - /** - * The consumer processes the Tablet directly and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processTablet(BiConsumer consumer); -} -``` - -#### **Data file writing event (TsFileInsertionEvent)** - -The data file writing event (TsFileInsertionEvent) is a high-level abstraction of the database file writing operation. It is a data collection of several operation log writing events (TabletInsertionEvent). - -The storage engine of IoTDB is LSM structured. When data is written, the writing operation will first be placed into a log-structured file, and the written data will be stored in the memory at the same time. When the memory reaches the control upper limit, the disk flushing behavior will be triggered, that is, the data in the memory will be converted into a database file, and the previously prewritten operation log will be deleted. When the data in the memory is converted into the data in the database file, it will undergo two compression processes: encoding compression and general compression. Therefore, the data in the database file takes up less space than the original data in the memory. - -In extreme network conditions, directly transmitting data files is more economical than transmitting data writing operations. It will occupy lower network bandwidth and achieve faster transmission speeds. Of course, there is no free lunch. Computing and processing data in files requires additional file I/O costs compared to directly computing and processing data in memory. However, it is precisely the existence of two structures, disk data files and memory write operations, with their own advantages and disadvantages, that gives the system the opportunity to make dynamic trade-offs and adjustments. It is based on this observation that data files are introduced into the plugin's event model. Write event. - -To sum up, the data file writing event appears in the event stream of the stream processing plugin, and there are two situations: - -(1) Historical data extraction: Before a stream processing task starts, all written data that has been placed on the disk will exist in the form of TsFile. After a stream processing task starts, when collecting historical data, the historical data will be abstracted using TsFileInsertionEvent; - -(2) Real-time data extraction: When a stream processing task is in progress, when the real-time processing speed of operation log write events in the data stream is slower than the write request speed, after a certain progress, the operation log write events that cannot be processed in the future will be persisted. to disk and exists in the form of TsFile. After this data is extracted by the stream processing engine, TsFileInsertionEvent will be used as an abstraction. - -```java -/** - * TsFileInsertionEvent is used to define the event of writing TsFile. Event data stores in disks, - * which is compressed and encoded, and requires IO cost for computational processing. - */ -public interface TsFileInsertionEvent extends Event { - - /** - * The method is used to convert the TsFileInsertionEvent into several TabletInsertionEvents. - * - * @return {@code Iterable} the list of TabletInsertionEvent - */ - Iterable toTabletInsertionEvents(); -} -``` - -### 1.3 Custom stream processing plugin programming interface definition - -Based on the custom stream processing plugin programming interface, users can easily write data extraction plugins, data processing plugins and data sending plugins, so that the stream processing function can be flexibly adapted to various industrial scenarios. - -#### Data extraction plugin interface - -Data extraction is the first stage of the three-stage process of stream processing, which includes data extraction, data processing, and data sending. The data extraction plugin (PipeSource) serves as a bridge between the stream processing engine and the storage engine. It captures various data write events by listening to the behavior of the storage engine. - -```java -/** - * PipeSource - * - *

PipeSource is responsible for capturing events from sources. - * - *

Various data sources can be supported by implementing different PipeSource classes. - * - *

The lifecycle of a PipeSource is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH Source` clause in SQL are - * parsed and the validation method {@link PipeSource#validate(PipeParameterValidator)} will - * be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} will be called to - * configure the runtime behavior of the PipeSource. - *
  • Then the method {@link PipeSource#start()} will be called to start the PipeSource. - *
  • While the collaboration task is in progress, the method {@link PipeSource#supply()} will be - * called to capture events from sources and then the events will be passed to the - * PipeProcessor. - *
  • The method {@link PipeSource#close()} will be called when the collaboration task is - * cancelled (the `DROP PIPE` command is executed). - *
- */ -public interface PipeSource extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSource. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSourceRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSource#validate(PipeParameterValidator)} - * is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSource - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSourceRuntimeConfiguration configuration) - throws Exception; - - /** - * Start the Source. After this method is called, events should be ready to be supplied by - * {@link PipeSource#supply()}. This method is called after {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @throws Exception the user can throw errors if necessary - */ - void start() throws Exception; - - /** - * Supply single event from the Source and the caller will send the event to the processor. - * This method is called after {@link PipeSource#start()} is called. - * - * @return the event to be supplied. the event may be null if the Source has no more events at - * the moment, but the Source is still running for more events. - * @throws Exception the user can throw errors if necessary - */ - Event supply() throws Exception; -} -``` - -#### Data processing plugin interface - -Data processing is the second stage of the three-stage process of stream processing, which includes data extraction, data processing, and data sending. The data processing plugin (PipeProcessor) is primarily used for filtering and transforming the various events captured by the data extraction plugin (PipeSource). - -```java -/** - * PipeProcessor - * - *

PipeProcessor is used to filter and transform the Event formed by the PipeSource. - * - *

The lifecycle of a PipeProcessor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH PROCESSOR` clause in SQL are - * parsed and the validation method {@link PipeProcessor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} will be called - * to configure the runtime behavior of the PipeProcessor. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSource. The - * following 3 methods will be called: {@link - * PipeProcessor#process(TabletInsertionEvent, EventCollector)}, {@link - * PipeProcessor#process(TsFileInsertionEvent, EventCollector)} and {@link - * PipeProcessor#process(Event, EventCollector)}. - *
    • PipeSink serializes the events into binaries and send them to sinks. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeProcessor#close() } method will be called. - *
- */ -public interface PipeProcessor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeProcessor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeProcessorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeProcessor#validate(PipeParameterValidator)} is called and before the beginning of the - * events processing. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeProcessor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeProcessorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is called to process the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(TabletInsertionEvent tabletInsertionEvent, EventCollector eventCollector) - throws Exception; - - /** - * This method is called to process the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - default void process(TsFileInsertionEvent tsFileInsertionEvent, EventCollector eventCollector) - throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - process(tabletInsertionEvent, eventCollector); - } - } - - /** - * This method is called to process the Event. - * - * @param event Event to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(Event event, EventCollector eventCollector) throws Exception; -} -``` - -#### Data sending plugin interface - -Data sending is the third stage of the three-stage process of stream processing, which includes data extraction, data processing, and data sending. The data sending plugin (PipeSink) is responsible for sending the various events processed by the data processing plugin (PipeProcessor). It serves as the network implementation layer of the stream processing framework and should support multiple real-time communication protocols and connectors in its interface. - -```java -/** - * PipeSink - * - *

PipeSink is responsible for sending events to sinks. - * - *

Various network protocols can be supported by implementing different PipeSink classes. - * - *

The lifecycle of a PipeSink is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SINK` clause in SQL are - * parsed and the validation method {@link PipeSink#validate(PipeParameterValidator)} will be - * called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link PipeSink#customize(PipeParameters, - * PipeSinkRuntimeConfiguration)} will be called to configure the runtime behavior of the - * PipeSink and the method {@link PipeSink#handshake()} will be called to create a connection - * with sink. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. - *
    • PipeSink serializes the events into binaries and send them to sinks. The following 3 - * methods will be called: {@link PipeSink#transfer(TabletInsertionEvent)}, {@link - * PipeSink#transfer(TsFileInsertionEvent)} and {@link PipeSink#transfer(Event)}. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeSink#close() } method will be called. - *
- * - *

In addition, the method {@link PipeSink#heartbeat()} will be called periodically to check - * whether the connection with sink is still alive. The method {@link PipeSink#handshake()} will be - * called to create a new connection with the sink when the method {@link PipeSink#heartbeat()} - * throws exceptions. - */ -public interface PipeSink extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSink. In this method, the user can do the following - * things: - * - *

    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSinkRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSink#validate(PipeParameterValidator)} is - * called and before the method {@link PipeSink#handshake()} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSink - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSinkRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is used to create a connection with sink. This method will be called after the - * method {@link PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called or - * will be called when the method {@link PipeSink#heartbeat()} throws exceptions. - * - * @throws Exception if the connection is failed to be created - */ - void handshake() throws Exception; - - /** - * This method will be called periodically to check whether the connection with sink is still - * alive. - * - * @throws Exception if the connection dies - */ - void heartbeat() throws Exception; - - /** - * This method is used to transfer the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(TabletInsertionEvent tabletInsertionEvent) throws Exception; - - /** - * This method is used to transfer the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - default void transfer(TsFileInsertionEvent tsFileInsertionEvent) throws Exception { - try { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - transfer(tabletInsertionEvent); - } - } finally { - tsFileInsertionEvent.close(); - } - } - - /** - * This method is used to transfer the generic events, including HeartbeatEvent. - * - * @param event Event to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(Event event) throws Exception; -} -``` - -## 2. Custom stream processing plugin management - -In order to ensure the flexibility and ease of use of user-defined plugins in actual production, the system also needs to provide the ability to dynamically and uniformly manage plugins. -The stream processing plugin management statements introduced in this chapter provide an entry point for dynamic unified management of plugins. - -### 2.1 Load plugin statement - -In IoTDB, if you want to dynamically load a user-defined plugin in the system, you first need to implement a specific plugin class based on PipeSource, PipeProcessor or PipeSink. -Then the plugin class needs to be compiled and packaged into a jar executable file, and finally the plugin is loaded into IoTDB using the management statement for loading the plugin. - -The syntax of the management statement for loading the plugin is shown in the figure. - -```sql -CREATE PIPEPLUGIN [IF NOT EXISTS] -AS -USING -``` -**IF NOT EXISTS semantics**: Used in creation operations to ensure that the create command is executed when the specified Pipe Plugin does not exist, preventing errors caused by attempting to create an existing Pipe Plugin. - -Example: If you implement a data processing plugin named edu.tsinghua.iotdb.pipe.ExampleProcessor, and the packaged jar package is pipe-plugin.jar, you want to use this plugin in the stream processing engine, and mark the plugin as example. There are two ways to use the plugin package, one is to upload to the URI server, and the other is to upload to the local directory of the cluster. - -Method 1: Upload to the URI server - -Preparation: To register in this way, you need to upload the JAR package to the URI server in advance and ensure that the IoTDB instance that executes the registration statement can access the URI server. For example https://example.com:8080/iotdb/pipe-plugin.jar . - -SQL: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -Method 2: Upload the data to the local directory of the cluster - -Preparation: To register in this way, you need to place the JAR package in any path on the machine where the DataNode node is located, and we recommend that you place the JAR package in the /ext/pipe directory of the IoTDB installation path (the installation package is already in the installation package, so you do not need to create a new one). For example: iotdb-1.x.x-bin/ext/pipe/pipe-plugin.jar. **(Note: If you are using a cluster, you will need to place the JAR package under the same path as the machine where each DataNode node is located)** - -SQL: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -### 2.2 Delete plugin statement - -When the user no longer wants to use a plugin and needs to uninstall the plugin from the system, he can use the delete plugin statement as shown in the figure. - -```sql -DROP PIPEPLUGIN [IF EXISTS] -``` - -**IF EXISTS semantics**: Used in deletion operations to ensure that when a specified Pipe Plugin exists, the delete command is executed to prevent errors caused by attempting to delete a non-existent Pipe Plugin. - -### 2.3 View plugin statements - -Users can also view plugins in the system on demand. View the statement of the plugin as shown in the figure. -```sql -SHOW PIPEPLUGINS -``` - -## 3. System preset stream processing plugin - -### 3.1 Pre-built Source Plugin - -#### iotdb-source - -Function: Extract historical or realtime data inside IoTDB into pipe. - - -| key | value | value range | required or optional with default | -|---------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|-----------------------------------| -| source | iotdb-source | String: iotdb-source | required | -| source.pattern | path prefix for filtering time series | String: any time series prefix | optional: root | -| source.history.start-time | start of synchronizing historical data event time,including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| source.history.end-time | end of synchronizing historical data event time,including end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.forwarding-pipe-requests | Whether to forward data written by another Pipe (usually Data Sync) | Boolean: true, false | optional:true | -| start-time(V1.3.1+) | start of synchronizing all data event time,including start-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| end-time(V1.3.1+) | end of synchronizing all data event time,including end-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.realtime.mode | Extraction mode for real-time data | String: hybrid, stream, batch | optional:hybrid | -| source.forwarding-pipe-requests | Whether to forward data written by another Pipe (usually Data Sync) | Boolean: true, false | optional:true | - -> 🚫 **source.pattern Parameter Description** -> -> * Pattern should use backquotes to modify illegal characters or illegal path nodes, for example, if you want to filter root.\`a@b\` or root.\`123\`, you should set the pattern to root.\`a@b\` or root.\`123\`(Refer specifically to [Timing of single and double quotes and backquotes](https://iotdb.apache.org/Download/)) -> * In the underlying implementation, when pattern is detected as root (default value) or a database name, synchronization efficiency is higher, and any other format will reduce performance. -> * The path prefix does not need to form a complete path. For example, when creating a pipe with the parameter 'source.pattern'='root.aligned.1': - > - > * root.aligned.1TS - > * root.aligned.1TS.\`1\` - > * root.aligned.100TS - > - > the data will be synchronized; - > - > * root.aligned.\`1\` -> * root.aligned.\`123\` - > - > the data will not be synchronized. - -> ❗️**start-time, end-time parameter description of source** -> -> * start-time, end-time should be in ISO format, such as 2011-12-03T10:15:30 or 2011-12-03T10:15:30+01:00. However, version 1.3.1+ supports timeStamp format like 1706704494000. - -> ✅ **A piece of data from production to IoTDB contains two key concepts of time** -> -> * **event time:** The time when the data is actually produced (or the generation time assigned to the data by the data production system, which is the time item in the data point), also called event time. -> * **arrival time:** The time when data arrives in the IoTDB system. -> -> The out-of-order data we often refer to refers to data whose **event time** is far behind the current system time (or the maximum **event time** that has been dropped) when the data arrives. On the other hand, whether it is out-of-order data or sequential data, as long as they arrive newly in the system, their **arrival time** will increase with the order in which the data arrives at IoTDB. - -> 💎 **The work of iotdb-source can be split into two stages** -> -> 1. Historical data extraction: All data with **arrival time** < **current system time** when creating the pipe is called historical data -> 2. Realtime data extraction: All data with **arrival time** >= **current system time** when the pipe is created is called realtime data -> -> The historical data transmission phase and the realtime data transmission phase are executed serially. Only when the historical data transmission phase is completed, the realtime data transmission phase is executed.** - -> 📌 **source.realtime.mode: Data extraction mode** -> -> * log: In this mode, the task only uses the operation log for data processing and sending -> * file: In this mode, the task only uses data files for data processing and sending. -> * hybrid: This mode takes into account the characteristics of low latency but low throughput when sending data one by one in the operation log, and the characteristics of high throughput but high latency when sending in batches of data files. It can automatically operate under different write loads. Switch the appropriate data extraction method. First, adopt the data extraction method based on operation logs to ensure low sending delay. When a data backlog occurs, it will automatically switch to the data extraction method based on data files to ensure high sending throughput. When the backlog is eliminated, it will automatically switch back to the data extraction method based on data files. The data extraction method of the operation log avoids the problem of difficulty in balancing data sending delay or throughput using a single data extraction algorithm. - -> 🍕 **source.forwarding-pipe-requests: Whether to allow forwarding data transmitted from another pipe** -> -> * If you want to use pipe to build data synchronization of A -> B -> C, then the pipe of B -> C needs to set this parameter to true, so that the data written by A to B through the pipe in A -> B can be forwarded correctly. to C -> * If you want to use pipe to build two-way data synchronization (dual-active) of A \<-> B, then the pipes of A -> B and B -> A need to set this parameter to false, otherwise the data will be endless. inter-cluster round-robin forwarding - -### 3.2 Preset processor plugin - -#### do-nothing-processor - -Function: No processing is done on the events passed in by the source. - - -| key | value | value range | required or optional with default | -|-----------|----------------------|------------------------------|-----------------------------------| -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### 3.3 Preset sink plugin - -#### do-nothing-sink - -Function: No processing is done on the events passed in by the processor. - -| key | value | value range | required or optional with default | -|------|-----------------|-------------------------|-----------------------------------| -| sink | do-nothing-sink | String: do-nothing-sink | required | - -## 4. Stream processing task management - -### 4.1 Create a stream processing task - -Use the `CREATE PIPE` statement to create a stream processing task. Taking the creation of a data synchronization stream processing task as an example, the sample SQL statement is as follows: - -```sql -CREATE PIPE -- PipeId is the name that uniquely identifies the sync task -WITH SOURCE ( - -- Default IoTDB Data Extraction Plugin - 'source' = 'iotdb-source', - -- Path prefix, only data that can match the path prefix will be extracted for subsequent processing and delivery - 'source.pattern' = 'root.timecho', - -- Whether to extract historical data - 'source.history.enable' = 'true', - -- Describes the time range of the historical data being extracted, indicating the earliest possible time - 'source.history.start-time' = '2011.12.03T10:15:30+01:00', - -- Describes the time range of the extracted historical data, indicating the latest time - 'source.history.end-time' = '2022.12.03T10:15:30+01:00', - -- Whether to extract realtime data - 'source.realtime.enable' = 'true', -) -WITH PROCESSOR ( - -- Default data processing plugin, means no processing - 'processor' = 'do-nothing-processor', -) -WITH SINK ( - -- IoTDB data sending plugin with target IoTDB - 'sink' = 'iotdb-thrift-sink', - -- Data service for one of the DataNode nodes on the target IoTDB ip - 'sink.ip' = '127.0.0.1', - -- Data service port of one of the DataNode nodes of the target IoTDB - 'sink.port' = '6667', -) -``` - -**When creating a stream processing task, you need to configure the PipeId and the parameters of the three plugin parts:** - -| Configuration | Description | Required or not | Default implementation | Default implementation description | Default implementation description | -|---------------|-----------------------------------------------------------------------------------------------------|---------------------------------|------------------------|---------------------------------------------------------------------------------------------------------------------------|------------------------------------| -| PipeId | A globally unique name that identifies a stream processing | Required | - | - | - | -| source | Pipe Source plugin, responsible for extracting stream processing data at the bottom of the database | Optional | iotdb-source | Integrate the full historical data of the database and subsequent real-time data arriving into the stream processing task | No | -| processor | Pipe Processor plugin, responsible for processing data | Optional | do-nothing-processor | Does not do any processing on the incoming data | Yes | -| sink | Pipe Sink plugin, responsible for sending data | Required | - | - | Yes | - -In the example, the iotdb-source, do-nothing-processor and iotdb-thrift-sink plugins are used to build the data flow processing task. IoTDB also has other built-in stream processing plugins, **please check the "System Preset Stream Processing plugin" section**. - -**A simplest example of the CREATE PIPE statement is as follows:** - -```sql -CREATE PIPE -- PipeId is a name that uniquely identifies the stream processing task -WITH SINK ( - -- IoTDB data sending plugin, the target is IoTDB - 'sink' = 'iotdb-thrift-sink', - --The data service IP of one of the DataNode nodes in the target IoTDB - 'sink.ip' = '127.0.0.1', - -- The data service port of one of the DataNode nodes in the target IoTDB - 'sink.port' = '6667', -) -``` - -The semantics expressed are: synchronize all historical data in this database instance and subsequent real-time data arriving to the IoTDB instance with the target 127.0.0.1:6667. - -**Notice:** - -- SOURCE and PROCESSOR are optional configurations. If you do not fill in the configuration parameters, the system will use the corresponding default implementation. -- SINK is a required configuration and needs to be configured declaratively in the CREATE PIPE statement -- SINK has self-reuse capability. For different stream processing tasks, if their SINKs have the same KV attributes (the keys corresponding to the values of all attributes are the same), then the system will only create one SINK instance in the end to realize the duplication of connection resources. - - - For example, there are the following declarations of two stream processing tasks, pipe1 and pipe2: - - ```sql - CREATE PIPE pipe1 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.ip' = 'localhost', - 'sink.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.port' = '9999', - 'sink.ip' = 'localhost', - ) - ``` - -- Because their declarations of SINK are exactly the same (**even if the order of declaration of some attributes is different**), the framework will automatically reuse the SINKs they declared, and ultimately the SINKs of pipe1 and pipe2 will be the same instance. . -- When the source is the default iotdb-source, and source.forwarding-pipe-requests is the default value true, please do not build an application scenario that includes data cycle synchronization (it will cause an infinite loop): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### 4.2 Start the stream processing task - -After the CREATE PIPE statement is successfully executed, the stream processing task-related instance will be created, but the running status of the entire stream processing task will be set to STOPPED(V1.3.0), that is, the stream processing task will not process data immediately. In version 1.3.1 and later, the status of the task will be set to RUNNING after CREATE. - -You can use the START PIPE statement to cause a stream processing task to start processing data: - -```sql -START PIPE -``` - -### 4.3 Stop the stream processing task - -Use the STOP PIPE statement to stop the stream processing task from processing data: - -```sql -STOP PIPE -``` - -### 4.4 Delete stream processing tasks - -Use the DROP PIPE statement to stop the stream processing task from processing data (when the stream processing task status is RUNNING), and then delete the entire stream processing task: - -```sql -DROP PIPE -``` - -Users do not need to perform a STOP operation before deleting the stream processing task. - -### 4.5 Display stream processing tasks - -Use the SHOW PIPES statement to view all stream processing tasks: - -```sql -SHOW PIPES -``` - -The query results are as follows: - -```sql -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor|PipeSink|ExceptionMessage| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| {}| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -``` - -You can use `` to specify the status of a stream processing task you want to see: - -```sql -SHOW PIPE -``` - -You can also use the where clause to determine whether the Pipe Sink used by a certain \ is reused. - -```sql -SHOW PIPES -WHERE SINK USED BY -``` - -### 4.6 Stream processing task running status migration - -A stream processing pipe will pass through various states during its managed life cycle: - -- **RUNNING:** pipe is working properly - - When a pipe is successfully created, its initial state is RUNNING.(V1.3.1+) -- **STOPPED:** The pipe is stopped. When the pipeline is in this state, there are several possibilities: - - When a pipe is successfully created, its initial state is STOPPED.(V1.3.0) - - The user manually pauses a pipe that is in normal running status, and its status will passively change from RUNNING to STOPPED. - - When an unrecoverable error occurs during the running of a pipe, its status will automatically change from RUNNING to STOPPED -- **DROPPED:** The pipe task was permanently deleted - -The following diagram shows all states and state transitions: - -![State migration diagram](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -## 5. authority management - -### 5.1 Stream processing tasks - - -| Permission name | Description | -|-----------------|------------------------------------------------------------| -| USE_PIPE | Register a stream processing task. The path is irrelevant. | -| USE_PIPE | Start the stream processing task. The path is irrelevant. | -| USE_PIPE | Stop the stream processing task. The path is irrelevant. | -| USE_PIPE | Offload stream processing tasks. The path is irrelevant. | -| USE_PIPE | Query stream processing tasks. The path is irrelevant. | - -### 5.2 Stream processing task plugin - - -| Permission name | Description | -|-----------------|----------------------------------------------------------------------| -| USE_PIPE | Register stream processing task plugin. The path is irrelevant. | -| USE_PIPE | Uninstall the stream processing task plugin. The path is irrelevant. | -| USE_PIPE | Query stream processing task plugin. The path is irrelevant. | - -## 6. Configuration parameters - -In iotdb-system.properties: - -V1.3.0+: -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 - -# The maximum number of selectors that can be used in the async connector. -# pipe_async_connector_selector_number=1 - -# The core number of clients that can be used in the async connector. -# pipe_async_connector_core_client_number=8 - -# The maximum number of clients that can be used in the async connector. -# pipe_async_connector_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -V1.3.1+: -```Properties -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` diff --git a/src/UserGuide/Master/Tree/User-Manual/Tiered-Storage_timecho.md b/src/UserGuide/Master/Tree/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 5d4c4a1dd..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,100 +0,0 @@ - - -# Tiered Storage -## 1. Overview - -The Tiered storage functionality allows users to define multiple layers of storage, spanning across multiple types of storage media (Memory mapped directory, SSD, rotational hard discs or cloud storage). While memory and cloud storage is usually singular, the local file system storages can consist of multiple directories joined together into one tier. Meanwhile, users can classify data based on its hot or cold nature and store data of different categories in specified "tier". Currently, IoTDB supports the classification of hot and cold data through TTL (Time to live / age) of data. When the data in one tier does not meet the TTL rules defined in the current tier, the data will be automatically migrated to the next tier. - -## 2. Parameter Definition - -To enable tiered storage in IoTDB, you need to configure the following aspects: - -1. configure the data catalogue and divide the data catalogue into different tiers -2. configure the TTL of the data managed in each tier to distinguish between hot and cold data categories managed in different tiers. -3. configure the minimum remaining storage space ratio for each tier so that when the storage space of the tier triggers the threshold, the data of the tier will be automatically migrated to the next tier (optional). - -The specific parameter definitions and their descriptions are as follows. - -| Configuration | Default | Required | Description | Constraint | -| --------------------------------------- | ------------------------ | --- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | Yes | specify different storage directories and divide the storage directories into tiers | Each level of storage uses a semicolon to separate, and commas to separate within a single level; cloud (OBJECT_STORAGE) configuration can only be used as the last level of storage and the first level can't be used as cloud storage; a cloud object at most; the remote storage directory is denoted by OBJECT_STORAGE | -| tier_ttl_in_ms | -1 | Yes | Define the maximum age of data for which each tier is responsible | Each level of storage is separated by a semicolon; the number of levels should match the number of levels defined by dn_data_dirs;"-1" means "unlimited". | -| dn_default_space_usage_thresholds | 0.85 | Yes | Define the maximum storage usage threshold ratio for each tier of data directories. When the used space exceeds this ratio, the data will be automatically migrated to the next tier. If the storage usage of the last tier surpasses this threshold, the system will be set to ​​READ_ONLY​​ mode. | Each level of storage is separated by a semicolon; the number of levels should match the number of levels defined by dn_data_dirs | -| object_storage_type | `AWS_S3` | Required when using remote storage | Cloud storage type. | all `AWS_S3` is supported. | -| object_storage_bucket | iotdb_data | Required when using remote storage | Name of cloud storage bucket | Bucket definition in AWS S3 | -| object_storage_endpoint | | Required when using remote storage | endpoint of cloud storage | endpoint of AWS S3 | -| object_storage_region | (Empty) | Required when using remote storage | Cloud storage Region. | | -| object_storage_access_key | | Required when using remote storage | Authentication information stored in the cloud: key | AWS S3 credential key | -| object_storage_access_secret | | Required when using remote storage | Authentication information stored in the cloud: secret | AWS S3 credential secret | -| enable_path_style_access | false | No | Whether to enable path style access for object storage service. | | -| remote_tsfile_cache_dirs | data/datanode/data/cache | No | Cache directory stored locally in the cloud | | -| remote_tsfile_cache_page_size_in_kb | 20480 | No | Block size of locally cached files stored in the cloud | | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | No | Maximum Disk Occupancy Size for Cloud Storage Local Cache | | - -## 3. local tiered storag configuration example - -The following is an example of a local two-level storage configuration. - -```JavaScript -//Required configuration items -dn_data_dirs=/data1/data;/data2/data,/data3/data; -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -In this example, two levels of storage are configured, specifically: - -| **tier** | **data path** | **data range** | **threshold for minimum remaining disk space** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| tier 1 | path 1:/data1/data | data for last 1 day | 20% | -| tier 2 | path 2:/data2/data path 2:/data3/data | data from 1 day ago | 10% | - -## 4. remote tiered storag configuration example - -The following takes three-level storage as an example: - -```JavaScript -//Required configuration items -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_type=AWS_S3 -object_storage_bucket=iotdb -object_storage_region= -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// Optional configuration items -enable_path_style_access=false -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -In this example, a total of three levels of storage are configured, specifically: - -| **tier** | **data path** | **data range** | **threshold for minimum remaining disk space** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| tier1 | path 1:/data1/data | data for last 1 day | 20% | -| tier2 | path 1:/data2/data path 2:/data3/data | data from past 1 day to past 10 days | 15% | -| tier 3 | S3 Cloud Storage | Data older than 10 days | 10% | diff --git a/src/UserGuide/Master/Tree/User-Manual/User-defined-function_timecho.md b/src/UserGuide/Master/Tree/User-Manual/User-defined-function_timecho.md deleted file mode 100644 index 5d5623569..000000000 --- a/src/UserGuide/Master/Tree/User-Manual/User-defined-function_timecho.md +++ /dev/null @@ -1,953 +0,0 @@ -# UDF - -## 1. UDF Introduction - -UDF (User Defined Function) refers to user-defined functions. IoTDB provides a variety of built-in time series processing functions and also supports extending custom functions to meet more computing needs. - -In IoTDB, you can expand two types of UDF: - - - - - - - - - - - - - - - - - - - - - - -
UDF ClassAccessStrategyDescription
UDTFMAPPABLE_ROW_BY_ROWCustom scalar function, input k columns of time series and 1 row of data, output 1 column of time series and 1 row of data, can be used in any clause and expression that appears in the scalar function, such as select clause, where clause, etc.
ROW_BY_ROW
SLIDING_TIME_WINDOW
SLIDING_SIZE_WINDOW
SESSION_TIME_WINDOW
STATE_WINDOW
Custom time series generation function, input k columns of time series m rows of data, output 1 column of time series n rows of data, the number of input rows m can be different from the number of output rows n, and can only be used in SELECT clauses.
UDAF-Custom aggregation function, input k columns of time series m rows of data, output 1 column of time series 1 row of data, can be used in any clause and expression that appears in the aggregation function, such as select clause, having clause, etc.
- -### 1.1 UDF usage - -The usage of UDF is similar to that of regular built-in functions, and can be directly used in SELECT statements like calling regular functions. - -#### 1.Basic SQL syntax support - -* Support `SLIMIT` / `SOFFSET` -* Support `LIMIT` / `OFFSET` -* Support queries with value filters -* Support queries with time filters - - -#### 2. Queries with * in SELECT Clauses - -Assume that there are 2 time series (`root.sg.d1.s1` and `root.sg.d1.s2`) in the system. - -* **`SELECT example(*) from root.sg.d1`** - -Then the result set will include the results of `example (root.sg.d1.s1)` and `example (root.sg.d1.s2)`. - -* **`SELECT example(s1, *) from root.sg.d1`** - -Then the result set will include the results of `example(root.sg.d1.s1, root.sg.d1.s1)` and `example(root.sg.d1.s1, root.sg.d1.s2)`. - -* **`SELECT example(*, *) from root.sg.d1`** - -Then the result set will include the results of `example(root.sg.d1.s1, root.sg.d1.s1)`, `example(root.sg.d1.s2, root.sg.d1.s1)`, `example(root.sg.d1.s1, root.sg.d1.s2)` and `example(root.sg.d1.s2, root.sg.d1.s2)`. - -#### 3. Queries with Key-value Attributes in UDF Parameters - -You can pass any number of key-value pair parameters to the UDF when constructing a UDF query. The key and value in the key-value pair need to be enclosed in single or double quotes. Note that key-value pair parameters can only be passed in after all time series have been passed in. Here is a set of examples: - - Example: -``` sql -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; -``` - -#### 4. Nested Queries - - Example: -``` sql -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` - - -## 2. UDF management - -### 2.1 UDF Registration - -The process of registering a UDF in IoTDB is as follows: - -1. Implement a complete UDF class, assuming the full class name of this class is `org.apache.iotdb.udf.ExampleUDTF`. -2. Convert the project into a JAR package. If using Maven to manage the project, you can refer to the [Maven project example](https://github.com/apache/iotdb/tree/master/example/udf) above. -3. Make preparations for registration according to the registration mode. For details, see the following example. -4. You can use following SQL to register UDF. - -```sql -CREATE FUNCTION AS (USING URI URI-STRING) -``` - -#### Example: register UDF named `example`, you can choose either of the following two registration methods - -#### Method 1: Manually place the jar package - -Prepare: -When registering using this method, it is necessary to place the JAR package in advance in the `ext/udf` directory of all nodes in the cluster (which can be configured). - -Registration statement: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' -``` - -#### Method 2: Cluster automatically installs jar packages through URI - -Prepare: -When registering using this method, it is necessary to upload the JAR package to the URI server in advance and ensure that the IoTDB instance executing the registration statement can access the URI server. - -Registration statement: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' USING URI 'http://jar/example.jar' -``` - -IoTDB will download JAR packages and synchronize them to the entire cluster. - -#### Note - -1. Since UDF instances are dynamically loaded through reflection technology, you do not need to restart the server during the UDF registration process. - -2. UDF function names are not case-sensitive. - -3. Please ensure that the function name given to the UDF is different from all built-in function names. A UDF with the same name as a built-in function cannot be registered. - -4. We recommend that you do not use classes that have the same class name but different function logic in different JAR packages. For example, in `UDF(UDAF/UDTF): udf1, udf2`, the JAR package of udf1 is `udf1.jar` and the JAR package of udf2 is `udf2.jar`. Assume that both JAR packages contain the `org.apache.iotdb.udf.ExampleUDTF` class. If you use two UDFs in the same SQL statement at the same time, the system will randomly load either of them and may cause inconsistency in UDF execution behavior. - -### 2.2 UDF Deregistration - -The SQL syntax is as follows: - -```sql -DROP FUNCTION -``` - -Example: Uninstall the UDF from the above example: - -```sql -DROP FUNCTION example -``` - -Note: For functions registered using USING URI, you need to remove the UDF's JAR files from the cluster-wide node path (`installation_package/ext/udf/install`). - -### 2.3 Show All Registered UDFs - -``` sql -SHOW FUNCTIONS -``` - -### 2.4 UDF configuration - -- UDF configuration allows configuring the storage directory of UDF in `iotdb-system.properties` - ``` Properties -# UDF lib dir - -udf_lib_dir=ext/udf -``` - -- -When using custom functions, there is a message indicating insufficient memory. Change the following configuration parameters in `iotdb-system.properties` and restart the service. - - ``` Properties - -# Used to estimate the memory usage of text fields in a UDF query. -# It is recommended to set this value to be slightly larger than the average length of all text -# effectiveMode: restart -# Datatype: int -udf_initial_byte_array_length_for_memory_control=48 - -# How much memory may be used in ONE UDF query (in MB). -# The upper limit is 20% of allocated memory for read. -# effectiveMode: restart -# Datatype: float -udf_memory_budget_in_mb=30.0 - -# UDF memory allocation ratio. -# The parameter form is a:b:c, where a, b, and c are integers. -# effectiveMode: restart -udf_reader_transformer_collector_memory_proportion=1:1:1 -``` - -### 2.5 UDF User Permissions - - -When users use UDF, they will be involved in the `USE_UDF` permission, and only users with this permission are allowed to perform UDF registration, uninstallation, and query operations. - -For more user permissions related content, please refer to [Account Management Statements](../User-Manual/Authority-Management_timecho). - - -## 3. UDF Libraries - -Based on the ability of user-defined functions, IoTDB provides a series of functions for temporal data processing, including data quality, data profiling, anomaly detection, frequency domain analysis, data matching, data repairing, sequence discovery, machine learning, etc., which can meet the needs of industrial fields for temporal data processing. - -You can refer to the [UDF Libraries](../SQL-Manual/UDF-Libraries_timecho.md)document to find the installation steps and registration statements for each function, to ensure that all required functions are registered correctly. - - -## 4. UDF development - -### 4.1 UDF Development Dependencies - -If you use [Maven](http://search.maven.org/), you can search for the development dependencies listed below from the [Maven repository](http://search.maven.org/) . Please note that you must select the same dependency version as the target IoTDB server version for development. - -``` xml - - org.apache.iotdb - udf-api - 1.0.0 - provided - -``` - -### 4.2 UDTF(User Defined Timeseries Generating Function) - -To write a UDTF, you need to inherit the `org.apache.iotdb.udf.api.UDTF` class, and at least implement the `beforeStart` method and a `transform` method. - -#### Interface Description: - -| Interface definition | Description | Required to Implement | -| :----------------------------------------------------------- | :----------------------------------------------------------- | ----------------------------------------------------- | -| void validate(UDFParameterValidator validator) throws Exception | This method is mainly used to validate `UDFParameters` and it is executed before `beforeStart(UDFParameters, UDTFConfigurations)` is called. | Optional | -| void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception | The initialization method to call the user-defined initialization behavior before a UDTF processes the input data. Every time a user executes a UDTF query, the framework will construct a new UDF instance, and `beforeStart` will be called. | Required | -| Object transform(Row row) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `MappableRowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Row`, and the transformation result should be returned. | Required to implement at least one `transform` method | -| void transform(Column[] columns, ColumnBuilder builder) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `MappableRowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Column[]`, and the transformation result should be output by `ColumnBuilder`. You need to call the data collection method provided by `builder` to determine the output data. | Required to implement at least one `transform` method | -| void transform(Row row, PointCollector collector) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `RowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Row`, and the transformation result should be output by `PointCollector`. You need to call the data collection method provided by `collector` to determine the output data. | Required to implement at least one `transform` method | -| void transform(RowWindow rowWindow, PointCollector collector) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `SlidingSizeWindowAccessStrategy` or `SlidingTimeWindowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `RowWindow`, and the transformation result should be output by `PointCollector`. You need to call the data collection method provided by `collector` to determine the output data. | Required to implement at least one `transform` method | -| void terminate(PointCollector collector) throws Exception | This method is called by the framework. This method will be called once after all `transform` calls have been executed. In a single UDF query, this method will and will only be called once. You need to call the data collection method provided by `collector` to determine the output data. | Optional | -| void beforeDestroy() | This method is called by the framework after the last input data is processed, and will only be called once in the life cycle of each UDF instance. | Optional | - -In the life cycle of a UDTF instance, the calling sequence of each method is as follows: - -1. void validate(UDFParameterValidator validator) throws Exception -2. void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception -3. `Object transform(Row row) throws Exception` or `void transform(Column[] columns, ColumnBuilder builder) throws Exception` or `void transform(Row row, PointCollector collector) throws Exception` or `void transform(RowWindow rowWindow, PointCollector collector) throws Exception` -4. void terminate(PointCollector collector) throws Exception -5. void beforeDestroy() - -> Note that every time the framework executes a UDTF query, a new UDF instance will be constructed. When the query ends, the corresponding instance will be destroyed. Therefore, the internal data of the instances in different UDTF queries (even in the same SQL statement) are isolated. You can maintain some state data in the UDTF without considering the influence of concurrency and other factors. - -#### Detailed interface introduction: - -1. **void validate(UDFParameterValidator validator) throws Exception** - -The `validate` method is used to validate the parameters entered by the user. - -In this method, you can limit the number and types of input time series, check the attributes of user input, or perform any custom verification. - -Please refer to the [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/parameter/UDFParameterValidator.java) for the usage of `UDFParameterValidator`. - - -2. **void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception** - -This method is mainly used to customize UDTF. In this method, the user can do the following things: - -1. Use UDFParameters to get the time series paths and parse key-value pair attributes entered by the user. -2. Set the strategy to access the raw data and set the output data type in UDTFConfigurations. -3. Create resources, such as establishing external connections, opening files, etc. - - -2.1 **UDFParameters** - -`UDFParameters` is used to parse UDF parameters in SQL statements (the part in parentheses after the UDF function name in SQL). The input parameters have two parts. The first part is data types of the time series that the UDF needs to process, and the second part is the key-value pair attributes for customization. Only the second part can be empty. - - -Example: - -``` sql -SELECT UDF(s1, s2, 'key1'='iotdb', 'key2'='123.45') FROM root.sg.d; -``` - -Usage: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - String stringValue = parameters.getString("key1"); // iotdb - Float floatValue = parameters.getFloat("key2"); // 123.45 - Double doubleValue = parameters.getDouble("key3"); // null - int intValue = parameters.getIntOrDefault("key4", 678); // 678 - // do something - - // configurations - // ... -} -``` - - -2.2 **UDTFConfigurations** - -You must use `UDTFConfigurations` to specify the strategy used by UDF to access raw data and the type of output sequence. - -Usage: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(Type.INT32); -} -``` - -The `setAccessStrategy` method is used to set the UDF's strategy for accessing the raw data, and the `setOutputDataType` method is used to set the data type of the output sequence. - - 2.2.1 **setAccessStrategy** - - -Note that the raw data access strategy you set here determines which `transform` method the framework will call. Please implement the `transform` method corresponding to the raw data access strategy. Of course, you can also dynamically decide which strategy to set based on the attribute parameters parsed by `UDFParameters`. Therefore, two `transform` methods are also allowed to be implemented in one UDF. - -The following are the strategies you can set: - -| Interface definition | Description | The `transform` Method to Call | -| :-------------------------------- | :----------------------------------------------------------- | ------------------------------------------------------------ | -| MappableRowByRowStrategy | Custom scalar function
The framework will call the `transform` method once for each row of raw data input, with k columns of time series and 1 row of data input, and 1 column of time series and 1 row of data output. It can be used in any clause and expression where scalar functions appear, such as select clauses, where clauses, etc. | void transform(Column[] columns, ColumnBuilder builder) throws ExceptionObject transform(Row row) throws Exception | -| RowByRowAccessStrategy | Customize time series generation function to process raw data line by line.
The framework will call the `transform` method once for each row of raw data input, inputting k columns of time series and 1 row of data, and outputting 1 column of time series and n rows of data.
When a sequence is input, the row serves as a data point for the input sequence.
When multiple sequences are input, after aligning the input sequences in time, each row serves as a data point for the input sequence.
(In a row of data, there may be a column with a `null` value, but not all columns are `null`) | void transform(Row row, PointCollector collector) throws Exception | -| SlidingTimeWindowAccessStrategy | Customize time series generation functions to process raw data in a sliding time window manner.
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SlidingSizeWindowAccessStrategy | Customize the time series generation function to process raw data in a fixed number of rows, meaning that each data processing window will contain a fixed number of rows of data (except for the last window).
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SessionTimeWindowAccessStrategy | Customize time series generation functions to process raw data in a session window format.
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| StateWindowAccessStrategy | Customize time series generation functions to process raw data in a state window format.
he framework will call the `transform` method once for each raw data input window, inputting 1 column of time series m rows of data and outputting 1 column of time series n rows of data.
A window may contain multiple rows of data, and currently only supports opening windows for one physical quantity, which is one column of data. | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | - - -#### Interface Description: - -- `MappableRowByRowStrategy` and `RowByRowAccessStrategy`: The construction of `RowByRowAccessStrategy` does not require any parameters. - -- `SlidingTimeWindowAccessStrategy` - -Window opening diagram: - - - -`SlidingTimeWindowAccessStrategy`: `SlidingTimeWindowAccessStrategy` has many constructors, you can pass 3 types of parameters to them: - -- Parameter 1: The display window on the time axis - -The first type of parameters are optional. If the parameters are not provided, the beginning time of the display window will be set to the same as the minimum timestamp of the query result set, and the ending time of the display window will be set to the same as the maximum timestamp of the query result set. - -- Parameter 2: Time interval for dividing the time axis (should be positive) -- Parameter 3: Time sliding step (not required to be greater than or equal to the time interval, but must be a positive number) - -The sliding step parameter is also optional. If the parameter is not provided, the sliding step will be set to the same as the time interval for dividing the time axis. - -The relationship between the three types of parameters can be seen in the figure below. Please see the [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/SlidingTimeWindowAccessStrategy.java) for more details. - -

- -> Note that the actual time interval of some of the last time windows may be less than the specified time interval parameter. In addition, there may be cases where the number of data rows in some time windows is 0. In these cases, the framework will also call the `transform` method for the empty windows. - -- `SlidingSizeWindowAccessStrategy` - -Window opening diagram: - - - -`SlidingSizeWindowAccessStrategy`: `SlidingSizeWindowAccessStrategy` has many constructors, you can pass 2 types of parameters to them: - -* Parameter 1: Window size. This parameter specifies the number of data rows contained in a data processing window. Note that the number of data rows in some of the last time windows may be less than the specified number of data rows. -* Parameter 2: Sliding step. This parameter means the number of rows between the first point of the next window and the first point of the current window. (This parameter is not required to be greater than or equal to the window size, but must be a positive number) - -The sliding step parameter is optional. If the parameter is not provided, the sliding step will be set to the same as the window size. - -- `SessionTimeWindowAccessStrategy` - -Window opening diagram: **Time intervals less than or equal to the given minimum time interval `sessionGap` are assigned in one group.** - - - -`SessionTimeWindowAccessStrategy`: `SessionTimeWindowAccessStrategy` has many constructors, you can pass 2 types of parameters to them: - -- Parameter 1: The display window on the time axis. -- Parameter 2: The minimum time interval `sessionGap` of two adjacent windows. - -- `StateWindowAccessStrategy` - -Window opening diagram: **For numerical data, if the state difference is less than or equal to the given threshold `delta`, it will be assigned in one group.** - - - -`StateWindowAccessStrategy` has four constructors. - -- Constructor 1: For numerical data, there are 3 parameters: the time axis can display the start and end time of the time window and the threshold `delta` for the allowable change within a single window. -- Constructor 2: For text data and boolean data, there are 3 parameters: the time axis can be provided to display the start and end time of the time window. For both data types, the data within a single window is same, and there is no need to provide an allowable change threshold. -- Constructor 3: For numerical data, there are 1 parameters: you can only provide the threshold delta that is allowed to change within a single window. The start time of the time axis display time window will be defined as the smallest timestamp in the entire query result set, and the time axis display time window end time will be defined as The largest timestamp in the entire query result set. -- Constructor 4: For text data and boolean data, you can provide no parameter. The start and end timestamps are explained in Constructor 3. - -StateWindowAccessStrategy can only take one column as input for now. - -Please see the [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/StateWindowAccessStrategy.java) for more details. - - 2.2.2 **setOutputDataType** - -Note that the type of output sequence you set here determines the type of data that the `PointCollector` can actually receive in the `transform` method. The relationship between the output data type set in `setOutputDataType` and the actual data output type that `PointCollector` can receive is as follows: - -| Output Data Type Set in `setOutputDataType` | Data Type that `PointCollector` Can Receive | -| :------------------------------------------ | :----------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | java.lang.String and org.apache.iotdb.udf.api.type.Binar` | - -The type of output time series of a UDTF is determined at runtime, which means that a UDTF can dynamically determine the type of output time series according to the type of input time series. -Here is a simple example: - -```java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **Object transform(Row row) throws Exception** - -You need to implement this method or `transform(Column[] columns, ColumnBuilder builder) throws Exception` when you specify the strategy of UDF to read the original data as `MappableRowByRowAccessStrategy`. - -This method processes the raw data one row at a time. The raw data is input from `Row` and output by its return object. You must return only one object based on each input data point in a single `transform` method call, i.e., input and output are one-to-one. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `Object transform(Row row) throws Exception` method. It is an adder that receives two columns of time series as input. - -```java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type dataType; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - dataType = parameters.getDataType(0); - configurations - .setAccessStrategy(new MappableRowByRowAccessStrategy()) - .setOutputDataType(dataType); - } - - @Override - public Object transform(Row row) throws Exception { - return row.getLong(0) + row.getLong(1); - } -} -``` - - - -4. **void transform(Column[] columns, ColumnBuilder builder) throws Exception** - -You need to implement this method or `Object transform(Row row) throws Exception` when you specify the strategy of UDF to read the original data as `MappableRowByRowAccessStrategy`. - -This method processes the raw data multiple rows at a time. After performance tests, we found that UDTF that process multiple rows at once perform better than those UDTF that process one data point at a time. The raw data is input from `Column[]` and output by `ColumnBuilder`. You must output a corresponding data point based on each input data point in a single `transform` method call, i.e., input and output are still one-to-one. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `void transform(Column[] columns, ColumnBuilder builder) throws Exception` method. It is an adder that receives two columns of time series as input. - -```java -import org.apache.iotdb.tsfile.read.common.block.column.Column; -import org.apache.iotdb.tsfile.read.common.block.column.ColumnBuilder; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type type; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - type = parameters.getDataType(0); - configurations.setAccessStrategy(new MappableRowByRowAccessStrategy()).setOutputDataType(type); - } - - @Override - public void transform(Column[] columns, ColumnBuilder builder) throws Exception { - long[] inputs1 = columns[0].getLongs(); - long[] inputs2 = columns[1].getLongs(); - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - builder.writeLong(inputs1[i] + inputs2[i]); - } - } -} -``` - -5. **void transform(Row row, PointCollector collector) throws Exception** - -You need to implement this method when you specify the strategy of UDF to read the original data as `RowByRowAccessStrategy`. - -This method processes the raw data one row at a time. The raw data is input from `Row` and output by `PointCollector`. You can output any number of data points in one `transform` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `void transform(Row row, PointCollector collector) throws Exception` method. It is an adder that receives two columns of time series as input. When two data points in a row are not `null`, this UDF will output the algebraic sum of these two data points. - -``` java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT64) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) throws Exception { - if (row.isNull(0) || row.isNull(1)) { - return; - } - collector.putLong(row.getTime(), row.getLong(0) + row.getLong(1)); - } -} -``` - -6. **void transform(RowWindow rowWindow, PointCollector collector) throws Exception** - -You need to implement this method when you specify the strategy of UDF to read the original data as `SlidingTimeWindowAccessStrategy` or `SlidingSizeWindowAccessStrategy`. - -This method processes a batch of data in a fixed number of rows or a fixed time interval each time, and we call the container containing this batch of data a window. The raw data is input from `RowWindow` and output by `PointCollector`. `RowWindow` can help you access a batch of `Row`, it provides a set of interfaces for random access and iterative access to this batch of `Row`. You can output any number of data points in one `transform` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -Below is a complete UDF example that implements the `void transform(RowWindow rowWindow, PointCollector collector) throws Exception` method. It is a counter that receives any number of time series as input, and its function is to count and output the number of data rows in each time window within a specified time range. - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.access.RowWindow; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.SlidingTimeWindowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Counter implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new SlidingTimeWindowAccessStrategy( - parameters.getLong("time_interval"), - parameters.getLong("sliding_step"), - parameters.getLong("display_window_begin"), - parameters.getLong("display_window_end"))); - } - - @Override - public void transform(RowWindow rowWindow, PointCollector collector) { - if (rowWindow.windowSize() != 0) { - collector.putInt(rowWindow.windowStartTime(), rowWindow.windowSize()); - } - } -} -``` - -7. **void terminate(PointCollector collector) throws Exception** - -In some scenarios, a UDF needs to traverse all the original data to calculate the final output data points. The `terminate` interface provides support for those scenarios. - -This method is called after all `transform` calls are executed and before the `beforeDestory` method is executed. You can implement the `transform` method to perform pure data processing (without outputting any data points), and implement the `terminate` method to output the processing results. - -The processing results need to be output by the `PointCollector`. You can output any number of data points in one `terminate` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -Below is a complete UDF example that implements the `void terminate(PointCollector collector) throws Exception` method. It takes one time series whose data type is `INT32` as input, and outputs the maximum value point of the series. - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Max implements UDTF { - - private Long time; - private int value; - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) { - if (row.isNull(0)) { - return; - } - int candidateValue = row.getInt(0); - if (time == null || value < candidateValue) { - time = row.getTime(); - value = candidateValue; - } - } - - @Override - public void terminate(PointCollector collector) throws IOException { - if (time != null) { - collector.putInt(time, value); - } - } -} -``` - -8. **void beforeDestroy()** - -The method for terminating a UDF. - -This method is called by the framework. For a UDF instance, `beforeDestroy` will be called after the last record is processed. In the entire life cycle of the instance, `beforeDestroy` will only be called once. - - - -### 4.3 UDAF (User Defined Aggregation Function) - -A complete definition of UDAF involves two classes, `State` and `UDAF`. - -#### State Class - -To write your own `State`, you need to implement the `org.apache.iotdb.udf.api.State` interface. - -#### Interface Description: - -| Interface Definition | Description | Required to Implement | -| -------------------------------- | ------------------------------------------------------------ | --------------------- | -| void reset() | To reset the `State` object to its initial state, you need to fill in the initial values of the fields in the `State` class within this method as if you were writing a constructor. | Required | -| byte[] serialize() | Serializes `State` to binary data. This method is used for IoTDB internal `State` passing. Note that the order of serialization must be consistent with the following deserialization methods. | Required | -| void deserialize(byte[] bytes) | Deserializes binary data to `State`. This method is used for IoTDB internal `State` passing. Note that the order of deserialization must be consistent with the serialization method above. | Required | - -#### Detailed interface introduction: - -1. **void reset()** - -This method resets the `State` to its initial state, you need to fill in the initial values of the fields in the `State` object in this method. For optimization reasons, IoTDB reuses `State` as much as possible internally, rather than creating a new `State` for each group, which would introduce unnecessary overhead. When `State` has finished updating the data in a group, this method is called to reset to the initial state as a way to process the next group. - -In the case of `State` for averaging (aka `avg`), for example, you would need the sum of the data, `sum`, and the number of entries in the data, `count`, and initialize both to 0 in the `reset()` method. - -```java -class AvgState implements State { - double sum; - - long count; - - @Override - public void reset() { - sum = 0; - count = 0; - } - - // other methods -} -``` - -2. **byte[] serialize()/void deserialize(byte[] bytes)** - -These methods serialize the `State` into binary data, and deserialize the `State` from the binary data. IoTDB, as a distributed database, involves passing data among different nodes, so you need to write these two methods to enable the passing of the State among different nodes. Note that the order of serialization and deserialization must be the consistent. - -In the case of `State` for averaging (aka `avg`), for example, you can convert the content of State to `byte[]` array and read out the content of State from `byte[]` array in any way you want, the following shows the code for serialization/deserialization using `ByteBuffer` introduced by Java8: - -```java -@Override -public byte[] serialize() { - ByteBuffer buffer = ByteBuffer.allocate(Double.BYTES + Long.BYTES); - buffer.putDouble(sum); - buffer.putLong(count); - - return buffer.array(); -} - -@Override -public void deserialize(byte[] bytes) { - ByteBuffer buffer = ByteBuffer.wrap(bytes); - sum = buffer.getDouble(); - count = buffer.getLong(); -} -``` - - - -#### UDAF Classes - -To write a UDAF, you need to implement the `org.apache.iotdb.udf.api.UDAF` interface. - -#### Interface Description: - -| Interface definition | Description | Required to Implement | -| ------------------------------------------------------------ | ------------------------------------------------------------ | --------------------- | -| void validate(UDFParameterValidator validator) throws Exception | This method is mainly used to validate `UDFParameters` and it is executed before `beforeStart(UDFParameters, UDTFConfigurations)` is called. | Optional | -| void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception | Initialization method that invokes user-defined initialization behavior before UDAF processes the input data. Unlike UDTF, configuration is of type `UDAFConfiguration`. | Required | -| State createState() | To create a `State` object, usually just call the default constructor and modify the default initial value as needed. | Required | -| void addInput(State state, Column[] columns, BitMap bitMap) | Update `State` object according to the incoming data `Column[]` in batch, note that last column `columns[columns.length - 1]` always represents the time column. In addition, `BitMap` represents the data that has been filtered out before, you need to manually determine whether the corresponding data has been filtered out when writing this method. | Required | -| void combineState(State state, State rhs) | Merge `rhs` state into `state` state. In a distributed scenario, the same set of data may be distributed on different nodes, IoTDB generates a `State` object for the partial data on each node, and then calls this method to merge it into the complete `State`. | Required | -| void outputFinal(State state, ResultValue resultValue) | Computes the final aggregated result based on the data in `State`. Note that according to the semantics of the aggregation, only one value can be output per group. | Required | -| void beforeDestroy() | This method is called by the framework after the last input data is processed, and will only be called once in the life cycle of each UDF instance. | Optional | - -In the life cycle of a UDAF instance, the calling sequence of each method is as follows: - -1. State createState() -2. void validate(UDFParameterValidator validator) throws Exception -3. void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception -4. void addInput(State state, Column[] columns, BitMap bitMap) -5. void combineState(State state, State rhs) -6. void outputFinal(State state, ResultValue resultValue) -7. void beforeDestroy() - -Similar to UDTF, every time the framework executes a UDAF query, a new UDF instance will be constructed. When the query ends, the corresponding instance will be destroyed. Therefore, the internal data of the instances in different UDAF queries (even in the same SQL statement) are isolated. You can maintain some state data in the UDAF without considering the influence of concurrency and other factors. - -#### Detailed interface introduction: - - -1. **void validate(UDFParameterValidator validator) throws Exception** - -Same as UDTF, the `validate` method is used to validate the parameters entered by the user. - -In this method, you can limit the number and types of input time series, check the attributes of user input, or perform any custom verification. - -2. **void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception** - - The `beforeStart` method does the same thing as the UDAF: - -1. Use UDFParameters to get the time series paths and parse key-value pair attributes entered by the user. -2. Set the strategy to access the raw data and set the output data type in UDAFConfigurations. -3. Create resources, such as establishing external connections, opening files, etc. - -The role of the `UDFParameters` type can be seen above. - -2.2 **UDTFConfigurations** - -The difference from UDTF is that UDAF uses `UDAFConfigurations` as the type of `configuration` object. - -Currently, this class only supports setting the type of output data. - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setOutputDataType(Type.INT32); } -} -``` - -The relationship between the output type set in `setOutputDataType` and the type of data output that `ResultValue` can actually receive is as follows: - -| The output type set in `setOutputDataType` | The output type that `ResultValue` can actually receive | -| ------------------------------------------ | ------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | org.apache.iotdb.udf.api.type.Binary | - -The output type of the UDAF is determined at runtime. You can dynamically determine the output sequence type based on the input type. - -Here is a simple example: - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **State createState()** - - -This method creates and initializes a `State` object for UDAF. Due to the limitations of the Java language, you can only call the default constructor for the `State` class. The default constructor assigns a default initial value to all the fields in the class, and if that initial value does not meet your requirements, you need to initialize them manually within this method. - -The following is an example that includes manual initialization. Suppose you want to implement an aggregate function that multiply all numbers in the group, then your initial `State` value should be set to 1, but the default constructor initializes it to 0, so you need to initialize `State` manually after calling the default constructor: - -```java -public State createState() { - MultiplyState state = new MultiplyState(); - state.result = 1; - return state; -} -``` - -4. **void addInput(State state, Column[] columns, BitMap bitMap)** - -This method updates the `State` object with the raw input data. For performance reasons, also to align with the IoTDB vectorized query engine, the raw input data is no longer a data point, but an array of columns ``Column[]``. Note that the last column (i.e. `columns[columns.length - 1]`) is always the time column, so you can also do different operations in UDAF depending on the time. - -Since the input parameter is not of a single data point type, but of multiple columns, you need to manually filter some of the data in the columns, which is why the third parameter, `BitMap`, exists. It identifies which of these columns have been filtered out, so you don't have to think about the filtered data in any case. - -Here's an example of `addInput()` that counts the number of items (aka count). It shows how you can use `BitMap` to ignore data that has been filtered out. Note that due to the limitations of the Java language, you need to do the explicit cast the `State` object from type defined in the interface to a custom `State` type at the beginning of the method, otherwise you won't be able to use the `State` object. - -```java -public void addInput(State state, Column[] columns, BitMap bitMap) { - CountState countState = (CountState) state; - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - if (bitMap != null && !bitMap.isMarked(i)) { - continue; - } - if (!columns[0].isNull(i)) { - countState.count++; - } - } -} -``` - -5. **void combineState(State state, State rhs)** - - -This method combines two `State`s, or more precisely, updates the first `State` object with the second `State` object. IoTDB is a distributed database, and the data of the same group may be distributed on different nodes. For performance reasons, IoTDB will first aggregate some of the data on each node into `State`, and then merge the `State`s on different nodes that belong to the same group, which is what `combineState` does. - -Here's an example of `combineState()` for averaging (aka avg). Similar to `addInput`, you need to do an explicit type conversion for the two `State`s at the beginning. Also note that you are updating the value of the first `State` with the contents of the second `State`. - -```java -public void combineState(State state, State rhs) { - AvgState avgState = (AvgState) state; - AvgState avgRhs = (AvgState) rhs; - - avgState.count += avgRhs.count; - avgState.sum += avgRhs.sum; -} -``` - -6. **void outputFinal(State state, ResultValue resultValue)** - -This method works by calculating the final result from `State`. You need to access the various fields in `State`, derive the final result, and set the final result into the `ResultValue` object.IoTDB internally calls this method once at the end for each group. Note that according to the semantics of aggregation, the final result can only be one value. - -Here is another `outputFinal` example for averaging (aka avg). In addition to the forced type conversion at the beginning, you will also see a specific use of the `ResultValue` object, where the final result is set by `setXXX` (where `XXX` is the type name). - -```java -public void outputFinal(State state, ResultValue resultValue) { - AvgState avgState = (AvgState) state; - - if (avgState.count != 0) { - resultValue.setDouble(avgState.sum / avgState.count); - } else { - resultValue.setNull(); - } -} -``` - -7. **void beforeDestroy()** - - -The method for terminating a UDF. - -This method is called by the framework. For a UDF instance, `beforeDestroy` will be called after the last record is processed. In the entire life cycle of the instance, `beforeDestroy` will only be called once. - - -### 4.4 Maven Project Example - -If you use Maven, you can build your own UDF project referring to our **udf-example** module. You can find the project [here](https://github.com/apache/iotdb/tree/master/example/udf). - - -## 5. Contribute universal built-in UDF functions to iotdb - -This part mainly introduces how external users can contribute their own UDFs to the IoTDB community. - -### 5.1 Prerequisites - -1. UDFs must be universal. - - The "universal" mentioned here refers to: UDFs can be widely used in some scenarios. In other words, the UDF function must have reuse value and may be directly used by other users in the community. - - If you are not sure whether the UDF you want to contribute is universal, you can send an email to `dev@iotdb.apache.org` or create an issue to initiate a discussion. - -2. The UDF you are going to contribute has been well tested and can run normally in the production environment. - - -### 5.2 What you need to prepare - -1. UDF source code -2. Test cases -3. Instructions - -### 5.3 Contribution Content - -#### 5.3.1 UDF Source Code - -1. Create the UDF main class and related classes in `iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin` or in its subfolders. -2. Register your UDF in `iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin/BuiltinTimeSeriesGeneratingFunction.java`. - -#### 5.3.2 Test Cases - -At a minimum, you need to write integration tests for the UDF. - -You can add a test class in `integration-test/src/test/java/org/apache/iotdb/db/it/udf`. - - -#### 5.3.3 Instructions - -The instructions need to include: the name and the function of the UDF, the attribute parameters that must be provided when the UDF is executed, the applicable scenarios, and the usage examples, etc. - -The instructions for use should include both Chinese and English versions. Instructions for use should be added separately in `docs/zh/UserGuide/Operation Manual/DML Data Manipulation Language.md` and `docs/UserGuide/Operation Manual/DML Data Manipulation Language.md`. - -#### 5.3.4 Submit a PR - -When you have prepared the UDF source code, test cases, and instructions, you are ready to submit a Pull Request (PR) on [Github](https://github.com/apache/iotdb). You can refer to our code contribution guide to submit a PR: [Development Guide](https://iotdb.apache.org/Community/Development-Guide.html). - - -After the PR review is approved and merged, your UDF has already contributed to the IoTDB community! - -## 6. Common problem - -Q1: How to modify the registered UDF? - -A1: Assume that the name of the UDF is `example` and the full class name is `org.apache.iotdb.udf.ExampleUDTF`, which is introduced by `example.jar`. - -1. Unload the registered function by executing `DROP FUNCTION example`. -2. Delete `example.jar` under `iotdb-server-2.0.x-all-bin/ext/udf`. -3. Modify the logic in `org.apache.iotdb.udf.ExampleUDTF` and repackage it. The name of the JAR package can still be `example.jar`. -4. Upload the new JAR package to `iotdb-server-2.0.x-all-bin/ext/udf`. -5. Load the new UDF by executing `CREATE FUNCTION example AS "org.apache.iotdb.udf.ExampleUDTF"`. - diff --git a/src/UserGuide/V1.2.x/Deployment-and-Maintenance/Deployment-Guide_timecho.md b/src/UserGuide/V1.2.x/Deployment-and-Maintenance/Deployment-Guide_timecho.md deleted file mode 100644 index bd323615c..000000000 --- a/src/UserGuide/V1.2.x/Deployment-and-Maintenance/Deployment-Guide_timecho.md +++ /dev/null @@ -1,1117 +0,0 @@ - - -# Deployment Guide - -## Stand-Alone Deployment - -This short guide will walk you through the basic process of using IoTDB. For a more-complete guide, please visit our website's [User Guide](../IoTDB-Introduction/What-is-IoTDB.md). - -### Prerequisites - -To use IoTDB, you need to have: - -1. Java >= 1.8 (Please make sure the environment path has been set) -2. Set the max open files num as 65535 to avoid "too many open files" problem. - -### Installation - -IoTDB provides you three installation methods, you can refer to the following suggestions, choose one of them: - -* Installation from source code. If you need to modify the code yourself, you can use this method. -* Installation from binary files. Download the binary files from the official website. This is the recommended method, in which you will get a binary released package which is out-of-the-box. -* Using Docker:The path to the dockerfile is [github](https://github.com/apache/iotdb/blob/master/docker/src/main) - - -### Download - -You can download the binary file from: -[Download Page](https://iotdb.apache.org/Download/) - -### Configurations - -Configuration files are under "conf" folder - -* environment config module (`datanode-env.bat`, `datanode-env.sh`), -* system config module (`iotdb-datanode.properties`) -* log config module (`logback.xml`). - -For more, see [Config](../Reference/DataNode-Config-Manual.md) in detail. - -### Start - -You can go through the following step to test the installation, if there is no error after execution, the installation is completed. - -#### Start IoTDB - -IoTDB is a database based on distributed system. To launch IoTDB, you can first start standalone mode (i.e. 1 ConfigNode and 1 DataNode) to check. - -Users can start IoTDB standalone mode by the start-standalone script under the sbin folder. - -``` -# Unix/OS X -> bash sbin/start-standalone.sh -``` - -``` -# Windows -> sbin\start-standalone.bat -``` - -Note: Currently, To run standalone mode, you need to ensure that all addresses are set to 127.0.0.1, If you need to access the IoTDB from a machine different from the one where the IoTDB is located, please change the configuration item `dn_rpc_address` to the IP of the machine where the IoTDB lives. And replication factors set to 1, which is by now the default setting. - -## Cluster deployment(Cluster management tool) - -The IoTDB cluster management tool is an easy-to-use operation and maintenance tool (enterprise version tool). -It is designed to solve the operation and maintenance problems of multiple nodes in the IoTDB distributed system. -It mainly includes cluster deployment, cluster start and stop, elastic expansion, configuration update, data export and other functions, thereby realizing one-click command issuance for complex database clusters, which greatly Reduce management difficulty. -This document will explain how to remotely deploy, configure, start and stop IoTDB cluster instances with cluster management tools. - -### Environment dependence - -This tool is a supporting tool for TimechoDB(Enterprise Edition based on IoTDB). You can contact your sales representative to obtain the tool download method. - -The machine where IoTDB is to be deployed needs to rely on jdk 8 and above, lsof, netstat, and unzip functions. If not, please install them yourself. You can refer to the installation commands required for the environment in the last section of the document. - -Tip: The IoTDB cluster management tool requires an account with root privileges - -### Deployment method - -#### Download and install - -This tool is a supporting tool for TimechoDB(Enterprise Edition based on IoTDB). You can contact your salesperson to obtain the tool download method. - -Note: Since the binary package only supports GLIBC2.17 and above, the minimum version is Centos7. - -* After entering the following commands in the iotd directory: - -```bash -bash install-iotd.sh -``` - -The iotd keyword can be activated in the subsequent shell, such as checking the environment instructions required before deployment as follows: - -```bash -iotd cluster check example -``` - -* You can also directly use <iotd absolute path>/sbin/iotd without activating iotd to execute commands, such as checking the environment required before deployment: - -```bash -/sbin/iotd cluster check example -``` - -### Introduction to cluster configuration files - -* There is a cluster configuration yaml file in the `iotd/config` directory. The yaml file name is the cluster name. There can be multiple yaml files. In order to facilitate users to configure yaml files, a `default_cluster.yaml` example is provided under the iotd/config directory. -* The yaml file configuration consists of five major parts: `global`, `confignode_servers`, `datanode_servers`, `grafana_server`, and `prometheus_server` -* `global` is a general configuration that mainly configures machine username and password, IoTDB local installation files, Jdk configuration, etc. A `default_cluster.yaml` sample data is provided in the `iotd/config` directory, - Users can copy and modify it to their own cluster name and refer to the instructions inside to configure the IoTDB cluster. In the `default_cluster.yaml` sample, all uncommented items are required, and those that have been commented are non-required. - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotd cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - -| parameter name | parameter describe | required | -|--------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| iotdb\_zip\_dir | IoTDB deployment distribution directory, if the value is empty, it will be downloaded from the address specified by `iotdb_download_url` | NO | -| iotdb\_download\_url | IoTDB download address, if `iotdb_zip_dir` has no value, download from the specified address | NO | -| jdk\_tar\_dir | jdk local directory, you can use this jdk path to upload and deploy to the target node. | NO | -| jdk\_deploy\_dir | jdk remote machine deployment directory, jdk will be deployed to this directory, and the following `jdk_dir_name` parameter forms a complete jdk deployment directory, that is, `/` | NO | -| jdk\_dir\_name | The directory name after jdk decompression defaults to jdk_iotdb | NO | -| iotdb\_lib\_dir | The IoTDB lib directory or the IoTDB lib compressed package only supports .zip format and is only used for IoTDB upgrade. It is in the comment state by default. If you need to upgrade, please open the comment and modify the path. If you use a zip file, please use the zip command to compress the iotdb/lib directory, such as zip -r lib.zip apache-iotdb-1.2.0/lib/* d | NO | -| user | User name for ssh login deployment machine | YES | -| password | The password for ssh login. If the password does not specify the use of pkey to log in, please ensure that the ssh login between nodes has been configured without a key. | NO | -| pkey | Key login: If password has a value, password is used first, otherwise pkey is used to log in. | NO | -| ssh\_port | ssh port | YES | -| deploy\_dir | IoTDB deployment directory, IoTDB will be deployed to this directory and the following `iotdb_dir_name` parameter will form a complete IoTDB deployment directory, that is, `/` | YES | -| iotdb\_dir\_name | The directory name after decompression of IoTDB is iotdb by default. | NO | -| datanode-env.sh | Corresponding to `iotdb/config/datanode-env.sh`, when `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first | NO | -| confignode-env.sh | Corresponding to `iotdb/config/confignode-env.sh`, the value in `datanode_servers` is used first when `global` and `datanode_servers` are configured at the same time | NO | -| iotdb-common.properties | Corresponds to `/config/iotdb-common.properties` | NO | -| cn\_target\_config\_node\_list | The cluster configuration address points to the surviving ConfigNode, and it points to confignode_x by default. When `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first, corresponding to `cn_target_config_node_list` in `iotdb/config/iotdb-confignode.properties` | YES | -| dn\_target\_config\_node\_list | The cluster configuration address points to the surviving ConfigNode, and points to confignode_x by default. When configuring values for `global` and `datanode_servers` at the same time, the value in `datanode_servers` is used first, corresponding to `dn_target_config_node_list` in `iotdb/config/iotdb-datanode.properties` | YES | - -Among them, datanode-env.sh and confignode-env.sh can be configured with extra parameters extra_opts. When this parameter is configured, corresponding values will be appended after datanode-env.sh and confignode-env.sh. Refer to default_cluster.yaml for configuration examples as follows: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - -* `confignode_servers` is the configuration for deploying IoTDB Confignodes, in which multiple Confignodes can be configured - By default, the first started ConfigNode node node1 is regarded as the Seed-ConfigNode - -| parameter name | parameter describe | required | -|--------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| name | Confignode name | YES | -| deploy\_dir | IoTDB config node deployment directory | YES | -| iotdb-confignode.properties | Corresponding to `iotdb/config/iotdb-confignode.properties`, please refer to the `iotdb-confignode.properties` file description for more details. | NO | -| cn_internal_address | Corresponds to iotdb/internal communication address, corresponding to `cn_internal_address` in `iotdb/config/iotdb-confignode.properties` | YES | -| cn\_target\_config\_node\_list | The cluster configuration address points to the surviving ConfigNode, and it points to confignode_x by default. When `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first, corresponding to `cn_target_config_node_list` in `iotdb/config/iotdb-confignode.properties` | YES | -| cn_internal_port | Internal communication port, corresponding to `cn_internal_port` in `iotdb/config/iotdb-confignode.properties` | YES | -| cn\_consensus\_port | Corresponds to `cn_consensus_port` in `iotdb/config/iotdb-confignode.properties` | NO | -| cn\_data\_dir | Corresponds to `cn_consensus_port` in `iotdb/config/iotdb-confignode.properties` Corresponds to `cn_data_dir` in `iotdb/config/iotdb-confignode.properties` | YES | -| iotdb-common.properties | Corresponding to `iotdb/config/iotdb-common.properties`, when configuring values in `global` and `confignode_servers` at the same time, the value in confignode_servers will be used first. | NO | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| parameter name | parameter describe | required | -|--------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| name | Datanode name | YES | -| deploy\_dir | IoTDB data node deployment directory | YES | -| iotdb-datanode.properties | Corresponding to `iotdb/config/iotdb-datanode.properties`, please refer to the `iotdb-datanode.properties` file description for more details. | NO | -| dn\_rpc\_address | The datanode rpc address corresponds to `dn_rpc_address` in `iotdb/config/iotdb-datanode.properties` | YES | -| dn\_internal\_address | Internal communication address, corresponding to `dn_internal_address` in `iotdb/config/iotdb-datanode.properties` | YES | -| dn\_target\_config\_node\_list | The cluster configuration address points to the surviving ConfigNode, and points to confignode_x by default. When configuring values for `global` and `datanode_servers` at the same time, the value in `datanode_servers` is used first, corresponding to `dn_target_config_node_list` in `iotdb/config/iotdb-datanode.properties`. | YES | -| dn\_rpc\_port | Datanode rpc port address, corresponding to `dn_rpc_port` in `iotdb/config/iotdb-datanode.properties` | YES | -| dn\_internal\_port | Internal communication port, corresponding to `dn_internal_port` in `iotdb/config/iotdb-datanode.properties` | YES | -| iotdb-common.properties | Corresponding to `iotdb/config/iotdb-common.properties`, when configuring values in `global` and `datanode_servers` at the same time, the value in `datanode_servers` will be used first. | NO | - -* grafana_server is the configuration related to deploying Grafana - -| parameter name | parameter describe | required | -|--------------------|-------------------------------------------------------------|-----------| -| grafana\_dir\_name | Grafana decompression directory name(default grafana_iotdb) | NO | -| host | Server ip deployed by grafana | YES | -| grafana\_port | The port of grafana deployment machine, default 3000 | NO | -| deploy\_dir | grafana deployment server directory | YES | -| grafana\_tar\_dir | Grafana compressed package location | YES | -| dashboards | dashboards directory | NO | - -* prometheus_server 是部署Prometheus 相关配置 - -| parameter name | parameter describe | required | -|--------------------------------|----------------------------------------------------|----------| -| prometheus\_dir\_name | prometheus decompression directory name, default prometheus_iotdb | NO | -| host | Server IP deployed by prometheus | YES | -| prometheus\_port | The port of prometheus deployment machine, default 9090 | NO | -| deploy\_dir | prometheus deployment server directory | YES | -| prometheus\_tar\_dir | prometheus compressed package path | YES | -| storage\_tsdb\_retention\_time | The number of days to save data is 15 days by default | NO | -| storage\_tsdb\_retention\_size | The data size that can be saved by the specified block defaults to 512M. Please note the units are KB, MB, GB, TB, PB, and EB. | NO | - -If metrics are configured in `iotdb-datanode.properties` and `iotdb-confignode.properties` of config/xxx.yaml, the configuration will be automatically put into promethues without manual modification. - -Note: How to configure the value corresponding to the yaml key to contain special characters such as: etc. It is recommended to use double quotes for the entire value, and do not use paths containing spaces in the corresponding file paths to prevent abnormal recognition problems. - -### scenes to be used - -#### Clean data - -* Cleaning up the cluster data scenario will delete the data directory in the IoTDB cluster and `cn_system_dir`, `cn_consensus_dir`, `cn_consensus_dir` configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs` and `ext` directories. -* First execute the stop cluster command, and then execute the cluster cleanup command. - -```bash -iotd cluster stop default_cluster -iotd cluster clean default_cluster -``` - -#### Cluster destruction - -* The cluster destruction scenario will delete `data`, `cn_system_dir`, `cn_consensus_dir`, in the IoTDB cluster - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs`, `ext`, `IoTDB` deployment directory, - grafana deployment directory and prometheus deployment directory. -* First execute the stop cluster command, and then execute the cluster destruction command. - - -```bash -iotd cluster stop default_cluster -iotd cluster destroy default_cluster -``` - -#### Cluster upgrade - -* To upgrade the cluster, you first need to configure `iotdb_lib_dir` in config/xxx.yaml as the directory path where the jar to be uploaded to the server is located (for example, iotdb/lib). -* If you use zip files to upload, please use the zip command to compress the iotdb/lib directory, such as zip -r lib.zip apache-iotdb-1.2.0/lib/* -* Execute the upload command and then execute the restart IoTDB cluster command to complete the cluster upgrade. - -```bash -iotd cluster upgrade default_cluster -iotd cluster restart default_cluster -``` - -#### hot deployment - -* First modify the configuration in config/xxx.yaml. -* Execute the distribution command, and then execute the hot deployment command to complete the hot deployment of the cluster configuration - -```bash -iotd cluster distribute default_cluster -iotd cluster reload default_cluster -``` - -#### Cluster expansion - -* First modify and add a datanode or confignode node in config/xxx.yaml. -* Execute the cluster expansion command - -```bash -iotd cluster scaleout default_cluster -``` - -#### Cluster scaling - -* First find the node name or ip+port to shrink in config/xxx.yaml (where confignode port is cn_internal_port, datanode port is rpc_port) -* Execute cluster shrink command - -```bash -iotd cluster scalein default_cluster -``` - -#### Using cluster management tools to manipulate existing IoTDB clusters - -* Configure the server's `user`, `passwod` or `pkey`, `ssh_port` -* Modify the IoTDB deployment path in config/xxx.yaml, `deploy_dir` (IoTDB deployment directory), `iotdb_dir_name` (IoTDB decompression directory name, the default is iotdb) - For example, if the full path of IoTDB deployment is `/home/data/apache-iotdb-1.1.1`, you need to modify the yaml files `deploy_dir:/home/data/` and `iotdb_dir_name:apache-iotdb-1.1.1` -* If the server is not using java_home, modify `jdk_deploy_dir` (jdk deployment directory) and `jdk_dir_name` (the directory name after jdk decompression, the default is jdk_iotdb). If java_home is used, there is no need to modify the configuration. - For example, the full path of jdk deployment is `/home/data/jdk_1.8.2`, you need to modify the yaml files `jdk_deploy_dir:/home/data/`, `jdk_dir_name:jdk_1.8.2` -* Configure `cn_target_config_node_list`, `dn_target_config_node_list` -* Configure `cn_internal_address`, `cn_internal_port`, `cn_consensus_port`, `cn_system_dir`, in `iotdb-confignode.properties` in `confignode_servers` - If the values in `cn_consensus_dir` and `iotdb-common.properties` are not the default for IoTDB, they need to be configured, otherwise there is no need to configure them. -* Configure `dn_rpc_address`, `dn_internal_address`, `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir` and `iotdb-common.properties` in `iotdb-datanode.properties` in `datanode_servers` -* Execute initialization command - -```bash -iotd cluster init default_cluster -``` - -#### Deploy IoTDB, Grafana and Prometheus - -* Configure `iotdb-datanode.properties`, `iotdb-confignode.properties` to open the metrics interface -* Configure the Grafana configuration. If there are multiple `dashboards`, separate them with commas. The names cannot be repeated or they will be overwritten. -* Configure the Prometheus configuration. If the IoTDB cluster is configured with metrics, there is no need to manually modify the Prometheus configuration. The Prometheus configuration will be automatically modified according to which node is configured with metrics. -* Start the cluster - -```bash -iotd cluster start default_cluster -``` - -For more detailed parameters, please refer to the cluster configuration file introduction above - -### Command - -The basic usage of this tool is: -```bash -iotd cluster [params (Optional)] -``` -* key indicates a specific command. - -* cluster name indicates the cluster name (that is, the name of the yaml file in the `iotd/config` file). - -* params indicates the required parameters of the command (optional). - -* For example, the command format to deploy the default_cluster cluster is: - -```bash -iotd cluster deploy default_cluster -``` - -* The functions and parameters of the cluster are listed as follows: - -| command | description | parameter | -|------------|-----------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| check | check whether the cluster can be deployed | Cluster name list | -| clean | cleanup-cluster | cluster-name | -| deploy | deploy cluster | Cluster name, -N, module name (optional for iotdb, grafana, prometheus), -op force (optional) | -| list | cluster status list | None | -| start | start cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional) | -| stop | stop cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional), -op force (nodename, grafana, prometheus optional) | -| restart | restart cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional), -op force (nodename, grafana, prometheus optional) | -| show | view cluster information. The details field indicates the details of the cluster information. | Cluster name, details (optional) | -| destroy | destroy cluster | Cluster name, -N, module name (iotdb, grafana, prometheus optional) | -| scaleout | cluster expansion | Cluster name | -| scalein | cluster shrink | Cluster name, -N, cluster node name or cluster node ip+port | -| reload | hot loading of cluster configuration files | Cluster name | -| distribute | cluster configuration file distribution | Cluster name | -| dumplog | Back up specified cluster logs | Cluster name, -N, cluster node name -h Back up to target machine ip -pw Back up to target machine password -p Back up to target machine port -path Backup directory -startdate Start time -enddate End time -loglevel Log type -l transfer speed | -| dumpdata | Backup cluster data | Cluster name, -h backup to target machine ip -pw backup to target machine password -p backup to target machine port -path backup directory -startdate start time -enddate end time -l transmission speed | -| upgrade | lib package upgrade | Cluster name | -| init | When an existing cluster uses the cluster deployment tool, initialize the cluster configuration | Cluster name | -| status | View process status | Cluster name | - -### Detailed command execution process - -The following commands are executed using default_cluster.yaml as an example, and users can modify them to their own cluster files to execute - -#### Check cluster deployment environment commands - -```bash -iotd cluster check default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Verify that the target node is able to log in via SSH - -* Verify whether the JDK version on the corresponding node meets IoTDB jdk1.8 and above, and whether the server is installed with unzip, lsof, and netstat. - -* If you see the following prompt `Info:example check successfully!`, it proves that the server has already met the installation requirements. - If `Error:example check fail!` is output, it proves that some conditions do not meet the requirements. You can check the Error log output above (for example: `Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`) to make repairs. , - If the jdk check does not meet the requirements, we can configure a jdk1.8 or above version in the yaml file ourselves for deployment without affecting subsequent use. - If checking lsof, netstat or unzip does not meet the requirements, you need to install it on the server yourself. - -#### Deploy cluster command - -```bash -iotd cluster deploy default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Upload IoTDB compressed package and jdk compressed package according to the node information in `confignode_servers` and `datanode_servers` (if `jdk_tar_dir` and `jdk_deploy_dir` values ​​are configured in yaml) - -* Generate and upload `iotdb-common.properties`, `iotdb-confignode.properties`, `iotdb-datanode.properties` according to the yaml file node configuration information - -```bash -iotd cluster deploy default_cluster -op force -``` - -Note: This command will force the deployment, and the specific process will delete the existing deployment directory and redeploy - -*deploy a single module* -```bash -# Deploy grafana module -iotd cluster deploy default_cluster -N grafana -# Deploy the prometheus module -iotd cluster deploy default_cluster -N prometheus -# Deploy the iotdb module -iotd cluster deploy default_cluster -N iotdb -``` - -#### Start cluster command - -```bash -iotd cluster start default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Start confignode, start sequentially according to the order in `confignode_servers` in the yaml configuration file and check whether the confignode is normal according to the process id, the first confignode is seek config - -* Start the datanode in sequence according to the order in `datanode_servers` in the yaml configuration file and check whether the datanode is normal according to the process id. - -* After checking the existence of the process according to the process id, check whether each service in the cluster list is normal through the cli. If the cli link fails, retry every 10s until it succeeds and retry up to 5 times - - -* -Start a single node command* -```bash -#Start according to the IoTDB node name -iotd cluster start default_cluster -N datanode_1 -#Start according to IoTDB cluster ip+port, where port corresponds to cn_internal_port of confignode and rpc_port of datanode. -iotd cluster start default_cluster -N 192.168.1.5:6667 -#Start grafana -iotd cluster start default_cluster -N grafana -#Start prometheus -iotd cluster start default_cluster -N prometheus -``` - -* Find the yaml file in the default location based on cluster-name - -* Find the node location information based on the provided node name or ip:port. If the started node is `data_node`, the ip uses `dn_rpc_address` in the yaml file, and the port uses `dn_rpc_port` in datanode_servers in the yaml file. - If the started node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - -* start the node - -Note: Since the cluster deployment tool only calls the start-confignode.sh and start-datanode.sh scripts in the IoTDB cluster, -When the actual output result fails, it may be that the cluster has not started normally. It is recommended to use the status command to check the current cluster status (iotd cluster status xxx) - - -#### View IoTDB cluster status command - -```bash -iotd cluster show default_cluster -#View IoTDB cluster details -iotd cluster show default_cluster details -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Execute `show cluster details` through cli on datanode in turn. If one node is executed successfully, it will not continue to execute cli on subsequent nodes and return the result directly. - -#### Stop cluster command - - -```bash -iotd cluster stop default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* According to the datanode node information in `datanode_servers`, stop the datanode nodes in order according to the configuration. - -* Based on the confignode node information in `confignode_servers`, stop the confignode nodes in sequence according to the configuration - -*force stop cluster command* - -```bash -iotd cluster stop default_cluster -op force -``` -Will directly execute the kill -9 pid command to forcibly stop the cluster - -*Stop single node command* - -```bash -#Stop by IoTDB node name -iotd cluster stop default_cluster -N datanode_1 -#Stop according to IoTDB cluster ip+port (ip+port is to get the only node according to ip+dn_rpc_port in datanode or ip+cn_internal_port in confignode to get the only node) -iotd cluster stop default_cluster -N 192.168.1.5:6667 -#Stop grafana -iotd cluster stop default_cluster -N grafana -#Stop prometheus -iotd cluster stop default_cluster -N prometheus -``` - -* Find the yaml file in the default location based on cluster-name - -* Find the corresponding node location information based on the provided node name or ip:port. If the stopped node is `data_node`, the ip uses `dn_rpc_address` in the yaml file, and the port uses `dn_rpc_port` in datanode_servers in the yaml file. - If the stopped node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - -* stop the node - -Note: Since the cluster deployment tool only calls the stop-confignode.sh and stop-datanode.sh scripts in the IoTDB cluster, in some cases the iotdb cluster may not be stopped. - - -#### Clean cluster data command - -```bash -iotd cluster clean default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Based on the information in `confignode_servers` and `datanode_servers`, check whether there are still services running, - If any service is running, the cleanup command will not be executed. - -* Delete the data directory in the IoTDB cluster and the `cn_system_dir`, `cn_consensus_dir`, configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs` and `ext` directories. - - - -#### Restart cluster command - -```bash -iotd cluster restart default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` - -* Execute the above stop cluster command (stop), and then execute the start cluster command (start). For details, refer to the above start and stop commands. - -*Force restart cluster command* - -```bash -iotd cluster restart default_cluster -op force -``` -Will directly execute the kill -9 pid command to force stop the cluster, and then start the cluster - - -*Restart a single node command* - -```bash -#Restart datanode_1 according to the IoTDB node name -iotd cluster restart default_cluster -N datanode_1 -#Restart confignode_1 according to the IoTDB node name -iotd cluster restart default_cluster -N confignode_1 -#Restart grafana -iotd cluster restart default_cluster -N grafana -#Restart prometheus -iotd cluster restart default_cluster -N prometheus -``` - -#### Cluster shrink command - -```bash -#Scale down by node name -iotd cluster scalein default_cluster -N nodename -#Scale down according to ip+port (ip+port obtains the only node according to ip+dn_rpc_port in datanode, and obtains the only node according to ip+cn_internal_port in confignode) -iotd cluster scalein default_cluster -N ip:port -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Determine whether there is only one confignode node and datanode to be reduced. If there is only one left, the reduction cannot be performed. - -* Then get the node information to shrink according to ip:port or nodename, execute the shrink command, and then destroy the node directory. If the shrink node is `data_node`, use `dn_rpc_address` in the yaml file for ip, and use `dn_rpc_address` in the port. `dn_rpc_port` in datanode_servers in yaml file. - If the shrinking node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - - -Tip: Currently, only one node scaling is supported at a time - -#### Cluster expansion command - -```bash -iotd cluster scaleout default_cluster -``` -* Modify the config/xxx.yaml file to add a datanode node or confignode node - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Find the node to be expanded, upload the IoTDB compressed package and jdb package (if the `jdk_tar_dir` and `jdk_deploy_dir` values ​​are configured in yaml) and decompress it - -* Generate and upload `iotdb-common.properties`, `iotdb-confignode.properties` or `iotdb-datanode.properties` according to the yaml file node configuration information - -* Execute the command to start the node and verify whether the node is started successfully - -Tip: Currently, only one node expansion is supported at a time - -#### destroy cluster command -```bash -iotd cluster destroy default_cluster -``` - -* cluster-name finds the yaml file in the default location - -* Check whether the node is still running based on the node node information in `confignode_servers`, `datanode_servers`, `grafana`, and `prometheus`. - Stop the destroy command if any node is running - -* Delete `data` in the IoTDB cluster and `cn_system_dir`, `cn_consensus_dir` configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs`, `ext`, `IoTDB` deployment directory, - grafana deployment directory and prometheus deployment directory - -*Destroy a single module* - -```bash -# Destroy grafana module -iotd cluster destroy default_cluster -N grafana -# Destroy prometheus module -iotd cluster destroy default_cluster -N prometheus -# Destroy iotdb module -iotd cluster destroy default_cluster -N iotdb -``` - -#### Distribute cluster configuration commands - -```bash -iotd cluster distribute default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` - -* Generate and upload `iotdb-common.properties`, `iotdb-confignode.properties`, `iotdb-datanode.properties` to the specified node according to the node configuration information of the yaml file - -#### Hot load cluster configuration command - -```bash -iotd cluster reload default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Execute `load configuration` in the cli according to the node configuration information of the yaml file. - -#### Cluster node log backup -```bash -iotd cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` - -* Find the yaml file in the default location based on cluster-name - -* This command will verify the existence of datanode_1 and confignode_1 according to the yaml file, and then back up the log data of the specified node datanode_1 and confignode_1 to the specified service `192.168.9.48` port `36000 according to the configured start and end dates (startdate<=logtime<=enddate) ` The data backup path is `/iotdb/logs`, and the IoTDB log storage path is `/root/data/db/iotdb/logs` (not required, if you do not fill in -logs xxx, the default is to backup logs from the IoTDB installation path /logs ) - -| command | description | required | -|------------|-------------------------------------------------------------------------|----------| -| -h | backup data server ip | NO | -| -u | backup data server username | NO | -| -pw | backup data machine password | NO | -| -p | backup data machine port(default 22) | NO | -| -path | path to backup data (default current path) | NO | -| -loglevel | Log levels include all, info, error, warn (default is all) | NO | -| -l | speed limit (default 1024 speed limit range 0 to 104857601 unit Kbit/s) | NO | -| -N | multiple configuration file cluster names are separated by commas. | YES | -| -startdate | start time (including default 1970-01-01) | NO | -| -enddate | end time (included) | NO | -| -logs | IoTDB log storage path, the default is ({iotdb}/logs)) | NO | - -#### Cluster data backup -```bash -iotd cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* This command will obtain the leader node based on the yaml file, and then back up the data to the /iotdb/datas directory on the 192.168.9.48 service based on the start and end dates (startdate<=logtime<=enddate) - -| command | description | required | -|--------------|-------------------------------------------------------------------------|----------| -| -h | backup data server ip | NO | -| -u | backup data server username | NO | -| -pw | backup data machine password | NO | -| -p | backup data machine port(default 22) | NO | -| -path | path to backup data (default current path) | NO | -| -granularity | partition | YES | -| -l | speed limit (default 1024 speed limit range 0 to 104857601 unit Kbit/s) | NO | -| -startdate | start time (including default 1970-01-01) | YES | -| -enddate | end time (included) | YES | - -#### Cluster upgrade -```bash -iotd cluster upgrade default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Upload lib package - -Note that after performing the upgrade, please restart IoTDB for it to take effect. - -#### Cluster initialization -```bash -iotd cluster init default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` -* Initialize cluster configuration - -#### View cluster process status -```bash -iotd cluster status default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` -* Display the survival status of each node in the cluster - -### Introduction to Cluster Deployment Tool Samples - -In the cluster deployment tool installation directory config/example, there are three yaml examples. If necessary, you can copy them to config and modify them. - -| name | description | -|-----------------------------|------------------------------------------------| -| default\_1c1d.yaml | 1 confignode and 1 datanode configuration example | -| default\_3c3d.yaml | 3 confignode and 3 datanode configuration samples | -| default\_3c3d\_grafa\_prome | 3 confignode and 3 datanode, Grafana, Prometheus configuration examples | - -## Manual Deployment - -### Prerequisites - -1. JDK>=1.8. -2. Max open file 65535. -3. Disable the swap memory. -4. Ensure that data/confignode directory has been cleared when starting ConfigNode for the first time, - and data/datanode directory has been cleared when starting DataNode for the first time -5. Turn off the firewall of the server if the entire cluster is in a trusted environment. -6. By default, IoTDB Cluster will use ports 10710, 10720 for the ConfigNode and - 6667, 10730, 10740, 10750 and 10760 for the DataNode. - Please make sure those ports are not occupied, or you will modify the ports in configuration files. - -### Get the Installation Package - -You can either download the binary release files (see Chap 3.1) or compile with source code (see Chap 3.2). - -#### Download the binary distribution - -1. Open our website [Download Page](https://iotdb.apache.org/Download/). -2. Download the binary distribution. -3. Decompress to get the apache-iotdb-1.0.0-all-bin directory. - -#### Compile with source code - -##### Download the source code - -**Git** - -``` -git clone https://github.com/apache/iotdb.git -git checkout v1.0.0 -``` - -**Website** - -1. Open our website [Download Page](https://iotdb.apache.org/Download/). -2. Download the source code. -3. Decompress to get the apache-iotdb-1.0.0 directory. - -##### Compile source code - -Under the source root folder: - -``` -mvn clean package -pl distribution -am -DskipTests -``` - -Then you will get the binary distribution under -**distribution/target/apache-iotdb-1.0.0-SNAPSHOT-all-bin/apache-iotdb-1.0.0-SNAPSHOT-all-bin**. - -### Binary Distribution Content - -| **Folder** | **Description** | -| ---------- | ------------------------------------------------------------ | -| conf | Configuration files folder, contains configuration files of ConfigNode, DataNode, JMX and logback | -| data | Data files folder, contains data files of ConfigNode and DataNode | -| lib | Jar files folder | -| licenses | Licenses files folder | -| logs | Logs files folder, contains logs files of ConfigNode and DataNode | -| sbin | Shell files folder, contains start/stop/remove shell of ConfigNode and DataNode, cli shell | -| tools | System tools | - -### Cluster Installation and Configuration - -#### Cluster Installation - -`apache-iotdb-1.0.0-SNAPSHOT-all-bin` contains both the ConfigNode and the DataNode. -Please deploy the files to all servers of your target cluster. -A best practice is deploying the files into the same directory in all servers. - -If you want to try the cluster mode on one server, please read -[Cluster Quick Start](../QuickStart/ClusterQuickStart.md). - -#### Cluster Configuration - -We need to modify the configurations on each server. -Therefore, login each server and switch the working directory to `apache-iotdb-1.0.0-SNAPSHOT-all-bin`. -The configuration files are stored in the `./conf` directory. - -For all ConfigNode servers, we need to modify the common configuration (see Chap 5.2.1) -and ConfigNode configuration (see Chap 5.2.2). - -For all DataNode servers, we need to modify the common configuration (see Chap 5.2.1) -and DataNode configuration (see Chap 5.2.3). - -##### Common configuration - -Open the common configuration file ./conf/iotdb-common.properties, -and set the following parameters base on the -[Deployment Recommendation](./Deployment-Recommendation.md): - -| **Configuration** | **Description** | **Default** | -| ------------------------------------------ | ------------------------------------------------------------ | ----------------------------------------------- | -| cluster\_name | Cluster name for which the Node to join in | defaultCluster | -| config\_node\_consensus\_protocol\_class | Consensus protocol of ConfigNode | org.apache.iotdb.consensus.ratis.RatisConsensus | -| schema\_replication\_factor | Schema replication factor, no more than DataNode number | 1 | -| schema\_region\_consensus\_protocol\_class | Consensus protocol of schema replicas | org.apache.iotdb.consensus.ratis.RatisConsensus | -| data\_replication\_factor | Data replication factor, no more than DataNode number | 1 | -| data\_region\_consensus\_protocol\_class | Consensus protocol of data replicas. Note that RatisConsensus currently does not support multiple data directories | org.apache.iotdb.consensus.iot.IoTConsensus | - -**Notice: The preceding configuration parameters cannot be changed after the cluster is started. Ensure that the common configurations of all Nodes are the same. Otherwise, the Nodes cannot be started.** - -##### ConfigNode configuration - -Open the ConfigNode configuration file ./conf/iotdb-confignode.properties, -and set the following parameters based on the IP address and available port of the server or VM: - -| **Configuration** | **Description** | **Default** | **Usage** | -| ------------------------------ | ------------------------------------------------------------ | --------------- | ------------------------------------------------------------ | -| cn\_internal\_address | Internal rpc service address of ConfigNode | 127.0.0.1 | Set to the IPV4 address or domain name of the server | -| cn\_internal\_port | Internal rpc service port of ConfigNode | 10710 | Set to any unoccupied port | -| cn\_consensus\_port | ConfigNode replication consensus protocol communication port | 10720 | Set to any unoccupied port | -| cn\_target\_config\_node\_list | ConfigNode address to which the node is connected when it is registered to the cluster. Note that Only one ConfigNode can be configured. | 127.0.0.1:10710 | For Seed-ConfigNode, set to its own cn\_internal\_address:cn\_internal\_port; For other ConfigNodes, set to other one running ConfigNode's cn\_internal\_address:cn\_internal\_port | - -**Notice: The preceding configuration parameters cannot be changed after the node is started. Ensure that all ports are not occupied. Otherwise, the Node cannot be started.** - -##### DataNode configuration - -Open the DataNode configuration file ./conf/iotdb-datanode.properties, -and set the following parameters based on the IP address and available port of the server or VM: - -| **Configuration** | **Description** | **Default** | **Usage** | -| ----------------------------------- | ------------------------------------------------ | --------------- | ------------------------------------------------------------ | -| dn\_rpc\_address | Client RPC Service address | 127.0.0.1 | Set to the IPV4 address or domain name of the server | -| dn\_rpc\_port | Client RPC Service port | 6667 | Set to any unoccupied port | -| dn\_internal\_address | Control flow address of DataNode inside cluster | 127.0.0.1 | Set to the IPV4 address or domain name of the server | -| dn\_internal\_port | Control flow port of DataNode inside cluster | 10730 | Set to any unoccupied port | -| dn\_mpp\_data\_exchange\_port | Data flow port of DataNode inside cluster | 10740 | Set to any unoccupied port | -| dn\_data\_region\_consensus\_port | Data replicas communication port for consensus | 10750 | Set to any unoccupied port | -| dn\_schema\_region\_consensus\_port | Schema replicas communication port for consensus | 10760 | Set to any unoccupied port | -| dn\_target\_config\_node\_list | Running ConfigNode of the Cluster | 127.0.0.1:10710 | Set to any running ConfigNode's cn\_internal\_address:cn\_internal\_port. You can set multiple values, separate them with commas(",") | - -**Notice: The preceding configuration parameters cannot be changed after the node is started. Ensure that all ports are not occupied. Otherwise, the Node cannot be started.** - -### Cluster Operation - -#### Starting the cluster - -This section describes how to start a cluster that includes several ConfigNodes and DataNodes. -The cluster can provide services only by starting at least one ConfigNode -and no less than the number of data/schema_replication_factor DataNodes. - -The total process are three steps: - -* Start the Seed-ConfigNode -* Add ConfigNode (Optional) -* Add DataNode - -##### Start the Seed-ConfigNode - -**The first Node started in the cluster must be ConfigNode. The first started ConfigNode must follow the tutorial in this section.** - -The first ConfigNode to start is the Seed-ConfigNode, which marks the creation of the new cluster. -Before start the Seed-ConfigNode, please open the common configuration file ./conf/iotdb-common.properties and check the following parameters: - -| **Configuration** | **Check** | -| ------------------------------------------ | ----------------------------------------------- | -| cluster\_name | Is set to the expected name | -| config\_node\_consensus\_protocol\_class | Is set to the expected consensus protocol | -| schema\_replication\_factor | Is set to the expected schema replication count | -| schema\_region\_consensus\_protocol\_class | Is set to the expected consensus protocol | -| data\_replication\_factor | Is set to the expected data replication count | -| data\_region\_consensus\_protocol\_class | Is set to the expected consensus protocol | - -**Notice:** Please set these parameters carefully based on the [Deployment Recommendation](./Deployment-Recommendation.md). -These parameters are not modifiable after the Node first startup. - -Then open its configuration file ./conf/iotdb-confignode.properties and check the following parameters: - -| **Configuration** | **Check** | -| ------------------------------ | ------------------------------------------------------------ | -| cn\_internal\_address | Is set to the IPV4 address or domain name of the server | -| cn\_internal\_port | The port isn't occupied | -| cn\_consensus\_port | The port isn't occupied | -| cn\_target\_config\_node\_list | Is set to its own internal communication address, which is cn\_internal\_address:cn\_internal\_port | - -After checking, you can run the startup script on the server: - -``` -# Linux foreground -bash ./sbin/start-confignode.sh - -# Linux background -nohup bash ./sbin/start-confignode.sh >/dev/null 2>&1 & - -# Windows -.\sbin\start-confignode.bat -``` - -For more details about other configuration parameters of ConfigNode, see the -[ConfigNode Configurations](../Reference/ConfigNode-Config-Manual.md). - -##### Add more ConfigNodes (Optional) - -**The ConfigNode who isn't the first one started must follow the tutorial in this section.** - -You can add more ConfigNodes to the cluster to ensure high availability of ConfigNodes. -A common configuration is to add extra two ConfigNodes to make the cluster has three ConfigNodes. - -Ensure that all configuration parameters in the ./conf/iotdb-common.properites are the same as those in the Seed-ConfigNode; -otherwise, it may fail to start or generate runtime errors. -Therefore, please check the following parameters in common configuration file: - -| **Configuration** | **Check** | -| ------------------------------------------ | -------------------------------------- | -| cluster\_name | Is consistent with the Seed-ConfigNode | -| config\_node\_consensus\_protocol\_class | Is consistent with the Seed-ConfigNode | -| schema\_replication\_factor | Is consistent with the Seed-ConfigNode | -| schema\_region\_consensus\_protocol\_class | Is consistent with the Seed-ConfigNode | -| data\_replication\_factor | Is consistent with the Seed-ConfigNode | -| data\_region\_consensus\_protocol\_class | Is consistent with the Seed-ConfigNode | - -Then, please open its configuration file ./conf/iotdb-confignode.properties and check the following parameters: - -| **Configuration** | **Check** | -| ------------------------------ | ------------------------------------------------------------ | -| cn\_internal\_address | Is set to the IPV4 address or domain name of the server | -| cn\_internal\_port | The port isn't occupied | -| cn\_consensus\_port | The port isn't occupied | -| cn\_target\_config\_node\_list | Is set to the internal communication address of an other running ConfigNode. The internal communication address of the seed ConfigNode is recommended. | - -After checking, you can run the startup script on the server: - -``` -# Linux foreground -bash ./sbin/start-confignode.sh - -# Linux background -nohup bash ./sbin/start-confignode.sh >/dev/null 2>&1 & - -# Windows -.\sbin\start-confignode.bat -``` - -For more details about other configuration parameters of ConfigNode, see the -[ConfigNode Configurations](../Reference/ConfigNode-Config-Manual.md). - -##### Start DataNode - -**Before adding DataNodes, ensure that there exists at least one ConfigNode is running in the cluster.** - -You can add any number of DataNodes to the cluster. -Before adding a new DataNode, - -please open its common configuration file ./conf/iotdb-common.properties and check the following parameters: - -| **Configuration** | **Check** | -| ----------------- | -------------------------------------- | -| cluster\_name | Is consistent with the Seed-ConfigNode | - -Then open its configuration file ./conf/iotdb-datanode.properties and check the following parameters: - -| **Configuration** | **Check** | -| ----------------------------------- | ------------------------------------------------------------ | -| dn\_rpc\_address | Is set to the IPV4 address or domain name of the server | -| dn\_rpc\_port | The port isn't occupied | -| dn\_internal\_address | Is set to the IPV4 address or domain name of the server | -| dn\_internal\_port | The port isn't occupied | -| dn\_mpp\_data\_exchange\_port | The port isn't occupied | -| dn\_data\_region\_consensus\_port | The port isn't occupied | -| dn\_schema\_region\_consensus\_port | The port isn't occupied | -| dn\_target\_config\_node\_list | Is set to the internal communication address of other running ConfigNodes. The internal communication address of the seed ConfigNode is recommended. | - -After checking, you can run the startup script on the server: - -``` -# Linux foreground -bash ./sbin/start-datanode.sh - -# Linux background -nohup bash ./sbin/start-datanode.sh >/dev/null 2>&1 & - -# Windows -.\sbin\start-datanode.bat -``` - -For more details about other configuration parameters of DataNode, see the -[DataNode Configurations](../Reference/DataNode-Config-Manual.md). - -**Notice: The cluster can provide services only if the number of its DataNodes is no less than the number of replicas(max{schema\_replication\_factor, data\_replication\_factor}).** - -#### Start Cli - -If the cluster is in local environment, you can directly run the Cli startup script in the ./sbin directory: - -``` -# Linux -./sbin/start-cli.sh - -# Windows -.\sbin\start-cli.bat -``` - -If you want to use the Cli to connect to a cluster in the production environment, -Please read the [Cli manual](../Tools-System/CLI.md). - -#### Verify Cluster - -Use a 3C3D(3 ConfigNodes and 3 DataNodes) as an example. -Assumed that the IP addresses of the 3 ConfigNodes are 192.168.1.10, 192.168.1.11 and 192.168.1.12, and the default ports 10710 and 10720 are used. -Assumed that the IP addresses of the 3 DataNodes are 192.168.1.20, 192.168.1.21 and 192.168.1.22, and the default ports 6667, 10730, 10740, 10750 and 10760 are used. - -After starting the cluster successfully according to chapter 6.1, you can run the `show cluster details` command on the Cli, and you will see the following results: - -``` -IoTDB> show cluster details -+------+----------+-------+---------------+------------+-------------------+------------+-------+-------+-------------------+-----------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|ConfigConsensusPort| RpcAddress|RpcPort|MppPort|SchemaConsensusPort|DataConsensusPort| -+------+----------+-------+---------------+------------+-------------------+------------+-------+-------+-------------------+-----------------+ -| 0|ConfigNode|Running| 192.168.1.10| 10710| 10720| | | | | | -| 2|ConfigNode|Running| 192.168.1.11| 10710| 10720| | | | | | -| 3|ConfigNode|Running| 192.168.1.12| 10710| 10720| | | | | | -| 1| DataNode|Running| 192.168.1.20| 10730| |192.168.1.20| 6667| 10740| 10750| 10760| -| 4| DataNode|Running| 192.168.1.21| 10730| |192.168.1.21| 6667| 10740| 10750| 10760| -| 5| DataNode|Running| 192.168.1.22| 10730| |192.168.1.22| 6667| 10740| 10750| 10760| -+------+----------+-------+---------------+------------+-------------------+------------+-------+-------+-------------------+-----------------+ -Total line number = 6 -It costs 0.012s -``` - -If the status of all Nodes is **Running**, the cluster deployment is successful. -Otherwise, read the run logs of the Node that fails to start and -check the corresponding configuration parameters. - -#### Stop IoTDB - -This section describes how to manually shut down the ConfigNode or DataNode process of the IoTDB. - -##### Stop ConfigNode by script - -Run the stop ConfigNode script: - -``` -# Linux -./sbin/stop-confignode.sh - -# Windows -.\sbin\stop-confignode.bat -``` - -##### Stop DataNode by script - -Run the stop DataNode script: - -``` -# Linux -./sbin/stop-datanode.sh - -# Windows -.\sbin\stop-datanode.bat -``` - -##### Kill Node process - -Get the process number of the Node: - -``` -jps - -# or - -ps aux | grep iotdb -``` - -Kill the process: - -``` -kill -9 -``` - -**Notice Some ports require root access, in which case use sudo** - -#### Shrink the Cluster - -This section describes how to remove ConfigNode or DataNode from the cluster. - -##### Remove ConfigNode - -Before removing a ConfigNode, ensure that there is at least one active ConfigNode in the cluster after the removal. -Run the remove-confignode script on an active ConfigNode: - -``` -# Linux -# Remove the ConfigNode with confignode_id -./sbin/remove-confignode.sh - -# Remove the ConfigNode with address:port -./sbin/remove-confignode.sh : - - -# Windows -# Remove the ConfigNode with confignode_id -.\sbin\remove-confignode.bat - -# Remove the ConfigNode with address:port -.\sbin\remove-confignode.bat : -``` - -##### Remove DataNode - -Before removing a DataNode, ensure that the cluster has at least the number of data/schema replicas DataNodes. -Run the remove-datanode script on an active DataNode: - -``` -# Linux -# Remove the DataNode with datanode_id -./sbin/remove-datanode.sh - -# Remove the DataNode with rpc address:port -./sbin/remove-datanode.sh : - - -# Windows -# Remove the DataNode with datanode_id -.\sbin\remove-datanode.bat - -# Remove the DataNode with rpc address:port -.\sbin\remove-datanode.bat : -``` - -### FAQ - -See [FAQ](../FAQ/Frequently-asked-questions.md). \ No newline at end of file diff --git a/src/UserGuide/V1.2.x/Tools-System/Maintenance-Tool_timecho.md b/src/UserGuide/V1.2.x/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index 03387256b..000000000 --- a/src/UserGuide/V1.2.x/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,232 +0,0 @@ - -# Maintenance Tool - -## Cluster Management Tool - -TODO - -## IoTDB Data Directory Overview Tool - -IoTDB data directory overview tool is used to print an overview of the IoTDB data directory structure. The location is tools/tsfile/print-iotdb-data-dir. - -### Usage - -- For Windows: - -```bash -.\print-iotdb-data-dir.bat () -``` - -- For Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh () -``` - -Note: if the storage path of the output overview file is not set, the default relative path "IoTDB_data_dir_overview.txt" will be used. - -### Example - -Use Windows in this example: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-common.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## TsFile Sketch Tool - -TsFile sketch tool is used to print the content of a TsFile in sketch mode. The location is tools/tsfile/print-tsfile. - -### Usage - -- For Windows: - -``` -.\print-tsfile-sketch.bat () -``` - -- For Linux or MacOs: - -``` -./print-tsfile-sketch.sh () -``` - -Note: if the storage path of the output sketch file is not set, the default relative path "TsFile_sketch_view.txt" will be used. - -### Example - -Use Windows in this example: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-common.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -Explanations: - -- Separated by "|", the left is the actual position in the TsFile, and the right is the summary content. -- "||||||||||||||||||||" is the guide information added to enhance readability, not the actual data stored in TsFile. -- The last printed "IndexOfTimerseriesIndex Tree" is a reorganization of the metadata index tree at the end of the TsFile, which is convenient for intuitive understanding, and again not the actual data stored in TsFile. - -## TsFile Resource Sketch Tool - -TsFile resource sketch tool is used to print the content of a TsFile resource file. The location is tools/tsfile/print-tsfile-resource-files. - -### Usage - -- For Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- For Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### Example - -Use Windows in this example: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-common.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-common.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-common.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-datanode.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-datanode.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-common.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-common.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-common.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-datanode.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-datanode.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/UserGuide/V1.2.x/Tools-System/Monitor-Tool_timecho.md b/src/UserGuide/V1.2.x/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index 55faee221..000000000 --- a/src/UserGuide/V1.2.x/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,180 +0,0 @@ - - -# Monitor Tool - -## Prometheus - -### The mapping from metric type to prometheus format - -> For metrics whose Metric Name is name and Tags are K1=V1, ..., Kn=Vn, the mapping is as follows, where value is a -> specific value - -| Metric Type | Mapping | -| ---------------- | ------------------------------------------------------------ | -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | - -### Config File - -1) Taking DataNode as an example, modify the iotdb-datanode.properties configuration file as follows: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -Then you can get metrics data as follows - -2) Start IoTDB DataNodes -3) Open a browser or use ```curl``` to visit ```http://servier_ip:9091/metrics```, you can get the following metric - data: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -### Prometheus + Grafana - -As shown above, IoTDB exposes monitoring metrics data in the standard Prometheus format to the outside world. Prometheus -can be used to collect and store monitoring indicators, and Grafana can be used to visualize monitoring indicators. - -The following picture describes the relationships among IoTDB, Prometheus and Grafana - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. Along with running, IoTDB will collect its metrics continuously. -2. Prometheus scrapes metrics from IoTDB at a constant interval (can be configured). -3. Prometheus saves these metrics to its inner TSDB. -4. Grafana queries metrics from Prometheus at a constant interval (can be configured) and then presents them on the - graph. - -So, we need to do some additional works to configure and deploy Prometheus and Grafana. - -For instance, you can config your Prometheus as follows to get metrics data from IoTDB: - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -The following documents may help you have a good journey with Prometheus and Grafana. - -[Prometheus getting_started](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus scrape metrics](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana getting_started](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana query metrics from Prometheus](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## Apache IoTDB Dashboard - -We introduce the Apache IoTDB Dashboard, designed for unified centralized operations and management. With it, multiple clusters can be monitored through a single panel. - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - - -You can access the Dashboard's Json file in the enterprise edition. - -### Cluster Overview - -Including but not limited to: - -- Total cluster CPU cores, memory space, and hard disk space. -- Number of ConfigNodes and DataNodes in the cluster. -- Cluster uptime duration. -- Cluster write speed. -- Current CPU, memory, and disk usage across all nodes in the cluster. -- Information on individual nodes. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - - -### Data Writing - -Including but not limited to: - -- Average write latency, median latency, and the 99% percentile latency. -- Number and size of WAL files. -- Node WAL flush SyncBuffer latency. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### Data Querying - -Including but not limited to: - -- Node query load times for time series metadata. -- Node read duration for time series. -- Node edit duration for time series metadata. -- Node query load time for Chunk metadata list. -- Node edit duration for Chunk metadata. -- Node filtering duration based on Chunk metadata. -- Average time to construct a Chunk Reader. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### Storage Engine - -Including but not limited to: - -- File count and sizes by type. -- The count and size of TsFiles at various stages. -- Number and duration of various tasks. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### System Monitoring - -Including but not limited to: - -- System memory, swap memory, and process memory. -- Disk space, file count, and file sizes. -- JVM GC time percentage, GC occurrences by type, GC volume, and heap memory usage across generations. -- Network transmission rate, packet sending rate - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) \ No newline at end of file diff --git a/src/UserGuide/V1.2.x/Tools-System/Workbench_timecho.md b/src/UserGuide/V1.2.x/Tools-System/Workbench_timecho.md deleted file mode 100644 index 06cd1f743..000000000 --- a/src/UserGuide/V1.2.x/Tools-System/Workbench_timecho.md +++ /dev/null @@ -1,30 +0,0 @@ -# WorkBench -## Product Introduction -IoTDB Visualization Console is an extension component developed for industrial scenarios based on the IoTDB Enterprise Edition time series database. It integrates real-time data collection, storage, and analysis, aiming to provide users with efficient and reliable real-time data storage and query solutions. It features lightweight, high performance, and ease of use, seamlessly integrating with the Hadoop and Spark ecosystems. It is suitable for high-speed writing and complex analytical queries of massive time series data in industrial IoT applications. - -## Instructions for Use -| **Functional Module** | **Functional Description** | -| ---------------------- | ------------------------------------------------------------ | -| Instance Management | Support unified management of connected instances, support creation, editing, and deletion, while visualizing the relationships between multiple instances, helping customers manage multiple database instances more clearly | -| Home | Support viewing the service running status of each node in the database instance (such as activation status, running status, IP information, etc.), support viewing the running monitoring status of clusters, ConfigNodes, and DataNodes, monitor the operational health of the database, and determine if there are any potential operational issues with the instance. | -| Measurement Point List | Support directly viewing the measurement point information in the instance, including database information (such as database name, data retention time, number of devices, etc.), and measurement point information (measurement point name, data type, compression encoding, etc.), while also supporting the creation, export, and deletion of measurement points either individually or in batches. | -| Data Model | Support viewing hierarchical relationships and visually displaying the hierarchical model. | -| Data Query | Support interface-based query interactions for common data query scenarios, and enable batch import and export of queried data. | -| Statistical Query | Support interface-based query interactions for common statistical data scenarios, such as outputting results for maximum, minimum, average, and sum values. | -| SQL Operations | Support interactive SQL operations on the database through a graphical user interface, allowing for the execution of single or multiple statements, and displaying and exporting the results. | -| Trend | Support one-click visualization to view the overall trend of data, draw real-time and historical data for selected measurement points, and observe the real-time and historical operational status of the measurement points. | -| Analysis | Support visualizing data through different analysis methods (such as FFT) for visualization. | -| View | Support viewing information such as view name, view description, result measuring points, and expressions through the interface. Additionally, enable users to quickly create, edit, and delete views through interactive interfaces. | -| Data synchronization | Support the intuitive creation, viewing, and management of data synchronization tasks between databases. Enable direct viewing of task running status, synchronized data, and target addresses. Users can also monitor changes in synchronization status in real-time through the interface. | -| Permission management | Support interface-based control of permissions for managing and controlling database user access and operations. | -| Audit logs | Support detailed logging of user operations on the database, including Data Definition Language (DDL), Data Manipulation Language (DML), and query operations. Assist users in tracking and identifying potential security threats, database errors, and misuse behavior. | - -Main feature showcase -* Home -![首页.png](/img/%E9%A6%96%E9%A1%B5.png) -* Measurement Point List -![测点列表.png](/img/%E6%B5%8B%E7%82%B9%E5%88%97%E8%A1%A8.png) -* Data Query -![数据查询.png](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2.png) -* Trend -![历史趋势.png](/img/%E5%8E%86%E5%8F%B2%E8%B6%8B%E5%8A%BF.png) \ No newline at end of file diff --git a/src/UserGuide/V1.2.x/User-Manual/Data-Sync_timecho.md b/src/UserGuide/V1.2.x/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index f41f03a92..000000000 --- a/src/UserGuide/V1.2.x/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,521 +0,0 @@ - - -# Data Sync -**The IoTDB data sync transfers data from IoTDB to another data platform, and a data sync task is called a Pipe.** - -**A Pipe consists of three subtasks (plugins):** - -- Extract -- Process -- Connect - -**Pipe allows users to customize the processing logic of these three subtasks, just like handling data using UDF (User-Defined Functions)**. Within a Pipe, the aforementioned subtasks are executed and implemented by three types of plugins. Data flows through these three plugins sequentially: Pipe Extractor is used to extract data, Pipe Processor is used to process data, and Pipe Connector is used to send data to an external system. - -**The model of a Pipe task is as follows:** - -![pipe.png](/img/pipe.png) - -It describes a data sync task, which essentially describes the attributes of the Pipe Extractor, Pipe Processor, and Pipe Connector plugins. Users can declaratively configure the specific attributes of the three subtasks through SQL statements. By combining different attributes, flexible data ETL (Extract, Transform, Load) capabilities can be achieved. - -By utilizing the data sync functionality, a complete data pipeline can be built to fulfill various requirements such as edge-to-cloud sync, remote disaster recovery, and read-write workload distribution across multiple databases. - -## Quick Start - -**🎯 Goal: Achieve full data sync of IoTDB A -> IoTDB B** - -- Start two IoTDBs,A(datanode -> 127.0.0.1:6667) B(datanode -> 127.0.0.1:6668) -- create a Pipe from A -> B, and execute on A - - ```sql - create pipe a2b - with connector ( - 'connector'='iotdb-thrift-connector', - 'connector.ip'='127.0.0.1', - 'connector.port'='6668' - ) - ``` -- start a Pipe from A -> B, and execute on A - - ```sql - start pipe a2b - ``` -- Write data to A - - ```sql - INSERT INTO root.db.d(time, m) values (1, 1) - ``` -- Checking data synchronised from A at B - ```sql - SELECT ** FROM root - ``` - -> ❗️**Note: The current IoTDB -> IoTDB implementation of data sync does not support DDL sync** -> -> That is: ttl, trigger, alias, template, view, create/delete sequence, create/delete database, etc. are not supported. -> -> **IoTDB -> IoTDB data sync requires the target IoTDB:** -> -> * Enable automatic metadata creation: manual configuration of encoding and compression of data types to be consistent with the sender is required -> * Do not enable automatic metadata creation: manually create metadata that is consistent with the source - -## Sync Task Management - -### Create a sync task - -A data sync task can be created using the `CREATE PIPE` statement, a sample SQL statement is shown below: - -```sql -CREATE PIPE -- PipeId is the name that uniquely identifies the sync task -WITH EXTRACTOR ( - -- Default IoTDB Data Extraction Plugin - 'extractor' = 'iotdb-extractor', - -- Path prefix, only data that can match the path prefix will be extracted for subsequent processing and delivery - 'extractor.pattern' = 'root.timecho', - -- Whether to extract historical data - 'extractor.history.enable' = 'true', - -- Describes the time range of the historical data being extracted, indicating the earliest possible time - 'extractor.history.start-time' = '2011.12.03T10:15:30+01:00', - -- Describes the time range of the extracted historical data, indicating the latest time - 'extractor.history.end-time' = '2022.12.03T10:15:30+01:00', - -- Whether to extract realtime data - 'extractor.realtime.enable' = 'true', -) -WITH PROCESSOR ( - -- Default data processing plugin, means no processing - 'processor' = 'do-nothing-processor', -) -WITH CONNECTOR ( - -- IoTDB data sending plugin with target IoTDB - 'connector' = 'iotdb-thrift-connector', - -- Data service for one of the DataNode nodes on the target IoTDB ip - 'connector.ip' = '127.0.0.1', - -- Data service port of one of the DataNode nodes of the target IoTDB - 'connector.port' = '6667', -) -``` - -**To create a sync task it is necessary to configure the PipeId and the parameters of the three plugin sections:** - - -| configuration item | description | Required or not | default implementation | Default implementation description | Whether to allow custom implementations | -| --------- | ------------------------------------------------- | --------------------------- | -------------------- | ------------------------------------------------------ | ------------------------- | -| pipeId | Globally uniquely identifies the name of a sync task | required | - | - | - | -| extractor | pipe Extractor plugin, for extracting synchronized data at the bottom of the database | Optional | iotdb-extractor | Integrate all historical data of the database and subsequent realtime data into the sync task | no | -| processor | Pipe Processor plugin, for processing data | Optional | do-nothing-processor | no processing of incoming data | yes | -| connector | Pipe Connector plugin,for sending data | required | - | - | yes | - -In the example, the iotdb-extractor, do-nothing-processor, and iotdb-thrift-connector plugins are used to build the data sync task. iotdb has other built-in data sync plugins, **see the section "System Pre-built Data Sync Plugin"**. -**An example of a minimalist CREATE PIPE statement is as follows:** - -```sql -CREATE PIPE -- PipeId is a name that uniquely identifies the task. -WITH CONNECTOR ( - -- IoTDB data sending plugin with target IoTDB - 'connector' = 'iotdb-thrift-connector', - -- Data service for one of the DataNode nodes on the target IoTDB ip - 'connector.ip' = '127.0.0.1', - -- Data service port of one of the DataNode nodes of the target IoTDB - 'connector.port' = '6667', -) -``` - -The expressed semantics are: synchronise the full amount of historical data and subsequent arrivals of realtime data from this database instance to the IoTDB instance with target 127.0.0.1:6667. - -**Note:** - -- EXTRACTOR and PROCESSOR are optional, if no configuration parameters are filled in, the system will use the corresponding default implementation. -- The CONNECTOR is a mandatory configuration that needs to be declared in the CREATE PIPE statement for configuring purposes. -- The CONNECTOR exhibits self-reusability. For different tasks, if their CONNECTOR possesses identical KV properties (where the value corresponds to every key), **the system will ultimately create only one instance of the CONNECTOR** to achieve resource reuse for connections. - - - For example, there are the following pipe1, pipe2 task declarations: - - ```sql - CREATE PIPE pipe1 - WITH CONNECTOR ( - 'connector' = 'iotdb-thrift-connector', - 'connector.thrift.host' = 'localhost', - 'connector.thrift.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH CONNECTOR ( - 'connector' = 'iotdb-thrift-connector', - 'connector.thrift.port' = '9999', - 'connector.thrift.host' = 'localhost', - ) - ``` - - - Since they have identical CONNECTOR declarations (**even if the order of some properties is different**), the framework will automatically reuse the CONNECTOR declared by them. Hence, the CONNECTOR instances for pipe1 and pipe2 will be the same. - - - When extractor is the default iotdb-extractor, and extractor.forwarding-pipe-requests is the default value true, please do not build an application scenario that involve data cycle sync (as it can result in an infinite loop): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### START TASK - -After the successful execution of the CREATE PIPE statement, task-related instances will be created. However, the overall task's running status will be set to STOPPED, meaning the task will not immediately process data. - -You can use the START PIPE statement to begin processing data for a task: - -```sql -START PIPE -``` - -### STOP TASK - -the STOP PIPE statement can be used to halt the data processing: - -```sql -STOP PIPE -``` - -### DELETE TASK - -If a task is in the RUNNING state, you can use the DROP PIPE statement to stop the data processing and delete the entire task: - -```sql -DROP PIPE -``` - -Before deleting a task, there is no need to execute the STOP operation. - -### SHOW TASK - -You can use the SHOW PIPES statement to view all tasks: - -```sql -SHOW PIPES -``` - -The query results are as follows: - -```sql -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -| ID| CreationTime | State|PipeExtractor|PipeProcessor|PipeConnector|ExceptionMessage| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| None| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -``` - -You can use \ to specify the status of a particular synchronization task: - -```sql -SHOW PIPE -``` - -Additionally, the WHERE clause can be used to determine if the Pipe Connector used by a specific \ is being reused. - -```sql -SHOW PIPES -WHERE CONNECTOR USED BY -``` - -### Task Running Status Migration - -The task running status can transition through several states during the lifecycle of a data synchronization pipe: - -- **STOPPED:** The pipe is in a stopped state. It can have the following possibilities: - - After the successful creation of a pipe, its initial state is set to stopped - - The user manually pauses a pipe that is in normal running state, transitioning its status from RUNNING to STOPPED - - If a pipe encounters an unrecoverable error during execution, its status automatically changes from RUNNING to STOPPED. -- **RUNNING:** The pipe is actively processing data -- **DROPPED:** The pipe is permanently deleted - -The following diagram illustrates the different states and their transitions: - -![state migration diagram](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) -## System Pre-built Data Sync Plugin - -### View pre-built plugin - -User can view the plugins in the system on demand. The statement for viewing plugins is shown below. -```sql -SHOW PIPEPLUGINS -``` - -### Pre-built Extractor Plugin - -#### iotdb-extractor - -Function: Extract historical or realtime data inside IoTDB into pipe. - -| key | value | value range | required or optional with default | -| ---------------------------------- | ------------------------------------------------ | -------------------------------------- | --------------------------------- | -| extractor | iotdb-extractor | String: iotdb-extractor | required | -| extractor.pattern | path prefix for filtering time series | String: any time series prefix | optional: root | -| extractor.history.enable | whether to synchronize historical data | Boolean: true, false | optional: true | -| extractor.history.start-time | start of synchronizing historical data event time,Include start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| extractor.history.end-time | end of synchronizing historical data event time,Include end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| extractor.realtime.enable | Whether to sync realtime data | Boolean: true, false | optional: true | -| extractor.realtime.mode | Extraction pattern for realtime data | String: hybrid, log, file | optional: hybrid | -| extractor.forwarding-pipe-requests | Whether or not to forward data written by another Pipe (usually Data Sync) | Boolean: true, false | optional: true | - -> 🚫 **extractor.pattern Parameter Description** -> -> * Pattern should use backquotes to modify illegal characters or illegal path nodes, for example, if you want to filter root.\`a@b\` or root.\`123\`, you should set the pattern to root.\`a@b\` or root.\`123\`(Refer specifically to [Timing of single and double quotes and backquotes](https://iotdb.apache.org/zh/Download/#_1-0-版本不兼容的语法详细说明)) -> * In the underlying implementation, when pattern is detected as root (default value), synchronization efficiency is higher, and any other format will reduce performance. -> * The path prefix does not need to form a complete path. For example, when creating a pipe with the parameter 'extractor.pattern'='root.aligned.1': -> -> * root.aligned.1TS -> * root.aligned.1TS.\`1\` -> * root.aligned.100TS -> -> the data will be synchronized; -> -> * root.aligned.\`1\` -> * root.aligned.\`123\` -> -> the data will not be synchronized. - -> ❗️**start-time, end-time parameter description of extractor.history** -> -> * start-time, end-time should be in ISO format, such as 2011-12-03T10:15:30 or 2011-12-03T10:15:30+01:00 - -> ✅ **a piece of data from production to IoTDB contains two key concepts of time** -> -> * **event time:** the time when the data is actually produced (or the generation time assigned to the data by the data production system, which is a time item in the data point), also called the event time. -> * **arrival time:** the time the data arrived in the IoTDB system. -> -> The out-of-order data we often refer to refers to data whose **event time** is far behind the current system time (or the maximum **event time** that has been dropped) when the data arrives. On the other hand, whether it is out-of-order data or sequential data, as long as they arrive newly in the system, their **arrival time** will increase with the order in which the data arrives at IoTDB. - -> 💎 **the work of iotdb-extractor can be split into two stages** -> -> 1. Historical data extraction: All data with **arrival time** < **current system time** when creating the pipe is called historical data -> 2. Realtime data extraction: All data with **arrival time** >= **current system time** when the pipe is created is called realtime data -> -> The historical data transmission phase and the realtime data transmission phase are executed serially. Only when the historical data transmission phase is completed, the realtime data transmission phase is executed.** -> -> Users can specify iotdb-extractor to: -> -> * Historical data extraction(`'extractor.history.enable' = 'true'`, `'extractor.realtime.enable' = 'false'` ) -> * Realtime data extraction(`'extractor.history.enable' = 'false'`, `'extractor.realtime.enable' = 'true'` ) -> * Full data extraction(`'extractor.history.enable' = 'true'`, `'extractor.realtime.enable' = 'true'` ) -> * Disable simultaneous sets `extractor.history.enable` and `extractor.realtime.enable` to `false` -> -> 📌 **extractor.realtime.mode: mode in which data is extracted** -> -> * log: in this mode, the task uses only operation logs for data processing and sending. -> * file: in this mode, the task uses only data files for data processing and sending. -> * hybrid: This mode takes into account the characteristics of low latency but low throughput when sending data item by item according to the operation log and high throughput but high latency when sending data in batches according to the data file, and is able to automatically switch to a suitable data extraction method under different write loads. When data backlog is generated, it automatically switches to data file-based data extraction to ensure high sending throughput, and when the backlog is eliminated, it automatically switches back to operation log-based data extraction, which avoids the problem that it is difficult to balance the data sending latency or throughput by using a single data extraction algorithm. -> 🍕 **extractor.forwarding-pipe-requests: whether to allow forwarding of data transferred from another pipe**. -> -> * If pipe is to be used to build A -> B -> C data sync, then the pipe of B -> C needs to have this parameter set to true for the data written from A -> B to B via the pipe to be forwarded to C correctly. -> * If you want to use pipe to build a bi-directional data sync between A \<-> B, then the pipe for A -> B and B -> A need to be set to false, otherwise it will result in an endless loop of data being forwarded between clusters. - -### Pre-built Processor Plugin - -#### do-nothing-processor - -Function: Do not do anything with the events passed in by the extractor. - - -| key | value | value range | required or optional with default | -| --------- | -------------------- | ---------------------------- | --------------------------------- | -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### pre-connector plugin - -#### iotdb-thrift-sync-connector(alias:iotdb-thrift-connector) - -Function: Primarily used for data transfer between IoTDB instances (v1.2.0+). Data is transmitted using the Thrift RPC framework and a single-threaded blocking IO model. It guarantees that the receiving end applies the data in the same order as the sending end receives the write requests. - -Limitation: Both the source and target IoTDB versions need to be v1.2.0+. - -| key | value | value range | required or optional with default | -| --------------------------------- | --------------------------------------------------------------------------- | ---------------------------------------------------------------------------- | ----------------------------------------------------- | -| connector | iotdb-thrift-connector or iotdb-thrift-sync-connector | String: iotdb-thrift-connector or iotdb-thrift-sync-connector | required | -| connector.ip | the data service IP of one of the DataNode nodes in the target IoTDB | String | optional: and connector.node-urls fill in either one | -| connector.port | the data service port of one of the DataNode nodes in the target IoTDB | Integer | optional: and connector.node-urls fill in either one | -| connector.node-urls | the URL of the data service port of any multiple DataNode nodes in the target IoTDB | String。eg:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | optional: and connector.ip:connector.port fill in either one | -| connector.batch.enable | Whether to enable log accumulation and batch sending mode to improve transmission throughput and reduce IOPS | Boolean: true, false | optional: true | -| connector.batch.max-delay-seconds | Effective when the log save and send mode is turned on, indicates the longest time a batch of data waits before being sent (unit: s) | Integer | optional: 1 | -| connector.batch.size-bytes | Effective when log saving and delivery mode is enabled, indicates the maximum saving size of a batch of data (unit: byte) | Long | optional: 16 * 1024 * 1024 (16MiB) | - -> 📌 Make sure that the receiver has created all the time series on the sender side, or that automatic metadata creation is turned on, otherwise the pipe run will fail. - -#### iotdb-thrift-async-connector - -Function: Primarily used for data transfer between IoTDB instances (v1.2.0+). -Data is transmitted using the Thrift RPC framework, employing a multi-threaded async non-blocking IO model, resulting in high transfer performance. It is particularly suitable for distributed scenarios on the target end. -It does not guarantee that the receiving end applies the data in the same order as the sending end receives the write requests, but it guarantees data integrity (at-least-once). - -Limitation: Both the source and target IoTDB versions need to be v1.2.0+. - - -| key | value | value range | required or optional with default | -| --------------------------------- | --------------------------------------------------------------------------- | ---------------------------------------------------------------------------- | ----------------------------------------------------- | -| connector | iotdb-thrift-async-connector | String: iotdb-thrift-async-connector | required | -| connector.ip | the data service IP of one of the DataNode nodes in the target IoTDB | String | optional: and connector.node-urls fill in either one | -| connector.port | the data service port of one of the DataNode nodes in the target IoTDB | Integer | optional: and connector.node-urls fill in either one | -| connector.node-urls | the URL of the data service port of any multiple DataNode nodes in the target IoTDB | String。eg: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | optional: and connector.ip:connector.port fill in either one | -| connector.batch.enable | Whether to enable the log saving wholesale delivery mode, which is used to improve transmission throughput and reduce IOPS | Boolean: true, false | optional: true | -| connector.batch.max-delay-seconds | Effective when the log save and send mode is turned on, indicates the longest time a batch of data waits before being sent (unit: s) | Integer | optional: 1 | -| connector.batch.size-bytes | Effective when log saving and delivery mode is enabled, indicates the maximum saving size of a batch of data (unit: byte) | Long | optional: 16 * 1024 * 1024 (16MiB) | - -> 📌 Please ensure that the receiving end has already created all the time series present in the sending end or has enabled automatic metadata creation. Otherwise, it may result in the failure of the pipe operation. - -#### iotdb-legacy-pipe-connector - -Function: Mainly used to transfer data from IoTDB (v1.2.0+) to lower versions of IoTDB, using the data synchronization (Sync) protocol before version v1.2.0. -Data is transmitted using the Thrift RPC framework. It employs a single-threaded sync blocking IO model, resulting in weak transfer performance. - -Limitation: The source IoTDB version needs to be v1.2.0+. The target IoTDB version can be either v1.2.0+, v1.1.x (lower versions of IoTDB are theoretically supported but untested). - -Note: In theory, any version prior to v1.2.0 of IoTDB can serve as the data synchronization (Sync) receiver for v1.2.0+. - -| key | value | value range | required or optional with default | -| ------------------ | --------------------------------------------------------------------- | ----------------------------------- | --------------------------------- | -| connector | iotdb-legacy-pipe-connector | string: iotdb-legacy-pipe-connector | required | -| connector.ip | data service of one DataNode node of the target IoTDB ip | string | required | -| connector.port | the data service port of one of the DataNode nodes in the target IoTDB | integer | required | -| connector.user | the user name of the target IoTDB. Note that the user needs to support data writing and TsFile Load permissions. | string | optional: root | -| connector.password | the password of the target IoTDB. Note that the user needs to support data writing and TsFile Load permissions. | string | optional: root | -| connector.version | the version of the target IoTDB, used to disguise its actual version and bypass the version consistency check of the target. | string | optional: 1.1 | - -> 📌 Make sure that the receiver has created all the time series on the sender side, or that automatic metadata creation is turned on, otherwise the pipe run will fail. - -#### iotdb-air-gap-connector - -Function: Used for data sync from IoTDB (v1.2.2+) to IoTDB (v1.2.2+) across one-way data gatekeepers. Supported gatekeeper models include NARI Syskeeper 2000, etc. -This Connector uses Java's own Socket to implement data transmission, a single-thread blocking IO model, and its performance is comparable to iotdb-thrift-sync-connector. -Ensure that the order in which the receiving end applies data is consistent with the order in which the sending end accepts write requests. - -Scenario: For example, in the specification of power systems - -> 1. Applications between Zone I/II and Zone III are prohibited from using SQL commands to access the database and bidirectional data transmission based on B/S mode. -> -> 2. For data communication between Zone I/II and Zone III, the transmission end is initiated by the intranet. The reverse response message is not allowed to carry data. The response message of the application layer is at most 1 byte and 1 word. The section has two states: all 0s or all 1s. - -limit: - -1. Both the source IoTDB and target IoTDB versions need to be v1.2.2+. -2. The one-way data gatekeeper needs to allow TCP requests to cross, and each request can return a byte of all 1s or all 0s. -3. The target IoTDB needs to be configured in iotdb-common.properties - a. pipe_air_gap_receiver_enabled=true - b. pipe_air_gap_receiver_port configures the receiving port of the receiver - - -| key | value | value range | required or optional with default | -| -------------------------------------- | ---------------------------------------------------------------- | ---------------------------------------------------------------------------- | ----------------------------------------------------- | -| connector | iotdb-air-gap-connector | String: iotdb-air-gap-connector | required | -| connector.ip | the data service IP of one of the DataNode nodes in the target IoTDB | String | optional: and connector.node-urls fill in either one | -| connector.port | the data service port of one of the DataNode nodes in the target IoTDB | Integer | optional: and connector.node-urls fill in either one | -| connector.node-urls | the URL of the data service port of any multiple DataNode nodes in the target IoTDB | String. eg:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | optional: 与 connector.ip:connector.port fill in either one | -| connector.air-gap.handshake-timeout-ms | The timeout period for the handshake request when the source and target try to establish a connection for the first time, unit: milliseconds | Integer | optional: 5000 | - -> 📌 Make sure that the receiver has created all the time series on the sender side or that automatic metadata creation is turned on, otherwise the pipe run will fail. -#### do-nothing-connector - -Function: Does not do anything with the events passed in by the processor. - - -| key | value | value range | required or optional with default | -| --------- | -------------------- | ---------------------------- | --------------------------------- | -| connector | do-nothing-connector | String: do-nothing-connector | required | - -## Authority Management - -| Authority Name | Description | -| ----------- | -------------------- | -| CREATE_PIPE | Register task,path-independent | -| START_PIPE | Start task,path-independent | -| STOP_PIPE | Stop task,path-independent | -| DROP_PIPE | Uninstall task,path-independent | -| SHOW_PIPES | Query task,path-independent | - -## Configure Parameters - -In iotdb-common.properties : - -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 - -# The maximum number of selectors that can be used in the async connector. -# pipe_async_connector_selector_number=1 - -# The core number of clients that can be used in the async connector. -# pipe_async_connector_core_client_number=8 - -# The maximum number of clients that can be used in the async connector. -# pipe_async_connector_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -## Functionality Features - -### At least one semantic guarantee **at-least-once** - -The data synchronization feature provides an at-least-once delivery semantic when transferring data to external systems. In most scenarios, the synchronization feature guarantees exactly-once delivery, ensuring that all data is synchronized exactly once. - -However, in the following scenarios, it is possible for some data to be synchronized multiple times **(due to resumable transmission)**: - -- Temporary network failures: If a data transmission request fails, the system will retry sending it until reaching the maximum retry attempts. -- Abnormal implementation of the Pipe plugin logic: If an error is thrown during the plugin's execution, the system will retry sending the data until reaching the maximum retry attempts. -- Data partition switching due to node failures or restarts: After the partition change is completed, the affected data will be retransmitted. -- Cluster unavailability: Once the cluster becomes available again, the affected data will be retransmitted. - -### Source: Data Writing with Pipe Processing and Asynchronous Decoupling of Data Transmission - -In the data sync feature, data transfer adopts an asynchronous replication mode. - -Data sync is completely decoupled from the writing operation, eliminating any impact on the critical path of writing. This mechanism allows the framework to maintain the writing speed of a time-series database while ensuring continuous data sync. - -### Source: Adaptive data transfer policy for data write load. - -Support to dynamically adjust the data transfer mode according to the writing load. Sync default is to use TsFile and operation stream dynamic hybrid transfer (`'extractor.realtime.mode'='hybrid'`), when the data writing load is high, TsFile transfer is preferred. - -When the data writing load is high, TsFile transfer is preferred, which has a high compression ratio and saves network bandwidth. - -When the load of data writing is low, the preferred method is operation stream synchronous transfer. The operation stream transfer has high real-time performance. -### Source: High Availability of Pipe Service in a Highly Available Cluster Deployment - -When the sender end IoTDB is deployed in a high availability cluster mode, the data sync service will also be highly available. The data sync framework monitors the data sync progress of each data node and periodically takes lightweight distributed consistent snapshots to preserve the sync state. - -- In the event of a failure of a data node in the sender cluster, the data sync framework can leverage the consistent snapshot and the data stored in replicas to quickly recover and resume sync, thus achieving high availability of the data sync service. -- In the event of a complete failure and restart of the sender cluster, the data sync framework can also use snapshots to recover the sync service. diff --git a/src/UserGuide/V1.2.x/User-Manual/IoTDB-View_timecho.md b/src/UserGuide/V1.2.x/User-Manual/IoTDB-View_timecho.md deleted file mode 100644 index 4a1485903..000000000 --- a/src/UserGuide/V1.2.x/User-Manual/IoTDB-View_timecho.md +++ /dev/null @@ -1,550 +0,0 @@ - - -# View - -## I. Sequence View Application Background - -## 1.1 Application Scenario 1 Time Series Renaming (PI Asset Management) - -In practice, the equipment collecting data may be named with identification numbers that are difficult to be understood by human beings, which brings difficulties in querying to the business layer. - -The Sequence View, on the other hand, is able to re-organise the management of these sequences and access them using a new model structure without changing the original sequence content and without the need to create new or copy sequences. - -**For example**: a cloud device uses its own NIC MAC address to form entity numbers and stores data by writing the following time sequence:`root.db.0800200A8C6D.xvjeifg`. - -It is difficult for the user to understand. However, at this point, the user is able to rename it using the sequence view feature, map it to a sequence view, and use `root.view.device001.temperature` to access the captured data. - -### 1.2 Application Scenario 2 Simplifying business layer query logic - -Sometimes users have a large number of devices that manage a large number of time series. When conducting a certain business, the user wants to deal with only some of these sequences. At this time, the focus of attention can be picked out by the sequence view function, which is convenient for repeated querying and writing. - -**For example**: Users manage a product assembly line with a large number of time series for each segment of the equipment. The temperature inspector only needs to focus on the temperature of the equipment, so he can extract the temperature-related sequences and compose the sequence view. - -### 1.3 Application Scenario 3 Auxiliary Rights Management - -In the production process, different operations are generally responsible for different scopes. For security reasons, it is often necessary to restrict the access scope of the operations staff through permission management. - -**For example**: The safety management department now only needs to monitor the temperature of each device in a production line, but these data are stored in the same database with other confidential data. At this point, it is possible to create a number of new views that contain only temperature-related time series on the production line, and then to give the security officer access to only these sequence views, thus achieving the purpose of permission restriction. - -### 1.4 Motivation for designing sequence view functionality - -Combining the above two types of usage scenarios, the motivations for designing sequence view functionality, are: - -1. time series renaming. -2. to simplify the query logic at the business level. -3. Auxiliary rights management, open data to specific users through the view. - -## 2. Sequence View Concepts - -### 2.1 Terminology Concepts - -Concept: If not specified, the views specified in this document are **Sequence Views**, and new features such as device views may be introduced in the future. - -### 2.2 Sequence view - -A sequence view is a way of organising the management of time series. - -In traditional relational databases, data must all be stored in a table, whereas in time series databases such as IoTDB, it is the sequence that is the storage unit. Therefore, the concept of sequence views in IoTDB is also built on sequences. - -A sequence view is a virtual time series, and each virtual time series is like a soft link or shortcut that maps to a sequence or some kind of computational logic external to a certain view. In other words, a virtual sequence either maps to some defined external sequence or is computed from multiple external sequences. - -Users can create views using complex SQL queries, where the sequence view acts as a stored query statement, and when data is read from the view, the stored query statement is used as the source of the data in the FROM clause. - -### 2.3 Alias Sequences - -There is a special class of beings in a sequence view that satisfy all of the following conditions: - -1. the data source is a single time series -2. there is no computational logic -3. no filtering conditions (e.g., no WHERE clause restrictions). - -Such a sequence view is called an **alias sequence**, or alias sequence view. A sequence view that does not fully satisfy all of the above conditions is called a non-alias sequence view. The difference between them is that only aliased sequences support write functionality. - -** All sequence views, including aliased sequences, do not currently support Trigger functionality. ** - -### 2.4 Nested Views - -A user may want to select a number of sequences from an existing sequence view to form a new sequence view, called a nested view. - -**The current version does not support the nested view feature**. - -### 2.5 Some constraints on sequence views in IoTDB - -#### Constraint 1 A sequence view must depend on one or several time series - -A sequence view has two possible forms of existence: - -1. it maps to a time series -2. it is computed from one or more time series. - -The former form of existence has been exemplified in the previous section and is easy to understand; the latter form of existence here is because the sequence view allows for computational logic. - -For example, the user has installed two thermometers in the same boiler and now needs to calculate the average of the two temperature values as a measurement. The user has captured the following two sequences: `root.db.d01.temperature01`, `root.db.d01.temperature02`. - -At this point, the user can use the average of the two sequences as one sequence in the view: `root.db.d01.avg_temperature`. - -This example will 3.1.2 expand in detail. - -#### Restriction 2 Non-alias sequence views are read-only - -Writing to non-alias sequence views is not allowed. - -Only aliased sequence views are supported for writing. - -#### Restriction 3 Nested views are not allowed - -It is not possible to select certain columns in an existing sequence view to create a sequence view, either directly or indirectly. - -An example of this restriction will be given in 3.1.3. - -#### Restriction 4 Sequence view and time series cannot be renamed - -Both sequence views and time series are located under the same tree, so they cannot be renamed. - -The name (path) of any sequence should be uniquely determined. - -#### Restriction 5 Sequence views share timing data with time series, metadata such as labels are not shared - -Sequence views are mappings pointing to time series, so they fully share timing data, with the time series being responsible for persistent storage. - -However, their metadata such as tags and attributes are not shared. - -This is because the business query, view-oriented users are concerned about the structure of the current view, and if you use group by tag and other ways to do the query, obviously want to get the view contains the corresponding tag grouping effect, rather than the time series of the tag grouping effect (the user is not even aware of those time series). - -## 3. Sequence view functionality - -### 3.1 Creating a view - -Creating a sequence view is similar to creating a time series, the difference is that you need to specify the data source, i.e., the original sequence, through the AS keyword. - -#### 3.1.1. SQL for creating a view - -User can select some sequences to create a view: - -```SQL -CREATE VIEW root.view.device.status -AS - SELECT s01 - FROM root.db.device -``` - -It indicates that the user has selected the sequence `s01` from the existing device `root.db.device`, creating the sequence view `root.view.device.status`. - -The sequence view can exist under the same entity as the time series, for example: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device -``` - -Thus, there is a virtual copy of `s01` under `root.db.device`, but with a different name `status`. - -It can be noticed that the sequence views in both of the above examples are aliased sequences, and we are giving the user a more convenient way of creating a sequence for that sequence: - -```SQL -CREATE VIEW root.view.device.status -AS - root.db.device.s01 -``` - -#### 3.1.2 Creating views with computational logic - -Following the example in section 2.2 Limitations 1: - -> A user has installed two thermometers in the same boiler and now needs to calculate the average of the two temperature values as a measurement. The user has captured the following two sequences: `root.db.d01.temperature01`, `root.db.d01.temperature02`. -> -> At this point, the user can use the two sequences averaged as one sequence in the view: `root.view.device01.avg_temperature`. - -If the view is not used, the user can query the average of the two temperatures like this: - -```SQL -SELECT (temperature01 + temperature02) / 2 -FROM root.db.d01 -``` - -And if using a sequence view, the user can create a view this way to simplify future queries: - -```SQL -CREATE VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02) / 2 - FROM root.db.d01 -``` - -The user can then query it like this: - -```SQL -SELECT avg_temperature FROM root.db.d01 -``` - -#### 3.1.3 Nested sequence views not supported - -Continuing with the example from 3.1.2, the user now wants to create a new view using the sequence view `root.db.d01.avg_temperature`, which is not allowed. We currently do not support nested views, whether it is an aliased sequence or not. - -For example, the following SQL statement will report an error: - -```SQL -CREATE VIEW root.view.device.avg_temp_copy -AS - root.db.d01.avg_temperature -- Not supported. Nested views are not allowed -``` - -#### 3.1.4 Creating multiple sequence views at once - -If only one sequence view can be specified at a time which is not convenient for the user to use, then multiple sequences can be specified at a time, for example: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - SELECT s01, s02 - FROM root.db.device -``` - -此外,上述写法可以做简化: - -```SQL -CREATE VIEW root.db.device(status, sub.hardware) -AS - SELECT s01, s02 - FROM root.db.device -``` - -Both statements above are equivalent to the following typing: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device; - -CREATE VIEW root.db.device.sub.hardware -AS - SELECT s02 - FROM root.db.device -``` - -is also equivalent to the following: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - root.db.device.s01, root.db.device.s02 - --- or - -CREATE VIEW root.db.device(status, sub.hardware) -AS - root.db.device(s01, s02) -``` - -##### The mapping relationships between all sequences are statically stored - -Sometimes, the SELECT clause may contain a number of statements that can only be determined at runtime, such as below: - -```SQL -SELECT s01, s02 -FROM root.db.d01, root.db.d02 -``` - -The number of sequences that can be matched by the above statement is uncertain and is related to the state of the system. Even so, the user can use it to create views. - -However, it is important to note that the mapping relationship between all sequences is stored statically (fixed at creation)! Consider the following example: - -The current database contains only three sequences `root.db.d01.s01`, `root.db.d02.s01`, `root.db.d02.s02`, and then the view is created: - -```SQL -CREATE VIEW root.view.d(alpha, beta, gamma) -AS - SELECT s01, s02 - FROM root.db.d01, root.db.d02 -``` - -The mapping relationship between time series is as follows: - -| sequence number | time series | sequence view | -| ---- | ----------------- | ----------------- | -| 1 | `root.db.d01.s01` | root.view.d.alpha | -| 2 | `root.db.d02.s01` | root.view.d.beta | -| 3 | `root.db.d02.s02` | root.view.d.gamma | - -After that, if the user adds the sequence `root.db.d01.s02`, it does not correspond to any view; then, if the user deletes `root.db.d01.s01`, the query for `root.view.d.alpha` will report an error directly, and it will not correspond to `root.db.d01.s02` either. - -Please always note that inter-sequence mapping relationships are stored statically and solidly. - -#### 3.1.5 Batch Creation of Sequence Views - -There are several existing devices, each with a temperature value, for example: - -1. root.db.d1.temperature -2. root.db.d2.temperature -3. ... - -There may be many other sequences stored under these devices (e.g. `root.db.d1.speed`), but for now it is possible to create a view that contains only the temperature values for these devices, without relation to the other sequences:. - -```SQL -CREATE VIEW root.db.view(${2}_temperature) -AS - SELECT temperature FROM root.db.* -``` - -This is modelled on the query writeback (`SELECT INTO`) convention for naming rules, which uses variable placeholders to specify naming rules. See also: [QUERY WRITEBACK (SELECT INTO)](../User-Manual/Query-Data.md#into-clause-query-write-back) - -Here `root.db.*.temperature` specifies what time series will be included in the view; and `${2}` specifies from which node in the time series the name is extracted to name the sequence view. - -Here, `${2}` refers to level 2 (starting at 0) of `root.db.*.temperature`, which is the result of the `*` match; and `${2}_temperature` is the result of the match and `temperature` spliced together with underscores to make up the node names of the sequences under the view. - -The above statement for creating a view is equivalent to the following writeup: - -```SQL -CREATE VIEW root.db.view(${2}_${3}) -AS - SELECT temperature from root.db.* -``` - -The final view contains these sequences: - -1. root.db.view.d1_temperature -2. root.db.view.d2_temperature -3. ... - -Created using wildcards, only static mapping relationships at the moment of creation will be stored. - -#### 3.1.6 SELECT clauses are somewhat limited when creating views - -The SELECT clause used when creating a serial view is subject to certain restrictions. The main restrictions are as follows: - -1. the `WHERE` clause cannot be used. -2. `GROUP BY` clause cannot be used. -3. `MAX_VALUE` and other aggregation functions cannot be used. - -Simply put, after `AS` you can only use `SELECT ... FROM ... ` and the results of this query must form a time series. - -### 3.2 View Data Queries - -For the data query functions that can be supported, the sequence view and time series can be used indiscriminately with identical behaviour when performing time series data queries. - -**The types of queries that are not currently supported by the sequence view are as follows:** - -1. **align by device query -2. *last query for non alias sequence views -3. **group by tags query - -Users can also mix time series and sequence view queries in the same SELECT statement, for example: - -```SQL -SELECT temperature01, temperature02, avg_temperature -FROM root.db.d01 -WHERE temperature01 < temperature02 -``` - -However, if the user wants to query the metadata of the sequence, such as tag, attributes, etc., the query is the result of the sequence view, not the result of the time series referenced by the sequence view. - -In addition, for aliased sequences, if the user wants to get information about the time series such as tags, attributes, etc., the user needs to query the mapping of the view columns to find the corresponding time series, and then query the time series for the tags, attributes, etc. The method of querying the mapping of the view columns will be explained in section 3.5. - -### 3.3 Modify Views - -The modification operations supported by the view include: modifying its calculation logic,modifying tag/attributes/aliases, and deleting. - -#### 3.3.1 Modify view data source - -```SQL -ALTER VIEW root.view.device.status -AS - SELECT s01 - FROM root.ln.wf.d01 -``` - -#### 3.3.2 Modify the view's calculation logic - -```SQL -ALTER VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02 + temperature03) / 3 - FROM root.db.d01 -``` - -#### 3.3.3 Tag point management - -- Add a new -tag -```SQL -ALTER view root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -- Add a new attribute - -```SQL -ALTER view root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -- rename tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -- Reset the value of a tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -- Delete an existing tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 DROP tag1, tag2 -``` - -- Update insert aliases, tags and attributes - -> If the alias, tag or attribute did not exist before, insert it, otherwise, update the old value with the new one. - -```SQL -ALTER view root.turbine.d1.s1 UPSERT TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -#### 3.3.4 Deleting Views - -Since a view is a sequence, a view can be deleted as if it were a time series. - - -```SQL -DELETE VIEW root.view.device.avg_temperatue -``` - -### 3.4 View Synchronisation - - - -#### If the dependent original sequence is deleted - -When the sequence view is queried (when the sequence is parsed), **the empty result set** is returned if the dependent time series does not exist. - -This is similar to the feedback for querying a non-existent sequence, but with a difference: if the dependent time series cannot be parsed, the empty result set is the one that contains the table header as a reminder to the user that the view is problematic. - -Additionally, when the dependent time series is deleted, no attempt is made to find out if there is a view that depends on the column, and the user receives no warning. - -#### Data Writes to Non-Aliased Sequences Not Supported - -Writes to non-alias sequences are not supported. - -Please refer to the previous section 2.1.6 Restrictions2 for more details. - -#### Metadata for sequences is not shared - -Please refer to the previous section 2.1.6 Restriction 5 for details. - -### 3.5 View Metadata Queries - -View metadata query specifically refers to querying the metadata of the view itself (e.g., how many columns the view has), as well as information about the views in the database (e.g., what views are available). - -#### 3.5.1 Viewing Current View Columns - -The user has two ways of querying: - -1. a query using `SHOW TIMESERIES`, which contains both time series and series views. This query contains both the time series and the sequence view. However, only some of the attributes of the view can be displayed. -2. a query using `SHOW VIEW`, which contains only the sequence view. It displays the complete properties of the sequence view. - -Example: - -```Shell -IoTDB> show timeseries; -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.device.s01 | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.view.status | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp01 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp02 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.avg_temp| null| root.db| FLOAT| null| null|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -Total line number = 5 -It costs 0.789s -IoTDB> -``` - -The last column `ViewType` shows the type of the sequence, the time series is BASE and the sequence view is VIEW. - -In addition, some of the sequence view properties will be missing, for example `root.db.d01.avg_temp` is calculated from temperature averages, so the `Encoding` and `Compression` properties are null values. - -In addition, the query results of the `SHOW TIMESERIES` statement are divided into two main parts. - -1. information about the timing data, such as data type, compression, encoding, etc. -2. other metadata information, such as tag, attribute, database, etc. - -For the sequence view, the temporal data information presented is the same as the original sequence or null (e.g., the calculated average temperature has a data type but no compression method); the metadata information presented is the content of the view. - -To learn more about the view, use `SHOW ``VIEW`. The `SHOW ``VIEW` shows the source of the view's data, etc. - -```Shell -IoTDB> show VIEW root.**; -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -| Timeseries|Database|DataType|Tags|Attributes|ViewType| SOURCE| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.view.status | root.db| INT32|null| null| VIEW| root.db.device.s01| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.d01.avg_temp| root.db| FLOAT|null| null| VIEW|(root.db.d01.temp01+root.db.d01.temp02)/2| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -Total line number = 2 -It costs 0.789s -IoTDB> -``` - -The last column, `SOURCE`, shows the data source for the sequence view, listing the SQL statement that created the sequence. - -##### About Data Types - -Both of the above queries involve the data type of the view. The data type of a view is inferred from the original time series type of the query statement or alias sequence that defines the view. This data type is computed in real time based on the current state of the system, so the data type queried at different moments may be changing. - -## IV. FAQ - -####Q1: I want the view to implement the function of type conversion. For example, a time series of type int32 was originally placed in the same view as other series of type int64. I now want all the data queried through the view to be automatically converted to int64 type. - -> Ans: This is not the function of the sequence view. But the conversion can be done using `CAST`, for example: - -```SQL -CREATE VIEW root.db.device.int64_status -AS - SELECT CAST(s1, 'type'='INT64') from root.db.device -``` - -> This way, a query for `root.view.status` will yield a result of type int64. -> -> Please note in particular that in the above example, the data for the sequence view is obtained by `CAST` conversion, so `root.db.device.int64_status` is not an aliased sequence, and thus **not supported for writing**. - -####Q2: Is default naming supported? Select a number of time series and create a view; but I don't specify the name of each series, it is named automatically by the database? - -> Ans: Not supported. Users must specify the naming explicitly. - -#### Q3: In the original system, create time series `root.db.device.s01`, you can find that database `root.db` is automatically created and device `root.db.device` is automatically created. Next, deleting the time series `root.db.device.s01` reveals that `root.db.device` was automatically deleted, while `root.db` remained. Will this mechanism be followed for creating views? What are the considerations? - -> Ans: Keep the original behaviour unchanged, the introduction of view functionality will not change these original logics. - -#### Q4: Does it support sequence view renaming? - -> A: Renaming is not supported in the current version, you can create your own view with new name to put it into use. \ No newline at end of file diff --git a/src/UserGuide/V1.2.x/User-Manual/Security-Management_timecho.md b/src/UserGuide/V1.2.x/User-Manual/Security-Management_timecho.md deleted file mode 100644 index 4e88a0183..000000000 --- a/src/UserGuide/V1.2.x/User-Manual/Security-Management_timecho.md +++ /dev/null @@ -1,30 +0,0 @@ - - -# Security Management - -## White List - -coming soon - -## Security Audit -coming soon - diff --git a/src/UserGuide/V1.2.x/User-Manual/Streaming_timecho.md b/src/UserGuide/V1.2.x/User-Manual/Streaming_timecho.md deleted file mode 100644 index 9dda4c279..000000000 --- a/src/UserGuide/V1.2.x/User-Manual/Streaming_timecho.md +++ /dev/null @@ -1,811 +0,0 @@ - - -# Stream Computing Framework - -The IoTDB stream processing framework allows users to implement customized stream processing logic, which can monitor and capture storage engine changes, transform changed data, and push transformed data outward. - -We call a data flow processing task a Pipe. A stream processing task (Pipe) contains three subtasks: - -- Extract -- Process -- Send (Connect) - -The stream processing framework allows users to customize the processing logic of three subtasks using Java language and process data in a UDF-like manner. -In a Pipe, the above three subtasks are executed by three plugins respectively, and the data will be processed by these three plugins in turn: -Pipe Extractor is used to extract data, Pipe Processor is used to process data, Pipe Connector is used to send data, and the final data will be sent to an external system. - -**The model of the Pipe task is as follows:** - -![pipe.png](/img/pipe.png) - -Describing a data flow processing task essentially describes the properties of Pipe Extractor, Pipe Processor and Pipe Connector plugins. -Users can declaratively configure the specific attributes of the three subtasks through SQL statements, and achieve flexible data ETL capabilities by combining different attributes. - -Using the stream processing framework, a complete data link can be built to meet the needs of end-side-cloud synchronization, off-site disaster recovery, and read-write load sub-library*. - -## Custom stream processing plugin development - -### Programming development dependencies - -It is recommended to use maven to build the project and add the following dependencies in `pom.xml`. Please be careful to select the same dependency version as the IoTDB server version. - -```xml - - org.apache.iotdb - pipe-api - 1.2.1 - provided - -``` - -### Event-driven programming model - -The user programming interface design of the stream processing plugin refers to the general design concept of the event-driven programming model. Events are data abstractions in the user programming interface, and the programming interface is decoupled from the specific execution method. It only needs to focus on describing the processing method expected by the system after the event (data) reaches the system. - -In the user programming interface of the stream processing plugin, events are an abstraction of database data writing operations. The event is captured by the stand-alone stream processing engine, and is passed to the PipeExtractor plugin, PipeProcessor plugin, and PipeConnector plugin in sequence according to the three-stage stream processing process, and triggers the execution of user logic in the three plugins in turn. - -In order to take into account the low latency of stream processing in low load scenarios on the end side and the high throughput of stream processing in high load scenarios on the end side, the stream processing engine will dynamically select processing objects in the operation logs and data files. Therefore, user programming of stream processing The interface requires users to provide processing logic for the following two types of events: operation log writing event TabletInsertionEvent and data file writing event TsFileInsertionEvent. - -#### **Operation log writing event (TabletInsertionEvent)** - -The operation log write event (TabletInsertionEvent) is a high-level data abstraction for user write requests. It provides users with the ability to manipulate the underlying data of write requests by providing a unified operation interface. - -For different database deployment methods, the underlying storage structures corresponding to operation log writing events are different. For stand-alone deployment scenarios, the operation log writing event is an encapsulation of write-ahead log (WAL) entries; for a distributed deployment scenario, the operation log writing event is an encapsulation of a single node consensus protocol operation log entry. - -For write operations generated by different write request interfaces in the database, the data structure of the request structure corresponding to the operation log write event is also different. IoTDB provides numerous writing interfaces such as InsertRecord, InsertRecords, InsertTablet, InsertTablets, etc. Each writing request uses a completely different serialization method, and the generated binary entries are also different. - -The existence of operation log writing events provides users with a unified view of data operations, which shields the implementation differences of the underlying data structure, greatly reduces the user's programming threshold, and improves the ease of use of the function. - -```java -/** TabletInsertionEvent is used to define the event of data insertion. */ -public interface TabletInsertionEvent extends Event { - - /** - * The consumer processes the data row by row and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processRowByRow(BiConsumer consumer); - - /** - * The consumer processes the Tablet directly and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processTablet(BiConsumer consumer); -} -``` - -#### **Data file writing event (TsFileInsertionEvent)** - -The data file writing event (TsFileInsertionEvent) is a high-level abstraction of the database file writing operation. It is a data collection of several operation log writing events (TabletInsertionEvent). - -The storage engine of IoTDB is LSM structured. When data is written, the writing operation will first be placed into a log-structured file, and the written data will be stored in the memory at the same time. When the memory reaches the control upper limit, the disk flushing behavior will be triggered, that is, the data in the memory will be converted into a database file, and the previously prewritten operation log will be deleted. When the data in the memory is converted into the data in the database file, it will undergo two compression processes: encoding compression and general compression. Therefore, the data in the database file takes up less space than the original data in the memory. - -In extreme network conditions, directly transmitting data files is more economical than transmitting data writing operations. It will occupy lower network bandwidth and achieve faster transmission speeds. Of course, there is no free lunch. Computing and processing data in files requires additional file I/O costs compared to directly computing and processing data in memory. However, it is precisely the existence of two structures, disk data files and memory write operations, with their own advantages and disadvantages, that gives the system the opportunity to make dynamic trade-offs and adjustments. It is based on this observation that data files are introduced into the plugin's event model. Write event. - -To sum up, the data file writing event appears in the event stream of the stream processing plugin, and there are two situations: - -(1) Historical data extraction: Before a stream processing task starts, all written data that has been placed on the disk will exist in the form of TsFile. After a stream processing task starts, when collecting historical data, the historical data will be abstracted using TsFileInsertionEvent; - -(2) Real-time data extraction: When a stream processing task is in progress, when the real-time processing speed of operation log write events in the data stream is slower than the write request speed, after a certain progress, the operation log write events that cannot be processed in the future will be persisted. to disk and exists in the form of TsFile. After this data is extracted by the stream processing engine, TsFileInsertionEvent will be used as an abstraction. - -```java -/** - * TsFileInsertionEvent is used to define the event of writing TsFile. Event data stores in disks, - * which is compressed and encoded, and requires IO cost for computational processing. - */ -public interface TsFileInsertionEvent extends Event { - - /** - * The method is used to convert the TsFileInsertionEvent into several TabletInsertionEvents. - * - * @return {@code Iterable} the list of TabletInsertionEvent - */ - Iterable toTabletInsertionEvents(); -} -``` - -### Custom stream processing plugin programming interface definition - -Based on the custom stream processing plugin programming interface, users can easily write data extraction plugins, data processing plugins and data sending plugins, so that the stream processing function can be flexibly adapted to various industrial scenarios. - -#### Data extraction plugin interface - -Data extraction is the first stage of the three stages of stream processing data from data extraction to data sending. The data extraction plugin (PipeExtractor) is the bridge between the stream processing engine and the storage engine. It monitors the behavior of the storage engine, -Capture various data write events. - -```java -/** - * PipeExtractor - * - *

PipeExtractor is responsible for capturing events from sources. - * - *

Various data sources can be supported by implementing different PipeExtractor classes. - * - *

The lifecycle of a PipeExtractor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH EXTRACTOR` clause in SQL are - * parsed and the validation method {@link PipeExtractor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeExtractor#customize(PipeParameters, PipeExtractorRuntimeConfiguration)} will be called - * to config the runtime behavior of the PipeExtractor. - *
  • Then the method {@link PipeExtractor#start()} will be called to start the PipeExtractor. - *
  • While the collaboration task is in progress, the method {@link PipeExtractor#supply()} will - * be called to capture events from sources and then the events will be passed to the - * PipeProcessor. - *
  • The method {@link PipeExtractor#close()} will be called when the collaboration task is - * cancelled (the `DROP PIPE` command is executed). - *
- */ -public interface PipeExtractor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeExtractor#customize(PipeParameters, PipeExtractorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeExtractor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeExtractorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeExtractor#validate(PipeParameterValidator)} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeExtractor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeExtractorRuntimeConfiguration configuration) - throws Exception; - - /** - * Start the extractor. After this method is called, events should be ready to be supplied by - * {@link PipeExtractor#supply()}. This method is called after {@link - * PipeExtractor#customize(PipeParameters, PipeExtractorRuntimeConfiguration)} is called. - * - * @throws Exception the user can throw errors if necessary - */ - void start() throws Exception; - - /** - * Supply single event from the extractor and the caller will send the event to the processor. - * This method is called after {@link PipeExtractor#start()} is called. - * - * @return the event to be supplied. the event may be null if the extractor has no more events at - * the moment, but the extractor is still running for more events. - * @throws Exception the user can throw errors if necessary - */ - Event supply() throws Exception; -} -``` - -#### Data processing plugin interface - -Data processing is the second stage of the three stages of stream processing data from data extraction to data sending. The data processing plugin (PipeProcessor) is mainly used to filter and transform the data captured by the data extraction plugin (PipeExtractor). -various events. - -```java -/** - * PipeProcessor - * - *

PipeProcessor is used to filter and transform the Event formed by the PipeExtractor. - * - *

The lifecycle of a PipeProcessor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH PROCESSOR` clause in SQL are - * parsed and the validation method {@link PipeProcessor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} will be called - * to config the runtime behavior of the PipeProcessor. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeExtractor captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeConnector. The - * following 3 methods will be called: {@link - * PipeProcessor#process(TabletInsertionEvent, EventCollector)}, {@link - * PipeProcessor#process(TsFileInsertionEvent, EventCollector)} and {@link - * PipeProcessor#process(Event, EventCollector)}. - *
    • PipeConnector serializes the events into binaries and send them to sinks. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeProcessor#close() } method will be called. - *
- */ -public interface PipeProcessor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeProcessor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeProcessorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeProcessor#validate(PipeParameterValidator)} is called and before the beginning of the - * events processing. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeProcessor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeProcessorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is called to process the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(TabletInsertionEvent tabletInsertionEvent, EventCollector eventCollector) - throws Exception; - - /** - * This method is called to process the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - default void process(TsFileInsertionEvent tsFileInsertionEvent, EventCollector eventCollector) - throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - process(tabletInsertionEvent, eventCollector); - } - } - - /** - * This method is called to process the Event. - * - * @param event Event to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(Event event, EventCollector eventCollector) throws Exception; -} -``` - -#### Data sending plugin interface - -Data sending is the third stage of the three stages of stream processing data from data extraction to data sending. The data sending plugin (PipeConnector) is mainly used to send data processed by the data processing plugin (PipeProcessor). -Various events, it serves as the network implementation layer of the stream processing framework, and the interface should allow access to multiple real-time communication protocols and multiple connectors. - -```java -/** - * PipeConnector - * - *

PipeConnector is responsible for sending events to sinks. - * - *

Various network protocols can be supported by implementing different PipeConnector classes. - * - *

The lifecycle of a PipeConnector is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH CONNECTOR` clause in SQL are - * parsed and the validation method {@link PipeConnector#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeConnector#customize(PipeParameters, PipeConnectorRuntimeConfiguration)} will be called - * to config the runtime behavior of the PipeConnector and the method {@link - * PipeConnector#handshake()} will be called to create a connection with sink. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeExtractor captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeConnector. - *
    • PipeConnector serializes the events into binaries and send them to sinks. The - * following 3 methods will be called: {@link - * PipeConnector#transfer(TabletInsertionEvent)}, {@link - * PipeConnector#transfer(TsFileInsertionEvent)} and {@link - * PipeConnector#transfer(Event)}. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeConnector#close() } method will be called. - *
- * - *

In addition, the method {@link PipeConnector#heartbeat()} will be called periodically to check - * whether the connection with sink is still alive. The method {@link PipeConnector#handshake()} - * will be called to create a new connection with the sink when the method {@link - * PipeConnector#heartbeat()} throws exceptions. - */ -public interface PipeConnector extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeConnector#customize(PipeParameters, PipeConnectorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeConnector. In this method, the user can do the - * following things: - * - *

    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeConnectorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeConnector#validate(PipeParameterValidator)} is called and before the method {@link - * PipeConnector#handshake()} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeConnector - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeConnectorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is used to create a connection with sink. This method will be called after the - * method {@link PipeConnector#customize(PipeParameters, PipeConnectorRuntimeConfiguration)} is - * called or will be called when the method {@link PipeConnector#heartbeat()} throws exceptions. - * - * @throws Exception if the connection is failed to be created - */ - void handshake() throws Exception; - - /** - * This method will be called periodically to check whether the connection with sink is still - * alive. - * - * @throws Exception if the connection dies - */ - void heartbeat() throws Exception; - - /** - * This method is used to transfer the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(TabletInsertionEvent tabletInsertionEvent) throws Exception; - - /** - * This method is used to transfer the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - default void transfer(TsFileInsertionEvent tsFileInsertionEvent) throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - transfer(tabletInsertionEvent); - } - } - - /** - * This method is used to transfer the Event. - * - * @param event Event to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(Event event) throws Exception; -} -``` - -## Custom stream processing plugin management - -In order to ensure the flexibility and ease of use of user-defined plugins in actual production, the system also needs to provide the ability to dynamically and uniformly manage plugins. -The stream processing plugin management statements introduced in this chapter provide an entry point for dynamic unified management of plugins. - -### Load plugin statement - -In IoTDB, if you want to dynamically load a user-defined plugin in the system, you first need to implement a specific plugin class based on PipeExtractor, PipeProcessor or PipeConnector. -Then the plugin class needs to be compiled and packaged into a jar executable file, and finally the plugin is loaded into IoTDB using the management statement for loading the plugin. - -The syntax of the management statement for loading the plugin is shown in the figure. - -```sql -CREATE PIPEPLUGIN -AS -USING -``` - -Example: If you implement a data processing plugin named edu.tsinghua.iotdb.pipe.ExampleProcessor, and the packaged jar package is pipe-plugin.jar, you want to use this plugin in the stream processing engine, and mark the plugin as example. There are two ways to use the plugin package, one is to upload to the URI server, and the other is to upload to the local directory of the cluster. - -Method 1: Upload to the URI server - -Preparation: To register in this way, you need to upload the JAR package to the URI server in advance and ensure that the IoTDB instance that executes the registration statement can access the URI server. For example https://example.com:8080/iotdb/pipe-plugin.jar . - -SQL: - -```sql -CREATE PIPEPLUGIN example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -Method 2: Upload the data to the local directory of the cluster - -Preparation: To register in this way, you need to place the JAR package in any path on the machine where the DataNode node is located, and we recommend that you place the JAR package in the /ext/pipe directory of the IoTDB installation path (the installation package is already in the installation package, so you do not need to create a new one). For example: iotdb-1.x.x-bin/ext/pipe/pipe-plugin.jar. **(Note: If you are using a cluster, you will need to place the JAR package under the same path as the machine where each DataNode node is located)** - -SQL: - -```sql -CREATE PIPEPLUGIN example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -### Delete plugin statement - -When the user no longer wants to use a plugin and needs to uninstall the plugin from the system, he can use the delete plugin statement as shown in the figure. - -```sql -DROP PIPEPLUGIN -``` - -### View plugin statements - -Users can also view plugins in the system on demand. View the statement of the plugin as shown in the figure. -```sql -SHOW PIPEPLUGINS -``` - -## System preset stream processing plugin - -### Preset extractor plugin - -####iotdb-extractor - -Function: Extract historical or real-time data inside IoTDB into pipe. - - -| key | value | value range | required or not |default value| -| ---------------------------------- | ------------------------------------------------ | -------------------------------------- | -------- |------| -| source | iotdb-source | String: iotdb-source | required | - | -| source.pattern | Path prefix for filtering time series | String: any time series prefix | optional | root | -| source.history.enable | Whether to synchronise history data | Boolean: true, false | optional | true | -| source.history.start-time | Synchronise the start event time of historical data, including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional | Long.MIN_VALUE | -| source.history.end-time | end event time for synchronised history data, contains end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional | Long.MAX_VALUE | -| source.realtime.enable | Whether to synchronise real-time data | Boolean: true, false | optional | true | -| source.realtime.mode | Extraction mode for real-time data | String: hybrid, stream, batch | optional | hybrid | -| source.forwarding-pipe-requests | Whether to forward data written by another Pipe (usually Data Sync) | Boolean: true, false | optional | true | - -> 🚫 **extractor.pattern 参数说明** -> ->* Pattern needs to use backticks to modify illegal characters or illegal path nodes. For example, if you want to filter root.\`a@b\` or root.\`123\`, you should set pattern to root.\`a@b \` or root.\`123\` (For details, please refer to [When to use single and double quotes and backticks](https://iotdb.apache.org/zh/Download/#_1-0-version incompatible syntax details illustrate)) -> * In the underlying implementation, when pattern is detected as root (default value), the extraction efficiency is higher, and any other format will reduce performance. -> * The path prefix does not need to form a complete path. For example, when creating a pipe with the parameter 'extractor.pattern'='root.aligned.1': - > - > * root.aligned.1TS -> * root.aligned.1TS.\`1\` -> * root.aligned.100T - > - > The data will be extracted; - > - > * root.aligned.\`1\` -> * root.aligned.\`123\` - > - > The data will not be extracted. -> * The data of root.\_\_system will not be extracted by pipe. Although users can include any prefix in extractor.pattern, including prefixes with (or overriding) root.\__system, the data under root.__system will always be ignored by pipe - -> ❗️**Start-time, end-time parameter description of extractor.history** -> -> * start-time, end-time should be in ISO format, such as 2011-12-03T10:15:30 or 2011-12-03T10:15:30+01:00 - -> ✅ **A piece of data from production to IoTDB contains two key concepts of time** -> -> * **event time:** The time when the data is actually produced (or the generation time assigned to the data by the data production system, which is the time item in the data point), also called event time. -> * **arrival time:** The time when data arrives in the IoTDB system. -> -> What we often call out-of-order data refers to data whose **event time** is far behind the current system time (or the maximum **event time** that has been dropped) when the data arrives. On the other hand, whether it is out-of-order data or sequential data, as long as they arrive newly in the system, their **arrival time** will increase with the order in which the data arrives at IoTDB. - -> 💎 **iotdb-extractor’s work can be split into two stages** -> -> 1. Historical data extraction: all data with **arrival time** < **current system time** when creating pipe is called historical data -> 2. Real-time data extraction: all **arrival time** >= data of **current system time** when creating pipe is called real-time data -> -> The historical data transmission phase and the real-time data transmission phase are executed serially. Only when the historical data transmission phase is completed, the real-time data transmission phase is executed. ** -> -> Users can specify iotdb-extractor to: -> -> * Historical data extraction (`'extractor.history.enable' = 'true'`, `'extractor.realtime.enable' = 'false'` ) -> * Real-time data extraction (`'extractor.history.enable' = 'false'`, `'extractor.realtime.enable' = 'true'` ) -> * Full data extraction (`'extractor.history.enable' = 'true'`, `'extractor.realtime.enable' = 'true'` ) -> * Disable setting `extractor.history.enable` and `extractor.realtime.enable` to `false` at the same time -> -> 📌 **extractor.realtime.mode: Data extraction mode** -> -> * log: In this mode, the task only uses the operation log for data processing and sending -> * file: In this mode, the task only uses data files for data processing and sending. -> * hybrid: This mode takes into account the characteristics of low latency but low throughput when sending data one by one in the operation log, and the characteristics of high throughput but high latency when sending in batches of data files. It can automatically operate under different write loads. Switch the appropriate data extraction method. First, adopt the data extraction method based on operation logs to ensure low sending delay. When a data backlog occurs, it will automatically switch to the data extraction method based on data files to ensure high sending throughput. When the backlog is eliminated, it will automatically switch back to the data extraction method based on data files. The data extraction method of the operation log avoids the problem of difficulty in balancing data sending delay or throughput using a single data extraction algorithm. - -> 🍕 **extractor.forwarding-pipe-requests: Whether to allow forwarding data transmitted from another pipe** -> -> * If you want to use pipe to build data synchronization of A -> B -> C, then the pipe of B -> C needs to set this parameter to true, so that the data written by A to B through the pipe in A -> B can be forwarded correctly. to C -> * If you want to use pipe to build two-way data synchronization (dual-active) of A \<-> B, then the pipes of A -> B and B -> A need to set this parameter to false, otherwise the data will be endless. inter-cluster round-robin forwarding - -### Preset processor plugin - -#### do-nothing-processor - -Function: No processing is done on the events passed in by the extractor. - - -| key | value | value range | required or optional with default | -| --------- | -------------------- | ---------------------------- | --------------------------------- | -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### Preset connector plugin - -#### do-nothing-connector - -Function: No processing is done on the events passed in by the processor. - -| key | value | value range | required or optional with default | -| --------- | -------------------- | ---------------------------- | --------------------------------- | -| connector | do-nothing-connector | String: do-nothing-connector | required | - -## Stream processing task management - -### Create a stream processing task - -Use the `CREATE PIPE` statement to create a stream processing task. Taking the creation of a data synchronization stream processing task as an example, the sample SQL statement is as follows: - -```sql -CREATE PIPE -- PipeId is a name that uniquely identifies the stream processing task -WITH EXTRACTOR ( - --Default IoTDB data extraction plugin - 'extractor' = 'iotdb-extractor', - --Path prefix, only data that can match the path prefix will be extracted for subsequent processing and sending - 'extractor.pattern' = 'root.timecho', - -- Whether to extract historical data - 'extractor.history.enable' = 'true', - -- Describes the time range of the extracted historical data, indicating the earliest time - 'extractor.history.start-time' = '2011.12.03T10:15:30+01:00', - -- Describes the time range of the extracted historical data, indicating the latest time - 'extractor.history.end-time' = '2022.12.03T10:15:30+01:00', - -- Whether to extract real-time data - 'extractor.realtime.enable' = 'true', - --Describe the extraction method of real-time data - 'extractor.realtime.mode' = 'hybrid', -) -WITH PROCESSOR ( - --The default data processing plugin, which does not do any processing - 'processor' = 'do-nothing-processor', -) -WITH CONNECTOR ( - -- IoTDB data sending plugin, the target is IoTDB - 'connector' = 'iotdb-thrift-connector', - --The data service IP of one of the DataNode nodes in the target IoTDB - 'connector.ip' = '127.0.0.1', - -- The data service port of one of the DataNode nodes in the target IoTDB - 'connector.port' = '6667', -) -``` - -**When creating a stream processing task, you need to configure the PipeId and the parameters of the three plugin parts:** - -| Configuration | Description | Required or not | Default implementation | Default implementation description | Default implementation description | -| ------------- | ------------------------------------------------------------ | ------------------------------- | ---------------------- | ------------------------------------------------------------ | ---------------------------------- | -| PipeId | A globally unique name that identifies a stream processing | Required | - | - | - | -| extractor | Pipe Extractor plugin, responsible for extracting stream processing data at the bottom of the database | Optional | iotdb-extractor | Integrate the full historical data of the database and subsequent real-time data arriving into the stream processing task | No | -| processor | Pipe Processor plugin, responsible for processing data | Optional | do-nothing-processor | Does not do any processing on the incoming data | Yes | -| connector | Pipe Connector plugin, responsible for sending data | Required | - | - | Yes | - -In the example, the iotdb-extractor, do-nothing-processor and iotdb-thrift-connector plugins are used to build the data flow processing task. IoTDB also has other built-in stream processing plugins, **please check the "System Preset Stream Processing plugin" section**. - -**A simplest example of the CREATE PIPE statement is as follows:** - -```sql -CREATE PIPE -- PipeId is a name that uniquely identifies the stream processing task -WITH CONNECTOR ( - -- IoTDB data sending plugin, the target is IoTDB - 'connector' = 'iotdb-thrift-connector', - --The data service IP of one of the DataNode nodes in the target IoTDB - 'connector.ip' = '127.0.0.1', - -- The data service port of one of the DataNode nodes in the target IoTDB - 'connector.port' = '6667', -) -``` - -The semantics expressed are: synchronize all historical data in this database instance and subsequent real-time data arriving to the IoTDB instance with the target 127.0.0.1:6667. - -**Notice:** - -- EXTRACTOR and PROCESSOR are optional configurations. If you do not fill in the configuration parameters, the system will use the corresponding default implementation. -- CONNECTOR is a required configuration and needs to be configured declaratively in the CREATE PIPE statement -- CONNECTOR has self-reuse capability. For different stream processing tasks, if their CONNECTORs have the same KV attributes (the keys corresponding to the values of all attributes are the same), then the system will only create one CONNECTOR instance in the end to realize the duplication of connection resources. use. - - - For example, there are the following declarations of two stream processing tasks, pipe1 and pipe2: - - ```sql - CREATE PIPE pipe1 - WITH CONNECTOR ( - 'connector' = 'iotdb-thrift-connector', - 'connector.thrift.host' = 'localhost', - 'connector.thrift.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH CONNECTOR ( - 'connector' = 'iotdb-thrift-connector', - 'connector.thrift.port' = '9999', - 'connector.thrift.host' = 'localhost', - ) - ``` - -- Because their declarations of CONNECTOR are exactly the same (**even if the order of declaration of some attributes is different**), the framework will automatically reuse the CONNECTORs they declared, and ultimately the CONNECTORs of pipe1 and pipe2 will be the same instance. . -- When the extractor is the default iotdb-extractor, and extractor.forwarding-pipe-requests is the default value true, please do not build an application scenario that includes data cycle synchronization (it will cause an infinite loop): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### Start the stream processing task - -After the CREATE PIPE statement is successfully executed, the stream processing task-related instance will be created, but the running status of the entire stream processing task will be set to STOPPED, that is, the stream processing task will not process data immediately. - -You can use the START PIPE statement to cause a stream processing task to start processing data: - -```sql -START PIPE -``` - -### Stop the stream processing task - -Use the STOP PIPE statement to stop the stream processing task from processing data: - -```sql -STOP PIPE -``` - -### Delete stream processing tasks - -Use the DROP PIPE statement to stop the stream processing task from processing data (when the stream processing task status is RUNNING), and then delete the entire stream processing task: - -```sql -DROP PIPE -``` - -Users do not need to perform a STOP operation before deleting the stream processing task. - -### Display stream processing tasks - -Use the SHOW PIPES statement to view all stream processing tasks: - -```sql -SHOW PIPES -``` - -The query results are as follows: - -```sql -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -| ID| CreationTime | State|PipeExtractor|PipeProcessor|PipeConnector|ExceptionMessage| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| None| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -``` - -You can use `` to specify the status of a stream processing task you want to see: - -```sql -SHOW PIPE -``` - -You can also use the where clause to determine whether the Pipe Connector used by a certain \ is reused. - -```sql -SHOW PIPES -WHERE CONNECTOR USED BY -``` - -### Stream processing task running status migration - -A stream processing pipe will pass through various states during its managed life cycle: - -- **STOPPED:** The pipe is stopped. When the pipeline is in this state, there are several possibilities: - - When a pipe is successfully created, its initial state is paused. - - The user manually pauses a pipe that is in normal running status, and its status will passively change from RUNNING to STOPPED. - - When an unrecoverable error occurs during the running of a pipe, its status will automatically change from RUNNING to STOPPED -- **RUNNING:** pipe is working properly -- **DROPPED:** The pipe task was permanently deleted - -The following diagram shows all states and state transitions: - -![State migration diagram](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -## authority management - -### Stream processing tasks - - -| Permission name | Description | -| ----------- | -------------------------- | -| CREATE_PIPE | Register a stream processing task. The path is irrelevant. | -| START_PIPE | Start the stream processing task. The path is irrelevant. | -| STOP_PIPE | Stop the stream processing task. The path is irrelevant. | -| DROP_PIPE | Offload stream processing tasks. The path is irrelevant. | -| SHOW_PIPES | Query stream processing tasks. The path is irrelevant. | - -### Stream processing task plugin - - -| Permission name | Description | -| ------------------ | ---------------------------------- | -| CREATE_PIPEPLUGIN | Register stream processing task plugin. The path is irrelevant. | -| DROP_PIPEPLUGIN | Uninstall the stream processing task plugin. The path is irrelevant. | -| SHOW_PIPEPLUGINS | Query stream processing task plugin. The path is irrelevant. | - -## Configuration parameters - -In iotdb-common.properties: - -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 -``` diff --git a/src/UserGuide/V1.2.x/User-Manual/Tiered-Storage_timecho.md b/src/UserGuide/V1.2.x/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 2bde0000c..000000000 --- a/src/UserGuide/V1.2.x/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,96 +0,0 @@ - - -# Tiered Storage -## Overview - -The Tiered storage functionality allows users to define multiple layers of storage, spanning across multiple types of storage media (Memory mapped directory, SSD, rotational hard discs or cloud storage). While memory and cloud storage is usually singular, the local file system storages can consist of multiple directories joined together into one tier. Meanwhile, users can classify data based on its hot or cold nature and store data of different categories in specified "tier". Currently, IoTDB supports the classification of hot and cold data through TTL (Time to live / age) of data. When the data in one tier does not meet the TTL rules defined in the current tier, the data will be automatically migrated to the next tier. - -## Parameter Definition - -To enable tiered storage in IoTDB, you need to configure the following aspects: - -1. configure the data catalogue and divide the data catalogue into different tiers -2. configure the TTL of the data managed in each tier to distinguish between hot and cold data categories managed in different tiers. -3. configure the minimum remaining storage space ratio for each tier so that when the storage space of the tier triggers the threshold, the data of the tier will be automatically migrated to the next tier (optional). - -The specific parameter definitions and their descriptions are as follows. - -| Configuration | Default | Description | Constraint | -| --------------------------------------- | ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | specify different storage directories and divide the storage directories into tiers | Each level of storage uses a semicolon to separate, and commas to separate within a single level; cloud (OBJECT_STORAGE) configuration can only be used as the last level of storage and the first level can't be used as cloud storage; a cloud object at most; the remote storage directory is denoted by OBJECT_STORAGE | -| default_ttl_in_ms | -1 | Define the maximum age of data for which each tier is responsible | Each level of storage is separated by a semicolon; the number of levels should match the number of levels defined by dn_data_dirs | -| dn_default_space_move_thresholds | 0.15 | Define the minimum remaining space ratio for each tier data catalogue; when the remaining space is less than this ratio, the data will be automatically migrated to the next tier; when the remaining storage space of the last tier falls below this threshold, the system will be set to READ_ONLY | Each level of storage is separated by a semicolon; the number of levels should match the number of levels defined by dn_data_dirs | -| object_storage_type | AWS_S3 | Cloud Storage Type | IoTDB currently only supports AWS S3 as a remote storage type, and this parameter can't be modified | -| object_storage_bucket | iotdb_data | Name of cloud storage bucket | Bucket definition in AWS S3; no need to configure if remote storage is not used | -| object_storage_endpoint | | endpoint of cloud storage | endpoint of AWS S3;If remote storage is not used, no configuration required | -| object_storage_access_key | | Authentication information stored in the cloud: key | AWS S3 credential key;If remote storage is not used, no configuration required | -| object_storage_access_secret | | Authentication information stored in the cloud: secret | AWS S3 credential secret;If remote storage is not used, no configuration required | -| remote_tsfile_cache_dirs | data/datanode/data/cache | Cache directory stored locally in the cloud | If remote storage is not used, no configuration required | -| remote_tsfile_cache_page_size_in_kb | 20480 |Block size of locally cached files stored in the cloud | If remote storage is not used, no configuration required | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | Maximum Disk Occupancy Size for Cloud Storage Local Cache | If remote storage is not used, no configuration required | - -## local tiered storag configuration example - -The following is an example of a local two-level storage configuration. - -```JavaScript -//Required configuration items -dn_data_dirs=/data1/data;/data2/data,/data3/data; -default_ttl_in_ms=86400000;-1 -dn_default_space_move_thresholds=0.2;0.1 -``` - -In this example, two levels of storage are configured, specifically: - -| **tier** | **data path** | **data range** | **threshold for minimum remaining disk space** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| tier 1 | path 1:/data1/data | data for last 1 day | 20% | -| tier 2 | path 2:/data2/data path 2:/data3/data | data from 1 day ago | 10% | - -## remote tiered storag configuration example - -The following takes three-level storage as an example: - -```JavaScript -//Required configuration items -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -default_ttl_in_ms=86400000;864000000;-1 -dn_default_space_move_thresholds=0.2;0.15;0.1 -object_storage_name=AWS_S3 -object_storage_bucket=iotdb -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// Optional configuration items -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -In this example, a total of three levels of storage are configured, specifically: - -| **tier** | **data path** | **data range** | **threshold for minimum remaining disk space** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| tier1 | path 1:/data1/data | data for last 1 day | 20% | -| tier2 | path 1:/data2/data path 2:/data3/data | data from past 1 day to past 10 days | 15% | -| tier3 | Remote AWS S3 Storage | data from 10 days ago | 10% | diff --git a/src/UserGuide/V1.3.x/AI-capability/AINode_timecho.md b/src/UserGuide/V1.3.x/AI-capability/AINode_timecho.md deleted file mode 100644 index 0676658d3..000000000 --- a/src/UserGuide/V1.3.x/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,661 +0,0 @@ - - -# AINode - -AINode is a native IoTDB node that supports the registration, management, and invocation of time-series-related models. It comes with built-in industry-leading self-developed time-series large models, such as the Timer series developed by Tsinghua University. These models can be invoked through standard SQL statements, enabling real-time inference of time series data at the millisecond level, and supporting application scenarios such as trend forecasting, missing value imputation, and anomaly detection for time series data. - - -The system architecture is shown below: -::: center - -::: -The responsibilities of the three nodes are as follows: - -- **ConfigNode**: responsible for storing and managing the meta-information of the model; responsible for distributed node management. -- **DataNode**: responsible for receiving and parsing SQL requests from users; responsible for storing time-series data; responsible for preprocessing computation of data. -- **AINode**: responsible for model file import creation and model inference. - -## 1. Advantageous features - -Compared with building a machine learning service alone, it has the following advantages: - -- **Simple and easy to use**: no need to use Python or Java programming, the complete process of machine learning model management and inference can be completed using SQL statements. Creating a model can be done using the CREATE MODEL statement, and using a model for inference can be done using the CALL INFERENCE (...) statement, making it simpler and more convenient to use. - -- **Avoid Data Migration**: With IoTDB native machine learning, data stored in IoTDB can be directly applied to the inference of machine learning models without having to move the data to a separate machine learning service platform, which accelerates data processing, improves security, and reduces costs. - -![](/img/AInode1.png) - -- **Built-in Advanced Algorithms**: supports industry-leading machine learning analytics algorithms covering typical timing analysis tasks, empowering the timing database with native data analysis capabilities. Such as: - - **Time Series Forecasting**: learns patterns of change from past time series; thus outputs the most likely prediction of future series based on observations at a given past time. - - **Anomaly Detection for Time Series**: detects and identifies outliers in a given time series data, helping to discover anomalous behaviour in the time series. - - **Annotation for Time Series (Time Series Annotation)**: Adds additional information or markers, such as event occurrence, outliers, trend changes, etc., to each data point or specific time period to better understand and analyse the data. - - - -## 2. Basic Concepts - -- **Model**: a machine learning model that takes time-series data as input and outputs the results or decisions of an analysis task. Model is the basic management unit of AINode, which supports adding (registration), deleting, checking, and using (inference) of models. -- **Create**: Load externally designed or trained model files or algorithms into MLNode for unified management and use by IoTDB. -- **Inference**: The process of using the created model to complete the timing analysis task applicable to the model on the specified timing data. -- **Built-in capabilities**: AINode comes with machine learning algorithms or home-grown models for common timing analysis scenarios (e.g., prediction and anomaly detection). - -::: center - -:::: - -## 3. Installation and Deployment - -The deployment of AINode can be found in the document [Deployment Guidelines](../Deployment-and-Maintenance/AINode_Deployment_timecho.md#AINode-部署) . - - -## 4. Usage Guidelines - -AINode provides model creation and deletion process for deep learning models related to timing data. Built-in models do not need to be created and deleted, they can be used directly, and the built-in model instances created after inference is completed will be destroyed automatically. - -### 4.1 Registering Models - -A trained deep learning model can be registered by specifying the vector dimensions of the model's inputs and outputs, which can be used for model inference. - -Models that meet the following criteria can be registered in AINode: -1. Models trained on PyTorch 2.1.0 and 2.2.0 versions supported by AINode should avoid using features from versions 2.2.0 and above. -2. AINode supports models stored using PyTorch JIT, and the model file needs to include the parameters and structure of the model. -3. The input sequence of the model can contain one or more columns, and if there are multiple columns, they need to correspond to the model capability and model configuration file. -4. The input and output dimensions of the model must be clearly defined in the `config.yaml` configuration file. When using the model, it is necessary to strictly follow the input-output dimensions defined in the `config.yaml` configuration file. If the number of input and output columns does not match the configuration file, it will result in errors. - -The following is the SQL syntax definition for model registration. - -```SQL -create model using uri -``` - -The specific meanings of the parameters in the SQL are as follows: - -- model_name: a globally unique identifier for the model, which cannot be repeated. The model name has the following constraints: - - - Identifiers [ 0-9 a-z A-Z _ ] (letters, numbers, underscores) are allowed. - - Length is limited to 2-64 characters - - Case sensitive - -- uri: resource path to the model registration file, which should contain the **model weights model.pt file and the model's metadata description file config.yaml**. - - - Model weight file: the weight file obtained after the training of the deep learning model is completed, currently supporting pytorch training of the .pt file - - - yaml metadata description file: parameters related to the model structure that need to be provided when the model is registered, which must contain the input and output dimensions of the model for model inference: - - - | **Parameter name** | **Parameter description** | **Example** | - | ------------ | ---------------------------- | -------- | - | input_shape | Rows and columns of model inputs for model inference | [96,2] | - | output_shape | rows and columns of model outputs, for model inference | [48,2] | - - - In addition to model inference, the data types of model input and output can be specified: - - - | **Parameter name** | **Parameter description** | **Example** | - | ----------- | ------------------ | --------------------- | - | input_type | model input data type | ['float32','float32'] | - | output_type | data type of the model output | ['float32','float32'] | - - - In addition to this, additional notes can be specified for display during model management - - - | **Parameter name** | **Parameter description** | **Examples** | - | ---------- | ---------------------------------------------- | ------------------------------------------- | - | attributes | optional, user-defined model notes for model display | 'model_type': 'dlinear','kernel_size': '25' | - - -In addition to registration of local model files, registration can also be done by specifying remote resource paths via URIs, using open source model repositories (e.g. HuggingFace). - -#### Example - -In the current example folder, it contains model.pt and config.yaml files, model.pt is the training get, and the content of config.yaml is as follows: - -```YAML -configs. - # Required options - input_shape: [96, 2] # The model receives data in 96 rows x 2 columns. - output_shape: [48, 2] # Indicates that the model outputs 48 rows x 2 columns. - - # Optional Default is all float32 and the number of columns is the number of columns in the shape. - input_type: ["int64", "int64"] # Input data type, need to match the number of columns. - output_type: ["text", "int64"] #Output data type, need to match the number of columns. - -attributes: # Optional user-defined notes for the input. - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -Specify this folder as the load path to register the model. - -```SQL -IoTDB> create model dlinear_example using uri "file://. /example" -``` - -Alternatively, you can download the corresponding model file from huggingFace and register it. - -```SQL -IoTDB> create model dlinear_example using uri "https://huggingface.com/IoTDBML/dlinear/" -``` - -After the SQL is executed, the registration process will be carried out asynchronously, and you can view the registration status of the model through the model showcase (see the Model Showcase section), and the time consumed for successful registration is mainly affected by the size of the model file. - -Once the model registration is complete, you can call specific functions and perform model inference by using normal queries. - -### 4.2 Viewing Models - -Successfully registered models can be queried for model-specific information through the show models command. The SQL definition is as follows: - -```SQL -show models - -show models -``` - -In addition to displaying information about all models directly, you can specify a model id to view information about a specific model. The results of the model show contain the following information: - -| **ModelId** | **State** | **Configs** | **Attributes** | -| ------------ | ------------------------------------- | ---------------------------------------------- | -------------- | -| Model Unique Identifier | Model Registration Status (LOADING, ACTIVE, DROPPING) | InputShape, outputShapeInputTypes, outputTypes | Model Notes | - -State is used to show the current state of model registration, which consists of the following three stages - -- **LOADING**: The corresponding model meta information has been added to the configNode, and the model file is being transferred to the AINode node. -- **ACTIVE**: The model has been set up and the model is in the available state -- **DROPPING**: Model deletion is in progress, model related information is being deleted from configNode and AINode. -- **UNAVAILABLE**: Model creation failed, you can delete the failed model_name by drop model. - -#### Example - -```SQL -IoTDB> show models - - -+---------------------+--------------------------+-----------+----------------------------+-----------------------+ -| ModelId| ModelType| State| Configs| Notes| -+---------------------+--------------------------+-----------+----------------------------+-----------------------+ -| dlinear_example| USER_DEFINED| ACTIVE| inputShape:[96,2]| | -| | | | outputShape:[48,2]| | -| | | | inputDataType:[float,float]| | -| | | |outputDataType:[float,float]| | -| _STLForecaster| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _NaiveForecaster| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _ARIMA| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -|_ExponentialSmoothing| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _GaussianHMM|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -| _GMMHMM|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -| _Stray|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -+---------------------+--------------------------+-----------+------------------------------------------------------------+-----------------------+ -``` - -We have registered the corresponding model earlier, you can view the model status through the corresponding designation, active indicates that the model is successfully registered and can be used for inference. - -### 4.3 Delete Model - -For a successfully registered model, the user can delete it via SQL. In addition to deleting the meta information on the configNode, this operation also deletes all the related model files under the AINode. The SQL is as follows: - -```SQL -drop model -``` - -You need to specify the model model_name that has been successfully registered to delete the corresponding model. Since model deletion involves the deletion of data on multiple nodes, the operation will not be completed immediately, and the state of the model at this time is DROPPING, and the model in this state cannot be used for model inference. - -### 4.4 Using Built-in Model Reasoning - -The SQL syntax is as follows: - - -```SQL -call inference(,sql[,=]) -``` - -Built-in model inference does not require a registration process, the inference function can be used by calling the inference function through the call keyword, and its corresponding parameters are described as follows: - -- **built_in_model_name**: built-in model name -- **parameterName**: parameter name -- **parameterValue**: parameter value - -#### Built-in Models and Parameter Descriptions - -The following machine learning models are currently built-in, please refer to the following links for detailed parameter descriptions. - -| Model | built_in_model_name | Task type | Parameter description | -| -------------------- | --------------------- | -------- | ------------------------------------------------------------ | -| Arima | _Arima | Forecast | [Arima Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.arima.ARIMA.html?highlight=Arima) | -| STLForecaster | _STLForecaster | Forecast | [STLForecaster Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.trend.STLForecaster.html#sktime.forecasting.trend.STLForecaster) | -| NaiveForecaster | _NaiveForecaster | Forecast | [NaiveForecaster Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.naive.NaiveForecaster.html#naiveforecaster) | -| ExponentialSmoothing | _ExponentialSmoothing | Forecast | [ExponentialSmoothing 参Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.exp_smoothing.ExponentialSmoothing.html) | -| GaussianHMM | _GaussianHMM | Annotation | [GaussianHMMParameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.hmm_learn.gaussian.GaussianHMM.html) | -| GMMHMM | _GMMHMM | Annotation | [GMMHMM参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.hmm_learn.gmm.GMMHMM.html) | -| Stray | _Stray | Anomaly detection | [Stray Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.stray.STRAY.html) | - - -#### Example - -The following is an example of an operation using built-in model inference. The built-in Stray model is used for anomaly detection algorithm. The input is `[144,1]` and the output is `[144,1]`. We use it for reasoning through SQL. - -```SQL -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", k=2) -+-------+ -|output0| -+-------+ -| 0| -| 0| -| 0| -| 0| -...... -| 1| -| 1| -| 0| -| 0| -| 0| -| 0| -+-------+ -Total line number = 144 -``` - -### 4.5 Reasoning with Deep Learning Models - -The SQL syntax is as follows: - -```SQL -call inference(,sql[,window=]) - - -window_function: - head(window_size) - tail(window_size) - count(window_size,sliding_step) -``` - -After completing the registration of the model, the inference function can be used by calling the inference function through the call keyword, and its corresponding parameters are described as follows: - -- **model_name**: corresponds to a registered model -- **sql**: sql query statement, the result of the query is used as input to the model for model inference. The dimensions of the rows and columns in the result of the query need to match the size specified in the specific model config. (It is not recommended to use the `SELECT *` clause for the sql here because in IoTDB, `*` does not sort the columns, so the order of the columns is undefined, you can use `SELECT s0,s1` to ensure that the columns order matches the expectations of the model input) -- **window_function**: Window functions that can be used in the inference process, there are currently three types of window functions provided to assist in model inference: - - **head(window_size)**: Get the top window_size points in the data for model inference, this window can be used for data cropping. - ![](/img/AINode-call1.png) - - - **tail(window_size)**: get the last window_size point in the data for model inference, this window can be used for data cropping. - ![](/img/AINode-call2.png) - - - **count(window_size, sliding_step)**: sliding window based on the number of points, the data in each window will be reasoned through the model respectively, as shown in the example below, window_size for 2 window function will be divided into three windows of the input dataset, and each window will perform reasoning operations to generate results respectively. The window can be used for continuous inference - ![](/img/AINode-call3.png) - -**Explanation 1**: window can be used to solve the problem of cropping rows when the results of the sql query and the input row requirements of the model do not match. Note that when the number of columns does not match or the number of rows is directly less than the model requirement, the inference cannot proceed and an error message will be returned. - -**Explanation 2**: In deep learning applications, timestamp-derived features (time columns in the data) are often used as covariates in generative tasks, and are input into the model together to enhance the model, but the time columns are generally not included in the model's output. In order to ensure the generality of the implementation, the model inference results only correspond to the real output of the model, if the model does not output the time column, it will not be included in the results. - - -#### Example - -The following is an example of inference in action using a deep learning model, for the `dlinear` prediction model with input `[96,2]` and output `[48,2]` mentioned above, which we use via SQL. - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**") -+--------------------------------------------+-----------------------------+ -| _result_0| _result_1| -+--------------------------------------------+-----------------------------+ -| 0.726302981376648| 1.6549958229064941| -| 0.7354921698570251| 1.6482787370681763| -| 0.7238251566886902| 1.6278168201446533| -...... -| 0.7692174911499023| 1.654654049873352| -| 0.7685555815696716| 1.6625318765640259| -| 0.7856493592262268| 1.6508299350738525| -+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### Example of using the tail/head window function - -When the amount of data is variable and you want to take the latest 96 rows of data for inference, you can use the corresponding window function tail. head function is used in a similar way, except that it takes the earliest 96 points. - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1988-01-01T00:00:00.000+08:00| 0.7355| 1.211| -...... -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 996 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**",window=tail(96)) -+--------------------------------------------+-----------------------------+ -| _result_0| _result_1| -+--------------------------------------------+-----------------------------+ -| 0.726302981376648| 1.6549958229064941| -| 0.7354921698570251| 1.6482787370681763| -| 0.7238251566886902| 1.6278168201446533| -...... -| 0.7692174911499023| 1.654654049873352| -| 0.7685555815696716| 1.6625318765640259| -| 0.7856493592262268| 1.6508299350738525| -+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### Example of using the count window function - -This window is mainly used for computational tasks. When the task's corresponding model can only handle a fixed number of rows of data at a time, but the final desired outcome is multiple sets of prediction results, this window function can be used to perform continuous inference using a sliding window of points. Suppose we now have an anomaly detection model `anomaly_example(input: [24,2], output[1,1])`, which generates a 0/1 label for every 24 rows of data. An example of its use is as follows: - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(anomaly_example,"select s0,s1 from root.**",window=count(24,24)) -+-------------------------+ -| _result_0| -+-------------------------+ -| 0| -| 1| -| 1| -| 0| -+-------------------------+ -Total line number = 4 -``` - -In the result set, each row's label corresponds to the output of the anomaly detection model after inputting each group of 24 rows of data. - - -### 4.6 TimeSeries Large Models Import Steps - -AINode currently supports a variety of time series large models. For deployment and usage, please refer to [TimeSeries Large Models](../AI-capability/TimeSeries-Large-Model) - - -## 5. Privilege Management - -When using AINode related functions, the authentication of IoTDB itself can be used to do a permission management, users can only use the model management related functions when they have the USE_MODEL permission. When using the inference function, the user needs to have the permission to access the source sequence corresponding to the SQL of the input model. - -| Privilege Name | Privilege Scope | Administrator User (default ROOT) | Normal User | Path Related | -| --------- | --------------------------------- | ---------------------- | -------- | -------- | -| USE_MODEL | create model/show models/drop model | √ | √ | x | -| READ_DATA| call inference | √ | √|√ | - -## 6. Practical Examples - -### 6.1 Power Load Prediction - -In some industrial scenarios, there is a need to predict power loads, which can be used to optimise power supply, conserve energy and resources, support planning and expansion, and enhance power system reliability. - -The data for the test set of ETTh1 that we use is [ETTh1](/img/ETTh1.csv). - - -It contains power data collected at 1h intervals, and each data consists of load and oil temperature as High UseFul Load, High UseLess Load, Middle UseLess Load, Low UseFul Load, Low UseLess Load, Oil Temperature. - -On this dataset, the model inference function of IoTDB-ML can predict the oil temperature in the future period of time through the relationship between the past values of high, middle and low use loads and the corresponding time stamp oil temperature, which empowers the automatic regulation and monitoring of grid transformers. - -#### Step 1: Data Import - -Users can import the ETT dataset into IoTDB using `import-csv.sh` in the tools folder - -``Bash -bash . /import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ... /... /ETTh1.csv -`` - -#### Step 2: Model Import - -We can enter the following SQL in iotdb-cli to pull a trained model from huggingface for registration for subsequent inference. - -```SQL -create model dlinear using uri 'https://huggingface.co/hvlgo/dlinear/tree/main' -``` - -This model is trained on the lighter weight deep model DLinear, which is able to capture as many trends within a sequence and relationships between variables as possible with relatively fast inference, making it more suitable for fast real-time prediction than other deeper models. - -#### Step 3: Model inference - -```Shell -IoTDB> select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth LIMIT 96 -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -| Time|root.eg.etth.s0|root.eg.etth.s1|root.eg.etth.s2|root.eg.etth.s3|root.eg.etth.s4|root.eg.etth.s5|root.eg.etth.s6| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -|2017-10-20T00:00:00.000+08:00| 10.449| 3.885| 8.706| 2.025| 2.041| 0.944| 8.864| -|2017-10-20T01:00:00.000+08:00| 11.119| 3.952| 8.813| 2.31| 2.071| 1.005| 8.442| -|2017-10-20T02:00:00.000+08:00| 9.511| 2.88| 7.533| 1.564| 1.949| 0.883| 8.16| -|2017-10-20T03:00:00.000+08:00| 9.645| 2.21| 7.249| 1.066| 1.828| 0.914| 7.949| -...... -|2017-10-23T20:00:00.000+08:00| 8.105| 0.938| 4.371| -0.569| 3.533| 1.279| 9.708| -|2017-10-23T21:00:00.000+08:00| 7.167| 1.206| 4.087| -0.462| 3.107| 1.432| 8.723| -|2017-10-23T22:00:00.000+08:00| 7.1| 1.34| 4.015| -0.32| 2.772| 1.31| 8.864| -|2017-10-23T23:00:00.000+08:00| 9.176| 2.746| 7.107| 1.635| 2.65| 1.097| 9.004| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example, "select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth", window=head(96)) -+-----------+----------+----------+------------+---------+----------+----------+ -| output0| output1| output2| output3| output4| output5| output6| -+-----------+----------+----------+------------+---------+----------+----------+ -| 10.319546| 3.1450553| 7.877341| 1.5723765|2.7303758| 1.1362307| 8.867775| -| 10.443649| 3.3286757| 7.8593454| 1.7675098| 2.560634| 1.1177158| 8.920919| -| 10.883752| 3.2341104| 8.47036| 1.6116762|2.4874182| 1.1760603| 8.798939| -...... -| 8.0115595| 1.2995274| 6.9900327|-0.098746896| 3.04923| 1.176214| 9.548782| -| 8.612427| 2.5036244| 5.6790237| 0.66474205|2.8870275| 1.2051733| 9.330128| -| 10.096699| 3.399722| 6.9909| 1.7478468|2.7642853| 1.1119363| 9.541455| -+-----------+----------+----------+------------+---------+----------+----------+ -Total line number = 48 -``` - -We compare the results of the prediction of the oil temperature with the real results, and we can get the following image. - -The data before 10/24 00:00 represents the past data input to the model, the blue line after 10/24 00:00 is the oil temperature forecast result given by the model, and the red line is the actual oil temperature data from the dataset (used for comparison). - -![](/img/AINode-analysis1.png) - -As can be seen, we have used the relationship between the six load information and the corresponding time oil temperatures for the past 96 hours (4 days) to model the possible changes in this data for the oil temperature for the next 48 hours (2 days) based on the inter-relationships between the sequences learned previously, and it can be seen that the predicted curves maintain a high degree of consistency in trend with the actual results after visualisation. - -### 6.2 Power Prediction - -Power monitoring of current, voltage and power data is required in substations for detecting potential grid problems, identifying faults in the power system, effectively managing grid loads and analysing power system performance and trends. - -We have used the current, voltage and power data in a substation to form a dataset in a real scenario. The dataset consists of data such as A-phase voltage, B-phase voltage, and C-phase voltage collected every 5 - 6s for a time span of nearly four months in the substation. - -The test set data content is [data](/img/data.csv). - -On this dataset, the model inference function of IoTDB-ML can predict the C-phase voltage in the future period through the previous values and corresponding timestamps of A-phase voltage, B-phase voltage and C-phase voltage, empowering the monitoring management of the substation. - -#### Step 1: Data Import - -Users can import the dataset using `import-csv.sh` in the tools folder - -```Bash -bash ./import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ... /... /data.csv -``` - -#### Step 2: Model Import - -We can select built-in models or registered models in IoTDB CLI for subsequent inference. - -We use the built-in model STLForecaster for prediction. STLForecaster is a time series forecasting method based on the STL implementation in the statsmodels library. - -#### Step 3: Model Inference - -```Shell -IoTDB> select * from root.eg.voltage limit 96 -+-----------------------------+------------------+------------------+------------------+ -| Time|root.eg.voltage.s0|root.eg.voltage.s1|root.eg.voltage.s2| -+-----------------------------+------------------+------------------+------------------+ -|2023-02-14T20:38:32.000+08:00| 2038.0| 2028.0| 2041.0| -|2023-02-14T20:38:38.000+08:00| 2014.0| 2005.0| 2018.0| -|2023-02-14T20:38:44.000+08:00| 2014.0| 2005.0| 2018.0| -...... -|2023-02-14T20:47:52.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:47:57.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:48:03.000+08:00| 2024.0| 2016.0| 2027.0| -+-----------------------------+------------------+------------------+------------------+ -Total line number = 96 - -IoTDB> call inference(_STLForecaster, "select s0,s1,s2 from root.eg.voltage", window=head(96),predict_length=48) -+---------+---------+---------+ -| output0| output1| output2| -+---------+---------+---------+ -|2026.3601|2018.2953|2029.4257| -|2019.1538|2011.4361|2022.0888| -|2025.5074|2017.4522|2028.5199| -...... - -|2022.2336|2015.0290|2025.1023| -|2015.7241|2008.8975|2018.5085| -|2022.0777|2014.9136|2024.9396| -|2015.5682|2008.7821|2018.3458| -+---------+---------+---------+ -Total line number = 48 -``` - -Comparing the predicted results of the C-phase voltage with the real results, we can get the following image. - -The data before 02/14 20:48 represents the past data input to the model, the blue line after 02/14 20:48 is the predicted result of phase C voltage given by the model, while the red line is the actual phase C voltage data from the dataset (used for comparison). - -![](/img/AINode-analysis2.png) - -It can be seen that we used the voltage data from the past 10 minutes and, based on the previously learned inter-sequence relationships, modeled the possible changes in the phase C voltage data for the next 5 minutes. The visualized forecast curve shows a certain degree of synchronicity with the actual results in terms of trend. - -### 6.3 Anomaly Detection - -In the civil aviation and transport industry, there exists a need for anomaly detection of the number of passengers travelling on an aircraft. The results of anomaly detection can be used to guide the adjustment of flight scheduling to make the organisation more efficient. - -Airline Passengers is a time-series dataset that records the number of international air passengers between 1949 and 1960, sampled at one-month intervals. The dataset contains a total of one time series. The dataset is [airline](/img/airline.csv). -On this dataset, the model inference function of IoTDB-ML can empower the transport industry by capturing the changing patterns of the sequence in order to detect anomalies at the sequence time points. - -#### Step 1: Data Import - -Users can import the dataset using `import-csv.sh` in the tools folder - -``Bash -bash . /import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ... /... /data.csv -`` - -#### Step 2: Model Inference - -IoTDB has some built-in machine learning algorithms that can be used directly, a sample prediction using one of the anomaly detection algorithms is shown below: - -```Shell -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", k=2) -+-------+ -|output0| -+-------+ -| 0| -| 0| -| 0| -| 0| -...... -| 1| -| 1| -| 0| -| 0| -| 0| -| 0| -+-------+ -Total line number = 144 -``` - -We plot the results detected as anomalies to get the following image. Where the blue curve is the original time series and the time points specially marked with red dots are the time points that the algorithm detects as anomalies. - -![](/img/s6.png) - -It can be seen that the Stray model has modelled the input sequence changes and successfully detected the time points where anomalies occur. \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/API/Programming-OPC-DA_timecho.md b/src/UserGuide/V1.3.x/API/Programming-OPC-DA_timecho.md deleted file mode 100644 index 80e568300..000000000 --- a/src/UserGuide/V1.3.x/API/Programming-OPC-DA_timecho.md +++ /dev/null @@ -1,209 +0,0 @@ - - -# OPC DA Protocol - -## 1. OPC DA - -OPC DA (OPC Data Access) is a communication protocol standard in the field of industrial automation and a core part of the classic OPC (OLE for Process Control) technology. Its primary goal is to enable real-time data exchange between industrial devices and software (such as SCADA, HMI, and databases) in a Windows environment. OPC DA is implemented based on COM/DCOM and is a lightweight protocol with two roles: server and client. - -* **Server:** Can be regarded as a pool of items, storing the latest data and status of each instance. All items can only be managed on the server side; clients can only read and write data and have no authority to manipulate metadata. - -![](/img/opc-da-1-1.png) - -* **Client:** After connecting to the server, the client needs to define a custom group (this group is only relevant to the client) and create items with the same names as those on the server. The client can then read and write the items it has created. - -![](/img/opc-da-1-2-en.png) - -## 2. OPC DA Sink - -IoTDB (available since V1.3.5.2 for V1.x) provides an OPC DA Sink that supports pushing tree-model data to a local COM server plugin. It encapsulates the OPC DA interface specifications and their inherent complexity, significantly simplifying the integration process. The data flow diagram for the OPC DA Sink is shown below. - -![](/img/opc-da-2-1-en.png) - -### 2.1 SQL Syntax - -```SQL ----- Note: The clsID here needs to be replaced with your own clsID -create pipe opc ( - 'sink'='opc-da-sink', - --- 'opcda.progid'='opcserversim.Instance.1' - 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C' -); -``` - -### 2.2 Parameter Description - -| ​**​Parameter​**​ | ​**​Description​**​ | ​**​Value Range​**​ | ​**​Required​**​ | -| ----------------------------- | ----------------------------------------------------------------------------------------------------------- | ------------------------------- | ----------------------------------------- | -| sink | OPC DA Sink | String: opc-da-sink | Yes | -| sink.opcda.clsid | The ClsID (unique identifier string) of the OPC Server. It is recommended to use clsID instead of progID. | String | Either clsID or progID must be provided | -| sink.opcda.progid | The ProgID of the OPC Server. If clsID is available, it is preferred over progID. | String | Either clsID or progID must be provided | - - -### 2.3 Mapping Specifications - -When used, IoTDB will push the latest data from its tree model to the server. The itemID for the data is the full path of the time series in the tree model, such as root.a.b.c.d. Note that, according to the OPC DA standard, clients cannot directly create items on the server side. Therefore, the server must pre-create items corresponding to IoTDB's time series with the itemID and the appropriate data type. - -* Data type correspondence is as follows: - -| IoTDB | OPC-DA Server | -| ----------- | ----------------------------------------------------------- | -| INT32 | VT\_I4 | -| INT64 | VT\_I8 | -| FLOAT | VT\_R4 | -| DOUBLE | VT\_R8 | -| TEXT | VT\_BSTR | -| BOOLEAN | VT\_BOOL | -| DATE | VT\_DATE | -| TIMESTAMP | VT\_DATE | -| BLOB | VT_BSTR (Variant does not support VT_BLOB, so VT_BSTR is used as a substitute) | -| STRING | VT\_BSTR | - -### 2.4 Common Error Codes - -| Symbol | Error Code | Description | -| ----------------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| OPC\_E\_BADTYPE | 0xC0040004 | The server cannot convert the data between the specified format/requested data type and the canonical data type. This means the server's data type does not match IoTDB's registered type. | -| OPC\_E\_UNKNOWNITEMID | 0xC0040007 | The item ID is not defined in the server's address space (when adding or validating), or the item ID no longer exists in the server's address space (when reading or writing). This means IoTDB's measurement point does not have a corresponding itemID on the server. | -| OPC\_E\_INVALIDITEMID | 0xC0040008 | The itemID does not conform to the server's syntax specifications. | -| REGDB\_E\_CLASSNOTREG | 0x80040154 | Class not registered | -| RPC\_S\_SERVER\_UNAVAILABLE | 0x800706BA | RPC service unavailable | -| DISP\_E\_OVERFLOW | 0x8002000A | Exceeds the maximum value of the type | -| DISP\_E\_BADVARTYPE | 0x80020005 | Type mismatch | - - -### 2.5 Usage Limitations - -* Only supports COM and can only be used on Windows. -* A small amount of old data may be pushed after restarting, but new data will eventually be pushed. -* Currently, only tree-model data is supported. - -## 3. Usage Steps -### 3.1 Prerequisites -1. Windows environment, version >= 8. -2. IoTDB is installed and running normally. -3. OPC DA Server is installed. - -* Using Simple OPC Server Simulator as an example: - -![](/img/opc-da-3-1.png) - -* Double-click an item to modify its name (itemID), data, data type, and other information. -* Right-click an item to delete it, update its value, or create a new item. - -![](/img/opc-da-3-2.png) - -4. OPC DA Client is installed. -* Using KepwareServerEX's quickClient as an example: -* In Kepware, the OPC DA Client can be opened as follows: - -![](/img/opc-da-3-3-en.png) - -![](/img/opc-da-3-4-en.png) - - -### 3.2 Configuration Modifications - -Modify the server configuration to prevent IoTDB's write client and Kepware's read client from connecting to two different instances, which would make debugging impossible. - -* First, press Win+R, type dcomcnfgin the Run menu, and open the DCOM component configuration: - -![](/img/opc-da-3-5-en.png) - -* Navigate to Component Services -> Computers -> My Computer -> DCOM Config, find AGG Software Simple OPC Server Simulator, right-click, and select "Properties": - -![](/img/opc-da-3-6-en.png) - -* Under Identity, change User Accountto Interactive User. Note: Do not use Launching User, as this may cause the two clients to start different server instances. - -![](/img/opc-da-3-7-en.png) - -### 3.3 Obtaining clsID -1. Method 1: Obtain via DCOM Configuration -* Press Win+R, type dcomcnfgin the Run menu, and open the DCOM component configuration. -* Navigate to Component Services -> Computers -> My Computer -> DCOM Config, find AGG Software Simple OPC Server Simulator, right-click, and select "Properties". -* Under General, you can obtain the application's clsID, which will be used for the opc-da-sink connection later. Note: Do not include the curly braces. - -![](/img/opc-da-3-8-en.png) - -2. Method 2: clsID and progID can also be obtained directly from the server. - -* Click `Help` > `Show OPC Server Info` - -![](/img/opc-da-3-9.png) - -* The pop-up window will display the information. - -![](/img/opc-da-3-10-en.png) - -### 3.4 Writing Data -#### 3.4.1 DA Server -1. Create a new item in the DA Server with the same name and type as the item to be written in IoTDB. - -![](/img/opc-da-3-11.png) - -2. Connect to the server in Kepware: - -![](/img/opc-da-3-12-en.png) - -3. Right-click the server to create a new group (the group name can be arbitrary): - -![](/img/opc-da-3-13-en.png) - -![](/img/opc-da-3-14-en.png) - -4. Right-click to create a new item with the same name as the one created earlier. - -![](/img/opc-da-3-15-en.png) - -![](/img/opc-da-3-16-en.png) - -![](/img/opc-da-3-17-en.png) - -#### 3.4.2 IoTDB - -1. Start IoTDB. -2. Create a Pipe. - -```SQL -create pipe opc ('sink'='opc-da-sink', 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C') -``` - -* Note: If the creation fails with the error Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 1107: Failed to connect to server, error code: 0x80040154, refer to this solution: https://opcexpert.com/support/0x80040154-class-not-registered/. - -3. Create a time series (if automatic metadata creation is enabled, this step can be skipped). - -```SQL -create timeseries root.a.b.c.r string; -``` - -4. Insert data. - -```SQL -insert into root.a.b.c (time, r) values(10000, "SomeString") -``` - -### 3.5 Verifying Data - -Check the data in Quick Client; it should have been updated. - -![](/img/opc-da-3-18-en.png) \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/API/Programming-OPC-UA_timecho.md b/src/UserGuide/V1.3.x/API/Programming-OPC-UA_timecho.md deleted file mode 100644 index 5cca37d1c..000000000 --- a/src/UserGuide/V1.3.x/API/Programming-OPC-UA_timecho.md +++ /dev/null @@ -1,295 +0,0 @@ - - -# OPC UA Protocol - -## OPC UA Subscription Data - -This feature allows users to subscribe to data from IoTDB using the OPC UA protocol. The communication modes for subscription data support both Client/Server and Pub/Sub. - -Note: This feature is not about collecting data from external OPC Servers and writing it into IoTDB. - -![](/img/opc-ua-new-1-en.png) - -## OPC Service Startup Method - -### Syntax - -The syntax to start the OPC UA protocol: - -```sql -create pipe p1 - with source (...) - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'sink.opcua.tcp.port' = '12686', - 'sink.opcua.https.port' = '8443', - 'sink.user' = 'root', - 'sink.password' = 'root', - 'sink.opcua.security.dir' = '...' - ) -``` - -### Parameters - -| key | value | value range | required or not | default value | -| :--------------------------------- | :-------------------------------------------------- | :------------------------------------------------------- | :-------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| sink | OPC UA SINK | String: opc-ua-sink | Required | | -| sink.opcua.model | OPC UA model used | String: client-server / pub-sub | Optional | pub-sub | -| sink.opcua.tcp.port | OPC UA's TCP port | Integer: \[0, 65536] | Optional | 12686 | -| sink.opcua.https.port | OPC UA's HTTPS port | Integer: \[0, 65536] | Optional | 8443 | -| sink.opcua.security.dir | Directory for OPC UA's keys and certificates | String: Path, supports absolute and relative directories | Optional | Opc_security folder/in the conf directory of the DataNode related to iotdb
If there is no conf directory for iotdb (such as launching DataNode in IDEA), it will be the iotdb_opc_Security folder/\in the user's home directory | -| sink.opcua.enable-anonymous-access | Whether OPC UA allows anonymous access | Boolean | Optional | true | -| sink.user | User for OPC UA, specified in the configuration | String | Optional | root | -| sink.password | Password for OPC UA, specified in the configuration | String | Optional | root | - -### Example - -```Bash -create pipe p1 - with sink ('sink' = 'opc-ua-sink', - 'sink.user' = 'root', - 'sink.password' = 'root'); -start pipe p1; -``` - -### Usage Limitations - -1. After starting the protocol, data needs to be written to establish a connection. Only data after the connection is established can be subscribed to. -2. Recommended for use in standalone mode. In distributed mode, each IoTDB DataNode acts as an independent OPC Server providing data and requires separate subscription. - - -## Examples of Two Communication Modes - -### Client / Server Mode - -In this mode, IoTDB's stream processing engine establishes a connection with the OPC UA Server via an OPC UA Sink. The OPC UA Server maintains data within its Address Space, from which IoTDB can request and retrieve data. Additionally, other OPC UA Clients can access the data on the server. - -* Features: - * OPC UA organizes device information received from the Sink into folders under the Objects folder according to a tree model. - * Each measurement point is recorded as a variable node, storing the latest value from the current database. - * OPC UA cannot delete data or change data type settings. - -#### Preparation Work - -1. Take UAExpert client as an example, download the UAExpert client: - -2. Install UAExpert and fill in your own certificate information. - -#### Quick Start -##### Scenarios Supporting the None Security Policy - -1. Use the following SQL to create and start the OPC UA Sink in client-server mode. For detailed syntax, please refer to: [IoTDB OPC Server Syntax](#syntax) - - ```sql - create pipe p1 with sink ('sink'='opc-ua-sink', 'opcua.security-policy'='AES128_SHA256_RSAOAEP, AES256_SHA256_RSAPSS, BASIC256SHA256, NONE'); - ``` - - Note: Since version V1.3.7.2, None is no longer supported by default. To use it, you must manually enable it via the security-policy parameter as shown above. - -2. Write some data. - - ```sql - insert into root.test.db(time, s2) values(now(), 2) - ``` - - ​The metadata is automatically created and enabled here. - -3. Configure the connection to IoTDB in UAExpert, where the password should be set to the one defined in the sink.password parameter (using the default password "root" as an example): - - ::: center - - - - ::: - - ::: center - - - - ::: - -4. After trusting the server's certificate, you can see the written data in the Objects folder on the left. - - ::: center - - - - ::: - - ::: center - - - - ::: - - Note: Since the SecurityPolicy is set to None, mutual certificate trust is not required. For production environments, it is recommended to use a non-None SecurityPolicy for connection, which requires mutual certificate trust. For operations, refer to the Pub/Sub mode section below. In the Client/Server certificate directory (search for the keyword keyStore in the printed logs), move the contents in reject to trusted/certs. Follow the sequence: connect → move server directory → connect → move client directory → connect. - - -5. You can drag the node on the left to the center and display the latest value of that node: - - ::: center - - - - ::: - -##### Scenarios Not Supporting the None Security Policy -1. Use the following SQL to create and start the OPC UA service. - ```SQL - create pipe p1 with sink ('sink'='opc-ua-sink'); - ``` - - Note: Since version V1.3.7.2, OpcUaSink no longer supports None mode by default for security considerations. - -2. Insert some test data. - ```SQL - insert into root.test.db(time, s2) values(now(), 2); - ``` - -3. Configure the IoTDB connection in UAExpert: - - - Do not access the URL directly; endpoints must be discovered using the Discover method - - The client first sends a GetEndpoints request with the None policy to retrieve the endpoint list - - It then selects the corresponding encrypted endpoint based on the configured Basic256Sha256 + SignAndEncrypt to establish an encrypted connection - - ![](/img/opc-ua-un-none-1.png) - -4. Use the same username and password configuration as above. After selecting the relevant connection mode (Sign / Sign & Encrypt), if the following prompt appears, click Ignore to connect directly. - - ![](/img/opc-ua-un-none-2.png) - - -### Pub / Sub Mode - -In this mode, IoTDB's stream processing engine sends data change events to the OPC UA Server through an OPC UA Sink. These events are published to the server's message queue and managed through Event Nodes. Other OPC UA Clients can subscribe to these Event Nodes to receive notifications upon data changes. - -- Features: - - - Each measurement point is wrapped as an Event Node in OPC UA. - - - The relevant fields and their meanings are as follows: - - | Field | Meaning | Type (Milo) | Example | - | :--------- | :--------------------------------- | :------------ | :-------------------- | - | Time | Timestamp | DateTime | 1698907326198 | - | SourceName | Full path of the measurement point | String | root.test.opc.sensor0 | - | SourceNode | Data type of the measurement point | NodeId | Int32 | - | Message | Data | LocalizedText | 3.0 | - - - Events are only sent to clients that are already listening; if a client is not connected, the Event will be ignored. - - If data is deleted, the information cannot be pushed to clients. - - -#### Preparation Work - -The code is located in the [opc-ua-sink](https://github.com/apache/iotdb/tree/rc/1.3.5/example/pipe-opc-ua-sink/src/main/java/org/apache/iotdb/opcua) under the iotdb-example package. - -The code includes: - -- The main class (ClientTest) -- Client certificate-related logic(IoTDBKeyStoreLoaderClient) -- Client configuration and startup logic(ClientExampleRunner) -- The parent class of ClientTest(ClientExample) - -#### Quick Start - -The steps are as follows: - -1. Start IoTDB and write some data. - - ```sql - insert into root.a.b(time, c, d) values(now(), 1, 2); - ``` - - ​The metadata is automatically created and enabled here. - -2. Use the following SQL to create and start the OPC UA Sink in Pub-Sub mode. For detailed syntax, please refer to: [IoTDB OPC Server Syntax](#syntax) - - ```sql - create pipe p1 with sink ('sink'='opc-ua-sink', - 'sink.opcua.model'='pub-sub'); - start pipe p1; - ``` - - ​ At this point, you can see that the opc certificate-related directory has been created under the server's conf directory. - - ::: center - - - - ::: - -3. Run the Client connection directly; the Client's certificate will be rejected by the server. - - ::: center - - - - ::: - -4. Go to the server's sink.opcua.security.dir directory, then to the pki's rejected directory, where the Client's certificate should have been generated. - - ::: center - - - - ::: - -5. Move (not copy) the client's certificate into (not into a subdirectory of) the trusted directory's certs folder in the same directory. - - ::: center - - - - ::: - -6. Open the Client connection again; the server's certificate should now be rejected by the Client. - - ::: center - - - - ::: - -7. Go to the client's /client/security directory, then to the pki's rejected directory, and move the server's certificate into (not into a subdirectory of) the trusted directory. - - ::: center - - - - ::: - -8. Open the Client, and now the two-way trust is successful, and the Client can connect to the server. - -9. Write data to the server, and the Client will print out the received data. - - ::: center - - - - ::: - -### Notes - -1. **stand alone and cluster:**It is recommended to use a 1C1D (one coordinator and one data node) single machine version. If there are multiple DataNodes in the cluster, data may be sent in a scattered manner across various DataNodes, and it may not be possible to listen to all the data. - -2. **No Need to Operate Root Directory Certificates:** During the certificate operation process, there is no need to operate the `iotdb-server.pfx` certificate under the IoTDB security root directory and the `example-client.pfx` directory under the client security directory. When the Client and Server connect bidirectionally, they will send the root directory certificate to each other. If it is the first time the other party sees this certificate, it will be placed in the reject dir. If the certificate is in the trusted/certs, then the other party can trust it. - -3. **It is Recommended to Use Java 17+:** - In JVM 8 versions, there may be a key length restriction, resulting in an "Illegal key size" error. For specific versions (such as jdk.1.8u151+), you can add `Security.`_`setProperty`_`("crypto.policy", "unlimited");`; in the create client of ClientExampleRunner to solve this, or you can download the unlimited package `local_policy.jar` and `US_export_policy` to replace the packages in the `JDK/jre/lib/security`. Download link: . diff --git a/src/UserGuide/V1.3.x/Background-knowledge/Cluster-Concept_timecho.md b/src/UserGuide/V1.3.x/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index 22afc3aa4..000000000 --- a/src/UserGuide/V1.3.x/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,118 +0,0 @@ - - -# Common Concepts - -## Sql_dialect Related Concepts - -| Concept | Meaning | -| ----------------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| sql_dialect | Tree model: manages devices and measurement points, manages data in a hierarchical path manner, where one path corresponds to one measurement point of a device. | -| Schema | Schema is the data model information of the database, i.e., tree structure. It includes definitions such as the names and data types of measurement points. | -| Device | Corresponds to a physical device in an actual scenario, usually containing multiple measurement points. | -| Timeseries | Also known as: physical quantity, time series, timeline, point location, semaphore, indicator, measurement value, etc. It is a time series formed by arranging multiple data points in ascending order of timestamps. Usually, a Timeseries represents a collection point that can periodically collect physical quantities of the environment it is in. | -| Encoding | Encoding is a compression technique that represents data in binary form to improve storage efficiency. IoTDB supports various encoding methods for different types of data. For more detailed information, please refer to:[Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) | -| Compression | After data encoding, IoTDB uses compression technology to further compress binary data to enhance storage efficiency. IoTDB supports multiple compression methods. For more detailed information, please refer to: [Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) | - -## Distributed Related Concepts - -The following figure shows a common IoTDB 3C3D (3 ConfigNodes, 3 DataNodes) cluster deployment pattern: - - - -IoTDB's cluster includes the following common concepts: - -- Nodes(ConfigNode、DataNode、AINode) -- Region(SchemaRegion、DataRegion) -- Replica Groups - -The above concepts will be introduced in the following text. - - -### Nodes - -IoTDB cluster includes three types of nodes (processes): ConfigNode (management node), DataNode (data node), and AINode (analysis node), as shown below: - -- ConfigNode: Manages cluster node information, configuration information, user permissions, metadata, partition information, etc., and is responsible for the scheduling of distributed operations and load balancing. All ConfigNodes are fully backed up with each other, as shown in ConfigNode-1, ConfigNode-2, and ConfigNode-3 in the figure above. -- DataNode: Serves client requests and is responsible for data storage and computation, as shown in DataNode-1, DataNode-2, and DataNode-3 in the figure above. -- AINode: Provides machine learning capabilities, supports the registration of trained machine learning models, and allows model inference through SQL calls. It has already built-in self-developed time-series large models and common machine learning algorithms (such as prediction and anomaly detection). - -### Data Partitioning - -In IoTDB, both metadata and data are divided into small partitions, namely Regions, which are managed by various DataNodes in the cluster. - -- SchemaRegion: Metadata partition, managing the metadata of a part of devices and measurement points. SchemaRegions with the same RegionID on different DataNodes are mutual replicas, as shown in SchemaRegion-1 in the figure above, which has three replicas located on DataNode-1, DataNode-2, and DataNode-3. -- DataRegion: Data partition, managing the data of a part of devices for a certain period of time. DataRegions with the same RegionID on different DataNodes are mutual replicas, as shown in DataRegion-2 in the figure above, which has two replicas located on DataNode-1 and DataNode-2. -- For specific partitioning algorithms, please refer to: [Data Partitioning](../Technical-Insider/Cluster-data-partitioning.md) - -### Replica Groups - -The number of replicas for data and metadata can be configured. The recommended configurations for different deployment modes are as follows, where multi-replication can provide high-availability services. - -| Category | Parameter | Stand-Alone Recommended Configuration | Cluster Recommended Configuration | -| :----- | :------------------------ | :----------- | :----------- | -| Schema | schema_replication_factor | 1 | 3 | -| Data | data_replication_factor | 1 | 2 | - - -## Deployment Related Concepts - -IoTDB has three operating modes: Stand-Alone mode, Cluster mode, and Dual-Active mode. - -### Stand-Alone Mode - -An IoTDB Stand-Alone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D; - - -- **Features**:Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations. -- **Applicable Scenarios**:Scenarios with limited resources or low requirements for high availability, such as edge-side servers. -- **Deployment Method**:[Stand-Alone-Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -### Dual-Active Mode - -Dual-active deployment is a feature of TimechoDB Enterprise Edition, which refers to two independent instances performing bidirectional synchronization and can provide services simultaneously. When one instance is restarted after a shutdown, the other instance will resume transmission of the missing data. - - -> An IoTDB dual-active instance usually consists of 2 single-machine nodes, i.e., 2 sets of 1C1D. Each instance can also be a cluster. - -- **Features**:The most resource-efficient high-availability solution. -- **Applicable Scenarios**:Scenarios with limited resources (only two servers) but requiring high-availability capabilities. -- **Deployment Method**:[Dual-Active-Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### Cluster Mode - -An IoTDB cluster instance consists of 3 ConfigNodes and no less than 3 DataNodes, usually 3 DataNodes, i.e., 3C3D; when some nodes fail, the remaining nodes can still provide services, ensuring the high availability of the database service, and the database performance can be improved with the addition of nodes. - -- **Features**:High availability and scalability, and the system performance can be improved by adding DataNodes. -- **Applicable Scenarios**:Enterprise-level application scenarios requiring high availability and reliability. -- **Deployment Method**:[Cluster-Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -### Summary of Features - -| Dimension | Stand-Alone Mode | Dual-Active Mode | Cluster Mode | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| Applicable Scenarios | Edge-side deployment, scenarios with low requirements for high availability | High-availability business, disaster recovery scenarios, etc. | High-availability business, disaster recovery scenarios, etc. | -| Number of Machines Required | 1 | 2 | ≥3 | -| Security and Reliability | Cannot tolerate single-point failures | High, can tolerate single-point failures | High, can tolerate single-point failures | -| Scalability | Can expand DataNodes to improve performance | Each instance can be expanded as needed | Can expand DataNodes to improve performance | -| Performance | Can be expanded with the number of DataNodes | Same as the performance of one of the instances | Can be expanded with the number of DataNodes | - -- The deployment steps for single-machine mode and cluster mode are similar (adding ConfigNodes and DataNodes one by one), with only the number of replicas and the minimum number of nodes that can provide services being different. \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/Basic-Concept/Operate-Metadata_timecho.md b/src/UserGuide/V1.3.x/Basic-Concept/Operate-Metadata_timecho.md deleted file mode 100644 index 1a21e8fa9..000000000 --- a/src/UserGuide/V1.3.x/Basic-Concept/Operate-Metadata_timecho.md +++ /dev/null @@ -1,1360 +0,0 @@ - - -# Timeseries Management - -## Database Management - -### Create Database - -According to the storage model we can set up the corresponding database. Two SQL statements are supported for creating databases, as follows: - -``` -IoTDB > create database root.ln -IoTDB > create database root.sgcc -``` - -We can thus create two databases using the above two SQL statements. - -It is worth noting that 1 database is recommended. - -When the path itself or the parent/child layer of the path is already created as database, the path is then not allowed to be created as database. For example, it is not feasible to create `root.ln.wf01` as database when two databases `root.ln` and `root.sgcc` exist. The system gives the corresponding error prompt as shown below: - -``` -IoTDB> CREATE DATABASE root.ln.wf01 -Msg: 300: root.ln has already been created as database. -IoTDB> create database root.ln.wf01 -Msg: 300: root.ln has already been created as database. -``` - -Database Node Naming Rules: -1. Node names may contain: **Chinese/English letters, Digits (0-9), Underscore(\_)、Period (.)、Backtick(\`)** -2. The entire name must be enclosed in **backticks (\`)** if: - - It consists solely of digits (e.g., 12345) - - It contains special characters (. or \_) that may cause ambiguity (e.g., db.01, \_temp) -3. Escaping Backticks: - If the node name itself contains a backtick (\`), use **two consecutive backticks(\`\`)** to represent a single backtick. Example: To name a node as \`db123\`\` (containing one backtick), write it as \`db123\`\`\`. - -Besides, if deploy on Windows or macOS system, the LayerName is case-insensitive, which means it's not allowed to create databases `root.ln` and `root.LN` at the same time. - -### Show Databases - -After creating the database, we can use the [SHOW DATABASES](../SQL-Manual/SQL-Manual.md) statement and [SHOW DATABASES \](../SQL-Manual/SQL-Manual.md) to view the databases. The SQL statements are as follows: - -``` -IoTDB> SHOW DATABASES -IoTDB> SHOW DATABASES root.** -``` - -The result is as follows: - -``` -+-------------+----+-------------------------+-----------------------+-----------------------+ -|database| ttl|schema_replication_factor|data_replication_factor|time_partition_interval| -+-------------+----+-------------------------+-----------------------+-----------------------+ -| root.sgcc|null| 2| 2| 604800| -| root.ln|null| 2| 2| 604800| -+-------------+----+-------------------------+-----------------------+-----------------------+ -Total line number = 2 -It costs 0.060s -``` - -### Delete Database - -User can use the `DELETE DATABASE ` statement to delete all databases matching the pathPattern. Please note the data in the database will also be deleted. - -``` -IoTDB > DELETE DATABASE root.ln -IoTDB > DELETE DATABASE root.sgcc -// delete all data, all timeseries and all databases -IoTDB > DELETE DATABASE root.** -``` - -### Count Databases - -User can use the `COUNT DATABASE ` statement to count the number of databases. It is allowed to specify `PathPattern` to count the number of databases matching the `PathPattern`. - -SQL statement is as follows: - -``` -IoTDB> count databases -IoTDB> count databases root.* -IoTDB> count databases root.sgcc.* -IoTDB> count databases root.sgcc -``` - -The result is as follows: - -``` -+-------------+ -| database| -+-------------+ -| root.sgcc| -| root.turbine| -| root.ln| -+-------------+ -Total line number = 3 -It costs 0.003s - -+-------------+ -| database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.003s - -+-------------+ -| database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 0| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 1| -+-------------+ -Total line number = 1 -It costs 0.002s -``` - -### Setting up heterogeneous databases (Advanced operations) - -Under the premise of familiar with IoTDB metadata modeling, -users can set up heterogeneous databases in IoTDB to cope with different production needs. - -Currently, the following database heterogeneous parameters are supported: - -| Parameter | Type | Description | -| ------------------------- | ------- | --------------------------------------------- | -| TTL | Long | TTL of the Database | -| SCHEMA_REPLICATION_FACTOR | Integer | The schema replication number of the Database | -| DATA_REPLICATION_FACTOR | Integer | The data replication number of the Database | -| SCHEMA_REGION_GROUP_NUM | Integer | The SchemaRegionGroup number of the Database | -| DATA_REGION_GROUP_NUM | Integer | The DataRegionGroup number of the Database | - -Note the following when configuring heterogeneous parameters: - -+ TTL and TIME_PARTITION_INTERVAL must be positive integers. -+ SCHEMA_REPLICATION_FACTOR and DATA_REPLICATION_FACTOR must be smaller than or equal to the number of deployed DataNodes. -+ The function of SCHEMA_REGION_GROUP_NUM and DATA_REGION_GROUP_NUM are related to the parameter `schema_region_group_extension_policy` and `data_region_group_extension_policy` in iotdb-common.properties configuration file. Take DATA_REGION_GROUP_NUM as an example: - If `data_region_group_extension_policy=CUSTOM` is set, DATA_REGION_GROUP_NUM serves as the number of DataRegionGroups owned by the Database. - If `data_region_group_extension_policy=AUTO`, DATA_REGION_GROUP_NUM is used as the lower bound of the DataRegionGroup quota owned by the Database. That is, when the Database starts writing data, it will have at least this number of DataRegionGroups. - -Users can set any heterogeneous parameters when creating a Database, or adjust some heterogeneous parameters during a stand-alone/distributed IoTDB run. - -#### Set heterogeneous parameters when creating a Database - -The user can set any of the above heterogeneous parameters when creating a Database. The SQL statement is as follows: - -``` -CREATE DATABASE prefixPath (WITH databaseAttributeClause (COMMA? databaseAttributeClause)*)? -``` - -For example: - -``` -CREATE DATABASE root.db WITH SCHEMA_REPLICATION_FACTOR=1, DATA_REPLICATION_FACTOR=3, SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### Adjust heterogeneous parameters at run time - -Users can adjust some heterogeneous parameters during the IoTDB runtime, as shown in the following SQL statement: - -``` -ALTER DATABASE prefixPath WITH databaseAttributeClause (COMMA? databaseAttributeClause)* -``` - -For example: - -``` -ALTER DATABASE root.db WITH SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -Note that only the following heterogeneous parameters can be adjusted at runtime: - -+ SCHEMA_REGION_GROUP_NUM -+ DATA_REGION_GROUP_NUM - -#### Show heterogeneous databases - -The user can query the specific heterogeneous configuration of each Database, and the SQL statement is as follows: - -``` -SHOW DATABASES DETAILS prefixPath? -``` - -For example: - -``` -IoTDB> SHOW DATABASES DETAILS -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|Database| TTL|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|MinSchemaRegionGroupNum|MaxSchemaRegionGroupNum|DataRegionGroupNum|MinDataRegionGroupNum|MaxDataRegionGroupNum| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|root.db1| null| 1| 3| 604800000| 0| 1| 1| 0| 2| 2| -|root.db2|86400000| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -|root.db3| null| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -Total line number = 3 -It costs 0.058s -``` - -The query results in each column are as follows: - -+ The name of the Database -+ The TTL of the Database -+ The schema replication number of the Database -+ The data replication number of the Database -+ The time partition interval of the Database -+ The current SchemaRegionGroup number of the Database -+ The required minimum SchemaRegionGroup number of the Database -+ The permitted maximum SchemaRegionGroup number of the Database -+ The current DataRegionGroup number of the Database -+ The required minimum DataRegionGroup number of the Database -+ The permitted maximum DataRegionGroup number of the Database - -### TTL - -IoTDB supports setting data retention time (TTL) at the device level, allowing the system to automatically and periodically delete old data to effectively control disk space and maintain high query performance and low memory usage. TTL is set in milliseconds by default. Once data expires, it cannot be queried or written, but physical deletion is delayed until compaction. Please note that changes to TTL may temporarily affect data queryability, and if TTL is reduced or removed, previously invisible data due to TTL may reappear. - -Important notes: -- TTL is set in milliseconds and is not affected by the time precision in the configuration file. -- Changes to TTL may affect data queryability. -- The system will eventually remove expired data, but there may be a delay. -- TTL determines data expiration based on the data point timestamp, not the ingestion time. -- The system supports setting up to 1000 TTL rules. When the limit is reached, existing rules must be removed before new ones can be added. - -#### TTL Path Rule -The path can only be prefix paths (i.e., the path cannot contain \* , except \*\* in the last level). -This path will match devices and also allows users to specify paths without asterisks as specific databases or devices. -When the path does not contain asterisks, the system will check if it matches a database; if it matches a database, both the path and path.\*\* will be set at the same time. Note: Device TTL settings do not verify the existence of metadata, i.e., it is allowed to set TTL for a non-existent device. -``` -qualified paths: -root.** -root.db.** -root.db.group1.** -root.db -root.db.group1.d1 - -unqualified paths: -root.*.db -root.**.db.* -root.db.* -``` -#### TTL Applicable Rules -When a device is subject to multiple TTL rules, the more precise and longer rules are prioritized. For example, for the device "root.bj.hd.dist001.turbine001", the rule "root.bj.hd.dist001.turbine001" takes precedence over "root.bj.hd.dist001.\*\*", and the rule "root.bj.hd.dist001.\*\*" takes precedence over "root.bj.hd.**". -#### Set TTL -The set ttl operation can be understood as setting a TTL rule, for example, setting ttl to root.sg.group1.** is equivalent to mounting ttl for all devices that can match this path pattern. -The unset ttl operation indicates unmounting TTL for the corresponding path pattern; if there is no corresponding TTL, nothing will be done. -If you want to set TTL to be infinitely large, you can use the INF keyword. -The SQL Statement for setting TTL is as follow: -``` -set ttl to pathPattern 360000; -``` -Set the Time to Live (TTL) to a pathPattern of 360,000 milliseconds; the pathPattern should not contain a wildcard (\*) in the middle and must end with a double asterisk (\*\*). The pathPattern is used to match corresponding devices. -To maintain compatibility with older SQL syntax, if the user-provided pathPattern matches a database (db), the path pattern is automatically expanded to include all sub-paths denoted by path.\*\*. -For instance, writing "set ttl to root.sg 360000" will automatically be transformed into "set ttl to root.sg.\*\* 360000", which sets the TTL for all devices under root.sg. However, if the specified pathPattern does not match a database, the aforementioned logic will not apply. For example, writing "set ttl to root.sg.group 360000" will not be expanded to "root.sg.group.\*\*" since root.sg.group does not match a database. -It is also permissible to specify a particular device without a wildcard (*). -#### Unset TTL - -To unset TTL, we can use follwing SQL statement: - -``` -IoTDB> unset ttl from root.ln -``` - -After unset TTL, all data will be accepted in `root.ln`. -``` -IoTDB> unset ttl from root.sgcc.** -``` - -Unset the TTL in the `root.sgcc` path. - -New syntax -``` -IoTDB> unset ttl from root.** -``` - -Old syntax -``` -IoTDB> unset ttl to root.** -``` -There is no functional difference between the old and new syntax, and they are compatible with each other. -The new syntax is just more conventional in terms of wording. - -Unset the TTL setting for all path pattern. - -#### Show TTL - -To Show TTL, we can use following SQL statement: - -show all ttl - -``` -IoTDB> SHOW ALL TTL -+--------------+--------+ -| path| TTL| -| root.**|55555555| -| root.sg2.a.**|44440000| -+--------------+--------+ -``` - -show ttl on pathPattern -``` -IoTDB> SHOW TTL ON root.db.**; -+--------------+--------+ -| path| TTL| -| root.db.**|55555555| -| root.db.a.**|44440000| -+--------------+--------+ -``` - -The SHOW ALL TTL example gives the TTL for all path patterns. -The SHOW TTL ON pathPattern shows the TTL for the path pattern specified. - -Display devices' ttl -``` -IoTDB> show devices -+---------------+---------+---------+ -| Device|IsAligned| TTL| -+---------------+---------+---------+ -|root.sg.device1| false| 36000000| -|root.sg.device2| true| INF| -+---------------+---------+---------+ -``` -All devices will definitely have a TTL, meaning it cannot be null. INF represents infinity. - - -## Device Template - -IoTDB supports the device template function, enabling different entities of the same type to share metadata, reduce the memory usage of metadata, and simplify the management of numerous entities and measurements. - - -### Create Device Template - -The SQL syntax for creating a metadata template is as follows: - -```sql -CREATE DEVICE TEMPLATE ALIGNED? '(' [',' ]+ ')' -``` - -**Example 1:** Create a template containing two non-aligned timeseries - -```shell -IoTDB> create device template t1 (temperature FLOAT encoding=RLE, status BOOLEAN encoding=PLAIN compression=SNAPPY) -``` - -**Example 2:** Create a template containing a group of aligned timeseries - -```shell -IoTDB> create device template t2 aligned (lat FLOAT encoding=Gorilla, lon FLOAT encoding=Gorilla) -``` - -The` lat` and `lon` measurements are aligned. - -![img](/img/%E6%A8%A1%E6%9D%BF.png) - -![img](/img/templateEN.jpg) - -### Set Device Template - -After a device template is created, it should be set to specific path before creating related timeseries or insert data. - -**It should be ensured that the related database has been set before setting template.** - -**It is recommended to set device template to database path. It is not suggested to set device template to some path above database** - -**It is forbidden to create timeseries under a path setting s tedeviceplate. Device template shall not be set on a prefix path of an existing timeseries.** - -The SQL Statement for setting device template is as follow: - -```shell -IoTDB> set device template t1 to root.sg1.d1 -``` - -### Activate Device Template - -After setting the device template, with the system enabled to auto create schema, you can insert data into the timeseries. For example, suppose there's a database root.sg1 and t1 has been set to root.sg1.d1, then timeseries like root.sg1.d1.temperature and root.sg1.d1.status are available and data points can be inserted. - - -**Attention**: Before inserting data or the system not enabled to auto create schema, timeseries defined by the device template will not be created. You can use the following SQL statement to create the timeseries or activate the templdeviceate, act before inserting data: - -```shell -IoTDB> create timeseries using device template on root.sg1.d1 -``` - -**Example:** Execute the following statement - -```shell -IoTDB> set device template t1 to root.sg1.d1 -IoTDB> set device template t2 to root.sg1.d2 -IoTDB> create timeseries using device template on root.sg1.d1 -IoTDB> create timeseries using device template on root.sg1.d2 -``` - -Show the time series: - -```sql -show timeseries root.sg1.** -```` - -```shell -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -|root.sg1.d1.temperature| null| root.sg1| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.sg1.d1.status| null| root.sg1| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -| root.sg1.d2.lon| null| root.sg1| FLOAT| GORILLA| SNAPPY|null| null| null| null| -| root.sg1.d2.lat| null| root.sg1| FLOAT| GORILLA| SNAPPY|null| null| null| null| -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -``` - -Show the devices: - -```sql -show devices root.sg1.** -```` - -```shell -+---------------+---------+ -| devices|isAligned| -+---------------+---------+ -| root.sg1.d1| false| -| root.sg1.d2| true| -+---------------+---------+ -```` - -### Show Device Template - -- Show all device templates - -The SQL statement looks like this: - -```shell -IoTDB> show device templates -``` - -The execution result is as follows: - -```shell -+-------------+ -|template name| -+-------------+ -| t2| -| t1| -+-------------+ -``` - -- Show nodes under in device template - -The SQL statement looks like this: - -```shell -IoTDB> show nodes in device template t1 -``` - -The execution result is as follows: - -```shell -+-----------+--------+--------+-----------+ -|child nodes|dataType|encoding|compression| -+-----------+--------+--------+-----------+ -|temperature| FLOAT| RLE| SNAPPY| -| status| BOOLEAN| PLAIN| SNAPPY| -+-----------+--------+--------+-----------+ -``` - -- Show the path prefix where a device template is set - -```shell -IoTDB> show paths set device template t1 -``` - -The execution result is as follows: - -```shell -+-----------+ -|child paths| -+-----------+ -|root.sg1.d1| -+-----------+ -``` - -- Show the path prefix where a device template is used (i.e. the time series has been created) - -```shell -IoTDB> show paths using device template t1 -``` - -The execution result is as follows: - -```shell -+-----------+ -|child paths| -+-----------+ -|root.sg1.d1| -+-----------+ -``` - -### Deactivate device Template - -To delete a group of timeseries represented by device template, namely deactivate the device template, use the following SQL statement: - -```shell -IoTDB> delete timeseries of device template t1 from root.sg1.d1 -``` - -or - -```shell -IoTDB> deactivate device template t1 from root.sg1.d1 -``` - -The deactivation supports batch process. - -```shell -IoTDB> delete timeseries of device template t1 from root.sg1.*, root.sg2.* -``` - -or - -```shell -IoTDB> deactivate device template t1 from root.sg1.*, root.sg2.* -``` - -If the template name is not provided in sql, all template activation on paths matched by given path pattern will be removed. - -### Unset Device Template - -The SQL Statement for unsetting device template is as follow: - -```shell -IoTDB> unset device template t1 from root.sg1.d1 -``` - -**Attention**: It should be guaranteed that none of the timeseries represented by the target device template exists, before unset it. It can be achieved by deactivation operation. - -### Drop Device Template - -The SQL Statement for dropping device template is as follow: - -```shell -IoTDB> drop device template t1 -``` - -**Attention**: Dropping an already set template is not supported. - -### Alter Device Template - -In a scenario where measurements need to be added, you can modify the template to add measurements to all devicesdevice using the device template. - -The SQL Statement for altering device template is as follow: - -```shell -IoTDB> alter device template t1 add (speed FLOAT encoding=RLE) -``` - -**When executing data insertion to devices with device template set on related prefix path and there are measurements not present in this device template, the measurements will be auto added to this device template.** - -## Timeseries Management - -### Create Timeseries - -According to the storage model selected before, we can create corresponding timeseries in the two databases respectively. The SQL statements for creating timeseries are as follows: - -``` -IoTDB > create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT,encoding=RLE -IoTDB > create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT,encoding=PLAIN -IoTDB > create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT,encoding=RLE -``` - -From v0.13, you can use a simplified version of the SQL statements to create timeseries: - -``` -IoTDB > create timeseries root.ln.wf01.wt01.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.ln.wf01.wt01.temperature FLOAT encoding=RLE -IoTDB > create timeseries root.ln.wf02.wt02.hardware TEXT encoding=PLAIN -IoTDB > create timeseries root.ln.wf02.wt02.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.temperature FLOAT encoding=RLE -``` - -Notice that when in the CREATE TIMESERIES statement the encoding method conflicts with the data type, the system gives the corresponding error prompt as shown below: - -``` -IoTDB > create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF -error: encoding TS_2DIFF does not support BOOLEAN -``` - -Please refer to [Encoding](../Technical-Insider/Encoding-and-Compression.md) for correspondence between data type and encoding. - -### Create Aligned Timeseries - -The SQL statement for creating a group of timeseries are as follows: - -``` -IoTDB> CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT encoding=PLAIN compressor=SNAPPY, longitude FLOAT encoding=PLAIN compressor=SNAPPY) -``` - -You can set different datatype, encoding, and compression for the timeseries in a group of aligned timeseries - -It is also supported to set an alias, tag, and attribute for aligned timeseries. - -### Delete Timeseries - -To delete the timeseries we created before, we are able to use `(DELETE | DROP) TimeSeries ` statement. - -The usage are as follows: - -``` -IoTDB> delete timeseries root.ln.wf01.wt01.status -IoTDB> delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware -IoTDB> delete timeseries root.ln.wf02.* -IoTDB> drop timeseries root.ln.wf02.* -``` - -### Show Timeseries - -* SHOW LATEST? TIMESERIES pathPattern? whereClause? limitClause? - - There are four optional clauses added in SHOW TIMESERIES, return information of time series - -Timeseries information includes: timeseries path, alias of measurement, database it belongs to, data type, encoding type, compression type, tags and attributes. - -Examples: - -* SHOW TIMESERIES - - presents all timeseries information in JSON form - -* SHOW TIMESERIES <`PathPattern`> - - returns all timeseries information matching the given <`PathPattern`>. SQL statements are as follows: - -``` -IoTDB> show timeseries root.** -IoTDB> show timeseries root.ln.** -``` - -The results are shown below respectively: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.016s - -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -Total line number = 4 -It costs 0.004s -``` - -* SHOW TIMESERIES LIMIT INT OFFSET INT - - returns all the timeseries information start from the offset and limit the number of series returned. For example, - -``` -show timeseries root.ln.** limit 10 offset 10 -``` - -* SHOW TIMESERIES WHERE TIMESERIES contains 'containStr' - - The query result set is filtered by string fuzzy matching based on the names of the timeseries. For example: - -``` -show timeseries root.ln.** where timeseries contains 'wf01.wt' -``` - -The result is shown below: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 2 -It costs 0.016s -``` - -* SHOW TIMESERIES WHERE DataType=type - - The query result set is filtered by data type. For example: - -``` -show timeseries root.ln.** where dataType=FLOAT -``` - -The result is shown below: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 3 -It costs 0.016s - -``` - - -* SHOW TIMESERIES WHERE TAGS(KEY) = VALUE -* SHOW TIMESERIES WHERE TAGS(KEY) CONTAINS VALUE - - The query result set is filtered by tags. For example: - -``` -show timeseries root.ln.** where TAGS(unit)='c' -show timeseries root.ln.** where TAGS(description) contains 'test1' -``` - -The query results are as follows: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s - -``` - - -* SHOW LATEST TIMESERIES - - all the returned timeseries information should be sorted in descending order of the last timestamp of timeseries - -It is worth noting that when the queried path does not exist, the system will return no timeseries. - - -### Count Timeseries - -IoTDB is able to use `COUNT TIMESERIES ` to count the number of timeseries matching the path. SQL statements are as follows: - -* `WHERE` condition could be used to fuzzy match a time series name with the following syntax: `COUNT TIMESERIES WHERE TIMESERIES contains 'containStr'`. -* `WHERE` condition could be used to filter result by data type with the syntax: `COUNT TIMESERIES WHERE DataType='`. -* `WHERE` condition could be used to filter result by tags with the syntax: `COUNT TIMESERIES WHERE TAGS(key)='value'` or `COUNT TIMESERIES WHERE TAGS(key) contains 'value'`. -* `LEVEL` could be defined to show count the number of timeseries of each node at the given level in current Metadata Tree. This could be used to query the number of sensors under each device. The grammar is: `COUNT TIMESERIES GROUP BY LEVEL=`. - - -``` -IoTDB > COUNT TIMESERIES root.** -IoTDB > COUNT TIMESERIES root.ln.** -IoTDB > COUNT TIMESERIES root.ln.*.*.status -IoTDB > COUNT TIMESERIES root.ln.wf01.wt01.status -IoTDB > COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' -IoTDB > COUNT TIMESERIES root.** WHERE DATATYPE = INT64 -IoTDB > COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c' -IoTDB > COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c' -IoTDB > COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1 -``` - -For example, if there are several timeseries (use `show timeseries` to show all timeseries): - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| {"unit":"c"}| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| {"description":"test1"}| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.004s -``` - -Then the Metadata Tree will be as below: - -

- -As can be seen, `root` is considered as `LEVEL=0`. So when you enter statements such as: - -``` -IoTDB > COUNT TIMESERIES root.** GROUP BY LEVEL=1 -IoTDB > COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2 -IoTDB > COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2 -``` - -You will get following results: - -``` -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -| root.sgcc| 2| -| root.ln| 4| -+------------+-----------------+ -Total line number = 3 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf02| 2| -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 2 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 1 -It costs 0.002s -``` - -> Note: The path of timeseries is just a filter condition, which has no relationship with the definition of level. - -### Active Timeseries Query -By adding WHERE time filter conditions to the existing SHOW/COUNT TIMESERIES, we can obtain time series with data within the specified time range. - -It is important to note that in metadata queries with time filters, views are not considered; only the time series actually stored in the TsFile are taken into account. - -An example usage is as follows: -``` -IoTDB> insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -IoTDB> insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -IoTDB> insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -IoTDB> show timeseries; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -IoTDB> show timeseries where time >= 15000 and time < 16000; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -IoTDB> count timeseries where time >= 15000 and time < 16000; -+-----------------+ -|count(timeseries)| -+-----------------+ -| 4| -+-----------------+ -``` -Regarding the definition of active time series, data that can be queried normally is considered active, meaning time series that have been inserted but deleted are not included. -### Tag and Attribute Management - -We can also add an alias, extra tag and attribute information while creating one timeseries. - -The differences between tag and attribute are: - -* Tag could be used to query the path of timeseries, we will maintain an inverted index in memory on the tag: Tag -> Timeseries -* Attribute could only be queried by timeseries path : Timeseries -> Attribute - -The SQL statements for creating timeseries with extra tag and attribute information are extended as follows: - -``` -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT, encoding=RLE, compression=SNAPPY tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2) -``` - -The `temprature` in the brackets is an alias for the sensor `s1`. So we can use `temprature` to replace `s1` anywhere. - -> IoTDB also supports using AS function to set alias. The difference between the two is: the alias set by the AS function is used to replace the whole time series name, temporary and not bound with the time series; while the alias mentioned above is only used as the alias of the sensor, which is bound with it and can be used equivalent to the original sensor name. - -> Notice that the size of the extra tag and attribute information shouldn't exceed the `tag_attribute_total_size`. - -We can update the tag information after creating it as following: - -* Rename the tag/attribute key - -``` -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -* Reset the tag/attribute value - -``` -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -* Delete the existing tag/attribute - -``` -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2 -``` - -* Add new tags - -``` -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -* Add new attributes - -``` -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -* Upsert alias, tags and attributes - -> add alias or a new key-value if the alias or key doesn't exist, otherwise, update the old one with new value. - -``` -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag3=v3, tag4=v4) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -* Show timeseries using tags. Use TAGS(tagKey) to identify the tags used as filter key - -``` -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -``` - -returns all the timeseries information that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -``` -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1 -show timeseries root.ln.** where TAGS(unit)='c' -show timeseries root.ln.** where TAGS(description) contains 'test1' -``` - -The results are shown below respectly: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s -``` - -- count timeseries using tags - -``` -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL= -``` - -returns all the number of timeseries that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -``` -count timeseries -count timeseries root.** where TAGS(unit)='c' -count timeseries root.** where TAGS(unit)='c' group by level = 2 -``` - -The results are shown below respectly : - -``` -IoTDB> count timeseries -+-----------------+ -|count(timeseries)| -+-----------------+ -| 6| -+-----------------+ -Total line number = 1 -It costs 0.019s -IoTDB> count timeseries root.** where TAGS(unit)='c' -+-----------------+ -|count(timeseries)| -+-----------------+ -| 2| -+-----------------+ -Total line number = 1 -It costs 0.020s -IoTDB> count timeseries root.** where TAGS(unit)='c' group by level = 2 -+--------------+-----------------+ -| column|count(timeseries)| -+--------------+-----------------+ -| root.ln.wf02| 2| -| root.ln.wf01| 0| -|root.sgcc.wf03| 0| -+--------------+-----------------+ -Total line number = 3 -It costs 0.011s -``` - -> Notice that, we only support one condition in the where clause. Either it's an equal filter or it is an `contains` filter. In both case, the property in the where condition must be a tag. - -create aligned timeseries - -``` -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)) -``` - -The execution result is as follows: - -``` -IoTDB> show timeseries -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -|root.sg1.d1.s2| null| root.sg1| DOUBLE| GORILLA| SNAPPY|{"tag4":"v4","tag3":"v3"}|{"attr4":"v4","attr3":"v3"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -Support query: - -``` -IoTDB> show timeseries where TAGS(tag1)='v1' -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -The above operations are supported for timeseries tag, attribute updates, etc. - -## Node Management - -### Show Child Paths - -``` -SHOW CHILD PATHS pathPattern -``` - -Return all child paths and their node types of all the paths matching pathPattern. - -node types: ROOT -> DB INTERNAL -> DATABASE -> INTERNAL -> DEVICE -> TIMESERIES - - -Example: - -* return the child paths of root.ln:show child paths root.ln - -``` -+------------+----------+ -| child paths|node types| -+------------+----------+ -|root.ln.wf01| INTERNAL| -|root.ln.wf02| INTERNAL| -+------------+----------+ -Total line number = 2 -It costs 0.002s -``` - -> get all paths in form of root.xx.xx.xx:show child paths root.xx.xx - -### Show Child Nodes - -``` -SHOW CHILD NODES pathPattern -``` - -Return all child nodes of the pathPattern. - -Example: - -* return the child nodes of root:show child nodes root - -``` -+------------+ -| child nodes| -+------------+ -| ln| -+------------+ -``` - -* return the child nodes of root.ln:show child nodes root.ln - -``` -+------------+ -| child nodes| -+------------+ -| wf01| -| wf02| -+------------+ -``` - -### Count Nodes - -IoTDB is able to use `COUNT NODES LEVEL=` to count the number of nodes at - the given level in current Metadata Tree considering a given pattern. IoTDB will find paths that - match the pattern and counts distinct nodes at the specified level among the matched paths. - This could be used to query the number of devices with specified measurements. The usage are as - follows: - -``` -IoTDB > COUNT NODES root.** LEVEL=2 -IoTDB > COUNT NODES root.ln.** LEVEL=2 -IoTDB > COUNT NODES root.ln.wf01.** LEVEL=3 -IoTDB > COUNT NODES root.**.temperature LEVEL=3 -``` - -As for the above mentioned example and Metadata tree, you can get following results: - -``` -+------------+ -|count(nodes)| -+------------+ -| 4| -+------------+ -Total line number = 1 -It costs 0.003s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 1| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s -``` - -> Note: The path of timeseries is just a filter condition, which has no relationship with the definition of level. - -### Show Devices - -* SHOW DEVICES pathPattern? (WITH DATABASE)? devicesWhereClause? limitClause? - -Similar to `Show Timeseries`, IoTDB also supports two ways of viewing devices: - -* `SHOW DEVICES` statement presents all devices' information, which is equal to `SHOW DEVICES root.**`. -* `SHOW DEVICES ` statement specifies the `PathPattern` and returns the devices information matching the pathPattern and under the given level. -* `WHERE` condition supports `DEVICE contains 'xxx'` to do a fuzzy query based on the device name. - -SQL statement is as follows: - -``` -IoTDB> show devices -IoTDB> show devices root.ln.** -IoTDB> show devices root.ln.** where device contains 't' -``` - -You can get results below: - -``` -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.ln.wf01.wt01| false| -| root.ln.wf02.wt02| false| -|root.sgcc.wf03.wt01| false| -| root.turbine.d1| false| -+-------------------+---------+ -Total line number = 4 -It costs 0.002s - -+-----------------+---------+ -| devices|isAligned| -+-----------------+---------+ -|root.ln.wf01.wt01| false| -|root.ln.wf02.wt02| false| -+-----------------+---------+ -Total line number = 2 -It costs 0.001s -``` - -`isAligned` indicates whether the timeseries under the device are aligned. - -To view devices' information with database, we can use `SHOW DEVICES WITH DATABASE` statement. - -* `SHOW DEVICES WITH DATABASE` statement presents all devices' information with their database. -* `SHOW DEVICES WITH DATABASE` statement specifies the `PathPattern` and returns the - devices' information under the given level with their database information. - -SQL statement is as follows: - -``` -IoTDB> show devices with database -IoTDB> show devices root.ln.** with database -``` - -You can get results below: - -``` -+-------------------+-------------+---------+ -| devices| database|isAligned| -+-------------------+-------------+---------+ -| root.ln.wf01.wt01| root.ln| false| -| root.ln.wf02.wt02| root.ln| false| -|root.sgcc.wf03.wt01| root.sgcc| false| -| root.turbine.d1| root.turbine| false| -+-------------------+-------------+---------+ -Total line number = 4 -It costs 0.003s - -+-----------------+-------------+---------+ -| devices| database|isAligned| -+-----------------+-------------+---------+ -|root.ln.wf01.wt01| root.ln| false| -|root.ln.wf02.wt02| root.ln| false| -+-----------------+-------------+---------+ -Total line number = 2 -It costs 0.001s -``` - -### Count Devices - -* COUNT DEVICES / - -The above statement is used to count the number of devices. At the same time, it is allowed to specify `PathPattern` to count the number of devices matching the `PathPattern`. - -SQL statement is as follows: - -``` -IoTDB> show devices -IoTDB> count devices -IoTDB> count devices root.ln.** -``` - -You can get results below: - -``` -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -|root.sgcc.wf03.wt03| false| -| root.turbine.d1| false| -| root.ln.wf02.wt02| false| -| root.ln.wf01.wt01| false| -+-------------------+---------+ -Total line number = 4 -It costs 0.024s - -+--------------+ -|count(devices)| -+--------------+ -| 4| -+--------------+ -Total line number = 1 -It costs 0.004s - -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -Total line number = 1 -It costs 0.004s -``` - -### Active Device Query -Similar to active timeseries query, we can add time filter conditions to device viewing and statistics to query active devices that have data within a certain time range. The definition of active here is the same as for active time series. An example usage is as follows: -``` -IoTDB> insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -IoTDB> insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -IoTDB> insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -IoTDB> show devices; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -| root.sg.data3| false| -+-------------------+---------+ - -IoTDB> show devices where time >= 15000 and time < 16000; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -+-------------------+---------+ - -IoTDB> count devices where time >= 15000 and time < 16000; -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -``` \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/UserGuide/V1.3.x/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index 35070c791..000000000 --- a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,574 +0,0 @@ - -# AINode Deployment - -## AINode Introduction - -### Capability Introduction - - AINode is the third type of endogenous node provided by IoTDB after the Configurable Node and DataNode. This node extends its ability to perform machine learning analysis on time series by interacting with the DataNode and Configurable Node of the IoTDB cluster. It supports the introduction of existing machine learning models from external sources for registration and the use of registered models to complete time series analysis tasks on specified time series data through simple SQL statements. The creation, management, and inference of models are integrated into the database engine. Currently, machine learning algorithms or self-developed models are available for common time series analysis scenarios, such as prediction and anomaly detection. - -### Delivery Method - It is an additional package outside the IoTDB cluster, with independent installation. - -### Deployment mode -
- - -
- -## Installation preparation - -### Get installation package - - Users can download the software installation package for AINode, download and unzip it to complete the installation of AINode. - - Unzip and install the package - `(apache-iotdb--ainode-bin.zip)`, The directory structure after unpacking the installation package is as follows: -| **Catalogue** | **Type** | **Explain** | -| ------------ | -------- | ------------------------------------------------ | -| lib | folder | AINode compiled binary executable files and related code dependencies | -| sbin | folder | The running script of AINode can start, remove, and stop AINode | -| conf | folder | Contains configuration items for AINode, specifically including the following configuration items | -| LICENSE | file | Certificate | -| NOTICE | file | Tips | -| README_ZH.md | file | Explanation of the Chinese version of the markdown format | -| `README.md` | file | Instructions | - - -### Pre-installation Check - -To ensure the AINode installation package you obtained is complete and valid, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum:please contact Timecho Team to re-obtain the installation package. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/ainode): - ```Bash - cd /data/ainode - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-05.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment of AINode as per the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### Environment preparation -- Suggested operating environment:Ubuntu, CentOS, MacOS - -- Runtime Environment - - Python >= 3.10 and Python <= 3.12 is sufficient, and comes with pip and venv tools; For non networked environments, and download the zip package for the corresponding operating system from [here](https://cloud.tsinghua.edu.cn/d/4c1342f6c272439aa96c/?p=%2Flibs&mode=list) (Note that when downloading dependencies, you need to select the zip file in the libs folder, as shown in the following figure). Copy all files in the folder to the `lib` folder in the `apache-iotdb--ainode-bin` folder, and follow the steps below to start AINode. - - - - - There must be a Python interpreter in the environment variables that can be directly called through the `python` instruction. - - It is recommended to create a Python interpreter venv virtual environment in the `apache-iotdb--ainode-bin` folder. If installing version 3.10.0 virtual environment, the statement is as follows: - ```shell - # Install version 3.10.0 of Venv , Create a virtual environment with the folder name `venv`. - ../Python-3.10.0/python -m venv `venv` - ``` - -## Installation steps - -### Install AINode - -1. AINode activation - - Require IoTDB to be in normal operation and have AINode module authorization in the license. - - The authorization method for activating the AINode module is as follows: - - Method 1: Activate file copy activation - - After restarting the confignode node, enter the activation folder, copy the system_info file to the Timecho staff, and inform them to apply for independent authorization for AINode; - - Received the license file returned by the staff; - - Put the license file into the activation folder of the corresponding node; - -- Method 2: Activate Script Activation - - Obtain the required machine code for activation, enter the `sbin` directory of the installation directory, and execute the activation script: - ```shell - cd sbin - ./start-activate.sh - ``` - - The following information is displayed. Please copy the machine code (i.e. this string of characters) to the Timecho staff and inform them to apply for independent authorization of AINode: - ```shell - Please copy the system_info's content and send it to Timecho: - Y17hFA0xRCE1TmkVxILuCIEPc7uJcr5bzlXWiptw8uZTmTX5aThfypQdLUIhMljw075hNRSicyvyJR9JM7QaNm1gcFZPHVRWVXIiY5IlZkXdxCVc1erXMsbCqUYsR2R2Mw4PSpFJsUF5jHWSoFIIjQ2bmJFW5P52KCccFMVeHTc= - Please enter license: - ``` - - Enter the activation code returned by the staff into the `Please enter license:` command prompt in the previous step, as shown below: - ```shell - Please enter license: - Jw+MmF+AtexsfgNGOFgTm83BgXbq0zT1+fOfPvQsLlj6ZsooHFU6HycUSEGC78eT1g67KPvkcLCUIsz2QpbyVmPLr9x1+kVjBubZPYlVpsGYLqLFc8kgpb5vIrPLd3hGLbJ5Ks8fV1WOVrDDVQq89YF2atQa2EaB9EAeTWd0bRMZ+s9ffjc/1Zmh9NSP/T3VCfJcJQyi7YpXWy5nMtcW0gSV+S6fS5r7a96PjbtE0zXNjnEhqgRzdU+mfO8gVuUNaIy9l375cp1GLpeCh6m6pF+APW1CiXLTSijK9Qh3nsL5bAOXNeob5l+HO5fEMgzrW8OJPh26Vl6ljKUpCvpTiw== - License has been stored to sbin/../activation/license - Import completed. Please start cluster and excute 'show cluster' to verify activation status - ``` -- After updating the license, restart the DataNode node and enter the sbin directory of IoTDB to start the datanode: - ```shell - cd sbin - ./start-datanode.sh -d #The parameter'd 'will be started in the background - ``` - -2. Check the kernel architecture of Linux - ```shell - uname -m - ``` - -3. Import Python environment [Download](https://repo.anaconda.com/miniconda/) - - Recommend downloading the py311 version application and importing it into the iotdb dedicated folder in the user's root directory - - 4. Verify Python version - -```shell - python --version - ``` -5. Create a virtual environment (execute in the ainode directory) - - ```shell - python -m venv venv - ``` - -6. Activate the virtual environment - - ```shell - source venv/bin/activate - ``` - - 7. Download and import AINode to a dedicated folder, switch to the dedicated folder and extract the installation package - - ```shell - unzip iotdb-enterprise-ainode-1.3.3.2.zip - ``` - - 8. Configuration item modification - - ```shell - vi iotdb-enterprise-ainode-1.3.3.2/conf/iotdb-ainode.properties - ``` - Configuration item modification:[detailed information](#configuration-item-modification) - - > ain_seed_config_node=iotdb-1:10710 (Cluster communication node IP: communication node port)
- > ain_inference_rpc_address=iotdb-3 (IP address of the server running AINode) - - 9. Replace Python source - - ```shell - pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ - ``` - - 10. Start the AINode node - - ```shell - nohup bash iotdb-enterprise-ainode-1.3.3.2/sbin/start-ainode.sh > myout.file 2>& 1 & - ``` - > Return to the default environment of the system: conda deactivate - - ### Configuration item modification - -AINode supports modifying some necessary parameters. You can find the following parameters in the `conf/iotdb-ainode.properties` file and make persistent modifications to them: -: - -| **Name** | **Describe** | **Type** | **Default value** | **Effective method after modification** | -| :----------------------------- | ------------------------------------------------------------ | ------- | ------------------ | ---------------------------- | -| cluster_name | The identifier for AINode to join the cluster | string | defaultCluster | Only allow modifications before the first service startup | -| ain_seed_config_node | The Configurable Node address registered during AINode startup | String | 127.0.0.1:10710 | Only allow modifications before the first service startup | -| ain_inference_rpc_address | AINode provides service and communication addresses , Internal Service Communication Interface | String | 127.0.0.1 | Only allow modifications before the first service startup | -| ain_inference_rpc_port | AINode provides ports for services and communication | String | 10810 | Only allow modifications before the first service startup | -| ain_system_dir | AINode metadata storage path, the starting directory of the relative path is related to the operating system, and it is recommended to use an absolute path | String | data/AINode/system | Only allow modifications before the first service startup | -| ain_models_dir | AINode stores the path of the model file, and the starting directory of the relative path is related to the operating system. It is recommended to use an absolute path | String | data/AINode/models | Only allow modifications before the first service startup | -| ain_logs_dir | The path where AINode stores logs, the starting directory of the relative path is related to the operating system, and it is recommended to use an absolute path | String | logs/AINode | Effective after restart | -| ain_thrift_compression_enabled | Does AINode enable Thrift's compression mechanism , 0-Do not start, 1-Start | Boolean | 0 | Effective after restart | - -### Start AINode - - After completing the deployment of Seed Config Node, the registration and inference functions of the model can be supported by adding AINode nodes. After specifying the information of the IoTDB cluster in the configuration file, the corresponding instruction can be executed to start AINode and join the IoTDB cluster。 - -#### Networking environment startup - -##### Start command - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - - # Windows systems - sbin\start-ainode.bat - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -#### Detailed Syntax - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh -i -r -n - - # Windows systems - sbin\start-ainode.bat -i -r -n - ``` - -##### Parameter introduction: - -| **Name** | **Label** | **Describe** | **Is it mandatory** | **Type** | **Default value** | **Input method** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | ---------------------- | -| ain_interpreter_dir | -i | The interpreter path of the virtual environment where AINode is installed requires the use of an absolute path. | no | String | Default reading of environment variables | Input or persist modifications during invocation | -| ain_force_reinstall | -r | Does this script check the version when checking the installation status of AINode. If it does, it will force the installation of the whl package in lib if the version is incorrect. | no | Bool | false | Input when calling | -| ain_no_dependencies | -n | Specify whether to install dependencies when installing AINode, and if so, only install the AINode main program without installing dependencies. | no | Bool | false | Input when calling | - - If you don't want to specify the corresponding parameters every time you start, you can also persistently modify the parameters in the `ainode-env.sh` and `ainode-env.bat` scripts in the `conf` folder (currently supporting persistent modification of the ain_interpreter-dir parameter). - - `ainode-env.sh` : - ```shell - # The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - # ain_interpreter_dir= - ``` - `ainode-env.bat` : -```shell - @REM The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - @REM set ain_interpreter_dir= - ``` - After writing the parameter value, uncomment the corresponding line and save it to take effect on the next script execution. - - -#### Example - -##### Directly start: - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - # Windows systems - sbin\start-ainode.bat - - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -##### Update Start: -If the version of AINode has been updated (such as updating the `lib` folder), this command can be used. Firstly, it is necessary to ensure that AINode has stopped running, and then restart it using the `-r` parameter, which will reinstall AINode based on the files under `lib`. - - -```shell - # Update startup command - # Linux and MacOS systems - bash sbin/start-ainode.sh -r - # Windows systems - sbin\start-ainode.bat -r - - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh -r > myout.file 2>& 1 & - # Windows c - nohup bash sbin\start-ainode.bat -r > myout.file 2>& 1 & - ``` -#### Non networked environment startup - -##### Start command - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - - # Windows systems - sbin\start-ainode.bat - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -#### Detailed Syntax - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh -i -r -n - - # Windows systems - sbin\start-ainode.bat -i -r -n - ``` - -##### Parameter introduction: - -| **Name** | **Label** | **Describe** | **Is it mandatory** | **Type** | **Default value** | **Input method** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | ---------------------- | -| ain_interpreter_dir | -i | The interpreter path of the virtual environment where AINode is installed requires the use of an absolute path | no | String | Default reading of environment variables | Input or persist modifications during invocation | -| ain_force_reinstall | -r | Does this script check the version when checking the installation status of AINode. If it does, it will force the installation of the whl package in lib if the version is incorrect | no | Bool | false | Input when calling | - -> Attention: When installation fails in a non networked environment, first check if the installation package corresponding to the platform is selected, and then confirm the Python version (due to the limitations of the downloaded installation package on Python versions, 3.7, 3.9, and others are not allowed) - -#### Example - -##### Directly start: - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - # Windows systems - sbin\start-ainode.bat - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### Detecting the status of AINode nodes - -During the startup process of AINode, the new AINode will be automatically added to the IoTDB cluster. After starting AINode, you can enter SQL in the command line to query. If you see an AINode node in the cluster and its running status is Running (as shown below), it indicates successful joining. - - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -### Stop AINode - -If you need to stop a running AINode node, execute the corresponding shutdown script. - -#### Stop command - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - - -#### Detailed Syntax - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh -t - - #Windows - sbin\stop-ainode.bat -t - ``` - -##### Parameter introduction: - -| **Name** | **Label** | **Describe** | **Is it mandatory** | **Type** | **Default value** | **Input method** | -| ----------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ------ | ---------- | -| ain_remove_target | -t | When closing AINode, you can specify the Node ID, address, and port number of the target AINode to be removed, in the format of `` | no | String | nothing | Input when calling | - -#### Example - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - # Windows - sbin\stop-ainode.bat - ``` -After stopping AINode, you can still see AINode nodes in the cluster, whose running status is UNKNOWN (as shown below), and the AINode function cannot be used at this time. - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -If you need to restart the node, you need to execute the startup script again. - -### Remove AINode - -When it is necessary to remove an AINode node from the cluster, a removal script can be executed. The difference between removing and stopping scripts is that stopping retains the AINode node in the cluster but stops the AINode service, while removing removes the AINode node from the cluster. - -#### Remove command - - -```shell - # Linux / MacOS - bash sbin/remove-ainode.sh - - # Windows - sbin\remove-ainode.bat - ``` - -#### Detailed Syntax - -```shell - # Linux / MacOS - bash sbin/remove-ainode.sh -i -t/: -r -n - - # Windows - sbin\remove-ainode.bat -i -t/: -r -n - ``` - -##### Parameter introduction: - - | **Name** | **Label** | **Describe** | **Is it mandatory** | **Type** | **Default value** | **Input method** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | --------------------- | -| ain_interpreter_dir | -i | The interpreter path of the virtual environment where AINode is installed requires the use of an absolute path | no | String | Default reading of environment variables | Input+persistent modification during invocation | -| ain_remove_target | -t | When closing AINode, you can specify the Node ID, address, and port number of the target AINode to be removed, in the format of `` | no | String | nothing | Input when calling | -| ain_force_reinstall | -r | Does this script check the version when checking the installation status of AINode. If it does, it will force the installation of the whl package in lib if the version is incorrect | no | Bool | false | Input when calling | -| ain_no_dependencies | -n | Specify whether to install dependencies when installing AINode, and if so, only install the AINode main program without installing dependencies | no | Bool | false | Input when calling | - - If you don't want to specify the corresponding parameters every time you start, you can also persistently modify the parameters in the `ainode-env.sh` and `ainode-env.bat` scripts in the `conf` folder (currently supporting persistent modification of the ain_interpreter-dir parameter). - - `ainode-env.sh` : - ```shell - # The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - # ain_interpreter_dir= - ``` - `ainode-env.bat` : -```shell - @REM The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - @REM set ain_interpreter_dir= - ``` - After writing the parameter value, uncomment the corresponding line and save it to take effect on the next script execution. - -#### Example - -##### Directly remove: - - ```shell - # Linux / MacOS - bash sbin/remove-ainode.sh - - # Windows - sbin\remove-ainode.bat - ``` - After removing the node, relevant information about the node cannot be queried. - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -##### Specify removal: - -If the user loses files in the data folder, AINode may not be able to actively remove them locally. The user needs to specify the node number, address, and port number for removal. In this case, we support users to input parameters according to the following methods for deletion. - - ```shell - # Linux / MacOS - bash sbin/remove-ainode.sh -t /: - - # Windows - sbin\remove-ainode.bat -t /: - ``` - -## common problem - -### An error occurs when starting AINode stating that the venv module cannot be found - - When starting AINode using the default method, a Python virtual environment will be created in the installation package directory and dependencies will be installed, so it is required to install the venv module. Generally speaking, Python 3.10 and above versions come with built-in VenV, but for some systems with built-in Python environments, this requirement may not be met. There are two solutions when this error occurs (choose one or the other): - - To install the Venv module locally, taking Ubuntu as an example, you can run the following command to install the built-in Venv module in Python. Or install a Python version with built-in Venv from the Python official website. - - ```shell -apt-get install python3.10-venv -``` -Install version 3.10.0 of venv into AINode in the AINode path. - - ```shell -../Python-3.10.0/python -m venv venv(Folder Name) -``` - When running the startup script, use ` -i ` to specify an existing Python interpreter path as the running environment for AINode, eliminating the need to create a new virtual environment. - - ### The SSL module in Python is not properly installed and configured to handle HTTPS resources -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -You can install OpenSSLS and then rebuild Python to solve this problem -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - - Python requires OpenSSL to be installed on our system, the specific installation method can be found in [link](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - - ### Pip version is lower - - A compilation issue similar to "error: Microsoft Visual C++14.0 or greater is required..." appears on Windows - -The corresponding error occurs during installation and compilation, usually due to insufficient C++version or Setup tools version. You can check it in - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - - ### Install and compile Python - - Use the following instructions to download the installation package from the official website and extract it: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` - Compile and install the corresponding Python package: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index fb039eae1..000000000 --- a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,565 +0,0 @@ - -# Cluster Deployment - -This section describes how to manually deploy an instance that includes 3 ConfigNodes and 3 DataNodes, commonly known as a 3C3D cluster. - -
- -
- -## Note - -1. Before installation, ensure that the system is complete by referring to [System configuration](./Environment-Requirements.md) - -2. It is recommended to prioritize using `hostname` for IP configuration during deployment, which can avoid the problem of modifying the host IP in the later stage and causing the database to fail to start. To set the host name, you need to configure /etc/hosts on the target server. For example, if the local IP is 192.168.1.3 and the host name is iotdb-1, you can use the following command to set the server's host name and configure the `cn_internal_address` and `dn_internal_address` of IoTDB using the host name. - ``` shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. Some parameters cannot be modified after the first startup. Please refer to the "Parameter Configuration" section below for settings. - -4. Whether in linux or windows, ensure that the IoTDB installation path does not contain Spaces and Chinese characters to avoid software exceptions. - -5. Please note that when installing and deploying IoTDB (including activating and using software), it is necessary to use the same user for operations. You can: -- Using root user (recommended): Using root user can avoid issues such as permissions. -- Using a fixed non root user: - - Using the same user operation: Ensure that the same user is used for start, activation, stop, and other operations, and do not switch users. - - Avoid using sudo: Try to avoid using sudo commands as they execute commands with root privileges, which may cause confusion or security issues. - -6. It is recommended to deploy a monitoring panel, which can monitor important operational indicators and keep track of database operation status at any time. The monitoring panel can be obtained by contacting the business department,The steps for deploying a monitoring panel can refer to:[Monitoring Panel Deployment](./Monitoring-panel-deployment.md) - -7. Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - -## Preparation Steps - -1. Prepare the IoTDB database installation package: iotdb enterprise- {version}-bin.zip(The installation package can be obtained from:[IoTDB-Package](../Deployment-and-Maintenance/IoTDB-Package_timecho.md)) -2. Configure the operating system environment according to environmental requirements(The system environment configuration can be found in:[Environment Requirement](https://www.timecho.com/docs/UserGuide/latest/Deployment-and-Maintenance/Environment-Requirements.html)) - -### Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -## Installation Steps - -Assuming there are three Linux servers now, the IP addresses and service roles are assigned as follows: - -| Node IP | Host Name | Service | -| ----------- | --------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -### Set Host Name - -On three machines, configure the host names separately. To set the host names, configure `/etc/hosts` on the target server. Use the following command: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### Configuration - -Unzip the installation package and enter the installation directory - -```Plain -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -#### Environment script configuration - -- `./conf/confignode-env.sh` configuration - - | **Configuration** | **Description** | **Default** | **Recommended value** | **Note** | - | :---------------- | :----------------------------------------------------------- | :---------- | :----------------------------------------------------------- | :---------------------------------- | - | MEMORY_SIZE | The total amount of memory that IoTDB ConfigNode nodes can use | Automatically calculated based on system memory, defaulting to 30% of the system memory. | Can be filled in as needed, and the system will allocate memory based on the filled in values | Save changes without immediate execution; modifications take effect after service restart. | - -- `./conf/datanode-env.sh` configuration - - | **Configuration** | **Description** | **Default** | **Recommended value** | **Note** | - | :---------------- | :----------------------------------------------------------- |:-----------------------------------------------------------------------------------------| :----------------------------------------------------------- | :---------------------------------- | - | MEMORY_SIZE | The total amount of memory that IoTDB DataNode nodes can use | Automatically calculated based on system memory, defaulting to 50% of the system memory. | Can be filled in as needed, and the system will allocate memory based on the filled in values | Save changes without immediate execution; modifications take effect after service restart. | - -#### General Configuration - -Open the general configuration file `./conf/iotdb-system.properties`,The following parameters can be set according to the deployment method: - -| **Configuration** | **Description** | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | -| ------------------------- | ------------------------------------------------------------ | -------------- | -------------- | -------------- | -| cluster_name | Cluster Name | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | The number of metadata replicas, the number of DataNodes should not be less than this number | 3 | 3 | 3 | -| data_replication_factor | The number of data replicas should not be less than this number of DataNodes | 2 | 2 | 2 | - -#### ConfigNode Configuration - -Open the ConfigNode configuration file `./conf/iotdb-system.properties`,Set the following parameters - -| **Configuration** | **Description** | **Default** | **Recommended value** | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | Note | -| ------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------------ | ------------- | ------------- | ------------- | ---------------------------------------- | -| cn_internal_address | The address used by ConfigNode for communication within the cluster | 127.0.0.1 | The IPV4 address or host name of the server where it is located, and it is recommended to use host name | iotdb-1 | iotdb-2 | iotdb-3 | Cannot be modified after initial startup | -| cn_internal_port | The port used by ConfigNode for communication within the cluster | 10710 | 10710 | 10710 | 10710 | 10710 | Cannot be modified after initial startup | -| cn_consensus_port | The port used for ConfigNode replica group consensus protocol communication | 10720 | 10720 | 10720 | 10720 | 10720 | Cannot be modified after initial startup | -| cn_seed_config_node | The address of the ConfigNode that the node connects to when registering to join the cluster, `cn_internal_address:cn_internal_port` | 127.0.0.1:10710 | The first CongfigNode's `cn_internal-address: cn_internal_port` | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | Cannot be modified after initial startup | - -#### DataNode Configuration - -Open DataNode Configuration File `./conf/iotdb-system.properties`,Set the following parameters: - -| **Configuration** | **Description** | **Default** | **Recommended value** | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | Note | -| ------------------------------- | ------------------------------------------------------------ |-----------------|-----------------------------------------------------------------------------------------------------------------| ------------- | ------------- | ------------- | ---------------------------------------- | -| dn_rpc_address | The address of the client RPC service | 0.0.0.0 | The IPV4 address or host name of the server where it is located, and it is recommended to use the IPV4 address | iotdb-1 |iotdb-2 | iotdb-3 | Restarting the service takes effect | -| dn_rpc_port | The port of the client RPC service | 6667 | 6667 | 6667 | 6667 | 6667 | Restarting the service takes effect | -| dn_internal_address | The address used by DataNode for communication within the cluster | 127.0.0.1 | The IPV4 address or host name of the server where it is located, and it is recommended to use host name | iotdb-1 | iotdb-2 | iotdb-3 | Cannot be modified after initial startup | -| dn_internal_port | The port used by DataNode for communication within the cluster | 10730 | 10730 | 10730 | 10730 | 10730 | Cannot be modified after initial startup | -| dn_mpp_data_exchange_port | The port used by DataNode to receive data streams | 10740 | 10740 | 10740 | 10740 | 10740 | Cannot be modified after initial startup | -| dn_data_region_consensus_port | The port used by DataNode for data replica consensus protocol communication | 10750 | 10750 | 10750 | 10750 | 10750 | Cannot be modified after initial startup | -| dn_schema_region_consensus_port | The port used by DataNode for metadata replica consensus protocol communication | 10760 | 10760 | 10760 | 10760 | 10760 | Cannot be modified after initial startup | -| dn_seed_config_node | The address of the ConfigNode that the node connects to when registering to join the cluster, i.e. `cn_internal-address: cn_internal_port` | 127.0.0.1:10710 | The first CongfigNode's cn_internal-address: cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | Cannot be modified after initial startup | - -> ❗️Attention: Editors such as VSCode Remote do not have automatic configuration saving function. Please ensure that the modified files are saved persistently, otherwise the configuration items will not take effect - -### Start and Activate Database (Available since V1.3.4) - -#### Start ConfigNode - -Start the first confignode of IoTDB-1 first, ensuring that the seed confignode node starts first, and then start the second and third confignode nodes in sequence - -```Bash -./start-confignode.sh -d #"- d" parameter will start in the background -``` - -If the startup fails, please refer to [Common Questions](#common-questions). - -#### Start DataNode - -Enter the `sbin` directory of iotdb and start three datanode nodes in sequence: - -```Bash -./start-datanode.sh -d #"- d" parameter will start in the background -``` - -#### Activate Database - -##### Activation via CLI - -- Enter the CLI of any node in the cluster - - ```SQL - ./sbin/start-cli.sh -``` - -- Obtain the machine codes of all nodes: - - - Execute the following command to get the machine codes required for activation: - - ```Bash - show system info - ``` - - - The machine codes of all nodes in the cluster will be displayed: - - ```Bash - +--------------------------------------------------------------+ - | SystemInfo| - +--------------------------------------------------------------+ - |01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| - +--------------------------------------------------------------+ - Total line number = 1 - It costs 0.030s - ``` - -- Copy the obtained machine codes and provide them to the Timecho team - -- The Timecho team will return an activation code, which normally corresponds to the order of the provided machine codes. Paste the complete activation code into the CLI for activation - - - Note: The activation code must be enclosed in ' symbols, as shown below: - - ```Bash - IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' - ``` - -### Start and Activate Database (Available before V1.3.4) - -#### Start ConfigNode - -Start the first confignode of IoTDB-1 first, ensuring that the seed confignode node starts first, and then start the second and third confignode nodes in sequence - -```Bash -./start-confignode.sh -d #"- d" parameter will start in the background -``` - -If the startup fails, please refer to [Common Questions](#common-questions). - -#### Activate Database - -##### Method 1: Activate file copy activation - -- After starting three confignode nodes in sequence, copy the `activation` folder of each machine and the `system_info` file of each machine to the Timecho staff; -- The staff will return the license files for each ConfigNode node, where 3 license files will be returned; -- Put the three license files into the `activation` folder of the corresponding ConfigNode node; - -##### Method 2: Activate Script Activation - -- Obtain the machine codes of three machines in sequence, enter the `sbin` directory of the installation directory, and execute the activation script `start activate.sh`: - - ```Bash - cd sbin - ./start-activate.sh - ``` - -- The following information is displayed, where the machine code of one machine is displayed: - - ```Bash - Please copy the system_info's content and send it to Timecho: - 01-KU5LDFFN-PNBEHDRH - Please enter license: - ``` - -- The other two nodes execute the activation script `start activate.sh` in sequence, and then copy the machine codes of the three machines obtained to the Timecho staff -- The staff will return 3 activation codes, which normally correspond to the order of the provided 3 machine codes. Please paste each activation code into the previous command line prompt `Please enter license:`, as shown below: - - ```Bash - Please enter license: - Jw+MmF+Atxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx5bAOXNeob5l+HO5fEMgzrW8OJPh26Vl6ljKUpCvpTiw== - License has been stored to sbin/../activation/license - Import completed. Please start cluster and excute 'show cluster' to verify activation status - ``` - -#### Start DataNode - - Enter the `sbin` directory of iotdb and start three datanode nodes in sequence: - -```Bash -./start-datanode.sh -d #"- d" parameter will start in the background -``` - -### Verify Deployment - -Can be executed directly Cli startup script in `./sbin` directory: - -```Plain -./start-cli.sh -h ip(local IP or domain name) -p port(6667) -``` - - After successful startup, the following interface will appear displaying successful installation of IOTDB. - -![](/img/%E4%BC%81%E4%B8%9A%E7%89%88%E6%88%90%E5%8A%9F.png) - -After the installation success interface appears, continue to check if the activation is successful and use the `show cluster` command. - -When you see the display of `Activated` on the far right, it indicates successful activation. - -![](/img/%E4%BC%81%E4%B8%9A%E7%89%88%E6%BF%80%E6%B4%BB.png) - -In the CLI, you can also check the activation status by running the `show activation` command; the example below shows a status of ACTIVATED, indicating successful activation. - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -> The appearance of `ACTIVATED (W)` indicates passive activation, which means that this Configurable Node does not have a license file (or has not issued the latest license file with a timestamp), and its activation depends on other Activated Configurable Nodes in the cluster. At this point, it is recommended to check if the license file has been placed in the license folder. If not, please place the license file. If a license file already exists, it may be due to inconsistency between the license file of this node and the information of other nodes. Please contact Timecho staff to reapply. - - -### One-click Cluster Start and Stop - -#### Overview - -Within the root directory of IoTDB, the `sbin `subdirectory houses the `start-all.sh` and `stop-all.sh` scripts, which work in concert with the `iotdb-cluster.properties` configuration file located in the `conf` subdirectory. This synergy enables the one-click initiation or termination of all nodes within the cluster from a single node. This approach facilitates efficient management of the IoTDB cluster's lifecycle, streamlining the deployment and operational maintenance processes. - -This following section will introduce the specific configuration items in the `iotdb-cluster.properties` file. - -#### Configuration Items - -> Note: -> -> * When the cluster changes, this configuration file needs to be manually updated. -> * If the `iotdb-cluster.properties` configuration file is not set up and the `start-all.sh` or `stop-all.sh` scripts are executed, the scripts will, by default, start or stop the ConfigNode and DataNode nodes located in the IOTDB\_HOME directory where the scripts reside. -> * It is recommended to configure SSH passwordless login: If not configured, the script will prompt for the server password after execution to facilitate subsequent start, stop, or destroy operations. If already configured, there is no need to enter the server password during script execution. - -* confignode\_address\_list - -| **Name** | **confignode\_address\_list** | -| :----------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Description | A list of IP addresses of the hosts where the ConfigNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_address\_list - -| **Name** | **datanode\_address\_list** | -| :----------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Description | A list of IP addresses of the hosts where the DataNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* ssh\_account - -| **Name** | **ssh\_account** | -| :----------------: | :------------------------------------------------------------------------------------------------- | -| Description | The username used to log in to the target hosts via SSH. All hosts must have the same username. | -| Type | String | -| Default | root | -| Effective | After restarting the system | - -* ssh\_port - -| **Name** | **ssh\_port** | -| :----------------: | :---------------------------------------------------------------------------------- | -| Description | The SSH port exposed by the target hosts. All hosts must have the same SSH port. | -| Type | int | -| Default | 22 | -| Effective | After restarting the system | - -* confignode\_deploy\_path - -| **Name** | **confignode\_deploy\_path** | -| :----------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Description | The path on the target hosts where all ConfigNodes to be started/stopped are located. All ConfigNodes must be in the same directory on their respective hosts. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_deploy\_path - -| **Name** | **datanode\_deploy\_path** | -| :----------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| Description | The path on the target hosts where all DataNodes to be started/stopped are located. All DataNodes must be in the same directory on their respective hosts. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - - - - - -## Node Maintenance Steps - -### ConfigNode Node Maintenance - -ConfigNode node maintenance is divided into two types of operations: adding and removing ConfigNodes, with two common use cases: -- Cluster expansion: For example, when there is only one ConfigNode in the cluster, and you want to increase the high availability of ConfigNode nodes, you can add two ConfigNodes, making a total of three ConfigNodes in the cluster. -- Cluster failure recovery: When the machine where a ConfigNode is located fails, making the ConfigNode unable to run normally, you can remove this ConfigNode and then add a new ConfigNode to the cluster. - -> ❗️Note, after completing ConfigNode node maintenance, you need to ensure that there are 1 or 3 ConfigNodes running normally in the cluster. Two ConfigNodes do not have high availability, and more than three ConfigNodes will lead to performance loss. - -#### Adding ConfigNode Nodes - -Script command: -```shell -# Linux / MacOS -# First switch to the IoTDB root directory -sbin/start-confignode.sh - -# Windows -# First switch to the IoTDB root directory -sbin/start-confignode.bat -``` - -Parameter introduction: - -| Parameter | Description | Is it required | -| :--- | :--------------------------------------------- | :----------- | -| -v | Show version information | No | -| -f | Run the script in the foreground, do not put it in the background | No | -| -d | Start in daemon mode, i.e. run in the background | No | -| -p | Specify a file to store the process ID for process management | No | -| -c | Specify the path to the configuration file folder, the script will load the configuration file from here | No | -| -g | Print detailed garbage collection (GC) information | No | -| -H | Specify the path of the Java heap dump file, used when JVM memory overflows | No | -| -E | Specify the path of the JVM error log file | No | -| -D | Define system properties, in the format key=value | No | -| -X | Pass -XX parameters directly to the JVM | No | -| -h | Help instruction | No | - -#### Removing ConfigNode Nodes - -First connect to the cluster through the CLI and confirm the internal address and port number of the ConfigNode you want to remove by using `show confignodes`: - -```Bash -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -Then use the script to remove the ConfigNode. Script command: - -```Bash -# Linux / MacOS -sbin/remove-confignode.sh [confignode_id] - -#Windows -sbin/remove-confignode.bat [confignode_id] - -``` - -### DataNode Node Maintenance - -There are two common scenarios for DataNode node maintenance: - -- Cluster expansion: For the purpose of expanding cluster capabilities, add new DataNodes to the cluster -- Cluster failure recovery: When a machine where a DataNode is located fails, making the DataNode unable to run normally, you can remove this DataNode and add a new DataNode to the cluster - -> ❗️Note, in order for the cluster to work normally, during the process of DataNode node maintenance and after the maintenance is completed, the total number of DataNodes running normally should not be less than the number of data replicas (usually 2), nor less than the number of metadata replicas (usually 3). - -#### Adding DataNode Nodes - -Script command: - -```Bash -# Linux / MacOS -# First switch to the IoTDB root directory -sbin/start-datanode.sh - -# Windows -# First switch to the IoTDB root directory -sbin/start-datanode.bat -``` - -Parameter introduction: - -| Abbreviation | Description | Is it required | -| :--- | :--------------------------------------------- | :----------- | -| -v | Show version information | No | -| -f | Run the script in the foreground, do not put it in the background | No | -| -d | Start in daemon mode, i.e. run in the background | No | -| -p | Specify a file to store the process ID for process management | No | -| -c | Specify the path to the configuration file folder, the script will load the configuration file from here | No | -| -g | Print detailed garbage collection (GC) information | No | -| -H | Specify the path of the Java heap dump file, used when JVM memory overflows | No | -| -E | Specify the path of the JVM error log file | No | -| -D | Define system properties, in the format key=value | No | -| -X | Pass -XX parameters directly to the JVM | No | -| -h | Help instruction | No | - -Note: After adding a DataNode, as new writes arrive (and old data expires, if TTL is set), the cluster load will gradually balance towards the new DataNode, eventually achieving a balance of storage and computation resources on all nodes. - -#### Removing DataNode Nodes - -First connect to the cluster through the CLI and confirm the RPC address and port number of the DataNode you want to remove with `show datanodes`: - -```Bash -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -Then use the script to remove the DataNode. Script command: - -```Bash -# Linux / MacOS -sbin/remove-datanode.sh [datanode_id] - -#Windows -sbin/remove-datanode.bat [datanode_id] -``` - -## Common Questions -1. Multiple prompts indicating activation failure during deployment process - - Use the `ls -al` command: Use the `ls -al` command to check if the owner information of the installation package root directory is the current user. - - Check activation directory: Check all files in the `./activation` directory and whether the owner information is the current user. - -2. Confignode failed to start - - Step 1: Please check the startup log to see if any parameters that cannot be changed after the first startup have been modified. - - Step 2: Please check the startup log for any other abnormalities. If there are any abnormal phenomena in the log, please contact Timecho Technical Support personnel for consultation on solutions. - - Step 3: If it is the first deployment or data can be deleted, you can also clean up the environment according to the following steps, redeploy, and restart. - - Step 4: Clean up the environment: - - a. Terminate all ConfigNode Node and DataNode processes. - ```Bash - # 1. Stop the ConfigNode and DataNode services - sbin/stop-standalone.sh - - # 2. Check for any remaining processes - jps - # Or - ps -ef|grep iotdb - - # 3. If there are any remaining processes, manually kill the - kill -9 - # If you are sure there is only one iotdb on the machine, you can use the following command to clean up residual processes - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. Delete the data and logs directories. - - Explanation: Deleting the data directory is necessary, deleting the logs directory is for clean logs and is not mandatory. - - ```Bash - cd /data/iotdb - rm -rf data logs - ``` \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index 0c22cc530..000000000 --- a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,496 +0,0 @@ - -# Docker Deployment - -## Environmental Preparation - -### Docker Installation - -```Bash -#Taking Ubuntu as an example, other operating systems can search for installation methods themselves -#step1: Install some necessary system tools -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: Install GPG certificate -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: Write software source information -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: Update and install Docker CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: Set Docker to start automatically upon startup -sudo systemctl enable docker -#step6: Verify if Docker installation is successful -docker --version #Display version information, indicating successful installation -``` - -### Docker-compose Installation - -```Bash -#Installation command -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#Verify if the installation was successful -docker-compose --version #Displaying version information indicates successful installation -``` - -### Install The Dmidecode Plugin - -By default, Linux servers should already be installed. If not, you can use the following command to install them. - -```Bash -sudo apt-get install dmidecode -``` - -After installing dmidecode, search for the installation path: `wherever dmidecode`. Assuming the result is `/usr/sbin/dmidecode`, remember this path as it will be used in the later docker compose yml file. - -### Get Container Image Of IoTDB - -You can contact business or technical support to obtain container images for IoTDB Enterprise Edition. - -## Stand-Alone Deployment - -This section demonstrates how to deploy a standalone Docker version of 1C1D. - -### Load Image File - -For example, the container image file name of IoTDB obtained here is: `iotdb-enterprise-1.3.2-3-standalone-docker.tar.gz` - -Load image: - -```Bash -docker load -i iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz -``` - -View image: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### Create Docker Bridge Network - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### Write The Yml File For docker-compose - -Here we take the example of consolidating the IoTDB installation directory and yml files in the/docker iotdb folder: - -The file directory structure is:`/docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml ` - -```Bash -docker-iotdb: -├── iotdb #Iotdb installation directory -│── docker-compose-standalone.yml #YML file for standalone Docker Composer -``` - -The complete docker-compose-standalone.yml content is as follows: - -```Bash -version: "3" -services: - iotdb-service: - image: iotdb-enterprise:1.3.2.3-standalone #The image used - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### First Launch - -Use the following command to start: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -Due to lack of activation, it is normal to exit directly upon initial startup. The initial startup is to obtain the machine code file for the subsequent activation process. - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### Apply For Activation - -- After the first startup, a system_info file will be generated in the physical machine directory `/docker-iotdb/iotdb/activation`, and this file will be copied to the Timecho staff. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Received the license file returned by the staff, copy the license file to the `/docker iotdb/iotdb/activation` folder. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### Restart IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### Validate Deployment - -- Viewing the log, the following words indicate successful startup - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- Enter the container to view the service running status and activation information - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - Enter the container, log in to the database through CLI, and use the `show cluster` command to view the service status and activation status - - ```Bash - docker exec -it iotdb /bin/bash #Entering the container - ./start-cli.sh -h iotdb #Log in to the database - IoTDB> show cluster #View status - ``` - - You can see that all services are running and the activation status shows as activated. - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### Map/conf Directory (optional) - -If you want to directly modify the configuration file in the physical machine in the future, you can map the/conf folder in the container in three steps: - -Step 1: Copy the/conf directory from the container to/docker-iotdb/iotdb/conf - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -Step 2: Add mappings in docker-compose-standalone.yml - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this/conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -Step 3: Restart IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## Cluster Deployment - -This section describes how to manually deploy an instance that includes 3 Config Nodes and 3 Data Nodes, commonly known as a 3C3D cluster. - -
- -
- -**Note: The cluster version currently only supports host and overlay networks, and does not support bridge networks.** - -Taking the host network as an example, we will demonstrate how to deploy a 3C3D cluster. - -### Set Host Name - -Assuming there are currently three Linux servers, the IP addresses and service role assignments are as follows: - -| Node IP | Host Name | Service | -| ----------- | --------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -Configure the host names on three machines separately. To set the host names, configure `/etc/hosts` on the target server using the following command: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### Load Image File - -For example, the container image file name obtained for IoTDB is: `iotdb-enterprise-1.3.23-standalone-docker.tar.gz` - -Execute the load image command on three servers separately: - -```Bash -docker load -i iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz -``` - -View image: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### Write The Yml File For Docker Compose - -Here we take the example of consolidating the IoTDB installation directory and yml files in the /docker-iotdb folder: - -The file directory structure is:/docker-iotdb/iotdb, /docker-iotdb/confignode.yml,/docker-iotdb/datanode.yml - -```Bash -docker-iotdb: -├── confignode.yml #Yml file of confignode -├── datanode.yml #Yml file of datanode -└── iotdb #IoTDB installation directory -``` - -On each server, two yml files need to be written, namely confignnode. yml and datanode. yml. The example of yml is as follows: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:1.3.2.3-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:1.3.2.3-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### Starting Confignode For The First Time - -First, start configNodes on each of the three servers to obtain the machine code. Pay attention to the startup order, start the first iotdb-1 first, then start iotdb-2 and iotdb-3. - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #Background startup -``` - -### Apply For Activation - -- After starting three confignodes for the first time, a system_info file will be generated in each physical machine directory `/docker-iotdb/iotdb/activation`, and the system_info files of the three servers will be copied to the Timecho staff; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Put the three license files into the `/docker iotdb/iotdb/activation` folder of the corresponding Configurable Node node; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- After the license is placed in the corresponding activation folder, confignode will be automatically activated without restarting confignode - -### Start Datanode - -Start datanodes on 3 servers separately - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #Background startup -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### Validate Deployment - -- Viewing the logs, the following words indicate that the datanode has successfully started - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- Enter any container to view the service running status and activation information - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - Enter the container, log in to the database through CLI, and use the `show cluster` command to view the service status and activation status - - ```Bash - docker exec -it iotdb-datanode /bin/bash #Entering the container - ./start-cli.sh -h iotdb-1 #Log in to the database - IoTDB> show cluster #View status - ``` - - You can see that all services are running and the activation status shows as activated. - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### Map/conf Directory (optional) - -If you want to directly modify the configuration file in the physical machine in the future, you can map the/conf folder in the container in three steps: - -Step 1: Copy the `/conf` directory from the container to `/docker-iotdb/iotdb/conf` on each of the three servers - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -or -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -Step 2: Add `/conf` directory mapping in `confignode.yml` and `datanode. yml` on 3 servers - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -Step 3: Restart IoTDB on 3 servers - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` - diff --git a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index 40c5e1d3d..000000000 --- a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,164 +0,0 @@ - -# Dual Active Deployment - -## What is a double active version? - -Dual active usually refers to two independent machines (or clusters) that perform real-time mirror synchronization. Their configurations are completely independent and can simultaneously receive external writes. Each independent machine (or cluster) can synchronize the data written to itself to another machine (or cluster), and the data of the two machines (or clusters) can achieve final consistency. - -- Two standalone machines (or clusters) can form a high availability group: when one of the standalone machines (or clusters) stops serving, the other standalone machine (or cluster) will not be affected. When the single machine (or cluster) that stopped the service is restarted, another single machine (or cluster) will synchronize the newly written data. Business can be bound to two standalone machines (or clusters) for read and write operations, thereby achieving high availability. -- The dual active deployment scheme allows for high availability with fewer than 3 physical nodes and has certain advantages in deployment costs. At the same time, the physical supply isolation of two sets of single machines (or clusters) can be achieved through the dual ring network of power and network, ensuring the stability of operation. -- At present, the dual active capability is a feature of the enterprise version. - -![](/img/20240731104336.png) - -## Note - -1. It is recommended to prioritize using `hostname` for IP configuration during deployment to avoid the problem of database failure caused by modifying the host IP in the later stage. To set the hostname, you need to configure `/etc/hosts` on the target server. If the local IP is 192.168.1.3 and the hostname is iotdb-1, you can use the following command to set the server's hostname and configure IoTDB's `cn_internal-address` and` dn_internal-address` using the hostname. - - ```Bash - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -2. Some parameters cannot be modified after the first startup, please refer to the "Installation Steps" section below to set them. - -3. Recommend deploying a monitoring panel, which can monitor important operational indicators and keep track of database operation status at any time. The monitoring panel can be obtained by contacting the business department. The steps for deploying the monitoring panel can be referred to [Monitoring Panel Deployment](https://www.timecho.com/docs/UserGuide/latest/Deployment-and-Maintenance/Monitoring-panel-deployment.html) - -## Installation Steps - -Taking the dual active version IoTDB built by two single machines A and B as an example, the IP addresses of A and B are 192.168.1.3 and 192.168.1.4, respectively. Here, we use hostname to represent different hosts. The plan is as follows: - -| Machine | Machine IP | Host Name | -| ------- | ----------- | --------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### Step1:Install Two Independent IoTDBs Separately - -Install IoTDB on two machines separately, and refer to the deployment documentation for the standalone version [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md),The deployment document for the cluster version can be referred to [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)。**It is recommended that the configurations of clusters A and B remain consistent to achieve the best dual active effect** - -### Step2:Create A Aata Synchronization Task On Machine A To Machine B - -- Create a data synchronization process on machine A, where the data on machine A is automatically synchronized to machine B. Use the cli tool in the sbin directory to connect to the IoTDB database on machine A: - - ```Bash - ./sbin/start-cli.sh -h iotdb-1 - ``` - -- Create and start the data synchronization command with the following SQL: - - ```Bash - create pipe AB - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' - ) - ``` - -- Note: To avoid infinite data loops, it is necessary to set the parameter `source. forwarding pipe questions` on both A and B to `false`, indicating that data transmitted from another pipe will not be forwarded. - -### Step3:Create A Data Synchronization Task On Machine B To Machine A - -- Create a data synchronization process on machine B, where the data on machine B is automatically synchronized to machine A. Use the cli tool in the sbin directory to connect to the IoTDB database on machine B - - ```Bash - ./sbin/start-cli.sh -h iotdb-2 - ``` - - Create and start the pipe with the following SQL: - - ```Bash - create pipe BA - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' - ) - ``` - -- Note: To avoid infinite data loops, it is necessary to set the parameter `source. forwarding pipe questions` on both A and B to `false` , indicating that data transmitted from another pipe will not be forwarded. - -### Step4:Validate Deployment - -After the above data synchronization process is created, the dual active cluster can be started. - -#### Check the running status of the cluster - -```Bash -#Execute the show cluster command on two nodes respectively to check the status of IoTDB service -show cluster -``` - -**Machine A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**Machine B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -Ensure that every Configurable Node and DataNode is in the Running state. - -#### Check synchronization status - -- Check the synchronization status on machine A - -```Bash -show pipes -``` - -![](/img/show%20pipes-A.png) - -- Check the synchronization status on machine B - -```Bash -show pipes -``` - -![](/img/show%20pipes-B.png) - -Ensure that every pipe is in the RUNNING state. - -### Step5:Stop Dual Active Version IoTDB - -- Execute the following command on machine A: - - ```SQL - ./sbin/start-cli.sh -h iotdb-1 #Log in to CLI - IoTDB> stop pipe AB #Stop the data synchronization process - ./sbin/stop-standalone.sh #Stop database service - ``` - -- Execute the following command on machine B: - - ```SQL - ./sbin/start-cli.sh -h iotdb-2 #Log in to CLI - IoTDB> stop pipe BA #Stop the data synchronization process - ./sbin/stop-standalone.sh #Stop database service - ``` - diff --git a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/UserGuide/V1.3.x/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index 86e0af2aa..000000000 --- a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,42 +0,0 @@ - -# Obtain TimechoDB -## How to obtain TimechoDB -The enterprise version installation package can be obtained through product trial application or by directly contacting the business personnel who are in contact with you. - -## Installation Package Structure -Install the package after decompression(iotdb-enterprise-{version}-bin.zip),The directory structure after unpacking the installation package is as follows: -| **catalogue** | **Type** | **Explanation** | -| :--------------: | -------- | ------------------------------------------------------------ | -| activation | folder | The directory where the activation file is located, including the generated machine code and the enterprise version activation code obtained from the business side (this directory will only be generated after starting ConfigNode to obtain the activation code) | -| conf | folder | Configuration file directory, including configuration files such as ConfigNode, DataNode, JMX, and logback | -| data | folder | The default data file directory contains data files for ConfigNode and DataNode. (The directory will only be generated after starting the program) | -| lib | folder | IoTDB executable library file directory | -| licenses | folder | Open source community certificate file directory | -| logs | folder | The default log file directory, which includes log files for ConfigNode and DataNode (this directory will only be generated after starting the program) | -| sbin | folder | Main script directory, including start, stop, and other scripts | -| tools | folder | Directory of System Peripheral Tools | -| ext | folder | Related files for pipe, trigger, and UDF plugins (created by the user when needed) | -| LICENSE | file | certificate | -| NOTICE | file | Tip | -| README_ZH\.md | file | Explanation of the Chinese version in Markdown format | -| README\.md | file | Instructions for use | -| RELEASE_NOTES\.md | file | Version Description | diff --git a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Kubernetes_timecho.md b/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Kubernetes_timecho.md deleted file mode 100644 index 14b51ab84..000000000 --- a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Kubernetes_timecho.md +++ /dev/null @@ -1,445 +0,0 @@ - - -# Kubernetes - -## 1. Environment Preparation - -### 1.1 Prepare a Kubernetes Cluster - -Ensure that you have an available Kubernetes cluster (minimum recommended version: Kubernetes 1.24) as the foundation for deploying the IoTDB cluster. - -Kubernetes Version Requirement: The recommended version is Kubernetes 1.24 or above. - -IoTDB Version Requirement: The version of TimechoDB must not be lower than v1.3.3.2. - -## 2. Create Namespace - -### 2.1 Create Namespace - -> Note: Before executing the namespace creation operation, verify that the specified namespace name has not been used in the Kubernetes cluster. If the namespace already exists, the creation command will fail, which may lead to errors during the deployment process. - -```Bash -kubectl create ns iotdb-ns -``` - -### 2.2 View Namespace - -```Bash -kubectl get ns -``` - -## 3. Create PersistentVolume (PV) - -### 3.1 Create PV Configuration File - -PV is used for persistent storage of IoTDB's ConfigNode and DataNode data. You need to create one PV for each node. - -> Note: One ConfigNode and one DataNode count as two nodes, requiring two PVs. - -For example, with 3 ConfigNodes and 3 DataNodes: - -1. Create a `pv.yaml` file and make six copies, renaming them to `pv01.yaml` through `pv06.yaml`. - -```Bash -# Create a directory to store YAML files -# Create pv.yaml file -touch pv.yaml -``` - -2. Modify the `name` and `path` in each file to ensure consistency. - -**pv.yaml Example:** - -```YAML -# pv.yaml -apiVersion: v1 -kind: PersistentVolume -metadata: - name: iotdb-pv-01 -spec: - capacity: - storage: 10Gi # Storage capacity - accessModes: # Access modes - - ReadWriteOnce - persistentVolumeReclaimPolicy: Retain # Reclaim policy - # Storage class name, if using local static storage, do not configure; if using dynamic storage, this must be set - storageClassName: local-storage - # Add the corresponding configuration based on your storage type - hostPath: # If using a local path - path: /data/k8s-data/iotdb-pv-01 - type: DirectoryOrCreate # If this line is not configured, you need to manually create the directory -``` - -### 3.2 Apply PV Configuration - -```Bash -kubectl apply -f pv01.yaml -kubectl apply -f pv-02.yaml -... -``` - -### 3.3 View PV - -```Bash -kubectl get pv -``` - - -### 3.4 Manually Create Directories - -> Note: If the type in the hostPath of the YAML file is not configured, you need to manually create the corresponding directories. - -Create the corresponding directories on all Kubernetes nodes: -```Bash -mkdir -p /data/k8s-data/iotdb-pv-01 -mkdir -p /data/k8s-data/iotdb-pv-02 -... -``` - -## 4. Install Helm - -For installation steps, please refer to the[Helm Official Website.](https://helm.sh/zh/docs/intro/install/) - -## 5. Configure IoTDB Helm Chart - -### 5.1 Clone IoTDB Kubernetes Deployment Code - -Please contact timechodb staff to obtain the IoTDB Helm Chart. If you encounter proxy issues, disable the proxy settings: - -### 5.2 Modify YAML Files - -> Ensure that the version used is supported (>=1.3.3.2): - -**values.yaml Example:** - -```YAML -nameOverride: "iotdb" -fullnameOverride: "iotdb" # Name after installation - -image: - repository: nexus.infra.timecho.com:8143/timecho/iotdb-enterprise - pullPolicy: IfNotPresent - tag: 1.3.3.2-standalone # Repository and version used - -storage: - # Storage class name, if using local static storage, do not configure; if using dynamic storage, this must be set - className: local-storage - -datanode: - name: datanode - nodeCount: 3 # Number of DataNode nodes - enableRestService: true - storageCapacity: 10Gi # Available space for DataNode - resources: - requests: - memory: 2Gi # Initial memory size for DataNode - cpu: 1000m # Initial CPU size for DataNode - limits: - memory: 4Gi # Maximum memory size for DataNode - cpu: 1000m # Maximum CPU size for DataNode - -confignode: - name: confignode - nodeCount: 3 # Number of ConfigNode nodes - storageCapacity: 10Gi # Available space for ConfigNode - resources: - requests: - memory: 512Mi # Initial memory size for ConfigNode - cpu: 1000m # Initial CPU size for ConfigNode - limits: - memory: 1024Mi # Maximum memory size for ConfigNode - cpu: 2000m # Maximum CPU size for ConfigNode - configNodeConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - schemaReplicationFactor: 3 - schemaRegionConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - dataReplicationFactor: 2 - dataRegionConsensusProtocolClass: org.apache.iotdb.consensus.iot.IoTConsensus -``` - -## 6. Configure Private Repository Information or Pre-Pull Images - -Configure private repository information on k8s as a prerequisite for the next helm install step. - -Option one is to pull the available iotdb images during helm insta, while option two is to import the available iotdb images into containerd in advance. - -### 6.1 [Option 1] Pull Image from Private Repository - -#### 6.1.1 Create a Secret to Allow k8s to Access the IoTDB Helm Private Repository - -Replace xxxxxx with the IoTDB private repository account, password, and email. - - - -```Bash -# Note the single quotes -kubectl create secret docker-registry timecho-nexus \ - --docker-server='nexus.infra.timecho.com:8143' \ - --docker-username='xxxxxx' \ - --docker-password='xxxxxx' \ - --docker-email='xxxxxx' \ - -n iotdb-ns - -# View the secret -kubectl get secret timecho-nexus -n iotdb-ns -# View and output as YAML -kubectl get secret timecho-nexus --output=yaml -n iotdb-ns -# View and decrypt -kubectl get secret timecho-nexus --output="jsonpath={.data.\.dockerconfigjson}" -n iotdb-ns | base64 --decode -``` - -#### 6.1.2 Load the Secret as a Patch to the Namespace iotdb-ns - -```Bash -# Add a patch to include login information for nexus in this namespace -kubectl patch serviceaccount default -n iotdb-ns -p '{"imagePullSecrets": [{"name": "timecho-nexus"}]}' - -# View the information in this namespace -kubectl get serviceaccounts -n iotdb-ns -o yaml -``` - -### 6.2 [Option 2] Import Image - -This step is for scenarios where the customer cannot connect to the private repository and requires assistance from company implementation staff. - -#### 6.2.1 Pull and Export the Image: - -```Bash -ctr images pull --user xxxxxxxx nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.2 View and Export the Image: - -```Bash -# View -ctr images ls - -# Export -ctr images export iotdb-enterprise:1.3.3.2-standalone.tar nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.3 Import into the k8s Namespace: - -> Note that k8s.io is the namespace for ctr in the example environment; importing to other namespaces will not work. - -```Bash -# Import into the k8s namespace -ctr -n k8s.io images import iotdb-enterprise:1.3.3.2-standalone.tar -``` - -#### 6.2.4 View the Image: - -```Bash -ctr --namespace k8s.io images list | grep 1.3.3.2 -``` - -## 7. Install IoTDB - -### 7.1 Install IoTDB - -```Bash -# Enter the directory -cd iotdb-cluster-k8s/helm - -# Install IoTDB -helm install iotdb ./ -n iotdb-ns -``` - -### 7.2 View Helm Installation List - -```Bash -# helm list -helm list -n iotdb-ns -``` - -### 7.3 View Pods - -```Bash -# View IoTDB pods -kubectl get pods -n iotdb-ns -o wide -``` - -After executing the command, if the output shows 6 Pods with confignode and datanode labels (3 each), it indicates a successful installation. Note that not all Pods may be in the Running state initially; inactive datanode Pods may keep restarting but will normalize after activation. - -### 7.4 Troubleshooting - -```Bash -# View k8s creation logs -kubectl get events -n iotdb-ns -watch kubectl get events -n iotdb-ns - -# Get detailed information -kubectl describe pod confignode-0 -n iotdb-ns -kubectl describe pod datanode-0 -n iotdb-ns - -# View ConfigNode logs -kubectl logs -n iotdb-ns confignode-0 -f -``` - -## 8. Activate IoTDB - -### 8.1 Option 1: Activate Directly in the Pod (Quickest) - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-1 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-2 -- /iotdb/sbin/start-activate.sh -# Obtain the machine code and proceed with activation -``` - -### 8.2 Option 2: Activate Inside the ConfigNode Container - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /bin/bash -cd /iotdb/sbin -/bin/bash start-activate.sh -# Obtain the machine code and proceed with activation -# Exit the container -``` - -### Option 3: Manual Activation - -1. View ConfigNode details to determine the node: - -```Bash -kubectl describe pod confignode-0 -n iotdb-ns | grep -e "Node:" -e "Path:" - -# Example output: -# Node: a87/172.20.31.87 -# Path: /data/k8s-data/env/confignode/.env -``` - -2. View PVC and find the corresponding Volume for ConfigNode to determine the path: - -```Bash -kubectl get pvc -n iotdb-ns | grep "confignode-0" -# Example output: -# map-confignode-confignode-0 Bound iotdb-pv-04 10Gi RWO local-storage 8h - -# To view multiple ConfigNodes, use the following: -for i in {0..2}; do echo confignode-$i; kubectl describe pod confignode-${i} -n iotdb-ns | grep -e "Node:" -e "Path:" -``` - -3. View the Detailed Information of the Corresponding Volume to Determine the Physical Directory Location: - - -```Bash -kubectl describe pv iotdb-pv-04 | grep "Path:" - -# Example output: -# Path: /data/k8s-data/iotdb-pv-04 -``` - -4. Locate the system-info file in the corresponding directory on the corresponding node, use this system-info as the machine code to generate an activation code, and create a new file named license in the same directory, writing the activation code into this file. - -## 9. Verify IoTDB - -### 9.1 Check the Status of Pods within the Namespace - -View the IP, status, and other information of the pods in the iotdb-ns namespace to ensure they are all running normally. - -```Bash -kubectl get pods -n iotdb-ns -o wide - -# Example output: -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` - -### 9.2 Check the Port Mapping within the Namespace - -```Bash -kubectl get svc -n iotdb-ns - -# Example output: -# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -# confignode-svc NodePort 10.10.226.151 80:31026/TCP 7d8h -# datanode-svc NodePort 10.10.194.225 6667:31563/TCP 7d8h -# jdbc-balancer LoadBalancer 10.10.191.209 6667:31895/TCP 7d8h -``` - -### 9.3 Start the CLI Script on Any Server to Verify the IoTDB Cluster Status - -Use the port of jdbc-balancer and the IP of any k8s node. - -```Bash -start-cli.sh -h 172.20.31.86 -p 31895 -start-cli.sh -h 172.20.31.87 -p 31895 -start-cli.sh -h 172.20.31.88 -p 31895 -``` - - - -## 10. Scaling - -### 10.1 Add New PV - -Add a new PV; scaling is only possible with available PVs. - - - -**Note: DataNode cannot join the cluster after restart** - -**Reason**:The static storage hostPath mode is configured, and the script modifies the `iotdb-system.properties` file to set `dn_data_dirs` to `/iotdb6/iotdb_data,/iotdb7/iotdb_data`. However, the default storage path `/iotdb/data` is not mounted, leading to data loss upon restart. -**Solution**:Mount the `/iotdb/data` directory as well, and ensure this setting is applied to both ConfigNode and DataNode to maintain data integrity and cluster stability. - -### 10.2 Scale ConfigNode - -Example: Scale from 3 ConfigNodes to 4 ConfigNodes - -Modify the values.yaml file in iotdb-cluster-k8s/helm to change the number of ConfigNodes from 3 to 4. - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - - - - -### 10.3 Scale DataNode - -Example: Scale from 3 DataNodes to 4 DataNodes - -Modify the values.yaml file in iotdb-cluster-k8s/helm to change the number of DataNodes from 3 to 4. - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - -### 10.4 Verify IoTDB Status - -```Shell -kubectl get pods -n iotdb-ns -o wide - -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -# datanode-3 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index 25fe15c59..000000000 --- a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,322 +0,0 @@ - -# Stand-Alone Deployment - -This chapter will introduce how to start an IoTDB standalone instance, which includes 1 ConfigNode and 1 DataNode (commonly known as 1C1D). - -## Matters Needing Attention - -1. Before installation, ensure that the system is complete by referring to [System configuration](./Environment-Requirements.md). - -2. It is recommended to prioritize using 'hostname' for IP configuration during deployment, which can avoid the problem of modifying the host IP in the later stage and causing the database to fail to start. To set the host name, you need to configure/etc/hosts on the target server. For example, if the local IP is 192.168.1.3 and the host name is iotdb-1, you can use the following command to set the server's host name and configure IoTDB's' cn_internal-address' using the host name dn_internal_address、dn_rpc_address。 - - ```shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. Some parameters cannot be modified after the first startup. Please refer to the "Parameter Configuration" section below for settings. - -4. Whether in linux or windows, ensure that the IoTDB installation path does not contain Spaces and Chinese characters to avoid software exceptions. - -5. Please note that when installing and deploying IoTDB (including activating and using software), it is necessary to use the same user for operations. You can: -- Using root user (recommended): Using root user can avoid issues such as permissions. -- Using a fixed non root user: - - Using the same user operation: Ensure that the same user is used for start, activation, stop, and other operations, and do not switch users. - - Avoid using sudo: Try to avoid using sudo commands as they execute commands with root privileges, which may cause confusion or security issues. - -6. It is recommended to deploy a monitoring panel, which can monitor important operational indicators and keep track of database operation status at any time. The monitoring panel can be obtained by contacting the business department, and the steps for deploying the monitoring panel can be referred to:[Monitoring Board Install and Deploy](./Monitoring-panel-deployment.md). - -7. Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - -## Installation Steps - -### 1. Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### 2、Unzip the installation package and enter the installation directory - -```shell -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -### 2. Parameter Configuration - -#### Environment Script Configuration - -- ./conf/confignode-env.sh (./conf/confignode-env.bat) configuration - -| **Configuration** | **Description** | **Default** | **Recommended value** | Note | -| :---------------: | :----------------------------------------------------------: | :---------: | :----------------------------------------------------------: | :---------------------------------: | -| MEMORY_SIZE | The total amount of memory that IoTDB ConfigNode nodes can use | Automatically calculated based on system memory, defaulting to 30% of the system memory. | Can be filled in as needed, and the system will allocate memory based on the filled in values | Save changes without immediate execution; modifications take effect after service restart. | - -- ./conf/datanode-env.sh (./conf/datanode-env.bat) configuration - -| **Configuration** | **Description** | **Default** | **Recommended value** | **Note** | -| :---------: | :----------------------------------: |:----------------------------------------------------------------------------------------:| :----------------------------------------------: | :----------: | -| MEMORY_SIZE | The total amount of memory that IoTDB DataNode nodes can use | Automatically calculated based on system memory, defaulting to 50% of the system memory. | Can be filled in as needed, and the system will allocate memory based on the filled in values | Save changes without immediate execution; modifications take effect after service restart. | - -#### System General Configuration - -Open the general configuration file (./conf/iotdb-system. properties file) and set the following parameters: - -| **Configuration** | **Description** | **Default** | **Recommended value** | Note | -| :-----------------------: | :----------------------------------------------------------: | :------------: | :----------------------------------------------------------: | :---------------------------------------------------: | -| cluster_name | Cluster Name | defaultCluster | The cluster name can be set as needed, and if there are no special needs, the default can be kept | Support hot loading from V1.3.3, but it is not recommended to change the cluster name by manually modifying the configuration file | -| schema_replication_factor | Number of metadata replicas, set to 1 for the standalone version here | 1 | 1 | Default 1, cannot be modified after the first startup | -| data_replication_factor | Number of data replicas, set to 1 for the standalone version here | 1 | 1 | Default 1, cannot be modified after the first startup | - -#### ConfigNode Configuration - -Open the ConfigNode configuration file (./conf/iotdb-system. properties file) and set the following parameters: - -| **Configuration** | **Description** | **Default** | **Recommended value** | Note | -| :-----------------: | :----------------------------------------------------------: | :-------------: | :----------------------------------------------------------: | :--------------------------------------: | -| cn_internal_address | The address used by ConfigNode for communication within the cluster | 127.0.0.1 | The IPV4 address or host name of the server where it is located, and it is recommended to use host name | Cannot be modified after initial startup | -| cn_internal_port | The port used by ConfigNode for communication within the cluster | 10710 | 10710 | Cannot be modified after initial startup | -| cn_consensus_port | The port used for ConfigNode replica group consensus protocol communication | 10720 | 10720 | Cannot be modified after initial startup | -| cn_seed_config_node | The address of the ConfigNode that the node connects to when registering to join the cluster, cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | Cannot be modified after initial startup | - -#### DataNode Configuration - -Open the DataNode configuration file (./conf/iotdb-system. properties file) and set the following parameters: - -| **Configuration** | **Description** | **Default** | **Recommended value** | **Note** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- |:----------------------------------------------------------------------------------------------------------------| :--------------------------------------- | -| dn_rpc_address | The address of the client RPC service | 0.0.0.0 | The IPV4 address or host name of the server where it is located, and it is recommended to use the IPV4 address | Restarting the service takes effect | -| dn_rpc_port | The port of the client RPC service | 6667 | 6667 | Restarting the service takes effect | -| dn_internal_address | The address used by DataNode for communication within the cluster | 127.0.0.1 | The IPV4 address or host name of the server where it is located, and it is recommended to use host name | Cannot be modified after initial startup | -| dn_internal_port | The port used by DataNode for communication within the cluster | 10730 | 10730 | Cannot be modified after initial startup | -| dn_mpp_data_exchange_port | The port used by DataNode to receive data streams | 10740 | 10740 | Cannot be modified after initial startup | -| dn_data_region_consensus_port | The port used by DataNode for data replica consensus protocol communication | 10750 | 10750 | Cannot be modified after initial startup | -| dn_schema_region_consensus_port | The port used by DataNode for metadata replica consensus protocol communication | 10760 | 10760 | Cannot be modified after initial startup | -| dn_seed_config_node | The ConfigNode address that the node connects to when registering to join the cluster, i.e. cn_internal-address: cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | Cannot be modified after initial startup | - -> ❗️Attention: Editors such as VSCode Remote do not have automatic configuration saving function. Please ensure that the modified files are saved persistently, otherwise the configuration items will not take effect - -### 3. Start and Activate Database (Available since V1.3.4) - -#### 3.1 Start ConfigNode - -Enter the sbin directory of iotdb and start confignode - -```shell -./start-confignode.sh -d #The "- d" parameter will start in the background -``` -If the startup fails, please refer to [Common Questions](#common-questions). - -#### 3.2 Start DataNode - -Enter the sbin directory of iotdb and start datanode: - -```shell -./start-datanode.sh -d # The "- d" parameter will start in the background -``` - -#### 3.3 Activate Database - -##### Activation via CLI - -- Enter the CLI - - ```SQL - ./sbin/start-cli.sh -``` - -- Execute the following command to obtain the machine code required for activation: - - ```Bash - show system info - ``` - -- Copy the returned machine code and provide it to the Timecho team: - -```Bash -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -It costs 0.030s -``` - -- Input the activation code returned by the Timecho team into the CLI using the following command: - - Note: The activation code must be enclosed in ' symbols, as shown: - -```Bash -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -### 4. Start and Activate Database (Available before V1.3.4) - -#### 4.1 Start ConfigNode - -Enter the sbin directory of iotdb and start confignode - -```shell -./start-confignode.sh -d #The "- d" parameter will start in the background -``` -If the startup fails, please refer to [Common Questions](#common-questions). - -#### 4.2 Activate Database - -##### Method 1: Activate file copy activation - -- After starting the confignode node, enter the activation folder and copy the systeminfo file to the Timecho staff -- Received the license file returned by the staff -- Place the license file in the activation folder of the corresponding node; - -##### Method 2: Activate Script Activation - -- Obtain the required machine code for activation, enter the sbin directory of the installation directory, and execute the activation script: - -```shell - cd sbin -./start-activate.sh -``` - -- The following information is displayed. Please copy the machine code (i.e. the string of characters) to the Timecho staff: - -```shell -Please copy the system_info's content and send it to Timecho: -01-KU5LDFFN-PNBEHDRH -Please enter license: -``` - -- Enter the activation code returned by the staff into the previous command line prompt 'Please enter license:', as shown below: - -```shell -Please enter license: -JJw+MmF+AtexsfgNGOFgTm83Bxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxm6pF+APW1CiXLTSijK9Qh3nsLgzrW8OJPh26Vl6ljKUpCvpTiw== -License has been stored to sbin/../activation/license -Import completed. Please start cluster and excute 'show cluster' to verify activation status -``` - -#### 4.3 Start DataNode - -Enter the sbin directory of iotdb and start datanode: - -```shell -./start-datanode.sh -d # The "- d" parameter will start in the background -``` - -### 5、Verify Deployment - -Can be executed directly/ Cli startup script in sbin directory: - -```shell -./start-cli.sh -h ip(local IP or domain name) -p port(6667) -``` - -After successful startup, the following interface will appear displaying successful installation of IOTDB. - -![](/img/%E5%90%AF%E5%8A%A8%E6%88%90%E5%8A%9F.png) - -After the installation success interface appears, continue to check if the activation is successful and use the `show cluster`command - -When you see the display "Activated" on the far right, it indicates successful activation - -![](/img/show%20cluster.png) - -In the CLI, you can also check the activation status by running the `show activation` command; the example below shows a status of ACTIVATED, indicating successful activation. - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -> The appearance of 'Activated (W)' indicates passive activation, indicating that this Config Node does not have a license file (or has not issued the latest license file with a timestamp). At this point, it is recommended to check if the license file has been placed in the license folder. If not, please place the license file. If a license file already exists, it may be due to inconsistency between the license file of this node and the information of other nodes. Please contact Timecho staff to reapply. - -## Common Problem -1. Multiple prompts indicating activation failure during deployment process - - Use the `ls -al` command: Use the `ls -al` command to check if the owner information of the installation package root directory is the current user. - - Check activation directory: Check all files in the `./activation` directory and whether the owner information is the current user. - -2. Confignode failed to start - - Step 1: Please check the startup log to see if any parameters that cannot be changed after the first startup have been modified. - - Step 2: Please check the startup log for any other abnormalities. If there are any abnormal phenomena in the log, please contact Timecho Technical Support personnel for consultation on solutions. - - Step 3: If it is the first deployment or data can be deleted, you can also clean up the environment according to the following steps, redeploy, and restart. - - Step 4: Clean up the environment: - - a. Terminate all ConfigNode Node and DataNode processes. - ```Bash - # 1. Stop the ConfigNode and DataNode services - sbin/stop-standalone.sh - - # 2. Check for any remaining processes - jps - # Or - ps -ef|grep iotdb - - # 3. If there are any remaining processes, manually kill the - kill -9 - # If you are sure there is only one iotdb on the machine, you can use the following command to clean up residual processes - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. Delete the data and logs directories. - - Explanation: Deleting the data directory is necessary, deleting the logs directory is for clean logs and is not mandatory. - - ```Bash - cd /data/iotdb - rm -rf data logs - ``` \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/workbench-deployment_timecho.md b/src/UserGuide/V1.3.x/Deployment-and-Maintenance/workbench-deployment_timecho.md deleted file mode 100644 index 1e37e14ca..000000000 --- a/src/UserGuide/V1.3.x/Deployment-and-Maintenance/workbench-deployment_timecho.md +++ /dev/null @@ -1,261 +0,0 @@ - -# Workbench Deployment - -The visualization console is one of the supporting tools for IoTDB (similar to Navicat for MySQL). It is an official application tool system used for database deployment implementation, operation and maintenance management, and application development stages, making the use, operation, and management of databases simpler and more efficient, truly achieving low-cost management and operation of databases. This document will assist you in installing Workbench. - -
-  -  -
- -The instructions for using the visualization console tool can be found in the [Instructions](../Tools-System/Monitor-Tool.md) section of the document. - -## Installation Preparation - -| Preparation Content | Name | Version Requirements | Link | -| :----------------------: | :-------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | -| Operating System | Windows or Linux | - | - | -| Installation Environment | JDK | v1.5.4 and below require ≥ 1.8; v1.5.5 and above require ≥ 17. Choose the ARM or x64 installer according to your system. | https://www.oracle.com/java/technologies/downloads/ | -| Related Software | Prometheus | Requires installation of V2.30.3 and above. | https://prometheus.io/download/ | -| Database | IoTDB | Requires V1.2.0 Enterprise Edition and above | You can contact business or technical support to obtain | -| Console | IoTDB-Workbench-`` | - | You can choose according to the appendix version comparison table and contact business or technical support to obtain it | - - -### Pre-installation Check - -To ensure the Workbench installation package you obtained is complete and valid, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Contact the Timecho Team to get it. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/workbench): - ```Bash - cd /data/workbench - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum IoTDB-Workbench-``.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-03.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact the Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -## Installation Steps - -### Step 1: IoTDB enables monitoring indicator collection - -1. Open the monitoring configuration item. The configuration items related to monitoring in IoTDB are disabled by default. Before deploying the monitoring panel, you need to open the relevant configuration items (note that the service needs to be restarted after enabling monitoring configuration). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ConfigurationLocated in the configuration fileDescription
cn_metric_reporter_listconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to PROMETHEUS
cn_metric_levelPlease add this configuration item to the configuration file and set the value to IMPORTANT
cn_metric_prometheus_reporter_portPlease add this configuration item to the configuration file to maintain the default setting of 9091. If other ports are set, they will not conflict with each other
dn_metric_reporter_listconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to PROMETHEUS
dn_metric_levelPlease add this configuration item to the configuration file and set the value to IMPORTANT
dn_metric_prometheus_reporter_portPlease add this configuration item to the configuration file and set it to 9092 by default. If other ports are set, they will not conflict with each other
dn_metric_internal_reporter_typePlease add this configuration item to the configuration file and set the value to IOTDB
enable_audit_logconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to true
audit_log_storagePlease add this configuration item in the configuration file, with values set to IOTDB and LOGGER
audit_log_operationPlease add this configuration item in the configuration file, with values set to DML,DDL,QUERY
- - -2. Restart all nodes. After modifying the monitoring indicator configuration of three nodes, the confignode and datanode services of all nodes can be restarted: - - ```shell - ./sbin/stop-standalone.sh #Stop confignode and datanode first - ./sbin/start-confignode.sh -d #Start confignode - ./sbin/start-datanode.sh -d #Start datanode - ``` - -3. After restarting, confirm the running status of each node through the client. If the status is Running, it indicates successful configuration: - - ![](/img/%E5%90%AF%E5%8A%A8.png) - -### Step 2: Install and configure Prometheus - -1. Download the Prometheus installation package, which requires installation of V2.30.3 and above. You can go to the Prometheus official website to download it (https://prometheus.io/docs/introduction/first_steps/) -2. Unzip the installation package and enter the unzipped folder: - - ```Shell - tar xvfz prometheus-*.tar.gz - cd prometheus-* - ``` - -3. Modify the configuration. Modify the configuration file prometheus.yml as follows - 1. Add configNode task to collect monitoring data for ConfigNode - 2. Add a datanode task to collect monitoring data for DataNodes - - ```shell - global: - scrape_interval: 15s - evaluation_interval: 15s - scrape_configs: - - job_name: "prometheus" - static_configs: - - targets: ["localhost:9090"] - - job_name: "confignode" - static_configs: - - targets: ["iotdb-1:9091","iotdb-2:9091","iotdb-3:9091"] - honor_labels: true - - job_name: "datanode" - static_configs: - - targets: ["iotdb-1:9092","iotdb-2:9092","iotdb-3:9092"] - honor_labels: true - ``` - -4. Start Prometheus. The default expiration time for Prometheus monitoring data is 15 days. In production environments, it is recommended to adjust it to 180 days or more to track historical monitoring data for a longer period of time. The startup command is as follows: - - ```Shell - ./prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=180d - ``` - -5. Confirm successful startup. Enter in browser `http://IP:port` Go to Prometheus and click on the Target interface under Status. When you see that all States are Up, it indicates successful configuration and connectivity. - -
- - -
- - -### Step 3: Install Workbench - -1. Enter the config directory of iotdb Workbench -`` - -2. Modify Workbench configuration file: Go to the `config` folder and modify the configuration file `application-prod.properties`. If you are installing it locally, there is no need to modify it. If you are deploying it on a server, you need to modify the IP address - > Workbench can be deployed on a local or cloud server as long as it can connect to IoTDB - - | Configuration | Before Modification | After modification | - | ---------------- | ----------------------------------- | ----------------------------------------------- | - | pipe.callbackUrl | pipe.callbackUrl=`http://127.0.0.1` | pipe.callbackUrl=`http://` | - - ![](/img/workbench-conf-1.png) - -3. Startup program: Please execute the startup command in the sbin folder of IoTDB Workbench -`` - Windows: - ```shell - # Start Workbench in the background - start.bat -d - ``` - Linux: - ```shell - # Start Workbench in the background - ./start.sh -d - ``` -4. You can use the `jps` command to check if the startup was successful, as shown in the figure: - - ![](/img/windows-jps.png) - -5. Verification successful: Open "`http://Server IP: Port in configuration file`" in the browser to access, for example:"`http://127.0.0.1:9190`" When the login interface appears, it is considered successful - - ![](/img/workbench-en.png) - - -### Step 4: Configure Instance Information - -1. Configure instance information: You only need to fill in the following information to connect to the instance - - ![](/img/workbench-en-1.jpeg) - - - | Field Name | Is It A Required Field | Field Meaning | Default Value | - | --------------- | ---------------------- | ------------------------------------------------------------ | ------ | - | Connection Type | Yes | The content filled in for different connection types varies, and supports selecting "single machine, cluster, dual active" | - | - | Instance Name | Yes | You can distinguish different instances based on their names, with a maximum input of 50 characters | - | - | Instance | Yes | Fill in the database address (`dn_rpc_address` field in the `iotdb/conf/iotdb-system.properties` file) and port number (`dn_rpc_port` field). Note: For clusters and dual active devices, clicking the "+" button supports entering multiple instance information | - | - | Prometheus | No | Fill in `http://:/app/v1/query` to view some monitoring information on the homepage. We recommend that you configure and use it | - | - | Username | Yes | Fill in the username for IoTDB, supporting input of 4 to 32 characters, including uppercase and lowercase letters, numbers, and special characters (! @ # $% ^&* () _+-=) | root | - | Enter Password | No | Fill in the password for IoTDB. To ensure the security of the database, we will not save the password. Please fill in the password yourself every time you connect to the instance or test | root | - -2. Test the accuracy of the information filled in: You can perform a connection test on the instance information by clicking the "Test" button - - ![](/img/workbench-en-2.png) - -## Appendix: IoTDB and Workbench Version Comparison Table - -| Version | Description | Supported IoTDB Versions | -|---------|-----------------------------------------------------------------------------------------------------------------------------|-------------------------------------| -| V1.5.7 | Optimize the point list by splitting point names into device names and points, ensure the point selection area supports horizontal scrolling, and align the export file column order with the page display. | All 1.x versions from V1.3.4 onward | -| V1.5.6 | Enhanced CSV import/export: optional tags/aliases on import; support for measurement descriptions with backtick-quoted quotes on export. | All 1.x versions from V1.3.4 onward | -| V1.5.5 | Added server clock functionality and support for activating Enterprise Edition license databases | All 1.x versions from V1.3.4 onward | -| V1.5.4 | Added authentication for Prometheus settings in Instance Management | All 1.x versions from V1.3.4 onward | -| V1.5.1 | Added AI analysis and pattern matching | All 1.x versions from V1.3.2 onward | -| V1.4.0 | Added tree model display and English UI | All 1.x versions from V1.3.2 onward | -| V1.3.1 | Enhanced analysis methods and import templates | All 1.x versions from V1.3.2 onward | -| V1.3.0 | Added DB configuration and UI refinements | All 1.x versions from V1.3.2 onward | -| V1.2.6 | Optimized permission controls | All 1.x versions from V1.3.1 onward | -| V1.2.5 | Added "Common Templates" and caching | All 1.x versions from V1.3.0 onward | -| V1.2.4 | Added import/export for calculations, time alignment field | All 1.x versions from V1.2.2 onward | -| V1.2.3 | Added activation details and analysis features | All 1.x versions from V1.2.2 onward | -| V1.2.2 | Optimized point description display | All 1.x versions from V1.2.2 onward | -| V1.2.1 | Added sync monitoring panel, Prometheus hints | All 1.x versions from V1.2.2 onward | -| V1.2.0 | Major Workbench upgrade | All 1.x versions from V1.2.0 onward | diff --git a/src/UserGuide/V1.3.x/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md b/src/UserGuide/V1.3.x/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md deleted file mode 100644 index 10f07ed73..000000000 --- a/src/UserGuide/V1.3.x/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md +++ /dev/null @@ -1,275 +0,0 @@ - - -# Ignition - -## Product Overview - -1. Introduction to Ignition - - Ignition is a web-based monitoring and data acquisition tool (SCADA) - an open and scalable universal platform. Ignition allows you to more easily control, track, display, and analyze all data of your enterprise, enhancing business capabilities. For more introduction details, please refer to [Ignition Official Website](https://docs.inductiveautomation.com/docs/8.1/getting-started/introducing-ignition) - -2. Introduction to the Ignition-IoTDB Connector - - The ignition-IoTDB Connector is divided into two modules: the ignition-IoTDB Connector,Ignition-IoTDB With JDBC。 Among them: - - - Ignition-IoTDB Connector: Provides the ability to store data collected by Ignition into IoTDB, and also supports data reading in Components. It injects script interfaces such as `system. iotdb. insert`and`system. iotdb. query`to facilitate programming in Ignition - - Ignition-IoTDB With JDBC: Ignition-IoTDB With JDBC can be used in the`Transaction Groups`module and is not applicable to the`Tag Historian`module. It can be used for custom writing and querying. - - The specific relationship and content between the two modules and ignition are shown in the following figure. - - ![](/img/20240703114443.png) - -## Installation Requirements - -| **Preparation Content** | Version Requirements | -| ------------------------------- | ------------------------------------------------------------ | -| IoTDB | Version 1.3.1 and above are required to be installed, please refer to IoTDB for installation [Deployment Guidance](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) | -| Ignition | Requirement: 8.1 version (8.1.37 and above) of version 8.1 must be installed. Please refer to the Ignition official website for installation [Installation Guidance](https://docs.inductiveautomation.com/docs/8.1/getting-started/installing-and-upgrading)(Other versions are compatible, please contact the business department for more information) | -| Ignition-IoTDB Connector module | Please contact Business to obtain | -| Ignition-IoTDB With JDBC module | Download address:https://repo1.maven.org/maven2/org/apache/iotdb/iotdb-jdbc/ | - -## Instruction Manual For Ignition-IoTDB Connector - -### Introduce - -The Ignition-IoTDB Connector module can store data in a database connection associated with the historical database provider. The data is directly stored in a table in the SQL database based on its data type, as well as a millisecond timestamp. Store data only when making changes based on the value pattern and dead zone settings on each label, thus avoiding duplicate and unnecessary data storage. - -The Ignition-IoTDB Connector provides the ability to store the data collected by Ignition into IoTDB. - -### Installation Steps - -Step 1: Enter the `Configuration` - `System` - `Modules` module and click on the `Install or Upgrade a Module` button at the bottom - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-1.png) - -Step 2: Select the obtained `modl`, select the file and upload it, click `Install`, and trust the relevant certificate. - -![](/img/20240703-151030.png) - -Step 3: After installation is completed, you can see the following content - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-3.png) - -Step 4: Enter the `Configuration` - `Tags` - `History` module and click on `Create new Historical Tag Provider` below - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-4.png) - -Step 5: Select `IoTDB` and fill in the configuration information - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-5.png) - -The configuration content is as follows: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NameDescriptionDefault ValueNotes
Main
Provider NameProvider Name-
Enabled trueThe provider can only be used when it is true
DescriptionDescription-
IoTDB Settings
Host NameThe address of the target IoTDB instance-
Port NumberThe port of the target IoTDB instance6667
UsernameThe username of the target IoTDB-
PasswordPassword for target IoTDB-
Database NameThe database name to be stored, starting with root, such as root db-
Pool SizeSize of SessionPool50Can be configured as needed
Store and Forward SettingsJust keep it as default
- - - -### Instructions - -#### Configure Historical Data Storage - -- After configuring the `Provider`, you can use the `IoTDB Tag Historian` in the `Designer`, just like using other `Providers`. Right click on the corresponding `Tag` and select `Edit Tag (s) `, then select the History category in the Tag Editor - - ![](/img/ignition-7.png) - -- Set `History Disabled` to `true`, select `Storage Provider` as the `Provider` created in the previous step, configure other parameters as needed, click `OK`, and then save the project. At this point, the data will be continuously stored in the 'IoTDB' instance according to the set content. - - ![](/img/ignition-8.png) - -#### Read Data - -- You can also directly select the tags stored in IoTDB under the Data tab of the Report - - ![](/img/ignition-9.png) - -- You can also directly browse relevant data in Components - - ![](/img/ignition-10.png) - -#### Script module: This function can interact with IoTDB - -1. system.iotdb.insert: - - -- Script Description: Write data to an IoTDB instance - -- Script Definition: - - `system.iotdb.insert(historian, deviceId, timestamps, measurementNames, measurementValues)` - -- Parameter: - - - `str historian`:The name of the corresponding IoTDB Tag Historian Provider - - `str deviceId`:The deviceId written, excluding the configured database, such as Sine - - `long[] timestamps`:List of timestamps for written data points - - `str[] measurementNames`:List of names for written physical quantities - - `str[][] measurementValues`:The written data point data corresponds to the timestamp list and physical quantity name list - -- Return Value: None - -- Available Range:Client, Designer, Gateway - -- Usage example: - - ```shell - system.iotdb.insert("IoTDB", "Sine", [system.date.now()],["measure1","measure2"],[["val1","val2"]]) - ``` - -2. system.iotdb.query: - - -- Script Description:Query the data written to the IoTDB instance - -- Script Definition: - - `system.iotdb.query(historian, sql)` - -- Parameter: - - - `str historian`:The name of the corresponding IoTDB Tag Historian Provider - - `str sql`:SQL statement to be queried - -- Return Value: - Query Results:`List>` - -- Available Range:Client, Designer, Gateway - -- Usage example: - - ```Python - system.iotdb.query("IoTDB", "select * from root.db.Sine where time > 1709563427247") - ``` - -## Ignition-IoTDB With JDBC - -### Introduce - - Ignition-IoTDB With JDBC provides a JDBC driver that allows users to connect and query the Ignition IoTDB database using standard JDBC APIs - -### Installation Steps - -Step 1: Enter the `Configuration` - `Databases` -`Drivers` module and create the `Translator` - -![](/img/Ignition-IoTDBWithJDBC-1.png) - -Step 2: Enter the `Configuration` - `Databases` - `Drivers` module, create a `JDBC Driver` , select the `Translator` configured in the previous step, and upload the downloaded `IoTDB JDBC`. Set the Classname to `org. apache. iotdb. jdbc.IoTDBDriver` - -![](/img/Ignition-IoTDBWithJDBC-2.png) - -Step 3: Enter the `Configuration` - `Databases` - `Connections` module, create a new `Connections` , select the`IoTDB Driver` created in the previous step for `JDBC Driver`, configure the relevant information, and save it to use - -![](/img/Ignition-IoTDBWithJDBC-3.png) - -### Instructions - -#### Data Writing - -Select the previously created `Connection` from the `Data Source` in the `Transaction Groups` - -- `Table name`needs to be set as the complete device path starting from root -- Uncheck `Automatically create table` -- `Store timestame to` configure as time - -Do not select other options, set the fields, and after `enabled` , the data will be installed and stored in the corresponding IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E5%86%99%E5%85%A5-1.png) - -#### Query - -- Select `Data Source` in the `Database Query Browser` and select the previously created `Connection` to write an SQL statement to query the data in IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2-ponz.png) - diff --git a/src/UserGuide/V1.3.x/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/UserGuide/V1.3.x/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index 4f7b6d570..000000000 --- a/src/UserGuide/V1.3.x/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,267 +0,0 @@ - - -# What is TimechoDB - -TimechoDB is a low-cost, high-performance native temporal database for the Internet of Things, provided by Timecho based on the Apache IoTDB community version as an original commercial product. It can solve various problems encountered by enterprises when building IoT big data platforms to manage time-series data, such as complex application scenarios, large data volumes, high sampling frequencies, high amount of unaligned data, long data processing time, diverse analysis requirements, and high storage and operation costs. - -Timecho provides a more diverse range of product features, stronger performance and stability, and a richer set of utility tools based on TimechoDB. It also offers comprehensive enterprise services to users, thereby providing commercial customers with more powerful product capabilities and a higher quality of development, operations, and usage experience. - -- Download 、Deployment and Usage:[QuickStart](../QuickStart/QuickStart_timecho.md) - - -## Product Components - -Timecho products is composed of several components, covering the entire time-series data lifecycle from data collection, data management to data analysis & application, helping users efficiently manage and analyze the massive amount of time-series data generated by the IoT. - -
- Introduction-en-timecho-new.png - -
- -1. **Time-series database (TimechoDB, a commercial product based on Apache IoTDB provided by the original team)**: The core component of time-series data storage, which can provide users with high-compression storage capabilities, rich time-series query capabilities, real-time stream processing capabilities, while also having high availability of data and high scalability of clusters, and providing security protection. At the same time, TimechoDB also provides users with a variety of application tools for easy management of the system; multi-language API and external system application integration capabilities, making it convenient for users to build applications based on TimechoDB. - -2. **Time-series data standard file format (Apache TsFile, led and contributed by core team members of Timecho)**: This file format is a storage format specifically designed for time-series data, which can efficiently store and query massive amounts of time-series data. Currently, the underlying storage files of Timecho's collection, storage, and intelligent analysis modules are all supported by Apache TsFile. TsFile can be efficiently loaded into TimechoDB and can also be migrated out. Through TsFile, users can use the same file format for data management in the stages of collection, management, application & analysis, greatly simplifying the entire process from data collection to analysis, and improving the efficiency and convenience of time-series data management. - -3. **Time-series model training and inference integrated engine (AINode)**: For intelligent analysis scenarios, TimechoDB provides the AINode time-series model training and inference integrated engine, which offers a complete set of time-series data analysis tools, with the underlying model training engine supporting training tasks and data management, including machine learning, deep learning, etc. With these tools, users can conduct in-depth analysis of the data stored in TimechoDB and mine its value. - -4. **Data collection**: To more conveniently dock with various industrial collection scenarios, Timecho provides data collection access services, supporting multiple protocols and formats, which can access data generated by various sensors and devices, while also supporting features such as breakpoint resumption and network barrier penetration. It is more adapted to the characteristics of difficult configuration, slow transmission, and weak network in the industrial field collection process, making the user's data collection simpler and more efficient. - -## Product Features - -TimechoDB has the following advantages and characteristics: - -- Flexible deployment methods: Support for one-click cloud deployment, out-of-the-box use after unzipping at the terminal, and seamless connection between terminal and cloud (data cloud synchronization tool). - -- Low hardware cost storage solution: Supports high compression ratio disk storage, no need to distinguish between historical and real-time databases, unified data management. - -- Hierarchical sensor organization and management: Supports modeling in the system according to the actual hierarchical relationship of devices to achieve alignment with the industrial sensor management structure, and supports directory viewing, search, and other capabilities for hierarchical structures. - -- High throughput data reading and writing: supports access to millions of devices, high-speed data reading and writing, out of unaligned/multi frequency acquisition, and other complex industrial reading and writing scenarios. - -- Rich time series query semantics: Supports a native computation engine for time series data, supports timestamp alignment during queries, provides nearly a hundred built-in aggregation and time series calculation functions, and supports time series feature analysis and AI capabilities. - -- Highly available distributed system: Supports HA distributed architecture, the system provides 7*24 hours uninterrupted real-time database services, the failure of a physical node or network fault will not affect the normal operation of the system; supports the addition, deletion, or overheating of physical nodes, the system will automatically perform load balancing of computing/storage resources; supports heterogeneous environments, servers of different types and different performance can form a cluster, and the system will automatically load balance according to the configuration of the physical machine. - -- Extremely low usage and operation threshold: supports SQL like language, provides multi language native secondary development interface, and has a complete tool system such as console. - -- Rich ecological environment docking: Supports docking with big data ecosystem components such as Hadoop, Spark, and supports equipment management and visualization tools such as Grafana, Thingsboard, DataEase. - -## Enterprise characteristics - -### Higher level product features - -Based on Apache IoTDB, TimechoDB offers a range of advanced product features, with native upgrades and optimizations at the kernel level for industrial production scenarios. These include multi-level storage, cloud-edge collaboration, visualization tools, and security enhancements, allowing users to focus more on business development without worrying too much about underlying logic. This simplifies and enhances industrial production, bringing more economic benefits to enterprises. For example: - - -- Dual Active Deployment:Dual active usually refers to two independent single machines (or clusters) that perform real-time mirror synchronization. Their configurations are completely independent and can simultaneously receive external writes. Each independent single machine (or clusters) can synchronize the data written to itself to another single machine (or clusters), and the data of the two single machines (or clusters) can achieve final consistency. - -- Data Synchronisation:Through the built-in synchronization module of the database, data can be aggregated from the station to the center, supporting various scenarios such as full aggregation, partial aggregation, and hierarchical aggregation. It can support both real-time data synchronization and batch data synchronization modes. Simultaneously providing multiple built-in plugins to support requirements such as gateway penetration, encrypted transmission, and compressed transmission in enterprise data synchronization applications. - -- Tiered Storage:Multi level storage: By upgrading the underlying storage capacity, data can be divided into different levels such as cold, warm, and hot based on factors such as access frequency and data importance, and stored in different media (such as SSD, mechanical hard drive, cloud storage, etc.). At the same time, the system also performs data scheduling during the query process. Thereby reducing customer data storage costs while ensuring data access speed. - -- Security Enhancements: Features like whitelists and audit logs strengthen internal management and reduce the risk of data breaches. - -The detailed functional comparison is as follows: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
FunctionApache IoTDBTimechoDB
Deployment ModeStand-Alone Deployment
Distributed Deployment
Dual Active Deployment-
Container DeploymentPartial support
Database FunctionalitySensor Management
Write Data
Query Data
Continuous Query
Trigger
User Defined Function
Permission Management
Data SynchronisationOnly file synchronization, no built-in pluginsReal time synchronization+file synchronization, enriched with built-in plugins
Stream ProcessingOnly framework, no built-in pluginsFramework+rich built-in plugins
Tiered Storage-
View-
White List-
Audit Log-
Supporting ToolsWorkbench-
Cluster Management Tool-
System Monitor Tool-
LocalizationLocalization Compatibility Certification-
Technical SupportExpert Support-
Use Training-
- -### More efficient/stable product performance - -TimechoDB has optimized stability and performance on the basis of the open source version. With technical support from the enterprise version, it can achieve more than 10 times performance improvement and has the performance advantage of timely fault recovery. - -### More User-Friendly Tool System - -TimechoDB will provide users with a simpler and more user-friendly tool system. Through products such as the Cluster Monitoring Panel (Grafana), Database Console (Workbench), and Cluster Management Tool (Deploy Tool, abbreviated as IoTD), it will help users quickly deploy, manage, and monitor database clusters, reduce the work/learning costs of operation and maintenance personnel, simplify database operation and maintenance work, and make the operation and maintenance process more convenient and efficient. - -- Cluster Monitoring Panel: Designed to address the monitoring issues of TimechoDB and its operating system, including operating system resource monitoring, TimechoDB performance monitoring, and hundreds of kernel monitoring indicators, to help users monitor the health status of the cluster and perform cluster tuning and operation. - -
-

Overall Overview

-

Operating System Resource Monitoring

-

TimechoDB Performance Monitoring

-
-
- - - -
-

- -- Database Console: Designed to provide a low threshold database interaction tool, it helps users perform metadata management, data addition, deletion, modification, query, permission management, system management, and other operations in a concise and clear manner through an interface console, simplifying the difficulty of database use and improving database efficiency. - - -
-

Home Page

-

Operate Metadata

-

SQL Query

-
-
- - - -
-

- - -- Cluster management tool: aimed at solving the operational difficulties of multi node distributed systems, mainly including cluster deployment, cluster start stop, elastic expansion, configuration updates, data export and other functions, so as to achieve one click instruction issuance for complex database clusters, greatly reducing management difficulty. - - -
-  -
- -### More professional enterprise technical services - -TimechoDB customers provide powerful original factory services, including but not limited to on-site installation and training, expert consultant consultation, on-site emergency assistance, software upgrades, online self-service, remote support, and guidance on using the latest development version. At the same time, in order to make TimechoDB more suitable for industrial production scenarios, we will recommend modeling solutions, optimize read-write performance, optimize compression ratios, recommend database configurations, and provide other technical support based on the actual data structure and read-write load of the enterprise. If encountering industrial customization scenarios that are not covered by some products, TimechoDB will provide customized development tools based on user characteristics. - -Compared to the open source version, TimechoDB provides a faster release frequency every 2-3 months. At the same time, it offers day level exclusive fixes for urgent customer issues to ensure stable production environments. - -### More compatible localization adaptation - -The TimechoDB code is self-developed and controllable, and is compatible with most mainstream information and creative products (CPU, operating system, etc.), and has completed compatibility certification with multiple manufacturers to ensure product compliance and security. \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/IoTDB-Introduction/Release-history_timecho.md b/src/UserGuide/V1.3.x/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index b1826d6ce..000000000 --- a/src/UserGuide/V1.3.x/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,391 +0,0 @@ - -# Release History - -## TimechoDB (Database Core) - -### V1.3.7.3 - -> Release Date: 2026.06.02
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 Checksum: 8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 primarily optimizes query module and data synchronization capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Query Module: Optimized `Last` queries, aligned series queries, reverse-order time filter queries, and other scenarios. -- Metadata Module: Optimized device creation validation for activated series and their child paths. -- Data Synchronization: Optimized the retry mechanism after synchronization failures. -- Data Synchronization: Cross-network-gateway synchronization plugin supports configuring the real-time write transmission timeout. -- Interface Module: Added error code validation to the Go client write interface. -- Interface Module: Optimized C# client connection pool management. - - -### V1.3.7.2 - -> Release Date: 2026.04.07
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 Checksum: 787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 primarily optimizes data synchronization and query module capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Data Synchronization: Optimized distribution performance for Pipe complex path matching scenarios. -- Query Module: The `SHOW QUERIES` statement now includes client IP, query timeout, server wait time, and other information. -- Ecosystem Integration: Supports IoTDB pushing data to an external OPC Server in OPC Client mode. - - -### V1.3.6.6 - -> Release Date: 2026.01.20
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 Checksum: 590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 enhances data read/write capabilities, resolves several product defects, and delivers comprehensive improvements in database monitoring, performance, and stability. - - -### V1.3.6.3 - -> Release Date: 2026.01.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 Checksum: 43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 focuses on deep optimizations in two core areas—query performance and memory management—while comprehensively enhancing database monitoring, performance, and stability. Specific release contents are as follows: - -* **Query Module**: Optimized query performance across multiple scenarios, including multi-series `Last` queries. -* **Query Module**: Added a new `FastLastQuery` interface in the Java SDK for more efficient `Last` query operations. -* **Query Module**: Modified the tree model’s `fetchSchema` to return results in segmented streaming mode, improving response speed under large-data-volume conditions. -* **Storage Module**: Enhanced memory management to mitigate memory leak risks and ensure long-term system stability. -* **Storage Module**: Optimized the file compaction mechanism to improve compaction efficiency and reduce storage resource consumption. -* **Others**: Fixed security vulnerabilities CVE-2025-12183, CVE-2025-66566, and CVE-2025-11226. - -### V1.3.6.1 - -> Release Date: 2025.12.09
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 Checksum: 9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 focuses on deep optimization of data synchronization stability, while delivering comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Data Synchronization**: Enhanced Pipe SQL parameter configuration to support specifying asynchronous loading methods. -* **Data Synchronization**: Introduced syntactic sugar that automatically splits full-data Pipe creation SQL into real-time and historical synchronization components. -* **System Module**: Added a global configuration option for data-type-specific compression strategies, enabling on-demand adjustment of storage compression policies. - -### V1.3.5.11 - -> Release Date: 2025.09.24
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 Checksum: f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 version primarily optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.10 - -> Release Date: 2025.08.27
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 Checksum: 3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 version fixes certain product defects and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.9 - -> Release Date: 2025.08.25
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 Checksum: 95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 version optimizes memory control, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### 1.x Other historical versions - -#### V1.3.5.8 - -> Release Date: 2025.08.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 Checksum: aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.7 - -> Release Date: 2025.08.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 Checksum: 17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.6 - -> Release Date: 2025.07.16
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 Checksum: 05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 introduces a new configuration switch to disable the data subscription feature. It optimizes the C++ high-availability client and addresses PIPE synchronization latency issues in normal operation, restart, and deletion scenarios, along with query performance for large TEXT objects. Comprehensive enhancements to database monitoring, performance, and stability are also included. - -#### V1.3.5.4 - -> Release Date: 2025.06.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 Checksum: edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 fixes several product defects and optimizes the node removal functionality. It also delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.5.3 - -> Release Date: 2025.06.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 Checksum: 5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 focuses on optimizing data synchronization capabilities, including persisting PIPE transmission progress and adding monitoring metrics for PIPE event transfer time. Related defects have been resolved. Additionally, the encryption algorithm for user passwords has been upgraded to SHA-256. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.2 - -> Release Date: 2015.06.10
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 Checksum: 4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 primarily optimizes data synchronization features, adding support for cascading configurations via parameters and ensuring fully consistent ordering between synchronized and real-time writes. It also enables partitioned sending of historical and real-time data after system restarts. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.1 - -> Release Date: 2025.05.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 Checksum: 91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 resolves several product defects and delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.4.2 - -> Release Date: 2025.04.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 Checksum: 52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b - -V1.3.4.2 enhances the data synchronization function by supporting bi-directional active-active synchronization of data forwarded through external PIPE sources. - -#### V1.3.4.1 - -> Release Date: 2025.01.08
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 Checksum: e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 introduces pattern matching functions, continuously optimizes the data subscription mechanism, improves stability, and extends import-data/export-data scripts to support new data types while unifying TsFile, CSV and SQL import/export formats. Comprehensive improvements have been made to database monitoring, performance and stability. Key updates: - -* Query Module: Configurable URI-based JAR loading for UDFs, PipePlugins, Triggers and AINodes -* System Module: Extended UDF functionality with new pattern\_match function -* Data Sync: Supports specifying authentication info at sender -* Ecosystem: Kubernetes Operator support -* Scripts: import-data/export-data now supports strings, BLOBs, dates and timestamps -* Scripts: Unified import/export support for TsFile, CSV and SQL formats - -#### V1.3.3.3 - -> Release Date: 2024.10.31
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 Checksum: 4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3 improves restart recovery performance, enables DataNodes to actively monitor/load TsFiles with observability metrics, supports automatic loading at receivers when senders transfer files to specified directories, and adds Alter Source capability for Pipes. Comprehensive improvements to monitoring, performance and stability include: - -* Data Sync: Automatic type conversion for inconsistent data at receivers -* Data Sync: Enhanced observability with ops/latency metrics for internal APIs -* Data Sync: OPC-UA sink plugin supports CS mode and non-anonymous access -* Subscription: SDK supports create\_if\_not\_exists and drop\_if\_exists APIs -* Stream Processing: Alter Pipe supports Alter Source -* System: Added latency monitoring for REST module -* Scripts: Auto-loading TsFiles from specified directories -* Scripts: import-tsfile supports remote server execution -* Scripts: Kubernetes Helm support -* Scripts: Python client supports new data types (string, BLOB, date, timestamp) - -#### V1.3.3.2 - -> Release Date: 2024.08.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 Checksum: 32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2 adds metrics for mods file reading time, merge sort memory usage and dispatch latency, supports configurable time partition origin adjustment, enables automatic subscription termination based on pipe completion markers, and improves merge memory control. Key updates: - -* Query: Explain Analyze shows mods file read time -* Query: Explain Analyze shows merge sort memory and dispatch latency -* Storage: Added configurable file splitting during compaction -* System: Configurable time partition origin -* Stream Processing: Auto-terminate subscriptions on pipe completion markers -* Data Sync: Configurable RPC compression levels -* Scripts: Export filters only root.\_\_system paths - -#### V1.3.3.1 - -> Release Date: 2024.07.12
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 Checksum: 1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1 adds tiered storage throttling, supports username/password auth specification at sync senders, optimizes ambiguous WARN logs at receivers, improves restart performance, and merges configuration files. Key updates: - -* Query: Optimized Filter performance for faster aggregation/WHERE queries -* Query: Java Session evenly distributes SQL requests across nodes -* System: Merged config files into iotdb-system.properties -* Storage: Added tiered storage throttling -* Data Sync: Username/password auth specification at senders -* System: Optimized restart recovery time - -#### V1.3.2.2 - -> Release Date: 2024.06.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 Checksum: ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 introduces EXPLAIN ANALYZE for SQL profiling, UDAF framework, automatic data deletion at disk thresholds, metadata sync, path-specific data point counting, and SQL import/export scripts. Supports rolling cluster upgrades and cluster-wide plugin distribution with comprehensive monitoring/performance improvements. Key updates: - -* Storage: Improved insertRecords performance -* Storage: SpaceTL feature for auto-deletion at disk thresholds -* Query: EXPLAIN ANALYZE for SQL stage-level profiling -* Query: New UDAF framework -* Query: New envelope demodulation analysis in UDFs -* Query: MaxBy/MinBy functions returning timestamps with values -* Query: Faster value-filtered queries -* Data Sync: Wildcard path matching -* Data Sync: Metadata synchronization (including attributes/permissions) -* Stream Processing: ALTER PIPE for hot plugin updates -* System: TsFile load statistics in data point counting -* Scripts: Local upgrade/backup via hard links -* Scripts: New export-data/import-data for CSV/TsFile/SQL formats -* Scripts: Windows window title differentiation for ConfigNode/DataNode/Cli - -#### V1.3.1.4 - -> Release Date: 2024.04.23
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 Checksum: 8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1.4 adds cluster activation status viewing, variance/stddev aggregation functions, FILL timeout settings, TsFile repair command, one-click info collection scripts, and cluster control scripts while optimizing views and stream processing. Key updates: - -* Query: FILL clause timeout threshold -* Query: REST V2 returns column types -* Data Sync: Simplified time range specification -* Data Sync: SSL support (iotdb-thrift-ssl-sink) -* System: SQL query for cluster activation status -* System: Tiered storage transfer rate control -* System: Enhanced observability (node divergence, task scheduling) -* System: Optimized default logging -* Scripts: One-click cluster control scripts (start-all/stop-all) -* Scripts: One-click info collection scripts (collect-info) - -#### V1.3.0.4 - -> Release Date: 2024.01.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 Checksum: 3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 introduces the AINode machine learning framework, upgrades permission granularity to time-series level, and optimizes views/stream processing for better usability and stability. Key updates: - -* Query: New AINode ML framework -* Query: Fixed slow SHOW PATH responses -* Security: Time-series granular permissions -* Security: SSL client-server encryption -* Stream Processing: New metrics monitoring -* Query: LAST queries on non-writable views -* System: Improved data point counting accuracy - -#### V1.2.0.1 - -> Release Date: 2023.06.30
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 Checksum: dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1 introduces stream processing framework, dynamic templates, substring/replace/round functions, enhances SHOW REGION/TIMESERIES/VARIABLE statements and Session APIs while optimizing monitoring metrics. Key updates: - -* Stream Processing: New framework -* Metadata: Dynamic template expansion -* Storage: New SPRINTZ/RLBE encoding and LZMA2 compression -* Query: New CAST, ROUND, SUBSTR, REPLACE functions -* Query: New TIME\_DURATION, MODE aggregation -* Query: CASE WHEN syntax support -* Query: ORDER BY expression support -* Interface: Python API multi-node connection -* Interface: Python client write redirection -* Interface: Batch sequence creation via templates - -#### V1.1.0.1 - -> Release Date: 2023.04.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.1.0.1.zip
-> SHA512 Checksum: 58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1 introduces GROUP BY VARIATION/CONDITION, DIFF/COUNT\_IF functions, and pipeline execution engine while fixing issues including: - -* Aligned sequence LAST queries with ORDER BY TIMESERIES -* LIMIT & OFFSET failures -* Post-restart metadata template errors -* Sequence creation after database deletion - -Key updates: - -* Query: ALIGN BY DEVICE supports ORDER BY TIME -* Query: SHOW QUERIES/KILL QUERY commands -* System: SHOW REGIONS per database -* System: SHOW VARIABLES for cluster parameters -* Query: GROUP BY VARIATION/CONDITION -* Query: SELECT INTO type casting -* Query: New DIFF (scalar), COUNT\_IF (aggregate) -* System: SHOW REGIONS creation time -* System: Configurable dn\_rpc\_port/address - -## Workbench (Console Tool) - - -| **Version** | **Description** | **Supported IoTDB Versions** | **SHA512 checksum** | -| ----------- | ------------------------------------------------------------ | ----------------------------------- | ------------------------------------------------------------ | -| V1.5.7 | Optimize the point list by splitting point names into device names and points, ensure the point selection area supports horizontal scrolling, and align the export file column order with the page display. | All 1.x versions from V1.3.4 onward | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | Enhanced CSV import/export: optional tags/aliases on import; support for measurement descriptions with backtick-quoted quotes on export. | All 1.x versions from V1.3.4 onward | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | Added server clock functionality and support for activating Enterprise Edition license databases | All 1.x versions from V1.3.4 onward | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | Added authentication for Prometheus settings in Instance Management | All 1.x versions from V1.3.4 onward | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | Added AI analysis and pattern matching | All 1.x versions from V1.3.2 onward | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | Added tree model display and English UI | All 1.x versions from V1.3.2 onward | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | Enhanced analysis methods and import templates | All 1.x versions from V1.3.2 onward | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | Added DB configuration and UI refinements | All 1.x versions from V1.3.2 onward | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | Optimized permission controls | All 1.x versions from V1.3.1 onward | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | Added "Common Templates" and caching | All 1.x versions from V1.3.0 onward | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | Added import/export for calculations, time alignment field | All 1.x versions from V1.2.2 onward | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | Added activation details and analysis features | All 1.x versions from V1.2.2 onward | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | Optimized point description display | All 1.x versions from V1.2.2 onward | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | Added sync monitoring panel, Prometheus hints | All 1.x versions from V1.2.2 onward | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | Major Workbench upgrade | All 1.x versions from V1.2.0 onward | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/UserGuide/V1.3.x/QuickStart/QuickStart_timecho.md b/src/UserGuide/V1.3.x/QuickStart/QuickStart_timecho.md deleted file mode 100644 index 632545c6a..000000000 --- a/src/UserGuide/V1.3.x/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,106 +0,0 @@ - -# Quick Start - -This document will help you understand how to quickly get started with IoTDB. - -## How to install and deploy? - -This document will help you quickly install and deploy IoTDB. You can quickly locate the content you need to view through the following document links: - -1. Prepare the necessary machine resources: The deployment and operation of IoTDB require consideration of multiple aspects of machine resource configuration. Specific resource allocation can be viewed [Database Resources](../Deployment-and-Maintenance/Database-Resources.md) - -2. Complete system configuration preparation: The system configuration of IoTDB involves multiple aspects, and the key system configuration introductions can be viewed [System Requirements](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. Get installation package: You can contact Timecho Business to obtain the IoTDB installation package to ensure that the downloaded version is the latest and stable. The specific installation package structure can be viewed: [Obtain TimechoDB](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. Install database and activate: You can choose the following tutorials for installation and deployment based on the actual deployment architecture: - - - Stand-Alone Deployment: [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - Distributed(Cluster) Deployment: [Distributed(Cluster) Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - Dual Active Deployment: [Dual Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️Attention: Currently, we still recommend installing and deploying directly on physical/virtual machines. If Docker deployment is required, please refer to: [Docker Deployment](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. Install database supporting tools: The enterprise version database provides a monitoring panel 、Workbench Supporting tools, etc,It is recommended to install IoTDB when deploying the enterprise version, which can help you use IoTDB more conveniently: - - - Monitoring panel:Provides over a hundred database monitoring metrics for detailed monitoring of IoTDB and its operating system, enabling system optimization, performance optimization, bottleneck discovery, and more. The installation steps can be viewed [Monitoring panel](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - Workbench: It is the visual interface of IoTDB,Support providing through interface interaction Operate Metadata、Query Data、Data Visualization and other functions, help users use the database easily and efficiently, and the installation steps can be viewed [Workbench Deployment](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - -## How to use it? - -1. Database modeling design: Database modeling is an important step in creating a database system, which involves designing the structure and relationships of data to ensure that the organization of data meets the specific application requirements. The following document will help you quickly understand the modeling design of IoTDB: - - - Introduction to the concept of timeseries:[Navigating Time Series Data](../Basic-Concept/Navigating_Time_Series_Data.md) - - - Introduction to Modeling Design: [Data Model](../Basic-Concept/Data-Model-and-Terminology.md) - - - SQL syntax introduction:[Operate Metadata](../Basic-Concept/Operate-Metadata_timecho.md) - -2. Write Data: In terms of data writing, IoTDB provides multiple ways to insert real-time data. Please refer to the basic data writing operations for details [Write Data](../Basic-Concept/Write-Data) - -3. Query Data: IoTDB provides rich data query functions. Please refer to the basic introduction of data query [Query Data](../Basic-Concept/Query-Data.md) - -4. Other advanced features: In addition to common functions such as writing and querying in databases, IoTDB also supports "Data Synchronisation、Stream Framework、Security Management、Database Administration、AI Capability"and other functions, specific usage methods can be found in the specific document: - - - Data Synchronisation: [Data Synchronisation](../User-Manual/Data-Sync_timecho.md) - - - Stream Framework: [Stream Framework](../User-Manual/Streaming_timecho.md) - - - Security Management: [Security Management](../User-Manual/White-List_timecho.md) - - - Database Administration: [Database Administration](../User-Manual/Authority-Management.md) - - - AI Capability :[AI Capability](../AI-capability/AINode_timecho.md) - -5. API: IoTDB provides multiple application programming interfaces (API) for developers to interact with IoTDB in their applications, and currently supports[ Java Native API](../API/Programming-Java-Native-API.md)、[Python Native API](../API/Programming-Python-Native-API.md)、[C++ Native API](../API/Programming-Cpp-Native-API.md)、[Go Native API](../API/Programming-Go-Native-API.md), For more API, please refer to the official website 【API】 and other chapters - -## What other convenient tools are available? - -In addition to its rich features, IoTDB also has a comprehensive range of tools in its surrounding system. This document will help you quickly use the peripheral tool system : - - - Workbench: Workbench is a visual interface for IoTDB that supports interactive operations. It offers intuitive features for metadata management, data querying, and data visualization, enhancing the convenience and efficiency of user database operations. For detailed usage instructions, please refer to: [Workbench](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - - - Monitor Tool: This is a tool for meticulous monitoring of IoTDB and its host operating system, covering hundreds of database monitoring metrics including database performance and system resources, which aids in system optimization and bottleneck identification. For detailed usage instructions, please refer to: [Monitor Tool](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - Benchmark Tool: IoT benchmark is a time series database benchmark testing tool developed based on Java and big data environments, developed and open sourced by the School of Software at Tsinghua University. It supports multiple writing and querying methods, can store test information and results for further query or analysis, and supports integration with Tableau to visualize test results. For specific usage instructions, please refer to: [Benchmark Tool](../Tools-System/Benchmark.md) - - - Data Import Script: For different scenarios, IoTDB provides users with multiple ways to batch import data. For specific usage instructions, please refer to: [Data Import](../Tools-System/Data-Import-Tool.md) - - - Data Export Script: For different scenarios, IoTDB provides users with multiple ways to batch export data. For specific usage instructions, please refer to: [Data Export](../Tools-System/Data-Export-Tool.md) - -## Want to Learn More About the Technical Details? - -If you are interested in delving deeper into the technical aspects of IoTDB, you can refer to the following documents: - - - Research Paper: IoTDB features columnar storage, data encoding, pre-calculation, and indexing technologies, along with a SQL-like interface and high-performance data processing capabilities. It also integrates seamlessly with Apache Hadoop, MapReduce, and Apache Spark. For related research papers, please refer to: [Research Paper](../Technical-Insider/Publication.md) - - - Compression & Encoding: IoTDB optimizes storage efficiency for different data types through a variety of encoding and compression techniques. To learn more, please refer to:[Compression & Encoding](../Technical-Insider/Encoding-and-Compression.md) - - - Data Partitioning and Load Balancing: IoTDB has meticulously designed data partitioning strategies and load balancing algorithms based on the characteristics of time series data, enhancing the availability and performance of the cluster. For more information, please refer to: [Data Partitionin & Load Balancing](../Technical-Insider/Cluster-data-partitioning.md) - - -## Encountering problems during use? - -If you encounter difficulties during installation or use, you can move to [Frequently Asked Questions](../FAQ/Frequently-asked-questions.md) View in the middle \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/Reference/DataNode-Config-Manual-old_timecho.md b/src/UserGuide/V1.3.x/Reference/DataNode-Config-Manual-old_timecho.md deleted file mode 100644 index 9b5e10a20..000000000 --- a/src/UserGuide/V1.3.x/Reference/DataNode-Config-Manual-old_timecho.md +++ /dev/null @@ -1,592 +0,0 @@ - - -# DataNode Configuration Parameters - -We use the same configuration files for IoTDB DataNode and Standalone version, all under the `conf`. - -* `datanode-env.sh/bat`:Environment configurations, in which we could set the memory allocation of DataNode and Standalone. - -* `iotdb-datanode.properties`:IoTDB system configurations. - -## Hot Modification Configuration - -For the convenience of users, IoTDB provides users with hot modification function, that is, modifying some configuration parameters in `iotdb-datanode.properties` during the system operation and applying them to the system immediately. -In the parameters described below, these parameters whose way of `Effective` is `hot-load` support hot modification. - -Trigger way: The client sends the command(sql) `load configuration` or `set configuration` to the IoTDB server. - -## Environment Configuration File(datanode-env.sh/bat) - -The environment configuration file is mainly used to configure the Java environment related parameters when DataNode is running, such as JVM related configuration. This part of the configuration is passed to the JVM when the DataNode starts. - -The details of each parameter are as follows: - -* MEMORY\_SIZE - -|Name|MEMORY\_SIZE| -|:---:|:---| -|Description|The minimum heap memory size that IoTDB DataNode will use when startup | -|Type|String| -|Default| The default is a half of the memory.| -|Effective|After restarting system| - -* ON\_HEAP\_MEMORY - -|Name|ON\_HEAP\_MEMORY| -|:---:|:---| -|Description|The heap memory size that IoTDB DataNode can use, Former Name: MAX\_HEAP\_SIZE | -|Type|String| -|Default| Calculate based on MEMORY\_SIZE.| -|Effective|After restarting system| - -* OFF\_HEAP\_MEMORY - -|Name|OFF\_HEAP\_MEMORY| -|:---:|:---| -|Description|The direct memory that IoTDB DataNode can use, Former Name: MAX\_DIRECT\_MEMORY\_SIZE| -|Type|String| -|Default| Calculate based on MEMORY\_SIZE.| -|Effective|After restarting system| - -* JMX\_LOCAL - -|Name|JMX\_LOCAL| -|:---:|:---| -|Description|JMX monitoring mode, configured as yes to allow only local monitoring, no to allow remote monitoring| -|Type|Enum String: "true", "false"| -|Default|true| -|Effective|After restarting system| - -* JMX\_PORT - -|Name|JMX\_PORT| -|:---:|:---| -|Description|JMX listening port. Please confirm that the port is not a system reserved port and is not occupied| -|Type|Short Int: [0,65535]| -|Default|31999| -|Effective|After restarting system| - -* JMX\_IP - -|Name|JMX\_IP| -|:---:|:---| -|Description|JMX listening address. Only take effect if JMX\_LOCAL=false. 0.0.0.0 is never allowed| -|Type|String| -|Default|127.0.0.1| -|Effective|After restarting system| - -## JMX Authorization - -We **STRONGLY RECOMMENDED** you CHANGE the PASSWORD for the JMX remote connection. - -The user and passwords are in ${IOTDB\_CONF}/conf/jmx.password. - -The permission definitions are in ${IOTDB\_CONF}/conf/jmx.access. - -## DataNode/Standalone Configuration File (iotdb-datanode.properties) - -### Data Node RPC Configuration - -* dn\_rpc\_address - -|Name| dn\_rpc\_address | -|:---:|:-----------------------------------------------| -|Description| The client rpc service listens on the address. | -|Type| String | -|Default| 0.0.0.0 | -|Effective| After restarting system | - -* dn\_rpc\_port - -|Name| dn\_rpc\_port | -|:---:|:---| -|Description| The client rpc service listens on the port.| -|Type|Short Int : [0,65535]| -|Default| 6667 | -|Effective|After restarting system| - -* dn\_internal\_address - -|Name| dn\_internal\_address | -|:---:|:---| -|Description| DataNode internal service host/IP | -|Type| string | -|Default| 127.0.0.1 | -|Effective|Only allowed to be modified in first start up| - -* dn\_internal\_port - -|Name| dn\_internal\_port | -|:---:|:-------------------------------| -|Description| DataNode internal service port | -|Type| int | -|Default| 10730 | -|Effective| Only allowed to be modified in first start up | - -* dn\_mpp\_data\_exchange\_port - -|Name| mpp\_data\_exchange\_port | -|:---:|:---| -|Description| MPP data exchange port | -|Type| int | -|Default| 10740 | -|Effective|Only allowed to be modified in first start up| - -* dn\_schema\_region\_consensus\_port - -|Name| dn\_schema\_region\_consensus\_port | -|:---:|:---| -|Description| DataNode Schema replica communication port for consensus | -|Type| int | -|Default| 10750 | -|Effective|Only allowed to be modified in first start up| - -* dn\_data\_region\_consensus\_port - -|Name| dn\_data\_region\_consensus\_port | -|:---:|:---| -|Description| DataNode Data replica communication port for consensus | -|Type| int | -|Default| 10760 | -|Effective|Only allowed to be modified in first start up| - -* dn\_join\_cluster\_retry\_interval\_ms - -|Name| dn\_join\_cluster\_retry\_interval\_ms | -|:---:|:--------------------------------------------------------------------------| -|Description| The time of data node waiting for the next retry to join into the cluster | -|Type| long | -|Default| 5000 | -|Effective| After restarting system | - -### SSL Configuration - -* enable\_thrift\_ssl - -|Name| enable\_thrift\_ssl | -|:---:|:---------------------------| -|Description|When enable\_thrift\_ssl is configured as true, SSL encryption will be used for communication through dn\_rpc\_port | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* enable\_https - -|Name| enable\_https | -|:---:|:-------------------------| -|Description| REST Service Specifies whether to enable SSL configuration | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* key\_store\_path - -|Name| key\_store\_path | -|:---:|:-----------------| -|Description| SSL certificate path | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* key\_store\_pwd - -|Name| key\_store\_pwd | -|:---:|:----------------| -|Description| SSL certificate password | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -### Target Config Nodes - -* dn\_seed\_config\_node - -|Name| dn\_seed\_config\_node | -|:---:|:------------------------------------------------| -|Description| ConfigNode Address for DataNode to join cluster | -|Type| String | -|Default| 127.0.0.1:10710 | -|Effective| Only allowed to be modified in first start up | - -### Connection Configuration - -* dn\_rpc\_thrift\_compression\_enable - -|Name| dn\_rpc\_thrift\_compression\_enable | -|:---:|:---| -|Description| Whether enable thrift's compression (using GZIP).| -|Type|Boolean| -|Default| false | -|Effective|After restarting system| - -* dn\_rpc\_advanced\_compression\_enable - -|Name| dn\_rpc\_advanced\_compression\_enable | -|:---:|:---| -|Description| Whether enable thrift's advanced compression.| -|Type|Boolean| -|Default| false | -|Effective|After restarting system| - -* dn\_rpc\_selector\_thread\_count - -|Name| dn\_rpc\_selector\_thread\_count | -|:---:|:-----------------------------------| -|Description| The number of rpc selector thread. | -|Type| int | -|Default| false | -|Effective| After restarting system | - -* dn\_rpc\_min\_concurrent\_client\_num - -|Name| dn\_rpc\_min\_concurrent\_client\_num | -|:---:|:-----------------------------------| -|Description| Minimum concurrent rpc connections | -|Type| Short Int : [0,65535] | -|Description| 1 | -|Effective| After restarting system | - -* dn\_rpc\_max\_concurrent\_client\_num - -|Name| dn\_rpc\_max\_concurrent\_client\_num | -|:---:|:--------------------------------------| -|Description| Max concurrent rpc connections | -|Type| Short Int : [0,65535] | -|Description| 1000 | -|Effective| After restarting system | - -* dn\_thrift\_max\_frame\_size - -|Name| dn\_thrift\_max\_frame\_size | -|:---:|:---| -|Description| Max size of bytes of each thrift RPC request/response| -|Type| Long | -|Unit|Byte| -|Default| 536870912 | -|Effective|After restarting system| - -* dn\_thrift\_init\_buffer\_size - -|Name| dn\_thrift\_init\_buffer\_size | -|:---:|:---| -|Description| Initial size of bytes of buffer that thrift used | -|Type| long | -|Default| 1024 | -|Effective|After restarting system| - -* dn\_connection\_timeout\_ms - -| Name | dn\_connection\_timeout\_ms | -|:-----------:|:---------------------------------------------------| -| Description | Thrift socket and connection timeout between nodes | -| Type | int | -| Default | 60000 | -| Effective | After restarting system | - -* dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager - -| Name | dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------------:|:--------------------------------------------------------------| -| Description | Number of core clients routed to each node in a ClientManager | -| Type | int | -| Default | 200 | -| Effective | After restarting system | - -* dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager - -| Name | dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager | -|:--------------:|:-------------------------------------------------------------| -| Description | Number of max clients routed to each node in a ClientManager | -| Type | int | -| Default | 300 | -| Effective | After restarting system | - -### Dictionary Configuration - -* dn\_system\_dir - -| Name | dn\_system\_dir | -|:-----------:|:----------------------------------------------------------------------------| -| Description | The directories of system files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/system (Windows: data\\datanode\\system) | -| Effective | After restarting system | - -* dn\_data\_dirs - -| Name | dn\_data\_dirs | -|:-----------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | The directories of data files. Multiple directories are separated by comma. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. If the path does not exist, the system will automatically create it. | -| Type | String[] | -| Default | data/datanode/data (Windows: data\\datanode\\data) | -| Effective | After restarting system | - -* dn\_multi\_dir\_strategy - -| Name | dn\_multi\_dir\_strategy | -|:-----------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | IoTDB's strategy for selecting directories for TsFile in tsfile_dir. You can use a simple class name or a full name of the class. The system provides the following three strategies:
1. SequenceStrategy: IoTDB selects the directory from tsfile\_dir in order, traverses all the directories in tsfile\_dir in turn, and keeps counting;
2. MaxDiskUsableSpaceFirstStrategy: IoTDB first selects the directory with the largest free disk space in tsfile\_dir;
You can complete a user-defined policy in the following ways:
1. Inherit the org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy class and implement its own Strategy method;
2. Fill in the configuration class with the full class name of the implemented class (package name plus class name, UserDfineStrategyPackage);
3. Add the jar file to the project. | -| Type | String | -| Default | SequenceStrategy | -| Effective | hot-load | - -* dn\_consensus\_dir - -| Name | dn\_consensus\_dir | -|:-----------:|:-------------------------------------------------------------------------------| -| Description | The directories of consensus files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/consensus | -| Effective | After restarting system | - -* dn\_wal\_dirs - -| Name | dn\_wal\_dirs | -|:-----------:|:-------------------------------------------------------------------------| -| Description | Write Ahead Log storage path. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/wal | -| Effective | After restarting system | - -* dn\_tracing\_dir - -| Name | dn\_tracing\_dir | -|:-----------:|:----------------------------------------------------------------------------| -| Description | The tracing root directory path. It is recommended to use an absolute path. | -| Type | String | -| Default | datanode/tracing | -| Effective | After restarting system | - -* dn\_sync\_dir - -| Name | dn\_sync\_dir | -|:-----------:|:--------------------------------------------------------------------------| -| Description | The directories of sync files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/sync | -| Effective | After restarting system | - -### Metric Configuration - -## Enable GC log - -GC log is off by default. -For performance tuning, you may want to collect the GC info. - -To enable GC log, just add a parameter "printgc" when you start the DataNode. - -```bash -nohup sbin/start-datanode.sh printgc >/dev/null 2>&1 & -``` -Or -```cmd -sbin\start-datanode.bat printgc -``` - -GC log is stored at `IOTDB_HOME/logs/gc.log`. -There will be at most 10 gc.log.* files and each one can reach to 10MB. - -### REST Service Configuration - -* enable\_rest\_service - -|Name| enable\_rest\_service | -|:---:|:--------------------------------------| -|Description| Whether to enable the Rest service | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* rest\_service\_port - -|Name| rest\_service\_port | -|:---:|:------------------| -|Description| The Rest service listens to the port number | -|Type| int32 | -|Default| 18080 | -|Effective| After restarting system | - -* enable\_swagger - -|Name| enable\_swagger | -|:---:|:-----------------------| -|Description| Whether to enable swagger to display rest interface information | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* rest\_query\_default\_row\_size\_limit - -|Name| rest\_query\_default\_row\_size\_limit | -|:---:|:------------------------------------------------------------------------------------------| -|Description| The maximum number of rows in a result set that can be returned by a query | -|Type| int32 | -|Default| 10000 | -|Effective| After restarting system | - -* cache\_expire - -|Name| cache\_expire | -|:---:|:--------------------------------------------------------| -|Description| Expiration time for caching customer login information | -|Type| int32 | -|Default| 28800 | -|Effective| After restarting system | - -* cache\_max\_num - -|Name| cache\_max\_num | -|:---:|:--------------| -|Description| The maximum number of users stored in the cache | -|Type| int32 | -|Default| 100 | -|Effective| After restarting system | - -* cache\_init\_num - -|Name| cache\_init\_num | -|:---:|:---------------| -|Description| Initial cache capacity | -|Type| int32 | -|Default| 10 | -|Effective| After restarting system | - - -* trust\_store\_path - -|Name| trust\_store\_path | -|:---:|:---------------| -|Description| keyStore Password (optional) | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* trust\_store\_pwd - -|Name| trust\_store\_pwd | -|:---:|:---------------------------------| -|Description| trustStore Password (Optional) | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* idle\_timeout - -|Name| idle\_timeout | -|:---:|:--------------| -|Description| SSL timeout duration, expressed in seconds | -|Type| int32 | -|Default| 5000 | -|Effective| After restarting system | - -#### Storage engine configuration - -* dn\_default\_space\_move\_thresholds - -|Name| dn\_default\_space\_move\_thresholds | -|:---:|:--------------| -|Description| Version 1.3.0/1: Define the minimum remaining space ratio for each tier data catalogue; when the remaining space is less than this ratio, the data will be automatically migrated to the next tier; when the remaining storage space of the last tier falls below this threshold, the system will be set to READ_ONLY | -|Type| double | -|Default| 0.15 | -|Effective| hot-load | - - -* dn\_default\_space\_usage\_thresholds - -|Name| dn\_default\_space\_usage\_thresholds | -|:---:|:--------------| -|Description| Version 1.3.2: Define the minimum remaining space ratio for each tier data catalogue; when the remaining space is less than this ratio, the data will be automatically migrated to the next tier; when the remaining storage space of the last tier falls below this threshold, the system will be set to READ_ONLY | -|Type| double | -|Default| 0.85 | -|Effective| hot-load | - -* remote\_tsfile\_cache\_dirs - -|Name| remote\_tsfile\_cache\_dirs | -|:---:|:--------------| -|Description| Cache directory stored locally in the cloud | -|Type| string | -|Default| data/datanode/data/cache | -|Effective| After restarting system | - -* remote\_tsfile\_cache\_page\_size\_in\_kb - -|Name| remote\_tsfile\_cache\_page\_size\_in\_kb | -|:---:|:--------------| -|Description| Block size of locally cached files stored in the cloud | -|Type| int | -|Default| 20480 | -|Effective| After restarting system | - -* remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb - -|Name| remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb | -|:---:|:--------------| -|Description| Maximum Disk Occupancy Size for Cloud Storage Local Cache | -|Type| long | -|Default| 51200 | -|Effective| After restarting system | - -* object\_storage\_type - -|Name| object\_storage\_type | -|:---:|:--------------| -|Description| Cloud Storage Type | -|Type| string | -|Default| AWS_S3 | -|Effective| After restarting system | - -* object\_storage\_bucket - -|Name| object\_storage\_bucket | -|:---:|:--------------| -|Description| Name of cloud storage bucket | -|Type| string | -|Default| iotdb_data | -|Effective| After restarting system | - -* object\_storage\_endpoint - -|Name| object\_storage\_endpoint | -|:---:|:--------------------------------| -|Description| endpoint of cloud storage | -|Type| string | -|Default| None | -|Effective| After restarting system | - -* object\_storage\_access\_key - -|Name| object\_storage\_access\_key | -|:---:|:--------------| -|Description| Authentication information stored in the cloud: key | -|Type| string | -|Default| None | -|Effective| After restarting system | - -* object\_storage\_access\_secret - -|Name| object\_storage\_access\_secret | -|:---:|:--------------| -|Description| Authentication information stored in the cloud: secret | -|Type| string | -|Default| None | -|Effective| After restarting system | diff --git a/src/UserGuide/V1.3.x/Reference/DataNode-Config-Manual_timecho.md b/src/UserGuide/V1.3.x/Reference/DataNode-Config-Manual_timecho.md deleted file mode 100644 index 3ce17bebe..000000000 --- a/src/UserGuide/V1.3.x/Reference/DataNode-Config-Manual_timecho.md +++ /dev/null @@ -1,584 +0,0 @@ - - -# DataNode Configuration Parameters - -We use the same configuration files for IoTDB DataNode and Standalone version, all under the `conf`. - -* `datanode-env.sh/bat`:Environment configurations, in which we could set the memory allocation of DataNode and Standalone. - -* `iotdb-system.properties`:IoTDB system configurations. - -## Hot Modification Configuration - -For the convenience of users, IoTDB provides users with hot modification function, that is, modifying some configuration parameters in `iotdb-system.properties` during the system operation and applying them to the system immediately. -In the parameters described below, these parameters whose way of `Effective` is `hot-load` support hot modification. - -Trigger way: The client sends the command(sql) `load configuration` or `set configuration` to the IoTDB server. - -## Environment Configuration File(datanode-env.sh/bat) - -The environment configuration file is mainly used to configure the Java environment related parameters when DataNode is running, such as JVM related configuration. This part of the configuration is passed to the JVM when the DataNode starts. - -The details of each parameter are as follows: - -* MEMORY\_SIZE - -|Name|MEMORY\_SIZE| -|:---:|:---| -|Description|The minimum heap memory size that IoTDB DataNode will use when startup | -|Type|String| -|Default| The default is a half of the memory.| -|Effective|After restarting system| - -* ON\_HEAP\_MEMORY - -|Name|ON\_HEAP\_MEMORY| -|:---:|:---| -|Description|The heap memory size that IoTDB DataNode can use, Former Name: MAX\_HEAP\_SIZE | -|Type|String| -|Default| Calculate based on MEMORY\_SIZE.| -|Effective|After restarting system| - -* OFF\_HEAP\_MEMORY - -|Name|OFF\_HEAP\_MEMORY| -|:---:|:---| -|Description|The direct memory that IoTDB DataNode can use, Former Name: MAX\_DIRECT\_MEMORY\_SIZE| -|Type|String| -|Default| Calculate based on MEMORY\_SIZE.| -|Effective|After restarting system| - -* JMX\_LOCAL - -|Name|JMX\_LOCAL| -|:---:|:---| -|Description|JMX monitoring mode, configured as yes to allow only local monitoring, no to allow remote monitoring| -|Type|Enum String: "true", "false"| -|Default|true| -|Effective|After restarting system| - -* JMX\_PORT - -|Name|JMX\_PORT| -|:---:|:---| -|Description|JMX listening port. Please confirm that the port is not a system reserved port and is not occupied| -|Type|Short Int: [0,65535]| -|Default|31999| -|Effective|After restarting system| - -* JMX\_IP - -|Name|JMX\_IP| -|:---:|:---| -|Description|JMX listening address. Only take effect if JMX\_LOCAL=false. 0.0.0.0 is never allowed| -|Type|String| -|Default|127.0.0.1| -|Effective|After restarting system| - -## JMX Authorization - -We **STRONGLY RECOMMENDED** you CHANGE the PASSWORD for the JMX remote connection. - -The user and passwords are in ${IOTDB\_CONF}/conf/jmx.password. - -The permission definitions are in ${IOTDB\_CONF}/conf/jmx.access. - -## DataNode/Standalone Configuration File (iotdb-system.properties) - -### Data Node RPC Configuration - -* dn\_rpc\_address - -|Name| dn\_rpc\_address | -|:---:|:-----------------------------------------------| -|Description| The client rpc service listens on the address. | -|Type| String | -|Default| 0.0.0.0 | -|Effective| After restarting system | - -* dn\_rpc\_port - -|Name| dn\_rpc\_port | -|:---:|:---| -|Description| The client rpc service listens on the port.| -|Type|Short Int : [0,65535]| -|Default| 6667 | -|Effective|After restarting system| - -* dn\_internal\_address - -|Name| dn\_internal\_address | -|:---:|:---| -|Description| DataNode internal service host/IP | -|Type| string | -|Default| 127.0.0.1 | -|Effective|Only allowed to be modified in first start up| - -* dn\_internal\_port - -|Name| dn\_internal\_port | -|:---:|:-------------------------------| -|Description| DataNode internal service port | -|Type| int | -|Default| 10730 | -|Effective| Only allowed to be modified in first start up | - -* dn\_mpp\_data\_exchange\_port - -|Name| mpp\_data\_exchange\_port | -|:---:|:---| -|Description| MPP data exchange port | -|Type| int | -|Default| 10740 | -|Effective|Only allowed to be modified in first start up| - -* dn\_schema\_region\_consensus\_port - -|Name| dn\_schema\_region\_consensus\_port | -|:---:|:---| -|Description| DataNode Schema replica communication port for consensus | -|Type| int | -|Default| 10750 | -|Effective|Only allowed to be modified in first start up| - -* dn\_data\_region\_consensus\_port - -|Name| dn\_data\_region\_consensus\_port | -|:---:|:---| -|Description| DataNode Data replica communication port for consensus | -|Type| int | -|Default| 10760 | -|Effective|Only allowed to be modified in first start up| - -* dn\_join\_cluster\_retry\_interval\_ms - -|Name| dn\_join\_cluster\_retry\_interval\_ms | -|:---:|:--------------------------------------------------------------------------| -|Description| The time of data node waiting for the next retry to join into the cluster | -|Type| long | -|Default| 5000 | -|Effective| After restarting system | - -### SSL Configuration - -* enable\_thrift\_ssl - -|Name| enable\_thrift\_ssl | -|:---:|:---------------------------| -|Description|When enable\_thrift\_ssl is configured as true, SSL encryption will be used for communication through dn\_rpc\_port | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* enable\_https - -|Name| enable\_https | -|:---:|:-------------------------| -|Description| REST Service Specifies whether to enable SSL configuration | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* key\_store\_path - -|Name| key\_store\_path | -|:---:|:-----------------| -|Description| SSL certificate path | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* key\_store\_pwd - -|Name| key\_store\_pwd | -|:---:|:----------------| -|Description| SSL certificate password | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -### SeedConfigNode - -* dn\_seed\_config\_node - -|Name| dn\_seed\_config\_node | -|:---:|:------------------------------------------------| -|Description| ConfigNode Address for DataNode to join cluster. This parameter is corresponding to dn\_target\_config\_node\_list before V1.2.2 | -|Type| String | -|Default| 127.0.0.1:10710 | -|Effective| Only allowed to be modified in first start up | - -### Connection Configuration - -* dn\_rpc\_thrift\_compression\_enable - -|Name| dn\_rpc\_thrift\_compression\_enable | -|:---:|:---| -|Description| Whether enable thrift's compression (using GZIP).| -|Type|Boolean| -|Default| false | -|Effective|After restarting system| - -* dn\_rpc\_advanced\_compression\_enable - -|Name| dn\_rpc\_advanced\_compression\_enable | -|:---:|:---| -|Description| Whether enable thrift's advanced compression.| -|Type|Boolean| -|Default| false | -|Effective|After restarting system| - -* dn\_rpc\_selector\_thread\_count - -|Name| dn\_rpc\_selector\_thread\_count | -|:---:|:-----------------------------------| -|Description| The number of rpc selector thread. | -|Type| int | -|Default| false | -|Effective| After restarting system | - -* dn\_rpc\_min\_concurrent\_client\_num - -|Name| dn\_rpc\_min\_concurrent\_client\_num | -|:---:|:-----------------------------------| -|Description| Minimum concurrent rpc connections | -|Type| Short Int : [0,65535] | -|Description| 1 | -|Effective| After restarting system | - -* dn\_rpc\_max\_concurrent\_client\_num - -|Name| dn\_rpc\_max\_concurrent\_client\_num | -|:---:|:--------------------------------------| -|Description| Max concurrent rpc connections | -|Type| Short Int : [0,65535] | -|Description| 1000 | -|Effective| After restarting system | - -* dn\_thrift\_max\_frame\_size - -|Name| dn\_thrift\_max\_frame\_size | -|:---:|:---| -|Description| Max size of bytes of each thrift RPC request/response| -|Type| Long | -|Unit|Byte| -|Default| 536870912 | -|Effective|After restarting system| - -* dn\_thrift\_init\_buffer\_size - -|Name| dn\_thrift\_init\_buffer\_size | -|:---:|:---| -|Description| Initial size of bytes of buffer that thrift used | -|Type| long | -|Default| 1024 | -|Effective|After restarting system| - -* dn\_connection\_timeout\_ms - -| Name | dn\_connection\_timeout\_ms | -|:-----------:|:---------------------------------------------------| -| Description | Thrift socket and connection timeout between nodes | -| Type | int | -| Default | 60000 | -| Effective | After restarting system | - -* dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager - -| Name | dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------------:|:--------------------------------------------------------------| -| Description | Number of core clients routed to each node in a ClientManager | -| Type | int | -| Default | 200 | -| Effective | After restarting system | - -* dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager - -| Name | dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager | -|:--------------:|:-------------------------------------------------------------| -| Description | Number of max clients routed to each node in a ClientManager | -| Type | int | -| Default | 300 | -| Effective | After restarting system | - -### Dictionary Configuration - -* dn\_system\_dir - -| Name | dn\_system\_dir | -|:-----------:|:----------------------------------------------------------------------------| -| Description | The directories of system files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/system (Windows: data\\datanode\\system) | -| Effective | After restarting system | - -* dn\_data\_dirs - -| Name | dn\_data\_dirs | -|:-----------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | The directories of data files. Multiple directories are separated by comma. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. If the path does not exist, the system will automatically create it. | -| Type | String[] | -| Default | data/datanode/data (Windows: data\\datanode\\data) | -| Effective | After restarting system | - -* dn\_multi\_dir\_strategy - -| Name | dn\_multi\_dir\_strategy | -|:-----------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | IoTDB's strategy for selecting directories for TsFile in tsfile_dir. You can use a simple class name or a full name of the class. The system provides the following three strategies:
1. SequenceStrategy: IoTDB selects the directory from tsfile\_dir in order, traverses all the directories in tsfile\_dir in turn, and keeps counting;
2. MaxDiskUsableSpaceFirstStrategy: IoTDB first selects the directory with the largest free disk space in tsfile\_dir;
You can complete a user-defined policy in the following ways:
1. Inherit the org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy class and implement its own Strategy method;
2. Fill in the configuration class with the full class name of the implemented class (package name plus class name, UserDfineStrategyPackage);
3. Add the jar file to the project. | -| Type | String | -| Default | SequenceStrategy | -| Effective | hot-load | - -* dn\_consensus\_dir - -| Name | dn\_consensus\_dir | -|:-----------:|:-------------------------------------------------------------------------------| -| Description | The directories of consensus files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/consensus | -| Effective | After restarting system | - -* dn\_wal\_dirs - -| Name | dn\_wal\_dirs | -|:-----------:|:-------------------------------------------------------------------------| -| Description | Write Ahead Log storage path. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/wal | -| Effective | After restarting system | - -* dn\_tracing\_dir - -| Name | dn\_tracing\_dir | -|:-----------:|:----------------------------------------------------------------------------| -| Description | The tracing root directory path. It is recommended to use an absolute path. | -| Type | String | -| Default | datanode/tracing | -| Effective | After restarting system | - -* dn\_sync\_dir - -| Name | dn\_sync\_dir | -|:-----------:|:--------------------------------------------------------------------------| -| Description | The directories of sync files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/sync | -| Effective | After restarting system | - -### Metric Configuration - -## Enable GC log - -GC log is off by default. -For performance tuning, you may want to collect the GC info. - -To enable GC log, just add a parameter "printgc" when you start the DataNode. - -```bash -nohup sbin/start-datanode.sh printgc >/dev/null 2>&1 & -``` -Or -```cmd -sbin\start-datanode.bat printgc -``` - -GC log is stored at `IOTDB_HOME/logs/gc.log`. -There will be at most 10 gc.log.* files and each one can reach to 10MB. - -### REST Service Configuration - -* enable\_rest\_service - -|Name| enable\_rest\_service | -|:---:|:--------------------------------------| -|Description| Whether to enable the Rest service | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* rest\_service\_port - -|Name| rest\_service\_port | -|:---:|:------------------| -|Description| The Rest service listens to the port number | -|Type| int32 | -|Default| 18080 | -|Effective| After restarting system | - -* enable\_swagger - -|Name| enable\_swagger | -|:---:|:-----------------------| -|Description| Whether to enable swagger to display rest interface information | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* rest\_query\_default\_row\_size\_limit - -|Name| rest\_query\_default\_row\_size\_limit | -|:---:|:------------------------------------------------------------------------------------------| -|Description| The maximum number of rows in a result set that can be returned by a query | -|Type| int32 | -|Default| 10000 | -|Effective| After restarting system | - -* cache\_expire - -|Name| cache\_expire | -|:---:|:--------------------------------------------------------| -|Description| Expiration time for caching customer login information | -|Type| int32 | -|Default| 28800 | -|Effective| After restarting system | - -* cache\_max\_num - -|Name| cache\_max\_num | -|:---:|:--------------| -|Description| The maximum number of users stored in the cache | -|Type| int32 | -|Default| 100 | -|Effective| After restarting system | - -* cache\_init\_num - -|Name| cache\_init\_num | -|:---:|:---------------| -|Description| Initial cache capacity | -|Type| int32 | -|Default| 10 | -|Effective| After restarting system | - - -* trust\_store\_path - -|Name| trust\_store\_path | -|:---:|:---------------| -|Description| keyStore Password (optional) | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* trust\_store\_pwd - -|Name| trust\_store\_pwd | -|:---:|:---------------------------------| -|Description| trustStore Password (Optional) | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* idle\_timeout - -|Name| idle\_timeout | -|:---:|:--------------| -|Description| SSL timeout duration, expressed in seconds | -|Type| int32 | -|Default| 5000 | -|Effective| After restarting system | - - -#### Storage engine configuration - - -* dn\_default\_space\_usage\_thresholds - -|Name| dn\_default\_space\_usage\_thresholds | -|:---:|:--------------| -|Description| Define the minimum remaining space ratio for each tier data catalogue; when the remaining space is less than this ratio, the data will be automatically migrated to the next tier; when the remaining storage space of the last tier falls below this threshold, the system will be set to READ_ONLY | -|Type| double | -|Default| 0.85 | -|Effective| hot-load | - -* remote\_tsfile\_cache\_dirs - -|Name| remote\_tsfile\_cache\_dirs | -|:---:|:--------------| -|Description| Cache directory stored locally in the cloud | -|Type| string | -|Default| data/datanode/data/cache | -|Effective| After restarting system | - -* remote\_tsfile\_cache\_page\_size\_in\_kb - -|Name| remote\_tsfile\_cache\_page\_size\_in\_kb | -|:---:|:--------------| -|Description| Block size of locally cached files stored in the cloud | -|Type| int | -|Default| 20480 | -|Effective| After restarting system | - -* remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb - -|Name| remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb | -|:---:|:--------------| -|Description| Maximum Disk Occupancy Size for Cloud Storage Local Cache | -|Type| long | -|Default| 51200 | -|Effective| After restarting system | - -* object\_storage\_type - -|Name| object\_storage\_type | -|:---:|:--------------| -|Description| Cloud Storage Type | -|Type| string | -|Default| AWS_S3 | -|Effective| After restarting system | - -* object\_storage\_bucket - -|Name| object\_storage\_bucket | -|:---:|:--------------| -|Description| Name of cloud storage bucket | -|Type| string | -|Default| iotdb_data | -|Effective| After restarting system | - -* object\_storage\_endpoint - -|Name| object\_storage\_endpoint | -|:---:|:--------------------------------| -|Description| endpoint of cloud storage | -|Type| string | -|Default| None | -|Effective| After restarting system | - -* object\_storage\_access\_key - -|Name| object\_storage\_access\_key | -|:---:|:--------------| -|Description| Authentication information stored in the cloud: key | -|Type| string | -|Default| None | -|Effective| After restarting system | - -* object\_storage\_access\_secret - -|Name| object\_storage\_access\_secret | -|:---:|:--------------| -|Description| Authentication information stored in the cloud: secret | -|Type| string | -|Default| None | -|Effective| After restarting system | diff --git a/src/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md b/src/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md deleted file mode 100644 index 2f719fe11..000000000 --- a/src/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md +++ /dev/null @@ -1,5064 +0,0 @@ - - -# UDF Libraries - -# UDF Libraries - -Based on the ability of user-defined functions, IoTDB provides a series of functions for temporal data processing, including data quality, data profiling, anomaly detection, frequency domain analysis, data matching, data repairing, sequence discovery, machine learning, etc., which can meet the needs of industrial fields for temporal data processing. - -> Note: The functions in the current UDF library only support millisecond level timestamp accuracy. - -## Installation steps - -1. Please obtain the compressed file of the UDF library JAR package that is compatible with the IoTDB version. - - | UDF installation package | Supported IoTDB versions | Download link | - | --------------- | ----------------- | ------------------------------------------------------------ | - | TimechoDB-UDF-1.3.3.zip | V1.3.3 and above | Please contact Timecho for assistance | - | TimechoDB-UDF-1.3.2.zip | V1.0.0~V1.3.2 | Please contact Timecho for assistance| - -2. Place the `library-udf.jar` file in the compressed file obtained in the directory `/ext/udf ` of all nodes in the IoTDB cluster -3. In the SQL command line terminal (CLI) or visualization console (Workbench) SQL operation interface of IoTDB, execute the corresponding function registration statement as follows. -4. Batch registration: Two registration methods: registration script or SQL full statement -- Register Script - - Copy the registration script (`register-UDF.sh` or `register-UDF.bat`) from the compressed package to the `tools` directory of IoTDB as needed, and modify the parameters in the script (default is host=127.0.0.1, rpcPort=6667, user=root, pass=root); - - Start IoTDB service, run registration script to batch register UDF - -- All SQL statements - - Open the SQl file in the compressed package, copy all SQL statements, and execute all SQl statements in the SQL command line terminal (CLI) of IoTDB or the SQL operation interface of the visualization console (Workbench) to batch register UDF - -## Data Quality - -### Completeness - -#### Registration statement - -```sql -create function completeness as 'org.apache.iotdb.library.dquality.UDTFCompleteness' -``` - -#### Usage - -This function calculates the completeness of a time series, which measures the presence or absence of missing values in the time series data. The function divides the input time series data into consecutive non-overlapping time windows, computes the data completeness for each window individually, and outputs the timestamp of the first data point in the window along with the completeness result. - -**Name:** COMPLETENESS - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. -+ `downtime`: Whether the downtime exception is considered in the calculation of completeness. It is 'true' or 'false' (default). When considering the downtime exception, long-term missing data will be considered as downtime exception without any influence on completeness. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-----------------------------+ -| Time|completeness(root.test.d1.s1)| -+-----------------------------+-----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -+-----------------------------+-----------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------+ -| Time|completeness(root.test.d1.s1, "window"="15")| -+-----------------------------+--------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+--------------------------------------------+ -``` - -### Consistency - -#### Registration statement - -```sql -create function consistency as 'org.apache.iotdb.library.dquality.UDTFConsistency' -``` - -#### Usage - -This function calculates the consistency of a time series, which measures whether the changes in the time series data are stable and follow uniform patterns. The function divides the input time series data into consecutive non-overlapping time windows, computes the data consistency for each window individually, and outputs the timestamp of the first data point in the window along with the consistency result. - -**Name:** CONSISTENCY - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|consistency(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+----------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------+ -| Time|consistency(root.test.d1.s1, "window"="15")| -+-----------------------------+-------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+-------------------------------------------+ -``` - -### Timeliness - -#### Registration statement - -```sql -create function timeliness as 'org.apache.iotdb.library.dquality.UDTFTimeliness' -``` - -#### Usage - -This function calculates the timeliness of a time series, which measures whether the time series data is collected and reported on schedule. The function divides the input time series data into consecutive non-overlapping time windows, computes the data timeliness for each window individually, and outputs the timestamp of the first data point in the window along with the timeliness result. - -**Name:** TIMELINESS - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+---------------------------+ -| Time|timeliness(root.test.d1.s1)| -+-----------------------------+---------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+---------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------+ -| Time|timeliness(root.test.d1.s1, "window"="15")| -+-----------------------------+------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+------------------------------------------+ -``` - -### Validity - -#### Registration statement - -```sql -create function validity as 'org.apache.iotdb.library.dquality.UDTFValidity' -``` - -#### Usage - -This function calculates the validity of a time series, which measures whether the time series data is normal, usable, and free of outliers. The function divides the input time series data into consecutive non-overlapping time windows, computes the data validity for each window individually, and outputs the timestamp of the first data point in the window along with the validity result. - -**Name:** VALIDITY - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select Validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|validity(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -+-----------------------------+-------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select Validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|validity(root.test.d1.s1, "window"="15")| -+-----------------------------+----------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - - - - - -## Data Profiling - -### ACF - -#### Registration statement - -```sql -create function acf as 'org.apache.iotdb.library.dprofile.UDTFACF' -``` - -#### Usage - -This function is used to calculate the auto-correlation factor of the input time series, -which equals to cross correlation between the same series. -For more information, please refer to [XCorr](#XCorr) function. - -**Name:** ACF - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. -There are $2N-1$ data points in the series, and the values are interpreted in details in [XCorr](#XCorr) function. - -**Note:** - -+ `null` and `NaN` values in the input series will be ignored and treated as 0. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| null| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|acf(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 6.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -|1970-01-01T08:00:00.004+08:00| 7.0| -|1970-01-01T08:00:00.005+08:00| 0.0| -|1970-01-01T08:00:00.006+08:00| 3.6| -|1970-01-01T08:00:00.007+08:00| 0.0| -|1970-01-01T08:00:00.008+08:00| 1.0| -+-----------------------------+--------------------+ -``` - -### Distinct - -#### Registration statement - -```sql -create function distinct as 'org.apache.iotdb.library.dprofile.UDTFDistinct' -``` - -#### Usage - -This function returns all unique values in time series. - -**Name:** DISTINCT - -**Input Series:** Only support a single input series. The type is arbitrary. - -**Output Series:** Output a single series. The type is the same as the input. - -**Note:** - -+ The timestamp of the output series is meaningless. The output order is arbitrary. -+ Missing points and null points in the input series will be ignored, but `NaN` will not. -+ Case Sensitive. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2020-01-01T08:00:00.001+08:00| Hello| -|2020-01-01T08:00:00.002+08:00| hello| -|2020-01-01T08:00:00.003+08:00| Hello| -|2020-01-01T08:00:00.004+08:00| World| -|2020-01-01T08:00:00.005+08:00| World| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select distinct(s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|distinct(root.test.d2.s2)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.001+08:00| Hello| -|1970-01-01T08:00:00.002+08:00| hello| -|1970-01-01T08:00:00.003+08:00| World| -+-----------------------------+-------------------------+ -``` - -### Histogram - -#### Registration statement - -```sql -create function histogram as 'org.apache.iotdb.library.dprofile.UDTFHistogram' -``` - -#### Usage - -This function is used to calculate the distribution histogram of a single column of numerical data. - -**Name:** HISTOGRAM - -**Input Series:** Only supports a single input sequence, the type is INT32 / INT64 / FLOAT / DOUBLE - -**Parameters:** - -+ `min`: The lower limit of the requested data range, the default value is -Double.MAX_VALUE. -+ `max`: The upper limit of the requested data range, the default value is Double.MAX_VALUE, and the value of start must be less than or equal to end. -+ `count`: The number of buckets of the histogram, the default value is 1. It must be a positive integer. - -**Output Series:** The value of the bucket of the histogram, where the lower bound represented by the i-th bucket (index starts from 1) is $min+ (i-1)\cdot\frac{max-min}{count}$ and the upper bound is $min + i \cdot \frac{max-min}{count}$. - -**Note:** - -+ If the value is lower than `min`, it will be put into the 1st bucket. If the value is larger than `max`, it will be put into the last bucket. -+ Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 11.0| -|2020-01-01T00:00:11.000+08:00| 12.0| -|2020-01-01T00:00:12.000+08:00| 13.0| -|2020-01-01T00:00:13.000+08:00| 14.0| -|2020-01-01T00:00:14.000+08:00| 15.0| -|2020-01-01T00:00:15.000+08:00| 16.0| -|2020-01-01T00:00:16.000+08:00| 17.0| -|2020-01-01T00:00:17.000+08:00| 18.0| -|2020-01-01T00:00:18.000+08:00| 19.0| -|2020-01-01T00:00:19.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------+ -| Time|histogram(root.test.d1.s1, "min"="1", "max"="20", "count"="10")| -+-----------------------------+---------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 2| -|1970-01-01T08:00:00.001+08:00| 2| -|1970-01-01T08:00:00.002+08:00| 2| -|1970-01-01T08:00:00.003+08:00| 2| -|1970-01-01T08:00:00.004+08:00| 2| -|1970-01-01T08:00:00.005+08:00| 2| -|1970-01-01T08:00:00.006+08:00| 2| -|1970-01-01T08:00:00.007+08:00| 2| -|1970-01-01T08:00:00.008+08:00| 2| -|1970-01-01T08:00:00.009+08:00| 2| -+-----------------------------+---------------------------------------------------------------+ -``` - -### Integral - -#### Registration statement - -```sql -create function integral as 'org.apache.iotdb.library.dprofile.UDAFIntegral' -``` - -#### Usage - -This function is used to calculate the integration of time series, -which equals to the area under the curve with time as X-axis and values as Y-axis. - -**Name:** INTEGRAL - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `unit`: The unit of time used when computing the integral. - The value should be chosen from "1S", "1s", "1m", "1H", "1d"(case-sensitive), - and each represents taking one millisecond / second / minute / hour / day as 1.0 while calculating the area and integral. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the integration. - -**Note:** - -+ The integral value equals to the sum of the areas of right-angled trapezoids consisting of each two adjacent points and the time-axis. - Choosing different `unit` implies different scaling of time axis, thus making it apparent to convert the value among those results with constant coefficient. - -+ `NaN` values in the input series will be ignored. The curve or trapezoids will skip these points and use the next valid point. - -#### Examples - -##### Default Parameters - -With default parameters, this function will take one second as 1.0. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 57.5| -+-----------------------------+-------------------------+ -``` - -Calculation expression: -$$\frac{1}{2}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 57.5$$ - -##### Specific time unit - -With time unit specified as "1m", this function will take one minute as 1.0. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.958| -+-----------------------------+-------------------------+ -``` - -Calculation expression: -$$\frac{1}{2\times 60}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 0.958$$ - -### IntegralAvg - -#### Registration statement - -```sql -create function integralavg as 'org.apache.iotdb.library.dprofile.UDAFIntegralAvg' -``` - -#### Usage - -This function is used to calculate the function average of time series. -The output equals to the area divided by the time interval using the same time `unit`. -For more information of the area under the curve, please refer to `Integral` function. - -**Name:** INTEGRALAVG - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the time-weighted average. - -**Note:** - -+ The time-weighted value equals to the integral value with any `unit` divided by the time interval of input series. - The result is irrelevant to the time unit used in integral, and it's consistent with the timestamp precision of IoTDB by default. - -+ `NaN` values in the input series will be ignored. The curve or trapezoids will skip these points and use the next valid point. - -+ If the input series is empty, the output value will be 0.0, but if there is only one data point, the value will equal to the input value. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|integralavg(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|1970-01-01T08:00:00.000+08:00| 6.388888888888889| -+-----------------------------+----------------------------+ -``` - -Calculation expression: -$$\frac{1}{2}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] / 10 = 5.75$$ - -### Mad - -#### Registration statement - -```sql -create function mad as 'org.apache.iotdb.library.dprofile.UDAFMad' -``` - -#### Usage - -The function is used to compute the exact or approximate median absolute deviation (MAD) of a numeric time series. MAD is the median of the deviation of each element from the elements' median. - -Take a dataset $\{1,3,3,5,5,6,7,8,9\}$ as an instance. Its median is 5 and the deviation of each element from the median is $\{0,0,1,2,2,2,3,4,4\}$, whose median is 2. Therefore, the MAD of the original dataset is 2. - -**Name:** MAD - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `error`: The relative error of the approximate MAD. It should be within [0,1) and the default value is 0. Taking `error`=0.01 as an instance, suppose the exact MAD is $a$ and the approximate MAD is $b$, we have $0.99a \le b \le 1.01a$. With `error`=0, the output is the exact MAD. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the MAD. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -##### Approximate Query - -By setting `error` within (0,1), the function queries the approximate MAD. - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -............ -Total line number = 20 -``` - -SQL for query: - -```sql -select mad(s1, "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -| Time|mad(root.test.s1, "error"="0.01")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9900000000000001| -+-----------------------------+---------------------------------+ -``` - -### Median - -#### Registration statement - -```sql -create function median as 'org.apache.iotdb.library.dprofile.UDAFMedian' -``` - -#### Usage - -The function is used to compute the exact or approximate median of a numeric time series. Median is the value separating the higher half from the lower half of a data sample. - -**Name:** MEDIAN - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `error`: The rank error of the approximate median. It should be within [0,1) and the default value is 0. For instance, a median with `error`=0.01 is the value of the element with rank percentage 0.49~0.51. With `error`=0, the output is the exact median. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the median. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -Total line number = 20 -``` - -SQL for query: - -```sql -select median(s1, "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|median(root.test.s1, "error"="0.01")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -+-----------------------------+------------------------------------+ -``` - -### MinMax - -#### Registration statement - -```sql -create function minmax as 'org.apache.iotdb.library.dprofile.UDTFMinMax' -``` - -#### Usage - -This function is used to standardize the input series with min-max. Minimum value is transformed to 0; maximum value is transformed to 1. - -**Name:** MINMAX - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `compute`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide minimum and maximum values. The default method is "batch". -+ `min`: The maximum value when method is set to "stream". -+ `max`: The minimum value when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select minmax(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|minmax(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.200+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.300+08:00| 0.25| -|1970-01-01T08:00:00.400+08:00| 0.08333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.700+08:00| 0.0| -|1970-01-01T08:00:00.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.900+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.000+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.100+08:00| 0.25| -|1970-01-01T08:00:01.200+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.300+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.25| -|1970-01-01T08:00:01.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.700+08:00| 1.0| -|1970-01-01T08:00:01.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.900+08:00| 0.0| -|1970-01-01T08:00:02.000+08:00| 0.16666666666666666| -+-----------------------------+--------------------+ -``` - - -### MvAvg - -#### Registration statement - -```sql -create function mvavg as 'org.apache.iotdb.library.dprofile.UDTFMvAvg' -``` - -#### Usage - -This function is used to calculate moving average of input series. - -**Name:** MVAVG - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `window`: Length of the moving window. Default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select mvavg(s1, "window"="3") from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -| Time|mvavg(root.test.s1, "window"="3")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.300+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.400+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.700+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.100+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.200+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.300+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.0| -|1970-01-01T08:00:01.500+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.600+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.700+08:00| 3.0| -|1970-01-01T08:00:01.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.900+08:00| -0.6666666666666666| -|1970-01-01T08:00:02.000+08:00| -3.3333333333333335| -+-----------------------------+---------------------------------+ -``` - -### PACF - -#### Registration statement - -```sql -create function pacf as 'org.apache.iotdb.library.dprofile.UDTFPACF' -``` - -#### Usage - -This function is used to calculate partial autocorrelation of input series by solving Yule-Walker equation. For some cases, the equation may not be solved, and NaN will be output. - -**Name:** PACF - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `lag`: Maximum lag of pacf to calculate. The default value is $\min(10\log_{10}n,n-1)$, where $n$ is the number of data points. - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Assigning maximum lag - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select pacf(s1, "lag"="5") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------+ -| Time|pacf(root.test.d1.s1, "lag"="5")| -+-----------------------------+--------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| -0.5744680851063829| -|2020-01-01T00:00:03.000+08:00| 0.3172297297297296| -|2020-01-01T00:00:04.000+08:00| -0.2977686586304181| -|2020-01-01T00:00:05.000+08:00| -2.0609033521065867| -+-----------------------------+--------------------------------+ -``` - -### Percentile - -#### Registration statement - -```sql -create function percentile as 'org.apache.iotdb.library.dprofile.UDAFPercentile' -``` - -#### Usage - -The function is used to compute the exact or approximate percentile of a numeric time series. A percentile is value of element in the certain rank of the sorted series. - -**Name:** PERCENTILE - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `rank`: The rank percentage of the percentile. It should be (0,1] and the default value is 0.5. For instance, a percentile with `rank`=0.5 is the median. -+ `error`: The rank error of the approximate percentile. It should be within [0,1) and the default value is 0. For instance, a 0.5-percentile with `error`=0.01 is the value of the element with rank percentage 0.49~0.51. With `error`=0, the output is the exact percentile. - -**Output Series:** Output a single series. The type is the same as input series. If `error`=0, there is only one data point in the series, whose timestamp is the same has which the first percentile value has, and value is the percentile, otherwise the timestamp of the only data point is 0. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+-------------+ -| Time|root.test2.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+-------------+ -Total line number = 20 -``` - -SQL for query: - -```sql -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|percentile(root.test2.s1, "rank"="0.2", "error"="0.01")| -+-----------------------------+-------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| -1.0| -+-----------------------------+-------------------------------------------------------+ -``` - -### Quantile - -#### Registration statement - -```sql -create function quantile as 'org.apache.iotdb.library.dprofile.UDAFQuantile' -``` - -#### Usage - -The function is used to compute the approximate quantile of a numeric time series. A quantile is value of element in the certain rank of the sorted series. - -**Name:** QUANTILE - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `rank`: The rank of the quantile. It should be (0,1] and the default value is 0.5. For instance, a quantile with `rank`=0.5 is the median. -+ `K`: The size of KLL sketch maintained in the query. It should be within [100,+inf) and the default value is 800. For instance, the 0.5-quantile computed by a KLL sketch with K=800 items is a value with rank quantile 0.49~0.51 with a confidence of at least 99%. The result will be more accurate as K increases. - -**Output Series:** Output a single series. The type is the same as input series. The timestamp of the only data point is 0. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+-------------+ -| Time|root.test1.s1| -+-----------------------------+-------------+ -|2021-03-17T10:32:17.054+08:00| 7| -|2021-03-17T10:32:18.054+08:00| 15| -|2021-03-17T10:32:19.054+08:00| 36| -|2021-03-17T10:32:20.054+08:00| 39| -|2021-03-17T10:32:21.054+08:00| 40| -|2021-03-17T10:32:22.054+08:00| 41| -|2021-03-17T10:32:23.054+08:00| 20| -|2021-03-17T10:32:24.054+08:00| 18| -+-----------------------------+-------------+ -............ -Total line number = 8 -``` - -SQL for query: - -```sql -select quantile(s1, "rank"="0.2", "K"="800") from root.test1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------+ -| Time|quantile(root.test1.s1, "rank"="0.2", "K"="800")| -+-----------------------------+------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.000000000000001| -+-----------------------------+------------------------------------------------+ -``` - -### Period - -#### Registration statement - -```sql -create function period as 'org.apache.iotdb.library.dprofile.UDAFPeriod' -``` - -#### Usage - -The function is used to compute the period of a numeric time series. - -**Name:** PERIOD - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is INT32. There is only one data point in the series, whose timestamp is 0 and value is the period. - -#### Examples - -Input series: - - -``` -+-----------------------------+---------------+ -| Time|root.test.d3.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| -|1970-01-01T08:00:00.004+08:00| 1.0| -|1970-01-01T08:00:00.005+08:00| 2.0| -|1970-01-01T08:00:00.006+08:00| 3.0| -|1970-01-01T08:00:00.007+08:00| 1.0| -|1970-01-01T08:00:00.008+08:00| 2.0| -|1970-01-01T08:00:00.009+08:00| 3.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select period(s1) from root.test.d3 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time|period(root.test.d3.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 3| -+-----------------------------+-----------------------+ -``` - -### QLB - -#### Registration statement - -```sql -create function qlb as 'org.apache.iotdb.library.dprofile.UDTFQLB' -``` - -#### Usage - -This function is used to calculate Ljung-Box statistics $Q_{LB}$ for time series, and convert it to p value. - -**Name:** QLB - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters**: - -`lag`: max lag to calculate. Legal input shall be integer from 1 to n-2, where n is the sample number. Default value is n-2. - -**Output Series:** Output a single series. The type is DOUBLE. The output series is p value, and timestamp means lag. - -**Note:** If you want to calculate Ljung-Box statistics $Q_{LB}$ instead of p value, you may use ACF function. - -#### Examples - -##### Using Default Parameter - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T00:00:00.100+08:00| 1.22| -|1970-01-01T00:00:00.200+08:00| -2.78| -|1970-01-01T00:00:00.300+08:00| 1.53| -|1970-01-01T00:00:00.400+08:00| 0.70| -|1970-01-01T00:00:00.500+08:00| 0.75| -|1970-01-01T00:00:00.600+08:00| -0.72| -|1970-01-01T00:00:00.700+08:00| -0.22| -|1970-01-01T00:00:00.800+08:00| 0.28| -|1970-01-01T00:00:00.900+08:00| 0.57| -|1970-01-01T00:00:01.000+08:00| -0.22| -|1970-01-01T00:00:01.100+08:00| -0.72| -|1970-01-01T00:00:01.200+08:00| 1.34| -|1970-01-01T00:00:01.300+08:00| -0.25| -|1970-01-01T00:00:01.400+08:00| 0.17| -|1970-01-01T00:00:01.500+08:00| 2.51| -|1970-01-01T00:00:01.600+08:00| 1.42| -|1970-01-01T00:00:01.700+08:00| -1.34| -|1970-01-01T00:00:01.800+08:00| -0.01| -|1970-01-01T00:00:01.900+08:00| -0.49| -|1970-01-01T00:00:02.000+08:00| 1.63| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select QLB(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+---------------------+ -| Time| QLB(root.test.d1.s1)| -+-----------------------------+---------------------+ -|1970-01-01T08:00:00.021+08:00| -0.31671| -|1970-01-01T08:00:00.001+08:00| 0.12748561639660716| -|1970-01-01T08:00:00.022+08:00| -0.17051499999999997| -|1970-01-01T08:00:00.002+08:00| 0.21941409592365868| -|1970-01-01T08:00:00.023+08:00| -0.11341499999999997| -|1970-01-01T08:00:00.003+08:00| 0.3384920824593398| -|1970-01-01T08:00:00.024+08:00| 0.26146| -|1970-01-01T08:00:00.004+08:00| 0.26293189359893154| -|1970-01-01T08:00:00.025+08:00| 0.06431999999999996| -|1970-01-01T08:00:00.005+08:00| 0.37265953802871943| -|1970-01-01T08:00:00.026+08:00| 0.036919999999999994| -|1970-01-01T08:00:00.006+08:00| 0.4923218142923832| -|1970-01-01T08:00:00.027+08:00|-0.009294999999999993| -|1970-01-01T08:00:00.007+08:00| 0.609628728420623| -|1970-01-01T08:00:00.028+08:00| 0.12271499999999999| -|1970-01-01T08:00:00.008+08:00| 0.6510708392264906| -|1970-01-01T08:00:00.029+08:00| 0.008480000000000033| -|1970-01-01T08:00:00.009+08:00| 0.7430561964288097| -|1970-01-01T08:00:00.030+08:00| -0.21764500000000003| -|1970-01-01T08:00:00.010+08:00| 0.6236738200492055| -|1970-01-01T08:00:00.031+08:00| 0.35853999999999997| -|1970-01-01T08:00:00.011+08:00| 0.21487390993160937| -|1970-01-01T08:00:00.032+08:00| 0.18115499999999998| -|1970-01-01T08:00:00.012+08:00| 0.18479562182870324| -|1970-01-01T08:00:00.033+08:00| -0.27745499999999995| -|1970-01-01T08:00:00.013+08:00| 0.07329862193377235| -|1970-01-01T08:00:00.034+08:00| -0.22418500000000002| -|1970-01-01T08:00:00.014+08:00| 0.038000864459751926| -|1970-01-01T08:00:00.035+08:00| 0.31609000000000004| -|1970-01-01T08:00:00.015+08:00| 0.004052989734200874| -|1970-01-01T08:00:00.036+08:00| -0.06078500000000001| -|1970-01-01T08:00:00.016+08:00| 0.005663787468609627| -|1970-01-01T08:00:00.037+08:00| 0.19219499999999998| -|1970-01-01T08:00:00.017+08:00|0.0016316380755082571| -|1970-01-01T08:00:00.038+08:00| -0.25646| -|1970-01-01T08:00:00.018+08:00|2.0047954405910673E-5| -+-----------------------------+---------------------+ -``` - -### Resample - -#### Registration statement - -```sql -create function re_sample as 'org.apache.iotdb.library.dprofile.UDTFResample' -``` - -#### Usage - -This function is used to resample the input series according to a given frequency, -including up-sampling and down-sampling. -Currently, the supported up-sampling methods are -NaN (filling with `NaN`), -FFill (filling with previous value), -BFill (filling with next value) and -Linear (filling with linear interpolation). -Down-sampling relies on group aggregation, -which supports Max, Min, First, Last, Mean and Median. - -**Name:** RESAMPLE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - - -+ `every`: The frequency of resampling, which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. This parameter cannot be lacked. -+ `interp`: The interpolation method of up-sampling, which is 'NaN', 'FFill', 'BFill' or 'Linear'. By default, NaN is used. -+ `aggr`: The aggregation method of down-sampling, which is 'Max', 'Min', 'First', 'Last', 'Mean' or 'Median'. By default, Mean is used. -+ `start`: The start time (inclusive) of resampling with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is the timestamp of the first valid data point. -+ `end`: The end time (exclusive) of resampling with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is the timestamp of the last valid data point. - -**Output Series:** Output a single series. The type is DOUBLE. It is strictly equispaced with the frequency `every`. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -##### Up-sampling - -When the frequency of resampling is higher than the original frequency, up-sampling starts. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2021-03-06T16:00:00.000+08:00| 3.09| -|2021-03-06T16:15:00.000+08:00| 3.53| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:45:00.000+08:00| 3.51| -|2021-03-06T17:00:00.000+08:00| 3.41| -+-----------------------------+---------------+ -``` - - -SQL for query: - -```sql -select resample(s1,'every'='5m','interp'='linear') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="5m", "interp"="linear")| -+-----------------------------+----------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:05:00.000+08:00| 3.2366665999094644| -|2021-03-06T16:10:00.000+08:00| 3.3833332856496177| -|2021-03-06T16:15:00.000+08:00| 3.5299999713897705| -|2021-03-06T16:20:00.000+08:00| 3.5199999809265137| -|2021-03-06T16:25:00.000+08:00| 3.509999990463257| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:35:00.000+08:00| 3.503333330154419| -|2021-03-06T16:40:00.000+08:00| 3.506666660308838| -|2021-03-06T16:45:00.000+08:00| 3.509999990463257| -|2021-03-06T16:50:00.000+08:00| 3.4766666889190674| -|2021-03-06T16:55:00.000+08:00| 3.443333387374878| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+----------------------------------------------------------+ -``` - -##### Down-sampling - -When the frequency of resampling is lower than the original frequency, down-sampling starts. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select resample(s1,'every'='30m','aggr'='first') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "aggr"="first")| -+-----------------------------+--------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+--------------------------------------------------------+ -``` - - - -##### Specify the time period - -The time period of resampling can be specified with `start` and `end`. -The period outside the actual time range will be interpolated. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "start"="2021-03-06 15:00:00")| -+-----------------------------+-----------------------------------------------------------------------+ -|2021-03-06T15:00:00.000+08:00| NaN| -|2021-03-06T15:30:00.000+08:00| NaN| -|2021-03-06T16:00:00.000+08:00| 3.309999942779541| -|2021-03-06T16:30:00.000+08:00| 3.5049999952316284| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+-----------------------------------------------------------------------+ -``` - -### Sample - -#### Registration statement - -```sql -create function sample as 'org.apache.iotdb.library.dprofile.UDTFSample' -``` - -#### Usage - -This function is used to sample the input series, -that is, select a specified number of data points from the input series and output them. -Currently, three sampling methods are supported: -**Reservoir sampling** randomly selects data points. -All of the points have the same probability of being sampled. -**Isometric sampling** selects data points at equal index intervals. -**Triangle sampling** assigns data points to the buckets based on the number of sampling. -Then it calculates the area of the triangle based on these points inside the bucket and selects the point with the largest area of the triangle. -For more detail, please read [paper](http://skemman.is/stream/get/1946/15343/37285/3/SS_MSthesis.pdf) - -**Name:** SAMPLE - -**Input Series:** Only support a single input series. The type is arbitrary. - -**Parameters:** - -+ `method`: The method of sampling, which is 'reservoir', 'isometric' or 'triangle'. By default, reservoir sampling is used. -+ `k`: The number of sampling, which is a positive integer. By default, it's 1. - -**Output Series:** Output a single series. The type is the same as the input. The length of the output series is `k`. Each data point in the output series comes from the input series. - -**Note:** If `k` is greater than the length of input series, all data points in the input series will be output. - -#### Examples - -##### Reservoir Sampling - -When `method` is 'reservoir' or the default, reservoir sampling is used. -Due to the randomness of this method, the output series shown below is only a possible result. - - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 4.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select sample(s1,'method'='reservoir','k'='5') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="reservoir", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -##### Isometric Sampling - -When `method` is 'isometric', isometric sampling is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select sample(s1,'method'='isometric','k'='5') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="isometric", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -### Segment - -#### Registration statement - -```sql -create function segment as 'org.apache.iotdb.library.dprofile.UDTFSegment' -``` - -#### Usage - -This function is used to segment a time series into subsequences according to linear trend, and returns linear fitted values of first values in each subsequence or every data point. - -**Name:** SEGMENT - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `output` :"all" to output all fitted points; "first" to output first fitted points in each subsequence. - -+ `error`: error allowed at linear regression. It is defined as mean absolute error of a subsequence. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** This function treat input series as equal-interval sampled. All data are loaded, so downsample input series first if there are too many data points. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:00.300+08:00| 2.0| -|1970-01-01T08:00:00.400+08:00| 3.0| -|1970-01-01T08:00:00.500+08:00| 4.0| -|1970-01-01T08:00:00.600+08:00| 5.0| -|1970-01-01T08:00:00.700+08:00| 6.0| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 8.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:01.100+08:00| 9.1| -|1970-01-01T08:00:01.200+08:00| 9.2| -|1970-01-01T08:00:01.300+08:00| 9.3| -|1970-01-01T08:00:01.400+08:00| 9.4| -|1970-01-01T08:00:01.500+08:00| 9.5| -|1970-01-01T08:00:01.600+08:00| 9.6| -|1970-01-01T08:00:01.700+08:00| 9.7| -|1970-01-01T08:00:01.800+08:00| 9.8| -|1970-01-01T08:00:01.900+08:00| 9.9| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:02.100+08:00| 8.0| -|1970-01-01T08:00:02.200+08:00| 6.0| -|1970-01-01T08:00:02.300+08:00| 4.0| -|1970-01-01T08:00:02.400+08:00| 2.0| -|1970-01-01T08:00:02.500+08:00| 0.0| -|1970-01-01T08:00:02.600+08:00| -2.0| -|1970-01-01T08:00:02.700+08:00| -4.0| -|1970-01-01T08:00:02.800+08:00| -6.0| -|1970-01-01T08:00:02.900+08:00| -8.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.100+08:00| 10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -|1970-01-01T08:00:03.300+08:00| 10.0| -|1970-01-01T08:00:03.400+08:00| 10.0| -|1970-01-01T08:00:03.500+08:00| 10.0| -|1970-01-01T08:00:03.600+08:00| 10.0| -|1970-01-01T08:00:03.700+08:00| 10.0| -|1970-01-01T08:00:03.800+08:00| 10.0| -|1970-01-01T08:00:03.900+08:00| 10.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select segment(s1, "error"="0.1") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|segment(root.test.s1, "error"="0.1")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -+-----------------------------+------------------------------------+ -``` - -### Skew - -#### Registration statement - -```sql -create function skew as 'org.apache.iotdb.library.dprofile.UDAFSkew' -``` - -#### Usage - -This function is used to calculate the population skewness. - -**Name:** SKEW - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the population skewness. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -|2020-01-01T00:00:11.000+08:00| 10.0| -|2020-01-01T00:00:12.000+08:00| 10.0| -|2020-01-01T00:00:13.000+08:00| 10.0| -|2020-01-01T00:00:14.000+08:00| 10.0| -|2020-01-01T00:00:15.000+08:00| 10.0| -|2020-01-01T00:00:16.000+08:00| 10.0| -|2020-01-01T00:00:17.000+08:00| 10.0| -|2020-01-01T00:00:18.000+08:00| 10.0| -|2020-01-01T00:00:19.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select skew(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time| skew(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| -0.9998427402292644| -+-----------------------------+-----------------------+ -``` - -### Spline - -#### Registration statement - -```sql -create function spline as 'org.apache.iotdb.library.dprofile.UDTFSpline' -``` - -#### Usage - -This function is used to calculate cubic spline interpolation of input series. - -**Name:** SPLINE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `points`: Number of resampling points. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note**: Output series retains the first and last timestamps of input series. Interpolation points are selected at equal intervals. The function tries to calculate only when there are no less than 4 points in input series. - -#### Examples - -##### Assigning number of interpolation points - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select spline(s1, "points"="151") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|spline(root.test.s1, "points"="151")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.010+08:00| 0.04870000251134237| -|1970-01-01T08:00:00.020+08:00| 0.09680000495910646| -|1970-01-01T08:00:00.030+08:00| 0.14430000734329226| -|1970-01-01T08:00:00.040+08:00| 0.19120000966389972| -|1970-01-01T08:00:00.050+08:00| 0.23750001192092896| -|1970-01-01T08:00:00.060+08:00| 0.2832000141143799| -|1970-01-01T08:00:00.070+08:00| 0.32830001624425253| -|1970-01-01T08:00:00.080+08:00| 0.3728000183105469| -|1970-01-01T08:00:00.090+08:00| 0.416700020313263| -|1970-01-01T08:00:00.100+08:00| 0.4600000222524008| -|1970-01-01T08:00:00.110+08:00| 0.5027000241279602| -|1970-01-01T08:00:00.120+08:00| 0.5448000259399414| -|1970-01-01T08:00:00.130+08:00| 0.5863000276883443| -|1970-01-01T08:00:00.140+08:00| 0.627200029373169| -|1970-01-01T08:00:00.150+08:00| 0.6675000309944153| -|1970-01-01T08:00:00.160+08:00| 0.7072000325520833| -|1970-01-01T08:00:00.170+08:00| 0.7463000340461731| -|1970-01-01T08:00:00.180+08:00| 0.7848000354766846| -|1970-01-01T08:00:00.190+08:00| 0.8227000368436178| -|1970-01-01T08:00:00.200+08:00| 0.8600000381469728| -|1970-01-01T08:00:00.210+08:00| 0.8967000393867494| -|1970-01-01T08:00:00.220+08:00| 0.9328000405629477| -|1970-01-01T08:00:00.230+08:00| 0.9683000416755676| -|1970-01-01T08:00:00.240+08:00| 1.0032000427246095| -|1970-01-01T08:00:00.250+08:00| 1.037500043710073| -|1970-01-01T08:00:00.260+08:00| 1.071200044631958| -|1970-01-01T08:00:00.270+08:00| 1.1043000454902647| -|1970-01-01T08:00:00.280+08:00| 1.1368000462849934| -|1970-01-01T08:00:00.290+08:00| 1.1687000470161437| -|1970-01-01T08:00:00.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:00.310+08:00| 1.2307000483103594| -|1970-01-01T08:00:00.320+08:00| 1.2608000489139557| -|1970-01-01T08:00:00.330+08:00| 1.2903000494873524| -|1970-01-01T08:00:00.340+08:00| 1.3192000500233967| -|1970-01-01T08:00:00.350+08:00| 1.3475000505149364| -|1970-01-01T08:00:00.360+08:00| 1.3752000509548186| -|1970-01-01T08:00:00.370+08:00| 1.402300051335891| -|1970-01-01T08:00:00.380+08:00| 1.4288000516510009| -|1970-01-01T08:00:00.390+08:00| 1.4547000518929958| -|1970-01-01T08:00:00.400+08:00| 1.480000052054723| -|1970-01-01T08:00:00.410+08:00| 1.5047000521290301| -|1970-01-01T08:00:00.420+08:00| 1.5288000521087646| -|1970-01-01T08:00:00.430+08:00| 1.5523000519867738| -|1970-01-01T08:00:00.440+08:00| 1.575200051755905| -|1970-01-01T08:00:00.450+08:00| 1.597500051409006| -|1970-01-01T08:00:00.460+08:00| 1.619200050938924| -|1970-01-01T08:00:00.470+08:00| 1.6403000503385066| -|1970-01-01T08:00:00.480+08:00| 1.660800049600601| -|1970-01-01T08:00:00.490+08:00| 1.680700048718055| -|1970-01-01T08:00:00.500+08:00| 1.7000000476837158| -|1970-01-01T08:00:00.510+08:00| 1.7188475466453037| -|1970-01-01T08:00:00.520+08:00| 1.7373800457262996| -|1970-01-01T08:00:00.530+08:00| 1.7555825448831923| -|1970-01-01T08:00:00.540+08:00| 1.7734400440724702| -|1970-01-01T08:00:00.550+08:00| 1.790937543250622| -|1970-01-01T08:00:00.560+08:00| 1.8080600423741364| -|1970-01-01T08:00:00.570+08:00| 1.8247925413995016| -|1970-01-01T08:00:00.580+08:00| 1.8411200402832066| -|1970-01-01T08:00:00.590+08:00| 1.8570275389817397| -|1970-01-01T08:00:00.600+08:00| 1.8725000374515897| -|1970-01-01T08:00:00.610+08:00| 1.8875225356492449| -|1970-01-01T08:00:00.620+08:00| 1.902080033531194| -|1970-01-01T08:00:00.630+08:00| 1.9161575310539258| -|1970-01-01T08:00:00.640+08:00| 1.9297400281739288| -|1970-01-01T08:00:00.650+08:00| 1.9428125248476913| -|1970-01-01T08:00:00.660+08:00| 1.9553600210317021| -|1970-01-01T08:00:00.670+08:00| 1.96736751668245| -|1970-01-01T08:00:00.680+08:00| 1.9788200117564232| -|1970-01-01T08:00:00.690+08:00| 1.9897025062101101| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.710+08:00| 2.0097024933913334| -|1970-01-01T08:00:00.720+08:00| 2.0188199867081615| -|1970-01-01T08:00:00.730+08:00| 2.027367479995188| -|1970-01-01T08:00:00.740+08:00| 2.0353599732971155| -|1970-01-01T08:00:00.750+08:00| 2.0428124666586482| -|1970-01-01T08:00:00.760+08:00| 2.049739960124489| -|1970-01-01T08:00:00.770+08:00| 2.056157453739342| -|1970-01-01T08:00:00.780+08:00| 2.06207994754791| -|1970-01-01T08:00:00.790+08:00| 2.067522441594897| -|1970-01-01T08:00:00.800+08:00| 2.072499935925006| -|1970-01-01T08:00:00.810+08:00| 2.07702743058294| -|1970-01-01T08:00:00.820+08:00| 2.081119925613404| -|1970-01-01T08:00:00.830+08:00| 2.0847924210611| -|1970-01-01T08:00:00.840+08:00| 2.0880599169707317| -|1970-01-01T08:00:00.850+08:00| 2.0909374133870027| -|1970-01-01T08:00:00.860+08:00| 2.0934399103546166| -|1970-01-01T08:00:00.870+08:00| 2.0955824079182768| -|1970-01-01T08:00:00.880+08:00| 2.0973799061226863| -|1970-01-01T08:00:00.890+08:00| 2.098847405012549| -|1970-01-01T08:00:00.900+08:00| 2.0999999046325684| -|1970-01-01T08:00:00.910+08:00| 2.1005574051201332| -|1970-01-01T08:00:00.920+08:00| 2.1002599065303778| -|1970-01-01T08:00:00.930+08:00| 2.0991524087846245| -|1970-01-01T08:00:00.940+08:00| 2.0972799118041947| -|1970-01-01T08:00:00.950+08:00| 2.0946874155104105| -|1970-01-01T08:00:00.960+08:00| 2.0914199198245944| -|1970-01-01T08:00:00.970+08:00| 2.0875224246680673| -|1970-01-01T08:00:00.980+08:00| 2.083039929962151| -|1970-01-01T08:00:00.990+08:00| 2.0780174356281687| -|1970-01-01T08:00:01.000+08:00| 2.0724999415874406| -|1970-01-01T08:00:01.010+08:00| 2.06653244776129| -|1970-01-01T08:00:01.020+08:00| 2.060159954071038| -|1970-01-01T08:00:01.030+08:00| 2.053427460438006| -|1970-01-01T08:00:01.040+08:00| 2.046379966783517| -|1970-01-01T08:00:01.050+08:00| 2.0390624730288924| -|1970-01-01T08:00:01.060+08:00| 2.031519979095454| -|1970-01-01T08:00:01.070+08:00| 2.0237974849045237| -|1970-01-01T08:00:01.080+08:00| 2.015939990377423| -|1970-01-01T08:00:01.090+08:00| 2.0079924954354746| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.110+08:00| 1.9907018211101906| -|1970-01-01T08:00:01.120+08:00| 1.9788509124245144| -|1970-01-01T08:00:01.130+08:00| 1.9645127287932083| -|1970-01-01T08:00:01.140+08:00| 1.9477527250665083| -|1970-01-01T08:00:01.150+08:00| 1.9286363560946513| -|1970-01-01T08:00:01.160+08:00| 1.9072290767278735| -|1970-01-01T08:00:01.170+08:00| 1.8835963418164114| -|1970-01-01T08:00:01.180+08:00| 1.8578036062105014| -|1970-01-01T08:00:01.190+08:00| 1.8299163247603802| -|1970-01-01T08:00:01.200+08:00| 1.7999999523162842| -|1970-01-01T08:00:01.210+08:00| 1.7623635841923329| -|1970-01-01T08:00:01.220+08:00| 1.7129696477516976| -|1970-01-01T08:00:01.230+08:00| 1.6543635959181928| -|1970-01-01T08:00:01.240+08:00| 1.5890908816156328| -|1970-01-01T08:00:01.250+08:00| 1.5196969577678319| -|1970-01-01T08:00:01.260+08:00| 1.4487272772986044| -|1970-01-01T08:00:01.270+08:00| 1.3787272931317647| -|1970-01-01T08:00:01.280+08:00| 1.3122424581911272| -|1970-01-01T08:00:01.290+08:00| 1.251818225400506| -|1970-01-01T08:00:01.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:01.310+08:00| 1.1548000470995912| -|1970-01-01T08:00:01.320+08:00| 1.1130667107899999| -|1970-01-01T08:00:01.330+08:00| 1.0756000393033045| -|1970-01-01T08:00:01.340+08:00| 1.043200033187868| -|1970-01-01T08:00:01.350+08:00| 1.016666692992053| -|1970-01-01T08:00:01.360+08:00| 0.9968000192642223| -|1970-01-01T08:00:01.370+08:00| 0.9844000125527389| -|1970-01-01T08:00:01.380+08:00| 0.9802666734059655| -|1970-01-01T08:00:01.390+08:00| 0.9852000023722649| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.410+08:00| 1.023999999165535| -|1970-01-01T08:00:01.420+08:00| 1.0559999990463256| -|1970-01-01T08:00:01.430+08:00| 1.0959999996423722| -|1970-01-01T08:00:01.440+08:00| 1.1440000009536744| -|1970-01-01T08:00:01.450+08:00| 1.2000000029802322| -|1970-01-01T08:00:01.460+08:00| 1.264000005722046| -|1970-01-01T08:00:01.470+08:00| 1.3360000091791153| -|1970-01-01T08:00:01.480+08:00| 1.4160000133514405| -|1970-01-01T08:00:01.490+08:00| 1.5040000182390214| -|1970-01-01T08:00:01.500+08:00| 1.600000023841858| -+-----------------------------+------------------------------------+ -``` - -### Spread - -#### Registration statement - -```sql -create function spread as 'org.apache.iotdb.library.dprofile.UDAFSpread' -``` - -#### Usage - -This function is used to calculate the spread of time series, that is, the maximum value minus the minimum value. - -**Name:** SPREAD - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is the same as the input. There is only one data point in the series, whose timestamp is 0 and value is the spread. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time|spread(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 26.0| -+-----------------------------+-----------------------+ -``` - - - -### ZScore - -#### Registration statement - -```sql -create function zscore as 'org.apache.iotdb.library.dprofile.UDTFZScore' -``` - -#### Usage - -This function is used to standardize the input series with z-score. - -**Name:** ZSCORE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `compute`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide mean and standard deviation. The default method is "batch". -+ `avg`: Mean value when method is set to "stream". -+ `sd`: Standard deviation when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select zscore(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|zscore(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.200+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.300+08:00| 0.20672455764868078| -|1970-01-01T08:00:00.400+08:00| -0.6201736729460423| -|1970-01-01T08:00:00.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.700+08:00| -1.033622788243404| -|1970-01-01T08:00:00.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:00.900+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.000+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.100+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.200+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.300+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.400+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.700+08:00| 3.9277665953249348| -|1970-01-01T08:00:01.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:01.900+08:00| -1.033622788243404| -|1970-01-01T08:00:02.000+08:00|-0.20672455764868078| -+-----------------------------+--------------------+ -``` - - -## Anomaly Detection - -### IQR - -#### Registration statement - -```sql -create function iqr as 'org.apache.iotdb.library.anomaly.UDTFIQR' -``` - -#### Usage - -This function is used to detect anomalies based on IQR. Points distributing beyond 1.5 times IQR are selected. - -**Name:** IQR - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `method`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide upper and lower quantiles. The default method is "batch". -+ `q1`: The lower quantile when method is set to "stream". -+ `q3`: The upper quantile when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** $IQR=Q_3-Q_1$ - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select iqr(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+-----------------+ -| Time|iqr(root.test.s1)| -+-----------------------------+-----------------+ -|1970-01-01T08:00:01.700+08:00| 10.0| -+-----------------------------+-----------------+ -``` - -### KSigma - -#### Registration statement - -```sql -create function ksigma as 'org.apache.iotdb.library.anomaly.UDTFKSigma' -``` - -#### Usage - -This function is used to detect anomalies based on the Dynamic K-Sigma Algorithm. -Within a sliding window, the input value with a deviation of more than k times the standard deviation from the average will be output as anomaly. - -**Name:** KSIGMA - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `k`: How many times to multiply on standard deviation to define anomaly, the default value is 3. -+ `window`: The window size of Dynamic K-Sigma Algorithm, the default value is 10000. - -**Output Series:** Output a single series. The type is same as input series. - -**Note:** Only when is larger than 0, the anomaly detection will be performed. Otherwise, nothing will be output. - -#### Examples - -##### Assigning k - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:04.000+08:00| 100.0| -|2020-01-01T00:00:06.000+08:00| 150.0| -|2020-01-01T00:00:08.000+08:00| 200.0| -|2020-01-01T00:00:10.000+08:00| 200.0| -|2020-01-01T00:00:14.000+08:00| 200.0| -|2020-01-01T00:00:15.000+08:00| 200.0| -|2020-01-01T00:00:16.000+08:00| 200.0| -|2020-01-01T00:00:18.000+08:00| 200.0| -|2020-01-01T00:00:20.000+08:00| 150.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -|Time |ksigma(root.test.d1.s1,"k"="3.0")| -+-----------------------------+---------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -+-----------------------------+---------------------------------+ -``` - -### LOF - -#### Registration statement - -```sql -create function LOF as 'org.apache.iotdb.library.anomaly.UDTFLOF' -``` - -#### Usage - -This function is used to detect density anomaly of time series. According to k-th distance calculation parameter and local outlier factor (lof) threshold, the function judges if a set of input values is an density anomaly, and a bool mark of anomaly values will be output. - -**Name:** LOF - -**Input Series:** Multiple input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `method`:assign a detection method. The default value is "default", when input data has multiple dimensions. The alternative is "series", when a input series will be transformed to high dimension. -+ `k`:use the k-th distance to calculate lof. Default value is 3. -+ `window`: size of window to split origin data points. Default value is 10000. -+ `windowsize`:dimension that will be transformed into when method is "series". The default value is 5. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** Incomplete rows will be ignored. They are neither calculated nor marked as anomaly. - -#### Examples - -##### Using default parameters - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| 1.0| -|1970-01-01T08:00:00.300+08:00| 1.0| 1.0| -|1970-01-01T08:00:00.400+08:00| 1.0| 0.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -1.0| -|1970-01-01T08:00:00.600+08:00| -1.0| -1.0| -|1970-01-01T08:00:00.700+08:00| -1.0| 0.0| -|1970-01-01T08:00:00.800+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select lof(s1,s2) from root.test.d1 where time<1000 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|lof(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.100+08:00| 3.8274824267668244| -|1970-01-01T08:00:00.200+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.300+08:00| 2.838155437762879| -|1970-01-01T08:00:00.400+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.500+08:00| 2.73518261244453| -|1970-01-01T08:00:00.600+08:00| 2.371440975708148| -|1970-01-01T08:00:00.700+08:00| 2.73518261244453| -|1970-01-01T08:00:00.800+08:00| 1.7561416374270742| -+-----------------------------+-------------------------------------+ -``` - -##### Diagnosing 1d timeseries - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 1.0| -|1970-01-01T08:00:00.200+08:00| 2.0| -|1970-01-01T08:00:00.300+08:00| 3.0| -|1970-01-01T08:00:00.400+08:00| 4.0| -|1970-01-01T08:00:00.500+08:00| 5.0| -|1970-01-01T08:00:00.600+08:00| 6.0| -|1970-01-01T08:00:00.700+08:00| 7.0| -|1970-01-01T08:00:00.800+08:00| 8.0| -|1970-01-01T08:00:00.900+08:00| 9.0| -|1970-01-01T08:00:01.000+08:00| 10.0| -|1970-01-01T08:00:01.100+08:00| 11.0| -|1970-01-01T08:00:01.200+08:00| 12.0| -|1970-01-01T08:00:01.300+08:00| 13.0| -|1970-01-01T08:00:01.400+08:00| 14.0| -|1970-01-01T08:00:01.500+08:00| 15.0| -|1970-01-01T08:00:01.600+08:00| 16.0| -|1970-01-01T08:00:01.700+08:00| 17.0| -|1970-01-01T08:00:01.800+08:00| 18.0| -|1970-01-01T08:00:01.900+08:00| 19.0| -|1970-01-01T08:00:02.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select lof(s1, "method"="series") from root.test.d1 where time<1000 -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|lof(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 3.77777777777778| -|1970-01-01T08:00:00.200+08:00| 4.32727272727273| -|1970-01-01T08:00:00.300+08:00| 4.85714285714286| -|1970-01-01T08:00:00.400+08:00| 5.40909090909091| -|1970-01-01T08:00:00.500+08:00| 5.94999999999999| -|1970-01-01T08:00:00.600+08:00| 6.43243243243243| -|1970-01-01T08:00:00.700+08:00| 6.79999999999999| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 7.0| -|1970-01-01T08:00:01.000+08:00| 6.79999999999999| -|1970-01-01T08:00:01.100+08:00| 6.43243243243243| -|1970-01-01T08:00:01.200+08:00| 5.94999999999999| -|1970-01-01T08:00:01.300+08:00| 5.40909090909091| -|1970-01-01T08:00:01.400+08:00| 4.85714285714286| -|1970-01-01T08:00:01.500+08:00| 4.32727272727273| -|1970-01-01T08:00:01.600+08:00| 3.77777777777778| -+-----------------------------+--------------------+ -``` - -### MissDetect - -#### Registration statement - -```sql -create function missdetect as 'org.apache.iotdb.library.anomaly.UDTFMissDetect' -``` - -#### Usage - -This function is used to detect missing anomalies. -In some datasets, missing values are filled by linear interpolation. -Thus, there are several long perfect linear segments. -By discovering these perfect linear segments, -missing anomalies are detected. - -**Name:** MISSDETECT - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -`error`: The minimum length of the detected missing anomalies, which is an integer greater than or equal to 10. By default, it is 10. - -**Output Series:** Output a single series. The type is BOOLEAN. Each data point which is miss anomaly will be labeled as true. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 0.0| -|2021-07-01T12:00:01.000+08:00| 1.0| -|2021-07-01T12:00:02.000+08:00| 0.0| -|2021-07-01T12:00:03.000+08:00| 1.0| -|2021-07-01T12:00:04.000+08:00| 0.0| -|2021-07-01T12:00:05.000+08:00| 0.0| -|2021-07-01T12:00:06.000+08:00| 0.0| -|2021-07-01T12:00:07.000+08:00| 0.0| -|2021-07-01T12:00:08.000+08:00| 0.0| -|2021-07-01T12:00:09.000+08:00| 0.0| -|2021-07-01T12:00:10.000+08:00| 0.0| -|2021-07-01T12:00:11.000+08:00| 0.0| -|2021-07-01T12:00:12.000+08:00| 0.0| -|2021-07-01T12:00:13.000+08:00| 0.0| -|2021-07-01T12:00:14.000+08:00| 0.0| -|2021-07-01T12:00:15.000+08:00| 0.0| -|2021-07-01T12:00:16.000+08:00| 1.0| -|2021-07-01T12:00:17.000+08:00| 0.0| -|2021-07-01T12:00:18.000+08:00| 1.0| -|2021-07-01T12:00:19.000+08:00| 0.0| -|2021-07-01T12:00:20.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select missdetect(s2,'minlen'='10') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------+ -| Time|missdetect(root.test.d2.s2, "minlen"="10")| -+-----------------------------+------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| false| -|2021-07-01T12:00:01.000+08:00| false| -|2021-07-01T12:00:02.000+08:00| false| -|2021-07-01T12:00:03.000+08:00| false| -|2021-07-01T12:00:04.000+08:00| true| -|2021-07-01T12:00:05.000+08:00| true| -|2021-07-01T12:00:06.000+08:00| true| -|2021-07-01T12:00:07.000+08:00| true| -|2021-07-01T12:00:08.000+08:00| true| -|2021-07-01T12:00:09.000+08:00| true| -|2021-07-01T12:00:10.000+08:00| true| -|2021-07-01T12:00:11.000+08:00| true| -|2021-07-01T12:00:12.000+08:00| true| -|2021-07-01T12:00:13.000+08:00| true| -|2021-07-01T12:00:14.000+08:00| true| -|2021-07-01T12:00:15.000+08:00| true| -|2021-07-01T12:00:16.000+08:00| false| -|2021-07-01T12:00:17.000+08:00| false| -|2021-07-01T12:00:18.000+08:00| false| -|2021-07-01T12:00:19.000+08:00| false| -|2021-07-01T12:00:20.000+08:00| false| -+-----------------------------+------------------------------------------+ -``` - -### Range - -#### Registration statement - -```sql -create function range as 'org.apache.iotdb.library.anomaly.UDTFRange' -``` - -#### Usage - -This function is used to detect range anomaly of time series. According to upper bound and lower bound parameters, the function judges if a input value is beyond range, aka range anomaly, and a new time series of anomaly will be output. - -**Name:** RANGE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `lower_bound`:lower bound of range anomaly detection. -+ `upper_bound`:upper bound of range anomaly detection. - -**Output Series:** Output a single series. The type is the same as the input. - -**Note:** Only when `upper_bound` is larger than `lower_bound`, the anomaly detection will be performed. Otherwise, nothing will be output. - - - -#### Examples - -##### Assigning Lower and Upper Bound - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------------------+ -|Time |range(root.test.d1.s1,"lower_bound"="101.0","upper_bound"="125.0")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -+-----------------------------+------------------------------------------------------------------+ -``` - -### TwoSidedFilter - -#### Registration statement - -```sql -create function twosidedfilter as 'org.apache.iotdb.library.anomaly.UDTFTwoSidedFilter' -``` - -#### Usage - -The function is used to filter anomalies of a numeric time series based on two-sided window detection. - -**Name:** TWOSIDEDFILTER - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE - -**Output Series:** Output a single series. The type is the same as the input. It is the input without anomalies. - -**Parameter:** - -- `len`: The size of the window, which is a positive integer. By default, it's 5. When `len`=3, the algorithm detects forward window and backward window with length 3 and calculates the outlierness of the current point. - -- `threshold`: The threshold of outlierness, which is a floating number in (0,1). By default, it's 0.3. The strict standard of detecting anomalies is in proportion to the threshold. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:00:37.000+08:00| 1484.0| -|1970-01-01T08:00:38.000+08:00| 1055.0| -|1970-01-01T08:00:39.000+08:00| 1050.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test -``` - -Output series: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -### Outlier - -#### Registration statement - -```sql -create function outlier as 'org.apache.iotdb.library.anomaly.UDTFOutlier' -``` - -#### Usage - -This function is used to detect distance-based outliers. For each point in the current window, if the number of its neighbors within the distance of neighbor distance threshold is less than the neighbor count threshold, the point in detected as an outlier. - -**Name:** OUTLIER - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `r`:the neighbor distance threshold. -+ `k`:the neighbor count threshold. -+ `w`:the window size. -+ `s`:the slide size. - -**Output Series:** Output a single series. The type is the same as the input. - -#### Examples - -##### Assigning Parameters of Queries - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|2020-01-04T23:59:55.000+08:00| 56.0| -|2020-01-04T23:59:56.000+08:00| 55.1| -|2020-01-04T23:59:57.000+08:00| 54.2| -|2020-01-04T23:59:58.000+08:00| 56.3| -|2020-01-04T23:59:59.000+08:00| 59.0| -|2020-01-05T00:00:00.000+08:00| 60.0| -|2020-01-05T00:00:01.000+08:00| 60.5| -|2020-01-05T00:00:02.000+08:00| 64.5| -|2020-01-05T00:00:03.000+08:00| 69.0| -|2020-01-05T00:00:04.000+08:00| 64.2| -|2020-01-05T00:00:05.000+08:00| 62.3| -|2020-01-05T00:00:06.000+08:00| 58.0| -|2020-01-05T00:00:07.000+08:00| 58.9| -|2020-01-05T00:00:08.000+08:00| 52.0| -|2020-01-05T00:00:09.000+08:00| 62.3| -|2020-01-05T00:00:10.000+08:00| 61.0| -|2020-01-05T00:00:11.000+08:00| 64.2| -|2020-01-05T00:00:12.000+08:00| 61.8| -|2020-01-05T00:00:13.000+08:00| 64.0| -|2020-01-05T00:00:14.000+08:00| 63.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|outlier(root.test.s1,"r"="5.0","k"="4","w"="10","s"="5")| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:03.000+08:00| 69.0| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:08.000+08:00| 52.0| -+-----------------------------+--------------------------------------------------------+ -``` - -## Frequency Domain Analysis - -### Conv - -#### Registration statement - -```sql -create function conv as 'org.apache.iotdb.library.frequency.UDTFConv' -``` - -#### Usage - -This function is used to calculate the convolution, i.e. polynomial multiplication. - -**Name:** CONV - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output:** Output a single series. The type is DOUBLE. It is the result of convolution whose timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 0.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select conv(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------+ -| Time|conv(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| -|1970-01-01T08:00:00.003+08:00| 2.0| -+-----------------------------+--------------------------------------+ -``` - -### Deconv - -#### Registration statement - -```sql -create function deconv as 'org.apache.iotdb.library.frequency.UDTFDeconv' -``` - -#### Usage - -This function is used to calculate the deconvolution, i.e. polynomial division. - -**Name:** DECONV - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `result`: The result of deconvolution, which is 'quotient' or 'remainder'. By default, the quotient will be output. - -**Output:** Output a single series. The type is DOUBLE. It is the result of deconvolving the second series from the first series (dividing the first series by the second series) whose timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - - -##### Calculate the quotient - -When `result` is 'quotient' or the default, this function calculates the quotient of the deconvolution. - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s3|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 8.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| null| -|1970-01-01T08:00:00.003+08:00| 2.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select deconv(s3,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2)| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - -##### Calculate the remainder - -When `result` is 'remainder', this function calculates the remainder of the deconvolution. - -Input series is the same as above, the SQL for query is shown below: - - -```sql -select deconv(s3,s2,'result'='remainder') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2, "result"="remainder")| -+-----------------------------+--------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 0.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -+-----------------------------+--------------------------------------------------------------+ -``` - -### DWT - -#### Registration statement - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFDWT' -``` - -#### Usage - -This function is used to calculate 1d discrete wavelet transform of a numerical series. - -**Name:** DWT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of wavelet. May select 'Haar', 'DB4', 'DB6', 'DB8', where DB means Daubechies. User may offer coefficients of wavelet transform and ignore this parameter. Case ignored. -+ `coef`: Coefficients of wavelet transform. When providing this parameter, use comma ',' to split them, and leave no spaces or other punctuations. -+ `layer`: Times to transform. The number of output vectors equals $layer+1$. Default is 1. - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. - -**Note:** The length of input series must be an integer number power of 2. - -#### Examples - - -##### Haar wavelet transform - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.2| -|1970-01-01T08:00:00.200+08:00| 1.5| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.600+08:00| 0.8| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.800+08:00| 2.5| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select dwt(s1,"method"="haar") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|dwt(root.test.d1.s1, "method"="haar")| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.14142135834465192| -|1970-01-01T08:00:00.100+08:00| 1.909188342921157| -|1970-01-01T08:00:00.200+08:00| 1.6263456473052773| -|1970-01-01T08:00:00.300+08:00| 1.9798989957517026| -|1970-01-01T08:00:00.400+08:00| 3.252691126023161| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776479437628| -|1970-01-01T08:00:00.800+08:00| -0.14142135834465192| -|1970-01-01T08:00:00.900+08:00| 0.21213200063848547| -|1970-01-01T08:00:01.000+08:00| -0.7778174761639416| -|1970-01-01T08:00:01.100+08:00| -0.8485281289944873| -|1970-01-01T08:00:01.200+08:00| 0.2828427799095765| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426400127697095| -|1970-01-01T08:00:01.500+08:00| -0.42426408557066786| -+-----------------------------+-------------------------------------+ -``` - - -### IDWT - -#### Registration statement - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFIDWT' -``` - -#### Usage - -This function performs one-dimensional inverse discrete wavelet transform on the input series, reconstructing the original data from DWT decomposed wavelet coefficients. - -**Name:** IDWT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of wavelet. May select 'Haar', 'DB4', 'DB6', 'DB8', where DB means Daubechies. User may offer coefficients of wavelet transform and ignore this parameter. Case ignored. -+ `coef`: Coefficients of wavelet transform. When providing this parameter, use comma ',' to split them, and leave no spaces or other punctuations. -+ `layer`: Times to transform. The number of output vectors equals $layer+1$. Default is 1. - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. - -**Note:** -* The length of input series must be an integer number power of 2. -* The parameter settings of the IDWT function (method/coef/layer) should be consistent with the corresponding DWT transformation to correctly reconstruct the original data. -* Typically, the input of IDWT is the output result of the DWT function. - -#### Examples - -##### Haar wavelet transform - -Input series: - -``` -+-----------------------------+--------------------+ -| Time| root.test.d1.s2| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 0.1414213562373095| -|1970-01-01T08:00:00.100+08:00| 1.909188309203678| -|1970-01-01T08:00:00.200+08:00| 1.6263455967290592| -|1970-01-01T08:00:00.300+08:00| 1.979898987322333| -|1970-01-01T08:00:00.400+08:00| 3.2526911934581184| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776310850235| -|1970-01-01T08:00:00.800+08:00| -0.1414213562373095| -|1970-01-01T08:00:00.900+08:00| 0.21213203435596428| -|1970-01-01T08:00:01.000+08:00| -0.7778174593052022| -|1970-01-01T08:00:01.100+08:00| -0.8485281374238569| -|1970-01-01T08:00:01.200+08:00| 0.2828427124746189| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426406871192857| -|1970-01-01T08:00:01.500+08:00|-0.42426406871192857| -+-----------------------------+--------------------+ -``` - -SQL for query: - -```sql -select idwt(s2,"method"="haar") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------+ -| Time|idwt(root.test.d1.s2, "method"="haar")| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.19999999999999998| -|1970-01-01T08:00:00.200+08:00| 1.4999999999999996| -|1970-01-01T08:00:00.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.6999999999999997| -|1970-01-01T08:00:00.600+08:00| 0.7999999999999998| -|1970-01-01T08:00:00.700+08:00| 1.9999999999999996| -|1970-01-01T08:00:00.800+08:00| 2.4999999999999996| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.9999999999999996| -|1970-01-01T08:00:01.200+08:00| 1.7999999999999998| -|1970-01-01T08:00:01.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:01.400+08:00| 0.9999999999999998| -|1970-01-01T08:00:01.500+08:00| 1.5999999999999999| -+-----------------------------+--------------------------------------+ -``` - - -### FFT - -#### Registration statement - -```sql -create function fft as 'org.apache.iotdb.library.frequency.UDTFFFT' -``` - -#### Usage - -This function is used to calculate the fast Fourier transform (FFT) of a numerical series. - -**Name:** FFT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of FFT, which is 'uniform' (by default) or 'nonuniform'. If the value is 'uniform', the timestamps will be ignored and all data points will be regarded as equidistant. Thus, the equidistant fast Fourier transform algorithm will be applied. If the value is 'nonuniform' (TODO), the non-equidistant fast Fourier transform algorithm will be applied based on timestamps. -+ `result`: The result of FFT, which is 'real', 'imag', 'abs' or 'angle', corresponding to the real part, imaginary part, magnitude and phase angle. By default, the magnitude will be output. -+ `compress`: The parameter of compression, which is within (0,1]. It is the reserved energy ratio of lossy compression. By default, there is no compression. - - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. The timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - - -##### Uniform FFT - -With the default `type`, uniform FFT is applied. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select fft(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------+ -| Time| fft(root.test.d1.s1)| -+-----------------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 1.2727111142703152E-8| -|1970-01-01T08:00:00.002+08:00| 2.385520799101839E-7| -|1970-01-01T08:00:00.003+08:00| 8.723291723972645E-8| -|1970-01-01T08:00:00.004+08:00| 19.999999960195904| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| -|1970-01-01T08:00:00.006+08:00| 3.2260694930700566E-7| -|1970-01-01T08:00:00.007+08:00| 8.723291605373329E-8| -|1970-01-01T08:00:00.008+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.009+08:00| 1.2727110997246171E-8| -|1970-01-01T08:00:00.010+08:00|1.9852334701272664E-23| -|1970-01-01T08:00:00.011+08:00| 1.2727111194499847E-8| -|1970-01-01T08:00:00.012+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.013+08:00| 8.723291785769131E-8| -|1970-01-01T08:00:00.014+08:00| 3.226069493070057E-7| -|1970-01-01T08:00:00.015+08:00| 9.999999850988388| -|1970-01-01T08:00:00.016+08:00| 19.999999960195904| -|1970-01-01T08:00:00.017+08:00| 8.723291747109068E-8| -|1970-01-01T08:00:00.018+08:00| 2.3855207991018386E-7| -|1970-01-01T08:00:00.019+08:00| 1.2727112069910878E-8| -+-----------------------------+----------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, there are peaks in $k=4$ and $k=5$ of the output. - -##### Uniform FFT with Compression - -Input series is the same as above, the SQL for query is shown below: - -```sql -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| fft(root.test.d1.s1,| fft(root.test.d1.s1,| -| | "result"="real",| "result"="imag",| -| | "compress"="0.99")| "compress"="0.99")| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - -Note: Based on the conjugation of the Fourier transform result, only the first half of the compression result is reserved. -According to the given parameter, data points are reserved from low frequency to high frequency until the reserved energy ratio exceeds it. -The last data point is reserved to indicate the length of the series. - -### HighPass - -#### Registration statement - -```sql -create function highpass as 'org.apache.iotdb.library.frequency.UDTFHighPass' -``` - -#### Usage - -This function performs low-pass filtering on the input series and extracts components above the cutoff frequency. -The timestamps of input will be ignored and all data points will be regarded as equidistant. - -**Name:** HIGHPASS - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `wpass`: The normalized cutoff frequency which values (0,1). This parameter cannot be lacked. - -**Output:** Output a single series. The type is DOUBLE. It is the input after filtering. The length and timestamps of output are the same as the input. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select highpass(s1,'wpass'='0.45') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------+ -| Time|highpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9999999534830373| -|1970-01-01T08:00:01.000+08:00| 1.7462829277628608E-8| -|1970-01-01T08:00:02.000+08:00| -0.9999999593178128| -|1970-01-01T08:00:03.000+08:00| -4.1115269056426626E-8| -|1970-01-01T08:00:04.000+08:00| 0.9999999925494194| -|1970-01-01T08:00:05.000+08:00| 3.328126513330016E-8| -|1970-01-01T08:00:06.000+08:00| -1.0000000183304454| -|1970-01-01T08:00:07.000+08:00| 6.260191433311374E-10| -|1970-01-01T08:00:08.000+08:00| 1.0000000018134796| -|1970-01-01T08:00:09.000+08:00| -3.097210911744423E-17| -|1970-01-01T08:00:10.000+08:00| -1.0000000018134794| -|1970-01-01T08:00:11.000+08:00| -6.260191627862097E-10| -|1970-01-01T08:00:12.000+08:00| 1.0000000183304454| -|1970-01-01T08:00:13.000+08:00| -3.328126501424346E-8| -|1970-01-01T08:00:14.000+08:00| -0.9999999925494196| -|1970-01-01T08:00:15.000+08:00| 4.111526915498874E-8| -|1970-01-01T08:00:16.000+08:00| 0.9999999593178128| -|1970-01-01T08:00:17.000+08:00| -1.7462829341296528E-8| -|1970-01-01T08:00:18.000+08:00| -0.9999999534830369| -|1970-01-01T08:00:19.000+08:00| -1.035237222742873E-16| -+-----------------------------+-----------------------------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, the output is $y=sin(2\pi t/4)$ after high-pass filtering. - -### IFFT - -#### Registration statement - -```sql -create function ifft as 'org.apache.iotdb.library.frequency.UDTFIFFT' -``` - -#### Usage - -This function treats the two input series as the real and imaginary part of a complex series, performs an inverse fast Fourier transform (IFFT), and outputs the real part of the result. -For the input format, please refer to the output format of `FFT` function. -Moreover, the compressed output of `FFT` function is also supported. - -**Name:** IFFT - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `start`: The start time of the output series with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is '1970-01-01 08:00:00'. -+ `interval`: The interval of the output series, which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it is 1s. - -**Output:** Output a single series. The type is DOUBLE. It is strictly equispaced. The values are the results of IFFT. - -**Note:** If a row contains null points or `NaN`, it will be ignored. - -#### Examples - - -Input series: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| root.test.d1.re| root.test.d1.im| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - - -SQL for query: - -```sql -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|ifft(root.test.d1.re, root.test.d1.im, "interval"="1m",| -| | "start"="2021-01-01 00:00:00")| -+-----------------------------+-------------------------------------------------------+ -|2021-01-01T00:00:00.000+08:00| 2.902112992431231| -|2021-01-01T00:01:00.000+08:00| 1.1755704705132448| -|2021-01-01T00:02:00.000+08:00| -2.175570513757101| -|2021-01-01T00:03:00.000+08:00| -1.9021130389094498| -|2021-01-01T00:04:00.000+08:00| 0.9999999925494194| -|2021-01-01T00:05:00.000+08:00| 1.902113046743454| -|2021-01-01T00:06:00.000+08:00| 0.17557053610884188| -|2021-01-01T00:07:00.000+08:00| -1.1755704886020932| -|2021-01-01T00:08:00.000+08:00| -0.9021130371347148| -|2021-01-01T00:09:00.000+08:00| 3.552713678800501E-16| -|2021-01-01T00:10:00.000+08:00| 0.9021130371347154| -|2021-01-01T00:11:00.000+08:00| 1.1755704886020932| -|2021-01-01T00:12:00.000+08:00| -0.17557053610884144| -|2021-01-01T00:13:00.000+08:00| -1.902113046743454| -|2021-01-01T00:14:00.000+08:00| -0.9999999925494196| -|2021-01-01T00:15:00.000+08:00| 1.9021130389094498| -|2021-01-01T00:16:00.000+08:00| 2.1755705137571004| -|2021-01-01T00:17:00.000+08:00| -1.1755704705132448| -|2021-01-01T00:18:00.000+08:00| -2.902112992431231| -|2021-01-01T00:19:00.000+08:00| -3.552713678800501E-16| -+-----------------------------+-------------------------------------------------------+ -``` - -### LowPass - -#### Registration statement - -```sql -create function lowpass as 'org.apache.iotdb.library.frequency.UDTFLowPass' -``` - -#### Usage - -This function performs low-pass filtering on the input series and extracts components below the cutoff frequency. -The timestamps of input will be ignored and all data points will be regarded as equidistant. - -**Name:** LOWPASS - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `wpass`: The normalized cutoff frequency which values (0,1). This parameter cannot be lacked. - -**Output:** Output a single series. The type is DOUBLE. It is the input after filtering. The length and timestamps of output are the same as the input. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select lowpass(s1,'wpass'='0.45') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|lowpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.9021130073323922| -|1970-01-01T08:00:01.000+08:00| 1.1755704705132448| -|1970-01-01T08:00:02.000+08:00| -1.1755705286582614| -|1970-01-01T08:00:03.000+08:00| -1.9021130389094498| -|1970-01-01T08:00:04.000+08:00| 7.450580419288145E-9| -|1970-01-01T08:00:05.000+08:00| 1.902113046743454| -|1970-01-01T08:00:06.000+08:00| 1.1755705212076808| -|1970-01-01T08:00:07.000+08:00| -1.1755704886020932| -|1970-01-01T08:00:08.000+08:00| -1.9021130222335536| -|1970-01-01T08:00:09.000+08:00| 3.552713678800501E-16| -|1970-01-01T08:00:10.000+08:00| 1.9021130222335536| -|1970-01-01T08:00:11.000+08:00| 1.1755704886020932| -|1970-01-01T08:00:12.000+08:00| -1.1755705212076801| -|1970-01-01T08:00:13.000+08:00| -1.902113046743454| -|1970-01-01T08:00:14.000+08:00| -7.45058112983088E-9| -|1970-01-01T08:00:15.000+08:00| 1.9021130389094498| -|1970-01-01T08:00:16.000+08:00| 1.1755705286582616| -|1970-01-01T08:00:17.000+08:00| -1.1755704705132448| -|1970-01-01T08:00:18.000+08:00| -1.9021130073323924| -|1970-01-01T08:00:19.000+08:00| -2.664535259100376E-16| -+-----------------------------+----------------------------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, the output is $y=2sin(2\pi t/5)$ after low-pass filtering. - - -### Envelope - -#### Registration statement - -```sql -create function envelope as 'org.apache.iotdb.library.frequency.UDFEnvelopeAnalysis' -``` - -#### Usage - -This function achieves signal demodulation and envelope extraction by inputting a one-dimensional floating-point array and a user specified modulation frequency. The goal of demodulation is to extract the parts of interest from complex signals, making them easier to understand. For example, demodulation can be used to find the envelope of the signal, that is, the trend of amplitude changes. - -**Name:** Envelope - -**Input:** Only supports a single input sequence, with types INT32/INT64/FLOAT/DOUBLE - - -**Parameters:** - -+ `frequency`: Frequency (optional, positive number. If this parameter is not filled in, the system will infer the frequency based on the time interval corresponding to the sequence). -+ `amplification`: Amplification factor (optional, positive integer. The output of the Time column is a set of positive integers and does not output decimals. When the frequency is less than 1, this parameter can be used to amplify the frequency to display normal results). - -**Output:** -+ `Time`: The meaning of the value returned by this column is frequency rather than time. If the output format is time format (e.g. 1970-01-01T08:00: 19.000+08:00), please convert it to a timestamp value. - - -+ `Envelope(Path, 'frequency'='{frequency}')`:Output a single sequence of type DOUBLE, which is the result of envelope analysis. - -**Note:** When the values of the demodulated original sequence are discontinuous, this function will treat it as continuous processing. It is recommended that the analyzed time series be a complete time series of values. It is also recommended to specify a start time and an end time. - -#### Examples - -Input series: - - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:01.000+08:00| 1.0 | -|1970-01-01T08:00:02.000+08:00| 2.0 | -|1970-01-01T08:00:03.000+08:00| 3.0 | -|1970-01-01T08:00:04.000+08:00| 4.0 | -|1970-01-01T08:00:05.000+08:00| 5.0 | -|1970-01-01T08:00:06.000+08:00| 6.0 | -|1970-01-01T08:00:07.000+08:00| 7.0 | -|1970-01-01T08:00:08.000+08:00| 8.0 | -|1970-01-01T08:00:09.000+08:00| 9.0 | -|1970-01-01T08:00:10.000+08:00| 10.0 | -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -set time_display_type=long; -select envelope(s1),envelope(s1,'frequency'='1000'),envelope(s1,'amplification'='10') from root.test.d1; -``` - -Output series: - - -``` -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -|Time|envelope(root.test.d1.s1)|envelope(root.test.d1.s1, "frequency"="1000")|envelope(root.test.d1.s1, "amplification"="10")| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -| 0| 6.284350808484124| 6.284350808484124| 6.284350808484124| -| 100| 1.5581923657404393| 1.5581923657404393| null| -| 200| 0.8503211038340728| 0.8503211038340728| null| -| 300| 0.512808785945551| 0.512808785945551| null| -| 400| 0.26361156774506744| 0.26361156774506744| null| -|1000| null| null| 1.5581923657404393| -|2000| null| null| 0.8503211038340728| -|3000| null| null| 0.512808785945551| -|4000| null| null| 0.26361156774506744| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ - -``` - - -## Data Matching - -### Cov - -#### Registration statement - -```sql -create function cov as 'org.apache.iotdb.library.dmatch.UDAFCov' -``` - -#### Usage - -This function is used to calculate the population covariance. - -**Name:** COV - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the population covariance. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `NaN` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select cov(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|cov(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 12.291666666666666| -+-----------------------------+-------------------------------------+ -``` - -### DTW - -#### Registration statement - -```sql -create function dtw as 'org.apache.iotdb.library.dmatch.UDAFDtw' -``` - -#### Usage - -This function is used to calculate the DTW distance between two input series. - -**Name:** DTW - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the DTW distance. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `0` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.004+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.005+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.006+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.007+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.008+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.009+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.010+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.011+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.012+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.013+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.014+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.015+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.016+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.017+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.018+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.019+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.020+08:00| 1.0| 2.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select dtw(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|dtw(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 20.0| -+-----------------------------+-------------------------------------+ -``` - -### Pearson - -#### Registration statement - -```sql -create function pearson as 'org.apache.iotdb.library.dmatch.UDAFPearson' -``` - -#### Usage - -This function is used to calculate the Pearson Correlation Coefficient. - -**Name:** PEARSON - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the Pearson Correlation Coefficient. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `NaN` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select pearson(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------+ -| Time|pearson(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.5630881927754872| -+-----------------------------+-----------------------------------------+ -``` - -### PtnSym - -#### Registration statement - -```sql -create function ptnsym as 'org.apache.iotdb.library.dmatch.UDTFPtnSym' -``` - -#### Usage - -This function is used to find all symmetric subseries in the input whose degree of symmetry is less than the threshold. -The degree of symmetry is calculated by DTW. -The smaller the degree, the more symmetrical the series is. - -**Name:** PATTERNSYMMETRIC - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE - -**Parameter:** - -+ `window`: The length of the symmetric subseries. It's a positive integer and the default value is 10. -+ `threshold`: The threshold of the degree of symmetry. It's non-negative. Only the subseries whose degree of symmetry is below it will be output. By default, all subseries will be output. - - -**Output Series:** Output a single series. The type is DOUBLE. Each data point in the output series corresponds to a symmetric subseries. The output timestamp is the starting timestamp of the subseries and the output value is the degree of symmetry. - -#### Example - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s4| -+-----------------------------+---------------+ -|2021-01-01T12:00:00.000+08:00| 1.0| -|2021-01-01T12:00:01.000+08:00| 2.0| -|2021-01-01T12:00:02.000+08:00| 3.0| -|2021-01-01T12:00:03.000+08:00| 2.0| -|2021-01-01T12:00:04.000+08:00| 1.0| -|2021-01-01T12:00:05.000+08:00| 1.0| -|2021-01-01T12:00:06.000+08:00| 1.0| -|2021-01-01T12:00:07.000+08:00| 1.0| -|2021-01-01T12:00:08.000+08:00| 2.0| -|2021-01-01T12:00:09.000+08:00| 3.0| -|2021-01-01T12:00:10.000+08:00| 2.0| -|2021-01-01T12:00:11.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|ptnsym(root.test.d1.s4, "window"="5", "threshold"="0")| -+-----------------------------+------------------------------------------------------+ -|2021-01-01T12:00:00.000+08:00| 0.0| -|2021-01-01T12:00:07.000+08:00| 0.0| -+-----------------------------+------------------------------------------------------+ -``` - -### XCorr - -#### Registration statement - -```sql -create function xcorr as 'org.apache.iotdb.library.dmatch.UDTFXCorr' -``` - -#### Usage - -This function is used to calculate the cross correlation function of given two time series. -For discrete time series, cross correlation is given by -$$CR(n) = \frac{1}{N} \sum_{m=1}^N S_1[m]S_2[m+n]$$ -which represent the similarities between two series with different index shifts. - -**Name:** XCORR - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series with DOUBLE as datatype. -There are $2N-1$ data points in the series, the center of which represents the cross correlation -calculated with pre-aligned series(that is $CR(0)$ in the formula above), -and the previous(or post) values represent those with shifting the latter series forward(or backward otherwise) -until the two series are no longer overlapped(not included). -In short, the values of output series are given by(index starts from 1) -$$OS[i] = CR(-N+i) = \frac{1}{N} \sum_{m=1}^{i} S_1[m]S_2[N-i+m],\ if\ i <= N$$ -$$OS[i] = CR(i-N) = \frac{1}{N} \sum_{m=1}^{2N-i} S_1[i-N+m]S_2[m],\ if\ i > N$$ - -**Note:** - -+ `null` and `NaN` values in the input series will be ignored and treated as 0. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| null| 6| -|2020-01-01T00:00:02.000+08:00| 2| 7| -|2020-01-01T00:00:03.000+08:00| 3| NaN| -|2020-01-01T00:00:04.000+08:00| 4| 9| -|2020-01-01T00:00:05.000+08:00| 5| 10| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -Output series: - -``` -+-----------------------------+---------------------------------------+ -| Time|xcorr(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+---------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 10.0| -|1970-01-01T08:00:00.002+08:00| 16.0| -|1970-01-01T08:00:00.003+08:00| 16.75| -|1970-01-01T08:00:00.004+08:00| 20.0| -|1970-01-01T08:00:00.005+08:00| 13.2| -|1970-01-01T08:00:00.006+08:00| 5.6| -|1970-01-01T08:00:00.007+08:00| 7.0| -|1970-01-01T08:00:00.008+08:00| 0.0| -+-----------------------------+---------------------------------------+ -``` -### Pattern\_match - -#### Registration statement - -```SQL -create function pattern_match as 'org.apache.iotdb.library.match.UDAFPatternMatch' -``` - -#### Usage - -This function performs pattern matching between an input time series and a predefined pattern. A match is considered successful if the similarity measure (distance) is less than or equal to a specified threshold. The results are output as a JSON list. - -​**Name**​: PATTERN\_MATCH - -**Input**​​**​ Series**​: Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE/ BOOLEAN - -​**Parameter**​: - -* `timePattern` : A comma-separated string of timestamps (e.g., `"t1,t2,t3"`). Length must be ​**greater than 1**​. Required. -* `valuePattern `: A comma-separated string of numerical values corresponding to `timePattern`. Length must **match ​**`timePattern` and be greater than 1. Required. - -> For boolean values: Use `1` for `true` and `0` for `false`. - -* `theshold`: Float-type similarity threshold. Required. - -**Output**​​**​ Series**​: A JSON list containing all successfully matched segments. Each entry includes: start timestamp `startTime`, end timestamp `endTime`, calculated similarity value `distance`. - -#### Example - -1. Linear Data - -Input series: - -```SQL -IoTDB> select s0 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s0| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.1| -|1970-01-01T08:00:00.003+08:00| 1.2| -|1970-01-01T08:00:00.004+08:00| 1.3| -|1970-01-01T08:00:00.005+08:00| 0.0| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+--------------------------------------------------------------------------------------------------+ -| match_result| -+--------------------------------------------------------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]| -+--------------------------------------------------------------------------------------------------+ -``` - -2. Boolean Data - -Input series: - -```SQL -IoTDB> select s1 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| true| -|1970-01-01T08:00:00.002+08:00| true| -|1970-01-01T08:00:00.003+08:00| true| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| false| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", "threshold"="0.5") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+-------------------------------------------------+ -| match_result| -+-------------------------------------------------+ -|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+-------------------------------------------------+ -``` - -3. V-shaped Data - -Input series: - -```SQL -IoTDB> select s2 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s2| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| -1.0| -|1970-01-01T08:00:00.003+08:00| -2.0| -|1970-01-01T08:00:00.004+08:00| -3.0| -|1970-01-01T08:00:00.005+08:00| -2.0| -|1970-01-01T08:00:00.006+08:00| -1.0| -|1970-01-01T08:00:00.007+08:00| -0.0| -|1970-01-01T08:00:00.008+08:00| -0.0| -|1970-01-01T08:00:00.009+08:00| -0.0| -|1970-01-01T08:00:00.010+08:00| -0.0| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s2, "timePattern"="1,2,3,4,5,6,7", "valuePattern"="0.0,-1.0,-2.0,-3.0,-2.0,-1.0,-0.0", "threshold"="10") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+----------------------------------------------+ -| match_result| -+----------------------------------------------+ -|[{"distance":0.53,"startTime":1,"endTime":10}]| -+----------------------------------------------+ -``` - -4. Multiple Matching Pattern - -Input series: - -```SQL -IoTDB> select s0,s1 from root.** -+-----------------------------+-------------+-------------+ -| Time|root.db.d0.s0|root.db.d0.s1| -+-----------------------------+-------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| true| -|1970-01-01T08:00:00.002+08:00| 1.1| true| -|1970-01-01T08:00:00.003+08:00| 1.2| true| -|1970-01-01T08:00:00.004+08:00| 1.3| false| -|1970-01-01T08:00:00.005+08:00| 0.0| false| -+-----------------------------+-------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result1, pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", - "threshold"="0.5") as match_result2 from root.db.d0 -``` - -Output series: - -```SQL -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -| match_result1| match_result2| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -``` - -## Data Repairing - -### TimestampRepair - -#### Registration statement - -```sql -create function timestamprepair as 'org.apache.iotdb.library.drepair.UDTFTimestampRepair' -``` - -#### Usage - -This function is used for timestamp repair. -According to the given standard time interval, -the method of minimizing the repair cost is adopted. -By fine-tuning the timestamps, -the original data with unstable timestamp interval is repaired to strictly equispaced data. -If no standard time interval is given, -this function will use the **median**, **mode** or **cluster** of the time interval to estimate the standard time interval. - -**Name:** TIMESTAMPREPAIR - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `interval`: The standard time interval whose unit is millisecond. It is a positive integer. By default, it will be estimated according to the given method. -+ `method`: The method to estimate the standard time interval, which is 'median', 'mode' or 'cluster'. This parameter is only valid when `interval` is not given. By default, median will be used. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -##### Manually Specify the Standard Time Interval - -When `interval` is given, this function repairs according to the given standard time interval. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:19.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:01.000+08:00| 7.0| -|2021-07-01T12:01:11.000+08:00| 8.0| -|2021-07-01T12:01:21.000+08:00| 9.0| -|2021-07-01T12:01:31.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timestamprepair(s1,'interval'='10000') from root.test.d2 -``` - -Output series: - - -``` -+-----------------------------+----------------------------------------------------+ -| Time|timestamprepair(root.test.d2.s1, "interval"="10000")| -+-----------------------------+----------------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+----------------------------------------------------+ -``` - -##### Automatically Estimate the Standard Time Interval - -When `interval` is default, this function estimates the standard time interval. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select timestamprepair(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------+ -| Time|timestamprepair(root.test.d2.s1)| -+-----------------------------+--------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+--------------------------------+ -``` - -### ValueFill - -#### Registration statement - -```sql -create function valuefill as 'org.apache.iotdb.library.drepair.UDTFValueFill' -``` - -#### Usage - -This function is used to impute time series. Several methods are supported. - -**Name**: ValueFill -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: {"mean", "previous", "linear", "likelihood", "AR", "MA", "SCREEN"}, default "linear". - Method to use for imputation in series. "mean": use global mean value to fill holes; "previous": propagate last valid observation forward to next valid. "linear": simplest interpolation method; "likelihood":Maximum likelihood estimation based on the normal distribution of speed; "AR": auto regression; "MA": moving average; "SCREEN": speed constraint. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -**Note:** AR method use AR(1) model. Input value should be auto-correlated, or the function would output a single point (0, 0.0). - -#### Examples - -##### Fill with linear - -When `method` is "linear" or the default, Screen method is used to impute. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| NaN| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| NaN| -|2020-01-01T00:00:22.000+08:00| NaN| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select valuefill(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------+ -| Time|valuefill(root.test.d2.s1)| -+-----------------------------+--------------------------+ -|2020-01-01T00:00:02.000+08:00| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 110.5| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.66666666666667| -|2020-01-01T00:00:22.000+08:00| 121.33333333333333| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+--------------------------+ -``` - -##### Previous Fill - -When `method` is "previous", previous method is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select valuefill(s1,"method"="previous") from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------------+ -| Time|valuefill(root.test.d2.s1, "method"="previous")| -+-----------------------------+-----------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 108.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 116.0| -|2020-01-01T00:00:22.000+08:00| 116.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-----------------------------------------------+ -``` - -### ValueRepair - -#### Registration statement - -```sql -create function valuerepair as 'org.apache.iotdb.library.drepair.UDTFValueRepair' -``` - -#### Usage - -This function is used to repair the value of the time series. -Currently, two methods are supported: -**Screen** is a method based on speed threshold, which makes all speeds meet the threshold requirements under the premise of minimum changes; -**LsGreedy** is a method based on speed change likelihood, which models speed changes as Gaussian distribution, and uses a greedy algorithm to maximize the likelihood. - - -**Name:** VALUEREPAIR - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The method used to repair, which is 'Screen' or 'LsGreedy'. By default, Screen is used. -+ `minSpeed`: This parameter is only valid with Screen. It is the speed threshold. Speeds below it will be regarded as outliers. By default, it is the median minus 3 times of median absolute deviation. -+ `maxSpeed`: This parameter is only valid with Screen. It is the speed threshold. Speeds above it will be regarded as outliers. By default, it is the median plus 3 times of median absolute deviation. -+ `center`: This parameter is only valid with LsGreedy. It is the center of the Gaussian distribution of speed changes. By default, it is 0. -+ `sigma`: This parameter is only valid with LsGreedy. It is the standard deviation of the Gaussian distribution of speed changes. By default, it is the median absolute deviation. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -**Note:** `NaN` will be filled with linear interpolation before repairing. - -#### Examples - -##### Repair with Screen - -When `method` is 'Screen' or the default, Screen method is used. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select valuerepair(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|valuerepair(root.test.d2.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+----------------------------+ -``` - -##### Repair with LsGreedy - -When `method` is 'LsGreedy', LsGreedy method is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select valuerepair(s1,'method'='LsGreedy') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|valuerepair(root.test.d2.s1, "method"="LsGreedy")| -+-----------------------------+-------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-------------------------------------------------+ -``` - -## Series Discovery - -### ConsecutiveSequences - -#### Registration statement - -```sql -create function consecutivesequences as 'org.apache.iotdb.library.series.UDTFConsecutiveSequences' -``` - -#### Usage - -This function is used to find locally longest consecutive subsequences in strictly equispaced multidimensional data. - -Strictly equispaced data is the data whose time intervals are strictly equal. Missing data, including missing rows and missing values, is allowed in it, while data redundancy and timestamp drift is not allowed. - -Consecutive subsequence is the subsequence that is strictly equispaced with the standard time interval without any missing data. If a consecutive subsequence is not a proper subsequence of any consecutive subsequence, it is locally longest. - -**Name:** CONSECUTIVESEQUENCES - -**Input Series:** Support multiple input series. The type is arbitrary but the data is strictly equispaced. - -**Parameters:** - -+ `gap`: The standard time interval which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it will be estimated by the mode of time intervals. - -**Output Series:** Output a single series. The type is INT32. Each data point in the output series corresponds to a locally longest consecutive subsequence. The output timestamp is the starting timestamp of the subsequence and the output value is the number of data points in the subsequence. - -**Note:** For input series that is not strictly equispaced, there is no guarantee on the output. - -#### Examples - -##### Manually Specify the Standard Time Interval - -It's able to manually specify the standard time interval by the parameter `gap`. It's notable that false parameter leads to false output. - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2, "gap"="5m")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------------------+ -``` - - -##### Automatically Estimate the Standard Time Interval - -When `gap` is default, this function estimates the standard time interval by the mode of time intervals and gets the same results. Therefore, this usage is more recommended. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select consecutivesequences(s1,s2) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------+ -``` - -### ConsecutiveWindows - -#### Registration statement - -```sql -create function consecutivewindows as 'org.apache.iotdb.library.series.UDTFConsecutiveWindows' -``` - -#### Usage - -This function is used to find consecutive windows of specified length in strictly equispaced multidimensional data. - -Strictly equispaced data is the data whose time intervals are strictly equal. Missing data, including missing rows and missing values, is allowed in it, while data redundancy and timestamp drift is not allowed. - -Consecutive window is the subsequence that is strictly equispaced with the standard time interval without any missing data. - -**Name:** CONSECUTIVEWINDOWS - -**Input Series:** Support multiple input series. The type is arbitrary but the data is strictly equispaced. - -**Parameters:** - -+ `gap`: The standard time interval which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it will be estimated by the mode of time intervals. -+ `length`: The length of the window which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. This parameter cannot be lacked. - -**Output Series:** Output a single series. The type is INT32. Each data point in the output series corresponds to a consecutive window. The output timestamp is the starting timestamp of the window and the output value is the number of data points in the window. - -**Note:** For input series that is not strictly equispaced, there is no guarantee on the output. - -#### Examples - - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------+ -| Time|consecutivewindows(root.test.d1.s1, root.test.d1.s2, "length"="10m")| -+-----------------------------+--------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -+-----------------------------+--------------------------------------------------------------------+ -``` - - - -## Machine Learning - -### AR - -#### Registration statement - -```sql -create function ar as 'org.apache.iotdb.library.dlearn.UDTFAR' -``` - -#### Usage - -This function is used to learn the coefficients of the autoregressive models for a time series. - -**Name:** AR - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `p`: The order of the autoregressive model. Its default value is 1. - -**Output Series:** Output a single series. The type is DOUBLE. The first line corresponds to the first order coefficient, and so on. - -**Note:** - -- Parameter `p` should be a positive integer. -- Most points in the series should be sampled at a constant time interval. -- Linear interpolation is applied for the missing points in the series. - -#### Examples - -##### Assigning Model Order - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ar(s0,"p"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+---------------------------+ -| Time|ar(root.test.d0.s0,"p"="2")| -+-----------------------------+---------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.9429| -|1970-01-01T08:00:00.002+08:00| -0.2571| -+-----------------------------+---------------------------+ -``` - - -### Cluster - -#### Registration statement - -```sql -create function cluster as 'org.apache.iotdb.library.dlearn.UDTFCluster' -``` - -#### Usage - -This function takes a **single input time series**, splits it into **non-overlapping** contiguous subsequences (windows) of fixed length `l`, and clusters those subsequences into `k` groups. - -**Name:** Cluster - -**Input Series:** Only support single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. Points are read in time order; trailing samples that do not fill a full window are dropped (only `⌊n/l⌋` windows are used, where `n` is the number of valid points). - -**Parameters:** - -| Name | Meaning | Default | Notes | -|------|---------|---------|--------| -| `l` | Subsequence (window) length | (required) | Positive integer; each window has `l` consecutive samples. | -| `k` | Number of clusters | (required) | Integer ≥ 2. | -| `method` | Clustering algorithm | `kmeans` | Optional: `kmeans`, `kshape`, `medoidshape` (case-insensitive). Defaults to k-means if omitted. | -| `norm` | Z-score normalize each subsequence | `true` | Boolean; if `true`, each subsequence is standardized before clustering. | -| `maxiter` | Maximum iterations | `200` | Positive integer. | -| `output` | Output mode | `label` | `label`: one cluster id per window; `centroid`: concatenate the `k` centroid vectors in cluster order. | -| `sample_rate` | Greedy sampling rate | `0.3` | Used only when **`method` = `medoidshape`**; must be in `(0, 1]`. | - - -**`method` details:** - -- **kmeans**: k-means in Euclidean space (optionally after per-window normalization). -- **kshape**: Assign by shape-based distance (SBD from normalized cross-correlation, NCC); centroids updated via SVD on the cluster matrix. -- **medoidshape**: Coarsely cluster, then greedy selection of `k` representative subsequences; `sample_rate` controls how many candidates are sampled each round. - -**Output Series:** Controlled by `output`: - -- **`output` = `label` (default):** One output series, type **INT32**. Number of points = number of full windows, `⌊n/l⌋`. Timestamp of each point = **time of the first sample** in that window; value = cluster id **0 … k−1**. -- **`output` = `centroid`:** One output series, type **DOUBLE**. Number of points = **`k × l`**: for clusters **0 → k−1**, emit the `l` components of each centroid in order (concatenated). Timestamps are `0, 1, 2, …` (placeholders only, no physical time meaning). - -**Note:** - -- Require valid point count `n ≥ l` and window count `⌊n/l⌋ ≥ k`. - -#### Examples - -##### KShape: window length 3, k = 2 - -Nine samples `{1,2,3,10,20,30,1,5,1}` form three non-overlapping windows `{1,2,3}`, `{10,20,30}`, `{1,5,1}`. With **`method` = `kshape`** (default `norm` = `true`), each output row is the cluster id for one window; timestamps are the window start times. Resulting labels: **0, 0, 1**. - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 10.0| -|2020-01-01T00:00:05.000+08:00| 20.0| -|2020-01-01T00:00:06.000+08:00| 30.0| -|2020-01-01T00:00:07.000+08:00| 1.0| -|2020-01-01T00:00:08.000+08:00| 5.0| -|2020-01-01T00:00:09.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select cluster(s0, "l"="3", "k"="2", "method"="kshape", "output"="label") -from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+----------------------------------------------------------------------------+ -| Time|cluster(root.test.d0.s0,"l"="3","k"="2","method"="kshape","output"="label")| -+-----------------------------+----------------------------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 0| -|2020-01-01T00:00:04.000+08:00| 0| -|2020-01-01T00:00:07.000+08:00| 1| -+-----------------------------+----------------------------------------------------------------------------+ -``` \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/Tools-System/CLI_timecho.md b/src/UserGuide/V1.3.x/Tools-System/CLI_timecho.md deleted file mode 100644 index 07e4ab9ba..000000000 --- a/src/UserGuide/V1.3.x/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,175 +0,0 @@ - - -# Command Line Interface (CLI) - - -IoTDB provides Cli/shell tools for users to interact with IoTDB server in command lines. This document shows how Cli/shell tool works and the meaning of its parameters. - -> Note: In this document, \$IOTDB\_HOME represents the path of the IoTDB installation directory. - -## Running Cli - -After installation, there is a default user in IoTDB: `root`, and the -default password is `root`. Users can use this username to try IoTDB Cli/Shell tool. The cli startup script is the `start-cli` file under the \$IOTDB\_HOME/bin folder. When starting the script, you need to specify the IP and PORT. (Make sure the IoTDB cluster is running properly when you use Cli/Shell tool to connect to it.) - -Here is an example where the cluster is started locally and the user has not changed the running port. The default rpc port is -6667
-If you need to connect to the remote DataNode or changes -the rpc port number of the DataNode running, set the specific IP and RPC PORT at -h and -p.
-You also can set your own environment variable at the front of the start script ("/sbin/start-cli.sh" for linux and "/sbin/start-cli.bat" for windows) - -The Linux and MacOS system startup commands are as follows: - -```shell -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -``` - -The Windows system startup commands are as follows: - -```shell -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -``` - -After operating these commands, the cli can be started successfully. The successful status will be as follows: - -``` - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ version - - -Successfully login at 127.0.0.1:6667 -IoTDB> -``` - -Enter ```quit``` or `exit` can exit Cli. - -## Cli Parameters - -| **Parameter** | **Type** | **Required** | **Description** | **Example** | -| -------------------------- | -------- | ------------ |-----------------------------------------------------------------------------------| ------------------- | -| -h `` | string | No | The IP address of the IoTDB server. (Default: 127.0.0.1) | -h 127.0.0.1 | -| -p `` | int | No | The RPC port of the IoTDB server. (Default: 6667) | -p 6667 | -| -u `` | string | No | The username to connect to the IoTDB server. (Default: root) | -u root | -| -pw `` | string | No | The password to connect to the IoTDB server. (Default: root) | -pw root | -| -e `` | string | No | Batch operations in non-interactive mode. | -e "show databases" | -| -c | Flag | No | Required if rpc_thrift_compression_enable=true on the server. | -c | -| -disableISO8601 | Flag | No | If set, timestamps will be displayed as numeric values instead of ISO8601 format. | -disableISO8601 | -| -usessl `` | Boolean | No | Enable SSL connection | -usessl true | -| -ts `` | string | No | SSL certificate store path | -ts /path/to/truststore | -| -tpw `` | string | No | SSL certificate store password | -tpw myTrustPassword | -| -timeout `` | int | No | Query timeout (seconds). If not set, the server's configuration will be used. | -timeout 30 | -| -help | Flag | No | Displays help information for the CLI tool. | -help | - -Following is a cli command which connects the host with IP -10.129.187.21, rpc port 6667, username "root", password "root", and prints the timestamp in digital form. The maximum number of lines displayed on the IoTDB command line is 10. - -The Linux and MacOS system startup commands are as follows: - -```shell -Shell > bash sbin/start-cli.sh -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -The Windows system startup commands are as follows: - -```shell -Shell > sbin\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -## CLI Special Command - -Special commands of Cli are below. - -| Command | Description / Example | -| :-------------------------- | :------------------------------------------------------ | -| `set time_display_type=xxx` | eg. long, default, ISO8601, yyyy-MM-dd HH:mm:ss | -| `show time_display_type` | show time display type | -| `set time_zone=xxx` | eg. +08:00, Asia/Shanghai | -| `show time_zone` | show cli time zone | -| `set fetch_size=xxx` | set fetch size when querying data from server | -| `show fetch_size` | show fetch size | -| `set max_display_num=xxx` | set max lines for cli to output, -1 equals to unlimited | -| `help` | Get hints for CLI special commands | -| `exit/quit` | Exit CLI | - - -## Batch Operation of Cli - --e parameter is designed for the Cli/shell tool in the situation where you would like to manipulate IoTDB in batches through scripts. By using the -e parameter, you can operate IoTDB without entering the cli's input mode. - -In order to avoid confusion between statements and other parameters, the current version only supports the -e parameter as the last parameter. - -The usage of -e parameter for Cli/shell is as follows: - -The Linux and MacOS system commands: - -```shell -Shell > bash sbin/start-cli.sh -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -The Windows system commands: - -```shell -Shell > sbin\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -In the Windows environment, the SQL statement of the -e parameter needs to use ` `` ` to replace `" "` - -In order to better explain the use of -e parameter, take following as an example(On linux system). - -Suppose you want to create a database root.demo to a newly launched IoTDB, create a timeseries root.demo.s1 and insert three data points into it. With -e parameter, you could write a shell like this: - -```shell -# !/bin/bash - -host=127.0.0.1 -rpcPort=6667 -user=root -pass=root - -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create database root.demo" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create timeseries root.demo.s1 WITH DATATYPE=INT32, ENCODING=RLE" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(1,10)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(2,11)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(3,12)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "select s1 from root.demo" -``` - -The results are shown in the figure, which are consistent with the Cli and jdbc operations. - -```shell - Shell > bash ./shell.sh -+-----------------------------+------------+ -| Time|root.demo.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.001+08:00| 10| -|1970-01-01T08:00:00.002+08:00| 11| -|1970-01-01T08:00:00.003+08:00| 12| -+-----------------------------+------------+ -Total line number = 3 -It costs 0.267s -``` - -It should be noted that the use of the -e parameter in shell scripts requires attention to the escaping of special characters. diff --git a/src/UserGuide/V1.3.x/Tools-System/Maintenance-Tool_timecho.md b/src/UserGuide/V1.3.x/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index d77787934..000000000 --- a/src/UserGuide/V1.3.x/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,960 +0,0 @@ - -# Cluster management tool - -## IoTDB-OpsKit - -The IoTDB OpsKit is an easy-to-use operation and maintenance tool (enterprise version tool). -It is designed to solve the operation and maintenance problems of multiple nodes in the IoTDB distributed system. -It mainly includes cluster deployment, cluster start and stop, elastic expansion, configuration update, data export and other functions, thereby realizing one-click command issuance for complex database clusters, which greatly Reduce management difficulty. -This document will explain how to remotely deploy, configure, start and stop IoTDB cluster instances with cluster management tools. - -### Environment dependence - -This tool is a supporting tool for TimechoDB(Enterprise Edition based on IoTDB). You can contact your sales representative to obtain the tool download method. - -The machine where IoTDB is to be deployed needs to rely on jdk 8 and above, lsof, netstat, and unzip functions. If not, please install them yourself. You can refer to the installation commands required for the environment in the last section of the document. - -Tip: The IoTDB cluster management tool requires an account with root privileges - -### Deployment method - -#### Download and install - -This tool is a supporting tool for TimechoDB(Enterprise Edition based on IoTDB). You can contact your salesperson to obtain the tool download method. - -Note: Since the binary package only supports GLIBC2.17 and above, the minimum version is Centos7. - -* After entering the following commands in the iotdb-opskit directory: - -```bash -bash install-iotdbctl.sh -``` - -The iotdbctl keyword can be activated in the subsequent shell, such as checking the environment instructions required before deployment as follows: - -```bash -iotdbctl cluster check example -``` - -* You can also directly use <iotdbctl absolute path>/sbin/iotdbctl without activating iotdbctl to execute commands, such as checking the environment required before deployment: - -```bash -/sbin/iotdbctl cluster check example -``` - -### Introduction to cluster configuration files - -* There is a cluster configuration yaml file in the `iotdbctl/config` directory. The yaml file name is the cluster name. There can be multiple yaml files. In order to facilitate users to configure yaml files, a `default_cluster.yaml` example is provided under the iotdbctl/config directory. -* The yaml file configuration consists of five major parts: `global`, `confignode_servers`, `datanode_servers`, `grafana_server`, and `prometheus_server` -* `global` is a general configuration that mainly configures machine username and password, IoTDB local installation files, Jdk configuration, etc. A `default_cluster.yaml` sample data is provided in the `iotdbctl/config` directory, - Users can copy and modify it to their own cluster name and refer to the instructions inside to configure the IoTDB cluster. In the `default_cluster.yaml` sample, all uncommented items are required, and those that have been commented are non-required. - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotdbctl cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - -| parameter name | parameter describe | required | -|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| iotdb\_zip\_dir | IoTDB deployment distribution directory, if the value is empty, it will be downloaded from the address specified by `iotdb_download_url` | NO | -| iotdb\_download\_url | IoTDB download address, if `iotdb_zip_dir` has no value, download from the specified address | NO | -| jdk\_tar\_dir | jdk local directory, you can use this jdk path to upload and deploy to the target node. | NO | -| jdk\_deploy\_dir | jdk remote machine deployment directory, jdk will be deployed to this directory, and the following `jdk_dir_name` parameter forms a complete jdk deployment directory, that is, `/` | NO | -| jdk\_dir\_name | The directory name after jdk decompression defaults to jdk_iotdb | NO | -| iotdb\_lib\_dir | The IoTDB lib directory or the IoTDB lib compressed package only supports .zip format and is only used for IoTDB upgrade. It is in the comment state by default. If you need to upgrade, please open the comment and modify the path. If you use a zip file, please use the zip command to compress the iotdb/lib directory, such as zip -r lib.zip apache-iotdb-1.2.0/lib/* d | NO | -| user | User name for ssh login deployment machine | YES | -| password | The password for ssh login. If the password does not specify the use of pkey to log in, please ensure that the ssh login between nodes has been configured without a key. | NO | -| pkey | Key login: If password has a value, password is used first, otherwise pkey is used to log in. | NO | -| ssh\_port | ssh port | YES | -| deploy\_dir | IoTDB deployment directory, IoTDB will be deployed to this directory and the following `iotdb_dir_name` parameter will form a complete IoTDB deployment directory, that is, `/` | YES | -| iotdb\_dir\_name | The directory name after decompression of IoTDB is iotdb by default. | NO | -| datanode-env.sh | Corresponding to `iotdb/config/datanode-env.sh`, when `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first | NO | -| confignode-env.sh | Corresponding to `iotdb/config/confignode-env.sh`, the value in `datanode_servers` is used first when `global` and `datanode_servers` are configured at the same time | NO | -| iotdb-system.properties | Corresponds to `/config/iotdb-system.properties` | NO | -| cn\_internal\_address | The cluster configuration address points to the surviving ConfigNode, and it points to confignode_x by default. When `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_address | The cluster configuration address points to the surviving ConfigNode, and points to confignode_x by default. When configuring values for `global` and `datanode_servers` at the same time, the value in `datanode_servers` is used first, corresponding to `dn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | - -Among them, datanode-env.sh and confignode-env.sh can be configured with extra parameters extra_opts. When this parameter is configured, corresponding values will be appended after datanode-env.sh and confignode-env.sh. Refer to default_cluster.yaml for configuration examples as follows: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - -* `confignode_servers` is the configuration for deploying IoTDB Confignodes, in which multiple Confignodes can be configured - By default, the first started ConfigNode node node1 is regarded as the Seed-ConfigNode - -| parameter name | parameter describe | required | -|-----------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| name | Confignode name | YES | -| deploy\_dir | IoTDB config node deployment directory | YES | -| cn\_internal\_address | Corresponds to iotdb/internal communication address, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| cn_internal_address | The cluster configuration address points to the surviving ConfigNode, and it points to confignode_x by default. When `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| cn\_internal\_port | Internal communication port, corresponding to `cn_internal_port` in `iotdb/config/iotdb-system.properties` | YES | -| cn\_consensus\_port | Corresponds to `cn_consensus_port` in `iotdb/config/iotdb-system.properties` | NO | -| cn\_data\_dir | Corresponds to `cn_consensus_port` in `iotdb/config/iotdb-system.properties` Corresponds to `cn_data_dir` in `iotdb/config/iotdb-system.properties` | YES | -| iotdb-system.properties | Corresponding to `iotdb/config/iotdb-system.properties`, when configuring values in `global` and `confignode_servers` at the same time, the value in confignode_servers will be used first. | NO | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| parameter name | parameter describe | required | -|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| name | Datanode name | YES | -| deploy\_dir | IoTDB data node deployment directory | YES | -| dn\_rpc\_address | The datanode rpc address corresponds to `dn_rpc_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_address | Internal communication address, corresponding to `dn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_seed\_config\_node | The cluster configuration address points to the surviving ConfigNode, and points to confignode_x by default. When configuring values for `global` and `datanode_servers` at the same time, the value in `datanode_servers` is used first, corresponding to `dn_seed_config_node` in `iotdb/config/iotdb-system.properties`. | YES | -| dn\_rpc\_port | Datanode rpc port address, corresponding to `dn_rpc_port` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_port | Internal communication port, corresponding to `dn_internal_port` in `iotdb/config/iotdb-system.properties` | YES | -| iotdb-system.properties | Corresponding to `iotdb/config/iotdb-system.properties`, when configuring values in `global` and `datanode_servers` at the same time, the value in `datanode_servers` will be used first. | NO | - -* grafana_server is the configuration related to deploying Grafana - -| parameter name | parameter describe | required | -|--------------------|-------------------------------------------------------------|-----------| -| grafana\_dir\_name | Grafana decompression directory name(default grafana_iotdb) | NO | -| host | Server ip deployed by grafana | YES | -| grafana\_port | The port of grafana deployment machine, default 3000 | NO | -| deploy\_dir | grafana deployment server directory | YES | -| grafana\_tar\_dir | Grafana compressed package location | YES | -| dashboards | dashboards directory | NO | - -* prometheus_server 是部署Prometheus 相关配置 - -| parameter name | parameter describe | required | -|--------------------------------|----------------------------------------------------|----------| -| prometheus\_dir\_name | prometheus decompression directory name, default prometheus_iotdb | NO | -| host | Server IP deployed by prometheus | YES | -| prometheus\_port | The port of prometheus deployment machine, default 9090 | NO | -| deploy\_dir | prometheus deployment server directory | YES | -| prometheus\_tar\_dir | prometheus compressed package path | YES | -| storage\_tsdb\_retention\_time | The number of days to save data is 15 days by default | NO | -| storage\_tsdb\_retention\_size | The data size that can be saved by the specified block defaults to 512M. Please note the units are KB, MB, GB, TB, PB, and EB. | NO | - -If metrics are configured in `iotdb-system.properties` and `iotdb-system.properties` of config/xxx.yaml, the configuration will be automatically put into promethues without manual modification. - -Note: How to configure the value corresponding to the yaml key to contain special characters such as: etc. It is recommended to use double quotes for the entire value, and do not use paths containing spaces in the corresponding file paths to prevent abnormal recognition problems. - -### scenes to be used - -#### Clean data - -* Cleaning up the cluster data scenario will delete the data directory in the IoTDB cluster and `cn_system_dir`, `cn_consensus_dir`, `cn_consensus_dir` configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs` and `ext` directories. -* First execute the stop cluster command, and then execute the cluster cleanup command. - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster clean default_cluster -``` - -#### Cluster destruction - -* The cluster destruction scenario will delete `data`, `cn_system_dir`, `cn_consensus_dir`, in the IoTDB cluster - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs`, `ext`, `IoTDB` deployment directory, - grafana deployment directory and prometheus deployment directory. -* First execute the stop cluster command, and then execute the cluster destruction command. - - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster destroy default_cluster -``` - -#### Cluster upgrade - -* To upgrade the cluster, you first need to configure `iotdb_lib_dir` in config/xxx.yaml as the directory path where the jar to be uploaded to the server is located (for example, iotdb/lib). -* If you use zip files to upload, please use the zip command to compress the iotdb/lib directory, such as zip -r lib.zip apache-iotdb-1.2.0/lib/* -* Execute the upload command and then execute the restart IoTDB cluster command to complete the cluster upgrade. - -```bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### hot deployment - -* First modify the configuration in config/xxx.yaml. -* Execute the distribution command, and then execute the hot deployment command to complete the hot deployment of the cluster configuration - -```bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### Cluster expansion - -* First modify and add a datanode or confignode node in config/xxx.yaml. -* Execute the cluster expansion command - -```bash -iotdbctl cluster scaleout default_cluster -``` - -#### Cluster scaling - -* First find the node name or ip+port to shrink in config/xxx.yaml (where confignode port is cn_internal_port, datanode port is rpc_port) -* Execute cluster shrink command - -```bash -iotdbctl cluster scalein default_cluster -``` - -#### Using cluster management tools to manipulate existing IoTDB clusters - -* Configure the server's `user`, `passwod` or `pkey`, `ssh_port` -* Modify the IoTDB deployment path in config/xxx.yaml, `deploy_dir` (IoTDB deployment directory), `iotdb_dir_name` (IoTDB decompression directory name, the default is iotdb) - For example, if the full path of IoTDB deployment is `/home/data/apache-iotdb-1.1.1`, you need to modify the yaml files `deploy_dir:/home/data/` and `iotdb_dir_name:apache-iotdb-1.1.1` -* If the server is not using java_home, modify `jdk_deploy_dir` (jdk deployment directory) and `jdk_dir_name` (the directory name after jdk decompression, the default is jdk_iotdb). If java_home is used, there is no need to modify the configuration. - For example, the full path of jdk deployment is `/home/data/jdk_1.8.2`, you need to modify the yaml files `jdk_deploy_dir:/home/data/`, `jdk_dir_name:jdk_1.8.2` -* Configure `cn_internal_address`, `dn_internal_address` -* Configure `cn_internal_address`, `cn_internal_port`, `cn_consensus_port`, `cn_system_dir`, in `iotdb-system.properties` in `confignode_servers` - If the values in `cn_consensus_dir` and `iotdb-system.properties` are not the default for IoTDB, they need to be configured, otherwise there is no need to configure them. -* Configure `dn_rpc_address`, `dn_internal_address`, `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir` in `iotdb-system.properties` -* Execute initialization command - -```bash -iotdbctl cluster init default_cluster -``` - -#### Deploy IoTDB, Grafana and Prometheus - -* Configure `iotdb-system.properties` to open the metrics interface -* Configure the Grafana configuration. If there are multiple `dashboards`, separate them with commas. The names cannot be repeated or they will be overwritten. -* Configure the Prometheus configuration. If the IoTDB cluster is configured with metrics, there is no need to manually modify the Prometheus configuration. The Prometheus configuration will be automatically modified according to which node is configured with metrics. -* Start the cluster - -```bash -iotdbctl cluster start default_cluster -``` - -For more detailed parameters, please refer to the cluster configuration file introduction above - -### Command - -The basic usage of this tool is: -```bash -iotdbctl cluster [params (Optional)] -``` -* key indicates a specific command. - -* cluster name indicates the cluster name (that is, the name of the yaml file in the `iotdbctl/config` file). - -* params indicates the required parameters of the command (optional). - -* For example, the command format to deploy the default_cluster cluster is: - -```bash -iotdbctl cluster deploy default_cluster -``` - -* The functions and parameters of the cluster are listed as follows: - -| command | description | parameter | -|-----------------|-----------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| check | check whether the cluster can be deployed | Cluster name list | -| clean | cleanup-cluster | cluster-name | -| deploy/dist-all | deploy cluster | Cluster name, -N, module name (optional for iotdb, grafana, prometheus), -op force (optional) | -| list | cluster status list | None | -| start | start cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional) | -| stop | stop cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional), -op force (nodename, grafana, prometheus optional) | -| restart | restart cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional), -op force (nodename, grafana, prometheus optional) | -| show | view cluster information. The details field indicates the details of the cluster information. | Cluster name, details (optional) | -| destroy | destroy cluster | Cluster name, -N, module name (iotdb, grafana, prometheus optional) | -| scaleout | cluster expansion | Cluster name | -| scalein | cluster shrink | Cluster name, -N, cluster node name or cluster node ip+port | -| reload | hot loading of cluster configuration files | Cluster name | -| dist-conf | cluster configuration file distribution | Cluster name | -| dumplog | Back up specified cluster logs | Cluster name, -N, cluster node name -h Back up to target machine ip -pw Back up to target machine password -p Back up to target machine port -path Backup directory -startdate Start time -enddate End time -loglevel Log type -l transfer speed | -| dumpdata | Backup cluster data | Cluster name, -h backup to target machine ip -pw backup to target machine password -p backup to target machine port -path backup directory -startdate start time -enddate end time -l transmission speed | -| dist-lib | lib package upgrade | Cluster name | -| init | When an existing cluster uses the cluster deployment tool, initialize the cluster configuration | Cluster name | -| status | View process status | Cluster name | -| activate | Activate cluster | Cluster name | -| health_check | health check | Cluster name, -N, nodename (optional) | -| backup | Activate cluster | Cluster name,-N nodename (optional) | -| importschema | Activate cluster | Cluster name,-N nodename -param paramters | -| exportschema | Activate cluster | Cluster name,-N nodename -param paramters | - - - -### Detailed command execution process - -The following commands are executed using default_cluster.yaml as an example, and users can modify them to their own cluster files to execute - -#### Check cluster deployment environment commands - -```bash -iotdbctl cluster check default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Verify that the target node is able to log in via SSH - -* Verify whether the JDK version on the corresponding node meets IoTDB jdk1.8 and above, and whether the server is installed with unzip, lsof, and netstat. - -* If you see the following prompt `Info:example check successfully!`, it proves that the server has already met the installation requirements. - If `Error:example check fail!` is output, it proves that some conditions do not meet the requirements. You can check the Error log output above (for example: `Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`) to make repairs. , - If the jdk check does not meet the requirements, we can configure a jdk1.8 or above version in the yaml file ourselves for deployment without affecting subsequent use. - If checking lsof, netstat or unzip does not meet the requirements, you need to install it on the server yourself. - -#### Deploy cluster command - -```bash -iotdbctl cluster deploy default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Upload IoTDB compressed package and jdk compressed package according to the node information in `confignode_servers` and `datanode_servers` (if `jdk_tar_dir` and `jdk_deploy_dir` values ​​are configured in yaml) - -* Generate and upload `iotdb-system.properties` according to the yaml file node configuration information - -```bash -iotdbctl cluster deploy default_cluster -op force -``` - -Note: This command will force the deployment, and the specific process will delete the existing deployment directory and redeploy - -*deploy a single module* -```bash -# Deploy grafana module -iotdbctl cluster deploy default_cluster -N grafana -# Deploy the prometheus module -iotdbctl cluster deploy default_cluster -N prometheus -# Deploy the iotdb module -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### Start cluster command - -```bash -iotdbctl cluster start default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Start confignode, start sequentially according to the order in `confignode_servers` in the yaml configuration file and check whether the confignode is normal according to the process id, the first confignode is seek config - -* Start the datanode in sequence according to the order in `datanode_servers` in the yaml configuration file and check whether the datanode is normal according to the process id. - -* After checking the existence of the process according to the process id, check whether each service in the cluster list is normal through the cli. If the cli link fails, retry every 10s until it succeeds and retry up to 5 times - - -* -Start a single node command* -```bash -#Start according to the IoTDB node name -iotdbctl cluster start default_cluster -N datanode_1 -#Start according to IoTDB cluster ip+port, where port corresponds to cn_internal_port of confignode and rpc_port of datanode. -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 -#Start grafana -iotdbctl cluster start default_cluster -N grafana -#Start prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -* Find the yaml file in the default location based on cluster-name - -* Find the node location information based on the provided node name or ip:port. If the started node is `data_node`, the ip uses `dn_rpc_address` in the yaml file, and the port uses `dn_rpc_port` in datanode_servers in the yaml file. - If the started node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - -* start the node - -Note: Since the cluster deployment tool only calls the start-confignode.sh and start-datanode.sh scripts in the IoTDB cluster, -When the actual output result fails, it may be that the cluster has not started normally. It is recommended to use the status command to check the current cluster status (iotdbctl cluster status xxx) - - -#### View IoTDB cluster status command - -```bash -iotdbctl cluster show default_cluster -#View IoTDB cluster details -iotdbctl cluster show default_cluster details -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Execute `show cluster details` through cli on datanode in turn. If one node is executed successfully, it will not continue to execute cli on subsequent nodes and return the result directly. - -#### Stop cluster command - - -```bash -iotdbctl cluster stop default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* According to the datanode node information in `datanode_servers`, stop the datanode nodes in order according to the configuration. - -* Based on the confignode node information in `confignode_servers`, stop the confignode nodes in sequence according to the configuration - -*force stop cluster command* - -```bash -iotdbctl cluster stop default_cluster -op force -``` -Will directly execute the kill -9 pid command to forcibly stop the cluster - -*Stop single node command* - -```bash -#Stop by IoTDB node name -iotdbctl cluster stop default_cluster -N datanode_1 -#Stop according to IoTDB cluster ip+port (ip+port is to get the only node according to ip+dn_rpc_port in datanode or ip+cn_internal_port in confignode to get the only node) -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 -#Stop grafana -iotdbctl cluster stop default_cluster -N grafana -#Stop prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -* Find the yaml file in the default location based on cluster-name - -* Find the corresponding node location information based on the provided node name or ip:port. If the stopped node is `data_node`, the ip uses `dn_rpc_address` in the yaml file, and the port uses `dn_rpc_port` in datanode_servers in the yaml file. - If the stopped node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - -* stop the node - -Note: Since the cluster deployment tool only calls the stop-confignode.sh and stop-datanode.sh scripts in the IoTDB cluster, in some cases the iotdb cluster may not be stopped. - - -#### Clean cluster data command - -```bash -iotdbctl cluster clean default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Based on the information in `confignode_servers` and `datanode_servers`, check whether there are still services running, - If any service is running, the cleanup command will not be executed. - -* Delete the data directory in the IoTDB cluster and the `cn_system_dir`, `cn_consensus_dir`, configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs` and `ext` directories. - - - -#### Restart cluster command - -```bash -iotdbctl cluster restart default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` - -* Execute the above stop cluster command (stop), and then execute the start cluster command (start). For details, refer to the above start and stop commands. - -*Force restart cluster command* - -```bash -iotdbctl cluster restart default_cluster -op force -``` -Will directly execute the kill -9 pid command to force stop the cluster, and then start the cluster - - -*Restart a single node command* - -```bash -#Restart datanode_1 according to the IoTDB node name -iotdbctl cluster restart default_cluster -N datanode_1 -#Restart confignode_1 according to the IoTDB node name -iotdbctl cluster restart default_cluster -N confignode_1 -#Restart grafana -iotdbctl cluster restart default_cluster -N grafana -#Restart prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### Cluster shrink command - -```bash -#Scale down by node name -iotdbctl cluster scalein default_cluster -N nodename -#Scale down according to ip+port (ip+port obtains the only node according to ip+dn_rpc_port in datanode, and obtains the only node according to ip+cn_internal_port in confignode) -iotdbctl cluster scalein default_cluster -N ip:port -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Determine whether there is only one confignode node and datanode to be reduced. If there is only one left, the reduction cannot be performed. - -* Then get the node information to shrink according to ip:port or nodename, execute the shrink command, and then destroy the node directory. If the shrink node is `data_node`, use `dn_rpc_address` in the yaml file for ip, and use `dn_rpc_address` in the port. `dn_rpc_port` in datanode_servers in yaml file. - If the shrinking node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - - -Tip: Currently, only one node scaling is supported at a time - -#### Cluster expansion command - -```bash -iotdbctl cluster scaleout default_cluster -``` -* Modify the config/xxx.yaml file to add a datanode node or confignode node - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Find the node to be expanded, upload the IoTDB compressed package and jdb package (if the `jdk_tar_dir` and `jdk_deploy_dir` values ​​are configured in yaml) and decompress it - -* Generate and upload `iotdb-system.properties` according to the yaml file node configuration information - -* Execute the command to start the node and verify whether the node is started successfully - -Tip: Currently, only one node expansion is supported at a time - -#### destroy cluster command -```bash -iotdbctl cluster destroy default_cluster -``` - -* cluster-name finds the yaml file in the default location - -* Check whether the node is still running based on the node node information in `confignode_servers`, `datanode_servers`, `grafana`, and `prometheus`. - Stop the destroy command if any node is running - -* Delete `data` in the IoTDB cluster and `cn_system_dir`, `cn_consensus_dir` configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs`, `ext`, `IoTDB` deployment directory, - grafana deployment directory and prometheus deployment directory - -*Destroy a single module* - -```bash -# Destroy grafana module -iotdbctl cluster destroy default_cluster -N grafana -# Destroy prometheus module -iotdbctl cluster destroy default_cluster -N prometheus -# Destroy iotdb module -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### Distribute cluster configuration commands - -```bash -iotdbctl cluster dist-conf default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` - -* Generate and upload `iotdb-system.properties` to the specified node according to the node configuration information of the yaml file - -#### Hot load cluster configuration command - -```bash -iotdbctl cluster reload default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Execute `load configuration` in the cli according to the node configuration information of the yaml file. - -#### Cluster node log backup -```bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` - -* Find the yaml file in the default location based on cluster-name - -* This command will verify the existence of datanode_1 and confignode_1 according to the yaml file, and then back up the log data of the specified node datanode_1 and confignode_1 to the specified service `192.168.9.48` port 36000 according to the configured start and end dates (startdate<=logtime<=enddate) The data backup path is `/iotdb/logs`, and the IoTDB log storage path is `/root/data/db/iotdb/logs` (not required, if you do not fill in -logs xxx, the default is to backup logs from the IoTDB installation path /logs ) - -| command | description | required | -|------------|-------------------------------------------------------------------------|----------| -| -h | backup data server ip | NO | -| -u | backup data server username | NO | -| -pw | backup data machine password | NO | -| -p | backup data machine port(default 22) | NO | -| -path | path to backup data (default current path) | NO | -| -loglevel | Log levels include all, info, error, warn (default is all) | NO | -| -l | speed limit (default 1024 speed limit range 0 to 104857601 unit Kbit/s) | NO | -| -N | multiple configuration file cluster names are separated by commas. | YES | -| -startdate | start time (including default 1970-01-01) | NO | -| -enddate | end time (included) | NO | -| -logs | IoTDB log storage path, the default is ({iotdb}/logs)) | NO | - -#### Cluster data backup -```bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* This command will obtain the leader node based on the yaml file, and then back up the data to the /iotdb/datas directory on the 192.168.9.48 service based on the start and end dates (startdate<=logtime<=enddate) - -| command | description | required | -|--------------|-------------------------------------------------------------------------|----------| -| -h | backup data server ip | NO | -| -u | backup data server username | NO | -| -pw | backup data machine password | NO | -| -p | backup data machine port(default 22) | NO | -| -path | path to backup data (default current path) | NO | -| -granularity | partition | YES | -| -l | speed limit (default 1024 speed limit range 0 to 104857601 unit Kbit/s) | NO | -| -startdate | start time (including default 1970-01-01) | YES | -| -enddate | end time (included) | YES | - -#### Cluster upgrade -```bash -iotdbctl cluster dist-lib default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Upload lib package - -Note that after performing the upgrade, please restart IoTDB for it to take effect. - -#### Cluster initialization -```bash -iotdbctl cluster init default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` -* Initialize cluster configuration - -#### View cluster process status -```bash -iotdbctl cluster status default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` -* Display the survival status of each node in the cluster - -#### Cluster authorization activation - -Cluster activation is activated by entering the activation code by default, or by using the - op license_path activated through license path - -* Default activation method -```bash -iotdbctl cluster activate default_cluster -``` -* Find the yaml file in the default location based on `cluster-name` and obtain the `confignode_servers` configuration information -* Obtain the machine code inside -* Waiting for activation code input - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* Activate a node - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -* Activate through license path - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* Find the yaml file in the default location based on `cluster-name` and obtain the `confignode_servers` configuration information -* Obtain the machine code inside -* Waiting for activation code input - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* Activate a node - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -op license_path -``` - -#### Cluster Health Check -```bash -iotdbctl cluster health_check default_cluster -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve confignode_servers and datanode_servers configuration information. -* Execute health_check.sh on each node. -* Single Node Health Check -```bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute health_check.sh on datanode1. - -#### Cluster Shutdown Backup - -```bash -iotdbctl cluster backup default_cluster -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve confignode_servers and datanode_servers configuration information. -* Execute backup.sh on each node - -* Single Node Backup - -```bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` - -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute backup.sh on datanode1. -Note: Multi-node deployment on a single machine only supports quick mode. - -#### Cluster Metadata Import -```bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute metadata import with import-schema.sh on datanode1. -* Parameters for -param are as follows: - -| command | description | required | -|------------|-------------------------------------------------------------------------|----------| -| -s | Specify the data file to be imported. You can specify a file or a directory. If a directory is specified, all files with a .csv extension in the directory will be imported in bulk. | YES | -| -fd | Specify a directory to store failed import files. If this parameter is not specified, failed files will be saved in the source data directory with the extension .failed added to the original filename. | No | -| -lpf | Specify the number of lines written to each failed import file. The default is 10000.| NO | - -#### Cluster Metadata Export - -```bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` - -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute metadata export with export-schema.sh on datanode1. -* Parameters for -param are as follows: - -| command | description | required | -|-------------|-------------------------------------------------------------------------|----------| -| -t | Specify the output path for the exported CSV file. | YES | -| -path | Specify the path pattern for exporting metadata. If this parameter is specified, the -s parameter will be ignored. Example: root.stock.** | NO | -| -pf | If -path is not specified, this parameter must be specified. It designates the file path containing the metadata paths to be exported, supporting txt file format. Each path to be exported is on a new line.| NO | -| -lpf | Specify the maximum number of lines for the exported dump file. The default is 10000.| NO | -| -timeout | Specify the timeout for session queries in milliseconds.| NO | - - - -### Introduction to Cluster Deployment Tool Samples - -In the cluster deployment tool installation directory config/example, there are three yaml examples. If necessary, you can copy them to config and modify them. - -| name | description | -|-----------------------------|------------------------------------------------| -| default\_1c1d.yaml | 1 confignode and 1 datanode configuration example | -| default\_3c3d.yaml | 3 confignode and 3 datanode configuration samples | -| default\_3c3d\_grafa\_prome | 3 confignode and 3 datanode, Grafana, Prometheus configuration examples | - - -## IoTDB Data Directory Overview Tool - -IoTDB data directory overview tool is used to print an overview of the IoTDB data directory structure. The location is tools/tsfile/print-iotdb-data-dir. - -### Usage - -- For Windows: - -```bash -.\print-iotdb-data-dir.bat () -``` - -- For Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh () -``` - -Note: if the storage path of the output overview file is not set, the default relative path "IoTDB_data_dir_overview.txt" will be used. - -### Example - -Use Windows in this example: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## TsFile Sketch Tool - -TsFile sketch tool is used to print the content of a TsFile in sketch mode. The location is tools/tsfile/print-tsfile. - -### Usage - -- For Windows: - -``` -.\print-tsfile-sketch.bat () -``` - -- For Linux or MacOs: - -``` -./print-tsfile-sketch.sh () -``` - -Note: if the storage path of the output sketch file is not set, the default relative path "TsFile_sketch_view.txt" will be used. - -### Example - -Use Windows in this example: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -Explanations: - -- Separated by "|", the left is the actual position in the TsFile, and the right is the summary content. -- "||||||||||||||||||||" is the guide information added to enhance readability, not the actual data stored in TsFile. -- The last printed "IndexOfTimerseriesIndex Tree" is a reorganization of the metadata index tree at the end of the TsFile, which is convenient for intuitive understanding, and again not the actual data stored in TsFile. - -## TsFile Resource Sketch Tool - -TsFile resource sketch tool is used to print the content of a TsFile resource file. The location is tools/tsfile/print-tsfile-resource-files. - -### Usage - -- For Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- For Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### Example - -Use Windows in this example: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/UserGuide/V1.3.x/Tools-System/Monitor-Tool_timecho.md b/src/UserGuide/V1.3.x/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index 5e0964932..000000000 --- a/src/UserGuide/V1.3.x/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,182 +0,0 @@ - - -# Monitor Tool - -The deployment of monitoring tools can refer to the document [Monitoring Panel Deployment](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) section. - -## Prometheus - -### The mapping from metric type to prometheus format - -> For metrics whose Metric Name is name and Tags are K1=V1, ..., Kn=Vn, the mapping is as follows, where value is a -> specific value - -| Metric Type | Mapping | -| ---------------- | ------------------------------------------------------------ | -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | - -### Config File - -1) Taking DataNode as an example, modify the iotdb-system.properties configuration file as follows: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -Then you can get metrics data as follows - -2) Start IoTDB DataNodes -3) Open a browser or use ```curl``` to visit ```http://servier_ip:9091/metrics```, you can get the following metric - data: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -### Prometheus + Grafana - -As shown above, IoTDB exposes monitoring metrics data in the standard Prometheus format to the outside world. Prometheus -can be used to collect and store monitoring indicators, and Grafana can be used to visualize monitoring indicators. - -The following picture describes the relationships among IoTDB, Prometheus and Grafana - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. Along with running, IoTDB will collect its metrics continuously. -2. Prometheus scrapes metrics from IoTDB at a constant interval (can be configured). -3. Prometheus saves these metrics to its inner TSDB. -4. Grafana queries metrics from Prometheus at a constant interval (can be configured) and then presents them on the - graph. - -So, we need to do some additional works to configure and deploy Prometheus and Grafana. - -For instance, you can config your Prometheus as follows to get metrics data from IoTDB: - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -The following documents may help you have a good journey with Prometheus and Grafana. - -[Prometheus getting_started](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus scrape metrics](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana getting_started](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana query metrics from Prometheus](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## Apache IoTDB Dashboard - -We introduce the Apache IoTDB Dashboard, designed for unified centralized operations and management. With it, multiple clusters can be monitored through a single panel. - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - - -You can access the Dashboard's Json file in the enterprise edition. - -### Cluster Overview - -Including but not limited to: - -- Total cluster CPU cores, memory space, and hard disk space. -- Number of ConfigNodes and DataNodes in the cluster. -- Cluster uptime duration. -- Cluster write speed. -- Current CPU, memory, and disk usage across all nodes in the cluster. -- Information on individual nodes. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - - -### Data Writing - -Including but not limited to: - -- Average write latency, median latency, and the 99% percentile latency. -- Number and size of WAL files. -- Node WAL flush SyncBuffer latency. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### Data Querying - -Including but not limited to: - -- Node query load times for time series metadata. -- Node read duration for time series. -- Node edit duration for time series metadata. -- Node query load time for Chunk metadata list. -- Node edit duration for Chunk metadata. -- Node filtering duration based on Chunk metadata. -- Average time to construct a Chunk Reader. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### Storage Engine - -Including but not limited to: - -- File count and sizes by type. -- The count and size of TsFiles at various stages. -- Number and duration of various tasks. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### System Monitoring - -Including but not limited to: - -- System memory, swap memory, and process memory. -- Disk space, file count, and file sizes. -- JVM GC time percentage, GC occurrences by type, GC volume, and heap memory usage across generations. -- Network transmission rate, packet sending rate - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) diff --git a/src/UserGuide/V1.3.x/Tools-System/Workbench_timecho.md b/src/UserGuide/V1.3.x/Tools-System/Workbench_timecho.md deleted file mode 100644 index 8b124a643..000000000 --- a/src/UserGuide/V1.3.x/Tools-System/Workbench_timecho.md +++ /dev/null @@ -1,33 +0,0 @@ -# WorkBench - -The deployment of the visualization console can refer to the document [Workbench Deployment](../Deployment-and-Maintenance/workbench-deployment_timecho.md) chapter. - -## Product Introduction -IoTDB Visualization Console is an extension component developed for industrial scenarios based on the IoTDB Enterprise Edition time series database. It integrates real-time data collection, storage, and analysis, aiming to provide users with efficient and reliable real-time data storage and query solutions. It features lightweight, high performance, and ease of use, seamlessly integrating with the Hadoop and Spark ecosystems. It is suitable for high-speed writing and complex analytical queries of massive time series data in industrial IoT applications. - -## Instructions for Use -| **Functional Module** | **Functional Description** | -| ---------------------- | ------------------------------------------------------------ | -| Instance Management | Support unified management of connected instances, support creation, editing, and deletion, while visualizing the relationships between multiple instances, helping customers manage multiple database instances more clearly | -| Home | Support viewing the service running status of each node in the database instance (such as activation status, running status, IP information, etc.), support viewing the running monitoring status of clusters, ConfigNodes, and DataNodes, monitor the operational health of the database, and determine if there are any potential operational issues with the instance. | -| Measurement Point List | Support directly viewing the measurement point information in the instance, including database information (such as database name, data retention time, number of devices, etc.), and measurement point information (measurement point name, data type, compression encoding, etc.), while also supporting the creation, export, and deletion of measurement points either individually or in batches. | -| Data Model | Support viewing hierarchical relationships and visually displaying the hierarchical model. | -| Data Query | Support interface-based query interactions for common data query scenarios, and enable batch import and export of queried data. | -| Statistical Query | Support interface-based query interactions for common statistical data scenarios, such as outputting results for maximum, minimum, average, and sum values. | -| SQL Operations | Support interactive SQL operations on the database through a graphical user interface, allowing for the execution of single or multiple statements, and displaying and exporting the results. | -| Trend | Support one-click visualization to view the overall trend of data, draw real-time and historical data for selected measurement points, and observe the real-time and historical operational status of the measurement points. | -| Analysis | Support visualizing data through different analysis methods (such as FFT) for visualization. | -| View | Support viewing information such as view name, view description, result measuring points, and expressions through the interface. Additionally, enable users to quickly create, edit, and delete views through interactive interfaces. | -| Data synchronization | Support the intuitive creation, viewing, and management of data synchronization tasks between databases. Enable direct viewing of task running status, synchronized data, and target addresses. Users can also monitor changes in synchronization status in real-time through the interface. | -| Permission management | Support interface-based control of permissions for managing and controlling database user access and operations. | -| Audit logs | Support detailed logging of user operations on the database, including Data Definition Language (DDL), Data Manipulation Language (DML), and query operations. Assist users in tracking and identifying potential security threats, database errors, and misuse behavior. | - -Main feature showcase -* Home -![首页.png](/img/%E9%A6%96%E9%A1%B5.png) -* Measurement Point List -![测点列表.png](/img/workbench-en-bxzk.png) -* Data Query -![数据查询.png](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2.png) -* Trend -![历史趋势.png](/img/%E5%8E%86%E5%8F%B2%E8%B6%8B%E5%8A%BF.png) \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/User-Manual/Audit-Log_timecho.md b/src/UserGuide/V1.3.x/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index 976b7da17..000000000 --- a/src/UserGuide/V1.3.x/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,93 +0,0 @@ - - -# Security Audit - -## Background of the function - -Audit log is the record credentials of a database, which can be queried by the audit log function to ensure information security by various operations such as user add, delete, change and check in the database. With the audit log function of IoTDB, the following scenarios can be achieved: - -- We can decide whether to record audit logs according to the source of the link ( human operation or not), such as: non-human operation such as hardware collector write data no need to record audit logs, human operation such as ordinary users through cli, workbench and other tools to operate the data need to record audit logs. -- Filter out system-level write operations, such as those recorded by the IoTDB monitoring system itself. - -### Scene Description - -#### Logging all operations (add, delete, change, check) of all users - -The audit log function traces all user operations in the database. The information recorded should include data operations (add, delete, query) and metadata operations (add, modify, delete, query), client login information (user name, ip address). - -Client Sources: -- Cli、workbench、Zeppelin、Grafana、通过 Session/JDBC/MQTT 等协议传入的请求 - -![](/img/audit-log.png) - -#### Audit logging can be turned off for some user connections - -No audit logs are required for data written by the hardware collector via Session/JDBC/MQTT if it is a non-human action. - -## Function Definition - -It is available through through configurations: - -- Decide whether to enable the audit function or not -- Decide where to output the audit logs, support output to one or more - 1. log file - 2. IoTDB storage -- Decide whether to block the native interface writes to prevent recording too many audit logs to affect performance. -- Decide the content category of the audit log, supporting recording one or more - 1. data addition and deletion operations - 2. data and metadata query operations - 3. metadata class adding, modifying, and deleting operations. - -### configuration item - -In iotdb-system.properties, change the following configurations: - -```YAML -#################### -### Audit log Configuration -#################### - -# whether to enable the audit log. -# Datatype: Boolean -# enable_audit_log=false - -# Output location of audit logs -# Datatype: String -# IOTDB: the stored time series is: root.__system.audit._{user} -# LOGGER: log_audit.log in the log directory -# audit_log_storage=IOTDB,LOGGER - -# whether enable audit log for DML operation of data -# whether enable audit log for DDL operation of schema -# whether enable audit log for QUERY operation of data and schema -# Datatype: String -# audit_log_operation=DML,DDL,QUERY - -# whether the local write api records audit logs -# Datatype: Boolean -# This contains Session insert api: insertRecord(s), insertTablet(s),insertRecordsOfOneDevice -# MQTT insert api -# RestAPI insert api -# This parameter will cover the DML in audit_log_operation -# enable_audit_log_for_native_insert_api=true -``` - diff --git a/src/UserGuide/V1.3.x/User-Manual/Data-Sync-old_timecho.md b/src/UserGuide/V1.3.x/User-Manual/Data-Sync-old_timecho.md deleted file mode 100644 index 0fb367524..000000000 --- a/src/UserGuide/V1.3.x/User-Manual/Data-Sync-old_timecho.md +++ /dev/null @@ -1,613 +0,0 @@ - - -# Data Sync - -Data synchronization is a typical requirement in industrial Internet of Things (IoT). Through data synchronization mechanisms, it is possible to achieve data sharing between IoTDB, and to establish a complete data link to meet the needs for internal and external network data interconnectivity, edge-cloud synchronization, data migration, and data backup. - -## Function Overview - -### Data Synchronization - -A data synchronization task consists of three stages: - -![](/img/sync_en_01.png) - -- Source Stage:This part is used to extract data from the source IoTDB, defined in the source section of the SQL statement. -- Process Stage:This part is used to process the data extracted from the source IoTDB, defined in the processor section of the SQL statement. -- Sink Stage:This part is used to send data to the target IoTDB, defined in the sink section of the SQL statement. - -By declaratively configuring the specific content of the three parts through SQL statements, flexible data synchronization capabilities can be achieved. Currently, data synchronization supports the synchronization of the following information, and you can select the synchronization scope when creating a synchronization task (the default is data.insert, which means synchronizing newly written data): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Synchronization ScopeSynchronization Content Description
allAll scopes
data(Data)insertSynchronize newly written data
deleteSynchronize deleted data
schemadatabaseSynchronize database creation, modification or deletion operations
timeseriesSynchronize the definition and attributes of time series
TTLSynchronize the data retention time
auth-Synchronize user permissions and access control
- -### Functional limitations and instructions - -The schema and auth synchronization functions have the following limitations: - -- When using schema synchronization, it is required that the consensus protocol of `Schema region` and `ConfigNode` must be the default ratis protocol, that is: In the `iotdb-common.properties` configuration file, both the `config_node_consensus_protocol_class` and `schema_region_consensus_protocol_class` configuration items are set to `org.apache.iotdb.consensus.ratis.RatisConsensus`. - -- To prevent potential conflicts, please turn off the automatic creation of metadata on the receiving end when enabling schema synchronization. You can do this by setting the `enable_auto_create_schema` configuration in the `iotdb-common.properties` configuration file to false. - -- When schema synchronization is enabled, the use of custom plugins is not supported. - -- In a dual-active cluster, schema synchronization should avoid simultaneous operations on both ends. - -- During data synchronization tasks, please avoid performing any deletion operations to prevent inconsistent states between the two ends. - -## Usage Instructions - -Data synchronization tasks have three states: RUNNING, STOPPED, and DROPPED. The task state transitions are shown in the following diagram: - - -V1.3.0 and earlier versions: - -After creation, it will not start immediately and needs to execute the `START PIPE` statement to start the task. - -![](/img/sync_en_02.png) - -V1.3.1 and later versions: - -After creation, the task will start directly, and when the task stops abnormally, the system will automatically attempt to restart the task. - -![](/img/Data-Sync02.png) - -Provide the following SQL statements for state management of synchronization tasks. - -### Create Task - -Use the `CREATE PIPE` statement to create a data synchronization task. The `PipeId` and `sink` attributes are required, while `source` and `processor` are optional. When entering the SQL, note that the order of the `SOURCE` and `SINK` plugins cannot be swapped. - -The SQL example is as follows: - -```SQL -CREATE PIPE -- PipeId is the name that uniquely identifies the task. --- Data extraction plugin, optional plugin -WITH SOURCE ( - [ = ,], -) --- Data processing plugin, optional plugin -WITH PROCESSOR ( - [ = ,], -) --- Data connection plugin, required plugin -WITH SINK ( - [ = ,], -) -``` - -### Start Task - -Start processing data: - -```SQL -START PIPE -``` - -### Stop Task - -Stop processing data: - -```SQL -STOP PIPE -``` - -### Delete Task - -Deletes the specified task: - -```SQL -DROP PIPE -``` - -Deleting a task does not require stopping the synchronization task first. - -### View Task - -View all tasks: - -```SQL -SHOW PIPES -``` - -To view a specified task: - -```SQL -SHOW PIPE -``` - -Example of the show pipes result for a pipe: - -```SQL -+--------------------------------+-----------------------+-------+---------------+--------------------+------------------------------------------------------------+----------------+ -| ID| CreationTime| State| PipeSource| PipeProcessor| PipeSink|ExceptionMessage| -+--------------------------------+-----------------------+-------+---------------+--------------------+------------------------------------------------------------+----------------+ -|3421aacb16ae46249bac96ce4048a220|2024-08-13T09:55:18.717|RUNNING| {}| {}|{{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}}| | -+--------------------------------+-----------------------+-------+---------------+--------------------+------------------------------------------------------------+----------------+ -``` - -The meanings of each column are as follows: - -- **ID**:The unique identifier for the synchronization task -- **CreationTime**:The time when the synchronization task was created -- **State**:The state of the synchronization task -- **PipeSource**:The source of the synchronized data stream -- **PipeProcessor**:The processing logic of the synchronized data stream during transmission -- **PipeSink**:The destination of the synchronized data stream -- **ExceptionMessage**:Displays the exception information of the synchronization task - - -### Synchronization Plugins - -To make the overall architecture more flexible to match different synchronization scenario requirements, we support plugin assembly within the synchronization task framework. The system comes with some pre-installed common plugins that you can use directly. At the same time, you can also customize processor plugins and Sink plugins, and load them into the IoTDB system for use. You can view the plugins in the system (including custom and built-in plugins) with the following statement: - -```SQL -SHOW PIPEPLUGINS -``` - -The return result is as follows (version 1.3.2): - -```SQL -IoTDB> SHOW PIPEPLUGINS -+---------------------+----------+-------------------------------------------------------------------------------------------+----------------------------------------------------+ -| PluginName|PluginType| ClassName| PluginJar| -+---------------------+----------+-------------------------------------------------------------------------------------------+----------------------------------------------------+ -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.processor.donothing.DoNothingProcessor| | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.donothing.DoNothingConnector| | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.extractor.iotdb.IoTDBExtractor| | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| | -|IOTDB-THRIFT-SSL-SINK| Builtin|org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| | -+---------------------+----------+-------------------------------------------------------------------------------------------+----------------------------------------------------+ -``` - -Detailed introduction of pre-installed plugins is as follows (for detailed parameters of each plugin, please refer to the [Parameter Description](#reference-parameter-description) section): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
TypeCustom PluginPlugin NameDescriptionApplicable Version
source pluginNot Supportediotdb-sourceThe default extractor plugin, used to extract historical or real-time data from IoTDB1.2.x
processor pluginSupporteddo-nothing-processorThe default processor plugin, which does not process the incoming data1.2.x
sink pluginSupporteddo-nothing-sinkDoes not process the data that is sent out1.2.x
iotdb-thrift-sinkThe default sink plugin ( V1.3.1+ ), used for data transfer between IoTDB ( V1.2.0+ ) and IoTDB( V1.2.0+ ) . It uses the Thrift RPC framework to transfer data, with a multi-threaded async non-blocking IO model, high transfer performance, especially suitable for scenarios where the target end is distributed1.2.x
iotdb-air-gap-sinkUsed for data synchronization across unidirectional data diodes from IoTDB ( V1.2.0+ ) to IoTDB ( V1.2.0+ ). Supported diode models include Nanrui Syskeeper 2000, etc1.2.x
iotdb-thrift-ssl-sinkUsed for data transfer between IoTDB ( V1.3.1+ ) and IoTDB ( V1.2.0+ ). It uses the Thrift RPC framework to transfer data, with a single-threaded sync blocking IO model, suitable for scenarios with higher security requirements1.3.1+
- -For importing custom plugins, please refer to the [Stream Processing](./Streaming_timecho.md#custom-stream-processing-plugin-management) section. - -## Use examples - -### Full data synchronisation - -This example is used to demonstrate the synchronisation of all data from one IoTDB to another IoTDB with the data link as shown below: - -![](/img/pipe1.jpg) - -In this example, we can create a synchronization task named A2B to synchronize the full data from A IoTDB to B IoTDB. The iotdb-thrift-sink plugin (built-in plugin) for the sink is required. The URL of the data service port of the DataNode node on the target IoTDB needs to be configured through node-urls, as shown in the following example statement: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -``` - -### Partial data synchronization - -This example is used to demonstrate the synchronisation of data from a certain historical time range (8:00pm 23 August 2023 to 8:00pm 23 October 2023) to another IoTDB, the data link is shown below: - -![](/img/pipe2.jpg) - -In this example, we can create a synchronization task named A2B. First, we need to define the range of data to be transferred in the source. Since the data being transferred is historical data (historical data refers to data that existed before the creation of the synchronization task), we need to configure the start-time and end-time of the data and the transfer mode mode. The URL of the data service port of the DataNode node on the target IoTDB needs to be configured through node-urls. - -The detailed statements are as follows: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'realtime.mode' = 'stream' -- The extraction mode for newly inserted data (after pipe creation) - 'start-time' = '2023.08.23T08:00:00+00:00', -- The start event time for synchronizing all data, including start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- The end event time for synchronizing all data, including end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -### Bidirectional data transfer - -This example is used to demonstrate the scenario where two IoTDB act as active-active pairs, with the data link shown in the figure below: - -![](/img/pipe3.jpg) - -In this example, to avoid infinite data loops, the `forwarding-pipe-requests` parameter on A and B needs to be set to `false`, indicating that data transmitted from another pipe is not forwarded, and to keep the data consistent on both sides, the pipe needs to be configured with `inclusion=all` to synchronize full data and metadata. - -The detailed statement is as follows: - -On A IoTDB, execute the following statement: - -```SQL -create pipe AB -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'forwarding-pipe-requests' = 'false' -- Do not forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -On B IoTDB, execute the following statement: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'forwarding-pipe-requests' = 'false' -- Do not forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -### Edge-cloud data transfer - -This example is used to demonstrate the scenario where data from multiple IoTDB is transferred to the cloud, with data from clusters B, C, and D all synchronized to cluster A, as shown in the figure below: - -![](/img/sync_en_03.png) - -In this example, to synchronize the data from clusters B, C, and D to A, the pipe between BA, CA, and DA needs to configure the `path` to limit the range, and to keep the edge and cloud data consistent, the pipe needs to be configured with `inclusion=all` to synchronize full data and metadata. The detailed statement is as follows: - -On B IoTDB, execute the following statement to synchronize data from B to A: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On C IoTDB, execute the following statement to synchronize data from C to A: - -```SQL -create pipe CA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On D IoTDB, execute the following statement to synchronize data from D to A: - -```SQL -create pipe DA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -### Cascading data transfer - -This example is used to demonstrate the scenario where data is transferred in a cascading manner between multiple IoTDB, with data from cluster A synchronized to cluster B, and then to cluster C, as shown in the figure below: - -![](/img/sync_en_04.png) - -In this example, to synchronize the data from cluster A to C, the `forwarding-pipe-requests` needs to be set to `true` between BC. The detailed statement is as follows: - -On A IoTDB, execute the following statement to synchronize data from A to B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On B IoTDB, execute the following statement to synchronize data from B to C: - -```SQL -create pipe BC -with source ( - 'forwarding-pipe-requests' = 'true' -- Whether to forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -### Cross-gate data transfer - -This example is used to demonstrate the scenario where data from one IoTDB is synchronized to another IoTDB through a unidirectional gateway, as shown in the figure below: - -![](/img/cross-network-gateway.png) - - -In this example, the iotdb-air-gap-sink plugin in the sink task needs to be used (currently supports some gateway models, for specific models, please contact Timecho staff for confirmation). After configuring the gateway, execute the following statement on A IoTDB. Fill in the node-urls with the URL of the data service port of the DataNode node on the target IoTDB configured by the gateway, as detailed below: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- The URL of the data service port of the DataNode node on the target IoTDB -``` - - -### Encrypted Synchronization (V1.3.1+) - -IoTDB supports the use of SSL encryption during the synchronization process, ensuring the secure transfer of data between different IoTDB instances. By configuring SSL-related parameters, such as the certificate address and password (`ssl.trust-store-path`)、(`ssl.trust-store-pwd`), data can be protected by SSL encryption during the synchronization process. - -For example, to create a synchronization task named A2B: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- The URL of the data service port of the DataNode node on the target IoTDB - 'ssl.trust-store-path'='pki/trusted', -- The trust store certificate path required to connect to the target DataNode - 'ssl.trust-store-pwd'='root' -- The trust store certificate password required to connect to the target DataNode -) -``` - -## Reference: Notes - -You can adjust the parameters for data synchronization by modifying the IoTDB configuration file (`iotdb-common.properties`), such as the directory for storing synchronized data. The complete configuration is as follows: - -V1.3.0/1/2: - -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -## Reference: parameter description - -### source parameter(V1.3.0) - -| key | value | value range | required or not | default value | -| :------------------------------ | :----------------------------------------------------------- | :------------------------------------- | :------- | :------------- | -| source | iotdb-source | String: iotdb-source | required | - | -| source.pattern | Used to filter the path prefix of time series | String: any time series prefix | optional | root | -| source.history.enable | Whether to send historical data | Boolean: true / false | optional | true | -| source.history.start-time | The start event time for synchronizing historical data, including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional | Long.MIN_VALUE | -| source.history.end-time | The end event time for synchronizing historical data, including end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional | Long.MAX_VALUE | -| source.realtime.enable | Whether to send real-time data | Boolean: true / false | optional | true | -| source.realtime.mode | The extraction mode for newly inserted data (after pipe creation) | String: stream, batch | optional | stream | -| source.forwarding-pipe-requests | Whether to forward data written by other Pipes (usually data synchronization) | Boolean: true, false | optional | true | -| source.history.loose-range | When transferring tsfile, whether to relax the historical data (before pipe creation) range. "": Do not relax the range, select data strictly according to the set conditions "time": Relax the time range to avoid splitting TsFile, which can improve synchronization efficiency | String: "" / "time" | optional | Empty String | - -> 💎 **Explanation: Difference between Historical Data and Real-time Data** -> - **Historical Data**: All data with arrival time < the current system time when the pipe is created is called historical data. -> - **Real-time Data**:All data with arrival time >= the current system time when the pipe is created is called real-time data. -> - **Full Data**: Full data = Historical data + Real-time data -> -> 💎 **Explanation: Differences between Stream and Batch Data Extraction Modes** -> - **stream (recommended)**: In this mode, tasks process and send data in real-time. It is characterized by high timeliness and low throughput. -> - **batch**: In this mode, tasks process and send data in batches (according to the underlying data files). It is characterized by low timeliness and high throughput. - -### source parameter(V1.3.1) - -> In versions 1.3.1 and above, the parameters no longer require additional source, processor, and sink prefixes. - -| key | value | value range | required or not | default value | -| :----------------------- | :----------------------------------------------------------- | :------------------------------------- | :------- | :------------- | -| source | iotdb-source | String: iotdb-source | Required | - | -| pattern | Used to filter the path prefix of time series | String: any time series prefix | Optional | root | -| start-time | The start event time for synchronizing all data, including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | Optional | Long.MIN_VALUE | -| end-time | The end event time for synchronizing all data, including end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | Optional | Long.MAX_VALUE | -| realtime.mode | The extraction mode for newly inserted data (after pipe creation) | String: stream, batch | Optional | stream | -| forwarding-pipe-requests | Whether to forward data written by other Pipes (usually data synchronization) | Boolean: true, false | Optional | true | -| history.loose-range | When transferring tsfile, whether to relax the historical data (before pipe creation) range. "": Do not relax the range, select data strictly according to the set conditions "time": Relax the time range to avoid splitting TsFile, which can improve synchronization efficiency | String: "" / "time" | Optional | Empty String | - -> 💎 **Explanation**:To maintain compatibility with lower versions, history.enable, history.start-time, history.end-time, realtime.enable can still be used, but they are not recommended in the new version. -> -> 💎 **Explanation: Differences between Stream and Batch Data Extraction Modes** -> - **stream (recommended)**: In this mode, tasks process and send data in real-time. It is characterized by high timeliness and low throughput. -> - **batch**: In this mode, tasks process and send data in batches (according to the underlying data files). It is characterized by low timeliness and high throughput. - -### source parameter(V1.3.2) - -> In versions 1.3.1 and above, the parameters no longer require additional source, processor, and sink prefixes. - -| key | value | value range | required or not | default value | -| :----------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :------------- | -| source | iotdb-source | String: iotdb-source | Required | - | -| inclusion | Used to specify the range of data to be synchronized in the data synchronization task, including data, schema, and auth | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | Optional | data.insert | -| inclusion.exclusion | Used to exclude specific operations from the range specified by inclusion, reducing the amount of data synchronized | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | Optional | - | -| path | Used to filter the path pattern schema of time series and data to be synchronized / schema synchronization can only use pathpath is exact matching, parameters must be prefix paths or complete paths, i.e., cannot contain `"*"`, at most one `"**"` at the end of the path parameter | String:IoTDB pattern | Optional | root.** | -| pattern | Used to filter the path prefix of time series | String: Optional | Optional | root | -| start-time | The start event time for synchronizing all data, including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | Optional | Long.MIN_VALUE | -| end-time | The end event time for synchronizing all data, including end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | Optional | Long.MAX_VALUE | -| realtime.mode | The extraction mode for newly inserted data (after pipe creation) | String: stream, batch | Optional | stream | -| forwarding-pipe-requests | Whether to forward data written by other Pipes (usually data synchronization) | Boolean: true, false | Optional | true | -| history.loose-range | When transferring tsfile, whether to relax the historical data (before pipe creation) range. "": Do not relax the range, select data strictly according to the set conditions "time": Relax the time range to avoid splitting TsFile, which can improve synchronization efficiency | String: "" 、 "time" | Optional | "" | -| mods.enable | Whether to send the mods file of tsfile | Boolean: true / false | Optional | false | - -> 💎 **Explanation**:To maintain compatibility with lower versions, history.enable, history.start-time, history.end-time, realtime.enable can still be used, but they are not recommended in the new version. -> -> 💎 **Explanation: Differences between Stream and Batch Data Extraction Modes** -> - **stream (recommended)**: In this mode, tasks process and send data in real-time. It is characterized by high timeliness and low throughput. -> - **batch**: In this mode, tasks process and send data in batches (according to the underlying data files). It is characterized by low timeliness and high throughput. - -### sink parameter - -> In versions 1.3.1 and above, the parameters no longer require additional source, processor, and sink prefixes. - -#### iotdb-thrift-sink( V1.3.0/1/2) - - -| key | value | value Range | required or not | Default Value | -| :--------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------- | -| sink | iotdb-thrift-sink or iotdb-thrift-async-sink | String: iotdb-thrift-sink or iotdb-thrift-async-sink | Required | | -| sink.node-urls | The URL of the data service port of any DataNode nodes on the target IoTDB (please note that synchronization tasks do not support forwarding to its own service) | String. Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Required | - | -| sink.batch.enable | Whether to enable batched log transmission mode to improve transmission throughput and reduce IOPS | Boolean: true, false | Optional | true | -| sink.batch.max-delay-seconds | Effective when batched log transmission mode is enabled, it represents the maximum waiting time for a batch of data before sending (unit: s) | Integer | Optional | 1 | -| sink.batch.size-bytes | Effective when batched log transmission mode is enabled, it represents the maximum batch size for a batch of data (unit: byte) | Long | Optional | 16*1024*1024 | - -#### iotdb-air-gap-sink( V1.3.0/1/2) - -| key | value | value Range | required or not | Default Value | -| :--------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------- | -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | Required | - | -| sink.node-urls | The URL of the data service port of any DataNode nodes on the target IoTDB | String. Example: :'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Required | - | -| sink.air-gap.handshake-timeout-ms | The timeout duration of the handshake request when the sender and receiver first attempt to establish a connection, unit: ms | Integer | Optional | 5000 | - - -#### iotdb-thrift-ssl-sink( V1.3.1/2) - -| key | value | value Range | required or not | Default Value | -| :---------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------- | -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | Required | - | -| node-urls | The URL of the data service port of any DataNode nodes on the target IoTDB (please note that synchronization tasks do not support forwarding to its own service) | String. Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Required | - | -| batch.enable | Whether to enable batched log transmission mode to improve transmission throughput and reduce IOPS | Boolean: true, false | Optional | true | -| batch.max-delay-seconds | Effective when batched log transmission mode is enabled, it represents the maximum waiting time for a batch of data before sending (unit: s) | Integer | Optional | 1 | -| batch.size-bytes | Effective when batched log transmission mode is enabled, it represents the maximum batch size for a batch of data (unit: byte) | Long | Optional | 16*1024*1024 | -| ssl.trust-store-path | The trust store certificate path required to connect to the target DataNode | String: certificate directory name, when configured as a relative directory, it is relative to the IoTDB root directory. Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667'| Required | - | -| ssl.trust-store-pwd | The trust store certificate password required to connect to the target DataNode | Integer | Required | - | diff --git a/src/UserGuide/V1.3.x/User-Manual/Data-Sync_timecho.md b/src/UserGuide/V1.3.x/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index da7d96a57..000000000 --- a/src/UserGuide/V1.3.x/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,664 +0,0 @@ - - -# Data Sync - -Data synchronization is a typical requirement in industrial Internet of Things (IoT). Through data synchronization mechanisms, it is possible to achieve data sharing between IoTDB, and to establish a complete data link to meet the needs for internal and external network data interconnectivity, edge-cloud synchronization, data migration, and data backup. - -## Function Overview - -### Data Synchronization - -A data synchronization task consists of three stages: - -![](/img/sync_en_01.png) - -- Source Stage:This part is used to extract data from the source IoTDB, defined in the source section of the SQL statement. -- Process Stage:This part is used to process the data extracted from the source IoTDB, defined in the processor section of the SQL statement. -- Sink Stage:This part is used to send data to the target IoTDB, defined in the sink section of the SQL statement. - -By declaratively configuring the specific content of the three parts through SQL statements, flexible data synchronization capabilities can be achieved. Currently, data synchronization supports the synchronization of the following information, and you can select the synchronization scope when creating a synchronization task (the default is data.insert, which means synchronizing newly written data): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Synchronization ScopeSynchronization Content Description
allAll scopes
data(Data)insertSynchronize newly written data
deleteSynchronize deleted data
schemadatabaseSynchronize database creation, modification or deletion operations
timeseriesSynchronize the definition and attributes of time series
TTLSynchronize the data retention time
auth-Synchronize user permissions and access control
- -### Functional limitations and instructions - -The schema and auth synchronization functions have the following limitations: - -- When using schema synchronization, it is required that the consensus protocol for `Schema region` and `ConfigNode` must be the default ratis protocol. This means that the `iotdb-system.properties` configuration file should contain the settings `config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus` and `schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`. If these are not specified, the default ratis protocol is used. - -- To prevent potential conflicts, please disable the automatic creation of schema on the receiving end when enabling schema synchronization. This can be done by setting the `enable_auto_create_schema` configuration in the `iotdb-system.properties` file to false. - -- When schema synchronization is enabled, the use of custom plugins is not supported. - -- In a dual-active cluster, schema synchronization should avoid simultaneous operations on both ends. - -- During data synchronization tasks, please avoid performing any deletion operations to prevent inconsistent states between the two ends. - -## Usage Instructions - -Data synchronization tasks have three states: RUNNING, STOPPED, and DROPPED. The task state transitions are shown in the following diagram: - -![](/img/Data-Sync02.png) - -After creation, the task will start directly, and when the task stops abnormally, the system will automatically attempt to restart the task. - -Provide the following SQL statements for state management of synchronization tasks. - -### Create Task - -Use the `CREATE PIPE` statement to create a data synchronization task. The `PipeId` and `sink` attributes are required, while `source` and `processor` are optional. When entering the SQL, note that the order of the `SOURCE` and `SINK` plugins cannot be swapped. - -The SQL example is as follows: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId is the name that uniquely identifies the task. --- Data extraction plugin, optional plugin -WITH SOURCE ( - [ = ,], -) --- Data processing plugin, optional plugin -WITH PROCESSOR ( - [ = ,], -) --- Data connection plugin, required plugin -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS semantics**: Used in creation operations to ensure that the create command is executed when the specified Pipe does not exist, preventing errors caused by attempting to create an existing Pipe. - -**Note**: - -Starting from V1.3.6, when creating a full data synchronization Pipe (e.g. Pipeid: `alldatapipe`), the system will automatically split it into two independent Pipes: - -* History Pipe: The PipeId is the original name plus the suffix `_history` (e.g. `alldatapipe_history`). The source parameter carries the default configurations: `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` -* Realtime Pipe: The PipeId is the original name plus the suffix `_realtime` (e.g. `alldatapipe_realtime`). The source parameter carries the default configuration: `'history.enable'='false'`. If metadata synchronization is configured, the Realtime Pipe will be responsible for sending the data. - -After successful creation, the original PipeId (e.g. `alldatapipe`) will no longer be a valid identifier. When performing task operations such as starting, stopping, deleting, or viewing, you must use the split independent PipeId (i.e. `*_history` or `*_realtime`). For operation examples, see the [View Task](./Data-Sync_timecho.md#view-task) section - -### Start Task - -Start processing data: - -```SQL -START PIPE -``` - -### Stop Task - -Stop processing data: - -```SQL -STOP PIPE -``` - -### Delete Task - -Deletes the specified task: - -```SQL -DROP PIPE [IF EXISTS] -``` -**IF EXISTS semantics**: Used in deletion operations to ensure that when a specified Pipe exists, the delete command is executed to prevent errors caused by attempting to delete non-existent Pipes. - -Deleting a task does not require stopping the synchronization task first. - -### View Task - -View all tasks: - -```SQL -SHOW PIPES -``` - -To view a specified task: - -```SQL -SHOW PIPE -``` - -Example of the show pipes result for a pipe: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -The meanings of each column are as follows: - -- **ID**:The unique identifier for the synchronization task -- **CreationTime**:The time when the synchronization task was created -- **State**:The state of the synchronization task -- **PipeSource**:The source of the synchronized data stream -- **PipeProcessor**:The processing logic of the synchronized data stream during transmission -- **PipeSink**:The destination of the synchronized data stream -- **ExceptionMessage**:Displays the exception information of the synchronization task -- **RemainingEventCount (Statistics with Delay)**: The number of remaining events, which is the total count of all events in the current data synchronization task, including data and schema synchronization events, as well as system and user-defined events. -- **EstimatedRemainingSeconds (Statistics with Delay)**: The estimated remaining time, based on the current number of events and the rate at the pipe, to complete the transfer. - -Example: - -In V1.3.6 and later versions, create a full data synchronization task and view the task details. - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -``` - -### Synchronization Plugins - -To make the overall architecture more flexible to match different synchronization scenario requirements, we support plugin assembly within the synchronization task framework. The system comes with some pre-installed common plugins that you can use directly. At the same time, you can also customize processor plugins and Sink plugins, and load them into the IoTDB system for use. You can view the plugins in the system (including custom and built-in plugins) with the following statement: - -```SQL -SHOW PIPEPLUGINS -``` - -The return result is as follows (version 1.3.2): - -```SQL -IoTDB> SHOW PIPEPLUGINS -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| PluginName|PluginType| ClassName| PluginJar| -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.processor.donothing.DoNothingProcessor| | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.donothing.DoNothingConnector| | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.extractor.iotdb.IoTDBExtractor| | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| | -| IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| | -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ - -``` - -Detailed introduction of pre-installed plugins is as follows (for detailed parameters of each plugin, please refer to the [Parameter Description](#reference-parameter-description) section): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
TypeCustom PluginPlugin NameDescriptionApplicable Version
source pluginNot Supportediotdb-sourceThe default extractor plugin, used to extract historical or real-time data from IoTDB1.2.x
processor pluginSupporteddo-nothing-processorThe default processor plugin, which does not process the incoming data1.2.x
sink pluginSupporteddo-nothing-sinkDoes not process the data that is sent out1.2.x
iotdb-thrift-sinkThe default sink plugin ( V1.3.1+ ), used for data transfer between IoTDB ( V1.2.0+ ) and IoTDB( V1.2.0+ ) . It uses the Thrift RPC framework to transfer data, with a multi-threaded async non-blocking IO model, high transfer performance, especially suitable for scenarios where the target end is distributed1.2.x
iotdb-air-gap-sinkUsed for data synchronization across unidirectional data diodes from IoTDB ( V1.2.0+ ) to IoTDB ( V1.2.0+ ). Supported diode models include Nanrui Syskeeper 2000, etc1.2.x
iotdb-thrift-ssl-sinkUsed for data transfer between IoTDB ( V1.3.1+ ) and IoTDB ( V1.2.0+ ). It uses the Thrift RPC framework to transfer data, with a single-threaded sync blocking IO model, suitable for scenarios with higher security requirements1.3.1+
- -For importing custom plugins, please refer to the [Stream Processing](./Streaming_timecho.md#custom-stream-processing-plugin-management) section. - -## Use examples - -### Full data synchronisation - -This example is used to demonstrate the synchronisation of all data from one IoTDB to another IoTDB with the data link as shown below: - -![](/img/pipe1.jpg) - -In this example, we can create a synchronization task named A2B to synchronize the full data from A IoTDB to B IoTDB. The iotdb-thrift-sink plugin (built-in plugin) for the sink is required. The URL of the data service port of the DataNode node on the target IoTDB needs to be configured through node-urls, as shown in the following example statement: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -``` - -### Partial data synchronization - -This example is used to demonstrate the synchronisation of data from a certain historical time range (8:00pm 23 August 2023 to 8:00pm 23 October 2023) to another IoTDB, the data link is shown below: - -![](/img/pipe2.jpg) - -In this example, we can create a synchronization task named A2B. First, we need to define the range of data to be transferred in the source. Since the data being transferred is historical data (historical data refers to data that existed before the creation of the synchronization task), we need to configure the start-time and end-time of the data and the transfer mode mode. The URL of the data service port of the DataNode node on the target IoTDB needs to be configured through node-urls. - -The detailed statements are as follows: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'realtime.mode' = 'stream' -- The extraction mode for newly inserted data (after pipe creation) - 'path' = 'root.vehicle.**', -- Scope of Data Synchronization - 'start-time' = '2023.08.23T08:00:00+00:00', -- The start event time for synchronizing all data, including start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- The end event time for synchronizing all data, including end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -### Bidirectional data transfer - -This example is used to demonstrate the scenario where two IoTDB act as active-active pairs, with the data link shown in the figure below: - -![](/img/pipe3.jpg) - -In this example, to avoid infinite data loops, the `forwarding-pipe-requests` parameter on A and B needs to be set to `false`, indicating that data transmitted from another pipe is not forwarded, and to keep the data consistent on both sides, the pipe needs to be configured with `inclusion=all` to synchronize full data and metadata. - -The detailed statement is as follows: - -On A IoTDB, execute the following statement: - -```SQL -create pipe AB -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'forwarding-pipe-requests' = 'false' -- Do not forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -On B IoTDB, execute the following statement: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'forwarding-pipe-requests' = 'false' -- Do not forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -### Edge-cloud data transfer - -This example is used to demonstrate the scenario where data from multiple IoTDB is transferred to the cloud, with data from clusters B, C, and D all synchronized to cluster A, as shown in the figure below: - -![](/img/sync_en_03.png) - -In this example, to synchronize the data from clusters B, C, and D to A, the pipe between BA, CA, and DA needs to configure the `path` to limit the range, and to keep the edge and cloud data consistent, the pipe needs to be configured with `inclusion=all` to synchronize full data and metadata. The detailed statement is as follows: - -On B IoTDB, execute the following statement to synchronize data from B to A: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On C IoTDB, execute the following statement to synchronize data from C to A: - -```SQL -create pipe CA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On D IoTDB, execute the following statement to synchronize data from D to A: - -```SQL -create pipe DA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -### Cascading data transfer - -This example is used to demonstrate the scenario where data is transferred in a cascading manner between multiple IoTDB, with data from cluster A synchronized to cluster B, and then to cluster C, as shown in the figure below: - -![](/img/sync_en_04.png) - -In this example, to synchronize the data from cluster A to C, the `forwarding-pipe-requests` needs to be set to `true` between BC. The detailed statement is as follows: - -On A IoTDB, execute the following statement to synchronize data from A to B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On B IoTDB, execute the following statement to synchronize data from B to C: - -```SQL -create pipe BC -with source ( - 'forwarding-pipe-requests' = 'true' -- Whether to forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -### Cross-gate data transfer - -This example is used to demonstrate the scenario where data from one IoTDB is synchronized to another IoTDB through a unidirectional gateway, as shown in the figure below: - -![](/img/cross-network-gateway.png) - - -In this example, the iotdb-air-gap-sink plugin in the sink task needs to be used . After configuring the gateway, execute the following statement on A IoTDB. Fill in the node-urls with the URL of the data service port of the DataNode node on the target IoTDB configured by the gateway, as detailed below: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- The URL of the data service port of the DataNode node on the target IoTDB -``` -**Notes: Currently supported gateway models** -> For other models of network gateway devices, Please contact timechodb staff to confirm compatibility. - -| Gateway Type | Model | Return Packet Limit | Send Limit | -| ---------------------- | ------------------------------------------------------------ | ------------------- | ---------------------- | -| Forward Gate | NARI Syskeeper-2000 Forward Gate | All 0 / All 1 bytes | No Limit | -| Forward Gate | XJ Self-developed Diaphragm | All 0 / All 1 bytes | No Limit | -| Unknown | WISGAP | No Limit | No Limit | -| Forward Gate | KEDONG StoneWall-2000 Network Security Isolation Device | No Limit | No Limit | -| Reverse Gate | NARI Syskeeper-2000 Reverse Direction | All 0 / All 1 bytes | Meet E Language Format | -| Unknown | DPtech ISG5000 | No Limit | No Limit | -| Unknown | GAP‌‌ - XL—GAP | No Limit | No Limit | - -### Compression Synchronization (V1.3.3+) - -IoTDB supports specifying data compression methods during synchronization. Real time compression and transmission of data can be achieved by configuring the `compressor` parameter. `Compressor` currently supports 5 optional algorithms: snappy/gzip/lz4/zstd/lzma2, and can choose multiple compression algorithm combinations to compress in the order of configuration `rate-limit-bytes-per-second`(supported in V1.3.3 and later versions) is the maximum number of bytes allowed to be transmitted per second, calculated as compressed bytes. If it is less than 0, there is no limit. - -For example, to create a synchronization task named A2B: - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB - 'compressor' = 'snappy,lz4' -- Compression algorithms -) -``` - -### Encrypted Synchronization (V1.3.1+) - -IoTDB supports the use of SSL encryption during the synchronization process, ensuring the secure transfer of data between different IoTDB instances. By configuring SSL-related parameters, such as the certificate address and password (`ssl.trust-store-path`)、(`ssl.trust-store-pwd`), data can be protected by SSL encryption during the synchronization process. - -For example, to create a synchronization task named A2B: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- The URL of the data service port of the DataNode node on the target IoTDB - 'ssl.trust-store-path'='pki/trusted', -- The trust store certificate path required to connect to the target DataNode - 'ssl.trust-store-pwd'='root' -- The trust store certificate password required to connect to the target DataNode -) -``` - -## Reference: Notes - -You can adjust the parameters for data synchronization by modifying the IoTDB configuration file (`iotdb-system.properties`), such as the directory for storing synchronized data. The complete configuration is as follows: - -V1.3.3+: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## Reference: parameter description - -### source parameter(V1.3.3) - -| key | value | value range | required or not | default value | -| :------------------------------ | :----------------------------------------------------------- | :------------------------------------- | :------- | :------------- | -| source | iotdb-source | String: iotdb-source | Required | - | -| inclusion | Used to specify the range of data to be synchronized in the data synchronization task, including data, schema, and auth | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | Optional | data.insert | -| inclusion.exclusion | Used to exclude specific operations from the range specified by inclusion, reducing the amount of data synchronized | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | Optional | - | -| path | Used to filter the path pattern schema of time series and data to be synchronized / schema synchronization can only use pathpath is exact matching, parameters must be prefix paths or complete paths, i.e., cannot contain `"*"`, at most one `"**"` at the end of the path parameter | String:IoTDB pattern | Optional | root.** | -| pattern | Used to filter the path prefix of time series | String: Optional | Optional | root | -| start-time | The start event time for synchronizing all data, including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | Optional | Long.MIN_VALUE | -| end-time | The end event time for synchronizing all data, including end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | Optional | Long.MAX_VALUE | -| realtime.mode | The extraction mode for newly inserted data (after pipe creation) | String: stream, batch | Optional | stream | -| forwarding-pipe-requests | Whether to forward data written by other Pipes (usually data synchronization) | Boolean: true, false | Optional | true | -| history.loose-range | When transferring TsFile, whether to relax the range of historical data (before the creation of the pipe). "": Do not relax the range, select data strictly according to the set conditions. "time": Relax the time range to avoid splitting TsFile, which can improve synchronization efficiency. "path": Relax the path range to avoid splitting TsFile, which can improve synchronization efficiency. "time, path", "path, time", "all": Relax all ranges to avoid splitting TsFile, which can improve synchronization efficiency. | String: "" 、 "time" 、 "path" 、 "time, path" 、 "path, time" 、 "all" | Optional |""| -| realtime.loose-range | When transferring TsFile, whether to relax the range of real-time data (before the creation of the pipe). "": Do not relax the range, select data strictly according to the set conditions. "time": Relax the time range to avoid splitting TsFile, which can improve synchronization efficiency. "path": Relax the path range to avoid splitting TsFile, which can improve synchronization efficiency. "time, path", "path, time", "all": Relax all ranges to avoid splitting TsFile, which can improve synchronization efficiency. | String: "" 、 "time" 、 "path" 、 "time, path" 、 "path, time" 、 "all" | Optional |""| -| mods.enable | Whether to send the mods file of tsfile | Boolean: true / false | Optional | false | - -> 💎 **Explanation**:To maintain compatibility with lower versions, history.enable, history.start-time, history.end-time, realtime.enable can still be used, but they are not recommended in the new version. -> -> 💎 **Explanation: Differences between Stream and Batch Data Extraction Modes** -> - **stream (recommended)**: In this mode, tasks process and send data in real-time. It is characterized by high timeliness and low throughput. -> - **batch**: In this mode, tasks process and send data in batches (according to the underlying data files). It is characterized by low timeliness and high throughput. - - -### sink parameter - -> In versions 1.3.3 and above, when only the sink is included, the additional "with sink" prefix is no longer required. - -#### iotdb-thrift-sink - - -| key | value | value Range | required or not | Default Value | -| :--------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------- | -| sink | iotdb-thrift-sink or iotdb-thrift-async-sink | String: iotdb-thrift-sink or iotdb-thrift-async-sink | Required | | -| node-urls | The URL of the data service port of any DataNode nodes on the target IoTDB (please note that synchronization tasks do not support forwarding to its own service) | String. Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Required | - | -| batch.enable | Whether to enable batched log transmission mode to improve transmission throughput and reduce IOPS | Boolean: true, false | Optional | true | -| batch.max-delay-seconds | Effective when batched log transmission mode is enabled, it represents the maximum waiting time for a batch of data before sending (unit: s) | Integer | Optional | 1 | -| batch.max-delay-ms | Effective when batched log transmission mode is enabled, it represents the maximum waiting time for a batch of data before sending (unit: ms) (Available since v1.3.6) | Integer | Optional | 1 | -| batch.size-bytes | Effective when batched log transmission mode is enabled, it represents the maximum batch size for a batch of data (unit: byte) | Long | Optional | 16*1024*1024 | -| load-tsfile-strategy | When synchronizing file data, whether the receiver waits for the local load tsfile operation to complete before responding to the sender:
sync: Wait for the local load tsfile operation to complete before returning the response.
async: Do not wait for the local load tsfile operation to complete; return the response immediately. (Available since v1.3.6) | String: sync / async | Optional | sync | - - -#### iotdb-air-gap-sink - -| key | value | value Range | required or not | Default Value | -| :--------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------- | -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | Required | - | -| node-urls | The URL of the data service port of any DataNode nodes on the target IoTDB | String. Example: :'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Required | - | -| air-gap.handshake-timeout-ms | The timeout duration of the handshake request when the sender and receiver first attempt to establish a connection, unit: ms | Integer | Optional | 5000 | -| load-tsfile-strategy | When synchronizing file data, whether the receiver waits for the local load tsfile operation to complete before responding to the sender:
sync: Wait for the local load tsfile operation to complete before returning the response.
async: Do not wait for the local load tsfile operation to complete; return the response immediately. (Available since v1.3.6) | String: sync / async | Optional | sync | - -#### iotdb-thrift-ssl-sink - -| key | value | value Range | required or not | Default Value | -| :---------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------- | -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | Required | - | -| node-urls | The URL of the data service port of any DataNode nodes on the target IoTDB (please note that synchronization tasks do not support forwarding to its own service) | String. Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Required | - | -| batch.enable | Whether to enable batched log transmission mode to improve transmission throughput and reduce IOPS | Boolean: true, false | Optional | true | -| batch.max-delay-seconds | Effective when batched log transmission mode is enabled, it represents the maximum waiting time for a batch of data before sending (unit: s) | Integer | Optional | 1 | -| batch.max-delay-ms | Effective when batched log transmission mode is enabled, it represents the maximum waiting time for a batch of data before sending (unit: ms) (Available since v1.3.6) | Integer | Optional | 1 | -| batch.size-bytes | Effective when batched log transmission mode is enabled, it represents the maximum batch size for a batch of data (unit: byte) | Long | Optional | 16*1024*1024 | -| load-tsfile-strategy | When synchronizing file data, whether the receiver waits for the local load tsfile operation to complete before responding to the sender:
sync: Wait for the local load tsfile operation to complete before returning the response.
async: Do not wait for the local load tsfile operation to complete; return the response immediately. (Available since v1.3.6) | String: sync / async | Optional | sync | -| ssl.trust-store-path | The trust store certificate path required to connect to the target DataNode | String: certificate directory name, when configured as a relative directory, it is relative to the IoTDB root directory. Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667'| Required | - | -| ssl.trust-store-pwd | The trust store certificate password required to connect to the target DataNode | Integer | Required | - | diff --git a/src/UserGuide/V1.3.x/User-Manual/IoTDB-View_timecho.md b/src/UserGuide/V1.3.x/User-Manual/IoTDB-View_timecho.md deleted file mode 100644 index b84bfef7a..000000000 --- a/src/UserGuide/V1.3.x/User-Manual/IoTDB-View_timecho.md +++ /dev/null @@ -1,549 +0,0 @@ - - -# View - -## Sequence View Application Background - -## Application Scenario 1 Time Series Renaming (PI Asset Management) - -In practice, the equipment collecting data may be named with identification numbers that are difficult to be understood by human beings, which brings difficulties in querying to the business layer. - -The Sequence View, on the other hand, is able to re-organise the management of these sequences and access them using a new model structure without changing the original sequence content and without the need to create new or copy sequences. - -**For example**: a cloud device uses its own NIC MAC address to form entity numbers and stores data by writing the following time sequence:`root.db.0800200A8C6D.xvjeifg`. - -It is difficult for the user to understand. However, at this point, the user is able to rename it using the sequence view feature, map it to a sequence view, and use `root.view.device001.temperature` to access the captured data. - -### Application Scenario 2 Simplifying business layer query logic - -Sometimes users have a large number of devices that manage a large number of time series. When conducting a certain business, the user wants to deal with only some of these sequences. At this time, the focus of attention can be picked out by the sequence view function, which is convenient for repeated querying and writing. - -**For example**: Users manage a product assembly line with a large number of time series for each segment of the equipment. The temperature inspector only needs to focus on the temperature of the equipment, so he can extract the temperature-related sequences and compose the sequence view. - -### Application Scenario 3 Auxiliary Rights Management - -In the production process, different operations are generally responsible for different scopes. For security reasons, it is often necessary to restrict the access scope of the operations staff through permission management. - -**For example**: The safety management department now only needs to monitor the temperature of each device in a production line, but these data are stored in the same database with other confidential data. At this point, it is possible to create a number of new views that contain only temperature-related time series on the production line, and then to give the security officer access to only these sequence views, thus achieving the purpose of permission restriction. - -### Motivation for designing sequence view functionality - -Combining the above two types of usage scenarios, the motivations for designing sequence view functionality, are: - -1. time series renaming. -2. to simplify the query logic at the business level. -3. Auxiliary rights management, open data to specific users through the view. - -## Sequence View Concepts - -### Terminology Concepts - -Concept: If not specified, the views specified in this document are **Sequence Views**, and new features such as device views may be introduced in the future. - -### Sequence view - -A sequence view is a way of organising the management of time series. - -In traditional relational databases, data must all be stored in a table, whereas in time series databases such as IoTDB, it is the sequence that is the storage unit. Therefore, the concept of sequence views in IoTDB is also built on sequences. - -A sequence view is a virtual time series, and each virtual time series is like a soft link or shortcut that maps to a sequence or some kind of computational logic external to a certain view. In other words, a virtual sequence either maps to some defined external sequence or is computed from multiple external sequences. - -Users can create views using complex SQL queries, where the sequence view acts as a stored query statement, and when data is read from the view, the stored query statement is used as the source of the data in the FROM clause. - -### Alias Sequences - -There is a special class of beings in a sequence view that satisfy all of the following conditions: - -1. the data source is a single time series -2. there is no computational logic -3. no filtering conditions (e.g., no WHERE clause restrictions). - -Such a sequence view is called an **alias sequence**, or alias sequence view. A sequence view that does not fully satisfy all of the above conditions is called a non-alias sequence view. The difference between them is that only aliased sequences support write functionality. - -** All sequence views, including aliased sequences, do not currently support Trigger functionality. ** - -### Nested Views - -A user may want to select a number of sequences from an existing sequence view to form a new sequence view, called a nested view. - -**The current version does not support the nested view feature**. - -### Some constraints on sequence views in IoTDB - -#### Constraint 1 A sequence view must depend on one or several time series - -A sequence view has two possible forms of existence: - -1. it maps to a time series -2. it is computed from one or more time series. - -The former form of existence has been exemplified in the previous section and is easy to understand; the latter form of existence here is because the sequence view allows for computational logic. - -For example, the user has installed two thermometers in the same boiler and now needs to calculate the average of the two temperature values as a measurement. The user has captured the following two sequences: `root.db.d01.temperature01`, `root.db.d01.temperature02`. - -At this point, the user can use the average of the two sequences as one sequence in the view: `root.db.d01.avg_temperature`. - -This example will 3.1.2 expand in detail. - -#### Restriction 2 Non-alias sequence views are read-only - -Writing to non-alias sequence views is not allowed. - -Only aliased sequence views are supported for writing. - -#### Restriction 3 Nested views are not allowed - -It is not possible to select certain columns in an existing sequence view to create a sequence view, either directly or indirectly. - -An example of this restriction will be given in 3.1.3. - -#### Restriction 4 Sequence view and time series cannot be renamed - -Both sequence views and time series are located under the same tree, so they cannot be renamed. - -The name (path) of any sequence should be uniquely determined. - -#### Restriction 5 Sequence views share timing data with time series, metadata such as labels are not shared - -Sequence views are mappings pointing to time series, so they fully share timing data, with the time series being responsible for persistent storage. - -However, their metadata such as tags and attributes are not shared. - -This is because the business query, view-oriented users are concerned about the structure of the current view, and if you use group by tag and other ways to do the query, obviously want to get the view contains the corresponding tag grouping effect, rather than the time series of the tag grouping effect (the user is not even aware of those time series). - -## Sequence view functionality - -### Creating a view - -Creating a sequence view is similar to creating a time series, the difference is that you need to specify the data source, i.e., the original sequence, through the AS keyword. - -#### SQL for creating a view - -User can select some sequences to create a view: - -```SQL -CREATE VIEW root.view.device.status -AS - SELECT s01 - FROM root.db.device -``` - -It indicates that the user has selected the sequence `s01` from the existing device `root.db.device`, creating the sequence view `root.view.device.status`. - -The sequence view can exist under the same entity as the time series, for example: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device -``` - -Thus, there is a virtual copy of `s01` under `root.db.device`, but with a different name `status`. - -It can be noticed that the sequence views in both of the above examples are aliased sequences, and we are giving the user a more convenient way of creating a sequence for that sequence: - -```SQL -CREATE VIEW root.view.device.status -AS - root.db.device.s01 -``` - -#### Creating views with computational logic - -Following the example in section 2.2 Limitations 1: - -> A user has installed two thermometers in the same boiler and now needs to calculate the average of the two temperature values as a measurement. The user has captured the following two sequences: `root.db.d01.temperature01`, `root.db.d01.temperature02`. -> -> At this point, the user can use the two sequences averaged as one sequence in the view: `root.view.device01.avg_temperature`. - -If the view is not used, the user can query the average of the two temperatures like this: - -```SQL -SELECT (temperature01 + temperature02) / 2 -FROM root.db.d01 -``` - -And if using a sequence view, the user can create a view this way to simplify future queries: - -```SQL -CREATE VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02) / 2 - FROM root.db.d01 -``` - -The user can then query it like this: - -```SQL -SELECT avg_temperature FROM root.db.d01 -``` - -#### Nested sequence views not supported - -Continuing with the example from 3.1.2, the user now wants to create a new view using the sequence view `root.db.d01.avg_temperature`, which is not allowed. We currently do not support nested views, whether it is an aliased sequence or not. - -For example, the following SQL statement will report an error: - -```SQL -CREATE VIEW root.view.device.avg_temp_copy -AS - root.db.d01.avg_temperature -- Not supported. Nested views are not allowed -``` - -#### Creating multiple sequence views at once - -If only one sequence view can be specified at a time which is not convenient for the user to use, then multiple sequences can be specified at a time, for example: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - SELECT s01, s02 - FROM root.db.device -``` - -此外,上述写法可以做简化: - -```SQL -CREATE VIEW root.db.device(status, sub.hardware) -AS - SELECT s01, s02 - FROM root.db.device -``` - -Both statements above are equivalent to the following typing: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device; - -CREATE VIEW root.db.device.sub.hardware -AS - SELECT s02 - FROM root.db.device -``` - -is also equivalent to the following: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - root.db.device.s01, root.db.device.s02 - --- or - -CREATE VIEW root.db.device(status, sub.hardware) -AS - root.db.device(s01, s02) -``` - -##### The mapping relationships between all sequences are statically stored - -Sometimes, the SELECT clause may contain a number of statements that can only be determined at runtime, such as below: - -```SQL -SELECT s01, s02 -FROM root.db.d01, root.db.d02 -``` - -The number of sequences that can be matched by the above statement is uncertain and is related to the state of the system. Even so, the user can use it to create views. - -However, it is important to note that the mapping relationship between all sequences is stored statically (fixed at creation)! Consider the following example: - -The current database contains only three sequences `root.db.d01.s01`, `root.db.d02.s01`, `root.db.d02.s02`, and then the view is created: - -```SQL -CREATE VIEW root.view.d(alpha, beta, gamma) -AS - SELECT s01, s02 - FROM root.db.d01, root.db.d02 -``` - -The mapping relationship between time series is as follows: - -| sequence number | time series | sequence view | -| ---- | ----------------- | ----------------- | -| 1 | `root.db.d01.s01` | root.view.d.alpha | -| 2 | `root.db.d02.s01` | root.view.d.beta | -| 3 | `root.db.d02.s02` | root.view.d.gamma | - -After that, if the user adds the sequence `root.db.d01.s02`, it does not correspond to any view; then, if the user deletes `root.db.d01.s01`, the query for `root.view.d.alpha` will report an error directly, and it will not correspond to `root.db.d01.s02` either. - -Please always note that inter-sequence mapping relationships are stored statically and solidly. - -#### Batch Creation of Sequence Views - -There are several existing devices, each with a temperature value, for example: - -1. root.db.d1.temperature -2. root.db.d2.temperature -3. ... - -There may be many other sequences stored under these devices (e.g. `root.db.d1.speed`), but for now it is possible to create a view that contains only the temperature values for these devices, without relation to the other sequences:. - -```SQL -CREATE VIEW root.db.view(${2}_temperature) -AS - SELECT temperature FROM root.db.* -``` - -This is modelled on the query writeback (`SELECT INTO`) convention for naming rules, which uses variable placeholders to specify naming rules. See also: [QUERY WRITEBACK (SELECT INTO)](../Basic-Concept/Query-Data.md#into-clause-query-write-back) - -Here `root.db.*.temperature` specifies what time series will be included in the view; and `${2}` specifies from which node in the time series the name is extracted to name the sequence view. - -Here, `${2}` refers to level 2 (starting at 0) of `root.db.*.temperature`, which is the result of the `*` match; and `${2}_temperature` is the result of the match and `temperature` spliced together with underscores to make up the node names of the sequences under the view. - -The above statement for creating a view is equivalent to the following writeup: - -```SQL -CREATE VIEW root.db.view(${2}_${3}) -AS - SELECT temperature from root.db.* -``` - -The final view contains these sequences: - -1. root.db.view.d1_temperature -2. root.db.view.d2_temperature -3. ... - -Created using wildcards, only static mapping relationships at the moment of creation will be stored. - -#### SELECT clauses are somewhat limited when creating views - -The SELECT clause used when creating a serial view is subject to certain restrictions. The main restrictions are as follows: - -1. the `WHERE` clause cannot be used. -2. `GROUP BY` clause cannot be used. -3. `MAX_VALUE` and other aggregation functions cannot be used. - -Simply put, after `AS` you can only use `SELECT ... FROM ... ` and the results of this query must form a time series. - -### View Data Queries - -For the data query functions that can be supported, the sequence view and time series can be used indiscriminately with identical behaviour when performing time series data queries. - -**The types of queries that are not currently supported by the sequence view are as follows:** - -1. **align by device query -2. **group by tags query - -Users can also mix time series and sequence view queries in the same SELECT statement, for example: - -```SQL -SELECT temperature01, temperature02, avg_temperature -FROM root.db.d01 -WHERE temperature01 < temperature02 -``` - -However, if the user wants to query the metadata of the sequence, such as tag, attributes, etc., the query is the result of the sequence view, not the result of the time series referenced by the sequence view. - -In addition, for aliased sequences, if the user wants to get information about the time series such as tags, attributes, etc., the user needs to query the mapping of the view columns to find the corresponding time series, and then query the time series for the tags, attributes, etc. The method of querying the mapping of the view columns will be explained in section 3.5. - -### Modify Views - -The modification operations supported by the view include: modifying its calculation logic,modifying tag/attributes, and deleting. - -#### Modify view data source - -```SQL -ALTER VIEW root.view.device.status -AS - SELECT s01 - FROM root.ln.wf.d01 -``` - -#### Modify the view's calculation logic - -```SQL -ALTER VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02 + temperature03) / 3 - FROM root.db.d01 -``` - -#### Tag point management - -- Add a new -tag -```SQL -ALTER view root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -- Add a new attribute - -```SQL -ALTER view root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -- rename tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -- Reset the value of a tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -- Delete an existing tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 DROP tag1, tag2 -``` - -- Update insert tags and attributes - -> If the tag or attribute did not exist before, insert it, otherwise, update the old value with the new one. - -```SQL -ALTER view root.turbine.d1.s1 UPSERT TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -#### Deleting Views - -Since a view is a sequence, a view can be deleted as if it were a time series. - - -```SQL -DELETE VIEW root.view.device.avg_temperatue -``` - -### View Synchronisation - - - -#### If the dependent original sequence is deleted - -When the sequence view is queried (when the sequence is parsed), **the empty result set** is returned if the dependent time series does not exist. - -This is similar to the feedback for querying a non-existent sequence, but with a difference: if the dependent time series cannot be parsed, the empty result set is the one that contains the table header as a reminder to the user that the view is problematic. - -Additionally, when the dependent time series is deleted, no attempt is made to find out if there is a view that depends on the column, and the user receives no warning. - -#### Data Writes to Non-Aliased Sequences Not Supported - -Writes to non-alias sequences are not supported. - -Please refer to the previous section 2.1.6 Restrictions2 for more details. - -#### Metadata for sequences is not shared - -Please refer to the previous section 2.1.6 Restriction 5 for details. - -### View Metadata Queries - -View metadata query specifically refers to querying the metadata of the view itself (e.g., how many columns the view has), as well as information about the views in the database (e.g., what views are available). - -#### Viewing Current View Columns - -The user has two ways of querying: - -1. a query using `SHOW TIMESERIES`, which contains both time series and series views. This query contains both the time series and the sequence view. However, only some of the attributes of the view can be displayed. -2. a query using `SHOW VIEW`, which contains only the sequence view. It displays the complete properties of the sequence view. - -Example: - -```Shell -IoTDB> show timeseries; -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.device.s01 | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.view.status | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp01 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp02 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.avg_temp| null| root.db| FLOAT| null| null|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -Total line number = 5 -It costs 0.789s -IoTDB> -``` - -The last column `ViewType` shows the type of the sequence, the time series is BASE and the sequence view is VIEW. - -In addition, some of the sequence view properties will be missing, for example `root.db.d01.avg_temp` is calculated from temperature averages, so the `Encoding` and `Compression` properties are null values. - -In addition, the query results of the `SHOW TIMESERIES` statement are divided into two main parts. - -1. information about the timing data, such as data type, compression, encoding, etc. -2. other metadata information, such as tag, attribute, database, etc. - -For the sequence view, the temporal data information presented is the same as the original sequence or null (e.g., the calculated average temperature has a data type but no compression method); the metadata information presented is the content of the view. - -To learn more about the view, use `SHOW ``VIEW`. The `SHOW ``VIEW` shows the source of the view's data, etc. - -```Shell -IoTDB> show VIEW root.**; -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -| Timeseries|Database|DataType|Tags|Attributes|ViewType| SOURCE| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.view.status | root.db| INT32|null| null| VIEW| root.db.device.s01| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.d01.avg_temp| root.db| FLOAT|null| null| VIEW|(root.db.d01.temp01+root.db.d01.temp02)/2| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -Total line number = 2 -It costs 0.789s -IoTDB> -``` - -The last column, `SOURCE`, shows the data source for the sequence view, listing the SQL statement that created the sequence. - -##### About Data Types - -Both of the above queries involve the data type of the view. The data type of a view is inferred from the original time series type of the query statement or alias sequence that defines the view. This data type is computed in real time based on the current state of the system, so the data type queried at different moments may be changing. - -## FAQ - -#### Q1: I want the view to implement the function of type conversion. For example, a time series of type int32 was originally placed in the same view as other series of type int64. I now want all the data queried through the view to be automatically converted to int64 type. - -> Ans: This is not the function of the sequence view. But the conversion can be done using `CAST`, for example: - -```SQL -CREATE VIEW root.db.device.int64_status -AS - SELECT CAST(s1, 'type'='INT64') from root.db.device -``` - -> This way, a query for `root.view.status` will yield a result of type int64. -> -> Please note in particular that in the above example, the data for the sequence view is obtained by `CAST` conversion, so `root.db.device.int64_status` is not an aliased sequence, and thus **not supported for writing**. - -#### Q2: Is default naming supported? Select a number of time series and create a view; but I don't specify the name of each series, it is named automatically by the database? - -> Ans: Not supported. Users must specify the naming explicitly. - -#### Q3: In the original system, create time series `root.db.device.s01`, you can find that database `root.db` is automatically created and device `root.db.device` is automatically created. Next, deleting the time series `root.db.device.s01` reveals that `root.db.device` was automatically deleted, while `root.db` remained. Will this mechanism be followed for creating views? What are the considerations? - -> Ans: Keep the original behaviour unchanged, the introduction of view functionality will not change these original logics. - -#### Q4: Does it support sequence view renaming? - -> A: Renaming is not supported in the current version, you can create your own view with new name to put it into use. \ No newline at end of file diff --git a/src/UserGuide/V1.3.x/User-Manual/Streaming_timecho.md b/src/UserGuide/V1.3.x/User-Manual/Streaming_timecho.md deleted file mode 100644 index 80edebe9c..000000000 --- a/src/UserGuide/V1.3.x/User-Manual/Streaming_timecho.md +++ /dev/null @@ -1,857 +0,0 @@ - - -# Stream Computing Framework - -The IoTDB stream processing framework allows users to implement customized stream processing logic, which can monitor and capture storage engine changes, transform changed data, and push transformed data outward. - -We call a data flow processing task a Pipe. A stream processing task (Pipe) contains three subtasks: - -- Source task -- Processor task -- Sink task - -The stream processing framework allows users to customize the processing logic of three subtasks using Java language and process data in a UDF-like manner. -In a Pipe, the above three subtasks are executed by three plugins respectively, and the data will be processed by these three plugins in turn: -Pipe Source is used to extract data, Pipe Processor is used to process data, Pipe Sink is used to send data, and the final data will be sent to an external system. - -**The model of the Pipe task is as follows:** - -![pipe.png](/img/1706778988482.jpg) - -Describing a data flow processing task essentially describes the properties of Pipe Source, Pipe Processor and Pipe Sink plugins. -Users can declaratively configure the specific attributes of the three subtasks through SQL statements, and achieve flexible data ETL capabilities by combining different attributes. - -Using the stream processing framework, a complete data link can be built to meet the needs of end-side-cloud synchronization, off-site disaster recovery, and read-write load sub-library*. - -## Custom stream processing plugin development - -### Programming development dependencies - -It is recommended to use maven to build the project and add the following dependencies in `pom.xml`. Please be careful to select the same dependency version as the IoTDB server version. - -```xml - - org.apache.iotdb - pipe-api - 1.3.1 - provided - -``` - -### Event-driven programming model - -The user programming interface design of the stream processing plugin refers to the general design concept of the event-driven programming model. Events are data abstractions in the user programming interface, and the programming interface is decoupled from the specific execution method. It only needs to focus on describing the processing method expected by the system after the event (data) reaches the system. - -In the user programming interface of the stream processing plugin, events are an abstraction of database data writing operations. The event is captured by the stand-alone stream processing engine, and is passed to the PipeSource plugin, PipeProcessor plugin, and PipeSink plugin in sequence according to the three-stage stream processing process, and triggers the execution of user logic in the three plugins in turn. - -In order to take into account the low latency of stream processing in low load scenarios on the end side and the high throughput of stream processing in high load scenarios on the end side, the stream processing engine will dynamically select processing objects in the operation logs and data files. Therefore, user programming of stream processing The interface requires users to provide processing logic for the following two types of events: operation log writing event TabletInsertionEvent and data file writing event TsFileInsertionEvent. - -#### **Operation log writing event (TabletInsertionEvent)** - -The operation log write event (TabletInsertionEvent) is a high-level data abstraction for user write requests. It provides users with the ability to manipulate the underlying data of write requests by providing a unified operation interface. - -For different database deployment methods, the underlying storage structures corresponding to operation log writing events are different. For stand-alone deployment scenarios, the operation log writing event is an encapsulation of write-ahead log (WAL) entries; for a distributed deployment scenario, the operation log writing event is an encapsulation of a single node consensus protocol operation log entry. - -For write operations generated by different write request interfaces in the database, the data structure of the request structure corresponding to the operation log write event is also different. IoTDB provides numerous writing interfaces such as InsertRecord, InsertRecords, InsertTablet, InsertTablets, etc. Each writing request uses a completely different serialization method, and the generated binary entries are also different. - -The existence of operation log writing events provides users with a unified view of data operations, which shields the implementation differences of the underlying data structure, greatly reduces the user's programming threshold, and improves the ease of use of the function. - -```java -/** TabletInsertionEvent is used to define the event of data insertion. */ -public interface TabletInsertionEvent extends Event { - - /** - * The consumer processes the data row by row and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processRowByRow(BiConsumer consumer); - - /** - * The consumer processes the Tablet directly and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processTablet(BiConsumer consumer); -} -``` - -#### **Data file writing event (TsFileInsertionEvent)** - -The data file writing event (TsFileInsertionEvent) is a high-level abstraction of the database file writing operation. It is a data collection of several operation log writing events (TabletInsertionEvent). - -The storage engine of IoTDB is LSM structured. When data is written, the writing operation will first be placed into a log-structured file, and the written data will be stored in the memory at the same time. When the memory reaches the control upper limit, the disk flushing behavior will be triggered, that is, the data in the memory will be converted into a database file, and the previously prewritten operation log will be deleted. When the data in the memory is converted into the data in the database file, it will undergo two compression processes: encoding compression and general compression. Therefore, the data in the database file takes up less space than the original data in the memory. - -In extreme network conditions, directly transmitting data files is more economical than transmitting data writing operations. It will occupy lower network bandwidth and achieve faster transmission speeds. Of course, there is no free lunch. Computing and processing data in files requires additional file I/O costs compared to directly computing and processing data in memory. However, it is precisely the existence of two structures, disk data files and memory write operations, with their own advantages and disadvantages, that gives the system the opportunity to make dynamic trade-offs and adjustments. It is based on this observation that data files are introduced into the plugin's event model. Write event. - -To sum up, the data file writing event appears in the event stream of the stream processing plugin, and there are two situations: - -(1) Historical data extraction: Before a stream processing task starts, all written data that has been placed on the disk will exist in the form of TsFile. After a stream processing task starts, when collecting historical data, the historical data will be abstracted using TsFileInsertionEvent; - -(2) Real-time data extraction: When a stream processing task is in progress, when the real-time processing speed of operation log write events in the data stream is slower than the write request speed, after a certain progress, the operation log write events that cannot be processed in the future will be persisted. to disk and exists in the form of TsFile. After this data is extracted by the stream processing engine, TsFileInsertionEvent will be used as an abstraction. - -```java -/** - * TsFileInsertionEvent is used to define the event of writing TsFile. Event data stores in disks, - * which is compressed and encoded, and requires IO cost for computational processing. - */ -public interface TsFileInsertionEvent extends Event { - - /** - * The method is used to convert the TsFileInsertionEvent into several TabletInsertionEvents. - * - * @return {@code Iterable} the list of TabletInsertionEvent - */ - Iterable toTabletInsertionEvents(); -} -``` - -### Custom stream processing plugin programming interface definition - -Based on the custom stream processing plugin programming interface, users can easily write data extraction plugins, data processing plugins and data sending plugins, so that the stream processing function can be flexibly adapted to various industrial scenarios. - -#### Data extraction plugin interface - -Data extraction is the first stage of the three stages of stream processing data from data extraction to data sending. The data extraction plugin (PipeSource) is the bridge between the stream processing engine and the storage engine. It monitors the behavior of the storage engine, -Capture various data write events. - -```java -/** - * PipeSource - * - *

PipeSource is responsible for capturing events from sources. - * - *

Various data sources can be supported by implementing different PipeSource classes. - * - *

The lifecycle of a PipeSource is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH Source` clause in SQL are - * parsed and the validation method {@link PipeSource#validate(PipeParameterValidator)} will - * be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} will be called to - * configure the runtime behavior of the PipeSource. - *
  • Then the method {@link PipeSource#start()} will be called to start the PipeSource. - *
  • While the collaboration task is in progress, the method {@link PipeSource#supply()} will be - * called to capture events from sources and then the events will be passed to the - * PipeProcessor. - *
  • The method {@link PipeSource#close()} will be called when the collaboration task is - * cancelled (the `DROP PIPE` command is executed). - *
- */ -public interface PipeSource extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSource. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSourceRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSource#validate(PipeParameterValidator)} - * is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSource - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSourceRuntimeConfiguration configuration) - throws Exception; - - /** - * Start the Source. After this method is called, events should be ready to be supplied by - * {@link PipeSource#supply()}. This method is called after {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @throws Exception the user can throw errors if necessary - */ - void start() throws Exception; - - /** - * Supply single event from the Source and the caller will send the event to the processor. - * This method is called after {@link PipeSource#start()} is called. - * - * @return the event to be supplied. the event may be null if the Source has no more events at - * the moment, but the Source is still running for more events. - * @throws Exception the user can throw errors if necessary - */ - Event supply() throws Exception; -} -``` - -#### Data processing plugin interface - -Data processing is the second stage of the three stages of stream processing data from data extraction to data sending. The data processing plugin (PipeProcessor) is mainly used to filter and transform the data captured by the data extraction plugin (PipeSource). -various events. - -```java -/** - * PipeProcessor - * - *

PipeProcessor is used to filter and transform the Event formed by the PipeSource. - * - *

The lifecycle of a PipeProcessor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH PROCESSOR` clause in SQL are - * parsed and the validation method {@link PipeProcessor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} will be called - * to configure the runtime behavior of the PipeProcessor. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSource. The - * following 3 methods will be called: {@link - * PipeProcessor#process(TabletInsertionEvent, EventCollector)}, {@link - * PipeProcessor#process(TsFileInsertionEvent, EventCollector)} and {@link - * PipeProcessor#process(Event, EventCollector)}. - *
    • PipeSink serializes the events into binaries and send them to sinks. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeProcessor#close() } method will be called. - *
- */ -public interface PipeProcessor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeProcessor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeProcessorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeProcessor#validate(PipeParameterValidator)} is called and before the beginning of the - * events processing. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeProcessor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeProcessorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is called to process the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(TabletInsertionEvent tabletInsertionEvent, EventCollector eventCollector) - throws Exception; - - /** - * This method is called to process the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - default void process(TsFileInsertionEvent tsFileInsertionEvent, EventCollector eventCollector) - throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - process(tabletInsertionEvent, eventCollector); - } - } - - /** - * This method is called to process the Event. - * - * @param event Event to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(Event event, EventCollector eventCollector) throws Exception; -} -``` - -#### Data sending plugin interface - -Data sending is the third stage of the three stages of stream processing data from data extraction to data sending. The data sending plugin (PipeSink) is mainly used to send data processed by the data processing plugin (PipeProcessor). -Various events, it serves as the network implementation layer of the stream processing framework, and the interface should allow access to multiple real-time communication protocols and multiple sinks. - -```java -/** - * PipeSink - * - *

PipeSink is responsible for sending events to sinks. - * - *

Various network protocols can be supported by implementing different PipeSink classes. - * - *

The lifecycle of a PipeSink is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SINK` clause in SQL are - * parsed and the validation method {@link PipeSink#validate(PipeParameterValidator)} will be - * called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link PipeSink#customize(PipeParameters, - * PipeSinkRuntimeConfiguration)} will be called to configure the runtime behavior of the - * PipeSink and the method {@link PipeSink#handshake()} will be called to create a connection - * with sink. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. - *
    • PipeSink serializes the events into binaries and send them to sinks. The following 3 - * methods will be called: {@link PipeSink#transfer(TabletInsertionEvent)}, {@link - * PipeSink#transfer(TsFileInsertionEvent)} and {@link PipeSink#transfer(Event)}. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeSink#close() } method will be called. - *
- * - *

In addition, the method {@link PipeSink#heartbeat()} will be called periodically to check - * whether the connection with sink is still alive. The method {@link PipeSink#handshake()} will be - * called to create a new connection with the sink when the method {@link PipeSink#heartbeat()} - * throws exceptions. - */ -public interface PipeSink extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSink. In this method, the user can do the following - * things: - * - *

    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSinkRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSink#validate(PipeParameterValidator)} is - * called and before the method {@link PipeSink#handshake()} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSink - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSinkRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is used to create a connection with sink. This method will be called after the - * method {@link PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called or - * will be called when the method {@link PipeSink#heartbeat()} throws exceptions. - * - * @throws Exception if the connection is failed to be created - */ - void handshake() throws Exception; - - /** - * This method will be called periodically to check whether the connection with sink is still - * alive. - * - * @throws Exception if the connection dies - */ - void heartbeat() throws Exception; - - /** - * This method is used to transfer the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(TabletInsertionEvent tabletInsertionEvent) throws Exception; - - /** - * This method is used to transfer the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - default void transfer(TsFileInsertionEvent tsFileInsertionEvent) throws Exception { - try { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - transfer(tabletInsertionEvent); - } - } finally { - tsFileInsertionEvent.close(); - } - } - - /** - * This method is used to transfer the generic events, including HeartbeatEvent. - * - * @param event Event to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(Event event) throws Exception; -} -``` - -## Custom stream processing plugin management - -In order to ensure the flexibility and ease of use of user-defined plugins in actual production, the system also needs to provide the ability to dynamically and uniformly manage plugins. -The stream processing plugin management statements introduced in this chapter provide an entry point for dynamic unified management of plugins. - -### Load plugin statement - -In IoTDB, if you want to dynamically load a user-defined plugin in the system, you first need to implement a specific plugin class based on PipeSource, PipeProcessor or PipeSink. -Then the plugin class needs to be compiled and packaged into a jar executable file, and finally the plugin is loaded into IoTDB using the management statement for loading the plugin. - -The syntax of the management statement for loading the plugin is shown in the figure. - -```sql -CREATE PIPEPLUGIN [IF NOT EXISTS] -AS -USING -``` -**IF NOT EXISTS semantics**: Used in creation operations to ensure that the create command is executed when the specified Pipe Plugin does not exist, preventing errors caused by attempting to create an existing Pipe Plugin. - -Example: If you implement a data processing plugin named edu.tsinghua.iotdb.pipe.ExampleProcessor, and the packaged jar package is pipe-plugin.jar, you want to use this plugin in the stream processing engine, and mark the plugin as example. There are two ways to use the plugin package, one is to upload to the URI server, and the other is to upload to the local directory of the cluster. - -Method 1: Upload to the URI server - -Preparation: To register in this way, you need to upload the JAR package to the URI server in advance and ensure that the IoTDB instance that executes the registration statement can access the URI server. For example https://example.com:8080/iotdb/pipe-plugin.jar . - -SQL: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -Method 2: Upload the data to the local directory of the cluster - -Preparation: To register in this way, you need to place the JAR package in any path on the machine where the DataNode node is located, and we recommend that you place the JAR package in the /ext/pipe directory of the IoTDB installation path (the installation package is already in the installation package, so you do not need to create a new one). For example: iotdb-1.x.x-bin/ext/pipe/pipe-plugin.jar. **(Note: If you are using a cluster, you will need to place the JAR package under the same path as the machine where each DataNode node is located)** - -SQL: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -### Delete plugin statement - -When the user no longer wants to use a plugin and needs to uninstall the plugin from the system, he can use the delete plugin statement as shown in the figure. - -```sql -DROP PIPEPLUGIN [IF EXISTS] -``` - -**IF EXISTS semantics**: Used in deletion operations to ensure that when a specified Pipe Plugin exists, the delete command is executed to prevent errors caused by attempting to delete a non-existent Pipe Plugin. - -### View plugin statements - -Users can also view plugins in the system on demand. View the statement of the plugin as shown in the figure. -```sql -SHOW PIPEPLUGINS -``` - -## System preset stream processing plugin - -### Pre-built Source Plugin - -#### iotdb-source - -Function: Extract historical or realtime data inside IoTDB into pipe. - - -| key | value | value range | required or optional with default | -|---------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|-----------------------------------| -| source | iotdb-source | String: iotdb-source | required | -| source.pattern | path prefix for filtering time series | String: any time series prefix | optional: root | -| source.history.start-time | start of synchronizing historical data event time,including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| source.history.end-time | end of synchronizing historical data event time,including end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.forwarding-pipe-requests | Whether to forward data written by another Pipe (usually Data Sync) | Boolean: true, false | optional:true | -| start-time(V1.3.1+) | start of synchronizing all data event time,including start-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| end-time(V1.3.1+) | end of synchronizing all data event time,including end-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.realtime.mode | Extraction mode for real-time data | String: hybrid, stream, batch | optional:hybrid | -| source.forwarding-pipe-requests | Whether to forward data written by another Pipe (usually Data Sync) | Boolean: true, false | optional:true | - -> 🚫 **source.pattern Parameter Description** -> -> * Pattern should use backquotes to modify illegal characters or illegal path nodes, for example, if you want to filter root.\`a@b\` or root.\`123\`, you should set the pattern to root.\`a@b\` or root.\`123\`(Refer specifically to [Timing of single and double quotes and backquotes](https://iotdb.apache.org/Download/)) -> * In the underlying implementation, when pattern is detected as root (default value) or a database name, synchronization efficiency is higher, and any other format will reduce performance. -> * The path prefix does not need to form a complete path. For example, when creating a pipe with the parameter 'source.pattern'='root.aligned.1': - > - > * root.aligned.1TS - > * root.aligned.1TS.\`1\` - > * root.aligned.100TS - > - > the data will be synchronized; - > - > * root.aligned.\`1\` -> * root.aligned.\`123\` - > - > the data will not be synchronized. - -> ❗️**start-time, end-time parameter description of source** -> -> * start-time, end-time should be in ISO format, such as 2011-12-03T10:15:30 or 2011-12-03T10:15:30+01:00. However, version 1.3.1+ supports timeStamp format like 1706704494000. - -> ✅ **A piece of data from production to IoTDB contains two key concepts of time** -> -> * **event time:** The time when the data is actually produced (or the generation time assigned to the data by the data production system, which is the time item in the data point), also called event time. -> * **arrival time:** The time when data arrives in the IoTDB system. -> -> The out-of-order data we often refer to refers to data whose **event time** is far behind the current system time (or the maximum **event time** that has been dropped) when the data arrives. On the other hand, whether it is out-of-order data or sequential data, as long as they arrive newly in the system, their **arrival time** will increase with the order in which the data arrives at IoTDB. - -> 💎 **The work of iotdb-source can be split into two stages** -> -> 1. Historical data extraction: All data with **arrival time** < **current system time** when creating the pipe is called historical data -> 2. Realtime data extraction: All data with **arrival time** >= **current system time** when the pipe is created is called realtime data -> -> The historical data transmission phase and the realtime data transmission phase are executed serially. Only when the historical data transmission phase is completed, the realtime data transmission phase is executed.** - -> 📌 **source.realtime.mode: Data extraction mode** -> -> * log: In this mode, the task only uses the operation log for data processing and sending -> * file: In this mode, the task only uses data files for data processing and sending. -> * hybrid: This mode takes into account the characteristics of low latency but low throughput when sending data one by one in the operation log, and the characteristics of high throughput but high latency when sending in batches of data files. It can automatically operate under different write loads. Switch the appropriate data extraction method. First, adopt the data extraction method based on operation logs to ensure low sending delay. When a data backlog occurs, it will automatically switch to the data extraction method based on data files to ensure high sending throughput. When the backlog is eliminated, it will automatically switch back to the data extraction method based on data files. The data extraction method of the operation log avoids the problem of difficulty in balancing data sending delay or throughput using a single data extraction algorithm. - -> 🍕 **source.forwarding-pipe-requests: Whether to allow forwarding data transmitted from another pipe** -> -> * If you want to use pipe to build data synchronization of A -> B -> C, then the pipe of B -> C needs to set this parameter to true, so that the data written by A to B through the pipe in A -> B can be forwarded correctly. to C -> * If you want to use pipe to build two-way data synchronization (dual-active) of A \<-> B, then the pipes of A -> B and B -> A need to set this parameter to false, otherwise the data will be endless. inter-cluster round-robin forwarding - -### Preset processor plugin - -#### do-nothing-processor - -Function: No processing is done on the events passed in by the source. - - -| key | value | value range | required or optional with default | -|-----------|----------------------|------------------------------|-----------------------------------| -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### Preset sink plugin - -#### do-nothing-sink - -Function: No processing is done on the events passed in by the processor. - -| key | value | value range | required or optional with default | -|------|-----------------|-------------------------|-----------------------------------| -| sink | do-nothing-sink | String: do-nothing-sink | required | - -## Stream processing task management - -### Create a stream processing task - -Use the `CREATE PIPE` statement to create a stream processing task. Taking the creation of a data synchronization stream processing task as an example, the sample SQL statement is as follows: - -```sql -CREATE PIPE -- PipeId is the name that uniquely identifies the sync task -WITH SOURCE ( - -- Default IoTDB Data Extraction Plugin - 'source' = 'iotdb-source', - -- Path prefix, only data that can match the path prefix will be extracted for subsequent processing and delivery - 'source.pattern' = 'root.timecho', - -- Whether to extract historical data - 'source.history.enable' = 'true', - -- Describes the time range of the historical data being extracted, indicating the earliest possible time - 'source.history.start-time' = '2011.12.03T10:15:30+01:00', - -- Describes the time range of the extracted historical data, indicating the latest time - 'source.history.end-time' = '2022.12.03T10:15:30+01:00', - -- Whether to extract realtime data - 'source.realtime.enable' = 'true', -) -WITH PROCESSOR ( - -- Default data processing plugin, means no processing - 'processor' = 'do-nothing-processor', -) -WITH SINK ( - -- IoTDB data sending plugin with target IoTDB - 'sink' = 'iotdb-thrift-sink', - -- Data service for one of the DataNode nodes on the target IoTDB ip - 'sink.ip' = '127.0.0.1', - -- Data service port of one of the DataNode nodes of the target IoTDB - 'sink.port' = '6667', -) -``` - -**When creating a stream processing task, you need to configure the PipeId and the parameters of the three plugin parts:** - -| Configuration | Description | Required or not | Default implementation | Default implementation description | Default implementation description | -|---------------|-----------------------------------------------------------------------------------------------------|---------------------------------|------------------------|---------------------------------------------------------------------------------------------------------------------------|------------------------------------| -| PipeId | A globally unique name that identifies a stream processing | Required | - | - | - | -| source | Pipe Source plugin, responsible for extracting stream processing data at the bottom of the database | Optional | iotdb-source | Integrate the full historical data of the database and subsequent real-time data arriving into the stream processing task | No | -| processor | Pipe Processor plugin, responsible for processing data | Optional | do-nothing-processor | Does not do any processing on the incoming data | Yes | -| sink | Pipe Sink plugin, responsible for sending data | Required | - | - | Yes | - -In the example, the iotdb-source, do-nothing-processor and iotdb-thrift-sink plugins are used to build the data flow processing task. IoTDB also has other built-in stream processing plugins, **please check the "System Preset Stream Processing plugin" section**. - -**A simplest example of the CREATE PIPE statement is as follows:** - -```sql -CREATE PIPE -- PipeId is a name that uniquely identifies the stream processing task -WITH SINK ( - -- IoTDB data sending plugin, the target is IoTDB - 'sink' = 'iotdb-thrift-sink', - --The data service IP of one of the DataNode nodes in the target IoTDB - 'sink.ip' = '127.0.0.1', - -- The data service port of one of the DataNode nodes in the target IoTDB - 'sink.port' = '6667', -) -``` - -The semantics expressed are: synchronize all historical data in this database instance and subsequent real-time data arriving to the IoTDB instance with the target 127.0.0.1:6667. - -**Notice:** - -- SOURCE and PROCESSOR are optional configurations. If you do not fill in the configuration parameters, the system will use the corresponding default implementation. -- SINK is a required configuration and needs to be configured declaratively in the CREATE PIPE statement -- SINK has self-reuse capability. For different stream processing tasks, if their SINKs have the same KV attributes (the keys corresponding to the values of all attributes are the same), then the system will only create one SINK instance in the end to realize the duplication of connection resources. - - - For example, there are the following declarations of two stream processing tasks, pipe1 and pipe2: - - ```sql - CREATE PIPE pipe1 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.ip' = 'localhost', - 'sink.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.port' = '9999', - 'sink.ip' = 'localhost', - ) - ``` - -- Because their declarations of SINK are exactly the same (**even if the order of declaration of some attributes is different**), the framework will automatically reuse the SINKs they declared, and ultimately the SINKs of pipe1 and pipe2 will be the same instance. . -- When the source is the default iotdb-source, and source.forwarding-pipe-requests is the default value true, please do not build an application scenario that includes data cycle synchronization (it will cause an infinite loop): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### Start the stream processing task - -After the CREATE PIPE statement is successfully executed, the stream processing task-related instance will be created, but the running status of the entire stream processing task will be set to STOPPED(V1.3.0), that is, the stream processing task will not process data immediately. In version 1.3.1 and later, the status of the task will be set to RUNNING after CREATE. - -You can use the START PIPE statement to cause a stream processing task to start processing data: - -```sql -START PIPE -``` - -### Stop the stream processing task - -Use the STOP PIPE statement to stop the stream processing task from processing data: - -```sql -STOP PIPE -``` - -### Delete stream processing tasks - -Use the DROP PIPE statement to stop the stream processing task from processing data (when the stream processing task status is RUNNING), and then delete the entire stream processing task: - -```sql -DROP PIPE -``` - -Users do not need to perform a STOP operation before deleting the stream processing task. - -### Display stream processing tasks - -Use the SHOW PIPES statement to view all stream processing tasks: - -```sql -SHOW PIPES -``` - -The query results are as follows: - -```sql -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor|PipeSink|ExceptionMessage| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| {}| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -``` - -You can use `` to specify the status of a stream processing task you want to see: - -```sql -SHOW PIPE -``` - -You can also use the where clause to determine whether the Pipe Sink used by a certain \ is reused. - -```sql -SHOW PIPES -WHERE SINK USED BY -``` - -### Stream processing task running status migration - -A stream processing pipe will pass through various states during its managed life cycle: - -- **RUNNING:** pipe is working properly - - When a pipe is successfully created, its initial state is RUNNING.(V1.3.1+) -- **STOPPED:** The pipe is stopped. When the pipeline is in this state, there are several possibilities: - - When a pipe is successfully created, its initial state is STOPPED.(V1.3.0) - - The user manually pauses a pipe that is in normal running status, and its status will passively change from RUNNING to STOPPED. - - When an unrecoverable error occurs during the running of a pipe, its status will automatically change from RUNNING to STOPPED -- **DROPPED:** The pipe task was permanently deleted - -The following diagram shows all states and state transitions: - -![State migration diagram](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -## authority management - -### Stream processing tasks - - -| Permission name | Description | -|-----------------|------------------------------------------------------------| -| USE_PIPE | Register a stream processing task. The path is irrelevant. | -| USE_PIPE | Start the stream processing task. The path is irrelevant. | -| USE_PIPE | Stop the stream processing task. The path is irrelevant. | -| USE_PIPE | Offload stream processing tasks. The path is irrelevant. | -| USE_PIPE | Query stream processing tasks. The path is irrelevant. | - -### Stream processing task plugin - - -| Permission name | Description | -|-----------------|----------------------------------------------------------------------| -| USE_PIPE | Register stream processing task plugin. The path is irrelevant. | -| USE_PIPE | Uninstall the stream processing task plugin. The path is irrelevant. | -| USE_PIPE | Query stream processing task plugin. The path is irrelevant. | - -## Configuration parameters - -In iotdb-system.properties: - -V1.3.0+: -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 - -# The maximum number of selectors that can be used in the async connector. -# pipe_async_connector_selector_number=1 - -# The core number of clients that can be used in the async connector. -# pipe_async_connector_core_client_number=8 - -# The maximum number of clients that can be used in the async connector. -# pipe_async_connector_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -V1.3.1+: -```Properties -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` diff --git a/src/UserGuide/V1.3.x/User-Manual/Tiered-Storage_timecho.md b/src/UserGuide/V1.3.x/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index b1d8cac66..000000000 --- a/src/UserGuide/V1.3.x/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,96 +0,0 @@ - - -# Tiered Storage -## Overview - -The Tiered storage functionality allows users to define multiple layers of storage, spanning across multiple types of storage media (Memory mapped directory, SSD, rotational hard discs or cloud storage). While memory and cloud storage is usually singular, the local file system storages can consist of multiple directories joined together into one tier. Meanwhile, users can classify data based on its hot or cold nature and store data of different categories in specified "tier". Currently, IoTDB supports the classification of hot and cold data through TTL (Time to live / age) of data. When the data in one tier does not meet the TTL rules defined in the current tier, the data will be automatically migrated to the next tier. - -## Parameter Definition - -To enable tiered storage in IoTDB, you need to configure the following aspects: - -1. configure the data catalogue and divide the data catalogue into different tiers -2. configure the TTL of the data managed in each tier to distinguish between hot and cold data categories managed in different tiers. -3. configure the minimum remaining storage space ratio for each tier so that when the storage space of the tier triggers the threshold, the data of the tier will be automatically migrated to the next tier (optional). - -The specific parameter definitions and their descriptions are as follows. - -| Configuration | Default | Description | Constraint | -| --------------------------------------- | ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | specify different storage directories and divide the storage directories into tiers | Each level of storage uses a semicolon to separate, and commas to separate within a single level; cloud (OBJECT_STORAGE) configuration can only be used as the last level of storage and the first level can't be used as cloud storage; a cloud object at most; the remote storage directory is denoted by OBJECT_STORAGE | -| tier_ttl_in_ms | -1 | Define the maximum age of data for which each tier is responsible | Each level of storage is separated by a semicolon; the number of levels should match the number of levels defined by dn_data_dirs;"-1" means "unlimited". | -| dn_default_space_usage_thresholds | 0.85 | Define the maximum storage usage threshold ratio for each tier of data directories. When the used space exceeds this ratio, the data will be automatically migrated to the next tier. If the storage usage of the last tier surpasses this threshold, the system will be set to ​​READ_ONLY​​ mode.| Each level of storage is separated by a semicolon; the number of levels should match the number of levels defined by dn_data_dirs | -| object_storage_type | AWS_S3 | Cloud Storage Type | IoTDB currently only supports AWS S3 as a remote storage type, and this parameter can't be modified | -| object_storage_bucket | iotdb_data | Name of cloud storage bucket | Bucket definition in AWS S3; no need to configure if remote storage is not used | -| object_storage_endpoint | | endpoint of cloud storage | endpoint of AWS S3;If remote storage is not used, no configuration required | -| object_storage_access_key | | Authentication information stored in the cloud: key | AWS S3 credential key;If remote storage is not used, no configuration required | -| object_storage_access_secret | | Authentication information stored in the cloud: secret | AWS S3 credential secret;If remote storage is not used, no configuration required | -| remote_tsfile_cache_dirs | data/datanode/data/cache | Cache directory stored locally in the cloud | If remote storage is not used, no configuration required | -| remote_tsfile_cache_page_size_in_kb | 20480 |Block size of locally cached files stored in the cloud | If remote storage is not used, no configuration required | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | Maximum Disk Occupancy Size for Cloud Storage Local Cache | If remote storage is not used, no configuration required | - -## local tiered storag configuration example - -The following is an example of a local two-level storage configuration. - -```JavaScript -//Required configuration items -dn_data_dirs=/data1/data;/data2/data,/data3/data; -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -In this example, two levels of storage are configured, specifically: - -| **tier** | **data path** | **data range** | **threshold for minimum remaining disk space** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| tier 1 | path 1:/data1/data | data for last 1 day | 20% | -| tier 2 | path 2:/data2/data path 2:/data3/data | data from 1 day ago | 10% | - -## remote tiered storag configuration example - -The following takes three-level storage as an example: - -```JavaScript -//Required configuration items -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_name=AWS_S3 -object_storage_bucket=iotdb -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// Optional configuration items -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -In this example, a total of three levels of storage are configured, specifically: - -| **tier** | **data path** | **data range** | **threshold for minimum remaining disk space** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| tier1 | path 1:/data1/data | data for last 1 day | 20% | -| tier2 | path 1:/data2/data path 2:/data3/data | data from past 1 day to past 10 days | 15% | -| tier3 | Remote AWS S3 Storage | data from 10 days ago | 10% | diff --git a/src/UserGuide/V1.3.x/User-Manual/User-defined-function_timecho.md b/src/UserGuide/V1.3.x/User-Manual/User-defined-function_timecho.md deleted file mode 100644 index f37270df7..000000000 --- a/src/UserGuide/V1.3.x/User-Manual/User-defined-function_timecho.md +++ /dev/null @@ -1,953 +0,0 @@ -# UDF - -## 1. UDF Introduction - -UDF (User Defined Function) refers to user-defined functions. IoTDB provides a variety of built-in time series processing functions and also supports extending custom functions to meet more computing needs. - -In IoTDB, you can expand two types of UDF: - - - - - - - - - - - - - - - - - - - - - - -
UDF ClassAccessStrategyDescription
UDTFMAPPABLE_ROW_BY_ROWCustom scalar function, input k columns of time series and 1 row of data, output 1 column of time series and 1 row of data, can be used in any clause and expression that appears in the scalar function, such as select clause, where clause, etc.
ROW_BY_ROW
SLIDING_TIME_WINDOW
SLIDING_SIZE_WINDOW
SESSION_TIME_WINDOW
STATE_WINDOW
Custom time series generation function, input k columns of time series m rows of data, output 1 column of time series n rows of data, the number of input rows m can be different from the number of output rows n, and can only be used in SELECT clauses.
UDAF-Custom aggregation function, input k columns of time series m rows of data, output 1 column of time series 1 row of data, can be used in any clause and expression that appears in the aggregation function, such as select clause, having clause, etc.
- -### 1.1 UDF usage - -The usage of UDF is similar to that of regular built-in functions, and can be directly used in SELECT statements like calling regular functions. - -#### 1.Basic SQL syntax support - -* Support `SLIMIT` / `SOFFSET` -* Support `LIMIT` / `OFFSET` -* Support queries with value filters -* Support queries with time filters - - -#### 2. Queries with * in SELECT Clauses - -Assume that there are 2 time series (`root.sg.d1.s1` and `root.sg.d1.s2`) in the system. - -* **`SELECT example(*) from root.sg.d1`** - -Then the result set will include the results of `example (root.sg.d1.s1)` and `example (root.sg.d1.s2)`. - -* **`SELECT example(s1, *) from root.sg.d1`** - -Then the result set will include the results of `example(root.sg.d1.s1, root.sg.d1.s1)` and `example(root.sg.d1.s1, root.sg.d1.s2)`. - -* **`SELECT example(*, *) from root.sg.d1`** - -Then the result set will include the results of `example(root.sg.d1.s1, root.sg.d1.s1)`, `example(root.sg.d1.s2, root.sg.d1.s1)`, `example(root.sg.d1.s1, root.sg.d1.s2)` and `example(root.sg.d1.s2, root.sg.d1.s2)`. - -#### 3. Queries with Key-value Attributes in UDF Parameters - -You can pass any number of key-value pair parameters to the UDF when constructing a UDF query. The key and value in the key-value pair need to be enclosed in single or double quotes. Note that key-value pair parameters can only be passed in after all time series have been passed in. Here is a set of examples: - - Example: -``` sql -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; -``` - -#### 4. Nested Queries - - Example: -``` sql -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` - -## 2. UDF management - -### 2.1 UDF Registration - -The process of registering a UDF in IoTDB is as follows: - -1. Implement a complete UDF class, assuming the full class name of this class is `org.apache.iotdb.udf.ExampleUDTF`. -2. Convert the project into a JAR package. If using Maven to manage the project, you can refer to the [Maven project example](https://github.com/apache/iotdb/tree/master/example/udf) above. -3. Make preparations for registration according to the registration mode. For details, see the following example. -4. You can use following SQL to register UDF. - -```sql -CREATE FUNCTION AS (USING URI URI-STRING) -``` - -#### Example: register UDF named `example`, you can choose either of the following two registration methods - -#### Method 1: Manually place the jar package - -Prepare: -When registering using this method, it is necessary to place the JAR package in advance in the `ext/udf` directory of all nodes in the cluster (which can be configured). - -Registration statement: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' -``` - -#### Method 2: Cluster automatically installs jar packages through URI - -Prepare: -When registering using this method, it is necessary to upload the JAR package to the URI server in advance and ensure that the IoTDB instance executing the registration statement can access the URI server. - -Registration statement: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' USING URI 'http://jar/example.jar' -``` - -IoTDB will download JAR packages and synchronize them to the entire cluster. - -#### Note - -1. Since UDF instances are dynamically loaded through reflection technology, you do not need to restart the server during the UDF registration process. - -2. UDF function names are not case-sensitive. - -3. Please ensure that the function name given to the UDF is different from all built-in function names. A UDF with the same name as a built-in function cannot be registered. - -4. We recommend that you do not use classes that have the same class name but different function logic in different JAR packages. For example, in `UDF(UDAF/UDTF): udf1, udf2`, the JAR package of udf1 is `udf1.jar` and the JAR package of udf2 is `udf2.jar`. Assume that both JAR packages contain the `org.apache.iotdb.udf.ExampleUDTF` class. If you use two UDFs in the same SQL statement at the same time, the system will randomly load either of them and may cause inconsistency in UDF execution behavior. - -### 2.2 UDF Deregistration - -The SQL syntax is as follows: - -```sql -DROP FUNCTION -``` - -Example: Uninstall the UDF from the above example: - -```sql -DROP FUNCTION example -``` - -Note: For functions registered using USING URI, you need to remove the UDF's JAR files from the cluster-wide node path (`installation_package/ext/udf/install`). - -### 2.3 Show All Registered UDFs - -``` sql -SHOW FUNCTIONS -``` - -### 2.4 UDF configuration - -- UDF configuration allows configuring the storage directory of UDF in `iotdb-system.properties` - ``` Properties -# UDF lib dir - -udf_lib_dir=ext/udf -``` - -- -When using custom functions, there is a message indicating insufficient memory. Change the following configuration parameters in `iotdb-system.properties` and restart the service. - - ``` Properties - -# Used to estimate the memory usage of text fields in a UDF query. -# It is recommended to set this value to be slightly larger than the average length of all text -# effectiveMode: restart -# Datatype: int -udf_initial_byte_array_length_for_memory_control=48 - -# How much memory may be used in ONE UDF query (in MB). -# The upper limit is 20% of allocated memory for read. -# effectiveMode: restart -# Datatype: float -udf_memory_budget_in_mb=30.0 - -# UDF memory allocation ratio. -# The parameter form is a:b:c, where a, b, and c are integers. -# effectiveMode: restart -udf_reader_transformer_collector_memory_proportion=1:1:1 -``` - -### 2.5 UDF User Permissions - - -When users use UDF, they will be involved in the `USE_UDF` permission, and only users with this permission are allowed to perform UDF registration, uninstallation, and query operations. - -For more user permissions related content, please refer to [Account Management Statements](../User-Manual/Authority-Management.md). - - -## 3. UDF Libraries - -Based on the ability of user-defined functions, IoTDB provides a series of functions for temporal data processing, including data quality, data profiling, anomaly detection, frequency domain analysis, data matching, data repairing, sequence discovery, machine learning, etc., which can meet the needs of industrial fields for temporal data processing. - -You can refer to the [UDF Libraries](../SQL-Manual/UDF-Libraries_timecho.md)document to find the installation steps and registration statements for each function, to ensure that all required functions are registered correctly. - - - -## 4. UDF development - -### 4.1 UDF Development Dependencies - -If you use [Maven](http://search.maven.org/), you can search for the development dependencies listed below from the [Maven repository](http://search.maven.org/) . Please note that you must select the same dependency version as the target IoTDB server version for development. - -``` xml - - org.apache.iotdb - udf-api - 1.0.0 - provided - -``` - -### 4.2 UDTF(User Defined Timeseries Generating Function) - -To write a UDTF, you need to inherit the `org.apache.iotdb.udf.api.UDTF` class, and at least implement the `beforeStart` method and a `transform` method. - -#### Interface Description: - -| Interface definition | Description | Required to Implement | -| :----------------------------------------------------------- | :----------------------------------------------------------- | ----------------------------------------------------- | -| void validate(UDFParameterValidator validator) throws Exception | This method is mainly used to validate `UDFParameters` and it is executed before `beforeStart(UDFParameters, UDTFConfigurations)` is called. | Optional | -| void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception | The initialization method to call the user-defined initialization behavior before a UDTF processes the input data. Every time a user executes a UDTF query, the framework will construct a new UDF instance, and `beforeStart` will be called. | Required | -| Object transform(Row row) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `MappableRowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Row`, and the transformation result should be returned. | Required to implement at least one `transform` method | -| void transform(Column[] columns, ColumnBuilder builder) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `MappableRowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Column[]`, and the transformation result should be output by `ColumnBuilder`. You need to call the data collection method provided by `builder` to determine the output data. | Required to implement at least one `transform` method | -| void transform(Row row, PointCollector collector) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `RowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Row`, and the transformation result should be output by `PointCollector`. You need to call the data collection method provided by `collector` to determine the output data. | Required to implement at least one `transform` method | -| void transform(RowWindow rowWindow, PointCollector collector) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `SlidingSizeWindowAccessStrategy` or `SlidingTimeWindowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `RowWindow`, and the transformation result should be output by `PointCollector`. You need to call the data collection method provided by `collector` to determine the output data. | Required to implement at least one `transform` method | -| void terminate(PointCollector collector) throws Exception | This method is called by the framework. This method will be called once after all `transform` calls have been executed. In a single UDF query, this method will and will only be called once. You need to call the data collection method provided by `collector` to determine the output data. | Optional | -| void beforeDestroy() | This method is called by the framework after the last input data is processed, and will only be called once in the life cycle of each UDF instance. | Optional | - -In the life cycle of a UDTF instance, the calling sequence of each method is as follows: - -1. void validate(UDFParameterValidator validator) throws Exception -2. void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception -3. `Object transform(Row row) throws Exception` or `void transform(Column[] columns, ColumnBuilder builder) throws Exception` or `void transform(Row row, PointCollector collector) throws Exception` or `void transform(RowWindow rowWindow, PointCollector collector) throws Exception` -4. void terminate(PointCollector collector) throws Exception -5. void beforeDestroy() - -> Note that every time the framework executes a UDTF query, a new UDF instance will be constructed. When the query ends, the corresponding instance will be destroyed. Therefore, the internal data of the instances in different UDTF queries (even in the same SQL statement) are isolated. You can maintain some state data in the UDTF without considering the influence of concurrency and other factors. - -#### Detailed interface introduction: - -1. **void validate(UDFParameterValidator validator) throws Exception** - -The `validate` method is used to validate the parameters entered by the user. - -In this method, you can limit the number and types of input time series, check the attributes of user input, or perform any custom verification. - -Please refer to the [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/parameter/UDFParameterValidator.java) for the usage of `UDFParameterValidator`. - - -2. **void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception** - -This method is mainly used to customize UDTF. In this method, the user can do the following things: - -1. Use UDFParameters to get the time series paths and parse key-value pair attributes entered by the user. -2. Set the strategy to access the raw data and set the output data type in UDTFConfigurations. -3. Create resources, such as establishing external connections, opening files, etc. - - -2.1 **UDFParameters** - -`UDFParameters` is used to parse UDF parameters in SQL statements (the part in parentheses after the UDF function name in SQL). The input parameters have two parts. The first part is data types of the time series that the UDF needs to process, and the second part is the key-value pair attributes for customization. Only the second part can be empty. - - -Example: - -``` sql -SELECT UDF(s1, s2, 'key1'='iotdb', 'key2'='123.45') FROM root.sg.d; -``` - -Usage: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - String stringValue = parameters.getString("key1"); // iotdb - Float floatValue = parameters.getFloat("key2"); // 123.45 - Double doubleValue = parameters.getDouble("key3"); // null - int intValue = parameters.getIntOrDefault("key4", 678); // 678 - // do something - - // configurations - // ... -} -``` - - -2.2 **UDTFConfigurations** - -You must use `UDTFConfigurations` to specify the strategy used by UDF to access raw data and the type of output sequence. - -Usage: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(Type.INT32); -} -``` - -The `setAccessStrategy` method is used to set the UDF's strategy for accessing the raw data, and the `setOutputDataType` method is used to set the data type of the output sequence. - - 2.2.1 **setAccessStrategy** - - -Note that the raw data access strategy you set here determines which `transform` method the framework will call. Please implement the `transform` method corresponding to the raw data access strategy. Of course, you can also dynamically decide which strategy to set based on the attribute parameters parsed by `UDFParameters`. Therefore, two `transform` methods are also allowed to be implemented in one UDF. - -The following are the strategies you can set: - -| Interface definition | Description | The `transform` Method to Call | -| :-------------------------------- | :----------------------------------------------------------- | ------------------------------------------------------------ | -| MappableRowByRowStrategy | Custom scalar function
The framework will call the `transform` method once for each row of raw data input, with k columns of time series and 1 row of data input, and 1 column of time series and 1 row of data output. It can be used in any clause and expression where scalar functions appear, such as select clauses, where clauses, etc. | void transform(Column[] columns, ColumnBuilder builder) throws ExceptionObject transform(Row row) throws Exception | -| RowByRowAccessStrategy | Customize time series generation function to process raw data line by line.
The framework will call the `transform` method once for each row of raw data input, inputting k columns of time series and 1 row of data, and outputting 1 column of time series and n rows of data.
When a sequence is input, the row serves as a data point for the input sequence.
When multiple sequences are input, after aligning the input sequences in time, each row serves as a data point for the input sequence.
(In a row of data, there may be a column with a `null` value, but not all columns are `null`) | void transform(Row row, PointCollector collector) throws Exception | -| SlidingTimeWindowAccessStrategy | Customize time series generation functions to process raw data in a sliding time window manner.
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SlidingSizeWindowAccessStrategy | Customize the time series generation function to process raw data in a fixed number of rows, meaning that each data processing window will contain a fixed number of rows of data (except for the last window).
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SessionTimeWindowAccessStrategy | Customize time series generation functions to process raw data in a session window format.
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| StateWindowAccessStrategy | Customize time series generation functions to process raw data in a state window format.
he framework will call the `transform` method once for each raw data input window, inputting 1 column of time series m rows of data and outputting 1 column of time series n rows of data.
A window may contain multiple rows of data, and currently only supports opening windows for one physical quantity, which is one column of data. | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | - - -#### Interface Description: - -- `MappableRowByRowStrategy` and `RowByRowAccessStrategy`: The construction of `RowByRowAccessStrategy` does not require any parameters. - -- `SlidingTimeWindowAccessStrategy` - -Window opening diagram: - - - -`SlidingTimeWindowAccessStrategy`: `SlidingTimeWindowAccessStrategy` has many constructors, you can pass 3 types of parameters to them: - -- Parameter 1: The display window on the time axis - -The first type of parameters are optional. If the parameters are not provided, the beginning time of the display window will be set to the same as the minimum timestamp of the query result set, and the ending time of the display window will be set to the same as the maximum timestamp of the query result set. - -- Parameter 2: Time interval for dividing the time axis (should be positive) -- Parameter 3: Time sliding step (not required to be greater than or equal to the time interval, but must be a positive number) - -The sliding step parameter is also optional. If the parameter is not provided, the sliding step will be set to the same as the time interval for dividing the time axis. - -The relationship between the three types of parameters can be seen in the figure below. Please see the [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/SlidingTimeWindowAccessStrategy.java) for more details. - -

- -> Note that the actual time interval of some of the last time windows may be less than the specified time interval parameter. In addition, there may be cases where the number of data rows in some time windows is 0. In these cases, the framework will also call the `transform` method for the empty windows. - -- `SlidingSizeWindowAccessStrategy` - -Window opening diagram: - - - -`SlidingSizeWindowAccessStrategy`: `SlidingSizeWindowAccessStrategy` has many constructors, you can pass 2 types of parameters to them: - -* Parameter 1: Window size. This parameter specifies the number of data rows contained in a data processing window. Note that the number of data rows in some of the last time windows may be less than the specified number of data rows. -* Parameter 2: Sliding step. This parameter means the number of rows between the first point of the next window and the first point of the current window. (This parameter is not required to be greater than or equal to the window size, but must be a positive number) - -The sliding step parameter is optional. If the parameter is not provided, the sliding step will be set to the same as the window size. - -- `SessionTimeWindowAccessStrategy` - -Window opening diagram: **Time intervals less than or equal to the given minimum time interval `sessionGap` are assigned in one group.** - - - -`SessionTimeWindowAccessStrategy`: `SessionTimeWindowAccessStrategy` has many constructors, you can pass 2 types of parameters to them: - -- Parameter 1: The display window on the time axis. -- Parameter 2: The minimum time interval `sessionGap` of two adjacent windows. - -- `StateWindowAccessStrategy` - -Window opening diagram: **For numerical data, if the state difference is less than or equal to the given threshold `delta`, it will be assigned in one group.** - - - -`StateWindowAccessStrategy` has four constructors. - -- Constructor 1: For numerical data, there are 3 parameters: the time axis can display the start and end time of the time window and the threshold `delta` for the allowable change within a single window. -- Constructor 2: For text data and boolean data, there are 3 parameters: the time axis can be provided to display the start and end time of the time window. For both data types, the data within a single window is same, and there is no need to provide an allowable change threshold. -- Constructor 3: For numerical data, there are 1 parameters: you can only provide the threshold delta that is allowed to change within a single window. The start time of the time axis display time window will be defined as the smallest timestamp in the entire query result set, and the time axis display time window end time will be defined as The largest timestamp in the entire query result set. -- Constructor 4: For text data and boolean data, you can provide no parameter. The start and end timestamps are explained in Constructor 3. - -StateWindowAccessStrategy can only take one column as input for now. - -Please see the [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/StateWindowAccessStrategy.java) for more details. - - 2.2.2 **setOutputDataType** - -Note that the type of output sequence you set here determines the type of data that the `PointCollector` can actually receive in the `transform` method. The relationship between the output data type set in `setOutputDataType` and the actual data output type that `PointCollector` can receive is as follows: - -| Output Data Type Set in `setOutputDataType` | Data Type that `PointCollector` Can Receive | -| :------------------------------------------ | :----------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | java.lang.String and org.apache.iotdb.udf.api.type.Binar` | - -The type of output time series of a UDTF is determined at runtime, which means that a UDTF can dynamically determine the type of output time series according to the type of input time series. -Here is a simple example: - -```java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **Object transform(Row row) throws Exception** - -You need to implement this method or `transform(Column[] columns, ColumnBuilder builder) throws Exception` when you specify the strategy of UDF to read the original data as `MappableRowByRowAccessStrategy`. - -This method processes the raw data one row at a time. The raw data is input from `Row` and output by its return object. You must return only one object based on each input data point in a single `transform` method call, i.e., input and output are one-to-one. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `Object transform(Row row) throws Exception` method. It is an adder that receives two columns of time series as input. - -```java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type dataType; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - dataType = parameters.getDataType(0); - configurations - .setAccessStrategy(new MappableRowByRowAccessStrategy()) - .setOutputDataType(dataType); - } - - @Override - public Object transform(Row row) throws Exception { - return row.getLong(0) + row.getLong(1); - } -} -``` - - - -4. **void transform(Column[] columns, ColumnBuilder builder) throws Exception** - -You need to implement this method or `Object transform(Row row) throws Exception` when you specify the strategy of UDF to read the original data as `MappableRowByRowAccessStrategy`. - -This method processes the raw data multiple rows at a time. After performance tests, we found that UDTF that process multiple rows at once perform better than those UDTF that process one data point at a time. The raw data is input from `Column[]` and output by `ColumnBuilder`. You must output a corresponding data point based on each input data point in a single `transform` method call, i.e., input and output are still one-to-one. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `void transform(Column[] columns, ColumnBuilder builder) throws Exception` method. It is an adder that receives two columns of time series as input. - -```java -import org.apache.iotdb.tsfile.read.common.block.column.Column; -import org.apache.iotdb.tsfile.read.common.block.column.ColumnBuilder; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type type; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - type = parameters.getDataType(0); - configurations.setAccessStrategy(new MappableRowByRowAccessStrategy()).setOutputDataType(type); - } - - @Override - public void transform(Column[] columns, ColumnBuilder builder) throws Exception { - long[] inputs1 = columns[0].getLongs(); - long[] inputs2 = columns[1].getLongs(); - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - builder.writeLong(inputs1[i] + inputs2[i]); - } - } -} -``` - -5. **void transform(Row row, PointCollector collector) throws Exception** - -You need to implement this method when you specify the strategy of UDF to read the original data as `RowByRowAccessStrategy`. - -This method processes the raw data one row at a time. The raw data is input from `Row` and output by `PointCollector`. You can output any number of data points in one `transform` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `void transform(Row row, PointCollector collector) throws Exception` method. It is an adder that receives two columns of time series as input. When two data points in a row are not `null`, this UDF will output the algebraic sum of these two data points. - -``` java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT64) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) throws Exception { - if (row.isNull(0) || row.isNull(1)) { - return; - } - collector.putLong(row.getTime(), row.getLong(0) + row.getLong(1)); - } -} -``` - -6. **void transform(RowWindow rowWindow, PointCollector collector) throws Exception** - -You need to implement this method when you specify the strategy of UDF to read the original data as `SlidingTimeWindowAccessStrategy` or `SlidingSizeWindowAccessStrategy`. - -This method processes a batch of data in a fixed number of rows or a fixed time interval each time, and we call the container containing this batch of data a window. The raw data is input from `RowWindow` and output by `PointCollector`. `RowWindow` can help you access a batch of `Row`, it provides a set of interfaces for random access and iterative access to this batch of `Row`. You can output any number of data points in one `transform` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -Below is a complete UDF example that implements the `void transform(RowWindow rowWindow, PointCollector collector) throws Exception` method. It is a counter that receives any number of time series as input, and its function is to count and output the number of data rows in each time window within a specified time range. - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.access.RowWindow; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.SlidingTimeWindowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Counter implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new SlidingTimeWindowAccessStrategy( - parameters.getLong("time_interval"), - parameters.getLong("sliding_step"), - parameters.getLong("display_window_begin"), - parameters.getLong("display_window_end"))); - } - - @Override - public void transform(RowWindow rowWindow, PointCollector collector) { - if (rowWindow.windowSize() != 0) { - collector.putInt(rowWindow.windowStartTime(), rowWindow.windowSize()); - } - } -} -``` - -7. **void terminate(PointCollector collector) throws Exception** - -In some scenarios, a UDF needs to traverse all the original data to calculate the final output data points. The `terminate` interface provides support for those scenarios. - -This method is called after all `transform` calls are executed and before the `beforeDestory` method is executed. You can implement the `transform` method to perform pure data processing (without outputting any data points), and implement the `terminate` method to output the processing results. - -The processing results need to be output by the `PointCollector`. You can output any number of data points in one `terminate` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -Below is a complete UDF example that implements the `void terminate(PointCollector collector) throws Exception` method. It takes one time series whose data type is `INT32` as input, and outputs the maximum value point of the series. - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Max implements UDTF { - - private Long time; - private int value; - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) { - if (row.isNull(0)) { - return; - } - int candidateValue = row.getInt(0); - if (time == null || value < candidateValue) { - time = row.getTime(); - value = candidateValue; - } - } - - @Override - public void terminate(PointCollector collector) throws IOException { - if (time != null) { - collector.putInt(time, value); - } - } -} -``` - -8. **void beforeDestroy()** - -The method for terminating a UDF. - -This method is called by the framework. For a UDF instance, `beforeDestroy` will be called after the last record is processed. In the entire life cycle of the instance, `beforeDestroy` will only be called once. - - - -### 4.3 UDAF (User Defined Aggregation Function) - -A complete definition of UDAF involves two classes, `State` and `UDAF`. - -#### State Class - -To write your own `State`, you need to implement the `org.apache.iotdb.udf.api.State` interface. - -#### Interface Description: - -| Interface Definition | Description | Required to Implement | -| -------------------------------- | ------------------------------------------------------------ | --------------------- | -| void reset() | To reset the `State` object to its initial state, you need to fill in the initial values of the fields in the `State` class within this method as if you were writing a constructor. | Required | -| byte[] serialize() | Serializes `State` to binary data. This method is used for IoTDB internal `State` passing. Note that the order of serialization must be consistent with the following deserialization methods. | Required | -| void deserialize(byte[] bytes) | Deserializes binary data to `State`. This method is used for IoTDB internal `State` passing. Note that the order of deserialization must be consistent with the serialization method above. | Required | - -#### Detailed interface introduction: - -1. **void reset()** - -This method resets the `State` to its initial state, you need to fill in the initial values of the fields in the `State` object in this method. For optimization reasons, IoTDB reuses `State` as much as possible internally, rather than creating a new `State` for each group, which would introduce unnecessary overhead. When `State` has finished updating the data in a group, this method is called to reset to the initial state as a way to process the next group. - -In the case of `State` for averaging (aka `avg`), for example, you would need the sum of the data, `sum`, and the number of entries in the data, `count`, and initialize both to 0 in the `reset()` method. - -```java -class AvgState implements State { - double sum; - - long count; - - @Override - public void reset() { - sum = 0; - count = 0; - } - - // other methods -} -``` - -2. **byte[] serialize()/void deserialize(byte[] bytes)** - -These methods serialize the `State` into binary data, and deserialize the `State` from the binary data. IoTDB, as a distributed database, involves passing data among different nodes, so you need to write these two methods to enable the passing of the State among different nodes. Note that the order of serialization and deserialization must be the consistent. - -In the case of `State` for averaging (aka `avg`), for example, you can convert the content of State to `byte[]` array and read out the content of State from `byte[]` array in any way you want, the following shows the code for serialization/deserialization using `ByteBuffer` introduced by Java8: - -```java -@Override -public byte[] serialize() { - ByteBuffer buffer = ByteBuffer.allocate(Double.BYTES + Long.BYTES); - buffer.putDouble(sum); - buffer.putLong(count); - - return buffer.array(); -} - -@Override -public void deserialize(byte[] bytes) { - ByteBuffer buffer = ByteBuffer.wrap(bytes); - sum = buffer.getDouble(); - count = buffer.getLong(); -} -``` - - - -#### UDAF Classes - -To write a UDAF, you need to implement the `org.apache.iotdb.udf.api.UDAF` interface. - -#### Interface Description: - -| Interface definition | Description | Required to Implement | -| ------------------------------------------------------------ | ------------------------------------------------------------ | --------------------- | -| void validate(UDFParameterValidator validator) throws Exception | This method is mainly used to validate `UDFParameters` and it is executed before `beforeStart(UDFParameters, UDTFConfigurations)` is called. | Optional | -| void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception | Initialization method that invokes user-defined initialization behavior before UDAF processes the input data. Unlike UDTF, configuration is of type `UDAFConfiguration`. | Required | -| State createState() | To create a `State` object, usually just call the default constructor and modify the default initial value as needed. | Required | -| void addInput(State state, Column[] columns, BitMap bitMap) | Update `State` object according to the incoming data `Column[]` in batch, note that last column `columns[columns.length - 1]` always represents the time column. In addition, `BitMap` represents the data that has been filtered out before, you need to manually determine whether the corresponding data has been filtered out when writing this method. | Required | -| void combineState(State state, State rhs) | Merge `rhs` state into `state` state. In a distributed scenario, the same set of data may be distributed on different nodes, IoTDB generates a `State` object for the partial data on each node, and then calls this method to merge it into the complete `State`. | Required | -| void outputFinal(State state, ResultValue resultValue) | Computes the final aggregated result based on the data in `State`. Note that according to the semantics of the aggregation, only one value can be output per group. | Required | -| void beforeDestroy() | This method is called by the framework after the last input data is processed, and will only be called once in the life cycle of each UDF instance. | Optional | - -In the life cycle of a UDAF instance, the calling sequence of each method is as follows: - -1. State createState() -2. void validate(UDFParameterValidator validator) throws Exception -3. void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception -4. void addInput(State state, Column[] columns, BitMap bitMap) -5. void combineState(State state, State rhs) -6. void outputFinal(State state, ResultValue resultValue) -7. void beforeDestroy() - -Similar to UDTF, every time the framework executes a UDAF query, a new UDF instance will be constructed. When the query ends, the corresponding instance will be destroyed. Therefore, the internal data of the instances in different UDAF queries (even in the same SQL statement) are isolated. You can maintain some state data in the UDAF without considering the influence of concurrency and other factors. - -#### Detailed interface introduction: - - -1. **void validate(UDFParameterValidator validator) throws Exception** - -Same as UDTF, the `validate` method is used to validate the parameters entered by the user. - -In this method, you can limit the number and types of input time series, check the attributes of user input, or perform any custom verification. - -2. **void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception** - - The `beforeStart` method does the same thing as the UDAF: - -1. Use UDFParameters to get the time series paths and parse key-value pair attributes entered by the user. -2. Set the strategy to access the raw data and set the output data type in UDAFConfigurations. -3. Create resources, such as establishing external connections, opening files, etc. - -The role of the `UDFParameters` type can be seen above. - -2.2 **UDTFConfigurations** - -The difference from UDTF is that UDAF uses `UDAFConfigurations` as the type of `configuration` object. - -Currently, this class only supports setting the type of output data. - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setOutputDataType(Type.INT32); } -} -``` - -The relationship between the output type set in `setOutputDataType` and the type of data output that `ResultValue` can actually receive is as follows: - -| The output type set in `setOutputDataType` | The output type that `ResultValue` can actually receive | -| ------------------------------------------ | ------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | org.apache.iotdb.udf.api.type.Binary | - -The output type of the UDAF is determined at runtime. You can dynamically determine the output sequence type based on the input type. - -Here is a simple example: - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **State createState()** - - -This method creates and initializes a `State` object for UDAF. Due to the limitations of the Java language, you can only call the default constructor for the `State` class. The default constructor assigns a default initial value to all the fields in the class, and if that initial value does not meet your requirements, you need to initialize them manually within this method. - -The following is an example that includes manual initialization. Suppose you want to implement an aggregate function that multiply all numbers in the group, then your initial `State` value should be set to 1, but the default constructor initializes it to 0, so you need to initialize `State` manually after calling the default constructor: - -```java -public State createState() { - MultiplyState state = new MultiplyState(); - state.result = 1; - return state; -} -``` - -4. **void addInput(State state, Column[] columns, BitMap bitMap)** - -This method updates the `State` object with the raw input data. For performance reasons, also to align with the IoTDB vectorized query engine, the raw input data is no longer a data point, but an array of columns ``Column[]``. Note that the last column (i.e. `columns[columns.length - 1]`) is always the time column, so you can also do different operations in UDAF depending on the time. - -Since the input parameter is not of a single data point type, but of multiple columns, you need to manually filter some of the data in the columns, which is why the third parameter, `BitMap`, exists. It identifies which of these columns have been filtered out, so you don't have to think about the filtered data in any case. - -Here's an example of `addInput()` that counts the number of items (aka count). It shows how you can use `BitMap` to ignore data that has been filtered out. Note that due to the limitations of the Java language, you need to do the explicit cast the `State` object from type defined in the interface to a custom `State` type at the beginning of the method, otherwise you won't be able to use the `State` object. - -```java -public void addInput(State state, Column[] columns, BitMap bitMap) { - CountState countState = (CountState) state; - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - if (bitMap != null && !bitMap.isMarked(i)) { - continue; - } - if (!columns[0].isNull(i)) { - countState.count++; - } - } -} -``` - -5. **void combineState(State state, State rhs)** - - -This method combines two `State`s, or more precisely, updates the first `State` object with the second `State` object. IoTDB is a distributed database, and the data of the same group may be distributed on different nodes. For performance reasons, IoTDB will first aggregate some of the data on each node into `State`, and then merge the `State`s on different nodes that belong to the same group, which is what `combineState` does. - -Here's an example of `combineState()` for averaging (aka avg). Similar to `addInput`, you need to do an explicit type conversion for the two `State`s at the beginning. Also note that you are updating the value of the first `State` with the contents of the second `State`. - -```java -public void combineState(State state, State rhs) { - AvgState avgState = (AvgState) state; - AvgState avgRhs = (AvgState) rhs; - - avgState.count += avgRhs.count; - avgState.sum += avgRhs.sum; -} -``` - -6. **void outputFinal(State state, ResultValue resultValue)** - -This method works by calculating the final result from `State`. You need to access the various fields in `State`, derive the final result, and set the final result into the `ResultValue` object.IoTDB internally calls this method once at the end for each group. Note that according to the semantics of aggregation, the final result can only be one value. - -Here is another `outputFinal` example for averaging (aka avg). In addition to the forced type conversion at the beginning, you will also see a specific use of the `ResultValue` object, where the final result is set by `setXXX` (where `XXX` is the type name). - -```java -public void outputFinal(State state, ResultValue resultValue) { - AvgState avgState = (AvgState) state; - - if (avgState.count != 0) { - resultValue.setDouble(avgState.sum / avgState.count); - } else { - resultValue.setNull(); - } -} -``` - -7. **void beforeDestroy()** - - -The method for terminating a UDF. - -This method is called by the framework. For a UDF instance, `beforeDestroy` will be called after the last record is processed. In the entire life cycle of the instance, `beforeDestroy` will only be called once. - - -### 4.4 Maven Project Example - -If you use Maven, you can build your own UDF project referring to our **udf-example** module. You can find the project [here](https://github.com/apache/iotdb/tree/master/example/udf). - - -## 5. Contribute universal built-in UDF functions to iotdb - -This part mainly introduces how external users can contribute their own UDFs to the IoTDB community. - -### 5.1 Prerequisites - -1. UDFs must be universal. - - The "universal" mentioned here refers to: UDFs can be widely used in some scenarios. In other words, the UDF function must have reuse value and may be directly used by other users in the community. - - If you are not sure whether the UDF you want to contribute is universal, you can send an email to `dev@iotdb.apache.org` or create an issue to initiate a discussion. - -2. The UDF you are going to contribute has been well tested and can run normally in the production environment. - - -### 5.2 What you need to prepare - -1. UDF source code -2. Test cases -3. Instructions - -### 5.3 Contribution Content - -#### 5.3.1 UDF Source Code - -1. Create the UDF main class and related classes in `iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin` or in its subfolders. -2. Register your UDF in `iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin/BuiltinTimeSeriesGeneratingFunction.java`. - -#### 5.3.2 Test Cases - -At a minimum, you need to write integration tests for the UDF. - -You can add a test class in `integration-test/src/test/java/org/apache/iotdb/db/it/udf`. - - -#### 5.3.3 Instructions - -The instructions need to include: the name and the function of the UDF, the attribute parameters that must be provided when the UDF is executed, the applicable scenarios, and the usage examples, etc. - -The instructions for use should include both Chinese and English versions. Instructions for use should be added separately in `docs/zh/UserGuide/Operation Manual/DML Data Manipulation Language.md` and `docs/UserGuide/Operation Manual/DML Data Manipulation Language.md`. - -#### 5.3.4 Submit a PR - -When you have prepared the UDF source code, test cases, and instructions, you are ready to submit a Pull Request (PR) on [Github](https://github.com/apache/iotdb). You can refer to our code contribution guide to submit a PR: [Development Guide](https://iotdb.apache.org/Community/Development-Guide.html). - - -After the PR review is approved and merged, your UDF has already contributed to the IoTDB community! - -## 6. Common problem - -Q1: How to modify the registered UDF? - -A1: Assume that the name of the UDF is `example` and the full class name is `org.apache.iotdb.udf.ExampleUDTF`, which is introduced by `example.jar`. - -1. Unload the registered function by executing `DROP FUNCTION example`. -2. Delete `example.jar` under `iotdb-server-1.0.0-all-bin/ext/udf`. -3. Modify the logic in `org.apache.iotdb.udf.ExampleUDTF` and repackage it. The name of the JAR package can still be `example.jar`. -4. Upload the new JAR package to `iotdb-server-1.0.0-all-bin/ext/udf`. -5. Load the new UDF by executing `CREATE FUNCTION example AS "org.apache.iotdb.udf.ExampleUDTF"`. - diff --git a/src/UserGuide/V1.3.x/User-Manual/White-List_timecho.md b/src/UserGuide/V1.3.x/User-Manual/White-List_timecho.md deleted file mode 100644 index 5194f7051..000000000 --- a/src/UserGuide/V1.3.x/User-Manual/White-List_timecho.md +++ /dev/null @@ -1,70 +0,0 @@ - - -# White List - -**function description** - -Allow which client addresses can connect to IoTDB - -**configuration file** - -conf/iotdb-system.properties - -conf/white.list - -**configuration item** - -iotdb-system.properties: - -Decide whether to enable white list - -```YAML - -# Whether to enable white list -enable_white_list=true -``` - -white.list: - -Decide which IP addresses can connect to IoTDB - -```YAML -# Support for annotation -# Supports precise matching, one IP per line -10.2.3.4 - -# Support for * wildcards, one ip per line -10.*.1.3 -10.100.0.* -``` - -**note** - -1. If the white list itself is cancelled via the session client, the current connection is not immediately disconnected. It is rejected the next time the connection is created. -2. If white.list is modified directly, it takes effect within one minute. If modified via the session client, it takes effect immediately, updating the values in memory and the white.list disk file. -3. Enable the whitelist function, there is no white.list file, start the DB service successfully, however, all connections are rejected. -4. while DB service is running, the white.list file is deleted, and all connections are denied after up to one minute. -5. whether to enable the configuration of the white list function, can be hot loaded. -6. Use the Java native interface to modify the whitelist, must be the root user to modify, reject non-root user to modify; modify the content must be legal, otherwise it will throw a StatementExecutionException. - -![](/img/%E7%99%BD%E5%90%8D%E5%8D%95.png) - diff --git a/src/UserGuide/dev-1.3/AI-capability/AINode_timecho.md b/src/UserGuide/dev-1.3/AI-capability/AINode_timecho.md deleted file mode 100644 index 0676658d3..000000000 --- a/src/UserGuide/dev-1.3/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,661 +0,0 @@ - - -# AINode - -AINode is a native IoTDB node that supports the registration, management, and invocation of time-series-related models. It comes with built-in industry-leading self-developed time-series large models, such as the Timer series developed by Tsinghua University. These models can be invoked through standard SQL statements, enabling real-time inference of time series data at the millisecond level, and supporting application scenarios such as trend forecasting, missing value imputation, and anomaly detection for time series data. - - -The system architecture is shown below: -::: center - -::: -The responsibilities of the three nodes are as follows: - -- **ConfigNode**: responsible for storing and managing the meta-information of the model; responsible for distributed node management. -- **DataNode**: responsible for receiving and parsing SQL requests from users; responsible for storing time-series data; responsible for preprocessing computation of data. -- **AINode**: responsible for model file import creation and model inference. - -## 1. Advantageous features - -Compared with building a machine learning service alone, it has the following advantages: - -- **Simple and easy to use**: no need to use Python or Java programming, the complete process of machine learning model management and inference can be completed using SQL statements. Creating a model can be done using the CREATE MODEL statement, and using a model for inference can be done using the CALL INFERENCE (...) statement, making it simpler and more convenient to use. - -- **Avoid Data Migration**: With IoTDB native machine learning, data stored in IoTDB can be directly applied to the inference of machine learning models without having to move the data to a separate machine learning service platform, which accelerates data processing, improves security, and reduces costs. - -![](/img/AInode1.png) - -- **Built-in Advanced Algorithms**: supports industry-leading machine learning analytics algorithms covering typical timing analysis tasks, empowering the timing database with native data analysis capabilities. Such as: - - **Time Series Forecasting**: learns patterns of change from past time series; thus outputs the most likely prediction of future series based on observations at a given past time. - - **Anomaly Detection for Time Series**: detects and identifies outliers in a given time series data, helping to discover anomalous behaviour in the time series. - - **Annotation for Time Series (Time Series Annotation)**: Adds additional information or markers, such as event occurrence, outliers, trend changes, etc., to each data point or specific time period to better understand and analyse the data. - - - -## 2. Basic Concepts - -- **Model**: a machine learning model that takes time-series data as input and outputs the results or decisions of an analysis task. Model is the basic management unit of AINode, which supports adding (registration), deleting, checking, and using (inference) of models. -- **Create**: Load externally designed or trained model files or algorithms into MLNode for unified management and use by IoTDB. -- **Inference**: The process of using the created model to complete the timing analysis task applicable to the model on the specified timing data. -- **Built-in capabilities**: AINode comes with machine learning algorithms or home-grown models for common timing analysis scenarios (e.g., prediction and anomaly detection). - -::: center - -:::: - -## 3. Installation and Deployment - -The deployment of AINode can be found in the document [Deployment Guidelines](../Deployment-and-Maintenance/AINode_Deployment_timecho.md#AINode-部署) . - - -## 4. Usage Guidelines - -AINode provides model creation and deletion process for deep learning models related to timing data. Built-in models do not need to be created and deleted, they can be used directly, and the built-in model instances created after inference is completed will be destroyed automatically. - -### 4.1 Registering Models - -A trained deep learning model can be registered by specifying the vector dimensions of the model's inputs and outputs, which can be used for model inference. - -Models that meet the following criteria can be registered in AINode: -1. Models trained on PyTorch 2.1.0 and 2.2.0 versions supported by AINode should avoid using features from versions 2.2.0 and above. -2. AINode supports models stored using PyTorch JIT, and the model file needs to include the parameters and structure of the model. -3. The input sequence of the model can contain one or more columns, and if there are multiple columns, they need to correspond to the model capability and model configuration file. -4. The input and output dimensions of the model must be clearly defined in the `config.yaml` configuration file. When using the model, it is necessary to strictly follow the input-output dimensions defined in the `config.yaml` configuration file. If the number of input and output columns does not match the configuration file, it will result in errors. - -The following is the SQL syntax definition for model registration. - -```SQL -create model using uri -``` - -The specific meanings of the parameters in the SQL are as follows: - -- model_name: a globally unique identifier for the model, which cannot be repeated. The model name has the following constraints: - - - Identifiers [ 0-9 a-z A-Z _ ] (letters, numbers, underscores) are allowed. - - Length is limited to 2-64 characters - - Case sensitive - -- uri: resource path to the model registration file, which should contain the **model weights model.pt file and the model's metadata description file config.yaml**. - - - Model weight file: the weight file obtained after the training of the deep learning model is completed, currently supporting pytorch training of the .pt file - - - yaml metadata description file: parameters related to the model structure that need to be provided when the model is registered, which must contain the input and output dimensions of the model for model inference: - - - | **Parameter name** | **Parameter description** | **Example** | - | ------------ | ---------------------------- | -------- | - | input_shape | Rows and columns of model inputs for model inference | [96,2] | - | output_shape | rows and columns of model outputs, for model inference | [48,2] | - - - In addition to model inference, the data types of model input and output can be specified: - - - | **Parameter name** | **Parameter description** | **Example** | - | ----------- | ------------------ | --------------------- | - | input_type | model input data type | ['float32','float32'] | - | output_type | data type of the model output | ['float32','float32'] | - - - In addition to this, additional notes can be specified for display during model management - - - | **Parameter name** | **Parameter description** | **Examples** | - | ---------- | ---------------------------------------------- | ------------------------------------------- | - | attributes | optional, user-defined model notes for model display | 'model_type': 'dlinear','kernel_size': '25' | - - -In addition to registration of local model files, registration can also be done by specifying remote resource paths via URIs, using open source model repositories (e.g. HuggingFace). - -#### Example - -In the current example folder, it contains model.pt and config.yaml files, model.pt is the training get, and the content of config.yaml is as follows: - -```YAML -configs. - # Required options - input_shape: [96, 2] # The model receives data in 96 rows x 2 columns. - output_shape: [48, 2] # Indicates that the model outputs 48 rows x 2 columns. - - # Optional Default is all float32 and the number of columns is the number of columns in the shape. - input_type: ["int64", "int64"] # Input data type, need to match the number of columns. - output_type: ["text", "int64"] #Output data type, need to match the number of columns. - -attributes: # Optional user-defined notes for the input. - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -Specify this folder as the load path to register the model. - -```SQL -IoTDB> create model dlinear_example using uri "file://. /example" -``` - -Alternatively, you can download the corresponding model file from huggingFace and register it. - -```SQL -IoTDB> create model dlinear_example using uri "https://huggingface.com/IoTDBML/dlinear/" -``` - -After the SQL is executed, the registration process will be carried out asynchronously, and you can view the registration status of the model through the model showcase (see the Model Showcase section), and the time consumed for successful registration is mainly affected by the size of the model file. - -Once the model registration is complete, you can call specific functions and perform model inference by using normal queries. - -### 4.2 Viewing Models - -Successfully registered models can be queried for model-specific information through the show models command. The SQL definition is as follows: - -```SQL -show models - -show models -``` - -In addition to displaying information about all models directly, you can specify a model id to view information about a specific model. The results of the model show contain the following information: - -| **ModelId** | **State** | **Configs** | **Attributes** | -| ------------ | ------------------------------------- | ---------------------------------------------- | -------------- | -| Model Unique Identifier | Model Registration Status (LOADING, ACTIVE, DROPPING) | InputShape, outputShapeInputTypes, outputTypes | Model Notes | - -State is used to show the current state of model registration, which consists of the following three stages - -- **LOADING**: The corresponding model meta information has been added to the configNode, and the model file is being transferred to the AINode node. -- **ACTIVE**: The model has been set up and the model is in the available state -- **DROPPING**: Model deletion is in progress, model related information is being deleted from configNode and AINode. -- **UNAVAILABLE**: Model creation failed, you can delete the failed model_name by drop model. - -#### Example - -```SQL -IoTDB> show models - - -+---------------------+--------------------------+-----------+----------------------------+-----------------------+ -| ModelId| ModelType| State| Configs| Notes| -+---------------------+--------------------------+-----------+----------------------------+-----------------------+ -| dlinear_example| USER_DEFINED| ACTIVE| inputShape:[96,2]| | -| | | | outputShape:[48,2]| | -| | | | inputDataType:[float,float]| | -| | | |outputDataType:[float,float]| | -| _STLForecaster| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _NaiveForecaster| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _ARIMA| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -|_ExponentialSmoothing| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _GaussianHMM|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -| _GMMHMM|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -| _Stray|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -+---------------------+--------------------------+-----------+------------------------------------------------------------+-----------------------+ -``` - -We have registered the corresponding model earlier, you can view the model status through the corresponding designation, active indicates that the model is successfully registered and can be used for inference. - -### 4.3 Delete Model - -For a successfully registered model, the user can delete it via SQL. In addition to deleting the meta information on the configNode, this operation also deletes all the related model files under the AINode. The SQL is as follows: - -```SQL -drop model -``` - -You need to specify the model model_name that has been successfully registered to delete the corresponding model. Since model deletion involves the deletion of data on multiple nodes, the operation will not be completed immediately, and the state of the model at this time is DROPPING, and the model in this state cannot be used for model inference. - -### 4.4 Using Built-in Model Reasoning - -The SQL syntax is as follows: - - -```SQL -call inference(,sql[,=]) -``` - -Built-in model inference does not require a registration process, the inference function can be used by calling the inference function through the call keyword, and its corresponding parameters are described as follows: - -- **built_in_model_name**: built-in model name -- **parameterName**: parameter name -- **parameterValue**: parameter value - -#### Built-in Models and Parameter Descriptions - -The following machine learning models are currently built-in, please refer to the following links for detailed parameter descriptions. - -| Model | built_in_model_name | Task type | Parameter description | -| -------------------- | --------------------- | -------- | ------------------------------------------------------------ | -| Arima | _Arima | Forecast | [Arima Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.arima.ARIMA.html?highlight=Arima) | -| STLForecaster | _STLForecaster | Forecast | [STLForecaster Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.trend.STLForecaster.html#sktime.forecasting.trend.STLForecaster) | -| NaiveForecaster | _NaiveForecaster | Forecast | [NaiveForecaster Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.naive.NaiveForecaster.html#naiveforecaster) | -| ExponentialSmoothing | _ExponentialSmoothing | Forecast | [ExponentialSmoothing 参Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.exp_smoothing.ExponentialSmoothing.html) | -| GaussianHMM | _GaussianHMM | Annotation | [GaussianHMMParameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.hmm_learn.gaussian.GaussianHMM.html) | -| GMMHMM | _GMMHMM | Annotation | [GMMHMM参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.hmm_learn.gmm.GMMHMM.html) | -| Stray | _Stray | Anomaly detection | [Stray Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.stray.STRAY.html) | - - -#### Example - -The following is an example of an operation using built-in model inference. The built-in Stray model is used for anomaly detection algorithm. The input is `[144,1]` and the output is `[144,1]`. We use it for reasoning through SQL. - -```SQL -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", k=2) -+-------+ -|output0| -+-------+ -| 0| -| 0| -| 0| -| 0| -...... -| 1| -| 1| -| 0| -| 0| -| 0| -| 0| -+-------+ -Total line number = 144 -``` - -### 4.5 Reasoning with Deep Learning Models - -The SQL syntax is as follows: - -```SQL -call inference(,sql[,window=]) - - -window_function: - head(window_size) - tail(window_size) - count(window_size,sliding_step) -``` - -After completing the registration of the model, the inference function can be used by calling the inference function through the call keyword, and its corresponding parameters are described as follows: - -- **model_name**: corresponds to a registered model -- **sql**: sql query statement, the result of the query is used as input to the model for model inference. The dimensions of the rows and columns in the result of the query need to match the size specified in the specific model config. (It is not recommended to use the `SELECT *` clause for the sql here because in IoTDB, `*` does not sort the columns, so the order of the columns is undefined, you can use `SELECT s0,s1` to ensure that the columns order matches the expectations of the model input) -- **window_function**: Window functions that can be used in the inference process, there are currently three types of window functions provided to assist in model inference: - - **head(window_size)**: Get the top window_size points in the data for model inference, this window can be used for data cropping. - ![](/img/AINode-call1.png) - - - **tail(window_size)**: get the last window_size point in the data for model inference, this window can be used for data cropping. - ![](/img/AINode-call2.png) - - - **count(window_size, sliding_step)**: sliding window based on the number of points, the data in each window will be reasoned through the model respectively, as shown in the example below, window_size for 2 window function will be divided into three windows of the input dataset, and each window will perform reasoning operations to generate results respectively. The window can be used for continuous inference - ![](/img/AINode-call3.png) - -**Explanation 1**: window can be used to solve the problem of cropping rows when the results of the sql query and the input row requirements of the model do not match. Note that when the number of columns does not match or the number of rows is directly less than the model requirement, the inference cannot proceed and an error message will be returned. - -**Explanation 2**: In deep learning applications, timestamp-derived features (time columns in the data) are often used as covariates in generative tasks, and are input into the model together to enhance the model, but the time columns are generally not included in the model's output. In order to ensure the generality of the implementation, the model inference results only correspond to the real output of the model, if the model does not output the time column, it will not be included in the results. - - -#### Example - -The following is an example of inference in action using a deep learning model, for the `dlinear` prediction model with input `[96,2]` and output `[48,2]` mentioned above, which we use via SQL. - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**") -+--------------------------------------------+-----------------------------+ -| _result_0| _result_1| -+--------------------------------------------+-----------------------------+ -| 0.726302981376648| 1.6549958229064941| -| 0.7354921698570251| 1.6482787370681763| -| 0.7238251566886902| 1.6278168201446533| -...... -| 0.7692174911499023| 1.654654049873352| -| 0.7685555815696716| 1.6625318765640259| -| 0.7856493592262268| 1.6508299350738525| -+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### Example of using the tail/head window function - -When the amount of data is variable and you want to take the latest 96 rows of data for inference, you can use the corresponding window function tail. head function is used in a similar way, except that it takes the earliest 96 points. - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1988-01-01T00:00:00.000+08:00| 0.7355| 1.211| -...... -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 996 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**",window=tail(96)) -+--------------------------------------------+-----------------------------+ -| _result_0| _result_1| -+--------------------------------------------+-----------------------------+ -| 0.726302981376648| 1.6549958229064941| -| 0.7354921698570251| 1.6482787370681763| -| 0.7238251566886902| 1.6278168201446533| -...... -| 0.7692174911499023| 1.654654049873352| -| 0.7685555815696716| 1.6625318765640259| -| 0.7856493592262268| 1.6508299350738525| -+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### Example of using the count window function - -This window is mainly used for computational tasks. When the task's corresponding model can only handle a fixed number of rows of data at a time, but the final desired outcome is multiple sets of prediction results, this window function can be used to perform continuous inference using a sliding window of points. Suppose we now have an anomaly detection model `anomaly_example(input: [24,2], output[1,1])`, which generates a 0/1 label for every 24 rows of data. An example of its use is as follows: - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(anomaly_example,"select s0,s1 from root.**",window=count(24,24)) -+-------------------------+ -| _result_0| -+-------------------------+ -| 0| -| 1| -| 1| -| 0| -+-------------------------+ -Total line number = 4 -``` - -In the result set, each row's label corresponds to the output of the anomaly detection model after inputting each group of 24 rows of data. - - -### 4.6 TimeSeries Large Models Import Steps - -AINode currently supports a variety of time series large models. For deployment and usage, please refer to [TimeSeries Large Models](../AI-capability/TimeSeries-Large-Model) - - -## 5. Privilege Management - -When using AINode related functions, the authentication of IoTDB itself can be used to do a permission management, users can only use the model management related functions when they have the USE_MODEL permission. When using the inference function, the user needs to have the permission to access the source sequence corresponding to the SQL of the input model. - -| Privilege Name | Privilege Scope | Administrator User (default ROOT) | Normal User | Path Related | -| --------- | --------------------------------- | ---------------------- | -------- | -------- | -| USE_MODEL | create model/show models/drop model | √ | √ | x | -| READ_DATA| call inference | √ | √|√ | - -## 6. Practical Examples - -### 6.1 Power Load Prediction - -In some industrial scenarios, there is a need to predict power loads, which can be used to optimise power supply, conserve energy and resources, support planning and expansion, and enhance power system reliability. - -The data for the test set of ETTh1 that we use is [ETTh1](/img/ETTh1.csv). - - -It contains power data collected at 1h intervals, and each data consists of load and oil temperature as High UseFul Load, High UseLess Load, Middle UseLess Load, Low UseFul Load, Low UseLess Load, Oil Temperature. - -On this dataset, the model inference function of IoTDB-ML can predict the oil temperature in the future period of time through the relationship between the past values of high, middle and low use loads and the corresponding time stamp oil temperature, which empowers the automatic regulation and monitoring of grid transformers. - -#### Step 1: Data Import - -Users can import the ETT dataset into IoTDB using `import-csv.sh` in the tools folder - -``Bash -bash . /import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ... /... /ETTh1.csv -`` - -#### Step 2: Model Import - -We can enter the following SQL in iotdb-cli to pull a trained model from huggingface for registration for subsequent inference. - -```SQL -create model dlinear using uri 'https://huggingface.co/hvlgo/dlinear/tree/main' -``` - -This model is trained on the lighter weight deep model DLinear, which is able to capture as many trends within a sequence and relationships between variables as possible with relatively fast inference, making it more suitable for fast real-time prediction than other deeper models. - -#### Step 3: Model inference - -```Shell -IoTDB> select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth LIMIT 96 -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -| Time|root.eg.etth.s0|root.eg.etth.s1|root.eg.etth.s2|root.eg.etth.s3|root.eg.etth.s4|root.eg.etth.s5|root.eg.etth.s6| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -|2017-10-20T00:00:00.000+08:00| 10.449| 3.885| 8.706| 2.025| 2.041| 0.944| 8.864| -|2017-10-20T01:00:00.000+08:00| 11.119| 3.952| 8.813| 2.31| 2.071| 1.005| 8.442| -|2017-10-20T02:00:00.000+08:00| 9.511| 2.88| 7.533| 1.564| 1.949| 0.883| 8.16| -|2017-10-20T03:00:00.000+08:00| 9.645| 2.21| 7.249| 1.066| 1.828| 0.914| 7.949| -...... -|2017-10-23T20:00:00.000+08:00| 8.105| 0.938| 4.371| -0.569| 3.533| 1.279| 9.708| -|2017-10-23T21:00:00.000+08:00| 7.167| 1.206| 4.087| -0.462| 3.107| 1.432| 8.723| -|2017-10-23T22:00:00.000+08:00| 7.1| 1.34| 4.015| -0.32| 2.772| 1.31| 8.864| -|2017-10-23T23:00:00.000+08:00| 9.176| 2.746| 7.107| 1.635| 2.65| 1.097| 9.004| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example, "select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth", window=head(96)) -+-----------+----------+----------+------------+---------+----------+----------+ -| output0| output1| output2| output3| output4| output5| output6| -+-----------+----------+----------+------------+---------+----------+----------+ -| 10.319546| 3.1450553| 7.877341| 1.5723765|2.7303758| 1.1362307| 8.867775| -| 10.443649| 3.3286757| 7.8593454| 1.7675098| 2.560634| 1.1177158| 8.920919| -| 10.883752| 3.2341104| 8.47036| 1.6116762|2.4874182| 1.1760603| 8.798939| -...... -| 8.0115595| 1.2995274| 6.9900327|-0.098746896| 3.04923| 1.176214| 9.548782| -| 8.612427| 2.5036244| 5.6790237| 0.66474205|2.8870275| 1.2051733| 9.330128| -| 10.096699| 3.399722| 6.9909| 1.7478468|2.7642853| 1.1119363| 9.541455| -+-----------+----------+----------+------------+---------+----------+----------+ -Total line number = 48 -``` - -We compare the results of the prediction of the oil temperature with the real results, and we can get the following image. - -The data before 10/24 00:00 represents the past data input to the model, the blue line after 10/24 00:00 is the oil temperature forecast result given by the model, and the red line is the actual oil temperature data from the dataset (used for comparison). - -![](/img/AINode-analysis1.png) - -As can be seen, we have used the relationship between the six load information and the corresponding time oil temperatures for the past 96 hours (4 days) to model the possible changes in this data for the oil temperature for the next 48 hours (2 days) based on the inter-relationships between the sequences learned previously, and it can be seen that the predicted curves maintain a high degree of consistency in trend with the actual results after visualisation. - -### 6.2 Power Prediction - -Power monitoring of current, voltage and power data is required in substations for detecting potential grid problems, identifying faults in the power system, effectively managing grid loads and analysing power system performance and trends. - -We have used the current, voltage and power data in a substation to form a dataset in a real scenario. The dataset consists of data such as A-phase voltage, B-phase voltage, and C-phase voltage collected every 5 - 6s for a time span of nearly four months in the substation. - -The test set data content is [data](/img/data.csv). - -On this dataset, the model inference function of IoTDB-ML can predict the C-phase voltage in the future period through the previous values and corresponding timestamps of A-phase voltage, B-phase voltage and C-phase voltage, empowering the monitoring management of the substation. - -#### Step 1: Data Import - -Users can import the dataset using `import-csv.sh` in the tools folder - -```Bash -bash ./import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ... /... /data.csv -``` - -#### Step 2: Model Import - -We can select built-in models or registered models in IoTDB CLI for subsequent inference. - -We use the built-in model STLForecaster for prediction. STLForecaster is a time series forecasting method based on the STL implementation in the statsmodels library. - -#### Step 3: Model Inference - -```Shell -IoTDB> select * from root.eg.voltage limit 96 -+-----------------------------+------------------+------------------+------------------+ -| Time|root.eg.voltage.s0|root.eg.voltage.s1|root.eg.voltage.s2| -+-----------------------------+------------------+------------------+------------------+ -|2023-02-14T20:38:32.000+08:00| 2038.0| 2028.0| 2041.0| -|2023-02-14T20:38:38.000+08:00| 2014.0| 2005.0| 2018.0| -|2023-02-14T20:38:44.000+08:00| 2014.0| 2005.0| 2018.0| -...... -|2023-02-14T20:47:52.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:47:57.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:48:03.000+08:00| 2024.0| 2016.0| 2027.0| -+-----------------------------+------------------+------------------+------------------+ -Total line number = 96 - -IoTDB> call inference(_STLForecaster, "select s0,s1,s2 from root.eg.voltage", window=head(96),predict_length=48) -+---------+---------+---------+ -| output0| output1| output2| -+---------+---------+---------+ -|2026.3601|2018.2953|2029.4257| -|2019.1538|2011.4361|2022.0888| -|2025.5074|2017.4522|2028.5199| -...... - -|2022.2336|2015.0290|2025.1023| -|2015.7241|2008.8975|2018.5085| -|2022.0777|2014.9136|2024.9396| -|2015.5682|2008.7821|2018.3458| -+---------+---------+---------+ -Total line number = 48 -``` - -Comparing the predicted results of the C-phase voltage with the real results, we can get the following image. - -The data before 02/14 20:48 represents the past data input to the model, the blue line after 02/14 20:48 is the predicted result of phase C voltage given by the model, while the red line is the actual phase C voltage data from the dataset (used for comparison). - -![](/img/AINode-analysis2.png) - -It can be seen that we used the voltage data from the past 10 minutes and, based on the previously learned inter-sequence relationships, modeled the possible changes in the phase C voltage data for the next 5 minutes. The visualized forecast curve shows a certain degree of synchronicity with the actual results in terms of trend. - -### 6.3 Anomaly Detection - -In the civil aviation and transport industry, there exists a need for anomaly detection of the number of passengers travelling on an aircraft. The results of anomaly detection can be used to guide the adjustment of flight scheduling to make the organisation more efficient. - -Airline Passengers is a time-series dataset that records the number of international air passengers between 1949 and 1960, sampled at one-month intervals. The dataset contains a total of one time series. The dataset is [airline](/img/airline.csv). -On this dataset, the model inference function of IoTDB-ML can empower the transport industry by capturing the changing patterns of the sequence in order to detect anomalies at the sequence time points. - -#### Step 1: Data Import - -Users can import the dataset using `import-csv.sh` in the tools folder - -``Bash -bash . /import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ... /... /data.csv -`` - -#### Step 2: Model Inference - -IoTDB has some built-in machine learning algorithms that can be used directly, a sample prediction using one of the anomaly detection algorithms is shown below: - -```Shell -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", k=2) -+-------+ -|output0| -+-------+ -| 0| -| 0| -| 0| -| 0| -...... -| 1| -| 1| -| 0| -| 0| -| 0| -| 0| -+-------+ -Total line number = 144 -``` - -We plot the results detected as anomalies to get the following image. Where the blue curve is the original time series and the time points specially marked with red dots are the time points that the algorithm detects as anomalies. - -![](/img/s6.png) - -It can be seen that the Stray model has modelled the input sequence changes and successfully detected the time points where anomalies occur. \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/API/Programming-OPC-DA_timecho.md b/src/UserGuide/dev-1.3/API/Programming-OPC-DA_timecho.md deleted file mode 100644 index 80e568300..000000000 --- a/src/UserGuide/dev-1.3/API/Programming-OPC-DA_timecho.md +++ /dev/null @@ -1,209 +0,0 @@ - - -# OPC DA Protocol - -## 1. OPC DA - -OPC DA (OPC Data Access) is a communication protocol standard in the field of industrial automation and a core part of the classic OPC (OLE for Process Control) technology. Its primary goal is to enable real-time data exchange between industrial devices and software (such as SCADA, HMI, and databases) in a Windows environment. OPC DA is implemented based on COM/DCOM and is a lightweight protocol with two roles: server and client. - -* **Server:** Can be regarded as a pool of items, storing the latest data and status of each instance. All items can only be managed on the server side; clients can only read and write data and have no authority to manipulate metadata. - -![](/img/opc-da-1-1.png) - -* **Client:** After connecting to the server, the client needs to define a custom group (this group is only relevant to the client) and create items with the same names as those on the server. The client can then read and write the items it has created. - -![](/img/opc-da-1-2-en.png) - -## 2. OPC DA Sink - -IoTDB (available since V1.3.5.2 for V1.x) provides an OPC DA Sink that supports pushing tree-model data to a local COM server plugin. It encapsulates the OPC DA interface specifications and their inherent complexity, significantly simplifying the integration process. The data flow diagram for the OPC DA Sink is shown below. - -![](/img/opc-da-2-1-en.png) - -### 2.1 SQL Syntax - -```SQL ----- Note: The clsID here needs to be replaced with your own clsID -create pipe opc ( - 'sink'='opc-da-sink', - --- 'opcda.progid'='opcserversim.Instance.1' - 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C' -); -``` - -### 2.2 Parameter Description - -| ​**​Parameter​**​ | ​**​Description​**​ | ​**​Value Range​**​ | ​**​Required​**​ | -| ----------------------------- | ----------------------------------------------------------------------------------------------------------- | ------------------------------- | ----------------------------------------- | -| sink | OPC DA Sink | String: opc-da-sink | Yes | -| sink.opcda.clsid | The ClsID (unique identifier string) of the OPC Server. It is recommended to use clsID instead of progID. | String | Either clsID or progID must be provided | -| sink.opcda.progid | The ProgID of the OPC Server. If clsID is available, it is preferred over progID. | String | Either clsID or progID must be provided | - - -### 2.3 Mapping Specifications - -When used, IoTDB will push the latest data from its tree model to the server. The itemID for the data is the full path of the time series in the tree model, such as root.a.b.c.d. Note that, according to the OPC DA standard, clients cannot directly create items on the server side. Therefore, the server must pre-create items corresponding to IoTDB's time series with the itemID and the appropriate data type. - -* Data type correspondence is as follows: - -| IoTDB | OPC-DA Server | -| ----------- | ----------------------------------------------------------- | -| INT32 | VT\_I4 | -| INT64 | VT\_I8 | -| FLOAT | VT\_R4 | -| DOUBLE | VT\_R8 | -| TEXT | VT\_BSTR | -| BOOLEAN | VT\_BOOL | -| DATE | VT\_DATE | -| TIMESTAMP | VT\_DATE | -| BLOB | VT_BSTR (Variant does not support VT_BLOB, so VT_BSTR is used as a substitute) | -| STRING | VT\_BSTR | - -### 2.4 Common Error Codes - -| Symbol | Error Code | Description | -| ----------------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| OPC\_E\_BADTYPE | 0xC0040004 | The server cannot convert the data between the specified format/requested data type and the canonical data type. This means the server's data type does not match IoTDB's registered type. | -| OPC\_E\_UNKNOWNITEMID | 0xC0040007 | The item ID is not defined in the server's address space (when adding or validating), or the item ID no longer exists in the server's address space (when reading or writing). This means IoTDB's measurement point does not have a corresponding itemID on the server. | -| OPC\_E\_INVALIDITEMID | 0xC0040008 | The itemID does not conform to the server's syntax specifications. | -| REGDB\_E\_CLASSNOTREG | 0x80040154 | Class not registered | -| RPC\_S\_SERVER\_UNAVAILABLE | 0x800706BA | RPC service unavailable | -| DISP\_E\_OVERFLOW | 0x8002000A | Exceeds the maximum value of the type | -| DISP\_E\_BADVARTYPE | 0x80020005 | Type mismatch | - - -### 2.5 Usage Limitations - -* Only supports COM and can only be used on Windows. -* A small amount of old data may be pushed after restarting, but new data will eventually be pushed. -* Currently, only tree-model data is supported. - -## 3. Usage Steps -### 3.1 Prerequisites -1. Windows environment, version >= 8. -2. IoTDB is installed and running normally. -3. OPC DA Server is installed. - -* Using Simple OPC Server Simulator as an example: - -![](/img/opc-da-3-1.png) - -* Double-click an item to modify its name (itemID), data, data type, and other information. -* Right-click an item to delete it, update its value, or create a new item. - -![](/img/opc-da-3-2.png) - -4. OPC DA Client is installed. -* Using KepwareServerEX's quickClient as an example: -* In Kepware, the OPC DA Client can be opened as follows: - -![](/img/opc-da-3-3-en.png) - -![](/img/opc-da-3-4-en.png) - - -### 3.2 Configuration Modifications - -Modify the server configuration to prevent IoTDB's write client and Kepware's read client from connecting to two different instances, which would make debugging impossible. - -* First, press Win+R, type dcomcnfgin the Run menu, and open the DCOM component configuration: - -![](/img/opc-da-3-5-en.png) - -* Navigate to Component Services -> Computers -> My Computer -> DCOM Config, find AGG Software Simple OPC Server Simulator, right-click, and select "Properties": - -![](/img/opc-da-3-6-en.png) - -* Under Identity, change User Accountto Interactive User. Note: Do not use Launching User, as this may cause the two clients to start different server instances. - -![](/img/opc-da-3-7-en.png) - -### 3.3 Obtaining clsID -1. Method 1: Obtain via DCOM Configuration -* Press Win+R, type dcomcnfgin the Run menu, and open the DCOM component configuration. -* Navigate to Component Services -> Computers -> My Computer -> DCOM Config, find AGG Software Simple OPC Server Simulator, right-click, and select "Properties". -* Under General, you can obtain the application's clsID, which will be used for the opc-da-sink connection later. Note: Do not include the curly braces. - -![](/img/opc-da-3-8-en.png) - -2. Method 2: clsID and progID can also be obtained directly from the server. - -* Click `Help` > `Show OPC Server Info` - -![](/img/opc-da-3-9.png) - -* The pop-up window will display the information. - -![](/img/opc-da-3-10-en.png) - -### 3.4 Writing Data -#### 3.4.1 DA Server -1. Create a new item in the DA Server with the same name and type as the item to be written in IoTDB. - -![](/img/opc-da-3-11.png) - -2. Connect to the server in Kepware: - -![](/img/opc-da-3-12-en.png) - -3. Right-click the server to create a new group (the group name can be arbitrary): - -![](/img/opc-da-3-13-en.png) - -![](/img/opc-da-3-14-en.png) - -4. Right-click to create a new item with the same name as the one created earlier. - -![](/img/opc-da-3-15-en.png) - -![](/img/opc-da-3-16-en.png) - -![](/img/opc-da-3-17-en.png) - -#### 3.4.2 IoTDB - -1. Start IoTDB. -2. Create a Pipe. - -```SQL -create pipe opc ('sink'='opc-da-sink', 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C') -``` - -* Note: If the creation fails with the error Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 1107: Failed to connect to server, error code: 0x80040154, refer to this solution: https://opcexpert.com/support/0x80040154-class-not-registered/. - -3. Create a time series (if automatic metadata creation is enabled, this step can be skipped). - -```SQL -create timeseries root.a.b.c.r string; -``` - -4. Insert data. - -```SQL -insert into root.a.b.c (time, r) values(10000, "SomeString") -``` - -### 3.5 Verifying Data - -Check the data in Quick Client; it should have been updated. - -![](/img/opc-da-3-18-en.png) \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/API/Programming-OPC-UA_timecho.md b/src/UserGuide/dev-1.3/API/Programming-OPC-UA_timecho.md deleted file mode 100644 index 5cca37d1c..000000000 --- a/src/UserGuide/dev-1.3/API/Programming-OPC-UA_timecho.md +++ /dev/null @@ -1,295 +0,0 @@ - - -# OPC UA Protocol - -## OPC UA Subscription Data - -This feature allows users to subscribe to data from IoTDB using the OPC UA protocol. The communication modes for subscription data support both Client/Server and Pub/Sub. - -Note: This feature is not about collecting data from external OPC Servers and writing it into IoTDB. - -![](/img/opc-ua-new-1-en.png) - -## OPC Service Startup Method - -### Syntax - -The syntax to start the OPC UA protocol: - -```sql -create pipe p1 - with source (...) - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'sink.opcua.tcp.port' = '12686', - 'sink.opcua.https.port' = '8443', - 'sink.user' = 'root', - 'sink.password' = 'root', - 'sink.opcua.security.dir' = '...' - ) -``` - -### Parameters - -| key | value | value range | required or not | default value | -| :--------------------------------- | :-------------------------------------------------- | :------------------------------------------------------- | :-------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| sink | OPC UA SINK | String: opc-ua-sink | Required | | -| sink.opcua.model | OPC UA model used | String: client-server / pub-sub | Optional | pub-sub | -| sink.opcua.tcp.port | OPC UA's TCP port | Integer: \[0, 65536] | Optional | 12686 | -| sink.opcua.https.port | OPC UA's HTTPS port | Integer: \[0, 65536] | Optional | 8443 | -| sink.opcua.security.dir | Directory for OPC UA's keys and certificates | String: Path, supports absolute and relative directories | Optional | Opc_security folder/in the conf directory of the DataNode related to iotdb
If there is no conf directory for iotdb (such as launching DataNode in IDEA), it will be the iotdb_opc_Security folder/\in the user's home directory | -| sink.opcua.enable-anonymous-access | Whether OPC UA allows anonymous access | Boolean | Optional | true | -| sink.user | User for OPC UA, specified in the configuration | String | Optional | root | -| sink.password | Password for OPC UA, specified in the configuration | String | Optional | root | - -### Example - -```Bash -create pipe p1 - with sink ('sink' = 'opc-ua-sink', - 'sink.user' = 'root', - 'sink.password' = 'root'); -start pipe p1; -``` - -### Usage Limitations - -1. After starting the protocol, data needs to be written to establish a connection. Only data after the connection is established can be subscribed to. -2. Recommended for use in standalone mode. In distributed mode, each IoTDB DataNode acts as an independent OPC Server providing data and requires separate subscription. - - -## Examples of Two Communication Modes - -### Client / Server Mode - -In this mode, IoTDB's stream processing engine establishes a connection with the OPC UA Server via an OPC UA Sink. The OPC UA Server maintains data within its Address Space, from which IoTDB can request and retrieve data. Additionally, other OPC UA Clients can access the data on the server. - -* Features: - * OPC UA organizes device information received from the Sink into folders under the Objects folder according to a tree model. - * Each measurement point is recorded as a variable node, storing the latest value from the current database. - * OPC UA cannot delete data or change data type settings. - -#### Preparation Work - -1. Take UAExpert client as an example, download the UAExpert client: - -2. Install UAExpert and fill in your own certificate information. - -#### Quick Start -##### Scenarios Supporting the None Security Policy - -1. Use the following SQL to create and start the OPC UA Sink in client-server mode. For detailed syntax, please refer to: [IoTDB OPC Server Syntax](#syntax) - - ```sql - create pipe p1 with sink ('sink'='opc-ua-sink', 'opcua.security-policy'='AES128_SHA256_RSAOAEP, AES256_SHA256_RSAPSS, BASIC256SHA256, NONE'); - ``` - - Note: Since version V1.3.7.2, None is no longer supported by default. To use it, you must manually enable it via the security-policy parameter as shown above. - -2. Write some data. - - ```sql - insert into root.test.db(time, s2) values(now(), 2) - ``` - - ​The metadata is automatically created and enabled here. - -3. Configure the connection to IoTDB in UAExpert, where the password should be set to the one defined in the sink.password parameter (using the default password "root" as an example): - - ::: center - - - - ::: - - ::: center - - - - ::: - -4. After trusting the server's certificate, you can see the written data in the Objects folder on the left. - - ::: center - - - - ::: - - ::: center - - - - ::: - - Note: Since the SecurityPolicy is set to None, mutual certificate trust is not required. For production environments, it is recommended to use a non-None SecurityPolicy for connection, which requires mutual certificate trust. For operations, refer to the Pub/Sub mode section below. In the Client/Server certificate directory (search for the keyword keyStore in the printed logs), move the contents in reject to trusted/certs. Follow the sequence: connect → move server directory → connect → move client directory → connect. - - -5. You can drag the node on the left to the center and display the latest value of that node: - - ::: center - - - - ::: - -##### Scenarios Not Supporting the None Security Policy -1. Use the following SQL to create and start the OPC UA service. - ```SQL - create pipe p1 with sink ('sink'='opc-ua-sink'); - ``` - - Note: Since version V1.3.7.2, OpcUaSink no longer supports None mode by default for security considerations. - -2. Insert some test data. - ```SQL - insert into root.test.db(time, s2) values(now(), 2); - ``` - -3. Configure the IoTDB connection in UAExpert: - - - Do not access the URL directly; endpoints must be discovered using the Discover method - - The client first sends a GetEndpoints request with the None policy to retrieve the endpoint list - - It then selects the corresponding encrypted endpoint based on the configured Basic256Sha256 + SignAndEncrypt to establish an encrypted connection - - ![](/img/opc-ua-un-none-1.png) - -4. Use the same username and password configuration as above. After selecting the relevant connection mode (Sign / Sign & Encrypt), if the following prompt appears, click Ignore to connect directly. - - ![](/img/opc-ua-un-none-2.png) - - -### Pub / Sub Mode - -In this mode, IoTDB's stream processing engine sends data change events to the OPC UA Server through an OPC UA Sink. These events are published to the server's message queue and managed through Event Nodes. Other OPC UA Clients can subscribe to these Event Nodes to receive notifications upon data changes. - -- Features: - - - Each measurement point is wrapped as an Event Node in OPC UA. - - - The relevant fields and their meanings are as follows: - - | Field | Meaning | Type (Milo) | Example | - | :--------- | :--------------------------------- | :------------ | :-------------------- | - | Time | Timestamp | DateTime | 1698907326198 | - | SourceName | Full path of the measurement point | String | root.test.opc.sensor0 | - | SourceNode | Data type of the measurement point | NodeId | Int32 | - | Message | Data | LocalizedText | 3.0 | - - - Events are only sent to clients that are already listening; if a client is not connected, the Event will be ignored. - - If data is deleted, the information cannot be pushed to clients. - - -#### Preparation Work - -The code is located in the [opc-ua-sink](https://github.com/apache/iotdb/tree/rc/1.3.5/example/pipe-opc-ua-sink/src/main/java/org/apache/iotdb/opcua) under the iotdb-example package. - -The code includes: - -- The main class (ClientTest) -- Client certificate-related logic(IoTDBKeyStoreLoaderClient) -- Client configuration and startup logic(ClientExampleRunner) -- The parent class of ClientTest(ClientExample) - -#### Quick Start - -The steps are as follows: - -1. Start IoTDB and write some data. - - ```sql - insert into root.a.b(time, c, d) values(now(), 1, 2); - ``` - - ​The metadata is automatically created and enabled here. - -2. Use the following SQL to create and start the OPC UA Sink in Pub-Sub mode. For detailed syntax, please refer to: [IoTDB OPC Server Syntax](#syntax) - - ```sql - create pipe p1 with sink ('sink'='opc-ua-sink', - 'sink.opcua.model'='pub-sub'); - start pipe p1; - ``` - - ​ At this point, you can see that the opc certificate-related directory has been created under the server's conf directory. - - ::: center - - - - ::: - -3. Run the Client connection directly; the Client's certificate will be rejected by the server. - - ::: center - - - - ::: - -4. Go to the server's sink.opcua.security.dir directory, then to the pki's rejected directory, where the Client's certificate should have been generated. - - ::: center - - - - ::: - -5. Move (not copy) the client's certificate into (not into a subdirectory of) the trusted directory's certs folder in the same directory. - - ::: center - - - - ::: - -6. Open the Client connection again; the server's certificate should now be rejected by the Client. - - ::: center - - - - ::: - -7. Go to the client's /client/security directory, then to the pki's rejected directory, and move the server's certificate into (not into a subdirectory of) the trusted directory. - - ::: center - - - - ::: - -8. Open the Client, and now the two-way trust is successful, and the Client can connect to the server. - -9. Write data to the server, and the Client will print out the received data. - - ::: center - - - - ::: - -### Notes - -1. **stand alone and cluster:**It is recommended to use a 1C1D (one coordinator and one data node) single machine version. If there are multiple DataNodes in the cluster, data may be sent in a scattered manner across various DataNodes, and it may not be possible to listen to all the data. - -2. **No Need to Operate Root Directory Certificates:** During the certificate operation process, there is no need to operate the `iotdb-server.pfx` certificate under the IoTDB security root directory and the `example-client.pfx` directory under the client security directory. When the Client and Server connect bidirectionally, they will send the root directory certificate to each other. If it is the first time the other party sees this certificate, it will be placed in the reject dir. If the certificate is in the trusted/certs, then the other party can trust it. - -3. **It is Recommended to Use Java 17+:** - In JVM 8 versions, there may be a key length restriction, resulting in an "Illegal key size" error. For specific versions (such as jdk.1.8u151+), you can add `Security.`_`setProperty`_`("crypto.policy", "unlimited");`; in the create client of ClientExampleRunner to solve this, or you can download the unlimited package `local_policy.jar` and `US_export_policy` to replace the packages in the `JDK/jre/lib/security`. Download link: . diff --git a/src/UserGuide/dev-1.3/Background-knowledge/Cluster-Concept_timecho.md b/src/UserGuide/dev-1.3/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index 22afc3aa4..000000000 --- a/src/UserGuide/dev-1.3/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,118 +0,0 @@ - - -# Common Concepts - -## Sql_dialect Related Concepts - -| Concept | Meaning | -| ----------------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| sql_dialect | Tree model: manages devices and measurement points, manages data in a hierarchical path manner, where one path corresponds to one measurement point of a device. | -| Schema | Schema is the data model information of the database, i.e., tree structure. It includes definitions such as the names and data types of measurement points. | -| Device | Corresponds to a physical device in an actual scenario, usually containing multiple measurement points. | -| Timeseries | Also known as: physical quantity, time series, timeline, point location, semaphore, indicator, measurement value, etc. It is a time series formed by arranging multiple data points in ascending order of timestamps. Usually, a Timeseries represents a collection point that can periodically collect physical quantities of the environment it is in. | -| Encoding | Encoding is a compression technique that represents data in binary form to improve storage efficiency. IoTDB supports various encoding methods for different types of data. For more detailed information, please refer to:[Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) | -| Compression | After data encoding, IoTDB uses compression technology to further compress binary data to enhance storage efficiency. IoTDB supports multiple compression methods. For more detailed information, please refer to: [Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) | - -## Distributed Related Concepts - -The following figure shows a common IoTDB 3C3D (3 ConfigNodes, 3 DataNodes) cluster deployment pattern: - - - -IoTDB's cluster includes the following common concepts: - -- Nodes(ConfigNode、DataNode、AINode) -- Region(SchemaRegion、DataRegion) -- Replica Groups - -The above concepts will be introduced in the following text. - - -### Nodes - -IoTDB cluster includes three types of nodes (processes): ConfigNode (management node), DataNode (data node), and AINode (analysis node), as shown below: - -- ConfigNode: Manages cluster node information, configuration information, user permissions, metadata, partition information, etc., and is responsible for the scheduling of distributed operations and load balancing. All ConfigNodes are fully backed up with each other, as shown in ConfigNode-1, ConfigNode-2, and ConfigNode-3 in the figure above. -- DataNode: Serves client requests and is responsible for data storage and computation, as shown in DataNode-1, DataNode-2, and DataNode-3 in the figure above. -- AINode: Provides machine learning capabilities, supports the registration of trained machine learning models, and allows model inference through SQL calls. It has already built-in self-developed time-series large models and common machine learning algorithms (such as prediction and anomaly detection). - -### Data Partitioning - -In IoTDB, both metadata and data are divided into small partitions, namely Regions, which are managed by various DataNodes in the cluster. - -- SchemaRegion: Metadata partition, managing the metadata of a part of devices and measurement points. SchemaRegions with the same RegionID on different DataNodes are mutual replicas, as shown in SchemaRegion-1 in the figure above, which has three replicas located on DataNode-1, DataNode-2, and DataNode-3. -- DataRegion: Data partition, managing the data of a part of devices for a certain period of time. DataRegions with the same RegionID on different DataNodes are mutual replicas, as shown in DataRegion-2 in the figure above, which has two replicas located on DataNode-1 and DataNode-2. -- For specific partitioning algorithms, please refer to: [Data Partitioning](../Technical-Insider/Cluster-data-partitioning.md) - -### Replica Groups - -The number of replicas for data and metadata can be configured. The recommended configurations for different deployment modes are as follows, where multi-replication can provide high-availability services. - -| Category | Parameter | Stand-Alone Recommended Configuration | Cluster Recommended Configuration | -| :----- | :------------------------ | :----------- | :----------- | -| Schema | schema_replication_factor | 1 | 3 | -| Data | data_replication_factor | 1 | 2 | - - -## Deployment Related Concepts - -IoTDB has three operating modes: Stand-Alone mode, Cluster mode, and Dual-Active mode. - -### Stand-Alone Mode - -An IoTDB Stand-Alone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D; - - -- **Features**:Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations. -- **Applicable Scenarios**:Scenarios with limited resources or low requirements for high availability, such as edge-side servers. -- **Deployment Method**:[Stand-Alone-Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -### Dual-Active Mode - -Dual-active deployment is a feature of TimechoDB Enterprise Edition, which refers to two independent instances performing bidirectional synchronization and can provide services simultaneously. When one instance is restarted after a shutdown, the other instance will resume transmission of the missing data. - - -> An IoTDB dual-active instance usually consists of 2 single-machine nodes, i.e., 2 sets of 1C1D. Each instance can also be a cluster. - -- **Features**:The most resource-efficient high-availability solution. -- **Applicable Scenarios**:Scenarios with limited resources (only two servers) but requiring high-availability capabilities. -- **Deployment Method**:[Dual-Active-Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### Cluster Mode - -An IoTDB cluster instance consists of 3 ConfigNodes and no less than 3 DataNodes, usually 3 DataNodes, i.e., 3C3D; when some nodes fail, the remaining nodes can still provide services, ensuring the high availability of the database service, and the database performance can be improved with the addition of nodes. - -- **Features**:High availability and scalability, and the system performance can be improved by adding DataNodes. -- **Applicable Scenarios**:Enterprise-level application scenarios requiring high availability and reliability. -- **Deployment Method**:[Cluster-Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -### Summary of Features - -| Dimension | Stand-Alone Mode | Dual-Active Mode | Cluster Mode | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| Applicable Scenarios | Edge-side deployment, scenarios with low requirements for high availability | High-availability business, disaster recovery scenarios, etc. | High-availability business, disaster recovery scenarios, etc. | -| Number of Machines Required | 1 | 2 | ≥3 | -| Security and Reliability | Cannot tolerate single-point failures | High, can tolerate single-point failures | High, can tolerate single-point failures | -| Scalability | Can expand DataNodes to improve performance | Each instance can be expanded as needed | Can expand DataNodes to improve performance | -| Performance | Can be expanded with the number of DataNodes | Same as the performance of one of the instances | Can be expanded with the number of DataNodes | - -- The deployment steps for single-machine mode and cluster mode are similar (adding ConfigNodes and DataNodes one by one), with only the number of replicas and the minimum number of nodes that can provide services being different. \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/Basic-Concept/Operate-Metadata_timecho.md b/src/UserGuide/dev-1.3/Basic-Concept/Operate-Metadata_timecho.md deleted file mode 100644 index 1a21e8fa9..000000000 --- a/src/UserGuide/dev-1.3/Basic-Concept/Operate-Metadata_timecho.md +++ /dev/null @@ -1,1360 +0,0 @@ - - -# Timeseries Management - -## Database Management - -### Create Database - -According to the storage model we can set up the corresponding database. Two SQL statements are supported for creating databases, as follows: - -``` -IoTDB > create database root.ln -IoTDB > create database root.sgcc -``` - -We can thus create two databases using the above two SQL statements. - -It is worth noting that 1 database is recommended. - -When the path itself or the parent/child layer of the path is already created as database, the path is then not allowed to be created as database. For example, it is not feasible to create `root.ln.wf01` as database when two databases `root.ln` and `root.sgcc` exist. The system gives the corresponding error prompt as shown below: - -``` -IoTDB> CREATE DATABASE root.ln.wf01 -Msg: 300: root.ln has already been created as database. -IoTDB> create database root.ln.wf01 -Msg: 300: root.ln has already been created as database. -``` - -Database Node Naming Rules: -1. Node names may contain: **Chinese/English letters, Digits (0-9), Underscore(\_)、Period (.)、Backtick(\`)** -2. The entire name must be enclosed in **backticks (\`)** if: - - It consists solely of digits (e.g., 12345) - - It contains special characters (. or \_) that may cause ambiguity (e.g., db.01, \_temp) -3. Escaping Backticks: - If the node name itself contains a backtick (\`), use **two consecutive backticks(\`\`)** to represent a single backtick. Example: To name a node as \`db123\`\` (containing one backtick), write it as \`db123\`\`\`. - -Besides, if deploy on Windows or macOS system, the LayerName is case-insensitive, which means it's not allowed to create databases `root.ln` and `root.LN` at the same time. - -### Show Databases - -After creating the database, we can use the [SHOW DATABASES](../SQL-Manual/SQL-Manual.md) statement and [SHOW DATABASES \](../SQL-Manual/SQL-Manual.md) to view the databases. The SQL statements are as follows: - -``` -IoTDB> SHOW DATABASES -IoTDB> SHOW DATABASES root.** -``` - -The result is as follows: - -``` -+-------------+----+-------------------------+-----------------------+-----------------------+ -|database| ttl|schema_replication_factor|data_replication_factor|time_partition_interval| -+-------------+----+-------------------------+-----------------------+-----------------------+ -| root.sgcc|null| 2| 2| 604800| -| root.ln|null| 2| 2| 604800| -+-------------+----+-------------------------+-----------------------+-----------------------+ -Total line number = 2 -It costs 0.060s -``` - -### Delete Database - -User can use the `DELETE DATABASE ` statement to delete all databases matching the pathPattern. Please note the data in the database will also be deleted. - -``` -IoTDB > DELETE DATABASE root.ln -IoTDB > DELETE DATABASE root.sgcc -// delete all data, all timeseries and all databases -IoTDB > DELETE DATABASE root.** -``` - -### Count Databases - -User can use the `COUNT DATABASE ` statement to count the number of databases. It is allowed to specify `PathPattern` to count the number of databases matching the `PathPattern`. - -SQL statement is as follows: - -``` -IoTDB> count databases -IoTDB> count databases root.* -IoTDB> count databases root.sgcc.* -IoTDB> count databases root.sgcc -``` - -The result is as follows: - -``` -+-------------+ -| database| -+-------------+ -| root.sgcc| -| root.turbine| -| root.ln| -+-------------+ -Total line number = 3 -It costs 0.003s - -+-------------+ -| database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.003s - -+-------------+ -| database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 0| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 1| -+-------------+ -Total line number = 1 -It costs 0.002s -``` - -### Setting up heterogeneous databases (Advanced operations) - -Under the premise of familiar with IoTDB metadata modeling, -users can set up heterogeneous databases in IoTDB to cope with different production needs. - -Currently, the following database heterogeneous parameters are supported: - -| Parameter | Type | Description | -| ------------------------- | ------- | --------------------------------------------- | -| TTL | Long | TTL of the Database | -| SCHEMA_REPLICATION_FACTOR | Integer | The schema replication number of the Database | -| DATA_REPLICATION_FACTOR | Integer | The data replication number of the Database | -| SCHEMA_REGION_GROUP_NUM | Integer | The SchemaRegionGroup number of the Database | -| DATA_REGION_GROUP_NUM | Integer | The DataRegionGroup number of the Database | - -Note the following when configuring heterogeneous parameters: - -+ TTL and TIME_PARTITION_INTERVAL must be positive integers. -+ SCHEMA_REPLICATION_FACTOR and DATA_REPLICATION_FACTOR must be smaller than or equal to the number of deployed DataNodes. -+ The function of SCHEMA_REGION_GROUP_NUM and DATA_REGION_GROUP_NUM are related to the parameter `schema_region_group_extension_policy` and `data_region_group_extension_policy` in iotdb-common.properties configuration file. Take DATA_REGION_GROUP_NUM as an example: - If `data_region_group_extension_policy=CUSTOM` is set, DATA_REGION_GROUP_NUM serves as the number of DataRegionGroups owned by the Database. - If `data_region_group_extension_policy=AUTO`, DATA_REGION_GROUP_NUM is used as the lower bound of the DataRegionGroup quota owned by the Database. That is, when the Database starts writing data, it will have at least this number of DataRegionGroups. - -Users can set any heterogeneous parameters when creating a Database, or adjust some heterogeneous parameters during a stand-alone/distributed IoTDB run. - -#### Set heterogeneous parameters when creating a Database - -The user can set any of the above heterogeneous parameters when creating a Database. The SQL statement is as follows: - -``` -CREATE DATABASE prefixPath (WITH databaseAttributeClause (COMMA? databaseAttributeClause)*)? -``` - -For example: - -``` -CREATE DATABASE root.db WITH SCHEMA_REPLICATION_FACTOR=1, DATA_REPLICATION_FACTOR=3, SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### Adjust heterogeneous parameters at run time - -Users can adjust some heterogeneous parameters during the IoTDB runtime, as shown in the following SQL statement: - -``` -ALTER DATABASE prefixPath WITH databaseAttributeClause (COMMA? databaseAttributeClause)* -``` - -For example: - -``` -ALTER DATABASE root.db WITH SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -Note that only the following heterogeneous parameters can be adjusted at runtime: - -+ SCHEMA_REGION_GROUP_NUM -+ DATA_REGION_GROUP_NUM - -#### Show heterogeneous databases - -The user can query the specific heterogeneous configuration of each Database, and the SQL statement is as follows: - -``` -SHOW DATABASES DETAILS prefixPath? -``` - -For example: - -``` -IoTDB> SHOW DATABASES DETAILS -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|Database| TTL|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|MinSchemaRegionGroupNum|MaxSchemaRegionGroupNum|DataRegionGroupNum|MinDataRegionGroupNum|MaxDataRegionGroupNum| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|root.db1| null| 1| 3| 604800000| 0| 1| 1| 0| 2| 2| -|root.db2|86400000| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -|root.db3| null| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -Total line number = 3 -It costs 0.058s -``` - -The query results in each column are as follows: - -+ The name of the Database -+ The TTL of the Database -+ The schema replication number of the Database -+ The data replication number of the Database -+ The time partition interval of the Database -+ The current SchemaRegionGroup number of the Database -+ The required minimum SchemaRegionGroup number of the Database -+ The permitted maximum SchemaRegionGroup number of the Database -+ The current DataRegionGroup number of the Database -+ The required minimum DataRegionGroup number of the Database -+ The permitted maximum DataRegionGroup number of the Database - -### TTL - -IoTDB supports setting data retention time (TTL) at the device level, allowing the system to automatically and periodically delete old data to effectively control disk space and maintain high query performance and low memory usage. TTL is set in milliseconds by default. Once data expires, it cannot be queried or written, but physical deletion is delayed until compaction. Please note that changes to TTL may temporarily affect data queryability, and if TTL is reduced or removed, previously invisible data due to TTL may reappear. - -Important notes: -- TTL is set in milliseconds and is not affected by the time precision in the configuration file. -- Changes to TTL may affect data queryability. -- The system will eventually remove expired data, but there may be a delay. -- TTL determines data expiration based on the data point timestamp, not the ingestion time. -- The system supports setting up to 1000 TTL rules. When the limit is reached, existing rules must be removed before new ones can be added. - -#### TTL Path Rule -The path can only be prefix paths (i.e., the path cannot contain \* , except \*\* in the last level). -This path will match devices and also allows users to specify paths without asterisks as specific databases or devices. -When the path does not contain asterisks, the system will check if it matches a database; if it matches a database, both the path and path.\*\* will be set at the same time. Note: Device TTL settings do not verify the existence of metadata, i.e., it is allowed to set TTL for a non-existent device. -``` -qualified paths: -root.** -root.db.** -root.db.group1.** -root.db -root.db.group1.d1 - -unqualified paths: -root.*.db -root.**.db.* -root.db.* -``` -#### TTL Applicable Rules -When a device is subject to multiple TTL rules, the more precise and longer rules are prioritized. For example, for the device "root.bj.hd.dist001.turbine001", the rule "root.bj.hd.dist001.turbine001" takes precedence over "root.bj.hd.dist001.\*\*", and the rule "root.bj.hd.dist001.\*\*" takes precedence over "root.bj.hd.**". -#### Set TTL -The set ttl operation can be understood as setting a TTL rule, for example, setting ttl to root.sg.group1.** is equivalent to mounting ttl for all devices that can match this path pattern. -The unset ttl operation indicates unmounting TTL for the corresponding path pattern; if there is no corresponding TTL, nothing will be done. -If you want to set TTL to be infinitely large, you can use the INF keyword. -The SQL Statement for setting TTL is as follow: -``` -set ttl to pathPattern 360000; -``` -Set the Time to Live (TTL) to a pathPattern of 360,000 milliseconds; the pathPattern should not contain a wildcard (\*) in the middle and must end with a double asterisk (\*\*). The pathPattern is used to match corresponding devices. -To maintain compatibility with older SQL syntax, if the user-provided pathPattern matches a database (db), the path pattern is automatically expanded to include all sub-paths denoted by path.\*\*. -For instance, writing "set ttl to root.sg 360000" will automatically be transformed into "set ttl to root.sg.\*\* 360000", which sets the TTL for all devices under root.sg. However, if the specified pathPattern does not match a database, the aforementioned logic will not apply. For example, writing "set ttl to root.sg.group 360000" will not be expanded to "root.sg.group.\*\*" since root.sg.group does not match a database. -It is also permissible to specify a particular device without a wildcard (*). -#### Unset TTL - -To unset TTL, we can use follwing SQL statement: - -``` -IoTDB> unset ttl from root.ln -``` - -After unset TTL, all data will be accepted in `root.ln`. -``` -IoTDB> unset ttl from root.sgcc.** -``` - -Unset the TTL in the `root.sgcc` path. - -New syntax -``` -IoTDB> unset ttl from root.** -``` - -Old syntax -``` -IoTDB> unset ttl to root.** -``` -There is no functional difference between the old and new syntax, and they are compatible with each other. -The new syntax is just more conventional in terms of wording. - -Unset the TTL setting for all path pattern. - -#### Show TTL - -To Show TTL, we can use following SQL statement: - -show all ttl - -``` -IoTDB> SHOW ALL TTL -+--------------+--------+ -| path| TTL| -| root.**|55555555| -| root.sg2.a.**|44440000| -+--------------+--------+ -``` - -show ttl on pathPattern -``` -IoTDB> SHOW TTL ON root.db.**; -+--------------+--------+ -| path| TTL| -| root.db.**|55555555| -| root.db.a.**|44440000| -+--------------+--------+ -``` - -The SHOW ALL TTL example gives the TTL for all path patterns. -The SHOW TTL ON pathPattern shows the TTL for the path pattern specified. - -Display devices' ttl -``` -IoTDB> show devices -+---------------+---------+---------+ -| Device|IsAligned| TTL| -+---------------+---------+---------+ -|root.sg.device1| false| 36000000| -|root.sg.device2| true| INF| -+---------------+---------+---------+ -``` -All devices will definitely have a TTL, meaning it cannot be null. INF represents infinity. - - -## Device Template - -IoTDB supports the device template function, enabling different entities of the same type to share metadata, reduce the memory usage of metadata, and simplify the management of numerous entities and measurements. - - -### Create Device Template - -The SQL syntax for creating a metadata template is as follows: - -```sql -CREATE DEVICE TEMPLATE ALIGNED? '(' [',' ]+ ')' -``` - -**Example 1:** Create a template containing two non-aligned timeseries - -```shell -IoTDB> create device template t1 (temperature FLOAT encoding=RLE, status BOOLEAN encoding=PLAIN compression=SNAPPY) -``` - -**Example 2:** Create a template containing a group of aligned timeseries - -```shell -IoTDB> create device template t2 aligned (lat FLOAT encoding=Gorilla, lon FLOAT encoding=Gorilla) -``` - -The` lat` and `lon` measurements are aligned. - -![img](/img/%E6%A8%A1%E6%9D%BF.png) - -![img](/img/templateEN.jpg) - -### Set Device Template - -After a device template is created, it should be set to specific path before creating related timeseries or insert data. - -**It should be ensured that the related database has been set before setting template.** - -**It is recommended to set device template to database path. It is not suggested to set device template to some path above database** - -**It is forbidden to create timeseries under a path setting s tedeviceplate. Device template shall not be set on a prefix path of an existing timeseries.** - -The SQL Statement for setting device template is as follow: - -```shell -IoTDB> set device template t1 to root.sg1.d1 -``` - -### Activate Device Template - -After setting the device template, with the system enabled to auto create schema, you can insert data into the timeseries. For example, suppose there's a database root.sg1 and t1 has been set to root.sg1.d1, then timeseries like root.sg1.d1.temperature and root.sg1.d1.status are available and data points can be inserted. - - -**Attention**: Before inserting data or the system not enabled to auto create schema, timeseries defined by the device template will not be created. You can use the following SQL statement to create the timeseries or activate the templdeviceate, act before inserting data: - -```shell -IoTDB> create timeseries using device template on root.sg1.d1 -``` - -**Example:** Execute the following statement - -```shell -IoTDB> set device template t1 to root.sg1.d1 -IoTDB> set device template t2 to root.sg1.d2 -IoTDB> create timeseries using device template on root.sg1.d1 -IoTDB> create timeseries using device template on root.sg1.d2 -``` - -Show the time series: - -```sql -show timeseries root.sg1.** -```` - -```shell -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -|root.sg1.d1.temperature| null| root.sg1| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.sg1.d1.status| null| root.sg1| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -| root.sg1.d2.lon| null| root.sg1| FLOAT| GORILLA| SNAPPY|null| null| null| null| -| root.sg1.d2.lat| null| root.sg1| FLOAT| GORILLA| SNAPPY|null| null| null| null| -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -``` - -Show the devices: - -```sql -show devices root.sg1.** -```` - -```shell -+---------------+---------+ -| devices|isAligned| -+---------------+---------+ -| root.sg1.d1| false| -| root.sg1.d2| true| -+---------------+---------+ -```` - -### Show Device Template - -- Show all device templates - -The SQL statement looks like this: - -```shell -IoTDB> show device templates -``` - -The execution result is as follows: - -```shell -+-------------+ -|template name| -+-------------+ -| t2| -| t1| -+-------------+ -``` - -- Show nodes under in device template - -The SQL statement looks like this: - -```shell -IoTDB> show nodes in device template t1 -``` - -The execution result is as follows: - -```shell -+-----------+--------+--------+-----------+ -|child nodes|dataType|encoding|compression| -+-----------+--------+--------+-----------+ -|temperature| FLOAT| RLE| SNAPPY| -| status| BOOLEAN| PLAIN| SNAPPY| -+-----------+--------+--------+-----------+ -``` - -- Show the path prefix where a device template is set - -```shell -IoTDB> show paths set device template t1 -``` - -The execution result is as follows: - -```shell -+-----------+ -|child paths| -+-----------+ -|root.sg1.d1| -+-----------+ -``` - -- Show the path prefix where a device template is used (i.e. the time series has been created) - -```shell -IoTDB> show paths using device template t1 -``` - -The execution result is as follows: - -```shell -+-----------+ -|child paths| -+-----------+ -|root.sg1.d1| -+-----------+ -``` - -### Deactivate device Template - -To delete a group of timeseries represented by device template, namely deactivate the device template, use the following SQL statement: - -```shell -IoTDB> delete timeseries of device template t1 from root.sg1.d1 -``` - -or - -```shell -IoTDB> deactivate device template t1 from root.sg1.d1 -``` - -The deactivation supports batch process. - -```shell -IoTDB> delete timeseries of device template t1 from root.sg1.*, root.sg2.* -``` - -or - -```shell -IoTDB> deactivate device template t1 from root.sg1.*, root.sg2.* -``` - -If the template name is not provided in sql, all template activation on paths matched by given path pattern will be removed. - -### Unset Device Template - -The SQL Statement for unsetting device template is as follow: - -```shell -IoTDB> unset device template t1 from root.sg1.d1 -``` - -**Attention**: It should be guaranteed that none of the timeseries represented by the target device template exists, before unset it. It can be achieved by deactivation operation. - -### Drop Device Template - -The SQL Statement for dropping device template is as follow: - -```shell -IoTDB> drop device template t1 -``` - -**Attention**: Dropping an already set template is not supported. - -### Alter Device Template - -In a scenario where measurements need to be added, you can modify the template to add measurements to all devicesdevice using the device template. - -The SQL Statement for altering device template is as follow: - -```shell -IoTDB> alter device template t1 add (speed FLOAT encoding=RLE) -``` - -**When executing data insertion to devices with device template set on related prefix path and there are measurements not present in this device template, the measurements will be auto added to this device template.** - -## Timeseries Management - -### Create Timeseries - -According to the storage model selected before, we can create corresponding timeseries in the two databases respectively. The SQL statements for creating timeseries are as follows: - -``` -IoTDB > create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT,encoding=RLE -IoTDB > create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT,encoding=PLAIN -IoTDB > create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT,encoding=RLE -``` - -From v0.13, you can use a simplified version of the SQL statements to create timeseries: - -``` -IoTDB > create timeseries root.ln.wf01.wt01.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.ln.wf01.wt01.temperature FLOAT encoding=RLE -IoTDB > create timeseries root.ln.wf02.wt02.hardware TEXT encoding=PLAIN -IoTDB > create timeseries root.ln.wf02.wt02.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.temperature FLOAT encoding=RLE -``` - -Notice that when in the CREATE TIMESERIES statement the encoding method conflicts with the data type, the system gives the corresponding error prompt as shown below: - -``` -IoTDB > create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF -error: encoding TS_2DIFF does not support BOOLEAN -``` - -Please refer to [Encoding](../Technical-Insider/Encoding-and-Compression.md) for correspondence between data type and encoding. - -### Create Aligned Timeseries - -The SQL statement for creating a group of timeseries are as follows: - -``` -IoTDB> CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT encoding=PLAIN compressor=SNAPPY, longitude FLOAT encoding=PLAIN compressor=SNAPPY) -``` - -You can set different datatype, encoding, and compression for the timeseries in a group of aligned timeseries - -It is also supported to set an alias, tag, and attribute for aligned timeseries. - -### Delete Timeseries - -To delete the timeseries we created before, we are able to use `(DELETE | DROP) TimeSeries ` statement. - -The usage are as follows: - -``` -IoTDB> delete timeseries root.ln.wf01.wt01.status -IoTDB> delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware -IoTDB> delete timeseries root.ln.wf02.* -IoTDB> drop timeseries root.ln.wf02.* -``` - -### Show Timeseries - -* SHOW LATEST? TIMESERIES pathPattern? whereClause? limitClause? - - There are four optional clauses added in SHOW TIMESERIES, return information of time series - -Timeseries information includes: timeseries path, alias of measurement, database it belongs to, data type, encoding type, compression type, tags and attributes. - -Examples: - -* SHOW TIMESERIES - - presents all timeseries information in JSON form - -* SHOW TIMESERIES <`PathPattern`> - - returns all timeseries information matching the given <`PathPattern`>. SQL statements are as follows: - -``` -IoTDB> show timeseries root.** -IoTDB> show timeseries root.ln.** -``` - -The results are shown below respectively: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.016s - -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -Total line number = 4 -It costs 0.004s -``` - -* SHOW TIMESERIES LIMIT INT OFFSET INT - - returns all the timeseries information start from the offset and limit the number of series returned. For example, - -``` -show timeseries root.ln.** limit 10 offset 10 -``` - -* SHOW TIMESERIES WHERE TIMESERIES contains 'containStr' - - The query result set is filtered by string fuzzy matching based on the names of the timeseries. For example: - -``` -show timeseries root.ln.** where timeseries contains 'wf01.wt' -``` - -The result is shown below: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 2 -It costs 0.016s -``` - -* SHOW TIMESERIES WHERE DataType=type - - The query result set is filtered by data type. For example: - -``` -show timeseries root.ln.** where dataType=FLOAT -``` - -The result is shown below: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 3 -It costs 0.016s - -``` - - -* SHOW TIMESERIES WHERE TAGS(KEY) = VALUE -* SHOW TIMESERIES WHERE TAGS(KEY) CONTAINS VALUE - - The query result set is filtered by tags. For example: - -``` -show timeseries root.ln.** where TAGS(unit)='c' -show timeseries root.ln.** where TAGS(description) contains 'test1' -``` - -The query results are as follows: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s - -``` - - -* SHOW LATEST TIMESERIES - - all the returned timeseries information should be sorted in descending order of the last timestamp of timeseries - -It is worth noting that when the queried path does not exist, the system will return no timeseries. - - -### Count Timeseries - -IoTDB is able to use `COUNT TIMESERIES ` to count the number of timeseries matching the path. SQL statements are as follows: - -* `WHERE` condition could be used to fuzzy match a time series name with the following syntax: `COUNT TIMESERIES WHERE TIMESERIES contains 'containStr'`. -* `WHERE` condition could be used to filter result by data type with the syntax: `COUNT TIMESERIES WHERE DataType='`. -* `WHERE` condition could be used to filter result by tags with the syntax: `COUNT TIMESERIES WHERE TAGS(key)='value'` or `COUNT TIMESERIES WHERE TAGS(key) contains 'value'`. -* `LEVEL` could be defined to show count the number of timeseries of each node at the given level in current Metadata Tree. This could be used to query the number of sensors under each device. The grammar is: `COUNT TIMESERIES GROUP BY LEVEL=`. - - -``` -IoTDB > COUNT TIMESERIES root.** -IoTDB > COUNT TIMESERIES root.ln.** -IoTDB > COUNT TIMESERIES root.ln.*.*.status -IoTDB > COUNT TIMESERIES root.ln.wf01.wt01.status -IoTDB > COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' -IoTDB > COUNT TIMESERIES root.** WHERE DATATYPE = INT64 -IoTDB > COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c' -IoTDB > COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c' -IoTDB > COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1 -``` - -For example, if there are several timeseries (use `show timeseries` to show all timeseries): - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| {"unit":"c"}| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| {"description":"test1"}| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.004s -``` - -Then the Metadata Tree will be as below: - -
- -As can be seen, `root` is considered as `LEVEL=0`. So when you enter statements such as: - -``` -IoTDB > COUNT TIMESERIES root.** GROUP BY LEVEL=1 -IoTDB > COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2 -IoTDB > COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2 -``` - -You will get following results: - -``` -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -| root.sgcc| 2| -| root.ln| 4| -+------------+-----------------+ -Total line number = 3 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf02| 2| -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 2 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 1 -It costs 0.002s -``` - -> Note: The path of timeseries is just a filter condition, which has no relationship with the definition of level. - -### Active Timeseries Query -By adding WHERE time filter conditions to the existing SHOW/COUNT TIMESERIES, we can obtain time series with data within the specified time range. - -It is important to note that in metadata queries with time filters, views are not considered; only the time series actually stored in the TsFile are taken into account. - -An example usage is as follows: -``` -IoTDB> insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -IoTDB> insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -IoTDB> insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -IoTDB> show timeseries; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -IoTDB> show timeseries where time >= 15000 and time < 16000; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -IoTDB> count timeseries where time >= 15000 and time < 16000; -+-----------------+ -|count(timeseries)| -+-----------------+ -| 4| -+-----------------+ -``` -Regarding the definition of active time series, data that can be queried normally is considered active, meaning time series that have been inserted but deleted are not included. -### Tag and Attribute Management - -We can also add an alias, extra tag and attribute information while creating one timeseries. - -The differences between tag and attribute are: - -* Tag could be used to query the path of timeseries, we will maintain an inverted index in memory on the tag: Tag -> Timeseries -* Attribute could only be queried by timeseries path : Timeseries -> Attribute - -The SQL statements for creating timeseries with extra tag and attribute information are extended as follows: - -``` -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT, encoding=RLE, compression=SNAPPY tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2) -``` - -The `temprature` in the brackets is an alias for the sensor `s1`. So we can use `temprature` to replace `s1` anywhere. - -> IoTDB also supports using AS function to set alias. The difference between the two is: the alias set by the AS function is used to replace the whole time series name, temporary and not bound with the time series; while the alias mentioned above is only used as the alias of the sensor, which is bound with it and can be used equivalent to the original sensor name. - -> Notice that the size of the extra tag and attribute information shouldn't exceed the `tag_attribute_total_size`. - -We can update the tag information after creating it as following: - -* Rename the tag/attribute key - -``` -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -* Reset the tag/attribute value - -``` -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -* Delete the existing tag/attribute - -``` -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2 -``` - -* Add new tags - -``` -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -* Add new attributes - -``` -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -* Upsert alias, tags and attributes - -> add alias or a new key-value if the alias or key doesn't exist, otherwise, update the old one with new value. - -``` -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag3=v3, tag4=v4) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -* Show timeseries using tags. Use TAGS(tagKey) to identify the tags used as filter key - -``` -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -``` - -returns all the timeseries information that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -``` -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1 -show timeseries root.ln.** where TAGS(unit)='c' -show timeseries root.ln.** where TAGS(description) contains 'test1' -``` - -The results are shown below respectly: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s -``` - -- count timeseries using tags - -``` -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL= -``` - -returns all the number of timeseries that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -``` -count timeseries -count timeseries root.** where TAGS(unit)='c' -count timeseries root.** where TAGS(unit)='c' group by level = 2 -``` - -The results are shown below respectly : - -``` -IoTDB> count timeseries -+-----------------+ -|count(timeseries)| -+-----------------+ -| 6| -+-----------------+ -Total line number = 1 -It costs 0.019s -IoTDB> count timeseries root.** where TAGS(unit)='c' -+-----------------+ -|count(timeseries)| -+-----------------+ -| 2| -+-----------------+ -Total line number = 1 -It costs 0.020s -IoTDB> count timeseries root.** where TAGS(unit)='c' group by level = 2 -+--------------+-----------------+ -| column|count(timeseries)| -+--------------+-----------------+ -| root.ln.wf02| 2| -| root.ln.wf01| 0| -|root.sgcc.wf03| 0| -+--------------+-----------------+ -Total line number = 3 -It costs 0.011s -``` - -> Notice that, we only support one condition in the where clause. Either it's an equal filter or it is an `contains` filter. In both case, the property in the where condition must be a tag. - -create aligned timeseries - -``` -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)) -``` - -The execution result is as follows: - -``` -IoTDB> show timeseries -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -|root.sg1.d1.s2| null| root.sg1| DOUBLE| GORILLA| SNAPPY|{"tag4":"v4","tag3":"v3"}|{"attr4":"v4","attr3":"v3"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -Support query: - -``` -IoTDB> show timeseries where TAGS(tag1)='v1' -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -The above operations are supported for timeseries tag, attribute updates, etc. - -## Node Management - -### Show Child Paths - -``` -SHOW CHILD PATHS pathPattern -``` - -Return all child paths and their node types of all the paths matching pathPattern. - -node types: ROOT -> DB INTERNAL -> DATABASE -> INTERNAL -> DEVICE -> TIMESERIES - - -Example: - -* return the child paths of root.ln:show child paths root.ln - -``` -+------------+----------+ -| child paths|node types| -+------------+----------+ -|root.ln.wf01| INTERNAL| -|root.ln.wf02| INTERNAL| -+------------+----------+ -Total line number = 2 -It costs 0.002s -``` - -> get all paths in form of root.xx.xx.xx:show child paths root.xx.xx - -### Show Child Nodes - -``` -SHOW CHILD NODES pathPattern -``` - -Return all child nodes of the pathPattern. - -Example: - -* return the child nodes of root:show child nodes root - -``` -+------------+ -| child nodes| -+------------+ -| ln| -+------------+ -``` - -* return the child nodes of root.ln:show child nodes root.ln - -``` -+------------+ -| child nodes| -+------------+ -| wf01| -| wf02| -+------------+ -``` - -### Count Nodes - -IoTDB is able to use `COUNT NODES LEVEL=` to count the number of nodes at - the given level in current Metadata Tree considering a given pattern. IoTDB will find paths that - match the pattern and counts distinct nodes at the specified level among the matched paths. - This could be used to query the number of devices with specified measurements. The usage are as - follows: - -``` -IoTDB > COUNT NODES root.** LEVEL=2 -IoTDB > COUNT NODES root.ln.** LEVEL=2 -IoTDB > COUNT NODES root.ln.wf01.** LEVEL=3 -IoTDB > COUNT NODES root.**.temperature LEVEL=3 -``` - -As for the above mentioned example and Metadata tree, you can get following results: - -``` -+------------+ -|count(nodes)| -+------------+ -| 4| -+------------+ -Total line number = 1 -It costs 0.003s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 1| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s -``` - -> Note: The path of timeseries is just a filter condition, which has no relationship with the definition of level. - -### Show Devices - -* SHOW DEVICES pathPattern? (WITH DATABASE)? devicesWhereClause? limitClause? - -Similar to `Show Timeseries`, IoTDB also supports two ways of viewing devices: - -* `SHOW DEVICES` statement presents all devices' information, which is equal to `SHOW DEVICES root.**`. -* `SHOW DEVICES ` statement specifies the `PathPattern` and returns the devices information matching the pathPattern and under the given level. -* `WHERE` condition supports `DEVICE contains 'xxx'` to do a fuzzy query based on the device name. - -SQL statement is as follows: - -``` -IoTDB> show devices -IoTDB> show devices root.ln.** -IoTDB> show devices root.ln.** where device contains 't' -``` - -You can get results below: - -``` -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.ln.wf01.wt01| false| -| root.ln.wf02.wt02| false| -|root.sgcc.wf03.wt01| false| -| root.turbine.d1| false| -+-------------------+---------+ -Total line number = 4 -It costs 0.002s - -+-----------------+---------+ -| devices|isAligned| -+-----------------+---------+ -|root.ln.wf01.wt01| false| -|root.ln.wf02.wt02| false| -+-----------------+---------+ -Total line number = 2 -It costs 0.001s -``` - -`isAligned` indicates whether the timeseries under the device are aligned. - -To view devices' information with database, we can use `SHOW DEVICES WITH DATABASE` statement. - -* `SHOW DEVICES WITH DATABASE` statement presents all devices' information with their database. -* `SHOW DEVICES WITH DATABASE` statement specifies the `PathPattern` and returns the - devices' information under the given level with their database information. - -SQL statement is as follows: - -``` -IoTDB> show devices with database -IoTDB> show devices root.ln.** with database -``` - -You can get results below: - -``` -+-------------------+-------------+---------+ -| devices| database|isAligned| -+-------------------+-------------+---------+ -| root.ln.wf01.wt01| root.ln| false| -| root.ln.wf02.wt02| root.ln| false| -|root.sgcc.wf03.wt01| root.sgcc| false| -| root.turbine.d1| root.turbine| false| -+-------------------+-------------+---------+ -Total line number = 4 -It costs 0.003s - -+-----------------+-------------+---------+ -| devices| database|isAligned| -+-----------------+-------------+---------+ -|root.ln.wf01.wt01| root.ln| false| -|root.ln.wf02.wt02| root.ln| false| -+-----------------+-------------+---------+ -Total line number = 2 -It costs 0.001s -``` - -### Count Devices - -* COUNT DEVICES / - -The above statement is used to count the number of devices. At the same time, it is allowed to specify `PathPattern` to count the number of devices matching the `PathPattern`. - -SQL statement is as follows: - -``` -IoTDB> show devices -IoTDB> count devices -IoTDB> count devices root.ln.** -``` - -You can get results below: - -``` -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -|root.sgcc.wf03.wt03| false| -| root.turbine.d1| false| -| root.ln.wf02.wt02| false| -| root.ln.wf01.wt01| false| -+-------------------+---------+ -Total line number = 4 -It costs 0.024s - -+--------------+ -|count(devices)| -+--------------+ -| 4| -+--------------+ -Total line number = 1 -It costs 0.004s - -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -Total line number = 1 -It costs 0.004s -``` - -### Active Device Query -Similar to active timeseries query, we can add time filter conditions to device viewing and statistics to query active devices that have data within a certain time range. The definition of active here is the same as for active time series. An example usage is as follows: -``` -IoTDB> insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -IoTDB> insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -IoTDB> insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -IoTDB> show devices; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -| root.sg.data3| false| -+-------------------+---------+ - -IoTDB> show devices where time >= 15000 and time < 16000; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -+-------------------+---------+ - -IoTDB> count devices where time >= 15000 and time < 16000; -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -``` \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/UserGuide/dev-1.3/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index 35070c791..000000000 --- a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,574 +0,0 @@ - -# AINode Deployment - -## AINode Introduction - -### Capability Introduction - - AINode is the third type of endogenous node provided by IoTDB after the Configurable Node and DataNode. This node extends its ability to perform machine learning analysis on time series by interacting with the DataNode and Configurable Node of the IoTDB cluster. It supports the introduction of existing machine learning models from external sources for registration and the use of registered models to complete time series analysis tasks on specified time series data through simple SQL statements. The creation, management, and inference of models are integrated into the database engine. Currently, machine learning algorithms or self-developed models are available for common time series analysis scenarios, such as prediction and anomaly detection. - -### Delivery Method - It is an additional package outside the IoTDB cluster, with independent installation. - -### Deployment mode -
- - -
- -## Installation preparation - -### Get installation package - - Users can download the software installation package for AINode, download and unzip it to complete the installation of AINode. - - Unzip and install the package - `(apache-iotdb--ainode-bin.zip)`, The directory structure after unpacking the installation package is as follows: -| **Catalogue** | **Type** | **Explain** | -| ------------ | -------- | ------------------------------------------------ | -| lib | folder | AINode compiled binary executable files and related code dependencies | -| sbin | folder | The running script of AINode can start, remove, and stop AINode | -| conf | folder | Contains configuration items for AINode, specifically including the following configuration items | -| LICENSE | file | Certificate | -| NOTICE | file | Tips | -| README_ZH.md | file | Explanation of the Chinese version of the markdown format | -| `README.md` | file | Instructions | - - -### Pre-installation Check - -To ensure the AINode installation package you obtained is complete and valid, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum:please contact Timecho Team to re-obtain the installation package. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/ainode): - ```Bash - cd /data/ainode - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-05.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment of AINode as per the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### Environment preparation -- Suggested operating environment:Ubuntu, CentOS, MacOS - -- Runtime Environment - - Python >= 3.10 and Python <= 3.12 is sufficient, and comes with pip and venv tools; For non networked environments, and download the zip package for the corresponding operating system from [here](https://cloud.tsinghua.edu.cn/d/4c1342f6c272439aa96c/?p=%2Flibs&mode=list) (Note that when downloading dependencies, you need to select the zip file in the libs folder, as shown in the following figure). Copy all files in the folder to the `lib` folder in the `apache-iotdb--ainode-bin` folder, and follow the steps below to start AINode. - - - - - There must be a Python interpreter in the environment variables that can be directly called through the `python` instruction. - - It is recommended to create a Python interpreter venv virtual environment in the `apache-iotdb--ainode-bin` folder. If installing version 3.10.0 virtual environment, the statement is as follows: - ```shell - # Install version 3.10.0 of Venv , Create a virtual environment with the folder name `venv`. - ../Python-3.10.0/python -m venv `venv` - ``` - -## Installation steps - -### Install AINode - -1. AINode activation - - Require IoTDB to be in normal operation and have AINode module authorization in the license. - - The authorization method for activating the AINode module is as follows: - - Method 1: Activate file copy activation - - After restarting the confignode node, enter the activation folder, copy the system_info file to the Timecho staff, and inform them to apply for independent authorization for AINode; - - Received the license file returned by the staff; - - Put the license file into the activation folder of the corresponding node; - -- Method 2: Activate Script Activation - - Obtain the required machine code for activation, enter the `sbin` directory of the installation directory, and execute the activation script: - ```shell - cd sbin - ./start-activate.sh - ``` - - The following information is displayed. Please copy the machine code (i.e. this string of characters) to the Timecho staff and inform them to apply for independent authorization of AINode: - ```shell - Please copy the system_info's content and send it to Timecho: - Y17hFA0xRCE1TmkVxILuCIEPc7uJcr5bzlXWiptw8uZTmTX5aThfypQdLUIhMljw075hNRSicyvyJR9JM7QaNm1gcFZPHVRWVXIiY5IlZkXdxCVc1erXMsbCqUYsR2R2Mw4PSpFJsUF5jHWSoFIIjQ2bmJFW5P52KCccFMVeHTc= - Please enter license: - ``` - - Enter the activation code returned by the staff into the `Please enter license:` command prompt in the previous step, as shown below: - ```shell - Please enter license: - Jw+MmF+AtexsfgNGOFgTm83BgXbq0zT1+fOfPvQsLlj6ZsooHFU6HycUSEGC78eT1g67KPvkcLCUIsz2QpbyVmPLr9x1+kVjBubZPYlVpsGYLqLFc8kgpb5vIrPLd3hGLbJ5Ks8fV1WOVrDDVQq89YF2atQa2EaB9EAeTWd0bRMZ+s9ffjc/1Zmh9NSP/T3VCfJcJQyi7YpXWy5nMtcW0gSV+S6fS5r7a96PjbtE0zXNjnEhqgRzdU+mfO8gVuUNaIy9l375cp1GLpeCh6m6pF+APW1CiXLTSijK9Qh3nsL5bAOXNeob5l+HO5fEMgzrW8OJPh26Vl6ljKUpCvpTiw== - License has been stored to sbin/../activation/license - Import completed. Please start cluster and excute 'show cluster' to verify activation status - ``` -- After updating the license, restart the DataNode node and enter the sbin directory of IoTDB to start the datanode: - ```shell - cd sbin - ./start-datanode.sh -d #The parameter'd 'will be started in the background - ``` - -2. Check the kernel architecture of Linux - ```shell - uname -m - ``` - -3. Import Python environment [Download](https://repo.anaconda.com/miniconda/) - - Recommend downloading the py311 version application and importing it into the iotdb dedicated folder in the user's root directory - - 4. Verify Python version - -```shell - python --version - ``` -5. Create a virtual environment (execute in the ainode directory) - - ```shell - python -m venv venv - ``` - -6. Activate the virtual environment - - ```shell - source venv/bin/activate - ``` - - 7. Download and import AINode to a dedicated folder, switch to the dedicated folder and extract the installation package - - ```shell - unzip iotdb-enterprise-ainode-1.3.3.2.zip - ``` - - 8. Configuration item modification - - ```shell - vi iotdb-enterprise-ainode-1.3.3.2/conf/iotdb-ainode.properties - ``` - Configuration item modification:[detailed information](#configuration-item-modification) - - > ain_seed_config_node=iotdb-1:10710 (Cluster communication node IP: communication node port)
- > ain_inference_rpc_address=iotdb-3 (IP address of the server running AINode) - - 9. Replace Python source - - ```shell - pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ - ``` - - 10. Start the AINode node - - ```shell - nohup bash iotdb-enterprise-ainode-1.3.3.2/sbin/start-ainode.sh > myout.file 2>& 1 & - ``` - > Return to the default environment of the system: conda deactivate - - ### Configuration item modification - -AINode supports modifying some necessary parameters. You can find the following parameters in the `conf/iotdb-ainode.properties` file and make persistent modifications to them: -: - -| **Name** | **Describe** | **Type** | **Default value** | **Effective method after modification** | -| :----------------------------- | ------------------------------------------------------------ | ------- | ------------------ | ---------------------------- | -| cluster_name | The identifier for AINode to join the cluster | string | defaultCluster | Only allow modifications before the first service startup | -| ain_seed_config_node | The Configurable Node address registered during AINode startup | String | 127.0.0.1:10710 | Only allow modifications before the first service startup | -| ain_inference_rpc_address | AINode provides service and communication addresses , Internal Service Communication Interface | String | 127.0.0.1 | Only allow modifications before the first service startup | -| ain_inference_rpc_port | AINode provides ports for services and communication | String | 10810 | Only allow modifications before the first service startup | -| ain_system_dir | AINode metadata storage path, the starting directory of the relative path is related to the operating system, and it is recommended to use an absolute path | String | data/AINode/system | Only allow modifications before the first service startup | -| ain_models_dir | AINode stores the path of the model file, and the starting directory of the relative path is related to the operating system. It is recommended to use an absolute path | String | data/AINode/models | Only allow modifications before the first service startup | -| ain_logs_dir | The path where AINode stores logs, the starting directory of the relative path is related to the operating system, and it is recommended to use an absolute path | String | logs/AINode | Effective after restart | -| ain_thrift_compression_enabled | Does AINode enable Thrift's compression mechanism , 0-Do not start, 1-Start | Boolean | 0 | Effective after restart | - -### Start AINode - - After completing the deployment of Seed Config Node, the registration and inference functions of the model can be supported by adding AINode nodes. After specifying the information of the IoTDB cluster in the configuration file, the corresponding instruction can be executed to start AINode and join the IoTDB cluster。 - -#### Networking environment startup - -##### Start command - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - - # Windows systems - sbin\start-ainode.bat - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -#### Detailed Syntax - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh -i -r -n - - # Windows systems - sbin\start-ainode.bat -i -r -n - ``` - -##### Parameter introduction: - -| **Name** | **Label** | **Describe** | **Is it mandatory** | **Type** | **Default value** | **Input method** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | ---------------------- | -| ain_interpreter_dir | -i | The interpreter path of the virtual environment where AINode is installed requires the use of an absolute path. | no | String | Default reading of environment variables | Input or persist modifications during invocation | -| ain_force_reinstall | -r | Does this script check the version when checking the installation status of AINode. If it does, it will force the installation of the whl package in lib if the version is incorrect. | no | Bool | false | Input when calling | -| ain_no_dependencies | -n | Specify whether to install dependencies when installing AINode, and if so, only install the AINode main program without installing dependencies. | no | Bool | false | Input when calling | - - If you don't want to specify the corresponding parameters every time you start, you can also persistently modify the parameters in the `ainode-env.sh` and `ainode-env.bat` scripts in the `conf` folder (currently supporting persistent modification of the ain_interpreter-dir parameter). - - `ainode-env.sh` : - ```shell - # The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - # ain_interpreter_dir= - ``` - `ainode-env.bat` : -```shell - @REM The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - @REM set ain_interpreter_dir= - ``` - After writing the parameter value, uncomment the corresponding line and save it to take effect on the next script execution. - - -#### Example - -##### Directly start: - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - # Windows systems - sbin\start-ainode.bat - - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -##### Update Start: -If the version of AINode has been updated (such as updating the `lib` folder), this command can be used. Firstly, it is necessary to ensure that AINode has stopped running, and then restart it using the `-r` parameter, which will reinstall AINode based on the files under `lib`. - - -```shell - # Update startup command - # Linux and MacOS systems - bash sbin/start-ainode.sh -r - # Windows systems - sbin\start-ainode.bat -r - - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh -r > myout.file 2>& 1 & - # Windows c - nohup bash sbin\start-ainode.bat -r > myout.file 2>& 1 & - ``` -#### Non networked environment startup - -##### Start command - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - - # Windows systems - sbin\start-ainode.bat - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -#### Detailed Syntax - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh -i -r -n - - # Windows systems - sbin\start-ainode.bat -i -r -n - ``` - -##### Parameter introduction: - -| **Name** | **Label** | **Describe** | **Is it mandatory** | **Type** | **Default value** | **Input method** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | ---------------------- | -| ain_interpreter_dir | -i | The interpreter path of the virtual environment where AINode is installed requires the use of an absolute path | no | String | Default reading of environment variables | Input or persist modifications during invocation | -| ain_force_reinstall | -r | Does this script check the version when checking the installation status of AINode. If it does, it will force the installation of the whl package in lib if the version is incorrect | no | Bool | false | Input when calling | - -> Attention: When installation fails in a non networked environment, first check if the installation package corresponding to the platform is selected, and then confirm the Python version (due to the limitations of the downloaded installation package on Python versions, 3.7, 3.9, and others are not allowed) - -#### Example - -##### Directly start: - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - # Windows systems - sbin\start-ainode.bat - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### Detecting the status of AINode nodes - -During the startup process of AINode, the new AINode will be automatically added to the IoTDB cluster. After starting AINode, you can enter SQL in the command line to query. If you see an AINode node in the cluster and its running status is Running (as shown below), it indicates successful joining. - - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -### Stop AINode - -If you need to stop a running AINode node, execute the corresponding shutdown script. - -#### Stop command - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - - -#### Detailed Syntax - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh -t - - #Windows - sbin\stop-ainode.bat -t - ``` - -##### Parameter introduction: - -| **Name** | **Label** | **Describe** | **Is it mandatory** | **Type** | **Default value** | **Input method** | -| ----------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ------ | ---------- | -| ain_remove_target | -t | When closing AINode, you can specify the Node ID, address, and port number of the target AINode to be removed, in the format of `` | no | String | nothing | Input when calling | - -#### Example - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - # Windows - sbin\stop-ainode.bat - ``` -After stopping AINode, you can still see AINode nodes in the cluster, whose running status is UNKNOWN (as shown below), and the AINode function cannot be used at this time. - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -If you need to restart the node, you need to execute the startup script again. - -### Remove AINode - -When it is necessary to remove an AINode node from the cluster, a removal script can be executed. The difference between removing and stopping scripts is that stopping retains the AINode node in the cluster but stops the AINode service, while removing removes the AINode node from the cluster. - -#### Remove command - - -```shell - # Linux / MacOS - bash sbin/remove-ainode.sh - - # Windows - sbin\remove-ainode.bat - ``` - -#### Detailed Syntax - -```shell - # Linux / MacOS - bash sbin/remove-ainode.sh -i -t/: -r -n - - # Windows - sbin\remove-ainode.bat -i -t/: -r -n - ``` - -##### Parameter introduction: - - | **Name** | **Label** | **Describe** | **Is it mandatory** | **Type** | **Default value** | **Input method** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | --------------------- | -| ain_interpreter_dir | -i | The interpreter path of the virtual environment where AINode is installed requires the use of an absolute path | no | String | Default reading of environment variables | Input+persistent modification during invocation | -| ain_remove_target | -t | When closing AINode, you can specify the Node ID, address, and port number of the target AINode to be removed, in the format of `` | no | String | nothing | Input when calling | -| ain_force_reinstall | -r | Does this script check the version when checking the installation status of AINode. If it does, it will force the installation of the whl package in lib if the version is incorrect | no | Bool | false | Input when calling | -| ain_no_dependencies | -n | Specify whether to install dependencies when installing AINode, and if so, only install the AINode main program without installing dependencies | no | Bool | false | Input when calling | - - If you don't want to specify the corresponding parameters every time you start, you can also persistently modify the parameters in the `ainode-env.sh` and `ainode-env.bat` scripts in the `conf` folder (currently supporting persistent modification of the ain_interpreter-dir parameter). - - `ainode-env.sh` : - ```shell - # The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - # ain_interpreter_dir= - ``` - `ainode-env.bat` : -```shell - @REM The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - @REM set ain_interpreter_dir= - ``` - After writing the parameter value, uncomment the corresponding line and save it to take effect on the next script execution. - -#### Example - -##### Directly remove: - - ```shell - # Linux / MacOS - bash sbin/remove-ainode.sh - - # Windows - sbin\remove-ainode.bat - ``` - After removing the node, relevant information about the node cannot be queried. - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -##### Specify removal: - -If the user loses files in the data folder, AINode may not be able to actively remove them locally. The user needs to specify the node number, address, and port number for removal. In this case, we support users to input parameters according to the following methods for deletion. - - ```shell - # Linux / MacOS - bash sbin/remove-ainode.sh -t /: - - # Windows - sbin\remove-ainode.bat -t /: - ``` - -## common problem - -### An error occurs when starting AINode stating that the venv module cannot be found - - When starting AINode using the default method, a Python virtual environment will be created in the installation package directory and dependencies will be installed, so it is required to install the venv module. Generally speaking, Python 3.10 and above versions come with built-in VenV, but for some systems with built-in Python environments, this requirement may not be met. There are two solutions when this error occurs (choose one or the other): - - To install the Venv module locally, taking Ubuntu as an example, you can run the following command to install the built-in Venv module in Python. Or install a Python version with built-in Venv from the Python official website. - - ```shell -apt-get install python3.10-venv -``` -Install version 3.10.0 of venv into AINode in the AINode path. - - ```shell -../Python-3.10.0/python -m venv venv(Folder Name) -``` - When running the startup script, use ` -i ` to specify an existing Python interpreter path as the running environment for AINode, eliminating the need to create a new virtual environment. - - ### The SSL module in Python is not properly installed and configured to handle HTTPS resources -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -You can install OpenSSLS and then rebuild Python to solve this problem -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - - Python requires OpenSSL to be installed on our system, the specific installation method can be found in [link](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - - ### Pip version is lower - - A compilation issue similar to "error: Microsoft Visual C++14.0 or greater is required..." appears on Windows - -The corresponding error occurs during installation and compilation, usually due to insufficient C++version or Setup tools version. You can check it in - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - - ### Install and compile Python - - Use the following instructions to download the installation package from the official website and extract it: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` - Compile and install the corresponding Python package: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index 21a284770..000000000 --- a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,566 +0,0 @@ - -# Cluster Deployment - -This section describes how to manually deploy an instance that includes 3 ConfigNodes and 3 DataNodes, commonly known as a 3C3D cluster. - -
- -
- -## Note - -1. Before installation, ensure that the system is complete by referring to [System configuration](./Environment-Requirements.md) - -2. It is recommended to prioritize using `hostname` for IP configuration during deployment, which can avoid the problem of modifying the host IP in the later stage and causing the database to fail to start. To set the host name, you need to configure /etc/hosts on the target server. For example, if the local IP is 192.168.1.3 and the host name is iotdb-1, you can use the following command to set the server's host name and configure the `cn_internal_address` and `dn_internal_address` of IoTDB using the host name. - ``` shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. Some parameters cannot be modified after the first startup. Please refer to the "Parameter Configuration" section below for settings. - -4. Whether in linux or windows, ensure that the IoTDB installation path does not contain Spaces and Chinese characters to avoid software exceptions. - -5. Please note that when installing and deploying IoTDB (including activating and using software), it is necessary to use the same user for operations. You can: -- Using root user (recommended): Using root user can avoid issues such as permissions. -- Using a fixed non root user: - - Using the same user operation: Ensure that the same user is used for start, activation, stop, and other operations, and do not switch users. - - Avoid using sudo: Try to avoid using sudo commands as they execute commands with root privileges, which may cause confusion or security issues. - -6. It is recommended to deploy a monitoring panel, which can monitor important operational indicators and keep track of database operation status at any time. The monitoring panel can be obtained by contacting the business department,The steps for deploying a monitoring panel can refer to:[Monitoring Panel Deployment](./Monitoring-panel-deployment.md) - -7. Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - - -## Preparation Steps - -1. Prepare the IoTDB database installation package: iotdb enterprise- {version}-bin.zip(The installation package can be obtained from:[IoTDB-Package](../Deployment-and-Maintenance/IoTDB-Package_timecho.md)) -2. Configure the operating system environment according to environmental requirements(The system environment configuration can be found in:[Environment Requirement](https://www.timecho.com/docs/UserGuide/latest/Deployment-and-Maintenance/Environment-Requirements.html)) - -### Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -## Installation Steps - -Assuming there are three Linux servers now, the IP addresses and service roles are assigned as follows: - -| Node IP | Host Name | Service | -| ----------- | --------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -### Set Host Name - -On three machines, configure the host names separately. To set the host names, configure `/etc/hosts` on the target server. Use the following command: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### Configuration - -Unzip the installation package and enter the installation directory - -```Plain -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -#### Environment script configuration - -- `./conf/confignode-env.sh` configuration - - | **Configuration** | **Description** | **Default** | **Recommended value** | **Note** | - | :---------------- | :----------------------------------------------------------- | :---------- | :----------------------------------------------------------- | :---------------------------------- | - | MEMORY_SIZE | The total amount of memory that IoTDB ConfigNode nodes can use | Automatically calculated based on system memory, defaulting to 30% of the system memory. | Can be filled in as needed, and the system will allocate memory based on the filled in values | Save changes without immediate execution; modifications take effect after service restart. | - -- `./conf/datanode-env.sh` configuration - - | **Configuration** | **Description** | **Default** | **Recommended value** | **Note** | - | :---------------- | :----------------------------------------------------------- |:-----------------------------------------------------------------------------------------| :----------------------------------------------------------- | :---------------------------------- | - | MEMORY_SIZE | The total amount of memory that IoTDB DataNode nodes can use | Automatically calculated based on system memory, defaulting to 50% of the system memory. | Can be filled in as needed, and the system will allocate memory based on the filled in values | Save changes without immediate execution; modifications take effect after service restart. | - -#### General Configuration - -Open the general configuration file `./conf/iotdb-system.properties`,The following parameters can be set according to the deployment method: - -| **Configuration** | **Description** | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | -| ------------------------- | ------------------------------------------------------------ | -------------- | -------------- | -------------- | -| cluster_name | Cluster Name | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | The number of metadata replicas, the number of DataNodes should not be less than this number | 3 | 3 | 3 | -| data_replication_factor | The number of data replicas should not be less than this number of DataNodes | 2 | 2 | 2 | - -#### ConfigNode Configuration - -Open the ConfigNode configuration file `./conf/iotdb-system.properties`,Set the following parameters - -| **Configuration** | **Description** | **Default** | **Recommended value** | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | Note | -| ------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------------ | ------------- | ------------- | ------------- | ---------------------------------------- | -| cn_internal_address | The address used by ConfigNode for communication within the cluster | 127.0.0.1 | The IPV4 address or host name of the server where it is located, and it is recommended to use host name | iotdb-1 | iotdb-2 | iotdb-3 | Cannot be modified after initial startup | -| cn_internal_port | The port used by ConfigNode for communication within the cluster | 10710 | 10710 | 10710 | 10710 | 10710 | Cannot be modified after initial startup | -| cn_consensus_port | The port used for ConfigNode replica group consensus protocol communication | 10720 | 10720 | 10720 | 10720 | 10720 | Cannot be modified after initial startup | -| cn_seed_config_node | The address of the ConfigNode that the node connects to when registering to join the cluster, `cn_internal_address:cn_internal_port` | 127.0.0.1:10710 | The first CongfigNode's `cn_internal-address: cn_internal_port` | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | Cannot be modified after initial startup | - -#### DataNode Configuration - -Open DataNode Configuration File `./conf/iotdb-system.properties`,Set the following parameters: - -| **Configuration** | **Description** | **Default** | **Recommended value** | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | Note | -| ------------------------------- | ------------------------------------------------------------ |-----------------|-----------------------------------------------------------------------------------------------------------------| ------------- | ------------- | ------------- | ---------------------------------------- | -| dn_rpc_address | The address of the client RPC service | 0.0.0.0 | The IPV4 address or host name of the server where it is located, and it is recommended to use the IPV4 address | iotdb-1 |iotdb-2 | iotdb-3 | Restarting the service takes effect | -| dn_rpc_port | The port of the client RPC service | 6667 | 6667 | 6667 | 6667 | 6667 | Restarting the service takes effect | -| dn_internal_address | The address used by DataNode for communication within the cluster | 127.0.0.1 | The IPV4 address or host name of the server where it is located, and it is recommended to use host name | iotdb-1 | iotdb-2 | iotdb-3 | Cannot be modified after initial startup | -| dn_internal_port | The port used by DataNode for communication within the cluster | 10730 | 10730 | 10730 | 10730 | 10730 | Cannot be modified after initial startup | -| dn_mpp_data_exchange_port | The port used by DataNode to receive data streams | 10740 | 10740 | 10740 | 10740 | 10740 | Cannot be modified after initial startup | -| dn_data_region_consensus_port | The port used by DataNode for data replica consensus protocol communication | 10750 | 10750 | 10750 | 10750 | 10750 | Cannot be modified after initial startup | -| dn_schema_region_consensus_port | The port used by DataNode for metadata replica consensus protocol communication | 10760 | 10760 | 10760 | 10760 | 10760 | Cannot be modified after initial startup | -| dn_seed_config_node | The address of the ConfigNode that the node connects to when registering to join the cluster, i.e. `cn_internal-address: cn_internal_port` | 127.0.0.1:10710 | The first CongfigNode's cn_internal-address: cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | Cannot be modified after initial startup | - -> ❗️Attention: Editors such as VSCode Remote do not have automatic configuration saving function. Please ensure that the modified files are saved persistently, otherwise the configuration items will not take effect - -### Start and Activate Database (Available since V1.3.4) - -#### Start ConfigNode - -Start the first confignode of IoTDB-1 first, ensuring that the seed confignode node starts first, and then start the second and third confignode nodes in sequence - -```Bash -./start-confignode.sh -d #"- d" parameter will start in the background -``` - -If the startup fails, please refer to [Common Questions](#common-questions). - -#### Start DataNode - -Enter the `sbin` directory of iotdb and start three datanode nodes in sequence: - -```Bash -./start-datanode.sh -d #"- d" parameter will start in the background -``` - -#### Activate Database - -##### Activation via CLI - -- Enter the CLI of any node in the cluster - - ```SQL - ./sbin/start-cli.sh -``` - -- Obtain the machine codes of all nodes: - - - Execute the following command to get the machine codes required for activation: - - ```Bash - show system info - ``` - - - The machine codes of all nodes in the cluster will be displayed: - - ```Bash - +--------------------------------------------------------------+ - | SystemInfo| - +--------------------------------------------------------------+ - |01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| - +--------------------------------------------------------------+ - Total line number = 1 - It costs 0.030s - ``` - -- Copy the obtained machine codes and provide them to the Timecho team - -- The Timecho team will return an activation code, which normally corresponds to the order of the provided machine codes. Paste the complete activation code into the CLI for activation - - - Note: The activation code must be enclosed in ' symbols, as shown below: - - ```Bash - IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' - ``` - -### Start and Activate Database (Available before V1.3.4) - -#### Start ConfigNode - -Start the first confignode of IoTDB-1 first, ensuring that the seed confignode node starts first, and then start the second and third confignode nodes in sequence - -```Bash -./start-confignode.sh -d #"- d" parameter will start in the background -``` - -If the startup fails, please refer to [Common Questions](#common-questions). - -#### Activate Database - -##### Method 1: Activate file copy activation - -- After starting three confignode nodes in sequence, copy the `activation` folder of each machine and the `system_info` file of each machine to the Timecho staff; -- The staff will return the license files for each ConfigNode node, where 3 license files will be returned; -- Put the three license files into the `activation` folder of the corresponding ConfigNode node; - -##### Method 2: Activate Script Activation - -- Obtain the machine codes of three machines in sequence, enter the `sbin` directory of the installation directory, and execute the activation script `start activate.sh`: - - ```Bash - cd sbin - ./start-activate.sh - ``` - -- The following information is displayed, where the machine code of one machine is displayed: - - ```Bash - Please copy the system_info's content and send it to Timecho: - 01-KU5LDFFN-PNBEHDRH - Please enter license: - ``` - -- The other two nodes execute the activation script `start activate.sh` in sequence, and then copy the machine codes of the three machines obtained to the Timecho staff -- The staff will return 3 activation codes, which normally correspond to the order of the provided 3 machine codes. Please paste each activation code into the previous command line prompt `Please enter license:`, as shown below: - - ```Bash - Please enter license: - Jw+MmF+Atxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx5bAOXNeob5l+HO5fEMgzrW8OJPh26Vl6ljKUpCvpTiw== - License has been stored to sbin/../activation/license - Import completed. Please start cluster and excute 'show cluster' to verify activation status - ``` - -#### Start DataNode - - Enter the `sbin` directory of iotdb and start three datanode nodes in sequence: - -```Bash -./start-datanode.sh -d #"- d" parameter will start in the background -``` - -### Verify Deployment - -Can be executed directly Cli startup script in `./sbin` directory: - -```Plain -./start-cli.sh -h ip(local IP or domain name) -p port(6667) -``` - - After successful startup, the following interface will appear displaying successful installation of IOTDB. - -![](/img/%E4%BC%81%E4%B8%9A%E7%89%88%E6%88%90%E5%8A%9F.png) - -After the installation success interface appears, continue to check if the activation is successful and use the `show cluster` command. - -When you see the display of `Activated` on the far right, it indicates successful activation. - -![](/img/%E4%BC%81%E4%B8%9A%E7%89%88%E6%BF%80%E6%B4%BB.png) - -In the CLI, you can also check the activation status by running the `show activation` command; the example below shows a status of ACTIVATED, indicating successful activation. - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -> The appearance of `ACTIVATED (W)` indicates passive activation, which means that this Configurable Node does not have a license file (or has not issued the latest license file with a timestamp), and its activation depends on other Activated Configurable Nodes in the cluster. At this point, it is recommended to check if the license file has been placed in the license folder. If not, please place the license file. If a license file already exists, it may be due to inconsistency between the license file of this node and the information of other nodes. Please contact Timecho staff to reapply. - - -### One-click Cluster Start and Stop - -#### Overview - -Within the root directory of IoTDB, the `sbin `subdirectory houses the `start-all.sh` and `stop-all.sh` scripts, which work in concert with the `iotdb-cluster.properties` configuration file located in the `conf` subdirectory. This synergy enables the one-click initiation or termination of all nodes within the cluster from a single node. This approach facilitates efficient management of the IoTDB cluster's lifecycle, streamlining the deployment and operational maintenance processes. - -This following section will introduce the specific configuration items in the `iotdb-cluster.properties` file. - -#### Configuration Items - -> Note: -> -> * When the cluster changes, this configuration file needs to be manually updated. -> * If the `iotdb-cluster.properties` configuration file is not set up and the `start-all.sh` or `stop-all.sh` scripts are executed, the scripts will, by default, start or stop the ConfigNode and DataNode nodes located in the IOTDB\_HOME directory where the scripts reside. -> * It is recommended to configure SSH passwordless login: If not configured, the script will prompt for the server password after execution to facilitate subsequent start, stop, or destroy operations. If already configured, there is no need to enter the server password during script execution. - -* confignode\_address\_list - -| **Name** | **confignode\_address\_list** | -| :----------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Description | A list of IP addresses of the hosts where the ConfigNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_address\_list - -| **Name** | **datanode\_address\_list** | -| :----------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Description | A list of IP addresses of the hosts where the DataNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* ssh\_account - -| **Name** | **ssh\_account** | -| :----------------: | :------------------------------------------------------------------------------------------------- | -| Description | The username used to log in to the target hosts via SSH. All hosts must have the same username. | -| Type | String | -| Default | root | -| Effective | After restarting the system | - -* ssh\_port - -| **Name** | **ssh\_port** | -| :----------------: | :---------------------------------------------------------------------------------- | -| Description | The SSH port exposed by the target hosts. All hosts must have the same SSH port. | -| Type | int | -| Default | 22 | -| Effective | After restarting the system | - -* confignode\_deploy\_path - -| **Name** | **confignode\_deploy\_path** | -| :----------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Description | The path on the target hosts where all ConfigNodes to be started/stopped are located. All ConfigNodes must be in the same directory on their respective hosts. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_deploy\_path - -| **Name** | **datanode\_deploy\_path** | -| :----------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| Description | The path on the target hosts where all DataNodes to be started/stopped are located. All DataNodes must be in the same directory on their respective hosts. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - - - - - -## Node Maintenance Steps - -### ConfigNode Node Maintenance - -ConfigNode node maintenance is divided into two types of operations: adding and removing ConfigNodes, with two common use cases: -- Cluster expansion: For example, when there is only one ConfigNode in the cluster, and you want to increase the high availability of ConfigNode nodes, you can add two ConfigNodes, making a total of three ConfigNodes in the cluster. -- Cluster failure recovery: When the machine where a ConfigNode is located fails, making the ConfigNode unable to run normally, you can remove this ConfigNode and then add a new ConfigNode to the cluster. - -> ❗️Note, after completing ConfigNode node maintenance, you need to ensure that there are 1 or 3 ConfigNodes running normally in the cluster. Two ConfigNodes do not have high availability, and more than three ConfigNodes will lead to performance loss. - -#### Adding ConfigNode Nodes - -Script command: -```shell -# Linux / MacOS -# First switch to the IoTDB root directory -sbin/start-confignode.sh - -# Windows -# First switch to the IoTDB root directory -sbin/start-confignode.bat -``` - -Parameter introduction: - -| Parameter | Description | Is it required | -| :--- | :--------------------------------------------- | :----------- | -| -v | Show version information | No | -| -f | Run the script in the foreground, do not put it in the background | No | -| -d | Start in daemon mode, i.e. run in the background | No | -| -p | Specify a file to store the process ID for process management | No | -| -c | Specify the path to the configuration file folder, the script will load the configuration file from here | No | -| -g | Print detailed garbage collection (GC) information | No | -| -H | Specify the path of the Java heap dump file, used when JVM memory overflows | No | -| -E | Specify the path of the JVM error log file | No | -| -D | Define system properties, in the format key=value | No | -| -X | Pass -XX parameters directly to the JVM | No | -| -h | Help instruction | No | - -#### Removing ConfigNode Nodes - -First connect to the cluster through the CLI and confirm the internal address and port number of the ConfigNode you want to remove by using `show confignodes`: - -```Bash -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -Then use the script to remove the ConfigNode. Script command: - -```Bash -# Linux / MacOS -sbin/remove-confignode.sh [confignode_id] - -#Windows -sbin/remove-confignode.bat [confignode_id] - -``` - -### DataNode Node Maintenance - -There are two common scenarios for DataNode node maintenance: - -- Cluster expansion: For the purpose of expanding cluster capabilities, add new DataNodes to the cluster -- Cluster failure recovery: When a machine where a DataNode is located fails, making the DataNode unable to run normally, you can remove this DataNode and add a new DataNode to the cluster - -> ❗️Note, in order for the cluster to work normally, during the process of DataNode node maintenance and after the maintenance is completed, the total number of DataNodes running normally should not be less than the number of data replicas (usually 2), nor less than the number of metadata replicas (usually 3). - -#### Adding DataNode Nodes - -Script command: - -```Bash -# Linux / MacOS -# First switch to the IoTDB root directory -sbin/start-datanode.sh - -# Windows -# First switch to the IoTDB root directory -sbin/start-datanode.bat -``` - -Parameter introduction: - -| Abbreviation | Description | Is it required | -| :--- | :--------------------------------------------- | :----------- | -| -v | Show version information | No | -| -f | Run the script in the foreground, do not put it in the background | No | -| -d | Start in daemon mode, i.e. run in the background | No | -| -p | Specify a file to store the process ID for process management | No | -| -c | Specify the path to the configuration file folder, the script will load the configuration file from here | No | -| -g | Print detailed garbage collection (GC) information | No | -| -H | Specify the path of the Java heap dump file, used when JVM memory overflows | No | -| -E | Specify the path of the JVM error log file | No | -| -D | Define system properties, in the format key=value | No | -| -X | Pass -XX parameters directly to the JVM | No | -| -h | Help instruction | No | - -Note: After adding a DataNode, as new writes arrive (and old data expires, if TTL is set), the cluster load will gradually balance towards the new DataNode, eventually achieving a balance of storage and computation resources on all nodes. - -#### Removing DataNode Nodes - -First connect to the cluster through the CLI and confirm the RPC address and port number of the DataNode you want to remove with `show datanodes`: - -```Bash -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -Then use the script to remove the DataNode. Script command: - -```Bash -# Linux / MacOS -sbin/remove-datanode.sh [datanode_id] - -#Windows -sbin/remove-datanode.bat [datanode_id] -``` - -## Common Questions -1. Multiple prompts indicating activation failure during deployment process - - Use the `ls -al` command: Use the `ls -al` command to check if the owner information of the installation package root directory is the current user. - - Check activation directory: Check all files in the `./activation` directory and whether the owner information is the current user. - -2. Confignode failed to start - - Step 1: Please check the startup log to see if any parameters that cannot be changed after the first startup have been modified. - - Step 2: Please check the startup log for any other abnormalities. If there are any abnormal phenomena in the log, please contact Timecho Technical Support personnel for consultation on solutions. - - Step 3: If it is the first deployment or data can be deleted, you can also clean up the environment according to the following steps, redeploy, and restart. - - Step 4: Clean up the environment: - - a. Terminate all ConfigNode Node and DataNode processes. - ```Bash - # 1. Stop the ConfigNode and DataNode services - sbin/stop-standalone.sh - - # 2. Check for any remaining processes - jps - # Or - ps -ef|grep iotdb - - # 3. If there are any remaining processes, manually kill the - kill -9 - # If you are sure there is only one iotdb on the machine, you can use the following command to clean up residual processes - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. Delete the data and logs directories. - - Explanation: Deleting the data directory is necessary, deleting the logs directory is for clean logs and is not mandatory. - - ```Bash - cd /data/iotdb - rm -rf data logs - ``` \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index 0c22cc530..000000000 --- a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,496 +0,0 @@ - -# Docker Deployment - -## Environmental Preparation - -### Docker Installation - -```Bash -#Taking Ubuntu as an example, other operating systems can search for installation methods themselves -#step1: Install some necessary system tools -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: Install GPG certificate -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: Write software source information -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: Update and install Docker CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: Set Docker to start automatically upon startup -sudo systemctl enable docker -#step6: Verify if Docker installation is successful -docker --version #Display version information, indicating successful installation -``` - -### Docker-compose Installation - -```Bash -#Installation command -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#Verify if the installation was successful -docker-compose --version #Displaying version information indicates successful installation -``` - -### Install The Dmidecode Plugin - -By default, Linux servers should already be installed. If not, you can use the following command to install them. - -```Bash -sudo apt-get install dmidecode -``` - -After installing dmidecode, search for the installation path: `wherever dmidecode`. Assuming the result is `/usr/sbin/dmidecode`, remember this path as it will be used in the later docker compose yml file. - -### Get Container Image Of IoTDB - -You can contact business or technical support to obtain container images for IoTDB Enterprise Edition. - -## Stand-Alone Deployment - -This section demonstrates how to deploy a standalone Docker version of 1C1D. - -### Load Image File - -For example, the container image file name of IoTDB obtained here is: `iotdb-enterprise-1.3.2-3-standalone-docker.tar.gz` - -Load image: - -```Bash -docker load -i iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz -``` - -View image: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### Create Docker Bridge Network - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### Write The Yml File For docker-compose - -Here we take the example of consolidating the IoTDB installation directory and yml files in the/docker iotdb folder: - -The file directory structure is:`/docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml ` - -```Bash -docker-iotdb: -├── iotdb #Iotdb installation directory -│── docker-compose-standalone.yml #YML file for standalone Docker Composer -``` - -The complete docker-compose-standalone.yml content is as follows: - -```Bash -version: "3" -services: - iotdb-service: - image: iotdb-enterprise:1.3.2.3-standalone #The image used - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### First Launch - -Use the following command to start: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -Due to lack of activation, it is normal to exit directly upon initial startup. The initial startup is to obtain the machine code file for the subsequent activation process. - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### Apply For Activation - -- After the first startup, a system_info file will be generated in the physical machine directory `/docker-iotdb/iotdb/activation`, and this file will be copied to the Timecho staff. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Received the license file returned by the staff, copy the license file to the `/docker iotdb/iotdb/activation` folder. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### Restart IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### Validate Deployment - -- Viewing the log, the following words indicate successful startup - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- Enter the container to view the service running status and activation information - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - Enter the container, log in to the database through CLI, and use the `show cluster` command to view the service status and activation status - - ```Bash - docker exec -it iotdb /bin/bash #Entering the container - ./start-cli.sh -h iotdb #Log in to the database - IoTDB> show cluster #View status - ``` - - You can see that all services are running and the activation status shows as activated. - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### Map/conf Directory (optional) - -If you want to directly modify the configuration file in the physical machine in the future, you can map the/conf folder in the container in three steps: - -Step 1: Copy the/conf directory from the container to/docker-iotdb/iotdb/conf - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -Step 2: Add mappings in docker-compose-standalone.yml - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this/conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -Step 3: Restart IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## Cluster Deployment - -This section describes how to manually deploy an instance that includes 3 Config Nodes and 3 Data Nodes, commonly known as a 3C3D cluster. - -
- -
- -**Note: The cluster version currently only supports host and overlay networks, and does not support bridge networks.** - -Taking the host network as an example, we will demonstrate how to deploy a 3C3D cluster. - -### Set Host Name - -Assuming there are currently three Linux servers, the IP addresses and service role assignments are as follows: - -| Node IP | Host Name | Service | -| ----------- | --------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -Configure the host names on three machines separately. To set the host names, configure `/etc/hosts` on the target server using the following command: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### Load Image File - -For example, the container image file name obtained for IoTDB is: `iotdb-enterprise-1.3.23-standalone-docker.tar.gz` - -Execute the load image command on three servers separately: - -```Bash -docker load -i iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz -``` - -View image: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### Write The Yml File For Docker Compose - -Here we take the example of consolidating the IoTDB installation directory and yml files in the /docker-iotdb folder: - -The file directory structure is:/docker-iotdb/iotdb, /docker-iotdb/confignode.yml,/docker-iotdb/datanode.yml - -```Bash -docker-iotdb: -├── confignode.yml #Yml file of confignode -├── datanode.yml #Yml file of datanode -└── iotdb #IoTDB installation directory -``` - -On each server, two yml files need to be written, namely confignnode. yml and datanode. yml. The example of yml is as follows: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:1.3.2.3-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:1.3.2.3-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### Starting Confignode For The First Time - -First, start configNodes on each of the three servers to obtain the machine code. Pay attention to the startup order, start the first iotdb-1 first, then start iotdb-2 and iotdb-3. - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #Background startup -``` - -### Apply For Activation - -- After starting three confignodes for the first time, a system_info file will be generated in each physical machine directory `/docker-iotdb/iotdb/activation`, and the system_info files of the three servers will be copied to the Timecho staff; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Put the three license files into the `/docker iotdb/iotdb/activation` folder of the corresponding Configurable Node node; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- After the license is placed in the corresponding activation folder, confignode will be automatically activated without restarting confignode - -### Start Datanode - -Start datanodes on 3 servers separately - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #Background startup -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### Validate Deployment - -- Viewing the logs, the following words indicate that the datanode has successfully started - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- Enter any container to view the service running status and activation information - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - Enter the container, log in to the database through CLI, and use the `show cluster` command to view the service status and activation status - - ```Bash - docker exec -it iotdb-datanode /bin/bash #Entering the container - ./start-cli.sh -h iotdb-1 #Log in to the database - IoTDB> show cluster #View status - ``` - - You can see that all services are running and the activation status shows as activated. - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### Map/conf Directory (optional) - -If you want to directly modify the configuration file in the physical machine in the future, you can map the/conf folder in the container in three steps: - -Step 1: Copy the `/conf` directory from the container to `/docker-iotdb/iotdb/conf` on each of the three servers - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -or -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -Step 2: Add `/conf` directory mapping in `confignode.yml` and `datanode. yml` on 3 servers - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -Step 3: Restart IoTDB on 3 servers - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` - diff --git a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index 40c5e1d3d..000000000 --- a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,164 +0,0 @@ - -# Dual Active Deployment - -## What is a double active version? - -Dual active usually refers to two independent machines (or clusters) that perform real-time mirror synchronization. Their configurations are completely independent and can simultaneously receive external writes. Each independent machine (or cluster) can synchronize the data written to itself to another machine (or cluster), and the data of the two machines (or clusters) can achieve final consistency. - -- Two standalone machines (or clusters) can form a high availability group: when one of the standalone machines (or clusters) stops serving, the other standalone machine (or cluster) will not be affected. When the single machine (or cluster) that stopped the service is restarted, another single machine (or cluster) will synchronize the newly written data. Business can be bound to two standalone machines (or clusters) for read and write operations, thereby achieving high availability. -- The dual active deployment scheme allows for high availability with fewer than 3 physical nodes and has certain advantages in deployment costs. At the same time, the physical supply isolation of two sets of single machines (or clusters) can be achieved through the dual ring network of power and network, ensuring the stability of operation. -- At present, the dual active capability is a feature of the enterprise version. - -![](/img/20240731104336.png) - -## Note - -1. It is recommended to prioritize using `hostname` for IP configuration during deployment to avoid the problem of database failure caused by modifying the host IP in the later stage. To set the hostname, you need to configure `/etc/hosts` on the target server. If the local IP is 192.168.1.3 and the hostname is iotdb-1, you can use the following command to set the server's hostname and configure IoTDB's `cn_internal-address` and` dn_internal-address` using the hostname. - - ```Bash - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -2. Some parameters cannot be modified after the first startup, please refer to the "Installation Steps" section below to set them. - -3. Recommend deploying a monitoring panel, which can monitor important operational indicators and keep track of database operation status at any time. The monitoring panel can be obtained by contacting the business department. The steps for deploying the monitoring panel can be referred to [Monitoring Panel Deployment](https://www.timecho.com/docs/UserGuide/latest/Deployment-and-Maintenance/Monitoring-panel-deployment.html) - -## Installation Steps - -Taking the dual active version IoTDB built by two single machines A and B as an example, the IP addresses of A and B are 192.168.1.3 and 192.168.1.4, respectively. Here, we use hostname to represent different hosts. The plan is as follows: - -| Machine | Machine IP | Host Name | -| ------- | ----------- | --------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### Step1:Install Two Independent IoTDBs Separately - -Install IoTDB on two machines separately, and refer to the deployment documentation for the standalone version [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md),The deployment document for the cluster version can be referred to [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)。**It is recommended that the configurations of clusters A and B remain consistent to achieve the best dual active effect** - -### Step2:Create A Aata Synchronization Task On Machine A To Machine B - -- Create a data synchronization process on machine A, where the data on machine A is automatically synchronized to machine B. Use the cli tool in the sbin directory to connect to the IoTDB database on machine A: - - ```Bash - ./sbin/start-cli.sh -h iotdb-1 - ``` - -- Create and start the data synchronization command with the following SQL: - - ```Bash - create pipe AB - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' - ) - ``` - -- Note: To avoid infinite data loops, it is necessary to set the parameter `source. forwarding pipe questions` on both A and B to `false`, indicating that data transmitted from another pipe will not be forwarded. - -### Step3:Create A Data Synchronization Task On Machine B To Machine A - -- Create a data synchronization process on machine B, where the data on machine B is automatically synchronized to machine A. Use the cli tool in the sbin directory to connect to the IoTDB database on machine B - - ```Bash - ./sbin/start-cli.sh -h iotdb-2 - ``` - - Create and start the pipe with the following SQL: - - ```Bash - create pipe BA - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' - ) - ``` - -- Note: To avoid infinite data loops, it is necessary to set the parameter `source. forwarding pipe questions` on both A and B to `false` , indicating that data transmitted from another pipe will not be forwarded. - -### Step4:Validate Deployment - -After the above data synchronization process is created, the dual active cluster can be started. - -#### Check the running status of the cluster - -```Bash -#Execute the show cluster command on two nodes respectively to check the status of IoTDB service -show cluster -``` - -**Machine A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**Machine B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -Ensure that every Configurable Node and DataNode is in the Running state. - -#### Check synchronization status - -- Check the synchronization status on machine A - -```Bash -show pipes -``` - -![](/img/show%20pipes-A.png) - -- Check the synchronization status on machine B - -```Bash -show pipes -``` - -![](/img/show%20pipes-B.png) - -Ensure that every pipe is in the RUNNING state. - -### Step5:Stop Dual Active Version IoTDB - -- Execute the following command on machine A: - - ```SQL - ./sbin/start-cli.sh -h iotdb-1 #Log in to CLI - IoTDB> stop pipe AB #Stop the data synchronization process - ./sbin/stop-standalone.sh #Stop database service - ``` - -- Execute the following command on machine B: - - ```SQL - ./sbin/start-cli.sh -h iotdb-2 #Log in to CLI - IoTDB> stop pipe BA #Stop the data synchronization process - ./sbin/stop-standalone.sh #Stop database service - ``` - diff --git a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/UserGuide/dev-1.3/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index 86e0af2aa..000000000 --- a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,42 +0,0 @@ - -# Obtain TimechoDB -## How to obtain TimechoDB -The enterprise version installation package can be obtained through product trial application or by directly contacting the business personnel who are in contact with you. - -## Installation Package Structure -Install the package after decompression(iotdb-enterprise-{version}-bin.zip),The directory structure after unpacking the installation package is as follows: -| **catalogue** | **Type** | **Explanation** | -| :--------------: | -------- | ------------------------------------------------------------ | -| activation | folder | The directory where the activation file is located, including the generated machine code and the enterprise version activation code obtained from the business side (this directory will only be generated after starting ConfigNode to obtain the activation code) | -| conf | folder | Configuration file directory, including configuration files such as ConfigNode, DataNode, JMX, and logback | -| data | folder | The default data file directory contains data files for ConfigNode and DataNode. (The directory will only be generated after starting the program) | -| lib | folder | IoTDB executable library file directory | -| licenses | folder | Open source community certificate file directory | -| logs | folder | The default log file directory, which includes log files for ConfigNode and DataNode (this directory will only be generated after starting the program) | -| sbin | folder | Main script directory, including start, stop, and other scripts | -| tools | folder | Directory of System Peripheral Tools | -| ext | folder | Related files for pipe, trigger, and UDF plugins (created by the user when needed) | -| LICENSE | file | certificate | -| NOTICE | file | Tip | -| README_ZH\.md | file | Explanation of the Chinese version in Markdown format | -| README\.md | file | Instructions for use | -| RELEASE_NOTES\.md | file | Version Description | diff --git a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Kubernetes_timecho.md b/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Kubernetes_timecho.md deleted file mode 100644 index 14b51ab84..000000000 --- a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Kubernetes_timecho.md +++ /dev/null @@ -1,445 +0,0 @@ - - -# Kubernetes - -## 1. Environment Preparation - -### 1.1 Prepare a Kubernetes Cluster - -Ensure that you have an available Kubernetes cluster (minimum recommended version: Kubernetes 1.24) as the foundation for deploying the IoTDB cluster. - -Kubernetes Version Requirement: The recommended version is Kubernetes 1.24 or above. - -IoTDB Version Requirement: The version of TimechoDB must not be lower than v1.3.3.2. - -## 2. Create Namespace - -### 2.1 Create Namespace - -> Note: Before executing the namespace creation operation, verify that the specified namespace name has not been used in the Kubernetes cluster. If the namespace already exists, the creation command will fail, which may lead to errors during the deployment process. - -```Bash -kubectl create ns iotdb-ns -``` - -### 2.2 View Namespace - -```Bash -kubectl get ns -``` - -## 3. Create PersistentVolume (PV) - -### 3.1 Create PV Configuration File - -PV is used for persistent storage of IoTDB's ConfigNode and DataNode data. You need to create one PV for each node. - -> Note: One ConfigNode and one DataNode count as two nodes, requiring two PVs. - -For example, with 3 ConfigNodes and 3 DataNodes: - -1. Create a `pv.yaml` file and make six copies, renaming them to `pv01.yaml` through `pv06.yaml`. - -```Bash -# Create a directory to store YAML files -# Create pv.yaml file -touch pv.yaml -``` - -2. Modify the `name` and `path` in each file to ensure consistency. - -**pv.yaml Example:** - -```YAML -# pv.yaml -apiVersion: v1 -kind: PersistentVolume -metadata: - name: iotdb-pv-01 -spec: - capacity: - storage: 10Gi # Storage capacity - accessModes: # Access modes - - ReadWriteOnce - persistentVolumeReclaimPolicy: Retain # Reclaim policy - # Storage class name, if using local static storage, do not configure; if using dynamic storage, this must be set - storageClassName: local-storage - # Add the corresponding configuration based on your storage type - hostPath: # If using a local path - path: /data/k8s-data/iotdb-pv-01 - type: DirectoryOrCreate # If this line is not configured, you need to manually create the directory -``` - -### 3.2 Apply PV Configuration - -```Bash -kubectl apply -f pv01.yaml -kubectl apply -f pv-02.yaml -... -``` - -### 3.3 View PV - -```Bash -kubectl get pv -``` - - -### 3.4 Manually Create Directories - -> Note: If the type in the hostPath of the YAML file is not configured, you need to manually create the corresponding directories. - -Create the corresponding directories on all Kubernetes nodes: -```Bash -mkdir -p /data/k8s-data/iotdb-pv-01 -mkdir -p /data/k8s-data/iotdb-pv-02 -... -``` - -## 4. Install Helm - -For installation steps, please refer to the[Helm Official Website.](https://helm.sh/zh/docs/intro/install/) - -## 5. Configure IoTDB Helm Chart - -### 5.1 Clone IoTDB Kubernetes Deployment Code - -Please contact timechodb staff to obtain the IoTDB Helm Chart. If you encounter proxy issues, disable the proxy settings: - -### 5.2 Modify YAML Files - -> Ensure that the version used is supported (>=1.3.3.2): - -**values.yaml Example:** - -```YAML -nameOverride: "iotdb" -fullnameOverride: "iotdb" # Name after installation - -image: - repository: nexus.infra.timecho.com:8143/timecho/iotdb-enterprise - pullPolicy: IfNotPresent - tag: 1.3.3.2-standalone # Repository and version used - -storage: - # Storage class name, if using local static storage, do not configure; if using dynamic storage, this must be set - className: local-storage - -datanode: - name: datanode - nodeCount: 3 # Number of DataNode nodes - enableRestService: true - storageCapacity: 10Gi # Available space for DataNode - resources: - requests: - memory: 2Gi # Initial memory size for DataNode - cpu: 1000m # Initial CPU size for DataNode - limits: - memory: 4Gi # Maximum memory size for DataNode - cpu: 1000m # Maximum CPU size for DataNode - -confignode: - name: confignode - nodeCount: 3 # Number of ConfigNode nodes - storageCapacity: 10Gi # Available space for ConfigNode - resources: - requests: - memory: 512Mi # Initial memory size for ConfigNode - cpu: 1000m # Initial CPU size for ConfigNode - limits: - memory: 1024Mi # Maximum memory size for ConfigNode - cpu: 2000m # Maximum CPU size for ConfigNode - configNodeConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - schemaReplicationFactor: 3 - schemaRegionConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - dataReplicationFactor: 2 - dataRegionConsensusProtocolClass: org.apache.iotdb.consensus.iot.IoTConsensus -``` - -## 6. Configure Private Repository Information or Pre-Pull Images - -Configure private repository information on k8s as a prerequisite for the next helm install step. - -Option one is to pull the available iotdb images during helm insta, while option two is to import the available iotdb images into containerd in advance. - -### 6.1 [Option 1] Pull Image from Private Repository - -#### 6.1.1 Create a Secret to Allow k8s to Access the IoTDB Helm Private Repository - -Replace xxxxxx with the IoTDB private repository account, password, and email. - - - -```Bash -# Note the single quotes -kubectl create secret docker-registry timecho-nexus \ - --docker-server='nexus.infra.timecho.com:8143' \ - --docker-username='xxxxxx' \ - --docker-password='xxxxxx' \ - --docker-email='xxxxxx' \ - -n iotdb-ns - -# View the secret -kubectl get secret timecho-nexus -n iotdb-ns -# View and output as YAML -kubectl get secret timecho-nexus --output=yaml -n iotdb-ns -# View and decrypt -kubectl get secret timecho-nexus --output="jsonpath={.data.\.dockerconfigjson}" -n iotdb-ns | base64 --decode -``` - -#### 6.1.2 Load the Secret as a Patch to the Namespace iotdb-ns - -```Bash -# Add a patch to include login information for nexus in this namespace -kubectl patch serviceaccount default -n iotdb-ns -p '{"imagePullSecrets": [{"name": "timecho-nexus"}]}' - -# View the information in this namespace -kubectl get serviceaccounts -n iotdb-ns -o yaml -``` - -### 6.2 [Option 2] Import Image - -This step is for scenarios where the customer cannot connect to the private repository and requires assistance from company implementation staff. - -#### 6.2.1 Pull and Export the Image: - -```Bash -ctr images pull --user xxxxxxxx nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.2 View and Export the Image: - -```Bash -# View -ctr images ls - -# Export -ctr images export iotdb-enterprise:1.3.3.2-standalone.tar nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.3 Import into the k8s Namespace: - -> Note that k8s.io is the namespace for ctr in the example environment; importing to other namespaces will not work. - -```Bash -# Import into the k8s namespace -ctr -n k8s.io images import iotdb-enterprise:1.3.3.2-standalone.tar -``` - -#### 6.2.4 View the Image: - -```Bash -ctr --namespace k8s.io images list | grep 1.3.3.2 -``` - -## 7. Install IoTDB - -### 7.1 Install IoTDB - -```Bash -# Enter the directory -cd iotdb-cluster-k8s/helm - -# Install IoTDB -helm install iotdb ./ -n iotdb-ns -``` - -### 7.2 View Helm Installation List - -```Bash -# helm list -helm list -n iotdb-ns -``` - -### 7.3 View Pods - -```Bash -# View IoTDB pods -kubectl get pods -n iotdb-ns -o wide -``` - -After executing the command, if the output shows 6 Pods with confignode and datanode labels (3 each), it indicates a successful installation. Note that not all Pods may be in the Running state initially; inactive datanode Pods may keep restarting but will normalize after activation. - -### 7.4 Troubleshooting - -```Bash -# View k8s creation logs -kubectl get events -n iotdb-ns -watch kubectl get events -n iotdb-ns - -# Get detailed information -kubectl describe pod confignode-0 -n iotdb-ns -kubectl describe pod datanode-0 -n iotdb-ns - -# View ConfigNode logs -kubectl logs -n iotdb-ns confignode-0 -f -``` - -## 8. Activate IoTDB - -### 8.1 Option 1: Activate Directly in the Pod (Quickest) - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-1 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-2 -- /iotdb/sbin/start-activate.sh -# Obtain the machine code and proceed with activation -``` - -### 8.2 Option 2: Activate Inside the ConfigNode Container - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /bin/bash -cd /iotdb/sbin -/bin/bash start-activate.sh -# Obtain the machine code and proceed with activation -# Exit the container -``` - -### Option 3: Manual Activation - -1. View ConfigNode details to determine the node: - -```Bash -kubectl describe pod confignode-0 -n iotdb-ns | grep -e "Node:" -e "Path:" - -# Example output: -# Node: a87/172.20.31.87 -# Path: /data/k8s-data/env/confignode/.env -``` - -2. View PVC and find the corresponding Volume for ConfigNode to determine the path: - -```Bash -kubectl get pvc -n iotdb-ns | grep "confignode-0" -# Example output: -# map-confignode-confignode-0 Bound iotdb-pv-04 10Gi RWO local-storage 8h - -# To view multiple ConfigNodes, use the following: -for i in {0..2}; do echo confignode-$i; kubectl describe pod confignode-${i} -n iotdb-ns | grep -e "Node:" -e "Path:" -``` - -3. View the Detailed Information of the Corresponding Volume to Determine the Physical Directory Location: - - -```Bash -kubectl describe pv iotdb-pv-04 | grep "Path:" - -# Example output: -# Path: /data/k8s-data/iotdb-pv-04 -``` - -4. Locate the system-info file in the corresponding directory on the corresponding node, use this system-info as the machine code to generate an activation code, and create a new file named license in the same directory, writing the activation code into this file. - -## 9. Verify IoTDB - -### 9.1 Check the Status of Pods within the Namespace - -View the IP, status, and other information of the pods in the iotdb-ns namespace to ensure they are all running normally. - -```Bash -kubectl get pods -n iotdb-ns -o wide - -# Example output: -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` - -### 9.2 Check the Port Mapping within the Namespace - -```Bash -kubectl get svc -n iotdb-ns - -# Example output: -# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -# confignode-svc NodePort 10.10.226.151 80:31026/TCP 7d8h -# datanode-svc NodePort 10.10.194.225 6667:31563/TCP 7d8h -# jdbc-balancer LoadBalancer 10.10.191.209 6667:31895/TCP 7d8h -``` - -### 9.3 Start the CLI Script on Any Server to Verify the IoTDB Cluster Status - -Use the port of jdbc-balancer and the IP of any k8s node. - -```Bash -start-cli.sh -h 172.20.31.86 -p 31895 -start-cli.sh -h 172.20.31.87 -p 31895 -start-cli.sh -h 172.20.31.88 -p 31895 -``` - - - -## 10. Scaling - -### 10.1 Add New PV - -Add a new PV; scaling is only possible with available PVs. - - - -**Note: DataNode cannot join the cluster after restart** - -**Reason**:The static storage hostPath mode is configured, and the script modifies the `iotdb-system.properties` file to set `dn_data_dirs` to `/iotdb6/iotdb_data,/iotdb7/iotdb_data`. However, the default storage path `/iotdb/data` is not mounted, leading to data loss upon restart. -**Solution**:Mount the `/iotdb/data` directory as well, and ensure this setting is applied to both ConfigNode and DataNode to maintain data integrity and cluster stability. - -### 10.2 Scale ConfigNode - -Example: Scale from 3 ConfigNodes to 4 ConfigNodes - -Modify the values.yaml file in iotdb-cluster-k8s/helm to change the number of ConfigNodes from 3 to 4. - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - - - - -### 10.3 Scale DataNode - -Example: Scale from 3 DataNodes to 4 DataNodes - -Modify the values.yaml file in iotdb-cluster-k8s/helm to change the number of DataNodes from 3 to 4. - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - -### 10.4 Verify IoTDB Status - -```Shell -kubectl get pods -n iotdb-ns -o wide - -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -# datanode-3 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index b6a8f2d4b..000000000 --- a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,323 +0,0 @@ - -# Stand-Alone Deployment - -This chapter will introduce how to start an IoTDB standalone instance, which includes 1 ConfigNode and 1 DataNode (commonly known as 1C1D). - -## Matters Needing Attention - -1. Before installation, ensure that the system is complete by referring to [System configuration](./Environment-Requirements.md). - -2. It is recommended to prioritize using 'hostname' for IP configuration during deployment, which can avoid the problem of modifying the host IP in the later stage and causing the database to fail to start. To set the host name, you need to configure/etc/hosts on the target server. For example, if the local IP is 192.168.1.3 and the host name is iotdb-1, you can use the following command to set the server's host name and configure IoTDB's' cn_internal-address' using the host name dn_internal_address、dn_rpc_address。 - - ```shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. Some parameters cannot be modified after the first startup. Please refer to the "Parameter Configuration" section below for settings. - -4. Whether in linux or windows, ensure that the IoTDB installation path does not contain Spaces and Chinese characters to avoid software exceptions. - -5. Please note that when installing and deploying IoTDB (including activating and using software), it is necessary to use the same user for operations. You can: -- Using root user (recommended): Using root user can avoid issues such as permissions. -- Using a fixed non root user: - - Using the same user operation: Ensure that the same user is used for start, activation, stop, and other operations, and do not switch users. - - Avoid using sudo: Try to avoid using sudo commands as they execute commands with root privileges, which may cause confusion or security issues. - -6. It is recommended to deploy a monitoring panel, which can monitor important operational indicators and keep track of database operation status at any time. The monitoring panel can be obtained by contacting the business department, and the steps for deploying the monitoring panel can be referred to:[Monitoring Board Install and Deploy](./Monitoring-panel-deployment.md). - -7. Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - - -## Installation Steps - -### 1. Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### 2、Unzip the installation package and enter the installation directory - -```shell -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -### 2. Parameter Configuration - -#### Environment Script Configuration - -- ./conf/confignode-env.sh (./conf/confignode-env.bat) configuration - -| **Configuration** | **Description** | **Default** | **Recommended value** | Note | -| :---------------: | :----------------------------------------------------------: | :---------: | :----------------------------------------------------------: | :---------------------------------: | -| MEMORY_SIZE | The total amount of memory that IoTDB ConfigNode nodes can use | Automatically calculated based on system memory, defaulting to 30% of the system memory. | Can be filled in as needed, and the system will allocate memory based on the filled in values | Save changes without immediate execution; modifications take effect after service restart. | - -- ./conf/datanode-env.sh (./conf/datanode-env.bat) configuration - -| **Configuration** | **Description** | **Default** | **Recommended value** | **Note** | -| :---------: | :----------------------------------: |:----------------------------------------------------------------------------------------:| :----------------------------------------------: | :----------: | -| MEMORY_SIZE | The total amount of memory that IoTDB DataNode nodes can use | Automatically calculated based on system memory, defaulting to 50% of the system memory. | Can be filled in as needed, and the system will allocate memory based on the filled in values | Save changes without immediate execution; modifications take effect after service restart. | - -#### System General Configuration - -Open the general configuration file (./conf/iotdb-system. properties file) and set the following parameters: - -| **Configuration** | **Description** | **Default** | **Recommended value** | Note | -| :-----------------------: | :----------------------------------------------------------: | :------------: | :----------------------------------------------------------: |:----------------------------------------------------------------------------------------------------------------------------------:| -| cluster_name | Cluster Name | defaultCluster | The cluster name can be set as needed, and if there are no special needs, the default can be kept | Support hot loading from V1.3.3, but it is not recommended to change the cluster name by manually modifying the configuration file | -| schema_replication_factor | Number of metadata replicas, set to 1 for the standalone version here | 1 | 1 | Default 1, cannot be modified after the first startup | -| data_replication_factor | Number of data replicas, set to 1 for the standalone version here | 1 | 1 | Default 1, cannot be modified after the first startup | - -#### ConfigNode Configuration - -Open the ConfigNode configuration file (./conf/iotdb-system. properties file) and set the following parameters: - -| **Configuration** | **Description** | **Default** | **Recommended value** | Note | -| :-----------------: | :----------------------------------------------------------: | :-------------: | :----------------------------------------------------------: | :--------------------------------------: | -| cn_internal_address | The address used by ConfigNode for communication within the cluster | 127.0.0.1 | The IPV4 address or host name of the server where it is located, and it is recommended to use host name | Cannot be modified after initial startup | -| cn_internal_port | The port used by ConfigNode for communication within the cluster | 10710 | 10710 | Cannot be modified after initial startup | -| cn_consensus_port | The port used for ConfigNode replica group consensus protocol communication | 10720 | 10720 | Cannot be modified after initial startup | -| cn_seed_config_node | The address of the ConfigNode that the node connects to when registering to join the cluster, cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | Cannot be modified after initial startup | - -#### DataNode Configuration - -Open the DataNode configuration file (./conf/iotdb-system. properties file) and set the following parameters: - -| **Configuration** | **Description** | **Default** | **Recommended value** | **Note** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- |:----------------------------------------------------------------------------------------------------------------| :--------------------------------------- | -| dn_rpc_address | The address of the client RPC service | 0.0.0.0 | The IPV4 address or host name of the server where it is located, and it is recommended to use the IPV4 address | Restarting the service takes effect | -| dn_rpc_port | The port of the client RPC service | 6667 | 6667 | Restarting the service takes effect | -| dn_internal_address | The address used by DataNode for communication within the cluster | 127.0.0.1 | The IPV4 address or host name of the server where it is located, and it is recommended to use host name | Cannot be modified after initial startup | -| dn_internal_port | The port used by DataNode for communication within the cluster | 10730 | 10730 | Cannot be modified after initial startup | -| dn_mpp_data_exchange_port | The port used by DataNode to receive data streams | 10740 | 10740 | Cannot be modified after initial startup | -| dn_data_region_consensus_port | The port used by DataNode for data replica consensus protocol communication | 10750 | 10750 | Cannot be modified after initial startup | -| dn_schema_region_consensus_port | The port used by DataNode for metadata replica consensus protocol communication | 10760 | 10760 | Cannot be modified after initial startup | -| dn_seed_config_node | The ConfigNode address that the node connects to when registering to join the cluster, i.e. cn_internal-address: cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | Cannot be modified after initial startup | - -> ❗️Attention: Editors such as VSCode Remote do not have automatic configuration saving function. Please ensure that the modified files are saved persistently, otherwise the configuration items will not take effect - -### 3. Start and Activate Database (Available since V1.3.4) - -#### 3.1 Start ConfigNode - -Enter the sbin directory of iotdb and start confignode - -```shell -./start-confignode.sh -d #The "- d" parameter will start in the background -``` -If the startup fails, please refer to [Common Questions](#common-questions). - -#### 3.2 Start DataNode - -Enter the sbin directory of iotdb and start datanode: - -```shell -./start-datanode.sh -d # The "- d" parameter will start in the background -``` - -#### 3.3 Activate Database - -##### Activation via CLI - -- Enter the CLI - - ```SQL - ./sbin/start-cli.sh -``` - -- Execute the following command to obtain the machine code required for activation: - - ```Bash - show system info - ``` - -- Copy the returned machine code and provide it to the Timecho team: - -```Bash -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -It costs 0.030s -``` - -- Input the activation code returned by the Timecho team into the CLI using the following command: - - Note: The activation code must be enclosed in ' symbols, as shown: - -```Bash -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -### 4. Start and Activate Database (Available before V1.3.4) - -#### 4.1 Start ConfigNode - -Enter the sbin directory of iotdb and start confignode - -```shell -./start-confignode.sh -d #The "- d" parameter will start in the background -``` -If the startup fails, please refer to [Common Questions](#common-questions). - -#### 4.2 Activate Database - -##### Method 1: Activate file copy activation - -- After starting the confignode node, enter the activation folder and copy the systeminfo file to the Timecho staff -- Received the license file returned by the staff -- Place the license file in the activation folder of the corresponding node; - -##### Method 2: Activate Script Activation - -- Obtain the required machine code for activation, enter the sbin directory of the installation directory, and execute the activation script: - -```shell - cd sbin -./start-activate.sh -``` - -- The following information is displayed. Please copy the machine code (i.e. the string of characters) to the Timecho staff: - -```shell -Please copy the system_info's content and send it to Timecho: -01-KU5LDFFN-PNBEHDRH -Please enter license: -``` - -- Enter the activation code returned by the staff into the previous command line prompt 'Please enter license:', as shown below: - -```shell -Please enter license: -JJw+MmF+AtexsfgNGOFgTm83Bxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxm6pF+APW1CiXLTSijK9Qh3nsLgzrW8OJPh26Vl6ljKUpCvpTiw== -License has been stored to sbin/../activation/license -Import completed. Please start cluster and excute 'show cluster' to verify activation status -``` - -#### 4.3 Start DataNode - -Enter the sbin directory of iotdb and start datanode: - -```shell -./start-datanode.sh -d # The "- d" parameter will start in the background -``` - -### 5、Verify Deployment - -Can be executed directly/ Cli startup script in sbin directory: - -```shell -./start-cli.sh -h ip(local IP or domain name) -p port(6667) -``` - -After successful startup, the following interface will appear displaying successful installation of IOTDB. - -![](/img/%E5%90%AF%E5%8A%A8%E6%88%90%E5%8A%9F.png) - -After the installation success interface appears, continue to check if the activation is successful and use the `show cluster`command - -When you see the display "Activated" on the far right, it indicates successful activation - -![](/img/show%20cluster.png) - -In the CLI, you can also check the activation status by running the `show activation` command; the example below shows a status of ACTIVATED, indicating successful activation. - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -> The appearance of 'Activated (W)' indicates passive activation, indicating that this Config Node does not have a license file (or has not issued the latest license file with a timestamp). At this point, it is recommended to check if the license file has been placed in the license folder. If not, please place the license file. If a license file already exists, it may be due to inconsistency between the license file of this node and the information of other nodes. Please contact Timecho staff to reapply. - -## Common Problem -1. Multiple prompts indicating activation failure during deployment process - - Use the `ls -al` command: Use the `ls -al` command to check if the owner information of the installation package root directory is the current user. - - Check activation directory: Check all files in the `./activation` directory and whether the owner information is the current user. - -2. Confignode failed to start - - Step 1: Please check the startup log to see if any parameters that cannot be changed after the first startup have been modified. - - Step 2: Please check the startup log for any other abnormalities. If there are any abnormal phenomena in the log, please contact Timecho Technical Support personnel for consultation on solutions. - - Step 3: If it is the first deployment or data can be deleted, you can also clean up the environment according to the following steps, redeploy, and restart. - - Step 4: Clean up the environment: - - a. Terminate all ConfigNode Node and DataNode processes. - ```Bash - # 1. Stop the ConfigNode and DataNode services - sbin/stop-standalone.sh - - # 2. Check for any remaining processes - jps - # Or - ps -ef|grep iotdb - - # 3. If there are any remaining processes, manually kill the - kill -9 - # If you are sure there is only one iotdb on the machine, you can use the following command to clean up residual processes - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. Delete the data and logs directories. - - Explanation: Deleting the data directory is necessary, deleting the logs directory is for clean logs and is not mandatory. - - ```Bash - cd /data/iotdb - rm -rf data logs - ``` \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/workbench-deployment_timecho.md b/src/UserGuide/dev-1.3/Deployment-and-Maintenance/workbench-deployment_timecho.md deleted file mode 100644 index 6d4bac18d..000000000 --- a/src/UserGuide/dev-1.3/Deployment-and-Maintenance/workbench-deployment_timecho.md +++ /dev/null @@ -1,262 +0,0 @@ - -# Workbench Deployment - -The visualization console is one of the supporting tools for IoTDB (similar to Navicat for MySQL). It is an official application tool system used for database deployment implementation, operation and maintenance management, and application development stages, making the use, operation, and management of databases simpler and more efficient, truly achieving low-cost management and operation of databases. This document will assist you in installing Workbench. - -
-  -  -
- -The instructions for using the visualization console tool can be found in the [Instructions](../Tools-System/Monitor-Tool.md) section of the document. - -## Installation Preparation - -| Preparation Content | Name | Version Requirements | Link | -| :----------------------: | :-------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | -| Operating System | Windows or Linux | - | - | -| Installation Environment | JDK | v1.5.4 and below require ≥ 1.8; v1.5.5 and above require ≥ 17. Choose the ARM or x64 installer according to your system.| https://www.oracle.com/java/technologies/downloads/ | -| Related Software | Prometheus | Requires installation of V2.30.3 and above. | https://prometheus.io/download/ | -| Database | IoTDB | Requires V1.2.0 Enterprise Edition and above | You can contact business or technical support to obtain | -| Console | IoTDB-Workbench-`` | - | You can choose according to the appendix version comparison table and contact business or technical support to obtain it | - - -### Pre-installation Check - -To ensure the Workbench installation package you obtained is complete and valid, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Contact the Timecho Team to get it. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/workbench): - ```Bash - cd /data/workbench - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum IoTDB-Workbench-``.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-03.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact the Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - - -## Installation Steps - -### Step 1: IoTDB enables monitoring indicator collection - -1. Open the monitoring configuration item. The configuration items related to monitoring in IoTDB are disabled by default. Before deploying the monitoring panel, you need to open the relevant configuration items (note that the service needs to be restarted after enabling monitoring configuration). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ConfigurationLocated in the configuration fileDescription
cn_metric_reporter_listconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to PROMETHEUS
cn_metric_levelPlease add this configuration item to the configuration file and set the value to IMPORTANT
cn_metric_prometheus_reporter_portPlease add this configuration item to the configuration file to maintain the default setting of 9091. If other ports are set, they will not conflict with each other
dn_metric_reporter_listconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to PROMETHEUS
dn_metric_levelPlease add this configuration item to the configuration file and set the value to IMPORTANT
dn_metric_prometheus_reporter_portPlease add this configuration item to the configuration file and set it to 9092 by default. If other ports are set, they will not conflict with each other
dn_metric_internal_reporter_typePlease add this configuration item to the configuration file and set the value to IOTDB
enable_audit_logconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to true
audit_log_storagePlease add this configuration item in the configuration file, with values set to IOTDB and LOGGER
audit_log_operationPlease add this configuration item in the configuration file, with values set to DML,DDL,QUERY
- - -2. Restart all nodes. After modifying the monitoring indicator configuration of three nodes, the confignode and datanode services of all nodes can be restarted: - - ```shell - ./sbin/stop-standalone.sh #Stop confignode and datanode first - ./sbin/start-confignode.sh -d #Start confignode - ./sbin/start-datanode.sh -d #Start datanode - ``` - -3. After restarting, confirm the running status of each node through the client. If the status is Running, it indicates successful configuration: - - ![](/img/%E5%90%AF%E5%8A%A8.png) - -### Step 2: Install and configure Prometheus - -1. Download the Prometheus installation package, which requires installation of V2.30.3 and above. You can go to the Prometheus official website to download it (https://prometheus.io/docs/introduction/first_steps/) -2. Unzip the installation package and enter the unzipped folder: - - ```Shell - tar xvfz prometheus-*.tar.gz - cd prometheus-* - ``` - -3. Modify the configuration. Modify the configuration file prometheus.yml as follows - 1. Add configNode task to collect monitoring data for ConfigNode - 2. Add a datanode task to collect monitoring data for DataNodes - - ```shell - global: - scrape_interval: 15s - evaluation_interval: 15s - scrape_configs: - - job_name: "prometheus" - static_configs: - - targets: ["localhost:9090"] - - job_name: "confignode" - static_configs: - - targets: ["iotdb-1:9091","iotdb-2:9091","iotdb-3:9091"] - honor_labels: true - - job_name: "datanode" - static_configs: - - targets: ["iotdb-1:9092","iotdb-2:9092","iotdb-3:9092"] - honor_labels: true - ``` - -4. Start Prometheus. The default expiration time for Prometheus monitoring data is 15 days. In production environments, it is recommended to adjust it to 180 days or more to track historical monitoring data for a longer period of time. The startup command is as follows: - - ```Shell - ./prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=180d - ``` - -5. Confirm successful startup. Enter in browser `http://IP:port` Go to Prometheus and click on the Target interface under Status. When you see that all States are Up, it indicates successful configuration and connectivity. - -
- - -
- - -### Step 3: Install Workbench - -1. Enter the config directory of iotdb Workbench -`` - -2. Modify Workbench configuration file: Go to the `config` folder and modify the configuration file `application-prod.properties`. If you are installing it locally, there is no need to modify it. If you are deploying it on a server, you need to modify the IP address - > Workbench can be deployed on a local or cloud server as long as it can connect to IoTDB - - | Configuration | Before Modification | After modification | - | ---------------- | ----------------------------------- | ----------------------------------------------- | - | pipe.callbackUrl | pipe.callbackUrl=`http://127.0.0.1` | pipe.callbackUrl=`http://` | - - ![](/img/workbench-conf-1.png) - -3. Startup program: Please execute the startup command in the sbin folder of IoTDB Workbench -`` - Windows: - ```shell - # Start Workbench in the background - start.bat -d - ``` - Linux: - ```shell - # Start Workbench in the background - ./start.sh -d - ``` -4. You can use the `jps` command to check if the startup was successful, as shown in the figure: - - ![](/img/windows-jps.png) - -5. Verification successful: Open "`http://Server IP: Port in configuration file`" in the browser to access, for example:"`http://127.0.0.1:9190`" When the login interface appears, it is considered successful - - ![](/img/workbench-en.png) - - -### Step 4: Configure Instance Information - -1. Configure instance information: You only need to fill in the following information to connect to the instance - - ![](/img/workbench-en-1.jpeg) - - - | Field Name | Is It A Required Field | Field Meaning | Default Value | - | --------------- | ---------------------- | ------------------------------------------------------------ | ------ | - | Connection Type | Yes | The content filled in for different connection types varies, and supports selecting "single machine, cluster, dual active" | - | - | Instance Name | Yes | You can distinguish different instances based on their names, with a maximum input of 50 characters | - | - | Instance | Yes | Fill in the database address (`dn_rpc_address` field in the `iotdb/conf/iotdb-system.properties` file) and port number (`dn_rpc_port` field). Note: For clusters and dual active devices, clicking the "+" button supports entering multiple instance information | - | - | Prometheus | No | Fill in `http://:/app/v1/query` to view some monitoring information on the homepage. We recommend that you configure and use it | - | - | Username | Yes | Fill in the username for IoTDB, supporting input of 4 to 32 characters, including uppercase and lowercase letters, numbers, and special characters (! @ # $% ^&* () _+-=) | root | - | Enter Password | No | Fill in the password for IoTDB. To ensure the security of the database, we will not save the password. Please fill in the password yourself every time you connect to the instance or test | root | - -2. Test the accuracy of the information filled in: You can perform a connection test on the instance information by clicking the "Test" button - - ![](/img/workbench-en-2.png) - -## Appendix: IoTDB and Workbench Version Comparison Table - -| Version | Description | Supported IoTDB Versions | -|---------|-----------------------------------------------------------------------------------------------------------------------------|-------------------------------------| -| V1.5.7 | Optimize the point list by splitting point names into device names and points, ensure the point selection area supports horizontal scrolling, and align the export file column order with the page display. | All 1.x versions from V1.3.4 onward | -| V1.5.6 | Enhanced CSV import/export: optional tags/aliases on import; support for measurement descriptions with backtick-quoted quotes on export. | All 1.x versions from V1.3.4 onward | -| V1.5.5 | Added server clock functionality and support for activating Enterprise Edition license databases | All 1.x versions from V1.3.4 onward | -| V1.5.4 | Added authentication for Prometheus settings in Instance Management | All 1.x versions from V1.3.4 onward | -| V1.5.1 | Added AI analysis and pattern matching | All 1.x versions from V1.3.2 onward | -| V1.4.0 | Added tree model display and English UI | All 1.x versions from V1.3.2 onward | -| V1.3.1 | Enhanced analysis methods and import templates | All 1.x versions from V1.3.2 onward | -| V1.3.0 | Added DB configuration and UI refinements | All 1.x versions from V1.3.2 onward | -| V1.2.6 | Optimized permission controls | All 1.x versions from V1.3.1 onward | -| V1.2.5 | Added "Common Templates" and caching | All 1.x versions from V1.3.0 onward | -| V1.2.4 | Added import/export for calculations, time alignment field | All 1.x versions from V1.2.2 onward | -| V1.2.3 | Added activation details and analysis features | All 1.x versions from V1.2.2 onward | -| V1.2.2 | Optimized point description display | All 1.x versions from V1.2.2 onward | -| V1.2.1 | Added sync monitoring panel, Prometheus hints | All 1.x versions from V1.2.2 onward | -| V1.2.0 | Major Workbench upgrade | All 1.x versions from V1.2.0 onward | diff --git a/src/UserGuide/dev-1.3/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md b/src/UserGuide/dev-1.3/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md deleted file mode 100644 index 10f07ed73..000000000 --- a/src/UserGuide/dev-1.3/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md +++ /dev/null @@ -1,275 +0,0 @@ - - -# Ignition - -## Product Overview - -1. Introduction to Ignition - - Ignition is a web-based monitoring and data acquisition tool (SCADA) - an open and scalable universal platform. Ignition allows you to more easily control, track, display, and analyze all data of your enterprise, enhancing business capabilities. For more introduction details, please refer to [Ignition Official Website](https://docs.inductiveautomation.com/docs/8.1/getting-started/introducing-ignition) - -2. Introduction to the Ignition-IoTDB Connector - - The ignition-IoTDB Connector is divided into two modules: the ignition-IoTDB Connector,Ignition-IoTDB With JDBC。 Among them: - - - Ignition-IoTDB Connector: Provides the ability to store data collected by Ignition into IoTDB, and also supports data reading in Components. It injects script interfaces such as `system. iotdb. insert`and`system. iotdb. query`to facilitate programming in Ignition - - Ignition-IoTDB With JDBC: Ignition-IoTDB With JDBC can be used in the`Transaction Groups`module and is not applicable to the`Tag Historian`module. It can be used for custom writing and querying. - - The specific relationship and content between the two modules and ignition are shown in the following figure. - - ![](/img/20240703114443.png) - -## Installation Requirements - -| **Preparation Content** | Version Requirements | -| ------------------------------- | ------------------------------------------------------------ | -| IoTDB | Version 1.3.1 and above are required to be installed, please refer to IoTDB for installation [Deployment Guidance](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) | -| Ignition | Requirement: 8.1 version (8.1.37 and above) of version 8.1 must be installed. Please refer to the Ignition official website for installation [Installation Guidance](https://docs.inductiveautomation.com/docs/8.1/getting-started/installing-and-upgrading)(Other versions are compatible, please contact the business department for more information) | -| Ignition-IoTDB Connector module | Please contact Business to obtain | -| Ignition-IoTDB With JDBC module | Download address:https://repo1.maven.org/maven2/org/apache/iotdb/iotdb-jdbc/ | - -## Instruction Manual For Ignition-IoTDB Connector - -### Introduce - -The Ignition-IoTDB Connector module can store data in a database connection associated with the historical database provider. The data is directly stored in a table in the SQL database based on its data type, as well as a millisecond timestamp. Store data only when making changes based on the value pattern and dead zone settings on each label, thus avoiding duplicate and unnecessary data storage. - -The Ignition-IoTDB Connector provides the ability to store the data collected by Ignition into IoTDB. - -### Installation Steps - -Step 1: Enter the `Configuration` - `System` - `Modules` module and click on the `Install or Upgrade a Module` button at the bottom - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-1.png) - -Step 2: Select the obtained `modl`, select the file and upload it, click `Install`, and trust the relevant certificate. - -![](/img/20240703-151030.png) - -Step 3: After installation is completed, you can see the following content - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-3.png) - -Step 4: Enter the `Configuration` - `Tags` - `History` module and click on `Create new Historical Tag Provider` below - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-4.png) - -Step 5: Select `IoTDB` and fill in the configuration information - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-5.png) - -The configuration content is as follows: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NameDescriptionDefault ValueNotes
Main
Provider NameProvider Name-
Enabled trueThe provider can only be used when it is true
DescriptionDescription-
IoTDB Settings
Host NameThe address of the target IoTDB instance-
Port NumberThe port of the target IoTDB instance6667
UsernameThe username of the target IoTDB-
PasswordPassword for target IoTDB-
Database NameThe database name to be stored, starting with root, such as root db-
Pool SizeSize of SessionPool50Can be configured as needed
Store and Forward SettingsJust keep it as default
- - - -### Instructions - -#### Configure Historical Data Storage - -- After configuring the `Provider`, you can use the `IoTDB Tag Historian` in the `Designer`, just like using other `Providers`. Right click on the corresponding `Tag` and select `Edit Tag (s) `, then select the History category in the Tag Editor - - ![](/img/ignition-7.png) - -- Set `History Disabled` to `true`, select `Storage Provider` as the `Provider` created in the previous step, configure other parameters as needed, click `OK`, and then save the project. At this point, the data will be continuously stored in the 'IoTDB' instance according to the set content. - - ![](/img/ignition-8.png) - -#### Read Data - -- You can also directly select the tags stored in IoTDB under the Data tab of the Report - - ![](/img/ignition-9.png) - -- You can also directly browse relevant data in Components - - ![](/img/ignition-10.png) - -#### Script module: This function can interact with IoTDB - -1. system.iotdb.insert: - - -- Script Description: Write data to an IoTDB instance - -- Script Definition: - - `system.iotdb.insert(historian, deviceId, timestamps, measurementNames, measurementValues)` - -- Parameter: - - - `str historian`:The name of the corresponding IoTDB Tag Historian Provider - - `str deviceId`:The deviceId written, excluding the configured database, such as Sine - - `long[] timestamps`:List of timestamps for written data points - - `str[] measurementNames`:List of names for written physical quantities - - `str[][] measurementValues`:The written data point data corresponds to the timestamp list and physical quantity name list - -- Return Value: None - -- Available Range:Client, Designer, Gateway - -- Usage example: - - ```shell - system.iotdb.insert("IoTDB", "Sine", [system.date.now()],["measure1","measure2"],[["val1","val2"]]) - ``` - -2. system.iotdb.query: - - -- Script Description:Query the data written to the IoTDB instance - -- Script Definition: - - `system.iotdb.query(historian, sql)` - -- Parameter: - - - `str historian`:The name of the corresponding IoTDB Tag Historian Provider - - `str sql`:SQL statement to be queried - -- Return Value: - Query Results:`List>` - -- Available Range:Client, Designer, Gateway - -- Usage example: - - ```Python - system.iotdb.query("IoTDB", "select * from root.db.Sine where time > 1709563427247") - ``` - -## Ignition-IoTDB With JDBC - -### Introduce - - Ignition-IoTDB With JDBC provides a JDBC driver that allows users to connect and query the Ignition IoTDB database using standard JDBC APIs - -### Installation Steps - -Step 1: Enter the `Configuration` - `Databases` -`Drivers` module and create the `Translator` - -![](/img/Ignition-IoTDBWithJDBC-1.png) - -Step 2: Enter the `Configuration` - `Databases` - `Drivers` module, create a `JDBC Driver` , select the `Translator` configured in the previous step, and upload the downloaded `IoTDB JDBC`. Set the Classname to `org. apache. iotdb. jdbc.IoTDBDriver` - -![](/img/Ignition-IoTDBWithJDBC-2.png) - -Step 3: Enter the `Configuration` - `Databases` - `Connections` module, create a new `Connections` , select the`IoTDB Driver` created in the previous step for `JDBC Driver`, configure the relevant information, and save it to use - -![](/img/Ignition-IoTDBWithJDBC-3.png) - -### Instructions - -#### Data Writing - -Select the previously created `Connection` from the `Data Source` in the `Transaction Groups` - -- `Table name`needs to be set as the complete device path starting from root -- Uncheck `Automatically create table` -- `Store timestame to` configure as time - -Do not select other options, set the fields, and after `enabled` , the data will be installed and stored in the corresponding IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E5%86%99%E5%85%A5-1.png) - -#### Query - -- Select `Data Source` in the `Database Query Browser` and select the previously created `Connection` to write an SQL statement to query the data in IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2-ponz.png) - diff --git a/src/UserGuide/dev-1.3/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/UserGuide/dev-1.3/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index 4f7b6d570..000000000 --- a/src/UserGuide/dev-1.3/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,267 +0,0 @@ - - -# What is TimechoDB - -TimechoDB is a low-cost, high-performance native temporal database for the Internet of Things, provided by Timecho based on the Apache IoTDB community version as an original commercial product. It can solve various problems encountered by enterprises when building IoT big data platforms to manage time-series data, such as complex application scenarios, large data volumes, high sampling frequencies, high amount of unaligned data, long data processing time, diverse analysis requirements, and high storage and operation costs. - -Timecho provides a more diverse range of product features, stronger performance and stability, and a richer set of utility tools based on TimechoDB. It also offers comprehensive enterprise services to users, thereby providing commercial customers with more powerful product capabilities and a higher quality of development, operations, and usage experience. - -- Download 、Deployment and Usage:[QuickStart](../QuickStart/QuickStart_timecho.md) - - -## Product Components - -Timecho products is composed of several components, covering the entire time-series data lifecycle from data collection, data management to data analysis & application, helping users efficiently manage and analyze the massive amount of time-series data generated by the IoT. - -
- Introduction-en-timecho-new.png - -
- -1. **Time-series database (TimechoDB, a commercial product based on Apache IoTDB provided by the original team)**: The core component of time-series data storage, which can provide users with high-compression storage capabilities, rich time-series query capabilities, real-time stream processing capabilities, while also having high availability of data and high scalability of clusters, and providing security protection. At the same time, TimechoDB also provides users with a variety of application tools for easy management of the system; multi-language API and external system application integration capabilities, making it convenient for users to build applications based on TimechoDB. - -2. **Time-series data standard file format (Apache TsFile, led and contributed by core team members of Timecho)**: This file format is a storage format specifically designed for time-series data, which can efficiently store and query massive amounts of time-series data. Currently, the underlying storage files of Timecho's collection, storage, and intelligent analysis modules are all supported by Apache TsFile. TsFile can be efficiently loaded into TimechoDB and can also be migrated out. Through TsFile, users can use the same file format for data management in the stages of collection, management, application & analysis, greatly simplifying the entire process from data collection to analysis, and improving the efficiency and convenience of time-series data management. - -3. **Time-series model training and inference integrated engine (AINode)**: For intelligent analysis scenarios, TimechoDB provides the AINode time-series model training and inference integrated engine, which offers a complete set of time-series data analysis tools, with the underlying model training engine supporting training tasks and data management, including machine learning, deep learning, etc. With these tools, users can conduct in-depth analysis of the data stored in TimechoDB and mine its value. - -4. **Data collection**: To more conveniently dock with various industrial collection scenarios, Timecho provides data collection access services, supporting multiple protocols and formats, which can access data generated by various sensors and devices, while also supporting features such as breakpoint resumption and network barrier penetration. It is more adapted to the characteristics of difficult configuration, slow transmission, and weak network in the industrial field collection process, making the user's data collection simpler and more efficient. - -## Product Features - -TimechoDB has the following advantages and characteristics: - -- Flexible deployment methods: Support for one-click cloud deployment, out-of-the-box use after unzipping at the terminal, and seamless connection between terminal and cloud (data cloud synchronization tool). - -- Low hardware cost storage solution: Supports high compression ratio disk storage, no need to distinguish between historical and real-time databases, unified data management. - -- Hierarchical sensor organization and management: Supports modeling in the system according to the actual hierarchical relationship of devices to achieve alignment with the industrial sensor management structure, and supports directory viewing, search, and other capabilities for hierarchical structures. - -- High throughput data reading and writing: supports access to millions of devices, high-speed data reading and writing, out of unaligned/multi frequency acquisition, and other complex industrial reading and writing scenarios. - -- Rich time series query semantics: Supports a native computation engine for time series data, supports timestamp alignment during queries, provides nearly a hundred built-in aggregation and time series calculation functions, and supports time series feature analysis and AI capabilities. - -- Highly available distributed system: Supports HA distributed architecture, the system provides 7*24 hours uninterrupted real-time database services, the failure of a physical node or network fault will not affect the normal operation of the system; supports the addition, deletion, or overheating of physical nodes, the system will automatically perform load balancing of computing/storage resources; supports heterogeneous environments, servers of different types and different performance can form a cluster, and the system will automatically load balance according to the configuration of the physical machine. - -- Extremely low usage and operation threshold: supports SQL like language, provides multi language native secondary development interface, and has a complete tool system such as console. - -- Rich ecological environment docking: Supports docking with big data ecosystem components such as Hadoop, Spark, and supports equipment management and visualization tools such as Grafana, Thingsboard, DataEase. - -## Enterprise characteristics - -### Higher level product features - -Based on Apache IoTDB, TimechoDB offers a range of advanced product features, with native upgrades and optimizations at the kernel level for industrial production scenarios. These include multi-level storage, cloud-edge collaboration, visualization tools, and security enhancements, allowing users to focus more on business development without worrying too much about underlying logic. This simplifies and enhances industrial production, bringing more economic benefits to enterprises. For example: - - -- Dual Active Deployment:Dual active usually refers to two independent single machines (or clusters) that perform real-time mirror synchronization. Their configurations are completely independent and can simultaneously receive external writes. Each independent single machine (or clusters) can synchronize the data written to itself to another single machine (or clusters), and the data of the two single machines (or clusters) can achieve final consistency. - -- Data Synchronisation:Through the built-in synchronization module of the database, data can be aggregated from the station to the center, supporting various scenarios such as full aggregation, partial aggregation, and hierarchical aggregation. It can support both real-time data synchronization and batch data synchronization modes. Simultaneously providing multiple built-in plugins to support requirements such as gateway penetration, encrypted transmission, and compressed transmission in enterprise data synchronization applications. - -- Tiered Storage:Multi level storage: By upgrading the underlying storage capacity, data can be divided into different levels such as cold, warm, and hot based on factors such as access frequency and data importance, and stored in different media (such as SSD, mechanical hard drive, cloud storage, etc.). At the same time, the system also performs data scheduling during the query process. Thereby reducing customer data storage costs while ensuring data access speed. - -- Security Enhancements: Features like whitelists and audit logs strengthen internal management and reduce the risk of data breaches. - -The detailed functional comparison is as follows: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
FunctionApache IoTDBTimechoDB
Deployment ModeStand-Alone Deployment
Distributed Deployment
Dual Active Deployment-
Container DeploymentPartial support
Database FunctionalitySensor Management
Write Data
Query Data
Continuous Query
Trigger
User Defined Function
Permission Management
Data SynchronisationOnly file synchronization, no built-in pluginsReal time synchronization+file synchronization, enriched with built-in plugins
Stream ProcessingOnly framework, no built-in pluginsFramework+rich built-in plugins
Tiered Storage-
View-
White List-
Audit Log-
Supporting ToolsWorkbench-
Cluster Management Tool-
System Monitor Tool-
LocalizationLocalization Compatibility Certification-
Technical SupportExpert Support-
Use Training-
- -### More efficient/stable product performance - -TimechoDB has optimized stability and performance on the basis of the open source version. With technical support from the enterprise version, it can achieve more than 10 times performance improvement and has the performance advantage of timely fault recovery. - -### More User-Friendly Tool System - -TimechoDB will provide users with a simpler and more user-friendly tool system. Through products such as the Cluster Monitoring Panel (Grafana), Database Console (Workbench), and Cluster Management Tool (Deploy Tool, abbreviated as IoTD), it will help users quickly deploy, manage, and monitor database clusters, reduce the work/learning costs of operation and maintenance personnel, simplify database operation and maintenance work, and make the operation and maintenance process more convenient and efficient. - -- Cluster Monitoring Panel: Designed to address the monitoring issues of TimechoDB and its operating system, including operating system resource monitoring, TimechoDB performance monitoring, and hundreds of kernel monitoring indicators, to help users monitor the health status of the cluster and perform cluster tuning and operation. - -
-

Overall Overview

-

Operating System Resource Monitoring

-

TimechoDB Performance Monitoring

-
-
- - - -
-

- -- Database Console: Designed to provide a low threshold database interaction tool, it helps users perform metadata management, data addition, deletion, modification, query, permission management, system management, and other operations in a concise and clear manner through an interface console, simplifying the difficulty of database use and improving database efficiency. - - -
-

Home Page

-

Operate Metadata

-

SQL Query

-
-
- - - -
-

- - -- Cluster management tool: aimed at solving the operational difficulties of multi node distributed systems, mainly including cluster deployment, cluster start stop, elastic expansion, configuration updates, data export and other functions, so as to achieve one click instruction issuance for complex database clusters, greatly reducing management difficulty. - - -
-  -
- -### More professional enterprise technical services - -TimechoDB customers provide powerful original factory services, including but not limited to on-site installation and training, expert consultant consultation, on-site emergency assistance, software upgrades, online self-service, remote support, and guidance on using the latest development version. At the same time, in order to make TimechoDB more suitable for industrial production scenarios, we will recommend modeling solutions, optimize read-write performance, optimize compression ratios, recommend database configurations, and provide other technical support based on the actual data structure and read-write load of the enterprise. If encountering industrial customization scenarios that are not covered by some products, TimechoDB will provide customized development tools based on user characteristics. - -Compared to the open source version, TimechoDB provides a faster release frequency every 2-3 months. At the same time, it offers day level exclusive fixes for urgent customer issues to ensure stable production environments. - -### More compatible localization adaptation - -The TimechoDB code is self-developed and controllable, and is compatible with most mainstream information and creative products (CPU, operating system, etc.), and has completed compatibility certification with multiple manufacturers to ensure product compliance and security. \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/IoTDB-Introduction/Release-history_timecho.md b/src/UserGuide/dev-1.3/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index b1826d6ce..000000000 --- a/src/UserGuide/dev-1.3/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,391 +0,0 @@ - -# Release History - -## TimechoDB (Database Core) - -### V1.3.7.3 - -> Release Date: 2026.06.02
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 Checksum: 8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 primarily optimizes query module and data synchronization capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Query Module: Optimized `Last` queries, aligned series queries, reverse-order time filter queries, and other scenarios. -- Metadata Module: Optimized device creation validation for activated series and their child paths. -- Data Synchronization: Optimized the retry mechanism after synchronization failures. -- Data Synchronization: Cross-network-gateway synchronization plugin supports configuring the real-time write transmission timeout. -- Interface Module: Added error code validation to the Go client write interface. -- Interface Module: Optimized C# client connection pool management. - - -### V1.3.7.2 - -> Release Date: 2026.04.07
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 Checksum: 787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 primarily optimizes data synchronization and query module capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Data Synchronization: Optimized distribution performance for Pipe complex path matching scenarios. -- Query Module: The `SHOW QUERIES` statement now includes client IP, query timeout, server wait time, and other information. -- Ecosystem Integration: Supports IoTDB pushing data to an external OPC Server in OPC Client mode. - - -### V1.3.6.6 - -> Release Date: 2026.01.20
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 Checksum: 590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 enhances data read/write capabilities, resolves several product defects, and delivers comprehensive improvements in database monitoring, performance, and stability. - - -### V1.3.6.3 - -> Release Date: 2026.01.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 Checksum: 43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 focuses on deep optimizations in two core areas—query performance and memory management—while comprehensively enhancing database monitoring, performance, and stability. Specific release contents are as follows: - -* **Query Module**: Optimized query performance across multiple scenarios, including multi-series `Last` queries. -* **Query Module**: Added a new `FastLastQuery` interface in the Java SDK for more efficient `Last` query operations. -* **Query Module**: Modified the tree model’s `fetchSchema` to return results in segmented streaming mode, improving response speed under large-data-volume conditions. -* **Storage Module**: Enhanced memory management to mitigate memory leak risks and ensure long-term system stability. -* **Storage Module**: Optimized the file compaction mechanism to improve compaction efficiency and reduce storage resource consumption. -* **Others**: Fixed security vulnerabilities CVE-2025-12183, CVE-2025-66566, and CVE-2025-11226. - -### V1.3.6.1 - -> Release Date: 2025.12.09
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 Checksum: 9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 focuses on deep optimization of data synchronization stability, while delivering comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Data Synchronization**: Enhanced Pipe SQL parameter configuration to support specifying asynchronous loading methods. -* **Data Synchronization**: Introduced syntactic sugar that automatically splits full-data Pipe creation SQL into real-time and historical synchronization components. -* **System Module**: Added a global configuration option for data-type-specific compression strategies, enabling on-demand adjustment of storage compression policies. - -### V1.3.5.11 - -> Release Date: 2025.09.24
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 Checksum: f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 version primarily optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.10 - -> Release Date: 2025.08.27
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 Checksum: 3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 version fixes certain product defects and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.9 - -> Release Date: 2025.08.25
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 Checksum: 95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 version optimizes memory control, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### 1.x Other historical versions - -#### V1.3.5.8 - -> Release Date: 2025.08.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 Checksum: aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.7 - -> Release Date: 2025.08.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 Checksum: 17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.6 - -> Release Date: 2025.07.16
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 Checksum: 05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 introduces a new configuration switch to disable the data subscription feature. It optimizes the C++ high-availability client and addresses PIPE synchronization latency issues in normal operation, restart, and deletion scenarios, along with query performance for large TEXT objects. Comprehensive enhancements to database monitoring, performance, and stability are also included. - -#### V1.3.5.4 - -> Release Date: 2025.06.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 Checksum: edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 fixes several product defects and optimizes the node removal functionality. It also delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.5.3 - -> Release Date: 2025.06.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 Checksum: 5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 focuses on optimizing data synchronization capabilities, including persisting PIPE transmission progress and adding monitoring metrics for PIPE event transfer time. Related defects have been resolved. Additionally, the encryption algorithm for user passwords has been upgraded to SHA-256. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.2 - -> Release Date: 2015.06.10
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 Checksum: 4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 primarily optimizes data synchronization features, adding support for cascading configurations via parameters and ensuring fully consistent ordering between synchronized and real-time writes. It also enables partitioned sending of historical and real-time data after system restarts. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.1 - -> Release Date: 2025.05.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 Checksum: 91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 resolves several product defects and delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.4.2 - -> Release Date: 2025.04.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 Checksum: 52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b - -V1.3.4.2 enhances the data synchronization function by supporting bi-directional active-active synchronization of data forwarded through external PIPE sources. - -#### V1.3.4.1 - -> Release Date: 2025.01.08
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 Checksum: e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 introduces pattern matching functions, continuously optimizes the data subscription mechanism, improves stability, and extends import-data/export-data scripts to support new data types while unifying TsFile, CSV and SQL import/export formats. Comprehensive improvements have been made to database monitoring, performance and stability. Key updates: - -* Query Module: Configurable URI-based JAR loading for UDFs, PipePlugins, Triggers and AINodes -* System Module: Extended UDF functionality with new pattern\_match function -* Data Sync: Supports specifying authentication info at sender -* Ecosystem: Kubernetes Operator support -* Scripts: import-data/export-data now supports strings, BLOBs, dates and timestamps -* Scripts: Unified import/export support for TsFile, CSV and SQL formats - -#### V1.3.3.3 - -> Release Date: 2024.10.31
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 Checksum: 4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3 improves restart recovery performance, enables DataNodes to actively monitor/load TsFiles with observability metrics, supports automatic loading at receivers when senders transfer files to specified directories, and adds Alter Source capability for Pipes. Comprehensive improvements to monitoring, performance and stability include: - -* Data Sync: Automatic type conversion for inconsistent data at receivers -* Data Sync: Enhanced observability with ops/latency metrics for internal APIs -* Data Sync: OPC-UA sink plugin supports CS mode and non-anonymous access -* Subscription: SDK supports create\_if\_not\_exists and drop\_if\_exists APIs -* Stream Processing: Alter Pipe supports Alter Source -* System: Added latency monitoring for REST module -* Scripts: Auto-loading TsFiles from specified directories -* Scripts: import-tsfile supports remote server execution -* Scripts: Kubernetes Helm support -* Scripts: Python client supports new data types (string, BLOB, date, timestamp) - -#### V1.3.3.2 - -> Release Date: 2024.08.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 Checksum: 32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2 adds metrics for mods file reading time, merge sort memory usage and dispatch latency, supports configurable time partition origin adjustment, enables automatic subscription termination based on pipe completion markers, and improves merge memory control. Key updates: - -* Query: Explain Analyze shows mods file read time -* Query: Explain Analyze shows merge sort memory and dispatch latency -* Storage: Added configurable file splitting during compaction -* System: Configurable time partition origin -* Stream Processing: Auto-terminate subscriptions on pipe completion markers -* Data Sync: Configurable RPC compression levels -* Scripts: Export filters only root.\_\_system paths - -#### V1.3.3.1 - -> Release Date: 2024.07.12
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 Checksum: 1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1 adds tiered storage throttling, supports username/password auth specification at sync senders, optimizes ambiguous WARN logs at receivers, improves restart performance, and merges configuration files. Key updates: - -* Query: Optimized Filter performance for faster aggregation/WHERE queries -* Query: Java Session evenly distributes SQL requests across nodes -* System: Merged config files into iotdb-system.properties -* Storage: Added tiered storage throttling -* Data Sync: Username/password auth specification at senders -* System: Optimized restart recovery time - -#### V1.3.2.2 - -> Release Date: 2024.06.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 Checksum: ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 introduces EXPLAIN ANALYZE for SQL profiling, UDAF framework, automatic data deletion at disk thresholds, metadata sync, path-specific data point counting, and SQL import/export scripts. Supports rolling cluster upgrades and cluster-wide plugin distribution with comprehensive monitoring/performance improvements. Key updates: - -* Storage: Improved insertRecords performance -* Storage: SpaceTL feature for auto-deletion at disk thresholds -* Query: EXPLAIN ANALYZE for SQL stage-level profiling -* Query: New UDAF framework -* Query: New envelope demodulation analysis in UDFs -* Query: MaxBy/MinBy functions returning timestamps with values -* Query: Faster value-filtered queries -* Data Sync: Wildcard path matching -* Data Sync: Metadata synchronization (including attributes/permissions) -* Stream Processing: ALTER PIPE for hot plugin updates -* System: TsFile load statistics in data point counting -* Scripts: Local upgrade/backup via hard links -* Scripts: New export-data/import-data for CSV/TsFile/SQL formats -* Scripts: Windows window title differentiation for ConfigNode/DataNode/Cli - -#### V1.3.1.4 - -> Release Date: 2024.04.23
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 Checksum: 8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1.4 adds cluster activation status viewing, variance/stddev aggregation functions, FILL timeout settings, TsFile repair command, one-click info collection scripts, and cluster control scripts while optimizing views and stream processing. Key updates: - -* Query: FILL clause timeout threshold -* Query: REST V2 returns column types -* Data Sync: Simplified time range specification -* Data Sync: SSL support (iotdb-thrift-ssl-sink) -* System: SQL query for cluster activation status -* System: Tiered storage transfer rate control -* System: Enhanced observability (node divergence, task scheduling) -* System: Optimized default logging -* Scripts: One-click cluster control scripts (start-all/stop-all) -* Scripts: One-click info collection scripts (collect-info) - -#### V1.3.0.4 - -> Release Date: 2024.01.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 Checksum: 3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 introduces the AINode machine learning framework, upgrades permission granularity to time-series level, and optimizes views/stream processing for better usability and stability. Key updates: - -* Query: New AINode ML framework -* Query: Fixed slow SHOW PATH responses -* Security: Time-series granular permissions -* Security: SSL client-server encryption -* Stream Processing: New metrics monitoring -* Query: LAST queries on non-writable views -* System: Improved data point counting accuracy - -#### V1.2.0.1 - -> Release Date: 2023.06.30
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 Checksum: dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1 introduces stream processing framework, dynamic templates, substring/replace/round functions, enhances SHOW REGION/TIMESERIES/VARIABLE statements and Session APIs while optimizing monitoring metrics. Key updates: - -* Stream Processing: New framework -* Metadata: Dynamic template expansion -* Storage: New SPRINTZ/RLBE encoding and LZMA2 compression -* Query: New CAST, ROUND, SUBSTR, REPLACE functions -* Query: New TIME\_DURATION, MODE aggregation -* Query: CASE WHEN syntax support -* Query: ORDER BY expression support -* Interface: Python API multi-node connection -* Interface: Python client write redirection -* Interface: Batch sequence creation via templates - -#### V1.1.0.1 - -> Release Date: 2023.04.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.1.0.1.zip
-> SHA512 Checksum: 58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1 introduces GROUP BY VARIATION/CONDITION, DIFF/COUNT\_IF functions, and pipeline execution engine while fixing issues including: - -* Aligned sequence LAST queries with ORDER BY TIMESERIES -* LIMIT & OFFSET failures -* Post-restart metadata template errors -* Sequence creation after database deletion - -Key updates: - -* Query: ALIGN BY DEVICE supports ORDER BY TIME -* Query: SHOW QUERIES/KILL QUERY commands -* System: SHOW REGIONS per database -* System: SHOW VARIABLES for cluster parameters -* Query: GROUP BY VARIATION/CONDITION -* Query: SELECT INTO type casting -* Query: New DIFF (scalar), COUNT\_IF (aggregate) -* System: SHOW REGIONS creation time -* System: Configurable dn\_rpc\_port/address - -## Workbench (Console Tool) - - -| **Version** | **Description** | **Supported IoTDB Versions** | **SHA512 checksum** | -| ----------- | ------------------------------------------------------------ | ----------------------------------- | ------------------------------------------------------------ | -| V1.5.7 | Optimize the point list by splitting point names into device names and points, ensure the point selection area supports horizontal scrolling, and align the export file column order with the page display. | All 1.x versions from V1.3.4 onward | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | Enhanced CSV import/export: optional tags/aliases on import; support for measurement descriptions with backtick-quoted quotes on export. | All 1.x versions from V1.3.4 onward | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | Added server clock functionality and support for activating Enterprise Edition license databases | All 1.x versions from V1.3.4 onward | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | Added authentication for Prometheus settings in Instance Management | All 1.x versions from V1.3.4 onward | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | Added AI analysis and pattern matching | All 1.x versions from V1.3.2 onward | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | Added tree model display and English UI | All 1.x versions from V1.3.2 onward | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | Enhanced analysis methods and import templates | All 1.x versions from V1.3.2 onward | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | Added DB configuration and UI refinements | All 1.x versions from V1.3.2 onward | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | Optimized permission controls | All 1.x versions from V1.3.1 onward | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | Added "Common Templates" and caching | All 1.x versions from V1.3.0 onward | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | Added import/export for calculations, time alignment field | All 1.x versions from V1.2.2 onward | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | Added activation details and analysis features | All 1.x versions from V1.2.2 onward | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | Optimized point description display | All 1.x versions from V1.2.2 onward | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | Added sync monitoring panel, Prometheus hints | All 1.x versions from V1.2.2 onward | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | Major Workbench upgrade | All 1.x versions from V1.2.0 onward | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/UserGuide/dev-1.3/QuickStart/QuickStart_timecho.md b/src/UserGuide/dev-1.3/QuickStart/QuickStart_timecho.md deleted file mode 100644 index 632545c6a..000000000 --- a/src/UserGuide/dev-1.3/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,106 +0,0 @@ - -# Quick Start - -This document will help you understand how to quickly get started with IoTDB. - -## How to install and deploy? - -This document will help you quickly install and deploy IoTDB. You can quickly locate the content you need to view through the following document links: - -1. Prepare the necessary machine resources: The deployment and operation of IoTDB require consideration of multiple aspects of machine resource configuration. Specific resource allocation can be viewed [Database Resources](../Deployment-and-Maintenance/Database-Resources.md) - -2. Complete system configuration preparation: The system configuration of IoTDB involves multiple aspects, and the key system configuration introductions can be viewed [System Requirements](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. Get installation package: You can contact Timecho Business to obtain the IoTDB installation package to ensure that the downloaded version is the latest and stable. The specific installation package structure can be viewed: [Obtain TimechoDB](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. Install database and activate: You can choose the following tutorials for installation and deployment based on the actual deployment architecture: - - - Stand-Alone Deployment: [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - Distributed(Cluster) Deployment: [Distributed(Cluster) Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - Dual Active Deployment: [Dual Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️Attention: Currently, we still recommend installing and deploying directly on physical/virtual machines. If Docker deployment is required, please refer to: [Docker Deployment](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. Install database supporting tools: The enterprise version database provides a monitoring panel 、Workbench Supporting tools, etc,It is recommended to install IoTDB when deploying the enterprise version, which can help you use IoTDB more conveniently: - - - Monitoring panel:Provides over a hundred database monitoring metrics for detailed monitoring of IoTDB and its operating system, enabling system optimization, performance optimization, bottleneck discovery, and more. The installation steps can be viewed [Monitoring panel](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - Workbench: It is the visual interface of IoTDB,Support providing through interface interaction Operate Metadata、Query Data、Data Visualization and other functions, help users use the database easily and efficiently, and the installation steps can be viewed [Workbench Deployment](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - -## How to use it? - -1. Database modeling design: Database modeling is an important step in creating a database system, which involves designing the structure and relationships of data to ensure that the organization of data meets the specific application requirements. The following document will help you quickly understand the modeling design of IoTDB: - - - Introduction to the concept of timeseries:[Navigating Time Series Data](../Basic-Concept/Navigating_Time_Series_Data.md) - - - Introduction to Modeling Design: [Data Model](../Basic-Concept/Data-Model-and-Terminology.md) - - - SQL syntax introduction:[Operate Metadata](../Basic-Concept/Operate-Metadata_timecho.md) - -2. Write Data: In terms of data writing, IoTDB provides multiple ways to insert real-time data. Please refer to the basic data writing operations for details [Write Data](../Basic-Concept/Write-Data) - -3. Query Data: IoTDB provides rich data query functions. Please refer to the basic introduction of data query [Query Data](../Basic-Concept/Query-Data.md) - -4. Other advanced features: In addition to common functions such as writing and querying in databases, IoTDB also supports "Data Synchronisation、Stream Framework、Security Management、Database Administration、AI Capability"and other functions, specific usage methods can be found in the specific document: - - - Data Synchronisation: [Data Synchronisation](../User-Manual/Data-Sync_timecho.md) - - - Stream Framework: [Stream Framework](../User-Manual/Streaming_timecho.md) - - - Security Management: [Security Management](../User-Manual/White-List_timecho.md) - - - Database Administration: [Database Administration](../User-Manual/Authority-Management.md) - - - AI Capability :[AI Capability](../AI-capability/AINode_timecho.md) - -5. API: IoTDB provides multiple application programming interfaces (API) for developers to interact with IoTDB in their applications, and currently supports[ Java Native API](../API/Programming-Java-Native-API.md)、[Python Native API](../API/Programming-Python-Native-API.md)、[C++ Native API](../API/Programming-Cpp-Native-API.md)、[Go Native API](../API/Programming-Go-Native-API.md), For more API, please refer to the official website 【API】 and other chapters - -## What other convenient tools are available? - -In addition to its rich features, IoTDB also has a comprehensive range of tools in its surrounding system. This document will help you quickly use the peripheral tool system : - - - Workbench: Workbench is a visual interface for IoTDB that supports interactive operations. It offers intuitive features for metadata management, data querying, and data visualization, enhancing the convenience and efficiency of user database operations. For detailed usage instructions, please refer to: [Workbench](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - - - Monitor Tool: This is a tool for meticulous monitoring of IoTDB and its host operating system, covering hundreds of database monitoring metrics including database performance and system resources, which aids in system optimization and bottleneck identification. For detailed usage instructions, please refer to: [Monitor Tool](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - Benchmark Tool: IoT benchmark is a time series database benchmark testing tool developed based on Java and big data environments, developed and open sourced by the School of Software at Tsinghua University. It supports multiple writing and querying methods, can store test information and results for further query or analysis, and supports integration with Tableau to visualize test results. For specific usage instructions, please refer to: [Benchmark Tool](../Tools-System/Benchmark.md) - - - Data Import Script: For different scenarios, IoTDB provides users with multiple ways to batch import data. For specific usage instructions, please refer to: [Data Import](../Tools-System/Data-Import-Tool.md) - - - Data Export Script: For different scenarios, IoTDB provides users with multiple ways to batch export data. For specific usage instructions, please refer to: [Data Export](../Tools-System/Data-Export-Tool.md) - -## Want to Learn More About the Technical Details? - -If you are interested in delving deeper into the technical aspects of IoTDB, you can refer to the following documents: - - - Research Paper: IoTDB features columnar storage, data encoding, pre-calculation, and indexing technologies, along with a SQL-like interface and high-performance data processing capabilities. It also integrates seamlessly with Apache Hadoop, MapReduce, and Apache Spark. For related research papers, please refer to: [Research Paper](../Technical-Insider/Publication.md) - - - Compression & Encoding: IoTDB optimizes storage efficiency for different data types through a variety of encoding and compression techniques. To learn more, please refer to:[Compression & Encoding](../Technical-Insider/Encoding-and-Compression.md) - - - Data Partitioning and Load Balancing: IoTDB has meticulously designed data partitioning strategies and load balancing algorithms based on the characteristics of time series data, enhancing the availability and performance of the cluster. For more information, please refer to: [Data Partitionin & Load Balancing](../Technical-Insider/Cluster-data-partitioning.md) - - -## Encountering problems during use? - -If you encounter difficulties during installation or use, you can move to [Frequently Asked Questions](../FAQ/Frequently-asked-questions.md) View in the middle \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/Reference/DataNode-Config-Manual_timecho.md b/src/UserGuide/dev-1.3/Reference/DataNode-Config-Manual_timecho.md deleted file mode 100644 index 3ce17bebe..000000000 --- a/src/UserGuide/dev-1.3/Reference/DataNode-Config-Manual_timecho.md +++ /dev/null @@ -1,584 +0,0 @@ - - -# DataNode Configuration Parameters - -We use the same configuration files for IoTDB DataNode and Standalone version, all under the `conf`. - -* `datanode-env.sh/bat`:Environment configurations, in which we could set the memory allocation of DataNode and Standalone. - -* `iotdb-system.properties`:IoTDB system configurations. - -## Hot Modification Configuration - -For the convenience of users, IoTDB provides users with hot modification function, that is, modifying some configuration parameters in `iotdb-system.properties` during the system operation and applying them to the system immediately. -In the parameters described below, these parameters whose way of `Effective` is `hot-load` support hot modification. - -Trigger way: The client sends the command(sql) `load configuration` or `set configuration` to the IoTDB server. - -## Environment Configuration File(datanode-env.sh/bat) - -The environment configuration file is mainly used to configure the Java environment related parameters when DataNode is running, such as JVM related configuration. This part of the configuration is passed to the JVM when the DataNode starts. - -The details of each parameter are as follows: - -* MEMORY\_SIZE - -|Name|MEMORY\_SIZE| -|:---:|:---| -|Description|The minimum heap memory size that IoTDB DataNode will use when startup | -|Type|String| -|Default| The default is a half of the memory.| -|Effective|After restarting system| - -* ON\_HEAP\_MEMORY - -|Name|ON\_HEAP\_MEMORY| -|:---:|:---| -|Description|The heap memory size that IoTDB DataNode can use, Former Name: MAX\_HEAP\_SIZE | -|Type|String| -|Default| Calculate based on MEMORY\_SIZE.| -|Effective|After restarting system| - -* OFF\_HEAP\_MEMORY - -|Name|OFF\_HEAP\_MEMORY| -|:---:|:---| -|Description|The direct memory that IoTDB DataNode can use, Former Name: MAX\_DIRECT\_MEMORY\_SIZE| -|Type|String| -|Default| Calculate based on MEMORY\_SIZE.| -|Effective|After restarting system| - -* JMX\_LOCAL - -|Name|JMX\_LOCAL| -|:---:|:---| -|Description|JMX monitoring mode, configured as yes to allow only local monitoring, no to allow remote monitoring| -|Type|Enum String: "true", "false"| -|Default|true| -|Effective|After restarting system| - -* JMX\_PORT - -|Name|JMX\_PORT| -|:---:|:---| -|Description|JMX listening port. Please confirm that the port is not a system reserved port and is not occupied| -|Type|Short Int: [0,65535]| -|Default|31999| -|Effective|After restarting system| - -* JMX\_IP - -|Name|JMX\_IP| -|:---:|:---| -|Description|JMX listening address. Only take effect if JMX\_LOCAL=false. 0.0.0.0 is never allowed| -|Type|String| -|Default|127.0.0.1| -|Effective|After restarting system| - -## JMX Authorization - -We **STRONGLY RECOMMENDED** you CHANGE the PASSWORD for the JMX remote connection. - -The user and passwords are in ${IOTDB\_CONF}/conf/jmx.password. - -The permission definitions are in ${IOTDB\_CONF}/conf/jmx.access. - -## DataNode/Standalone Configuration File (iotdb-system.properties) - -### Data Node RPC Configuration - -* dn\_rpc\_address - -|Name| dn\_rpc\_address | -|:---:|:-----------------------------------------------| -|Description| The client rpc service listens on the address. | -|Type| String | -|Default| 0.0.0.0 | -|Effective| After restarting system | - -* dn\_rpc\_port - -|Name| dn\_rpc\_port | -|:---:|:---| -|Description| The client rpc service listens on the port.| -|Type|Short Int : [0,65535]| -|Default| 6667 | -|Effective|After restarting system| - -* dn\_internal\_address - -|Name| dn\_internal\_address | -|:---:|:---| -|Description| DataNode internal service host/IP | -|Type| string | -|Default| 127.0.0.1 | -|Effective|Only allowed to be modified in first start up| - -* dn\_internal\_port - -|Name| dn\_internal\_port | -|:---:|:-------------------------------| -|Description| DataNode internal service port | -|Type| int | -|Default| 10730 | -|Effective| Only allowed to be modified in first start up | - -* dn\_mpp\_data\_exchange\_port - -|Name| mpp\_data\_exchange\_port | -|:---:|:---| -|Description| MPP data exchange port | -|Type| int | -|Default| 10740 | -|Effective|Only allowed to be modified in first start up| - -* dn\_schema\_region\_consensus\_port - -|Name| dn\_schema\_region\_consensus\_port | -|:---:|:---| -|Description| DataNode Schema replica communication port for consensus | -|Type| int | -|Default| 10750 | -|Effective|Only allowed to be modified in first start up| - -* dn\_data\_region\_consensus\_port - -|Name| dn\_data\_region\_consensus\_port | -|:---:|:---| -|Description| DataNode Data replica communication port for consensus | -|Type| int | -|Default| 10760 | -|Effective|Only allowed to be modified in first start up| - -* dn\_join\_cluster\_retry\_interval\_ms - -|Name| dn\_join\_cluster\_retry\_interval\_ms | -|:---:|:--------------------------------------------------------------------------| -|Description| The time of data node waiting for the next retry to join into the cluster | -|Type| long | -|Default| 5000 | -|Effective| After restarting system | - -### SSL Configuration - -* enable\_thrift\_ssl - -|Name| enable\_thrift\_ssl | -|:---:|:---------------------------| -|Description|When enable\_thrift\_ssl is configured as true, SSL encryption will be used for communication through dn\_rpc\_port | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* enable\_https - -|Name| enable\_https | -|:---:|:-------------------------| -|Description| REST Service Specifies whether to enable SSL configuration | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* key\_store\_path - -|Name| key\_store\_path | -|:---:|:-----------------| -|Description| SSL certificate path | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* key\_store\_pwd - -|Name| key\_store\_pwd | -|:---:|:----------------| -|Description| SSL certificate password | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -### SeedConfigNode - -* dn\_seed\_config\_node - -|Name| dn\_seed\_config\_node | -|:---:|:------------------------------------------------| -|Description| ConfigNode Address for DataNode to join cluster. This parameter is corresponding to dn\_target\_config\_node\_list before V1.2.2 | -|Type| String | -|Default| 127.0.0.1:10710 | -|Effective| Only allowed to be modified in first start up | - -### Connection Configuration - -* dn\_rpc\_thrift\_compression\_enable - -|Name| dn\_rpc\_thrift\_compression\_enable | -|:---:|:---| -|Description| Whether enable thrift's compression (using GZIP).| -|Type|Boolean| -|Default| false | -|Effective|After restarting system| - -* dn\_rpc\_advanced\_compression\_enable - -|Name| dn\_rpc\_advanced\_compression\_enable | -|:---:|:---| -|Description| Whether enable thrift's advanced compression.| -|Type|Boolean| -|Default| false | -|Effective|After restarting system| - -* dn\_rpc\_selector\_thread\_count - -|Name| dn\_rpc\_selector\_thread\_count | -|:---:|:-----------------------------------| -|Description| The number of rpc selector thread. | -|Type| int | -|Default| false | -|Effective| After restarting system | - -* dn\_rpc\_min\_concurrent\_client\_num - -|Name| dn\_rpc\_min\_concurrent\_client\_num | -|:---:|:-----------------------------------| -|Description| Minimum concurrent rpc connections | -|Type| Short Int : [0,65535] | -|Description| 1 | -|Effective| After restarting system | - -* dn\_rpc\_max\_concurrent\_client\_num - -|Name| dn\_rpc\_max\_concurrent\_client\_num | -|:---:|:--------------------------------------| -|Description| Max concurrent rpc connections | -|Type| Short Int : [0,65535] | -|Description| 1000 | -|Effective| After restarting system | - -* dn\_thrift\_max\_frame\_size - -|Name| dn\_thrift\_max\_frame\_size | -|:---:|:---| -|Description| Max size of bytes of each thrift RPC request/response| -|Type| Long | -|Unit|Byte| -|Default| 536870912 | -|Effective|After restarting system| - -* dn\_thrift\_init\_buffer\_size - -|Name| dn\_thrift\_init\_buffer\_size | -|:---:|:---| -|Description| Initial size of bytes of buffer that thrift used | -|Type| long | -|Default| 1024 | -|Effective|After restarting system| - -* dn\_connection\_timeout\_ms - -| Name | dn\_connection\_timeout\_ms | -|:-----------:|:---------------------------------------------------| -| Description | Thrift socket and connection timeout between nodes | -| Type | int | -| Default | 60000 | -| Effective | After restarting system | - -* dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager - -| Name | dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------------:|:--------------------------------------------------------------| -| Description | Number of core clients routed to each node in a ClientManager | -| Type | int | -| Default | 200 | -| Effective | After restarting system | - -* dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager - -| Name | dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager | -|:--------------:|:-------------------------------------------------------------| -| Description | Number of max clients routed to each node in a ClientManager | -| Type | int | -| Default | 300 | -| Effective | After restarting system | - -### Dictionary Configuration - -* dn\_system\_dir - -| Name | dn\_system\_dir | -|:-----------:|:----------------------------------------------------------------------------| -| Description | The directories of system files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/system (Windows: data\\datanode\\system) | -| Effective | After restarting system | - -* dn\_data\_dirs - -| Name | dn\_data\_dirs | -|:-----------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | The directories of data files. Multiple directories are separated by comma. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. If the path does not exist, the system will automatically create it. | -| Type | String[] | -| Default | data/datanode/data (Windows: data\\datanode\\data) | -| Effective | After restarting system | - -* dn\_multi\_dir\_strategy - -| Name | dn\_multi\_dir\_strategy | -|:-----------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | IoTDB's strategy for selecting directories for TsFile in tsfile_dir. You can use a simple class name or a full name of the class. The system provides the following three strategies:
1. SequenceStrategy: IoTDB selects the directory from tsfile\_dir in order, traverses all the directories in tsfile\_dir in turn, and keeps counting;
2. MaxDiskUsableSpaceFirstStrategy: IoTDB first selects the directory with the largest free disk space in tsfile\_dir;
You can complete a user-defined policy in the following ways:
1. Inherit the org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy class and implement its own Strategy method;
2. Fill in the configuration class with the full class name of the implemented class (package name plus class name, UserDfineStrategyPackage);
3. Add the jar file to the project. | -| Type | String | -| Default | SequenceStrategy | -| Effective | hot-load | - -* dn\_consensus\_dir - -| Name | dn\_consensus\_dir | -|:-----------:|:-------------------------------------------------------------------------------| -| Description | The directories of consensus files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/consensus | -| Effective | After restarting system | - -* dn\_wal\_dirs - -| Name | dn\_wal\_dirs | -|:-----------:|:-------------------------------------------------------------------------| -| Description | Write Ahead Log storage path. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/wal | -| Effective | After restarting system | - -* dn\_tracing\_dir - -| Name | dn\_tracing\_dir | -|:-----------:|:----------------------------------------------------------------------------| -| Description | The tracing root directory path. It is recommended to use an absolute path. | -| Type | String | -| Default | datanode/tracing | -| Effective | After restarting system | - -* dn\_sync\_dir - -| Name | dn\_sync\_dir | -|:-----------:|:--------------------------------------------------------------------------| -| Description | The directories of sync files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/sync | -| Effective | After restarting system | - -### Metric Configuration - -## Enable GC log - -GC log is off by default. -For performance tuning, you may want to collect the GC info. - -To enable GC log, just add a parameter "printgc" when you start the DataNode. - -```bash -nohup sbin/start-datanode.sh printgc >/dev/null 2>&1 & -``` -Or -```cmd -sbin\start-datanode.bat printgc -``` - -GC log is stored at `IOTDB_HOME/logs/gc.log`. -There will be at most 10 gc.log.* files and each one can reach to 10MB. - -### REST Service Configuration - -* enable\_rest\_service - -|Name| enable\_rest\_service | -|:---:|:--------------------------------------| -|Description| Whether to enable the Rest service | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* rest\_service\_port - -|Name| rest\_service\_port | -|:---:|:------------------| -|Description| The Rest service listens to the port number | -|Type| int32 | -|Default| 18080 | -|Effective| After restarting system | - -* enable\_swagger - -|Name| enable\_swagger | -|:---:|:-----------------------| -|Description| Whether to enable swagger to display rest interface information | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* rest\_query\_default\_row\_size\_limit - -|Name| rest\_query\_default\_row\_size\_limit | -|:---:|:------------------------------------------------------------------------------------------| -|Description| The maximum number of rows in a result set that can be returned by a query | -|Type| int32 | -|Default| 10000 | -|Effective| After restarting system | - -* cache\_expire - -|Name| cache\_expire | -|:---:|:--------------------------------------------------------| -|Description| Expiration time for caching customer login information | -|Type| int32 | -|Default| 28800 | -|Effective| After restarting system | - -* cache\_max\_num - -|Name| cache\_max\_num | -|:---:|:--------------| -|Description| The maximum number of users stored in the cache | -|Type| int32 | -|Default| 100 | -|Effective| After restarting system | - -* cache\_init\_num - -|Name| cache\_init\_num | -|:---:|:---------------| -|Description| Initial cache capacity | -|Type| int32 | -|Default| 10 | -|Effective| After restarting system | - - -* trust\_store\_path - -|Name| trust\_store\_path | -|:---:|:---------------| -|Description| keyStore Password (optional) | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* trust\_store\_pwd - -|Name| trust\_store\_pwd | -|:---:|:---------------------------------| -|Description| trustStore Password (Optional) | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* idle\_timeout - -|Name| idle\_timeout | -|:---:|:--------------| -|Description| SSL timeout duration, expressed in seconds | -|Type| int32 | -|Default| 5000 | -|Effective| After restarting system | - - -#### Storage engine configuration - - -* dn\_default\_space\_usage\_thresholds - -|Name| dn\_default\_space\_usage\_thresholds | -|:---:|:--------------| -|Description| Define the minimum remaining space ratio for each tier data catalogue; when the remaining space is less than this ratio, the data will be automatically migrated to the next tier; when the remaining storage space of the last tier falls below this threshold, the system will be set to READ_ONLY | -|Type| double | -|Default| 0.85 | -|Effective| hot-load | - -* remote\_tsfile\_cache\_dirs - -|Name| remote\_tsfile\_cache\_dirs | -|:---:|:--------------| -|Description| Cache directory stored locally in the cloud | -|Type| string | -|Default| data/datanode/data/cache | -|Effective| After restarting system | - -* remote\_tsfile\_cache\_page\_size\_in\_kb - -|Name| remote\_tsfile\_cache\_page\_size\_in\_kb | -|:---:|:--------------| -|Description| Block size of locally cached files stored in the cloud | -|Type| int | -|Default| 20480 | -|Effective| After restarting system | - -* remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb - -|Name| remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb | -|:---:|:--------------| -|Description| Maximum Disk Occupancy Size for Cloud Storage Local Cache | -|Type| long | -|Default| 51200 | -|Effective| After restarting system | - -* object\_storage\_type - -|Name| object\_storage\_type | -|:---:|:--------------| -|Description| Cloud Storage Type | -|Type| string | -|Default| AWS_S3 | -|Effective| After restarting system | - -* object\_storage\_bucket - -|Name| object\_storage\_bucket | -|:---:|:--------------| -|Description| Name of cloud storage bucket | -|Type| string | -|Default| iotdb_data | -|Effective| After restarting system | - -* object\_storage\_endpoint - -|Name| object\_storage\_endpoint | -|:---:|:--------------------------------| -|Description| endpoint of cloud storage | -|Type| string | -|Default| None | -|Effective| After restarting system | - -* object\_storage\_access\_key - -|Name| object\_storage\_access\_key | -|:---:|:--------------| -|Description| Authentication information stored in the cloud: key | -|Type| string | -|Default| None | -|Effective| After restarting system | - -* object\_storage\_access\_secret - -|Name| object\_storage\_access\_secret | -|:---:|:--------------| -|Description| Authentication information stored in the cloud: secret | -|Type| string | -|Default| None | -|Effective| After restarting system | diff --git a/src/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md b/src/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md deleted file mode 100644 index 84541c7fd..000000000 --- a/src/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md +++ /dev/null @@ -1,4978 +0,0 @@ - - -# UDF Libraries - -# UDF Libraries - -Based on the ability of user-defined functions, IoTDB provides a series of functions for temporal data processing, including data quality, data profiling, anomaly detection, frequency domain analysis, data matching, data repairing, sequence discovery, machine learning, etc., which can meet the needs of industrial fields for temporal data processing. - -> Note: The functions in the current UDF library only support millisecond level timestamp accuracy. - -## Installation steps - -1. Please obtain the compressed file of the UDF library JAR package that is compatible with the IoTDB version. - - | UDF installation package | Supported IoTDB versions | Download link | - | --------------- | ----------------- | ------------------------------------------------------------ | - | TimechoDB-UDF-1.3.3.zip | V1.3.3 and above | Please contact Timecho for assistance | - | TimechoDB-UDF-1.3.2.zip | V1.0.0~V1.3.2 | Please contact Timecho for assistance| - -2. Place the `library-udf.jar` file in the compressed file obtained in the directory `/ext/udf ` of all nodes in the IoTDB cluster -3. In the SQL command line terminal (CLI) or visualization console (Workbench) SQL operation interface of IoTDB, execute the corresponding function registration statement as follows. -4. Batch registration: Two registration methods: registration script or SQL full statement -- Register Script - - Copy the registration script (`register-UDF.sh` or `register-UDF.bat`) from the compressed package to the `tools` directory of IoTDB as needed, and modify the parameters in the script (default is host=127.0.0.1, rpcPort=6667, user=root, pass=root); - - Start IoTDB service, run registration script to batch register UDF - -- All SQL statements - - Open the SQl file in the compressed package, copy all SQL statements, and execute all SQl statements in the SQL command line terminal (CLI) of IoTDB or the SQL operation interface of the visualization console (Workbench) to batch register UDF - -## Data Quality - -### Completeness - -#### Registration statement - -```sql -create function completeness as 'org.apache.iotdb.library.dquality.UDTFCompleteness' -``` - -#### Usage - -This function calculates the completeness of a time series, which measures the presence or absence of missing values in the time series data. The function divides the input time series data into consecutive non-overlapping time windows, computes the data completeness for each window individually, and outputs the timestamp of the first data point in the window along with the completeness result. - -**Name:** COMPLETENESS - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. -+ `downtime`: Whether the downtime exception is considered in the calculation of completeness. It is 'true' or 'false' (default). When considering the downtime exception, long-term missing data will be considered as downtime exception without any influence on completeness. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-----------------------------+ -| Time|completeness(root.test.d1.s1)| -+-----------------------------+-----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -+-----------------------------+-----------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------+ -| Time|completeness(root.test.d1.s1, "window"="15")| -+-----------------------------+--------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+--------------------------------------------+ -``` - -### Consistency - -#### Registration statement - -```sql -create function consistency as 'org.apache.iotdb.library.dquality.UDTFConsistency' -``` - -#### Usage - -This function calculates the consistency of a time series, which measures whether the changes in the time series data are stable and follow uniform patterns. The function divides the input time series data into consecutive non-overlapping time windows, computes the data consistency for each window individually, and outputs the timestamp of the first data point in the window along with the consistency result. - -**Name:** CONSISTENCY - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|consistency(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+----------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------+ -| Time|consistency(root.test.d1.s1, "window"="15")| -+-----------------------------+-------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+-------------------------------------------+ -``` - -### Timeliness - -#### Registration statement - -```sql -create function timeliness as 'org.apache.iotdb.library.dquality.UDTFTimeliness' -``` - -#### Usage - -This function calculates the timeliness of a time series, which measures whether the time series data is collected and reported on schedule. The function divides the input time series data into consecutive non-overlapping time windows, computes the data timeliness for each window individually, and outputs the timestamp of the first data point in the window along with the timeliness result. - -**Name:** TIMELINESS - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+---------------------------+ -| Time|timeliness(root.test.d1.s1)| -+-----------------------------+---------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+---------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------+ -| Time|timeliness(root.test.d1.s1, "window"="15")| -+-----------------------------+------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+------------------------------------------+ -``` - -### Validity - -#### Registration statement - -```sql -create function validity as 'org.apache.iotdb.library.dquality.UDTFValidity' -``` - -#### Usage - -This function calculates the validity of a time series, which measures whether the time series data is normal, usable, and free of outliers. The function divides the input time series data into consecutive non-overlapping time windows, computes the data validity for each window individually, and outputs the timestamp of the first data point in the window along with the validity result. - -**Name:** VALIDITY - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select Validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|validity(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -+-----------------------------+-------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select Validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|validity(root.test.d1.s1, "window"="15")| -+-----------------------------+----------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - - - - - -## Data Profiling - -### ACF - -#### Registration statement - -```sql -create function acf as 'org.apache.iotdb.library.dprofile.UDTFACF' -``` - -#### Usage - -This function is used to calculate the auto-correlation factor of the input time series, -which equals to cross correlation between the same series. -For more information, please refer to [XCorr](#XCorr) function. - -**Name:** ACF - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. -There are $2N-1$ data points in the series, and the values are interpreted in details in [XCorr](#XCorr) function. - -**Note:** - -+ `null` and `NaN` values in the input series will be ignored and treated as 0. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| null| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|acf(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 6.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -|1970-01-01T08:00:00.004+08:00| 7.0| -|1970-01-01T08:00:00.005+08:00| 0.0| -|1970-01-01T08:00:00.006+08:00| 3.6| -|1970-01-01T08:00:00.007+08:00| 0.0| -|1970-01-01T08:00:00.008+08:00| 1.0| -+-----------------------------+--------------------+ -``` - -### Distinct - -#### Registration statement - -```sql -create function distinct as 'org.apache.iotdb.library.dprofile.UDTFDistinct' -``` - -#### Usage - -This function returns all unique values in time series. - -**Name:** DISTINCT - -**Input Series:** Only support a single input series. The type is arbitrary. - -**Output Series:** Output a single series. The type is the same as the input. - -**Note:** - -+ The timestamp of the output series is meaningless. The output order is arbitrary. -+ Missing points and null points in the input series will be ignored, but `NaN` will not. -+ Case Sensitive. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2020-01-01T08:00:00.001+08:00| Hello| -|2020-01-01T08:00:00.002+08:00| hello| -|2020-01-01T08:00:00.003+08:00| Hello| -|2020-01-01T08:00:00.004+08:00| World| -|2020-01-01T08:00:00.005+08:00| World| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select distinct(s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|distinct(root.test.d2.s2)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.001+08:00| Hello| -|1970-01-01T08:00:00.002+08:00| hello| -|1970-01-01T08:00:00.003+08:00| World| -+-----------------------------+-------------------------+ -``` - -### Histogram - -#### Registration statement - -```sql -create function histogram as 'org.apache.iotdb.library.dprofile.UDTFHistogram' -``` - -#### Usage - -This function is used to calculate the distribution histogram of a single column of numerical data. - -**Name:** HISTOGRAM - -**Input Series:** Only supports a single input sequence, the type is INT32 / INT64 / FLOAT / DOUBLE - -**Parameters:** - -+ `min`: The lower limit of the requested data range, the default value is -Double.MAX_VALUE. -+ `max`: The upper limit of the requested data range, the default value is Double.MAX_VALUE, and the value of start must be less than or equal to end. -+ `count`: The number of buckets of the histogram, the default value is 1. It must be a positive integer. - -**Output Series:** The value of the bucket of the histogram, where the lower bound represented by the i-th bucket (index starts from 1) is $min+ (i-1)\cdot\frac{max-min}{count}$ and the upper bound is $min + i \cdot \frac{max-min}{count}$. - -**Note:** - -+ If the value is lower than `min`, it will be put into the 1st bucket. If the value is larger than `max`, it will be put into the last bucket. -+ Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 11.0| -|2020-01-01T00:00:11.000+08:00| 12.0| -|2020-01-01T00:00:12.000+08:00| 13.0| -|2020-01-01T00:00:13.000+08:00| 14.0| -|2020-01-01T00:00:14.000+08:00| 15.0| -|2020-01-01T00:00:15.000+08:00| 16.0| -|2020-01-01T00:00:16.000+08:00| 17.0| -|2020-01-01T00:00:17.000+08:00| 18.0| -|2020-01-01T00:00:18.000+08:00| 19.0| -|2020-01-01T00:00:19.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------+ -| Time|histogram(root.test.d1.s1, "min"="1", "max"="20", "count"="10")| -+-----------------------------+---------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 2| -|1970-01-01T08:00:00.001+08:00| 2| -|1970-01-01T08:00:00.002+08:00| 2| -|1970-01-01T08:00:00.003+08:00| 2| -|1970-01-01T08:00:00.004+08:00| 2| -|1970-01-01T08:00:00.005+08:00| 2| -|1970-01-01T08:00:00.006+08:00| 2| -|1970-01-01T08:00:00.007+08:00| 2| -|1970-01-01T08:00:00.008+08:00| 2| -|1970-01-01T08:00:00.009+08:00| 2| -+-----------------------------+---------------------------------------------------------------+ -``` - -### Integral - -#### Registration statement - -```sql -create function integral as 'org.apache.iotdb.library.dprofile.UDAFIntegral' -``` - -#### Usage - -This function is used to calculate the integration of time series, -which equals to the area under the curve with time as X-axis and values as Y-axis. - -**Name:** INTEGRAL - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `unit`: The unit of time used when computing the integral. - The value should be chosen from "1S", "1s", "1m", "1H", "1d"(case-sensitive), - and each represents taking one millisecond / second / minute / hour / day as 1.0 while calculating the area and integral. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the integration. - -**Note:** - -+ The integral value equals to the sum of the areas of right-angled trapezoids consisting of each two adjacent points and the time-axis. - Choosing different `unit` implies different scaling of time axis, thus making it apparent to convert the value among those results with constant coefficient. - -+ `NaN` values in the input series will be ignored. The curve or trapezoids will skip these points and use the next valid point. - -#### Examples - -##### Default Parameters - -With default parameters, this function will take one second as 1.0. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 57.5| -+-----------------------------+-------------------------+ -``` - -Calculation expression: -$$\frac{1}{2}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 57.5$$ - -##### Specific time unit - -With time unit specified as "1m", this function will take one minute as 1.0. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.958| -+-----------------------------+-------------------------+ -``` - -Calculation expression: -$$\frac{1}{2\times 60}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 0.958$$ - -### IntegralAvg - -#### Registration statement - -```sql -create function integralavg as 'org.apache.iotdb.library.dprofile.UDAFIntegralAvg' -``` - -#### Usage - -This function is used to calculate the function average of time series. -The output equals to the area divided by the time interval using the same time `unit`. -For more information of the area under the curve, please refer to `Integral` function. - -**Name:** INTEGRALAVG - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the time-weighted average. - -**Note:** - -+ The time-weighted value equals to the integral value with any `unit` divided by the time interval of input series. - The result is irrelevant to the time unit used in integral, and it's consistent with the timestamp precision of IoTDB by default. - -+ `NaN` values in the input series will be ignored. The curve or trapezoids will skip these points and use the next valid point. - -+ If the input series is empty, the output value will be 0.0, but if there is only one data point, the value will equal to the input value. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|integralavg(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|1970-01-01T08:00:00.000+08:00| 6.388888888888889| -+-----------------------------+----------------------------+ -``` - -Calculation expression: -$$\frac{1}{2}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] / 10 = 5.75$$ - -### Mad - -#### Registration statement - -```sql -create function mad as 'org.apache.iotdb.library.dprofile.UDAFMad' -``` - -#### Usage - -The function is used to compute the exact or approximate median absolute deviation (MAD) of a numeric time series. MAD is the median of the deviation of each element from the elements' median. - -Take a dataset $\{1,3,3,5,5,6,7,8,9\}$ as an instance. Its median is 5 and the deviation of each element from the median is $\{0,0,1,2,2,2,3,4,4\}$, whose median is 2. Therefore, the MAD of the original dataset is 2. - -**Name:** MAD - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `error`: The relative error of the approximate MAD. It should be within [0,1) and the default value is 0. Taking `error`=0.01 as an instance, suppose the exact MAD is $a$ and the approximate MAD is $b$, we have $0.99a \le b \le 1.01a$. With `error`=0, the output is the exact MAD. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the MAD. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -##### Approximate Query - -By setting `error` within (0,1), the function queries the approximate MAD. - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -............ -Total line number = 20 -``` - -SQL for query: - -```sql -select mad(s1, "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -| Time|mad(root.test.s1, "error"="0.01")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9900000000000001| -+-----------------------------+---------------------------------+ -``` - -### Median - -#### Registration statement - -```sql -create function median as 'org.apache.iotdb.library.dprofile.UDAFMedian' -``` - -#### Usage - -The function is used to compute the exact or approximate median of a numeric time series. Median is the value separating the higher half from the lower half of a data sample. - -**Name:** MEDIAN - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `error`: The rank error of the approximate median. It should be within [0,1) and the default value is 0. For instance, a median with `error`=0.01 is the value of the element with rank percentage 0.49~0.51. With `error`=0, the output is the exact median. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the median. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -Total line number = 20 -``` - -SQL for query: - -```sql -select median(s1, "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|median(root.test.s1, "error"="0.01")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -+-----------------------------+------------------------------------+ -``` - -### MinMax - -#### Registration statement - -```sql -create function minmax as 'org.apache.iotdb.library.dprofile.UDTFMinMax' -``` - -#### Usage - -This function is used to standardize the input series with min-max. Minimum value is transformed to 0; maximum value is transformed to 1. - -**Name:** MINMAX - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `compute`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide minimum and maximum values. The default method is "batch". -+ `min`: The maximum value when method is set to "stream". -+ `max`: The minimum value when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select minmax(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|minmax(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.200+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.300+08:00| 0.25| -|1970-01-01T08:00:00.400+08:00| 0.08333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.700+08:00| 0.0| -|1970-01-01T08:00:00.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.900+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.000+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.100+08:00| 0.25| -|1970-01-01T08:00:01.200+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.300+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.25| -|1970-01-01T08:00:01.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.700+08:00| 1.0| -|1970-01-01T08:00:01.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.900+08:00| 0.0| -|1970-01-01T08:00:02.000+08:00| 0.16666666666666666| -+-----------------------------+--------------------+ -``` - - -### MvAvg - -#### Registration statement - -```sql -create function mvavg as 'org.apache.iotdb.library.dprofile.UDTFMvAvg' -``` - -#### Usage - -This function is used to calculate moving average of input series. - -**Name:** MVAVG - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `window`: Length of the moving window. Default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select mvavg(s1, "window"="3") from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -| Time|mvavg(root.test.s1, "window"="3")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.300+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.400+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.700+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.100+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.200+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.300+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.0| -|1970-01-01T08:00:01.500+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.600+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.700+08:00| 3.0| -|1970-01-01T08:00:01.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.900+08:00| -0.6666666666666666| -|1970-01-01T08:00:02.000+08:00| -3.3333333333333335| -+-----------------------------+---------------------------------+ -``` - -### PACF - -#### Registration statement - -```sql -create function pacf as 'org.apache.iotdb.library.dprofile.UDTFPACF' -``` - -#### Usage - -This function is used to calculate partial autocorrelation of input series by solving Yule-Walker equation. For some cases, the equation may not be solved, and NaN will be output. - -**Name:** PACF - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `lag`: Maximum lag of pacf to calculate. The default value is $\min(10\log_{10}n,n-1)$, where $n$ is the number of data points. - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Assigning maximum lag - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select pacf(s1, "lag"="5") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------+ -| Time|pacf(root.test.d1.s1, "lag"="5")| -+-----------------------------+--------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| -0.5744680851063829| -|2020-01-01T00:00:03.000+08:00| 0.3172297297297296| -|2020-01-01T00:00:04.000+08:00| -0.2977686586304181| -|2020-01-01T00:00:05.000+08:00| -2.0609033521065867| -+-----------------------------+--------------------------------+ -``` - -### Percentile - -#### Registration statement - -```sql -create function percentile as 'org.apache.iotdb.library.dprofile.UDAFPercentile' -``` - -#### Usage - -The function is used to compute the exact or approximate percentile of a numeric time series. A percentile is value of element in the certain rank of the sorted series. - -**Name:** PERCENTILE - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `rank`: The rank percentage of the percentile. It should be (0,1] and the default value is 0.5. For instance, a percentile with `rank`=0.5 is the median. -+ `error`: The rank error of the approximate percentile. It should be within [0,1) and the default value is 0. For instance, a 0.5-percentile with `error`=0.01 is the value of the element with rank percentage 0.49~0.51. With `error`=0, the output is the exact percentile. - -**Output Series:** Output a single series. The type is the same as input series. If `error`=0, there is only one data point in the series, whose timestamp is the same has which the first percentile value has, and value is the percentile, otherwise the timestamp of the only data point is 0. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+-------------+ -| Time|root.test2.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+-------------+ -Total line number = 20 -``` - -SQL for query: - -```sql -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|percentile(root.test2.s1, "rank"="0.2", "error"="0.01")| -+-----------------------------+-------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| -1.0| -+-----------------------------+-------------------------------------------------------+ -``` - -### Quantile - -#### Registration statement - -```sql -create function quantile as 'org.apache.iotdb.library.dprofile.UDAFQuantile' -``` - -#### Usage - -The function is used to compute the approximate quantile of a numeric time series. A quantile is value of element in the certain rank of the sorted series. - -**Name:** QUANTILE - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `rank`: The rank of the quantile. It should be (0,1] and the default value is 0.5. For instance, a quantile with `rank`=0.5 is the median. -+ `K`: The size of KLL sketch maintained in the query. It should be within [100,+inf) and the default value is 800. For instance, the 0.5-quantile computed by a KLL sketch with K=800 items is a value with rank quantile 0.49~0.51 with a confidence of at least 99%. The result will be more accurate as K increases. - -**Output Series:** Output a single series. The type is the same as input series. The timestamp of the only data point is 0. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+-------------+ -| Time|root.test1.s1| -+-----------------------------+-------------+ -|2021-03-17T10:32:17.054+08:00| 7| -|2021-03-17T10:32:18.054+08:00| 15| -|2021-03-17T10:32:19.054+08:00| 36| -|2021-03-17T10:32:20.054+08:00| 39| -|2021-03-17T10:32:21.054+08:00| 40| -|2021-03-17T10:32:22.054+08:00| 41| -|2021-03-17T10:32:23.054+08:00| 20| -|2021-03-17T10:32:24.054+08:00| 18| -+-----------------------------+-------------+ -............ -Total line number = 8 -``` - -SQL for query: - -```sql -select quantile(s1, "rank"="0.2", "K"="800") from root.test1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------+ -| Time|quantile(root.test1.s1, "rank"="0.2", "K"="800")| -+-----------------------------+------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.000000000000001| -+-----------------------------+------------------------------------------------+ -``` - -### Period - -#### Registration statement - -```sql -create function period as 'org.apache.iotdb.library.dprofile.UDAFPeriod' -``` - -#### Usage - -The function is used to compute the period of a numeric time series. - -**Name:** PERIOD - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is INT32. There is only one data point in the series, whose timestamp is 0 and value is the period. - -#### Examples - -Input series: - - -``` -+-----------------------------+---------------+ -| Time|root.test.d3.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| -|1970-01-01T08:00:00.004+08:00| 1.0| -|1970-01-01T08:00:00.005+08:00| 2.0| -|1970-01-01T08:00:00.006+08:00| 3.0| -|1970-01-01T08:00:00.007+08:00| 1.0| -|1970-01-01T08:00:00.008+08:00| 2.0| -|1970-01-01T08:00:00.009+08:00| 3.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select period(s1) from root.test.d3 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time|period(root.test.d3.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 3| -+-----------------------------+-----------------------+ -``` - -### QLB - -#### Registration statement - -```sql -create function qlb as 'org.apache.iotdb.library.dprofile.UDTFQLB' -``` - -#### Usage - -This function is used to calculate Ljung-Box statistics $Q_{LB}$ for time series, and convert it to p value. - -**Name:** QLB - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters**: - -`lag`: max lag to calculate. Legal input shall be integer from 1 to n-2, where n is the sample number. Default value is n-2. - -**Output Series:** Output a single series. The type is DOUBLE. The output series is p value, and timestamp means lag. - -**Note:** If you want to calculate Ljung-Box statistics $Q_{LB}$ instead of p value, you may use ACF function. - -#### Examples - -##### Using Default Parameter - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T00:00:00.100+08:00| 1.22| -|1970-01-01T00:00:00.200+08:00| -2.78| -|1970-01-01T00:00:00.300+08:00| 1.53| -|1970-01-01T00:00:00.400+08:00| 0.70| -|1970-01-01T00:00:00.500+08:00| 0.75| -|1970-01-01T00:00:00.600+08:00| -0.72| -|1970-01-01T00:00:00.700+08:00| -0.22| -|1970-01-01T00:00:00.800+08:00| 0.28| -|1970-01-01T00:00:00.900+08:00| 0.57| -|1970-01-01T00:00:01.000+08:00| -0.22| -|1970-01-01T00:00:01.100+08:00| -0.72| -|1970-01-01T00:00:01.200+08:00| 1.34| -|1970-01-01T00:00:01.300+08:00| -0.25| -|1970-01-01T00:00:01.400+08:00| 0.17| -|1970-01-01T00:00:01.500+08:00| 2.51| -|1970-01-01T00:00:01.600+08:00| 1.42| -|1970-01-01T00:00:01.700+08:00| -1.34| -|1970-01-01T00:00:01.800+08:00| -0.01| -|1970-01-01T00:00:01.900+08:00| -0.49| -|1970-01-01T00:00:02.000+08:00| 1.63| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select QLB(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+---------------------+ -| Time| QLB(root.test.d1.s1)| -+-----------------------------+---------------------+ -|1970-01-01T08:00:00.021+08:00| -0.31671| -|1970-01-01T08:00:00.001+08:00| 0.12748561639660716| -|1970-01-01T08:00:00.022+08:00| -0.17051499999999997| -|1970-01-01T08:00:00.002+08:00| 0.21941409592365868| -|1970-01-01T08:00:00.023+08:00| -0.11341499999999997| -|1970-01-01T08:00:00.003+08:00| 0.3384920824593398| -|1970-01-01T08:00:00.024+08:00| 0.26146| -|1970-01-01T08:00:00.004+08:00| 0.26293189359893154| -|1970-01-01T08:00:00.025+08:00| 0.06431999999999996| -|1970-01-01T08:00:00.005+08:00| 0.37265953802871943| -|1970-01-01T08:00:00.026+08:00| 0.036919999999999994| -|1970-01-01T08:00:00.006+08:00| 0.4923218142923832| -|1970-01-01T08:00:00.027+08:00|-0.009294999999999993| -|1970-01-01T08:00:00.007+08:00| 0.609628728420623| -|1970-01-01T08:00:00.028+08:00| 0.12271499999999999| -|1970-01-01T08:00:00.008+08:00| 0.6510708392264906| -|1970-01-01T08:00:00.029+08:00| 0.008480000000000033| -|1970-01-01T08:00:00.009+08:00| 0.7430561964288097| -|1970-01-01T08:00:00.030+08:00| -0.21764500000000003| -|1970-01-01T08:00:00.010+08:00| 0.6236738200492055| -|1970-01-01T08:00:00.031+08:00| 0.35853999999999997| -|1970-01-01T08:00:00.011+08:00| 0.21487390993160937| -|1970-01-01T08:00:00.032+08:00| 0.18115499999999998| -|1970-01-01T08:00:00.012+08:00| 0.18479562182870324| -|1970-01-01T08:00:00.033+08:00| -0.27745499999999995| -|1970-01-01T08:00:00.013+08:00| 0.07329862193377235| -|1970-01-01T08:00:00.034+08:00| -0.22418500000000002| -|1970-01-01T08:00:00.014+08:00| 0.038000864459751926| -|1970-01-01T08:00:00.035+08:00| 0.31609000000000004| -|1970-01-01T08:00:00.015+08:00| 0.004052989734200874| -|1970-01-01T08:00:00.036+08:00| -0.06078500000000001| -|1970-01-01T08:00:00.016+08:00| 0.005663787468609627| -|1970-01-01T08:00:00.037+08:00| 0.19219499999999998| -|1970-01-01T08:00:00.017+08:00|0.0016316380755082571| -|1970-01-01T08:00:00.038+08:00| -0.25646| -|1970-01-01T08:00:00.018+08:00|2.0047954405910673E-5| -+-----------------------------+---------------------+ -``` - -### Resample - -#### Registration statement - -```sql -create function re_sample as 'org.apache.iotdb.library.dprofile.UDTFResample' -``` - -#### Usage - -This function is used to resample the input series according to a given frequency, -including up-sampling and down-sampling. -Currently, the supported up-sampling methods are -NaN (filling with `NaN`), -FFill (filling with previous value), -BFill (filling with next value) and -Linear (filling with linear interpolation). -Down-sampling relies on group aggregation, -which supports Max, Min, First, Last, Mean and Median. - -**Name:** RESAMPLE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - - -+ `every`: The frequency of resampling, which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. This parameter cannot be lacked. -+ `interp`: The interpolation method of up-sampling, which is 'NaN', 'FFill', 'BFill' or 'Linear'. By default, NaN is used. -+ `aggr`: The aggregation method of down-sampling, which is 'Max', 'Min', 'First', 'Last', 'Mean' or 'Median'. By default, Mean is used. -+ `start`: The start time (inclusive) of resampling with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is the timestamp of the first valid data point. -+ `end`: The end time (exclusive) of resampling with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is the timestamp of the last valid data point. - -**Output Series:** Output a single series. The type is DOUBLE. It is strictly equispaced with the frequency `every`. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -##### Up-sampling - -When the frequency of resampling is higher than the original frequency, up-sampling starts. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2021-03-06T16:00:00.000+08:00| 3.09| -|2021-03-06T16:15:00.000+08:00| 3.53| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:45:00.000+08:00| 3.51| -|2021-03-06T17:00:00.000+08:00| 3.41| -+-----------------------------+---------------+ -``` - - -SQL for query: - -```sql -select resample(s1,'every'='5m','interp'='linear') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="5m", "interp"="linear")| -+-----------------------------+----------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:05:00.000+08:00| 3.2366665999094644| -|2021-03-06T16:10:00.000+08:00| 3.3833332856496177| -|2021-03-06T16:15:00.000+08:00| 3.5299999713897705| -|2021-03-06T16:20:00.000+08:00| 3.5199999809265137| -|2021-03-06T16:25:00.000+08:00| 3.509999990463257| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:35:00.000+08:00| 3.503333330154419| -|2021-03-06T16:40:00.000+08:00| 3.506666660308838| -|2021-03-06T16:45:00.000+08:00| 3.509999990463257| -|2021-03-06T16:50:00.000+08:00| 3.4766666889190674| -|2021-03-06T16:55:00.000+08:00| 3.443333387374878| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+----------------------------------------------------------+ -``` - -##### Down-sampling - -When the frequency of resampling is lower than the original frequency, down-sampling starts. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select resample(s1,'every'='30m','aggr'='first') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "aggr"="first")| -+-----------------------------+--------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+--------------------------------------------------------+ -``` - - - -##### Specify the time period - -The time period of resampling can be specified with `start` and `end`. -The period outside the actual time range will be interpolated. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "start"="2021-03-06 15:00:00")| -+-----------------------------+-----------------------------------------------------------------------+ -|2021-03-06T15:00:00.000+08:00| NaN| -|2021-03-06T15:30:00.000+08:00| NaN| -|2021-03-06T16:00:00.000+08:00| 3.309999942779541| -|2021-03-06T16:30:00.000+08:00| 3.5049999952316284| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+-----------------------------------------------------------------------+ -``` - -### Sample - -#### Registration statement - -```sql -create function sample as 'org.apache.iotdb.library.dprofile.UDTFSample' -``` - -#### Usage - -This function is used to sample the input series, -that is, select a specified number of data points from the input series and output them. -Currently, three sampling methods are supported: -**Reservoir sampling** randomly selects data points. -All of the points have the same probability of being sampled. -**Isometric sampling** selects data points at equal index intervals. -**Triangle sampling** assigns data points to the buckets based on the number of sampling. -Then it calculates the area of the triangle based on these points inside the bucket and selects the point with the largest area of the triangle. -For more detail, please read [paper](http://skemman.is/stream/get/1946/15343/37285/3/SS_MSthesis.pdf) - -**Name:** SAMPLE - -**Input Series:** Only support a single input series. The type is arbitrary. - -**Parameters:** - -+ `method`: The method of sampling, which is 'reservoir', 'isometric' or 'triangle'. By default, reservoir sampling is used. -+ `k`: The number of sampling, which is a positive integer. By default, it's 1. - -**Output Series:** Output a single series. The type is the same as the input. The length of the output series is `k`. Each data point in the output series comes from the input series. - -**Note:** If `k` is greater than the length of input series, all data points in the input series will be output. - -#### Examples - -##### Reservoir Sampling - -When `method` is 'reservoir' or the default, reservoir sampling is used. -Due to the randomness of this method, the output series shown below is only a possible result. - - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 4.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select sample(s1,'method'='reservoir','k'='5') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="reservoir", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -##### Isometric Sampling - -When `method` is 'isometric', isometric sampling is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select sample(s1,'method'='isometric','k'='5') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="isometric", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -### Segment - -#### Registration statement - -```sql -create function segment as 'org.apache.iotdb.library.dprofile.UDTFSegment' -``` - -#### Usage - -This function is used to segment a time series into subsequences according to linear trend, and returns linear fitted values of first values in each subsequence or every data point. - -**Name:** SEGMENT - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `output` :"all" to output all fitted points; "first" to output first fitted points in each subsequence. - -+ `error`: error allowed at linear regression. It is defined as mean absolute error of a subsequence. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** This function treat input series as equal-interval sampled. All data are loaded, so downsample input series first if there are too many data points. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:00.300+08:00| 2.0| -|1970-01-01T08:00:00.400+08:00| 3.0| -|1970-01-01T08:00:00.500+08:00| 4.0| -|1970-01-01T08:00:00.600+08:00| 5.0| -|1970-01-01T08:00:00.700+08:00| 6.0| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 8.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:01.100+08:00| 9.1| -|1970-01-01T08:00:01.200+08:00| 9.2| -|1970-01-01T08:00:01.300+08:00| 9.3| -|1970-01-01T08:00:01.400+08:00| 9.4| -|1970-01-01T08:00:01.500+08:00| 9.5| -|1970-01-01T08:00:01.600+08:00| 9.6| -|1970-01-01T08:00:01.700+08:00| 9.7| -|1970-01-01T08:00:01.800+08:00| 9.8| -|1970-01-01T08:00:01.900+08:00| 9.9| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:02.100+08:00| 8.0| -|1970-01-01T08:00:02.200+08:00| 6.0| -|1970-01-01T08:00:02.300+08:00| 4.0| -|1970-01-01T08:00:02.400+08:00| 2.0| -|1970-01-01T08:00:02.500+08:00| 0.0| -|1970-01-01T08:00:02.600+08:00| -2.0| -|1970-01-01T08:00:02.700+08:00| -4.0| -|1970-01-01T08:00:02.800+08:00| -6.0| -|1970-01-01T08:00:02.900+08:00| -8.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.100+08:00| 10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -|1970-01-01T08:00:03.300+08:00| 10.0| -|1970-01-01T08:00:03.400+08:00| 10.0| -|1970-01-01T08:00:03.500+08:00| 10.0| -|1970-01-01T08:00:03.600+08:00| 10.0| -|1970-01-01T08:00:03.700+08:00| 10.0| -|1970-01-01T08:00:03.800+08:00| 10.0| -|1970-01-01T08:00:03.900+08:00| 10.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select segment(s1, "error"="0.1") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|segment(root.test.s1, "error"="0.1")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -+-----------------------------+------------------------------------+ -``` - -### Skew - -#### Registration statement - -```sql -create function skew as 'org.apache.iotdb.library.dprofile.UDAFSkew' -``` - -#### Usage - -This function is used to calculate the population skewness. - -**Name:** SKEW - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the population skewness. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -|2020-01-01T00:00:11.000+08:00| 10.0| -|2020-01-01T00:00:12.000+08:00| 10.0| -|2020-01-01T00:00:13.000+08:00| 10.0| -|2020-01-01T00:00:14.000+08:00| 10.0| -|2020-01-01T00:00:15.000+08:00| 10.0| -|2020-01-01T00:00:16.000+08:00| 10.0| -|2020-01-01T00:00:17.000+08:00| 10.0| -|2020-01-01T00:00:18.000+08:00| 10.0| -|2020-01-01T00:00:19.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select skew(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time| skew(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| -0.9998427402292644| -+-----------------------------+-----------------------+ -``` - -### Spline - -#### Registration statement - -```sql -create function spline as 'org.apache.iotdb.library.dprofile.UDTFSpline' -``` - -#### Usage - -This function is used to calculate cubic spline interpolation of input series. - -**Name:** SPLINE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `points`: Number of resampling points. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note**: Output series retains the first and last timestamps of input series. Interpolation points are selected at equal intervals. The function tries to calculate only when there are no less than 4 points in input series. - -#### Examples - -##### Assigning number of interpolation points - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select spline(s1, "points"="151") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|spline(root.test.s1, "points"="151")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.010+08:00| 0.04870000251134237| -|1970-01-01T08:00:00.020+08:00| 0.09680000495910646| -|1970-01-01T08:00:00.030+08:00| 0.14430000734329226| -|1970-01-01T08:00:00.040+08:00| 0.19120000966389972| -|1970-01-01T08:00:00.050+08:00| 0.23750001192092896| -|1970-01-01T08:00:00.060+08:00| 0.2832000141143799| -|1970-01-01T08:00:00.070+08:00| 0.32830001624425253| -|1970-01-01T08:00:00.080+08:00| 0.3728000183105469| -|1970-01-01T08:00:00.090+08:00| 0.416700020313263| -|1970-01-01T08:00:00.100+08:00| 0.4600000222524008| -|1970-01-01T08:00:00.110+08:00| 0.5027000241279602| -|1970-01-01T08:00:00.120+08:00| 0.5448000259399414| -|1970-01-01T08:00:00.130+08:00| 0.5863000276883443| -|1970-01-01T08:00:00.140+08:00| 0.627200029373169| -|1970-01-01T08:00:00.150+08:00| 0.6675000309944153| -|1970-01-01T08:00:00.160+08:00| 0.7072000325520833| -|1970-01-01T08:00:00.170+08:00| 0.7463000340461731| -|1970-01-01T08:00:00.180+08:00| 0.7848000354766846| -|1970-01-01T08:00:00.190+08:00| 0.8227000368436178| -|1970-01-01T08:00:00.200+08:00| 0.8600000381469728| -|1970-01-01T08:00:00.210+08:00| 0.8967000393867494| -|1970-01-01T08:00:00.220+08:00| 0.9328000405629477| -|1970-01-01T08:00:00.230+08:00| 0.9683000416755676| -|1970-01-01T08:00:00.240+08:00| 1.0032000427246095| -|1970-01-01T08:00:00.250+08:00| 1.037500043710073| -|1970-01-01T08:00:00.260+08:00| 1.071200044631958| -|1970-01-01T08:00:00.270+08:00| 1.1043000454902647| -|1970-01-01T08:00:00.280+08:00| 1.1368000462849934| -|1970-01-01T08:00:00.290+08:00| 1.1687000470161437| -|1970-01-01T08:00:00.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:00.310+08:00| 1.2307000483103594| -|1970-01-01T08:00:00.320+08:00| 1.2608000489139557| -|1970-01-01T08:00:00.330+08:00| 1.2903000494873524| -|1970-01-01T08:00:00.340+08:00| 1.3192000500233967| -|1970-01-01T08:00:00.350+08:00| 1.3475000505149364| -|1970-01-01T08:00:00.360+08:00| 1.3752000509548186| -|1970-01-01T08:00:00.370+08:00| 1.402300051335891| -|1970-01-01T08:00:00.380+08:00| 1.4288000516510009| -|1970-01-01T08:00:00.390+08:00| 1.4547000518929958| -|1970-01-01T08:00:00.400+08:00| 1.480000052054723| -|1970-01-01T08:00:00.410+08:00| 1.5047000521290301| -|1970-01-01T08:00:00.420+08:00| 1.5288000521087646| -|1970-01-01T08:00:00.430+08:00| 1.5523000519867738| -|1970-01-01T08:00:00.440+08:00| 1.575200051755905| -|1970-01-01T08:00:00.450+08:00| 1.597500051409006| -|1970-01-01T08:00:00.460+08:00| 1.619200050938924| -|1970-01-01T08:00:00.470+08:00| 1.6403000503385066| -|1970-01-01T08:00:00.480+08:00| 1.660800049600601| -|1970-01-01T08:00:00.490+08:00| 1.680700048718055| -|1970-01-01T08:00:00.500+08:00| 1.7000000476837158| -|1970-01-01T08:00:00.510+08:00| 1.7188475466453037| -|1970-01-01T08:00:00.520+08:00| 1.7373800457262996| -|1970-01-01T08:00:00.530+08:00| 1.7555825448831923| -|1970-01-01T08:00:00.540+08:00| 1.7734400440724702| -|1970-01-01T08:00:00.550+08:00| 1.790937543250622| -|1970-01-01T08:00:00.560+08:00| 1.8080600423741364| -|1970-01-01T08:00:00.570+08:00| 1.8247925413995016| -|1970-01-01T08:00:00.580+08:00| 1.8411200402832066| -|1970-01-01T08:00:00.590+08:00| 1.8570275389817397| -|1970-01-01T08:00:00.600+08:00| 1.8725000374515897| -|1970-01-01T08:00:00.610+08:00| 1.8875225356492449| -|1970-01-01T08:00:00.620+08:00| 1.902080033531194| -|1970-01-01T08:00:00.630+08:00| 1.9161575310539258| -|1970-01-01T08:00:00.640+08:00| 1.9297400281739288| -|1970-01-01T08:00:00.650+08:00| 1.9428125248476913| -|1970-01-01T08:00:00.660+08:00| 1.9553600210317021| -|1970-01-01T08:00:00.670+08:00| 1.96736751668245| -|1970-01-01T08:00:00.680+08:00| 1.9788200117564232| -|1970-01-01T08:00:00.690+08:00| 1.9897025062101101| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.710+08:00| 2.0097024933913334| -|1970-01-01T08:00:00.720+08:00| 2.0188199867081615| -|1970-01-01T08:00:00.730+08:00| 2.027367479995188| -|1970-01-01T08:00:00.740+08:00| 2.0353599732971155| -|1970-01-01T08:00:00.750+08:00| 2.0428124666586482| -|1970-01-01T08:00:00.760+08:00| 2.049739960124489| -|1970-01-01T08:00:00.770+08:00| 2.056157453739342| -|1970-01-01T08:00:00.780+08:00| 2.06207994754791| -|1970-01-01T08:00:00.790+08:00| 2.067522441594897| -|1970-01-01T08:00:00.800+08:00| 2.072499935925006| -|1970-01-01T08:00:00.810+08:00| 2.07702743058294| -|1970-01-01T08:00:00.820+08:00| 2.081119925613404| -|1970-01-01T08:00:00.830+08:00| 2.0847924210611| -|1970-01-01T08:00:00.840+08:00| 2.0880599169707317| -|1970-01-01T08:00:00.850+08:00| 2.0909374133870027| -|1970-01-01T08:00:00.860+08:00| 2.0934399103546166| -|1970-01-01T08:00:00.870+08:00| 2.0955824079182768| -|1970-01-01T08:00:00.880+08:00| 2.0973799061226863| -|1970-01-01T08:00:00.890+08:00| 2.098847405012549| -|1970-01-01T08:00:00.900+08:00| 2.0999999046325684| -|1970-01-01T08:00:00.910+08:00| 2.1005574051201332| -|1970-01-01T08:00:00.920+08:00| 2.1002599065303778| -|1970-01-01T08:00:00.930+08:00| 2.0991524087846245| -|1970-01-01T08:00:00.940+08:00| 2.0972799118041947| -|1970-01-01T08:00:00.950+08:00| 2.0946874155104105| -|1970-01-01T08:00:00.960+08:00| 2.0914199198245944| -|1970-01-01T08:00:00.970+08:00| 2.0875224246680673| -|1970-01-01T08:00:00.980+08:00| 2.083039929962151| -|1970-01-01T08:00:00.990+08:00| 2.0780174356281687| -|1970-01-01T08:00:01.000+08:00| 2.0724999415874406| -|1970-01-01T08:00:01.010+08:00| 2.06653244776129| -|1970-01-01T08:00:01.020+08:00| 2.060159954071038| -|1970-01-01T08:00:01.030+08:00| 2.053427460438006| -|1970-01-01T08:00:01.040+08:00| 2.046379966783517| -|1970-01-01T08:00:01.050+08:00| 2.0390624730288924| -|1970-01-01T08:00:01.060+08:00| 2.031519979095454| -|1970-01-01T08:00:01.070+08:00| 2.0237974849045237| -|1970-01-01T08:00:01.080+08:00| 2.015939990377423| -|1970-01-01T08:00:01.090+08:00| 2.0079924954354746| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.110+08:00| 1.9907018211101906| -|1970-01-01T08:00:01.120+08:00| 1.9788509124245144| -|1970-01-01T08:00:01.130+08:00| 1.9645127287932083| -|1970-01-01T08:00:01.140+08:00| 1.9477527250665083| -|1970-01-01T08:00:01.150+08:00| 1.9286363560946513| -|1970-01-01T08:00:01.160+08:00| 1.9072290767278735| -|1970-01-01T08:00:01.170+08:00| 1.8835963418164114| -|1970-01-01T08:00:01.180+08:00| 1.8578036062105014| -|1970-01-01T08:00:01.190+08:00| 1.8299163247603802| -|1970-01-01T08:00:01.200+08:00| 1.7999999523162842| -|1970-01-01T08:00:01.210+08:00| 1.7623635841923329| -|1970-01-01T08:00:01.220+08:00| 1.7129696477516976| -|1970-01-01T08:00:01.230+08:00| 1.6543635959181928| -|1970-01-01T08:00:01.240+08:00| 1.5890908816156328| -|1970-01-01T08:00:01.250+08:00| 1.5196969577678319| -|1970-01-01T08:00:01.260+08:00| 1.4487272772986044| -|1970-01-01T08:00:01.270+08:00| 1.3787272931317647| -|1970-01-01T08:00:01.280+08:00| 1.3122424581911272| -|1970-01-01T08:00:01.290+08:00| 1.251818225400506| -|1970-01-01T08:00:01.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:01.310+08:00| 1.1548000470995912| -|1970-01-01T08:00:01.320+08:00| 1.1130667107899999| -|1970-01-01T08:00:01.330+08:00| 1.0756000393033045| -|1970-01-01T08:00:01.340+08:00| 1.043200033187868| -|1970-01-01T08:00:01.350+08:00| 1.016666692992053| -|1970-01-01T08:00:01.360+08:00| 0.9968000192642223| -|1970-01-01T08:00:01.370+08:00| 0.9844000125527389| -|1970-01-01T08:00:01.380+08:00| 0.9802666734059655| -|1970-01-01T08:00:01.390+08:00| 0.9852000023722649| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.410+08:00| 1.023999999165535| -|1970-01-01T08:00:01.420+08:00| 1.0559999990463256| -|1970-01-01T08:00:01.430+08:00| 1.0959999996423722| -|1970-01-01T08:00:01.440+08:00| 1.1440000009536744| -|1970-01-01T08:00:01.450+08:00| 1.2000000029802322| -|1970-01-01T08:00:01.460+08:00| 1.264000005722046| -|1970-01-01T08:00:01.470+08:00| 1.3360000091791153| -|1970-01-01T08:00:01.480+08:00| 1.4160000133514405| -|1970-01-01T08:00:01.490+08:00| 1.5040000182390214| -|1970-01-01T08:00:01.500+08:00| 1.600000023841858| -+-----------------------------+------------------------------------+ -``` - -### Spread - -#### Registration statement - -```sql -create function spread as 'org.apache.iotdb.library.dprofile.UDAFSpread' -``` - -#### Usage - -This function is used to calculate the spread of time series, that is, the maximum value minus the minimum value. - -**Name:** SPREAD - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is the same as the input. There is only one data point in the series, whose timestamp is 0 and value is the spread. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time|spread(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 26.0| -+-----------------------------+-----------------------+ -``` - - - -### ZScore - -#### Registration statement - -```sql -create function zscore as 'org.apache.iotdb.library.dprofile.UDTFZScore' -``` - -#### Usage - -This function is used to standardize the input series with z-score. - -**Name:** ZSCORE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `compute`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide mean and standard deviation. The default method is "batch". -+ `avg`: Mean value when method is set to "stream". -+ `sd`: Standard deviation when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select zscore(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|zscore(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.200+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.300+08:00| 0.20672455764868078| -|1970-01-01T08:00:00.400+08:00| -0.6201736729460423| -|1970-01-01T08:00:00.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.700+08:00| -1.033622788243404| -|1970-01-01T08:00:00.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:00.900+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.000+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.100+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.200+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.300+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.400+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.700+08:00| 3.9277665953249348| -|1970-01-01T08:00:01.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:01.900+08:00| -1.033622788243404| -|1970-01-01T08:00:02.000+08:00|-0.20672455764868078| -+-----------------------------+--------------------+ -``` - - -## Anomaly Detection - -### IQR - -#### Registration statement - -```sql -create function iqr as 'org.apache.iotdb.library.anomaly.UDTFIQR' -``` - -#### Usage - -This function is used to detect anomalies based on IQR. Points distributing beyond 1.5 times IQR are selected. - -**Name:** IQR - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `method`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide upper and lower quantiles. The default method is "batch". -+ `q1`: The lower quantile when method is set to "stream". -+ `q3`: The upper quantile when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** $IQR=Q_3-Q_1$ - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select iqr(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+-----------------+ -| Time|iqr(root.test.s1)| -+-----------------------------+-----------------+ -|1970-01-01T08:00:01.700+08:00| 10.0| -+-----------------------------+-----------------+ -``` - -### KSigma - -#### Registration statement - -```sql -create function ksigma as 'org.apache.iotdb.library.anomaly.UDTFKSigma' -``` - -#### Usage - -This function is used to detect anomalies based on the Dynamic K-Sigma Algorithm. -Within a sliding window, the input value with a deviation of more than k times the standard deviation from the average will be output as anomaly. - -**Name:** KSIGMA - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `k`: How many times to multiply on standard deviation to define anomaly, the default value is 3. -+ `window`: The window size of Dynamic K-Sigma Algorithm, the default value is 10000. - -**Output Series:** Output a single series. The type is same as input series. - -**Note:** Only when is larger than 0, the anomaly detection will be performed. Otherwise, nothing will be output. - -#### Examples - -##### Assigning k - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:04.000+08:00| 100.0| -|2020-01-01T00:00:06.000+08:00| 150.0| -|2020-01-01T00:00:08.000+08:00| 200.0| -|2020-01-01T00:00:10.000+08:00| 200.0| -|2020-01-01T00:00:14.000+08:00| 200.0| -|2020-01-01T00:00:15.000+08:00| 200.0| -|2020-01-01T00:00:16.000+08:00| 200.0| -|2020-01-01T00:00:18.000+08:00| 200.0| -|2020-01-01T00:00:20.000+08:00| 150.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -|Time |ksigma(root.test.d1.s1,"k"="3.0")| -+-----------------------------+---------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -+-----------------------------+---------------------------------+ -``` - -### LOF - -#### Registration statement - -```sql -create function LOF as 'org.apache.iotdb.library.anomaly.UDTFLOF' -``` - -#### Usage - -This function is used to detect density anomaly of time series. According to k-th distance calculation parameter and local outlier factor (lof) threshold, the function judges if a set of input values is an density anomaly, and a bool mark of anomaly values will be output. - -**Name:** LOF - -**Input Series:** Multiple input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `method`:assign a detection method. The default value is "default", when input data has multiple dimensions. The alternative is "series", when a input series will be transformed to high dimension. -+ `k`:use the k-th distance to calculate lof. Default value is 3. -+ `window`: size of window to split origin data points. Default value is 10000. -+ `windowsize`:dimension that will be transformed into when method is "series". The default value is 5. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** Incomplete rows will be ignored. They are neither calculated nor marked as anomaly. - -#### Examples - -##### Using default parameters - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| 1.0| -|1970-01-01T08:00:00.300+08:00| 1.0| 1.0| -|1970-01-01T08:00:00.400+08:00| 1.0| 0.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -1.0| -|1970-01-01T08:00:00.600+08:00| -1.0| -1.0| -|1970-01-01T08:00:00.700+08:00| -1.0| 0.0| -|1970-01-01T08:00:00.800+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select lof(s1,s2) from root.test.d1 where time<1000 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|lof(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.100+08:00| 3.8274824267668244| -|1970-01-01T08:00:00.200+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.300+08:00| 2.838155437762879| -|1970-01-01T08:00:00.400+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.500+08:00| 2.73518261244453| -|1970-01-01T08:00:00.600+08:00| 2.371440975708148| -|1970-01-01T08:00:00.700+08:00| 2.73518261244453| -|1970-01-01T08:00:00.800+08:00| 1.7561416374270742| -+-----------------------------+-------------------------------------+ -``` - -##### Diagnosing 1d timeseries - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 1.0| -|1970-01-01T08:00:00.200+08:00| 2.0| -|1970-01-01T08:00:00.300+08:00| 3.0| -|1970-01-01T08:00:00.400+08:00| 4.0| -|1970-01-01T08:00:00.500+08:00| 5.0| -|1970-01-01T08:00:00.600+08:00| 6.0| -|1970-01-01T08:00:00.700+08:00| 7.0| -|1970-01-01T08:00:00.800+08:00| 8.0| -|1970-01-01T08:00:00.900+08:00| 9.0| -|1970-01-01T08:00:01.000+08:00| 10.0| -|1970-01-01T08:00:01.100+08:00| 11.0| -|1970-01-01T08:00:01.200+08:00| 12.0| -|1970-01-01T08:00:01.300+08:00| 13.0| -|1970-01-01T08:00:01.400+08:00| 14.0| -|1970-01-01T08:00:01.500+08:00| 15.0| -|1970-01-01T08:00:01.600+08:00| 16.0| -|1970-01-01T08:00:01.700+08:00| 17.0| -|1970-01-01T08:00:01.800+08:00| 18.0| -|1970-01-01T08:00:01.900+08:00| 19.0| -|1970-01-01T08:00:02.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select lof(s1, "method"="series") from root.test.d1 where time<1000 -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|lof(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 3.77777777777778| -|1970-01-01T08:00:00.200+08:00| 4.32727272727273| -|1970-01-01T08:00:00.300+08:00| 4.85714285714286| -|1970-01-01T08:00:00.400+08:00| 5.40909090909091| -|1970-01-01T08:00:00.500+08:00| 5.94999999999999| -|1970-01-01T08:00:00.600+08:00| 6.43243243243243| -|1970-01-01T08:00:00.700+08:00| 6.79999999999999| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 7.0| -|1970-01-01T08:00:01.000+08:00| 6.79999999999999| -|1970-01-01T08:00:01.100+08:00| 6.43243243243243| -|1970-01-01T08:00:01.200+08:00| 5.94999999999999| -|1970-01-01T08:00:01.300+08:00| 5.40909090909091| -|1970-01-01T08:00:01.400+08:00| 4.85714285714286| -|1970-01-01T08:00:01.500+08:00| 4.32727272727273| -|1970-01-01T08:00:01.600+08:00| 3.77777777777778| -+-----------------------------+--------------------+ -``` - -### MissDetect - -#### Registration statement - -```sql -create function missdetect as 'org.apache.iotdb.library.anomaly.UDTFMissDetect' -``` - -#### Usage - -This function is used to detect missing anomalies. -In some datasets, missing values are filled by linear interpolation. -Thus, there are several long perfect linear segments. -By discovering these perfect linear segments, -missing anomalies are detected. - -**Name:** MISSDETECT - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -`error`: The minimum length of the detected missing anomalies, which is an integer greater than or equal to 10. By default, it is 10. - -**Output Series:** Output a single series. The type is BOOLEAN. Each data point which is miss anomaly will be labeled as true. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 0.0| -|2021-07-01T12:00:01.000+08:00| 1.0| -|2021-07-01T12:00:02.000+08:00| 0.0| -|2021-07-01T12:00:03.000+08:00| 1.0| -|2021-07-01T12:00:04.000+08:00| 0.0| -|2021-07-01T12:00:05.000+08:00| 0.0| -|2021-07-01T12:00:06.000+08:00| 0.0| -|2021-07-01T12:00:07.000+08:00| 0.0| -|2021-07-01T12:00:08.000+08:00| 0.0| -|2021-07-01T12:00:09.000+08:00| 0.0| -|2021-07-01T12:00:10.000+08:00| 0.0| -|2021-07-01T12:00:11.000+08:00| 0.0| -|2021-07-01T12:00:12.000+08:00| 0.0| -|2021-07-01T12:00:13.000+08:00| 0.0| -|2021-07-01T12:00:14.000+08:00| 0.0| -|2021-07-01T12:00:15.000+08:00| 0.0| -|2021-07-01T12:00:16.000+08:00| 1.0| -|2021-07-01T12:00:17.000+08:00| 0.0| -|2021-07-01T12:00:18.000+08:00| 1.0| -|2021-07-01T12:00:19.000+08:00| 0.0| -|2021-07-01T12:00:20.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select missdetect(s2,'minlen'='10') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------+ -| Time|missdetect(root.test.d2.s2, "minlen"="10")| -+-----------------------------+------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| false| -|2021-07-01T12:00:01.000+08:00| false| -|2021-07-01T12:00:02.000+08:00| false| -|2021-07-01T12:00:03.000+08:00| false| -|2021-07-01T12:00:04.000+08:00| true| -|2021-07-01T12:00:05.000+08:00| true| -|2021-07-01T12:00:06.000+08:00| true| -|2021-07-01T12:00:07.000+08:00| true| -|2021-07-01T12:00:08.000+08:00| true| -|2021-07-01T12:00:09.000+08:00| true| -|2021-07-01T12:00:10.000+08:00| true| -|2021-07-01T12:00:11.000+08:00| true| -|2021-07-01T12:00:12.000+08:00| true| -|2021-07-01T12:00:13.000+08:00| true| -|2021-07-01T12:00:14.000+08:00| true| -|2021-07-01T12:00:15.000+08:00| true| -|2021-07-01T12:00:16.000+08:00| false| -|2021-07-01T12:00:17.000+08:00| false| -|2021-07-01T12:00:18.000+08:00| false| -|2021-07-01T12:00:19.000+08:00| false| -|2021-07-01T12:00:20.000+08:00| false| -+-----------------------------+------------------------------------------+ -``` - -### Range - -#### Registration statement - -```sql -create function range as 'org.apache.iotdb.library.anomaly.UDTFRange' -``` - -#### Usage - -This function is used to detect range anomaly of time series. According to upper bound and lower bound parameters, the function judges if a input value is beyond range, aka range anomaly, and a new time series of anomaly will be output. - -**Name:** RANGE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `lower_bound`:lower bound of range anomaly detection. -+ `upper_bound`:upper bound of range anomaly detection. - -**Output Series:** Output a single series. The type is the same as the input. - -**Note:** Only when `upper_bound` is larger than `lower_bound`, the anomaly detection will be performed. Otherwise, nothing will be output. - - - -#### Examples - -##### Assigning Lower and Upper Bound - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------------------+ -|Time |range(root.test.d1.s1,"lower_bound"="101.0","upper_bound"="125.0")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -+-----------------------------+------------------------------------------------------------------+ -``` - -### TwoSidedFilter - -#### Registration statement - -```sql -create function twosidedfilter as 'org.apache.iotdb.library.anomaly.UDTFTwoSidedFilter' -``` - -#### Usage - -The function is used to filter anomalies of a numeric time series based on two-sided window detection. - -**Name:** TWOSIDEDFILTER - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE - -**Output Series:** Output a single series. The type is the same as the input. It is the input without anomalies. - -**Parameter:** - -- `len`: The size of the window, which is a positive integer. By default, it's 5. When `len`=3, the algorithm detects forward window and backward window with length 3 and calculates the outlierness of the current point. - -- `threshold`: The threshold of outlierness, which is a floating number in (0,1). By default, it's 0.3. The strict standard of detecting anomalies is in proportion to the threshold. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:00:37.000+08:00| 1484.0| -|1970-01-01T08:00:38.000+08:00| 1055.0| -|1970-01-01T08:00:39.000+08:00| 1050.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test -``` - -Output series: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -### Outlier - -#### Registration statement - -```sql -create function outlier as 'org.apache.iotdb.library.anomaly.UDTFOutlier' -``` - -#### Usage - -This function is used to detect distance-based outliers. For each point in the current window, if the number of its neighbors within the distance of neighbor distance threshold is less than the neighbor count threshold, the point in detected as an outlier. - -**Name:** OUTLIER - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `r`:the neighbor distance threshold. -+ `k`:the neighbor count threshold. -+ `w`:the window size. -+ `s`:the slide size. - -**Output Series:** Output a single series. The type is the same as the input. - -#### Examples - -##### Assigning Parameters of Queries - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|2020-01-04T23:59:55.000+08:00| 56.0| -|2020-01-04T23:59:56.000+08:00| 55.1| -|2020-01-04T23:59:57.000+08:00| 54.2| -|2020-01-04T23:59:58.000+08:00| 56.3| -|2020-01-04T23:59:59.000+08:00| 59.0| -|2020-01-05T00:00:00.000+08:00| 60.0| -|2020-01-05T00:00:01.000+08:00| 60.5| -|2020-01-05T00:00:02.000+08:00| 64.5| -|2020-01-05T00:00:03.000+08:00| 69.0| -|2020-01-05T00:00:04.000+08:00| 64.2| -|2020-01-05T00:00:05.000+08:00| 62.3| -|2020-01-05T00:00:06.000+08:00| 58.0| -|2020-01-05T00:00:07.000+08:00| 58.9| -|2020-01-05T00:00:08.000+08:00| 52.0| -|2020-01-05T00:00:09.000+08:00| 62.3| -|2020-01-05T00:00:10.000+08:00| 61.0| -|2020-01-05T00:00:11.000+08:00| 64.2| -|2020-01-05T00:00:12.000+08:00| 61.8| -|2020-01-05T00:00:13.000+08:00| 64.0| -|2020-01-05T00:00:14.000+08:00| 63.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|outlier(root.test.s1,"r"="5.0","k"="4","w"="10","s"="5")| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:03.000+08:00| 69.0| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:08.000+08:00| 52.0| -+-----------------------------+--------------------------------------------------------+ -``` - -## Frequency Domain Analysis - -### Conv - -#### Registration statement - -```sql -create function conv as 'org.apache.iotdb.library.frequency.UDTFConv' -``` - -#### Usage - -This function is used to calculate the convolution, i.e. polynomial multiplication. - -**Name:** CONV - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output:** Output a single series. The type is DOUBLE. It is the result of convolution whose timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 0.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select conv(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------+ -| Time|conv(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| -|1970-01-01T08:00:00.003+08:00| 2.0| -+-----------------------------+--------------------------------------+ -``` - -### Deconv - -#### Registration statement - -```sql -create function deconv as 'org.apache.iotdb.library.frequency.UDTFDeconv' -``` - -#### Usage - -This function is used to calculate the deconvolution, i.e. polynomial division. - -**Name:** DECONV - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `result`: The result of deconvolution, which is 'quotient' or 'remainder'. By default, the quotient will be output. - -**Output:** Output a single series. The type is DOUBLE. It is the result of deconvolving the second series from the first series (dividing the first series by the second series) whose timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - - -##### Calculate the quotient - -When `result` is 'quotient' or the default, this function calculates the quotient of the deconvolution. - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s3|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 8.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| null| -|1970-01-01T08:00:00.003+08:00| 2.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select deconv(s3,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2)| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - -##### Calculate the remainder - -When `result` is 'remainder', this function calculates the remainder of the deconvolution. - -Input series is the same as above, the SQL for query is shown below: - - -```sql -select deconv(s3,s2,'result'='remainder') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2, "result"="remainder")| -+-----------------------------+--------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 0.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -+-----------------------------+--------------------------------------------------------------+ -``` - -### DWT - -#### Registration statement - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFDWT' -``` - -#### Usage - -This function is used to calculate 1d discrete wavelet transform of a numerical series. - -**Name:** DWT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of wavelet. May select 'Haar', 'DB4', 'DB6', 'DB8', where DB means Daubechies. User may offer coefficients of wavelet transform and ignore this parameter. Case ignored. -+ `coef`: Coefficients of wavelet transform. When providing this parameter, use comma ',' to split them, and leave no spaces or other punctuations. -+ `layer`: Times to transform. The number of output vectors equals $layer+1$. Default is 1. - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. - -**Note:** The length of input series must be an integer number power of 2. - -#### Examples - - -##### Haar wavelet transform - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.2| -|1970-01-01T08:00:00.200+08:00| 1.5| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.600+08:00| 0.8| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.800+08:00| 2.5| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select dwt(s1,"method"="haar") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|dwt(root.test.d1.s1, "method"="haar")| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.14142135834465192| -|1970-01-01T08:00:00.100+08:00| 1.909188342921157| -|1970-01-01T08:00:00.200+08:00| 1.6263456473052773| -|1970-01-01T08:00:00.300+08:00| 1.9798989957517026| -|1970-01-01T08:00:00.400+08:00| 3.252691126023161| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776479437628| -|1970-01-01T08:00:00.800+08:00| -0.14142135834465192| -|1970-01-01T08:00:00.900+08:00| 0.21213200063848547| -|1970-01-01T08:00:01.000+08:00| -0.7778174761639416| -|1970-01-01T08:00:01.100+08:00| -0.8485281289944873| -|1970-01-01T08:00:01.200+08:00| 0.2828427799095765| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426400127697095| -|1970-01-01T08:00:01.500+08:00| -0.42426408557066786| -+-----------------------------+-------------------------------------+ -``` - - -### IDWT - -#### Registration statement - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFIDWT' -``` - -#### Usage - -This function performs one-dimensional inverse discrete wavelet transform on the input series, reconstructing the original data from DWT decomposed wavelet coefficients. - -**Name:** IDWT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of wavelet. May select 'Haar', 'DB4', 'DB6', 'DB8', where DB means Daubechies. User may offer coefficients of wavelet transform and ignore this parameter. Case ignored. -+ `coef`: Coefficients of wavelet transform. When providing this parameter, use comma ',' to split them, and leave no spaces or other punctuations. -+ `layer`: Times to transform. The number of output vectors equals $layer+1$. Default is 1. - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. - -**Note:** -* The length of input series must be an integer number power of 2. -* The parameter settings of the IDWT function (method/coef/layer) should be consistent with the corresponding DWT transformation to correctly reconstruct the original data. -* Typically, the input of IDWT is the output result of the DWT function. - -#### Examples - -##### Haar wavelet transform - -Input series: - -``` -+-----------------------------+--------------------+ -| Time| root.test.d1.s2| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 0.1414213562373095| -|1970-01-01T08:00:00.100+08:00| 1.909188309203678| -|1970-01-01T08:00:00.200+08:00| 1.6263455967290592| -|1970-01-01T08:00:00.300+08:00| 1.979898987322333| -|1970-01-01T08:00:00.400+08:00| 3.2526911934581184| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776310850235| -|1970-01-01T08:00:00.800+08:00| -0.1414213562373095| -|1970-01-01T08:00:00.900+08:00| 0.21213203435596428| -|1970-01-01T08:00:01.000+08:00| -0.7778174593052022| -|1970-01-01T08:00:01.100+08:00| -0.8485281374238569| -|1970-01-01T08:00:01.200+08:00| 0.2828427124746189| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426406871192857| -|1970-01-01T08:00:01.500+08:00|-0.42426406871192857| -+-----------------------------+--------------------+ -``` - -SQL for query: - -```sql -select idwt(s2,"method"="haar") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------+ -| Time|idwt(root.test.d1.s2, "method"="haar")| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.19999999999999998| -|1970-01-01T08:00:00.200+08:00| 1.4999999999999996| -|1970-01-01T08:00:00.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.6999999999999997| -|1970-01-01T08:00:00.600+08:00| 0.7999999999999998| -|1970-01-01T08:00:00.700+08:00| 1.9999999999999996| -|1970-01-01T08:00:00.800+08:00| 2.4999999999999996| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.9999999999999996| -|1970-01-01T08:00:01.200+08:00| 1.7999999999999998| -|1970-01-01T08:00:01.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:01.400+08:00| 0.9999999999999998| -|1970-01-01T08:00:01.500+08:00| 1.5999999999999999| -+-----------------------------+--------------------------------------+ -``` - - -### FFT - -#### Registration statement - -```sql -create function fft as 'org.apache.iotdb.library.frequency.UDTFFFT' -``` - -#### Usage - -This function is used to calculate the fast Fourier transform (FFT) of a numerical series. - -**Name:** FFT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of FFT, which is 'uniform' (by default) or 'nonuniform'. If the value is 'uniform', the timestamps will be ignored and all data points will be regarded as equidistant. Thus, the equidistant fast Fourier transform algorithm will be applied. If the value is 'nonuniform' (TODO), the non-equidistant fast Fourier transform algorithm will be applied based on timestamps. -+ `result`: The result of FFT, which is 'real', 'imag', 'abs' or 'angle', corresponding to the real part, imaginary part, magnitude and phase angle. By default, the magnitude will be output. -+ `compress`: The parameter of compression, which is within (0,1]. It is the reserved energy ratio of lossy compression. By default, there is no compression. - - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. The timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - - -##### Uniform FFT - -With the default `type`, uniform FFT is applied. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select fft(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------+ -| Time| fft(root.test.d1.s1)| -+-----------------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 1.2727111142703152E-8| -|1970-01-01T08:00:00.002+08:00| 2.385520799101839E-7| -|1970-01-01T08:00:00.003+08:00| 8.723291723972645E-8| -|1970-01-01T08:00:00.004+08:00| 19.999999960195904| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| -|1970-01-01T08:00:00.006+08:00| 3.2260694930700566E-7| -|1970-01-01T08:00:00.007+08:00| 8.723291605373329E-8| -|1970-01-01T08:00:00.008+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.009+08:00| 1.2727110997246171E-8| -|1970-01-01T08:00:00.010+08:00|1.9852334701272664E-23| -|1970-01-01T08:00:00.011+08:00| 1.2727111194499847E-8| -|1970-01-01T08:00:00.012+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.013+08:00| 8.723291785769131E-8| -|1970-01-01T08:00:00.014+08:00| 3.226069493070057E-7| -|1970-01-01T08:00:00.015+08:00| 9.999999850988388| -|1970-01-01T08:00:00.016+08:00| 19.999999960195904| -|1970-01-01T08:00:00.017+08:00| 8.723291747109068E-8| -|1970-01-01T08:00:00.018+08:00| 2.3855207991018386E-7| -|1970-01-01T08:00:00.019+08:00| 1.2727112069910878E-8| -+-----------------------------+----------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, there are peaks in $k=4$ and $k=5$ of the output. - -##### Uniform FFT with Compression - -Input series is the same as above, the SQL for query is shown below: - -```sql -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| fft(root.test.d1.s1,| fft(root.test.d1.s1,| -| | "result"="real",| "result"="imag",| -| | "compress"="0.99")| "compress"="0.99")| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - -Note: Based on the conjugation of the Fourier transform result, only the first half of the compression result is reserved. -According to the given parameter, data points are reserved from low frequency to high frequency until the reserved energy ratio exceeds it. -The last data point is reserved to indicate the length of the series. - -### HighPass - -#### Registration statement - -```sql -create function highpass as 'org.apache.iotdb.library.frequency.UDTFHighPass' -``` - -#### Usage - -This function performs low-pass filtering on the input series and extracts components above the cutoff frequency. -The timestamps of input will be ignored and all data points will be regarded as equidistant. - -**Name:** HIGHPASS - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `wpass`: The normalized cutoff frequency which values (0,1). This parameter cannot be lacked. - -**Output:** Output a single series. The type is DOUBLE. It is the input after filtering. The length and timestamps of output are the same as the input. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select highpass(s1,'wpass'='0.45') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------+ -| Time|highpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9999999534830373| -|1970-01-01T08:00:01.000+08:00| 1.7462829277628608E-8| -|1970-01-01T08:00:02.000+08:00| -0.9999999593178128| -|1970-01-01T08:00:03.000+08:00| -4.1115269056426626E-8| -|1970-01-01T08:00:04.000+08:00| 0.9999999925494194| -|1970-01-01T08:00:05.000+08:00| 3.328126513330016E-8| -|1970-01-01T08:00:06.000+08:00| -1.0000000183304454| -|1970-01-01T08:00:07.000+08:00| 6.260191433311374E-10| -|1970-01-01T08:00:08.000+08:00| 1.0000000018134796| -|1970-01-01T08:00:09.000+08:00| -3.097210911744423E-17| -|1970-01-01T08:00:10.000+08:00| -1.0000000018134794| -|1970-01-01T08:00:11.000+08:00| -6.260191627862097E-10| -|1970-01-01T08:00:12.000+08:00| 1.0000000183304454| -|1970-01-01T08:00:13.000+08:00| -3.328126501424346E-8| -|1970-01-01T08:00:14.000+08:00| -0.9999999925494196| -|1970-01-01T08:00:15.000+08:00| 4.111526915498874E-8| -|1970-01-01T08:00:16.000+08:00| 0.9999999593178128| -|1970-01-01T08:00:17.000+08:00| -1.7462829341296528E-8| -|1970-01-01T08:00:18.000+08:00| -0.9999999534830369| -|1970-01-01T08:00:19.000+08:00| -1.035237222742873E-16| -+-----------------------------+-----------------------------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, the output is $y=sin(2\pi t/4)$ after high-pass filtering. - -### IFFT - -#### Registration statement - -```sql -create function ifft as 'org.apache.iotdb.library.frequency.UDTFIFFT' -``` - -#### Usage - -This function treats the two input series as the real and imaginary part of a complex series, performs an inverse fast Fourier transform (IFFT), and outputs the real part of the result. -For the input format, please refer to the output format of `FFT` function. -Moreover, the compressed output of `FFT` function is also supported. - -**Name:** IFFT - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `start`: The start time of the output series with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is '1970-01-01 08:00:00'. -+ `interval`: The interval of the output series, which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it is 1s. - -**Output:** Output a single series. The type is DOUBLE. It is strictly equispaced. The values are the results of IFFT. - -**Note:** If a row contains null points or `NaN`, it will be ignored. - -#### Examples - - -Input series: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| root.test.d1.re| root.test.d1.im| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - - -SQL for query: - -```sql -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|ifft(root.test.d1.re, root.test.d1.im, "interval"="1m",| -| | "start"="2021-01-01 00:00:00")| -+-----------------------------+-------------------------------------------------------+ -|2021-01-01T00:00:00.000+08:00| 2.902112992431231| -|2021-01-01T00:01:00.000+08:00| 1.1755704705132448| -|2021-01-01T00:02:00.000+08:00| -2.175570513757101| -|2021-01-01T00:03:00.000+08:00| -1.9021130389094498| -|2021-01-01T00:04:00.000+08:00| 0.9999999925494194| -|2021-01-01T00:05:00.000+08:00| 1.902113046743454| -|2021-01-01T00:06:00.000+08:00| 0.17557053610884188| -|2021-01-01T00:07:00.000+08:00| -1.1755704886020932| -|2021-01-01T00:08:00.000+08:00| -0.9021130371347148| -|2021-01-01T00:09:00.000+08:00| 3.552713678800501E-16| -|2021-01-01T00:10:00.000+08:00| 0.9021130371347154| -|2021-01-01T00:11:00.000+08:00| 1.1755704886020932| -|2021-01-01T00:12:00.000+08:00| -0.17557053610884144| -|2021-01-01T00:13:00.000+08:00| -1.902113046743454| -|2021-01-01T00:14:00.000+08:00| -0.9999999925494196| -|2021-01-01T00:15:00.000+08:00| 1.9021130389094498| -|2021-01-01T00:16:00.000+08:00| 2.1755705137571004| -|2021-01-01T00:17:00.000+08:00| -1.1755704705132448| -|2021-01-01T00:18:00.000+08:00| -2.902112992431231| -|2021-01-01T00:19:00.000+08:00| -3.552713678800501E-16| -+-----------------------------+-------------------------------------------------------+ -``` - -### LowPass - -#### Registration statement - -```sql -create function lowpass as 'org.apache.iotdb.library.frequency.UDTFLowPass' -``` - -#### Usage - -This function performs low-pass filtering on the input series and extracts components below the cutoff frequency. -The timestamps of input will be ignored and all data points will be regarded as equidistant. - -**Name:** LOWPASS - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `wpass`: The normalized cutoff frequency which values (0,1). This parameter cannot be lacked. - -**Output:** Output a single series. The type is DOUBLE. It is the input after filtering. The length and timestamps of output are the same as the input. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select lowpass(s1,'wpass'='0.45') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|lowpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.9021130073323922| -|1970-01-01T08:00:01.000+08:00| 1.1755704705132448| -|1970-01-01T08:00:02.000+08:00| -1.1755705286582614| -|1970-01-01T08:00:03.000+08:00| -1.9021130389094498| -|1970-01-01T08:00:04.000+08:00| 7.450580419288145E-9| -|1970-01-01T08:00:05.000+08:00| 1.902113046743454| -|1970-01-01T08:00:06.000+08:00| 1.1755705212076808| -|1970-01-01T08:00:07.000+08:00| -1.1755704886020932| -|1970-01-01T08:00:08.000+08:00| -1.9021130222335536| -|1970-01-01T08:00:09.000+08:00| 3.552713678800501E-16| -|1970-01-01T08:00:10.000+08:00| 1.9021130222335536| -|1970-01-01T08:00:11.000+08:00| 1.1755704886020932| -|1970-01-01T08:00:12.000+08:00| -1.1755705212076801| -|1970-01-01T08:00:13.000+08:00| -1.902113046743454| -|1970-01-01T08:00:14.000+08:00| -7.45058112983088E-9| -|1970-01-01T08:00:15.000+08:00| 1.9021130389094498| -|1970-01-01T08:00:16.000+08:00| 1.1755705286582616| -|1970-01-01T08:00:17.000+08:00| -1.1755704705132448| -|1970-01-01T08:00:18.000+08:00| -1.9021130073323924| -|1970-01-01T08:00:19.000+08:00| -2.664535259100376E-16| -+-----------------------------+----------------------------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, the output is $y=2sin(2\pi t/5)$ after low-pass filtering. - - -### Envelope - -#### Registration statement - -```sql -create function envelope as 'org.apache.iotdb.library.frequency.UDFEnvelopeAnalysis' -``` - -#### Usage - -This function achieves signal demodulation and envelope extraction by inputting a one-dimensional floating-point array and a user specified modulation frequency. The goal of demodulation is to extract the parts of interest from complex signals, making them easier to understand. For example, demodulation can be used to find the envelope of the signal, that is, the trend of amplitude changes. - -**Name:** Envelope - -**Input:** Only supports a single input sequence, with types INT32/INT64/FLOAT/DOUBLE - - -**Parameters:** - -+ `frequency`: Frequency (optional, positive number. If this parameter is not filled in, the system will infer the frequency based on the time interval corresponding to the sequence). -+ `amplification`: Amplification factor (optional, positive integer. The output of the Time column is a set of positive integers and does not output decimals. When the frequency is less than 1, this parameter can be used to amplify the frequency to display normal results). - -**Output:** -+ `Time`: The meaning of the value returned by this column is frequency rather than time. If the output format is time format (e.g. 1970-01-01T08:00: 19.000+08:00), please convert it to a timestamp value. - - -+ `Envelope(Path, 'frequency'='{frequency}')`:Output a single sequence of type DOUBLE, which is the result of envelope analysis. - -**Note:** When the values of the demodulated original sequence are discontinuous, this function will treat it as continuous processing. It is recommended that the analyzed time series be a complete time series of values. It is also recommended to specify a start time and an end time. - -#### Examples - -Input series: - - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:01.000+08:00| 1.0 | -|1970-01-01T08:00:02.000+08:00| 2.0 | -|1970-01-01T08:00:03.000+08:00| 3.0 | -|1970-01-01T08:00:04.000+08:00| 4.0 | -|1970-01-01T08:00:05.000+08:00| 5.0 | -|1970-01-01T08:00:06.000+08:00| 6.0 | -|1970-01-01T08:00:07.000+08:00| 7.0 | -|1970-01-01T08:00:08.000+08:00| 8.0 | -|1970-01-01T08:00:09.000+08:00| 9.0 | -|1970-01-01T08:00:10.000+08:00| 10.0 | -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -set time_display_type=long; -select envelope(s1),envelope(s1,'frequency'='1000'),envelope(s1,'amplification'='10') from root.test.d1; -``` - -Output series: - - -``` -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -|Time|envelope(root.test.d1.s1)|envelope(root.test.d1.s1, "frequency"="1000")|envelope(root.test.d1.s1, "amplification"="10")| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -| 0| 6.284350808484124| 6.284350808484124| 6.284350808484124| -| 100| 1.5581923657404393| 1.5581923657404393| null| -| 200| 0.8503211038340728| 0.8503211038340728| null| -| 300| 0.512808785945551| 0.512808785945551| null| -| 400| 0.26361156774506744| 0.26361156774506744| null| -|1000| null| null| 1.5581923657404393| -|2000| null| null| 0.8503211038340728| -|3000| null| null| 0.512808785945551| -|4000| null| null| 0.26361156774506744| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ - -``` - - -## Data Matching - -### Cov - -#### Registration statement - -```sql -create function cov as 'org.apache.iotdb.library.dmatch.UDAFCov' -``` - -#### Usage - -This function is used to calculate the population covariance. - -**Name:** COV - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the population covariance. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `NaN` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select cov(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|cov(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 12.291666666666666| -+-----------------------------+-------------------------------------+ -``` - -### DTW - -#### Registration statement - -```sql -create function dtw as 'org.apache.iotdb.library.dmatch.UDAFDtw' -``` - -#### Usage - -This function is used to calculate the DTW distance between two input series. - -**Name:** DTW - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the DTW distance. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `0` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.004+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.005+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.006+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.007+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.008+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.009+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.010+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.011+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.012+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.013+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.014+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.015+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.016+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.017+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.018+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.019+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.020+08:00| 1.0| 2.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select dtw(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|dtw(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 20.0| -+-----------------------------+-------------------------------------+ -``` - -### Pearson - -#### Registration statement - -```sql -create function pearson as 'org.apache.iotdb.library.dmatch.UDAFPearson' -``` - -#### Usage - -This function is used to calculate the Pearson Correlation Coefficient. - -**Name:** PEARSON - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the Pearson Correlation Coefficient. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `NaN` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select pearson(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------+ -| Time|pearson(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.5630881927754872| -+-----------------------------+-----------------------------------------+ -``` - -### PtnSym - -#### Registration statement - -```sql -create function ptnsym as 'org.apache.iotdb.library.dmatch.UDTFPtnSym' -``` - -#### Usage - -This function is used to find all symmetric subseries in the input whose degree of symmetry is less than the threshold. -The degree of symmetry is calculated by DTW. -The smaller the degree, the more symmetrical the series is. - -**Name:** PATTERNSYMMETRIC - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE - -**Parameter:** - -+ `window`: The length of the symmetric subseries. It's a positive integer and the default value is 10. -+ `threshold`: The threshold of the degree of symmetry. It's non-negative. Only the subseries whose degree of symmetry is below it will be output. By default, all subseries will be output. - - -**Output Series:** Output a single series. The type is DOUBLE. Each data point in the output series corresponds to a symmetric subseries. The output timestamp is the starting timestamp of the subseries and the output value is the degree of symmetry. - -#### Example - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s4| -+-----------------------------+---------------+ -|2021-01-01T12:00:00.000+08:00| 1.0| -|2021-01-01T12:00:01.000+08:00| 2.0| -|2021-01-01T12:00:02.000+08:00| 3.0| -|2021-01-01T12:00:03.000+08:00| 2.0| -|2021-01-01T12:00:04.000+08:00| 1.0| -|2021-01-01T12:00:05.000+08:00| 1.0| -|2021-01-01T12:00:06.000+08:00| 1.0| -|2021-01-01T12:00:07.000+08:00| 1.0| -|2021-01-01T12:00:08.000+08:00| 2.0| -|2021-01-01T12:00:09.000+08:00| 3.0| -|2021-01-01T12:00:10.000+08:00| 2.0| -|2021-01-01T12:00:11.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|ptnsym(root.test.d1.s4, "window"="5", "threshold"="0")| -+-----------------------------+------------------------------------------------------+ -|2021-01-01T12:00:00.000+08:00| 0.0| -|2021-01-01T12:00:07.000+08:00| 0.0| -+-----------------------------+------------------------------------------------------+ -``` - -### XCorr - -#### Registration statement - -```sql -create function xcorr as 'org.apache.iotdb.library.dmatch.UDTFXCorr' -``` - -#### Usage - -This function is used to calculate the cross correlation function of given two time series. -For discrete time series, cross correlation is given by -$$CR(n) = \frac{1}{N} \sum_{m=1}^N S_1[m]S_2[m+n]$$ -which represent the similarities between two series with different index shifts. - -**Name:** XCORR - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series with DOUBLE as datatype. -There are $2N-1$ data points in the series, the center of which represents the cross correlation -calculated with pre-aligned series(that is $CR(0)$ in the formula above), -and the previous(or post) values represent those with shifting the latter series forward(or backward otherwise) -until the two series are no longer overlapped(not included). -In short, the values of output series are given by(index starts from 1) -$$OS[i] = CR(-N+i) = \frac{1}{N} \sum_{m=1}^{i} S_1[m]S_2[N-i+m],\ if\ i <= N$$ -$$OS[i] = CR(i-N) = \frac{1}{N} \sum_{m=1}^{2N-i} S_1[i-N+m]S_2[m],\ if\ i > N$$ - -**Note:** - -+ `null` and `NaN` values in the input series will be ignored and treated as 0. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| null| 6| -|2020-01-01T00:00:02.000+08:00| 2| 7| -|2020-01-01T00:00:03.000+08:00| 3| NaN| -|2020-01-01T00:00:04.000+08:00| 4| 9| -|2020-01-01T00:00:05.000+08:00| 5| 10| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -Output series: - -``` -+-----------------------------+---------------------------------------+ -| Time|xcorr(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+---------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 10.0| -|1970-01-01T08:00:00.002+08:00| 16.0| -|1970-01-01T08:00:00.003+08:00| 16.75| -|1970-01-01T08:00:00.004+08:00| 20.0| -|1970-01-01T08:00:00.005+08:00| 13.2| -|1970-01-01T08:00:00.006+08:00| 5.6| -|1970-01-01T08:00:00.007+08:00| 7.0| -|1970-01-01T08:00:00.008+08:00| 0.0| -+-----------------------------+---------------------------------------+ -``` -### Pattern\_match - -#### Registration statement - -```SQL -create function pattern_match as 'org.apache.iotdb.library.match.UDAFPatternMatch' -``` - -#### Usage - -This function performs pattern matching between an input time series and a predefined pattern. A match is considered successful if the similarity measure (distance) is less than or equal to a specified threshold. The results are output as a JSON list. - -​**Name**​: PATTERN\_MATCH - -**Input**​​**​ Series**​: Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE/ BOOLEAN - -​**Parameter**​: - -* `timePattern` : A comma-separated string of timestamps (e.g., `"t1,t2,t3"`). Length must be ​**greater than 1**​. Required. -* `valuePattern `: A comma-separated string of numerical values corresponding to `timePattern`. Length must **match ​**`timePattern` and be greater than 1. Required. - -> For boolean values: Use `1` for `true` and `0` for `false`. - -* `theshold`: Float-type similarity threshold. Required. - -**Output**​​**​ Series**​: A JSON list containing all successfully matched segments. Each entry includes: start timestamp `startTime`, end timestamp `endTime`, calculated similarity value `distance`. - -#### Example - -1. Linear Data - -Input series: - -```SQL -IoTDB> select s0 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s0| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.1| -|1970-01-01T08:00:00.003+08:00| 1.2| -|1970-01-01T08:00:00.004+08:00| 1.3| -|1970-01-01T08:00:00.005+08:00| 0.0| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+--------------------------------------------------------------------------------------------------+ -| match_result| -+--------------------------------------------------------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]| -+--------------------------------------------------------------------------------------------------+ -``` - -2. Boolean Data - -Input series: - -```SQL -IoTDB> select s1 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| true| -|1970-01-01T08:00:00.002+08:00| true| -|1970-01-01T08:00:00.003+08:00| true| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| false| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", "threshold"="0.5") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+-------------------------------------------------+ -| match_result| -+-------------------------------------------------+ -|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+-------------------------------------------------+ -``` - -3. V-shaped Data - -Input series: - -```SQL -IoTDB> select s2 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s2| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| -1.0| -|1970-01-01T08:00:00.003+08:00| -2.0| -|1970-01-01T08:00:00.004+08:00| -3.0| -|1970-01-01T08:00:00.005+08:00| -2.0| -|1970-01-01T08:00:00.006+08:00| -1.0| -|1970-01-01T08:00:00.007+08:00| -0.0| -|1970-01-01T08:00:00.008+08:00| -0.0| -|1970-01-01T08:00:00.009+08:00| -0.0| -|1970-01-01T08:00:00.010+08:00| -0.0| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s2, "timePattern"="1,2,3,4,5,6,7", "valuePattern"="0.0,-1.0,-2.0,-3.0,-2.0,-1.0,-0.0", "threshold"="10") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+----------------------------------------------+ -| match_result| -+----------------------------------------------+ -|[{"distance":0.53,"startTime":1,"endTime":10}]| -+----------------------------------------------+ -``` - -4. Multiple Matching Pattern - -Input series: - -```SQL -IoTDB> select s0,s1 from root.** -+-----------------------------+-------------+-------------+ -| Time|root.db.d0.s0|root.db.d0.s1| -+-----------------------------+-------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| true| -|1970-01-01T08:00:00.002+08:00| 1.1| true| -|1970-01-01T08:00:00.003+08:00| 1.2| true| -|1970-01-01T08:00:00.004+08:00| 1.3| false| -|1970-01-01T08:00:00.005+08:00| 0.0| false| -+-----------------------------+-------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result1, pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", - "threshold"="0.5") as match_result2 from root.db.d0 -``` - -Output series: - -```SQL -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -| match_result1| match_result2| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -``` - -## Data Repairing - -### TimestampRepair - -#### Registration statement - -```sql -create function timestamprepair as 'org.apache.iotdb.library.drepair.UDTFTimestampRepair' -``` - -#### Usage - -This function is used for timestamp repair. -According to the given standard time interval, -the method of minimizing the repair cost is adopted. -By fine-tuning the timestamps, -the original data with unstable timestamp interval is repaired to strictly equispaced data. -If no standard time interval is given, -this function will use the **median**, **mode** or **cluster** of the time interval to estimate the standard time interval. - -**Name:** TIMESTAMPREPAIR - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `interval`: The standard time interval whose unit is millisecond. It is a positive integer. By default, it will be estimated according to the given method. -+ `method`: The method to estimate the standard time interval, which is 'median', 'mode' or 'cluster'. This parameter is only valid when `interval` is not given. By default, median will be used. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -##### Manually Specify the Standard Time Interval - -When `interval` is given, this function repairs according to the given standard time interval. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:19.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:01.000+08:00| 7.0| -|2021-07-01T12:01:11.000+08:00| 8.0| -|2021-07-01T12:01:21.000+08:00| 9.0| -|2021-07-01T12:01:31.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timestamprepair(s1,'interval'='10000') from root.test.d2 -``` - -Output series: - - -``` -+-----------------------------+----------------------------------------------------+ -| Time|timestamprepair(root.test.d2.s1, "interval"="10000")| -+-----------------------------+----------------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+----------------------------------------------------+ -``` - -##### Automatically Estimate the Standard Time Interval - -When `interval` is default, this function estimates the standard time interval. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select timestamprepair(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------+ -| Time|timestamprepair(root.test.d2.s1)| -+-----------------------------+--------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+--------------------------------+ -``` - -### ValueFill - -#### Registration statement - -```sql -create function valuefill as 'org.apache.iotdb.library.drepair.UDTFValueFill' -``` - -#### Usage - -This function is used to impute time series. Several methods are supported. - -**Name**: ValueFill -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: {"mean", "previous", "linear", "likelihood", "AR", "MA", "SCREEN"}, default "linear". - Method to use for imputation in series. "mean": use global mean value to fill holes; "previous": propagate last valid observation forward to next valid. "linear": simplest interpolation method; "likelihood":Maximum likelihood estimation based on the normal distribution of speed; "AR": auto regression; "MA": moving average; "SCREEN": speed constraint. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -**Note:** AR method use AR(1) model. Input value should be auto-correlated, or the function would output a single point (0, 0.0). - -#### Examples - -##### Fill with linear - -When `method` is "linear" or the default, Screen method is used to impute. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| NaN| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| NaN| -|2020-01-01T00:00:22.000+08:00| NaN| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select valuefill(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------+ -| Time|valuefill(root.test.d2.s1)| -+-----------------------------+--------------------------+ -|2020-01-01T00:00:02.000+08:00| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 110.5| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.66666666666667| -|2020-01-01T00:00:22.000+08:00| 121.33333333333333| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+--------------------------+ -``` - -##### Previous Fill - -When `method` is "previous", previous method is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select valuefill(s1,"method"="previous") from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------------+ -| Time|valuefill(root.test.d2.s1, "method"="previous")| -+-----------------------------+-----------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 108.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 116.0| -|2020-01-01T00:00:22.000+08:00| 116.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-----------------------------------------------+ -``` - -### ValueRepair - -#### Registration statement - -```sql -create function valuerepair as 'org.apache.iotdb.library.drepair.UDTFValueRepair' -``` - -#### Usage - -This function is used to repair the value of the time series. -Currently, two methods are supported: -**Screen** is a method based on speed threshold, which makes all speeds meet the threshold requirements under the premise of minimum changes; -**LsGreedy** is a method based on speed change likelihood, which models speed changes as Gaussian distribution, and uses a greedy algorithm to maximize the likelihood. - - -**Name:** VALUEREPAIR - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The method used to repair, which is 'Screen' or 'LsGreedy'. By default, Screen is used. -+ `minSpeed`: This parameter is only valid with Screen. It is the speed threshold. Speeds below it will be regarded as outliers. By default, it is the median minus 3 times of median absolute deviation. -+ `maxSpeed`: This parameter is only valid with Screen. It is the speed threshold. Speeds above it will be regarded as outliers. By default, it is the median plus 3 times of median absolute deviation. -+ `center`: This parameter is only valid with LsGreedy. It is the center of the Gaussian distribution of speed changes. By default, it is 0. -+ `sigma`: This parameter is only valid with LsGreedy. It is the standard deviation of the Gaussian distribution of speed changes. By default, it is the median absolute deviation. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -**Note:** `NaN` will be filled with linear interpolation before repairing. - -#### Examples - -##### Repair with Screen - -When `method` is 'Screen' or the default, Screen method is used. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select valuerepair(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|valuerepair(root.test.d2.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+----------------------------+ -``` - -##### Repair with LsGreedy - -When `method` is 'LsGreedy', LsGreedy method is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select valuerepair(s1,'method'='LsGreedy') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|valuerepair(root.test.d2.s1, "method"="LsGreedy")| -+-----------------------------+-------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-------------------------------------------------+ -``` - -## Series Discovery - -### ConsecutiveSequences - -#### Registration statement - -```sql -create function consecutivesequences as 'org.apache.iotdb.library.series.UDTFConsecutiveSequences' -``` - -#### Usage - -This function is used to find locally longest consecutive subsequences in strictly equispaced multidimensional data. - -Strictly equispaced data is the data whose time intervals are strictly equal. Missing data, including missing rows and missing values, is allowed in it, while data redundancy and timestamp drift is not allowed. - -Consecutive subsequence is the subsequence that is strictly equispaced with the standard time interval without any missing data. If a consecutive subsequence is not a proper subsequence of any consecutive subsequence, it is locally longest. - -**Name:** CONSECUTIVESEQUENCES - -**Input Series:** Support multiple input series. The type is arbitrary but the data is strictly equispaced. - -**Parameters:** - -+ `gap`: The standard time interval which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it will be estimated by the mode of time intervals. - -**Output Series:** Output a single series. The type is INT32. Each data point in the output series corresponds to a locally longest consecutive subsequence. The output timestamp is the starting timestamp of the subsequence and the output value is the number of data points in the subsequence. - -**Note:** For input series that is not strictly equispaced, there is no guarantee on the output. - -#### Examples - -##### Manually Specify the Standard Time Interval - -It's able to manually specify the standard time interval by the parameter `gap`. It's notable that false parameter leads to false output. - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2, "gap"="5m")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------------------+ -``` - - -##### Automatically Estimate the Standard Time Interval - -When `gap` is default, this function estimates the standard time interval by the mode of time intervals and gets the same results. Therefore, this usage is more recommended. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select consecutivesequences(s1,s2) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------+ -``` - -### ConsecutiveWindows - -#### Registration statement - -```sql -create function consecutivewindows as 'org.apache.iotdb.library.series.UDTFConsecutiveWindows' -``` - -#### Usage - -This function is used to find consecutive windows of specified length in strictly equispaced multidimensional data. - -Strictly equispaced data is the data whose time intervals are strictly equal. Missing data, including missing rows and missing values, is allowed in it, while data redundancy and timestamp drift is not allowed. - -Consecutive window is the subsequence that is strictly equispaced with the standard time interval without any missing data. - -**Name:** CONSECUTIVEWINDOWS - -**Input Series:** Support multiple input series. The type is arbitrary but the data is strictly equispaced. - -**Parameters:** - -+ `gap`: The standard time interval which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it will be estimated by the mode of time intervals. -+ `length`: The length of the window which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. This parameter cannot be lacked. - -**Output Series:** Output a single series. The type is INT32. Each data point in the output series corresponds to a consecutive window. The output timestamp is the starting timestamp of the window and the output value is the number of data points in the window. - -**Note:** For input series that is not strictly equispaced, there is no guarantee on the output. - -#### Examples - - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------+ -| Time|consecutivewindows(root.test.d1.s1, root.test.d1.s2, "length"="10m")| -+-----------------------------+--------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -+-----------------------------+--------------------------------------------------------------------+ -``` - - - -## Machine Learning - -### AR - -#### Registration statement - -```sql -create function ar as 'org.apache.iotdb.library.dlearn.UDTFAR' -``` - -#### Usage - -This function is used to learn the coefficients of the autoregressive models for a time series. - -**Name:** AR - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `p`: The order of the autoregressive model. Its default value is 1. - -**Output Series:** Output a single series. The type is DOUBLE. The first line corresponds to the first order coefficient, and so on. - -**Note:** - -- Parameter `p` should be a positive integer. -- Most points in the series should be sampled at a constant time interval. -- Linear interpolation is applied for the missing points in the series. - -#### Examples - -##### Assigning Model Order - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ar(s0,"p"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+---------------------------+ -| Time|ar(root.test.d0.s0,"p"="2")| -+-----------------------------+---------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.9429| -|1970-01-01T08:00:00.002+08:00| -0.2571| -+-----------------------------+---------------------------+ -``` - - diff --git a/src/UserGuide/dev-1.3/Tools-System/CLI_timecho.md b/src/UserGuide/dev-1.3/Tools-System/CLI_timecho.md deleted file mode 100644 index 07e4ab9ba..000000000 --- a/src/UserGuide/dev-1.3/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,175 +0,0 @@ - - -# Command Line Interface (CLI) - - -IoTDB provides Cli/shell tools for users to interact with IoTDB server in command lines. This document shows how Cli/shell tool works and the meaning of its parameters. - -> Note: In this document, \$IOTDB\_HOME represents the path of the IoTDB installation directory. - -## Running Cli - -After installation, there is a default user in IoTDB: `root`, and the -default password is `root`. Users can use this username to try IoTDB Cli/Shell tool. The cli startup script is the `start-cli` file under the \$IOTDB\_HOME/bin folder. When starting the script, you need to specify the IP and PORT. (Make sure the IoTDB cluster is running properly when you use Cli/Shell tool to connect to it.) - -Here is an example where the cluster is started locally and the user has not changed the running port. The default rpc port is -6667
-If you need to connect to the remote DataNode or changes -the rpc port number of the DataNode running, set the specific IP and RPC PORT at -h and -p.
-You also can set your own environment variable at the front of the start script ("/sbin/start-cli.sh" for linux and "/sbin/start-cli.bat" for windows) - -The Linux and MacOS system startup commands are as follows: - -```shell -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -``` - -The Windows system startup commands are as follows: - -```shell -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -``` - -After operating these commands, the cli can be started successfully. The successful status will be as follows: - -``` - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ version - - -Successfully login at 127.0.0.1:6667 -IoTDB> -``` - -Enter ```quit``` or `exit` can exit Cli. - -## Cli Parameters - -| **Parameter** | **Type** | **Required** | **Description** | **Example** | -| -------------------------- | -------- | ------------ |-----------------------------------------------------------------------------------| ------------------- | -| -h `` | string | No | The IP address of the IoTDB server. (Default: 127.0.0.1) | -h 127.0.0.1 | -| -p `` | int | No | The RPC port of the IoTDB server. (Default: 6667) | -p 6667 | -| -u `` | string | No | The username to connect to the IoTDB server. (Default: root) | -u root | -| -pw `` | string | No | The password to connect to the IoTDB server. (Default: root) | -pw root | -| -e `` | string | No | Batch operations in non-interactive mode. | -e "show databases" | -| -c | Flag | No | Required if rpc_thrift_compression_enable=true on the server. | -c | -| -disableISO8601 | Flag | No | If set, timestamps will be displayed as numeric values instead of ISO8601 format. | -disableISO8601 | -| -usessl `` | Boolean | No | Enable SSL connection | -usessl true | -| -ts `` | string | No | SSL certificate store path | -ts /path/to/truststore | -| -tpw `` | string | No | SSL certificate store password | -tpw myTrustPassword | -| -timeout `` | int | No | Query timeout (seconds). If not set, the server's configuration will be used. | -timeout 30 | -| -help | Flag | No | Displays help information for the CLI tool. | -help | - -Following is a cli command which connects the host with IP -10.129.187.21, rpc port 6667, username "root", password "root", and prints the timestamp in digital form. The maximum number of lines displayed on the IoTDB command line is 10. - -The Linux and MacOS system startup commands are as follows: - -```shell -Shell > bash sbin/start-cli.sh -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -The Windows system startup commands are as follows: - -```shell -Shell > sbin\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -## CLI Special Command - -Special commands of Cli are below. - -| Command | Description / Example | -| :-------------------------- | :------------------------------------------------------ | -| `set time_display_type=xxx` | eg. long, default, ISO8601, yyyy-MM-dd HH:mm:ss | -| `show time_display_type` | show time display type | -| `set time_zone=xxx` | eg. +08:00, Asia/Shanghai | -| `show time_zone` | show cli time zone | -| `set fetch_size=xxx` | set fetch size when querying data from server | -| `show fetch_size` | show fetch size | -| `set max_display_num=xxx` | set max lines for cli to output, -1 equals to unlimited | -| `help` | Get hints for CLI special commands | -| `exit/quit` | Exit CLI | - - -## Batch Operation of Cli - --e parameter is designed for the Cli/shell tool in the situation where you would like to manipulate IoTDB in batches through scripts. By using the -e parameter, you can operate IoTDB without entering the cli's input mode. - -In order to avoid confusion between statements and other parameters, the current version only supports the -e parameter as the last parameter. - -The usage of -e parameter for Cli/shell is as follows: - -The Linux and MacOS system commands: - -```shell -Shell > bash sbin/start-cli.sh -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -The Windows system commands: - -```shell -Shell > sbin\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -In the Windows environment, the SQL statement of the -e parameter needs to use ` `` ` to replace `" "` - -In order to better explain the use of -e parameter, take following as an example(On linux system). - -Suppose you want to create a database root.demo to a newly launched IoTDB, create a timeseries root.demo.s1 and insert three data points into it. With -e parameter, you could write a shell like this: - -```shell -# !/bin/bash - -host=127.0.0.1 -rpcPort=6667 -user=root -pass=root - -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create database root.demo" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create timeseries root.demo.s1 WITH DATATYPE=INT32, ENCODING=RLE" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(1,10)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(2,11)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(3,12)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "select s1 from root.demo" -``` - -The results are shown in the figure, which are consistent with the Cli and jdbc operations. - -```shell - Shell > bash ./shell.sh -+-----------------------------+------------+ -| Time|root.demo.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.001+08:00| 10| -|1970-01-01T08:00:00.002+08:00| 11| -|1970-01-01T08:00:00.003+08:00| 12| -+-----------------------------+------------+ -Total line number = 3 -It costs 0.267s -``` - -It should be noted that the use of the -e parameter in shell scripts requires attention to the escaping of special characters. diff --git a/src/UserGuide/dev-1.3/Tools-System/Maintenance-Tool_timecho.md b/src/UserGuide/dev-1.3/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index d77787934..000000000 --- a/src/UserGuide/dev-1.3/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,960 +0,0 @@ - -# Cluster management tool - -## IoTDB-OpsKit - -The IoTDB OpsKit is an easy-to-use operation and maintenance tool (enterprise version tool). -It is designed to solve the operation and maintenance problems of multiple nodes in the IoTDB distributed system. -It mainly includes cluster deployment, cluster start and stop, elastic expansion, configuration update, data export and other functions, thereby realizing one-click command issuance for complex database clusters, which greatly Reduce management difficulty. -This document will explain how to remotely deploy, configure, start and stop IoTDB cluster instances with cluster management tools. - -### Environment dependence - -This tool is a supporting tool for TimechoDB(Enterprise Edition based on IoTDB). You can contact your sales representative to obtain the tool download method. - -The machine where IoTDB is to be deployed needs to rely on jdk 8 and above, lsof, netstat, and unzip functions. If not, please install them yourself. You can refer to the installation commands required for the environment in the last section of the document. - -Tip: The IoTDB cluster management tool requires an account with root privileges - -### Deployment method - -#### Download and install - -This tool is a supporting tool for TimechoDB(Enterprise Edition based on IoTDB). You can contact your salesperson to obtain the tool download method. - -Note: Since the binary package only supports GLIBC2.17 and above, the minimum version is Centos7. - -* After entering the following commands in the iotdb-opskit directory: - -```bash -bash install-iotdbctl.sh -``` - -The iotdbctl keyword can be activated in the subsequent shell, such as checking the environment instructions required before deployment as follows: - -```bash -iotdbctl cluster check example -``` - -* You can also directly use <iotdbctl absolute path>/sbin/iotdbctl without activating iotdbctl to execute commands, such as checking the environment required before deployment: - -```bash -/sbin/iotdbctl cluster check example -``` - -### Introduction to cluster configuration files - -* There is a cluster configuration yaml file in the `iotdbctl/config` directory. The yaml file name is the cluster name. There can be multiple yaml files. In order to facilitate users to configure yaml files, a `default_cluster.yaml` example is provided under the iotdbctl/config directory. -* The yaml file configuration consists of five major parts: `global`, `confignode_servers`, `datanode_servers`, `grafana_server`, and `prometheus_server` -* `global` is a general configuration that mainly configures machine username and password, IoTDB local installation files, Jdk configuration, etc. A `default_cluster.yaml` sample data is provided in the `iotdbctl/config` directory, - Users can copy and modify it to their own cluster name and refer to the instructions inside to configure the IoTDB cluster. In the `default_cluster.yaml` sample, all uncommented items are required, and those that have been commented are non-required. - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotdbctl cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - -| parameter name | parameter describe | required | -|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| iotdb\_zip\_dir | IoTDB deployment distribution directory, if the value is empty, it will be downloaded from the address specified by `iotdb_download_url` | NO | -| iotdb\_download\_url | IoTDB download address, if `iotdb_zip_dir` has no value, download from the specified address | NO | -| jdk\_tar\_dir | jdk local directory, you can use this jdk path to upload and deploy to the target node. | NO | -| jdk\_deploy\_dir | jdk remote machine deployment directory, jdk will be deployed to this directory, and the following `jdk_dir_name` parameter forms a complete jdk deployment directory, that is, `/` | NO | -| jdk\_dir\_name | The directory name after jdk decompression defaults to jdk_iotdb | NO | -| iotdb\_lib\_dir | The IoTDB lib directory or the IoTDB lib compressed package only supports .zip format and is only used for IoTDB upgrade. It is in the comment state by default. If you need to upgrade, please open the comment and modify the path. If you use a zip file, please use the zip command to compress the iotdb/lib directory, such as zip -r lib.zip apache-iotdb-1.2.0/lib/* d | NO | -| user | User name for ssh login deployment machine | YES | -| password | The password for ssh login. If the password does not specify the use of pkey to log in, please ensure that the ssh login between nodes has been configured without a key. | NO | -| pkey | Key login: If password has a value, password is used first, otherwise pkey is used to log in. | NO | -| ssh\_port | ssh port | YES | -| deploy\_dir | IoTDB deployment directory, IoTDB will be deployed to this directory and the following `iotdb_dir_name` parameter will form a complete IoTDB deployment directory, that is, `/` | YES | -| iotdb\_dir\_name | The directory name after decompression of IoTDB is iotdb by default. | NO | -| datanode-env.sh | Corresponding to `iotdb/config/datanode-env.sh`, when `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first | NO | -| confignode-env.sh | Corresponding to `iotdb/config/confignode-env.sh`, the value in `datanode_servers` is used first when `global` and `datanode_servers` are configured at the same time | NO | -| iotdb-system.properties | Corresponds to `/config/iotdb-system.properties` | NO | -| cn\_internal\_address | The cluster configuration address points to the surviving ConfigNode, and it points to confignode_x by default. When `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_address | The cluster configuration address points to the surviving ConfigNode, and points to confignode_x by default. When configuring values for `global` and `datanode_servers` at the same time, the value in `datanode_servers` is used first, corresponding to `dn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | - -Among them, datanode-env.sh and confignode-env.sh can be configured with extra parameters extra_opts. When this parameter is configured, corresponding values will be appended after datanode-env.sh and confignode-env.sh. Refer to default_cluster.yaml for configuration examples as follows: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - -* `confignode_servers` is the configuration for deploying IoTDB Confignodes, in which multiple Confignodes can be configured - By default, the first started ConfigNode node node1 is regarded as the Seed-ConfigNode - -| parameter name | parameter describe | required | -|-----------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| name | Confignode name | YES | -| deploy\_dir | IoTDB config node deployment directory | YES | -| cn\_internal\_address | Corresponds to iotdb/internal communication address, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| cn_internal_address | The cluster configuration address points to the surviving ConfigNode, and it points to confignode_x by default. When `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| cn\_internal\_port | Internal communication port, corresponding to `cn_internal_port` in `iotdb/config/iotdb-system.properties` | YES | -| cn\_consensus\_port | Corresponds to `cn_consensus_port` in `iotdb/config/iotdb-system.properties` | NO | -| cn\_data\_dir | Corresponds to `cn_consensus_port` in `iotdb/config/iotdb-system.properties` Corresponds to `cn_data_dir` in `iotdb/config/iotdb-system.properties` | YES | -| iotdb-system.properties | Corresponding to `iotdb/config/iotdb-system.properties`, when configuring values in `global` and `confignode_servers` at the same time, the value in confignode_servers will be used first. | NO | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| parameter name | parameter describe | required | -|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| name | Datanode name | YES | -| deploy\_dir | IoTDB data node deployment directory | YES | -| dn\_rpc\_address | The datanode rpc address corresponds to `dn_rpc_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_address | Internal communication address, corresponding to `dn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_seed\_config\_node | The cluster configuration address points to the surviving ConfigNode, and points to confignode_x by default. When configuring values for `global` and `datanode_servers` at the same time, the value in `datanode_servers` is used first, corresponding to `dn_seed_config_node` in `iotdb/config/iotdb-system.properties`. | YES | -| dn\_rpc\_port | Datanode rpc port address, corresponding to `dn_rpc_port` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_port | Internal communication port, corresponding to `dn_internal_port` in `iotdb/config/iotdb-system.properties` | YES | -| iotdb-system.properties | Corresponding to `iotdb/config/iotdb-system.properties`, when configuring values in `global` and `datanode_servers` at the same time, the value in `datanode_servers` will be used first. | NO | - -* grafana_server is the configuration related to deploying Grafana - -| parameter name | parameter describe | required | -|--------------------|-------------------------------------------------------------|-----------| -| grafana\_dir\_name | Grafana decompression directory name(default grafana_iotdb) | NO | -| host | Server ip deployed by grafana | YES | -| grafana\_port | The port of grafana deployment machine, default 3000 | NO | -| deploy\_dir | grafana deployment server directory | YES | -| grafana\_tar\_dir | Grafana compressed package location | YES | -| dashboards | dashboards directory | NO | - -* prometheus_server 是部署Prometheus 相关配置 - -| parameter name | parameter describe | required | -|--------------------------------|----------------------------------------------------|----------| -| prometheus\_dir\_name | prometheus decompression directory name, default prometheus_iotdb | NO | -| host | Server IP deployed by prometheus | YES | -| prometheus\_port | The port of prometheus deployment machine, default 9090 | NO | -| deploy\_dir | prometheus deployment server directory | YES | -| prometheus\_tar\_dir | prometheus compressed package path | YES | -| storage\_tsdb\_retention\_time | The number of days to save data is 15 days by default | NO | -| storage\_tsdb\_retention\_size | The data size that can be saved by the specified block defaults to 512M. Please note the units are KB, MB, GB, TB, PB, and EB. | NO | - -If metrics are configured in `iotdb-system.properties` and `iotdb-system.properties` of config/xxx.yaml, the configuration will be automatically put into promethues without manual modification. - -Note: How to configure the value corresponding to the yaml key to contain special characters such as: etc. It is recommended to use double quotes for the entire value, and do not use paths containing spaces in the corresponding file paths to prevent abnormal recognition problems. - -### scenes to be used - -#### Clean data - -* Cleaning up the cluster data scenario will delete the data directory in the IoTDB cluster and `cn_system_dir`, `cn_consensus_dir`, `cn_consensus_dir` configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs` and `ext` directories. -* First execute the stop cluster command, and then execute the cluster cleanup command. - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster clean default_cluster -``` - -#### Cluster destruction - -* The cluster destruction scenario will delete `data`, `cn_system_dir`, `cn_consensus_dir`, in the IoTDB cluster - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs`, `ext`, `IoTDB` deployment directory, - grafana deployment directory and prometheus deployment directory. -* First execute the stop cluster command, and then execute the cluster destruction command. - - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster destroy default_cluster -``` - -#### Cluster upgrade - -* To upgrade the cluster, you first need to configure `iotdb_lib_dir` in config/xxx.yaml as the directory path where the jar to be uploaded to the server is located (for example, iotdb/lib). -* If you use zip files to upload, please use the zip command to compress the iotdb/lib directory, such as zip -r lib.zip apache-iotdb-1.2.0/lib/* -* Execute the upload command and then execute the restart IoTDB cluster command to complete the cluster upgrade. - -```bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### hot deployment - -* First modify the configuration in config/xxx.yaml. -* Execute the distribution command, and then execute the hot deployment command to complete the hot deployment of the cluster configuration - -```bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### Cluster expansion - -* First modify and add a datanode or confignode node in config/xxx.yaml. -* Execute the cluster expansion command - -```bash -iotdbctl cluster scaleout default_cluster -``` - -#### Cluster scaling - -* First find the node name or ip+port to shrink in config/xxx.yaml (where confignode port is cn_internal_port, datanode port is rpc_port) -* Execute cluster shrink command - -```bash -iotdbctl cluster scalein default_cluster -``` - -#### Using cluster management tools to manipulate existing IoTDB clusters - -* Configure the server's `user`, `passwod` or `pkey`, `ssh_port` -* Modify the IoTDB deployment path in config/xxx.yaml, `deploy_dir` (IoTDB deployment directory), `iotdb_dir_name` (IoTDB decompression directory name, the default is iotdb) - For example, if the full path of IoTDB deployment is `/home/data/apache-iotdb-1.1.1`, you need to modify the yaml files `deploy_dir:/home/data/` and `iotdb_dir_name:apache-iotdb-1.1.1` -* If the server is not using java_home, modify `jdk_deploy_dir` (jdk deployment directory) and `jdk_dir_name` (the directory name after jdk decompression, the default is jdk_iotdb). If java_home is used, there is no need to modify the configuration. - For example, the full path of jdk deployment is `/home/data/jdk_1.8.2`, you need to modify the yaml files `jdk_deploy_dir:/home/data/`, `jdk_dir_name:jdk_1.8.2` -* Configure `cn_internal_address`, `dn_internal_address` -* Configure `cn_internal_address`, `cn_internal_port`, `cn_consensus_port`, `cn_system_dir`, in `iotdb-system.properties` in `confignode_servers` - If the values in `cn_consensus_dir` and `iotdb-system.properties` are not the default for IoTDB, they need to be configured, otherwise there is no need to configure them. -* Configure `dn_rpc_address`, `dn_internal_address`, `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir` in `iotdb-system.properties` -* Execute initialization command - -```bash -iotdbctl cluster init default_cluster -``` - -#### Deploy IoTDB, Grafana and Prometheus - -* Configure `iotdb-system.properties` to open the metrics interface -* Configure the Grafana configuration. If there are multiple `dashboards`, separate them with commas. The names cannot be repeated or they will be overwritten. -* Configure the Prometheus configuration. If the IoTDB cluster is configured with metrics, there is no need to manually modify the Prometheus configuration. The Prometheus configuration will be automatically modified according to which node is configured with metrics. -* Start the cluster - -```bash -iotdbctl cluster start default_cluster -``` - -For more detailed parameters, please refer to the cluster configuration file introduction above - -### Command - -The basic usage of this tool is: -```bash -iotdbctl cluster [params (Optional)] -``` -* key indicates a specific command. - -* cluster name indicates the cluster name (that is, the name of the yaml file in the `iotdbctl/config` file). - -* params indicates the required parameters of the command (optional). - -* For example, the command format to deploy the default_cluster cluster is: - -```bash -iotdbctl cluster deploy default_cluster -``` - -* The functions and parameters of the cluster are listed as follows: - -| command | description | parameter | -|-----------------|-----------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| check | check whether the cluster can be deployed | Cluster name list | -| clean | cleanup-cluster | cluster-name | -| deploy/dist-all | deploy cluster | Cluster name, -N, module name (optional for iotdb, grafana, prometheus), -op force (optional) | -| list | cluster status list | None | -| start | start cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional) | -| stop | stop cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional), -op force (nodename, grafana, prometheus optional) | -| restart | restart cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional), -op force (nodename, grafana, prometheus optional) | -| show | view cluster information. The details field indicates the details of the cluster information. | Cluster name, details (optional) | -| destroy | destroy cluster | Cluster name, -N, module name (iotdb, grafana, prometheus optional) | -| scaleout | cluster expansion | Cluster name | -| scalein | cluster shrink | Cluster name, -N, cluster node name or cluster node ip+port | -| reload | hot loading of cluster configuration files | Cluster name | -| dist-conf | cluster configuration file distribution | Cluster name | -| dumplog | Back up specified cluster logs | Cluster name, -N, cluster node name -h Back up to target machine ip -pw Back up to target machine password -p Back up to target machine port -path Backup directory -startdate Start time -enddate End time -loglevel Log type -l transfer speed | -| dumpdata | Backup cluster data | Cluster name, -h backup to target machine ip -pw backup to target machine password -p backup to target machine port -path backup directory -startdate start time -enddate end time -l transmission speed | -| dist-lib | lib package upgrade | Cluster name | -| init | When an existing cluster uses the cluster deployment tool, initialize the cluster configuration | Cluster name | -| status | View process status | Cluster name | -| activate | Activate cluster | Cluster name | -| health_check | health check | Cluster name, -N, nodename (optional) | -| backup | Activate cluster | Cluster name,-N nodename (optional) | -| importschema | Activate cluster | Cluster name,-N nodename -param paramters | -| exportschema | Activate cluster | Cluster name,-N nodename -param paramters | - - - -### Detailed command execution process - -The following commands are executed using default_cluster.yaml as an example, and users can modify them to their own cluster files to execute - -#### Check cluster deployment environment commands - -```bash -iotdbctl cluster check default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Verify that the target node is able to log in via SSH - -* Verify whether the JDK version on the corresponding node meets IoTDB jdk1.8 and above, and whether the server is installed with unzip, lsof, and netstat. - -* If you see the following prompt `Info:example check successfully!`, it proves that the server has already met the installation requirements. - If `Error:example check fail!` is output, it proves that some conditions do not meet the requirements. You can check the Error log output above (for example: `Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`) to make repairs. , - If the jdk check does not meet the requirements, we can configure a jdk1.8 or above version in the yaml file ourselves for deployment without affecting subsequent use. - If checking lsof, netstat or unzip does not meet the requirements, you need to install it on the server yourself. - -#### Deploy cluster command - -```bash -iotdbctl cluster deploy default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Upload IoTDB compressed package and jdk compressed package according to the node information in `confignode_servers` and `datanode_servers` (if `jdk_tar_dir` and `jdk_deploy_dir` values ​​are configured in yaml) - -* Generate and upload `iotdb-system.properties` according to the yaml file node configuration information - -```bash -iotdbctl cluster deploy default_cluster -op force -``` - -Note: This command will force the deployment, and the specific process will delete the existing deployment directory and redeploy - -*deploy a single module* -```bash -# Deploy grafana module -iotdbctl cluster deploy default_cluster -N grafana -# Deploy the prometheus module -iotdbctl cluster deploy default_cluster -N prometheus -# Deploy the iotdb module -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### Start cluster command - -```bash -iotdbctl cluster start default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Start confignode, start sequentially according to the order in `confignode_servers` in the yaml configuration file and check whether the confignode is normal according to the process id, the first confignode is seek config - -* Start the datanode in sequence according to the order in `datanode_servers` in the yaml configuration file and check whether the datanode is normal according to the process id. - -* After checking the existence of the process according to the process id, check whether each service in the cluster list is normal through the cli. If the cli link fails, retry every 10s until it succeeds and retry up to 5 times - - -* -Start a single node command* -```bash -#Start according to the IoTDB node name -iotdbctl cluster start default_cluster -N datanode_1 -#Start according to IoTDB cluster ip+port, where port corresponds to cn_internal_port of confignode and rpc_port of datanode. -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 -#Start grafana -iotdbctl cluster start default_cluster -N grafana -#Start prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -* Find the yaml file in the default location based on cluster-name - -* Find the node location information based on the provided node name or ip:port. If the started node is `data_node`, the ip uses `dn_rpc_address` in the yaml file, and the port uses `dn_rpc_port` in datanode_servers in the yaml file. - If the started node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - -* start the node - -Note: Since the cluster deployment tool only calls the start-confignode.sh and start-datanode.sh scripts in the IoTDB cluster, -When the actual output result fails, it may be that the cluster has not started normally. It is recommended to use the status command to check the current cluster status (iotdbctl cluster status xxx) - - -#### View IoTDB cluster status command - -```bash -iotdbctl cluster show default_cluster -#View IoTDB cluster details -iotdbctl cluster show default_cluster details -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Execute `show cluster details` through cli on datanode in turn. If one node is executed successfully, it will not continue to execute cli on subsequent nodes and return the result directly. - -#### Stop cluster command - - -```bash -iotdbctl cluster stop default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* According to the datanode node information in `datanode_servers`, stop the datanode nodes in order according to the configuration. - -* Based on the confignode node information in `confignode_servers`, stop the confignode nodes in sequence according to the configuration - -*force stop cluster command* - -```bash -iotdbctl cluster stop default_cluster -op force -``` -Will directly execute the kill -9 pid command to forcibly stop the cluster - -*Stop single node command* - -```bash -#Stop by IoTDB node name -iotdbctl cluster stop default_cluster -N datanode_1 -#Stop according to IoTDB cluster ip+port (ip+port is to get the only node according to ip+dn_rpc_port in datanode or ip+cn_internal_port in confignode to get the only node) -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 -#Stop grafana -iotdbctl cluster stop default_cluster -N grafana -#Stop prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -* Find the yaml file in the default location based on cluster-name - -* Find the corresponding node location information based on the provided node name or ip:port. If the stopped node is `data_node`, the ip uses `dn_rpc_address` in the yaml file, and the port uses `dn_rpc_port` in datanode_servers in the yaml file. - If the stopped node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - -* stop the node - -Note: Since the cluster deployment tool only calls the stop-confignode.sh and stop-datanode.sh scripts in the IoTDB cluster, in some cases the iotdb cluster may not be stopped. - - -#### Clean cluster data command - -```bash -iotdbctl cluster clean default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Based on the information in `confignode_servers` and `datanode_servers`, check whether there are still services running, - If any service is running, the cleanup command will not be executed. - -* Delete the data directory in the IoTDB cluster and the `cn_system_dir`, `cn_consensus_dir`, configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs` and `ext` directories. - - - -#### Restart cluster command - -```bash -iotdbctl cluster restart default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` - -* Execute the above stop cluster command (stop), and then execute the start cluster command (start). For details, refer to the above start and stop commands. - -*Force restart cluster command* - -```bash -iotdbctl cluster restart default_cluster -op force -``` -Will directly execute the kill -9 pid command to force stop the cluster, and then start the cluster - - -*Restart a single node command* - -```bash -#Restart datanode_1 according to the IoTDB node name -iotdbctl cluster restart default_cluster -N datanode_1 -#Restart confignode_1 according to the IoTDB node name -iotdbctl cluster restart default_cluster -N confignode_1 -#Restart grafana -iotdbctl cluster restart default_cluster -N grafana -#Restart prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### Cluster shrink command - -```bash -#Scale down by node name -iotdbctl cluster scalein default_cluster -N nodename -#Scale down according to ip+port (ip+port obtains the only node according to ip+dn_rpc_port in datanode, and obtains the only node according to ip+cn_internal_port in confignode) -iotdbctl cluster scalein default_cluster -N ip:port -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Determine whether there is only one confignode node and datanode to be reduced. If there is only one left, the reduction cannot be performed. - -* Then get the node information to shrink according to ip:port or nodename, execute the shrink command, and then destroy the node directory. If the shrink node is `data_node`, use `dn_rpc_address` in the yaml file for ip, and use `dn_rpc_address` in the port. `dn_rpc_port` in datanode_servers in yaml file. - If the shrinking node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - - -Tip: Currently, only one node scaling is supported at a time - -#### Cluster expansion command - -```bash -iotdbctl cluster scaleout default_cluster -``` -* Modify the config/xxx.yaml file to add a datanode node or confignode node - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Find the node to be expanded, upload the IoTDB compressed package and jdb package (if the `jdk_tar_dir` and `jdk_deploy_dir` values ​​are configured in yaml) and decompress it - -* Generate and upload `iotdb-system.properties` according to the yaml file node configuration information - -* Execute the command to start the node and verify whether the node is started successfully - -Tip: Currently, only one node expansion is supported at a time - -#### destroy cluster command -```bash -iotdbctl cluster destroy default_cluster -``` - -* cluster-name finds the yaml file in the default location - -* Check whether the node is still running based on the node node information in `confignode_servers`, `datanode_servers`, `grafana`, and `prometheus`. - Stop the destroy command if any node is running - -* Delete `data` in the IoTDB cluster and `cn_system_dir`, `cn_consensus_dir` configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs`, `ext`, `IoTDB` deployment directory, - grafana deployment directory and prometheus deployment directory - -*Destroy a single module* - -```bash -# Destroy grafana module -iotdbctl cluster destroy default_cluster -N grafana -# Destroy prometheus module -iotdbctl cluster destroy default_cluster -N prometheus -# Destroy iotdb module -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### Distribute cluster configuration commands - -```bash -iotdbctl cluster dist-conf default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` - -* Generate and upload `iotdb-system.properties` to the specified node according to the node configuration information of the yaml file - -#### Hot load cluster configuration command - -```bash -iotdbctl cluster reload default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Execute `load configuration` in the cli according to the node configuration information of the yaml file. - -#### Cluster node log backup -```bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` - -* Find the yaml file in the default location based on cluster-name - -* This command will verify the existence of datanode_1 and confignode_1 according to the yaml file, and then back up the log data of the specified node datanode_1 and confignode_1 to the specified service `192.168.9.48` port 36000 according to the configured start and end dates (startdate<=logtime<=enddate) The data backup path is `/iotdb/logs`, and the IoTDB log storage path is `/root/data/db/iotdb/logs` (not required, if you do not fill in -logs xxx, the default is to backup logs from the IoTDB installation path /logs ) - -| command | description | required | -|------------|-------------------------------------------------------------------------|----------| -| -h | backup data server ip | NO | -| -u | backup data server username | NO | -| -pw | backup data machine password | NO | -| -p | backup data machine port(default 22) | NO | -| -path | path to backup data (default current path) | NO | -| -loglevel | Log levels include all, info, error, warn (default is all) | NO | -| -l | speed limit (default 1024 speed limit range 0 to 104857601 unit Kbit/s) | NO | -| -N | multiple configuration file cluster names are separated by commas. | YES | -| -startdate | start time (including default 1970-01-01) | NO | -| -enddate | end time (included) | NO | -| -logs | IoTDB log storage path, the default is ({iotdb}/logs)) | NO | - -#### Cluster data backup -```bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* This command will obtain the leader node based on the yaml file, and then back up the data to the /iotdb/datas directory on the 192.168.9.48 service based on the start and end dates (startdate<=logtime<=enddate) - -| command | description | required | -|--------------|-------------------------------------------------------------------------|----------| -| -h | backup data server ip | NO | -| -u | backup data server username | NO | -| -pw | backup data machine password | NO | -| -p | backup data machine port(default 22) | NO | -| -path | path to backup data (default current path) | NO | -| -granularity | partition | YES | -| -l | speed limit (default 1024 speed limit range 0 to 104857601 unit Kbit/s) | NO | -| -startdate | start time (including default 1970-01-01) | YES | -| -enddate | end time (included) | YES | - -#### Cluster upgrade -```bash -iotdbctl cluster dist-lib default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Upload lib package - -Note that after performing the upgrade, please restart IoTDB for it to take effect. - -#### Cluster initialization -```bash -iotdbctl cluster init default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` -* Initialize cluster configuration - -#### View cluster process status -```bash -iotdbctl cluster status default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` -* Display the survival status of each node in the cluster - -#### Cluster authorization activation - -Cluster activation is activated by entering the activation code by default, or by using the - op license_path activated through license path - -* Default activation method -```bash -iotdbctl cluster activate default_cluster -``` -* Find the yaml file in the default location based on `cluster-name` and obtain the `confignode_servers` configuration information -* Obtain the machine code inside -* Waiting for activation code input - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* Activate a node - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -* Activate through license path - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* Find the yaml file in the default location based on `cluster-name` and obtain the `confignode_servers` configuration information -* Obtain the machine code inside -* Waiting for activation code input - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* Activate a node - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -op license_path -``` - -#### Cluster Health Check -```bash -iotdbctl cluster health_check default_cluster -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve confignode_servers and datanode_servers configuration information. -* Execute health_check.sh on each node. -* Single Node Health Check -```bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute health_check.sh on datanode1. - -#### Cluster Shutdown Backup - -```bash -iotdbctl cluster backup default_cluster -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve confignode_servers and datanode_servers configuration information. -* Execute backup.sh on each node - -* Single Node Backup - -```bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` - -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute backup.sh on datanode1. -Note: Multi-node deployment on a single machine only supports quick mode. - -#### Cluster Metadata Import -```bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute metadata import with import-schema.sh on datanode1. -* Parameters for -param are as follows: - -| command | description | required | -|------------|-------------------------------------------------------------------------|----------| -| -s | Specify the data file to be imported. You can specify a file or a directory. If a directory is specified, all files with a .csv extension in the directory will be imported in bulk. | YES | -| -fd | Specify a directory to store failed import files. If this parameter is not specified, failed files will be saved in the source data directory with the extension .failed added to the original filename. | No | -| -lpf | Specify the number of lines written to each failed import file. The default is 10000.| NO | - -#### Cluster Metadata Export - -```bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` - -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute metadata export with export-schema.sh on datanode1. -* Parameters for -param are as follows: - -| command | description | required | -|-------------|-------------------------------------------------------------------------|----------| -| -t | Specify the output path for the exported CSV file. | YES | -| -path | Specify the path pattern for exporting metadata. If this parameter is specified, the -s parameter will be ignored. Example: root.stock.** | NO | -| -pf | If -path is not specified, this parameter must be specified. It designates the file path containing the metadata paths to be exported, supporting txt file format. Each path to be exported is on a new line.| NO | -| -lpf | Specify the maximum number of lines for the exported dump file. The default is 10000.| NO | -| -timeout | Specify the timeout for session queries in milliseconds.| NO | - - - -### Introduction to Cluster Deployment Tool Samples - -In the cluster deployment tool installation directory config/example, there are three yaml examples. If necessary, you can copy them to config and modify them. - -| name | description | -|-----------------------------|------------------------------------------------| -| default\_1c1d.yaml | 1 confignode and 1 datanode configuration example | -| default\_3c3d.yaml | 3 confignode and 3 datanode configuration samples | -| default\_3c3d\_grafa\_prome | 3 confignode and 3 datanode, Grafana, Prometheus configuration examples | - - -## IoTDB Data Directory Overview Tool - -IoTDB data directory overview tool is used to print an overview of the IoTDB data directory structure. The location is tools/tsfile/print-iotdb-data-dir. - -### Usage - -- For Windows: - -```bash -.\print-iotdb-data-dir.bat () -``` - -- For Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh () -``` - -Note: if the storage path of the output overview file is not set, the default relative path "IoTDB_data_dir_overview.txt" will be used. - -### Example - -Use Windows in this example: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## TsFile Sketch Tool - -TsFile sketch tool is used to print the content of a TsFile in sketch mode. The location is tools/tsfile/print-tsfile. - -### Usage - -- For Windows: - -``` -.\print-tsfile-sketch.bat () -``` - -- For Linux or MacOs: - -``` -./print-tsfile-sketch.sh () -``` - -Note: if the storage path of the output sketch file is not set, the default relative path "TsFile_sketch_view.txt" will be used. - -### Example - -Use Windows in this example: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -Explanations: - -- Separated by "|", the left is the actual position in the TsFile, and the right is the summary content. -- "||||||||||||||||||||" is the guide information added to enhance readability, not the actual data stored in TsFile. -- The last printed "IndexOfTimerseriesIndex Tree" is a reorganization of the metadata index tree at the end of the TsFile, which is convenient for intuitive understanding, and again not the actual data stored in TsFile. - -## TsFile Resource Sketch Tool - -TsFile resource sketch tool is used to print the content of a TsFile resource file. The location is tools/tsfile/print-tsfile-resource-files. - -### Usage - -- For Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- For Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### Example - -Use Windows in this example: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/UserGuide/dev-1.3/Tools-System/Monitor-Tool_timecho.md b/src/UserGuide/dev-1.3/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index 5e0964932..000000000 --- a/src/UserGuide/dev-1.3/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,182 +0,0 @@ - - -# Monitor Tool - -The deployment of monitoring tools can refer to the document [Monitoring Panel Deployment](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) section. - -## Prometheus - -### The mapping from metric type to prometheus format - -> For metrics whose Metric Name is name and Tags are K1=V1, ..., Kn=Vn, the mapping is as follows, where value is a -> specific value - -| Metric Type | Mapping | -| ---------------- | ------------------------------------------------------------ | -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | - -### Config File - -1) Taking DataNode as an example, modify the iotdb-system.properties configuration file as follows: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -Then you can get metrics data as follows - -2) Start IoTDB DataNodes -3) Open a browser or use ```curl``` to visit ```http://servier_ip:9091/metrics```, you can get the following metric - data: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -### Prometheus + Grafana - -As shown above, IoTDB exposes monitoring metrics data in the standard Prometheus format to the outside world. Prometheus -can be used to collect and store monitoring indicators, and Grafana can be used to visualize monitoring indicators. - -The following picture describes the relationships among IoTDB, Prometheus and Grafana - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. Along with running, IoTDB will collect its metrics continuously. -2. Prometheus scrapes metrics from IoTDB at a constant interval (can be configured). -3. Prometheus saves these metrics to its inner TSDB. -4. Grafana queries metrics from Prometheus at a constant interval (can be configured) and then presents them on the - graph. - -So, we need to do some additional works to configure and deploy Prometheus and Grafana. - -For instance, you can config your Prometheus as follows to get metrics data from IoTDB: - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -The following documents may help you have a good journey with Prometheus and Grafana. - -[Prometheus getting_started](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus scrape metrics](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana getting_started](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana query metrics from Prometheus](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## Apache IoTDB Dashboard - -We introduce the Apache IoTDB Dashboard, designed for unified centralized operations and management. With it, multiple clusters can be monitored through a single panel. - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - - -You can access the Dashboard's Json file in the enterprise edition. - -### Cluster Overview - -Including but not limited to: - -- Total cluster CPU cores, memory space, and hard disk space. -- Number of ConfigNodes and DataNodes in the cluster. -- Cluster uptime duration. -- Cluster write speed. -- Current CPU, memory, and disk usage across all nodes in the cluster. -- Information on individual nodes. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - - -### Data Writing - -Including but not limited to: - -- Average write latency, median latency, and the 99% percentile latency. -- Number and size of WAL files. -- Node WAL flush SyncBuffer latency. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### Data Querying - -Including but not limited to: - -- Node query load times for time series metadata. -- Node read duration for time series. -- Node edit duration for time series metadata. -- Node query load time for Chunk metadata list. -- Node edit duration for Chunk metadata. -- Node filtering duration based on Chunk metadata. -- Average time to construct a Chunk Reader. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### Storage Engine - -Including but not limited to: - -- File count and sizes by type. -- The count and size of TsFiles at various stages. -- Number and duration of various tasks. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### System Monitoring - -Including but not limited to: - -- System memory, swap memory, and process memory. -- Disk space, file count, and file sizes. -- JVM GC time percentage, GC occurrences by type, GC volume, and heap memory usage across generations. -- Network transmission rate, packet sending rate - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) diff --git a/src/UserGuide/dev-1.3/Tools-System/Workbench_timecho.md b/src/UserGuide/dev-1.3/Tools-System/Workbench_timecho.md deleted file mode 100644 index 8b124a643..000000000 --- a/src/UserGuide/dev-1.3/Tools-System/Workbench_timecho.md +++ /dev/null @@ -1,33 +0,0 @@ -# WorkBench - -The deployment of the visualization console can refer to the document [Workbench Deployment](../Deployment-and-Maintenance/workbench-deployment_timecho.md) chapter. - -## Product Introduction -IoTDB Visualization Console is an extension component developed for industrial scenarios based on the IoTDB Enterprise Edition time series database. It integrates real-time data collection, storage, and analysis, aiming to provide users with efficient and reliable real-time data storage and query solutions. It features lightweight, high performance, and ease of use, seamlessly integrating with the Hadoop and Spark ecosystems. It is suitable for high-speed writing and complex analytical queries of massive time series data in industrial IoT applications. - -## Instructions for Use -| **Functional Module** | **Functional Description** | -| ---------------------- | ------------------------------------------------------------ | -| Instance Management | Support unified management of connected instances, support creation, editing, and deletion, while visualizing the relationships between multiple instances, helping customers manage multiple database instances more clearly | -| Home | Support viewing the service running status of each node in the database instance (such as activation status, running status, IP information, etc.), support viewing the running monitoring status of clusters, ConfigNodes, and DataNodes, monitor the operational health of the database, and determine if there are any potential operational issues with the instance. | -| Measurement Point List | Support directly viewing the measurement point information in the instance, including database information (such as database name, data retention time, number of devices, etc.), and measurement point information (measurement point name, data type, compression encoding, etc.), while also supporting the creation, export, and deletion of measurement points either individually or in batches. | -| Data Model | Support viewing hierarchical relationships and visually displaying the hierarchical model. | -| Data Query | Support interface-based query interactions for common data query scenarios, and enable batch import and export of queried data. | -| Statistical Query | Support interface-based query interactions for common statistical data scenarios, such as outputting results for maximum, minimum, average, and sum values. | -| SQL Operations | Support interactive SQL operations on the database through a graphical user interface, allowing for the execution of single or multiple statements, and displaying and exporting the results. | -| Trend | Support one-click visualization to view the overall trend of data, draw real-time and historical data for selected measurement points, and observe the real-time and historical operational status of the measurement points. | -| Analysis | Support visualizing data through different analysis methods (such as FFT) for visualization. | -| View | Support viewing information such as view name, view description, result measuring points, and expressions through the interface. Additionally, enable users to quickly create, edit, and delete views through interactive interfaces. | -| Data synchronization | Support the intuitive creation, viewing, and management of data synchronization tasks between databases. Enable direct viewing of task running status, synchronized data, and target addresses. Users can also monitor changes in synchronization status in real-time through the interface. | -| Permission management | Support interface-based control of permissions for managing and controlling database user access and operations. | -| Audit logs | Support detailed logging of user operations on the database, including Data Definition Language (DDL), Data Manipulation Language (DML), and query operations. Assist users in tracking and identifying potential security threats, database errors, and misuse behavior. | - -Main feature showcase -* Home -![首页.png](/img/%E9%A6%96%E9%A1%B5.png) -* Measurement Point List -![测点列表.png](/img/workbench-en-bxzk.png) -* Data Query -![数据查询.png](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2.png) -* Trend -![历史趋势.png](/img/%E5%8E%86%E5%8F%B2%E8%B6%8B%E5%8A%BF.png) \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/User-Manual/Audit-Log_timecho.md b/src/UserGuide/dev-1.3/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index 976b7da17..000000000 --- a/src/UserGuide/dev-1.3/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,93 +0,0 @@ - - -# Security Audit - -## Background of the function - -Audit log is the record credentials of a database, which can be queried by the audit log function to ensure information security by various operations such as user add, delete, change and check in the database. With the audit log function of IoTDB, the following scenarios can be achieved: - -- We can decide whether to record audit logs according to the source of the link ( human operation or not), such as: non-human operation such as hardware collector write data no need to record audit logs, human operation such as ordinary users through cli, workbench and other tools to operate the data need to record audit logs. -- Filter out system-level write operations, such as those recorded by the IoTDB monitoring system itself. - -### Scene Description - -#### Logging all operations (add, delete, change, check) of all users - -The audit log function traces all user operations in the database. The information recorded should include data operations (add, delete, query) and metadata operations (add, modify, delete, query), client login information (user name, ip address). - -Client Sources: -- Cli、workbench、Zeppelin、Grafana、通过 Session/JDBC/MQTT 等协议传入的请求 - -![](/img/audit-log.png) - -#### Audit logging can be turned off for some user connections - -No audit logs are required for data written by the hardware collector via Session/JDBC/MQTT if it is a non-human action. - -## Function Definition - -It is available through through configurations: - -- Decide whether to enable the audit function or not -- Decide where to output the audit logs, support output to one or more - 1. log file - 2. IoTDB storage -- Decide whether to block the native interface writes to prevent recording too many audit logs to affect performance. -- Decide the content category of the audit log, supporting recording one or more - 1. data addition and deletion operations - 2. data and metadata query operations - 3. metadata class adding, modifying, and deleting operations. - -### configuration item - -In iotdb-system.properties, change the following configurations: - -```YAML -#################### -### Audit log Configuration -#################### - -# whether to enable the audit log. -# Datatype: Boolean -# enable_audit_log=false - -# Output location of audit logs -# Datatype: String -# IOTDB: the stored time series is: root.__system.audit._{user} -# LOGGER: log_audit.log in the log directory -# audit_log_storage=IOTDB,LOGGER - -# whether enable audit log for DML operation of data -# whether enable audit log for DDL operation of schema -# whether enable audit log for QUERY operation of data and schema -# Datatype: String -# audit_log_operation=DML,DDL,QUERY - -# whether the local write api records audit logs -# Datatype: Boolean -# This contains Session insert api: insertRecord(s), insertTablet(s),insertRecordsOfOneDevice -# MQTT insert api -# RestAPI insert api -# This parameter will cover the DML in audit_log_operation -# enable_audit_log_for_native_insert_api=true -``` - diff --git a/src/UserGuide/dev-1.3/User-Manual/Data-Sync_timecho.md b/src/UserGuide/dev-1.3/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index dd4b5b8f4..000000000 --- a/src/UserGuide/dev-1.3/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,663 +0,0 @@ - - -# Data Sync - -Data synchronization is a typical requirement in industrial Internet of Things (IoT). Through data synchronization mechanisms, it is possible to achieve data sharing between IoTDB, and to establish a complete data link to meet the needs for internal and external network data interconnectivity, edge-cloud synchronization, data migration, and data backup. - -## Function Overview - -### Data Synchronization - -A data synchronization task consists of three stages: - -![](/img/sync_en_01.png) - -- Source Stage:This part is used to extract data from the source IoTDB, defined in the source section of the SQL statement. -- Process Stage:This part is used to process the data extracted from the source IoTDB, defined in the processor section of the SQL statement. -- Sink Stage:This part is used to send data to the target IoTDB, defined in the sink section of the SQL statement. - -By declaratively configuring the specific content of the three parts through SQL statements, flexible data synchronization capabilities can be achieved. Currently, data synchronization supports the synchronization of the following information, and you can select the synchronization scope when creating a synchronization task (the default is data.insert, which means synchronizing newly written data): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Synchronization ScopeSynchronization Content Description
allAll scopes
data(Data)insertSynchronize newly written data
deleteSynchronize deleted data
schemadatabaseSynchronize database creation, modification or deletion operations
timeseriesSynchronize the definition and attributes of time series
TTLSynchronize the data retention time
auth-Synchronize user permissions and access control
- -### Functional limitations and instructions - -The schema and auth synchronization functions have the following limitations: - -- When using schema synchronization, it is required that the consensus protocol for `Schema region` and `ConfigNode` must be the default ratis protocol. This means that the `iotdb-system.properties` configuration file should contain the settings `config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus` and `schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`. If these are not specified, the default ratis protocol is used. - -- To prevent potential conflicts, please disable the automatic creation of schema on the receiving end when enabling schema synchronization. This can be done by setting the `enable_auto_create_schema` configuration in the `iotdb-system.properties` file to false. - -- When schema synchronization is enabled, the use of custom plugins is not supported. - -- In a dual-active cluster, schema synchronization should avoid simultaneous operations on both ends. - -- During data synchronization tasks, please avoid performing any deletion operations to prevent inconsistent states between the two ends. - -## Usage Instructions - -Data synchronization tasks have three states: RUNNING, STOPPED, and DROPPED. The task state transitions are shown in the following diagram: - -![](/img/Data-Sync02.png) - -After creation, the task will start directly, and when the task stops abnormally, the system will automatically attempt to restart the task. - -Provide the following SQL statements for state management of synchronization tasks. - -### Create Task - -Use the `CREATE PIPE` statement to create a data synchronization task. The `PipeId` and `sink` attributes are required, while `source` and `processor` are optional. When entering the SQL, note that the order of the `SOURCE` and `SINK` plugins cannot be swapped. - -The SQL example is as follows: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId is the name that uniquely identifies the task. --- Data extraction plugin, optional plugin -WITH SOURCE ( - [ = ,], -) --- Data processing plugin, optional plugin -WITH PROCESSOR ( - [ = ,], -) --- Data connection plugin, required plugin -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS semantics**: Used in creation operations to ensure that the create command is executed when the specified Pipe does not exist, preventing errors caused by attempting to create an existing Pipe. - -**Note**: - -Starting from V1.3.6, when creating a full data synchronization Pipe (e.g. Pipeid: `alldatapipe`), the system will automatically split it into two independent Pipes: - -* History Pipe: The PipeId is the original name plus the suffix `_history` (e.g. `alldatapipe_history`). The source parameter carries the default configurations: `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` -* Realtime Pipe: The PipeId is the original name plus the suffix `_realtime` (e.g. `alldatapipe_realtime`). The source parameter carries the default configuration: `'history.enable'='false'`. If metadata synchronization is configured, the Realtime Pipe will be responsible for sending the data. - -After successful creation, the original PipeId (e.g. `alldatapipe`) will no longer be a valid identifier. When performing task operations such as starting, stopping, deleting, or viewing, you must use the split independent PipeId (i.e. `*_history` or `*_realtime`). For operation examples, see the [View Task](./Data-Sync_timecho.md#view-task) section - -### Start Task - -Start processing data: - -```SQL -START PIPE -``` - -### Stop Task - -Stop processing data: - -```SQL -STOP PIPE -``` - -### Delete Task - -Deletes the specified task: - -```SQL -DROP PIPE [IF EXISTS] -``` -**IF EXISTS semantics**: Used in deletion operations to ensure that when a specified Pipe exists, the delete command is executed to prevent errors caused by attempting to delete non-existent Pipes. - -Deleting a task does not require stopping the synchronization task first. - -### View Task - -View all tasks: - -```SQL -SHOW PIPES -``` - -To view a specified task: - -```SQL -SHOW PIPE -``` - -Example of the show pipes result for a pipe: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -The meanings of each column are as follows: - -- **ID**:The unique identifier for the synchronization task -- **CreationTime**:The time when the synchronization task was created -- **State**:The state of the synchronization task -- **PipeSource**:The source of the synchronized data stream -- **PipeProcessor**:The processing logic of the synchronized data stream during transmission -- **PipeSink**:The destination of the synchronized data stream -- **ExceptionMessage**:Displays the exception information of the synchronization task -- **RemainingEventCount (Statistics with Delay)**: The number of remaining events, which is the total count of all events in the current data synchronization task, including data and schema synchronization events, as well as system and user-defined events. -- **EstimatedRemainingSeconds (Statistics with Delay)**: The estimated remaining time, based on the current number of events and the rate at the pipe, to complete the transfer. - -Example: - -In V1.3.6 and later versions, create a full data synchronization task and view the task details. - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -``` - -### Synchronization Plugins - -To make the overall architecture more flexible to match different synchronization scenario requirements, we support plugin assembly within the synchronization task framework. The system comes with some pre-installed common plugins that you can use directly. At the same time, you can also customize processor plugins and Sink plugins, and load them into the IoTDB system for use. You can view the plugins in the system (including custom and built-in plugins) with the following statement: - -```SQL -SHOW PIPEPLUGINS -``` - -The return result is as follows (version 1.3.2): - -```SQL -IoTDB> SHOW PIPEPLUGINS -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| PluginName|PluginType| ClassName| PluginJar| -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.processor.donothing.DoNothingProcessor| | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.donothing.DoNothingConnector| | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.extractor.iotdb.IoTDBExtractor| | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| | -| IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| | -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ - -``` - -Detailed introduction of pre-installed plugins is as follows (for detailed parameters of each plugin, please refer to the [Parameter Description](#reference-parameter-description) section): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
TypeCustom PluginPlugin NameDescriptionApplicable Version
source pluginNot Supportediotdb-sourceThe default extractor plugin, used to extract historical or real-time data from IoTDB1.2.x
processor pluginSupporteddo-nothing-processorThe default processor plugin, which does not process the incoming data1.2.x
sink pluginSupporteddo-nothing-sinkDoes not process the data that is sent out1.2.x
iotdb-thrift-sinkThe default sink plugin ( V1.3.1+ ), used for data transfer between IoTDB ( V1.2.0+ ) and IoTDB( V1.2.0+ ) . It uses the Thrift RPC framework to transfer data, with a multi-threaded async non-blocking IO model, high transfer performance, especially suitable for scenarios where the target end is distributed1.2.x
iotdb-air-gap-sinkUsed for data synchronization across unidirectional data diodes from IoTDB ( V1.2.0+ ) to IoTDB ( V1.2.0+ ). Supported diode models include Nanrui Syskeeper 2000, etc1.2.x
iotdb-thrift-ssl-sinkUsed for data transfer between IoTDB ( V1.3.1+ ) and IoTDB ( V1.2.0+ ). It uses the Thrift RPC framework to transfer data, with a single-threaded sync blocking IO model, suitable for scenarios with higher security requirements1.3.1+
- -For importing custom plugins, please refer to the [Stream Processing](./Streaming_timecho.md#custom-stream-processing-plugin-management) section. - -## Use examples - -### Full data synchronisation - -This example is used to demonstrate the synchronisation of all data from one IoTDB to another IoTDB with the data link as shown below: - -![](/img/pipe1.jpg) - -In this example, we can create a synchronization task named A2B to synchronize the full data from A IoTDB to B IoTDB. The iotdb-thrift-sink plugin (built-in plugin) for the sink is required. The URL of the data service port of the DataNode node on the target IoTDB needs to be configured through node-urls, as shown in the following example statement: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -``` - -### Partial data synchronization - -This example is used to demonstrate the synchronisation of data from a certain historical time range (8:00pm 23 August 2023 to 8:00pm 23 October 2023) to another IoTDB, the data link is shown below: - -![](/img/pipe2.jpg) - -In this example, we can create a synchronization task named A2B. First, we need to define the range of data to be transferred in the source. Since the data being transferred is historical data (historical data refers to data that existed before the creation of the synchronization task), we need to configure the start-time and end-time of the data and the transfer mode mode. The URL of the data service port of the DataNode node on the target IoTDB needs to be configured through node-urls. - -The detailed statements are as follows: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'realtime.mode' = 'stream' -- The extraction mode for newly inserted data (after pipe creation) - 'path' = 'root.vehicle.**', -- Scope of Data Synchronization - 'start-time' = '2023.08.23T08:00:00+00:00', -- The start event time for synchronizing all data, including start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- The end event time for synchronizing all data, including end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -### Bidirectional data transfer - -This example is used to demonstrate the scenario where two IoTDB act as active-active pairs, with the data link shown in the figure below: - -![](/img/pipe3.jpg) - -In this example, to avoid infinite data loops, the `forwarding-pipe-requests` parameter on A and B needs to be set to `false`, indicating that data transmitted from another pipe is not forwarded, and to keep the data consistent on both sides, the pipe needs to be configured with `inclusion=all` to synchronize full data and metadata. - -The detailed statement is as follows: - -On A IoTDB, execute the following statement: - -```SQL -create pipe AB -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'forwarding-pipe-requests' = 'false' -- Do not forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -On B IoTDB, execute the following statement: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'forwarding-pipe-requests' = 'false' -- Do not forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -### Edge-cloud data transfer - -This example is used to demonstrate the scenario where data from multiple IoTDB is transferred to the cloud, with data from clusters B, C, and D all synchronized to cluster A, as shown in the figure below: - -![](/img/sync_en_03.png) - -In this example, to synchronize the data from clusters B, C, and D to A, the pipe between BA, CA, and DA needs to configure the `path` to limit the range, and to keep the edge and cloud data consistent, the pipe needs to be configured with `inclusion=all` to synchronize full data and metadata. The detailed statement is as follows: - -On B IoTDB, execute the following statement to synchronize data from B to A: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On C IoTDB, execute the following statement to synchronize data from C to A: - -```SQL -create pipe CA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On D IoTDB, execute the following statement to synchronize data from D to A: - -```SQL -create pipe DA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -### Cascading data transfer - -This example is used to demonstrate the scenario where data is transferred in a cascading manner between multiple IoTDB, with data from cluster A synchronized to cluster B, and then to cluster C, as shown in the figure below: - -![](/img/sync_en_04.png) - -In this example, to synchronize the data from cluster A to C, the `forwarding-pipe-requests` needs to be set to `true` between BC. The detailed statement is as follows: - -On A IoTDB, execute the following statement to synchronize data from A to B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On B IoTDB, execute the following statement to synchronize data from B to C: - -```SQL -create pipe BC -with source ( - 'forwarding-pipe-requests' = 'true' -- Whether to forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -### Cross-gate data transfer - -This example is used to demonstrate the scenario where data from one IoTDB is synchronized to another IoTDB through a unidirectional gateway, as shown in the figure below: - -![](/img/cross-network-gateway.png) - - -In this example, the iotdb-air-gap-sink plugin in the sink task needs to be used . After configuring the gateway, execute the following statement on A IoTDB. Fill in the node-urls with the URL of the data service port of the DataNode node on the target IoTDB configured by the gateway, as detailed below: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- The URL of the data service port of the DataNode node on the target IoTDB -``` -**Notes: Currently supported gateway models** -> For other models of network gateway devices, Please contact timechodb staff to confirm compatibility. - -| Gateway Type | Model | Return Packet Limit | Send Limit | -| ---------------------- | ------------------------------------------------------------ | ------------------- | ---------------------- | -| Forward Gate | NARI Syskeeper-2000 Forward Gate | All 0 / All 1 bytes | No Limit | -| Forward Gate | XJ Self-developed Diaphragm | All 0 / All 1 bytes | No Limit | -| Unknown | WISGAP | No Limit | No Limit | -| Forward Gate | KEDONG StoneWall-2000 Network Security Isolation Device | No Limit | No Limit | -| Reverse Gate | NARI Syskeeper-2000 Reverse Direction | All 0 / All 1 bytes | Meet E Language Format | -| Unknown | DPtech ISG5000 | No Limit | No Limit | -| Unknown | GAP‌‌ - XL—GAP | No Limit | No Limit | - -### Compression Synchronization (V1.3.3+) - -IoTDB supports specifying data compression methods during synchronization. Real time compression and transmission of data can be achieved by configuring the `compressor` parameter. `Compressor` currently supports 5 optional algorithms: snappy/gzip/lz4/zstd/lzma2, and can choose multiple compression algorithm combinations to compress in the order of configuration `rate-limit-bytes-per-second`(supported in V1.3.3 and later versions) is the maximum number of bytes allowed to be transmitted per second, calculated as compressed bytes. If it is less than 0, there is no limit. - -For example, to create a synchronization task named A2B: - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB - 'compressor' = 'snappy,lz4' -- Compression algorithms -) -``` - -### Encrypted Synchronization (V1.3.1+) - -IoTDB supports the use of SSL encryption during the synchronization process, ensuring the secure transfer of data between different IoTDB instances. By configuring SSL-related parameters, such as the certificate address and password (`ssl.trust-store-path`)、(`ssl.trust-store-pwd`), data can be protected by SSL encryption during the synchronization process. - -For example, to create a synchronization task named A2B: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- The URL of the data service port of the DataNode node on the target IoTDB - 'ssl.trust-store-path'='pki/trusted', -- The trust store certificate path required to connect to the target DataNode - 'ssl.trust-store-pwd'='root' -- The trust store certificate password required to connect to the target DataNode -) -``` - -## Reference: Notes - -You can adjust the parameters for data synchronization by modifying the IoTDB configuration file (`iotdb-system.properties`), such as the directory for storing synchronized data. The complete configuration is as follows: - -V1.3.3+: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## Reference: parameter description - -### source parameter(V1.3.3) - -| key | value | value range | required or not | default value | -| :------------------------------ | :----------------------------------------------------------- | :------------------------------------- | :------- | :------------- | -| source | iotdb-source | String: iotdb-source | Required | - | -| inclusion | Used to specify the range of data to be synchronized in the data synchronization task, including data, schema, and auth | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | Optional | data.insert | -| inclusion.exclusion | Used to exclude specific operations from the range specified by inclusion, reducing the amount of data synchronized | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | Optional | - | -| path | Used to filter the path pattern schema of time series and data to be synchronized / schema synchronization can only use pathpath is exact matching, parameters must be prefix paths or complete paths, i.e., cannot contain `"*"`, at most one `"**"` at the end of the path parameter | String:IoTDB pattern | Optional | root.** | -| pattern | Used to filter the path prefix of time series | String: Optional | Optional | root | -| start-time | The start event time for synchronizing all data, including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | Optional | Long.MIN_VALUE | -| end-time | The end event time for synchronizing all data, including end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | Optional | Long.MAX_VALUE | -| realtime.mode | The extraction mode for newly inserted data (after pipe creation) | String: stream, batch | Optional | stream | -| forwarding-pipe-requests | Whether to forward data written by other Pipes (usually data synchronization) | Boolean: true, false | Optional | true | -| history.loose-range | When transferring TsFile, whether to relax the range of historical data (before the creation of the pipe). "": Do not relax the range, select data strictly according to the set conditions. "time": Relax the time range to avoid splitting TsFile, which can improve synchronization efficiency. "path": Relax the path range to avoid splitting TsFile, which can improve synchronization efficiency. "time, path", "path, time", "all": Relax all ranges to avoid splitting TsFile, which can improve synchronization efficiency. | String: "" 、 "time" 、 "path" 、 "time, path" 、 "path, time" 、 "all" | Optional |""| -| realtime.loose-range | When transferring TsFile, whether to relax the range of real-time data (before the creation of the pipe). "": Do not relax the range, select data strictly according to the set conditions. "time": Relax the time range to avoid splitting TsFile, which can improve synchronization efficiency. "path": Relax the path range to avoid splitting TsFile, which can improve synchronization efficiency. "time, path", "path, time", "all": Relax all ranges to avoid splitting TsFile, which can improve synchronization efficiency. | String: "" 、 "time" 、 "path" 、 "time, path" 、 "path, time" 、 "all" | Optional |""| -| mods.enable | Whether to send the mods file of tsfile | Boolean: true / false | Optional | false | - -> 💎 **Explanation**:To maintain compatibility with lower versions, history.enable, history.start-time, history.end-time, realtime.enable can still be used, but they are not recommended in the new version. -> -> 💎 **Explanation: Differences between Stream and Batch Data Extraction Modes** -> - **stream (recommended)**: In this mode, tasks process and send data in real-time. It is characterized by high timeliness and low throughput. -> - **batch**: In this mode, tasks process and send data in batches (according to the underlying data files). It is characterized by low timeliness and high throughput. - - -### sink parameter - -> In versions 1.3.3 and above, when only the sink is included, the additional "with sink" prefix is no longer required. - -#### iotdb-thrift-sink - - -| key | value | value Range | required or not | Default Value | -| :--------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------- | -| sink | iotdb-thrift-sink or iotdb-thrift-async-sink | String: iotdb-thrift-sink or iotdb-thrift-async-sink | Required | | -| node-urls | The URL of the data service port of any DataNode nodes on the target IoTDB (please note that synchronization tasks do not support forwarding to its own service) | String. Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Required | - | -| batch.enable | Whether to enable batched log transmission mode to improve transmission throughput and reduce IOPS | Boolean: true, false | Optional | true | -| batch.max-delay-seconds | Effective when batched log transmission mode is enabled, it represents the maximum waiting time for a batch of data before sending (unit: s) | Integer | Optional | 1 | -| batch.max-delay-ms | Effective when batched log transmission mode is enabled, it represents the maximum waiting time for a batch of data before sending (unit: ms) (Available since v1.3.6) | Integer | Optional | 1 | -| batch.size-bytes | Effective when batched log transmission mode is enabled, it represents the maximum batch size for a batch of data (unit: byte) | Long | Optional | 16*1024*1024 | -| load-tsfile-strategy | When synchronizing file data, whether the receiver waits for the local load tsfile operation to complete before responding to the sender:
sync: Wait for the local load tsfile operation to complete before returning the response.
async: Do not wait for the local load tsfile operation to complete; return the response immediately. (Available since v1.3.6) | String: sync / async | Optional | sync | - -#### iotdb-air-gap-sink - -| key | value | value Range | required or not | Default Value | -| :--------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------- | -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | Required | - | -| node-urls | The URL of the data service port of any DataNode nodes on the target IoTDB | String. Example: :'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Required | - | -| air-gap.handshake-timeout-ms | The timeout duration of the handshake request when the sender and receiver first attempt to establish a connection, unit: ms | Integer | Optional | 5000 | -| load-tsfile-strategy | When synchronizing file data, whether the receiver waits for the local load tsfile operation to complete before responding to the sender:
sync: Wait for the local load tsfile operation to complete before returning the response.
async: Do not wait for the local load tsfile operation to complete; return the response immediately. (Available since v1.3.6) | String: sync / async | Optional | sync | - -#### iotdb-thrift-ssl-sink - -| key | value | value Range | required or not | Default Value | -| :---------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------- | -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | Required | - | -| node-urls | The URL of the data service port of any DataNode nodes on the target IoTDB (please note that synchronization tasks do not support forwarding to its own service) | String. Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Required | - | -| batch.enable | Whether to enable batched log transmission mode to improve transmission throughput and reduce IOPS | Boolean: true, false | Optional | true | -| batch.max-delay-seconds | Effective when batched log transmission mode is enabled, it represents the maximum waiting time for a batch of data before sending (unit: s) | Integer | Optional | 1 | -| batch.max-delay-ms | Effective when batched log transmission mode is enabled, it represents the maximum waiting time for a batch of data before sending (unit: ms) (Available since v1.3.6) | Integer | Optional | 1 | -| batch.size-bytes | Effective when batched log transmission mode is enabled, it represents the maximum batch size for a batch of data (unit: byte) | Long | Optional | 16*1024*1024 | -| load-tsfile-strategy | When synchronizing file data, whether the receiver waits for the local load tsfile operation to complete before responding to the sender:
sync: Wait for the local load tsfile operation to complete before returning the response.
async: Do not wait for the local load tsfile operation to complete; return the response immediately. (Available since v1.3.6) | String: sync / async | Optional | sync | -| ssl.trust-store-path | The trust store certificate path required to connect to the target DataNode | String: certificate directory name, when configured as a relative directory, it is relative to the IoTDB root directory. Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667'| Required | - | -| ssl.trust-store-pwd | The trust store certificate password required to connect to the target DataNode | Integer | Required | - | diff --git a/src/UserGuide/dev-1.3/User-Manual/IoTDB-View_timecho.md b/src/UserGuide/dev-1.3/User-Manual/IoTDB-View_timecho.md deleted file mode 100644 index b84bfef7a..000000000 --- a/src/UserGuide/dev-1.3/User-Manual/IoTDB-View_timecho.md +++ /dev/null @@ -1,549 +0,0 @@ - - -# View - -## Sequence View Application Background - -## Application Scenario 1 Time Series Renaming (PI Asset Management) - -In practice, the equipment collecting data may be named with identification numbers that are difficult to be understood by human beings, which brings difficulties in querying to the business layer. - -The Sequence View, on the other hand, is able to re-organise the management of these sequences and access them using a new model structure without changing the original sequence content and without the need to create new or copy sequences. - -**For example**: a cloud device uses its own NIC MAC address to form entity numbers and stores data by writing the following time sequence:`root.db.0800200A8C6D.xvjeifg`. - -It is difficult for the user to understand. However, at this point, the user is able to rename it using the sequence view feature, map it to a sequence view, and use `root.view.device001.temperature` to access the captured data. - -### Application Scenario 2 Simplifying business layer query logic - -Sometimes users have a large number of devices that manage a large number of time series. When conducting a certain business, the user wants to deal with only some of these sequences. At this time, the focus of attention can be picked out by the sequence view function, which is convenient for repeated querying and writing. - -**For example**: Users manage a product assembly line with a large number of time series for each segment of the equipment. The temperature inspector only needs to focus on the temperature of the equipment, so he can extract the temperature-related sequences and compose the sequence view. - -### Application Scenario 3 Auxiliary Rights Management - -In the production process, different operations are generally responsible for different scopes. For security reasons, it is often necessary to restrict the access scope of the operations staff through permission management. - -**For example**: The safety management department now only needs to monitor the temperature of each device in a production line, but these data are stored in the same database with other confidential data. At this point, it is possible to create a number of new views that contain only temperature-related time series on the production line, and then to give the security officer access to only these sequence views, thus achieving the purpose of permission restriction. - -### Motivation for designing sequence view functionality - -Combining the above two types of usage scenarios, the motivations for designing sequence view functionality, are: - -1. time series renaming. -2. to simplify the query logic at the business level. -3. Auxiliary rights management, open data to specific users through the view. - -## Sequence View Concepts - -### Terminology Concepts - -Concept: If not specified, the views specified in this document are **Sequence Views**, and new features such as device views may be introduced in the future. - -### Sequence view - -A sequence view is a way of organising the management of time series. - -In traditional relational databases, data must all be stored in a table, whereas in time series databases such as IoTDB, it is the sequence that is the storage unit. Therefore, the concept of sequence views in IoTDB is also built on sequences. - -A sequence view is a virtual time series, and each virtual time series is like a soft link or shortcut that maps to a sequence or some kind of computational logic external to a certain view. In other words, a virtual sequence either maps to some defined external sequence or is computed from multiple external sequences. - -Users can create views using complex SQL queries, where the sequence view acts as a stored query statement, and when data is read from the view, the stored query statement is used as the source of the data in the FROM clause. - -### Alias Sequences - -There is a special class of beings in a sequence view that satisfy all of the following conditions: - -1. the data source is a single time series -2. there is no computational logic -3. no filtering conditions (e.g., no WHERE clause restrictions). - -Such a sequence view is called an **alias sequence**, or alias sequence view. A sequence view that does not fully satisfy all of the above conditions is called a non-alias sequence view. The difference between them is that only aliased sequences support write functionality. - -** All sequence views, including aliased sequences, do not currently support Trigger functionality. ** - -### Nested Views - -A user may want to select a number of sequences from an existing sequence view to form a new sequence view, called a nested view. - -**The current version does not support the nested view feature**. - -### Some constraints on sequence views in IoTDB - -#### Constraint 1 A sequence view must depend on one or several time series - -A sequence view has two possible forms of existence: - -1. it maps to a time series -2. it is computed from one or more time series. - -The former form of existence has been exemplified in the previous section and is easy to understand; the latter form of existence here is because the sequence view allows for computational logic. - -For example, the user has installed two thermometers in the same boiler and now needs to calculate the average of the two temperature values as a measurement. The user has captured the following two sequences: `root.db.d01.temperature01`, `root.db.d01.temperature02`. - -At this point, the user can use the average of the two sequences as one sequence in the view: `root.db.d01.avg_temperature`. - -This example will 3.1.2 expand in detail. - -#### Restriction 2 Non-alias sequence views are read-only - -Writing to non-alias sequence views is not allowed. - -Only aliased sequence views are supported for writing. - -#### Restriction 3 Nested views are not allowed - -It is not possible to select certain columns in an existing sequence view to create a sequence view, either directly or indirectly. - -An example of this restriction will be given in 3.1.3. - -#### Restriction 4 Sequence view and time series cannot be renamed - -Both sequence views and time series are located under the same tree, so they cannot be renamed. - -The name (path) of any sequence should be uniquely determined. - -#### Restriction 5 Sequence views share timing data with time series, metadata such as labels are not shared - -Sequence views are mappings pointing to time series, so they fully share timing data, with the time series being responsible for persistent storage. - -However, their metadata such as tags and attributes are not shared. - -This is because the business query, view-oriented users are concerned about the structure of the current view, and if you use group by tag and other ways to do the query, obviously want to get the view contains the corresponding tag grouping effect, rather than the time series of the tag grouping effect (the user is not even aware of those time series). - -## Sequence view functionality - -### Creating a view - -Creating a sequence view is similar to creating a time series, the difference is that you need to specify the data source, i.e., the original sequence, through the AS keyword. - -#### SQL for creating a view - -User can select some sequences to create a view: - -```SQL -CREATE VIEW root.view.device.status -AS - SELECT s01 - FROM root.db.device -``` - -It indicates that the user has selected the sequence `s01` from the existing device `root.db.device`, creating the sequence view `root.view.device.status`. - -The sequence view can exist under the same entity as the time series, for example: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device -``` - -Thus, there is a virtual copy of `s01` under `root.db.device`, but with a different name `status`. - -It can be noticed that the sequence views in both of the above examples are aliased sequences, and we are giving the user a more convenient way of creating a sequence for that sequence: - -```SQL -CREATE VIEW root.view.device.status -AS - root.db.device.s01 -``` - -#### Creating views with computational logic - -Following the example in section 2.2 Limitations 1: - -> A user has installed two thermometers in the same boiler and now needs to calculate the average of the two temperature values as a measurement. The user has captured the following two sequences: `root.db.d01.temperature01`, `root.db.d01.temperature02`. -> -> At this point, the user can use the two sequences averaged as one sequence in the view: `root.view.device01.avg_temperature`. - -If the view is not used, the user can query the average of the two temperatures like this: - -```SQL -SELECT (temperature01 + temperature02) / 2 -FROM root.db.d01 -``` - -And if using a sequence view, the user can create a view this way to simplify future queries: - -```SQL -CREATE VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02) / 2 - FROM root.db.d01 -``` - -The user can then query it like this: - -```SQL -SELECT avg_temperature FROM root.db.d01 -``` - -#### Nested sequence views not supported - -Continuing with the example from 3.1.2, the user now wants to create a new view using the sequence view `root.db.d01.avg_temperature`, which is not allowed. We currently do not support nested views, whether it is an aliased sequence or not. - -For example, the following SQL statement will report an error: - -```SQL -CREATE VIEW root.view.device.avg_temp_copy -AS - root.db.d01.avg_temperature -- Not supported. Nested views are not allowed -``` - -#### Creating multiple sequence views at once - -If only one sequence view can be specified at a time which is not convenient for the user to use, then multiple sequences can be specified at a time, for example: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - SELECT s01, s02 - FROM root.db.device -``` - -此外,上述写法可以做简化: - -```SQL -CREATE VIEW root.db.device(status, sub.hardware) -AS - SELECT s01, s02 - FROM root.db.device -``` - -Both statements above are equivalent to the following typing: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device; - -CREATE VIEW root.db.device.sub.hardware -AS - SELECT s02 - FROM root.db.device -``` - -is also equivalent to the following: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - root.db.device.s01, root.db.device.s02 - --- or - -CREATE VIEW root.db.device(status, sub.hardware) -AS - root.db.device(s01, s02) -``` - -##### The mapping relationships between all sequences are statically stored - -Sometimes, the SELECT clause may contain a number of statements that can only be determined at runtime, such as below: - -```SQL -SELECT s01, s02 -FROM root.db.d01, root.db.d02 -``` - -The number of sequences that can be matched by the above statement is uncertain and is related to the state of the system. Even so, the user can use it to create views. - -However, it is important to note that the mapping relationship between all sequences is stored statically (fixed at creation)! Consider the following example: - -The current database contains only three sequences `root.db.d01.s01`, `root.db.d02.s01`, `root.db.d02.s02`, and then the view is created: - -```SQL -CREATE VIEW root.view.d(alpha, beta, gamma) -AS - SELECT s01, s02 - FROM root.db.d01, root.db.d02 -``` - -The mapping relationship between time series is as follows: - -| sequence number | time series | sequence view | -| ---- | ----------------- | ----------------- | -| 1 | `root.db.d01.s01` | root.view.d.alpha | -| 2 | `root.db.d02.s01` | root.view.d.beta | -| 3 | `root.db.d02.s02` | root.view.d.gamma | - -After that, if the user adds the sequence `root.db.d01.s02`, it does not correspond to any view; then, if the user deletes `root.db.d01.s01`, the query for `root.view.d.alpha` will report an error directly, and it will not correspond to `root.db.d01.s02` either. - -Please always note that inter-sequence mapping relationships are stored statically and solidly. - -#### Batch Creation of Sequence Views - -There are several existing devices, each with a temperature value, for example: - -1. root.db.d1.temperature -2. root.db.d2.temperature -3. ... - -There may be many other sequences stored under these devices (e.g. `root.db.d1.speed`), but for now it is possible to create a view that contains only the temperature values for these devices, without relation to the other sequences:. - -```SQL -CREATE VIEW root.db.view(${2}_temperature) -AS - SELECT temperature FROM root.db.* -``` - -This is modelled on the query writeback (`SELECT INTO`) convention for naming rules, which uses variable placeholders to specify naming rules. See also: [QUERY WRITEBACK (SELECT INTO)](../Basic-Concept/Query-Data.md#into-clause-query-write-back) - -Here `root.db.*.temperature` specifies what time series will be included in the view; and `${2}` specifies from which node in the time series the name is extracted to name the sequence view. - -Here, `${2}` refers to level 2 (starting at 0) of `root.db.*.temperature`, which is the result of the `*` match; and `${2}_temperature` is the result of the match and `temperature` spliced together with underscores to make up the node names of the sequences under the view. - -The above statement for creating a view is equivalent to the following writeup: - -```SQL -CREATE VIEW root.db.view(${2}_${3}) -AS - SELECT temperature from root.db.* -``` - -The final view contains these sequences: - -1. root.db.view.d1_temperature -2. root.db.view.d2_temperature -3. ... - -Created using wildcards, only static mapping relationships at the moment of creation will be stored. - -#### SELECT clauses are somewhat limited when creating views - -The SELECT clause used when creating a serial view is subject to certain restrictions. The main restrictions are as follows: - -1. the `WHERE` clause cannot be used. -2. `GROUP BY` clause cannot be used. -3. `MAX_VALUE` and other aggregation functions cannot be used. - -Simply put, after `AS` you can only use `SELECT ... FROM ... ` and the results of this query must form a time series. - -### View Data Queries - -For the data query functions that can be supported, the sequence view and time series can be used indiscriminately with identical behaviour when performing time series data queries. - -**The types of queries that are not currently supported by the sequence view are as follows:** - -1. **align by device query -2. **group by tags query - -Users can also mix time series and sequence view queries in the same SELECT statement, for example: - -```SQL -SELECT temperature01, temperature02, avg_temperature -FROM root.db.d01 -WHERE temperature01 < temperature02 -``` - -However, if the user wants to query the metadata of the sequence, such as tag, attributes, etc., the query is the result of the sequence view, not the result of the time series referenced by the sequence view. - -In addition, for aliased sequences, if the user wants to get information about the time series such as tags, attributes, etc., the user needs to query the mapping of the view columns to find the corresponding time series, and then query the time series for the tags, attributes, etc. The method of querying the mapping of the view columns will be explained in section 3.5. - -### Modify Views - -The modification operations supported by the view include: modifying its calculation logic,modifying tag/attributes, and deleting. - -#### Modify view data source - -```SQL -ALTER VIEW root.view.device.status -AS - SELECT s01 - FROM root.ln.wf.d01 -``` - -#### Modify the view's calculation logic - -```SQL -ALTER VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02 + temperature03) / 3 - FROM root.db.d01 -``` - -#### Tag point management - -- Add a new -tag -```SQL -ALTER view root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -- Add a new attribute - -```SQL -ALTER view root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -- rename tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -- Reset the value of a tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -- Delete an existing tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 DROP tag1, tag2 -``` - -- Update insert tags and attributes - -> If the tag or attribute did not exist before, insert it, otherwise, update the old value with the new one. - -```SQL -ALTER view root.turbine.d1.s1 UPSERT TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -#### Deleting Views - -Since a view is a sequence, a view can be deleted as if it were a time series. - - -```SQL -DELETE VIEW root.view.device.avg_temperatue -``` - -### View Synchronisation - - - -#### If the dependent original sequence is deleted - -When the sequence view is queried (when the sequence is parsed), **the empty result set** is returned if the dependent time series does not exist. - -This is similar to the feedback for querying a non-existent sequence, but with a difference: if the dependent time series cannot be parsed, the empty result set is the one that contains the table header as a reminder to the user that the view is problematic. - -Additionally, when the dependent time series is deleted, no attempt is made to find out if there is a view that depends on the column, and the user receives no warning. - -#### Data Writes to Non-Aliased Sequences Not Supported - -Writes to non-alias sequences are not supported. - -Please refer to the previous section 2.1.6 Restrictions2 for more details. - -#### Metadata for sequences is not shared - -Please refer to the previous section 2.1.6 Restriction 5 for details. - -### View Metadata Queries - -View metadata query specifically refers to querying the metadata of the view itself (e.g., how many columns the view has), as well as information about the views in the database (e.g., what views are available). - -#### Viewing Current View Columns - -The user has two ways of querying: - -1. a query using `SHOW TIMESERIES`, which contains both time series and series views. This query contains both the time series and the sequence view. However, only some of the attributes of the view can be displayed. -2. a query using `SHOW VIEW`, which contains only the sequence view. It displays the complete properties of the sequence view. - -Example: - -```Shell -IoTDB> show timeseries; -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.device.s01 | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.view.status | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp01 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp02 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.avg_temp| null| root.db| FLOAT| null| null|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -Total line number = 5 -It costs 0.789s -IoTDB> -``` - -The last column `ViewType` shows the type of the sequence, the time series is BASE and the sequence view is VIEW. - -In addition, some of the sequence view properties will be missing, for example `root.db.d01.avg_temp` is calculated from temperature averages, so the `Encoding` and `Compression` properties are null values. - -In addition, the query results of the `SHOW TIMESERIES` statement are divided into two main parts. - -1. information about the timing data, such as data type, compression, encoding, etc. -2. other metadata information, such as tag, attribute, database, etc. - -For the sequence view, the temporal data information presented is the same as the original sequence or null (e.g., the calculated average temperature has a data type but no compression method); the metadata information presented is the content of the view. - -To learn more about the view, use `SHOW ``VIEW`. The `SHOW ``VIEW` shows the source of the view's data, etc. - -```Shell -IoTDB> show VIEW root.**; -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -| Timeseries|Database|DataType|Tags|Attributes|ViewType| SOURCE| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.view.status | root.db| INT32|null| null| VIEW| root.db.device.s01| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.d01.avg_temp| root.db| FLOAT|null| null| VIEW|(root.db.d01.temp01+root.db.d01.temp02)/2| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -Total line number = 2 -It costs 0.789s -IoTDB> -``` - -The last column, `SOURCE`, shows the data source for the sequence view, listing the SQL statement that created the sequence. - -##### About Data Types - -Both of the above queries involve the data type of the view. The data type of a view is inferred from the original time series type of the query statement or alias sequence that defines the view. This data type is computed in real time based on the current state of the system, so the data type queried at different moments may be changing. - -## FAQ - -#### Q1: I want the view to implement the function of type conversion. For example, a time series of type int32 was originally placed in the same view as other series of type int64. I now want all the data queried through the view to be automatically converted to int64 type. - -> Ans: This is not the function of the sequence view. But the conversion can be done using `CAST`, for example: - -```SQL -CREATE VIEW root.db.device.int64_status -AS - SELECT CAST(s1, 'type'='INT64') from root.db.device -``` - -> This way, a query for `root.view.status` will yield a result of type int64. -> -> Please note in particular that in the above example, the data for the sequence view is obtained by `CAST` conversion, so `root.db.device.int64_status` is not an aliased sequence, and thus **not supported for writing**. - -#### Q2: Is default naming supported? Select a number of time series and create a view; but I don't specify the name of each series, it is named automatically by the database? - -> Ans: Not supported. Users must specify the naming explicitly. - -#### Q3: In the original system, create time series `root.db.device.s01`, you can find that database `root.db` is automatically created and device `root.db.device` is automatically created. Next, deleting the time series `root.db.device.s01` reveals that `root.db.device` was automatically deleted, while `root.db` remained. Will this mechanism be followed for creating views? What are the considerations? - -> Ans: Keep the original behaviour unchanged, the introduction of view functionality will not change these original logics. - -#### Q4: Does it support sequence view renaming? - -> A: Renaming is not supported in the current version, you can create your own view with new name to put it into use. \ No newline at end of file diff --git a/src/UserGuide/dev-1.3/User-Manual/Streaming_timecho.md b/src/UserGuide/dev-1.3/User-Manual/Streaming_timecho.md deleted file mode 100644 index 80edebe9c..000000000 --- a/src/UserGuide/dev-1.3/User-Manual/Streaming_timecho.md +++ /dev/null @@ -1,857 +0,0 @@ - - -# Stream Computing Framework - -The IoTDB stream processing framework allows users to implement customized stream processing logic, which can monitor and capture storage engine changes, transform changed data, and push transformed data outward. - -We call a data flow processing task a Pipe. A stream processing task (Pipe) contains three subtasks: - -- Source task -- Processor task -- Sink task - -The stream processing framework allows users to customize the processing logic of three subtasks using Java language and process data in a UDF-like manner. -In a Pipe, the above three subtasks are executed by three plugins respectively, and the data will be processed by these three plugins in turn: -Pipe Source is used to extract data, Pipe Processor is used to process data, Pipe Sink is used to send data, and the final data will be sent to an external system. - -**The model of the Pipe task is as follows:** - -![pipe.png](/img/1706778988482.jpg) - -Describing a data flow processing task essentially describes the properties of Pipe Source, Pipe Processor and Pipe Sink plugins. -Users can declaratively configure the specific attributes of the three subtasks through SQL statements, and achieve flexible data ETL capabilities by combining different attributes. - -Using the stream processing framework, a complete data link can be built to meet the needs of end-side-cloud synchronization, off-site disaster recovery, and read-write load sub-library*. - -## Custom stream processing plugin development - -### Programming development dependencies - -It is recommended to use maven to build the project and add the following dependencies in `pom.xml`. Please be careful to select the same dependency version as the IoTDB server version. - -```xml - - org.apache.iotdb - pipe-api - 1.3.1 - provided - -``` - -### Event-driven programming model - -The user programming interface design of the stream processing plugin refers to the general design concept of the event-driven programming model. Events are data abstractions in the user programming interface, and the programming interface is decoupled from the specific execution method. It only needs to focus on describing the processing method expected by the system after the event (data) reaches the system. - -In the user programming interface of the stream processing plugin, events are an abstraction of database data writing operations. The event is captured by the stand-alone stream processing engine, and is passed to the PipeSource plugin, PipeProcessor plugin, and PipeSink plugin in sequence according to the three-stage stream processing process, and triggers the execution of user logic in the three plugins in turn. - -In order to take into account the low latency of stream processing in low load scenarios on the end side and the high throughput of stream processing in high load scenarios on the end side, the stream processing engine will dynamically select processing objects in the operation logs and data files. Therefore, user programming of stream processing The interface requires users to provide processing logic for the following two types of events: operation log writing event TabletInsertionEvent and data file writing event TsFileInsertionEvent. - -#### **Operation log writing event (TabletInsertionEvent)** - -The operation log write event (TabletInsertionEvent) is a high-level data abstraction for user write requests. It provides users with the ability to manipulate the underlying data of write requests by providing a unified operation interface. - -For different database deployment methods, the underlying storage structures corresponding to operation log writing events are different. For stand-alone deployment scenarios, the operation log writing event is an encapsulation of write-ahead log (WAL) entries; for a distributed deployment scenario, the operation log writing event is an encapsulation of a single node consensus protocol operation log entry. - -For write operations generated by different write request interfaces in the database, the data structure of the request structure corresponding to the operation log write event is also different. IoTDB provides numerous writing interfaces such as InsertRecord, InsertRecords, InsertTablet, InsertTablets, etc. Each writing request uses a completely different serialization method, and the generated binary entries are also different. - -The existence of operation log writing events provides users with a unified view of data operations, which shields the implementation differences of the underlying data structure, greatly reduces the user's programming threshold, and improves the ease of use of the function. - -```java -/** TabletInsertionEvent is used to define the event of data insertion. */ -public interface TabletInsertionEvent extends Event { - - /** - * The consumer processes the data row by row and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processRowByRow(BiConsumer consumer); - - /** - * The consumer processes the Tablet directly and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processTablet(BiConsumer consumer); -} -``` - -#### **Data file writing event (TsFileInsertionEvent)** - -The data file writing event (TsFileInsertionEvent) is a high-level abstraction of the database file writing operation. It is a data collection of several operation log writing events (TabletInsertionEvent). - -The storage engine of IoTDB is LSM structured. When data is written, the writing operation will first be placed into a log-structured file, and the written data will be stored in the memory at the same time. When the memory reaches the control upper limit, the disk flushing behavior will be triggered, that is, the data in the memory will be converted into a database file, and the previously prewritten operation log will be deleted. When the data in the memory is converted into the data in the database file, it will undergo two compression processes: encoding compression and general compression. Therefore, the data in the database file takes up less space than the original data in the memory. - -In extreme network conditions, directly transmitting data files is more economical than transmitting data writing operations. It will occupy lower network bandwidth and achieve faster transmission speeds. Of course, there is no free lunch. Computing and processing data in files requires additional file I/O costs compared to directly computing and processing data in memory. However, it is precisely the existence of two structures, disk data files and memory write operations, with their own advantages and disadvantages, that gives the system the opportunity to make dynamic trade-offs and adjustments. It is based on this observation that data files are introduced into the plugin's event model. Write event. - -To sum up, the data file writing event appears in the event stream of the stream processing plugin, and there are two situations: - -(1) Historical data extraction: Before a stream processing task starts, all written data that has been placed on the disk will exist in the form of TsFile. After a stream processing task starts, when collecting historical data, the historical data will be abstracted using TsFileInsertionEvent; - -(2) Real-time data extraction: When a stream processing task is in progress, when the real-time processing speed of operation log write events in the data stream is slower than the write request speed, after a certain progress, the operation log write events that cannot be processed in the future will be persisted. to disk and exists in the form of TsFile. After this data is extracted by the stream processing engine, TsFileInsertionEvent will be used as an abstraction. - -```java -/** - * TsFileInsertionEvent is used to define the event of writing TsFile. Event data stores in disks, - * which is compressed and encoded, and requires IO cost for computational processing. - */ -public interface TsFileInsertionEvent extends Event { - - /** - * The method is used to convert the TsFileInsertionEvent into several TabletInsertionEvents. - * - * @return {@code Iterable} the list of TabletInsertionEvent - */ - Iterable toTabletInsertionEvents(); -} -``` - -### Custom stream processing plugin programming interface definition - -Based on the custom stream processing plugin programming interface, users can easily write data extraction plugins, data processing plugins and data sending plugins, so that the stream processing function can be flexibly adapted to various industrial scenarios. - -#### Data extraction plugin interface - -Data extraction is the first stage of the three stages of stream processing data from data extraction to data sending. The data extraction plugin (PipeSource) is the bridge between the stream processing engine and the storage engine. It monitors the behavior of the storage engine, -Capture various data write events. - -```java -/** - * PipeSource - * - *

PipeSource is responsible for capturing events from sources. - * - *

Various data sources can be supported by implementing different PipeSource classes. - * - *

The lifecycle of a PipeSource is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH Source` clause in SQL are - * parsed and the validation method {@link PipeSource#validate(PipeParameterValidator)} will - * be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} will be called to - * configure the runtime behavior of the PipeSource. - *
  • Then the method {@link PipeSource#start()} will be called to start the PipeSource. - *
  • While the collaboration task is in progress, the method {@link PipeSource#supply()} will be - * called to capture events from sources and then the events will be passed to the - * PipeProcessor. - *
  • The method {@link PipeSource#close()} will be called when the collaboration task is - * cancelled (the `DROP PIPE` command is executed). - *
- */ -public interface PipeSource extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSource. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSourceRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSource#validate(PipeParameterValidator)} - * is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSource - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSourceRuntimeConfiguration configuration) - throws Exception; - - /** - * Start the Source. After this method is called, events should be ready to be supplied by - * {@link PipeSource#supply()}. This method is called after {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @throws Exception the user can throw errors if necessary - */ - void start() throws Exception; - - /** - * Supply single event from the Source and the caller will send the event to the processor. - * This method is called after {@link PipeSource#start()} is called. - * - * @return the event to be supplied. the event may be null if the Source has no more events at - * the moment, but the Source is still running for more events. - * @throws Exception the user can throw errors if necessary - */ - Event supply() throws Exception; -} -``` - -#### Data processing plugin interface - -Data processing is the second stage of the three stages of stream processing data from data extraction to data sending. The data processing plugin (PipeProcessor) is mainly used to filter and transform the data captured by the data extraction plugin (PipeSource). -various events. - -```java -/** - * PipeProcessor - * - *

PipeProcessor is used to filter and transform the Event formed by the PipeSource. - * - *

The lifecycle of a PipeProcessor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH PROCESSOR` clause in SQL are - * parsed and the validation method {@link PipeProcessor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} will be called - * to configure the runtime behavior of the PipeProcessor. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSource. The - * following 3 methods will be called: {@link - * PipeProcessor#process(TabletInsertionEvent, EventCollector)}, {@link - * PipeProcessor#process(TsFileInsertionEvent, EventCollector)} and {@link - * PipeProcessor#process(Event, EventCollector)}. - *
    • PipeSink serializes the events into binaries and send them to sinks. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeProcessor#close() } method will be called. - *
- */ -public interface PipeProcessor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeProcessor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeProcessorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeProcessor#validate(PipeParameterValidator)} is called and before the beginning of the - * events processing. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeProcessor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeProcessorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is called to process the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(TabletInsertionEvent tabletInsertionEvent, EventCollector eventCollector) - throws Exception; - - /** - * This method is called to process the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - default void process(TsFileInsertionEvent tsFileInsertionEvent, EventCollector eventCollector) - throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - process(tabletInsertionEvent, eventCollector); - } - } - - /** - * This method is called to process the Event. - * - * @param event Event to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(Event event, EventCollector eventCollector) throws Exception; -} -``` - -#### Data sending plugin interface - -Data sending is the third stage of the three stages of stream processing data from data extraction to data sending. The data sending plugin (PipeSink) is mainly used to send data processed by the data processing plugin (PipeProcessor). -Various events, it serves as the network implementation layer of the stream processing framework, and the interface should allow access to multiple real-time communication protocols and multiple sinks. - -```java -/** - * PipeSink - * - *

PipeSink is responsible for sending events to sinks. - * - *

Various network protocols can be supported by implementing different PipeSink classes. - * - *

The lifecycle of a PipeSink is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SINK` clause in SQL are - * parsed and the validation method {@link PipeSink#validate(PipeParameterValidator)} will be - * called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link PipeSink#customize(PipeParameters, - * PipeSinkRuntimeConfiguration)} will be called to configure the runtime behavior of the - * PipeSink and the method {@link PipeSink#handshake()} will be called to create a connection - * with sink. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. - *
    • PipeSink serializes the events into binaries and send them to sinks. The following 3 - * methods will be called: {@link PipeSink#transfer(TabletInsertionEvent)}, {@link - * PipeSink#transfer(TsFileInsertionEvent)} and {@link PipeSink#transfer(Event)}. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeSink#close() } method will be called. - *
- * - *

In addition, the method {@link PipeSink#heartbeat()} will be called periodically to check - * whether the connection with sink is still alive. The method {@link PipeSink#handshake()} will be - * called to create a new connection with the sink when the method {@link PipeSink#heartbeat()} - * throws exceptions. - */ -public interface PipeSink extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSink. In this method, the user can do the following - * things: - * - *

    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSinkRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSink#validate(PipeParameterValidator)} is - * called and before the method {@link PipeSink#handshake()} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSink - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSinkRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is used to create a connection with sink. This method will be called after the - * method {@link PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called or - * will be called when the method {@link PipeSink#heartbeat()} throws exceptions. - * - * @throws Exception if the connection is failed to be created - */ - void handshake() throws Exception; - - /** - * This method will be called periodically to check whether the connection with sink is still - * alive. - * - * @throws Exception if the connection dies - */ - void heartbeat() throws Exception; - - /** - * This method is used to transfer the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(TabletInsertionEvent tabletInsertionEvent) throws Exception; - - /** - * This method is used to transfer the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - default void transfer(TsFileInsertionEvent tsFileInsertionEvent) throws Exception { - try { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - transfer(tabletInsertionEvent); - } - } finally { - tsFileInsertionEvent.close(); - } - } - - /** - * This method is used to transfer the generic events, including HeartbeatEvent. - * - * @param event Event to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(Event event) throws Exception; -} -``` - -## Custom stream processing plugin management - -In order to ensure the flexibility and ease of use of user-defined plugins in actual production, the system also needs to provide the ability to dynamically and uniformly manage plugins. -The stream processing plugin management statements introduced in this chapter provide an entry point for dynamic unified management of plugins. - -### Load plugin statement - -In IoTDB, if you want to dynamically load a user-defined plugin in the system, you first need to implement a specific plugin class based on PipeSource, PipeProcessor or PipeSink. -Then the plugin class needs to be compiled and packaged into a jar executable file, and finally the plugin is loaded into IoTDB using the management statement for loading the plugin. - -The syntax of the management statement for loading the plugin is shown in the figure. - -```sql -CREATE PIPEPLUGIN [IF NOT EXISTS] -AS -USING -``` -**IF NOT EXISTS semantics**: Used in creation operations to ensure that the create command is executed when the specified Pipe Plugin does not exist, preventing errors caused by attempting to create an existing Pipe Plugin. - -Example: If you implement a data processing plugin named edu.tsinghua.iotdb.pipe.ExampleProcessor, and the packaged jar package is pipe-plugin.jar, you want to use this plugin in the stream processing engine, and mark the plugin as example. There are two ways to use the plugin package, one is to upload to the URI server, and the other is to upload to the local directory of the cluster. - -Method 1: Upload to the URI server - -Preparation: To register in this way, you need to upload the JAR package to the URI server in advance and ensure that the IoTDB instance that executes the registration statement can access the URI server. For example https://example.com:8080/iotdb/pipe-plugin.jar . - -SQL: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -Method 2: Upload the data to the local directory of the cluster - -Preparation: To register in this way, you need to place the JAR package in any path on the machine where the DataNode node is located, and we recommend that you place the JAR package in the /ext/pipe directory of the IoTDB installation path (the installation package is already in the installation package, so you do not need to create a new one). For example: iotdb-1.x.x-bin/ext/pipe/pipe-plugin.jar. **(Note: If you are using a cluster, you will need to place the JAR package under the same path as the machine where each DataNode node is located)** - -SQL: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -### Delete plugin statement - -When the user no longer wants to use a plugin and needs to uninstall the plugin from the system, he can use the delete plugin statement as shown in the figure. - -```sql -DROP PIPEPLUGIN [IF EXISTS] -``` - -**IF EXISTS semantics**: Used in deletion operations to ensure that when a specified Pipe Plugin exists, the delete command is executed to prevent errors caused by attempting to delete a non-existent Pipe Plugin. - -### View plugin statements - -Users can also view plugins in the system on demand. View the statement of the plugin as shown in the figure. -```sql -SHOW PIPEPLUGINS -``` - -## System preset stream processing plugin - -### Pre-built Source Plugin - -#### iotdb-source - -Function: Extract historical or realtime data inside IoTDB into pipe. - - -| key | value | value range | required or optional with default | -|---------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|-----------------------------------| -| source | iotdb-source | String: iotdb-source | required | -| source.pattern | path prefix for filtering time series | String: any time series prefix | optional: root | -| source.history.start-time | start of synchronizing historical data event time,including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| source.history.end-time | end of synchronizing historical data event time,including end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.forwarding-pipe-requests | Whether to forward data written by another Pipe (usually Data Sync) | Boolean: true, false | optional:true | -| start-time(V1.3.1+) | start of synchronizing all data event time,including start-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| end-time(V1.3.1+) | end of synchronizing all data event time,including end-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.realtime.mode | Extraction mode for real-time data | String: hybrid, stream, batch | optional:hybrid | -| source.forwarding-pipe-requests | Whether to forward data written by another Pipe (usually Data Sync) | Boolean: true, false | optional:true | - -> 🚫 **source.pattern Parameter Description** -> -> * Pattern should use backquotes to modify illegal characters or illegal path nodes, for example, if you want to filter root.\`a@b\` or root.\`123\`, you should set the pattern to root.\`a@b\` or root.\`123\`(Refer specifically to [Timing of single and double quotes and backquotes](https://iotdb.apache.org/Download/)) -> * In the underlying implementation, when pattern is detected as root (default value) or a database name, synchronization efficiency is higher, and any other format will reduce performance. -> * The path prefix does not need to form a complete path. For example, when creating a pipe with the parameter 'source.pattern'='root.aligned.1': - > - > * root.aligned.1TS - > * root.aligned.1TS.\`1\` - > * root.aligned.100TS - > - > the data will be synchronized; - > - > * root.aligned.\`1\` -> * root.aligned.\`123\` - > - > the data will not be synchronized. - -> ❗️**start-time, end-time parameter description of source** -> -> * start-time, end-time should be in ISO format, such as 2011-12-03T10:15:30 or 2011-12-03T10:15:30+01:00. However, version 1.3.1+ supports timeStamp format like 1706704494000. - -> ✅ **A piece of data from production to IoTDB contains two key concepts of time** -> -> * **event time:** The time when the data is actually produced (or the generation time assigned to the data by the data production system, which is the time item in the data point), also called event time. -> * **arrival time:** The time when data arrives in the IoTDB system. -> -> The out-of-order data we often refer to refers to data whose **event time** is far behind the current system time (or the maximum **event time** that has been dropped) when the data arrives. On the other hand, whether it is out-of-order data or sequential data, as long as they arrive newly in the system, their **arrival time** will increase with the order in which the data arrives at IoTDB. - -> 💎 **The work of iotdb-source can be split into two stages** -> -> 1. Historical data extraction: All data with **arrival time** < **current system time** when creating the pipe is called historical data -> 2. Realtime data extraction: All data with **arrival time** >= **current system time** when the pipe is created is called realtime data -> -> The historical data transmission phase and the realtime data transmission phase are executed serially. Only when the historical data transmission phase is completed, the realtime data transmission phase is executed.** - -> 📌 **source.realtime.mode: Data extraction mode** -> -> * log: In this mode, the task only uses the operation log for data processing and sending -> * file: In this mode, the task only uses data files for data processing and sending. -> * hybrid: This mode takes into account the characteristics of low latency but low throughput when sending data one by one in the operation log, and the characteristics of high throughput but high latency when sending in batches of data files. It can automatically operate under different write loads. Switch the appropriate data extraction method. First, adopt the data extraction method based on operation logs to ensure low sending delay. When a data backlog occurs, it will automatically switch to the data extraction method based on data files to ensure high sending throughput. When the backlog is eliminated, it will automatically switch back to the data extraction method based on data files. The data extraction method of the operation log avoids the problem of difficulty in balancing data sending delay or throughput using a single data extraction algorithm. - -> 🍕 **source.forwarding-pipe-requests: Whether to allow forwarding data transmitted from another pipe** -> -> * If you want to use pipe to build data synchronization of A -> B -> C, then the pipe of B -> C needs to set this parameter to true, so that the data written by A to B through the pipe in A -> B can be forwarded correctly. to C -> * If you want to use pipe to build two-way data synchronization (dual-active) of A \<-> B, then the pipes of A -> B and B -> A need to set this parameter to false, otherwise the data will be endless. inter-cluster round-robin forwarding - -### Preset processor plugin - -#### do-nothing-processor - -Function: No processing is done on the events passed in by the source. - - -| key | value | value range | required or optional with default | -|-----------|----------------------|------------------------------|-----------------------------------| -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### Preset sink plugin - -#### do-nothing-sink - -Function: No processing is done on the events passed in by the processor. - -| key | value | value range | required or optional with default | -|------|-----------------|-------------------------|-----------------------------------| -| sink | do-nothing-sink | String: do-nothing-sink | required | - -## Stream processing task management - -### Create a stream processing task - -Use the `CREATE PIPE` statement to create a stream processing task. Taking the creation of a data synchronization stream processing task as an example, the sample SQL statement is as follows: - -```sql -CREATE PIPE -- PipeId is the name that uniquely identifies the sync task -WITH SOURCE ( - -- Default IoTDB Data Extraction Plugin - 'source' = 'iotdb-source', - -- Path prefix, only data that can match the path prefix will be extracted for subsequent processing and delivery - 'source.pattern' = 'root.timecho', - -- Whether to extract historical data - 'source.history.enable' = 'true', - -- Describes the time range of the historical data being extracted, indicating the earliest possible time - 'source.history.start-time' = '2011.12.03T10:15:30+01:00', - -- Describes the time range of the extracted historical data, indicating the latest time - 'source.history.end-time' = '2022.12.03T10:15:30+01:00', - -- Whether to extract realtime data - 'source.realtime.enable' = 'true', -) -WITH PROCESSOR ( - -- Default data processing plugin, means no processing - 'processor' = 'do-nothing-processor', -) -WITH SINK ( - -- IoTDB data sending plugin with target IoTDB - 'sink' = 'iotdb-thrift-sink', - -- Data service for one of the DataNode nodes on the target IoTDB ip - 'sink.ip' = '127.0.0.1', - -- Data service port of one of the DataNode nodes of the target IoTDB - 'sink.port' = '6667', -) -``` - -**When creating a stream processing task, you need to configure the PipeId and the parameters of the three plugin parts:** - -| Configuration | Description | Required or not | Default implementation | Default implementation description | Default implementation description | -|---------------|-----------------------------------------------------------------------------------------------------|---------------------------------|------------------------|---------------------------------------------------------------------------------------------------------------------------|------------------------------------| -| PipeId | A globally unique name that identifies a stream processing | Required | - | - | - | -| source | Pipe Source plugin, responsible for extracting stream processing data at the bottom of the database | Optional | iotdb-source | Integrate the full historical data of the database and subsequent real-time data arriving into the stream processing task | No | -| processor | Pipe Processor plugin, responsible for processing data | Optional | do-nothing-processor | Does not do any processing on the incoming data | Yes | -| sink | Pipe Sink plugin, responsible for sending data | Required | - | - | Yes | - -In the example, the iotdb-source, do-nothing-processor and iotdb-thrift-sink plugins are used to build the data flow processing task. IoTDB also has other built-in stream processing plugins, **please check the "System Preset Stream Processing plugin" section**. - -**A simplest example of the CREATE PIPE statement is as follows:** - -```sql -CREATE PIPE -- PipeId is a name that uniquely identifies the stream processing task -WITH SINK ( - -- IoTDB data sending plugin, the target is IoTDB - 'sink' = 'iotdb-thrift-sink', - --The data service IP of one of the DataNode nodes in the target IoTDB - 'sink.ip' = '127.0.0.1', - -- The data service port of one of the DataNode nodes in the target IoTDB - 'sink.port' = '6667', -) -``` - -The semantics expressed are: synchronize all historical data in this database instance and subsequent real-time data arriving to the IoTDB instance with the target 127.0.0.1:6667. - -**Notice:** - -- SOURCE and PROCESSOR are optional configurations. If you do not fill in the configuration parameters, the system will use the corresponding default implementation. -- SINK is a required configuration and needs to be configured declaratively in the CREATE PIPE statement -- SINK has self-reuse capability. For different stream processing tasks, if their SINKs have the same KV attributes (the keys corresponding to the values of all attributes are the same), then the system will only create one SINK instance in the end to realize the duplication of connection resources. - - - For example, there are the following declarations of two stream processing tasks, pipe1 and pipe2: - - ```sql - CREATE PIPE pipe1 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.ip' = 'localhost', - 'sink.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.port' = '9999', - 'sink.ip' = 'localhost', - ) - ``` - -- Because their declarations of SINK are exactly the same (**even if the order of declaration of some attributes is different**), the framework will automatically reuse the SINKs they declared, and ultimately the SINKs of pipe1 and pipe2 will be the same instance. . -- When the source is the default iotdb-source, and source.forwarding-pipe-requests is the default value true, please do not build an application scenario that includes data cycle synchronization (it will cause an infinite loop): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### Start the stream processing task - -After the CREATE PIPE statement is successfully executed, the stream processing task-related instance will be created, but the running status of the entire stream processing task will be set to STOPPED(V1.3.0), that is, the stream processing task will not process data immediately. In version 1.3.1 and later, the status of the task will be set to RUNNING after CREATE. - -You can use the START PIPE statement to cause a stream processing task to start processing data: - -```sql -START PIPE -``` - -### Stop the stream processing task - -Use the STOP PIPE statement to stop the stream processing task from processing data: - -```sql -STOP PIPE -``` - -### Delete stream processing tasks - -Use the DROP PIPE statement to stop the stream processing task from processing data (when the stream processing task status is RUNNING), and then delete the entire stream processing task: - -```sql -DROP PIPE -``` - -Users do not need to perform a STOP operation before deleting the stream processing task. - -### Display stream processing tasks - -Use the SHOW PIPES statement to view all stream processing tasks: - -```sql -SHOW PIPES -``` - -The query results are as follows: - -```sql -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor|PipeSink|ExceptionMessage| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| {}| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -``` - -You can use `` to specify the status of a stream processing task you want to see: - -```sql -SHOW PIPE -``` - -You can also use the where clause to determine whether the Pipe Sink used by a certain \ is reused. - -```sql -SHOW PIPES -WHERE SINK USED BY -``` - -### Stream processing task running status migration - -A stream processing pipe will pass through various states during its managed life cycle: - -- **RUNNING:** pipe is working properly - - When a pipe is successfully created, its initial state is RUNNING.(V1.3.1+) -- **STOPPED:** The pipe is stopped. When the pipeline is in this state, there are several possibilities: - - When a pipe is successfully created, its initial state is STOPPED.(V1.3.0) - - The user manually pauses a pipe that is in normal running status, and its status will passively change from RUNNING to STOPPED. - - When an unrecoverable error occurs during the running of a pipe, its status will automatically change from RUNNING to STOPPED -- **DROPPED:** The pipe task was permanently deleted - -The following diagram shows all states and state transitions: - -![State migration diagram](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -## authority management - -### Stream processing tasks - - -| Permission name | Description | -|-----------------|------------------------------------------------------------| -| USE_PIPE | Register a stream processing task. The path is irrelevant. | -| USE_PIPE | Start the stream processing task. The path is irrelevant. | -| USE_PIPE | Stop the stream processing task. The path is irrelevant. | -| USE_PIPE | Offload stream processing tasks. The path is irrelevant. | -| USE_PIPE | Query stream processing tasks. The path is irrelevant. | - -### Stream processing task plugin - - -| Permission name | Description | -|-----------------|----------------------------------------------------------------------| -| USE_PIPE | Register stream processing task plugin. The path is irrelevant. | -| USE_PIPE | Uninstall the stream processing task plugin. The path is irrelevant. | -| USE_PIPE | Query stream processing task plugin. The path is irrelevant. | - -## Configuration parameters - -In iotdb-system.properties: - -V1.3.0+: -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 - -# The maximum number of selectors that can be used in the async connector. -# pipe_async_connector_selector_number=1 - -# The core number of clients that can be used in the async connector. -# pipe_async_connector_core_client_number=8 - -# The maximum number of clients that can be used in the async connector. -# pipe_async_connector_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -V1.3.1+: -```Properties -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` diff --git a/src/UserGuide/dev-1.3/User-Manual/Tiered-Storage_timecho.md b/src/UserGuide/dev-1.3/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 2567c5595..000000000 --- a/src/UserGuide/dev-1.3/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,96 +0,0 @@ - - -# Tiered Storage -## Overview - -The Tiered storage functionality allows users to define multiple layers of storage, spanning across multiple types of storage media (Memory mapped directory, SSD, rotational hard discs or cloud storage). While memory and cloud storage is usually singular, the local file system storages can consist of multiple directories joined together into one tier. Meanwhile, users can classify data based on its hot or cold nature and store data of different categories in specified "tier". Currently, IoTDB supports the classification of hot and cold data through TTL (Time to live / age) of data. When the data in one tier does not meet the TTL rules defined in the current tier, the data will be automatically migrated to the next tier. - -## Parameter Definition - -To enable tiered storage in IoTDB, you need to configure the following aspects: - -1. configure the data catalogue and divide the data catalogue into different tiers -2. configure the TTL of the data managed in each tier to distinguish between hot and cold data categories managed in different tiers. -3. configure the minimum remaining storage space ratio for each tier so that when the storage space of the tier triggers the threshold, the data of the tier will be automatically migrated to the next tier (optional). - -The specific parameter definitions and their descriptions are as follows. - -| Configuration | Default | Description | Constraint | -| --------------------------------------- | ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | specify different storage directories and divide the storage directories into tiers | Each level of storage uses a semicolon to separate, and commas to separate within a single level; cloud (OBJECT_STORAGE) configuration can only be used as the last level of storage and the first level can't be used as cloud storage; a cloud object at most; the remote storage directory is denoted by OBJECT_STORAGE | -| tier_ttl_in_ms | -1 | Define the maximum age of data for which each tier is responsible | Each level of storage is separated by a semicolon; the number of levels should match the number of levels defined by dn_data_dirs;"-1" means "unlimited". | -| dn_default_space_usage_thresholds | 0.85 | Define the maximum storage usage threshold ratio for each tier of data directories. When the used space exceeds this ratio, the data will be automatically migrated to the next tier. If the storage usage of the last tier surpasses this threshold, the system will be set to ​​READ_ONLY​​ mode. | Each level of storage is separated by a semicolon; the number of levels should match the number of levels defined by dn_data_dirs | -| object_storage_type | AWS_S3 | Cloud Storage Type | IoTDB currently only supports AWS S3 as a remote storage type, and this parameter can't be modified | -| object_storage_bucket | iotdb_data | Name of cloud storage bucket | Bucket definition in AWS S3; no need to configure if remote storage is not used | -| object_storage_endpoint | | endpoint of cloud storage | endpoint of AWS S3;If remote storage is not used, no configuration required | -| object_storage_access_key | | Authentication information stored in the cloud: key | AWS S3 credential key;If remote storage is not used, no configuration required | -| object_storage_access_secret | | Authentication information stored in the cloud: secret | AWS S3 credential secret;If remote storage is not used, no configuration required | -| remote_tsfile_cache_dirs | data/datanode/data/cache | Cache directory stored locally in the cloud | If remote storage is not used, no configuration required | -| remote_tsfile_cache_page_size_in_kb | 20480 |Block size of locally cached files stored in the cloud | If remote storage is not used, no configuration required | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | Maximum Disk Occupancy Size for Cloud Storage Local Cache | If remote storage is not used, no configuration required | - -## local tiered storag configuration example - -The following is an example of a local two-level storage configuration. - -```JavaScript -//Required configuration items -dn_data_dirs=/data1/data;/data2/data,/data3/data; -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -In this example, two levels of storage are configured, specifically: - -| **tier** | **data path** | **data range** | **threshold for minimum remaining disk space** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| tier 1 | path 1:/data1/data | data for last 1 day | 20% | -| tier 2 | path 2:/data2/data path 2:/data3/data | data from 1 day ago | 10% | - -## remote tiered storag configuration example - -The following takes three-level storage as an example: - -```JavaScript -//Required configuration items -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_name=AWS_S3 -object_storage_bucket=iotdb -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// Optional configuration items -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -In this example, a total of three levels of storage are configured, specifically: - -| **tier** | **data path** | **data range** | **threshold for minimum remaining disk space** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| tier1 | path 1:/data1/data | data for last 1 day | 20% | -| tier2 | path 1:/data2/data path 2:/data3/data | data from past 1 day to past 10 days | 15% | -| tier3 | Remote AWS S3 Storage | data from 10 days ago | 10% | diff --git a/src/UserGuide/dev-1.3/User-Manual/User-defined-function_timecho.md b/src/UserGuide/dev-1.3/User-Manual/User-defined-function_timecho.md deleted file mode 100644 index f37270df7..000000000 --- a/src/UserGuide/dev-1.3/User-Manual/User-defined-function_timecho.md +++ /dev/null @@ -1,953 +0,0 @@ -# UDF - -## 1. UDF Introduction - -UDF (User Defined Function) refers to user-defined functions. IoTDB provides a variety of built-in time series processing functions and also supports extending custom functions to meet more computing needs. - -In IoTDB, you can expand two types of UDF: - - - - - - - - - - - - - - - - - - - - - - -
UDF ClassAccessStrategyDescription
UDTFMAPPABLE_ROW_BY_ROWCustom scalar function, input k columns of time series and 1 row of data, output 1 column of time series and 1 row of data, can be used in any clause and expression that appears in the scalar function, such as select clause, where clause, etc.
ROW_BY_ROW
SLIDING_TIME_WINDOW
SLIDING_SIZE_WINDOW
SESSION_TIME_WINDOW
STATE_WINDOW
Custom time series generation function, input k columns of time series m rows of data, output 1 column of time series n rows of data, the number of input rows m can be different from the number of output rows n, and can only be used in SELECT clauses.
UDAF-Custom aggregation function, input k columns of time series m rows of data, output 1 column of time series 1 row of data, can be used in any clause and expression that appears in the aggregation function, such as select clause, having clause, etc.
- -### 1.1 UDF usage - -The usage of UDF is similar to that of regular built-in functions, and can be directly used in SELECT statements like calling regular functions. - -#### 1.Basic SQL syntax support - -* Support `SLIMIT` / `SOFFSET` -* Support `LIMIT` / `OFFSET` -* Support queries with value filters -* Support queries with time filters - - -#### 2. Queries with * in SELECT Clauses - -Assume that there are 2 time series (`root.sg.d1.s1` and `root.sg.d1.s2`) in the system. - -* **`SELECT example(*) from root.sg.d1`** - -Then the result set will include the results of `example (root.sg.d1.s1)` and `example (root.sg.d1.s2)`. - -* **`SELECT example(s1, *) from root.sg.d1`** - -Then the result set will include the results of `example(root.sg.d1.s1, root.sg.d1.s1)` and `example(root.sg.d1.s1, root.sg.d1.s2)`. - -* **`SELECT example(*, *) from root.sg.d1`** - -Then the result set will include the results of `example(root.sg.d1.s1, root.sg.d1.s1)`, `example(root.sg.d1.s2, root.sg.d1.s1)`, `example(root.sg.d1.s1, root.sg.d1.s2)` and `example(root.sg.d1.s2, root.sg.d1.s2)`. - -#### 3. Queries with Key-value Attributes in UDF Parameters - -You can pass any number of key-value pair parameters to the UDF when constructing a UDF query. The key and value in the key-value pair need to be enclosed in single or double quotes. Note that key-value pair parameters can only be passed in after all time series have been passed in. Here is a set of examples: - - Example: -``` sql -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; -``` - -#### 4. Nested Queries - - Example: -``` sql -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` - -## 2. UDF management - -### 2.1 UDF Registration - -The process of registering a UDF in IoTDB is as follows: - -1. Implement a complete UDF class, assuming the full class name of this class is `org.apache.iotdb.udf.ExampleUDTF`. -2. Convert the project into a JAR package. If using Maven to manage the project, you can refer to the [Maven project example](https://github.com/apache/iotdb/tree/master/example/udf) above. -3. Make preparations for registration according to the registration mode. For details, see the following example. -4. You can use following SQL to register UDF. - -```sql -CREATE FUNCTION AS (USING URI URI-STRING) -``` - -#### Example: register UDF named `example`, you can choose either of the following two registration methods - -#### Method 1: Manually place the jar package - -Prepare: -When registering using this method, it is necessary to place the JAR package in advance in the `ext/udf` directory of all nodes in the cluster (which can be configured). - -Registration statement: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' -``` - -#### Method 2: Cluster automatically installs jar packages through URI - -Prepare: -When registering using this method, it is necessary to upload the JAR package to the URI server in advance and ensure that the IoTDB instance executing the registration statement can access the URI server. - -Registration statement: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' USING URI 'http://jar/example.jar' -``` - -IoTDB will download JAR packages and synchronize them to the entire cluster. - -#### Note - -1. Since UDF instances are dynamically loaded through reflection technology, you do not need to restart the server during the UDF registration process. - -2. UDF function names are not case-sensitive. - -3. Please ensure that the function name given to the UDF is different from all built-in function names. A UDF with the same name as a built-in function cannot be registered. - -4. We recommend that you do not use classes that have the same class name but different function logic in different JAR packages. For example, in `UDF(UDAF/UDTF): udf1, udf2`, the JAR package of udf1 is `udf1.jar` and the JAR package of udf2 is `udf2.jar`. Assume that both JAR packages contain the `org.apache.iotdb.udf.ExampleUDTF` class. If you use two UDFs in the same SQL statement at the same time, the system will randomly load either of them and may cause inconsistency in UDF execution behavior. - -### 2.2 UDF Deregistration - -The SQL syntax is as follows: - -```sql -DROP FUNCTION -``` - -Example: Uninstall the UDF from the above example: - -```sql -DROP FUNCTION example -``` - -Note: For functions registered using USING URI, you need to remove the UDF's JAR files from the cluster-wide node path (`installation_package/ext/udf/install`). - -### 2.3 Show All Registered UDFs - -``` sql -SHOW FUNCTIONS -``` - -### 2.4 UDF configuration - -- UDF configuration allows configuring the storage directory of UDF in `iotdb-system.properties` - ``` Properties -# UDF lib dir - -udf_lib_dir=ext/udf -``` - -- -When using custom functions, there is a message indicating insufficient memory. Change the following configuration parameters in `iotdb-system.properties` and restart the service. - - ``` Properties - -# Used to estimate the memory usage of text fields in a UDF query. -# It is recommended to set this value to be slightly larger than the average length of all text -# effectiveMode: restart -# Datatype: int -udf_initial_byte_array_length_for_memory_control=48 - -# How much memory may be used in ONE UDF query (in MB). -# The upper limit is 20% of allocated memory for read. -# effectiveMode: restart -# Datatype: float -udf_memory_budget_in_mb=30.0 - -# UDF memory allocation ratio. -# The parameter form is a:b:c, where a, b, and c are integers. -# effectiveMode: restart -udf_reader_transformer_collector_memory_proportion=1:1:1 -``` - -### 2.5 UDF User Permissions - - -When users use UDF, they will be involved in the `USE_UDF` permission, and only users with this permission are allowed to perform UDF registration, uninstallation, and query operations. - -For more user permissions related content, please refer to [Account Management Statements](../User-Manual/Authority-Management.md). - - -## 3. UDF Libraries - -Based on the ability of user-defined functions, IoTDB provides a series of functions for temporal data processing, including data quality, data profiling, anomaly detection, frequency domain analysis, data matching, data repairing, sequence discovery, machine learning, etc., which can meet the needs of industrial fields for temporal data processing. - -You can refer to the [UDF Libraries](../SQL-Manual/UDF-Libraries_timecho.md)document to find the installation steps and registration statements for each function, to ensure that all required functions are registered correctly. - - - -## 4. UDF development - -### 4.1 UDF Development Dependencies - -If you use [Maven](http://search.maven.org/), you can search for the development dependencies listed below from the [Maven repository](http://search.maven.org/) . Please note that you must select the same dependency version as the target IoTDB server version for development. - -``` xml - - org.apache.iotdb - udf-api - 1.0.0 - provided - -``` - -### 4.2 UDTF(User Defined Timeseries Generating Function) - -To write a UDTF, you need to inherit the `org.apache.iotdb.udf.api.UDTF` class, and at least implement the `beforeStart` method and a `transform` method. - -#### Interface Description: - -| Interface definition | Description | Required to Implement | -| :----------------------------------------------------------- | :----------------------------------------------------------- | ----------------------------------------------------- | -| void validate(UDFParameterValidator validator) throws Exception | This method is mainly used to validate `UDFParameters` and it is executed before `beforeStart(UDFParameters, UDTFConfigurations)` is called. | Optional | -| void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception | The initialization method to call the user-defined initialization behavior before a UDTF processes the input data. Every time a user executes a UDTF query, the framework will construct a new UDF instance, and `beforeStart` will be called. | Required | -| Object transform(Row row) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `MappableRowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Row`, and the transformation result should be returned. | Required to implement at least one `transform` method | -| void transform(Column[] columns, ColumnBuilder builder) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `MappableRowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Column[]`, and the transformation result should be output by `ColumnBuilder`. You need to call the data collection method provided by `builder` to determine the output data. | Required to implement at least one `transform` method | -| void transform(Row row, PointCollector collector) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `RowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Row`, and the transformation result should be output by `PointCollector`. You need to call the data collection method provided by `collector` to determine the output data. | Required to implement at least one `transform` method | -| void transform(RowWindow rowWindow, PointCollector collector) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `SlidingSizeWindowAccessStrategy` or `SlidingTimeWindowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `RowWindow`, and the transformation result should be output by `PointCollector`. You need to call the data collection method provided by `collector` to determine the output data. | Required to implement at least one `transform` method | -| void terminate(PointCollector collector) throws Exception | This method is called by the framework. This method will be called once after all `transform` calls have been executed. In a single UDF query, this method will and will only be called once. You need to call the data collection method provided by `collector` to determine the output data. | Optional | -| void beforeDestroy() | This method is called by the framework after the last input data is processed, and will only be called once in the life cycle of each UDF instance. | Optional | - -In the life cycle of a UDTF instance, the calling sequence of each method is as follows: - -1. void validate(UDFParameterValidator validator) throws Exception -2. void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception -3. `Object transform(Row row) throws Exception` or `void transform(Column[] columns, ColumnBuilder builder) throws Exception` or `void transform(Row row, PointCollector collector) throws Exception` or `void transform(RowWindow rowWindow, PointCollector collector) throws Exception` -4. void terminate(PointCollector collector) throws Exception -5. void beforeDestroy() - -> Note that every time the framework executes a UDTF query, a new UDF instance will be constructed. When the query ends, the corresponding instance will be destroyed. Therefore, the internal data of the instances in different UDTF queries (even in the same SQL statement) are isolated. You can maintain some state data in the UDTF without considering the influence of concurrency and other factors. - -#### Detailed interface introduction: - -1. **void validate(UDFParameterValidator validator) throws Exception** - -The `validate` method is used to validate the parameters entered by the user. - -In this method, you can limit the number and types of input time series, check the attributes of user input, or perform any custom verification. - -Please refer to the [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/parameter/UDFParameterValidator.java) for the usage of `UDFParameterValidator`. - - -2. **void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception** - -This method is mainly used to customize UDTF. In this method, the user can do the following things: - -1. Use UDFParameters to get the time series paths and parse key-value pair attributes entered by the user. -2. Set the strategy to access the raw data and set the output data type in UDTFConfigurations. -3. Create resources, such as establishing external connections, opening files, etc. - - -2.1 **UDFParameters** - -`UDFParameters` is used to parse UDF parameters in SQL statements (the part in parentheses after the UDF function name in SQL). The input parameters have two parts. The first part is data types of the time series that the UDF needs to process, and the second part is the key-value pair attributes for customization. Only the second part can be empty. - - -Example: - -``` sql -SELECT UDF(s1, s2, 'key1'='iotdb', 'key2'='123.45') FROM root.sg.d; -``` - -Usage: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - String stringValue = parameters.getString("key1"); // iotdb - Float floatValue = parameters.getFloat("key2"); // 123.45 - Double doubleValue = parameters.getDouble("key3"); // null - int intValue = parameters.getIntOrDefault("key4", 678); // 678 - // do something - - // configurations - // ... -} -``` - - -2.2 **UDTFConfigurations** - -You must use `UDTFConfigurations` to specify the strategy used by UDF to access raw data and the type of output sequence. - -Usage: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(Type.INT32); -} -``` - -The `setAccessStrategy` method is used to set the UDF's strategy for accessing the raw data, and the `setOutputDataType` method is used to set the data type of the output sequence. - - 2.2.1 **setAccessStrategy** - - -Note that the raw data access strategy you set here determines which `transform` method the framework will call. Please implement the `transform` method corresponding to the raw data access strategy. Of course, you can also dynamically decide which strategy to set based on the attribute parameters parsed by `UDFParameters`. Therefore, two `transform` methods are also allowed to be implemented in one UDF. - -The following are the strategies you can set: - -| Interface definition | Description | The `transform` Method to Call | -| :-------------------------------- | :----------------------------------------------------------- | ------------------------------------------------------------ | -| MappableRowByRowStrategy | Custom scalar function
The framework will call the `transform` method once for each row of raw data input, with k columns of time series and 1 row of data input, and 1 column of time series and 1 row of data output. It can be used in any clause and expression where scalar functions appear, such as select clauses, where clauses, etc. | void transform(Column[] columns, ColumnBuilder builder) throws ExceptionObject transform(Row row) throws Exception | -| RowByRowAccessStrategy | Customize time series generation function to process raw data line by line.
The framework will call the `transform` method once for each row of raw data input, inputting k columns of time series and 1 row of data, and outputting 1 column of time series and n rows of data.
When a sequence is input, the row serves as a data point for the input sequence.
When multiple sequences are input, after aligning the input sequences in time, each row serves as a data point for the input sequence.
(In a row of data, there may be a column with a `null` value, but not all columns are `null`) | void transform(Row row, PointCollector collector) throws Exception | -| SlidingTimeWindowAccessStrategy | Customize time series generation functions to process raw data in a sliding time window manner.
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SlidingSizeWindowAccessStrategy | Customize the time series generation function to process raw data in a fixed number of rows, meaning that each data processing window will contain a fixed number of rows of data (except for the last window).
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SessionTimeWindowAccessStrategy | Customize time series generation functions to process raw data in a session window format.
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| StateWindowAccessStrategy | Customize time series generation functions to process raw data in a state window format.
he framework will call the `transform` method once for each raw data input window, inputting 1 column of time series m rows of data and outputting 1 column of time series n rows of data.
A window may contain multiple rows of data, and currently only supports opening windows for one physical quantity, which is one column of data. | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | - - -#### Interface Description: - -- `MappableRowByRowStrategy` and `RowByRowAccessStrategy`: The construction of `RowByRowAccessStrategy` does not require any parameters. - -- `SlidingTimeWindowAccessStrategy` - -Window opening diagram: - - - -`SlidingTimeWindowAccessStrategy`: `SlidingTimeWindowAccessStrategy` has many constructors, you can pass 3 types of parameters to them: - -- Parameter 1: The display window on the time axis - -The first type of parameters are optional. If the parameters are not provided, the beginning time of the display window will be set to the same as the minimum timestamp of the query result set, and the ending time of the display window will be set to the same as the maximum timestamp of the query result set. - -- Parameter 2: Time interval for dividing the time axis (should be positive) -- Parameter 3: Time sliding step (not required to be greater than or equal to the time interval, but must be a positive number) - -The sliding step parameter is also optional. If the parameter is not provided, the sliding step will be set to the same as the time interval for dividing the time axis. - -The relationship between the three types of parameters can be seen in the figure below. Please see the [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/SlidingTimeWindowAccessStrategy.java) for more details. - -

- -> Note that the actual time interval of some of the last time windows may be less than the specified time interval parameter. In addition, there may be cases where the number of data rows in some time windows is 0. In these cases, the framework will also call the `transform` method for the empty windows. - -- `SlidingSizeWindowAccessStrategy` - -Window opening diagram: - - - -`SlidingSizeWindowAccessStrategy`: `SlidingSizeWindowAccessStrategy` has many constructors, you can pass 2 types of parameters to them: - -* Parameter 1: Window size. This parameter specifies the number of data rows contained in a data processing window. Note that the number of data rows in some of the last time windows may be less than the specified number of data rows. -* Parameter 2: Sliding step. This parameter means the number of rows between the first point of the next window and the first point of the current window. (This parameter is not required to be greater than or equal to the window size, but must be a positive number) - -The sliding step parameter is optional. If the parameter is not provided, the sliding step will be set to the same as the window size. - -- `SessionTimeWindowAccessStrategy` - -Window opening diagram: **Time intervals less than or equal to the given minimum time interval `sessionGap` are assigned in one group.** - - - -`SessionTimeWindowAccessStrategy`: `SessionTimeWindowAccessStrategy` has many constructors, you can pass 2 types of parameters to them: - -- Parameter 1: The display window on the time axis. -- Parameter 2: The minimum time interval `sessionGap` of two adjacent windows. - -- `StateWindowAccessStrategy` - -Window opening diagram: **For numerical data, if the state difference is less than or equal to the given threshold `delta`, it will be assigned in one group.** - - - -`StateWindowAccessStrategy` has four constructors. - -- Constructor 1: For numerical data, there are 3 parameters: the time axis can display the start and end time of the time window and the threshold `delta` for the allowable change within a single window. -- Constructor 2: For text data and boolean data, there are 3 parameters: the time axis can be provided to display the start and end time of the time window. For both data types, the data within a single window is same, and there is no need to provide an allowable change threshold. -- Constructor 3: For numerical data, there are 1 parameters: you can only provide the threshold delta that is allowed to change within a single window. The start time of the time axis display time window will be defined as the smallest timestamp in the entire query result set, and the time axis display time window end time will be defined as The largest timestamp in the entire query result set. -- Constructor 4: For text data and boolean data, you can provide no parameter. The start and end timestamps are explained in Constructor 3. - -StateWindowAccessStrategy can only take one column as input for now. - -Please see the [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/StateWindowAccessStrategy.java) for more details. - - 2.2.2 **setOutputDataType** - -Note that the type of output sequence you set here determines the type of data that the `PointCollector` can actually receive in the `transform` method. The relationship between the output data type set in `setOutputDataType` and the actual data output type that `PointCollector` can receive is as follows: - -| Output Data Type Set in `setOutputDataType` | Data Type that `PointCollector` Can Receive | -| :------------------------------------------ | :----------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | java.lang.String and org.apache.iotdb.udf.api.type.Binar` | - -The type of output time series of a UDTF is determined at runtime, which means that a UDTF can dynamically determine the type of output time series according to the type of input time series. -Here is a simple example: - -```java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **Object transform(Row row) throws Exception** - -You need to implement this method or `transform(Column[] columns, ColumnBuilder builder) throws Exception` when you specify the strategy of UDF to read the original data as `MappableRowByRowAccessStrategy`. - -This method processes the raw data one row at a time. The raw data is input from `Row` and output by its return object. You must return only one object based on each input data point in a single `transform` method call, i.e., input and output are one-to-one. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `Object transform(Row row) throws Exception` method. It is an adder that receives two columns of time series as input. - -```java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type dataType; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - dataType = parameters.getDataType(0); - configurations - .setAccessStrategy(new MappableRowByRowAccessStrategy()) - .setOutputDataType(dataType); - } - - @Override - public Object transform(Row row) throws Exception { - return row.getLong(0) + row.getLong(1); - } -} -``` - - - -4. **void transform(Column[] columns, ColumnBuilder builder) throws Exception** - -You need to implement this method or `Object transform(Row row) throws Exception` when you specify the strategy of UDF to read the original data as `MappableRowByRowAccessStrategy`. - -This method processes the raw data multiple rows at a time. After performance tests, we found that UDTF that process multiple rows at once perform better than those UDTF that process one data point at a time. The raw data is input from `Column[]` and output by `ColumnBuilder`. You must output a corresponding data point based on each input data point in a single `transform` method call, i.e., input and output are still one-to-one. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `void transform(Column[] columns, ColumnBuilder builder) throws Exception` method. It is an adder that receives two columns of time series as input. - -```java -import org.apache.iotdb.tsfile.read.common.block.column.Column; -import org.apache.iotdb.tsfile.read.common.block.column.ColumnBuilder; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type type; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - type = parameters.getDataType(0); - configurations.setAccessStrategy(new MappableRowByRowAccessStrategy()).setOutputDataType(type); - } - - @Override - public void transform(Column[] columns, ColumnBuilder builder) throws Exception { - long[] inputs1 = columns[0].getLongs(); - long[] inputs2 = columns[1].getLongs(); - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - builder.writeLong(inputs1[i] + inputs2[i]); - } - } -} -``` - -5. **void transform(Row row, PointCollector collector) throws Exception** - -You need to implement this method when you specify the strategy of UDF to read the original data as `RowByRowAccessStrategy`. - -This method processes the raw data one row at a time. The raw data is input from `Row` and output by `PointCollector`. You can output any number of data points in one `transform` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `void transform(Row row, PointCollector collector) throws Exception` method. It is an adder that receives two columns of time series as input. When two data points in a row are not `null`, this UDF will output the algebraic sum of these two data points. - -``` java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT64) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) throws Exception { - if (row.isNull(0) || row.isNull(1)) { - return; - } - collector.putLong(row.getTime(), row.getLong(0) + row.getLong(1)); - } -} -``` - -6. **void transform(RowWindow rowWindow, PointCollector collector) throws Exception** - -You need to implement this method when you specify the strategy of UDF to read the original data as `SlidingTimeWindowAccessStrategy` or `SlidingSizeWindowAccessStrategy`. - -This method processes a batch of data in a fixed number of rows or a fixed time interval each time, and we call the container containing this batch of data a window. The raw data is input from `RowWindow` and output by `PointCollector`. `RowWindow` can help you access a batch of `Row`, it provides a set of interfaces for random access and iterative access to this batch of `Row`. You can output any number of data points in one `transform` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -Below is a complete UDF example that implements the `void transform(RowWindow rowWindow, PointCollector collector) throws Exception` method. It is a counter that receives any number of time series as input, and its function is to count and output the number of data rows in each time window within a specified time range. - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.access.RowWindow; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.SlidingTimeWindowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Counter implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new SlidingTimeWindowAccessStrategy( - parameters.getLong("time_interval"), - parameters.getLong("sliding_step"), - parameters.getLong("display_window_begin"), - parameters.getLong("display_window_end"))); - } - - @Override - public void transform(RowWindow rowWindow, PointCollector collector) { - if (rowWindow.windowSize() != 0) { - collector.putInt(rowWindow.windowStartTime(), rowWindow.windowSize()); - } - } -} -``` - -7. **void terminate(PointCollector collector) throws Exception** - -In some scenarios, a UDF needs to traverse all the original data to calculate the final output data points. The `terminate` interface provides support for those scenarios. - -This method is called after all `transform` calls are executed and before the `beforeDestory` method is executed. You can implement the `transform` method to perform pure data processing (without outputting any data points), and implement the `terminate` method to output the processing results. - -The processing results need to be output by the `PointCollector`. You can output any number of data points in one `terminate` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -Below is a complete UDF example that implements the `void terminate(PointCollector collector) throws Exception` method. It takes one time series whose data type is `INT32` as input, and outputs the maximum value point of the series. - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Max implements UDTF { - - private Long time; - private int value; - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) { - if (row.isNull(0)) { - return; - } - int candidateValue = row.getInt(0); - if (time == null || value < candidateValue) { - time = row.getTime(); - value = candidateValue; - } - } - - @Override - public void terminate(PointCollector collector) throws IOException { - if (time != null) { - collector.putInt(time, value); - } - } -} -``` - -8. **void beforeDestroy()** - -The method for terminating a UDF. - -This method is called by the framework. For a UDF instance, `beforeDestroy` will be called after the last record is processed. In the entire life cycle of the instance, `beforeDestroy` will only be called once. - - - -### 4.3 UDAF (User Defined Aggregation Function) - -A complete definition of UDAF involves two classes, `State` and `UDAF`. - -#### State Class - -To write your own `State`, you need to implement the `org.apache.iotdb.udf.api.State` interface. - -#### Interface Description: - -| Interface Definition | Description | Required to Implement | -| -------------------------------- | ------------------------------------------------------------ | --------------------- | -| void reset() | To reset the `State` object to its initial state, you need to fill in the initial values of the fields in the `State` class within this method as if you were writing a constructor. | Required | -| byte[] serialize() | Serializes `State` to binary data. This method is used for IoTDB internal `State` passing. Note that the order of serialization must be consistent with the following deserialization methods. | Required | -| void deserialize(byte[] bytes) | Deserializes binary data to `State`. This method is used for IoTDB internal `State` passing. Note that the order of deserialization must be consistent with the serialization method above. | Required | - -#### Detailed interface introduction: - -1. **void reset()** - -This method resets the `State` to its initial state, you need to fill in the initial values of the fields in the `State` object in this method. For optimization reasons, IoTDB reuses `State` as much as possible internally, rather than creating a new `State` for each group, which would introduce unnecessary overhead. When `State` has finished updating the data in a group, this method is called to reset to the initial state as a way to process the next group. - -In the case of `State` for averaging (aka `avg`), for example, you would need the sum of the data, `sum`, and the number of entries in the data, `count`, and initialize both to 0 in the `reset()` method. - -```java -class AvgState implements State { - double sum; - - long count; - - @Override - public void reset() { - sum = 0; - count = 0; - } - - // other methods -} -``` - -2. **byte[] serialize()/void deserialize(byte[] bytes)** - -These methods serialize the `State` into binary data, and deserialize the `State` from the binary data. IoTDB, as a distributed database, involves passing data among different nodes, so you need to write these two methods to enable the passing of the State among different nodes. Note that the order of serialization and deserialization must be the consistent. - -In the case of `State` for averaging (aka `avg`), for example, you can convert the content of State to `byte[]` array and read out the content of State from `byte[]` array in any way you want, the following shows the code for serialization/deserialization using `ByteBuffer` introduced by Java8: - -```java -@Override -public byte[] serialize() { - ByteBuffer buffer = ByteBuffer.allocate(Double.BYTES + Long.BYTES); - buffer.putDouble(sum); - buffer.putLong(count); - - return buffer.array(); -} - -@Override -public void deserialize(byte[] bytes) { - ByteBuffer buffer = ByteBuffer.wrap(bytes); - sum = buffer.getDouble(); - count = buffer.getLong(); -} -``` - - - -#### UDAF Classes - -To write a UDAF, you need to implement the `org.apache.iotdb.udf.api.UDAF` interface. - -#### Interface Description: - -| Interface definition | Description | Required to Implement | -| ------------------------------------------------------------ | ------------------------------------------------------------ | --------------------- | -| void validate(UDFParameterValidator validator) throws Exception | This method is mainly used to validate `UDFParameters` and it is executed before `beforeStart(UDFParameters, UDTFConfigurations)` is called. | Optional | -| void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception | Initialization method that invokes user-defined initialization behavior before UDAF processes the input data. Unlike UDTF, configuration is of type `UDAFConfiguration`. | Required | -| State createState() | To create a `State` object, usually just call the default constructor and modify the default initial value as needed. | Required | -| void addInput(State state, Column[] columns, BitMap bitMap) | Update `State` object according to the incoming data `Column[]` in batch, note that last column `columns[columns.length - 1]` always represents the time column. In addition, `BitMap` represents the data that has been filtered out before, you need to manually determine whether the corresponding data has been filtered out when writing this method. | Required | -| void combineState(State state, State rhs) | Merge `rhs` state into `state` state. In a distributed scenario, the same set of data may be distributed on different nodes, IoTDB generates a `State` object for the partial data on each node, and then calls this method to merge it into the complete `State`. | Required | -| void outputFinal(State state, ResultValue resultValue) | Computes the final aggregated result based on the data in `State`. Note that according to the semantics of the aggregation, only one value can be output per group. | Required | -| void beforeDestroy() | This method is called by the framework after the last input data is processed, and will only be called once in the life cycle of each UDF instance. | Optional | - -In the life cycle of a UDAF instance, the calling sequence of each method is as follows: - -1. State createState() -2. void validate(UDFParameterValidator validator) throws Exception -3. void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception -4. void addInput(State state, Column[] columns, BitMap bitMap) -5. void combineState(State state, State rhs) -6. void outputFinal(State state, ResultValue resultValue) -7. void beforeDestroy() - -Similar to UDTF, every time the framework executes a UDAF query, a new UDF instance will be constructed. When the query ends, the corresponding instance will be destroyed. Therefore, the internal data of the instances in different UDAF queries (even in the same SQL statement) are isolated. You can maintain some state data in the UDAF without considering the influence of concurrency and other factors. - -#### Detailed interface introduction: - - -1. **void validate(UDFParameterValidator validator) throws Exception** - -Same as UDTF, the `validate` method is used to validate the parameters entered by the user. - -In this method, you can limit the number and types of input time series, check the attributes of user input, or perform any custom verification. - -2. **void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception** - - The `beforeStart` method does the same thing as the UDAF: - -1. Use UDFParameters to get the time series paths and parse key-value pair attributes entered by the user. -2. Set the strategy to access the raw data and set the output data type in UDAFConfigurations. -3. Create resources, such as establishing external connections, opening files, etc. - -The role of the `UDFParameters` type can be seen above. - -2.2 **UDTFConfigurations** - -The difference from UDTF is that UDAF uses `UDAFConfigurations` as the type of `configuration` object. - -Currently, this class only supports setting the type of output data. - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setOutputDataType(Type.INT32); } -} -``` - -The relationship between the output type set in `setOutputDataType` and the type of data output that `ResultValue` can actually receive is as follows: - -| The output type set in `setOutputDataType` | The output type that `ResultValue` can actually receive | -| ------------------------------------------ | ------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | org.apache.iotdb.udf.api.type.Binary | - -The output type of the UDAF is determined at runtime. You can dynamically determine the output sequence type based on the input type. - -Here is a simple example: - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **State createState()** - - -This method creates and initializes a `State` object for UDAF. Due to the limitations of the Java language, you can only call the default constructor for the `State` class. The default constructor assigns a default initial value to all the fields in the class, and if that initial value does not meet your requirements, you need to initialize them manually within this method. - -The following is an example that includes manual initialization. Suppose you want to implement an aggregate function that multiply all numbers in the group, then your initial `State` value should be set to 1, but the default constructor initializes it to 0, so you need to initialize `State` manually after calling the default constructor: - -```java -public State createState() { - MultiplyState state = new MultiplyState(); - state.result = 1; - return state; -} -``` - -4. **void addInput(State state, Column[] columns, BitMap bitMap)** - -This method updates the `State` object with the raw input data. For performance reasons, also to align with the IoTDB vectorized query engine, the raw input data is no longer a data point, but an array of columns ``Column[]``. Note that the last column (i.e. `columns[columns.length - 1]`) is always the time column, so you can also do different operations in UDAF depending on the time. - -Since the input parameter is not of a single data point type, but of multiple columns, you need to manually filter some of the data in the columns, which is why the third parameter, `BitMap`, exists. It identifies which of these columns have been filtered out, so you don't have to think about the filtered data in any case. - -Here's an example of `addInput()` that counts the number of items (aka count). It shows how you can use `BitMap` to ignore data that has been filtered out. Note that due to the limitations of the Java language, you need to do the explicit cast the `State` object from type defined in the interface to a custom `State` type at the beginning of the method, otherwise you won't be able to use the `State` object. - -```java -public void addInput(State state, Column[] columns, BitMap bitMap) { - CountState countState = (CountState) state; - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - if (bitMap != null && !bitMap.isMarked(i)) { - continue; - } - if (!columns[0].isNull(i)) { - countState.count++; - } - } -} -``` - -5. **void combineState(State state, State rhs)** - - -This method combines two `State`s, or more precisely, updates the first `State` object with the second `State` object. IoTDB is a distributed database, and the data of the same group may be distributed on different nodes. For performance reasons, IoTDB will first aggregate some of the data on each node into `State`, and then merge the `State`s on different nodes that belong to the same group, which is what `combineState` does. - -Here's an example of `combineState()` for averaging (aka avg). Similar to `addInput`, you need to do an explicit type conversion for the two `State`s at the beginning. Also note that you are updating the value of the first `State` with the contents of the second `State`. - -```java -public void combineState(State state, State rhs) { - AvgState avgState = (AvgState) state; - AvgState avgRhs = (AvgState) rhs; - - avgState.count += avgRhs.count; - avgState.sum += avgRhs.sum; -} -``` - -6. **void outputFinal(State state, ResultValue resultValue)** - -This method works by calculating the final result from `State`. You need to access the various fields in `State`, derive the final result, and set the final result into the `ResultValue` object.IoTDB internally calls this method once at the end for each group. Note that according to the semantics of aggregation, the final result can only be one value. - -Here is another `outputFinal` example for averaging (aka avg). In addition to the forced type conversion at the beginning, you will also see a specific use of the `ResultValue` object, where the final result is set by `setXXX` (where `XXX` is the type name). - -```java -public void outputFinal(State state, ResultValue resultValue) { - AvgState avgState = (AvgState) state; - - if (avgState.count != 0) { - resultValue.setDouble(avgState.sum / avgState.count); - } else { - resultValue.setNull(); - } -} -``` - -7. **void beforeDestroy()** - - -The method for terminating a UDF. - -This method is called by the framework. For a UDF instance, `beforeDestroy` will be called after the last record is processed. In the entire life cycle of the instance, `beforeDestroy` will only be called once. - - -### 4.4 Maven Project Example - -If you use Maven, you can build your own UDF project referring to our **udf-example** module. You can find the project [here](https://github.com/apache/iotdb/tree/master/example/udf). - - -## 5. Contribute universal built-in UDF functions to iotdb - -This part mainly introduces how external users can contribute their own UDFs to the IoTDB community. - -### 5.1 Prerequisites - -1. UDFs must be universal. - - The "universal" mentioned here refers to: UDFs can be widely used in some scenarios. In other words, the UDF function must have reuse value and may be directly used by other users in the community. - - If you are not sure whether the UDF you want to contribute is universal, you can send an email to `dev@iotdb.apache.org` or create an issue to initiate a discussion. - -2. The UDF you are going to contribute has been well tested and can run normally in the production environment. - - -### 5.2 What you need to prepare - -1. UDF source code -2. Test cases -3. Instructions - -### 5.3 Contribution Content - -#### 5.3.1 UDF Source Code - -1. Create the UDF main class and related classes in `iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin` or in its subfolders. -2. Register your UDF in `iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin/BuiltinTimeSeriesGeneratingFunction.java`. - -#### 5.3.2 Test Cases - -At a minimum, you need to write integration tests for the UDF. - -You can add a test class in `integration-test/src/test/java/org/apache/iotdb/db/it/udf`. - - -#### 5.3.3 Instructions - -The instructions need to include: the name and the function of the UDF, the attribute parameters that must be provided when the UDF is executed, the applicable scenarios, and the usage examples, etc. - -The instructions for use should include both Chinese and English versions. Instructions for use should be added separately in `docs/zh/UserGuide/Operation Manual/DML Data Manipulation Language.md` and `docs/UserGuide/Operation Manual/DML Data Manipulation Language.md`. - -#### 5.3.4 Submit a PR - -When you have prepared the UDF source code, test cases, and instructions, you are ready to submit a Pull Request (PR) on [Github](https://github.com/apache/iotdb). You can refer to our code contribution guide to submit a PR: [Development Guide](https://iotdb.apache.org/Community/Development-Guide.html). - - -After the PR review is approved and merged, your UDF has already contributed to the IoTDB community! - -## 6. Common problem - -Q1: How to modify the registered UDF? - -A1: Assume that the name of the UDF is `example` and the full class name is `org.apache.iotdb.udf.ExampleUDTF`, which is introduced by `example.jar`. - -1. Unload the registered function by executing `DROP FUNCTION example`. -2. Delete `example.jar` under `iotdb-server-1.0.0-all-bin/ext/udf`. -3. Modify the logic in `org.apache.iotdb.udf.ExampleUDTF` and repackage it. The name of the JAR package can still be `example.jar`. -4. Upload the new JAR package to `iotdb-server-1.0.0-all-bin/ext/udf`. -5. Load the new UDF by executing `CREATE FUNCTION example AS "org.apache.iotdb.udf.ExampleUDTF"`. - diff --git a/src/UserGuide/dev-1.3/User-Manual/White-List_timecho.md b/src/UserGuide/dev-1.3/User-Manual/White-List_timecho.md deleted file mode 100644 index 5194f7051..000000000 --- a/src/UserGuide/dev-1.3/User-Manual/White-List_timecho.md +++ /dev/null @@ -1,70 +0,0 @@ - - -# White List - -**function description** - -Allow which client addresses can connect to IoTDB - -**configuration file** - -conf/iotdb-system.properties - -conf/white.list - -**configuration item** - -iotdb-system.properties: - -Decide whether to enable white list - -```YAML - -# Whether to enable white list -enable_white_list=true -``` - -white.list: - -Decide which IP addresses can connect to IoTDB - -```YAML -# Support for annotation -# Supports precise matching, one IP per line -10.2.3.4 - -# Support for * wildcards, one ip per line -10.*.1.3 -10.100.0.* -``` - -**note** - -1. If the white list itself is cancelled via the session client, the current connection is not immediately disconnected. It is rejected the next time the connection is created. -2. If white.list is modified directly, it takes effect within one minute. If modified via the session client, it takes effect immediately, updating the values in memory and the white.list disk file. -3. Enable the whitelist function, there is no white.list file, start the DB service successfully, however, all connections are rejected. -4. while DB service is running, the white.list file is deleted, and all connections are denied after up to one minute. -5. whether to enable the configuration of the white list function, can be hot loaded. -6. Use the Java native interface to modify the whitelist, must be the root user to modify, reject non-root user to modify; modify the content must be legal, otherwise it will throw a StatementExecutionException. - -![](/img/%E7%99%BD%E5%90%8D%E5%8D%95.png) - diff --git a/src/UserGuide/latest-Table/AI-capability/AINode_Upgrade_timecho.md b/src/UserGuide/latest-Table/AI-capability/AINode_Upgrade_timecho.md deleted file mode 100644 index a3167dc86..000000000 --- a/src/UserGuide/latest-Table/AI-capability/AINode_Upgrade_timecho.md +++ /dev/null @@ -1,900 +0,0 @@ - - -# AINode - -AINode is a native IoTDB node that supports the registration, management, and invocation of time series related models. It includes industry-leading self-developed time series large models, such as the Timer series models developed by Tsinghua University. Models can be invoked using standard SQL statements, enabling millisecond-level real-time inference on time series data, and supporting application scenarios such as time series trend prediction, missing value filling, and anomaly value detection. - -The system architecture is shown in the following figure: - -![](/img/AINode-0-en.png) - -The responsibilities of the three nodes are as follows: - -* **ConfigNode**: Responsible for distributed node management and load balancing. -* **DataNode**: Responsible for receiving and parsing user SQL requests; responsible for storing time series data; responsible for data preprocessing calculations. -* **AINode**: Responsible for managing and using time series models. - -## 1. Advantages and Features - -Compared to building a machine learning service separately, it has the following advantages: - -* **Simple and Easy to Use**: No need to use Python or Java programming, you can complete the entire process of machine learning model management and inference using SQL statements. For example, creating a model can be done using the CREATE MODEL statement, and using a model for inference can be done using the `SELECT * FROM FORECAST (...)` statement, making it more simple and convenient. -* **Avoid Data Migration**: Using IoTDB-native machine learning can directly apply data stored in IoTDB to machine learning model inference without moving data to a separate machine learning service platform, thus accelerating data processing, improving security, and reducing costs. - -![](/img/h1.png) - -* **Built-in Advanced Algorithms**: Supports industry-leading machine learning analysis algorithms, covering typical time series analysis tasks, and empowering time series databases with native data analysis capabilities. For example: - * **Time Series Forecasting**: Learning change patterns from past time series data; outputting the most likely predictions for future sequences based on given past observations. - * **Time Series Anomaly Detection**: Detecting and identifying abnormal values in given time series data to help discover abnormal behavior in time series. - -## 2. Basic Concepts - -* **Model (Model)**: A machine learning model that takes time series data as input and outputs the results or decisions of the analysis task. The model is the basic management unit of AINode, supporting the creation (registration), deletion, query, modification (fine-tuning), and use (inference) of models. -* **Create (Create)**: Load the external designed or trained model file or algorithm into AINode, managed and used uniformly by IoTDB. -* **Inference (Inference)**: Use the created model to complete the time series analysis task on the specified time series data. -* **Built-in (Built-in)**: AINode comes with common time series analysis scenario (e.g., prediction and anomaly detection) machine learning algorithms or self-developed models. - -![](/img/AINode-en.png) - -## 3. Installation and Deployment - -AINode deployment can be referred to the documentation [AINode Deployment](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md). - -## 4. Usage Guide - -TimechoDB-AINode supports three major functions: model inference, model fine-tuning, and model management (registration, viewing, deletion, loading, unloading, etc.). The following sections will provide detailed explanations. - -### 4.1 Model Inference - -The AINode table model supports two major inference capabilities: time series prediction and time series classification. - -#### 4.1.1 Time Series Prediction - -The time series prediction capability provided by the AINode table model includes: - -* **Univariate Prediction**: Supports prediction of a single target variable. -* **Covariate Prediction**: Can simultaneously predict multiple target variables and supports introducing covariates in prediction to improve accuracy. - -The following sections will detail the syntax definition, parameter descriptions, and usage examples of the prediction inference function. - -1. **SQL Syntax** - -```SQL -SELECT * FROM FORECAST( - MODEL_ID, - TARGETS, -- SQL to get target variables - [HISTORY_COVS, -- String, SQL to get historical covariates - FUTURE_COVS, -- String, SQL to get future covariates - OUTPUT_START_TIME, - OUTPUT_LENGTH, - OUTPUT_INTERVAL, - TIMECOL, - PRESERVE_INPUT, - AUTO_ADAPT, -- Boolean type, indicating whether adaptive mode is enabled. - MODEL_OPTIONS]? -) -``` - -* Built-in model inference does not require a registration process. By using the forecast function and specifying model_id, you can use the inference function of the model. -* Parameter description - -| Parameter Name | Parameter Type | Parameter Attributes | Description | Required | Notes | -|----------------|----------------|----------------------|-------------|----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| model_id | Scalar parameter | String type | Unique identifier of the prediction model | Yes | | -| targets | Table parameter | SET SEMANTIC | Input data for the target variables to be predicted. IoTDB will automatically sort the data in ascending order of time before passing it to AINode. | Yes | Use SQL to describe the input data with target variables. If the input SQL is invalid, corresponding query errors will be reported. | -| history_covs | Scalar parameter | String type (valid table model query SQL), default: none | Specifies historical data of covariates for this prediction task, which are used to assist in predicting target variables. AINode will not output prediction results for historical covariates. Before passing data to the model, AINode will automatically sort the data in ascending order of time. | No | 1. Query results can only contain FIELD columns;
2. Other: Different models may have specific requirements, and errors will be thrown if not met. | -| future_covs | Scalar parameter | String type (valid table model query SQL), default: none | Specifies future data of some covariates for this prediction task, which are used to assist in predicting target variables. Before passing data to the model, AINode will automatically sort the data in ascending order of time. | No | 1. Can only be specified when history_covs is set;
2. The covariate names involved must be a subset of history_covs;
3. Query results can only contain FIELD columns;
4. Other: Different models may have specific requirements, and errors will be thrown if not met. | -| auto_adapt | Scalar parameter | Boolean type, default value: true | Whether to enable adaptive processing for covariate inference.(Support from V2.0.8.2) | No | When adaptive mode is enabled:
1. If the set of future covariates (`future_covs`) is not a subset of the historical covariates (`history_covs`), any future covariates not present in the historical set will be automatically discarded.
2. If the length of any historical covariate does not match the length of the input target variable: a. If shorter, pad zeros at the beginning; b. If longer, discard the earliest data points.
3. If the length of any future covariate does not match the prediction length (`output_length`): a. If shorter, pad zeros at the end; b. If longer, discard the most recent data points. | -| output_start_time | Scalar parameter | Timestamp type. Default value: last timestamp of target variable + output_interval | Starting timestamp of output prediction points [i.e., forecast start time] | No | Must be greater than the maximum timestamp of target variable timestamps | -| output_length | Scalar parameter | INT32 type. Default value: 96 | Output window size | No | Must be greater than 0 | -| output_interval | Scalar parameter | Time interval type. Default value: (last timestamp - first timestamp of input data) / n - 1 | Time interval between output prediction points. Supported units: ns, us, ms, s, m, h, d, w | No | Must be greater than 0 | -| timecol | Scalar parameter | String type. Default value: time | Name of time column | No | Must be a TIMESTAMP column existing in targets | -| preserve_input | Scalar parameter | Boolean type. Default value: false | Whether to retain all original rows of target variable input in the output result set | No | | -| model_options | Scalar parameter | String type. Default value: empty string | Key-value pairs related to the model, such as whether to normalize the input. Different key-value pairs are separated by ';'. | No | | - -Notes: -* **Default behavior**: Predict all columns of targets. Currently, only supports INT32, INT64, FLOAT, DOUBLE types. -* **Input data requirements**: - * Must contain a time column. - * Row count requirements: If insufficient, an error will be reported; if exceeding the maximum, the last data will be automatically truncated. - * Column count requirements: Univariate models only support single columns, multi-column will report errors; covariate models usually have no restrictions unless the model itself has clear constraints. -* **Output results**: - * Includes all target variable columns, with data types consistent with the original table. - * If `preserve_input=true` is specified, an additional `is_input` column will be added to identify original data rows. -* **Timestamp generation**: - * Uses `OUTPUT_START_TIME` (optional) as the starting time point for prediction and divides historical and future data. - * Uses `OUTPUT_INTERVAL` (optional, default is the sampling interval of input data) as the output time interval. The timestamp of the Nth row is calculated as: `OUTPUT_START_TIME + (N - 1) * OUTPUT_INTERVAL`. - -2. **Usage Examples** - -**Example 1: Univariate Prediction** - -Create database etth and table eg in advance - -```SQL -create database etth; -create table eg (hufl FLOAT FIELD, hull FLOAT FIELD, mufl FLOAT FIELD, mull FLOAT FIELD, lufl FLOAT FIELD, lull FLOAT FIELD, ot FLOAT FIELD) -``` - -Prepare original data [ETTh1-tab](/img/ETTh1-tab.csv). - -You can import the raw data using the [import-data](../Tools-System/Data-Import-Tool_timecho.md#_2-2-csv-format) script. For example: - -```bash -./tools/import-data.sh -ft csv -sql_dialect table -db etth -table eg -s ~/Desktop/model-compare-html/ETTh1-tab.csv -``` - -Use the first 96 rows of data from column ot in table eg to predict its future 1440 rows of data. - -```SQL -IoTDB:etth> select Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT from eg LIMIT 1440 -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -| Time| HUFL| HULL| MUFL| MULL| LUFL| LULL| OT| -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -|2016-07-01T00:00:00.000+08:00| 5.827|2.009|1.599|0.462|4.203| 1.34|30.531| -|2016-07-01T01:00:00.000+08:00| 5.693|2.076|1.492|0.426|4.142|1.371|27.787| -|2016-07-01T02:00:00.000+08:00| 5.157|1.741|1.279|0.355|3.777|1.218|27.787| -|2016-07-01T03:00:00.000+08:00| 5.09|1.942|1.279|0.391|3.807|1.279|25.044| -...... -Total line number = 1440 -It costs 0.119s - -IoTDB:etth> select * from forecast( - model_id => 'sundial', - targets => (select Time, ot from etth.eg where time >= 2016-08-07T18:00:00.000+08:00 limit 1440) order BY time, - output_length => 96 -) -+-----------------------------+---------+ -| time| ot| -+-----------------------------+---------+ -|2016-10-06T18:00:00.000+08:00|20.733124| -|2016-10-06T19:00:00.000+08:00|20.258146| -|2016-10-06T20:00:00.000+08:00|20.022043| -|2016-10-06T21:00:00.000+08:00|19.789446| -...... -Total line number = 96 -It costs 1.615s -``` - -**Example 2: Covariate Prediction** - -Create table tab_real (to store original real data) in advance - -```SQL -create table tab_real (target1 DOUBLE FIELD, target2 DOUBLE FIELD, cov1 DOUBLE FIELD, cov2 DOUBLE FIELD, cov3 DOUBLE FIELD); -``` - -Prepare original data - -```SQL --- Insert statement -IoTDB:etth> INSERT INTO tab_real (time, target1, target2, cov1, cov2, cov3) VALUES -(1, 1.0, 1.0, 1.0, 1.0, 1.0), -(2, 2.0, 2.0, 2.0, 2.0, 2.0), -(3, 3.0, 3.0, 3.0, 3.0, 3.0), -(4, 4.0, 4.0, 4.0, 4.0, 4.0), -(5, 5.0, 5.0, 5.0, 5.0, 5.0), -(6, 6.0, 6.0, 6.0, 6.0, 6.0), -(7, NULL, NULL, NULL, NULL, 7.0), -(8, NULL, NULL, NULL, NULL, 8.0); - -IoTDB:etth> SELECT * FROM tab_real -+-----------------------------+-------+-------+----+----+----+ -| time|target1|target2|cov1|cov2|cov3| -+-----------------------------+-------+-------+----+----+----+ -|1970-01-01T08:00:00.001+08:00| 1.0| 1.0| 1.0| 1.0| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| 2.0| 2.0| 2.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| 3.0| 3.0| 3.0| 3.0| -|1970-01-01T08:00:00.004+08:00| 4.0| 4.0| 4.0| 4.0| 4.0| -|1970-01-01T08:00:00.005+08:00| 5.0| 5.0| 5.0| 5.0| 5.0| -|1970-01-01T08:00:00.006+08:00| 6.0| 6.0| 6.0| 6.0| 6.0| -|1970-01-01T08:00:00.007+08:00| null| null|null|null| 7.0| -|1970-01-01T08:00:00.008+08:00| null| null|null|null| 8.0| -+-----------------------------+-------+-------+----+----+----+ -``` - -* Prediction task 1: Use historical covariates cov1, cov2, and cov3 to assist in predicting target variables target1 and target2. - - ![](/img/ainode-upgrade-table-forecast-timecho-1-en.png) - - * Use the first 6 rows of historical data from cov1, cov2, cov3, target1, target2 in table tab_real to predict the next 2 rows of target variables target1 and target2. - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.338330268859863|7.338330268859863| - |1970-01-01T08:00:00.008+08:00| 8.02529525756836| 8.02529525756836| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.315s - ``` -* Prediction task 2: Use historical covariates cov1, cov2 and known covariates cov3 in the same table to assist in predicting target variables target1 and target2. - - ![](/img/ainode-upgrade-table-forecast-timecho-2-en.png) - - * Use the first 6 rows of historical data from cov1, cov2, cov3, target1, target2 in table tab_real, and known covariate cov3 in the future 2 rows of the same table to predict the next 2 rows of target variables target1 and target2. - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - FUTURE_COVS => ' - SELECT TIME, cov3 - FROM etth.tab_real - WHERE TIME >= 7 - LIMIT 2', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.244050025939941|7.244050025939941| - |1970-01-01T08:00:00.008+08:00|7.907227516174316|7.907227516174316| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.291s - ``` -* Prediction task 3: Use historical covariates cov1, cov2 from different tables and known covariates cov3 to assist in predicting target variables target1 and target2. - - ![](/img/ainode-upgrade-table-forecast-timecho-3-en.png) - - * Create table tab_cov_forecast (to store known covariate cov3 prediction values) in advance, and prepare related data. - ```SQL - create table tab_cov_forecast (cov3 DOUBLE FIELD); - - -- Insert statement - INSERT INTO tab_cov_forecast (time, cov3) VALUES (7, 7.0),(8, 8.0); - - IoTDB:etth> SELECT * FROM tab_cov_forecast - +----+----+ - |time|cov3| - +----+----+ - | 7| 7.0| - | 8| 8.0| - +----+----+ - ``` - * Use the first 6 rows of known data from cov1, cov2, cov3, target1, target2 in table tab_real, and known covariate cov3 in the future 2 rows from table tab_cov_forecast to predict the next 2 rows of target variables target1 and target2. - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - FUTURE_COVS => ' - SELECT TIME, cov3 - FROM etth.tab_cov_forecast - WHERE TIME >= 7 - LIMIT 2', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.244050025939941|7.244050025939941| - |1970-01-01T08:00:00.008+08:00|7.907227516174316|7.907227516174316| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.351s - ``` - - -#### 4.1.2 Time Series Classification - -Time series classification is a critical capability beyond time series prediction, with extensive applications in industrial scenarios. Its typical paradigm is to input the recent sampling values of multiple measuring points, comprehensively judge the overall operating status of the equipment, and output a classification label for the current status. For example, it can be used for operating status classification of new energy battery pack equipment and other scenarios. - -The AINode table model supports executing time series classification tasks by calling covariate classification models. - -> Note: This feature is available starting from version V2.0.9.1. - -1. **SQL Syntax** -```sql -SELECT * FROM CLASSIFY( - MODEL_ID, - INPUTS -- SQL to retrieve input variables - [TIMECOL, - MODEL_OPTIONS]? -) -``` - -* Parameter Description - -| Parameter Name | Parameter Type | Parameter Attribute | Description | Required | Remarks | -|----------------|----------------|---------------------|-------------|----------|---------| -| model_id | Scalar Parameter | String | Unique identifier of the model used for classification | Yes | - | -| inputs | Table Parameter | SET SEMANTIC | Input data to be classified. IoTDB will automatically sort the data in ascending chronological order before passing it to AINode. | Yes | Describes the input data to be classified via SQL; corresponding query errors will be thrown if the input SQL is invalid. | -| timecol | Scalar Parameter | String, Default: `time` | Name of the time column | No | Must be a column of TIMESTAMP type present in the `inputs` result set; otherwise, an error will be thrown. | -| model_options | Scalar Parameter | String, Default: Empty string | Model-related key-value pairs (e.g., whether input normalization is required). Different key-value pairs are separated by `;`. | No | Unsupported parameters for a specific model will be ignored without throwing errors. | - -**Specifications** - -* **Input Data Requirements** - - Type Constraint: Only INT32, INT64, FLOAT, and DOUBLE data types are supported currently. - - Row Count Constraint: Varies by model. Errors will be thrown if the row count is below the minimum or above the maximum required by the model. - - Column Count Constraint**: Must include a time column. Univariate classification models support only one data column and will throw an error for multiple columns; multivariate classification models generally have no restrictions unless explicitly specified by the model itself. - - Order Constraint: Multivariate zero-shot classification models generally have no order restrictions unless explicitly specified by the model itself. - -* **Output Result** - The returned result is a table composed of time series data classification results, and its schema depends on the specific implementation of the model. - -2. **Usage Example** - -Suppose a project has 10 time series variables with an input length of 192. The custom `mantis_custom` model is used as an example for time series classification inference. - -* Model Registration -```sql -CREATE MODEL mantis_custom USING URI 'file:///path/to/mantis' -``` -For detailed steps to register a custom model, refer to Section 4.3. - -* Execute SQL -```sql -IoTDB:etth> SELECT * FROM CLASSIFY ( - MODEL_ID => 'mantis_custom', - INPUTS => ( - SELECT Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT,UT,MT,LT - FROM eg - WHERE TIME < 2016-07-09 00:00:00 - ORDER BY TIME DESC - LIMIT 192) ORDER BY TIME -) -``` - -* Execution Result -```sql -+--------+ -|category| -+--------+ -| 4| -+--------+ -``` - - -### 4.2 Model Fine-Tuning - -AINode supports model fine-tuning through SQL. - -**SQL Syntax** - -```SQL -createModelStatement - | CREATE MODEL modelId=identifier (WITH HYPERPARAMETERS '(' hparamPair (',' hparamPair)* ')')? FROM MODEL existingModelId=identifier ON DATASET '(' targetData=string ')' - ; -hparamPair - : hparamKey=identifier '=' hyparamValue=primaryExpression - ; -``` - -**Parameter Description** - -| Name | Description | -|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| modelId | Unique identifier of the fine-tuned model | -| hparamPair | Hyperparameter key-value pairs used for fine-tuning, currently supports the following:
`train_epochs`: int type, number of fine-tuning epochs
`iter_per_epoch`: int type, number of iterations per epoch
`learning_rate`: double type, learning rate | -| existingModelId | Base model used for fine-tuning | -| targetData | SQL to get the dataset used for fine-tuning | - -**Example** - -1. Select data from the ot field in the specified time range as the fine-tuning dataset, and create the model sundialv3 based on sundial. - -```SQL -IoTDB> set sql_dialect=table -Msg: The statement is executed successfully. -IoTDB> CREATE MODEL sundialv3 FROM MODEL sundial ON DATASET ('SELECT time, ot from etth.eg where 1467302400000 <= time and time < 1517468400001') -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -| sundialv3| sundial| fine_tuned| training| -+---------------------+---------+-----------+---------+ -``` - -2. Fine-tuning tasks are started asynchronously in the background, which can be seen in the AINode process log; after fine-tuning is completed, query and use the new model. - -```SQL -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -| sundialv3| sundial| fine_tuned| active| -+---------------------+---------+-----------+---------+ -``` - -### 4.3 Register Custom Models - -**The following Transformers models can be registered to AINode:** - -1. AINode currently uses transformers version 4.56.2, so when building the model, avoid inheriting interfaces from lower versions (<4.50); -2. The model must inherit a pipeline for inference tasks of AINode (currently supports prediction pipeline): - * iotdb-core/ainode/iotdb/ainode/core/inference/pipeline/basic_pipeline.py - - **Before V2.0.9.3** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - Preprocess the input data before the inference task starts, including shape validation and numerical conversion. - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - Postprocess the output results after the inference task is completed. - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def preprocess(self, inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], **infer_kwargs): - """ - Preprocess the input data before passing it to the model for inference, validating the shape and type of the input data. - - Args: - inputs (list[dict]): - Input data, a list of dictionaries, each dictionary contains: - - 'targets': Tensor with shape (input_length,) or (target_count, input_length). - - 'past_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - 'future_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - infer_kwargs (dict, optional): Additional keyword arguments for inference, such as: - - `output_length`(int): Used to validate the validity of 'future_covariates' if provided. - - Raises: - ValueError: If the input format is invalid (e.g., missing keys, invalid tensor shapes). - - Returns: - Preprocessed and validated input data that can be directly used for model inference. - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - Perform forecasting on the given inputs. - - Parameters: - inputs: Input data for forecasting. The type and structure depend on the specific model implementation. - **infer_kwargs: Additional inference parameters, e.g.: - - `output_length`(int): The number of time points the model should generate. - - Returns: - Forecast output, the specific form depends on the specific model implementation. - """ - pass - - def postprocess(self, outputs: list[torch.Tensor], **infer_kwargs) -> list[torch.Tensor]: - """ - Postprocess the model outputs after inference, validating the shape of the output data and ensuring it meets the expected dimensions. - - Args: - outputs: - Model outputs, a list of 2D tensors, each tensor with shape `[target_count, output_length]`. - - Raises: - InferenceModelInternalException: If the output tensor shape is invalid (e.g., incorrect dimensions). - ValueError: If the output format is incorrect. - - Returns: - list[torch.Tensor]: - Postprocessed outputs, which will be a list of 2D tensors. - """ - pass - ``` - - **From V2.0.9.3 onwards** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - Preprocess the input data before the inference task starts, including shape validation and numerical conversion. - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - Postprocess the output results after the inference task is completed. - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def _preprocess( - self, - inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], - **infer_kwargs, - ): - """ - Preprocess the input data before passing it to the model for inference, validating the shape and type of the input data. - - Args: - inputs (list[dict[str, dict[str, torch.Tensor] | torch.Tensor]]): - Input data, a list of dictionaries, each dictionary contains: - - 'targets': Tensor with shape (input_length,) or (target_count, input_length). - - 'past_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - 'future_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - infer_kwargs (dict, optional): Additional keyword arguments for inference, such as: - - `output_length`(int): Used to validate the validity of 'future_covariates' if provided. - - Raises: - ValueError: If the input format is invalid (e.g., missing keys, invalid tensor shapes). - - Returns: - Preprocessed and validated input data that can be directly used for model inference. - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - Perform forecasting on the given inputs. - - Parameters: - inputs: Input data for forecasting. The type and structure depend on the specific model implementation. - **infer_kwargs: Additional inference parameters, e.g.: - - `output_length`(int): The number of time points the model should generate. - - Returns: - Forecast output, the specific form depends on the specific model implementation. - """ - pass - - def _postprocess(self, outputs, **infer_kwargs) -> list[torch.Tensor]: - """ - Postprocess the model outputs after inference, validating the shape of the output data and ensuring it meets the expected dimensions. - - Args: - outputs: - Model outputs, a list of 2D tensors, each tensor with shape `[target_count, output_length]`. - - Raises: - InferenceModelInternalException: If the output tensor shape is invalid (e.g., incorrect dimensions). - ValueError: If the output format is incorrect. - - Returns: - list[torch.Tensor]: - Postprocessed outputs, which will be a list of 2D tensors. - """ - pass - ``` - -3. Modify the model configuration file `config.json` to ensure it contains the following fields: - - **Before V2.0.9.3** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // Specify the model Config class - "AutoModelForCausalLM": "model.Chronos2Model" // Specify the model class - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // Specify the inference pipeline for the model - "model_type": "custom_t5", // Specify the model type - } - ``` - * The model Config class and model class **must** be specified via `auto_map`; - * The inference pipeline class **must** be inherited and specified; - * For built-in and user-defined models managed by AINode, `model_type` also serves as a unique non-duplicable identifier. That is, the model type to be registered must not duplicate any existing model types; models created via fine-tuning will inherit the model type of the original model. - - **From V2.0.9.3 onwards** - > The `model_type` parameter is **not required** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // Specify the model Config class - "AutoModelForCausalLM": "model.Chronos2Model" // Specify the model class - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // Specify the inference pipeline for the model - } - ``` - * The model Config class and model class **must** be specified via `auto_map`; - * The inference pipeline class **must** be inherited and specified; - -4. Ensure the model directory to be registered contains the following files, and the model configuration file name and weight file name are not customizable: - * Model configuration file: config.json; - * Model weight file: model.safetensors; - * Model code: other .py files. - -**The SQL syntax for registering custom models is as follows:** - -```SQL -CREATE MODEL USING URI -``` - -**Parameter Description:** - -* **model_id**: Unique identifier for the custom model; cannot be duplicated, with the following constraints: - * Allowed characters: [0-9 a-z A-Z \_ ] (letters, numbers (not at the beginning), underscore (not at the beginning)) - * Length limit: 2-64 characters - * Case-sensitive -* **uri**: Local URI address containing the model code and weights. - -**Registration Example:** - -Upload a custom Transformers model from a local path, AINode will copy the folder to the user_defined directory. - -```SQL -CREATE MODEL chronos2 USING URI 'file:///path/to/chronos2' -``` - -After executing the SQL, the registration process will be asynchronous. The registration status of the model can be viewed by checking the model display (see the "Viewing Models" section). After the model is registered successfully, it can be called using normal query methods for model inference. - -### 4.4 Viewing Models - -Registered models can be queried using the view command. - -```SQL -SHOW MODELS -``` - -In addition to displaying all model information directly, you can specify `model_id` to view the information of a specific model. - -```SQL -SHOW MODELS -- Only show specific model -``` - -The result of the model display contains the following: - -| **ModelId** | **ModelType** | **Category** | **State** | -| ------------------- | --------------------- | -------------------- | ----------------- | -| Model ID | Model Type | Model Category | Model State | - -Where, the State model status machine flowchart is as follows: - -![](/img/ainode-upgrade-state-timecho-en.png) - -State machine flow description: - -1. After starting AINode, executing `show models` command, only **system built-in (BUILTIN)** models can be viewed. -2. Users can import their own models, which are identified as **user-defined (USER_DEFINED)**; AINode will try to parse the model type (ModelType) from the model configuration file; if parsing fails, this field will be displayed as empty. -3. Time series large models (built-in models) do not have weight files packaged with AINode, and AINode automatically downloads them when starting. - 1. During download, it is ACTIVATING, and after successful download, it becomes ACTIVE, and if failed, it becomes INACTIVE. -4. After users start a model fine-tuning task, the model state during training is TRAINING, and after successful training, it becomes ACTIVE, and if failed, it becomes FAILED. -5. If the fine-tuning task is successful, after fine-tuning, all ckpt (training files) will be statistically analyzed to find the best file and automatically renamed to the user-specified model_id. - -**Viewing Example** - -```SQL -IoTDB> show models -+---------------------+--------------+--------------+-------------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------+--------------+-------------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| custom| | user_defined| active| -| timer_xl| timer| builtin| activating| -| sundial| sundial| builtin| active| -| sundialx_1| sundial| fine_tuned| active| -| sundialx_4| sundial| fine_tuned| training| -| sundialx_5| sundial| fine_tuned| failed| -| chronos2| t5| builtin| inactive| -+---------------------+--------------+--------------+-------------+ -``` - -**Built-in Traditional Time Series Models:** - -| Model Name | Core Concept | Applicable Scenarios | Key Features | -|-----------------------------------------|------------------------------------------------------------------------------|-----------------------------------------------------------|------------------------------------------------------------------------------| -| **ARIMA** (Autoregressive Integrated Moving Average) | Combines AR, differencing (I), and MA for stationary or differenced series | Univariate forecasting (stock prices, sales, economics) | 1. For linear trends with weak seasonality2. Requires (p,d,q) tuning3. Sensitive to missing values | -| **Holt-Winters** (Triple Exponential Smoothing) | Exponential smoothing with level, trend, and seasonal components | Data with clear trend & seasonality (monthly sales, power demand) | 1. Handles additive/multiplicative seasonality2. Weights recent data higher3. Simple implementation | -| **Exponential Smoothing** | Weighted average of history with exponentially decaying weights | Trending but non-seasonal data (short-term demand) | 1. Few parameters, simple computation2. Suitable for stable/slow-changing series3. Extensible to double/triple smoothing | -| **Naive Forecaster** | Uses last observation as next prediction (simplest baseline) | Benchmarking or data with no clear pattern | 1. No training needed2. Sensitive to sudden changes3. Seasonal variant uses prior season value | -| **STL Forecaster** | Decomposes series into trend, seasonal, residual; forecasts components | Complex seasonality/trends (climate, traffic) | 1. Handles non-fixed seasonality2. Robust to outliers3. Components can use other models | -| **Gaussian HMM** | Hidden states generate observations; each state follows Gaussian distribution | State sequence prediction/classification (speech, finance) | 1. Models temporal state transitions2. Observations independent per state3. Requires state count | -| **GMM HMM** | Extends Gaussian HMM; each state uses Gaussian Mixture Model | Multi-modal observation scenarios (motion recognition, biosignals) | 1. More flexible than single Gaussian2. Higher complexity3. Requires GMM component count | -| **STRAY** (Search for Outliers using Random Projection and Adaptive Thresholding) | Uses SVD to detect anomalies in high-dimensional time series | High-dimensional anomaly detection (sensor networks, IT monitoring) | 1. No distribution assumption2. Handles high dimensions3. Sensitive to global anomalies | - -**Built-in Time Series Large Models:** - -| Model Name | Core Concept | Applicable Scenarios | Key Features | -|-----------------|------------------------------------------------------------------------------|-----------------------------------------------------------|------------------------------------------------------------------------------| -| **Timer-XL** | Long-context time series large model pretrained on massive industrial data | Complex industrial forecasting requiring ultra-long history (energy, aerospace, transport) | 1. Supports input of tens of thousands of time points2. Covers non-stationary, multivariate, and covariate scenarios3. Pretrained on trillion-scale high-quality industrial IoT data | -| **Timer-Sundial** | Generative foundation model with "Transformer + TimeFlow" architecture | Zero-shot forecasting requiring uncertainty quantification (finance, supply chain, renewable energy) | 1. Strong zero-shot generalization; supports point & probabilistic forecasting2. Flexible analysis of any prediction distribution statistic3. Innovative flow-matching architecture for efficient non-deterministic sample generation | -| **Chronos-2** | Universal time series foundation model based on discrete tokenization | Rapid zero-shot univariate forecasting; scenarios enhanced by covariates (promotions, weather) | 1. Powerful zero-shot probabilistic forecasting2. Unified multi-variable & covariate modeling (strict input requirements):  a. Future covariate names ⊆ historical covariate names  b. Each historical covariate length = target length  c. Each future covariate length = prediction length3. Efficient encoder-only structure balancing performance and speed | - -### 4.5 Model Deletion - -Registered models can be deleted via SQL. AINode removes the corresponding model folder under `user_defined`. Syntax: -```SQL -DROP MODEL -``` -- Requires specifying an existing `model_id`. -- Deletion is asynchronous (status: `DROPPING`), during which the model cannot be used for inference. -- **Built-in models cannot be deleted.** - -### 4.6 Loading/Unloading Models - -AINode supports two loading strategies: -* **On-Demand Loading**: Load model temporarily during inference, then release resources. Suitable for testing or low-load scenarios. -* **Persistent Loading**: Keep model instances resident in CPU memory or GPU VRAM to support high-concurrency inference. Users specify load/unload targets via SQL; AINode auto-manages instance counts. Current loaded status is queryable. - -Details below: - -1. **Configuration Parameters** - Edit these settings to control persistent loading behavior: - ```properties - # Ratio of total device memory/VRAM usable by AINode for inference - # Datatype: Float - ain_inference_memory_usage_ratio=0.4 - - # Memory overhead ratio per loaded model instance (model_size * this_value) - # Datatype: Float - ain_inference_extra_memory_ratio=1.2 - ``` - -2. **List Available Devices** - ```SQL - SHOW AI_DEVICES - ``` - Example: - ```SQL - IoTDB> SHOW AI_DEVICES - +-------------+ - | DeviceId| - +-------------+ - | cpu| - | 0| - | 1| - +-------------+ - ``` - -3. **Load Model** - Manually load model; system auto-balances instance count based on resources: - ```SQL - LOAD MODEL TO DEVICES (, )* - ``` - Parameters: - * `existing_model_id`: Model ID (current version supports `timer_xl` and `sundial` only) - * `device_id`: - * `cpu`: Load into server memory - * `gpu_id`: Load into specified GPU(s), e.g., `'0,1'` for GPUs 0 and 1 - Example: - ```SQL - LOAD MODEL sundial TO DEVICES 'cpu,0,1' - ``` - -4. **Unload Model** - Unload all instances of a model; system reallocates freed resources: - ```SQL - UNLOAD MODEL FROM DEVICES (, )* - ``` - Parameters same as `LOAD MODEL`. - Example: - ```SQL - UNLOAD MODEL sundial FROM DEVICES 'cpu,0,1' - ``` - -5. **View Loaded Models** - ```SQL - SHOW LOADED MODELS - SHOW LOADED MODELS (, )* -- Filter by device - ``` - Example (sundial loaded on CPU, GPU 0, GPU 1): - ```SQL - IoTDB> SHOW LOADED MODELS - +-------------+--------------+------------------+ - | DeviceId| ModelId| Count(instances)| - +-------------+--------------+------------------+ - | cpu| sundial| 4| - | 0| sundial| 6| - | 1| sundial| 6| - +-------------+--------------+------------------+ - ``` - * `DeviceId`: Device identifier - * `ModelId`: Loaded model ID - * `Count(instances)`: Number of model instances per device (auto-assigned by system) - -### 4.7 Large Time Series Models - -AINode supports multiple large time series models. For deployment details, refer to [Time Series Large Model](../AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md) - -### 5. Permission Management - -Use IoTDB's built-in authentication for AINode permissions. Users need `USE_MODEL` permission to manage models and access input data for inference. - -| **Permission** | **Scope** | **Administrator (default ROOT)** | **Normal User** | -|---------------------|--------------------------------|----------------------------------|-----------------| -| USE_MODEL | create model / show models / drop model | √ | √ | -| READ_SCHEMA&READ_DATA | forecast | √ | √ | \ No newline at end of file diff --git a/src/UserGuide/latest-Table/AI-capability/AINode_timecho.md b/src/UserGuide/latest-Table/AI-capability/AINode_timecho.md deleted file mode 100644 index d31d879b2..000000000 --- a/src/UserGuide/latest-Table/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,482 +0,0 @@ - - -# AINode - -AINode is a native IoTDB node that supports the registration, management, and invocation of time-series-related models. It comes with built-in industry-leading self-developed time-series large models, such as the Timer series developed by Tsinghua University. These models can be invoked through standard SQL statements, enabling real-time inference of time series data at the millisecond level, and supporting application scenarios such as trend forecasting, missing value imputation, and anomaly detection for time series data. - -> Available since V2.0.5.1 - -The system architecture is shown below: -::: center - -::: - -The responsibilities of the three nodes are as follows: - -- **ConfigNode:** - - Manages distributed nodes and handles load balancing across the system. -- **DataNode:** - - Receives and parses user SQL queries. - - Stores time-series data. - - Performs preprocessing computations on raw data. -- **AINode:** - - Manages and utilizes time-series models (including training/inference). - - Supports deep learning and machine learning workflows. - -## 1. Advantageous features - -Compared with building a machine learning service alone, it has the following advantages: - -- **Simple and easy to use**: no need to use Python or Java programming, the complete process of machine learning model management and inference can be completed using SQL statements. Creating a model can be done using the CREATE MODEL statement, and using a model for inference can be done using the SELECT * FROM FORECAST (...) statement, making it simpler and more convenient to use. - -- **Avoid Data Migration**: With IoTDB native machine learning, data stored in IoTDB can be directly applied to the inference of machine learning models without having to move the data to a separate machine learning service platform, which accelerates data processing, improves security, and reduces costs. - -![](/img/AInode1.png) - -- **Built-in Advanced Algorithms**: supports industry-leading machine learning analytics algorithms covering typical timing analysis tasks, empowering the timing database with native data analysis capabilities. Such as: - - **Time Series Forecasting**: learns patterns of change from past time series; thus outputs the most likely prediction of future series based on observations at a given past time. - - **Anomaly Detection for Time Series**: detects and identifies outliers in a given time series data, helping to discover anomalous behaviour in the time series. - - **Annotation for Time Series (Time Series Annotation)**: Adds additional information or markers, such as event occurrence, outliers, trend changes, etc., to each data point or specific time period to better understand and analyse the data. - - -## 2. Basic Concepts - -- **Model**: A machine learning model takes time series data as input and outputs analysis task results or decisions. Models are the basic management units of AINode, supporting model operations such as creation (registration), deletion, query, modification (fine-tuning), and usage (inference). -- **Create**: Load externally designed or trained model files/algorithms into AINode for unified management and usage by IoTDB. -- **Inference**: Use the created model to complete time series analysis tasks applicable to the model on specified time series data. -- **Built-in Capabilities**: AINode comes with machine learning algorithms or self-developed models for common time series analysis scenarios (e.g., forecasting and anomaly detection). -![](/img/AINode-en.png) - -## 3. Installation and Deployment - -The deployment of AINode can be found in the document [AINode Deployment](../Deployment-and-Maintenance/AINode_Deployment_timecho.md) . - - -## 4. Usage Guide - -AINode provides model creation and deletion functions for time series models. Built-in models do not require creation and can be used directly. - -### 4.1 Registering Models - -Trained deep learning models can be registered by specifying their input and output vector dimensions for inference. - -Models that meet the following criteria can be registered with AINode: - -1. AINode currently supports models trained with PyTorch 2.4.0. Features above version 2.4.0 should be avoided. -2. AINode supports models stored using PyTorch JIT (`model.pt`), which must include both the model structure and weights. -3. The model input sequence can include single or multiple columns. If multi-column, it must match the model capabilities and configuration file. -4. Model configuration parameters must be clearly defined in the `config.yaml` file. When using the model, the input and output dimensions defined in `config.yaml` must be strictly followed. Mismatches with the configuration file will cause errors. - -The SQL syntax for model registration is defined as follows: - -```SQL -create model using uri -``` - -Detailed meanings of SQL parameters: - -- **model_id**: The global unique identifier for the model, non-repeating. Model names have the following constraints: - - Allowed characters: [0-9 a-z A-Z _] (letters, digits (not at the beginning), underscores (not at the beginning)) - - Length: 2-64 characters - - Case-sensitive -- **uri**: The resource path of the model registration files, which should include the **model structure and weight file `model.pt` and the model configuration file `config.yaml`** - - - **Model structure and weight file**: The weight file generated after model training, currently supporting `.pt` files from PyTorch training. - - - **Model configuration file**: Parameters related to the model structure provided during registration, which must include input and output dimensions for inference: - - | **Parameter Name** | **Description** | **Example** | - | ------------------ | -------------------------------- | ----------- | - | input_shape | Rows and columns of model input | [96,2] | - | output_shape | Rows and columns of model output | [48,2] | - - In addition to inference, data types of input and output can also be specified: - - | **Parameter Name** | **Description** | **Example** | - | ------------------ | ------------------------- | ---------------------- | - | input_type | Data type of model input | ['float32', 'float32'] | - | output_type | Data type of model output | ['float32', 'float32'] | - - Additional notes can be specified for model management display: - - | **Parameter Name** | **Description** | **Example** | - | ------------------ | --------------------------------------------- | -------------------------------------------- | - | attributes | Optional notes set by users for model display | 'model_type': 'dlinear', 'kernel_size': '25' | - -In addition to registering local model files, remote resource paths can be specified via URIs for registration, using open-source model repositories (e.g., HuggingFace). - -#### Example - -The [example folder](https://github.com/apache/iotdb/tree/master/integration-test/src/test/resources/ainode-example) contains model.pt (trained model) and config.yaml with the following content: - -```YAML -configs: - # Required - input_shape: [96, 2] # Model accepts 96 rows x 2 columns of data - output_shape: [48, 2] # Model outputs 48 rows x 2 columns of data - - # Optional (default to all float32, column count matches shape) - input_type: ["int64", "int64"] # Data types of inputs, must match input column count - output_type: ["text", "int64"] # Data types of outputs, must match output column count - -attributes: # Optional user-defined notes - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -Register the model by specifying this folder as the loading path: - -```SQL -IoTDB> create model dlinear_example using uri "file://./example" -``` - -Models can also be downloaded from HuggingFace for registration: - -```SQL -IoTDB> create model dlinear_example using uri "https://huggingface.co/google/timesfm-2.0-500m-pytorch" -``` - -After SQL execution, registration proceeds asynchronously. The registration status can be checked via model display (see Model Display section). The registration success time mainly depends on the model file size. - -Once registered, the model can be invoked for inference through normal query syntax. - -### 4.2 Viewing Models - -Registered models can be queried using the `show models` command. The SQL definitions are: - -```SQL -show models - -show models -``` - -In addition to displaying all models, specifying a `model_id` shows details of a specific model. The display includes: - -| **ModelId** | **ModelType** | **Category** | **State** | -|-------------|---------------|----------------|-------------| -| Model ID | Model Type | Model Category | Model State | - -- Model State Transition Diagram - -![](/img/AINode-State-en.png) - -**Instructions:** - -1. Initialization: - - When AINode starts, show models only displays BUILT-IN models. -2. Custom Model Import: - - Users can import custom models (marked as USER-DEFINED). - - The system attempts to parse the ModelTypefrom the config file. - - If parsing fails, the field remains empty. -3. Foundation Model Weights: - - Time-series foundation model weights are not bundled with AINode. - - AINode automatically downloads them during startup. - - Download state: LOADING. -4. Download Outcomes: - - Success → State changes to ACTIVE. - - Failure → State changes to INACTIVE. -5. Fine-Tuning Process: - - When fine-tuning starts: State becomes TRAINING. - - Successful training → State transitions to ACTIVE. - - Training failure → State changes to FAILED. - -**Example** - -```SQL -IoTDB> show models -+---------------------+--------------------+--------------+---------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+--------------+---------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| custom| | USER-DEFINED| ACTIVE| -| timerxl| Timer-XL| BUILT-IN| LOADING| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| sundialx_1| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_2| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_4| Timer-Sundial| FINE-TUNED| TRAINING| -| sundialx_5| Timer-Sundial| FINE-TUNED| FAILED| -+---------------------+--------------------+--------------+---------+ -``` - - -### 4.3 Deleting Models - -Registered models can be deleted via SQL, which removes all related files under AINode: - -```SQL -drop model -``` - -Specify the registered `model_id` to delete the model. Since deletion involves data cleanup, the operation is not immediate, and the model state becomes `DROPPING`, during which it cannot be used for inference. **Note:** Built-in models cannot be deleted. - -### 4.4 Inference with Built-in Models - -SQL syntax: - - -```SQL -SELECT * FROM forecast( - input, - model_id, - [output_length, - output_start_time, - output_interval, - timecol, - preserve_input, - model_options]? -) -``` - - -Built-in models do not require prior registration for inference. Simply use the `forecast` function and specify the `model_id` to invoke the model's inference capabilities. - - - **Note**: Inference with built-in time series large models requires local availability of model weights in the directory `/IOTDB_AINODE_HOME/data/ainode/models/weights/model_id/`. If weights are missing, they will be automatically downloaded from HuggingFace. Ensure direct network access to HuggingFace. - -Parameter descriptions: - -| Parameter | Type | Attribute | Description | Required | Notes | -| ----------------- | ------------- | -------------------------------------------------------- | ------------------------------------------------------------ | -------- | ------------------------------------------------------------ | -| input | Table | SET SEMANTIC | Input data for forecasting | Yes | | -| model_id | String | Scalar | Name of the model to use | Yes | Must be non-empty and a built-in model; otherwise, errors like "MODEL_ID cannot be null" occur. | -| output_length | INT32 | Scalar (default: 96) | Size of the output forecast window | No | Must be > 0. | -| output_start_time | Timestamp | Scalar (default: last input timestamp + output_interval) | Start timestamp of the forecast results | No | Can be negative (before 1970-01-01). | -| output_interval | Time interval | Scalar (default: inferred from input) | Time interval between forecast points (supports ns, us, ms, s, m, h, d, w) | No | If > 0, uses user-specified interval; else, infers from input. | -| timecol | String | Scalar (default: "time") | Name of the timestamp column | No | Must exist in `input` and be of TIMESTAMP type; otherwise, errors occur. | -| preserve_input | Boolean | Scalar (default: false) | Retain all input rows in the output | No | | -| model_options | String | Scalar (default: empty) | Model-specific key-value pairs (e.g., normalization) | No | Unsupported parameters are ignored. See appendix for built-in model parameters. | - -**Notes:** - -1. The `forecast` function predicts all columns in the input table by default (excluding the time column and columns specified in `partition by`). -2. The `forecast` function does not require the input data to be in any specific order. It sorts the input data in ascending order by the timestamp (specified by the `TIMECOL` parameter) before invoking the model for prediction. -3. Different models have varying requirements for the number of input data rows. If the input data has fewer rows than the minimum requirement, an error will be reported. - - Among the current built-in models in AINode: - - Timer-XL requires at least 96 rows of input data. - - Timer-Sundial requires at least 16 rows of input data. -4. The result columns of the `forecast` function include all input columns from the input table, with their original data types preserved. If `preserve_input = true`, an additional `is_input` column will be included to indicate whether a row is from the input data. - - Currently, only columns of type INT32, INT64, FLOAT, or DOUBLE are supported for prediction. Otherwise, an error will occur: "The type of the column [%s] is [%s], only INT32, INT64, FLOAT, DOUBLE is allowed." -5. `output_start_time` and `output_interval` only affect the generation of the timestamp column in the output results. Both are optional parameters: - - `output_start_time` defaults to the last timestamp of the input data plus `output_interval`. - - `output_interval` defaults to the sampling interval of the input data, calculated as: (last timestamp - first timestamp) / (number of rows - 1). - - The timestamp of the Nth output row is calculated as: `output_start_time + (N - 1) * output_interval`. - -**Example: Database and table must be pre-created** - -```sql -create database etth -create table eg (hufl FLOAT FIELD, hull FLOAT FIELD, mufl FLOAT FIELD, mull FLOAT FIELD, lufl FLOAT FIELD, lull FLOAT FIELD, ot FLOAT FIELD) -``` - -Using the ETTh1-tab dataset:[ETTh1-tab](/img/ETTh1-tab.csv)。 - -**View supported models** - -```Bash -IoTDB:etth> show models -+---------------------+--------------------+--------+------+ -| ModelId| ModelType|Category| State| -+---------------------+--------------------+--------+------+ -| arima| Arima|BUILT-IN|ACTIVE| -| holtwinters| HoltWinters|BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing|BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster|BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster|BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm|BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm|BUILT-IN|ACTIVE| -| stray| Stray|BUILT-IN|ACTIVE| -| sundial| Timer-Sundial|BUILT-IN|ACTIVE| -| timer_xl| Timer-XL|BUILT-IN|ACTIVE| -+---------------------+--------------------+--------+------+ -Total line number = 10 -It costs 0.004s -``` - -**Inference with sundial model:** - -```Bash -IoTDB:etth> select Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT from eg LIMIT 96 -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -| Time| HUFL| HULL| MUFL| MULL| LUFL| LULL| OT| -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -|2016-07-01T00:00:00.000+08:00| 5.827|2.009|1.599|0.462|4.203| 1.34|30.531| -|2016-07-01T01:00:00.000+08:00| 5.693|2.076|1.492|0.426|4.142|1.371|27.787| -|2016-07-01T02:00:00.000+08:00| 5.157|1.741|1.279|0.355|3.777|1.218|27.787| -|2016-07-01T03:00:00.000+08:00| 5.09|1.942|1.279|0.391|3.807|1.279|25.044| -...... -Total line number = 96 -It costs 0.119s - -IoTDB:etth> select * from forecast( - model_id => 'sundial', - input => (select Time, ot from etth.eg where time >= 2016-08-07T18:00:00.000+08:00 limit 1440) order BY time, - output_length => 96 -) -+-----------------------------+---------+ -| time| ot| -+-----------------------------+---------+ -|2016-10-06T18:00:00.000+08:00|20.781654| -|2016-10-06T19:00:00.000+08:00|20.252121| -|2016-10-06T20:00:00.000+08:00|19.960138| -|2016-10-06T21:00:00.000+08:00|19.662334| -...... -Total line number = 96 -It costs 1.615s -``` -### 4.5 Fine-tuning Built-in Models -> Only Timer-XL and Timer-Sundial support fine-tuning. - - -The SQL syntax is as follows: - - -```SQL -create model (with hyperparameters -(=(, =)*))? -from model -on dataset (inputSql) -``` - -#### Example - -1. Select the first 80% of data from the measurement `ot` as the fine-tuning dataset, and create the model `sundialv3` based on `sundial`. - -```SQL -IoTDB> set sql_dialect=table -Msg: The statement is executed successfully. -IoTDB> CREATE MODEL sundialv3 FROM MODEL sundial ON DATASET ('SELECT time, ot from etth.eg where 1467302400000 <= time and time < 1517468400001') -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+--------------------+----------+--------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+--------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| timer_xl| Timer-XL| BUILT-IN| ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED| ACTIVE| -| sundialv3| Timer-Sundial|FINE-TUNED|TRAINING| -+---------------------+--------------------+----------+--------+ -``` - -2. The fine-tuning task starts asynchronously in the background, and logs can be viewed in the AINode process. After fine-tuning is complete, query and use the new model - -```SQL -IoTDB> show models -+---------------------+--------------------+----------+------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+------+ -| arima| Arima| BUILT-IN|ACTIVE| -| holtwinters| HoltWinters| BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN|ACTIVE| -| stray| Stray| BUILT-IN|ACTIVE| -| sundial| Timer-Sundial| BUILT-IN|ACTIVE| -| timer_xl| Timer-XL| BUILT-IN|ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|ACTIVE| -| sundialv3| Timer-Sundial|FINE-TUNED|ACTIVE| -+---------------------+--------------------+----------+------+ -``` - -### 4.6 Time Series Large Model Import Steps - -AINode supports multiple time series large models. For deployment, refer to [Time Series Large Model](../AI-capability/TimeSeries-Large-Model.md) - - -## 5 Permission Management - -AINode uses IoTDB's authentication for permission management. Users need `USE_MODEL` permission for model management and `READ_DATA` permission for inference (to access input data sources). - -| **Permission** | **Scope** | **Admin (ROOT)** | **Regular User** | **Path-Related** | -| -------------- | ------------------------ | ---------------- | ---------------- | ---------------- | -| USE_MODEL | Create/Show/Drop models | ✔️ | ✔️ | ❌ | -| READ_DATA | Call inference functions | ✔️ | ✔️ | ✔️ | - - -## 6 Appendix - -**Arima** - -| Parameter | Description | Default | -| ----------------------- | ------------------------------------------------------------ | --------- | -| order | ARIMA order `(p, d, q)`: p=autoregressive, d=differencing, q=moving average. | (1,0,0) | -| seasonal_order | Seasonal ARIMA order `(P, D, Q, s)`: seasonal AR, differencing, MA orders, and season length (e.g., 12 for monthly data). | (0,0,0,0) | -| method | Optimizer: 'newton', 'nm', 'bfgs', 'lbfgs', 'powell', 'cg', 'ncg', 'basinhopping'. | 'lbfgs' | -| maxiter | Maximum iterations/function evaluations. | 50 | -| out_of_sample_size | Number of tail samples for validation (not used in fitting). | 0 | -| scoring | Scoring function for validation (sklearn metric or custom). | 'mse' | -| trend | Trend term configuration. If `with_intercept=True` and None, defaults to 'c' (constant). | None | -| with_intercept | Include intercept term. | True | -| time_varying_regression | Allow regression coefficients to vary over time. | False | -| enforce_stationarity | Enforce stationarity of AR components. | True | -| enforce_invertibility | Enforce invertibility of MA components. | True | -| simple_differencing | Use differenced data for estimation (sacrifices first rows). | False | -| measurement_error | Assume observation errors. | False | -| mle_regression | Use maximum likelihood for regression (must be False if `time_varying_regression=True`). | True | -| hamilton_representation | Use Hamilton representation (default is Harvey). | False | -| concentrate_scale | Exclude scale parameter from likelihood (reduces parameters). | False | - -**NaiveForecaster** - -| Parameter | Description | Default | -| --------- | ------------------------------------------------------------ | ------- | -| strategy | Forecasting strategy: - `"last"`: Use last training value (seasonal if `sp`>1). - `"mean"`: Use mean of last window (seasonal if `sp`>1). - `"drift"`: Fit line through last window and extrapolate (non-robust to NaN). | "last" | -| sp | Seasonal period. `None` or 1 means no seasonality; 12 means monthly. | 1 | - -**STLForecaster** - -| Parameter | Description | Default | -| ------------- | ---------------------------------------------------------- | ------- | -| sp | Seasonal period (units). Passed to statsmodels' STL. | 2 | -| seasonal | Seasonal smoothing window (odd ≥3, typically ≥7). | 7 | -| seasonal_deg | LOESS polynomial degree for season (0=constant, 1=linear). | 1 | -| trend_deg | LOESS polynomial degree for trend (0 or 1). | 1 | -| low_pass_deg | LOESS polynomial degree for low-pass (0 or 1). | 1 | -| seasonal_jump | Interpolation step for season LOESS (larger = faster). | 1 | -| trend_jump | Interpolation step for trend LOESS (larger = faster). | 1 | -| low_pass_jump | Interpolation step for low-pass LOESS. | 1 | - -**ExponentialSmoothing (HoltWinters)** - -| Parameter | Description | Default | -| --------------------- | ------------------------------------------------------------ | ----------- | -| damped_trend | Use damped trend (trend flattens instead of growing infinitely). | True | -| initialization_method | Initialization method: - `"estimated"`: Fit to estimate initial states - `"heuristic"`: Use heuristic for initial level/trend/season - `"known"`: User-provided initial values - `"legacy-heuristic"`: Legacy compatibility | "estimated" | -| optimized | Optimize parameters via maximum likelihood. | True | -| remove_bias | Remove bias to make residuals' mean zero. | False | -| use_brute | Use brute-force grid search for initial parameters. | | diff --git a/src/UserGuide/latest-Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md b/src/UserGuide/latest-Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md deleted file mode 100644 index c7fde0bf2..000000000 --- a/src/UserGuide/latest-Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md +++ /dev/null @@ -1,157 +0,0 @@ - -# Time Series Large Models - -## 1. Introduction - -Time Series Large Models are foundational models specifically designed for time series data analysis. The IoTDB team has been developing the Timer, a self-researched foundational time series model, which is based on the Transformer architecture and pre-trained on massive multi-domain time series data, supporting downstream tasks such as time series forecasting, anomaly detection, and time series imputation. The AINode platform developed by the team also supports the integration of cutting-edge time series foundational models from the industry, providing users with diverse model options. Unlike traditional time series analysis techniques, these large models possess universal feature extraction capabilities and can serve a wide range of analytical tasks through zero-shot analysis, fine-tuning, and other services. - -All technical achievements in the field of time series large models related to this paper (including both the team's self-researched models and industry-leading directions) have been published in top international machine learning conferences, with specific details in the appendix. - -## 2. Application Scenarios - -* **Time Series Forecasting**: Providing time series data forecasting services for industrial production, natural environments, and other fields to help users understand future trends in advance. -* **Data Imputation**: Performing context-based filling for missing segments in time series to enhance the continuity and integrity of the dataset. -* **Anomaly Detection**: Using autoregressive analysis technology to monitor time series data in real-time, promptly alerting potential anomalies. - -![](/img/LargeModel10.png) - -## 3. Timer-1 Model - -The Timer model (non-built-in model) not only demonstrates excellent few-shot generalization and multi-task adaptability, but also acquires a rich knowledge base through pre-training, endowing it with universal capabilities to handle diverse downstream tasks, with the following characteristics: - -* **Generalizability**: The model can achieve industry-leading deep model prediction results through fine-tuning with only a small number of samples. -* **Universality**: The model design is flexible, capable of adapting to various different task requirements, and supports variable input and output lengths, enabling it to function effectively in various application scenarios. -* **Scalability**: As the number of model parameters increases or the scale of pre-training data expands, the model's performance will continue to improve, ensuring that the model can continuously optimize its prediction effectiveness as time and data volume grow. - -![](/img/model01.png) - -## 4. Timer-XL Model - -Timer-XL further extends and upgrades the network structure based on Timer, achieving comprehensive breakthroughs in multiple dimensions: - -* **Ultra-Long Context Support**: This model breaks through the limitations of traditional time series forecasting models, supporting the processing of inputs with thousands of Tokens (equivalent to tens of thousands of time points), effectively solving the context length bottleneck problem. -* **Coverage of Multi-Variable Forecasting Scenarios**: Supports various forecasting scenarios, including the prediction of non-stationary time series, multi-variable prediction tasks, and predictions involving covariates, meeting diversified business needs. -* **Large-Scale Industrial Time Series Dataset**: Pre-trained on a trillion-scale time series dataset from the industrial IoT field, the dataset possesses important characteristics such as massive scale, excellent quality, and rich domain coverage, covering multiple fields including energy, aerospace, steel, and transportation. - -![](/img/model02.png) - -## 5. Timer-Sundial Model - -Timer-Sundial is a series of generative foundational models focused on time series forecasting. The base version has 128 million parameters and has been pre-trained on 1 trillion time points, with the following core characteristics: - -* **Strong Generalization Performance**: Possesses zero-shot forecasting capabilities and can support both point forecasting and probabilistic forecasting simultaneously. -* **Flexible Prediction Distribution Analysis**: Not only can it predict means or quantiles, but it can also evaluate any statistical properties of the prediction distribution through the raw samples generated by the model. -* **Innovative Generative Architecture**: Employs a "Transformer + TimeFlow" collaborative architecture - the Transformer learns the autoregressive representations of time segments, while the TimeFlow module transforms random noise into diverse prediction trajectories based on the flow-matching framework (Flow-Matching), achieving efficient generation of non-deterministic samples. - -![](/img/model03.png) - -## 6. Chronos-2 Model - -Chronos-2 is a universal time series foundational model developed by the Amazon Web Services (AWS) research team, evolved from the Chronos discrete token modeling paradigm. This model is suitable for both zero-shot univariate forecasting and covariate forecasting. Its main characteristics include: - -* **Probabilistic Forecasting Capability**: The model outputs multi-step prediction results in a generative manner, supporting quantile or distribution-level forecasting to characterize future uncertainty. -* **Zero-Shot General Forecasting**: Leveraging the contextual learning ability acquired through pre-training, it can directly execute forecasting on unseen datasets without retraining or parameter updates. -* **Unified Modeling of Multi-Variable and Covariates**: Supports joint modeling of multiple related time series and their covariates under the same architecture to improve prediction performance for complex tasks. However, it has strict input requirements: - * The set of names of future covariates must be a subset of the set of names of historical covariates; - * The length of each historical covariate must equal the length of the target variable; - * The length of each future covariate must equal the prediction length; -* **Efficient Inference and Deployment**: The model adopts a compact encoder-only structure, maintaining strong generalization capabilities while ensuring inference efficiency. - -![](/img/timeseries-large-model-chronos2.png) - -## 7. Performance Showcase - -Time Series Large Models can adapt to real time series data from various different domains and scenarios, demonstrating excellent processing capabilities across various tasks. The following shows the actual performance on different datasets: - -**Time Series Forecasting:** - -Leveraging the forecasting capabilities of Time Series Large Models, future trends of time series can be accurately predicted. The blue curve in the following figure represents the predicted trend, while the red curve represents the actual trend, with both curves highly consistent. - -![](/img/LargeModel03.png) - -**Data Imputation:** - -Using Time Series Large Models to fill missing data segments through predictive imputation. - -![](/img/timeseries-large-model-data-imputation.png) - -**Anomaly Detection:** - -Using Time Series Large Models to accurately identify outliers that deviate significantly from the normal trend. - -![](/img/LargeModel05.png) - -## 8. Deployment and Usage - -1. Open the IoTDB CLI console and check that the ConfigNode, DataNode, and AINode nodes are all Running. - -```Plain -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| -+------+----------+-------+---------------+------------+--------------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| 2.0.5.1| 069354f| -| 1| DataNode|Running| 127.0.0.1| 10730| 2.0.5.1| 069354f| -| 2| AINode|Running| 127.0.0.1| 10810| 2.0.5.1|069354f-dev| -+------+----------+-------+---------------+------------+--------------+-----------+ -Total line number = 3 -It costs 0.140s -``` - -2. In an online environment, the first startup of the AINode node will automatically pull the Timer-XL, Sundial, and Chronos2 models. - - > Note: - > - > * The AINode installation package does not include model weight files. - > * The automatic pull feature depends on the deployment environment having HuggingFace network access capability. - > * AINode supports manual upload of model weight files. For specific operation methods, refer to [Importing Weight Files](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_3-3-importing-built-in-weight-files). - -3. Check if the models are available. - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### Appendix - -**[1]** Timer: Generative Pre-trained Transformers Are Large Time Series Models, Yong Liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ Back]() - -**[2]** TIMER-XL: LONG-CONTEXT TRANSFORMERS FOR UNIFIED TIME SERIES FORECASTING, Yong Liu, Guo Qin, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ Back]() - -**[3]** Sundial: A Family of Highly Capable Time Series Foundation Models, Yong Liu, Guo Qin, Zhiyuan Shi, Zhi Chen, Caiyin Yang, Xiangdong Huang, Jianmin Wang, Mingsheng Long, **ICML 2025 spotlight**. [↩ Back]() - -**[4]** Chronos-2: From Univariate to Universal Forecasting, Abdul Fatir Ansari, Oleksandr Shchur, Jaris Küken, Andreas Auer, Boran Han, Pedro Mercado, Syama Sundar Rangapuram, Huibin Shen, Lorenzo Stella, Xiyuan Zhang, Mononito Goswami, Shubham Kapoor, Danielle C. Maddix, Pablo Guerron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, Michael Bohlke-Schneider, **arXiv:2510.15821**. [↩ Back]() \ No newline at end of file diff --git a/src/UserGuide/latest-Table/API/Programming-CSharp-Native-API_timecho.md b/src/UserGuide/latest-Table/API/Programming-CSharp-Native-API_timecho.md deleted file mode 100644 index 7c515df99..000000000 --- a/src/UserGuide/latest-Table/API/Programming-CSharp-Native-API_timecho.md +++ /dev/null @@ -1,403 +0,0 @@ - -# C# Native API - -## 1. Feature Overview - -IoTDB provides a C# native client driver and corresponding connection pool, offering object-oriented interfaces that allow direct assembly of time-series objects for writing without SQL construction. It is recommended to use the connection pool for multi-threaded parallel database operations. - -## 2. Usage Instructions - -**Environment Requirements:** - -* .NET SDK >= 5.0 or .NET Framework 4.x -* Thrift >= 0.14.1 -* NLog >= 4.7.9 - -**Dependency Installation:** - -It supports installation using tools such as NuGet Package Manager or .NET CLI. Taking .NET CLI as an example: - -If using .NET 5.0 or a later version of the SDK, enter the following command to install the latest NuGet package: - -```Plain -dotnet add package Apache.IoTDB -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 3. Read/Write Operations - -### 3.1 TableSessionPool - -#### 3.1.1 Functional Description - -The `TableSessionPool` defines basic operations for interacting with IoTDB, supporting data insertion, query execution, and session closure. It also serves as a connection pool to efficiently reuse connections and properly release resources when unused. This interface defines how to acquire sessions from the pool and how to close the pool. - -#### 3.1.2 Method List - -Below are the methods defined in `TableSessionPool` with detailed descriptions: - -| Method | Description | Parameters | Return Type | -| ---------------------------------------------------------------- | -------------------------------------------------------------------------------- |-----------------------------------------------------------------------------------------------------------| ---------------------------- | -| `Open(bool enableRpcCompression)` | Opens a session connection with custom `enableRpcCompression` | `enableRpcCompression`: Whether to enable `RpcCompression` (requires server-side configuration alignment) | `Task` | -| `Open()` | Opens a session connection without enabling `RpcCompression` | None | `Task` | -| `InsertAsync(Tablet tablet)` | Inserts a `Tablet` object containing time-series data into the database | `tablet`: The Tablet object to insert | `Task` | -| `ExecuteNonQueryStatementAsync(string sql)` | Executes a non-query SQL statement (e.g., DDL/DML commands) | `sql`: The SQL statement to execute | `Task` | -| `ExecuteQueryStatementAsync(string sql)` | Executes a query SQL statement and returns a `SessionDataSet` with results | `sql`: The SQL query to execute | `Task` | -| `ExecuteQueryStatementAsync(string sql, long timeoutInMs)` | Executes a query SQL statement with a timeout (milliseconds) | `sql`: The SQL query to execute
`timeoutInMs`: Query timeout in milliseconds | `Task` | -| `Close()` | Closes the session and releases held resources | None | `Task` | - -#### 3.1.3 Interface Examples - -```C# -public async Task Open(bool enableRpcCompression, CancellationToken cancellationToken = default) - - public async Task Open(CancellationToken cancellationToken = default) - - public async Task InsertAsync(Tablet tablet) - - public async Task ExecuteNonQueryStatementAsync(string sql) - - public async Task ExecuteQueryStatementAsync(string sql) - - public async Task ExecuteQueryStatementAsync(string sql, long timeoutInMs) - - public async Task Close() -``` - -### 3.2 TableSessionPool.Builder - -#### 3.2.1 Functional Description - -The `TableSessionPool.Builder` class configures and creates instances of `TableSessionPool`, allowing developers to set connection parameters, session settings, and pooling behaviors. - -#### 3.2.2 Configuration Options - -Below are the available configuration options for `TableSessionPool.Builder` and their defaults: - -| ​**Configuration Method** | ​**Description** | ​**Default Value** | -| --------------------------------------------- | -------------------------------------------------------------------------------- |---------------------------------------------------| -| `SetHost(string host)` | Sets the IoTDB node host | `localhost` | -| `SetPort(int port)` | Sets the IoTDB node port | `6667` | -| `SetNodeUrls(List nodeUrls)` | Sets IoTDB cluster node URLs (overrides `SetHost`/`SetPort` when used) | Not set | -| `SetUsername(string username)` | Sets the connection username | `"root"` | -| `SetPassword(string password)` | Sets the connection password | `"TimechoDB@2021"` //before V2.0.6 it is root | -| `SetFetchSize(int fetchSize)` | Sets the fetch size for query results | `1024` | -| `SetZoneId(string zoneId)` | Sets the timezone ZoneID | `UTC+08:00` | -| `SetPoolSize(int poolSize)` | Sets the maximum number of sessions in the connection pool | `8` | -| `SetEnableRpcCompression(bool enable)` | Enables/disables RPC compression | `false` | -| `SetConnectionTimeoutInMs(int timeout)` | Sets the connection timeout in milliseconds | `500` | -| `SetDatabase(string database)` | Sets the target database name | `""` | - -#### 3.2.3 Interface Examples - -```c# -public Builder SetHost(string host) - { - _host = host; - return this; - } - - public Builder SetPort(int port) - { - _port = port; - return this; - } - - public Builder SetUsername(string username) - { - _username = username; - return this; - } - - public Builder SetPassword(string password) - { - _password = password; - return this; - } - - public Builder SetFetchSize(int fetchSize) - { - _fetchSize = fetchSize; - return this; - } - - public Builder SetZoneId(string zoneId) - { - _zoneId = zoneId; - return this; - } - - public Builder SetPoolSize(int poolSize) - { - _poolSize = poolSize; - return this; - } - - public Builder SetEnableRpcCompression(bool enableRpcCompression) - { - _enableRpcCompression = enableRpcCompression; - return this; - } - - public Builder SetConnectionTimeoutInMs(int timeout) - { - _connectionTimeoutInMs = timeout; - return this; - } - - public Builder SetNodeUrls(List nodeUrls) - { - _nodeUrls = nodeUrls; - return this; - } - - protected internal Builder SetSqlDialect(string sqlDialect) - { - _sqlDialect = sqlDialect; - return this; - } - - public Builder SetDatabase(string database) - { - _database = database; - return this; - } - - public Builder() - { - _host = "localhost"; - _port = 6667; - _username = "root"; - _password = "TimechoDB@2021"; //before V2.0.6 it is root - _fetchSize = 1024; - _zoneId = "UTC+08:00"; - _poolSize = 8; - _enableRpcCompression = false; - _connectionTimeoutInMs = 500; - _sqlDialect = IoTDBConstant.TABLE_SQL_DIALECT; - _database = ""; - } - - public TableSessionPool Build() - { - SessionPool sessionPool; - // if nodeUrls is not empty, use nodeUrls to create session pool - if (_nodeUrls.Count > 0) - { - sessionPool = new SessionPool(_nodeUrls, _username, _password, _fetchSize, _zoneId, _poolSize, _enableRpcCompression, _connectionTimeoutInMs, _sqlDialect, _database); - } - else - { - sessionPool = new SessionPool(_host, _port, _username, _password, _fetchSize, _zoneId, _poolSize, _enableRpcCompression, _connectionTimeoutInMs, _sqlDialect, _database); - } - return new TableSessionPool(sessionPool); - } -``` - -## 4. Example - -Complete example : [samples/Apache.IoTDB.Samples/TableSessionPoolTest.cs](https://github.com/apache/iotdb-client-csharp/blob/main/samples/Apache.IoTDB.Samples/TableSessionPoolTest.cs) - -```c# -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -using System; -using System.Collections.Generic; -using System.Threading.Tasks; -using Apache.IoTDB.DataStructure; - -namespace Apache.IoTDB.Samples; - -public class TableSessionPoolTest -{ - private readonly SessionPoolTest sessionPoolTest; - - public TableSessionPoolTest(SessionPoolTest sessionPoolTest) - { - this.sessionPoolTest = sessionPoolTest; - } - - public async Task Test() - { - await TestCleanup(); - - await TestSelectAndInsert(); - await TestUseDatabase(); - // await TestCleanup(); - } - - - public async Task TestSelectAndInsert() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - - await tableSessionPool.ExecuteNonQueryStatementAsync("CREATE DATABASE test1"); - await tableSessionPool.ExecuteNonQueryStatementAsync("CREATE DATABASE test2"); - - await tableSessionPool.ExecuteNonQueryStatementAsync("use test2"); - - // or use full qualified table name - await tableSessionPool.ExecuteNonQueryStatementAsync( - "create table test1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - - await tableSessionPool.ExecuteNonQueryStatementAsync( - "create table table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - var res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - // show tables by specifying another database - // using SHOW tables FROM - res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES FROM test1"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - var tableName = "testTable1"; - List columnNames = - new List { - "region_id", - "plant_id", - "device_id", - "model", - "temperature", - "humidity" }; - List dataTypes = - new List{ - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE}; - List columnCategories = - new List{ - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD}; - var values = new List> { }; - var timestamps = new List { }; - for (long timestamp = 0; timestamp < 100; timestamp++) - { - timestamps.Add(timestamp); - values.Add(new List { "1", "5", "3", "A", 1.23F + timestamp, 111.1 + timestamp }); - } - var tablet = new Tablet(tableName, columnNames, columnCategories, dataTypes, values, timestamps); - - await tableSessionPool.InsertAsync(tablet); - - - res = await tableSessionPool.ExecuteQueryStatementAsync("select * from testTable1 " - + "where region_id = '1' and plant_id in ('3', '5') and device_id = '3'"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.Close(); - } - - - public async Task TestUseDatabase() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetDatabase("test1") - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - - // show tables from current database - var res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.ExecuteNonQueryStatementAsync("use test2"); - - // show tables from current database - res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.Close(); - } - - public async Task TestCleanup() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - await tableSessionPool.ExecuteNonQueryStatementAsync("drop database test1"); - await tableSessionPool.ExecuteNonQueryStatementAsync("drop database test2"); - - await tableSessionPool.Close(); - } -} -``` diff --git a/src/UserGuide/latest-Table/API/Programming-Cpp-Native-API_timecho.md b/src/UserGuide/latest-Table/API/Programming-Cpp-Native-API_timecho.md deleted file mode 100644 index 9be6f1da9..000000000 --- a/src/UserGuide/latest-Table/API/Programming-Cpp-Native-API_timecho.md +++ /dev/null @@ -1,454 +0,0 @@ - - -# C++ Native API - -## 1. Dependencies - -- Java 8+ -- Flex -- Bison 2.7+ -- Boost 1.56+ -- OpenSSL 1.0+ -- GCC 5.5.0+ - -## 2. Installation - -### 2.1 Install Required Dependencies - -- **MAC** - 1. Install Bison: - - Use the following brew command to install the Bison version: - ```shell - brew install bison - ``` - - 2. Install Boost: Make sure to install the latest version of Boost. - - ```shell - brew install boost - ``` - - 3. Check OpenSSL: Make sure the OpenSSL library is installed. The default OpenSSL header file path is "/usr/local/opt/openssl/include". - - If you encounter errors related to OpenSSL not being found during compilation, try adding `-Dopenssl.include.dir=""`. - -- **Ubuntu 16.04+ or Other Debian-based Systems** - - Use the following commands to install dependencies: - - ```shell - sudo apt-get update - sudo apt-get install gcc g++ bison flex libboost-all-dev libssl-dev - ``` - -- **CentOS 7.7+/Fedora/Rocky Linux or Other Red Hat-based Systems** - - Use the yum command to install dependencies: - - ```shell - sudo yum update - sudo yum install gcc gcc-c++ boost-devel bison flex openssl-devel - ``` - -- **Windows** - - 1. Set Up the Build Environment - - Install MS Visual Studio (version 2019+ recommended): Make sure to select Visual Studio C/C++ IDE and compiler (supporting CMake, Clang, MinGW) during installation. - - Download and install [CMake](https://cmake.org/download/). - - 2. Download and Install Flex, Bison - - Download [Win_Flex_Bison](https://sourceforge.net/projects/winflexbison/). - - After downloading, rename the executables to flex.exe and bison.exe to ensure they can be found during compilation, and add the directory of these executables to the PATH environment variable. - - 3. Install Boost Library - - Download [Boost](https://www.boost.org/users/download/). - - Compile Boost locally: Run `bootstrap.bat` and `b2.exe` in sequence. - - Add the Boost installation directory to the PATH environment variable, e.g., `C:\Program Files (x86)\boost_1_78_0`. - - 4. Install OpenSSL - - Download and install [OpenSSL](http://slproweb.com/products/Win32OpenSSL.html). - - Add the include directory under the installation directory to the PATH environment variable. - -### 2.2 Compilation - -Clone the source code from git: -```shell -git clone https://github.com/apache/iotdb.git -``` - -The default main branch is the master branch. If you want to use a specific release version, switch to that branch (e.g., version 2.0.6): -```shell -git checkout rc/2.0.6 -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -Run Maven to compile in the IoTDB root directory: - -- Mac or Linux with glibc version >= 2.32 - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp - ``` - -- Linux with glibc version >= 2.31 - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Diotdb-tools-thrift.version=0.14.1.1-old-glibc-SNAPSHOT - ``` - -- Linux with glibc version >= 2.17 - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Diotdb-tools-thrift.version=0.14.1.1-glibc223-SNAPSHOT - ``` - -- Windows using Visual Studio 2022 - ```batch - .\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp - ``` - -- Windows using Visual Studio 2019 - ```batch - .\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Dcmake.generator="Visual Studio 16 2019" -Diotdb-tools-thrift.version=0.14.1.1-msvc142-SNAPSHOT - ``` - - If you haven't added the Boost library path to the PATH environment variable, you need to add the relevant parameters to the compile command, e.g., `-DboostIncludeDir="C:\Program Files (x86)\boost_1_78_0" -DboostLibraryDir="C:\Program Files (x86)\boost_1_78_0\stage\lib"`. - -After successful compilation, the packaged library files will be located in `iotdb-client/client-cpp/target`, and you can find the compiled example program under `example/client-cpp-example/target`. - -### 2.3 Compilation Q&A - -Q: What are the requirements for the environment on Linux? - -A: -- The known minimum version requirement for glibc (x86_64 version) is 2.17, and the minimum version for GCC is 5.5. -- The known minimum version requirement for glibc (ARM version) is 2.31, and the minimum version for GCC is 10.2. -- If the above requirements are not met, you can try compiling Thrift locally: - - Download the code from https://github.com/apache/iotdb-bin-resources/tree/iotdb-tools-thrift-v0.14.1.0/iotdb-tools-thrift. - - Run `./mvnw clean install`. - - Go back to the IoTDB code directory and run `./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp`. - -Q: How to resolve the `undefined reference to '_libc_single_thread'` error during Linux compilation? - -A: -- This issue is caused by the precompiled Thrift dependencies requiring a higher version of glibc. -- You can try adding `-Diotdb-tools-thrift.version=0.14.1.1-glibc223-SNAPSHOT` or `-Diotdb-tools-thrift.version=0.14.1.1-old-glibc-SNAPSHOT` to the Maven compile command. - -Q: What if I need to compile using Visual Studio 2017 or earlier on Windows? - -A: -- You can try compiling Thrift locally before compiling the client: - - Download the code from https://github.com/apache/iotdb-bin-resources/tree/iotdb-tools-thrift-v0.14.1.0/iotdb-tools-thrift. - - Run `.\mvnw.cmd clean install`. - - Go back to the IoTDB code directory and run `.\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Dcmake.generator="Visual Studio 15 2017"`. - -## 3. Usage - -### 3.1 TableSession Class - -All operations in the C++ client are performed through the TableSession class. Below are the method descriptions defined in the TableSession interface. - -#### 3.1.1 Method List - -1. `insert(Tablet& tablet, bool sorted = false)`: Inserts a Tablet object containing time series data into the database. The sorted parameter indicates whether the rows in the tablet are already sorted by time. -2. `executeNonQueryStatement(string& sql)`: Executes non-query SQL statements, such as DDL (Data Definition Language) or DML (Data Manipulation Language) commands. -3. `executeQueryStatement(string& sql)`: Executes query SQL statements and returns a SessionDataSet object containing the query results. The optional timeoutInMs parameter indicates the timeout return time. - * Note: When retrieving rows of query results by calling `SessionDataSet::next()`, you must store the returned `std::shared_ptr` object in a local scope variable (e.g.: `auto row = dataSet->next();`) to ensure the validity of the data lifecycle. If you access it directly via `.get()` or a raw pointer (e.g., `dataSet->next().get()`), the reference count of the smart pointer will drop to zero, the data will be released immediately, and subsequent access will lead to undefined behavior. -4. `open(bool enableRPCCompression = false)`: Opens the connection and determines whether to enable RPC compression (client state must match server state, disabled by default). -5. `close()`: Closes the connection. - -#### 3.1.2 Interface Display - -```cpp -class TableSession { -private: - Session* session; -public: - TableSession(Session* session) { - this->session = session; - } - void insert(Tablet& tablet, bool sorted = false); - void executeNonQueryStatement(const std::string& sql); - unique_ptr executeQueryStatement(const std::string& sql); - unique_ptr executeQueryStatement(const std::string& sql, int64_t timeoutInMs); - string getDatabase(); //Get the currently selected database, can be replaced by executeNonQueryStatement - void open(bool enableRPCCompression = false); - void close(); -}; -``` - -### 3.2 TableSessionBuilder Class - -The TableSessionBuilder class is a builder used to configure and create instances of the TableSession class. Through it, you can conveniently set connection parameters, query parameters, and other settings when creating instances. - -#### 3.2.1 Usage Example - -```cpp -//Set connection IP, port, username, password -//The order of settings is arbitrary, just ensure build() is called last, the created instance is connected by default through open() -session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("TimechoDB@2021") //before V2.0.6 it is root - ->build(); -``` - -#### 3.2.2 Configurable Parameter List - -| **Parameter Name** | **Description** | **Default Value** | -| :---: | :---: |:-------------------------------------------:| -| host | Set the connected node IP | "127.0.0.1" ("localhost") | -| rpcPort | Set the connected node port | 6667 | -| username | Set the connection username | "root" | -| password | Set the connection password | "TimechoDB@2021" //before V2.0.6 it is root | -| zoneId | Set the ZoneId related to timezone | "" | -| fetchSize | Set the query result fetch size | 10000 | -| database | Set the target database name | "" | - -## 4. Examples - -The sample code of using these interfaces is in: - -- `example/client-cpp-example/src/TableModelSessionExample.cpp`: [TableModelSessionExample](https://github.com/apache/iotdb/blob/master/example/client-cpp-example/src/TableModelSessionExample.cpp) - -If the compilation finishes successfully, the example project will be placed under `example/client-cpp-example/target` - - -```cpp -#include "TableSession.h" -#include "TableSessionBuilder.h" - -using namespace std; - -shared_ptr session; - -void insertRelationalTablet() { - - vector> schemaList { - make_pair("region_id", TSDataType::TEXT), - make_pair("plant_id", TSDataType::TEXT), - make_pair("device_id", TSDataType::TEXT), - make_pair("model", TSDataType::TEXT), - make_pair("temperature", TSDataType::FLOAT), - make_pair("humidity", TSDataType::DOUBLE) - }; - - vector columnTypes = { - ColumnCategory::TAG, - ColumnCategory::TAG, - ColumnCategory::TAG, - ColumnCategory::ATTRIBUTE, - ColumnCategory::FIELD, - ColumnCategory::FIELD - }; - - Tablet tablet("table1", schemaList, columnTypes, 100); - - for (int row = 0; row < 100; row++) { - int rowIndex = tablet.rowSize++; - tablet.timestamps[rowIndex] = row; - - // Using index-based API is more efficient than column name lookup - // Prefer: tablet.addValue(0, rowIndex, "1"); - // Avoid: tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue(0, rowIndex, "1"); // region_id - tablet.addValue(1, rowIndex, "5"); // plant_id - tablet.addValue(2, rowIndex, "3"); // device_id - tablet.addValue(3, rowIndex, "A"); // model - tablet.addValue(4, rowIndex, 37.6F); // temperature - tablet.addValue(5, rowIndex, 111.1); // humidity - if (tablet.rowSize == tablet.maxRowNumber) { - session->insert(tablet); - tablet.reset(); - } - } - - if (tablet.rowSize != 0) { - session->insert(tablet); - tablet.reset(); - } -} - -void Output(unique_ptr &dataSet) { - for (const string &name: dataSet->getColumnNames()) { - cout << name << " "; - } - cout << endl; - while (dataSet->hasNext()) { - cout << dataSet->next()->toString(); - } - cout << endl; -} - -void OutputWithType(unique_ptr &dataSet) { - for (const string &name: dataSet->getColumnNames()) { - cout << name << " "; - } - cout << endl; - for (const string &type: dataSet->getColumnTypeList()) { - cout << type << " "; - } - cout << endl; - while (dataSet->hasNext()) { - cout << dataSet->next()->toString(); - } - cout << endl; -} - -int main() { - try { - session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("root") - ->build(); - - cout << "[Create Database db1,db2]\n" << endl; - try { - session->executeNonQueryStatement("CREATE DATABASE IF NOT EXISTS db1"); - session->executeNonQueryStatement("CREATE DATABASE IF NOT EXISTS db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Use db1 as database]\n" << endl; - try { - session->executeNonQueryStatement("USE db1"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Create Table table1,table2]\n" << endl; - try { - session->executeNonQueryStatement("create table db1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - session->executeNonQueryStatement("create table db2.table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show Tables]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show tables from specific database]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES FROM db1"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[InsertTablet]\n" << endl; - try { - insertRelationalTablet(); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Query Table Data]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SELECT * FROM table1" - " where region_id = '1' and plant_id in ('3', '5') and device_id = '3'"); - OutputWithType(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - session->close(); - - // specify database in constructor - session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("root") - ->database("db1") - ->build(); - - cout << "[Show tables from current database(db1)]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Change database to db2]\n" << endl; - try { - session->executeNonQueryStatement("USE db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show tables from current database(db2)]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Drop Database db1,db2]\n" << endl; - try { - session->executeNonQueryStatement("DROP DATABASE db1"); - session->executeNonQueryStatement("DROP DATABASE db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "session close\n" << endl; - session->close(); - - cout << "finished!\n" << endl; - } catch (IoTDBConnectionException &e) { - cout << e.what() << endl; - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - return 0; -} -``` - - -## 5. FAQ - -### 5.1 on Mac - -If errors occur when compiling thrift source code, try to downgrade your xcode-commandline from 12 to 11.5 - -see https://stackoverflow.com/questions/63592445/ld-unsupported-tapi-file-type-tapi-tbd-in-yaml-file/65518087#65518087 - - -### 5.2 on Windows - -When Building Thrift and downloading packages via "wget", a possible annoying issue may occur with -error message looks like: -```shell -Failed to delete cached file C:\Users\Administrator\.m2\repository\.cache\download-maven-plugin\index.ser -``` -Possible fixes: -- Try to delete the ".m2\repository\\.cache\" directory and try again. -- Add "\true\" configuration to the download-maven-plugin maven phase that complains this error. - diff --git a/src/UserGuide/latest-Table/API/Programming-Go-Native-API_timecho.md b/src/UserGuide/latest-Table/API/Programming-Go-Native-API_timecho.md deleted file mode 100644 index 41ab86372..000000000 --- a/src/UserGuide/latest-Table/API/Programming-Go-Native-API_timecho.md +++ /dev/null @@ -1,576 +0,0 @@ - - -# Go Native API - -The Git repository for the Go Native API client is located [here](https://github.com/apache/iotdb-client-go/) - -## 1. Usage -### 1.1 Dependencies - -* golang >= 1.13 -* make >= 3.0 -* curl >= 7.1.1 -* thrift 0.15.0 -* Linux、Macos or other unix-like systems -* Windows+bash (WSL、cygwin、Git Bash) - -### 1.2 Installation - -* go mod - -```sh -export GO111MODULE=on -export GOPROXY=https://goproxy.io - -mkdir session_example && cd session_example - -curl -o session_example.go -L https://github.com/apache/iotdb-client-go/raw/main/example/session_example.go - -go mod init session_example -go run session_example.go -``` - -* GOPATH - -```sh -# get thrift 0.15.0 -go get github.com/apache/thrift -cd $GOPATH/src/github.com/apache/thrift -git checkout 0.15.0 - -mkdir -p $GOPATH/src/iotdb-client-go-example/session_example -cd $GOPATH/src/iotdb-client-go-example/session_example -curl -o session_example.go -L https://github.com/apache/iotdb-client-go/raw/main/example/session_example.go -go run session_example.go -``` -* Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 2. ITableSession Interface -### 2.1 Description - -Defines core operations for interacting with IoTDB tables, including data insertion, query execution, and session closure. Not thread-safe. - -### 2.2 Method List - -| **Method Name** | **Description** | **Parameters** | **Return Value** | **Return Error** | -| -------------------------------------------------------------- | -------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | ----------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | -| `Insert(tablet *Tablet)` | Inserts a`Tablet`containing time-series data into the database.| `tablet`: A pointer to a Tablet containing time-series data to be inserted. | A pointer to TSStatus indicating the execution result. | An error if an issue occurs during the operation, such as a connection error or execution failure. | -| `xecuteNonQueryStatement(sql string)`| Executes non-query SQL statements such as DDL or DML commands. | `sql`: The SQL statement to execute.| A pointer to TSStatus indicating the execution result.| An error if an issue occurs during the operation, such as a connection error or execution failure. | -| `ExecuteQueryStatement (sql string, timeoutInMs *int64)` | Executes a query SQL statement with a specified timeout in milliseconds. | `sql`: The SQL query statement.`timeoutInMs`: Query timeout in milliseconds. | A pointer to SessionDataSet containing the query results. | An error if an issue occurs during the operation, such as a connection error or execution failure. | -| `Close()` | Closes the session and releases resources. | None | None | An error if there is an issue with closing the IoTDB connection. | - -### 2.3 Interface Definition -1. ITableSession - -```go -// ITableSession defines an interface for interacting with IoTDB tables. -// It supports operations such as data insertion, executing queries, and closing the session. -// Implementations of this interface are expected to manage connections and ensure -// proper resource cleanup. -// -// Each method may return an error to indicate issues such as connection errors -// or execution failures. -// -// Since this interface includes a Close method, it is recommended to use -// defer to ensure the session is properly closed. -type ITableSession interface { - - // Insert inserts a Tablet into the database. - // - // Parameters: - // - tablet: A pointer to a Tablet containing time-series data to be inserted. - // - // Returns: - // - r: A pointer to TSStatus indicating the execution result. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - Insert(tablet *Tablet) (r *common.TSStatus, err error) - - // ExecuteNonQueryStatement executes a non-query SQL statement, such as a DDL or DML command. - // - // Parameters: - // - sql: The SQL statement to execute. - // - // Returns: - // - r: A pointer to TSStatus indicating the execution result. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - ExecuteNonQueryStatement(sql string) (r *common.TSStatus, err error) - - // ExecuteQueryStatement executes a query SQL statement and returns the result set. - // - // Parameters: - // - sql: The SQL query statement to execute. - // - timeoutInMs: A pointer to the timeout duration in milliseconds for the query execution. - // - // Returns: - // - result: A pointer to SessionDataSet containing the query results. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - ExecuteQueryStatement(sql string, timeoutInMs *int64) (*SessionDataSet, error) - - // Close closes the session, releasing any held resources. - // - // Returns: - // - err: An error if there is an issue with closing the IoTDB connection. - Close() (err error) -} -``` - -2. Constructing a TableSession - -* There’s no need to manually set the `sqlDialect` field in the `Config`structs. This is automatically handled by the corresponding `NewSession` function during initialization. Simply use the appropriate constructor based on your use case (single-node or cluster). - -```Go -type Config struct { - Host string - Port string - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - sqlDialect string - Version Version - Database string -} - -type ClusterConfig struct { - NodeUrls []string //ip:port - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - sqlDialect string - Database string -} - -// NewTableSession creates a new TableSession instance using the provided configuration. -// -// Parameters: -// - config: The configuration for the session. -// - enableRPCCompression: A boolean indicating whether RPC compression is enabled. -// - connectionTimeoutInMs: The timeout in milliseconds for establishing a connection. -// -// Returns: -// - An ITableSession instance if the session is successfully created. -// - An error if there is an issue during session initialization. -func NewTableSession(config *Config, enableRPCCompression bool, connectionTimeoutInMs int) (ITableSession, error) - -// NewClusterTableSession creates a new TableSession instance for a cluster setup. -// -// Parameters: -// - clusterConfig: The configuration for the cluster session. -// - enableRPCCompression: A boolean indicating whether RPC compression is enabled. -// -// Returns: -// - An ITableSession instance if the session is successfully created. -// - An error if there is an issue during session initialization. -func NewClusterTableSession(clusterConfig *ClusterConfig, enableRPCCompression bool) (ITableSession, error) -``` - -> Note: -> -> When creating a `TableSession` via `NewTableSession` or `NewClusterTableSession`, the connection is already established; no additional `Open` operation is required. - -### 2.4 Example - -```go -package main - -import ( - "flag" - "log" - "math/rand" - "strconv" - "time" - - "github.com/apache/iotdb-client-go/v2/client" -) - -func main() { - flag.Parse() - config := &client.Config{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_session", - } - session, err := client.NewTableSession(config, false, 0) - if err != nil { - log.Fatal(err) - } - defer session.Close() - - checkError(session.ExecuteNonQueryStatement("create database test_db")) - checkError(session.ExecuteNonQueryStatement("use test_db")) - checkError(session.ExecuteNonQueryStatement("create table t1 (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - insertRelationalTablet(session) - showTables(session) - query(session) -} - -func getTextValueFromDataSet(dataSet *client.SessionDataSet, columnName string) string { - if isNull, err := dataSet.IsNull(columnName); err != nil { - log.Fatal(err) - } else if isNull { - return "null" - } - v, err := dataSet.GetString(columnName) - if err != nil { - log.Fatal(err) - } - return v -} - -func insertRelationalTablet(session client.ITableSession) { - tablet, err := client.NewRelationalTablet("t1", []*client.MeasurementSchema{ - { - Measurement: "tag1", - DataType: client.STRING, - }, - { - Measurement: "tag2", - DataType: client.STRING, - }, - { - Measurement: "s1", - DataType: client.TEXT, - }, - { - Measurement: "s2", - DataType: client.TEXT, - }, - }, []client.ColumnCategory{client.TAG, client.TAG, client.FIELD, client.FIELD}, 1024) - if err != nil { - log.Fatal("Failed to create relational tablet {}", err) - } - ts := time.Now().UTC().UnixNano() / 1000000 - for row := 0; row < 16; row++ { - ts++ - tablet.SetTimestamp(ts, row) - tablet.SetValueAt("tag1_value_"+strconv.Itoa(row), 0, row) - tablet.SetValueAt("tag2_value_"+strconv.Itoa(row), 1, row) - tablet.SetValueAt("s1_value_"+strconv.Itoa(row), 2, row) - tablet.SetValueAt("s2_value_"+strconv.Itoa(row), 3, row) - tablet.RowSize++ - } - checkError(session.Insert(tablet)) - - tablet.Reset() - - for row := 0; row < 16; row++ { - ts++ - tablet.SetTimestamp(ts, row) - tablet.SetValueAt("tag1_value_1", 0, row) - tablet.SetValueAt("tag2_value_1", 1, row) - tablet.SetValueAt("s1_value_"+strconv.Itoa(row), 2, row) - tablet.SetValueAt("s2_value_"+strconv.Itoa(row), 3, row) - - nullValueColumn := rand.Intn(4) - tablet.SetValueAt(nil, nullValueColumn, row) - tablet.RowSize++ - } - checkError(session.Insert(tablet)) -} - -func showTables(session client.ITableSession) { - timeout := int64(2000) - dataSet, err := session.ExecuteQueryStatement("show tables", &timeout) - defer dataSet.Close() - if err != nil { - log.Fatal(err) - } - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - value, err := dataSet.GetString("TableName") - if err != nil { - log.Fatal(err) - } - log.Printf("tableName is %v", value) - } -} - -func query(session client.ITableSession) { - timeout := int64(2000) - dataSet, err := session.ExecuteQueryStatement("select * from t1", &timeout) - defer dataSet.Close() - if err != nil { - log.Fatal(err) - } - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - log.Printf("%v %v %v %v", getTextValueFromDataSet(dataSet, "tag1"), getTextValueFromDataSet(dataSet, "tag2"), getTextValueFromDataSet(dataSet, "s1"), getTextValueFromDataSet(dataSet, "s2")) - } -} - -func checkError(err error) { - if err != nil { - log.Fatal(err) - } -} -``` - -## 3. TableSessionPool Interface -### 3.1 Description - -Manages a pool of `ITableSession` instances for efficient connection reuse and resource cleanup. - -### 3.2 Method List - -| **Method Name** | **Description** | **Return Value** | **Return Error** | -| ----------------------- | ------------------------------------------------------------ | ------------------------------------------------------------- | ------------------------------------------- | -| `GetSession()` | Acquires a session from the pool for database interaction. | A usable ITableSession instance for interacting with IoTDB. | An error if a session cannot be acquired. | -| `Close()` | Closes the session pool and releases resources.。 | None | None | - -### 3.3 Interface Definition -1. TableSessionPool - -```Go -// TableSessionPool manages a pool of ITableSession instances, enabling efficient -// reuse and management of resources. It provides methods to acquire a session -// from the pool and to close the pool, releasing all held resources. -// -// This implementation ensures proper lifecycle management of sessions, -// including efficient reuse and cleanup of resources. - -// GetSession acquires an ITableSession instance from the pool. -// -// Returns: -// - A usable ITableSession instance for interacting with IoTDB. -// - An error if a session cannot be acquired. -func (spool *TableSessionPool) GetSession() (ITableSession, error) { - return spool.sessionPool.getTableSession() -} - -// Close closes the TableSessionPool, releasing all held resources. -// Once closed, no further sessions can be acquired from the pool. -func (spool *TableSessionPool) Close() -``` - -2. Constructing a TableSessionPool - -```Go -type PoolConfig struct { - Host string - Port string - NodeUrls []string - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - Database string - sqlDialect string -} - -// NewTableSessionPool creates a new TableSessionPool with the specified configuration. -// -// Parameters: -// - conf: PoolConfig defining the configuration for the pool. -// - maxSize: The maximum number of sessions the pool can hold. -// - connectionTimeoutInMs: Timeout for establishing a connection in milliseconds. -// - waitToGetSessionTimeoutInMs: Timeout for waiting to acquire a session in milliseconds. -// - enableCompression: A boolean indicating whether to enable compression. -// -// Returns: -// - A TableSessionPool instance. -func NewTableSessionPool(conf *PoolConfig, maxSize, connectionTimeoutInMs, waitToGetSessionTimeoutInMs int, - enableCompression bool) TableSessionPool -``` - -> Note: -> -> * If a `Database` is specified when creating the `TableSessionPool`, all sessions acquired from the pool will automatically use this database. There is no need to explicitly set the database during operations. -> * Automatic State Reset: If a session temporarily switches to another database using `USE DATABASE` during usage, the session will automatically revert to the original database specified in the pool when closed and returned to the pool. - -### 3.4 Example - -```go -package main - -import ( - "log" - "strconv" - "sync" - "sync/atomic" - "time" - - "github.com/apache/iotdb-client-go/v2/client" -) - -func main() { - sessionPoolWithSpecificDatabaseExample() - sessionPoolWithoutSpecificDatabaseExample() - putBackToSessionPoolExample() -} - -func putBackToSessionPoolExample() { - // should create database test_db before executing - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_db", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) - defer sessionPool.Close() - - num := 4 - successGetSessionNum := int32(0) - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - dbName := "db" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create database "+dbName+"because ", err) - return - } - atomic.AddInt32(&successGetSessionNum, 1) - defer func() { - time.Sleep(6 * time.Second) - // put back to session pool - session.Close() - }() - checkError(session.ExecuteNonQueryStatement("create database " + dbName)) - checkError(session.ExecuteNonQueryStatement("use " + dbName)) - checkError(session.ExecuteNonQueryStatement("create table table_of_" + dbName + " (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() - log.Println("success num is", successGetSessionNum) - - log.Println("All session's database have been reset.") - // the using database will automatically reset to session pool's database after the session closed - wg.Add(5) - for i := 0; i < 5; i++ { - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to get session because ", err) - } - defer session.Close() - timeout := int64(3000) - dataSet, err := session.ExecuteQueryStatement("show tables", &timeout) - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - value, err := dataSet.GetString("TableName") - if err != nil { - log.Fatal(err) - } - log.Println("table is", value) - } - dataSet.Close() - }() - } - wg.Wait() -} - -func sessionPoolWithSpecificDatabaseExample() { - // should create database test_db before executing - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_db", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 8000, false) - defer sessionPool.Close() - num := 10 - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - tableName := "t" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create table "+tableName+"because ", err) - return - } - defer session.Close() - checkError(session.ExecuteNonQueryStatement("create table " + tableName + " (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() -} - -func sessionPoolWithoutSpecificDatabaseExample() { - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 8000, false) - defer sessionPool.Close() - num := 10 - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - dbName := "db" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create database ", dbName, err) - return - } - defer session.Close() - checkError(session.ExecuteNonQueryStatement("create database " + dbName)) - checkError(session.ExecuteNonQueryStatement("use " + dbName)) - checkError(session.ExecuteNonQueryStatement("create table t1 (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() -} - -func checkError(err error) { - if err != nil { - log.Fatal(err) - } -} -``` - diff --git a/src/UserGuide/latest-Table/API/Programming-JDBC_timecho.md b/src/UserGuide/latest-Table/API/Programming-JDBC_timecho.md deleted file mode 100644 index 6b85ccab2..000000000 --- a/src/UserGuide/latest-Table/API/Programming-JDBC_timecho.md +++ /dev/null @@ -1,189 +0,0 @@ - -# JDBC - -The IoTDB JDBC provides a standardized way to interact with the IoTDB database, allowing users to execute SQL statements from Java programs for managing databases and time-series data. It supports operations such as connecting to the database, creating, querying, updating, and deleting data, as well as batch insertion and querying of time-series data. - -**Note:** The current JDBC implementation is designed primarily for integration with third-party tools. High-performance writing **may not be achieved** when using JDBC for insert operations. For Java applications, it is recommended to use the **JAVA Native API** for optimal performance. - -## 1. Prerequisites - -### 1.1 **Environment Requirements** - -- **JDK:** Version 1.8 or higher -- **Maven:** Version 3.6 or higher - -### 1.2 **Adding Maven Dependencies** - -Add the following dependency to your Maven `pom.xml` file: - -```XML - - - com.timecho.iotdb - iotdb-session - 2.0.1.1 - - -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 2. Read and Write Operations - -**Write Operations:** Perform database operations such as inserting data, creating databases, and creating time-series using the `execute` method. - -**Read Operations:** Execute queries using the `executeQuery` method and retrieve results via the `ResultSet` object. - -### 2.1 Method Overview - -| **Method Name** | **Description** | **Parameters** | **Return Value** | -| ------------------------------------------------------------ | ----------------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------- | -| Class.forName(String driver) | Loads the JDBC driver class | `driver`: Name of the JDBC driver class | `Class`: Loaded class object | -| DriverManager.getConnection(String url, String username, String password) | Establishes a database connection | `url`: Database URL `username`: Username `password`: Password | `Connection`: Database connection object | -| Connection.createStatement() | Creates a `Statement` object for executing SQL statements | None | `Statement`: SQL execution object | -| Statement.execute(String sql) | Executes a non-query SQL statement | `sql`: SQL statement to execute | `boolean`: Indicates if a `ResultSet` is returned | -| Statement.executeQuery(String sql) | Executes a query SQL statement and retrieves the result set | `sql`: SQL query statement | `ResultSet`: Query result set | -| ResultSet.getMetaData() | Retrieves metadata of the result set | None | `ResultSetMetaData`: Metadata object | -| ResultSet.next() | Moves to the next row in the result set | None | `boolean`: Whether the move was successful | -| ResultSet.getString(int columnIndex) | Retrieves the string value of a specified column | `columnIndex`: Column index (starting from 1) | `String`: Column value | - -## 3. Sample Code - -**Note:** When using the Table Mode, you must specify the `sql_dialect` parameter as `table` in the URL. Example: - -```Java -String url = "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table"; -``` - -You can find the full example code at [GitHub Repository](https://github.com/apache/iotdb/blob/rc/2.0.1/example/jdbc/src/main/java/org/apache/iotdb/TableModelJDBCExample.java). - -Here is an excerpt of the sample code: - -```Java -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -package org.apache.iotdb; - -import org.apache.iotdb.jdbc.IoTDBSQLException; - -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -import java.sql.Connection; -import java.sql.DriverManager; -import java.sql.ResultSet; -import java.sql.ResultSetMetaData; -import java.sql.SQLException; -import java.sql.Statement; - -public class TableModelJDBCExample { - - private static final Logger LOGGER = LoggerFactory.getLogger(TableModelJDBCExample.class); - - public static void main(String[] args) throws ClassNotFoundException, SQLException { - Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); - - // don't specify database in url - try (Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", "root", "TimechoDB@2021"); //before V2.0.6 it is root - Statement statement = connection.createStatement()) { - - statement.execute("CREATE DATABASE test1"); - statement.execute("CREATE DATABASE test2"); - - statement.execute("use test2"); - - // or use full qualified table name - statement.execute( - "create table test1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - - statement.execute( - "create table table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - // show tables by specifying another database - // using SHOW tables FROM - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES FROM test1")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - - // specify database in url - try (Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/test1?sql_dialect=table", "root", "TimechoDB@2021"); //before V2.0.6 it is root - Statement statement = connection.createStatement()) { - // show tables from current database test1 - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - // change database to test2 - statement.execute("use test2"); - - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - } - } -} -``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/API/Programming-Java-Native-API_timecho.md b/src/UserGuide/latest-Table/API/Programming-Java-Native-API_timecho.md deleted file mode 100644 index ec19bf235..000000000 --- a/src/UserGuide/latest-Table/API/Programming-Java-Native-API_timecho.md +++ /dev/null @@ -1,851 +0,0 @@ - -# Java Native API - -## 1. Function Introduction - -IoTDB provides a Java native client driver and a session pool management mechanism. These tools enable developers to interact with IoTDB using object-oriented APIs, allowing time-series objects to be directly assembled and inserted into the database without constructing SQL statements. It is recommended to use the `ITableSessionPool` for multi-threaded database operations to maximize efficiency. - -## 2. Usage Instructions - -**Environment Requirements** - -- **JDK**: Version 1.8 or higher -- **Maven**: Version 3.6 or higher - -**Adding Maven Dependencies** - -```XML - - - com.timecho.iotdb - iotdb-session - - ${project.version} - - -``` -* The latest version of `iotdb-session` can be viewed [here](https://repo1.maven.org/maven2/com/timecho/iotdb/iotdb-session/) -* Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 3. Read and Write Operations - -### 3.1 ITableSession Interface - -#### 3.1.1 Feature Description - -The `ITableSession` interface defines basic operations for interacting with IoTDB, including data insertion, query execution, and session closure. Note that this interface is **not thread-safe**. - -#### 3.1.2 Method Overview - -| **Method Name** | **Description** | **Parameters** | **Return Value** | **Exceptions** | -| --------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ---------------- | --------------------------------------------------------- | -| insert(Tablet tablet) | Inserts a `Tablet` containing time-series data into the database. | `tablet`: The `Tablet` object to be inserted. | None | `StatementExecutionException`, `IoTDBConnectionException` | -| executeNonQueryStatement(String sql) | Executes non-query SQL statements such as DDL or DML commands. | `sql`: The SQL statement to execute. | None | `StatementExecutionException`, `IoTDBConnectionException` | -| executeQueryStatement(String sql) | Executes a query SQL statement and returns a `SessionDataSet` containing the query results. | `sql`: The SQL query statement to execute. | `SessionDataSet` | `StatementExecutionException`, `IoTDBConnectionException` | -| executeQueryStatement(String sql, long timeoutInMs) | Executes a query SQL statement with a specified timeout in milliseconds. | `sql`: The SQL query statement. `timeoutInMs`: Query timeout in milliseconds. | `SessionDataSet` | `StatementExecutionException` | -| close() | Closes the session and releases resources. | None | None | IoTDBConnectionException | - -**Description of Object Data Type:** - -Since V2.0.8, the `iTableSession.insert(Tablet tablet)` interface supports splitting a single Object-class file into multiple segments and writing them sequentially in order. When the column data type in the Tablet data structure is **`TSDataType.Object`**, you need to use the following method to populate the Tablet: - -```Java -/* -rowIndex: row position in the tablet -columnIndex: column position in the tablet -isEOF: whether the current write operation contains the last segment of the Object file -offset: starting offset of the current write content within the Object file -content: byte array of the current write content -Note: When writing, ensure the total length of all segmented byte[] arrays equals the original Object size, -otherwise it will cause incorrect data size. -*/ -void addValue(int rowIndex, int columnIndex, boolean isEOF, long offset, byte[] content) -``` - -During queries, the following four methods are supported to retrieve values: -`Field.getStringValue`, `Field.getObjectValue`, `SessionDataSet.DataIterator.getObject`, and `SessionDataSet.DataIterator.getString`. -All these methods return a String containing metadata in the format: -`(Object) XX.XX KB` (where XX.XX KB represents the actual object size). - -#### 3.1.3 Sample Code - -```java -/** - * This interface defines a session for interacting with IoTDB tables. - * It supports operations such as data insertion, executing queries, and closing the session. - * Implementations of this interface are expected to manage connections and ensure - * proper resource cleanup. - * - *

Each method may throw exceptions to indicate issues such as connection errors or - * execution failures. - * - *

Since this interface extends {@link AutoCloseable}, it is recommended to use - * try-with-resources to ensure the session is properly closed. - */ -public interface ITableSession extends AutoCloseable { - - /** - * Inserts a {@link Tablet} into the database. - * - * @param tablet the tablet containing time-series data to be inserted. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - void insert(Tablet tablet) throws StatementExecutionException, IoTDBConnectionException; - - /** - * Executes a non-query SQL statement, such as a DDL or DML command. - * - * @param sql the SQL statement to execute. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - * @throws StatementExecutionException if an error occurs while executing the statement. - */ - void executeNonQueryStatement(String sql) throws IoTDBConnectionException, StatementExecutionException; - - /** - * Executes a query SQL statement and returns the result set. - * - * @param sql the SQL query statement to execute. - * @return a {@link SessionDataSet} containing the query results. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - SessionDataSet executeQueryStatement(String sql) - throws StatementExecutionException, IoTDBConnectionException; - - /** - * Executes a query SQL statement with a specified timeout and returns the result set. - * - * @param sql the SQL query statement to execute. - * @param timeoutInMs the timeout duration in milliseconds for the query execution. - * @return a {@link SessionDataSet} containing the query results. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - SessionDataSet executeQueryStatement(String sql, long timeoutInMs) - throws StatementExecutionException, IoTDBConnectionException; - - /** - * Closes the session, releasing any held resources. - * - * @throws IoTDBConnectionException if there is an issue with closing the IoTDB connection. - */ - @Override - void close() throws IoTDBConnectionException; -} -``` - -### 3.2 TableSessionBuilder Class - -#### 3.2.1 Feature Description - -The `TableSessionBuilder` class is a builder for configuring and creating instances of the `ITableSession` interface. It allows developers to set connection parameters, query parameters, and security features. - -#### 3.2.2 Parameter Configuration - -| **Parameter** | **Description** | **Default Value** | -|-----------------------------------------------------| ------------------------------------------------------------ |---------------------------------------------------| -| nodeUrls(List\ nodeUrls) | Sets the list of IoTDB cluster node URLs. | `Collections.singletonList("``localhost:6667``")` | -| username(String username) | Sets the username for the connection. | `"root"` | -| password(String password) | Sets the password for the connection. | `"TimechoDB@2021"` //before V2.0.6 it is root | -| database(String database) | Sets the target database name. | `null` | -| queryTimeoutInMs(long queryTimeoutInMs) | Sets the query timeout in milliseconds. | `60000` (1 minute) | -| fetchSize(int fetchSize) | Sets the fetch size for query results. | `5000` | -| zoneId(ZoneId zoneId) | Sets the timezone-related `ZoneId`. | `ZoneId.systemDefault()` | -| thriftDefaultBufferSize(int thriftDefaultBufferSize) | Sets the default buffer size for the Thrift client (in bytes). | `1024`(1KB) | -| thriftMaxFrameSize(int thriftMaxFrameSize) | Sets the maximum frame size for the Thrift client (in bytes). | `64 * 1024 * 1024`(64MB) | -| enableRedirection(boolean enableRedirection) | Enables or disables redirection for cluster nodes. | `true` | -| enableAutoFetch(boolean enableAutoFetch) | Enables or disables automatic fetching of available DataNodes. | `true` | -| maxRetryCount(int maxRetryCount) | Sets the maximum number of connection retry attempts. | `60` | -| retryIntervalInMs(long retryIntervalInMs) | Sets the interval between retry attempts (in milliseconds). | `500`(500 millisesonds) | -| useSSL(boolean useSSL) | Enables or disables SSL for secure connections. | `false` | -| trustStore(String keyStore) | Sets the path to the trust store for SSL connections. | `null` | -| trustStorePwd(String keyStorePwd) | Sets the password for the SSL trust store. | `null` | -| enableCompression(boolean enableCompression) | Enables or disables RPC compression for the connection. | `false` | -| connectionTimeoutInMs(int connectionTimeoutInMs) | Sets the connection timeout in milliseconds. | `0` (no timeout) | - -#### 3.2.3 Sample Code - -```java -/** - * A builder class for constructing instances of {@link ITableSession}. - * - *

This builder provides a fluent API for configuring various options such as connection - * settings, query parameters, and security features. - * - *

All configurations have reasonable default values, which can be overridden as needed. - */ -public class TableSessionBuilder { - - /** - * Builds and returns a configured {@link ITableSession} instance. - * - * @return a fully configured {@link ITableSession}. - * @throws IoTDBConnectionException if an error occurs while establishing the connection. - */ - public ITableSession build() throws IoTDBConnectionException; - - /** - * Sets the list of node URLs for the IoTDB cluster. - * - * @param nodeUrls a list of node URLs. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue Collection.singletonList("localhost:6667") - */ - public TableSessionBuilder nodeUrls(List nodeUrls); - - /** - * Sets the username for the connection. - * - * @param username the username. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue "root" - */ - public TableSessionBuilder username(String username); - - /** - * Sets the password for the connection. - * - * @param password the password. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue "TimechoDB@2021" //before V2.0.6 it is root - */ - public TableSessionBuilder password(String password); - - /** - * Sets the target database name. - * - * @param database the database name. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder database(String database); - - /** - * Sets the query timeout in milliseconds. - * - * @param queryTimeoutInMs the query timeout in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 60000 (1 minute) - */ - public TableSessionBuilder queryTimeoutInMs(long queryTimeoutInMs); - - /** - * Sets the fetch size for query results. - * - * @param fetchSize the fetch size. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 5000 - */ - public TableSessionBuilder fetchSize(int fetchSize); - - /** - * Sets the {@link ZoneId} for timezone-related operations. - * - * @param zoneId the {@link ZoneId}. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue ZoneId.systemDefault() - */ - public TableSessionBuilder zoneId(ZoneId zoneId); - - /** - * Sets the default init buffer size for the Thrift client. - * - * @param thriftDefaultBufferSize the buffer size in bytes. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 1024 (1 KB) - */ - public TableSessionBuilder thriftDefaultBufferSize(int thriftDefaultBufferSize); - - /** - * Sets the maximum frame size for the Thrift client. - * - * @param thriftMaxFrameSize the maximum frame size in bytes. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 64 * 1024 * 1024 (64 MB) - */ - public TableSessionBuilder thriftMaxFrameSize(int thriftMaxFrameSize); - - /** - * Enables or disables redirection for cluster nodes. - * - * @param enableRedirection whether to enable redirection. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue true - */ - public TableSessionBuilder enableRedirection(boolean enableRedirection); - - /** - * Enables or disables automatic fetching of available DataNodes. - * - * @param enableAutoFetch whether to enable automatic fetching. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue true - */ - public TableSessionBuilder enableAutoFetch(boolean enableAutoFetch); - - /** - * Sets the maximum number of retries for connection attempts. - * - * @param maxRetryCount the maximum retry count. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 60 - */ - public TableSessionBuilder maxRetryCount(int maxRetryCount); - - /** - * Sets the interval between retries in milliseconds. - * - * @param retryIntervalInMs the interval in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 500 milliseconds - */ - public TableSessionBuilder retryIntervalInMs(long retryIntervalInMs); - - /** - * Enables or disables SSL for secure connections. - * - * @param useSSL whether to enable SSL. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue false - */ - public TableSessionBuilder useSSL(boolean useSSL); - - /** - * Sets the trust store path for SSL connections. - * - * @param keyStore the trust store path. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder trustStore(String keyStore); - - /** - * Sets the trust store password for SSL connections. - * - * @param keyStorePwd the trust store password. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder trustStorePwd(String keyStorePwd); - - /** - * Enables or disables rpc compression for the connection. - * - * @param enableCompression whether to enable compression. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue false - */ - public TableSessionBuilder enableCompression(boolean enableCompression); - - /** - * Sets the connection timeout in milliseconds. - * - * @param connectionTimeoutInMs the connection timeout in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 0 (no timeout) - */ - public TableSessionBuilder connectionTimeoutInMs(int connectionTimeoutInMs); -} -``` - -> Note: When creating tables using the native API, if table or column names contain special characters or Chinese characters, do not add extra double quotes around them. Otherwise, the quotation marks will become part of the name itself. - -## 4. Session Pool - -### 4.1 ITableSessionPool Interface - -#### 4.1.1 Feature Description - -The `ITableSessionPool` interface manages a pool of `ITableSession` instances, enabling efficient reuse of connections and proper cleanup of resources. - -#### 4.1.2 Method Overview - -| **Method Name** | **Description** | **Return Value** | **Exceptions** | -| --------------- | ---------------------------------------------------------- | ---------------- | -------------------------- | -| getSession() | Acquires a session from the pool for database interaction. | `ITableSession` | `IoTDBConnectionException` | -| close() | Closes the session pool and releases resources.。 | None | None | - -#### 4.1.3 Sample Code - -```Java -/** - * This interface defines a pool for managing {@link ITableSession} instances. - * It provides methods to acquire a session from the pool and to close the pool. - * - *

The implementation should handle the lifecycle of sessions, ensuring efficient - * reuse and proper cleanup of resources. - */ -public interface ITableSessionPool { - - /** - * Acquires an {@link ITableSession} instance from the pool. - * - * @return an {@link ITableSession} instance for interacting with the IoTDB. - * @throws IoTDBConnectionException if there is an issue obtaining a session from the pool. - */ - ITableSession getSession() throws IoTDBConnectionException; - - /** - * Closes the session pool, releasing any held resources. - * - *

Once the pool is closed, no further sessions can be acquired. - */ - void close(); -} -``` - -### 4.2 ableSessionPoolBuilder Class - -#### 4.2.1 Feature Description - -The `TableSessionPoolBuilder` class is a builder for configuring and creating `ITableSessionPool` instances, supporting options like connection settings and pooling behavior. - -#### 4.2.2 Parameter Configuration - -| **Parameter** | **Description** | **Default Value** | -|---------------------------------------------------------------| ------------------------------------------------------------ |------------------------------------------------| -| nodeUrls(List\ nodeUrls) | Sets the list of IoTDB cluster node URLs. | `Collections.singletonList("localhost:6667")` | -| maxSize(int maxSize) | Sets the maximum size of the session pool, i.e., the maximum number of sessions allowed in the pool. | `5` | -| user(String user) | Sets the username for the connection. | `"root"` | -| password(String password) | Sets the password for the connection. | `"TimechoDB@2021"` //before V2.0.6 it is root | -| database(String database) | Sets the target database name. | `"root"` | -| queryTimeoutInMs(long queryTimeoutInMs) | Sets the query timeout in milliseconds. | `60000`(1 minute) | -| fetchSize(int fetchSize) | Sets the fetch size for query results. | `5000` | -| zoneId(ZoneId zoneId) | Sets the timezone-related `ZoneId`. | `ZoneId.systemDefault()` | -| waitToGetSessionTimeoutInMs(long waitToGetSessionTimeoutInMs) | Sets the timeout duration (in milliseconds) for acquiring a session from the pool. | `30000`(30 seconds) | -| thriftDefaultBufferSize(int thriftDefaultBufferSize) | Sets the default buffer size for the Thrift client (in bytes). | `1024`(1KB) | -| thriftMaxFrameSize(int thriftMaxFrameSize) | Sets the maximum frame size for the Thrift client (in bytes). | `64 * 1024 * 1024`(64MB) | -| enableCompression(boolean enableCompression) | Enables or disables compression for the connection. | `false` | -| enableRedirection(boolean enableRedirection) | Enables or disables redirection for cluster nodes. | `true` | -| connectionTimeoutInMs(int connectionTimeoutInMs) | Sets the connection timeout in milliseconds. | `10000` (10 seconds) | -| enableAutoFetch(boolean enableAutoFetch) | Enables or disables automatic fetching of available DataNodes. | `true` | -| maxRetryCount(int maxRetryCount) | Sets the maximum number of connection retry attempts. | `60` | -| retryIntervalInMs(long retryIntervalInMs) | Sets the interval between retry attempts (in milliseconds). | `500` (500 milliseconds) | -| useSSL(boolean useSSL) | Enables or disables SSL for secure connections. | `false` | -| trustStore(String keyStore) | Sets the path to the trust store for SSL connections. | `null` | -| trustStorePwd(String keyStorePwd) | Sets the password for the SSL trust store. | `null` | - -#### 4.2.3 Sample Code - -```Java -/** - * A builder class for constructing instances of {@link ITableSessionPool}. - * - *

This builder provides a fluent API for configuring a session pool, including - * connection settings, session parameters, and pool behavior. - * - *

All configurations have reasonable default values, which can be overridden as needed. - */ -public class TableSessionPoolBuilder { - - /** - * Builds and returns a configured {@link ITableSessionPool} instance. - * - * @return a fully configured {@link ITableSessionPool}. - */ - public ITableSessionPool build(); - - /** - * Sets the list of node URLs for the IoTDB cluster. - * - * @param nodeUrls a list of node URLs. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue Collection.singletonList("localhost:6667") - */ - public TableSessionPoolBuilder nodeUrls(List nodeUrls); - - /** - * Sets the maximum size of the session pool. - * - * @param maxSize the maximum number of sessions allowed in the pool. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 5 - */ - public TableSessionPoolBuilder maxSize(int maxSize); - - /** - * Sets the username for the connection. - * - * @param user the username. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "root" - */ - public TableSessionPoolBuilder user(String user); - - /** - * Sets the password for the connection. - * - * @param password the password. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "TimechoDB@2021" //before V2.0.6 it is root - */ - public TableSessionPoolBuilder password(String password); - - /** - * Sets the target database name. - * - * @param database the database name. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "root" - */ - public TableSessionPoolBuilder database(String database); - - /** - * Sets the query timeout in milliseconds. - * - * @param queryTimeoutInMs the query timeout in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 60000 (1 minute) - */ - public TableSessionPoolBuilder queryTimeoutInMs(long queryTimeoutInMs); - - /** - * Sets the fetch size for query results. - * - * @param fetchSize the fetch size. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 5000 - */ - public TableSessionPoolBuilder fetchSize(int fetchSize); - - /** - * Sets the {@link ZoneId} for timezone-related operations. - * - * @param zoneId the {@link ZoneId}. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue ZoneId.systemDefault() - */ - public TableSessionPoolBuilder zoneId(ZoneId zoneId); - - /** - * Sets the timeout for waiting to acquire a session from the pool. - * - * @param waitToGetSessionTimeoutInMs the timeout duration in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 30000 (30 seconds) - */ - public TableSessionPoolBuilder waitToGetSessionTimeoutInMs(long waitToGetSessionTimeoutInMs); - - /** - * Sets the default buffer size for the Thrift client. - * - * @param thriftDefaultBufferSize the buffer size in bytes. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 1024 (1 KB) - */ - public TableSessionPoolBuilder thriftDefaultBufferSize(int thriftDefaultBufferSize); - - /** - * Sets the maximum frame size for the Thrift client. - * - * @param thriftMaxFrameSize the maximum frame size in bytes. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 64 * 1024 * 1024 (64 MB) - */ - public TableSessionPoolBuilder thriftMaxFrameSize(int thriftMaxFrameSize); - - /** - * Enables or disables compression for the connection. - * - * @param enableCompression whether to enable compression. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue false - */ - public TableSessionPoolBuilder enableCompression(boolean enableCompression); - - /** - * Enables or disables redirection for cluster nodes. - * - * @param enableRedirection whether to enable redirection. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue true - */ - public TableSessionPoolBuilder enableRedirection(boolean enableRedirection); - - /** - * Sets the connection timeout in milliseconds. - * - * @param connectionTimeoutInMs the connection timeout in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 10000 (10 seconds) - */ - public TableSessionPoolBuilder connectionTimeoutInMs(int connectionTimeoutInMs); - - /** - * Enables or disables automatic fetching of available DataNodes. - * - * @param enableAutoFetch whether to enable automatic fetching. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue true - */ - public TableSessionPoolBuilder enableAutoFetch(boolean enableAutoFetch); - - /** - * Sets the maximum number of retries for connection attempts. - * - * @param maxRetryCount the maximum retry count. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 60 - */ - public TableSessionPoolBuilder maxRetryCount(int maxRetryCount); - - /** - * Sets the interval between retries in milliseconds. - * - * @param retryIntervalInMs the interval in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 500 milliseconds - */ - public TableSessionPoolBuilder retryIntervalInMs(long retryIntervalInMs); - - /** - * Enables or disables SSL for secure connections. - * - * @param useSSL whether to enable SSL. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue false - */ - public TableSessionPoolBuilder useSSL(boolean useSSL); - - /** - * Sets the trust store path for SSL connections. - * - * @param keyStore the trust store path. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue null - */ - public TableSessionPoolBuilder trustStore(String keyStore); - - /** - * Sets the trust store password for SSL connections. - * - * @param keyStorePwd the trust store password. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue null - */ - public TableSessionPoolBuilder trustStorePwd(String keyStorePwd); -} -``` - -## 5. Example Code - -Session: [src/main/java/org/apache/iotdb/TableModelSessionExample.java](https://github.com/apache/iotdb/blob/master/example/session/src/main/java/org/apache/iotdb/TableModelSessionExample.java) - -SessionPool: [src/main/java/org/apache/iotdb/TableModelSessionPoolExample.java](https://github.com/apache/iotdb/blob/master/example/session/src/main/java/org/apache/iotdb/TableModelSessionPoolExample.java) - -```Java -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -package org.apache.iotdb; - -import org.apache.iotdb.isession.ITableSession; -import org.apache.iotdb.isession.SessionDataSet; -import org.apache.iotdb.isession.pool.ITableSessionPool; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.TableSessionPoolBuilder; - -import org.apache.tsfile.enums.ColumnCategory; -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.write.record.Tablet; - -import java.util.ArrayList; -import java.util.Arrays; -import java.util.Collections; -import java.util.List; - -import static org.apache.iotdb.SessionExample.printDataSet; - -public class TableModelSessionPoolExample { - - private static final String LOCAL_URL = "127.0.0.1:6667"; - - public static void main(String[] args) { - - // don't specify database in constructor - ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(Collections.singletonList(LOCAL_URL)) - .user("root") - .password("TimechoDB@2021") //before V2.0.6 it is root - .maxSize(1) - .build(); - - try (ITableSession session = tableSessionPool.getSession()) { - - session.executeNonQueryStatement("CREATE DATABASE test1"); - session.executeNonQueryStatement("CREATE DATABASE test2"); - - session.executeNonQueryStatement("use test2"); - - // or use full qualified table name - session.executeNonQueryStatement( - "create table test1.table1(" - + "region_id STRING TAG, " - + "plant_id STRING TAG, " - + "device_id STRING TAG, " - + "model STRING ATTRIBUTE, " - + "temperature FLOAT FIELD, " - + "humidity DOUBLE FIELD) with (TTL=3600000)"); - - session.executeNonQueryStatement( - "create table table2(" - + "region_id STRING TAG, " - + "plant_id STRING TAG, " - + "color STRING ATTRIBUTE, " - + "temperature FLOAT FIELD, " - + "speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - // show tables by specifying another database - // using SHOW tables FROM - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES FROM test1")) { - printDataSet(dataSet); - } - - // insert table data by tablet - List columnNameList = - Arrays.asList("region_id", "plant_id", "device_id", "model", "temperature", "humidity"); - List dataTypeList = - Arrays.asList( - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE); - List columnTypeList = - new ArrayList<>( - Arrays.asList( - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD)); - Tablet tablet = new Tablet("test1", columnNameList, dataTypeList, columnTypeList, 100); - for (long timestamp = 0; timestamp < 100; timestamp++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue("plant_id", rowIndex, "5"); - tablet.addValue("device_id", rowIndex, "3"); - tablet.addValue("model", rowIndex, "A"); - tablet.addValue("temperature", rowIndex, 37.6F); - tablet.addValue("humidity", rowIndex, 111.1); - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - session.insert(tablet); - tablet.reset(); - } - } - if (tablet.getRowSize() != 0) { - session.insert(tablet); - tablet.reset(); - } - - // query table data - try (SessionDataSet dataSet = - session.executeQueryStatement( - "select * from test1 " - + "where region_id = '1' and plant_id in ('3', '5') and device_id = '3'")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } finally { - tableSessionPool.close(); - } - - // specify database in constructor - tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(Collections.singletonList(LOCAL_URL)) - .user("root") - .password("TimechoDB@2021")//before V2.0.6 it is root - .maxSize(1) - .database("test1") - .build(); - - try (ITableSession session = tableSessionPool.getSession()) { - - // show tables from current database - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - // change database to test2 - session.executeNonQueryStatement("use test2"); - - // show tables by specifying another database - // using SHOW tables FROM - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } - - try (ITableSession session = tableSessionPool.getSession()) { - - // show tables from default database test1 - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } finally { - tableSessionPool.close(); - } - } -} -``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/API/Programming-MQTT_timecho.md b/src/UserGuide/latest-Table/API/Programming-MQTT_timecho.md deleted file mode 100644 index 39184f479..000000000 --- a/src/UserGuide/latest-Table/API/Programming-MQTT_timecho.md +++ /dev/null @@ -1,261 +0,0 @@ - -# MQTT Protocol - -## 1. Overview - -MQTT (Message Queuing Telemetry Transport) is a lightweight messaging protocol designed for IoT and low-bandwidth environments. It operates on a Publish/Subscribe (Pub/Sub) model, enabling efficient and reliable bidirectional communication between devices. Its core objectives are low power consumption, minimal bandwidth usage, and high real-time performance, making it ideal for unstable networks or resource-constrained scenarios (e.g., sensors, mobile devices). - -IoTDB provides deep integration with the MQTT protocol, fully compliant with MQTT v3.1 (OASIS International Standard). The IoTDB server includes a built-in high-performance MQTT Broker module, eliminating the need for third-party middleware. Devices can directly write time-series data into the IoTDB storage engine via MQTT messages. - -![](/img/mqtt-table-en-1.png) - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the MQTT service JAR file by default. Please contact the Timecho team to obtain the JAR file before using this service, and place it in the `timechodb_home/lib` or `timechodb_home/ext/external_service` directory. - - -## 2. Configuration - -By default, the IoTDB MQTT service loads configurations from `${IOTDB_HOME}/${IOTDB_CONF}/iotdb-system.properties`. - -| **Property** | **Description** | **Default** | -| ------------------------ | -------------------------------------------------------------------------------------------------------------------- | ------------------- | -| `enable_mqtt_service` | Enable/ disable the MQTT service. | FALSE | -| `mqtt_host` | Host address bound to the MQTT service. | 127.0.0.1 | -| `mqtt_port` | Port bound to the MQTT service. | 1883 | -| `mqtt_handler_pool_size` | Thread pool size for processing MQTT messages. | 1 | -| **`mqtt_payload_formatter`** | **Formatting method for MQTT message payloads. ​**​**Options: `json` (tree mode), `line` (table mode).** | **json** | -| `mqtt_max_message_size` | Maximum allowed MQTT message size (bytes). | 1048576 | - -## 3. Write Protocol - -* Line Protocol Syntax - -```JavaScript -[,=[,=]][ =[,=]] =[,=] [] -``` - -* Example - -```JavaScript -myMeasurement,tag1=value1,tag2=value2 attr1=value1,attr2=value2 fieldKey="fieldValue" 1556813561098000000 -``` - -![](/img/mqtt-table-en-2.png) - -## 4. Naming Conventions - -* Database Name - -The first segment of the MQTT topic (split by `/`) is used as the database name. - -```Properties -topic: stock/Legacy -databaseName: stock - - -topic: stock/Legacy/# -databaseName:stock -``` - -* Table Name - -The table name is derived from the `` in the line protocol. - -* Type Identifiers - -| Filed Value | IoTDB Data Type | -|--------------------------------------------------------------------| ----------------- | -| 1
1.12 | DOUBLE | -| 1`f`
1.12`f` | FLOAT | -| 1`i`
123`i` | INT64 | -| 1`u`
123`u` | INT64 | -| 1`i32`
123`i32` | INT32 | -| `"xxx"` | TEXT | -| `t`,`T`,`true`,`True`,`TRUE`
`f`,`F`,`false`,`False`,`FALSE` | BOOLEAN | - - -## 5. Coding Examples -The following is an example which a mqtt client send messages to IoTDB server. - - ```java -MQTT mqtt = new MQTT(); -mqtt.setHost("127.0.0.1", 1883); -mqtt.setUserName("root"); -mqtt.setPassword("root"); - -BlockingConnection connection = mqtt.blockingConnection(); -String DATABASE = "myMqttTest"; -connection.connect(); - -String payload = - "test1,tag1=t1,tag2=t2 attr3=a5,attr4=a4 field1=\"fieldValue1\",field2=1i,field3=1u 1"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -payload = "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 2"; -connection.publish(DATABASE, payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -payload = "# It's a remark\n " + "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 6"; - connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); - Thread.sleep(10); - -//batch write example -payload = - "test1,tag1=t1,tag2=t2 field7=t,field8=T,field9=true 3 \n " - + "test1,tag1=t1,tag2=t2 field7=f,field8=F,field9=FALSE 4"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -//batch write example -payload = - "test1,tag1=t1,tag2=t2 attr1=a1,attr2=a2 field1=\"fieldValue1\",field2=1i,field3=1u 4 \n " - + "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 5"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -connection.disconnect(); - ``` - - - -## 6. Customize your MQTT Message Format - -If you do not like the above Line format, you can customize your MQTT Message format by just writing several lines -of codes. An example can be found in [example/mqtt-customize](https://github.com/apache/iotdb/tree/master/example/mqtt-customize) project. - -Steps: -1. Create a java project, and add dependency: -```xml - - org.apache.iotdb - iotdb-server - 2.0.4-SNAPSHOT - -``` -2. Define your implementation which implements `org.apache.iotdb.db.protocol.mqtt.PayloadFormatter` - e.g., - -```java -package org.apache.iotdb.mqtt.server; - -import io.netty.buffer.ByteBuf; -import org.apache.iotdb.db.protocol.mqtt.Message; -import org.apache.iotdb.db.protocol.mqtt.PayloadFormatter; - -import java.nio.charset.StandardCharsets; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -public class CustomizedLinePayloadFormatter implements PayloadFormatter { - - @Override - public List format(String topic, ByteBuf payload) { - // Suppose the payload is a line format - if (payload == null) { - return null; - } - - String line = payload.toString(StandardCharsets.UTF_8); - // parse data from the line and generate Messages and put them into List ret - List ret = new ArrayList<>(); - // this is just an example, so we just generate some Messages directly - for (int i = 0; i < 3; i++) { - long ts = i; - TableMessage message = new TableMessage(); - - // Parsing Database Name - message.setDatabase("db" + i); - - //Parsing Table Names - message.setTable("t" + i); - - // Parsing Tags - List tagKeys = new ArrayList<>(); - tagKeys.add("tag1" + i); - tagKeys.add("tag2" + i); - List tagValues = new ArrayList<>(); - tagValues.add("t_value1" + i); - tagValues.add("t_value2" + i); - message.setTagKeys(tagKeys); - message.setTagValues(tagValues); - - // Parsing Attributes - List attributeKeys = new ArrayList<>(); - List attributeValues = new ArrayList<>(); - attributeKeys.add("attr1" + i); - attributeKeys.add("attr2" + i); - attributeValues.add("a_value1" + i); - attributeValues.add("a_value2" + i); - message.setAttributeKeys(attributeKeys); - message.setAttributeValues(attributeValues); - - // Parsing Fields - List fields = Arrays.asList("field1" + i, "field2" + i); - List dataTypes = Arrays.asList(TSDataType.FLOAT, TSDataType.FLOAT); - List values = Arrays.asList("4.0" + i, "5.0" + i); - message.setFields(fields); - message.setDataTypes(dataTypes); - message.setValues(values); - - //// Parsing timestamp - message.setTimestamp(ts); - ret.add(message); - } - return ret; - } - - @Override - public String getName() { - // set the value of mqtt_payload_formatter in iotdb-system.properties as the following string: - return "CustomizedLine"; - } -} -``` -3. modify the file in `src/main/resources/META-INF/services/org.apache.iotdb.db.protocol.mqtt.PayloadFormatter`: - clean the file and put your implementation class name into the file. - In this example, the content is: `org.apache.iotdb.mqtt.server.CustomizedLinePayloadFormatter` -4. compile your implementation as a jar file: `mvn package -DskipTests` - - -Then, in your server: -1. Create ${IOTDB_HOME}/ext/mqtt/ folder, and put the jar into this folder. -2. Update configuration to enable MQTT service. (`enable_mqtt_service=true` in `conf/iotdb-system.properties`) -3. Set the value of `mqtt_payload_formatter` in `conf/iotdb-system.properties` as the value of getName() in your implementation - , in this example, the value is `CustomizedLine` -4. Launch the IoTDB server. -5. Now IoTDB will use your implementation to parse the MQTT message. - -More: the message format can be anything you want. For example, if it is a binary format, -just use `payload.forEachByte()` or `payload.array` to get bytes content. - -## 7. Caution - -To avoid compatibility issues caused by a default client_id, always explicitly supply a unique, non-empty client_id in every MQTT client. -Behavior varies when the client_id is missing or empty. Common examples: -1. Explicitly sending an empty string -• MQTTX: When client_id="", IoTDB silently discards the message. -• mosquitto_pub: When client_id="", IoTDB receives the message normally. -2. Omitting client_id entirely -• MQTTX: IoTDB accepts the message. -• mosquitto_pub: IoTDB rejects the connection. -Therefore, explicitly assigning a unique, non-empty client_id is the simplest way to eliminate these discrepancies and ensure reliable message delivery. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/API/Programming-ODBC_timecho.md b/src/UserGuide/latest-Table/API/Programming-ODBC_timecho.md deleted file mode 100644 index fe40f1f8f..000000000 --- a/src/UserGuide/latest-Table/API/Programming-ODBC_timecho.md +++ /dev/null @@ -1,1051 +0,0 @@ - - -# ODBC - -## 1. Feature Introduction -The IoTDB ODBC driver provides the ability to interact with the database via the standard ODBC interface, supporting data management in time-series databases through ODBC connections. It currently supports database connection, data query, data insertion, data modification, and data deletion operations, and is compatible with various applications and toolchains that support the ODBC protocol. - -> Note: This feature is supported starting from V2.0.8.2. - -## 2. Usage Method -It is recommended to install using the pre-compiled binary package. There is no need to compile it yourself; simply use the script to complete the driver installation and system registration. Currently, only Windows systems are supported. - -### 2.1 Environment Requirements -Only the ODBC Driver Manager dependency at the operating system level is required; no compilation environment configuration is needed: - -| **Operating System** | **Requirements and Installation Method** | -| :--- |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Windows | 1. **Windows 10/11, Server 2016/2019/2022**: Comes with ODBC Driver Manager version 17/18 built-in; no extra installation needed.
2. **Windows 8.1/Server 2012 R2**: Requires manual installation of the corresponding version of the ODBC Driver Manager. | - -### 2.2 Installation Steps -1. Contact the Tianmou team to obtain the pre-compiled binary package. - Binary package directory structure: - ```Plain - ├── bin/ - │ ├── apache_iotdb_odbc.dll - │ └── install_driver.exe - ├── install.bat - └── registry.bat - ``` -2. Open a command line tool (CMD/PowerShell) with **Administrator privileges** and run the following command: (You can replace the path with any absolute path) - ```Bash - install.bat "C:\Program Files\Apache IoTDB ODBC Driver" - ``` - The script automatically completes the following operations: - * Creates the installation directory (if it does not exist). - * Copies `bin\apache_iotdb_odbc.dll` to the specified installation directory. - * Calls `install_driver.exe` to register the driver to the system via the ODBC standard API (`SQLInstallDriverEx`). -3. Verify installation: Open "ODBC Data Source Administrator". If you can see `Apache IoTDB ODBC Driver` in the "Drivers" tab, the registration was successful. - ![](/img/odbc-1-en.png) - -### 2.3 Uninstallation Steps -1. Open Command Prompt as Administrator and `cd` into the project root directory. -2. Run the uninstallation script: - ```Bash - uninstall.bat - ``` - The script will call `install_driver.exe` to unregister the driver from the system via the ODBC standard API (`SQLRemoveDriver`). The DLL files in the installation directory will not be automatically deleted; please delete them manually if cleanup is required. - -### 2.4 Connection Configuration -After installing the driver, you need to configure a Data Source Name (DSN) to allow applications to connect to the database using the DSN name. The IoTDB ODBC driver supports two methods for configuring connection parameters: via Data Source and via Connection String. - -#### 2.4.1 Configuring Data Source -**Configure via ODBC Data Source Administrator** -1. Open "ODBC Data Source Administrator", switch to the "User DSN" tab, and click the "Add" button. - ![](/img/odbc-2-en.png) -2. Select "Apache IoTDB ODBC Driver" from the pop-up driver list and click "Finish". - ![](/img/odbc-3-en.png) -3. The data source configuration dialog will appear. Fill in the connection parameters and click OK: - ![](/img/odbc-4-en.png) - The meaning of each field in the dialog box is as follows: - - | **Area** | **Field** | **Description** | - | :--- | :--- | :--- | - | Data Source | DSN Name | Data Source Name; applications refer to this data source by this name. | - | Data Source | Description | Data Source description (optional). | - | Connection | Server | IoTDB server IP address, default 127.0.0.1. | - | Connection | Port | IoTDB Session API port, default 6667. | - | Connection | User | Username, default root. | - | Connection | Password | Password, default root. | - | Options | Table Model | Check to use Table Model; uncheck to use Tree Model. | - | Options | Database | Database name. Only available in Table Model mode; grayed out in Tree Model. | - | Options | Log Level | Log level (0-4): 0=OFF, 1=ERROR, 2=WARN, 3=INFO, 4=TRACE. | - | Options | Session Timeout | Session timeout time (milliseconds); 0 means no timeout. Note: The server-side `queryTimeoutThreshold` defaults to 60000ms; exceeding this value requires modifying server configuration. | - | Options | Batch Size | Number of rows fetched per batch, default 1000. Setting to 0 resets to the default value. | - -4. After filling in the details, you can click the "Test Connection" button to test the connection. Testing will attempt to connect to the IoTDB server using the current parameters and execute a `SHOW VERSION` query. If successful, the server version information will be displayed; if failed, the specific error reason will be shown. -5. Once parameters are confirmed correct, click "OK" to save. The data source will appear in the "User DSN" list, as shown in the example below with the name "123". - ![](/img/odbc-5-en.png) - To modify the configuration of an existing data source, select it in the list and click the "Configure" button to edit again. - -#### 2.4.2 Connection String -The connection string format is **semicolon-separated key-value pairs**, for example: -```Bash -Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;database=testdb;isTableModel=true;loglevel=2 -``` -Specific field attributes are introduced in the table below: - -| **Field Name** | **Description** | **Optional Values** | **Default Value** | -| :--- | :--- |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--- | -| DSN | Data Source Name | Custom data source name | - | -| uid | Database username | Any string | root | -| pwd | Database password | Any string | root | -| server | IoTDB server address | IP address | 127.0.0.1 | -| port | IoTDB server port | Port number | 6667 | -| database | Database name (only effective in Table Model mode) | Any string | Empty string | -| loglevel | Log level | Integer value (0-4) | 4 (LOG_LEVEL_TRACE) | -| isTableModel / tablemodel | Whether to enable Table Model mode | Boolean type, supports multiple representations:
1. 0, false, no, off: set to false;
2. 1, true, yes, on: set to true;
3. Other values default to true. | true | -| sessiontimeoutms | Session timeout time (milliseconds) | 64-bit integer, defaults to `LLONG_MAX`; setting to `0` will be replaced with `LLONG_MAX`. Note: The server has a timeout setting: `private long queryTimeoutThreshold = 60000;` this item needs to be modified to get a timeout time exceeding 60 seconds. | LLONG_MAX | -| batchsize | Batch size for fetching data each time | 64-bit integer, defaults to `1000`; setting to `0` will be replaced with `1000` | 1000 | - -Notes: -* Field names are case-insensitive (automatically converted to lowercase for comparison). -* Connection string format is semicolon-separated key-value pairs, e.g., `Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;database=testdb;isTableModel=true;loglevel=2`. -* For boolean fields (`isTableModel`), multiple representation methods are supported. -* All fields are optional; if not specified, default values are used. -* Unsupported fields will be ignored and a warning logged, but will not affect the connection. -* The default server interface port 6667 is the default port used by IoTDB's C++ Session interface. This ODBC driver uses the C++ Session interface to transfer data with IoTDB. If the C++ Session interface on the IoTDB server uses a non-default port, corresponding changes must be made in the ODBC connection string. - -#### 2.4.3 Relationship between Data Source Configuration and Connection String -Configurations saved in the ODBC Data Source Administrator are written into the system's ODBC data source configuration as key-value pairs (corresponding to the registry `HKEY_CURRENT_USER\SOFTWARE\ODBC\ODBC.INI` under Windows). When an application uses `SQLConnect` or specifies `DSN=DataSourceName` in the connection string, the driver reads these parameters from the system configuration. - -**The priority of the connection string is higher than the configuration saved in the DSN.** Specific rules are as follows: -1. If the connection string contains `DSN=xxx` and does not contain `DRIVER=...`, the driver first loads all parameters of that DSN from the system configuration as base values. -2. Then, parameters explicitly specified in the connection string will override parameters with the same name in the DSN. -3. If the connection string contains `DRIVER=...`, no DSN parameters will be read from the system configuration; it will rely entirely on the connection string. - -For example: If the DSN is configured with `Server=192.168.1.100` and `Port=6667`, but the connection string is `DSN=MyDSN;Server=127.0.0.1`, then the actual connection will use `Server=127.0.0.1` (overridden by connection string) and `Port=6667` (from DSN). - -### 2.5 Logging -Log output during driver runtime is divided into "Driver Self-Logs" and "ODBC Manager Tracing Logs". Note the impact of log levels on performance. - -#### 2.5.1 Driver Self-Logs -* Output location: `apache_iotdb_odbc.log` in the user's home directory. -* Log level: Configured via the `loglevel` parameter in the connection string (0-4; higher levels produce more detailed output). -* Performance impact: High log levels will significantly reduce driver performance; recommended for debugging only. - -#### 2.5.2 ODBC Manager Tracing Logs -* How to enable: Open "ODBC Data Source Administrator" → "Tracing" → "Start Tracing Now". -* Precautions: Enabling this will greatly reduce driver performance; use only for troubleshooting. - -## 3. Interface Support - -### 3.1 Method List -The driver's support status for standard ODBC APIs is as follows: - -| ODBC/Setup API | Function Function | Parameter List | Parameter Description | -|:------------------|:---------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--- | -| SQLAllocHandle | Allocate ODBC Handle | (SQLSMALLINT HandleType, SQLHANDLE InputHandle, SQLHANDLE *OutputHandle) | HandleType: Type of handle to allocate (ENV/DBC/STMT/DESC);
InputHandle: Parent context handle;
OutputHandle: Pointer to the returned new handle. | -| SQLBindCol | Bind column to result buffer | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN *StrLen_or_Ind) | StatementHandle: Statement handle;
ColumnNumber: Column number;
TargetType: C data type;
TargetValue: Data buffer;BufferLength: Buffer length;
StrLen_or_Ind: Returns data length or NULL indicator. | -| SQLColAttribute | Get column attribute information | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLUSMALLINT FieldIdentifier, SQLPOINTER CharacterAttribute, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength, SQLLEN *NumericAttribute) | StatementHandle: Statement handle;
ColumnNumber: Column number;
FieldIdentifier: Attribute ID;
CharacterAttribute: Character attribute output;
BufferLength: Buffer length;
StringLength: Returned length;
NumericAttribute: Numeric attribute output. | -| SQLColumns | Query table column information | (SQLHSTMT StatementHandle, SQLCHAR *CatalogName, SQLSMALLINT NameLength1, SQLCHAR *SchemaName, SQLSMALLINT NameLength2, SQLCHAR *TableName, SQLSMALLINT NameLength3, SQLCHAR *ColumnName, SQLSMALLINT NameLength4) | StatementHandle: Statement handle;
Catalog/Schema/Table/ColumnName: Query object names;

NameLength*: Corresponding name lengths. | -| SQLConnect | Establish database connection | (SQLHDBC ConnectionHandle, SQLCHAR *ServerName, SQLSMALLINT NameLength1, SQLCHAR *UserName, SQLSMALLINT NameLength2, SQLCHAR *Authentication, SQLSMALLINT NameLength3) | ConnectionHandle: Connection handle;
ServerName: Data source name;
UserName: Username;
Authentication: Password;
NameLength*: String lengths. | -| SQLDescribeCol | Describe columns in result set | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLCHAR *ColumnName, SQLSMALLINT BufferLength, SQLSMALLINT *NameLength, SQLSMALLINT *DataType, SQLULEN *ColumnSize, SQLSMALLINT *DecimalDigits, SQLSMALLINT *Nullable) | StatementHandle: Statement handle;
ColumnNumber: Column number;
ColumnName: Column name output;
BufferLength: Buffer length;
NameLength: Returned column name length;
DataType: SQL type;
ColumnSize: Column size;
DecimalDigits: Decimal digits;
Nullable: Whether nullable. | -| SQLDisconnect | Disconnect database connection | (SQLHDBC ConnectionHandle) | ConnectionHandle: Connection handle. | -| SQLDriverConnect | Establish connection using connection string | (SQLHDBC ConnectionHandle, SQLHWND WindowHandle, SQLCHAR *InConnectionString, SQLSMALLINT StringLength1, SQLCHAR *OutConnectionString, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength2, SQLUSMALLINT DriverCompletion) | ConnectionHandle: Connection handle;
WindowHandle: Window handle;InConnectionString: Input connection string;
StringLength1: Input length;
OutConnectionString: Output connection string;
BufferLength: Output buffer;
StringLength2: Returned length;
DriverCompletion: Connection prompt method. | -| SQLEndTran | Commit or rollback transaction | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT CompletionType) | HandleType: Handle type;
Handle: Connection or environment handle;
CompletionType: Commit or rollback transaction. | -| SQLExecDirect | Execute SQL statement directly | (SQLHSTMT StatementHandle, SQLCHAR *StatementText, SQLINTEGER TextLength) | StatementHandle: Statement handle;
StatementText: SQL text;TextLength: SQL length. | -| SQLFetch | Fetch next row in result set | (SQLHSTMT StatementHandle) | StatementHandle: Statement handle. | -| SQLFreeHandle | Free ODBC handle | (SQLSMALLINT HandleType, SQLHANDLE Handle) | HandleType: Handle type;
Handle: Handle to free. | -| SQLFreeStmt | Free statement-related resources | (SQLHSTMT StatementHandle, SQLUSMALLINT Option) | StatementHandle: Statement handle;
Option: Free option (close cursor/reset parameters, etc.). | -| SQLGetConnectAttr | Get connection attribute | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER *StringLength) | ConnectionHandle: Connection handle;
Attribute: Attribute ID;
Value: Returned attribute value;
BufferLength: Buffer length;
StringLength: Returned length. | -| SQLGetData | Get result data | (SQLHSTMT StatementHandle, SQLUSMALLINT Col_or_Param_Num, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN *StrLen_or_Ind) | StatementHandle: Statement handle;
Col_or_Param_Num: Column number;
TargetType: C type;
TargetValue: Data buffer;
BufferLength: Buffer size;
StrLen_or_Ind: Returned length or NULL flag. | -| SQLGetDiagField | Get diagnostic field | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLSMALLINT DiagIdentifier, SQLPOINTER DiagInfo, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength) | HandleType: Handle type;
Handle: Handle;
RecNumber: Record number;
DiagIdentifier: Diagnostic field ID;
DiagInfo: Output info;
BufferLength: Buffer;
StringLength: Returned length. | -| SQLGetDiagRec | Get diagnostic record | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLCHAR *Sqlstate, SQLINTEGER *NativeError, SQLCHAR *MessageText, SQLSMALLINT BufferLength, SQLSMALLINT *TextLength) | HandleType: Handle type;
Handle: Handle;
RecNumber: Record number;
Sqlstate: SQL state code;
NativeError: Native error code;
MessageText: Error message;
BufferLength: Buffer;
TextLength: Returned length. | -| SQLGetInfo | Get database information | (SQLHDBC ConnectionHandle, SQLUSMALLINT InfoType, SQLPOINTER InfoValue, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength) | ConnectionHandle: Connection handle;
InfoType: Information type;
InfoValue: Return value;
BufferLength: Buffer length;
StringLength: Returned length. | -| SQLGetStmtAttr | Get statement attribute | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER *StringLength) | StatementHandle: Statement handle;
Attribute: Attribute ID;
Value: Return value;
BufferLength: Buffer;
StringLength: Returned length. | -| SQLGetTypeInfo | Get data type information | (SQLHSTMT StatementHandle, SQLSMALLINT DataType) | StatementHandle: Statement handle;
DataType: SQL data type. | -| SQLMoreResults | Get more result sets | (SQLHSTMT StatementHandle) | StatementHandle: Statement handle. | -| SQLNumResultCols | Get number of columns in result set | (SQLHSTMT StatementHandle, SQLSMALLINT *ColumnCount) | StatementHandle: Statement handle;
ColumnCount: Returned column count. | -| SQLRowCount | Get number of affected rows | (SQLHSTMT StatementHandle, SQLLEN *RowCount) | StatementHandle: Statement handle;
RowCount: Returned number of affected rows. | -| SQLSetConnectAttr | Set connection attribute | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | ConnectionHandle: Connection handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Attribute value length. | -| SQLSetEnvAttr | Set environment attribute | (SQLHENV EnvironmentHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | EnvironmentHandle: Environment handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Length. | -| SQLSetStmtAttr | Set statement attribute | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | StatementHandle: Statement handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Length. | -| SQLTables | Query table information | (SQLHSTMT StatementHandle, SQLCHAR *CatalogName, SQLSMALLINT NameLength1, SQLCHAR *SchemaName, SQLSMALLINT NameLength2, SQLCHAR *TableName, SQLSMALLINT NameLength3, SQLCHAR *TableType, SQLSMALLINT NameLength4) | StatementHandle: Statement handle;
Catalog/Schema/TableName: Table names;
TableType: Table type;
NameLength*: Corresponding lengths. | - -### 3.2 Data Type Conversion -The mapping relationship between IoTDB data types and standard ODBC data types is as follows: - -| **IoTDB Data Type** | **ODBC Data Type** | -| :--- | :--- | -| BOOLEAN | SQL_BIT | -| INT32 | SQL_INTEGER | -| INT64 | SQL_BIGINT | -| FLOAT | SQL_REAL | -| DOUBLE | SQL_DOUBLE | -| TEXT | SQL_VARCHAR | -| STRING | SQL_VARCHAR | -| BLOB | SQL_LONGVARBINARY | -| TIMESTAMP | SQL_BIGINT | -| DATE | SQL_DATE | - -## 4. Operation Examples -This chapter mainly introduces full-type operation examples for **C#**, **Python**, **C++**, **PowerBI**, and **Excel**, covering core operations such as data query, insertion, and deletion. - -### 4.1 C# Example - -```C# -Here is the C# code with all comments and string literals translated into English: - -```csharp -/******* -Note: When the output contains Chinese characters, it may cause garbled text. -This is because the table.Write() function cannot output strings in UTF-8 encoding -and can only output using GB2312 (or another system default encoding). This issue -may not occur in software like Power BI; it also does not occur when using the Console.WriteLine function. -This is an issue with the ConsoleTable package. -*****/ - -using System.Data.Common; -using System.Data.Odbc; -using System.Reflection.PortableExecutable; -using ConsoleTables; -using System; - -/// Executes a SELECT query and outputs the results of fulltable in table format -void Query(OdbcConnection dbConnection) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - dbCommand.CommandText = "select * from fulltable"; - using (OdbcDataReader dbReader = dbCommand.ExecuteReader()) - { - var fCount = dbReader.FieldCount; - Console.WriteLine($"fCount = {fCount}"); - - // Output header row - var columns = new string[fCount]; - for (var i = 0; i < fCount; i++) - { - var fName = dbReader.GetName(i); - if (fName.Contains('.')) - { - fName = fName.Substring(fName.LastIndexOf('.') + 1); - } - columns[i] = fName; - } - - // Output content rows - var table = new ConsoleTable(columns); - while (dbReader.Read()) - { - var row = new object[fCount]; - for (var i = 0; i < fCount; i++) - { - if (dbReader.IsDBNull(i)) - { - row[i] = null; - continue; - } - row[i] = dbReader.GetValue(i); - } - table.AddRow(row); - } - table.Write(); - Console.WriteLine(); - } - } - } - catch (Exception ex) - { - Console.WriteLine(ex.ToString()); - } -} - -/// Executes non-query SQL statements (such as CREATE DATABASE, CREATE TABLE, INSERT, etc.) -void Execute(OdbcConnection dbConnection, string command) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - try - { - dbCommand.CommandText = command; - Console.WriteLine($"Execute command: {command}"); - dbCommand.ExecuteNonQuery(); - } - catch (Exception ex) - { - Console.WriteLine($"CommandText error: {ex.Message}"); - } - } - } - catch (OdbcException ex) - { - Console.WriteLine($"Database error: {ex.Message}"); - } - catch (Exception ex) - { - Console.WriteLine($"Unknown error occurred: {ex.Message}"); - } -} - -var dsn = "Apache IoTDB DSN"; -var user = "root"; -var password = "root"; -var server = "127.0.0.1"; -var database = "test"; -var connectionString = $"DSN={dsn};Server={server};UID={user};PWD={password};Database={database};loglevel=4"; - -using (OdbcConnection dbConnection = new OdbcConnection(connectionString)) -{ - Console.WriteLine($"Start"); - try - { - dbConnection.Open(); - } - catch (Exception ex) - { - Console.WriteLine($"Login failed: {ex.Message}"); - Console.WriteLine($"Stack Trace: {ex.StackTrace}"); - dbConnection.Dispose(); - return; - } - Console.WriteLine($"Successfully opened connection. database name = {dbConnection.Driver}"); - - Execute(dbConnection, "CREATE DATABASE IF NOT EXISTS test"); - Execute(dbConnection, "use test"); - Console.WriteLine("use test Execute complete. Begin to setup fulltable."); - - Execute(dbConnection, "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) WITH (TTL=315360000000)"); - - string[] insertStatements = new string[] - { - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')" - }; - - foreach (var insert in insertStatements) - { - Execute(dbConnection, insert); - } - Console.WriteLine("fulltable setup complete. Begin to query."); - - Query(dbConnection); // Execute query and output results -} -``` - -### 4.2 Python Example -1. To access ODBC via Python, install the `pyodbc` package: - ```Plain - pip install pyodbc - ``` -2. Full Code: - - -```Python -Here is the complete Python code with all comments and string literals translated into English: - -```python -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -""" -Apache IoTDB ODBC Python example -Use pyodbc to connect to the IoTDB ODBC driver and perform operations such as query and insert. -For reference, see examples/cpp-example/test.cpp and examples/BasicTest/BasicTest/Program.cs -""" - -import pyodbc - -def execute(conn: pyodbc.Connection, command: str) -> None: - """Executes non-query SQL statements (such as USE, CREATE, INSERT, DELETE, etc.)""" - try: - with conn.cursor() as cursor: - cursor.execute(command) - # INSERT/UPDATE/DELETE require commit; session commands such as USE do not. - cmd_upper = command.strip().upper() - if cmd_upper.startswith(("INSERT", "UPDATE", "DELETE")): - conn.commit() - print(f"Execute command: {command}") - except pyodbc.Error as ex: - print(f"CommandText error: {ex}") - -def query(conn: pyodbc.Connection, sql: str) -> None: - """Executes a SELECT query and outputs the results in table format""" - try: - with conn.cursor() as cursor: - cursor.execute(sql) - col_count = len(cursor.description) - print(f"fCount = {col_count}") - - if col_count <= 0: - return - - # Get column names (if the name contains '.', take the last segment, consistent with C++/C# samples). - columns = [] - for i in range(col_count): - col_name = cursor.description[i][0] or f"Column{i}" - if "." in str(col_name): - col_name = str(col_name).split(".")[-1] - columns.append(str(col_name)) - - # Fetch data rows - rows = cursor.fetchall() - - # Simple table output - col_widths = [max(len(str(col)), 4) for col in columns] - for i, row in enumerate(rows): - for j, val in enumerate(row): - if j < len(col_widths): - col_widths[j] = max(col_widths[j], len(str(val) if val is not None else "NULL")) - - # Print header - header = " | ".join(str(c).ljust(col_widths[i]) for i, c in enumerate(columns)) - print(header) - print("-" * len(header)) - - # Print data rows - for row in rows: - values = [] - for i, val in enumerate(row): - if val is None: - cell = "NULL" - else: - cell = str(val) - values.append(cell.ljust(col_widths[i]) if i < len(col_widths) else cell) - print(" | ".join(values)) - - print() - - except pyodbc.Error as ex: - print(f"Query error: {ex}") - -def main() -> None: - dsn = "Apache IoTDB DSN" - user = "root" - password = "root" - server = "127.0.0.1" - database = "test" - connection_string = ( - f"DSN={dsn};Server={server};UID={user};PWD={password};" - f"Database={database};loglevel=4" - ) - - print("Start") - - try: - conn = pyodbc.connect(connection_string) - except pyodbc.Error as ex: - print(f"Login failed: {ex}") - return - - try: - driver_name = conn.getinfo(6) # SQL_DRIVER_NAME - print(f"Successfully opened connection. driver = {driver_name}") - except Exception: - print("Successfully opened connection.") - - try: - execute(conn, "CREATE DATABASE IF NOT EXISTS test") - execute(conn, "use test") - print("use test Execute complete. Begin to setup fulltable.") - - # Create the fulltable table and insert test data - execute( - conn, - "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, " - "int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, " - "double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, " - "blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) " - "WITH (TTL=315360000000)", - ) - insert_statements = [ - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - ] - for insert_sql in insert_statements: - execute(conn, insert_sql) - print("fulltable setup complete. Begin to query.") - query(conn, "select * from fulltable") - print("Query ok") - finally: - conn.close() - -if __name__ == "__main__": - main() -``` - - -### 4.3 C++ Example - - -```C++ -Here is the complete C++ code with all comments and string literals translated into English: - -```cpp -#define WIN32_LEAN_AND_MEAN -#include - -#include -#include -#include -#include -#include -#include -#include - -#ifndef SQL_DIAG_COLUMN_SIZE -#define SQL_DIAG_COLUMN_SIZE 33L -#endif - -// Error handling function (core functionality preserved) -void CheckOdbcError(SQLRETURN retCode, SQLSMALLINT handleType, SQLHANDLE handle, const char* functionName) { - if (retCode == SQL_SUCCESS || retCode == SQL_SUCCESS_WITH_INFO) { - return; - } - - SQLCHAR sqlState[6]; - SQLCHAR message[SQL_MAX_MESSAGE_LENGTH]; - SQLINTEGER nativeError; - SQLSMALLINT textLength; - SQLRETURN errRet; - errRet = SQLGetDiagRec(handleType, handle, 1, sqlState, &nativeError, message, sizeof(message), &textLength); - - std::cerr << "ODBC Error in " << functionName << ":\n"; - std::cerr << " SQL State: " << sqlState << "\n"; - std::cerr << " Native Error: " << nativeError << "\n"; - std::cerr << " Message: " << message << "\n"; - std::cerr << " SQLGetDiagRec Return: " << errRet << "\n"; - - if (retCode == SQL_ERROR || retCode == SQL_INVALID_HANDLE) { - exit(1); - } -} - -// Simplified table output - displays basic data only -void PrintSimpleTable(const std::vector& headers, - const std::vector>& rows) { - // Print header row - for (size_t i = 0; i < headers.size(); i++) { - std::cout << headers[i]; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - // Print separator line - for (size_t i = 0; i < headers.size(); i++) { - std::cout << "----------------"; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - // Print data rows - for (const auto& row : rows) { - for (size_t i = 0; i < row.size(); i++) { - std::cout << row[i]; - if (i < row.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - } - std::cout << std::endl; -} - -/// Executes a SELECT query and outputs the results of fulltable in table format -void Query(SQLHDBC hDbc) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret = SQL_SUCCESS; - - try { - // Allocate statement handle - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - return; - } - - // Execute query - const std::string sqlQuery = "select * from fulltable"; - std::cout << "Execute query: " << sqlQuery << std::endl; - - ret = SQLExecDirect(hStmt, reinterpret_cast(const_cast(sqlQuery.c_str())), SQL_NTS); - if (!SQL_SUCCEEDED(ret)) { - if (ret != SQL_NO_DATA) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect(SELECT)"); - } - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - // Get column count - SQLSMALLINT colCount = 0; - ret = SQLNumResultCols(hStmt, &colCount); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLNumResultCols"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::cout << "Column count = " << colCount << std::endl; - - // If no columns, return directly - if (colCount <= 0) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - // Get column names and type information - std::vector columnNames; - std::vector columnTypes(colCount); - std::vector columnSizes(colCount); - std::vector decimalDigits(colCount); - std::vector nullable(colCount); - - // Get basic column information - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLSMALLINT nameLength = 0; - ret = SQLDescribeCol(hStmt, i, NULL, 0, &nameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get length)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector colNameBuffer(nameLength + 1); - SQLSMALLINT actualNameLength = 0; - - ret = SQLDescribeCol(hStmt, i, colNameBuffer.data(), nameLength + 1, - &actualNameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get name)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::string fullName(reinterpret_cast(colNameBuffer.data())); - - size_t pos = fullName.find_last_of('.'); - if (pos != std::string::npos) { - columnNames.push_back(fullName.substr(pos + 1)); - } else { - columnNames.push_back(fullName); - } - - ret = SQLDescribeCol(hStmt, i, NULL, 0, NULL, &columnTypes[i-1], - &columnSizes[i-1], &decimalDigits[i-1], &nullable[i-1]); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get type info)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - } - - std::vector> tableRows; - - int rowCount = 0; - // Fetch data for every row - while (true) { - ret = SQLFetch(hStmt); - if (ret == SQL_NO_DATA) { - break; - } - - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLFetch"); - break; - } - - std::vector row; - - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLLEN indicator = 0; - std::string valueStr; - - SQLSMALLINT cType; - size_t bufferSize; - bool isCharacterType = false; - const int maxBufferSize = 32768; - - switch (columnTypes[i-1]) { - case SQL_CHAR: - case SQL_VARCHAR: - case SQL_LONGVARCHAR: - case SQL_WCHAR: - case SQL_WVARCHAR: - case SQL_WLONGVARCHAR: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_DECIMAL: - case SQL_NUMERIC: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_INTEGER: - case SQL_SMALLINT: - case SQL_TINYINT: - case SQL_BIGINT: - cType = SQL_C_SBIGINT; - bufferSize = sizeof(SQLBIGINT); - break; - - case SQL_REAL: - case SQL_FLOAT: - case SQL_DOUBLE: - cType = SQL_C_DOUBLE; - bufferSize = sizeof(double); - break; - - case SQL_BIT: - cType = SQL_C_BIT; - bufferSize = sizeof(SQLCHAR); - break; - - case SQL_DATE: - case SQL_TYPE_DATE: - cType = SQL_C_DATE; - bufferSize = sizeof(SQL_DATE_STRUCT); - break; - - case SQL_TIME: - case SQL_TYPE_TIME: - cType = SQL_C_TIME; - bufferSize = sizeof(SQL_TIME_STRUCT); - break; - - case SQL_TIMESTAMP: - case SQL_TYPE_TIMESTAMP: - cType = SQL_C_TIMESTAMP; - bufferSize = sizeof(SQL_TIMESTAMP_STRUCT); - break; - - default: - cType = SQL_C_CHAR; - bufferSize = 256; - isCharacterType = true; - break; - } - - std::vector buffer(bufferSize); - - ret = SQLGetData(hStmt, i, cType, buffer.data(), bufferSize, &indicator); - - if (indicator == SQL_NULL_DATA) { - valueStr = "NULL"; - } - else if (ret != SQL_SUCCESS) { - valueStr = "ERR_CONV"; - } - else { - if (cType == SQL_C_CHAR) { - valueStr = reinterpret_cast(buffer.data()); - } - else if (cType == SQL_C_SBIGINT) { - SQLBIGINT intVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(intVal); - } - else if (cType == SQL_C_DOUBLE) { - double doubleVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(doubleVal); - } - else if (cType == SQL_C_BIT) { - valueStr = (*buffer.data() != 0) ? "TRUE" : "FALSE"; - } - else if (cType == SQL_C_DATE) { - SQL_DATE_STRUCT* date = reinterpret_cast(buffer.data()); - char dateStr[20]; - snprintf(dateStr, sizeof(dateStr), "%04d-%02d-%02d", - date->year, date->month, date->day); - valueStr = dateStr; - } - else if (cType == SQL_C_TIME) { - SQL_TIME_STRUCT* time = reinterpret_cast(buffer.data()); - char timeStr[15]; - snprintf(timeStr, sizeof(timeStr), "%02d:%02d:%02d", - time->hour, time->minute, time->second); - valueStr = timeStr; - } - else if (cType == SQL_C_TIMESTAMP) { - SQL_TIMESTAMP_STRUCT* ts = reinterpret_cast(buffer.data()); - char tsStr[30]; - snprintf(tsStr, sizeof(tsStr), "%04d-%02d-%02d %02d:%02d:%02d.%06d", - ts->year, ts->month, ts->day, - ts->hour, ts->minute, ts->second, - ts->fraction / 1000); - valueStr = tsStr; - } - else { - valueStr = "UNKNOWN_TYPE"; - } - - if (isCharacterType && ret == SQL_SUCCESS_WITH_INFO) { - SQLLEN actualSize = 0; - SQLGetDiagField(SQL_HANDLE_STMT, hStmt, 0, SQL_DIAG_COLUMN_SIZE, - &actualSize, SQL_IS_INTEGER, NULL); - - if (indicator > 0 && static_cast(indicator) > bufferSize - 1) { - valueStr += "..."; - } - } - - } - - row.push_back(valueStr); - } - - tableRows.push_back(row); - } - - if (!tableRows.empty()) { - PrintSimpleTable(columnNames, tableRows); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (const std::exception& ex) { - std::cerr << "Exception: " << ex.what() << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } - catch (...) { - std::cerr << "Unknown exception occurred" << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -/// Executes non-query SQL statements (such as CREATE DATABASE, CREATE TABLE, INSERT, etc.) -void Execute(SQLHDBC hDbc, const std::string& command) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret; - - try { - // Allocate statement handle - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - - // Execute command - ret = SQLExecDirect(hStmt, (SQLCHAR*)command.c_str(), SQL_NTS); - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect"); - } - - // Free statement handle - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (...) { - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -int main() { - SQLHENV hEnv = SQL_NULL_HENV; - SQLHDBC hDbc = SQL_NULL_HDBC; - SQLRETURN ret; - - try { - std::cout << "Start" << std::endl; - - // 1. Initialize ODBC environment - ret = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hEnv); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_ENV)"); - - ret = SQLSetEnvAttr(hEnv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)SQL_OV_ODBC3, 0); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLSetEnvAttr"); - - // 2. Establish connection - ret = SQLAllocHandle(SQL_HANDLE_DBC, hEnv, &hDbc); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_DBC)"); - - // Connection string - std::string dsn = "Apache IoTDB DSN"; - std::string user = "root"; - std::string password = "root"; - std::string server = "127.0.0.1"; - std::string database = "test"; - - std::string connectionString = "DSN=" + dsn + ";Server=" + server + - ";UID=" + user + ";PWD=" + password + - ";Database=" + database + ";loglevel=4"; - std::cout << "Using connection string: " << connectionString << std::endl; - - SQLCHAR outConnStr[1024]; - SQLSMALLINT outConnStrLen; - - ret = SQLDriverConnect(hDbc, NULL, - (SQLCHAR*)connectionString.c_str(), SQL_NTS, - outConnStr, sizeof(outConnStr), - &outConnStrLen, SQL_DRIVER_COMPLETE); - - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - std::cerr << "Login failed" << std::endl; - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLDriverConnect"); - return 1; - } - - // Get driver name - SQLCHAR driverName[256]; - SQLSMALLINT nameLength; - ret = SQLGetInfo(hDbc, SQL_DRIVER_NAME, driverName, sizeof(driverName), &nameLength); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLGetInfo"); - - std::cout << "Successfully opened connection. database name = " << driverName << std::endl; - - // 3. Execute operations - Execute(hDbc, "CREATE DATABASE IF NOT EXISTS test"); - Execute(hDbc, "use test"); - std::cout << "use test Execute complete. Begin to setup fulltable." << std::endl; - - // Create fulltable table and insert test data - Execute(hDbc, "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) WITH (TTL=315360000000)"); - const char* insertStatements[] = { - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', '0x506C616E7444617469', 1735690740000, '2026-01-04')" - }; - for (const char* sql : insertStatements) { - Execute(hDbc, sql); - } - std::cout << "fulltable setup complete. Begin to query." << std::endl; - Query(hDbc); - std::cout << "Query ok" << std::endl; - - // 4. Clean up resources - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - - return 0; - } - catch (...) { - // Exception cleanup - if (hDbc != SQL_NULL_HDBC) { - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - } - if (hEnv != SQL_NULL_HENV) { - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - } - - std::cerr << "Unexpected error!" << std::endl; - return 1; - } -} -``` - -### 4.4 PowerBI Example -1. Open PowerBI Desktop and create a new project. -2. Click "Home" → "Get Data" → "More..." → "ODBC" → Click the "Connect" button. -3. Data Source Selection: In the pop-up window, select "Data Source Name (DSN)" and choose `Apache IoTDB DSN` from the dropdown. -4. Advanced Configuration: - * Click "Advanced options" and fill in the configuration in the "Connection string" input box (example): - ```Plain - server=127.0.0.1;port=6667;database=test;isTableModel=true;loglevel=4 - ``` - * Notes: - * The `dsn` item is optional; filling it in or not does not affect the connection. - * `loglevel` ranges from 0-4: Level 0 (ERROR) has the least logs, Level 4 (TRACE) has the most detailed logs; set as needed. - * `server`/`database`/`dsn`/`loglevel` are case-insensitive (e.g., can be written as `Server/DATABASE`). - * If relevant information is configured in the DSN, you do not need to fill in any configuration information; the Driver Manager will automatically use the configuration filled in the DSN. -5. Authentication: Enter the username (default `root`) and password (default `root`), then click "Connect". -6. Data Loading: Select the table to be called in the interface (e.g., fulltable/table1) , and then click "Load" to view the data. - -### 4.5 Excel Example -1. Open Excel and create a blank workbook. -2. Click the "Data" tab → "From Other Sources" → "From Data Connection Wizard". -3. Data Source Selection: Select "ODBC DSN" → Next → Select `Apache IoTDB DSN` → Next. -4. Connection Configuration: - * The input process for connection string, username, and password is exactly the same as in PowerBI. Reference format for connection string: - ```Plain - server=127.0.0.1;port=6667;database=test;isTableModel=true;loglevel=4 - ``` - * If relevant information is configured in the DSN, you do not need to fill in any configuration information; the Driver Manager will automatically use the configuration filled in the DSN. -5. Table Selection: Choose the database and table you wish to access (e.g., fulltable), then click Next. -6. Save Connection: Customize settings for the data connection file name, connection description, etc., then click "Finish". -7. Import Data: Select the location to import the data into the worksheet (e.g., cell A1 of "Existing Worksheet"), click "OK" to complete data loading. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/API/Programming-Python-Native-API_timecho.md b/src/UserGuide/latest-Table/API/Programming-Python-Native-API_timecho.md deleted file mode 100644 index 364bb31ba..000000000 --- a/src/UserGuide/latest-Table/API/Programming-Python-Native-API_timecho.md +++ /dev/null @@ -1,752 +0,0 @@ - -# Python Native API - -IoTDB provides a Python native client driver and a session pool management mechanism. These tools allow developers to interact with IoTDB in a programmatic and efficient manner. Using the Python API, developers can encapsulate time-series data into objects (e.g., `Tablet`, `NumpyTablet`) and insert them into the database directly, without the need to manually construct SQL statements. For multi-threaded operations, the `TableSessionPool` is recommended to optimize resource utilization and enhance performance. - -## 1. Prerequisites - -To use the IoTDB Python API, install the required package using pip: - -```shell -pip3 install apache-iotdb>=2.0 -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 2. Read and Write Operations - -### 2.1 TableSession - -`TableSession` is a core class in IoTDB, enabling users to interact with the IoTDB database. It provides methods to execute SQL statements, insert data, and manage database sessions. - -#### Method Overview - -| **Method Name** | **Descripton** | **Parameter Type** | **Return Type** | -| --------------------------- | ----------------------------------------------------- | ------------------------------------ | ---------------- | -| insert | Inserts data into the database. | tablet: `Union[Tablet, NumpyTablet]` | None | -| execute_non_query_statement | Executes non-query SQL statements like DDL/DML. | sql: `str` | None | -| execute_query_statement | Executes a query SQL statement and retrieves results. | sql: `str` | `SessionDataSet` | -| close | Closes the session and releases resources. | None | None | - -**Since V2.0.8.2**, `SessionDataSet` provides methods for batch DataFrame retrieval to efficiently handle large-volume queries: - -```python -# Batch DataFrame retrieval -has_next = result.has_next_df() -if has_next: - df = result.next_df() - # Process DataFrame -``` - -**Method Details:** -- `has_next_df()`: Returns `True`/`False` indicating whether more data exists -- `next_df()`: Returns a `DataFrame` or `None`. Each call returns `fetchSize` rows (default: 5000 rows, controlled by Session's `fetch_size` parameter): - - If remaining data ≥ `fetchSize`: returns `fetchSize` rows - - If remaining data < `fetchSize`: returns all remaining rows - - If traversal completes: returns `None` -- Session validates `fetchSize` at initialization: if ≤0, resets to 5000 and logs warning: `fetch_size xxx is illegal, use default fetch_size 5000` - -**Note:** Avoid mixing different traversal methods (e.g., combining `todf()` with `next_df()`), which may cause unexpected errors. - -**Since V2.0.8.3**, the Python client has supported `TSDataType.OBJECT` for Tablet batch write and Session value serialization. Query results are read via the `Field` object. The related interfaces are defined as follows: - -| Function Name | Description | Parameters | Return Value | -|---------------|-------------|------------|--------------| -| `encode_object_cell` | Encodes a single OBJECT cell into wire-format bytes | `is_eof: bool`,
`offset: int`,
`content: bytes` | `bytes`: `\|[eof 1B]\|[offset 8B BE]\|[payload]\|` | -| `decode_object_cell` | Parses a wire-format cell back into `eof`, `offset`, and `payload` | `cell: bytes` (length ≥ 9) | `Tuple[bool, int, bytes]`: `(is_eof, offset, payload)` | -| `Tablet.add_value_object` | Writes an OBJECT cell at the specified row and column (internally calls `encode_object_cell`) | `row_index: int`,
`column_index: int`,
`is_eof: bool`,
`offset: int`,
`content: bytes` | `None` | -| `Tablet.add_value_object_by_name` | Same as above, locates column by name | `column_name: str`,
`row_index: int`,
`is_eof: bool`,
`offset: int`,
`content: bytes` | `None` | -| `NumpyTablet.add_value_object` | Same semantics as `Tablet.add_value_object`, column data is stored as `ndarray` | Same as above (`row_index`, `column_index`, ...) | `None` | -| `Field.get_object_value` | Converts the value to a Python value based on the **target type** | `data_type: TSDataType` | Depends on type:
For OBJECT: `str` decoded from the entire `self.value` in UTF-8 (see Field.py) | -| `Field.get_string_value` | Returns a string representation | None | `str`;
For OBJECT: `self.value.decode("utf-8")` | -| `Field.get_binary_value` | Gets the binary data of TEXT/STRING/BLOB | None | `bytes` or `None`;
**Throws an error for OBJECT columns and should not be called** | - - -#### Sample Code - -```Python -class TableSession(object): -def insert(self, tablet: Union[Tablet, NumpyTablet]): - """ - Insert data into the database. - - Parameters: - tablet (Tablet | NumpyTablet): The tablet containing the data to be inserted. - Accepts either a `Tablet` or `NumpyTablet`. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def execute_non_query_statement(self, sql: str): - """ - Execute a non-query SQL statement. - - Parameters: - sql (str): The SQL statement to execute. Typically used for commands - such as INSERT, DELETE, or UPDATE. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def execute_query_statement(self, sql: str, timeout_in_ms: int = 0) -> "SessionDataSet": - """ - Execute a query SQL statement and return the result set. - - Parameters: - sql (str): The SQL query to execute. - timeout_in_ms (int, optional): Timeout for the query in milliseconds. Defaults to 0, - which means no timeout. - - Returns: - SessionDataSet: The result set of the query. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def close(self): - """ - Close the session and release resources. - - Raises: - IoTDBConnectionException: If there is an issue closing the connection. - """ - pass -``` - -### 2.2 TableSessionConfig - -`TableSessionConfig` is a configuration class that sets parameters for creating a `TableSession` instance, defining essential settings for connecting to the IoTDB database. - -#### Parameter Configuration - -| **Parameter** | **Description** | **Type** | **Default Value** | -| ------------------ | ------------------------------------- | -------- |-----------------------------------------------| -| node_urls | List of database node URLs. | `list` | `["localhost:6667"]` | -| username | Username for the database connection. | `str` | `"root"` | -| password | Password for the database connection. | `str` | `"TimechoDB@2021"`,before V2.0.6 it is root | -| database | Target database to connect to. | `str` | `None` | -| fetch_size | Number of rows to fetch per query. | `int` | `5000` | -| time_zone | Default session time zone. | `str` | `Session.DEFAULT_ZONE_ID` | -| enable_compression | Enable data compression. | `bool` | `False` | - -#### Sample Code - -```Python -class TableSessionConfig(object): - """ - Configuration class for a TableSession. - - This class defines various parameters for connecting to and interacting - with the IoTDB tables. - """ - - def __init__( - self, - node_urls: list = None, - username: str = Session.DEFAULT_USER, - password: str = Session.DEFAULT_PASSWORD, - database: str = None, - fetch_size: int = 5000, - time_zone: str = Session.DEFAULT_ZONE_ID, - enable_compression: bool = False, - ): - """ - Initialize a TableSessionConfig object with the provided parameters. - - Parameters: - node_urls (list, optional): A list of node URLs for the database connection. - Defaults to ["localhost:6667"]. - username (str, optional): The username for the database connection. - Defaults to "root". - password (str, optional): The password for the database connection. - Defaults to "TimechoDB@2021",before V2.0.6 it is root - database (str, optional): The target database to connect to. Defaults to None. - fetch_size (int, optional): The number of rows to fetch per query. Defaults to 5000. - time_zone (str, optional): The default time zone for the session. - Defaults to Session.DEFAULT_ZONE_ID. - enable_compression (bool, optional): Whether to enable data compression. - Defaults to False. - """ -``` - -**Note:** After using a `TableSession`, make sure to call the `close` method to release resources. - -## 3. Session Pool - -### 3.1 TableSessionPool - -`TableSessionPool` is a session pool management class designed for creating and managing `TableSession` instances. It provides functionality to retrieve sessions from the pool and close the pool when it is no longer needed. - -#### Method Overview - -| **Method Name** | **Description** | **Return Type** | **Exceptions** | -| --------------- | ------------------------------------------------------ | --------------- | -------------- | -| get_session | Retrieves a new `TableSession` instance from the pool. | `TableSession` | None | -| close | Closes the session pool and releases all resources. | None | None | - -#### Sample Code - -```Python -def get_session(self) -> TableSession: - """ - Retrieve a new TableSession instance. - - Returns: - TableSession: A new session object configured with the session pool. - - Notes: - The session is initialized with the underlying session pool for managing - connections. Ensure proper usage of the session's lifecycle. - """ - -def close(self): - """ - Close the session pool and release all resources. - - This method closes the underlying session pool, ensuring that all - resources associated with it are properly released. - - Notes: - After calling this method, the session pool cannot be used to retrieve - new sessions, and any attempt to do so may raise an exception. - """ -``` - -### 3.2 TableSessionPoolConfig - -`TableSessionPoolConfig` is a configuration class used to define parameters for initializing and managing a `TableSessionPool` instance. It specifies the settings needed for efficient session pool management in IoTDB. - -#### Parameter Configuration - -| **Paramater** | **Description** | **Type** | **Default Value** | -| ------------------ | ------------------------------------------------------------ | -------- | -------------------------- | -| node_urls | List of IoTDB cluster node URLs. | `list` | None | -| max_pool_size | Maximum size of the session pool, i.e., the maximum number of sessions allowed in the pool. | `int` | `5` | -| username | Username for the connection. | `str` | `Session.DEFAULT_USER` | -| password | Password for the connection. | `str` | `Session.DEFAULT_PASSWORD` | -| database | Target database to connect to. | `str` | None | -| fetch_size | Fetch size for query results | `int` | `5000` | -| time_zone | Timezone-related `ZoneId` | `str` | `Session.DEFAULT_ZONE_ID` | -| enable_redirection | Whether to enable redirection. | `bool` | `False` | -| enable_compression | Whether to enable data compression. | `bool` | `False` | -| wait_timeout_in_ms | Sets the connection timeout in milliseconds. | `int` | `10000` | -| max_retry | Maximum number of connection retry attempts. | `int` | `3` | - -#### Sample Code - -```Python -class TableSessionPoolConfig(object): - """ - Configuration class for a TableSessionPool. - - This class defines the parameters required to initialize and manage - a session pool for interacting with the IoTDB database. - """ - def __init__( - self, - node_urls: list = None, - max_pool_size: int = 5, - username: str = Session.DEFAULT_USER, - password: str = Session.DEFAULT_PASSWORD, - database: str = None, - fetch_size: int = 5000, - time_zone: str = Session.DEFAULT_ZONE_ID, - enable_redirection: bool = False, - enable_compression: bool = False, - wait_timeout_in_ms: int = 10000, - max_retry: int = 3, - ): - """ - Initialize a TableSessionPoolConfig object with the provided parameters. - - Parameters: - node_urls (list, optional): A list of node URLs for the database connection. - Defaults to None. - max_pool_size (int, optional): The maximum number of sessions in the pool. - Defaults to 5. - username (str, optional): The username for the database connection. - Defaults to Session.DEFAULT_USER. - password (str, optional): The password for the database connection. - Defaults to Session.DEFAULT_PASSWORD. - database (str, optional): The target database to connect to. Defaults to None. - fetch_size (int, optional): The number of rows to fetch per query. Defaults to 5000. - time_zone (str, optional): The default time zone for the session pool. - Defaults to Session.DEFAULT_ZONE_ID. - enable_redirection (bool, optional): Whether to enable redirection. - Defaults to False. - enable_compression (bool, optional): Whether to enable data compression. - Defaults to False. - wait_timeout_in_ms (int, optional): The maximum time (in milliseconds) to wait for a session - to become available. Defaults to 10000. - max_retry (int, optional): The maximum number of retry attempts for operations. Defaults to 3. - - """ -``` - -**Notes:** - -- Ensure that `TableSession` instances retrieved from the `TableSessionPool` are properly closed after use. -- After closing the `TableSessionPool`, it will no longer be possible to retrieve new sessions. - -### 3.3 SSL Connection - -#### 3.3.1 Server Certificate Configuration - -In the `conf/iotdb-system.properties` configuration file, locate or add the following configuration items: - -``` -enable_thrift_ssl=true -key_store_path=/path/to/your/server_keystore.jks -key_store_pwd=your_keystore_password -``` - -#### 3.3.2 Configure Python Client Certificate - -- Set `use_ssl` to True to enable SSL. -- Specify the client certificate path using the `ca_certs` parameter. - -``` -use_ssl = True -ca_certs = "/path/to/your/server.crt" # 或 ca_certs = "/path/to/your//ca_cert.pem" -``` -**Example Code: Using SSL to Connect to IoTDB** - -```Python -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# - -from iotdb.SessionPool import PoolConfig, SessionPool -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021",before V2.0.6 it is root -# Configure SSL enabled -use_ssl = True -# Configure certificate path -ca_certs = "/path/server.crt" - - -def get_data(): - session = Session( - ip, port_, username_, password_, use_ssl=use_ssl, ca_certs=ca_certs - ) - session.open(False) - with session.execute_query_statement("SHOW DATABASES") as session_data_set: - print(session_data_set.get_column_names()) - while session_data_set.has_next(): - print(session_data_set.next()) - - session.close() - - -def get_data2(): - pool_config = PoolConfig( - host=ip, - port=port_, - user_name=username_, - password=password_, - fetch_size=1024, - time_zone="UTC+8", - max_retry=3, - use_ssl=use_ssl, - ca_certs=ca_certs, - ) - max_pool_size = 5 - wait_timeout_in_ms = 3000 - session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) - session = session_pool.get_session() - with session.execute_query_statement("SHOW DATABASES") as session_data_set: - print(session_data_set.get_column_names()) - while session_data_set.has_next(): - print(session_data_set.next()) - session_pool.put_back(session) - session_pool.close() - - -if __name__ == "__main__": - df = get_data() -``` - -## 4. Sample Code - -**Session** Example: You can find the full example code at [Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/table_model_session_example.py). - -**Session Pool** Example: You can find the full example code at [SessionPool Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/table_model_session_pool_example.py). - -Here is an excerpt of the sample code: - -```Python -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# -import threading - -import numpy as np - -from iotdb.table_session_pool import TableSessionPool, TableSessionPoolConfig -from iotdb.utils.IoTDBConstants import TSDataType -from iotdb.utils.NumpyTablet import NumpyTablet -from iotdb.utils.Tablet import ColumnType, Tablet - - -def prepare_data(): - print("create database") - # Get a session from the pool - session = session_pool.get_session() - session.execute_non_query_statement("CREATE DATABASE IF NOT EXISTS db1") - session.execute_non_query_statement('USE "db1"') - session.execute_non_query_statement( - "CREATE TABLE table0 (id1 string id, attr1 string attribute, " - + "m1 double " - + "field)" - ) - session.execute_non_query_statement( - "CREATE TABLE table1 (id1 string tag, attr1 string attribute, " - + "m1 double " - + "field)" - ) - - print("now the tables are:") - # show result - with session.execute_query_statement("SHOW TABLES") as res: - while res.has_next(): - print(res.next()) - - session.close() - - -def insert_data(num: int): - print("insert data for table" + str(num)) - # Get a session from the pool - session = session_pool.get_session() - column_names = [ - "id1", - "attr1", - "m1", - ] - data_types = [ - TSDataType.STRING, - TSDataType.STRING, - TSDataType.DOUBLE, - ] - column_types = [ColumnType.TAG, ColumnType.ATTRIBUTE, ColumnType.FIELD] - timestamps = [] - values = [] - for row in range(15): - timestamps.append(row) - values.append(["id:" + str(row), "attr:" + str(row), row * 1.0]) - tablet = Tablet( - "table" + str(num), column_names, data_types, values, timestamps, column_types - ) - session.insert(tablet) - session.execute_non_query_statement("FLush") - - np_timestamps = np.arange(15, 30, dtype=np.dtype(">i8")) - np_values = [ - np.array(["id:{}".format(i) for i in range(15, 30)]), - np.array(["attr:{}".format(i) for i in range(15, 30)]), - np.linspace(15.0, 29.0, num=15, dtype=TSDataType.DOUBLE.np_dtype()), - ] - - np_tablet = NumpyTablet( - "table" + str(num), - column_names, - data_types, - np_values, - np_timestamps, - column_types=column_types, - ) - session.insert(np_tablet) - session.close() - - -def query_data(): - # Get a session from the pool - session = session_pool.get_session() - - print("get data from table0") - with session.execute_query_statement("select * from table0") as res: - while res.has_next(): - print(res.next()) - - print("get data from table1") - with session.execute_query_statement("select * from table1") as res: - while res.has_next(): - print(res.next()) - - # Querying Table Data Using Batch DataFrame (Recommended for Large Datasets) - print("get data from table0 using batch DataFrame") - with session.execute_query_statement("select * from table0") as res: - while res.has_next_df(): - print(res.next_df()) - - session.close() - - -def delete_data(): - session = session_pool.get_session() - session.execute_non_query_statement("drop database db1") - print("data has been deleted. now the databases are:") - with session.execute_query_statement("show databases") as res: - while res.has_next(): - print(res.next()) - session.close() - - -# Create a session pool -username = "root" -password = "TimechoDB@2021",before V2.0.6 it is root -node_urls = ["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"] -fetch_size = 1024 -database = "db1" -max_pool_size = 5 -wait_timeout_in_ms = 3000 -config = TableSessionPoolConfig( - node_urls=node_urls, - username=username, - password=password, - database=database, - max_pool_size=max_pool_size, - fetch_size=fetch_size, - wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) - -prepare_data() - -insert_thread1 = threading.Thread(target=insert_data, args=(0,)) -insert_thread2 = threading.Thread(target=insert_data, args=(1,)) - -insert_thread1.start() -insert_thread2.start() - -insert_thread1.join() -insert_thread2.join() - -query_data() -delete_data() -session_pool.close() -print("example is finished!") -``` - -**Object Type Usage Example** - -```python -import os - -import numpy as np -import pytest - -from iotdb.utils.IoTDBConstants import TSDataType -from iotdb.utils.NumpyTablet import NumpyTablet -from iotdb.utils.Tablet import Tablet, ColumnType -from iotdb.utils.object_column import decode_object_cell - - -def _require_thrift(): - pytest.importorskip("iotdb.thrift.common.ttypes") - - -def _session_endpoint(): - host = os.environ.get("IOTDB_HOST", "127.0.0.1") - port = int(os.environ.get("IOTDB_PORT", "6667")) - return host, port - - -@pytest.fixture(scope="module") -def table_session(): - _require_thrift() - from iotdb.Session import Session - from iotdb.table_session import TableSession, TableSessionConfig - - host, port = _session_endpoint() - cfg = TableSessionConfig( - node_urls=[f"{host}:{port}"], - username=os.environ.get("IOTDB_USER", Session.DEFAULT_USER), - password=os.environ.get("IOTDB_PASSWORD", Session.DEFAULT_PASSWORD), - ) - ts = TableSession(cfg) - yield ts - ts.close() - - -def test_table_numpy_tablet_object_columns(table_session): - """ - Table model: Tablet.add_value_object / add_value_object_by_name, - NumpyTablet.add_value_object, insert + query Field + decode_object_cell; - Also includes writing OBJECT in two segments at the same timestamp - (first with is_eof=False/offset=0, then with is_eof=True/offset=length of the first segment), - and verifies the complete concatenated bytes using read_object(f1). - """ - db = "test_py_object_e2e" - table = "obj_tbl" - table_session.execute_non_query_statement(f"CREATE DATABASE IF NOT EXISTS {db}") - table_session.execute_non_query_statement(f"USE {db}") - table_session.execute_non_query_statement(f"DROP TABLE IF EXISTS {table}") - table_session.execute_non_query_statement( - f"CREATE TABLE {table}(" - "device STRING TAG, temp FLOAT FIELD, f1 OBJECT FIELD, f2 OBJECT FIELD)" - ) - - column_names = ["device", "temp", "f1", "f2"] - data_types = [ - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.OBJECT, - TSDataType.OBJECT, - ] - column_types = [ - ColumnType.TAG, - ColumnType.FIELD, - ColumnType.FIELD, - ColumnType.FIELD, - ] - timestamps = [100, 200] - values = [ - ["d1", 1.5, None, None], - ["d1", 2.5, None, None], - ] - - tablet = Tablet( - table, column_names, data_types, values, timestamps, column_types - ) - tablet.add_value_object(0, 2, True, 0, b"first-row-obj") - # Single-segment write for the entire object: is_eof=True and offset=0; - # Segmented sequential writes must pass server-side offset/length validation - tablet.add_value_object_by_name("f2", 0, True, 0, b"seg") - tablet.add_value_object(1, 2, True, 0, b"second-f1") - tablet.add_value_object(1, 3, True, 0, b"second-f2") - table_session.insert(tablet) - - ts_arr = np.array([300, 400], dtype=TSDataType.INT64.np_dtype()) - np_vals = [ - np.array(["d1", "d1"]), - np.array([1.0, 2.0], dtype=np.float32), - np.array([None, None], dtype=object), - np.array([None, None], dtype=object), - ] - np_tab = NumpyTablet( - table, column_names, data_types, np_vals, ts_arr, column_types=column_types - ) - np_tab.add_value_object(0, 2, True, 0, b"np-r0-f1") - np_tab.add_value_object(0, 3, True, 0, b"np-r0-f2") - np_tab.add_value_object(1, 2, True, 0, b"np-r1-f1") - np_tab.add_value_object(1, 3, True, 0, b"np-r1-f2") - table_session.insert(np_tab) - - # Segmented OBJECT: first with is_eof=False (continue transmission), - # then with is_eof=True (last segment); offset is the length of written bytes - chunk0 = bytes((i % 256) for i in range(512)) - chunk1 = b"\xab" * 64 - expected_segmented = chunk0 + chunk1 - seg1 = Tablet( - table, - column_names, - data_types, - [["d1", 3.0, None, None]], - [500], - column_types, - ) - seg1.add_value_object(0, 2, False, 0, chunk0) - seg1.add_value_object(0, 3, True, 0, b"f2-seg") - table_session.insert(seg1) - seg2 = Tablet( - table, - column_names, - data_types, - [["d1", 3.0, None, None]], - [500], - column_types, - ) - seg2.add_value_object(0, 2, True, 512, chunk1) - seg2.add_value_object(0, 3, True, 0, b"f2-seg") - table_session.insert(seg2) - - with table_session.execute_query_statement( - f"SELECT read_object(f1) FROM {table} WHERE time = 500" - ) as ds: - assert ds.has_next() - row = ds.next() - blob = row.get_fields()[0].get_binary_value() - assert blob == expected_segmented - assert not ds.has_next() - - seen = 0 - with table_session.execute_query_statement( - f"SELECT device, temp, f1, f2 FROM {table} ORDER BY time" - ) as ds: - while ds.has_next(): - row = ds.next() - fields = row.get_fields() - assert fields[0].get_object_value(TSDataType.STRING) == "d1" - assert fields[1].get_object_value(TSDataType.FLOAT) is not None - for j in (2, 3): - raw = fields[j].value - assert isinstance(raw, (bytes, bytearray)) - eof, off, body = decode_object_cell(bytes(raw)) - assert isinstance(eof, bool) and isinstance(off, int) - assert isinstance(body, bytes) - fields[j].get_string_value() - fields[j].get_object_value(TSDataType.OBJECT) - seen += 1 - assert seen == 5 - - -if __name__ == "__main__": - pytest.main([__file__, "-v", "-rs"]) -``` diff --git a/src/UserGuide/latest-Table/API/RestAPI-V1_timecho.md b/src/UserGuide/latest-Table/API/RestAPI-V1_timecho.md deleted file mode 100644 index 21ce0768c..000000000 --- a/src/UserGuide/latest-Table/API/RestAPI-V1_timecho.md +++ /dev/null @@ -1,363 +0,0 @@ - -# RestAPI V1 - -IoTDB's RESTful service can be used for querying, writing, and management operations. It uses the OpenAPI standard to define interfaces and generate frameworks. - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the REST service JAR file by default. Please contact the Timecho team to obtain the corresponding JAR file before using this service, and place it in the `timechodb_home/lib` or `timechodb_home/ext/external_service` directory. - -## 1. Enabling RESTful Service - -The RESTful service is disabled by default. To enable it, locate the `conf/iotdb-system.properties` file in the IoTDB installation directory, set `enable_rest_service` to `true`, and then restart the datanode process. - -```Properties -enable_rest_service=true -``` - -## 2. Authentication - -All RESTful APIs adopt **Basic Authentication**, except the health check interface `/ping`. -All requests must carry the `Authorization` information in the request header. - -1. Authentication Format -``` -Authorization: Basic -``` -The `` is generated by directly Base64-encoding the string in the format `username:password`. -Quick generation methods are shown below: - -* Linux / macOS -```bash -echo -n "your_username:your_password" | base64 -eg: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows -```powershell -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("username:password")) -eg: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) -``` -```cmd -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"username:password\"))" -eg: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. Authentication Example - -Default username: `root`, password: `TimechoDB@2021` - -- Concatenated string: `root:TimechoDB@2021` -- Base64 encoded result: `cm9vdDpUaW1lY2hvREJAMjAyMQ==` -- Final Header: -``` -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. Error Description -- Incorrect username or password: returns HTTP status code `801` -```json -{"code":801,"message":"WRONG_LOGIN_PASSWORD"} -``` - -- Missing `Authorization` configuration: returns HTTP status code `800` -```json -{"code":800,"message":"INIT_AUTH_ERROR"} -``` - - -## 3. Interface Definitions - -### 3.1 Ping - -The `/ping` endpoint can be used for online service health checks. - -- Request Method: GET - -- Request Path: `http://ip:port/ping` - -- Example Request: - - ```Bash - curl http://127.0.0.1:18080/ping - ``` - -- HTTP Status Codes: - - - `200`: The service is working normally and can accept external requests. - - - `503`: The service is experiencing issues and cannot accept external requests. - - | Parameter Name | Type | Description | - | :------------- | :------ | :--------------- | - | code | integer | Status Code | - | message | string | Code Information | - -- Response Example: - - - When the HTTP status code is `200`: - - ```JSON - { "code": 200, "message": "SUCCESS_STATUS"} - ``` - - - When the HTTP status code is `503`: - - ```JSON - { "code": 500, "message": "thrift service is unavailable"} - ``` - -**Note**: The `/ping` endpoint does not require authentication. - -### 3.2 Query Interface - -- Request Path: `/rest/table/v1/query` - -- Request Method: POST - -- Request Format: - - - Header: `application/json` - - - Request Parameters: - - | Parameter Name | Type | Required | Description | - | :------------- | :----- | :------- | :----------------------------------------------------------- | - | `database` | string | Yes | Database name | - | `sql` | string | Yes | SQL query | - | `row_limit` | int | No | Maximum number of rows to return in a single query. If not set, the default value from the configuration file (`rest_query_default_row_size_limit`) is used. If the result set exceeds this limit, status code `411` is returned. | - -- Response Format: - - | Parameter Name | Type | Description | - | :------------- | :---- | :----------------------------------------------------------- | - | `column_names` | array | Column names | - | `data_types` | array | Data types of each column | - | `values` | array | A 2D array where the first dimension represents rows, and the second dimension represents columns. Each element corresponds to a column, with the same length as `column_names`. | - -- Example Request: - - ```Bash - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"database":"test","sql":"select s1,s2,s3 from test_table"}' http://127.0.0.1:18080/rest/table/v1/query - ``` - -- Example Response: - - ```JSON - { - "column_names": [ - "s1", - "s2", - "s3" - ], - "data_types": [ - "STRING", - "BOOLEAN", - "INT32" - ], - "values": [ - [ - "a11", - true, - 2024 - ], - [ - "a11", - false, - 2025 - ] - ] - } - ``` - -### 3.3 Non-Query Interface - -- Request Path: `/rest/table/v1/nonQuery` - -- Request Method: POST - -- Request Format: - - - Header: `application/json` - - - Request Parameters: - - | Parameter Name | Type | Required | Description | - | :------------- | :----- | :------- | :------------ | - | `sql` | string | Yes | SQL statement | - | `database` | string | No | Database name | - -- Response Format: - - | Parameter Name | Type | Description | - | :------------- | :------ | :---------- | - | `code` | integer | Status code | - | `message` | string | Message | - -- Example Requests: - - - Create a database: - - ```Bash - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"create database test","database":""}' http://127.0.0.1:18080/rest/table/v1/nonQuery - ``` - - - Create a table `test_table` in the `test` database: - - ```Bash - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"CREATE TABLE table1 (time TIMESTAMP TIME,region STRING TAG,plant_id STRING TAG,device_id STRING TAG,model_id STRING ATTRIBUTE,maintenance STRING ATTRIBUTE,temperature FLOAT FIELD,humidity FLOAT FIELD,status Boolean FIELD,arrival_time TIMESTAMP FIELD) WITH (TTL=31536000000)","database":"test"}' http://127.0.0.1:18080/rest/table/v1/nonQuery - ``` - -- Example Response: - - ```JSON - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -### 3.4 Batch Write Interface - -- Request Path: `/rest/table/v1/insertTablet` - -- Request Method: POST - -- Request Format: - - - Header: `application/json` - - - Request Parameters: - - | Parameter Name | Type | Required | Description | - | :------------------ | :----- | :------- | :----------------------------------------------------------- | - | `database` | string | Yes | Database name | - | `table` | string | Yes | Table name | - | `column_names` | array | Yes | Column names | - | `column_categories` | array | Yes | Column categories (`TAG`, `FIELD`, `ATTRIBUTE`) | - | `data_types` | array | Yes | Data types | - | `timestamps` | array | Yes | Timestamp column | - | `values` | array | Yes | Value columns. Each column's values can be `null`. A 2D array where the first dimension corresponds to timestamps, and the second dimension corresponds to columns. | - -- Response Format: - - | Parameter Name | Type | Description | - | :------------- | :------ | :---------- | - | `code` | integer | Status code | - | `message` | string | Message | - -- Example Request: - - ```Bash - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"database":"test","column_categories":["TAG","FIELD","FIELD"],"timestamps":[1739702535000,1739789055000],"column_names":["s1","s2","s3"],"data_types":["STRING","BOOLEAN","INT32"],"values":[["a11",true,2024],["a11",false,2025]],"table":"test_table"}' http://127.0.0.1:18080/rest/table/v1/insertTablet - ``` - -- Example Response: - - ```JSON - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -## 4. Configuration - -The configuration file is located in `iotdb-system.properties`. - -- Set `enable_rest_service` to `true` to enable the module, or `false` to disable it. The default value is `false`. - - ```Properties - enable_rest_service=true - ``` - -- Only effective when `enable_rest_service=true`. Set `rest_service_port` to a number (1025~65535) to customize the REST service socket port. The default value is `18080`. - - ```Properties - rest_service_port=18080 - ``` - -- Set `enable_swagger` to `true` to enable Swagger for displaying REST interface information, or `false` to disable it. The default value is `false`. - - ```Properties - enable_swagger=false - ``` - -- The maximum number of rows that can be returned in a single query. If the result set exceeds this limit, only the rows within the limit will be returned, and status code `411` will be returned. - - ```Properties - rest_query_default_row_size_limit=10000 - ``` - -- Expiration time for caching client login information (used to speed up user authentication, in seconds, default is 8 hours). - - ```Properties - cache_expire_in_seconds=28800 - ``` - -- Maximum number of users stored in the cache (default is 100). - - ```Properties - cache_max_num=100 - ``` - -- Initial cache capacity (default is 10). - - ```Properties - cache_init_num=10 - ``` - -- Whether to enable SSL configuration for the REST service. Set `enable_https` to `true` to enable it, or `false` to disable it. The default value is `false`. - - ```Properties - enable_https=false - ``` - -- Path to the `keyStore` (optional). - - ```Properties - key_store_path= - ``` - -- Password for the `keyStore` (optional). - - ```Properties - key_store_pwd= - ``` - -- Path to the `trustStore` (optional). - - ```Properties - trust_store_path="" - ``` - -- Password for the `trustStore` (optional). - - ```Properties - trust_store_pwd="" - ``` - -- SSL timeout time, in seconds. - - ```Properties - idle_timeout_in_seconds=5000 - ``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Background-knowledge/Cluster-Concept_timecho.md b/src/UserGuide/latest-Table/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index 7ebd77297..000000000 --- a/src/UserGuide/latest-Table/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,140 +0,0 @@ - - -# Common Concepts - -## 1. SQL Dialect Related Concepts - -### 1.1 sql_dialect - -IoTDB supports two time-series data mode (SQL dialects), both managing devices and measurement points: - -- **Tree** **Mode**: Organizes data in a hierarchical path structure, where each path represents a measurement point of a device. -- **Table** **Mode**: Organizes data in a relational table format, where each table corresponds to a type of device. - -Each dialect comes with its own SQL syntax and query patterns tailored to its data mode. - -### 1.2 Schema - -Schema refers to the metadata structure of the database, which can follow either a tree or table format. It includes definitions such as measurement point names, data types, and storage configurations. - -### 1.3 Device - -A device corresponds to a physical device in a real-world scenario, typically associated with multiple measurement points. - -### 1.4 Timeseries - -Also referred to as: physical quantity, time series, timeline, point, signal, metric, measurement value, etc. -A measurement point is a time series consisting of multiple data points arranged in ascending timestamp order. It typically represents a collection point that periodically gathers physical quantities from its environment. - -### 1.5 Encoding - -Encoding is a compression technique that represents data in binary form, improving storage efficiency. IoTDB supports multiple encoding methods for different types of data. For details, refer to: [Compression and Encoding ](../Technical-Insider/Encoding-and-Compression.md)。 - -### 1.6 Compression - - After encoding, IoTDB applies additional compression techniques to further reduce data size and improve storage efficiency. Various compression algorithms are supported. For details, refer to: [ Compression and Encoding](../Technical-Insider/Encoding-and-Compression.md)。 - -## 2. Distributed System Related Concepts - -IoTDB supports distributed deployments, typically in a 3C3D cluster model (3 ConfigNodes, 3 DataNodes), as illustrated below: - - - -### 2.1 Key Concepts - -- **Nodes** (*ConfigNode,* *DataNode**, AINode*) -- **Regions** (*SchemaRegion, DataRegion*) -- **Replica Groups** - -Below is an introduction to these concepts. - - -### 2.2 Nodes - -An IoTDB cluster consists of three types of nodes, each with distinct responsibilities: - -- **ConfigNode (Management Node)** Manages cluster metadata, configuration, user permissions, schema, and partitioning. It also handles distributed scheduling and load balancing. All ConfigNodes are replicated for high availability. -- **DataNode (Storage and Computation Node)** Handles client requests, stores data, and executes computations. -- **AINode (Analytics Node)** Provides machine learning capabilities, allowing users to register pre-trained models and perform inference via SQL. It includes built-in time-series modes and common ML algorithms for tasks like prediction and anomaly detection. - -### 2.3 Data Partitioning - -IoTDB divides schema and data into **Regions**, which are managed by DataNodes. - -- **SchemaRegion**: Stores schema information (devices and measurement points). Regions with the same RegionID across different DataNodes serve as replicas. -- **DataRegion**: Stores time-series data for a subset of devices over a specified time period. Regions with the same RegionID across different DataNodes act as replicas. - -For more details, see [Cluster Data Partitioning](../Technical-Insider/Cluster-data-partitioning.md) - -### 2.4 Replica Groups - -Replica groups ensure high availability by maintaining multiple copies of schema and data. The recommended replication configurations are: - -| **Category** | **Configuration Item** | **Standalone Recommended** | **Cluster Recommended** | -| ------------ | ------------------------- | -------------------------- | ----------------------- | -| Metadata | schema_replication_factor | 1 | 3 | -| Data | data_replication_factor | 1 | 2 | - - -## 3. Deployment Related Concepts - -IoTDB has two operation modes: standalone mode and cluster mode. - -### 3.1 Standalone Mode - -An IoTDB standalone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D. - -- **Features**: Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations. -- **Use Cases**: Scenarios with limited resources or low high-availability requirements, such as edge servers. -- **Deployment Method**: [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - -### 3.2 Dual-Active Mode - -Dual-Active Deployment is a feature of TimechoDB, where two independent instances synchronize bidirectionally and can provide services simultaneously. If one instance stops and restarts, the other instance will resume data transfer from the breakpoint. - -> An IoTDB Dual-Active instance typically consists of 2 standalone nodes, i.e., 2 sets of 1C1D. Each instance can also be a cluster. - -- **Features**: The high-availability solution with the lowest resource consumption. -- **Use Cases**: Scenarios with limited resources (only two servers) but requiring high availability. -- **Deployment Method**: [Dual-Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### 3.3 Cluster Mode - -An IoTDB cluster instance consists of 3 ConfigNodes and no fewer than 3 DataNodes, typically 3 DataNodes, i.e., 3C3D. If some nodes fail, the remaining nodes can still provide services, ensuring high availability of the database. Performance can be improved by adding DataNodes. - -- **Features**: High availability, high scalability, and improved system performance by adding DataNodes. -- **Use Cases**: Enterprise-level application scenarios requiring high availability and reliability. -- **Deployment Method**: [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - -### 3.4 Feature Summary - -| **Dimension** | **Stand-Alone Mode** | **Dual-Active Mode** | **Cluster Mode** | -| :-------------------------- | :------------------------------------------------------- | :------------------------------------------------------ | :------------------------------------------------------ | -| Use Cases | Edge-side deployment, low high-availability requirements | High-availability services, disaster recovery scenarios | High-availability services, disaster recovery scenarios | -| Number of Machines Required | 1 | 2 | ≥3 | -| Security and Reliability | Cannot tolerate single-point failure | High, can tolerate single-point failure | High, can tolerate single-point failure | -| Scalability | Can expand DataNodes to improve performance | Each instance can be scaled as needed | Can expand DataNodes to improve performance | -| Performance | Can scale with the number of DataNodes | Same as one of the instances | Can scale with the number of DataNodes | - -- Notes: The deployment steps for Stand-Alone Mode and Cluster Mode are similar (adding ConfigNodes and DataNodes one by one), with differences only in the number of replicas and the minimum number of nodes required to provide services. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Background-knowledge/Data-Model-and-Terminology_timecho.md b/src/UserGuide/latest-Table/Background-knowledge/Data-Model-and-Terminology_timecho.md deleted file mode 100644 index 88537b320..000000000 --- a/src/UserGuide/latest-Table/Background-knowledge/Data-Model-and-Terminology_timecho.md +++ /dev/null @@ -1,388 +0,0 @@ - - -# Modeling Scheme Design - -This section introduces how to transform time series data application scenarios into IoTDB time series mode. - -## 1. Time Series Data Mode - -Before designing an IoTDB data mode, it's essential to understand time series data and its underlying structure. For more details, refer to: [Time Series Data Mode](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - -## 2. Tree-Table Twin Mode in IoTDB - -IoTDB offers tree-table twin mode, each with its distinct characteristics as follows: - -**Tree Mode**: It manages data points as objects, with each data point corresponding to a time series. The data point names, segmented by dots, form a tree-like directory structure that corresponds one-to-one with the physical world, making the read and write operations on data points straightforward and intuitive. - -**Table Mode**: It is recommended to create a table for each type of device. The collection of physical quantities from devices of the same type shares certain commonalities (such as the collection of temperature and humidity physical quantities), allowing for flexible and rich data analysis. - -### 2.1 Mode Characteristics - -Tree-table twin mode syntaxes have their own applicable scenarios. - -The following table compares the tree mode and the table mode from various dimensions, including applicable scenarios and typical operations. Users can choose the appropriate mode based on their specific usage requirements to achieve efficient data storage and management. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DimensionTree ModeTable Mode
Applicable ScenariosMeasurements management, monitoring scenariosDevice management, analysis scenarios
Typical OperationsRead and write operations by specifying data point pathsData filtering and analysis through tags
Structural CharacteristicsFlexible addition and deletion, similar to a file systemTemplate-based management, facilitating data governance
Syntax CharacteristicsConcise and flexibleRich analysis
Performance ComparisonSimilar
- -**Notes:** - -- Both mode spaces can coexist within the same cluster instance. Each mode follows distinct syntax and database naming conventions, and they remain isolated by default. - -## 2.2 Model Selection - -IoTDB supports model selection through various client tools. The configuration methods for different clients are as follows: - -1. [Command-Line Interface (CLI)](../Tools-System/CLI_timecho.md) - -When connecting via CLI, specify the model using the `sql_dialect` parameter (default: tree model). - -```bash -# Tree model -start-cli.sh(bat) -start-cli.sh(bat) -sql_dialect tree - -# Table model -start-cli.sh(bat) -sql_dialect table -``` - -2. [SQL](../User-Manual/Maintenance-commands_timecho.md#_2-1-setting-the-connected-model) - -Use the `SET` statement to switch models in SQL: - -```sql --- Tree model -IoTDB> SET SQL_DIALECT=TREE - --- Table model -IoTDB> SET SQL_DIALECT=TABLE -``` - -3. Application Programming Interfaces (APIs) - -For multi-language APIs, create connections via model-specific session/session pool classes. Examples: - -* [Java Native API](../API/Programming-Java-Native-API_timecho.md) - -```java -// Tree model -SessionPool sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(3) - .build(); - -// Table model -ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(1) - .build(); -``` - -* [Python Native API](../API/Programming-Python-Native-API_timecho.md) - -```python -# Tree model -session = Session( - ip=ip, - port=port, - user=username, - password=password, - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) - -# Table model -config = TableSessionPoolConfig( - node_urls=node_urls, - username=username, - password=password, - database=database, - max_pool_size=max_pool_size, - fetch_size=fetch_size, - wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) -``` - -* [C++ Native API](../API/Programming-Cpp-Native-API_timecho.md) - -```cpp -// Tree model -session = new Session(hostip, port, username, password); - -// Table model -session = (new TableSessionBuilder()) - ->host(ip) - ->rpcPort(port) - ->username(username) - ->password(password) - ->build(); -``` - -* [Go Native API](../API/Programming-Go-Native-API_timecho.md) - -```go -// Tree model -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, -} -sessionPool = client.NewSessionPool(config, 3, 60000, 60000, false) -defer sessionPool.Close() - -// Table model -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, - Database: dbname, -} -sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) -defer sessionPool.Close() -``` - -* [C# Native API](../API/Programming-CSharp-Native-API_timecho.md) - -```csharp -// Tree model -var session_pool = new SessionPool(host, port, pool_size); - -// Table model -var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(nodeUrls) - .SetUsername(username) - .SetPassword(password) - .SetFetchSize(1024) - .Build(); -``` - -* [JDBC](../API/Programming-JDBC_timecho.md) - -For the table model, include `sql_dialect=table` in the JDBC URL: - -```java -// Tree model -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/", username, password); - -// Table model -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", username, password); -``` - -## 2.3 Tree-to-Table Conversion - -IoTDB supports **tree-to-table conversion**, as shown in the figure below: - -![](/img/tree-to-table-en-1.png) - -This feature allows existing tree-model data to be transformed into table views. Users can then query the same dataset using either model. Detailed instructions are available in [Tree-to-Table View](../User-Manual/Tree-to-Table_timecho.md). **Note**: SQL statements for creating tree-to-table views **must be executed in table mode**. - - -## 3. Application Scenarios - -The application scenarios mainly include three categories: - -- Scenario 1: Using the tree mode for data reading and writing. - -- Scenario 2: Using the table mode for data reading and writing. - -- Scenario 3: Sharing the same dataset, using the tree mode for data reading and writing, and the table mode for data analysis. - -### 3.1 Scenario 1: Tree Mode - -#### 3.1.1 Characteristics - -- Simple and intuitive, corresponding one-to-one with monitoring points in the physical world. - -- Flexible like a file system, allowing the design of any branch structure. - -- Suitable for industrial monitoring scenarios such as DCS and SCADA. - -#### 3.1.2 Basic Concepts - -| **Concept** | **Definition** | -| ---------------------------- | ------------------------------------------------------------ | -| **Database** | **Definition**: A path prefixed with `root.`.
**Naming Recommendation**: Only include the next level node under `root`, such as `root.db`.
**Quantity Recommendation**: The upper limit is related to memory. A single database can fully utilize machine resources; there is no need to create multiple databases for performance reasons.
**Creation Method**: Recommended to create manually, but can also be created automatically when a time series is created (defaults to the next level node under `root`). | -| **Time Series (Data Point)** | **Definition**:
A path prefixed with the database path, segmented by `.`, and can contain any number of levels, such as `root.db.turbine.device1.metric1`.
Each time series can have different data types.
**Naming Recommendation**:
Only include unique identifiers (similar to a composite primary key) in the path, generally not exceeding 10 levels.
Typically, place tags with low cardinality (fewer distinct values) at the front to facilitate system compression of common prefixes.
**Quantity Recommendation**:
The total number of time series manageable by the cluster is related to total memory; refer to the resource recommendation section.
There is no limit to the number of child nodes at any level.
**Creation Method**: Can be created manually or automatically during data writing. | -| **Device** | **Definition**: The second-to-last level is the device, such as `device1` in `root.db.turbine.device1.metric1`.
**Creation Method**: Cannot create a device alone; it exists as time series are created. | - -#### 3.1.3 Mode Examples - -##### 3.1.3.1 How to mode when managing multiple types of devices? - -- If different types of devices in the scenario have different hierarchical paths and data point sets, create branches under the database node by device type. Each device type can have a different data point structure. - -
- -
- -##### 3.1.3.2 How to mode when there are no devices, only data points? - -- For example, in a monitoring system for a station, each data point has a unique number but does not correspond to any specific device. - -
- -
- -##### 3.1.3.3 How to mode when a device has both sub-devices and data points? - -- For example, in an energy storage scenario, each layer of the structure monitors its voltage and current. The following mode approach can be used. - -
- -
- - -### 3.2 Scenario 2: Table Mode - -#### 3.2.1 Characteristics - -- Modes and manages device time series data using time series tables, facilitating analysis with standard SQL. - -- Suitable for device data analysis or migrating data from other databases to IoTDB. - -#### 3.2.2 Basic Concepts - -- Database: Can manage multiple types of devices. - -- Time Series Table: Corresponds to a type of device. - -| **Category** | **Definition** | -| -------------------------------- | ------------------------------------------------------------ | -| **Time Column (TIME)** | Each time series table must have a time column named `time`, with the data type `TIMESTAMP`. | -| **Tag Column (TAG)** \| | Unique identifiers (composite primary key) for devices, ranging from 0 to multiple.
Tag information cannot be modified or deleted but can be added.
Recommended to arrange from coarse to fine granularity. | -| **Data Point Column (FIELD)** \| | A device can collect 1 to multiple data points, with values changing over time.
There is no limit to the number of data point columns; it can reach hundreds of thousands. | -| **Attribute Column (ATTRIBUTE)** | Supplementary descriptions of devices, not changing over time.
Device attribute information can range from 0 to multiple and can be updated or added.
A small number of static attributes that may need modification can be stored here. | - -**Data Filtering Efficiency**: Time Column = Tag Column > Attribute Column > Data Point Column. - -#### 3.2.3 Mode Examples - -##### 3.2.3.1 How to mode when managing multiple types of devices? - -- Recommended to create a table for each type of device, with each table having different tags and data point sets. - -- Even if devices are related or have hierarchical relationships, it is recommended to create a table for each type of device. - -
- -
- -##### 3.2.3.2 How to mode when there are no device identifier columns or attribute columns? - -- There is no limit to the number of columns; it can reach hundreds of thousands. - -
- -
- -##### 3.2.3.3 How to mode when a device has both sub-devices and data points? - -- Each device has multiple sub-devices and data point information. It is recommended to create a table for each type of device for management. - -
- -
- -### 3.3 Scenario 3: Dual-Mode Integration - -#### 3.3.1 Characteristics - -- Ingeniously combines the advantages of the tree mode and table mode, sharing the same dataset, with flexible writing and rich querying. - -- During the data writing phase, the tree mode syntax is used, supporting flexible data access and expansion. - -- During the data analysis phase, the table mode syntax is used, allowing users to perform complex data analysis using standard SQL queries. - -#### 3.3.2 Mode Examples - -##### 3.3.2.1 How to mode when managing multiple types of devices? - -- Different types of devices in the scenario have different hierarchical paths and data point sets. - -- **Tree Mode**T: Create branches under the database node by device type, with each device type having a different data point structure. - -- **Table View**T: Create a table view for each type of device, with each table view having different tags and data point sets. - -
- -
- -##### 3.3.2.2 How to mode when there are no device identifier columns or attribute columns? - -- **Tree Mode**: Each data point has a unique number but does not correspond to any specific device. -- **Table View**: Place all data points into a single table. There is no limit to the number of data point columns; it can reach hundreds of thousands. If data points have the same data type, they can be treated as the same type of device. - -
- -
- -##### 3.3.2.3 How to mode when a device has both sub-devices and data points? - -- **Tree Mode**: Mode each layer of the structure according to the monitoring points in the physical world. -- **Table View**: Create multiple tables to manage each layer of structural information according to device classification. - -
- -
diff --git a/src/UserGuide/latest-Table/Background-knowledge/Data-Type_timecho.md b/src/UserGuide/latest-Table/Background-knowledge/Data-Type_timecho.md deleted file mode 100644 index d35af9d07..000000000 --- a/src/UserGuide/latest-Table/Background-knowledge/Data-Type_timecho.md +++ /dev/null @@ -1,201 +0,0 @@ - - -# Data Type - -## 1. Basic Data Types - -IoTDB supports the following data types: - -- **BOOLEAN** (Boolean value) -- **INT32** (32-bit integer) -- **INT64** (64-bit integer) -- **FLOAT** (Single-precision floating-point number) -- **DOUBLE** (Double-precision floating-point number) -- **TEXT** (Text data, suitable for long strings, Not recommended) -- **STRING** (String data with additional statistical information for optimized queries) -- **BLOB** (Large binary object) -- **OBJECT** (Large Binary Object) - > Supported since V2.0.8 -- **TIMESTAMP** (Timestamp, representing precise moments in time) -- **DATE** (Date, storing only calendar date information) - -The difference between **STRING** and **TEXT**: - -- **STRING** stores text data and includes additional statistical information to optimize value-filtering queries. -- **TEXT** is suitable for storing long text strings without additional query optimization. - -The differences between **OBJECT** and **BLOB** types are as follows: - -| | **OBJECT** | **BLOB** | -|----------------------|-------------------------------------------------------------------------------------------------------------------------|--------------------------------------| -| **Write Amplification** (Lower is better) | Low (Write amplification factor is always 1) | High (Write amplification factor = 2 + number of merges) | -| **Space Amplification** (Lower is better) | Low (Merge & release on write) | High (Merge on read and release on compact) | -| **Query Results** | When querying an OBJECT column by default, returns metadata like: `(Object) XX.XX KB`. Actual OBJECT data storage path: `${data_dir}/object_data`. Use `READ_OBJECT` function to retrieve raw content | Directly returns raw binary content | - - -### 1.1 Data Type Compatibility - -If the written data type does not match the registered data type of a series: - -- **Incompatible types** → The system will issue an error. -- **Compatible types** → The system will automatically convert the written data type to match the registered type. - -The compatibility of data types is shown in the table below: - -| Registered Data Type | Compatible Write Data Types | -|:---------------------|:---------------------------------------| -| BOOLEAN | BOOLEAN | -| INT32 | INT32 | -| INT64 | INT32, INT64, TIMESTAMP | -| FLOAT | INT32, FLOAT | -| DOUBLE | INT32, INT64, FLOAT, DOUBLE, TIMESTAMP | -| TEXT | TEXT, STRING | -| STRING | TEXT, STRING | -| BLOB | TEXT, STRING, BLOB | -| OBJECT | OBJECT | -| TIMESTAMP | INT32, INT64, TIMESTAMP | -| DATE | DATE | - -## 2. Timestamp Types - -A timestamp represents the moment when data is recorded. IoTDB supports two types: - -- **Absolute timestamps**: Directly specify a point in time. -- **Relative timestamps**: Define time offsets from a reference point (e.g., `now()`). - -### 2.1 Absolute Timestamp - -IoTDB supports timestamps in two formats: - -1. **LONG**: Milliseconds since the Unix epoch (1970-01-01 00:00:00 UTC). -2. **DATETIME**: Human-readable date-time strings. (including **DATETIME-INPUT** and **DATETIME-DISPLAY** subcategories). - -When entering a timestamp, users can use either a LONG value or a DATETIME string. Supported input formats include: - -
- -**DATETIME-INPUT Type Supports Format** - - -| format | -| :--------------------------- | -| yyyy-MM-dd HH:mm:ss | -| yyyy/MM/dd HH:mm:ss | -| yyyy.MM.dd HH:mm:ss | -| yyyy-MM-dd HH:mm:ssZZ | -| yyyy/MM/dd HH:mm:ssZZ | -| yyyy.MM.dd HH:mm:ssZZ | -| yyyy/MM/dd HH:mm:ss.SSS | -| yyyy-MM-dd HH:mm:ss.SSS | -| yyyy.MM.dd HH:mm:ss.SSS | -| yyyy-MM-dd HH:mm:ss.SSSZZ | -| yyyy/MM/dd HH:mm:ss.SSSZZ | -| yyyy.MM.dd HH:mm:ss.SSSZZ | -| ISO8601 standard time format | - - -
- -> **Note:** `ZZ` represents a time zone offset (e.g., `+0800` for Beijing Time, `-0500` for Eastern Standard Time). - -IoTDB supports timestamp display in **LONG** format or **DATETIME-DISPLAY** format, allowing users to customize time output. - -
- -**Syntax for Custom Time Formats in DATETIME-DISPLAY** - - -| Symbol | Meaning | Presentation | Examples | -| :----: | :-------------------------: | :----------: | :--------------------------------: | -| G | era | era | era | -| C | century of era (>=0) | number | 20 | -| Y | year of era (>=0) | year | 1996 | -| | | | | -| x | weekyear | year | 1996 | -| w | week of weekyear | number | 27 | -| e | day of week | number | 2 | -| E | day of week | text | Tuesday; Tue | -| | | | | -| y | year | year | 1996 | -| D | day of year | number | 189 | -| M | month of year | month | July; Jul; 07 | -| d | day of month | number | 10 | -| | | | | -| a | halfday of day | text | PM | -| K | hour of halfday (0~11) | number | 0 | -| h | clockhour of halfday (1~12) | number | 12 | -| | | | | -| H | hour of day (0~23) | number | 0 | -| k | clockhour of day (1~24) | number | 24 | -| m | minute of hour | number | 30 | -| s | second of minute | number | 55 | -| S | fraction of second | millis | 978 | -| | | | | -| z | time zone | text | Pacific Standard Time; PST | -| Z | time zone offset/id | zone | -0800; -08:00; America/Los_Angeles | -| | | | | -| ' | escape for text | delimiter | | -| '' | single quote | literal | ' | - -
- -### 2.2 Relative Timestamp - -Relative timestamps allow specifying time offsets from **now()** or a **DATETIME** reference. - -The formal definition is: - -```Plain -Duration = (Digit+ ('Y'|'MO'|'W'|'D'|'H'|'M'|'S'|'MS'|'US'|'NS'))+ -RelativeTime = (now() | DATETIME) ((+|-) Duration)+ -``` - -
- - **The syntax of the duration unit** - - - | Symbol | Meaning | Presentation | Examples | - | :----: | :---------: | :----------------------: | :------: | - | y | year | 1y=365 days | 1y | - | mo | month | 1mo=30 days | 1mo | - | w | week | 1w=7 days | 1w | - | d | day | 1d=1 day | 1d | - | | | | | - | h | hour | 1h=3600 seconds | 1h | - | m | minute | 1m=60 seconds | 1m | - | s | second | 1s=1 second | 1s | - | | | | | - | ms | millisecond | 1ms=1000_000 nanoseconds | 1ms | - | us | microsecond | 1us=1000 nanoseconds | 1us | - | ns | nanosecond | 1ns=1 nanosecond | 1ns | - -
- -**Examples:** - -```Plain -now() - 1d2h // A time 1 day and 2 hours earlier than the server time -now() - 1w // A time 1 week earlier than the server time -``` - -> **Note:** There must be spaces on both sides of `+` and `-` operators. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md b/src/UserGuide/latest-Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md deleted file mode 100644 index c60e112d5..000000000 --- a/src/UserGuide/latest-Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md +++ /dev/null @@ -1,51 +0,0 @@ - -# Timeseries Data Model - -## 1. What is Time Series Data? - -In today's interconnected world, industries such as the Internet of Things (IoT) and manufacturing are undergoing rapid digital transformation. Sensors are widely deployed on various devices to collect real-time operational data. For example: - -- **Motors** record voltage and current. -- **Wind Turbines** track blade speed, angular velocity, and power output. -- **Vehicles** capture GPS coordinates, speed, and fuel consumption. -- **Bridges** monitor vibration frequency, deflection, and displacement. - -Sensor data collection has permeated almost every industry, generating vast amounts of **time series data**. - -![](/img/time-series-data-en-01.png) - -Each data collection point is referred to as a **measurement point** (also known as a physical quantity, time series, signal, metric, or measurement value). As time progresses, new data is continuously recorded for each measurement point, forming a **time series**. In tabular form, a time series consists of two columns: **timestamp** and **value**. When visualized, a time series appears as a trend chart over time, resembling an "electrocardiogram" of a device. - -![](/img/time-series-data-en-02.png) - -Given the vast amount of time-series data generated by sensors, structuring this data effectively is essential for digital transformation across industries. Therefore, time-series data modeling is primarily centered around **devices** and **sensors**. - -## 2. Key Concepts in Time Series Data - -Several fundamental concepts define time-series data: - -| **Device** | Also known as an entity or equipment, a device is a real-world object that generates time-series data. In IoTDB, a device serves as a logical grouping of multiple time series. A device could be a physical machine, a measuring instrument, or a collection of sensors. Examples include:
- Energy sector: A wind turbine, identified by parameters such as region, power station, line, model, and instance.
- Manufacturing sector: A robotic arm, uniquely identified by an IoT platform-assigned ID.
- Connected vehicles: A car, identified by its Vehicle Identification Number (VIN).
- Monitoring systems: A CPU, identified by attributes such as data center, rack, hostname, and device type. | -| ------------------------------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| **FIELD** | Also referred to as a physical quantity, signal, metric, or status point, a field represents a specific measurable property recorded by a sensor. Each field corresponds to a measurement point that periodically captures environmental data. Examples include:
- Energy and power: Current, voltage, wind speed, rotational speed.
- Connected vehicles: Fuel level, vehicle speed, latitude, longitude.
- Manufacturing: Temperature, humidity.
Under the **table model**, the total number of **measurement points** is the sum of measurement points of all tables (measurement points per table = number of devices × number of field columns). For detailed statistics methods, please refer to [Metadata Query](../Basic-Concept/Table-Management_timecho.md#_1-7-metadata-query) | -| **Data Point** | A data point consists of a timestamp and a value. The timestamp is typically stored as a long integer, while the value can be of various data types such as BOOLEAN, FLOAT, or INT32.
In tabular format, a data point corresponds to a single row in a time-series dataset, while in graphical representation, it is a single point on a time-series chart.
| -| **Frequency** | The sampling frequency determines how often a sensor records data within a given timeframe.
For example, if a temperature sensor records data once per second, its sampling frequency is 1Hz (1 sample per second). | -| **TTL** | TTL (Time-to-Live) defines the retention period of stored data. Once the TTL expires, the data is automatically deleted.
IoTDB allows different TTL values for different datasets, enabling automated, periodic data deletion. Proper TTL configuration helps:
- Manage disk space efficiently, preventing storage overflow.
- Maintain high query performance.
- Reduce memory resource consumption. | \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Basic-Concept/Database-Management_timecho.md b/src/UserGuide/latest-Table/Basic-Concept/Database-Management_timecho.md deleted file mode 100644 index 2ac3e52fd..000000000 --- a/src/UserGuide/latest-Table/Basic-Concept/Database-Management_timecho.md +++ /dev/null @@ -1,175 +0,0 @@ - - -# Database Management - -## 1. Database Management - -### 1.1 Create a Database - -This command is used to create a database. - -**Syntax:** - -```SQL - CREATE DATABASE (IF NOT EXISTS)? (WITH properties)? -``` - -**Note: ** - -1. ``: The name of the database, with the following characteristics: - - Case-insensitive. After creation, it will be displayed uniformly in lowercase. - - Can include commas (`,`), underscores (`_`), numbers, letters, and Chinese characters. - - Maximum length is 64 characters. - - Names with special characters or Chinese characters must be enclosed in double quotes (`""`). - -2. `WITH properties`: Property names are case-insensitive. For more details, refer to the case sensitivity rules [case-sensitivity](../SQL-Manual/Identifier.md#2-case-sensitivity)。Configurable properties include: - -| Property | Description | Default Value | -| ----------------------- | ------------------------------------------------------------ | -------------------- | -| TTL | Automatic data expiration time, in milliseconds | `INF` | -| TIME_PARTITION_INTERVAL | Time partition interval for the database, in milliseconds | `604800000` (7 days) | -| SCHEMA_REGION_GROUP_NUM | Number of metadata replica groups; generally does not require modification | `1` | -| DATA_REGION_GROUP_NUM | Number of data replica groups; generally does not require modification | `2` | - -**Examples:** - -```SQL -CREATE DATABASE IF NOT EXISTS database1 with(TTL=31536000000); -``` - -### 1.2 Use a Database - -Specify the current database as the namespace for table operations. - -**Syntax:** - -```SQL -USE -``` - -**Example:** - -```SQL -USE database1; -``` - -### 1.3 View the Current Database - -Displays the name of the currently connected database. If no USE statement has been executed, the default is `null`. - -**Syntax:** - -```SQL -SHOW CURRENT_DATABASE -``` - -**Example:** - -```SQL -USE database1; -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| database1| -+---------------+ -``` - - -### 1.4 View All Databases - -Displays all databases and their properties. - -**Syntax:** - -```SQL -SHOW DATABASES (DETAILS)? -``` - -**Columns Explained:** - - -| Column Name | Description | -| ----------------------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| database | Name of the database. | -| TTL | Data retention period. If TTL is specified when creating a database, it applies to all tables within the database. You can also set or update the TTL of individual tables using [create table](../Basic-Concept/Table-Management_timecho.md#11-create-a-table) 、[alter table](../Basic-Concept/Table-Management_timecho.md#14-update-tables) . | -| SchemaReplicationFactor | Number of metadata replicas, ensuring metadata high availability. This can be configured in the `iotdb-system.properties` file under the `schema_replication_factor` property. | -| DataReplicationFactor | Number of data replicas, ensuring data high availability. This can be configured in the `iotdb-system.properties` file under the `data_replication_factor` property. | -| TimePartitionInterval | Time partition interval, determining how often data is grouped into directories on disk. The default is typically one week. | -| Model | Returned when using the `DETAILS` option, showing the data model corresponding to each database (e.g., timeseries tree model or device table model). | - -**Examples:** - -```SQL -SHOW DATABASES DETAILS; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|DataRegionGroupNum| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| database1| INF| 1| 1| 604800000| 1| 2| -|information_schema| INF| null| null| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -``` - -### 1.5 Update a Database - -Used to modify some attributes in the database. - -**Syntax:** - -```SQL -ALTER DATABASE (IF EXISTS)? database=identifier SET PROPERTIES propertyAssignments -``` - -**Note:** - -1. The `ALTER DATABASE` operation currently only supports modifications to the database's `SCHEMA_REGION_GROUP_NUM`, `DATA_REGION_GROUP_NUM`, and `TTL` attributes. - -**Example:** - -```SQL -ALTER DATABASE database1 SET PROPERTIES TTL=31536000000; -``` - -### 1.6 Delete a Database - -Deletes the specified database and all associated tables and data. - -**Syntax:** - -```SQL -DROP DATABASE (IF EXISTS)? -``` - -**Note:** - -1. A database currently in use can still be dropped. -2. Deleting a database removes all its tables and stored data. - -**Example:** - -```SQL -DROP DATABASE IF EXISTS database1; -``` diff --git a/src/UserGuide/latest-Table/Basic-Concept/Query-Data_timecho.md b/src/UserGuide/latest-Table/Basic-Concept/Query-Data_timecho.md deleted file mode 100644 index 47976fb45..000000000 --- a/src/UserGuide/latest-Table/Basic-Concept/Query-Data_timecho.md +++ /dev/null @@ -1,590 +0,0 @@ - - -# Query Data - -## 1. Syntax Overview - -```SQL -SELECT ⟨select_list⟩ - FROM ⟨tables⟩ | patternRecognition - [WHERE ⟨condition⟩] - [GROUP BY ⟨groups⟩] - [HAVING ⟨group_filter⟩] - [WINDOW windowDefinition (',' windowDefinition)*)] - [FILL ⟨fill_methods⟩] - [ORDER BY ⟨order_expression⟩] - [OFFSET ⟨n⟩] - [LIMIT ⟨n⟩]; -``` - -The IoTDB table model query syntax supports the following clauses: - -- **SELECT Clause**: Specifies the columns to be included in the result. Details: [SELECT Clause](../SQL-Manual/Select-Clause_timecho.md) -- **FROM Clause**: Indicates the data source for the query, which can be a single table, multiple tables joined using the `JOIN` clause, or a subquery. Details: [FROM & JOIN Clause](../SQL-Manual/From-Join-Clause.md) -- **WHERE Clause**: Filters rows based on specific conditions. Logically executed immediately after the `FROM` clause. Details: [WHERE Clause](../SQL-Manual/Where-Clause.md) -- **GROUP BY Clause**: Used for aggregating data, specifying the columns for grouping. Details: [GROUP BY Clause](../SQL-Manual/GroupBy-Clause.md) -- **HAVING Clause**: Applied after the `GROUP BY` clause to filter grouped data, similar to `WHERE` but operates after grouping. Details:[HAVING Clause](../SQL-Manual/Having-Clause.md) -- **FILL Clause**: Handles missing values in query results by specifying fill methods (e.g., previous non-null value or linear interpolation) for better visualization and analysis. Details:[FILL Clause](../SQL-Manual/Fill-Clause.md) -- **ORDER BY Clause**: Sorts query results in ascending (`ASC`) or descending (`DESC`) order, with optional handling for null values (`NULLS FIRST` or `NULLS LAST`). Details: [ORDER BY Clause](../SQL-Manual/OrderBy-Clause.md) -- **OFFSET Clause**: Specifies the starting position for the query result, skipping the first `OFFSET` rows. Often used with the `LIMIT` clause. Details: [LIMIT and OFFSET Clause](../SQL-Manual/Limit-Offset-Clause.md) -- **LIMIT Clause**: Limits the number of rows in the query result. Typically used in conjunction with the `OFFSET` clause for pagination. Details: [LIMIT and OFFSET Clause](../SQL-Manual/Limit-Offset-Clause.md) - -## 2. Clause Execution Order - -![](/img/data-query-1.png) - - -## 3. Common Query Examples - -### 3.1 Sample Dataset - -The [Example Data page](../Reference/Sample-Data.md)page provides SQL statements to construct table schemas and insert data. By downloading and executing these statements in the IoTDB CLI, you can import the data into IoTDB. This data can be used to test and run the example SQL queries included in this documentation, allowing you to reproduce the described results. - -### 3.2 Basic Data Query - -**Example 1: Filter by Time** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE time >= 2024-11-27 00:00:00 and time <= 2024-11-29 00:00:00; -``` - -**Result**: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-28T08:00:00.000+08:00| 85.0| null| -|2024-11-28T09:00:00.000+08:00| null| 40.9| -|2024-11-28T10:00:00.000+08:00| 85.0| 35.2| -|2024-11-28T11:00:00.000+08:00| 88.0| 45.1| -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| null| -|2024-11-27T16:41:00.000+08:00| 85.0| null| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-27T16:43:00.000+08:00| null| null| -|2024-11-27T16:44:00.000+08:00| null| null| -+-----------------------------+-----------+--------+ -Total line number = 11 -It costs 0.075s -``` - -**Example 2: Filter by** **Value** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE temperature > 89.0; -``` - -**Result**: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-29T18:30:00.000+08:00| 90.0| 35.4| -|2024-11-26T13:37:00.000+08:00| 90.0| 35.1| -|2024-11-26T13:38:00.000+08:00| 90.0| 35.1| -|2024-11-30T09:30:00.000+08:00| 90.0| 35.2| -|2024-11-30T14:30:00.000+08:00| 90.0| 34.8| -+-----------------------------+-----------+--------+ -Total line number = 5 -It costs 0.156s -``` - -**Example 3: Filter by Attribute** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE model_id ='B'; -``` - -**Result**: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| null| -|2024-11-27T16:41:00.000+08:00| 85.0| null| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-27T16:43:00.000+08:00| null| null| -|2024-11-27T16:44:00.000+08:00| null| null| -+-----------------------------+-----------+--------+ -Total line number = 7 -It costs 0.106s -``` - -**Example 3:Multi device time aligned query** - -```SQL -IoTDB> SELECT date_bin_gapfill(1d, TIME) AS a_time, - device_id, - AVG(temperature) AS avg_temp - FROM table1 - WHERE TIME >= 2024-11-26 13:00:00 - AND TIME <= 2024-11-27 17:00:00 - GROUP BY 1, device_id FILL METHOD PREVIOUS; -``` - -**Result**: - -```SQL -+-----------------------------+---------+--------+ -| a_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-26T08:00:00.000+08:00| 100| 90.0| -|2024-11-27T08:00:00.000+08:00| 100| 90.0| -|2024-11-26T08:00:00.000+08:00| 101| 90.0| -|2024-11-27T08:00:00.000+08:00| 101| 85.0| -+-----------------------------+---------+--------+ -Total line number = 4 -It costs 0.048s -``` - -### 3.3 Aggregation Query - -**Example**: Calculate the average, maximum, and minimum temperature for each `device_id` within a specific time range. - -```SQL -IoTDB> SELECT device_id, AVG(temperature) as avg_temp, MAX(temperature) as max_temp, MIN(temperature) as min_temp - FROM table1 - WHERE time >= 2024-11-26 00:00:00 AND time <= 2024-11-29 00:00:00 - GROUP BY device_id; -``` - -**Result**: - -```SQL -+---------+--------+--------+--------+ -|device_id|avg_temp|max_temp|min_temp| -+---------+--------+--------+--------+ -| 100| 87.6| 90.0| 85.0| -| 101| 85.0| 85.0| 85.0| -+---------+--------+--------+--------+ -Total line number = 2 -It costs 0.278s -``` - -### 3.4 Latest Point Query - -**Example**: Retrieve the latest record for each `device_id`, including the temperature value and the timestamp of the last record. - -```SQL -IoTDB> SELECT device_id,last(time),last_by(temperature,time) - FROM table1 - GROUP BY device_id; -``` - -**Result**: - -```SQL -+---------+-----------------------------+-----+ -|device_id| _col1|_col2| -+---------+-----------------------------+-----+ -| 100|2024-11-29T18:30:00.000+08:00| 90.0| -| 101|2024-11-30T14:30:00.000+08:00| 90.0| -+---------+-----------------------------+-----+ -Total line number = 2 -It costs 0.090s -``` - -### 3.5 Downsampling Query - -**Example**: Group data by day and calculate the average temperature using `date_bin_gapfill` function. - -```SQL -IoTDB> SELECT device_id,date_bin(1d ,time) as day_time, AVG(temperature) as avg_temp - FROM table1 - WHERE time >= 2024-11-26 00:00:00 AND time <= 2024-11-30 00:00:00 - GROUP BY device_id,date_bin(1d ,time); -``` - -**Result**: - -```SQL -+---------+-----------------------------+--------+ -|device_id| day_time|avg_temp| -+---------+-----------------------------+--------+ -| 100|2024-11-29T08:00:00.000+08:00| 90.0| -| 100|2024-11-28T08:00:00.000+08:00| 86.0| -| 100|2024-11-26T08:00:00.000+08:00| 90.0| -| 101|2024-11-29T08:00:00.000+08:00| 85.0| -| 101|2024-11-27T08:00:00.000+08:00| 85.0| -+---------+-----------------------------+--------+ -Total line number = 5 -It costs 0.066s -``` -### 3.6 Multi device downsampling alignment query - -#### 3.6.1 Sampling Frequency is the Same, but Time is Different - -**Table 1: Sampling Frequency: 1s** - -| Time | device_id | temperature | -| ------------ | --------- | ----------- | -| 00:00:00.001 | d1 | 90.0 | -| 00:00:01.002 | d1 | 85.0 | -| 00:00:02.101 | d1 | 85.0 | -| 00:00:03.201 | d1 | null | -| 00:00:04.105 | d1 | 90.0 | -| 00:00:05.023 | d1 | 85.0 | -| 00:00:06.129 | d1 | 90.0 | - -**Table 2: Sampling Frequency: 1s** - -| Time | device_id | humidity | -| ------------ | --------- | -------- | -| 00:00:00.003 | d1 | 35.1 | -| 00:00:01.012 | d1 | 37.2 | -| 00:00:02.031 | d1 | null | -| 00:00:03.134 | d1 | 35.2 | -| 00:00:04.201 | d1 | 38.2 | -| 00:00:05.091 | d1 | 35.4 | -| 00:00:06.231 | d1 | 35.1 | - -**Example: Querying the downsampled data of table1:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS a_time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**Result:** - -```SQL -+-----------------------------+-------+ -| a_time|a_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| -|2025-05-13T00:00:01.000+08:00| 85.0| -|2025-05-13T00:00:02.000+08:00| 85.0| -|2025-05-13T00:00:03.000+08:00| 85.0| -|2025-05-13T00:00:04.000+08:00| 90.0| -|2025-05-13T00:00:05.000+08:00| 85.0| -|2025-05-13T00:00:06.000+08:00| 90.0| -+-----------------------------+-------+ -``` - -**Example: Querying the downsampled data of table2:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS b_time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**Result:** - -```SQL -+-----------------------------+-------+ -| b_time|b_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 35.1| -|2025-05-13T00:00:01.000+08:00| 37.2| -|2025-05-13T00:00:02.000+08:00| 37.2| -|2025-05-13T00:00:03.000+08:00| 35.2| -|2025-05-13T00:00:04.000+08:00| 38.2| -|2025-05-13T00:00:05.000+08:00| 35.4| -|2025-05-13T00:00:06.000+08:00| 35.1| -+-----------------------------+-------+ -``` - -**Example: Aligning multiple sequences by integer time:** - -```SQL -IoTDB> SELECT time, - a_value, - b_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) B - USING (time) -``` - -**Result:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|b_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:02.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:03.000+08:00| 85.0| 35.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 38.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 35.4| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` - -- **Retaining NULL Values**: When NULL values have special significance or when you wish to preserve the null values in the data, you can choose to omit FILL METHOD PREVIOUS to avoid filling in the gaps. -**Example:** - -```SQL -IoTDB> SELECT time, - a_value, - b_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1) B - USING (time) -``` - -**Result:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|b_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:02.000+08:00| 85.0| null| -|2025-05-13T00:00:03.000+08:00| null| 35.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 38.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 35.4| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` -#### 3.6.2 Different Sampling Frequencies, Different Times - -**Table 1: Sampling Frequency: 1s** - -| Time | device_id | temperature | -| ------------ | --------- | ----------- | -| 00:00:00.001 | d1 | 90.0 | -| 00:00:01.002 | d1 | 85.0 | -| 00:00:02.101 | d1 | 85.0 | -| 00:00:03.201 | d1 | null | -| 00:00:04.105 | d1 | 90.0 | -| 00:00:05.023 | d1 | 85.0 | -| 00:00:06.129 | d1 | 90.0 | - -**Table 3: Sampling Frequency: 2s** - -| Time | device_id | humidity | -| ------------ | --------- | -------- | -| 00:00:00.005 | d1 | 35.1 | -| 00:00:02.106 | d1 | 37.2 | -| 00:00:04.187 | d1 | null | -| 00:00:06.156 | d1 | 35.1 | - -**Example: Querying the downsampled data of table1:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS a_time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**Result:** - -```SQL -+-----------------------------+-------+ -| a_time|a_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| -|2025-05-13T00:00:01.000+08:00| 85.0| -|2025-05-13T00:00:02.000+08:00| 85.0| -|2025-05-13T00:00:03.000+08:00| 85.0| -|2025-05-13T00:00:04.000+08:00| 90.0| -|2025-05-13T00:00:05.000+08:00| 85.0| -|2025-05-13T00:00:06.000+08:00| 90.0| -+-----------------------------+-------+ -``` -**Example: Querying the downsampled data of table3:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS c_time, - first(humidity) AS c_value - FROM table3 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**Result:** - -```SQL -+-----------------------------+-------+ -| c_time|c_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 35.1| -|2025-05-13T00:00:01.000+08:00| 35.1| -|2025-05-13T00:00:02.000+08:00| 37.2| -|2025-05-13T00:00:03.000+08:00| 37.2| -|2025-05-13T00:00:04.000+08:00| 37.2| -|2025-05-13T00:00:05.000+08:00| 37.2| -|2025-05-13T00:00:06.000+08:00| 35.1| -+-----------------------------+-------+ -``` - -**Example: Aligning multiple sequences by the higher sampling frequency:** - -```SQL -IoTDB> SELECT time, - a_value, - c_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS c_value - FROM table3 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) C - USING (time) -``` - -**Result:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|c_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 35.1| -|2025-05-13T00:00:02.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:03.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 37.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` - -### 3.7 Missing Data Filling - -**Example**: Query the records within a specified time range where `device_id` is '100'. If there are missing data points, fill them using the previous non-null value. - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE time >= 2024-11-26 00:00:00 and time <= 2024-11-30 11:00:00 - AND region='East' AND plant_id='1001' AND device_id='101' - FILL METHOD PREVIOUS; -``` - -**Result**: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:41:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:42:00.000+08:00| 85.0| 35.2| -|2024-11-27T16:43:00.000+08:00| 85.0| 35.2| -|2024-11-27T16:44:00.000+08:00| 85.0| 35.2| -+-----------------------------+-----------+--------+ -Total line number = 7 -It costs 0.101s -``` - -### 3.8 Sorting & Pagination - -**Example**: Query records from the table, sorting by `humidity` in descending order and placing null values (NULL) at the end. Skip the first 2 rows and return the next 8 rows. - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - ORDER BY humidity desc NULLS LAST - OFFSET 2 - LIMIT 10; -``` - -**Result**: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-28T09:00:00.000+08:00| null| 40.9| -|2024-11-29T18:30:00.000+08:00| 90.0| 35.4| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-28T10:00:00.000+08:00| 85.0| 35.2| -|2024-11-30T09:30:00.000+08:00| 90.0| 35.2| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-26T13:38:00.000+08:00| 90.0| 35.1| -|2024-11-26T13:37:00.000+08:00| 90.0| 35.1| -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-30T14:30:00.000+08:00| 90.0| 34.8| -+-----------------------------+-----------+--------+ -Total line number = 10 -It costs 0.093s -``` diff --git a/src/UserGuide/latest-Table/Basic-Concept/TTL-Delete-Data_timecho.md b/src/UserGuide/latest-Table/Basic-Concept/TTL-Delete-Data_timecho.md deleted file mode 100644 index de696fa1d..000000000 --- a/src/UserGuide/latest-Table/Basic-Concept/TTL-Delete-Data_timecho.md +++ /dev/null @@ -1,145 +0,0 @@ - - -# TTL Delete Data - -## 1. Overview - -Time-to-Live (TTL) is a mechanism for defining the lifespan of data in a database. In IoTDB, TTL allows setting table-level expiration policies, enabling the system to automatically delete outdated data periodically. This helps manage disk space efficiently, maintain high query performance, and reduce memory usage. - -TTL values are specified in milliseconds, and once data exceeds its defined lifespan, it becomes unavailable for queries and cannot be written to. However, the physical deletion of expired data occurs later during the compaction process. Note that changes to TTL settings can briefly impact the accessibility of data. - -**Notes:** - -1. TTL defines the expiration time of data in milliseconds, independent of the time precision configuration file. -2. Modifying TTL settings can cause temporary variations in data accessibility. -3. The system eventually removes expired data, though this process may involve some delay.。 -4. The TTL expiration check is based on the data point timestamp, not the write time. - -## 2. Set TTL - -In the table model, IoTDB’s TTL operates at the granularity of individual tables. You can set TTL directly on a table or at the database level. When TTL is configured at the database level, it serves as the default for new tables created within the database. However, each table can still have its own independent TTL settings. - -**Note:** Modifying the database-level TTL does not retroactively affect the TTL settings of existing tables. - -### 2.1 Set TTL for Tables - -If TTL is specified when creating a table using SQL, the table’s TTL takes precedence. Refer to [Table-Management](../Basic-Concept/Table-Management_timecho.md)for details. - -Example 1: Setting TTL during table creation: - -```SQL -CREATE TABLE test3 ("site" string id, "temperature" int32) with (TTL=3600); -``` - -Example 2: Changing TTL for an existing table: - -```SQL -ALTER TABLE tableB SET PROPERTIES TTL=3600; -``` - -**Example 3:** If TTL is not specified or set to the default value, it will inherit the database's TTL. By default, the database TTL is `'INF'` (infinite): - -```SQL -CREATE TABLE test3 ("site" string id, "temperature" int32) with (TTL=DEFAULT); -CREATE TABLE test3 ("site" string id, "temperature" int32); -ALTER TABLE tableB set properties TTL=DEFAULT; -``` - -### 2.2 Set TTL for Databases - -Tables without explicit TTL settings inherit the TTL of their database. Refer to [Database-Management](../Basic-Concept/Database-Management_timecho.md)for details. - -Example 4: A database with TTL=3600000 creates tables inheriting this TTL: - -```SQL -CREATE DATABASE db WITH (ttl=3600000); -use db; -CREATE TABLE test3 ("site" string id, "temperature" int32); -``` - -Example 5: A database without a TTL setting creates tables without TTL: - -```SQL -CREATE DATABASE db; -use db; -CREATE TABLE test3 ("site" string id, "temperature" int32); -``` - -Example 6: Setting a table with no TTL explicitly (TTL=INF) in a database with a configured TTL: - -```SQL -CREATE DATABASE db WITH (ttl=3600000); -use db; -CREATE TABLE test3 ("site" string id, "temperature" int32) with (ttl='INF'); -``` - -## 3. Remove TTL - -To cancel a TTL setting, modify the table's TTL to 'INF'. Note that IoTDB does not currently support modifying the TTL of a database. - -```SQL -ALTER TABLE tableB set properties TTL='INF'; -``` - -## 4. View TTL Information - -Use the SHOW DATABASES and SHOW TABLES commands to view TTL details for databases and tables. Refer to [Database-Management](../Basic-Concept/Database-Management_timecho.md)、 [Table-Management](../Basic-Concept/Table-Management_timecho.md)for details. - -> Note: TTL settings in tree-model will also be shown. - -Example Output: - -```SQL -IoTDB> show databases; -+---------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+---------+-------+-----------------------+---------------------+---------------------+ -|test_prop| 300| 1| 3| 100000| -| test2| 300| 1| 1| 604800000| -+---------+-------+-----------------------+---------------------+---------------------+ - -IoTDB> show databases details; -+---------+-------+-----------------------+---------------------+---------------------+-----+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|Model| -+---------+-------+-----------------------+---------------------+---------------------+-----+ -|test_prop| 300| 1| 3| 100000|TABLE| -| test2| 300| 1| 1| 604800000| TREE| -+---------+-------+-----------------------+---------------------+---------------------+-----+ -IoTDB> show tables; -+---------+-------+ -|TableName|TTL(ms)| -+---------+-------+ -| grass| 1000| -| bamboo| 300| -| flower| INF| -+---------+-------+ - -IoTDB> show tables details; -+---------+-------+----------+ -|TableName|TTL(ms)| Status| -+---------+-------+----------+ -| bean| 300|PRE_CREATE| -| grass| 1000| USING| -| bamboo| 300| USING| -| flower| INF| USING| -+---------+-------+----------+ -``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Basic-Concept/Table-Management_timecho.md b/src/UserGuide/latest-Table/Basic-Concept/Table-Management_timecho.md deleted file mode 100644 index f2e1bb2cd..000000000 --- a/src/UserGuide/latest-Table/Basic-Concept/Table-Management_timecho.md +++ /dev/null @@ -1,321 +0,0 @@ - - -# Table Management - -Before starting to use the table management functionality, we recommend familiarizing yourself with the following related background knowledge for a better understanding and application of the table management features: -* [Timeseries Data Model](../Background-knowledge/Navigating_Time_Series_Data_timecho.md): Understand the basic concepts and characteristics of time series data to establish a foundation for data modeling. -* [Modeling Scheme Design](../Background-knowledge/Data-Model-and-Terminology_timecho.md): Master the IoTDB time series model and its applicable scenarios to provide a design basis for table management. - -## 1. Table Management - -### 1.1 Create a Table - -#### 1.1.1 Manually create a table with CREATE - -Manually create a table within the current or specified database.The format is "database name. table name". - -**Syntax:** - -```SQL -createTableStatement - : CREATE TABLE (IF NOT EXISTS)? qualifiedName - '(' (columnDefinition (',' columnDefinition)*)? ')' - charsetDesc? - comment? - (WITH properties)? - ; - -charsetDesc - : DEFAULT? (CHAR SET | CHARSET | CHARACTER SET) EQ? identifierOrString - ; - -columnDefinition - : identifier columnCategory=(TAG | ATTRIBUTE | TIME) charsetName? comment? - | identifier type (columnCategory=(TAG | ATTRIBUTE | TIME | FIELD))? charsetName? comment? - ; - -charsetName - : CHAR SET identifier - | CHARSET identifier - | CHARACTER SET identifier - ; - -comment - : COMMENT string - ; -``` - -**Note:** - -1. When creating a table, you do not need to specify a time column. IoTDB automatically adds a column named "time" and places it as the first column. All other columns can be added by enabling the `enable_auto_create_schema` option in the database configuration, or through the session interface for automatic creation or by using table modification statements. -2. Since version V2.0.8.2, tables support custom naming of the time column during creation. The order of the custom time column in the table is determined by the order in the creation SQL. The related constraints are as follows: - - - When the column category is set to TIME, the data type must be TIMESTAMP. - - Each table allows at most one time column (columnCategory = TIME). - - If no time column is explicitly defined, no other column can use "time" as its name to avoid conflicts with the system's default time column naming. -3. The column category can be omitted and defaults to FIELD. When the column category is TAG or ATTRIBUTE, the data type must be STRING (can be omitted). -4. The TTL of a table defaults to the TTL of its database. If the default value is used, this attribute can be omitted or set to default. -5. table name has the following characteristics: - - - It is case-insensitive and, upon successful creation, is uniformly displayed in lowercase. - - The name can include special characters, such as `~!`"%`, etc. - - Table names containing special characters or Chinese characters must be enclosed in double quotation marks ("") during creation. - - - Note: In SQL, special characters or Chinese table names must be enclosed in double quotes. In the native API, no additional quotes are needed; otherwise, the table name will include the quote characters. - - When naming a table, the outermost double quotation marks (`""`) will not appear in the actual table name. - - ```sql - -- In SQL - "a""b" --> a"b - """""" --> "" - -- In API - "a""b" --> "a""b" - ``` -6. columnDefinition column names have the same characteristics as table names and can include the special character `.`. -7. COMMENT adds a comment to the table. - -**Examples:** - -```SQL -CREATE TABLE table1 ( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - humidity FLOAT FIELD COMMENT 'humidity', - status Boolean FIELD COMMENT 'status', - arrival_time TIMESTAMP FIELD COMMENT 'arrival_time' -) COMMENT 'table1' WITH (TTL=31536000000); -``` - -Note: If your terminal does not support multi-line paste (e.g., Windows CMD), please reformat the SQL statement into a single line before execution. - -### 1.2 View Tables - -Used to view all tables and their properties in the current or a specified database. - -**Syntax:** - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -**Note:** - -1. If the `FROM` or `IN` clause is specified, the command lists all tables in the specified database. -2. If neither `FROM` nor `IN` is specified, the command lists all tables in the currently selected database. If no database is selected (`USE` statement not executed), an error is returned. -3. When the `DETAILS` option is used, the command shows the current state of each table: - 1. `USING`: The table is available and operational. - 2. `PRE_CREATE`: The table is in the process of being created or the creation has failed; the table is not available. - 3. `PRE_DELETE`: The table is in the process of being deleted or the deletion has failed; the table will remain permanently unavailable. - -**Examples:** - -```sql -show tables details from database1; -``` -```shell -+---------------+-----------+------+-------+ -| TableName| TTL(ms)|Status|Comment| -+---------------+-----------+------+-------+ -| table1|31536000000| USING| table1| -+---------------+-----------+------+-------+ -``` - -### 1.3 View Table Columns - -Used to view column names, data types, categories, and states of a table. - -**Syntax:** - -```SQL -(DESC | DESCRIBE) (DETAILS)? -``` - -**Note:** If the `DETAILS` option is specified, detailed state information of the columns is displayed: - -- `USING`: The column is in normal use. -- `PRE_DELETE`: The column is being deleted or the deletion has failed; it is permanently unavailable. - - - -**Examples:** - -```sql -desc table1 details; -``` -```shell -+------------+---------+---------+------+------------+ -| ColumnName| DataType| Category|Status| Comment| -+------------+---------+---------+------+------------+ -| time|TIMESTAMP| TIME| USING| null| -| region| STRING| TAG| USING| null| -| plant_id| STRING| TAG| USING| null| -| device_id| STRING| TAG| USING| null| -| model_id| STRING|ATTRIBUTE| USING| null| -| maintenance| STRING|ATTRIBUTE| USING| maintenance| -| temperature| FLOAT| FIELD| USING| temperature| -| humidity| FLOAT| FIELD| USING| humidity| -| status| BOOLEAN| FIELD| USING| status| -|arrival_time|TIMESTAMP| FIELD| USING|arrival_time| -+------------+---------+---------+------+------------+ -``` - -### 1.4 View Table Creation Statement - -Retrieves the complete definition statement of a table or view under the table model. This feature automatically fills in all default values that were omitted during creation, so the displayed statement may differ from the original CREATE statement. - -> This feature is supported starting from v2.0.5. - -**Syntax:** - -```SQL -SHOW CREATE TABLE -``` - -**Note::** - -1. This statement does not support queries on system tables. - -**Example:** - -```SQL -show create table table1; -``` -```shell -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| Table| Create Table| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|table1|CREATE TABLE "table1" ("region" STRING TAG,"plant_id" STRING TAG,"device_id" STRING TAG,"model_id" STRING ATTRIBUTE,"maintenance" STRING ATTRIBUTE,"temperature" FLOAT FIELD,"humidity" FLOAT FIELD,"status" BOOLEAN FIELD,"arrival_time" TIMESTAMP FIELD) WITH (ttl=31536000000)| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 1 -``` - - -### 1.5 Update Tables - -Used to update a table, including adding or deleting columns, modify column type (V2.0.8.2) and configuring table properties. - -**Syntax:** - -```SQL -#addColumn; -ALTER TABLE (IF EXISTS)? tableName=qualifiedName ADD COLUMN (IF NOT EXISTS)? column=columnDefinition; -#dropColumn; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName DROP COLUMN (IF EXISTS)? column=identifier; -#setTableProperties; -// set TTL can use this; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName SET PROPERTIES propertyAssignments; -| COMMENT ON TABLE tableName=qualifiedName IS 'table_comment'; -| COMMENT ON COLUMN tableName.column IS 'column_comment'; -#changeColumndatatype; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName ALTER COLUMN (IF EXISTS)? column=identifier SET DATA TYPE new_type=type; -``` - -**Note::** - -1. The `SET PROPERTIES` operation currently only supports configuring the `TTL` property of a table -2. The delete column function only supports deleting the ATTRIBUTE and FILD columns, and the TAG column does not support deletion. -3. The modified comment will overwrite the original comment. If null is specified, the previous comment will be erased. -4. Since version V2.0.8.2, modifying the data type of a column is supported. Currently, only columns with Category type FIELD can be modified. - - * If the time series is concurrently deleted during the modification process, an error will be reported. - * The new data type must be compatible with the original type. The specific compatibility is shown in the following table: - -**Example:** - -add column -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS a TAG COMMENT 'a'; -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS b FLOAT FIELD COMMENT 'b'; -``` -set TTL -```SQL -ALTER TABLE table1 set properties TTL=3600; -``` -set comment -```SQL -COMMENT ON TABLE table1 IS 'table1'; -COMMENT ON COLUMN table1.a IS null; -``` -alter column datatype -```SQL -ALTER TABLE table1 ALTER COLUMN IF EXISTS b SET DATA TYPE DOUBLE; -``` - -### 1.6 Delete Tables - -Used to delete a table. - -**Syntax:** - -```SQL -DROP TABLE (IF EXISTS)? -``` - -**Examples:** - -```SQL -DROP TABLE table1; -DROP TABLE database1.table1; -``` - -## 1.7 Metadata Query -Under the table model, the **total number of measurement points** equals the sum of measurement points of all tables. Currently, the number of measurement points in a single table can be calculated with the formula: -**Measurement points per single table = Number of devices × Number of field columns**. -Support for directly querying measurement points under the table model via SQL statements will be available in future updates. Please stay tuned. - -Take `table1` in the [sample data](../Reference/Sample-Data.md) as an example. - -In the organizational structure of this sample, there are three tag columns (`region`, `plant_id`, `device_id`) and four field columns (`temperature`, `humidity`, `status`, `arrival_time`). - -A unique device is identified by the combination of all tag columns. Each unique combination of `region` + `plant_id` + `device_id` represents an independent device. - -The sample data defines 2 regions: Beijing and Shanghai. Details are as follows: -- **Beijing**: 1 factory with ID 1001 - - 2 devices under this factory: IDs 100 and 101 -- **Shanghai**: 2 factories with IDs 3001 and 3002 - - Factory 3001: 2 devices (IDs 100, 101) - - Factory 3002: 2 devices (IDs 100, 101) - -In total, there are 6 unique tag combinations in the table, corresponding to 6 independent devices. - -### Complete Calculation Example for Single-Table Measurement Points -1. Query the number of devices -```sql -IoTDB:database1> count devices from table1; -+--------------+ -|count(devices)| -+--------------+ -| 6| -+--------------+ -Total line number = 1 -It costs 0.019s -``` - -2. Calculate the total measurement points of the single table -- Number of devices: 6 -- Number of field columns: 4 -- Total measurement points of the table: **6 × 4 = 24** - diff --git a/src/UserGuide/latest-Table/Basic-Concept/Write-Updata-Data_timecho.md b/src/UserGuide/latest-Table/Basic-Concept/Write-Updata-Data_timecho.md deleted file mode 100644 index 38436f17d..000000000 --- a/src/UserGuide/latest-Table/Basic-Concept/Write-Updata-Data_timecho.md +++ /dev/null @@ -1,398 +0,0 @@ - - -# Write & Update Data - -## 1. SQL Insertion - -### 1.1 Syntax - -In IoTDB, data insertion follows the general syntax: - -```SQL -INSERT INTO [(COLUMN_NAME[, COLUMN_NAME]*)]? VALUES (COLUMN_VALUE[, COLUMN_VALUE]*) -``` - -**Basic Constraints**: - -1. Tables cannot be automatically created using `INSERT` statements. -2. Columns not specified in the `INSERT` statement will automatically be filled with `null`. -3. If no timestamp is provided, the system will use the current time (`now()`). -4. If a column value does not exist for the identified device, the insertion will overwrite any existing `null` values with the new data. -5. If a column value already exists for the identified device, a new insertion will update the column with the new value. -6. Writing duplicate timestamps will update the values in the columns corresponding to the original timestamps. -7. When an INSERT statement does not specify column names (e.g., INSERT INTO table VALUES (...)), the values in VALUES must strictly follow the physical order of columns in the table (this order can be checked via the DESC table command). - -Since attributes generally do not change over time, it is recommended to update attribute values using the `UPDATE` statement described below,Please refer to the following [Data Update](#2-data-updates). - -
- -
- - -### 1.2 Specified Column Insertion - -It is possible to insert data for specific columns. Columns not specified will remain `null`. - -**Example:** - -```SQL -INSERT INTO table1(region, plant_id, device_id, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, time, temperature) VALUES ('Hamburg', '1001', '100', '2025-11-26 13:38:00', 91.0); -``` - -### 1.3 Null Value Insertion - -You can explicitly set `null` values for tag columns, attribute columns, and field columns. - -**Example**: - -Equivalent to the above partial column insertion. - -```SQL -# Equivalent to the example above; -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', null, null, '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', null, null, '2025-11-26 13:38:00', 91.0, null); -``` - -If no tag columns are included, the system will automatically create a device with all tag column values set to `null`. - -> **Note:** This operation will not only automatically populate existing tag columns in the table with `null` values but will also populate any newly added tag columns with `null` values in the future. - -### 1.4 Multi-Row Insertion - -IoTDB supports inserting multiple rows of data in a single statement to improve efficiency. - -**Example**: - -```SQL -INSERT INTO table1 -VALUES -('2025-11-26 13:37:00', 'Frankfurt', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('2025-11-26 13:38:00', 'Frankfurt', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:38:25'); - -INSERT INTO table1 -(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) -VALUES -('Frankfurt', '1001', '100', 'A', '180', '2025-11-26 13:37:00', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('Frankfurt', '1001', '100', 'A', '180', '2025-11-26 13:38:00', 90.0, 35.1, true, '2025-11-26 13:38:25'); -``` - -#### Notes - -- Referencing non-existent columns in SQL will result in an error code `COLUMN_NOT_EXIST(616)`. -- Data type mismatches between the insertion data and the column's data type will result in an error code `DATA_TYPE_MISMATCH(614)`. - - -### 1.5 Query Write-back - -The IoTDB table model supports the **append-only query write-back** feature, implemented via the `INSERT INTO QUERY` statement. This feature allows writing the results of a query into an **existing** table. - -> ​**Note**​: This feature is available starting from version V2.0.6. - -#### 1.5.1 Syntax Definition - -sql - -```sql -INSERT INTO table_name [ ( column [, ... ] ) ] query -``` - -The **query** component supports three formats, which are illustrated with examples below. - -Using the [sample data](../Reference/Sample-Data.md) as the data source, first create the target table: - -sql - -```sql -CREATE TABLE target_table ( time TIMESTAMP TIME, region STRING TAG, device_id STRING TAG, temperature FLOAT FIELD ); -Msg: The statement is executed successfully. -``` - -1. Write-back via Standard Query Statement - -The `query` part is a direct `select ... from ...` query. - -​**Example**​: Use a standard query statement to write the `time`, `region`, `device_id`, and `temperature` data of the Beijing region from `table1` into `target_table`. - -sql - -```sql -insert into target_table select time,region,device_id,temperature from table1 where region = 'Beijing'; -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region='Beijing'; -``` -```shell -+-----------------------------+--------+-----------+-------------+ -| time| region| device_id| temperature| -+-----------------------------+--------+-----------+-------------+ -|2024-11-26T13:37:00.000+08:00| Beijing| 100| 90.0| -|2024-11-26T13:38:00.000+08:00| Beijing| 100| 90.0| -|2024-11-27T16:38:00.000+08:00| Beijing| 101| null| -|2024-11-27T16:39:00.000+08:00| Beijing| 101| 85.0| -|2024-11-27T16:40:00.000+08:00| Beijing| 101| 85.0| -|2024-11-27T16:41:00.000+08:00| Beijing| 101| 85.0| -|2024-11-27T16:42:00.000+08:00| Beijing| 101| null| -|2024-11-27T16:43:00.000+08:00| Beijing| 101| null| -|2024-11-27T16:44:00.000+08:00| Beijing| 101| null| -+-----------------------------+--------+-----------+-------------+ -Total line number = 9 -It costs 0.029s -``` - -2. Write-back via Table Reference Query - -The `query` part uses the table reference syntax `table source_table`. - -​**Example**​: Use a table reference query to write data from `table3` into `target_table`. - -sql - -```sql -insert into target_table(time,device_id,temperature) table table3; -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region is null; -``` -```shell -+-----------------------------+------+-----------+-------------+ -| time|region| device_id| temperature| -+-----------------------------+------+-----------+-------------+ -|2025-05-13T00:00:00.001+08:00| null| d1| 90.0| -|2025-05-13T00:00:01.002+08:00| null| d1| 85.0| -|2025-05-13T00:00:02.101+08:00| null| d1| 85.0| -|2025-05-13T00:00:03.201+08:00| null| d1| null| -|2025-05-13T00:00:04.105+08:00| null| d1| 90.0| -|2025-05-13T00:00:05.023+08:00| null| d1| 85.0| -|2025-05-13T00:00:06.129+08:00| null| d1| 90.0| -+-----------------------------+------+-----------+-------------+ -Total line number = 7 -It costs 0.015s -``` - -3. Write-back via Subquery - -The `query` part is a parenthesized subquery. - -​**Example**​: Use a subquery to write the `time`, `region`, `device_id`, and `temperature` data from `table1` whose timestamps match the records of the Shanghai region in `table2` into `target_table`. - -sql - -```sql -insert into target_table (select t1.time, t1.region as region, t1.device_id as device_id, t1.temperature as temperature from table1 t1 where t1.time in (select t2.time from table2 t2 where t2.region = 'Shanghai')); -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region = 'Shanghai'; -``` -```shell -+-----------------------------+---------+-----------+-------------+ -| time| region| device_id| temperature| -+-----------------------------+---------+-----------+-------------+ -|2024-11-28T08:00:00.000+08:00| Shanghai| 100| 85.0| -|2024-11-29T11:00:00.000+08:00| Shanghai| 100| null| -+-----------------------------+---------+-----------+-------------+ -Total line number = 2 -It costs 0.014s -``` - -#### 1.5.2 Notes - -* The source table in the `query` and the target table `table_name` are allowed to be the same table, e.g., `INSERT INTO testtb SELECT * FROM testtb`. -* The target table ​**must already exist**​; otherwise, the error message `550: Table 'xxx.xxx' does not exist` will be thrown. -* The number and types of query result columns must exactly match those of the target table. Object type is currently not supported, and no implicit type conversion is supported. If types mismatch, the error `701: Insert query has mismatched column types` will be raised. -* You can specify a subset of columns in the target table, provided the following rules are met: - * The timestamp column must be included; otherwise, the error message `701: time column can not be null` will be thrown. - * At least one **FIELD** column must be included; otherwise, the error message `701: No Field column present` will be thrown. - * **TAG** columns are optional. - * The number of specified columns can be less than that of the target table. Missing columns will be automatically filled with `NULL` values. -* For Java applications, the `INSERT INTO QUERY` statement can be executed using the [executeNonQueryStatement](../API/Programming-Java-Native-API_timecho.md#_3-1-itablesession-interface) method. -* For REST API access, the `INSERT INTO QUERY` statement can be executed via the [/rest/table/v1/nonQuery](../API/RestAPI-V1_timecho.md#_3-3-non-query-interface) endpoint. -* `INSERT INTO QUERY` does **not** support the `EXPLAIN` and `EXPLAIN ANALYZE` commands. -* To execute the query write-back statement successfully, users must have the following permissions: - * The `SELECT` permission on the source tables involved in the query. - * The `WRITE` permission on the target table. - * For more details about user permissions, refer to [Authority Management](../User-Manual/Authority-Management_timecho.md). - - -### 1.6 Writing Object Type - -To avoid oversized Object write requests, values of **Object** type can be split into segments and written sequentially. In SQL, the `to_object(isEOF, offset, content)` function must be used for value insertion. - -> Supported since V2.0.8 - -**Syntax:** - -```SQL -INSERT INTO tableName(time, columnName) VALUES(timeValue, TO_OBJECT(isEOF, offset, content)); -``` - -**Parameters:** - -| Name | Data Type | Description | -|---------|--------------------|-----------------------------------------------------------------------------| -| isEOF | BOOLEAN | Whether the current write contains the last segment of the Object | -| offset | INT64 | Starting offset of the current segment within the Object | -| content | Hexadecimal (HEX) | Content of the current segment | - -**Examples:** - -Add an Object-type column `s1` to table `table1`: - -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS s1 OBJECT FIELD COMMENT 'object type'; -``` - -1. **Non-segmented write** - -```SQL -INSERT INTO table1(time, device_id, s1) VALUES(NOW(), 'tag1', TO_OBJECT(TRUE, 0, X'696F746462')); -``` - -2. **Segmented write** - -```SQL --- First write: TO_OBJECT(FALSE, 0, X'696F'); -INSERT INTO table1(time, device_id, s1) VALUES(1, 'tag1', TO_OBJECT(FALSE, 0, X'696F')); - --- Second write: TO_OBJECT(FALSE, 2, X'7464'); -INSERT INTO table1(time, device_id, s1) VALUES(1, 'tag1', TO_OBJECT(FALSE, 2, X'7464')); - --- Third write: TO_OBJECT(TRUE, 4, X'62'); -INSERT INTO table1(time, device_id, s1) VALUES(1, 'tag1', TO_OBJECT(TRUE, 4, X'62')); -``` - -**Notes:** - -1. If only partial segments of an Object are written, querying the column will return `NULL`. Data becomes accessible only after all segments are successfully written. -2. During segmented writes, if the `offset` of the current write does not match the current size of the Object, the write operation will fail. -3. If `offset=0` is used after partial writes, the existing content will be overwritten with new data. - - -## 2. Schema-less Writing - -When performing data writing through Session, IoTDB supports schema-less writing: there is no need to manually create tables beforehand. The system automatically constructs the table structure based on the information in the write request, and then directly executes the data writing operation. - -**Example:** - -```Java -try (ITableSession session = - new TableSessionBuilder() - .nodeUrls(Collections.singletonList("127.0.0.1:6667")) - .username("root") - .password("root") - .build()) { - - session.executeNonQueryStatement("CREATE DATABASE db1"); - session.executeNonQueryStatement("use db1"); - - // Insert data without manually creating the table - List columnNameList = - Arrays.asList("region_id", "plant_id", "device_id", "model", "temperature", "humidity"); - List dataTypeList = - Arrays.asList( - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE); - List columnTypeList = - new ArrayList<>( - Arrays.asList( - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD)); - Tablet tablet = new Tablet("table1", columnNameList, dataTypeList, columnTypeList, 100); - for (long timestamp = 0; timestamp < 100; timestamp++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue("plant_id", rowIndex, "5"); - tablet.addValue("device_id", rowIndex, "3"); - tablet.addValue("model", rowIndex, "A"); - tablet.addValue("temperature", rowIndex, 37.6F); - tablet.addValue("humidity", rowIndex, 111.1); - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - session.insert(tablet); - tablet.reset(); - } - } - if (tablet.getRowSize() != 0) { - session.insert(tablet); - tablet.reset(); - } -} -``` - -After execution, you can verify the table creation using the following command: - -```SQL -desc table1; -``` -```shell -+-----------+---------+-----------+ -| ColumnName| DataType| Category| -+-----------+---------+-----------+ -| time|TIMESTAMP| TIME| -| region_id| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model| STRING| ATTRIBUTE| -|temperature| FLOAT| FIELD| -| humidity| DOUBLE| FIELD| -+-----------+---------+-----------+ -``` - - -## 3. Data Updates - -### 3.1 Syntax - -```SQL -UPDATE SET updateAssignment (',' updateAssignment)* (WHERE where=booleanExpression)? - -updateAssignment - : identifier EQ expression - ; -``` - -**Note:** - -- Updates are allowed only on `ATTRIBUTE` columns. -- `WHERE` conditions: - - Can only include `TAG` and `ATTRIBUTE` columns; `FIELD` and `TIME` columns are not allowed. - - Aggregation functions are not supported. -- The result of the `SET` assignment expression must be a `string` type and follow the same constraints as the `WHERE` clause. - -**Example**: - -```SQL -update table1 set b = a where substring(a, 1, 1) like '%'; -``` diff --git a/src/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md b/src/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md deleted file mode 100644 index 31a27cfb9..000000000 --- a/src/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md +++ /dev/null @@ -1,290 +0,0 @@ - -# AINode Deployment - -## 1. AINode Introduction - -### 1.1 Capability Introduction - -AINode is the third type of endogenous node provided by TimechoDB after ConfigNode and DataNode. By interacting with the DataNodes and ConfigNodes of an TimechoDB cluster, this node extends the capability for machine learning analysis on time series. AINode integrates model management, training, and inference within the database engine. It supports performing time series analysis tasks on specified time series data using registered models through simple SQL statements and also supports registering and using custom machine learning models. AINode currently integrates machine learning algorithms and self-developed models for common time series analysis scenarios (e.g., forecasting). - -### 1.2 Deployment Modes - -AINode is an additional component outside the TimechoDB cluster and is deployed using a separate installation package. - -
- - -
- -## 2. Installation Preparation - -### 2.1 Installation Package Acquisition - -The key directory structure after extracting the AINode installation package (`timechodb--ainode-bin.zip`) is as follows: - -| Directory | Type | Description | -| :--- | :--- | :--- | -| lib | Folder | Executable programs and dependencies for AINode | -| sbin | Folder | Operation scripts for AINode, used to start or stop AINode | -| conf | Folder | Configuration files and version declaration file for AINode | - -### 2.2 Pre-installation Verification - -To ensure the AINode installation package you obtained is complete and correct, it is recommended to perform an SHA512 verification before installation and deployment. - -**Preparation:** - -- Obtain the official SHA512 checksum: Please contact Timecho staff. - -**Verification Steps (using Linux as an example):** - -1. Open a terminal, navigate to the directory containing the installation package (e.g., `/data/ainode`): - -```bash -cd /data/ainode -``` - -2. Execute the following command to calculate the hash value: - -```bash -sha512sum timechodb-{version}-ainode-bin.zip -``` - -3. The terminal will output the result (left side is the SHA512 checksum, right side is the filename): - -```SQL -(base) root@hadoop@1:/data/ainode (0.664s) -sha512sum timechodb-2.0.6.1-ainode-bin.zip -4d5a6a64935b4f0459bc9ed214c4563aa7a6a5941024336e9416212424707f27bdfdfc70f4c528b51b812687d660014adc1b8add699498ea67ff17c7e619a6f0 timechodb-2.0.6.1-ainode-bin.zip -``` - -4. Compare the output with the official SHA512 checksum. If they match, you can proceed with the AINode installation and deployment steps below. - -**Notes:** - -- If the verification results do not match, please contact Timecho staff to obtain a new installation package. -- If you encounter a "file not found" prompt during verification, check if the file path is correct or if the installation package was downloaded completely. - -### 2.3 Environment Requirements - -- Recommended operating environment: Linux, macOS. -- TimechoDB Version: >= V2.0.8. - -#### 2.3.1 Resource Configuration Recommendations - -> Note: The resource configuration recommendations in this section apply only to **model inference tasks**. Guidelines for model training tasks will be provided in subsequent releases. - -The following are baseline resource configurations for model inference running on a single NVIDIA RTX 4090 (24 GB VRAM). For model inference on AINode, overall throughput can be improved by horizontally scaling the number of GPUs. It is generally recommended to deploy servers with 1, 2, 4 or 8 GPUs. - -Specifications of inference tasks used in benchmark tests: -- **Univariate inference**: Historical sequence length: 2880, prediction length: 720 -- **Covariate inference**: Historical sequence length: 2880, prediction length: 720, with 20 known covariates - -| Number of GPUs (NVIDIA 4090, 24 GB VRAM) | Recommended CPU Cores | Recommended Memory (GB) | Supported QPS for Univariate Inference | Supported QPS for Covariate Inference | -|------------------------------------------|-----------------------|-------------------------|-----------------------------------------|---------------------------------------| -| 1 GPU | 16 cores | 24 GB | 100 | 10 | -| 2 GPUs | 32 cores | 48 GB | 200 | 20 | -| 4 GPUs | 64 cores | 96 GB | 400 | 40 | -| 8 GPUs | 128 cores | 192 GB | 800 | 80 | - -**Notes**: -- The CPU and memory configurations above follow this general rule: allocate 16 CPU cores per GPU, and set system memory equal to GPU VRAM at a ratio of 1:1. -- The throughput figures are benchmark references. Actual performance may vary depending on model type, data complexity and deployment environment. -- The throughput of univariate and covariate inference shall be evaluated separately as required, and the two values cannot be summed directly. - - -## 3. Installation, Deployment, and Usage - -### 3.1 Installing AINode - -Download the AINode installation package, import it into a dedicated folder, switch to that folder, and extract the package. - -```bash -unzip timechodb--ainode-bin.zip -``` - -### 3.2 Modifying Configuration Items - -AINode supports modifying some necessary parameters. You can find the following parameters in the `/TIMECHODB_AINODE_HOME/conf/iotdb-ainode.properties` file and make persistent modifications: - -| Name | Description | Type | Default Value | -| :--- | :--- | :--- | :--- | -| `cluster_name` | The cluster identifier the AINode is to join | String | `defaultCluster` | -| `ain_seed_config_node` | The ConfigNode address for AINode registration upon startup | String | `127.0.0.1:10710` | -| `ain_cluster_ingress_address` | The rpc address of the DataNode from which AINode pulls data | String | `127.0.0.1` | -| `ain_cluster_ingress_port` | The rpc port of the DataNode from which AINode pulls data | Integer | `6667` | -| `ain_cluster_ingress_username` | The client username for the DataNode from which AINode pulls data | String | `root` | -| `ain_cluster_ingress_password` | The client password for the DataNode from which AINode pulls data | String | `root` | -| `ain_rpc_address` | The address for AINode service provision and communication (internal service communication interface) | String | `127.0.0.1` | -| `ain_rpc_port` | The port for AINode service provision and communication | String | `10810` | -| `ain_system_dir` | AINode metadata storage path. The starting directory for relative paths is OS-dependent; using an absolute path is recommended. | String | `data/AINode/system` | -| `ain_models_dir` | AINode model file storage path. The starting directory for relative paths is OS-dependent; using an absolute path is recommended. | String | `data/AINode/models` | -| `ain_thrift_compression_enabled` | Whether to enable Thrift compression mechanism for AINode. 0-disable, 1-enable. | Boolean | `0` | - -### 3.3 Importing Built-in Weight Files - -*If the deployment environment has network connectivity and can access HuggingFace, the system will automatically pull the built-in model weight files. This step can be skipped.* -*For offline environments, contact Timecho staff to obtain the model weight folder and place it under the `/TIMECHODB_AINODE_HOME/data/ainode/models/builtin` directory.* -**NOTE:** Pay attention to the directory hierarchy. The parent directory for all built-in model weights should be `builtin`. - -### 3.4 Starting AINode - -After completing the deployment of ConfigNodes, you can add an AINode to support time series model management and inference functionality. After specifying the TimechoDB cluster information in the configuration items, you can execute the corresponding command to start the AINode and join the TimechoDB cluster. - -```bash -# Startup command -# Linux and macOS systems -bash sbin/start-ainode.sh - -# Windows system -sbin\start-ainode.bat - -# Background startup command (recommended for long-term operation) -# Linux and macOS systems -bash sbin/start-ainode.sh -d - -# Windows system -sbin\start-ainode.bat -d -``` - -### 3.5 Activating AINode - -1. Refer to TimechoDB Activation: [Activation Method](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md#_2-6-activate-database) - -2. You can verify AINode activation as follows. When the status shows `ACTIVATED`, it indicates successful activation. - -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -Total line number = 3 -It costs 0.002s -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ -Total line number = 7 -It costs 0.013s -``` - -### 3.6 Checking AINode Node Status - -During startup, AINode automatically joins the TimechoDB cluster. After starting AINode, you can enter an SQL query in the command line. Seeing the AINode node in the cluster with a `Running` status (as shown below) indicates a successful join. - -```sql -TimechoDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -Additionally, you can check the model status using the `show models` command. If the model status is incorrect, please verify the weight file path. - -```sql -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 3.7 Stopping AINode - -If you need to stop a running AINode node, execute the corresponding shutdown script. It supports specifying the port via the `-p` parameter, which corresponds to the `ain_rpc_port` configuration item. - -```bash -# Linux / macOS -bash sbin/stop-ainode.sh -bash sbin/stop-ainode.sh -p # Specify port - -# Windows -sbin\stop-ainode.bat -sbin\stop-ainode.bat -p # Specify port -``` - -After stopping AINode, you can still see the AINode node in the cluster, but its status will be `UNKNOWN` (as shown below). AINode functionality will be unavailable at this time. - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|UNKNOWN| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -If you need to restart the node, re-execute the startup script. - -### 3.8 Upgrading AINode -If you need to upgrade the version of the current AINode, follow these steps: - -1. Stop the current AINode service - - Run the stop command and ensure the service has completely exited before proceeding with subsequent operations. - - ```bash - # Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # Specify port - - # Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # Specify port - ``` - -2. Replace core files - - Delete the `lib` and `sbin` directories of the current version, then copy the `lib` and `sbin` directories from the new version to the corresponding locations. - - Back up the modified configuration files in the `conf` directory, then replace the `conf` folder and synchronize your modified configurations to the corresponding files. - -3. Update built-in model weights (optional) - - If the new version includes updates to built-in models, relevant information will be announced in the [Release History](../IoTDB-Introduction/Release-history_timecho.md). You may contact Timecho staff to obtain the latest weight package, and replace it in the `data/ainode/models/builtin` directory. - -4. After the upgrade is complete, start the AINode service and check the node status. For detailed commands, refer to Sections 3.4 and 3.6. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index 08882b31d..000000000 --- a/src/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,319 +0,0 @@ - -# AINode Deployment - -## 1. AINode Introduction - -### 1.1 Capability Introduction - -AINode is the third type of endogenous node provided by IoTDB after the Configurable Node and DataNode. This node extends its ability to perform machine learning analysis on time series by interacting with the DataNode and Configurable Node of the IoTDB cluster. It supports the introduction of existing machine learning models from external sources for registration and the use of registered models to complete time series analysis tasks on specified time series data through simple SQL statements. The creation, management, and inference of models are integrated into the database engine. Currently, machine learning algorithms or self-developed models are available for common time series analysis scenarios, such as prediction and anomaly detection. - -### 1.2 Delivery Method -AINode is an additional package outside the IoTDB cluster, with independent installation. - -### 1.3 Deployment mode -
- - -
- -## 2. Installation preparation - -### 2.1 Get installation package - -Unzip and install the package -`(timechodb--ainode-bin.zip)`, The directory structure after unpacking the installation package is as follows: - -| **Catalogue** | **Type** | **Explain** | -| ----------- | -------- |-----------------------------------------------------------------------| -| lib | folder | Python package files for AINode | -| sbin | folder | The running script of AINode can start, remove, and stop AINode | -| conf | folder | Configuration files for AINode, and runtime environment setup scripts | -| LICENSE | file | Certificate | -| NOTICE | file | Tips | -| README_ZH.md | file | Explanation of the Chinese version of the markdown format | -| README.md | file | Instructions | - -### 2.2 Pre-installation Check - -To ensure the AINode installation package you obtained is complete and valid, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum:please contact Timecho Team to re-obtain the installation package. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/ainode): - ```Bash - cd /data/ainode - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-06.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment of AINode as per the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### 2.3 Environmental Preparation - -1. Recommended operating systems: Ubuntu, MacOS -2. IoTDB version: >= V 2.0.5.1 -3. Runtime environment - - Python version between 3.9 and 3.12, with pip and venv tools installed; - -## 3. Installation steps - -### 3.1 Install AINode - -1. Ensure Python version is between 3.9 and 3.12: -```shell -python --version -# or -python3 --version -``` - -2. Download and import AINode into a dedicated folder, switch to the folder, and unzip the package: -```shell - unzip timechodb--ainode-bin.zip - ``` -3. Activate AINode: - -- Enter the IoTDB CLI - -```sql -# For Linux or macOS -./start-cli.sh -sql_dialect table - -# For Windows -./start-cli.bat -sql_dialect table -``` - -- Run the following command to retrieve the machine code required for activation: - -```sql -show system info -``` - -- Copy the returned machine code and send it to the Timecho team: - -```sql -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -``` - -- Enter the activation code provided by the Timecho team in the CLI using the following format. Wrap the activation code in single quotes ('): - -```sql -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZK' -``` - -- You can verify the activation using the following method: when the status shows ACTIVATED, it indicates successful activation. - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ - -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ - -``` - -### 3.2 Configuration item modification - -AINode supports modifying some necessary parameters. You can find the following parameters in the `conf/iotdb-ainode.properties` file and make persistent modifications to them: - -| **Name** | **Description** | **Type** | **Default Value** | -| ------------------------------ | ------------------------------------------------------------ | -------- | ------------------ | -| cluster_name | Identifier of the cluster AINode joins | string | defaultCluster | -| ain_seed_config_node | Address of the ConfigNode registered when AINode starts | String | 127.0.0.1:10710 | -| ain_cluster_ingress_address | RPC address of the DataNode for AINode to pull data | String | 127.0.0.1 | -| ain_cluster_ingress_port | RPC port of the DataNode for AINode to pull data | Integer | 6667 | -| ain_cluster_ingress_username | Client username for AINode to pull data from the DataNode | String | root | -| ain_cluster_ingress_password | Client password for AINode to pull data from the DataNode | String | root | -| ain_cluster_ingress_time_zone | Client time zone for AINode to pull data from the DataNode | String | UTC+8 | -| ain_inference_rpc_address | Address for AINode to provide services and communication (internal interface) | String | 127.0.0.1 | -| ain_inference_rpc_port | Port for AINode to provide services and communication | String | 10810 | -| ain_system_dir | Metadata storage path for AINode (relative path starts from OS-dependent directory; absolute path is recommended) | String | data/AINode/system | -| ain_models_dir | Path to store model files for AINode (relative path starts from OS-dependent directory; absolute path is recommended) | String | data/AINode/models | -| ain_thrift_compression_enabled | Whether to enable Thrift compression for AINode (0=disabled, 1=enabled) | Boolean | 0 | - -### 3.3 Importing Weight Files - -> Offline environment only (Online environments can skip this step) -> -Contact Timecho team to obtain the model weight files, then place them in the /IOTDB_AINODE_HOME/data/ainode/models/weights/ directory. - - -### 3.4 Start AINode - -After completing the deployment of Seed Config Node, the registration and inference functions of the model can be supported by adding AINode nodes. After specifying the information of the IoTDB cluster in the configuration file, the corresponding instruction can be executed to start AINode and join the IoTDB cluster。 - -- Networking environment startup - -Start command - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - - # Windows systems - sbin\start-ainode.bat - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### 3.5 Detecting the status of AINode nodes - -During the startup process of AINode, the new AINode will be automatically added to the IoTDB cluster. After starting AINode, you can enter SQL in the command line to query. If you see an AINode node in the cluster and its running status is Running (as shown below), it indicates successful joining. - - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -### 3.6 Stop AINode - -If you need to stop a running AINode node, execute the corresponding shutdown script. - -Stop command - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - -After stopping AINode, you can still see AINode nodes in the cluster, whose running status is UNKNOWN (as shown below), and the AINode function cannot be used at this time. - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -If you need to restart the node, you need to execute the startup script again. - -## 4. common problem - -### 4.1 An error occurs when starting AINode stating that the venv module cannot be found - -When starting AINode using the default method, a Python virtual environment will be created in the installation package directory and dependencies will be installed, so it is required to install the venv module. Generally speaking, Python 3.10 and above versions come with built-in VenV, but for some systems with built-in Python environments, this requirement may not be met. There are two solutions when this error occurs (choose one or the other): - -To install the Venv module locally, taking Ubuntu as an example, you can run the following command to install the built-in Venv module in Python. Or install a Python version with built-in Venv from the Python official website. - - ```shell -apt-get install python3.10-venv -``` -Install version 3.10.0 of venv into AINode in the AINode path. - - ```shell -../Python-3.10.0/python -m venv venv(Folder Name) -``` -When running the startup script, use ` -i ` to specify an existing Python interpreter path as the running environment for AINode, eliminating the need to create a new virtual environment. - -### 4.2 The SSL module in Python is not properly installed and configured to handle HTTPS resources -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -You can install OpenSSLS and then rebuild Python to solve this problem -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - -Python requires OpenSSL to be installed on our system, the specific installation method can be found in [link](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - -### 4.3 Pip version is lower - -A compilation issue similar to "error: Microsoft Visual C++14.0 or greater is required..." appears on Windows - -The corresponding error occurs during installation and compilation, usually due to insufficient C++version or Setup tools version. You can check it in - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - -### 4.4 Install and compile Python - -Use the following instructions to download the installation package from the official website and extract it: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` -Compile and install the corresponding Python package: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/UserGuide/latest-Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index 04f6f2342..000000000 --- a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,624 +0,0 @@ - -# Cluster Deployment - -This guide describes how to manually deploy a cluster instance consisting of 3 ConfigNodes and 3 DataNodes (commonly referred to as a 3C3D cluster). - -
- -
- - -## 1. Prerequisites - -1. **System Preparation**: Ensure the system has been configured according to the [System Requirements](../Deployment-and-Maintenance/Environment-Requirements.md). - -2. **IP Configuration**: It is recommended to use hostnames for IP configuration to prevent issues caused by IP address changes. Set the hostname by editing the `/etc/hosts` file. For example, if the local IP is `192.168.1.3` and the hostname is `iotdb-1`, run: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -``` - -Use the hostname for `cn_internal_address` and `dn_internal_address` in IoTDB configuration. - -3. **Unmodifiable Parameters**: Some parameters cannot be changed after the first startup. Refer to the [Parameter Configuration](#22-parameters-configuration) section. - - -4. **Installation Path**: Ensure the installation path contains no spaces or non-ASCII characters to prevent runtime issues. -5. **User Permissions**: Choose one of the following permissions during installation and deployment: - - **Root User (Recommended)**: This avoids permission-related issues. - - **Non-Root User**: - - Use the same user for all operations, including starting, activating, and stopping services. - - Avoid using `sudo`, which can cause permission conflicts. - -6. **Monitoring Panel**: Deploy a monitoring panel to track key performance metrics. Contact the Timecho team for access and refer to the [Monitoring Board Install and Deploy](../Deployment-and-Maintenance/Monitoring-panel-deployment.md). - -7. **Health Check Tool**: Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - - -## 2. Preparation - -1. Obtain the TimechoDB installation package: `timechodb-{version}-bin.zip` following [IoTDB-Package](../Deployment-and-Maintenance/IoTDB-Package_timecho.md)) - -2. Configure the operating system environment according to [Environment Requirement](../Deployment-and-Maintenance/Environment-Requirements.md)) - -### 2.1 Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -## 3. Installation Steps - -Taking a cluster with three Linux servers with the following information as example: - -| Node IP | Hostname | Services | -| ------------- | -------- | -------------------- | -| 11.101.17.224 | iotdb-1 | ConfigNode, DataNode | -| 11.101.17.225 | iotdb-2 | ConfigNode, DataNode | -| 11.101.17.226 | iotdb-3 | ConfigNode, DataNode | - -### 3.1 Configure Hostnames - -On all three servers, configure the hostnames by editing the `/etc/hosts` file. Use the following commands: - -```Bash -echo "11.101.17.224 iotdb-1" >> /etc/hosts -echo "11.101.17.225 iotdb-2" >> /etc/hosts -echo "11.101.17.226 iotdb-3" >> /etc/hosts -``` - -### 3.2 Extract Installation Package - -Unzip the installation package and navigate to the directory: - -```Plain -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -### 3.3 Parameters Configuration - -#### 3.3.1 Memory Configuration - -Edit the following files for memory allocation: - -- **ConfigNode**: `./conf/confignode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :--------------------------------- | :---------- | :-------------- | :-------------------------------------- | -| MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 30% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -- **DataNode**: `./conf/datanode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :--------------------------------- |:-----------------------------------------------------------------------------------------| :-------------- | :-------------------------------------- | -| MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 50% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -#### 3.3.2 General Configuration - -Set the following parameters in `./conf/iotdb-system.properties`. Refer to `./conf/iotdb-system.properties.template` for a complete list. - -**Cluster-Level Parameters**: - -| **Parameter** | **Description** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | -| :------------------------ | :----------------------------------------------------------- | :---------------- | :---------------- | :---------------- | -| cluster_name | Name of the cluster | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | Metadata replication factor; DataNode count shall not be fewer than this value | 3 | 3 | 3 | -| data_replication_factor | Data replication factor; DataNode count shall not be fewer than this value | 2 | 2 | 2 | - -**ConfigNode Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | **Notes** | -| :------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------------------- | :---------------- | :---------------- | :---------------- | :--------------------------------------------------------- | -| cn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | iotdb-1 | iotdb-2 | iotdb-3 | This parameter cannot be modified after the first startup. | -| cn_internal_port | Port used for internal communication within the cluster | 10710 | 10710 | 10710 | 10710 | 10710 | This parameter cannot be modified after the first startup. | -| cn_consensus_port | Port used for consensus protocol communication among ConfigNode replicas | 10720 | 10720 | 10720 | 10720 | 10720 | This parameter cannot be modified after the first startup. | -| cn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Address and port of the seed ConfigNode (e.g., `cn_internal_address:cn_internal_port`) | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | This parameter cannot be modified after the first startup. | - -**DataNode Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | **Notes** | -| :------------------------------ | :----------------------------------------------------------- |:----------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :---------------- | :---------------- | :---------------- | :--------------------------------------------------------- | -| dn_rpc_address | Address for the client RPC service | 127.0.0.1 | By default, the local machine can directly access it. For non-local access, please modify this configuration item to the IPv4 address or hostname of the server where it is located. It is recommended to use the IPv4 address of the server where it is located. | iotdb-1 | iotdb-2 | iotdb-3 | Effective after restarting the service. | -| dn_rpc_port | Port for the client RPC service | 6667 | 6667 | 6667 | 6667 | 6667 | Effective after restarting the service. | -| dn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | iotdb-1 | iotdb-2 | iotdb-3 | This parameter cannot be modified after the first startup. | -| dn_internal_port | Port used for internal communication within the cluster | 10730 | 10730 | 10730 | 10730 | 10730 | This parameter cannot be modified after the first startup. | -| dn_mpp_data_exchange_port | Port used for receiving data streams | 10740 | 10740 | 10740 | 10740 | 10740 | This parameter cannot be modified after the first startup. | -| dn_data_region_consensus_port | Port used for data replica consensus protocol communication | 10750 | 10750 | 10750 | 10750 | 10750 | This parameter cannot be modified after the first startup. | -| dn_schema_region_consensus_port | Port used for metadata replica consensus protocol communication | 10760 | 10760 | 10760 | 10760 | 10760 | This parameter cannot be modified after the first startup. | -| dn_seed_config_node | Address of the ConfigNode for registering and joining the cluster.(e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Address of the first ConfigNode | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | This parameter cannot be modified after the first startup. | - -**Note:** Ensure files are saved after editing. Tools like VSCode Remote do not save changes automatically. - -### 3.4 Start ConfigNode Instances - -1. Start the first ConfigNode (`iotdb-1`) as the seed node - -```Bash - # Unix/OS X - cd sbin - ./start-confignode.sh -d # The "-d" flag starts the process in the background. - - # Windows - # Before version V2.0.4.x - .\start-confignode.bat - - # V2.0.4.x and later versions - .\windows\start-confignode.bat - ``` - -2. Start the remaining ConfigNodes (`iotdb-2` and `iotdb-3`) in sequence. - -If the startup fails, refer to the [Common Issues](#5-common-issues) section below for troubleshooting. - -### 3.5 Start DataNode Instances - -On each server, navigate to the `sbin` directory and start the DataNode: - -```Bash - # Unix/OS X - cd sbin - ./start-datanode.sh -d # The "-d" flag starts the process in the background. - - # Windows - # Before version V2.0.4.x - .\start-datanode.bat - - # V2.0.4.x and later versions - .\windows\start-datanode.bat - ``` - -### 3.6 Activate the Database - -#### Option 1: Command-Based Activation - -1. Enter the IoTDB CLI on any node of the cluster: - -**Linux** or **MacOS** - -```Bash -# Before version V2.0.6.x -Shell> bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` -**Windows** - -```Bash -# Before version V2.0.4.x -Shell> sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.4.x and later versions, before version V2.0.6.x -Shell> sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -2. Execute the following command to obtain the machine code required for activation: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -|01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -3. Execute the following statement to obtain the version number of the database to be activated: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.9.2| 5ea21bc| -+-------+---------+ -Total line number = 1 -``` - -4. Provide the obtained machine code and version number to the Timecho team. - -5. Enter the activation codes provided by the Timecho team in the CLI in sequence using the following format. Wrap the activation code in single quotes ('): - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -- Note : The activation operation only needs to be performed once on any machine in the cluster. - -#### Option 2: File-Based Activation - -1. Start all ConfigNodes and DataNodes. -2. Copy the `system_info` file from the `activation` directory on each server and send them to the Timecho team. -3. Place the license files provided by the Timecho team into the corresponding `activation` folder for each node. - - -### 3.7 Verify Activation - -In the CLI, you can check the activation status by running the `show activation` command; the example below shows a status of ACTIVATED, indicating successful activation. - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -### 3.8 One-click Cluster Start and Stop - -#### 3.8.1 Overview - -Within the root directory of IoTDB, the `sbin `subdirectory houses the `start-all.sh` and `stop-all.sh` scripts, which work in concert with the `iotdb-cluster.properties` configuration file located in the `conf` subdirectory. This synergy enables the one-click initiation or termination of all nodes within the cluster from a single node. This approach facilitates efficient management of the IoTDB cluster's lifecycle, streamlining the deployment and operational maintenance processes. - -This following section will introduce the specific configuration items in the `iotdb-cluster.properties` file. - -#### 3.8.2 Configuration Items - -> Note: -> -> * When the cluster changes, this configuration file needs to be manually updated. -> * If the `iotdb-cluster.properties` configuration file is not set up and the `start-all.sh` or `stop-all.sh` scripts are executed, the scripts will, by default, start or stop the ConfigNode and DataNode nodes located in the IOTDB\_HOME directory where the scripts reside. -> * It is recommended to configure SSH passwordless login: If not configured, the script will prompt for the server password after execution to facilitate subsequent start, stop, or destroy operations. If already configured, there is no need to enter the server password during script execution. - -* confignode\_address\_list - -| **Name** | **confignode\_address\_list** | -| :----------------: |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | A list of IP addresses or hostname of the hosts where the ConfigNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_address\_list - -| **Name** | **datanode\_address\_list** | -| :----------------: |:------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | A list of IP addresses or hostname of the hosts where the DataNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* ssh\_account - -| **Name** | **ssh\_account** | -| :----------------: | :------------------------------------------------------------------------------------------------- | -| Description | The username used to log in to the target hosts via SSH. All hosts must have the same username. | -| Type | String | -| Default | root | -| Effective | After restarting the system | - -* ssh\_port - -| **Name** | **ssh\_port** | -| :----------------: | :---------------------------------------------------------------------------------- | -| Description | The SSH port exposed by the target hosts. All hosts must have the same SSH port. | -| Type | int | -| Default | 22 | -| Effective | After restarting the system | - -* confignode\_deploy\_path - -| **Name** | **confignode\_deploy\_path** | -| :----------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Description | The path on the target hosts where all ConfigNodes to be started/stopped are located. All ConfigNodes must be in the same directory on their respective hosts. eg: `/data/demo/apache-iotdb-1.3.1-all-bin`| -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_deploy\_path - -| **Name** | **datanode\_deploy\_path** | -| :----------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| Description | The path on the target hosts where all DataNodes to be started/stopped are located. All DataNodes must be in the same directory on their respective hosts.eg: `/data/demo/apache-iotdb-1.3.1-all-bin` | -| Type | String | -| Default | None | -| Effective | After restarting the system | - - -#### 3.8.3 Quick Example - -1. Configuration File: `iotdb-cluster.properties` -```properties -# Configure ConfigNode node addresses, separated by commas -confignode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# Configure DataNode node addresses, separated by commas -datanode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# SSH login username for target deployment servers -ssh_account=root - -# SSH service port number -ssh_port=22 - -# IoTDB installation directory (the program will be deployed into this path on remote nodes) -confignode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -datanode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -``` - -2. Run `./start-all.sh` to launch cluster and verify status - Connect to IoTDB CLI and execute `show cluster`. A successful output is shown below: -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -| 0|ConfigNode|Running| 172.xx.xx.16| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 1|ConfigNode|Running| 172.xx.xx.18| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 2|ConfigNode|Running| 172.xx.xx.17| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 3| DataNode|Running| 172.xx.xx.18| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 4| DataNode|Running| 172.xx.xx.17| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 5| DataNode|Running| 172.xx.xx.16| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -``` - - -## 4. Maintenance - -### 4.1 ConfigNode Maintenance - -ConfigNode maintenance includes adding and removing ConfigNodes. Common use cases include: - -- **Cluster Expansion:** If the cluster contains only 1 ConfigNode, adding 2 more ConfigNodes enhances high availability, resulting in a total of 3 ConfigNodes. -- **Cluster Fault Recovery:** If a ConfigNode's machine fails and it cannot function normally, remove the faulty ConfigNode and add a new one to the cluster. - -**Note:** After completing ConfigNode maintenance, ensure that the cluster contains either 1 or 3 active ConfigNodes. Two ConfigNodes do not provide high availability, and more than three ConfigNodes can degrade performance. - -#### 4.1.1 Adding a ConfigNode - -**Linux / MacOS :** - -```Bash -sbin/start-confignode.sh -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\start-confignode.bat - -# V2.0.4.x and later versions -sbin\windows\start-confignode.bat -``` - -#### 4.1.2 Removing a ConfigNode - -1. Connect to the cluster using the CLI and confirm the internal address and port of the ConfigNode to be removed: - -```Plain -show confignodes; -``` - -Example output: - -```Plain -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -2. Remove the ConfigNode using the script: - -**Linux / MacOS:** - -```Bash -sbin/remove-confignode.sh [confignode_id] -# Or: -sbin/remove-confignode.sh [cn_internal_address:cn_internal_port] -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\remove-confignode.bat [confignode_id] -# Or: -sbin\remove-confignode.bat [cn_internal_address:cn_internal_port] - -# V2.0.4.x and later versions -sbin\windows\remove-confignode.bat [confignode_id] -# Or: -sbin\windows\remove-confignode.bat [cn_internal_address:cn_internal_port] -``` - -### 4.2 DataNode Maintenance - -DataNode maintenance includes adding and removing DataNodes. Common use cases include: - -- **Cluster Expansion:** Add new DataNodes to increase cluster capacity. -- **Cluster Fault Recovery:** If a DataNode's machine fails and it cannot function normally, remove the faulty DataNode and add a new one to the cluster. - -**Note:** During and after DataNode maintenance, ensure that the number of active DataNodes is not fewer than the data replication factor (usually 2) or the schema replication factor (usually 3). - -#### 4.2.1 Adding a DataNode - -**Linux / MacOS:** - -```Bash -sbin/start-datanode.sh -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\start-datanode.bat - -# V2.0.4.x and later versions -sbin\windows\start-datanode.bat -``` - -**Note:** After adding a DataNode, the cluster load will gradually balance across all nodes as new writes arrive and old data expires (if TTL is set). - -#### 4.2.2 Removing a DataNode - -1. Connect to the cluster using the CLI and confirm the RPC address and port of the DataNode to be removed: - -```sql -show datanodes; -``` - -Example output: - -```sql -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -2. Remove the DataNode using the script: - -**Linux / MacOS:** - -```Bash -sbin/remove-datanode.sh [dn_rpc_address:dn_rpc_port] -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\remove-datanode.bat [dn_rpc_address:dn_rpc_port] - -# V2.0.4.x and later versions -sbin\windows\remove-datanode.bat [dn_rpc_address:dn_rpc_port] -``` - -### 4.3 Cluster Maintenance - -For more details on cluster maintenance, please refer to: [Cluster Maintenance](../User-Manual/Load-Balance.md) - -## 5. Common Issues - -1. Activation Fails Repeatedly - - Use the `ls -al` command to verify that the ownership of the installation directory matches the current user. - - Check the ownership of all files in the `./activation` directory to ensure they belong to the current user. - -2. ConfigNode Fails to Start - 1. Review the startup logs to check if any parameters, which cannot be modified after the first startup, were changed. - 2. Check the logs for any other errors. If unresolved, contact technical support for assistance. - 3. If the deployment is fresh or data can be discarded, clean the environment and redeploy using the following steps: - - **Clean the Environment** - - 1. Stop all ConfigNode and DataNode processes: - ```Bash - sbin/stop-standalone.sh - ``` - - 2. Check for any remaining processes: - ```Bash - jps - # or - ps -ef | grep iotdb - ``` - - 3. If processes remain, terminate them manually: - ```Bash - kill -9 - - #For systems with a single IoTDB instance, you can clean up residual processes with: - ps -ef | grep iotdb | grep -v grep | tr -s ' ' ' ' | cut -d ' ' -f2 | xargs kill -9 - ``` - - 4. Delete the `data` and `logs` directories: - ```Bash - cd /data/iotdb - rm -rf data logs - ``` - -## 6. Appendix - -### 6.1 ConfigNode Parameters - -| Parameter | Description | Required | -| :-------- | :---------------------------------------------------------- | :------- | -| -d | Starts the process in daemon mode (runs in the background). | No | - -### 6.2 DataNode Parameters - -| Parameter | Description | Required | -| :-------- | :----------------------------------------------------------- | :------- | -| -v | Displays version information. | No | -| -f | Runs the script in the foreground without backgrounding it. | No | -| -d | Starts the process in daemon mode (runs in the background). | No | -| -p | Specifies a file to store the process ID for process management. | No | -| -c | Specifies the path to the configuration folder; the script loads configuration files from this location. | No | -| -g | Prints detailed garbage collection (GC) information. | No | -| -H | Specifies the path for the Java heap dump file, used during JVM memory overflow. | No | -| -E | Specifies the file for JVM error logs. | No | -| -D | Defines system properties in the format `key=value`. | No | -| -X | Passes `-XX` options directly to the JVM. | No | -| -h | Displays the help instructions. | No | diff --git a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Database-Resources_timecho.md b/src/UserGuide/latest-Table/Deployment-and-Maintenance/Database-Resources_timecho.md deleted file mode 100644 index bb15f8a36..000000000 --- a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Database-Resources_timecho.md +++ /dev/null @@ -1,222 +0,0 @@ - -# Database Resources -## 1. CPU - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Number of timeseries (frequency<=1HZ)CPUNumber of nodes
standaloneDual-ActiveDistributed
Within 1000002-4 cores123
Within 3000004-8 cores123
Within 5000008-16 cores123
Within 100000016-32 cores123
Within 200000032-48 cores123
Within 1000000048core12Please contact Timecho Business for consultation
Over 10000000Please contact Timecho Business for consultation
- -> Supported CPU models: Kunpeng, Phytium, Sunway, Hygon, Zhaoxin, Loongson - -## 2. Memory - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Number of timeseries (frequency<=1HZ)MemoryNumber of nodes
standaloneDual-ActiveDistributed
Within 1000002-4G123
Within 3000006-12G123
Within 50000012-24G123
Within 100000024-48G123
Within 200000048-96G123
Within 10000000128G12Please contact Timecho Business for consultation
Over 10000000Please contact Timecho Business for consultation
- -> Flexible memory configuration options are provided. Users can adjust them in the datanode-env file. For details and configuration guidelines, please refer to [datanode-env](../Reference/System-Config-Manual.md#_3-2-datanode-env-sh-bat) - -**Note**: For dedicated hardware allocation and throughput references for AI model inference scenarios, refer to Section **[2.3.1 Resource Configuration Recommendations](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_2-3-1-resource-configuration-recommendations)** in the AINode deployment documentation. - -## 3. Storage (Disk) -### 3.1 Storage space -Calculation Formula: - -```Plain -Storage Space = Number of Measurement Points * Sampling Frequency (Hz) * Size of Each Data Point (Bytes, see the table below) * Storage Duration * Replication Factor / Compression Ratio -``` - -Data Point Size Calculation Table: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Data TypeTimestamp (Bytes) Value (Bytes) Total Data Point Size (Bytes) -
Boolean819
INT32 / FLOAT (Single Precision)8412
INT64 / DOUBLE (Double Precision)8816
TEXT (String)8Average = a8+a
-Example: - -- Scenario: 1,000 devices, 100 measurement points per device, i.e. 100,000 sequences in total. Data type is INT32. Sampling frequency is 1Hz (once per second). Storage duration is 1 year. Replication factor is 3. -- Full Calculation: - ```Plain - 1,000 devices * 100 measurement points * 12 bytes per data point * 86,400 seconds per day * 365 days per year * 3 replicas / 10 compression ratio = 11 TB - ``` -- Simplified Calculation: - ```Plain - 1,000 * 100 * 12 * 86,400 * 365 * 3 / 10 = 11 TB - ``` -### 3.2 Storage Configuration - -- For systems with > 10 million measurement points or high query loads, SSD is recommended. - -## 4. Network (NIC) -When the write throughput does not exceed 10 million points per second, a gigabit network card is required. When the write throughput exceeds 10 million points per second, a 10-gigabit network card is required. - -| **Write** **Throughput** **(Data Points/Second)** | **NIC** **Speed** | -| ------------------------------------------------- | -------------------- | -| < 10 million | 1 Gbps (Gigabit) | -| ≥ 10 million | 10 Gbps (10 Gigabit) | - -## 5. Additional Notes - -- IoTDB supports second-level cluster scaling . Data migration is not required when adding new nodes, so there is no need to worry about limited cluster capacity based on current data estimates. You can add new nodes to the cluster when scaling is needed in the future. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Deployment-form_timecho.md b/src/UserGuide/latest-Table/Deployment-and-Maintenance/Deployment-form_timecho.md deleted file mode 100644 index b2daee47f..000000000 --- a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Deployment-form_timecho.md +++ /dev/null @@ -1,63 +0,0 @@ - -# Deployment form - -IoTDB has two operation modes: standalone mode and cluster mode. - -## 1. Standalone Mode - -An IoTDB standalone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D. - -- **Features**: Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations. -- **Use Cases**: Scenarios with limited resources or low high-availability requirements, such as edge servers. -- **Deployment Method**: [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -## 2. Dual-Active Mode - -Dual-Active Deployment is a feature of TimechoDB, where two independent instances synchronize bidirectionally and can provide services simultaneously. If one instance stops and restarts, the other instance will resume data transfer from the breakpoint. - -> An IoTDB Dual-Active instance typically consists of 2 standalone nodes, i.e., 2 sets of 1C1D. Each instance can also be a cluster. - -- **Features**: The high-availability solution with the lowest resource consumption. -- **Use Cases**: Scenarios with limited resources (only two servers) but requiring high availability. -- **Deployment Method**: [Dual-Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -## 3. Cluster Mode - -An IoTDB cluster instance consists of 3 ConfigNodes and no fewer than 3 DataNodes, typically 3 DataNodes, i.e., 3C3D. If some nodes fail, the remaining nodes can still provide services, ensuring high availability of the database. Performance can be improved by adding DataNodes. - -- **Features**: High availability, high scalability, and improved system performance by adding DataNodes. -- **Use Cases**: Enterprise-level application scenarios requiring high availability and reliability. -- **Deployment Method**: [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - -## 4. Feature Summary - -| **Dimension** | **Stand-Alone Mode** | **Dual-Active Mode** | **Cluster Mode** | -| :-------------------------- | :------------------------------------------------------- | :------------------------------------------------------ | :------------------------------------------------------ | -| Use Cases | Edge-side deployment, low high-availability requirements | High-availability services, disaster recovery scenarios | High-availability services, disaster recovery scenarios | -| Number of Machines Required | 1 | 2 | ≥3 | -| Security and Reliability | Cannot tolerate single-point failure | High, can tolerate single-point failure | High, can tolerate single-point failure | -| Scalability | Can expand DataNodes to improve performance | Each instance can be scaled as needed | Can expand DataNodes to improve performance | -| Performance | Can scale with the number of DataNodes | Same as one of the instances | Can scale with the number of DataNodes | - -- The deployment steps for Stand-Alone Mode and Cluster Mode are similar (adding ConfigNodes and DataNodes one by one), with differences only in the number of replicas and the minimum number of nodes required to provide services. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/UserGuide/latest-Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index a0d4293d9..000000000 --- a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,487 +0,0 @@ - -# Docker Deployment - -## 1. Environment Preparation - -### 1.1 Install Docker - -```Bash -#Taking Ubuntu as an example. For other operating systems, you can search for installation methods on your own. -#step1: Install necessary system tools -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: Install GPG certificate -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: Add the software source -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: Update and install Docker CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: Set Docker to start automatically on boot -sudo systemctl enable docker -#step6: Verify if Docker is installed successfully -docker --version #Display version information, indicating successful installation. -``` - -### 1.2 Install Docker Compose - -```Bash -#Installation command -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#Verify the installation -docker-compose --version #Display version information, indicating successful installation. -``` - -### 1.3 Install dmidecode - -By default, Linux servers should already have dmidecode. If not, you can use the following command to install it. - -```Bash -sudo apt-get install dmidecode -``` - -After installing `dmidecode`, you can locate its installation path by running:`whereis dmidecode`. Assuming the result is `/usr/sbin/dmidecode`, please remember this path as it will be used in the YML file of Docker Compose later. - -### 1.4 Obtain the Container Image - -For the TimechoDB container image, you can contact the Timecho team to acquire it. - -## 2. Stand-Alone Deployment - -This section demonstrates how to deploy a standalone Docker version of 1C1D. - -### 2.1 Load the Image File - -For example, if the IoTDB container image file you obtained is named: `iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz`, use the following command to load the image: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -To view the loaded image, use the following command: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### 2.2 Create a Docker Bridge Network - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### 2.3 Write the Docker-Compose YML File - -Assume the IoTDB installation directory and the YML file are placed under the `/docker-iotdb` folder. The directory structure is as follows:`docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml` - -```Bash -docker-iotdb: -├── iotdb #Iotdb installation directory -│── docker-compose-standalone.yml #YML file for standalone Docker Composer -``` - -The complete content of `docker-compose-standalone.yml` is as follows: - -```Bash -version: "3" -services: - iotdb-service: - image: timecho/timechodb:2.0.2.1-standalone #The image used - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### 2.4 First Startup - -Use the following command to start: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -Since the system is not activated yet, it will exit immediately after the first startup, which is normal. The purpose of the first startup is to generate the machine code file for the activation process. - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### 2.5 Apply for Activation - -- After the first startup, a `system_info` file will be generated in the physical machine directory `/docker-iotdb/iotdb/activation`. Copy this file and send it to the Timecho team. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Once you receive the `license` file, copy it to the `/docker-iotdb/iotdb/activation` folder. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### 2.6 Start IoTDB Again - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### 2.7 Verify the Deployment - -- Check the logs: If you see the following message, the startup is successful. - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- Enter the container and check the service status: - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - Enter the container, log in to the database through CLI, and use the show cluster command to view the service status and activation status - - ```Bash - docker exec -it iotdb /bin/bash #Enter the container - ./start-cli.sh -h iotdb #Log in to the database - IoTDB> show cluster #Check the service status - ``` - - If all services are in the `running` state, the IoTDB deployment is successful. - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### 2.8 Map the `/conf` Directory (Optional) - -If you want to modify configuration files directly on the physical machine, you can map the `/conf` folder from the container. Follow these steps: - -**Step 1**: Copy the `/conf` directory from the container to `/docker-iotdb/iotdb/conf`: - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -**Step 2**: Add the mapping in `docker-compose-standalone.yml`: - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf # Add this mapping for the /conf folder - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /dev/mem:/dev/mem:ro -``` - -**Step 3**: Restart IoTDB: - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## 3. Cluster Deployment - -This section describes how to manually deploy a cluster consisting of 3 ConfigNodes and 3 DataNodes, commonly referred to as a 3C3D cluster. - -
- -
- -**Note: The cluster version currently only supports host and overlay networks, and does not support bridge networks.** - -Below, we demonstrate how to deploy a 3C3D cluster using the host network as an example. - -### 3.1 Set Hostnames - -Assume there are 3 Linux servers with the following IP addresses and service roles: - -| Node IP | Hostname | Services | -| :---------- | :------- | :------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode, DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode, DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode, DataNode | - -On each of the 3 machines, configure the hostnames by editing the `/etc/hosts` file. Use the following commands: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 3.2 Load the Image File - -For example, if the IoTDB container image file is named `iotdb-enterprise-2.0.x.x.3-standalone-docker.tar.gz`, execute the following command on all 3 servers to load the image: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -To view the loaded images, run: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### 3.3. Write the Docker-Compose YML Files - -Here, we assume the IoTDB installation directory and YML files are placed under the `/docker-iotdb` folder. The directory structure is as follows: - -```Bash -docker-iotdb: -├── confignode.yml #ConfigNode YML file -├── datanode.yml #DataNode YML file -└── iotdb #IoTDB installation directory -``` - -On each server, create two YML files: `confignode.yml` and `datanode.yml`. Examples are provided below: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:2.0.x.x-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:2.0.x.x-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### 3.4 Start ConfigNode for the First Time - -Start the ConfigNode on all 3 servers. **Note the startup order**: Start `iotdb-1` first, followed by `iotdb-2` and `iotdb-3`. - -Run the following command on each server: - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #Background startup -``` - -### 3.5 Apply for Activation - -- After starting the 3 ConfigNodes for the first time, a `system_info` file will be generated in the `/docker-iotdb/iotdb/activation` directory on each physical machine. Copy the `system_info` files from all 3 servers and send them to the Timecho team. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Place the 3 `license` files into the corresponding `/docker-iotdb/iotdb/activation` folders on each ConfigNode server. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- Once the `license` files are placed in the `activation` folders, the ConfigNodes will automatically activate. **No restart is required for the ConfigNodes.** - -### 3.6 Start DataNode - -Start the DataNode on all 3 servers: - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #Background startup -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### 3.7 Verify Deployment - -- Check the logs: If you see the following message, the DataNode has started successfully. - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- Enter the container and check the service status: - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - Enter any container, log in to the database via CLI, and use the `show cluster` command to check the service status: - -```Bash -docker exec -it iotdb-datanode /bin/bash #Entering the container -./start-cli.sh -h iotdb-1 #Log in to the database -IoTDB> show cluster #View status -``` - -If all services are in the `running` state, the IoTDB deployment is successful. - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### 3.8 Map the `/conf` Directory (Optional) - -If you want to modify configuration files directly on the physical machine, you can map the `/conf` folder from the container. Follow these steps: - -**Step 1**: Copy the `/conf` directory from the container to `/docker-iotdb/iotdb/conf` on all 3 servers: - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -or -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -**Step 2**: Add the `/conf` directory mapping in both `confignode.yml` and `datanode.yml` on all 3 servers: - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -**Step 3**: Restart IoTDB on all 3 servers: - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/UserGuide/latest-Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index ac07865bd..000000000 --- a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,208 +0,0 @@ - -# Dual Active Deployment - -## 1. What is a double active version? - -Dual-active mode refers to two independent instances (either standalone or clusters) with completely independent configurations. These instances can simultaneously handle external read and write operations, with real-time bi-directional synchronization and breakpoint recovery capabilities. - -Key features include: - -- **Mutual Backup of Instances**: If one instance stops service, the other remains unaffected. When the stopped instance resumes, the other instance will synchronize newly written data. Businesses can bind both instances for read and write operations, achieving high availability. -- **Cost-Effective Deployment**: The dual-active deployment solution achieves high availability with only two physical nodes, offering cost advantages. Additionally, physical resource isolation for the two instances can be ensured by leveraging dual-ring power and network designs, enhancing operational stability. - -**Note:** The dual-active functionality is exclusively available in enterprise-grade TimechoDB. - -![](/img/20240731104336.png) - -## 2. Prerequisites - -1. **Hostname Configuration**: It is recommended to prioritize hostname over IP during deployment to avoid issues where the database cannot start due to later changes in the host IP. For instance, if the local IP is `192.168.1.3` and the hostname is `iotdb-1`, configure it in `/etc/hosts` using: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -``` - -Use the hostname to configure IoTDB’s `cn_internal_address` and `dn_internal_address`. - -2. **Immutable Parameters**: Some parameters cannot be changed after the initial startup. Follow the steps in the "Installation Steps" section to configure them correctly. - -3. **Monitoring Panel**: Deploying a monitoring panel is recommended to monitor key performance indicators and stay informed about the database’s operational status. Contact the Timecho team to obtain the monitoring panel and refer to the corresponding [Monitoring Panel Deployment](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) for deployment steps. - -## 3. Installation Steps - -This guide uses two standalone nodes, A and B, to deploy the dual-active version of TimechoDB. The IP addresses and hostnames for the nodes are as follows: - -| Machine | IP Address | Hostname | -| ------- | ----------- | -------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### 3.1 Install Two Independent TimechoDB Instances - -Install TimechoDB on both machines (A and B) independently. For detailed instructions, refer to the standalone [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md)or cluster [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)deployment guides. - -Ensure that configurations for A and B are consistent for optimal dual-active performance. - -### 3.2 Configure Data Synchronization from Machine A to Machine B - -- Connect to the database on Machine A using the CLI tool from the `sbin` directory: - -```Bash -# Unix/OS X -./sbin/start-cli.sh -h iotdb-1 - -# Windows -# Before version V2.0.4.x -.\sbin\start-cli.bat -h iotdb-1 - -# V2.0.4.x and later versions -.\sbin\windows\start-cli.bat -h iotdb-1 -``` - -- Then create and start a data synchronization process. Use the following SQL command: - -```Bash -create pipe AB -with source ( - 'source.mode.double-living' ='true' -with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' -) -``` - -- **Note:** To avoid infinite data loops, ensure the parameter `source.mode.double-living` is set to `true` on both A and B. This prevents retransmission of data received through the other instance's pipe. - -### 3.3 Configure Data Synchronization from Machine B to Machine A - -- Connect to the database on Machine B: - -```Bash -# Unix/OS X -./sbin/start-cli.sh -h iotdb-2 - -# Windows -# Before version V2.0.4.x -.\sbin\start-cli.bat -h iotdb-2 - -# V2.0.4.x and later versions -.\sbin\windows\start-cli.bat -h iotdb-2 -``` - -- Then create and start the synchronization process with the following SQL command: - -```Bash -create pipe BA -with source ( -'source.mode.double-living' ='true' -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' -) -``` - -- **Note:** To avoid infinite data loops, ensure the parameter `source.mode.double-living` is set to `true` on both A and B. This prevents retransmission of data received through the other instance's pipe. - -### 3.4 Verify Deployment - -#### Check Cluster Status - -Run the `show cluster` command on both nodes to verify the status of the TimechoDB services: - -```Bash -show cluster -``` - -**Machine A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**Machine B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -Ensure all `ConfigNode` and `DataNode` processes are in the `Running` state. - -#### Check Synchronization Status - -Use the `show pipes` command on both nodes: - -```Bash -show pipes -``` - -Confirm that all pipes are in the `RUNNING` state: - -On machine A: - -![](/img/show%20pipes-A.png) - -On machine B: - -![](/img/show%20pipes-B.png) - -### 3.5 Stop the Dual-Active Instances - -To stop the dual-active instances: - -On machine A: - -```SQL -# Unix/OS X -./sbin/start-cli.sh -h iotdb-1 # Log in to CLI -IoTDB> stop pipe AB # Stop data synchronization -./sbin/stop-standalone.sh # Stop database service - -# Windows -# Before version V2.0.4.x -.\sbin\start-cli.bat -h iotdb-1 -IoTDB> stop pipe AB -.\sbin\stop-standalone.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-cli.bat -h iotdb-1 -IoTDB> stop pipe AB -.\sbin\windows\stop-standalone.bat -``` - -On machine B: - -```SQL -# Unix/OS X -./sbin/start-cli.sh -h iotdb-2 # Log in to CLI -IoTDB> stop pipe BA # Stop data synchronization -./sbin/stop-standalone.sh # Stop database service - -# Windows -# Before version V2.0.4.x -.\sbin\start-cli.bat -h iotdb-2 -IoTDB> stop pipe BA -.\sbin\stop-standalone.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-cli.bat -h iotdb-2 -IoTDB> stop pipe BA -.\sbin\windows\stop-standalone.bat -``` diff --git a/src/UserGuide/latest-Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/UserGuide/latest-Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index c2bffcf22..000000000 --- a/src/UserGuide/latest-Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,48 +0,0 @@ - -# Obtain TimechoDB - -## 1. How to obtain TimechoDB - -The TimechoDB installation package can be obtained through product trial application or by directly contacting the Timecho team. - -## 2. Installation Package Structure - -After unpacking the installation package(`iotdb-enterprise-{version}-bin.zip`),you will see the directory structure is as follows: - -| **Catologue** | **Type** | **Description** | -| :--------------- | :------- | :----------------------------------------------------------- | -| activation | Folder | Directory for activation files, including the generated machine code and the TimechoDB activation code obtained from Timecho staff. *(This directory is generated after starting the ConfigNode, enabling you to obtain the activation code.)* | -| conf | Folder | Configuration files directory, containing ConfigNode, DataNode, JMX, and logback configuration files. | -| data | Folder | Default data file directory, containing data files for ConfigNode and DataNode. *(This directory is generated after starting the program.)* | -| lib | Folder | Library files directory. | -| licenses | Folder | Directory for open-source license certificates. | -| logs | Folder | Default log file directory, containing log files for ConfigNode and DataNode. *(This directory is generated after starting the program.)* | -| sbin | Folder | Main scripts directory, containing scripts for starting, stopping, and managing the database. | -| tools | Folder | Tools directory. | -| ext | Folder | Directory for pipe, trigger, and UDF plugin-related files. | -| LICENSE | File | Open-source license file. | -| NOTICE | File | Open-source notice file. | -| README_ZH.md | File | User manual (Chinese version). | -| README.md | File | User manual (English version). | -| RELEASE_NOTES.md | File | Release notes. | - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the MQTT service and REST service JAR files by default. If you need to use them, please contact the Timecho team to obtain them. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/UserGuide/latest-Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index b2e601dd2..000000000 --- a/src/UserGuide/latest-Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,319 +0,0 @@ - -# Stand-Alone Deployment - -This guide introduces how to set up a standalone TimechoDB instance, which includes one ConfigNode and one DataNode (commonly referred to as 1C1D). - -## 1. Prerequisites - -1. **System Preparation**: Ensure the system has been configured according to the [System Requirements](../Deployment-and-Maintenance/Environment-Requirements.md). - -2. **IP Configuration**: It is recommended to use hostnames for IP configuration to prevent issues caused by IP address changes. Set the hostname by editing the `/etc/hosts` file. For example, if the local IP is `192.168.1.3` and the hostname is `iotdb-1`, run: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -``` - -Use the hostname for `cn_internal_address` and `dn_internal_address` in IoTDB configuration. - -3. **Unmodifiable Parameters**: Some parameters cannot be changed after the first startup. Refer to the [Parameter Configuration](#22-parameters-configuration) section. - -4. **Installation Path**: Ensure the installation path contains no spaces or non-ASCII characters to prevent runtime issues. - -5. **User Permissions**: Choose one of the following permissions during installation and deployment: - - **Root User (Recommended)**: This avoids permission-related issues. - - **Non-Root User**: - - Use the same user for all operations, including starting, activating, and stopping services. - - Avoid using `sudo`, which can cause permission conflicts. - -6. **Monitoring Panel**: Deploy a monitoring panel to track key performance metrics. Contact the Timecho team for access and refer to the [Monitoring Board Install and Deploy](../Deployment-and-Maintenance/Monitoring-panel-deployment.md). - -7. **Health Check Tool**: Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - - -## 2. Installation Steps - -### 2.1 Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### 2.2 Extract Installation Package - -Unzip the installation package and navigate to the directory: - -```Bash -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -### 2.3 Parameters Configuration - -#### 2.3.1 Memory Configuration - -Edit the following files for memory allocation: - -- **ConfigNode**: `./conf/confignode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :--------------------------------- | :---------- | :-------------- | :-------------------------------------- | -| MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 30% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -- **DataNode**: `./conf/datanode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :--------------------------------- |:-----------------------------------------------------------------------------------------| :-------------- | :-------------------------------------- | -| MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 50% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -#### 2.3.2 General Configuration - -Set the following parameters in `conf/iotdb-system.properties`. Refer to `conf/iotdb-system.properties.template` for a complete list. - -**Cluster-Level Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------------ | :-------------------------- | :------------- | :-------------- | :----------------------------------------------------------- | -| cluster_name | Name of the cluster | defaultCluster | Customizable | Support hot loading, but it is not recommended to change the cluster name by manually modifying the configuration file. | -| schema_replication_factor | Number of metadata replicas | 1 | 1 | In standalone mode, set this to 1. This value cannot be modified after the first startup. | -| data_replication_factor | Number of data replicas | 1 | 1 | In standalone mode, set this to 1. This value cannot be modified after the first startup. | - -**ConfigNode Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------------------- | :--------------------------------------------------------- | -| cn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | This parameter cannot be modified after the first startup. | -| cn_internal_port | Port used for internal communication within the cluster | 10710 | 10710 | This parameter cannot be modified after the first startup. | -| cn_consensus_port | Port used for consensus protocol communication among ConfigNode replicas | 10720 | 10720 | This parameter cannot be modified after the first startup. | -| cn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Use `cn_internal_address:cn_internal_port` | This parameter cannot be modified after the first startup. | - -**DataNode Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--------------------------------------------------------- | -| dn_rpc_address | Address for the client RPC service | 127.0.0.1 | By default, the local machine can directly access it. For non-local access, please modify this configuration item to the IPv4 address or hostname of the server where it is located. It is recommended to use the IPv4 address of the server where it is located. | Effective after restarting the service. | -| dn_rpc_port | Port for the client RPC service | 6667 | 6667 | Effective after restarting the service. | -| dn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | This parameter cannot be modified after the first startup. | -| dn_internal_port | Port used for internal communication within the cluster | 10730 | 10730 | This parameter cannot be modified after the first startup. | -| dn_mpp_data_exchange_port | Port used for receiving data streams | 10740 | 10740 | This parameter cannot be modified after the first startup. | -| dn_data_region_consensus_port | Port used for data replica consensus protocol communication | 10750 | 10750 | This parameter cannot be modified after the first startup. | -| dn_schema_region_consensus_port | Port used for metadata replica consensus protocol communication | 10760 | 10760 | This parameter cannot be modified after the first startup. | -| dn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Use `cn_internal_address:cn_internal_port` | This parameter cannot be modified after the first startup. | - -### 2.4 Start ConfigNode - -Navigate to the `sbin` directory and start ConfigNode: - -```Bash -# Unix/OS X -./sbin/start-confignode.sh -d # The "-d" flag starts the process in the background. - -# Windows -# Before version V2.0.4.x -.\sbin\start-confignode.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-confignode.bat -``` - -If the startup fails, refer to the [Common Issues](#3-common-issues)。 section below for troubleshooting. - - - -### 2.5 Start DataNode - -Navigate to the `sbin` directory of IoTDB and start the DataNode: - -```Bash -# Unix/OS X -./sbin/start-datanode.sh -d # The "-d" flag starts the process in the background. - -# Windows -# Before version V2.0.4.x -.\sbin\start-datanode.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-datanode.bat -``` - -### 2.6 Activate the Database - -#### Option 1: Command-Based Activation - -1. Enter the IoTDB CLI. - -**Linux** or **MacOS** - -```Bash -# Before version V2.0.6.x -Shell> bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` -**Windows** - -```Bash -# Before version V2.0.4.x -Shell> sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.4.x and later versions, before version V2.0.6.x -Shell> sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - - -2. Execute the following command to obtain the machine code required for activation: - -```SQL -show system info -``` -```Bash -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -3. Execute the following statement to obtain the version number of the database to be activated: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.9.2| 5ea21bc| -+-------+---------+ -Total line number = 1 -``` - -4. Provide the obtained machine code and version number to the Timecho team. - -5. Enter the activation codes provided by the Timecho team in the CLI in sequence using the following format. Wrap the activation code in single quotes ('): - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -#### Option 2: File-Based Activation - -1. After starting the Confignode and Datanode nodes, enter the `activation` folder and send the `system_info` file to the Timecho team. -2. Receive the `license` file returned by the staff. -3. Place the `license` file into the `activation` folder of the corresponding node. - - -### 2.7 Verify Activation - -In the CLI, you can check the activation status by running the `show activation` command. Check the `ClusterActivationStatus` field. If it shows `ACTIVATED`, the database has been successfully activated. - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81.png) - -## 3. Common Issues -1. Activation Fails Repeatedly - - 1. Use the `ls -al` command to verify that the ownership of the installation directory matches the current user. - 2. Check the ownership of all files in the `./activation` directory to ensure they belong to the current user. - -2. ConfigNode Fails to Start - - 1. Review the startup logs to check if any parameters, which cannot be modified after the first startup, were changed. - 2. Check the logs for any other errors. If unresolved, contact technical support for assistance. - 3. If the deployment is fresh or data can be discarded, clean the environment and redeploy using the following steps: - - **Clean the Environment** - - 1. Stop all ConfigNode and DataNode processes: - ```Bash - sbin/stop-standalone.sh - ``` - - 2. Check for any remaining processes: - ```Bash - jps - # or - ps -ef | grep iotdb - ``` - - 3. If processes remain, terminate them manually: - ```Bash - kill -9 - - #For systems with a single IoTDB instance, you can clean up residual processes with: - ps -ef | grep iotdb | grep -v grep | tr -s ' ' ' ' | cut -d ' ' -f2 | xargs kill -9 - ``` - - 4. Delete the `data` and `logs` directories: - ```Bash - cd /data/iotdb - rm -rf data logs - ``` - -## 4. Appendix - -### 4.1 ConfigNode Parameters - -| Parameter | Description | Required | -| :-------- | :---------------------------------------------------------- | :------- | -| -d | Starts the process in daemon mode (runs in the background). | No | - -### 4.2 DataNode Parameters - -| Parameter | Description | Required | -| :-------- | :----------------------------------------------------------- | :------- | -| -v | Displays version information. | No | -| -f | Runs the script in the foreground without backgrounding it. | No | -| -d | Starts the process in daemon mode (runs in the background). | No | -| -p | Specifies a file to store the process ID for process management. | No | -| -c | Specifies the path to the configuration folder; the script loads configuration files from this location. | No | -| -g | Prints detailed garbage collection (GC) information. | No | -| -H | Specifies the path for the Java heap dump file, used during JVM memory overflow. | No | -| -E | Specifies the file for JVM error logs. | No | -| -D | Defines system properties in the format `key=value`. | No | -| -X | Passes `-XX` options directly to the JVM. | No | -| -h | Displays the help instructions. | No | diff --git a/src/UserGuide/latest-Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md b/src/UserGuide/latest-Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md deleted file mode 100644 index 96d73ffc1..000000000 --- a/src/UserGuide/latest-Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md +++ /dev/null @@ -1,44 +0,0 @@ - - -# Overview - -IoTDB Ecosystem Integration Bridges the Full Pipeline of Time-Series Data: -- Through data collection, it enables second-level device connectivity. -- Via data integration, it constructs cross-cloud pipelines. -- Leveraging programming frameworks, it accelerates business logic development. -- With computing engines, it accomplishes distributed processing. -- Through visualization and SQL development, it implements analytical strategies. -- Finally, by interfacing with IoT platforms, it achieves edge-cloud synergy—building a complete intelligent closed loop from the physical world to digital decision-making. - -![](/img/eco-overview-n-en.png) - -The following documentation will help you quickly and comprehensively understand the usage of various integration tools at each stage: - -- Computing Engine - - Spark [Spark](./Spark-IoTDB.md) -- SQL Development - - DBeaver [DBeaver](./DBeaver.md) - - DataGrip [DataGrip ](./DataGrip.md) -- Programming Framework - - Spring Boot Starter [Spring Boot Starter](./Spring-Boot-Starter.md) - - Mybatis Generator [Mybatis Generator](./Mybatis-Generator.md) - - MyBatisPlus Generator [MyBatisPlus Generator](./MyBatisPlus-Generator.md) \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Ecosystem-Integration/SeaTunnel_timecho.md b/src/UserGuide/latest-Table/Ecosystem-Integration/SeaTunnel_timecho.md deleted file mode 100644 index 115a94e1d..000000000 --- a/src/UserGuide/latest-Table/Ecosystem-Integration/SeaTunnel_timecho.md +++ /dev/null @@ -1,193 +0,0 @@ - - - -# Apache SeaTunnel - -## 1. Overview - -SeaTunnel is a distributed integration platform designed for massive data. Leveraging its high performance and elastic scaling capabilities, it connects multi-source heterogeneous data links through standardized Connectors (composed of Source and Sink). The platform uniformly abstracts various data sources into the SeaTunnelRow format via Source. After dynamic resource scheduling and batch processing optimization, it efficiently writes data to different storage systems through Sink. Through the deep integration of the IoTDB Connector with SeaTunnel, it not only addresses core challenges in time-series data scenarios such as **high-throughput writing, multi-source governance, and complex analysis**, but also helps enterprises quickly build **low-cost, highly reliable, and easily scalable** data infrastructure in fields like the Internet of Things and industrial internet, leveraging the out-of-the-box connector ecosystem and automated operation and maintenance capabilities. - -## 2. Usage Steps - -### 2.1 Environment Preparation - -#### 2.1.1 Software Requirements - -| Software | Version | Installation Reference | -| ------------- | ------------- |-----------------------------------------------------------| -| IoTDB | >= 2.0.5 | [Quick Start](../QuickStart/QuickStart_timecho.md) | -| SeaTunnel | 2.3.12 | [Official Website](https://seatunnel.apache.org/download) | - -* Thrift Version Conflict Resolution (Only required for Spark engine): - -```Bash -# Remove older Thrift from Spark -rm -f $SPARK_HOME/jars/libthrift* -# Copy IoTDB's Thrift library to Spark classpath -cp $IOTDB_HOME/lib/libthrift* $SPARK_HOME/jars/ -``` - -#### 2.1.2 Dependency Configuration - -1. JDBC - -* Spark/Flink Engine: Place the [JDBC driver JAR](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) into the `${SEATUNNEL_HOME}/plugins/` directory. -* SeaTunnel Zeta Engine: Place the [JDBC driver JAR](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) into the `${SEATUNNEL_HOME}/lib/` directory. - -2. Connector - -Place the corresponding version of the [SeaTunnel Connector](https://mvnrepository.com/artifact/org.apache.seatunnel/connector-iotdb) into the `${SEATUNNEL_HOME}/plugins/` directory. - -### 2.2 Reading Data (IoTDB Source Connector) - -#### 2.2.1 Configuration Parameters - -| **Parameter** | **Type** | **Required** | **Default** | **Description** | -| -------------------------- | -------- | ------------ | ----------- | --------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | string | yes | - | IoTDB cluster address, format: `"host1:port"` or `"host1:port,host2:port"` | -| `username` | string | yes | - | IoTDB username | -| `password` | string | yes | - | IoTDB password | -| `sql_dialect` | string | no | tree | IoTDB model: `tree` for tree model; `table` for table model | -| `sql` | string | yes | - | SQL query statement to execute | -| `database` | string | no | - | Database name, only effective in table model | -| `schema` | config | yes | - | Data schema definition | -| `fetch_size` | int | no | - | Number of data rows fetched per request from IoTDB during query execution | -| `lower_bound` | long | no | - | Lower bound of time range (used for data partitioning by time column) | -| `upper_bound` | long | no | - | Upper bound of time range (used for data partitioning by time column) | -| `num_partitions` | int | no | - | Number of partitions (used when partitioning by time column):
1 partition: uses the full time range
If partitions < (upper_bound - lower_bound), the difference is used as actual partitions | -| `thrift_default_buffer_size`| int | no | - | Thrift protocol buffer size | -| `thrift_max_frame_size` | int | no | - | Thrift maximum frame size | -| `enable_cache_leader` | boolean | no | - | Whether to enable leader node caching | -| `version` | string | no | - | Client SQL semantic version (`V_0_12` / `V_0_13`) | - -#### 2.2.2 Configuration Example - -1. Create a new file `iotdb_source_example.conf` in the `${SEATUNNEL_HOME}/config/` directory: - -```bash -env { - parallelism = 2 # Parallelism set to 2 - job.mode = "BATCH" # Batch mode -} - -source { - IoTDB { - node_urls = "localhost:6667" - username = "root" - password = "root" - sql_dialect = "table" - sql = "SELECT time,device_id,city,s1,s2,s3,s4 FROM tcollector.table1" - schema { - fields { - time = timestamp - device_id = string - city= string - s1= int - s2= bigint - s3= float - s4= double - } - } - } -} - -sink { - Console { - } # Output to console -} -``` - -2. Run SeaTunnel with the following command: - -```Bash -./bin/seatunnel.sh --config config/iotdb_source_example.conf -e local -``` - -3. For more details, please refer to the official Apache SeaTunnel documentation on [IoTDB Source Connector](https://seatunnel.apache.org/docs/2.3.12/connector-v2/source/IoTDB). - -### 2.3 Writing Data (IoTDB Sink Connector) - -#### 2.3.1 Configuration Parameters - -| **Parameter** | **Type** | **Required** | **Default** | **Description** | -| ----------------------------- | --------- | ------------ | ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | Array | yes | - | IoTDB cluster address, format: `["host1:port"]` or `["host1:port","host2:port"]` | -| `username` | String | yes | - | IoTDB username | -| `password` | String | yes | - | IoTDB password | -| `sql_dialect` | String | no | tree | IoTDB model: `tree` for tree model; `table` for table model | -| `storage_group` | String | yes | - | IoTDB tree model: specifies the storage group for devices (path prefix) e.g., deviceId = \${storage_group} + "." + \${key_device}; IoTDB table model: specifies the database | -| `key_device` | String | yes | - | IoTDB tree model: field name in SeaTunnelRow that specifies the IoTDB device ID; IoTDB table model: field name in SeaTunnelRow that specifies the IoTDB table name | -| `key_timestamp` | String | no | processing time | IoTDB tree model: field name in SeaTunnelRow that specifies the IoTDB timestamp (if not specified, processing time is used as timestamp); IoTDB table model: field name in SeaTunnelRow that specifies the IoTDB time column (if not specified, processing time is used as timestamp) | -| `key_measurement_fields` | Array | no | See description | IoTDB tree model: field names in SeaTunnelRow that specify the list of IoTDB measurements (if not specified, includes all fields except `key_device` and `key_timestamp`); IoTDB table model: field names in SeaTunnelRow that specify the IoTDB field columns (if not specified, includes all fields except `key_device`, `key_timestamp`, `key_tag_fields`, `key_attribute_fields`) | -| `key_tag_fields` | Array | no | - | IoTDB tree model: not applicable; IoTDB table model: field names in SeaTunnelRow that specify the IoTDB tag columns | -| `key_attribute_fields` | Array | no | - | IoTDB tree model: not applicable; IoTDB table model: field names in SeaTunnelRow that specify the IoTDB attribute columns | -| `batch_size` | Integer | no | 1024 | For batch writing, data is flushed to IoTDB when the buffer reaches `batch_size` or when the time reaches `batch_interval_ms` | -| `max_retries` | Integer | no | - | Number of retries on failed flush | -| `retry_backoff_multiplier_ms` | Integer | no | - | Multiplier used to generate the next backoff delay | -| `max_retry_backoff_ms` | Integer | no | - | Maximum wait time before retrying a request to IoTDB | -| `default_thrift_buffer_size` | Integer | no | - | Initial buffer size for Thrift client in IoTDB | -| `max_thrift_frame_size` | Integer | no | - | Maximum frame size for Thrift client in IoTDB | -| `zone_id` | string | no | - | IoTDB client `java.time.ZoneId` | -| `enable_rpc_compression` | Boolean | no | - | Enable RPC compression in IoTDB client | -| `connection_timeout_in_ms` | Integer | no | - | Maximum time (in milliseconds) to wait when connecting to IoTDB | - -#### 2.3.2 Configuration Example - -1. Create a new file `iotdb_sink_example.conf` in the `${SEATUNNEL_HOME}/config/` directory: - -```bash -# Define runtime environment -env { - parallelism = 4 - job.mode = "BATCH" -} - -source{ - Jdbc { - url = "jdbc:mysql://localhost:3306/demo_db?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true" - driver = "com.mysql.cj.jdbc.Driver" - connection_check_timeout_sec = 100 - user = "root" - password = "IoTDB@2024" - query = "select * from device" - } -} -sink { - IoTDB { - node_urls = ["localhost:6667"] - username = "root" - password = "root" - sql_dialect = "table" - storage_group = "seatunnel" - key_device = "id" - key_timestamp = "intime" - } -} -``` - -2. Run SeaTunnel with the following command: - -```Bash -./bin/seatunnel.sh --config config/iotdb_sink_example.conf -e local -``` - -3. For more configuration parameters and examples, please refer to the official Apache SeaTunnel documentation on [IoTDB Sink Connector](https://seatunnel.apache.org/docs/2.3.12/connector-v2/sink/IoTDB). diff --git a/src/UserGuide/latest-Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/UserGuide/latest-Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index f74257551..000000000 --- a/src/UserGuide/latest-Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,299 +0,0 @@ - -# IoTDB Introduction - -TimechoDB is a high-performance, cost-efficient, and IoT-native time-series database developed by Timecho. As an enterprise-grade extension of Apache IoTDB, it is designed to tackle the complexities of managing large-scale time-series data in IoT environments. These challenges include high-frequency data sampling, massive data volumes, out-of-order data, extended processing times, diverse analytical demands, and high storage and maintenance costs. - -TimechoDB enhances Apache IoTDB with superior functionality, optimized performance, enterprise-grade reliability, and an intuitive toolset, enabling industrial users to streamline data operations and unlock deeper insights. - -- [Quick Start](../QuickStart/QuickStart_timecho.md): Download, Deploy, and Use - -## 1. TimechoDB Data Management Solution - -The Timecho ecosystem provides an integrated **collect-store-use** solution, covering the complete lifecycle of time-series data, from acquisition to analysis. - -![](/img/Introduction-en-timecho-new.png) - -Key components include: - -1. **Time-Series Database (TimechoDB)**: - 1. The primary storage and processing engine for time-series data, based on Apache IoTDB. - 2. Offers **high compression, advanced** **query** **capabilities, real-time stream processing, high availability, and scalability**. - 3. Provides **security features, multi-language APIs, and seamless integration with external systems**. -2. **Time-Series Standard File Format** **(Apache** **TsFile)**: - 1. A high-performance storage format originally developed by Timecho’s core contributors. - 2. Enables **efficient compression and fast querying**. - 3. Powers TimechoDB’s **data collection, storage, and analysis pipeline**, ensuring unified data management -3. **Time-Series AI Engine** **(AINode)**: - 1. Integrates **machine learning and deep learning** for time-series analytics. - 2. Extracts actionable insights directly from TimechoDB-stored data. -4. **Data Collection Framework**: - 1. Supports **various industrial protocols, resumable transfers, and network barrier penetration**. - 2. Facilitates **reliable data acquisition in challenging industrial environments**. - -## 2. TimechoDB Architecture - -The diagram below illustrates a common cluster deployment (3 ConfigNodes, 3 DataNodes) of TimechoDB: - -![](/img/Cluster-Concept03N.png) - -## 3. Key Features - -TimechoDB offers the following advantages: - -**Flexible Deployment:** - -- Supports one-click cloud deployment, on-premise installation, and seamless terminal-cloud synchronization. -- Adapts to hybrid, edge, and cloud-native architectures - -**Cost-Efficient Storage:** - -- Utilizes high compression ratio storage, eliminating the need for separate real-time and historical databases. -- Supports unified data management across different time horizons. - -**Hierarchical** **Data** **Organization:** - -- Mirrors real-world industrial structures through hierarchical measurement point modeling. -- Enables directory-based navigation, search, and retrieval. - -**High-Throughput Read****&****Write:** - -- Optimized for millions of concurrent device connections. -- Handles multi-frequency and out-of-order data ingestion with high efficiency. - -**Advanced Time-Series Query Semantics** **:** - -- Features a native time-series computation engine with built-in timestamp alignment. -- Provides nearly 100 aggregation and analytical functions, enabling AI-powered time-series insights. - -**Enterprise-Grade High Availability** **:** - -- Distributed HA architecture ensures 24/7 real-time database services. -- Automated resource balancing when nodes are added, removed, or overheated. -- Supports heterogeneous clusters with varying hardware configurations. - -**Operational Simplicity** **:** - -- Standard SQL query syntax for ease of use. -- Multi-language APIs for flexible development. -- Comes with a comprehensive toolset, including an intuitive management console - -**Robust Ecosystem Integration:** - -- Seamlessly integrates with big data frameworks (Hadoop, Spark) and visualization tools (Grafana, ThingsBoard, DataEase). -- Supports device management for industrial IoT environments. - -## 4. Enterprise-level Enhancements - -TimechoDB extends Apache IoTDB with advanced industrial-grade capabilities, including tiered storage, cloud-edge collaboration, visualization tools, and security upgrades. - -**Dual-Active Deployment:** - -- Implements active-active high availability, ensuring continuous operations. -- Two independent clusters perform real-time bidirectional synchronization. -- Both systems accept external writes and maintain eventual consistency. - -**Seamless Data Synchronization** **:** - -- Built-in synchronization module supports real-time and batch data aggregation from field devices to central hubs. -- Supports full, partial, and cascading aggregation. -- Includes enterprise-ready plugins for cross air-gap transmission, encrypted transmission, and compression. - -**Tiered** **Storage:** - -- Dynamically categorizes data into hot, warm, and cold tiers. -- Efficiently balances SSD, HDD, and cloud storage utilization. -- Automatically optimizes data access speed and storage costs. - -**Enhanced Security** **:** - -- Implements whitelist-based access control and audit logging. -- Strengthens data governance and risk mitigation. - -**Feature Comparison**: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
FunctionApache IoTDBTimechoDB
Deployment ModeStand-Alone Deployment
Distributed Deployment
Dual Active Deployment-
Container DeploymentPartial support
Database FunctionalitySensor Management
Write Data
Query Data
Continuous Query
Trigger
User Defined Function
Permission Management
Data SynchronisationOnly file synchronization, no built-in pluginsReal time synchronization+file synchronization, enriched with built-in plugins
Stream ProcessingOnly framework, no built-in pluginsFramework+rich built-in plugins
Tiered Storage-
View-
White List-
Audit Log-
Supporting ToolsWorkbench-
Cluster Management Tool-
System Monitor Tool-
LocalizationLocalization Compatibility Certification-
Technical SupportExpert Support-
Use Training-
- -### 4.1 Higher Efficiency and Stability - -TimechoDB achieves up to 10x performance improvements over Apache IoTDB in mission-critical workloads, and provides rapid fault recovery for industrial environments. - -### 4.2 Comprehensive Management Tools - -TimechoDB simplifies deployment, monitoring, and maintenance through an intuitive toolset: - -- **Cluster Monitoring Dashboard** - - Real-time insights into IoTDB and underlying OS health. - - 100+ performance metrics for in-depth monitoring and optimization. - - - - ![](/img/Introduction01.png) - - - - ![](/img/Introduction02.png) - - - - ![](/img/Introduction03.png) - - -- **Database Console** **:** - - Simplifies interaction with an intuitive GUI for metadata management, SQL execution, user permissions, and system configuration. -- **Cluster Management Tool** **:** - - Provides **one-click operations** for cluster deployment, scaling, start/stop, and configuration updates. - - ![](/img/introduction-opskit-en.png) - -### 4.3 Professional Enterprise Technical Services - -TimechoDB offers **vendor-backed enterprise services** to support industrial-scale deployments: - -- **On-Site Installation & Training**: Hands-on guidance for fast adoption. -- **Expert Consulting & Advisory**: Performance tuning and expert support. -- **Emergency Support & Remote Assistance**: Minimized downtime for mission-critical operations. -- **Custom Development & Optimization**: Tailored solutions for unique industrial use cases. - -Compared to Apache IoTDB’s 2-3 month release cycle, TimechoDB delivers faster updates and same-day critical issue resolutions, ensuring production stability. - -### 4.4 Ecosystem Compatibility & Compliance - -imechoDB is self-developed, supports mainstream CPUs & operating systems, and meets industry compliance standards, making it a reliable choice for enterprise IoT deployments. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/IoTDB-Introduction/Release-history_timecho.md b/src/UserGuide/latest-Table/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index 77764aaef..000000000 --- a/src/UserGuide/latest-Table/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,703 +0,0 @@ - -# Release History - -## 1. TimechoDB (Database Core) - - -### V2.0.9.4 -> Release Date: 2026.06.10
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.4-bin.zip
-> SHA512 Checksum: 040ebdd9e45d93535e9628cf377003d560be83cec9737f5a5fbd0c3a93a12810814094752eac3eacdfec5cddcf433fa83e76edc14be34c73c1a54d9b937ea1b5 - -Version 2.0.9.4 primarily optimizes table model AINode inference, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- AINode: Table model covariate inference models adaptively support filling null values - - -### V2.0.9.3 -> Release Date: 2026.05.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.3-bin.zip
-> SHA512 Checksum: f6c5d50cbf8902503289884f073593c650ffdc8edbebfabf27f6ab4499630749331aa4ed09dd34627a39fa8dee27b4d7e2689d0ed1cf23c76dd9c7270f9fae2a - -Version 2.0.9.3 of AINode newly supports registering multiple models by using the same model code with different model weights. It also includes enhancements and bug fixes for previous versions, with comprehensive improvements to database monitoring, performance and stability. Details are as follows: - -- AINode: [Supports registering custom models with the same model code and different model weights](../AI-capability/AINode_Upgrade_timecho.md#_4-3-register-custom-models) - - -### V2.0.9.2 -> Release Date: 2026.05.11
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.2-bin.zip
-> SHA512 Checksum: 10d3f34b6e65ad5c09b1cf3538ee27e181cc38c5fedf6acfd7d7053797ca23c76245683536275b69bd478aa1e43364351eceef1948832ab663a7398665af9eff - -Version 2.0.9.2 adds import and export capabilities for the Object data type, and introduces the new `tsfile-backup` script (currently supported only for table model scenarios). It also brings optimizations and bug fixes for legacy versions, with overall upgrades to database monitoring, performance and stability. Details are as follows: - -- Scripts & Tools: [The `import-data` script for TsFile format](../Tools-System/Data-Import-Tool_timecho.md#_2-4-tsfile-format) supports Object type data import for table models -- Scripts & Tools: New[ `tsfile-backup` script ](../Tools-System/Data-Export-Tool_timecho.md#_3-tsfilebackup-based-on-pipe-framework)added for table models -- Stream Processing Module: PIPE for table models supports [local export and remote transmission of Object type data](../User-Manual/Data-Sync_timecho.md#_3-9-object-type-data-export) -- System Module: [Audit logs](../User-Manual/Audit-Log_timecho.md) support slow request quantity statistics - -### V2.0.9.1 -> Release Date: 2026.05.11
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.1-bin.zip
-> SHA512 Checksum: 18ff3801ba58550e06ef0aa4bf4465e8ce1b31d1aecb9c6899eb843f5d9187d3cc575e930ee38d96b87b17067e2b21f1852ab5127eac7480cf5051c20a68894b - -Version 2.0.9.1 endows AINode with covariate classification inference capability, supports schema-level and table-level storage space statistics. It adds set operations, CTE and multiple built-in functions for data query, enables SQL debugging via DEBUG statements, and supports configuring auto-start on boot. This version also contains legacy version improvements, bug fixes, and comprehensive enhancements to database monitoring, performance and stability. Details are as follows: - -- AINode: Table models support [time series data classification inference](../AI-capability/AINode_Upgrade_timecho.md#_4-1-model-inference) -- Query Module: Table models support [set operations (UNION/INTERSECT/EXCEPT)](../SQL-Manual/Set-Operations_timecho.md) and [Common Table Expressions (CTE)](../SQL-Manual/Common-Table-Expression_timecho.md) -- Query Module: Newly added [IF scalar function](../SQL-Manual/Basis-Function_timecho.md#_8-3-if-expression), [binary functions](../SQL-Manual/Basis-Function_timecho.md#_7-binary-functions) and [APPROX_PERCENTILE aggregate function](../SQL-Manual/Basis-Function_timecho.md#_2-aggregate-functions) for table models -- Query Module: Supports [DEBUG SQL](../User-Manual/Maintenance-commands_timecho.md#_6-query-debugging) for query debugging and optimizes the result set of [Explain Analyze](../User-Manual/Query-Performance-Analysis.md) -- Query Module: Supports [schema-level](../../latest/User-Manual/Maintenance-commands_timecho.md#_1-10-view-disk-space-usage) and [table-level](../Reference/System-Tables_timecho.md#_2-22-table-disk-usage) storage space occupancy statistics; the[ `SHOW CONFIGURATION` statement](../User-Manual/Maintenance-commands_timecho.md#_1-13-view-node-configuration) is available to view cluster configuration information -- Scripts & Tools: Data and metadata import/export tools support the SSL protocol -- Scripts & Tools: Command-line tool adds access [history display](../Tools-System/CLI_timecho.md#_4-access-history-feature) capability -- System Module: Supports [system auto-start](../User-Manual/Auto-Start-On-Boot_timecho.md) configuration -- Others: Fixed security vulnerability CVE-2026-28564 - - -### V2.0.8.3 -> Release Date: 2026.04.21
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.8.3-bin.zip
-> SHA512 Checksum: 4b95bea87cc375bc455897dcf4cec80692421fa5c3eee746e1095b94288611d4afdd94aa8dad70340757d041757758924701cbdb2b73b49fb8730c4caac2a126 - -Version 2.0.8.3 enables reading and writing Object type data via Python. It also includes optimizations and bug fixes for previous versions, with comprehensive upgrades to database monitoring, performance and stability. Details are as follows: - -- Interface Module: [Python Native API](../API/Programming-Python-Native-API_timecho.md) supports reading and writing Object type data for table models - - -### V2.0.8.2 - -> Release Date: 2026.03.31
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name:timechodb-2.0.8.2-bin.zip
-> SHA512 Checksum:02ab10e3e94786dd5676e0a69609eef192afd90d87f4d8d7bd44e7e9cbc8a18d61ba5668bae56cb8e4416ac71a877f760963b72ca7838d7c39ae10f1ed321d89 - -Version 2.0.8.2 adds support for modifying the full path of time series in the tree model, customizing the Time column name in the table model, changing data types in both tree and table models, and includes the ODBC Driver, among other features. It also introduces improvements and bug fixes for earlier versions, with comprehensive enhancements to database monitoring, performance, and stability. The detailed release notes are as follows: - -- Storage Module: The tree model supports [modifying the full name of time series](../../latest/Basic-Concept/Operate-Metadata_timecho.md#_2-4-修改时间序列名称) and [changing the data type of time series](../../latest/Basic-Concept/Operate-Metadata_timecho.md#_2-3-修改时间序列数据类型). -- Storage Module: The table model supports [modifying column data types](../Basic-Concept/Table-Management_timecho.md#_1-5-修改表) and [customizing the Time column name](../Basic-Concept/Table-Management_timecho.md#_1-1-创建表). -- Interface Module: Adds support for the [ODBC Driver](../API/Programming-ODBC_timecho.md); the Python SessionDataset supports fetching DataFrames in batches; the MQTT service is externalized, and a new system table named Services is added for service queries. -- AI Node: The table model supports adaptive [covariate inference](../AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理). -- Stream Processing Module: The tree model data synchronization PIPE statement supports specifying multiple precise paths. - - -### V2.0.8.1 - -> Release Date: 2026.02.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name:timechodb-2.0.8.1-bin.zip
-> SHA512 Checksum: 49d97cbf488443f8e8e73cc39f6f320b3bc84b194aed90af695ebd5771650b5e5b6a3abb0fb68059bd01827260485b903c035657b337442f4fdd32c877f2aca3 - -V2.0.8.1 introduces the **Object data type** to table models, significantly enhances audit logging capabilities, optimizes the tree model’s **OPC UA protocol**, adds **covariate-based forecasting** support in AINode, and enables **concurrent inference** in AINode. Additionally, comprehensive improvements have been made to database monitoring, performance, and stability. The detailed release notes are as follows: - -- **Query Module**: Added a list view of available DataNode instances, allowing users to [view each node's RPC address and port](../User-Manual/Maintenance-commands_timecho.md#_1-7-viewing-available-nodes). -- **Query Module**: Introduced a new system table for [statistical query latency analysis](../Reference/System-Tables_timecho.md#_2-20-queries-costs-histogram). -- **Storage Module**: Added SQL support to retrieve the full definition statements for [tables](../Basic-Concept/Table-Management_timecho.md#_1-4-view-table-creation-statement) and [views](../User-Manual/Tree-to-Table_timecho.md#_2-4-viewing-table-views). -- **Storage Module**: Optimized the tree model’s [OPC UA protocol](../../latest/API/Programming-OPC-UA_timecho.md). -- **System Module**: Added support for the [Object data type](../Background-knowledge/Data-Type_timecho.md) in table models. -- **System Module**: Significantly enhanced and upgraded the [audit log](../User-Manual/Audit-Log_timecho.md) functionality. -- **System Module**: Added a new system table to monitor [DataNode connection status](../Reference/System-Tables_timecho.md#_2-18-connections). -- **AINode**: Integrated the built-in **Chronos-2** model, supporting [covariate-based forecasting](../AI-capability/AINode_Upgrade_timecho.md). -- **AINode**: Built-in models **Timer-XL** and **Sundial** now support [concurrent inference](../AI-capability/AINode_Upgrade_timecho.md). -- **Stream Processing Module**: When creating a full-data synchronization pipe, it will be [automatically split](../User-Manual/Data-Sync_timecho.md#_2-1-create-a-task) into two independent pipes—one for real-time data and one for historical data—whose remaining event counts can be monitored separately via the `SHOW PIPES` statement. -- **Others**: Fixed security vulnerabilities **CVE-2025-12183**, **CVE-2025-66566**, and **CVE-2025-11226**. - -### V2.0.6.6 - -> Release Date: 2026.01.20
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.6.6-bin.zip
-> SHA512 Checksum: d12e60b8119690d63c501d0c2afcd527e39df8a8786198e35b53338e21939e1a9244805e710d81cbb62d02c2739909d7e8227c029660a0cd9ea7ca718cf9bdf6 - -V2.0.6.6 primarily optimizes query performance for time series in the tree model, while delivering comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Query Module**: Improved query performance for `SHOW/COUNT TIMESERIES/DEVICES` statements. -* **Others**: Fixed security vulnerabilities CVE-2025-12183, CVE-2025-66566, and CVE-2025-11226. - - -### V2.0.6.4 - -> Release Date: 2025.11.17
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.6.4-bin.zip
-> SHA512 Checksum: 57b9998cc14632862c32b6781c70db1c52caf8172b5d45d27cc214cab50d3afd4230ed0754e1c1a4ed825666bf971dc81fbb7d3b93261e57e9dabc20e794a2b8 - -V2.0.6.4 focuses on enhancements to the storage and AINode modules, resolves several product defects, and provides comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Storage Module**: Added support for modifying the encoding and compression methods of time series in the tree model. -* **AINode**: Introduced one-click deployment and optimized model inference capabilities. - -### V2.0.6.1 - -> Release Date: 2025.09.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.6.1-bin.zip
-> SHA512 Checksum: c88e3e2c0dbd06578bd0697ca9992880b300baee2c4906ba1f952134e37ae2fa803a6af236f4541d318b75f43a498b5d5bfbbc7c445783271076c36e696e4dd0 - -V2.0.6.1 introduces the new table model query write-back function, access control blacklist/whitelist function, bitwise operation functions (built-in scalar functions), and push-downable time functions. Comprehensive enhancements to database monitoring, performance, and stability are also included. Key updates: - -* ​**​Query Module:​**​ - * Supports the table model query write-back function - * The table model row pattern recognition supports the use of aggregate functions to capture continuous data for analytical calculation - * The table model adds built-in scalar functions - bitwise operation functions - * The table model adds push-downable EXTRACT time functions -* ​**​System Module:​**​ - * Adds access control, supporting users to customize and configure blacklist/whitelist functions -* ​**​Others:​**​ - * The default user password is updated to "TimechoDB@2021" with higher security strength - -### V2.0.5.2 - -> Release Date: 2025.08.08
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.5.2-bin.zip
-> SHA512 Checksum: a00a4075c9937b7749c454f71d2480fea5e9ff9659c0628b132e30e2f256c7c537cd91dca4f6be924db0274bb180946a1b88e460c025bf82fdb994a3c2c7b91e - -V2.0.5.2 introduces addresses certain product defects, optimizes the data synchronization function,Comprehensive enhancements to database monitoring, performance, and stability are also included. - - -### V2.0.5.1 - -> Release Date: 2025.07.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.5.1-bin.zip
-> SHA512 Checksum: aa724755b659bf89a60da6f2123dfa91fe469d2e330ed9bd029e8f36dd49212f3d83b1025e9da26cb69315e02f65c7e9a93922e40df4f2aa4c7f8da8da2a4cea - -V2.0.5.1 introduces ​**​tree-to-table view​**​, ​**​window functions​**​ and the ​**​approx\_most\_frequent​**​ aggregate function for the table model, along with support for ​**​LEFT & RIGHT JOIN​**​ and ​**​ASOF LEFT JOIN​**​. AINode adds two built-in models: ​**​Timer-XL​**​ and ​**​Timer-Sundial​**​, supporting inference and fine-tuning for tree and table models. Comprehensive enhancements to database monitoring, performance, and stability are also included. Key updates: - -* ​**​Query Module:​**​ - * Supports manually creating tree-to-table views - * Adds window functions for table model - * Adds approx\_most\_frequent aggregate function - * Extends JOIN support: LEFT/RIGHT JOIN, ASOF LEFT JOIN - * Enables row pattern recognition (captures continuous data for analysis) - * New system tables: VIEWS (view metadata), MODELS (model info), etc. -* ​**​System Module:​**​ - * Adds TsFile data encryption -* ​**​AI Module:​**​ - * New built-in models: Timer-XL and Timer-Sundial - * Supports inference/fine-tuning for tree and table models -* ​**​Others:​**​ - * Enables data publishing via OPC DA protocol - -### 2.x Other historical versions - -#### V2.0.4.2 - -> Release Date: 2025.06.21
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.4.2-bin.zip
-> SHA512 Checksum: 31f26473ac90988ce970dac8d0950671bde918f9af6f2f6a6c2bf99a53aa1c0a459c53a137b18ff0b28e70952e9c4b6acb50029e0b2e38837b969eb8f78f2939 - -V2.0.4.2 adds support for passing TOPIC to custom MQTT plugins. Includes comprehensive improvements to monitoring, performance, and stability. - -#### V2.0.4.1 - -> Release Date: 2025.06.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.4.1-bin.zip
-> SHA512 Checksum: 93ac08bfae06aff6db04849f474458433026f66778f4f5c402eb22f1a7cb14d8096daf0a9e9cc365ddfefd4f8ca4443b2a9fb6461906f056b1e6a344990beb3a - -V2.0.4.1 introduces ​**​User-Defined Table Functions (UDTF)​**​ and multiple built-in table functions for the table model, adds the ​**​approx\_count\_distinct​**​ aggregate function, and enables ​**​ASOF INNER JOIN on timestamp columns​**​. Script tools are categorized, with Windows-specific scripts separated out. Key updates: - -* ​**​Query Module:​**​ - * Adds UDTFs and built-in table functions - * Supports ASOF INNER JOIN on timestamps - * Adds approx\_count\_distinct aggregate function -* ​**​Stream Processing:​**​ - * Supports asynchronous TsFile loading via SQL -* ​**​System Module:​**​ - * Disaster-aware load balancing strategy for replica selection during downsizing - * Compatibility with Windows Server 2025 -* ​**​Scripts & Tools:​**​ - * Categorized scripts; isolated Windows-specific tools - -#### V2.0.3.4 - -> Release Date: 2025.06.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.4-bin.zip
-> SHA512 Checksum: d80d34b7d3890def75b17c491fc4c13efc36153a5950a9b23744755d04d6adb5d6ab9ec970101183fef7bfeb8a559ef92fce90d2d22f7b7fd5795cd5589461bb - -V2.0.3.4 upgrades the user password encryption algorithm to ​**​SHA-256​**​. Includes comprehensive monitoring, performance, and stability improvements. - -#### V2.0.3.3 - -> Release Date: 2025.05.16
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.3-bin.zip
-> SHA512 Checksum: f47e3fb45f869dbe690e7cfaa93f95e5e08a462b362aa9d7ccac7ee5b55022dc8f62db12009dfde055f278f3003ff9ea7c22849d52a3ef2c25822f01ade78591 - -V2.0.3.3 introduces ​**​metadata import/export scripts for table models​**​, ​**​Spark ecosystem integration​**​, and adds ​**​timestamps to AINode results​**​. New aggregate/scalar functions are added. Key updates: - -* ​**​Query Module:​**​ - * New aggregate function: count\_if; scalar functions: greatest/least - * Significant optimization for full-table count(\*) queries -* ​**​AI Module:​**​ - * Timestamps added to AINode results -* ​**​System Module:​**​ - * Optimized metadata performance for table model - * Active monitoring & loading of TsFiles - * New metrics: TsFile parsing time, Tablet conversion count -* ​**​Ecosystem Integration:​**​ - * Spark integration for table model -* ​**​Scripts & Tools:​**​ - * import-schema/export-schema scripts support table model metadata - -#### V2.0.3.2 - -> Release Date: 2025.05.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.2-bin.zip
-> SHA512 Checksum: 76bd294de4b01782e5dd621a996aeb448e4581f98c70fb5b72b17dc392c2e1227c0d26bd3df5533669a80f217a83a566bc6ec926b7efd21ce7a89b894cd33e19 - -V2.0.3.2 resolves product defects, optimizes node removal, and enhances monitoring, performance, and stability. - -#### V2.0.2.1 - -> Release Date: 2025.04.07
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.2.1-bin.zip
-> SHA512 Checksum: a41be3f8c57e6a39ac165f1d6ab92c9ed790b0712528f31662c58617f4c94e6bfc9392a9c1ef2fc5bdd8c7ca79901389f368cbdbec3e5b1d5c1ce155b2f1a457 - -V2.0.2.1 adds ​**​table model permission management​**​, ​**​user management​**​, and ​**​operation authentication​**​, alongside UDFs, system tables, and nested queries. Data subscription mechanisms are optimized. Key updates: - -* ​**​Query Module:​**​ - * Added UDF management: User-Defined Scalar Functions (UDSF) & Aggregate Functions (UDAF) - * Configurable URI-based loading for UDF/PipePlugin/Trigger/AINode JARs - * Permission/user management with operation authentication - * New system tables and maintenance statements -* ​**​System Module:​**​ - * CSharp client supports table model - * New C++ Session write APIs for table model - * Multi-tier storage supports S3-compliant non-AWS object storage - * New pattern\_match function -* ​**​Data Sync:​**​ - * Table model metadata sync and delete propagation - -#### V2.0.1.2 - -> Release Date: 2025.01.25
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.1.2-bin.zip
-> SHA512 Checksum: 51c2fa5da2974a8a3c8871dec1c49bd98e5d193a13ef33ac7801adb833a1e360d74f0160bcdf33c7ffb23a5c5e0f376e26a4315cf877f1459483356285b85349 - -V2.0.1.2 officially implements ​**​dual-model configuration (tree + table)​**​. The table model supports ​**​standard SQL queries​**​, diverse functions/operators, stream processing, and Benchmarking. Python client adds four new data types, and script tools support TsFile/CSV/SQL import/export. Key updates: - -* ​**​Time-Series Table Model:​**​ - * Standard SQL: SELECT, WHERE, JOIN, GROUP BY, ORDER BY, LIMIT, nested queries -* ​**​Query Module:​**​ - * Logical operators, math functions, time-series functions (e.g., DIFF) - * Configurable URI-based JAR loading -* ​**​Storage Module:​**​ - * Session API writes with auto-metadata creation - * Python client supports: String, Blob, Date, Timestamp - * Optimized compaction task priority -* ​**​Stream Processing:​**​ - * Auth info specification on sender side - * TsFile Load for table model - * Plugin adaptation for table model -* ​**​System Module:​**​ - * Enhanced DataNode downsizing stability - * Supports DROP DATABASE in read-only mode -* ​**​Scripts & Tools:​**​ - * Benchmark adapted for table model - * Support for String/Blob/Date/Timestamp in Benchmark - * import-data/export-data: Universal support for TsFile/CSV/SQL -* ​**​Ecosystem Integration:​**​ - * Kubernetes Operator support - - -### V1.3.7.3 - -> Release Date: 2026.06.02
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 Checksum: 8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 primarily optimizes query module and data synchronization capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Query Module: Optimized `Last` queries, aligned series queries, reverse-order time filter queries, and other scenarios. -- Metadata Module: Optimized device creation validation for activated series and their child paths. -- Data Synchronization: Optimized the retry mechanism after synchronization failures. -- Data Synchronization: Cross-network-gateway synchronization plugin supports configuring the real-time write transmission timeout. -- Interface Module: Added error code validation to the Go client write interface. -- Interface Module: Optimized C# client connection pool management. - - -### V1.3.7.2 - -> Release Date: 2026.04.07
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 Checksum: 787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 primarily optimizes data synchronization and query module capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Data Synchronization: Optimized distribution performance for Pipe complex path matching scenarios. -- Query Module: The `SHOW QUERIES` statement now includes client IP, query timeout, server wait time, and other information. -- Ecosystem Integration: Supports IoTDB pushing data to an external OPC Server in OPC Client mode. - - -### V1.3.6.6 - -> Release Date: 2026.01.20
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 Checksum: 590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 enhances data read/write capabilities, resolves several product defects, and delivers comprehensive improvements in database monitoring, performance, and stability. - - -### V1.3.6.3 - -> Release Date: 2026.01.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 Checksum: 43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 focuses on deep optimizations in two core areas—query performance and memory management—while comprehensively enhancing database monitoring, performance, and stability. Specific release contents are as follows: - -* **Query Module**: Optimized query performance across multiple scenarios, including multi-series `Last` queries. -* **Query Module**: Added a new `FastLastQuery` interface in the Java SDK for more efficient `Last` query operations. -* **Query Module**: Modified the tree model’s `fetchSchema` to return results in segmented streaming mode, improving response speed under large-data-volume conditions. -* **Storage Module**: Enhanced memory management to mitigate memory leak risks and ensure long-term system stability. -* **Storage Module**: Optimized the file compaction mechanism to improve compaction efficiency and reduce storage resource consumption. -* **Others**: Fixed security vulnerabilities CVE-2025-12183, CVE-2025-66566, and CVE-2025-11226. - -### V1.3.6.1 - -> Release Date: 2025.12.09
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 Checksum: 9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 focuses on deep optimization of data synchronization stability, while delivering comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Data Synchronization**: Enhanced Pipe SQL parameter configuration to support specifying asynchronous loading methods. -* **Data Synchronization**: Introduced syntactic sugar that automatically splits full-data Pipe creation SQL into real-time and historical synchronization components. -* **System Module**: Added a global configuration option for data-type-specific compression strategies, enabling on-demand adjustment of storage compression policies. - - -### V1.3.5.11 - -> Release Date: 2025.09.24
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 Checksum: f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 version primarily optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.10 - -> Release Date: 2025.08.27
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 Checksum: 3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 version fixes certain product defects and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.9 - -> Release Date: 2025.08.25
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 Checksum: 95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 version optimizes memory control, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### 1.x Other historical versions - -#### V1.3.5.8 - -> Release Date: 2025.08.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 Checksum: aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.7 - -> Release Date: 2025.08.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 Checksum: 17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.6 - -> Release Date: 2025.07.16
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 Checksum: 05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 introduces a new configuration switch to disable the data subscription feature. It optimizes the C++ high-availability client and addresses PIPE synchronization latency issues in normal operation, restart, and deletion scenarios, along with query performance for large TEXT objects. Comprehensive enhancements to database monitoring, performance, and stability are also included. - -#### V1.3.5.4 - -> Release Date: 2025.06.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 Checksum: edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 fixes several product defects and optimizes the node removal functionality. It also delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.5.3 - -> Release Date: 2025.06.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 Checksum: 5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 focuses on optimizing data synchronization capabilities, including persisting PIPE transmission progress and adding monitoring metrics for PIPE event transfer time. Related defects have been resolved. Additionally, the encryption algorithm for user passwords has been upgraded to SHA-256. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.2 - -> Release Date: 2015.06.10
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 Checksum: 4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 primarily optimizes data synchronization features, adding support for cascading configurations via parameters and ensuring fully consistent ordering between synchronized and real-time writes. It also enables partitioned sending of historical and real-time data after system restarts. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.1 - -> Release Date: 2025.05.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 Checksum: 91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 resolves several product defects and delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.4.2 - -> Release Date: 2025.04.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 Checksum: 52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b - -V1.3.4.2 enhances the data synchronization function by supporting bi-directional active-active synchronization of data forwarded through external PIPE sources. - -#### V1.3.4.1 - -> Release Date: 2025.01.08
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 Checksum: e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 introduces pattern matching functions, continuously optimizes the data subscription mechanism, improves stability, and extends import-data/export-data scripts to support new data types while unifying TsFile, CSV and SQL import/export formats. Comprehensive improvements have been made to database monitoring, performance and stability. Key updates: - -* Query Module: Configurable URI-based JAR loading for UDFs, PipePlugins, Triggers and AINodes -* System Module: Extended UDF functionality with new pattern\_match function -* Data Sync: Supports specifying authentication info at sender -* Ecosystem: Kubernetes Operator support -* Scripts: import-data/export-data now supports strings, BLOBs, dates and timestamps -* Scripts: Unified import/export support for TsFile, CSV and SQL formats - -#### V1.3.3.3 - -> Release Date: 2024.10.31
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 Checksum: 4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3 improves restart recovery performance, enables DataNodes to actively monitor/load TsFiles with observability metrics, supports automatic loading at receivers when senders transfer files to specified directories, and adds Alter Source capability for Pipes. Comprehensive improvements to monitoring, performance and stability include: - -* Data Sync: Automatic type conversion for inconsistent data at receivers -* Data Sync: Enhanced observability with ops/latency metrics for internal APIs -* Data Sync: OPC-UA sink plugin supports CS mode and non-anonymous access -* Subscription: SDK supports create\_if\_not\_exists and drop\_if\_exists APIs -* Stream Processing: Alter Pipe supports Alter Source -* System: Added latency monitoring for REST module -* Scripts: Auto-loading TsFiles from specified directories -* Scripts: import-tsfile supports remote server execution -* Scripts: Kubernetes Helm support -* Scripts: Python client supports new data types (string, BLOB, date, timestamp) - -#### V1.3.3.2 - -> Release Date: 2024.08.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 Checksum: 32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2 adds metrics for mods file reading time, merge sort memory usage and dispatch latency, supports configurable time partition origin adjustment, enables automatic subscription termination based on pipe completion markers, and improves merge memory control. Key updates: - -* Query: Explain Analyze shows mods file read time -* Query: Explain Analyze shows merge sort memory and dispatch latency -* Storage: Added configurable file splitting during compaction -* System: Configurable time partition origin -* Stream Processing: Auto-terminate subscriptions on pipe completion markers -* Data Sync: Configurable RPC compression levels -* Scripts: Export filters only root.\_\_system paths - -#### V1.3.3.1 - -> Release Date: 2024.07.12
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 Checksum: 1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1 adds tiered storage throttling, supports username/password auth specification at sync senders, optimizes ambiguous WARN logs at receivers, improves restart performance, and merges configuration files. Key updates: - -* Query: Optimized Filter performance for faster aggregation/WHERE queries -* Query: Java Session evenly distributes SQL requests across nodes -* System: Merged config files into iotdb-system.properties -* Storage: Added tiered storage throttling -* Data Sync: Username/password auth specification at senders -* System: Optimized restart recovery time - -#### V1.3.2.2 - -> Release Date: 2024.06.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 Checksum: ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 introduces EXPLAIN ANALYZE for SQL profiling, UDAF framework, automatic data deletion at disk thresholds, metadata sync, path-specific data point counting, and SQL import/export scripts. Supports rolling cluster upgrades and cluster-wide plugin distribution with comprehensive monitoring/performance improvements. Key updates: - -* Storage: Improved insertRecords performance -* Storage: SpaceTL feature for auto-deletion at disk thresholds -* Query: EXPLAIN ANALYZE for SQL stage-level profiling -* Query: New UDAF framework -* Query: New envelope demodulation analysis in UDFs -* Query: MaxBy/MinBy functions returning timestamps with values -* Query: Faster value-filtered queries -* Data Sync: Wildcard path matching -* Data Sync: Metadata synchronization (including attributes/permissions) -* Stream Processing: ALTER PIPE for hot plugin updates -* System: TsFile load statistics in data point counting -* Scripts: Local upgrade/backup via hard links -* Scripts: New export-data/import-data for CSV/TsFile/SQL formats -* Scripts: Windows window title differentiation for ConfigNode/DataNode/Cli - -#### V1.3.1.4 - -> Release Date: 2024.04.23
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 Checksum: 8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1.4 adds cluster activation status viewing, variance/stddev aggregation functions, FILL timeout settings, TsFile repair command, one-click info collection scripts, and cluster control scripts while optimizing views and stream processing. Key updates: - -* Query: FILL clause timeout threshold -* Query: REST V2 returns column types -* Data Sync: Simplified time range specification -* Data Sync: SSL support (iotdb-thrift-ssl-sink) -* System: SQL query for cluster activation status -* System: Tiered storage transfer rate control -* System: Enhanced observability (node divergence, task scheduling) -* System: Optimized default logging -* Scripts: One-click cluster control scripts (start-all/stop-all) -* Scripts: One-click info collection scripts (collect-info) - -#### V1.3.0.4 - -> Release Date: 2024.01.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 Checksum: 3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 introduces the AINode machine learning framework, upgrades permission granularity to time-series level, and optimizes views/stream processing for better usability and stability. Key updates: - -* Query: New AINode ML framework -* Query: Fixed slow SHOW PATH responses -* Security: Time-series granular permissions -* Security: SSL client-server encryption -* Stream Processing: New metrics monitoring -* Query: LAST queries on non-writable views -* System: Improved data point counting accuracy - -#### V1.2.0.1 - -> Release Date: 2023.06.30
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 Checksum: dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1 introduces stream processing framework, dynamic templates, substring/replace/round functions, enhances SHOW REGION/TIMESERIES/VARIABLE statements and Session APIs while optimizing monitoring metrics. Key updates: - -* Stream Processing: New framework -* Metadata: Dynamic template expansion -* Storage: New SPRINTZ/RLBE encoding and LZMA2 compression -* Query: New CAST, ROUND, SUBSTR, REPLACE functions -* Query: New TIME\_DURATION, MODE aggregation -* Query: CASE WHEN syntax support -* Query: ORDER BY expression support -* Interface: Python API multi-node connection -* Interface: Python client write redirection -* Interface: Batch sequence creation via templates - -#### V1.1.0.1 - -> Release Date: 2023.04.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.1.0.1.zip
-> SHA512 Checksum: 58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1 introduces GROUP BY VARIATION/CONDITION, DIFF/COUNT\_IF functions, and pipeline execution engine while fixing issues including: - -* Aligned sequence LAST queries with ORDER BY TIMESERIES -* LIMIT & OFFSET failures -* Post-restart metadata template errors -* Sequence creation after database deletion - -Key updates: - -* Query: ALIGN BY DEVICE supports ORDER BY TIME -* Query: SHOW QUERIES/KILL QUERY commands -* System: SHOW REGIONS per database -* System: SHOW VARIABLES for cluster parameters -* Query: GROUP BY VARIATION/CONDITION -* Query: SELECT INTO type casting -* Query: New DIFF (scalar), COUNT\_IF (aggregate) -* System: SHOW REGIONS creation time -* System: Configurable dn\_rpc\_port/address - -## 2. Workbench (Console Tool) - - -| **Version** | **Description** | **Supported IoTDB Versions** | **SHA512 checksum** | -| ----------- | ------------------------------------------------------------ | ----------------------------------- | ------------------------------------------------------------ | -| V2.1.1 | Optimize the measuring point selection on the trend interface to support scenarios without devices | V2.0 and above | aa05fd4d9f33f07c0949bc2d6546bb4b9791ed5ea94bcef27e2bf51ea141ec0206f1c12466aced7bf3449e11ad68d65378d697f3d10cb4881024a83746029a65 | -| V2.0.1-beta | The first version of the V2.x series, supporting dual models of tree and table | V2.0 and above | 0ca0d5029874ed8ada9c7d1cb562370b3a46913eed66d39c08759287ccc8bf332cf80bb8861e788614b61ae5d53a9f5605f553e1a607e856f395eb5102e7cc4d | -| V1.5.7 | Optimize the point list by splitting point names into device names and points, ensure the point selection area supports horizontal scrolling, and align the export file column order with the page display. | All 1.x versions from V1.3.4 onward | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | Enhanced CSV import/export: optional tags/aliases on import; support for measurement descriptions with backtick-quoted quotes on export. | All 1.x versions from V1.3.4 onward | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | Added server clock functionality and support for activating Enterprise Edition license databases | All 1.x versions from V1.3.4 onward | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | Added authentication for Prometheus settings in Instance Management | All 1.x versions from V1.3.4 onward | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | Added AI analysis and pattern matching | All 1.x versions from V1.3.2 onward | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | Added tree model display and English UI | All 1.x versions from V1.3.2 onward | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | Enhanced analysis methods and import templates | All 1.x versions from V1.3.2 onward | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | Added DB configuration and UI refinements | All 1.x versions from V1.3.2 onward | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | Optimized permission controls | All 1.x versions from V1.3.1 onward | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | Added "Common Templates" and caching | All 1.x versions from V1.3.0 onward | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | Added import/export for calculations, time alignment field | All 1.x versions from V1.2.2 onward | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | Added activation details and analysis features | All 1.x versions from V1.2.2 onward | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | Optimized point description display | All 1.x versions from V1.2.2 onward | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | Added sync monitoring panel, Prometheus hints | All 1.x versions from V1.2.2 onward | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | Major Workbench upgrade | All 1.x versions from V1.2.0 onward | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/UserGuide/latest-Table/QuickStart/QuickStart_timecho.md b/src/UserGuide/latest-Table/QuickStart/QuickStart_timecho.md deleted file mode 100644 index 918214f34..000000000 --- a/src/UserGuide/latest-Table/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,94 +0,0 @@ - - -# Quick Start - -This document will guide you through methods to get started quickly with IoTDB. - -## 1. How to Install and Deploy? - -This guide will assist you in quickly installing and deploying IoTDB. You can quickly navigate to the content you need to review through the following document links: - -1. Prepare the necessary machine resources: The deployment and operation of IoTDB require consideration of various aspects of machine resource configuration. For specific resource configurations, please refer to [Database Resource](../Deployment-and-Maintenance/Database-Resources_timecho.md) - -2. Complete system configuration preparations: IoTDB's system configuration involves multiple aspects. For an introduction to key system configurations, please see [System Requirements](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. Obtain the installation package: You can contact the Timecho Team to get the IoTDB installation package to ensure you download the latest and most stable version. For the specific structure of the installation package, please refer to[Obtain TimechoDB](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. Install the database and activate it: Depending on your actual deployment architecture, you can choose from the following tutorials for installation and deployment: - - - Stand-Alone Deployment: [Stand-Alone Deployment ](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - Distributed(Cluster) Deployment:[Distributed(Cluster) Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - Dual-Active Deployment:[Dual-Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️Note: We currently still recommend direct installation and deployment on physical/virtual machines. For Docker deployment, please refer to [Docker Deployment](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. Install supporting tools: The TimechoDB provides supporting tools such as a monitoring panel, which is recommended to be installed when deploying the TimechoDB to facilitate a more convenient usage: - - - Monitoring Panel: Provides hundreds of database monitoring metrics for fine-grained monitoring of IoTDB and its operating system, assisting in system optimization, performance tuning, and bottleneck detection. For installation steps, please see [Monitoring-panel Deployment ](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - -## 2. How to Use IoTDB? - -1. Database Modeling Design: Database modeling is a crucial step in creating a database system, involving the design of data structures and relationships to ensure that the organization of data meets the needs of specific applications. The following documents will help you quickly understand IoTDB's modeling design: - - - Introduction to Time Series Concepts: [Navigating Time Series Data](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - - - Introduction to Modeling Design:[Data Model and Terminology](../Background-knowledge/Data-Model-and-Terminology_timecho.md) - - - Introduction to Database:[Database Management](../Basic-Concept/Database-Management_timecho.md) - - - Introduction to Tables: [Table Management](../Basic-Concept/Table-Management_timecho.md) - -2. Data Insertion & Updates: IoTDB provides multiple methods for inserting real-time data, and supports data write-back. For basic data insertion and updating operations, please see [Write&Updata Data](../Basic-Concept/Write-Updata-Data_timecho.md) - -3. Data Querying: IoTDB offers a rich set of data querying capabilities. For a basic introduction to data querying, please see [Query Data](../Basic-Concept/Query-Data.md). It includes pattern queries and window functions applicable to time-series featured analysis. For detailed introductions, please refer to [Pattern Query](../User-Manual/Pattern-Query_timecho.md) and [Window Function](../User-Manual/Window-Function_timecho.md). - -4. Data Deletion: IoTDB supports two deletion methods: SQL-based deletion and automatic expiration deletion (TTL). - - - SQL-Based Deletion: For a basic introduction, please refer to [Delete Data](../Basic-Concept/Delete-Data.md) - - - Automatic Expiration Deletion (TTL): For a basic introduction, please see [TTL Delete Data](../Basic-Concept/TTL-Delete-Data_timecho.md) - -5. Advanced Features: In addition to common database functions such as insertion and querying, IoTDB also supports features like "data synchronization." For specific usage methods, please refer to the respective documents: - - - Data Synchronization: [Data Sync](../User-Manual/Data-Sync_timecho.md) - -6. Application Programming Interfaces (APIs): IoTDB provides various application programming interfaces (APIs) to facilitate developers' interaction with IoTDB in applications. Currently supported interfaces include [Java Native API](../API/Programming-Java-Native-API_timecho.md)、[Python Native API](../API/Programming-Python-Native-API_timecho.md)、[JDBC](../API/Programming-JDBC_timecho.md), and more. For more programming interfaces, please refer to the [Application Programming Interfaces] section on the official website. - -## 3. What other convenient tools are available? - -In addition to its own rich features, IoTDB is complemented by a comprehensive suite of peripheral tools. This guide will help you quickly understand and use these tools: - - - Monitoring Panel: A tool for detailed monitoring of IoTDB and its operating system, covering hundreds of database monitoring indicators such as database performance and system resources, helping system optimization and bottleneck identification. For a detailed usage introduction, please refer to [Monitor Tool](../Tools-System/Monitor-Tool_timecho.md) - - -## 4. Want to learn more technical details? - -If you want to explore IoTDB’s internal mechanisms further, refer to the following documents: - - - Data Partitioning and Load Balancing: IoTDB is designed with a partitioning strategy and load balancing algorithm to enhance cluster availability and performance. For more details, please see [Cluster data partitioning](../Technical-Insider/Cluster-data-partitioning.md) - - - Compression & Encoding: IoTDB employs various encoding and compression techniques optimized for different data types to improve storage efficiency. For more details, please see [Encoding and Compression](../Technical-Insider/Encoding-and-Compression.md) - - diff --git a/src/UserGuide/latest-Table/Reference/System-Config-Manual_timecho.md b/src/UserGuide/latest-Table/Reference/System-Config-Manual_timecho.md deleted file mode 100644 index f96a33713..000000000 --- a/src/UserGuide/latest-Table/Reference/System-Config-Manual_timecho.md +++ /dev/null @@ -1,3374 +0,0 @@ - - -# Config Manual - -## 1. IoTDB Configuration Files - -The configuration files for IoTDB are located in the `conf` folder under the IoTDB installation directory. Key configuration files include: - -1. `confignode-env.sh` **/** `confignode-env.bat`: - 1. Environment configuration file for ConfigNode. - 2. Used to configure memory size and other environment settings for ConfigNode. -2. `datanode-env.sh` **/** `datanode-env.bat`: - 1. Environment configuration file for DataNode. - 2. Used to configure memory size and other environment settings for DataNode. -3. `iotdb-system.properties`: - 1. Main configuration file for IoTDB. - 2. Contains configurable parameters for IoTDB. -4. `iotdb-system.properties.template`: - 1. Template for the `iotdb-system.properties` file. - 2. Provides a reference for all available configuration parameters. - -## 2. Modify Configurations - -### 2.1 **Modify Existing Parameters**: - -- Parameters already present in the `iotdb-system.properties` file can be directly modified. - -### 2.2 **Adding New Parameters**: - -- For parameters not listed in `iotdb-system.properties`, you can find them in the `iotdb-system.properties.template` file. -- Copy the desired parameter from the template file to `iotdb-system.properties` and modify its value. - -### 2.3 Configuration Update Methods - -Different configuration parameters have different update methods, categorized as follows: - -1. **Modify before the first startup.**: - 1. These parameters can only be modified before the first startup of ConfigNode/DataNode. - 2. Modifying them after the first startup will prevent ConfigNode/DataNode from starting. -2. **Restart Required for Changes to Take Effect**: - 1. These parameters can be modified after ConfigNode/DataNode has started. - 2. However, a restart of ConfigNode/DataNode is required for the changes to take effect. -3. **Hot Reload**: - 1. These parameters can be modified while ConfigNode/DataNode is running. - 2. After modification, use the following SQL commands to apply the changes: - - `load configuration`: Reloads the configuration. - - `set configuration key1 = 'value1'`: Updates specific configuration parameters. - -## 3. Environment Parameters - -The environment configuration files (`confignode-env.sh/bat` and `datanode-env.sh/bat`) are used to configure Java environment parameters for ConfigNode and DataNode, such as JVM settings. These configurations are passed to the JVM when ConfigNode or DataNode starts. - -### 3.1 **confignode-env.sh/bat** - -- MEMORY_SIZE - -| Name | MEMORY_SIZE | -| ----------- | ------------------------------------------------------------ | -| Description | Memory size allocated when IoTDB ConfigNode starts. | -| Type | String | -| Default | Depends on the operating system and machine configuration. Defaults to 3/10 of the machine's memory, capped at 16G. | -| Effective | Restart required | - -- ON_HEAP_MEMORY - -| Name | ON_HEAP_MEMORY | -| ----------- | ------------------------------------------------------------ | -| Description | On-heap memory size available for IoTDB ConfigNode. Previously named `MAX_HEAP_SIZE`. | -| Type | String | -| Default | Depends on the `MEMORY_SIZE` configuration. | -| Effective | Restart required | - -- OFF_HEAP_MEMORY - -| Name | OFF_HEAP_MEMORY | -| ----------- | ------------------------------------------------------------ | -| Description | Off-heap memory size available for IoTDB ConfigNode. Previously named `MAX_DIRECT_MEMORY_SIZE`. | -| Type | String | -| Default | Depends on the `MEMORY_SIZE` configuration. | -| Effective | Restart required | - -### 3.2 **datanode-env.sh/bat** - -- MEMORY_SIZE - -| Name | MEMORY_SIZE | -| ----------- | ------------------------------------------------------------ | -| Description | Memory size allocated when IoTDB DataNode starts. | -| Type | String | -| Default | Depends on the operating system and machine configuration. Defaults to 1/2 of the machine's memory. | -| Effective | Restart required | - -- ON_HEAP_MEMORY - -| Name | ON_HEAP_MEMORY | -| ----------- | ------------------------------------------------------------ | -| Description | On-heap memory size available for IoTDB DataNode. Previously named `MAX_HEAP_SIZE`. | -| Type | String | -| Default | Depends on the `MEMORY_SIZE` configuration. | -| Effective | Restart required | - -- OFF_HEAP_MEMORY - -| Name | OFF_HEAP_MEMORY | -| ----------- | ------------------------------------------------------------ | -| Description | Off-heap memory size available for IoTDB DataNode. Previously named `MAX_DIRECT_MEMORY_SIZE`. | -| Type | String | -| Default | Depends on the `MEMORY_SIZE` configuration. | -| Effective | Restart required | - -## 4. System Parameters (`iotdb-system.properties.template`) - -The `iotdb-system.properties` file contains various configurations for managing IoTDB clusters, nodes, replication, directories, monitoring, SSL, connections, object storage, tier management, and REST services. Below is a detailed breakdown of the parameters: - -### 4.1 Cluster Configuration - -- cluster_name - -| Name | cluster_name | -| ----------- | --------------------------------------------------------- | -| Description | Name of the cluster. | -| Type | String | -| Default | default_cluster | -| Effective | Use CLI: `set configuration cluster_name='xxx'`. | -| Note | Changes are distributed across nodes. Changes may not propagate to all nodes in case of network issues or node failures. Nodes that fail to update must manually modify `cluster_name` in their configuration files and restart. Under normal circumstances, it is not recommended to modify `cluster_name` by manually modifying configuration files or to perform hot-loading via `load configuration` method. | - -### 4.2 Seed ConfigNode - -- cn_seed_config_node - -| Name | cn_seed_config_node | -| ----------- | ------------------------------------------------------------ | -| Description | Address of the seed ConfigNode for Confignode to join the cluster. | -| Type | String | -| Default | 127.0.0.1:10710 | -| Effective | Modify before the first startup. | - -- dn_seed_config_node - -| Name | dn_seed_config_node | -| ----------- | ------------------------------------------------------------ | -| Description | Address of the seed ConfigNode for Datanode to join the cluster. | -| Type | String | -| Default | 127.0.0.1:10710 | -| Effective | Modify before the first startup. | - -### 4.3 Node RPC Configuration - -- cn_internal_address - -| Name | cn_internal_address | -| ----------- | ---------------------------------------------- | -| Description | Internal address for ConfigNode communication. | -| Type | String | -| Default | 127.0.0.1 | -| Effective | Modify before the first startup. | - -- cn_internal_port - -| Name | cn_internal_port | -| ----------- | ------------------------------------------- | -| Description | Port for ConfigNode internal communication. | -| Type | Short Int : [0,65535] | -| Default | 10710 | -| Effective | Modify before the first startup. | - -- cn_consensus_port - -| Name | cn_consensus_port | -| ----------- | ----------------------------------------------------- | -| Description | Port for ConfigNode consensus protocol communication. | -| Type | Short Int : [0,65535] | -| Default | 10720 | -| Effective | Modify before the first startup. | - -- dn_rpc_address - -| Name | dn_rpc_address | -| ----------- |---------------------------------| -| Description | Address for client RPC service. | -| Type | String | -| Default | 127.0.0.1 | -| Effective | Restart required. | - -- dn_rpc_port - -| Name | dn_rpc_port | -| ----------- | ---------------------------- | -| Description | Port for client RPC service. | -| Type | Short Int : [0,65535] | -| Default | 6667 | -| Effective | Restart required. | - -- dn_internal_address - -| Name | dn_internal_address | -| ----------- | -------------------------------------------- | -| Description | Internal address for DataNode communication. | -| Type | string | -| Default | 127.0.0.1 | -| Effective | Modify before the first startup. | - -- dn_internal_port - -| Name | dn_internal_port | -| ----------- | ----------------------------------------- | -| Description | Port for DataNode internal communication. | -| Type | int | -| Default | 10730 | -| Effective | Modify before the first startup. | - -- dn_mpp_data_exchange_port - -| Name | dn_mpp_data_exchange_port | -| ----------- | -------------------------------- | -| Description | Port for MPP data exchange. | -| Type | int | -| Default | 10740 | -| Effective | Modify before the first startup. | - -- dn_schema_region_consensus_port - -| Name | dn_schema_region_consensus_port | -| ----------- | ------------------------------------------------------------ | -| Description | Port for Datanode SchemaRegion consensus protocol communication. | -| Type | int | -| Default | 10750 | -| Effective | Modify before the first startup. | - -- dn_data_region_consensus_port - -| Name | dn_data_region_consensus_port | -| ----------- | ------------------------------------------------------------ | -| Description | Port for Datanode DataRegion consensus protocol communication. | -| Type | int | -| Default | 10760 | -| Effective | Modify before the first startup. | - -- dn_join_cluster_retry_interval_ms - -| Name | dn_join_cluster_retry_interval_ms | -| ----------- | --------------------------------------------------- | -| Description | Interval for DataNode to retry joining the cluster. | -| Type | long | -| Default | 5000 | -| Effective | Restart required. | - -### 4.4 Replication configuration - -- config_node_consensus_protocol_class - -| Name | config_node_consensus_protocol_class | -| ----------- | ------------------------------------------------------------ | -| Description | Consensus protocol for ConfigNode replication, only supports RatisConsensus | -| Type | String | -| Default | org.apache.iotdb.consensus.ratis.RatisConsensus | -| Effective | Modify before the first startup. | - -- schema_replication_factor - -| Name | schema_replication_factor | -| ----------- | ------------------------------------------------------------ | -| Description | Default schema replication factor for databases. | -| Type | int32 | -| Default | 1 | -| Effective | Restart required. Takes effect on the new database after restarting. | - -- schema_region_consensus_protocol_class - -| Name | schema_region_consensus_protocol_class | -| ----------- | ------------------------------------------------------------ | -| Description | Consensus protocol for schema region replication. Only supports RatisConsensus when multi-replications. | -| Type | String | -| Default | org.apache.iotdb.consensus.ratis.RatisConsensus | -| Effective | Modify before the first startup. | - -- data_replication_factor - -| Name | data_replication_factor | -| ----------- | ------------------------------------------------------------ | -| Description | Default data replication factor for databases. | -| Type | int32 | -| Default | 1 | -| Effective | Restart required. Takes effect on the new database after restarting. | - -- data_region_consensus_protocol_class - -| Name | data_region_consensus_protocol_class | -| ----------- | ------------------------------------------------------------ | -| Description | Consensus protocol for data region replication. Supports IoTConsensus or RatisConsensus when multi-replications. | -| Type | String | -| Default | org.apache.iotdb.consensus.iot.IoTConsensus | -| Effective | Modify before the first startup. | - -### 4.5 Directory configuration - -- cn_system_dir - -| Name | cn_system_dir | -| ----------- | ----------------------------------------------------------- | -| Description | System data storage path for ConfigNode. | -| Type | String | -| Default | data/confignode/system(Windows:data\\configndoe\\system) | -| Effective | Restart required | - -- cn_consensus_dir - -| Name | cn_consensus_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Consensus protocol data storage path for ConfigNode. | -| Type | String | -| Default | data/confignode/consensus(Windows:data\\configndoe\\consensus) | -| Effective | Restart required | - -- cn_pipe_receiver_file_dir - -| Name | cn_pipe_receiver_file_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Directory for pipe receiver files in ConfigNode. | -| Type | String | -| Default | data/confignode/system/pipe/receiver(Windows:data\\confignode\\system\\pipe\\receiver) | -| Effective | Restart required | - -- dn_system_dir - -| Name | dn_system_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Schema storage path for DataNode. By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/system(Windows:data\\datanode\\system) | -| Effective | Restart required | - -- dn_data_dirs - -| Name | dn_data_dirs | -| ----------- | ------------------------------------------------------------ | -| Description | Data storage path for DataNode. By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/data(Windows:data\\datanode\\data) | -| Effective | Restart required | - -- dn_multi_dir_strategy - -| Name | dn_multi_dir_strategy | -| ----------- | ------------------------------------------------------------ | -| Description | The strategy used by IoTDB to select directories in `data_dirs` for TsFiles. You can use either the simple class name or the fully qualified class name. The system provides the following two strategies: 1. SequenceStrategy: IoTDB selects directories sequentially, iterating through all directories in `data_dirs` in a round-robin manner. 2. MaxDiskUsableSpaceFirstStrategy IoTDB prioritizes the directory in `data_dirs` with the largest disk free space. To implement a custom strategy: 1. Inherit the `org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy `class and implement your own strategy method. 2. Fill in the configuration item with the fully qualified class name of your implementation (package name + class name, e.g., `UserDefineStrategyPackage`). 3. Add the JAR file containing your custom class to the project. | -| Type | String | -| Default | SequenceStrategy | -| Effective | Hot reload. | - -- dn_consensus_dir - -| Name | dn_consensus_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Consensus log storage path for DataNode. By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/consensus(Windows:data\\datanode\\consensus) | -| Effective | Restart required | - -- dn_wal_dirs - -| Name | dn_wal_dirs | -| ----------- | ------------------------------------------------------------ | -| Description | Write-ahead log (WAL) storage path for DataNode. By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/wal(Windows:data\\datanode\\wal) | -| Effective | Restart required | - -- dn_tracing_dir - -| Name | dn_tracing_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Tracing root directory for DataNode. By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | datanode/tracing(Windows:datanode\\tracing) | -| Effective | Restart required | - -- dn_sync_dir - -| Name | dn_sync_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Sync storage path for DataNode.By default, it is stored in the data directory at the same level as the sbin directory. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/sync(Windows:data\\datanode\\sync) | -| Effective | Restart required | - -- sort_tmp_dir - -| Name | sort_tmp_dir | -| ----------- | ------------------------------------------------- | -| Description | Temporary directory for sorting operations. | -| Type | String | -| Default | data/datanode/tmp(Windows:data\\datanode\\tmp) | -| Effective | Restart required | - -- dn_pipe_receiver_file_dirs - -| Name | dn_pipe_receiver_file_dirs | -| ----------- | ------------------------------------------------------------ | -| Description | Directory for pipe receiver files in DataNode. | -| Type | String | -| Default | data/datanode/system/pipe/receiver(Windows:data\\datanode\\system\\pipe\\receiver) | -| Effective | Restart required | - -- iot_consensus_v2_receiver_file_dirs - -| Name | iot_consensus_v2_receiver_file_dirs | -| ----------- | ------------------------------------------------------------ | -| Description | Directory for IoTConsensus V2 receiver files. | -| Type | String | -| Default | data/datanode/system/pipe/consensus/receiver(Windows:data\\datanode\\system\\pipe\\consensus\\receiver) | -| Effective | Restart required | - -- iot_consensus_v2_deletion_file_dir - -| Name | iot_consensus_v2_deletion_file_dir | -| ----------- | ------------------------------------------------------------ | -| Description | Directory for IoTConsensus V2 deletion files. | -| Type | String | -| Default | data/datanode/system/pipe/consensus/deletion(Windows:data\\datanode\\system\\pipe\\consensus\\deletion) | -| Effective | Restart required | - -### 4.6 Metric Configuration - -- cn_metric_reporter_list - -| Name | cn_metric_reporter_list | -| ----------- | ----------------------------------------- | -| Description | Systems for reporting ConfigNode metrics. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -- cn_metric_level - -| Name | cn_metric_level | -| ----------- | --------------------------------------- | -| Description | Level of detail for ConfigNode metrics. | -| Type | String | -| Default | IMPORTANT | -| Effective | Restart required. | - -- cn_metric_async_collect_period - -| Name | cn_metric_async_collect_period | -| ----------- | ------------------------------------------------------------ | -| Description | Period for asynchronous metric collection in ConfigNode (in seconds). | -| Type | int | -| Default | 5 | -| Effective | Restart required. | - -- cn_metric_prometheus_reporter_port - -| Name | cn_metric_prometheus_reporter_port | -| ----------- | --------------------------------------------------- | -| Description | Port for Prometheus metric reporting in ConfigNode. | -| Type | int | -| Default | 9091 | -| Effective | Restart required. | - -- dn_metric_reporter_list - -| Name | dn_metric_reporter_list | -| ----------- | --------------------------------------- | -| Description | Systems for reporting DataNode metrics. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -- dn_metric_level - -| Name | dn_metric_level | -| ----------- | ------------------------------------- | -| Description | Level of detail for DataNode metrics. | -| Type | String | -| Default | IMPORTANT | -| Effective | Restart required. | - -- dn_metric_async_collect_period - -| Name | dn_metric_async_collect_period | -| ----------- | ------------------------------------------------------------ | -| Description | Period for asynchronous metric collection in DataNode (in seconds). | -| Type | int | -| Default | 5 | -| Effective | Restart required. | - -- dn_metric_prometheus_reporter_port - -| Name | dn_metric_prometheus_reporter_port | -| ----------- | ------------------------------------------------- | -| Description | Port for Prometheus metric reporting in DataNode. | -| Type | int | -| Default | 9092 | -| Effective | Restart required. | - -- dn_metric_internal_reporter_type - -| Name | dn_metric_internal_reporter_type | -| ----------- | ------------------------------------------------------------ | -| Description | Internal reporter types for DataNode metrics. For internal monitoring and checking that the data has been successfully written and refreshed. | -| Type | String | -| Default | IOTDB | -| Effective | Restart required. | - -### 4.7 SSL Configuration - -- enable_thrift_ssl - -| Name | enable_thrift_ssl | -| ----------- | --------------------------------------------- | -| Description | Enables SSL encryption for RPC communication. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- enable_https - -| Name | enable_https | -| ----------- | ------------------------------ | -| Description | Enables SSL for REST services. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- key_store_path - -| Name | key_store_path | -| ----------- | ---------------------------- | -| Description | Path to the SSL certificate. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -- key_store_pwd - -| Name | key_store_pwd | -| ----------- | --------------------------------- | -| Description | Password for the SSL certificate. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -### 4.8 Connection Configuration - -- cn_rpc_thrift_compression_enable - -| Name | cn_rpc_thrift_compression_enable | -| ----------- | ----------------------------------- | -| Description | Enables Thrift compression for RPC. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- cn_rpc_max_concurrent_client_num - -| Name | cn_rpc_max_concurrent_client_num | -| ----------- |-------------------------------------------| -| Description | Maximum number of concurrent RPC clients. | -| Type | int | -| Default | 3000 | -| Effective | Restart required. | - -- cn_connection_timeout_ms - -| Name | cn_connection_timeout_ms | -| ----------- | ---------------------------------------------------- | -| Description | Connection timeout for ConfigNode (in milliseconds). | -| Type | int | -| Default | 60000 | -| Effective | Restart required. | - -- cn_selector_thread_nums_of_client_manager - -| Name | cn_selector_thread_nums_of_client_manager | -| ----------- | ------------------------------------------------------------ | -| Description | Number of selector threads for client management in ConfigNode. | -| Type | int | -| Default | 1 | -| Effective | Restart required. | - -- cn_max_client_count_for_each_node_in_client_manager - -| Name | cn_max_client_count_for_each_node_in_client_manager | -| ----------- | ------------------------------------------------------ | -| Description | Maximum clients per node in ConfigNode client manager. | -| Type | int | -| Default | 300 | -| Effective | Restart required. | - -- dn_session_timeout_threshold - -| Name | dn_session_timeout_threshold | -| ----------- | ---------------------------------------- | -| Description | Maximum idle time for DataNode sessions. | -| Type | int | -| Default | 0 | -| Effective | Restart required.t required. | - -- dn_rpc_thrift_compression_enable - -| Name | dn_rpc_thrift_compression_enable | -| ----------- | -------------------------------------------- | -| Description | Enables Thrift compression for DataNode RPC. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- dn_rpc_advanced_compression_enable - -| Name | dn_rpc_advanced_compression_enable | -| ----------- | ----------------------------------------------------- | -| Description | Enables advanced Thrift compression for DataNode RPC. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- dn_rpc_selector_thread_count - -| Name | rpc_selector_thread_count | -| ----------- | -------------------------------------------- | -| Description | Number of selector threads for DataNode RPC. | -| Type | int | -| Default | 1 | -| Effective | Restart required.t required. | - -- dn_rpc_min_concurrent_client_num - -| Name | rpc_min_concurrent_client_num | -| ----------- | ------------------------------------------------------ | -| Description | Minimum number of concurrent RPC clients for DataNode. | -| Type | Short Int : [0,65535] | -| Default | 1 | -| Effective | Restart required. | - -- dn_rpc_max_concurrent_client_num - -| Name | dn_rpc_max_concurrent_client_num | -| ----------- |--------------------------------------------------------| -| Description | Maximum number of concurrent RPC clients for DataNode. | -| Type | Short Int : [0,65535] | -| Default | 1000 | -| Effective | Restart required. | - -- dn_thrift_max_frame_size - -| Name | dn_thrift_max_frame_size | -| ----------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | Maximum frame size for RPC requests/responses. | -| Type | int | -| Default | Defaults to 0, which means the value is automatically calculated based on the DN JVM configuration parameters at startup:
a. min(64MB, dn_alloc_memory/64)
b. If the user manually configures `dn_thrift_max_frame_size`, the user-specified value will be used instead. | -| Effective | Restart required. | - -- dn_thrift_init_buffer_size - -| Name | dn_thrift_init_buffer_size | -| ----------- | ----------------------------------- | -| Description | Initial buffer size for Thrift RPC. | -| Type | long | -| Default | 1024 | -| Effective | Restart required. | - -- dn_connection_timeout_ms - -| Name | dn_connection_timeout_ms | -| ----------- | -------------------------------------------------- | -| Description | Connection timeout for DataNode (in milliseconds). | -| Type | int | -| Default | 60000 | -| Effective | Restart required. | - -- dn_selector_thread_count_of_client_manager - -| Name | dn_selector_thread_count_of_client_manager | -| ----------- | ------------------------------------------------------------ | -| Description | selector thread (TAsyncClientManager) nums for async thread in a clientManager | -| Type | int | -| Default | 1 | -| Effective | Restart required.t required. | - -- dn_max_client_count_for_each_node_in_client_manager - -| Name | dn_max_client_count_for_each_node_in_client_manager | -| ----------- | --------------------------------------------------- | -| Description | Maximum clients per node in DataNode clientmanager. | -| Type | int | -| Default | 300 | -| Effective | Restart required. | - -### 4.9 Object storage management - -- remote_tsfile_cache_dirs - -| Name | remote_tsfile_cache_dirs | -| ----------- | ---------------------------------------- | -| Description | Local cache directory for cloud storage. | -| Type | String | -| Default | data/datanode/data/cache | -| Effective | Restart required. | - -- remote_tsfile_cache_page_size_in_kb - -| Name | remote_tsfile_cache_page_size_in_kb | -| ----------- | --------------------------------------------- | -| Description | Block size for cached files in cloud storage. | -| Type | int | -| Default | 20480 | -| Effective | Restart required. | - -- remote_tsfile_cache_max_disk_usage_in_mb - -| Name | remote_tsfile_cache_max_disk_usage_in_mb | -| ----------- | ------------------------------------------- | -| Description | Maximum disk usage for cloud storage cache. | -| Type | long | -| Default | 51200 | -| Effective | Restart required. | - -- object_storage_type - -| Name | object_storage_type | -| ----------- | ---------------------- | -| Description | Type of cloud storage. | -| Type | String | -| Default | AWS_S3 | -| Effective | Restart required. | - -- object_storage_endpoint - -| Name | object_storage_endpoint | -| ----------- | --------------------------- | -| Description | Endpoint for cloud storage. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -- object_storage_bucket - -| Name | object_storage_bucket | -| ----------- | ------------------------------ | -| Description | Bucket name for cloud storage. | -| Type | String | -| Default | iotdb_data | -| Effective | Restart required. | - -- object_storage_access_key - -| Name | object_storage_access_key | -| ----------- | ----------------------------- | -| Description | Access key for cloud storage. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -- object_storage_access_secret - -| Name | object_storage_access_secret | -| ----------- | -------------------------------- | -| Description | Access secret for cloud storage. | -| Type | String | -| Default | None | -| Effective | Restart required. | - -### 4.10 Tier management - -- dn_default_space_usage_thresholds - -| Name | dn_default_space_usage_thresholds | -| ----------- | ------------------------------------------------------------ | -| Description | Disk usage threshold, data will be moved to the next tier when the usage of the tier is higher than this threshold.If tiered storage is enabled, please separate thresholds of different tiers by semicolons ";". | -| Type | double | -| Default | 0.85 | -| Effective | Hot reload. | - -- dn_tier_full_policy - -| Name | dn_tier_full_policy | -| ----------- | ------------------------------------------------------------ | -| Description | How to deal with the last tier's data when its used space has been higher than its dn_default_space_usage_thresholds. | -| Type | String | -| Default | NULL | -| Effective | Hot reload. | - -- migrate_thread_count - -| Name | migrate_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | thread pool size for migrate operation in the DataNode's data directories. | -| Type | int | -| Default | 1 | -| Effective | Hot reload. | - -- tiered_storage_migrate_speed_limit_bytes_per_sec - -| Name | tiered_storage_migrate_speed_limit_bytes_per_sec | -| ----------- | ------------------------------------------------------------ | -| Description | The migrate speed limit of different tiers can reach per second | -| Type | int | -| Default | 10485760 | -| Effective | Hot reload. | - -### 4.11 REST Service Configuration - -- enable_rest_service - -| Name | enable_rest_service | -| ----------- | --------------------------- | -| Description | Is the REST service enabled | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- rest_service_port - -| Name | rest_service_port | -| ----------- | ------------------------------------ | -| Description | the binding port of the REST service | -| Type | int32 | -| Default | 18080 | -| Effective | Restart required. | - -- enable_swagger - -| Name | enable_swagger | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to display rest service interface information through swagger. eg: http://ip:port/swagger.json | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- rest_query_default_row_size_limit - -| Name | rest_query_default_row_size_limit | -| ----------- | ------------------------------------------------------------ | -| Description | the default row limit to a REST query response when the rowSize parameter is not given in request | -| Type | int32 | -| Default | 10000 | -| Effective | Restart required. | - -- cache_expire_in_seconds - -| Name | cache_expire_in_seconds | -| ----------- | ------------------------------------------------------------ | -| Description | The expiration time of the user login information cache (in seconds) | -| Type | int32 | -| Default | 28800 | -| Effective | Restart required. | - -- cache_max_num - -| Name | cache_max_num | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of users can be stored in the user login cache. | -| Type | int32 | -| Default | 100 | -| Effective | Restart required. | - -- cache_init_num - -| Name | cache_init_num | -| ----------- | ------------------------------------------------------------ | -| Description | The initial capacity of users can be stored in the user login cache. | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- client_auth - -| Name | client_auth | -| ----------- | --------------------------------- | -| Description | Is client authentication required | -| Type | boolean | -| Default | false | -| Effective | Restart required. | - -- trust_store_path - -| Name | trust_store_path | -| ----------- | -------------------- | -| Description | SSL trust store path | -| Type | String | -| Default | "" | -| Effective | Restart required. | - -- trust_store_pwd - -| Name | trust_store_pwd | -| ----------- | ------------------------- | -| Description | SSL trust store password. | -| Type | String | -| Default | "" | -| Effective | Restart required. | - -- idle_timeout_in_seconds - -| Name | idle_timeout_in_seconds | -| ----------- | ------------------------ | -| Description | SSL timeout (in seconds) | -| Type | int32 | -| Default | 5000 | -| Effective | Restart required. | - -### 4.12 Load balancing configuration - -- series_slot_num - -| Name | series_slot_num | -| ----------- | ------------------------------------------- | -| Description | Number of SeriesPartitionSlots per Database | -| Type | int32 | -| Default | 10000 | -| Effective | Modify before the first startup. | - -- series_partition_executor_class - -| Name | series_partition_executor_class | -| ----------- | ------------------------------------------------------------ | -| Description | SeriesPartitionSlot executor class | -| Type | String | -| Default | org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor | -| Effective | Modify before the first startup. | - -- schema_region_group_extension_policy - -| Name | schema_region_group_extension_policy | -| ----------- | ------------------------------------------------------------ | -| Description | The policy of extension SchemaRegionGroup for each Database. | -| Type | string | -| Default | AUTO | -| Effective | Restart required. | - -- default_schema_region_group_num_per_database - -| Name | default_schema_region_group_num_per_database | -| ----------- | ------------------------------------------------------------ | -| Description | When set schema_region_group_extension_policy=CUSTOM, this parameter is the default number of SchemaRegionGroups for each Database.When set schema_region_group_extension_policy=AUTO, this parameter is the default minimal number of SchemaRegionGroups for each Database. | -| Type | int | -| Default | 1 | -| Effective | Restart required. | - -- schema_region_per_data_node - -| Name | schema_region_per_data_node | -| ----------- | ------------------------------------------------------------ | -| Description | It only takes effect when set schema_region_group_extension_policy=AUTO.This parameter is the maximum number of SchemaRegions expected to be managed by each DataNode. | -| Type | double | -| Default | 1.0 | -| Effective | Restart required. | - -- data_region_group_extension_policy - -| Name | data_region_group_extension_policy | -| ----------- | ---------------------------------------------------------- | -| Description | The policy of extension DataRegionGroup for each Database. | -| Type | string | -| Default | AUTO | -| Effective | Restart required. | - -- default_data_region_group_num_per_database - -| Name | default_data_region_group_per_database | -| ----------- | ------------------------------------------------------------ | -| Description | When set data_region_group_extension_policy=CUSTOM, this parameter is the default number of DataRegionGroups for each Database.When set data_region_group_extension_policy=AUTO, this parameter is the default minimal number of DataRegionGroups for each Database. | -| Type | int | -| Default | 2 | -| Effective | Restart required. | - -- data_region_per_data_node - -| Name | data_region_per_data_node | -| ----------- | ------------------------------------------------------------ | -| Description | It only takes effect when set data_region_group_extension_policy=AUTO.This parameter is the maximum number of DataRegions expected to be managed by each DataNode. | -| Type | double | -| Default | 5.0 | -| Effective | Restart required. | - -- enable_auto_leader_balance_for_ratis_consensus - -| Name | enable_auto_leader_balance_for_ratis_consensus | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to enable auto leader balance for Ratis consensus protocol. | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -- enable_auto_leader_balance_for_iot_consensus - -| Name | enable_auto_leader_balance_for_iot_consensus | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to enable auto leader balance for IoTConsensus protocol. | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -### 4.13 Cluster management - -- time_partition_origin - -| Name | time_partition_origin | -| ----------- | ------------------------------------------------------------ | -| Description | Time partition origin in milliseconds, default is equal to zero. | -| Type | Long | -| Unit | ms | -| Default | 0 | -| Effective | Modify before the first startup. | - -- time_partition_interval - -| Name | time_partition_interval | -| ----------- | ------------------------------------------------------------ | -| Description | Time partition interval in milliseconds, and partitioning data inside each data region, default is equal to one week | -| Type | Long | -| Unit | ms | -| Default | 604800000 | -| Effective | Modify before the first startup. | - -- heartbeat_interval_in_ms - -| Name | heartbeat_interval_in_ms | -| ----------- | -------------------------------------- | -| Description | The heartbeat interval in milliseconds | -| Type | Long | -| Unit | ms | -| Default | 1000 | -| Effective | Restart required. | - -- disk_space_warning_threshold - -| Name | disk_space_warning_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | Disk remaining threshold at which DataNode is set to ReadOnly status | -| Type | double(percentage) | -| Default | 0.05 | -| Effective | Restart required. | - -### 4.14 Memory Control Configuration - -- datanode_memory_proportion - -| Name | datanode_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Memory Allocation Ratio: StorageEngine, QueryEngine, SchemaEngine, Consensus, StreamingEngine and Free Memory. | -| Type | Ratio | -| Default | 3:3:1:1:1:1 | -| Effective | Restart required. | - -- schema_memory_proportion - -| Name | schema_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Schema Memory Allocation Ratio: SchemaRegion, SchemaCache, and PartitionCache. | -| Type | Ratio | -| Default | 5:4:1 | -| Effective | Restart required. | - -- storage_engine_memory_proportion - -| Name | storage_engine_memory_proportion | -| ----------- | ----------------------------------------------------------- | -| Description | Memory allocation ratio in StorageEngine: Write, Compaction | -| Type | Ratio | -| Default | 8:2 | -| Effective | Restart required. | - -- write_memory_proportion - -| Name | write_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Memory allocation ratio in writing: Memtable, TimePartitionInfo | -| Type | Ratio | -| Default | 19:1 | -| Effective | Restart required. | - -- primitive_array_size - -| Name | primitive_array_size | -| ----------- | --------------------------------------------------------- | -| Description | primitive array size (length of each array) in array pool | -| Type | int32 | -| Default | 64 | -| Effective | Restart required. | - -- chunk_metadata_size_proportion - -| Name | chunk_metadata_size_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Ratio of compaction memory for chunk metadata maintains in memory when doing compaction | -| Type | Double | -| Default | 0.1 | -| Effective | Restart required. | - -- flush_proportion - -| Name | flush_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Ratio of memtable memory for invoking flush disk, 0.4 by defaultIf you have extremely high write load (like batch=1000), it can be set lower than the default value like 0.2 | -| Type | Double | -| Default | 0.4 | -| Effective | Restart required. | - -- buffered_arrays_memory_proportion - -| Name | buffered_arrays_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Ratio of memtable memory allocated for buffered arrays, 0.6 by default | -| Type | Double | -| Default | 0.6 | -| Effective | Restart required. | - -- reject_proportion - -| Name | reject_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Ratio of memtable memory for rejecting insertion, 0.8 by defaultIf you have extremely high write load (like batch=1000) and the physical memory size is large enough, it can be set higher than the default value like 0.9 | -| Type | Double | -| Default | 0.8 | -| Effective | Restart required. | - -- device_path_cache_proportion - -| Name | device_path_cache_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Ratio of memtable memory for the DevicePathCache. DevicePathCache is the deviceId cache, keeping only one copy of the same deviceId in memory | -| Type | Double | -| Default | 0.05 | -| Effective | Restart required. | - -- write_memory_variation_report_proportion - -| Name | write_memory_variation_report_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | If memory cost of data region increased more than proportion of allocated memory for writing, report to system. The default value is 0.001 | -| Type | Double | -| Default | 0.001 | -| Effective | Restart required. | - -- check_period_when_insert_blocked - -| Name | check_period_when_insert_blocked | -| ----------- | ------------------------------------------------------------ | -| Description | When an insertion is rejected, the waiting period (in ms) to check system again, 50 by default.If the insertion has been rejected and the read load is low, it can be set larger. | -| Type | int32 | -| Default | 50 | -| Effective | Restart required. | - -- io_task_queue_size_for_flushing - -| Name | io_task_queue_size_for_flushing | -| ----------- | -------------------------------------------- | -| Description | size of ioTaskQueue. The default value is 10 | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- enable_query_memory_estimation - -| Name | enable_query_memory_estimation | -| ----------- | ------------------------------------------------------------ | -| Description | If true, we will estimate each query's possible memory footprint before executing it and deny it if its estimated memory exceeds current free memory | -| Type | bool | -| Default | true | -| Effective | Hot reload. | - -### 4.15 Schema Engine Configuration - -- schema_engine_mode - -| Name | schema_engine_mode | -| ----------- | ------------------------------------------------------------ | -| Description | The schema management mode of schema engine. Currently, support Memory and PBTree.This config of all DataNodes in one cluster must keep same. | -| Type | string | -| Default | Memory | -| Effective | Modify before the first startup. | - -- partition_cache_size - -| Name | partition_cache_size | -| ----------- | ------------------------- | -| Description | cache size for partition. | -| Type | Int32 | -| Default | 1000 | -| Effective | Restart required. | - -- sync_mlog_period_in_ms - -| Name | sync_mlog_period_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | The cycle when metadata log is periodically forced to be written to disk(in milliseconds)If sync_mlog_period_in_ms=0 it means force metadata log to be written to disk after each refreshmentSetting this parameter to 0 may slow down the operation on slow disk. | -| Type | Int64 | -| Default | 100 | -| Effective | Restart required. | - -- tag_attribute_flush_interval - -| Name | tag_attribute_flush_interval | -| ----------- | ------------------------------------------------------------ | -| Description | interval num for tag and attribute records when force flushing to disk | -| Type | int32 | -| Default | 1000 | -| Effective | Modify before the first startup. | - -- tag_attribute_total_size - -| Name | tag_attribute_total_size | -| ----------- | ------------------------------------------------------------ | -| Description | max size for a storage block for tags and attributes of a one-time series | -| Type | int32 | -| Default | 700 | -| Effective | Modify before the first startup. | - -- max_measurement_num_of_internal_request - -| Name | max_measurement_num_of_internal_request | -| ----------- | ------------------------------------------------------------ | -| Description | max measurement num of internal requestWhen creating timeseries with Session.createMultiTimeseries, the user input plan, the timeseries num ofwhich exceeds this num, will be split to several plans with timeseries no more than this num. | -| Type | Int32 | -| Default | 10000 | -| Effective | Restart required. | - -- datanode_schema_cache_eviction_policy - -| Name | datanode_schema_cache_eviction_policy | -| ----------- | --------------------------------------- | -| Description | Policy of DataNodeSchemaCache eviction. | -| Type | String | -| Default | FIFO | -| Effective | Restart required. | - -- cluster_timeseries_limit_threshold - -| Name | cluster_timeseries_limit_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | This configuration parameter sets the maximum number of time series allowed in the cluster. | -| Type | Int32 | -| Default | -1 | -| Effective | Restart required. | - -- cluster_device_limit_threshold - -| Name | cluster_device_limit_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | This configuration parameter sets the maximum number of devices allowed in the cluster. | -| Type | Int32 | -| Default | -1 | -| Effective | Restart required. | - -- database_limit_threshold - -| Name | database_limit_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | This configuration parameter sets the maximum number of Cluster Databases allowed. | -| Type | Int32 | -| Default | -1 | -| Effective | Restart required. | - -### 4.16 Configurations for creating schema automatically - -- enable_auto_create_schema - -| Name | enable_auto_create_schema | -| ----------- | ------------------------------------------------ | -| Description | Whether creating schema automatically is enabled | -| Value | true or false | -| Default | true | -| Effective | Restart required. | - -- default_storage_group_level - -| Name | default_storage_group_level | -| ----------- | ------------------------------------------------------------ | -| Description | Database level when creating schema automatically is enabled e.g. root.sg0.d1.s2We will set root.sg0 as the database if database level is 1If the incoming path is shorter than this value, the creation/insertion will fail. | -| Value | int32 | -| Default | 1 | -| Effective | Restart required. | - -- boolean_string_infer_type - -| Name | boolean_string_infer_type | -| ----------- |------------------------------------------------------------------------------------| -| Description | register time series as which type when receiving boolean string "true" or "false" | -| Value | BOOLEAN or TEXT | -| Default | BOOLEAN | -| Effective | Hot_reload | - -- integer_string_infer_type - -| Name | integer_string_infer_type | -| ----------- |------------------------------------------------------------------------------------------------------------------| -| Description | register time series as which type when receiving an integer string and using float or double may lose precision | -| Value | INT32, INT64, FLOAT, DOUBLE, TEXT | -| Default | DOUBLE | -| Effective | Hot_reload | - -- floating_string_infer_type - -| Name | floating_string_infer_type | -| ----------- |----------------------------------------------------------------------------------| -| Description | register time series as which type when receiving a floating number string "6.7" | -| Value | DOUBLE, FLOAT or TEXT | -| Default | DOUBLE | -| Effective | Hot_reload | - -- nan_string_infer_type - -| Name | nan_string_infer_type | -| ----------- |--------------------------------------------------------------------| -| Description | register time series as which type when receiving the Literal NaN. | -| Value | DOUBLE, FLOAT or TEXT | -| Default | DOUBLE | -| Effective | Hot_reload | - -- default_boolean_encoding - -| Name | default_boolean_encoding | -| ----------- |----------------------------------------------------------------| -| Description | BOOLEAN encoding when creating schema automatically is enabled | -| Value | PLAIN, RLE | -| Default | RLE | -| Effective | Hot_reload | - -- default_int32_encoding - -| Name | default_int32_encoding | -| ----------- |--------------------------------------------------------------| -| Description | INT32 encoding when creating schema automatically is enabled | -| Value | PLAIN, RLE, TS_2DIFF, REGULAR, GORILLA | -| Default | TS_2DIFF | -| Effective | Hot_reload | - -- default_int64_encoding - -| Name | default_int64_encoding | -| ----------- |--------------------------------------------------------------| -| Description | INT64 encoding when creating schema automatically is enabled | -| Value | PLAIN, RLE, TS_2DIFF, REGULAR, GORILLA | -| Default | TS_2DIFF | -| Effective | Hot_reload | - -- default_float_encoding - -| Name | default_float_encoding | -| ----------- |--------------------------------------------------------------| -| Description | FLOAT encoding when creating schema automatically is enabled | -| Value | PLAIN, RLE, TS_2DIFF, GORILLA | -| Default | GORILLA | -| Effective | Hot_reload | - -- default_double_encoding - -| Name | default_double_encoding | -| ----------- |---------------------------------------------------------------| -| Description | DOUBLE encoding when creating schema automatically is enabled | -| Value | PLAIN, RLE, TS_2DIFF, GORILLA | -| Default | GORILLA | -| Effective | Hot_reload | - -- default_text_encoding - -| Name | default_text_encoding | -| ----------- |-------------------------------------------------------------| -| Description | TEXT encoding when creating schema automatically is enabled | -| Value | PLAIN | -| Default | PLAIN | -| Effective | Hot_reload | - - -* boolean_compressor - -| Name | boolean_compressor | -|------------------|-----------------------------------------------------------------------------------------| -| Description | BOOLEAN compression when creating schema automatically is enabled (Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - -* int32_compressor - -| Name | int32_compressor | -|----------------------|--------------------------------------------------------------------------------------------| -| Description | INT32/DATE compression when creating schema automatically is enabled(Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - -* int64_compressor - -| Name | int64_compressor | -|--------------------|-------------------------------------------------------------------------------------------------| -| Description | INT64/TIMESTAMP compression when creating schema automatically is enabled (Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - -* float_compressor - -| Name | float_compressor | -|-----------------------|---------------------------------------------------------------------------------------| -| Description | FLOAT compression when creating schema automatically is enabled (Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - -* double_compressor - -| Name | double_compressor | -|-------------------|----------------------------------------------------------------------------------------| -| Description | DOUBLE compression when creating schema automatically is enabled (Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - -* text_compressor - -| Name | text_compressor | -|--------------------|--------------------------------------------------------------------------------------------------| -| Description | TEXT/BINARY/BLOB compression when creating schema automatically is enabled (Supports from V2.0.6) | -| Type | String | -| Default | LZ4 | -| Effective | Hot_reload | - - -### 4.17 Query Configurations - -- read_consistency_level - -| Name | read_consistency_level | -| ----------- | ------------------------------------------------------------ | -| Description | The read consistency levelThese consistency levels are currently supported:strong(Default, read from the leader replica)weak(Read from a random replica) | -| Type | String | -| Default | strong | -| Effective | Restart required. | - -- meta_data_cache_enable - -| Name | meta_data_cache_enable | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to cache meta data (BloomFilter, ChunkMetadata and TimeSeriesMetadata) or not. | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -- chunk_timeseriesmeta_free_memory_proportion - -| Name | chunk_timeseriesmeta_free_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | Read memory Allocation Ratio: BloomFilterCache : ChunkCache : TimeSeriesMetadataCache : Coordinator : Operators : DataExchange : timeIndex in TsFileResourceList : others.The parameter form is a:b:c:d:e:f:g:h, where a, b, c, d, e, f, g and h are integers. for example: 1:1:1:1:1:1:1:1 , 1:100:200:50:200:200:200:50 | -| Type | String | -| Default | 1 : 100 : 200 : 300 : 400 | -| Effective | Restart required. | - -- enable_last_cache - -| Name | enable_last_cache | -| ----------- | ---------------------------- | -| Description | Whether to enable LAST cache | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -- mpp_data_exchange_core_pool_size - -| Name | mpp_data_exchange_core_pool_size | -| ----------- | -------------------------------------------- | -| Description | Core size of ThreadPool of MPP data exchange | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- mpp_data_exchange_max_pool_size - -| Name | mpp_data_exchange_max_pool_size | -| ----------- | ------------------------------------------- | -| Description | Max size of ThreadPool of MPP data exchange | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- mpp_data_exchange_keep_alive_time_in_ms - -| Name | mpp_data_exchange_keep_alive_time_in_ms | -| ----------- | --------------------------------------- | -| Description | Max waiting time for MPP data exchange | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- driver_task_execution_time_slice_in_ms - -| Name | driver_task_execution_time_slice_in_ms | -| ----------- | -------------------------------------- | -| Description | The max execution time of a DriverTask | -| Type | int32 | -| Default | 200 | -| Effective | Restart required. | - -- max_tsblock_size_in_bytes - -| Name | max_tsblock_size_in_bytes | -| ----------- | ----------------------------- | -| Description | The max capacity of a TsBlock | -| Type | int32 | -| Default | 131072 | -| Effective | Restart required. | - -- max_tsblock_line_numbers - -| Name | max_tsblock_line_numbers | -| ----------- | ------------------------------------------- | -| Description | The max number of lines in a single TsBlock | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- slow_query_threshold - -| Name | slow_query_threshold | -| ----------- |----------------------------------------| -| Description | Time cost(ms) threshold for slow query | -| Type | long | -| Default | 3000 | -| Effective | Hot reload | - -- query_cost_stat_window - -| Name | query_cost_stat_window | -|-------------|--------------------| -| Description | Time window threshold(min) for record of history queries. | -| Type | Int32 | -| Default | 0 | -| Effective | Hot reload | - -- query_timeout_threshold - -| Name | query_timeout_threshold | -| ----------- | ----------------------------------------- | -| Description | The max executing time of query. unit: ms | -| Type | Int32 | -| Default | 60000 | -| Effective | Restart required. | - -- max_allowed_concurrent_queries - -| Name | max_allowed_concurrent_queries | -| ----------- | -------------------------------------------------- | -| Description | The maximum allowed concurrently executing queries | -| Type | Int32 | -| Default | 1000 | -| Effective | Restart required. | - -- query_thread_count - -| Name | query_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | How many threads can concurrently execute query statement. When <= 0, use CPU core number. | -| Type | Int32 | -| Default | 0 | -| Effective | Restart required. | - -- degree_of_query_parallelism - -| Name | degree_of_query_parallelism | -| ----------- | ------------------------------------------------------------ | -| Description | How many pipeline drivers will be created for one fragment instance. When <= 0, use CPU core number / 2. | -| Type | Int32 | -| Default | 0 | -| Effective | Restart required. | - -- mode_map_size_threshold - -| Name | mode_map_size_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | The threshold of count map size when calculating the MODE aggregation function | -| Type | Int32 | -| Default | 10000 | -| Effective | Restart required. | - -- batch_size - -| Name | batch_size | -| ----------- | ------------------------------------------------------------ | -| Description | The amount of data iterate each time in server (the number of data strips, that is, the number of different timestamps.) | -| Type | Int32 | -| Default | 100000 | -| Effective | Restart required. | - -- sort_buffer_size_in_bytes - -| Name | sort_buffer_size_in_bytes | -| ----------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | The memory for external sort in sort operator, when the data size is smaller than sort_buffer_size_in_bytes, the sort operator will use in-memory sort. | -| Type | long | -| Default | 1048576(Before V2.0.6)
0(Supports from V2.0.6), if `sort_buffer_size_in_bytes <= 0`, default value will be used, `default value = min(32MB, memory for query operators / query_thread_count / 2)`, if `sort_buffer_size_in_bytes > 0`, the specified value will be used. | -| Effective | Hot_reload | - -- merge_threshold_of_explain_analyze - -| Name | merge_threshold_of_explain_analyze | -| ----------- | ------------------------------------------------------------ | -| Description | The threshold of operator count in the result set of EXPLAIN ANALYZE, if the number of operator in the result set is larger than this threshold, operator will be merged. | -| Type | int | -| Default | 10 | -| Effective | Hot reload | - -### 4.18 TTL Configuration - -- ttl_check_interval - -| Name | ttl_check_interval | -| ----------- | ------------------------------------------------------------ | -| Description | The interval of TTL check task in each database. The TTL check task will inspect and select files with a higher volume of expired data for compaction. Default is 2 hours. | -| Type | int | -| Default | 7200000 | -| Effective | Restart required. | - -- max_expired_time - -| Name | max_expired_time | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum expiring time of device which has a ttl. Default is 1 month.If the data elapsed time (current timestamp minus the maximum data timestamp of the device in the file) of such devices exceeds this value, then the file will be cleaned by compaction. | -| Type | int | -| Default | 2592000000 | -| Effective | Restart required. | - -- expired_data_ratio - -| Name | expired_data_ratio | -| ----------- | ------------------------------------------------------------ | -| Description | The expired device ratio. If the ratio of expired devices in one file exceeds this value, then expired data of this file will be cleaned by compaction. | -| Type | float | -| Default | 0.3 | -| Effective | Restart required. | - -### 4.19 Storage Engine Configuration - -- timestamp_precision - -| Name | timestamp_precision | -| ----------- | ------------------------------------------------------------ | -| Description | Use this value to set timestamp precision as "ms", "us" or "ns". | -| Type | String | -| Default | ms | -| Effective | Modify before the first startup. | - -- timestamp_precision_check_enabled - -| Name | timestamp_precision_check_enabled | -| ----------- | ------------------------------------------------------------ | -| Description | When the timestamp precision check is enabled, the timestamps those are over 13 digits for ms precision, or over 16 digits for us precision are not allowed to be inserted. | -| Type | Boolean | -| Default | true | -| Effective | Modify before the first startup. | - -- max_waiting_time_when_insert_blocked - -| Name | max_waiting_time_when_insert_blocked | -| ----------- | ------------------------------------------------------------ | -| Description | When the waiting time (in ms) of an inserting exceeds this, throw an exception. 10000 by default. | -| Type | Int32 | -| Default | 10000 | -| Effective | Restart required. | - -- handle_system_error - -| Name | handle_system_error | -| ----------- | -------------------------------------------------------- | -| Description | What will the system do when unrecoverable error occurs. | -| Type | String | -| Default | CHANGE_TO_READ_ONLY | -| Effective | Restart required. | - -- enable_timed_flush_seq_memtable - -| Name | enable_timed_flush_seq_memtable | -| ----------- | --------------------------------------------------- | -| Description | Whether to timed flush sequence tsfiles' memtables. | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- seq_memtable_flush_interval_in_ms - -| Name | seq_memtable_flush_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | If a memTable's last update time is older than current time minus this, the memtable will be flushed to disk. | -| Type | long | -| Default | 600000 | -| Effective | Hot reload | - -- seq_memtable_flush_check_interval_in_ms - -| Name | seq_memtable_flush_check_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | The interval to check whether sequence memtables need flushing. | -| Type | long | -| Default | 30000 | -| Effective | Hot reload | - -- enable_timed_flush_unseq_memtable - -| Name | enable_timed_flush_unseq_memtable | -| ----------- | ----------------------------------------------------- | -| Description | Whether to timed flush unsequence tsfiles' memtables. | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- unseq_memtable_flush_interval_in_ms - -| Name | unseq_memtable_flush_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | If a memTable's last update time is older than current time minus this, the memtable will be flushed to disk. | -| Type | long | -| Default | 600000 | -| Effective | Hot reload | - -- unseq_memtable_flush_check_interval_in_ms - -| Name | unseq_memtable_flush_check_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | The interval to check whether unsequence memtables need flushing. | -| Type | long | -| Default | 30000 | -| Effective | Hot reload | - -- tvlist_sort_algorithm - -| Name | tvlist_sort_algorithm | -| ----------- | ------------------------------------------------- | -| Description | The sort algorithms used in the memtable's TVList | -| Type | String | -| Default | TIM | -| Effective | Restart required. | - -- avg_series_point_number_threshold - -| Name | avg_series_point_number_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | When the average point number of timeseries in memtable exceeds this, the memtable is flushed to disk. | -| Type | int32 | -| Default | 100000 | -| Effective | Restart required. | - -- flush_thread_count - -| Name | flush_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | How many threads can concurrently flush. When <= 0, use CPU core number. | -| Type | int32 | -| Default | 0 | -| Effective | Restart required. | - -- enable_partial_insert - -| Name | enable_partial_insert | -| ----------- | ------------------------------------------------------------ | -| Description | In one insert (one device, one timestamp, multiple measurements), if enable partial insert, one measurement failure will not impact other measurements | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -- recovery_log_interval_in_ms - -| Name | recovery_log_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | the interval to log recover progress of each vsg when starting iotdb | -| Type | Int32 | -| Default | 5000 | -| Effective | Restart required. | - -- 0.13_data_insert_adapt - -| Name | 0.13_data_insert_adapt | -| ----------- | ------------------------------------------------------------ | -| Description | If using a v0.13 client to insert data, please set this configuration to true. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- enable_tsfile_validation - -| Name | enable_tsfile_validation | -| ----------- | ------------------------------------------------------------ | -| Description | Verify that TSfiles generated by Flush, Load, and Compaction are correct. | -| Type | boolean | -| Default | false | -| Effective | Hot reload | - -- tier_ttl_in_ms - -| Name | tier_ttl_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | Default tier TTL. When the survival time of the data exceeds the threshold, it will be migrated to the next tier. | -| Type | long | -| Default | -1 | -| Effective | Restart required. | - -- max_object_file_size_in_byte - -| Name | max_object_file_size_in_byte | -|-------------|-----------------------------------------------------------------------| -| Description | Maximum size limit for a single object file (supported since V2.0.8). | -| Type | long | -| Default | 4294967296 (4 GB in bytes) | -| Effective | Hot reload | - -- restrict_object_limit - -| Name | restrict_object_limit | -|-------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | No special restrictions on table names, column names, or device identifiers for `OBJECT` type (supported since V2.0.8). When set to `true` and the table contains `OBJECT` columns, the following restrictions apply:
1. Naming Rules: Values in TAG columns, table names, and field names must not use `.` or `..`; Prohibited characters include `./` or `.\`, otherwise metadata creation will fail; Names containing filesystem-unsupported characters will cause write errors.
2. Case Sensitivity: If the underlying filesystem is case-insensitive, device identifiers like `'d1'` and `'D1'` are treated as identical; Creating similar identifiers may overwrite `OBJECT` data files, leading to data corruption.
3. Storage Path: Actual storage path format: `${dataregionid}/${tablename}/${tag1}/${tag2}/.../${field}/${timestamp}.bin` | -| Type | boolean | -| Default | false | -| Effective | Can only be modified before the first service startup. | - -### 4.20 Compaction Configurations - -- enable_seq_space_compaction - -| Name | enable_seq_space_compaction | -| ----------- | ---------------------------------------------------------- | -| Description | sequence space compaction: only compact the sequence files | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- enable_unseq_space_compaction - -| Name | enable_unseq_space_compaction | -| ----------- | ------------------------------------------------------------ | -| Description | unsequence space compaction: only compact the unsequence files | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- enable_cross_space_compaction - -| Name | enable_cross_space_compaction | -| ----------- | ------------------------------------------------------------ | -| Description | cross space compaction: compact the unsequence files into the overlapped sequence files | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- enable_auto_repair_compaction - -| Name | enable_auto_repair_compaction | -| ----------- | ---------------------------------------------- | -| Description | enable auto repair unsorted file by compaction | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- cross_selector - -| Name | cross_selector | -| ----------- | ------------------------------------------- | -| Description | the selector of cross space compaction task | -| Type | String | -| Default | rewrite | -| Effective | Restart required. | - -- cross_performer - -| Name | cross_performer | -| ----------- |-----------------------------------------------------------| -| Description | the compaction performer of cross space compaction task, Options: read_point, fast | -| Type | String | -| Default | fast | -| Effective | Hot reload . | - -- inner_seq_selector - -| Name | inner_seq_selector | -| ----------- |--------------------------------------------------------| -| Description | the selector of inner sequence space compaction task, Options: size_tiered_single_target,size_tiered_multi_target | -| Type | String | -| Default | size_tiered_multi_target | -| Effective | Hot reload | - -- inner_seq_performer - -| Name | inner_seq_performer | -| ----------- |---------------------------------------------------------| -| Description | the performer of inner sequence space compaction task, Options: read_chunk, fast | -| Type | String | -| Default | read_chunk | -| Effective | Hot reload | - -- inner_unseq_selector - -| Name | inner_unseq_selector | -| ----------- |----------------------------------------------------------| -| Description | the selector of inner unsequence space compaction task, Options: size_tiered_single_target,size_tiered_multi_target | -| Type | String | -| Default | size_tiered_multi_target | -| Effective | Hot reload | - -- inner_unseq_performer - -| Name | inner_unseq_performer | -| ----------- |-----------------------------------------------------------| -| Description | the performer of inner unsequence space compaction task, Options: read_point, fast | -| Type | String | -| Default | fast | -| Effective | Hot reload | - -- compaction_priority - -| Name | compaction_priority | -| ----------- | ------------------------------------------------------------ | -| Description | The priority of compaction executionINNER_CROSS: prioritize inner space compaction, reduce the number of files firstCROSS_INNER: prioritize cross space compaction, eliminate the unsequence files firstBALANCE: alternate two compaction types | -| Type | String | -| Default | INNER_CROSS | -| Effective | Restart required. | - -- candidate_compaction_task_queue_size - -| Name | candidate_compaction_task_queue_size | -| ----------- | -------------------------------------------- | -| Description | The size of candidate compaction task queue. | -| Type | int32 | -| Default | 50 | -| Effective | Restart required. | - -- target_compaction_file_size - -| Name | target_compaction_file_size | -| ----------- | ------------------------------------------------------------ | -| Description | This parameter is used in two places:The target tsfile size of inner space compaction.The candidate size of seq tsfile in cross space compaction will be smaller than target_compaction_file_size * 1.5.In most cases, the target file size of cross compaction won't exceed this threshold, and if it does, it will not be much larger than it. | -| Type | Int64 | -| Default | 2147483648 | -| Effective | Hot reload | - -- inner_compaction_total_file_size_threshold - -| Name | inner_compaction_total_file_size_threshold | -| ----------- | ---------------------------------------------------- | -| Description | The total file size limit in inner space compaction. | -| Type | int64 | -| Default | 10737418240 | -| Effective | Hot reload | - -- inner_compaction_total_file_num_threshold - -| Name | inner_compaction_total_file_num_threshold | -| ----------- | --------------------------------------------------- | -| Description | The total file num limit in inner space compaction. | -| Type | int32 | -| Default | 100 | -| Effective | Hot reload | - -- max_level_gap_in_inner_compaction - -| Name | max_level_gap_in_inner_compaction | -| ----------- | ----------------------------------------------- | -| Description | The max level gap in inner compaction selection | -| Type | int32 | -| Default | 2 | -| Effective | Hot reload | - -- target_chunk_size - -| Name | target_chunk_size | -| ----------- | ------------------------------------------------------------ | -| Description | The target chunk size in flushing and compaction. If the size of a timeseries in memtable exceeds this, the data will be flushed to multiple chunks.| -| Type | Int64 | -| Default | 1600000 | -| Effective | Restart required. | - -- target_chunk_point_num - -| Name | target_chunk_point_num | -| ----------- |-----------------------------------------------------------------| -| Description | The target point nums in one chunk in flushing and compaction. If the point number of a timeseries in memtable exceeds this, the data will be flushed to multiple chunks. | -| Type | Int64 | -| Default | 100000 | -| Effective | Restart required. | - -- chunk_size_lower_bound_in_compaction - -| Name | chunk_size_lower_bound_in_compaction | -| ----------- | ------------------------------------------------------------ | -| Description | If the chunk size is lower than this threshold, it will be deserialized into points | -| Type | Int64 | -| Default | 128 | -| Effective | Restart required. | - -- chunk_point_num_lower_bound_in_compaction - -| Name | chunk_point_num_lower_bound_in_compaction | -| ----------- |------------------------------------------------------------------------------------------| -| Description | If the chunk point num is lower than this threshold, it will be deserialized into points | -| Type | Int64 | -| Default | 100 | -| Effective | Restart required. | - -- inner_compaction_candidate_file_num - -| Name | inner_compaction_candidate_file_num | -| ----------- | ------------------------------------------------------------ | -| Description | The file num requirement when selecting inner space compaction candidate files | -| Type | int32 | -| Default | 30 | -| Effective | Hot reload | - -- max_cross_compaction_candidate_file_num - -| Name | max_cross_compaction_candidate_file_num | -| ----------- | ------------------------------------------------------------ | -| Description | The max file when selecting cross space compaction candidate files | -| Type | int32 | -| Default | 500 | -| Effective | Hot reload | - -- max_cross_compaction_candidate_file_size - -| Name | max_cross_compaction_candidate_file_size | -| ----------- | ------------------------------------------------------------ | -| Description | The max total size when selecting cross space compaction candidate files | -| Type | Int64 | -| Default | 5368709120 | -| Effective | Hot reload | - -- min_cross_compaction_unseq_file_level - -| Name | min_cross_compaction_unseq_file_level | -| ----------- | ------------------------------------------------------------ | -| Description | The min inner compaction level of unsequence file which can be selected as candidate | -| Type | int32 | -| Default | 1 | -| Effective | Hot reload | - -- compaction_thread_count - -| Name | compaction_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | How many threads will be set up to perform compaction, 10 by default. | -| Type | int32 | -| Default | 10 | -| Effective | Hot reload | - -- compaction_max_aligned_series_num_in_one_batch - -| Name | compaction_max_aligned_series_num_in_one_batch | -| ----------- | ------------------------------------------------------------ | -| Description | How many chunk will be compacted in aligned series compaction, 10 by default. | -| Type | int32 | -| Default | 10 | -| Effective | Hot reload | - -- compaction_schedule_interval_in_ms - -| Name | compaction_schedule_interval_in_ms | -| ----------- | ---------------------------------------- | -| Description | The interval of compaction task schedule | -| Type | Int64 | -| Default | 60000 | -| Effective | Restart required. | - -- compaction_write_throughput_mb_per_sec - -| Name | compaction_write_throughput_mb_per_sec | -| ----------- | -------------------------------------------------------- | -| Description | The limit of write throughput merge can reach per second | -| Type | int32 | -| Default | 16 | -| Effective | Restart required. | - -- compaction_read_throughput_mb_per_sec - -| Name | compaction_read_throughput_mb_per_sec | -| ----------- | ------------------------------------------------------- | -| Description | The limit of read throughput merge can reach per second | -| Type | int32 | -| Default | 0 | -| Effective | Hot reload | - -- compaction_read_operation_per_sec - -| Name | compaction_read_operation_per_sec | -| ----------- | ------------------------------------------------------ | -| Description | The limit of read operation merge can reach per second | -| Type | int32 | -| Default | 0 | -| Effective | Hot reload | - -- sub_compaction_thread_count - -| Name | sub_compaction_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | The number of sub compaction threads to be set up to perform compaction. | -| Type | int32 | -| Default | 4 | -| Effective | Hot reload | - -- inner_compaction_task_selection_disk_redundancy - -| Name | inner_compaction_task_selection_disk_redundancy | -| ----------- | ------------------------------------------------------------ | -| Description | Redundancy value of disk availability, only use for inner compaction. | -| Type | double | -| Default | 0.05 | -| Effective | Hot reload | - -- inner_compaction_task_selection_mods_file_threshold - -| Name | inner_compaction_task_selection_mods_file_threshold | -| ----------- | -------------------------------------------------------- | -| Description | Mods file size threshold, only use for inner compaction. | -| Type | long | -| Default | 131072 | -| Effective | Hot reload | - -- compaction_schedule_thread_num - -| Name | compaction_schedule_thread_num | -| ----------- | ------------------------------------------------------------ | -| Description | The number of threads to be set up to select compaction task. | -| Type | int32 | -| Default | 4 | -| Effective | Hot reload | - -### 4.21 Write Ahead Log Configuration - -- wal_mode - -| Name | wal_mode | -| ----------- | ------------------------------------------------------------ | -| Description | The details of these three modes are as follows:DISABLE: the system will disable wal.SYNC: the system will submit wal synchronously, write request will not return until its wal is fsynced to the disk successfully.ASYNC: the system will submit wal asynchronously, write request will return immediately no matter its wal is fsynced to the disk successfully. | -| Type | String | -| Default | ASYNC | -| Effective | Restart required. | - -- max_wal_nodes_num - -| Name | max_wal_nodes_num | -| ----------- | ------------------------------------------------------------ | -| Description | each node corresponds to one wal directory The default value 0 means the number is determined by the system, the number is in the range of [data region num / 2, data region num]. | -| Type | int32 | -| Default | 0 | -| Effective | Restart required. | - -- wal_async_mode_fsync_delay_in_ms - -| Name | wal_async_mode_fsync_delay_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | Duration a wal flush operation will wait before calling fsync in the async mode | -| Type | int32 | -| Default | 1000 | -| Effective | Hot reload | - -- wal_sync_mode_fsync_delay_in_ms - -| Name | wal_sync_mode_fsync_delay_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | Duration a wal flush operation will wait before calling fsync in the sync mode | -| Type | int32 | -| Default | 3 | -| Effective | Hot reload | - -- wal_buffer_size_in_byte - -| Name | wal_buffer_size_in_byte | -| ----------- | ---------------------------- | -| Description | Buffer size of each wal node | -| Type | int32 | -| Default | 33554432 | -| Effective | Restart required. | - -- wal_buffer_queue_capacity - -| Name | wal_buffer_queue_capacity | -| ----------- | --------------------------------- | -| Description | Buffer capacity of each wal queue | -| Type | int32 | -| Default | 500 | -| Effective | Restart required. | - -- wal_file_size_threshold_in_byte - -| Name | wal_file_size_threshold_in_byte | -| ----------- | ------------------------------- | -| Description | Size threshold of each wal file | -| Type | int32 | -| Default | 31457280 | -| Effective | Hot reload | - -- wal_min_effective_info_ratio - -| Name | wal_min_effective_info_ratio | -| ----------- | --------------------------------------------------- | -| Description | Minimum ratio of effective information in wal files | -| Type | double | -| Default | 0.1 | -| Effective | Hot reload | - -- wal_memtable_snapshot_threshold_in_byte - -| Name | wal_memtable_snapshot_threshold_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | MemTable size threshold for triggering MemTable snapshot in wal | -| Type | int64 | -| Default | 8388608 | -| Effective | Hot reload | - -- max_wal_memtable_snapshot_num - -| Name | max_wal_memtable_snapshot_num | -| ----------- | ------------------------------------- | -| Description | MemTable's max snapshot number in wal | -| Type | int32 | -| Default | 1 | -| Effective | Hot reload | - -- delete_wal_files_period_in_ms - -| Name | delete_wal_files_period_in_ms | -| ----------- | ----------------------------------------------------------- | -| Description | The period when outdated wal files are periodically deleted | -| Type | int64 | -| Default | 20000 | -| Effective | Hot reload | - -- wal_throttle_threshold_in_byte - -| Name | wal_throttle_threshold_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | The minimum size of wal files when throttle down in IoTConsensus | -| Type | long | -| Default | 53687091200 | -| Effective | Hot reload | - -- iot_consensus_cache_window_time_in_ms - -| Name | iot_consensus_cache_window_time_in_ms | -| ----------- | ------------------------------------------------ | -| Description | Maximum wait time of write cache in IoTConsensus | -| Type | long | -| Default | -1 | -| Effective | Hot reload | - -- enable_wal_compression - -| Name | iot_consensus_cache_window_time_in_ms | -| ----------- | ------------------------------------- | -| Description | Enable Write Ahead Log compression. | -| Type | boolean | -| Default | true | -| Effective | Hot reload | - -### 4.22 **IoTConsensus Configuration** - -- data_region_iot_max_log_entries_num_per_batch - -| Name | data_region_iot_max_log_entries_num_per_batch | -| ----------- | ------------------------------------------------- | -| Description | The maximum log entries num in IoTConsensus Batch | -| Type | int32 | -| Default | 1024 | -| Effective | Restart required. | - -- data_region_iot_max_size_per_batch - -| Name | data_region_iot_max_size_per_batch | -| ----------- | -------------------------------------- | -| Description | The maximum size in IoTConsensus Batch | -| Type | int32 | -| Default | 16777216 | -| Effective | Restart required. | - -- data_region_iot_max_pending_batches_num - -| Name | data_region_iot_max_pending_batches_num | -| ----------- | ----------------------------------------------- | -| Description | The maximum pending batches num in IoTConsensus | -| Type | int32 | -| Default | 5 | -| Effective | Restart required. | - -- data_region_iot_max_memory_ratio_for_queue - -| Name | data_region_iot_max_memory_ratio_for_queue | -| ----------- | -------------------------------------------------- | -| Description | The maximum memory ratio for queue in IoTConsensus | -| Type | double | -| Default | 0.6 | -| Effective | Restart required. | - -- region_migration_speed_limit_bytes_per_second - -| Name | region_migration_speed_limit_bytes_per_second | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum transit size in byte per second for region migration | -| Type | long | -| Default | 33554432 | -| Effective | Restart required. | - -### 4.23 TsFile Configurations - -- group_size_in_byte - -| Name | group_size_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of bytes written to disk each time the data in memory is written to disk | -| Type | int32 | -| Default | 134217728 | -| Effective | Hot reload | - -- page_size_in_byte - -| Name | page_size_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | The memory size for each series writer to pack page, default value is 64KB | -| Type | int32 | -| Default | 65536 | -| Effective | Hot reload | - -- max_number_of_points_in_page - -| Name | max_number_of_points_in_page | -| ----------- | ------------------------------------------- | -| Description | The maximum number of data points in a page | -| Type | int32 | -| Default | 10000 | -| Effective | Hot reload | - -- pattern_matching_threshold - -| Name | pattern_matching_threshold | -| ----------- | ------------------------------------------- | -| Description | The threshold for pattern matching in regex | -| Type | int32 | -| Default | 1000000 | -| Effective | Hot reload | - -- float_precision - -| Name | float_precision | -| ----------- | ------------------------------------------------------------ | -| Description | Floating-point precision of query results.Only effective for RLE and TS_2DIFF encodings.Due to the limitation of machine precision, some values may not be interpreted strictly. | -| Type | int32 | -| Default | 2 | -| Effective | Hot reload | - -- value_encoder - -| Name | value_encoder | -| ----------- | ------------------------------------------------------------ | -| Description | Encoder of value series. default value is PLAIN. | -| Type | For int, long data type, also supports TS_2DIFF and RLE(run-length encoding), GORILLA and ZIGZAG. | -| Default | PLAIN | -| Effective | Hot reload | - -- compressor - -| Name | compressor | -| ----------- | ------------------------------------------------------------ | -| Description | Compression configuration And it is also used as the default compressor of time column in aligned timeseries. | -| Type | Data compression method, supports UNCOMPRESSED, SNAPPY, ZSTD, LZMA2 or LZ4. Default value is LZ4 | -| Default | LZ4 | -| Effective | Hot reload | - -- encrypt_flag - -| Name | encrypt_flag | -| ----------- | ---------------------- | -| Description | Enable data encryption | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- encrypt_type - -| Name | encrypt_type | -| ----------- |---------------------------------------| -| Description | The method of data encrytion | -| Type | String | -| Default | org.apache.tsfile.encrypt.UNENCRYPTED | -| Effective | Restart required. | - -- encrypt_key_path - -| Name | encrypt_key_path | -| ----------- | ----------------------------------- | -| Description | The path of key for data encryption | -| Type | String | -| Default | None | -| Effective | Restart required. | - -### 4.24 Authorization Configuration - -- authorizer_provider_class - -| Name | authorizer_provider_class | -| ----------- | ------------------------------------------------------------ | -| Description | which class to serve for authorization. | -| Type | String | -| Default | org.apache.iotdb.commons.auth.authorizer.LocalFileAuthorizer | -| Effective | Restart required. | - -- iotdb_server_encrypt_decrypt_provider - -| Name | iotdb_server_encrypt_decrypt_provider | -| ----------- | ------------------------------------------------------------ | -| Description | encryption provider class | -| Type | String | -| Default | org.apache.iotdb.commons.security.encrypt.MessageDigestEncrypt | -| Effective | Modify before the first startup. | - -- iotdb_server_encrypt_decrypt_provider_parameter - -| Name | iotdb_server_encrypt_decrypt_provider_parameter | -| ----------- | ----------------------------------------------- | -| Description | encryption provided class parameter | -| Type | String | -| Default | None | -| Effective | Modify before the first startup. | - -- author_cache_size - -| Name | author_cache_size | -| ----------- | --------------------------- | -| Description | Cache size of user and role | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- author_cache_expire_time - -| Name | author_cache_expire_time | -| ----------- | ---------------------------------- | -| Description | Cache expire time of user and role | -| Type | int32 | -| Default | 30 | -| Effective | Restart required. | - -### 4.25 UDF Configuration - -- udf_initial_byte_array_length_for_memory_control - -| Name | udf_initial_byte_array_length_for_memory_control | -| ----------- | ------------------------------------------------------------ | -| Description | Used to estimate the memory usage of text fields in a UDF query.It is recommended to set this value to be slightly larger than the average length of all text records. | -| Type | int32 | -| Default | 48 | -| Effective | Restart required. | - -- udf_memory_budget_in_mb - -| Name | udf_memory_budget_in_mb | -| ----------- | ------------------------------------------------------------ | -| Description | How much memory may be used in ONE UDF query (in MB). The upper limit is 20% of allocated memory for read. | -| Type | Float | -| Default | 30.0 | -| Effective | Restart required. | - -- udf_reader_transformer_collector_memory_proportion - -| Name | udf_reader_transformer_collector_memory_proportion | -| ----------- | ------------------------------------------------------------ | -| Description | UDF memory allocation ratio.The parameter form is a:b:c, where a, b, and c are integers. | -| Type | String | -| Default | 1:1:1 | -| Effective | Restart required. | - -- udf_lib_dir - -| Name | udf_lib_dir | -| ----------- | ---------------------------- | -| Description | the udf lib directory | -| Type | String | -| Default | ext/udf(Windows:ext\\udf) | -| Effective | Restart required. | - -### 4.26 Trigger Configuration - -- trigger_lib_dir - -| Name | trigger_lib_dir | -| ----------- | ------------------------- | -| Description | the trigger lib directory | -| Type | String | -| Default | ext/trigger | -| Effective | Restart required. | - -- stateful_trigger_retry_num_when_not_found - -| Name | stateful_trigger_retry_num_when_not_found | -| ----------- | ------------------------------------------------------------ | -| Description | How many times will we retry to found an instance of stateful trigger on DataNodes | -| Type | Int32 | -| Default | 3 | -| Effective | Restart required. | - -### 4.27 **Select-Into Configuration** - -- into_operation_buffer_size_in_byte - -| Name | into_operation_buffer_size_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum memory occupied by the data to be written when executing select-into statements. | -| Type | long | -| Default | 104857600 | -| Effective | Hot reload | - -- select_into_insert_tablet_plan_row_limit - -| Name | select_into_insert_tablet_plan_row_limit | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of rows can be processed in insert-tablet-plan when executing select-into statements. | -| Type | int32 | -| Default | 10000 | -| Effective | Hot reload | - -- into_operation_execution_thread_count - -| Name | into_operation_execution_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | The number of threads in the thread pool that execute insert-tablet tasks | -| Type | int32 | -| Default | 2 | -| Effective | Restart required. | - -### 4.28 Continuous Query Configuration - -- continuous_query_submit_thread_count - -| Name | continuous_query_execution_thread | -| ----------- | ------------------------------------------------------------ | -| Description | The number of threads in the scheduled thread pool that submit continuous query tasks periodically | -| Type | int32 | -| Default | 2 | -| Effective | Restart required. | - -- continuous_query_min_every_interval_in_ms - -| Name | continuous_query_min_every_interval_in_ms | -| ----------- | ------------------------------------------------------------ | -| Description | The minimum value of the continuous query execution time interval | -| Type | long (duration) | -| Default | 1000 | -| Effective | Restart required. | - -### 4.29 Pipe Configuration - -- pipe_lib_dir - -| Name | pipe_lib_dir | -| ----------- | ----------------------- | -| Description | the pipe lib directory. | -| Type | string | -| Default | ext/pipe | -| Effective | Not support modify | - -- pipe_subtask_executor_max_thread_num - -| Name | pipe_subtask_executor_max_thread_num | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). | -| Type | int | -| Default | 5 | -| Effective | Restart required. | - -- pipe_sink_timeout_ms - -| Name | pipe_sink_timeout_ms | -| ----------- | ------------------------------------------------------------ | -| Description | The connection timeout (in milliseconds) for the thrift client. | -| Type | int | -| Default | 900000 | -| Effective | Restart required. | - -- pipe_sink_selector_number - -| Name | pipe_sink_selector_number | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of selectors that can be used in the sink.Recommend to set this value to less than or equal to pipe_sink_max_client_number. | -| Type | int | -| Default | 4 | -| Effective | Restart required. | - -- pipe_sink_max_client_number - -| Name | pipe_sink_max_client_number | -| ----------- | ----------------------------------------------------------- | -| Description | The maximum number of clients that can be used in the sink. | -| Type | int | -| Default | 16 | -| Effective | Restart required. | - -- pipe_air_gap_receiver_enabled - -| Name | pipe_air_gap_receiver_enabled | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to enable receiving pipe data through air gap.The receiver can only return 0 or 1 in TCP mode to indicate whether the data is received successfully. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- pipe_air_gap_receiver_port - -| Name | pipe_air_gap_receiver_port | -| ----------- | ------------------------------------------------------------ | -| Description | The port for the server to receive pipe data through air gap. | -| Type | int | -| Default | 9780 | -| Effective | Restart required. | - -- pipe_all_sinks_rate_limit_bytes_per_second - -| Name | pipe_all_sinks_rate_limit_bytes_per_second | -| ----------- | ------------------------------------------------------------ | -| Description | The total bytes that all pipe sinks can transfer per second.When given a value less than or equal to 0, it means no limit. default value is -1, which means no limit. | -| Type | double | -| Default | -1 | -| Effective | Hot reload | - -### 4.30 RatisConsensus Configuration - -- config_node_ratis_log_appender_buffer_size_max - -| Name | config_node_ratis_log_appender_buffer_size_max | -| ----------- | ------------------------------------------------------------ | -| Description | max payload size for a single log-sync-RPC from leader to follower of ConfigNode (in byte, by default 16MB) | -| Type | int32 | -| Default | 16777216 | -| Effective | Restart required. | - -- schema_region_ratis_log_appender_buffer_size_max - -| Name | schema_region_ratis_log_appender_buffer_size_max | -| ----------- | ------------------------------------------------------------ | -| Description | max payload size for a single log-sync-RPC from leader to follower of SchemaRegion (in byte, by default 16MB) | -| Type | int32 | -| Default | 16777216 | -| Effective | Restart required. | - -- data_region_ratis_log_appender_buffer_size_max - -| Name | data_region_ratis_log_appender_buffer_size_max | -| ----------- | ------------------------------------------------------------ | -| Description | max payload size for a single log-sync-RPC from leader to follower of DataRegion (in byte, by default 16MB) | -| Type | int32 | -| Default | 16777216 | -| Effective | Restart required. | - -- config_node_ratis_snapshot_trigger_threshold - -| Name | config_node_ratis_snapshot_trigger_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | max numbers of snapshot_trigger_threshold logs to trigger a snapshot of Confignode | -| Type | int32 | -| Default | 400,000 | -| Effective | Restart required. | - -- schema_region_ratis_snapshot_trigger_threshold - -| Name | schema_region_ratis_snapshot_trigger_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | max numbers of snapshot_trigger_threshold logs to trigger a snapshot of SchemaRegion | -| Type | int32 | -| Default | 400,000 | -| Effective | Restart required. | - -- data_region_ratis_snapshot_trigger_threshold - -| Name | data_region_ratis_snapshot_trigger_threshold | -| ----------- | ------------------------------------------------------------ | -| Description | max numbers of snapshot_trigger_threshold logs to trigger a snapshot of DataRegion | -| Type | int32 | -| Default | 400,000 | -| Effective | Restart required. | - -- config_node_ratis_log_unsafe_flush_enable - -| Name | config_node_ratis_log_unsafe_flush_enable | -| ----------- | ------------------------------------------------------ | -| Description | Is confignode allowed flushing Raft Log asynchronously | -| Type | boolean | -| Default | false | -| Effective | Restart required. | - -- schema_region_ratis_log_unsafe_flush_enable - -| Name | schema_region_ratis_log_unsafe_flush_enable | -| ----------- | -------------------------------------------------------- | -| Description | Is schemaregion allowed flushing Raft Log asynchronously | -| Type | boolean | -| Default | false | -| Effective | Restart required. | - -- data_region_ratis_log_unsafe_flush_enable - -| Name | data_region_ratis_log_unsafe_flush_enable | -| ----------- | ------------------------------------------------------ | -| Description | Is dataregion allowed flushing Raft Log asynchronously | -| Type | boolean | -| Default | false | -| Effective | Restart required. | - -- config_node_ratis_log_segment_size_max_in_byte - -| Name | config_node_ratis_log_segment_size_max_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | max capacity of a RaftLog segment file of confignode (in byte, by default 24MB) | -| Type | int32 | -| Default | 25165824 | -| Effective | Restart required. | - -- schema_region_ratis_log_segment_size_max_in_byte - -| Name | schema_region_ratis_log_segment_size_max_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | max capacity of a RaftLog segment file of schemaregion (in byte, by default 24MB) | -| Type | int32 | -| Default | 25165824 | -| Effective | Restart required. | - -- data_region_ratis_log_segment_size_max_in_byte - -| Name | data_region_ratis_log_segment_size_max_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | max capacity of a RaftLog segment file of dataregion(in byte, by default 24MB) | -| Type | int32 | -| Default | 25165824 | -| Effective | Restart required. | - -- config_node_simple_consensus_log_segment_size_max_in_byte - -| Name | data_region_ratis_log_segment_size_max_in_byte | -| ----------- | ------------------------------------------------------------ | -| Description | max capacity of a simple log segment file of confignode(in byte, by default 24MB) | -| Type | int32 | -| Default | 25165824 | -| Effective | Restart required. | - -- config_node_ratis_grpc_flow_control_window - -| Name | config_node_ratis_grpc_flow_control_window | -| ----------- | ---------------------------------------------------------- | -| Description | confignode flow control window for ratis grpc log appender | -| Type | int32 | -| Default | 4194304 | -| Effective | Restart required. | - -- schema_region_ratis_grpc_flow_control_window - -| Name | schema_region_ratis_grpc_flow_control_window | -| ----------- | ------------------------------------------------------------ | -| Description | schema region flow control window for ratis grpc log appender | -| Type | int32 | -| Default | 4194304 | -| Effective | Restart required. | - -- data_region_ratis_grpc_flow_control_window - -| Name | data_region_ratis_grpc_flow_control_window | -| ----------- | ----------------------------------------------------------- | -| Description | data region flow control window for ratis grpc log appender | -| Type | int32 | -| Default | 4194304 | -| Effective | Restart required. | - -- config_node_ratis_grpc_leader_outstanding_appends_max - -| Name | config_node_ratis_grpc_leader_outstanding_appends_max | -| ----------- | ----------------------------------------------------- | -| Description | config node grpc line concurrency threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- schema_region_ratis_grpc_leader_outstanding_appends_max - -| Name | schema_region_ratis_grpc_leader_outstanding_appends_max | -| ----------- | ------------------------------------------------------- | -| Description | schema region grpc line concurrency threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- data_region_ratis_grpc_leader_outstanding_appends_max - -| Name | data_region_ratis_grpc_leader_outstanding_appends_max | -| ----------- | ----------------------------------------------------- | -| Description | data region grpc line concurrency threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- config_node_ratis_log_force_sync_num - -| Name | config_node_ratis_log_force_sync_num | -| ----------- | ------------------------------------ | -| Description | config node fsync threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- schema_region_ratis_log_force_sync_num - -| Name | schema_region_ratis_log_force_sync_num | -| ----------- | -------------------------------------- | -| Description | schema region fsync threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- data_region_ratis_log_force_sync_num - -| Name | data_region_ratis_log_force_sync_num | -| ----------- | ------------------------------------ | -| Description | data region fsync threshold | -| Type | int32 | -| Default | 128 | -| Effective | Restart required. | - -- config_node_ratis_rpc_leader_election_timeout_min_ms - -| Name | config_node_ratis_rpc_leader_election_timeout_min_ms | -| ----------- | ---------------------------------------------------- | -| Description | confignode leader min election timeout | -| Type | int32 | -| Default | 2000ms | -| Effective | Restart required. | - -- schema_region_ratis_rpc_leader_election_timeout_min_ms - -| Name | schema_region_ratis_rpc_leader_election_timeout_min_ms | -| ----------- | ------------------------------------------------------ | -| Description | schema region leader min election timeout | -| Type | int32 | -| Default | 2000ms | -| Effective | Restart required. | - -- data_region_ratis_rpc_leader_election_timeout_min_ms - -| Name | data_region_ratis_rpc_leader_election_timeout_min_ms | -| ----------- | ---------------------------------------------------- | -| Description | data region leader min election timeout | -| Type | int32 | -| Default | 2000ms | -| Effective | Restart required. | - -- config_node_ratis_rpc_leader_election_timeout_max_ms - -| Name | config_node_ratis_rpc_leader_election_timeout_max_ms | -| ----------- | ---------------------------------------------------- | -| Description | confignode leader max election timeout | -| Type | int32 | -| Default | 4000ms | -| Effective | Restart required. | - -- schema_region_ratis_rpc_leader_election_timeout_max_ms - -| Name | schema_region_ratis_rpc_leader_election_timeout_max_ms | -| ----------- | ------------------------------------------------------ | -| Description | schema region leader max election timeout | -| Type | int32 | -| Default | 4000ms | -| Effective | Restart required. | - -- data_region_ratis_rpc_leader_election_timeout_max_ms - -| Name | data_region_ratis_rpc_leader_election_timeout_max_ms | -| ----------- | ---------------------------------------------------- | -| Description | data region leader max election timeout | -| Type | int32 | -| Default | 4000ms | -| Effective | Restart required. | - -- config_node_ratis_request_timeout_ms - -| Name | config_node_ratis_request_timeout_ms | -| ----------- | --------------------------------------- | -| Description | confignode ratis client retry threshold | -| Type | int32 | -| Default | 10000 | -| Effective | Restart required. | - -- schema_region_ratis_request_timeout_ms - -| Name | schema_region_ratis_request_timeout_ms | -| ----------- | ------------------------------------------ | -| Description | schema region ratis client retry threshold | -| Type | int32 | -| Default | 10000 | -| Effective | Restart required. | - -- data_region_ratis_request_timeout_ms - -| Name | data_region_ratis_request_timeout_ms | -| ----------- | ---------------------------------------- | -| Description | data region ratis client retry threshold | -| Type | int32 | -| Default | 10000 | -| Effective | Restart required. | - -- config_node_ratis_max_retry_attempts - -| Name | config_node_ratis_max_retry_attempts | -| ----------- | ------------------------------------ | -| Description | confignode ratis client retry times | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- config_node_ratis_initial_sleep_time_ms - -| Name | config_node_ratis_initial_sleep_time_ms | -| ----------- | ------------------------------------------ | -| Description | confignode ratis client initial sleep time | -| Type | int32 | -| Default | 100ms | -| Effective | Restart required. | - -- config_node_ratis_max_sleep_time_ms - -| Name | config_node_ratis_max_sleep_time_ms | -| ----------- | -------------------------------------------- | -| Description | confignode ratis client max retry sleep time | -| Type | int32 | -| Default | 10000 | -| Effective | Restart required. | - -- schema_region_ratis_max_retry_attempts - -| Name | schema_region_ratis_max_retry_attempts | -| ----------- | ------------------------------------------ | -| Description | schema region ratis client max retry times | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- schema_region_ratis_initial_sleep_time_ms - -| Name | schema_region_ratis_initial_sleep_time_ms | -| ----------- | ------------------------------------------ | -| Description | schema region ratis client init sleep time | -| Type | int32 | -| Default | 100ms | -| Effective | Restart required. | - -- schema_region_ratis_max_sleep_time_ms - -| Name | schema_region_ratis_max_sleep_time_ms | -| ----------- | ----------------------------------------- | -| Description | schema region ratis client max sleep time | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- data_region_ratis_max_retry_attempts - -| Name | data_region_ratis_max_retry_attempts | -| ----------- | --------------------------------------------- | -| Description | data region ratis client max retry sleep time | -| Type | int32 | -| Default | 10 | -| Effective | Restart required. | - -- data_region_ratis_initial_sleep_time_ms - -| Name | data_region_ratis_initial_sleep_time_ms | -| ----------- | ---------------------------------------- | -| Description | data region ratis client init sleep time | -| Type | int32 | -| Default | 100ms | -| Effective | Restart required. | - -- data_region_ratis_max_sleep_time_ms - -| Name | data_region_ratis_max_sleep_time_ms | -| ----------- | --------------------------------------------- | -| Description | data region ratis client max retry sleep time | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- ratis_first_election_timeout_min_ms - -| Name | ratis_first_election_timeout_min_ms | -| ----------- | ----------------------------------- | -| Description | Ratis first election min timeout | -| Type | int64 | -| Default | 50 (ms) | -| Effective | Restart required. | - -- ratis_first_election_timeout_max_ms - -| Name | ratis_first_election_timeout_max_ms | -| ----------- | ----------------------------------- | -| Description | Ratis first election max timeout | -| Type | int64 | -| Default | 150 (ms) | -| Effective | Restart required. | - -- config_node_ratis_preserve_logs_num_when_purge - -| Name | config_node_ratis_preserve_logs_num_when_purge | -| ----------- | ------------------------------------------------------------ | -| Description | confignode snapshot preserves certain logs when taking snapshot and purge | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- schema_region_ratis_preserve_logs_num_when_purge - -| Name | schema_region_ratis_preserve_logs_num_when_purge | -| ----------- | ------------------------------------------------------------ | -| Description | schema region snapshot preserves certain logs when taking snapshot and purge | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- data_region_ratis_preserve_logs_num_when_purge - -| Name | data_region_ratis_preserve_logs_num_when_purge | -| ----------- | ------------------------------------------------------------ | -| Description | data region snapshot preserves certain logs when taking snapshot and purge | -| Type | int32 | -| Default | 1000 | -| Effective | Restart required. | - -- config_node_ratis_log_max_size - -| Name | config_node_ratis_log_max_size | -| ----------- | -------------------------------------- | -| Description | config node Raft Log disk size control | -| Type | int64 | -| Default | 2147483648 (2GB) | -| Effective | Restart required. | - -- schema_region_ratis_log_max_size - -| Name | schema_region_ratis_log_max_size | -| ----------- | ---------------------------------------- | -| Description | schema region Raft Log disk size control | -| Type | int64 | -| Default | 2147483648 (2GB) | -| Effective | Restart required. | - -- data_region_ratis_log_max_size - -| Name | data_region_ratis_log_max_size | -| ----------- | -------------------------------------- | -| Description | data region Raft Log disk size control | -| Type | int64 | -| Default | 21474836480 (20GB) | -| Effective | Restart required. | - -- config_node_ratis_periodic_snapshot_interval - -| Name | config_node_ratis_periodic_snapshot_interval | -| ----------- | -------------------------------------------- | -| Description | config node Raft periodic snapshot interval | -| Type | int64 | -| Default | 86400 (s) | -| Effective | Restart required. | - -- schema_region_ratis_periodic_snapshot_interval - -| Name | schema_region_ratis_preserve_logs_num_when_purge | -| ----------- | ------------------------------------------------ | -| Description | schema region Raft periodic snapshot interval | -| Type | int64 | -| Default | 86400 (s) | -| Effective | Restart required. | - -- data_region_ratis_periodic_snapshot_interval - -| Name | data_region_ratis_preserve_logs_num_when_purge | -| ----------- | ---------------------------------------------- | -| Description | data region Raft periodic snapshot interval | -| Type | int64 | -| Default | 86400 (s) | -| Effective | Restart required. | - -### 4.31 IoTConsensusV2 Configuration - -- iot_consensus_v2_pipeline_size - -| Name | iot_consensus_v2_pipeline_size | -| ----------- | ------------------------------------------------------------ | -| Description | Default event buffer size for connector and receiver in iot consensus v2 | -| Type | int | -| Default | 5 | -| Effective | Restart required. | - -- iot_consensus_v2_mode - -| Name | iot_consensus_v2_pipeline_size | -| ----------- | ------------------------------ | -| Description | IoTConsensusV2 mode. | -| Type | String | -| Default | batch | -| Effective | Restart required. | - -### 4.32 Procedure Configuration - -- procedure_core_worker_thread_count - -| Name | procedure_core_worker_thread_count | -| ----------- | ------------------------------------- | -| Description | Default number of worker thread count | -| Type | int32 | -| Default | 4 | -| Effective | Restart required. | - -- procedure_completed_clean_interval - -| Name | procedure_completed_clean_interval | -| ----------- | ------------------------------------------------------------ | -| Description | Default time interval of completed procedure cleaner work in, time unit is second | -| Type | int32 | -| Default | 30(s) | -| Effective | Restart required. | - -- procedure_completed_evict_ttl - -| Name | procedure_completed_evict_ttl | -| ----------- | ------------------------------------------------------- | -| Description | Default ttl of completed procedure, time unit is second | -| Type | int32 | -| Default | 60(s) | -| Effective | Restart required. | - -### 4.33 MQTT Broker Configuration - -- enable_mqtt_service - -| Name | enable_mqtt_service。 | -| ----------- | ----------------------------------- | -| Description | whether to enable the mqtt service. | -| Type | Boolean | -| Default | false | -| Effective | Hot reload | - -- mqtt_host - -| Name | mqtt_host | -| ----------- | ------------------------------ | -| Description | the mqtt service binding host. | -| Type | String | -| Default | 127.0.0.1 | -| Effective | Hot reload | - -- mqtt_port - -| Name | mqtt_port | -| ----------- | ------------------------------ | -| Description | the mqtt service binding port. | -| Type | int32 | -| Default | 1883 | -| Effective | Hot reload | - -- mqtt_handler_pool_size - -| Name | mqtt_handler_pool_size | -| ----------- | ---------------------------------------------------- | -| Description | the handler pool size for handing the mqtt messages. | -| Type | int32 | -| Default | 1 | -| Effective | Hot reload | - -- mqtt_payload_formatter - -| Name | mqtt_payload_formatter | -| ----------- | ----------------------------------- | -| Description | the mqtt message payload formatter. | -| Type | String | -| Default | json | -| Effective | Hot reload | - -- mqtt_max_message_size - -| Name | mqtt_max_message_size | -| ----------- | ---------------------------------- | -| Description | max length of mqtt message in byte | -| Type | int32 | -| Default | 1048576 | -| Effective | Hot reload | - -### 4.34 Audit log Configuration - -- enable_audit_log - -| Name | enable_audit_log | -| ----------- | -------------------------------- | -| Description | whether to enable the audit log. | -| Type | Boolean | -| Default | false | -| Effective | Restart required. | - -- audit_log_storage - -| Name | audit_log_storage | -| ----------- | ----------------------------- | -| Description | Output location of audit logs | -| Type | String | -| Default | IOTDB,LOGGER | -| Effective | Restart required. | - -- audit_log_operation - -| Name | audit_log_operation | -| ----------- | ------------------------------------------------------------ | -| Description | whether enable audit log for DML operation of datawhether enable audit log for DDL operation of schemawhether enable audit log for QUERY operation of data and schema | -| Type | String | -| Default | DML,DDL,QUERY | -| Effective | Restart required. | - -- enable_audit_log_for_native_insert_api - -| Name | enable_audit_log_for_native_insert_api | -| ----------- | ---------------------------------------------- | -| Description | whether the local write api records audit logs | -| Type | Boolean | -| Default | true | -| Effective | Restart required. | - -### 4.35 White List Configuration - -- enable_white_list - -| Name | enable_white_list | -| ----------- | ------------------------- | -| Description | whether enable white list | -| Type | Boolean | -| Default | false | -| Effective | Hot reload | - -### 4.36 IoTDB-AI Configuration - -- model_inference_execution_thread_count - -| Name | model_inference_execution_thread_count | -| ----------- | ------------------------------------------------------------ | -| Description | The thread count which can be used for model inference operation. | -| Type | int | -| Default | 5 | -| Effective | Restart required. | - -### 4.37 Load TsFile Configuration - -- load_clean_up_task_execution_delay_time_seconds - -| Name | load_clean_up_task_execution_delay_time_seconds | -| ----------- | ------------------------------------------------------------ | -| Description | Load clean up task is used to clean up the unsuccessful loaded tsfile after a certain period of time. | -| Type | int | -| Default | 1800 | -| Effective | Hot reload | - -- load_write_throughput_bytes_per_second - -| Name | load_write_throughput_bytes_per_second | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum bytes per second of disk write throughput when loading tsfile. | -| Type | int | -| Default | -1 | -| Effective | Hot reload | - -- load_active_listening_enable - -| Name | load_active_listening_enable | -| ----------- | ------------------------------------------------------------ | -| Description | Whether to enable the active listening mode for tsfile loading. | -| Type | Boolean | -| Default | true | -| Effective | Hot reload | - -- load_active_listening_dirs - -| Name | load_active_listening_dirs | -| ----------- | ------------------------------------------------------------ | -| Description | The directory to be actively listened for tsfile loading.Multiple directories should be separated by a ','. | -| Type | String | -| Default | ext/load/pending | -| Effective | Hot reload | - -- load_active_listening_fail_dir - -| Name | load_active_listening_fail_dir | -| ----------- | ------------------------------------------------------------ | -| Description | The directory where tsfiles are moved if the active listening mode fails to load them. | -| Type | String | -| Default | ext/load/failed | -| Effective | Hot reload | - -- load_active_listening_max_thread_num - -| Name | load_active_listening_max_thread_num | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum number of threads that can be used to load tsfile actively.The default value, when this parameter is commented out or <= 0, use CPU core number. | -| Type | Long | -| Default | 0 | -| Effective | Restart required. | - -- load_active_listening_check_interval_seconds - -| Name | load_active_listening_check_interval_seconds | -| ----------- | ------------------------------------------------------------ | -| Description | The interval specified in seconds for the active listening mode to check the directory specified in `load_active_listening_dirs`.The active listening mode will check the directory every `load_active_listening_check_interval_seconds seconds`. | -| Type | Long | -| Default | 5 | -| Effective | Restart required. | - -* last_cache_operation_on_load - -|Name| last_cache_operation_on_load | -|:---:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|Description| The operation performed to LastCache when a TsFile is successfully loaded. `UPDATE`: use the data in the TsFile to update LastCache; `UPDATE_NO_BLOB`: similar to UPDATE, but will invalidate LastCache for blob series; `CLEAN_DEVICE`: invalidate LastCache of devices contained in the TsFile; `CLEAN_ALL`: clean the whole LastCache. | -|Type| String | -|Default| UPDATE_NO_BLOB | -|Effective| Effective after restart | - -* cache_last_values_for_load - -|Name| cache_last_values_for_load | -|:---:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|Description| Whether to cache last values before loading a TsFile. Only effective when `last_cache_operation_on_load=UPDATE_NO_BLOB` or `last_cache_operation_on_load=UPDATE`. When set to true, blob series will be ignored even with `last_cache_operation_on_load=UPDATE`. Enabling this will increase the memory footprint during loading TsFiles. | -|Type| Boolean | -|Default| true | -|Effective| Effective after restart | - -* cache_last_values_memory_budget_in_byte - -|Name| cache_last_values_memory_budget_in_byte | -|:---:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|Description| When `cache_last_values_for_load=true`, the maximum memory that can be used to cache last values. If this value is exceeded, the cached values will be abandoned and last values will be read from the TsFile in a streaming manner. | -|Type| int32 | -|Default| 4194304 | -|Effective| Effective after restart | - - -### 4.38 Dispatch Retry Configuration - -- enable_retry_for_unknown_error - -| Name | enable_retry_for_unknown_error | -| ----------- | ------------------------------------------------------------ | -| Description | The maximum retrying time for write request remotely dispatching, time unit is milliseconds. | -| Type | Long | -| Default | 60000 | -| Effective | Hot reload | - -- enable_retry_for_unknown_error - -| Name | enable_retry_for_unknown_error | -| ----------- | ------------------------------------ | -| Description | Whether retrying for unknown errors. | -| Type | boolean | -| Default | false | -| Effective | Hot reload | \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Reference/System-Tables_timecho.md b/src/UserGuide/latest-Table/Reference/System-Tables_timecho.md deleted file mode 100644 index 8338d998d..000000000 --- a/src/UserGuide/latest-Table/Reference/System-Tables_timecho.md +++ /dev/null @@ -1,801 +0,0 @@ - - -# System Tables - -IoTDB has a built-in system database called `INFORMATION_SCHEMA`, which contains a series of system tables for storing IoTDB runtime information (such as currently executing SQL statements, etc.). Currently, the `INFORMATION_SCHEMA` database only supports read operations. - -> 💡 **[V2.0.9.1 Version Update]**
-> 👉 Added onw system tables: **[TABLE_DISK_USAGE](#_2-22-table-disk-usage)** (Table-level Storage Space Statistics), enhancing cluster maintenance and performance analysis. - - -## 1. System Database - -* ​**Name**​: `INFORMATION_SCHEMA` -* ​**Commands**​: Read-only, only supports `Show databases (DETAILS)` / `Show Tables (DETAILS)` / `Use`. Other operations will result in an error: `"The database 'information_schema' can only be queried."` -* ​**Attributes**​:` TTL=INF`, other attributes default to `null ` -* ​**SQL Example**​: - -```sql -IoTDB> show databases -+------------------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+------------------+-------+-----------------------+---------------------+---------------------+ -|information_schema| INF| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+ - -IoTDB> show tables from information_schema -+-----------------------+-------+ -| TableName|TTL(ms)| -+-----------------------+-------+ -| columns| INF| -| config_nodes| INF| -| configurations| INF| -| connections| INF| -| current_queries| INF| -| data_nodes| INF| -| databases| INF| -| functions| INF| -| keywords| INF| -| nodes| INF| -| pipe_plugins| INF| -| pipes| INF| -| queries| INF| -|queries_costs_histogram| INF| -| regions| INF| -| services| INF| -| subscriptions| INF| -| table_disk_usage| INF| -| tables| INF| -| topics| INF| -| views| INF| -+-----------------------+-------+ -``` - -## 2. System Tables - -* ​**Names**​: `DATABASES`, `TABLES`, `REGIONS`, `QUERIES`, `COLUMNS`, `PIPES`, `PIPE_PLUGINS`, `SUBSCRIPTION`, `TOPICS`, `VIEWS`, `MODELS`, `FUNCTIONS`, `CONFIGURATIONS`, `KEYWORDS`, `NODES`, `CONFIG_NODES`, `DATA_NODES`, `CONNECTIONS`, `CURRENT_QUERIES`, `QUERIES_COSTS_HISTOGRAM`, `SERVICES`, `TABLE_DISK_USAGE` (detailed descriptions in later sections) -* ​**Operations**​: Read-only, only supports `SELECT`, `COUNT/SHOW DEVICES`, `DESC`. Any modifications to table structure or content are not allowed and will result in an error: `"The database 'information_schema' can only be queried." ` -* ​**Column Names**​: System table column names are all lowercase by default and separated by underscores (`_`). - -### 2.1 DATABASES - -* Contains information about all databases in the cluster. -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| --------------------------------- | ----------- | ------------- | -------------------------------- | -| `database` | STRING | TAG | Database name | -| `ttl(ms)` | STRING | ATTRIBUTE | Data retention time | -| `schema_replication_factor` | INT32 | ATTRIBUTE | Schema replica count | -| `data_replication_factor` | INT32 | ATTRIBUTE | Data replica count | -| `time_partition_interval` | INT64 | ATTRIBUTE | Time partition interval | -| `schema_region_group_num` | INT32 | ATTRIBUTE | Number of schema region groups | -| `data_region_group_num` | INT32 | ATTRIBUTE | Number of data region groups | - -* The query results only display the collection of databases for which you have any permission on the database itself or any table within the database. -* Query Example: - -```sql -IoTDB> select * from information_schema.databases -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -| database|ttl(ms)|schema_replication_factor|data_replication_factor|time_partition_interval|schema_region_group_num|data_region_group_num| -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -|information_schema| INF| null| null| null| null| null| -| database1| INF| 1| 1| 604800000| 0| 0| -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -``` - -### 2.2 TABLES - -* Contains information about all tables in the cluster. -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ------------------ | ----------- | ------------- | --------------------- | -| `database` | STRING | TAG | Database name | -| `table_name` | STRING | TAG | Table name | -| `ttl(ms)` | STRING | ATTRIBUTE | Data retention time | -| `status` | STRING | ATTRIBUTE | Status | -| `comment` | STRING | ATTRIBUTE | Description/comment | - -* Note: Possible values for `status`: `USING`, `PRE_CREATE`, `PRE_DELETE`. For details, refer to the [View Tables](../Basic-Concept/Table-Management_timecho.md#12-view-tables) in Table Management documentation -* The query results only display the collection of tables for which you have any permission. -* Query Example: - -```sql -IoTDB> select * from information_schema.tables -+------------------+--------------+-----------+------+-------+-----------+ -| database| table_name| ttl(ms)|status|comment| table_type| -+------------------+--------------+-----------+------+-------+-----------+ -|information_schema| databases| INF| USING| null|SYSTEM VIEW| -|information_schema| models| INF| USING| null|SYSTEM VIEW| -|information_schema| subscriptions| INF| USING| null|SYSTEM VIEW| -|information_schema| regions| INF| USING| null|SYSTEM VIEW| -|information_schema| functions| INF| USING| null|SYSTEM VIEW| -|information_schema| keywords| INF| USING| null|SYSTEM VIEW| -|information_schema| columns| INF| USING| null|SYSTEM VIEW| -|information_schema| topics| INF| USING| null|SYSTEM VIEW| -|information_schema|configurations| INF| USING| null|SYSTEM VIEW| -|information_schema| queries| INF| USING| null|SYSTEM VIEW| -|information_schema| tables| INF| USING| null|SYSTEM VIEW| -|information_schema| pipe_plugins| INF| USING| null|SYSTEM VIEW| -|information_schema| nodes| INF| USING| null|SYSTEM VIEW| -|information_schema| data_nodes| INF| USING| null|SYSTEM VIEW| -|information_schema| pipes| INF| USING| null|SYSTEM VIEW| -|information_schema| views| INF| USING| null|SYSTEM VIEW| -|information_schema| config_nodes| INF| USING| null|SYSTEM VIEW| -| database1| table1|31536000000| USING| null| BASE TABLE| -+------------------+--------------+-----------+------+-------+-----------+ -``` - -### 2.3 REGIONS - -* Contains information about all regions in the cluster. -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ------------------------- | ----------- | ------------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `region_id` | INT32 | TAG | Region ID | -| `datanode_id` | INT32 | TAG | DataNode ID | -| `type` | STRING | ATTRIBUTE | Type (`SchemaRegion`/`DataRegion`) | -| `status` | STRING | ATTRIBUTE | Status (`Running`,`Unknown`, etc.) | -| `database` | STRING | ATTRIBUTE | Database name | -| `series_slot_num` | INT32 | ATTRIBUTE | Number of series slots | -| `time_slot_num` | INT64 | ATTRIBUTE | Number of time slots | -| `rpc_address` | STRING | ATTRIBUTE | RPC address | -| `rpc_port` | INT32 | ATTRIBUTE | RPC port | -| `internal_address` | STRING | ATTRIBUTE | Internal communication address | -| `role` | STRING | ATTRIBUTE | Role (`Leader`/`Follower`) | -| `create_time` | TIMESTAMP | ATTRIBUTE | Creation time | -| `tsfile_size_bytes` | INT64 | ATTRIBUTE | - For​**DataRegion with statistics ​**​: Total file size of TsFiles.
- For**DataRegion without statistics**(Unknown):`-1`.
- For​**SchemaRegion**​:`null`. | - -* Only administrators are allowed to perform query operations. -* Query Example: - -```SQL -IoTDB> select * from information_schema.regions -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -|region_id|datanode_id| type| status| database|series_slot_num|time_slot_num|rpc_address|rpc_port|internal_address| role| create_time|tsfile_size_bytes| -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -| 0| 1|SchemaRegion|Running|database1| 12| 0| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:08.485+08:00| null| -| 1| 1| DataRegion|Running|database1| 6| 6| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:09.156+08:00| 3985| -| 2| 1| DataRegion|Running|database1| 6| 6| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:09.156+08:00| 3841| -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -``` - -### 2.4 QUERIES - -* Contains information about all currently executing queries in the cluster. Can also be queried using the `SHOW QUERIES` syntax. -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| -------------------- | ----------- | ------------- | ------------------------------------------------------------ | -| `query_id` | STRING | TAG | Query ID | -| `start_time` | TIMESTAMP | ATTRIBUTE | Query start timestamp (precision matches system precision) | -| `datanode_id` | INT32 | ATTRIBUTE | DataNode ID that initiated the query | -| `elapsed_time` | FLOAT | ATTRIBUTE | Query execution duration (in seconds) | -| `statement` | STRING | ATTRIBUTE | SQL statement of the query | -| `user` | STRING | ATTRIBUTE | User who initiated the query | - -* For regular users, the query results only display the queries executed by themselves; for administrators, all queries are displayed. -* Query Example: - -```SQL -IoTDB> select * from information_schema.queries -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -|20250331_023242_00011_1|2025-03-31T10:32:42.360+08:00| 1| 0.025|select * from information_schema.queries|root| -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -``` - -### 2.5 COLUMNS - -* Contains information about all columns in tables across the cluster -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ------------------- | ----------- | ------------- | -------------------- | -| `database` | STRING | TAG | Database name | -| `table_name` | STRING | TAG | Table name | -| `column_name` | STRING | TAG | Column name | -| `datatype` | STRING | ATTRIBUTE | Column data type | -| `category` | STRING | ATTRIBUTE | Column category | -| `status` | STRING | ATTRIBUTE | Column status | -| `comment` | STRING | ATTRIBUTE | Column description | - -Notes: -* Possible values for `status`: `USING`, `PRE_DELETE`. For details, refer to [Viewing Table Columns](../Basic-Concept/Table-Management_timecho.html#13-view-table-columns) in Table Management documentation. -* The query results only display the column information of tables for which you have any permission. - -* Query Example: - -```SQL -IoTDB> select * from information_schema.columns where database = 'database1' -+---------+----------+------------+---------+---------+------+-------+ -| database|table_name| column_name| datatype| category|status|comment| -+---------+----------+------------+---------+---------+------+-------+ -|database1| table1| time|TIMESTAMP| TIME| USING| null| -|database1| table1| region| STRING| TAG| USING| null| -|database1| table1| plant_id| STRING| TAG| USING| null| -|database1| table1| device_id| STRING| TAG| USING| null| -|database1| table1| model_id| STRING|ATTRIBUTE| USING| null| -|database1| table1| maintenance| STRING|ATTRIBUTE| USING| null| -|database1| table1| temperature| FLOAT| FIELD| USING| null| -|database1| table1| humidity| FLOAT| FIELD| USING| null| -|database1| table1| status| BOOLEAN| FIELD| USING| null| -|database1| table1|arrival_time|TIMESTAMP| FIELD| USING| null| -+---------+----------+------------+---------+---------+------+-------+ -``` - -### 2.6 PIPES - -* Contains information about all pipes in the cluster -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ----------------------------------- | ----------- | ------------- | ---------------------------------------------------------- | -| `id` | STRING | TAG | Pipe name | -| `creation_time` | TIMESTAMP | ATTRIBUTE | Creation time | -| `state` | STRING | ATTRIBUTE | Pipe status (`RUNNING`/`STOPPED`) | -| `pipe_source` | STRING | ATTRIBUTE | Source plugin parameters | -| `pipe_processor` | STRING | ATTRIBUTE | Processor plugin parameters | -| `pipe_sink` | STRING | ATTRIBUTE | Sink plugin parameters | -| `exception_message` | STRING | ATTRIBUTE | Exception message | -| `remaining_event_count` | INT64 | ATTRIBUTE | Remaining event count (`-1`if Unknown) | -| `estimated_remaining_seconds` | DOUBLE | ATTRIBUTE | Estimated remaining time in seconds (`-1`if Unknown) | - -* Only administrators are allowed to perform operations. -* Query Example: - -```SQL -select * from information_schema.pipes -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -| id| creation_time| state| pipe_source|pipe_processor| pipe_sink|exception_message|remaining_event_count|estimated_remaining_seconds| -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -|tablepipe1|2025-03-31T12:25:24.040+08:00|RUNNING|{__system.sql-dialect=table, source.password=******, source.username=root}| {}|{format=hybrid, node-urls=192.168.xxx.xxx:6667, sink=iotdb-thrift-sink}| | 0| 0.0| -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -``` - -### 2.7 PIPE\_PLUGINS - -* Contains information about all PIPE plugins in the cluster -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ------------------- | ----------- | ------------- | ----------------------------------------------------- | -| `plugin_name` | STRING | TAG | Plugin name | -| `plugin_type` | STRING | ATTRIBUTE | Plugin type (`Builtin`/`External`) | -| `class_name` | STRING | ATTRIBUTE | Plugin's main class name | -| `plugin_jar` | STRING | ATTRIBUTE | Plugin's JAR file name (`null`for builtin type) | - -* Query Example: - -```SQL -IoTDB> select * from information_schema.pipe_plugins -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -| plugin_name|plugin_type| class_name|plugin_jar| -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -|IOTDB-THRIFT-SSL-SINK| Builtin|org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| null| -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| null| -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.donothing.DoNothingConnector| null| -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.processor.donothing.DoNothingProcessor| null| -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| null| -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.extractor.iotdb.IoTDBExtractor| null| -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -``` - -### 2.8 SUBSCRIPTIONS - -* Contains information about all data subscriptions in the cluster -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| ---------------------------- | ----------- | ------------- | ------------------------- | -| `topic_name` | STRING | TAG | Subscription topic name | -| `consumer_group_name` | STRING | TAG | Consumer group name | -| `subscribed_consumers` | STRING | ATTRIBUTE | Subscribed consumers | - -* Only administrators are allowed to perform operations. -* Query Example: - -```SQL -IoTDB> select * from information_schema.subscriptions where topic_name = 'topic_1' -+----------+-------------------+--------------------------------+ -|topic_name|consumer_group_name| subscribed_consumers| -+----------+-------------------+--------------------------------+ -| topic_1| cg1|[c3, c4, c5, c6, c7, c0, c1, c2]| -+----------+-------------------+--------------------------------+ -``` - -### 2.9 TOPICS - -* Contains information about all data subscription topics in the cluster -* Table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -| --------------------- | ----------- | ------------- | -------------------------------- | -| `topic_name` | STRING | TAG | Subscription topic name | -| `topic_configs` | STRING | ATTRIBUTE | Topic configuration parameters | - -* Only administrators are allowed to perform operations. -* Query Example: - -```SQL -IoTDB> select * from information_schema.topics -+----------+----------------------------------------------------------------+ -|topic_name| topic_configs| -+----------+----------------------------------------------------------------+ -| topic|{__system.sql-dialect=table, start-time=2025-01-10T17:05:38.282}| -+----------+----------------------------------------------------------------+ -``` - -### 2.10 VIEWS - -> This system table is available starting from version V2.0.5. - -* Contains information about all table views in the database. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------------ | ----------- | ----------------- | --------------------------------- | -| database | STRING | TAG | Database name | -| table\_name | STRING | TAG | View name | -| view\_definition | STRING | ATTRIBUTE | SQL statement for view creation | - -* The query results only display the collection of views for which you have any permission. -* Query example: - -```SQL -IoTDB> select * from information_schema.views -+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------+ -| database|table_name| view_definition| -+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------+ -|database1| ln|CREATE VIEW "ln" ("device" STRING TAG,"model" STRING TAG,"status" BOOLEAN FIELD,"hardware" STRING FIELD) WITH (ttl='INF') AS root.ln.**| -+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------+ -``` - - -### 2.11 MODELS - -> This system table is available starting from version V 2.0.5 and has been discontinued since version V 2.0.8. - -* Contains information about all models in the database. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------- | ----------- | ----------------- | ------------------------------------------------------------------------------------------------ | -| model\_id | STRING | TAG | Model name | -| model\_type | STRING | ATTRIBUTE | Model type (Forecast, Anomaly Detection, Custom) | -| state | STRING | ATTRIBUTE | Model status (Available/Unavailable) | -| configs | STRING | ATTRIBUTE | String format of model hyperparameters, consistent with the output of the `show` command | -| notes | STRING | ATTRIBUTE | Model description\* Built-in model: Built-in model in IoTDB\* User-defined model: Custom model | - -* Query example: - -```SQL --- Find all built-in forecast models -IoTDB> select * from information_schema.models where model_type = 'BUILT_IN_FORECAST' -+---------------------+-----------------+------+-------+-----------------------+ -| model_id| model_type| state|configs| notes| -+---------------------+-----------------+------+-------+-----------------------+ -| _STLForecaster|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _NaiveForecaster|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _ARIMA|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -|_ExponentialSmoothing|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _HoltWinters|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _sundial|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -+---------------------+-----------------+------+-------+-----------------------+ -``` - - -### 2.12 FUNCTIONS - -> This system table is available starting from version V2.0.5. - -* Contains information about all functions in the database. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------------ | ----------- | ----------------- | -------------------------------------------------------------------------- | -| function\_name | STRING | TAG | Function name | -| function\_type | STRING | ATTRIBUTE | Function type (Built-in/User-defined, Scalar/Aggregation/Table Function) | -| class\_name(udf) | STRING | ATTRIBUTE | Class name if it is a UDF, otherwise null (tentative) | -| state | STRING | ATTRIBUTE | Availability status | - -* Query example: - -```SQL -IoTDB> select * from information_schema.functions where function_type='built-in table function' -+--------------+-----------------------+---------------+---------+ -|function_name | function_type|class_name(udf)| state| -+--------------+-----------------------+---------------+---------+ -| CUMULATE|built-in table function| null|AVAILABLE| -| SESSION|built-in table function| null|AVAILABLE| -| HOP|built-in table function| null|AVAILABLE| -| TUMBLE|built-in table function| null|AVAILABLE| -| FORECAST|built-in table function| null|AVAILABLE| -| VARIATION|built-in table function| null|AVAILABLE| -| CAPACITY|built-in table function| null|AVAILABLE| -+--------------+-----------------------+---------------+---------+ -``` - - -### 2.13 CONFIGURATIONS - -> This system table is available starting from version V2.0.5. - -* Contains all configuration properties of the database. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------- | ----------- | ----------------- | ------------------------------ | -| variable | STRING | TAG | Configuration property name | -| value | STRING | ATTRIBUTE | Configuration property value | - -* Only administrators are allowed to perform operations on this table. -* Query example: - -```SQL -IoTDB> select * from information_schema.configurations -+----------------------------------+-----------------------------------------------------------------+ -| variable| value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - - -### 2.14 KEYWORDS - -> This system table is available starting from version V2.0.5. - -* Contains all keywords in the database. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------- | ----------- | ----------------- | ------------------------------------------------- | -| word | STRING | TAG | Keyword | -| reserved | INT32 | ATTRIBUTE | Whether it is a reserved word (1 = Yes, 0 = No) | - -* Query example: - -```SQL -IoTDB> select * from information_schema.keywords limit 10 -+----------+--------+ -| word|reserved| -+----------+--------+ -| ABSENT| 0| -|ACTIVATION| 1| -| ACTIVATE| 1| -| ADD| 0| -| ADMIN| 0| -| AFTER| 0| -| AINODES| 1| -| ALL| 0| -| ALTER| 1| -| ANALYZE| 0| -+----------+--------+ -``` - - -### 2.15 NODES - -> This system table is available starting from version V2.0.5. - -* Contains information about all nodes in the database cluster. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| -------------------------------------------- | ----------- | ----------------- | ---------------------- | -| node\_id | INT32 | TAG | Node ID | -| node\_type | STRING | ATTRIBUTE | Node type | -| status | STRING | ATTRIBUTE | Node status | -| internal\_address | STRING | ATTRIBUTE | Internal RPC address | -| internal\_port | INT32 | ATTRIBUTE | Internal port | -| version | STRING | ATTRIBUTE | Version number | -| build\_info | STRING | ATTRIBUTE | Commit ID | -| activate\_status (Enterprise Edition only) | STRING | ATTRIBUTE | Activation status | - -* Only administrators are allowed to perform operations on this table. -* Query example: - -```SQL -IoTDB> select * from information_schema.nodes -+-------+----------+-------+----------------+-------------+-------+----------+ -|node_id| node_type| status|internal_address|internal_port|version|build_info| -+-------+----------+-------+----------------+-------------+-------+----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|2.0.5.1| 58d685e| -| 1| DataNode|Running| 127.0.0.1| 10730|2.0.5.1| 58d685e| -+-------+----------+-------+----------------+-------------+-------+----------+ -``` - - -### 2.16 CONFIG\_NODES - -> This system table is available starting from version V2.0.5. - -* Contains information about all ConfigNodes in the cluster. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------------------- | ----------- | ----------------- | --------------------------- | -| node\_id | INT32 | TAG | Node ID | -| config\_consensus\_port | INT32 | ATTRIBUTE | ConfigNode consensus port | -| role | STRING | ATTRIBUTE | ConfigNode role | - -* Only administrators are allowed to perform operations on this table. -* Query example: - -```SQL -IoTDB> select * from information_schema.config_nodes -+-------+---------------------+------+ -|node_id|config_consensus_port| role| -+-------+---------------------+------+ -| 0| 10720|Leader| -+-------+---------------------+------+ -``` - - -### 2.17 DATA\_NODES - -> This system table is available starting from version V2.0.5. - -* Contains information about all DataNodes in the cluster. -* The table structure is as follows: - -| Column Name | Data Type | Column Category | Description | -| ------------------------- | ----------- | ----------------- | ----------------------------- | -| node\_id | INT32 | TAG | Node ID | -| data\_region\_num | INT32 | ATTRIBUTE | Number of DataRegions | -| schema\_region\_num | INT32 | ATTRIBUTE | Number of SchemaRegions | -| rpc\_address | STRING | ATTRIBUTE | RPC address | -| rpc\_port | INT32 | ATTRIBUTE | RPC port | -| mpp\_port | INT32 | ATTRIBUTE | MPP communication port | -| data\_consensus\_port | INT32 | ATTRIBUTE | DataRegion consensus port | -| schema\_consensus\_port | INT32 | ATTRIBUTE | SchemaRegion consensus port | - -* Only administrators are allowed to perform operations on this table. -* Query example: - -```SQL -IoTDB> select * from information_schema.data_nodes -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -|node_id|data_region_num|schema_region_num|rpc_address|rpc_port|mpp_port|data_consensus_port|schema_consensus_port| -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -| 1| 4| 4| 0.0.0.0| 6667| 10740| 10760| 10750| -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -``` - -### 2.18 CONNECTIONS - -> This system table is available starting from version V 2.0.8 - -* Contains all connections in the cluster. -* The table structure is as follows: - -| **Column Name** | **Data Type** | **Column Type** | **Description** | -|-----------------|---------------|-----------------|------------------------| -| datanode_id | STRING | TAG | DataNode ID | -| user_id | STRING | TAG | User ID | -| session_id | STRING | TAG | Session ID | -| user_name | STRING | ATTRIBUTE | Username | -| last_active_time| TIMESTAMP | ATTRIBUTE | Last active time | -| client_ip | STRING | ATTRIBUTE | Client IP address | - -* Query example: - -```SQL -IoTDB> select * from information_schema.connections; -+-----------+-------+----------+---------+-----------------------------+---------+ -|datanode_id|user_id|session_id|user_name| last_active_time|client_ip| -+-----------+-------+----------+---------+-----------------------------+---------+ -| 1| 0| 2| root|2026-01-21T16:28:54.704+08:00|127.0.0.1| -+-----------+-------+----------+---------+-----------------------------+---------+ -``` - -### 2.19 CURRENT_QUERIES - -> This system table is available starting from version V 2.0.8 - -* Contains all queries whose execution end time falls within the range `[now() - query_cost_stat_window, now())`, including currently executing queries. The `query_cost_stat_window` parameter represents the query cost statistics window. Its default value is 0 and can be configured via the `iotdb-system.properties` configuration file. -* The table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -|--------------|-----------|-------------|-----------------------------------------------------------------------------| -| query_id | STRING | TAG | Query statement ID | -| state | STRING | FIELD | Query state: RUNNING indicates executing, FINISHED indicates completed | -| start_time | TIMESTAMP | FIELD | Query start timestamp (precision matches system timestamp precision) | -| end_time | TIMESTAMP | FIELD | Query end timestamp (precision matches system timestamp precision). NULL if query is not yet finished | -| datanode_id | INT32 | FIELD | DataNode from which the query was initiated | -| cost_time | FLOAT | FIELD | Query execution time in seconds. If query is not finished, shows elapsed time | -| statement | STRING | FIELD | Query SQL / concatenated query request SQL | -| user | STRING | FIELD | User who initiated the query | -| client_ip | STRING | FIELD | Client IP address that initiated the query | - -* Regular users can only view their own queries; administrators can view all queries. -* Query example: - -```SQL -IoTDB> select * from information_schema.current_queries; -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -| query_id| state| start_time|end_time|datanode_id|cost_time| statement|user|client_ip| -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -|20260121_085427_00013_1|RUNNING|2026-01-21T16:54:27.019+08:00| null| 1| 0.0|select * from information_schema.current_queries|root|127.0.0.1| -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -``` - -### 2.20 QUERIES_COSTS_HISTOGRAM - -> This system table is available starting from version V 2.0.8 - -* Contains a histogram of query execution times within the past `query_cost_stat_window` period (only statistics for completed SQL queries). The `query_cost_stat_window` parameter represents the query cost statistics window. Its default value is 0 and can be configured via the `iotdb-system.properties` configuration file. -* The table structure is as follows: - -| Column Name | Data Type | Column Type | Description | -|--------------|-----------|-------------|-----------------------------------------------------------------------------| -| bin | STRING | TAG | Bucket name: 61 buckets total - [0, 1), [1, 2), [2, 3), ..., [59, 60), 60+ | -| nums | INT32 | FIELD | Number of SQL queries in the bucket | -| datanode_id | INT32 | FIELD | DataNode to which this bucket belongs | - -* Only administrators can execute operations on this table. -* Query example: - -```SQL -IoTDB> select * from information_schema.queries_costs_histogram limit 10 -+------+----+-----------+ -| bin|nums|datanode_id| -+------+----+-----------+ -| [0,1)| 0| 1| -| [1,2)| 0| 1| -| [2,3)| 0| 1| -| [3,4)| 0| 1| -| [4,5)| 0| 1| -| [5,6)| 0| 1| -| [6,7)| 0| 1| -| [7,8)| 0| 1| -| [8,9)| 0| 1| -|[9,10)| 0| 1| -+------+----+-----------+ -``` - -### 2.21 SERVICES - -> This system table is available starting from version V 2.0.8.2 - -* Displays services (MQTT service, REST service) on all active DataNodes (with RUNNING or READ-ONLY status). -* Table structure: - -| Column Name | Data Type | Column Type | Description | -|---------------|-----------|-------------|---------------------------------| -| service_name | STRING | TAG | Service Name | -| datanode_id | INT32 | ATTRIBUTE | DataNode ID where service runs | -| state | STRING | ATTRIBUTE | Service status: RUNNING/STOPPED | - - -* Query example: - -```sql -IoTDB> SELECT * FROM information_schema.services -+------------+-----------+---------+ -|service_name|datanode_id|state | -+------------+-----------+---------+ -|MQTT |1 |STOPPED | -|REST |1 |RUNNING | -+------------+-----------+---------+ -``` - -### 2.22 TABLE_DISK_USAGE -> This system table is available since version V2.0.9.1 - -Used to display the disk space usage of specified tables (excluding views), including the size of ChunkGroups and the size of Metadata. - -Note: Statistics are based on the actual size of data in TsFiles; therefore, deletions made via mods are not considered. - -The table structure is shown below: - -| Column Name | Data Type | Column Type | Description | -|-----------------|-----------|-------------|----------------------------------| -| database | string | Field | Database name | -| table_name | string | Field | Table name | -| datanode_id | int32 | Field | DataNode node ID | -| region_id | int32 | Field | Region ID | -| time_partition | int64 | Field | Time partition ID | -| size_in_bytes | int64 | Field | Disk space occupied (in bytes) | - -**Query Examples**: - -```SQL --- Query all data; -select * from information_schema.table_disk_usage; -``` - -```Bash -+---------+-------------------+-----------+---------+--------------+-------------+ -| database| table_name|datanode_id|region_id|time_partition|size_in_bytes| -+---------+-------------------+-----------+---------+--------------+-------------+ -|database1| table1| 1| 3| 2864| 867| -|database1| table11| 1| 3| 2864| 0| -|database1| table3| 1| 3| 2864| 0| -|database1| table1| 1| 3| 2865| 1411| -|database1| table11| 1| 3| 2865| 0| -|database1| table3| 1| 3| 2865| 0| -|database1| table1| 1| 3| 2925| 590| -|database1| table11| 1| 3| 2925| 0| -|database1| table3| 1| 3| 2925| 0| -|database1| table1| 1| 4| 2864| 883| -|database1| table11| 1| 4| 2864| 0| -|database1| table3| 1| 4| 2864| 0| -|database1| table1| 1| 4| 2865| 1224| -|database1| table11| 1| 4| 2865| 0| -|database1| table3| 1| 4| 2865| 0| -|database1| table1| 1| 4| 2888| 0| -|database1| table11| 1| 4| 2888| 0| -|database1| table3| 1| 4| 2888| 205| -| etth| tab_cov_forecast| 1| 8| 0| 0| -| etth| tab_real| 1| 8| 0| 963| -| etth|tab_target_forecast| 1| 8| 0| 0| -| etth| tab_cov_forecast| 1| 9| 0| 448| -| etth| tab_real| 1| 9| 0| 0| -| etth|tab_target_forecast| 1| 9| 0| 0| -+---------+-------------------+-----------+---------+--------------+-------------+ -``` - -```SQL --- Specify query conditions; -select * from information_schema.table_disk_usage where region_id = 4 and table_name like '%1'; -``` - -```Bash -+---------+----------+-----------+---------+--------------+-------------+ -| database|table_name|datanode_id|region_id|time_partition|size_in_bytes| -+---------+----------+-----------+---------+--------------+-------------+ -|database1| table1| 1| 4| 2864| 883| -|database1| table11| 1| 4| 2864| 0| -|database1| table1| 1| 4| 2865| 1224| -|database1| table11| 1| 4| 2865| 0| -|database1| table1| 1| 4| 2888| 0| -|database1| table11| 1| 4| 2888| 0| -+---------+----------+-----------+---------+--------------+-------------+ -``` - - -## 3. Permission Description - -* GRANT/REVOKE operations are not supported for the `information_schema` database or any of its tables. -* All users can view `information_schema` database details via the `SHOW DATABASES` statement. -* All users can list system tables via `SHOW TABLES FROM information_schema`. -* All users can inspect system table structures using the `DESC` statement. diff --git a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/Basis-Function_timecho.md deleted file mode 100644 index 0010eebda..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/Basis-Function_timecho.md +++ /dev/null @@ -1,2392 +0,0 @@ - - - -# Basic Functions - -## 1. Comparison Functions and Operators - -### 1.1 Basic Comparison Operators - -Comparison operators are used to compare two values and return the comparison result (`true` or `false`). - -| Operators | Description | -| :-------- | :----------------------- | -| < | Less than | -| > | Greater than | -| <= | Less than or equal to | -| >= | Greater than or equal to | -| = | Equal to | -| <> | Not equal to | -| != | Not equal to | - -#### 1.1.1 Comparison rules: - -1. All types can be compared with themselves. -2. Numeric types (INT32, INT64, FLOAT, DOUBLE, TIMESTAMP) can be compared with each other. -3. Character types (STRING, TEXT) can also be compared with each other. -4. Comparisons between types other than those mentioned above will result in an error. - -### 1.2 BETWEEN Operator - -1. The `BETWEEN `operator is used to determine whether a value falls within a specified range. -2. The `NOT BETWEEN` operator is used to determine whether a value does not fall within a specified range. -3. The `BETWEEN` and `NOT BETWEEN` operators can be used to evaluate any sortable type. -4. The value, minimum, and maximum parameters for `BETWEEN` and `NOT BETWEEN` must be of the same type, otherwise an error will occur. - -Syntax: - -```SQL - value BETWEEN min AND max: - value NOT BETWEEN min AND max: -``` - -Example 1 :BETWEEN - -```SQL --- Query records where temperature is between 85.0 and 90.0 -SELECT * FROM table1 WHERE temperature BETWEEN 85.0 AND 90.0; -``` - -Example 2 : NOT BETWEEN - -``` --- Query records where humidity is not between 35.0 and 40.0 -SELECT * FROM table1 WHERE humidity NOT BETWEEN 35.0 AND 40.0; -``` - -### 1.3 IS NULL Operator - -1. These operators apply to all data types. - -Example 1: Query records where temperature is NULL - -```SQL -SELECT * FROM table1 WHERE temperature IS NULL; -``` - -Example 2: Query records where humidity is not NULL - -```SQL -SELECT * FROM table1 WHERE humidity IS NOT NULL; -``` - -### 1.4 IN Operator - -1. The `IN` operator can be used in the `WHERE `clause to compare a column with a list of values. -2. These values can be provided by a static array or scalar expressions. - -Syntax: - -```SQL -... WHERE column [NOT] IN ('value1','value2', expression1) -``` - -Example 1: Static array: Query records where region is 'Beijing' or 'Shanghai' - -```SQL -SELECT * FROM table1 WHERE region IN ('Beijing', 'Shanghai'); ---Equivalent to -SELECT * FROM region WHERE name = 'Beijing' OR name = 'Shanghai'; -``` - -Example 2: Scalar expression: Query records where temperature is among specific values - -```SQL -SELECT * FROM table1 WHERE temperature IN (85.0, 90.0); -``` - -Example 3: Query records where region is not 'Beijing' or 'Shanghai' - -```SQL -SELECT * FROM table1 WHERE region NOT IN ('Beijing', 'Shanghai'); -``` - -### 1.5 GREATEST and LEAST - -The `GREATEST` function returns the maximum value from a list of arguments, while the `LEAST` function returns the minimum value. The return type matches the input data type. - -Key Behaviors: -1. NULL Handling: Returns NULL if all arguments are NULL. -2. Parameter Requirements: Requires at least 2 arguments. -3. Type Constraints: All arguments must have the same data type. -4. Supported Types: `BOOLEAN`、`FLOAT`、`DOUBLE`、`INT32`、`INT64`、`STRING`、`TEXT`、`TIMESTAMP`、`DATE` - -**Syntax:** - -```sql - greatest(value1, value2, ..., valueN) - least(value1, value2, ..., valueN) -``` - -**Examples:** - -```sql --- Retrieve the maximum value between `temperature` and `humidity` in `table2` -SELECT GREATEST(temperature,humidity) FROM table2; - --- Retrieve the minimum value between `temperature` and `humidity` in `table2` -SELECT LEAST(temperature,humidity) FROM table2; -``` - -## 2. Aggregate functions - -### 2.1 Overview - -1. Aggregate functions are many-to-one functions. They perform aggregate calculations on a set of values to obtain a single aggregate result. - -2. Except for `COUNT()`, all other aggregate functions ignore null values and return null when there are no input rows or all values are null. For example, `SUM()` returns null instead of zero, and `AVG()` does not include null values in the count. - -### 2.2 Supported Aggregate Functions - -| Function Name | Description | Allowed Input Types | Output Type | -|:-----------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------| -| COUNT | Counts the number of data points. | All types | INT64 | -| COUNT_IF | COUNT_IF(exp) counts the number of rows that satisfy a specified boolean expression. | `exp` must be a boolean expression,(e.g. `count_if(temperature>20)`) | INT64 | -| APPROX_COUNT_DISTINCT | The APPROX_COUNT_DISTINCT(x[, maxStandardError]) function provides an approximation of COUNT(DISTINCT x), returning the estimated number of distinct input values. | `x`: The target column to be calculated, supports all data types.
`maxStandardError` (optional): Specifies the maximum standard error allowed for the function's result. Valid range is [0.0040625, 0.26]. Defaults to 0.023 if not specified. | INT64 | -| APPROX_MOST_FREQUENT | The APPROX_MOST_FREQUENT(x, k, capacity) function is used to approximately calculate the top k most frequent elements in a dataset. It returns a JSON-formatted string where the keys are the element values and the values are their corresponding approximate frequencies. (Available since V2.0.5.1) | `x` : The column to be calculated, supporting all existing data types in IoTDB;
`k`: The number of top-k most frequent values to return;
`capacity`: The number of buckets used for computation, which relates to memory usage—a larger value reduces error but consumes more memory, while a smaller value increases error but uses less memory. | STRING | -| APPROX_PERCENTILE | The APPROX_PERCENTILE function calculates the value at a specified percentile in a dataset, helping quickly understand data distribution (e.g., median, quartiles). It supports weighted percentile calculation. If the percentile does not point to an exact position, it returns a linear interpolation of adjacent values at that position.Memory usage depends on the number of centroids, and the maximum number of centroids can be limited using the compression parameter. Error can be estimated using empirical formulas.Note: This function is supported since V2.0.9.1. | Unweighted Version: APPROX_PERCENTILE(x, percentage)
x: Column to compute. Supports all numeric types: INT32, INT64, FLOAT, DOUBLE, TIMESTAMP.
percentage: Target percentile, DOUBLE type.
Weighted Version: APPROX_PERCENTILE(x, w, percentage)
x: Column to compute. Supports all numeric types: INT32, INT64, FLOAT, DOUBLE, TIMESTAMP.
w: Weight column, integer type (must align with the length of x; NULL or 0 means the row is ignored).
percentage: Target percentile, DOUBLE type. | Same as the input column x. | -| SUM | Calculates the sum. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| AVG | Calculates the average. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| MAX | Finds the maximum value. | All types | Same as input type | -| MIN | Finds the minimum value. | All types | Same as input type | -| FIRST | Finds the value with the smallest timestamp that is not NULL. | All types | Same as input type | -| LAST | Finds the value with the largest timestamp that is not NULL. | All types | Same as input type | -| STDDEV | Alias for STDDEV_SAMP, calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| STDDEV_POP | Calculates the population standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| STDDEV_SAMP | Calculates the sample standard deviation. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VARIANCE | Alias for VAR_SAMP, calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VAR_POP | Calculates the population variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VAR_SAMP | Calculates the sample variance. | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| EXTREME | Finds the value with the largest absolute value. If the largest absolute values of positive and negative values are equal, returns the positive value. | INT32 INT64 FLOAT DOUBLE | Same as input type | -| MODE | Finds the mode. Note: 1. There is a risk of memory exception when the number of distinct values in the input sequence is too large; 2. If all elements have the same frequency, i.e., there is no mode, a random element is returned; 3. If there are multiple modes, a random mode is returned; 4. NULL values are also counted in frequency, so even if not all values in the input sequence are NULL, the final result may still be NULL. | All types | Same as input type | -| MAX_BY | MAX_BY(x, y) finds the value of x corresponding to the maximum y in the binary input x and y. MAX_BY(time, x) returns the timestamp when x is at its maximum. | x and y can be of any type | Same as the data type of the first input x | -| MIN_BY | MIN_BY(x, y) finds the value of x corresponding to the minimum y in the binary input x and y. MIN_BY(time, x) returns the timestamp when x is at its minimum. | x and y can be of any type | Same as the data type of the first input x | -| FIRST_BY | FIRST_BY(x, y) finds the value of x in the same row when y is the first non-null value. | x and y can be of any type | Same as the data type of the first input x | -| LAST_BY | LAST_BY(x, y) finds the value of x in the same row when y is the last non-null value. | x and y can be of any type | Same as the data type of the first input x | - - -### 2.3 Examples - -#### 2.3.1 Example Data - -The [Example Data page](../Reference/Sample-Data.md) contains SQL statements for building table structures and inserting data. Download and execute these statements in the IoTDB CLI to import the data into IoTDB. You can use this data to test and execute the SQL statements in the examples and obtain the corresponding results. - -#### 2.3.2 Count - -Counts the number of rows in the entire table and the number of non-null values in the `temperature` column. - -```SQL -IoTDB> select count(*), count(temperature) from table1; -``` - -The execution result is as follows: - -> Note: Only the COUNT function can be used with *, otherwise an error will occur. - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 18| 12| -+-----+-----+ -Total line number = 1 -It costs 0.834s -``` - - -#### 2.3.3 Count_if - -Count `Non-Null` `arrival_time` Records in `table2` - -```sql -select count_if(arrival_time is not null) from table2; -``` - -The execution result is as follows: - -```sql -+-----+ -|_col0| -+-----+ -| 4| -+-----+ -Total line number = 1 -It costs 0.047s -``` - -#### 2.3.4 Approx_count_distinct - -Retrieve the number of distinct values in the `temperature` column from `table1`. - -```sql -IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature) as approx FROM table1; -IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1; -``` - -The execution result is as follows: - -```sql -+------+------+ -|origin|approx| -+------+------+ -| 3| 3| -+------+------+ -Total line number = 1 -It costs 0.022s -``` - -#### 2.3.5 Approx_most_frequent - -Query the ​​top 2 most frequent values​​ in the `temperature` column of `table1`. - -```sql -IoTDB> select approx_most_frequent(temperature,2,100) as topk from table1; -``` - -The execution result is as follows: - -```sql -+-------------------+ -| topk| -+-------------------+ -|{"85.0":6,"90.0":5}| -+-------------------+ -Total line number = 1 -It costs 0.064s -``` - -#### 2.3.6 Approx_Percentile - -Calculate the 90th percentile of the `temperature` column and the 50th percentile (median) of the `humidity` column from `table1` respectively, and return these two approximate percentile values. - -```SQL -SELECT APPROX_PERCENTILE(temperature,0.9), APPROX_PERCENTILE(humidity,0.5) FROM table1; -``` - -**Execution Result:** - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 35.2| -+-----+-----+ -Total line number = 1 -It costs 0.206s -``` - -#### 2.3.7 First - -Finds the values with the smallest timestamp that are not NULL in the `temperature` and `humidity` columns. - -```SQL -IoTDB> select first(temperature), first(humidity) from table1; -``` - -The execution result is as follows: - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 35.1| -+-----+-----+ -Total line number = 1 -It costs 0.170s -``` - -#### 2.3.8 Last - -Finds the values with the largest timestamp that are not NULL in the `temperature` and `humidity` columns. - -```SQL -IoTDB> select last(temperature), last(humidity) from table1; -``` - -The execution result is as follows: - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 34.8| -+-----+-----+ -Total line number = 1 -It costs 0.211s -``` - -#### 2.3.9 First_by - -Finds the `time` value of the row with the smallest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the smallest timestamp that is not NULL in the `temperature` column. - -```SQL -IoTDB> select first_by(time, temperature), first_by(humidity, temperature) from table1; -``` - -The execution result is as follows: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-26T13:37:00.000+08:00| 35.1| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.269s -``` - -#### 2.3.10 Last_by - -Queries the `time` value of the row with the largest timestamp that is not NULL in the `temperature` column, and the `humidity` value of the row with the largest timestamp that is not NULL in the `temperature` column. - -```SQL -IoTDB> select last_by(time, temperature), last_by(humidity, temperature) from table1; -``` - -The execution result is as follows: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-30T14:30:00.000+08:00| 34.8| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.070s -``` - -#### 2.3.11 Max_by - -Queries the `time` value of the row where the `temperature` column is at its maximum, and the `humidity` value of the row where the `temperature` column is at its maximum. - -```SQL -IoTDB> select max_by(time, temperature), max_by(humidity, temperature) from table1; -``` - -The execution result is as follows: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-30T09:30:00.000+08:00| 35.2| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.172s -``` - -#### 2.3.12 Min_by - -Queries the `time` value of the row where the `temperature` column is at its minimum, and the `humidity` value of the row where the `temperature` column is at its minimum. - -```SQL -select min_by(time, temperature), min_by(humidity, temperature) from table1; -``` - -The execution result is as follows: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-29T10:00:00.000+08:00| null| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.244s -``` - - -## 3. Logical operators - -### 3.1 Overview - -Logical operators are used to combine conditions or negate conditions, returning a Boolean result (`true` or `false`). - -Below are the commonly used logical operators along with their descriptions: - -| Operator | Description | Example | -| :------- | :-------------------------------- | :------ | -| AND | True only if both values are true | a AND b | -| OR | True if either value is true | a OR b | -| NOT | True when the value is false | NOT a | - -### 3.2 Impact of NULL on Logical Operators - -#### 3.2.1 AND Operator - -- If one or both sides of the expression are `NULL`, the result may be `NULL`. -- If one side of the `AND` operator is `FALSE`, the expression result is `FALSE`. - -Examples: - -```SQL -NULL AND true -- null -NULL AND false -- false -NULL AND NULL -- null -``` - -#### 3.2.2 OR Operator - -- If one or both sides of the expression are `NULL`, the result may be `NULL`. -- If one side of the `OR` operator is `TRUE`, the expression result is `TRUE`. - -Examples: - -```SQL -NULL OR NULL -- null -NULL OR false -- null -NULL OR true -- true -``` - -##### 3.2.2.1 Truth Table - -The following truth table illustrates how `NULL` is handled in `AND` and `OR` operators: - -| a | b | a AND b | a OR b | -| :---- | :---- | :------ | :----- | -| TRUE | TRUE | TRUE | TRUE | -| TRUE | FALSE | FALSE | TRUE | -| TRUE | NULL | NULL | TRUE | -| FALSE | TRUE | FALSE | TRUE | -| FALSE | FALSE | FALSE | FALSE | -| FALSE | NULL | FALSE | NULL | -| NULL | TRUE | NULL | TRUE | -| NULL | FALSE | FALSE | NULL | -| NULL | NULL | NULL | NULL | - -#### 3.2.3 NOT Operator - -The logical negation of `NULL` remains `NULL`. - -Example: - -```SQL -NOT NULL -- null -``` - -##### 3.2.3.1 Truth Table - -The following truth table illustrates how `NULL` is handled in the `NOT` operator: - -| a | NOT a | -| :---- | :---- | -| TRUE | FALSE | -| FALSE | TRUE | -| NULL | NULL | - -## 4. Date and Time Functions and Operators - -### 4.1 now() -> Timestamp - -Returns the current timestamp. - -### 4.2 date_bin(interval, Timestamp[, Timestamp]) -> Timestamp - -The `date_bin` function is used for handling time data by rounding a timestamp (`Timestamp`) to the boundary of a specified time interval (`interval`). - -#### **Syntax:** - -```SQL --- Calculates the time interval starting from timestamp 0 and returns the nearest interval boundary to the specified timestamp. -date_bin(interval,source) - --- Calculates the time interval starting from the origin timestamp and returns the nearest interval boundary to the specified timestamp. -date_bin(interval,source,origin) - ---Supported time units for interval: ---Years (y), months (mo), weeks (week), days (d), hours (h), minutes (M), seconds (s), milliseconds (ms), microseconds (µs), nanoseconds (ns). ---source: Must be of timestamp type. -``` - -#### **Parameters**: - -| Parameter | Description | -| :-------- | :----------------------------------------------------------- | -| interval | 1. Time interval 2. Supported units: `y`, `mo`, `week`, `d`, `h`, `M`, `s`, `ms`, `µs`, `ns`. | -| source | 1. The timestamp column or expression to be calculated. 2. Must be of timestamp type. | -| origin | The reference timestamp. | - -#### 4.2.1Syntax Rules : - -1. If `origin` is not specified, the default reference timestamp is `1970-01-01T00:00:00Z` (Beijing time: `1970-01-01 08:00:00`). -2. `interval` must be a non-negative number with a time unit. If `interval` is `0ms`, the function returns `source` directly without calculation. -3. If `origin` or `source` is negative, it represents a time point before the epoch. `date_bin` will calculate and return the relevant time period. -4. If `source` is `null`, the function returns `null`. -5. Mixing months and non-month time units (e.g., `1 MONTH 1 DAY`) is not supported due to ambiguity. - -> For example, if the starting point is **April 30, 2000**, calculating `1 DAY` first and then `1 MONTH` results in **June 1, 2000**, whereas calculating `1 MONTH` first and then `1 DAY` results in **May 31, 2000**. The resulting dates are different. - -#### 4.2.2 Examples - -##### Example Data - -The [Example Data page](../Reference/Sample-Data.md) contains SQL statements for building table structures and inserting data. Download and execute these statements in the IoTDB CLI to import the data into IoTDB. You can use this data to test and execute the SQL statements in the examples and obtain the corresponding results. - -#### Example 1: Without Specifying the Origin Timestamp - -```SQL -SELECT - time, - date_bin(1h,time) as time_bin -FROM - table1; -``` - -Result**:** - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:00:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.683s -``` - -#### Example 2: Specifying the Origin Timestamp - -```SQL -SELECT - time, - date_bin(1h, time, 2024-11-29T18:30:00.000) as time_bin -FROM - table1; -``` - -Result: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:30:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T09:30:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T10:30:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:30:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T07:30:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T08:30:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T09:30:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T10:30:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:30:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:30:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.056s -``` - -#### Example 3: Negative Origin - -```SQL -SELECT - time, - date_bin(1h, time, 1969-12-31 00:00:00.000) as time_bin -FROM - table1; -``` - -Result: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:00:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.203s -``` - -#### Example 4: Interval of 0 - -```SQL -SELECT - time, - date_bin(0ms, time) as time_bin -FROM - table1; -``` - -Result**:** - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:30:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:38:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:39:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:40:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:41:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:42:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:43:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:44:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:30:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:37:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:38:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.107s -``` - -#### Example 5: Source is NULL - -```SQL -SELECT - arrival_time, - date_bin(1h,arrival_time) as time_bin -FROM - table1; -``` - -Result: - -```Plain -+-----------------------------+-----------------------------+ -| arrival_time| time_bin| -+-----------------------------+-----------------------------+ -| null| null| -|2024-11-30T14:30:17.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:13.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:37:01.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -|2024-11-27T16:37:03.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:37:04.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -| null| null| -|2024-11-27T16:37:08.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -|2024-11-29T18:30:15.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:09.000+08:00|2024-11-28T08:00:00.000+08:00| -| null| null| -|2024-11-28T10:00:11.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:12.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:34.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:25.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.319s -``` - -### 4.3 Extract Function - -This function is used to extract the value of a specific part of a date. (Supported from version V2.0.6) - -#### 4.3.1 Syntax Definition - -```SQL -EXTRACT (identifier FROM expression) -``` - -* Parameter Description - * **expression**: `TIMESTAMP` type or a time constant - * **identifier**: The valid ranges and corresponding return value types are shown in the table below. - - | Valid Range | Return Type | Return Range | - |----------------------|---------------|--------------------| - | `YEAR` | `INT64` | `/` | - | `QUARTER` | `INT64` | `1-4` | - | `MONTH` | `INT64` | `1-12` | - | `WEEK` | `INT64` | `1-53` | - | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` | - | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` | - | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` | - | `HOUR` | `INT64` | `0-23` | - | `MINUTE` | `INT64` | `0-59` | - | `SECOND` | `INT64` | `0-59` | - | `MS` | `INT64` | `0-999` | - | `US` | `INT64` | `0-999` | - | `NS` | `INT64` | `0-999` | - - -#### 4.3.2 Usage Example - -Using table1 from the [Sample Data](../Reference/Sample-Data.md) as the source data, query the average temperature for the first 12 hours of each day within a certain period. - -```SQL -IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00 and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by date_bin(1d,time) order by date_bin(1d,time) -+----------+-----+ -| fmtdate|avgtp| -+----------+-----+ -|2024-11-28| 86.0| -|2024-11-29| 85.0| -|2024-11-30| 90.0| -+----------+-----+ -Total line number = 3 -It costs 0.041s -``` - -Introduction to the `Format` function: [Format Function](../SQL-Manual/Basis-Function_timecho.md#_7-2-format-function) - -Introduction to the `Date_bin` function: [Date_bin Funtion](../SQL-Manual/Basis-Function_timecho.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp) - - -## 5. Mathematical Functions and Operators - -### 5.1 Mathematical Operators - -| **Operator** | **Description** | -| :----------- | :---------------------------------------------- | -| + | Addition | -| - | Subtraction | -| * | Multiplication | -| / | Division (integer division performs truncation) | -| % | Modulus (remainder) | -| - | Negation | - -### 5.2 Mathematical functions - -| Function Name | Description | Input | Output | Usage | -|:--------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------|:-------------------| :--------- | -| sin | Sine | double, float, INT64, INT32 | double | sin(x) | -| cos | Cosine | double, float, INT64, INT32 | double | cos(x) | -| tan | Tangent | double, float, INT64, INT32 | double | tan(x) | -| asin | Inverse Sine | double, float, INT64, INT32 | double | asin(x) | -| acos | Inverse Cosine | double, float, INT64, INT32 | double | acos(x) | -| atan | Inverse Tangent | double, float, INT64, INT32 | double | atan(x) | -| sinh | Hyperbolic Sine | double, float, INT64, INT32 | double | sinh(x) | -| cosh | Hyperbolic Cosine | double, float, INT64, INT32 | double | cosh(x) | -| tanh | Hyperbolic Tangent | double, float, INT64, INT32 | double | tanh(x) | -| degrees | Converts angle `x` in radians to degrees | double, float, INT64, INT32 | double | degrees(x) | -| radians | Radian Conversion from Degrees | double, float, INT64, INT32 | double | radians(x) | -| abs | Absolute Value | double, float, INT64, INT32 | Same as input type | abs(x) | -| sign | Returns the sign of `x`: - If `x = 0`, returns `0` - If `x > 0`, returns `1` - If `x < 0`, returns `-1` For `double/float` inputs: - If `x = NaN`, returns `NaN` - If `x = +Infinity`, returns `1.0` - If `x = -Infinity`, returns `-1.0` | double, float, INT64, INT32 | Same as input type | sign(x) | -| ceil | Rounds `x` up to the nearest integer | double, float, INT64, INT32 | double | ceil(x) | -| floor | Rounds `x` down to the nearest integer | double, float, INT64, INT32 | double | floor(x) | -| exp | Returns `e^x` (Euler's number raised to the power of `x`) | double, float, INT64, INT32 | double | exp(x) | -| ln | Returns the natural logarithm of `x` | double, float, INT64, INT32 | double | ln(x) | -| log10 | Returns the base 10 logarithm of `x` | double, float, INT64, INT32 | double | log10(x) | -| round | Rounds `x` to the nearest integer | double, float, INT64, INT32 | double | round(x) | -| round | Rounds `x` to `d` decimal places | double, float, INT64, INT32 | double | round(x, d) | -| sqrt | Returns the square root of `x`. | double, float, INT64, INT32 | double | sqrt(x) | -| e | Returns Euler’s number `e`. | | double | e() | -| pi | Pi (π) | | double | pi() | - -## 6. Bitwise Functions - -> Supported from version V2.0.6 - -Example raw data is as follows: - -``` -IoTDB:database1> select * from bit_table -+-----------------------------+---------+------+-----+ -| time|device_id|length|width| -+-----------------------------+---------+------+-----+ -|2025-10-29T15:59:42.957+08:00| d1| 14| 12| -|2025-10-29T15:58:59.399+08:00| d3| 15| 10| -|2025-10-29T15:59:32.769+08:00| d2| 13| 12| -+-----------------------------+---------+------+-----+ - --- Table creation statement -CREATE TABLE bit_table(time TIMESTAMP TIME, device_id STRING TAG, length INT32 FIELD, width INT32 FIELD); - --- Write data -INSERT INTO bit_table values(2025-10-29 15:59:42.957, 'd1', 14, 12),(2025-10-29 15:58:59.399, 'd3', 15, 10),(2025-10-29 15:59:32.769, 'd2', 13, 12); -``` - -### 6.1 bit\_count(num, bits) - -The `bit_count(num, bits)`function is used to count the number of 1s in the binary representation of the integer `num`under the specified bit width `bits`. - -#### 6.1.1 Syntax Definition - -``` -bit_count(num, bits) -> INT64 -- The return type is Int64 -``` - -* Parameter Description - - * **​num:​**​ Any integer value (int32 or int64) - * **​bits:​**​ Integer value, with a valid range of 2\~64 - -Note: An error will be raised if the number of `bits`is insufficient to represent `num`(using ​**two's complement signed representation**​): `Argument exception, the scalar function num must be representable with the bits specified. [num] cannot be represented with [bits] bits.` - -* Usage Methods - - * Two specific numbers: `bit_count(9, 64)` - * Column and a number: `bit_count(column1, 64)` - * Between two columns: `bit_count(column1, column2)` - -#### 6.1.2 Usage Examples - -``` --- Two specific numbers -IoTDB:database1> select distinct bit_count(2,8) from bit_table -+-----+ -|_col0| -+-----+ -| 1| -+-----+ --- Two specific numbers -IoTDB:database1> select distinct bit_count(-5,8) from bit_table -+-----+ -|_col0| -+-----+ -| 7| -+-----+ --- Column and a number -IoTDB:database1> select length,bit_count(length,8) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 3| -| 15| 4| -| 13| 3| -+------+-----+ --- Insufficient bits -IoTDB:database1> select length,bit_count(length,2) from bit_table -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Argument exception, the scalar function num must be representable with the bits specified. 13 cannot be represented with 2 bits. -``` - -### 6.2 bitwise\_and(x, y) - -The `bitwise_and(x, y)`function performs a logical AND operation on each bit of two integers x and y based on their two's complement representation, and returns the bitwise AND operation result. - -#### 6.2.1 Syntax Definition - -``` -bitwise_and(x, y) -> INT64 -- The return type is Int64 -``` - -* Parameter Description - - * ​**x, y**​: Must be integer values of data type Int32 or Int64 -* Usage Methods - - * Two specific numbers: `bitwise_and(19, 25)` - * Column and a number: `bitwise_and(column1, 25)` - * Between two columns: `bitwise_and(column1, column2)` - -#### 6.2.2 Usage Examples - -``` ---Two specific numbers -IoTDB:database1> select distinct bitwise_and(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 17| -+-----+ ---Column and a number -IoTDB:database1> select length, bitwise_and(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 8| -| 15| 9| -| 13| 9| -+------+-----+ ---Between two columns -IoTDB:database1> select length, width, bitwise_and(length, width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 12| -| 15| 10| 10| -| 13| 12| 12| -+------+-----+-----+ -``` - -### 6.3 bitwise\_not(x) - -The `bitwise_not(x)`function performs a logical NOT operation on each bit of the integer x based on its two's complement representation, and returns the bitwise NOT operation result. - -#### 6.3.1 Syntax Definition - -``` -bitwise_not(x) -> INT64 -- The return type is Int64 -``` - -* Parameter Description - - * ​**x**​: Must be an integer value of data type Int32 or Int64 -* Usage Methods - - * Specific number: `bitwise_not(5)` - * Single column operation: `bitwise_not(column1)` - -#### 6.3.2 Usage Examples - -``` --- Specific number -IoTDB:database1> select distinct bitwise_not(5) from bit_table -+-----+ -|_col0| -+-----+ -| -6| -+-----+ --- Single column -IoTDB:database1> select length, bitwise_not(length) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| -15| -| 15| -16| -| 13| -14| -+------+-----+ -``` - -### 6.4 bitwise\_or(x, y) - -The `bitwise_or(x,y)`function performs a logical OR operation on each bit of two integers x and y based on their two's complement representation, and returns the bitwise OR operation result. - -#### 6.4.1 Syntax Definition - -``` -bitwise_or(x, y) -> INT64 -- The return type is Int64 -``` - -* Parameter Description - - * ​**x, y**​: Must be integer values of data type Int32 or Int64 -* Usage Methods - - * Two specific numbers: `bitwise_or(19, 25)` - * Column and a number: `bitwise_or(column1, 25)` - * Between two columns: `bitwise_or(column1, column2)` - -#### 6.4.2 Usage Examples - -``` --- Two specific numbers -IoTDB:database1> select distinct bitwise_or(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 27| -+-----+ --- Column and a number -IoTDB:database1> select length,bitwise_or(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 31| -| 15| 31| -| 13| 29| -+------+-----+ --- Between two columns -IoTDB:database1> select length, width, bitwise_or(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 14| -| 15| 10| 15| -| 13| 12| 13| -+------+-----+-----+ -``` - -### 6.5 bitwise\_xor(x, y) - -The `bitwise_xor(x,y)`function performs a logical XOR (exclusive OR) operation on each bit of two integers x and y based on their two's complement representation, and returns the bitwise XOR operation result. XOR rule: same bits result in 0, different bits result in 1. - -#### 6.5.1 Syntax Definition - -``` -bitwise_xor(x, y) -> INT64 -- The return type is Int64 -``` - -* Parameter Description - - * ​**x, y**​: Must be integer values of data type Int32 or Int64 -* Usage Methods - - * Two specific numbers: `bitwise_xor(19, 25)` - * Column and a number: `bitwise_xor(column1, 25)` - * Between two columns: `bitwise_xor(column1, column2)` - -#### 6.5.2 Usage Examples - -``` --- Two specific numbers -IoTDB:database1> select distinct bitwise_xor(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 10| -+-----+ --- Column and a number -IoTDB:database1> select length,bitwise_xor(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 23| -| 15| 22| -| 13| 20| -+------+-----+ --- Between two columns -IoTDB:database1> select length, width, bitwise_xor(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 2| -| 15| 10| 5| -| 13| 12| 1| -+------+-----+-----+ -``` - -### 6.6 bitwise\_left\_shift(value, shift) - -The `bitwise_left_shift(value, shift)`function returns the result of shifting the binary representation of integer `value`left by `shift`bits. The left shift operation moves bits towards the higher-order direction, filling the vacated lower-order bits with 0s, and discarding the higher-order bits that overflow. Equivalent to: `value << shift`. - -#### 6.6.1 Syntax Definition - -``` -bitwise_left_shift(value, shift) -> [same as value] -- The return type is the same as the data type of value -``` - -* Parameter Description - - * ​**value**​: The integer value to shift left. Must be of data type Int32 or Int64. - * ​**shift**​: The number of bits to shift. Must be of data type Int32 or Int64. -* Usage Methods - - * Two specific numbers: `bitwise_left_shift(1, 2)` - * Column and a number: `bitwise_left_shift(column1, 2)` - * Between two columns: `bitwise_left_shift(column1, column2)` - -#### 6.6.2 Usage Examples - -``` ---Two specific numbers -IoTDB:database1> select distinct bitwise_left_shift(1,2) from bit_table -+-----+ -|_col0| -+-----+ -| 4| -+-----+ --- Column and a number -IoTDB:database1> select length, bitwise_left_shift(length,2) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 56| -| 15| 60| -| 13| 52| -+------+-----+ --- Between two columns -IoTDB:database1> select length, width, bitwise_left_shift(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -+------+-----+-----+ -``` - -### 6.7 bitwise\_right\_shift(value, shift) - -The `bitwise_right_shift(value, shift)`function returns the result of logically (unsigned) right shifting the binary representation of integer `value`by `shift`bits. The logical right shift operation moves bits towards the lower-order direction, filling the vacated higher-order bits with 0s, and discarding the lower-order bits that overflow. - -#### 6.7.1 Syntax Definition - -``` -bitwise_right_shift(value, shift) -> [same as value] -- The return type is the same as the data type of value -``` - -* Parameter Description - - * ​**value**​: The integer value to shift right. Must be of data type Int32 or Int64. - * ​**shift**​: The number of bits to shift. Must be of data type Int32 or Int64. -* Usage Methods - - * Two specific numbers: `bitwise_right_shift(8, 3)` - * Column and a number: `bitwise_right_shift(column1, 3)` - * Between two columns: `bitwise_right_shift(column1, column2)` - -#### 6.7.2 Usage Examples - -``` ---Two specific numbers -IoTDB:database1> select distinct bitwise_right_shift(8,3) from bit_table -+-----+ -|_col0| -+-----+ -| 1| -+-----+ ---Column and a number -IoTDB:database1> select length, bitwise_right_shift(length,3) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 1| -| 15| 1| -| 13| 1| -+------+-----+ ---Between two columns -IoTDB:database1> select length, width, bitwise_right_shift(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -``` - -### 6.8 bitwise\_right\_shift\_arithmetic(value, shift) - -The `bitwise_right_shift_arithmetic(value, shift)`function returns the result of arithmetically right shifting the binary representation of integer `value`by `shift`bits. The arithmetic right shift operation moves bits towards the lower-order direction, discarding the lower-order bits that overflow, and filling the vacated higher-order bits with the sign bit (0 for positive numbers, 1 for negative numbers) to preserve the sign of the number. - -#### 6.8.1 Syntax Definition - -``` -bitwise_right_shift_arithmetic(value, shift) -> [same as value]-- The return type is the same as the data type of value -``` - -* Parameter Description - - * ​**value**​: The integer value to shift right. Must be of data type Int32 or Int64. - * ​**shift**​: The number of bits to shift. Must be of data type Int32 or Int64. -* Usage Methods: - - * Two specific numbers: `bitwise_right_shift_arithmetic(12, 2)` - * Column and a number: `bitwise_right_shift_arithmetic(column1, 64)` - * Between two columns: `bitwise_right_shift_arithmetic(column1, column2)` - -#### 6.8.2 Usage Examples - -``` ---Two specific numbers -IoTDB:database1> select distinct bitwise_right_shift_arithmetic(12,2) from bit_table -+-----+ -|_col0| -+-----+ -| 3| -+-----+ --- Column and a number -IoTDB:database1> select length, bitwise_right_shift_arithmetic(length,3) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 1| -| 15| 1| -| 13| 1| -+------+-----+ ---Between two columns -IoTDB:database1> select length, width, bitwise_right_shift_arithmetic(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -+------+-----+-----+ -``` - -## 7. Binary Functions - -> Supported since V2.0.9.1 - -### 7.1 Base64 Encoding Functions -| Function Name | Description | Input Type | Output Type | -| ----------------------------- | ----------------------------------------------------------------------------- | --------------------- | ------------- | -| `to_base64(input)` | Encode input data to standard Base64 string for binary data transmission/storage | STRING/TEXT/BLOB | STRING | -| `from_base64(input)` | Decode standard Base64 string to raw binary data (inverse of to_base64) | STRING/TEXT | BLOB | -| `to_base64url(input)` | Encode input to URL-safe Base64URL string (replace +/_, omit padding) | STRING/TEXT/BLOB | STRING | -| `from_base64url(input)` | Decode Base64URL string to raw binary data (inverse of to_base64url) | STRING/TEXT | BLOB | -| `to_base32(input)` | Encode input to Base32 string (case-insensitive, high readability) | STRING/TEXT/BLOB | STRING | -| `from_base32(input)` | Decode Base32 string to raw binary data (inverse of to_base32) | STRING/TEXT | BLOB | - -**Examples** -1. to_base64: Encode string to standard Base64 -```SQL -SELECT DISTINCT to_base64('IoTDB Binary Test') FROM table1; -``` -``` -+----------------------------+ -| _col0| -+----------------------------+ -|SW9URELkuozov5vliLbmtYvor5U=| -+----------------------------+ -``` - -2. from_base64: Decode Base64 to binary -```SQL -SELECT DISTINCT from_base64('SW9URELkuozov5vliLbmtYvor5U=') FROM table1; -``` -``` -+------------------------------------------+ -| _col0| -+------------------------------------------+ -|0x496f544442e4ba8ce8bf9be588b6e6b58be8af95| -+------------------------------------------+ -``` - -3. to_base64url: Encode to URL-safe Base64URL -```SQL -SELECT DISTINCT to_base64url('https://iotdb.apache.org') FROM table1; -``` -``` -+--------------------------------+ -| _col0| -+--------------------------------+ -|aHR0cHM6Ly9pb3RkYi5hcGFjaGUub3Jn| -+--------------------------------+ -``` - -4. from_base64url: Decode Base64URL -```SQL -SELECT DISTINCT from_base64url('aHR0cHM6Ly9pb3RkYi5hcGFjaGUub3Jn') FROM table1; -``` -``` -+--------------------------------------------------+ -| _col0| -+--------------------------------------------------+ -|0x68747470733a2f2f696f7464622e6170616368652e6f7267| -+--------------------------------------------------+ -``` - -5. to_base32: Encode to Base32 -```SQL -SELECT DISTINCT to_base32('123456') FROM table1; -``` -``` -+----------------+ -| _col0| -+----------------+ -|GEZDGNBVGY======| -+----------------+ -``` - -6. from_base32: Decode Base32 -```SQL -SELECT DISTINCT from_base32('GEZDGNBVGY======') FROM table1; -``` -``` -+--------------+ -| _col0| -+--------------+ -|0x313233343536| -+--------------+ -``` - -### 7.2 Hex Encoding Functions -| Function Name | Description | Input Type | Output Type | -| ------------------------ | -------------------------------------------------- | --------------------- | ------------- | -| `TO_HEX(input)` | Convert input to hex string (raw byte view) | STRING/TEXT/BLOB | STRING | -| `FROM_HEX(input)` | Decode hex string to raw binary (inverse of TO_HEX) | STRING/TEXT | BLOB | - -**Examples** -1. TO_HEX: Convert string/binary to hex -```SQL -SELECT DISTINCT TO_HEX('test') FROM table1; -``` -``` -+--------+ -| _col0| -+--------+ -|74657374| -+--------+ -``` - -2. FROM_HEX: Decode hex to binary -```SQL -SELECT DISTINCT FROM_HEX('74657374') FROM table1; -``` -``` -+----------+ -| _col0| -+----------+ -|0x74657374| -+----------+ -``` - -### 7.3 Basic Binary Functions -| Function Name | Description | Input Type | Output Type | -| ----------------------------------- | ------------------------------------------------------------------------------------------ | -------------------------- | -------------- | -| `length(input)` | Return data length: chars for TEXT, bytes for BLOB/OBJECT | STRING/TEXT/BLOB/OBJECT | INT32 | -| `REVERSE(input)` | Reverse input: chars for TEXT, bytes for BLOB | STRING/TEXT/BLOB | Same as input | -| `LPAD(input, length, pad_bytes)` | Left-pad/truncate BLOB to target byte length | BLOB, INT32/INT64, BLOB | BLOB | -| `RPAD(input, length, pad_bytes)` | Right-pad/truncate BLOB to target byte length | BLOB, INT32/INT64, BLOB | BLOB | - -**Examples** -1. length: Get data length -```SQL -SELECT DISTINCT length('IoTDB') FROM table1; -``` -``` -+-----+ -|_col0| -+-----+ -| 5| -+-----+ -``` - -2. REVERSE: Reverse data -```SQL -SELECT DISTINCT REVERSE('12345') FROM table1; -``` -``` -+-----+ -|_col0| -+-----+ -|54321| -+-----+ -``` - -3. LPAD: Left-pad BLOB -```SQL -SELECT DISTINCT LPAD(FROM_HEX('74657374'),5, FROM_HEX('74657374')) FROM table1; -``` -``` -+------------+ -| _col0| -+------------+ -|0x7474657374| -+------------+ -``` - -4. RPAD: Right-pad BLOB -```SQL -SELECT DISTINCT RPAD(FROM_HEX('74657374'),5, FROM_HEX('74657374')) FROM table1; -``` -``` -+------------+ -| _col0| -+------------+ -|0x7465737474| -+------------+ -``` - -### 7.4 Integer Encoding Functions -| Function Name | Description | Input Type | Output Type | -| ------------------------------------- | ------------------------------------------------------------------------- | ------------ | ------------ | -| `to_big_endian_32(input)` | Convert INT32 to 4-byte big-endian BLOB (network byte order) | INT32 | BLOB | -| `to_big_endian_64(input)` | Convert INT64 to 8-byte big-endian BLOB | INT64 | BLOB | -| `from_big_endian_32(input)` | Decode 4-byte big-endian BLOB to INT32 | BLOB | INT32 | -| `from_big_endian_64(input)` | Decode 8-byte big-endian BLOB to INT64 | BLOB | INT64 | -| `to_little_endian_32(input)` | Convert INT32 to 4-byte little-endian BLOB (x86 architecture) | INT32 | BLOB | -| `to_little_endian_64(input)` | Convert INT64 to 8-byte little-endian BLOB | INT64 | BLOB | -| `from_little_endian_32(input)` | Decode 4-byte little-endian BLOB to INT32 | BLOB | INT32 | -| `from_little_endian_64(input)` | Decode 8-byte little-endian BLOB to INT64 | BLOB | INT64 | - -**Examples** -1. Big-endian encode/decode -```SQL -SELECT DISTINCT TO_HEX(to_big_endian_32(12345)) FROM table1; -``` -``` -+--------+ -| _col0| -+--------+ -|00003039| -+--------+ -``` - -2. Little-endian encode/decode -```SQL -SELECT DISTINCT TO_HEX(to_little_endian_32(12345)) FROM table1; -``` -``` -+--------+ -| _col0| -+--------+ -|39300000| -+--------+ -``` - -### 7.5 Floating-Point Encoding Functions -| Function Name | Description | Input Type | Output Type | -| ------------------------------- | ------------------------------------------------------------------------- | ------------ | ------------ | -| `to_ieee754_32(input)` | Convert FLOAT to 4-byte big-endian IEEE754 BLOB | FLOAT | BLOB | -| `to_ieee754_64(input)` | Convert DOUBLE to 8-byte big-endian IEEE754 BLOB | DOUBLE | BLOB | -| `from_ieee754_32(input)` | Decode 4-byte IEEE754 BLOB to FLOAT | BLOB | FLOAT | -| `from_ieee754_64(input)` | Decode 8-byte IEEE754 BLOB to DOUBLE | BLOB | DOUBLE | - -**Examples** -1. FLOAT encode/decode -```SQL -SELECT DISTINCT from_ieee754_32(FROM_HEX('42b40000')) FROM table1; -``` -``` -+-----+ -|_col0| -+-----+ -| 90.0| -+-----+ -``` - -2. DOUBLE encode/decode -```SQL -SELECT DISTINCT from_ieee754_64(FROM_HEX('400921fb54411744')) FROM table1; -``` -``` -+------------+ -| _col0| -+------------+ -|3.1415926535| -+------------+ -``` - -### 7.6 Hash Functions -| Function Name | Description | Input Type | Output Type | -| --------------------------------- | ------------------------------------------------------------------------- | ------------------ | ------------- | -| `sha256(input)` | SHA-256 cryptographic hash (collision-resistant) | STRING/TEXT/BLOB | BLOB(32B) | -| `SHA512(input)` | SHA-512 cryptographic hash (higher security) | STRING/TEXT/BLOB | BLOB(64B) | -| `SHA1(input)` | SHA-1 hash (not secure for cryptography) | STRING/TEXT/BLOB | BLOB(20B) | -| `MD5(input)` | MD5 hash (non-cryptographic checksum) | STRING/TEXT/BLOB | BLOB(16B) | -| `CRC32(input)` | CRC32 checksum (fast error detection) | STRING/TEXT/BLOB | INT64 | -| `spooky_hash_v2_32(input)` | 32-bit SpookyHashV2 (high-performance non-crypto) | STRING/TEXT/BLOB | BLOB(4B) | -| `spooky_hash_v2_64(input)` | 64-bit SpookyHashV2 | STRING/TEXT/BLOB | BLOB(8B) | -| `xxhash64(input)` | 64-bit xxHash (ultra-fast) | STRING/TEXT/BLOB | BLOB(8B) | -| `murmur3(input)` | 128-bit MurmurHash3 (uniform distribution) | STRING/TEXT/BLOB | BLOB(16B) | - -**Examples** -```SQL -SELECT DISTINCT TO_HEX(sha256('test')) FROM table1; -``` -``` -+----------------------------------------------------------------+ -| _col0| -+----------------------------------------------------------------+ -|9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08| -+----------------------------------------------------------------+ -``` - -### 7.7 HMAC Functions -| Function Name | Description | Input Type | Output Type | -| ------------------------------- | ------------------------------------------------------------------------------ | ------------------------------------- | ------------- | -| `hmac_md5(data, key)` | HMAC-MD5 message authentication code | data: STRING/TEXT/BLOB key: STRING/TEXT | BLOB(16B) | -| `hmac_sha1(data, key)` | HMAC-SHA1 authentication code | data: STRING/TEXT/BLOB key: STRING/TEXT | BLOB(20B) | -| `hmac_sha256(data, key)` | HMAC-SHA256 (industry-recommended, high security) | data: STRING/TEXT/BLOB key: STRING/TEXT | BLOB(32B) | -| `hmac_sha512(data, key)` | HMAC-SHA512 (maximum security) | data: STRING/TEXT/BLOB key: STRING/TEXT | BLOB(64B) | - -**Examples** -```SQL -SELECT DISTINCT TO_HEX(hmac_sha256('user_data_123', 'iotdb_secret_key')) FROM table1; -``` -``` -+----------------------------------------------------------------+ -| _col0| -+----------------------------------------------------------------+ -|73b6f26bbcb5192dbe2cb83745b0fc48c63418fa674b0bf62fabe7f8747f3afd| -+----------------------------------------------------------------+ -``` - - -## 8. Conditional Expressions - -### 8.1 CASE - -CASE expressions come in two forms: **Simple CASE** and **Searched CASE**. - -#### 8.1.1 Simple CASE - -The simple form evaluates each value expression from left to right until it finds a match with the given expression: - -```SQL -CASE expression - WHEN value THEN result - [ WHEN ... ] - [ ELSE result ] -END -``` - -If a matching value is found, the corresponding result is returned. If no match is found, the result from the `ELSE` clause (if provided) is returned; otherwise, `NULL` is returned. - -Example: - -```SQL -SELECT a, - CASE a - WHEN 1 THEN 'one' - WHEN 2 THEN 'two' - ELSE 'many' - END -``` - -#### 8.1.2 Searched CASE - -The searched form evaluates each Boolean condition from left to right until a `TRUE` condition is found, then returns the corresponding result: - -```SQL -CASE - WHEN condition THEN result - [ WHEN ... ] - [ ELSE result ] -END -``` - -If no condition evaluates to `TRUE`, the `ELSE` clause result (if provided) is returned; otherwise, `NULL` is returned. - -Example: - -```SQL -SELECT a, b, - CASE - WHEN a = 1 THEN 'aaa' - WHEN b = 2 THEN 'bbb' - ELSE 'ccc' - END -``` - -### 8.2 COALESCE - -Returns the first non-null value from the given list of parameters. - -```SQL -coalesce(value1, value2[, ...]) -``` - -### 8.3 IF Expression - -The IF expression has two forms: one that specifies only the true value, and another that specifies both the true value and the false value. - -| Form | Description | Output Type Restrictions | -| ---------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ | ------------------------ | -| `IF(condition, true_value)` | If the condition evaluates to true, `true_value` is computed and returned; otherwise, `null` is returned and `true_value` is not evaluated. | — | -| `IF(condition, true_value, false_value)` | If the condition evaluates to true, `true_value` is computed and returned; otherwise, `false_value` is computed and returned. | The data types of `true_value` and `false_value` **must be exactly the same**. Implicit type conversion is not supported. | - -> Supported since V2.0.9.1 - -**Examples:** - -1. Equivalent examples of IF and CASE expressions: -```SQL --- IF syntax -SELECT - device_id, - temperature, - IF(temperature > 85, 'High Value', 'Low Value') -FROM table1; - --- Equivalent CASE syntax -SELECT - device_id, - temperature, - CASE - WHEN temperature > 85 THEN 'High Value' - ELSE 'Low Value' - END -FROM table1; -``` - -2. Output type restriction examples: -```SQL --- Succeeds --- temperature (float) and humidity (float) have matching types -SELECT IF(temperature > 85, temperature, humidity) FROM table1; - --- Fails --- temperature (float) and status (boolean) have mismatched types -SELECT IF(temperature > 85, temperature, status) FROM table1; -``` - - -## 9. Conversion Functions - -### 9.1 Conversion Functions - -#### 9.1.1 cast(value AS type) → type - -Explicitly converts a value to the specified type. This can be used to convert strings (`VARCHAR`) to numeric types or numeric values to string types. Starting from V2.0.8, OBJECT type can be explicitly cast to STRING type. - -If the conversion fails, a runtime error is thrown. - -Example: - -```SQL -SELECT * - FROM table1 - WHERE CAST(time AS DATE) - IN (CAST('2024-11-27' AS DATE), CAST('2024-11-28' AS DATE)); -``` - -#### 9.1.2 try_cast(value AS type) → type - -Similar to `CAST()`. If the conversion fails, returns `NULL` instead of throwing an error. - -Example: - -```SQL -SELECT * - FROM table1 - WHERE try_cast(time AS DATE) - IN (try_cast('2024-11-27' AS DATE), try_cast('2024-11-28' AS DATE)); -``` - -### 9.2 Format Function - -This function generates and returns a formatted string based on a specified format string and input arguments. Similar to Java’s `String.format` or C’s `printf`, it allows developers to construct dynamic string templates using placeholder syntax. Predefined format specifiers in the template are replaced precisely with corresponding argument values, producing a complete string that adheres to specific formatting requirements. - -#### 9.2.1 Syntax - -```SQL -format(pattern, ...args) -> STRING -``` - -**Parameters** - -* `pattern`: A format string containing static text and one or more format specifiers (e.g., `%s`, `%d`), or any expression returning a `STRING`/`TEXT` type. -* `args`: Input arguments to replace format specifiers. Constraints: - * Number of arguments ≥ 1. - * Multiple arguments must be comma-separated (e.g., `arg1, arg2`). - * Total arguments can exceed the number of specifiers in `pattern` but cannot be fewer, otherwise an exception is triggered. - -**Return Value** - -* Formatted result string of type `STRING`. - -#### 9.2.2 Usage Examples - -1. Format Floating-Point Numbers - ```SQL - IoTDB:database1> SELECT format('%.5f', humidity) FROM table1 WHERE humidity = 35.4; - +--------+ - | _col0| - +--------+ - |35.40000| - +--------+ - ``` -2. Format Integers - ```SQL - IoTDB:database1> SELECT format('%03d', 8) FROM table1 LIMIT 1; - +-----+ - |_col0| - +-----+ - | 008| - +-----+ - ``` -3. Format Dates and Timestamps - -* Locale-Specific Date - -```SQL -IoTDB:database1> SELECT format('%1$tA, %1$tB %1$te, %1$tY', 2024-01-01) FROM table1 LIMIT 1; -+--------------------+ -| _col0| -+--------------------+ -|Monday, January 1, 2024| -+--------------------+ -``` - -* Remove Timezone Information - -```SQL -IoTDB:database1> SELECT format('%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL', 2024-01-01T00:00:00.000+08:00) FROM table1 LIMIT 1; -+-----------------------+ -| _col0| -+-----------------------+ -|2024-01-01 00:00:00.000| -+-----------------------+ -``` - -* Second-Level Timestamp Precision - -```SQL -IoTDB:database1> SELECT format('%1$tF %1$tT', 2024-01-01T00:00:00.000+08:00) FROM table1 LIMIT 1; -+-------------------+ -| _col0| -+-------------------+ -|2024-01-01 00:00:00| -+-------------------+ -``` - -* Date/Time Format Symbols - -| **Symbol** | **​ Description** | -| ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| 'H' | 24-hour format (two digits, zero-padded), i.e. 00 - 23 | -| 'I' | 12-hour format (two digits, zero-padded), i.e. 01 - 12 | -| 'k' | 24-hour format (no padding), i.e. 0 - 23 | -| 'l' | 12-hour format (no padding), i.e. 1 - 12 | -| 'M' | Minute (two digits, zero-padded), i.e. 00 - 59 | -| 'S' | Second (two digits, zero-padded; supports leap seconds), i.e. 00 - 60 | -| 'L' | Millisecond (three digits, zero-padded), i.e. 000 - 999 | -| 'N' | Nanosecond (nine digits, zero-padded), i.e. 000000000 - 999999999。 | -| 'p' | Locale-specific lowercase AM/PM marker (e.g., "am", "pm"). Prefix with`T`to force uppercase (e.g., "AM"). | -| 'z' | RFC 822 timezone offset from GMT (e.g.,`-0800`). Adjusts for daylight saving. Uses the JVM's default timezone for`long`/`Long`/`Date`. | -| 'Z' | Timezone abbreviation (e.g., "PST"). Adjusts for daylight saving. Uses the JVM's default timezone; Formatter's timezone overrides the argument's timezone if specified. | -| 's' | Seconds since Unix epoch (1970-01-01 00:00:00 UTC), i.e. Long.MIN\_VALUE/1000 to Long.MAX\_VALUE/1000。 | -| 'Q' | Milliseconds since Unix epoch, i.e. Long.MIN\_VALUE 至 Long.MAX\_VALUE。 | - -* Common Date/Time Conversion Characters - -| **Symbol** | **​ Description** | -| ---------------- | -------------------------------------------------------------------- | -| 'B' | Locale-specific full month name, for example "January", "February" | -| 'b' | Locale-specific abbreviated month name, for example "Jan", "Feb" | -| 'h' | Same as`b` | -| 'A' | Locale-specific full weekday name, for example "Sunday", "Monday" | -| 'a' | Locale-specific short weekday name, for example "Sun", "Mon" | -| 'C' | Year divided by 100 (two digits, zero-padded) | -| 'Y' | Year (minimum 4 digits, zero-padded) | -| 'y' | Last two digits of year (zero-padded) | -| 'j' | Day of year (three digits, zero-padded) | -| 'm' | Month (two digits, zero-padded) | -| 'd' | Day of month (two digits, zero-padded) | -| 'e' | Day of month (no padding) | - -4. Format Strings - ```SQL - IoTDB:database1> SELECT format('The measurement status is: %s', status) FROM table2 LIMIT 1; - +-------------------------------+ - | _col0| - +-------------------------------+ - |The measurement status is: true| - +-------------------------------+ - ``` -5. Format Percentage Sign - ```SQL - IoTDB:database1> SELECT format('%s%%', 99.9) FROM table1 LIMIT 1; - +-----+ - |_col0| - +-----+ - |99.9%| - +-----+ - ``` - -#### 9.2.3 Format Conversion Failure Scenarios - -1. Type Mismatch Errors - -* Timestamp Type Conflict - - If the format specifier includes time-related tokens (e.g., `%Y-%m-%d`) but the argument: - - * Is a non-`DATE`/`TIMESTAMP` type value. ◦ - * Requires sub-day precision (e.g., `%H`, `%M`) but the argument is not `TIMESTAMP`. - -```SQL --- Example 1 -IoTDB:database1> SELECT format('%1$tA, %1$tB %1$te, %1$tY', humidity) from table2 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %1$tA, %1$tB %1$te, %1$tY (IllegalFormatConversion: A != java.lang.Float) - --- Example 2 -IoTDB:database1> SELECT format('%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL', humidity) from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL (IllegalFormatConversion: Y != java.lang.Float) -``` - -* Floating-Point Type Conflict - - Using `%f` with non-numeric arguments (e.g., strings or booleans): - -```SQL -IoTDB:database1> select format('%.5f',status) from table1 where humidity = 35.4 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %.5f (IllegalFormatConversion: f != java.lang.Boolean) -``` - -2. Argument Count Mismatch - The number of arguments must equal or exceed the number of format specifiers. - - ```SQL - IoTDB:database1> SELECT format('%.5f %03d', humidity) FROM table1 WHERE humidity = 35.4; - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %.5f %03d (MissingFormatArgument: Format specifier '%03d') - ``` -3. Invalid Invocation Errors - - Triggered if: - - * Total arguments < 2 (must include `pattern` and at least one argument).• - * `pattern` is not of type `STRING`/`TEXT`. - -```SQL --- Example 1 -IoTDB:database1> select format('%s') from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Scalar function format must have at least two arguments, and first argument pattern must be TEXT or STRING type. - ---Example 2 -IoTDB:database1> select format(123, humidity) from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Scalar function format must have at least two arguments, and first argument pattern must be TEXT or STRING type. -``` - - -## 10. String Functions and Operators - -### 10.1 String operators - -#### 10.1.1 || Operator - -The `||` operator is used for string concatenation and functions the same as the `concat` function. - -#### 10.1.2 LIKE Statement - - The `LIKE` statement is used for pattern matching. For detailed usage, refer to Pattern Matching:[LIKE](#1-like-运算符). - -### 10.2 String Functions - -| Function Name | Description | Input | Output | Usage | -| :------------ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------| :------ | :----------------------------------------------------------- | -| `length` | Returns the number of characters in a string (not byte length). | `string` (the string whose length is to be calculated) | INT32 | length(string) | -| `upper` | Converts all letters in a string to uppercase. | string | String | upper(string) | -| `lower` | Converts all letters in a string to lowercase. | string | String | lower(string) | -| `trim` | Removes specified leading and/or trailing characters from a string. **Parameters:** - `specification` (optional): Specifies which side to trim: - `BOTH`: Removes characters from both sides (default). - `LEADING`: Removes characters from the beginning. - `TRAILING`: Removes characters from the end. - `trimcharacter` (optional): Character to be removed (default is whitespace). - `string`: The target string. | string | String | trim([ [ specification ] [ trimcharacter ] FROM ] string) Example:`trim('!' FROM '!foo!');` —— `'foo'` | -| `strpos` | Returns the position of the first occurrence of `subStr` in `sourceStr`. **Notes:** - Position starts at `1`. - Returns `0` if `subStr` is not found. - Positioning is based on characters, not byte arrays. | `sourceStr` (string to be searched), `subStr` (substring to find) | INT32 | strpos(sourceStr, subStr) | -| `starts_with` | Checks if `sourceStr` starts with the specified `prefix`. | `sourceStr`, `prefix` | Boolean | starts_with(sourceStr, prefix) | -| `ends_with` | Checks if `sourceStr` ends with the specified `suffix`. | `sourceStr`, `suffix` | Boolean | ends_with(sourceStr, suffix) | -| `concat` | Concatenates `string1, string2, ..., stringN`. Equivalent to the `\|\|` operator. | `string`, `text` | String | concat(str1, str2, ...) or str1 \|\| str2 ... | -| `strcmp` | Compares two strings lexicographically. **Returns:** - `-1` if `str1 < str2` - `0` if `str1 = str2` - `1` if `str1 > str2` - `NULL` if either `str1` or `str2` is `NULL` | `string1`, `string2` | INT32 | strcmp(str1, str2) | -| `replace` | Removes all occurrences of `search` in `string`. | `string`, `search` | String | replace(string, search) | -| `replace` | Replaces all occurrences of `search` in `string` with `replace`. | `string`, `search`, `replace` | String | replace(string, search, replace) | -| `substring` | Extracts a substring from `start_index` to the end of the string. **Notes:** - `start_index` starts at `1`. - Returns `NULL` if input is `NULL`. - Throws an error if `start_index` is greater than string length. | `string`, `start_index` | String | substring(string from start_index)or substring(string, start_index) | -| `substring` | Extracts a substring of `length` characters starting from `start_index`. **Notes:** - `start_index` starts at `1`. - Returns `NULL` if input is `NULL`. - Throws an error if `start_index` is greater than string length. - Throws an error if `length` is negative. - If `start_index + length` exceeds `int.MAX`, an overflow error may occur. | `string`, `start_index`, `length` | String | substring(string from start_index for length) or substring(string, start_index, length) | - -## 11. Pattern Matching Functions - -### 11.1 LIKE - -#### 11.1.1 Usage - -The `LIKE `operator is used to compare a value with a pattern. It is commonly used in the `WHERE `clause to match specific patterns within strings. - -#### 11.1.2 Syntax - -```SQL -... column [NOT] LIKE 'pattern' ESCAPE 'character'; -``` - -#### 11.1.3 Match rules - -- Matching characters is case-sensitive -- The pattern supports two wildcard characters: - - `_` matches any single character - - `%` matches zero or more characters - -#### 11.1.4 Notes - -- `LIKE` pattern matching applies to the entire string by default. Therefore, if it's desired to match a sequence anywhere within a string, the pattern must start and end with a percent sign. -- To match the escape character itself, double it (e.g., `\\` to match `\`). For example, you can use `\\` to match for `\`. - -#### 11.1.5 Examples - -#### **Example 1: Match Strings Starting with a Specific Character** - -- **Description:** Find all names that start with the letter `E` (e.g., `Europe`). - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'E%'; -``` - -#### **Example 2: Exclude a Specific Pattern** - -- **Description:** Find all names that do **not** start with the letter `E`. - -```SQL -SELECT * FROM table1 WHERE continent NOT LIKE 'E%'; -``` - -#### **Example 3: Match Strings of a Specific Length** - -- **Description:** Find all names that start with `A`, end with `a`, and have exactly two characters in between (e.g., `Asia`). - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'A__a'; -``` - -#### **Example 4: Escape Special Characters** - -- **Description:** Find all names that start with `South_` (e.g., `South_America`). The underscore (`_`) is a wildcard character, so it needs to be escaped using `\`. - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'South\_%' ESCAPE '\'; -``` - -#### **Example 5: Match the Escape Character Itself** - -- **Description:** Find all names that start with 'South\'. Since `\` is the escape character, it must be escaped using `\\`. - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'South\\%' ESCAPE '\'; -``` - -### 11.2 regexp_like - -#### 11.2.1 Usage - -Evaluates whether the regular expression pattern is present within the given string. - -#### 11.2.2 Syntax - -```SQL -regexp_like(string, pattern); -``` - -#### 10.2.3 Notes - -- The pattern for `regexp_like` only needs to be contained within the string, and does not need to match the entire string. -- To match the entire string, use the `^` and `$` anchors. -- `^` signifies the "start of the string," and `$` signifies the "end of the string." -- Regular expressions use the Java-defined regular syntax, but there are the following exceptions to be aware of: - - Multiline mode - 1. Enabled by: `(?m)`. - 2. Recognizes only `\n` as the line terminator. - 3. Does not support the `(?d)` flag, and its use is prohibited. - - Case-insensitive matching - 1. Enabled by: `(?i)`. - 2. Based on Unicode rules, it does not support context-dependent and localized matching. - 3. Does not support the `(?u)` flag, and its use is prohibited. - - Character classes - 1. Within character classes (e.g., `[A-Z123]`), `\Q` and `\E` are not supported and are treated as literals. - - Unicode character classes (`\p{prop}`) - 1. Underscores in names: All underscores in names must be removed (e.g., `OldItalic `instead of `Old_Italic`). - 2. Scripts: Specify directly, without the need for `Is`, `script=`, or `sc=` prefixes (e.g., `\p{Hiragana}`). - 3. Blocks: Must use the `In` prefix, `block=` or `blk=` prefixes are not supported (e.g., `\p{InMongolian}`). - 4. Categories: Specify directly, without the need for `Is`, `general_category=`, or `gc=` prefixes (e.g., `\p{L}`). - 5. Binary properties: Specify directly, without `Is` (e.g., `\p{NoncharacterCodePoint}`). - -#### 11.2.4 Examples - -#### Example 1: **Matching strings containing a specific pattern** - -```SQL -SELECT regexp_like('1a 2b 14m', '\\d+b'); -- true -``` - -- **Explanation**: Determines whether the string '1a 2b 14m' contains a substring that matches the pattern `\d+b`. - - `\d+` means "one or more digits". - - `b` represents the letter b. - - In `'1a 2b 14m'`, the substring `'2b'` matches this pattern, so it returns `true`. - - -#### **Example 2: Matching the entire string** - -```SQL -SELECT regexp_like('1a 2b 14m', '^\\d+b$'); -- false -``` - -- **Explanation**: Checks if the string `'1a 2b 14m'` matches the pattern `^\\d+b$` exactly. - - `\d+` means "one or more digits". - - `b` represents the letter b. - - `'1a 2b 14m'` does not match this pattern because it does not start with digits and does not end with `b`, so it returns `false`. - -## 12. Timeseries Windowing Functions - -The sample data is as follows: - -```SQL -IoTDB> SELECT * FROM bid; -+-----------------------------+--------+-----+ -| time|stock_id|price| -+-----------------------------+--------+-----+ -|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+--------+-----+ - --- Create table statement -CREATE TABLE bid(time TIMESTAMP TIME, stock_id STRING TAG, price FLOAT FIELD); --- Insert data -INSERT INTO bid(time, stock_id, price) VALUES('2021-01-01T09:05:00','AAPL',100.0),('2021-01-01T09:06:00','TESL',200.0),('2021-01-01T09:07:00','AAPL',103.0),('2021-01-01T09:07:00','TESL',202.0),('2021-01-01T09:09:00','AAPL',102.0),('2021-01-01T09:15:00','TESL',195.0); -``` - -### 12.1 HOP - -#### 12.1.1 Function Description - -The HOP function segments data into overlapping time windows for analysis, assigning each row to all windows that overlap with its timestamp. If windows overlap (when SLIDE < SIZE), data will be duplicated across multiple windows. - -#### 12.1.2 Function Definition - -```SQL -HOP(data, timecol, size, slide[, origin]) -``` - -#### 12.1.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | ------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer | Window size | -| SLIDE | Scalar | Long integer | Sliding step | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - - -#### 12.1.4 Returned Results - -The HOP function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### 12.1.5 Usage Example - -```SQL -IoTDB> SELECT * FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.2 SESSION - -#### 12.2.1 Function Description - -The SESSION function groups data into sessions based on time intervals. It checks the time gap between consecutive rows—rows with gaps smaller than the threshold (GAP) are grouped into the current window, while larger gaps trigger a new window. - -#### 12.2.2 Function Definition - -```SQL -SESSION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], timecol, gap) -``` -#### 12.2.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| TIMECOL | Scalar | String (default: 'time') | Time column name | -| GAP | Scalar | Long integer | Session gap threshold | - -#### 12.2.4 Returned Results - -The SESSION function returns: - -* `window_start`: Time of the first row in the session -* `window_end`: Time of the last row in the session -* Pass-through columns: All input columns from DATA - -#### 12.2.5 Usage Example - -```SQL -IoTDB> SELECT * FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY SESSION when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL| 201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.3 VARIATION - -#### 12.3.1 Function Description - -The VARIATION function groups data based on value differences. The first row becomes the baseline for the first window. Subsequent rows are compared to the baseline—if the difference is within the threshold (DELTA), they join the current window; otherwise, a new window starts with that row as the new baseline. - -#### 12.3.2 Function Definition - -```sql -VARIATION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], col, delta) -``` - -#### 12.3.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| COL | Scalar | String | Column for difference calculation | -| DELTA | Scalar | Float | Difference threshold | - -#### 12.3.4 Returned Results - -The VARIATION function returns: - -* `window_index`: Window identifier -* Pass-through columns: All input columns from DATA - -#### 12.3.5 Usage Example - -```sql -IoTDB> SELECT * FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price',DELTA => 2.0); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 1|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY VARIATION when combined with GROUP BY -IoTDB> SELECT first(time) as window_start, last(time) as window_end, stock_id, avg(price) as avg FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price', DELTA => 2.0) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:07:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.5| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 12.4 CAPACITY - -#### 12.4.1 Function Description - -The CAPACITY function groups data into fixed-size windows, where each window contains up to SIZE rows. - -#### 12.4.2 Function Definition - -```sql -CAPACITY(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], size) -``` - -#### 12.4.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| SIZE | Scalar | Long integer | Window size (row count) | - -#### 12.4.4 Returned Results - -The CAPACITY function returns: - -* `window_index`: Window identifier -* Pass-through columns: All input columns from DATA - -#### 12.4.5 Usage Example - -```sql -IoTDB> SELECT * FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 0|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY COUNT when combined with GROUP BY -IoTDB> SELECT first(time) as start_time, last(time) as end_time, stock_id, avg(price) as avg FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| start_time| end_time|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|101.5| -|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 12.5 TUMBLE - -#### 12.5.1 Function Description - -The TUMBLE function assigns each row to a non-overlapping, fixed-size time window based on a timestamp attribute. - -#### 12.5.2 Function Definition - -```sql -TUMBLE(data, timecol, size[, origin]) -``` -#### 12.5.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | ------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer (positive) | Window size | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - -#### 12.5.4 Returned Results - -The TUMBLE function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### 12.5.5 Usage Example - -```SQL -IoTDB> SELECT * FROM TUMBLE( DATA => bid, TIMECOL => 'time', SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM TUMBLE(DATA => bid, TIMECOL => 'time', SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.6 CUMULATE - -#### 12.6.1 Function Description - -The CUMULATE function creates expanding windows from an initial window, maintaining the same start time while incrementally extending the end time by STEP until reaching SIZE. Each window contains all elements within its range. For example, with a 1-hour STEP and 24-hour SIZE, daily windows would be: `[00:00, 01:00)`, `[00:00, 02:00)`, ..., `[00:00, 24:00)`. - -#### 12.6.2 Function Definition - -```sql -CUMULATE(data, timecol, size, step[, origin]) -``` - -#### 12.6.3 Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | --------------------------------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer (positive) | Window size (must be an integer multiple of STEP) | -| STEP | Scalar | Long integer (positive) | Expansion step | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - -> Note: An error `Cumulative table function requires size must be an integral multiple of step` occurs if SIZE is not divisible by STEP. - -#### 12.6.4 Returned Results - -The CUMULATE function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### 12.6.5 Usage Example - -```sql -IoTDB> SELECT * FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree mode's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m, SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| TESL| 201.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00| AAPL| 100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| AAPL| 101.5| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` diff --git a/src/UserGuide/latest-Table/SQL-Manual/Common-Table-Expression_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/Common-Table-Expression_timecho.md deleted file mode 100644 index e2ae2ef06..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/Common-Table-Expression_timecho.md +++ /dev/null @@ -1,233 +0,0 @@ - -# Common Table Expressions (CTE) - -## 1. Overview - -CTE (Common Table Expressions) supports defining one or more temporary result sets (called common tables) using the `WITH` clause. These result sets can be referenced multiple times in subsequent parts of the same query. CTE provides a clean way to construct complex queries, making SQL code more readable and maintainable. - -> Note: This feature is available since version 2.0.9.1. - -## 2. Syntax Definition - -The simplified SQL syntax for CTE is as follows: - -```sql -with_clause: - WITH cte_name [(col_name [, col_name] ...)] AS (subquery) - [, cte_name [(col_name [, col_name] ...)] AS (subquery)] ... -``` - -- Supports simple and nested CTEs: One or more CTEs can be defined in a `WITH` clause, and CTEs can reference each other in a nested way (forward references are **not** allowed, meaning a CTE cannot reference another CTE that has not yet been defined). -- Name conflict between CTE and source table: If a CTE has the same name as a source table, only the CTE is visible in the outer scope, and the source table is shadowed. -- Multiple references to CTE: The same CTE can be referenced multiple times in the outer query. -- EXPLAIN / EXPLAIN ANALYZE support: `EXPLAIN` or `EXPLAIN ANALYZE` can be used on the entire query, but **not** on the `subquery` inside a CTE definition. -- Column count constraint: The number of column names specified in a CTE definition must match the number of output columns from the `subquery`, otherwise an error will be thrown. -- Unused CTE: A query can still execute normally if a defined CTE is not referenced in the main query body. - -## 3. Examples - -Using tables `table1` and `table2` from the [Sample Data](../Reference/Sample-Data.md) as source tables: - -### 3.1 Simple CTE - -```sql -WITH cte1 AS (SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL), - cte2 AS (SELECT device_id, humidity FROM table2 WHERE humidity IS NOT NULL) -SELECT * FROM cte1 JOIN cte2 ON cte1.device_id = cte2.device_id LIMIT 10; -``` - -Result - -``` -+---------+-----------+---------+--------+ -|device_id|temperature|device_id|humidity| -+---------+-----------+---------+--------+ -| 100| 90.0| 100| 45.1| -| 100| 90.0| 100| 35.2| -| 100| 90.0| 100| 35.1| -| 100| 85.0| 100| 45.1| -| 100| 85.0| 100| 35.2| -| 100| 85.0| 100| 35.1| -| 100| 85.0| 100| 45.1| -| 100| 85.0| 100| 35.2| -| 100| 85.0| 100| 35.1| -| 100| 88.0| 100| 45.1| -+---------+-----------+---------+--------+ -Total line number = 10 -It costs 0.075s -``` - -### 3.2 CTE with the Same Name as Source Table - -```sql -WITH table1 AS (SELECT time, device_id, temperature FROM table1 WHERE temperature IS NOT NULL) -SELECT * FROM table1 LIMIT 5; -``` - -Result - -``` -+-----------------------------+---------+-----------+ -| time|device_id|temperature| -+-----------------------------+---------+-----------+ -|2024-11-30T09:30:00.000+08:00| 101| 90.0| -|2024-11-30T14:30:00.000+08:00| 101| 90.0| -|2024-11-29T10:00:00.000+08:00| 101| 85.0| -|2024-11-27T16:39:00.000+08:00| 101| 85.0| -|2024-11-27T16:40:00.000+08:00| 101| 85.0| -+-----------------------------+---------+-----------+ -Total line number = 5 -It costs 0.103s -``` - -### 3.3 Nested CTE - -```sql -WITH - table1 AS (SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL), - cte1 AS (SELECT device_id, temperature FROM table2 WHERE temperature IS NOT NULL), - table2 AS (SELECT temperature FROM table1), - cte2 AS (SELECT temperature FROM table1) -SELECT * FROM table2; -``` - -Result - -``` -+-----------+ -|temperature| -+-----------+ -| 90.0| -| 90.0| -| 85.0| -| 85.0| -| 85.0| -| 85.0| -| 90.0| -| 85.0| -| 85.0| -| 88.0| -| 90.0| -| 90.0| -+-----------+ -Total line number = 12 -It costs 0.050s -``` - -- Forward references are **not** supported - -```sql -WITH - cte2 AS (SELECT temperature FROM cte1), - cte1 AS (SELECT device_id, temperature FROM table1) -SELECT * FROM cte2; -``` - -Error message - -``` -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 550: Table 'database1.cte1' does not exist. -``` - -### 3.4 Multiple References to CTE - -```sql -WITH cte AS (SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL) -SELECT * FROM cte WHERE temperature > (SELECT avg(temperature) FROM cte); -``` - -Result - -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 90.0| -| 100| 90.0| -| 100| 88.0| -| 100| 90.0| -| 100| 90.0| -+---------+-----------+ -Total line number = 6 -It costs 0.203s -``` - -### 3.5 EXPLAIN Support - -- Supported on the entire query - -```sql -EXPLAIN WITH cte AS (SELECT * FROM table1) SELECT * FROM cte; -``` - -Result - -``` -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| distribution plan| -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ | -| │OutputNode-7 │ | -| │OutputColumns-[time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time] │ | -| │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ | -| └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ | -| │ | -| │ | -| ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ | -| │Collect-42 │ | -| │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ | -| └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ | -| ┌───────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────┐ | -| │ │ | -| ┌───────────┐ ┌───────────┐ | -| │Exchange-49│ │Exchange-50│ | -| └───────────┘ └───────────┘ | -| │ │ | -| │ │ | -|┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐| -|│DeviceTableScanNode-41 │ │DeviceTableScanNode-40 │| -|│QualifiedTableName: database1.table1 │ │QualifiedTableName: database1.table1 │| -|│OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│| -|│DeviceNumber: 3 │ │DeviceNumber: 3 │| -|│ScanOrder: ASC │ │ScanOrder: ASC │| -|│PushDownOffset: 0 │ │PushDownOffset: 0 │| -|│PushDownLimit: 0 │ │PushDownLimit: 0 │| -|│PushDownLimitToEachDevice: false │ │PushDownLimitToEachDevice: false │| -|│RegionId: 2 │ │RegionId: 1 │| -|└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘| -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 29 -It costs 0.065s -``` - -- Not supported for internal queries of CTE - -```sql -WITH cte AS (EXPLAIN SELECT * FROM table1) SELECT * FROM cte; -``` - -Error message - -``` -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 700: line 1:14: mismatched input 'EXPLAIN'. Expecting: -``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/SQL-Manual/Featured-Functions_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/Featured-Functions_timecho.md deleted file mode 100644 index 97cefa7a3..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/Featured-Functions_timecho.md +++ /dev/null @@ -1,861 +0,0 @@ - -# Featured Functions - -## 1. Downsampling Functions - -### 1.1 `date_bin` Function - -#### **Description** - -The `date_bin` function is a scalar function that aligns timestamps to the start of specified time intervals. It is commonly used with the `GROUP BY` clause for downsampling. - -- **Partial Intervals May Be Empty:** Only timestamps that meet the conditions are aligned; missing intervals are not filled. -- **All Intervals Return Empty:** If no data exists within the query range, the downsampling result is an empty set. - -#### **Usage Examples** - -[Sample Dataset](../Reference/Sample-Data.md): The example data page contains SQL statements for building table structures and inserting data. Download and execute these statements in the IoTDB CLI to import the data into IoTDB. You can use this data to test and execute the SQL statements in the examples and obtain the corresponding results. - -**Example 1: Hourly Average Temperature for Device 100** - -```SQL -SELECT date_bin(1h, time) AS hour_time, avg(temperature) AS avg_temp -FROM table1 -WHERE (time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00) - AND device_id = '100' -GROUP BY 1; -``` - -**Result** - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:00:00.000+08:00| 90.0| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -+-----------------------------+--------+ -``` - -**Example 2: Hourly Average Temperature for Each Device** - -```SQL -SELECT date_bin(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00 -GROUP BY 1, device_id; -``` - -**Result** - -```Plain -+-----------------------------+---------+--------+ -| hour_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-29T11:00:00.000+08:00| 100| null| -|2024-11-29T18:00:00.000+08:00| 100| 90.0| -|2024-11-28T08:00:00.000+08:00| 100| 85.0| -|2024-11-28T09:00:00.000+08:00| 100| null| -|2024-11-28T10:00:00.000+08:00| 100| 85.0| -|2024-11-28T11:00:00.000+08:00| 100| 88.0| -|2024-11-29T10:00:00.000+08:00| 101| 85.0| -|2024-11-27T16:00:00.000+08:00| 101| 85.0| -+-----------------------------+---------+--------+ -``` - -**Example 3: Hourly Average Temperature for All Devices** - -```SQL -SELECT date_bin(1h, time) AS hour_time, avg(temperature) AS avg_temp - FROM table1 - WHERE time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00 - group by 1; -``` - -**Result** - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-29T10:00:00.000+08:00| 85.0| -|2024-11-27T16:00:00.000+08:00| 85.0| -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:00:00.000+08:00| 90.0| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -+-----------------------------+--------+ -``` - -### 1.2 `date_bin_gapfill` Function - -#### **Description:** - -The `date_bin_gapfill` function is an extension of `date_bin` that fills in missing time intervals, returning a complete time series. - -- **Partial Intervals May Be Empty**: Aligns timestamps for data that meets the conditions and fills in missing intervals. -- **All Intervals Return Empty**: If no data exists within the query range, the result is an empty set. - -#### **Limitations:** - -1. The function must always be used with the `GROUP BY` clause. If used elsewhere, it behaves like `date_bin` without gap-filling. -2. A `GROUP BY` clause can contain only one instance of date_bin_gapfill. Multiple calls will result in an error. -3. The `GAPFILL` operation occurs after the `HAVING` clause and before the `FILL` clause. -4. The `WHERE` clause must include time filters in one of the following forms: - 1. `time >= XXX AND time <= XXX` - 2. `time > XXX AND time < XXX` - 3. `time BETWEEN XXX AND XXX` -5. If additional time filters or conditions are used, an error is raised. Time conditions and other value filters must be connected using the `AND` operator. -6. If `startTime` and `endTime` cannot be inferred from the `WHERE` clause, an error is raised. - -**Usage Examples** - -**Example 1: Fill Missing Intervals** - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, avg(temperature) AS avg_temp -FROM table1 -WHERE (time >= 2024-11-28 07:00:00 AND time <= 2024-11-28 16:00:00) - AND device_id = '100' -GROUP BY 1; -``` - -**Result** - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-28T07:00:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -|2024-11-28T12:00:00.000+08:00| null| -|2024-11-28T13:00:00.000+08:00| null| -|2024-11-28T14:00:00.000+08:00| null| -|2024-11-28T15:00:00.000+08:00| null| -|2024-11-28T16:00:00.000+08:00| null| -+-----------------------------+--------+ -``` - -**Example 2: Fill Missing Intervals with Device Grouping** - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-28 07:00:00 AND time <= 2024-11-28 16:00:00 -GROUP BY 1, device_id; -``` - -**Result** - -```Plain -+-----------------------------+---------+--------+ -| hour_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-28T07:00:00.000+08:00| 100| null| -|2024-11-28T08:00:00.000+08:00| 100| 85.0| -|2024-11-28T09:00:00.000+08:00| 100| null| -|2024-11-28T10:00:00.000+08:00| 100| 85.0| -|2024-11-28T11:00:00.000+08:00| 100| 88.0| -|2024-11-28T12:00:00.000+08:00| 100| null| -|2024-11-28T13:00:00.000+08:00| 100| null| -|2024-11-28T14:00:00.000+08:00| 100| null| -|2024-11-28T15:00:00.000+08:00| 100| null| -|2024-11-28T16:00:00.000+08:00| 100| null| -+-----------------------------+---------+--------+ -``` - -**Example 3: Empty Result Set for No Data in Range** - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-27 09:00:00 AND time <= 2024-11-27 14:00:00 -GROUP BY 1, device_id; -``` - -**Result** - -```Plain -+---------+---------+--------+ -|hour_time|device_id|avg_temp| -+---------+---------+--------+ -+---------+---------+--------+ -``` - -## 2. `DIFF` Function - -### 2.1 **Description:** - -- The `DIFF` function calculates the difference between the current row and the previous row. For the first row, it returns `NULL` since there is no previous row. - -### 2.2 **Function Definition:** - -``` -DIFF(numberic[, boolean]) -> Double -``` - -### 2.3 **Parameters:** - -#### **First Parameter (numeric):** - -- **Type**: Must be numeric (`INT32`, `INT64`, `FLOAT`, `DOUBLE`). -- **Purpose**: Specifies the column for which to calculate the difference. - -#### **Second Parameter (boolean, optional):** - -- **Type**: Boolean (`true` or `false`). -- **Default**: `true`. -- **Purpose**: - - `true`: Ignores `NULL` values and uses the first non-`NULL` value for calculation. If no non-`NULL` value exists, returns `NULL`. - - `false`: Does not ignore `NULL` values. If the previous row is `NULL`, the result is `NULL`. - -### 2.4 **Notes:** - -- In **tree models**, the second parameter must be specified as `'ignoreNull'='true'` or `'ignoreNull'='false'`. -- In **table models**, simply use `true` or `false`. Using `'ignoreNull'='true'` or `'ignoreNull'='false'` in table models results in a string comparison and always evaluates to `false`. - -### 2.5 **Usage Examples** - -#### **Example 1: Ignore NULL Values** - -```SQL -SELECT time, DIFF(temperature) AS diff_temp -FROM table1 -WHERE device_id = '100'; -``` - -**Result** - -```Plain -+-----------------------------+---------+ -| time|diff_temp| -+-----------------------------+---------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:30:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| -5.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 0.0| -|2024-11-28T11:00:00.000+08:00| 3.0| -|2024-11-26T13:37:00.000+08:00| 2.0| -|2024-11-26T13:38:00.000+08:00| 0.0| -+-----------------------------+---------+ -``` - -#### **Example 2: Do Not Ignore NULL Values** - -```SQL -SELECT time, DIFF(temperature, false) AS diff_temp -FROM table1 -WHERE device_id = '100'; -``` - -**Result** - -```Plain -+-----------------------------+---------+ -| time|diff_temp| -+-----------------------------+---------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:30:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| -5.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| null| -|2024-11-28T11:00:00.000+08:00| 3.0| -|2024-11-26T13:37:00.000+08:00| 2.0| -|2024-11-26T13:38:00.000+08:00| 0.0| -+-----------------------------+---------+ -``` - -#### **Example 3: Full Example** - -```SQL -SELECT time, temperature, - DIFF(temperature) AS diff_temp_1, - DIFF(temperature, false) AS diff_temp_2 -FROM table1 -WHERE device_id = '100'; -``` - -**Result** - -```Plain -+-----------------------------+-----------+-----------+-----------+ -| time|temperature|diff_temp_1|diff_temp_2| -+-----------------------------+-----------+-----------+-----------+ -|2024-11-29T11:00:00.000+08:00| null| null| null| -|2024-11-29T18:30:00.000+08:00| 90.0| null| null| -|2024-11-28T08:00:00.000+08:00| 85.0| -5.0| -5.0| -|2024-11-28T09:00:00.000+08:00| null| null| null| -|2024-11-28T10:00:00.000+08:00| 85.0| 0.0| null| -|2024-11-28T11:00:00.000+08:00| 88.0| 3.0| 3.0| -|2024-11-26T13:37:00.000+08:00| 90.0| 2.0| 2.0| -|2024-11-26T13:38:00.000+08:00| 90.0| 0.0| 0.0| -+-----------------------------+-----------+-----------+-----------+ -``` - -## 3 Timeseries Windowing Functions - -The sample data is as follows: - -```SQL -IoTDB> SELECT * FROM bid; -+-----------------------------+--------+-----+ -| time|stock_id|price| -+-----------------------------+--------+-----+ -|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+--------+-----+ - --- Create table statement -CREATE TABLE bid(time TIMESTAMP TIME, stock_id STRING TAG, price FLOAT FIELD); --- Insert data -INSERT INTO bid(time, stock_id, price) VALUES('2021-01-01T09:05:00','AAPL',100.0),('2021-01-01T09:06:00','TESL',200.0),('2021-01-01T09:07:00','AAPL',103.0),('2021-01-01T09:07:00','TESL',202.0),('2021-01-01T09:09:00','AAPL',102.0),('2021-01-01T09:15:00','TESL',195.0); -``` - -### 3.1 HOP - -#### Function Description - -The HOP function segments data into overlapping time windows for analysis, assigning each row to all windows that overlap with its timestamp. If windows overlap (when SLIDE < SIZE), data will be duplicated across multiple windows. - -#### Function Definition - -```SQL -HOP(data, timecol, size, slide[, origin]) -``` - -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | ------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer | Window size | -| SLIDE | Scalar | Long integer | Sliding step | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - - -#### Returned Results - -The HOP function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```SQL -IoTDB> SELECT * FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.2 SESSION - -#### Function Description - -The SESSION function groups data into sessions based on time intervals. It checks the time gap between consecutive rows—rows with gaps smaller than the threshold (GAP) are grouped into the current window, while larger gaps trigger a new window. - -#### Function Definition - -```SQL -SESSION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], timecol, gap) -``` -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| TIMECOL | Scalar | String (default: 'time') | Time column name | -| GAP | Scalar | Long integer | Session gap threshold | - -#### Returned Results - -The SESSION function returns: - -* `window_start`: Time of the first row in the session -* `window_end`: Time of the last row in the session -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```SQL -IoTDB> SELECT * FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY SESSION when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL| 201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.3 VARIATION - -#### Function Description - -The VARIATION function groups data based on value differences. The first row becomes the baseline for the first window. Subsequent rows are compared to the baseline—if the difference is within the threshold (DELTA), they join the current window; otherwise, a new window starts with that row as the new baseline. - -#### Function Definition - -```sql -VARIATION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], col, delta) -``` - -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| COL | Scalar | String | Column for difference calculation | -| DELTA | Scalar | Float | Difference threshold | - -#### Returned Results - -The VARIATION function returns: - -* `window_index`: Window identifier -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```sql -IoTDB> SELECT * FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price',DELTA => 2.0); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 1|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY VARIATION when combined with GROUP BY -IoTDB> SELECT first(time) as window_start, last(time) as window_end, stock_id, avg(price) as avg FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price', DELTA => 2.0) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:07:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.5| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 3.4 CAPACITY - -#### Function Description - -The CAPACITY function groups data into fixed-size windows, where each window contains up to SIZE rows. - -#### Function Definition - -```sql -CAPACITY(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], size) -``` - -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | ---------------------------- | -------------------------------------- | -| DATA | Table | SET SEMANTIC, PASS THROUGH | Input table with partition/sort keys | -| SIZE | Scalar | Long integer | Window size (row count) | - -#### Returned Results - -The CAPACITY function returns: - -* `window_index`: Window identifier -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```sql -IoTDB> SELECT * FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 0|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY COUNT when combined with GROUP BY -IoTDB> SELECT first(time) as start_time, last(time) as end_time, stock_id, avg(price) as avg FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| start_time| end_time|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|101.5| -|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 3.5 TUMBLE - -#### Function Description - -The TUMBLE function assigns each row to a non-overlapping, fixed-size time window based on a timestamp attribute. - -#### Function Definition - -```sql -TUMBLE(data, timecol, size[, origin]) -``` -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | ------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer (positive) | Window size | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - -#### Returned Results - -The TUMBLE function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```SQL -IoTDB> SELECT * FROM TUMBLE( DATA => bid, TIMECOL => 'time', SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM TUMBLE(DATA => bid, TIMECOL => 'time', SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.6 CUMULATE - -#### Function Description - -The CUMULATE function creates expanding windows from an initial window, maintaining the same start time while incrementally extending the end time by STEP until reaching SIZE. Each window contains all elements within its range. For example, with a 1-hour STEP and 24-hour SIZE, daily windows would be: `[00:00, 01:00)`, `[00:00, 02:00)`, ..., `[00:00, 24:00)`. - -#### Function Definition - -```sql -CUMULATE(data, timecol, size, step[, origin]) -``` - -#### Parameter Description - -| Parameter | Type | Attributes | Description | -| ----------- | -------- | --------------------------------- | --------------------------------------------------- | -| DATA | Table | ROW SEMANTIC, PASS THROUGH | Input table | -| TIMECOL | Scalar | String (default: 'time') | Time column | -| SIZE | Scalar | Long integer (positive) | Window size (must be an integer multiple of STEP) | -| STEP | Scalar | Long integer (positive) | Expansion step | -| ORIGIN | Scalar | Timestamp (default: Unix epoch) | First window start time | - -> Note: An error `Cumulative table function requires size must be an integral multiple of step` occurs if SIZE is not divisible by STEP. - -#### Returned Results - -The CUMULATE function returns: - -* `window_start`: Window start time (inclusive) -* `window_end`: Window end time (exclusive) -* Pass-through columns: All input columns from DATA - -#### Usage Example - -```sql -IoTDB> SELECT * FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- Equivalent to tree model's GROUP BY TIME when combined with GROUP BY -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m, SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| TESL| 201.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00| AAPL| 100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| AAPL| 101.5| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - - -## 4. Window Functions - -### 4.1 SQL Definition - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -For more detailed introductions to the features, please refer to: [Window Function](../User-Manual/Window-Function_timecho.md) - -### 4.2 Usage Examples - -The original data of the device_flow table is as follows: - -```sql -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ -``` - -1. Query all columns from device_flow, group the data by the device dimension, sort the records within each device group by the value of the flow field, calculate the cumulative sum of the flow field, and finally return the cumulative sum as a column named sum. - -SQL: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -``` - -Result: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` -2. Query all original columns from the device_flow table, group the data by the device dimension (device), sort the records within each device group by the value of the flow field, count the number of rows within the range of "the flow group of the current row + the previous 1 flow group", and finally return the count result as a column named count. - -SQL: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. Query all original columns from the device_flow table, group the data by device, sort the records in ascending order by the value of the flow field within each group, count the number of all rows falling within the numeric range of "the flow value of the current row minus 2" to "the flow value of the current row", and finally return the count result as a column named count. - -SQL: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -## 5. Object Type Read Function - -**Description**: -Reads binary content from an `OBJECT` type column and returns a `BLOB` type (raw binary data of the object). -> Supported since V2.0.8 - -**Syntax:** -```SQL -READ_OBJECT(object [, offset, length]) -``` - -**Parameters:** -- **Required**: - `object` (OBJECT type) -- **Optional**: - - `offset` (long/INT64): Start position for reading. Default: `0`. - - Throws error if `offset < 0` or `offset >= full file length`. - - `length` (long/INT64): Number of bytes to read. Default: full file length. - - Error if `length > 2^31 - 1`. - - If `length` exceeds remaining bytes from `offset`, reads until end of file. - - If `length < 0`, reads all remaining data from `offset`. - -**Examples:** -```sql -IoTDB:database1> SELECT READ_OBJECT(s1) FROM table1 WHERE device_id = 'tag1' -+------------+ -| _col0| -+------------+ -|0x696f746462| -+------------+ -Total line number = 1 - -IoTDB:database1> SELECT READ_OBJECT(s1, 0, 3) FROM table1 WHERE device_id = 'tag1' -+--------+ -| _col0| -+--------+ -|0x696f74| -+--------+ -Total line number = 1 -``` diff --git a/src/UserGuide/latest-Table/SQL-Manual/QuickStart-Only-Sql_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/QuickStart-Only-Sql_timecho.md deleted file mode 100644 index 97569de90..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/QuickStart-Only-Sql_timecho.md +++ /dev/null @@ -1,126 +0,0 @@ - -# QuickStart Only SQL - -> **Before executing the following SQL statements, please ensure** -> -> * **IoTDB service has been successfully started** -> * **Connected to IoTDB via Cli client** -> -> Note: If your terminal does not support multi-line pasting (e.g., Windows CMD), please adjust the SQL statements to single-line format before execution. - -## 1. Database Management - -```SQL --- Create database database1, and set the database TTL time to 1 year; -CREATE DATABASE IF NOT EXISTS database1; - --- Use database database1; -USE database1; - --- Modify the database TTL time to 1 week; -ALTER DATABASE database1 SET PROPERTIES TTL=604800000; - --- Delete database database1; -DROP DATABASE IF EXISTS database1; -``` - -For detailed syntax description, please refer to: [Database Management](../Basic-Concept/Database-Management_timecho.md) - -## 2. Table Management - -```SQL --- Create table table1; -CREATE TABLE table1 ( - time TIMESTAMP TIME, - device_id STRING TAG, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - status Boolean FIELD COMMENT 'status' -); - --- View column information of table table1; -DESC table1 DETAILS; - --- Modify table; --- Add column to table table1; -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS humidity FLOAT FIELD COMMENT 'humidity'; --- Set table table1 TTL to 1 week; -ALTER TABLE table1 set properties TTL=604800000; - --- Delete table table1; -DROP TABLE table1; -``` - -For detailed syntax description, please refer to: [Table Management](../Basic-Concept/Table-Management_timecho.md) - -## 3. Data Writing - -```SQL --- Single row writing; -INSERT INTO table1(device_id, time, temperature) VALUES ('100', '2025-11-26 13:37:00', 90.0); - --- Multi-row writing; -INSERT INTO table1(device_id, maintenance, time, temperature) VALUES - ('101', '180', '2024-11-26 13:37:00', 88.0), - ('100', '180', '2024-11-26 13:38:00', 85.0), - ('101', '180', '2024-11-27 16:38:00', 80.0); -``` - -For detailed syntax description, please refer to: [Data Writing](../Basic-Concept/Write-Updata-Data_timecho.md#_1-data-insertion) - -## 4. Data Query - -```SQL --- Full table query; -SELECT * FROM table1; - --- Function query; -SELECT count(*), sum(temperature) FROM table1; - --- Query data for specified device and time period; -SELECT * -FROM table1 -WHERE time >= 2024-11-26 00:00:00 and time <= 2024-11-27 00:00:00 and device_id='101'; -``` - -For detailed syntax description, please refer to: [Data Query](../Basic-Concept/Query-Data_timecho.md) - -## 5. Data Update - -```SQL --- Update the maintenance attribute value for data where device_id is 100; -UPDATE table1 SET maintenance='45' WHERE device_id='100'; -``` - -For detailed syntax description, please refer to: [Data Update](../Basic-Concept/Write-Updata-Data_timecho.md#_2-data-updates) - -## 6. Data Deletion - -```SQL --- Delete data for specified device and time period; -DELETE FROM table1 WHERE time >= 2024-11-26 23:39:00 and time <= 2024-11-27 20:42:00 AND device_id='101'; - --- Full table deletion; -DELETE FROM table1; -``` - -For detailed syntax description, please refer to: [Data Deletion](../Basic-Concept/Delete-Data.md) \ No newline at end of file diff --git a/src/UserGuide/latest-Table/SQL-Manual/Row-Pattern-Recognition_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/Row-Pattern-Recognition_timecho.md deleted file mode 100644 index b5136f77a..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/Row-Pattern-Recognition_timecho.md +++ /dev/null @@ -1,167 +0,0 @@ - - -# Pattern Query - -## 1. Syntax Definition - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**Note:** - -* PARTITION BY: Optional. Used to group the input table, and each group can perform pattern matching independently. If this clause is not specified, the entire input table will be processed as a single unit. -* ORDER BY: Optional. Used to ensure that input data is processed in a specific order during matching. -* MEASURES: Optional. Used to specify which information to extract from the matched segment of data. -* ROWS PER MATCH: Optional. Used to specify the output method of the result set after successful pattern matching. -* AFTER MATCH SKIP: Optional. Used to specify which row to resume from for the next pattern match after identifying a non-empty match. -* PATTERN: Used to define the row pattern to be matched. -* SUBSET: Optional. Used to merge rows matched by multiple basic pattern variables into a single logical set. -* DEFINE: Used to define the basic pattern variables for the row pattern. - -For more detailed introductions to the features, please refer to:[Pattern Query](../User-Manual/Pattern-Query_timecho.md) - -## 2. Usage Examples - -Using [Sample Data](../Reference/Sample-Data.md) as the source data - -1. Time Segment Query - -Segment the data in table1 by time intervals less than or equal to 24 hours, and query the total number of data entries in each segment, as well as the start and end times. - -Query SQL - -SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -Query Results - -SQL - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -2. Difference Segment Query - -Segment the data in table2 by humidity value differences less than 0.1, and query the total number of data entries in each segment, as well as the start and end times. - -* Query SQL - -SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* Query Results - -SQL - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -3. Event Statistics Query - -Group the data in table1 by device ID, and count the start and end times and maximum humidity value where the humidity in the Shanghai area is greater than 35. - -* Query SQL - -SQL - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= 'Shanghai' AND A.humidity> 35 -) AS m -``` - -* Query Results - -SQL - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` diff --git a/src/UserGuide/latest-Table/SQL-Manual/SQL-Authority-Management_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/SQL-Authority-Management_timecho.md deleted file mode 100644 index 3528a5f31..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/SQL-Authority-Management_timecho.md +++ /dev/null @@ -1,378 +0,0 @@ - - -# Authority Management - -This document is the SQL manual for authority management starting from version V2.0.7. For detailed function usage, see [Authority Management](../User-Manual/Authority-Management-Upgrade_timecho.md). For an introduction to authority management functions before version V2.0.7, refer to [Authority Management](../User-Manual/Authority-Management_timecho.md) - -## 1. Privilege List - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Privilege TypePrivilege NameScope of EffectDescription
Global PrivilegesSYSTEMGlobalAllows users to create, modify, and delete databases.
Allows users to create, modify, and delete tables and table views.
Allows users to create, delete, and view user-defined functions.
Allows users to create, start, stop, delete, and view PIPEs. Allows users to create, delete, and view PIPEPLUGINS.
Allows users to query and cancel queries. Allows users to view variables. Allows users to view cluster status.
Allows users to create, delete, and view deep learning models.
SECURITYGlobalAllows users to create users.
Allows users to delete users.
Allows users to modify user passwords.
Allows users to view user privilege information.
Allows users to list all users.
Allows users to create roles.
Allows users to delete roles.
Allows users to view role privilege information.
Allows users to grant a role to a user or revoke it.
Allows users to list all roles.
AUDITGlobalAllows users to maintain audit log rules and view audit logs.
Data PrivilegesCREATEANYAllows creating any table and any database.
DatabaseAllows users to create tables under this database; allows users to create a database with this name.
TableAllows users to create a table with this name.
ALTERANYAllows modifying the definition of any table and any database.
DatabaseAllows users to modify the definition of a database and the definitions of tables under that database.
TableAllows users to modify the definition of a table.
SELECTANYAllows querying data from any table in any database in the system.
DatabaseAllows users to query data from any table in this database.
TableAllows users to query data in this table. When executing multi-table queries, the database only displays data that the user has permission to access.
INSERTANYAllows inserting/updating data into any table in any database.
DatabaseAllows users to insert/update data into any table within the scope of this database.
TableAllows users to insert/update data into this table.
DELETEANYAllows deleting data from any table.
DatabaseAllows users to delete data within the scope of this database.
TableAllows users to delete data from this table.
- -## 2. SQL Statements - -### 2.1 User and Role Management - -1. Create User (Requires SECURITY privilege) - -```SQL -CREATE USER -eg: CREATE USER user1 'Passwd@202604'; -``` - -2. Change Password - -Users can change their own passwords, but changing other users' passwords requires the SECURITY privilege. - -```SQL -ALTER USER SET PASSWORD -eg: ALTER USER tempuser SET PASSWORD 'Newpwd@202604'; -``` - -3. Drop User (Requires SECURITY privilege) - -```SQL -DROP USER -eg: DROP USER user1; -``` - -4. Create Role (Requires SECURITY privilege) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1; -``` - -5. Drop Role (Requires SECURITY privilege) - -```SQL -DROP ROLE -eg: DROP ROLE role1; -``` - -6. Grant Role to User (Requires SECURITY privilege) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1; -``` - -7. Revoke Role from User (Requires SECURITY privilege) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1; -``` - -8. List All Users (Requires SECURITY privilege) - -```SQL -LIST USER; -``` - -9. List All Roles (Requires SECURITY privilege) - -```SQL -LIST ROLE; -``` - -10. List All Users Under a Specified Role (Requires SECURITY privilege) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser; -``` - -11. List All Roles of a Specified User - -Users can list their own roles, but listing other users' roles requires the SECURITY privilege. - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser; -``` - -12. List All Privileges of a User - -Users can list their own privilege information, but listing other users' privileges requires the SECURITY privilege. - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser; -``` - -13. List All Privileges of a Role - -Users can list the privilege information of roles they possess, but listing other roles' privileges requires the SECURITY privilege. - -```SQL -LIST PRIVILEGES OF ROLE -eg: LIST PRIVILEGES OF ROLE actor; -``` - -### 2.2 Privilege Management - -#### 2.2.1 Grant Privileges - -1. Grant user management privileges to a user - -```SQL -GRANT SECURITY TO USER -eg: GRANT SECURITY TO USER TEST_USER; -``` - -2. Grant a user the privilege to create databases and create tables within the database scope, and allow the user to manage privileges within that scope - -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION -eg: GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION; -``` - -3. Grant a role the privilege to query a database - -```SQL -GRANT SELECT ON DATABASE TO ROLE -eg: GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE; -``` - -4. Grant a user the privilege to query a table - -```SQL -GRANT SELECT ON . TO USER -eg: GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER; -``` - -5. Grant a role the privilege to query all databases and tables - -```SQL -GRANT SELECT ON ANY TO ROLE -eg: GRANT SELECT ON ANY TO ROLE TEST_ROLE; -``` - -6. ALL Syntax Sugar: ALL represents all privileges within the object scope. You can use the ALL field to flexibly grant privileges. - -```SQL -GRANT ALL TO USER TESTUSER; --- Grants all privileges available to the user, including global privileges and all data privileges in the ANY scope - -GRANT ALL ON ANY TO USER TESTUSER; --- Grants all data privileges available in the ANY scope to the user. After executing this statement, the user will have all data privileges on all databases - -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER; --- Grants all data privileges available in the DB scope to the user. After executing this statement, the user will have all data privileges on this database - -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER; --- Grants all data privileges available in the TABLE scope to the user. After executing this statement, the user will have all data privileges on this table -``` - -#### 2.2.2 Revoke Privileges - -1. Revoke user management privileges from a user - -```SQL -REVOKE SECURITY FROM USER -eg: REVOKE SECURITY FROM USER TEST_USER; -``` - -2. Revoke a user's privilege to create databases and create tables within the database scope - -```SQL -REVOKE CREATE ON DATABASE FROM USER -eg: REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER; -``` - -3. Revoke a user's privilege to query a table - -```SQL -REVOKE SELECT ON . FROM USER -eg: REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER; -``` - -4. Revoke a user's privilege to query all databases and tables - -```SQL -REVOKE SELECT ON ANY FROM USER -eg: REVOKE SELECT ON ANY FROM USER TEST_USER; -``` - -5. ALL Syntax Sugar: ALL represents all privileges within the object scope. You can use the ALL field to flexibly revoke privileges. - -```SQL -REVOKE ALL FROM USER TESTUSER; --- Revokes all global privileges and all data privileges in the ANY scope from the user - -REVOKE ALL ON ANY FROM USER TESTUSER; --- Revokes all data privileges in the ANY scope from the user, and does not affect DB-scope and TABLE-scope privileges - -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER; --- Revokes all data privileges on the DB from the user, and does not affect TABLE privileges - -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER; --- Revokes all data privileges on the TABLE from the user -``` - -#### 2.2.3 View User Privileges - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser -``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md deleted file mode 100644 index fc9593436..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md +++ /dev/null @@ -1,172 +0,0 @@ - - -# Data Addition & Deletion - -## 1. Data Insertion - -**Syntax:** - -```SQL -INSERT INTO [(COLUMN_NAME[, COLUMN_NAME]*)]? VALUES (COLUMN_VALUE[, COLUMN_VALUE]*) -``` - -[Detailed syntax reference](../Basic-Concept/Write-Updata-Data_timecho.md#_1-1-syntax) - -**Example 1: Specified Columns Insertion** - -```SQL -INSERT INTO table1(region, plant_id, device_id, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, time, temperature) VALUES ('Hamburg', '1001', '100', '2025-11-26 13:38:00', 91.0); -``` - -**Example 2: NULL Value Insertion** - -Equivalent to the example above -```SQL -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', null, null, '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('Hamburg', '1001', '100', null, null, '2025-11-26 13:38:00', 91.0, null); -``` - -**Example 3: Multi-row Insertion** - -```SQL -INSERT INTO table1 -VALUES -('2025-11-26 13:37:00', 'Frankfurt', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('2025-11-26 13:38:00', 'Frankfurt', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:38:25'); - -INSERT INTO table1 -(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) -VALUES -('Frankfurt', '1001', '100', 'A', '180', '2025-11-26 13:37:00', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('Frankfurt', '1001', '100', 'A', '180', '2025-11-26 13:38:00', 90.0, 35.1, true, '2025-11-26 13:38:25'); -``` - -**Example 4: Query Write-back** - -```SQL -insert into target_table select time,region,device_id,temperature from table1 where region = 'bj'; - -insert into target_table(time,device_id,temperature) table table3; - -insert into target_table (select t1.time, t1.region as region, t1.device_id as device_id, t1.temperature as temperature from table1 t1 where t1.time in (select t2.time from table2 t2 where t2.region = 'shh')); -``` - - -## 2. Data Update - -**Syntax:** - -```SQL -UPDATE SET updateAssignment (',' updateAssignment)* (WHERE where=booleanExpression)? - -updateAssignment - : identifier EQ expression - ; -``` - -[Detailed syntax reference](../Basic-Concept/Write-Updata-Data_timecho.md#_2-1-syntax) - -**Example:** - -```SQL -update table1 set b = a where substring(a, 1, 1) like '%'; -``` - -## 3. Data Deletion - -**Syntax:** - -```SQL -DELETE FROM [WHERE_CLAUSE]? - -WHERE_CLAUSE: - WHERE DELETE_CONDITION - -DELETE_CONDITION: - SINGLE_CONDITION - | DELETE_CONDITION AND DELETE_CONDITION - | DELETE_CONDITION OR DELETE_CONDITION - -SINGLE_CODITION: - TIME_CONDITION | ID_CONDITION - -TIME_CONDITION: - time TIME_OPERATOR LONG_LITERAL - -TIME_OPERATOR: - < | > | <= | >= | = - -ID_CONDITION: - identifier = STRING_LITERAL -``` - -**Example 1: Full Table Deletion** - -```SQL -DELETE FROM table1; -``` - -**Example 2: Time-range Deletion** - -Single time range -```SQL -DELETE FROM table1 WHERE time <= 2024-11-29 00:00:00; -``` -Multiple time ranges -```sql -DELETE FROM table1 WHERE time >= 2024-11-27 00:00:00 and time <= 2024-11-29 00:00:00; -``` - -**Example 3: Device-Specific Deletion** - -Delete data for specific device -```SQL -DELETE FROM table1 -WHERE device_id='101' AND model_id = 'B'; -``` -Delete data for device within time range -```sql -DELETE FROM table1 -WHERE time >= '2024-11-27 16:39:00' AND time <= '2024-11-29 16:42:00' - AND device_id='101' AND model_id = 'B'; -``` -Delete data for specific device model -```sql -DELETE FROM table1 WHERE model_id = 'B'; -``` - -## 4. Device Deletion - -**Syntax:** - -```SQL -DELETE DEVICES FROM tableName=qualifiedName (WHERE booleanExpression)? -``` - -**Example: Delete specified device and all associated data** - -```SQL -DELETE DEVICES FROM table1 WHERE device_id = '101'; -``` diff --git a/src/UserGuide/latest-Table/SQL-Manual/SQL-Data-Sync_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/SQL-Data-Sync_timecho.md deleted file mode 100644 index 41eff7eeb..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/SQL-Data-Sync_timecho.md +++ /dev/null @@ -1,321 +0,0 @@ - - -# Data Sync - -This document mainly contains the SQL statements for the data synchronization function. For detailed function introduction and usage instructions, see [Data Sync](../User-Manual/Data-Sync_timecho.md) - -## 1. Create Task - -**Syntax:** - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId is the name that uniquely identifies the task --- Data extraction plugin, optional -WITH SOURCE ( - [ = ,], -) --- Data processing plugin, optional -WITH PROCESSOR ( - [ = ,], -) --- Data connection plugin, required -WITH SINK ( - [ = ,], -) -``` - -**Example 1: Full Data Synchronization** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -**Example 2: Partial Data Synchronization** - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'mode.streaming' = 'true', - 'database-name'='db_b.*', - 'start-time' = '2023.08.23T08:00:00+00:00', - 'end-time' = '2023.10.23T08:00:00+00:00' -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -**Example 3: Bidirectional Data Transmission** - -* Execute the following statement on IoTDB A - -```SQL -create pipe AB -with source ( - 'source.mode.double-living' ='true' -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* Execute the following statement on IoTDB B - -```SQL -create pipe BA -with source ( - 'source.mode.double-living' ='true' -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -) -``` - -**Example 4: Edge-Cloud Data Transmission** - -* Execute the following statement on IoTDB B to synchronize data from B to A - -```SQL -create pipe BA -with source ( - 'database-name'='db_b.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -) -``` - -* Execute the following statement on IoTDB C to synchronize data from C to A - -```SQL -create pipe CA -with source ( - 'database-name'='db_c.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* Execute the following statement on IoTDB D to synchronize data from D to A - -```SQL -create pipe DA -with source ( - 'database-name'='db_d.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -) -``` - -**Example 5: Cascaded Data Transmission** - -* Execute the following statement on IoTDB A to synchronize data from A to B - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* Execute the following statement on IoTDB B to synchronize data from B to C - -```SQL -create pipe BC -with source ( -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -) -``` - -**Example 6: Cross-Gap Data Transmission** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -) -``` - -**Example 7: Compressed Synchronization** - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', - 'compressor' = 'snappy,lz4', - 'rate-limit-bytes-per-second'='1048576' -) -``` - -**Example 8: Encrypted Synchronization** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', - 'ssl.trust-store-path'='pki/trusted', - 'ssl.trust-store-pwd'='root' -) -``` - -**Example 9: Local Export of Object Type Data** - -```SQL -CREATE PIPE tsfile_export_local -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile-local-sink', - 'sink.local.target-path' = '/data/backup/export_2024', - 'sink.rate-limit-bytes-per-second' = '10485760' -); -``` - -**Example 10: Remote Transmission of Object Type Data** - -* This method requires pre-registration of the `tsfile_remote_sink` plugin - -```SQL -CREATE PIPE tsfile_export_scp -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile_remote_sink', - 'sink.file-mode' = 'scp', - 'sink.scp.host' = '192.168.1.100', - 'sink.scp.port' = '22', - 'sink.scp.user' = 'backup_user', - 'sink.scp.password' = 'ComplexPass123!', - 'sink.scp.remote-path' = '/remote/archive/', - 'sink.rate-limit-bytes-per-second' = '10485760' -); -``` - -## 2. Start Task - -**Syntax:** - -```SQL -START PIPE -``` - -**Example:** - -```SQL -START PIPE A2B -``` - -## 3. Stop Task - -**Syntax:** - -```SQL -STOP PIPE -``` - -**Example:** - -```SQL -STOP PIPE A2B -``` - -## 4. Drop Task - -**Syntax:** - -```SQL -DROP PIPE [IF EXISTS] -``` - -**Example:** - -```SQL -DROP PIPE IF EXISTS A2B -``` - -## 5. Show Tasks - -**Syntax:** - -```SQL --- Show all tasks -SHOW PIPES --- Show a specific task -SHOW PIPE -``` - -**Example:** - -```SQL -SHOW PIPES - -SHOW PIPE A2B -``` - -## 6. Alter Task - -**Syntax:** - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -**Example:** - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md deleted file mode 100644 index 7b27761cc..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md +++ /dev/null @@ -1,686 +0,0 @@ - - -# Management Statements - -## 1. Status Inspection - -### 1.1 View Current Tree/Table Mode - -**Syntax:** - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 1.2 View Current User - -**Syntax:** - -```SQL -showCurrentUserStatement - : SHOW CURRENT_USER - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW CURRENT_USER -+-----------+ -|CurrentUser| -+-----------+ -| root| -+-----------+ -``` - -### 1.3 View Connected Database - -**Syntax:** - -```SQL -showCurrentDatabaseStatement - : SHOW CURRENT_DATABASE - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW CURRENT_DATABASE; -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ - -IoTDB> USE test; - -IoTDB> SHOW CURRENT_DATABASE; -+---------------+ -|CurrentDatabase| -+---------------+ -| test| -+---------------+ -``` - -### 1.4 View Cluster Version - -**Syntax:** - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW VERSION -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.5 View Key Cluster Parameters - -**Syntax:** - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW VARIABLES -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.6 View Cluster ID - -**Syntax:** - -```SQL -showClusterIdStatement - : SHOW (CLUSTERID | CLUSTER_ID) - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW CLUSTER_ID -+------------------------------------+ -| ClusterId| -+------------------------------------+ -|40163007-9ec1-4455-aa36-8055d740fcda| -``` - -### 1.7 View Server Time - -Shows time of the DataNode server directly connected to client - -**Syntax:** - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - - -### 1.8 View Region Information - -**Description**: Displays regions' information of the current cluster. - -**Syntax**: - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW REGIONS -``` - -**Result**: - -```SQL -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 6|SchemaRegion|Running|tcollector| 670| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.194| | -| 7| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.196| 169.85 KB| -| 8| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.198| 161.63 KB| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.9 View Available Nodes - -**Description**: Returns the RPC addresses and ports of all available DataNodes in the current cluster. Note: A DataNode is considered "available" if it is not in the REMOVING state. - -> This feature is supported starting from v2.0.8. - -**Syntax**: - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW AVAILABLE URLS -``` - -**Result**: - -```SQL -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.10 View Service Information - -> Supported since V2.0.8.2 - -**Syntax**: - -```sql -showServicesStatement - : SHOW SERVICES - ; -``` - -**Example**: - -```sql -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -``` - -**Result**: - -```sql -+--------------+-------------+---------+ -| Service Name | DataNode ID | State | -+--------------+-------------+---------+ -| MQTT | 1 | STOPPED | -| REST | 1 | RUNNING | -+--------------+-------------+---------+ -``` - -### 1.11 View Cluster Activation Status - -**Syntax**: - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW ACTIVATION -``` - -**Result**: - -```SQL -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -## 2. Status Configuration - -### 2.1 Set Connection Tree/Table Mode - -**Syntax:** - -```SQL -SET SQL_DIALECT EQ (TABLE | TREE) -``` - -**Example:** - -```SQL -IoTDB> SET SQL_DIALECT=TABLE -IoTDB> SHOW CURRENT_SQL_DIALECT -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 2.2 Update Configuration Items - -**Syntax:** - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**Example:** - -```SQL -IoTDB> SET CONFIGURATION disk_space_warning_threshold='0.05',heartbeat_interval_in_ms='1000' ON 1; -``` - -### 2.3 Load Manually Modified Configuration - -**Syntax:** - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Example:** - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 Set System Status - -**Syntax:** - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Example:** - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - -## 3. Data Management - -### 3.1 Flush Memory Table to Disk - -**Syntax:** - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Example:** - -```SQL -IoTDB> FLUSH test_db TRUE ON LOCAL; -``` - -## 4. Data Repair - -### 4.1 Start Background TsFile Repair - -**Syntax:** - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Example:** - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 Pause TsFile Repair - -**Syntax:** - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Example:** - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. Query Operations - -### 5.1 View Active Queries - -**Syntax:** - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**Example:** - -```SQL -IoTDB> SHOW QUERIES WHERE elapsed_time > 30 -+-----------------------+-----------------------------+-----------+------------+------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -|20250108_101015_00000_1|2025-01-08T18:10:15.935+08:00| 1| 32.283|show queries|root| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -``` - -### 5.2 Terminate Queries - -**Syntax:** - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**Example:** - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -- teminate specific query -IoTDB> KILL ALL QUERIES; -- teminate all query -``` - -### 5.3 Query Performance Analysis - -#### 5.3.1 View Execution Plan - -**Syntax:** - -```SQL -EXPLAIN -``` - -Detailed syntax reference: [EXPLAIN STATEMENT](../User-Manual/Query-Performance-Analysis.md#_1-explain-statement) - -**Example:** - -```SQL -IoTDB> explain select * from t1 -+-----------------------------------------------------------------------------------------------+ -| distribution plan| -+-----------------------------------------------------------------------------------------------+ -| ┌─────────────────────────────────────────────┐ | -| │OutputNode-4 │ | -| │OutputColumns-[time, device_id, type, speed] │ | -| │OutputSymbols: [time, device_id, type, speed]│ | -| └─────────────────────────────────────────────┘ | -| │ | -| │ | -| ┌─────────────────────────────────────────────┐ | -| │Collect-21 │ | -| │OutputSymbols: [time, device_id, type, speed]│ | -| └─────────────────────────────────────────────┘ | -| ┌───────────────────────┴───────────────────────┐ | -| │ │ | -|┌─────────────────────────────────────────────┐ ┌───────────┐ | -|│TableScan-19 │ │Exchange-28│ | -|│QualifiedTableName: test.t1 │ └───────────┘ | -|│OutputSymbols: [time, device_id, type, speed]│ │ | -|│DeviceNumber: 1 │ │ | -|│ScanOrder: ASC │ ┌─────────────────────────────────────────────┐| -|│PushDownOffset: 0 │ │TableScan-20 │| -|│PushDownLimit: 0 │ │QualifiedTableName: test.t1 │| -|│PushDownLimitToEachDevice: false │ │OutputSymbols: [time, device_id, type, speed]│| -|│RegionId: 2 │ │DeviceNumber: 1 │| -|└─────────────────────────────────────────────┘ │ScanOrder: ASC │| -| │PushDownOffset: 0 │| -| │PushDownLimit: 0 │| -| │PushDownLimitToEachDevice: false │| -| │RegionId: 1 │| -| └─────────────────────────────────────────────┘| -+-----------------------------------------------------------------------------------------------+ -``` - -#### 5.3.2 Analyze Query Performance - -**Syntax:** - -```SQL -EXPLAIN ANALYZE [VERBOSE] -``` - -Detailed syntax reference: [EXPLAIN ANALYZE STATEMENT](../User-Manual/Query-Performance-Analysis.md#_2-explain-analyze-statement) - -**Example:** - -```SQL -IoTDB> explain analyze verbose select * from t1 -+-----------------------------------------------------------------------------------------------+ -| Explain Analyze| -+-----------------------------------------------------------------------------------------------+ -|Analyze Cost: 38.860 ms | -|Fetch Partition Cost: 9.888 ms | -|Fetch Schema Cost: 54.046 ms | -|Logical Plan Cost: 10.102 ms | -|Logical Optimization Cost: 17.396 ms | -|Distribution Plan Cost: 2.508 ms | -|Dispatch Cost: 22.126 ms | -|Fragment Instances Count: 2 | -| | -|FRAGMENT-INSTANCE[Id: 20241127_090849_00009_1.2.0][IP: 0.0.0.0][DataRegion: 2][State: FINISHED]| -| Total Wall Time: 18 ms | -| Cost of initDataQuerySource: 6.153 ms | -| Seq File(unclosed): 1, Seq File(closed): 0 | -| UnSeq File(unclosed): 0, UnSeq File(closed): 0 | -| ready queued time: 0.164 ms, blocked queued time: 0.342 ms | -| Query Statistics: | -| loadBloomFilterFromCacheCount: 0 | -| loadBloomFilterFromDiskCount: 0 | -| loadBloomFilterActualIOSize: 0 | -| loadBloomFilterTime: 0.000 | -| loadTimeSeriesMetadataAlignedMemSeqCount: 1 | -| loadTimeSeriesMetadataAlignedMemSeqTime: 0.246 | -| loadTimeSeriesMetadataFromCacheCount: 0 | -| loadTimeSeriesMetadataFromDiskCount: 0 | -| loadTimeSeriesMetadataActualIOSize: 0 | -| constructAlignedChunkReadersMemCount: 1 | -| constructAlignedChunkReadersMemTime: 0.294 | -| loadChunkFromCacheCount: 0 | -| loadChunkFromDiskCount: 0 | -| loadChunkActualIOSize: 0 | -| pageReadersDecodeAlignedMemCount: 1 | -| pageReadersDecodeAlignedMemTime: 0.047 | -| [PlanNodeId 43]: IdentitySinkNode(IdentitySinkOperator) | -| CPU Time: 5.523 ms | -| output: 2 rows | -| HasNext() Called Count: 6 | -| Next() Called Count: 5 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 31]: CollectNode(CollectOperator) | -| CPU Time: 5.512 ms | -| output: 2 rows | -| HasNext() Called Count: 6 | -| Next() Called Count: 5 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 29]: TableScanNode(TableScanOperator) | -| CPU Time: 5.439 ms | -| output: 1 rows | -| HasNext() Called Count: 3 -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| DeviceNumber: 1 | -| CurrentDeviceIndex: 0 | -| [PlanNodeId 40]: ExchangeNode(ExchangeOperator) | -| CPU Time: 0.053 ms | -| output: 1 rows | -| HasNext() Called Count: 2 | -| Next() Called Count: 1 | -| Estimated Memory Size: : 131072 | -| | -|FRAGMENT-INSTANCE[Id: 20241127_090849_00009_1.3.0][IP: 0.0.0.0][DataRegion: 1][State: FINISHED]| -| Total Wall Time: 13 ms | -| Cost of initDataQuerySource: 5.725 ms | -| Seq File(unclosed): 1, Seq File(closed): 0 | -| UnSeq File(unclosed): 0, UnSeq File(closed): 0 | -| ready queued time: 0.118 ms, blocked queued time: 5.844 ms | -| Query Statistics: | -| loadBloomFilterFromCacheCount: 0 | -| loadBloomFilterFromDiskCount: 0 | -| loadBloomFilterActualIOSize: 0 | -| loadBloomFilterTime: 0.000 | -| loadTimeSeriesMetadataAlignedMemSeqCount: 1 | -| loadTimeSeriesMetadataAlignedMemSeqTime: 0.004 | -| loadTimeSeriesMetadataFromCacheCount: 0 | -| loadTimeSeriesMetadataFromDiskCount: 0 | -| loadTimeSeriesMetadataActualIOSize: 0 | -| constructAlignedChunkReadersMemCount: 1 | -| constructAlignedChunkReadersMemTime: 0.001 | -| loadChunkFromCacheCount: 0 | -| loadChunkFromDiskCount: 0 | -| loadChunkActualIOSize: 0 | -| pageReadersDecodeAlignedMemCount: 1 | -| pageReadersDecodeAlignedMemTime: 0.007 | -| [PlanNodeId 42]: IdentitySinkNode(IdentitySinkOperator) | -| CPU Time: 0.270 ms | -| output: 1 rows | -| HasNext() Called Count: 3 | -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 30]: TableScanNode(TableScanOperator) | -| CPU Time: 0.250 ms | -| output: 1 rows | -| HasNext() Called Count: 3 | -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| DeviceNumber: 1 | -| CurrentDeviceIndex: 0 | -+-----------------------------------------------------------------------------------------------+ -``` diff --git a/src/UserGuide/latest-Table/SQL-Manual/SQL-Metadata-Operations_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/SQL-Metadata-Operations_timecho.md deleted file mode 100644 index 7b6107b4e..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/SQL-Metadata-Operations_timecho.md +++ /dev/null @@ -1,385 +0,0 @@ - - -# Metadata Operations - -## 1. Database Management - -### 1.1 Create Database - -**Syntax:** - -```SQL -CREATE DATABASE (IF NOT EXISTS)? (WITH properties)? -``` - -[Detailed syntax reference](../Basic-Concept/Database-Management_timecho.md#_1-1-create-a-database) - -**Examples:** - -```SQL -CREATE DATABASE database1; -CREATE DATABASE IF NOT EXISTS database1; - --- Create database with 1-year TTL; -CREATE DATABASE IF NOT EXISTS database1 with(TTL=31536000000); -``` - -### 1.2 Use Database - -**Syntax:** - -```SQL -USE -``` - -**Examples:** - -```SQL -USE database1; -``` - -### 1.3 View Current Database - -**Syntax:** - -```SQL -SHOW CURRENT_DATABASE; -``` - -**Examples:** - -```SQL -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ -``` -```sql -USE database1; -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| database1| -+---------------+ -``` - -### 1.4 List All Databases - -**Syntax:** - -```SQL -SHOW DATABASES (DETAILS)? -``` - -[Detailed syntax reference](../Basic-Concept/Database-Management_timecho.md#_1-4-view-all-databases) - -**Examples:** - -```SQL -show databases; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+------------------+-------+-----------------------+---------------------+---------------------+ -| database1| INF| 1| 1| 604800000| -|information_schema| INF| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+ -``` -```sql -show databases details; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|DataRegionGroupNum| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| database1| INF| 1| 1| 604800000| 1| 2| -|information_schema| INF| null| null| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -``` - -### 1.5 Modify Database - -**Syntax:** - -```SQL -ALTER DATABASE (IF EXISTS)? database=identifier SET PROPERTIES propertyAssignments -``` - -**Examples:** - -```SQL -ALTER DATABASE database1 SET PROPERTIES TTL=31536000000; -``` - -### 1.6 Drop Database - -**Syntax:** - -```SQL -DROP DATABASE (IF EXISTS)? -``` - -**Examples:** - -```SQL -DROP DATABASE IF EXISTS database1; -``` - -## 2. Table Management - -### 2.1 Create Table - -**Syntax:** - -```SQL -createTableStatement - : CREATE TABLE (IF NOT EXISTS)? qualifiedName - '(' (columnDefinition (',' columnDefinition)*)? ')' - charsetDesc? - comment? - (WITH properties)? - ; - -charsetDesc - : DEFAULT? (CHAR SET | CHARSET | CHARACTER SET) EQ? identifierOrString - ; - -columnDefinition - : identifier columnCategory=(TAG | ATTRIBUTE | TIME) charsetName? comment? - | identifier type (columnCategory=(TAG | ATTRIBUTE | TIME | FIELD))? charsetName? comment? - ; - -charsetName - : CHAR SET identifier - | CHARSET identifier - | CHARACTER SET identifier - ; - -comment - : COMMENT string - ; -``` - -[Detailed syntax reference](../Basic-Concept/Table-Management_timecho.md#_1-1-create-a-table) - -**Examples:** - -```SQL -CREATE TABLE table1 ( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - humidity FLOAT FIELD COMMENT 'humidity', - status BOOLEAN FIELD COMMENT 'status', - arrival_time TIMESTAMP FIELD COMMENT 'arrival_time' -) COMMENT 'table1' WITH (TTL=31536000000); - -CREATE TABLE if not exists tableB (); - -CREATE TABLE tableC ( - "Site" STRING TAG, - "Temperature" int32 FIELD COMMENT 'temperature' - ) with (TTL=DEFAULT); - ``` - -Custom time column: named time_test, located in the second column of the table. (Support from V2.0.8.2) - ```sql - CREATE TABLE table1 ( - region STRING TAG, - time_user_defined TIMESTAMP TIME, - temperature FLOAT FIELD - ); -``` - -Note: If your terminal does not support multi-line paste (e.g., Windows CMD), please reformat the SQL statement into a single line before execution. - - -### 2.2 List Tables - -**Syntax:** - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -**Examples:** - -```SQL -show tables from database1; -``` -```shell -+---------+---------------+ -|TableName| TTL(ms)| -+---------+---------------+ -| table1| 31536000000| -+---------+---------------+ -``` -```sql -show tables details from database1; -``` -```shell -+---------------+-----------+------+-------+ -| TableName| TTL(ms)|Status|Comment| -+---------------+-----------+------+-------+ -| table1|31536000000| USING| table1| -+---------------+-----------+------+-------+ -``` - -### 2.3 Describe Table Columns - -**Syntax:** - -```SQL -(DESC | DESCRIBE) (DETAILS)? -``` - -**Examples:** - -```SQL -desc table1; -``` -```shell -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ -``` -```sql -desc table1 details; -``` -```shell -+------------+---------+---------+------+------------+ -| ColumnName| DataType| Category|Status| Comment| -+------------+---------+---------+------+------------+ -| time|TIMESTAMP| TIME| USING| null| -| region| STRING| TAG| USING| null| -| plant_id| STRING| TAG| USING| null| -| device_id| STRING| TAG| USING| null| -| model_id| STRING|ATTRIBUTE| USING| null| -| maintenance| STRING|ATTRIBUTE| USING| maintenance| -| temperature| FLOAT| FIELD| USING| temperature| -| humidity| FLOAT| FIELD| USING| humidity| -| status| BOOLEAN| FIELD| USING| status| -|arrival_time|TIMESTAMP| FIELD| USING|arrival_time| -+------------+---------+---------+------+------------+ -``` - - -### 2.4 View Table Creation Statement - -**Syntax:** - -```SQL -SHOW CREATE TABLE -``` - -**Examples:** - -```SQL -show create table table1; -``` -```shell -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| Table| Create Table| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|table1|CREATE TABLE "table1" ("region" STRING TAG,"plant_id" STRING TAG,"device_id" STRING TAG,"model_id" STRING ATTRIBUTE,"maintenance" STRING ATTRIBUTE,"temperature" FLOAT FIELD,"humidity" FLOAT FIELD,"status" BOOLEAN FIELD,"arrival_time" TIMESTAMP FIELD) WITH (ttl=31536000000)| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -``` - - -### 2.5 Modify Table - -**Syntax:** - -```SQL -#addColumn; -ALTER TABLE (IF EXISTS)? tableName=qualifiedName ADD COLUMN (IF NOT EXISTS)? column=columnDefinition COMMENT 'column_comment'; -#dropColumn; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName DROP COLUMN (IF EXISTS)? column=identifier; -#setTableProperties; -// set TTL can use this; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName SET PROPERTIES propertyAssignments; -| COMMENT ON TABLE tableName=qualifiedName IS 'table_comment'; -| COMMENT ON COLUMN tableName.column IS 'column_comment'; -#changeColumndatatype; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName ALTER COLUMN (IF EXISTS)? column=identifier SET DATA TYPE new_type=type; -``` - -**Examples:** - -add column -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS a TAG COMMENT 'a'; -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS b FLOAT FIELD COMMENT 'b'; -``` -set TTL -```SQL -ALTER TABLE table1 set properties TTL=3600; -``` -set comment -```SQL -COMMENT ON TABLE table1 IS 'table1'; -COMMENT ON COLUMN table1.a IS null; -``` -alter column datatype -```SQL -ALTER TABLE table1 ALTER COLUMN IF EXISTS b SET DATA TYPE DOUBLE; -``` - -### 2.6 Drop Table - -**Syntax:** - -```SQL -DROP TABLE (IF EXISTS)? -``` - -**Examples:** - -```SQL -DROP TABLE table1; -DROP TABLE database1.table1; -``` - - diff --git a/src/UserGuide/latest-Table/SQL-Manual/Select-Clause_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/Select-Clause_timecho.md deleted file mode 100644 index a489f8a90..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/Select-Clause_timecho.md +++ /dev/null @@ -1,491 +0,0 @@ - - -# SELECT Clauses - -**SELECT Clause** specifies the columns included in the query results. - -## 1. Syntax Overview - -```sql -SELECT setQuantifier? selectItem (',' selectItem)* - -selectItem - : expression (AS? identifier)? #selectSingle - | tableName '.' ASTERISK (AS columnAliases)? #selectAll - | ASTERISK #selectAll - ; -setQuantifier - : DISTINCT - | ALL - ; -``` - -- It supports aggregate functions (e.g., `SUM`, `AVG`, `COUNT`) and window functions, logically executed last in the query process. -- DISTINCT Keyword: `SELECT DISTINCT column_name` ensures that the values in the query results are unique, removing duplicates. -- COLUMNS Function: The COLUMNS function is supported in the SELECT clause for column filtering. It can be combined with expressions, allowing the expression's logic to apply to all columns selected by the function. - -## 2. Detailed Syntax: - -Each `selectItem` can take one of the following forms: - -1. **Expression**: `expression [[AS] column_alias]` defines a single output column and optionally assigns an alias. -2. **All Columns from a Relation**: `relation.*` selects all columns from a specified relation. Column aliases are not allowed in this case. -3. **All Columns in the Result Set**: `*` selects all columns returned by the query. Column aliases are not allowed. - -Usage scenarios for DISTINCT: - -1. **SELECT Statement**: Use DISTINCT in the SELECT statement to remove duplicate items from the query results. - -2. **Aggregate Functions**: When used with aggregate functions, DISTINCT only processes non-duplicate rows in the input dataset. - -3. **AGROUP BY Clause**: Use ALL and DISTINCT quantifiers in the GROUP BY clause to determine whether each duplicate grouping set produces distinct output rows. - -`COLUMNS` Function: - -1. **`COLUMNS(*)`**: Matches all columns and supports combining with expressions. -2. **`COLUMNS(regexStr) ? AS identifier`**: Regular expression matching - - Selects columns whose names match the specified regular expression `(regexStr)` and supports combining with expressions. - - Allows renaming columns by referencing groups captured by the regular expression. If `AS` is omitted, the original column name is displayed in the format `_coln_original_name` (where `n` is the column’s position in the result table). - - Renaming Syntax: - - Use parentheses () in regexStr to define capture groups. - - Reference captured groups in identifier using `'$index'`. - - Note: The identifier must be enclosed in double quotes if it contains special characters like `$`. - -## 3. Example Data - - -The [Example Data page](../Reference/Sample-Data.md)page provides SQL statements to construct table schemas and insert data. By downloading and executing these statements in the IoTDB CLI, you can import the data into IoTDB. This data can be used to test and run the example SQL queries included in this documentation, allowing you to reproduce the described results. - -### 3.1 Selection List - -#### 3.1.1 Star Expression - -The asterisk (`*`) selects all columns in a table. Note that it cannot be used with most functions, except for cases like `COUNT(*)`. - -**Example**: Selecting all columns from a table. - - -```sql -SELECT * FROM table1; -``` - -Results: - -```sql -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| modifytime| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-29T18:30:00.000+08:00| 上海| 3002| 100| E| 180| 90.0| 35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T10:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| 35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 35.2| true| null| -|2024-11-30T14:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00| 上海| 3001| 101| D| 360| 85.0| null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| 35.3| null| null| -|2024-11-27T16:40:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.2| false| null| -|2024-11-27T16:43:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false| null| -|2024-11-27T16:44:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -Total line number = 18 -It costs 0.653s -``` - -#### 3.1.2 Aggregate Functions - -Aggregate functions summarize multiple rows into a single value. When aggregate functions are present in the `SELECT` clause, the query is treated as an **aggregate query**. All expressions in the query must either be part of an aggregate function or specified in the [GROUP BY clause](../SQL-Manual/GroupBy-Clause.md). - -**Example 1**: Total number of rows in a table. - -```sql -SELECT count(*) FROM table1; -``` - -Results: - -```sql -+-----+ -|_col0| -+-----+ -| 18| -+-----+ -Total line number = 1 -It costs 0.091s -``` - -**Example 2**: Total rows grouped by region. - -```sql -SELECT region, count(*) - FROM table1 - GROUP BY region; -``` - -Results: - -```sql -+------+-----+ -|region|_col1| -+------+-----+ -| 上海| 9| -| 北京| 9| -+------+-----+ -Total line number = 2 -It costs 0.071s -``` - -#### 3.1.3 Aliases - -The `AS` keyword assigns an alias to selected columns, improving readability by overriding existing column names. - -**Example 1**: Original table. - - -```sql -IoTDB> SELECT * FROM table1; -``` - -Results: - -```sql -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| modifytime| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-29T18:30:00.000+08:00| 上海| 3002| 100| E| 180| 90.0| 35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T10:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| 35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 35.2| true| null| -|2024-11-30T14:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00| 上海| 3001| 101| D| 360| 85.0| null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| 35.3| null| null| -|2024-11-27T16:40:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.2| false| null| -|2024-11-27T16:43:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false| null| -|2024-11-27T16:44:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -Total line number = 18 -It costs 0.653s -``` - -**Example 2**: Assigning an alias to a single column. - -```sql -IoTDB> SELECT device_id - AS device - FROM table1; -``` - -Results: - -```sql -+------+ -|device| -+------+ -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -+------+ -Total line number = 18 -It costs 0.053s -``` - -**Example 3:** Assigning aliases to all columns. - -```sql -IoTDB> SELECT table1.* - AS (timestamp, Reg, Pl, DevID, Mod, Mnt, Temp, Hum, Stat,MTime) - FROM table1; -``` - -Results: - -```sql -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -| TIMESTAMP| REG| PL|DEVID|MOD|MNT|TEMP| HUM| STAT| MTIME| -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -|2024-11-29T11:00:00.000+08:00|上海|3002| 100| E|180|null|45.1| true| null| -|2024-11-29T18:30:00.000+08:00|上海|3002| 100| E|180|90.0|35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00|上海|3001| 100| C| 90|85.0|null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00|上海|3001| 100| C| 90|null|40.9| true| null| -|2024-11-28T10:00:00.000+08:00|上海|3001| 100| C| 90|85.0|35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00|上海|3001| 100| C| 90|88.0|45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00|北京|1001| 100| A|180|90.0|35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00|北京|1001| 100| A|180|90.0|35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00|上海|3002| 101| F|360|90.0|35.2| true| null| -|2024-11-30T14:30:00.000+08:00|上海|3002| 101| F|360|90.0|34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00|上海|3001| 101| D|360|85.0|null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00|北京|1001| 101| B|180|null|35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00|北京|1001| 101| B|180|85.0|35.3| null| null| -|2024-11-27T16:40:00.000+08:00|北京|1001| 101| B|180|85.0|null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00|北京|1001| 101| B|180|85.0|null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00|北京|1001| 101| B|180|null|35.2|false| null| -|2024-11-27T16:43:00.000+08:00|北京|1001| 101| B|180|null|null|false| null| -|2024-11-27T16:44:00.000+08:00|北京|1001| 101| B|180|null|null|false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -Total line number = 18 -It costs 0.189s -``` - -#### 3.1.4 Object Type Query - -> Supported since V2.0.8 - -**Example 1: Directly querying Object type data** - -```sql -IoTDB:database1> SELECT s1 FROM table1 WHERE device_id = 'tag1'; -``` - -Results: - -```sql -+------------+ -| s1| -+------------+ -|(Object) 5 B| -+------------+ -Total line number = 1 -It costs 0.428s -``` - -**Example 2: Retrieving raw content of Object type data using `read_object` function** - -```sql -IoTDB:database1> SELECT read_object(s1) FROM table1 WHERE device_id = 'tag1' -``` - -Results: - -```sql -+------------+ -| _col0| -+------------+ -|0x696f746462| -+------------+ -Total line number = 1 -It costs 0.188s -``` - - -### 3.2 Columns Function - -1. Without combining expressions - -Query data from columns whose names start with 'm' -```sql -IoTDB:database1> select columns('^m.*') from table1 limit 5; -``` - -Results: - -```sql -+--------+-----------+ -|model_id|maintenance| -+--------+-----------+ -| E| 180| -| E| 180| -| C| 90| -| C| 90| -| C| 90| -+--------+-----------+ -``` - -Query columns whose names start with 'o' - throw an exception if no columns match -```sql -IoTDB:database1> select columns('^o.*') from table1 limit 5; -``` - -Results: - -```sql -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: No matching columns found that match regex '^o.*' -``` - -Query data from columns whose names start with 'm' and rename them with 'series_' prefix -```sql -IoTDB:database1> select columns('^m(.*)') AS "series_$0" from table1 limit 5; -``` - -Results: - -```sql -+---------------+------------------+ -|series_model_id|series_maintenance| -+---------------+------------------+ -| E| 180| -| E| 180| -| C| 90| -| C| 90| -| C| 90| -+---------------+------------------+ -``` - -2. With Expression Combination - -- Single COLUMNS Function - -Query the minimum value of all columns -```sql -IoTDB:database1> select min(columns(*)) from table1 -``` - -Results: - -```sql -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -| _col0_time|_col1_region|_col2_plant_id|_col3_device_id|_col4_model_id|_col5_maintenance|_col6_temperature|_col7_humidity|_col8_status| _col9_arrival_time| -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -|2024-11-26T13:37:00.000+08:00| 上海| 1001| 100| A| 180| 85.0| 34.8| false|2024-11-26T13:37:34.000+08:00| -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -``` - -- Multiple COLUMNS Functions in Same Expression - -> Usage Restriction: When multiple COLUMNS functions appear in the same expression, their parameters must be identical. - -Query the sum of minimum and maximum values for columns starting with 'h' -```sql -IoTDB:database1> select min(columns('^h.*')) + max(columns('^h.*')) from table1 -``` - -Results: - -```sql -+--------------+ -|_col0_humidity| -+--------------+ -| 79.899994| -+--------------+ -``` - -Error Case: Non-Identical COLUMNS Functions -```sql -IoTDB:database1> select min(columns('^h.*')) + max(columns('^t.*')) from table1 -``` - -Results: - -```sql -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Multiple different COLUMNS in the same expression are not supported -``` - -- Multiple COLUMNS Functions in Different Expressions - -Query minimum of 'h'-columns and maximum of 'h'-columns separately -```sql -IoTDB:database1> select min(columns('^h.*')) , max(columns('^h.*')) from table1 -``` - -Results: - -```sql -+--------------+--------------+ -|_col0_humidity|_col1_humidity| -+--------------+--------------+ -| 34.8| 45.1| -+--------------+--------------+ -``` - -Query minimum of 'h'-columns and maximum of 'te'-columns -```sql -IoTDB:database1> select min(columns('^h.*')) , max(columns('^te.*')) from table1 -``` - -Results: - -```sql -+--------------+-----------------+ -|_col0_humidity|_col1_temperature| -+--------------+-----------------+ -| 34.8| 90.0| -+--------------+-----------------+ -``` - -3. In Where Clause - -Query data where all 'h'-columns must be > 40 (equivalent to) -```sql -IoTDB:database1> select * from table1 where columns('^h.*') > 40 -``` - -Results: - -```sql -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -``` - -Alternative syntax -```sql -IoTDB:database1> select * from table1 where humidity > 40 -``` - -Results: - -```sql -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -``` - -## 4. Column Order in the Result Set - -- **Column Order**: The order of columns in the result set matches the order specified in the `SELECT` clause. -- **Multi-column Expressions**: If a selection expression produces multiple columns, their order follows the order in the source relation.p. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/SQL-Manual/Set-Operations_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/Set-Operations_timecho.md deleted file mode 100644 index 3628b15ec..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/Set-Operations_timecho.md +++ /dev/null @@ -1,295 +0,0 @@ - -# Set Operations - -IoTDB natively supports standard SQL set operations, including three core operators: **UNION**, **INTERSECT**, and **EXCEPT**. These operations enable seamless merging, comparison, and filtering of query results from multiple time-series data sources, greatly improving the flexibility and efficiency of time-series data analysis. - -> Note: This feature is available since version 2.0.9.1. - -## 1. UNION -### 1.1 Overview -The UNION operator combines all rows from two result sets (order not guaranteed), supporting both duplicate elimination (default) and duplicate retention modes. - -### 1.2 Syntax -```sql -query UNION (ALL | DISTINCT) query -``` - -**Description** -1. **Duplicate Handling** - - Default (`UNION` or `UNION DISTINCT`): Automatically removes duplicate rows. - - `UNION ALL`: Preserves all rows (including duplicates) with higher performance. - -2. **Input Requirements** - - The two queries must return the same number of columns. - - Corresponding columns must have compatible data types: - - Numeric compatibility: `INT32`, `INT64`, `FLOAT`, and `DOUBLE` are fully compatible with each other. - - String compatibility: `TEXT` and `STRING` are fully compatible. - - Special rule: `INT64` is compatible with `TIMESTAMP`. - -3. **Result Set Rules** - - Column names and order are inherited from the first query. - -### 1.3 Examples -Using the [sample data](../Reference/Sample-Data.md): - -1. Get distinct non-null device and temperature records from `table1` and `table2` -```sql -SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL -UNION -SELECT device_id, temperature FROM table2 WHERE temperature IS NOT NULL; - --- Equivalent to: -SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL -UNION DISTINCT -SELECT device_id, temperature FROM table2 WHERE temperature IS NOT NULL; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 85.0| -| 100| 90.0| -| 100| 85.0| -| 100| 88.0| -+---------+-----------+ -Total line number = 5 -It costs 0.074s -``` - -2. Get all non-null device and temperature records from `table1` and `table2` (including duplicates) -```sql -SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL -UNION ALL -SELECT device_id, temperature FROM table2 WHERE temperature IS NOT NULL; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 90.0| -| 101| 85.0| -| 101| 85.0| -| 101| 85.0| -| 101| 85.0| -| 100| 90.0| -| 100| 85.0| -| 100| 85.0| -| 100| 88.0| -| 100| 90.0| -| 100| 90.0| -| 101| 90.0| -| 101| 85.0| -| 101| 85.0| -| 100| 85.0| -| 100| 90.0| -+---------+-----------+ -Total line number = 17 -It costs 0.108s -``` - -> **Notes** -> - Set operations **do not guarantee result order**; actual output may differ from examples. - - -## 2. INTERSECT -### 2.1 Overview -The INTERSECT operator returns rows that exist in both result sets (order not guaranteed), supporting both duplicate elimination (default) and duplicate retention modes. - -### 2.2 Syntax -```sql -query1 INTERSECT [ALL | DISTINCT] query2 -``` - -**Description** -1. **Duplicate Handling** - - Default (`INTERSECT` or `INTERSECT DISTINCT`): Automatically removes duplicate rows. - - `INTERSECT ALL`: Preserves duplicate rows, with slightly lower performance. - -2. **Precedence Rules** - - `INTERSECT` has higher precedence than `UNION` and `EXCEPT` - (e.g., `A UNION B INTERSECT C` is equivalent to `A UNION (B INTERSECT C)`). - - Evaluation is left-to-right - (e.g., `A INTERSECT B INTERSECT C` is equivalent to `(A INTERSECT B) INTERSECT C`). - -3. **Input Requirements** - - The two queries must return the same number of columns. - - Corresponding columns must have compatible data types (same rules as UNION). - - NULL values are treated as equal (`NULL IS NOT DISTINCT FROM NULL`). - - If the `time` column is not included in `SELECT`, it does not participate in comparison and will not appear in the result. - -4. **Result Set Rules** - - Column names and order are inherited from the first query. - -### 2.3 Examples -Using the [sample data](../Reference/Sample-Data.md): - -1. Get distinct common device and temperature records from `table1` and `table2` -```sql -SELECT device_id, temperature FROM table1 -INTERSECT -SELECT device_id, temperature FROM table2; - --- Equivalent to: -SELECT device_id, temperature FROM table1 -INTERSECT DISTINCT -SELECT device_id, temperature FROM table2; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 85.0| -| 100| null| -| 100| 90.0| -| 100| 85.0| -+---------+-----------+ -Total line number = 5 -It costs 0.087s -``` - -2. Get all common device and temperature records from `table1` and `table2` (including duplicates) -```sql -SELECT device_id, temperature FROM table1 -INTERSECT ALL -SELECT device_id, temperature FROM table2; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 100| 85.0| -| 100| 90.0| -| 100| null| -| 101| 85.0| -| 101| 85.0| -| 101| 90.0| -+---------+-----------+ -Total line number = 6 -It costs 0.139s -``` - -> **Notes** -> - Set operations **do not guarantee result order**. -> - When mixed with `UNION`/`EXCEPT`, use parentheses to explicitly specify precedence - > (e.g., `A INTERSECT (B UNION C)`). - - -## 3. EXCEPT -### 3.1 Overview -The EXCEPT operator returns rows that exist in the first result set but **not** in the second (order not guaranteed), supporting both duplicate elimination (default) and duplicate retention modes. - -### 3.2 Syntax -```sql -query1 EXCEPT [ALL | DISTINCT] query2 -``` - -**Description** -1. **Duplicate Handling** - - Default (`EXCEPT` or `EXCEPT DISTINCT`): Automatically removes duplicate rows. - - `EXCEPT ALL`: Preserves duplicate rows, with slightly lower performance. - -2. **Precedence Rules** - - `EXCEPT` has the same precedence as `UNION`, and lower precedence than `INTERSECT` - (e.g., `A INTERSECT B EXCEPT C` is equivalent to `(A INTERSECT B) EXCEPT C`). - - Evaluation is left-to-right - (e.g., `A EXCEPT B EXCEPT C` is equivalent to `(A EXCEPT B) EXCEPT C`). - -3. **Input Requirements** - - The two queries must return the same number of columns. - - Corresponding columns must have compatible data types (same rules as UNION). - - NULL values are treated as equal (`NULL IS NOT DISTINCT FROM NULL`). - - If the `time` column is not included in `SELECT`, it does not participate in comparison and will not appear in the result. - -4. **Result Set Rules** - - Column names and order are inherited from the first query. - -### 3.3 Examples -Using the [sample data](../Reference/Sample-Data.md): - -1. Get distinct records from `table1` that do not exist in `table2` -```sql -SELECT device_id, temperature FROM table1 -EXCEPT -SELECT device_id, temperature FROM table2; - --- Equivalent to: -SELECT device_id, temperature FROM table1 -EXCEPT DISTINCT -SELECT device_id, temperature FROM table2; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| null| -| 100| 88.0| -+---------+-----------+ -Total line number = 2 -It costs 0.173s -``` - -2. Get all records from `table1` that do not exist in `table2` (including duplicates) -```sql -SELECT device_id, temperature FROM table1 -EXCEPT ALL -SELECT device_id, temperature FROM table2; -``` - -Result: -``` -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 100| 85.0| -| 100| 88.0| -| 100| 90.0| -| 100| 90.0| -| 100| null| -| 101| 85.0| -| 101| 85.0| -| 101| 90.0| -| 101| null| -| 101| null| -| 101| null| -| 101| null| -+---------+-----------+ -Total line number = 12 -It costs 0.155s -``` - -> **Notes** -> - Set operations **do not guarantee result order**. -> - When mixed with `UNION`/`INTERSECT`, use parentheses to explicitly specify precedence - > (e.g., `A EXCEPT (B INTERSECT C)`). \ No newline at end of file diff --git a/src/UserGuide/latest-Table/SQL-Manual/overview_timecho.md b/src/UserGuide/latest-Table/SQL-Manual/overview_timecho.md deleted file mode 100644 index d564f44c6..000000000 --- a/src/UserGuide/latest-Table/SQL-Manual/overview_timecho.md +++ /dev/null @@ -1,53 +0,0 @@ - - -# Overview - -## 1. Syntax Overview - -```SQL -SELECT ⟨select_list⟩ - FROM ⟨tables⟩ | patternRecognition - [WHERE ⟨condition⟩] - [GROUP BY ⟨groups⟩] - [HAVING ⟨group_filter⟩] - [WINDOW windowDefinition (',' windowDefinition)*)] - [FILL ⟨fill_methods⟩] - [ORDER BY ⟨order_expression⟩] - [OFFSET ⟨n⟩] - [LIMIT ⟨n⟩]; -``` - -The IoTDB table model query syntax supports the following clauses: - -- **SELECT Clause**: Specifies the columns to be included in the result. Details: [SELECT Clause](../SQL-Manual/Select-Clause_timecho.md) -- **FROM Clause**: Indicates the data source for the query, which can be a single table, multiple tables joined using the `JOIN` clause, or a subquery. Details: [FROM & JOIN Clause](../SQL-Manual/From-Join-Clause.md) -- **WHERE Clause**: Filters rows based on specific conditions. Logically executed immediately after the `FROM` clause. Details: [WHERE Clause](../SQL-Manual/Where-Clause.md) -- **GROUP BY Clause**: Used for aggregating data, specifying the columns for grouping. Details: [GROUP BY Clause](../SQL-Manual/GroupBy-Clause.md) -- **HAVING Clause**: Applied after the `GROUP BY` clause to filter grouped data, similar to `WHERE` but operates after grouping. Details:[HAVING Clause](../SQL-Manual/Having-Clause.md) -- **FILL Clause**: Handles missing values in query results by specifying fill methods (e.g., previous non-null value or linear interpolation) for better visualization and analysis. Details:[FILL Clause](../SQL-Manual/Fill-Clause.md) -- **ORDER BY Clause**: Sorts query results in ascending (`ASC`) or descending (`DESC`) order, with optional handling for null values (`NULLS FIRST` or `NULLS LAST`). Details: [ORDER BY Clause](../SQL-Manual/OrderBy-Clause.md) -- **OFFSET Clause**: Specifies the starting position for the query result, skipping the first `OFFSET` rows. Often used with the `LIMIT` clause. Details: [LIMIT and OFFSET Clause](../SQL-Manual/Limit-Offset-Clause.md) -- **LIMIT Clause**: Limits the number of rows in the query result. Typically used in conjunction with the `OFFSET` clause for pagination. Details: [LIMIT and OFFSET Clause](../SQL-Manual/Limit-Offset-Clause.md) - -## 2. Clause Execution Order - -![](/img/data-query-1.png) \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Tools-System/CLI_timecho.md b/src/UserGuide/latest-Table/Tools-System/CLI_timecho.md deleted file mode 100644 index 484d5b32f..000000000 --- a/src/UserGuide/latest-Table/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,178 +0,0 @@ - -# CLI - -The IoTDB Command Line Interface (CLI) tool allows users to interact with the IoTDB server. Before using the CLI tool to connect to IoTDB, ensure that the IoTDB service is running correctly. This document explains how to launch the CLI and its related parameters. - -In this manual, `$IOTDB_HOME` represents the installation directory of IoTDB. - -## 1. CLI Launch - -The CLI client script is located in the `$IOTDB_HOME/sbin` directory. The common commands to start the CLI tool are as follows: - -#### **Linux** **MacOS** - -```Bash -Shell> bash sbin/start-cli.sh -sql_dialect table -#or -# Before version V2.0.6.x -Shell> bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -#### **Windows** - -```Bash -# Before version V2.0.4.x -Shell> sbin\start-cli.bat -sql_dialect table -#or -Shell> sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.4.x and later versions -Shell> sbin\windows\start-cli.bat -sql_dialect table -#or -# V2.0.4.x and later versions, before version V2.0.6.x -Shell> sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -**Parameter Explanation** - -| **Parameter** | **Type** | **Required** | **Description** | **Example** | -| -------------------------- | -------- | ------------ |---------------------------------------------------------------------------------------------------| ------------------- | -| -h `` | string | No | The IP address of the IoTDB server. (Default: 127.0.0.1) | -h 127.0.0.1 | -| -p `` | int | No | The RPC port of the IoTDB server. (Default: 6667) | -p 6667 | -| -u `` | string | No | The username to connect to the IoTDB server. (Default: root) | -u root | -| -pw `` | string | No | The password to connect to the IoTDB server. (Default: `TimechoDB@2021`,before V2.0.6 it is root) | -pw root | -| -sql_dialect `` | string | No | The data model type: tree or table. (Default: tree) | -sql_dialect table | -| -e `` | string | No | Batch operations in non-interactive mode. | -e "show databases" | -| -c | Flag | No | Required if rpc_thrift_compression_enable=true on the server. | -c | -| -disableISO8601 | Flag | No | If set, timestamps will be displayed as numeric values instead of ISO8601 format. | -disableISO8601 | -| -usessl `` | Boolean | No | Enable SSL connection | -usessl true | -| -ts `` | string | No | SSL certificate store path | -ts /path/to/truststore | -| -tpw `` | string | No | SSL certificate store password | -tpw myTrustPassword | -| -timeout `` | int | No | Query timeout (seconds). If not set, the server's configuration will be used. | -timeout 30 | -| -help | Flag | No | Displays help information for the CLI tool. | -help | - -The figure below indicates a successful startup: - -![](/img/Cli-01.png) - - -## 2. Example Commands - -### 2.1 **Create a Database** - -```Java -create database test -``` - -![](/img/Cli-02.png) - - -### 2.2 **Show Databases** -```Java -show databases -``` - -![](/img/Cli-03.png) - - -## 3. CLI Exit - -To exit the CLI and terminate the session, type`quit`or`exit`. - -### 3.1 Additional Notes and Shortcuts - -1. **Navigate Command History:** Use the up and down arrow keys. -2. **Auto-Complete Commands:** Use the right arrow key. -3. **Interrupt Command Execution:** Press `CTRL+C`. - -## 4. Access History Feature - -Since IoTDB **V2.0.9.1**, the access history feature is available. After a client logs in successfully, key historical access information is displayed, and the feature supports distributed deployments. Both administrators and regular users can only view their own access history. The core displayed information includes: - -- Last successful session: displays date, time, access application, IP address, and access method (not shown for first login or when no history exists). -- Most recent failed attempt: displays the date, time, access application, IP address, and access method of the latest failed login attempt immediately before the current successful login. -- Cumulative failed attempts: total number of failed session attempts since the last successful session was established. - -### 4.1 Enabling Access History - -You can enable or disable the access history feature by modifying the corresponding parameter in the `iotdb-system.properties` file. A restart is required for changes to take effect. For example: - -```Plain -# Controls whether the audit log feature is enabled -enable_audit_log=false -``` - -- When enabled: login information is recorded and expired data is cleaned periodically. -- When disabled: no data is recorded, displayed, or cleaned up. -- If disabled and then re-enabled, the displayed history will be the last record before the feature was disabled, which may not reflect the actual latest login. - -Usage example: - -```Bash ---------------------- -Starting IoTDB Cli ---------------------- - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ Enterprise version 2.0.9.1 (Build: xxxxxxx) - - ----Last Successful Session------------------ -Time: 2026-03-24T10:25:47.759+08:00 -IP Address: 127.0.0.1 ----Last Failed Session---------------------- -Time: 2026-03-24T10:27:26.314+08:00 -IP Address: 127.0.0.1 -Cumulative Failed Attempts: 1 -Successfully logged in at 127.0.0.1:6667 -IoTDB> -``` - -### 4.2 Viewing Access History - -The `root` user and users with the `AUDIT` privilege can view login history records using SQL statements. - -Syntax: - -```SQL -SELECT * FROM __audit.login_history; -``` - -Example: - -```SQL -IoTDB> SELECT * FROM __audit.login_history -+-----------------------------+-------+-------+--------+---------+------+ -| time|user_id|node_id|username| ip|result| -+-----------------------------+-------+-------+--------+---------+------+ -|2026-03-25T10:55:58.240+08:00| u_0| node_1| root|127.0.0.1| true| -+-----------------------------+-------+-------+--------+---------+------+ -Total line number = 1 -It costs 0.213s -``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Tools-System/Data-Export-Tool_timecho.md b/src/UserGuide/latest-Table/Tools-System/Data-Export-Tool_timecho.md deleted file mode 100644 index f3f031d17..000000000 --- a/src/UserGuide/latest-Table/Tools-System/Data-Export-Tool_timecho.md +++ /dev/null @@ -1,252 +0,0 @@ -# Data Export - -## 1. Function Overview - -IoTDB supports two methods for data export: - -* Data Export Tool: `export-data.sh/bat` is located in the `tools` directory. It can export the query results of specified SQL statements into CSV, SQL, and TsFile (open-source time-series file format) files. -* PIPE Framework-based TsFileBackup: `tsfile-backup.sh/bat` is located in the `tools` directory. It can export specified data files into TsFile format using the PIPE framework. - - - - - - - - - - - - - - - - - - - - - - - - - -
File FormatIoTDB ToolDescription
CSVexport-data.sh/batPlain text format for storing structured data. Must follow the CSV format specified below.
SQLFile containing custom SQL statements.
TsFileOpen-source time-series file format.
tsfile-backup.sh/batAn open-source time-series data file format,and this script supports the Object data type.
- - -## 2. Data Export Tool -### 2.1 Common Parameters -| Short | Full Parameter | Description | Required | Default | -|----------------|--------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ----------------- |-----------------------------------------------| -| `-ft` | `--file_type` | Export file type: `csv`, `sql`, `tsfile`. | ​**Yes** | - | -| `-h` | `--host` | Hostname of the IoTDB server. | No | `127.0.0.1` | -| `-p` | `--port` | Port number of the IoTDB server. | No | `6667` | -| `-u` | `--username` | Username for authentication. | No | `root` | -| `-pw` | `--password` | Password for authentication. Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Select server model : tree or table | No | tree | -| `-db ` | `--database` | The target database to be exported only takes effect when `-sql_dialect` is of the table type. | Yes when `-sql_dialect = table`| - | -| `-table` | `--table` | The target table to be exported only takes effect when `-sql_dialect` is of the table type. If the `-q` parameter is specified, this parameter will not take effect. If the export type is tsfile/sql, this parameter is mandatory. | ​ No | - | -| `-start_time` | `--start_time` | The start time of the data to be exported only takes effect when `-sql_dialect` is of the table type. If `-q` is specified, this parameter will not take effect. The supported time formats are the same as those for the `-tf` parameter. |No | - | -| `-end_time` | `--end_time` | The end time of the data to be exported only takes effect when `-sql_dialect` is set to the table type. If `-q` is specified, this parameter will not take effect. | No | - | -| `-t` | `--target` | Target directory for the output files. If the path does not exist, it will be created. | ​**Yes** | - | -| `-pfn` | `--prefix_file_name` | Prefix for the exported file names. For example, `abc` will generate files like `abc_0.tsfile`, `abc_1.tsfile`. | No | `dump_0.tsfile` | -| `-q` | `--query` | SQL query command to execute. Starting from v2.0.8, semicolons in SQL statements are automatically removed, and query execution proceeds normally. | No | - | -| `-timeout` | `--query_timeout` | Query timeout in milliseconds (ms). | No | `-1` (before v2.0.8)
`Long.MAX_VALUE` (v2.0.8 and later)
(Range: `-1~Long.MAX_VALUE`) | -| `-help` | `--help` | Display help information. | No | - | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 CSV Format -#### 2.2.1 Command - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table - [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -``` -#### 2.2.2 CSV-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ------------ | ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- |------------------------------------------| -| `-dt` | `--datatype` | Whether to include data types in the CSV file header (`true` or `false`). | No | `false` | -| `-lpf` | `--lines_per_file` | Number of rows per exported file. | No | `10000` (Range:0~Integer.Max=2147483647) | -| `-tf` | `--time_format` | Time format for the CSV file. Options: 1) Timestamp (numeric, long), 2) ISO8601 (default), 3) Custom pattern (e.g., `yyyy-MM-dd HH:mm:ss`). SQL file timestamps are unaffected by this setting. | No | `ISO8601` | -| `-tz` | `--timezone` | Timezone setting (e.g., `+08:00`, `-01:00`). | No | System default | - -#### 2.2.3 Examples - -```Shell -# Valid Example -> export-data.sh -ft csv -sql_dialect table -t /path/export/dir -db database1 -q "select * from table1" - -# Error Example -> export-data.sh -ft csv -sql_dialect table -t /path/export/dir -q "select * from table1" -Parse error: Missing required option: db -``` -### 2.3 SQL Format -#### 2.3.1 Command -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-aligned ] - -lpf - [-tf ] [-tz ] [-q ] [-timeout ] - -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] -``` -#### 2.3.2 SQL-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ---------------- | ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ---------------- | -| `-aligned` | `--use_aligned` | Whether to export as aligned SQL format (`true` or `false`). | No | `true` | -| `-lpf` | `--lines_per_file` | Number of rows per exported file. | No | `10000` (Range:0~Integer.Max=2147483647) | -| `-tf` | `--time_format` | Time format for the CSV file. Options: 1) Timestamp (numeric, long), 2) ISO8601 (default), 3) Custom pattern (e.g., `yyyy-MM-dd HH:mm:ss`). SQL file timestamps are unaffected by this setting. | No | `ISO8601` | -| `-tz` | `--timezone` | Timezone setting (e.g., `+08:00`, `-01:00`). | No | System default | - -#### 2.3.3 Examples -```Shell -# Valid Example -> export-data.sh -ft sql -sql_dialect table -t /path/export/dir -db database1 -start_time 1 - -# Error Example -> export-data.sh -ft sql -sql_dialect table -t /path/export/dir -start_time 1 -Parse error: Missing required option: db -``` - -### 2.4 TsFile Format - -#### 2.4.1 Command - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] -``` - -#### 2.4.2 TsFile-Specific Parameters - -* None - -#### 2.4.3 Examples - -```Shell -# Valid Example -> /tools/export-data.sh -ft tsfile -sql_dialect table -t /path/export/dir -db database1 -start_time 0 - -# Error Example -> /tools/export-data.sh -ft tsfile -sql_dialect table -t /path/export/dir -start_time 0 -Parse error: Missing required option: db -``` - - -## 3. TsFileBackup Based on PIPE Framework -Since **V2.0.9.2**, IoTDB supports the `tsfile-backup.sh/bat` script. This script can automatically generate and send the `CREATE PIPE` SQL command to the server, exporting specified data files to TsFile format. - -**Notes:** -1. **To use this script, contact the Timecho Team to obtain the JAR package(`tsfile-remote-sink--jar-with-dependencies.jar`), and place it in a path accessible to IoTDB (e.g., all Data Node hosts).** -2. **This script supports exporting Object-type data to TsFile files.** - - -### 3.1 Execution Commands -```Shell -# Unix/OS X -> tools/tsfile-backup.sh [-sql_dialect ] [-h ] [-p ] - [-u ] [-pw ] [-path ] [-db ] [-table -
] [-s ] [-e ] [-t ] - [-th ] [-tu ] [-tp ] - [--rate_limit] [--plugin_jar] [-help] - -# Windows -> tools\windows\tsfile-backup.bat [-sql_dialect ] [-h ] [-p ] - [-u ] [-pw ] [-path ] [-db ] [-table -
] [-s ] [-e ] [-t ] - [-th ] [-tu ] [-tp ] - [--rate_limit] [--plugin_jar] [-help] -``` - - -### 3.2 Script Parameters -| Abbreviation | Full Name | Description | Required | Default | -|-------------------------|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------| -------- | --------------- | -| `-sql_dialect` | `--sql_dialect` | Specifies the data model type. Valid values: `tree` (Tree Model) or `table` (Table Model). | Yes | - | -| `-h` | `--host` | Local host address (IP of the IoTDB instance where the data resides). | No | `127.0.0.1` | -| `-p` | `--port` | Port number for the IoTDB RPC service. | No | `6667` | -| `-u` | `--user` | Username for IoTDB authentication. | No | `root` | -| `-pw` | `--password` | Password for IoTDB authentication (hidden input supported). | No | `root` | -| `-t` | `--target` | Export target directory. In SCP mode, this is an absolute physical path on the remote server. TsFile and associated Object directories will be exported here. | Yes | - | -| `-db` | `--database` | Database name (optional for Table Model). | No | `.*` | -| `-table` | `--table` | Table name (optional for Table Model). | No | `.*` | -| `-s` | `--start_time` | Start time (ISO8601 format e.g. `2026-01-01T00:00:00` or millisecond timestamp). Only data from this time onwards is exported. | No | - | -| `-e` | `--end_time` | End time (same format as above). Only data before this time is exported. | No | - | -| `-th` | `--target_host` | Remote target host IP. If specified, the script automatically configures Pipe to use SCP for data transfer. | No | - | -| `-tu` | `--target_host_user` | Username for SSH/SCP login to the remote server. | No | - | -| `-tpw` | `--target_host_pw` | Password for remote authentication (hidden input supported). | No | - | -| `-tp` | `--target_host_port` | Remote SSH port. | No | `22` | -| `--rate_limit` | `--rate_limit` | Transfer rate limit (unit: Bytes/s) to prevent excessive bandwidth usage. | No | - | -| `--plugin_jar` | `--plugin_jar` | Path to the Pipe plugin JAR file. | No | - | -| `--object-parallelism` | `--object-parallelism` | Specifies the maximum parallelism for object file transmission. | No | - | -| `--object-batch-size` | `--object-batch-size` | Limits the total byte size of each object file upload batch, used to control memory usage and single SCP transfer size. | No | - | -| `-help` | `--help` | Show help information. | No | - | - - -### 3.3 Execution Examples - -Example 1: SCP Remote Export (Send Data to Another Server) - -```Bash -./tsfile-backup.sh -sql_dialect table -db test_db -t /remote/archive/ -th 192.168.1.100 -tu backup_user -tpw ComplexPass123! -``` - -Example 2: Remote Object Data Export with Rate Limiting - -```Bash -./tsfile-backup.sh -sql_dialect table -t /mnt/backup/ -th 10.0.0.5 -tu iot_admin -tpw Admin@2026 --rate_limit 5242880 -``` - -Example 3: Specify Pipe Plugin JAR Directory - -```Bash -./tsfile-backup.sh -sql_dialect table -db test -table .* -tu luoluoyuyu -tpw -t /tmp/backup --plugin_jar /local/lib/tsfile-remote-sink-2.0.8-SNAPSHOT-jar-with-dependencies.jar -``` - -**Note**: When exporting Object-type data in SCP mode, to avoid handshake exceptions, connection failures, or frequent Pipe restarts, it is recommended to take any of the following measures: -* Appropriately lower the configuration parameter `object-parallelism` -* Increase the `MaxStartups` value on the target machine as needed. After modification, execute `sshd reload` or `sshd restart` for the configuration to take effect. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Tools-System/Data-Import-Tool_timecho.md b/src/UserGuide/latest-Table/Tools-System/Data-Import-Tool_timecho.md deleted file mode 100644 index 1becb0726..000000000 --- a/src/UserGuide/latest-Table/Tools-System/Data-Import-Tool_timecho.md +++ /dev/null @@ -1,375 +0,0 @@ -# Data Import - -## 1. Functional Overview - -IoTDB supports three methods for data import: -- Data Import Tool: Use the `import-data.sh/bat` script in the `tools` directory to manually import CSV, SQL, or TsFile (open-source time-series file format) data into IoTDB. -- `TsFile` Auto-Loading Feature -- Load `TsFile` SQL - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
File FormatIoTDB ToolDescription
CSVimport-data.sh/batCan be used for single or batch import of CSV files into IoTDB
SQLCan be used for single or batch import of SQL files into IoTDB
TsFileCan be used for single or batch import of TsFile files into IoTDB
TsFile Auto-Loading FeatureCan automatically monitor a specified directory for newly generated TsFiles and load them into IoTDB
Load SQLCan be used for single or batch import of TsFile files into IoTDB
- -- The table model TsFile import currently only supports local import. -- Since version V2.0.9.2, the import-data.sh/bat script supports the Object data type when importing TsFile files. - -## 2. Data Import Tool -### 2.1 Common Parameters - -| Short | Full Parameter | Description | Required | Default | -|-----------------|---------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ----------------- |-----------------------------------------------| -| `-ft` | `--file_type` | File type: `csv`, `sql`, `tsfile`. | ​**Yes** | - | -| `-h` | `--host` | IoTDB server hostname. | No | `127.0.0.1` | -| `-p` | `--port` | IoTDB server port. | No | `6667` | -| `-u` | `--username` | Username. | No | `root` | -| `-pw` | `--password` | Password. Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Select server model : tree or table | No | `tree` | -| ` -db ` | `--database` | ​Target database , applies only to `-sql_dialect=table` |Yes when `-sql_dialect = table`;
Starting from version V2.0.9.2, this parameter is optional when the file format is SQL. A prompt will be issued if the target database is not explicitly specified in either the parameter or the SQL statement. | - | -| `-table` | `--table ` | Target table , required for CSV imports in table model | No | - | -| `-s` | `--source` | Local path to the file/directory to import. ​​**Supported formats**​: CSV, SQL, TsFile. Unsupported formats trigger error: `The file name must end with "csv", "sql", or "tsfile"!` | ​**Yes** | - | -| `-tn` | `--thread_num` | Maximum parallel threads | No | `8`
Range: 0 to Integer.Max(2147483647). | -| `-tz` | `--timezone` | Timezone (e.g., `+08:00`, `-01:00`). | No | System default | -| `-help` | `--help` | Display help (general or format-specific: `-help csv`). | No | - | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - - -### 2.2 CSV Format - -#### 2.2.1 Command -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table - [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] -``` - -#### 2.2.2 CSV-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ---------------- | ------------------------------- |----------------------------------------------------------| ---------- |-----------------| -| `-fd` | `--fail_dir` | Directory to save failed files. | No | YOUR_CSV_FILE_PATH | -| `-lpf` | `--lines_per_failed_file` | Max lines per failed file. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-aligned` | `--use_aligned` | Import as aligned time series. | No | `false` | -| `-batch` | `--batch_size` | Rows processed per API call. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-ti` | `--type_infer` | Type mapping (e.g., `BOOLEAN=text,INT=long`). | No | - | -| `-tp` | `--timestamp_precision` | Timestamp precision: `ms`, `us`, `ns`. | No | `ms` | - -#### 2.2.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_0.csv -db database1 -table table1 - -# Error Example -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_1.csv -table table1 -Parse error: Missing required option: db - -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_1.csv -db database1 -table table5 -There are no tables or the target table table5 does not exist -``` - -#### 2.2.4 Import Notes - -1. CSV Import Specifications - -- Special Character Escaping Rules: If a text-type field contains special characters (e.g., commas `,`), they must be escaped using a backslash (`\`). -- Supported Time Formats: `yyyy-MM-dd'T'HH:mm:ss`, `yyyy-MM-dd HH:mm:ss`, or `yyyy-MM-dd'T'HH:mm:ss.SSSZ`. -- Timestamp Column Requirement: The timestamp column must be the first column in the data file. - -2. CSV File Example - -```sql -time,region,device,model,temperature,humidity -1970-01-01T08:00:00.001+08:00,"SH","101","F",90.0,35.2 -1970-01-01T08:00:00.002+08:00,"SH","101","F",90.0,34.8 -``` - - -### 2.3 SQL Format - -#### 2.3.1 Command - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] -``` - -#### 2.3.2 SQL-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| -------------- | ------------------------------- | -------------------------------------------------------------------- | ---------- | ------------------ | -| `-fd` | `--fail_dir` | Directory to save failed files. | No |YOUR_CSV_FILE_PATH| -| `-lpf` | `--lines_per_failed_file` | Max lines per failed file. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-batch` | `--batch_size` | Rows processed per API call. | No | `100000`
Range: 0 to Integer.Max(2147483647). | - -#### 2.3.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft sql -sql_dialect table -s ./sql/dump0_0.sql -db database1 - -# Error Example -> tools/import-data.sh -ft sql -sql_dialect table -s ./sql/dump1_1.sql -db database1 -Source file or directory ./sql/dump1_1.sql does not exist - -# When the ​target table exists but metadata is incompatible or ​data is malformed, the system will generate a .failed file and log error details. -# Log Example -Fail to insert measurements '[column.name]' caused by [data type is not consistent, input '[column.value]', registered '[column.DataType]'] -``` -### 2.4 TsFile Format - -#### 2.4.1 Command - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-o ] -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-o ] -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] -``` -#### 2.4.2 TsFile-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -|---------|---------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|-----------------------| -| `-os` | `--on_success` | Action for successful files:
`none`: Do not delete the file.
`mv`: Move the successful file to the target directory.
`cp`:Create a hard link (copy) of the successful file to the target directory.
`delete`:Delete the file. | ​**Yes** | - | -| `-sd` | `--success_dir` | Target directory for `mv`/`cp` actions on success. Required if `-os` is `mv`/`cp`. The file name will be flattened and concatenated with the original file name. | Conditional | `${EXEC_DIR}/success` | -| `-of` | `--on_fail` | Action for failed files:
`none`:Skip the file.
`mv`:Move the failed file to the target directory.
`cp`:Create a hard link (copy) of the failed file to the target directory.
`delete`:Delete the file.. | ​**Yes** | - | -| `-fd` | `--fail_dir` | Target directory for `mv`/`cp` actions on failure. Required if `-of` is `mv`/`cp`. The file name will be flattened and concatenated with the original file name. | Conditional | `${EXEC_DIR}/fail` | -| `-tp` | `--timestamp_precision` | TsFile timestamp precision: `ms`, `us`, `ns`.
For non-remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The system will manually verify if the timestamp precision matches the server. If it does not match, an error will be returned.
​For remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The Pipe system will automatically verify if the timestamp precision matches. If it does not match, a Pipe error will be returned. | No | `ms` | -| `-o` | `--object-file-paths` | Storage path for Object files.
Default mode: If this parameter is not specified, the script automatically identifies and imports Object files located in the subdirectory with the same name as the TsFile.
Absolute path mode: Explicitly specifies the external storage root directory for Object files; the tool creates an associated data index based on this path.
Note: This parameter is supported since V2.0.9.2 | No | | - -#### 2.4.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 -os none -of none - -# Error Example -> tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 -Parse error: Missing required options: os, of -``` - - -**Object Type Import** - -1. Import Directory Structure - -* Default Mode - -```Bash -target_dir - ├── tsfile.tsfile - └── tsfile/ (matches the TsFile name) - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - -* Specified Object Directory - -```Bash -target_dir - ├── tsfile.tsfile -object_dir - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - -2. Command Line Examples - -* Basic Import (automatically identifies Object files in the TsFile-named directory) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/import/sensor_v1.tsfile -db database1 -os none -of none -``` - -* Batch Directory Import (specify concurrent threads and post-success action) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/raw_data/ -tn 16 -os mv -sd /data/archive/ -``` - -* Table Model Associated Import (specify external Object storage path and target database) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/import/ -db factory_db -o /mnt/object_storage/ -of mv -fd /data/error_log/ -``` - - -## 3. TsFile Auto-Loading - -This feature enables IoTDB to automatically monitor a specified directory for new TsFiles and load them into the database without manual intervention. - -![](/img/Data-import2.png) - -### 3.1 Configuration - -Add the following parameters to `iotdb-system.properties` (template: `iotdb-system.properties.template`): - -| Parameter | Description | Value Range | Required | Default | Hot-Load? | -| ---------------------------------------------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------| ---------- | ----------------------------- | ----------------------- | -| `load_active_listening_enable` | Enable auto-loading. | `true`/`false` | Optional | `true` | Yes | -| `load_active_listening_dirs` | Directories to monitor (subdirectories included). Multiple paths separated by commas.
Note: In the table model, the directory name where the file is located will be used as the database. | String | Optional | `ext/load/pending` | Yes | -| `load_active_listening_fail_dir` | Directory to store failed TsFiles. Only can set one. | String | Optional | `ext/load/failed` | Yes | -| `load_active_listening_max_thread_num` | Maximum Threads for TsFile Loading Tasks:The default value for this parameter, when commented out, is max(1, CPU cores / 2). If the value set by the user falls outside the range [1, CPU cores / 2], it will be reset to the default value of max(1, CPU cores / 2). | `1` to `Long.MAX_VALUE` | Optional | `max(1, CPU_CORES / 2)` | No (restart required) | -| `load_active_listening_check_interval_seconds` | Active Listening Polling Interval (in seconds):The active listening feature for TsFiles is implemented through polling the target directory. This configuration specifies the time interval between two consecutive checks of the `load_active_listening_dirs`. After each check, the next check will be performed after `load_active_listening_check_interval_seconds` seconds. If the polling interval set by the user is less than 1, it will be reset to the default value of 5 seconds. | `1` to `Long.MAX_VALUE` | Optional | `5` | No (restart required) | - -### 3.2 Examples - -```bash -load_active_listening_dir/ -├─sensors/ -│ ├─temperature/ -│ │ └─temperature-table.TSFILE - -``` - -- Table model TsFile - - `temperature-table.TSFILE`: will be imported into the `temperature` database (because it is located in the `sensors/temperature/` directory) - - -### 3.3 Notes - -1. ​​**Mods Files**​: If TsFiles have associated `.mods` files, move `.mods` files to the monitored directory ​**before** their corresponding TsFiles. Ensure `.mods` and TsFiles are in the same directory. -2. ​​**Restricted Directories**​: Do NOT set Pipe receiver directories, data directories, or other system paths as monitored directories. -3. ​​**Directory Conflicts**​: Ensure `load_active_listening_fail_dir` does not overlap with `load_active_listening_dirs` or its subdirectories. -4. ​​**Permissions**​: The monitored directory must have write permissions. Files are deleted after successful loading; insufficient permissions may cause duplicate loading. - - -## 4. Load SQL - -IoTDB supports importing one or multiple TsFile files containing time series into another running IoTDB instance directly via SQL execution through the CLI. - -### 4.1 Command - -```SQL -load '' with ( - 'attribute-key1'='attribute-value1', - 'attribute-key2'='attribute-value2', -) -``` - -* `` : The path to a TsFile or a folder containing multiple TsFiles. -* ``: Optional parameters, as described below. - -| Key | Key Description | Value Type | Value Range | Value is Required | Default Value | -|--------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|--------------------------------|-------------------|----------------------------| -| `database-level` | When the database corresponding to the TsFile does not exist, the database hierarchy level can be specified via the ` database-level` parameter. The default is the level set in `iotdb-common.properties`. For example, setting level=1 means the prefix path of level 1 in all time series in the TsFile will be used as the database. | Integer | `[1: Integer.MAX_VALUE]` | No | 1 | -| `on-success` | Action for successfully loaded TsFiles: `delete` (delete the TsFile after successful import) or `none` (retain the TsFile in the source folder). | String | `delete / none` | No | delete | -| `model` | Specifies whether the TsFile uses the `table` model or `tree` model. This parameter becomes invalid starting from V2.0.2.1. The system automatically identifies whether the data model is tree-based or table-based. | String | `tree / table` | No | Aligns with `-sql_dialect` | -| `database-name` | Table model only: Target database for import. Automatically created if it does not exist. The database-name must not include the `root.` prefix (an error will occur if included). | String | `-` | No | null | -| `convert-on-type-mismatch` | Whether to perform type conversion during loading if data types in the TsFile mismatch the target schema. | Boolean | `true / false` | No | true | -| `verify` | Whether to validate the schema before loading the TsFile. | Boolean | `true / false` | No | true | -| `tablet-conversion-threshold` | Size threshold (in bytes) for converting TsFiles into tablet format during loading. Default: `-1` (no conversion for any TsFile). | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | No | -1 | -| `async` | Whether to enable asynchronous loading. If enabled, TsFiles are moved to an active-load directory and loaded into the `database-name` asynchronously. | Boolean | `true / false` | No | false | - -### 4.2 Example - -```SQL --- Create target database: database2 -IoTDB> create database database2 -Msg: The statement is executed successfully. - -IoTDB> use database2 -Msg: The statement is executed successfully. - -IoTDB:database2> show tables details -+---------+-------+------+-------+ -|TableName|TTL(ms)|Status|Comment| -+---------+-------+------+-------+ -+---------+-------+------+-------+ -Empty set. - --- Import tsfile by excuting load sql -IoTDB:database2> load '/home/dump0.tsfile' with ( 'on-success'='none', 'database-name'='database2') -Msg: The statement is executed successfully. - --- Verify whether the import was successful -IoTDB:database2> select * from table2 -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -|2024-11-30T00:00:00.000+08:00| 上海| 3002| 101| 90.0| 35.2| true| null| -|2024-11-29T00:00:00.000+08:00| 上海| 3001| 101| 85.0| 35.1| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T00:00:00.000+08:00| 北京| 1001| 101| 85.0| 35.1| true|2024-11-27T16:37:01.000+08:00| -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| null| 45.1| true| null| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| 85.0| 35.2| false|2024-11-28T08:00:09.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -``` diff --git a/src/UserGuide/latest-Table/Tools-System/Maintenance-Tool_timecho.md b/src/UserGuide/latest-Table/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index 725b758a4..000000000 --- a/src/UserGuide/latest-Table/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,1151 +0,0 @@ - -# Cluster Management Tool - -## 1. IoTDB-OpsKit - -The IoTDB OpsKit is an easy-to-use operation and maintenance tool designed for TimechoDB (Enterprise-grade product based on Apache IoTDB). It helps address the operational and maintenance challenges of multi-node distributed IoTDB deployments by providing functionalities such as cluster deployment, start/stop management, elastic scaling, configuration updates, and data export. With one-click command execution, it simplifies the management of complex database clusters and significantly reduces operational complexity. - -This document provides guidance on remotely deploying, configuring, starting, and stopping IoTDB cluster instances using the cluster management tool. - -### 1.1 Prerequisites - -The IoTDB OpsKit requires GLIBC 2.17 or later, which means the minimum supported operating system version is CentOS 7. The target machines for IoTDB deployment must have the following dependencies installed: - -- JDK 8 or later -- lsof -- netstat -- unzip - -If any of these dependencies are missing, please install them manually. The last section of this document provides installation commands for reference. - -> **Note:** The IoTDB cluster management tool requires **root privileges** to execute. - -### 1.2 Deployment - -#### Download and Installation - -The IoTDB OpsKit is an auxiliary tool for TimechoDB. Please contact Timecho team to obtain the download instructions. - -To install: - -1. Navigate to the `iotdb-opskit` directory and execute: - -```Bash -bash install-iotdbctl.sh -``` - -This will activate the `iotdbctl` command in the current shell session. You can verify the installation by checking the deployment prerequisites: - -```Bash -iotdbctl cluster check example -``` - -1. Alternatively, if you prefer not to activate `iotdbctl`, you can execute commands directly using the absolute path: - -```Bash -/sbin/iotdbctl cluster check example -``` - -### 1.3 Cluster Configuration Files - -The cluster configuration files are stored in the `iotdbctl/config` directory as YAML files. - -- Each YAML file name corresponds to a cluster name. Multiple YAML files can coexist. -- A sample configuration file (`default_cluster.yaml`) is provided in the `iotdbctl/config` directory to assist users in setting up their configurations. - -#### **Structure of YAML Configuration** - -The YAML file consists of the following five sections: - -1. `global` – General settings, such as SSH credentials, installation paths, and JDK configurations. -2. `confignode_servers` – Configuration settings for ConfigNodes. -3. `datanode_servers` – Configuration settings for DataNodes. -4. `grafana_server` – Configuration settings for Grafana monitoring. -5. `prometheus_server` – Configuration settings for Prometheus monitoring. - -A sample YAML file (`default_cluster.yaml`) is included in the `iotdbctl/config` directory. - -- You can copy and rename it based on your cluster setup. -- All uncommented fields are mandatory. -- Commented fields are optional. - -**Example:** Checking `default_cluster.yaml` - -To validate a cluster configuration, execute: - -```SQL -iotdbctl cluster check default_cluster -``` - -For a complete list of available commands, refer to the command reference section below. - -#### Parameter Reference - -| **Parameter** | **Description** | **Mandatory** | -| ----------------------- | ------------------------------------------------------------ | ------------- | -| iotdb_zip_dir | IoTDB distribution directory. If empty, the package will be downloaded from `iotdb_download_url`. | NO | -| iotdb_download_url | IoTDB download URL. If `iotdb_zip_dir` is empty, the package will be retrieved from this address. | NO | -| jdk_tar_dir | Local path to the JDK package for uploading and deployment. | NO | -| jdk_deploy_dir | Remote deployment directory for the JDK. | NO | -| jdk_dir_name | JDK decompression directory name. Default: `jdk_iotdb`. | NO | -| iotdb_lib_dir | IoTDB library directory (or `.zip` package for upgrades). Default: commented out. | NO | -| user | SSH login username for deployment. | YES | -| password | SSH password (if omitted, key-based authentication will be used). | NO | -| pkey | SSH private key (used if `password` is not provided). | NO | -| ssh_port | SSH port number. | YES | -| deploy_dir | IoTDB deployment directory. | YES | -| iotdb_dir_name | IoTDB decompression directory name. Default: `iotdb`. | NO | -| datanode-env.sh | Corresponds to `iotdb/config/datanode-env.sh`. If both `global` and `confignode_servers` are configured, `confignode_servers` takes precedence. | NO | -| confignode-env.sh | Corresponds to `iotdb/config/confignode-env.sh`. If both `global` and `datanode_servers` are configured, `datanode_servers` takes precedence. | NO | -| iotdb-system.properties | Corresponds to `/config/iotdb-system.properties`. | NO | -| cn_internal_address | The inter-node communication address for ConfigNodes. This parameter defines the address of the surviving ConfigNode, which defaults to `confignode_x`. If both `global` and `confignode_servers` are configured, the value in `confignode_servers` takes precedence. Corresponds to `cn_internal_address` in `iotdb/config/iotdb-system.properties`. | YES | -| dn_internal_address | The inter-node communication address for DataNodes. This address defaults to `confignode_x`. If both `global` and `datanode_servers` are configured, the value in `datanode_servers` takes precedence. Corresponds to `dn_internal_address` in `iotdb/config/iotdb-system.properties`. | YES | - -Both `datanode-env.sh` and `confignode-env.sh` allow **extra parameters** to be appended. These parameters can be configured using the `extra_opts` field. Example from `default_cluster.yaml`: - -```YAML -datanode-env.sh: - extra_opts: | - IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" - IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" -``` - -#### ConfigNode Configuration - -ConfigNodes can be configured in `confignode_servers`. Multiple ConfigNodes can be deployed, with the first started ConfigNode (`node1`) serving as the Seed ConfigNode by default. - -| **Parameter** | **Description** | **Mandatory** | -| ----------------------- | ------------------------------------------------------------ | ------------- | -| name | ConfigNode name. | YES | -| deploy_dir | ConfigNode deployment directory. | YES | -| cn_internal_address | Inter-node communication address for ConfigNodes, corresponding to `iotdb/config/iotdb-system.properties`. | YES | -| cn_seed_config_node | The cluster configuration address points to the surviving ConfigNode. This address defaults to `confignode_x`. If both `global` and `confignode_servers` are configured, the value in `confignode_servers` takes precedence, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| cn_internal_port | Internal communication port, corresponding to `cn_internal_port` in `iotdb/config/iotdb-system.properties`. | YES | -| cn_consensus_port | Consensus communication port, corresponding to `cn_consensus_port` in `iotdb/config/iotdb-system.properties`. | NO | -| cn_data_dir | Data directory for ConfigNodes, corresponding to `cn_data_dir` in `iotdb/config/iotdb-system.properties`. | YES | -| iotdb-system.properties | ConfigNode properties file. If `global` and `confignode_servers` are both configured, values from `confignode_servers` take precedence. | NO | - -#### DataNode Configuration - -Datanodes can be configured in `datanode_servers`. Multiple DataNodes can be deployed, each requiring unique configuration. - -| **Parameter** | **Description** | **Mandatory** | -| ----------------------- | ------------------------------------------------------------ | ------------- | -| name | DataNode name. | YES | -| deploy_dir | DataNode deployment directory. | YES | -| dn_rpc_address | RPC communication address, corresponding to `dn_rpc_address` in `iotdb/config/iotdb-system.properties`. | YES | -| dn_internal_address | Internal communication address, corresponding to `dn_internal_address` in `iotdb/config/iotdb-system.properties`. | YES | -| dn_seed_config_node | Points to the active ConfigNode. Defaults to `confignode_x`. If `global` and `datanode_servers` are both configured, values from `datanode_servers` take precedence. Corresponds to `dn_seed_config_node` in `iotdb/config/iotdb-system.properties`. | YES | -| dn_rpc_port | RPC port for DataNodes, corresponding to `dn_rpc_port` in `iotdb/config/iotdb-system.properties`. | YES | -| dn_internal_port | Internal communication port, corresponding to `dn_internal_port` in `iotdb/config/iotdb-system.properties`. | YES | -| iotdb-system.properties | DataNode properties file. If `global` and `datanode_servers` are both configured, values from `datanode_servers` take precedence. | NO | - -#### Grafana Configuration - -Grafana can be configured in `grafana_server`. Defines the settings for deploying Grafana as a monitoring solution for IoTDB. - -| **Parameter** | **Description** | **Mandatory** | -| ---------------- | ------------------------------------------------------------ | ------------- | -| grafana_dir_name | Name of the Grafana decompression directory. Default: `grafana_iotdb`. | NO | -| host | The IP address of the machine hosting Grafana. | YES | -| grafana_port | The port Grafana listens on. Default: `3000`. | NO | -| deploy_dir | Deployment directory for Grafana. | YES | -| grafana_tar_dir | Path to the Grafana compressed package. | YES | -| dashboards | Path to pre-configured Grafana dashboards. | NO | - -#### Prometheus Configuration - -Grafana can be configured in `prometheus_server`. Defines the settings for deploying Prometheus as a monitoring solution for IoTDB. - -| **Parameter** | **Description** | **Mandatory** | -| --------------------------- | ------------------------------------------------------------ | ------------- | -| prometheus_dir_name | Name of the Prometheus decompression directory. Default: `prometheus_iotdb`. | NO | -| host | The IP address of the machine hosting Prometheus. | YES | -| prometheus_port | The port Prometheus listens on. Default: `9090`. | NO | -| deploy_dir | Deployment directory for Prometheus. | YES | -| prometheus_tar_dir | Path to the Prometheus compressed package. | YES | -| storage_tsdb_retention_time | Number of days data is retained. Default: `15 days`. | NO | -| storage_tsdb_retention_size | Maximum data storage size per block. Default: `512M`. Units: KB, MB, GB, TB, PB, EB. | NO | - -If metrics are enabled in `iotdb-system.properties` (in `config/xxx.yaml`), the configurations will be automatically applied to Prometheus without manual modification. - -**Special Configuration Notes** - -- **Handling Special Characters in YAML Keys**: If a YAML key value contains special characters (such as `:`), it is recommended to enclose the entire value in double quotes (`""`). -- **Avoid Spaces in File Paths**: Paths containing spaces may cause parsing errors in some configurations. - -### 1.4 Usage Scenarios - -#### Data Cleanup - -This operation deletes cluster data directories, including: - -- IoTDB data directories, -- ConfigNode directories (`cn_system_dir`, `cn_consensus_dir`), -- DataNode directories (`dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`), -- Log directories and ext directories specified in the YAML configuration. - -To clean cluster data, perform the following steps: - -```Bash -# Step 1: Stop the cluster -iotdbctl cluster stop default_cluster - -# Step 2: Clean the cluster data -iotdbctl cluster clean default_cluster -``` - -#### Cluster Destruction - -The cluster destruction process completely removes the following resources: - -- Data directories, -- ConfigNode directories (`cn_system_dir`, `cn_consensus_dir`), -- DataNode directories (`dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`), -- Log and ext directories, -- IoTDB deployment directory, -- Grafana and Prometheus deployment directories. - -To destroy a cluster, follow these steps: - -```Bash -# Step 1: Stop the cluster -iotdbctl cluster stop default_cluster - -# Step 2: Destroy the cluster -iotdbctl cluster destroy default_cluster -``` - -#### Cluster Upgrade - -To upgrade the cluster, follow these steps: - -1. In `config/xxx.yaml`, set **`iotdb_lib_dir`** to the path of the JAR files to be uploaded. Example: `iotdb/lib` -2. If uploading a compressed package, compress the `iotdb/lib` directory: - -```Bash -zip -r lib.zip apache-iotdb-1.2.0/lib/* -``` - -1. Execute the following commands to distribute the library and restart the cluster: - -```Bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### Hot Deployment - -Hot deployment allows real-time configuration updates without restarting the cluster. - -Steps: - -1. Modify the configuration in `config/xxx.yaml`. -2. Distribute the updated configuration and reload it: - -```Bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### Cluster Expansion - -To expand the cluster by adding new nodes: - -1. Add a new DataNode or ConfigNode in `config/xxx.yaml`. -2. Execute the cluster expansion command: - -```Bash -iotdbctl cluster scaleout default_cluster -``` - -#### Cluster Shrinking - -To remove a node from the cluster: - -1. Identify the node name or IP:port in `config/xxx.yaml`: - 1. ConfigNode port: `cn_internal_port` - 2. DataNode port: `rpc_port` -2. Execute the following command: - -```Bash -iotdbctl cluster scalein default_cluster -``` - -#### Managing Existing IoTDB Clusters - -To manage an existing IoTDB cluster with the OpsKit tool: - -1. Configure SSH credentials: - 1. Set `user`, `password` (or `pkey`), and `ssh_port` in `config/xxx.yaml`. -2. Modify IoTDB deployment paths: For example, if IoTDB is deployed at `/home/data/apache-iotdb-1.1.1`: - -```YAML -deploy_dir: /home/data/ -iotdb_dir_name: apache-iotdb-1.1.1 -``` - -1. Configure JDK paths: If `JAVA_HOME` is not used, set the JDK deployment path: - -```YAML -jdk_deploy_dir: /home/data/ -jdk_dir_name: jdk_1.8.2 -``` - -1. Set cluster addresses: - -- `cn_internal_address` and `dn_internal_address` -- In `confignode_servers` → `iotdb-system.properties`, configure: - - `cn_internal_address`, `cn_internal_port`, `cn_consensus_port`, `cn_system_dir`, `cn_consensus_dir` -- In `datanode_servers` → `iotdb-system.properties`, configure: - - `dn_rpc_address`, `dn_internal_address`, `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir` - -1. Execute the initialization command: - -```Bash -iotdbctl cluster init default_cluster -``` - -#### Deploying IoTDB, Grafana, and Prometheus - -To deploy an IoTDB cluster along with Grafana and Prometheus: - -1. Enable metrics: In `iotdb-system.properties`, enable the metrics interface. -2. Configure Grafana: - -- If deploying multiple dashboards, separate names with commas. -- Ensure dashboard names are unique to prevent overwriting. - -1. Configure Prometheus: - -- If the IoTDB cluster has metrics enabled, Prometheus automatically adapts without manual configuration. - -1. Start the cluster: - -```Bash -iotdbctl cluster start default_cluster -``` - -For detailed parameters, refer to the **Cluster Configuration Files** section above. - -### 1.5 Command Reference - -The basic command structure is: - -```Bash -iotdbctl cluster [params (Optional)] -``` - -- `key` – The specific command to execute. -- `cluster_name` – The name of the cluster (matches the YAML file name in `iotdbctl/config`). -- `params` – Optional parameters for the command. - -Example: Deploying the `default_cluster` cluster - -```Bash -iotdbctl cluster deploy default_cluster -``` - -#### Command Overview - -| **ommand** | **escription** | **Parameters** | -| ---------- | ---------------------------------- | ---------------------------------------------------- | -| check | Check cluster readiness for deployment. | Cluster name | -| clean | Clean up cluster data. | Cluster name | -| deploy/dist-all | Deploy the cluster. | Cluster name, -N module (optional: iotdb, grafana, prometheus), -op force (optional) | -| list | List cluster status. | None | -| start | Start the cluster. | Cluster name, -N node name (optional: iotdb, grafana, prometheus) | -| stop | Stop the cluster. | Cluster name, -N node name (optional), -op force (optional) | -| restart | Restart the cluster. | Cluster name, -N node name (optional), -op force (optional) | -| show | View cluster details. | Cluster name, details (optional) | -| destroy | Destroy the cluster. | Cluster name, -N module (optional: iotdb, grafana, prometheus) | -| scaleout | Expand the cluster. | Cluster name | -| scalein | Shrink the cluster. | Cluster name, -N node name or IP:port | -| reload | Hot reload cluster configuration. | Cluster name | -| dist-conf | Distribute cluster configuration. | Cluster name | -| dumplog | Backup cluster logs. | Cluster name, -N node name, -h target IP, -pw target password, -p target port, -path backup path, -startdate, -enddate, -loglevel, -l transfer speed | -| dumpdata | Backup cluster data | Cluster name, -h target IP, -pw target password, -p target port, -path backup path, -startdate, -enddate, -l transfer speed | -| dist-lib | Upgrade the IoTDB lib package. | Cluster name | -| init | Initialize the cluster configuration. | Cluster name | -| status | View process status. | Cluster name | -| activate | Activate the cluster. | Cluster name | -| health_check | Perform a health check. | Cluster name, -N, nodename (optional) | -| backup | Backup the cluster. | Cluster name,-N nodename (optional) | -| importschema | Import metadata. | Cluster name,-N nodename -param paramters | -| exportschema | Export metadata. | Cluster name,-N nodename -param paramters | - -### 1.6 Detailed Command Execution Process - -The following examples use `default_cluster.yaml` as a reference. Users can modify the commands according to their specific cluster configuration files. - -#### Check Cluster Deployment Environment - -The following command checks whether the cluster environment meets the deployment requirements: - -```Bash -iotdbctl cluster check default_cluster -``` - -**Execution Steps:** - -1. Locate the corresponding YAML file (`default_cluster.yaml`) based on the cluster name. -2. Retrieve configuration information for ConfigNode and DataNode (`confignode_servers` and `datanode_servers`). -3. Verify the following conditions on the target node: - 1. SSH connectivity - 2. JDK version (must be 1.8 or above) - 3. Required system tools: unzip, lsof, netstat - -**Expected Output:** - -- If successful: `Info: example check successfully!` -- If failed: `Error: example check fail!` - -**Troubleshooting Tips:** - -- JDK version not satisfied: Specify a valid `jdk1.8+` path in the YAML file for deployment. -- Missing system tools: Install unzip, lsof, and netstat on the server. -- Port conflict: Check the error log, e.g., `Error: Server (ip:172.20.31.76) iotdb port (10713) is listening.` - -#### Deploy Cluster - -Deploy the entire cluster using the following command: - -```Bash -iotdbctl cluster deploy default_cluster -``` - -**Execution Steps:** - -1. Locate the corresponding `YAML` file based on the cluster name. -2. Upload the IoTDB and JDK compressed packages (if `jdk_tar_dir` and `jdk_deploy_dir` are configured). -3. Generate and upload the iotdb-system.properties file based on the YAML configuration. - -**Force Deployment:** To overwrite existing deployment directories and redeploy: - -```Bash -iotdbctl cluster deploy default_cluster -op force -``` - -**Deploying Individual Modules:** - -You can deploy specific components individually: - -```Bash -# Deploy Grafana module -iotdbctl cluster deploy default_cluster -N grafana - -# Deploy Prometheus module -iotdbctl cluster deploy default_cluster -N prometheus - -# Deploy IoTDB module -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### Start Cluster - -Start the cluster using the following command: - -```Bash -iotdbctl cluster start default_cluster -``` - -**Execution Steps:** - -1. Locate the `YAML` file based on the cluster name. -2. Start ConfigNodes sequentially according to the YAML order. - 1. The first ConfigNode is treated as the Seed ConfigNode. - 2. Verify startup by checking process IDs. -3. Start DataNodes sequentially and verify their process IDs. -4. After process verification, check the cluster's service health via CLI. - 1. If the CLI connection fails, retry every 10 seconds, up to 5 times. - -**Start a Single Node:** Start specific nodes by name or IP: - -```Bash -# By node name -iotdbctl cluster start default_cluster -N datanode_1 - -# By IP and port (ConfigNode uses `cn_internal_port`, DataNode uses `rpc_port`) -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 - -# Start Grafana -iotdbctl cluster start default_cluster -N grafana - -# Start Prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -**Note:** The `iotdbctl` tool relies on `start-confignode.sh` and `start-datanode.sh` scripts. If startup fails, check the cluster status using the following command: - -```Bash -iotdbctl cluster status default_cluster -``` - -#### View Cluster Status - -To view the current cluster status: - -```Bash -iotdbctl cluster show default_cluster -``` - -To view detailed information: - -```Bash -iotdbctl cluster show default_cluster details -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve `confignode_servers` and `datanode_servers` configuration. -2. Execute `show cluster details` via CLI. -3. If one node returns successfully, the process skips checking remaining nodes. - -#### Stop Cluster - -To stop the entire cluster: - -```Bash -iotdbctl cluster stop default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve `confignode_servers` and `datanode_servers` configuration. -2. Stop DataNodes sequentially based on the YAML configuration. -3. Stop ConfigNodes in sequence. - -**Force Stop:** To forcibly stop the cluster using `kill -9`: - -```Bash -iotdbctl cluster stop default_cluster -op force -``` - -**Stop a Single Node:** Stop nodes by name or IP: - -```Bash -# By node name -iotdbctl cluster stop default_cluster -N datanode_1 - -# By IP and port -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 - -# Stop Grafana -iotdbctl cluster stop default_cluster -N grafana - -# Stop Prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -**Note:** If the IoTDB cluster is not fully stopped, verify its status using: - -```Bash -iotdbctl cluster status default_cluster -``` - -#### Clean Cluster Data - -To clean up cluster data, execute: - -```Bash -iotdbctl cluster clean default_cluster -``` - -**Execution Steps:** - -1. Locate the `YAML` file and retrieve `confignode_servers` and `datanode_servers` configuration. -2. Verify that no services are running. If any are active, the cleanup will not proceed. -3. Delete the following directories: - 1. IoTDB data directories, - 2. ConfigNode and DataNode system directories (`cn_system_dir`, `dn_system_dir`), - 3. Consensus directories (`cn_consensus_dir`, `dn_consensus_dir`), - 4. Logs and ext directories. - -#### Restart Cluster - -Restart the cluster using the following command: - -```Bash -iotdbctl cluster restart default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve configurations for ConfigNodes, DataNodes, Grafana, and Prometheus. -2. Perform a cluster stop followed by a cluster start. - -**Force Restart:** To forcibly restart the cluster: - -```Bash -iotdbctl cluster restart default_cluster -op force -``` - -**Restart a Single Node:** Restart specific nodes by name: - -```Bash -# Restart DataNode -iotdbctl cluster restart default_cluster -N datanode_1 - -# Restart ConfigNode -iotdbctl cluster restart default_cluster -N confignode_1 - -# Restart Grafana -iotdbctl cluster restart default_cluster -N grafana - -# Restart Prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### Cluster Expansion - -To add a node to the cluster: - -1. Edit `config/xxx.yaml` to add a new DataNode or ConfigNode. -2. Execute the following command: - -```Bash -iotdbctl cluster scaleout default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve node configuration. -2. Upload the IoTDB and JDK packages (if `jdk_tar_dir` and `jdk_deploy_dir` are configured). -3. Generate and upload iotdb-system.properties. -4. Start the new node and verify success. - -Tip: Only one node expansion is supported per execution. - -#### Cluster Shrinking - -To remove a node from the cluster: - -1. Identify the node name or IP:port in `config/xxx.yaml`. -2. Execute the following command: - -```Bash -#Scale down by node name -iotdbctl cluster scalein default_cluster -N nodename - -#Scale down according to ip+port (ip+port obtains the only node according to ip+dn_rpc_port in datanode, and obtains the only node according to ip+cn_internal_port in confignode) -iotdbctl cluster scalein default_cluster -N ip:port -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve node configuration. -2. Ensure at least one ConfigNode and one DataNode remain. -3. Identify the node to remove, execute the scale-in command, and delete the node directory. - -Tip: Only one node shrinking is supported per execution. - -#### Destroy Cluster - -To destroy the entire cluster: - -```Bash -iotdbctl cluster destroy default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file and retrieve node information. -2. Verify that all nodes are stopped. If any node is running, the destruction will not proceed. -3. Delete the following directories: - 1. IoTDB data directories, - 2. ConfigNode and DataNode system directories, - 3. Consensus directories, - 4. Logs, ext, and deployment directories, - 5. Grafana and Prometheus directories. - -**Destroy a Single Module:** To destroy individual modules: - -```Bash -# Destroy Grafana -iotdbctl cluster destroy default_cluster -N grafana - -# Destroy Prometheus -iotdbctl cluster destroy default_cluster -N prometheus - -# Destroy IoTDB -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### Distribute Cluster Configuration - -To distribute the cluster configuration files across nodes: - -```Bash -iotdbctl cluster dist-conf default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on `cluster-name`. -2. Retrieve configuration from `confignode_servers`, `datanode_servers`, `grafana`, and `prometheus`. -3. Generate and upload `iotdb-system.properties` to the specified nodes. - -#### Hot Load Cluster Configuration - -To reload the cluster configuration without restarting: - -```Plain -iotdbctl cluster reload default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on `cluster-name`. -2. Execute the `load configuration` command through the CLI for each node. - -#### Cluster Node Log Backup - -To back up logs from specific nodes: - -```Bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` - -**Execution Steps:** - -1. Locate the YAML file based on `cluster-name`. -2. Verify node existence (`datanode_1` and `confignode_1`). -3. Back up log data within the specified date range. -4. Save logs to `/iotdb/logs` or the default IoTDB installation path. - -| **Command** | **Description** | **Mandatory** | -| ----------- | ------------------------------------------------------------ | ------------- | -| -h | IP address of the backup server | NO | -| -u | Username for the backup server | NO | -| -pw | Password for the backup server | NO | -| -p | Backup server port (Default: `22`) | NO | -| -path | Path for backup data (Default: current path) | NO | -| -loglevel | Log level (`all`, `info`, `error`, `warn`. Default: `all`) | NO | -| -l | Speed limit (Default: unlimited; Range: 0 to 104857601 Kbit/s) | NO | -| -N | Cluster names (comma-separated) | YES | -| -startdate | Start date (inclusive; Default: `1970-01-01`) | NO | -| -enddate | End date (inclusive) | NO | -| -logs | IoTDB log storage path (Default: `{iotdb}/logs`) | NO | - -#### Cluster Data Backup - -To back up data from the cluster: - -```Bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` - -This command identifies the leader node from the YAML file and backs up data within the specified date range to the `/iotdb/datas` directory on the `192.168.9.48` server. - -| **Command** | **Description** | **Mandatory** | -| ------------ | ------------------------------------------------------------ | ------------- | -| -h | IP address of the backup server | NO | -| -u | Username for the backup server | NO | -| -pw | Password for the backup server | NO | -| -p | Backup server port (Default: `22`) | NO | -| -path | Path for storing backup data (Default: current path) | NO | -| -granularity | Data partition granularity | YES | -| -l | Speed limit (Default: unlimited; Range: 0 to 104857601 Kbit/s) | NO | -| -startdate | Start date (inclusive) | YES | -| -enddate | End date (inclusive) | YES | - -#### Cluster Upgrade - -To upgrade the cluster: - -```Bash -iotdbctl cluster dist-lib default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve the configuration of `confignode_servers` and `datanode_servers`. -3. Upload the library package. - -**Note:** After the upgrade, restart all IoTDB nodes for the changes to take effect. - -#### Cluster Initialization - -To initialize the cluster: - -```Bash -iotdbctl cluster init default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve configuration details for `confignode_servers`, `datanode_servers`, `Grafana`, and `Prometheus`. -3. Initialize the cluster configuration. - -#### View Cluster Process - -To check the cluster process status: - -```Bash -iotdbctl cluster status default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve configuration details for `confignode_servers`, `datanode_servers`, `Grafana`, and `Prometheus`. -3. Display the operational status of each node in the cluster. - -#### Cluster Authorization Activation - -**Default Activation Method:** To activate the cluster using an activation code: - -```Bash -iotdbctl cluster activate default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve the `confignode_servers` configuration. -3. Obtain the machine code. -4. Enter the activation code when prompted. - -Example: - -```Bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=, lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful. -``` - -**Activate a Specific Node:** To activate a specific node: - -```Bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -**Activate via License Path:** To activate using a license file: - -```Bash -iotdbctl cluster activate default_cluster -op license_path -``` - -#### Cluster Health Check - -To perform a cluster health check: - -```Bash -iotdbctl cluster health_check default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve configuration details for `confignode_servers` and `datanode_servers`. -3. Execute `health_check.sh` on each node. - -**Single Node Health Check:** To check a specific node: - -```Bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` - -#### Cluster Shutdown Backup - -To back up the cluster during shutdown: - -```Bash -iotdbctl cluster backup default_cluster -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name. -2. Retrieve configuration details for `confignode_servers` and `datanode_servers`. -3. Execute `backup.sh` on each node. - -**Single Node Backup:** To back up a specific node: - -```Bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` - -**Note:** Multi-node deployment on a single machine only supports quick mode. - -#### Cluster Metadata Import - -To import metadata: - -```Bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name to retrieve `datanode_servers` configuration information. -2. Execute metadata import using `import-schema.sh` on `datanode1`. - -Parameter Descriptions for `-param`: - -| **Command** | **Description** | **Mandatory** | -| ----------- | ------------------------------------------------------------ | ------------- | -| -s | Specify the data file or directory to be imported. If a directory is specified, all files with a `.csv` extension will be imported in bulk. | YES | -| -fd | Specify a directory to store failed import files. If omitted, failed files will be saved in the source directory with the `.failed` suffix added to the original filename. | No | -| -lpf | Specify the maximum number of lines per failed import file (Default: 10,000). | NO | - -#### Cluster Metadata Export - -To export metadata: - -```Bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` - -**Execution Steps:** - -1. Locate the YAML file based on the cluster name to retrieve `datanode_servers` configuration information. -2. Execute metadata export using `export-schema.sh` on `datanode1`. - -**Parameter Descriptions for** **`-param`:** - -| **Command** | **Description** | **Mandatory** | -| ----------- | ------------------------------------------------------------ | ------------- | -| -t | Specify the output path for the exported CSV file. | YES | -| -path | Specify the metadata path pattern for export. If this parameter is specified, the `-s` parameter will be ignored. Example: `root.stock.**`. | NO | -| -pf | If `-path` is not specified, use this parameter to specify the file containing metadata paths to export. The file must be in `.txt` format, with one path per line. | NO | -| -lpf | Specify the maximum number of lines per exported file (Default: 10,000). | NO | -| -timeout | Specify the session query timeout in milliseconds. | NO | - -### 1.7 Introduction to Cluster Deployment Tool Samples - -In the cluster deployment tool installation directory (config/example), there are three YAML configuration examples. If needed, you can copy and modify them for your deployment. - -| **Name** | **Description** | -| -------------------- | -------------------------------------------------------- | -| default_1c1d.yaml | Example configuration for 1 ConfigNode and 1 DataNode. | -| default_3c3d.yaml | Example configuration for 3 ConfigNodes and 3 DataNodes. | -| default_3c3d_grafa_prome | Example configuration for 3 ConfigNodes, 3 DataNodes, Grafana, and Prometheus. | - -## 2. IoTDB Data Directory Overview Tool - -The IoTDB Data Directory Overview Tool provides an overview of the IoTDB data directory structure. It is located at `tools/tsfile/print-iotdb-data-dir`. - -### 2.1 Usage - -- For Windows: - -```Bash -.\print-iotdb-data-dir.bat () -``` - -- For Linux or MacOs: - -```Shell -./print-iotdb-data-dir.sh () -``` - -**Note:** If the output path is not specified, the default relative path `IoTDB_data_dir_overview.txt` will be used. - -### 2.2 Example - -Use Windows in this example: - -~~~Bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -~~~ - -## 3. TsFile Sketch Tool - -The TsFile Sketch Tool provides a summarized view of the content within a TsFile. It is located at `tools/tsfile/print-tsfile`. - -### 3.1 Usage - -- For Windows: - -```Plain -.\print-tsfile-sketch.bat () -``` - -- For Linux or MacOs: - -```Plain -./print-tsfile-sketch.sh () -``` - -**Note:** If the output path is not specified, the default relative path `TsFile_sketch_view.txt` will be used. - -### 3.2 Example - -Use Windows in this example: - -~~~Bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -~~~ - -Explanations: - -- The output is separated by the `|` symbol. The left side indicates the actual position within the TsFile, while the right side provides a summary of the content. -- The `"||||||||||||||||||||"` lines are added for readability and are not part of the actual TsFile data. -- The final `"IndexOfTimerseriesIndex Tree"` section reorganizes the metadata index tree at the end of the TsFile. This view aids understanding but does not represent actual stored data. - -## 4. TsFile Resource Sketch Tool - -The TsFile Resource Sketch Tool displays details about TsFile resource files. It is located at `tools/tsfile/print-tsfile-resource-files`. - -### 4.1 Usage - -- For Windows: - -```Bash -.\print-tsfile-resource-files.bat -``` - -- For Linux or MacOs: - -```Plain -./print-tsfile-resource-files.sh -``` - -### 4.2 Example - -Use Windows in this example: - -~~~Bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -~~~ diff --git a/src/UserGuide/latest-Table/Tools-System/Monitor-Tool_timecho.md b/src/UserGuide/latest-Table/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index 418d62f82..000000000 --- a/src/UserGuide/latest-Table/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,194 +0,0 @@ - -# Monitor Tool - -## 1. **Prometheus** **Integration** - -### 1.1 **Prometheus Metric Mapping** - -The following table illustrates the mapping of IoTDB metrics to the Prometheus-compatible format. For a given metric with `Metric Name = name` and tags `K1=V1, ..., Kn=Vn`, the mapping follows this pattern, where `value` represents the actual measurement. - -| **Metric Type** | **Mapping** | -| ---------------- | ------------------------------------------------------------ | -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| AutoGauge, Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | - -### 1.2 **Configuration File** - -To enable Prometheus metric collection in IoTDB, modify the configuration file as follows: - -1. Taking DataNode as an example, modify the iotdb-system.properties configuration file as follows: - -```Properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -1. Start IoTDB DataNodes -2. Use a web browser or `curl` to access `http://server_ip:9091/metrics` to retrieve metric data, such as: - -```Plain -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -### 1.3 **Prometheus + Grafana** **Integration** - -IoTDB exposes monitoring data in the standard Prometheus-compatible format. Prometheus collects and stores these metrics, while Grafana is used for visualization. - -**Integration Workflow** - -The following picture describes the relationships among IoTDB, Prometheus and Grafana: - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -Iotdb-Prometheus-Grafana Workflow - -1. IoTDB continuously collects monitoring metrics. -2. Prometheus collects metrics from IoTDB at a configurable interval. -3. Prometheus stores the collected metrics in its internal time-series database (TSDB). -4. Grafana queries Prometheus at a configurable interval and visualizes the metrics. - -**Prometheus Configuration Example** - -To configure Prometheus to collect IoTDB metrics, modify the `prometheus.yml` file as follows: - -```YAML -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -For more details, refer to: - -- Prometheus Documentation: - - [Prometheus getting_started](https://prometheus.io/docs/prometheus/latest/getting_started/) - - [Prometheus scrape metrics](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) -- Grafana Documentation: - - [Grafana getting_started](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - - [Grafana query metrics from Prometheus](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## 2. **Apache IoTDB Dashboard** - -We introduce the Apache IoTDB Dashboard, designed for unified centralized operations and management, which enables monitoring multiple clusters through a single panel. - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - -You can access the Dashboard's Json file in TimechoDB. - -### 2.1 **Cluster Overview** - -Including but not limited to: - -- Total number of CPU cores, memory capacity, and disk space in the cluster. -- Number of ConfigNodes and DataNodes in the cluster. -- Cluster uptime. -- Cluster write throughput. -- Current CPU, memory, and disk utilization across all nodes. -- Detailed information for individual nodes. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - -### 2.2 **Data Writing** - -Including but not limited to: - -- Average write latency, median latency, and the 99% percentile latency. -- Number and size of WAL files. -- WAL flush SyncBuffer latency per node. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### 2.3 **Data Querying** - -Including but not limited to: - -- Time series metadata query load time per node. -- Time series data read duration per node. -- Time series metadata modification duration per node. -- Chunk metadata list loading time per node. -- Chunk metadata modification duration per node. -- Chunk metadata-based filtering duration per node. -- Average time required to construct a Chunk Reader. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### 2.4 **Storage Engine** - -Including but not limited to: - -- File count and size by type. -- Number and size of TsFiles at different processing stages. -- Task count and execution duration for various operations. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### 2.5 **System Monitoring** - -Including but not limited to: - -- System memory, swap memory, and process memory usage. -- Disk space, file count, and file size statistics. -- JVM garbage collection (GC) time percentage, GC events by type, GC data volume, and heap memory utilization across generations. -- Network throughput and packet transmission rate. - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) - -### 2.6 Data Synchronization - -Including but not limited to: - -- Pipe event commit queue size, number of unassigned Pipe events -- Number of unprocessed events in the Source queue, Source event feeding rate, Processor event processing rate -- Number of untransmitted events for all Pipe Sinks/Sources, transmission event rate of Pipe connectors -- Retry queue size and pending handler count of Pipe Sinks; total data size before and after compression and compression duration of Pipe Sinks; batch size and batch interval distribution of Pipe Sinks -- Pipe memory usage and capacity, number of Pipe phantom references, quantity and total size of linked TsFiles, disk bytes read for TsFile transmission via Pipe - -![](/img/monitor-tool-pipe-1-en.png) - -![](/img/monitor-tool-pipe-2-en.png) - -![](/img/monitor-tool-pipe-3-en.png) - -![](/img/monitor-tool-pipe-4-en.png) \ No newline at end of file diff --git a/src/UserGuide/latest-Table/Tools-System/Schema-Export-Tool_timecho.md b/src/UserGuide/latest-Table/Tools-System/Schema-Export-Tool_timecho.md deleted file mode 100644 index f01e67169..000000000 --- a/src/UserGuide/latest-Table/Tools-System/Schema-Export-Tool_timecho.md +++ /dev/null @@ -1,111 +0,0 @@ - - -# Schema Export - -## 1. Overview - -The schema export tool `export-schema.sh/bat` is located in the `tools` directory. It can export schema from a specified database in IoTDB to a script file. - -## 2. Detailed Functionality - -### 2.1 Parameter - -| **Short Param** | **Full Param** | **Description** | Required | Default | -|-----------------|----------------------------|-----------------------------------------------------------------------| ------------------------------------- |-----------------------------------------------| -| `-h` | `--host` | Hostname | No | 127.0.0.1 | -| `-p` | `--port` | Port number | No | 6667 | -| `-u` | `--username` | Username | No | root | -| `-pw` | `--password` | Password, Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Specifies whether the server uses`tree `model or`table `model | No | tree | -| `-db` | `--database` | Target database to export (only applies when`-sql_dialect=table`) | Required if`-sql_dialect=table` | - | -| `-table` | `--table` | Target table to export (only applies when`-sql_dialect=table`) | No | - | -| `-t` | `--target` | Output directory (created if it doesn't exist) | Yes | | -| `-path` | `--path_pattern` | Path pattern for metadata export | Required if`-sql_dialect=tree` | | -| `-pfn` | `--prefix_file_name` | Output filename prefix | No | `dump_dbname.sql` | -| `-lpf` | `--lines_per_file` | Maximum lines per dump file (only applies when`-sql_dialect=tree`) | No | `10000` | -| `-timeout` | `--query_timeout` | Query timeout in milliseconds (`-1`= no timeout) | No | -1Range:`-1 to Long. max=9223372036854775807` | -| `-help` | `--help` | Display help information | No | | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 Command - -```Bash -Shell -# Unix/OS X -> tools/export-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -# Windows -# Before version V2.0.4.x -> tools\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\schema\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -``` - -### 2.3 Examples - -Export schema from `database1` to `/home`: - -```Bash -./export-schema.sh -sql_dialect table -t /home/ -db database1 -``` - -Output `dump_database1.sql`: - -```sql -DROP TABLE IF EXISTS table1; -CREATE TABLE table1( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -DROP TABLE IF EXISTS table2; -CREATE TABLE table2( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -``` diff --git a/src/UserGuide/latest-Table/Tools-System/Schema-Import-Tool_timecho.md b/src/UserGuide/latest-Table/Tools-System/Schema-Import-Tool_timecho.md deleted file mode 100644 index 4b0af24c3..000000000 --- a/src/UserGuide/latest-Table/Tools-System/Schema-Import-Tool_timecho.md +++ /dev/null @@ -1,166 +0,0 @@ - - -# Schema Import - -## 1. Overview - -The schema import tool `import-schema.sh/bat` is located in `tools` directory. - -## 2. Detailed Functionality - -### 2.1 Parameter - -| **Short Param** | **Full Param** | **Description** | Required | Default | -|-----------------| ------------------------------- |-----------------------------------------------------------------------| ---------- |----------------------------------------------| -| `-h` | `--host` | Hostname | No | 127.0.0.1 | -| `-p` | `--port` | Port number | No | 6667 | -| `-u` | `--username` | Username | No | root | -| `-pw` | `--password` | Password, Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Specifies whether the server uses`tree `model or`table `model | No | tree | -| `-db` | `--database` | Target database for import | Yes | - | -| `-table` | `--table` | Target table for import (only applies when`-sql_dialect=table`) | No | - | -| `-s` | `--source` | Local directory path containing script file(s) to import | Yes | | -| `-fd` | `--fail_dir` | Directory to save failed import files | No | | -| `-lpf` | `--lines_per_failed_file` | Maximum lines per failed file (only applies when`-sql_dialect=table`) | No | 100000Range:`0 to Integer.Max=2147483647` | -| `-help` | `--help` | Display help information | No | | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 Command - -```Bash -# Unix/OS X -tools/import-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# Windows -# Before version V2.0.4.x -tools\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# V2.0.4.x and later versions -tools\windows\schema\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] -``` - -### 2.3 Examples - -Import `dump_database1.sql` from `/home` into `database2`, - -```sql --- File content (dump_database1.sql): -DROP TABLE IF EXISTS table1; -CREATE TABLE table1( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -DROP TABLE IF EXISTS table2; -CREATE TABLE table2( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -``` - -Executing the command: - -```Bash -./import-schema.sh -sql_dialect table -s /home/dump_database1.sql -db database2 - -# If database2 doesn't exist -The target database database2 does not exist - -# If database2 exists -Import completely! -``` - -Verification: - -```Bash -# Before import -IoTDB:database2> show tables -+---------+-------+ -|TableName|TTL(ms)| -+---------+-------+ -+---------+-------+ -Empty set. - -# After import -IoTDB:database2> show tables details -+---------+-------+------+-------+ -|TableName|TTL(ms)|Status|Comment| -+---------+-------+------+-------+ -| table2| INF| USING| null| -| table1| INF| USING| null| -+---------+-------+------+-------+ - -IoTDB:database2> desc table1 -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ - -IoTDB:database2> desc table2 -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ -``` diff --git a/src/UserGuide/latest-Table/User-Manual/Audit-Log_timecho.md b/src/UserGuide/latest-Table/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index c2b4fcaa2..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,165 +0,0 @@ - - - -# Security Audit - -## 1. Introduction - -Audit logs serve as the record credentials of a database, enabling tracking of various operations (e.g., create, read, update, delete) to ensure information security. The audit log feature in IoTDB supports the following capabilities: - -* Supports enabling/disabling the audit log functionality through configuration -* Supports configuring operation types and privilege levels to be recorded via parameters -* Supports setting the storage duration of audit log files, including time-based rolling (via TTL) and space-based rolling (via SpaceTL) -* Supports configuring parameters to count slow requests (with write/query latency exceeding a threshold, default 3000 milliseconds) within any specified time period -* Audit log files are stored in encrypted format by default - -> Note: This feature is available from version V2.0.8 onwards. - -## 2. Configuration Parameters - -Edit the `iotdb-system.properties` file to enable audit logging using the following parameters: - - -* V2.0.8.1 - -| Parameter Name | Description | Data Type | Default Value | Activation Method | -|-------------------------------------------|------------------------------------------------------------------------------------------------------------|-----------|-------------------------------|-------------------| -| `enable_audit_log` | Whether to enable audit logging. true: enabled. false: disabled. | Boolean | false | Hot Reload | -| `auditable_operation_type` | Operation type selection. DML: all DML operations are logged; DDL: all DDL operations are logged; QUERY: all query operations are logged; CONTROL: all control statements are logged. | String | DML,DDL,QUERY,CONTROL | Hot Reload | -| `auditable_operation_level` | Permission level selection. global: log all audit events; object: only log events related to data instances. Containment relationship: object < global. For example: when set to global, all audit logs are recorded normally; when set to object, only operations on specific data instances are recorded. | String | global | Hot Reload | -| `auditable_operation_result` | Audit result selection. success: log only successful events; fail: log only failed events | String | success,fail | Hot Reload | -| `audit_log_ttl_in_days` | Audit log TTL (Time To Live). Logs older than this threshold will expire. | Double | -1.0 (never deleted) | Hot Reload | -| `audit_log_space_tl_in_GB` | Audit log SpaceTL. Logs will start rotating when total space reaches this threshold. | Double | 1.0 | Hot Reload | -| `audit_log_batch_interval_in_ms` | Batch write interval for audit logs | Long | 1000 | Hot Reload | -| `audit_log_batch_max_queue_bytes` | Maximum byte size of the queue for batch processing audit logs. Subsequent write operations will be blocked when this threshold is exceeded. | Long | 268435456 | Hot Reload | - -* V2.0.9.2 - -| Parameter Name | Description | Data Type | Default Value | Activation Method | -|-------------------------------------------|------------------------------------------------------------------------------------------------------------|-----------|-------------------------------|-------------------| -| `enable_audit_log` | Whether to enable audit logging. true: enabled. false: disabled. | Boolean | false | Hot Reload | -| `auditable_operation_type` | Operation type selection. DML: all DML operations are logged; DDL: all DDL operations are logged; QUERY: all query operations are logged; CONTROL: all control statements are logged. | String | DML,DDL,QUERY,CONTROL | Hot Reload | -| `auditable_dml_event_type` | Event types for auditing DML operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_ddl_event_type` | Event types for auditing DDL operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_query_event_type` | Event types for auditing query operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_control_event_type` | Event types for auditing control operations. `CHANGE_AUDIT_OPTION`: audit option change, `OBJECT_AUTHENTICATION`: object authentication, `LOGIN`: login, `LOGOUT`: logout, `DN_SHUTDOWN`: data node shutdown, `SLOW_OPERATION`: slow operation | String | `CHANGE_AUDIT_OPTION`,`OBJECT_AUTHENTICATION`,`LOGIN`,`LOGOUT`,`DN_SHUTDOWN`,`SLOW_OPERATION` | Hot Reload | -| `auditable_operation_level` | Permission level selection. global: log all audit events; object: only log events related to data instances. Containment relationship: object < global. For example: when set to global, all audit logs are recorded normally; when set to object, only operations on specific data instances are recorded. | String | global | Hot Reload | -| `auditable_operation_result` | Audit result selection. success: log only successful events; fail: log only failed events | String | success,fail | Hot Reload | -| `audit_log_ttl_in_days` | Audit log TTL (Time To Live). Logs older than this threshold will expire. | Double | -1.0 (never deleted) | Hot Reload | -| `audit_log_space_tl_in_GB` | Audit log SpaceTL. Logs will start rotating when total space reaches this threshold. | Double | 1.0 | Hot Reload | -| `audit_log_batch_interval_in_ms` | Batch write interval for audit logs | Long | 1000 | Hot Reload | -| `audit_log_batch_max_queue_bytes` | Maximum byte size of the queue for batch processing audit logs. Subsequent write operations will be blocked when this threshold is exceeded. | Long | 268435456 | Hot Reload | - -**Instructions for Object Authentication and Slow Operations:** -- When the parameters `auditable_dml_event_type`, `auditable_ddl_event_type`, `auditable_query_event_type`, or `auditable_control_event_type` are set to `OBJECT_AUTHENTICATION`, the corresponding event types will be recorded in the audit log. -- When the parameters `auditable_dml_event_type`, `auditable_ddl_event_type`, `auditable_query_event_type`, or `auditable_control_event_type` are set to `SLOW_OPERATION`, only the corresponding event types whose execution time exceeds the value of the `slow_query_threshold` parameter (default: 3000 ms) will be recorded in the audit log. The value of the `slow_query_threshold` parameter can be configured in the `iotdb-system.properties` file. - - -## 3. Access Methods - -Supports direct reading of audit logs via SQL. - -### 3.1 SQL Syntax - -```SQL -SELECT (, )* log FROM WHERE whereclause ORDER BY order_expression -``` - -Where: - -* `AUDIT_LOG_PATH`: Audit log storage location `__audit.audit_log`; -* `audit_log_field`: Query fields refer to the metadata structure below -* Supports WHERE clause filtering and ORDER BY sorting - -### 3.2 Metadata Structure - -| Field | Description | Data Type | -|------------------------|--------------------------------------------------|----------------| -| `time` | The date and time when the event started | timestamp | -| `username` | User name | string | -| `cli_hostname` | Client hostname identifier | string | -| `audit_event_type` | Audit event type, e.g., WRITE_DATA, GENERATE_KEY| string | -| `operation_type` | Operation type, e.g., DML, DDL, QUERY, CONTROL | string | -| `privilege_type` | Privilege used, e.g., WRITE_DATA, MANAGE_USER | string | -| `privilege_level` | Event privilege level, global or object | string | -| `result` | Event result, success=1, fail=0 | boolean | -| `database` | Database name | string | -| `sql_string` | User's original SQL statement | string | -| `log` | Detailed event description | string | - -### 3.3 Usage Examples - -* Query times, usernames and host information for successfully executed DML operations: - -```SQL -IoTDB:__audit> select time,username,cli_hostname from audit_log where result = true and operation_type='DML' -+-----------------------------+--------+------------+ -| time|username|cli_hostname| -+-----------------------------+--------+------------+ -|2026-01-23T11:43:46.697+08:00| root| 127.0.0.1| -|2026-01-23T11:45:39.950+08:00| root| 127.0.0.1| -+-----------------------------+--------+------------+ -Total line number = 2 -It costs 0.284s -``` - -* Query latest operation details: - -```SQL -IoTDB:__audit> select time,username,cli_hostname,operation_type,sql_string from audit_log order by time desc limit 1 -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -| time|username|cli_hostname|operation_type| sql_string| -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -|2026-01-23T11:46:31.026+08:00| root| 127.0.0.1| QUERY|select time,username,cli_hostname,operation_type,sql_string from audit_log order by time desc limit 1| -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -Total line number = 1 -It costs 0.053s -``` - -* Query failed operations: - -```SQL -IoTDB:__audit> select time,database,operation_type,log from audit_log where result=false -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -| time|database|operation_type| log| -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -|2026-01-23T11:47:42.136+08:00| | CONTROL|User user1 (ID=-1) login failed with code: 804, Authentication failed.| -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -Total line number = 1 -It costs 0.011s -``` - - -* Query audit event records with types 'slow operation' - -```SQL -IoTDB:__audit> select * from audit_log where audit_event_type='SLOW_OPERATION' limit 3 -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| time|node_id|user_id|username|cli_hostname|audit_event_type|operation_type|privilege_type|privilege_level|result| database| sql_string| log| -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|2026-05-06T14:57:57.468+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| QUERY| [SELECT]| OBJECT| true| | show databases| SLOW_QUERY: cost 10 ms, show databases| -|2026-05-06T14:58:38.149+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| DML| [INSERT]| OBJECT| true|database1|INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2024-11-26 13:37:00', 90.0, 35.1, true, '2024-11-26 13:37:34'), ('北京', '1001', '100', 'A', '180', '2024-11-26 13:38:00', 90.0, 35.1, true, '2024-11-26 13:38:25'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:38:00', NULL, 35.1, true, '2024-11-27 16:37:01'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:39:00', 85.0, 35.3, NULL, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:40:00', 85.0, NULL, NULL, '2024-11-27 16:37:03'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:41:00', 85.0, NULL, NULL, '2024-11-27 16:37:04'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:42:00', NULL, 35.2, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:43:00', NULL, Null, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:44:00', NULL, Null, false, '2024-11-27 16:37:08'), ('上海', '3001', '100', 'C', '90', '2024-11-28 08:00:00', 85.0, Null, NULL, '2024-11-28 08:00:09'), ('上海', '3001', '100', 'C', '90', '2024-11-28 09:00:00', NULL, 40.9, true, NULL), ('上海', '3001', '100', 'C', '90', '2024-11-28 10:00:00', 85.0, 35.2, NULL, '2024-11-28 10:00:11'), ('上海', '3001', '100', 'C', '90', '2024-11-28 11:00:00', 88.0, 45.1, true, '2024-11-28 11:00:12'), ('上海', '3001', '101', 'D', '360', '2024-11-29 10:00:00', 85.0, NULL, NULL, '2024-11-29 10:00:13'), ('上海', '3002', '100', 'E', '180', '2024-11-29 11:00:00', NULL, 45.1, true, NULL), ('上海', '3002', '100', 'E', '180', '2024-11-29 18:30:00', 90.0, 35.4, true, '2024-11-29 18:30:15'), ('上海', '3002', '101', 'F', '360', '2024-11-30 09:30:00', 90.0, 35.2, true, NULL), ('上海', '3002', '101', 'F', '360', '2024-11-30 14:30:00', 90.0, 34.8, true, '2024-11-30 14:30:17')|Execution: INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2024-11-26 13:37:00', 90.0, 35.1, true, '2024-11-26 13:37:34'), ('北京', '1001', '100', 'A', '180', '2024-11-26 13:38:00', 90.0, 35.1, true, '2024-11-26 13:38:25'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:38:00', NULL, 35.1, true, '2024-11-27 16:37:01'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:39:00', 85.0, 35.3, NULL, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:40:00', 85.0, NULL, NULL, '2024-11-27 16:37:03'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:41:00', 85.0, NULL, NULL, '2024-11-27 16:37:04'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:42:00', NULL, 35.2, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:43:00', NULL, Null, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:44:00', NULL, Null, false, '2024-11-27 16:37:08'), ('上海', '3001', '100', 'C', '90', '2024-11-28 08:00:00', 85.0, Null, NULL, '2024-11-28 08:00:09'), ('上海', '3001', '100', 'C', '90', '2024-11-28 09:00:00', NULL, 40.9, true, NULL), ('上海', '3001', '100', 'C', '90', '2024-11-28 10:00:00', 85.0, 35.2, NULL, '2024-11-28 10:00:11'), ('上海', '3001', '100', 'C', '90', '2024-11-28 11:00:00', 88.0, 45.1, true, '2024-11-28 11:00:12'), ('上海', '3001', '101', 'D', '360', '2024-11-29 10:00:00', 85.0, NULL, NULL, '2024-11-29 10:00:13'), ('上海', '3002', '100', 'E', '180', '2024-11-29 11:00:00', NULL, 45.1, true, NULL), ('上海', '3002', '100', 'E', '180', '2024-11-29 18:30:00', 90.0, 35.4, true, '2024-11-29 18:30:15'), ('上海', '3002', '101', 'F', '360', '2024-11-30 09:30:00', 90.0, 35.2, true, NULL), ('上海', '3002', '101', 'F', '360', '2024-11-30 14:30:00', 90.0, 34.8, true, '2024-11-30 14:30:17') cost 329 ms, with status code: TSStatus(code:200, message:)| -|2026-05-06T14:58:45.534+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| QUERY| [SELECT]| OBJECT| true|database1| select * from table1| SLOW_QUERY: cost 121 ms, select * from table1| -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 3 -It costs 0.026s -``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/User-Manual/Authority-Management-Upgrade_timecho.md b/src/UserGuide/latest-Table/User-Manual/Authority-Management-Upgrade_timecho.md deleted file mode 100644 index 5056f75bc..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Authority-Management-Upgrade_timecho.md +++ /dev/null @@ -1,495 +0,0 @@ - -# Authority Management - -IoTDB provides permission management capabilities to deliver fine-grained access control for data and cluster systems, ensuring data and system security. This document introduces the basic concepts of the permission module under the IoTDB Table Model, user specifications, permission governance, authentication logic, and practical application examples. - -## 1. Basic Concepts -### 1.1 User -A user is a legitimate database operator. Each user is identified by a unique username and authenticated with a password. To use the database, users must provide valid usernames and passwords stored in the system. - -### 1.2 Privilege -The database supports a variety of operations, but not all users are authorized to perform every action. A user is considered to have the corresponding privilege if permitted to execute a specific operation. - -### 1.3 Role -A role is a collection of privileges marked by a unique role identifier. Roles correspond to real-world job identities (e.g., traffic dispatchers), and one identity may cover multiple users. Users with identical job identities usually share the same set of permissions. Roles serve as an abstraction to realize unified permission management for such user groups. - -### 1.4 Default Users and Roles -After initialization, IoTDB provides a default user `root` with the default password `TimechoDB@2021`. As the administrator account, root owns all privileges by default. Its permissions cannot be granted or revoked, and the account cannot be deleted. There is only one administrator user in the database. -Newly created users and roles have no permissions by default. - -## 2. User Specifications -Users with the `SECURITY` privilege can create users and roles, and all creations must comply with the following constraints: - -### 2.1 Username Rules -- Length: 4 to 32 characters. Supports uppercase and lowercase letters, digits, and special symbols (`!@#$%^&*()_+-=`). Usernames identical to the administrator account are not allowed. -- If a username consists entirely of digits or contains special characters, it must be enclosed in double quotation marks `""` during creation. - -### 2.2 Password Rules -Length: 12 to 32 characters. A valid password must contain both uppercase and lowercase letters, at least one digit, and at least one special symbol (`!@#$%^&*()_+-=`). A password cannot be the same as the associated username. - -### 2.3 Role Name Rules -Length: 4 to 32 characters. Supports uppercase and lowercase letters, digits, and special symbols (`!@#$%^&*()_+-=`). Role names identical to the administrator account are prohibited. - -## 3. Permission Management -Under the IoTDB Table Model, permissions are divided into two major categories: global privileges and data privileges. - -### 3.1 Global Privileges -Global privileges include three types: `SYSTEM`, `SECURITY`, and `AUDIT`: -- **SYSTEM**: Covers privileges for O&M operations and Data Definition Language (DDL) tasks. -- **SECURITY**: Covers management of users and roles, as well as privilege assignment for other accounts. -- **AUDIT**: Covers audit rule maintenance and audit log viewing. - -Detailed descriptions of each global privilege are shown in the table below: - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Privilege CategoryOriginal NameDescription
SYSTEMN/AAllows users to create, alter and drop databases.
N/AAllows users to create, alter and drop tables and table views.
N/AAllows users to create, drop and query user-defined functions.
N/AAllows users to create, start, stop, drop and view PIPE tasks; create, drop and view PIPEPLUGINS.
N/AAllows users to execute and cancel queries, view system variables, and check cluster status.
N/AAllows users to create, drop and query deep learning models.
SECURITYMANAGE_USERAllows users to create and drop users, modify user passwords, view user privilege information, and list all users.
MANAGE_ROLEAllows users to create and drop roles, view role privilege information, grant roles to users, revoke roles from users, and list all roles.
AUDITN/AAllows users to maintain audit log rules and view audit logs.
- -### 3.2 Data Privileges -Data privileges consist of privilege types and effective scopes. - -- **Privilege Types**: `CREATE`, `DROP`, `ALTER`, `SELECT`, `INSERT`, `DELETE`. -- **Scopes**: `ANY` (system-wide), `DATABASE` (database-level), `TABLE` (single table). - - Privileges with the `ANY` scope apply to all databases and tables. - - Database-level privileges apply to the specified database and all tables under it. - - Table-level privileges only take effect on the target single table. -- **Scope Matching Logic**: When performing single-table operations, the system verifies permissions by scope priority. For example, when writing data to `DATABASE1.TABLE1`, the system checks write permissions in sequence for `ANY`, `DATABASE1` and `DATABASE1.TABLE1` until a matching privilege is found or the check fails. - -The logical relationship between privilege types, scopes and capabilities is shown below: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Privilege TypePermission Scope (Level)Capability Description
CREATEANYAllows creating any databases and tables.
DatabaseAllows creating the specified database and creating tables under this database.
TableAllows creating the specified table.
DROPANYAllows dropping any databases and tables.
DatabaseAllows dropping the specified database and all tables under it.
TableAllows dropping the specified table.
ALTERANYAllows modifying the definitions of any databases and tables.
DatabaseAllows modifying database definitions and the structure of all tables under the database.
TableAllows modifying the table structure and definition.
SELECTANYAllows querying data from any table across all databases.
DatabaseAllows querying data from all tables in the specified database.
TableAllows querying data in the specified table. For multi-table queries, only accessible data with valid permissions will be displayed.
INSERTANYAllows inserting and updating data in any tables of all databases.
DatabaseAllows inserting and updating data in all tables under the specified database.
TableAllows inserting and updating data in the specified table.
DELETEANYAllows deleting data from any tables.
DatabaseAllows deleting data within the specified database.
TableAllows deleting data in the specified table.
- -### 3.3 Privilege Granting and Revocation -IoTDB supports privilege assignment and revocation through three methods: -- Direct granting or revocation by the super administrator. -- Granting or revocation by users with the `GRANT OPTION` privilege. -- Granting or revocation via role configuration, operated by the super administrator or users with `SECURITY` privileges. - -The following rules apply to permission management in the IoTDB Table Model: -- No scope needs to be specified when granting or revoking global privileges. -- Data privilege operations require explicit privilege types and scopes. Revocation only takes effect on the designated scope and is not affected by hierarchical inclusion relationships. -- Pre-authorization is supported for databases and tables that have not been created yet. -- Repeated granting or revocation of privileges is permitted. -- **WITH GRANT OPTION**: Authorizes users to manage permissions within their granted scope, including granting and revoking privileges for other users. - -## 4. Syntax and Usage Examples -### 4.1 User and Role Management -1. **Create User** (Requires `SECURITY` privilege) -```SQL -CREATE USER --- Example -CREATE USER user1 'Passwd@202604' -``` - -2. **Modify Password** - Users can change their own passwords; modifying other users' passwords requires the `SECURITY` privilege. -```SQL -ALTER USER SET PASSWORD --- Example -ALTER USER tempuser SET PASSWORD 'Newpwd@202604' -``` - -3. **Drop User** (Requires `SECURITY` privilege) -```SQL -DROP USER --- Example -DROP USER user1 -``` - -4. **Create Role** (Requires `SECURITY` privilege) -```SQL -CREATE ROLE --- Example -CREATE ROLE role1 -``` - -5. **Drop Role** (Requires `SECURITY` privilege) -```SQL -DROP ROLE --- Example -DROP ROLE role1 -``` - -6. **Grant Role to User** (Requires `SECURITY` privilege) -```SQL -GRANT ROLE TO --- Example -GRANT ROLE admin TO user1 -``` - -7. **Revoke Role from User** (Requires `SECURITY` privilege) -```SQL -REVOKE ROLE FROM --- Example -REVOKE ROLE admin FROM user1 -``` - -8. **List All Users** (Requires `SECURITY` privilege) -```SQL -LIST USER -``` - -9. **List All Roles** (Requires `SECURITY` privilege) -```SQL -LIST ROLE -``` - -10. **List Users Under a Specified Role** (Requires `SECURITY` privilege) -```SQL -LIST USER OF ROLE --- Example -LIST USER OF ROLE roleuser -``` - -11. **List Roles of a Specified User** - Users can view their own roles; viewing roles of other users requires the `SECURITY` privilege. -```SQL -LIST ROLE OF USER --- Example -LIST ROLE OF USER tempuser -``` - -12. **List All Privileges of a User** - Users can view their own privileges; viewing privileges of other users requires the `SECURITY` privilege. -```SQL -LIST PRIVILEGES OF USER --- Example -LIST PRIVILEGES OF USER tempuser -``` - -13. **List All Privileges of a Role** - Users can view privileges of their bound roles; viewing privileges of other roles requires the `SECURITY` privilege. -```SQL -LIST PRIVILEGES OF ROLE --- Example -LIST PRIVILEGES OF ROLE actor -``` - -### 4.2 Granting and Revoking Privileges -#### 4.2.1 Grant Privileges -1. Grant user management privileges to a user -```SQL -GRANT SECURITY TO USER --- Example -GRANT SECURITY TO USER TEST_USER -``` - -2. Grant database and table creation privileges with independent permission management rights -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION --- Example -GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION -``` - -3. Grant database query privileges to a role -```SQL -GRANT SELECT ON DATABASE TO ROLE --- Example -GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE -``` - -4. Grant table query privileges to a user -```SQL -GRANT SELECT ON . TO USER --- Example -GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER -``` - -5. Grant global cross-database query privileges to a role -```SQL -GRANT SELECT ON ANY TO ROLE --- Example -GRANT SELECT ON ANY TO ROLE TEST_ROLE -``` - -6. **ALL Syntax Sugar**: `ALL` represents all available privileges within the target scope -```SQL --- Grant all global privileges and full data privileges under the ANY scope -GRANT ALL TO USER TESTUSER - --- Grant all data privileges covering the entire system scope -GRANT ALL ON ANY TO USER TESTUSER - --- Grant all data privileges for the specified database -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER - --- Grant all data privileges for the specified single table -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER -``` - -#### 4.2.2 Revoke Privileges -1. Revoke user management privileges -```SQL -REVOKE SECURITY FROM USER --- Example -REVOKE SECURITY FROM USER TEST_USER -``` - -2. Revoke database and table creation privileges -```SQL -REVOKE CREATE ON DATABASE FROM USER --- Example -REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER -``` - -3. Revoke table query privileges -```SQL -REVOKE SELECT ON . FROM USER --- Example -REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER -``` - -4. Revoke global cross-database query privileges -```SQL -REVOKE SELECT ON ANY FROM USER --- Example -REVOKE SELECT ON ANY FROM USER TEST_USER -``` - -5. **ALL Syntax Sugar** for privilege revocation -```SQL --- Revoke all global privileges and ANY-scoped data privileges -REVOKE ALL FROM USER TESTUSER - --- Only revoke data privileges under the ANY scope -REVOKE ALL ON ANY FROM USER TESTUSER - --- Only revoke all data privileges of the specified database -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER - --- Only revoke all data privileges of the specified table -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER -``` - -### 4.3 View User Privileges -Each user has an independent privilege list that records all authorized permissions. -Use `LIST PRIVILEGES OF USER ` to query detailed privileges of a user or role. The output format is as follows: - -| ROLE | SCOPE | PRIVILEGE | WITH GRANT OPTION | -|-------|---------|-----------|------------------| -| | DB1.TB1 | SELECT | FALSE | -| | | SECURITY | TRUE | -| ROLE1 | DB2.TB2 | UPDATE | TRUE | -| ROLE1 | DB3.* | DELETE | FALSE | -| ROLE1 | *.* | UPDATE | TRUE | - -- **ROLE**: Blank for self-owned user privileges; displays the role name if the permission is inherited from a role. -- **SCOPE**: Table-level scope displayed as `DB.TABLE`, database-level as `DB.*`, and global ANY scope as `*.*`. -- **PRIVILEGE**: Lists specific authorized permission types. -- **WITH GRANT OPTION**: `TRUE` means the user can grant or revoke permissions within the corresponding scope. -- Users and roles can hold permissions for both the Tree Model and Table Model simultaneously. The system only displays permissions applicable to the currently connected model, while permissions for the other model will be hidden. - -## 5. Practical Scenario Example -Based on the [Sample Data](../Reference/Sample-Data.md), data in different tables belongs to two independent data centers (bj and sh). To prevent unauthorized cross-center data access, permission isolation needs to be configured at the data center level. - -### 5.1 Create Users -Use `CREATE USER` to create new users. For example, the root administrator creates two dedicated write users for the BJ and SH data centers: `bj_write_user` and `sh_write_user`, with a unified password `write_Pwd@2026`. - -```SQL -CREATE USER bj_write_user 'write_Pwd@2026'; -CREATE USER sh_write_user 'write_Pwd@2026'; -``` - -Execute the user query statement: -```SQL -LIST USER -``` - -Query result: -``` -+------+-------------+-----------------+-----------------+ -|UserId| User|MaxSessionPerUser|MinSessionPerUser| -+------+-------------+-----------------+-----------------+ -| 0| root| -1| 1| -| 10000|bj_write_user| -1| -1| -| 10001|sh_write_user| -1| -1| -+------+-------------+-----------------+-----------------+ -``` - -### 5.2 Grant Privileges -Newly created users have no permissions by default and cannot perform database operations. For example, an insertion executed by `bj_write_user` will fail: -```SQL -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('Beijing', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -``` - -Error prompt: -``` -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: database is not specified -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: DATABASE database1 -``` - -Grant table write privileges to `bj_write_user` via the root account: -```SQL -GRANT INSERT ON database1.table1 TO USER bj_write_user -``` - -Retry data insertion after switching to the target database: -```SQL -IoTDB> use database1 -Msg: The statement is executed successfully. -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('Beijing', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: The statement is executed successfully. -``` - -### 5.3 Revoke Privileges -Use the `REVOKE` statement to reclaim granted permissions: -```SQL -REVOKE INSERT ON database1.table1 FROM USER bj_write_user -REVOKE INSERT ON database1.table2 FROM USER sh_write_user -``` - -After revocation, `bj_write_user` no longer has write access to table1: -``` -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: No permissions for this operation, please add privilege INSERT ON database1.table1 -``` - \ No newline at end of file diff --git a/src/UserGuide/latest-Table/User-Manual/Authority-Management_timecho.md b/src/UserGuide/latest-Table/User-Manual/Authority-Management_timecho.md deleted file mode 100644 index 3ac042003..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Authority-Management_timecho.md +++ /dev/null @@ -1,494 +0,0 @@ - - -# Authority Management - -IoTDB provides permission management functionality to implement fine-grained access control for data and cluster systems, ensuring data and system security. This document introduces the basic concepts, user definitions, permission management, authentication logic, and functional use cases of the permission module in IoTDB's table model. - -## 1. Basic Concepts - -### 1.1 User - -A **user** is a legitimate database user. Each user is associated with a unique username and authenticated via a password. Before accessing the database, a user must provide valid credentials (a username and password that exist in the database). - -### 1.2 Permission - -A database supports multiple operations, but not all users can perform every operation. If a user is authorized to execute a specific operation, they are said to have the **permission** for that operation. - -### 1.3 Role - -A **role** is a collection of permissions, identified by a unique role name. Roles typically correspond to real-world identities (e.g., "traffic dispatcher"), where a single identity may encompass multiple users. Users sharing the same real-world identity often require the same set of permissions, and roles abstract this grouping for unified management. - -### 1.4 Default User and Role - -Upon initialization, IoTDB includes a default user: - -* ​**Username**​: `root` -* ​**Default password**​: `TimechoDB@2021` //before V2.0.6 it is root - -The `root` user is the ​**administrator**​, inherently possessing all permissions. This user cannot be granted or revoked permissions and cannot be deleted. The database maintains only one administrator user. Newly created users or roles start with **no permissions** by default. - -## 2. Permission List - -In IoTDB's table model, there are two main types of permissions: Global Permissions and Data Permissions . - -### 2.1 Global Permissions - -Global permissions include user management and role management. - -The following table describes the types of global permissions: - -| Permission Name | Description | -| ----------------- |----------------------------------------------------------------------------------------------------------------------------------| -| MANAGE\_USER | - Create users
- Delete users
- Modify user passwords
- View user permission details
- List all users | -| MANAGE\_ROLE | - Create roles
- Delete roles
- View role permission details
- Grant/revoke roles to/from users
- List all roles | - -### 2.2 Data Permissions - -Data permissions consist of permission types and permission scopes. - -* Permission Types: - * CREATE: Permission to create resources - * DROP: Permission to delete resources - * ALTER: Permission to modify definitions - * SELECT: Permission to query data - * INSERT: Permission to insert/update data - * DELETE: Permission to delete data -* Permission Scopes: - * ANY: System-wide (affects all databases and tables) - * DATABASE: Database-wide (affects the specified database and its tables) - * TABLE: Table-specific (affects only the specified table) -* Scope Enforcement Logic: - -When performing table-level operations, the system matches user permissions with data permission scopes hierarchically. Example: If a user attempts to write data to `DATABASE1.TABLE1`, the system checks for write permissions in this order: 1. `ANY` scope → 2. `DATABASE1` scope → 3. `DATABASE1.TABLE1` scope. The check stops at the first successful match or fails if no permissions are found. - -* Permission Type-Scope-Effect Matrix - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Permission TypeScope(Hierarchy)Effect
CREATEANYCreate any table/database
DATABASECreate tables in the specified database; create a database with the specified name
TABLECreate a table with the specified name
DROPANYDelete any table/database
DATABASEDelete the specified database or its tables
TABLEDelete the specified table
ALTERANYModify definitions of any table/database
DATABASEModify definitions of the specified database or its tables
TABLEModify the definition of the specified table
SELECTANYQuery data from any table in any database
DATABASEQuery data from any table in the specified database
TABLEQuery data from the specified table
INSERTANYInsert/update data in any table
DATABASEInsert/update data in any table within the specified database
TABLEInsert/update data in the specified table
DELETEANYDelete data from any table
DATABASEDelete data from tables within the specified database
TABLEDelete data from the specified table
- -## 3. User and Role Management - -1. Create User (Requires `MANAGE_USER` Permission) - -```SQL -CREATE USER -eg: CREATE USER user1 'passwd' -``` - -Constraints: - -* Username: 4-32 characters (letters, numbers, special chars: `!@#$%^&*()_+-=`). Cannot duplicate the admin (`root`) username. - - If the username consists entirely of numbers or contains special characters, you need to enclose it in double quotes `""` when creating it. -* Password: 4-32 characters (letters, numbers, special chars). Stored as SHA-256 hash by default. - -2. Modify Password - -Users can modify their own passwords. Modifying others' passwords requires `MANAGE_USER`. - -```SQL -ALTER USER SET PASSWORD -eg: ALTER USER tempuser SET PASSWORD 'newpwd' -``` - -3. Delete User (Requires `MANAGE_USER`) - -```SQL -DROP USER -eg: DROP USER user1 -``` - -4. Create Role (Requires `MANAGE_ROLE`) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1 -``` - -Constraints: - -* Role Name: 4-32 characters (letters, numbers, special chars). Cannot duplicate the admin role name. - -5. Delete Role (Requires `MANAGE_ROLE`) - -```SQL -DROP ROLE -eg: DROP ROLE role1 -``` - -6. Assign Role to User (Requires `MANAGE_ROLE`) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1 -``` - -7. Revoke Role from User (Requires `MANAGE_ROLE`) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1 -``` - -8. List All Users (Requires `MANAGE_USER`) - -```SQL -LIST USER -``` - -9. List All Roles (Requires `MANAGE_ROLE`) - -```SQL -LIST ROLE -``` - -10. List Users in a Role (Requires `MANAGE_USER`) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser -``` - -11. List Roles of a User - -* Users can list their own permissions. -* Listing others' permissions requires `MANAGE_USER`. - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser -``` - -12. List User Permissions - -* Users can list their own permissions. -* Listing others' permissions requires `MANAGE_USER`. - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser -``` - -13. List Role Permissions - -* Users can list permissions of roles they have. -* Listing other roles' permissions requires `MANAGE_ROLE`. - -```SQL -LIST PRIVILEGES OF ROLE -eg: LIST PRIVILEGES OF ROLE actor -``` - -## 4. Permission Management - -IoTDB supports granting and revoking permissions through the following three methods: - -* Direct assignment/revocation by a super administrator -* Assignment/revocation by users with the `GRANT OPTION` privilege -* Assignment/revocation via roles (managed by super administrators or users with `MANAGE_ROLE` permissions) - -In the IoTDB Table Model, the following principles apply when granting or revoking permissions: - -* **Global permissions** can be granted/revoked without specifying a scope. -* **Data permissions** require specifying both the permission type and permission scope. When revoking, only the explicitly defined scope is affected, regardless of hierarchical inclusion relationships. -* Preemptive permission planning is allowed—permissions can be granted for databases or tables that do not yet exist. -* Repeated granting/revoking of permissions is permitted. -* `WITH GRANT OPTION`: Allows users to manage permissions within the granted scope. Users with this option can grant or revoke permissions for other users in the same scope. - -### 4.1 Granting Permissions - -1. Grant a user the permission to manage users - -```SQL -GRANT MANAGE_USER TO USER -eg: GRANT MANAGE_USER TO USER TEST_USER -``` - -2. Grant a user the permission to create databases and tables within the database, and allow them to manage permissions in that scope - -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION -eg: GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION -``` - -3. Grant a role the permission to query a database - -```SQL -GRANT SELECT ON DATABASE TO ROLE -eg: GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE -``` - -4. Grant a user the permission to query a table - -```SQL -GRANT SELECT ON . TO USER -eg: GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER -``` - -5. Grant a role the permission to query all databases and tables - -```SQL -GRANT SELECT ON ANY TO ROLE -eg: GRANT SELECT ON ANY TO ROLE TEST_ROLE -``` - -6. ALL Syntax Sugar: ALL represents all permissions within a given scope, allowing flexible permission granting. - -```sql -GRANT ALL TO USER TESTUSER --- Grants all possible permissions to the user, including global permissions and all data permissions under ANY scope. - -GRANT ALL ON ANY TO USER TESTUSER --- Grants all data permissions under the ANY scope. After execution, the user will have all data permissions across all databases. - -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER --- Grants all data permissions within the specified database. After execution, the user will have all data permissions on that database. - -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER --- Grants all data permissions on the specified table. After execution, the user will have all data permissions on that table. -``` - -### 4.2 Revoking Permissions - -1. Revoke a user's permission to manage users - -```SQL -REVOKE MANAGE_USER FROM USER -eg: REVOKE MANAGE_USER FROM USER TEST_USER -``` - -2. Revoke a user's permission to create databases and tables within the database - -```SQL -REVOKE CREATE ON DATABASE FROM USER -eg: REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER -``` - -3. Revoke a user's permission to query a table - -```SQL -REVOKE SELECT ON . FROM USER -eg: REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER -``` - -4. Revoke a user's permission to query all databases and tables - -```SQL -REVOKE SELECT ON ANY FROM USER -eg: REVOKE SELECT ON ANY FROM USER TEST_USER -``` - -5. ALL Syntax Sugar: ALL represents all permissions within a given scope, allowing flexible permission revocation. - -```sql -REVOKE ALL FROM USER TESTUSER --- Revokes all global permissions and all data permissions under ANY scope. - -REVOKE ALL ON ANY FROM USER TESTUSER --- Revokes all data permissions under the ANY scope, without affecting DB or TABLE-level permissions. - -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER --- Revokes all data permissions on the specified database, without affecting TABLE-level permissions. - -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER --- Revokes all data permissions on the specified table. -``` - -### 4.3 Viewing User Permissions - -Each user has an access control list that identifies all the permissions they have been granted. You can use the `LIST PRIVILEGES OF USER ` statement to view the permission information of a specific user or role. The output format is as follows: - -| ROLE | SCOPE | PRIVIVLEGE | WITH GRANT OPTION | -|--------------|---------| -------------- |-------------------| -| | DB1.TB1 | SELECT | FALSE | -| | | MANAGE\_ROLE | TRUE | -| ROLE1 | DB2.TB2 | UPDATE | TRUE | -| ROLE1 | DB3.\* | DELETE | FALSE | -| ROLE1 | \*.\* | UPDATE | TRUE | - -* ​**ROLE column**​: If empty, it indicates the user's own permissions. If not empty, it means the permission is derived from a granted role. -* ​**SCOPE column**​: Represents the permission scope of the user/role. Table-level permissions are denoted as `DB.TABLE`, database-level permissions as `DB.*`, and ANY-level permissions as `*.*`. -* ​**PRIVILEGE column**​: Lists the specific permission types. -* ​**WITH GRANT OPTION column**​: If `TRUE`, it means the user can grant their own permissions to others. -* A user or role can have permissions in both the tree model and the table model, but the system will only display the permissions relevant to the currently connected model. Permissions under the other model will not be shown. - -## 5. Example - -Using the content from the [Sample Data](../Reference/Sample-Data.md) as an example, the data in the two tables may belong to the **bj** and **sh** data centers, respectively. To prevent each center from accessing the other's database data, we need to implement permission isolation at the data center level. - -### 5.1 Creating Users - -Use `CREATE USER ` to create users. For example, the **root** user with all permissions can create two user roles for the **ln** and **sgcc** groups, named **bj\_write\_user** and ​**sh\_write\_user**​, both with the password ​**write\_pwd**​. The SQL statements are: - -```SQL -CREATE USER bj_write_user 'write_pwd' -CREATE USER sh_write_user 'write_pwd' -``` - -To display the users, use the following SQL statement: - -```Plain -LIST USER -``` - -The result will show the two newly created users, as follows: - -```sql -+-------------+ -| User| -+-------------+ -|bj_write_user| -| root| -|sh_write_user| -+-------------+ -``` - -### 5.2 Granting User Permissions - -Although the two users have been created, they do not yet have any permissions and thus cannot perform database operations. For example, if the **bj\_write\_user** attempts to write data to ​**table1**​, the SQL statement would be: - -```sql -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -``` - -The system will deny the operation and display an error: - -```sql -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: database is not specified -IoTDB> use database1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: DATABASE database1 -``` - -The **root** user can grant **bj\_write\_user** write permissions for **table1** using the `GRANT ON TO USER ` statement, for example: - -```sql -GRANT INSERT ON database1.table1 TO USER bj_write_user -``` - -After granting permissions, **bj\_write\_user** can successfully write data: - -```SQL -IoTDB> use database1 -Msg: The statement is executed successfully. -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: The statement is executed successfully. -``` - -### 5.3 Revoking User Permissions - -After granting permissions, the **root** user can revoke them using the `REVOKE ON FROM USER ` statement. For example: - -```sql -REVOKE INSERT ON database1.table1 FROM USER bj_write_user -REVOKE INSERT ON database1.table2 FROM USER sh_write_user -``` - -Once permissions are revoked, **bj\_write\_user** will no longer have write access to ​**table1**​: - -```sql -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: No permissions for this operation, please add privilege INSERT ON database1.table1 -``` diff --git a/src/UserGuide/latest-Table/User-Manual/Auto-Start-On-Boot_timecho.md b/src/UserGuide/latest-Table/User-Manual/Auto-Start-On-Boot_timecho.md deleted file mode 100644 index 1fbcd4c6f..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Auto-Start-On-Boot_timecho.md +++ /dev/null @@ -1,213 +0,0 @@ - - -# Auto-start on Boot -## 1. Overview -TimechoDB supports registering ConfigNode, DataNode, and AINode as Linux system services via the three scripts `daemon-confignode.sh`, `daemon-datanode.sh`, and `daemon-ainode.sh`. Combined with the system-built `systemctl` command, it manages the TimechoDB cluster in daemon mode, enabling more convenient startup, shutdown, restart, and auto-start on boot operations, and improving service stability. - -> Note: This feature is available starting from version 2.0.9.1. - -## 2. Environment Requirements -| Item | Specification | -|--------------|-------------------------------------------------------------------------------| -| OS | Linux (supports the `systemctl` command) | -| User Privilege | root user | -| Environment Variable | `JAVA_HOME` must be set before deploying ConfigNode and DataNode | - -## 3. Service Registration -Enter the TimechoDB installation directory and execute the corresponding daemon script: - -```bash -# Register ConfigNode service -./tools/ops/daemon-confignode.sh - -# Register DataNode service -./tools/ops/daemon-datanode.sh - -# Register AINode service -./tools/ops/daemon-ainode.sh -``` - -During script execution, you will be prompted with two options: -1. Whether to start the corresponding TimechoDB service immediately (timechodb-confignode / timechodb-datanode / timechodb-ainode); -2. Whether to register the corresponding service for auto-start on boot. - -After script execution, the corresponding service files will be generated in the `/etc/systemd/system/` directory: -- `timechodb-confignode.service` -- `timechodb-datanode.service` -- `timechodb-ainode.service` - -## 4. Service Management -After service registration, you can use `systemctl` commands to start, stop, restart, check status, and configure auto-start on boot for each TimechoDB node service. All commands below must be executed as the root user. - -### 4.1 Manual Service Startup -```bash -# Start ConfigNode service -systemctl start timechodb-confignode -# Start DataNode service -systemctl start timechodb-datanode -# Start AINode service -systemctl start timechodb-ainode -``` - -### 4.2 Manual Service Shutdown -```bash -# Stop ConfigNode service -systemctl stop timechodb-confignode -# Stop DataNode service -systemctl stop timechodb-datanode -# Stop AINode service -systemctl stop timechodb-ainode -``` - -After stopping the service, check the service status. If it shows `inactive (dead)`, the service has been shut down successfully. For other statuses, check TimechoDB logs to analyze exceptions. - -### 4.3 Check Service Status -```bash -# Check ConfigNode service status -systemctl status timechodb-confignode -# Check DataNode service status -systemctl status timechodb-datanode -# Check AINode service status -systemctl status timechodb-ainode -``` - -Status Description: -- `active (running)`: Service is running. If this status persists for 10 minutes, the service has started successfully. -- `failed`: Service startup failed. Check TimechoDB logs for troubleshooting. - -### 4.4 Restart Service -Restarting a service is equivalent to stopping and then starting it. Commands are as follows: -```bash -# Restart ConfigNode service -systemctl restart timechodb-confignode -# Restart DataNode service -systemctl restart timechodb-datanode -# Restart AINode service -systemctl restart timechodb-ainode -``` - -### 4.5 Enable Auto-start on Boot -```bash -# Enable ConfigNode auto-start on boot -systemctl enable timechodb-confignode -# Enable DataNode auto-start on boot -systemctl enable timechodb-datanode -# Enable AINode auto-start on boot -systemctl enable timechodb-ainode -``` - -### 4.6 Disable Auto-start on Boot -```bash -# Disable ConfigNode auto-start on boot -systemctl disable timechodb-confignode -# Disable DataNode auto-start on boot -systemctl disable timechodb-datanode -# Disable AINode auto-start on boot -systemctl disable timechodb-ainode -``` - -## 5. Custom Service Configuration -### 5.1 Customization Methods -#### 5.1.1 Method 1: Modify the Script -1. Modify the `[Unit]`, `[Service]`, and `[Install]` sections in the `daemon-xxx.sh` script. For details of configuration items, refer to the next section. -2. Execute the `daemon-xxx.sh` script. - -#### 5.1.2 Method 2: Modify the Service File -1. Modify the `xx.service` file in `/etc/systemd/system`. -2. Execute `systemctl daemon-reload`. - -### 5.2 `daemon-xxx.sh` Configuration Items -#### 5.2.1 `[Unit]` Section (Service Metadata) -| Item | Description | -|---------------|-----------------------------------------------------------------------------| -| Description | Service description | -| Documentation | Link to the official TimechoDB documentation | -| After | Ensures the service starts only after the network service has started | - -#### 5.2.2 `[Service]` Section (Service Runtime Configuration) -| Item | Meaning | -|-------------------------------------------|-----------------------------------------------------------------------------------------------------------| -| StandardOutput, StandardError | Specify storage paths for service standard output and error logs | -| LimitNOFILE=65536 | Set the maximum number of file descriptors, default value is 65536 | -| Type=simple | Service type is a simple foreground process; systemd tracks the main service process | -| User=root, Group=root | Run the service with root user and group permissions | -| ExecStart / ExecStop | Specify the paths of the service startup and shutdown scripts respectively | -| Restart=on-failure | Automatically restart the service only if it exits abnormally | -| SuccessExitStatus=143 | Treat exit code 143 (128+15, normal termination via SIGTERM) as a successful exit | -| RestartSec=5 | Interval between service restarts, default 5 seconds | -| StartLimitInterval=600s, StartLimitBurst=3 | Maximum 3 restarts within 10 minutes (600 seconds) to prevent excessive resource consumption from frequent restarts | -| RestartPreventExitStatus=SIGKILL | Do not auto-restart the service if killed by the SIGKILL signal, avoiding infinite restart of zombie processes | - -#### 5.2.3 `[Install]` Section (Installation Configuration) -| Item | Meaning | -|-----------------------|----------------------------------------------------------------------| -| WantedBy=multi-user.target | Start the service automatically when the system enters multi-user mode | - -### 5.3 Sample `.service` File Format -```bash -[Unit] -Description=timechodb-confignode -Documentation=https://www.timecho.com/ -After=network.target - -[Service] -StandardOutput=null -StandardError=null -LimitNOFILE=65536 -Type=simple -User=root -Group=root -Environment=JAVA_HOME=$JAVA_HOME -ExecStart=$TimechoDB_SBIN_HOME/start-confignode.sh -Restart=on-failure -SuccessExitStatus=143 -RestartSec=5 -StartLimitInterval=600s -StartLimitBurst=3 -RestartPreventExitStatus=SIGKILL - -[Install] -WantedBy=multi-user.target -``` - -Note: The above is the standard format of the `timechodb-confignode.service` file. The formats of `timechodb-datanode.service` and `timechodb-ainode.service` are similar. - -## 6. Notes -1. **Process Daemon Mechanism** - - **Auto-restart**: The system will auto-restart the service if it fails to start or exits abnormally during runtime (e.g., OOM). - - **No restart**: Normal exits (e.g., executing `kill`, `./sbin/stop-xxx.sh`, or `systemctl stop`) will not trigger auto-restart. - -2. **Log Location** - - All runtime logs are stored in the `logs` folder under the TimechoDB installation directory. Refer to this directory for troubleshooting. - -3. **Cluster Status Check** - - After service startup, execute `./sbin/start-cli.sh` and run the `show cluster` command to view the cluster status. - -4. **Fault Recovery Procedure** - - If the service status is `failed`, after fixing the issue, **you must first execute `systemctl daemon-reload`** before running `systemctl start`, otherwise startup will fail. - -5. **Configuration Activation** - - After modifying the `daemon-xxx.sh` script, execute `systemctl daemon-reload` to re-register the service for new configurations to take effect. - -6. **Startup Mode Compatibility** - - Services started via `systemctl start` can be stopped using `./sbin/stop` (no restart triggered). - - Processes started via `./sbin/start` cannot be monitored via `systemctl`. \ No newline at end of file diff --git a/src/UserGuide/latest-Table/User-Manual/Black-White-List_timecho.md b/src/UserGuide/latest-Table/User-Manual/Black-White-List_timecho.md deleted file mode 100644 index 3aa3cb94a..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Black-White-List_timecho.md +++ /dev/null @@ -1,78 +0,0 @@ - - -# Black White List - -## 1. Introduction - -IoTDB is a time-series database designed for IoT scenarios, supporting efficient data storage, query, and analysis. With the widespread application of IoT technology, data security and access control have become critical. In open environments, ensuring secure data access for legitimate users presents a key challenge. The whitelist mechanism allows only trusted IPs or users to connect, reducing the attack surface at the source. The blacklist function can block malicious IPs in real time in edge-cloud collaborative scenarios, preventing unauthorized access, SQL injection, brute‑force attacks, DDoS, and other threats, thereby providing continuous and stable security for data transmission. - -> Note: This feature is available starting from version 2.0.6. - -## 2. Whitelist - -### 2.1 Function Description - -By enabling the whitelist function and configuring the whitelist, client addresses allowed to connect to IoTDB are specified. Only clients within the whitelist can access IoTDB, achieving security control. - -### 2.2 Configuration Parameters - -Administrators can enable/disable the whitelist function and add, modify, or delete whitelist IPs/IP segments in the following two ways: - -* Edit the configuration file `iotdb‑system.properties`. -* Use the `set configuration` statement. - * Table model reference: [set configuration](../SQL-Manual/SQL-Maintenance-Statements_timecho.md#_2-2-update-configuration-items) - -Related parameters are as follows: - -| Name | Description | Default Value | Effective Mode | Example | -| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------- | --------------- | ---------------- | ------------------------------------------------------------------- | -| `enable_white_list` | Whether to enable the whitelist function. true: enable; false: disable. The value is case‑insensitive. | false | Hot reload | `set enable_white_list = 'true'` | -| `white_ip_list` | Add, modify, or delete whitelist IPs/IP segments. Supports exact match and the \* wildcard. Multiple IPs are separated by commas. | empty | Hot reload | `set white_ip_list='192.168.1.200,192.168.1.201,192.168.1.*'` | - -## 3. Blacklist - -### 3.1 Function Description - -By enabling the blacklist function and configuring the blacklist, certain specific IP addresses are prevented from accessing the database, guarding against unauthorized access, SQL injection, brute‑force attacks, DDoS attacks, and other security threats, thereby ensuring the security and stability of data transmission. - -### 3.2 Configuration Parameters - -Administrators can enable/disable the blacklist function and add, modify, or delete blacklist IPs/IP segments in the following two ways: - -* Edit the configuration file `iotdb‑system.properties`. -* Use the `set configuration`statement. - * Table model reference:[set configuration](../SQL-Manual/SQL-Maintenance-Statements_timecho.md#_2-2-update-configuration-items) - -Related parameters are as follows: - -| Name | Description | Default Value | Effective Mode | Example | -|---------------------| ----------------------------------------------------------------------------------------------------------------------------------- | --------------- | ---------------- | ------------------------------------------------------------------- | -| `enable_black_list` | Whether to enable the blacklist function. true: enable; false: disable. The value is case‑insensitive. | false | Hot reload | `set enable_black_list = 'true'` | -| `black_ip_list` | Add, modify, or delete blacklist IPs/IP segments. Supports exact match and the \* wildcard. Multiple IPs are separated by commas. | empty | Hot reload | `set black_ip_list='192.168.1.200,192.168.1.201,192.168.1.*'` | - -## 4. Notes - -1. After the whitelist is enabled, if the list is empty, all connections are denied. If the local IP is not included, local login is denied. -2. When the same IP appears in both the whitelist and blacklist, the blacklist takes precedence. -3. The system validates the IP format. Invalid entries will cause an error when the user connects and be skipped, without affecting the loading of other valid IPs. -4. Duplicate IPs in the configuration are supported; they are automatically deduplicated in memory without notification. For manual deduplication, edit the configuration accordingly. -5. Blacklist/whitelist rules only apply to new connections. Existing connections before enabling the function are not affected; they will be intercepted only upon subsequent reconnection. diff --git a/src/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md b/src/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index 8e6964a35..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,850 +0,0 @@ - -# Data Sync - -Data synchronization is a typical requirement in the Industrial Internet of Things (IIoT). Through data synchronization mechanisms, data sharing between IoTDB instances can be achieved, enabling the establishment of a complete data pipeline to meet needs such as internal and external network data exchange, edge-to-cloud synchronization, data migration, and data backup. - -## 1. Functional Overview - -### 1.1 Data Synchronization - -A data synchronization task consists of three stages: - -![](/img/data-sync-new.png) - -- Source Stage: This stage is used to extract data from the source IoTDB, defined in the `source` section of the SQL statement. -- Process Stage: This stage is used to process the data extracted from the source IoTDB, defined in the `processor` section of the SQL statement. -- Sink Stage: This stage is used to send data to the target IoTDB, defined in the `sink` section of the SQL statement. - -By declaratively configuring these three parts in an SQL statement, flexible data synchronization capabilities can be achieved.Currently, data synchronization supports the synchronization of the following information, and you can select the synchronization scope when creating a synchronization task (the default is data.insert, which means synchronizing newly written data): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Synchronization ScopeSynchronization Content Description
allAll scopes
data(Data)insertSynchronize newly written data
deleteSynchronize deleted data
schemadatabaseSynchronize database creation, modification or deletion operations
tableSynchronize table creation, modification or deletion operations
TTLSynchronize the data retention time
auth-Synchronize user permissions and access control
- -### 1.2 Functional Limitations and Notes - -- Data synchronization between IoTDB of 1. x series version and IoTDB of 2. x and above series versions is not supported. -- When performing data synchronization tasks, avoid executing any deletion operations to prevent inconsistencies between the two ends. -- The `pipe` and `pipe plugins` for tree modes and table modes are designed to be isolated from each other. Before creating a `pipe`, it is recommended to first use the `show` command to query the built-in plugins available under the current `-sql_dialect` parameter configuration to ensure syntax compatibility and functional support. -- Object-type data export is supported since version V2.0.9.2. -- When Pipe fails to write data to the sink due to field type mismatches, IoTDB automatically converts the data to the field types defined in the existing sink schema and retries the write operation to improve synchronization success rate. This feature is controlled by the parameter `sink.exception.data.convert-on-type-mismatch`. Refer to the subsequent sink parameter table for detailed parameter descriptions. - -The conversion rules for type mismatches are as follows: - -| Source Type | Target Type | Conversion Rule | -| --------------------- | ------------ | --------------- | -| Numeric Type | Numeric Type | Convert to the target numeric type. Truncation, precision loss, or overflow may occur. | -| Numeric Type | BOOLEAN | `0` is converted to `false`; non-zero values are converted to `true`. | -| BOOLEAN | Numeric Type | `true` is converted to `1`; `false` is converted to `0`. | -| TEXT, STRING, BLOB | BOOLEAN | Parse the string into a BOOLEAN value. | -| TEXT, STRING, BLOB | Numeric Type | Parse the string into the target numeric type. If parsing fails, write the default value `0`, `0L`, or `0.0`. | -| TEXT, STRING, BLOB | TIMESTAMP | Parse the string into a TIMESTAMP value. If parsing fails, write the default value `0L`. | -| TEXT, STRING, BLOB | DATE | Parse the string into a DATE value. If parsing fails, write the default date `1970-01-01`. | -| Invalid Numeric Value | DATE | If conversion to a valid DATE fails, write the default date `1970-01-01`. | -| DATE | TIMESTAMP | Convert to the timestamp of 00:00 (UTC) on the same day. | -| TIMESTAMP | DATE | Convert to the corresponding date in UTC. | - -> **Note**: Automatic conversion is performed based on the existing sink schema and will **not** modify the sink schema. This feature prioritizes continuous data synchronization, which may result in precision loss or writing of default values. - - - -## 2. Usage Instructions - -A data synchronization task can be in one of three states: RUNNING, STOPPED, and DROPPED. The state transitions of the task are illustrated in the diagram below: - -![](/img/Data-Sync02.png) - -After creation, the task will start directly. Additionally, if the task stops due to an exception, the system will automatically attempt to restart it. - -We provide the following SQL statements for managing the state of synchronization tasks. - -### 2.1 Create a Task - -Use the `CREATE PIPE` statement to create a data synchronization task. Among the following attributes, `PipeId` and `sink` are required, while `source` and `processor` are optional. Note that the order of the `SOURCE` and `SINK` plugins cannot be swapped when writing the SQL. - -SQL Example: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId is a unique name identifying the task --- Data extraction plugin (optional) -WITH SOURCE ( - [ = ,], -) --- Data processing plugin (optional) -WITH PROCESSOR ( - [ = ,], -) --- Data transmission plugin (required) -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS Semantics**: Ensures that the creation command is executed only if the specified Pipe does not exist, preventing errors caused by attempting to create an already existing Pipe. - -**Note**: - -Starting from V2.0.8, when creating a full data synchronization Pipe (e.g. Pipeid: `alldatapipe`), the system will automatically split it into two independent Pipes: - -* History Pipe: The PipeId is the original name plus the suffix `_history` (e.g. `alldatapipe_history`). The source parameter carries the default configurations: `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` -* Realtime Pipe: The PipeId is the original name plus the suffix `_realtime` (e.g. `alldatapipe_realtime`). The source parameter carries the default configuration: `'history.enable'='false'`. If metadata synchronization is configured, the Realtime Pipe will be responsible for sending the data. - -After successful creation, the original PipeId (e.g. `alldatapipe`) will no longer be a valid identifier. When performing task operations such as starting, stopping, deleting, or viewing, you must use the split independent PipeId (i.e. `*_history` or `*_realtime`). For operation examples, see the [View Task](./Data-Sync_timecho.md#_2-5-view-task) section - -### 2.2 Start a Task - -After creation, the task directly enters the RUNNING state and does not require manual startup. However, if the task is stopped using the `STOP PIPE` statement, you need to manually start it using the `START PIPE` statement. If the task stops due to an exception, it will automatically restart to resume data processing: - -```SQL -START PIPE -``` - -### 2.3 Stop a Task - -To stop data processing: - -```SQL -STOP PIPE -``` - -### 2.4 Delete a Task - -To delete a specified task: - -```SQL -DROP PIPE [IF EXISTS] -``` - -**IF EXISTS Semantics**: Ensures that the deletion command is executed only if the specified Pipe exists, preventing errors caused by attempting to delete a non-existent Pipe. - -**Note**: Deleting a task does not require stopping the synchronization task first. - -### 2.5 View Tasks - -To view all tasks: - -```SQL -SHOW PIPES -``` - -To view a specific task: - -```SQL -SHOW PIPE -``` - -Example Output of `SHOW PIPES`: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -**Column Descriptions**: - -- **ID**: Unique identifier of the synchronization task. -- **CreationTime**: Time when the task was created. -- **State**: Current state of the task. -- **PipeSource**: Source of the data stream. -- **PipeProcessor**: Processing logic applied during data transmission. -- **PipeSink**: Destination of the data stream. -- **ExceptionMessage**: Displays exception information for the task. -- **RemainingEventCount** (statistics may have delays): Number of remaining events, including data and metadata synchronization events, as well as system and user-defined events. -- **EstimatedRemainingSeconds** (statistics may have delays): Estimated remaining time to complete the transmission based on the current event count and pipe processing rate. - -Example: - -In V2.0.8 and later versions, create a full data synchronization task and view the task details. - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -``` - - -### 2.6 Modify a Task - -The `ALTER PIPE` statement dynamically updates an existing PIPE and supports modifying or replacing the configuration of source, processor, and sink. - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -Description: - -* Executing this operation does not change the running state of the PIPE. It is equivalent to keeping the processing progress of the original PipeId and creating a new PIPE at the original progress position. -* The modify/replace parameters for source/processor/sink are all optional. If no modification parameter is specified, it is equivalent to deleting the current PIPE and recreating it with the original configuration and progress. -* For a plugin specified with modify, the plugin's other parameters are retained, and only the given parameters are replaced or added. -* For a plugin specified with replace, all parameters of the plugin are replaced directly. -* When the [IF EXISTS] keyword is used, execution succeeds even if no Pipe with the same name exists, but no operation is actually performed. - -Example: - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` - -### 2.7 Synchronization Plugins - -To make the architecture more flexible and adaptable to different synchronization scenarios, IoTDB supports plugin assembly in the synchronization task framework. The system provides some common pre-installed plugins, and you can also customize `processor` and `sink` plugins and load them into the IoTDB system. - -To view the plugins available in the system (including custom and built-in plugins), use the following statement: - -```SQL -SHOW PIPEPLUGINS -``` - -Example Output: - -```SQL -IoTDB> SHOW PIPEPLUGINS -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -| PluginName|PluginType| ClassName|PluginJar|ExceptionMessage| -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -| DO-NOTHING-PROCESSOR| Builtin|org.apache.iotdb.commons.pipe.agent.plugin.builtin.processor.donothing.DoNothingProcessor| | | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.donothing.DoNothingSink| | | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.airgap.IoTDBAirGapSink| | | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.source.iotdb.IoTDBSource| | | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.thrift.IoTDBThriftSink| | | -|IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.thrift.IoTDBThriftSslSink| | | -| TSFILE-LOCAL-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.tsfile.PipeTsFileLocalSink| | | -| WRITE-BACK-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.writeback.WriteBackSink| | | -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -``` - -Detailed introduction of pre-installed plugins is as follows (for detailed parameters of each plugin, please refer to the [Parameter Description](#reference-parameter-description): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
TypeCustom PluginPlugin NameDescription
Source PluginNot Supportediotdb-sourceDefault extractor plugin for extracting historical or real-time data from IoTDB.
processor PluginSupporteddo-nothing-processorDefault processor plugin that does not process incoming data.
sink PluginSupporteddo-nothing-sinkDoes not process outgoing data.
iotdb-thrift-sinkDefault sink plugin for data transmission between IoTDB instances (V2.0.0+). Uses Thrift RPC framework with a multi-threaded async non-blocking IO model, ideal for distributed target scenarios.
iotdb-air-gap-sinkUsed for cross-unidirectional data gate synchronization between IoTDB instances (V2.0.0+). Supports gate models like NARI Syskeeper 2000.
iotdb-thrift-ssl-sinkUsed for data transmission between IoTDB instances (V2.0.0+). Uses Thrift RPC framework with a multi-threaded sync blocking IO model, suitable for high-security scenarios.
write-back-sinkA data write-back plugin for IoTDB (V2.0.2 and above) to achieve the effect of materialized views.
opc-ua-sinkAn OPC UA protocol data transfer plugin for IoTDB (V2.0.2 and above), supporting both Client/Server and Pub/Sub communication modes.
tsfile-local-sinkUsed in IoTDB (V2.0.9.2 and later) to support exporting Object data to the local file system where the IoTDB server resides.
tsfile-remote-sinkUsed in IoTDB (V2.0.9.2 and later) to support sending Object data to a remote server via the SSH/SCP protocol.
- -## 3. Usage Examples - -### 3.1 Full Data Synchronization - -This example demonstrates synchronizing all data from one IoTDB to another. The data pipeline is shown below: - -![](/img/e1.png) - -In this example, we create a synchronization task named `A2B` to synchronize all data from IoTDB A to IoTDB B. The `iotdb-thrift-sink` plugin (built-in) is used, and the `node-urls` parameter is configured with the URL of the DataNode service port on the target IoTDB. - -SQL Example: - -```SQL -CREATE PIPE A2B -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668' -- URL of the DataNode service port on the target IoTDB -) -``` - -### 3.2 Partial Data Synchronization - -This example demonstrates synchronizing data within a specific historical time range (from August 23, 2023, 8:00 to October 23, 2023, 8:00) to another IoTDB. The data pipeline is shown below: - -![](/img/e2.png) - -In this example, we create a synchronization task named `A2B`. First, we define the data range in the `source` configuration. Since we are synchronizing historical data (data that existed before the task was created), we need to configure the start time (`start-time`), end time (`end-time`), and the streaming mode (`mode.streaming`). The `node-urls` parameter is configured with the URL of the DataNode service port on the target IoTDB. - -SQL Example: - -```SQL -CREATE PIPE A2B -WITH SOURCE ( - 'source' = 'iotdb-source', - 'mode.streaming' = 'true' -- Extraction mode for newly inserted data (after the pipe is created): - -- Whether to extract data in streaming mode (if set to false, batch mode is used). - 'database-name'='testdb.*', -- Scope of Data Synchronization - 'start-time' = '2023.08.23T08:00:00+00:00', -- The event time at which data synchronization starts (inclusive). - 'end-time' = '2023.10.23T08:00:00+00:00' -- The event time at which data synchronization ends (inclusive). -) -WITH SINK ( - 'sink' = 'iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668' -- The URL of the DataNode's data service port in the target IoTDB instance. -) -``` - -### 3.3 Bidirectional Data Transmission - -This example demonstrates a scenario where two IoTDB instances act as dual-active systems. The data pipeline is shown below: - -![](/img/e3.png) - -To avoid infinite data loops, the `source.mode.double-living` parameter must be set to `true` on both IoTDB A and B, indicating that data forwarded from another pipe will not be retransmitted. - -SQL Example: On IoTDB A: - -```SQL -CREATE PIPE AB -WITH SOURCE ( - 'source.mode.double-living' = 'true' -- Do not forward data from other pipes -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668' -- URL of the DataNode service port on the target IoTDB -) -``` - -On IoTDB B: - -```SQL -CREATE PIPE BA -WITH SOURCE ( - 'source.mode.double-living' = 'true' -- Do not forward data from other pipes -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667' -- URL of the DataNode service port on the target IoTDB -) -``` - -### 3.4 Edge-to-Cloud Data Transmission - -This example demonstrates synchronizing data from multiple IoTDB clusters (B, C, D) to a central IoTDB cluster (A). The data pipeline is shown below: - -![](/img/sync_en_03.png) - -To synchronize data from clusters B, C, and D to cluster A, the `database-name` and `table-name` parameters are used to restrict the data range. - -SQL Example: On IoTDB B: - -```SQL -CREATE PIPE BA -WITH SOURCE ( - 'database-name' = 'db_b.*', -- Restrict the database scope - 'table-name' = '.*' -- Match all tables -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667' -- URL of the DataNode service port on the target IoTDB -) -``` - -On IoTDB C : - -```SQL -CREATE PIPE CA -WITH SOURCE ( - 'database-name' = 'db_c.*', -- Restrict the database scope - 'table-name' = '.*' -- Match all tables -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668' -- URL of the DataNode service port on the target IoTDB -) -``` - -On IoTDB D: - -```SQL -CREATE PIPE DA -WITH SOURCE ( - 'database-name' = 'db_d.*', -- Restrict the database scope - 'table-name' = '.*' -- Match all tables -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669' -- URL of the DataNode service port on the target IoTDB -) -``` - -### 3.5 Cascaded Data Transmission - -This example demonstrates cascading data transmission from IoTDB A to IoTDB B and then to IoTDB C. The data pipeline is shown below: - -![](/img/sync_en_04.png) - - -SQL Example: On IoTDB A: - -```SQL -CREATE PIPE AB -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668' -- URL of the DataNode service port on the target IoTDB -) -``` - -On IoTDB B: - -```SQL -CREATE PIPE BC -WITH SOURCE ( -) -WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669' -- URL of the DataNode service port on the target IoTDB -) -``` - -### 3.6 Air-Gapped Data Transmission - -This example demonstrates synchronizing data from one IoTDB to another through a unidirectional air gap. The data pipeline is shown below: - -![](/img/cross-network-gateway.png) - -In this example, the `iotdb-air-gap-sink` plugin is used (currently supports specific air gap models; contact Timecho team for details). After configuring the air gap, execute the following statement on IoTDB A, where `node-urls` is the URL of the DataNode service port on the target IoTDB. - -SQL Example: - -```SQL -CREATE PIPE A2B -WITH SINK ( - 'sink' = 'iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780' -- URL of the DataNode service port on the target IoTDB -) -``` - -**Note:** -* When creating a pipe for synchronization across a network gap (data diode), you must ensure that the target user on the receiving end already exists. If the receiving-end user is missing at the time of pipe creation, data prior to the subsequent creation of that user will not be synchronized. -* Currently supported network gap device models are listed in the table below. -> For other models of network gateway devices, Please contact timechodb staff to confirm compatibility. - -| Gateway Type | Model | Return Packet Limit | Send Limit | -| ---------------------- | ---------------------------------------------------------- | ------------------- | ---------------------- | -| Forward Gate | NARI Syskeeper-2000 Forward Gate | All 0 / All 1 bytes | No Limit | -| Forward Gate | XJ Self-developed Diaphragm | All 0 / All 1 bytes | No Limit | -| Unknown | WISGAP | No Limit | No Limit | -| Forward Gate | KEDONG StoneWall-2000 Network Security Isolation Device | No Limit | No Limit | -| Reverse Gate | NARI Syskeeper-2000 Reverse Direction | All 0 / All 1 bytes | Meet E Language Format | -| Unknown | DPtech ISG5000 | No Limit | No Limit | -| Unknown | GAP XL—GAP | No Limit | No Limit | - -### 3.7 Compressed Synchronization - -IoTDB supports specifying data compression methods during synchronization. The `compressor` parameter can be configured to enable real-time data compression and transmission. Supported algorithms include `snappy`, `gzip`, `lz4`, `zstd`, and `lzma2`. Multiple algorithms can be combined and applied in the configured order. The `rate-limit-bytes-per-second` parameter (supported in V1.3.3 and later) limits the maximum number of bytes transmitted per second (calculated after compression). If set to a value less than 0, there is no limit. - -**SQL Example**: - -```SQL -CREATE PIPE A2B -WITH SINK ( - 'node-urls' = '127.0.0.1:6668', -- URL of the DataNode service port on the target IoTDB - 'compressor' = 'snappy,lz4', -- Compression algorithms - 'rate-limit-bytes-per-second' = '1048576' -- Maximum bytes allowed per second -) -``` - -### 3.8 Encrypted Synchronization - -IoTDB supports SSL encryption during synchronization to securely transmit data between IoTDB instances. By configuring SSL-related parameters such as the certificate path (`ssl.trust-store-path`) and password (`ssl.trust-store-pwd`), data can be protected by SSL encryption during synchronization. - -**SQL Example**: - -```SQL -CREATE PIPE A2B -WITH SINK ( - 'sink' = 'iotdb-thrift-ssl-sink', - 'node-urls' = '127.0.0.1:6667', -- URL of the DataNode service port on the target IoTDB - 'ssl.trust-store-path' = 'pki/trusted', -- Path to the trust store certificate - 'ssl.trust-store-pwd' = 'root' -- Password for the trust store certificate -) -``` - -### 3.9 Object-Type Data Export -Since version V2.0.9.2, IoTDB supports exporting Object-type data. The following two methods are supported by configuring sink parameters: - -* **Local Mode**: Exports data to the local file system where the IoTDB server resides. -* **SCP Mode**: Sends data to a remote server via the SSH/SCP protocol. - -**Example 1: Local Export** - -You can directly use the built-in `tsfile-local-sink` plugin to create a PIPE statement for data export. For example: - -```SQL -CREATE PIPE tsfile_export_local -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile-local-sink', -- Required, specifies the Sink type - 'sink.local.target-path' = '/data/backup/export_2024' -- Target export path - 'sink.rate-limit-bytes-per-second' = '10485760' -- Rate limit: 10MB/s -); -``` - -**Example 2: Remote Transfer** - -1. Contact the Timecho Team to obtain the JAR package related to the `tsfile-remote-sink` plugin, such as `tsfile-remote-sink--jar-with-dependencies.jar`, and place it in a path accessible to IoTDB (e.g., all Data Node hosts). -2. Register the plugin using the following statement: - -```SQL -CREATE PIPEPLUGIN tsfile_remote_sink -AS 'org.apache.iotdb.pipe.plugin.sink.tsfile.PipeTsFileRemoteSink' -USING URI 'file:///path/to/tsfile-remote-sink--jar-with-dependencies.jar'; -``` - -3. Create the PIPE statement: - -```SQL -CREATE PIPE tsfile_export_scp -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile_remote_sink', - 'sink.file-mode' = 'scp', -- Specifies SCP mode - 'sink.scp.host' = '192.168.1.100', -- Remote host IP - 'sink.scp.port' = '22', -- SSH port - 'sink.scp.user' = 'backup_user', -- SSH username - 'sink.scp.password' = 'ComplexPass123!', -- SSH password - 'sink.scp.remote-path' = '/remote/archive/', -- Remote storage path - 'sink.rate-limit-bytes-per-second' = '10485760' -- Rate limit: 10MB/s -); -``` - -**Note**: When exporting Object-type data in SCP mode, to avoid handshake exceptions, connection failures, or frequent Pipe restarts, it is recommended to take any of the following measures: -* Appropriately lower the configuration parameter `sink.scp.object-parallelism` -* Increase the `MaxStartups` value on the target machine as needed. After modification, execute `sshd reload` or `sshd restart` for the configuration to take effect. - -**Sink Exported TSFile and Object Format:** - -```Bash -target_dir - ├── tsfile.tsfile - └── tsfile/ (matches the TSFile name) - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - -## Reference: Notes - -You can adjust the parameters for data synchronization by modifying the IoTDB configuration file (`iotdb-system.properties`), such as the directory for storing synchronized data. The complete configuration is as follows: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## Reference: Parameter Description - -### source parameter - -| **Parameter** | **Description** | **Value Range** | **Required** | **Default Value** | -| :----------------------- |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------| :----------- | :---------------------------------------------------------- | -| source | iotdb-source | String: iotdb-source | Yes | - | -| inclusion | Used to specify the range of data to be synchronized in the data synchronization task, including data,schema and auth | String:all, data(insert,delete), schema(database,table,ttl), auth | No | data.insert | -| inclusion.exclusion | Used to exclude specific operations from the range specified by inclusion, reducing the amount of data synchronized | String:all, data(insert,delete), schema(database,table,ttl), auth | No | - | -| mode.streaming | This parameter specifies the source of time-series data capture. It applies to scenarios where `mode.streaming` is set to `false`, determining the capture source for `data.insert` in `inclusion`. Two capture strategies are available: - **true**: Dynamically selects the capture type. The system adapts to downstream processing speed, choosing between capturing each write request or only capturing TsFile file sealing requests. When downstream processing is fast, write requests are prioritized to reduce latency; when processing is slow, only file sealing requests are captured to prevent processing backlogs. This mode suits most scenarios, optimizing the balance between processing latency and throughput. - **false**: Uses a fixed batch capture approach, capturing only TsFile file sealing requests. This mode is suitable for resource-constrained applications, reducing system load. **Note**: Snapshot data captured when the pipe starts will only be provided for downstream processing as files. | Boolean: true / false | No | true | -| mode.strict | Determines whether to strictly filter data when using the `time`, `path`, `database-name`, or `table-name` parameters: - **true**: Strict filtering. The system will strictly filter captured data according to the specified conditions, ensuring that only matching data is selected. - **false**: Non-strict filtering. Some extra data may be included during the selection process to optimize performance and reduce CPU and I/O consumption. | Boolean: true / false | No | true | -| mode.snapshot | This parameter determines the data capture mode, affecting the `data` in `inclusion`. Two modes are available: - **true**: Static data capture. A one-time data snapshot is taken when the pipe starts. Once the snapshot data is fully consumed, the pipe automatically terminates (executing `DROP PIPE` SQL automatically). - **false**: Dynamic data capture. In addition to capturing snapshot data when the pipe starts, it continuously captures subsequent data changes. The pipe remains active to process the dynamic data stream. | Boolean: true / false | No | false | -| database-name | When the user connects with `sql_dialect` set to `table`, this parameter can be specified. Determines the scope of data capture, affecting the `data` in `inclusion`. Specifies the database name to filter. It can be a specific database name or a Java-style regular expression to match multiple databases. By default, all databases are matched. | String: Database name or database regular expression pattern string, which can match uncreated or non - existent databases. | No | ".*" | -| table-name | When the user connects with `sql_dialect` set to `table`, this parameter can be specified. Determines the scope of data capture, affecting the `data` in `inclusion`. Specifies the table name to filter. It can be a specific table name or a Java-style regular expression to match multiple tables. By default, all tables are matched. | String: Data table name or data table regular expression pattern string, which can be uncreated or non - existent tables. | No | ".*" | -| start-time | Determines the scope of data capture, affecting the `data` in `inclusion`. Data with an event time **greater than or equal to** this parameter will be selected for stream processing in the pipe. | Long: [Long.MIN_VALUE, Long.MAX_VALUE](Unix bare timestamp)orString: ISO format timestamp supported by IoTDB | No | Long: [Long.MIN_VALUE, Long.MAX_VALUE](Unix bare timestamp) | -| end-time | Determines the scope of data capture, affecting the `data` in `inclusion`. Data with an event time **less than or equal to** this parameter will be selected for stream processing in the pipe. | Long: [Long.MIN_VALUE, Long.MAX_VALUE](Unix bare timestamp)orString: ISO format timestamp supported by IoTDB | No | Long: [Long.MIN_VALUE, Long.MAX_VALUE](Unix bare timestamp) | -| mode.double-living | Whether to enable full dual-active mode. When enabled, the system will ignore the `-sql_dialect` connection method to capture all tree-table model data and not forward data synced from another pipe (to avoid circular synchronization). | Boolean: true / false | No | false | -| mods | Same as mods.enable, whether to send the MODS file for TSFile. | Boolean: true / false | No | false | -| skipIf | Which errors can be skipped? Currently only the insufficient privileges error. | String:no-privileges | No | no-privileges | - -> 💎 **Note:** The difference between the values of true and false for the data extraction mode `mode.streaming` -> -> - True (recommended): Under this value, the task will process and send the data in real-time. Its characteristics are high timeliness and low throughput. -> - False: Under this value, the task will process and send the data in batches (according to the underlying data files). Its characteristics are low timeliness and high throughput. - -### sink parameter - -#### iotdb-thrift-sink - -| **Parameter** | **Description** | Value Range | Required | Default Value | -|:-------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------| :------------ | -| sink | iotdb-thrift-sink or iotdb-thrift-async-sink | String: iotdb-thrift-sink or iotdb-thrift-async-sink | Yes | - | -| node-urls | URLs of the DataNode service ports on the target IoTDB. (please note that the synchronization task does not support forwarding to its own service). | String. Example:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| user/username | username for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | root | -| batch.enable | Enables batch mode for log transmission to improve throughput and reduce IOPS. | Boolean: true, false | No | true | -| batch.max-delay-seconds | Maximum delay (in seconds) for batch transmission. | Integer | No | 1 | -| batch.max-delay-ms | Maximum delay (in ms) for batch transmission. (Available since v2.0.5) | Integer | No | 1 | -| batch.size-bytes | Maximum batch size (in bytes) for batch transmission. | Long | No | 16*1024*1024 | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | The maximum number of bytes allowed to be transmitted per second. The compressed bytes (such as after compression) are calculated. If it is less than 0, there is no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| format | The payload formats for data transmission include the following options:
- hybrid: The format depends on what is passed from the processor (either tsfile or tablet), and the sink performs no conversion.
- tsfile: Data is forcibly converted to tsfile format before transmission. This is suitable for scenarios like data file backup.
- tablet: Data is forcibly converted to tsfile format before transmission. This is useful for data synchronization when the sender and receiver have incompatible data types (to minimize errors). | String: hybrid / tsfile / tablet | No | hybrid | -| mark-as-general-write-request | This parameter controls whether data forwarded by external pipes can be synchronized between dual-active pipes (configured on the sender side of dual-active external pipes). | Boolean: true / false. True: can synchronize; False: cannot synchronize; | No | False | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - -#### iotdb-air-gap-sink - -| **Parameter** | **Description** | Value Range | Required | Default Value | -|:--------------------------------------------| :----------------------------------------------------------- | :----------------------------------------------------------- | :------- |:---------------------------------------------| -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | Yes | - | -| node-urls | URLs of the DataNode service ports on the target IoTDB. (please note that the synchronization task does not support forwarding to its own service). | String. Example:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| user/username | username for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | TimechoDB@2021 (Before V2.0.6.x it is root) | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | The maximum number of bytes allowed to be transmitted per second. The compressed bytes (such as after compression) are calculated. If it is less than 0, there is no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| air-gap.handshake-timeout-ms | The timeout duration for the handshake requests when the sender and receiver attempt to establish a connection for the first time, in milliseconds. | Integer | No | 5000 | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - -#### iotdb-thrift-ssl-sink - -| **Parameter** | **Description** | Value Range | Required | Default Value | -|:--------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------| :------------ | -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | Yes | - | -| node-urls | URLs of the DataNode service ports on the target IoTDB. (please note that the synchronization task does not support forwarding to its own service). | String. Example:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| user/usename | Usename for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | root | -| batch.enable | Enables batch mode for log transmission to improve throughput and reduce IOPS. | Boolean: true, false | No | true | -| batch.max-delay-seconds | Maximum delay (in seconds) for batch transmission. | Integer | No | 1 | -| batch.max-delay-ms | Maximum delay (in ms) for batch transmission. (Available since v2.0.5) | Integer | No | 1 | -| batch.size-bytes | Maximum batch size (in bytes) for batch transmission. | Long | No | 16*1024*1024 | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | Maximum bytes allowed per second for transmission (calculated after compression). Set to a value less than 0 for no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| ssl.trust-store-path | Path to the trust store certificate for SSL connection. | String.Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| ssl.trust-store-pwd | Password for the trust store certificate. | Integer | Yes | - | -| format | The payload formats for data transmission include the following options:
- hybrid: The format depends on what is passed from the processor (either tsfile or tablet), and the sink performs no conversion.
- tsfile: Data is forcibly converted to tsfile format before transmission. This is suitable for scenarios like data file backup.
- tablet: Data is forcibly converted to tsfile format before transmission. This is useful for data synchronization when the sender and receiver have incompatible data types (to minimize errors). | String: hybrid / tsfile / tablet | No | hybrid | -| mark-as-general-write-request | This parameter controls whether data forwarded by external pipes can be synchronized between dual-active pipes (configured on the sender side of dual-active external pipes).(Available since v2.0.5) | Boolean: true / false. True: can synchronize; False: cannot synchronize; | No | False | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - - - -#### write-back-sink - -| **Parameter** | **Description** | **value Range** | **Required** | **Default Value** | -| ---------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- |--------------| -| sink | write-back-sink | String: write-back-sink | Yes | - | -| user/username | User used for write-back | String: username | No | root | -| password | Password used for write-back | String: password | No | root123 | -| user-id | User ID corresponding to the user | String | No | root | -| cli-hostname | CLI hostname corresponding to the user | String | No | root | -| use-event-user-name | Whether to use another user's username if the event contains one (generally not needed now because there is no external source) | Boolean: true / false | No | false | - -#### opc-ua-sink - -| **Parameter** | **Description** |Value Range | Required | Default Value | -|:-------------------------------------|:-------------------------------------------------------------------------------------------------------| :------------------------------------- |:-------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| sink | opc-ua-sink | String: opc-ua-sink | Yes | - | -| sink.opcua.model | OPC UA model used | String: client-server / pub-sub | No | pub-sub | -| sink.opcua.tcp.port | OPC UA's TCP port | Integer: [0, 65536] | No | 12686 | -| sink.opcua.https.port | OPC UA's HTTPS port | Integer: [0, 65536] | No | 8443 | -| sink.opcua.security.dir | Directory for OPC UA's keys and certificates | String: Path, supports absolute and relative directories | No | Opc_security folder``in the conf directory of the DataNode related to iotdb
If there is no conf directory for iotdb (such as launching DataNode in IDEA), it will be the iotdb_opc_Security folder``in the user's home directory | -| sink.opcua.enable-anonymous-access | Whether OPC UA allows anonymous access | Boolean | No | true | -| sink.user | User for OPC UA, specified in the configuration | String | No | root | -| sink.password | Password for OPC UA, specified in the configuration | String | No | TimechoDB@2021 (Before V2.0.6.x it is root) | -| sink.opcua.placeholder | A placeholder string used to substitute for null mapping paths when the value of the ID column is null | String | No | "null" | - - -#### tsfile-local-sink -| Parameter | Description | Value Range | Required | Default | -|-----------------------------------|-----------------------------------------------------------------------------|------------------------|----------|---------| -| sink | Component name | String: tsfile-local-sink | Yes | - | -| sink.local.target-path | Local target directory | String | Yes | - | -| sink.rate-limit-bytes-per-second | Rate limit threshold (unit: bytes/second). Takes effect when enabled. No limit if rate-limit <= 0 | Long | No | 0 | - -#### tsfile-remote-sink -| Parameter | Description | Value Range | Required | Default | -|------------------------------------|----------------------------------------------------------------------------|-------------------------|----------|---------| -| sink | Component name | String: tsfile-remote-sink | Yes | - | -| sink.scp.host | Remote host IP | String | Yes | - | -| sink.scp.port | Remote SSH port | Long | No | 22 | -| sink.scp.user | Remote SSH user | String | Yes | - | -| sink.scp.password | Remote SSH password | String | Yes | - | -| sink.scp.remote-path | Remote target directory | String | Yes | - | -| sink.rate-limit-bytes-per-second | Unit: bytes/second. Takes effect when enabled. No limit if rate-limit <= 0 | Long | No | 0 | -| sink.scp.object-parallelism | Maximum parallelism for object file transmission | Long | No |` min(cpu/4,16)` | -| sink.scp.object-batch-size-bytes | Maximum size of Object files sent per asynchronous thread, unit: MB | Long | No | 200 | - diff --git a/src/UserGuide/latest-Table/User-Manual/Maintenance-commands_timecho.md b/src/UserGuide/latest-Table/User-Manual/Maintenance-commands_timecho.md deleted file mode 100644 index c58fccfd5..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Maintenance-commands_timecho.md +++ /dev/null @@ -1,932 +0,0 @@ - -# Maintenance Statement - -## 1. Status Checking - -### 1.1 Viewing the Connected Model - -**Description**: Returns the current SQL dialect model (`Tree` or `Table`). - -**Syntax**: - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT; -``` - -**Result:** - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 1.2 Viewing the Logged-in Username - -**Description**: Returns the currently logged-in username. - -**Syntax**: - -```SQL -showCurrentUserStatement - : SHOW CURRENT_USER - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_USER; -``` - -**Result**: - -```Plain -+-----------+ -|CurrentUser| -+-----------+ -| root| -+-----------+ -``` - -### 1.3 Viewing the Connected Database Name - -**Description**: Returns the name of the currently connected database. If no `USE` statement has been executed, it returns `null`. - -**Syntax**: - -```SQL -showCurrentDatabaseStatement - : SHOW CURRENT_DATABASE - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_DATABASE; -IoTDB> USE test; -IoTDB> SHOW CURRENT_DATABASE; -``` - -**Result**: - -```Plain -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ -+---------------+ -|CurrentDatabase| -+---------------+ -| test| -+---------------+ -``` - -### 1.4 Viewing the Cluster Version - -**Description**: Returns the current cluster version. - -**Syntax**: - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW VERSION; -``` - -**Result**: - -```Plain -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.5 Viewing Cluster Key Parameters - -**Description**: Returns key parameters of the current cluster. - -**Syntax**: - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -Key Parameters: - -1. **ClusterName**: The name of the current cluster. -2. **DataReplicationFactor**: Number of data replicas per DataRegion. -3. **SchemaReplicationFactor**: Number of schema replicas per SchemaRegion. -4. **DataRegionConsensusProtocolClass**: Consensus protocol class for DataRegions. -5. **SchemaRegionConsensusProtocolClass**: Consensus protocol class for SchemaRegions. -6. **ConfigNodeConsensusProtocolClass**: Consensus protocol class for ConfigNodes. -7. **TimePartitionOrigin**: The starting timestamp of database time partitions. -8. **TimePartitionInterval**: The interval of database time partitions (in milliseconds). -9. **ReadConsistencyLevel**: The consistency level for read operations. -10. **SchemaRegionPerDataNode**: Number of SchemaRegions per DataNode. -11. **DataRegionPerDataNode**: Number of DataRegions per DataNode. -12. **SeriesSlotNum**: Number of SeriesSlots per DataRegion. -13. **SeriesSlotExecutorClass**: Implementation class for SeriesSlots. -14. **DiskSpaceWarningThreshold**: Disk space warning threshold (in percentage). -15. **TimestampPrecision**: Timestamp precision. - -**Example**: - -```SQL -IoTDB> SHOW VARIABLES; -``` - -**Result**: - -```Plain -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.6 Viewing the Cluster ID - -**Description**: Returns the ID of the current cluster. - -**Syntax**: - -```SQL -showClusterIdStatement - : SHOW (CLUSTERID | CLUSTER_ID) - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CLUSTER_ID; -``` - -**Result**: - -```Plain -+------------------------------------+ -| ClusterId| -+------------------------------------+ -|40163007-9ec1-4455-aa36-8055d740fcda| -+------------------------------------+ -``` - -### 1.7 Viewing the Timestamp of the Connected DataNode - -**Description**: Returns the current timestamp of the DataNode process directly connected to the client. - -**Syntax**: - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP; -``` - -**Result**: - -```Plain -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - -### 1.8 Viewing Executing Queries - -**Description**: Displays information about all currently executing queries. - -> For more details on how to use system tables, please refer to [System Tables](../Reference/System-Tables_timecho.md) - -**Syntax**: - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**Parameters**: - -1. **WHERE Clause**: Filters the result set based on specified conditions. -2. **ORDER BY Clause**: Sorts the result set based on specified columns. -3. **limitOffsetClause**: Limits the number of rows returned. - 1. Format: `LIMIT , `. - -**Columns in QUERIES Table**: - -- **query_id**: Unique ID of the query. -- **start_time**: Timestamp when the query started. -- **datanode_id**: ID of the DataNode executing the query. -- **elapsed_time**: Time elapsed since the query started (in seconds). -- **statement**: The SQL statement being executed. -- **user**: The user who initiated the query. - -**Example**: - -```SQL -IoTDB> SHOW QUERIES WHERE elapsed_time > 30; -``` - -**Result**: - -```Plain -+-----------------------+-----------------------------+-----------+------------+------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -|20250108_101015_00000_1|2025-01-08T18:10:15.935+08:00| 1| 32.283|show queries|root| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -``` - - -### 1.9 Viewing Region Information - -**Description**: Displays regions' information of the current cluster. - -**Syntax**: - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW REGIONS -``` - -**Result**: - -```SQL -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 6|SchemaRegion|Running|tcollector| 670| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.194| | -| 7| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.196| 169.85 KB| -| 8| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.198| 161.63 KB| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.10 Viewing Available Nodes - -**Description**: Returns the RPC addresses and ports of all available DataNodes in the current cluster. Note: A DataNode is considered "available" if it is not in the REMOVING state. - -> This feature is supported starting from v2.0.8. - -**Syntax**: - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW AVAILABLE URLS -``` - -**Result**: - -```SQL -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.11 View Service Information - -**Description**: Returns service information (MQTT service, REST service) on all active DataNodes (in RUNNING or READ-ONLY state) in the current cluster. - -> Supported since V2.0.8.2 - -#### Syntax: -```sql -showServicesStatement - : SHOW SERVICES - ; -``` - -#### Examples: -```sql -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -``` - -Execution result: -```sql -+--------------+-------------+---------+ -| Service Name | DataNode ID | State | -+--------------+-------------+---------+ -| MQTT | 1 | STOPPED | -| REST | 1 | RUNNING | -+--------------+-------------+---------+ -``` - -### 1.12 View Cluster Activation Status - -**Description**:Returns the activation status of the current cluster. - -#### Syntax: - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -#### Examples: - -```SQL -IoTDB> SHOW ACTIVATION -``` - -Execution result: - -```SQL -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -### 1.13 View Node Configuration -**Description**: By default, returns the effective configuration items from the configuration file of the specified node (identified by `node_id`). If `node_id` is not specified, returns the configuration of the directly connected DataNode. -Adding the `all` parameter returns all configuration items (the `value` of unconfigured items is `null`). -Adding the `with desc` parameter returns configuration items with descriptions. - -> Supported since version 2.0.9.1 - -#### Syntax: -```SQL -showConfigurationStatement - : SHOW (ALL)? CONFIGURATION (ON nodeId=INTEGER_VALUE)? (WITH DESC)? - ; -``` - -#### Result Set Description -| Column Name | Column Type | Description | -|---------------|-------------|---------------------------------| -| name | string | Configuration name | -| value | string | Configuration value | -| default_value | string | Default value of the configuration | -| description | string | Configuration description (optional) | - -#### Examples: -1. View configuration of the directly connected DataNode -```SQL -SHOW CONFIGURATION; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| schema_replication_factor| 1| 1| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_metric_prometheus_reporter_port| 9091| 9091| -| dn_metric_prometheus_reporter_port| 9092| 9092| -| series_slot_num| 1000| 1000| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| time_partition_origin| 0| 0| -| time_partition_interval| 604800000| 604800000| -| disk_space_warning_threshold| 0.05| 0.05| -| schema_engine_mode| Memory| Memory| -| tag_attribute_total_size| 700| 700| -| read_consistency_level| strong| strong| -| timestamp_precision| ms| ms| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 28 -It costs 0.013s -``` - -2. View configuration of the node with a specific node ID -```SQL -SHOW CONFIGURATION ON 1; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| schema_replication_factor| 1| 1| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_metric_prometheus_reporter_port| 9091| 9091| -| dn_metric_prometheus_reporter_port| 9092| 9092| -| series_slot_num| 1000| 1000| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| time_partition_origin| 0| 0| -| time_partition_interval| 604800000| 604800000| -| disk_space_warning_threshold| 0.05| 0.05| -| schema_engine_mode| Memory| Memory| -| tag_attribute_total_size| 700| 700| -| read_consistency_level| strong| strong| -| timestamp_precision| ms| ms| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 28 -It costs 0.004s -``` - -3. View all configurations -```SQL -SHOW ALL CONFIGURATION; -``` - -```Bash -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| dn_join_cluster_retry_interval_ms| null| 5000| -| config_node_consensus_protocol_class| null| org.apache.iotdb.consensus.ratis.RatisConsensus| -| schema_replication_factor| 1| 1| -| schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_system_dir| null| data/confignode/system| -| cn_consensus_dir| null| data/confignode/consensus| -| cn_pipe_receiver_file_dir| null| data/confignode/system/pipe/receiver| -... -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 412 -It costs 0.006s -``` - -4. View configuration items with descriptions -```SQL -SHOW CONFIGURATION ON 1 WITH DESC; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| name| value| default_value| description| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| Used to indicate the cluster name and distinguish different clusters. To modify the cluster name, use the SQL statement 'set configuration "cluster_name=xxx"'. Manually modifying the configuration file is not recommended because it may cause node restart failure. effectiveMode: hot_reload. Datatype: string| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710|For the first ConfigNode to start, cn_seed_config_node points to its own cn_internal_address:cn_internal_port. For other ConfigNodes that to join the cluster, cn_seed_config_node points to any running ConfigNode's cn_internal_address:cn_internal_port. Note: After this ConfigNode successfully joins the cluster for the first time, this parameter is no longer used. Each node automatically maintains the list of ConfigNodes and traverses connections when restarting. Format: address:port e.g. 127.0.0.1:10710.effectiveMode: first_start.Datatype: String| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| dn_seed_config_node points to any running ConfigNode's cn_internal_address:cn_internal_port. Note: After this DataNode successfully joins the cluster for the first time, this parameter is no longer used. Each node automatically maintains the list of ConfigNodes and traverses connections when restarting. Format: address:port e.g. 127.0.0.1:10710.effectiveMode: first_start.Datatype: String| -| cn_internal_address| 127.0.0.1| 127.0.0.1| Used for RPC communication inside cluster. Could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: first_start.Datatype: String| -| cn_internal_port| 10710| 10710| Used for RPC communication inside cluster.effectiveMode: first_start.Datatype: int| -| cn_consensus_port| 10720| 10720| Used for consensus communication among ConfigNodes inside cluster.effectiveMode: first_start.Datatype: int| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| Used for connection of IoTDB native clients(Session) Could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: restart.Datatype: String| -| dn_rpc_port| 6667| 6667| Used for connection of IoTDB native clients(Session) Bind with dn_rpc_address.effectiveMode: restart.Datatype: int| -| dn_internal_address| 127.0.0.1| 127.0.0.1| Used for communication inside cluster. could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: first_start.Datatype: String| -| dn_internal_port| 10730| 10730| Used for communication inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_mpp_data_exchange_port| 10740| 10740| Port for data exchange among DataNodes inside cluster Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_schema_region_consensus_port| 10750| 10750| port for consensus's communication for schema region inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_data_region_consensus_port| 10760| 10760| port for consensus's communication for data region inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| schema_replication_factor| 1| 1| Default number of schema replicas.effectiveMode: first_start.Datatype: int| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| SchemaRegion consensus protocol type. This parameter is unmodifiable after ConfigNode starts for the first time. These consensus protocols are currently supported: 1. org.apache.iotdb.consensus.ratis.RatisConsensus 2. org.apache.iotdb.consensus.simple.SimpleConsensus (The schema_replication_factor can only be set to 1).effectiveMode: first_start.Datatype: string| -| data_replication_factor| 1| 1| Default number of data replicas.effectiveMode: first_start.Datatype: int| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| DataRegion consensus protocol type. This parameter is unmodifiable after ConfigNode starts for the first time. These consensus protocols are currently supported: 1. org.apache.iotdb.consensus.simple.SimpleConsensus (The data_replication_factor can only be set to 1) 2. org.apache.iotdb.consensus.iot.IoTConsensus 3. org.apache.iotdb.consensus.ratis.RatisConsensus 4. org.apache.iotdb.consensus.iot.IoTConsensusV2.effectiveMode: first_start.Datatype: string| -| cn_metric_prometheus_reporter_port| 9091| 9091| The port of prometheus reporter of metric module.effectiveMode: restart.Datatype: int| -| dn_metric_prometheus_reporter_port| 9092| 9092| The port of prometheus reporter of metric module.effectiveMode: restart.Datatype: int| -| series_slot_num| 1000| 1000| All parameters in Partition configuration is unmodifiable after ConfigNode starts for the first time. And these parameters should be consistent within the ConfigNodeGroup. Number of SeriesPartitionSlots per Database.effectiveMode: first_start.Datatype: Integer| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| SeriesPartitionSlot executor class These hashing algorithms are currently supported: 1. BKDRHashExecutor(Default) 2. APHashExecutor 3. JSHashExecutor 4. SDBMHashExecutor Also, if you want to implement your own SeriesPartition executor, you can inherit the SeriesPartitionExecutor class and modify this parameter to correspond to your Java class.effectiveMode: first_start.Datatype: String| -| time_partition_origin| 0| 0| Time partition origin in milliseconds, default is equal to zero. This origin is set by default to the beginning of Unix time, which is January 1, 1970, at 00:00 UTC (Coordinated Universal Time). This point is known as the Unix epoch, and its timestamp is 0. If you want to specify a different time partition origin, you can set this value to a specific Unix timestamp in milliseconds.effectiveMode: first_start.Datatype: long| -| time_partition_interval| 604800000| 604800000| Time partition interval in milliseconds, and partitioning data inside each data region, default is equal to one week.effectiveMode: first_start.Datatype: long| -| disk_space_warning_threshold| 0.05| 0.05| Disk remaining threshold at which DataNode is set to ReadOnly status.effectiveMode: restart.Datatype: double(percentage)| -| schema_engine_mode| Memory| Memory| The schema management mode of schema engine. Currently, support Memory and PBTree. This config of all DataNodes in one cluster must keep same.effectiveMode: first_start.Datatype: string| -| tag_attribute_total_size| 700| 700| max size for a storage block for tags and attributes of one time series. If the combined size of tags and attributes exceeds the tag_attribute_total_size, a new storage block will be allocated to continue storing the excess data. the unit is byte.effectiveMode: first_start.Datatype: int| -| read_consistency_level| strong| strong| The read consistency level These consistency levels are currently supported: 1. strong(Default, read from the leader replica) 2. weak(Read from a random replica).effectiveMode: restart.Datatype: string| -| timestamp_precision| ms| ms| Use this value to set timestamp precision as "ms", "us" or "ns". Once the precision has been set, it can not be changed.effectiveMode: first_start.Datatype: String| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 28 -It costs 0.010s -``` - -## 2. Status Setting - -### 2.1 Setting the Connected Model - -**Description**: Sets the current SQL dialect model to `Tree` or `Table` which can be used in both tree and table models. - -**Syntax**: - -```SQL -SET SQL_DIALECT = (TABLE | TREE); -``` - -**Example**: - -```SQL -IoTDB> SET SQL_DIALECT=TABLE; -IoTDB> SHOW CURRENT_SQL_DIALECT; -``` - -**Result**: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 2.2 Updating Configuration Items - -**Description**: Updates configuration items. Changes take effect immediately without restarting if the items support hot modification. - -**Syntax**: - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**Parameters**: - -1. **propertyAssignments**: A list of properties to update. - 1. Format: `property (',' property)*`. - 2. Values: - - `DEFAULT`: Resets the configuration to its default value. - - `expression`: A specific value (must be a string). -2. **ON INTEGER_VALUE** **(Optional):** Specifies the node ID to update. - 1. If not specified or set to a negative value, updates all ConfigNodes and DataNodes. - -**Example**: - -```SQL -IoTDB> SET CONFIGURATION disk_space_warning_threshold='0.05',heartbeat_interval_in_ms='1000' ON 1; -``` - -### 2.3 Loading Manually Modified Configuration Files - -**Description**: Loads manually modified configuration files and hot-loads the changes. Configuration items that support hot modification take effect immediately. - -**Syntax**: - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode** **(Optional):** - 1. Specifies the scope of configuration loading. - 2. Default: `CLUSTER`. - 3. Values: - - `LOCAL`: Loads configuration only on the DataNode directly connected to the client. - - `CLUSTER`: Loads configuration on all DataNodes in the cluster. - -**Example**: - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 Setting the System Status - -**Description**: Sets the system status to either `READONLY` or `RUNNING`. - -**Syntax**: - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **RUNNING |** **READONLY**: - 1. **RUNNING**: Sets the system to running mode, allowing both read and write operations. - 2. **READONLY**: Sets the system to read-only mode, allowing only read operations and prohibiting writes. -2. **localOrClusterMode** **(Optional):** - 1. **LOCAL**: Applies the status change only to the DataNode directly connected to the client. - 2. **CLUSTER**: Applies the status change to all DataNodes in the cluster. - 3. **Default**: `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - -## 3. Data Management - -### 3.1 Flushing Data from Memory to Disk - -**Description**: Flushes data from the memory table to disk. - -**Syntax**: - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **identifier** **(Optional):** - 1. Specifies the name of the database to flush. - 2. If not specified, all databases are flushed. - 3. **Multiple Databases**: Multiple database names can be specified, separated by commas (e.g., `FLUSH test_db1, test_db2`). -2. **booleanValue** **(****Optional****)**: - 1. Specifies the type of data to flush. - 2. **TRUE**: Flushes only the sequential memory table. - 3. **FALSE**: Flushes only the unsequential MemTable. - 4. **Default**: Flushes both sequential and unsequential memory tables. -3. **localOrClusterMode** **(****Optional****)**: - 1. **ON LOCAL**: Flushes only the memory tables on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Flushes memory tables on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> FLUSH test_db TRUE ON LOCAL; -``` - - -## 4. Data Repair - -### 4.1 Starting Background Scan and Repair of TsFiles - -**Description**: Starts a background task to scan and repair TsFiles, fixing issues such as timestamp disorder within data files. - -**Syntax**: - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode(Optional)**: - 1. **ON LOCAL**: Executes the repair task only on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Executes the repair task on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 Pausing Background TsFile Repair Task - -**Description**: Pauses the background repair task. The paused task can be resumed by executing the `START REPAIR DATA` command again. - -**Syntax**: - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode** **(Optional):** - 1. **ON LOCAL**: Executes the pause command only on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Executes the pause command on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. Query Termination - -### 5.1 Terminating Queries - -**Description**: Terminates one or more running queries. - -**Syntax**: - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**Parameters**: - -1. **QUERY** **queryId:** Specifies the ID of the query to terminate. - -- To obtain the `queryId`, use the `SHOW QUERIES` command. - -2. **ALL QUERIES:** Terminates all currently running queries. - -**Example**: - -Terminate a specific query: - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -``` - -Terminate all queries: - -```SQL -IoTDB> KILL ALL QUERIES; -``` - -## 6. Query Debugging -### 6.1 DEBUG SQL - -**Definition**: Add the `DEBUG` keyword at the beginning of an SQL query statement. During execution, debug logs will be output, including the underlying file scan information involved in the query. - -> Supported since V2.0.9.1 - -#### Syntax: -```sql -debugSQLStatement - : DEBUG ? query - ; -``` - -**Description**: -* Log output path: `logs/log_datanode_query_debug.log` - -#### Example: -1. Execute the following SQL for DEBUG query -```sql -DEBUG SELECT * FROM table3; -``` - -2. Check the log content in `log_datanode_query_debug.log` to view the file scan information involved in the query. - -```bash -2026-03-24 10:10:41,515 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.t.TsFileResource:1098 - Path: table3.d1 file /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2864/1769139940009-1-0-0.tsfile is not satisfied because of no device! -2026-03-24 10:10:41,515 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.t.TsFileResource:1098 - Path: table3.d1 file /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2865/1769139940010-1-0-0.tsfile is not satisfied because of no device! -2026-03-24 10:10:41,516 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:159 - Cache miss: table3.d1. in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,516 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:160 - Device: table3.d1, all sensors: [, temperature] -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.BloomFilterCache:110 - get bloomFilter from cache where filePath is: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: table3.d1. metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=-128, chunkMetaDataListDataSize=8, measurementId='', dataType=VECTOR, statistics=startTime: 1747065600001 endTime: 1747065601002 count: 2, modified=false, isSeq=true, chunkMetadataList=[measurementId: , datatype: VECTOR, version: 0, Statistics: startTime: 1747065600001 endTime: 1747065601002 count: 2, deleteIntervalList: null]}. -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: table3.d1.temperature metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=64, chunkMetaDataListDataSize=8, measurementId='temperature', dataType=FLOAT, statistics=startTime: 1747065600001 endTime: 1747065601002 count: 2 [minValue:85.0,maxValue:90.0,firstValue:90.0,lastValue:85.0,sumValue:175.0], modified=false, isSeq=true, chunkMetadataList=[measurementId: temperature, datatype: FLOAT, version: 0, Statistics: startTime: 1747065600001 endTime: 1747065601002 count: 2 [minValue:85.0,maxValue:90.0,firstValue:90.0,lastValue:85.0,sumValue:175.0], deleteIntervalList: null]}. -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:110 - Modifications size is 1 for file Path: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:114 - [] -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:125 - After modification Chunk meta data list is: -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:126 - org.apache.tsfile.file.metadata.TableDeviceChunkMetadata@2e11291f -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile', regionId=4, timePartitionId=2888, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=19} -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile', regionId=4, timePartitionId=2888, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=46} -2026-03-24 10:10:41,519 [pool-69-IoTDB-ClientRPC-Processor-1$20260324_021041_00068_1] INFO o.a.i.d.q.p.Coordinator:902 - debug select * from table3 -``` diff --git a/src/UserGuide/latest-Table/User-Manual/Pattern-Query_timecho.md b/src/UserGuide/latest-Table/User-Manual/Pattern-Query_timecho.md deleted file mode 100644 index f97a66db5..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Pattern-Query_timecho.md +++ /dev/null @@ -1,1137 +0,0 @@ - - -# Pattern Query - -For time-series data feature analysis scenarios, IoTDB provides the capability of pattern query, which deliver a flexible and efficient solution for in-depth mining and complex computation of time-series data. The following sections will elaborate on the feature in detail. - -## 1. Overview - -Pattern query enables capturing a segment of continuous data by defining the recognition logic of pattern variables and regular expressions, and performing analysis and calculation on each captured data segment. It is suitable for business scenarios such as identifying specific patterns in time-series data (as shown in the figure below) and detecting specific events. - -![](/img/timeseries-featured-analysis-1.png) - -> Note: This feature is available starting from version V2.0.5. - -## 2. Function Introduction -### 2.1 Syntax Format - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**Note:** - -* PARTITION BY: Optional. Used to group the input table, and each group can perform pattern matching independently. If this clause is not specified, the entire input table will be processed as a single unit. -* ORDER BY: Optional. Used to ensure that input data is processed in a specific order during matching. -* MEASURES: Optional. Used to specify which information to extract from the matched segment of data. -* ROWS PER MATCH: Optional. Used to specify the output method of the result set after successful pattern matching. -* AFTER MATCH SKIP: Optional. Used to specify which row to resume from for the next pattern match after identifying a non-empty match. -* PATTERN: Used to define the row pattern to be matched. -* SUBSET: Optional. Used to merge rows matched by multiple basic pattern variables into a single logical set. -* DEFINE: Used to define the basic pattern variables for the row pattern. - -**Original Data for Syntax Examples:** - -```SQL -IoTDB:database3> select * from t -+-----------------------------+------+----------+ -| time|device|totalprice| -+-----------------------------+------+----------+ -|2025-01-01T00:01:00.000+08:00| d1| 90| -|2025-01-01T00:02:00.000+08:00| d1| 80| -|2025-01-01T00:03:00.000+08:00| d1| 70| -|2025-01-01T00:04:00.000+08:00| d1| 80| -|2025-01-01T00:05:00.000+08:00| d1| 70| -|2025-01-01T00:06:00.000+08:00| d1| 80| -+-----------------------------+------+----------+ - --- Creation Statement -create table t(device tag, totalprice int32 field) - -insert into t(time,device,totalprice) values(2025-01-01T00:01:00, 'd1', 90),(2025-01-01T00:02:00, 'd1', 80),(2025-01-01T00:03:00, 'd1', 70),(2025-01-01T00:04:00, 'd1', 80),(2025-01-01T00:05:00, 'd1', 70),(2025-01-01T00:06:00, 'd1', 80) -``` - -### 2.2 DEFINE Clause - -Used to specify the judgment condition for each basic pattern variable in pattern recognition. These variables are usually represented by identifiers (e.g., `A`, `B`), and the Boolean expressions in this clause precisely define which rows meet the requirements of the variable. - -* During pattern matching execution, a row is only marked as the variable (and thus included in the current matching group) if the Boolean expression returns TRUE. - -```SQL --- A row can only be identified as B if its totalprice value is less than the totalprice value of the previous row. -DEFINE B AS totalprice < PREV(totalprice) -``` - -* Variables not **explicitly** defined in this clause have an implicitly set condition of always true (TRUE), meaning they can be successfully matched on any input row. - -### 2.3 SUBSET Clause - -Used to merge rows matched by multiple basic pattern variables (e.g., `A`, `B`) into a combined pattern variable (e.g., `U`), allowing these rows to be treated as a single logical set for operations. It can be used in the `MEASURES`, `DEFINE`, and `AFTER MATCH SKIP` clauses. - -```SQL -SUBSET U = (A, B) -``` -For example, for the pattern `PATTERN ((A | B){5} C+)`, it is impossible to determine whether the 5th repetition matches the basic pattern variable A or B during matching. Therefore: - -1. In the `MEASURES` clause, if you need to reference the last row matched in this phase, you can do so by defining the combined pattern variable `SUBSET U = (A, B)`. At this point, the expression `RPR_LAST(U.totalprice)` will directly return the `totalprice` value of the target row. -2. In the `AFTER MATCH SKIP` clause, if the matching result does not include the basic pattern variable A or B, executing `AFTER MATCH SKIP TO LAST B` or `AFTER MATCH SKIP TO LAST A` will fail to jump due to missing anchors. However, by introducing the combined pattern variable `SUBSET U = (A, B)`, using `AFTER MATCH SKIP TO LAST U` is always valid. - -### 2.4 PATTERN Clause - -Used to define the row pattern to be matched, whose basic building block is a row pattern variable. - -```SQL -PATTERN ( row_pattern ) -``` - -#### 2.4.1 Pattern Types - -| Row Pattern | Syntax Format | Description | -|-----------------------|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Pattern Concatenation | `A B+ C+ D+` | Composed of subpatterns without any operators, matching all subpatterns in the declared order sequentially. | -| Pattern Alternation | `A \| B \| C` | Composed of multiple subpatterns separated by `\|`, matching only one of them. If multiple subpatterns can be matched, the leftmost one is selected. | -| Pattern Permutation | `PERMUTE(A, B, C)` | Equivalent to performing alternation matching on all different orders of the subpattern elements. It requires that A, B, and C must all be matched, but their order of appearance is not fixed. If multiple matching orders are possible, the priority is determined by the **lexicographical order** based on the definition sequence of elements in the PERMUTE list. For example, A B C has the highest priority, while C B A has the lowest. | -| Pattern Grouping | `(A B C)` | Encloses subpatterns in parentheses to treat them as a single unit, which can be used with other operators. For example, `(A B C)+` indicates a pattern where a group of `(A B C)` appears consecutively. | -| Empty Pattern | `()` | Represents an empty match that does not contain any rows. | -| Pattern Exclusion | `{- row_pattern -}` | Used to specify the matched part to be excluded from the output. Usually used with the `ALL ROWS PER MATCH` option to output rows of interest. For example, `PATTERN (A {- B+ C+ -} D+)` with ALL ROWS PER MATCH will only output the first row `(corresponding to A)` and the trailing rows `(corresponding to D+)` of the match. | - -#### 2.4.2 Partition Start/End Anchor - -* `^A` indicates matching a pattern that starts with A as the partition beginning - * When the value of the PATTERN clause is `^A`, the match must start from the first row of the partition, and this row must satisfy the definition of `A`. - * When the value of the PATTERN clause is `^A^` or `A^`, the output result is empty. -* `A$` indicates matching a pattern that ends with A as the partition end - * When the value of the PATTERN clause is `A$`, the match must end at the end of the partition, and this row must satisfy the definition of `A`. - * When the value of the PATTERN clause is `$A` or `$A$`, the output result is empty. - -**Examples** - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - AFTER MATCH SKIP PAST LAST ROW - PATTERN %s -- PATTERN 子句 - DEFINE A AS true -) AS m; -``` - -* Results - * When the PATTERN clause is specified as PATTERN (^A) - - ![](/img/timeseries-featured-analysis-2.png) - - Actual Return - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * When the PATTERN clause is specified as PATTERN (^A^), the output result is empty. This is because it is impossible to match an A starting from the beginning of a partition and then return to the beginning of the partition again. - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - * When the PATTERN clause is specified as PATTERN (A\$) - - ![](/img/timeseries-featured-analysis-3.png) - - Actual Return - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:06:00.000+08:00| 1| 80| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * When the PATTERN clause is specified as PATTERN (\$A\$), the output result is empty. - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - -#### 2.4.3 Quantifiers - -Quantifiers are used to specify the number of times a subpattern repeats, placed after the corresponding subpattern (e.g., `(A | B)*`). - -Common quantifiers are as follows: - -| Quantifier | Description | -| -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `*` | Zero or more repetitions | -| `+` | One or more repetitions | -| `?` | Zero or one repetition | -| `{n}` | Exactly n repetitions | -| `{m, n}` | Repetitions between m and n times (m and n are non-negative integers). \* If the left bound is omitted, the default starts from 0; \* If the right bound is omitted, there is no upper limit on the number of repetitions (e.g., {5,} is equivalent to "at least five times"); \* If both left and right bounds are omitted (i.e., {,}), it is equivalent to `*`. | - -* The matching preference can be changed by adding `?` after the quantifier. - * `{3,5}`: Prefers 5 times, least prefers 3 times; `{3,5}?`: Prefers 3 times, least prefers 5 times. - * `?`: Prefers 1 time; `??`: Prefers 0 times. - -### 2.5 AFTER MATCH SKIP Clause - -Used to specify which row to start the next pattern match from after identifying a non-empty match. - -| Jump Strategy | Description | Allows Overlapping Matches? | -| ------------------------------------------------------------- | -------------------------------------------------------------------------------- | ----------------------------- | -| `AFTER MATCH SKIP PAST LAST ROW` | Default behavior. Starts from the row after the last row of the current match. | No | -| `AFTER MATCH SKIP TO NEXT ROW` | Starts from the second row in the current match. | Yes | -| `AFTER MATCH SKIP TO [ FIRST \| LAST ] pattern_variable` | Jumps to start from the [ first row \| last row ] of a pattern variable. | Yes | - -* Among all possible configurations, only when `ALL ROWS PER MATCH WITH UNMATCHED ROWS` is used in combination with `AFTER MATCH SKIP PAST LAST ROW` can the system ensure that exactly one output record is generated for each input row. - -**Examples** - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - %s -- AFTER MATCH SKIP 子句 - PATTERN (A B+ C+ D?) - SUBSET U = (C, D) - DEFINE - B AS B.totalprice < PREV (B.totalprice), - C AS C.totalprice > PREV (C.totalprice), - D AS false -- 永远不会匹配成功 -) AS m; -``` - -* Results - * When AFTER MATCH SKIP PAST LAST ROW is specified - - ![](/img/timeseries-featured-analysis-4-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: According to the semantics of `AFTER MATCH SKIP PAST LAST ROW`, starting from row 5, no valid match can be found - * This pattern will never have overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 4 - ``` - - * When AFTER MATCH SKIP TO NEXT ROW - - ![](/img/timeseries-featured-analysis-5-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: According to the semantics of `AFTER MATCH SKIP TO NEXT ROW`, starting from row 2, matches: Rows 2, 3, 4 - * Third match: Attempts to start from row 3, fails - * Fourth match: Attempts to start from row 4, succeeds, matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:02:00.000+08:00| 2| 80| A| - |2025-01-01T00:03:00.000+08:00| 2| 70| B| - |2025-01-01T00:04:00.000+08:00| 2| 80| C| - |2025-01-01T00:04:00.000+08:00| 3| 80| A| - |2025-01-01T00:05:00.000+08:00| 3| 70| B| - |2025-01-01T00:06:00.000+08:00| 3| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 10 - ``` - - * When AFTER MATCH SKIP TO FIRST C - - ![](/img/timeseries-featured-analysis-6-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: Starts from the first C (i.e., row 4), matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO LAST B or AFTER MATCH SKIP TO B - - ![](/img/timeseries-featured-analysis-7-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: Attempts to start from the last B (i.e., row 3), fails - * Third match: Attempts to start from row 4, successfully matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO U - - ![](/img/timeseries-featured-analysis-8-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: `SKIP TO U` means jumping to the last C or D; D can never match successfully, so it jumps to the last C (i.e., row 4), successfully matching rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO A, you cannot jump to the first row of the match, otherwise it will cause an infinite loop - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: cannot skip to first row of match - ``` - - * When AFTER MATCH SKIP TO B, you cannot jump to a pattern variable that does not exist in the match group - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: pattern variable is not present in match - ``` - - -### 2.6 ROWS PER MATCH Clause - -Used to specify the output method of the result set after a successful pattern match, including the following two main options: - -| Output Method | Rule Description | Output Result | Handling Logic for **Empty Matches/Unmatched Rows** | -| -------------------- | -------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| ONE ROW PER MATCH | Generates one output row for each successful match. | \* Columns in the PARTITION BY clause\* Expressions defined in the MEASURES clause. | Outputs empty matches; skips unmatched rows. | -| ALL ROWS PER MATCH | Each row in a match generates an output record, unless the row is excluded via exclusion syntax. | \* Columns in the PARTITION BY clause\* Columns in the ORDER BY clause\* Expressions defined in the MEASURES clause\* Remaining columns in the input table | \* Default: Outputs empty matches; skips unmatched rows.\* ALL ROWS PER MATCH​**SHOW EMPTY MATCHES**​: Outputs empty matches by default; skips unmatched rows.\* ALL ROWS PER MATCH​**OMIT EMPTY MATCHES**​: Does not output empty matches; skips unmatched rows.\* ALL ROWS PER MATCH​**WITH UNMATCHED ROWS**​: Outputs empty matches and generates an additional output record for each unmatched row. | - -### 2.7 MEASURES Clause - -Used to specify which information to extract from a matched set of data. This clause is optional; if not explicitly specified, some input columns will become the output results of pattern recognition based on the settings of the ROWS PER MATCH clause. - -SQL - -```SQL -MEASURES measure_expression AS measure_name [, ...] -``` - -* A `measure_expression` is a scalar value calculated from the matched set of data. - -| Usage Example | Description | -| ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `A.totalprice AS starting_price` | Returns the price from the first row in the matched group (i.e., the only row associated with variable A) as the starting price. | -| `RPR_LAST(B.totalprice) AS bottom_price` | Returns the price from the last row associated with variable B, representing the lowest price in the "V" shape pattern (corresponding to the end of the downward segment). | -| `RPR_LAST(U.totalprice) AS top_price` | Returns the highest price in the matched group, corresponding to the last row associated with variable C or D (i.e., the end of the entire matched group). [Assuming SUBSET U = (C, D)] | - -* Each `measure_expression` defines an output column, which can be referenced by its specified `measure_name`. - -### 2.8 Row Pattern Recognition Expressions - -Expressions used in the MEASURES and DEFINE clauses are ​**scalar expressions**​, evaluated in the row-level context of the input table. In addition to supporting standard SQL syntax, **scalar expressions** also support special extended functions for row pattern recognition. - -#### 2.8.1 Pattern Variable References - -```SQL -A.totalprice -U.orderdate -orderstatus -``` - -* When a column name is prefixed with a **basic pattern variable** or a ​**combined pattern variable**​, it refers to the corresponding column values of all rows matched by that variable. -* If a column name has no prefix, it is equivalent to using the "​**global combined pattern variable**​" (i.e., the union of all basic pattern variables) as the prefix, referring to the column values of all rows in the current match. - -> Using table names as column name prefixes in pattern recognition expressions is not allowed. - -#### 2.8.2 Extended Functions - -| Function Name | Function Syntax | Description | -| ------------------------------- | ----------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| `MATCH_NUMBER` Function | `MATCH_NUMBER()` | Returns the sequence number of the current match within the partition, starting from 1. Empty matches occupy match sequence numbers just like non-empty matches. | -| `CLASSIFIER` Function | `CLASSIFIER(option)` | 1. Returns the name of the basic pattern variable mapped by the current row. 2. `option` is an optional parameter: a basic pattern variable `CLASSIFIER(A)` or a combined pattern variable `CLASSIFIER(U)` can be passed in to limit the function's scope; for rows outside the scope, NULL is returned directly. When used with a combined pattern variable, it can be used to distinguish which basic pattern variable in the union the row is mapped to. | -| Logical Navigation Functions | `RPR_FIRST(expr, k)` | 1. Indicates locating the first row satisfying `expr` in the ​**current match group**​, then searching for the k-th occurrence of the row corresponding to the same pattern variable towards the end of the group, and returning the specified column value of that row. If the k-th matching row is not found in the specified direction, the function returns NULL. 2. `k` is an optional parameter, defaulting to 0 (only locating the first row satisfying the condition); if explicitly specified, it must be a non-negative integer. | -| Logical Navigation Functions | `RPR_LAST(expr, k)` | 1. Indicates locating the last row satisfying `expr` in the ​**current match group**​, then searching for the k-th occurrence of the row corresponding to the same pattern variable towards the start of the group, and returning the specified column value of that row. If the k-th matching row is not found in the specified direction, the function returns NULL. 2. `k` is an optional parameter, defaulting to 0 (only locating the last row satisfying the condition); if explicitly specified, it must be a non-negative integer. | -| Physical Navigation Functions | `PREV(expr, k)` | 1. Indicates offsetting k rows towards the start from the last row matched to the given pattern variable, and returning the corresponding column value. If navigation exceeds the ​**partition boundary**​, the function returns NULL. 2. `k` is an optional parameter, defaulting to 1; if explicitly specified, it must be a non-negative integer. | -| Physical Navigation Functions | `NEXT(expr, k)` | 1. Indicates offsetting k rows towards the end from the last row matched to the given pattern variable, and returning the corresponding column value. If navigation exceeds the ​**partition boundary**​, the function returns NULL. 2. `k` is an optional parameter, defaulting to 1; if explicitly specified, it must be a non-negative integer. | -| Aggregate Functions | COUNT, SUM, AVG, MAX, MIN Functions | Can be used to calculate data in the current match. Aggregate functions and navigation functions are not allowed to be nested within each other. (Supported from version V2.0.6) | -| Nested Functions | `PREV/NEXT(CLASSIFIER())` | Nesting of physical navigation functions and the CLASSIFIER function. Used to obtain the pattern variables corresponding to the previous and next matching rows of the current row. | -| Nested Functions | `PREV/NEXT(RPR_FIRST/RPR_LAST(expr, k)`) | **Logical functions are allowed to be nested** inside physical functions; **physical functions are not allowed to be nested** inside logical functions. Used to perform logical offset first, then physical offset. | - -**Examples** - -1. CLASSIFIER Function - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* Analysis - - ![](/img/timeseries-featured-analysis-9-en.png) - -* Result - -```SQL -+-----------------------------+-----+-----+---------------+-----+ -| time|match|price|lower_or_higher|label| -+-----------------------------+-----+-----+---------------+-----+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| -+-----------------------------+-----+-----+---------------+-----+ -Total line number = 6 -``` - -2. Logical Navigation Functions - -* Query sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` - -* Results - * When the value is totalprice, RPR\_LAST(totalprice), RUNNING RPR\_LAST(totalprice) - - ![](/img/timeseries-featured-analysis-10.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is FINAL RPR\_LAST(totalprice) - - ![](/img/timeseries-featured-analysis-11.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_FIRST(totalprice), RUNNING RPR\_FIRST(totalprice), FINAL RPR\_FIRST(totalprice) - - ![](/img/timeseries-featured-analysis-12.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 90| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 90| - |2025-01-01T00:05:00.000+08:00| 90| - |2025-01-01T00:06:00.000+08:00| 90| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_LAST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-13.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| null| - |2025-01-01T00:02:00.000+08:00| null| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is FINAL RPP\_LAST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-14.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_FIRST(totalprice, 2) and FINAL RPR\_FIRST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-15.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 70| - |2025-01-01T00:02:00.000+08:00| 70| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 6 - ``` - -3. Physical Navigation Functions - -* Query sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (B) - DEFINE B AS B.totalprice >= PREV(B.totalprice) -) AS m; -``` - -* Results - * When the value is `PREV(totalprice)` - - ![](/img/timeseries-featured-analysis-16.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `PREV(B.totalprice, 2)` - - ![](/img/timeseries-featured-analysis-17.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `PREV(B.totalprice, 4)` - - ![](/img/timeseries-featured-analysis-18.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| null| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `NEXT(totalprice)` or `NEXT(B.totalprice, 1)` - - ![](/img/timeseries-featured-analysis-19.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * `When the value is `NEXT(B.totalprice, 2)` - - ![](/img/timeseries-featured-analysis-20.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - -4. Aggregate Functions - -* Query sql - -```SQL -SELECT m.time, m.count, m.avg, m.sum, m.min, m.max -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - COUNT(*) AS count, - AVG(totalprice) AS avg, - SUM(totalprice) AS sum, - MIN(totalprice) AS min, - MAX(totalprice) AS max - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* Analysis (Taking MIN(totalprice) as an Example) - -![](/img/timeseries-featured-analysis-21.png) - -* Result - -```SQL -+-----------------------------+-----+-----------------+-----+---+---+ -| time|count| avg| sum|min|max| -+-----------------------------+-----+-----------------+-----+---+---+ -|2025-01-01T00:01:00.000+08:00| 1| 90.0| 90.0| 90| 90| -|2025-01-01T00:02:00.000+08:00| 2| 85.0|170.0| 80| 90| -|2025-01-01T00:03:00.000+08:00| 3| 80.0|240.0| 70| 90| -|2025-01-01T00:04:00.000+08:00| 4| 80.0|320.0| 70| 90| -|2025-01-01T00:05:00.000+08:00| 5| 78.0|390.0| 70| 90| -|2025-01-01T00:06:00.000+08:00| 6|78.33333333333333|470.0| 70| 90| -+-----------------------------+-----+-----------------+-----+---+---+ -Total line number = 6 -``` - -5. Nested Functions - -Example 1 - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label, m.prev_label, m.next_label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label, - PREV(CLASSIFIER(W)) AS prev_label, - NEXT(CLASSIFIER(W)) AS next_label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* Analysis - -![](/img/timeseries-featured-analysis-22-en.png) - -* Result - -```SQL -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -| time|match|price|lower_or_higher|label|prev_label|next_label| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| null| A| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| H| null| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| null| A| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| L| null| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| null| A| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| L| null| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -Total line number = 6 -``` - -Example 2 - -* Query sql - -```SQL -SELECT m.time, m.prev_last_price, m.next_first_price -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - PREV(RPR_LAST(totalprice), 2) AS prev_last_price, - NEXT(RPR_FIRST(totalprice), 2) as next_first_price - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* Analysis - -![](/img/timeseries-featured-analysis-23.png) - -* Result - -```SQL -+-----------------------------+---------------+----------------+ -| time|prev_last_price|next_first_price| -+-----------------------------+---------------+----------------+ -|2025-01-01T00:01:00.000+08:00| null| 70| -|2025-01-01T00:02:00.000+08:00| null| 70| -|2025-01-01T00:03:00.000+08:00| 90| 70| -|2025-01-01T00:04:00.000+08:00| 80| 70| -|2025-01-01T00:05:00.000+08:00| 70| 70| -|2025-01-01T00:06:00.000+08:00| 80| 70| -+-----------------------------+---------------+----------------+ -Total line number = 6 -``` - -#### 2.8.3 RUNNING and FINAL Semantics - -1. Definition - -* `RUNNING`: Indicates the calculation scope is from the start row of the current match group to the row currently being processed (i.e., up to the current row). -* `FINAL`: Indicates the calculation scope is from the start row of the current match group to the final row of the group (i.e., the entire match group). - -2. Scope of Application - -* The DEFINE clause uses RUNNING semantics by default. -* The MEASURES clause uses RUNNING semantics by default and supports specifying FINAL semantics. When using the ONE ROW PER MATCH output mode, all expressions are calculated from the last row position of the match group, and at this time, RUNNING semantics are equivalent to FINAL semantics. - -3. Syntax Constraints - -* RUNNING and FINAL need to be written before **logical navigation functions** or aggregate functions, and cannot directly act on **column references.** - * Valid: `RUNNING RPP_LAST(A.totalprice)`, `FINAL RPP_LAST(A.totalprice)` - * Invalid: `RUNNING A.totalprice`, `FINAL A.totalprice`, `RUNNING PREV(A.totalprice)` - -## 3. Scenario Examples - -Using [Sample Data](../Reference/Sample-Data.md) as the source data - -### 3.1 Time Segment Query - -Segment the data in table1 by time intervals less than or equal to 24 hours, and query the total number of data entries in each segment, as well as the start and end times. - -Query SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -Results - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -### 3.2 Difference Segment Query - -Segment the data in table2 by humidity value differences less than 0.1, and query the total number of data entries in each segment, as well as the start and end times. - -* Query SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* Results - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -### 3.3 Event Statistics Query - -Group the data in table1 by device ID, and count the start and end times and maximum humidity value where the humidity in the Shanghai area is greater than 35. - -* Query SQL - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= '上海' AND A.humidity> 35 -) AS m -``` - -* Results - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` - -## 4. Practical Cases - -### 4.1 Altitude Monitoring - -* **Business Background** - -During oil product transportation, environmental pressure is directly affected by altitude: higher altitude means lower atmospheric pressure, which increases oil evaporation risks. To accurately assess natural oil loss, BeiDou positioning data must identify altitude anomalies to support loss evaluation. - -* **Data Structure** - -Monitoring table contains these core fields: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | Data collection timestamp | -| device\_id | STRING | TAG | Vehicle device ID (partition key) | -| department | STRING | FIELD | Affiliated department | -| altitude | DOUBLE | FIELD | Altitude (unit: meters) | - -* **Business Requirements** - -Identify altitude anomaly events: When vehicle altitude exceeds 500m and later drops below 500m, it constitutes a complete anomaly event. Calculate core metrics: - -* Event start time (first timestamp exceeding 500m) -* Event end time (last timestamp above 500m) -* Maximum altitude during event - -![](/img/pattern-query-altitude.png) - -* **Implementation Method** - -```SQL -SELECT * -FROM beidou -MATCH_RECOGNIZE ( - PARTITION BY device_id -- Partition by vehicle device ID - ORDER BY time -- Chronological ordering - MEASURES - FIRST(A.time) AS ts_s, -- Event start timestamp - LAST(A.time) AS ts_e, -- Event end timestamp - MAX(A.altitude) AS max_a -- Maximum altitude during event - PATTERN (A+) -- Match consecutive records above 500m - DEFINE - A AS A.altitude > 500 -- Define A as altitude > 500m -) -``` - -### 4.2 Safety Injection Operation Identification - -* **Business Background** - -Nuclear power plants require periodic safety tests (e.g., PT1RPA010 "Safety Injection Logic Test with 1 RPA 601KC") to verify equipment integrity. These tests cause characteristic flow pattern changes. The control system must identify these patterns to detect anomalies and ensure equipment safety. - -* **Data Structure** - -Sensor table contains these core fields: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | Data collection timestamp | -| pipe\_id | STRING | TAG | Pipe ID (partition key) | -| pressure | DOUBLE | FIELD | Pipe pressure | -| flow\_rate | DOUBLE | FIELD | Pipe flow rate (key metric) | - -* **Business Requirements** - -Identify PT1RPA010 flow pattern: Normal flow → Continuous decline → Extremely low flow (<0.5) → Continuous recovery → Normal flow. Extract core metrics: - -* Pattern start time (initial normal flow timestamp) -* Pattern end time (recovered normal flow timestamp) -* Extremely low phase start/end times -* Minimum flow rate during extremely low phase - -![](/img/pattern-query-flow.png) - -* **Implementation Method** - -```SQL -SELECT * FROM sensor MATCH_RECOGNIZE( - PARTITION BY pipe_id -- Partition by pipe ID - ORDER BY time -- Chronological ordering - MEASURES - A.time AS start_ts, -- Pattern start timestamp - E.time AS end_ts, -- Pattern end timestamp - FIRST(C.time) AS low_start_ts, -- Extremely low phase start - LAST(C.time) AS low_end_ts, -- Extremely low phase end - MIN(C.flow_rate) AS min_low_flow -- Minimum flow during low phase - ONE ROW PER MATCH -- Output one row per match - PATTERN(A B+? C+ D+? E) -- Match normal→decline→extremely low→recovery→normal - DEFINE - A AS flow_rate BETWEEN 2 AND 2.5, -- Initial normal flow - B AS flow_rate < PREV(B.flow_rate), -- Continuous decline - C AS flow_rate < 0.5, -- Extremely low threshold - D AS flow_rate > PREV(D.flow_rate), -- Continuous recovery - E AS flow_rate BETWEEN 2 AND 2.5 -- Normal recovery -); -``` - -### 4.3 Extreme Operational Gust (Sombrero Wind) Identification - -* **Business Background** - -In wind power generation, "extreme operational gusts (sombrero wind)" are short-duration (≈10s) sinusoidal gusts with prominent peaks that can cause physical turbine damage. Identifying these gusts and calculating their frequency helps assess turbine damage risks and guide maintenance. - -* **Data Structure** - -Turbine sensor table contains: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | Wind speed timestamp | -| speed | DOUBLE | FIELD | Wind speed (key metric) | - -* **Business Requirements** - -Identify sombrero wind pattern: Gradual speed decline → Sharp increase → Sharp decrease → Gradual recovery to initial value (≈10s total). Primary goal: count gust occurrences for risk assessment. - -![](/img/pattern-query-speed.png) - -* **Implementation Method** - -```SQL -SELECT COUNT(*) -- Count extreme gust occurrences -FROM sensor -MATCH_RECOGNIZE( - ORDER BY time -- Chronological ordering - MEASURES - FIRST(B.time) AS ts_s, -- Gust start timestamp - LAST(D.time) AS ts_e -- Gust end timestamp - PATTERN (B+ R+? F+? D+? E) -- Match sombrero wind pattern - DEFINE - -- Phase B: Gradual decline, initial speed>9, delta<2.5 - B AS speed <= AVG(B.speed) - AND FIRST(B.speed) > 9 - AND (FIRST(B.speed) - LAST(B.speed)) < 2.5, - -- Phase R: Sharp increase (above phase average) - R AS speed >= AVG(R.speed), - -- Phase F: Sharp decrease, peak>16 (crest threshold) - F AS speed <= AVG(F.speed) - AND MAX(F.speed) > 16, - -- Phase D: Gradual recovery, delta<2.5 - D AS speed >= AVG(D.speed) - AND (LAST(D.speed) - FIRST(D.speed)) < 2.5, - -- Phase E: Recovery to ±0.2 of initial value, total duration <11s - E AS speed - FIRST(B.speed) BETWEEN -0.2 AND 0.2 - AND time - FIRST(B.time) < 11 -); -``` \ No newline at end of file diff --git a/src/UserGuide/latest-Table/User-Manual/Tiered-Storage_timecho.md b/src/UserGuide/latest-Table/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 36d72f18b..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,102 +0,0 @@ - -# Tiered Storage - -## 1. Overview - -The **tiered storage** feature enables users to manage multiple types of storage media efficiently. Users can configure different storage media types within IoTDB and classify them into distinct storage tiers. In IoTDB, tiered storage is implemented by managing multiple directories. Users can group multiple storage directories into the same category and designate them as a **storage tier**. Additionally, data can be classified based on its "hotness" or "coldness" and stored accordingly in designated tiers. - -Currently, IoTDB supports hot and cold data classification based on the **Time-To-Live (****TTL****)** parameter. When data in a tier no longer meets the defined TTL rules, it is automatically migrated to the next tier. - -## 2. **Parameter Definitions** - -To enable multi-level storage in IoTDB, the following configurations are required: - -1. Configure data directories and assign them into different tiers -2. Set TTL for each Tier to distinguish hot and cold data managed by different tiers. -3. Configure minimum remaining storage space ratio for each tier (Optional). If the available space in a tier falls below the defined threshold, data will be migrated to the next tier automatically. - -The specific parameter definitions and their descriptions are as follows. - -| **Parameter** | **Default Value** | **Required** | **Description** | **Constraints** | -| :------------------------------------------- | :------------------------- | --- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `dn_data_dirs` | `data/datanode/data` | Yes | Specifies storage directories grouped into tiers. | Tiers are separated by `;`, directories within the same tier are separated by `,`.
Cloud storage (e.g., AWS S3) can only be the last tier.
Use `OBJECT_STORAGE` to denote cloud storage.
Only one cloud storage bucket is allowed. | -| `tier_ttl_in_ms` | `-1` | Yes | Defines the TTL (in milliseconds) for each tier to determine the data range it manages. | Tiers are separated by `;`.
The number of tiers must match `dn_data_dirs`.
`-1` means "no limit". | -| `dn_default_space_usage_thresholds` | `0.85` | Yes | Define the maximum storage usage threshold ratio for each tier of data directories. When the used space exceeds this ratio, the data will be automatically migrated to the next tier. If the storage usage of the last tier surpasses this threshold, the system will be set to ​​READ_ONLY​​ mode. | -Tiers are separated by `;`.The number of tiers must match `dn_data_dirs`. | -| `object_storage_type` | `AWS_S3` | Required when using remote storage | Cloud storage type. | all `AWS_S3` is supported. | -| `object_storage_bucket` | `iotdb_data` | Required when using remote storage | Cloud storage bucket name. | | -| `object_storage_endpoint` | (Empty) | Required when using remote storage | Cloud storage endpoint. | | -| `object_storage_region` | (Empty) | Required when using remote storage | Cloud storage Region. | | -| `object_storage_access_key` | (Empty) | Required when using remote storage | Cloud storage access key. | | -| `object_storage_access_secret` | (Empty) | Required when using remote storage | Cloud storage access secret. | | -| `enable_path_style_access` | false | No | Whether to enable path style access for object storage service. | | -| `remote_tsfile_cache_dirs` | `data/datanode/data/cache` | No | Local cache directory for cloud storage. | | -| `remote_tsfile_cache_page_size_in_kb` | `20480` | No | Page size (in KB) for cloud storage local cache. | | -| `remote_tsfile_cache_max_disk_usage_in_mb` | `51200` | No | Maximum disk space (in MB) allocated for cloud storage local cache. | | - -## 3. Local Tiered Storage Example - -The following is an example of a **two-tier local storage configuration**: - -```Properties -# Mandatory configurations -dn_data_dirs=/data1/data;/data2/data,/data3/data -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -**Tier Details:** - -| **Tier** | **Storage Directories** | **Data Range** | **Remaining Space Threshold** | -| :------- | :--------------------------- | :-------------------- | :---------------------------- | -| Tier 1 | `/data1/data` | Last 1 day of data | 20% | -| Tier 2 | `/data2/data`, `/data3/data` | Data older than 1 day | 10% | - -## 4. Cloud-based Tiered Storage Example - -The following is an example of a **three-tier configuration with cloud storage**: - -```Properties -# Mandatory configurations -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_type=AWS_S3 -object_storage_bucket=iotdb -object_storage_region= -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -# Optional configurations -enable_path_style_access=false -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -**Tier Details:** - -| **Tier** | **Storage Directories** | **Data Range** | **Remaining Space Threshold** | -| :------- | :--------------------------- | :----------------------------- | :---------------------------- | -| Tier 1 | `/data1/data` | Last 1 day of data | 20% | -| Tier 2 | `/data2/data`, `/data3/data` | Data from 1 day to 10 days ago | 15% | -| Tier 3 | S3 Cloud Storage | Data older than 10 days | 10% | \ No newline at end of file diff --git a/src/UserGuide/latest-Table/User-Manual/Timeseries-Featured-Analysis_timecho.md b/src/UserGuide/latest-Table/User-Manual/Timeseries-Featured-Analysis_timecho.md deleted file mode 100644 index 90ddd39de..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Timeseries-Featured-Analysis_timecho.md +++ /dev/null @@ -1,1728 +0,0 @@ - - -# Timeseries Featured Analysis - -For time-series data feature analysis scenarios, IoTDB provides two core capabilities: pattern query and window functions. These capabilities deliver a flexible and efficient solution for in-depth mining and complex computation of time-series data. The following sections will elaborate on the two features in detail. - -## 1. Pattern Query - -### 1.1 Overview - -Pattern query enables capturing a segment of continuous data by defining the recognition logic of pattern variables and regular expressions, and performing analysis and calculation on each captured data segment. It is suitable for business scenarios such as identifying specific patterns in time-series data (as shown in the figure below) and detecting specific events. - -![](/img/timeseries-featured-analysis-1.png) - -> Note: This feature is available starting from version V2.0.5. - -### 1.2 Function Introduction -#### 1.2.1 Syntax Format - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**Note:** - -* PARTITION BY: Optional. Used to group the input table, and each group can perform pattern matching independently. If this clause is not specified, the entire input table will be processed as a single unit. -* ORDER BY: Optional. Used to ensure that input data is processed in a specific order during matching. -* MEASURES: Optional. Used to specify which information to extract from the matched segment of data. -* ROWS PER MATCH: Optional. Used to specify the output method of the result set after successful pattern matching. -* AFTER MATCH SKIP: Optional. Used to specify which row to resume from for the next pattern match after identifying a non-empty match. -* PATTERN: Used to define the row pattern to be matched. -* SUBSET: Optional. Used to merge rows matched by multiple basic pattern variables into a single logical set. -* DEFINE: Used to define the basic pattern variables for the row pattern. - -**Original Data for Syntax Examples:** - -```SQL -IoTDB:database3> select * from t -+-----------------------------+------+----------+ -| time|device|totalprice| -+-----------------------------+------+----------+ -|2025-01-01T00:01:00.000+08:00| d1| 90| -|2025-01-01T00:02:00.000+08:00| d1| 80| -|2025-01-01T00:03:00.000+08:00| d1| 70| -|2025-01-01T00:04:00.000+08:00| d1| 80| -|2025-01-01T00:05:00.000+08:00| d1| 70| -|2025-01-01T00:06:00.000+08:00| d1| 80| -+-----------------------------+------+----------+ - --- Creation Statement -create table t(device tag, totalprice int32 field) - -insert into t(time,device,totalprice) values(2025-01-01T00:01:00, 'd1', 90),(2025-01-01T00:02:00, 'd1', 80),(2025-01-01T00:03:00, 'd1', 70),(2025-01-01T00:04:00, 'd1', 80),(2025-01-01T00:05:00, 'd1', 70),(2025-01-01T00:06:00, 'd1', 80) -``` - -#### 1.2.2 DEFINE Clause - -Used to specify the judgment condition for each basic pattern variable in pattern recognition. These variables are usually represented by identifiers (e.g., `A`, `B`), and the Boolean expressions in this clause precisely define which rows meet the requirements of the variable. - -* During pattern matching execution, a row is only marked as the variable (and thus included in the current matching group) if the Boolean expression returns TRUE. - -```SQL --- A row can only be identified as B if its totalprice value is less than the totalprice value of the previous row. -DEFINE B AS totalprice < PREV(totalprice) -``` - -* Variables not **explicitly** defined in this clause have an implicitly set condition of always true (TRUE), meaning they can be successfully matched on any input row. - -#### 1.2.3 SUBSET Clause - -Used to merge rows matched by multiple basic pattern variables (e.g., `A`, `B`) into a combined pattern variable (e.g., `U`), allowing these rows to be treated as a single logical set for operations. It can be used in the `MEASURES`, `DEFINE`, and `AFTER MATCH SKIP` clauses. - -```SQL -SUBSET U = (A, B) -``` -For example, for the pattern `PATTERN ((A | B){5} C+)`, it is impossible to determine whether the 5th repetition matches the basic pattern variable A or B during matching. Therefore: - -1. In the `MEASURES` clause, if you need to reference the last row matched in this phase, you can do so by defining the combined pattern variable `SUBSET U = (A, B)`. At this point, the expression `RPR_LAST(U.totalprice)` will directly return the `totalprice` value of the target row. -2. In the `AFTER MATCH SKIP` clause, if the matching result does not include the basic pattern variable A or B, executing `AFTER MATCH SKIP TO LAST B` or `AFTER MATCH SKIP TO LAST A` will fail to jump due to missing anchors. However, by introducing the combined pattern variable `SUBSET U = (A, B)`, using `AFTER MATCH SKIP TO LAST U` is always valid. - -#### 1.2.4 PATTERN Clause - -Used to define the row pattern to be matched, whose basic building block is a row pattern variable. - -```SQL -PATTERN ( row_pattern ) -``` - -##### 1.2.4.1 Pattern Types - -| Row Pattern | Syntax Format | Description | -|-----------------------|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Pattern Concatenation | `A B+ C+ D+` | Composed of subpatterns without any operators, matching all subpatterns in the declared order sequentially. | -| Pattern Alternation | `A \| B \| C` | Composed of multiple subpatterns separated by `\|`, matching only one of them. If multiple subpatterns can be matched, the leftmost one is selected. | -| Pattern Permutation | `PERMUTE(A, B, C)` | Equivalent to performing alternation matching on all different orders of the subpattern elements. It requires that A, B, and C must all be matched, but their order of appearance is not fixed. If multiple matching orders are possible, the priority is determined by the **lexicographical order** based on the definition sequence of elements in the PERMUTE list. For example, A B C has the highest priority, while C B A has the lowest. | -| Pattern Grouping | `(A B C)` | Encloses subpatterns in parentheses to treat them as a single unit, which can be used with other operators. For example, `(A B C)+` indicates a pattern where a group of `(A B C)` appears consecutively. | -| Empty Pattern | `()` | Represents an empty match that does not contain any rows. | -| Pattern Exclusion | `{- row_pattern -}` | Used to specify the matched part to be excluded from the output. Usually used with the `ALL ROWS PER MATCH` option to output rows of interest. For example, `PATTERN (A {- B+ C+ -} D+)` with ALL ROWS PER MATCH will only output the first row `(corresponding to A)` and the trailing rows `(corresponding to D+)` of the match. | - -##### 1.2.4.2 Partition Start/End Anchor - -* `^A` indicates matching a pattern that starts with A as the partition beginning - * When the value of the PATTERN clause is `^A`, the match must start from the first row of the partition, and this row must satisfy the definition of `A`. - * When the value of the PATTERN clause is `^A^` or `A^`, the output result is empty. -* `A$` indicates matching a pattern that ends with A as the partition end - * When the value of the PATTERN clause is `A$`, the match must end at the end of the partition, and this row must satisfy the definition of `A`. - * When the value of the PATTERN clause is `$A` or `$A$`, the output result is empty. - -**Examples** - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - AFTER MATCH SKIP PAST LAST ROW - PATTERN %s -- PATTERN 子句 - DEFINE A AS true -) AS m; -``` - -* Results - * When the PATTERN clause is specified as PATTERN (^A) - - ![](/img/timeseries-featured-analysis-2.png) - - Actual Return - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * When the PATTERN clause is specified as PATTERN (^A^), the output result is empty. This is because it is impossible to match an A starting from the beginning of a partition and then return to the beginning of the partition again. - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - * When the PATTERN clause is specified as PATTERN (A\$) - - ![](/img/timeseries-featured-analysis-3.png) - - Actual Return - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:06:00.000+08:00| 1| 80| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * When the PATTERN clause is specified as PATTERN (\$A\$), the output result is empty. - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - -##### 1.2.4.3 Quantifiers - -Quantifiers are used to specify the number of times a subpattern repeats, placed after the corresponding subpattern (e.g., `(A | B)*`). - -Common quantifiers are as follows: - -| Quantifier | Description | -| -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `*` | Zero or more repetitions | -| `+` | One or more repetitions | -| `?` | Zero or one repetition | -| `{n}` | Exactly n repetitions | -| `{m, n}` | Repetitions between m and n times (m and n are non-negative integers). \* If the left bound is omitted, the default starts from 0; \* If the right bound is omitted, there is no upper limit on the number of repetitions (e.g., {5,} is equivalent to "at least five times"); \* If both left and right bounds are omitted (i.e., {,}), it is equivalent to `*`. | - -* The matching preference can be changed by adding `?` after the quantifier. - * `{3,5}`: Prefers 5 times, least prefers 3 times; `{3,5}?`: Prefers 3 times, least prefers 5 times. - * `?`: Prefers 1 time; `??`: Prefers 0 times. - -#### 1.2.5 AFTER MATCH SKIP Clause - -Used to specify which row to start the next pattern match from after identifying a non-empty match. - -| Jump Strategy | Description | Allows Overlapping Matches? | -| ------------------------------------------------------------- | -------------------------------------------------------------------------------- | ----------------------------- | -| `AFTER MATCH SKIP PAST LAST ROW` | Default behavior. Starts from the row after the last row of the current match. | No | -| `AFTER MATCH SKIP TO NEXT ROW` | Starts from the second row in the current match. | Yes | -| `AFTER MATCH SKIP TO [ FIRST \| LAST ] pattern_variable` | Jumps to start from the [ first row \| last row ] of a pattern variable. | Yes | - -* Among all possible configurations, only when `ALL ROWS PER MATCH WITH UNMATCHED ROWS` is used in combination with `AFTER MATCH SKIP PAST LAST ROW` can the system ensure that exactly one output record is generated for each input row. - -**Examples** - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - %s -- AFTER MATCH SKIP 子句 - PATTERN (A B+ C+ D?) - SUBSET U = (C, D) - DEFINE - B AS B.totalprice < PREV (B.totalprice), - C AS C.totalprice > PREV (C.totalprice), - D AS false -- 永远不会匹配成功 -) AS m; -``` - -* Results - * When AFTER MATCH SKIP PAST LAST ROW is specified - - ![](/img/timeseries-featured-analysis-4-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: According to the semantics of `AFTER MATCH SKIP PAST LAST ROW`, starting from row 5, no valid match can be found - * This pattern will never have overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 4 - ``` - - * When AFTER MATCH SKIP TO NEXT ROW - - ![](/img/timeseries-featured-analysis-5-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: According to the semantics of `AFTER MATCH SKIP TO NEXT ROW`, starting from row 2, matches: Rows 2, 3, 4 - * Third match: Attempts to start from row 3, fails - * Fourth match: Attempts to start from row 4, succeeds, matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:02:00.000+08:00| 2| 80| A| - |2025-01-01T00:03:00.000+08:00| 2| 70| B| - |2025-01-01T00:04:00.000+08:00| 2| 80| C| - |2025-01-01T00:04:00.000+08:00| 3| 80| A| - |2025-01-01T00:05:00.000+08:00| 3| 70| B| - |2025-01-01T00:06:00.000+08:00| 3| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 10 - ``` - - * When AFTER MATCH SKIP TO FIRST C - - ![](/img/timeseries-featured-analysis-6-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: Starts from the first C (i.e., row 4), matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO LAST B or AFTER MATCH SKIP TO B - - ![](/img/timeseries-featured-analysis-7-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: Attempts to start from the last B (i.e., row 3), fails - * Third match: Attempts to start from row 4, successfully matches rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO U - - ![](/img/timeseries-featured-analysis-8-en.png) - - * - * First match: Rows 1, 2, 3, 4 - * Second match: `SKIP TO U` means jumping to the last C or D; D can never match successfully, so it jumps to the last C (i.e., row 4), successfully matching rows 4, 5, 6 - * This pattern allows overlapping matches - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * When AFTER MATCH SKIP TO A, you cannot jump to the first row of the match, otherwise it will cause an infinite loop - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: cannot skip to first row of match - ``` - - * When AFTER MATCH SKIP TO B, you cannot jump to a pattern variable that does not exist in the match group - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: pattern variable is not present in match - ``` - - -#### 1.2.6 ROWS PER MATCH Clause - -Used to specify the output method of the result set after a successful pattern match, including the following two main options: - -| Output Method | Rule Description | Output Result | Handling Logic for **Empty Matches/Unmatched Rows** | -| -------------------- | -------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| ONE ROW PER MATCH | Generates one output row for each successful match. | \* Columns in the PARTITION BY clause\* Expressions defined in the MEASURES clause. | Outputs empty matches; skips unmatched rows. | -| ALL ROWS PER MATCH | Each row in a match generates an output record, unless the row is excluded via exclusion syntax. | \* Columns in the PARTITION BY clause\* Columns in the ORDER BY clause\* Expressions defined in the MEASURES clause\* Remaining columns in the input table | \* Default: Outputs empty matches; skips unmatched rows.\* ALL ROWS PER MATCH​**SHOW EMPTY MATCHES**​: Outputs empty matches by default; skips unmatched rows.\* ALL ROWS PER MATCH​**OMIT EMPTY MATCHES**​: Does not output empty matches; skips unmatched rows.\* ALL ROWS PER MATCH​**WITH UNMATCHED ROWS**​: Outputs empty matches and generates an additional output record for each unmatched row. | - -#### 1.2.7 MEASURES Clause - -Used to specify which information to extract from a matched set of data. This clause is optional; if not explicitly specified, some input columns will become the output results of pattern recognition based on the settings of the ROWS PER MATCH clause. - -SQL - -```SQL -MEASURES measure_expression AS measure_name [, ...] -``` - -* A `measure_expression` is a scalar value calculated from the matched set of data. - -| Usage Example | Description | -| ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `A.totalprice AS starting_price` | Returns the price from the first row in the matched group (i.e., the only row associated with variable A) as the starting price. | -| `RPR_LAST(B.totalprice) AS bottom_price` | Returns the price from the last row associated with variable B, representing the lowest price in the "V" shape pattern (corresponding to the end of the downward segment). | -| `RPR_LAST(U.totalprice) AS top_price` | Returns the highest price in the matched group, corresponding to the last row associated with variable C or D (i.e., the end of the entire matched group). [Assuming SUBSET U = (C, D)] | - -* Each `measure_expression` defines an output column, which can be referenced by its specified `measure_name`. - -#### 1.2.8 Row Pattern Recognition Expressions - -Expressions used in the MEASURES and DEFINE clauses are ​**scalar expressions**​, evaluated in the row-level context of the input table. In addition to supporting standard SQL syntax, **scalar expressions** also support special extended functions for row pattern recognition. - -##### 1.2.8.1 Pattern Variable References - -```SQL -A.totalprice -U.orderdate -orderstatus -``` - -* When a column name is prefixed with a **basic pattern variable** or a ​**combined pattern variable**​, it refers to the corresponding column values of all rows matched by that variable. -* If a column name has no prefix, it is equivalent to using the "​**global combined pattern variable**​" (i.e., the union of all basic pattern variables) as the prefix, referring to the column values of all rows in the current match. - -> Using table names as column name prefixes in pattern recognition expressions is not allowed. - -##### 1.2.8.2 Extended Functions - -| Function Name | Function Syntax | Description | -| ------------------------------- | ----------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| `MATCH_NUMBER` Function | `MATCH_NUMBER()` | Returns the sequence number of the current match within the partition, starting from 1. Empty matches occupy match sequence numbers just like non-empty matches. | -| `CLASSIFIER` Function | `CLASSIFIER(option)` | 1. Returns the name of the basic pattern variable mapped by the current row. 2. `option` is an optional parameter: a basic pattern variable `CLASSIFIER(A)` or a combined pattern variable `CLASSIFIER(U)` can be passed in to limit the function's scope; for rows outside the scope, NULL is returned directly. When used with a combined pattern variable, it can be used to distinguish which basic pattern variable in the union the row is mapped to. | -| Logical Navigation Functions | `RPR_FIRST(expr, k)` | 1. Indicates locating the first row satisfying `expr` in the ​**current match group**​, then searching for the k-th occurrence of the row corresponding to the same pattern variable towards the end of the group, and returning the specified column value of that row. If the k-th matching row is not found in the specified direction, the function returns NULL. 2. `k` is an optional parameter, defaulting to 0 (only locating the first row satisfying the condition); if explicitly specified, it must be a non-negative integer. | -| Logical Navigation Functions | `RPR_LAST(expr, k)` | 1. Indicates locating the last row satisfying `expr` in the ​**current match group**​, then searching for the k-th occurrence of the row corresponding to the same pattern variable towards the start of the group, and returning the specified column value of that row. If the k-th matching row is not found in the specified direction, the function returns NULL. 2. `k` is an optional parameter, defaulting to 0 (only locating the last row satisfying the condition); if explicitly specified, it must be a non-negative integer. | -| Physical Navigation Functions | `PREV(expr, k)` | 1. Indicates offsetting k rows towards the start from the last row matched to the given pattern variable, and returning the corresponding column value. If navigation exceeds the ​**partition boundary**​, the function returns NULL. 2. `k` is an optional parameter, defaulting to 1; if explicitly specified, it must be a non-negative integer. | -| Physical Navigation Functions | `NEXT(expr, k)` | 1. Indicates offsetting k rows towards the end from the last row matched to the given pattern variable, and returning the corresponding column value. If navigation exceeds the ​**partition boundary**​, the function returns NULL. 2. `k` is an optional parameter, defaulting to 1; if explicitly specified, it must be a non-negative integer. | -| Aggregate Functions | COUNT, SUM, AVG, MAX, MIN Functions | Can be used to calculate data in the current match. Aggregate functions and navigation functions are not allowed to be nested within each other. (Supported from version V2.0.6) | -| Nested Functions | `PREV/NEXT(CLASSIFIER())` | Nesting of physical navigation functions and the CLASSIFIER function. Used to obtain the pattern variables corresponding to the previous and next matching rows of the current row. | -| Nested Functions | `PREV/NEXT(RPR_FIRST/RPR_LAST(expr, k)`) | **Logical functions are allowed to be nested** inside physical functions; **physical functions are not allowed to be nested** inside logical functions. Used to perform logical offset first, then physical offset. | - -**Examples** - -1. CLASSIFIER Function - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* Analysis - - ![](/img/timeseries-featured-analysis-9-en.png) - -* Result - -```SQL -+-----------------------------+-----+-----+---------------+-----+ -| time|match|price|lower_or_higher|label| -+-----------------------------+-----+-----+---------------+-----+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| -+-----------------------------+-----+-----+---------------+-----+ -Total line number = 6 -``` - -2. Logical Navigation Functions - -* Query sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` - -* Results - * When the value is totalprice, RPR\_LAST(totalprice), RUNNING RPR\_LAST(totalprice) - - ![](/img/timeseries-featured-analysis-10.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is FINAL RPR\_LAST(totalprice) - - ![](/img/timeseries-featured-analysis-11.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_FIRST(totalprice), RUNNING RPR\_FIRST(totalprice), FINAL RPR\_FIRST(totalprice) - - ![](/img/timeseries-featured-analysis-12.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 90| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 90| - |2025-01-01T00:05:00.000+08:00| 90| - |2025-01-01T00:06:00.000+08:00| 90| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_LAST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-13.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| null| - |2025-01-01T00:02:00.000+08:00| null| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is FINAL RPP\_LAST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-14.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * When the value is RPR\_FIRST(totalprice, 2) and FINAL RPR\_FIRST(totalprice, 2) - - ![](/img/timeseries-featured-analysis-15.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 70| - |2025-01-01T00:02:00.000+08:00| 70| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 6 - ``` - -3. Physical Navigation Functions - -* Query sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (B) - DEFINE B AS B.totalprice >= PREV(B.totalprice) -) AS m; -``` - -* Results - * When the value is `PREV(totalprice)` - - ![](/img/timeseries-featured-analysis-16.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `PREV(B.totalprice, 2)` - - ![](/img/timeseries-featured-analysis-17.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `PREV(B.totalprice, 4)` - - ![](/img/timeseries-featured-analysis-18.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| null| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * When the value is `NEXT(totalprice)` or `NEXT(B.totalprice, 1)` - - ![](/img/timeseries-featured-analysis-19.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * `When the value is `NEXT(B.totalprice, 2)` - - ![](/img/timeseries-featured-analysis-20.png) - - Actual Return - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - -4. Aggregate Functions - -* Query sql - -```SQL -SELECT m.time, m.count, m.avg, m.sum, m.min, m.max -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - COUNT(*) AS count, - AVG(totalprice) AS avg, - SUM(totalprice) AS sum, - MIN(totalprice) AS min, - MAX(totalprice) AS max - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* Analysis (Taking MIN(totalprice) as an Example) - -![](/img/timeseries-featured-analysis-21.png) - -* Result - -```SQL -+-----------------------------+-----+-----------------+-----+---+---+ -| time|count| avg| sum|min|max| -+-----------------------------+-----+-----------------+-----+---+---+ -|2025-01-01T00:01:00.000+08:00| 1| 90.0| 90.0| 90| 90| -|2025-01-01T00:02:00.000+08:00| 2| 85.0|170.0| 80| 90| -|2025-01-01T00:03:00.000+08:00| 3| 80.0|240.0| 70| 90| -|2025-01-01T00:04:00.000+08:00| 4| 80.0|320.0| 70| 90| -|2025-01-01T00:05:00.000+08:00| 5| 78.0|390.0| 70| 90| -|2025-01-01T00:06:00.000+08:00| 6|78.33333333333333|470.0| 70| 90| -+-----------------------------+-----+-----------------+-----+---+---+ -Total line number = 6 -``` - -5. Nested Functions - -Example 1 - -* Query sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label, m.prev_label, m.next_label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label, - PREV(CLASSIFIER(W)) AS prev_label, - NEXT(CLASSIFIER(W)) AS next_label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* Analysis - -![](/img/timeseries-featured-analysis-22-en.png) - -* Result - -```SQL -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -| time|match|price|lower_or_higher|label|prev_label|next_label| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| null| A| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| H| null| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| null| A| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| L| null| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| null| A| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| L| null| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -Total line number = 6 -``` - -Example 2 - -* Query sql - -```SQL -SELECT m.time, m.prev_last_price, m.next_first_price -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - PREV(RPR_LAST(totalprice), 2) AS prev_last_price, - NEXT(RPR_FIRST(totalprice), 2) as next_first_price - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* Analysis - -![](/img/timeseries-featured-analysis-23.png) - -* Result - -```SQL -+-----------------------------+---------------+----------------+ -| time|prev_last_price|next_first_price| -+-----------------------------+---------------+----------------+ -|2025-01-01T00:01:00.000+08:00| null| 70| -|2025-01-01T00:02:00.000+08:00| null| 70| -|2025-01-01T00:03:00.000+08:00| 90| 70| -|2025-01-01T00:04:00.000+08:00| 80| 70| -|2025-01-01T00:05:00.000+08:00| 70| 70| -|2025-01-01T00:06:00.000+08:00| 80| 70| -+-----------------------------+---------------+----------------+ -Total line number = 6 -``` - -##### 1.2.8.3 RUNNING and FINAL Semantics - -1. Definition - -* `RUNNING`: Indicates the calculation scope is from the start row of the current match group to the row currently being processed (i.e., up to the current row). -* `FINAL`: Indicates the calculation scope is from the start row of the current match group to the final row of the group (i.e., the entire match group). - -2. Scope of Application - -* The DEFINE clause uses RUNNING semantics by default. -* The MEASURES clause uses RUNNING semantics by default and supports specifying FINAL semantics. When using the ONE ROW PER MATCH output mode, all expressions are calculated from the last row position of the match group, and at this time, RUNNING semantics are equivalent to FINAL semantics. - -3. Syntax Constraints - -* RUNNING and FINAL need to be written before **logical navigation functions** or aggregate functions, and cannot directly act on **column references.** - * Valid: `RUNNING RPP_LAST(A.totalprice)`, `FINAL RPP_LAST(A.totalprice)` - * Invalid: `RUNNING A.totalprice`, `FINAL A.totalprice`, `RUNNING PREV(A.totalprice)` - -### 1.3 Scenario Examples - -Using [Sample Data](../Reference/Sample-Data.md) as the source data - -#### 1.3.1 Time Segment Query - -Segment the data in table1 by time intervals less than or equal to 24 hours, and query the total number of data entries in each segment, as well as the start and end times. - -Query SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -Results - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -#### 1.3.2 Difference Segment Query - -Segment the data in table2 by humidity value differences less than 0.1, and query the total number of data entries in each segment, as well as the start and end times. - -* Query SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* Results - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -#### 1.3.3 Event Statistics Query - -Group the data in table1 by device ID, and count the start and end times and maximum humidity value where the humidity in the Shanghai area is greater than 35. - -* Query SQL - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= '上海' AND A.humidity> 35 -) AS m -``` - -* Results - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` - - -## 2. Window Functions - -### 2.1 Function Overview - -Window Functions perform calculations on each row based on a specific set of rows related to the current row (called a "window"). It combines grouping operations (`PARTITION BY`), sorting (`ORDER BY`), and definable calculation ranges (window frame `FRAME`), enabling complex cross-row calculations without collapsing the original data rows. It is commonly used in data analysis scenarios such as ranking, cumulative sums, moving averages, etc. - -> Note: This feature is available starting from version V 2.0.5. - -For example, in a scenario where you need to query the cumulative power consumption values of different devices, you can achieve this using window functions. - -```SQL --- Original data -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ - --- Create table and insert data -CREATE TABLE device_flow(device String tag, flow INT32 FIELD); -insert into device_flow(time, device ,flow ) values ('1970-01-01T08:00:00.000+08:00','d0',3),('1970-01-01T08:00:01.000+08:00','d0',5),('1970-01-01T08:00:02.000+08:00','d0',3),('1970-01-01T08:00:03.000+08:00','d0',1),('1970-01-01T08:00:04.000+08:00','d1',2),('1970-01-01T08:00:05.000+08:00','d1',4); - - --- Execute window function query -SELECT *, sum(flow) ​OVER(PARTITION​ ​BY​ device ​ORDER​ ​BY​ flow) ​as​ sum ​FROM device_flow; -``` - -After grouping, sorting, and calculation (steps are disassembled as shown in the figure below), - -![](/img/window-function-1.png) - -the expected results can be obtained: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -### 2.2 Function Definition - -#### 2.2.1 SQL Definition - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -#### 2.2.2 Window Definition - -##### 2.2.2.1 Partition - -`PARTITION BY` is used to divide data into multiple independent, unrelated "groups". Window functions can only access and operate on data within their respective groups, and cannot access data from other groups. This clause is optional; if not explicitly specified, all data is divided into the same group by default. It is worth noting that unlike `GROUP BY` which aggregates a group of data into a single row, the window function with `PARTITION BY` **does not affect the number of rows within the group.** - -* Example - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER (PARTITION BY device) as count FROM device_flow; -``` - -Disassembly steps: - -![](/img/window-function-2.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 4| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -|1970-01-01T08:00:02.000+08:00| d0| 3| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 4| -+-----------------------------+------+----+-----+ -``` - -##### 2.2.2.2 Ordering - -`ORDER BY` is used to sort data within a partition. After sorting, rows with equal values are called peers. Peers affect the behavior of window functions; for example, different rank functions handle peers differently, and different frame division methods also handle peers differently. This clause is optional. - -* Example - -Query statement: - -```SQL -IoTDB> SELECT *, rank() OVER (PARTITION BY device ORDER BY flow) as rank FROM device_flow; -``` - -Disassembly steps: - -![](/img/window-function-3.png) - -Query result: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -##### 2.2.2.3 Framing - -For each row in a partition, the window function evaluates on a corresponding set of rows called a Frame (i.e., the input domain of the Window Function on each row). The Frame can be specified manually, involving two attributes when specified, as detailed below. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Frame AttributeAttribute ValueValue Description
TypeROWSDivide the frame by row number
GROUPSDivide the frame by peers, i.e., rows with the same value are regarded as equivalent. All rows in peers are grouped into one group called a peer group
RANGEDivide the frame by value
Start and End PositionUNBOUNDED PRECEDINGThe first row of the entire partition
offset PRECEDINGRepresents the row with an "offset" distance from the current row in the preceding direction
CURRENT ROWThe current row
offset FOLLOWINGRepresents the row with an "offset" distance from the current row in the following direction
UNBOUNDED FOLLOWINGThe last row of the entire partition
- -Among them, the meanings of `CURRENT ROW`, `PRECEDING N`, and `FOLLOWING N` vary with the type of frame, as shown in the following table: - -| | `ROWS` | `GROUPS` | `RANGE` | -|--------------------|------------|------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------| -| `CURRENT ROW` | Current row | Since a peer group contains multiple rows, this option differs depending on whether it acts on frame_start and frame_end: * frame_start: the first row of the peer group; * frame_end: the last row of the peer group. | Same as GROUPS, differing depending on whether it acts on frame_start and frame_end: * frame_start: the first row of the peer group; * frame_end: the last row of the peer group. | -| `offset PRECEDING` | The previous offset rows | The previous offset peer groups; | Rows whose value difference from the current row in the preceding direction is less than or equal to offset are grouped into one frame | -| `offset FOLLOWING` | The following offset rows | The following offset peer groups. | Rows whose value difference from the current row in the following direction is less than or equal to offset are grouped into one frame | - -The syntax format is as follows: - -```SQL --- Specify both frame_start and frame_end -{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end --- Specify only frame_start, frame_end is CURRENT ROW -{ RANGE | ROWS | GROUPS } frame_start -``` - -If the Frame is not specified manually, the default Frame division rules are as follows: - -* When the window function uses ORDER BY: The default Frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (i.e., from the first row of the window to the current row). For example: In RANK() OVER(PARTITION BY COL1 ORDER BY COL2), the Frame defaults to include the current row and all preceding rows in the partition. -* When the window function does not use ORDER BY: The default Frame is RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (i.e., all rows in the entire window). For example: In AVG(COL2) OVER(PARTITION BY col1), the Frame defaults to include all rows in the partition, calculating the average of the entire partition. - -It should be noted that when the Frame type is GROUPS or RANGE, `ORDER BY` must be specified. The difference is that ORDER BY in GROUPS can involve multiple fields, while RANGE requires calculation and thus can only specify one field. - -* Example - -1. Frame type is ROWS - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ROWS 1 PRECEDING) as count FROM device_flow; -``` - -Disassembly steps: - -* Take the previous row and the current row as the Frame - * For the first row of the partition, since there is no previous row, the entire Frame has only this row, returning 1; - * For other rows of the partition, the entire Frame includes the current row and its previous row, returning 2: - -![](/img/window-function-4.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 2| -+-----------------------------+------+----+-----+ -``` - -2. Frame type is GROUPS - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Disassembly steps: - -* Take the previous peer group and the current peer group as the Frame. Taking the partition with device d0 as an example (same for d1), for the count of rows: - * For the peer group with flow 1, since there are no peer groups smaller than it, the entire Frame has only this row, returning 1; - * For the peer group with flow 3, it itself contains 2 rows, and the previous peer group is the one with flow 1 (1 row), so the entire Frame has 3 rows, returning 3; - * For the peer group with flow 5, it itself contains 1 row, and the previous peer group is the one with flow 3 (2 rows), so the entire Frame has 3 rows, returning 3. - -![](/img/window-function-5.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. Frame type is RANGE - -Query statement: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Disassembly steps: - -* Group rows whose data is **less than or equal to 2** compared to the current row into the same Frame. Taking the partition with device d0 as an example (same for d1), for the count of rows: - * For the row with flow 1, since it is the smallest row, the entire Frame has only this row, returning 1; - * For the row with flow 3, note that CURRENT ROW exists as frame_end, so it is the last row of the entire peer group. There is 1 row smaller than it that meets the requirement, and the peer group has 2 rows, so the entire Frame has 3 rows, returning 3; - * For the row with flow 5, it itself contains 1 row, and there are 2 rows smaller than it that meet the requirement, so the entire Frame has 3 rows, returning 3. - -![](/img/window-function-6.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -### 2.3 Built-in Window Functions - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Window Function CategoryWindow Function NameFunction DefinitionSupports FRAME Clause
Aggregate FunctionAll built-in aggregate functionsAggregate a set of values to get a single aggregated result.Yes
Value Functionfirst_valueReturn the first value of the frame; if IGNORE NULLS is specified, skip leading NULLsYes
last_valueReturn the last value of the frame; if IGNORE NULLS is specified, skip trailing NULLsYes
nth_valueReturn the nth element of the frame (note that n starts from 1); if IGNORE NULLS is specified, skip NULLsYes
leadReturn the element offset rows after the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return defaultNo
lagReturn the element offset rows before the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return defaultNo
Rank FunctionrankReturn the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there may be gaps between sequence numbersNo
dense_rankReturn the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there are no gaps between sequence numbersNo
row_numberReturn the row number of the current row in the entire partition; note that the row number starts from 1No
percent_rankReturn the sequence number of the current row's value in the entire partition as a percentage; i.e., (rank() - 1) / (n - 1), where n is the number of rows in the entire partitionNo
cume_distReturn the sequence number of the current row's value in the entire partition as a percentage; i.e., (number of rows less than or equal to it) / n No
ntileSpecify n to number each row from 1 to n.No
- -#### 2.3.1 Aggregate Function - -All built-in aggregate functions such as `sum()`, `avg()`, `min()`, `max()` can be used as Window Functions. - -> Note: Unlike GROUP BY, each row has a corresponding output in the Window Function - -Example: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -#### 2.3.2 Value Function - -1. `first_value` - -* Function name: `first_value(value) [IGNORE NULLS]` -* Definition: Return the first value of the frame; if IGNORE NULLS is specified, skip leading NULLs; -* Example: - -```SQL -IoTDB> SELECT *, first_value(flow) OVER w as first_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+-----------+ -| time|device|flow|first_value| -+-----------------------------+------+----+-----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----------+ -``` - -2. `last_value` - -* Function name: `last_value(value) [IGNORE NULLS]` -* Definition: Return the last value of the frame; if IGNORE NULLS is specified, skip trailing NULLs; -* Example: - -```SQL -IoTDB> SELECT *, last_value(flow) OVER w as last_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|last_value| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -3. `nth_value` - -* Function name: `nth_value(value, n) [IGNORE NULLS]` -* Definition: Return the nth element of the frame (note that n starts from 1); if IGNORE NULLS is specified, skip NULLs; -* Example: - -```SQL -IoTDB> SELECT *, nth_value(flow, 2) OVER w as nth_values FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|nth_values| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -4. lead - -* Function name: `lead(value[, offset[, default]]) [IGNORE NULLS]` -* Definition: Return the element offset rows after the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return default; the default value of offset is 1, and the default value of default is NULL. -* The lead function requires an ORDER BY window clause -* Example: - -```SQL -IoTDB> SELECT *, lead(flow) OVER w as lead FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY time); -+-----------------------------+------+----+----+ -| time|device|flow|lead| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4|null| -|1970-01-01T08:00:00.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 1| -|1970-01-01T08:00:03.000+08:00| d0| 1|null| -+-----------------------------+------+----+----+ -``` - -5. lag - -* Function name: `lag(value[, offset[, default]]) [IGNORE NULLS]` -* Definition: Return the element offset rows before the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return default; the default value of offset is 1, and the default value of default is NULL. -* The lag function requires an ORDER BY window clause -* Example: - -```SQL -IoTDB> SELECT *, lag(flow) OVER w as lag FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY device); -+-----------------------------+------+----+----+ -| time|device|flow| lag| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2|null| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3|null| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -+-----------------------------+------+----+----+ -``` - -#### 2.3.3 Rank Function - -1. rank - -* Function name: `rank()` -* Definition: Return the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there may be gaps between sequence numbers; -* Example: - -```SQL -IoTDB> SELECT *, rank() OVER w as rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -2. dense_rank - -* Function name: `dense_rank()` -* Definition: Return the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there are no gaps between sequence numbers. -* Example: - -```SQL -IoTDB> SELECT *, dense_rank() OVER w as dense_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|dense_rank| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+----------+ -``` - -3. row_number - -* Function name: `row_number()` -* Definition: Return the row number of the current row in the entire partition; note that the row number starts from 1; -* Example: - -```SQL -IoTDB> SELECT *, row_number() OVER w as row_number FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|row_number| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----------+ -``` - -4. percent_rank - -* Function name: `percent_rank()` -* Definition: Return the sequence number of the current row's value in the entire partition as a percentage; i.e., **(rank() - 1) / (n - 1)**, where n is the number of rows in the entire partition; -* Example: - -```SQL -IoTDB> SELECT *, percent_rank() OVER w as percent_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+------------------+ -| time|device|flow| percent_rank| -+-----------------------------+------+----+------------------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.0| -|1970-01-01T08:00:00.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:02.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+------------------+ -``` - -5. cume_dist - -* Function name: `cume_dist` -* Definition: Return the sequence number of the current row's value in the entire partition as a percentage; i.e., **(number of rows less than or equal to it) / n**. -* Example: - -```SQL -IoTDB> SELECT *, cume_dist() OVER w as cume_dist FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+---------+ -| time|device|flow|cume_dist| -+-----------------------------+------+----+---------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.5| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.25| -|1970-01-01T08:00:00.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:02.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+---------+ -``` - -6. ntile - -* Function name: `ntile` -* Definition: Specify n to number each row from 1 to n. - * If the number of rows in the entire partition is less than n, the number is the row index; - * If the number of rows in the entire partition is greater than n: - * If the number of rows is divisible by n, it is perfect. For example, if the number of rows is 4 and n is 2, the numbers are 1, 1, 2, 2; - * If the number of rows is not divisible by n, distribute to the first few groups. For example, if the number of rows is 5 and n is 3, the numbers are 1, 1, 2, 2, 3; -* Example: - -```SQL -IoTDB> SELECT *, ntile(2) OVER w as ntile FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+-----+ -| time|device|flow|ntile| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -+-----------------------------+------+----+-----+ -``` - -### 2.4 Scenario Examples - -1. Multi-device diff function - -For each row of each device, calculate the difference from the previous row: - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -For each row of each device, calculate the difference from the next row: - -```SQL -SELECT - *, - measurement - lead(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -For each row of a single device, calculate the difference from the previous row (same for the next row): - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (ORDER BY time) -FROM data -where device='d1' -WHERE timeCondition; -``` - -2. Multi-device TOP_K/BOTTOM_K - -Use rank to get the sequence number, then retain the desired order in the outer query. - -(Note: The execution order of window functions is after the HAVING clause, so a subquery is needed here) - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY time DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -In addition to sorting by time, you can also sort by the value of the measurement point: - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY measurement DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -3. Multi-device CHANGE_POINTS - -This SQL is used to remove consecutive identical values in the input sequence, which can be achieved with lead + subquery: - -```SQL -SELECT - time, - device, - measurement -FROM( - SELECT - time, - device, - measurement, - LEAD(measurement) OVER (PARTITION BY device ORDER BY time) AS next - FROM data - WHERE timeCondition -) -WHERE measurement != next OR next IS NULL; -``` diff --git a/src/UserGuide/latest-Table/User-Manual/Tree-to-Table_timecho.md b/src/UserGuide/latest-Table/User-Manual/Tree-to-Table_timecho.md deleted file mode 100644 index 729c7ada4..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Tree-to-Table_timecho.md +++ /dev/null @@ -1,619 +0,0 @@ - - -# Tree-to-Table Mapping - -## 1. Functional Overview - -IoTDB introduces a tree-to-table function, which enables the creation of table views from existing tree-model data. This allows querying via table views, achieving collaborative processing of both tree and table models for the same dataset: - -* During the data writing phase, the tree-model syntax is used, supporting flexible data ingestion and expansion. -* During the data analysis phase, the table-model syntax is adopted, allowing complex data analysis through standard SQL queries. - -![](/img/tree-to-table-en-1.png) - -> * This feature supports from version V2.0.5. -> * Table views are read-only, so data cannot be written through them. - - ## 2. Feature Description -### 2.1 Creating a Table View -#### 2.1.1 Syntax Definition - -```SQL --- create (or replace) view on tree -CREATE - [OR REPLACE] - VIEW view_name ([viewColumnDefinition (',' viewColumnDefinition)*]) - [comment] - [RESTRICT] - [WITH properties] - AS prefixPath - -viewColumnDefinition - : column_name [dataType] TAG [comment] # tagColumn - | column_name [dataType] TIME [comment] # timeColumn - | column_name [dataType] FIELD [FROM original_measurement] [comment] # fieldColumn - ; - -comment - : COMMENT string - ; -``` - -> Note: Columns only support tags, fields, or time; attributes are not supported. - -#### 2.1.2 Syntax Explanation -1. **`prefixPath`** - -Corresponds to the path in the tree model. The last level of the path must be `**`, and no other levels can contain `*`or `**`. This path determines the subtree corresponding to the VIEW. - -2. **`view_name`** - -The name of the view, which follows the same rules as a table name (for specific constraints, refer to [Create Table](../Basic-Concept/Table-Management_timecho.md#\_1-1-create-a-table)), e.g., `db.view`. - -3. **`viewColumnDefinition`** - -* `TAG`: Each TAG column corresponds, in order, to the path nodes at the levels following the `prefixPath`. -* `FIELD`: A FIELD column corresponds to a measurement (leaf node) in the tree model. - * If a FIELD column is specified, the column name uses the declared `column_name`. - * If `original_measurement`is declared, it maps directly to that measurement in the tree model. Otherwise, the lowercase `column_name`is used as the measurement name for mapping. - * Mapping multiple FIELD columns to the same measurement name in the tree model is not supported. - * If the `dataType`for a FIELD column is not specified, the system defaults to the data type of the mapped measurement in the tree model. - * If a device in the tree model does not contain certain declared FIELD columns, or if their data types are inconsistent with the declared FIELD columns, the value for that FIELD column will always be `NULL`when querying that device. - * If no FIELD columns are specified, the system automatically scans for all measurements under the `prefixPath`subtree (including all ordinary sequence measurements and measurements defined in any templates whose mounted paths overlap with the `prefixPath`) during creation. The column names will use the measurement names from the tree model. - * The tree model cannot have measurements with the same name (case-insensitive) but different data types. -* `TIME`: When creating a view, you do not need to specify a time column. IoTDB automatically adds a column named "time" and places it as the first column. Since version V2.0.8.2, views support **custom naming of the time column** during creation. The order of the custom time column in the view is determined by the order in the creation SQL. The related constraints are as follows: - * When the column category is set to `TIME`, the data type must be `TIMESTAMP`. - * Each view allows at most one time column (columnCategory = TIME). - * If no time column is explicitly defined, no other column can use `time` as its name to avoid conflicts with the system's default time column naming. - - -4. **`WITH properties`** - -Currently, only TTL is supported. It indicates that data older than TTL (in milliseconds) will not be displayed in query results, i.e., effectively `WHERE time > now() - TTL`. If a TTL is also set in the tree model, the query uses the smaller value of the two. - -> Note: The table view's TTL does not affect the actual TTL of the devices in the tree model. When data reaches the TTL set in the tree model, it will be physically deleted by the system. - -5. **`OR REPLACE`** - -A table and a view cannot have the same name. If a table with the same name already exists during creation, an error will be reported. If a view with the same name already exists, it will be replaced. - -6. **`RESTRICT`** - -This constrains the number of levels of the tree model devices that are matched (starting from the level below the `prefixPath`). If the `RESTRICT`keyword is present, only devices whose level count exactly equals the number of TAG columns are matched. Otherwise, devices whose level count is less than or equal to the number of TAG columns are matched. The default behavior is non-RESTRICT, meaning devices with a level count less than or equal to the number of TAG columns are matched. - -#### 2.1.3 Usage Example -1. Tree Model and Table View Schema - -![](/img/tree-to-table-en-2.png) - -2. Creating the Table View - -* Creation Statement: - -```SQL -CREATE OR REPLACE VIEW viewdb."wind_turbine" - (wind_turbine_group String TAG, - wind_turbine_number String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -with (ttl=604800000) -AS root.db.** -``` - -* Detailed Explanation - -This statement creates a view named `viewdb.wind_turbine`(an error will occur if `viewdb`does not exist). If the view already exists, it will be replaced. - -* It creates a table view for the time series mounted under the tree model path `root.db.**`. -* It has two `TAG` columns, `wind_turbine_group `and `wind_turbine_number`, so the table view will only include devices from the 3rd level of the original tree model. -* It has two `FIELD`columns, `voltage` and `current`. Here, these `FIELD` columns correspond to measurement names in the tree model that are also `voltage` and `current`, and only select time series of type `DOUBLE`. - -**Renaming measurement requirement:** - -If the measurement name in the tree model is `current_new`, but you want the corresponding `FIELD` column name in the table view to be `current`, the SQL should be changed as follows: - -```SQL -CREATE OR REPLACE VIEW viewdb."wind_turbine" - (wind_turbine_group String TAG, - wind_turbine_number String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD FROM current_new - ) -with (ttl=604800000) -AS root.db.** -``` - -When customizing the time column (supported since V2.0.8.2), the SQL changes are as follows: - -```SQL -CREATE OR REPLACE VIEW viewdb."wind_turbine" - (wind_turbine_group String TAG, - wind_turbine_number String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD, - time_user_defined TIMESTAMP TIME - ) -with (ttl=604800000) -AS root.db.** -``` - - -### 2.2 Modifying a Table View -#### 2.2.1 Syntax Definition - -The ALTER VIEW function supports modifying the view name, adding columns, renaming columns, modifying FIELD column data type (supported since V2.0.8.2), deleting columns, setting the view's TTL property, and adding comments via COMMENT. - -```SQL --- Rename view -ALTER VIEW [IF EXISTS] viewName RENAME TO to=identifier - --- Add a column to the view -ALTER VIEW [IF EXISTS] viewName ADD COLUMN [IF NOT EXISTS] viewColumnDefinition -viewColumnDefinition - : column_name [dataType] TAG # tagColumn - | column_name [dataType] FIELD [FROM original_measurement] # fieldColumn - --- Rename a column in the view -ALTER VIEW [IF EXISTS] viewName RENAME COLUMN [IF EXISTS] oldName TO newName - --- Modify the data type of a FIELD column -ALTER VIEW [IF EXISTS] viewName ALTER COLUMN [IF EXISTS] columnName SET DATA TYPE new_type - --- Delete a column from the view -ALTER VIEW [IF EXISTS] viewName DROP COLUMN [IF EXISTS] columnName - --- Modify the view's TTL -ALTER VIEW [IF EXISTS] viewName SET PROPERTIES propertyAssignments - --- Add comments -COMMENT ON VIEW qualifiedName IS (string | NULL) #commentView -COMMENT ON COLUMN qualifiedName '.' column=identifier IS (string | NULL) #commentColumn -``` - -#### 2.2.2 Syntax Explanation -1. The `SET PROPERTIES`operation currently only supports configuring the TTL property for the table view. -2. The `DROP COLUMN`function only supports deleting FIELD columns; TAG columns cannot be deleted. -3. Modifying the comment will overwrite the original comment. If set to `null`, the previous comment will be erased. -4. When modifying the data type of a FIELD column, the new data type must be compatible with the original type. The specific compatibility is shown in the following table: - -| Original Type | Convertible To Type | -|---------------|----------------------------------------------| -| INT32 | INT64, FLOAT, DOUBLE, TIMESTAMP, STRING, TEXT | -| INT64 | TIMESTAMP, DOUBLE, STRING, TEXT | -| FLOAT | DOUBLE, STRING, TEXT | -| DOUBLE | STRING, TEXT | -| BOOLEAN | STRING, TEXT | -| TEXT | BLOB, STRING | -| STRING | TEXT, BLOB | -| BLOB | STRING, TEXT | -| DATE | STRING, TEXT | -| TIMESTAMP | INT64, DOUBLE, STRING, TEXT | - -#### 2.2.3 Usage Examples - -```SQL --- Rename view -ALTER VIEW IF EXISTS tableview1 RENAME TO tableview - --- Add a column to the view -ALTER VIEW IF EXISTS tableview ADD COLUMN IF NOT EXISTS temperature float field - --- Rename a column in the view -ALTER VIEW IF EXISTS tableview RENAME COLUMN IF EXISTS temperature TO temp - --- Modify the data type of a FIELD column -ALTER VIEW IF EXISTS tableview ALTER COLUMN IF EXISTS temperature SET DATA TYPE double - --- Delete a column from the view -ALTER VIEW IF EXISTS tableview DROP COLUMN IF EXISTS temp - --- Modify the view's TTL -ALTER VIEW IF EXISTS tableview SET PROPERTIES TTL=3600 - --- Add comments -COMMENT ON VIEW tableview IS 'Tree to Table' -COMMENT ON COLUMN tableview.status is Null -``` - -### 2.3 Deleting a Table View -#### 2.3.1 Syntax Definition - -```SQL -DROP VIEW [IF EXISTS] viewName -``` - -#### 2.3.2 Usage Example - -```SQL -DROP VIEW IF EXISTS tableview -``` - -### 2.4 Viewing Table Views -#### 2.4.1 **`Show Tables`** -1. Syntax Definition - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -2. Syntax Explanation - -The `SHOW TABLES (DETAILS)`statement displays the type information of tables or views through the `TABLE_TYPE`field in the result set: - -| Type | `TABLE_TYPE`Field Value | -| -------------------------------------------- | ----------------------------- | -| Ordinary Table(Table) | `BASE TABLE` | -| Tree-to-Table View (Tree View) | `VIEW FROM TREE` | -| System Table(Iinformation\_schema.Tables) | `SYSTEM VIEW` | - -3. Usage Examples - -```SQL -IoTDB> show tables details from database1 -+-----------+-----------+------+---------------+--------------+ -| TableName| TTL(ms)|Status| Comment| TableType| -+-----------+-----------+------+---------------+--------------+ -| tableview| INF| USING| Tree to Table |VIEW FROM TREE| -| table1|31536000000| USING| null| BASE TABLE| -| table2|31536000000| USING| null| BASE TABLE| -+-----------+-----------+------+---------------+--------------+ - -IoTDB> show tables details from information_schema -+--------------+-------+------+-------+-----------+ -| TableName|TTL(ms)|Status|Comment| TableType| -+--------------+-------+------+-------+-----------+ -| columns| INF| USING| null|SYSTEM VIEW| -| config_nodes| INF| USING| null|SYSTEM VIEW| -|configurations| INF| USING| null|SYSTEM VIEW| -| data_nodes| INF| USING| null|SYSTEM VIEW| -| databases| INF| USING| null|SYSTEM VIEW| -| functions| INF| USING| null|SYSTEM VIEW| -| keywords| INF| USING| null|SYSTEM VIEW| -| models| INF| USING| null|SYSTEM VIEW| -| nodes| INF| USING| null|SYSTEM VIEW| -| pipe_plugins| INF| USING| null|SYSTEM VIEW| -| pipes| INF| USING| null|SYSTEM VIEW| -| queries| INF| USING| null|SYSTEM VIEW| -| regions| INF| USING| null|SYSTEM VIEW| -| subscriptions| INF| USING| null|SYSTEM VIEW| -| tables| INF| USING| null|SYSTEM VIEW| -| topics| INF| USING| null|SYSTEM VIEW| -| views| INF| USING| null|SYSTEM VIEW| -+--------------+-------+------+-------+-----------+ -``` - -#### 2.4.2 **`Show Create View`** -1. Syntax Definition - -```SQL -SHOW CREATE VIEW viewname; -``` - -2. Syntax Explanation - -* This statement retrieves the complete definition of a table or view. -* It automatically fills in all default values omitted during creation, so the statement shown in the result may differ from the original CREATE statement. -* This statement does not support system tables. - -3. Usage Examples - -```SQL -IoTDB> show create view tableview -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| View| Create View| -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|tableview|CREATE VIEW "tableview" ("device" STRING TAG,"model" STRING TAG,"status" BOOLEAN FIELD,"hardware" STRING FIELD) COMMENT '表视图' WITH (ttl=INF) AS root.ln.**| -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -``` - -> Additionally, you can also use the `SHOW CREATE TABLE` statement to view the creation information of table views. For more details, see [show create table](../Basic-Concept/Table-Management_timecho.md#_1-4-view-table-creation-statement) - - -### 2.5 Query Differences Between Non-aligned and Aligned Devices - -Queries on tree-to-table views may yield different results compared to equivalent tree model `ALIGN BY DEVICE`queries when dealing with null values in aligned and non-aligned devices. - -* Aligned Devices - * Tree Model Query Behavior:Rows where all selected time series have null values are not retained. - * Table View Query Behavior:Consistent with the table model, rows where all selected fields are null are retained. -* Non-aligned Devices - * Tree Model Query Behavior:Rows where all selected time series have null values are not retained. - * Table View Query Behavior:Consistent with the tree model, rows where all selected fields are null are not retained. -* Explanation Example - * Aligned - - ```SQL - -- Write data in tree model (aligned) - CREATE ALIGNED TIMESERIES root.db.battery.b1(voltage INT32, current FLOAT) - INSERT INTO root.db.battery.b1(time, voltage, current) aligned values (1, 1, 1) - INSERT INTO root.db.battery.b1(time, voltage, current) aligned values (2, null, 1) - - -- Create VIEW statement - CREATE VIEW view1 (battery_id TAG, voltage INT32 FIELD, current FLOAT FIELD) as root.db.battery.** - - -- Query - IoTDB> select voltage from view1 - +-------+ - |voltage| - +-------+ - | 1| - | null| - +-------+ - Total line number = 2 - ``` - - * Non-aligned - - ```SQL - -- Write data in tree model (non-aligned) - CREATE TIMESERIES root.db.battery.b1.voltage INT32 - CREATE TIMESERIES root.db.battery.b1.current FLOAT - INSERT INTO root.db.battery.b1(time, voltage, current) values (1, 1, 1) - INSERT INTO root.db.battery.b1(time, voltage, current) values (2, null, 1) - - -- Create VIEW statement - CREATE VIEW view1 (battery_id TAG, voltage INT32 FIELD, current FLOAT FIELD) as root.db.battery.** - - -- Query - IoTDB> select voltage from view1 - +-------+ - |voltage| - +-------+ - | 1| - +-------+ - Total line number = 1 - - -- Can only ensure all rows are retrieved if the query specifies all FIELD columns, or only non-FIELD columns - IoTDB> select voltage,current from view1 - +-------+-------+ - |voltage|current| - +-------+-------+ - | 1| 1.0| - | null| 1.0| - +-------+-------+ - Total line number = 2 - - IoTDB> select battery_id from view1 - +-----------+ - |battery_id| - +-----------+ - | b1| - | b1| - +-----------+ - Total line number = 2 - - -- If the query involves only some FIELD columns, the final number of rows depends on the number of rows after aligning the specified FIELD columns by timestamp. - IoTDB> select time,voltage from view1 - +-----------------------------+-------+ - | time|voltage| - +-----------------------------+-------+ - |1970-01-01T08:00:00.001+08:00| 1| - +-----------------------------+-------+ - Total line number = 1 - ``` - -## 3. Scenario Examples -### 3.1 Managing Multiple Device Types in the Original Tree Model - -* The scenario involves managing different types of devices, each with its own hierarchical path and set of measurements. -* During Data Writing: Create branches under the database node according to device type. Each device type can have a different measurement structure. -* During Querying: Create a separate table for each device type. Each table will have different tags and sets of measurements. - -![](/img/tree-to-table-en-3.png) - -**SQL for Creating a Table View:** - -```SQL --- Wind Turbine Table -CREATE VIEW viewdb.wind_turbine - (wind_turbine_group String TAG, - wind_turbine_number String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -AS root.db.wind_turbine.** - --- Motor Table -CREATE VIEW viewdb.motor - ( motor_group String TAG, - motor_number String TAG, - power FLOAT FIELD, - electricity FLOAT FIELD, - temperature FLOAT FIELD - ) -AS root.db.motor.** -``` - -### 3.2 Original Tree Model Contains Only Measurements, No Devices - -This scenario occurs in systems like station monitoring where each measurement has a unique identifier but cannot be mapped to specific physical devices. - -> Wide Table Form - -![](/img/tree-to-table-en-4.png) - -**SQL for Creating a Table View:** - -```SQL -CREATE VIEW viewdb.machine - (DCS_PIT_02105A DOUBLE FIELD, - DCS_PIT_02105B DOUBLE FIELD, - DCS_PIT_02105C DOUBLE FIELD, - ... - DCS_XI_02716A DOUBLE FIELD - ) -AS root.db.** -``` - -### 3.3 Original Tree Model Where a Device Has Both Sub-devices and Measurements - -This scenario is common in energy storage systems where each hierarchical level requires monitoring of parameters like voltage and current. - -* Writing Phase: Model according to physical monitoring points at each hierarchical level -* Querying Phase: Create multiple tables based on device categories to manage information at each structural level - -![](/img/tree-to-table-en-5.png) - -**SQL for Creating a Table View:** - -```SQL --- Battery Compartment -CREATE VIEW viewdb.battery_compartment - (station String TAG, - batter_compartment String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -RESTRICT -AS root.db.** - --- Battery Stack -CREATE VIEW viewdb.battery_stack - (station String TAG, - batter_compartment String TAG, - battery_stack String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -RESTRICT -AS root.db.** - --- Battery Cluster -CREATE VIEW viewdb.battery_cluster - (station String TAG, - batter_compartment String TAG, - battery_stackString TAG, - battery_cluster String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -RESTRICT -AS 'root.db.**' - --- Battery Ceil -CREATE VIEW viewdb.battery_ceil - (station String TAG, - batter_compartment String TAG, - battery_cluster String TAG, - battery_cluster String TAG, - battery_ceil String TAG, - voltage DOUBLE FIELD, - current DOUBLE FIELD - ) -RESTRICT -AS root.db.** -``` - -### 3.4 Original Tree Model Where a Device Has Only One Measurement Under It - -> Narrow Table Form - -#### 3.4.1 All Measurements Have the Same Data Type - -![](/img/tree-to-table-en-6.png) - -**SQL for Creating a Table View:** - -```SQL -CREATE VIEW viewdb.machine - ( - sensor_id STRING TAG, - value DOUBLE FIELD - ) -AS root.db.** -``` - -#### 3.4.2 Measurements Have Different Data Types -##### 3.4.2.1 Create a Narrow Table View for Each Data Type of Measurement - -**Advantage: ​**The number of table views is constant, only related to the data types in the system. - -**Disadvantage: ​**When querying the value of a specific measurement, its data type must be known in advance to determine which table view to query. - -![](/img/tree-to-table-en-7.png) - -**SQL for Creating a Table View:** - -```SQL -CREATE VIEW viewdb.machine_float - ( - sensor_id STRING TAG, - value FLOAT FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_double - ( - sensor_id STRING TAG, - value DOUBLE FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_int32 - ( - sensor_id STRING TAG, - value INT32 FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_int64 - ( - sensor_id STRING TAG, - value INT64 FIELD - ) -AS root.db.** - -... -``` - -##### 3.4.2.2 Create a Table for Each Measurement - -**Advantage: ​**When querying the value of a specific measurement, there's no need to first check its data type to determine which table to query, making the process simple and convenient. - -**Disadvantage: ​**When there are a large number of measurements, it will introduce too many table views, requiring the writing of a large number of view creation statements. - -![](/img/tree-to-table-en-8.png) - -**SQL for Creating a Table View:** - -```SQL -CREATE VIEW viewdb.DCS_PIT_02105A - ( - value FLOAT FIELD - ) -AS root.db.DCS_PIT_02105A.** - -CREATE VIEW viewdb.DCS_PIT_02105B - ( - value DOUBLE FIELD - ) -AS root.db.DCS_PIT_02105B.** - -CREATE VIEW viewdb.DCS_XI_02716A - ( - value INT64 FIELD - ) -AS root.db.DCS_XI_02716A.** - -...... -``` diff --git a/src/UserGuide/latest-Table/User-Manual/Window-Function_timecho.md b/src/UserGuide/latest-Table/User-Manual/Window-Function_timecho.md deleted file mode 100644 index 11675903e..000000000 --- a/src/UserGuide/latest-Table/User-Manual/Window-Function_timecho.md +++ /dev/null @@ -1,759 +0,0 @@ - - -# Window Functions - -For time-series data feature analysis scenarios, IoTDB provides the capability of window functions, which deliver a flexible and efficient solution for in-depth mining and complex computation of time-series data. The following sections will elaborate on the feature in detail. - -## 1. Function Overview - -Window Functions perform calculations on each row based on a specific set of rows related to the current row (called a "window"). It combines grouping operations (`PARTITION BY`), sorting (`ORDER BY`), and definable calculation ranges (window frame `FRAME`), enabling complex cross-row calculations without collapsing the original data rows. It is commonly used in data analysis scenarios such as ranking, cumulative sums, moving averages, etc. - -> Note: This feature is available starting from version V 2.0.5. - -For example, in a scenario where you need to query the cumulative power consumption values of different devices, you can achieve this using window functions. - -```SQL --- Original data -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ - --- Create table and insert data -CREATE TABLE device_flow(device String tag, flow INT32 FIELD); -insert into device_flow(time, device ,flow ) values ('1970-01-01T08:00:00.000+08:00','d0',3),('1970-01-01T08:00:01.000+08:00','d0',5),('1970-01-01T08:00:02.000+08:00','d0',3),('1970-01-01T08:00:03.000+08:00','d0',1),('1970-01-01T08:00:04.000+08:00','d1',2),('1970-01-01T08:00:05.000+08:00','d1',4); - - --- Execute window function query -SELECT *, sum(flow) ​OVER(PARTITION​ ​BY​ device ​ORDER​ ​BY​ flow) ​as​ sum ​FROM device_flow; -``` - -After grouping, sorting, and calculation (steps are disassembled as shown in the figure below), - -![](/img/window-function-1.png) - -the expected results can be obtained: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -## 2. Function Definition - -### 2.1 SQL Definition - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -### 2.2 Window Definition - -#### 2.2.1 Partition - -`PARTITION BY` is used to divide data into multiple independent, unrelated "groups". Window functions can only access and operate on data within their respective groups, and cannot access data from other groups. This clause is optional; if not explicitly specified, all data is divided into the same group by default. It is worth noting that unlike `GROUP BY` which aggregates a group of data into a single row, the window function with `PARTITION BY` **does not affect the number of rows within the group.** - -* Example - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER (PARTITION BY device) as count FROM device_flow; -``` - -Disassembly steps: - -![](/img/window-function-2.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 4| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -|1970-01-01T08:00:02.000+08:00| d0| 3| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 4| -+-----------------------------+------+----+-----+ -``` - -#### 2.2.2 Ordering - -`ORDER BY` is used to sort data within a partition. After sorting, rows with equal values are called peers. Peers affect the behavior of window functions; for example, different rank functions handle peers differently, and different frame division methods also handle peers differently. This clause is optional. - -* Example - -Query statement: - -```SQL -IoTDB> SELECT *, rank() OVER (PARTITION BY device ORDER BY flow) as rank FROM device_flow; -``` - -Disassembly steps: - -![](/img/window-function-3.png) - -Query result: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -#### 2.2.3 Framing - -For each row in a partition, the window function evaluates on a corresponding set of rows called a Frame (i.e., the input domain of the Window Function on each row). The Frame can be specified manually, involving two attributes when specified, as detailed below. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Frame AttributeAttribute ValueValue Description
TypeROWSDivide the frame by row number
GROUPSDivide the frame by peers, i.e., rows with the same value are regarded as equivalent. All rows in peers are grouped into one group called a peer group
RANGEDivide the frame by value
Start and End PositionUNBOUNDED PRECEDINGThe first row of the entire partition
offset PRECEDINGRepresents the row with an "offset" distance from the current row in the preceding direction
CURRENT ROWThe current row
offset FOLLOWINGRepresents the row with an "offset" distance from the current row in the following direction
UNBOUNDED FOLLOWINGThe last row of the entire partition
- -Among them, the meanings of `CURRENT ROW`, `PRECEDING N`, and `FOLLOWING N` vary with the type of frame, as shown in the following table: - -| | `ROWS` | `GROUPS` | `RANGE` | -|--------------------|------------|------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------| -| `CURRENT ROW` | Current row | Since a peer group contains multiple rows, this option differs depending on whether it acts on frame_start and frame_end: * frame_start: the first row of the peer group; * frame_end: the last row of the peer group. | Same as GROUPS, differing depending on whether it acts on frame_start and frame_end: * frame_start: the first row of the peer group; * frame_end: the last row of the peer group. | -| `offset PRECEDING` | The previous offset rows | The previous offset peer groups; | Rows whose value difference from the current row in the preceding direction is less than or equal to offset are grouped into one frame | -| `offset FOLLOWING` | The following offset rows | The following offset peer groups. | Rows whose value difference from the current row in the following direction is less than or equal to offset are grouped into one frame | - -The syntax format is as follows: - -```SQL --- Specify both frame_start and frame_end -{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end --- Specify only frame_start, frame_end is CURRENT ROW -{ RANGE | ROWS | GROUPS } frame_start -``` - -If the Frame is not specified manually, the default Frame division rules are as follows: - -* When the window function uses ORDER BY: The default Frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (i.e., from the first row of the window to the current row). For example: In RANK() OVER(PARTITION BY COL1 ORDER BY COL2), the Frame defaults to include the current row and all preceding rows in the partition. -* When the window function does not use ORDER BY: The default Frame is RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (i.e., all rows in the entire window). For example: In AVG(COL2) OVER(PARTITION BY col1), the Frame defaults to include all rows in the partition, calculating the average of the entire partition. - -It should be noted that when the Frame type is GROUPS or RANGE, `ORDER BY` must be specified. The difference is that ORDER BY in GROUPS can involve multiple fields, while RANGE requires calculation and thus can only specify one field. - -* Example - -1. Frame type is ROWS - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ROWS 1 PRECEDING) as count FROM device_flow; -``` - -Disassembly steps: - -* Take the previous row and the current row as the Frame - * For the first row of the partition, since there is no previous row, the entire Frame has only this row, returning 1; - * For other rows of the partition, the entire Frame includes the current row and its previous row, returning 2: - -![](/img/window-function-4.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 2| -+-----------------------------+------+----+-----+ -``` - -2. Frame type is GROUPS - -Query statement: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Disassembly steps: - -* Take the previous peer group and the current peer group as the Frame. Taking the partition with device d0 as an example (same for d1), for the count of rows: - * For the peer group with flow 1, since there are no peer groups smaller than it, the entire Frame has only this row, returning 1; - * For the peer group with flow 3, it itself contains 2 rows, and the previous peer group is the one with flow 1 (1 row), so the entire Frame has 3 rows, returning 3; - * For the peer group with flow 5, it itself contains 1 row, and the previous peer group is the one with flow 3 (2 rows), so the entire Frame has 3 rows, returning 3. - -![](/img/window-function-5.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. Frame type is RANGE - -Query statement: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -Disassembly steps: - -* Group rows whose data is **less than or equal to 2** compared to the current row into the same Frame. Taking the partition with device d0 as an example (same for d1), for the count of rows: - * For the row with flow 1, since it is the smallest row, the entire Frame has only this row, returning 1; - * For the row with flow 3, note that CURRENT ROW exists as frame_end, so it is the last row of the entire peer group. There is 1 row smaller than it that meets the requirement, and the peer group has 2 rows, so the entire Frame has 3 rows, returning 3; - * For the row with flow 5, it itself contains 1 row, and there are 2 rows smaller than it that meet the requirement, so the entire Frame has 3 rows, returning 3. - -![](/img/window-function-6.png) - -Query result: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -## 3. Built-in Window Functions - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Window Function CategoryWindow Function NameFunction DefinitionSupports FRAME Clause
Aggregate FunctionAll built-in aggregate functionsAggregate a set of values to get a single aggregated result.Yes
Value Functionfirst_valueReturn the first value of the frame; if IGNORE NULLS is specified, skip leading NULLsYes
last_valueReturn the last value of the frame; if IGNORE NULLS is specified, skip trailing NULLsYes
nth_valueReturn the nth element of the frame (note that n starts from 1); if IGNORE NULLS is specified, skip NULLsYes
leadReturn the element offset rows after the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return defaultNo
lagReturn the element offset rows before the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return defaultNo
Rank FunctionrankReturn the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there may be gaps between sequence numbersNo
dense_rankReturn the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there are no gaps between sequence numbersNo
row_numberReturn the row number of the current row in the entire partition; note that the row number starts from 1No
percent_rankReturn the sequence number of the current row's value in the entire partition as a percentage; i.e., (rank() - 1) / (n - 1), where n is the number of rows in the entire partitionNo
cume_distReturn the sequence number of the current row's value in the entire partition as a percentage; i.e., (number of rows less than or equal to it) / n No
ntileSpecify n to number each row from 1 to n.No
- -### 3.1 Aggregate Function - -All built-in aggregate functions such as `sum()`, `avg()`, `min()`, `max()` can be used as Window Functions. - -> Note: Unlike GROUP BY, each row has a corresponding output in the Window Function - -Example: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -### 3.2 Value Function - -1. `first_value` - -* Function name: `first_value(value) [IGNORE NULLS]` -* Definition: Return the first value of the frame; if IGNORE NULLS is specified, skip leading NULLs; -* Example: - -```SQL -IoTDB> SELECT *, first_value(flow) OVER w as first_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+-----------+ -| time|device|flow|first_value| -+-----------------------------+------+----+-----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----------+ -``` - -2. `last_value` - -* Function name: `last_value(value) [IGNORE NULLS]` -* Definition: Return the last value of the frame; if IGNORE NULLS is specified, skip trailing NULLs; -* Example: - -```SQL -IoTDB> SELECT *, last_value(flow) OVER w as last_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|last_value| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -3. `nth_value` - -* Function name: `nth_value(value, n) [IGNORE NULLS]` -* Definition: Return the nth element of the frame (note that n starts from 1); if IGNORE NULLS is specified, skip NULLs; -* Example: - -```SQL -IoTDB> SELECT *, nth_value(flow, 2) OVER w as nth_values FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|nth_values| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -4. lead - -* Function name: `lead(value[, offset[, default]]) [IGNORE NULLS]` -* Definition: Return the element offset rows after the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return default; the default value of offset is 1, and the default value of default is NULL. -* The lead function requires an ORDER BY window clause -* Example: - -```SQL -IoTDB> SELECT *, lead(flow) OVER w as lead FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY time); -+-----------------------------+------+----+----+ -| time|device|flow|lead| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4|null| -|1970-01-01T08:00:00.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 1| -|1970-01-01T08:00:03.000+08:00| d0| 1|null| -+-----------------------------+------+----+----+ -``` - -5. lag - -* Function name: `lag(value[, offset[, default]]) [IGNORE NULLS]` -* Definition: Return the element offset rows before the current row (if IGNORE NULLS is specified, NULLs are not considered); if no such element exists (exceeding the partition range), return default; the default value of offset is 1, and the default value of default is NULL. -* The lag function requires an ORDER BY window clause -* Example: - -```SQL -IoTDB> SELECT *, lag(flow) OVER w as lag FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY device); -+-----------------------------+------+----+----+ -| time|device|flow| lag| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2|null| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3|null| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -+-----------------------------+------+----+----+ -``` - -### 3.3 Rank Function - -1. rank - -* Function name: `rank()` -* Definition: Return the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there may be gaps between sequence numbers; -* Example: - -```SQL -IoTDB> SELECT *, rank() OVER w as rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -2. dense_rank - -* Function name: `dense_rank()` -* Definition: Return the sequence number of the current row in the entire partition; rows with the same value have the same sequence number, and there are no gaps between sequence numbers. -* Example: - -```SQL -IoTDB> SELECT *, dense_rank() OVER w as dense_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|dense_rank| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+----------+ -``` - -3. row_number - -* Function name: `row_number()` -* Definition: Return the row number of the current row in the entire partition; note that the row number starts from 1; -* Example: - -```SQL -IoTDB> SELECT *, row_number() OVER w as row_number FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|row_number| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----------+ -``` - -4. percent_rank - -* Function name: `percent_rank()` -* Definition: Return the sequence number of the current row's value in the entire partition as a percentage; i.e., **(rank() - 1) / (n - 1)**, where n is the number of rows in the entire partition; -* Example: - -```SQL -IoTDB> SELECT *, percent_rank() OVER w as percent_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+------------------+ -| time|device|flow| percent_rank| -+-----------------------------+------+----+------------------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.0| -|1970-01-01T08:00:00.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:02.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+------------------+ -``` - -5. cume_dist - -* Function name: `cume_dist` -* Definition: Return the sequence number of the current row's value in the entire partition as a percentage; i.e., **(number of rows less than or equal to it) / n**. -* Example: - -```SQL -IoTDB> SELECT *, cume_dist() OVER w as cume_dist FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+---------+ -| time|device|flow|cume_dist| -+-----------------------------+------+----+---------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.5| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.25| -|1970-01-01T08:00:00.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:02.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+---------+ -``` - -6. ntile - -* Function name: `ntile` -* Definition: Specify n to number each row from 1 to n. - * If the number of rows in the entire partition is less than n, the number is the row index; - * If the number of rows in the entire partition is greater than n: - * If the number of rows is divisible by n, it is perfect. For example, if the number of rows is 4 and n is 2, the numbers are 1, 1, 2, 2; - * If the number of rows is not divisible by n, distribute to the first few groups. For example, if the number of rows is 5 and n is 3, the numbers are 1, 1, 2, 2, 3; -* Example: - -```SQL -IoTDB> SELECT *, ntile(2) OVER w as ntile FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+-----+ -| time|device|flow|ntile| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -+-----------------------------+------+----+-----+ -``` - -## 4. Scenario Examples - -1. Multi-device diff function - -For each row of each device, calculate the difference from the previous row: - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -For each row of each device, calculate the difference from the next row: - -```SQL -SELECT - *, - measurement - lead(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -For each row of a single device, calculate the difference from the previous row (same for the next row): - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (ORDER BY time) -FROM data -where device='d1' -WHERE timeCondition; -``` - -2. Multi-device TOP_K/BOTTOM_K - -Use rank to get the sequence number, then retain the desired order in the outer query. - -(Note: The execution order of window functions is after the HAVING clause, so a subquery is needed here) - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY time DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -In addition to sorting by time, you can also sort by the value of the measurement point: - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY measurement DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -3. Multi-device CHANGE_POINTS - -This SQL is used to remove consecutive identical values in the input sequence, which can be achieved with lead + subquery: - -```SQL -SELECT - time, - device, - measurement -FROM( - SELECT - time, - device, - measurement, - LEAD(measurement) OVER (PARTITION BY device ORDER BY time) AS next - FROM data - WHERE timeCondition -) -WHERE measurement != next OR next IS NULL; -``` diff --git a/src/UserGuide/latest/AI-capability/AINode_Upgrade_timecho.md b/src/UserGuide/latest/AI-capability/AINode_Upgrade_timecho.md deleted file mode 100644 index 54f4d63b6..000000000 --- a/src/UserGuide/latest/AI-capability/AINode_Upgrade_timecho.md +++ /dev/null @@ -1,663 +0,0 @@ - - -# AINode - -AINode is a native IoTDB node that supports the registration, management, and invocation of time series related models, with built-in industry-leading self-developed time series large models such as the Tsinghua University's Timer series. It can be invoked through standard SQL statements to achieve millisecond-level real-time inference on time series data, supporting applications such as time series trend prediction, missing value imputation, and anomaly detection. - -The system architecture is shown in the following diagram: - -![](/img/AINode-0-en.png) - -The responsibilities of the three nodes are as follows: - -* **ConfigNode**: Responsible for distributed node management and load balancing. -* **DataNode**: Responsible for receiving and parsing user SQL requests; responsible for storing time series data; responsible for data preprocessing calculations. -* **AINode**: Responsible for managing and using time series models. - -## 1. Advantages - -Compared to building machine learning services separately, it has the following advantages: - -* **Simple and easy to use**: No need to use Python or Java programming, SQL statements can be used to complete the entire process of machine learning model management and inference. For example, creating a model can use the CREATE MODEL statement, and using a model for inference can use the CALL INFERENCE (...) statement, etc., which is simpler and more convenient to use. -* **Avoid data migration**: Using IoTDB native machine learning can directly apply time series data stored in IoTDB to machine learning model inference, without moving data to a separate machine learning service platform, thus accelerating data processing, improving security, and reducing costs. - -![](/img/AInode1.png) - -* **Built-in advanced algorithms**: Supports industry-leading machine learning analysis algorithms, covering typical time series analysis tasks, empowering time series databases with native data analysis capabilities. Such as: - * **Time Series Forecasting**: Learn change patterns from past time series to output the most likely prediction of future sequences based on given past observations. - * **Anomaly Detection for Time Series**: Detect and identify anomalies in given time series data to help discover abnormal behavior in time series. - -## 2. Basic Concepts - -* **Model**: Machine learning model, which takes time series data as input and outputs the results or decisions of the analysis task. The model is the basic management unit of AINode, supporting the addition (registration), deletion, query, modification (fine-tuning), and use (inference) of models. -* **Create**: Load external designed or trained model files or algorithms into AINode, managed and used by IoTDB. -* **Inference**: Use the created model to complete the time series analysis task applicable to the model on the specified time series data. -* **Built-in**: AINode comes with common time series analysis scenario (e.g., prediction and anomaly detection) machine learning algorithms or self-developed models. - -![](/img/AInode2.png) - -## 3. Installation and Deployment - -AINode deployment can be referenced in the documentation [AINode Deployment](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md). - -## 4. Usage Guide - -TimechoDB-AINode supports three major functions: model inference, model fine-tuning, and model management (registration, viewing, deletion, loading, unloading, etc.). The following sections will explain them in detail. - -### 4.1 Model Inference - -SQL syntax as follows: - -```SQL -call inference(,inputSql,(=)*) -``` - -After completing the model registration (built-in model inference does not require a registration process), the inference function of the model can be used by calling the inference function with the call keyword. The corresponding parameter descriptions are as follows: - -* **model\_id**: Corresponds to an already registered model -* **sql**: SQL query statement, the result of the query is used as the input for model inference. The dimensions of the rows and columns in the query result need to match the size specified in the specific model config. (It is not recommended to use the `SELECT *` clause in this sql, because in IoTDB, `*` does not sort columns, so the column order is undefined. It is recommended to use `SELECT ot` to ensure that the column order matches the expected input of the model.) -* **parameterName/parameterValue**: currently supported: - - | Parameter Name | Parameter Type | Parameter Description | Default Value | - | ---------------- | -------------- | ----------------------- | -------------- | - | **generateTime** | boolean | Whether to include a timestamp column in the result | false | - | **outputLength** | int | Specifies the output length of the result | 96 | - -Notes: -1. The prerequisite for using built-in time series large models for inference is that the local machine has the corresponding model weights, located at `/TIMECHODB_AINODE_HOME/data/ainode/models/builtin/model_id/`. If the local machine does not have model weights, it will automatically pull from HuggingFace. Please ensure that the local machine can directly access HuggingFace. -2. In deep learning applications, it is common to use time-derived features (the time column in the data) as covariates and input them into the model together with the data to improve model performance. However, the time column is generally not included in the model's output results. To ensure universality, the model inference result only corresponds to the model's true output. If the model does not output a time column, the result will not contain it. - -**Example** - -Sample data [ETTh-tree](/img/ETTh-tree.csv) - -Below is an example of using the sundial model for inference. The input is 96 rows, and the output is 48 rows. We use SQL to perform the inference. - -```SQL -IoTDB> select OT from root.db.** -+-----------------------------+---------------+ -| Time|root.db.etth.OT| -+-----------------------------+---------------+ -|2016-07-01T00:00:00.000+08:00| 30.531| -|2016-07-01T01:00:00.000+08:00| 27.787| -|2016-07-01T02:00:00.000+08:00| 27.787| -|2016-07-01T03:00:00.000+08:00| 25.044| -|2016-07-01T04:00:00.000+08:00| 21.948| -| ...... | ...... | -|2016-07-04T19:00:00.000+08:00| 29.546| -|2016-07-04T20:00:00.000+08:00| 29.475| -|2016-07-04T21:00:00.000+08:00| 29.264| -|2016-07-04T22:00:00.000+08:00| 30.953| -|2016-07-04T23:00:00.000+08:00| 31.726| -+-----------------------------+---------------+ -Total line number = 96 - -IoTDB> call inference(sundial,"select OT from root.db.**", generateTime=True, outputLength=48) -+-----------------------------+------------------+ -| Time| output| -+-----------------------------+------------------+ -|2016-07-04T23:00:00.000+08:00|30.537494659423828| -|2016-07-04T23:59:22.500+08:00|29.619892120361328| -|2016-07-05T00:58:45.000+08:00|28.815832138061523| -|2016-07-05T01:58:07.500+08:00| 27.91131019592285| -|2016-07-05T02:57:30.000+08:00|26.893848419189453| -| ...... | ...... | -|2016-07-06T17:33:07.500+08:00| 24.40607261657715| -|2016-07-06T18:32:30.000+08:00| 25.00441551208496| -|2016-07-06T19:31:52.500+08:00|24.907312393188477| -|2016-07-06T20:31:15.000+08:00|25.156436920166016| -|2016-07-06T21:30:37.500+08:00|25.335433959960938| -+-----------------------------+------------------+ -Total line number = 48 -``` - -### 4.2 Model Fine-Tuning - -AINode supports model fine-tuning through SQL. - -**SQL Syntax** - -```SQL -createModel - | CREATE MODEL modelId=identifier (WITH HYPERPARAMETERS LR_BRACKET hparamPair (COMMA hparamPair)* RR_BRACKET)? FROM MODEL existingModelId=identifier ON DATASET LR_BRACKET trainingData RR_BRACKET - ; - -trainingData - : dataElement(COMMA dataElement)* - ; - -dataElement - : pathPatternElement (LR_BRACKET timeRange RR_BRACKET)? - ; - -pathPatternElement - : PATH path=prefixPath - ; -``` - -**Parameter Description** - -| Name | Description | -| ------ |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| modelId | The unique identifier of the fine-tuned model | -| hparamPair | Key-value pairs of hyperparameters used for fine-tuning, currently supported:
`train_epochs`: int type, number of fine-tuning epochs
`iter_per_epoch`: int type, number of iterations per epoch
`learning_rate`: double type, learning rate | -| existingModelId | The base model used for fine-tuning | -| trainingData | The dataset used for fine-tuning | - -**Example** - -1. Select the data of the measurement point root.db.etth.ot within a specified time range as the fine-tuning dataset, and create the model sundialv2 based on sundial. - -```SQL -IoTDB> CREATE MODEL sundialv2 FROM MODEL sundial ON DATASET (PATH root.db.etth.OT([1467302400000, 1467644400000))) -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| training| -+---------------------+---------+-----------+---------+ -``` - -2. Fine-tuning tasks are started asynchronously in the background, and logs can be seen in the AINode process; after fine-tuning is completed, query and use the new model. - -```SQL -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -+---------------------+---------+-----------+---------+ -``` - -### 4.3 Register Custom Models - -**Transformers models that meet the following requirements can be registered to AINode:** - -1. AINode currently uses transformers version v4.56.2, so when building the model, it is necessary to **avoid inheriting low-version (<4.50) interfaces**; -2. The model needs to inherit a type of AINode inference task pipeline (currently supports the forecasting pipeline): - * iotdb-core/ainode/iotdb/ainode/core/inference/pipeline/basic\_pipeline.py - - **Before V2.0.9.3** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - Preprocess the input data before the inference task starts, including shape validation and numerical conversion. - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - Postprocess the output results after the inference task is completed. - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def preprocess(self, inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], **infer_kwargs): - """ - Preprocess the input data before passing it to the model for inference, validating the shape and type of the input data. - - Args: - inputs (list[dict]): - Input data, a list of dictionaries, each dictionary contains: - - 'targets': Tensor with shape (input_length,) or (target_count, input_length). - - 'past_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - 'future_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - infer_kwargs (dict, optional): Additional keyword arguments for inference, such as: - - `output_length`(int): Used to validate the validity of 'future_covariates' if provided. - - Raises: - ValueError: If the input format is invalid (e.g., missing keys, invalid tensor shapes). - - Returns: - Preprocessed and validated input data that can be directly used for model inference. - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - Perform forecasting on the given inputs. - - Parameters: - inputs: Input data for forecasting. The type and structure depend on the specific model implementation. - **infer_kwargs: Additional inference parameters, e.g.: - - `output_length`(int): The number of time points the model should generate. - - Returns: - Forecast output, the specific form depends on the specific model implementation. - """ - pass - - def postprocess(self, outputs: list[torch.Tensor], **infer_kwargs) -> list[torch.Tensor]: - """ - Postprocess the model outputs after inference, validating the shape of the output data and ensuring it meets the expected dimensions. - - Args: - outputs: - Model outputs, a list of 2D tensors, each tensor with shape `[target_count, output_length]`. - - Raises: - InferenceModelInternalException: If the output tensor shape is invalid (e.g., incorrect dimensions). - ValueError: If the output format is incorrect. - - Returns: - list[torch.Tensor]: - Postprocessed outputs, which will be a list of 2D tensors. - """ - pass - ``` - - **From V2.0.9.3 onwards** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - Preprocess the input data before the inference task starts, including shape validation and numerical conversion. - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - Postprocess the output results after the inference task is completed. - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def _preprocess( - self, - inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], - **infer_kwargs, - ): - """ - Preprocess the input data before passing it to the model for inference, validating the shape and type of the input data. - - Args: - inputs (list[dict[str, dict[str, torch.Tensor] | torch.Tensor]]): - Input data, a list of dictionaries, each dictionary contains: - - 'targets': Tensor with shape (input_length,) or (target_count, input_length). - - 'past_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - 'future_covariates': Optional, dictionary of tensors, each tensor with shape (input_length,). - - infer_kwargs (dict, optional): Additional keyword arguments for inference, such as: - - `output_length`(int): Used to validate the validity of 'future_covariates' if provided. - - Raises: - ValueError: If the input format is invalid (e.g., missing keys, invalid tensor shapes). - - Returns: - Preprocessed and validated input data that can be directly used for model inference. - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - Perform forecasting on the given inputs. - - Parameters: - inputs: Input data for forecasting. The type and structure depend on the specific model implementation. - **infer_kwargs: Additional inference parameters, e.g.: - - `output_length`(int): The number of time points the model should generate. - - Returns: - Forecast output, the specific form depends on the specific model implementation. - """ - pass - - def _postprocess(self, outputs, **infer_kwargs) -> list[torch.Tensor]: - """ - Postprocess the model outputs after inference, validating the shape of the output data and ensuring it meets the expected dimensions. - - Args: - outputs: - Model outputs, a list of 2D tensors, each tensor with shape `[target_count, output_length]`. - - Raises: - InferenceModelInternalException: If the output tensor shape is invalid (e.g., incorrect dimensions). - ValueError: If the output format is incorrect. - - Returns: - list[torch.Tensor]: - Postprocessed outputs, which will be a list of 2D tensors. - """ - pass - ``` - -3. Modify the model configuration file `config.json` to ensure it contains the following fields: - - **Before V2.0.9.3** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // Specify the model Config class - "AutoModelForCausalLM": "model.Chronos2Model" // Specify the model class - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // Specify the inference pipeline for the model - "model_type": "custom_t5", // Specify the model type - } - ``` - * The model Config class and model class **must** be specified via `auto_map`; - * The inference pipeline class **must** be inherited and specified; - * For built-in and user-defined models managed by AINode, `model_type` also serves as a unique non-duplicable identifier. That is, the model type to be registered must not duplicate any existing model types; models created via fine-tuning will inherit the model type of the original model. - - **From V2.0.9.3 onwards** - > The `model_type` parameter is **not required** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // Specify the model Config class - "AutoModelForCausalLM": "model.Chronos2Model" // Specify the model class - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // Specify the inference pipeline for the model - } - ``` - * The model Config class and model class **must** be specified via `auto_map`; - * The inference pipeline class **must** be inherited and specified; - -4. Ensure that the model directory to be registered contains the following files, and the model configuration file name and weight file name are not customizable: - * Model configuration file: config.json; - * Model weight file: model.safetensors; - * Model code: other .py files. - -**The SQL syntax for registering a custom model is as follows:** - -```SQL -CREATE MODEL USING URI -``` - -**Parameter Description:** - -* **model\_id**: The unique identifier of the custom model; non-repetitive, with the following constraints: - * Allowed characters: [0-9 a-z A-Z \_ ] (letters, numbers (not at the beginning), underscore (not at the beginning)) - * Length limit: 2-64 characters - * Case-sensitive -* **uri**: The local URI address containing the model code and weights. - -**Registration Example:** - -Upload a custom Transformers model from a local path. AINode will copy the folder to the user\_defined directory. - -```SQL -CREATE MODEL chronos2 USING URI 'file:///path/to/chronos2' -``` - -After executing the SQL, the registration process will be performed asynchronously. The registration status of the model can be viewed by checking the model display (see the model display section). After the model is registered, it can be called using normal query methods to perform model inference. - -### 4.4 View Models - -Registered models can be viewed using the view command. - -```SQL -SHOW MODELS -``` - -In addition to directly displaying all model information, you can specify `model_id` to view the information of a specific model. - -```SQL -SHOW MODELS -- Only display specific model -``` - -The results of model display include the following: - -| **ModelId** | **ModelType** | **Category** | **State** | -| ------------------- | --------------------- | -------------------- | ----------------- | -| Model ID | Model Type | Model Category | Model State | - -Where, State model state machine flow diagram as follows: - -![](/img/ainode-upgrade-state-timecho-en.png) - -State machine flow explanation: - -1. After starting AINode, executing `show models` command, only **system built-in (BUILTIN)** models can be viewed. -2. Users can import their own models, which are identified as **user-defined (USER_DEFINED)**; AINode will attempt to parse the model type (ModelType) from the model configuration file; if parsing fails, this field will display as empty. -3. Time series large models (built-in models) weight files are not packaged with AINode, AINode automatically downloads them when starting. - 1. During download, it is ACTIVATING, and after successful download, it becomes ACTIVE, failure becomes INACTIVE. -4. After users start a model fine-tuning task, the model state is TRAINING, and after successful training, it becomes ACTIVE, failure becomes FAILED. -5. If the fine-tuning task is successful, after fine-tuning, the model will automatically rename the best checkpoint (training file) based on the best metric and become the user-specified model\_id. - -**View Example** - -```SQL -IoTDB> show models -+---------------------+--------------+--------------+-------------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------+--------------+-------------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| custom| | user_defined| active| -| timer_xl| timer| builtin| activating| -| sundial| sundial| builtin| active| -| sundialx_1| sundial| fine_tuned| active| -| sundialx_4| sundial| fine_tuned| training| -| sundialx_5| sundial| fine_tuned| failed| -| chronos2| t5| builtin| inactive| -+---------------------+--------------+--------------+-------------+ -``` - -Built-in traditional time series models are introduced as follows: - -| Model Name | Core Concept | Applicable Scenario | Main Features | -|------------|--------------|---------------------|---------------| -| **ARIMA** (Autoregressive Integrated Moving Average) | Combines autoregression (AR), differencing (I), and moving average (MA), used for predicting stationary time series or data that can be made stationary through differencing. | Univariate time series prediction, such as stock prices, sales, economic indicators. | 1. Suitable for linear trends and weak seasonality data. 2. Need to select parameters (p,d,q). 3. Sensitive to missing values. | -| **Holt-Winters** (Three-parameter exponential smoothing) | Based on exponential smoothing, introduces three components: level, trend, and seasonality, suitable for data with trend and seasonality. | Time series with clear seasonality and trend, such as monthly sales, power demand. | 1. Can handle additive or multiplicative seasonality. 2. Gives higher weight to recent data. 3. Simple to implement. | -| **Exponential Smoothing** | Uses weighted average of historical data, with weights decreasing exponentially over time, emphasizing the importance of recent observations. | Time series without obvious seasonality but with trend, such as short-term demand prediction. | 1. Few parameters, simple calculation. 2. Suitable for stationary or slowly changing sequences. 3. Can be extended to double or triple exponential smoothing. | -| **Naive Forecaster** | Uses the observation value from the previous period as the prediction for the next period, the simplest baseline model. | As a comparison baseline for other models, or simple prediction when data has no obvious pattern. | 1. No training required. 2. Sensitive to sudden changes. 3. Seasonal naive variant can use the value from the same period of the previous season for prediction. | -| **STL Forecaster** (Seasonal-Trend Decomposition) | Based on STL decomposition of time series, predict trend, seasonal, and residual components separately and combine. | Time series with complex seasonality, trend, and non-linear patterns, such as climate data, traffic flow. | 1. Can handle non-fixed seasonality. 2. Robust to outliers. 3. After decomposition, other models can be combined to predict each component. | -| **Gaussian HMM** (Gaussian Hidden Markov Model) | Assumes that observed data is generated by hidden states, with each state's observed probability following a Gaussian distribution. | State sequence prediction or classification, such as speech recognition, financial state recognition. | 1. Suitable for modeling time series state. 2. Assumes observed values are independent given the state. 3. Need to specify the number of hidden states. | -| **GMM HMM** (Gaussian Mixture Hidden Markov Model) | An extension of Gaussian HMM, where the observed probability of each state is described by a Gaussian Mixture Model, capturing more complex observed distributions. | Scenarios requiring multi-modal observed distributions, such as complex action recognition, bio-signal analysis. | 1. More flexible than single Gaussian. 2. More parameters, higher computational complexity. 3. Need to train the number of GMM components. | -| **STRAY** (Singular Value-based Anomaly Detection) | Detects anomalies in high-dimensional data through Singular Value Decomposition (SVD), commonly used for time series anomaly detection. | High-dimensional time series anomaly detection, such as sensor networks, IT system monitoring. | 1. No distribution assumption required. 2. Can handle high-dimensional data. 3. Sensitive to global anomalies, may miss local anomalies. | - -Built-in time series large models are introduced as follows: - -| Model Name | Core Concept | Applicable Scenario | Main Features | -|------------|--------------|---------------------|---------------| -| **Timer-XL** | Time series large model supporting ultra-long context, enhanced generalization ability through large-scale industrial data pre-training. | Complex industrial prediction requiring extremely long historical data, such as energy, aerospace, transportation. | 1. Ultra-long context support, can handle tens of thousands of time points as input. 2. Multi-scenario coverage, supports non-stationary, multi-variable, and covariate prediction. 3. Pre-trained on trillions of high-quality industrial time series data. | -| **Timer-Sundial** | A generative foundational model using "Transformer + TimeFlow" architecture, focused on probabilistic prediction. | Zero-shot prediction scenarios requiring quantification of uncertainty, such as finance, supply chain, new energy power generation. | 1. Strong zero-shot generalization ability, supports point prediction and probabilistic prediction. 2. Flexible analysis of any statistical characteristics of the prediction distribution. 3. Innovative generative architecture, achieving efficient non-deterministic sample generation. | -| **Chronos-2** | A universal time series foundational model based on discrete tokenization paradigm, transforming prediction into language modeling tasks. | Fast zero-shot univariate prediction, and scenarios that can leverage covariates (e.g., promotions, weather) to improve results. | 1. Strong zero-shot probabilistic prediction ability. 2. Supports covariate unified modeling, but has strict input requirements: a. The set of names of future covariates must be a subset of the set of names of historical covariates; b. The length of each historical covariate must equal the length of the target variable; c. The length of each future covariate must equal the prediction length; 3. Uses an efficient encoder-based structure, balancing performance and inference speed. | - -### 4.5 Delete Models - -For registered models, users can delete them through SQL. AINode will delete the corresponding model folder in the user\_defined directory. The SQL syntax is as follows: - -```SQL -DROP MODEL -``` - -The model id that has been successfully registered must be specified to delete the corresponding model. Since model deletion involves model data cleanup, the operation will not be completed immediately, and the model status becomes DROPPING, and the model in this state cannot be used for model inference. Note that this feature does not support deleting built-in models. - -### 4.6 Load/Unload Models - -To adapt to different scenarios, AINode provides the following two model loading strategies: - -* **On-demand loading**: Load the model temporarily when inference is performed, and release resources after completion. Suitable for testing or low-load scenarios. -* **Persistent loading**: Load the model persistently in memory (CPU) or GPU memory, to support high-concurrency inference. Users only need to specify the model to load or unload through SQL, and AINode will automatically manage the number of instances. The status of the persistent model can also be viewed at any time. - -The following sections will detail the loading/unloading model content: - -1. Configuration parameters - -Support editing the following configuration items to set persistent loading related parameters. - -```Properties -# The proportion of device memory/GPU memory available for AINode inference -# Datatype: Float -ain_inference_memory_usage_ratio=0.4 - -# The proportion of memory that each loaded model instance needs to occupy, i.e., model occupancy * this value -# Datatype: Float -ain_inference_extra_memory_ratio=1.2 -``` - -2. Show available devices - -Support viewing all available device IDs through the following SQL command. - -```SQL -SHOW AI_DEVICES -``` - -Example - -```SQL -IoTDB> show ai_devices -+-------------+ -| DeviceId| -+-------------+ -| cpu| -| 0| -| 1| -+-------------+ -``` - -3. Load model - -Support loading models through the following SQL command, and the system will **automatically balance** the number of model instances based on hardware resource usage. - -```SQL -LOAD MODEL TO DEVICES (, )* -``` - -Parameter requirements - -* **existing\_model\_id:** The model id to be specified, current version only supports timer\_xl and sundial. -* **device\_id:** The location where the model is loaded. - * **cpu:** Load to the memory of the AINode server. - * **gpu\_id:** Load to the corresponding GPU of the AINode server, e.g., "0, 1" means load to the two GPUs numbered 0 and 1. - -Example - -```SQL -LOAD MODEL sundial TO DEVICES 'cpu,0,1' -``` - -4. Unload model - -Support unloading specified models through the following SQL command, and the system will **reallocate** the freed resources to other models. - -```SQL -UNLOAD MODEL FROM DEVICES (, )* -``` - -Parameter requirements - -* **existing\_model\_id:** The model id to be specified, current version only supports timer\_xl and sundial. -* **device\_id:** The location where the model is loaded. - * **cpu:** Attempt to unload the specified model from the memory of the AINode server. - * **gpu\_id:** Attempt to unload the specified model from the corresponding GPU of the AINode server, e.g., "0, 1" means attempt to unload from the two GPUs numbered 0 and 1. - -Example - -```SQL -UNLOAD MODEL sundial FROM DEVICES 'cpu,0,1' -``` - -5. Show loaded models - -Support viewing the models that have been manually loaded through the following SQL command, and you can specify the device via `device_id`. - -```SQL -SHOW LOADED MODELS -SHOW LOADED MODELS (, )* # View models in specified devices -``` - -Example: sundial model is loaded on memory, gpu_0, and gpu_1 - -```SQL -IoTDB> show loaded models -+-------------+--------------+------------------+ -| DeviceId| ModelId| Count(instances)| -+-------------+--------------+------------------+ -| cpu| sundial| 4| -| 0| sundial| 6| -| 1| sundial| 6| -+-------------+--------------+------------------+ -``` - -Explanation: -* DeviceId: Device ID -* ModelId: Loaded model ID -* Count(instances): Number of model instances on each device (automatically assigned by the system) - -### 4.7 Introduction to Time Series Large Models - -AINode currently supports multiple time series large models. For related introductions and deployment usage, please refer to [Time Series Large Models](../AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md) - -## 5. Permission Management - -When using AINode related features, you can use IoTDB's own authentication for permission management. Users can only use the model management related features if they have the USE_MODEL permission. When using the inference feature, users need to have permission to access the source sequence corresponding to the SQL for the input model. - -| Permission Name | Permission Scope | Administrator User (Default ROOT) | Ordinary User | Path-related | -| --------------- | ----------------- | ------------------------------- | -------------- | ------------ | -| USE_MODEL | create model / show models / drop model | √ | √ | x | -| READ_DATA | call inference | √ | √ | √ | \ No newline at end of file diff --git a/src/UserGuide/latest/AI-capability/AINode_timecho.md b/src/UserGuide/latest/AI-capability/AINode_timecho.md deleted file mode 100644 index 7c8672e82..000000000 --- a/src/UserGuide/latest/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,693 +0,0 @@ - - -# AINode - -AINode is a native IoTDB node that supports the registration, management, and invocation of time-series-related models. It comes with built-in industry-leading self-developed time-series large models, such as the Timer series developed by Tsinghua University. These models can be invoked through standard SQL statements, enabling real-time inference of time series data at the millisecond level, and supporting application scenarios such as trend forecasting, missing value imputation, and anomaly detection for time series data. - -> Available since V2.0.5.1 - -The system architecture is shown below: -::: center - -::: - -The responsibilities of the three nodes are as follows: - -- **ConfigNode:** - - Manages distributed nodes and handles load balancing across the system. -- **DataNode:** - - Receives and parses user SQL queries. - - Stores time-series data. - - Performs preprocessing computations on raw data. -- **AINode:** - - Manages and utilizes time-series models (including training/inference). - - Supports deep learning and machine learning workflows. - -## 1. Advantageous features - -Compared with building a machine learning service alone, it has the following advantages: - -- **Simple and easy to use**: no need to use Python or Java programming, the complete process of machine learning model management and inference can be completed using SQL statements. Creating a model can be done using the CREATE MODEL statement, and using a model for inference can be done using the CALL INFERENCE (...) statement, making it simpler and more convenient to use. - -- **Avoid Data Migration**: With IoTDB native machine learning, data stored in IoTDB can be directly applied to the inference of machine learning models without having to move the data to a separate machine learning service platform, which accelerates data processing, improves security, and reduces costs. - -![](/img/AInode1.png) - -- **Built-in Advanced Algorithms**: supports industry-leading machine learning analytics algorithms covering typical timing analysis tasks, empowering the timing database with native data analysis capabilities. Such as: - - **Time Series Forecasting**: learns patterns of change from past time series; thus outputs the most likely prediction of future series based on observations at a given past time. - - **Anomaly Detection for Time Series**: detects and identifies outliers in a given time series data, helping to discover anomalous behaviour in the time series. - - **Annotation for Time Series (Time Series Annotation)**: Adds additional information or markers, such as event occurrence, outliers, trend changes, etc., to each data point or specific time period to better understand and analyse the data. - - - -## 2. Basic Concepts - -- **Model**: A machine learning model takes time series data as input and outputs analysis task results or decisions. Models are the basic management units of AINode, supporting model operations such as creation (registration), deletion, query, modification (fine-tuning), and usage (inference). -- **Create**: Load externally designed or trained model files/algorithms into AINode for unified management and usage by IoTDB. -- **Inference**: Use the created model to complete time series analysis tasks applicable to the model on specified time series data. -- **Built-in Capabilities**: AINode comes with machine learning algorithms or self-developed models for common time series analysis scenarios (e.g., forecasting and anomaly detection). - -::: center - -:::: - -## 3. Installation and Deployment - -The deployment of AINode can be found in the document [AINode Deployment](../Deployment-and-Maintenance/AINode_Deployment_apache.md). - -## 4. Usage Guidelines - -AINode provides model creation and deletion functions for time series models. Built-in models do not require creation and can be used directly. - - -### 4.1 Registering Models - - -Trained deep learning models can be registered by specifying their input and output vector dimensions for inference. - -Models that meet the following criteria can be registered with AINode: - -1. AINode currently supports models trained with PyTorch 2.4.0. Features above version 2.4.0 should be avoided. -2. AINode supports models stored using PyTorch JIT (`model.pt`), which must include both the model structure and weights. -3. The model input sequence can include single or multiple columns. If multi-column, it must match the model capabilities and configuration file. -4. Model configuration parameters must be clearly defined in the `config.yaml` file. When using the model, the input and output dimensions defined in `config.yaml` must be strictly followed. Mismatches with the configuration file will cause errors. - -The SQL syntax for model registration is defined as follows: - -```SQL -create model using uri -``` - -Detailed meanings of SQL parameters: - -- **model_id**: The global unique identifier for the model, non-repeating. Model names have the following constraints: - - Allowed characters: [0-9 a-z A-Z _] (letters, digits (not at the beginning), underscores (not at the beginning)) - - Length: 2-64 characters - - Case-sensitive -- **uri**: The resource path of the model registration files, which should include the **model structure and weight file `model.pt` and the model configuration file `config.yaml`** - - - **Model structure and weight file**: The weight file generated after model training, currently supporting `.pt` files from PyTorch training. - - - **Model configuration file**: Parameters related to the model structure provided during registration, which must include input and output dimensions for inference: - - | **Parameter Name** | **Description** | **Example** | - | ------------------ | -------------------------------- | ----------- | - | input_shape | Rows and columns of model input | [96,2] | - | output_shape | Rows and columns of model output | [48,2] | - - In addition to inference, data types of input and output can also be specified: - - | **Parameter Name** | **Description** | **Example** | - | ------------------ | ------------------------- | ---------------------- | - | input_type | Data type of model input | ['float32', 'float32'] | - | output_type | Data type of model output | ['float32', 'float32'] | - - Additional notes can be specified for model management display: - - | **Parameter Name** | **Description** | **Example** | - | ------------------ | --------------------------------------------- | -------------------------------------------- | - | attributes | Optional notes set by users for model display | 'model_type': 'dlinear', 'kernel_size': '25' | - -In addition to registering local model files, remote resource paths can be specified via URIs for registration, using open-source model repositories (e.g., HuggingFace). - - -#### Example - -The [example folder](https://github.com/apache/iotdb/tree/master/integration-test/src/test/resources/ainode-example) contains model.pt (trained model) and config.yaml with the following content: - -```YAML -configs: - # Required - input_shape: [96, 2] # Model accepts 96 rows x 2 columns of data - output_shape: [48, 2] # Model outputs 48 rows x 2 columns of data - - # Optional (default to all float32, column count matches shape) - input_type: ["int64", "int64"] # Data types of inputs, must match input column count - output_type: ["text", "int64"] # Data types of outputs, must match output column count - -attributes: # Optional user-defined notes - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -Register the model by specifying this folder as the loading path: - -```SQL -IoTDB> create model dlinear_example using uri "file://./example" -``` - -After SQL execution, registration proceeds asynchronously. The registration status can be checked via model display (see Model Display section). The registration success time mainly depends on the model file size. - -Once registered, the model can be invoked for inference through normal query syntax. - -### 4.2 Viewing Models - -Registered models can be queried using the `show models` command. The SQL definitions are: - -```SQL -show models - -show models -``` - -In addition to displaying all models, specifying a `model_id` shows details of a specific model. The display includes: - -| **ModelId** | **ModelType** | **Category** | **State** | -|-------------|---------------|----------------|-------------| -| Model ID | Model Type | Model Category | Model State | - -- Model State Transition Diagram - -![](/img/AINode-State-en.png) - -**Instructions:** - -1. Initialization: - - When AINode starts, show models only displays BUILT-IN models. -2. Custom Model Import: - - Users can import custom models (marked as USER-DEFINED). - - The system attempts to parse the ModelTypefrom the config file. - - If parsing fails, the field remains empty. -3. Foundation Model Weights: - - Time-series foundation model weights are not bundled with AINode. - - AINode automatically downloads them during startup. - - Download state: LOADING. -4. Download Outcomes: - - Success → State changes to ACTIVE. - - Failure → State changes to INACTIVE. -5. Fine-Tuning Process: - - When fine-tuning starts: State becomes TRAINING. - - Successful training → State transitions to ACTIVE. - - Training failure → State changes to FAILED. - -**Example** - -```SQL -IoTDB> show models -+---------------------+--------------------+--------------+---------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+--------------+---------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| custom| | USER-DEFINED| ACTIVE| -| timerxl| Timer-XL| BUILT-IN| LOADING| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| sundialx_1| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_2| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_4| Timer-Sundial| FINE-TUNED| TRAINING| -| sundialx_5| Timer-Sundial| FINE-TUNED| FAILED| -+---------------------+--------------------+--------------+---------+ -``` - -### 4.3 Deleting Models - -Registered models can be deleted via SQL, which removes all related files under AINode: - -```SQL -drop model -``` - -Specify the registered `model_id` to delete the model. Since deletion involves data cleanup, the operation is not immediate, and the model state becomes `DROPPING`, during which it cannot be used for inference. **Note:** Built-in models cannot be deleted. - -### 4.4 Using Built-in Model Reasoning - -The SQL syntax is as follows: - - -```SQL -call inference(,inputSql,(=)*) - -window_function: - head(window_size) - tail(window_size) - count(window_size,sliding_step) -``` - -Built-in model inference does not require a registration process, the inference function can be used by calling the inference function through the call keyword, and its corresponding parameters are described as follows: - -- **built_in_model_name**: built-in model name -- **parameterName**: parameter name -- **parameterValue**: parameter value - -- **Note**: To use a built-in time series large model for inference, the corresponding model weights must be stored locally in the directory `/IOTDB_AINODE_HOME/data/ainode/models/weights/model_id/`. If the weights are not present locally, they will be automatically downloaded from HuggingFace. Ensure your environment has direct access to HuggingFace. - -#### Built-in Models and Parameter Descriptions - -The following machine learning models are currently built-in, please refer to the following links for detailed parameter descriptions. - -| Model | built_in_model_name | Task type | Parameter description | -| -------------------- | --------------------- | -------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Arima | _Arima | Forecast | [Arima Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.arima.ARIMA.html?highlight=Arima) | -| STLForecaster | _STLForecaster | Forecast | [STLForecaster Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.trend.STLForecaster.html#sktime.forecasting.trend.STLForecaster) | -| NaiveForecaster | _NaiveForecaster | Forecast | [NaiveForecaster Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.naive.NaiveForecaster.html#naiveforecaster) | -| ExponentialSmoothing | _ExponentialSmoothing | Forecast | [ExponentialSmoothing Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.exp_smoothing.ExponentialSmoothing.html) | -| GaussianHMM | _GaussianHMM | Annotation | [GaussianHMMParameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.hmm_learn.gaussian.GaussianHMM.html) | -| GMMHMM | _GMMHMM | Annotation | [GMMHMM Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.hmm_learn.gmm.GMMHMM.html) | -| Stray | _Stray | Anomaly detection | [Stray Parameter description](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.stray.STRAY.html) | - - -After completing the registration of the model, the inference function can be used by calling the inference function through the call keyword, and its corresponding parameters are described as follows: - -- **model_id**: corresponds to a registered model -- **sql**: sql query statement, the result of the query is used as input to the model for model inference. The dimensions of the rows and columns in the result of the query need to match the size specified in the specific model config. (It is not recommended to use the `SELECT *` clause for the sql here because in IoTDB, `*` does not sort the columns, so the order of the columns is undefined, you can use `SELECT s0,s1` to ensure that the columns order matches the expectations of the model input) -- **window_function**: Window functions that can be used in the inference process, there are currently three types of window functions provided to assist in model inference: - - **head(window_size)**: Get the top window_size points in the data for model inference, this window can be used for data cropping. - ![](/img/AINode-call1.png) - - - **tail(window_size)**: get the last window_size point in the data for model inference, this window can be used for data cropping. - ![](/img/AINode-call2.png) - - - **count(window_size, sliding_step)**: sliding window based on the number of points, the data in each window will be reasoned through the model respectively, as shown in the example below, window_size for 2 window function will be divided into three windows of the input dataset, and each window will perform reasoning operations to generate results respectively. The window can be used for continuous inference - ![](/img/AINode-call3.png) - -**Explanation 1**: window can be used to solve the problem of cropping rows when the results of the sql query and the input row requirements of the model do not match. Note that when the number of columns does not match or the number of rows is directly less than the model requirement, the inference cannot proceed and an error message will be returned. - -**Explanation 2**: In deep learning applications, timestamp-derived features (time columns in the data) are often used as covariates in generative tasks, and are input into the model together to enhance the model, but the time columns are generally not included in the model's output. In order to ensure the generality of the implementation, the model inference results only correspond to the real output of the model, if the model does not output the time column, it will not be included in the results. - - -#### Example - -The following is an example of inference in action using a deep learning model, for the `dlinear` prediction model with input `[96,2]` and output `[48,2]` mentioned above, which we use via SQL. - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**", generateTime=True) -+-----------------------------+--------------------------------------------+-----------------------------+ -| Time| _result_0| _result_1| -+-----------------------------+--------------------------------------------+-----------------------------+ -|1990-04-06T00:00:00.000+08:00| 0.726302981376648| 1.6549958229064941| -|1990-04-08T00:00:00.000+08:00| 0.7354921698570251| 1.6482787370681763| -|1990-04-10T00:00:00.000+08:00| 0.7238251566886902| 1.6278168201446533| -...... -|1990-07-07T00:00:00.000+08:00| 0.7692174911499023| 1.654654049873352| -|1990-07-09T00:00:00.000+08:00| 0.7685555815696716| 1.6625318765640259| -|1990-07-11T00:00:00.000+08:00| 0.7856493592262268| 1.6508299350738525| -+-----------------------------+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### Example of using the tail/head window function - -When the amount of data is variable and you want to take the latest 96 rows of data for inference, you can use the corresponding window function tail. head function is used in a similar way, except that it takes the earliest 96 points. - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1988-01-01T00:00:00.000+08:00| 0.7355| 1.211| -...... -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 996 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**", generateTime=True, window=tail(96)) -+-----------------------------+--------------------------------------------+-----------------------------+ -| Time| _result_0| _result_1| -+-----------------------------+--------------------------------------------+-----------------------------+ -|1990-04-06T00:00:00.000+08:00| 0.726302981376648| 1.6549958229064941| -|1990-04-08T00:00:00.000+08:00| 0.7354921698570251| 1.6482787370681763| -|1990-04-10T00:00:00.000+08:00| 0.7238251566886902| 1.6278168201446533| -...... -|1990-07-07T00:00:00.000+08:00| 0.7692174911499023| 1.654654049873352| -|1990-07-09T00:00:00.000+08:00| 0.7685555815696716| 1.6625318765640259| -|1990-07-11T00:00:00.000+08:00| 0.7856493592262268| 1.6508299350738525| -+-----------------------------+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### Example of using the count window function - -This window is mainly used for computational tasks. When the task's corresponding model can only handle a fixed number of rows of data at a time, but the final desired outcome is multiple sets of prediction results, this window function can be used to perform continuous inference using a sliding window of points. Suppose we now have an anomaly detection model `anomaly_example(input: [24,2], output[1,1])`, which generates a 0/1 label for every 24 rows of data. An example of its use is as follows: - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(anomaly_example,"select s0,s1 from root.**", generateTime=True, window=count(24,24)) -+-----------------------------+-------------------------+ -| Time| _result_0| -+-----------------------------+-------------------------+ -|1990-04-06T00:00:00.000+08:00| 0| -|1990-04-30T00:00:00.000+08:00| 1| -|1990-05-24T00:00:00.000+08:00| 1| -|1990-06-17T00:00:00.000+08:00| 0| -+-----------------------------+-------------------------+ -Total line number = 4 -``` - -In the result set, each row's label corresponds to the output of the anomaly detection model after inputting each group of 24 rows of data. - -### 4.5 Fine-tuning Built-in Models -> Only Timer-XL and Timer-Sundial support fine-tuning. - - -The SQL syntax is as follows: - - -```SQL -create model (with hyperparameters -(=(, =)*))? -from model -on dataset (PATH ([timeRange])?) -``` - -#### Examples - -1. Select the first 80% of data from the measurement point `root.db.etth.ot` as the fine-tuning dataset, and create the model `sundialv2` based on `sundial`. - -```SQL -IoTDB> CREATE MODEL sundialv2 FROM MODEL sundial ON DATASET (PATH root.db.etth.OT([1467302400000, 1517468400001))) -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+--------------------+----------+--------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+--------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| timer_xl| Timer-XL| BUILT-IN| ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|TRAINING| -+---------------------+--------------------+----------+--------+ -``` - -2. The fine-tuning task starts asynchronously in the background, and logs can be viewed in the AINode process. After fine-tuning is complete, query and use the new model - -```SQL -IoTDB> show models -+---------------------+--------------------+----------+------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+------+ -| arima| Arima| BUILT-IN|ACTIVE| -| holtwinters| HoltWinters| BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN|ACTIVE| -| stray| Stray| BUILT-IN|ACTIVE| -| sundial| Timer-Sundial| BUILT-IN|ACTIVE| -| timer_xl| Timer-XL| BUILT-IN|ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|ACTIVE| -+---------------------+--------------------+----------+------+ -``` - -### 4.6 TimeSeries Large Models Import Steps - -The deployment of AINode can be found in the document [AINode Deployment](../Deployment-and-Maintenance/AINode_Deployment_timecho.md) . - - -## 5. Privilege Management - -When using AINode related functions, the authentication of IoTDB itself can be used to do a permission management, users can only use the model management related functions when they have the USE_MODEL permission. When using the inference function, the user needs to have the permission to access the source sequence corresponding to the SQL of the input model. - -| Privilege Name | Privilege Scope | Administrator User (default ROOT) | Normal User | Path Related | -| --------- | --------------------------------- | ---------------------- | -------- | -------- | -| USE_MODEL | create model/show models/drop model | √ | √ | x | -| READ_DATA| call inference | √ | √|√ | - -## 6. Practical Examples - -### 6.1 Power Load Prediction - -In some industrial scenarios, there is a need to predict power loads, which can be used to optimise power supply, conserve energy and resources, support planning and expansion, and enhance power system reliability. - -The data for the test set of ETTh1 that we use is [ETTh1](/img/ETTh1.csv). - - -It contains power data collected at 1h intervals, and each data consists of load and oil temperature as High UseFul Load, High UseLess Load, Middle UseLess Load, Low UseFul Load, Low UseLess Load, Oil Temperature. - -On this dataset, the model inference function of IoTDB-ML can predict the oil temperature in the future period of time through the relationship between the past values of high, middle and low use loads and the corresponding time stamp oil temperature, which empowers the automatic regulation and monitoring of grid transformers. - -#### Step 1: Data Import - -Users can import the ETT dataset into IoTDB using `import-data.sh` in the tools folder - -``Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/ETTh1.csv -`` - -#### Step 2: Model Import - -We can enter the following SQL in iotdb-cli to pull a trained model from huggingface for registration for subsequent inference. - -```SQL -create model dlinear using uri 'https://huggingface.co/hvlgo/dlinear/tree/main' -``` - -This model is trained on the lighter weight deep model DLinear, which is able to capture as many trends within a sequence and relationships between variables as possible with relatively fast inference, making it more suitable for fast real-time prediction than other deeper models. - -#### Step 3: Model inference - -```Shell -IoTDB> select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth LIMIT 96 -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -| Time|root.eg.etth.s0|root.eg.etth.s1|root.eg.etth.s2|root.eg.etth.s3|root.eg.etth.s4|root.eg.etth.s5|root.eg.etth.s6| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -|2017-10-20T00:00:00.000+08:00| 10.449| 3.885| 8.706| 2.025| 2.041| 0.944| 8.864| -|2017-10-20T01:00:00.000+08:00| 11.119| 3.952| 8.813| 2.31| 2.071| 1.005| 8.442| -|2017-10-20T02:00:00.000+08:00| 9.511| 2.88| 7.533| 1.564| 1.949| 0.883| 8.16| -|2017-10-20T03:00:00.000+08:00| 9.645| 2.21| 7.249| 1.066| 1.828| 0.914| 7.949| -...... -|2017-10-23T20:00:00.000+08:00| 8.105| 0.938| 4.371| -0.569| 3.533| 1.279| 9.708| -|2017-10-23T21:00:00.000+08:00| 7.167| 1.206| 4.087| -0.462| 3.107| 1.432| 8.723| -|2017-10-23T22:00:00.000+08:00| 7.1| 1.34| 4.015| -0.32| 2.772| 1.31| 8.864| -|2017-10-23T23:00:00.000+08:00| 9.176| 2.746| 7.107| 1.635| 2.65| 1.097| 9.004| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example, "select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth", generateTime=True, window=head(96)) -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -| Time| output0| output1| output2| output3| output4| output5| output6| -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -|2017-10-23T23:00:00.000+08:00| 10.319546| 3.1450553| 7.877341| 1.5723765|2.7303758| 1.1362307| 8.867775| -|2017-10-24T01:00:00.000+08:00| 10.443649| 3.3286757| 7.8593454| 1.7675098| 2.560634| 1.1177158| 8.920919| -|2017-10-24T03:00:00.000+08:00| 10.883752| 3.2341104| 8.47036| 1.6116762|2.4874182| 1.1760603| 8.798939| -...... -|2017-10-26T19:00:00.000+08:00| 8.0115595| 1.2995274| 6.9900327|-0.098746896| 3.04923| 1.176214| 9.548782| -|2017-10-26T21:00:00.000+08:00| 8.612427| 2.5036244| 5.6790237| 0.66474205|2.8870275| 1.2051733| 9.330128| -|2017-10-26T22:00:00.000+08:00| 10.096699| 3.399722| 6.9909| 1.7478468|2.7642853| 1.1119363| 9.541455| -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -Total line number = 48 -``` - -We compare the results of the prediction of the oil temperature with the real results, and we can get the following image. - -The data before 10/24 00:00 represents the past data input to the model, the blue line after 10/24 00:00 is the oil temperature forecast result given by the model, and the red line is the actual oil temperature data from the dataset (used for comparison). - -![](/img/AINode-analysis1.png) - -As can be seen, we have used the relationship between the six load information and the corresponding time oil temperatures for the past 96 hours (4 days) to model the possible changes in this data for the oil temperature for the next 48 hours (2 days) based on the inter-relationships between the sequences learned previously, and it can be seen that the predicted curves maintain a high degree of consistency in trend with the actual results after visualisation. - -### 6.2 Power Prediction - -Power monitoring of current, voltage and power data is required in substations for detecting potential grid problems, identifying faults in the power system, effectively managing grid loads and analysing power system performance and trends. - -We have used the current, voltage and power data in a substation to form a dataset in a real scenario. The dataset consists of data such as A-phase voltage, B-phase voltage, and C-phase voltage collected every 5 - 6s for a time span of nearly four months in the substation. - -The test set data content is [data](/img/data.csv). - -On this dataset, the model inference function of IoTDB-ML can predict the C-phase voltage in the future period through the previous values and corresponding timestamps of A-phase voltage, B-phase voltage and C-phase voltage, empowering the monitoring management of the substation. - -#### Step 1: Data Import - -Users can import the dataset using `import-data.sh` in the tools folder - -```Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/data.csv -``` - -#### Step 2: Model Import - -We can select built-in models or registered models in IoTDB CLI for subsequent inference. - -We use the built-in model STLForecaster for prediction. STLForecaster is a time series forecasting method based on the STL implementation in the statsmodels library. - -#### Step 3: Model Inference - -```Shell -IoTDB> select * from root.eg.voltage limit 96 -+-----------------------------+------------------+------------------+------------------+ -| Time|root.eg.voltage.s0|root.eg.voltage.s1|root.eg.voltage.s2| -+-----------------------------+------------------+------------------+------------------+ -|2023-02-14T20:38:32.000+08:00| 2038.0| 2028.0| 2041.0| -|2023-02-14T20:38:38.000+08:00| 2014.0| 2005.0| 2018.0| -|2023-02-14T20:38:44.000+08:00| 2014.0| 2005.0| 2018.0| -...... -|2023-02-14T20:47:52.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:47:57.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:48:03.000+08:00| 2024.0| 2016.0| 2027.0| -+-----------------------------+------------------+------------------+------------------+ -Total line number = 96 - -IoTDB> call inference(_STLForecaster, "select s0,s1,s2 from root.eg.voltage", generateTime=True, window=head(96),predict_length=48) -+-----------------------------+---------+---------+---------+ -| Time| output0| output1| output2| -+-----------------------------+---------+---------+---------+ -|2023-02-14T20:48:03.000+08:00|2026.3601|2018.2953|2029.4257| -|2023-02-14T20:48:09.000+08:00|2019.1538|2011.4361|2022.0888| -|2023-02-14T20:48:15.000+08:00|2025.5074|2017.4522|2028.5199| -...... - -|2023-02-14T20:52:15.000+08:00|2022.2336|2015.0290|2025.1023| -|2023-02-14T20:52:21.000+08:00|2015.7241|2008.8975|2018.5085| -|2023-02-14T20:52:27.000+08:00|2022.0777|2014.9136|2024.9396| -|2023-02-14T20:52:33.000+08:00|2015.5682|2008.7821|2018.3458| -+-----------------------------+---------+---------+---------+ -Total line number = 48 -``` - -Comparing the predicted results of the C-phase voltage with the real results, we can get the following image. - -The data before 02/14 20:48 represents the past data input to the model, the blue line after 02/14 20:48 is the predicted result of phase C voltage given by the model, while the red line is the actual phase C voltage data from the dataset (used for comparison). - -![](/img/AINode-analysis2.png) - -It can be seen that we used the voltage data from the past 10 minutes and, based on the previously learned inter-sequence relationships, modeled the possible changes in the phase C voltage data for the next 5 minutes. The visualized forecast curve shows a certain degree of synchronicity with the actual results in terms of trend. - -### 6.3 Anomaly Detection - -In the civil aviation and transport industry, there exists a need for anomaly detection of the number of passengers travelling on an aircraft. The results of anomaly detection can be used to guide the adjustment of flight scheduling to make the organisation more efficient. - -Airline Passengers is a time-series dataset that records the number of international air passengers between 1949 and 1960, sampled at one-month intervals. The dataset contains a total of one time series. The dataset is [airline](/img/airline.csv). -On this dataset, the model inference function of IoTDB-ML can empower the transport industry by capturing the changing patterns of the sequence in order to detect anomalies at the sequence time points. - -#### Step 1: Data Import - -Users can import the dataset using `import-data.sh` in the tools folder - -``Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/data.csv -`` - -#### Step 2: Model Inference - -IoTDB has some built-in machine learning algorithms that can be used directly, a sample prediction using one of the anomaly detection algorithms is shown below: - -```Shell -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", generateTime=True, k=2) -+-----------------------------+-------+ -| Time|output0| -+-----------------------------+-------+ -|1960-12-31T00:00:00.000+08:00| 0| -|1961-01-31T08:00:00.000+08:00| 0| -|1961-02-28T08:00:00.000+08:00| 0| -|1961-03-31T08:00:00.000+08:00| 0| -...... -|1972-06-30T08:00:00.000+08:00| 1| -|1972-07-31T08:00:00.000+08:00| 1| -|1972-08-31T08:00:00.000+08:00| 0| -|1972-09-30T08:00:00.000+08:00| 0| -|1972-10-31T08:00:00.000+08:00| 0| -|1972-11-30T08:00:00.000+08:00| 0| -+-----------------------------+-------+ -Total line number = 144 -``` - -We plot the results detected as anomalies to get the following image. Where the blue curve is the original time series and the time points specially marked with red dots are the time points that the algorithm detects as anomalies. - -![](/img/s6.png) - -It can be seen that the Stray model has modelled the input sequence changes and successfully detected the time points where anomalies occur. diff --git a/src/UserGuide/latest/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md b/src/UserGuide/latest/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md deleted file mode 100644 index c7fde0bf2..000000000 --- a/src/UserGuide/latest/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md +++ /dev/null @@ -1,157 +0,0 @@ - -# Time Series Large Models - -## 1. Introduction - -Time Series Large Models are foundational models specifically designed for time series data analysis. The IoTDB team has been developing the Timer, a self-researched foundational time series model, which is based on the Transformer architecture and pre-trained on massive multi-domain time series data, supporting downstream tasks such as time series forecasting, anomaly detection, and time series imputation. The AINode platform developed by the team also supports the integration of cutting-edge time series foundational models from the industry, providing users with diverse model options. Unlike traditional time series analysis techniques, these large models possess universal feature extraction capabilities and can serve a wide range of analytical tasks through zero-shot analysis, fine-tuning, and other services. - -All technical achievements in the field of time series large models related to this paper (including both the team's self-researched models and industry-leading directions) have been published in top international machine learning conferences, with specific details in the appendix. - -## 2. Application Scenarios - -* **Time Series Forecasting**: Providing time series data forecasting services for industrial production, natural environments, and other fields to help users understand future trends in advance. -* **Data Imputation**: Performing context-based filling for missing segments in time series to enhance the continuity and integrity of the dataset. -* **Anomaly Detection**: Using autoregressive analysis technology to monitor time series data in real-time, promptly alerting potential anomalies. - -![](/img/LargeModel10.png) - -## 3. Timer-1 Model - -The Timer model (non-built-in model) not only demonstrates excellent few-shot generalization and multi-task adaptability, but also acquires a rich knowledge base through pre-training, endowing it with universal capabilities to handle diverse downstream tasks, with the following characteristics: - -* **Generalizability**: The model can achieve industry-leading deep model prediction results through fine-tuning with only a small number of samples. -* **Universality**: The model design is flexible, capable of adapting to various different task requirements, and supports variable input and output lengths, enabling it to function effectively in various application scenarios. -* **Scalability**: As the number of model parameters increases or the scale of pre-training data expands, the model's performance will continue to improve, ensuring that the model can continuously optimize its prediction effectiveness as time and data volume grow. - -![](/img/model01.png) - -## 4. Timer-XL Model - -Timer-XL further extends and upgrades the network structure based on Timer, achieving comprehensive breakthroughs in multiple dimensions: - -* **Ultra-Long Context Support**: This model breaks through the limitations of traditional time series forecasting models, supporting the processing of inputs with thousands of Tokens (equivalent to tens of thousands of time points), effectively solving the context length bottleneck problem. -* **Coverage of Multi-Variable Forecasting Scenarios**: Supports various forecasting scenarios, including the prediction of non-stationary time series, multi-variable prediction tasks, and predictions involving covariates, meeting diversified business needs. -* **Large-Scale Industrial Time Series Dataset**: Pre-trained on a trillion-scale time series dataset from the industrial IoT field, the dataset possesses important characteristics such as massive scale, excellent quality, and rich domain coverage, covering multiple fields including energy, aerospace, steel, and transportation. - -![](/img/model02.png) - -## 5. Timer-Sundial Model - -Timer-Sundial is a series of generative foundational models focused on time series forecasting. The base version has 128 million parameters and has been pre-trained on 1 trillion time points, with the following core characteristics: - -* **Strong Generalization Performance**: Possesses zero-shot forecasting capabilities and can support both point forecasting and probabilistic forecasting simultaneously. -* **Flexible Prediction Distribution Analysis**: Not only can it predict means or quantiles, but it can also evaluate any statistical properties of the prediction distribution through the raw samples generated by the model. -* **Innovative Generative Architecture**: Employs a "Transformer + TimeFlow" collaborative architecture - the Transformer learns the autoregressive representations of time segments, while the TimeFlow module transforms random noise into diverse prediction trajectories based on the flow-matching framework (Flow-Matching), achieving efficient generation of non-deterministic samples. - -![](/img/model03.png) - -## 6. Chronos-2 Model - -Chronos-2 is a universal time series foundational model developed by the Amazon Web Services (AWS) research team, evolved from the Chronos discrete token modeling paradigm. This model is suitable for both zero-shot univariate forecasting and covariate forecasting. Its main characteristics include: - -* **Probabilistic Forecasting Capability**: The model outputs multi-step prediction results in a generative manner, supporting quantile or distribution-level forecasting to characterize future uncertainty. -* **Zero-Shot General Forecasting**: Leveraging the contextual learning ability acquired through pre-training, it can directly execute forecasting on unseen datasets without retraining or parameter updates. -* **Unified Modeling of Multi-Variable and Covariates**: Supports joint modeling of multiple related time series and their covariates under the same architecture to improve prediction performance for complex tasks. However, it has strict input requirements: - * The set of names of future covariates must be a subset of the set of names of historical covariates; - * The length of each historical covariate must equal the length of the target variable; - * The length of each future covariate must equal the prediction length; -* **Efficient Inference and Deployment**: The model adopts a compact encoder-only structure, maintaining strong generalization capabilities while ensuring inference efficiency. - -![](/img/timeseries-large-model-chronos2.png) - -## 7. Performance Showcase - -Time Series Large Models can adapt to real time series data from various different domains and scenarios, demonstrating excellent processing capabilities across various tasks. The following shows the actual performance on different datasets: - -**Time Series Forecasting:** - -Leveraging the forecasting capabilities of Time Series Large Models, future trends of time series can be accurately predicted. The blue curve in the following figure represents the predicted trend, while the red curve represents the actual trend, with both curves highly consistent. - -![](/img/LargeModel03.png) - -**Data Imputation:** - -Using Time Series Large Models to fill missing data segments through predictive imputation. - -![](/img/timeseries-large-model-data-imputation.png) - -**Anomaly Detection:** - -Using Time Series Large Models to accurately identify outliers that deviate significantly from the normal trend. - -![](/img/LargeModel05.png) - -## 8. Deployment and Usage - -1. Open the IoTDB CLI console and check that the ConfigNode, DataNode, and AINode nodes are all Running. - -```Plain -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| -+------+----------+-------+---------------+------------+--------------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| 2.0.5.1| 069354f| -| 1| DataNode|Running| 127.0.0.1| 10730| 2.0.5.1| 069354f| -| 2| AINode|Running| 127.0.0.1| 10810| 2.0.5.1|069354f-dev| -+------+----------+-------+---------------+------------+--------------+-----------+ -Total line number = 3 -It costs 0.140s -``` - -2. In an online environment, the first startup of the AINode node will automatically pull the Timer-XL, Sundial, and Chronos2 models. - - > Note: - > - > * The AINode installation package does not include model weight files. - > * The automatic pull feature depends on the deployment environment having HuggingFace network access capability. - > * AINode supports manual upload of model weight files. For specific operation methods, refer to [Importing Weight Files](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_3-3-importing-built-in-weight-files). - -3. Check if the models are available. - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### Appendix - -**[1]** Timer: Generative Pre-trained Transformers Are Large Time Series Models, Yong Liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ Back]() - -**[2]** TIMER-XL: LONG-CONTEXT TRANSFORMERS FOR UNIFIED TIME SERIES FORECASTING, Yong Liu, Guo Qin, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ Back]() - -**[3]** Sundial: A Family of Highly Capable Time Series Foundation Models, Yong Liu, Guo Qin, Zhiyuan Shi, Zhi Chen, Caiyin Yang, Xiangdong Huang, Jianmin Wang, Mingsheng Long, **ICML 2025 spotlight**. [↩ Back]() - -**[4]** Chronos-2: From Univariate to Universal Forecasting, Abdul Fatir Ansari, Oleksandr Shchur, Jaris Küken, Andreas Auer, Boran Han, Pedro Mercado, Syama Sundar Rangapuram, Huibin Shen, Lorenzo Stella, Xiyuan Zhang, Mononito Goswami, Shubham Kapoor, Danielle C. Maddix, Pablo Guerron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, Michael Bohlke-Schneider, **arXiv:2510.15821**. [↩ Back]() \ No newline at end of file diff --git a/src/UserGuide/latest/API/Programming-Data-Subscription_timecho.md b/src/UserGuide/latest/API/Programming-Data-Subscription_timecho.md deleted file mode 100644 index 730ff8dc3..000000000 --- a/src/UserGuide/latest/API/Programming-Data-Subscription_timecho.md +++ /dev/null @@ -1,260 +0,0 @@ - - - - -# Data Subscription API - -IoTDB provides powerful data subscription functionality, allowing users to access newly added data from IoTDB in real-time through subscription APIs. For detailed functional definitions and introductions:[Data subscription](../User-Manual/Data-subscription_timecho) - -## 1. Core Steps - -1. Create Topic: Create a Topic that includes the measurement points you wish to subscribe to. -2. Subscribe to Topic: Before a consumer subscribes to a topic, the topic must have been created, otherwise the subscription will fail. Consumers under the same consumer group will evenly distribute the data. -3. Consume Data: Only by explicitly subscribing to a specific topic will you receive data from that topic. -4. Unsubscribe: When a consumer is closed, it will exit the corresponding consumer group and cancel all existing subscriptions. - - -## 2. Detailed Steps - -This section is used to illustrate the core development process and does not demonstrate all parameters and interfaces. For a comprehensive understanding of all features and parameters, please refer to: [Java Native API](../API/Programming-Java-Native-API_timecho#_3-native-interface-description) - - -### 2.1 Create a Maven project - -Create a Maven project and import the following dependencies(JDK >= 1.8, Maven >= 3.6) - -```xml - - - org.apache.iotdb - iotdb-session - - ${project.version} - - -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -### 2.2 Code Example - -#### 2.2.1 Topic operations - -```java -import java.util.Optional; -import java.util.Properties; -import java.util.Set; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.SubscriptionSession; -import org.apache.iotdb.session.subscription.model.Topic; - -public class DataConsumerExample { - - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - try (SubscriptionSession session = new SubscriptionSession("127.0.0.1", 6667, "root", "TimechoDB@2021", 67108864)) { //Before V2.0.6.x the default password is root - // 1. open session - session.open(); - - // 2. create a topic of all data - Properties sessionConfig = new Properties(); - sessionConfig.put(TopicConstant.PATH_KEY, "root.**"); - - session.createTopic("allData", sessionConfig); - - // 3. show all topics - Set topics = session.getTopics(); - System.out.println(topics); - - // 4. show a specific topic - Optional allData = session.getTopic("allData"); - System.out.println(allData.get()); - } - } -} -``` - -#### 2.2.2 Data Consume - -##### Scenario-1: Subscribing to newly added real-time data in IoTDB (for scenarios such as dashboard or configuration display) - -```java -import java.io.IOException; -import java.util.List; -import java.util.Properties; -import org.apache.iotdb.rpc.subscription.config.ConsumerConstant; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.consumer.SubscriptionPullConsumer; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessage; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessageType; -import org.apache.iotdb.session.subscription.payload.SubscriptionSessionDataSet; -import org.apache.tsfile.read.common.RowRecord; - -public class DataConsumerExample { - - public static void main(String[] args) throws IOException { - - // 5. create a pull consumer, the subscription is automatically cancelled when the logic in the try resources is completed - Properties consumerConfig = new Properties(); - consumerConfig.put(ConsumerConstant.CONSUMER_ID_KEY, "c1"); - consumerConfig.put(ConsumerConstant.CONSUMER_GROUP_ID_KEY, "cg1"); - consumerConfig.put(ConsumerConstant.USERNAME_KEY, "root"); - consumerConfig.put(ConsumerConstant.PASSWORD_KEY, "TimechoDB@2021"); //Before V2.0.6.x the default password is root - try (SubscriptionPullConsumer pullConsumer = new SubscriptionPullConsumer(consumerConfig)) { - pullConsumer.open(); - pullConsumer.subscribe("topic_all"); - while (true) { - List messages = pullConsumer.poll(10000); - for (final SubscriptionMessage message : messages) { - final short messageType = message.getMessageType(); - if (SubscriptionMessageType.isValidatedMessageType(messageType)) { - for (final SubscriptionSessionDataSet dataSet : message.getSessionDataSetsHandler()) { - while (dataSet.hasNext()) { - final RowRecord record = dataSet.next(); - System.out.println(record); - } - } - } - } - } - } - } -} - - -``` - -##### Scenario-2: Subscribing to newly added TsFiles (for scenarios such as regular data backup) - -Prerequisite: The format of the topic to be consumed must be of the TsfileHandler type. For example:`create topic topic_all_tsfile with ('path'='root.**','format'='TsFileHandler')` - -```java -import java.io.IOException; -import java.util.List; -import java.util.Properties; -import org.apache.iotdb.rpc.subscription.config.ConsumerConstant; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.consumer.SubscriptionPullConsumer; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessage; - - -public class DataConsumerExample { - - public static void main(String[] args) throws IOException { - // 1. create a pull consumer, the subscription is automatically cancelled when the logic in the try resources is completed - Properties consumerConfig = new Properties(); - consumerConfig.put(ConsumerConstant.CONSUMER_ID_KEY, "c1"); - consumerConfig.put(ConsumerConstant.CONSUMER_GROUP_ID_KEY, "cg1"); - // 2. Specify the consumption type as the tsfile type - consumerConfig.put(ConsumerConstant.USERNAME_KEY, "root"); - consumerConfig.put(ConsumerConstant.PASSWORD_KEY, "TimechoDB@2021"); //Before V2.0.6.x the default password is root - consumerConfig.put(ConsumerConstant.FILE_SAVE_DIR_KEY, "/Users/iotdb/Downloads"); - try (SubscriptionPullConsumer pullConsumer = new SubscriptionPullConsumer(consumerConfig)) { - pullConsumer.open(); - pullConsumer.subscribe("topic_all_tsfile"); - while (true) { - List messages = pullConsumer.poll(10000); - for (final SubscriptionMessage message : messages) { - message.getTsFileHandler().copyFile("/Users/iotdb/Downloads/1.tsfile"); - } - } - } - } -} -``` - - - - -## 3. Java Native API Description - -### 3.1 Parameter List - -The consumer-related parameters can be set through the Properties parameter object. The specific parameters are as follows: - -#### SubscriptionConsumer - - -| **Parameter** | **required or optional with default** | **Parameter Meaning** | -| :---------------------- |:-------------------------------------------------------------------------------------| :----------------------------------------------------------- | -| host | optional: 127.0.0.1 | `String`: The RPC host of a DataNode in IoTDB | -| port | optional: 6667 | `Integer`: The RPC port of a DataNode in IoTDB | -| node-urls | optional: 127.0.0.1:6667 | `List`: The RPC addresses of all DataNodes in IoTDB, which can be multiple; either host:port or node-urls can be filled. If both host:port and node-urls are filled, the **union** of host:port and node-urls will be taken to form a new node-urls for application | -| username | optional: root | `String`: The username of the DataNode in IoTDB | -| password | optional: TimechoDB@2021 //Before V2.0.6.x the default password is root | `String`: The password of the DataNode in IoTDB | -| groupId | optional | `String`: consumer group id,if not specified, it will be randomly assigned (a new consumer group),ensuring that the consumer group id of different consumer groups are all different | -| consumerId | optional | `String`: consumer client id,if not specified, it will be randomly assigned,ensuring that each consumer client id in the same consumer group is different | -| heartbeatIntervalMs | optional: 30000 (min: 1000) | `Long`: The interval at which the consumer sends periodic heartbeat requests to the IoTDB DataNode | -| endpointsSyncIntervalMs | optional: 120000 (min: 5000) | `Long`: The interval at which the consumer detects the expansion or contraction of IoTDB cluster nodes and adjusts the subscription connection | -| fileSaveDir | optional: Paths.get(System.getProperty("user.dir"), "iotdb-subscription").toString() | `String`: The temporary directory path where the consumer stores the subscribed TsFile files | -| fileSaveFsync | optional: false | `Boolean`: Whether the consumer actively calls fsync during the subscription of TsFiles | - -Special configurations in `SubscriptionPushConsumer` : - -| **Parameter** | **required or optional with default** | **Parameter Meaning** | -| :----------------- | :------------------------------------ | :----------------------------------------------------------- | -| ackStrategy | optional: `ACKStrategy.AFTER_CONSUME` | The acknowledgment mechanism for consumption progress includes the following options: `ACKStrategy.BEFORE_CONSUME`(the consumer submits the consumption progress immediately upon receiving the data, before `onReceive` )`ACKStrategy.AFTER_CONSUME`(the consumer submits the consumption progress after consuming the data, after `onReceive` ) | -| consumeListener | optional | The callback function for consuming data, which needs to implement the `ConsumeListener` interface, defining the processing logic for consuming `SessionDataSetsHandler` and `TsFileHandler` formatted data | -| autoPollIntervalMs | optional: 5000 (min: 500) | Long: The time interval at which the consumer automatically pulls data, in **ms** | -| autoPollTimeoutMs | optional: 10000 (min: 1000) | Long: The timeout duration for the consumer to pull data each time, in **ms** | - -Special configurations in `SubscriptionPullConsumer` : - -| **Parameter** | **required or optional with default** | **Parameter Meaning** | -| :----------------- | :------------------------------------ | :----------------------------------------------------------- | -| autoCommit | optional: true | Boolean: Whether to automatically commit the consumption progress. If this parameter is set to false, the `commit` method needs to be called manually to submit the consumption progress | -| autoCommitInterval | optional: 5000 (min: 500) | Long: The time interval for automatically committing the consumption progress, in **ms** .This parameter only takes effect when the `autoCommit` parameter is set to true | - - -### 3.2 Function List - -#### Data subscription - -##### SubscriptionPullConsumer - -| **Function name** | **Description** | **Parameter** | -| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -| `open()` | Opens the consumer connection and starts message consumption. If `autoCommit` is enabled, it will start the automatic commit worker. | None | -| `close()` | Closes the consumer connection. If `autoCommit` is enabled, it will commit all uncommitted messages before closing. | None | -| `poll(final Duration timeout)` | Pulls messages with a specified timeout. | `timeout` : The timeout duration. | -| `poll(final long timeoutMs)` | Pulls messages with a specified timeout in milliseconds. | `timeoutMs` : The timeout duration in milliseconds. | -| `poll(final Set topicNames, final Duration timeout)` | Pulls messages from specified topics with a specified timeout. | `topicNames` : The set of topics to pull messages from. `timeout`: The timeout duration。 | -| `poll(final Set topicNames, final long timeoutMs)` | Pulls messages from specified topics with a specified timeout in milliseconds. | `topicNames` : The set of topics to pull messages from.`timeoutMs`: The timeout duration in milliseconds. | -| `commitSync(final SubscriptionMessage message)` | Synchronously commits a single message. | `message` : The message object to be committed. | -| `commitSync(final Iterable messages)` | Synchronously commits multiple messages. | `messages` : The collection of message objects to be committed. | -| `commitAsync(final SubscriptionMessage message)` | Asynchronously commits a single message. | `message` : The message object to be committed. | -| `commitAsync(final Iterable messages)` | Asynchronously commits multiple messages. | `messages` : The collection of message objects to be committed. | -| `commitAsync(final SubscriptionMessage message, final AsyncCommitCallback callback)` | Asynchronously commits a single message with a specified callback. | `message` : The message object to be committed. `callback` : The callback function to be executed after asynchronous commit. | -| `commitAsync(final Iterable messages, final AsyncCommitCallback callback)` | Asynchronously commits multiple messages with a specified callback. | `messages` : The collection of message objects to be committed.`callback` : The callback function to be executed after asynchronous commit. | - -##### SubscriptionPushConsumer - -| **Function name** | **Description** | **Parameter** | -| -------------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| `open()` | Opens the consumer connection, starts message consumption, and submits the automatic polling worker. | None | -| `close()` | Closes the consumer connection and stops message consumption. | None | -| `toString()` | Returns the core configuration information of the consumer object. | None | -| `coreReportMessage()` | Obtains the key-value representation of the consumer's core configuration. | None | -| `allReportMessage()` | Obtains the key-value representation of all the consumer's configurations. | None | -| `buildPushConsumer()` | Builds a `SubscriptionPushConsumer` instance through the `Builder` | None | -| `ackStrategy(final AckStrategy ackStrategy)` | Configures the message acknowledgment strategy for the consumer. | `ackStrategy`: The specified message acknowledgment strategy. | -| `consumeListener(final ConsumeListener consumeListener)` | Configures the message consumption logic for the consumer. | `consumeListener`: The processing logic when the consumer receives messages. | -| `autoPollIntervalMs(final long autoPollIntervalMs)` | Configures the interval for automatic polling. | `autoPollIntervalMs` : The interval for automatic polling, in milliseconds. | -| `autoPollTimeoutMs(final long autoPollTimeoutMs)` | Configures the timeout for automatic polling.间。 | `autoPollTimeoutMs`: The timeout for automatic polling, in milliseconds. | \ No newline at end of file diff --git a/src/UserGuide/latest/API/Programming-JDBC_timecho.md b/src/UserGuide/latest/API/Programming-JDBC_timecho.md deleted file mode 100644 index 5ed6af0b8..000000000 --- a/src/UserGuide/latest/API/Programming-JDBC_timecho.md +++ /dev/null @@ -1,296 +0,0 @@ - - -# JDBC - -**Note**: The current JDBC implementation is only for connecting with third-party tools. We do not recommend using JDBC (when executing insert statements) as it cannot provide high-performance writing. For queries, we recommend using JDBC. -PLEASE USE [Java Native API](./Programming-Java-Native-API_timecho) INSTEAD* - -## 1. Dependencies - -* JDK >= 1.8+ -* Maven >= 3.9+ - -## 2. Installation - -In root directory: - -```shell -mvn clean install -pl iotdb-client/jdbc -am -DskipTests -``` - -## 3. Use IoTDB JDBC with Maven - -```xml - - - org.apache.iotdb - iotdb-jdbc - ${project.version} - - -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -## 4. Coding Examples - -This chapter provides an example of how to open a database connection, execute an SQL query, and display the results. - -It requires including the packages containing the JDBC classes needed for database programming. - -**NOTE: For faster insertion, the insertTablet() in Session is recommended.** - -```java -import java.sql.*; -import org.apache.iotdb.jdbc.IoTDBSQLException; - -public class JDBCExample { - /** - * Before executing a SQL statement with a Statement object, you need to create a Statement object using the createStatement() method of the Connection object. - * After creating a Statement object, you can use its execute() method to execute a SQL statement - * Finally, remember to close the 'statement' and 'connection' objects by using their close() method - * For statements with query results, we can use the getResultSet() method of the Statement object to get the result set. - */ - public static void main(String[] args) throws SQLException { - Connection connection = getConnection(); - if (connection == null) { - System.out.println("get connection defeat"); - return; - } - Statement statement = connection.createStatement(); - //Create database - try { - statement.execute("CREATE DATABASE root.demo"); - }catch (IoTDBSQLException e){ - System.out.println(e.getMessage()); - } - - - //SHOW DATABASES - statement.execute("SHOW DATABASES"); - outputResult(statement.getResultSet()); - - //Create time series - //Different data type has different encoding methods. Here use INT32 as an example - try { - statement.execute("CREATE TIMESERIES root.demo.s0 WITH DATATYPE=INT32,ENCODING=RLE;"); - }catch (IoTDBSQLException e){ - System.out.println(e.getMessage()); - } - //Show time series - statement.execute("SHOW TIMESERIES root.demo"); - outputResult(statement.getResultSet()); - //Show devices - statement.execute("SHOW DEVICES"); - outputResult(statement.getResultSet()); - //Count time series - statement.execute("COUNT TIMESERIES root"); - outputResult(statement.getResultSet()); - //Count nodes at the given level - statement.execute("COUNT NODES root LEVEL=3"); - outputResult(statement.getResultSet()); - //Count timeseries group by each node at the given level - statement.execute("COUNT TIMESERIES root GROUP BY LEVEL=3"); - outputResult(statement.getResultSet()); - - - //Execute insert statements in batch - statement.addBatch("INSERT INTO root.demo(timestamp,s0) VALUES(1,1);"); - statement.addBatch("INSERT INTO root.demo(timestamp,s0) VALUES(1,1);"); - statement.addBatch("INSERT INTO root.demo(timestamp,s0) VALUES(2,15);"); - statement.addBatch("INSERT INTO root.demo(timestamp,s0) VALUES(2,17);"); - statement.addBatch("INSERT INTO root.demo(timestamp,s0) values(4,12);"); - statement.executeBatch(); - statement.clearBatch(); - - //Full query statement - String sql = "SELECT * FROM root.demo"; - ResultSet resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Exact query statement - sql = "SELECT s0 FROM root.demo WHERE time = 4;"; - resultSet= statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Time range query - sql = "SELECT s0 FROM root.demo WHERE time >= 2 AND time < 5;"; - resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Aggregate query - sql = "SELECT COUNT(s0) FROM root.demo;"; - resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Delete time series - statement.execute("DELETE timeseries root.demo.s0"); - - //close connection - statement.close(); - connection.close(); - } - - public static Connection getConnection() { - // JDBC driver name and database URL - String driver = "org.apache.iotdb.jdbc.IoTDBDriver"; - String url = "jdbc:iotdb://127.0.0.1:6667/"; - // set rpc compress mode - // String url = "jdbc:iotdb://127.0.0.1:6667?rpc_compress=true"; - - // Database credentials - String username = "root"; - String password = "TimechoDB@2021"; //Before V2.0.6.x the default password is root - - Connection connection = null; - try { - Class.forName(driver); - connection = DriverManager.getConnection(url, username, password); - } catch (ClassNotFoundException e) { - e.printStackTrace(); - } catch (SQLException e) { - e.printStackTrace(); - } - return connection; - } - - /** - * This is an example of outputting the results in the ResultSet - */ - private static void outputResult(ResultSet resultSet) throws SQLException { - if (resultSet != null) { - System.out.println("--------------------------"); - final ResultSetMetaData metaData = resultSet.getMetaData(); - final int columnCount = metaData.getColumnCount(); - for (int i = 0; i < columnCount; i++) { - System.out.print(metaData.getColumnLabel(i + 1) + " "); - } - System.out.println(); - while (resultSet.next()) { - for (int i = 1; ; i++) { - System.out.print(resultSet.getString(i)); - if (i < columnCount) { - System.out.print(", "); - } else { - System.out.println(); - break; - } - } - } - System.out.println("--------------------------\n"); - } - } -} -``` - -The parameter `version` can be used in the url: -````java -String url = "jdbc:iotdb://127.0.0.1:6667?version=V_1_0"; -```` -The parameter `version` represents the SQL semantic version used by the client, which is used in order to be compatible with the SQL semantics of `0.12` when upgrading to `0.13`. -The possible values are: `V_0_12`, `V_0_13`, `V_1_0`. - -In addition, IoTDB provides additional interfaces in JDBC for users to read and write the database using different character sets (e.g., GB18030) in the connection. -The default character set for IoTDB is UTF-8. When users want to use a character set other than UTF-8, they need to specify the charset property in the JDBC connection. For example: -1. Create a connection using the GB18030 charset: -```java -DriverManager.getConnection("jdbc:iotdb://127.0.0.1:6667?charset=GB18030", "root", "TimechoDB@2021"); //Before V2.0.6.x the default password is root -``` -2. When executing SQL with the `IoTDBStatement` interface, the SQL can be provided as a `byte[]` array, and it will be parsed into a string according to the specified charset. -```java -public boolean execute(byte[] sql) throws SQLException; -``` -3. When outputting query results, the `getBytes` method of `ResultSet` can be used to get `byte[]`, which will be encoded using the charset specified in the connection. -```java -System.out.print(resultSet.getString(i) + " (" + new String(resultSet.getBytes(i), charset) + ")"); -``` -Here is a complete example: -```java -public class JDBCCharsetExample { - - private static final Logger LOGGER = LoggerFactory.getLogger(JDBCCharsetExample.class); - - public static void main(String[] args) throws Exception { - Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); - - try (final Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?charset=GB18030", "root", "TimechoDB@2021"); //Before V2.0.6.x the default password is root - final IoTDBStatement statement = (IoTDBStatement) connection.createStatement()) { - - final String insertSQLWithGB18030 = - "insert into root.测试(timestamp, 维语, 彝语, 繁体, 蒙文, 简体, 标点符号, 藏语) values(1, 'ئۇيغۇر تىلى', 'ꆈꌠꉙ', \"繁體\", 'ᠮᠣᠩᠭᠣᠯ ᠬᠡᠯᠡ', '简体', '——?!', \"བོད་སྐད།\");"; - final byte[] insertSQLWithGB18030Bytes = insertSQLWithGB18030.getBytes("GB18030"); - statement.execute(insertSQLWithGB18030Bytes); - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - - outputResult("GB18030"); - outputResult("UTF-8"); - outputResult("UTF-16"); - outputResult("GBK"); - outputResult("ISO-8859-1"); - } - - private static void outputResult(String charset) throws SQLException { - System.out.println("[Charset: " + charset + "]"); - try (final Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?charset=" + charset, "root", "TimechoDB@2021"); //Before V2.0.6.x the default password is root - final IoTDBStatement statement = (IoTDBStatement) connection.createStatement()) { - outputResult(statement.executeQuery("select ** from root"), Charset.forName(charset)); - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - } - - private static void outputResult(ResultSet resultSet, Charset charset) throws SQLException { - if (resultSet != null) { - System.out.println("--------------------------"); - final ResultSetMetaData metaData = resultSet.getMetaData(); - final int columnCount = metaData.getColumnCount(); - for (int i = 0; i < columnCount; i++) { - System.out.print(metaData.getColumnLabel(i + 1) + " "); - } - System.out.println(); - - while (resultSet.next()) { - for (int i = 1; ; i++) { - System.out.print( - resultSet.getString(i) + " (" + new String(resultSet.getBytes(i), charset) + ")"); - if (i < columnCount) { - System.out.print(", "); - } else { - System.out.println(); - break; - } - } - } - System.out.println("--------------------------\n"); - } - } -} -``` \ No newline at end of file diff --git a/src/UserGuide/latest/API/Programming-Java-Native-API_timecho.md b/src/UserGuide/latest/API/Programming-Java-Native-API_timecho.md deleted file mode 100644 index a746bceaa..000000000 --- a/src/UserGuide/latest/API/Programming-Java-Native-API_timecho.md +++ /dev/null @@ -1,625 +0,0 @@ - - -# Java Native API - -In the native API of IoTDB, the `Session` is the core interface for interacting with the database. It integrates a rich set of methods that support data writing, querying, and metadata operations. By instantiating a `Session`, you can establish a connection to the IoTDB server and perform various database operations within the environment constructed by this connection. The `Session` is not thread-safe and should not be called simultaneously by multiple threads. - -`SessionPool` is a connection pool for `Session`, and it is recommended to use `SessionPool` for programming. In scenarios with multi-threaded concurrency, `SessionPool` can manage and allocate connection resources effectively, thereby improving system performance and resource utilization efficiency. - -## 1. Overview of Steps - -1. Create a Connection Pool Instance: Initialize a SessionPool object to manage multiple Session instances. -2. Perform Operations: Directly obtain a Session instance from the SessionPool and execute database operations, without the need to open and close connections each time. -3. Close Connection Pool Resources: When database operations are no longer needed, close the SessionPool to release all related resources. - - -## 2. Detailed Steps - -This section provides an overview of the core development process and does not demonstrate all parameters and interfaces. For a complete list of functionalities and parameters, please refer to:[Java Native API](./Programming-Java-Native-API_timecho#_3-native-interface-description) or check the: [Source Code](https://github.com/apache/iotdb/tree/rc/2.0.1/example/session/src/main/java/org/apache/iotdb) - -### 2.1 Create a Maven Project - -Create a Maven project and add the following dependencies to the pom.xml file (JDK >= 1.8, Maven >= 3.6): - -```xml - - - org.apache.iotdb - iotdb-session - - ${project.version} - - -``` -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -### 2.2 Creating a Connection Pool Instance - - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.session.pool.SessionPool; - -public class IoTDBSessionPoolExample { - private static SessionPool sessionPool; - - public static void main(String[] args) { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //Before V2.0.6.x the default password is root " - .maxSize(3) - .build(); - } -} -``` - -### 2.3 Performing Database Operations - -#### 2.3.1 Data Insertion - -In industrial scenarios, data insertion can be categorized into the following types: inserting multiple rows of data, and inserting multiple rows of data for a single device. Below, we introduce the insertion interfaces for different scenarios. - -##### Multi-Row Data Insertion Interface - -Interface Description: Supports inserting multiple rows of data at once, where each row corresponds to multiple measurement values for a device at a specific timestamp. - - -Interface List: - -| **Interface Name** | **Function Description** | -| ------------------------------------------------------------ | ------------------------------------------------------------ | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | Inserts multiple rows of data, suitable for scenarios where measurements are independently collected. | - -Code Example: - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; -import org.apache.tsfile.enums.TSDataType; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. execute insert data - insertRecordsExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //Before V2.0.6.x the default password is root - .maxSize(3) - .build(); - } - - public static void insertRecordsExample() throws IoTDBConnectionException, StatementExecutionException { - String deviceId = "root.sg1.d1"; - List measurements = new ArrayList<>(); - measurements.add("s1"); - measurements.add("s2"); - measurements.add("s3"); - List deviceIds = new ArrayList<>(); - List> measurementsList = new ArrayList<>(); - List> valuesList = new ArrayList<>(); - List timestamps = new ArrayList<>(); - List> typesList = new ArrayList<>(); - - for (long time = 0; time < 500; time++) { - List values = new ArrayList<>(); - List types = new ArrayList<>(); - values.add(1L); - values.add(2L); - values.add(3L); - types.add(TSDataType.INT64); - types.add(TSDataType.INT64); - types.add(TSDataType.INT64); - - deviceIds.add(deviceId); - measurementsList.add(measurements); - valuesList.add(values); - typesList.add(types); - timestamps.add(time); - if (time != 0 && time % 100 == 0) { - try { - sessionPool.insertRecords(deviceIds, timestamps, measurementsList, typesList, valuesList); - } catch (IoTDBConnectionException | StatementExecutionException e) { - // solve exception - } - deviceIds.clear(); - measurementsList.clear(); - valuesList.clear(); - typesList.clear(); - timestamps.clear(); - } - } - try { - sessionPool.insertRecords(deviceIds, timestamps, measurementsList, typesList, valuesList); - } catch (IoTDBConnectionException | StatementExecutionException e) { - // solve exception - } - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - -##### Single-Device Multi-Row Data Insertion Interface - -Interface Description: Supports inserting multiple rows of data for a single device at once, where each row corresponds to multiple measurement values for a specific timestamp. - -Interface List: - -| **Interface Name** | **Function Description** | -| ----------------------------- | ------------------------------------------------------------ | -| `insertTablet(Tablet tablet)` | Inserts multiple rows of data for a single device, suitable for scenarios where measurements are independently collected. | - -Code Example: - -```java -import java.util.ArrayList; -import java.util.List; -import java.util.Random; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.write.record.Tablet; -import org.apache.tsfile.write.schema.IMeasurementSchema; -import org.apache.tsfile.write.schema.MeasurementSchema; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. execute insert data - insertTabletExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - //nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //Before V2.0.6.x the default password is root - .maxSize(3) - .build(); - } - - private static void insertTabletExample() throws IoTDBConnectionException, StatementExecutionException { - /* - * A Tablet example: - * device1 - * time s1, s2, s3 - * 1, 1, 1, 1 - * 2, 2, 2, 2 - * 3, 3, 3, 3 - */ - // The schema of measurements of one device - // only measurementId and data type in MeasurementSchema take effects in Tablet - List schemaList = new ArrayList<>(); - schemaList.add(new MeasurementSchema("s1", TSDataType.INT64)); - schemaList.add(new MeasurementSchema("s2", TSDataType.INT64)); - schemaList.add(new MeasurementSchema("s3", TSDataType.INT64)); - - Tablet tablet = new Tablet("root.sg.d1",schemaList,100); - - // Method 1 to add tablet data - long timestamp = System.currentTimeMillis(); - - Random random = new Random(); - for (long row = 0; row < 100; row++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - for (int s = 0; s < 3; s++) { - long value = random.nextLong(); - tablet.addValue(schemaList.get(s).getMeasurementName(), rowIndex, value); - } - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - sessionPool.insertTablet(tablet); - tablet.reset(); - } - timestamp++; - } - if (tablet.getRowSize() != 0) { - sessionPool.insertTablet(tablet); - tablet.reset(); - } - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - -#### 2.3.2 SQL Operations - -SQL operations are divided into two categories: queries and non-queries. The corresponding interfaces are executeQuery and executeNonQuery. The difference between them is that the former executes specific query statements and returns a result set, while the latter performs insert, delete, and update operations and does not return a result set. - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.isession.pool.SessionDataSetWrapper; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. executes a non-query SQL statement, such as a DDL or DML command. - executeQueryExample(); - // 3. executes a query SQL statement and returns the result set. - executeNonQueryExample(); - // 4. close SessionPool - closeSessionPool(); - } - - private static void executeNonQueryExample() throws IoTDBConnectionException, StatementExecutionException { - // 1. create a nonAligned time series - sessionPool.executeNonQueryStatement("create timeseries root.test.d1.s1 with dataType = int32"); - // 2. set ttl - sessionPool.executeNonQueryStatement("set TTL to root.test.** 10000"); - // 3. delete time series - sessionPool.executeNonQueryStatement("delete timeseries root.test.d1.s1"); - } - - private static void executeQueryExample() throws IoTDBConnectionException, StatementExecutionException { - // 1. execute normal query - try(SessionDataSetWrapper wrapper = sessionPool.executeQueryStatement("select s1 from root.sg1.d1 limit 10")) { - // get DataIterator like JDBC - DataIterator dataIterator = wrapper.iterator(); - System.out.println(wrapper.getColumnNames()); - System.out.println(wrapper.getColumnTypes()); - while (dataIterator.next()) { - StringBuilder builder = new StringBuilder(); - for (String columnName : wrapper.getColumnNames()) { - builder.append(dataIterator.getString(columnName) + " "); - } - System.out.println(builder); - } - } - // 2. execute aggregate query - try(SessionDataSetWrapper wrapper = sessionPool.executeQueryStatement("select count(s1) from root.sg1.d1 group by ([0, 40), 5ms) ")) { - // get DataIterator like JDBC - DataIterator dataIterator = wrapper.iterator(); - System.out.println(wrapper.getColumnNames()); - System.out.println(wrapper.getColumnTypes()); - while (dataIterator.next()) { - StringBuilder builder = new StringBuilder(); - for (String columnName : wrapper.getColumnNames()) { - builder.append(dataIterator.getString(columnName) + " "); - } - System.out.println(builder); - } - } - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //Before V2.0.6.x the default password is root - .maxSize(3) - .build(); - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - - -For more information on the use of result sets and their method `SessionDataSet.DataIterator`, please refer to the following example (note: the `getBlob` and `getDate` interfaces have been supported since version V2.0.4): - -```java -import org.apache.iotdb.isession.SessionDataSet; -import org.apache.iotdb.isession.pool.SessionDataSetWrapper; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; - -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.utils.Binary; -import org.apache.tsfile.utils.DateUtils; -import org.apache.tsfile.write.record.Tablet; -import org.apache.tsfile.write.schema.MeasurementSchema; -import org.junit.Assert; - -import java.sql.Timestamp; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -public class SessionExample { - private static SessionPool sessionPool; - - public static void main(String[] args) - throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. executes a query SQL statement, such as a DDL or DML command. - executeQueryExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void executeQueryExample() - throws IoTDBConnectionException, StatementExecutionException { - Tablet tablet = - new Tablet( - "root.sg.d1", - Arrays.asList( - new MeasurementSchema("s1", TSDataType.INT32), - new MeasurementSchema("s2", TSDataType.INT64), - new MeasurementSchema("s3", TSDataType.FLOAT), - new MeasurementSchema("s4", TSDataType.DOUBLE), - new MeasurementSchema("s5", TSDataType.TEXT), - new MeasurementSchema("s6", TSDataType.BOOLEAN), - new MeasurementSchema("s7", TSDataType.TIMESTAMP), - new MeasurementSchema("s8", TSDataType.BLOB), - new MeasurementSchema("s9", TSDataType.STRING), - new MeasurementSchema("s10", TSDataType.DATE), - new MeasurementSchema("s11", TSDataType.TIMESTAMP)), - 10); - tablet.addTimestamp(0, 0L); - tablet.addValue("s1", 0, 1); - tablet.addValue("s2", 0, 1L); - tablet.addValue("s3", 0, 0f); - tablet.addValue("s4", 0, 0d); - tablet.addValue("s5", 0, "text_value"); - tablet.addValue("s6", 0, true); - tablet.addValue("s7", 0, 1L); - tablet.addValue("s8", 0, new Binary(new byte[] {1})); - tablet.addValue("s9", 0, "string_value"); - tablet.addValue("s10", 0, DateUtils.parseIntToLocalDate(20250403)); - tablet.initBitMaps(); - tablet.bitMaps[10].mark(0); - tablet.rowSize = 1; - sessionPool.insertAlignedTablet(tablet); - - try (SessionDataSetWrapper dataSet = - sessionPool.executeQueryStatement("select * from root.sg.d1")) { - SessionDataSet.DataIterator iterator = dataSet.iterator(); - int count = 0; - while (iterator.next()) { - count++; - Assert.assertFalse(iterator.isNull("root.sg.d1.s1")); - Assert.assertEquals(1, iterator.getInt("root.sg.d1.s1")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s2")); - Assert.assertEquals(1L, iterator.getLong("root.sg.d1.s2")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s3")); - Assert.assertEquals(0, iterator.getFloat("root.sg.d1.s3"), 0.01); - Assert.assertFalse(iterator.isNull("root.sg.d1.s4")); - Assert.assertEquals(0, iterator.getDouble("root.sg.d1.s4"), 0.01); - Assert.assertFalse(iterator.isNull("root.sg.d1.s5")); - Assert.assertEquals("text_value", iterator.getString("root.sg.d1.s5")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s6")); - Assert.assertTrue(iterator.getBoolean("root.sg.d1.s6")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s7")); - Assert.assertEquals(new Timestamp(1), iterator.getTimestamp("root.sg.d1.s7")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s8")); - Assert.assertEquals(new Binary(new byte[] {1}), iterator.getBlob("root.sg.d1.s8")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s9")); - Assert.assertEquals("string_value", iterator.getString("root.sg.d1.s9")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s10")); - Assert.assertEquals( - DateUtils.parseIntToLocalDate(20250403), iterator.getDate("root.sg.d1.s10")); - Assert.assertTrue(iterator.isNull("root.sg.d1.s11")); - Assert.assertNull(iterator.getTimestamp("root.sg.d1.s11")); - - Assert.assertEquals(new Timestamp(0), iterator.getTimestamp("Time")); - Assert.assertFalse(iterator.isNull("Time")); - } - Assert.assertEquals(tablet.rowSize, count); - } - } - - private static void constructSessionPool() { - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("root") - .maxSize(3) - .build(); - } - - public static void closeSessionPool() { - sessionPool.close(); - } -} -``` - - - -### 3. Native Interface Description - -#### 3.1 Parameter List - -The Session class has the following fields, which can be set through the constructor or the Session.Builder method: - -| **Field Name** | **Type** | **Description** | -| -------------------------------- | ----------------------------------- | ------------------------------------------------------------ | -| `nodeUrls` | `List` | List of URLs for database nodes, supporting multiple node connections | -| `username` | `String` | Username | -| `password` | `String` | Password | -| `fetchSize` | `int` | Default batch size for query results | -| `useSSL` | `boolean` | Whether to enable SSL | -| `trustStore` | `String` | Path to the trust store | -| `trustStorePwd` | `String` | Password for the trust store | -| `queryTimeoutInMs` | `long` | Query timeout in milliseconds. Default value: -1. A negative value means the server default configuration is used, and 0 disables query timeout. | -| `enableRPCCompression` | `boolean` | Whether to enable RPC compression | -| `connectionTimeoutInMs` | `int` | Connection timeout in milliseconds | -| `zoneId` | `ZoneId` | Time zone setting for the session | -| `thriftDefaultBufferSize` | `int` | Default buffer size for Thrift Thrift | -| `thriftMaxFrameSize` | `int` | Maximum frame size for Thrift Thrift | -| `defaultEndPoint` | `TEndPoint` | Default database endpoint information | -| `defaultSessionConnection` | `SessionConnection` | Default session connection object | -| `isClosed` | `boolean` | Whether the current session is closed | -| `enableRedirection` | `boolean` | Whether to enable redirection | -| `enableRecordsAutoConvertTablet` | `boolean` | Whether to enable the function of recording the automatic transfer to Tablet | -| `deviceIdToEndpoint` | `Map` | Mapping of device IDs to database endpoints | -| `endPointToSessionConnection` | `Map` | Mapping of database endpoints to session connections | -| `executorService` | `ScheduledExecutorService` | Thread pool for periodically updating the node list | -| `availableNodes` | `INodeSupplier` | Supplier of available nodes | -| `enableQueryRedirection` | `boolean` | Whether to enable query redirection | -| `version` | `Version` | Client version number, used for compatibility judgment with the server | -| `enableAutoFetch` | `boolean` | Whether to enable automatic fetching | -| `maxRetryCount` | `int` | Maximum number of retries | -| `retryIntervalInMs` | `long` | Retry interval in milliseconds | - - - -#### 3.2 Interface list - -##### 3.2.1 Metadata Management - -| **Method Name** | **Function Description** | **Parameter Explanation** | -| ------------------------------------------------------------ | ---------------------------------------------- | ------------------------------------------------------------ | -| `createDatabase(String database)` | Create a database | `database`: The name of the database to be created | -| `deleteDatabase(String database)` | Delete a specified database | `database`: The name of the database to be deleted | -| `deleteDatabases(List databases)` | Batch delete databases | `databases`: A list of database names to be deleted | -| `createTimeseries(String path, TSDataType dataType, TSEncoding encoding, CompressionType compressor)` | Create a single time series | `path`: The path of the time series,`dataType`: The data type,`encoding`: The encoding type,`compressor`: The compression type | -| `createAlignedTimeseries(...)` | Create aligned time series | Device ID, list of measurement points, list of data types, list of encodings, list of compression types | -| `createMultiTimeseries(...)` | Batch create time series | Multiple paths, data types, encodings, compression types, properties, tags, aliases, etc. | -| `deleteTimeseries(String path)` | Delete a time series | `path`: The path of the time series to be deleted | -| `deleteTimeseries(List paths)` | Batch delete time series | `paths`: A list of time series paths to be deleted | -| `setSchemaTemplate(String templateName, String prefixPath)` | Set a schema template | `templateName`: The name of template,`prefixPath`: The path where the template is applied | -| `createSchemaTemplate(Template template)` | Create a schema template | `template`: The template object | -| `dropSchemaTemplate(String templateName)` | Delete a schema template | `templateName`: The name of template to be deleted | -| `addAlignedMeasurementsInTemplate(...)` | Add aligned measurements to a template | Template name, list of measurement paths, data type, encoding type, compression type | -| `addUnalignedMeasurementsInTemplate(...)` | Add unaligned measurements to a template | Same as above | -| `deleteNodeInTemplate(String templateName, String path)` | Delete a node in a template | `templateName`: The name of template,`path`: The path to be deleted | -| `countMeasurementsInTemplate(String name)` | Count the number of measurements in a template | `name`: The name of template | -| `isMeasurementInTemplate(String templateName, String path)` | Check if a measurement exists in a template | `templateName`: The name of template,`path`: The path of the measurement | -| `isPathExistInTemplate(String templateName, String path)` | Check if a path exists in a template | same as above | -| `showMeasurementsInTemplate(String templateName)` | Show measurements in a template | `templateName`: The name of template | -| `showMeasurementsInTemplate(String templateName, String pattern)` | Show measurements in a template by pattern | `templateName`: The name of template,`pattern`: The matching pattern | -| `showAllTemplates()` | Show all templates | No parameters | -| `showPathsTemplateSetOn(String templateName)` | Show paths where a template is set | `templateName`: The name of the template | -| `showPathsTemplateUsingOn(String templateName)` | Show actual paths using a template | Same as above上 | -| `unsetSchemaTemplate(String prefixPath, String templateName)` | Unset the template setting for a path | `prefixPath`: The path,`templateName`: The name of template | - - -##### 3.2.2 Data Insertion - -| **Method Name** | **Function Description** | **Parameter Explanation** | -| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -| `insertRecord(String deviceId, long time, List measurements, List types, Object... values)` | Insert a single record | `deviceId`: Device ID,`time`: Timestamp,`measurements`: List of measurement points,`types`: List of data types,`values`: List of values | -| `insertRecord(String deviceId, long time, List measurements, List values)` | Insert a single record | `deviceId`: Device ID,`time`: Timestamp,`measurements`: List of measurement points,`values`: List of values | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> valuesList)` | Insert multiple records | `deviceIds`: List of device IDs,`times`: List of timestamps,`measurementsList`: List of timestamps,`valuesList`: List of lists of values | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | Insert multiple records | Same as above,plus `typesList`: List of lists of data types | -| `insertRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList)` | Insert multiple records for a single device | `deviceId`: Device ID,`times`: List of timestamps,`measurementsList`: List of lists of measurement points,`typesList`: List of lists of types,`valuesList`: List of lists of values | -| `insertRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList, boolean haveSorted)` | Insert sorted multiple records for a single device | Same as above, plus `haveSorted`: Whether the data is already sorted | -| `insertStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList)` | Insert string-formatted records for a single device | `deviceId`: Device ID,`times`: List of timestamps,`measurementsList`: List of lists of measurement points,`valuesList`: List of lists of values | -| `insertStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList, boolean haveSorted)` | Insert sorted string-formatted records for a single device | Same as above, plus `haveSorted`: Whether the data is already sorted序 | -| `insertAlignedRecord(String deviceId, long time, List measurements, List types, List values)` | Insert a single aligned record | `deviceId`: Device ID,`time`: Timestamp,`measurements`: List of measurement points,`types`: List of types,`values`: List of values | -| `insertAlignedRecord(String deviceId, long time, List measurements, List values)` | Insert a single string-formatted aligned record | `deviceId`: Device ID`time`: Timestamp,`measurements`: List of measurement points,`values`: List of values | -| `insertAlignedRecords(List deviceIds, List times, List> measurementsList, List> valuesList)` | Insert multiple aligned records | `deviceIds`: List of device IDs,`times`: List of timestamps,`measurementsList`: List of lists of measurement points,`valuesList`: List of lists of values | -| `insertAlignedRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | Insert multiple aligned records | Same as above, plus `typesList`: List of lists of data types | -| `insertAlignedRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList)` | Insert multiple aligned records for a single device | Same as above | -| `insertAlignedRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList, boolean haveSorted)` | Insert sorted multiple aligned records for a single device | Same as above, plus `haveSorted`: Whether the data is already sorted | -| `insertAlignedStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList)` | Insert string-formatted aligned records for a single device | `deviceId`: Device ID,`times`: List of timestamps,`measurementsList`: List of lists of measurement points,`valuesList`: List of lists of values | -| `insertAlignedStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList, boolean haveSorted)` | Insert sorted string-formatted aligned records for a single device | Same as above, plus w `haveSorted`: whether the data is already sorted | -| `insertTablet(Tablet tablet)` | Insert a single Tablet data | `tablet`: The Tablet data to be inserted | -| `insertTablet(Tablet tablet, boolean sorted)` | Insert a sorted Tablet data | Same as above, plus `sorted`: whether the data is already sorted | -| `insertAlignedTablet(Tablet tablet)` | Insert an aligned Tablet data | `tablet`: The Tablet data to be inserted | -| `insertAlignedTablet(Tablet tablet, boolean sorted)` | Insert a sorted aligned Tablet data | Same as above, plus `sorted`: whether the data is already sorted | -| `insertTablets(Map tablets)` | Insert multiple Tablet data in batch | `tablets`: Mapping from device IDs to Tablet data | -| `insertTablets(Map tablets, boolean sorted)` | Insert sorted multiple Tablet data in batch | Same as above, plus `sorted`: whether the data is already sorted | -| `insertAlignedTablets(Map tablets)` | Insert multiple aligned Tablet data in batch | `tablets`: Mapping from device IDs to Tablet data | -| `insertAlignedTablets(Map tablets, boolean sorted)` | Insert sorted multiple aligned Tablet data in batch | Same as above, plus `sorted`: whether the data is already sorted | - -##### 3.2.3 Data Deletion - -| **Method Name** | **Function Description** | **Parameter Explanation** | -| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------- | -| `deleteTimeseries(String path)` | Delete a single time series | `path`: The path of the time series | -| `deleteTimeseries(List paths)` | Batch delete time series | `paths`: A list of time series paths | -| `deleteData(String path, long endTime)` | Delete historical data for a specified path | `path`: The path,`endTime`: The end timestamp | -| `deleteData(List paths, long endTime)` | Batch delete historical data for specified paths | `paths`: A list of paths,`endTime`: The end timestamp | -| `deleteData(List paths, long startTime, long endTime)` | Delete historical data within a time range for specified paths | Same as above, plus `startTime`: The start timestamp | - - -##### 3.2.4 Data Query - -| **Method Name** | **Function Description** | **Parameter Explanation** | -| ------------------------------------------------------------ | -------------------------------------------------------- | ------------------------------------------------------------ | -| `executeQueryStatement(String sql)` | Execute a query statement | `sql`: The query SQL statement | -| `executeQueryStatement(String sql, long timeoutInMs)` | Execute a query statement with timeout | `sql`: The query SQL statement, `timeoutInMs`: The query timeout (in milliseconds), default to the server configuration, which is 60s. | -| `executeRawDataQuery(List paths, long startTime, long endTime)` | Query raw data for specified paths | paths: A list of query paths, `startTime`: The start timestamp, `endTime`: The end timestamp | -| `executeRawDataQuery(List paths, long startTime, long endTime, long timeOut)` | Query raw data for specified paths (with timeout) | Same as above, plus `timeOut`: The timeout time | -| `executeLastDataQuery(List paths)` | Query the latest data | `paths`: A list of query paths | -| `executeLastDataQuery(List paths, long lastTime)` | Query the latest data at a specified time | `paths`: A list of query paths, `lastTime`: The specified timestamp | -| `executeLastDataQuery(List paths, long lastTime, long timeOut)` | Query the latest data at a specified time (with timeout) | Same as above, plus `timeOut`: The timeout time | -| `executeLastDataQueryForOneDevice(String db, String device, List sensors, boolean isLegalPathNodes)` | Query the latest data for a single device | `db`: The database name, `device`: The device name, `sensors`: A list of sensors, `isLegalPathNodes`: Whether the path nodes are legal | -| `executeAggregationQuery(List paths, List aggregations)` | Execute an aggregation query | `paths`: A list of query paths, `aggregations`: A list of aggregation types | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime)` | Execute an aggregation query with a time range | Same as above, plus `startTime`: The start timestamp, `endTime`:` The end timestamp | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime, long interval)` | Execute an aggregation query with a time interval | Same as above, plus `interval`: The time interval | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime, long interval, long slidingStep)` | Execute a sliding window aggregation query | Same as above, plus `slidingStep`: The sliding step | -| `fetchAllConnections()` | Get information of all active connections | No parameters | - -##### 3.2.5 System Status and Backup - -| **Method Name** | **Function Description** | **Parameter Explanation** | -| -------------------------- | ----------------------------------------- | ------------------------------------------ | -| `getBackupConfiguration()` | Get backup configuration information | No parameters | -| `fetchAllConnections()` | Get information of all active connections | No parameters | -| `getSystemStatus()` | Get the system status | Deprecated, returns `SystemStatus.NORMAL` | diff --git a/src/UserGuide/latest/API/Programming-MQTT_timecho.md b/src/UserGuide/latest/API/Programming-MQTT_timecho.md deleted file mode 100644 index 0f2e6a14d..000000000 --- a/src/UserGuide/latest/API/Programming-MQTT_timecho.md +++ /dev/null @@ -1,294 +0,0 @@ - -# MQTT Protocol - -## 1. Overview - -MQTT (Message Queuing Telemetry Transport) is a lightweight messaging protocol designed for IoT and low-bandwidth environments. It operates on a Publish/Subscribe (Pub/Sub) model, enabling efficient and reliable bidirectional communication between devices. Its core objectives are low power consumption, minimal bandwidth usage, and high real-time performance, making it ideal for unstable networks or resource-constrained scenarios (e.g., sensors, mobile devices). - -IoTDB provides deep integration with the MQTT protocol, fully compliant with MQTT v3.1 (OASIS International Standard). The IoTDB server includes a built-in high-performance MQTT Broker module, eliminating the need for third-party middleware. Devices can directly write time-series data into the IoTDB storage engine via MQTT messages. - - - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the MQTT service JAR file by default. Please contact the Timecho team to obtain the JAR file before using this service, and place it in the `timechodb_home/lib` or `timechodb_home/ext/external_service` directory. - -## 2. Built-in MQTT Service -The Built-in MQTT Service provide the ability of direct connection to IoTDB through MQTT. It listen the publish messages from MQTT clients - and then write the data into storage immediately. -The MQTT topic corresponds to IoTDB timeseries. -The messages payload can be format to events by `PayloadFormatter` which loaded by java SPI, and the default implementation is `JSONPayloadFormatter`. -The default `json` formatter support two json format and its json array. The following is an MQTT message payload example: - -```json - { - "device":"root.sg.d1", - "timestamp":1586076045524, - "measurements":["s1","s2"], - "values":[0.530635,0.530635] - } -``` -or -```json - { - "device":"root.sg.d1", - "timestamps":[1586076045524,1586076065526], - "measurements":["s1","s2"], - "values":[[0.530635,0.530635], [0.530655,0.530695]] - } -``` -or json array of the above two. - - - -## 3. MQTT Configurations -The IoTDB MQTT service load configurations from `${IOTDB_HOME}/${IOTDB_CONF}/iotdb-system.properties` by default. - -Configurations are as follows: - -| **Property** | **Description** | **Default** | -| ------------------------ | -------------------------------------------------------------------------------------------------------------------- | ------------------- | -| `enable_mqtt_service` | Enable/ disable the MQTT service. | FALSE | -| `mqtt_host` | Host address bound to the MQTT service. | 127.0.0.1 | -| `mqtt_port` | Port bound to the MQTT service. | 1883 | -| `mqtt_handler_pool_size` | Thread pool size for processing MQTT messages. | 1 | -| **`mqtt_payload_formatter`** | **Formatting method for MQTT message payloads. ​**​**Options: `json` (tree mode), `line` (table mode).** | **json** | -| `mqtt_max_message_size` | Maximum allowed MQTT message size (bytes). | 1048576 | - -## 4. Coding Examples -The following is an example which a mqtt client send messages to IoTDB server. - -```java -MQTT mqtt = new MQTT(); -mqtt.setHost("127.0.0.1", 1883); -mqtt.setUserName("root"); -mqtt.setPassword("root"); - -BlockingConnection connection = mqtt.blockingConnection(); -connection.connect(); - -Random random = new Random(); -for (int i = 0; i < 10; i++) { - String payload = String.format("{\n" + - "\"device\":\"root.sg.d1\",\n" + - "\"timestamp\":%d,\n" + - "\"measurements\":[\"s1\"],\n" + - "\"values\":[%f]\n" + - "}", System.currentTimeMillis(), random.nextDouble()); - - connection.publish("root.sg.d1.s1", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -} - -connection.disconnect(); - -``` - -## 5. Customize your MQTT Message Format - -In a production environment, each device typically has its own MQTT client, and the message formats of these clients have been pre-defined. If communication is to be carried out in accordance with the MQTT message format supported by IoTDB, a comprehensive upgrade and transformation of all existing clients would be required, which would undoubtedly incur significant costs. However, we can easily achieve customization of the MQTT message format through simple programming means, without the need to modify the clients. -An example can be found in [example/mqtt-customize](https://github.com/apache/iotdb/tree/rc/2.0.1/example/mqtt-customize) project. - -Assuming the MQTT client sends the following message format: -```json - { - "time":1586076045523, - "deviceID":"car_1", - "deviceType":"Gasoline car​​", - "point":"Fuel level​​", - "value":10.0 -} -``` -Or in the form of an array of JSON: -```java -[ - { - "time":1586076045523, - "deviceID":"car_1", - "deviceType":"Gasoline car​​", - "point":"Fuel level", - "value":10.0 - }, - { - "time":1586076045524, - "deviceID":"car_2", - "deviceType":"NEV(new enegry vehicle)", - "point":"Speed", - "value":80.0 - } -] -``` - -Then you can set up the custom MQTT message format through the following steps: - -1. Create a java project, and add dependency: -```xml - - org.apache.iotdb - iotdb-server - 2.0.4-SNAPSHOT - -``` -2. Define your implementation which implements `org.apache.iotdb.db.protocol.mqtt.PayloadFormatter` -e.g., - -```java -package org.apache.iotdb.mqtt.server; - -import org.apache.iotdb.db.protocol.mqtt.Message; -import org.apache.iotdb.db.protocol.mqtt.PayloadFormatter; -import org.apache.iotdb.db.protocol.mqtt.TableMessage; - -import com.google.common.collect.Lists; -import com.google.gson.Gson; -import com.google.gson.GsonBuilder; -import com.google.gson.JsonArray; -import com.google.gson.JsonElement; -import com.google.gson.JsonObject; -import com.google.gson.JsonParseException; -import io.netty.buffer.ByteBuf; -import org.apache.commons.lang3.NotImplementedException; -import org.apache.tsfile.enums.TSDataType; - -import java.nio.charset.StandardCharsets; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -/** - * The Customized JSON payload formatter. one json format supported: { "time":1586076045523, - * "deviceID":"car_1", "deviceType":"NEV", "point":"Speed", "value":80.0 } - */ -public class CustomizedJsonPayloadFormatter implements PayloadFormatter { - private static final String JSON_KEY_TIME = "time"; - private static final String JSON_KEY_DEVICEID = "deviceID"; - private static final String JSON_KEY_DEVICETYPE = "deviceType"; - private static final String JSON_KEY_POINT = "point"; - private static final String JSON_KEY_VALUE = "value"; - private static final Gson GSON = new GsonBuilder().create(); - - @Override - public List format(String topic, ByteBuf payload) { - if (payload == null) { - return new ArrayList<>(); - } - String txt = payload.toString(StandardCharsets.UTF_8); - JsonElement jsonElement = GSON.fromJson(txt, JsonElement.class); - if (jsonElement.isJsonObject()) { - JsonObject jsonObject = jsonElement.getAsJsonObject(); - return formatTableRow(topic, jsonObject); - } else if (jsonElement.isJsonArray()) { - JsonArray jsonArray = jsonElement.getAsJsonArray(); - List messages = new ArrayList<>(); - for (JsonElement element : jsonArray) { - JsonObject jsonObject = element.getAsJsonObject(); - messages.addAll(formatTableRow(topic, jsonObject)); - } - return messages; - } - throw new JsonParseException("payload is invalidate"); - } - - @Override - @Deprecated - public List format(ByteBuf payload) { - throw new NotImplementedException(); - } - - private List formatTableRow(String topic, JsonObject jsonObject) { - TableMessage message = new TableMessage(); - String database = !topic.contains("/") ? topic : topic.substring(0, topic.indexOf("/")); - String table = "test_table"; - - // Parsing Database Name - message.setDatabase((database)); - - // Parsing Table Name - message.setTable(table); - - // Parsing Tags - List tagKeys = new ArrayList<>(); - tagKeys.add(JSON_KEY_DEVICEID); - List tagValues = new ArrayList<>(); - tagValues.add(jsonObject.get(JSON_KEY_DEVICEID).getAsString()); - message.setTagKeys(tagKeys); - message.setTagValues(tagValues); - - // Parsing Attributes - List attributeKeys = new ArrayList<>(); - List attributeValues = new ArrayList<>(); - attributeKeys.add(JSON_KEY_DEVICETYPE); - attributeValues.add(jsonObject.get(JSON_KEY_DEVICETYPE).getAsString()); - message.setAttributeKeys(attributeKeys); - message.setAttributeValues(attributeValues); - - // Parsing Fields - List fields = Arrays.asList(JSON_KEY_POINT); - List dataTypes = Arrays.asList(TSDataType.FLOAT); - List values = Arrays.asList(jsonObject.get(JSON_KEY_VALUE).getAsFloat()); - message.setFields(fields); - message.setDataTypes(dataTypes); - message.setValues(values); - - // Parsing timestamp - message.setTimestamp(jsonObject.get(JSON_KEY_TIME).getAsLong()); - return Lists.newArrayList(message); - } - - @Override - public String getName() { - // set the value of mqtt_payload_formatter in iotdb-common.properties as the following string: - return "CustomizedJson2Table"; - } - - @Override - public String getType() { - return PayloadFormatter.TABLE_TYPE; - } -} -``` -3. modify the file in `src/main/resources/META-INF/services/org.apache.iotdb.db.protocol.mqtt.PayloadFormatter`: - clean the file and put your implementation class name into the file. - In this example, the content is: `org.apache.iotdb.mqtt.server.CustomizedJsonPayloadFormatter` -4. compile your implementation as a jar file: `mvn package -DskipTests` - - -Then, in your server: -1. Create ${IOTDB_HOME}/ext/mqtt/ folder, and put the jar into this folder. -2. Update configuration to enable MQTT service. (`enable_mqtt_service=true` in `conf/iotdb-system.properties`) -3. Set the value of `mqtt_payload_formatter` in `conf/iotdb-system.properties` as the value of getName() in your implementation - , in this example, the value is `CustomizedJson2Table` -4. Launch the IoTDB server. -5. Now IoTDB will use your implementation to parse the MQTT message. - -More: the message format can be anything you want. For example, if it is a binary format, -just use `payload.forEachByte()` or `payload.array` to get bytes content. - - -## 6. Caution - -To avoid compatibility issues caused by a default client_id, always explicitly supply a unique, non-empty client_id in every MQTT client. -Behavior varies when the client_id is missing or empty. Common examples: -1. Explicitly sending an empty string - • MQTTX: When client_id="", IoTDB silently discards the message. - • mosquitto_pub: When client_id="", IoTDB receives the message normally. -2. Omitting client_id entirely - • MQTTX: IoTDB accepts the message. - • mosquitto_pub: IoTDB rejects the connection. - Therefore, explicitly assigning a unique, non-empty client_id is the simplest way to eliminate these discrepancies and ensure reliable message delivery. diff --git a/src/UserGuide/latest/API/Programming-ODBC_timecho.md b/src/UserGuide/latest/API/Programming-ODBC_timecho.md deleted file mode 100644 index bea1a4eb2..000000000 --- a/src/UserGuide/latest/API/Programming-ODBC_timecho.md +++ /dev/null @@ -1,994 +0,0 @@ - - -# ODBC - -## 1. Feature Introduction -The IoTDB ODBC driver provides the ability to interact with the database via the standard ODBC interface, supporting data management in time-series databases through ODBC connections. It currently supports database connection, data query, data insertion, data modification, and data deletion operations, and is compatible with various applications and toolchains that support the ODBC protocol. - -> Note: This feature is supported starting from V2.0.8.2. - -## 2. Usage Method -It is recommended to install using the pre-compiled binary package. There is no need to compile it yourself; simply use the script to complete the driver installation and system registration. Currently, only Windows systems are supported. - -### 2.1 Environment Requirements -Only the ODBC Driver Manager dependency at the operating system level is required; no compilation environment configuration is needed: - -| **Operating System** | **Requirements and Installation Method** | -| :--- | :--- | -| Windows | 1. **Windows 10/11, Server 2016/2019/2022**: Comes with ODBC Driver Manager version 17/18 built-in; no extra installation needed.
2. **Windows 8.1/Server 2012 R2**: Requires manual installation of the corresponding version of the ODBC Driver Manager. | - -### 2.2 Installation Steps -1. Contact the Tianmou team to obtain the pre-compiled binary package. - Binary package directory structure: - ```Plain - ├── bin/ - │ ├── apache_iotdb_odbc.dll - │ └── install_driver.exe - ├── install.bat - └── registry.bat - ``` -2. Open a command line tool (CMD/PowerShell) with **Administrator privileges** and run the following command: (You can replace the path with any absolute path) - ```Bash - install.bat "C:\Program Files\Apache IoTDB ODBC Driver" - ``` - The script automatically completes the following operations: - * Creates the installation directory (if it does not exist). - * Copies `bin\apache_iotdb_odbc.dll` to the specified installation directory. - * Calls `install_driver.exe` to register the driver to the system via the ODBC standard API (`SQLInstallDriverEx`). -3. Verify installation: Open "ODBC Data Source Administrator". If you can see `Apache IoTDB ODBC Driver` in the "Drivers" tab, the registration was successful. - ![](/img/odbc-1-en.png) - -### 2.3 Uninstallation Steps -1. Open Command Prompt as Administrator and `cd` into the project root directory. -2. Run the uninstallation script: - ```Bash - uninstall.bat - ``` - The script will call `install_driver.exe` to unregister the driver from the system via the ODBC standard API (`SQLRemoveDriver`). The DLL files in the installation directory will not be automatically deleted; please delete them manually if cleanup is required. - -### 2.4 Connection Configuration -After installing the driver, you need to configure a Data Source Name (DSN) to allow applications to connect to the database using the DSN name. The IoTDB ODBC driver supports two methods for configuring connection parameters: via Data Source and via Connection String. - -#### 2.4.1 Configuring Data Source -**Configure via ODBC Data Source Administrator** -1. Open "ODBC Data Source Administrator", switch to the "User DSN" tab, and click the "Add" button. - ![](/img/odbc-2-en.png) -2. Select "Apache IoTDB ODBC Driver" from the pop-up driver list and click "Finish". - ![](/img/odbc-3-en.png) -3. The data source configuration dialog will appear. Fill in the connection parameters and click OK: - ![](/img/odbc-4-en.png) - The meaning of each field in the dialog box is as follows: - - | **Area** | **Field** | **Description** | - | :--- | :--- | :--- | - | Data Source | DSN Name | Data Source Name; applications refer to this data source by this name. | - | Data Source | Description | Data Source description (optional). | - | Connection | Server | IoTDB server IP address, default 127.0.0.1. | - | Connection | Port | IoTDB Session API port, default 6667. | - | Connection | User | Username, default root. | - | Connection | Password | Password, default root. | - | Options | Table Model | Check to use Table Model; uncheck to use Tree Model. | - | Options | Database | Database name. Only available in Table Model mode; grayed out in Tree Model. | - | Options | Log Level | Log level (0-4): 0=OFF, 1=ERROR, 2=WARN, 3=INFO, 4=TRACE. | - | Options | Session Timeout | Session timeout time (milliseconds); 0 means no timeout. Note: The server-side `queryTimeoutThreshold` defaults to 60000ms; exceeding this value requires modifying server configuration. | - | Options | Batch Size | Number of rows fetched per batch, default 1000. Setting to 0 resets to the default value. | - -4. After filling in the details, you can click the "Test Connection" button to test the connection. Testing will attempt to connect to the IoTDB server using the current parameters and execute a `SHOW VERSION` query. If successful, the server version information will be displayed; if failed, the specific error reason will be shown. -5. Once parameters are confirmed correct, click "OK" to save. The data source will appear in the "User DSN" list, as shown in the example below with the name "123". - ![](/img/odbc-5-en.png) - To modify the configuration of an existing data source, select it in the list and click the "Configure" button to edit again. - -#### 2.4.2 Connection String -The connection string format is **semicolon-separated key-value pairs**, for example: -```Bash -Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;isTableModel=false;loglevel=2 -``` -Specific field attributes are introduced in the table below: - -| **Field Name** | **Description** | **Optional Values** | **Default Value** | -| :--- | :--- |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--- | -| DSN | Data Source Name | Custom data source name | - | -| uid | Database username | Any string | root | -| pwd | Database password | Any string | root | -| server | IoTDB server address | IP address | 127.0.0.1 | -| port | IoTDB server port | Port number | 6667 | -| database | Database name (only effective in Table Model mode) | Any string | Empty string | -| loglevel | Log level | Integer value (0-4) | 4 (LOG_LEVEL_TRACE) | -| isTableModel / tablemodel | Whether to enable Table Model mode | Boolean type, supports multiple representations:
1. 0, false, no, off: set to false;
2. 1, true, yes, on: set to true;
3. Other values default to true. | true | -| sessiontimeoutms | Session timeout time (milliseconds) | 64-bit integer, defaults to `LLONG_MAX`; setting to `0` will be replaced with `LLONG_MAX`. Note: The server has a timeout setting: `private long queryTimeoutThreshold = 60000;` this item needs to be modified to get a timeout time exceeding 60 seconds. | LLONG_MAX | -| batchsize | Batch size for fetching data each time | 64-bit integer, defaults to `1000`; setting to `0` will be replaced with `1000` | 1000 | - -Notes: -* Field names are case-insensitive (automatically converted to lowercase for comparison). -* Connection string format is semicolon-separated key-value pairs, e.g., `Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;isTableModel=false;loglevel=2`. -* For boolean fields (`isTableModel`), multiple representation methods are supported. -* All fields are optional; if not specified, default values are used. -* Unsupported fields will be ignored and a warning logged, but will not affect the connection. -* The default server interface port 6667 is the default port used by IoTDB's C++ Session interface. This ODBC driver uses the C++ Session interface to transfer data with IoTDB. If the C++ Session interface on the IoTDB server uses a non-default port, corresponding changes must be made in the ODBC connection string. - -#### 2.4.3 Relationship between Data Source Configuration and Connection String -Configurations saved in the ODBC Data Source Administrator are written into the system's ODBC data source configuration as key-value pairs (corresponding to the registry `HKEY_CURRENT_USER\SOFTWARE\ODBC\ODBC.INI` under Windows). When an application uses `SQLConnect` or specifies `DSN=DataSourceName` in the connection string, the driver reads these parameters from the system configuration. - -**The priority of the connection string is higher than the configuration saved in the DSN.** Specific rules are as follows: -1. If the connection string contains `DSN=xxx` and does not contain `DRIVER=...`, the driver first loads all parameters of that DSN from the system configuration as base values. -2. Then, parameters explicitly specified in the connection string will override parameters with the same name in the DSN. -3. If the connection string contains `DRIVER=...`, no DSN parameters will be read from the system configuration; it will rely entirely on the connection string. - -For example: If the DSN is configured with `Server=192.168.1.100` and `Port=6667`, but the connection string is `DSN=MyDSN;Server=127.0.0.1`, then the actual connection will use `Server=127.0.0.1` (overridden by connection string) and `Port=6667` (from DSN). - -### 2.5 Logging -Log output during driver runtime is divided into "Driver Self-Logs" and "ODBC Manager Tracing Logs". Note the impact of log levels on performance. - -#### 2.5.1 Driver Self-Logs -* Output location: `apache_iotdb_odbc.log` in the user's home directory. -* Log level: Configured via the `loglevel` parameter in the connection string (0-4; higher levels produce more detailed output). -* Performance impact: High log levels will significantly reduce driver performance; recommended for debugging only. - -#### 2.5.2 ODBC Manager Tracing Logs -* How to enable: Open "ODBC Data Source Administrator" → "Tracing" → "Start Tracing Now". -* Precautions: Enabling this will greatly reduce driver performance; use only for troubleshooting. - -## 3. Interface Support - -### 3.1 Method List -The driver's support status for standard ODBC APIs is as follows: - -| ODBC/Setup API | Function Function | Parameter List | Parameter Description | -|:------------------|:---------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--- | -| SQLAllocHandle | Allocate ODBC Handle | (SQLSMALLINT HandleType, SQLHANDLE InputHandle, SQLHANDLE *OutputHandle) | HandleType: Type of handle to allocate (ENV/DBC/STMT/DESC);
InputHandle: Parent context handle;
OutputHandle: Pointer to the returned new handle. | -| SQLBindCol | Bind column to result buffer | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN *StrLen_or_Ind) | StatementHandle: Statement handle;
ColumnNumber: Column number;
TargetType: C data type;
TargetValue: Data buffer;BufferLength: Buffer length;
StrLen_or_Ind: Returns data length or NULL indicator. | -| SQLColAttribute | Get column attribute information | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLUSMALLINT FieldIdentifier, SQLPOINTER CharacterAttribute, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength, SQLLEN *NumericAttribute) | StatementHandle: Statement handle;
ColumnNumber: Column number;
FieldIdentifier: Attribute ID;
CharacterAttribute: Character attribute output;
BufferLength: Buffer length;
StringLength: Returned length;
NumericAttribute: Numeric attribute output. | -| SQLColumns | Query table column information | (SQLHSTMT StatementHandle, SQLCHAR *CatalogName, SQLSMALLINT NameLength1, SQLCHAR *SchemaName, SQLSMALLINT NameLength2, SQLCHAR *TableName, SQLSMALLINT NameLength3, SQLCHAR *ColumnName, SQLSMALLINT NameLength4) | StatementHandle: Statement handle;
Catalog/Schema/Table/ColumnName: Query object names;

NameLength*: Corresponding name lengths. | -| SQLConnect | Establish database connection | (SQLHDBC ConnectionHandle, SQLCHAR *ServerName, SQLSMALLINT NameLength1, SQLCHAR *UserName, SQLSMALLINT NameLength2, SQLCHAR *Authentication, SQLSMALLINT NameLength3) | ConnectionHandle: Connection handle;
ServerName: Data source name;
UserName: Username;
Authentication: Password;
NameLength*: String lengths. | -| SQLDescribeCol | Describe columns in result set | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLCHAR *ColumnName, SQLSMALLINT BufferLength, SQLSMALLINT *NameLength, SQLSMALLINT *DataType, SQLULEN *ColumnSize, SQLSMALLINT *DecimalDigits, SQLSMALLINT *Nullable) | StatementHandle: Statement handle;
ColumnNumber: Column number;
ColumnName: Column name output;
BufferLength: Buffer length;
NameLength: Returned column name length;
DataType: SQL type;
ColumnSize: Column size;
DecimalDigits: Decimal digits;
Nullable: Whether nullable. | -| SQLDisconnect | Disconnect database connection | (SQLHDBC ConnectionHandle) | ConnectionHandle: Connection handle. | -| SQLDriverConnect | Establish connection using connection string | (SQLHDBC ConnectionHandle, SQLHWND WindowHandle, SQLCHAR *InConnectionString, SQLSMALLINT StringLength1, SQLCHAR *OutConnectionString, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength2, SQLUSMALLINT DriverCompletion) | ConnectionHandle: Connection handle;
WindowHandle: Window handle;InConnectionString: Input connection string;
StringLength1: Input length;
OutConnectionString: Output connection string;
BufferLength: Output buffer;
StringLength2: Returned length;
DriverCompletion: Connection prompt method. | -| SQLEndTran | Commit or rollback transaction | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT CompletionType) | HandleType: Handle type;
Handle: Connection or environment handle;
CompletionType: Commit or rollback transaction. | -| SQLExecDirect | Execute SQL statement directly | (SQLHSTMT StatementHandle, SQLCHAR *StatementText, SQLINTEGER TextLength) | StatementHandle: Statement handle;
StatementText: SQL text;TextLength: SQL length. | -| SQLFetch | Fetch next row in result set | (SQLHSTMT StatementHandle) | StatementHandle: Statement handle. | -| SQLFreeHandle | Free ODBC handle | (SQLSMALLINT HandleType, SQLHANDLE Handle) | HandleType: Handle type;
Handle: Handle to free. | -| SQLFreeStmt | Free statement-related resources | (SQLHSTMT StatementHandle, SQLUSMALLINT Option) | StatementHandle: Statement handle;
Option: Free option (close cursor/reset parameters, etc.). | -| SQLGetConnectAttr | Get connection attribute | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER *StringLength) | ConnectionHandle: Connection handle;
Attribute: Attribute ID;
Value: Returned attribute value;
BufferLength: Buffer length;
StringLength: Returned length. | -| SQLGetData | Get result data | (SQLHSTMT StatementHandle, SQLUSMALLINT Col_or_Param_Num, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN *StrLen_or_Ind) | StatementHandle: Statement handle;
Col_or_Param_Num: Column number;
TargetType: C type;
TargetValue: Data buffer;
BufferLength: Buffer size;
StrLen_or_Ind: Returned length or NULL flag. | -| SQLGetDiagField | Get diagnostic field | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLSMALLINT DiagIdentifier, SQLPOINTER DiagInfo, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength) | HandleType: Handle type;
Handle: Handle;
RecNumber: Record number;
DiagIdentifier: Diagnostic field ID;
DiagInfo: Output info;
BufferLength: Buffer;
StringLength: Returned length. | -| SQLGetDiagRec | Get diagnostic record | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLCHAR *Sqlstate, SQLINTEGER *NativeError, SQLCHAR *MessageText, SQLSMALLINT BufferLength, SQLSMALLINT *TextLength) | HandleType: Handle type;
Handle: Handle;
RecNumber: Record number;
Sqlstate: SQL state code;
NativeError: Native error code;
MessageText: Error message;
BufferLength: Buffer;
TextLength: Returned length. | -| SQLGetInfo | Get database information | (SQLHDBC ConnectionHandle, SQLUSMALLINT InfoType, SQLPOINTER InfoValue, SQLSMALLINT BufferLength, SQLSMALLINT *StringLength) | ConnectionHandle: Connection handle;
InfoType: Information type;
InfoValue: Return value;
BufferLength: Buffer length;
StringLength: Returned length. | -| SQLGetStmtAttr | Get statement attribute | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER *StringLength) | StatementHandle: Statement handle;
Attribute: Attribute ID;
Value: Return value;
BufferLength: Buffer;
StringLength: Returned length. | -| SQLGetTypeInfo | Get data type information | (SQLHSTMT StatementHandle, SQLSMALLINT DataType) | StatementHandle: Statement handle;
DataType: SQL data type. | -| SQLMoreResults | Get more result sets | (SQLHSTMT StatementHandle) | StatementHandle: Statement handle. | -| SQLNumResultCols | Get number of columns in result set | (SQLHSTMT StatementHandle, SQLSMALLINT *ColumnCount) | StatementHandle: Statement handle;
ColumnCount: Returned column count. | -| SQLRowCount | Get number of affected rows | (SQLHSTMT StatementHandle, SQLLEN *RowCount) | StatementHandle: Statement handle;
RowCount: Returned number of affected rows. | -| SQLSetConnectAttr | Set connection attribute | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | ConnectionHandle: Connection handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Attribute value length. | -| SQLSetEnvAttr | Set environment attribute | (SQLHENV EnvironmentHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | EnvironmentHandle: Environment handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Length. | -| SQLSetStmtAttr | Set statement attribute | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | StatementHandle: Statement handle;
Attribute: Attribute ID;
Value: Attribute value;
StringLength: Length. | -| SQLTables | Query table information | (SQLHSTMT StatementHandle, SQLCHAR *CatalogName, SQLSMALLINT NameLength1, SQLCHAR *SchemaName, SQLSMALLINT NameLength2, SQLCHAR *TableName, SQLSMALLINT NameLength3, SQLCHAR *TableType, SQLSMALLINT NameLength4) | StatementHandle: Statement handle;
Catalog/Schema/TableName: Table names;
TableType: Table type;
NameLength*: Corresponding lengths. | - -### 3.2 Data Type Conversion -The mapping relationship between IoTDB data types and standard ODBC data types is as follows: - -| **IoTDB Data Type** | **ODBC Data Type** | -| :--- | :--- | -| BOOLEAN | SQL_BIT | -| INT32 | SQL_INTEGER | -| INT64 | SQL_BIGINT | -| FLOAT | SQL_REAL | -| DOUBLE | SQL_DOUBLE | -| TEXT | SQL_VARCHAR | -| STRING | SQL_VARCHAR | -| BLOB | SQL_LONGVARBINARY | -| TIMESTAMP | SQL_BIGINT | -| DATE | SQL_DATE | - -## 4. Operation Examples -This chapter mainly introduces full-type operation examples for **C#**, **Python**, **C++**, **PowerBI**, and **Excel**, covering core operations such as data query, insertion, and deletion. - -### 4.1 C# Example - -```C# -/******* -Note: When the output contains Chinese characters, it may cause garbled text. -This is because the table.Write() function cannot output strings in UTF-8 encoding -and can only output using GB2312 (or another system default encoding). This issue -may not occur in software like Power BI; it also does not occur when using the Console.WriteLine function. -This is an issue with the ConsoleTable package. -*****/ - -using System.Data.Common; -using System.Data.Odbc; -using System.Reflection.PortableExecutable; -using ConsoleTables; -using System; - -/// Executes a SELECT query and outputs the results of root.full.fulldevice in table format -void Query(OdbcConnection dbConnection) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - dbCommand.CommandText = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000"; - using (OdbcDataReader dbReader = dbCommand.ExecuteReader()) - { - var fCount = dbReader.FieldCount; - Console.WriteLine($"fCount = {fCount}"); - - // Output header row - var columns = new string[fCount]; - for (var i = 0; i < fCount; i++) - { - var fName = dbReader.GetName(i); - if (fName.Contains('.')) - { - fName = fName.Substring(fName.LastIndexOf('.') + 1); - } - columns[i] = fName; - } - - // Output content rows - var table = new ConsoleTable(columns); - while (dbReader.Read()) - { - var row = new object[fCount]; - for (var i = 0; i < fCount; i++) - { - if (dbReader.IsDBNull(i)) - { - row[i] = null; - continue; - } - row[i] = dbReader.GetValue(i); - } - table.AddRow(row); - } - table.Write(); - Console.WriteLine(); - } - } - } - catch (Exception ex) - { - Console.WriteLine(ex.ToString()); - } -} - -/// Executes non-query SQL statements (such as INSERT; Tree Model INSERT will auto-create paths) -void Execute(OdbcConnection dbConnection, string command) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - try - { - dbCommand.CommandText = command; - Console.WriteLine($"Execute command: {command}"); - dbCommand.ExecuteNonQuery(); - } - catch (Exception ex) - { - Console.WriteLine($"CommandText error: {ex.Message}"); - } - } - } - catch (OdbcException ex) - { - Console.WriteLine($"Database error: {ex.Message}"); - } - catch (Exception ex) - { - Console.WriteLine($"Unknown error occurred: {ex.Message}"); - } -} - -var dsn = "Apache IoTDB DSN"; -var user = "root"; -var password = "root"; -var server = "127.0.0.1"; -var connectionString = $"DSN={dsn};Server={server};UID={user};PWD={password};loglevel=4;istablemodel=0"; - -using (OdbcConnection dbConnection = new OdbcConnection(connectionString)) -{ - Console.WriteLine($"Start"); - try - { - dbConnection.Open(); - } - catch (Exception ex) - { - Console.WriteLine($"Login failed: {ex.Message}"); - Console.WriteLine($"Stack Trace: {ex.StackTrace}"); - dbConnection.Dispose(); - return; - } - - string[] insertStatements = new string[] - { - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')" - }; - - foreach (var insert in insertStatements) - { - Execute(dbConnection, insert); - } - Console.WriteLine($"[DEBUG] Inserted {insertStatements.Length} rows. Begin to query."); - - Query(dbConnection); // Execute query and output results -} -``` - - -### 4.2 Python Example -1. To access ODBC via Python, install the `pyodbc` package: - ```Plain - pip install pyodbc - ``` -2. Full Code: - - -```Python -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -""" -Apache IoTDB ODBC Python Example - Tree Model -Uses pyodbc to connect to the IoTDB ODBC driver, using istablemodel=0 for Tree Model. -Functionality references examples/BasicTest/TreeTest/TreeTest.cs and examples/cpp-example/TreeTest.cpp. -""" - -import pyodbc - -def execute(conn: pyodbc.Connection, command: str) -> None: - """Executes non-query SQL statements (such as INSERT; Tree Model INSERT will auto-create paths)""" - try: - with conn.cursor() as cursor: - cursor.execute(command) - cmd_upper = command.strip().upper() - if cmd_upper.startswith(("INSERT", "UPDATE", "DELETE")): - conn.commit() - print(f"Execute command: {command}") - except pyodbc.Error as ex: - print(f"CommandText error: {ex}") - -def query(conn: pyodbc.Connection, sql: str) -> None: - """Executes a SELECT query and outputs the results of root.full.fulldevice in table format""" - try: - with conn.cursor() as cursor: - cursor.execute(sql) - col_count = len(cursor.description) - print(f"fCount = {col_count}") - - if col_count <= 0: - return - - columns = [] - for i in range(col_count): - col_name = cursor.description[i][0] or f"Column{i}" - if "." in str(col_name): - col_name = str(col_name).split(".")[-1] - columns.append(str(col_name)) - - rows = cursor.fetchall() - - # Calculate column widths - col_widths = [max(len(str(col)), 4) for col in columns] - for row in rows: - for j, val in enumerate(row): - if j < len(col_widths): - col_widths[j] = max(col_widths[j], len(str(val) if val is not None else "NULL")) - - # Print header - header = " | ".join(str(c).ljust(col_widths[i]) for i, c in enumerate(columns)) - print(header) - print("-" * len(header)) - - # Print rows - for row in rows: - values = [] - for i, val in enumerate(row): - if val is None: - cell = "NULL" - else: - cell = str(val) - values.append(cell.ljust(col_widths[i]) if i < len(col_widths) else cell) - print(" | ".join(values)) - - print() - except pyodbc.Error as ex: - print(f"Query error: {ex}") - -def main() -> None: - dsn = "Apache IoTDB DSN" - user = "root" - password = "root" - server = "127.0.0.1" - connection_string = ( - f"DSN={dsn};Server={server};UID={user};PWD={password};" - f"loglevel=4;istablemodel=0" - ) - - print("Start") - try: - conn = pyodbc.connect(connection_string) - except pyodbc.Error as ex: - print(f"Login failed: {ex}") - return - - try: - driver_name = conn.getinfo(6) # SQL_DRIVER_NAME - print(f"Successfully opened connection. driver = {driver_name}") - except Exception: - print("Successfully opened connection.") - - try: - insert_statements = [ - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600001, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')", - ] - - for insert_sql in insert_statements: - execute(conn, insert_sql) - - print(f"[DEBUG] Inserted {len(insert_statements)} rows. Begin to query.") - - query_sql = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000" - query(conn, query_sql) - print("Query ok") - finally: - conn.close() - -if __name__ == "__main__": - main() -``` - - -### 4.3 C++ Example - - -```C++ -#define WIN32_LEAN_AND_MEAN -#include - -#include -#include -#include -#include -#include -#include -#include - -#ifndef SQL_DIAG_COLUMN_SIZE -#define SQL_DIAG_COLUMN_SIZE 33L -#endif - -// Helper function to check ODBC errors -void CheckOdbcError(SQLRETURN retCode, SQLSMALLINT handleType, SQLHANDLE handle, const char* functionName) { - if (retCode == SQL_SUCCESS || retCode == SQL_SUCCESS_WITH_INFO) { - return; - } - - SQLCHAR sqlState[6]; - SQLCHAR message[SQL_MAX_MESSAGE_LENGTH]; - SQLINTEGER nativeError; - SQLSMALLINT textLength; - SQLRETURN errRet; - errRet = SQLGetDiagRec(handleType, handle, 1, sqlState, &nativeError, message, sizeof(message), &textLength); - - std::cerr << "ODBC Error in " << functionName << ":\n"; - std::cerr << " SQL State: " << sqlState << "\n"; - std::cerr << " Native Error: " << nativeError << "\n"; - std::cerr << " Message: " << message << "\n"; - std::cerr << " SQLGetDiagRec Return: " << errRet << "\n"; - - if (retCode == SQL_ERROR || retCode == SQL_INVALID_HANDLE) { - exit(1); - } -} - -// Helper function to print a simple table -void PrintSimpleTable(const std::vector& headers, - const std::vector>& rows) { - for (size_t i = 0; i < headers.size(); i++) { - std::cout << headers[i]; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - for (size_t i = 0; i < headers.size(); i++) { - std::cout << "----------------"; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - for (const auto& row : rows) { - for (size_t i = 0; i < row.size(); i++) { - std::cout << row[i]; - if (i < row.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - } - std::cout << std::endl; -} - -/// Executes a SELECT query and outputs the results of root.full.fulldevice in table format -void Query(SQLHDBC hDbc) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret = SQL_SUCCESS; - - try { - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - return; - } - - const std::string sqlQuery = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000"; - std::cout << "Execute query: " << sqlQuery << std::endl; - - ret = SQLExecDirect(hStmt, reinterpret_cast(const_cast(sqlQuery.c_str())), SQL_NTS); - if (!SQL_SUCCEEDED(ret)) { - if (ret != SQL_NO_DATA) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect(SELECT)"); - } - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - SQLSMALLINT colCount = 0; - ret = SQLNumResultCols(hStmt, &colCount); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLNumResultCols"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::cout << "Column count = " << colCount << std::endl; - - if (colCount <= 0) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector columnNames; - std::vector columnTypes(colCount); - std::vector columnSizes(colCount); - std::vector decimalDigits(colCount); - std::vector nullable(colCount); - - // Get basic column information - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLSMALLINT nameLength = 0; - ret = SQLDescribeCol(hStmt, i, NULL, 0, &nameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get length)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector colNameBuffer(nameLength + 1); - SQLSMALLINT actualNameLength = 0; - - ret = SQLDescribeCol(hStmt, i, colNameBuffer.data(), nameLength + 1, - &actualNameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get name)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::string fullName(reinterpret_cast(colNameBuffer.data())); - - size_t pos = fullName.find_last_of('.'); - if (pos != std::string::npos) { - columnNames.push_back(fullName.substr(pos + 1)); - } else { - columnNames.push_back(fullName); - } - - ret = SQLDescribeCol(hStmt, i, NULL, 0, NULL, &columnTypes[i-1], - &columnSizes[i-1], &decimalDigits[i-1], &nullable[i-1]); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get type info)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - } - - std::vector> tableRows; - - int rowCount = 0; - // Fetch data for every row - while (true) { - ret = SQLFetch(hStmt); - if (ret == SQL_NO_DATA) { - break; - } - - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLFetch"); - break; - } - - std::vector row; - - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLLEN indicator = 0; - std::string valueStr; - - SQLSMALLINT cType; - size_t bufferSize; - bool isCharacterType = false; - const int maxBufferSize = 32768; - - switch (columnTypes[i-1]) { - case SQL_CHAR: - case SQL_VARCHAR: - case SQL_LONGVARCHAR: - case SQL_WCHAR: - case SQL_WVARCHAR: - case SQL_WLONGVARCHAR: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_DECIMAL: - case SQL_NUMERIC: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_INTEGER: - case SQL_SMALLINT: - case SQL_TINYINT: - case SQL_BIGINT: - cType = SQL_C_SBIGINT; - bufferSize = sizeof(SQLBIGINT); - break; - - case SQL_REAL: - case SQL_FLOAT: - case SQL_DOUBLE: - cType = SQL_C_DOUBLE; - bufferSize = sizeof(double); - break; - - case SQL_BIT: - cType = SQL_C_BIT; - bufferSize = sizeof(SQLCHAR); - break; - - case SQL_DATE: - case SQL_TYPE_DATE: - cType = SQL_C_DATE; - bufferSize = sizeof(SQL_DATE_STRUCT); - break; - - case SQL_TIME: - case SQL_TYPE_TIME: - cType = SQL_C_TIME; - bufferSize = sizeof(SQL_TIME_STRUCT); - break; - - case SQL_TIMESTAMP: - case SQL_TYPE_TIMESTAMP: - cType = SQL_C_TIMESTAMP; - bufferSize = sizeof(SQL_TIMESTAMP_STRUCT); - break; - - default: - cType = SQL_C_CHAR; - bufferSize = 256; - isCharacterType = true; - break; - } - - std::vector buffer(bufferSize); - - ret = SQLGetData(hStmt, i, cType, buffer.data(), bufferSize, &indicator); - - if (indicator == SQL_NULL_DATA) { - valueStr = "NULL"; - } - else if (ret != SQL_SUCCESS) { - valueStr = "ERR_CONV"; - } - else { - if (cType == SQL_C_CHAR) { - valueStr = reinterpret_cast(buffer.data()); - } - else if (cType == SQL_C_SBIGINT) { - SQLBIGINT intVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(intVal); - } - else if (cType == SQL_C_DOUBLE) { - double doubleVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(doubleVal); - } - else if (cType == SQL_C_BIT) { - valueStr = (*buffer.data() != 0) ? "TRUE" : "FALSE"; - } - else if (cType == SQL_C_DATE) { - SQL_DATE_STRUCT* date = reinterpret_cast(buffer.data()); - char dateStr[20]; - snprintf(dateStr, sizeof(dateStr), "%04d-%02d-%02d", - date->year, date->month, date->day); - valueStr = dateStr; - } - else if (cType == SQL_C_TIME) { - SQL_TIME_STRUCT* time = reinterpret_cast(buffer.data()); - char timeStr[15]; - snprintf(timeStr, sizeof(timeStr), "%02d:%02d:%02d", - time->hour, time->minute, time->second); - valueStr = timeStr; - } - else if (cType == SQL_C_TIMESTAMP) { - SQL_TIMESTAMP_STRUCT* ts = reinterpret_cast(buffer.data()); - char tsStr[30]; - snprintf(tsStr, sizeof(tsStr), "%04d-%02d-%02d %02d:%02d:%02d.%06d", - ts->year, ts->month, ts->day, - ts->hour, ts->minute, ts->second, - ts->fraction / 1000); - valueStr = tsStr; - } - else { - valueStr = "UNKNOWN_TYPE"; - } - - if (isCharacterType && ret == SQL_SUCCESS_WITH_INFO) { - SQLLEN actualSize = 0; - SQLGetDiagField(SQL_HANDLE_STMT, hStmt, 0, SQL_DIAG_COLUMN_SIZE, - &actualSize, SQL_IS_INTEGER, NULL); - - if (indicator > 0 && static_cast(indicator) > bufferSize - 1) { - valueStr += "..."; - } - } - - } - - row.push_back(valueStr); - } - - tableRows.push_back(row); - } - - if (!tableRows.empty()) { - PrintSimpleTable(columnNames, tableRows); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (const std::exception& ex) { - std::cerr << "Exception: " << ex.what() << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } - catch (...) { - std::cerr << "Unknown exception occurred" << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -/// Executes non-query SQL statements (such as INSERT; Tree Model INSERT will auto-create paths) -void Execute(SQLHDBC hDbc, const std::string& command) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret; - - try { - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - - ret = SQLExecDirect(hStmt, (SQLCHAR*)command.c_str(), SQL_NTS); - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect"); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (...) { - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -int main() { - SQLHENV hEnv = SQL_NULL_HENV; - SQLHDBC hDbc = SQL_NULL_HDBC; - SQLRETURN ret; - - try { - std::cout << "Start" << std::endl; - - ret = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hEnv); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_ENV)"); - - ret = SQLSetEnvAttr(hEnv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)SQL_OV_ODBC3, 0); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLSetEnvAttr"); - - ret = SQLAllocHandle(SQL_HANDLE_DBC, hEnv, &hDbc); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_DBC)"); - - std::string dsn = "Apache IoTDB DSN"; - std::string user = "root"; - std::string password = "root"; - std::string server = "127.0.0.1"; - - std::string connectionString = "DSN=" + dsn + ";Server=" + server + - ";UID=" + user + ";PWD=" + password + - ";loglevel=4;istablemodel=0"; - std::cout << "Using connection string: " << connectionString << std::endl; - - SQLCHAR outConnStr[1024]; - SQLSMALLINT outConnStrLen; - - ret = SQLDriverConnect(hDbc, NULL, - (SQLCHAR*)connectionString.c_str(), SQL_NTS, - outConnStr, sizeof(outConnStr), - &outConnStrLen, SQL_DRIVER_COMPLETE); - - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - std::cerr << "Login failed" << std::endl; - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLDriverConnect"); - return 1; - } - - SQLCHAR driverName[256]; - SQLSMALLINT nameLength; - ret = SQLGetInfo(hDbc, SQL_DRIVER_NAME, driverName, sizeof(driverName), &nameLength); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLGetInfo"); - - std::cout << "Successfully opened connection. database name = " << driverName << std::endl; - - const char* insertStatements[] = { - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, 'Device operating normally', 'DeviceA-Room1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, 'Device operating normally', 'DeviceA-Room1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, 'Device operating normally', 'DeviceA-Room1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, 'Device temperature high alarm', 'DeviceA-Room1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, 'Device status returned to normal', 'DeviceA-Room1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, 'Device operating normally', 'DeviceB-Room2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, 'Device operating normally', 'DeviceB-Room2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, 'Device humidity low alarm', 'DeviceB-Room2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, 'Device status returned to normal', 'DeviceB-Room2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, 'Device operating normally', 'DeviceC-Room3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, 'Device operating normally', 'DeviceC-Room3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, 'Device voltage unstable alarm', 'DeviceC-Room3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, 'Device status returned to normal', 'DeviceC-Room3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, 'Device operating normally', 'DeviceD-Room4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, 'Device operating normally', 'DeviceD-Room4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, 'Device operating normally', 'DeviceD-Room4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, 'Device signal interrupted alarm', 'DeviceD-Room4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, 'Device operating normally', 'DeviceE-Room5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, 'Device operating normally', 'DeviceE-Room5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, 'Device operating normally', 'DeviceE-Room5', 1735690740000, '2026-01-04')" - }; - for (const char* sql : insertStatements) { - Execute(hDbc, sql); - } - std::cout << "[DEBUG] Inserted 20 rows. Begin to query." << std::endl; - Query(hDbc); - std::cout << "Query ok" << std::endl; - - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - - return 0; - } - catch (...) { - if (hDbc != SQL_NULL_HDBC) { - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - } - if (hEnv != SQL_NULL_HENV) { - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - } - - std::cerr << "Unexpected error!" << std::endl; - return 1; - } -} -``` - -### 4.4 PowerBI Example -1. Open PowerBI Desktop and create a new project. -2. Click "Home" → "Get Data" → "More..." → "ODBC" → Click the "Connect" button. -3. Data Source Selection: In the pop-up window, select "Data Source Name (DSN)" and choose `Apache IoTDB DSN` from the dropdown. -4. Advanced Configuration: - * Click "Advanced options" and fill in the configuration in the "Connection string" input box (example): - ```Plain - server=127.0.0.1;port=6667;isTableModel=false;loglevel=4 - ``` - * Notes: - * The `dsn` item is optional; filling it in or not does not affect the connection. - * `loglevel` ranges from 0-4: Level 0 (ERROR) has the least logs, Level 4 (TRACE) has the most detailed logs; set as needed. - * `server`/`dsn`/`loglevel` are case-insensitive (e.g., can be written as `Server`). - * If relevant information is configured in the DSN, you do not need to fill in any configuration information; the Driver Manager will automatically use the configuration filled in the DSN. -5. Authentication: Enter the username (default `root`) and password (default `root`), then click "Connect". -6. Data Loading: Click "Load" to view the data. - -### 4.5 Excel Example -1. Open Excel and create a blank workbook. -2. Click the "Data" tab → "From Other Sources" → "From Data Connection Wizard". -3. Data Source Selection: Select "ODBC DSN" → Next → Select `Apache IoTDB DSN` → Next. -4. Connection Configuration: - * The input process for connection string, username, and password is exactly the same as in PowerBI. Reference format for connection string: - ```Plain - server=127.0.0.1;port=6667;isTableModel=false;loglevel=4 - ``` - * If relevant information is configured in the DSN, you do not need to fill in any configuration information; the Driver Manager will automatically use the configuration filled in the DSN. -5. Save Connection: Customize settings for the data connection file name, connection description, etc., then click "Finish". -6. Import Data: Select the location to import the data into the worksheet (e.g., cell A1 of "Existing Worksheet"), click "OK" to complete data loading. \ No newline at end of file diff --git a/src/UserGuide/latest/API/Programming-OPC-DA_timecho.md b/src/UserGuide/latest/API/Programming-OPC-DA_timecho.md deleted file mode 100644 index 861d88e98..000000000 --- a/src/UserGuide/latest/API/Programming-OPC-DA_timecho.md +++ /dev/null @@ -1,209 +0,0 @@ - - -# OPC DA Protocol - -## 1. OPC DA - -OPC DA (OPC Data Access) is a communication protocol standard in the field of industrial automation and a core part of the classic OPC (OLE for Process Control) technology. Its primary goal is to enable real-time data exchange between industrial devices and software (such as SCADA, HMI, and databases) in a Windows environment. OPC DA is implemented based on COM/DCOM and is a lightweight protocol with two roles: server and client. - -* **Server:** Can be regarded as a pool of items, storing the latest data and status of each instance. All items can only be managed on the server side; clients can only read and write data and have no authority to manipulate metadata. - -![](/img/opc-da-1-1.png) - -* **Client:** After connecting to the server, the client needs to define a custom group (this group is only relevant to the client) and create items with the same names as those on the server. The client can then read and write the items it has created. - -![](/img/opc-da-1-2-en.png) - -## 2. OPC DA Sink - -IoTDB (available since V2.0.5.1 for V2.x) provides an OPC DA Sink that supports pushing tree-model data to a local COM server plugin. It encapsulates the OPC DA interface specifications and their inherent complexity, significantly simplifying the integration process. The data flow diagram for the OPC DA Sink is shown below. - -![](/img/opc-da-2-1-en.png) - -### 2.1 SQL Syntax - -```SQL ----- Note: The clsID here needs to be replaced with your own clsID -create pipe opc ( - 'sink'='opc-da-sink', - --- 'opcda.progid'='opcserversim.Instance.1' - 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C' -); -``` - -### 2.2 Parameter Description - -| ​**​Parameter​**​ | ​**​Description​**​ | ​**​Value Range​**​ | ​**​Required​**​ | -| ----------------------------- | ----------------------------------------------------------------------------------------------------------- | ------------------------------- | ----------------------------------------- | -| sink | OPC DA Sink | String: opc-da-sink | Yes | -| sink.opcda.clsid | The ClsID (unique identifier string) of the OPC Server. It is recommended to use clsID instead of progID. | String | Either clsID or progID must be provided | -| sink.opcda.progid | The ProgID of the OPC Server. If clsID is available, it is preferred over progID. | String | Either clsID or progID must be provided | - - -### 2.3 Mapping Specifications - -When used, IoTDB will push the latest data from its tree model to the server. The itemID for the data is the full path of the time series in the tree model, such as root.a.b.c.d. Note that, according to the OPC DA standard, clients cannot directly create items on the server side. Therefore, the server must pre-create items corresponding to IoTDB's time series with the itemID and the appropriate data type. - -* Data type correspondence is as follows: - -| IoTDB | OPC-DA Server | -| ----------- | ----------------------------------------------------------- | -| INT32 | VT\_I4 | -| INT64 | VT\_I8 | -| FLOAT | VT\_R4 | -| DOUBLE | VT\_R8 | -| TEXT | VT\_BSTR | -| BOOLEAN | VT\_BOOL | -| DATE | VT\_DATE | -| TIMESTAMP | VT\_DATE | -| BLOB | VT_BSTR (Variant does not support VT_BLOB, so VT_BSTR is used as a substitute) | -| STRING | VT\_BSTR | - -### 2.4 Common Error Codes - -| Symbol | Error Code | Description | -| ----------------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| OPC\_E\_BADTYPE | 0xC0040004 | The server cannot convert the data between the specified format/requested data type and the canonical data type. This means the server's data type does not match IoTDB's registered type. | -| OPC\_E\_UNKNOWNITEMID | 0xC0040007 | The item ID is not defined in the server's address space (when adding or validating), or the item ID no longer exists in the server's address space (when reading or writing). This means IoTDB's measurement point does not have a corresponding itemID on the server. | -| OPC\_E\_INVALIDITEMID | 0xC0040008 | The itemID does not conform to the server's syntax specifications. | -| REGDB\_E\_CLASSNOTREG | 0x80040154 | Class not registered | -| RPC\_S\_SERVER\_UNAVAILABLE | 0x800706BA | RPC service unavailable | -| DISP\_E\_OVERFLOW | 0x8002000A | Exceeds the maximum value of the type | -| DISP\_E\_BADVARTYPE | 0x80020005 | Type mismatch | - - -### 2.5 Usage Limitations - -* Only supports COM and can only be used on Windows. -* A small amount of old data may be pushed after restarting, but new data will eventually be pushed. -* Currently, only tree-model data is supported. - -## 3. Usage Steps -### 3.1 Prerequisites -1. Windows environment, version >= 8. -2. IoTDB is installed and running normally. -3. OPC DA Server is installed. - -* Using Simple OPC Server Simulator as an example: - -![](/img/opc-da-3-1.png) - -* Double-click an item to modify its name (itemID), data, data type, and other information. -* Right-click an item to delete it, update its value, or create a new item. - -![](/img/opc-da-3-2.png) - -4. OPC DA Client is installed. -* Using KepwareServerEX's quickClient as an example: -* In Kepware, the OPC DA Client can be opened as follows: - -![](/img/opc-da-3-3-en.png) - -![](/img/opc-da-3-4-en.png) - - -### 3.2 Configuration Modifications - -Modify the server configuration to prevent IoTDB's write client and Kepware's read client from connecting to two different instances, which would make debugging impossible. - -* First, press Win+R, type dcomcnfgin the Run menu, and open the DCOM component configuration: - -![](/img/opc-da-3-5-en.png) - -* Navigate to Component Services -> Computers -> My Computer -> DCOM Config, find AGG Software Simple OPC Server Simulator, right-click, and select "Properties": - -![](/img/opc-da-3-6-en.png) - -* Under Identity, change User Accountto Interactive User. Note: Do not use Launching User, as this may cause the two clients to start different server instances. - -![](/img/opc-da-3-7-en.png) - -### 3.3 Obtaining clsID -1. Method 1: Obtain via DCOM Configuration -* Press Win+R, type dcomcnfgin the Run menu, and open the DCOM component configuration. -* Navigate to Component Services -> Computers -> My Computer -> DCOM Config, find AGG Software Simple OPC Server Simulator, right-click, and select "Properties". -* Under General, you can obtain the application's clsID, which will be used for the opc-da-sink connection later. Note: Do not include the curly braces. - -![](/img/opc-da-3-8-en.png) - -2. Method 2: clsID and progID can also be obtained directly from the server. - -* Click `Help` > `Show OPC Server Info` - -![](/img/opc-da-3-9.png) - -* The pop-up window will display the information. - -![](/img/opc-da-3-10-en.png) - -### 3.4 Writing Data -#### 3.4.1 DA Server -1. Create a new item in the DA Server with the same name and type as the item to be written in IoTDB. - -![](/img/opc-da-3-11.png) - -2. Connect to the server in Kepware: - -![](/img/opc-da-3-12-en.png) - -3. Right-click the server to create a new group (the group name can be arbitrary): - -![](/img/opc-da-3-13-en.png) - -![](/img/opc-da-3-14-en.png) - -4. Right-click to create a new item with the same name as the one created earlier. - -![](/img/opc-da-3-15-en.png) - -![](/img/opc-da-3-16-en.png) - -![](/img/opc-da-3-17-en.png) - -#### 3.4.2 IoTDB - -1. Start IoTDB. -2. Create a Pipe. - -```SQL -create pipe opc ('sink'='opc-da-sink', 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C') -``` - -* Note: If the creation fails with the error Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 1107: Failed to connect to server, error code: 0x80040154, refer to this solution: https://opcexpert.com/support/0x80040154-class-not-registered/. - -3. Create a time series (if automatic metadata creation is enabled, this step can be skipped). - -```SQL -create timeseries root.a.b.c.r string; -``` - -4. Insert data. - -```SQL -insert into root.a.b.c (time, r) values(10000, "SomeString") -``` - -### 3.5 Verifying Data - -Check the data in Quick Client; it should have been updated. - -![](/img/opc-da-3-18-en.png) \ No newline at end of file diff --git a/src/UserGuide/latest/API/Programming-OPC-UA_timecho.md b/src/UserGuide/latest/API/Programming-OPC-UA_timecho.md deleted file mode 100644 index 7d139aea2..000000000 --- a/src/UserGuide/latest/API/Programming-OPC-UA_timecho.md +++ /dev/null @@ -1,373 +0,0 @@ - - -# OPC UA Protocol - -## 1. Overview - -This document describes two independent operational modes for IoTDB's integration with the OPC UA protocol. Choose the mode based on your business scenario: - -* **Mode 1: Data Subscription Service (IoTDB as OPC UA Server)**: IoTDB starts an embedded OPC UA server to passively allow external clients (e.g., UAExpert) to connect and subscribe to its internal data. This is the traditional usage. -* **Mode 2: Data Push (IoTDB as OPC UA Client)**: IoTDB acts as a client to actively synchronize data and metadata to one or more independently deployed external OPC UA servers. - > Note: This mode is supported starting from V2.0.8. - -**Note: Modes are mutually exclusive** -When the Pipe configuration specifies the `node-urls` parameter (Mode 2), IoTDB will **not** start the embedded OPC UA server (Mode 1). These two modes **cannot be used simultaneously** within the same Pipe. - -## 2. Data Subscription - -This mode supports users subscribing to data from IoTDB using the OPC UA protocol, with communication modes supporting both Client/Server and Pub/Sub. - -Note: This feature does **not** involve collecting data from external OPC Servers into IoTDB. - -![](/img/opc-ua-new-1-en.png) - -### 2.1 OPC Service Startup - -#### 2.1.1 Syntax - -Syntax for starting OPC UA protocol: - -```SQL -CREATE PIPE p1 - WITH SOURCE (...) - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'sink.opcua.tcp.port' = '12686', - 'sink.opcua.https.port' = '8443', - 'sink.user' = 'root', - 'sink.password' = 'TimechoDB@2021', // Default password was 'root' before V2.0.6.x - 'sink.opcua.security.dir' = '...' - ) -``` - -#### 2.1.2 Parameters - -| **Parameter** | **Description** | **Value Range** | **Required** | **Default Value** | -| ------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |-------------------------------------------------------------------------------------------------------------------------------------| -------------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| sink | OPC UA SINK | String: opc-ua-sink | Required | | -| sink.opcua.model | OPC UA operational mode | String: client-server / pub-sub | Optional | client-server | -| sink.opcua.tcp.port | OPC UA TCP port | Integer: [0, 65536] | Optional | 12686 | -| sink.opcua.https.port | OPC UA HTTPS port | Integer: [0, 65536] | Optional | 8443 | -| sink.opcua.security.dir | OPC UA key and certificate directory | String: Path (supports absolute/relative paths) | Optional | 1. `opc_security` folder under IoTDB's DataNode conf directory `/`. 2. User home directory's `iotdb_opc_security` folder `/` if no IoTDB conf directory exists (e.g., when starting DataNode in IDEA) | -| opcua.security-policy | Security policy used for OPC UA connections (case-insensitive). Multiple policies can be configured and separated by commas. After configuring one policy, clients can only connect using that policy. Default implementation supports `None` and `Basic256Sha256`. Should be set to a non-`None` policy by default. `None` policy is only for debugging (convenient but insecure; not recommended for production). Note: Supported since V2.0.8, only for client-server mode. | String (security level increases):`None`,`Basic128Rsa15`,`Basic256`,`Basic256Sha256`,`Aes128_Sha256_RsaOaep`,`Aes256_Sha256_RsaPss` | Optional | `Basic256Sha256,Aes128_Sha256_RsaOaep,Aes256_Sha256_RsaPss` | -| sink.opcua.enable-anonymous-access | Whether OPC UA allows anonymous access | Boolean | Optional | true | -| sink.user | User (OPC UA allowed user) | String | Optional | root | -| sink.password | Password (OPC UA allowed password) | String | Optional | TimechoDB@2021 (Default was 'root' before V2.0.6.x) | -| opcua.with-quality | Whether OPC UA publishes data in value + quality mode. When enabled, system processes data as follows:1. Both value and quality present → Push directly to OPC UA Server.2. Only value present → Quality automatically filled as UNCERTAIN (default, configurable).3. Only quality present → Ignore write (no processing).4. Non-value/quality fields present → Ignore data and log warning (configurable log frequency to avoid high-frequency interference).5. Quality type restriction: Only boolean type supported (true = GOOD, false = BAD).**Note**: Supported since V2.0.8, only for client-server mode | Boolean | Optional | false | -| opcua.value-name | Effective when `with-quality` = true, specifies the name of the value point. **Note**: Supported since V2.0.8, only for client-server mode | String | Optional | value | -| opcua.quality-name | Effective when `with-quality` = true, specifies the name of the quality point. **Note**: Supported since V2.0.8, only for client-server mode | String | Optional | quality | -| opcua.default-quality | When no quality is provided, specify `GOOD`/`UNCERTAIN`/`BAD` via SQL parameter. **Note**: Supported since V2.0.8, only for client-server mode | String: `GOOD`/`UNCERTAIN`/`BAD` | Optional | `UNCERTAIN` | -| opcua.timeout-seconds | Client connection timeout in seconds (effective only when IoTDB acts as client). **Note**: Supported since V2.0.8, only for client-server mode | Long | Optional | 10L | - -#### 2.1.3 Example - -```Bash -CREATE PIPE p1 - WITH SINK ('sink' = 'opc-ua-sink', - 'sink.user' = 'root', - 'sink.password' = 'TimechoDB@2021'); // Default password was 'root' before V2.0.6.x -START PIPE p1; -``` - -#### 2.1.4 Usage Restrictions - -1. Data must be written after protocol startup to establish connection. Only data written *after* connection can be subscribed. -2. Recommended for single-node mode. In distributed mode, each IoTDB DataNode acts as an independent OPC Server; separate subscriptions are required for each. - -### 2.2 Example of Two Communication Modes - -#### 2.2.1 Client/Server Mode - -In this mode, IoTDB's stream processing engine establishes a connection with the OPC UA Server (Server) via OPC UA Sink. The OPC UA Server maintains data in its address space (Address Space), and IoTDB can request and retrieve this data. Other OPC UA clients (Clients) can also access the server's data. - -* **Features**: - * OPC UA organizes device information received from Sink into folders under Objects folder in tree structure. - * Each point is recorded as a variable node with the latest value in the current database. - * OPC UA cannot delete data or change data type settings. - -##### 2.2.1.1 Preparation - -1. Example using UAExpert client: Download UAExpert client from https://www.unified-automation.com/downloads/opc-ua-clients.html -2. Install UAExpert and configure certificate information. - -##### 2.2.1.2 Quick Start -###### 2.2.1.2.1 Scenarios Supporting the None Security Policy -1. Start OPC UA service using SQL (detailed syntax see [IoTDB OPC Server Syntax](./Programming-OPC-UA_timecho.md#_2-1-语法)): - -```SQL -create pipe p1 with sink ('sink'='opc-ua-sink', 'opcua.security-policy'='AES128_SHA256_RSAOAEP, AES256_SHA256_RSAPSS, BASIC256SHA256, NONE'); -``` -Note: Since version V2.0.8.1, None is no longer supported by default. To use it, you must manually enable it via the security-policy parameter as shown above. - -2. Write some data: - -```SQL -INSERT INTO root.test.db(time, s2) VALUES(NOW(), 2); -``` - -3. Configure UAExpert to connect to IoTDB (password matches `sink.password` configured above, e.g., root/TimechoDB@2021): - - ::: center - - - - ::: - - ::: center - - - - ::: - -4. Trust the server certificate, then view written data under Objects folder on the left: - - ::: center - - - - ::: - - ::: center - - - - ::: - - Note: Since the SecurityPolicy is set to None, mutual certificate trust is not required. For production environments, it is recommended to use a non-None SecurityPolicy for connection, which requires mutual certificate trust. For operations, refer to the Pub/Sub mode section below. In the Client/Server certificate directory (search for the keyword keyStore in the printed logs), move the contents in reject to trusted/certs. Follow the sequence: connect → move server directory → connect → move client directory → connect. - - -5. Drag left nodes to the middle to display latest value: - - ::: center - - - - ::: - -###### 2.2.1.2.2 Scenarios Not Supporting the None Security Policy -1. Use the following SQL to create and start the OPC UA service. - ```SQL - create pipe p1 with sink ('sink'='opc-ua-sink'); - ``` - - Note: Since version V2.0.8.1, OpcUaSink no longer supports None mode by default for security considerations. - -2. Insert some test data. - ```SQL - insert into root.test.db(time, s2) values(now(), 2); - ``` - -3. Configure the IoTDB connection in UAExpert: - - - Do not access the URL directly; endpoints must be discovered using the Discover method - - The client first sends a GetEndpoints request with the None policy to retrieve the endpoint list - - It then selects the corresponding encrypted endpoint based on the configured Basic256Sha256 + SignAndEncrypt to establish an encrypted connection - - ![](/img/opc-ua-un-none-1.png) - -4. Use the same username and password configuration as above. After selecting the relevant connection mode (Sign / Sign & Encrypt), if the following prompt appears, click Ignore to connect directly. - - ![](/img/opc-ua-un-none-2.png) - - -#### 2.2.2 Pub/Sub Mode - -In this mode, IoTDB's stream processing engine sends data change events to the OPC UA Server (Server) via OPC UA Sink. These events are published to the server's message queue and managed via Event Nodes. Other OPC UA clients (Clients) can subscribe to these Event Nodes to receive notifications when data changes. - -* **Features**: - - * Each point is packaged as an Event Node (EventNode) by OPC UA. - * Related fields and meanings: - - | Field | Meaning | Type (Milo) | Example | - | ------------ | ------------------ | -------------- | ----------------------- | - | Time | Timestamp | DateTime | 1698907326198 | - | SourceName | Full path of point | String | root.test.opc.sensor0 | - | SourceNode | Data type of point | NodeId | Int32 | - | Message | Data | LocalizedText | 3.0 | - - - -- Events are sent only to currently subscribed clients. Unconnected clients ignore events. -- Deleted data cannot be pushed to clients. - -##### 2.2.2.1 Preparation - -Code located in `example/pipe-opc-ua-sink/src/main/java/org/apache/iotdb/opcua` of iotdb-example package. - -Contains: - -- Main class (`ClientTest`) -- Client certificate logic (`IoTDBKeyStoreLoaderClient`) -- Client configuration and startup logic (`ClientExampleRunner`) -- Parent class for `ClientTest` (`ClientExample`) - -##### 2.2.2.2 Quick Start - -1. Open IoTDB and write some data: - -```SQL -INSERT INTO root.a.b(time, c, d) VALUES(NOW(), 1, 2); // Auto-creates metadata -``` - -2. Create and start Pub/Sub mode OPC UA Sink: - -```SQL -CREATE PIPE p1 WITH SINK ('sink'='opc-ua-sink', 'sink.opcua.model'='pub-sub'); -START PIPE p1; -``` - -3. Observe server creates OPC certificate directory under conf: -4. Run Client to connect, but server rejects Client certificate: -5. Enter server's `sink.opcua.security.dir` → `pki` → `rejected` directory, find Client's certificate: -6. Move Client's certificate (not copy) to `trusted/certs` directory: -7. Reopen Client → server certificate rejected by Client: -8. Enter client's `/client/security` → `pki` → `rejected` → move server's certificate (not copy) to `trusted`: -9. Open Client → successful bidirectional trust, connection established. -10. Write data to server → Client prints received data: - -#### 2.2.3 Notes - -1. **Single-node vs Cluster**: Recommend single-node (1C1D). In cluster with multiple DataNodes, data may be distributed across nodes, preventing full data subscription. -2. **No root certificate operations**: No need to handle IoTDB's root security directory `iotdb-server.pfx` or client security directory `example-client.pfx`. During bidirectional connection, root certificates are exchanged. New certificates are placed in `rejected` directory; if in `trusted/certs`, they're trusted. -3. **Recommended Java 17+**: JDK 8 may have key size restrictions causing "Illegal key size" errors. For specific versions (e.g., jdk.1.8u151+), add `Security.setProperty("crypto.policy", "unlimited");` in `ClientExampleRunner.createClient()`, or replace `JDK/jre/lib/security/local_policy.jar` and `US_export_policy.jar` with unlimited versions from https://www.oracle.com/java/technologies/javase-jce8-downloads.html. -4. **Connection issues**: If error is "Unknown host", modify `/etc/hosts` on IoTDB DataNode machine to add target machine's URL and hostname. - -## 3. Data Push - -In this mode, IoTDB acts as an OPC UA client via Pipe to actively push selected data (including quality code) to one or more external OPC UA servers. External servers automatically create directory trees and nodes based on IoTDB's metadata. - -![](/img/opc-ua-data-push-en.png) - -### 3.1 OPC Service Startup - -#### 3.1.1 Syntax - -Syntax for starting OPC UA protocol: - -```SQL -CREATE PIPE p1 - WITH SOURCE (...) - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'opcua.node-url' = '127.0.0.1:12686', - 'opcua.historizing' = 'true', - 'opcua.with-quality' = 'true' - ) -``` - -#### 3.1.2 Parameters - -| **Parameter** | **Description** | **Value Range** | **Required** | **Default Value** | -|-----------------------| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |-------------------------------------------------------------------------------------------------------------------------------------| -------------------- | -------------------- | -| sink | OPC UA SINK | String: opc-ua-sink | Required | | -| opcua.node-url | Comma-separated OPC UA TCP ports. When specified, IoTDB **does not** start local server but sends data to configured OPC UA Server. | String | Optional | `''` | -| opcua.historizing | When automatically creating directories and leaf nodes, whether to store historical data in new nodes. | Boolean | Optional | false | -| opcua.with-quality | Whether OPC UA publishes data in value + quality mode. When enabled, system processes data as follows:1. Both value and quality present → Push directly to OPC UA Server.2. Only value present → Quality automatically filled as UNCERTAIN (default, configurable).3. Only quality present → Ignore write (no processing).4. Non-value/quality fields present → Ignore data and log warning (configurable log frequency).5. Quality type restriction: Only boolean type supported (true = GOOD, false = BAD). | Boolean | Optional | false | -| opcua.value-name | Effective when `with-quality` = true, specifies the name of the value point. | String | Optional | value | -| opcua.quality-name | Effective when `with-quality` = true, specifies the name of the quality point. | String | Optional | quality | -| opcua.default-quality | When no quality is provided, specify `GOOD`/`UNCERTAIN`/`BAD` via SQL parameter. | String: `GOOD`/`UNCERTAIN`/`BAD` | Optional | `UNCERTAIN` | -| opcua.security-policy | OPC UA client security policy (case-insensitive), URL format: `http://opcfoundation.org/UA/SecurityPolicy#`, e.g., `http://opcfoundation.org/UA/SecurityPolicy#Aes128_Sha256_RsaOaep` | String (security level increases):`None`,`Basic128Rsa15`,`Basic256`,`Basic256Sha256`,`Aes128_Sha256_RsaOaep`,`Aes256_Sha256_RsaPss` | Optional | `Basic256Sha256` | -| opcua.timeout-seconds | Client connection timeout in seconds (effective only when IoTDB acts as client) | Long | Optional | 10L | - -> **Parameter Naming Note**: All parameters support omitting `opcua.` prefix (e.g., `node-urls` and `opcua.node-urls` are equivalent). -> -> **Support Note**: All `opcua.` parameters are supported starting from V2.0.8, and only for `client-server` mode. - -#### 3.1.3 Example - -```Bash -CREATE PIPE p1 - WITH SOURCE (...) - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'node-urls' = '127.0.0.1:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ) -``` - -#### 3.1.4 Usage Restrictions - -1. Current mode **only supports `client-server` mode and tree model data**. -2. Do not configure multiple DataNodes on one machine to avoid port conflicts. -3. **Does not support** `OBJECT` type data push. -4. When a time series is renamed, OPC UA Sink automatically deletes the old path and pushes data to the new path. -5. **Strongly recommended** to use non-`None` security policy (e.g., `Basic256Sha256`) with proper bidirectional certificate trust in production. - -### 3.2 External OPC UA Server Project - -IoTDB supports a standalone external Server project. This Server implements the same configuration as IoTDB's embedded Server but requires additional support for dynamically creating directories and leaf nodes based on IoTDB's metadata. - -Configuration is injected via command-line args when starting the Server (no YAML/XML support). Parameter keys match IoTDB OPC Server configuration items, with dots (`.`) and hyphens (`-`) replaced by underscores (`_`). - -Example: - -```SQL -./start-IoTDB-opc-server.sh -enable_anonymous_access true -u root -pw root -https_port 8443 -``` - -Where `user` and `password` can be abbreviated as `-u` and `-p`. All other parameter keys match configuration items. Note: `userName` is **not** a valid parameter key; only `user` is supported. - -### 3.3 Scenario Example - -**Goal**: Aggregate data from multiple sources to 3 external OPC Servers for unified monitoring center access. - -![](/img/opc-ua-data-push-example-en.png) - -1. **Preparation**: Start external OPC UA Server (port 12686) on three servers (`ip1`, `ip2`, `ip3`). -2. **Configure Pipes**: Create 3 Pipes in IoTDB, using `processor` or `source` path patterns to filter data by region and push to corresponding Servers. - ```SQL - -- Start IoTDB - ./start-standalone.sh - - -- Start three OPC UA Servers (on ip1, ip2, ip3) - ./start-IoTDB-external-opc-server.sh -enable-anonymous-access true -u root -pw root - - -- Create three Pipes - ./start-cli.sh - CREATE PIPE p1 - WITH SOURCE () - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip1:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - CREATE PIPE p2 - WITH SOURCE () - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip2:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - CREATE PIPE p3 - WITH SOURCE () - WITH PROCESSOR (...) - WITH SINK ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip3:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - ``` -3. **Result**: The monitoring center only needs to connect to `ip1`, `ip2`, and `ip3` to access the complete data view from all regions, with quality information attached. diff --git a/src/UserGuide/latest/API/Programming-Python-Native-API_timecho.md b/src/UserGuide/latest/API/Programming-Python-Native-API_timecho.md deleted file mode 100644 index 9ece00588..000000000 --- a/src/UserGuide/latest/API/Programming-Python-Native-API_timecho.md +++ /dev/null @@ -1,875 +0,0 @@ - - -# Python Native API - -## 1. Requirements - -You have to install thrift (>=0.13) before using the package. - - - -## 2. How to use (Example) - -First, download the package: `pip3 install apache-iotdb>=2.0` - -Note: Do not use a newer client to connect to an older server, as this may cause connection failures or unexpected errors. - -You can get an example of using the package to read and write data at here:[Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/session_example.py) - -An example of aligned timeseries: [Aligned Timeseries Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/session_aligned_timeseries_example.py) - -(you need to add `import iotdb` in the head of the file) - -Or: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //Before V2.0.6.x the default password is root -session = Session(ip, port_, username_, password_) -session.open(False) -zone = session.get_time_zone() -session.close() -``` - -## 3. Initialization - -* Initialize a Session - -```python -session = Session( - ip="127.0.0.1", - port="6667", - user="root", - password="TimechoDB@2021", //Before V2.0.6.x the default password is root - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) -``` - -* Initialize a Session to connect multiple nodes - -```python -session = Session.init_from_node_urls( - node_urls=["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"], - user="root", - password="TimechoDB@2021", //Before V2.0.6.x the default password is root - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) -``` - -* Open a session, with a parameter to specify whether to enable RPC compression - -```python -session.open(enable_rpc_compression=False) -``` - -Notice: this RPC compression status of client must comply with that of IoTDB server - -* Close a Session - -```python -session.close() -``` - -## 4. Managing Session through SessionPool - -Utilizing SessionPool to manage sessions eliminates the need to worry about session reuse. When the number of session connections reaches the maximum capacity of the pool, requests for acquiring a session will be blocked, and you can set the blocking wait time through parameters. After using a session, it should be returned to the SessionPool using the `putBack` method for proper management. - -### 4.1 Create SessionPool - -```python -pool_config = PoolConfig(host=ip,port=port, user_name=username, - password=password, fetch_size=1024, - time_zone="UTC+8", max_retry=3) -max_pool_size = 5 -wait_timeout_in_ms = 3000 - -# # Create the connection pool -session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) -``` -### 4.2 Create a SessionPool using distributed nodes. -```python -pool_config = PoolConfig(node_urls=node_urls=["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"], user_name=username, - password=password, fetch_size=1024, - time_zone="UTC+8", max_retry=3) -max_pool_size = 5 -wait_timeout_in_ms = 3000 -``` -### 4.3 Acquiring a session through SessionPool and manually calling PutBack after use - -```python -session = session_pool.get_session() -session.set_storage_group(STORAGE_GROUP_NAME) -session.create_time_series( - TIMESERIES_PATH, TSDataType.BOOLEAN, TSEncoding.PLAIN, Compressor.SNAPPY -) -# After usage, return the session using putBack -session_pool.put_back(session) -# When closing the sessionPool, all managed sessions will be closed as well -session_pool.close() -``` -### 4.4 SSL Connection - -#### 4.4.1 Server Certificate Configuration - -In the `conf/iotdb-system.properties` configuration file, locate or add the following configuration items: - -```Java -enable_thrift_ssl=true -key_store_path=/path/to/your/server_keystore.jks -key_store_pwd=your_keystore_password -``` - -#### 4.4.2 Configure Python Client Certificate - -- Set `use_ssl` to True to enable SSL. -- Specify the client certificate path using the `ca_certs` parameter. - -```Java -use_ssl = True -ca_certs = "/path/to/your/server.crt" # 或 ca_certs = "/path/to/your//ca_cert.pem" -``` -**Example Code: Using SSL to Connect to IoTDB** - -```Java -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# - -from iotdb.SessionPool import PoolConfig, SessionPool -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //Before V2.0.6.x the default password is root -# Configure SSL enabled -use_ssl = True -# Configure certificate path -ca_certs = "/path/server.crt" - - -def get_data(): - session = Session( - ip, port_, username_, password_, use_ssl=use_ssl, ca_certs=ca_certs - ) - session.open(False) - with session.execute_query_statement("select * from root.eg.etth") as result: - df = result.todf() - df.rename(columns={"Time": "date"}, inplace=True) - session.close() - return df - - -def get_data2(): - pool_config = PoolConfig( - host=ip, - port=port_, - user_name=username_, - password=password_, - fetch_size=1024, - time_zone="UTC+8", - max_retry=3, - use_ssl=use_ssl, - ca_certs=ca_certs, - ) - max_pool_size = 5 - wait_timeout_in_ms = 3000 - session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) - session = session_pool.get_session() - with session.execute_query_statement("select * from root.eg.etth") as result: - df = result.todf() - df.rename(columns={"Time": "date"}, inplace=True) - session_pool.put_back(session) - session_pool.close() - - -if __name__ == "__main__": - df = get_data() -``` - -## 5. Data Definition Interface (DDL Interface) - -### 5.1 Database Management - -* CREATE DATABASE - -```python -session.set_storage_group(group_name) -``` - -* Delete one or several databases - -```python -session.delete_storage_group(group_name) -session.delete_storage_groups(group_name_lst) -``` -### 5.2 Timeseries Management - -* Create one or multiple timeseries - -```python -session.create_time_series(ts_path, data_type, encoding, compressor, - props=None, tags=None, attributes=None, alias=None) - -session.create_multi_time_series( - ts_path_lst, data_type_lst, encoding_lst, compressor_lst, - props_lst=None, tags_lst=None, attributes_lst=None, alias_lst=None -) -``` - -* Create aligned timeseries - -```python -session.create_aligned_time_series( - device_id, measurements_lst, data_type_lst, encoding_lst, compressor_lst -) -``` - -Attention: Alias of measurements are **not supported** currently. - -* Delete one or several timeseries - -```python -session.delete_time_series(paths_list) -``` - -* Check whether the specific timeseries exists - -```python -session.check_time_series_exists(path) -``` - -## 6. Data Manipulation Interface (DML Interface) - -### 6.1 Insert - -It is recommended to use insertTablet to help improve write efficiency. - -* Insert a Tablet,which is multiple rows of a device, each row has the same measurements - * **Better Write Performance** - * **Support null values**: fill the null value with any value, and then mark the null value via BitMap (from v0.13) - - -We have two implementations of Tablet in Python API. - -* Normal Tablet - -```python -values_ = [ - [False, 10, 11, 1.1, 10011.1, "test01"], - [True, 100, 11111, 1.25, 101.0, "test02"], - [False, 100, 1, 188.1, 688.25, "test03"], - [True, 0, 0, 0, 6.25, "test04"], -] -timestamps_ = [1, 2, 3, 4] -tablet_ = Tablet( - device_id, measurements_, data_types_, values_, timestamps_ -) -session.insert_tablet(tablet_) - -values_ = [ - [None, 10, 11, 1.1, 10011.1, "test01"], - [True, None, 11111, 1.25, 101.0, "test02"], - [False, 100, None, 188.1, 688.25, "test03"], - [True, 0, 0, 0, None, None], -] -timestamps_ = [16, 17, 18, 19] -tablet_ = Tablet( - device_id, measurements_, data_types_, values_, timestamps_ -) -session.insert_tablet(tablet_) -``` -* Numpy Tablet - -Comparing with Tablet, Numpy Tablet is using [numpy.ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html) to record data. -With less memory footprint and time cost of serialization, the insert performance will be better. - -**Notice** -1. time and numerical value columns in Tablet is ndarray -2. recommended to use the specific dtypes to each ndarray, see the example below - (if not, the default dtypes are also ok). - -```python -import numpy as np -data_types_ = [ - TSDataType.BOOLEAN, - TSDataType.INT32, - TSDataType.INT64, - TSDataType.FLOAT, - TSDataType.DOUBLE, - TSDataType.TEXT, -] -np_values_ = [ - np.array([False, True, False, True], TSDataType.BOOLEAN.np_dtype()), - np.array([10, 100, 100, 0], TSDataType.INT32.np_dtype()), - np.array([11, 11111, 1, 0], TSDataType.INT64.np_dtype()), - np.array([1.1, 1.25, 188.1, 0], TSDataType.FLOAT.np_dtype()), - np.array([10011.1, 101.0, 688.25, 6.25], TSDataType.DOUBLE.np_dtype()), - np.array(["test01", "test02", "test03", "test04"], TSDataType.TEXT.np_dtype()), -] -np_timestamps_ = np.array([1, 2, 3, 4], TSDataType.INT64.np_dtype()) -np_tablet_ = NumpyTablet( - device_id, measurements_, data_types_, np_values_, np_timestamps_ -) -session.insert_tablet(np_tablet_) - -# insert one numpy tablet with None into the database. -np_values_ = [ - np.array([False, True, False, True], TSDataType.BOOLEAN.np_dtype()), - np.array([10, 100, 100, 0], TSDataType.INT32.np_dtype()), - np.array([11, 11111, 1, 0], TSDataType.INT64.np_dtype()), - np.array([1.1, 1.25, 188.1, 0], TSDataType.FLOAT.np_dtype()), - np.array([10011.1, 101.0, 688.25, 6.25], TSDataType.DOUBLE.np_dtype()), - np.array(["test01", "test02", "test03", "test04"], TSDataType.TEXT.np_dtype()), -] -np_timestamps_ = np.array([98, 99, 100, 101], TSDataType.INT64.np_dtype()) -np_bitmaps_ = [] -for i in range(len(measurements_)): - np_bitmaps_.append(BitMap(len(np_timestamps_))) -np_bitmaps_[0].mark(0) -np_bitmaps_[1].mark(1) -np_bitmaps_[2].mark(2) -np_bitmaps_[4].mark(3) -np_bitmaps_[5].mark(3) -np_tablet_with_none = NumpyTablet( - device_id, measurements_, data_types_, np_values_, np_timestamps_, np_bitmaps_ -) -session.insert_tablet(np_tablet_with_none) -``` - -* Insert multiple Tablets - -```python -session.insert_tablets(tablet_lst) -``` - -* Insert a Record - -```python -session.insert_record(device_id, timestamp, measurements_, data_types_, values_) -``` - -* Insert multiple Records - -```python -session.insert_records( - device_ids_, time_list_, measurements_list_, data_type_list_, values_list_ -) -``` - -* Insert multiple Records that belong to the same device. - With type info the server has no need to do type inference, which leads a better performance - - -```python -session.insert_records_of_one_device(device_id, time_list, measurements_list, data_types_list, values_list) -``` - -### 6.2 Insert with type inference - -When the data is of String type, we can use the following interface to perform type inference based on the value of the value itself. For example, if value is "true" , it can be automatically inferred to be a boolean type. If value is "3.2" , it can be automatically inferred as a flout type. Without type information, server has to do type inference, which may cost some time. - -* Insert a Record, which contains multiple measurement value of a device at a timestamp - -```python -session.insert_str_record(device_id, timestamp, measurements, string_values) -``` - -### 6.3 Insert of Aligned Timeseries - -The Insert of aligned timeseries uses interfaces like insert_aligned_XXX, and others are similar to the above interfaces: - -* insert_aligned_record -* insert_aligned_records -* insert_aligned_records_of_one_device -* insert_aligned_tablet -* insert_aligned_tablets - - -## 7. IoTDB-SQL Interface - -* Execute query statement - -```python -session.execute_query_statement(sql) -``` - -* Execute non query statement - -```python -session.execute_non_query_statement(sql) -``` - -* Execute statement - -```python -session.execute_statement(sql) -``` - -## 8. Schema Template -### 8.1 Create Schema Template -The step for creating a metadata template is as follows -1. Create the template class -2. Adding MeasurementNode -3. Execute create schema template function - -```python -template = Template(name=template_name, share_time=True) - -m_node_x = MeasurementNode("x", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) -m_node_y = MeasurementNode("y", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) -m_node_z = MeasurementNode("z", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) - -template.add_template(m_node_x) -template.add_template(m_node_y) -template.add_template(m_node_z) - -session.create_schema_template(template) -``` -### 8.2 Modify Schema Template measurements -Modify measurements in a template, the template must be already created. These are functions that add or delete some measurement nodes. -* add node in template -```python -session.add_measurements_in_template(template_name, measurements_path, data_types, encodings, compressors, is_aligned) -``` - -* delete node in template -```python -session.delete_node_in_template(template_name, path) -``` - -### 8.3 Set Schema Template -```python -session.set_schema_template(template_name, prefix_path) -``` - -### 8.4 Uset Schema Template -```python -session.unset_schema_template(template_name, prefix_path) -``` - -### 8.5 Show Schema Template -* Show all schema templates -```python -session.show_all_templates() -``` -* Count all measurements in templates -```python -session.count_measurements_in_template(template_name) -``` - -* Judge whether the path is measurement or not in templates, This measurement must be in the template -```python -session.count_measurements_in_template(template_name, path) -``` - -* Judge whether the path is exist or not in templates, This path may not belong to the template -```python -session.is_path_exist_in_template(template_name, path) -``` - -* Show nodes under in schema template -```python -session.show_measurements_in_template(template_name) -``` - -* Show the path prefix where a schema template is set -```python -session.show_paths_template_set_on(template_name) -``` - -* Show the path prefix where a schema template is used (i.e. the time series has been created) -```python -session.show_paths_template_using_on(template_name) -``` - -### 8.6 Drop Schema Template -Delete an existing metadata template,dropping an already set template is not supported -```python -session.drop_schema_template("template_python") -``` - - -## 9. Pandas Support - -To easily transform a query result to a [Pandas Dataframe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) -the SessionDataSet has a method `.todf()` which consumes the dataset and transforms it to a pandas dataframe. - -Example: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //Before V2.0.6.x the default password is root -session = Session(ip, port_, username_, password_) -session.open(False) -with session.execute_query_statement("SELECT ** FROM root") as result: - # Transform to Pandas Dataset - df = result.todf() - -session.close() - -# Now you can work with the dataframe -df = ... -``` - - -**Since V2.0.8.2**, `SessionDataSet` provides methods for batch DataFrame retrieval to efficiently handle large-volume queries: - -```python -# Batch DataFrame retrieval -has_next = result.has_next_df() -if has_next: - df = result.next_df() - # Process DataFrame -``` - -**Method Details:** -- `has_next_df()`: Returns `True`/`False` indicating whether more data exists -- `next_df()`: Returns a `DataFrame` or `None`. Each call returns `fetchSize` rows (default: 5000 rows, controlled by Session's `fetch_size` parameter): - - If remaining data ≥ `fetchSize`: returns `fetchSize` rows - - If remaining data < `fetchSize`: returns all remaining rows - - If traversal completes: returns `None` -- Session validates `fetchSize` at initialization: if ≤0, resets to 5000 and logs warning: `fetch_size xxx is illegal, use default fetch_size 5000` - -**Note:** Avoid mixing different traversal methods (e.g., combining `todf()` with `next_df()`), which may cause unexpected errors. - -**Usage Example:** - -```python -from iotdb.Session import Session - -# Initialize session with fetch_size=2 -session = Session( - host="127.0.0.1", port="6667", fetch_size=2 -) -session.open(False) -session.execute_non_query_statement("CREATE DATABASE root.device0") - -# Insert 3 records -session.insert_str_record("root.device0", 123, "pressure", "15.0") -session.insert_str_record("root.device0", 124, "pressure", "15.0") -session.insert_str_record("root.device0", 125, "pressure", "15.0") - -# Query and batch retrieve -with session.execute_query_statement("SELECT * FROM root.device0") as session_data_set: - while session_data_set.has_next_df(): - df = session_data_set.next_df() - # Outputs two DataFrames: first with 2 rows, second with 1 row - print(df) - -session.close() -``` - - -## 10. IoTDB Testcontainer - -The Test Support is based on the lib `testcontainers` (https://testcontainers-python.readthedocs.io/en/latest/index.html) which you need to install in your project if you want to use the feature. - -To start (and stop) an IoTDB Database in a Docker container simply do: -```python -class MyTestCase(unittest.TestCase): - - def test_something(self): - with IoTDBContainer() as c: - session = Session("localhost", c.get_exposed_port(6667), "root", "TimechoDB@2021") //Before V2.0.6.x the default password is root - session.open(False) - with session.execute_query_statement("SHOW TIMESERIES") result: - print(result) - session.close() -``` - -by default it will load the image `apache/iotdb:latest`, if you want a specific version just pass it like e.g. `IoTDBContainer("apache/iotdb:0.12.0")` to get version `0.12.0` running. - -## 11. IoTDB DBAPI - -IoTDB DBAPI implements the Python DB API 2.0 specification (https://peps.python.org/pep-0249/), which defines a common -interface for accessing databases in Python. - -### 11.1 Examples -+ Initialization - -The initialized parameters are consistent with the session part (except for the sqlalchemy_mode). -```python -from iotdb.dbapi import connect - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //Before V2.0.6.x the default password is root -conn = connect(ip, port_, username_, password_,fetch_size=1024,zone_id="UTC+8",sqlalchemy_mode=False) -cursor = conn.cursor() -``` -+ simple SQL statement execution -```python -cursor.execute("SELECT ** FROM root") -for row in cursor.fetchall(): - print(row) -``` - -+ execute SQL with parameter - -IoTDB DBAPI supports pyformat style parameters -```python -cursor.execute("SELECT ** FROM root WHERE time < %(time)s",{"time":"2017-11-01T00:08:00.000"}) -for row in cursor.fetchall(): - print(row) -``` - -+ execute SQL with parameter sequences -```python -seq_of_parameters = [ - {"timestamp": 1, "temperature": 1}, - {"timestamp": 2, "temperature": 2}, - {"timestamp": 3, "temperature": 3}, - {"timestamp": 4, "temperature": 4}, - {"timestamp": 5, "temperature": 5}, -] -sql = "insert into root.cursor(timestamp,temperature) values(%(timestamp)s,%(temperature)s)" -cursor.executemany(sql,seq_of_parameters) -``` - -+ close the connection and cursor -```python -cursor.close() -conn.close() -``` - -## 12. IoTDB SQLAlchemy Dialect (Experimental) -The SQLAlchemy dialect of IoTDB is written to adapt to Apache Superset. -This part is still being improved. -Please do not use it in the production environment! -### 12.1 Mapping of the metadata -The data model used by SQLAlchemy is a relational data model, which describes the relationships between different entities through tables. -While the data model of IoTDB is a hierarchical data model, which organizes the data through a tree structure. -In order to adapt IoTDB to the dialect of SQLAlchemy, the original data model in IoTDB needs to be reorganized. -Converting the data model of IoTDB into the data model of SQLAlchemy. - -The metadata in the IoTDB are: - -1. Database -2. Path -3. Entity -4. Measurement - -The metadata in the SQLAlchemy are: -1. Schema -2. Table -3. Column - -The mapping relationship between them is: - -| The metadata in the SQLAlchemy | The metadata in the IoTDB | -| -------------------- | -------------------------------------------- | -| Schema | Database | -| Table | Path ( from database to entity ) + Entity | -| Column | Measurement | - -The following figure shows the relationship between the two more intuitively: - -![sqlalchemy-to-iotdb](/img/UserGuide/API/IoTDB-SQLAlchemy/sqlalchemy-to-iotdb.png?raw=true) - -### 12.2 Data type mapping -| data type in IoTDB | data type in SQLAlchemy | -|--------------------|-------------------------| -| BOOLEAN | Boolean | -| INT32 | Integer | -| INT64 | BigInteger | -| FLOAT | Float | -| DOUBLE | Float | -| TEXT | Text | -| LONG | BigInteger | - -### 12.3 Example - -+ execute statement - -```python -from sqlalchemy import create_engine - -engine = create_engine("iotdb://root:root@127.0.0.1:6667") -connect = engine.connect() -result = connect.execute("SELECT ** FROM root") -for row in result.fetchall(): - print(row) -``` - -+ ORM (now only simple queries are supported) - -```python -from sqlalchemy import create_engine, Column, Float, BigInteger, MetaData -from sqlalchemy.ext.declarative import declarative_base -from sqlalchemy.orm import sessionmaker - -metadata = MetaData( - schema='root.factory' -) -Base = declarative_base(metadata=metadata) - - -class Device(Base): - __tablename__ = "room2.device1" - Time = Column(BigInteger, primary_key=True) - temperature = Column(Float) - status = Column(Float) - - -engine = create_engine("iotdb://root:TimechoDB@2021@127.0.0.1:6667") //Before V2.0.6.x the default password is root - -DbSession = sessionmaker(bind=engine) -session = DbSession() - -res = session.query(Device.status).filter(Device.temperature > 1) - -for row in res: - print(row) -``` - - -## 13. Developers - -### 13.1 Introduction - -This is an example of how to connect to IoTDB with python, using the thrift rpc interfaces. Things are almost the same on Windows or Linux, but pay attention to the difference like path separator. - - - -### 13.2 Prerequisites - -Python3.7 or later is preferred. - -You have to install Thrift (0.11.0 or later) to compile our thrift file into python code. Below is the official tutorial of installation, eventually, you should have a thrift executable. - -``` -http://thrift.apache.org/docs/install/ -``` - -Before starting you need to install `requirements_dev.txt` in your python environment, e.g. by calling -```shell -pip install -r requirements_dev.txt -``` - - - -### 13.3 Compile the thrift library and Debug - -In the root of IoTDB's source code folder, run `mvn clean generate-sources -pl iotdb-client/client-py -am`. - -This will automatically delete and repopulate the folder `iotdb/thrift` with the generated thrift files. -This folder is ignored from git and should **never be pushed to git!** - -**Notice** Do not upload `iotdb/thrift` to the git repo. - - - - -### 13.4 Session Client & Example - -We packed up the Thrift interface in `client-py/src/iotdb/Session.py` (similar with its Java counterpart), also provided an example file `client-py/src/SessionExample.py` of how to use the session module. please read it carefully. - - -Or, another simple example: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //Before V2.0.6.x the default password is root -session = Session(ip, port_, username_, password_) -session.open(False) -zone = session.get_time_zone() -session.close() -``` - - - -### 13.5 Tests - -Please add your custom tests in `tests` folder. - -To run all defined tests just type `pytest .` in the root folder. - -**Notice** Some tests need docker to be started on your system as a test instance is started in a docker container using [testcontainers](https://testcontainers-python.readthedocs.io/en/latest/index.html). - - - -### 13.6 Futher Tools - -[black](https://pypi.org/project/black/) and [flake8](https://pypi.org/project/flake8/) are installed for autoformatting and linting. -Both can be run by `black .` or `flake8 .` respectively. - - - -## 14. Releasing - -To do a release just ensure that you have the right set of generated thrift files. -Then run linting and auto-formatting. -Then, ensure that all tests work (via `pytest .`). -Then you are good to go to do a release! - - - -### 14.1 Preparing your environment - -First, install all necessary dev dependencies via `pip install -r requirements_dev.txt`. - - - -### 14.2 Doing the Release - -There is a convenient script `release.sh` to do all steps for a release. -Namely, these are - -* Remove all transient directories from last release (if exists) -* (Re-)generate all generated sources via mvn -* Run Linting (flake8) -* Run Tests via pytest -* Build -* Release to pypi - diff --git a/src/UserGuide/latest/API/RestServiceV1_timecho.md b/src/UserGuide/latest/API/RestServiceV1_timecho.md deleted file mode 100644 index 376b21c6b..000000000 --- a/src/UserGuide/latest/API/RestServiceV1_timecho.md +++ /dev/null @@ -1,931 +0,0 @@ - - -# REST API V1(Not Recommend) -IoTDB's RESTful services can be used for query, write, and management operations, using the OpenAPI standard to define interfaces and generate frameworks. - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the REST service JAR file by default. Please contact the Timecho team to obtain the corresponding JAR file before using this service, and place it in the `timechodb_home/lib` or `timechodb_home/ext/external_service` directory. - -## 1. Enable RESTful Services - -All RESTful services require **Basic authentication** except the health check interface `/ping`. An `Authorization` header must be carried in all requests. - -1. Authentication Format -``` -Authorization: Basic -``` -Where `` is the Base64 encoding result of the string formatted as `username:password`. Quick generation methods are as follows: - -* Linux/macOS -```bash -echo -n "your_username:your_password" | base64 -Example: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows -```powershell -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("username:password")) -Example: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) -``` - -```cmd -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"username:password\"))" -Example: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. Authentication Example - -Default username: `root`, default password: `TimechoDB@2021`: -- Concatenated string: `root:TimechoDB@2021` -- Base64 encoded result: `cm9vdDpUaW1lY2hvREJAMjAyMQ==` -- Final Request Header: -``` -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. Error Description -- Incorrect username or password: Returns HTTP status code `600` with response content: -```json -{"code":600,"message":"WRONG_LOGIN_PASSWORD_ERROR"} -``` - -- Missing `Authorization` header: Returns HTTP status code `603` with response content: -```json -{"code":603,"message":"UNINITIALIZED_AUTH_ERROR"} -``` - -## 3. Interface - -### 3.1 ping - -The `/ping` API can be used for service liveness probing. - -Request method: `GET` - -Request path: `http://ip:port/ping` - -The user name used in the example is: root, password: root - -Example request: - -```shell -$ curl http://127.0.0.1:18080/ping -``` - -Response status codes: - -- `200`: The service is alive. -- `503`: The service cannot accept any requests now. - -Response parameters: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -|code | integer | status code | -| message | string | message | - -Sample response: - -- With HTTP status code `200`: - - ```json - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -- With HTTP status code `503`: - - ```json - { - "code": 500, - "message": "thrift service is unavailable" - } - ``` - -> `/ping` can be accessed without authorization. - -### 3.2 query - -The query interface can be used to handle data queries and metadata queries. - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v1/query` - -Parameter Description: - -| parameter name | parameter type | required | parameter description | -|----------------| -------------- | -------- | ------------------------------------------------------------ | -| sql | string | yes | | -| rowLimit | integer | no | The maximum number of rows in the result set that can be returned by a query.
If this parameter is not set, the `rest_query_default_row_size_limit` of the configuration file will be used as the default value.
When the number of rows in the returned result set exceeds the limit, the status code `411` will be returned. | - -Response parameters: - -| parameter name | parameter type | parameter description | -|----------------| -------------- | ------------------------------------------------------------ | -| expressions | array | Array of result set column names for data query, `null` for metadata query | -| columnNames | array | Array of column names for metadata query result set, `null` for data query | -| timestamps | array | Timestamp column, `null` for metadata query | -| values | array | A two-dimensional array, the first dimension has the same length as the result set column name array, and the second dimension array represents a column of the result set | - -**Examples:** - -Tip: Statements like `select * from root.xx.**` are not recommended because those statements may cause OOM. - -**Expression query** - - ```shell - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select s3, s4, s3 + 1 from root.sg27 limit 2"}' http://127.0.0.1:18080/rest/v1/query - ```` -Response instance - ```json - { - "expressions": [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg27.s3 + 1" - ], - "columnNames": null, - "timestamps": [ - 1635232143960, - 1635232153960 - ], - "values": [ - [ - 11, - null - ], - [ - false, - true - ], - [ - 12.0, - null - ] - ] - } - ``` - -**Show child paths** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show child paths root"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "child paths" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ] - ] -} -``` - -**Show child nodes** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show child nodes root"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "child nodes" - ], - "timestamps": null, - "values": [ - [ - "sg27", - "sg28" - ] - ] -} -``` - -**Show all ttl** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show all ttl"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - null, - null - ] - ] -} -``` - -**Show ttl** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show ttl on root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27" - ], - [ - null - ] - ] -} -``` - -**Show functions** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show functions"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "function name", - "function type", - "class name (UDF)" - ], - "timestamps": null, - "values": [ - [ - "ABS", - "ACOS", - "ASIN", - ... - ], - [ - "built-in UDTF", - "built-in UDTF", - "built-in UDTF", - ... - ], - [ - "org.apache.iotdb.db.query.udf.builtin.UDTFAbs", - "org.apache.iotdb.db.query.udf.builtin.UDTFAcos", - "org.apache.iotdb.db.query.udf.builtin.UDTFAsin", - ... - ] - ] -} -``` - -**Show timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show timeseries"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg28.s3", - "root.sg28.s4" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg27", - "root.sg27", - "root.sg28", - "root.sg28" - ], - [ - "INT32", - "BOOLEAN", - "INT32", - "BOOLEAN" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -**Show latest timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show latest timeseries"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg28.s4", - "root.sg27.s4", - "root.sg28.s3", - "root.sg27.s3" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg28", - "root.sg27", - "root.sg28", - "root.sg27" - ], - [ - "BOOLEAN", - "BOOLEAN", - "INT32", - "INT32" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -**Count timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"count timeseries root.**"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -**Count nodes** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"count nodes root.** level=2"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -**Show devices** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show devices"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "devices", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -**Show devices with database** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show devices with database"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "devices", - "database", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -**List user** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"list user"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "user" - ], - "timestamps": null, - "values": [ - [ - "root" - ] - ] -} -``` - -**Aggregation** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "columnNames": null, - "timestamps": [ - 0 - ], - "values": [ - [ - 1 - ], - [ - 2 - ] - ] -} -``` - -**Group by level** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.** group by level = 1"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "count(root.sg27.*)", - "count(root.sg28.*)" - ], - "timestamps": null, - "values": [ - [ - 3 - ], - [ - 3 - ] - ] -} -``` - -**Group by** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.sg27 group by([1635232143960,1635232153960),1s)"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "columnNames": null, - "timestamps": [ - 1635232143960, - 1635232144960, - 1635232145960, - 1635232146960, - 1635232147960, - 1635232148960, - 1635232149960, - 1635232150960, - 1635232151960, - 1635232152960 - ], - "values": [ - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ], - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ] - ] -} -``` - -**Last** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select last s3 from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "value", - "dataType" - ], - "timestamps": [ - 1635232143960 - ], - "values": [ - [ - "root.sg27.s3" - ], - [ - "11" - ], - [ - "INT32" - ] - ] -} -``` - -**Disable align** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select * from root.sg27 disable align"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "code": 407, - "message": "disable align clauses are not supported." -} -``` - -**Align by device** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(s3) from root.sg27 align by device"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "code": 407, - "message": "align by device clauses are not supported." -} -``` - -**Select into** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select s3, s4 into root.sg29.s1, root.sg29.s2 from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -```json -{ - "code": 407, - "message": "select into clauses are not supported." -} -``` - -### 3.3 nonQuery - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v1/nonQuery` - -Parameter Description: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| sql | string | query content | - -Example request: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"CREATE DATABASE root.ln"}' http://127.0.0.1:18080/rest/v1/nonQuery -``` - -Response parameters: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| code | integer | status code | -| message | string | message | - -Sample response: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - - -### 3.4 insertTablet - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v1/insertTablet` - -Parameter Description: - -| parameter name |parameter type |is required|parameter describe| -|:---------------| :--- | :---| :---| -| timestamps | array | yes | Time column | -| measurements | array | yes | The name of the measuring point | -| dataTypes | array | yes | The data type | -| values | array | yes | Value columns, the values in each column can be `null` | -| isAligned | boolean | yes | Whether to align the timeseries | -| deviceId | string | yes | Device name | - -Example request: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"timestamps":[1635232143960,1635232153960],"measurements":["s3","s4"],"dataTypes":["INT32","BOOLEAN"],"values":[[11,null],[false,true]],"isAligned":false,"deviceId":"root.sg27"}' http://127.0.0.1:18080/rest/v1/insertTablet -``` - -Sample response: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| code | integer | status code | -| message | string | message | - -Sample response: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - -## 4. Configuration - -The configuration is located in 'iotdb-system.properties'. - -* Set 'enable_rest_service' to 'true' to enable the module, and 'false' to disable the module. By default, this value is' false '. - -```properties -enable_rest_service=true -``` - -* This parameter is valid only when 'enable_REST_service =true'. Set 'rest_service_port' to a number (1025 to 65535) to customize the REST service socket port. By default, the value is 18080. - -```properties -rest_service_port=18080 -``` - -* Set 'enable_swagger' to 'true' to display rest service interface information through swagger, and 'false' to do not display the rest service interface information through the swagger. By default, this value is' false '. - -```properties -enable_swagger=false -``` - -* The maximum number of rows in the result set that can be returned by a query. When the number of rows in the returned result set exceeds the limit, the status code `411` is returned. - -````properties -rest_query_default_row_size_limit=10000 -```` - -* Expiration time for caching customer login information (used to speed up user authentication, in seconds, 8 hours by default) - -```properties -cache_expire=28800 -``` - - -* Maximum number of users stored in the cache (default: 100) - -```properties -cache_max_num=100 -``` - -* Initial cache size (default: 10) - -```properties -cache_init_num=10 -``` - -* REST Service whether to enable SSL configuration, set 'enable_https' to' true 'to enable the module, and set' false 'to disable the module. By default, this value is' false '. - -```properties -enable_https=false -``` - -* keyStore location path (optional) - -```properties -key_store_path= -``` - - -* keyStore password (optional) - -```properties -key_store_pwd= -``` - - -* trustStore location path (optional) - -```properties -trust_store_path= -``` - -* trustStore password (optional) - -```properties -trust_store_pwd= -``` - - -* SSL timeout period, in seconds - -```properties -idle_timeout=5000 -``` diff --git a/src/UserGuide/latest/API/RestServiceV2_timecho.md b/src/UserGuide/latest/API/RestServiceV2_timecho.md deleted file mode 100644 index 6a852c489..000000000 --- a/src/UserGuide/latest/API/RestServiceV2_timecho.md +++ /dev/null @@ -1,983 +0,0 @@ - - -# REST API V2 -IoTDB's RESTful services can be used for query, write, and management operations, using the OpenAPI standard to define interfaces and generate frameworks. - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the REST service JAR file by default. Please contact the Timecho team to obtain the corresponding JAR file before using this service, and place it in the `timechodb_home/lib` or `timechodb_home/ext/external_service` directory. - -## 1. Enable RESTful Services - -RESTful services are disabled by default. - -Find the `conf/iotdb-system.properties` file under the IoTDB installation directory and set `enable_rest_service` to `true` to enable the module. - - ```properties - enable_rest_service=true - ``` - -## 2. Authentication - -All RESTful services require **Basic authentication** except the health check interface `/ping`. An `Authorization` header must be carried in all requests. - -1. Authentication Format -``` -Authorization: Basic -``` -Where `` is the Base64 encoding result of the string formatted as `username:password`. Quick generation methods are as follows: - -* Linux/macOS -```bash -echo -n "your_username:your_password" | base64 -Example: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows -```powershell -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("username:password")) -Example: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) -``` - -```cmd -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"username:password\"))" -Example: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. Authentication Example - -Default username: `root`, default password: `TimechoDB@2021`: -- Concatenated string: `root:TimechoDB@2021` -- Base64 encoded result: `cm9vdDpUaW1lY2hvREJAMjAyMQ==` -- Final Request Header: -``` -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. Error Description -- Incorrect username or password: Returns HTTP status code `600` with response content: -```json -{"code":600,"message":"WRONG_LOGIN_PASSWORD_ERROR"} -``` - -- Missing `Authorization` header: Returns HTTP status code `603` with response content: -```json -{"code":603,"message":"UNINITIALIZED_AUTH_ERROR"} -``` - - - -## 3. Interface - -### 3.1 ping - -The `/ping` API can be used for service liveness probing. - -Request method: `GET` - -Request path: `http://ip:port/ping` - -The user name used in the example is: root, password: root - -Example request: - -```shell -$ curl http://127.0.0.1:18080/ping -``` - -Response status codes: - -- `200`: The service is alive. -- `503`: The service cannot accept any requests now. - -Response parameters: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -|code | integer | status code | -| message | string | message | - -Sample response: - -- With HTTP status code `200`: - - ```json - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -- With HTTP status code `503`: - - ```json - { - "code": 500, - "message": "thrift service is unavailable" - } - ``` - -> `/ping` can be accessed without authorization. - -### 3.2 query - -The query interface can be used to handle data queries and metadata queries. - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v2/query` - -Parameter Description: - -| parameter name | parameter type | required | parameter description | -|----------------| -------------- | -------- | ------------------------------------------------------------ | -| sql | string | yes | | -| row_limit | integer | no | The maximum number of rows in the result set that can be returned by a query.
If this parameter is not set, the `rest_query_default_row_size_limit` of the configuration file will be used as the default value.
When the number of rows in the returned result set exceeds the limit, the status code `411` will be returned. | - -Response parameters: - -| parameter name | parameter type | parameter description | -|----------------| -------------- | ------------------------------------------------------------ | -| expressions | array | Array of result set column names for data query, `null` for metadata query | -| column_names | array | Array of column names for metadata query result set, `null` for data query | -| timestamps | array | Timestamp column, `null` for metadata query | -| values | array | A two-dimensional array, the first dimension has the same length as the result set column name array, and the second dimension array represents a column of the result set | - -**Examples:** - -Tip: Statements like `select * from root.xx.**` are not recommended because those statements may cause OOM. - -**Expression query** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select s3, s4, s3 + 1 from root.sg27 limit 2"}' http://127.0.0.1:18080/rest/v2/query -```` - -```json -{ - "expressions": [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg27.s3 + 1" - ], - "column_names": null, - "timestamps": [ - 1635232143960, - 1635232153960 - ], - "values": [ - [ - 11, - null - ], - [ - false, - true - ], - [ - 12.0, - null - ] - ] -} -``` - -**Show child paths** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show child paths root"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "child paths" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ] - ] -} -``` - -**Show child nodes** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show child nodes root"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "child nodes" - ], - "timestamps": null, - "values": [ - [ - "sg27", - "sg28" - ] - ] -} -``` - -**Show all ttl** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show all ttl"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - null, - null - ] - ] -} -``` - -**Show ttl** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show ttl on root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27" - ], - [ - null - ] - ] -} -``` - -**Show functions** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show functions"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "function name", - "function type", - "class name (UDF)" - ], - "timestamps": null, - "values": [ - [ - "ABS", - "ACOS", - "ASIN", - ... - ], - [ - "built-in UDTF", - "built-in UDTF", - "built-in UDTF", - ... - ], - [ - "org.apache.iotdb.db.query.udf.builtin.UDTFAbs", - "org.apache.iotdb.db.query.udf.builtin.UDTFAcos", - "org.apache.iotdb.db.query.udf.builtin.UDTFAsin", - ... - ] - ] -} -``` - -**Show timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show timeseries"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg28.s3", - "root.sg28.s4" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg27", - "root.sg27", - "root.sg28", - "root.sg28" - ], - [ - "INT32", - "BOOLEAN", - "INT32", - "BOOLEAN" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -**Show latest timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show latest timeseries"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg28.s4", - "root.sg27.s4", - "root.sg28.s3", - "root.sg27.s3" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg28", - "root.sg27", - "root.sg28", - "root.sg27" - ], - [ - "BOOLEAN", - "BOOLEAN", - "INT32", - "INT32" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -**Count timeseries** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"count timeseries root.**"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -**Count nodes** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"count nodes root.** level=2"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -**Show devices** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show devices"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "devices", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -**Show devices with database** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show devices with database"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "devices", - "database", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -**List user** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"list user"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "user" - ], - "timestamps": null, - "values": [ - [ - "root" - ] - ] -} -``` - -**Aggregation** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "column_names": null, - "timestamps": [ - 0 - ], - "values": [ - [ - 1 - ], - [ - 2 - ] - ] -} -``` - -**Group by level** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.** group by level = 1"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "count(root.sg27.*)", - "count(root.sg28.*)" - ], - "timestamps": null, - "values": [ - [ - 3 - ], - [ - 3 - ] - ] -} -``` - -**Group by** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.sg27 group by([1635232143960,1635232153960),1s)"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "column_names": null, - "timestamps": [ - 1635232143960, - 1635232144960, - 1635232145960, - 1635232146960, - 1635232147960, - 1635232148960, - 1635232149960, - 1635232150960, - 1635232151960, - 1635232152960 - ], - "values": [ - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ], - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ] - ] -} -``` - -**Last** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select last s3 from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "value", - "dataType" - ], - "timestamps": [ - 1635232143960 - ], - "values": [ - [ - "root.sg27.s3" - ], - [ - "11" - ], - [ - "INT32" - ] - ] -} -``` - -**Disable align** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select * from root.sg27 disable align"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "code": 407, - "message": "disable align clauses are not supported." -} -``` - -**Align by device** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(s3) from root.sg27 align by device"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "code": 407, - "message": "align by device clauses are not supported." -} -``` - -**Select into** - -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select s3, s4 into root.sg29.s1, root.sg29.s2 from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -```json -{ - "code": 407, - "message": "select into clauses are not supported." -} -``` - -### 3.3 nonQuery - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v2/nonQuery` - -Parameter Description: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| sql | string | query content | - -Example request: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"CREATE DATABASE root.ln"}' http://127.0.0.1:18080/rest/v2/nonQuery -``` - -Response parameters: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| code | integer | status code | -| message | string | message | - -Sample response: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - - -### 3.4 insertTablet - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v2/insertTablet` - -Parameter Description: - -| parameter name |parameter type |is required|parameter describe| -|:---------------| :--- | :---| :---| -| timestamps | array | yes | Time column | -| measurements | array | yes | The name of the measuring point | -| data_types | array | yes | The data type | -| values | array | yes | Value columns, the values in each column can be `null` | -| is_aligned | boolean | yes | Whether to align the timeseries | -| device | string | yes | Device name | - -Example request: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"timestamps":[1635232143960,1635232153960],"measurements":["s3","s4"],"data_types":["INT32","BOOLEAN"],"values":[[11,null],[false,true]],"is_aligned":false,"device":"root.sg27"}' http://127.0.0.1:18080/rest/v2/insertTablet -``` - -Sample response: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| code | integer | status code | -| message | string | message | - -Sample response: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - -### 3.5 insertRecords - -Request method: `POST` - -Request header: `application/json` - -Request path: `http://ip:port/rest/v2/insertRecords` - -Parameter Description: - -| parameter name |parameter type |is required|parameter describe| -|:------------------| :--- | :---| :---| -| timestamps | array | yes | Time column | -| measurements_list | array | yes | The name of the measuring point | -| data_types_list | array | yes | The data type | -| values_list | array | yes | Value columns, the values in each column can be `null` | -| devices | string | yes | Device name | -| is_aligned | boolean | yes | Whether to align the timeseries | - -Example request: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"timestamps":[1635232113960,1635232151960,1635232143960,1635232143960],"measurements_list":[["s33","s44"],["s55","s66"],["s77","s88"],["s771","s881"]],"data_types_list":[["INT32","INT64"],["FLOAT","DOUBLE"],["FLOAT","DOUBLE"],["BOOLEAN","TEXT"]],"values_list":[[1,11],[2.1,2],[4,6],[false,"cccccc"]],"is_aligned":false,"devices":["root.s1","root.s1","root.s1","root.s3"]}' http://127.0.0.1:18080/rest/v2/insertRecords -``` - -Sample response: - -|parameter name |parameter type |parameter describe| -|:--- | :--- | :---| -| code | integer | status code | -| message | string | message | - -Sample response: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - -## 4. Configuration - -The configuration is located in 'iotdb-system.properties'. - -* Set 'enable_rest_service' to 'true' to enable the module, and 'false' to disable the module. By default, this value is' false '. - -```properties -enable_rest_service=true -``` - -* This parameter is valid only when 'enable_REST_service =true'. Set 'rest_service_port' to a number (1025 to 65535) to customize the REST service socket port. By default, the value is 18080. - -```properties -rest_service_port=18080 -``` - -* Set 'enable_swagger' to 'true' to display rest service interface information through swagger, and 'false' to do not display the rest service interface information through the swagger. By default, this value is' false '. - -```properties -enable_swagger=false -``` - -* The maximum number of rows in the result set that can be returned by a query. When the number of rows in the returned result set exceeds the limit, the status code `411` is returned. - -````properties -rest_query_default_row_size_limit=10000 -```` - -* Expiration time for caching customer login information (used to speed up user authentication, in seconds, 8 hours by default) - -```properties -cache_expire=28800 -``` - - -* Maximum number of users stored in the cache (default: 100) - -```properties -cache_max_num=100 -``` - -* Initial cache size (default: 10) - -```properties -cache_init_num=10 -``` - -* REST Service whether to enable SSL configuration, set 'enable_https' to' true 'to enable the module, and set' false 'to disable the module. By default, this value is' false '. - -```properties -enable_https=false -``` - -* keyStore location path (optional) - -```properties -key_store_path= -``` - - -* keyStore password (optional) - -```properties -key_store_pwd= -``` - - -* trustStore location path (optional) - -```properties -trust_store_path= -``` - -* trustStore password (optional) - -```properties -trust_store_pwd= -``` - - -* SSL timeout period, in seconds - -```properties -idle_timeout=5000 -``` diff --git a/src/UserGuide/latest/Background-knowledge/Cluster-Concept_timecho.md b/src/UserGuide/latest/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index 147d0c5ed..000000000 --- a/src/UserGuide/latest/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,118 +0,0 @@ - - -# Common Concepts - -## 1. Sql_dialect Related Concepts - -| Concept | Meaning | -| ----------------------- | ------------------------------------------------------------ | -| sql_dialect | IoTDB supports two time-series data mode (SQL dialects), both managing devices and measurement points. Tree: Manages data in a hierarchical path manner, where one path corresponds to one measurement point of a device. Table: Manages data in a relational table manner, where one table corresponds to a category of devices. | -| Schema | Schema is the data mode information of the database, i.e., tree structure or table structure. It includes definitions such as the names and data types of measurement points. | -| Device | Corresponds to a physical device in an actual scenario, usually containing multiple measurement points. | -| Timeseries | Also known as: physical quantity, time series, timeline, point location, semaphore, indicator, measurement value, etc. It is a time series formed by arranging multiple data points in ascending order of timestamps. Usually, a Timeseries represents a collection point that can periodically collect physical quantities of the environment it is in. | -| Encoding | Encoding is a compression technique that represents data in binary form to improve storage efficiency. IoTDB supports various encoding methods for different types of data. For more detailed information, please refer to:[Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) | -| Compression | After data encoding, IoTDB uses compression technology to further compress binary data to enhance storage efficiency. IoTDB supports multiple compression methods. For more detailed information, please refer to: [Encoding-and-Compression](../Technical-Insider/Encoding-and-Compression.md) | - -## 2. Distributed Related Concepts - -The following figure shows a common IoTDB 3C3D (3 ConfigNodes, 3 DataNodes) cluster deployment pattern: - - - -IoTDB's cluster includes the following common concepts: - -- Nodes(ConfigNode、DataNode、AINode) -- Region(SchemaRegion、DataRegion) -- Replica Groups - -The above concepts will be introduced in the following text. - - -### 2.1 Nodes - -IoTDB cluster includes three types of nodes (processes): ConfigNode (management node), DataNode (data node), and AINode (analysis node), as shown below: - -- ConfigNode: Manages cluster node information, configuration information, user permissions, metadata, partition information, etc., and is responsible for the scheduling of distributed operations and load balancing. All ConfigNodes are fully backed up with each other, as shown in ConfigNode-1, ConfigNode-2, and ConfigNode-3 in the figure above. -- DataNode: Serves client requests and is responsible for data storage and computation, as shown in DataNode-1, DataNode-2, and DataNode-3 in the figure above. -- AINode: Provides machine learning capabilities, supports the registration of trained machine learning models, and allows model inference through SQL calls. It has already built-in self-developed time-series large models and common machine learning algorithms (such as prediction and anomaly detection). - -### 2.2 Data Partitioning - -In IoTDB, both metadata and data are divided into small partitions, namely Regions, which are managed by various DataNodes in the cluster. - -- SchemaRegion: Metadata partition, managing the metadata of a part of devices and measurement points. SchemaRegions with the same RegionID on different DataNodes are mutual replicas, as shown in SchemaRegion-1 in the figure above, which has three replicas located on DataNode-1, DataNode-2, and DataNode-3. -- DataRegion: Data partition, managing the data of a part of devices for a certain period of time. DataRegions with the same RegionID on different DataNodes are mutual replicas, as shown in DataRegion-2 in the figure above, which has two replicas located on DataNode-1 and DataNode-2. -- For specific partitioning algorithms, please refer to: [Data Partitioning](../Technical-Insider/Cluster-data-partitioning.md) - -### 2.3 Replica Groups - -The number of replicas for data and metadata can be configured. The recommended configurations for different deployment modes are as follows, where multi-replication can provide high-availability services. - -| Category | Parameter | Stand-Alone Recommended Configuration | Cluster Recommended Configuration | -| :----- | :------------------------ | :----------- | :----------- | -| Schema | schema_replication_factor | 1 | 3 | -| Data | data_replication_factor | 1 | 2 | - - -## 3. Deployment Related Concepts - -IoTDB has three operating modes: Stand-Alone mode, Cluster mode, and Dual-Active mode. - -### 3.1 Stand-Alone Mode - -An IoTDB Stand-Alone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D; - - -- **Features**:Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations. -- **Applicable Scenarios**:Scenarios with limited resources or low requirements for high availability, such as edge-side servers. -- **Deployment Method**:[Stand-Alone-Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -### 3.2 Dual-Active Mode - -Dual-active deployment is a feature of TimechoDB Enterprise Edition, which refers to two independent instances performing bidirectional synchronization and can provide services simultaneously. When one instance is restarted after a shutdown, the other instance will resume transmission of the missing data. - - -> An IoTDB dual-active instance usually consists of 2 single-machine nodes, i.e., 2 sets of 1C1D. Each instance can also be a cluster. - -- **Features**:The most resource-efficient high-availability solution. -- **Applicable Scenarios**:Scenarios with limited resources (only two servers) but requiring high-availability capabilities. -- **Deployment Method**:[Dual-Active-Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### 3.3 Cluster Mode - -An IoTDB cluster instance consists of 3 ConfigNodes and no less than 3 DataNodes, usually 3 DataNodes, i.e., 3C3D; when some nodes fail, the remaining nodes can still provide services, ensuring the high availability of the database service, and the database performance can be improved with the addition of nodes. - -- **Features**:High availability and scalability, and the system performance can be improved by adding DataNodes. -- **Applicable Scenarios**:Enterprise-level application scenarios requiring high availability and reliability. -- **Deployment Method**:[Cluster-Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -### 3.4 Summary of Features - -| Dimension | Stand-Alone Mode | Dual-Active Mode | Cluster Mode | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| Applicable Scenarios | Edge-side deployment, scenarios with low requirements for high availability | High-availability business, disaster recovery scenarios, etc. | High-availability business, disaster recovery scenarios, etc. | -| Number of Machines Required | 1 | 2 | ≥3 | -| Security and Reliability | Cannot tolerate single-point failures | High, can tolerate single-point failures | High, can tolerate single-point failures | -| Scalability | Can expand DataNodes to improve performance | Each instance can be expanded as needed | Can expand DataNodes to improve performance | -| Performance | Can be expanded with the number of DataNodes | Same as the performance of one of the instances | Can be expanded with the number of DataNodes | - -- The deployment steps for single-machine mode and cluster mode are similar (adding ConfigNodes and DataNodes one by one), with only the number of replicas and the minimum number of nodes that can provide services being different. \ No newline at end of file diff --git a/src/UserGuide/latest/Background-knowledge/Data-Model-and-Terminology_timecho.md b/src/UserGuide/latest/Background-knowledge/Data-Model-and-Terminology_timecho.md deleted file mode 100644 index 6b6e2018d..000000000 --- a/src/UserGuide/latest/Background-knowledge/Data-Model-and-Terminology_timecho.md +++ /dev/null @@ -1,393 +0,0 @@ - - -# Modeling Scheme Design - -This section introduces how to transform time series data application scenarios into IoTDB time series mode. - -## 1. Time Series Data Mode - -Before designing an IoTDB data mode, it's essential to understand time series data and its underlying structure. For more details, refer to: [Time Series Data Mode](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - -## 2. Tree-Table Twin Mode in IoTDB - -IoTDB offers Tree-table twin mode, each with its distinct characteristics as follows: - -**Tree Mode**: It manages data points as objects, with each data point corresponding to a time series. The data point names, segmented by dots, form a tree-like directory structure that corresponds one-to-one with the physical world, making the read and write operations on data points straightforward and intuitive. - -> 1. When performing data mode, to meet sufficient performance requirements, it is recommended that the penultimate layer node (corresponding to the number of devices) in the data path (Path) contains no fewer than 1,000 entries. The number of devices is linked to concurrent processing capability—a higher number of devices ensures more efficient concurrent read and write operations. - In scenarios where "the number of devices is small but each device contains a large number of data points" (e.g., only 3 devices, each with 10,000 data points), it is advisable to add a .value level at the end of the path. This increases the total number of nodes in the penultimate layer. Example: root.db.device01.metric.value. -> 2. When constructing tree mode [paths](../Basic-Concept/Operate-Metadata_timecho.md#4-path-query), if node naming may include non-standard characters or special symbols, it is recommended to implement a backtick encapsulation strategy for all hierarchical nodes. This approach effectively mitigates issues such as probe registration failures and data write interruptions caused by character parsing errors, ensuring the accuracy of path identifiers in syntax parsing. - -**Table Mode**: It is recommended to create a table for each type of device. The collection of physical quantities from devices of the same type shares certain commonalities (such as the collection of temperature and humidity physical quantities), allowing for flexible and rich data analysis. - -### 2.1 Mode Characteristics - -Tree-table twin mode syntaxes have their own applicable scenarios. - -The following table compares the tree mode and the table mode from various dimensions, including applicable scenarios and typical operations. Users can choose the appropriate mode based on their specific usage requirements to achieve efficient data storage and management. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DimensionTree ModeTable Mode
Applicable ScenariosMeasurements management, monitoring scenariosDevice management, analysis scenarios
Typical OperationsRead and write operations by specifying data point pathsData filtering and analysis through tags
Structural CharacteristicsFlexible addition and deletion, similar to a file systemTemplate-based management, facilitating data governance
Syntax CharacteristicsConcise and flexibleRich analysis
Performance ComparisonSimilar
- -**Notes:** - -- Both mode spaces can coexist within the same cluster instance. Each mode follows distinct syntax and database naming conventions, and they remain isolated by default. - - -## 2.2 Model Selection - -IoTDB supports model selection through various client tools. The configuration methods for different clients are as follows: - -1. [Command-Line Interface (CLI)](../Tools-System/CLI_timecho.md) - -When connecting via CLI, specify the model using the `sql_dialect` parameter (default: tree model). - -```bash -# Tree model -start-cli.sh(bat) -start-cli.sh(bat) -sql_dialect tree - -# Table model -start-cli.sh(bat) -sql_dialect table -``` - -2. [SQL](../User-Manual/Maintenance-commands_timecho.md#_2-1-setting-the-connected-model) - -Use the `SET` statement to switch models in SQL: - -```sql --- Tree model -IoTDB> SET SQL_DIALECT=TREE - --- Table model -IoTDB> SET SQL_DIALECT=TABLE -``` - -3. Application Programming Interfaces (APIs) - -For multi-language APIs, create connections via model-specific session/session pool classes. Examples: - -* [Java Native API](../API/Programming-Java-Native-API_timecho.md) - -```java -// Tree model -SessionPool sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(3) - .build(); - -// Table model -ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(1) - .build(); -``` - -* [Python Native API](../API/Programming-Python-Native-API_timecho.md) - -```python -# Tree model -session = Session( - ip=ip, - port=port, - user=username, - password=password, - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) - -# Table model -config = TableSessionPoolConfig( - node_urls=node_urls, - username=username, - password=password, - database=database, - max_pool_size=max_pool_size, - fetch_size=fetch_size, - wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) -``` - -* [C++ Native API](../API/Programming-Cpp-Native-API.md) - -```cpp -// Tree model -session = new Session(hostip, port, username, password); - -// Table model -session = (new TableSessionBuilder()) - ->host(ip) - ->rpcPort(port) - ->username(username) - ->password(password) - ->build(); -``` - -* [Go Native API](../API/Programming-Go-Native-API.md) - -```go -// Tree model -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, -} -sessionPool = client.NewSessionPool(config, 3, 60000, 60000, false) -defer sessionPool.Close() - -// Table model -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, - Database: dbname, -} -sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) -defer sessionPool.Close() -``` - -* [C# Native API](../API/Programming-CSharp-Native-API.md) - -```csharp -// Tree model -var session_pool = new SessionPool(host, port, pool_size); - -// Table model -var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(nodeUrls) - .SetUsername(username) - .SetPassword(password) - .SetFetchSize(1024) - .Build(); -``` - -* [JDBC](../API/Programming-JDBC_timecho.md) - -For the table model, include `sql_dialect=table` in the JDBC URL: - -```java -// Tree model -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/", username, password); - -// Table model -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", username, password); -``` - -## 2.3 Tree-to-Table Conversion - -IoTDB supports **tree-to-table conversion**, as shown in the figure below: - -![](/img/tree-to-table-en-1.png) - -This feature allows existing tree-model data to be transformed into table views. Users can then query the same dataset using either model. Detailed instructions are available in [Tree-to-Table View](../../latest-Table/User-Manual/Tree-to-Table_timecho.md). **Note**: SQL statements for creating tree-to-table views **must be executed in table mode**. - - -## 3. Application Scenarios - -The application scenarios mainly include three categories: - -- Scenario 1: Using the tree mode for data reading and writing. - -- Scenario 2: Using the table mode for data reading and writing. - -- Scenario 3: Sharing the same dataset, using the tree mode for data reading and writing, and the table mode for data analysis. - -### 3.1 Scenario 1: Tree Mode - -#### 3.1.1 Characteristics - -- Simple and intuitive, corresponding one-to-one with monitoring points in the physical world. - -- Flexible like a file system, allowing the design of any branch structure. - -- Suitable for industrial monitoring scenarios such as DCS and SCADA. - -#### 3.1.2 Basic Concepts - -| **Concept** | **Definition** | -| ---------------------------- | ------------------------------------------------------------ | -| **Database** | **Definition**: A path prefixed with `root.`.
**Naming Recommendation**: Only include the next level node under `root`, such as `root.db`.
**Quantity Recommendation**: The upper limit is related to memory. A single database can fully utilize machine resources; there is no need to create multiple databases for performance reasons.
**Creation Method**: Recommended to create manually, but can also be created automatically when a time series is created (defaults to the next level node under `root`). | -| **Time Series (Data Point)** | **Definition**:
A path prefixed with the database path, segmented by `.`, and can contain any number of levels, such as `root.db.turbine.device1.metric1`.
Each time series can have different data types.
**Naming Recommendation**:
Only include unique identifiers (similar to a composite primary key) in the path, generally not exceeding 10 levels.
Typically, place tags with low cardinality (fewer distinct values) at the front to facilitate system compression of common prefixes.
**Quantity Recommendation**:
The total number of time series manageable by the cluster is related to total memory; refer to the resource recommendation section.
There is no limit to the number of child nodes at any level.
**Creation Method**: Can be created manually or automatically during data writing. | -| **Device** | **Definition**: The second-to-last level is the device, such as `device1` in `root.db.turbine.device1.metric1`.
**Creation Method**: Cannot create a device alone; it exists as time series are created. | - -#### 3.1.3 Mode Examples - -##### 3.1.3.1 How to mode when managing multiple types of devices? - -- If different types of devices in the scenario have different hierarchical paths and data point sets, create branches under the database node by device type. Each device type can have a different data point structure. - -
- -
- -##### 3.1.3.2 How to mode when there are no devices, only data points? - -- For example, in a monitoring system for a station, each data point has a unique number but does not correspond to any specific device. - -
- -
- -##### 3.1.3.3 How to mode when a device has both sub-devices and data points? - -- For example, in an energy storage scenario, each layer of the structure monitors its voltage and current. The following mode approach can be used. - -
- -
- - -### 3.2 Scenario 2: Table Mode - -#### 3.2.1 Characteristics - -- Modes and manages device time series data using time series tables, facilitating analysis with standard SQL. - -- Suitable for device data analysis or migrating data from other databases to IoTDB. - -#### 3.2.2 Basic Concepts - -- Database: Can manage multiple types of devices. - -- Time Series Table: Corresponds to a type of device. - -| **Category** | **Definition** | -| -------------------------------- | ------------------------------------------------------------ | -| **Time Column (TIME)** | Each time series table must have a time column named `time`, with the data type `TIMESTAMP`. | -| **Tag Column (TAG)** \| | Unique identifiers (composite primary key) for devices, ranging from 0 to multiple.
Tag information cannot be modified or deleted but can be added.
Recommended to arrange from coarse to fine granularity. | -| **Data Point Column (FIELD)** \| | A device can collect 1 to multiple data points, with values changing over time.
There is no limit to the number of data point columns; it can reach hundreds of thousands. | -| **Attribute Column (ATTRIBUTE)** | Supplementary descriptions of devices, not changing over time.
Device attribute information can range from 0 to multiple and can be updated or added.
A small number of static attributes that may need modification can be stored here. | - -**Data Filtering Efficiency**: Time Column = Tag Column > Attribute Column > Data Point Column. - -#### 3.2.3 Mode Examples - -##### 3.2.3.1 How to mode when managing multiple types of devices? - -- Recommended to create a table for each type of device, with each table having different tags and data point sets. - -- Even if devices are related or have hierarchical relationships, it is recommended to create a table for each type of device. - -
- -
- -##### 3.2.3.2 How to mode when there are no device identifier columns or attribute columns? - -- There is no limit to the number of columns; it can reach hundreds of thousands. - -
- -
- -##### 3.2.3.3 How to mode when a device has both sub-devices and data points? - -- Each device has multiple sub-devices and data point information. It is recommended to create a table for each type of device for management. - -
- -
- -### 3.3 Scenario 3: Dual-Mode Integration - -#### 3.3.1 Characteristics - -- Ingeniously combines the advantages of the tree mode and table mode, sharing the same dataset, with flexible writing and rich querying. - -- During the data writing phase, the tree mode syntax is used, supporting flexible data access and expansion. - -- During the data analysis phase, the table mode syntax is used, allowing users to perform complex data analysis using standard SQL queries. - -#### 3.3.2 Mode Examples - -##### 3.3.2.1 How to mode when managing multiple types of devices? - -- Different types of devices in the scenario have different hierarchical paths and data point sets. - -- **Tree Mode**T: Create branches under the database node by device type, with each device type having a different data point structure. - -- **Table View**T: Create a table view for each type of device, with each table view having different tags and data point sets. - -
- -
- -##### 3.3.2.2 How to mode when there are no device identifier columns or attribute columns? - -- **Tree Mode**: Each data point has a unique number but does not correspond to any specific device. -- **Table View**: Place all data points into a single table. There is no limit to the number of data point columns; it can reach hundreds of thousands. If data points have the same data type, they can be treated as the same type of device. - -
- -
- -##### 3.3.2.3 How to mode when a device has both sub-devices and data points? - -- **Tree Mode**: Mode each layer of the structure according to the monitoring points in the physical world. -- **Table View**: Create multiple tables to manage each layer of structural information according to device classification. - -
- -
diff --git a/src/UserGuide/latest/Background-knowledge/Navigating_Time_Series_Data_timecho.md b/src/UserGuide/latest/Background-knowledge/Navigating_Time_Series_Data_timecho.md deleted file mode 100644 index dc29b26d4..000000000 --- a/src/UserGuide/latest/Background-knowledge/Navigating_Time_Series_Data_timecho.md +++ /dev/null @@ -1,65 +0,0 @@ - -# Timeseries Data Model - -## 1. What Is Time Series Data? - -In today's era of the Internet of Things, various scenarios such as the Internet of Things and industrial scenarios are undergoing digital transformation. People collect various states of devices by installing sensors on them. If the motor collects voltage and current, the blade speed, angular velocity, and power generation of the fan; Vehicle collection of latitude and longitude, speed, and fuel consumption; The vibration frequency, deflection, displacement, etc. of the bridge. The data collection of sensors has penetrated into various industries. - -![](/img/time-series-data-en-01.png) - -Generally speaking, we refer to each collection point as a measurement point (also known as a physical quantity, time series, timeline, signal quantity, indicator, measurement value, etc.). Each measurement point continuously collects new data information over time, forming a time series. In the form of a table, each time series is a table formed by two columns: time and value; In a graphical way, each time series is a trend chart formed over time, which can also be vividly referred to as the device's electrocardiogram. - -![](/img/time-series-data-en-02.png) - -The massive time series data generated by sensors is the foundation of digital transformation in various industries, so our modeling of time series data mainly focuses on equipment and sensors. - -## 2. Key Concepts of Time Series Data -The main concepts involved in time-series data can be divided from bottom to top: data points, measurement points, and equipment. - -![](/img/time-series-data-en-04.png) - -### 2.1 Data Point - -- Definition: Consists of a timestamp and a value, where the timestamp is of type long and the value can be of various types such as BOOLEAN, FLOAT, INT32, etc. -- Example: A row of a time series in the form of a table in the above figure, or a point of a time series in the form of a graph, is a data point. - -![](/img/time-series-data-en-03.png) - -### 2.2 Measurement Points - -- Definition: It is a time series formed by multiple data points arranged in increments according to timestamps. Usually, a measuring point represents a collection point and can regularly collect physical quantities of the environment it is located in. -- Also known as: physical quantity, time series, timeline, semaphore, indicator, measurement value, etc -- Example: - - Electricity scenario: current, voltage - - Energy scenario: wind speed, rotational speed - - Vehicle networking scenarios: fuel consumption, vehicle speed, longitude, dimensions - - Factory scenario: temperature, humidity -- In the tree model, the total number of measurement points equals the number of leaf nodes under the entire path pattern. For detailed statistics methods, refer to [Count Timeseries](../Basic-Concept/Operate-Metadata_timecho.md#_2-7-count-timeseries) - -### 2.3 Device - -- Definition: Corresponding to a physical device in an actual scene, usually a collection of measurement points, identified by one to multiple labels -- Example: - - Vehicle networking scenario: Vehicles identified by vehicle identification code (VIN) - - Factory scenario: robotic arm, unique ID identification generated by IoT platform - - Energy scenario: Wind turbines, identified by region, station, line, model, instance, etc - - Monitoring scenario: CPU, identified by machine room, rack, Hostname, device type, etc \ No newline at end of file diff --git a/src/UserGuide/latest/Basic-Concept/Operate-Metadata_timecho.md b/src/UserGuide/latest/Basic-Concept/Operate-Metadata_timecho.md deleted file mode 100644 index 138a5b4a4..000000000 --- a/src/UserGuide/latest/Basic-Concept/Operate-Metadata_timecho.md +++ /dev/null @@ -1,1283 +0,0 @@ - - -# Data Modeling - -## 1. Database Management - -### 1.1 Create Database - -According to the storage model we can set up the corresponding database. Two SQL statements are supported for creating databases, as follows: - -```sql -create database root.ln; -create database root.sgcc; -``` - -We can thus create two databases using the above two SQL statements. - -It is worth noting that 1 database is recommended. - -When the path itself or the parent/child layer of the path is already created as database, the path is then not allowed to be created as database. - -For example, when the databases root.ln and root.sgcc already exist, creating the database root.ln.wf01 is not allowed. The system will return the corresponding error message as shown below: - -```sql -CREATE DATABASE root.ln.wf01; -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 501: root.ln has already been created as a database -``` - -Similarly, when the database root.db.test already exists, creating the database root.db is not allowed either. The system will return the corresponding error message as shown below: - -```sql -CREATE DATABASE root.db; -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 529: Some children of root.db have already been created as datab -``` - -Database Node Naming Rules: -1. Node names may contain: **Chinese/English letters, Digits (0-9), Underscore(\_)、Period (.)、Backtick(\`)** -2. The entire name must be enclosed in **backticks (\`)** if: - - It consists solely of digits (e.g., 12345) - - It contains special characters (. or \_) that may cause ambiguity (e.g., db.01, \_temp) -3. Escaping Backticks: - If the node name itself contains a backtick (\`), use **two consecutive backticks(\`\`)** to represent a single backtick. Example: To name a node as \`db123\`\` (containing one backtick), write it as \`db123\`\`\`. - -Besides, if deploy on Windows or macOS system, the LayerName is case-insensitive, which means it's not allowed to create databases `root.ln` and `root.LN` at the same time. - -### 1.2 Show Databases - -After creating the database, we can use the [SHOW DATABASES](../SQL-Manual/SQL-Manual_timecho) statement and [SHOW DATABASES \](../SQL-Manual/SQL-Manual_timecho) to view the databases. The SQL statements are as follows: - -```sql -SHOW DATABASES; -SHOW DATABASES root.**; -``` - -The result is as follows: - -``` -+-------------+----+-------------------------+-----------------------+-----------------------+ -|database| ttl|schema_replication_factor|data_replication_factor|time_partition_interval| -+-------------+----+-------------------------+-----------------------+-----------------------+ -| root.sgcc|null| 2| 2| 604800| -| root.ln|null| 2| 2| 604800| -+-------------+----+-------------------------+-----------------------+-----------------------+ -Total line number = 2 -It costs 0.060s -``` - -### 1.3 Delete Database - -User can use the `DELETE DATABASE ` statement to delete all databases matching the pathPattern. Please note the data in the database will also be deleted. - -```sql -DELETE DATABASE root.ln; -DELETE DATABASE root.sgcc; -// delete all data, all timeseries and all databases; -DELETE DATABASE root.**; -``` - -### 1.4 Count Databases - -User can use the `COUNT DATABASE ` statement to count the number of databases. It is allowed to specify `PathPattern` to count the number of databases matching the `PathPattern`. - -SQL statement is as follows: - -```sql -count databases; -count databases root.*; -count databases root.sgcc.*; -count databases root.sgcc; -``` - -The result is as follows: - -``` -+-------------+ -| database| -+-------------+ -| root.sgcc| -| root.turbine| -| root.ln| -+-------------+ -Total line number = 3 -It costs 0.003s - -+-------------+ -| database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.003s - -+-------------+ -| database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 0| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 1| -+-------------+ -Total line number = 1 -It costs 0.002s -``` - -### 1.5 Setting up heterogeneous databases (Advanced operations) - -Under the premise of familiar with IoTDB metadata modeling, -users can set up heterogeneous databases in IoTDB to cope with different production needs. - -Currently, the following database heterogeneous parameters are supported: - -| Parameter | Type | Description | -| ------------------------- | ------- | --------------------------------------------- | -| TTL | Long | TTL of the Database | -| SCHEMA_REPLICATION_FACTOR | Integer | The schema replication number of the Database | -| DATA_REPLICATION_FACTOR | Integer | The data replication number of the Database | -| SCHEMA_REGION_GROUP_NUM | Integer | The SchemaRegionGroup number of the Database | -| DATA_REGION_GROUP_NUM | Integer | The DataRegionGroup number of the Database | - -Note the following when configuring heterogeneous parameters: - -+ TTL and TIME_PARTITION_INTERVAL must be positive integers. -+ SCHEMA_REPLICATION_FACTOR and DATA_REPLICATION_FACTOR must be smaller than or equal to the number of deployed DataNodes. -+ The function of SCHEMA_REGION_GROUP_NUM and DATA_REGION_GROUP_NUM are related to the parameter `schema_region_group_extension_policy` and `data_region_group_extension_policy` in iotdb-common.properties configuration file. Take DATA_REGION_GROUP_NUM as an example: - If `data_region_group_extension_policy=CUSTOM` is set, DATA_REGION_GROUP_NUM serves as the number of DataRegionGroups owned by the Database. - If `data_region_group_extension_policy=AUTO`, DATA_REGION_GROUP_NUM is used as the lower bound of the DataRegionGroup quota owned by the Database. That is, when the Database starts writing data, it will have at least this number of DataRegionGroups. - -Users can set any heterogeneous parameters when creating a Database, or adjust some heterogeneous parameters during a stand-alone/distributed IoTDB run. - -#### Set heterogeneous parameters when creating a Database - -The user can set any of the above heterogeneous parameters when creating a Database. The SQL statement is as follows: - -```sql -CREATE DATABASE prefixPath (WITH databaseAttributeClause (COMMA? databaseAttributeClause)*)? -``` - -For example: - -```sql -CREATE DATABASE root.db WITH SCHEMA_REPLICATION_FACTOR=1, DATA_REPLICATION_FACTOR=3, SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### Adjust heterogeneous parameters at run time - -Users can adjust some heterogeneous parameters during the IoTDB runtime, as shown in the following SQL statement: - -```sql -ALTER DATABASE prefixPath WITH databaseAttributeClause (COMMA? databaseAttributeClause)*; -``` - -For example: - -```sql -ALTER DATABASE root.db WITH SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -Note that only the following heterogeneous parameters can be adjusted at runtime: - -+ SCHEMA_REGION_GROUP_NUM -+ DATA_REGION_GROUP_NUM - -#### Show heterogeneous databases - -The user can query the specific heterogeneous configuration of each Database, and the SQL statement is as follows: - -```sql -SHOW DATABASES DETAILS prefixPath? -``` - -For example: - -```sql -SHOW DATABASES DETAILS -``` -``` -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|Database| TTL|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|MinSchemaRegionGroupNum|MaxSchemaRegionGroupNum|DataRegionGroupNum|MinDataRegionGroupNum|MaxDataRegionGroupNum| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|root.db1| null| 1| 3| 604800000| 0| 1| 1| 0| 2| 2| -|root.db2|86400000| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -|root.db3| null| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -Total line number = 3 -It costs 0.058s -``` - -The query results in each column are as follows: - -+ The name of the Database -+ The TTL of the Database -+ The schema replication number of the Database -+ The data replication number of the Database -+ The time partition interval of the Database -+ The current SchemaRegionGroup number of the Database -+ The required minimum SchemaRegionGroup number of the Database -+ The permitted maximum SchemaRegionGroup number of the Database -+ The current DataRegionGroup number of the Database -+ The required minimum DataRegionGroup number of the Database -+ The permitted maximum DataRegionGroup number of the Database - -### 1.6 TTL - -IoTDB supports setting data retention time (TTL) at the device level, allowing the system to automatically and periodically delete old data to effectively control disk space and maintain high query performance and low memory usage. TTL is set in milliseconds by default. Once data expires, it cannot be queried or written, but physical deletion is delayed until compaction. Please note that changes to TTL may temporarily affect data queryability, and if TTL is reduced or removed, previously invisible data due to TTL may reappear. - -Important notes: -- TTL is set in milliseconds and is not affected by the time precision in the configuration file. -- Changes to TTL may affect data queryability. -- The system will eventually remove expired data, but there may be a delay. -- TTL determines data expiration based on the data point timestamp, not the ingestion time. -- The system supports setting up to 1000 TTL rules. When the limit is reached, existing rules must be removed before new ones can be added. - -#### TTL Path Rule -The path can only be prefix paths (i.e., the path cannot contain \* , except \*\* in the last level). -This path will match devices and also allows users to specify paths without asterisks as specific databases or devices. -When the path does not contain asterisks, the system will check if it matches a database; if it matches a database, both the path and path.\*\* will be set at the same time. Note: Device TTL settings do not verify the existence of metadata, i.e., it is allowed to set TTL for a non-existent device. -``` -qualified paths: -root.** -root.db.** -root.db.group1.** -root.db -root.db.group1.d1 - -unqualified paths: -root.*.db -root.**.db.* -root.db.* -``` -#### TTL Applicable Rules -When a device is subject to multiple TTL rules, the more precise and longer rules are prioritized. For example, for the device "root.bj.hd.dist001.turbine001", the rule "root.bj.hd.dist001.turbine001" takes precedence over "root.bj.hd.dist001.\*\*", and the rule "root.bj.hd.dist001.\*\*" takes precedence over "root.bj.hd.**". -#### Set TTL -The set ttl operation can be understood as setting a TTL rule, for example, setting ttl to root.sg.group1.** is equivalent to mounting ttl for all devices that can match this path pattern. -The unset ttl operation indicates unmounting TTL for the corresponding path pattern; if there is no corresponding TTL, nothing will be done. -If you want to set TTL to be infinitely large, you can use the INF keyword. -The SQL Statement for setting TTL is as follow: - -```sql -set ttl to pathPattern 360000; -``` -Set the Time to Live (TTL) to a pathPattern of 360,000 milliseconds; the pathPattern should not contain a wildcard (\*) in the middle and must end with a double asterisk (\*\*). The pathPattern is used to match corresponding devices. -To maintain compatibility with older SQL syntax, if the user-provided pathPattern matches a database (db), the path pattern is automatically expanded to include all sub-paths denoted by path.\*\*. -For instance, writing "set ttl to root.sg 360000" will automatically be transformed into "set ttl to root.sg.\*\* 360000", which sets the TTL for all devices under root.sg. However, if the specified pathPattern does not match a database, the aforementioned logic will not apply. For example, writing "set ttl to root.sg.group 360000" will not be expanded to "root.sg.group.\*\*" since root.sg.group does not match a database. -It is also permissible to specify a particular device without a wildcard (*). -#### Unset TTL - -To unset TTL, we can use follwing SQL statement: - -```sql -unset ttl from root.ln -``` - -After unset TTL, all data will be accepted in `root.ln`. - -```sql -unset ttl from root.sgcc.** -``` - -Unset the TTL in the `root.sgcc` path. - -New syntax - -```sql -unset ttl from root.** -``` - -Old syntax - -```sql -unset ttl to root.** -``` -There is no functional difference between the old and new syntax, and they are compatible with each other. -The new syntax is just more conventional in terms of wording. - -Unset the TTL setting for all path pattern. - -#### Show TTL - -To Show TTL, we can use following SQL statement: - -show all ttl - -```sql -SHOW ALL TTL; -``` -``` -+--------------+--------+ -| path| TTL| -| root.**|55555555| -| root.sg2.a.**|44440000| -+--------------+--------+ -``` - -show ttl on pathPattern -```sql -SHOW TTL ON root.db.**; -``` -``` -+--------------+--------+ -| path| TTL| -| root.db.**|55555555| -| root.db.a.**|44440000| -+--------------+--------+ -``` - -The SHOW ALL TTL example gives the TTL for all path patterns. -The SHOW TTL ON pathPattern shows the TTL for the path pattern specified. - -Display devices' ttl -```sql -show devices; -``` -``` -+---------------+---------+---------+ -| Device|IsAligned| TTL| -+---------------+---------+---------+ -|root.sg.device1| false| 36000000| -|root.sg.device2| true| INF| -+---------------+---------+---------+ -``` -All devices will definitely have a TTL, meaning it cannot be null. INF represents infinity. - - -## 2. Timeseries Management - -### 2.1 Create Timeseries - -According to the storage model selected before, we can create corresponding timeseries in the two databases respectively. The SQL statements for creating timeseries are as follows: - -```sql -create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT; -create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT; -create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT; -``` - -From v0.13, you can use a simplified version of the SQL statements to create timeseries: - -```sql -create timeseries root.ln.wf01.wt01.status BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature FLOAT; -create timeseries root.ln.wf02.wt02.hardware TEXT; -create timeseries root.ln.wf02.wt02.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature FLOAT; -``` - -When creating a timeseries, the system will automatically assign default encoding and compression methods, requiring no manual specification. If your business scenario requires custom adjustments, you may refer to the following example: - -```sql -create timeseries root.sgcc.wf03.wt01.temperature FLOAT encoding=PLAIN compressor=SNAPPY; -``` - -Note that if you manually specify an encoding method that is incompatible with the data type, the system will return an error message, as shown below: - -```sql -create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF; -error: encoding TS_2DIFF does not support BOOLEAN -``` - -For a full list of supported data types and corresponding encoding methods, please refer to [Compression & Encoding](../Technical-Insider/Encoding-and-Compression.md)。 - - -### 2.2 Create Aligned Timeseries - -The SQL statement for creating a group of timeseries are as follows: - -```sql -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT, longitude FLOAT); -``` - -You can set different datatype, encoding, and compression for the timeseries in a group of aligned timeseries - -It is also supported to set an alias, tag, and attribute for aligned timeseries. - - -### 2.3 Modifying Timseries Data Types - -Starting from version V2.0.8.2, modifying the data type of a timeseries via SQL statements is supported. - -Syntax definition: - -```SQL -ALTER TIMESERIES fullPath SET DATA TYPE newType=type -``` - -Notes: - -* If the timeseries is concurrently deleted during the modification process, an error will be reported. - -* The new data type must be compatible with the original type. The specific compatibility is shown in the following table: - -| Original Type |Convertible To Type | -| ----------- | ----------------------------------------------- | -| INT32 | INT64, FLOAT, DOUBLE, TIMESTAMP, STRING, TEXT | -| INT64 | TIMESTAMP, DOUBLE, STRING, TEXT | -| FLOAT | DOUBLE, STRING, TEXT | -| DOUBLE | STRING, TEXT | -| BOOLEAN | STRING, TEXT | -| TEXT | BLOB, STRING | -| STRING | TEXT, BLOB | -| BLOB | STRING, TEXT | -| DATE | STRING, TEXT | -| TIMESTAMP | INT64, DOUBLE, STRING, TEXT | - -Usage example: - -```SQL -ALTER TIMESERIES root.ln.wf01.wt01.temperature set data type DOUBLE -``` - -### 2.4 Modifying Timeseries Name - -Since version V2.0.8.2, it has been supported to modify the full path name of a timeseries through SQL statements. After a successful modification, the original name becomes invalid but is still retained in the metadata storage. - -Syntax definition: - -```sql --- Supports modifying the full path of a certain sequence to another full path -ALTER TIMESERIES RENAME TO -``` - -Usage instructions: - -- This statement takes effect immediately upon successful execution, and the tags/attributes/alias of the original sequence will be migrated to the new sequence. -- The invalidated sequence (original sequence) no longer supports write, query, delete, or other operations. The name of the invalidated sequence will be retained by the system, and creating a new sequence with the same name is not allowed. This ensures the uniqueness and traceability of the original sequence name: it supports viewing the original sequence through the `SHOW INVALID TIMESERIES` statement, preventing the loss of original sequence information due to frequent modifications, significantly improving data traceability and problem localization efficiency. -- The new sequence supports creating views, but the original sequence does not support creating views. When modifying the encoding, compression, sequence type, tags, attributes, or alias of the new sequence, the original sequence will not be modified; deleting the new sequence will also modify the original sequence. -- If the new sequence path or the alias of the original sequence under the target device already exists (including real sequences, views, invalid sequences, and their aliases), the system will report an error. - -Usage example: - -```sql -ALTER TIMESERIES root.ln.wf01.wt01.temperature RENAME TO root.newln.newwf.newwt.temperature -``` - - -### 2.5 Delete Timeseries - -To delete the timeseries we created before, we are able to use `(DELETE | DROP) TimeSeries ` statement. - -The usage are as follows: - -```sql -delete timeseries root.ln.wf01.wt01.status; -delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware; -delete timeseries root.ln.wf02.*; -drop timeseries root.ln.wf02.*; -``` - -### 2.6 Show Timeseries - -* SHOW LATEST? TIMESERIES pathPattern? whereClause? limitClause? - - There are four optional clauses added in SHOW TIMESERIES, return information of time series - -Timeseries information includes: timeseries path, alias of measurement, database it belongs to, data type, encoding type, compression type, tags and attributes. - -Examples: - -* SHOW TIMESERIES - - presents all timeseries information in JSON form - -* SHOW TIMESERIES <`PathPattern`> - - returns all timeseries information matching the given <`PathPattern`>. SQL statements are as follows: - -```sql -show timeseries root.**; -show timeseries root.ln.**; -``` - -The results are shown below respectively: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.016s - -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -Total line number = 4 -It costs 0.004s -``` - -* SHOW TIMESERIES LIMIT INT OFFSET INT - - returns all the timeseries information start from the offset and limit the number of series returned. For example, - -```sql -show timeseries root.ln.** limit 10 offset 10 -``` - -* SHOW TIMESERIES WHERE TIMESERIES contains 'containStr' - - The query result set is filtered by string fuzzy matching based on the names of the timeseries. For example: - -```sql -show timeseries root.ln.** where timeseries contains 'wf01.wt' -``` - -The result is shown below: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 2 -It costs 0.016s -``` - -* SHOW TIMESERIES WHERE DataType=type - - The query result set is filtered by data type. For example: - -```sql -show timeseries root.ln.** where dataType=FLOAT -``` - -The result is shown below: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 3 -It costs 0.016s - -``` - - -* SHOW TIMESERIES WHERE TAGS(KEY) = VALUE -* SHOW TIMESERIES WHERE TAGS(KEY) CONTAINS VALUE - - The query result set is filtered by tags. For example: - -```sql -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -The query results are as follows: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s - -``` - - -* SHOW LATEST TIMESERIES - - all the returned timeseries information should be sorted in descending order of the last timestamp of timeseries - -It is worth noting that when the queried path does not exist, the system will return no timeseries. - -- SHOW INVALID TIMESERIES - - Since version V2.0.8.2, this SQL statement is supported to display the invalidated timeseries after a successful full path name modification. - -```sql -IoTDB> show invalid timeSeries -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| NewPath| -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| GORILLA| LZ4|null| null| null| null| BASE|root.newln.newwf.newwt.temperature| -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -``` - -Explanation: The last column, "NewPath," in the returned result displays the new sequence corresponding to the invalidated sequence. This serves scenarios such as view construction and cluster migration (Load + rename). - - -### 2.7 Count Timeseries - -IoTDB is able to use `COUNT TIMESERIES ` to count the number of timeseries matching the path. SQL statements are as follows: - -* `WHERE` condition could be used to fuzzy match a time series name with the following syntax: `COUNT TIMESERIES WHERE TIMESERIES contains 'containStr'`. -* `WHERE` condition could be used to filter result by data type with the syntax: `COUNT TIMESERIES WHERE DataType='`. -* `WHERE` condition could be used to filter result by tags with the syntax: `COUNT TIMESERIES WHERE TAGS(key)='value'` or `COUNT TIMESERIES WHERE TAGS(key) contains 'value'`. -* `LEVEL` could be defined to show count the number of timeseries of each node at the given level in current Metadata Tree. This could be used to query the number of sensors under each device. The grammar is: `COUNT TIMESERIES GROUP BY LEVEL=`. - - -```sql -COUNT TIMESERIES root.**; -COUNT TIMESERIES root.ln.**; -COUNT TIMESERIES root.ln.*.*.status; -COUNT TIMESERIES root.ln.wf01.wt01.status; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' ; -COUNT TIMESERIES root.** WHERE DATATYPE = INT64; -COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c' ; -COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c' ; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1; -``` - -For example, if there are several timeseries (use `show timeseries` to show all timeseries): - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| {"unit":"c"}| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| {"description":"test1"}| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.004s -``` - -Then the Metadata Tree will be as below: - -
- -As can be seen, `root` is considered as `LEVEL=0`. So when you enter statements such as: - -```sql -COUNT TIMESERIES root.** GROUP BY LEVEL=1; -COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2; -COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2; -``` - -You will get following results: - -``` -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -| root.sgcc| 2| -| root.ln| 4| -+------------+-----------------+ -Total line number = 3 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf02| 2| -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 2 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 1 -It costs 0.002s -``` - -> Note: The path of timeseries is just a filter condition, which has no relationship with the definition of level. - -### 2.8 Active Timeseries Query -By adding WHERE time filter conditions to the existing SHOW/COUNT TIMESERIES, we can obtain time series with data within the specified time range. - -It is important to note that in metadata queries with time filters, views are not considered; only the time series actually stored in the TsFile are taken into account. - -An example usage is as follows: -```sql -insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -show timeseries; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -show timeseries where time >= 15000 and time < 16000; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -count timeseries where time >= 15000 and time < 16000; -+-----------------+ -|count(timeseries)| -+-----------------+ -| 4| -+-----------------+ -``` -Regarding the definition of active time series, data that can be queried normally is considered active, meaning time series that have been inserted but deleted are not included. -### 2.9 Tag and Attribute Management - -We can also add an alias, extra tag and attribute information while creating one timeseries. - -The differences between tag and attribute are: - -* Tag could be used to query the path of timeseries, we will maintain an inverted index in memory on the tag: Tag -> Timeseries -* Attribute could only be queried by timeseries path : Timeseries -> Attribute - -The SQL statements for creating timeseries with extra tag and attribute information are extended as follows: - -```sql -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2); -``` - -The `temprature` in the brackets is an alias for the sensor `s1`. So we can use `temprature` to replace `s1` anywhere. - -> IoTDB also supports using AS function to set alias. The difference between the two is: the alias set by the AS function is used to replace the whole time series name, temporary and not bound with the time series; while the alias mentioned above is only used as the alias of the sensor, which is bound with it and can be used equivalent to the original sensor name. - -> Notice that the size of the extra tag and attribute information shouldn't exceed the `tag_attribute_total_size`. - -We can update the tag information after creating it as following: - -* Rename the tag/attribute key - -```sql -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -* Reset the tag/attribute value - -```sql -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -* Delete the existing tag/attribute - -```sql -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2 -``` - -* Add new tags - -```sql -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -* Add new attributes - -```sql -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -* Upsert alias, tags and attributes - -> add alias or a new key-value if the alias or key doesn't exist, otherwise, update the old one with new value. - -```sql -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag3=v3, tag4=v4) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -* Show timeseries using tags. Use TAGS(tagKey) to identify the tags used as filter key - -```sql -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -``` - -returns all the timeseries information that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -```sql -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c; -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1; -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -The results are shown below respectly: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s -``` - -- count timeseries using tags - -```sql -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause; -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL=; -``` - -returns all the number of timeseries that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -```sql -count timeseries; -count timeseries root.** where TAGS(unit)='c'; -count timeseries root.** where TAGS(unit)='c' group by level = 2; -``` - -The results are shown below respectly : - -```sql -count timeseries; -+-----------------+ -|count(timeseries)| -+-----------------+ -| 6| -+-----------------+ -Total line number = 1 -It costs 0.019s -count timeseries root.** where TAGS(unit)='c'; -+-----------------+ -|count(timeseries)| -+-----------------+ -| 2| -+-----------------+ -Total line number = 1 -It costs 0.020s -count timeseries root.** where TAGS(unit)='c' group by level = 2; -+--------------+-----------------+ -| column|count(timeseries)| -+--------------+-----------------+ -| root.ln.wf02| 2| -| root.ln.wf01| 0| -|root.sgcc.wf03| 0| -+--------------+-----------------+ -Total line number = 3 -It costs 0.011s -``` - -> Notice that, we only support one condition in the where clause. Either it's an equal filter or it is an `contains` filter. In both case, the property in the where condition must be a tag. - -create aligned timeseries - -```sql -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)) -``` - -The execution result is as follows: - -```sql -show timeseries -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -|root.sg1.d1.s2| null| root.sg1| DOUBLE| GORILLA| SNAPPY|{"tag4":"v4","tag3":"v3"}|{"attr4":"v4","attr3":"v3"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -Support query: - -```sql -show timeseries where TAGS(tag1)='v1' -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -The above operations are supported for timeseries tag, attribute updates, etc. - -## 3. Path query - -### 3.1 Path - -A `path` is an expression that conforms to the following constraints: - -```sql -path - : nodeName ('.' nodeName)* - ; - -nodeName - : wildcard? identifier wildcard? - | wildcard - ; - -wildcard - : '*' - | '**' - ; -``` - -### 3.2 NodeName - -- The parts of a path separated by `.` are called node names (`nodeName`). -- For example, `root.a.b.c` is a path with a depth of 4 levels, where `root`, `a`, `b`, and `c` are all node names. - -#### Constraints - -- **Reserved keyword `root`**: `root` is a reserved keyword that is only allowed at the beginning of a path. If `root` appears in any other level, the system will fail to parse it and report an error. -- **Character support**: Except for the first level (`root`), other levels support the following characters: - - Letters (`a-z`, `A-Z`) - - Digits (`0-9`) - - Underscores (`_`) - - UNICODE Chinese characters (`\u2E80` to `\u9FFF`) -- **Case sensitivity**: On Windows systems, path node names in the database are case-insensitive. For example, `root.ln` and `root.LN` are considered the same path. - -### 3.3 Special Characters (Backquote) - -If special characters (such as spaces or punctuation marks) are needed in a `nodeName`, you can enclose the node name in Backquote (`). For more information on the use of backticks, please refer to [Backquote](../SQL-Manual/Syntax-Rule.md#reverse-quotation-marks). - -### 3.4 Path Pattern - -To make it more convenient and efficient to express multiple time series, IoTDB provides paths with wildcards `*` and `**`. Wildcards can appear in any level of a path. - -- **Single-level wildcard (`\*`)**: Represents one level in a path. - - For example, `root.vehicle.*.sensor1` represents paths with a depth of 4 levels, prefixed by `root.vehicle` and suffixed by `sensor1`. -- **Multi-level wildcard (`\**`)**: Represents one or more levels (`*`+). - - For example: - - `root.vehicle.device1.**` represents all paths with a depth of 4 or more levels, prefixed by `root.vehicle.device1`. - - `root.vehicle.**.sensor1` represents paths with a depth of 4 or more levels, prefixed by `root.vehicle` and suffixed by `sensor1`. - -**Note**: `*` and `**` cannot be placed at the beginning of a path. - -### 3.5 Show Child Paths - -```sql -SHOW CHILD PATHS pathPattern -``` - -Return all child paths and their node types of all the paths matching pathPattern. - -node types: ROOT -> DB INTERNAL -> DATABASE -> INTERNAL -> DEVICE -> TIMESERIES - - -Example: - -* return the child paths of root.ln:show child paths root.ln - -``` -+------------+----------+ -| child paths|node types| -+------------+----------+ -|root.ln.wf01| INTERNAL| -|root.ln.wf02| INTERNAL| -+------------+----------+ -Total line number = 2 -It costs 0.002s -``` - -> get all paths in form of root.xx.xx.xx:show child paths root.xx.xx - -### 3.6 Show Child Nodes - -```sql -SHOW CHILD NODES pathPattern -``` - -Return all child nodes of the pathPattern. - -Example: - -* return the child nodes of root:show child nodes root - -``` -+------------+ -| child nodes| -+------------+ -| ln| -+------------+ -``` - -* return the child nodes of root.ln:show child nodes root.ln - -``` -+------------+ -| child nodes| -+------------+ -| wf01| -| wf02| -+------------+ -``` - -### 3.7 Count Nodes - -IoTDB is able to use `COUNT NODES LEVEL=` to count the number of nodes at - the given level in current Metadata Tree considering a given pattern. IoTDB will find paths that - match the pattern and counts distinct nodes at the specified level among the matched paths. - This could be used to query the number of devices with specified measurements. The usage are as - follows: - -```sql -COUNT NODES root.** LEVEL=2; -COUNT NODES root.ln.** LEVEL=2; -COUNT NODES root.ln.wf01.** LEVEL=3; -COUNT NODES root.**.temperature LEVEL=3; -``` - -As for the above mentioned example and Metadata tree, you can get following results: - -``` -+------------+ -|count(nodes)| -+------------+ -| 4| -+------------+ -Total line number = 1 -It costs 0.003s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 1| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s -``` - -> Note: The path of timeseries is just a filter condition, which has no relationship with the definition of level. - -### 3.8 Show Devices - -* SHOW DEVICES pathPattern? (WITH DATABASE)? devicesWhereClause? limitClause? - -Similar to `Show Timeseries`, IoTDB also supports two ways of viewing devices: - -* `SHOW DEVICES` statement presents all devices' information, which is equal to `SHOW DEVICES root.**`. -* `SHOW DEVICES ` statement specifies the `PathPattern` and returns the devices information matching the pathPattern and under the given level. -* `WHERE` condition supports `DEVICE contains 'xxx'` to do a fuzzy query based on the device name. - -SQL statement is as follows: - -```sql -show devices; -show devices root.ln.**; -show devices root.ln.** where device contains 't'; -``` - -You can get results below: - -``` -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.ln.wf01.wt01| false| -| root.ln.wf02.wt02| false| -|root.sgcc.wf03.wt01| false| -| root.turbine.d1| false| -+-------------------+---------+ -Total line number = 4 -It costs 0.002s - -+-----------------+---------+ -| devices|isAligned| -+-----------------+---------+ -|root.ln.wf01.wt01| false| -|root.ln.wf02.wt02| false| -+-----------------+---------+ -Total line number = 2 -It costs 0.001s -``` - -`isAligned` indicates whether the timeseries under the device are aligned. - -To view devices' information with database, we can use `SHOW DEVICES WITH DATABASE` statement. - -* `SHOW DEVICES WITH DATABASE` statement presents all devices' information with their database. -* `SHOW DEVICES WITH DATABASE` statement specifies the `PathPattern` and returns the - devices' information under the given level with their database information. - -SQL statement is as follows: - -```sql -show devices with database; -show devices root.ln.** with database; -``` - -You can get results below: - -``` -+-------------------+-------------+---------+ -| devices| database|isAligned| -+-------------------+-------------+---------+ -| root.ln.wf01.wt01| root.ln| false| -| root.ln.wf02.wt02| root.ln| false| -|root.sgcc.wf03.wt01| root.sgcc| false| -| root.turbine.d1| root.turbine| false| -+-------------------+-------------+---------+ -Total line number = 4 -It costs 0.003s - -+-----------------+-------------+---------+ -| devices| database|isAligned| -+-----------------+-------------+---------+ -|root.ln.wf01.wt01| root.ln| false| -|root.ln.wf02.wt02| root.ln| false| -+-----------------+-------------+---------+ -Total line number = 2 -It costs 0.001s -``` - -### 3.9 Count Devices - -* COUNT DEVICES `` - -The above statement is used to count the number of devices. At the same time, it is allowed to specify `PathPattern` to count the number of devices matching the `PathPattern`. - -SQL statement is as follows: - -```sql -show devices; -count devices; -count devices root.ln.**; -``` - -You can get results below: - -``` -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -|root.sgcc.wf03.wt03| false| -| root.turbine.d1| false| -| root.ln.wf02.wt02| false| -| root.ln.wf01.wt01| false| -+-------------------+---------+ -Total line number = 4 -It costs 0.024s - -+--------------+ -|count(devices)| -+--------------+ -| 4| -+--------------+ -Total line number = 1 -It costs 0.004s - -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -Total line number = 1 -It costs 0.004s -``` - -### 3.10 Active Device Query -Similar to active timeseries query, we can add time filter conditions to device viewing and statistics to query active devices that have data within a certain time range. The definition of active here is the same as for active time series. An example usage is as follows: -```sql -insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -show devices; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -| root.sg.data3| false| -+-------------------+---------+ - -show devices where time >= 15000 and time < 16000; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -+-------------------+---------+ - -count devices where time >= 15000 and time < 16000; -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -``` \ No newline at end of file diff --git a/src/UserGuide/latest/Basic-Concept/Query-Data_timecho.md b/src/UserGuide/latest/Basic-Concept/Query-Data_timecho.md deleted file mode 100644 index 91c407083..000000000 --- a/src/UserGuide/latest/Basic-Concept/Query-Data_timecho.md +++ /dev/null @@ -1,3054 +0,0 @@ - -# Query Data -## 1. OVERVIEW - -### 1.1 Syntax Definition - -In IoTDB, `SELECT` statement is used to retrieve data from one or more selected time series. Here is the syntax definition of `SELECT` statement: - -```sql -SELECT [LAST] selectExpr [, selectExpr] ... - [INTO intoItem [, intoItem] ...] - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY { - ([startTime, endTime), interval [, slidingStep]) | - LEVEL = levelNum [, levelNum] ... | - TAGS(tagKey [, tagKey] ... ) | - VARIATION(expression[,delta][,ignoreNull=true/false]) | - CONDITION(expression,[keep>/>=/=/ 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is "status" and "temperature". The SQL statement requires that the status and temperature sensor values between the time point of "2017-11-01T00:05:00.000" and "2017-11-01T00:12:00.000" be selected. - -The execution result of this SQL statement is as follows: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -|2017-11-01T00:10:00.000+08:00| true| 25.52| -|2017-11-01T00:11:00.000+08:00| false| 22.91| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 6 -It costs 0.018s -``` - -#### Select Multiple Columns of Data for the Same Device According to Multiple Time Intervals - -IoTDB supports specifying multiple time interval conditions in a query. Users can combine time interval conditions at will according to their needs. For example, the SQL statement is: - -```sql -select status,temperature from root.ln.wf01.wt01 where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is "status" and "temperature"; the statement specifies two different time intervals, namely "2017-11-01T00:05:00.000 to 2017-11-01T00:12:00.000" and "2017-11-01T16:35:00.000 to 2017-11-01T16:37:00.000". The SQL statement requires that the values of selected timeseries satisfying any time interval be selected. - -The execution result of this SQL statement is as follows: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -|2017-11-01T00:10:00.000+08:00| true| 25.52| -|2017-11-01T00:11:00.000+08:00| false| 22.91| -|2017-11-01T16:35:00.000+08:00| true| 23.44| -|2017-11-01T16:36:00.000+08:00| false| 21.98| -|2017-11-01T16:37:00.000+08:00| false| 21.93| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 9 -It costs 0.018s -``` - - -#### Choose Multiple Columns of Data for Different Devices According to Multiple Time Intervals - -The system supports the selection of data in any column in a query, i.e., the selected columns can come from different devices. For example, the SQL statement is: - -```sql -select wf01.wt01.status,wf02.wt02.hardware from root.ln where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -which means: - -The selected timeseries are "the power supply status of ln group wf01 plant wt01 device" and "the hardware version of ln group wf02 plant wt02 device"; the statement specifies two different time intervals, namely "2017-11-01T00:05:00.000 to 2017-11-01T00:12:00.000" and "2017-11-01T16:35:00.000 to 2017-11-01T16:37:00.000". The SQL statement requires that the values of selected timeseries satisfying any time interval be selected. - -The execution result of this SQL statement is as follows: - -``` -+-----------------------------+------------------------+--------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf02.wt02.hardware| -+-----------------------------+------------------------+--------------------------+ -|2017-11-01T00:06:00.000+08:00| false| v1| -|2017-11-01T00:07:00.000+08:00| false| v1| -|2017-11-01T00:08:00.000+08:00| false| v1| -|2017-11-01T00:09:00.000+08:00| false| v1| -|2017-11-01T00:10:00.000+08:00| true| v2| -|2017-11-01T00:11:00.000+08:00| false| v1| -|2017-11-01T16:35:00.000+08:00| true| v2| -|2017-11-01T16:36:00.000+08:00| false| v1| -|2017-11-01T16:37:00.000+08:00| false| v1| -+-----------------------------+------------------------+--------------------------+ -Total line number = 9 -It costs 0.014s -``` - -#### Order By Time Query - -IoTDB supports the 'order by time' statement since 0.11, it's used to display results in descending order by time. -For example, the SQL statement is: - -```sql -select * from root.ln.** where time > 1 order by time desc limit 10; -``` - -The execution result of this SQL statement is as follows: - -``` -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -|2017-11-07T23:59:00.000+08:00| v1| false| 21.07| false| -|2017-11-07T23:58:00.000+08:00| v1| false| 22.93| false| -|2017-11-07T23:57:00.000+08:00| v2| true| 24.39| true| -|2017-11-07T23:56:00.000+08:00| v2| true| 24.44| true| -|2017-11-07T23:55:00.000+08:00| v2| true| 25.9| true| -|2017-11-07T23:54:00.000+08:00| v1| false| 22.52| false| -|2017-11-07T23:53:00.000+08:00| v2| true| 24.58| true| -|2017-11-07T23:52:00.000+08:00| v1| false| 20.18| false| -|2017-11-07T23:51:00.000+08:00| v1| false| 22.24| false| -|2017-11-07T23:50:00.000+08:00| v2| true| 23.7| true| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -Total line number = 10 -It costs 0.016s -``` - -### 1.4 Execution Interface - -In IoTDB, there are two ways to execute data query: - -- Execute queries using IoTDB-SQL. -- Efficient execution interfaces for common queries, including time-series raw data query, last query, and aggregation query. - -#### Execute queries using IoTDB-SQL - -Data query statements can be used in SQL command-line terminals, JDBC, JAVA / C++ / Python / Go and other native APIs, and RESTful APIs. - -- Execute the query statement in the SQL command line terminal: start the SQL command line terminal, and directly enter the query statement to execute, see [SQL command line terminal](../Tools-System/CLI_timecho.md). - -- Execute query statements in JDBC, see [JDBC](../API/Programming-JDBC_timecho.md) for details. - -- Execute query statements in native APIs such as JAVA / C++ / Python / Go. For details, please refer to the relevant documentation in the Application Programming Interface chapter. The interface prototype is as follows: - - ````java - SessionDataSet executeQueryStatement(String sql) - ```` - -- Used in RESTful API, see [HTTP API V1](../API/RestServiceV1_timecho.md) or [HTTP API V2](../API/RestServiceV2_timecho.md) for details. - -#### Efficient execution interfaces - -The native APIs provide efficient execution interfaces for commonly used queries, which can save time-consuming operations such as SQL parsing. include: - -* Time-series raw data query with time range: - - The specified query time range is a left-closed right-open interval, including the start time but excluding the end time. - -```java -SessionDataSet executeRawDataQuery(List paths, long startTime, long endTime); -``` - -* Last query: - - Query the last data, whose timestamp is greater than or equal LastTime. - -```java -SessionDataSet executeLastDataQuery(List paths, long LastTime); -``` - -* Aggregation query: - - Support specified query time range: The specified query time range is a left-closed right-open interval, including the start time but not the end time. - - Support GROUP BY TIME. - -```java -SessionDataSet executeAggregationQuery(List paths, List aggregations); - -SessionDataSet executeAggregationQuery( - List paths, List aggregations, long startTime, long endTime); - -SessionDataSet executeAggregationQuery( - List paths, - List aggregations, - long startTime, - long endTime, - long interval); - -SessionDataSet executeAggregationQuery( - List paths, - List aggregations, - long startTime, - long endTime, - long interval, - long slidingStep); -``` - -## 2. `SELECT` CLAUSE -The `SELECT` clause specifies the output of the query, consisting of several `selectExpr`. Each `selectExpr` defines one or more columns in the query result. For select expression details, see document [Operator-and-Expression](../SQL-Manual/Operator-and-Expression.md). - -- Example 1: - -```sql -select temperature from root.ln.wf01.wt01 -``` - -- Example 2: - -```sql -select status, temperature from root.ln.wf01.wt01 -``` - -### 2.1 Last Query - -The last query is a special type of query in Apache IoTDB. It returns the data point with the largest timestamp of the specified time series. In other word, it returns the latest state of a time series. This feature is especially important in IoT data analysis scenarios. To meet the performance requirement of real-time device monitoring systems, Apache IoTDB caches the latest values of all time series to achieve microsecond read latency. - -The last query is to return the most recent data point of the given timeseries in a three column format. - -The SQL syntax is defined as: - -```sql -select last [COMMA ]* from < PrefixPath > [COMMA < PrefixPath >]* [ORDER BY TIMESERIES (DESC | ASC)?] -``` - -which means: Query and return the last data points of timeseries prefixPath.path. - -- Only time filter is supported in \. Any other filters given in the \ will give an exception. When the cached most recent data point does not satisfy the criterion specified by the filter, IoTDB will have to get the result from the external storage, which may cause a decrease in performance. - -- The result will be returned in a four column table format. - - ``` - | Time | timeseries | value | dataType | - ``` - - **Note:** The `value` colum will always return the value as `string` and thus also has `TSDataType.TEXT`. Therefore, the column `dataType` is returned also which contains the _real_ type how the value should be interpreted. - -- We can use `TIME/TIMESERIES/VALUE/DATATYPE (DESC | ASC)` to specify that the result set is sorted in descending/ascending order based on a particular column. When the value column contains multiple types of data, the sorting is based on the string representation of the values. - -**Example 1:** get the last point of root.ln.wf01.wt01.status: - -```sql -select last status from root.ln.wf01.wt01; -``` -``` -+-----------------------------+------------------------+-----+--------+ -| Time| timeseries|value|dataType| -+-----------------------------+------------------------+-----+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.status|false| BOOLEAN| -+-----------------------------+------------------------+-----+--------+ -Total line number = 1 -It costs 0.000s -``` - -**Example 2:** get the last status and temperature points of root.ln.wf01.wt01, whose timestamp larger or equal to 2017-11-07T23:50:00。 - -```sql -select last status, temperature from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00; -``` -``` -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**Example 3:** get the last points of all sensor in root.ln.wf01.wt01, and order the result by the timeseries column in descending order - -```sql -select last * from root.ln.wf01.wt01 order by timeseries desc; -``` -``` -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**Example 4:** get the last points of all sensor in root.ln.wf01.wt01, and order the result by the dataType column in descending order - -```sql -select last * from root.ln.wf01.wt01 order by dataType desc; -``` -``` -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**Note:** The requirement to query the latest data point with other filtering conditions can be implemented through function composition. For example: -```sql -select max_time(*), last_value(*) from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00 and status = false align by device; -``` -``` -+-----------------+---------------------+----------------+-----------------------+------------------+ -| Device|max_time(temperature)|max_time(status)|last_value(temperature)|last_value(status)| -+-----------------+---------------------+----------------+-----------------------+------------------+ -|root.ln.wf01.wt01| 1510077540000| 1510077540000| 21.067368| false| -+-----------------+---------------------+----------------+-----------------------+------------------+ -Total line number = 1 -It costs 0.021s -``` - - -## 3. `WHERE` CLAUSE - -In IoTDB query statements, two filter conditions, **time filter** and **value filter**, are supported. - -The supported operators are as follows: - -- Comparison operators: greater than (`>`), greater than or equal ( `>=`), equal ( `=` or `==`), not equal ( `!=` or `<>`), less than or equal ( `<=`), less than ( `<`). -- Logical operators: and ( `AND` or `&` or `&&`), or ( `OR` or `|` or `||`), not ( `NOT` or `!`). -- Range contains operator: contains ( `IN` ). -- String matches operator: `LIKE`, `REGEXP`. - -### 3.1 Time Filter - -Use time filters to filter data for a specific time range. For supported formats of timestamps, please refer to [Timestamp](../Background-knowledge/Data-Type.md) . - -An example is as follows: - -1. Select data with timestamp greater than 2022-01-01T00:05:00.000: - - ```sql - select s1 from root.sg1.d1 where time > 2022-01-01T00:05:00.000; - ```` - -2. Select data with timestamp equal to 2022-01-01T00:05:00.000: - - ```sql - select s1 from root.sg1.d1 where time = 2022-01-01T00:05:00.000; - ```` - -3. Select the data in the time interval [2017-11-01T00:05:00.000, 2017-11-01T00:12:00.000): - - ```sql - select s1 from root.sg1.d1 where time >= 2022-01-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; - ```` - -Note: In the above example, `time` can also be written as `timestamp`. - -### 3.2 Value Filter - -Use value filters to filter data whose data values meet certain criteria. **Allow** to use a time series not selected in the select clause as a value filter. - -An example is as follows: - -1. Select data with a value greater than 36.5: - - ```sql - select temperature from root.sg1.d1 where temperature > 36.5; - ```` - -2. Select data with value equal to true: - - ```sql - select status from root.sg1.d1 where status = true; - ```` - -3. Select data for the interval [36.5,40] or not: - - ```sql - select temperature from root.sg1.d1 where temperature between 36.5 and 40; - ```` - - ```sql - select temperature from root.sg1.d1 where temperature not between 36.5 and 40; - ```` - -4. Select data with values within a specific range: - - ```sql - select code from root.sg1.d1 where code in ('200', '300', '400', '500'); - ```` - -5. Select data with values outside a certain range: - - ```sql - select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); - ```` - -6. Select data with values is null: - - ```sql - select code from root.sg1.d1 where temperature is null; - ```` - -7. Select data with values is not null: - - ```sql - select code from root.sg1.d1 where temperature is not null; - ```` - -### 3.3 Fuzzy Query - -Fuzzy query is divided into Like statement and Regexp statement, both of which can support fuzzy matching of TEXT type and STRING type data. - -Like statement: - -#### Fuzzy matching using `Like` - -In the value filter condition, for TEXT type data, use `Like` and `Regexp` operators to perform fuzzy matching on data. - -**Matching rules:** - -- The percentage (`%`) wildcard matches any string of zero or more characters. -- The underscore (`_`) wildcard matches any single character. - -**Example 1:** Query data containing `'cc'` in `value` under `root.sg.d1`. - -```sql -select * from root.sg.d1 where value like '%cc%' -``` -``` -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -**Example 2:** Query data that consists of 3 characters and the second character is `'b'` in `value` under `root.sg.d1`. - -```sql -select * from root.sg.device where value like '_b_'; -``` -``` -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:02.000+08:00| abc| -+-----------------------------+----------------+ -Total line number = 1 -It costs 0.002s -``` - -#### Fuzzy matching using `Regexp` - -The filter conditions that need to be passed in are regular expressions in the Java standard library style. - -**Examples of common regular matching:** - -``` -All characters with a length of 3-20: ^.{3,20}$ -Uppercase english characters: ^[A-Z]+$ -Numbers and English characters: ^[A-Za-z0-9]+$ -Beginning with a: ^a.* -``` - -**Example 1:** Query a string composed of 26 English characters for the value under root.sg.d1 - -```sql -select * from root.sg.d1 where value regexp '^[A-Za-z]+$' -``` -``` -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -**Example 2:** Query root.sg.d1 where the value value is a string composed of 26 lowercase English characters and the time is greater than 100 - -```sql -select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100 -``` -``` -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -## 4. `GROUP BY` CLAUSE - -IoTDB supports using `GROUP BY` clause to aggregate the time series by segment and group. - -Segmented aggregation refers to segmenting data in the row direction according to the time dimension, aiming at the time relationship between different data points in the same time series, and obtaining an aggregated value for each segment. Currently only **group by time**、**group by variation**、**group by condition**、**group by session** and **group by count** is supported, and more segmentation methods will be supported in the future. - -Group aggregation refers to grouping the potential business attributes of time series for different time series. Each group contains several time series, and each group gets an aggregated value. Support **group by path level** and **group by tag** two grouping methods. - -### 4.1 Aggregate By Segment - -#### Aggregate By Time - -Aggregate by time is a typical query method for time series data. Data is collected at high frequency and needs to be aggregated and calculated at certain time intervals. For example, to calculate the daily average temperature, the sequence of temperature needs to be segmented by day, and then calculated. average value. - -Aggregate by time refers to a query method that uses a lower frequency than the time frequency of data collection, and is a special case of segmented aggregation. For example, the frequency of data collection is one second. If you want to display the data in one minute, you need to use time aggregagtion. - -This section mainly introduces the related examples of time aggregation, using the `GROUP BY` clause. IoTDB supports partitioning result sets according to time interval and customized sliding step. And by default results are sorted by time in ascending order. - -The GROUP BY statement provides users with three types of specified parameters: - -* Parameter 1: The display window on the time axis -* Parameter 2: Time interval for dividing the time axis(should be positive) -* Parameter 3: Time sliding step (optional and defaults to equal the time interval if not set) - -The actual meanings of the three types of parameters are shown in Figure below. -Among them, the parameter 3 is optional. - -
-
- - -There are three typical examples of frequency reduction aggregation: - -##### Aggregate By Time without Specifying the Sliding Step Length - -The SQL statement is: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d); -``` - -which means: - -Since the sliding step length is not specified, the `GROUP BY` statement by default set the sliding step the same as the time interval which is `1d`. - -The fist parameter of the `GROUP BY` statement above is the display window parameter, which determines the final display range is [2017-11-01T00:00:00, 2017-11-07T23:00:00). - -The second parameter of the `GROUP BY` statement above is the time interval for dividing the time axis. Taking this parameter (1d) as time interval and startTime of the display window as the dividing origin, the time axis is divided into several continuous intervals, which are [0,1d), [1d, 2d), [2d, 3d), etc. - -Then the system will use the time and value filtering condition in the `WHERE` clause and the first parameter of the `GROUP BY` statement as the data filtering condition to obtain the data satisfying the filtering condition (which in this case is the data in the range of [2017-11-01T00:00:00, 2017-11-07 T23:00:00]), and map these data to the previously segmented time axis (in this case there are mapped data in every 1-day period from 2017-11-01T00:00:00 to 2017-11-07T23:00:00:00). - -Since there is data for each time period in the result range to be displayed, the execution result of the SQL statement is shown below: - -``` -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 1440| 26.0| -|2017-11-02T00:00:00.000+08:00| 1440| 26.0| -|2017-11-03T00:00:00.000+08:00| 1440| 25.99| -|2017-11-04T00:00:00.000+08:00| 1440| 26.0| -|2017-11-05T00:00:00.000+08:00| 1440| 26.0| -|2017-11-06T00:00:00.000+08:00| 1440| 25.99| -|2017-11-07T00:00:00.000+08:00| 1380| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 7 -It costs 0.024s -``` - -##### Aggregate By Time Specifying the Sliding Step Length - -The SQL statement is: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d); -``` - -which means: - -Since the user specifies the sliding step parameter as 1d, the `GROUP BY` statement will move the time interval `1 day` long instead of `3 hours` as default. - -That means we want to fetch all the data of 00:00:00 to 02:59:59 every day from 2017-11-01 to 2017-11-07. - -The first parameter of the `GROUP BY` statement above is the display window parameter, which determines the final display range is [2017-11-01T00:00:00, 2017-11-07T23:00:00). - -The second parameter of the `GROUP BY` statement above is the time interval for dividing the time axis. Taking this parameter (3h) as time interval and the startTime of the display window as the dividing origin, the time axis is divided into several continuous intervals, which are [2017-11-01T00:00:00, 2017-11-01T03:00:00), [2017-11-02T00:00:00, 2017-11-02T03:00:00), [2017-11-03T00:00:00, 2017-11-03T03:00:00), etc. - -The third parameter of the `GROUP BY` statement above is the sliding step for each time interval moving. - -Then the system will use the time and value filtering condition in the `WHERE` clause and the first parameter of the `GROUP BY` statement as the data filtering condition to obtain the data satisfying the filtering condition (which in this case is the data in the range of [2017-11-01T00:00:00, 2017-11-07T23:00:00]), and map these data to the previously segmented time axis (in this case there are mapped data in every 3-hour period for each day from 2017-11-01T00:00:00 to 2017-11-07T23:00:00:00). - -Since there is data for each time period in the result range to be displayed, the execution result of the SQL statement is shown below: - -``` -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| 25.98| -|2017-11-02T00:00:00.000+08:00| 180| 25.98| -|2017-11-03T00:00:00.000+08:00| 180| 25.96| -|2017-11-04T00:00:00.000+08:00| 180| 25.96| -|2017-11-05T00:00:00.000+08:00| 180| 26.0| -|2017-11-06T00:00:00.000+08:00| 180| 25.85| -|2017-11-07T00:00:00.000+08:00| 180| 25.99| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 7 -It costs 0.006s -``` - -The sliding step can be smaller than the interval, in which case there is overlapping time between the aggregation windows (similar to a sliding window). - -The SQL statement is: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-01 10:00:00), 4h, 2h); -``` - -The execution result of the SQL statement is shown below: - -``` -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| 25.98| -|2017-11-01T02:00:00.000+08:00| 180| 25.98| -|2017-11-01T04:00:00.000+08:00| 180| 25.96| -|2017-11-01T06:00:00.000+08:00| 180| 25.96| -|2017-11-01T08:00:00.000+08:00| 180| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 5 -It costs 0.006s -``` - -##### Aggregate by Natural Month - -The SQL statement is: - -```sql -select count(status) from root.ln.wf01.wt01 group by([2017-11-01T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` - -which means: - -Since the user specifies the sliding step parameter as `2mo`, the `GROUP BY` statement will move the time interval `2 months` long instead of `1 month` as default. - -The first parameter of the `GROUP BY` statement above is the display window parameter, which determines the final display range is [2017-11-01T00:00:00, 2019-11-07T23:00:00). - -The start time is 2017-11-01T00:00:00. The sliding step will increment monthly based on the start date, and the 1st day of the month will be used as the time interval's start time. - -The second parameter of the `GROUP BY` statement above is the time interval for dividing the time axis. Taking this parameter (1mo) as time interval and the startTime of the display window as the dividing origin, the time axis is divided into several continuous intervals, which are [2017-11-01T00:00:00, 2017-12-01T00:00:00), [2018-02-01T00:00:00, 2018-03-01T00:00:00), [2018-05-03T00:00:00, 2018-06-01T00:00:00)), etc. - -The third parameter of the `GROUP BY` statement above is the sliding step for each time interval moving. - -Then the system will use the time and value filtering condition in the `WHERE` clause and the first parameter of the `GROUP BY` statement as the data filtering condition to obtain the data satisfying the filtering condition (which in this case is the data in the range of (2017-11-01T00:00:00, 2019-11-07T23:00:00], and map these data to the previously segmented time axis (in this case there are mapped data of the first month in every two month period from 2017-11-01T00:00:00 to 2019-11-07T23:00:00). - -The SQL execution result is: - -``` -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-11-01T00:00:00.000+08:00| 259| -|2018-01-01T00:00:00.000+08:00| 250| -|2018-03-01T00:00:00.000+08:00| 259| -|2018-05-01T00:00:00.000+08:00| 251| -|2018-07-01T00:00:00.000+08:00| 242| -|2018-09-01T00:00:00.000+08:00| 225| -|2018-11-01T00:00:00.000+08:00| 216| -|2019-01-01T00:00:00.000+08:00| 207| -|2019-03-01T00:00:00.000+08:00| 216| -|2019-05-01T00:00:00.000+08:00| 207| -|2019-07-01T00:00:00.000+08:00| 199| -|2019-09-01T00:00:00.000+08:00| 181| -|2019-11-01T00:00:00.000+08:00| 60| -+-----------------------------+-------------------------------+ -``` - -The SQL statement is: - -```sql -select count(status) from root.ln.wf01.wt01 group by([2017-10-31T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` - -which means: - -Since the user specifies the sliding step parameter as `2mo`, the `GROUP BY` statement will move the time interval `2 months` long instead of `1 month` as default. - -The first parameter of the `GROUP BY` statement above is the display window parameter, which determines the final display range is [2017-10-31T00:00:00, 2019-11-07T23:00:00). - -Different from the previous example, the start time is set to 2017-10-31T00:00:00. The sliding step will increment monthly based on the start date, and the 31st day of the month meaning the last day of the month will be used as the time interval's start time. If the start time is set to the 30th date, the sliding step will use the 30th or the last day of the month. - -The start time is 2017-10-31T00:00:00. The sliding step will increment monthly based on the start time, and the 1st day of the month will be used as the time interval's start time. - -The second parameter of the `GROUP BY` statement above is the time interval for dividing the time axis. Taking this parameter (1mo) as time interval and the startTime of the display window as the dividing origin, the time axis is divided into several continuous intervals, which are [2017-10-31T00:00:00, 2017-11-31T00:00:00), [2018-02-31T00:00:00, 2018-03-31T00:00:00), [2018-05-31T00:00:00, 2018-06-31T00:00:00), etc. - -The third parameter of the `GROUP BY` statement above is the sliding step for each time interval moving. - -Then the system will use the time and value filtering condition in the `WHERE` clause and the first parameter of the `GROUP BY` statement as the data filtering condition to obtain the data satisfying the filtering condition (which in this case is the data in the range of [2017-10-31T00:00:00, 2019-11-07T23:00:00) and map these data to the previously segmented time axis (in this case there are mapped data of the first month in every two month period from 2017-10-31T00:00:00 to 2019-11-07T23:00:00). - -The SQL execution result is: - -``` -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-10-31T00:00:00.000+08:00| 251| -|2017-12-31T00:00:00.000+08:00| 250| -|2018-02-28T00:00:00.000+08:00| 259| -|2018-04-30T00:00:00.000+08:00| 250| -|2018-06-30T00:00:00.000+08:00| 242| -|2018-08-31T00:00:00.000+08:00| 225| -|2018-10-31T00:00:00.000+08:00| 216| -|2018-12-31T00:00:00.000+08:00| 208| -|2019-02-28T00:00:00.000+08:00| 216| -|2019-04-30T00:00:00.000+08:00| 208| -|2019-06-30T00:00:00.000+08:00| 199| -|2019-08-31T00:00:00.000+08:00| 181| -|2019-10-31T00:00:00.000+08:00| 69| -+-----------------------------+-------------------------------+ -``` - -##### Left Open And Right Close Range - -The SQL statement is: - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d); -``` - -In this sql, the time interval is left open and right close, so we won't include the value of timestamp 2017-11-01T00:00:00 and instead we will include the value of timestamp 2017-11-07T23:00:00. - -We will get the result like following: - -``` -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-11-02T00:00:00.000+08:00| 1440| -|2017-11-03T00:00:00.000+08:00| 1440| -|2017-11-04T00:00:00.000+08:00| 1440| -|2017-11-05T00:00:00.000+08:00| 1440| -|2017-11-06T00:00:00.000+08:00| 1440| -|2017-11-07T00:00:00.000+08:00| 1440| -|2017-11-07T23:00:00.000+08:00| 1380| -+-----------------------------+-------------------------------+ -Total line number = 7 -It costs 0.004s -``` - -#### Aggregation By Variation - -IoTDB supports grouping by continuous stable values through the `GROUP BY VARIATION` statement. - -Group-By-Variation wil set the first point in group as the base point, -then if the difference between the new data and base point is small than or equal to delta, -the data point will be grouped together and execute aggregation query (The calculation of difference and the meaning of delte are introduced below). The groups won't overlap and there is no fixed start time and end time. -The syntax of clause is as follows: - -```sql -group by variation(controlExpression[,delta][,ignoreNull=true/false]) -``` - -The different parameters mean: - -* controlExpression - -The value that is used to calculate difference. It can be any columns or the expression of them. - -* delta - -The threshold that is used when grouping. The difference of controlExpression between the first data point and new data point should less than or equal to delta. -When delta is zero, all the continuous data with equal expression value will be grouped into the same group. - -* ignoreNull - -Used to specify how to deal with the data when the value of controlExpression is null. When ignoreNull is false, null will be treated as a new value and when ignoreNull is true, the data point will be directly skipped. - -The supported return types of controlExpression and how to deal with null value when ignoreNull is false are shown in the following table: - -| delta | Return Type Supported By controlExpression | The Handling of null when ignoreNull is False | -| -------- | ------------------------------------------ | ------------------------------------------------------------ | -| delta!=0 | INT32、INT64、FLOAT、DOUBLE | If the processing group doesn't contains null, null value should be treated as infinity/infinitesimal and will end current group.
Continuous null values are treated as stable values and assigned to the same group. | -| delta=0 | TEXT、BINARY、INT32、INT64、FLOAT、DOUBLE | Null is treated as a new value in a new group and continuous nulls belong to the same group. | - -groupByVariation - -##### Precautions for Use - -1. The result of controlExpression should be a unique value. If multiple columns appear after using wildcard stitching, an error will be reported. -2. For a group in resultSet, the time column output the start time of the group by default. __endTime can be used in select clause to output the endTime of groups in resultSet. -3. Each device is grouped separately when used with `ALIGN BY DEVICE`. -4. Delta is zero and ignoreNull is true by default. -5. Currently `GROUP BY VARIATION` is not supported with `GROUP BY LEVEL`. - -Using the raw data below, several examples of `GROUP BY VARIAITON` queries will be given. - -``` -+-----------------------------+-------+-------+-------+--------+-------+-------+ -| Time| s1| s2| s3| s4| s5| s6| -+-----------------------------+-------+-------+-------+--------+-------+-------+ -|1970-01-01T08:00:00.000+08:00| 4.5| 9.0| 0.0| 45.0| 9.0| 8.25| -|1970-01-01T08:00:00.010+08:00| null| 19.0| 10.0| 145.0| 19.0| 8.25| -|1970-01-01T08:00:00.020+08:00| 24.5| 29.0| null| 245.0| 29.0| null| -|1970-01-01T08:00:00.030+08:00| 34.5| null| 30.0| 345.0| null| null| -|1970-01-01T08:00:00.040+08:00| 44.5| 49.0| 40.0| 445.0| 49.0| 8.25| -|1970-01-01T08:00:00.050+08:00| null| 59.0| 50.0| 545.0| 59.0| 6.25| -|1970-01-01T08:00:00.060+08:00| 64.5| 69.0| 60.0| 645.0| 69.0| null| -|1970-01-01T08:00:00.070+08:00| 74.5| 79.0| null| null| 79.0| 3.25| -|1970-01-01T08:00:00.080+08:00| 84.5| 89.0| 80.0| 845.0| 89.0| 3.25| -|1970-01-01T08:00:00.090+08:00| 94.5| 99.0| 90.0| 945.0| 99.0| 3.25| -|1970-01-01T08:00:00.150+08:00| 66.5| 77.0| 90.0| 945.0| 99.0| 9.25| -+-----------------------------+-------+-------+-------+--------+-------+-------+ -``` - -##### delta = 0 - -The sql is shown below: - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6) -``` - -Get the result below which ignores the row with null value in `s6`. - -``` -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.040+08:00| 24.5| 3| 50.0| -|1970-01-01T08:00:00.050+08:00|1970-01-01T08:00:00.050+08:00| null| 1| 50.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` - -when ignoreNull is false, the row with null value in `s6` will be considered. - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, ignoreNull=false) -``` - -Get the following result. - -``` -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.010+08:00| 4.5| 2| 10.0| -|1970-01-01T08:00:00.020+08:00|1970-01-01T08:00:00.030+08:00| 29.5| 1| 30.0| -|1970-01-01T08:00:00.040+08:00|1970-01-01T08:00:00.040+08:00| 44.5| 1| 40.0| -|1970-01-01T08:00:00.050+08:00|1970-01-01T08:00:00.050+08:00| null| 1| 50.0| -|1970-01-01T08:00:00.060+08:00|1970-01-01T08:00:00.060+08:00| 64.5| 1| 60.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` - -##### delta !=0 - -The sql is shown below: - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, 4) -``` - -Get the result below: - -``` -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.050+08:00| 24.5| 4| 100.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` - -The sql is shown below: - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6+s5, 10) -``` - -Get the result below: - -``` -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.010+08:00| 4.5| 2| 10.0| -|1970-01-01T08:00:00.040+08:00|1970-01-01T08:00:00.050+08:00| 44.5| 2| 90.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.080+08:00| 79.5| 2| 80.0| -|1970-01-01T08:00:00.090+08:00|1970-01-01T08:00:00.150+08:00| 80.5| 2| 180.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` - -#### Aggregation By Condition - -When you need to filter the data according to a specific condition and group the continuous ones for an aggregation query. -`GROUP BY CONDITION` is suitable for you.The rows which don't meet the given condition will be simply ignored because they don't belong to any group. -Its syntax is defined below: - -```sql -group by condition(predict,[keep>/>=/=/<=/<]threshold,[,ignoreNull=true/false]) -``` - -* predict - -Any legal expression return the type of boolean for filtering in grouping. - -* [keep>/>=/=/<=/<]threshold - -Keep expression is used to specify the number of continuous rows that meet the `predict` condition to form a group. Only the number of rows in group satisfy the keep condition, the result of group will be output. -Keep expression consists of a 'keep' string and a threshold of type `long` or a single 'long' type data. - -* ignoreNull=true/false - -Used to specify how to handle data rows that encounter null predict, skip the row when it's true and end current group when it's false. - -##### Precautions for Use - -1. keep condition is required in the query, but you can omit the 'keep' string and given a `long` number which defaults to 'keep=long number' condition. -2. IgnoreNull defaults to true. -3. For a group in resultSet, the time column output the start time of the group by default. __endTime can be used in select clause to output the endTime of groups in resultSet. -4. Each device is grouped separately when used with `ALIGN BY DEVICE`. -5. Currently `GROUP BY CONDITION` is not supported with `GROUP BY LEVEL`. - -For the following raw data, several query examples are given below: - -``` -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -| Time|root.sg.beijing.car01.soc|root.sg.beijing.car01.charging_status|root.sg.beijing.car01.vehicle_status| -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 14.0| 1| 1| -|1970-01-01T08:00:00.002+08:00| 16.0| 1| 1| -|1970-01-01T08:00:00.003+08:00| 16.0| 0| 1| -|1970-01-01T08:00:00.004+08:00| 16.0| 0| 1| -|1970-01-01T08:00:00.005+08:00| 18.0| 1| 1| -|1970-01-01T08:00:00.006+08:00| 24.0| 1| 1| -|1970-01-01T08:00:00.007+08:00| 36.0| 1| 1| -|1970-01-01T08:00:00.008+08:00| 36.0| null| 1| -|1970-01-01T08:00:00.009+08:00| 45.0| 1| 1| -|1970-01-01T08:00:00.010+08:00| 60.0| 1| 1| -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -``` - -The sql statement to query data with at least two continuous row shown below: - -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoringNull=true) -``` - -Get the result below: - -``` -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -| Time|max_time(root.sg.beijing.car01.charging_status)|count(root.sg.beijing.car01.vehicle_status)|last_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 2| 2| 16.0| -|1970-01-01T08:00:00.005+08:00| 10| 5| 60.0| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -``` - -When ignoreNull is false, the null value will be treated as a row that doesn't meet the condition. - -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoringNull=false) -``` - -Get the result below, the original group is split. - -``` -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -| Time|max_time(root.sg.beijing.car01.charging_status)|count(root.sg.beijing.car01.vehicle_status)|last_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 2| 2| 16.0| -|1970-01-01T08:00:00.005+08:00| 7| 3| 36.0| -|1970-01-01T08:00:00.009+08:00| 10| 2| 60.0| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -``` - -#### Aggregation By Session - -`GROUP BY SESSION` can be used to group data according to the interval of the time. Data with a time interval less than or equal to the given threshold will be assigned to the same group. -For example, in industrial scenarios, devices don't always run continuously, `GROUP BY SESSION` will group the data generated by each access session of the device. -Its syntax is defined as follows: - -```sql -group by session(timeInterval) -``` - -* timeInterval - -A given interval threshold to create a new group of data when the difference between the time of data is greater than the threshold. - -The figure below is a grouping diagram under `GROUP BY SESSION`. - -groupBySession - -##### Precautions for Use - -1. For a group in resultSet, the time column output the start time of the group by default. __endTime can be used in select clause to output the endTime of groups in resultSet. -2. Each device is grouped separately when used with `ALIGN BY DEVICE`. -3. Currently `GROUP BY SESSION` is not supported with `GROUP BY LEVEL`. - -For the raw data below, a few query examples are given: - -``` -+-----------------------------+-----------------+-----------+--------+------+ -| Time| Device|temperature|hardware|status| -+-----------------------------+-----------------+-----------+--------+------+ -|1970-01-01T08:00:01.000+08:00|root.ln.wf02.wt01| 35.7| 11| false| -|1970-01-01T08:00:02.000+08:00|root.ln.wf02.wt01| 35.8| 22| true| -|1970-01-01T08:00:03.000+08:00|root.ln.wf02.wt01| 35.4| 33| false| -|1970-01-01T08:00:04.000+08:00|root.ln.wf02.wt01| 36.4| 44| false| -|1970-01-01T08:00:05.000+08:00|root.ln.wf02.wt01| 36.8| 55| false| -|1970-01-01T08:00:10.000+08:00|root.ln.wf02.wt01| 36.8| 110| false| -|1970-01-01T08:00:20.000+08:00|root.ln.wf02.wt01| 37.8| 220| true| -|1970-01-01T08:00:30.000+08:00|root.ln.wf02.wt01| 37.5| 330| false| -|1970-01-01T08:00:40.000+08:00|root.ln.wf02.wt01| 37.4| 440| false| -|1970-01-01T08:00:50.000+08:00|root.ln.wf02.wt01| 37.9| 550| false| -|1970-01-01T08:01:40.000+08:00|root.ln.wf02.wt01| 38.0| 110| false| -|1970-01-01T08:02:30.000+08:00|root.ln.wf02.wt01| 38.8| 220| true| -|1970-01-01T08:03:20.000+08:00|root.ln.wf02.wt01| 38.6| 330| false| -|1970-01-01T08:04:20.000+08:00|root.ln.wf02.wt01| 38.4| 440| false| -|1970-01-01T08:05:20.000+08:00|root.ln.wf02.wt01| 38.3| 550| false| -|1970-01-01T08:06:40.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-01T08:07:50.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-01T08:08:00.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-02T08:08:01.000+08:00|root.ln.wf02.wt01| 38.2| 110| false| -|1970-01-02T08:08:02.000+08:00|root.ln.wf02.wt01| 37.5| 220| true| -|1970-01-02T08:08:03.000+08:00|root.ln.wf02.wt01| 37.4| 330| false| -|1970-01-02T08:08:04.000+08:00|root.ln.wf02.wt01| 36.8| 440| false| -|1970-01-02T08:08:05.000+08:00|root.ln.wf02.wt01| 37.4| 550| false| -+-----------------------------+-----------------+-----------+--------+------+ -``` - -TimeInterval can be set by different time units, the sql is shown below: - -```sql -select __endTime,count(*) from root.** group by session(1d) -``` - -Get the result: - -``` -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -| Time| __endTime|count(root.ln.wf02.wt01.temperature)|count(root.ln.wf02.wt01.hardware)|count(root.ln.wf02.wt01.status)| -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -|1970-01-01T08:00:01.000+08:00|1970-01-01T08:08:00.000+08:00| 15| 18| 15| -|1970-01-02T08:08:01.000+08:00|1970-01-02T08:08:05.000+08:00| 5| 5| 5| -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -``` - -It can be also used with `HAVING` and `ALIGN BY DEVICE` clauses. - -```sql -select __endTime,sum(hardware) from root.ln.wf02.wt01 group by session(50s) having sum(hardware)>0 align by device -``` - -Get the result below: - -``` -+-----------------------------+-----------------+-----------------------------+-------------+ -| Time| Device| __endTime|sum(hardware)| -+-----------------------------+-----------------+-----------------------------+-------------+ -|1970-01-01T08:00:01.000+08:00|root.ln.wf02.wt01|1970-01-01T08:03:20.000+08:00| 2475.0| -|1970-01-01T08:04:20.000+08:00|root.ln.wf02.wt01|1970-01-01T08:04:20.000+08:00| 440.0| -|1970-01-01T08:05:20.000+08:00|root.ln.wf02.wt01|1970-01-01T08:05:20.000+08:00| 550.0| -|1970-01-02T08:08:01.000+08:00|root.ln.wf02.wt01|1970-01-02T08:08:05.000+08:00| 1650.0| -+-----------------------------+-----------------+-----------------------------+-------------+ -``` - -#### Aggregation By Count - -`GROUP BY COUNT`can aggregate the data points according to the number of points. It can group fixed number of continuous data points together for aggregation query. -Its syntax is defined as follows: - -```sql -group by count(controlExpression, size[,ignoreNull=true/false]) -``` - -* controlExpression - -The object to count during processing, it can be any column or an expression of columns. - -* size - -The number of data points in a group, a number of `size` continuous points will be divided to the same group. - -* ignoreNull=true/false - -Whether to ignore the data points with null in `controlExpression`, when ignoreNull is true, data points with the `controlExpression` of null will be skipped during counting. - -##### Precautions for Use - -1. For a group in resultSet, the time column output the start time of the group by default. __endTime can be used in select clause to output the endTime of groups in resultSet. -2. Each device is grouped separately when used with `ALIGN BY DEVICE`. -3. Currently `GROUP BY SESSION` is not supported with `GROUP BY LEVEL`. -4. When the final number of data points in a group is less than `size`, the result of the group will not be output. - -For the data below, some examples will be given. - -``` -+-----------------------------+-----------+-----------------------+ -| Time|root.sg.soc|root.sg.charging_status| -+-----------------------------+-----------+-----------------------+ -|1970-01-01T08:00:00.001+08:00| 14.0| 1| -|1970-01-01T08:00:00.002+08:00| 16.0| 1| -|1970-01-01T08:00:00.003+08:00| 16.0| 0| -|1970-01-01T08:00:00.004+08:00| 16.0| 0| -|1970-01-01T08:00:00.005+08:00| 18.0| 1| -|1970-01-01T08:00:00.006+08:00| 24.0| 1| -|1970-01-01T08:00:00.007+08:00| 36.0| 1| -|1970-01-01T08:00:00.008+08:00| 36.0| null| -|1970-01-01T08:00:00.009+08:00| 45.0| 1| -|1970-01-01T08:00:00.010+08:00| 60.0| 1| -+-----------------------------+-----------+-----------------------+ -``` - -The sql is shown below - -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5) -``` - -Get the result below, in the second group from 1970-01-01T08:00:00.006+08:00 to 1970-01-01T08:00:00.010+08:00. There are only four points included which is less than `size`. So it won't be output. - -``` -+-----------------------------+-----------------------------+--------------------------------------+ -| Time| __endTime|first_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.001+08:00|1970-01-01T08:00:00.005+08:00| 14.0| -+-----------------------------+-----------------------------+--------------------------------------+ -``` - -When `ignoreNull=false` is used to take null value into account. There will be two groups with 5 points in the resultSet, which is shown as follows: - -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5,ignoreNull=false) -``` - -Get the results: - -``` -+-----------------------------+-----------------------------+--------------------------------------+ -| Time| __endTime|first_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.001+08:00|1970-01-01T08:00:00.005+08:00| 14.0| -|1970-01-01T08:00:00.006+08:00|1970-01-01T08:00:00.010+08:00| 24.0| -+-----------------------------+-----------------------------+--------------------------------------+ -``` - -### 4.2 Aggregate By Group - -#### Aggregation By Level - -Aggregation by level statement is used to group the query result whose name is the same at the given level. - -- Keyword `LEVEL` is used to specify the level that need to be grouped. By convention, `level=0` represents *root* level. -- All aggregation functions are supported. When using five aggregations: sum, avg, min_value, max_value and extreme, please make sure all the aggregated series have exactly the same data type. Otherwise, it will generate a syntax error. - -**Example 1:** there are multiple series named `status` under different databases, like "root.ln.wf01.wt01.status", "root.ln.wf02.wt02.status", and "root.sgcc.wf03.wt01.status". If you need to count the number of data points of the `status` sequence under different databases, use the following query: - -```sql -select count(status) from root.** group by level = 1 -``` - -Result: - -``` -+-------------------------+---------------------------+ -|count(root.ln.*.*.status)|count(root.sgcc.*.*.status)| -+-------------------------+---------------------------+ -| 20160| 10080| -+-------------------------+---------------------------+ -Total line number = 1 -It costs 0.003s -``` - -**Example 2:** If you need to count the number of data points under different devices, you can specify level = 3, - -```sql -select count(status) from root.** group by level = 3 -``` - -Result: - -``` -+---------------------------+---------------------------+ -|count(root.*.*.wt01.status)|count(root.*.*.wt02.status)| -+---------------------------+---------------------------+ -| 20160| 10080| -+---------------------------+---------------------------+ -Total line number = 1 -It costs 0.003s -``` - -**Example 3:** Attention,the devices named `wt01` under databases `ln` and `sgcc` are grouped together, since they are regarded as devices with the same name. If you need to further count the number of data points in different devices under different databases, you can use the following query: - -```sql -select count(status) from root.** group by level = 1, 3 -``` - -Result: - -``` -+----------------------------+----------------------------+------------------------------+ -|count(root.ln.*.wt01.status)|count(root.ln.*.wt02.status)|count(root.sgcc.*.wt01.status)| -+----------------------------+----------------------------+------------------------------+ -| 10080| 10080| 10080| -+----------------------------+----------------------------+------------------------------+ -Total line number = 1 -It costs 0.003s -``` - -**Example 4:** Assuming that you want to query the maximum value of temperature sensor under all time series, you can use the following query statement: - -```sql -select max_value(temperature) from root.** group by level = 0 -``` - -Result: - -``` -+---------------------------------+ -|max_value(root.*.*.*.temperature)| -+---------------------------------+ -| 26.0| -+---------------------------------+ -Total line number = 1 -It costs 0.013s -``` - -**Example 5:** The above queries are for a certain sensor. In particular, **if you want to query the total data points owned by all sensors at a certain level**, you need to explicitly specify `*` is selected. - -```sql -select count(*) from root.ln.** group by level = 2 -``` - -Result: - -``` -+----------------------+----------------------+ -|count(root.*.wf01.*.*)|count(root.*.wf02.*.*)| -+----------------------+----------------------+ -| 20160| 20160| -+----------------------+----------------------+ -Total line number = 1 -It costs 0.013s -``` - -##### Aggregate By Time with Level Clause - -Level could be defined to show count the number of points of each node at the given level in current Metadata Tree. - -This could be used to query the number of points under each device. - -The SQL statement is: - -Get time aggregation by level. - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d), level=1; -``` - -Result: - -``` -+-----------------------------+-------------------------+ -| Time|COUNT(root.ln.*.*.status)| -+-----------------------------+-------------------------+ -|2017-11-02T00:00:00.000+08:00| 1440| -|2017-11-03T00:00:00.000+08:00| 1440| -|2017-11-04T00:00:00.000+08:00| 1440| -|2017-11-05T00:00:00.000+08:00| 1440| -|2017-11-06T00:00:00.000+08:00| 1440| -|2017-11-07T00:00:00.000+08:00| 1440| -|2017-11-07T23:00:00.000+08:00| 1380| -+-----------------------------+-------------------------+ -Total line number = 7 -It costs 0.006s -``` - -Time aggregation with sliding step and by level. - -```sql -select count(status) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d), level=1; -``` - -Result: - -``` -+-----------------------------+-------------------------+ -| Time|COUNT(root.ln.*.*.status)| -+-----------------------------+-------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| -|2017-11-02T00:00:00.000+08:00| 180| -|2017-11-03T00:00:00.000+08:00| 180| -|2017-11-04T00:00:00.000+08:00| 180| -|2017-11-05T00:00:00.000+08:00| 180| -|2017-11-06T00:00:00.000+08:00| 180| -|2017-11-07T00:00:00.000+08:00| 180| -+-----------------------------+-------------------------+ -Total line number = 7 -It costs 0.004s -``` - -#### Aggregation By Tags - -IotDB allows you to do aggregation query with the tags defined in timeseries through `GROUP BY TAGS` clause as well. - -Firstly, we can put these example data into IoTDB, which will be used in the following feature introduction. - -These are the temperature data of the workshops, which belongs to the factory `factory1` and locates in different cities. The time range is `[1000, 10000)`. - -The device node of the timeseries path is the ID of the device. The information of city and workshop are modelled in the tags `city` and `workshop`. -The devices `d1` and `d2` belong to the workshop `d1` in `Beijing`. -`d3` and `d4` belong to the workshop `w2` in `Beijing`. -`d5` and `d6` belong to the workshop `w1` in `Shanghai`. -`d7` belongs to the workshop `w2` in `Shanghai`. -`d8` and `d9` are under maintenance, and don't belong to any workshops, so they have no tags. - - -```SQL -CREATE DATABASE root.factory1; -create timeseries root.factory1.d1.temperature with datatype=FLOAT tags(city=Beijing, workshop=w1); -create timeseries root.factory1.d2.temperature with datatype=FLOAT tags(city=Beijing, workshop=w1); -create timeseries root.factory1.d3.temperature with datatype=FLOAT tags(city=Beijing, workshop=w2); -create timeseries root.factory1.d4.temperature with datatype=FLOAT tags(city=Beijing, workshop=w2); -create timeseries root.factory1.d5.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w1); -create timeseries root.factory1.d6.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w1); -create timeseries root.factory1.d7.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w2); -create timeseries root.factory1.d8.temperature with datatype=FLOAT; -create timeseries root.factory1.d9.temperature with datatype=FLOAT; - -insert into root.factory1.d1(time, temperature) values(1000, 104.0); -insert into root.factory1.d1(time, temperature) values(3000, 104.2); -insert into root.factory1.d1(time, temperature) values(5000, 103.3); -insert into root.factory1.d1(time, temperature) values(7000, 104.1); - -insert into root.factory1.d2(time, temperature) values(1000, 104.4); -insert into root.factory1.d2(time, temperature) values(3000, 103.7); -insert into root.factory1.d2(time, temperature) values(5000, 103.3); -insert into root.factory1.d2(time, temperature) values(7000, 102.9); - -insert into root.factory1.d3(time, temperature) values(1000, 103.9); -insert into root.factory1.d3(time, temperature) values(3000, 103.8); -insert into root.factory1.d3(time, temperature) values(5000, 102.7); -insert into root.factory1.d3(time, temperature) values(7000, 106.9); - -insert into root.factory1.d4(time, temperature) values(1000, 103.9); -insert into root.factory1.d4(time, temperature) values(5000, 102.7); -insert into root.factory1.d4(time, temperature) values(7000, 106.9); - -insert into root.factory1.d5(time, temperature) values(1000, 112.9); -insert into root.factory1.d5(time, temperature) values(7000, 113.0); - -insert into root.factory1.d6(time, temperature) values(1000, 113.9); -insert into root.factory1.d6(time, temperature) values(3000, 113.3); -insert into root.factory1.d6(time, temperature) values(5000, 112.7); -insert into root.factory1.d6(time, temperature) values(7000, 112.3); - -insert into root.factory1.d7(time, temperature) values(1000, 101.2); -insert into root.factory1.d7(time, temperature) values(3000, 99.3); -insert into root.factory1.d7(time, temperature) values(5000, 100.1); -insert into root.factory1.d7(time, temperature) values(7000, 99.8); - -insert into root.factory1.d8(time, temperature) values(1000, 50.0); -insert into root.factory1.d8(time, temperature) values(3000, 52.1); -insert into root.factory1.d8(time, temperature) values(5000, 50.1); -insert into root.factory1.d8(time, temperature) values(7000, 50.5); - -insert into root.factory1.d9(time, temperature) values(1000, 50.3); -insert into root.factory1.d9(time, temperature) values(3000, 52.1); -``` - -##### Aggregation query by one single tag - -If the user wants to know the average temperature of each workshop, he can query like this - -```SQL -SELECT AVG(temperature) FROM root.factory1.** GROUP BY TAGS(city); -``` - -The query will calculate the average of the temperatures of those timeseries which have the same tag value of the key `city`. -The results are - -``` -+--------+------------------+ -| city| avg(temperature)| -+--------+------------------+ -| Beijing|104.04666697184244| -|Shanghai|107.85000076293946| -| NULL| 50.84999910990397| -+--------+------------------+ -Total line number = 3 -It costs 0.231s -``` - -From the results we can see that the differences between aggregation by tags query and aggregation by time or level query are: - -1. Aggregation query by tags will no longer remove wildcard to raw timeseries, but do the aggregation through the data of multiple timeseries, which have the same tag value. -2. Except for the aggregate result column, the result set contains the key-value column of the grouped tag. The column name is the tag key, and the values in the column are tag values which present in the searched timeseries. - If some searched timeseries doesn't have the grouped tag, a `NULL` value in the key-value column of the grouped tag will be presented, which means the aggregation of all the timeseries lacking the tagged key. - -##### Aggregation query by multiple tags - -Except for the aggregation query by one single tag, aggregation query by multiple tags in a particular order is allowed as well. - -For example, a user wants to know the average temperature of the devices in each workshop. -As the workshop names may be same in different city, it's not correct to aggregated by the tag `workshop` directly. -So the aggregation by the tag `city` should be done first, and then by the tag `workshop`. - -SQL - -```SQL -SELECT avg(temperature) FROM root.factory1.** GROUP BY TAGS(city, workshop); -``` - -The results - -``` -+--------+--------+------------------+ -| city|workshop| avg(temperature)| -+--------+--------+------------------+ -| NULL| NULL| 50.84999910990397| -|Shanghai| w1|113.01666768391927| -| Beijing| w2| 104.4000004359654| -|Shanghai| w2|100.10000038146973| -| Beijing| w1|103.73750019073486| -+--------+--------+------------------+ -Total line number = 5 -It costs 0.027s -``` - -We can see that in a multiple tags aggregation query, the result set will output the key-value columns of all the grouped tag keys, which have the same order with the one in `GROUP BY TAGS`. - -##### Downsampling Aggregation by tags based on Time Window - -Downsampling aggregation by time window is one of the most popular features in a time series database. IoTDB supports to do aggregation query by tags based on time window. - -For example, a user wants to know the average temperature of the devices in each workshop, in every 5 seconds, in the range of time `[1000, 10000)`. - -SQL - -```SQL -SELECT avg(temperature) FROM root.factory1.** GROUP BY ([1000, 10000), 5s), TAGS(city, workshop); -``` - -The results - -``` -+-----------------------------+--------+--------+------------------+ -| Time| city|workshop| avg(temperature)| -+-----------------------------+--------+--------+------------------+ -|1970-01-01T08:00:01.000+08:00| NULL| NULL| 50.91999893188476| -|1970-01-01T08:00:01.000+08:00|Shanghai| w1|113.20000076293945| -|1970-01-01T08:00:01.000+08:00| Beijing| w2| 103.4| -|1970-01-01T08:00:01.000+08:00|Shanghai| w2| 100.1999994913737| -|1970-01-01T08:00:01.000+08:00| Beijing| w1|103.81666692097981| -|1970-01-01T08:00:06.000+08:00| NULL| NULL| 50.5| -|1970-01-01T08:00:06.000+08:00|Shanghai| w1| 112.6500015258789| -|1970-01-01T08:00:06.000+08:00| Beijing| w2| 106.9000015258789| -|1970-01-01T08:00:06.000+08:00|Shanghai| w2| 99.80000305175781| -|1970-01-01T08:00:06.000+08:00| Beijing| w1| 103.5| -+-----------------------------+--------+--------+------------------+ -``` - -Comparing to the pure tag aggregations, this kind of aggregation will divide the data according to the time window specification firstly, and do the aggregation query by the multiple tags in each time window secondly. -The result set will also contain a time column, which have the same meaning with the time column of the result in downsampling aggregation query by time window. - -##### Limitation of Aggregation by Tags - -As this feature is still under development, some queries have not been completed yet and will be supported in the future. - -> 1. Temporarily not support `HAVING` clause to filter the results. -> 2. Temporarily not support ordering by tag values. -> 3. Temporarily not support `LIMIT`,`OFFSET`,`SLIMIT`,`SOFFSET`. -> 4. Temporarily not support `ALIGN BY DEVICE`. -> 5. Temporarily not support expressions as aggregation function parameter,e.g. `count(s+1)`. -> 6. Not support the value filter, which stands the same with the `GROUP BY LEVEL` query. - -## 5. `HAVING` CLAUSE - -If you want to filter the results of aggregate queries, -you can use the `HAVING` clause after the `GROUP BY` clause. - -> NOTE: -> -> 1.The expression in HAVING clause must consist of aggregate values; the original sequence cannot appear alone. -> The following usages are incorrect: -> -> ```sql -> select count(s1) from root.** group by ([1,3),1ms) having sum(s1) > s1; -> select count(s1) from root.** group by ([1,3),1ms) having s1 > 1; -> ``` -> -> 2.When filtering the `GROUP BY LEVEL` result, the PATH in `SELECT` and `HAVING` can only have one node. -> The following usages are incorrect: -> -> ```sql -> select count(s1) from root.** group by ([1,3),1ms), level=1 having sum(d1.s1) > 1; -> select count(d1.s1) from root.** group by ([1,3),1ms), level=1 having sum(s1) > 1; -> ``` - -Here are a few examples of using the 'HAVING' clause to filter aggregate results. - -Aggregation result 1: - -``` -+-----------------------------+---------------------+---------------------+ -| Time|count(root.test.*.s1)|count(root.test.*.s2)| -+-----------------------------+---------------------+---------------------+ -|1970-01-01T08:00:00.001+08:00| 4| 4| -|1970-01-01T08:00:00.003+08:00| 1| 0| -|1970-01-01T08:00:00.005+08:00| 2| 4| -|1970-01-01T08:00:00.007+08:00| 3| 2| -|1970-01-01T08:00:00.009+08:00| 4| 4| -+-----------------------------+---------------------+---------------------+ -``` - -Aggregation result filtering query 1: - -```sql - select count(s1) from root.** group by ([1,11),2ms), level=1 having count(s2) > 1; -``` - -Filtering result 1: - -``` -+-----------------------------+---------------------+ -| Time|count(root.test.*.s1)| -+-----------------------------+---------------------+ -|1970-01-01T08:00:00.001+08:00| 4| -|1970-01-01T08:00:00.005+08:00| 2| -|1970-01-01T08:00:00.009+08:00| 4| -+-----------------------------+---------------------+ -``` - -Aggregation result 2: - -``` -+-----------------------------+-------------+---------+---------+ -| Time| Device|count(s1)|count(s2)| -+-----------------------------+-------------+---------+---------+ -|1970-01-01T08:00:00.001+08:00|root.test.sg1| 1| 2| -|1970-01-01T08:00:00.003+08:00|root.test.sg1| 1| 0| -|1970-01-01T08:00:00.005+08:00|root.test.sg1| 1| 2| -|1970-01-01T08:00:00.007+08:00|root.test.sg1| 2| 1| -|1970-01-01T08:00:00.009+08:00|root.test.sg1| 2| 2| -|1970-01-01T08:00:00.001+08:00|root.test.sg2| 2| 2| -|1970-01-01T08:00:00.003+08:00|root.test.sg2| 0| 0| -|1970-01-01T08:00:00.005+08:00|root.test.sg2| 1| 2| -|1970-01-01T08:00:00.007+08:00|root.test.sg2| 1| 1| -|1970-01-01T08:00:00.009+08:00|root.test.sg2| 2| 2| -+-----------------------------+-------------+---------+---------+ -``` - -Aggregation result filtering query 2: - -```sql - select count(s1), count(s2) from root.** group by ([1,11),2ms) having count(s2) > 1 align by device; -``` - -Filtering result 2: - -``` -+-----------------------------+-------------+---------+---------+ -| Time| Device|count(s1)|count(s2)| -+-----------------------------+-------------+---------+---------+ -|1970-01-01T08:00:00.001+08:00|root.test.sg1| 1| 2| -|1970-01-01T08:00:00.005+08:00|root.test.sg1| 1| 2| -|1970-01-01T08:00:00.009+08:00|root.test.sg1| 2| 2| -|1970-01-01T08:00:00.001+08:00|root.test.sg2| 2| 2| -|1970-01-01T08:00:00.005+08:00|root.test.sg2| 1| 2| -|1970-01-01T08:00:00.009+08:00|root.test.sg2| 2| 2| -+-----------------------------+-------------+---------+---------+ -``` - -## 6. `FILL` CLAUSE - -### 6.1 Introduction - -When executing some queries, there may be no data for some columns in some rows, and data in these locations will be null, but this kind of null value is not conducive to data visualization and analysis, and the null value needs to be filled. - -In IoTDB, users can use the FILL clause to specify the fill mode when data is missing. Fill null value allows the user to fill any query result with null values according to a specific method, such as taking the previous value that is not null, or linear interpolation. The query result after filling the null value can better reflect the data distribution, which is beneficial for users to perform data analysis. - -### 6.2 Syntax Definition - -**The following is the syntax definition of the `FILL` clause:** - -```sql -FILL '(' PREVIOUS | LINEAR | constant ')'; -``` - -**Note:** - -- We can specify only one fill method in the `FILL` clause, and this method applies to all columns of the result set. -- Null value fill is not compatible with version 0.13 and previous syntax (`FILL(([(, , )?])+)`) is not supported anymore. - -### 6.3 Fill Methods - -**IoTDB supports the following three fill methods:** - -- `PREVIOUS`: Fill with the previous non-null value of the column. -- `LINEAR`: Fill the column with a linear interpolation of the previous non-null value and the next non-null value of the column. -- Constant: Fill with the specified constant. - -**Following table lists the data types and supported fill methods.** - -| Data Type | Supported Fill Methods | -| :-------- | :---------------------- | -| boolean | previous, value | -| int32 | previous, linear, value | -| int64 | previous, linear, value | -| float | previous, linear, value | -| double | previous, linear, value | -| text | previous, value | - -**Note:** For columns whose data type does not support specifying the fill method, we neither fill it nor throw exception, just keep it as it is. - -**For examples:** - -If we don't use any fill methods: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000; -``` - -the original result will be like: - -``` -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| null| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -#### `PREVIOUS` Fill - -**For null values in the query result set, fill with the previous non-null value of the column.** - -**Note:** If the first value of this column is null, we will keep first value as null and won't fill it until we meet first non-null value - -For example, with `PREVIOUS` fill, the SQL is as follows: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous); -``` - -result will be like: - -``` -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 21.93| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| false| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -**While using `FILL(PREVIOUS)`, you can specify a time interval. If the interval between the timestamp of the current null value and the timestamp of the previous non-null value exceeds the specified time interval, no filling will be performed.** - -> 1. In the case of FILL(LINEAR) and FILL(CONSTANT), if the second parameter is specified, an exception will be thrown -> 2. The interval parameter only supports integers - For example, the raw data looks like this: - -```sql -select s1 from root.db.d1 -``` -``` -+-----------------------------+-------------+ -| Time|root.db.d1.s1| -+-----------------------------+-------------+ -|2023-11-08T16:41:50.008+08:00| 1.0| -+-----------------------------+-------------+ -|2023-11-08T16:46:50.011+08:00| 2.0| -+-----------------------------+-------------+ -|2023-11-08T16:48:50.011+08:00| 3.0| -+-----------------------------+-------------+ -``` - -We want to group the data by 1 min time interval: - -```sql -select avg(s1) - from root.db.d1 - group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m) -``` -``` -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| null| -+-----------------------------+------------------+ -``` - -After grouping, we want to fill the null value: - -```sql -select avg(s1) - from root.db.d1 - group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m) - FILL(PREVIOUS); -``` -``` -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| 3.0| -+-----------------------------+------------------+ -``` - -we also don't want the null value to be filled if it keeps null for 2 min. - -```sql -select avg(s1) -from root.db.d1 -group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m) - FILL(PREVIOUS, 2m); -``` -``` -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| 3.0| -+-----------------------------+------------------+ -``` - -#### `LINEAR` Fill - -**For null values in the query result set, fill the column with a linear interpolation of the previous non-null value and the next non-null value of the column.** - -**Note:** - -- If all the values before current value are null or all the values after current value are null, we will keep current value as null and won't fill it. -- If the column's data type is boolean/text, we neither fill it nor throw exception, just keep it as it is. - -Here we give an example of filling null values using the linear method. The SQL statement is as follows: - -For example, with `LINEAR` fill, the SQL is as follows: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(linear); -``` - -result will be like: - -``` -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 22.08| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -#### Constant Fill - -**For null values in the query result set, fill with the specified constant.** - -**Note:** - -- When using the ValueFill, IoTDB neither fill the query result if the data type is different from the input constant nor throw exception, just keep it as it is. - - | Constant Value Data Type | Support Data Type | - | :----------------------- | :-------------------------------------- | - | `BOOLEAN` | `BOOLEAN` `TEXT` | - | `INT64` | `INT32` `INT64` `FLOAT` `DOUBLE` `TEXT` | - | `DOUBLE` | `FLOAT` `DOUBLE` `TEXT` | - | `TEXT` | `TEXT` | - -- If constant value is larger than Integer.MAX_VALUE, IoTDB neither fill the query result if the data type is int32 nor throw exception, just keep it as it is. - -For example, with `FLOAT` constant fill, the SQL is as follows: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(2.0); -``` - -result will be like: - -``` -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 2.0| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -For example, with `BOOLEAN` constant fill, the SQL is as follows: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(true); -``` - -result will be like: - -``` -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| null| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| true| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -## 7. `LIMIT` and `SLIMIT` CLAUSES (PAGINATION) - -When the query result set has a large amount of data, it is not conducive to display on one page. You can use the `LIMIT/SLIMIT` clause and the `OFFSET/SOFFSET` clause to control paging. - -- The `LIMIT` and `SLIMIT` clauses are used to control the number of rows and columns of query results. -- The `OFFSET` and `SOFFSET` clauses are used to control the starting position of the result display. - -### 7.1 Row Control over Query Results - -By using LIMIT and OFFSET clauses, users control the query results in a row-related manner. We demonstrate how to use LIMIT and OFFSET clauses through the following examples. - -* Example 1: basic LIMIT clause - -The SQL statement is: - -```sql -select status, temperature from root.ln.wf01.wt01 limit 10 -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is "status" and "temperature". The SQL statement requires the first 10 rows of the query result. - -The result is shown below: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:00:00.000+08:00| true| 25.96| -|2017-11-01T00:01:00.000+08:00| true| 24.36| -|2017-11-01T00:02:00.000+08:00| false| 20.09| -|2017-11-01T00:03:00.000+08:00| false| 20.18| -|2017-11-01T00:04:00.000+08:00| false| 21.13| -|2017-11-01T00:05:00.000+08:00| false| 22.72| -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 10 -It costs 0.000s -``` - -* Example 2: LIMIT clause with OFFSET - -The SQL statement is: - -```sql -select status, temperature from root.ln.wf01.wt01 limit 5 offset 3 -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is "status" and "temperature". The SQL statement requires rows 3 to 7 of the query result be returned (with the first row numbered as row 0). - -The result is shown below: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:03:00.000+08:00| false| 20.18| -|2017-11-01T00:04:00.000+08:00| false| 21.13| -|2017-11-01T00:05:00.000+08:00| false| 22.72| -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 5 -It costs 0.342s -``` - -* Example 3: LIMIT clause combined with WHERE clause - -The SQL statement is: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2024-07-07T00:05:00.000 and time< 2024-07-12T00:12:00.000 limit 5 offset 3 -``` - -which means: - -The selected equipment is the ln group wf01 factory wt01 equipment; The selected time series are "state" and "temperature". The SQL statement requires the return of the status and temperature sensor values between the time "2024-07-07T00:05:00.000" and "2024-07-12T00:12:00.0000" on lines 3 to 7 (the first line is numbered as line 0). - -The result is shown below: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2024-07-09T17:32:11.943+08:00| true| 24.941973| -|2024-07-09T17:32:12.944+08:00| true| 20.05108| -|2024-07-09T17:32:13.945+08:00| true| 20.541632| -|2024-07-09T17:32:14.945+08:00| null| 23.09016| -|2024-07-09T17:32:14.946+08:00| true| null| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 5 -It costs 0.070s -``` - -* Example 4: LIMIT clause combined with GROUP BY clause - -The SQL statement is: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) limit 5 offset 3 -``` - -which means: - -The SQL statement clause requires rows 3 to 7 of the query result be returned (with the first row numbered as row 0). - -The result is shown below: - -``` -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-04T00:00:00.000+08:00| 1440| 26.0| -|2017-11-05T00:00:00.000+08:00| 1440| 26.0| -|2017-11-06T00:00:00.000+08:00| 1440| 25.99| -|2017-11-07T00:00:00.000+08:00| 1380| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 4 -It costs 0.016s -``` - -### 7.2 Column Control over Query Results - -By using SLIMIT and SOFFSET clauses, users can control the query results in a column-related manner. We will demonstrate how to use SLIMIT and SOFFSET clauses through the following examples. - -* Example 1: basic SLIMIT clause - -The SQL statement is: - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is the first column under this device, i.e., the power supply status. The SQL statement requires the status sensor values between the time point of "2017-11-01T00:05:00.000" and "2017-11-01T00:12:00.000" be selected. - -The result is shown below: - -``` -+-----------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.temperature| -+-----------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| 20.71| -|2017-11-01T00:07:00.000+08:00| 21.45| -|2017-11-01T00:08:00.000+08:00| 22.58| -|2017-11-01T00:09:00.000+08:00| 20.98| -|2017-11-01T00:10:00.000+08:00| 25.52| -|2017-11-01T00:11:00.000+08:00| 22.91| -+-----------------------------+-----------------------------+ -Total line number = 6 -It costs 0.000s -``` - -* Example 2: SLIMIT clause with SOFFSET - -The SQL statement is: - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 soffset 1 -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is the second column under this device, i.e., the temperature. The SQL statement requires the temperature sensor values between the time point of "2017-11-01T00:05:00.000" and "2017-11-01T00:12:00.000" be selected. - -The result is shown below: - -``` -+-----------------------------+------------------------+ -| Time|root.ln.wf01.wt01.status| -+-----------------------------+------------------------+ -|2017-11-01T00:06:00.000+08:00| false| -|2017-11-01T00:07:00.000+08:00| false| -|2017-11-01T00:08:00.000+08:00| false| -|2017-11-01T00:09:00.000+08:00| false| -|2017-11-01T00:10:00.000+08:00| true| -|2017-11-01T00:11:00.000+08:00| false| -+-----------------------------+------------------------+ -Total line number = 6 -It costs 0.003s -``` - -* Example 3: SLIMIT clause combined with GROUP BY clause - -The SQL statement is: - -```sql -select max_value(*) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) slimit 1 soffset 1 -``` - -The result is shown below: - -``` -+-----------------------------+-----------------------------------+ -| Time|max_value(root.ln.wf01.wt01.status)| -+-----------------------------+-----------------------------------+ -|2017-11-01T00:00:00.000+08:00| true| -|2017-11-02T00:00:00.000+08:00| true| -|2017-11-03T00:00:00.000+08:00| true| -|2017-11-04T00:00:00.000+08:00| true| -|2017-11-05T00:00:00.000+08:00| true| -|2017-11-06T00:00:00.000+08:00| true| -|2017-11-07T00:00:00.000+08:00| true| -+-----------------------------+-----------------------------------+ -Total line number = 7 -It costs 0.000s -``` - -### 7.3 Row and Column Control over Query Results - -In addition to row or column control over query results, IoTDB allows users to control both rows and columns of query results. Here is a complete example with both LIMIT clauses and SLIMIT clauses. - -The SQL statement is: - -```sql -select * from root.ln.wf01.wt01 limit 10 offset 100 slimit 2 soffset 0 -``` - -which means: - -The selected device is ln group wf01 plant wt01 device; the selected timeseries is columns 0 to 1 under this device (with the first column numbered as column 0). The SQL statement clause requires rows 100 to 109 of the query result be returned (with the first row numbered as row 0). - -The result is shown below: - -``` -+-----------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+-----------------------------+------------------------+ -|2017-11-01T01:40:00.000+08:00| 21.19| false| -|2017-11-01T01:41:00.000+08:00| 22.79| false| -|2017-11-01T01:42:00.000+08:00| 22.98| false| -|2017-11-01T01:43:00.000+08:00| 21.52| false| -|2017-11-01T01:44:00.000+08:00| 23.45| true| -|2017-11-01T01:45:00.000+08:00| 24.06| true| -|2017-11-01T01:46:00.000+08:00| 22.6| false| -|2017-11-01T01:47:00.000+08:00| 23.78| true| -|2017-11-01T01:48:00.000+08:00| 24.72| true| -|2017-11-01T01:49:00.000+08:00| 24.68| true| -+-----------------------------+-----------------------------+------------------------+ -Total line number = 10 -It costs 0.009s -``` - -### 7.4 Error Handling - -If the parameter N/SN of LIMIT/SLIMIT exceeds the size of the result set, IoTDB returns all the results as expected. For example, the query result of the original SQL statement consists of six rows, and we select the first 100 rows through the LIMIT clause: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 limit 100 -``` - -The result is shown below: - -``` -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -|2017-11-01T00:10:00.000+08:00| true| 25.52| -|2017-11-01T00:11:00.000+08:00| false| 22.91| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 6 -It costs 0.005s -``` - -If the parameter N/SN of LIMIT/SLIMIT clause exceeds the allowable maximum value (N/SN is of type int64), the system prompts errors. For example, executing the following SQL statement: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 limit 9223372036854775808; -``` - -The SQL statement will not be executed and the corresponding error prompt is given as follows: - -``` -Msg: 416: Out of range. LIMIT : N should be Int64. -``` - -If the parameter N/SN of LIMIT/SLIMIT clause is not a positive intege, the system prompts errors. For example, executing the following SQL statement: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 limit 13.1; -``` - -The SQL statement will not be executed and the corresponding error prompt is given as follows: - -``` -Msg: 401: line 1:129 mismatched input '.' expecting {, ';'} -``` - -If the parameter OFFSET of LIMIT clause exceeds the size of the result set, IoTDB will return an empty result set. For example, executing the following SQL statement: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 limit 2 offset 6; -``` - -The result is shown below: - -``` -+----+------------------------+-----------------------------+ -|Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+----+------------------------+-----------------------------+ -+----+------------------------+-----------------------------+ -Empty set. -It costs 0.005s -``` - -If the parameter SOFFSET of SLIMIT clause is not smaller than the number of available timeseries, the system prompts errors. For example, executing the following SQL statement: - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 soffset 2; -``` - -The SQL statement will not be executed and the corresponding error prompt is given as follows: - -``` -Msg: 411: Meet error in query process: The value of SOFFSET (2) is equal to or exceeds the number of sequences (2) that can actually be returned. -``` - -## 8. `ORDER BY` CLAUSE - -### 8.1 Order by in ALIGN BY TIME mode - -The result set of IoTDB is in ALIGN BY TIME mode by default and `ORDER BY TIME` clause can also be used to specify the ordering of timestamp. The SQL statement is: - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time desc; -``` - -Results: - -``` -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -|2017-11-01T00:01:00.000+08:00| v2| true| 24.36| true| -|2017-11-01T00:00:00.000+08:00| v2| true| 25.96| true| -|1970-01-01T08:00:00.002+08:00| v2| false| null| null| -|1970-01-01T08:00:00.001+08:00| v1| true| null| null| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -``` - -### 8.2 Order by in ALIGN BY DEVICE mode - -When querying in ALIGN BY DEVICE mode, `ORDER BY` clause can be used to specify the ordering of result set. - -ALIGN BY DEVICE mode supports four kinds of clauses with two sort keys which are `Device` and `Time`. - -1. ``ORDER BY DEVICE``: sort by the alphabetical order of the device name. The devices with the same column names will be clustered in a group view. - -2. ``ORDER BY TIME``: sort by the timestamp, the data points from different devices will be shuffled according to the timestamp. - -3. ``ORDER BY DEVICE,TIME``: sort by the alphabetical order of the device name. The data points with the same device name will be sorted by timestamp. - -4. ``ORDER BY TIME,DEVICE``: sort by timestamp. The data points with the same time will be sorted by the alphabetical order of the device name. - -> To make the result set more legible, when `ORDER BY` clause is not used, default settings will be provided. -> The default ordering clause is `ORDER BY DEVICE,TIME` and the default ordering is `ASC`. - -When `Device` is the main sort key, the result set is sorted by device name first, then by timestamp in the group with the same device name, the SQL statement is: - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by device desc,time asc align by device; -``` - -The result shows below: - -``` -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -+-----------------------------+-----------------+--------+------+-----------+ -``` - -When `Time` is the main sort key, the result set is sorted by timestamp first, then by device name in data points with the same timestamp. The SQL statement is: - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time asc,device desc align by device; -``` - -The result shows below: - -``` -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -+-----------------------------+-----------------+--------+------+-----------+ -``` - -When `ORDER BY` clause is not used, sort in default way, the SQL statement is: - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` - -The result below indicates `ORDER BY DEVICE ASC,TIME ASC` is the clause in default situation. -`ASC` can be omitted because it's the default ordering. - -``` -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -+-----------------------------+-----------------+--------+------+-----------+ -``` - -Besides,`ALIGN BY DEVICE` and `ORDER BY` clauses can be used with aggregate query,the SQL statement is: - -```sql -select count(*) from root.ln.** group by ((2017-11-01T00:00:00.000+08:00,2017-11-01T00:03:00.000+08:00],1m) order by device asc,time asc align by device; -``` - -The result shows below: - -``` -+-----------------------------+-----------------+---------------+-------------+------------------+ -| Time| Device|count(hardware)|count(status)|count(temperature)| -+-----------------------------+-----------------+---------------+-------------+------------------+ -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| 1| 1| -|2017-11-01T00:02:00.000+08:00|root.ln.wf01.wt01| null| 0| 0| -|2017-11-01T00:03:00.000+08:00|root.ln.wf01.wt01| null| 0| 0| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| 1| 1| null| -|2017-11-01T00:02:00.000+08:00|root.ln.wf02.wt02| 0| 0| null| -|2017-11-01T00:03:00.000+08:00|root.ln.wf02.wt02| 0| 0| null| -+-----------------------------+-----------------+---------------+-------------+------------------+ -``` - -### 8.3 Order by arbitrary expressions - -In addition to the predefined keywords "Time" and "Device" in IoTDB, `ORDER BY` can also be used to sort by any expressions. - -When sorting, `ASC` or `DESC` can be used to specify the sorting order, and `NULLS` syntax is supported to specify the priority of NULL values in the sorting. By default, `NULLS FIRST` places NULL values at the top of the result, and `NULLS LAST` ensures that NULL values appear at the end of the result. If not specified in the clause, the default order is ASC with NULLS LAST. - -Here are several examples of queries for sorting arbitrary expressions using the following data: - -``` -+-----------------------------+-------------+-------+-------+--------+-------+ -| Time| Device| base| score| bonus| total| -+-----------------------------+-------------+-------+-------+--------+-------+ -|1970-01-01T08:00:00.000+08:00| root.one| 12| 50.0| 45.0| 107.0| -|1970-01-02T08:00:00.000+08:00| root.one| 10| 50.0| 45.0| 105.0| -|1970-01-03T08:00:00.000+08:00| root.one| 8| 50.0| 45.0| 103.0| -|1970-01-01T08:00:00.010+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.020+08:00| root.two| 8| 10.0| 15.0| 33.0| -|1970-01-01T08:00:00.010+08:00| root.three| 9| null| 24.0| 33.0| -|1970-01-01T08:00:00.020+08:00| root.three| 8| null| 22.5| 30.5| -|1970-01-01T08:00:00.030+08:00| root.three| 7| null| 23.5| 30.5| -|1970-01-01T08:00:00.010+08:00| root.four| 9| 32.0| 45.0| 86.0| -|1970-01-01T08:00:00.020+08:00| root.four| 8| 32.0| 45.0| 85.0| -|1970-01-01T08:00:00.030+08:00| root.five| 7| 53.0| 44.0| 104.0| -|1970-01-01T08:00:00.040+08:00| root.five| 6| 54.0| 42.0| 102.0| -+-----------------------------+-------------+-------+-------+--------+-------+ -``` - -When you need to sort the results based on the base score score, you can use the following SQL: - -```Sql -select score from root.** order by score desc align by device; -``` - -This will give you the following results: - -``` -+-----------------------------+---------+-----+ -| Time| Device|score| -+-----------------------------+---------+-----+ -|1970-01-01T08:00:00.040+08:00|root.five| 54.0| -|1970-01-01T08:00:00.030+08:00|root.five| 53.0| -|1970-01-01T08:00:00.000+08:00| root.one| 50.0| -|1970-01-02T08:00:00.000+08:00| root.one| 50.0| -|1970-01-03T08:00:00.000+08:00| root.one| 50.0| -|1970-01-01T08:00:00.000+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00| root.two| 10.0| -+-----------------------------+---------+-----+ -``` - -If you want to sort the results based on the total score, you can use an expression in the `ORDER BY` clause to perform the calculation: - -```Sql -select score,total from root.one order by base+score+bonus desc -``` - -This SQL is equivalent to: - -```Sql -select score,total from root.one order by total desc -``` - -Here are the results: - -``` -+-----------------------------+--------------+--------------+ -| Time|root.one.score|root.one.total| -+-----------------------------+--------------+--------------+ -|1970-01-01T08:00:00.000+08:00| 50.0| 107.0| -|1970-01-02T08:00:00.000+08:00| 50.0| 105.0| -|1970-01-03T08:00:00.000+08:00| 50.0| 103.0| -+-----------------------------+--------------+--------------+ -``` - -If you want to sort the results based on the total score and, in case of tied scores, sort by score, base, bonus, and submission time in descending order, you can specify multiple layers of sorting using multiple expressions: - -```Sql -select base, score, bonus, total from root.** order by total desc NULLS Last, - score desc NULLS Last, - bonus desc NULLS Last, - time desc align by device; -``` - -Here are the results: - -``` -+-----------------------------+----------+----+-----+-----+-----+ -| Time| Device|base|score|bonus|total| -+-----------------------------+----------+----+-----+-----+-----+ -|1970-01-01T08:00:00.000+08:00| root.one| 12| 50.0| 45.0|107.0| -|1970-01-02T08:00:00.000+08:00| root.one| 10| 50.0| 45.0|105.0| -|1970-01-01T08:00:00.030+08:00| root.five| 7| 53.0| 44.0|104.0| -|1970-01-03T08:00:00.000+08:00| root.one| 8| 50.0| 45.0|103.0| -|1970-01-01T08:00:00.040+08:00| root.five| 6| 54.0| 42.0|102.0| -|1970-01-01T08:00:00.010+08:00| root.four| 9| 32.0| 45.0| 86.0| -|1970-01-01T08:00:00.020+08:00| root.four| 8| 32.0| 45.0| 85.0| -|1970-01-01T08:00:00.010+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.000+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.020+08:00| root.two| 8| 10.0| 15.0| 33.0| -|1970-01-01T08:00:00.010+08:00|root.three| 9| null| 24.0| 33.0| -|1970-01-01T08:00:00.030+08:00|root.three| 7| null| 23.5| 30.5| -|1970-01-01T08:00:00.020+08:00|root.three| 8| null| 22.5| 30.5| -+-----------------------------+----------+----+-----+-----+-----+ -``` - -In the `ORDER BY` clause, you can also use aggregate query expressions. For example: - -```Sql -select min_value(total) from root.** order by min_value(total) asc align by device; -``` - -This will give you the following results: - -``` -+----------+----------------+ -| Device|min_value(total)| -+----------+----------------+ -|root.three| 30.5| -| root.two| 33.0| -| root.four| 85.0| -| root.five| 102.0| -| root.one| 103.0| -+----------+----------------+ -``` - -When specifying multiple columns in the query, the unsorted columns will change order along with the rows and sorted columns. The order of rows when the sorting columns are the same may vary depending on the specific implementation (no fixed order). For example: - -```Sql -select min_value(total),max_value(base) from root.** order by max_value(total) desc align by device; -``` - -This will give you the following results: -· - -``` -+----------+----------------+---------------+ -| Device|min_value(total)|max_value(base)| -+----------+----------------+---------------+ -| root.one| 103.0| 12| -| root.five| 102.0| 7| -| root.four| 85.0| 9| -| root.two| 33.0| 9| -|root.three| 30.5| 9| -+----------+----------------+---------------+ -``` - -You can use both `ORDER BY DEVICE,TIME` and `ORDER BY EXPRESSION` together. For example: - -```Sql -select score from root.** order by device asc, score desc, time asc align by device; -``` - -This will give you the following results: - -``` -+-----------------------------+---------+-----+ -| Time| Device|score| -+-----------------------------+---------+-----+ -|1970-01-01T08:00:00.040+08:00|root.five| 54.0| -|1970-01-01T08:00:00.030+08:00|root.five| 53.0| -|1970-01-01T08:00:00.010+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00|root.four| 32.0| -|1970-01-01T08:00:00.000+08:00| root.one| 50.0| -|1970-01-02T08:00:00.000+08:00| root.one| 50.0| -|1970-01-03T08:00:00.000+08:00| root.one| 50.0| -|1970-01-01T08:00:00.000+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00| root.two| 50.0| -|1970-01-01T08:00:00.020+08:00| root.two| 10.0| -+-----------------------------+---------+-----+ -``` - -## 9. `ALIGN BY` CLAUSE - -In addition, IoTDB supports another result set format: `ALIGN BY DEVICE`. - -### 9.1 Align by Device - -The `ALIGN BY DEVICE` indicates that the deviceId is considered as a column. Therefore, there are totally limited columns in the dataset. - -> NOTE: -> -> 1.You can see the result of 'align by device' as one relational table, `Time + Device` is the primary key of this Table. -> -> 2.The result is order by `Device` firstly, and then by `Time` order. - -The SQL statement is: - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` - -The result shows below: - -``` -+-----------------------------+-----------------+-----------+------+--------+ -| Time| Device|temperature|status|hardware| -+-----------------------------+-----------------+-----------+------+--------+ -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| 25.96| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| 24.36| true| null| -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| null| true| v1| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| null| false| v2| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| null| true| v2| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| null| true| v2| -+-----------------------------+-----------------+-----------+------+--------+ -Total line number = 6 -It costs 0.012s -``` - -### 9.2 Ordering in ALIGN BY DEVICE - -ALIGN BY DEVICE mode arranges according to the device first, and sort each device in ascending order according to the timestamp. The ordering and priority can be adjusted through `ORDER BY` clause. - -## 10. `INTO` CLAUSE (QUERY WRITE-BACK) - -The `SELECT INTO` statement copies data from query result set into target time series. - -The application scenarios are as follows: - -- **Implement IoTDB internal ETL**: ETL the original data and write a new time series. -- **Query result storage**: Persistently store the query results, which acts like a materialized view. -- **Non-aligned time series to aligned time series**: Rewrite non-aligned time series into another aligned time series. - -### 10.1 SQL Syntax - -#### Syntax Definition - -**The following is the syntax definition of the `select` statement:** - -```sql -selectIntoStatement -: SELECT - resultColumn [, resultColumn] ... - INTO intoItem [, intoItem] ... - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY groupByTimeClause, groupByLevelClause] - [FILL {PREVIOUS | LINEAR | constant}] - [LIMIT rowLimit OFFSET rowOffset] - [ALIGN BY DEVICE] -; - -intoItem -: [ALIGNED] intoDevicePath '(' intoMeasurementName [',' intoMeasurementName]* ')' - ; -``` - -#### `INTO` Clause - -The `INTO` clause consists of several `intoItem`. - -Each `intoItem` consists of a target device and a list of target measurements (similar to the `INTO` clause in an `INSERT` statement). - -Each target measurement and device form a target time series, and an `intoItem` contains a series of time series. For example: `root.sg_copy.d1(s1, s2)` specifies two target time series `root.sg_copy.d1.s1` and `root.sg_copy.d1.s2`. - -The target time series specified by the `INTO` clause must correspond one-to-one with the columns of the query result set. The specific rules are as follows: - -- **Align by time** (default): The number of target time series contained in all `intoItem` must be consistent with the number of columns in the query result set (except the time column) and correspond one-to-one in the order from left to right in the header. -- **Align by device** (using `ALIGN BY DEVICE`): the number of target devices specified in all `intoItem` is the same as the number of devices queried (i.e., the number of devices matched by the path pattern in the `FROM` clause), and One-to-one correspondence according to the output order of the result set device. -
The number of measurements specified for each target device should be consistent with the number of columns in the query result set (except for the time and device columns). It should be in one-to-one correspondence from left to right in the header. - -For examples: - -- **Example 1** (aligned by time) - -```sql -select s1, s2 into root.sg_copy.d1(t1), root.sg_copy.d2(t1, t2), root.sg_copy.d1(t2) from root.sg.d1, root.sg.d2; -``` -``` -+--------------+-------------------+--------+ -| source column| target timeseries| written| -+--------------+-------------------+--------+ -| root.sg.d1.s1| root.sg_copy.d1.t1| 8000| -+--------------+-------------------+--------+ -| root.sg.d2.s1| root.sg_copy.d2.t1| 10000| -+--------------+-------------------+--------+ -| root.sg.d1.s2| root.sg_copy.d2.t2| 12000| -+--------------+-------------------+--------+ -| root.sg.d2.s2| root.sg_copy.d1.t2| 10000| -+--------------+-------------------+--------+ -Total line number = 4 -It costs 0.725s -``` - -This statement writes the query results of the four time series under the `root.sg` database to the four specified time series under the `root.sg_copy` database. Note that `root.sg_copy.d2(t1, t2)` can also be written as `root.sg_copy.d2(t1), root.sg_copy.d2(t2)`. - -We can see that the writing of the `INTO` clause is very flexible as long as the combined target time series is not repeated and corresponds to the query result column one-to-one. - -> In the result set displayed by `CLI`, the meaning of each column is as follows: -> -> - The `source column` column represents the column name of the query result. -> - `target timeseries` represents the target time series for the corresponding column to write. -> - `written` indicates the amount of data expected to be written. - - -- **Example 2** (aligned by time) - -```sql -select count(s1 + s2), last_value(s2) into root.agg.count(s1_add_s2), root.agg.last_value(s2) from root.sg.d1 group by ([0, 100), 10ms); -``` -``` -+--------------------------------------+-------------------------+--------+ -| source column| target timeseries| written| -+--------------------------------------+-------------------------+--------+ -| count(root.sg.d1.s1 + root.sg.d1.s2)| root.agg.count.s1_add_s2| 10| -+--------------------------------------+-------------------------+--------+ -| last_value(root.sg.d1.s2)| root.agg.last_value.s2| 10| -+--------------------------------------+-------------------------+--------+ -Total line number = 2 -It costs 0.375s -``` - -This statement stores the results of an aggregated query into the specified time series. - -- **Example 3** (aligned by device) - -```sql -select s1, s2 into root.sg_copy.d1(t1, t2), root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` -``` -+--------------+--------------+-------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+--------------+-------------------+--------+ -| root.sg.d1| s1| root.sg_copy.d1.t1| 8000| -+--------------+--------------+-------------------+--------+ -| root.sg.d1| s2| root.sg_copy.d1.t2| 11000| -+--------------+--------------+-------------------+--------+ -| root.sg.d2| s1| root.sg_copy.d2.t1| 12000| -+--------------+--------------+-------------------+--------+ -| root.sg.d2| s2| root.sg_copy.d2.t2| 9000| -+--------------+--------------+-------------------+--------+ -Total line number = 4 -It costs 0.625s -``` - -This statement also writes the query results of the four time series under the `root.sg` database to the four specified time series under the `root.sg_copy` database. However, in ALIGN BY DEVICE, the number of `intoItem` must be the same as the number of queried devices, and each queried device corresponds to one `intoItem`. - -> When aligning the query by device, the result set displayed by `CLI` has one more column, the `source device` column indicating the queried device. - -- **Example 4** (aligned by device) - -```sql -select s1 + s2 into root.expr.add(d1s1_d1s2), root.expr.add(d2s1_d2s2) from root.sg.d1, root.sg.d2 align by device; -``` -``` -+--------------+--------------+------------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+--------------+------------------------+--------+ -| root.sg.d1| s1 + s2| root.expr.add.d1s1_d1s2| 10000| -+--------------+--------------+------------------------+--------+ -| root.sg.d2| s1 + s2| root.expr.add.d2s1_d2s2| 10000| -+--------------+--------------+------------------------+--------+ -Total line number = 2 -It costs 0.532s -``` - -This statement stores the result of evaluating an expression into the specified time series. - -#### Using variable placeholders - -In particular, We can use variable placeholders to describe the correspondence between the target and query time series, simplifying the statement. The following two variable placeholders are currently supported: - -- Suffix duplication character `::`: Copy the suffix (or measurement) of the query device, indicating that from this layer to the last layer (or measurement) of the device, the node name (or measurement) of the target device corresponds to the queried device The node name (or measurement) is the same. -- Single-level node matcher `${i}`: Indicates that the current level node name of the target sequence is the same as the i-th level node name of the query sequence. For example, for the path `root.sg1.d1.s1`, `${1}` means `sg1`, `${2}` means `d1`, and `${3}` means `s1`. - -When using variable placeholders, there must be no ambiguity in the correspondence between `intoItem` and the columns of the query result set. The specific cases are classified as follows: - -##### ALIGN BY TIME (default) - -> Note: The variable placeholder **can only describe the correspondence between time series**. If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a time series, so neither the target device nor the measurement can use variable placeholders. - -###### (1) The target device does not use variable placeholders & the target measurement list uses variable placeholders - -**Limitations:** - -1. In each `intoItem`, the length of the list of physical quantities must be 1.
(If the length can be greater than 1, e.g. `root.sg1.d1(::, s1)`, it is not possible to determine which columns match `::`) -2. The number of `intoItem` is 1, or the same as the number of columns in the query result set.
(When the length of each target measurement list is 1, if there is only one `intoItem`, it means that all the query sequences are written to the same device; if the number of `intoItem` is consistent with the query sequence, it is expressed as each query time series specifies a target device; if `intoItem` is greater than one and less than the number of query sequences, it cannot be a one-to-one correspondence with the query sequence) - -**Matching method:** Each query time series specifies the target device, and the target measurement is generated from the variable placeholder. - -**Example:** - -```sql -select s1, s2 -into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) -from root.sg.d1, root.sg.d2; -```` - -This statement is equivalent to: - -```sql -select s1, s2 -into root.sg_copy.d1(s1), root.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2) -from root.sg.d1, root.sg.d2; -```` - -As you can see, the statement is not very simplified in this case. - -###### (2) The target device uses variable placeholders & the target measurement list does not use variable placeholders - -**Limitations:** The number of target measurements in all `intoItem` is the same as the number of columns in the query result set. - -**Matching method:** The target measurement is specified for each query time series, and the target device is generated according to the target device placeholder of the `intoItem` where the corresponding target measurement is located. - -**Example:** - -```sql -select d1.s1, d1.s2, d2.s3, d3.s4 -into ::(s1_1, s2_2), root.sg.d2_2(s3_3), root.${2}_copy.::(s4) -from root.sg; -```` - -###### (3) The target device uses variable placeholders & the target measurement list uses variable placeholders - -**Limitations:** There is only one `intoItem`, and the length of the list of measurement list is 1. - -**Matching method:** Each query time series can get a target time series according to the variable placeholder. - -**Example:** - -```sql -select * into root.sg_bk.::(::) from root.sg.**; -```` - -Write the query results of all time series under `root.sg` to `root.sg_bk`, the device name suffix and measurement remain unchanged. - -##### ALIGN BY DEVICE - -> Note: The variable placeholder **can only describe the correspondence between time series**. If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a specific physical quantity, so the target measurement cannot use variable placeholders. - -###### (1) The target device does not use variable placeholders & the target measurement list uses variable placeholders - -**Limitations:** In each `intoItem`, if the list of measurement uses variable placeholders, the length of the list must be 1. - -**Matching method:** Each query time series specifies the target device, and the target measurement is generated from the variable placeholder. - -**Example:** - -```sql -select s1, s2, s3, s4 -into root.backup_sg.d1(s1, s2, s3, s4), root.backup_sg.d2(::), root.sg.d3(backup_${4}) -from root.sg.d1, root.sg.d2, root.sg.d3 -align by device; -```` - -###### (2) The target device uses variable placeholders & the target measurement list does not use variable placeholders - -**Limitations:** There is only one `intoItem`. (If there are multiple `intoItem` with placeholders, we will not know which source devices each `intoItem` needs to match) - -**Matching method:** Each query device obtains a target device according to the variable placeholder, and the target measurement written in each column of the result set under each device is specified by the target measurement list. - -**Example:** - -```sql -select avg(s1), sum(s2) + sum(s3), count(s4) -into root.agg_${2}.::(avg_s1, sum_s2_add_s3, count_s4) -from root.** -align by device; -```` - -###### (3) The target device uses variable placeholders & the target measurement list uses variable placeholders - -**Limitations:** There is only one `intoItem` and the length of the target measurement list is 1. - -**Matching method:** Each query time series can get a target time series according to the variable placeholder. - -**Example:** - -```sql -select * into ::(backup_${4}) from root.sg.** align by device; -```` - -Write the query result of each time series in `root.sg` to the same device, and add `backup_` before the measurement. - -#### Specify the target time series as the aligned time series - -We can use the `ALIGNED` keyword to specify the target device for writing to be aligned, and each `intoItem` can be set independently. - -**Example:** - -```sql -select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` - -This statement specifies that `root.sg_copy.d1` is an unaligned device and `root.sg_copy.d2` is an aligned device. - -#### Unsupported query clauses - -- `SLIMIT`, `SOFFSET`: The query columns are uncertain, so they are not supported. -- `LAST`, `GROUP BY TAGS`, `DISABLE ALIGN`: The table structure is inconsistent with the writing structure, so it is not supported. - -#### Other points to note - -- For general aggregation queries, the timestamp is meaningless, and the convention is to use 0 to store. -- When the target time-series exists, the data type of the source column and the target time-series must be compatible. About data type compatibility, see the document [Data Type](../Background-knowledge/Data-Type.md). -- When the target time series does not exist, the system automatically creates it (including the database). -- When the queried time series does not exist, or the queried sequence does not have data, the target time series will not be created automatically. - -### 10.2 Application examples - -#### Implement IoTDB internal ETL - -ETL the original data and write a new time series. - -```sql -SELECT preprocess_udf(s1, s2) INTO ::(preprocessed_s1, preprocessed_s2) FROM root.sg.* ALIGN BY DEVICE; -``` -``` -+--------------+-------------------+---------------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d1| preprocess_udf(s1)| root.sg.d1.preprocessed_s1| 8000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d1| preprocess_udf(s2)| root.sg.d1.preprocessed_s2| 10000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d2| preprocess_udf(s1)| root.sg.d2.preprocessed_s1| 11000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d2| preprocess_udf(s2)| root.sg.d2.preprocessed_s2| 9000| -+--------------+-------------------+---------------------------+--------+ -``` - -#### Query result storage - -Persistently store the query results, which acts like a materialized view. - -```sql -SELECT count(s1), last_value(s1) INTO root.sg.agg_${2}(count_s1, last_value_s1) FROM root.sg1.d1 GROUP BY ([0, 10000), 10ms); -``` -``` -+--------------------------+-----------------------------+--------+ -| source column| target timeseries| written| -+--------------------------+-----------------------------+--------+ -| count(root.sg.d1.s1)| root.sg.agg_d1.count_s1| 1000| -+--------------------------+-----------------------------+--------+ -| last_value(root.sg.d1.s2)| root.sg.agg_d1.last_value_s2| 1000| -+--------------------------+-----------------------------+--------+ -Total line number = 2 -It costs 0.115s -``` - -#### Non-aligned time series to aligned time series - -Rewrite non-aligned time series into another aligned time series. - -**Note:** It is recommended to use the `LIMIT & OFFSET` clause or the `WHERE` clause (time filter) to batch data to prevent excessive data volume in a single operation. - -```sql -SELECT s1, s2 INTO ALIGNED root.sg1.aligned_d(s1, s2) FROM root.sg1.non_aligned_d WHERE time >= 0 and time < 10000; -``` -``` -+--------------------------+----------------------+--------+ -| source column| target timeseries| written| -+--------------------------+----------------------+--------+ -| root.sg1.non_aligned_d.s1| root.sg1.aligned_d.s1| 10000| -+--------------------------+----------------------+--------+ -| root.sg1.non_aligned_d.s2| root.sg1.aligned_d.s2| 10000| -+--------------------------+----------------------+--------+ -Total line number = 2 -It costs 0.375s -``` - -### 10.3 User Permission Management - -The user must have the following permissions to execute a query write-back statement: - -* All `WRITE_SCHEMA` permissions for the source series in the `select` clause. -* All `WRITE_DATA` permissions for the target series in the `into` clause. - -For more user permissions related content, please refer to [Account Management Statements](../User-Manual/Authority-Management_timecho.md). - -### 10.4 Configurable Properties - -* `select_into_insert_tablet_plan_row_limit`: The maximum number of rows can be processed in one insert-tablet-plan when executing select-into statements. 10000 by default. diff --git a/src/UserGuide/latest/Basic-Concept/Write-Data_timecho.md b/src/UserGuide/latest/Basic-Concept/Write-Data_timecho.md deleted file mode 100644 index 54927276d..000000000 --- a/src/UserGuide/latest/Basic-Concept/Write-Data_timecho.md +++ /dev/null @@ -1,202 +0,0 @@ - - - -# Write Data -## 1. CLI INSERT - -IoTDB provides users with a variety of ways to insert real-time data, such as directly inputting [INSERT SQL statement](../SQL-Manual/SQL-Manual_timecho#insert-data) in [Client/Shell tools](../Tools-System/CLI.md), or using [Java JDBC](../API/Programming-JDBC_timecho) to perform single or batch execution of [INSERT SQL statement](../SQL-Manual/SQL-Manual_timecho). - -NOTE: This section mainly introduces the use of [INSERT SQL statement](../SQL-Manual/SQL-Manual_timecho#insert-data) for real-time data import in the scenario. - -When writing data with duplicate timestamps, the existing data with the same timestamp will be overwritten directly, which is equivalent to data update; however, if the written value is NULL, the operation will not take effect and the original field value will not be overwritten. - -### 1.1 Use of INSERT Statements - -The [INSERT SQL statement](../SQL-Manual/SQL-Manual_timecho#insert-data) statement is used to insert data into one or more specified timeseries created. For each point of data inserted, it consists of a [timestamp](../Basic-Concept/Operate-Metadata.md) and a sensor acquisition value (see [Data Type](../Background-knowledge/Data-Type.md)). - -In the scenario of this section, take two timeseries `root.ln.wf02.wt02.status` and `root.ln.wf02.wt02.hardware` as an example, and their data types are BOOLEAN and TEXT, respectively. - -The sample code for single column data insertion is as follows: - -``` -IoTDB > insert into root.ln.wf02.wt02(timestamp,status) values(1,true) -IoTDB > insert into root.ln.wf02.wt02(timestamp,hardware) values(1, "v1") -``` - -The above example code inserts the long integer timestamp and the value "true" into the timeseries `root.ln.wf02.wt02.status` and inserts the long integer timestamp and the value "v1" into the timeseries `root.ln.wf02.wt02.hardware`. When the execution is successful, cost time is shown to indicate that the data insertion has been completed. - -> Note: In IoTDB, TEXT type data can be represented by single and double quotation marks. The insertion statement above uses double quotation marks for TEXT type data. The following example will use single quotation marks for TEXT type data. - -The INSERT statement can also support the insertion of multi-column data at the same time point. The sample code of inserting the values of the two timeseries at the same time point '2' is as follows: - -```sql -IoTDB > insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (2, false, 'v2') -``` - -In addition, The INSERT statement support insert multi-rows at once. The sample code of inserting two rows as follows: - -```sql -IoTDB > insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4') -``` - -When writing data to the tree model, both timestamp and time can be used as time column identifiers in INSERT statements, and there is no need to deliberately distinguish between them when writing statements. However, in query results, the time column is uniformly displayed as Time (a fixed name) to ensure a consistent result format. - -After inserting the data, we can simply query the inserted data using the SELECT statement: - -```sql -IoTDB > select * from root.ln.wf02.wt02 where time < 5 -``` - -The result is shown below. The query result shows that the insertion statements of single column and multi column data are performed correctly. - -``` -+-----------------------------+--------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status| -+-----------------------------+--------------------------+------------------------+ -|1970-01-01T08:00:00.001+08:00| v1| true| -|1970-01-01T08:00:00.002+08:00| v2| false| -|1970-01-01T08:00:00.003+08:00| v3| false| -|1970-01-01T08:00:00.004+08:00| v4| true| -+-----------------------------+--------------------------+------------------------+ -Total line number = 4 -It costs 0.004s -``` - -In addition, we can omit the timestamp column, and the system will use the current system timestamp as the timestamp of the data point. The sample code is as follows: - -```sql -IoTDB > insert into root.ln.wf02.wt02(status, hardware) values (false, 'v2') -``` - -**Note:** Timestamps must be specified when inserting multiple rows of data in a SQL. - -### 1.2 Insert Data Into Aligned Timeseries - -To insert data into a group of aligned time series, we only need to add the `ALIGNED` keyword in SQL, and others are similar. - -The sample code is as follows: - -```sql -IoTDB > create aligned timeseries root.sg1.d1(s1 INT32, s2 DOUBLE) -IoTDB > insert into root.sg1.d1(time, s1, s2) aligned values(1, 1, 1) -IoTDB > insert into root.sg1.d1(time, s1, s2) aligned values(2, 2, 2), (3, 3, 3) -IoTDB > select * from root.sg1.d1 -``` - -The result is shown below. The query result shows that the insertion statements are performed correctly. - -``` -+-----------------------------+--------------+--------------+ -| Time|root.sg1.d1.s1|root.sg1.d1.s2| -+-----------------------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 1| 1.0| -|1970-01-01T08:00:00.002+08:00| 2| 2.0| -|1970-01-01T08:00:00.003+08:00| 3| 3.0| -+-----------------------------+--------------+--------------+ -Total line number = 3 -It costs 0.004s -``` - -## 2. NATIVE API WRITE - -The Native API ( Session ) is the most widely used series of APIs of IoTDB, including multiple APIs, adapted to different data collection scenarios, with high performance and multi-language support. - -### 2.1 Multi-language API write - -#### Java - -Before writing via the Java API, you need to establish a connection, refer to [Java Native API](../API/Programming-Java-Native-API_timecho). -then refer to [ JAVA Data Manipulation Interface (DML) ](../API/Programming-Java-Native-API_timecho#insert) - -#### Python - -Refer to [ Python Data Manipulation Interface (DML) ](../API/Programming-Python-Native-API_timecho#insert) - -#### C++ - -Refer to [ C++ Data Manipulation Interface (DML) ](../API/Programming-Cpp-Native-API.md#insert) - -#### Go - -Refer to [Go Native API](../API/Programming-Go-Native-API.md) - -## 3. REST API WRITE - -Refer to [insertTablet (v1)](../API/RestServiceV1_timecho#inserttablet) or [insertTablet (v2)](../API/RestServiceV2_timecho#inserttablet) - -Example: - -```JSON -{ -      "timestamps": [ -            1, -            2, -            3 -      ], -      "measurements": [ -            "temperature", -            "status" -      ], -      "data_types": [ -            "FLOAT", -            "BOOLEAN" -      ], -      "values": [ -            [ -                  1.1, -                  2.2, -                  3.3 -            ], -            [ -                  false, -                  true, -                  true -            ] -      ], -      "is_aligned": false, -      "device": "root.ln.wf01.wt01" -} -``` - -## 4. MQTT WRITE - -Refer to [Built-in MQTT Service](../API/Programming-MQTT_timecho.md#_2-built-in-mqtt-service) - -## 5. BATCH DATA LOAD - -In different scenarios, the IoTDB provides a variety of methods for importing data in batches. This section describes the two most common methods for importing data in CSV format and TsFile format. - -### 5.1 TsFile Batch Load - -TsFile is the file format of time series used in IoTDB. You can directly import one or more TsFile files with time series into another running IoTDB instance through tools such as CLI. For details, see [Data Import](../Tools-System/Data-Import-Tool_timecho). - -### 5.2 CSV Batch Load - -CSV stores table data in plain text. You can write multiple formatted data into a CSV file and import the data into the IoTDB in batches. Before importing data, you are advised to create the corresponding metadata in the IoTDB. Don't worry if you forget to create one, the IoTDB can automatically infer the data in the CSV to its corresponding data type, as long as you have a unique data type for each column. In addition to a single file, the tool supports importing multiple CSV files as folders and setting optimization parameters such as time precision. For details, see [Data Import](../Tools-System/Data-Import-Tool_timecho). - -## 6. SCHEMALESS WRITING -In IoT scenarios, the types and quantities of devices may dynamically increase or decrease over time, and different devices may generate data with varying fields (e.g., temperature, humidity, status codes). Additionally, businesses often require rapid deployment and flexible integration of new devices without cumbersome predefined processes. Therefore, unlike traditional time-series databases that typically require predefining data models, IoTDB supports schema-less writing, where the database automatically identifies and registers the necessary metadata during data writing, enabling automatic modeling. - -Users can either use CLI `INSERT` statements or native APIs to write data in real-time, either in batches or row-by-row, for single or multiple devices. Alternatively, they can import historical data in formats such as CSV or TsFile using import tools, during which metadata like time series, data types, and compression encoding methods are automatically created. - - - diff --git a/src/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md b/src/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md deleted file mode 100644 index b5e5ff055..000000000 --- a/src/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md +++ /dev/null @@ -1,267 +0,0 @@ - -# AINode Deployment - -## 1. AINode Introduction - -### 1.1 Capability Introduction - -AINode is the third type of endogenous node provided by TimechoDB after ConfigNode and DataNode. By interacting with the DataNodes and ConfigNodes of an TimechoDB cluster, this node extends the capability for machine learning analysis on time series. AINode integrates model management, training, and inference within the database engine. It supports performing time series analysis tasks on specified time series data using registered models through simple SQL statements and also supports registering and using custom machine learning models. AINode currently integrates machine learning algorithms and self-developed models for common time series analysis scenarios (e.g., forecasting). - -### 1.2 Deployment Modes - -AINode is an additional component outside the TimechoDB cluster and is deployed using a separate installation package. - -
- - -
- -## 2. Installation Preparation - -### 2.1 Installation Package Acquisition - -The key directory structure after extracting the AINode installation package (`timechodb--ainode-bin.zip`) is as follows: - -| Directory | Type | Description | -| :--- | :--- | :--- | -| lib | Folder | Executable programs and dependencies for AINode | -| sbin | Folder | Operation scripts for AINode, used to start or stop AINode | -| conf | Folder | Configuration files and version declaration file for AINode | - -### 2.2 Pre-installation Verification - -To ensure the AINode installation package you obtained is complete and correct, it is recommended to perform an SHA512 verification before installation and deployment. - -**Preparation:** - -- Obtain the official SHA512 checksum: Please contact Timecho staff. - -**Verification Steps (using Linux as an example):** - -1. Open a terminal, navigate to the directory containing the installation package (e.g., `/data/ainode`): - -```bash -cd /data/ainode -``` - -2. Execute the following command to calculate the hash value: - -```bash -sha512sum timechodb-{version}-ainode-bin.zip -``` - -3. The terminal will output the result (left side is the SHA512 checksum, right side is the filename): - -```SQL -(base) root@hadoop@1:/data/ainode (0.664s) -sha512sum timechodb-2.0.6.1-ainode-bin.zip -4d5a6a64935b4f0459bc9ed214c4563aa7a6a5941024336e9416212424707f27bdfdfc70f4c528b51b812687d660014adc1b8add699498ea67ff17c7e619a6f0 timechodb-2.0.6.1-ainode-bin.zip -``` - -4. Compare the output with the official SHA512 checksum. If they match, you can proceed with the AINode installation and deployment steps below. - -**Notes:** - -- If the verification results do not match, please contact Timecho staff to obtain a new installation package. -- If you encounter a "file not found" prompt during verification, check if the file path is correct or if the installation package was downloaded completely. - -### 2.3 Environment Requirements - -- Recommended operating environment: Linux, macOS. -- TimechoDB Version: >= V2.0.8. - -## 3. Installation, Deployment, and Usage - -### 3.1 Installing AINode - -Download the AINode installation package, import it into a dedicated folder, switch to that folder, and extract the package. - -```bash -unzip timechodb--ainode-bin.zip -``` - -### 3.2 Modifying Configuration Items - -AINode supports modifying some necessary parameters. You can find the following parameters in the `/TIMECHODB_AINODE_HOME/conf/iotdb-ainode.properties` file and make persistent modifications: - -| Name | Description | Type | Default Value | -| :--- | :--- | :--- | :--- | -| `cluster_name` | The cluster identifier the AINode is to join | String | `defaultCluster` | -| `ain_seed_config_node` | The ConfigNode address for AINode registration upon startup | String | `127.0.0.1:10710` | -| `ain_cluster_ingress_address` | The rpc address of the DataNode from which AINode pulls data | String | `127.0.0.1` | -| `ain_cluster_ingress_port` | The rpc port of the DataNode from which AINode pulls data | Integer | `6667` | -| `ain_cluster_ingress_username` | The client username for the DataNode from which AINode pulls data | String | `root` | -| `ain_cluster_ingress_password` | The client password for the DataNode from which AINode pulls data | String | `root` | -| `ain_rpc_address` | The address for AINode service provision and communication (internal service communication interface) | String | `127.0.0.1` | -| `ain_rpc_port` | The port for AINode service provision and communication | String | `10810` | -| `ain_system_dir` | AINode metadata storage path. The starting directory for relative paths is OS-dependent; using an absolute path is recommended. | String | `data/AINode/system` | -| `ain_models_dir` | AINode model file storage path. The starting directory for relative paths is OS-dependent; using an absolute path is recommended. | String | `data/AINode/models` | -| `ain_thrift_compression_enabled` | Whether to enable Thrift compression mechanism for AINode. 0-disable, 1-enable. | Boolean | `0` | - -### 3.3 Importing Built-in Weight Files - -*If the deployment environment has network connectivity and can access HuggingFace, the system will automatically pull the built-in model weight files. This step can be skipped.* -*For offline environments, contact Timecho staff to obtain the model weight folder and place it under the `/TIMECHODB_AINODE_HOME/data/ainode/models/builtin` directory.* -**NOTE:** Pay attention to the directory hierarchy. The parent directory for all built-in model weights should be `builtin`. - -### 3.4 Starting AINode - -After completing the deployment of ConfigNodes, you can add an AINode to support time series model management and inference functionality. After specifying the TimechoDB cluster information in the configuration items, you can execute the corresponding command to start the AINode and join the TimechoDB cluster. - -```bash -# Startup command -# Linux and macOS systems -bash sbin/start-ainode.sh - -# Windows system -sbin\start-ainode.bat - -# Background startup command (recommended for long-term operation) -# Linux and macOS systems -bash sbin/start-ainode.sh -d - -# Windows system -sbin\start-ainode.bat -d -``` - -### 3.5 Activating AINode - -1. Refer to TimechoDB Activation: [Activation Method](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md#_2-6-activate-database) - -2. You can verify AINode activation as follows. When the status shows `ACTIVATED`, it indicates successful activation. - -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -Total line number = 3 -It costs 0.002s -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ -Total line number = 7 -It costs 0.013s -``` - -### 3.6 Checking AINode Node Status - -During startup, AINode automatically joins the TimechoDB cluster. After starting AINode, you can enter an SQL query in the command line. Seeing the AINode node in the cluster with a `Running` status (as shown below) indicates a successful join. - -```sql -TimechoDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -Additionally, you can check the model status using the `show models` command. If the model status is incorrect, please verify the weight file path. - -```sql -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 3.7 Stopping AINode - -If you need to stop a running AINode node, execute the corresponding shutdown script. It supports specifying the port via the `-p` parameter, which corresponds to the `ain_rpc_port` configuration item. - -```bash -# Linux / macOS -bash sbin/stop-ainode.sh -bash sbin/stop-ainode.sh -p # Specify port - -# Windows -sbin\stop-ainode.bat -sbin\stop-ainode.bat -p # Specify port -``` - -After stopping AINode, you can still see the AINode node in the cluster, but its status will be `UNKNOWN` (as shown below). AINode functionality will be unavailable at this time. - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|UNKNOWN| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -If you need to restart the node, re-execute the startup script. - -### 3.8 Upgrading AINode -If you need to upgrade the version of the current AINode, follow these steps: - -1. Stop the current AINode service - - Run the stop command and ensure the service has completely exited before proceeding with subsequent operations. - - ```bash - # Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # Specify port - - # Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # Specify port - ``` - -2. Replace core files - - Delete the `lib` and `sbin` directories of the current version, then copy the `lib` and `sbin` directories from the new version to the corresponding locations. - - Back up the modified configuration files in the `conf` directory, then replace the `conf` folder and synchronize your modified configurations to the corresponding files. - -3. Update built-in model weights (optional) - - If the new version includes updates to built-in models, relevant information will be announced in the [Release History](../IoTDB-Introduction/Release-history_timecho.md). You may contact Timecho staff to obtain the latest weight package, and replace it in the `data/ainode/models/builtin` directory. - -4. A \ No newline at end of file diff --git a/src/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index 8981ee335..000000000 --- a/src/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,319 +0,0 @@ - -# AINode Deployment - -## 1. AINode Introduction - -### 1.1 Capability Introduction - - AINode is the third type of endogenous node provided by IoTDB after the Configurable Node and DataNode. This node extends its ability to perform machine learning analysis on time series by interacting with the DataNode and Configurable Node of the IoTDB cluster. It supports the introduction of existing machine learning models from external sources for registration and the use of registered models to complete time series analysis tasks on specified time series data through simple SQL statements. The creation, management, and inference of models are integrated into the database engine. Currently, machine learning algorithms or self-developed models are available for common time series analysis scenarios, such as prediction and anomaly detection. - -### 1.2 Delivery Method - AINode is an additional package outside the IoTDB cluster, with independent installation. - -### 1.3 Deployment mode -
- - -
- -## 2. Installation preparation - -### 2.1 Get installation package - - Unzip and install the package - `(timechodb--ainode-bin.zip)`, The directory structure after unpacking the installation package is as follows: - -| **Catalogue** | **Type** | **Explain** | -| ----------- | -------- |-----------------------------------------------------------------------| -| lib | folder | Python package files for AINode | -| sbin | folder | The running script of AINode can start, remove, and stop AINode | -| conf | folder | Configuration files for AINode, and runtime environment setup scripts | -| LICENSE | file | Certificate | -| NOTICE | file | Tips | -| README_ZH.md | file | Explanation of the Chinese version of the markdown format | -| README.md | file | Instructions | - -### 2.2 Pre-installation Check - -To ensure the AINode installation package you obtained is complete and valid, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum:please contact Timecho Team to re-obtain the installation package. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/ainode): - ```Bash - cd /data/ainode - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-06.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment of AINode as per the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### 2.3 Environmental Preparation - -1. Recommended operating systems: Ubuntu, MacOS -2. IoTDB version: >= V 2.0.5.1 -3. Runtime environment - - Python version between 3.9 and 3.12, with pip and venv tools installed; - -## 3. Installation steps - -### 3.1 Install AINode - -1. Ensure Python version is between 3.9 and 3.12: -```shell -python --version -# or -python3 --version -``` - -2. Download and import AINode into a dedicated folder, switch to the folder, and unzip the package: -```shell - unzip timechodb--ainode-bin.zip - ``` -3. Activate AINode: - -- Enter the IoTDB CLI - -```sql -# For Linux or macOS -./start-cli.sh - -# For Windows -./start-cli.bat -``` - -- Run the following command to retrieve the machine code required for activation: - -```sql -show system info -``` - -- Copy the returned machine code and send it to the Timecho team: - -```sql -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -``` - -- Enter the activation code provided by the Timecho team in the CLI using the following format. Wrap the activation code in single quotes ('): - -```sql -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZK' -``` - -- You can verify the activation using the following method: when the status shows ACTIVATED, it indicates successful activation. - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ - -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ - -``` - -### 3.2 Configuration item modification - -AINode supports modifying some necessary parameters. You can find the following parameters in the `conf/iotdb-ainode.properties` file and make persistent modifications to them: - -| **Name** | **Description** | **Type** | **Default Value** | -| ------------------------------ | ------------------------------------------------------------ | -------- | ------------------ | -| cluster_name | Identifier of the cluster AINode joins | string | defaultCluster | -| ain_seed_config_node | Address of the ConfigNode registered when AINode starts | String | 127.0.0.1:10710 | -| ain_cluster_ingress_address | RPC address of the DataNode for AINode to pull data | String | 127.0.0.1 | -| ain_cluster_ingress_port | RPC port of the DataNode for AINode to pull data | Integer | 6667 | -| ain_cluster_ingress_username | Client username for AINode to pull data from the DataNode | String | root | -| ain_cluster_ingress_password | Client password for AINode to pull data from the DataNode | String | root | -| ain_cluster_ingress_time_zone | Client time zone for AINode to pull data from the DataNode | String | UTC+8 | -| ain_inference_rpc_address | Address for AINode to provide services and communication (internal interface) | String | 127.0.0.1 | -| ain_inference_rpc_port | Port for AINode to provide services and communication | String | 10810 | -| ain_system_dir | Metadata storage path for AINode (relative path starts from OS-dependent directory; absolute path is recommended) | String | data/AINode/system | -| ain_models_dir | Path to store model files for AINode (relative path starts from OS-dependent directory; absolute path is recommended) | String | data/AINode/models | -| ain_thrift_compression_enabled | Whether to enable Thrift compression for AINode (0=disabled, 1=enabled) | Boolean | 0 | - -### 3.3 Importing Weight Files - -> Offline environment only (Online environments can skip this step) -> -Contact Timecho team to obtain the model weight files, then place them in the /IOTDB_AINODE_HOME/data/ainode/models/weights/ directory. - - -### 3.4 Start AINode - - After completing the deployment of Seed Config Node, the registration and inference functions of the model can be supported by adding AINode nodes. After specifying the information of the IoTDB cluster in the configuration file, the corresponding instruction can be executed to start AINode and join the IoTDB cluster。 - -- Networking environment startup - -Start command - -```shell - # Start command - # Linux and MacOS systems - bash sbin/start-ainode.sh - - # Windows systems - sbin\start-ainode.bat - - # Backend startup command (recommended for long-term running) - # Linux and MacOS systems - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows systems - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### 3.5 Detecting the status of AINode nodes - -During the startup process of AINode, the new AINode will be automatically added to the IoTDB cluster. After starting AINode, you can enter SQL in the command line to query. If you see an AINode node in the cluster and its running status is Running (as shown below), it indicates successful joining. - - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -### 3.6 Stop AINode - -If you need to stop a running AINode node, execute the corresponding shutdown script. - - Stop command - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - -After stopping AINode, you can still see AINode nodes in the cluster, whose running status is UNKNOWN (as shown below), and the AINode function cannot be used at this time. - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -If you need to restart the node, you need to execute the startup script again. - -## 4. common problem - -### 4.1 An error occurs when starting AINode stating that the venv module cannot be found - - When starting AINode using the default method, a Python virtual environment will be created in the installation package directory and dependencies will be installed, so it is required to install the venv module. Generally speaking, Python 3.10 and above versions come with built-in VenV, but for some systems with built-in Python environments, this requirement may not be met. There are two solutions when this error occurs (choose one or the other): - - To install the Venv module locally, taking Ubuntu as an example, you can run the following command to install the built-in Venv module in Python. Or install a Python version with built-in Venv from the Python official website. - - ```shell -apt-get install python3.10-venv -``` -Install version 3.10.0 of venv into AINode in the AINode path. - - ```shell -../Python-3.10.0/python -m venv venv(Folder Name) -``` - When running the startup script, use ` -i ` to specify an existing Python interpreter path as the running environment for AINode, eliminating the need to create a new virtual environment. - - ### 4.2 The SSL module in Python is not properly installed and configured to handle HTTPS resources -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -You can install OpenSSLS and then rebuild Python to solve this problem -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - - Python requires OpenSSL to be installed on our system, the specific installation method can be found in [link](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - - ### 4.3 Pip version is lower - - A compilation issue similar to "error: Microsoft Visual C++14.0 or greater is required..." appears on Windows - -The corresponding error occurs during installation and compilation, usually due to insufficient C++version or Setup tools version. You can check it in - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - - ### 4.4 Install and compile Python - - Use the following instructions to download the installation package from the official website and extract it: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` - Compile and install the corresponding Python package: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/UserGuide/latest/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/UserGuide/latest/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index 516cf88b1..000000000 --- a/src/UserGuide/latest/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,640 +0,0 @@ - -# Cluster Deployment - -This guide describes how to manually deploy a cluster instance consisting of 3 ConfigNodes and 3 DataNodes (commonly referred to as a 3C3D cluster). - -
- -
- - - -## 1. Prerequisites - -1. [System configuration](./Environment-Requirements.md):Ensure the system has been configured according to the preparation guidelines. - -2. **IP Configuration**: It is recommended to use hostnames for IP configuration to prevent issues caused by IP address changes. Configure the `/etc/hosts` file on each server. For example, if the local IP is `11.101.17.224` and the hostname is `iotdb-1`, use the following command to set the hostname: - - ``` shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - - Use the hostname for `cn_internal_address` and `dn_internal_address` in IoTDB configuration. - -3. **Unmodifiable Parameters**: Some parameters cannot be changed after the first startup. Refer to the Parameter Configuration section. - -4. **Installation Path**: Ensure the installation path contains no spaces or non-ASCII characters to prevent runtime issues. - -5. **User Permissions**: Choose one of the following permissions during installation and deployment: - - - **Root User (Recommended)**: This avoids permission-related issues. - - **Non-Root User**: - - Use the same user for all operations, including starting, activating, and stopping services. - - Avoid using `sudo`, which can cause permission conflicts. - -6. **Monitoring Panel**: Deploy a monitoring panel to track key performance metrics. Contact the Timecho team for access and refer to the "[Monitoring Panel Deployment](./Monitoring-panel-deployment.md)" guide. - -7. **Health Check Tool**: Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - - -## 2. Preparation - -1. Obtain the TimechoDB installation package: `timechodb-{version}-bin.zip` following [IoTDB-Package](../Deployment-and-Maintenance/IoTDB-Package_timecho.md)) - -2. Configure the operating system environment according to [Environment Requirement](../Deployment-and-Maintenance/Environment-Requirements.md)) - -### 2.1 Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -## 3. Installation Steps - -Taking a cluster with three Linux servers with the following information as example: - -| Node IP | Host Name | Service | -| ------------- | --------- | -------------------- | -| 11.101.17.224 | iotdb-1 | ConfigNode、DataNode | -| 11.101.17.225 | iotdb-2 | ConfigNode、DataNode | -| 11.101.17.226 | iotdb-3 | ConfigNode、DataNode | - -### 3.1 Configure Hostnames - -On all three servers, configure the hostnames by editing the `/etc/hosts` file. Use the following commands: - -```Bash -echo "11.101.17.224 iotdb-1" >> /etc/hosts -echo "11.101.17.225 iotdb-2" >> /etc/hosts -echo "11.101.17.226 iotdb-3" >> /etc/hosts -``` - -### 3.2 Extract Installation Package - -Unzip the installation package and enter the installation directory: - -```Plain -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -### 3.3 Parameters Configuration - -- #### Memory Configuration - - Edit the following files for memory allocation: - - - **ConfigNode**: `./conf/confignode-env.sh` (or `.bat` for Windows) - - | **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | - | :------------ | :--------------------------------- | :---------- | :-------------- | :-------------------------------------- | - | MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 30% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - - **DataNode**: `./conf/datanode-env.sh` (or `.bat` for Windows) - - | **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | - | :------------ | :--------------------------------- |:-----------------------------------------------------------------------------------------| :-------------- | :-------------------------------------- | - | MEMORY_SIZE | Total memory allocated to the node | Automatically calculated based on system memory, defaulting to 50% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -**General Configuration** - -Set the following parameters in `./conf/iotdb-system.properties`. Refer to `./conf/iotdb-system.properties.template` for a complete list. - -**Cluster-Level Parameters**: - -| **Parameter** | **Description** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | -| :------------------------ | :----------------------------------------------------------- | :---------------- | :---------------- | :---------------- | -| cluster_name | Name of the cluster | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | Metadata replication factor; DataNode count shall not be fewer than this value | 3 | 3 | 3 | -| data_replication_factor | Data replication factor; DataNode count shall not be fewer than this value | 2 | 2 | 2 | - -#### ConfigNode Parameters - -| **Parameter** | **Description** | **Default** | **Recommended** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | **Notes** | -| :------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------------------- | :---------------- | :---------------- | :---------------- | :--------------------------------------------------------- | -| cn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | iotdb-1 | iotdb-2 | iotdb-3 | This parameter cannot be modified after the first startup. | -| cn_internal_port | Port used for internal communication within the cluster | 10710 | 10710 | 10710 | 10710 | 10710 | This parameter cannot be modified after the first startup. | -| cn_consensus_port | Port used for consensus protocol communication among ConfigNode replicas | 10720 | 10720 | 10720 | 10720 | 10720 | This parameter cannot be modified after the first startup. | -| cn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Address and port of the seed ConfigNode (e.g., `cn_internal_address:cn_internal_port`) | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | This parameter cannot be modified after the first startup. | - -#### DataNode Parameters - -| **Parameter** | **Description** | **Default** | **Recommended** | **11.101.17.224** | **11.101.17.225** | **11.101.17.226** | **Notes** | -| :------------------------------ | :----------------------------------------------------------- |:----------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :---------------- | :---------------- | :---------------- | :--------------------------------------------------------- | -| dn_rpc_address | Address for the client RPC service | 127.0.0.1 | By default, the local machine can directly access it. For non-local access, please modify this configuration item to the IPv4 address or hostname of the server where it is located. It is recommended to use the IPv4 address of the server where it is located. | iotdb-1 | iotdb-2 | iotdb-3 | Effective after restarting the service. | -| dn_rpc_port | Port for the client RPC service | 6667 | 6667 | 6667 | 6667 | 6667 | Effective after restarting the service. | -| dn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | iotdb-1 | iotdb-2 | iotdb-3 | This parameter cannot be modified after the first startup. | -| dn_internal_port | Port used for internal communication within the cluster | 10730 | 10730 | 10730 | 10730 | 10730 | This parameter cannot be modified after the first startup. | -| dn_mpp_data_exchange_port | Port used for receiving data streams | 10740 | 10740 | 10740 | 10740 | 10740 | This parameter cannot be modified after the first startup. | -| dn_data_region_consensus_port | Port used for data replica consensus protocol communication | 10750 | 10750 | 10750 | 10750 | 10750 | This parameter cannot be modified after the first startup. | -| dn_schema_region_consensus_port | Port used for metadata replica consensus protocol communication | 10760 | 10760 | 10760 | 10760 | 10760 | This parameter cannot be modified after the first startup. | -| dn_seed_config_node | Address of the ConfigNode for registering and joining the cluster.(e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Address of the first ConfigNode | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | This parameter cannot be modified after the first startup. | - -**Note:** Ensure files are saved after editing. Tools like VSCode Remote do not save changes automatically. - -### 3.4 Start ConfigNode Instances - -1. Start the first ConfigNode (`iotdb-1`) as the seed node - -```Bash -# Unix/OS X -cd sbin -./start-confignode.sh -d #"- d" parameter will start in the background - -# Windows -# Before version V2.0.4.x -.\start-confignode.bat - -# V2.0.4.x and later versions -.\windows\start-confignode.bat -``` - -2. Start the remaining ConfigNodes (`iotdb-2` and `iotdb-3`) in sequence. - - If the startup fails, refer to the [Common Questions](#common-questions) section below for troubleshooting. - -### 3.5 Start DataNode Instances - -On each server, navigate to the `sbin` directory and start the DataNode: - -```Go -# Unix/OS X -cd sbin -./start-datanode.sh -d #"- d" parameter will start in the background - -# Windows -# Before version V2.0.4.x -.\start-datanode.bat - -# V2.0.4.x and later versions -.\windows\start-datanode.bat -``` - -### 3.6 Activate Database - -#### Option 1: Command-Based Activation - -1. Enter the IoTDB CLI on any node of the cluster: - -The Linux and MacOS system startup commands are as follows: - -```shell -# Before version V2.0.6.x -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -The Windows system startup commands are as follows: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x and later versions, before version V2.0.6.x -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -The Windows system startup commands are as follows: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x and later versions, before version V2.0.6.x -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -2. Execute the following command to obtain the machine code required for activation: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -|01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -3. Execute the following statement to obtain the version number of the database to be activated: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.9.2| 5ea21bc| -+-------+---------+ -Total line number = 1 -``` - -4. Provide the obtained machine code and version number to the Timecho team. - -5. Enter the activation codes provided by the Timecho team in the CLI in sequence using the following format. Wrap the activation code in single quotes ('): - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -- Note : The activation operation only needs to be performed once on any machine in the cluster. - -#### Option 2: File-Based Activation - -1. Start all ConfigNodes and DataNodes. -2. Copy the `system_info` file from the `activation` directory on each server and send them to the Timecho team. -3. Place the license files provided by the Timecho team into the corresponding `activation` folder for each node. - - -### 3.7 Verify Activation - -In the CLI, you can check the activation status by running the `show activation` command; the example below shows a status of ACTIVATED, indicating successful activation. - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -### 3.8 One-click Cluster Start and Stop - -#### 3.8.1 Overview - -Within the root directory of IoTDB, the `sbin `subdirectory houses the `start-all.sh` and `stop-all.sh` scripts, which work in concert with the `iotdb-cluster.properties` configuration file located in the `conf` subdirectory. This synergy enables the one-click initiation or termination of all nodes within the cluster from a single node. This approach facilitates efficient management of the IoTDB cluster's lifecycle, streamlining the deployment and operational maintenance processes. - -This following section will introduce the specific configuration items in the `iotdb-cluster.properties` file. - -#### 3.8.2 Configuration Items - -> Note: -> -> * When the cluster changes, this configuration file needs to be manually updated. -> * If the `iotdb-cluster.properties` configuration file is not set up and the `start-all.sh` or `stop-all.sh` scripts are executed, the scripts will, by default, start or stop the ConfigNode and DataNode nodes located in the IOTDB\_HOME directory where the scripts reside. -> * It is recommended to configure SSH passwordless login: If not configured, the script will prompt for the server password after execution to facilitate subsequent start, stop, or destroy operations. If already configured, there is no need to enter the server password during script execution. - -* confignode\_address\_list - -| **Name** | **confignode\_address\_list** | -| :----------------: |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | A list of IP addresses or hostname of the hosts where the ConfigNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_address\_list - -| **Name** | **datanode\_address\_list** | -| :----------------: |:------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | A list of IP addresses or hostname of the hosts where the DataNodes to be started/stopped are located. If there are multiple, they should be separated by commas. | -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* ssh\_account - -| **Name** | **ssh\_account** | -| :----------------: | :------------------------------------------------------------------------------------------------- | -| Description | The username used to log in to the target hosts via SSH. All hosts must have the same username. | -| Type | String | -| Default | root | -| Effective | After restarting the system | - -* ssh\_port - -| **Name** | **ssh\_port** | -| :----------------: |:--------------------------------------------------------------------------------------| -| Description | The SSH port exposed by the target hosts. All hosts must have the same SSH port. | -| Type | int | -| Default | 22 | -| Effective | After restarting the system | - -* confignode\_deploy\_path - -| **Name** | **confignode\_deploy\_path** | -| :----------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Description | The path on the target hosts where all ConfigNodes to be started/stopped are located. All ConfigNodes must be in the same directory on their respective hosts. eg: `/data/demo/apache-iotdb-1.3.1-all-bin`| -| Type | String | -| Default | None | -| Effective | After restarting the system | - -* datanode\_deploy\_path - -| **Name** | **datanode\_deploy\_path** | -| :----------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| Description | The path on the target hosts where all DataNodes to be started/stopped are located. All DataNodes must be in the same directory on their respective hosts. eg: `/data/demo/apache-iotdb-1.3.1-all-bin`| -| Type | String | -| Default | None | -| Effective | After restarting the system | - - -#### 3.8.3 Quick Example - -1. Configuration File: `iotdb-cluster.properties` -```properties -# Configure ConfigNode node addresses, separated by commas -confignode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# Configure DataNode node addresses, separated by commas -datanode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# SSH login username for target deployment servers -ssh_account=root - -# SSH service port number -ssh_port=22 - -# IoTDB installation directory (the program will be deployed into this path on remote nodes) -confignode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -datanode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -``` - -2. Run `./start-all.sh` to launch cluster and verify status - Connect to IoTDB CLI and execute `show cluster`. A successful output is shown below: -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -| 0|ConfigNode|Running| 172.xx.xx.16| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 1|ConfigNode|Running| 172.xx.xx.18| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 2|ConfigNode|Running| 172.xx.xx.17| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 3| DataNode|Running| 172.xx.xx.18| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 4| DataNode|Running| 172.xx.xx.17| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 5| DataNode|Running| 172.xx.xx.16| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -``` - - -## 4. Maintenance - -### 4.1 ConfigNode Maintenance - -ConfigNode maintenance includes adding and removing ConfigNodes. Common use cases include: - -- **Cluster Expansion:** If the cluster contains only 1 ConfigNode, adding 2 more ConfigNodes enhances high availability, resulting in a total of 3 ConfigNodes. -- **Cluster Fault Recovery:** If a ConfigNode's machine fails and it cannot function normally, remove the faulty ConfigNode and add a new one to the cluster. - -**Note:** After completing ConfigNode maintenance, ensure that the cluster contains either 1 or 3 active ConfigNodes. Two ConfigNodes do not provide high availability, and more than three ConfigNodes can degrade performance. - -#### Adding a ConfigNode - -**Linux /** **MacOS**: - -```Bash -sbin/start-confignode.sh -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\start-confignode.bat - -# V2.0.4.x and later versions -sbin\windows\start-confignode.bat -``` - -#### Removing a ConfigNode - -1. Connect to the cluster using the CLI and confirm the internal address and port of the ConfigNode to be removed: - - ```Plain - show confignodes; - ``` - -Example output: - -```Plain -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -2. Remove the ConfigNode using the script: - -**Linux /** **MacOS**: - -```Bash -sbin/remove-confignode.sh [confignode_id] -# Or: -sbin/remove-confignode.sh [cn_internal_address:cn_internal_port] -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\remove-confignode.bat [confignode_id] -# Or: -sbin\remove-confignode.bat [cn_internal_address:cn_internal_port] - -# V2.0.4.x and later versions -sbin\windows\remove-confignode.bat [confignode_id] -# Or: -sbin\windows\remove-confignode.bat [cn_internal_address:cn_internal_port] -``` - -### 4.2 DataNode Maintenance - -DataNode maintenance includes adding and removing DataNodes. Common use cases include: - -- **Cluster Expansion:** Add new DataNodes to increase cluster capacity. -- **Cluster Fault Recovery:** If a DataNode's machine fails and it cannot function normally, remove the faulty DataNode and add a new one to the cluster. - -**Note:** During and after DataNode maintenance, ensure that the number of active DataNodes is not fewer than the data replication factor (usually 2) or the schema replication factor (usually 3). - -#### Adding a DataNode - -**Linux /** **MacOS**: - -```Bash -sbin/start-datanode.sh -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\start-datanode.bat - -# V2.0.4.x and later versions -sbin\windows\start-datanode.bat -``` - -**Note:** After adding a DataNode, the cluster load will gradually balance across all nodes as new writes arrive and old data expires (if TTL is set). - -#### Removing a DataNode - -1. Connect to the cluster using the CLI and confirm the RPC address and port of the DataNode to be removed: - -```Plain -show datanodes; -``` - -Example output: - -```Plain -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -2. Remove the DataNode using the script: - -**Linux / MacOS:** - -```Bash -sbin/remove-datanode.sh [dn_rpc_address:dn_rpc_port] -``` - -**Windows:** - -```Bash -# Before version V2.0.4.x -sbin\remove-datanode.bat [dn_rpc_address:dn_rpc_port] - -# V2.0.4.x and later versions -sbin\windows\remove-datanode.bat [dn_rpc_address:dn_rpc_port] -``` - -### 4.3 Cluster Maintenance - -For more details on cluster maintenance, please refer to: [Cluster Maintenance](../User-Manual/Load-Balance.md) - -## 5. Common Questions - -1. Activation Fails Repeatedly - - Use the `ls -al` command to verify that the ownership of the installation directory matches the current user. - - Check the ownership of all files in the `./activation` directory to ensure they belong to the current user. -2. ConfigNode Fails to Start - - Review the startup logs to check if any parameters, which cannot be modified after the first startup, were changed. - - Check the logs for any other errors. If unresolved, contact technical support for assistance. - - If the deployment is fresh or data can be discarded, clean the environment and redeploy using the following steps: - **Clean the Environment** - - - Stop all ConfigNode and DataNode processes: - ```Bash - sbin/stop-standalone.sh - ``` - - - Check for any remaining processes: - ```Bash - jps - # or - ps -ef | grep iotdb - ``` - - - If processes remain, terminate them manually: - ```Bash - kill -9 - - #For systems with a single IoTDB instance, you can clean up residual processes with: - ps -ef | grep iotdb | grep -v grep | tr -s ' ' ' ' | cut -d ' ' -f2 | xargs kill -9 - ``` - - - Delete the `data` and `logs` directories: - ```Bash - cd /data/iotdb - rm -rf data logs - ``` - -## 6. Appendix - -### 6.1 ConfigNode Parameters - -| Parameter | Description | Is it required | -| :-------- | :---------------------------------------------------------- | :------------- | -| -d | Starts the process in daemon mode (runs in the background). | No | - -### 6.2 DataNode Parameters - -| Parameter | Description | Required | -| :-------- | :----------------------------------------------------------- | :------- | -| -v | Displays version information. | No | -| -f | Runs the script in the foreground without backgrounding it. | No | -| -d | Starts the process in daemon mode (runs in the background). | No | -| -p | Specifies a file to store the process ID for process management. | No | -| -c | Specifies the path to the configuration folder; the script loads configuration files from this location. | No | -| -g | Prints detailed garbage collection (GC) information. | No | -| -H | Specifies the path for the Java heap dump file, used during JVM memory overflow. | No | -| -E | Specifies the file for JVM error logs. | No | -| -D | Defines system properties in the format `key=value`. | No | -| -X | Passes `-XX` options directly to the JVM. | No | -| -h | Displays the help instructions. | No | diff --git a/src/UserGuide/latest/Deployment-and-Maintenance/Deployment-form_timecho.md b/src/UserGuide/latest/Deployment-and-Maintenance/Deployment-form_timecho.md deleted file mode 100644 index b2daee47f..000000000 --- a/src/UserGuide/latest/Deployment-and-Maintenance/Deployment-form_timecho.md +++ /dev/null @@ -1,63 +0,0 @@ - -# Deployment form - -IoTDB has two operation modes: standalone mode and cluster mode. - -## 1. Standalone Mode - -An IoTDB standalone instance includes 1 ConfigNode and 1 DataNode, i.e., 1C1D. - -- **Features**: Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations. -- **Use Cases**: Scenarios with limited resources or low high-availability requirements, such as edge servers. -- **Deployment Method**: [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -## 2. Dual-Active Mode - -Dual-Active Deployment is a feature of TimechoDB, where two independent instances synchronize bidirectionally and can provide services simultaneously. If one instance stops and restarts, the other instance will resume data transfer from the breakpoint. - -> An IoTDB Dual-Active instance typically consists of 2 standalone nodes, i.e., 2 sets of 1C1D. Each instance can also be a cluster. - -- **Features**: The high-availability solution with the lowest resource consumption. -- **Use Cases**: Scenarios with limited resources (only two servers) but requiring high availability. -- **Deployment Method**: [Dual-Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -## 3. Cluster Mode - -An IoTDB cluster instance consists of 3 ConfigNodes and no fewer than 3 DataNodes, typically 3 DataNodes, i.e., 3C3D. If some nodes fail, the remaining nodes can still provide services, ensuring high availability of the database. Performance can be improved by adding DataNodes. - -- **Features**: High availability, high scalability, and improved system performance by adding DataNodes. -- **Use Cases**: Enterprise-level application scenarios requiring high availability and reliability. -- **Deployment Method**: [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - -## 4. Feature Summary - -| **Dimension** | **Stand-Alone Mode** | **Dual-Active Mode** | **Cluster Mode** | -| :-------------------------- | :------------------------------------------------------- | :------------------------------------------------------ | :------------------------------------------------------ | -| Use Cases | Edge-side deployment, low high-availability requirements | High-availability services, disaster recovery scenarios | High-availability services, disaster recovery scenarios | -| Number of Machines Required | 1 | 2 | ≥3 | -| Security and Reliability | Cannot tolerate single-point failure | High, can tolerate single-point failure | High, can tolerate single-point failure | -| Scalability | Can expand DataNodes to improve performance | Each instance can be scaled as needed | Can expand DataNodes to improve performance | -| Performance | Can scale with the number of DataNodes | Same as one of the instances | Can scale with the number of DataNodes | - -- The deployment steps for Stand-Alone Mode and Cluster Mode are similar (adding ConfigNodes and DataNodes one by one), with differences only in the number of replicas and the minimum number of nodes required to provide services. \ No newline at end of file diff --git a/src/UserGuide/latest/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/UserGuide/latest/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index 5d1d89d05..000000000 --- a/src/UserGuide/latest/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,496 +0,0 @@ - -# Docker Deployment - -## 1. Environmental Preparation - -### 1.1 Docker Installation - -```Bash -#Taking Ubuntu as an example, other operating systems can search for installation methods themselves -#step1: Install some necessary system tools -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: Install GPG certificate -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: Write software source information -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: Update and install Docker CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: Set Docker to start automatically upon startup -sudo systemctl enable docker -#step6: Verify if Docker installation is successful -docker --version #Display version information, indicating successful installation -``` - -### 1.2 Docker-compose Installation - -```Bash -#Installation command -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#Verify if the installation was successful -docker-compose --version #Displaying version information indicates successful installation -``` - -### 1.3 Install The Dmidecode Plugin - -By default, Linux servers should already be installed. If not, you can use the following command to install them. - -```Bash -sudo apt-get install dmidecode -``` - -After installing dmidecode, search for the installation path: `wherever dmidecode`. Assuming the result is `/usr/sbin/dmidecode`, remember this path as it will be used in the later docker compose yml file. - -### 1.4 Get Container Image Of IoTDB - -You can contact business or technical support to obtain container images for IoTDB Enterprise Edition. - -## 2. Stand-Alone Deployment - -This section demonstrates how to deploy a standalone Docker version of 1C1D. - -### 2.1 Load Image File - -For example, the container image file name of IoTDB obtained here is: `iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -Load image: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -View image: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### 2.2 Create Docker Bridge Network - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### 2.3 Write The Yml File For docker-compose - -Here we take the example of consolidating the IoTDB installation directory and yml files in the/docker iotdb folder: - -The file directory structure is:`/docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml ` - -```Bash -docker-iotdb: -├── iotdb #Iotdb installation directory -│── docker-compose-standalone.yml #YML file for standalone Docker Composer -``` - -The complete docker-compose-standalone.yml content is as follows: - -```Bash -version: "3" -services: - iotdb-service: - image: timecho/timechodb:2.0.2.1-standalone #The image used - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### 2.4 First Launch - -Use the following command to start: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -Due to lack of activation, it is normal to exit directly upon initial startup. The initial startup is to obtain the machine code file for the subsequent activation process. - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### 2.5 Apply For Activation - -- After the first startup, a system_info file will be generated in the physical machine directory `/docker-iotdb/iotdb/activation`, and this file will be copied to the Timecho staff. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Received the license file returned by the staff, copy the license file to the `/docker iotdb/iotdb/activation` folder. - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### 2.6 Restart IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### 2.7 Validate Deployment - -- Viewing the log, the following words indicate successful startup - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- Enter the container to view the service running status and activation information - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - Enter the container, log in to the database through CLI, and use the `show cluster` command to view the service status and activation status - - ```Bash - docker exec -it iotdb /bin/bash #Entering the container - ./start-cli.sh -h iotdb #Log in to the database - IoTDB> show cluster #View status - ``` - - You can see that all services are running and the activation status shows as activated. - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### 2.8 Map/conf Directory (optional) - -If you want to directly modify the configuration file in the physical machine in the future, you can map the/conf folder in the container in three steps: - -Step 1: Copy the/conf directory from the container to/docker-iotdb/iotdb/conf - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -Step 2: Add mappings in docker-compose-standalone.yml - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this/conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -Step 3: Restart IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## 3. Cluster Deployment - -This section describes how to manually deploy an instance that includes 3 Config Nodes and 3 Data Nodes, commonly known as a 3C3D cluster. - -
- -
- -**Note: The cluster version currently only supports host and overlay networks, and does not support bridge networks.** - -Taking the host network as an example, we will demonstrate how to deploy a 3C3D cluster. - -### 3.1 Set Host Name - -Assuming there are currently three Linux servers, the IP addresses and service role assignments are as follows: - -| Node IP | Host Name | Service | -| ----------- | --------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -Configure the host names on three machines separately. To set the host names, configure `/etc/hosts` on the target server using the following command: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 3.2 Load Image File - -For example, the container image file name obtained for IoTDB is: `iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -Execute the load image command on three servers separately: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -View image: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### 3.3 Write The Yml File For Docker Compose - -Here we take the example of consolidating the IoTDB installation directory and yml files in the /docker-iotdb folder: - -The file directory structure is:/docker-iotdb/iotdb, /docker-iotdb/confignode.yml,/docker-iotdb/datanode.yml - -```Bash -docker-iotdb: -├── confignode.yml #Yml file of confignode -├── datanode.yml #Yml file of datanode -└── iotdb #IoTDB installation directory -``` - -On each server, two yml files need to be written, namely confignnode. yml and datanode. yml. The example of yml is as follows: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:2.0.x.x-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:2.0.x.x-standalone #The image used - hostname: iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #Choose from three options based on the actual situation - - dn_seed_config_node=iotdb-1:10710 #The default first node is the seed node - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #Number of metadata copies - - data_replication_factor=2 #Number of data replicas - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #Using the host network - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### 3.4 Starting Confignode For The First Time - -First, start configNodes on each of the three servers to obtain the machine code. Pay attention to the startup order, start the first iotdb-1 first, then start iotdb-2 and iotdb-3. - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #Background startup -``` - -### 3.5 Apply For Activation - -- After starting three confignodes for the first time, a system_info file will be generated in each physical machine directory `/docker-iotdb/iotdb/activation`, and the system_info files of the three servers will be copied to the Timecho staff; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- Put the three license files into the `/docker iotdb/iotdb/activation` folder of the corresponding Configurable Node node; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- After the license is placed in the corresponding activation folder, confignode will be automatically activated without restarting confignode - -### 3.6 Start Datanode - -Start datanodes on 3 servers separately - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #Background startup -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### 3.7 Validate Deployment - -- Viewing the logs, the following words indicate that the datanode has successfully started - - ```Bash - docker logs -f iotdb-datanode #View log command - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- Enter any container to view the service running status and activation information - - View the launched container - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - Enter the container, log in to the database through CLI, and use the `show cluster` command to view the service status and activation status - - ```Bash - docker exec -it iotdb-datanode /bin/bash #Entering the container - ./start-cli.sh -h iotdb-1 #Log in to the database - IoTDB> show cluster #View status - ``` - - You can see that all services are running and the activation status shows as activated. - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### 3.8 Map/conf Directory (optional) - -If you want to directly modify the configuration file in the physical machine in the future, you can map the/conf folder in the container in three steps: - -Step 1: Copy the `/conf` directory from the container to `/docker-iotdb/iotdb/conf` on each of the three servers - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -or -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -Step 2: Add `/conf` directory mapping in `confignode.yml` and `datanode. yml` on 3 servers - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #Add mapping for this /conf folder - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -Step 3: Restart IoTDB on 3 servers - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` - diff --git a/src/UserGuide/latest/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/UserGuide/latest/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index 8bf34b405..000000000 --- a/src/UserGuide/latest/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,204 +0,0 @@ - -# Dual Active Deployment - -## 1. What is a double active version? - -Dual active usually refers to two independent machines (or clusters) that perform real-time mirror synchronization. Their configurations are completely independent and can simultaneously receive external writes. Each independent machine (or cluster) can synchronize the data written to itself to another machine (or cluster), and the data of the two machines (or clusters) can achieve final consistency. - -- Two standalone machines (or clusters) can form a high availability group: when one of the standalone machines (or clusters) stops serving, the other standalone machine (or cluster) will not be affected. When the single machine (or cluster) that stopped the service is restarted, another single machine (or cluster) will synchronize the newly written data. Business can be bound to two standalone machines (or clusters) for read and write operations, thereby achieving high availability. -- The dual active deployment scheme allows for high availability with fewer than 3 physical nodes and has certain advantages in deployment costs. At the same time, the physical supply isolation of two sets of single machines (or clusters) can be achieved through the dual ring network of power and network, ensuring the stability of operation. -- At present, the dual active capability is a feature of the enterprise version. - -![](/img/20240731104336.png) - -## 2. Note - -1. It is recommended to prioritize using `hostname` for IP configuration during deployment to avoid the problem of database failure caused by modifying the host IP in the later stage. To set the hostname, you need to configure `/etc/hosts` on the target server. If the local IP is 192.168.1.3 and the hostname is iotdb-1, you can use the following command to set the server's hostname and configure IoTDB's `cn_internal-address` and` dn_internal-address` using the hostname. - - ```Bash - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -2. Some parameters cannot be modified after the first startup, please refer to the "Installation Steps" section below to set them. - -3. Recommend deploying a monitoring panel, which can monitor important operational indicators and keep track of database operation status at any time. The monitoring panel can be obtained by contacting the business department. The steps for deploying the monitoring panel can be referred to [Monitoring Panel Deployment](https://www.timecho.com/docs/UserGuide/latest/Deployment-and-Maintenance/Monitoring-panel-deployment.html) - -## 3. Installation Steps - -Taking the dual active version IoTDB built by two single machines A and B as an example, the IP addresses of A and B are 192.168.1.3 and 192.168.1.4, respectively. Here, we use hostname to represent different hosts. The plan is as follows: - -| Machine | Machine IP | Host Name | -| ------- | ----------- | --------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### 3.1 Install Two Independent IoTDBs Separately - -Install IoTDB on two machines separately, and refer to the deployment documentation for the standalone version [Stand-Alone Deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md),The deployment document for the cluster version can be referred to [Cluster Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)。**It is recommended that the configurations of clusters A and B remain consistent to achieve the best dual active effect** - -### 3.2 Create A Aata Synchronization Task On Machine A To Machine B - -- Create a data synchronization process on machine A, where the data on machine A is automatically synchronized to machine B. Use the cli tool in the sbin directory to connect to the IoTDB database on machine A: - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 - - # Windows - # Before version V2.0.4.x - .\sbin\start-cli.bat -h iotdb-1 - - # V2.0.4.x and later versions - .\sbin\windows\start-cli.bat -h iotdb-1 - ``` - -- Create and start the data synchronization command with the following SQL: - - ```Bash - create pipe AB - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' - ) - ``` - -- Note: To avoid infinite data loops, it is necessary to set the parameter `source. forwarding pipe questions` on both A and B to `false`, indicating that data transmitted from another pipe will not be forwarded. - -### 3.3 Create A Data Synchronization Task On Machine B To Machine A - -- Create a data synchronization process on machine B, where the data on machine B is automatically synchronized to machine A. Use the cli tool in the sbin directory to connect to the IoTDB database on machine B - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 - - # Windows - # Before version V2.0.4.x - .\sbin\start-cli.bat -h iotdb-2 - - # V2.0.4.x and later versions - .\sbin\windows\start-cli.bat -h iotdb-2 - ``` - - Create and start the pipe with the following SQL: - - ```Bash - create pipe BA - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' - ) - ``` - -- Note: To avoid infinite data loops, it is necessary to set the parameter `source. forwarding pipe questions` on both A and B to `false` , indicating that data transmitted from another pipe will not be forwarded. - -### 3.4 Validate Deployment - -After the above data synchronization process is created, the dual active cluster can be started. - -#### Check the running status of the cluster - -```Bash -#Execute the show cluster command on two nodes respectively to check the status of IoTDB service -show cluster -``` - -**Machine A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**Machine B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -Ensure that every Configurable Node and DataNode is in the Running state. - -#### Check synchronization status - -- Check the synchronization status on machine A - -```Bash -show pipes -``` - -![](/img/show%20pipes-A.png) - -- Check the synchronization status on machine B - -```Bash -show pipes -``` - -![](/img/show%20pipes-B.png) - -Ensure that every pipe is in the RUNNING state. - -### 3.5 Stop Dual Active Version IoTDB - -- Execute the following command on machine A: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 #Log in to CLI - IoTDB> stop pipe AB #Stop the data synchronization process - ./sbin/stop-standalone.sh #Stop database service - - # Windows - # Before version V2.0.4.x - .\sbin\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\stop-standalone.bat - - # V2.0.4.x and later versions - .\sbin\windows\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\windows\stop-standalone.bat - ``` - -- Execute the following command on machine B: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 #Log in to CLI - IoTDB> stop pipe BA #Stop the data synchronization process - ./sbin/stop-standalone.sh #Stop database service - - # Windows - # Before version V2.0.4.x - .\sbin\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\stop-standalone.bat - - # V2.0.4.x and later versions - .\sbin\windows\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\windows\stop-standalone.bat - ``` - diff --git a/src/UserGuide/latest/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/UserGuide/latest/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index c2bffcf22..000000000 --- a/src/UserGuide/latest/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,48 +0,0 @@ - -# Obtain TimechoDB - -## 1. How to obtain TimechoDB - -The TimechoDB installation package can be obtained through product trial application or by directly contacting the Timecho team. - -## 2. Installation Package Structure - -After unpacking the installation package(`iotdb-enterprise-{version}-bin.zip`),you will see the directory structure is as follows: - -| **Catologue** | **Type** | **Description** | -| :--------------- | :------- | :----------------------------------------------------------- | -| activation | Folder | Directory for activation files, including the generated machine code and the TimechoDB activation code obtained from Timecho staff. *(This directory is generated after starting the ConfigNode, enabling you to obtain the activation code.)* | -| conf | Folder | Configuration files directory, containing ConfigNode, DataNode, JMX, and logback configuration files. | -| data | Folder | Default data file directory, containing data files for ConfigNode and DataNode. *(This directory is generated after starting the program.)* | -| lib | Folder | Library files directory. | -| licenses | Folder | Directory for open-source license certificates. | -| logs | Folder | Default log file directory, containing log files for ConfigNode and DataNode. *(This directory is generated after starting the program.)* | -| sbin | Folder | Main scripts directory, containing scripts for starting, stopping, and managing the database. | -| tools | Folder | Tools directory. | -| ext | Folder | Directory for pipe, trigger, and UDF plugin-related files. | -| LICENSE | File | Open-source license file. | -| NOTICE | File | Open-source notice file. | -| README_ZH.md | File | User manual (Chinese version). | -| README.md | File | User manual (English version). | -| RELEASE_NOTES.md | File | Release notes. | - -Note: As of version V2.0.8.2, the TimechoDB installation package does not include the MQTT service and REST service JAR files by default. If you need to use them, please contact the Timecho team to obtain them. \ No newline at end of file diff --git a/src/UserGuide/latest/Deployment-and-Maintenance/Kubernetes_timecho.md b/src/UserGuide/latest/Deployment-and-Maintenance/Kubernetes_timecho.md deleted file mode 100644 index da93a7072..000000000 --- a/src/UserGuide/latest/Deployment-and-Maintenance/Kubernetes_timecho.md +++ /dev/null @@ -1,445 +0,0 @@ - - -# Kubernetes - -## 1. Environment Preparation - -### 1.1 Prepare a Kubernetes Cluster - -Ensure that you have an available Kubernetes cluster (minimum recommended version: Kubernetes 1.24) as the foundation for deploying the IoTDB cluster. - -Kubernetes Version Requirement: The recommended version is Kubernetes 1.24 or above. - -IoTDB Version Requirement: The version of TimechoDB must not be lower than v1.3.3.2. - -## 2. Create Namespace - -### 2.1 Create Namespace - -> Note: Before executing the namespace creation operation, verify that the specified namespace name has not been used in the Kubernetes cluster. If the namespace already exists, the creation command will fail, which may lead to errors during the deployment process. - -```Bash -kubectl create ns iotdb-ns -``` - -### 2.2 View Namespace - -```Bash -kubectl get ns -``` - -## 3. Create PersistentVolume (PV) - -### 3.1 Create PV Configuration File - -PV is used for persistent storage of IoTDB's ConfigNode and DataNode data. You need to create one PV for each node. - -> Note: One ConfigNode and one DataNode count as two nodes, requiring two PVs. - -For example, with 3 ConfigNodes and 3 DataNodes: - -1. Create a `pv.yaml` file and make six copies, renaming them to `pv01.yaml` through `pv06.yaml`. - -```Bash -# Create a directory to store YAML files -# Create pv.yaml file -touch pv.yaml -``` - -2. Modify the `name` and `path` in each file to ensure consistency. - -**pv.yaml Example:** - -```YAML -# pv.yaml -apiVersion: v1 -kind: PersistentVolume -metadata: - name: iotdb-pv-01 -spec: - capacity: - storage: 10Gi # Storage capacity - accessModes: # Access modes - - ReadWriteOnce - persistentVolumeReclaimPolicy: Retain # Reclaim policy - # Storage class name, if using local static storage, do not configure; if using dynamic storage, this must be set - storageClassName: local-storage - # Add the corresponding configuration based on your storage type - hostPath: # If using a local path - path: /data/k8s-data/iotdb-pv-01 - type: DirectoryOrCreate # If this line is not configured, you need to manually create the directory -``` - -### 3.2 Apply PV Configuration - -```Bash -kubectl apply -f pv01.yaml -kubectl apply -f pv-02.yaml -... -``` - -### 3.3 View PV - -```Bash -kubectl get pv -``` - - -### 3.4 Manually Create Directories - -> Note: If the type in the hostPath of the YAML file is not configured, you need to manually create the corresponding directories. - -Create the corresponding directories on all Kubernetes nodes: -```Bash -mkdir -p /data/k8s-data/iotdb-pv-01 -mkdir -p /data/k8s-data/iotdb-pv-02 -... -``` - -## 4. Install Helm - -For installation steps, please refer to the[Helm Official Website.](https://helm.sh/zh/docs/intro/install/) - -## 5. Configure IoTDB Helm Chart - -### 5.1 Clone IoTDB Kubernetes Deployment Code - -Please contact timechodb staff to obtain the IoTDB Helm Chart. If you encounter proxy issues, disable the proxy settings: - -### 5.2 Modify YAML Files - -> Ensure that the version used is supported (>=1.3.3.2): - -**values.yaml Example:** - -```YAML -nameOverride: "iotdb" -fullnameOverride: "iotdb" # Name after installation - -image: - repository: nexus.infra.timecho.com:8143/timecho/iotdb-enterprise - pullPolicy: IfNotPresent - tag: 1.3.3.2-standalone # Repository and version used - -storage: - # Storage class name, if using local static storage, do not configure; if using dynamic storage, this must be set - className: local-storage - -datanode: - name: datanode - nodeCount: 3 # Number of DataNode nodes - enableRestService: true - storageCapacity: 10Gi # Available space for DataNode - resources: - requests: - memory: 2Gi # Initial memory size for DataNode - cpu: 1000m # Initial CPU size for DataNode - limits: - memory: 4Gi # Maximum memory size for DataNode - cpu: 1000m # Maximum CPU size for DataNode - -confignode: - name: confignode - nodeCount: 3 # Number of ConfigNode nodes - storageCapacity: 10Gi # Available space for ConfigNode - resources: - requests: - memory: 512Mi # Initial memory size for ConfigNode - cpu: 1000m # Initial CPU size for ConfigNode - limits: - memory: 1024Mi # Maximum memory size for ConfigNode - cpu: 2000m # Maximum CPU size for ConfigNode - configNodeConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - schemaReplicationFactor: 3 - schemaRegionConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - dataReplicationFactor: 2 - dataRegionConsensusProtocolClass: org.apache.iotdb.consensus.iot.IoTConsensus -``` - -## 6. Configure Private Repository Information or Pre-Pull Images - -Configure private repository information on k8s as a prerequisite for the next helm install step. - -Option one is to pull the available iotdb images during helm insta, while option two is to import the available iotdb images into containerd in advance. - -### 6.1 [Option 1] Pull Image from Private Repository - -#### 6.1.1 Create a Secret to Allow k8s to Access the IoTDB Helm Private Repository - -Replace xxxxxx with the IoTDB private repository account, password, and email. - - - -```Bash -# Note the single quotes -kubectl create secret docker-registry timecho-nexus \ - --docker-server='nexus.infra.timecho.com:8143' \ - --docker-username='xxxxxx' \ - --docker-password='xxxxxx' \ - --docker-email='xxxxxx' \ - -n iotdb-ns - -# View the secret -kubectl get secret timecho-nexus -n iotdb-ns -# View and output as YAML -kubectl get secret timecho-nexus --output=yaml -n iotdb-ns -# View and decrypt -kubectl get secret timecho-nexus --output="jsonpath={.data.\.dockerconfigjson}" -n iotdb-ns | base64 --decode -``` - -#### 6.1.2 Load the Secret as a Patch to the Namespace iotdb-ns - -```Bash -# Add a patch to include login information for nexus in this namespace -kubectl patch serviceaccount default -n iotdb-ns -p '{"imagePullSecrets": [{"name": "timecho-nexus"}]}' - -# View the information in this namespace -kubectl get serviceaccounts -n iotdb-ns -o yaml -``` - -### 6.2 [Option 2] Import Image - -This step is for scenarios where the customer cannot connect to the private repository and requires assistance from company implementation staff. - -#### 6.2.1 Pull and Export the Image: - -```Bash -ctr images pull --user xxxxxxxx nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.2 View and Export the Image: - -```Bash -# View -ctr images ls - -# Export -ctr images export iotdb-enterprise:1.3.3.2-standalone.tar nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.3 Import into the k8s Namespace: - -> Note that k8s.io is the namespace for ctr in the example environment; importing to other namespaces will not work. - -```Bash -# Import into the k8s namespace -ctr -n k8s.io images import iotdb-enterprise:1.3.3.2-standalone.tar -``` - -#### 6.2.4 View the Image: - -```Bash -ctr --namespace k8s.io images list | grep 1.3.3.2 -``` - -## 7. Install IoTDB - -### 7.1 Install IoTDB - -```Bash -# Enter the directory -cd iotdb-cluster-k8s/helm - -# Install IoTDB -helm install iotdb ./ -n iotdb-ns -``` - -### 7.2 View Helm Installation List - -```Bash -# helm list -helm list -n iotdb-ns -``` - -### 7.3 View Pods - -```Bash -# View IoTDB pods -kubectl get pods -n iotdb-ns -o wide -``` - -After executing the command, if the output shows 6 Pods with confignode and datanode labels (3 each), it indicates a successful installation. Note that not all Pods may be in the Running state initially; inactive datanode Pods may keep restarting but will normalize after activation. - -### 7.4 Troubleshooting - -```Bash -# View k8s creation logs -kubectl get events -n iotdb-ns -watch kubectl get events -n iotdb-ns - -# Get detailed information -kubectl describe pod confignode-0 -n iotdb-ns -kubectl describe pod datanode-0 -n iotdb-ns - -# View ConfigNode logs -kubectl logs -n iotdb-ns confignode-0 -f -``` - -## 8. Activate IoTDB - -### 8.1 Option 1: Activate Directly in the Pod (Quickest) - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-1 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-2 -- /iotdb/sbin/start-activate.sh -# Obtain the machine code and proceed with activation -``` - -### 8.2 Option 2: Activate Inside the ConfigNode Container - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /bin/bash -cd /iotdb/sbin -/bin/bash start-activate.sh -# Obtain the machine code and proceed with activation -# Exit the container -``` - -### 8.3 Option 3: Manual Activation - -1. View ConfigNode details to determine the node: - -```Bash -kubectl describe pod confignode-0 -n iotdb-ns | grep -e "Node:" -e "Path:" - -# Example output: -# Node: a87/172.20.31.87 -# Path: /data/k8s-data/env/confignode/.env -``` - -2. View PVC and find the corresponding Volume for ConfigNode to determine the path: - -```Bash -kubectl get pvc -n iotdb-ns | grep "confignode-0" -# Example output: -# map-confignode-confignode-0 Bound iotdb-pv-04 10Gi RWO local-storage 8h - -# To view multiple ConfigNodes, use the following: -for i in {0..2}; do echo confignode-$i; kubectl describe pod confignode-${i} -n iotdb-ns | grep -e "Node:" -e "Path:" -``` - -3. View the Detailed Information of the Corresponding Volume to Determine the Physical Directory Location: - - -```Bash -kubectl describe pv iotdb-pv-04 | grep "Path:" - -# Example output: -# Path: /data/k8s-data/iotdb-pv-04 -``` - -4. Locate the system-info file in the corresponding directory on the corresponding node, use this system-info as the machine code to generate an activation code, and create a new file named license in the same directory, writing the activation code into this file. - -## 9. Verify IoTDB - -### 9.1 Check the Status of Pods within the Namespace - -View the IP, status, and other information of the pods in the iotdb-ns namespace to ensure they are all running normally. - -```Bash -kubectl get pods -n iotdb-ns -o wide - -# Example output: -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` - -### 9.2 Check the Port Mapping within the Namespace - -```Bash -kubectl get svc -n iotdb-ns - -# Example output: -# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -# confignode-svc NodePort 10.10.226.151 80:31026/TCP 7d8h -# datanode-svc NodePort 10.10.194.225 6667:31563/TCP 7d8h -# jdbc-balancer LoadBalancer 10.10.191.209 6667:31895/TCP 7d8h -``` - -### 9.3 Start the CLI Script on Any Server to Verify the IoTDB Cluster Status - -Use the port of jdbc-balancer and the IP of any k8s node. - -```Bash -start-cli.sh -h 172.20.31.86 -p 31895 -start-cli.sh -h 172.20.31.87 -p 31895 -start-cli.sh -h 172.20.31.88 -p 31895 -``` - - - -## 10. Scaling - -### 10.1 Add New PV - -Add a new PV; scaling is only possible with available PVs. - - - -**Note: DataNode cannot join the cluster after restart** - -**Reason**:The static storage hostPath mode is configured, and the script modifies the `iotdb-system.properties` file to set `dn_data_dirs` to `/iotdb6/iotdb_data,/iotdb7/iotdb_data`. However, the default storage path `/iotdb/data` is not mounted, leading to data loss upon restart. -**Solution**:Mount the `/iotdb/data` directory as well, and ensure this setting is applied to both ConfigNode and DataNode to maintain data integrity and cluster stability. - -### 10.2 Scale ConfigNode - -Example: Scale from 3 ConfigNodes to 4 ConfigNodes - -Modify the values.yaml file in iotdb-cluster-k8s/helm to change the number of ConfigNodes from 3 to 4. - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - - - - -### 10.3 Scale DataNode - -Example: Scale from 3 DataNodes to 4 DataNodes - -Modify the values.yaml file in iotdb-cluster-k8s/helm to change the number of DataNodes from 3 to 4. - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - -### 10.4 Verify IoTDB Status - -```Shell -kubectl get pods -n iotdb-ns -o wide - -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -# datanode-3 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` diff --git a/src/UserGuide/latest/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/UserGuide/latest/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index 608491c6f..000000000 --- a/src/UserGuide/latest/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,320 +0,0 @@ - -# Stand-Alone Deployment - -This guide introduces how to set up a standalone TimechoDB instance, which includes one ConfigNode and one DataNode (commonly referred to as 1C1D). - -## 1. Prerequisites - -1. [System configuration](./Environment-Requirements.md): Ensure the system has been configured according to the preparation guidelines. - -2. **IP Configuration**: It is recommended to use hostnames for IP configuration to prevent issues caused by IP address changes. Set the hostname by editing the `/etc/hosts` file. For example, if the local IP is `192.168.1.3` and the hostname is `iotdb-1`, run: - - ```shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - - Use the hostname for `cn_internal_address` and `dn_internal_address` in IoTDB configuration. - -3. **Unmodifiable Parameters**: Some parameters cannot be changed after the first startup. Refer to the Parameter Configuration section. - -4. **Installation Path**: Ensure the installation path contains no spaces or non-ASCII characters to prevent runtime issues. - -5. - **User Permissions**: Choose one of the following permissions during installation and deployment: - - **Root User (Recommended)**: This avoids permission-related issues. - - **Non-Root User**: - - Use the same user for all operations, including starting, activating, and stopping services. - - Avoid using `sudo`, which can cause permission conflicts. - -6. **Monitoring Panel**: Deploy a monitoring panel to track key performance metrics. Contact the Timecho team for access and refer to the "[Monitoring Board Install and Deploy](./Monitoring-panel-deployment.md)" guide. - -7. **Health Check Tool**: Before installation, the health check tool can help inspect the operating environment of IoTDB nodes and obtain detailed inspection results. The usage method of the IoTDB health check tool can be found in:[Health Check Tool](../Tools-System/Health-Check-Tool.md). - - -## 2. Installation Steps - -### 2.1 Pre-installation Check - -To ensure the IoTDB Enterprise Edition installation package you obtained is complete and authentic, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Find the "SHA512 Checksum" corresponding to each version in the [Release History](../IoTDB-Introduction/Release-history_timecho.md) document. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/iotdb): - ```Bash - cd /data/iotdb - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-02.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -### 2.2 Extract Installation Package - -Unzip the installation package and navigate to the directory: - -```Plain -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -### 2.3 Parameter Configuration - -#### Memory Configuration - -Edit the following files for memory allocation: - -- **ConfigNode**: `conf/confignode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :---------------------------------- | :---------- | :-------------- | :---------------------- | -| MEMORY_SIZE | Total memory allocated for the node | Automatically calculated based on system memory, defaulting to 30% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - -- **DataNode**: `conf/datanode-env.sh` (or `.bat` for Windows) - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------ | :---------------------------------- |:-----------------------------------------------------------------------------------------| :-------------- | :---------------------- | -| MEMORY_SIZE | Total memory allocated for the node | Automatically calculated based on system memory, defaulting to 50% of the system memory. | As needed | Save changes without immediate execution; modifications take effect after service restart. | - - -#### General Configuration - -Set the following parameters in `conf/iotdb-system.properties`. Refer to `conf/iotdb-system.properties.template` for a complete list. - -**Cluster-Level Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------------ | :-------------------------- | :------------- | :-------------- | :----------------------------------------------------------- | -| cluster_name | Name of the cluster | defaultCluster | Customizable | Support hot loading, but it is not recommended to change the cluster name by manually modifying the configuration file | -| schema_replication_factor | Number of metadata replicas | 1 | 1 | In standalone mode, set this to 1. This value cannot be modified after the first startup. | -| data_replication_factor | Number of data replicas | 1 | 1 | In standalone mode, set this to 1. This value cannot be modified after the first startup. | - -**ConfigNode Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------------------- | :--------------------------------------------------------- | -| cn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | This parameter cannot be modified after the first startup. | -| cn_internal_port | Port used for internal communication within the cluster | 10710 | 10710 | This parameter cannot be modified after the first startup. | -| cn_consensus_port | Port used for consensus protocol communication among ConfigNode replicas | 10720 | 10720 | This parameter cannot be modified after the first startup. | -| cn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Use `cn_internal_address:cn_internal_port` | This parameter cannot be modified after the first startup. | - -**DataNode** **Parameters**: - -| **Parameter** | **Description** | **Default** | **Recommended** | **Notes** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :--------------------------------------------------------- | -| dn_rpc_address | Address for the client RPC service | 127.0.0.1 | By default, the local machine can directly access it. For non-local access, please modify this configuration item to the IPv4 address or hostname of the server where it is located. It is recommended to use the IPv4 address of the server where it is located. | Effective after restarting the service. | -| dn_rpc_port | Port for the client RPC service | 6667 | 6667 | Effective after restarting the service. | -| dn_internal_address | Address used for internal communication within the cluster | 127.0.0.1 | Server's IPv4 address or hostname. Use hostname to avoid issues when the IP changes. | This parameter cannot be modified after the first startup. | -| dn_internal_port | Port used for internal communication within the cluster | 10730 | 10730 | This parameter cannot be modified after the first startup. | -| dn_mpp_data_exchange_port | Port used for receiving data streams | 10740 | 10740 | This parameter cannot be modified after the first startup. | -| dn_data_region_consensus_port | Port used for data replica consensus protocol communication | 10750 | 10750 | This parameter cannot be modified after the first startup. | -| dn_schema_region_consensus_port | Port used for metadata replica consensus protocol communication | 10760 | 10760 | This parameter cannot be modified after the first startup. | -| dn_seed_config_node | Address of the ConfigNode for registering and joining the cluster. (e.g.,`cn_internal_address:cn_internal_port`) | 127.0.0.1:10710 | Use `cn_internal_address:cn_internal_port` | This parameter cannot be modified after the first startup. | - -### 2.4 Start ConfigNode - -Navigate to the `sbin` directory and start ConfigNode: - -```Bash -# Unix/OS X -./sbin/start-confignode.sh -d # The "-d" flag starts the process in the background. - -# Windows -# Before version V2.0.4.x -.\sbin\start-confignode.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-confignode.bat -``` - - If the startup fails, refer to the [**Common Problem**](#Common Problem) section below for troubleshooting. - -### 2.5 Start DataNode - -Navigate to the `sbin` directory of IoTDB and start the DataNode: - -````shell -# Unix/OS X -./sbin/start-datanode.sh -d # The "-d" flag starts the process in the background. - -# Windows -# Before version V2.0.4.x -.\sbin\start-datanode.bat - -# V2.0.4.x and later versions -.\sbin\windows\start-datanode.bat -```` - -### 2.6 Activate Database - -#### Option 1: Command-Based Activation - -1. Enter the IoTDB CLI. - -The Linux and MacOS system startup commands are as follows: - -```shell -# Before version V2.0.6.x -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -The Windows system startup commands are as follows: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x and later versions, before version V2.0.6.x -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -2. Execute the following command to obtain the machine code required for activation: - -```SQL -show system info -``` -```Bash -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -3. Execute the following statement to obtain the version number of the database to be activated: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.9.2| 5ea21bc| -+-------+---------+ -Total line number = 1 -``` - -4. Provide the obtained machine code and version number to the Timecho team. - -5. Enter the activation codes provided by the Timecho team in the CLI in sequence using the following format. Wrap the activation code in single quotes ('): - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -#### Option 2: File-Based Activation - -1. After starting the Confignode and Datanode nodes, enter the `activation` folder and send the `system_info` file to the Timecho team. -2. Receive the `license` file returned by the staff. -3. Place the `license` file into the `activation` folder of the corresponding node. - -### 2.7 Verify Activation - -In the CLI, you can check the activation status by running the `show activation` command. Check the `ClusterActivationStatus` field. If it shows `ACTIVATED`, the database has been successfully activated. - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81.png) - -## 3. Common Problem - -1. Activation Fails Repeatedly - 1. Use the `ls -al` command to verify that the ownership of the installation directory matches the current user. - 2. Check the ownership of all files in the `./activation` directory to ensure they belong to the current user. -2. ConfigNode Fails to Start - 1. Review the startup logs to check if any parameters, which cannot be modified after the first startup, were changed. - 2. Check the logs for any other errors. If unresolved, contact technical support for assistance. - 3. If the deployment is fresh or data can be discarded, clean the environment and redeploy using the following steps: - - **Clean the Environment** - -1. Stop all ConfigNode and DataNode processes: - -```Bash -sbin/stop-standalone.sh -``` - -2. Check for any remaining processes: - -```Bash -jps -# or -ps -ef | grep iotdb -``` - -3. If processes remain, terminate them manually: - -```Bash -kill -9 - -#For systems with a single IoTDB instance, you can clean up residual processes with: -ps -ef | grep iotdb | grep -v grep | tr -s ' ' ' ' | cut -d ' ' -f2 | xargs kill -9 -``` - -4. Delete the `data` and `logs` directories: - -```Bash -cd /data/iotdb -rm -rf data logs -``` - -## 4. Appendix - -### 4.1 ConfigNode Parameters - -| Parameter | Description | **Is it required** | -| :-------- | :---------------------------------------------------------- | :----------------- | -| -d | Starts the process in daemon mode (runs in the background). | No | - -### 4.2 DataNode Parameters - -| Parameter | Description | Required | -| :-------- | :----------------------------------------------------------- | :------- | -| -v | Displays version information. | No | -| -f | Runs the script in the foreground without backgrounding it. | No | -| -d | Starts the process in daemon mode (runs in the background). | No | -| -p | Specifies a file to store the process ID for process management. | No | -| -c | Specifies the path to the configuration folder; the script loads configuration files from this location. | No | -| -g | Prints detailed garbage collection (GC) information. | No | -| -H | Specifies the path for the Java heap dump file, used during JVM memory overflow. | No | -| -E | Specifies the file for JVM error logs. | No | -| -D | Defines system properties in the format `key=value`. | No | -| -X | Passes `-XX` options directly to the JVM. | No | -| -h | Displays the help instructions. | No | diff --git a/src/UserGuide/latest/Deployment-and-Maintenance/workbench-deployment_timecho.md b/src/UserGuide/latest/Deployment-and-Maintenance/workbench-deployment_timecho.md deleted file mode 100644 index ab236c0d1..000000000 --- a/src/UserGuide/latest/Deployment-and-Maintenance/workbench-deployment_timecho.md +++ /dev/null @@ -1,273 +0,0 @@ - -# Workbench Deployment - -The visualization console is one of the supporting tools for IoTDB (similar to Navicat for MySQL). It is an official application tool system used for database deployment implementation, operation and maintenance management, and application development stages, making the use, operation, and management of databases simpler and more efficient, truly achieving low-cost management and operation of databases. This document will assist you in installing Workbench. - -
-  -  -
- -The instructions for using the visualization console tool can be found in the [Instructions](../Tools-System/Monitor-Tool.md) section of the document. - -## 1. Installation Preparation - -| Preparation Content | Name | Version Requirements | Link | -| :----------------------: | :-------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | -| Operating System | Windows or Linux | - | - | -| Installation Environment | JDK | v1.5.4 and below require ≥ 1.8; v1.5.5 and above require ≥ 17. Choose the ARM or x64 installer according to your system. | https://www.oracle.com/java/technologies/downloads/ | -| Related Software | Prometheus | Requires installation of V2.30.3 and above. | https://prometheus.io/download/ | -| Database | IoTDB | Requires V1.2.0 Enterprise Edition and above | You can contact business or technical support to obtain | -| Console | IoTDB-Workbench-`` | - | You can choose according to the appendix version comparison table and contact business or technical support to obtain it | - -### Pre-installation Check - -To ensure the Workbench installation package you obtained is complete and valid, we recommend performing an SHA512 verification before proceeding with the installation and deployment. - -#### Preparation: - -- Obtain the officially released SHA512 checksum: Contact the Timecho Team to get it. - -#### Verification Steps (Linux as an Example): - -1. Open the terminal and navigate to the directory where the installation package is stored (e.g., /data/workbench): - ```Bash - cd /data/workbench - ``` -2. Execute the following command to calculate the hash value: - ```Bash - sha512sum IoTDB-Workbench-``.zip - ``` -3. The terminal will output a result (the left part is the SHA512 checksum, and the right part is the file name): - -![img](/img/sha512-04.png) - -4. Compare the output result with the official SHA512 checksum. Once confirmed that they match, you can proceed with the installation and deployment operations in accordance with the procedures below. - -#### Notes: - -- If the verification results do not match, please contact the Timecho Team to re-obtain the installation package. -- If a "file not found" prompt appears during verification, check whether the file path is correct or if the installation package has been fully downloaded. - -## 2. Installation Steps - -### 2.1 IoTDB enables monitoring indicator collection - -1. Open the monitoring configuration item. The configuration items related to monitoring in IoTDB are disabled by default. Before deploying the monitoring panel, you need to open the relevant configuration items (note that the service needs to be restarted after enabling monitoring configuration). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ConfigurationLocated in the configuration fileDescription
cn_metric_reporter_listconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to PROMETHEUS
cn_metric_levelPlease add this configuration item to the configuration file and set the value to IMPORTANT
cn_metric_prometheus_reporter_portPlease add this configuration item to the configuration file to maintain the default setting of 9091. If other ports are set, they will not conflict with each other
dn_metric_reporter_listconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to PROMETHEUS
dn_metric_levelPlease add this configuration item to the configuration file and set the value to IMPORTANT
dn_metric_prometheus_reporter_portPlease add this configuration item to the configuration file and set it to 9092 by default. If other ports are set, they will not conflict with each other
dn_metric_internal_reporter_typePlease add this configuration item to the configuration file and set the value to IOTDB
enable_audit_logconf/iotdb-system.propertiesPlease add this configuration item to the configuration file and set the value to true
audit_log_storagePlease add this configuration item in the configuration file, with values set to IOTDB and LOGGER
audit_log_operationPlease add this configuration item in the configuration file, with values set to DML,DDL,QUERY
- - -2. Restart all nodes. After modifying the monitoring indicator configuration of three nodes, the confignode and datanode services of all nodes can be restarted: - - ```shell - # Unix/OS X - ./sbin/stop-standalone.sh #Stop confignode and datanode first - ./sbin/start-confignode.sh -d #Start confignode - ./sbin/start-datanode.sh -d #Start datanode - - # Windows - # Before version V2.0.4.x - .\sbin\stop-standalone.bat - .\sbin\start-confignode.bat - .\sbin\start-datanode.bat - - # V2.0.4.x and later versions - .\sbin\windows\stop-standalone.bat - .\sbin\windows\start-confignode.bat - .\sbin\windows\start-datanode.bat - ``` - -3. After restarting, confirm the running status of each node through the client. If the status is Running, it indicates successful configuration: - - ![](/img/%E5%90%AF%E5%8A%A8.png) - -### 2.2 Install and configure Prometheus - -1. Download the Prometheus installation package, which requires installation of V2.30.3 and above. You can go to the Prometheus official website to download it (https://prometheus.io/docs/introduction/first_steps/) -2. Unzip the installation package and enter the unzipped folder: - - ```Shell - tar xvfz prometheus-*.tar.gz - cd prometheus-* - ``` - -3. Modify the configuration. Modify the configuration file prometheus.yml as follows - 1. Add configNode task to collect monitoring data for ConfigNode - 2. Add a datanode task to collect monitoring data for DataNodes - - ```shell - global: - scrape_interval: 15s - evaluation_interval: 15s - scrape_configs: - - job_name: "prometheus" - static_configs: - - targets: ["localhost:9090"] - - job_name: "confignode" - static_configs: - - targets: ["iotdb-1:9091","iotdb-2:9091","iotdb-3:9091"] - honor_labels: true - - job_name: "datanode" - static_configs: - - targets: ["iotdb-1:9092","iotdb-2:9092","iotdb-3:9092"] - honor_labels: true - ``` - -4. Start Prometheus. The default expiration time for Prometheus monitoring data is 15 days. In production environments, it is recommended to adjust it to 180 days or more to track historical monitoring data for a longer period of time. The startup command is as follows: - - ```Shell - ./prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=180d - ``` - -5. Confirm successful startup. Enter in browser `http://IP:port` Go to Prometheus and click on the Target interface under Status. When you see that all States are Up, it indicates successful configuration and connectivity. - -
- - -
- - -### 2.3 Install Workbench - -1. Enter the config directory of iotdb Workbench -`` - -2. Modify Workbench configuration file: Go to the `config` folder and modify the configuration file `application-prod.properties`. If you are installing it locally, there is no need to modify it. If you are deploying it on a server, you need to modify the IP address - > Workbench can be deployed on a local or cloud server as long as it can connect to IoTDB - - | Configuration | Before Modification | After modification | - | ---------------- | ----------------------------------- | ----------------------------------------------- | - | pipe.callbackUrl | pipe.callbackUrl=`http://127.0.0.1` | pipe.callbackUrl=`http://` | - - ![](/img/workbench-conf-1.png) - -3. Startup program: Please execute the startup command in the sbin folder of IoTDB Workbench -`` - Windows: - ```shell - # Start Workbench in the background - start.bat -d - ``` - Linux: - ```shell - # Start Workbench in the background - ./start.sh -d - ``` -4. You can use the `jps` command to check if the startup was successful, as shown in the figure: - - ![](/img/windows-jps.png) - -5. Verification successful: Open "`http://Server IP: Port in configuration file`" in the browser to access, for example:"`http://127.0.0.1:9190`" When the login interface appears, it is considered successful - - ![](/img/workbench-en.png) - - -### 2.4 Configure Instance Information - -1. Configure instance information: You only need to fill in the following information to connect to the instance - - ![](/img/workbench-en-1.jpeg) - - - | Field Name | Is It A Required Field | Field Meaning | Default Value | - | --------------- | ---------------------- | ------------------------------------------------------------ | ------ | - | Connection Type | Yes | The content filled in for different connection types varies, and supports selecting "single machine, cluster, dual active" | - | - | Instance Name | Yes | You can distinguish different instances based on their names, with a maximum input of 50 characters | - | - | Instance | Yes | Fill in the database address (`dn_rpc_address` field in the `iotdb/conf/iotdb-system.properties` file) and port number (`dn_rpc_port` field). Note: For clusters and dual active devices, clicking the "+" button supports entering multiple instance information | - | - | Prometheus | No | Fill in `http://:/app/v1/query` to view some monitoring information on the homepage. We recommend that you configure and use it | - | - | Username | Yes | Fill in the username for IoTDB, supporting input of 4 to 32 characters, including uppercase and lowercase letters, numbers, and special characters (! @ # $% ^&* () _+-=) | root | - | Enter Password | No | Fill in the password for IoTDB. To ensure the security of the database, we will not save the password. Please fill in the password yourself every time you connect to the instance or test | root | - -2. Test the accuracy of the information filled in: You can perform a connection test on the instance information by clicking the "Test" button - - ![](/img/workbench-en-2.png) - -## 3. Appendix: IoTDB and Workbench Version Comparison Table - -| Version | Description | Supported IoTDB Versions | -|---------|-----------------------------------------------------------------------------------------------------------------------------|-------------------------------------| -| V2.0.1-beta | The first version of the V2.x series, supporting dual models of tree and table | V2.0 and above, The AI analysis module only supports versions above 2.0.5. | -| V1.5.7 | Optimize the point list by splitting point names into device names and points, ensure the point selection area supports horizontal scrolling, and align the export file column order with the page display. | All 1.x versions from V1.3.4 onward | -| V1.5.6 | Enhanced CSV import/export: optional tags/aliases on import; support for measurement descriptions with backtick-quoted quotes on export. | All 1.x versions from V1.3.4 onward | -| V1.5.5 | Added server clock functionality and support for activating Enterprise Edition license databases | All 1.x versions from V1.3.4 onward | -| V1.5.4 | Added authentication for Prometheus settings in Instance Management | All 1.x versions from V1.3.4 onward | -| V1.5.1 | Added AI analysis and pattern matching | All 1.x versions from V1.3.2 onward | -| V1.4.0 | Added tree model display and English UI | All 1.x versions from V1.3.2 onward | -| V1.3.1 | Enhanced analysis methods and import templates | All 1.x versions from V1.3.2 onward | -| V1.3.0 | Added DB configuration and UI refinements | All 1.x versions from V1.3.2 onward | -| V1.2.6 | Optimized permission controls | All 1.x versions from V1.3.1 onward | -| V1.2.5 | Added "Common Templates" and caching | All 1.x versions from V1.3.0 onward | -| V1.2.4 | Added import/export for calculations, time alignment field | All 1.x versions from V1.2.2 onward | -| V1.2.3 | Added activation details and analysis features | All 1.x versions from V1.2.2 onward | -| V1.2.2 | Optimized point description display | All 1.x versions from V1.2.2 onward | -| V1.2.1 | Added sync monitoring panel, Prometheus hints | All 1.x versions from V1.2.2 onward | -| V1.2.0 | Major Workbench upgrade | All 1.x versions from V1.2.0 onward | diff --git a/src/UserGuide/latest/Ecosystem-Integration/Ecosystem-Overview_timecho.md b/src/UserGuide/latest/Ecosystem-Integration/Ecosystem-Overview_timecho.md deleted file mode 100644 index 742d3ca85..000000000 --- a/src/UserGuide/latest/Ecosystem-Integration/Ecosystem-Overview_timecho.md +++ /dev/null @@ -1,53 +0,0 @@ - - -# Overview - -IoTDB Ecosystem Integration Bridges the Full Pipeline of Time-Series Data: -- Through data collection, it enables second-level device connectivity. -- Via data integration, it constructs cross-cloud pipelines. -- Leveraging programming frameworks, it accelerates business logic development. -- With computing engines, it accomplishes distributed processing. -- Through visualization and SQL development, it implements analytical strategies. -- Finally, by interfacing with IoT platforms, it achieves edge-cloud synergy—building a complete intelligent closed loop from the physical world to digital decision-making. - -![](/img/eco-overview-n-en.png) - -The following documentation will help you quickly and comprehensively understand the usage of various integration tools at each stage: - -- Data Acquisition - - Telegraf [Telegraf Plugin](./Telegraf.md) -- Data Integration - - NiFi [Apache NiFi](./NiFi-IoTDB.md) - - Kafka [Kafka](./Programming-Kafka.md) -- Computing Engine - - Flink [Flink](./Flink-IoTDB.md) - - Spark [Spark](./Spark-IoTDB.md) -- Visual Analytics - - Zeppelin [Zeppelin](./Zeppelin-IoTDB.md) - - Grafana [Grafana](./Grafana-Connector.md) - - Grafana Plugin [Grafana Plugin](./Grafana-Plugin.md) - - DataEase [DataEase](./DataEase.md) -- SQL Development - - DBeaver [DBeaver](./DBeaver.md) -- IoT Platform - - Ignition [Ignition](./Ignition-IoTDB-plugin_timecho.md) - - Thingsboard [Thingsboard](./Thingsboard.md) \ No newline at end of file diff --git a/src/UserGuide/latest/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md b/src/UserGuide/latest/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md deleted file mode 100644 index ac82207e8..000000000 --- a/src/UserGuide/latest/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md +++ /dev/null @@ -1,275 +0,0 @@ - - -# Ignition - -## 1. Product Overview - -1. Introduction to Ignition - - Ignition is a web-based monitoring and data acquisition tool (SCADA) - an open and scalable universal platform. Ignition allows you to more easily control, track, display, and analyze all data of your enterprise, enhancing business capabilities. For more introduction details, please refer to [Ignition Official Website](https://docs.inductiveautomation.com/docs/8.1/getting-started/introducing-ignition) - -2. Introduction to the Ignition-IoTDB Connector - - The ignition-IoTDB Connector is divided into two modules: the ignition-IoTDB Connector,Ignition-IoTDB With JDBC。 Among them: - - - Ignition-IoTDB Connector: Provides the ability to store data collected by Ignition into IoTDB, and also supports data reading in Components. It injects script interfaces such as `system. iotdb. insert`and`system. iotdb. query`to facilitate programming in Ignition - - Ignition-IoTDB With JDBC: Ignition-IoTDB With JDBC can be used in the`Transaction Groups`module and is not applicable to the`Tag Historian`module. It can be used for custom writing and querying. - - The specific relationship and content between the two modules and ignition are shown in the following figure. - - ![](/img/20240703114443.png) - -## 2. Installation Requirements - -| **Preparation Content** | Version Requirements | -| ------------------------------- | ------------------------------------------------------------ | -| IoTDB | Version 1.3.1 and above are required to be installed, please refer to IoTDB for installation [Deployment Guidance](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) | -| Ignition | Requirement: 8.1 version (8.1.37 and above) of version 8.1 must be installed. Please refer to the Ignition official website for installation [Installation Guidance](https://docs.inductiveautomation.com/docs/8.1/getting-started/installing-and-upgrading)(Other versions are compatible, please contact the business department for more information) | -| Ignition-IoTDB Connector module | Please contact Business to obtain | -| Ignition-IoTDB With JDBC module | Download address:https://repo1.maven.org/maven2/org/apache/iotdb/iotdb-jdbc/ | - -## 3. Instruction Manual For Ignition-IoTDB Connector - -### 3.1 Introduce - -The Ignition-IoTDB Connector module can store data in a database connection associated with the historical database provider. The data is directly stored in a table in the SQL database based on its data type, as well as a millisecond timestamp. Store data only when making changes based on the value pattern and dead zone settings on each label, thus avoiding duplicate and unnecessary data storage. - -The Ignition-IoTDB Connector provides the ability to store the data collected by Ignition into IoTDB. - -### 3.2 Installation Steps - -Step 1: Enter the `Configuration` - `System` - `Modules` module and click on the `Install or Upgrade a Module` button at the bottom - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-1.png) - -Step 2: Select the obtained `modl`, select the file and upload it, click `Install`, and trust the relevant certificate. - -![](/img/20240703-151030.png) - -Step 3: After installation is completed, you can see the following content - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-3.png) - -Step 4: Enter the `Configuration` - `Tags` - `History` module and click on `Create new Historical Tag Provider` below - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-4.png) - -Step 5: Select `IoTDB` and fill in the configuration information - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-5.png) - -The configuration content is as follows: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NameDescriptionDefault ValueNotes
Main
Provider NameProvider Name-
Enabled trueThe provider can only be used when it is true
DescriptionDescription-
IoTDB Settings
Host NameThe address of the target IoTDB instance-
Port NumberThe port of the target IoTDB instance6667
UsernameThe username of the target IoTDB-
PasswordPassword for target IoTDB-
Database NameThe database name to be stored, starting with root, such as root db-
Pool SizeSize of SessionPool50Can be configured as needed
Store and Forward SettingsJust keep it as default
- - - -### 3.3 Instructions - -#### Configure Historical Data Storage - -- After configuring the `Provider`, you can use the `IoTDB Tag Historian` in the `Designer`, just like using other `Providers`. Right click on the corresponding `Tag` and select `Edit Tag (s) `, then select the History category in the Tag Editor - - ![](/img/ignition-7.png) - -- Set `History Disabled` to `true`, select `Storage Provider` as the `Provider` created in the previous step, configure other parameters as needed, click `OK`, and then save the project. At this point, the data will be continuously stored in the 'IoTDB' instance according to the set content. - - ![](/img/ignition-8.png) - -#### Read Data - -- You can also directly select the tags stored in IoTDB under the Data tab of the Report - - ![](/img/ignition-9.png) - -- You can also directly browse relevant data in Components - - ![](/img/ignition-10.png) - -#### Script module: This function can interact with IoTDB - -1. system.iotdb.insert: - - -- Script Description: Write data to an IoTDB instance - -- Script Definition: - - `system.iotdb.insert(historian, deviceId, timestamps, measurementNames, measurementValues)` - -- Parameter: - - - `str historian`:The name of the corresponding IoTDB Tag Historian Provider - - `str deviceId`:The deviceId written, excluding the configured database, such as Sine - - `long[] timestamps`:List of timestamps for written data points - - `str[] measurementNames`:List of names for written physical quantities - - `str[][] measurementValues`:The written data point data corresponds to the timestamp list and physical quantity name list - -- Return Value: None - -- Available Range:Client, Designer, Gateway - -- Usage example: - - ```shell - system.iotdb.insert("IoTDB", "Sine", [system.date.now()],["measure1","measure2"],[["val1","val2"]]) - ``` - -2. system.iotdb.query: - - -- Script Description:Query the data written to the IoTDB instance - -- Script Definition: - - `system.iotdb.query(historian, sql)` - -- Parameter: - - - `str historian`:The name of the corresponding IoTDB Tag Historian Provider - - `str sql`:SQL statement to be queried - -- Return Value: - Query Results:`List>` - -- Available Range:Client, Designer, Gateway - -- Usage example: - - ```Python - system.iotdb.query("IoTDB", "select * from root.db.Sine where time > 1709563427247") - ``` - -## 4. Ignition-IoTDB With JDBC - -### 4.1 Introduce - - Ignition-IoTDB With JDBC provides a JDBC driver that allows users to connect and query the Ignition IoTDB database using standard JDBC APIs - -### 4.2 Installation Steps - -Step 1: Enter the `Configuration` - `Databases` -`Drivers` module and create the `Translator` - -![](/img/Ignition-IoTDBWithJDBC-1.png) - -Step 2: Enter the `Configuration` - `Databases` - `Drivers` module, create a `JDBC Driver` , select the `Translator` configured in the previous step, and upload the downloaded `IoTDB JDBC`. Set the Classname to `org. apache. iotdb. jdbc.IoTDBDriver` - -![](/img/Ignition-IoTDBWithJDBC-2.png) - -Step 3: Enter the `Configuration` - `Databases` - `Connections` module, create a new `Connections` , select the`IoTDB Driver` created in the previous step for `JDBC Driver`, configure the relevant information, and save it to use - -![](/img/Ignition-IoTDBWithJDBC-3.png) - -### 4.3 Instructions - -#### Data Writing - -Select the previously created `Connection` from the `Data Source` in the `Transaction Groups` - -- `Table name`needs to be set as the complete device path starting from root -- Uncheck `Automatically create table` -- `Store timestame to` configure as time - -Do not select other options, set the fields, and after `enabled` , the data will be installed and stored in the corresponding IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E5%86%99%E5%85%A5-1.png) - -#### Query - -- Select `Data Source` in the `Database Query Browser` and select the previously created `Connection` to write an SQL statement to query the data in IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2-ponz.png) - diff --git a/src/UserGuide/latest/Ecosystem-Integration/SeaTunnel_timecho.md b/src/UserGuide/latest/Ecosystem-Integration/SeaTunnel_timecho.md deleted file mode 100644 index f804b8c0b..000000000 --- a/src/UserGuide/latest/Ecosystem-Integration/SeaTunnel_timecho.md +++ /dev/null @@ -1,190 +0,0 @@ - - - -# Apache SeaTunnel - -## 1. Overview - -SeaTunnel is a distributed integration platform designed for massive data. Leveraging its high performance and elastic scaling capabilities, it connects multi-source heterogeneous data links through standardized Connectors (composed of Source and Sink). The platform uniformly abstracts various data sources into the SeaTunnelRow format via Source. After dynamic resource scheduling and batch processing optimization, it efficiently writes data to different storage systems through Sink. Through the deep integration of the IoTDB Connector with SeaTunnel, it not only addresses core challenges in time-series data scenarios such as **high-throughput writing, multi-source governance, and complex analysis**, but also helps enterprises quickly build **low-cost, highly reliable, and easily scalable** data infrastructure in fields like the Internet of Things and industrial internet, leveraging the out-of-the-box connector ecosystem and automated operation and maintenance capabilities. - -## 2. Usage Steps - -### 2.1 Environment Preparation - -#### 2.1.1 Software Requirements - -| Software | Version | Installation Reference | -| ------------- | ------------- |-----------------------------------------------------------| -| IoTDB | >= 2.0.5 | [Quick Start](../QuickStart/QuickStart_timecho.md) | -| SeaTunnel | 2.3.12 | [Official Website](https://seatunnel.apache.org/download) | - -* Thrift Version Conflict Resolution (Only required for Spark engine): - -```Bash -# Remove older Thrift from Spark -rm -f $SPARK_HOME/jars/libthrift* -# Copy IoTDB's Thrift library to Spark classpath -cp $IOTDB_HOME/lib/libthrift* $SPARK_HOME/jars/ -``` - -#### 2.1.2 Dependency Configuration - -1. JDBC - -* Spark/Flink Engine: Place the [JDBC driver JAR](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) into the `${SEATUNNEL_HOME}/plugins/` directory. -* SeaTunnel Zeta Engine: Place the [JDBC driver JAR](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) into the `${SEATUNNEL_HOME}/lib/` directory. - -2. Connector - -Place the corresponding version of the [SeaTunnel Connector](https://mvnrepository.com/artifact/org.apache.seatunnel/connector-iotdb) into the `${SEATUNNEL_HOME}/plugins/` directory. - -### 2.2 Reading Data (IoTDB Source Connector) - -#### 2.2.1 Configuration Parameters - -| **Parameter** | **Type** | **Required** | **Default** | **Description** | -| -------------------------- | -------- | ------------ | ----------- | --------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | string | yes | - | IoTDB cluster address, format: `"host1:port"` or `"host1:port,host2:port"` | -| `username` | string | yes | - | IoTDB username | -| `password` | string | yes | - | IoTDB password | -| `sql_dialect` | string | no | tree | IoTDB model: `tree` for tree model; `table` for table model | -| `sql` | string | yes | - | SQL query statement to execute | -| `database` | string | no | - | Database name, only effective in table model | -| `schema` | config | yes | - | Data schema definition | -| `fetch_size` | int | no | - | Number of data rows fetched per request from IoTDB during query execution | -| `lower_bound` | long | no | - | Lower bound of time range (used for data partitioning by time column) | -| `upper_bound` | long | no | - | Upper bound of time range (used for data partitioning by time column) | -| `num_partitions` | int | no | - | Number of partitions (used when partitioning by time column):
1 partition: uses the full time range
If partitions < (upper_bound - lower_bound), the difference is used as actual partitions | -| `thrift_default_buffer_size`| int | no | - | Thrift protocol buffer size | -| `thrift_max_frame_size` | int | no | - | Thrift maximum frame size | -| `enable_cache_leader` | boolean | no | - | Whether to enable leader node caching | -| `version` | string | no | - | Client SQL semantic version (`V_0_12` / `V_0_13`) | - -#### 2.2.2 Configuration Example - -1. Create a new file `iotdb_source_example.conf` in the `${SEATUNNEL_HOME}/config/` directory: - -```bash -env { - parallelism = 2 # Parallelism set to 2 - job.mode = "BATCH" # Batch mode -} - -source { - IoTDB { - node_urls = "localhost:6667" - username = "root" - password = "root" - sql = "SELECT temperature, humidity, status FROM root.testdb.seatunnel.source.device align by device" - schema { - fields { - ts = timestamp - device_name = string - temperature = double - humidity = double - status = boolean - } - } - } -} - -sink { - Console { - } # Output to console -} -``` - -2. Run SeaTunnel with the following command: - -```Bash -./bin/seatunnel.sh --config config/iotdb_source_example.conf -e local -``` - -3. For more details, please refer to the official Apache SeaTunnel documentation on [IoTDB Source Connector](https://seatunnel.apache.org/docs/2.3.12/connector-v2/source/IoTDB). - -### 2.3 Writing Data (IoTDB Sink Connector) - -#### 2.3.1 Configuration Parameters - -| **Parameter** | **Type** | **Required** | **Default** | **Description** | -| ----------------------------- | --------- | ------------ | ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | Array | yes | - | IoTDB cluster address, format: `["host1:port"]` or `["host1:port","host2:port"]` | -| `username` | String | yes | - | IoTDB username | -| `password` | String | yes | - | IoTDB password | -| `sql_dialect` | String | no | tree | IoTDB model: `tree` for tree model; `table` for table model | -| `storage_group` | String | yes | - | IoTDB tree model: specifies the storage group for devices (path prefix) e.g., deviceId = \${storage_group} + "." + \${key_device}; IoTDB table model: specifies the database | -| `key_device` | String | yes | - | IoTDB tree model: field name in SeaTunnelRow that specifies the IoTDB device ID; IoTDB table model: field name in SeaTunnelRow that specifies the IoTDB table name | -| `key_timestamp` | String | no | processing time | IoTDB tree model: field name in SeaTunnelRow that specifies the IoTDB timestamp (if not specified, processing time is used as timestamp); IoTDB table model: field name in SeaTunnelRow that specifies the IoTDB time column (if not specified, processing time is used as timestamp) | -| `key_measurement_fields` | Array | no | See description | IoTDB tree model: field names in SeaTunnelRow that specify the list of IoTDB measurements (if not specified, includes all fields except `key_device` and `key_timestamp`); IoTDB table model: field names in SeaTunnelRow that specify the IoTDB field columns (if not specified, includes all fields except `key_device`, `key_timestamp`, `key_tag_fields`, `key_attribute_fields`) | -| `key_tag_fields` | Array | no | - | IoTDB tree model: not applicable; IoTDB table model: field names in SeaTunnelRow that specify the IoTDB tag columns | -| `key_attribute_fields` | Array | no | - | IoTDB tree model: not applicable; IoTDB table model: field names in SeaTunnelRow that specify the IoTDB attribute columns | -| `batch_size` | Integer | no | 1024 | For batch writing, data is flushed to IoTDB when the buffer reaches `batch_size` or when the time reaches `batch_interval_ms` | -| `max_retries` | Integer | no | - | Number of retries on failed flush | -| `retry_backoff_multiplier_ms` | Integer | no | - | Multiplier used to generate the next backoff delay | -| `max_retry_backoff_ms` | Integer | no | - | Maximum wait time before retrying a request to IoTDB | -| `default_thrift_buffer_size` | Integer | no | - | Initial buffer size for Thrift client in IoTDB | -| `max_thrift_frame_size` | Integer | no | - | Maximum frame size for Thrift client in IoTDB | -| `zone_id` | string | no | - | IoTDB client `java.time.ZoneId` | -| `enable_rpc_compression` | Boolean | no | - | Enable RPC compression in IoTDB client | -| `connection_timeout_in_ms` | Integer | no | - | Maximum time (in milliseconds) to wait when connecting to IoTDB | - -#### 2.3.2 Configuration Example - -1. Create a new file `iotdb_sink_example.conf` in the `${SEATUNNEL_HOME}/config/` directory: - -```bash -# Define runtime environment -env { - parallelism = 4 - job.mode = "BATCH" -} - -source { - Jdbc { - url = "jdbc:mysql://localhost:3306/demo_db?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true" - driver = "com.mysql.cj.jdbc.Driver" - connection_check_timeout_sec = 100 - user = "root" - password = "IoTDB@2024" - query = "select * from device" - } -} - -sink { - IoTDB { - node_urls = ["localhost:6667"] - username = "root" - password = "root" - key_device = "id" # Specify the `deviceId` using the device_name field - key_timestamp = "intime" - storage_group = "root.mysql" - } -} -``` - -2. Run SeaTunnel with the following command: - -```Bash -./bin/seatunnel.sh --config config/iotdb_sink_example.conf -e local -``` - -3. For more configuration parameters and examples, please refer to the official Apache SeaTunnel documentation on [IoTDB Sink Connector](https://seatunnel.apache.org/docs/2.3.12/connector-v2/sink/IoTDB). diff --git a/src/UserGuide/latest/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/UserGuide/latest/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index 10fa4c4d4..000000000 --- a/src/UserGuide/latest/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,267 +0,0 @@ - - -# IoTDB Introduction - -TimechoDB is a low-cost, high-performance native temporal database for the Internet of Things, provided by Timecho based on the Apache IoTDB community version as an original commercial product. It can solve various problems encountered by enterprises when building IoT big data platforms to manage time-series data, such as complex application scenarios, large data volumes, high sampling frequencies, high amount of unaligned data, long data processing time, diverse analysis requirements, and high storage and operation costs. - -Timecho provides a more diverse range of product features, stronger performance and stability, and a richer set of utility tools based on TimechoDB. It also offers comprehensive enterprise services to users, thereby providing commercial customers with more powerful product capabilities and a higher quality of development, operations, and usage experience. - -- Download 、Deployment and Usage:[QuickStart](../QuickStart/QuickStart_timecho.md) - - -## 1. Product Components - -Timecho products is composed of several components, covering the entire time-series data lifecycle from data collection, data management to data analysis & application, helping users efficiently manage and analyze the massive amount of time-series data generated by the IoT. - -
- Introduction-en-timecho-new.png - -
- -1. **Time-series database (TimechoDB, a commercial product based on Apache IoTDB provided by the original team)**: The core component of time-series data storage, which can provide users with high-compression storage capabilities, rich time-series query capabilities, real-time stream processing capabilities, while also having high availability of data and high scalability of clusters, and providing security protection. At the same time, TimechoDB also provides users with a variety of application tools for easy management of the system; multi-language API and external system application integration capabilities, making it convenient for users to build applications based on TimechoDB. - -2. **Time-series data standard file format (Apache TsFile, led and contributed by core team members of Timecho)**: This file format is a storage format specifically designed for time-series data, which can efficiently store and query massive amounts of time-series data. Currently, the underlying storage files of Timecho's collection, storage, and intelligent analysis modules are all supported by Apache TsFile. TsFile can be efficiently loaded into TimechoDB and can also be migrated out. Through TsFile, users can use the same file format for data management in the stages of collection, management, application & analysis, greatly simplifying the entire process from data collection to analysis, and improving the efficiency and convenience of time-series data management. - -3. **Time-series model training and inference integrated engine (AINode)**: For intelligent analysis scenarios, TimechoDB provides the AINode time-series model training and inference integrated engine, which offers a complete set of time-series data analysis tools, with the underlying model training engine supporting training tasks and data management, including machine learning, deep learning, etc. With these tools, users can conduct in-depth analysis of the data stored in TimechoDB and mine its value. - -4. **Data collection**: To more conveniently dock with various industrial collection scenarios, Timecho provides data collection access services, supporting multiple protocols and formats, which can access data generated by various sensors and devices, while also supporting features such as breakpoint resumption and network barrier penetration. It is more adapted to the characteristics of difficult configuration, slow transmission, and weak network in the industrial field collection process, making the user's data collection simpler and more efficient. - -## 2. Product Features - -TimechoDB has the following advantages and characteristics: - -- Flexible deployment methods: Support for one-click cloud deployment, out-of-the-box use after unzipping at the terminal, and seamless connection between terminal and cloud (data cloud synchronization tool). - -- Low hardware cost storage solution: Supports high compression ratio disk storage, no need to distinguish between historical and real-time databases, unified data management. - -- Hierarchical sensor organization and management: Supports modeling in the system according to the actual hierarchical relationship of devices to achieve alignment with the industrial sensor management structure, and supports directory viewing, search, and other capabilities for hierarchical structures. - -- High throughput data reading and writing: supports access to millions of devices, high-speed data reading and writing, out of unaligned/multi frequency acquisition, and other complex industrial reading and writing scenarios. - -- Rich time series query semantics: Supports a native computation engine for time series data, supports timestamp alignment during queries, provides nearly a hundred built-in aggregation and time series calculation functions, and supports time series feature analysis and AI capabilities. - -- Highly available distributed system: Supports HA distributed architecture, the system provides 7*24 hours uninterrupted real-time database services, the failure of a physical node or network fault will not affect the normal operation of the system; supports the addition, deletion, or overheating of physical nodes, the system will automatically perform load balancing of computing/storage resources; supports heterogeneous environments, servers of different types and different performance can form a cluster, and the system will automatically load balance according to the configuration of the physical machine. - -- Extremely low usage and operation threshold: supports SQL like language, provides multi language native secondary development interface, and has a complete tool system such as console. - -- Rich ecological environment docking: Supports docking with big data ecosystem components such as Hadoop, Spark, and supports equipment management and visualization tools such as Grafana, Thingsboard, DataEase. - -## 3. Enterprise characteristics - -### 3.1 Higher level product features - -Based on Apache IoTDB, TimechoDB offers a range of advanced product features, with native upgrades and optimizations at the kernel level for industrial production scenarios. These include multi-level storage, cloud-edge collaboration, visualization tools, and security enhancements, allowing users to focus more on business development without worrying too much about underlying logic. This simplifies and enhances industrial production, bringing more economic benefits to enterprises. For example: - - -- Dual Active Deployment:Dual active usually refers to two independent single machines (or clusters) that perform real-time mirror synchronization. Their configurations are completely independent and can simultaneously receive external writes. Each independent single machine (or clusters) can synchronize the data written to itself to another single machine (or clusters), and the data of the two single machines (or clusters) can achieve final consistency. - -- Data Synchronisation:Through the built-in synchronization module of the database, data can be aggregated from the station to the center, supporting various scenarios such as full aggregation, partial aggregation, and hierarchical aggregation. It can support both real-time data synchronization and batch data synchronization modes. Simultaneously providing multiple built-in plugins to support requirements such as gateway penetration, encrypted transmission, and compressed transmission in enterprise data synchronization applications. - -- Tiered Storage:Multi level storage: By upgrading the underlying storage capacity, data can be divided into different levels such as cold, warm, and hot based on factors such as access frequency and data importance, and stored in different media (such as SSD, mechanical hard drive, cloud storage, etc.). At the same time, the system also performs data scheduling during the query process. Thereby reducing customer data storage costs while ensuring data access speed. - -- Security Enhancements: Features like whitelists and audit logs strengthen internal management and reduce the risk of data breaches. - -The detailed functional comparison is as follows: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
FunctionApache IoTDBTimechoDB
Deployment ModeStand-Alone Deployment
Distributed Deployment
Dual Active Deployment-
Container DeploymentPartial support
Database FunctionalitySensor Management
Write Data
Query Data
Continuous Query
Trigger
User Defined Function
Permission Management
Data SynchronisationOnly file synchronization, no built-in pluginsReal time synchronization+file synchronization, enriched with built-in plugins
Stream ProcessingOnly framework, no built-in pluginsFramework+rich built-in plugins
Tiered Storage-
View-
White List-
Audit Log-
Supporting ToolsWorkbench-
Cluster Management Tool-
System Monitor Tool-
LocalizationLocalization Compatibility Certification-
Technical SupportExpert Support-
Use Training-
- -### 3.2 More efficient/stable product performance - -TimechoDB has optimized stability and performance on the basis of the open source version. With technical support from the enterprise version, it can achieve more than 10 times performance improvement and has the performance advantage of timely fault recovery. - -### 3.3 More User-Friendly Tool System - -TimechoDB will provide users with a simpler and more user-friendly tool system. Through products such as the Cluster Monitoring Panel (Grafana), Database Console (Workbench), and Cluster Management Tool (Deploy Tool, abbreviated as IoTD), it will help users quickly deploy, manage, and monitor database clusters, reduce the work/learning costs of operation and maintenance personnel, simplify database operation and maintenance work, and make the operation and maintenance process more convenient and efficient. - -- Cluster Monitoring Panel: Designed to address the monitoring issues of TimechoDB and its operating system, including operating system resource monitoring, TimechoDB performance monitoring, and hundreds of kernel monitoring indicators, to help users monitor the health status of the cluster and perform cluster tuning and operation. - -
-

Overall Overview

-

Operating System Resource Monitoring

-

TimechoDB Performance Monitoring

-
-
- - - -
-

- -- Database Console: Designed to provide a low threshold database interaction tool, it helps users perform metadata management, data addition, deletion, modification, query, permission management, system management, and other operations in a concise and clear manner through an interface console, simplifying the difficulty of database use and improving database efficiency. - - -
-

Home Page

-

Operate Metadata

-

SQL Query

-
-
- - - -
-

- - -- Cluster management tool: aimed at solving the operational difficulties of multi node distributed systems, mainly including cluster deployment, cluster start stop, elastic expansion, configuration updates, data export and other functions, so as to achieve one click instruction issuance for complex database clusters, greatly reducing management difficulty. - - -
-  -
- -### 3.4 More professional enterprise technical services - -TimechoDB customers provide powerful original factory services, including but not limited to on-site installation and training, expert consultant consultation, on-site emergency assistance, software upgrades, online self-service, remote support, and guidance on using the latest development version. At the same time, in order to make TimechoDB more suitable for industrial production scenarios, we will recommend modeling solutions, optimize read-write performance, optimize compression ratios, recommend database configurations, and provide other technical support based on the actual data structure and read-write load of the enterprise. If encountering industrial customization scenarios that are not covered by some products, TimechoDB will provide customized development tools based on user characteristics. - -Compared to the open source version, TimechoDB provides a faster release frequency every 2-3 months. At the same time, it offers day level exclusive fixes for urgent customer issues to ensure stable production environments. - -### 3.5 More compatible localization adaptation - -The TimechoDB code is self-developed and controllable, and is compatible with most mainstream information and creative products (CPU, operating system, etc.), and has completed compatibility certification with multiple manufacturers to ensure product compliance and security. \ No newline at end of file diff --git a/src/UserGuide/latest/IoTDB-Introduction/Release-history_timecho.md b/src/UserGuide/latest/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index 6c5df5236..000000000 --- a/src/UserGuide/latest/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,701 +0,0 @@ - -# Release History - -## 1. TimechoDB (Database Core) - -### V2.0.9.4 -> Release Date: 2026.06.10
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.4-bin.zip
-> SHA512 Checksum: 040ebdd9e45d93535e9628cf377003d560be83cec9737f5a5fbd0c3a93a12810814094752eac3eacdfec5cddcf433fa83e76edc14be34c73c1a54d9b937ea1b5 - -Version 2.0.9.4 primarily optimizes table model AINode inference, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- AINode: Table model covariate inference models adaptively support filling null values - - -### V2.0.9.3 -> Release Date: 2026.05.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.3-bin.zip
-> SHA512 Checksum: f6c5d50cbf8902503289884f073593c650ffdc8edbebfabf27f6ab4499630749331aa4ed09dd34627a39fa8dee27b4d7e2689d0ed1cf23c76dd9c7270f9fae2a - -Version 2.0.9.3 of AINode newly supports registering multiple models by using the same model code with different model weights. It also includes enhancements and bug fixes for previous versions, with comprehensive improvements to database monitoring, performance and stability. Details are as follows: - -- AINode: [Supports registering custom models with the same model code and different model weights](../AI-capability/AINode_Upgrade_timecho.md#_4-3-register-custom-models) - - -### V2.0.9.2 -> Release Date: 2026.05.11
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.2-bin.zip
-> SHA512 Checksum: 10d3f34b6e65ad5c09b1cf3538ee27e181cc38c5fedf6acfd7d7053797ca23c76245683536275b69bd478aa1e43364351eceef1948832ab663a7398665af9eff - -Version 2.0.9.2 adds import and export capabilities for the Object data type, and introduces the new `tsfile-backup` script (currently supported only for table model scenarios). It also brings optimizations and bug fixes for legacy versions, with overall upgrades to database monitoring, performance and stability. Details are as follows: - -- Scripts & Tools: [The `import-data` script for TsFile format](../../latest-Table/Tools-System/Data-Import-Tool_timecho.md#_2-4-tsfile-format) supports Object type data import for table models -- Scripts & Tools: New[ `tsfile-backup` script ](../../latest-Table/Tools-System/Data-Export-Tool_timecho.md#_3-tsfilebackup-based-on-pipe-framework)added for table models -- Stream Processing Module: PIPE for table models supports [local export and remote transmission of Object type data](../../latest-Table/User-Manual/Data-Sync_timecho.md#_3-9-object-type-data-export) -- System Module: [Audit logs](../User-Manual/Audit-Log_timecho.md) support slow request quantity statistics - -### V2.0.9.1 -> Release Date: 2026.05.11
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.9.1-bin.zip
-> SHA512 Checksum: 18ff3801ba58550e06ef0aa4bf4465e8ce1b31d1aecb9c6899eb843f5d9187d3cc575e930ee38d96b87b17067e2b21f1852ab5127eac7480cf5051c20a68894b - -Version 2.0.9.1 endows AINode with covariate classification inference capability, supports schema-level and table-level storage space statistics. It adds set operations, CTE and multiple built-in functions for data query, enables SQL debugging via DEBUG statements, and supports configuring auto-start on boot. This version also contains legacy version improvements, bug fixes, and comprehensive enhancements to database monitoring, performance and stability. Details are as follows: - -- AINode: Table models support [time series data classification inference](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md#_4-1-model-inference) -- Query Module: Table models support [set operations (UNION/INTERSECT/EXCEPT)](../../latest-Table/SQL-Manual/Set-Operations_timecho.md) and [Common Table Expressions (CTE)](../../latest-Table/SQL-Manual/Common-Table-Expression_timecho.md) -- Query Module: Newly added [IF scalar function](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_8-3-if-expression), [binary functions](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_7-binary-functions) and [APPROX_PERCENTILE aggregate function](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_2-aggregate-functions) for table models -- Query Module: Supports [DEBUG SQL](../User-Manual/Maintenance-commands_timecho.md#_6-query-debugging) for query debugging and optimizes the result set of [Explain Analyze](../User-Manual/Query-Performance-Analysis.md) -- Query Module: Supports [schema-level](../User-Manual/Maintenance-commands_timecho.md#_1-10-view-disk-space-usage) and [table-level](../../latest-Table/Reference/System-Tables_timecho.md#_2-22-table-disk-usage) storage space occupancy statistics; the[ `SHOW CONFIGURATION` statement](../../latest-Table/User-Manual/Maintenance-commands_timecho.md#_1-13-view-node-configuration) is available to view cluster configuration information -- Scripts & Tools: Data and metadata import/export tools support the SSL protocol -- Scripts & Tools: Command-line tool adds access [history display](../Tools-System/CLI_timecho.md#_5-access-history-feature) capability -- System Module: Supports [system auto-start](../User-Manual/Auto-Start-On-Boot_timecho.md) configuration -- Others: Fixed security vulnerability CVE-2026-28564 - - -### V2.0.8.3 -> Release Date: 2026.04.21
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.8.3-bin.zip
-> SHA512 Checksum: 4b95bea87cc375bc455897dcf4cec80692421fa5c3eee746e1095b94288611d4afdd94aa8dad70340757d041757758924701cbdb2b73b49fb8730c4caac2a126 - -Version 2.0.8.3 enables reading and writing Object type data via Python. It also includes optimizations and bug fixes for previous versions, with comprehensive upgrades to database monitoring, performance and stability. Details are as follows: - -- Interface Module: [Python Native API](../../latest-Table/API/Programming-Python-Native-API_timecho.md) supports reading and writing Object type data for table models - - -### V2.0.8.2 - -> Release Date: 2026.03.31
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name:timechodb-2.0.8.2-bin.zip
-> SHA512 Checksum:02ab10e3e94786dd5676e0a69609eef192afd90d87f4d8d7bd44e7e9cbc8a18d61ba5668bae56cb8e4416ac71a877f760963b72ca7838d7c39ae10f1ed321d89 - -Version 2.0.8.2 adds support for modifying the full path of time series in the tree model, customizing the Time column name in the table model, changing data types in both tree and table models, and includes the ODBC Driver, among other features. It also introduces improvements and bug fixes for earlier versions, with comprehensive enhancements to database monitoring, performance, and stability. The detailed release notes are as follows: - -- Storage Module: The tree model supports [modifying the full name of time series](../Basic-Concept/Operate-Metadata_timecho.md#_2-4-修改时间序列名称) and [changing the data type of time series](../Basic-Concept/Operate-Metadata_timecho.md#_2-3-修改时间序列数据类型). -- Storage Module: The table model supports [modifying column data types](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-5-修改表) and [customizing the Time column name](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-1-创建表). -- Interface Module: Adds support for the [ODBC Driver](../API/Programming-ODBC_timecho.md); the Python SessionDataset supports fetching DataFrames in batches; the MQTT service is externalized, and a new system table named Services is added for service queries. -- AI Node: The table model supports adaptive [covariate inference](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理). -- Stream Processing Module: The tree model data synchronization PIPE statement supports specifying multiple precise paths. - - -### V2.0.8.1 - -> Release Date: 2026.02.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name:timechodb-2.0.8.1-bin.zip
-> SHA512 Checksum: 49d97cbf488443f8e8e73cc39f6f320b3bc84b194aed90af695ebd5771650b5e5b6a3abb0fb68059bd01827260485b903c035657b337442f4fdd32c877f2aca3 - -V2.0.8.1 introduces the **Object data type** to table models, significantly enhances audit logging capabilities, optimizes the tree model’s **OPC UA protocol**, adds **covariate-based forecasting** support in AINode, and enables **concurrent inference** in AINode. Additionally, comprehensive improvements have been made to database monitoring, performance, and stability. The detailed release notes are as follows: - -- **Query Module**: Added a list view of available DataNode instances, allowing users to [view each node's RPC address and port](../User-Manual/Maintenance-commands_timecho.md#_1-7-viewing-available-nodes). -- **Query Module**: Introduced a new system table for [statistical query latency analysis](../../latest-Table/Reference/System-Tables_timecho.md#_2-20-queries-costs-histogram). -- **Storage Module**: Added SQL support to retrieve the full definition statements for [tables](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-4-view-table-creation-statement) and [views](../../latest-Table/User-Manual/Tree-to-Table_timecho.md#_2-4-viewing-table-views). -- **Storage Module**: Optimized the tree model’s [OPC UA protocol](../API/Programming-OPC-UA_timecho.md). -- **System Module**: Added support for the [Object data type](../../latest-Table/Background-knowledge/Data-Type_timecho.md) in table models. -- **System Module**: Significantly enhanced and upgraded the [audit log](../User-Manual/Audit-Log_timecho.md) functionality. -- **System Module**: Added a new system table to monitor [DataNode connection status](../../latest-Table/Reference/System-Tables_timecho.md#_2-18-connections). -- **AINode**: Integrated the built-in **Chronos-2** model, supporting [covariate-based forecasting](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md). -- **AINode**: Built-in models **Timer-XL** and **Sundial** now support [concurrent inference](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md). -- **Stream Processing Module**: When creating a full-data synchronization pipe, it will be [automatically split](../User-Manual/Data-Sync_timecho.md#_2-1-create-a-task) into two independent pipes—one for real-time data and one for historical data—whose remaining event counts can be monitored separately via the `SHOW PIPES` statement. -- **Others**: Fixed security vulnerabilities **CVE-2025-12183**, **CVE-2025-66566**, and **CVE-2025-11226**. - -### V2.0.6.6 - -> Release Date: 2026.01.20
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.6.6-bin.zip
-> SHA512 Checksum: d12e60b8119690d63c501d0c2afcd527e39df8a8786198e35b53338e21939e1a9244805e710d81cbb62d02c2739909d7e8227c029660a0cd9ea7ca718cf9bdf6 - -V2.0.6.6 primarily optimizes query performance for time series in the tree mode, while delivering comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Query Module**: Improved query performance for `SHOW/COUNT TIMESERIES/DEVICES` statements. - - -### V2.0.6.4 - -> Release Date: 2025.11.17
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: timechodb-2.0.6.4-bin.zip
-> SHA512 Checksum: 57b9998cc14632862c32b6781c70db1c52caf8172b5d45d27cc214cab50d3afd4230ed0754e1c1a4ed825666bf971dc81fbb7d3b93261e57e9dabc20e794a2b8 - -V2.0.6.4 focuses on enhancements to the storage and AINode modules, resolves several product defects, and provides comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Storage Module**: Added support for modifying the encoding and compression methods of time series in the tree mode. -* **AINode**: Introduced one-click deployment and optimized model inference capabilities. - - -### V2.0.6.1 - -> Release Date: 2025.09.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.6.1-bin.zip
-> SHA512 Checksum: c88e3e2c0dbd06578bd0697ca9992880b300baee2c4906ba1f952134e37ae2fa803a6af236f4541d318b75f43a498b5d5bfbbc7c445783271076c36e696e4dd0 - -V2.0.6.1 introduces the new table model query write-back function, access control blacklist/whitelist function, bitwise operation functions (built-in scalar functions), and push-downable time functions. Comprehensive enhancements to database monitoring, performance, and stability are also included. Key updates: - -* ​**​Query Module:​**​ - * Supports the table model query write-back function - * The table model row pattern recognition supports the use of aggregate functions to capture continuous data for analytical calculation - * The table model adds built-in scalar functions - bitwise operation functions - * The table model adds push-downable EXTRACT time functions -* ​**​System Module:​**​ - * Adds access control, supporting users to customize and configure blacklist/whitelist functions -* ​**​Others:​**​ - * The default user password is updated to "TimechoDB@2021" with higher security strength - -### V2.0.5.2 - -> Release Date: 2025.08.08
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.5.2-bin.zip
-> SHA512 Checksum: a00a4075c9937b7749c454f71d2480fea5e9ff9659c0628b132e30e2f256c7c537cd91dca4f6be924db0274bb180946a1b88e460c025bf82fdb994a3c2c7b91e - -V2.0.5.2 introduces addresses certain product defects, optimizes the data synchronization function,Comprehensive enhancements to database monitoring, performance, and stability are also included. - - -### V2.0.5.1 - -> Release Date: 2025.07.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.5.1-bin.zip
-> SHA512 Checksum: aa724755b659bf89a60da6f2123dfa91fe469d2e330ed9bd029e8f36dd49212f3d83b1025e9da26cb69315e02f65c7e9a93922e40df4f2aa4c7f8da8da2a4cea - -V2.0.5.1 introduces ​**​tree-to-table view​**​, ​**​window functions​**​ and the ​**​approx\_most\_frequent​**​ aggregate function for the table model, along with support for ​**​LEFT & RIGHT JOIN​**​ and ​**​ASOF LEFT JOIN​**​. AINode adds two built-in models: ​**​Timer-XL​**​ and ​**​Timer-Sundial​**​, supporting inference and fine-tuning for tree and table models. Comprehensive enhancements to database monitoring, performance, and stability are also included. Key updates: - -* ​**​Query Module:​**​ - * Supports manually creating tree-to-table views - * Adds window functions for table model - * Adds approx\_most\_frequent aggregate function - * Extends JOIN support: LEFT/RIGHT JOIN, ASOF LEFT JOIN - * Enables row pattern recognition (captures continuous data for analysis) - * New system tables: VIEWS (view metadata), MODELS (model info), etc. -* ​**​System Module:​**​ - * Adds TsFile data encryption -* ​**​AI Module:​**​ - * New built-in models: Timer-XL and Timer-Sundial - * Supports inference/fine-tuning for tree and table models -* ​**​Others:​**​ - * Enables data publishing via OPC DA protocol - -### 2.x Other historical versions - -#### V2.0.4.2 - -> Release Date: 2025.06.21
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.4.2-bin.zip
-> SHA512 Checksum: 31f26473ac90988ce970dac8d0950671bde918f9af6f2f6a6c2bf99a53aa1c0a459c53a137b18ff0b28e70952e9c4b6acb50029e0b2e38837b969eb8f78f2939 - -V2.0.4.2 adds support for passing TOPIC to custom MQTT plugins. Includes comprehensive improvements to monitoring, performance, and stability. - -#### V2.0.4.1 - -> Release Date: 2025.06.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.4.1-bin.zip
-> SHA512 Checksum: 93ac08bfae06aff6db04849f474458433026f66778f4f5c402eb22f1a7cb14d8096daf0a9e9cc365ddfefd4f8ca4443b2a9fb6461906f056b1e6a344990beb3a - -V2.0.4.1 introduces ​**​User-Defined Table Functions (UDTF)​**​ and multiple built-in table functions for the table model, adds the ​**​approx\_count\_distinct​**​ aggregate function, and enables ​**​ASOF INNER JOIN on timestamp columns​**​. Script tools are categorized, with Windows-specific scripts separated out. Key updates: - -* ​**​Query Module:​**​ - * Adds UDTFs and built-in table functions - * Supports ASOF INNER JOIN on timestamps - * Adds approx\_count\_distinct aggregate function -* ​**​Stream Processing:​**​ - * Supports asynchronous TsFile loading via SQL -* ​**​System Module:​**​ - * Disaster-aware load balancing strategy for replica selection during downsizing - * Compatibility with Windows Server 2025 -* ​**​Scripts & Tools:​**​ - * Categorized scripts; isolated Windows-specific tools - -#### V2.0.3.4 - -> Release Date: 2025.06.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.4-bin.zip
-> SHA512 Checksum: d80d34b7d3890def75b17c491fc4c13efc36153a5950a9b23744755d04d6adb5d6ab9ec970101183fef7bfeb8a559ef92fce90d2d22f7b7fd5795cd5589461bb - -V2.0.3.4 upgrades the user password encryption algorithm to ​**​SHA-256​**​. Includes comprehensive monitoring, performance, and stability improvements. - -#### V2.0.3.3 - -> Release Date: 2025.05.16
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.3-bin.zip
-> SHA512 Checksum: f47e3fb45f869dbe690e7cfaa93f95e5e08a462b362aa9d7ccac7ee5b55022dc8f62db12009dfde055f278f3003ff9ea7c22849d52a3ef2c25822f01ade78591 - -V2.0.3.3 introduces ​**​metadata import/export scripts for table models​**​, ​**​Spark ecosystem integration​**​, and adds ​**​timestamps to AINode results​**​. New aggregate/scalar functions are added. Key updates: - -* ​**​Query Module:​**​ - * New aggregate function: count\_if; scalar functions: greatest/least - * Significant optimization for full-table count(\*) queries -* ​**​AI Module:​**​ - * Timestamps added to AINode results -* ​**​System Module:​**​ - * Optimized metadata performance for table model - * Active monitoring & loading of TsFiles - * New metrics: TsFile parsing time, Tablet conversion count -* ​**​Ecosystem Integration:​**​ - * Spark integration for table model -* ​**​Scripts & Tools:​**​ - * import-schema/export-schema scripts support table model metadata - -#### V2.0.3.2 - -> Release Date: 2025.05.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.3.2-bin.zip
-> SHA512 Checksum: 76bd294de4b01782e5dd621a996aeb448e4581f98c70fb5b72b17dc392c2e1227c0d26bd3df5533669a80f217a83a566bc6ec926b7efd21ce7a89b894cd33e19 - -V2.0.3.2 resolves product defects, optimizes node removal, and enhances monitoring, performance, and stability. - -#### V2.0.2.1 - -> Release Date: 2025.04.07
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.2.1-bin.zip
-> SHA512 Checksum: a41be3f8c57e6a39ac165f1d6ab92c9ed790b0712528f31662c58617f4c94e6bfc9392a9c1ef2fc5bdd8c7ca79901389f368cbdbec3e5b1d5c1ce155b2f1a457 - -V2.0.2.1 adds ​**​table model permission management​**​, ​**​user management​**​, and ​**​operation authentication​**​, alongside UDFs, system tables, and nested queries. Data subscription mechanisms are optimized. Key updates: - -* ​**​Query Module:​**​ - * Added UDF management: User-Defined Scalar Functions (UDSF) & Aggregate Functions (UDAF) - * Configurable URI-based loading for UDF/PipePlugin/Trigger/AINode JARs - * Permission/user management with operation authentication - * New system tables and maintenance statements -* ​**​System Module:​**​ - * CSharp client supports table model - * New C++ Session write APIs for table model - * Multi-tier storage supports S3-compliant non-AWS object storage - * New pattern\_match function -* ​**​Data Sync:​**​ - * Table model metadata sync and delete propagation - -#### V2.0.1.2 - -> Release Date: 2025.01.25
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: timechodb-2.0.1.2-bin.zip
-> SHA512 Checksum: 51c2fa5da2974a8a3c8871dec1c49bd98e5d193a13ef33ac7801adb833a1e360d74f0160bcdf33c7ffb23a5c5e0f376e26a4315cf877f1459483356285b85349 - -V2.0.1.2 officially implements ​**​dual-model configuration (tree + table)​**​. The table model supports ​**​standard SQL queries​**​, diverse functions/operators, stream processing, and Benchmarking. Python client adds four new data types, and script tools support TsFile/CSV/SQL import/export. Key updates: - -* ​**​Time-Series Table Mode:​**​ - * Standard SQL: SELECT, WHERE, JOIN, GROUP BY, ORDER BY, LIMIT, nested queries -* ​**​Query Module:​**​ - * Logical operators, math functions, time-series functions (e.g., DIFF) - * Configurable URI-based JAR loading -* ​**​Storage Module:​**​ - * Session API writes with auto-metadata creation - * Python client supports: String, Blob, Date, Timestamp - * Optimized compaction task priority -* ​**​Stream Processing:​**​ - * Auth info specification on sender side - * TsFile Load for table model - * Plugin adaptation for table model -* ​**​System Module:​**​ - * Enhanced DataNode downsizing stability - * Supports DROP DATABASE in read-only mode -* ​**​Scripts & Tools:​**​ - * Benchmark adapted for table model - * Support for String/Blob/Date/Timestamp in Benchmark - * import-data/export-data: Universal support for TsFile/CSV/SQL -* ​**​Ecosystem Integration:​**​ - * Kubernetes Operator support - - -### V1.3.7.3 - -> Release Date: 2026.06.02
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 Checksum: 8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 primarily optimizes query module and data synchronization capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Query Module: Optimized `Last` queries, aligned series queries, reverse-order time filter queries, and other scenarios. -- Metadata Module: Optimized device creation validation for activated series and their child paths. -- Data Synchronization: Optimized the retry mechanism after synchronization failures. -- Data Synchronization: Cross-network-gateway synchronization plugin supports configuring the real-time write transmission timeout. -- Interface Module: Added error code validation to the Go client write interface. -- Interface Module: Optimized C# client connection pool management. - - -### V1.3.7.2 - -> Release Date: 2026.04.07
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 Checksum: 787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 primarily optimizes data synchronization and query module capabilities, fixes several product defects, and provides comprehensive improvements to database monitoring, performance, and stability. Specific release contents are as follows: - -- Data Synchronization: Optimized distribution performance for Pipe complex path matching scenarios. -- Query Module: The `SHOW QUERIES` statement now includes client IP, query timeout, server wait time, and other information. -- Ecosystem Integration: Supports IoTDB pushing data to an external OPC Server in OPC Client mode. - - -### V1.3.6.6 - -> Release Date: 2026.01.20
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 Checksum: 590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 enhances data read/write capabilities, resolves several product defects, and delivers comprehensive improvements in database monitoring, performance, and stability. - - -### V1.3.6.3 - -> Release Date: 2026.01.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 Checksum: 43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 focuses on deep optimizations in two core areas—query performance and memory management—while comprehensively enhancing database monitoring, performance, and stability. Specific release contents are as follows: - -* **Query Module**: Optimized query performance across multiple scenarios, including multi-series `Last` queries. -* **Query Module**: Added a new `FastLastQuery` interface in the Java SDK for more efficient `Last` query operations. -* **Query Module**: Modified the tree model’s `fetchSchema` to return results in segmented streaming mode, improving response speed under large-data-volume conditions. -* **Storage Module**: Enhanced memory management to mitigate memory leak risks and ensure long-term system stability. -* **Storage Module**: Optimized the file compaction mechanism to improve compaction efficiency and reduce storage resource consumption. -* **Others**: Fixed security vulnerabilities CVE-2025-12183, CVE-2025-66566, and CVE-2025-11226. - -### V1.3.6.1 - -> Release Date: 2025.12.09
-> Download Link: Please contact Timecho Team to obtain the download link
-> Package Name: iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 Checksum: 9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 focuses on deep optimization of data synchronization stability, while delivering comprehensive improvements in database monitoring, performance, and stability. Specific release contents are as follows: - -* **Data Synchronization**: Enhanced Pipe SQL parameter configuration to support specifying asynchronous loading methods. -* **Data Synchronization**: Introduced syntactic sugar that automatically splits full-data Pipe creation SQL into real-time and historical synchronization components. -* **System Module**: Added a global configuration option for data-type-specific compression strategies, enabling on-demand adjustment of storage compression policies. - - -### V1.3.5.11 - -> Release Date: 2025.09.24
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 Checksum: f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 version primarily optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.10 - -> Release Date: 2025.08.27
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 Checksum: 3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 version fixes certain product defects and includes comprehensive enhancements to database monitoring, performance, and stability. - -### V1.3.5.9 - -> Release Date: 2025.08.25
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 Checksum: 95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 version optimizes memory control, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -### 1.x Other historical versions - -#### V1.3.5.8 - -> Release Date: 2025.08.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 Checksum: aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.7 - -> Release Date: 2025.08.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 Checksum: 17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 version optimizes the data synchronization function, fixes certain product defects, and includes comprehensive enhancements to database monitoring, performance, and stability. - -#### V1.3.5.6 - -> Release Date: 2025.07.16
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 Checksum: 05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 introduces a new configuration switch to disable the data subscription feature. It optimizes the C++ high-availability client and addresses PIPE synchronization latency issues in normal operation, restart, and deletion scenarios, along with query performance for large TEXT objects. Comprehensive enhancements to database monitoring, performance, and stability are also included. - -#### V1.3.5.4 - -> Release Date: 2025.06.19
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 Checksum: edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 fixes several product defects and optimizes the node removal functionality. It also delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.5.3 - -> Release Date: 2025.06.13
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 Checksum: 5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 focuses on optimizing data synchronization capabilities, including persisting PIPE transmission progress and adding monitoring metrics for PIPE event transfer time. Related defects have been resolved. Additionally, the encryption algorithm for user passwords has been upgraded to SHA-256. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.2 - -> Release Date: 2015.06.10
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 Checksum: 4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 primarily optimizes data synchronization features, adding support for cascading configurations via parameters and ensuring fully consistent ordering between synchronized and real-time writes. It also enables partitioned sending of historical and real-time data after system restarts. Comprehensive enhancements to database monitoring, performance, and stability are included. - -#### V1.3.5.1 - -> Release Date: 2025.05.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 Checksum: 91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 resolves several product defects and delivers comprehensive improvements to database monitoring, performance, and stability. - -#### V1.3.4.2 - -> Release Date: 2025.04.14
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 Checksum: 52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b - -V1.3.4.2 enhances the data synchronization function by supporting bi-directional active-active synchronization of data forwarded through external PIPE sources. - -#### V1.3.4.1 - -> Release Date: 2025.01.08
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 Checksum: e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 introduces pattern matching functions, continuously optimizes the data subscription mechanism, improves stability, and extends import-data/export-data scripts to support new data types while unifying TsFile, CSV and SQL import/export formats. Comprehensive improvements have been made to database monitoring, performance and stability. Key updates: - -* Query Module: Configurable URI-based JAR loading for UDFs, PipePlugins, Triggers and AINodes -* System Module: Extended UDF functionality with new pattern\_match function -* Data Sync: Supports specifying authentication info at sender -* Ecosystem: Kubernetes Operator support -* Scripts: import-data/export-data now supports strings, BLOBs, dates and timestamps -* Scripts: Unified import/export support for TsFile, CSV and SQL formats - -#### V1.3.3.3 - -> Release Date: 2024.10.31
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 Checksum: 4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3 improves restart recovery performance, enables DataNodes to actively monitor/load TsFiles with observability metrics, supports automatic loading at receivers when senders transfer files to specified directories, and adds Alter Source capability for Pipes. Comprehensive improvements to monitoring, performance and stability include: - -* Data Sync: Automatic type conversion for inconsistent data at receivers -* Data Sync: Enhanced observability with ops/latency metrics for internal APIs -* Data Sync: OPC-UA sink plugin supports CS mode and non-anonymous access -* Subscription: SDK supports create\_if\_not\_exists and drop\_if\_exists APIs -* Stream Processing: Alter Pipe supports Alter Source -* System: Added latency monitoring for REST module -* Scripts: Auto-loading TsFiles from specified directories -* Scripts: import-tsfile supports remote server execution -* Scripts: Kubernetes Helm support -* Scripts: Python client supports new data types (string, BLOB, date, timestamp) - -#### V1.3.3.2 - -> Release Date: 2024.08.15
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 Checksum: 32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2 adds metrics for mods file reading time, merge sort memory usage and dispatch latency, supports configurable time partition origin adjustment, enables automatic subscription termination based on pipe completion markers, and improves merge memory control. Key updates: - -* Query: Explain Analyze shows mods file read time -* Query: Explain Analyze shows merge sort memory and dispatch latency -* Storage: Added configurable file splitting during compaction -* System: Configurable time partition origin -* Stream Processing: Auto-terminate subscriptions on pipe completion markers -* Data Sync: Configurable RPC compression levels -* Scripts: Export filters only root.\_\_system paths - -#### V1.3.3.1 - -> Release Date: 2024.07.12
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 Checksum: 1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1 adds tiered storage throttling, supports username/password auth specification at sync senders, optimizes ambiguous WARN logs at receivers, improves restart performance, and merges configuration files. Key updates: - -* Query: Optimized Filter performance for faster aggregation/WHERE queries -* Query: Java Session evenly distributes SQL requests across nodes -* System: Merged config files into iotdb-system.properties -* Storage: Added tiered storage throttling -* Data Sync: Username/password auth specification at senders -* System: Optimized restart recovery time - -#### V1.3.2.2 - -> Release Date: 2024.06.04
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 Checksum: ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 introduces EXPLAIN ANALYZE for SQL profiling, UDAF framework, automatic data deletion at disk thresholds, metadata sync, path-specific data point counting, and SQL import/export scripts. Supports rolling cluster upgrades and cluster-wide plugin distribution with comprehensive monitoring/performance improvements. Key updates: - -* Storage: Improved insertRecords performance -* Storage: SpaceTL feature for auto-deletion at disk thresholds -* Query: EXPLAIN ANALYZE for SQL stage-level profiling -* Query: New UDAF framework -* Query: New envelope demodulation analysis in UDFs -* Query: MaxBy/MinBy functions returning timestamps with values -* Query: Faster value-filtered queries -* Data Sync: Wildcard path matching -* Data Sync: Metadata synchronization (including attributes/permissions) -* Stream Processing: ALTER PIPE for hot plugin updates -* System: TsFile load statistics in data point counting -* Scripts: Local upgrade/backup via hard links -* Scripts: New export-data/import-data for CSV/TsFile/SQL formats -* Scripts: Windows window title differentiation for ConfigNode/DataNode/Cli - -#### V1.3.1.4 - -> Release Date: 2024.04.23
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 Checksum: 8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1.4 adds cluster activation status viewing, variance/stddev aggregation functions, FILL timeout settings, TsFile repair command, one-click info collection scripts, and cluster control scripts while optimizing views and stream processing. Key updates: - -* Query: FILL clause timeout threshold -* Query: REST V2 returns column types -* Data Sync: Simplified time range specification -* Data Sync: SSL support (iotdb-thrift-ssl-sink) -* System: SQL query for cluster activation status -* System: Tiered storage transfer rate control -* System: Enhanced observability (node divergence, task scheduling) -* System: Optimized default logging -* Scripts: One-click cluster control scripts (start-all/stop-all) -* Scripts: One-click info collection scripts (collect-info) - -#### V1.3.0.4 - -> Release Date: 2024.01.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 Checksum: 3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 introduces the AINode machine learning framework, upgrades permission granularity to time-series level, and optimizes views/stream processing for better usability and stability. Key updates: - -* Query: New AINode ML framework -* Query: Fixed slow SHOW PATH responses -* Security: Time-series granular permissions -* Security: SSL client-server encryption -* Stream Processing: New metrics monitoring -* Query: LAST queries on non-writable views -* System: Improved data point counting accuracy - -#### V1.2.0.1 - -> Release Date: 2023.06.30
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 Checksum: dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1 introduces stream processing framework, dynamic templates, substring/replace/round functions, enhances SHOW REGION/TIMESERIES/VARIABLE statements and Session APIs while optimizing monitoring metrics. Key updates: - -* Stream Processing: New framework -* Metadata: Dynamic template expansion -* Storage: New SPRINTZ/RLBE encoding and LZMA2 compression -* Query: New CAST, ROUND, SUBSTR, REPLACE functions -* Query: New TIME\_DURATION, MODE aggregation -* Query: CASE WHEN syntax support -* Query: ORDER BY expression support -* Interface: Python API multi-node connection -* Interface: Python client write redirection -* Interface: Batch sequence creation via templates - -#### V1.1.0.1 - -> Release Date: 2023.04.03
-> Download Link: Please contact Timecho Team to obtain the download link
-> Installation Package Name: iotdb-enterprise-1.1.0.1.zip
-> SHA512 Checksum: 58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1 introduces GROUP BY VARIATION/CONDITION, DIFF/COUNT\_IF functions, and pipeline execution engine while fixing issues including: - -* Aligned sequence LAST queries with ORDER BY TIMESERIES -* LIMIT & OFFSET failures -* Post-restart metadata template errors -* Sequence creation after database deletion - -Key updates: - -* Query: ALIGN BY DEVICE supports ORDER BY TIME -* Query: SHOW QUERIES/KILL QUERY commands -* System: SHOW REGIONS per database -* System: SHOW VARIABLES for cluster parameters -* Query: GROUP BY VARIATION/CONDITION -* Query: SELECT INTO type casting -* Query: New DIFF (scalar), COUNT\_IF (aggregate) -* System: SHOW REGIONS creation time -* System: Configurable dn\_rpc\_port/address - -## 2. Workbench (Console Tool) - -| **Version** | **Description** | **Supported IoTDB Versions** | **SHA512 checksum** | -| ----------- | ------------------------------------------------------------ | ----------------------------------- | ------------------------------------------------------------ | -| V2.1.1 | Optimize the measuring point selection on the trend interface to support scenarios without devices | V2.0 and above | aa05fd4d9f33f07c0949bc2d6546bb4b9791ed5ea94bcef27e2bf51ea141ec0206f1c12466aced7bf3449e11ad68d65378d697f3d10cb4881024a83746029a65 | -| V2.0.1-beta | The first version of the V2.x series, supporting dual models of tree and table | V2.0 and above | 0ca0d5029874ed8ada9c7d1cb562370b3a46913eed66d39c08759287ccc8bf332cf80bb8861e788614b61ae5d53a9f5605f553e1a607e856f395eb5102e7cc4d | -| V1.5.7 | Optimize the point list by splitting point names into device names and points, ensure the point selection area supports horizontal scrolling, and align the export file column order with the page display. | All 1.x versions from V1.3.4 onward | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | Enhanced CSV import/export: optional tags/aliases on import; support for measurement descriptions with backtick-quoted quotes on export. | All 1.x versions from V1.3.4 onward | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | Added server clock functionality and support for activating Enterprise Edition license databases | All 1.x versions from V1.3.4 onward | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | Added authentication for Prometheus settings in Instance Management | All 1.x versions from V1.3.4 onward | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | Added AI analysis and pattern matching | All 1.x versions from V1.3.2 onward | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | Added tree model display and English UI | All 1.x versions from V1.3.2 onward | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | Enhanced analysis methods and import templates | All 1.x versions from V1.3.2 onward | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | Added DB configuration and UI refinements | All 1.x versions from V1.3.2 onward | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | Optimized permission controls | All 1.x versions from V1.3.1 onward | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | Added "Common Templates" and caching | All 1.x versions from V1.3.0 onward | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | Added import/export for calculations, time alignment field | All 1.x versions from V1.2.2 onward | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | Added activation details and analysis features | All 1.x versions from V1.2.2 onward | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | Optimized point description display | All 1.x versions from V1.2.2 onward | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | Added sync monitoring panel, Prometheus hints | All 1.x versions from V1.2.2 onward | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | Major Workbench upgrade | All 1.x versions from V1.2.0 onward | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/UserGuide/latest/QuickStart/QuickStart_timecho.md b/src/UserGuide/latest/QuickStart/QuickStart_timecho.md deleted file mode 100644 index cb58bf5f5..000000000 --- a/src/UserGuide/latest/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,108 +0,0 @@ - - - -# Quick Start - -This document will guide you through methods to get started quickly with IoTDB. - -## 1. How to Install and Deploy? - -This guide will assist you in quickly installing and deploying IoTDB. You can quickly navigate to the content you need to review through the following document links: - -1. Prepare the necessary machine resources: The deployment and operation of IoTDB require consideration of various aspects of machine resource configuration. For specific resource configurations, please refer to [Database Resource](../Deployment-and-Maintenance/Database-Resources.md) - -2. Complete system configuration preparations: IoTDB's system configuration involves multiple aspects. For an introduction to key system configurations, please see [System Requirements](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. Obtain the installation package: You can contact the Timecho Team to get the IoTDB installation package to ensure you download the latest and most stable version. For the specific structure of the installation package, please refer to[Obtain TimechoDB](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. Install the database and activate it: Depending on your actual deployment architecture, you can choose from the following tutorials for installation and deployment: - - - Stand-Alone Deployment: [Stand-Alone Deployment ](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - Distributed(Cluster) Deployment:[Distributed(Cluster) Deployment](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - Dual-Active Deployment:[Dual-Active Deployment](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️Note: We currently still recommend direct installation and deployment on physical/virtual machines. For Docker deployment, please refer to [Docker Deployment](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. Install database supporting tools: The enterprise version database provides a monitoring panel 、Workbench Supporting tools, etc,It is recommended to install IoTDB when deploying the enterprise version, which can help you use IoTDB more conveniently: - - - Monitoring panel:Provides over a hundred database monitoring metrics for detailed monitoring of IoTDB and its operating system, enabling system optimization, performance optimization, bottleneck discovery, and more. The installation steps can be viewed [Monitoring panel](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - Workbench: It is the visual interface of IoTDB,Support providing through interface interaction Operate Metadata、Query Data、Data Visualization and other functions, help users use the database easily and efficiently, and the installation steps can be viewed [Workbench Deployment](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - -## 2. How to Use IoTDB? - -1. Database Modeling Design: Database modeling is a crucial step in creating a database system, involving the design of data structures and relationships to ensure that the organization of data meets the needs of specific applications. The following documents will help you quickly understand IoTDB's modeling design: - - - Introduction to Time Series Concepts: [Navigating Time Series Data](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - - - Introduction to Modeling Design:[Data Model and Terminology](../Background-knowledge/Data-Model-and-Terminology_timecho.md) - - - Introduction to SQL syntax[SQL syntax](../Basic-Concept/Operate-Metadata_timecho.md) - -2. Write Data: In terms of data writing, IoTDB provides multiple ways to insert real-time data. Please refer to the basic data writing operations for details [Write Data](../Basic-Concept/Write-Data_timecho.md) - -3. Query Data: IoTDB provides rich data query functions. Please refer to the basic introduction of data query [Query Data](../Basic-Concept/Query-Data_timecho.md) - -4. Other advanced features: In addition to common functions such as writing and querying in databases, IoTDB also supports "Data Synchronisation、Stream Framework、Security Management、Database Administration、AI Capability"and other functions, specific usage methods can be found in the specific document: - - - Data Synchronisation: [Data Synchronisation](../User-Manual/Data-Sync_timecho.md) - - - Stream Framework: [Stream Framework](../User-Manual/Streaming_timecho.md) - - - Security Management: [Security Management](../User-Manual/Black-White-List_timecho.md) - - - Database Administration: [Database Administration](../User-Manual/Authority-Management_timecho.md) - - - AI Capability :[AI Capability](../AI-capability/AINode_timecho.md) - -5. API: IoTDB provides multiple application programming interfaces (API) for developers to interact with IoTDB in their applications, and currently supports[ Java Native API](../API/Programming-Java-Native-API_timecho.md)、[Python Native API](../API/Programming-Python-Native-API_timecho.md)、[C++ Native API](../API/Programming-Cpp-Native-API.md)、[Go Native API](../API/Programming-Go-Native-API.md), For more API, please refer to the official website 【API】 and other chapters - -## 3. What other convenient tools are available? - -In addition to its rich features, IoTDB also has a comprehensive range of tools in its surrounding system. This document will help you quickly use the peripheral tool system : - - - Workbench: Workbench is a visual interface for IoTDB that supports interactive operations. It offers intuitive features for metadata management, data querying, and data visualization, enhancing the convenience and efficiency of user database operations. For detailed usage instructions, please refer to: [Workbench](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - - - Monitor Tool: This is a tool for meticulous monitoring of IoTDB and its host operating system, covering hundreds of database monitoring metrics including database performance and system resources, which aids in system optimization and bottleneck identification. For detailed usage instructions, please refer to: [Monitor Tool](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - Benchmark Tool: IoT benchmark is a time series database benchmark testing tool developed based on Java and big data environments, developed and open sourced by the School of Software at Tsinghua University. It supports multiple writing and querying methods, can store test information and results for further query or analysis, and supports integration with Tableau to visualize test results. For specific usage instructions, please refer to: [Benchmark Tool](../Tools-System/Benchmark.md) - - - Data Import Script: For different scenarios, IoTDB provides users with multiple ways to batch import data. For specific usage instructions, please refer to: [Data Import](../Tools-System/Data-Import-Tool_timecho.md) - - - Data Export Script: For different scenarios, IoTDB provides users with multiple ways to batch export data. For specific usage instructions, please refer to: [Data Export](../Tools-System/Data-Export-Tool_timecho.md) - - -## 4. Want to Learn More About the Technical Details? - -If you are interested in delving deeper into the technical aspects of IoTDB, you can refer to the following documents: - - - Research Paper: IoTDB features columnar storage, data encoding, pre-calculation, and indexing technologies, along with a SQL-like interface and high-performance data processing capabilities. It also integrates seamlessly with Apache Hadoop, MapReduce, and Apache Spark. For related research papers, please refer to: [Research Paper](../Technical-Insider/Publication.md) - - - Compression & Encoding: IoTDB optimizes storage efficiency for different data types through a variety of encoding and compression techniques. To learn more, please refer to:[Compression & Encoding](../Technical-Insider/Encoding-and-Compression.md) - - - Data Partitioning and Load Balancing: IoTDB has meticulously designed data partitioning strategies and load balancing algorithms based on the characteristics of time series data, enhancing the availability and performance of the cluster. For more information, please refer to: [Data Partitionin & Load Balancing](../Technical-Insider/Cluster-data-partitioning.md) - -## 5. Encountering problems during use? - -If you encounter difficulties during installation or use, you can move to [Frequently Asked Questions](../FAQ/Frequently-asked-questions.md) View in the middle \ No newline at end of file diff --git a/src/UserGuide/latest/Reference/DataNode-Config-Manual_timecho.md b/src/UserGuide/latest/Reference/DataNode-Config-Manual_timecho.md deleted file mode 100644 index f3f5f4f75..000000000 --- a/src/UserGuide/latest/Reference/DataNode-Config-Manual_timecho.md +++ /dev/null @@ -1,632 +0,0 @@ - - -# DataNode Config Manual - -We use the same configuration files for IoTDB DataNode and Standalone version, all under the `conf`. - -* `datanode-env.sh/bat`:Environment configurations, in which we could set the memory allocation of DataNode and Standalone. - -* `iotdb-system.properties`:IoTDB system configurations. - -## 1. Hot Modification Configuration - -For the convenience of users, IoTDB provides users with hot modification function, that is, modifying some configuration parameters in `iotdb-system.properties` during the system operation and applying them to the system immediately. -In the parameters described below, these parameters whose way of `Effective` is `hot-load` support hot modification. - -Trigger way: The client sends the command(sql) `load configuration` or `set configuration` to the IoTDB server. - -## 2. Environment Configuration File(datanode-env.sh/bat) - -The environment configuration file is mainly used to configure the Java environment related parameters when DataNode is running, such as JVM related configuration. This part of the configuration is passed to the JVM when the DataNode starts. - -The details of each parameter are as follows: - -* MEMORY\_SIZE - -|Name|MEMORY\_SIZE| -|:---:|:---| -|Description|The minimum heap memory size that IoTDB DataNode will use when startup | -|Type|String| -|Default| The default is a half of the memory.| -|Effective|After restarting system| - -* ON\_HEAP\_MEMORY - -|Name|ON\_HEAP\_MEMORY| -|:---:|:---| -|Description|The heap memory size that IoTDB DataNode can use, Former Name: MAX\_HEAP\_SIZE | -|Type|String| -|Default| Calculate based on MEMORY\_SIZE.| -|Effective|After restarting system| - -* OFF\_HEAP\_MEMORY - -|Name|OFF\_HEAP\_MEMORY| -|:---:|:---| -|Description|The direct memory that IoTDB DataNode can use, Former Name: MAX\_DIRECT\_MEMORY\_SIZE| -|Type|String| -|Default| Calculate based on MEMORY\_SIZE.| -|Effective|After restarting system| - -* JMX\_LOCAL - -|Name|JMX\_LOCAL| -|:---:|:---| -|Description|JMX monitoring mode, configured as yes to allow only local monitoring, no to allow remote monitoring| -|Type|Enum String: "true", "false"| -|Default|true| -|Effective|After restarting system| - -* JMX\_PORT - -|Name|JMX\_PORT| -|:---:|:---| -|Description|JMX listening port. Please confirm that the port is not a system reserved port and is not occupied| -|Type|Short Int: [0,65535]| -|Default|31999| -|Effective|After restarting system| - -* JMX\_IP - -|Name|JMX\_IP| -|:---:|:---| -|Description|JMX listening address. Only take effect if JMX\_LOCAL=false. 0.0.0.0 is never allowed| -|Type|String| -|Default|127.0.0.1| -|Effective|After restarting system| - -## 3. JMX Authorization - -We **STRONGLY RECOMMENDED** you CHANGE the PASSWORD for the JMX remote connection. - -The user and passwords are in ${IOTDB\_CONF}/conf/jmx.password. - -The permission definitions are in ${IOTDB\_CONF}/conf/jmx.access. - -## 4. DataNode/Standalone Configuration File (iotdb-system.properties) - -### 4.1 Data Node RPC Configuration - -* dn\_rpc\_address - -|Name| dn\_rpc\_address | -|:---:|:-----------------------------------------------| -|Description| The client rpc service listens on the address. | -|Type| String | -|Default| 127.0.0.1 | -|Effective| After restarting system | - -* dn\_rpc\_port - -|Name| dn\_rpc\_port | -|:---:|:---| -|Description| The client rpc service listens on the port.| -|Type|Short Int : [0,65535]| -|Default| 6667 | -|Effective|After restarting system| - -* dn\_internal\_address - -|Name| dn\_internal\_address | -|:---:|:---| -|Description| DataNode internal service host/IP | -|Type| string | -|Default| 127.0.0.1 | -|Effective|Only allowed to be modified in first start up| - -* dn\_internal\_port - -|Name| dn\_internal\_port | -|:---:|:-------------------------------| -|Description| DataNode internal service port | -|Type| int | -|Default| 10730 | -|Effective| Only allowed to be modified in first start up | - -* dn\_mpp\_data\_exchange\_port - -|Name| mpp\_data\_exchange\_port | -|:---:|:---| -|Description| MPP data exchange port | -|Type| int | -|Default| 10740 | -|Effective|Only allowed to be modified in first start up| - -* dn\_schema\_region\_consensus\_port - -|Name| dn\_schema\_region\_consensus\_port | -|:---:|:---| -|Description| DataNode Schema replica communication port for consensus | -|Type| int | -|Default| 10750 | -|Effective|Only allowed to be modified in first start up| - -* dn\_data\_region\_consensus\_port - -|Name| dn\_data\_region\_consensus\_port | -|:---:|:---| -|Description| DataNode Data replica communication port for consensus | -|Type| int | -|Default| 10760 | -|Effective|Only allowed to be modified in first start up| - -* dn\_join\_cluster\_retry\_interval\_ms - -|Name| dn\_join\_cluster\_retry\_interval\_ms | -|:---:|:--------------------------------------------------------------------------| -|Description| The time of data node waiting for the next retry to join into the cluster | -|Type| long | -|Default| 5000 | -|Effective| After restarting system | - -### 4.2 SSL Configuration - -* enable\_thrift\_ssl - -|Name| enable\_thrift\_ssl | -|:---:|:---------------------------| -|Description|When enable\_thrift\_ssl is configured as true, SSL encryption will be used for communication through dn\_rpc\_port | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* enable\_https - -|Name| enable\_https | -|:---:|:-------------------------| -|Description| REST Service Specifies whether to enable SSL configuration | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* key\_store\_path - -|Name| key\_store\_path | -|:---:|:-----------------| -|Description| SSL certificate path | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* key\_store\_pwd - -|Name| key\_store\_pwd | -|:---:|:----------------| -|Description| SSL certificate password | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -### 4.3 SeedConfigNode - -* dn\_seed\_config\_node - -|Name| dn\_seed\_config\_node | -|:---:|:------------------------------------------------| -|Description| ConfigNode Address for DataNode to join cluster. This parameter is corresponding to dn\_target\_config\_node\_list before V1.2.2 | -|Type| String | -|Default| 127.0.0.1:10710 | -|Effective| Only allowed to be modified in first start up | - -### 4.4 Connection Configuration - -* dn\_rpc\_thrift\_compression\_enable - -|Name| dn\_rpc\_thrift\_compression\_enable | -|:---:|:---| -|Description| Whether enable thrift's compression (using GZIP).| -|Type|Boolean| -|Default| false | -|Effective|After restarting system| - -* dn\_rpc\_advanced\_compression\_enable - -|Name| dn\_rpc\_advanced\_compression\_enable | -|:---:|:---| -|Description| Whether enable thrift's advanced compression.| -|Type|Boolean| -|Default| false | -|Effective|After restarting system| - -* dn\_rpc\_selector\_thread\_count - -|Name| dn\_rpc\_selector\_thread\_count | -|:---:|:-----------------------------------| -|Description| The number of rpc selector thread. | -|Type| int | -|Default| false | -|Effective| After restarting system | - -* dn\_rpc\_min\_concurrent\_client\_num - -|Name| dn\_rpc\_min\_concurrent\_client\_num | -|:---:|:-----------------------------------| -|Description| Minimum concurrent rpc connections | -|Type| Short Int : [0,65535] | -|Description| 1 | -|Effective| After restarting system | - -* dn\_rpc\_max\_concurrent\_client\_num - -|Name| dn\_rpc\_max\_concurrent\_client\_num | -|:---:|:--------------------------------------| -|Description| Max concurrent rpc connections | -|Type| Short Int : [0,65535] | -|Description| 1000 | -|Effective| After restarting system | - -* dn\_thrift\_max\_frame\_size - -|Name| dn\_thrift\_max\_frame\_size | -|:---:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|Description| Max size of bytes of each thrift RPC request/response | -|Type| int | -| Default | Defaults to 0, which means the value is automatically calculated based on the DN JVM configuration parameters at startup:
a. min(64MB, dn_alloc_memory/64)
b. If the user manually configures `dn_thrift_max_frame_size`, the user-specified value will be used instead. | -|Effective| After restarting system | - -* dn\_thrift\_init\_buffer\_size - -|Name| dn\_thrift\_init\_buffer\_size | -|:---:|:---| -|Description| Initial size of bytes of buffer that thrift used | -|Type| long | -|Default| 1024 | -|Effective|After restarting system| - -* dn\_connection\_timeout\_ms - -| Name | dn\_connection\_timeout\_ms | -|:-----------:|:---------------------------------------------------| -| Description | Thrift socket and connection timeout between nodes | -| Type | int | -| Default | 60000 | -| Effective | After restarting system | - -* dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager - -| Name | dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------------:|:--------------------------------------------------------------| -| Description | Number of core clients routed to each node in a ClientManager | -| Type | int | -| Default | 200 | -| Effective | After restarting system | - -* dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager - -| Name | dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager | -|:--------------:|:-------------------------------------------------------------| -| Description | Number of max clients routed to each node in a ClientManager | -| Type | int | -| Default | 300 | -| Effective | After restarting system | - -### 4.5 Dictionary Configuration - -* dn\_system\_dir - -| Name | dn\_system\_dir | -|:-----------:|:----------------------------------------------------------------------------| -| Description | The directories of system files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/system (Windows: data\\datanode\\system) | -| Effective | After restarting system | - -* dn\_data\_dirs - -| Name | dn\_data\_dirs | -|:-----------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | The directories of data files. Multiple directories are separated by comma. The starting directory of the relative path is related to the operating system. It is recommended to use an absolute path. If the path does not exist, the system will automatically create it. | -| Type | String[] | -| Default | data/datanode/data (Windows: data\\datanode\\data) | -| Effective | After restarting system | - -* dn\_multi\_dir\_strategy - -| Name | dn\_multi\_dir\_strategy | -|:-----------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Description | IoTDB's strategy for selecting directories for TsFile in tsfile_dir. You can use a simple class name or a full name of the class. The system provides the following three strategies:
1. SequenceStrategy: IoTDB selects the directory from tsfile\_dir in order, traverses all the directories in tsfile\_dir in turn, and keeps counting;
2. MaxDiskUsableSpaceFirstStrategy: IoTDB first selects the directory with the largest free disk space in tsfile\_dir;
You can complete a user-defined policy in the following ways:
1. Inherit the org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy class and implement its own Strategy method;
2. Fill in the configuration class with the full class name of the implemented class (package name plus class name, UserDfineStrategyPackage);
3. Add the jar file to the project. | -| Type | String | -| Default | SequenceStrategy | -| Effective | hot-load | - -* dn\_consensus\_dir - -| Name | dn\_consensus\_dir | -|:-----------:|:-------------------------------------------------------------------------------| -| Description | The directories of consensus files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/consensus | -| Effective | After restarting system | - -* dn\_wal\_dirs - -| Name | dn\_wal\_dirs | -|:-----------:|:-------------------------------------------------------------------------| -| Description | Write Ahead Log storage path. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/wal | -| Effective | After restarting system | - -* dn\_tracing\_dir - -| Name | dn\_tracing\_dir | -|:-----------:|:----------------------------------------------------------------------------| -| Description | The tracing root directory path. It is recommended to use an absolute path. | -| Type | String | -| Default | datanode/tracing | -| Effective | After restarting system | - -* dn\_sync\_dir - -| Name | dn\_sync\_dir | -|:-----------:|:--------------------------------------------------------------------------| -| Description | The directories of sync files. It is recommended to use an absolute path. | -| Type | String | -| Default | data/datanode/sync | -| Effective | After restarting system | - -### 4.6 Metric Configuration - -* dn\_metric\_reporter\_list - -| Name | dn\_metric\_reporter\_list | -|:-----------:|:--------------------------------------| -| Description | Systems for reporting DataNode metrics. | -| Type | String | -| Default | None | -| Effective | After restarting system | - -* dn\_metric\_level - -| Name | dn\_metric\_level | -|:-----------:|:------------------------------------| -| Description | Level of detail for DataNode metrics. | -| Type | String | -| Default | IMPORTANT | -| Effective | After restarting system | - -* dn\_metric\_async\_collect\_period - -| Name | dn\_metric\_async\_collect\_period | -|:-----------:|:------------------------------------------------------------| -| Description | Period for asynchronous metric collection in DataNode (in seconds). | -| Type | int | -| Default | 5 | -| Effective | After restarting system | - -* dn\_metric\_prometheus\_reporter\_port - -| Name | dn\_metric\_prometheus\_reporter\_port | -|:-----------:|:------------------------------------------| -| Description | Port for Prometheus metric reporting in DataNode. | -| Type | int | -| Default | 9092 | -| Effective | After restarting system | - -* dn\_metric\_internal\_reporter\_type - -| Name | dn\_metric\_internal\_reporter\_type | -|:-----------:|:------------------------------------------------------------| -| Description | Internal reporter types for DataNode metrics. For internal monitoring and checking that the data has been successfully written and refreshed. | -| Type | String | -| Default | IOTDB | -| Effective | After restarting system | - -## 5. Enable GC log - -GC log is off by default. -For performance tuning, you may want to collect the GC info. - -To enable GC log, just add a parameter "printgc" when you start the DataNode. - -```bash -nohup sbin/start-datanode.sh printgc >/dev/null 2>&1 & -``` -Or -```bash -# Before version V2.0.4.x -sbin\start-datanode.bat printgc - -# V2.0.4.x and later versions -sbin\windows\start-datanode.bat printgc -``` - -GC log is stored at `IOTDB_HOME/logs/gc.log`. -There will be at most 10 gc.log.* files and each one can reach to 10MB. - -### 5.1 REST Service Configuration - -* enable\_rest\_service - -|Name| enable\_rest\_service | -|:---:|:--------------------------------------| -|Description| Whether to enable the Rest service | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* rest\_service\_port - -|Name| rest\_service\_port | -|:---:|:------------------| -|Description| The Rest service listens to the port number | -|Type| int32 | -|Default| 18080 | -|Effective| After restarting system | - -* enable\_swagger - -|Name| enable\_swagger | -|:---:|:-----------------------| -|Description| Whether to enable swagger to display rest interface information | -|Type| Boolean | -|Default| false | -|Effective| After restarting system | - -* rest\_query\_default\_row\_size\_limit - -|Name| rest\_query\_default\_row\_size\_limit | -|:---:|:------------------------------------------------------------------------------------------| -|Description| The maximum number of rows in a result set that can be returned by a query | -|Type| int32 | -|Default| 10000 | -|Effective| After restarting system | - -* cache\_expire - -|Name| cache\_expire | -|:---:|:--------------------------------------------------------| -|Description| Expiration time for caching customer login information | -|Type| int32 | -|Default| 28800 | -|Effective| After restarting system | - -* cache\_max\_num - -|Name| cache\_max\_num | -|:---:|:--------------| -|Description| The maximum number of users stored in the cache | -|Type| int32 | -|Default| 100 | -|Effective| After restarting system | - -* cache\_init\_num - -|Name| cache\_init\_num | -|:---:|:---------------| -|Description| Initial cache capacity | -|Type| int32 | -|Default| 10 | -|Effective| After restarting system | - - -* trust\_store\_path - -|Name| trust\_store\_path | -|:---:|:---------------| -|Description| keyStore Password (optional) | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* trust\_store\_pwd - -|Name| trust\_store\_pwd | -|:---:|:---------------------------------| -|Description| trustStore Password (Optional) | -|Type| String | -|Default| "" | -|Effective| After restarting system | - -* idle\_timeout - -|Name| idle\_timeout | -|:---:|:--------------| -|Description| SSL timeout duration, expressed in seconds | -|Type| int32 | -|Default| 5000 | -|Effective| After restarting system | - - -#### Storage engine configuration - - -* dn\_default\_space\_usage\_thresholds - -|Name| dn\_default\_space\_usage\_thresholds | -|:---:|:--------------| -|Description| Define the minimum remaining space ratio for each tier data catalogue; when the remaining space is less than this ratio, the data will be automatically migrated to the next tier; when the remaining storage space of the last tier falls below this threshold, the system will be set to READ_ONLY | -|Type| double | -|Default| 0.85 | -|Effective| hot-load | - -* remote\_tsfile\_cache\_dirs - -|Name| remote\_tsfile\_cache\_dirs | -|:---:|:--------------| -|Description| Cache directory stored locally in the cloud | -|Type| string | -|Default| data/datanode/data/cache | -|Effective| After restarting system | - -* remote\_tsfile\_cache\_page\_size\_in\_kb - -|Name| remote\_tsfile\_cache\_page\_size\_in\_kb | -|:---:|:--------------| -|Description| Block size of locally cached files stored in the cloud | -|Type| int | -|Default| 20480 | -|Effective| After restarting system | - -* remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb - -|Name| remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb | -|:---:|:--------------| -|Description| Maximum Disk Occupancy Size for Cloud Storage Local Cache | -|Type| long | -|Default| 51200 | -|Effective| After restarting system | - -* object\_storage\_type - -|Name| object\_storage\_type | -|:---:|:--------------| -|Description| Cloud Storage Type | -|Type| string | -|Default| AWS_S3 | -|Effective| After restarting system | - -* object\_storage\_bucket - -|Name| object\_storage\_bucket | -|:---:|:--------------| -|Description| Name of cloud storage bucket | -|Type| string | -|Default| iotdb_data | -|Effective| After restarting system | - -* object\_storage\_endpoint - -|Name| object\_storage\_endpoint | -|:---:|:--------------------------------| -|Description| endpoint of cloud storage | -|Type| string | -|Default| None | -|Effective| After restarting system | - -* object\_storage\_access\_key - -|Name| object\_storage\_access\_key | -|:---:|:--------------| -|Description| Authentication information stored in the cloud: key | -|Type| string | -|Default| None | -|Effective| After restarting system | - -* object\_storage\_access\_secret - -|Name| object\_storage\_access\_secret | -|:---:|:--------------| -|Description| Authentication information stored in the cloud: secret | -|Type| string | -|Default| None | -|Effective| After restarting system | diff --git a/src/UserGuide/latest/SQL-Manual/QuickStart-Only-Sql_timecho.md b/src/UserGuide/latest/SQL-Manual/QuickStart-Only-Sql_timecho.md deleted file mode 100644 index a0a58fba3..000000000 --- a/src/UserGuide/latest/SQL-Manual/QuickStart-Only-Sql_timecho.md +++ /dev/null @@ -1,111 +0,0 @@ - - -# QuickStart Only SQL - -> **Before executing the following SQL statements, please ensure** -> -> * **IoTDB service has been successfully started** -> * **Connected to IoTDB via Cli client** -> -> Note: If your terminal does not support multi-line pasting (e.g., Windows CMD), please adjust the SQL statements to single-line format before execution. - -## 1. Database Management - -```SQL --- Create database; -CREATE DATABASE root.ln; - --- View database; -SHOW DATABASES root.**; - --- Delete database; -DELETE DATABASE root.ln; - --- Count database; -COUNT DATABASES root.**; -``` - -For detailed syntax description, please refer to: [Database Management](../Basic-Concept/Operate-Metadata_timecho.md#_1-database-management) - -## 2. Time Series Management - -```SQL --- Create time series; -CREATE TIMESERIES root.ln.wf01.wt01.status BOOLEAN; -CREATE TIMESERIES root.ln.wf01.wt01.temperature FLOAT; - --- Create aligned time series; -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT, longitude FLOAT); - --- Delete time series; -DELETE TIMESERIES root.ln.wf01.wt01.status; - --- View time series; -SHOW TIMESERIES root.ln.**; - --- Count time series; -COUNT TIMESERIES root.ln.**; -``` - -For detailed syntax description, please refer to: [Time Series Management](../Basic-Concept/Operate-Metadata_timecho.md#_2-timeseries-management) - -## 3. Data Writing - -```SQL --- Single column writing; -INSERT INTO root.ln.wf01.wt01(timestamp, temperature) VALUES(1, 23.0),(2, 42.6); - --- Multi-column writing; -INSERT INTO root.ln.wf01.wt01(timestamp, status, temperature) VALUES (3, false, 33.1),(4, true, 24.6); -``` - -For detailed syntax description, please refer to: [Data Writing](../Basic-Concept/Write-Data_timecho.md) - -## 4. Data Query - -```SQL --- Time filter query; -SELECT * from root.ln.** where time > 1; - --- Value filter query; -SELECT temperature FROM root.ln.wf01.wt01 where temperature > 36.5; - --- Function query; -SELECT count(temperature) FROM root.ln.wf01.wt01; - --- Latest point query; -SELECT LAST status FROM root.ln.wf01.wt01; -``` - -For detailed syntax description, please refer to: [Data Query](../Basic-Concept/Query-Data_timecho.md) - -## 5. Data Deletion - -```SQL --- Single column deletion; -DELETE FROM root.ln.wf01.wt01.status WHERE time >= 20; - --- Multi-column deletion; -DELETE FROM root.ln.wf01.wt01.* where time <= 10; -``` - -For detailed syntax description, please refer to: [Data Deletion](../Basic-Concept/Delete-Data.md) \ No newline at end of file diff --git a/src/UserGuide/latest/SQL-Manual/SQL-Manual_timecho.md b/src/UserGuide/latest/SQL-Manual/SQL-Manual_timecho.md deleted file mode 100644 index b791e680e..000000000 --- a/src/UserGuide/latest/SQL-Manual/SQL-Manual_timecho.md +++ /dev/null @@ -1,1697 +0,0 @@ - - -# SQL Manual - -## 1. DATABASE MANAGEMENT - -For more details, see document [Operate-Metadata](../Basic-Concept/Operate-Metadata_timecho.md). - -### 1.1 Create Database - -```sql -create database root.ln; -create database root.sgcc; -``` - -### 1.2 Show Databases - -```sql -SHOW DATABASES; -SHOW DATABASES root.**; -``` - -### 1.3 Delete Database - -```sql -DELETE DATABASE root.ln; -DELETE DATABASE root.sgcc; -// delete all data, all timeseries and all databases; -DELETE DATABASE root.**; -``` - -### 1.4 Count Databases - -```sql -count databases; -count databases root.*; -count databases root.sgcc.*; -count databases root.sgcc; -``` - -### 1.5 Setting up heterogeneous databases (Advanced operations) - -#### Set heterogeneous parameters when creating a Database - -```sql -CREATE DATABASE root.db WITH SCHEMA_REPLICATION_FACTOR=1, DATA_REPLICATION_FACTOR=3, SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### Adjust heterogeneous parameters at run time - -```sql -ALTER DATABASE root.db WITH SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### Show heterogeneous databases - -```sql -SHOW DATABASES DETAILS; -``` - -### 1.6 TTL - -#### Set TTL - -```sql -set ttl to root.ln 3600000; -set ttl to root.sgcc.** 3600000; -set ttl to root.** 3600000; -``` - -#### Unset TTL - -```sql -unset ttl from root.ln; -unset ttl from root.sgcc.**; -unset ttl from root.**; -``` - -#### Show TTL - -```sql -SHOW ALL TTL; -SHOW TTL ON StorageGroupNames; -SHOW DEVICES; -``` - -## 2. TIMESERIES MANAGEMENT - -For more details, see document [Operate-Metadata](../Basic-Concept/Operate-Metadata_timecho.md). - -### 2.1 Create Timeseries - -```sql -create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT; -create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT; -create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT; -``` - -- From v0.13, you can use a simplified version of the SQL statements to create timeseries: - -```sql -create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT; -create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT; -create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT; -``` - -- Notice that when in the CREATE TIMESERIES statement the encoding method conflicts with the data type, the system gives the corresponding error prompt as shown below: - -```sql -create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF; -error: encoding TS_2DIFF does not support BOOLEAN -``` - -### 2.2 Create Aligned Timeseries - -```sql -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT , longitude FLOAT); -``` - -### 2.3 Modify Timeseries Data Type -> Supported since V2.0.8.2 - -```SQL -ALTER TIMESERIES root.ln.wf01.wt01.temperature set data type DOUBLE -``` - -### 2.4 Modify Timeseries Name -> This statement is supported from V2.0.8.2 onwards - -```sql -ALTER TIMESERIES root.ln.wf01.wt01.temperature RENAME TO root.newln.newwf.newwt.temperature -``` - -### 2.5 Delete Timeseries - -```sql -delete timeseries root.ln.wf01.wt01.status; -delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware; -delete timeseries root.ln.wf02.*; -drop timeseries root.ln.wf02.*; -``` - -### 2.6 Show Timeseries - -```sql -show timeseries root.**; -show timeseries root.ln.**; -show timeseries root.ln.** limit 10 offset 10; -show timeseries root.ln.** where timeseries contains 'wf01.wt'; -show timeseries root.ln.** where dataType=FLOAT; -show timeseries root.ln.** where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -show latest timeseries; -show invalid timeseries; -- This statement is supported from V2.0.8.2 onwards; -``` - -### 2.7 Count Timeseries - -```sql -COUNT TIMESERIES root.**; -COUNT TIMESERIES root.ln.**; -COUNT TIMESERIES root.ln.*.*.status; -COUNT TIMESERIES root.ln.wf01.wt01.status; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc'; -COUNT TIMESERIES root.** WHERE DATATYPE = INT64; -COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c'; -COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c'; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1; -COUNT TIMESERIES root.** GROUP BY LEVEL=1; -COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2; -COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2; -``` - -### 2.8 Tag and Attribute Management - -```sql -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2); -``` - -* Rename the tag/attribute key - -```SQL -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1; -``` - -* Reset the tag/attribute value - -```SQL -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1; -``` - -* Delete the existing tag/attribute - -```SQL -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2; -``` - -* Add new tags - -```SQL -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4; -``` - -* Add new attributes - -```SQL -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4; -``` - -* Upsert alias, tags and attributes - -> add alias or a new key-value if the alias or key doesn't exist, otherwise, update the old one with new value. - -```SQL -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag3=v3, tag4=v4) ATTRIBUTES(attr3=v3, attr4=v4); -``` - -* Show timeseries using tags. Use TAGS(tagKey) to identify the tags used as filter key - -```SQL -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause; -``` - -returns all the timeseries information that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -```SQL -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c; -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1; -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -- count timeseries using tags - -```SQL -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause; -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL=; -``` - -returns all the number of timeseries that satisfy the where condition and match the pathPattern. SQL statements are as follows: - -```SQL -count timeseries; -count timeseries root.** where TAGS(unit)='c'; -count timeseries root.** where TAGS(unit)='c' group by level = 2; -``` - -create aligned timeseries - -```SQL -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)); -``` - -The execution result is as follows: - -```SQL -show timeseries; -``` -```shell -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -|root.sg1.d1.s2| null| root.sg1| DOUBLE| GORILLA| SNAPPY|{"tag4":"v4","tag3":"v3"}|{"attr4":"v4","attr3":"v3"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -Support query: - -```SQL -show timeseries where TAGS(tag1)='v1'; -``` -```shell -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -The above operations are supported for timeseries tag, attribute updates, etc. - -## 3. NODE MANAGEMENT - -For more details, see document [Operate-Metadata](../Basic-Concept/Operate-Metadata_timecho.md). - -### 3.1 Show Child Paths - -```SQL -SHOW CHILD PATHS pathPattern; -``` - -### 3.2 Show Child Nodes - -```SQL -SHOW CHILD NODES pathPattern; -``` - -### 3.3 Count Nodes - -```SQL -COUNT NODES root.** LEVEL=2; -COUNT NODES root.ln.** LEVEL=2; -COUNT NODES root.ln.wf01.** LEVEL=3; -COUNT NODES root.**.temperature LEVEL=3; -``` - -### 3.4 Show Devices - -```SQL -show devices; -show devices root.ln.**; -show devices root.ln.** where device contains 't'; -show devices with database; -show devices root.ln.** with database; -``` - -### 3.5 Count Devices - -```SQL -show devices; -count devices; -count devices root.ln.**; -``` - -## 4. INSERT & LOAD DATA - -### 4.1 Insert Data - -For more details, see document [Write-Data](../Basic-Concept/Write-Data_timecho). - -#### Use of INSERT Statements - -- Insert Single Timeseries - -```sql -insert into root.ln.wf02.wt02(timestamp,status) values(1,true); -insert into root.ln.wf02.wt02(timestamp,hardware) values(1, 'v1'); -``` - -- Insert Multiple Timeseries - -```sql -insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (2, false, 'v2'); -insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4'); -``` - -- Use the Current System Timestamp as the Timestamp of the Data Point - -```SQL -insert into root.ln.wf02.wt02(status, hardware) values (false, 'v2'); -``` - -#### Insert Data Into Aligned Timeseries - -```SQL -create aligned timeseries root.sg1.d1(s1 INT32, s2 DOUBLE); -insert into root.sg1.d1(time, s1, s2) aligned values(1, 1, 1); -insert into root.sg1.d1(time, s1, s2) aligned values(2, 2, 2), (3, 3, 3); -select * from root.sg1.d1; -``` - -### 4.2 Load External TsFile Tool - -For more details, see document [Data Import](../Tools-System/Data-Import-Tool_timecho). - -#### Load with SQL - -1. Load a single tsfile by specifying a file path (absolute path). - -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile'` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' sglevel=1` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' onSuccess=delete` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' sglevel=1 onSuccess=delete` - - -2. Load a batch of files by specifying a folder path (absolute path). - -- `load '/Users/Desktop/data'` -- `load '/Users/Desktop/data' sglevel=1` -- `load '/Users/Desktop/data' onSuccess=delete` -- `load '/Users/Desktop/data' sglevel=1 onSuccess=delete` - -#### Load with Script - -```sql -./load-rewrite.bat -f D:\IoTDB\data -h 192.168.0.101 -p 6667 -u root -pw root -``` - -## 5. DELETE DATA - -For more details, see document [Write-Delete-Data](../Basic-Concept/Write-Data_timecho). - -### 5.1 Delete Single Timeseries - -```sql -delete from root.ln.wf02.wt02.status where time<=2017-11-01T16:26:00; -delete from root.ln.wf02.wt02.status where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -delete from root.ln.wf02.wt02.status where time < 10; -delete from root.ln.wf02.wt02.status where time <= 10; -delete from root.ln.wf02.wt02.status where time < 20 and time > 10; -delete from root.ln.wf02.wt02.status where time <= 20 and time >= 10; -delete from root.ln.wf02.wt02.status where time > 20; -delete from root.ln.wf02.wt02.status where time >= 20; -delete from root.ln.wf02.wt02.status where time = 20; -delete from root.ln.wf02.wt02.status where time > 4 or time < 0; -Msg: 303: Check metadata error: For delete statement, where clause can only contain atomic; -expressions like : time > XXX, time <= XXX, or two atomic expressions connected by 'AND'; -delete from root.ln.wf02.wt02.status; -``` - -### 5.2 Delete Multiple Timeseries - -```sql -delete from root.ln.wf02.wt02 where time <= 2017-11-01T16:26:00; -delete from root.ln.wf02.wt02.* where time <= 2017-11-01T16:26:00; -delete from root.ln.wf03.wt02.status where time < now(); -Msg: The statement is executed successfully. -``` - -### 5.3 Delete Time Partition (experimental) - -```sql -DELETE PARTITION root.ln 0,1,2; -``` - -## 6. QUERY DATA - -For more details, see document [Query-Data](../Basic-Concept/Query-Data_timecho). - -```sql -SELECT [LAST] selectExpr [, selectExpr] ... - [INTO intoItem [, intoItem] ...] - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY { - ([startTime, endTime), interval [, slidingStep]) | - LEVEL = levelNum [, levelNum] ... | - TAGS(tagKey [, tagKey] ... ) | - VARIATION(expression[,delta][,ignoreNull=true/false]) | - CONDITION(expression,[keep>/>=/=/ 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` - -#### Select Multiple Columns of Data for the Same Device According to Multiple Time Intervals - -```sql -select status,temperature from root.ln.wf01.wt01 where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -#### Choose Multiple Columns of Data for Different Devices According to Multiple Time Intervals - -```sql -select wf01.wt01.status,wf02.wt02.hardware from root.ln where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -#### Order By Time Query - -```sql -select * from root.ln.** where time > 1 order by time desc limit 10; -``` - -### 6.2 `SELECT` CLAUSE - -#### Use Alias - -```sql -select s1 as temperature, s2 as speed from root.ln.wf01.wt01; -``` - -#### Nested Expressions - -##### Nested Expressions with Time Series Query - -```sql -select a, - b, - ((a + 1) * 2 - 1) % 2 + 1.5, - sin(a + sin(a + sin(b))), - -(a + b) * (sin(a + b) * sin(a + b) + cos(a + b) * cos(a + b)) + 1 -from root.sg1; - -select (a + b) * 2 + sin(a) from root.sg; - -select (a + *) / 2 from root.sg1; - -select (a + b) * 3 from root.sg, root.ln; -``` - -##### Nested Expressions query with aggregations - -```sql -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) -from root.ln.wf01.wt01; - -select avg(*), - (avg(*) + 1) * 3 / 2 -1 -from root.sg1; - -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) as custom_sum -from root.ln.wf01.wt01 -GROUP BY([10, 90), 10ms); -``` - -#### Last Query - -```sql -select last status from root.ln.wf01.wt01; -select last status, temperature from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00; -select last * from root.ln.wf01.wt01 order by timeseries desc; -select last * from root.ln.wf01.wt01 order by dataType desc; -``` - -### 6.3 `WHERE` CLAUSE - -#### Time Filter - -```sql -select s1 from root.sg1.d1 where time > 2022-01-01T00:05:00.000; -select s1 from root.sg1.d1 where time = 2022-01-01T00:05:00.000; -select s1 from root.sg1.d1 where time >= 2022-01-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` - -#### Value Filter - -```sql -select temperature from root.sg1.d1 where temperature > 36.5; -select status from root.sg1.d1 where status = true; -select temperature from root.sg1.d1 where temperature between 36.5 and 40; -select temperature from root.sg1.d1 where temperature not between 36.5 and 40; -select code from root.sg1.d1 where code in ('200', '300', '400', '500'); -select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); -select code from root.sg1.d1 where temperature is null; -select code from root.sg1.d1 where temperature is not null; -``` - -#### Fuzzy Query - -- Fuzzy matching using `Like` - -```sql -select * from root.sg.d1 where value like '%cc%'; -select * from root.sg.device where value like '_b_'; -``` - -- Fuzzy matching using `Regexp` - -```sql -select * from root.sg.d1 where value regexp '^[A-Za-z]+$'; -select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100; -``` - -### 6.4 `GROUP BY` CLAUSE - -- Aggregate By Time without Specifying the Sliding Step Length - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d); -``` - -- Aggregate By Time Specifying the Sliding Step Length - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d); -``` - -- Aggregate by Natural Month - -```sql -select count(status) from root.ln.wf01.wt01 group by([2017-11-01T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -select count(status) from root.ln.wf01.wt01 group by([2017-10-31T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` - -- Left Open And Right Close Range - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d); -``` - -- Aggregation By Variation - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6); -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, ignoreNull=false); -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, 4); -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6+s5, 10); -``` - -- Aggregation By Condition - -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoringNull=true); -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoringNull=false); -``` - -- Aggregation By Session - -```sql -select __endTime,count(*) from root.** group by session(1d); -select __endTime,sum(hardware) from root.ln.wf02.wt01 group by session(50s) having sum(hardware)>0 align by device; -``` - -- Aggregation By Count - -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5); -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5,ignoreNull=false); -``` - -- Aggregation By Level - -```sql -select count(status) from root.** group by level = 1; -select count(status) from root.** group by level = 3; -select count(status) from root.** group by level = 1, 3; -select max_value(temperature) from root.** group by level = 0; -select count(*) from root.ln.** group by level = 2; -``` - -- Aggregate By Time with Level Clause - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d), level=1; -select count(status) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d), level=1; -``` - -- Aggregation query by one single tag - -```sql -SELECT AVG(temperature) FROM root.factory1.** GROUP BY TAGS(city); -``` - -- Aggregation query by multiple tags - -```sql -SELECT avg(temperature) FROM root.factory1.** GROUP BY TAGS(city, workshop); -``` - -- Downsampling Aggregation by tags based on Time Window - -```sql -SELECT avg(temperature) FROM root.factory1.** GROUP BY ([1000, 10000), 5s), TAGS(city, workshop); -``` - -### 6.5 `HAVING` CLAUSE - -Correct: - -```sql -select count(s1) from root.** group by ([1,11),2ms), level=1 having count(s2) > 1; -select count(s1), count(s2) from root.** group by ([1,11),2ms) having count(s2) > 1 align by device; -``` - -Incorrect: - -```sql -select count(s1) from root.** group by ([1,3),1ms) having sum(s1) > s1; -select count(s1) from root.** group by ([1,3),1ms) having s1 > 1; -select count(s1) from root.** group by ([1,3),1ms), level=1 having sum(d1.s1) > 1; -select count(d1.s1) from root.** group by ([1,3),1ms), level=1 having sum(s1) > 1; -``` - -### 6.6 `FILL` CLAUSE - -#### `PREVIOUS` Fill - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous); -``` - -#### `PREVIOUS` FILL and specify the fill timeout threshold -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous, 2m); -``` - -#### `LINEAR` Fill - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(linear); -``` - -#### Constant Fill - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(2.0); -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(true); -``` - -### 6.7 `LIMIT` and `SLIMIT` CLAUSES (PAGINATION) - -#### Row Control over Query Results - -```sql -select status, temperature from root.ln.wf01.wt01 limit 10; -select status, temperature from root.ln.wf01.wt01 limit 5 offset 3; -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time< 2017-11-01T00:12:00.000 limit 2 offset 3; -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) limit 5 offset 3; -``` - -#### Column Control over Query Results - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1; -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 soffset 1; -select max_value(*) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) slimit 1 soffset 1; -``` - -#### Row and Column Control over Query Results - -```sql -select * from root.ln.wf01.wt01 limit 10 offset 100 slimit 2 soffset 0; -``` - -### 6.8 `ORDER BY` CLAUSE - -#### Order by in ALIGN BY TIME mode - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time desc; -``` - -#### Order by in ALIGN BY DEVICE mode - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by device desc,time asc align by device; -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time asc,device desc align by device; -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -select count(*) from root.ln.** group by ((2017-11-01T00:00:00.000+08:00,2017-11-01T00:03:00.000+08:00],1m) order by device asc,time asc align by device; -``` - -#### Order by arbitrary expressions - -```sql -select score from root.** order by score desc align by device; -select score,total from root.one order by base+score+bonus desc; -select score,total from root.one order by total desc; -select base, score, bonus, total from root.** order by total desc NULLS Last, - score desc NULLS Last, - bonus desc NULLS Last, - time desc align by device; -select min_value(total) from root.** order by min_value(total) asc align by device; -select min_value(total),max_value(base) from root.** order by max_value(total) desc align by device; -select score from root.** order by device asc, score desc, time asc align by device; -``` - -### 6.9 `ALIGN BY` CLAUSE - -#### Align by Device - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` - -### 6.10 `INTO` CLAUSE (QUERY WRITE-BACK) - -```sql -select s1, s2 into root.sg_copy.d1(t1), root.sg_copy.d2(t1, t2), root.sg_copy.d1(t2) from root.sg.d1, root.sg.d2; -select count(s1 + s2), last_value(s2) into root.agg.count(s1_add_s2), root.agg.last_value(s2) from root.sg.d1 group by ([0, 100), 10ms); -select s1, s2 into root.sg_copy.d1(t1, t2), root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -select s1 + s2 into root.expr.add(d1s1_d1s2), root.expr.add(d2s1_d2s2) from root.sg.d1, root.sg.d2 align by device; -``` - -- Using variable placeholders: - -```sql -select s1, s2 -into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) -from root.sg.d1, root.sg.d2; - -select d1.s1, d1.s2, d2.s3, d3.s4 -into ::(s1_1, s2_2), root.sg.d2_2(s3_3), root.${2}_copy.::(s4) -from root.sg; - -select * into root.sg_bk.::(::) from root.sg.**; - -select s1, s2, s3, s4 -into root.backup_sg.d1(s1, s2, s3, s4), root.backup_sg.d2(::), root.sg.d3(backup_${4}) -from root.sg.d1, root.sg.d2, root.sg.d3 -align by device; - -select avg(s1), sum(s2) + sum(s3), count(s4) -into root.agg_${2}.::(avg_s1, sum_s2_add_s3, count_s4) -from root.** -align by device; - -select * into ::(backup_${4}) from root.sg.** align by device; - -select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` - -## 7. Maintenance -Generate the corresponding query plan: -```sql -explain select s1,s2 from root.sg.d1; -``` -Execute the corresponding SQL, analyze the execution and output: -```sql -explain analyze select s1,s2 from root.sg.d1 order by s1; -``` - -For more Maintenance commands, please refer to[Maintenance commands](../User-Manual/Maintenance-commands_timecho.md) - -## 8. OPERATOR - -For more details, see document [Operator-and-Expression](./Operator-and-Expression.md). - -### 8.1 Arithmetic Operators - -For details and examples, see the document [Arithmetic Operators and Functions](./Operator-and-Expression.md#_1-1-arithmetic-operators). - -```sql -select s1, - s1, s2, + s2, s1 + s2, s1 - s2, s1 * s2, s1 / s2, s1 % s2 from root.sg.d1; -``` - -### 8.2 Comparison Operators - -For details and examples, see the document [Comparison Operators and Functions](./Operator-and-Expression.md#_1-2-comparison-operators). - -```sql -# Basic comparison operators; -select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test; - -# `BETWEEN ... AND ...` operator; -select temperature from root.sg1.d1 where temperature between 36.5 and 40; -select temperature from root.sg1.d1 where temperature not between 36.5 and 40; - -# Fuzzy matching operator: Use `Like` for fuzzy matching; -select * from root.sg.d1 where value like '%cc%'; -select * from root.sg.device where value like '_b_'; - -# Fuzzy matching operator: Use `Regexp` for fuzzy matching; -select * from root.sg.d1 where value regexp '^[A-Za-z]+$'; -select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100; -select b, b like '1%', b regexp '[0-2]' from root.test; - -# `IS NULL` operator; -select code from root.sg1.d1 where temperature is null; -select code from root.sg1.d1 where temperature is not null; - -# `IN` operator; -select code from root.sg1.d1 where code in ('200', '300', '400', '500'); -select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); -select a, a in (1, 2) from root.test; -``` - -### 8.3 Logical Operators - -For details and examples, see the document [Logical Operators](./Operator-and-Expression.md#_1-3-logical-operators). - -```sql -select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test; -``` - -## 9. BUILT-IN FUNCTIONS - -For more details, see document [Operator-and-Expression](./Operator-and-Expression.md#_2-built-in-functions). - -### 9.1 Aggregate Functions - -For details and examples, see the document [Aggregate Functions](./Operator-and-Expression.md#_2-1-aggregate-functions). - -```sql -select count(status) from root.ln.wf01.wt01; - -select count_if(s1=0 & s2=0, 3), count_if(s1=1 & s2=0, 3) from root.db.d1; -select count_if(s1=0 & s2=0, 3, 'ignoreNull'='false'), count_if(s1=1 & s2=0, 3, 'ignoreNull'='false') from root.db.d1; - -select time_duration(s1) from root.db.d1; -``` - -### 9.2 Arithmetic Functions - -For details and examples, see the document [Arithmetic Operators and Functions](./Operator-and-Expression.md#_2-2-arithmetic-functions). - -```sql -select s1, sin(s1), cos(s1), tan(s1) from root.sg1.d1 limit 5 offset 1000; -select s4,round(s4),round(s4,2),round(s4,-1) from root.sg1.d1; -``` - -### 9.3 Comparison Functions - -For details and examples, see the document [Comparison Operators and Functions](./Operator-and-Expression.md#_2-3-comparison-functions). - -```sql -select ts, on_off(ts, 'threshold'='2') from root.test; -select ts, in_range(ts, 'lower'='2', 'upper'='3.1') from root.test; -``` - -### 9.4 String Processing Functions - -For details and examples, see the document [String Processing](./Operator-and-Expression.md#_2-4-string-processing-functions). - -```sql -select s1, string_contains(s1, 's'='warn') from root.sg1.d4; -select s1, string_matches(s1, 'regex'='[^\\s]+37229') from root.sg1.d4; -select s1, length(s1) from root.sg1.d1; -select s1, locate(s1, "target"="1") from root.sg1.d1; -select s1, locate(s1, "target"="1", "reverse"="true") from root.sg1.d1; -select s1, startswith(s1, "target"="1") from root.sg1.d1; -select s1, endswith(s1, "target"="1") from root.sg1.d1; -select s1, s2, concat(s1, s2, "target1"="IoT", "target2"="DB") from root.sg1.d1; -select s1, s2, concat(s1, s2, "target1"="IoT", "target2"="DB", "series_behind"="true") from root.sg1.d1; -select s1, substring(s1 from 1 for 2) from root.sg1.d1; -select s1, replace(s1, 'es', 'tt') from root.sg1.d1; -select s1, upper(s1) from root.sg1.d1; -select s1, lower(s1) from root.sg1.d1; -select s3, trim(s3) from root.sg1.d1; -select s1, s2, strcmp(s1, s2) from root.sg1.d1; -select strreplace(s1, "target"=",", "replace"="/", "limit"="2") from root.test.d1; -select strreplace(s1, "target"=",", "replace"="/", "limit"="1", "offset"="1", "reverse"="true") from root.test.d1; -select regexmatch(s1, "regex"="\d+\.\d+\.\d+\.\d+", "group"="0") from root.test.d1; -select regexreplace(s1, "regex"="192\.168\.0\.(\d+)", "replace"="cluster-$1", "limit"="1") from root.test.d1; -select regexsplit(s1, "regex"=",", "index"="-1") from root.test.d1; -select regexsplit(s1, "regex"=",", "index"="3") from root.test.d1; -``` - -### 9.5 Data Type Conversion Function - -For details and examples, see the document [Data Type Conversion Function](./Operator-and-Expression.md#_2-5-data-type-conversion-function). - -```sql -SELECT cast(s1 as INT32) from root.sg; -``` - -### 9.6 Constant Timeseries Generating Functions - -For details and examples, see the document [Constant Timeseries Generating Functions](./Operator-and-Expression.md#_2-6-constant-timeseries-generating-functions). - -```sql -select s1, s2, const(s1, 'value'='1024', 'type'='INT64'), pi(s2), e(s1, s2) from root.sg1.d1; -``` - -### 9.7 Selector Functions - -For details and examples, see the document [Selector Functions](./Operator-and-Expression.md#_2-7-selector-functions). - -```sql -select s1, top_k(s1, 'k'='2'), bottom_k(s1, 'k'='2') from root.sg1.d2 where time > 2020-12-10T20:36:15.530+08:00; -``` - -### 9.8 Continuous Interval Functions - -For details and examples, see the document [Continuous Interval Functions](./Operator-and-Expression.md#_2-8-continuous-interval-functions). - -```sql -select s1, zero_count(s1), non_zero_count(s2), zero_duration(s3), non_zero_duration(s4) from root.sg.d2; -``` - -### 9.9 Variation Trend Calculation Functions - -For details and examples, see the document [Variation Trend Calculation Functions](./Operator-and-Expression.md#_2-9-variation-trend-calculation-functions). - -```sql -select s1, time_difference(s1), difference(s1), non_negative_difference(s1), derivative(s1), non_negative_derivative(s1) from root.sg1.d1 limit 5 offset 1000; - -SELECT DIFF(s1), DIFF(s2) from root.test; -SELECT DIFF(s1, 'ignoreNull'='false'), DIFF(s2, 'ignoreNull'='false') from root.test; -``` - -### 9.10 Sample Functions - -For details and examples, see the document [Sample Functions](./Operator-and-Expression.md#_2-10-sample-functions). - -```sql -select equal_size_bucket_random_sample(temperature,'proportion'='0.1') as random_sample from root.ln.wf01.wt01; -select equal_size_bucket_agg_sample(temperature, 'type'='avg','proportion'='0.1') as agg_avg, equal_size_bucket_agg_sample(temperature, 'type'='max','proportion'='0.1') as agg_max, equal_size_bucket_agg_sample(temperature,'type'='min','proportion'='0.1') as agg_min, equal_size_bucket_agg_sample(temperature, 'type'='sum','proportion'='0.1') as agg_sum, equal_size_bucket_agg_sample(temperature, 'type'='extreme','proportion'='0.1') as agg_extreme, equal_size_bucket_agg_sample(temperature, 'type'='variance','proportion'='0.1') as agg_variance from root.ln.wf01.wt01; -select equal_size_bucket_m4_sample(temperature, 'proportion'='0.1') as M4_sample from root.ln.wf01.wt01; -select equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='avg', 'number'='2') as outlier_avg_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='stendis', 'number'='2') as outlier_stendis_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='cos', 'number'='2') as outlier_cos_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='prenextdis', 'number'='2') as outlier_prenextdis_sample from root.ln.wf01.wt01; - -select M4(s1,'timeInterval'='25','displayWindowBegin'='0','displayWindowEnd'='100') from root.vehicle.d1; -select M4(s1,'windowSize'='10') from root.vehicle.d1; -``` - -### 9.11 Change Points Function - -For details and examples, see the document [Time-Series](./Operator-and-Expression.md#_2-11-change-points-function). - -```sql -select change_points(s1), change_points(s2), change_points(s3), change_points(s4), change_points(s5), change_points(s6) from root.testChangePoints.d1; -``` - -## 10. DATA QUALITY FUNCTION LIBRARY - -For more details, see document [Operator-and-Expression](../SQL-Manual/UDF-Libraries.md). - -### 10.1 Data Quality - -For details and examples, see the document [Data-Quality](../SQL-Manual/UDF-Libraries.md#data-quality). - -```sql -# Completeness; -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Consistency; -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Timeliness; -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Validity; -select Validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select Validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Accuracy; -select Accuracy(t1,t2,t3,m1,m2,m3) from root.test; -``` - -### 10.2 Data Profiling - -For details and examples, see the document [Data-Profiling](../SQL-Manual/UDF-Libraries.md#data-profiling). - -```sql -# ACF; -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05; - -# Distinct; -select distinct(s2) from root.test.d2; - -# Histogram; -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1; - -# Integral; -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10; -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10; - -# IntegralAvg; -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10; - -# Mad; -select mad(s0) from root.test; -select mad(s0, "error"="0.01") from root.test; - -# Median; -select median(s0, "error"="0.01") from root.test; - -# MinMax; -select minmax(s1) from root.test; - -# Mode; -select mode(s2) from root.test.d2; - -# MvAvg; -select mvavg(s1, "window"="3") from root.test; - -# PACF; -select pacf(s1, "lag"="5") from root.test; - -# Percentile; -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test; - -# Quantile; -select quantile(s0, "rank"="0.2", "K"="800") from root.test; - -# Period; -select period(s1) from root.test.d3; - -# QLB; -select QLB(s1) from root.test.d1; - -# Resample; -select resample(s1,'every'='5m','interp'='linear') from root.test.d1; -select resample(s1,'every'='30m','aggr'='first') from root.test.d1; -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1; - -# Sample; -select sample(s1,'method'='reservoir','k'='5') from root.test.d1; -select sample(s1,'method'='isometric','k'='5') from root.test.d1; - -# Segment; -select segment(s1, "error"="0.1") from root.test; - -# Skew; -select skew(s1) from root.test.d1; - -# Spline; -select spline(s1, "points"="151") from root.test; - -# Spread; -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; - -# Stddev; -select stddev(s1) from root.test.d1; - -# ZScore; -select zscore(s1) from root.test; -``` - -### 10.3 Anomaly Detection - -For details and examples, see the document [Anomaly-Detection](../SQL-Manual/UDF-Libraries.md#anomaly-detection). - -```sql -# IQR; -select iqr(s1) from root.test; - -# KSigma; -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30; - -# LOF; -select lof(s1,s2) from root.test.d1 where time<1000; -select lof(s1, "method"="series") from root.test.d1 where time<1000; - -# MissDetect; -select missdetect(s2,'minlen'='10') from root.test.d2; - -# Range; -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30; - -# TwoSidedFilter; -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test; - -# Outlier; -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test; - -# MasterTrain; -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test; - -# MasterDetect; -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test; -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test; -``` - -### 10.4 Frequency Domain - -For details and examples, see the document [Frequency-Domain](../SQL-Manual/UDF-Libraries.md#frequency-domain-analysis). - -```sql -# Conv; -select conv(s1,s2) from root.test.d2; - -# Deconv; -select deconv(s3,s2) from root.test.d2; -select deconv(s3,s2,'result'='remainder') from root.test.d2; - -# DWT; -select dwt(s1,"method"="haar") from root.test.d1; - -# FFT; -select fft(s1) from root.test.d1; -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1; - -# HighPass; -select highpass(s1,'wpass'='0.45') from root.test.d1; - -# IFFT; -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1; - -# LowPass; -select lowpass(s1,'wpass'='0.45') from root.test.d1; - -# Envelope; -select envelope(s1) from root.test.d1; -``` - -### 10.5 Data Matching - -For details and examples, see the document [Data-Matching](../SQL-Manual/UDF-Libraries.md#data-matching). - -```sql -# Cov; -select cov(s1,s2) from root.test.d2; - -# DTW; -select dtw(s1,s2) from root.test.d2; - -# Pearson; -select pearson(s1,s2) from root.test.d2; - -# PtnSym; -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1; - -# XCorr; -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05; -``` - -### 10.6 Data Repairing - -For details and examples, see the document [Data-Repairing](../SQL-Manual/UDF-Libraries.md#data-repairing). - -```sql -# TimestampRepair; -select timestamprepair(s1,'interval'='10000') from root.test.d2; -select timestamprepair(s1) from root.test.d2; - -# ValueFill; -select valuefill(s1) from root.test.d2; -select valuefill(s1,"method"="previous") from root.test.d2; - -# ValueRepair; -select valuerepair(s1) from root.test.d2; -select valuerepair(s1,'method'='LsGreedy') from root.test.d2; - -# MasterRepair; -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test; - -# SeasonalRepair; -select seasonalrepair(s1,'period'=3,'k'=2) from root.test.d2; -select seasonalrepair(s1,'method'='improved','period'=3) from root.test.d2; -``` - -### 10.7 Series Discovery - -For details and examples, see the document [Series-Discovery](../SQL-Manual/UDF-Libraries.md#series-discovery). - -```sql -# ConsecutiveSequences; -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1; -select consecutivesequences(s1,s2) from root.test.d1; - -# ConsecutiveWindows; -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1; -``` - -### 10.8 Machine Learning - -For details and examples, see the document [Machine-Learning](../SQL-Manual/UDF-Libraries.md#machine-learning). - -```sql -# AR; -select ar(s0,"p"="2") from root.test.d0; - -# Representation; -select representation(s0,"tb"="3","vb"="2") from root.test.d0; - -# RM; -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0; -``` - - -## 11. CONDITIONAL EXPRESSION - -For details and examples, see the document [Conditional Expressions](../SQL-Manual/UDF-Libraries.md#conditional-expressions). - -```sql -select T, P, case -when 1000=1050 then "bad temperature" -when P<=1000000 or P>=1100000 then "bad pressure" -end as `result` -from root.test1; - -select str, case -when str like "%cc%" then "has cc" -when str like "%dd%" then "has dd" -else "no cc and dd" end as `result` -from root.test2; - -select -count(case when x<=1 then 1 end) as `(-∞,1]`, -count(case when 1 -[RESAMPLE - [EVERY ] - [BOUNDARY ] - [RANGE [, end_time_offset]] -] -[TIMEOUT POLICY BLOCKED|DISCARD] -BEGIN - SELECT CLAUSE - INTO CLAUSE - FROM CLAUSE - [WHERE CLAUSE] - [GROUP BY([, ]) [, level = ]] - [HAVING CLAUSE] - [FILL ({PREVIOUS | LINEAR | constant} (, interval=DURATION_LITERAL)?)] - [LIMIT rowLimit OFFSET rowOffset] - [ALIGN BY DEVICE] -END; -``` - -### 13.1 Configuring execution intervals - -```sql -CREATE CONTINUOUS QUERY cq1 -RESAMPLE EVERY 20s -BEGIN -SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) -END; -``` - -### 13.2 Configuring time range for resampling - -```sql -CREATE CONTINUOUS QUERY cq2 -RESAMPLE RANGE 40s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) -END; -``` - -### 13.3 Configuring execution intervals and CQ time ranges - -```sql -CREATE CONTINUOUS QUERY cq3 -RESAMPLE EVERY 20s RANGE 40s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) - FILL(100.0) -END; -``` - -### 13.4 Configuring end_time_offset for CQ time range - -```sql -CREATE CONTINUOUS QUERY cq4 -RESAMPLE EVERY 20s RANGE 40s, 20s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) - FILL(100.0) -END; -``` - -### 13.5 CQ without group by clause - -```sql -CREATE CONTINUOUS QUERY cq5 -RESAMPLE EVERY 20s -BEGIN - SELECT temperature + 1 - INTO root.precalculated_sg.::(temperature) - FROM root.ln.*.* - align by device -END; -``` - -### 13.6 CQ Management - -#### Listing continuous queries - -```sql -SHOW (CONTINUOUS QUERIES | CQS) -``` - -#### Dropping continuous queries - -```sql -DROP (CONTINUOUS QUERY | CQ) -``` - -#### Altering continuous queries - -CQs can't be altered once they're created. To change a CQ, you must `DROP` and re`CREATE` it with the updated settings. - -## 14. USER-DEFINED FUNCTION (UDF) - -For more details, see document [UDF Libraries](../SQL-Manual/UDF-Libraries.md). - -### 14.1 UDF Registration - -```sql -CREATE FUNCTION AS (USING URI URI-STRING)? -``` - -### 14.2 UDF Deregistration - -```sql -DROP FUNCTION -``` - -### 14.3 UDF Queries - -```sql -SELECT example(*) from root.sg.d1; -SELECT example(s1, *) from root.sg.d1; -SELECT example(*, *) from root.sg.d1; - -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; - -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` - -### 14.4 Show All Registered UDFs - -```sql -SHOW FUNCTIONS; -``` - -## 15. ADMINISTRATION MANAGEMENT - -For more details, see document [Authority Management](../User-Manual/Authority-Management_timecho.md). - -### 15.1 SQL Statements - -- Create user (Requires MANAGE_USER permission) - -```SQL -CREATE USER ; -eg: CREATE USER user1 'passwd'; -``` - -- Delete user (Requires MANAGE_USER permission) - -```sql -DROP USER ; -eg: DROP USER user1; -``` - -- Create role (Requires MANAGE_ROLE permission) - -```sql -CREATE ROLE ; -eg: CREATE ROLE role1; -``` - -- Delete role (Requires MANAGE_ROLE permission) - -```sql -DROP ROLE ; -eg: DROP ROLE role1; -``` - -- Grant role to user (Requires MANAGE_ROLE permission) - -```sql -GRANT ROLE TO ; -eg: GRANT ROLE admin TO user1; -``` - -- Revoke role from user(Requires MANAGE_ROLE permission) - -```sql -REVOKE ROLE FROM ; -eg: REVOKE ROLE admin FROM user1; -``` - -- List all user (Requires MANAGE_USER permission) - -```sql -LIST USER; -``` - -- List all role (Requires MANAGE_ROLE permission) - -```sql -LIST ROLE; -``` - -- List all users granted specific role.(Requires MANAGE_USER permission) - -```sql -LIST USER OF ROLE ; -eg: LIST USER OF ROLE roleuser; -``` - -- List all role granted to specific user. - -```sql -LIST ROLE OF USER ; -eg: LIST ROLE OF USER tempuser; -``` - -- List all privileges of user - -```sql -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; -``` - -- List all privileges of role - -```sql -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -- Modify password - -```sql -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'newpwd'; -``` - -### 15.2 Authorization and Deauthorization - - -```sql -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT MANAGE_ROLE ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -```sql -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE MANAGE_ROLE ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - - -#### Delete Time Partition (experimental) - -```sql -Eg: DELETE PARTITION root.ln 0,1,2; -``` - -#### Continuous Query,CQ - -```sql -Eg: CREATE CONTINUOUS QUERY cq1 BEGIN SELECT max_value(temperature) INTO temperature_max FROM root.ln.*.* GROUP BY time(10s) END; -``` - -#### Maintenance Command - -- FLUSH - -```sql -Eg: flush; -``` - -- MERGE - -```sql -Eg: MERGE; -Eg: FULL MERGE; -``` - -- CLEAR CACHE - -```sql -Eg: CLEAR CACHE; -``` - -- START REPAIR DATA - -```sql -Eg: START REPAIR DATA; -``` - -- STOP REPAIR DATA - -```sql -Eg: STOP REPAIR DATA; -``` - -- SET SYSTEM TO READONLY / WRITABLE - -```sql -Eg: SET SYSTEM TO READONLY / WRITABLE; -``` - -- Query abort - -```sql -Eg: KILL QUERY 1; -``` \ No newline at end of file diff --git a/src/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md b/src/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md deleted file mode 100644 index b4f54e05c..000000000 --- a/src/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md +++ /dev/null @@ -1,5063 +0,0 @@ - - -# UDF Libraries - -Based on the ability of user-defined functions, IoTDB provides a series of functions for temporal data processing, including data quality, data profiling, anomaly detection, frequency domain analysis, data matching, data repairing, sequence discovery, machine learning, etc., which can meet the needs of industrial fields for temporal data processing. - -> Note: The functions in the current UDF library only support millisecond level timestamp accuracy. - -## 1. Installation steps - -1. Please obtain the compressed file of the UDF library JAR package that is compatible with the IoTDB version. - - | UDF installation package | Supported IoTDB versions | Download link | - | --------------- | ----------------- | ------------------------------------------------------------ | - | TimechoDB-UDF-1.3.3.zip | V1.3.3 and above | Please contact Timecho for assistance | - | TimechoDB-UDF-1.3.2.zip | V1.0.0~V1.3.2 | Please contact Timecho for assistance| - -2. Place the `library-udf.jar` file in the compressed file obtained in the directory `/ext/udf ` of all nodes in the IoTDB cluster -3. In the SQL command line terminal (CLI) or visualization console (Workbench) SQL operation interface of IoTDB, execute the corresponding function registration statement as follows. -4. Batch registration: Two registration methods: registration script or SQL full statement -- Register Script - - Copy the registration script (`register-UDF.sh` or `register-UDF.bat`) from the compressed package to the `tools` directory of IoTDB as needed, and modify the parameters in the script (default is host=127.0.0.1, rpcPort=6667, user=root, pass=root); - - Start IoTDB service, run registration script to batch register UDF - -- All SQL statements - - Open the SQl file in the compressed package, copy all SQL statements, and execute all SQl statements in the SQL command line terminal (CLI) of IoTDB or the SQL operation interface of the visualization console (Workbench) to batch register UDF - -## 2. Data Quality - -### 2.1 Completeness - -#### Registration statement - -```sql -create function completeness as 'org.apache.iotdb.library.dquality.UDTFCompleteness' -``` - -#### Usage - -This function calculates the completeness of a time series, which measures the presence or absence of missing values in the time series data. The function divides the input time series data into consecutive non-overlapping time windows, computes the data completeness for each window individually, and outputs the timestamp of the first data point in the window along with the completeness result. - -**Name:** COMPLETENESS - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. -+ `downtime`: Whether the downtime exception is considered in the calculation of completeness. It is 'true' or 'false' (default). When considering the downtime exception, long-term missing data will be considered as downtime exception without any influence on completeness. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-----------------------------+ -| Time|completeness(root.test.d1.s1)| -+-----------------------------+-----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -+-----------------------------+-----------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------+ -| Time|completeness(root.test.d1.s1, "window"="15")| -+-----------------------------+--------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+--------------------------------------------+ -``` - -### 2.2 Consistency - -#### Registration statement - -```sql -create function consistency as 'org.apache.iotdb.library.dquality.UDTFConsistency' -``` - -#### Usage - -This function calculates the consistency of a time series, which measures whether the changes in the time series data are stable and follow uniform patterns. The function divides the input time series data into consecutive non-overlapping time windows, computes the data consistency for each window individually, and outputs the timestamp of the first data point in the window along with the consistency result. - -**Name:** CONSISTENCY - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|consistency(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+----------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------+ -| Time|consistency(root.test.d1.s1, "window"="15")| -+-----------------------------+-------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+-------------------------------------------+ -``` - -### 2.3 Timeliness - -#### Registration statement - -```sql -create function timeliness as 'org.apache.iotdb.library.dquality.UDTFTimeliness' -``` - -#### Usage - -This function calculates the timeliness of a time series, which measures whether the time series data is collected and reported on schedule. The function divides the input time series data into consecutive non-overlapping time windows, computes the data timeliness for each window individually, and outputs the timestamp of the first data point in the window along with the timeliness result. - -**Name:** TIMELINESS - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+---------------------------+ -| Time|timeliness(root.test.d1.s1)| -+-----------------------------+---------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+---------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------+ -| Time|timeliness(root.test.d1.s1, "window"="15")| -+-----------------------------+------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+------------------------------------------+ -``` - -### 2.4 Validity - -#### Registration statement - -```sql -create function validity as 'org.apache.iotdb.library.dquality.UDTFValidity' -``` - -#### Usage - -This function calculates the validity of a time series, which measures whether the time series data is normal, usable, and free of outliers. The function divides the input time series data into consecutive non-overlapping time windows, computes the data validity for each window individually, and outputs the timestamp of the first data point in the window along with the validity result. - -**Name:** VALIDITY - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `window`: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window. - -**Output Series:** Output a single series. The type is DOUBLE. The range of each value is [0,1]. - -**Note:** Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output. - -#### Examples - -##### Default Parameters - -With default parameters, this function will regard all input data as the same window. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select Validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|validity(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -+-----------------------------+-------------------------+ -``` - -##### Specific Window Size - -When the window size is given, this function will divide the input data as multiple windows. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select Validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|validity(root.test.d1.s1, "window"="15")| -+-----------------------------+----------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - - - - - -## 3. Data Profiling - -### 3.1 ACF - -#### Registration statement - -```sql -create function acf as 'org.apache.iotdb.library.dprofile.UDTFACF' -``` - -#### Usage - -This function is used to calculate the auto-correlation factor of the input time series, -which equals to cross correlation between the same series. -For more information, please refer to [XCorr](#XCorr) function. - -**Name:** ACF - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. -There are $2N-1$ data points in the series, and the values are interpreted in details in [XCorr](#XCorr) function. - -**Note:** - -+ `null` and `NaN` values in the input series will be ignored and treated as 0. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| null| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|acf(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 6.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -|1970-01-01T08:00:00.004+08:00| 7.0| -|1970-01-01T08:00:00.005+08:00| 0.0| -|1970-01-01T08:00:00.006+08:00| 3.6| -|1970-01-01T08:00:00.007+08:00| 0.0| -|1970-01-01T08:00:00.008+08:00| 1.0| -+-----------------------------+--------------------+ -``` - -### 3.2 Distinct - -#### Registration statement - -```sql -create function distinct as 'org.apache.iotdb.library.dprofile.UDTFDistinct' -``` - -#### Usage - -This function returns all unique values in time series. - -**Name:** DISTINCT - -**Input Series:** Only support a single input series. The type is arbitrary. - -**Output Series:** Output a single series. The type is the same as the input. - -**Note:** - -+ The timestamp of the output series is meaningless. The output order is arbitrary. -+ Missing points and null points in the input series will be ignored, but `NaN` will not. -+ Case Sensitive. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2020-01-01T08:00:00.001+08:00| Hello| -|2020-01-01T08:00:00.002+08:00| hello| -|2020-01-01T08:00:00.003+08:00| Hello| -|2020-01-01T08:00:00.004+08:00| World| -|2020-01-01T08:00:00.005+08:00| World| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select distinct(s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|distinct(root.test.d2.s2)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.001+08:00| Hello| -|1970-01-01T08:00:00.002+08:00| hello| -|1970-01-01T08:00:00.003+08:00| World| -+-----------------------------+-------------------------+ -``` - -### 3.3 Histogram - -#### Registration statement - -```sql -create function histogram as 'org.apache.iotdb.library.dprofile.UDTFHistogram' -``` - -#### Usage - -This function is used to calculate the distribution histogram of a single column of numerical data. - -**Name:** HISTOGRAM - -**Input Series:** Only supports a single input sequence, the type is INT32 / INT64 / FLOAT / DOUBLE - -**Parameters:** - -+ `min`: The lower limit of the requested data range, the default value is -Double.MAX_VALUE. -+ `max`: The upper limit of the requested data range, the default value is Double.MAX_VALUE, and the value of start must be less than or equal to end. -+ `count`: The number of buckets of the histogram, the default value is 1. It must be a positive integer. - -**Output Series:** The value of the bucket of the histogram, where the lower bound represented by the i-th bucket (index starts from 1) is $min+ (i-1)\cdot\frac{max-min}{count}$ and the upper bound is $min + i \cdot \frac{max-min}{count}$. - -**Note:** - -+ If the value is lower than `min`, it will be put into the 1st bucket. If the value is larger than `max`, it will be put into the last bucket. -+ Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 11.0| -|2020-01-01T00:00:11.000+08:00| 12.0| -|2020-01-01T00:00:12.000+08:00| 13.0| -|2020-01-01T00:00:13.000+08:00| 14.0| -|2020-01-01T00:00:14.000+08:00| 15.0| -|2020-01-01T00:00:15.000+08:00| 16.0| -|2020-01-01T00:00:16.000+08:00| 17.0| -|2020-01-01T00:00:17.000+08:00| 18.0| -|2020-01-01T00:00:18.000+08:00| 19.0| -|2020-01-01T00:00:19.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+---------------------------------------------------------------+ -| Time|histogram(root.test.d1.s1, "min"="1", "max"="20", "count"="10")| -+-----------------------------+---------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 2| -|1970-01-01T08:00:00.001+08:00| 2| -|1970-01-01T08:00:00.002+08:00| 2| -|1970-01-01T08:00:00.003+08:00| 2| -|1970-01-01T08:00:00.004+08:00| 2| -|1970-01-01T08:00:00.005+08:00| 2| -|1970-01-01T08:00:00.006+08:00| 2| -|1970-01-01T08:00:00.007+08:00| 2| -|1970-01-01T08:00:00.008+08:00| 2| -|1970-01-01T08:00:00.009+08:00| 2| -+-----------------------------+---------------------------------------------------------------+ -``` - -### 3.4 Integral - -#### Registration statement - -```sql -create function integral as 'org.apache.iotdb.library.dprofile.UDAFIntegral' -``` - -#### Usage - -This function is used to calculate the integration of time series, -which equals to the area under the curve with time as X-axis and values as Y-axis. - -**Name:** INTEGRAL - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `unit`: The unit of time used when computing the integral. - The value should be chosen from "1S", "1s", "1m", "1H", "1d"(case-sensitive), - and each represents taking one millisecond / second / minute / hour / day as 1.0 while calculating the area and integral. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the integration. - -**Note:** - -+ The integral value equals to the sum of the areas of right-angled trapezoids consisting of each two adjacent points and the time-axis. - Choosing different `unit` implies different scaling of time axis, thus making it apparent to convert the value among those results with constant coefficient. - -+ `NaN` values in the input series will be ignored. The curve or trapezoids will skip these points and use the next valid point. - -#### Examples - -##### Default Parameters - -With default parameters, this function will take one second as 1.0. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 57.5| -+-----------------------------+-------------------------+ -``` - -Calculation expression: -$$\frac{1}{2}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 57.5$$ - -##### Specific time unit - -With time unit specified as "1m", this function will take one minute as 1.0. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.958| -+-----------------------------+-------------------------+ -``` - -Calculation expression: -$$\frac{1}{2\times 60}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 0.958$$ - -### 3.5 IntegralAvg - -#### Registration statement - -```sql -create function integralavg as 'org.apache.iotdb.library.dprofile.UDAFIntegralAvg' -``` - -#### Usage - -This function is used to calculate the function average of time series. -The output equals to the area divided by the time interval using the same time `unit`. -For more information of the area under the curve, please refer to `Integral` function. - -**Name:** INTEGRALAVG - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the time-weighted average. - -**Note:** - -+ The time-weighted value equals to the integral value with any `unit` divided by the time interval of input series. - The result is irrelevant to the time unit used in integral, and it's consistent with the timestamp precision of IoTDB by default. - -+ `NaN` values in the input series will be ignored. The curve or trapezoids will skip these points and use the next valid point. - -+ If the input series is empty, the output value will be 0.0, but if there is only one data point, the value will equal to the input value. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|integralavg(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|1970-01-01T08:00:00.000+08:00| 6.388888888888889| -+-----------------------------+----------------------------+ -``` - -Calculation expression: -$$\frac{1}{2}[(1+2) \times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] / 10 = 5.75$$ - -### 3.6 Mad - -#### Registration statement - -```sql -create function mad as 'org.apache.iotdb.library.dprofile.UDAFMad' -``` - -#### Usage - -The function is used to compute the exact or approximate median absolute deviation (MAD) of a numeric time series. MAD is the median of the deviation of each element from the elements' median. - -Take a dataset $\{1,3,3,5,5,6,7,8,9\}$ as an instance. Its median is 5 and the deviation of each element from the median is $\{0,0,1,2,2,2,3,4,4\}$, whose median is 2. Therefore, the MAD of the original dataset is 2. - -**Name:** MAD - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `error`: The relative error of the approximate MAD. It should be within [0,1) and the default value is 0. Taking `error`=0.01 as an instance, suppose the exact MAD is $a$ and the approximate MAD is $b$, we have $0.99a \le b \le 1.01a$. With `error`=0, the output is the exact MAD. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the MAD. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -##### Approximate Query - -By setting `error` within (0,1), the function queries the approximate MAD. - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -............ -Total line number = 20 -``` - -SQL for query: - -```sql -select mad(s1, "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -| Time|mad(root.test.s1, "error"="0.01")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9900000000000001| -+-----------------------------+---------------------------------+ -``` - -### 3.7 Median - -#### Registration statement - -```sql -create function median as 'org.apache.iotdb.library.dprofile.UDAFMedian' -``` - -#### Usage - -The function is used to compute the exact or approximate median of a numeric time series. Median is the value separating the higher half from the lower half of a data sample. - -**Name:** MEDIAN - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `error`: The rank error of the approximate median. It should be within [0,1) and the default value is 0. For instance, a median with `error`=0.01 is the value of the element with rank percentage 0.49~0.51. With `error`=0, the output is the exact median. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the median. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -Total line number = 20 -``` - -SQL for query: - -```sql -select median(s1, "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|median(root.test.s1, "error"="0.01")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -+-----------------------------+------------------------------------+ -``` - -### 3.8 MinMax - -#### Registration statement - -```sql -create function minmax as 'org.apache.iotdb.library.dprofile.UDTFMinMax' -``` - -#### Usage - -This function is used to standardize the input series with min-max. Minimum value is transformed to 0; maximum value is transformed to 1. - -**Name:** MINMAX - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `compute`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide minimum and maximum values. The default method is "batch". -+ `min`: The maximum value when method is set to "stream". -+ `max`: The minimum value when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select minmax(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|minmax(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.200+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.300+08:00| 0.25| -|1970-01-01T08:00:00.400+08:00| 0.08333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.700+08:00| 0.0| -|1970-01-01T08:00:00.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.900+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.000+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.100+08:00| 0.25| -|1970-01-01T08:00:01.200+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.300+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.25| -|1970-01-01T08:00:01.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.700+08:00| 1.0| -|1970-01-01T08:00:01.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.900+08:00| 0.0| -|1970-01-01T08:00:02.000+08:00| 0.16666666666666666| -+-----------------------------+--------------------+ -``` - - -### 3.9 MvAvg - -#### Registration statement - -```sql -create function mvavg as 'org.apache.iotdb.library.dprofile.UDTFMvAvg' -``` - -#### Usage - -This function is used to calculate moving average of input series. - -**Name:** MVAVG - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `window`: Length of the moving window. Default value is 10. - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select mvavg(s1, "window"="3") from root.test -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -| Time|mvavg(root.test.s1, "window"="3")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.300+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.400+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.700+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.100+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.200+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.300+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.0| -|1970-01-01T08:00:01.500+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.600+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.700+08:00| 3.0| -|1970-01-01T08:00:01.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.900+08:00| -0.6666666666666666| -|1970-01-01T08:00:02.000+08:00| -3.3333333333333335| -+-----------------------------+---------------------------------+ -``` - -### 3.10 PACF - -#### Registration statement - -```sql -create function pacf as 'org.apache.iotdb.library.dprofile.UDTFPACF' -``` - -#### Usage - -This function is used to calculate partial autocorrelation of input series by solving Yule-Walker equation. For some cases, the equation may not be solved, and NaN will be output. - -**Name:** PACF - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `lag`: Maximum lag of pacf to calculate. The default value is $\min(10\log_{10}n,n-1)$, where $n$ is the number of data points. - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Assigning maximum lag - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select pacf(s1, "lag"="5") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------+ -| Time|pacf(root.test.d1.s1, "lag"="5")| -+-----------------------------+--------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| -0.5744680851063829| -|2020-01-01T00:00:03.000+08:00| 0.3172297297297296| -|2020-01-01T00:00:04.000+08:00| -0.2977686586304181| -|2020-01-01T00:00:05.000+08:00| -2.0609033521065867| -+-----------------------------+--------------------------------+ -``` - -### 3.11 Percentile - -#### Registration statement - -```sql -create function percentile as 'org.apache.iotdb.library.dprofile.UDAFPercentile' -``` - -#### Usage - -The function is used to compute the exact or approximate percentile of a numeric time series. A percentile is value of element in the certain rank of the sorted series. - -**Name:** PERCENTILE - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `rank`: The rank percentage of the percentile. It should be (0,1] and the default value is 0.5. For instance, a percentile with `rank`=0.5 is the median. -+ `error`: The rank error of the approximate percentile. It should be within [0,1) and the default value is 0. For instance, a 0.5-percentile with `error`=0.01 is the value of the element with rank percentage 0.49~0.51. With `error`=0, the output is the exact percentile. - -**Output Series:** Output a single series. The type is the same as input series. If `error`=0, there is only one data point in the series, whose timestamp is the same has which the first percentile value has, and value is the percentile, otherwise the timestamp of the only data point is 0. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+-------------+ -| Time|root.test2.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+-------------+ -Total line number = 20 -``` - -SQL for query: - -```sql -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|percentile(root.test2.s1, "rank"="0.2", "error"="0.01")| -+-----------------------------+-------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| -1.0| -+-----------------------------+-------------------------------------------------------+ -``` - -### 3.12 Quantile - -#### Registration statement - -```sql -create function quantile as 'org.apache.iotdb.library.dprofile.UDAFQuantile' -``` - -#### Usage - -The function is used to compute the approximate quantile of a numeric time series. A quantile is value of element in the certain rank of the sorted series. - -**Name:** QUANTILE - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -+ `rank`: The rank of the quantile. It should be (0,1] and the default value is 0.5. For instance, a quantile with `rank`=0.5 is the median. -+ `K`: The size of KLL sketch maintained in the query. It should be within [100,+inf) and the default value is 800. For instance, the 0.5-quantile computed by a KLL sketch with K=800 items is a value with rank quantile 0.49~0.51 with a confidence of at least 99%. The result will be more accurate as K increases. - -**Output Series:** Output a single series. The type is the same as input series. The timestamp of the only data point is 0. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+-------------+ -| Time|root.test1.s1| -+-----------------------------+-------------+ -|2021-03-17T10:32:17.054+08:00| 7| -|2021-03-17T10:32:18.054+08:00| 15| -|2021-03-17T10:32:19.054+08:00| 36| -|2021-03-17T10:32:20.054+08:00| 39| -|2021-03-17T10:32:21.054+08:00| 40| -|2021-03-17T10:32:22.054+08:00| 41| -|2021-03-17T10:32:23.054+08:00| 20| -|2021-03-17T10:32:24.054+08:00| 18| -+-----------------------------+-------------+ -............ -Total line number = 8 -``` - -SQL for query: - -```sql -select quantile(s1, "rank"="0.2", "K"="800") from root.test1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------+ -| Time|quantile(root.test1.s1, "rank"="0.2", "K"="800")| -+-----------------------------+------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.000000000000001| -+-----------------------------+------------------------------------------------+ -``` - -### 3.13 Period - -#### Registration statement - -```sql -create function period as 'org.apache.iotdb.library.dprofile.UDAFPeriod' -``` - -#### Usage - -The function is used to compute the period of a numeric time series. - -**Name:** PERIOD - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is INT32. There is only one data point in the series, whose timestamp is 0 and value is the period. - -#### Examples - -Input series: - - -``` -+-----------------------------+---------------+ -| Time|root.test.d3.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| -|1970-01-01T08:00:00.004+08:00| 1.0| -|1970-01-01T08:00:00.005+08:00| 2.0| -|1970-01-01T08:00:00.006+08:00| 3.0| -|1970-01-01T08:00:00.007+08:00| 1.0| -|1970-01-01T08:00:00.008+08:00| 2.0| -|1970-01-01T08:00:00.009+08:00| 3.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select period(s1) from root.test.d3 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time|period(root.test.d3.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 3| -+-----------------------------+-----------------------+ -``` - -### 3.14 QLB - -#### Registration statement - -```sql -create function qlb as 'org.apache.iotdb.library.dprofile.UDTFQLB' -``` - -#### Usage - -This function is used to calculate Ljung-Box statistics $Q_{LB}$ for time series, and convert it to p value. - -**Name:** QLB - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters**: - -`lag`: max lag to calculate. Legal input shall be integer from 1 to n-2, where n is the sample number. Default value is n-2. - -**Output Series:** Output a single series. The type is DOUBLE. The output series is p value, and timestamp means lag. - -**Note:** If you want to calculate Ljung-Box statistics $Q_{LB}$ instead of p value, you may use ACF function. - -#### Examples - -##### Using Default Parameter - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T00:00:00.100+08:00| 1.22| -|1970-01-01T00:00:00.200+08:00| -2.78| -|1970-01-01T00:00:00.300+08:00| 1.53| -|1970-01-01T00:00:00.400+08:00| 0.70| -|1970-01-01T00:00:00.500+08:00| 0.75| -|1970-01-01T00:00:00.600+08:00| -0.72| -|1970-01-01T00:00:00.700+08:00| -0.22| -|1970-01-01T00:00:00.800+08:00| 0.28| -|1970-01-01T00:00:00.900+08:00| 0.57| -|1970-01-01T00:00:01.000+08:00| -0.22| -|1970-01-01T00:00:01.100+08:00| -0.72| -|1970-01-01T00:00:01.200+08:00| 1.34| -|1970-01-01T00:00:01.300+08:00| -0.25| -|1970-01-01T00:00:01.400+08:00| 0.17| -|1970-01-01T00:00:01.500+08:00| 2.51| -|1970-01-01T00:00:01.600+08:00| 1.42| -|1970-01-01T00:00:01.700+08:00| -1.34| -|1970-01-01T00:00:01.800+08:00| -0.01| -|1970-01-01T00:00:01.900+08:00| -0.49| -|1970-01-01T00:00:02.000+08:00| 1.63| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select QLB(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+---------------------+ -| Time| QLB(root.test.d1.s1)| -+-----------------------------+---------------------+ -|1970-01-01T08:00:00.021+08:00| -0.31671| -|1970-01-01T08:00:00.001+08:00| 0.12748561639660716| -|1970-01-01T08:00:00.022+08:00| -0.17051499999999997| -|1970-01-01T08:00:00.002+08:00| 0.21941409592365868| -|1970-01-01T08:00:00.023+08:00| -0.11341499999999997| -|1970-01-01T08:00:00.003+08:00| 0.3384920824593398| -|1970-01-01T08:00:00.024+08:00| 0.26146| -|1970-01-01T08:00:00.004+08:00| 0.26293189359893154| -|1970-01-01T08:00:00.025+08:00| 0.06431999999999996| -|1970-01-01T08:00:00.005+08:00| 0.37265953802871943| -|1970-01-01T08:00:00.026+08:00| 0.036919999999999994| -|1970-01-01T08:00:00.006+08:00| 0.4923218142923832| -|1970-01-01T08:00:00.027+08:00|-0.009294999999999993| -|1970-01-01T08:00:00.007+08:00| 0.609628728420623| -|1970-01-01T08:00:00.028+08:00| 0.12271499999999999| -|1970-01-01T08:00:00.008+08:00| 0.6510708392264906| -|1970-01-01T08:00:00.029+08:00| 0.008480000000000033| -|1970-01-01T08:00:00.009+08:00| 0.7430561964288097| -|1970-01-01T08:00:00.030+08:00| -0.21764500000000003| -|1970-01-01T08:00:00.010+08:00| 0.6236738200492055| -|1970-01-01T08:00:00.031+08:00| 0.35853999999999997| -|1970-01-01T08:00:00.011+08:00| 0.21487390993160937| -|1970-01-01T08:00:00.032+08:00| 0.18115499999999998| -|1970-01-01T08:00:00.012+08:00| 0.18479562182870324| -|1970-01-01T08:00:00.033+08:00| -0.27745499999999995| -|1970-01-01T08:00:00.013+08:00| 0.07329862193377235| -|1970-01-01T08:00:00.034+08:00| -0.22418500000000002| -|1970-01-01T08:00:00.014+08:00| 0.038000864459751926| -|1970-01-01T08:00:00.035+08:00| 0.31609000000000004| -|1970-01-01T08:00:00.015+08:00| 0.004052989734200874| -|1970-01-01T08:00:00.036+08:00| -0.06078500000000001| -|1970-01-01T08:00:00.016+08:00| 0.005663787468609627| -|1970-01-01T08:00:00.037+08:00| 0.19219499999999998| -|1970-01-01T08:00:00.017+08:00|0.0016316380755082571| -|1970-01-01T08:00:00.038+08:00| -0.25646| -|1970-01-01T08:00:00.018+08:00|2.0047954405910673E-5| -+-----------------------------+---------------------+ -``` - -### 3.15 Resample - -#### Registration statement - -```sql -create function re_sample as 'org.apache.iotdb.library.dprofile.UDTFResample' -``` - -#### Usage - -This function is used to resample the input series according to a given frequency, -including up-sampling and down-sampling. -Currently, the supported up-sampling methods are -NaN (filling with `NaN`), -FFill (filling with previous value), -BFill (filling with next value) and -Linear (filling with linear interpolation). -Down-sampling relies on group aggregation, -which supports Max, Min, First, Last, Mean and Median. - -**Name:** RESAMPLE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - - -+ `every`: The frequency of resampling, which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. This parameter cannot be lacked. -+ `interp`: The interpolation method of up-sampling, which is 'NaN', 'FFill', 'BFill' or 'Linear'. By default, NaN is used. -+ `aggr`: The aggregation method of down-sampling, which is 'Max', 'Min', 'First', 'Last', 'Mean' or 'Median'. By default, Mean is used. -+ `start`: The start time (inclusive) of resampling with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is the timestamp of the first valid data point. -+ `end`: The end time (exclusive) of resampling with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is the timestamp of the last valid data point. - -**Output Series:** Output a single series. The type is DOUBLE. It is strictly equispaced with the frequency `every`. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -##### Up-sampling - -When the frequency of resampling is higher than the original frequency, up-sampling starts. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2021-03-06T16:00:00.000+08:00| 3.09| -|2021-03-06T16:15:00.000+08:00| 3.53| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:45:00.000+08:00| 3.51| -|2021-03-06T17:00:00.000+08:00| 3.41| -+-----------------------------+---------------+ -``` - - -SQL for query: - -```sql -select resample(s1,'every'='5m','interp'='linear') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="5m", "interp"="linear")| -+-----------------------------+----------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:05:00.000+08:00| 3.2366665999094644| -|2021-03-06T16:10:00.000+08:00| 3.3833332856496177| -|2021-03-06T16:15:00.000+08:00| 3.5299999713897705| -|2021-03-06T16:20:00.000+08:00| 3.5199999809265137| -|2021-03-06T16:25:00.000+08:00| 3.509999990463257| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:35:00.000+08:00| 3.503333330154419| -|2021-03-06T16:40:00.000+08:00| 3.506666660308838| -|2021-03-06T16:45:00.000+08:00| 3.509999990463257| -|2021-03-06T16:50:00.000+08:00| 3.4766666889190674| -|2021-03-06T16:55:00.000+08:00| 3.443333387374878| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+----------------------------------------------------------+ -``` - -##### Down-sampling - -When the frequency of resampling is lower than the original frequency, down-sampling starts. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select resample(s1,'every'='30m','aggr'='first') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "aggr"="first")| -+-----------------------------+--------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+--------------------------------------------------------+ -``` - - - -##### Specify the time period - -The time period of resampling can be specified with `start` and `end`. -The period outside the actual time range will be interpolated. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "start"="2021-03-06 15:00:00")| -+-----------------------------+-----------------------------------------------------------------------+ -|2021-03-06T15:00:00.000+08:00| NaN| -|2021-03-06T15:30:00.000+08:00| NaN| -|2021-03-06T16:00:00.000+08:00| 3.309999942779541| -|2021-03-06T16:30:00.000+08:00| 3.5049999952316284| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+-----------------------------------------------------------------------+ -``` - -### 3.16 Sample - -#### Registration statement - -```sql -create function sample as 'org.apache.iotdb.library.dprofile.UDTFSample' -``` - -#### Usage - -This function is used to sample the input series, -that is, select a specified number of data points from the input series and output them. -Currently, three sampling methods are supported: -**Reservoir sampling** randomly selects data points. -All of the points have the same probability of being sampled. -**Isometric sampling** selects data points at equal index intervals. -**Triangle sampling** assigns data points to the buckets based on the number of sampling. -Then it calculates the area of the triangle based on these points inside the bucket and selects the point with the largest area of the triangle. -For more detail, please read [paper](http://skemman.is/stream/get/1946/15343/37285/3/SS_MSthesis.pdf) - -**Name:** SAMPLE - -**Input Series:** Only support a single input series. The type is arbitrary. - -**Parameters:** - -+ `method`: The method of sampling, which is 'reservoir', 'isometric' or 'triangle'. By default, reservoir sampling is used. -+ `k`: The number of sampling, which is a positive integer. By default, it's 1. - -**Output Series:** Output a single series. The type is the same as the input. The length of the output series is `k`. Each data point in the output series comes from the input series. - -**Note:** If `k` is greater than the length of input series, all data points in the input series will be output. - -#### Examples - -##### Reservoir Sampling - -When `method` is 'reservoir' or the default, reservoir sampling is used. -Due to the randomness of this method, the output series shown below is only a possible result. - - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 4.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select sample(s1,'method'='reservoir','k'='5') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="reservoir", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -##### Isometric Sampling - -When `method` is 'isometric', isometric sampling is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select sample(s1,'method'='isometric','k'='5') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="isometric", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -### 3.17 Segment - -#### Registration statement - -```sql -create function segment as 'org.apache.iotdb.library.dprofile.UDTFSegment' -``` - -#### Usage - -This function is used to segment a time series into subsequences according to linear trend, and returns linear fitted values of first values in each subsequence or every data point. - -**Name:** SEGMENT - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `output` :"all" to output all fitted points; "first" to output first fitted points in each subsequence. - -+ `error`: error allowed at linear regression. It is defined as mean absolute error of a subsequence. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** This function treat input series as equal-interval sampled. All data are loaded, so downsample input series first if there are too many data points. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:00.300+08:00| 2.0| -|1970-01-01T08:00:00.400+08:00| 3.0| -|1970-01-01T08:00:00.500+08:00| 4.0| -|1970-01-01T08:00:00.600+08:00| 5.0| -|1970-01-01T08:00:00.700+08:00| 6.0| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 8.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:01.100+08:00| 9.1| -|1970-01-01T08:00:01.200+08:00| 9.2| -|1970-01-01T08:00:01.300+08:00| 9.3| -|1970-01-01T08:00:01.400+08:00| 9.4| -|1970-01-01T08:00:01.500+08:00| 9.5| -|1970-01-01T08:00:01.600+08:00| 9.6| -|1970-01-01T08:00:01.700+08:00| 9.7| -|1970-01-01T08:00:01.800+08:00| 9.8| -|1970-01-01T08:00:01.900+08:00| 9.9| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:02.100+08:00| 8.0| -|1970-01-01T08:00:02.200+08:00| 6.0| -|1970-01-01T08:00:02.300+08:00| 4.0| -|1970-01-01T08:00:02.400+08:00| 2.0| -|1970-01-01T08:00:02.500+08:00| 0.0| -|1970-01-01T08:00:02.600+08:00| -2.0| -|1970-01-01T08:00:02.700+08:00| -4.0| -|1970-01-01T08:00:02.800+08:00| -6.0| -|1970-01-01T08:00:02.900+08:00| -8.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.100+08:00| 10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -|1970-01-01T08:00:03.300+08:00| 10.0| -|1970-01-01T08:00:03.400+08:00| 10.0| -|1970-01-01T08:00:03.500+08:00| 10.0| -|1970-01-01T08:00:03.600+08:00| 10.0| -|1970-01-01T08:00:03.700+08:00| 10.0| -|1970-01-01T08:00:03.800+08:00| 10.0| -|1970-01-01T08:00:03.900+08:00| 10.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select segment(s1, "error"="0.1") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|segment(root.test.s1, "error"="0.1")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -+-----------------------------+------------------------------------+ -``` - -### 3.18 Skew - -#### Registration statement - -```sql -create function skew as 'org.apache.iotdb.library.dprofile.UDAFSkew' -``` - -#### Usage - -This function is used to calculate the population skewness. - -**Name:** SKEW - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the population skewness. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -|2020-01-01T00:00:11.000+08:00| 10.0| -|2020-01-01T00:00:12.000+08:00| 10.0| -|2020-01-01T00:00:13.000+08:00| 10.0| -|2020-01-01T00:00:14.000+08:00| 10.0| -|2020-01-01T00:00:15.000+08:00| 10.0| -|2020-01-01T00:00:16.000+08:00| 10.0| -|2020-01-01T00:00:17.000+08:00| 10.0| -|2020-01-01T00:00:18.000+08:00| 10.0| -|2020-01-01T00:00:19.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select skew(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time| skew(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| -0.9998427402292644| -+-----------------------------+-----------------------+ -``` - -### 3.19 Spline - -#### Registration statement - -```sql -create function spline as 'org.apache.iotdb.library.dprofile.UDTFSpline' -``` - -#### Usage - -This function is used to calculate cubic spline interpolation of input series. - -**Name:** SPLINE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `points`: Number of resampling points. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note**: Output series retains the first and last timestamps of input series. Interpolation points are selected at equal intervals. The function tries to calculate only when there are no less than 4 points in input series. - -#### Examples - -##### Assigning number of interpolation points - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select spline(s1, "points"="151") from root.test -``` - -Output series: - -``` -+-----------------------------+------------------------------------+ -| Time|spline(root.test.s1, "points"="151")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.010+08:00| 0.04870000251134237| -|1970-01-01T08:00:00.020+08:00| 0.09680000495910646| -|1970-01-01T08:00:00.030+08:00| 0.14430000734329226| -|1970-01-01T08:00:00.040+08:00| 0.19120000966389972| -|1970-01-01T08:00:00.050+08:00| 0.23750001192092896| -|1970-01-01T08:00:00.060+08:00| 0.2832000141143799| -|1970-01-01T08:00:00.070+08:00| 0.32830001624425253| -|1970-01-01T08:00:00.080+08:00| 0.3728000183105469| -|1970-01-01T08:00:00.090+08:00| 0.416700020313263| -|1970-01-01T08:00:00.100+08:00| 0.4600000222524008| -|1970-01-01T08:00:00.110+08:00| 0.5027000241279602| -|1970-01-01T08:00:00.120+08:00| 0.5448000259399414| -|1970-01-01T08:00:00.130+08:00| 0.5863000276883443| -|1970-01-01T08:00:00.140+08:00| 0.627200029373169| -|1970-01-01T08:00:00.150+08:00| 0.6675000309944153| -|1970-01-01T08:00:00.160+08:00| 0.7072000325520833| -|1970-01-01T08:00:00.170+08:00| 0.7463000340461731| -|1970-01-01T08:00:00.180+08:00| 0.7848000354766846| -|1970-01-01T08:00:00.190+08:00| 0.8227000368436178| -|1970-01-01T08:00:00.200+08:00| 0.8600000381469728| -|1970-01-01T08:00:00.210+08:00| 0.8967000393867494| -|1970-01-01T08:00:00.220+08:00| 0.9328000405629477| -|1970-01-01T08:00:00.230+08:00| 0.9683000416755676| -|1970-01-01T08:00:00.240+08:00| 1.0032000427246095| -|1970-01-01T08:00:00.250+08:00| 1.037500043710073| -|1970-01-01T08:00:00.260+08:00| 1.071200044631958| -|1970-01-01T08:00:00.270+08:00| 1.1043000454902647| -|1970-01-01T08:00:00.280+08:00| 1.1368000462849934| -|1970-01-01T08:00:00.290+08:00| 1.1687000470161437| -|1970-01-01T08:00:00.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:00.310+08:00| 1.2307000483103594| -|1970-01-01T08:00:00.320+08:00| 1.2608000489139557| -|1970-01-01T08:00:00.330+08:00| 1.2903000494873524| -|1970-01-01T08:00:00.340+08:00| 1.3192000500233967| -|1970-01-01T08:00:00.350+08:00| 1.3475000505149364| -|1970-01-01T08:00:00.360+08:00| 1.3752000509548186| -|1970-01-01T08:00:00.370+08:00| 1.402300051335891| -|1970-01-01T08:00:00.380+08:00| 1.4288000516510009| -|1970-01-01T08:00:00.390+08:00| 1.4547000518929958| -|1970-01-01T08:00:00.400+08:00| 1.480000052054723| -|1970-01-01T08:00:00.410+08:00| 1.5047000521290301| -|1970-01-01T08:00:00.420+08:00| 1.5288000521087646| -|1970-01-01T08:00:00.430+08:00| 1.5523000519867738| -|1970-01-01T08:00:00.440+08:00| 1.575200051755905| -|1970-01-01T08:00:00.450+08:00| 1.597500051409006| -|1970-01-01T08:00:00.460+08:00| 1.619200050938924| -|1970-01-01T08:00:00.470+08:00| 1.6403000503385066| -|1970-01-01T08:00:00.480+08:00| 1.660800049600601| -|1970-01-01T08:00:00.490+08:00| 1.680700048718055| -|1970-01-01T08:00:00.500+08:00| 1.7000000476837158| -|1970-01-01T08:00:00.510+08:00| 1.7188475466453037| -|1970-01-01T08:00:00.520+08:00| 1.7373800457262996| -|1970-01-01T08:00:00.530+08:00| 1.7555825448831923| -|1970-01-01T08:00:00.540+08:00| 1.7734400440724702| -|1970-01-01T08:00:00.550+08:00| 1.790937543250622| -|1970-01-01T08:00:00.560+08:00| 1.8080600423741364| -|1970-01-01T08:00:00.570+08:00| 1.8247925413995016| -|1970-01-01T08:00:00.580+08:00| 1.8411200402832066| -|1970-01-01T08:00:00.590+08:00| 1.8570275389817397| -|1970-01-01T08:00:00.600+08:00| 1.8725000374515897| -|1970-01-01T08:00:00.610+08:00| 1.8875225356492449| -|1970-01-01T08:00:00.620+08:00| 1.902080033531194| -|1970-01-01T08:00:00.630+08:00| 1.9161575310539258| -|1970-01-01T08:00:00.640+08:00| 1.9297400281739288| -|1970-01-01T08:00:00.650+08:00| 1.9428125248476913| -|1970-01-01T08:00:00.660+08:00| 1.9553600210317021| -|1970-01-01T08:00:00.670+08:00| 1.96736751668245| -|1970-01-01T08:00:00.680+08:00| 1.9788200117564232| -|1970-01-01T08:00:00.690+08:00| 1.9897025062101101| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.710+08:00| 2.0097024933913334| -|1970-01-01T08:00:00.720+08:00| 2.0188199867081615| -|1970-01-01T08:00:00.730+08:00| 2.027367479995188| -|1970-01-01T08:00:00.740+08:00| 2.0353599732971155| -|1970-01-01T08:00:00.750+08:00| 2.0428124666586482| -|1970-01-01T08:00:00.760+08:00| 2.049739960124489| -|1970-01-01T08:00:00.770+08:00| 2.056157453739342| -|1970-01-01T08:00:00.780+08:00| 2.06207994754791| -|1970-01-01T08:00:00.790+08:00| 2.067522441594897| -|1970-01-01T08:00:00.800+08:00| 2.072499935925006| -|1970-01-01T08:00:00.810+08:00| 2.07702743058294| -|1970-01-01T08:00:00.820+08:00| 2.081119925613404| -|1970-01-01T08:00:00.830+08:00| 2.0847924210611| -|1970-01-01T08:00:00.840+08:00| 2.0880599169707317| -|1970-01-01T08:00:00.850+08:00| 2.0909374133870027| -|1970-01-01T08:00:00.860+08:00| 2.0934399103546166| -|1970-01-01T08:00:00.870+08:00| 2.0955824079182768| -|1970-01-01T08:00:00.880+08:00| 2.0973799061226863| -|1970-01-01T08:00:00.890+08:00| 2.098847405012549| -|1970-01-01T08:00:00.900+08:00| 2.0999999046325684| -|1970-01-01T08:00:00.910+08:00| 2.1005574051201332| -|1970-01-01T08:00:00.920+08:00| 2.1002599065303778| -|1970-01-01T08:00:00.930+08:00| 2.0991524087846245| -|1970-01-01T08:00:00.940+08:00| 2.0972799118041947| -|1970-01-01T08:00:00.950+08:00| 2.0946874155104105| -|1970-01-01T08:00:00.960+08:00| 2.0914199198245944| -|1970-01-01T08:00:00.970+08:00| 2.0875224246680673| -|1970-01-01T08:00:00.980+08:00| 2.083039929962151| -|1970-01-01T08:00:00.990+08:00| 2.0780174356281687| -|1970-01-01T08:00:01.000+08:00| 2.0724999415874406| -|1970-01-01T08:00:01.010+08:00| 2.06653244776129| -|1970-01-01T08:00:01.020+08:00| 2.060159954071038| -|1970-01-01T08:00:01.030+08:00| 2.053427460438006| -|1970-01-01T08:00:01.040+08:00| 2.046379966783517| -|1970-01-01T08:00:01.050+08:00| 2.0390624730288924| -|1970-01-01T08:00:01.060+08:00| 2.031519979095454| -|1970-01-01T08:00:01.070+08:00| 2.0237974849045237| -|1970-01-01T08:00:01.080+08:00| 2.015939990377423| -|1970-01-01T08:00:01.090+08:00| 2.0079924954354746| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.110+08:00| 1.9907018211101906| -|1970-01-01T08:00:01.120+08:00| 1.9788509124245144| -|1970-01-01T08:00:01.130+08:00| 1.9645127287932083| -|1970-01-01T08:00:01.140+08:00| 1.9477527250665083| -|1970-01-01T08:00:01.150+08:00| 1.9286363560946513| -|1970-01-01T08:00:01.160+08:00| 1.9072290767278735| -|1970-01-01T08:00:01.170+08:00| 1.8835963418164114| -|1970-01-01T08:00:01.180+08:00| 1.8578036062105014| -|1970-01-01T08:00:01.190+08:00| 1.8299163247603802| -|1970-01-01T08:00:01.200+08:00| 1.7999999523162842| -|1970-01-01T08:00:01.210+08:00| 1.7623635841923329| -|1970-01-01T08:00:01.220+08:00| 1.7129696477516976| -|1970-01-01T08:00:01.230+08:00| 1.6543635959181928| -|1970-01-01T08:00:01.240+08:00| 1.5890908816156328| -|1970-01-01T08:00:01.250+08:00| 1.5196969577678319| -|1970-01-01T08:00:01.260+08:00| 1.4487272772986044| -|1970-01-01T08:00:01.270+08:00| 1.3787272931317647| -|1970-01-01T08:00:01.280+08:00| 1.3122424581911272| -|1970-01-01T08:00:01.290+08:00| 1.251818225400506| -|1970-01-01T08:00:01.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:01.310+08:00| 1.1548000470995912| -|1970-01-01T08:00:01.320+08:00| 1.1130667107899999| -|1970-01-01T08:00:01.330+08:00| 1.0756000393033045| -|1970-01-01T08:00:01.340+08:00| 1.043200033187868| -|1970-01-01T08:00:01.350+08:00| 1.016666692992053| -|1970-01-01T08:00:01.360+08:00| 0.9968000192642223| -|1970-01-01T08:00:01.370+08:00| 0.9844000125527389| -|1970-01-01T08:00:01.380+08:00| 0.9802666734059655| -|1970-01-01T08:00:01.390+08:00| 0.9852000023722649| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.410+08:00| 1.023999999165535| -|1970-01-01T08:00:01.420+08:00| 1.0559999990463256| -|1970-01-01T08:00:01.430+08:00| 1.0959999996423722| -|1970-01-01T08:00:01.440+08:00| 1.1440000009536744| -|1970-01-01T08:00:01.450+08:00| 1.2000000029802322| -|1970-01-01T08:00:01.460+08:00| 1.264000005722046| -|1970-01-01T08:00:01.470+08:00| 1.3360000091791153| -|1970-01-01T08:00:01.480+08:00| 1.4160000133514405| -|1970-01-01T08:00:01.490+08:00| 1.5040000182390214| -|1970-01-01T08:00:01.500+08:00| 1.600000023841858| -+-----------------------------+------------------------------------+ -``` - -### 3.20 Spread - -#### Registration statement - -```sql -create function spread as 'org.apache.iotdb.library.dprofile.UDAFSpread' -``` - -#### Usage - -This function is used to calculate the spread of time series, that is, the maximum value minus the minimum value. - -**Name:** SPREAD - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is the same as the input. There is only one data point in the series, whose timestamp is 0 and value is the spread. - -**Note:** Missing points, null points and `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+-----------------------+ -| Time|spread(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 26.0| -+-----------------------------+-----------------------+ -``` - - - -### 3.21 ZScore - -#### Registration statement - -```sql -create function zscore as 'org.apache.iotdb.library.dprofile.UDTFZScore' -``` - -#### Usage - -This function is used to standardize the input series with z-score. - -**Name:** ZSCORE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `compute`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide mean and standard deviation. The default method is "batch". -+ `avg`: Mean value when method is set to "stream". -+ `sd`: Standard deviation when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select zscore(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|zscore(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.200+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.300+08:00| 0.20672455764868078| -|1970-01-01T08:00:00.400+08:00| -0.6201736729460423| -|1970-01-01T08:00:00.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.700+08:00| -1.033622788243404| -|1970-01-01T08:00:00.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:00.900+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.000+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.100+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.200+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.300+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.400+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.700+08:00| 3.9277665953249348| -|1970-01-01T08:00:01.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:01.900+08:00| -1.033622788243404| -|1970-01-01T08:00:02.000+08:00|-0.20672455764868078| -+-----------------------------+--------------------+ -``` - - -## 4. Anomaly Detection - -### 4.1 IQR - -#### Registration statement - -```sql -create function iqr as 'org.apache.iotdb.library.anomaly.UDTFIQR' -``` - -#### Usage - -This function is used to detect anomalies based on IQR. Points distributing beyond 1.5 times IQR are selected. - -**Name:** IQR - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `method`: When set to "batch", anomaly test is conducted after importing all data points; when set to "stream", it is required to provide upper and lower quantiles. The default method is "batch". -+ `q1`: The lower quantile when method is set to "stream". -+ `q3`: The upper quantile when method is set to "stream". - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** $IQR=Q_3-Q_1$ - -#### Examples - -##### Batch computing - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select iqr(s1) from root.test -``` - -Output series: - -``` -+-----------------------------+-----------------+ -| Time|iqr(root.test.s1)| -+-----------------------------+-----------------+ -|1970-01-01T08:00:01.700+08:00| 10.0| -+-----------------------------+-----------------+ -``` - -### 4.2 KSigma - -#### Registration statement - -```sql -create function ksigma as 'org.apache.iotdb.library.anomaly.UDTFKSigma' -``` - -#### Usage - -This function is used to detect anomalies based on the Dynamic K-Sigma Algorithm. -Within a sliding window, the input value with a deviation of more than k times the standard deviation from the average will be output as anomaly. - -**Name:** KSIGMA - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `k`: How many times to multiply on standard deviation to define anomaly, the default value is 3. -+ `window`: The window size of Dynamic K-Sigma Algorithm, the default value is 10000. - -**Output Series:** Output a single series. The type is same as input series. - -**Note:** Only when is larger than 0, the anomaly detection will be performed. Otherwise, nothing will be output. - -#### Examples - -##### Assigning k - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:04.000+08:00| 100.0| -|2020-01-01T00:00:06.000+08:00| 150.0| -|2020-01-01T00:00:08.000+08:00| 200.0| -|2020-01-01T00:00:10.000+08:00| 200.0| -|2020-01-01T00:00:14.000+08:00| 200.0| -|2020-01-01T00:00:15.000+08:00| 200.0| -|2020-01-01T00:00:16.000+08:00| 200.0| -|2020-01-01T00:00:18.000+08:00| 200.0| -|2020-01-01T00:00:20.000+08:00| 150.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+---------------------------------+ -|Time |ksigma(root.test.d1.s1,"k"="3.0")| -+-----------------------------+---------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -+-----------------------------+---------------------------------+ -``` - -### 4.3 LOF - -#### Registration statement - -```sql -create function LOF as 'org.apache.iotdb.library.anomaly.UDTFLOF' -``` - -#### Usage - -This function is used to detect density anomaly of time series. According to k-th distance calculation parameter and local outlier factor (lof) threshold, the function judges if a set of input values is an density anomaly, and a bool mark of anomaly values will be output. - -**Name:** LOF - -**Input Series:** Multiple input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `method`:assign a detection method. The default value is "default", when input data has multiple dimensions. The alternative is "series", when a input series will be transformed to high dimension. -+ `k`:use the k-th distance to calculate lof. Default value is 3. -+ `window`: size of window to split origin data points. Default value is 10000. -+ `windowsize`:dimension that will be transformed into when method is "series". The default value is 5. - -**Output Series:** Output a single series. The type is DOUBLE. - -**Note:** Incomplete rows will be ignored. They are neither calculated nor marked as anomaly. - -#### Examples - -##### Using default parameters - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| 1.0| -|1970-01-01T08:00:00.300+08:00| 1.0| 1.0| -|1970-01-01T08:00:00.400+08:00| 1.0| 0.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -1.0| -|1970-01-01T08:00:00.600+08:00| -1.0| -1.0| -|1970-01-01T08:00:00.700+08:00| -1.0| 0.0| -|1970-01-01T08:00:00.800+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select lof(s1,s2) from root.test.d1 where time<1000 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|lof(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.100+08:00| 3.8274824267668244| -|1970-01-01T08:00:00.200+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.300+08:00| 2.838155437762879| -|1970-01-01T08:00:00.400+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.500+08:00| 2.73518261244453| -|1970-01-01T08:00:00.600+08:00| 2.371440975708148| -|1970-01-01T08:00:00.700+08:00| 2.73518261244453| -|1970-01-01T08:00:00.800+08:00| 1.7561416374270742| -+-----------------------------+-------------------------------------+ -``` - -##### Diagnosing 1d timeseries - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 1.0| -|1970-01-01T08:00:00.200+08:00| 2.0| -|1970-01-01T08:00:00.300+08:00| 3.0| -|1970-01-01T08:00:00.400+08:00| 4.0| -|1970-01-01T08:00:00.500+08:00| 5.0| -|1970-01-01T08:00:00.600+08:00| 6.0| -|1970-01-01T08:00:00.700+08:00| 7.0| -|1970-01-01T08:00:00.800+08:00| 8.0| -|1970-01-01T08:00:00.900+08:00| 9.0| -|1970-01-01T08:00:01.000+08:00| 10.0| -|1970-01-01T08:00:01.100+08:00| 11.0| -|1970-01-01T08:00:01.200+08:00| 12.0| -|1970-01-01T08:00:01.300+08:00| 13.0| -|1970-01-01T08:00:01.400+08:00| 14.0| -|1970-01-01T08:00:01.500+08:00| 15.0| -|1970-01-01T08:00:01.600+08:00| 16.0| -|1970-01-01T08:00:01.700+08:00| 17.0| -|1970-01-01T08:00:01.800+08:00| 18.0| -|1970-01-01T08:00:01.900+08:00| 19.0| -|1970-01-01T08:00:02.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select lof(s1, "method"="series") from root.test.d1 where time<1000 -``` - -Output series: - -``` -+-----------------------------+--------------------+ -| Time|lof(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 3.77777777777778| -|1970-01-01T08:00:00.200+08:00| 4.32727272727273| -|1970-01-01T08:00:00.300+08:00| 4.85714285714286| -|1970-01-01T08:00:00.400+08:00| 5.40909090909091| -|1970-01-01T08:00:00.500+08:00| 5.94999999999999| -|1970-01-01T08:00:00.600+08:00| 6.43243243243243| -|1970-01-01T08:00:00.700+08:00| 6.79999999999999| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 7.0| -|1970-01-01T08:00:01.000+08:00| 6.79999999999999| -|1970-01-01T08:00:01.100+08:00| 6.43243243243243| -|1970-01-01T08:00:01.200+08:00| 5.94999999999999| -|1970-01-01T08:00:01.300+08:00| 5.40909090909091| -|1970-01-01T08:00:01.400+08:00| 4.85714285714286| -|1970-01-01T08:00:01.500+08:00| 4.32727272727273| -|1970-01-01T08:00:01.600+08:00| 3.77777777777778| -+-----------------------------+--------------------+ -``` - -### 4.4 MissDetect - -#### Registration statement - -```sql -create function missdetect as 'org.apache.iotdb.library.anomaly.UDTFMissDetect' -``` - -#### Usage - -This function is used to detect missing anomalies. -In some datasets, missing values are filled by linear interpolation. -Thus, there are several long perfect linear segments. -By discovering these perfect linear segments, -missing anomalies are detected. - -**Name:** MISSDETECT - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameter:** - -`error`: The minimum length of the detected missing anomalies, which is an integer greater than or equal to 10. By default, it is 10. - -**Output Series:** Output a single series. The type is BOOLEAN. Each data point which is miss anomaly will be labeled as true. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 0.0| -|2021-07-01T12:00:01.000+08:00| 1.0| -|2021-07-01T12:00:02.000+08:00| 0.0| -|2021-07-01T12:00:03.000+08:00| 1.0| -|2021-07-01T12:00:04.000+08:00| 0.0| -|2021-07-01T12:00:05.000+08:00| 0.0| -|2021-07-01T12:00:06.000+08:00| 0.0| -|2021-07-01T12:00:07.000+08:00| 0.0| -|2021-07-01T12:00:08.000+08:00| 0.0| -|2021-07-01T12:00:09.000+08:00| 0.0| -|2021-07-01T12:00:10.000+08:00| 0.0| -|2021-07-01T12:00:11.000+08:00| 0.0| -|2021-07-01T12:00:12.000+08:00| 0.0| -|2021-07-01T12:00:13.000+08:00| 0.0| -|2021-07-01T12:00:14.000+08:00| 0.0| -|2021-07-01T12:00:15.000+08:00| 0.0| -|2021-07-01T12:00:16.000+08:00| 1.0| -|2021-07-01T12:00:17.000+08:00| 0.0| -|2021-07-01T12:00:18.000+08:00| 1.0| -|2021-07-01T12:00:19.000+08:00| 0.0| -|2021-07-01T12:00:20.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select missdetect(s2,'minlen'='10') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------+ -| Time|missdetect(root.test.d2.s2, "minlen"="10")| -+-----------------------------+------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| false| -|2021-07-01T12:00:01.000+08:00| false| -|2021-07-01T12:00:02.000+08:00| false| -|2021-07-01T12:00:03.000+08:00| false| -|2021-07-01T12:00:04.000+08:00| true| -|2021-07-01T12:00:05.000+08:00| true| -|2021-07-01T12:00:06.000+08:00| true| -|2021-07-01T12:00:07.000+08:00| true| -|2021-07-01T12:00:08.000+08:00| true| -|2021-07-01T12:00:09.000+08:00| true| -|2021-07-01T12:00:10.000+08:00| true| -|2021-07-01T12:00:11.000+08:00| true| -|2021-07-01T12:00:12.000+08:00| true| -|2021-07-01T12:00:13.000+08:00| true| -|2021-07-01T12:00:14.000+08:00| true| -|2021-07-01T12:00:15.000+08:00| true| -|2021-07-01T12:00:16.000+08:00| false| -|2021-07-01T12:00:17.000+08:00| false| -|2021-07-01T12:00:18.000+08:00| false| -|2021-07-01T12:00:19.000+08:00| false| -|2021-07-01T12:00:20.000+08:00| false| -+-----------------------------+------------------------------------------+ -``` - -### 4.5 Range - -#### Registration statement - -```sql -create function range as 'org.apache.iotdb.library.anomaly.UDTFRange' -``` - -#### Usage - -This function is used to detect range anomaly of time series. According to upper bound and lower bound parameters, the function judges if a input value is beyond range, aka range anomaly, and a new time series of anomaly will be output. - -**Name:** RANGE - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `lower_bound`:lower bound of range anomaly detection. -+ `upper_bound`:upper bound of range anomaly detection. - -**Output Series:** Output a single series. The type is the same as the input. - -**Note:** Only when `upper_bound` is larger than `lower_bound`, the anomaly detection will be performed. Otherwise, nothing will be output. - - - -#### Examples - -##### Assigning Lower and Upper Bound - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------------------+ -|Time |range(root.test.d1.s1,"lower_bound"="101.0","upper_bound"="125.0")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -+-----------------------------+------------------------------------------------------------------+ -``` - -### 4.6 TwoSidedFilter - -#### Registration statement - -```sql -create function twosidedfilter as 'org.apache.iotdb.library.anomaly.UDTFTwoSidedFilter' -``` - -#### Usage - -The function is used to filter anomalies of a numeric time series based on two-sided window detection. - -**Name:** TWOSIDEDFILTER - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE - -**Output Series:** Output a single series. The type is the same as the input. It is the input without anomalies. - -**Parameter:** - -- `len`: The size of the window, which is a positive integer. By default, it's 5. When `len`=3, the algorithm detects forward window and backward window with length 3 and calculates the outlierness of the current point. - -- `threshold`: The threshold of outlierness, which is a floating number in (0,1). By default, it's 0.3. The strict standard of detecting anomalies is in proportion to the threshold. - -#### Examples - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:00:37.000+08:00| 1484.0| -|1970-01-01T08:00:38.000+08:00| 1055.0| -|1970-01-01T08:00:39.000+08:00| 1050.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test -``` - -Output series: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -### 4.7 Outlier - -#### Registration statement - -```sql -create function outlier as 'org.apache.iotdb.library.anomaly.UDTFOutlier' -``` - -#### Usage - -This function is used to detect distance-based outliers. For each point in the current window, if the number of its neighbors within the distance of neighbor distance threshold is less than the neighbor count threshold, the point in detected as an outlier. - -**Name:** OUTLIER - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -+ `r`:the neighbor distance threshold. -+ `k`:the neighbor count threshold. -+ `w`:the window size. -+ `s`:the slide size. - -**Output Series:** Output a single series. The type is the same as the input. - -#### Examples - -##### Assigning Parameters of Queries - -Input series: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|2020-01-04T23:59:55.000+08:00| 56.0| -|2020-01-04T23:59:56.000+08:00| 55.1| -|2020-01-04T23:59:57.000+08:00| 54.2| -|2020-01-04T23:59:58.000+08:00| 56.3| -|2020-01-04T23:59:59.000+08:00| 59.0| -|2020-01-05T00:00:00.000+08:00| 60.0| -|2020-01-05T00:00:01.000+08:00| 60.5| -|2020-01-05T00:00:02.000+08:00| 64.5| -|2020-01-05T00:00:03.000+08:00| 69.0| -|2020-01-05T00:00:04.000+08:00| 64.2| -|2020-01-05T00:00:05.000+08:00| 62.3| -|2020-01-05T00:00:06.000+08:00| 58.0| -|2020-01-05T00:00:07.000+08:00| 58.9| -|2020-01-05T00:00:08.000+08:00| 52.0| -|2020-01-05T00:00:09.000+08:00| 62.3| -|2020-01-05T00:00:10.000+08:00| 61.0| -|2020-01-05T00:00:11.000+08:00| 64.2| -|2020-01-05T00:00:12.000+08:00| 61.8| -|2020-01-05T00:00:13.000+08:00| 64.0| -|2020-01-05T00:00:14.000+08:00| 63.0| -+-----------------------------+------------+ -``` - -SQL for query: - -```sql -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|outlier(root.test.s1,"r"="5.0","k"="4","w"="10","s"="5")| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:03.000+08:00| 69.0| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:08.000+08:00| 52.0| -+-----------------------------+--------------------------------------------------------+ -``` - -## 5. Frequency Domain Analysis - -### 5.1 Conv - -#### Registration statement - -```sql -create function conv as 'org.apache.iotdb.library.frequency.UDTFConv' -``` - -#### Usage - -This function is used to calculate the convolution, i.e. polynomial multiplication. - -**Name:** CONV - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output:** Output a single series. The type is DOUBLE. It is the result of convolution whose timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 0.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select conv(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------+ -| Time|conv(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| -|1970-01-01T08:00:00.003+08:00| 2.0| -+-----------------------------+--------------------------------------+ -``` - -### 5.2 Deconv - -#### Registration statement - -```sql -create function deconv as 'org.apache.iotdb.library.frequency.UDTFDeconv' -``` - -#### Usage - -This function is used to calculate the deconvolution, i.e. polynomial division. - -**Name:** DECONV - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `result`: The result of deconvolution, which is 'quotient' or 'remainder'. By default, the quotient will be output. - -**Output:** Output a single series. The type is DOUBLE. It is the result of deconvolving the second series from the first series (dividing the first series by the second series) whose timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - - -##### Calculate the quotient - -When `result` is 'quotient' or the default, this function calculates the quotient of the deconvolution. - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s3|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 8.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| null| -|1970-01-01T08:00:00.003+08:00| 2.0| null| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select deconv(s3,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2)| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - -##### Calculate the remainder - -When `result` is 'remainder', this function calculates the remainder of the deconvolution. - -Input series is the same as above, the SQL for query is shown below: - - -```sql -select deconv(s3,s2,'result'='remainder') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2, "result"="remainder")| -+-----------------------------+--------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 0.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -+-----------------------------+--------------------------------------------------------------+ -``` - -### 5.3 DWT - -#### Registration statement - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFDWT' -``` - -#### Usage - -This function is used to calculate 1d discrete wavelet transform of a numerical series. - -**Name:** DWT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of wavelet. May select 'Haar', 'DB4', 'DB6', 'DB8', where DB means Daubechies. User may offer coefficients of wavelet transform and ignore this parameter. Case ignored. -+ `coef`: Coefficients of wavelet transform. When providing this parameter, use comma ',' to split them, and leave no spaces or other punctuations. -+ `layer`: Times to transform. The number of output vectors equals $layer+1$. Default is 1. - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. - -**Note:** The length of input series must be an integer number power of 2. - -#### Examples - - -##### Haar wavelet transform - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.2| -|1970-01-01T08:00:00.200+08:00| 1.5| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.600+08:00| 0.8| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.800+08:00| 2.5| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select dwt(s1,"method"="haar") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|dwt(root.test.d1.s1, "method"="haar")| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.14142135834465192| -|1970-01-01T08:00:00.100+08:00| 1.909188342921157| -|1970-01-01T08:00:00.200+08:00| 1.6263456473052773| -|1970-01-01T08:00:00.300+08:00| 1.9798989957517026| -|1970-01-01T08:00:00.400+08:00| 3.252691126023161| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776479437628| -|1970-01-01T08:00:00.800+08:00| -0.14142135834465192| -|1970-01-01T08:00:00.900+08:00| 0.21213200063848547| -|1970-01-01T08:00:01.000+08:00| -0.7778174761639416| -|1970-01-01T08:00:01.100+08:00| -0.8485281289944873| -|1970-01-01T08:00:01.200+08:00| 0.2828427799095765| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426400127697095| -|1970-01-01T08:00:01.500+08:00| -0.42426408557066786| -+-----------------------------+-------------------------------------+ -``` - - -### 5.4 IDWT - -#### Registration statement - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFIDWT' -``` - -#### Usage - -This function performs one-dimensional inverse discrete wavelet transform on the input series, reconstructing the original data from DWT decomposed wavelet coefficients. - -**Name:** IDWT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of wavelet. May select 'Haar', 'DB4', 'DB6', 'DB8', where DB means Daubechies. User may offer coefficients of wavelet transform and ignore this parameter. Case ignored. -+ `coef`: Coefficients of wavelet transform. When providing this parameter, use comma ',' to split them, and leave no spaces or other punctuations. -+ `layer`: Times to transform. The number of output vectors equals $layer+1$. Default is 1. - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. - -**Note:** -* The length of input series must be an integer number power of 2. -* The parameter settings of the IDWT function (method/coef/layer) should be consistent with the corresponding DWT transformation to correctly reconstruct the original data. -* Typically, the input of IDWT is the output result of the DWT function. - -#### Examples - -##### Haar wavelet transform - -Input series: - -``` -+-----------------------------+--------------------+ -| Time| root.test.d1.s2| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 0.1414213562373095| -|1970-01-01T08:00:00.100+08:00| 1.909188309203678| -|1970-01-01T08:00:00.200+08:00| 1.6263455967290592| -|1970-01-01T08:00:00.300+08:00| 1.979898987322333| -|1970-01-01T08:00:00.400+08:00| 3.2526911934581184| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776310850235| -|1970-01-01T08:00:00.800+08:00| -0.1414213562373095| -|1970-01-01T08:00:00.900+08:00| 0.21213203435596428| -|1970-01-01T08:00:01.000+08:00| -0.7778174593052022| -|1970-01-01T08:00:01.100+08:00| -0.8485281374238569| -|1970-01-01T08:00:01.200+08:00| 0.2828427124746189| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426406871192857| -|1970-01-01T08:00:01.500+08:00|-0.42426406871192857| -+-----------------------------+--------------------+ -``` - -SQL for query: - -```sql -select idwt(s2,"method"="haar") from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------+ -| Time|idwt(root.test.d1.s2, "method"="haar")| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.19999999999999998| -|1970-01-01T08:00:00.200+08:00| 1.4999999999999996| -|1970-01-01T08:00:00.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.6999999999999997| -|1970-01-01T08:00:00.600+08:00| 0.7999999999999998| -|1970-01-01T08:00:00.700+08:00| 1.9999999999999996| -|1970-01-01T08:00:00.800+08:00| 2.4999999999999996| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.9999999999999996| -|1970-01-01T08:00:01.200+08:00| 1.7999999999999998| -|1970-01-01T08:00:01.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:01.400+08:00| 0.9999999999999998| -|1970-01-01T08:00:01.500+08:00| 1.5999999999999999| -+-----------------------------+--------------------------------------+ -``` - - - -### 5.5 FFT - -#### Registration statement - -```sql -create function fft as 'org.apache.iotdb.library.frequency.UDTFFFT' -``` - -#### Usage - -This function is used to calculate the fast Fourier transform (FFT) of a numerical series. - -**Name:** FFT - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The type of FFT, which is 'uniform' (by default) or 'nonuniform'. If the value is 'uniform', the timestamps will be ignored and all data points will be regarded as equidistant. Thus, the equidistant fast Fourier transform algorithm will be applied. If the value is 'nonuniform' (TODO), the non-equidistant fast Fourier transform algorithm will be applied based on timestamps. -+ `result`: The result of FFT, which is 'real', 'imag', 'abs' or 'angle', corresponding to the real part, imaginary part, magnitude and phase angle. By default, the magnitude will be output. -+ `compress`: The parameter of compression, which is within (0,1]. It is the reserved energy ratio of lossy compression. By default, there is no compression. - - -**Output:** Output a single series. The type is DOUBLE. The length is the same as the input. The timestamps starting from 0 only indicate the order. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - - -##### Uniform FFT - -With the default `type`, uniform FFT is applied. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select fft(s1) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------+ -| Time| fft(root.test.d1.s1)| -+-----------------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 1.2727111142703152E-8| -|1970-01-01T08:00:00.002+08:00| 2.385520799101839E-7| -|1970-01-01T08:00:00.003+08:00| 8.723291723972645E-8| -|1970-01-01T08:00:00.004+08:00| 19.999999960195904| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| -|1970-01-01T08:00:00.006+08:00| 3.2260694930700566E-7| -|1970-01-01T08:00:00.007+08:00| 8.723291605373329E-8| -|1970-01-01T08:00:00.008+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.009+08:00| 1.2727110997246171E-8| -|1970-01-01T08:00:00.010+08:00|1.9852334701272664E-23| -|1970-01-01T08:00:00.011+08:00| 1.2727111194499847E-8| -|1970-01-01T08:00:00.012+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.013+08:00| 8.723291785769131E-8| -|1970-01-01T08:00:00.014+08:00| 3.226069493070057E-7| -|1970-01-01T08:00:00.015+08:00| 9.999999850988388| -|1970-01-01T08:00:00.016+08:00| 19.999999960195904| -|1970-01-01T08:00:00.017+08:00| 8.723291747109068E-8| -|1970-01-01T08:00:00.018+08:00| 2.3855207991018386E-7| -|1970-01-01T08:00:00.019+08:00| 1.2727112069910878E-8| -+-----------------------------+----------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, there are peaks in $k=4$ and $k=5$ of the output. - -##### Uniform FFT with Compression - -Input series is the same as above, the SQL for query is shown below: - -```sql -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| fft(root.test.d1.s1,| fft(root.test.d1.s1,| -| | "result"="real",| "result"="imag",| -| | "compress"="0.99")| "compress"="0.99")| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - -Note: Based on the conjugation of the Fourier transform result, only the first half of the compression result is reserved. -According to the given parameter, data points are reserved from low frequency to high frequency until the reserved energy ratio exceeds it. -The last data point is reserved to indicate the length of the series. - -### 5.6 HighPass - -#### Registration statement - -```sql -create function highpass as 'org.apache.iotdb.library.frequency.UDTFHighPass' -``` - -#### Usage - -This function performs low-pass filtering on the input series and extracts components above the cutoff frequency. -The timestamps of input will be ignored and all data points will be regarded as equidistant. - -**Name:** HIGHPASS - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `wpass`: The normalized cutoff frequency which values (0,1). This parameter cannot be lacked. - -**Output:** Output a single series. The type is DOUBLE. It is the input after filtering. The length and timestamps of output are the same as the input. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select highpass(s1,'wpass'='0.45') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------+ -| Time|highpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9999999534830373| -|1970-01-01T08:00:01.000+08:00| 1.7462829277628608E-8| -|1970-01-01T08:00:02.000+08:00| -0.9999999593178128| -|1970-01-01T08:00:03.000+08:00| -4.1115269056426626E-8| -|1970-01-01T08:00:04.000+08:00| 0.9999999925494194| -|1970-01-01T08:00:05.000+08:00| 3.328126513330016E-8| -|1970-01-01T08:00:06.000+08:00| -1.0000000183304454| -|1970-01-01T08:00:07.000+08:00| 6.260191433311374E-10| -|1970-01-01T08:00:08.000+08:00| 1.0000000018134796| -|1970-01-01T08:00:09.000+08:00| -3.097210911744423E-17| -|1970-01-01T08:00:10.000+08:00| -1.0000000018134794| -|1970-01-01T08:00:11.000+08:00| -6.260191627862097E-10| -|1970-01-01T08:00:12.000+08:00| 1.0000000183304454| -|1970-01-01T08:00:13.000+08:00| -3.328126501424346E-8| -|1970-01-01T08:00:14.000+08:00| -0.9999999925494196| -|1970-01-01T08:00:15.000+08:00| 4.111526915498874E-8| -|1970-01-01T08:00:16.000+08:00| 0.9999999593178128| -|1970-01-01T08:00:17.000+08:00| -1.7462829341296528E-8| -|1970-01-01T08:00:18.000+08:00| -0.9999999534830369| -|1970-01-01T08:00:19.000+08:00| -1.035237222742873E-16| -+-----------------------------+-----------------------------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, the output is $y=sin(2\pi t/4)$ after high-pass filtering. - -### 5.7 IFFT - -#### Registration statement - -```sql -create function ifft as 'org.apache.iotdb.library.frequency.UDTFIFFT' -``` - -#### Usage - -This function treats the two input series as the real and imaginary part of a complex series, performs an inverse fast Fourier transform (IFFT), and outputs the real part of the result. -For the input format, please refer to the output format of `FFT` function. -Moreover, the compressed output of `FFT` function is also supported. - -**Name:** IFFT - -**Input:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `start`: The start time of the output series with the format 'yyyy-MM-dd HH:mm:ss'. By default, it is '1970-01-01 08:00:00'. -+ `interval`: The interval of the output series, which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it is 1s. - -**Output:** Output a single series. The type is DOUBLE. It is strictly equispaced. The values are the results of IFFT. - -**Note:** If a row contains null points or `NaN`, it will be ignored. - -#### Examples - - -Input series: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| root.test.d1.re| root.test.d1.im| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - - -SQL for query: - -```sql -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|ifft(root.test.d1.re, root.test.d1.im, "interval"="1m",| -| | "start"="2021-01-01 00:00:00")| -+-----------------------------+-------------------------------------------------------+ -|2021-01-01T00:00:00.000+08:00| 2.902112992431231| -|2021-01-01T00:01:00.000+08:00| 1.1755704705132448| -|2021-01-01T00:02:00.000+08:00| -2.175570513757101| -|2021-01-01T00:03:00.000+08:00| -1.9021130389094498| -|2021-01-01T00:04:00.000+08:00| 0.9999999925494194| -|2021-01-01T00:05:00.000+08:00| 1.902113046743454| -|2021-01-01T00:06:00.000+08:00| 0.17557053610884188| -|2021-01-01T00:07:00.000+08:00| -1.1755704886020932| -|2021-01-01T00:08:00.000+08:00| -0.9021130371347148| -|2021-01-01T00:09:00.000+08:00| 3.552713678800501E-16| -|2021-01-01T00:10:00.000+08:00| 0.9021130371347154| -|2021-01-01T00:11:00.000+08:00| 1.1755704886020932| -|2021-01-01T00:12:00.000+08:00| -0.17557053610884144| -|2021-01-01T00:13:00.000+08:00| -1.902113046743454| -|2021-01-01T00:14:00.000+08:00| -0.9999999925494196| -|2021-01-01T00:15:00.000+08:00| 1.9021130389094498| -|2021-01-01T00:16:00.000+08:00| 2.1755705137571004| -|2021-01-01T00:17:00.000+08:00| -1.1755704705132448| -|2021-01-01T00:18:00.000+08:00| -2.902112992431231| -|2021-01-01T00:19:00.000+08:00| -3.552713678800501E-16| -+-----------------------------+-------------------------------------------------------+ -``` - -### 5.8 LowPass - -#### Registration statement - -```sql -create function lowpass as 'org.apache.iotdb.library.frequency.UDTFLowPass' -``` - -#### Usage - -This function performs low-pass filtering on the input series and extracts components below the cutoff frequency. -The timestamps of input will be ignored and all data points will be regarded as equidistant. - -**Name:** LOWPASS - -**Input:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `wpass`: The normalized cutoff frequency which values (0,1). This parameter cannot be lacked. - -**Output:** Output a single series. The type is DOUBLE. It is the input after filtering. The length and timestamps of output are the same as the input. - -**Note:** `NaN` in the input series will be ignored. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select lowpass(s1,'wpass'='0.45') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+----------------------------------------+ -| Time|lowpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.9021130073323922| -|1970-01-01T08:00:01.000+08:00| 1.1755704705132448| -|1970-01-01T08:00:02.000+08:00| -1.1755705286582614| -|1970-01-01T08:00:03.000+08:00| -1.9021130389094498| -|1970-01-01T08:00:04.000+08:00| 7.450580419288145E-9| -|1970-01-01T08:00:05.000+08:00| 1.902113046743454| -|1970-01-01T08:00:06.000+08:00| 1.1755705212076808| -|1970-01-01T08:00:07.000+08:00| -1.1755704886020932| -|1970-01-01T08:00:08.000+08:00| -1.9021130222335536| -|1970-01-01T08:00:09.000+08:00| 3.552713678800501E-16| -|1970-01-01T08:00:10.000+08:00| 1.9021130222335536| -|1970-01-01T08:00:11.000+08:00| 1.1755704886020932| -|1970-01-01T08:00:12.000+08:00| -1.1755705212076801| -|1970-01-01T08:00:13.000+08:00| -1.902113046743454| -|1970-01-01T08:00:14.000+08:00| -7.45058112983088E-9| -|1970-01-01T08:00:15.000+08:00| 1.9021130389094498| -|1970-01-01T08:00:16.000+08:00| 1.1755705286582616| -|1970-01-01T08:00:17.000+08:00| -1.1755704705132448| -|1970-01-01T08:00:18.000+08:00| -1.9021130073323924| -|1970-01-01T08:00:19.000+08:00| -2.664535259100376E-16| -+-----------------------------+----------------------------------------+ -``` - -Note: The input is $y=sin(2\pi t/4)+2sin(2\pi t/5)$ with a length of 20. Thus, the output is $y=2sin(2\pi t/5)$ after low-pass filtering. - - -### 5.9 Envelope - -#### Registration statement - -```sql -create function envelope as 'org.apache.iotdb.library.frequency.UDFEnvelopeAnalysis' -``` - -#### Usage - -This function achieves signal demodulation and envelope extraction by inputting a one-dimensional floating-point array and a user specified modulation frequency. The goal of demodulation is to extract the parts of interest from complex signals, making them easier to understand. For example, demodulation can be used to find the envelope of the signal, that is, the trend of amplitude changes. - -**Name:** Envelope - -**Input:** Only supports a single input sequence, with types INT32/INT64/FLOAT/DOUBLE - - -**Parameters:** - -+ `frequency`: Frequency (optional, positive number. If this parameter is not filled in, the system will infer the frequency based on the time interval corresponding to the sequence). -+ `amplification`: Amplification factor (optional, positive integer. The output of the Time column is a set of positive integers and does not output decimals. When the frequency is less than 1, this parameter can be used to amplify the frequency to display normal results). - -**Output:** -+ `Time`: The meaning of the value returned by this column is frequency rather than time. If the output format is time format (e.g. 1970-01-01T08:00: 19.000+08:00), please convert it to a timestamp value. - - -+ `Envelope(Path, 'frequency'='{frequency}')`:Output a single sequence of type DOUBLE, which is the result of envelope analysis. - -**Note:** When the values of the demodulated original sequence are discontinuous, this function will treat it as continuous processing. It is recommended that the analyzed time series be a complete time series of values. It is also recommended to specify a start time and an end time. - -#### Examples - -Input series: - - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:01.000+08:00| 1.0 | -|1970-01-01T08:00:02.000+08:00| 2.0 | -|1970-01-01T08:00:03.000+08:00| 3.0 | -|1970-01-01T08:00:04.000+08:00| 4.0 | -|1970-01-01T08:00:05.000+08:00| 5.0 | -|1970-01-01T08:00:06.000+08:00| 6.0 | -|1970-01-01T08:00:07.000+08:00| 7.0 | -|1970-01-01T08:00:08.000+08:00| 8.0 | -|1970-01-01T08:00:09.000+08:00| 9.0 | -|1970-01-01T08:00:10.000+08:00| 10.0 | -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -set time_display_type=long; -select envelope(s1),envelope(s1,'frequency'='1000'),envelope(s1,'amplification'='10') from root.test.d1; -``` - -Output series: - - -``` -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -|Time|envelope(root.test.d1.s1)|envelope(root.test.d1.s1, "frequency"="1000")|envelope(root.test.d1.s1, "amplification"="10")| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -| 0| 6.284350808484124| 6.284350808484124| 6.284350808484124| -| 100| 1.5581923657404393| 1.5581923657404393| null| -| 200| 0.8503211038340728| 0.8503211038340728| null| -| 300| 0.512808785945551| 0.512808785945551| null| -| 400| 0.26361156774506744| 0.26361156774506744| null| -|1000| null| null| 1.5581923657404393| -|2000| null| null| 0.8503211038340728| -|3000| null| null| 0.512808785945551| -|4000| null| null| 0.26361156774506744| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ - -``` - - -## 6. Data Matching - -### 6.1 Cov - -#### Registration statement - -```sql -create function cov as 'org.apache.iotdb.library.dmatch.UDAFCov' -``` - -#### Usage - -This function is used to calculate the population covariance. - -**Name:** COV - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the population covariance. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `NaN` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select cov(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|cov(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 12.291666666666666| -+-----------------------------+-------------------------------------+ -``` - -### 6.2 DTW - -#### Registration statement - -```sql -create function dtw as 'org.apache.iotdb.library.dmatch.UDAFDtw' -``` - -#### Usage - -This function is used to calculate the DTW distance between two input series. - -**Name:** DTW - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the DTW distance. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `0` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.004+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.005+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.006+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.007+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.008+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.009+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.010+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.011+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.012+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.013+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.014+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.015+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.016+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.017+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.018+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.019+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.020+08:00| 1.0| 2.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select dtw(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------+ -| Time|dtw(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 20.0| -+-----------------------------+-------------------------------------+ -``` - -### 6.3 Pearson - -#### Registration statement - -```sql -create function pearson as 'org.apache.iotdb.library.dmatch.UDAFPearson' -``` - -#### Usage - -This function is used to calculate the Pearson Correlation Coefficient. - -**Name:** PEARSON - -**Input Series:** Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the Pearson Correlation Coefficient. - -**Note:** - -+ If a row contains missing points, null points or `NaN`, it will be ignored; -+ If all rows are ignored, `NaN` will be output. - - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select pearson(s1,s2) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------+ -| Time|pearson(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.5630881927754872| -+-----------------------------+-----------------------------------------+ -``` - -### 6.4 PtnSym - -#### Registration statement - -```sql -create function ptnsym as 'org.apache.iotdb.library.dmatch.UDTFPtnSym' -``` - -#### Usage - -This function is used to find all symmetric subseries in the input whose degree of symmetry is less than the threshold. -The degree of symmetry is calculated by DTW. -The smaller the degree, the more symmetrical the series is. - -**Name:** PATTERNSYMMETRIC - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE - -**Parameter:** - -+ `window`: The length of the symmetric subseries. It's a positive integer and the default value is 10. -+ `threshold`: The threshold of the degree of symmetry. It's non-negative. Only the subseries whose degree of symmetry is below it will be output. By default, all subseries will be output. - - -**Output Series:** Output a single series. The type is DOUBLE. Each data point in the output series corresponds to a symmetric subseries. The output timestamp is the starting timestamp of the subseries and the output value is the degree of symmetry. - -#### Example - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s4| -+-----------------------------+---------------+ -|2021-01-01T12:00:00.000+08:00| 1.0| -|2021-01-01T12:00:01.000+08:00| 2.0| -|2021-01-01T12:00:02.000+08:00| 3.0| -|2021-01-01T12:00:03.000+08:00| 2.0| -|2021-01-01T12:00:04.000+08:00| 1.0| -|2021-01-01T12:00:05.000+08:00| 1.0| -|2021-01-01T12:00:06.000+08:00| 1.0| -|2021-01-01T12:00:07.000+08:00| 1.0| -|2021-01-01T12:00:08.000+08:00| 2.0| -|2021-01-01T12:00:09.000+08:00| 3.0| -|2021-01-01T12:00:10.000+08:00| 2.0| -|2021-01-01T12:00:11.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|ptnsym(root.test.d1.s4, "window"="5", "threshold"="0")| -+-----------------------------+------------------------------------------------------+ -|2021-01-01T12:00:00.000+08:00| 0.0| -|2021-01-01T12:00:07.000+08:00| 0.0| -+-----------------------------+------------------------------------------------------+ -``` - -### 6.5 XCorr - -#### Registration statement - -```sql -create function xcorr as 'org.apache.iotdb.library.dmatch.UDTFXCorr' -``` - -#### Usage - -This function is used to calculate the cross correlation function of given two time series. -For discrete time series, cross correlation is given by -$$CR(n) = \frac{1}{N} \sum_{m=1}^N S_1[m]S_2[m+n]$$ -which represent the similarities between two series with different index shifts. - -**Name:** XCORR - -**Input Series:** Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Output Series:** Output a single series with DOUBLE as datatype. -There are $2N-1$ data points in the series, the center of which represents the cross correlation -calculated with pre-aligned series(that is $CR(0)$ in the formula above), -and the previous(or post) values represent those with shifting the latter series forward(or backward otherwise) -until the two series are no longer overlapped(not included). -In short, the values of output series are given by(index starts from 1) -$$OS[i] = CR(-N+i) = \frac{1}{N} \sum_{m=1}^{i} S_1[m]S_2[N-i+m],\ if\ i <= N$$ -$$OS[i] = CR(i-N) = \frac{1}{N} \sum_{m=1}^{2N-i} S_1[i-N+m]S_2[m],\ if\ i > N$$ - -**Note:** - -+ `null` and `NaN` values in the input series will be ignored and treated as 0. - -#### Examples - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| null| 6| -|2020-01-01T00:00:02.000+08:00| 2| 7| -|2020-01-01T00:00:03.000+08:00| 3| NaN| -|2020-01-01T00:00:04.000+08:00| 4| 9| -|2020-01-01T00:00:05.000+08:00| 5| 10| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -Output series: - -``` -+-----------------------------+---------------------------------------+ -| Time|xcorr(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+---------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 10.0| -|1970-01-01T08:00:00.002+08:00| 16.0| -|1970-01-01T08:00:00.003+08:00| 16.75| -|1970-01-01T08:00:00.004+08:00| 20.0| -|1970-01-01T08:00:00.005+08:00| 13.2| -|1970-01-01T08:00:00.006+08:00| 5.6| -|1970-01-01T08:00:00.007+08:00| 7.0| -|1970-01-01T08:00:00.008+08:00| 0.0| -+-----------------------------+---------------------------------------+ -``` -### 6.6 Pattern\_match - -#### Registration statement - -```SQL -create function pattern_match as 'org.apache.iotdb.library.match.UDAFPatternMatch' -``` - -#### Usage - -This function performs pattern matching between an input time series and a predefined pattern. A match is considered successful if the similarity measure (distance) is less than or equal to a specified threshold. The results are output as a JSON list. - -​**Name**​: PATTERN\_MATCH - -**Input**​​**​ Series**​: Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE/ BOOLEAN - -​**Parameter**​: - -* `timePattern` : A comma-separated string of timestamps (e.g., `"t1,t2,t3"`). Length must be ​**greater than 1**​. Required. -* `valuePattern `: A comma-separated string of numerical values corresponding to `timePattern`. Length must **match ​**`timePattern` and be greater than 1. Required. - -> For boolean values: Use `1` for `true` and `0` for `false`. - -* `theshold`: Float-type similarity threshold. Required. - -**Output**​​**​ Series**​: A JSON list containing all successfully matched segments. Each entry includes: start timestamp `startTime`, end timestamp `endTime`, calculated similarity value `distance`. - -#### Example - -1. Linear Data - -Input series: - -```SQL -IoTDB> select s0 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s0| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.1| -|1970-01-01T08:00:00.003+08:00| 1.2| -|1970-01-01T08:00:00.004+08:00| 1.3| -|1970-01-01T08:00:00.005+08:00| 0.0| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+--------------------------------------------------------------------------------------------------+ -| match_result| -+--------------------------------------------------------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]| -+--------------------------------------------------------------------------------------------------+ -``` - -2. Boolean Data - -Input series: - -```SQL -IoTDB> select s1 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| true| -|1970-01-01T08:00:00.002+08:00| true| -|1970-01-01T08:00:00.003+08:00| true| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| false| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", "threshold"="0.5") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+-------------------------------------------------+ -| match_result| -+-------------------------------------------------+ -|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+-------------------------------------------------+ -``` - -3. V-shaped Data - -Input series: - -```SQL -IoTDB> select s2 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s2| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| -1.0| -|1970-01-01T08:00:00.003+08:00| -2.0| -|1970-01-01T08:00:00.004+08:00| -3.0| -|1970-01-01T08:00:00.005+08:00| -2.0| -|1970-01-01T08:00:00.006+08:00| -1.0| -|1970-01-01T08:00:00.007+08:00| -0.0| -|1970-01-01T08:00:00.008+08:00| -0.0| -|1970-01-01T08:00:00.009+08:00| -0.0| -|1970-01-01T08:00:00.010+08:00| -0.0| -+-----------------------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s2, "timePattern"="1,2,3,4,5,6,7", "valuePattern"="0.0,-1.0,-2.0,-3.0,-2.0,-1.0,-0.0", "threshold"="10") as match_result from root.db.d0 -``` - -Output series: - -```SQL -+----------------------------------------------+ -| match_result| -+----------------------------------------------+ -|[{"distance":0.53,"startTime":1,"endTime":10}]| -+----------------------------------------------+ -``` - -4. Multiple Matching Pattern - -Input series: - -```SQL -IoTDB> select s0,s1 from root.** -+-----------------------------+-------------+-------------+ -| Time|root.db.d0.s0|root.db.d0.s1| -+-----------------------------+-------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| true| -|1970-01-01T08:00:00.002+08:00| 1.1| true| -|1970-01-01T08:00:00.003+08:00| 1.2| true| -|1970-01-01T08:00:00.004+08:00| 1.3| false| -|1970-01-01T08:00:00.005+08:00| 0.0| false| -+-----------------------------+-------------+-------------+ -``` - -SQL for query: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result1, pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", - "threshold"="0.5") as match_result2 from root.db.d0 -``` - -Output series: - -```SQL -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -| match_result1| match_result2| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -``` - - -## 7. Data Repairing - -### 7.1 TimestampRepair - -#### Registration statement - -```sql -create function timestamprepair as 'org.apache.iotdb.library.drepair.UDTFTimestampRepair' -``` - -#### Usage - -This function is used for timestamp repair. -According to the given standard time interval, -the method of minimizing the repair cost is adopted. -By fine-tuning the timestamps, -the original data with unstable timestamp interval is repaired to strictly equispaced data. -If no standard time interval is given, -this function will use the **median**, **mode** or **cluster** of the time interval to estimate the standard time interval. - -**Name:** TIMESTAMPREPAIR - -**Input Series:** Only support a single input series. The data type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `interval`: The standard time interval whose unit is millisecond. It is a positive integer. By default, it will be estimated according to the given method. -+ `method`: The method to estimate the standard time interval, which is 'median', 'mode' or 'cluster'. This parameter is only valid when `interval` is not given. By default, median will be used. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -#### Examples - -##### Manually Specify the Standard Time Interval - -When `interval` is given, this function repairs according to the given standard time interval. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:19.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:01.000+08:00| 7.0| -|2021-07-01T12:01:11.000+08:00| 8.0| -|2021-07-01T12:01:21.000+08:00| 9.0| -|2021-07-01T12:01:31.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select timestamprepair(s1,'interval'='10000') from root.test.d2 -``` - -Output series: - - -``` -+-----------------------------+----------------------------------------------------+ -| Time|timestamprepair(root.test.d2.s1, "interval"="10000")| -+-----------------------------+----------------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+----------------------------------------------------+ -``` - -##### Automatically Estimate the Standard Time Interval - -When `interval` is default, this function estimates the standard time interval. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select timestamprepair(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------------+ -| Time|timestamprepair(root.test.d2.s1)| -+-----------------------------+--------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+--------------------------------+ -``` - -### 7.2 ValueFill - -#### Registration statement - -```sql -create function valuefill as 'org.apache.iotdb.library.drepair.UDTFValueFill' -``` - -#### Usage - -This function is used to impute time series. Several methods are supported. - -**Name**: ValueFill -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: {"mean", "previous", "linear", "likelihood", "AR", "MA", "SCREEN"}, default "linear". - Method to use for imputation in series. "mean": use global mean value to fill holes; "previous": propagate last valid observation forward to next valid. "linear": simplest interpolation method; "likelihood":Maximum likelihood estimation based on the normal distribution of speed; "AR": auto regression; "MA": moving average; "SCREEN": speed constraint. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -**Note:** AR method use AR(1) model. Input value should be auto-correlated, or the function would output a single point (0, 0.0). - -#### Examples - -##### Fill with linear - -When `method` is "linear" or the default, Screen method is used to impute. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| NaN| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| NaN| -|2020-01-01T00:00:22.000+08:00| NaN| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select valuefill(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+--------------------------+ -| Time|valuefill(root.test.d2.s1)| -+-----------------------------+--------------------------+ -|2020-01-01T00:00:02.000+08:00| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 110.5| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.66666666666667| -|2020-01-01T00:00:22.000+08:00| 121.33333333333333| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+--------------------------+ -``` - -##### Previous Fill - -When `method` is "previous", previous method is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select valuefill(s1,"method"="previous") from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-----------------------------------------------+ -| Time|valuefill(root.test.d2.s1, "method"="previous")| -+-----------------------------+-----------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 108.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 116.0| -|2020-01-01T00:00:22.000+08:00| 116.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-----------------------------------------------+ -``` - -### 7.3 ValueRepair - -#### Registration statement - -```sql -create function valuerepair as 'org.apache.iotdb.library.drepair.UDTFValueRepair' -``` - -#### Usage - -This function is used to repair the value of the time series. -Currently, two methods are supported: -**Screen** is a method based on speed threshold, which makes all speeds meet the threshold requirements under the premise of minimum changes; -**LsGreedy** is a method based on speed change likelihood, which models speed changes as Gaussian distribution, and uses a greedy algorithm to maximize the likelihood. - - -**Name:** VALUEREPAIR - -**Input Series:** Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -+ `method`: The method used to repair, which is 'Screen' or 'LsGreedy'. By default, Screen is used. -+ `minSpeed`: This parameter is only valid with Screen. It is the speed threshold. Speeds below it will be regarded as outliers. By default, it is the median minus 3 times of median absolute deviation. -+ `maxSpeed`: This parameter is only valid with Screen. It is the speed threshold. Speeds above it will be regarded as outliers. By default, it is the median plus 3 times of median absolute deviation. -+ `center`: This parameter is only valid with LsGreedy. It is the center of the Gaussian distribution of speed changes. By default, it is 0. -+ `sigma`: This parameter is only valid with LsGreedy. It is the standard deviation of the Gaussian distribution of speed changes. By default, it is the median absolute deviation. - -**Output Series:** Output a single series. The type is the same as the input. This series is the input after repairing. - -**Note:** `NaN` will be filled with linear interpolation before repairing. - -#### Examples - -##### Repair with Screen - -When `method` is 'Screen' or the default, Screen method is used. - -Input series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select valuerepair(s1) from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+----------------------------+ -| Time|valuerepair(root.test.d2.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+----------------------------+ -``` - -##### Repair with LsGreedy - -When `method` is 'LsGreedy', LsGreedy method is used. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select valuerepair(s1,'method'='LsGreedy') from root.test.d2 -``` - -Output series: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|valuerepair(root.test.d2.s1, "method"="LsGreedy")| -+-----------------------------+-------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-------------------------------------------------+ -``` - -## 8. Series Discovery - -### 8.1 ConsecutiveSequences - -#### Registration statement - -```sql -create function consecutivesequences as 'org.apache.iotdb.library.series.UDTFConsecutiveSequences' -``` - -#### Usage - -This function is used to find locally longest consecutive subsequences in strictly equispaced multidimensional data. - -Strictly equispaced data is the data whose time intervals are strictly equal. Missing data, including missing rows and missing values, is allowed in it, while data redundancy and timestamp drift is not allowed. - -Consecutive subsequence is the subsequence that is strictly equispaced with the standard time interval without any missing data. If a consecutive subsequence is not a proper subsequence of any consecutive subsequence, it is locally longest. - -**Name:** CONSECUTIVESEQUENCES - -**Input Series:** Support multiple input series. The type is arbitrary but the data is strictly equispaced. - -**Parameters:** - -+ `gap`: The standard time interval which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it will be estimated by the mode of time intervals. - -**Output Series:** Output a single series. The type is INT32. Each data point in the output series corresponds to a locally longest consecutive subsequence. The output timestamp is the starting timestamp of the subsequence and the output value is the number of data points in the subsequence. - -**Note:** For input series that is not strictly equispaced, there is no guarantee on the output. - -#### Examples - -##### Manually Specify the Standard Time Interval - -It's able to manually specify the standard time interval by the parameter `gap`. It's notable that false parameter leads to false output. - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2, "gap"="5m")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------------------+ -``` - - -##### Automatically Estimate the Standard Time Interval - -When `gap` is default, this function estimates the standard time interval by the mode of time intervals and gets the same results. Therefore, this usage is more recommended. - -Input series is the same as above, the SQL for query is shown below: - -```sql -select consecutivesequences(s1,s2) from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------+ -``` - -### 8.2 ConsecutiveWindows - -#### Registration statement - -```sql -create function consecutivewindows as 'org.apache.iotdb.library.series.UDTFConsecutiveWindows' -``` - -#### Usage - -This function is used to find consecutive windows of specified length in strictly equispaced multidimensional data. - -Strictly equispaced data is the data whose time intervals are strictly equal. Missing data, including missing rows and missing values, is allowed in it, while data redundancy and timestamp drift is not allowed. - -Consecutive window is the subsequence that is strictly equispaced with the standard time interval without any missing data. - -**Name:** CONSECUTIVEWINDOWS - -**Input Series:** Support multiple input series. The type is arbitrary but the data is strictly equispaced. - -**Parameters:** - -+ `gap`: The standard time interval which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, it will be estimated by the mode of time intervals. -+ `length`: The length of the window which is a positive number with an unit. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. This parameter cannot be lacked. - -**Output Series:** Output a single series. The type is INT32. Each data point in the output series corresponds to a consecutive window. The output timestamp is the starting timestamp of the window and the output value is the number of data points in the window. - -**Note:** For input series that is not strictly equispaced, there is no guarantee on the output. - -#### Examples - - -Input series: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -SQL for query: - -```sql -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1 -``` - -Output series: - -``` -+-----------------------------+--------------------------------------------------------------------+ -| Time|consecutivewindows(root.test.d1.s1, root.test.d1.s2, "length"="10m")| -+-----------------------------+--------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -+-----------------------------+--------------------------------------------------------------------+ -``` - - - -## 9. Machine Learning - -### 9.1 AR - -#### Registration statement - -```sql -create function ar as 'org.apache.iotdb.library.dlearn.UDTFAR' -``` - -#### Usage - -This function is used to learn the coefficients of the autoregressive models for a time series. - -**Name:** AR - -**Input Series:** Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. - -**Parameters:** - -- `p`: The order of the autoregressive model. Its default value is 1. - -**Output Series:** Output a single series. The type is DOUBLE. The first line corresponds to the first order coefficient, and so on. - -**Note:** - -- Parameter `p` should be a positive integer. -- Most points in the series should be sampled at a constant time interval. -- Linear interpolation is applied for the missing points in the series. - -#### Examples - -##### Assigning Model Order - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select ar(s0,"p"="2") from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+---------------------------+ -| Time|ar(root.test.d0.s0,"p"="2")| -+-----------------------------+---------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.9429| -|1970-01-01T08:00:00.002+08:00| -0.2571| -+-----------------------------+---------------------------+ -``` - -### 9.2 Cluster - -#### Registration statement - -```sql -create function cluster as 'org.apache.iotdb.library.dlearn.UDTFCluster' -``` - -#### Usage - -This function takes a **single input time series**, splits it into **non-overlapping** contiguous subsequences (windows) of fixed length `l`, and clusters those subsequences into `k` groups. - -**Name:** Cluster - -**Input Series:** Only support single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE. Points are read in time order; trailing samples that do not fill a full window are dropped (only `⌊n/l⌋` windows are used, where `n` is the number of valid points). - -**Parameters:** - -| Name | Meaning | Default | Notes | -|------|---------|---------|--------| -| `l` | Subsequence (window) length | (required) | Positive integer; each window has `l` consecutive samples. | -| `k` | Number of clusters | (required) | Integer ≥ 2. | -| `method` | Clustering algorithm | `kmeans` | Optional: `kmeans`, `kshape`, `medoidshape` (case-insensitive). Defaults to k-means if omitted. | -| `norm` | Z-score normalize each subsequence | `true` | Boolean; if `true`, each subsequence is standardized before clustering. | -| `maxiter` | Maximum iterations | `200` | Positive integer. | -| `output` | Output mode | `label` | `label`: one cluster id per window; `centroid`: concatenate the `k` centroid vectors in cluster order. | -| `sample_rate` | Greedy sampling rate | `0.3` | Used only when **`method` = `medoidshape`**; must be in `(0, 1]`. | - - -**`method` details:** - -- **kmeans**: k-means in Euclidean space (optionally after per-window normalization). -- **kshape**: Assign by shape-based distance (SBD from normalized cross-correlation, NCC); centroids updated via SVD on the cluster matrix. -- **medoidshape**: Coarsely cluster, then greedy selection of `k` representative subsequences; `sample_rate` controls how many candidates are sampled each round. - -**Output Series:** Controlled by `output`: - -- **`output` = `label` (default):** One output series, type **INT32**. Number of points = number of full windows, `⌊n/l⌋`. Timestamp of each point = **time of the first sample** in that window; value = cluster id **0 … k−1**. -- **`output` = `centroid`:** One output series, type **DOUBLE**. Number of points = **`k × l`**: for clusters **0 → k−1**, emit the `l` components of each centroid in order (concatenated). Timestamps are `0, 1, 2, …` (placeholders only, no physical time meaning). - -**Note:** - -- Require valid point count `n ≥ l` and window count `⌊n/l⌋ ≥ k`. - -#### Examples - -##### KShape: window length 3, k = 2 - -Nine samples `{1,2,3,10,20,30,1,5,1}` form three non-overlapping windows `{1,2,3}`, `{10,20,30}`, `{1,5,1}`. With **`method` = `kshape`** (default `norm` = `true`), each output row is the cluster id for one window; timestamps are the window start times. Resulting labels: **0, 0, 1**. - -Input Series: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 10.0| -|2020-01-01T00:00:05.000+08:00| 20.0| -|2020-01-01T00:00:06.000+08:00| 30.0| -|2020-01-01T00:00:07.000+08:00| 1.0| -|2020-01-01T00:00:08.000+08:00| 5.0| -|2020-01-01T00:00:09.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -SQL for query: - -```sql -select cluster(s0, "l"="3", "k"="2", "method"="kshape", "output"="label") -from root.test.d0 -``` - -Output Series: - -``` -+-----------------------------+----------------------------------------------------------------------------+ -| Time|cluster(root.test.d0.s0,"l"="3","k"="2","method"="kshape","output"="label")| -+-----------------------------+----------------------------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 0| -|2020-01-01T00:00:04.000+08:00| 0| -|2020-01-01T00:00:07.000+08:00| 1| -+-----------------------------+----------------------------------------------------------------------------+ -``` diff --git a/src/UserGuide/latest/Tools-System/CLI_timecho.md b/src/UserGuide/latest/Tools-System/CLI_timecho.md deleted file mode 100644 index ad512f4ae..000000000 --- a/src/UserGuide/latest/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,275 +0,0 @@ - - -# Command Line Interface (CLI) - - -IoTDB provides Cli/shell tools for users to interact with IoTDB server in command lines. This document shows how Cli/shell tool works and the meaning of its parameters. - -> Note: In this document, \$IOTDB\_HOME represents the path of the IoTDB installation directory. - -## 1. Running Cli - -After installation, there is a default user in IoTDB: `root`, and the -default password is `TimechoDB@2021`(Before V2.0.6 it is `root`). Users can use this username to try IoTDB Cli/Shell tool. The cli startup script is the `start-cli` file under the \$IOTDB\_HOME/bin folder. When starting the script, you need to specify the IP and PORT. (Make sure the IoTDB cluster is running properly when you use Cli/Shell tool to connect to it.) - -Here is an example where the cluster is started locally and the user has not changed the running port. The default rpc port is -6667
-If you need to connect to the remote DataNode or changes -the rpc port number of the DataNode running, set the specific IP and RPC PORT at -h and -p.
-You also can set your own environment variable at the front of the start script - -The Linux and MacOS system startup commands are as follows: - -```shell -# Before version V2.0.6.x -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -The Windows system startup commands are as follows: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x and later versions, before version V2.0.6.x -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x and later versions -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -After operating these commands, the cli can be started successfully. The successful status will be as follows: - -``` - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ version - - -Successfully login at 127.0.0.1:6667 -IoTDB> -``` - -Enter ```quit``` or `exit` can exit Cli. - -## 2. Cli Parameters - -| **Parameter** | **Type** | **Required** | **Description** | **Example** | -| -------------------------- | -------- | ------------ |-----------------------------------------------------------------------------------| ------------------- | -| -h `` | string | No | The IP address of the IoTDB server. (Default: 127.0.0.1) | -h 127.0.0.1 | -| -p `` | int | No | The RPC port of the IoTDB server. (Default: 6667) | -p 6667 | -| -u `` | string | No | The username to connect to the IoTDB server. (Default: root) | -u root | -| -pw `` | string | No | The password to connect to the IoTDB server. (Default: root) | -pw root | -| -sql_dialect `` | string | No | The data model type: tree or table. (Default: tree) | -sql_dialect table | -| -e `` | string | No | Batch operations in non-interactive mode. | -e "show databases" | -| -c | Flag | No | Required if rpc_thrift_compression_enable=true on the server. | -c | -| -disableISO8601 | Flag | No | If set, timestamps will be displayed as numeric values instead of ISO8601 format. | -disableISO8601 | -| -usessl `` | Boolean | No | Enable SSL connection | -usessl true | -| -ts `` | string | No | SSL certificate store path | -ts /path/to/truststore | -| -tpw `` | string | No | SSL certificate store password | -tpw myTrustPassword | -| -timeout `` | int | No | Query timeout (seconds). If not set, the server's configuration will be used. | -timeout 30 | -| -help | Flag | No | Displays help information for the CLI tool. | -help | - -Following is a cli command which connects the host with IP -10.129.187.21, rpc port 6667, username "root", password "root", and prints the timestamp in digital form. The maximum number of lines displayed on the IoTDB command line is 10. - -The Linux and MacOS system startup commands are as follows: - -```shell -Shell > bash sbin/start-cli.sh -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -The Windows system startup commands are as follows: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 - -# # V2.0.4.x and later versions -Shell > sbin\windows\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -## 3. CLI Special Command - -Special commands of Cli are below. - -| Command | Description / Example | -| :-------------------------- | :------------------------------------------------------ | -| `set time_display_type=xxx` | eg. long, default, ISO8601, yyyy-MM-dd HH:mm:ss | -| `show time_display_type` | show time display type | -| `set time_zone=xxx` | eg. +08:00, Asia/Shanghai | -| `show time_zone` | show cli time zone | -| `set fetch_size=xxx` | set fetch size when querying data from server | -| `show fetch_size` | show fetch size | -| `set max_display_num=xxx` | set max lines for cli to output, -1 equals to unlimited | -| `help` | Get hints for CLI special commands | -| `exit/quit` | Exit CLI | - - -## 4. Batch Operation of Cli - --e parameter is designed for the Cli/shell tool in the situation where you would like to manipulate IoTDB in batches through scripts. By using the -e parameter, you can operate IoTDB without entering the cli's input mode. - -In order to avoid confusion between statements and other parameters, the current version only supports the -e parameter as the last parameter. - -The usage of -e parameter for Cli/shell is as follows: - -The Linux and MacOS system commands: - -```shell -Shell > bash sbin/start-cli.sh -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -The Windows system commands: - -```shell -# Before version V2.0.4.x -Shell > sbin\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} - -# V2.0.4.x and later versions -Shell > sbin\windows\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -In the Windows environment, the SQL statement of the -e parameter needs to use ` `` ` to replace `" "` - -In order to better explain the use of -e parameter, take following as an example(On linux system). - -Suppose you want to create a database root.demo to a newly launched IoTDB, create a timeseries root.demo.s1 and insert three data points into it. With -e parameter, you could write a shell like this: - -```shell -# !/bin/bash - -host=127.0.0.1 -rpcPort=6667 -user=root -pass=root - -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create database root.demo" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create timeseries root.demo.s1 WITH DATATYPE=INT32, ENCODING=RLE" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(1,10)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(2,11)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(3,12)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "select s1 from root.demo" -``` - -The results are shown in the figure, which are consistent with the Cli and jdbc operations. - -```shell - Shell > bash ./shell.sh -+-----------------------------+------------+ -| Time|root.demo.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.001+08:00| 10| -|1970-01-01T08:00:00.002+08:00| 11| -|1970-01-01T08:00:00.003+08:00| 12| -+-----------------------------+------------+ -Total line number = 3 -It costs 0.267s -``` - -It should be noted that the use of the -e parameter in shell scripts requires attention to the escaping of special characters. - -## 5. Access History Feature - -Since IoTDB **V2.0.9.1**, the access history feature is supported. After a client logs in successfully, key historical access information is displayed, and the feature supports distributed scenarios. Both administrators and regular users can only view their own access history. The core displayed information includes: - -- Last successful session: displays date, time, access application, IP address, and access method (not shown for first login or when no history exists). -- Most recent failed attempt: displays the date, time, access application, IP address, and access method of the latest failed login attempt before the current successful login. -- Cumulative failed attempts: total number of failed session attempts since the last successful session was established. - -### 5.1 Enabling Access History - -You can enable or disable the access history feature by modifying relevant parameters in the `iotdb-system.properties` file. A restart is required for changes to take effect. For example: - -```Plain -# Controls whether the audit log feature is enabled -enable_audit_log=false -``` - -- When enabled: login information is recorded and expired data is cleaned periodically. -- When disabled: no data is recorded, displayed, or cleaned. -- If disabled and then re-enabled, the displayed history will be the last record before disabling, which may not represent the actual latest login. - -Usage example: - -```Bash ---------------------- -Starting IoTDB Cli ---------------------- - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ Enterprise version 2.0.9.1 (Build: xxxxxxx) - - ----Last Successful Session------------------ -Time: 2026-03-24T10:25:47.759+08:00 -IP Address: 127.0.0.1 ----Last Failed Session---------------------- -Time: 2026-03-24T10:27:26.314+08:00 -IP Address: 127.0.0.1 -Cumulative Failed Attempts: 1 -Successfully login at 127.0.0.1:6667 -IoTDB> -``` - -### 5.2 Viewing Access History - -The `root` user and users with the `AUDIT` privilege can view access history records using SQL statements. - -Syntax: - -```SQL -SELECT * FROM root.__audit.login.u_{userid}.** -``` - -The `userid` can be obtained using the `LIST USER` statement. - -Example: - -```SQL -IoTDB> SELECT * FROM root.__audit.login.** -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -| Time|root.__audit.login.u_0.node_1.result|root.__audit.login.u_0.node_1.ip|root.__audit.login.u_0.node_1.username| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -|2026-03-25T10:55:58.240+08:00| true| 127.0.0.1| root| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -Total line number = 1 -It costs 0.039s - -IoTDB> SELECT * FROM root.__audit.login.u_0.** -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -| Time|root.__audit.login.u_0.node_1.result|root.__audit.login.u_0.node_1.ip|root.__audit.login.u_0.node_1.username| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -|2026-03-25T10:55:58.240+08:00| true| 127.0.0.1| root| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -Total line number = 1 -It costs 0.020s -``` \ No newline at end of file diff --git a/src/UserGuide/latest/Tools-System/Data-Export-Tool_timecho.md b/src/UserGuide/latest/Tools-System/Data-Export-Tool_timecho.md deleted file mode 100644 index c03ac9270..000000000 --- a/src/UserGuide/latest/Tools-System/Data-Export-Tool_timecho.md +++ /dev/null @@ -1,166 +0,0 @@ -# Data Export - -## 1. Overview -The data export tool, export-data.sh (Unix/OS X) or export-data.bat (Windows), located in the tools directory, allows users to export query results from specified SQL statements into CSV, SQL, or TsFile (open-source time-series file format) formats. The specific functionalities are as follows: - - - - - - - - - - - - - - - - - - - - - -
File FormatIoTDB ToolDescription
CSVexport-data.sh/batPlain text format for storing structured data. Must follow the CSV format specified below.
SQLFile containing custom SQL statements.
TsFileOpen-source time-series file format.
- - -## 2. Detailed Functionality -### 2.1 Common Parameters -| Short | Full Parameter | Description | Required | Default | -|------------------|--------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------| ----------------- |-----------------------------------------------------------------------------------------------------| -| `-ft` | `--file_type` | Export file type: `csv`, `sql`, `tsfile`. | ​**Yes** | - | -| `-h` | `--host` | Hostname of the IoTDB server. | No | `127.0.0.1` | -| `-p` | `--port` | Port number of the IoTDB server. | No | `6667` | -| `-u` | `--username` | Username for authentication. | No | `root` | -| `-pw` | `--password` | Password for authentication. Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is `root` ) | -| `-t` | `--target` | Target directory for the output files. If the path does not exist, it will be created. | ​**Yes** | - | -| `-pfn` | `--prefix_file_name` | Prefix for the exported file names. For example, `abc` will generate files like `abc_0.tsfile`, `abc_1.tsfile`. | No | `dump_0.tsfile` | -| `-q` | `--query` | SQL query command to execute. Starting from v2.0.8, semicolons in SQL statements are automatically removed, and query execution proceeds normally. | No | - | -| `-timeout` | `--query_timeout` | Query timeout in milliseconds (ms). | No | `-1` (before v2.0.8)
`Long.MAX_VALUE` (v2.0.8 and later)
(Range: `-1~Long.MAX_VALUE`) | -| `-help` | `--help` | Display help information. | No | - | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 CSV Format -#### 2.2.1 Command - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -``` -#### 2.2.2 CSV-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ------------ | ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- |------------------------------------------| -| `-dt` | `--datatype` | Whether to include data types in the CSV file header (`true` or `false`). | No | `false` | -| `-lpf` | `--lines_per_file` | Number of rows per exported file. | No | `10000` (Range:0~Integer.Max=2147483647) | -| `-tf` | `--time_format` | Time format for the CSV file. Options: 1) Timestamp (numeric, long), 2) ISO8601 (default), 3) Custom pattern (e.g., `yyyy-MM-dd HH:mm:ss`). SQL file timestamps are unaffected by this setting. | No | `ISO8601` | -| `-tz` | `--timezone` | Timezone setting (e.g., `+08:00`, `-01:00`). | No | System default | - -#### 2.2.3 Examples - -```Shell -# Valid Example -> tools/export-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn exported-data.csv -dt true -lpf 1000 -tf "yyyy-MM-dd HH:mm:ss" - -tz +08:00 -q "SELECT * FROM root.ln" -timeout 20000 - -# Error Example -> tools/export-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` -## 2.3 SQL Format -#### 2.3.1 Command -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-aligned ] - -lpf - [-tf ] [-tz ] [-q ] [-timeout ] - -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] -``` -#### 2.3.2 SQL-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ---------------- | ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ---------------- | -| `-aligned` | `--use_aligned` | Whether to export as aligned SQL format (`true` or `false`). | No | `true` | -| `-lpf` | `--lines_per_file` | Number of rows per exported file. | No | `10000` (Range:0~Integer.Max=2147483647) | -| `-tf` | `--time_format` | Time format for the CSV file. Options: 1) Timestamp (numeric, long), 2) ISO8601 (default), 3) Custom pattern (e.g., `yyyy-MM-dd HH:mm:ss`). SQL file timestamps are unaffected by this setting. | No | `ISO8601` | -| `-tz` | `--timezone` | Timezone setting (e.g., `+08:00`, `-01:00`). | No | System default | - -#### 2.3.3 Examples -```Shell -# Valid Example -> tools/export-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn exported-data.csv -aligned true -lpf 1000 -tf "yyyy-MM-dd HH:mm:ss" - -tz +08:00 -q "SELECT * FROM root.ln" -timeout 20000 - -# Error Example -> tools/export-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` - -### 2.4 TsFile Format - -#### 2.4.1 Command - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# Windows -# Before version V2.0.4.x -> tools\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] -``` - -#### 2.4.2 TsFile-Specific Parameters - -* None - -#### 2.4.3 Examples - -```Shell -# Valid Example -> tools/export-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn export-data.tsfile -q "SELECT * FROM root.ln" -timeout 10000 - -# Error Example -> tools/export-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` diff --git a/src/UserGuide/latest/Tools-System/Data-Import-Tool_timecho.md b/src/UserGuide/latest/Tools-System/Data-Import-Tool_timecho.md deleted file mode 100644 index 32bd2a129..000000000 --- a/src/UserGuide/latest/Tools-System/Data-Import-Tool_timecho.md +++ /dev/null @@ -1,328 +0,0 @@ -# Data Import - -## 1. Overview -IoTDB supports three methods for data import: -- Data Import Tool: Use the `import-data.sh/bat` script in the `tools` directory to manually import CSV, SQL, or TsFile (open-source time-series file format) data into IoTDB. -- `TsFile` Auto-Loading Feature -- Load `TsFile` SQL - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
File FormatIoTDB ToolDescription
CSVimport-data.sh/batCan be used for single or batch import of CSV files into IoTDB
SQLCan be used for single or batch import of SQL files into IoTDB
TsFileCan be used for single or batch import of TsFile files into IoTDB
TsFile Auto-Loading FeatureCan automatically monitor a specified directory for newly generated TsFiles and load them into IoTDB
Load SQLCan be used for single or batch import of TsFile files into IoTDB
- -## 2. Data Import Tool -### 2.1 Common Parameters - -| Short | Full Parameter | Description | Required | Default | -|-----------------|--------------------------| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- |-----------------------------------------------| -| `-ft` | `--file_type` | File type: `csv`, `sql`, `tsfile`. | ​**Yes** | - | -| `-h` | `--host` | IoTDB server hostname. | No | `127.0.0.1` | -| `-p` | `--port` | IoTDB server port. | No | `6667` | -| `-u` | `--username` | Username. | No | `root` | -| `-pw` | `--password` | Password. Supported for hidden input since V2.0.9.1 | No | `TimechoDB@2021`(Before V2.0.6 it is `root` ) | -| `-s` | `--source` | Local path to the file/directory to import. ​​**Supported formats**​: CSV, SQL, TsFile. Unsupported formats trigger error: `The file name must end with "csv", "sql", or "tsfile"!` | ​**Yes** | - | -| `-tn` | `--thread_num` | Maximum parallel threads | No | `8`
Range: 0 to Integer.Max(2147483647). | -| `-tz` | `--timezone` | Timezone (e.g., `+08:00`, `-01:00`). | No | System default | -| `-help` | `--help` | Display help (general or format-specific: `-help csv`). | No | - | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 CSV Format - -#### 2.2.1 Command -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] -``` - -#### 2.2.2 CSV-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ---------------- | ------------------------------- |----------------------------------------------------------| ---------- |-----------------| -| `-fd` | `--fail_dir` | Directory to save failed files. | No | YOUR_CSV_FILE_PATH | -| `-lpf` | `--lines_per_failed_file` | Max lines per failed file. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-aligned` | `--use_aligned` | Import as aligned time series. | No | `false` | -| `-batch` | `--batch_size` | Rows processed per API call. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-ti` | `--type_infer` | Type mapping (e.g., `BOOLEAN=text,INT=long`). | No | - | -| `-tp` | `--timestamp_precision` | Timestamp precision: `ms`, `us`, `ns`. | No | `ms` | - -#### 2.2.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -s /path/sql - -fd /path/failure/dir -lpf 100 -aligned true -ti "BOOLEAN=text,INT=long,FLOAT=double" - -tp ms -tz +08:00 -batch 5000 -tn 4 - -# Error Example -> tools/import-data.sh -ft csv -s /non_path -error: Source file or directory /non_path does not exist - -> tools/import-data.sh -ft csv -s /path/sql -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` - -#### 2.2.4 Import Notes - -1. CSV Import Specifications - -- Special Character Escaping Rules: If a text-type field contains special characters (e.g., commas ,), they must be escaped using a backslash (\). -- Supported Time Formats: yyyy-MM-dd'T'HH:mm:ss, yyyy-MM-dd HH:mm:ss, or yyyy-MM-dd'T'HH:mm:ss.SSSZ. -- Timestamp Column Requirement: The timestamp column must be the first column in the data file. - -2. CSV File Example - -- Time Alignment - -```sql --- Headers without data types -Time,root.test.t1.str,root.test.t2.str,root.test.t2.var -1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 -1970-01-01T08:00:00.002+08:00,"123",, - --- Headers with data types (Text-type data supports both quoted and unquoted formats) -Time,root.test.t1.str(TEXT),root.test.t2.str(TEXT),root.test.t2.var(INT32) -1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 -1970-01-01T08:00:00.002+08:00,123,hello world,123 -1970-01-01T08:00:00.003+08:00,"123",, -1970-01-01T08:00:00.004+08:00,123,,12 -``` - -- Device Alignment - -```sql --- Headers without data types -Time,Device,str,var -1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", -1970-01-01T08:00:00.002+08:00,root.test.t1,"123", -1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100 - --- Headers with data types (Text-type data supports both quoted and unquoted formats) -Time,Device,str(TEXT),var(INT32) -1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", -1970-01-01T08:00:00.002+08:00,root.test.t1,"123", -1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100 -1970-01-01T08:00:00.002+08:00,root.test.t1,hello world,123 -``` - - -### 2.3 SQL Format - -#### 2.3.1 Command - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] -``` - -#### 2.3.2 SQL-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| -------------- | ------------------------------- | -------------------------------------------------------------------- | ---------- | ------------------ | -| `-fd` | `--fail_dir` | Directory to save failed files. | No |YOUR_CSV_FILE_PATH| -| `-lpf` | `--lines_per_failed_file` | Max lines per failed file. | No | `100000`
Range: 0 to Integer.Max(2147483647). | -| `-batch` | `--batch_size` | Rows processed per API call. | No | `100000`
Range: 0 to Integer.Max(2147483647). | - -#### 2.3.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -s /path/sql - -fd /path/failure/dir -lpf 500 -tz +08:00 - -batch 100000 -tn 4 - -# Error Example -> tools/import-data.sh -ft sql -s /path/sql -fd /non_path -error: Source file or directory /path/sql does not exist - - -> tools/import-data.sh -ft sql -s /path/sql -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` -### 2.4 TsFile Format - -#### 2.4.1 Command - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# Windows -# Before version V2.0.4.x -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# V2.0.4.x and later versions -> tools\windows\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] -``` -#### 2.4.2 TsFile-Specific Parameters - -| Short | Full Parameter | Description | Required | Default | -| ----------- | ----------------------------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ----------------- | --------------------------- | -| `-os` | `--on_success` | Action for successful files:
`none`: Do not delete the file.
`mv`: Move the successful file to the target directory.
`cp`:Create a hard link (copy) of the successful file to the target directory.
`delete`:Delete the file. | ​**Yes** | - | -| `-sd` | `--success_dir` | Target directory for `mv`/`cp` actions on success. Required if `-os` is `mv`/`cp`. The file name will be flattened and concatenated with the original file name. | Conditional | `${EXEC_DIR}/success` | -| `-of` | `--on_fail` | Action for failed files:
`none`:Skip the file.
`mv`:Move the failed file to the target directory.
`cp`:Create a hard link (copy) of the failed file to the target directory.
`delete`:Delete the file.. | ​**Yes** | - | -| `-fd` | `--fail_dir` | Target directory for `mv`/`cp` actions on failure. Required if `-of` is `mv`/`cp`. The file name will be flattened and concatenated with the original file name. | Conditional | `${EXEC_DIR}/fail` | -| `-tp` | `--timestamp_precision` | TsFile timestamp precision: `ms`, `us`, `ns`.
For non-remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The system will manually verify if the timestamp precision matches the server. If it does not match, an error will be returned.
​For remote TsFile imports: Use -tp to specify the timestamp precision of the TsFile. The Pipe system will automatically verify if the timestamp precision matches. If it does not match, a Pipe error will be returned. | No | `ms` | - -#### 2.4.3 Examples - -```Shell -# Valid Example -> tools/import-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 - -s /path/sql -os mv -of cp -sd /path/success/dir -fd /path/failure/dir - -tn 8 -tz +08:00 -tp ms - -# Error Example -> tools/import-data.sh -ft tsfile -s /path/sql -os mv -of cp - -fd /path/failure/dir -tn 8 -error: Missing option --success_dir (or -sd) when --on_success is 'mv' or 'cp' - -> tools/import-data.sh -ft tsfile -s /path/sql -os mv -of cp - -sd /path/success/dir -fd /path/failure/dir -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# Note: Before version V2.0.6, the default value for the -pw parameter was root. -``` - - -## 3. TsFile Auto-Loading - -This feature enables IoTDB to automatically monitor a specified directory for new TsFiles and load them into the database without manual intervention. - -![](/img/Data-import2.png) - -### 3.1 Configuration - -Add the following parameters to `iotdb-system.properties` (template: `iotdb-system.properties.template`): - -| Parameter | Description | Value Range | Required | Default | Hot-Load? | -| ---------------------------------------------------- |---------------------------------------------------------------------------------------| --------------------------------- | ---------- | ----------------------------- | ----------------------- | -| `load_active_listening_enable` | Enable auto-loading. | `true`/`false` | Optional | `true` | Yes | -| `load_active_listening_dirs` | Directories to monitor (subdirectories included). Multiple paths separated by commas. | String | Optional | `ext/load/pending` | Yes | -| `load_active_listening_fail_dir` | Directory to store failed TsFiles. Only can set one. | String | Optional | `ext/load/failed` | Yes | -| `load_active_listening_max_thread_num` | Maximum Threads for TsFile Loading Tasks:The default value for this parameter, when commented out, is max(1, CPU cores / 2). If the value set by the user falls outside the range [1, CPU cores / 2], it will be reset to the default value of max(1, CPU cores / 2). | `1` to `Long.MAX_VALUE` | Optional | `max(1, CPU_CORES / 2)` | No (restart required) | -| `load_active_listening_check_interval_seconds` | Active Listening Polling Interval (in seconds):The active listening feature for TsFiles is implemented through polling the target directory. This configuration specifies the time interval between two consecutive checks of the `load_active_listening_dirs`. After each check, the next check will be performed after `load_active_listening_check_interval_seconds` seconds. If the polling interval set by the user is less than 1, it will be reset to the default value of 5 seconds. | `1` to `Long.MAX_VALUE` | Optional | `5` | No (restart required) | - -### 3.2 Notes - -1. ​​**Mods Files**​: If TsFiles have associated `.mods` files, move `.mods` files to the monitored directory ​**before** their corresponding TsFiles. Ensure `.mods` and TsFiles are in the same directory. -2. ​​**Restricted Directories**​: Do NOT set Pipe receiver directories, data directories, or other system paths as monitored directories. -3. ​​**Directory Conflicts**​: Ensure `load_active_listening_fail_dir` does not overlap with `load_active_listening_dirs` or its subdirectories. -4. ​​**Permissions**​: The monitored directory must have write permissions. Files are deleted after successful loading; insufficient permissions may cause duplicate loading. - -## 4. Load SQL - -IoTDB supports importing one or multiple TsFile files containing time series into another running IoTDB instance directly via SQL execution through the CLI. - -### 4.1 Command - -```SQL -load '' with ( - 'attribute-key1'='attribute-value1', - 'attribute-key2'='attribute-value2', -) -``` - -* `` : The path to a TsFile or a folder containing multiple TsFiles. -* ``: Optional parameters, as described below. - -| Key | Key Description | Value Type | Value Range | Value is Required | Default Value | -|--------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|--------------------------------|-------------------|----------------------------| -| `database-level` | When the database corresponding to the TsFile does not exist, the database hierarchy level can be specified via the ` database-level` parameter. The default is the level set in `iotdb-common.properties`. For example, setting level=1 means the prefix path of level 1 in all time series in the TsFile will be used as the database. | Integer | `[1: Integer.MAX_VALUE]` | No | 1 | -| `on-success` | Action for successfully loaded TsFiles: `delete` (delete the TsFile after successful import) or `none` (retain the TsFile in the source folder). | String | `delete / none` | No | delete | -| `model` | Specifies whether the TsFile uses the `table` model or `tree` model. This parameter becomes invalid starting from V2.0.2.1. The system automatically identifies whether the data model is tree-based or table-based. | String | `tree / table` | No | Aligns with `-sql_dialect` | -| `database-name` | Table model only: Target database for import. Automatically created if it does not exist. The database-name must not include the `root.` prefix (an error will occur if included). | String | `-` | No | null | -| `convert-on-type-mismatch` | Whether to perform type conversion during loading if data types in the TsFile mismatch the target schema. | Boolean | `true / false` | No | true | -| `verify` | Whether to validate the schema before loading the TsFile. | Boolean | `true / false` | No | true | -| `tablet-conversion-threshold` | Size threshold (in bytes) for converting TsFiles into tablet format during loading. Default: `-1` (no conversion for any TsFile). | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | No | -1 | -| `async` | Whether to enable asynchronous loading. If enabled, TsFiles are moved to an active-load directory and loaded into the `database-name` asynchronously. | Boolean | `true / false` | No | false | - -### 4.2 Example - -```SQL --- Before import -IoTDB> show databases -+-------------+-----------------------+---------------------+-------------------+---------------------+ -| Database|SchemaReplicationFactor|DataReplicationFactor|TimePartitionOrigin|TimePartitionInterval| -+-------------+-----------------------+---------------------+-------------------+---------------------+ -|root.__system| 1| 1| 0| 604800000| -+-------------+-----------------------+---------------------+-------------------+---------------------+ - --- Import tsfile by excuting load sql -IoTDB> load '/home/dump1.tsfile' with ( 'on-success'='none') -Msg: The statement is executed successfully. - --- Verify whether the import was successful -IoTDB> select * from root.testdb.** -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -| Time|root.testdb.device.model.temperature|root.testdb.device.model.humidity|root.testdb.device.model.status| -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -|2025-04-17T10:35:47.218+08:00| 22.3| 19.4| true| -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -``` \ No newline at end of file diff --git a/src/UserGuide/latest/Tools-System/Maintenance-Tool_timecho.md b/src/UserGuide/latest/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index 5b36a9294..000000000 --- a/src/UserGuide/latest/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,960 +0,0 @@ - -# Cluster Management Tool - -## 1. IoTDB-OpsKit - -The IoTDB OpsKit is an easy-to-use operation and maintenance tool (enterprise version tool). -It is designed to solve the operation and maintenance problems of multiple nodes in the IoTDB distributed system. -It mainly includes cluster deployment, cluster start and stop, elastic expansion, configuration update, data export and other functions, thereby realizing one-click command issuance for complex database clusters, which greatly Reduce management difficulty. -This document will explain how to remotely deploy, configure, start and stop IoTDB cluster instances with cluster management tools. - -### 1.1 Environment dependence - -This tool is a supporting tool for TimechoDB(Enterprise Edition based on IoTDB). You can contact your sales representative to obtain the tool download method. - -The machine where IoTDB is to be deployed needs to rely on jdk 8 and above, lsof, netstat, and unzip functions. If not, please install them yourself. You can refer to the installation commands required for the environment in the last section of the document. - -Tip: The IoTDB cluster management tool requires an account with root privileges - -### 1.2 Deployment method - -#### Download and install - -This tool is a supporting tool for TimechoDB(Enterprise Edition based on IoTDB). You can contact your salesperson to obtain the tool download method. - -Note: Since the binary package only supports GLIBC2.17 and above, the minimum version is Centos7. - -* After entering the following commands in the iotdb-opskit directory: - -```bash -bash install-iotdbctl.sh -``` - -The iotdbctl keyword can be activated in the subsequent shell, such as checking the environment instructions required before deployment as follows: - -```bash -iotdbctl cluster check example -``` - -* You can also directly use <iotdbctl absolute path>/sbin/iotdbctl without activating iotdbctl to execute commands, such as checking the environment required before deployment: - -```bash -/sbin/iotdbctl cluster check example -``` - -### 1.3 Introduction to cluster configuration files - -* There is a cluster configuration yaml file in the `iotdbctl/config` directory. The yaml file name is the cluster name. There can be multiple yaml files. In order to facilitate users to configure yaml files, a `default_cluster.yaml` example is provided under the iotdbctl/config directory. -* The yaml file configuration consists of five major parts: `global`, `confignode_servers`, `datanode_servers`, `grafana_server`, and `prometheus_server` -* `global` is a general configuration that mainly configures machine username and password, IoTDB local installation files, Jdk configuration, etc. A `default_cluster.yaml` sample data is provided in the `iotdbctl/config` directory, - Users can copy and modify it to their own cluster name and refer to the instructions inside to configure the IoTDB cluster. In the `default_cluster.yaml` sample, all uncommented items are required, and those that have been commented are non-required. - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotdbctl cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - -| parameter name | parameter describe | required | -|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| iotdb\_zip\_dir | IoTDB deployment distribution directory, if the value is empty, it will be downloaded from the address specified by `iotdb_download_url` | NO | -| iotdb\_download\_url | IoTDB download address, if `iotdb_zip_dir` has no value, download from the specified address | NO | -| jdk\_tar\_dir | jdk local directory, you can use this jdk path to upload and deploy to the target node. | NO | -| jdk\_deploy\_dir | jdk remote machine deployment directory, jdk will be deployed to this directory, and the following `jdk_dir_name` parameter forms a complete jdk deployment directory, that is, `/` | NO | -| jdk\_dir\_name | The directory name after jdk decompression defaults to jdk_iotdb | NO | -| iotdb\_lib\_dir | The IoTDB lib directory or the IoTDB lib compressed package only supports .zip format and is only used for IoTDB upgrade. It is in the comment state by default. If you need to upgrade, please open the comment and modify the path. If you use a zip file, please use the zip command to compress the iotdb/lib directory, such as zip -r lib.zip apache-iotdb-1.2.0/lib/* d | NO | -| user | User name for ssh login deployment machine | YES | -| password | The password for ssh login. If the password does not specify the use of pkey to log in, please ensure that the ssh login between nodes has been configured without a key. | NO | -| pkey | Key login: If password has a value, password is used first, otherwise pkey is used to log in. | NO | -| ssh\_port | ssh port | YES | -| deploy\_dir | IoTDB deployment directory, IoTDB will be deployed to this directory and the following `iotdb_dir_name` parameter will form a complete IoTDB deployment directory, that is, `/` | YES | -| iotdb\_dir\_name | The directory name after decompression of IoTDB is iotdb by default. | NO | -| datanode-env.sh | Corresponding to `iotdb/config/datanode-env.sh`, when `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first | NO | -| confignode-env.sh | Corresponding to `iotdb/config/confignode-env.sh`, the value in `datanode_servers` is used first when `global` and `datanode_servers` are configured at the same time | NO | -| iotdb-system.properties | Corresponds to `/config/iotdb-system.properties` | NO | -| cn\_internal\_address | The cluster configuration address points to the surviving ConfigNode, and it points to confignode_x by default. When `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_address | The cluster configuration address points to the surviving ConfigNode, and points to confignode_x by default. When configuring values for `global` and `datanode_servers` at the same time, the value in `datanode_servers` is used first, corresponding to `dn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | - -Among them, datanode-env.sh and confignode-env.sh can be configured with extra parameters extra_opts. When this parameter is configured, corresponding values will be appended after datanode-env.sh and confignode-env.sh. Refer to default_cluster.yaml for configuration examples as follows: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - -* `confignode_servers` is the configuration for deploying IoTDB Confignodes, in which multiple Confignodes can be configured - By default, the first started ConfigNode node node1 is regarded as the Seed-ConfigNode - -| parameter name | parameter describe | required | -|-----------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| name | Confignode name | YES | -| deploy\_dir | IoTDB config node deployment directory | YES | -| cn\_internal\_address | Corresponds to iotdb/internal communication address, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| cn_internal_address | The cluster configuration address points to the surviving ConfigNode, and it points to confignode_x by default. When `global` and `confignode_servers` are configured at the same time, the value in `confignode_servers` is used first, corresponding to `cn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| cn\_internal\_port | Internal communication port, corresponding to `cn_internal_port` in `iotdb/config/iotdb-system.properties` | YES | -| cn\_consensus\_port | Corresponds to `cn_consensus_port` in `iotdb/config/iotdb-system.properties` | NO | -| cn\_data\_dir | Corresponds to `cn_consensus_port` in `iotdb/config/iotdb-system.properties` Corresponds to `cn_data_dir` in `iotdb/config/iotdb-system.properties` | YES | -| iotdb-system.properties | Corresponding to `iotdb/config/iotdb-system.properties`, when configuring values in `global` and `confignode_servers` at the same time, the value in confignode_servers will be used first. | NO | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| parameter name | parameter describe | required | -|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| name | Datanode name | YES | -| deploy\_dir | IoTDB data node deployment directory | YES | -| dn\_rpc\_address | The datanode rpc address corresponds to `dn_rpc_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_address | Internal communication address, corresponding to `dn_internal_address` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_seed\_config\_node | The cluster configuration address points to the surviving ConfigNode, and points to confignode_x by default. When configuring values for `global` and `datanode_servers` at the same time, the value in `datanode_servers` is used first, corresponding to `dn_seed_config_node` in `iotdb/config/iotdb-system.properties`. | YES | -| dn\_rpc\_port | Datanode rpc port address, corresponding to `dn_rpc_port` in `iotdb/config/iotdb-system.properties` | YES | -| dn\_internal\_port | Internal communication port, corresponding to `dn_internal_port` in `iotdb/config/iotdb-system.properties` | YES | -| iotdb-system.properties | Corresponding to `iotdb/config/iotdb-system.properties`, when configuring values in `global` and `datanode_servers` at the same time, the value in `datanode_servers` will be used first. | NO | - -* grafana_server is the configuration related to deploying Grafana - -| parameter name | parameter describe | required | -|--------------------|-------------------------------------------------------------|-----------| -| grafana\_dir\_name | Grafana decompression directory name(default grafana_iotdb) | NO | -| host | Server ip deployed by grafana | YES | -| grafana\_port | The port of grafana deployment machine, default 3000 | NO | -| deploy\_dir | grafana deployment server directory | YES | -| grafana\_tar\_dir | Grafana compressed package location | YES | -| dashboards | dashboards directory | NO | - -* prometheus_server 是部署Prometheus 相关配置 - -| parameter name | parameter describe | required | -|--------------------------------|----------------------------------------------------|----------| -| prometheus\_dir\_name | prometheus decompression directory name, default prometheus_iotdb | NO | -| host | Server IP deployed by prometheus | YES | -| prometheus\_port | The port of prometheus deployment machine, default 9090 | NO | -| deploy\_dir | prometheus deployment server directory | YES | -| prometheus\_tar\_dir | prometheus compressed package path | YES | -| storage\_tsdb\_retention\_time | The number of days to save data is 15 days by default | NO | -| storage\_tsdb\_retention\_size | The data size that can be saved by the specified block defaults to 512M. Please note the units are KB, MB, GB, TB, PB, and EB. | NO | - -If metrics are configured in `iotdb-system.properties` and `iotdb-system.properties` of config/xxx.yaml, the configuration will be automatically put into Prometheus without manual modification. - -Note: How to configure the value corresponding to the yaml key to contain special characters such as: etc. It is recommended to use double quotes for the entire value, and do not use paths containing spaces in the corresponding file paths to prevent abnormal recognition problems. - -### 1.4 scenes to be used - -#### Clean data - -* Cleaning up the cluster data scenario will delete the data directory in the IoTDB cluster and `cn_system_dir`, `cn_consensus_dir`, `cn_consensus_dir` configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs` and `ext` directories. -* First execute the stop cluster command, and then execute the cluster cleanup command. - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster clean default_cluster -``` - -#### Cluster destruction - -* The cluster destruction scenario will delete `data`, `cn_system_dir`, `cn_consensus_dir`, in the IoTDB cluster - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs`, `ext`, `IoTDB` deployment directory, - grafana deployment directory and prometheus deployment directory. -* First execute the stop cluster command, and then execute the cluster destruction command. - - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster destroy default_cluster -``` - -#### Cluster upgrade - -* To upgrade the cluster, you first need to configure `iotdb_lib_dir` in config/xxx.yaml as the directory path where the jar to be uploaded to the server is located (for example, iotdb/lib). -* If you use zip files to upload, please use the zip command to compress the iotdb/lib directory, such as zip -r lib.zip apache-iotdb-1.2.0/lib/* -* Execute the upload command and then execute the restart IoTDB cluster command to complete the cluster upgrade. - -```bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### hot deployment - -* First modify the configuration in config/xxx.yaml. -* Execute the distribution command, and then execute the hot deployment command to complete the hot deployment of the cluster configuration - -```bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### Cluster expansion - -* First modify and add a datanode or confignode node in config/xxx.yaml. -* Execute the cluster expansion command - -```bash -iotdbctl cluster scaleout default_cluster -``` - -#### Cluster scaling - -* First find the node name or ip+port to shrink in config/xxx.yaml (where confignode port is cn_internal_port, datanode port is rpc_port) -* Execute cluster shrink command - -```bash -iotdbctl cluster scalein default_cluster -``` - -#### Using cluster management tools to manipulate existing IoTDB clusters - -* Configure the server's `user`, `passwod` or `pkey`, `ssh_port` -* Modify the IoTDB deployment path in config/xxx.yaml, `deploy_dir` (IoTDB deployment directory), `iotdb_dir_name` (IoTDB decompression directory name, the default is iotdb) - For example, if the full path of IoTDB deployment is `/home/data/apache-iotdb-1.1.1`, you need to modify the yaml files `deploy_dir:/home/data/` and `iotdb_dir_name:apache-iotdb-1.1.1` -* If the server is not using java_home, modify `jdk_deploy_dir` (jdk deployment directory) and `jdk_dir_name` (the directory name after jdk decompression, the default is jdk_iotdb). If java_home is used, there is no need to modify the configuration. - For example, the full path of jdk deployment is `/home/data/jdk_1.8.2`, you need to modify the yaml files `jdk_deploy_dir:/home/data/`, `jdk_dir_name:jdk_1.8.2` -* Configure `cn_internal_address`, `dn_internal_address` -* Configure `cn_internal_address`, `cn_internal_port`, `cn_consensus_port`, `cn_system_dir`, in `iotdb-system.properties` in `confignode_servers` - If the values in `cn_consensus_dir` and `iotdb-system.properties` are not the default for IoTDB, they need to be configured, otherwise there is no need to configure them. -* Configure `dn_rpc_address`, `dn_internal_address`, `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir` in `iotdb-system.properties` -* Execute initialization command - -```bash -iotdbctl cluster init default_cluster -``` - -#### Deploy IoTDB, Grafana and Prometheus - -* Configure `iotdb-system.properties` to open the metrics interface -* Configure the Grafana configuration. If there are multiple `dashboards`, separate them with commas. The names cannot be repeated or they will be overwritten. -* Configure the Prometheus configuration. If the IoTDB cluster is configured with metrics, there is no need to manually modify the Prometheus configuration. The Prometheus configuration will be automatically modified according to which node is configured with metrics. -* Start the cluster - -```bash -iotdbctl cluster start default_cluster -``` - -For more detailed parameters, please refer to the cluster configuration file introduction above - -### 1.5 Command - -The basic usage of this tool is: -```bash -iotdbctl cluster [params (Optional)] -``` -* key indicates a specific command. - -* cluster name indicates the cluster name (that is, the name of the yaml file in the `iotdbctl/config` file). - -* params indicates the required parameters of the command (optional). - -* For example, the command format to deploy the default_cluster cluster is: - -```bash -iotdbctl cluster deploy default_cluster -``` - -* The functions and parameters of the cluster are listed as follows: - -| command | description | parameter | -|-----------------|-----------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| check | check whether the cluster can be deployed | Cluster name list | -| clean | cleanup-cluster | cluster-name | -| deploy/dist-all | deploy cluster | Cluster name, -N, module name (optional for iotdb, grafana, prometheus), -op force (optional) | -| list | cluster status list | None | -| start | start cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional) | -| stop | stop cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional), -op force (nodename, grafana, prometheus optional) | -| restart | restart cluster | Cluster name, -N, node name (nodename, grafana, prometheus optional), -op force (nodename, grafana, prometheus optional) | -| show | view cluster information. The details field indicates the details of the cluster information. | Cluster name, details (optional) | -| destroy | destroy cluster | Cluster name, -N, module name (iotdb, grafana, prometheus optional) | -| scaleout | cluster expansion | Cluster name | -| scalein | cluster shrink | Cluster name, -N, cluster node name or cluster node ip+port | -| reload | hot loading of cluster configuration files | Cluster name | -| dist-conf | cluster configuration file distribution | Cluster name | -| dumplog | Back up specified cluster logs | Cluster name, -N, cluster node name -h Back up to target machine ip -pw Back up to target machine password -p Back up to target machine port -path Backup directory -startdate Start time -enddate End time -loglevel Log type -l transfer speed | -| dumpdata | Backup cluster data | Cluster name, -h backup to target machine ip -pw backup to target machine password -p backup to target machine port -path backup directory -startdate start time -enddate end time -l transmission speed | -| dist-lib | lib package upgrade | Cluster name | -| init | When an existing cluster uses the cluster deployment tool, initialize the cluster configuration | Cluster name | -| status | View process status | Cluster name | -| activate | Activate cluster | Cluster name | -| health_check | health check | Cluster name, -N, nodename (optional) | -| backup | Activate cluster | Cluster name,-N nodename (optional) | -| importschema | Activate cluster | Cluster name,-N nodename -param paramters | -| exportschema | Activate cluster | Cluster name,-N nodename -param paramters | - - - -### 1.6 Detailed command execution process - -The following commands are executed using default_cluster.yaml as an example, and users can modify them to their own cluster files to execute - -#### Check cluster deployment environment commands - -```bash -iotdbctl cluster check default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Verify that the target node is able to log in via SSH - -* Verify whether the JDK version on the corresponding node meets IoTDB jdk1.8 and above, and whether the server is installed with unzip, lsof, and netstat. - -* If you see the following prompt `Info:example check successfully!`, it proves that the server has already met the installation requirements. - If `Error:example check fail!` is output, it proves that some conditions do not meet the requirements. You can check the Error log output above (for example: `Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`) to make repairs. , - If the jdk check does not meet the requirements, we can configure a jdk1.8 or above version in the yaml file ourselves for deployment without affecting subsequent use. - If checking lsof, netstat or unzip does not meet the requirements, you need to install it on the server yourself. - -#### Deploy cluster command - -```bash -iotdbctl cluster deploy default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Upload IoTDB compressed package and jdk compressed package according to the node information in `confignode_servers` and `datanode_servers` (if `jdk_tar_dir` and `jdk_deploy_dir` values ​​are configured in yaml) - -* Generate and upload `iotdb-system.properties` according to the yaml file node configuration information - -```bash -iotdbctl cluster deploy default_cluster -op force -``` - -Note: This command will force the deployment, and the specific process will delete the existing deployment directory and redeploy - -*deploy a single module* -```bash -# Deploy grafana module -iotdbctl cluster deploy default_cluster -N grafana -# Deploy the prometheus module -iotdbctl cluster deploy default_cluster -N prometheus -# Deploy the iotdb module -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### Start cluster command - -```bash -iotdbctl cluster start default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Start confignode, start sequentially according to the order in `confignode_servers` in the yaml configuration file and check whether the confignode is normal according to the process id, the first confignode is seek config - -* Start the datanode in sequence according to the order in `datanode_servers` in the yaml configuration file and check whether the datanode is normal according to the process id. - -* After checking the existence of the process according to the process id, check whether each service in the cluster list is normal through the cli. If the cli link fails, retry every 10s until it succeeds and retry up to 5 times - - -* -Start a single node command* -```bash -#Start according to the IoTDB node name -iotdbctl cluster start default_cluster -N datanode_1 -#Start according to IoTDB cluster ip+port, where port corresponds to cn_internal_port of confignode and rpc_port of datanode. -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 -#Start grafana -iotdbctl cluster start default_cluster -N grafana -#Start prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -* Find the yaml file in the default location based on cluster-name - -* Find the node location information based on the provided node name or ip:port. If the started node is `data_node`, the ip uses `dn_rpc_address` in the yaml file, and the port uses `dn_rpc_port` in datanode_servers in the yaml file. - If the started node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - -* start the node - -Note: Since the cluster deployment tool only calls the start-confignode.sh and start-datanode.sh scripts in the IoTDB cluster, -When the actual output result fails, it may be that the cluster has not started normally. It is recommended to use the status command to check the current cluster status (iotdbctl cluster status xxx) - - -#### View IoTDB cluster status command - -```bash -iotdbctl cluster show default_cluster -#View IoTDB cluster details -iotdbctl cluster show default_cluster details -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Execute `show cluster details` through cli on datanode in turn. If one node is executed successfully, it will not continue to execute cli on subsequent nodes and return the result directly. - -#### Stop cluster command - - -```bash -iotdbctl cluster stop default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* According to the datanode node information in `datanode_servers`, stop the datanode nodes in order according to the configuration. - -* Based on the confignode node information in `confignode_servers`, stop the confignode nodes in sequence according to the configuration - -*force stop cluster command* - -```bash -iotdbctl cluster stop default_cluster -op force -``` -Will directly execute the kill -9 pid command to forcibly stop the cluster - -*Stop single node command* - -```bash -#Stop by IoTDB node name -iotdbctl cluster stop default_cluster -N datanode_1 -#Stop according to IoTDB cluster ip+port (ip+port is to get the only node according to ip+dn_rpc_port in datanode or ip+cn_internal_port in confignode to get the only node) -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 -#Stop grafana -iotdbctl cluster stop default_cluster -N grafana -#Stop prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -* Find the yaml file in the default location based on cluster-name - -* Find the corresponding node location information based on the provided node name or ip:port. If the stopped node is `data_node`, the ip uses `dn_rpc_address` in the yaml file, and the port uses `dn_rpc_port` in datanode_servers in the yaml file. - If the stopped node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - -* stop the node - -Note: Since the cluster deployment tool only calls the stop-confignode.sh and stop-datanode.sh scripts in the IoTDB cluster, in some cases the iotdb cluster may not be stopped. - - -#### Clean cluster data command - -```bash -iotdbctl cluster clean default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Based on the information in `confignode_servers` and `datanode_servers`, check whether there are still services running, - If any service is running, the cleanup command will not be executed. - -* Delete the data directory in the IoTDB cluster and the `cn_system_dir`, `cn_consensus_dir`, configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs` and `ext` directories. - - - -#### Restart cluster command - -```bash -iotdbctl cluster restart default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` - -* Execute the above stop cluster command (stop), and then execute the start cluster command (start). For details, refer to the above start and stop commands. - -*Force restart cluster command* - -```bash -iotdbctl cluster restart default_cluster -op force -``` -Will directly execute the kill -9 pid command to force stop the cluster, and then start the cluster - - -*Restart a single node command* - -```bash -#Restart datanode_1 according to the IoTDB node name -iotdbctl cluster restart default_cluster -N datanode_1 -#Restart confignode_1 according to the IoTDB node name -iotdbctl cluster restart default_cluster -N confignode_1 -#Restart grafana -iotdbctl cluster restart default_cluster -N grafana -#Restart prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### Cluster shrink command - -```bash -#Scale down by node name -iotdbctl cluster scalein default_cluster -N nodename -#Scale down according to ip+port (ip+port obtains the only node according to ip+dn_rpc_port in datanode, and obtains the only node according to ip+cn_internal_port in confignode) -iotdbctl cluster scalein default_cluster -N ip:port -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Determine whether there is only one confignode node and datanode to be reduced. If there is only one left, the reduction cannot be performed. - -* Then get the node information to shrink according to ip:port or nodename, execute the shrink command, and then destroy the node directory. If the shrink node is `data_node`, use `dn_rpc_address` in the yaml file for ip, and use `dn_rpc_address` in the port. `dn_rpc_port` in datanode_servers in yaml file. - If the shrinking node is `config_node`, the ip uses `cn_internal_address` in confignode_servers in the yaml file, and the port uses `cn_internal_port` - - -Tip: Currently, only one node scaling is supported at a time - -#### Cluster expansion command - -```bash -iotdbctl cluster scaleout default_cluster -``` -* Modify the config/xxx.yaml file to add a datanode node or confignode node - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Find the node to be expanded, upload the IoTDB compressed package and jdb package (if the `jdk_tar_dir` and `jdk_deploy_dir` values ​​are configured in yaml) and decompress it - -* Generate and upload `iotdb-system.properties` according to the yaml file node configuration information - -* Execute the command to start the node and verify whether the node is started successfully - -Tip: Currently, only one node expansion is supported at a time - -#### destroy cluster command -```bash -iotdbctl cluster destroy default_cluster -``` - -* cluster-name finds the yaml file in the default location - -* Check whether the node is still running based on the node node information in `confignode_servers`, `datanode_servers`, `grafana`, and `prometheus`. - Stop the destroy command if any node is running - -* Delete `data` in the IoTDB cluster and `cn_system_dir`, `cn_consensus_dir` configured in the yaml file - `dn_data_dirs`, `dn_consensus_dir`, `dn_system_dir`, `logs`, `ext`, `IoTDB` deployment directory, - grafana deployment directory and prometheus deployment directory - -*Destroy a single module* - -```bash -# Destroy grafana module -iotdbctl cluster destroy default_cluster -N grafana -# Destroy prometheus module -iotdbctl cluster destroy default_cluster -N prometheus -# Destroy iotdb module -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### Distribute cluster configuration commands - -```bash -iotdbctl cluster dist-conf default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` - -* Generate and upload `iotdb-system.properties` to the specified node according to the node configuration information of the yaml file - -#### Hot load cluster configuration command - -```bash -iotdbctl cluster reload default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Execute `load configuration` in the cli according to the node configuration information of the yaml file. - -#### Cluster node log backup -```bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` - -* Find the yaml file in the default location based on cluster-name - -* This command will verify the existence of datanode_1 and confignode_1 according to the yaml file, and then back up the log data of the specified node datanode_1 and confignode_1 to the specified service `192.168.9.48` port 36000 according to the configured start and end dates (startdate<=logtime<=enddate) The data backup path is `/iotdb/logs`, and the IoTDB log storage path is `/root/data/db/iotdb/logs` (not required, if you do not fill in -logs xxx, the default is to backup logs from the IoTDB installation path /logs ) - -| command | description | required | -|------------|-------------------------------------------------------------------------|----------| -| -h | backup data server ip | NO | -| -u | backup data server username | NO | -| -pw | backup data machine password | NO | -| -p | backup data machine port(default 22) | NO | -| -path | path to backup data (default current path) | NO | -| -loglevel | Log levels include all, info, error, warn (default is all) | NO | -| -l | speed limit (default 1024 speed limit range 0 to 104857601 unit Kbit/s) | NO | -| -N | multiple configuration file cluster names are separated by commas. | YES | -| -startdate | start time (including default 1970-01-01) | NO | -| -enddate | end time (included) | NO | -| -logs | IoTDB log storage path, the default is ({iotdb}/logs)) | NO | - -#### Cluster data backup -```bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* This command will obtain the leader node based on the yaml file, and then back up the data to the /iotdb/datas directory on the 192.168.9.48 service based on the start and end dates (startdate<=logtime<=enddate) - -| command | description | required | -|--------------|-------------------------------------------------------------------------|----------| -| -h | backup data server ip | NO | -| -u | backup data server username | NO | -| -pw | backup data machine password | NO | -| -p | backup data machine port(default 22) | NO | -| -path | path to backup data (default current path) | NO | -| -granularity | partition | YES | -| -l | speed limit (default 1024 speed limit range 0 to 104857601 unit Kbit/s) | NO | -| -startdate | start time (including default 1970-01-01) | YES | -| -enddate | end time (included) | YES | - -#### Cluster upgrade -```bash -iotdbctl cluster dist-lib default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers` and `datanode_servers` - -* Upload lib package - -Note that after performing the upgrade, please restart IoTDB for it to take effect. - -#### Cluster initialization -```bash -iotdbctl cluster init default_cluster -``` -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` -* Initialize cluster configuration - -#### View cluster process status -```bash -iotdbctl cluster status default_cluster -``` - -* Find the yaml file in the default location according to cluster-name and obtain the configuration information of `confignode_servers`, `datanode_servers`, `grafana` and `prometheus` -* Display the survival status of each node in the cluster - -#### Cluster authorization activation - -Cluster activation is activated by entering the activation code by default, or by using the - op license_path activated through license path - -* Default activation method -```bash -iotdbctl cluster activate default_cluster -``` -* Find the yaml file in the default location based on `cluster-name` and obtain the `confignode_servers` configuration information -* Obtain the machine code inside -* Waiting for activation code input - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* Activate a node - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -* Activate through license path - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* Find the yaml file in the default location based on `cluster-name` and obtain the `confignode_servers` configuration information -* Obtain the machine code inside -* Waiting for activation code input - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* Activate a node - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -op license_path -``` - -#### Cluster Health Check -```bash -iotdbctl cluster health_check default_cluster -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve confignode_servers and datanode_servers configuration information. -* Execute health_check.sh on each node. -* Single Node Health Check -```bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute health_check.sh on datanode1. - -#### Cluster Shutdown Backup - -```bash -iotdbctl cluster backup default_cluster -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve confignode_servers and datanode_servers configuration information. -* Execute backup.sh on each node - -* Single Node Backup - -```bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` - -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute backup.sh on datanode1. -Note: Multi-node deployment on a single machine only supports quick mode. - -#### Cluster Metadata Import -```bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute metadata import with import-schema.sh on datanode1. -* Parameters for -param are as follows: - -| command | description | required | -|------------|-------------------------------------------------------------------------|----------| -| -s | Specify the data file to be imported. You can specify a file or a directory. If a directory is specified, all files with a .csv extension in the directory will be imported in bulk. | YES | -| -fd | Specify a directory to store failed import files. If this parameter is not specified, failed files will be saved in the source data directory with the extension .failed added to the original filename. | No | -| -lpf | Specify the number of lines written to each failed import file. The default is 10000.| NO | - -#### Cluster Metadata Export - -```bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` - -* Locate the yaml file in the default location based on the cluster-name to retrieve datanode_servers configuration information. -* Execute metadata export with export-schema.sh on datanode1. -* Parameters for -param are as follows: - -| command | description | required | -|-------------|-------------------------------------------------------------------------|----------| -| -t | Specify the output path for the exported CSV file. | YES | -| -path | Specify the path pattern for exporting metadata. If this parameter is specified, the -s parameter will be ignored. Example: root.stock.** | NO | -| -pf | If -path is not specified, this parameter must be specified. It designates the file path containing the metadata paths to be exported, supporting txt file format. Each path to be exported is on a new line.| NO | -| -lpf | Specify the maximum number of lines for the exported dump file. The default is 10000.| NO | -| -timeout | Specify the timeout for session queries in milliseconds.| NO | - - - -### 1.7 Introduction to Cluster Deployment Tool Samples - -In the cluster deployment tool installation directory config/example, there are three yaml examples. If necessary, you can copy them to config and modify them. - -| name | description | -|-----------------------------|------------------------------------------------| -| default\_1c1d.yaml | 1 confignode and 1 datanode configuration example | -| default\_3c3d.yaml | 3 confignode and 3 datanode configuration samples | -| default\_3c3d\_grafa\_prome | 3 confignode and 3 datanode, Grafana, Prometheus configuration examples | - - -## 2. IoTDB Data Directory Overview Tool - -IoTDB data directory overview tool is used to print an overview of the IoTDB data directory structure. The location is tools/tsfile/print-iotdb-data-dir. - -### 2.1 Usage - -- For Windows: - -```bash -.\print-iotdb-data-dir.bat () -``` - -- For Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh () -``` - -Note: if the storage path of the output overview file is not set, the default relative path "IoTDB_data_dir_overview.txt" will be used. - -### 2.2 Example - -Use Windows in this example: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## 3. TsFile Sketch Tool - -TsFile sketch tool is used to print the content of a TsFile in sketch mode. The location is tools/tsfile/print-tsfile. - -### 3.1 Usage - -- For Windows: - -``` -.\print-tsfile-sketch.bat () -``` - -- For Linux or MacOs: - -``` -./print-tsfile-sketch.sh () -``` - -Note: if the storage path of the output sketch file is not set, the default relative path "TsFile_sketch_view.txt" will be used. - -### 3.2 Example - -Use Windows in this example: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -Explanations: - -- Separated by "|", the left is the actual position in the TsFile, and the right is the summary content. -- "||||||||||||||||||||" is the guide information added to enhance readability, not the actual data stored in TsFile. -- The last printed "IndexOfTimerseriesIndex Tree" is a reorganization of the metadata index tree at the end of the TsFile, which is convenient for intuitive understanding, and again not the actual data stored in TsFile. - -## 4. TsFile Resource Sketch Tool - -TsFile resource sketch tool is used to print the content of a TsFile resource file. The location is tools/tsfile/print-tsfile-resource-files. - -### 4.1 Usage - -- For Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- For Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### 4.2 Example - -Use Windows in this example: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/UserGuide/latest/Tools-System/Monitor-Tool_timecho.md b/src/UserGuide/latest/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index 197ea27e2..000000000 --- a/src/UserGuide/latest/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,200 +0,0 @@ - - -# Monitor Tool - -The deployment of monitoring tools can refer to the document [Monitoring Panel Deployment](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) section. - -## 1. Prometheus - -### 1.1 The mapping from metric type to prometheus format - -> For metrics whose Metric Name is name and Tags are K1=V1, ..., Kn=Vn, the mapping is as follows, where value is a -> specific value - -| Metric Type | Mapping | -| ---------------- | ------------------------------------------------------------ | -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.5"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId", k1="V1", ..., Kn="Vn", quantile="0.99"} value | - -### 1.2 Config File - -1) Taking DataNode as an example, modify the iotdb-system.properties configuration file as follows: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -Then you can get metrics data as follows - -2) Start IoTDB DataNodes -3) Open a browser or use ```curl``` to visit ```http://servier_ip:9091/metrics```, you can get the following metric - data: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -### 1.3 Prometheus + Grafana - -As shown above, IoTDB exposes monitoring metrics data in the standard Prometheus format to the outside world. Prometheus -can be used to collect and store monitoring indicators, and Grafana can be used to visualize monitoring indicators. - -The following picture describes the relationships among IoTDB, Prometheus and Grafana - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. Along with running, IoTDB will collect its metrics continuously. -2. Prometheus scrapes metrics from IoTDB at a constant interval (can be configured). -3. Prometheus saves these metrics to its inner TSDB. -4. Grafana queries metrics from Prometheus at a constant interval (can be configured) and then presents them on the - graph. - -So, we need to do some additional works to configure and deploy Prometheus and Grafana. - -For instance, you can config your Prometheus as follows to get metrics data from IoTDB: - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -The following documents may help you have a good journey with Prometheus and Grafana. - -[Prometheus getting_started](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus scrape metrics](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana getting_started](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana query metrics from Prometheus](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## 2. Apache IoTDB Dashboard - -We introduce the Apache IoTDB Dashboard, designed for unified centralized operations and management. With it, multiple clusters can be monitored through a single panel. - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - - -You can access the Dashboard's Json file in the enterprise edition. - -### 2.1 Cluster Overview - -Including but not limited to: - -- Total cluster CPU cores, memory space, and hard disk space. -- Number of ConfigNodes and DataNodes in the cluster. -- Cluster uptime duration. -- Cluster write speed. -- Current CPU, memory, and disk usage across all nodes in the cluster. -- Information on individual nodes. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - - -### 2.2 Data Writing - -Including but not limited to: - -- Average write latency, median latency, and the 99% percentile latency. -- Number and size of WAL files. -- Node WAL flush SyncBuffer latency. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### 2.3 Data Querying - -Including but not limited to: - -- Node query load times for time series metadata. -- Node read duration for time series. -- Node edit duration for time series metadata. -- Node query load time for Chunk metadata list. -- Node edit duration for Chunk metadata. -- Node filtering duration based on Chunk metadata. -- Average time to construct a Chunk Reader. - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### 2.4 Storage Engine - -Including but not limited to: - -- File count and sizes by type. -- The count and size of TsFiles at various stages. -- Number and duration of various tasks. - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### 2.5 System Monitoring - -Including but not limited to: - -- System memory, swap memory, and process memory. -- Disk space, file count, and file sizes. -- JVM GC time percentage, GC occurrences by type, GC volume, and heap memory usage across generations. -- Network transmission rate, packet sending rate - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) - -### 2.6 Data Synchronization - -Including but not limited to: - -- Pipe event commit queue size, number of unassigned Pipe events -- Number of unprocessed events in the Source queue, Source event feeding rate, Processor event processing rate -- Number of untransmitted events for all Pipe Sinks/Sources, transmission event rate of Pipe connectors -- Retry queue size and pending handler count of Pipe Sinks; total data size before and after compression and compression duration of Pipe Sinks; batch size and batch interval distribution of Pipe Sinks -- Pipe memory usage and capacity, number of Pipe phantom references, quantity and total size of linked TsFiles, disk bytes read for TsFile transmission via Pipe - -![](/img/monitor-tool-pipe-1-en.png) - -![](/img/monitor-tool-pipe-2-en.png) - -![](/img/monitor-tool-pipe-3-en.png) - -![](/img/monitor-tool-pipe-4-en.png) \ No newline at end of file diff --git a/src/UserGuide/latest/Tools-System/Schema-Export-Tool_timecho.md b/src/UserGuide/latest/Tools-System/Schema-Export-Tool_timecho.md deleted file mode 100644 index e68bfc67b..000000000 --- a/src/UserGuide/latest/Tools-System/Schema-Export-Tool_timecho.md +++ /dev/null @@ -1,85 +0,0 @@ - - -# Schema Export - -## 1. Overview - -The schema export tool `export-schema.sh/bat` is located in the `tools` directory. It can export schema from a specified database in IoTDB to a script file. - -## 2. Detailed Functionality - -### 2.1 Parameter - -| **Short Param** | **Full Param** | **Description** | Required | Default | -|------------------|--------------------------| ------------------------------------------------------------------------ | ------------------------------------- |-----------------------------------------------| -| `-h` | `--host` | Hostname | No | 127.0.0.1 | -| `-p` | `--port` | Port number | No | 6667 | -| `-u` | `--username` | Username | No | root | -| `-pw` | `--password` | Password, Supported for hidden input since V2.0.9.1 | No | TimechoDB@2021(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Specifies whether the server uses`tree `model or`table `model | No | tree | -| `-db` | `--database` | Target database to export (only applies when`-sql_dialect=table`) | Required if`-sql_dialect=table` | - | -| `-table` | `--table` | Target table to export (only applies when`-sql_dialect=table`) | No | - | -| `-t` | `--target` | Output directory (created if it doesn't exist) | Yes | | -| `-path` | `--path_pattern` | Path pattern for metadata export | Required if`-sql_dialect=tree` | | -| `-pfn` | `--prefix_file_name` | Output filename prefix | No | `dump_dbname.sql` | -| `-lpf` | `--lines_per_file` | Maximum lines per dump file (only applies when`-sql_dialect=tree`) | No | `10000` | -| `-timeout` | `--query_timeout` | Query timeout in milliseconds (`-1`= no timeout) | No | -1Range:`-1 to Long. max=9223372036854775807` | -| `-help` | `--help` | Display help information | No | | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 Command - -```Bash -Shell -# Unix/OS X -> tools/export-schema.sh [-sql_dialect] -db -table - [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -# Windows -# Before version V2.0.4.x -> tools\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] - -# V2.0.4.x and later versions -> tools\windows\schema\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -``` - -### 2.3 Examples - - -```Bash -# Export schema under root.treedb -./export-schema.sh -sql_dialect tree -t /home/ -path "root.treedb.**" - -# Output -Timeseries,Alias,DataType,Encoding,Compression -root.treedb.device.temperature,,DOUBLE,GORILLA,LZ4 -root.treedb.device.humidity,,DOUBLE,GORILLA,LZ4 -``` \ No newline at end of file diff --git a/src/UserGuide/latest/Tools-System/Schema-Import-Tool_timecho.md b/src/UserGuide/latest/Tools-System/Schema-Import-Tool_timecho.md deleted file mode 100644 index a41e516d4..000000000 --- a/src/UserGuide/latest/Tools-System/Schema-Import-Tool_timecho.md +++ /dev/null @@ -1,90 +0,0 @@ - - -# Schema Import - -## 1. Overview - -The schema import tool `import-schema.sh/bat` is located in `tools` directory. - -## 2. Detailed Functionality - -### 2.1 Parameter - -| **Short Param** | **Full Param** | **Description** | Required | Default | -|-----------------| ------------------------------- |-----------------------------------------------------------------------| ---------- |-------------------------------------------| -| `-h` | `--host` | Hostname | No | 127.0.0.1 | -| `-p` | `--port` | Port number | No | 6667 | -| `-u` | `--username` | Username | No | root | -| `-pw` | `--password` | Password, Supported for hidden input since V2.0.9.1 | No | TimechoDB@2021(Before V2.0.6 it is root) | -| `-sql_dialect` | `--sql_dialect` | Specifies whether the server uses`tree `model or`table `model | No | tree | -| `-db` | `--database` | Target database for import | Yes | - | -| `-table` | `--table` | Target table for import (only applies when`-sql_dialect=table`) | No | - | -| `-s` | `--source` | Local directory path containing script file(s) to import | Yes | | -| `-fd` | `--fail_dir` | Directory to save failed import files | No | | -| `-lpf` | `--lines_per_failed_file` | Maximum lines per failed file (only applies when`-sql_dialect=table`) | No | 100000Range:`0 to Integer.Max=2147483647` | -| `-help` | `--help` | Display help information | No | | -| `-usessl` | `--use_ssl` | Use SSL protocol. Supported since V2.0.9.1 | No | - | -| `-ts` | `--trust_store` | Trust store. Supports hidden input. Supported since V2.0.9.1 | No | - | -| `-tpw` | `--trust_store_password` | Trust store password. Supports hidden input. Supported since V2.0.9.1 | No | - | - -### 2.2 Command - -```Bash -# Unix/OS X -tools/import-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# Windows -# Before version V2.0.4.x -tools\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# V2.0.4.x and later versions -tools\windows\schema\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] -``` - -### 2.3 Examples - -```Bash -# Before import -IoTDB> show timeseries root.treedb.** -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -# Execution -./import-schema.sh -sql_dialect tree -s /home/dump0_0.csv -db root.treedb - -# Verification -IoTDB> show timeseries root.treedb.** -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias| Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.treedb.device.temperature| null|root.treedb| DOUBLE| GORILLA| LZ4|null| null| null| null| BASE| -| root.treedb.device.humidity| null|root.treedb| DOUBLE| GORILLA| LZ4|null| null| null| null| BASE| -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -``` diff --git a/src/UserGuide/latest/Tools-System/Workbench_timecho.md b/src/UserGuide/latest/Tools-System/Workbench_timecho.md deleted file mode 100644 index cb0d17f1e..000000000 --- a/src/UserGuide/latest/Tools-System/Workbench_timecho.md +++ /dev/null @@ -1,33 +0,0 @@ -# WorkBench - -The deployment of the visualization console can refer to the document [Workbench Deployment](../Deployment-and-Maintenance/workbench-deployment_timecho.md) chapter. - -## 1. Product Introduction -IoTDB Visualization Console is an extension component developed for industrial scenarios based on the IoTDB Enterprise Edition time series database. It integrates real-time data collection, storage, and analysis, aiming to provide users with efficient and reliable real-time data storage and query solutions. It features lightweight, high performance, and ease of use, seamlessly integrating with the Hadoop and Spark ecosystems. It is suitable for high-speed writing and complex analytical queries of massive time series data in industrial IoT applications. - -## 2. Instructions for Use -| **Functional Module** | **Functional Description** | -| ---------------------- | ------------------------------------------------------------ | -| Instance Management | Support unified management of connected instances, support creation, editing, and deletion, while visualizing the relationships between multiple instances, helping customers manage multiple database instances more clearly | -| Home | Support viewing the service running status of each node in the database instance (such as activation status, running status, IP information, etc.), support viewing the running monitoring status of clusters, ConfigNodes, and DataNodes, monitor the operational health of the database, and determine if there are any potential operational issues with the instance. | -| Measurement Point List | Support directly viewing the measurement point information in the instance, including database information (such as database name, data retention time, number of devices, etc.), and measurement point information (measurement point name, data type, compression encoding, etc.), while also supporting the creation, export, and deletion of measurement points either individually or in batches. | -| Data Model | Support viewing hierarchical relationships and visually displaying the hierarchical model. | -| Data Query | Support interface-based query interactions for common data query scenarios, and enable batch import and export of queried data. | -| Statistical Query | Support interface-based query interactions for common statistical data scenarios, such as outputting results for maximum, minimum, average, and sum values. | -| SQL Operations | Support interactive SQL operations on the database through a graphical user interface, allowing for the execution of single or multiple statements, and displaying and exporting the results. | -| Trend | Support one-click visualization to view the overall trend of data, draw real-time and historical data for selected measurement points, and observe the real-time and historical operational status of the measurement points. | -| Analysis | Support visualizing data through different analysis methods (such as FFT) for visualization. | -| View | Support viewing information such as view name, view description, result measuring points, and expressions through the interface. Additionally, enable users to quickly create, edit, and delete views through interactive interfaces. | -| Data synchronization | Support the intuitive creation, viewing, and management of data synchronization tasks between databases. Enable direct viewing of task running status, synchronized data, and target addresses. Users can also monitor changes in synchronization status in real-time through the interface. | -| Permission management | Support interface-based control of permissions for managing and controlling database user access and operations. | -| Audit logs | Support detailed logging of user operations on the database, including Data Definition Language (DDL), Data Manipulation Language (DML), and query operations. Assist users in tracking and identifying potential security threats, database errors, and misuse behavior. | - -Main feature showcase -* Home -![首页.png](/img/%E9%A6%96%E9%A1%B5.png) -* Measurement Point List -![测点列表.png](/img/workbench-en-bxzk.png) -* Data Query -![数据查询.png](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2.png) -* Trend -![历史趋势.png](/img/%E5%8E%86%E5%8F%B2%E8%B6%8B%E5%8A%BF.png) \ No newline at end of file diff --git a/src/UserGuide/latest/User-Manual/Audit-Log_timecho.md b/src/UserGuide/latest/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index f63be18b4..000000000 --- a/src/UserGuide/latest/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,165 +0,0 @@ - - - -# Security Audit - -## 1. Introduction - -Audit logs serve as the record credentials of a database, enabling tracking of various operations (e.g., create, read, update, delete) to ensure information security. The audit log feature in IoTDB supports the following capabilities: - -* Supports enabling/disabling the audit log functionality through configuration -* Supports configuring operation types and privilege levels to be recorded via parameters -* Supports setting the storage duration of audit log files, including time-based rolling (via TTL) and space-based rolling (via SpaceTL) -* Supports configuring parameters to count slow requests (with write/query latency exceeding a threshold, default 3000 milliseconds) within any specified time period -* Audit log files are stored in encrypted format by default - -> Note: This feature is available from version V2.0.8 onwards. - -## 2. Configuration Parameters - -Edit the `iotdb-system.properties` file to enable audit logging using the following parameters: - -* V2.0.8.1 - -| Parameter Name | Description | Data Type | Default Value | Activation Method | -|-------------------------------------------|------------------------------------------------------------------------------------------------------------|-----------|-------------------------------|-------------------| -| `enable_audit_log` | Whether to enable audit logging. true: enabled. false: disabled. | Boolean | false | Hot Reload | -| `auditable_operation_type` | Operation type selection. DML: all DML operations are logged; DDL: all DDL operations are logged; QUERY: all query operations are logged; CONTROL: all control statements are logged. | String | DML,DDL,QUERY,CONTROL | Hot Reload | -| `auditable_operation_level` | Permission level selection. global: log all audit events; object: only log events related to data instances. Containment relationship: object < global. For example: when set to global, all audit logs are recorded normally; when set to object, only operations on specific data instances are recorded. | String | global | Hot Reload | -| `auditable_operation_result` | Audit result selection. success: log only successful events; fail: log only failed events | String | success,fail | Hot Reload | -| `audit_log_ttl_in_days` | Audit log TTL (Time To Live). Logs older than this threshold will expire. | Double | -1.0 (never deleted) | Hot Reload | -| `audit_log_space_tl_in_GB` | Audit log SpaceTL. Logs will start rotating when total space reaches this threshold. | Double | 1.0 | Hot Reload | -| `audit_log_batch_interval_in_ms` | Batch write interval for audit logs | Long | 1000 | Hot Reload | -| `audit_log_batch_max_queue_bytes` | Maximum byte size of the queue for batch processing audit logs. Subsequent write operations will be blocked when this threshold is exceeded. | Long | 268435456 | Hot Reload | - -* V2.0.9.2 - -| Parameter Name | Description | Data Type | Default Value | Activation Method | -|-------------------------------------------|------------------------------------------------------------------------------------------------------------|-----------|-------------------------------|-------------------| -| `enable_audit_log` | Whether to enable audit logging. true: enabled. false: disabled. | Boolean | false | Hot Reload | -| `auditable_operation_type` | Operation type selection. DML: all DML operations are logged; DDL: all DDL operations are logged; QUERY: all query operations are logged; CONTROL: all control statements are logged. | String | DML,DDL,QUERY,CONTROL | Hot Reload | -| `auditable_dml_event_type` | Event types for auditing DML operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_ddl_event_type` | Event types for auditing DDL operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_query_event_type` | Event types for auditing query operations. `OBJECT_AUTHENTICATION`: object authentication, `SLOW_OPERATION`: slow operation | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | Hot Reload | -| `auditable_control_event_type` | Event types for auditing control operations. `CHANGE_AUDIT_OPTION`: audit option change, `OBJECT_AUTHENTICATION`: object authentication, `LOGIN`: login, `LOGOUT`: logout, `DN_SHUTDOWN`: data node shutdown, `SLOW_OPERATION`: slow operation | String | `CHANGE_AUDIT_OPTION`,`OBJECT_AUTHENTICATION`,`LOGIN`,`LOGOUT`,`DN_SHUTDOWN`,`SLOW_OPERATION` | Hot Reload | -| `auditable_operation_level` | Permission level selection. global: log all audit events; object: only log events related to data instances. Containment relationship: object < global. For example: when set to global, all audit logs are recorded normally; when set to object, only operations on specific data instances are recorded. | String | global | Hot Reload | -| `auditable_operation_result` | Audit result selection. success: log only successful events; fail: log only failed events | String | success,fail | Hot Reload | -| `audit_log_ttl_in_days` | Audit log TTL (Time To Live). Logs older than this threshold will expire. | Double | -1.0 (never deleted) | Hot Reload | -| `audit_log_space_tl_in_GB` | Audit log SpaceTL. Logs will start rotating when total space reaches this threshold. | Double | 1.0 | Hot Reload | -| `audit_log_batch_interval_in_ms` | Batch write interval for audit logs | Long | 1000 | Hot Reload | -| `audit_log_batch_max_queue_bytes` | Maximum byte size of the queue for batch processing audit logs. Subsequent write operations will be blocked when this threshold is exceeded. | Long | 268435456 | Hot Reload | - -**Instructions for Object Authentication and Slow Operations:** -- When the parameters `auditable_dml_event_type`, `auditable_ddl_event_type`, `auditable_query_event_type`, or `auditable_control_event_type` are set to `OBJECT_AUTHENTICATION`, the corresponding event types will be recorded in the audit log. -- When the parameters `auditable_dml_event_type`, `auditable_ddl_event_type`, `auditable_query_event_type`, or `auditable_control_event_type` are set to `SLOW_OPERATION`, only the corresponding event types whose execution time exceeds the value of the `slow_query_threshold` parameter (default: 3000 ms) will be recorded in the audit log. The value of the `slow_query_threshold` parameter can be configured in the `iotdb-system.properties` file. - - -## 3. Access Methods - -Supports direct reading of audit logs via SQL. - -### 3.1 SQL Syntax - -```SQL -SELECT (, )* log FROM WHERE whereclause ORDER BY order_expression -``` - -* `AUDIT_LOG_PATH`: Audit log storage location `root.__audit.log..` -* `audit_log_field`: Query fields refer to the metadata structure below -* Supports WHERE clause filtering and ORDER BY sorting - -### 3.2 Metadata Structure - -| Field | Description | Data Type | -|------------------------|--------------------------------------------------|----------------| -| `time` | The date and time when the event started | timestamp | -| `username` | User name | string | -| `cli_hostname` | Client hostname identifier | string | -| `audit_event_type` | Audit event type, e.g., WRITE_DATA, GENERATE_KEY| string | -| `operation_type` | Operation type, e.g., DML, DDL, QUERY, CONTROL | string | -| `privilege_type` | Privilege used, e.g., WRITE_DATA, MANAGE_USER | string | -| `privilege_level` | Event privilege level, global or object | string | -| `result` | Event result, success=1, fail=0 | boolean | -| `database` | Database name | string | -| `sql_string` | User's original SQL statement | string | -| `log` | Detailed event description | string | - -### 3.3 Usage Examples - -* Query times, usernames and host information for successfully executed queries: - -```SQL -IoTDB> select username,cli_hostname from root.__audit.log.** where operation_type='QUERY' and result=true align by device -+-----------------------------+---------------------------+--------+------------+ -| Time| Device|username|cli_hostname| -+-----------------------------+---------------------------+--------+------------+ -|2026-01-23T10:39:21.563+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -|2026-01-23T10:39:33.746+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -|2026-01-23T10:42:15.032+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -+-----------------------------+---------------------------+--------+------------+ -Total line number = 3 -It costs 0.036s -``` - -* Query latest operation details: - -```SQL -IoTDB> select username,cli_hostname,operation_type,sql_string from root.__audit.log.** order by time desc limit 1 align by device -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -| Time| Device|username|cli_hostname|operation_type| sql_string| -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -|2026-01-23T10:42:32.795+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| QUERY|select username,cli_hostname from root.__audit.log.** where operation_type='QUERY' and result=true align by device| -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -Total line number = 1 -It costs 0.033s -``` - -* Query failed operations: - -```SQL -IoTDB> select database,operation_type,log from root.__audit.log.** where result=false align by device -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -| Time| Device| database|operation_type| log| -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -|2026-01-23T10:49:55.159+08:00|root.__audit.log.node_1.u_10000| | CONTROL| User user1 (ID=10000) login failed with code: 801, Authentication failed.| -|2026-01-23T10:52:04.579+08:00|root.__audit.log.node_1.u_10000| [root.**]| QUERY| User user1 (ID=10000) requests authority on object [root.**] with result false| -|2026-01-23T10:52:43.412+08:00|root.__audit.log.node_1.u_10000|root.userdb| DDL| User user1 (ID=10000) requests authority on object root.userdb with result false| -|2026-01-23T10:52:48.075+08:00|root.__audit.log.node_1.u_10000| null| QUERY|User user1 (ID=10000) requests authority on object root.__audit with result false| -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -Total line number = 4 -It costs 0.024s -``` - -* Query audit records for user 'u_0' on node 'node_1' with event types 'SLOW_OPERATION' - -```SQL -IoTDB> select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' align by device -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -| Time| Device|result|privilege_level|privilege_type|database|operation_type| log| sql_string|audit_event_type|cli_hostname|username| -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -|2026-05-06T14:43:55.088+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [READ_DATA]| | QUERY| SLOW_QUERY: cost 60 ms, select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' or audit_event_type='LOGIN'limit 1 align by device|select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' or audit_event_type='LOGIN'limit 1 align by device| SLOW_OPERATION| 127.0.0.1| root| -|2026-05-06T14:44:08.715+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [WRITE_DATA]| | DML| Execution: insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2') cost 290 ms, with status code: TSStatus(code:200, message:)| insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2')| SLOW_OPERATION| 127.0.0.1| root| -|2026-05-06T14:44:11.684+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [WRITE_DATA]| | DML|Execution: insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4') cost 6 ms, with status code: TSStatus(code:200, message:)| insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4')| SLOW_OPERATION| 127.0.0.1| root| -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -Total line number = 3 -It costs 0.010s -``` diff --git a/src/UserGuide/latest/User-Manual/Authority-Management-Upgrade_timecho.md b/src/UserGuide/latest/User-Manual/Authority-Management-Upgrade_timecho.md deleted file mode 100644 index 83690fdeb..000000000 --- a/src/UserGuide/latest/User-Manual/Authority-Management-Upgrade_timecho.md +++ /dev/null @@ -1,410 +0,0 @@ - -# Authority Management - -IoTDB provides comprehensive permission management features to control access to data and cluster resources, ensuring data and system security. This document introduces the core concepts of the IoTDB permission module, user specifications, permission governance, authentication logic, and practical application examples. - -## 1. Core Concepts -### 1.1 User -A user refers to a legitimate database operator. Each user is identified by a unique username and authenticated by a password. To access the database, users must log in with valid usernames and passwords stored in the system. - -### 1.2 Privilege -The database supports a wide range of operations, but not all users are authorized to perform every action. A user is granted a corresponding privilege if permitted to execute a specific operation. Each privilege is bounded by a designated path. Flexible permission management can be implemented via [Path Pattern](../Basic-Concept/Operate-Metadata_timecho.md). - -### 1.3 Role -A role is a collection of privileges identified by a unique role name. Roles correspond to actual job identities (e.g., traffic dispatchers), and multiple users may share the same identity with identical permission sets. Roles enable unified and centralized management of permissions for user groups with consistent access requirements. - -### 1.4 Default Users and Roles -After initialization, IoTDB provides one default user: `root`, with the default password `TimechoDB@2021`. As the built-in super administrator, the root user permanently owns all privileges. Its permissions cannot be granted, revoked, or deleted, and it is the sole administrator account in the database. - -Newly created users and roles have no permissions by default. - -## 2. User Specifications -Users with the `SECURITY` privilege are authorized to create users and roles, subject to the following constraints: - -### 2.1 Username Rules -Usernames must be 4 to 32 characters long, including uppercase and lowercase letters, digits, and special symbols (`!@#$%^&*()_+-=`). Creation of usernames identical to the administrator account is prohibited. - -### 2.2 Password Rules -Passwords must be 12 to 32 characters long, containing both uppercase and lowercase letters, at least one digit, and at least one special symbol (`!@#$%^&*()_+-=`). Passwords cannot be the same as the associated username. - -### 2.3 Role Name Rules -Role names must be 4 to 32 characters long, including uppercase and lowercase letters, digits, and special symbols (`!@#$%^&*()_+-=`). Creation of role names identical to the administrator account is prohibited. - -## 3. Permission Management -Based on its tree data model, IoTDB classifies permissions into two major categories: global privileges and time series privileges. - -### 3.1 Global Privileges -Global privileges include three types: `SYSTEM`, `SECURITY`, and `AUDIT`: -- **SYSTEM**: Governs O&M operations and Data Definition Language (DDL) operations. -- **SECURITY**: Governs user and role management, as well as privilege granting for other accounts. -- **AUDIT**: Governs audit rule maintenance and audit log viewing. - -Detailed descriptions of each global privilege are shown in the table below: - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Privilege NameOriginal Privilege NameDescription
SYSTEMMANAGE_DATABASEAllows users to create and drop databases.
USE_TRIGGERAllows users to create, drop and query triggers.
USE_UDFAllows users to create, drop and query user-defined functions.
USE_PIPEAllows users to create, start, stop, drop and query PIPE tasks; allows users to create, drop and query PIPEPLUGINS.
USE_CQAllows users to register, start, stop, uninstall and query stream processing tasks; allows users to register, uninstall and query stream processing plugins.
MAINTAINAllows users to execute and cancel queries, view system variables, and check cluster status.
USE_MODELAllows users to create, drop and query deep learning models.
SECURITYMANAGE_USERAllows users to create, drop, modify and query users.
MANAGE_ROLEAllows users to create, drop and query roles; grant roles to other users or revoke roles from other users.
AUDITN/AAllows users to maintain audit log rules and view audit logs.
- -### 3.2 Time Series Privileges -Time series privileges control the scope and mode of user data access. They support authorization for absolute paths and prefix matching paths, and take effect at the time series granularity. - -Definitions of all time series privileges are listed in the table below: - -| Privilege Name | Description | -| --------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| READ_DATA | Allows reading time series data under authorized paths. | -| WRITE_DATA | Allows reading time series data under authorized paths.
Allows inserting and deleting time series data under authorized paths.
Supports data import and loading within authorized paths. Data import requires the `WRITE_DATA` privilege for target paths; automatic creation of databases and time series additionally requires `SYSTEM` and `WRITE_SCHEMA` privileges. | -| READ_SCHEMA | Allows viewing detailed metadata tree information under authorized paths, including databases, sub-paths, child nodes, devices, time series, templates, views and other metadata. | -| WRITE_SCHEMA | Allows viewing metadata tree information under authorized paths.
Allows creating, dropping and modifying time series, templates and views under authorized paths.
When creating or modifying a view, the system checks the `WRITE_SCHEMA` privilege for the view path and the `READ_SCHEMA` privilege for data sources.
Querying and inserting data into a view requires the `READ_DATA` and `WRITE_DATA` privileges for the view path.
Allows configuring, canceling and querying TTL settings under authorized paths.
Allows mounting and unmounting templates under authorized paths.
Supports renaming the full path of time series (supported since V2.0.8.2). | - -### 3.3 Privilege Granting and Revocation -In IoTDB, users can obtain permissions through three methods: -1. Granted by the super administrator, who has full control over all user permissions. -2. Granted by common users with authorization permission, who have been assigned the `GRANT OPTION` keyword for specific privileges. -3. Assigned via roles granted by the super administrator or users with the `SECURITY` privilege. - -Permissions can be revoked through the following methods: -1. Revoked by the super administrator. -2. Revoked by common users with authorization permission, who have been assigned the `GRANT OPTION` keyword for specific privileges. -3. Revoked by the super administrator or users with the `SECURITY` privilege by removing specific roles from target users. - -- A valid path must be specified for all authorization operations. Global privileges require the path `root.**`, while time series privileges must use absolute paths or prefix paths ending with double wildcards. -- The `WITH GRANT OPTION` keyword can be specified when granting privileges to roles, enabling grantees to regrant or revoke corresponding privileges within the authorized path scope. For example, if User A is granted read access to `Group1.Company1.**` with the `GRANT OPTION`, User A can authorize or revoke read permissions for all sub-nodes and time series under `Group1.Company1`. -- During revocation, the system matches the revocation statement with all existing permission paths of the target user and clears all matched permissions. For instance, if User A owns the read privilege for `Group1.Company1.Factory1`, revoking the read privilege for `Group1.Company1.**` will clear the permission for the sub-path as well. - -## 4. Syntax and Usage Examples -IoTDB provides combined privilege aliases to simplify authorization configuration: - -| Privilege Name | Scope of Authority | -| ---------- | ---------------------------- | -| ALL | All privileges | -| READ | READ_SCHEMA, READ_DATA | -| WRITE | WRITE_SCHEMA, WRITE_DATA | - -Combined privileges are simplified aliases rather than independent privilege types, and function identically to declaring individual privileges separately. - -The following examples demonstrate common permission management SQL statements. Non-administrator users need corresponding prerequisites to execute these operations, which are noted in each scenario. - -### 4.1 User and Role Management -- **Create User** (Requires `SECURITY` privilege) -```SQL -CREATE USER -eg: CREATE USER user1 'Passwd@202604' -``` - -- **Drop User** (Requires `SECURITY` privilege) -```SQL -DROP USER -eg: DROP USER user1 -``` - -- **Create Role** (Requires `SECURITY` privilege) -```SQL -CREATE ROLE -eg: CREATE ROLE role1 -``` - -- **Drop Role** (Requires `SECURITY` privilege) -```SQL -DROP ROLE -eg: DROP ROLE role1 -``` - -- **Grant Role to User** (Requires `SECURITY` privilege) -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1 -``` - -- **Revoke Role from User** (Requires `SECURITY` privilege) -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1 -``` - -- **List All Users** (Requires `SECURITY` privilege) -```SQL -LIST USER -``` - -- **List All Roles** (Requires `SECURITY` privilege) -```SQL -LIST ROLE -``` - -- **List All Users Under a Specified Role** (Requires `SECURITY` privilege) -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser -``` - -- **List Roles of a Specified User** - Users can view their own roles; viewing other users' roles requires the `SECURITY` privilege. -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser -``` - -- **List All Privileges of a Specified User** - Users can view their own privileges; viewing other users' privileges requires the `SECURITY` privilege. -```SQL -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; -``` - -- **List All Privileges of a Specified Role** - Users can view privileges of their assigned roles; viewing other roles' privileges requires the `SECURITY` privilege. -```SQL -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -- **Modify Password** - Users can update their own passwords; modifying other users' passwords requires the `SECURITY` privilege. -```SQL -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'Newpwd@202604'; -``` - -### 4.2 Privilege Granting and Revocation -#### Grant Syntax -```SQL -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT SECURITY ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -#### Revoke Syntax -```SQL -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE SECURITY ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - -- Non-administrator users must hold the target privileges with the `WITH GRANT OPTION` attribute for the specified paths to execute grant or revoke operations. -- When granting or revoking global privileges (or statements containing global privileges, as `ALL` includes global privileges), the path must be set to `root.**`. - -**Valid Examples of Grant & Revoke** -```SQL -GRANT SECURITY ON root.** TO USER user1; -GRANT SECURITY ON root.** TO ROLE role1 WITH GRANT OPTION; -GRANT ALL ON root.** TO role role1 WITH GRANT OPTION; -REVOKE SECURITY ON root.** FROM USER user1; -REVOKE SECURITY ON root.** FROM ROLE role1; -REVOKE ALL ON root.** FROM ROLE role1; -``` - -**Invalid Statements** -```SQL -GRANT READ, SECURITY ON root.t1.** TO USER user1; -GRANT ALL ON root.t1.t2 TO USER user1 WITH GRANT OPTION; -REVOKE ALL ON root.t1.t2 FROM USER user1; -REVOKE READ, SECURITY ON root.t1.t2 FROM ROLE ROLE1; -``` - -- Valid path formats include complete absolute paths and paths ending with double wildcards: -```SQL --- Valid Paths -root.** -root.t1.t2.** -root.t1.t2.t3 -``` - -```SQL --- Invalid Paths -root.t1.* -root.t1.**.t2 -root.t1*.t2.t3 -``` - -## 5. Practical Scenario Example -Based on the [sample data](https://github.com/thulab/iotdb/files/4438687/OtherMaterial-Sample.Data.txt), IoTDB sample data belongs to multiple power generation groups such as ln and sgcc. To prevent cross-group data access, strict permission isolation at the group level is required. - -### 5.1 Create Users -Use the `CREATE USER` statement to create new users. For example, the root administrator creates two write users for the ln and sgcc groups with the unified password `write_Pwd@2026`. It is recommended to wrap usernames with backticks (`). - -```SQL -CREATE USER `ln_write_user` 'write_Pwd@2026'; -CREATE USER `sgcc_write_user` 'write_Pwd@2026'; -``` - -Execute the following statement to query all users: -```SQL -LIST USER; -``` - -Query result: -``` -IoTDB> CREATE USER `ln_write_user` 'write_Pwd@2026'; -Msg: The statement is executed successfully. -IoTDB> CREATE USER `sgcc_write_user` 'write_Pwd@2026'; -Msg: The statement is executed successfully. -IoTDB> LIST USER; -+------+---------------+-----------------+-----------------+ -|UserId| User|MaxSessionPerUser|MinSessionPerUser| -+------+---------------+-----------------+-----------------+ -| 0| root| -1| 1| -| 10000| ln_write_user| -1| -1| -| 10001|sgcc_write_user| -1| -1| -+------+---------------+-----------------+-----------------+ -Total line number = 3 -It costs 0.005s -``` - -### 5.2 Grant Permissions -Newly created users have no permissions by default and cannot perform any database operations. For example, an insertion executed by `ln_write_user` will fail: -```SQL -INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true); -``` - -Error message: -``` -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true); -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -Grant targeted write permissions to each user via the root account: -```SQL -GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user`; -GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user`; -``` - -Execution result: -``` -IoTDB> GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user`; -Msg: The statement is executed successfully. -IoTDB> GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user`; -Msg: The statement is executed successfully. -``` - -Retry data insertion with `ln_write_user`: -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true); -Msg: The statement is executed successfully. -``` - -### 5.3 Revoke Permissions -Use the `REVOKE` statement to reclaim granted permissions: -```SQL -REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -``` - -Execution result: -``` -IoTDB> REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -Msg: The statement is executed successfully. -IoTDB> REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -Msg: The statement is executed successfully. -``` - -After permission revocation, `ln_write_user` loses write access to the `root.ln.**` path: -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true) -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -## 6. Authentication & Supplementary Instructions -### 6.1 Authentication Mechanism -User permissions consist of three core elements: effective path scope, privilege type, and the `WITH GRANT OPTION` tag. -```Plain -userTest1 : - root.t1.** - read_schema, read_data - with grant option - root.** - write_schema, write_data - with grant option -``` - -Each user has an independent permission list recording all authorized privileges, which can be queried via `LIST PRIVILEGES OF USER `. - -During authentication, the system matches the target operation path with authorized paths in sequence. When verifying the `read_schema` privilege for `root.t1.t2`, the system first matches the path rule `root.t1.**`. If matched, it checks whether the required privilege is included; otherwise, it continues matching until a valid rule is found or all rules are traversed. - -- For multi-path query tasks, the system only returns data accessible to the current user and filters out unauthorized content. -- For multi-path write tasks, the operation requires valid write permissions for **all** target time series. - -**Operations Requiring Combined Privileges** -1. With automatic time series creation enabled, inserting data into non-existent time series requires both `WRITE_DATA` and metadata modification privileges. -2. The `SELECT INTO` statement requires read privileges for source paths and write privileges for target paths. Insufficient source permissions lead to incomplete data; insufficient target permissions will terminate the task and throw an error. -3. View permissions are independent of underlying data sources. Read and write operations on views only verify view-specific permissions without checking privileges of the original data paths. - -### 6.2 Supplementary Notes -A role is a collection of privileges, while users have two types of attributes: independent individual privileges and inherited role privileges. A single role can contain multiple privileges, and a single user can be assigned multiple roles and independent permissions. - -No conflicting permissions exist in IoTDB. A user’s final effective permissions are the **union** of personal privileges and all privileges from assigned roles. An operation is permitted if either the user’s individual privileges or inherited role privileges contain the required authorization. Duplicate permissions between personal and role settings do not affect normal usage. - -Key notes: -If a user holds an independent privilege for Operation A and obtains the same privilege via a role, revoking only the user’s individual privilege cannot restrict the operation. Administrators must revoke the privilege from the corresponding role or remove the role from the user to disable the operation completely. Similarly, revoking privileges only from roles cannot restrict users with independent permissions. - -Modifications to role permissions take effect in real time for all bound users. Adding privileges to a role immediately grants access to all associated users, and removing privileges will revoke corresponding access unless users hold independent overriding permissions. \ No newline at end of file diff --git a/src/UserGuide/latest/User-Manual/Authority-Management_timecho.md b/src/UserGuide/latest/User-Manual/Authority-Management_timecho.md deleted file mode 100644 index 54bee9150..000000000 --- a/src/UserGuide/latest/User-Manual/Authority-Management_timecho.md +++ /dev/null @@ -1,519 +0,0 @@ - - -# Authority Management - -IoTDB provides permission management operations, offering users the ability to manage permissions for data and cluster systems, ensuring data and system security. - -This article introduces the basic concepts of the permission module in IoTDB, including user definition, permission management, authentication logic, and use cases. In the JAVA programming environment, you can use the [JDBC API](https://chat.openai.com/API/Programming-JDBC.md) to execute permission management statements individually or in batches. - -## 1. Basic Concepts - -### 1.1 User - -A user is a legitimate user of the database. Each user corresponds to a unique username and has a password as a means of authentication. Before using the database, a person must provide a valid (i.e., stored in the database) username and password for a successful login. - -### 1.2 Permission - -The database provides various operations, but not all users can perform all operations. If a user can perform a certain operation, they are said to have permission to execute that operation. Permissions are typically limited in scope by a path, and [path patterns](https://chat.openai.com/Basic-Concept/Data-Model-and-Terminology.md) can be used to manage permissions flexibly. - -### 1.3 Role - -A role is a collection of multiple permissions and has a unique role name as an identifier. Roles often correspond to real-world identities (e.g., a traffic dispatcher), and a real-world identity may correspond to multiple users. Users with the same real-world identity often have the same permissions, and roles are abstractions for unified management of such permissions. - -### 1.4 Default Users and Roles - -After installation and initialization, IoTDB includes a default user: root, with the default password TimechoDB@2021 (Before V2.0.6.x it is root). This user is an administrator with fixed permissions, which cannot be granted or revoked and cannot be deleted. There is only one administrator user in the database. - -A newly created user or role does not have any permissions initially. - -## 2. User Definition - -Users with MANAGE_USER and MANAGE_ROLE permissions or administrators can create users or roles. Creating a user must meet the following constraints. - -### 2.1 Username Constraints - -4 to 32 characters, supports the use of uppercase and lowercase English letters, numbers, and special characters (`!@#$%^&*()_+-=`). - -Users cannot create users with the same name as the administrator. - -### 2.2 Password Constraints - -4 to 32 characters, can use uppercase and lowercase letters, numbers, and special characters (`!@#$%^&*()_+-=`). Passwords are encrypted by default using SHA-256. - -### 2.3 Role Name Constraints - -4 to 32 characters, supports the use of uppercase and lowercase English letters, numbers, and special characters (`!@#$%^&*()_+-=`). - -Users cannot create roles with the same name as the administrator. - - - -## 3. Permission Management - -IoTDB primarily has two types of permissions: series permissions and global permissions. - -### 3.1 Series Permissions - -Series permissions constrain the scope and manner in which users access data. IOTDB support authorization for both absolute paths and prefix-matching paths, and can be effective at the timeseries granularity. - -The table below describes the types and scope of these permissions: - - - -| Permission Name | Description | -|-----------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| READ_DATA | Allows reading time series data under the authorized path. | -| WRITE_DATA | Allows reading time series data under the authorized path.
Allows inserting and deleting time series data under the authorized path.
Allows importing and loading data under the authorized path. When importing data, you need the WRITE_DATA permission for the corresponding path. When automatically creating databases or time series, you need MANAGE_DATABASE and WRITE_SCHEMA permissions. | -| READ_SCHEMA | Allows obtaining detailed information about the metadata tree under the authorized path,
including databases, child paths, child nodes, devices, time series, templates, views, etc. | -| WRITE_SCHEMA | Allows obtaining detailed information about the metadata tree under the authorized path.
Allows creating, deleting, and modifying time series, templates, views, etc. under the authorized path. When creating or modifying views, it checks the WRITE_SCHEMA permission for the view path and READ_SCHEMA permission for the data source. When querying and inserting data into views, it checks the READ_DATA and WRITE_DATA permissions for the view path.
Allows setting, unsetting, and viewing TTL under the authorized path.
Allows attaching or detaching templates under the authorized path.
Allowed to modify the full path name of a timeseries under an authorized path. -- Supported from V2.0.8.2 onwards | - - -### 3.2 Global Permissions - -Global permissions constrain the database functions that users can use and restrict commands that change the system and task state. Once a user obtains global authorization, they can manage the database. -The table below describes the types of system permissions: - - -| Permission Name | Description | -|:---------------:|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| MANAGE_DATABASE | Allow users to create and delete databases. | -| MANAGE_USER | Allow users to create, delete, modify, and view users. | -| MANAGE_ROLE | Allow users to create, delete, modify, and view roles.
Allow users to grant/revoke roles to/from other users. | -| USE_TRIGGER | Allow users to create, delete, and view triggers.
Independent of data source permission checks for triggers. | -| USE_UDF | Allow users to create, delete, and view user-defined functions.
Independent of data source permission checks for user-defined functions. | -| USE_CQ | Allow users to create, delete, and view continuous queries.
Independent of data source permission checks for continuous queries. | -| USE_PIPE | Allow users to create, start, stop, delete, and view pipelines.
Allow users to create, delete, and view pipeline plugins.
Independent of data source permission checks for pipelines. | -| EXTEND_TEMPLATE | Permission to automatically create templates. | -| MAINTAIN | Allow users to query and cancel queries.
Allow users to view variables.
Allow users to view cluster status. | -| USE_MODEL | Allow users to create, delete and view deep learning model. | -Regarding template permissions: - -1. Only administrators are allowed to create, delete, modify, query, mount, and unmount templates. -2. To activate a template, you need to have WRITE_SCHEMA permission for the activation path. -3. If automatic creation is enabled, writing to a non-existent path that has a template mounted will automatically extend the template and insert data. Therefore, one needs EXTEND_TEMPLATE permission and WRITE_DATA permission for writing to the sequence. -4. To deactivate a template, WRITE_SCHEMA permission for the mounted template path is required. -5. To query paths that use a specific metadata template, you needs READ_SCHEMA permission for the paths; otherwise, it will return empty results. - - - -### 3.3 Granting and Revoking Permissions - -In IoTDB, users can obtain permissions through three methods: - -1. Granted by administrator, who has control over the permissions of other users. -2. Granted by a user allowed to authorize permissions, and this user was assigned the grant option keyword when obtaining the permission. -3. Granted a certain role by administrator or a user with MANAGE_ROLE, thereby obtaining permissions. - -Revoking a user's permissions can be done through the following methods: - -1. Revoked by administrator. -2. Revoked by a user allowed to authorize permissions, and this user was assigned the grant option keyword when obtaining the permission. -3. Revoked from a user's role by administrator or a user with MANAGE_ROLE, thereby revoking the permissions. - -- When granting permissions, a path must be specified. Global permissions need to be specified as root.**, while series-specific permissions must be absolute paths or prefix paths ending with a double wildcard. -- When granting user/role permissions, you can specify the "with grant option" keyword for that permission, which means that the user can grant permissions on their authorized paths and can also revoke permissions on other users' authorized paths. For example, if User A is granted read permission for `group1.company1.**` with the grant option keyword, then A can grant read permissions to others on any node or series below `group1.company1`, and can also revoke read permissions on any node below `group1.company1` for other users. -- When revoking permissions, the revocation statement will match against all of the user's permission paths and clear the matched permission paths. For example, if User A has read permission for `group1.company1.factory1`, when revoking read permission for `group1.company1.**`, it will remove A's read permission for `group1.company1.factory1`. - - - -## 4. Authentication - -User permissions mainly consist of three parts: permission scope (path), permission type, and the "with grant option" flag: - -``` -userTest1: - root.t1.** - read_schema, read_data - with grant option - root.** - write_schema, write_data - with grant option -``` - -Each user has such a permission access list, identifying all the permissions they have acquired. You can view their permissions by using the command `LIST PRIVILEGES OF USER `. - -When authorizing a path, the database will match the path with the permissions. For example, when checking the read_schema permission for `root.t1.t2`, it will first match with the permission access list `root.t1.**`. If it matches successfully, it will then check if that path contains the permission to be authorized. If not, it continues to the next path-permission match until a match is found or all matches are exhausted. - -When performing authorization for multiple paths, such as executing a multi-path query task, the database will only present data for which the user has permissions. Data for which the user does not have permissions will not be included in the results, and information about these paths without permissions will be output to the alert messages. - -Please note that the following operations require checking multiple permissions: - -1. Enabling the automatic sequence creation feature requires not only write permission for the corresponding sequence when a user inserts data into a non-existent sequence but also metadata modification permission for the sequence. - -2. When executing the "select into" statement, it is necessary to check the read permission for the source sequence and the write permission for the target sequence. It should be noted that the source sequence data may only be partially accessible due to insufficient permissions, and if the target sequence has insufficient write permissions, an error will occur, terminating the task. - -3. View permissions and data source permissions are independent. Performing read and write operations on a view will only check the permissions of the view itself and will not perform permission validation on the source path. - - -## 5. Function Syntax and Examples - -IoTDB provides composite permissions for user authorization: - -| Permission Name | Permission Scope | -|-----------------|--------------------------| -| ALL | All permissions | -| READ | READ_SCHEMA, READ_DATA | -| WRITE | WRITE_SCHEMA, WRITE_DATA | - -Composite permissions are not specific permissions themselves but a shorthand way to denote a combination of permissions, with no difference from directly specifying the corresponding permission names. - -The following series of specific use cases will demonstrate the usage of permission statements. Non-administrator users executing the following statements require obtaining the necessary permissions, which are indicated after the operation description. - -### 5.1 User and Role Related - -- Create user (Requires MANAGE_USER permission) - -```SQL -CREATE USER -eg: CREATE USER user1 'passwd' -``` - -- Delete user (Requires MANAGE_USER permission) - -```sql -DROP USER -eg: DROP USER user1 -``` - -- Create role (Requires MANAGE_ROLE permission) - -```sql -CREATE ROLE -eg: CREATE ROLE role1 -``` - -- Delete role (Requires MANAGE_ROLE permission) - -```sql -DROP ROLE -eg: DROP ROLE role1 -``` - -- Grant role to user (Requires MANAGE_ROLE permission) - -```sql -GRANT ROLE TO -eg: GRANT ROLE admin TO user1 -``` - -- Revoke role from user(Requires MANAGE_ROLE permission) - -```sql -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1 -``` - -- List all user (Requires MANAGE_USER permission) - -```sql -LIST USER -``` - -- List all role (Requires MANAGE_ROLE permission) - -```sql -LIST ROLE -``` - -- List all users granted specific role.(Requires MANAGE_USER permission) - -```sql -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser -``` - -- List all role granted to specific user. - - Users can list their own roles, but listing roles of other users requires the MANAGE_ROLE permission. - -```sql -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser -``` - -- List all privileges of user - -Users can list their own privileges, but listing privileges of other users requires the MANAGE_USER permission. - -```sql -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; -``` - -- List all privileges of role - -Users can list the permission information of roles they have, but listing permissions of other roles requires the MANAGE_ROLE permission. - -```sql -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -- Modify password - -Users can modify their own password, but modifying passwords of other users requires the MANAGE_USER permission. - -```sql -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'newpwd'; -``` - -### 5.2 Authorization and Deauthorization - -Users can use authorization statements to grant permissions to other users. The syntax is as follows: - -```sql -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT MANAGE_ROLE ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -Users can use deauthorization statements to revoke permissions from others. The syntax is as follows: - -```sql -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE MANAGE_ROLE ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - -- **When non-administrator users execute authorization/deauthorization statements, they need to have \ permissions on \, and these permissions must be marked with WITH GRANT OPTION.** - -- When granting or revoking global permissions or when the statement contains global permissions (expanding ALL includes global permissions), you must specify the path as root**. For example, the following authorization/deauthorization statements are valid: - - ```sql - GRANT MANAGE_USER ON root.** TO USER user1; - GRANT MANAGE_ROLE ON root.** TO ROLE role1 WITH GRANT OPTION; - GRANT ALL ON root.** TO role role1 WITH GRANT OPTION; - REVOKE MANAGE_USER ON root.** FROM USER user1; - REVOKE MANAGE_ROLE ON root.** FROM ROLE role1; - REVOKE ALL ON root.** FROM ROLE role1; - ``` - - The following statements are invalid: - - ```sql - GRANT READ, MANAGE_ROLE ON root.t1.** TO USER user1; - GRANT ALL ON root.t1.t2 TO USER user1 WITH GRANT OPTION; - REVOKE ALL ON root.t1.t2 FROM USER user1; - REVOKE READ, MANAGE_ROLE ON root.t1.t2 FROM ROLE ROLE1; - ``` - -- \ must be a full path or a matching path ending with a double wildcard. The following paths are valid: - - ```sql - root.** - root.t1.t2.** - root.t1.t2.t3 - ``` - - The following paths are invalid: - - ```sql - root.t1.* - root.t1.**.t2 - root.t1*.t2.t3 - ``` - - - -## 6. Examples - - Based on the described [sample data](https://github.com/thulab/iotdb/files/4438687/OtherMaterial-Sample.Data.txt), IoTDB's sample data may belong to different power generation groups such as ln, sgcc, and so on. Different power generation groups do not want other groups to access their database data, so we need to implement data isolation at the group level. - -#### Create Users -Use `CREATE USER ` to create users. For example, we can create two users for the ln and sgcc groups with the root user, who has all permissions, and name them ln_write_user and sgcc_write_user. It is recommended to enclose the username in backticks. The SQL statements are as follows: -```SQL -CREATE USER `ln_write_user` 'write_pwd' -CREATE USER `sgcc_write_user` 'write_pwd' -``` - -Now, using the SQL statement to display users: - -```sql -LIST USER -``` - -We can see that these two users have been created, and the result is as follows: - -```sql -IoTDB> CREATE USER `ln_write_user` 'write_pwd' -Msg: The statement is executed successfully. -IoTDB> CREATE USER `sgcc_write_user` 'write_pwd' -Msg: The statement is executed successfully. -IoTDB> LIST USER; -+---------------+ -| user| -+---------------+ -| ln_write_user| -| root| -|sgcc_write_user| -+---------------+ -Total line number = 3 -It costs 0.012s -``` - -#### Granting Permissions to Users - -At this point, although two users have been created, they do not have any permissions, so they cannot operate on the database. For example, if we use the ln_write_user to write data to the database, the SQL statement is as follows: - -```sql -INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true) -``` - -At this point, the system does not allow this operation, and an error is displayed: - -```sql -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true) -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -Now, we will grant each user write permissions to the corresponding paths using the root user. - -We use the `GRANT ON TO USER ` statement to grant permissions to users, for example: - -```sql -GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user` -GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user` -``` - -The execution status is as follows: - -```sql -IoTDB> GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user` -Msg: The statement is executed successfully. -IoTDB> GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user` -Msg: The statement is executed successfully. -``` - -Then, using ln_write_user, try to write data again: - -```sql -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true) -Msg: The statement is executed successfully. -``` - -#### Revoking User Permissions - -After granting user permissions, we can use the `REVOKE ON FROM USER ` to revoke the permissions granted to users. For example, using the root user to revoke the permissions of ln_write_user and sgcc_write_user: - -```sql -REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -``` - - -The execution status is as follows: - -```sql -IoTDB> REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -Msg: The statement is executed successfully. -IoTDB> REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -Msg: The statement is executed successfully. -``` - -After revoking the permissions, ln_write_user no longer has the permission to write data to root.ln.**: - -```sql -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true) -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -## 7. Other Explanations - -Roles are collections of permissions, and both permissions and roles are attributes of users. In other words, a role can have multiple permissions, and a user can have multiple roles and permissions (referred to as the user's self-permissions). - -Currently, in IoTDB, there are no conflicting permissions. Therefore, the actual permissions a user has are the union of their self-permissions and the permissions of all their roles. In other words, to determine if a user can perform a certain operation, it's necessary to check whether their self-permissions or the permissions of all their roles allow that operation. Self-permissions, role permissions, and the permissions of multiple roles a user has may contain the same permission, but this does not have any impact. - -It's important to note that if a user has a certain permission (corresponding to operation A) on their own, and one of their roles has the same permission, revoking the permission from the user alone will not prevent the user from performing operation A. To prevent the user from performing operation A, you need to revoke the permission from both the user and the role, or remove the user from the role that has the permission. Similarly, if you only revoke the permission from the role, it won't prevent the user from performing operation A if they have the same permission on their own. - -At the same time, changes to roles will be immediately reflected in all users who have that role. For example, adding a certain permission to a role will immediately grant that permission to all users who have that role, and removing a certain permission will cause those users to lose that permission (unless the user has it on their own). - - - -## 8. Upgrading from a previous version - -Before version 1.3, there were many different permission types. In 1.3 version's implementation, we have streamlined the permission types. - -The permission paths in version 1.3 of the database must be either full paths or matching paths ending with a double wildcard. During system upgrades, any invalid permission paths and permission types will be automatically converted. The first invalid node on the path will be replaced with "**", and any unsupported permission types will be mapped to the permissions supported by the current system. - -| Permission | Path | Mapped-Permission | Mapped-path | -|-------------------|-----------------|-------------------|---------------| -| CREATE_DATBASE | root.db.t1.* | MANAGE_DATABASE | root.** | -| INSERT_TIMESERIES | root.db.t2.*.t3 | WRITE_DATA | root.db.t2.** | -| CREATE_TIMESERIES | root.db.t2*c.t3 | WRITE_SCHEMA | root.db.** | -| LIST_ROLE | root.** | (ignore) | | - - - -You can refer to the table below for a comparison of permission types between the old and new versions (where "--IGNORE" indicates that the new version ignores that permission): - -| Permission Name | Path-Related | New Permission Name | Path-Related | -|---------------------------|--------------|---------------------|--------------| -| CREATE_DATABASE | YES | MANAGE_DATABASE | NO | -| INSERT_TIMESERIES | YES | WRITE_DATA | YES | -| UPDATE_TIMESERIES | YES | WRITE_DATA | YES | -| READ_TIMESERIES | YES | READ_DATA | YES | -| CREATE_TIMESERIES | YES | WRITE_SCHEMA | YES | -| DELETE_TIMESERIES | YES | WRITE_SCHEMA | YES | -| CREATE_USER | NO | MANAGE_USER | NO | -| DELETE_USER | NO | MANAGE_USER | NO | -| MODIFY_PASSWORD | NO | -- IGNORE | | -| LIST_USER | NO | -- IGNORE | | -| GRANT_USER_PRIVILEGE | NO | -- IGNORE | | -| REVOKE_USER_PRIVILEGE | NO | -- IGNORE | | -| GRANT_USER_ROLE | NO | MANAGE_ROLE | NO | -| REVOKE_USER_ROLE | NO | MANAGE_ROLE | NO | -| CREATE_ROLE | NO | MANAGE_ROLE | NO | -| DELETE_ROLE | NO | MANAGE_ROLE | NO | -| LIST_ROLE | NO | -- IGNORE | | -| GRANT_ROLE_PRIVILEGE | NO | -- IGNORE | | -| REVOKE_ROLE_PRIVILEGE | NO | -- IGNORE | | -| CREATE_FUNCTION | NO | USE_UDF | NO | -| DROP_FUNCTION | NO | USE_UDF | NO | -| CREATE_TRIGGER | YES | USE_TRIGGER | NO | -| DROP_TRIGGER | YES | USE_TRIGGER | NO | -| START_TRIGGER | YES | USE_TRIGGER | NO | -| STOP_TRIGGER | YES | USE_TRIGGER | NO | -| CREATE_CONTINUOUS_QUERY | NO | USE_CQ | NO | -| DROP_CONTINUOUS_QUERY | NO | USE_CQ | NO | -| ALL | NO | All privilegs | | -| DELETE_DATABASE | YES | MANAGE_DATABASE | NO | -| ALTER_TIMESERIES | YES | WRITE_SCHEMA | YES | -| UPDATE_TEMPLATE | NO | -- IGNORE | | -| READ_TEMPLATE | NO | -- IGNORE | | -| APPLY_TEMPLATE | YES | WRITE_SCHEMA | YES | -| READ_TEMPLATE_APPLICATION | NO | -- IGNORE | | -| SHOW_CONTINUOUS_QUERIES | NO | -- IGNORE | | -| CREATE_PIPEPLUGIN | NO | USE_PIPE | NO | -| DROP_PIPEPLUGINS | NO | USE_PIPE | NO | -| SHOW_PIPEPLUGINS | NO | -- IGNORE | | -| CREATE_PIPE | NO | USE_PIPE | NO | -| START_PIPE | NO | USE_PIPE | NO | -| STOP_PIPE | NO | USE_PIPE | NO | -| DROP_PIPE | NO | USE_PIPE | NO | -| SHOW_PIPES | NO | -- IGNORE | | -| CREATE_VIEW | YES | WRITE_SCHEMA | YES | -| ALTER_VIEW | YES | WRITE_SCHEMA | YES | -| RENAME_VIEW | YES | WRITE_SCHEMA | YES | -| DELETE_VIEW | YES | WRITE_SCHEMA | YES | diff --git a/src/UserGuide/latest/User-Manual/Auto-Start-On-Boot_timecho.md b/src/UserGuide/latest/User-Manual/Auto-Start-On-Boot_timecho.md deleted file mode 100644 index 1fbcd4c6f..000000000 --- a/src/UserGuide/latest/User-Manual/Auto-Start-On-Boot_timecho.md +++ /dev/null @@ -1,213 +0,0 @@ - - -# Auto-start on Boot -## 1. Overview -TimechoDB supports registering ConfigNode, DataNode, and AINode as Linux system services via the three scripts `daemon-confignode.sh`, `daemon-datanode.sh`, and `daemon-ainode.sh`. Combined with the system-built `systemctl` command, it manages the TimechoDB cluster in daemon mode, enabling more convenient startup, shutdown, restart, and auto-start on boot operations, and improving service stability. - -> Note: This feature is available starting from version 2.0.9.1. - -## 2. Environment Requirements -| Item | Specification | -|--------------|-------------------------------------------------------------------------------| -| OS | Linux (supports the `systemctl` command) | -| User Privilege | root user | -| Environment Variable | `JAVA_HOME` must be set before deploying ConfigNode and DataNode | - -## 3. Service Registration -Enter the TimechoDB installation directory and execute the corresponding daemon script: - -```bash -# Register ConfigNode service -./tools/ops/daemon-confignode.sh - -# Register DataNode service -./tools/ops/daemon-datanode.sh - -# Register AINode service -./tools/ops/daemon-ainode.sh -``` - -During script execution, you will be prompted with two options: -1. Whether to start the corresponding TimechoDB service immediately (timechodb-confignode / timechodb-datanode / timechodb-ainode); -2. Whether to register the corresponding service for auto-start on boot. - -After script execution, the corresponding service files will be generated in the `/etc/systemd/system/` directory: -- `timechodb-confignode.service` -- `timechodb-datanode.service` -- `timechodb-ainode.service` - -## 4. Service Management -After service registration, you can use `systemctl` commands to start, stop, restart, check status, and configure auto-start on boot for each TimechoDB node service. All commands below must be executed as the root user. - -### 4.1 Manual Service Startup -```bash -# Start ConfigNode service -systemctl start timechodb-confignode -# Start DataNode service -systemctl start timechodb-datanode -# Start AINode service -systemctl start timechodb-ainode -``` - -### 4.2 Manual Service Shutdown -```bash -# Stop ConfigNode service -systemctl stop timechodb-confignode -# Stop DataNode service -systemctl stop timechodb-datanode -# Stop AINode service -systemctl stop timechodb-ainode -``` - -After stopping the service, check the service status. If it shows `inactive (dead)`, the service has been shut down successfully. For other statuses, check TimechoDB logs to analyze exceptions. - -### 4.3 Check Service Status -```bash -# Check ConfigNode service status -systemctl status timechodb-confignode -# Check DataNode service status -systemctl status timechodb-datanode -# Check AINode service status -systemctl status timechodb-ainode -``` - -Status Description: -- `active (running)`: Service is running. If this status persists for 10 minutes, the service has started successfully. -- `failed`: Service startup failed. Check TimechoDB logs for troubleshooting. - -### 4.4 Restart Service -Restarting a service is equivalent to stopping and then starting it. Commands are as follows: -```bash -# Restart ConfigNode service -systemctl restart timechodb-confignode -# Restart DataNode service -systemctl restart timechodb-datanode -# Restart AINode service -systemctl restart timechodb-ainode -``` - -### 4.5 Enable Auto-start on Boot -```bash -# Enable ConfigNode auto-start on boot -systemctl enable timechodb-confignode -# Enable DataNode auto-start on boot -systemctl enable timechodb-datanode -# Enable AINode auto-start on boot -systemctl enable timechodb-ainode -``` - -### 4.6 Disable Auto-start on Boot -```bash -# Disable ConfigNode auto-start on boot -systemctl disable timechodb-confignode -# Disable DataNode auto-start on boot -systemctl disable timechodb-datanode -# Disable AINode auto-start on boot -systemctl disable timechodb-ainode -``` - -## 5. Custom Service Configuration -### 5.1 Customization Methods -#### 5.1.1 Method 1: Modify the Script -1. Modify the `[Unit]`, `[Service]`, and `[Install]` sections in the `daemon-xxx.sh` script. For details of configuration items, refer to the next section. -2. Execute the `daemon-xxx.sh` script. - -#### 5.1.2 Method 2: Modify the Service File -1. Modify the `xx.service` file in `/etc/systemd/system`. -2. Execute `systemctl daemon-reload`. - -### 5.2 `daemon-xxx.sh` Configuration Items -#### 5.2.1 `[Unit]` Section (Service Metadata) -| Item | Description | -|---------------|-----------------------------------------------------------------------------| -| Description | Service description | -| Documentation | Link to the official TimechoDB documentation | -| After | Ensures the service starts only after the network service has started | - -#### 5.2.2 `[Service]` Section (Service Runtime Configuration) -| Item | Meaning | -|-------------------------------------------|-----------------------------------------------------------------------------------------------------------| -| StandardOutput, StandardError | Specify storage paths for service standard output and error logs | -| LimitNOFILE=65536 | Set the maximum number of file descriptors, default value is 65536 | -| Type=simple | Service type is a simple foreground process; systemd tracks the main service process | -| User=root, Group=root | Run the service with root user and group permissions | -| ExecStart / ExecStop | Specify the paths of the service startup and shutdown scripts respectively | -| Restart=on-failure | Automatically restart the service only if it exits abnormally | -| SuccessExitStatus=143 | Treat exit code 143 (128+15, normal termination via SIGTERM) as a successful exit | -| RestartSec=5 | Interval between service restarts, default 5 seconds | -| StartLimitInterval=600s, StartLimitBurst=3 | Maximum 3 restarts within 10 minutes (600 seconds) to prevent excessive resource consumption from frequent restarts | -| RestartPreventExitStatus=SIGKILL | Do not auto-restart the service if killed by the SIGKILL signal, avoiding infinite restart of zombie processes | - -#### 5.2.3 `[Install]` Section (Installation Configuration) -| Item | Meaning | -|-----------------------|----------------------------------------------------------------------| -| WantedBy=multi-user.target | Start the service automatically when the system enters multi-user mode | - -### 5.3 Sample `.service` File Format -```bash -[Unit] -Description=timechodb-confignode -Documentation=https://www.timecho.com/ -After=network.target - -[Service] -StandardOutput=null -StandardError=null -LimitNOFILE=65536 -Type=simple -User=root -Group=root -Environment=JAVA_HOME=$JAVA_HOME -ExecStart=$TimechoDB_SBIN_HOME/start-confignode.sh -Restart=on-failure -SuccessExitStatus=143 -RestartSec=5 -StartLimitInterval=600s -StartLimitBurst=3 -RestartPreventExitStatus=SIGKILL - -[Install] -WantedBy=multi-user.target -``` - -Note: The above is the standard format of the `timechodb-confignode.service` file. The formats of `timechodb-datanode.service` and `timechodb-ainode.service` are similar. - -## 6. Notes -1. **Process Daemon Mechanism** - - **Auto-restart**: The system will auto-restart the service if it fails to start or exits abnormally during runtime (e.g., OOM). - - **No restart**: Normal exits (e.g., executing `kill`, `./sbin/stop-xxx.sh`, or `systemctl stop`) will not trigger auto-restart. - -2. **Log Location** - - All runtime logs are stored in the `logs` folder under the TimechoDB installation directory. Refer to this directory for troubleshooting. - -3. **Cluster Status Check** - - After service startup, execute `./sbin/start-cli.sh` and run the `show cluster` command to view the cluster status. - -4. **Fault Recovery Procedure** - - If the service status is `failed`, after fixing the issue, **you must first execute `systemctl daemon-reload`** before running `systemctl start`, otherwise startup will fail. - -5. **Configuration Activation** - - After modifying the `daemon-xxx.sh` script, execute `systemctl daemon-reload` to re-register the service for new configurations to take effect. - -6. **Startup Mode Compatibility** - - Services started via `systemctl start` can be stopped using `./sbin/stop` (no restart triggered). - - Processes started via `./sbin/start` cannot be monitored via `systemctl`. \ No newline at end of file diff --git a/src/UserGuide/latest/User-Manual/Black-White-List_timecho.md b/src/UserGuide/latest/User-Manual/Black-White-List_timecho.md deleted file mode 100644 index 2692edd4a..000000000 --- a/src/UserGuide/latest/User-Manual/Black-White-List_timecho.md +++ /dev/null @@ -1,78 +0,0 @@ - - -# Black White List - -## 1. Introduction - -IoTDB is a time-series database designed for IoT scenarios, supporting efficient data storage, query, and analysis. With the widespread application of IoT technology, data security and access control have become critical. In open environments, ensuring secure data access for legitimate users presents a key challenge. The whitelist mechanism allows only trusted IPs or users to connect, reducing the attack surface at the source. The blacklist function can block malicious IPs in real time in edge-cloud collaborative scenarios, preventing unauthorized access, SQL injection, brute‑force attacks, DDoS, and other threats, thereby providing continuous and stable security for data transmission. - -> Note: This feature is available starting from version 2.0.6. - -## 2. Whitelist - -### 2.1 Function Description - -By enabling the whitelist function and configuring the whitelist, client addresses allowed to connect to IoTDB are specified. Only clients within the whitelist can access IoTDB, achieving security control. - -### 2.2 Configuration Parameters - -Administrators can enable/disable the whitelist function and add, modify, or delete whitelist IPs/IP segments in the following two ways: - -* Edit the configuration file `iotdb‑system.properties`. -* Use the `set configuration` statement. - * Tree model reference: [set configuration](../Reference/Modify-Config-Manual.md) - -Related parameters are as follows: - -| Name | Description | Default Value | Effective Mode | Example | -| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------- | --------------- | ---------------- | ------------------------------------------------------------------- | -| `enable_white_list` | Whether to enable the whitelist function. true: enable; false: disable. The value is case‑insensitive. | false | Hot reload | `set enable_white_list = 'true'` | -| `white_ip_list` | Add, modify, or delete whitelist IPs/IP segments. Supports exact match and the \* wildcard. Multiple IPs are separated by commas. | empty | Hot reload | `set white_ip_list='192.168.1.200,192.168.1.201,192.168.1.*'` | - -## 3. Blacklist - -### 3.1 Function Description - -By enabling the blacklist function and configuring the blacklist, certain specific IP addresses are prevented from accessing the database, guarding against unauthorized access, SQL injection, brute‑force attacks, DDoS attacks, and other security threats, thereby ensuring the security and stability of data transmission. - -### 3.2 Configuration Parameters - -Administrators can enable/disable the blacklist function and add, modify, or delete blacklist IPs/IP segments in the following two ways: - -* Edit the configuration file `iotdb‑system.properties`. -* Use the `set configuration`statement. - * Tree model reference:[set configuration](../Reference/Modify-Config-Manual.md) - -Related parameters are as follows: - -| Name | Description | Default Value | Effective Mode | Example | -|---------------------| ----------------------------------------------------------------------------------------------------------------------------------- | --------------- | ---------------- | ------------------------------------------------------------------- | -| `enable_black_list` | Whether to enable the blacklist function. true: enable; false: disable. The value is case‑insensitive. | false | Hot reload | `set enable_black_list = 'true'` | -| `black_ip_list` | Add, modify, or delete blacklist IPs/IP segments. Supports exact match and the \* wildcard. Multiple IPs are separated by commas. | empty | Hot reload | `set black_ip_list='192.168.1.200,192.168.1.201,192.168.1.*'` | - -## 4. Notes - -1. After the whitelist is enabled, if the list is empty, all connections are denied. If the local IP is not included, local login is denied. -2. When the same IP appears in both the whitelist and blacklist, the blacklist takes precedence. -3. The system validates the IP format. Invalid entries will cause an error when the user connects and be skipped, without affecting the loading of other valid IPs. -4. Duplicate IPs in the configuration are supported; they are automatically deduplicated in memory without notification. For manual deduplication, edit the configuration accordingly. -5. Blacklist/whitelist rules only apply to new connections. Existing connections before enabling the function are not affected; they will be intercepted only upon subsequent reconnection. diff --git a/src/UserGuide/latest/User-Manual/Data-Sync_timecho.md b/src/UserGuide/latest/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index b3bf1debd..000000000 --- a/src/UserGuide/latest/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,744 +0,0 @@ - - -# Data Sync - -Data synchronization is a typical requirement in industrial Internet of Things (IoT). Through data synchronization mechanisms, it is possible to achieve data sharing between IoTDB, and to establish a complete data link to meet the needs for internal and external network data interconnectivity, edge-cloud synchronization, data migration, and data backup. - -## 1. Function Overview - -### 1.1 Data Synchronization - -A data synchronization task consists of three stages: - -![](/img/data-sync-new.png) - -- Source Stage:This part is used to extract data from the source IoTDB, defined in the source section of the SQL statement. -- Process Stage:This part is used to process the data extracted from the source IoTDB, defined in the processor section of the SQL statement. -- Sink Stage:This part is used to send data to the target IoTDB, defined in the sink section of the SQL statement. - -By declaratively configuring the specific content of the three parts through SQL statements, flexible data synchronization capabilities can be achieved. Currently, data synchronization supports the synchronization of the following information, and you can select the synchronization scope when creating a synchronization task (the default is data.insert, which means synchronizing newly written data): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Synchronization ScopeSynchronization Content Description
allAll scopes
data(Data)insertSynchronize newly written data
deleteSynchronize deleted data
schemadatabaseSynchronize database creation, modification or deletion operations
timeseriesSynchronize the definition and attributes of time series
TTLSynchronize the data retention time
auth-Synchronize user permissions and access control
- -### 1.2 Functional limitations and instructions - -1. The schema and auth synchronization functions have the following limitations: - -- When using schema synchronization, it is required that the consensus protocol for `Schema region` and `ConfigNode` must be the default ratis protocol. This means that the `iotdb-system.properties` configuration file should contain the settings `config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus` and `schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`. If these are not specified, the default ratis protocol is used. - -- To prevent potential conflicts, please disable the automatic creation of schema on the receiving end when enabling schema synchronization. This can be done by setting the `enable_auto_create_schema` configuration in the `iotdb-system.properties` file to false. - -- When schema synchronization is enabled, the use of custom plugins is not supported. - -- In a dual-active cluster, schema synchronization should avoid simultaneous operations on both ends. - -- During data synchronization tasks, please avoid performing any deletion operations to prevent inconsistent states between the two ends. - -2. Pipe Permission Control Specifications - -- When creating a pipe, a username and password can be specified for the extraction/write‑back plugins. If the password is incorrect, creation is prohibited. If not specified, the current user is used for synchronization by default. - -- During data/metadata synchronization, filtering is first performed according to the path pattern (pattern/path) configured in the Pipe, followed by authentication based on the user’s read permissions: - - If the permission scope is greater than or equal to the write path: full synchronization. - - If the permission scope has no intersection with the write path: no synchronization. - - If the permission scope is smaller than the write path or overlaps partially: synchronize only the intersecting part. - -- When encountering data for which the user lacks permission: - - If the sender’s skipIf=no‑privileges, the unauthorized data is skipped. - - If skipIf is left empty (unconfigured), the task reports an error (Error 803). - - Note: This skipIf configuration is independent of the receiver’s skipIf setting (which defaults to empty). - -- Data under root.__system and root.__audit will not be synchronized. - -3. Automatic Type Conversion for Pipe Sink - -When Pipe fails to write data to the sink due to field type mismatches, IoTDB automatically converts the data to the field types defined in the existing sink schema and retries the write operation to improve synchronization success rate. This feature is controlled by the parameter `sink.exception.data.convert-on-type-mismatch`. Refer to the subsequent sink parameter table for detailed parameter descriptions. - -The conversion rules for type mismatches are as follows: - -| Source Type | Target Type | Conversion Rule | -|---------------------|-------------|---------------------------------------------------------------------------------| -| Numeric Type | Numeric Type| Convert to the target numeric type. Truncation, precision loss or overflow may occur. | -| Numeric Type | BOOLEAN | `0` is converted to `false`; non-zero values are converted to `true`. | -| BOOLEAN | Numeric Type| `true` is converted to `1`; `false` is converted to `0`. | -| TEXT, STRING, BLOB | BOOLEAN | Parse the string into a BOOLEAN value. | -| TEXT, STRING, BLOB | Numeric Type| Parse the string into the target numeric type. If parsing fails, write the default value `0`, `0L` or `0.0`. | -| TEXT, STRING, BLOB | TIMESTAMP | Parse the string into a TIMESTAMP value. If parsing fails, write the default value `0L`. | -| TEXT, STRING, BLOB | DATE | Parse the string into a DATE value. If parsing fails, write the default date `1970-01-01`. | -| Invalid Numeric Value | DATE | If conversion to a valid DATE fails, write the default date `1970-01-01`. | -| DATE | TIMESTAMP | Convert to the timestamp of 00:00 (UTC) on the same day. | -| TIMESTAMP | DATE | Convert to the corresponding date in UTC. | - -> **Note**: Automatic conversion is performed based on the existing sink schema and will **not** modify the sink schema. This feature prioritizes continuous data synchronization, which may result in precision loss or writing of default values. - - -## 2. Usage Instructions - -Data synchronization tasks have three states: RUNNING, STOPPED, and DROPPED. The task state transitions are shown in the following diagram: - -![](/img/Data-Sync02.png) - -After creation, the task will start directly, and when the task stops abnormally, the system will automatically attempt to restart the task. - -Provide the following SQL statements for state management of synchronization tasks. - -### 2.1 Create Task - -Use the `CREATE PIPE` statement to create a data synchronization task. The `PipeId` and `sink` attributes are required, while `source` and `processor` are optional. When entering the SQL, note that the order of the `SOURCE` and `SINK` plugins cannot be swapped. - -The SQL example is as follows: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId is the name that uniquely identifies the task. --- Data extraction plugin, optional plugin -WITH SOURCE ( - [ = ,], -) --- Data processing plugin, optional plugin -WITH PROCESSOR ( - [ = ,], -) --- Data connection plugin, required plugin -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS semantics**: Used in creation operations to ensure that the create command is executed when the specified Pipe does not exist, preventing errors caused by attempting to create an existing Pipe. - -**Note**: - -Starting from V2.0.8, when creating a full data synchronization Pipe (e.g. Pipeid: `alldatapipe`), the system will automatically split it into two independent Pipes: - -* History Pipe: The PipeId is the original name plus the suffix `_history` (e.g. `alldatapipe_history`). The source parameter carries the default configurations: `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` -* Realtime Pipe: The PipeId is the original name plus the suffix `_realtime` (e.g. `alldatapipe_realtime`). The source parameter carries the default configuration: `'history.enable'='false'`. If metadata synchronization is configured, the Realtime Pipe will be responsible for sending the data. - -After successful creation, the original PipeId (e.g. `alldatapipe`) will no longer be a valid identifier. When performing task operations such as starting, stopping, deleting, or viewing, you must use the split independent PipeId (i.e. `*_history` or `*_realtime`). For operation examples, see the [View Task](./Data-Sync_timecho.md#_2-5-view-task) section - -### 2.2 Start Task - -Start processing data: - -```SQL -START PIPE -``` - -### 2.3 Stop Task - -Stop processing data: - -```SQL -STOP PIPE -``` - -### 2.4 Delete Task - -Deletes the specified task: - -```SQL -DROP PIPE [IF EXISTS] -``` -**IF EXISTS semantics**: Used in deletion operations to ensure that when a specified Pipe exists, the delete command is executed to prevent errors caused by attempting to delete non-existent Pipes. - -Deleting a task does not require stopping the synchronization task first. - -### 2.5 View Task - -View all tasks: - -```SQL -SHOW PIPES -``` - -To view a specified task: - -```SQL -SHOW PIPE -``` - -Example of the show pipes result for a pipe: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -The meanings of each column are as follows: - -- **ID**:The unique identifier for the synchronization task -- **CreationTime**:The time when the synchronization task was created -- **State**:The state of the synchronization task -- **PipeSource**:The source of the synchronized data stream -- **PipeProcessor**:The processing logic of the synchronized data stream during transmission -- **PipeSink**:The destination of the synchronized data stream -- **ExceptionMessage**:Displays the exception information of the synchronization task -- **RemainingEventCount (Statistics with Delay)**: The number of remaining events, which is the total count of all events in the current data synchronization task, including data and schema synchronization events, as well as system and user-defined events. -- **EstimatedRemainingSeconds (Statistics with Delay)**: The estimated remaining time, based on the current number of events and the rate at the pipe, to complete the transfer. - -Example: - -In V2.0.8 and later versions, create a full data synchronization task and view the task details. - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -``` - - -### 2.6 Modify Task - -The `ALTER PIPE` statement dynamically updates an existing PIPE and supports modifying or replacing the configuration of source, processor, and sink. - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -Description: - -* Executing this operation does not change the running state of the PIPE. It is equivalent to keeping the processing progress of the original PipeId and creating a new PIPE at the original progress position. -* The modify/replace parameters for source/processor/sink are all optional. If no modification parameter is specified, it is equivalent to deleting the current PIPE and recreating it with the original configuration and progress. -* For a plugin specified with modify, the plugin's other parameters are retained, and only the given parameters are replaced or added. -* For a plugin specified with replace, all parameters of the plugin are replaced directly. -* When the [IF EXISTS] keyword is used, execution succeeds even if no Pipe with the same name exists, but no operation is actually performed. - -Example: - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` - -### 2.7 Synchronization Plugins - -To make the overall architecture more flexible to match different synchronization scenario requirements, we support plugin assembly within the synchronization task framework. The system comes with some pre-installed common plugins that you can use directly. At the same time, you can also customize processor plugins and Sink plugins, and load them into the IoTDB system for use. You can view the plugins in the system (including custom and built-in plugins) with the following statement: - -```SQL -SHOW PIPEPLUGINS -``` - -The return result is as follows (version 1.3.2): - -```SQL -IoTDB> SHOW PIPEPLUGINS -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| PluginName|PluginType| ClassName| PluginJar| -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.processor.donothing.DoNothingProcessor| | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.donothing.DoNothingConnector| | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.extractor.iotdb.IoTDBExtractor| | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| | -| IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| | -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ - -``` - -Detailed introduction of pre-installed plugins is as follows (for detailed parameters of each plugin, please refer to the [Parameter Description](#reference-parameter-description) section): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
TypeCustom PluginPlugin NameDescriptionApplicable Version
source pluginNot Supportediotdb-sourceThe default extractor plugin, used to extract historical or real-time data from IoTDB1.2.x
processor pluginSupporteddo-nothing-processorThe default processor plugin, which does not process the incoming data1.2.x
sink pluginSupporteddo-nothing-sinkDoes not process the data that is sent out1.2.x
iotdb-thrift-sinkThe default sink plugin ( V1.3.1+ ), used for data transfer between IoTDB ( V1.2.0+ ) and IoTDB( V1.2.0+ ) . It uses the Thrift RPC framework to transfer data, with a multi-threaded async non-blocking IO model, high transfer performance, especially suitable for scenarios where the target end is distributed1.2.x
iotdb-air-gap-sinkUsed for data synchronization across unidirectional data diodes from IoTDB ( V1.2.0+ ) to IoTDB ( V1.2.0+ ). Supported diode models include Nanrui Syskeeper 2000, etc1.2.x
iotdb-thrift-ssl-sinkUsed for data transfer between IoTDB ( V1.3.1+ ) and IoTDB ( V1.2.0+ ). It uses the Thrift RPC framework to transfer data, with a single-threaded sync blocking IO model, suitable for scenarios with higher security requirements1.3.1+
- -For importing custom plugins, please refer to the [Stream Processing](./Streaming_timecho.md#custom-stream-processing-plugin-management) section. - -## 3. Use examples - -### 3.1 Full data synchronisation - -This example is used to demonstrate the synchronisation of all data from one IoTDB to another IoTDB with the data link as shown below: - -![](/img/pipe1.jpg) - -In this example, we can create a synchronization task named A2B to synchronize the full data from A IoTDB to B IoTDB. The iotdb-thrift-sink plugin (built-in plugin) for the sink is required. The URL of the data service port of the DataNode node on the target IoTDB needs to be configured through node-urls, as shown in the following example statement: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -``` - -### 3.2 Partial data synchronization - -This example is used to demonstrate the synchronisation of data from a certain historical time range (8:00pm 23 August 2023 to 8:00pm 23 October 2023) to another IoTDB, the data link is shown below: - -![](/img/pipe2.jpg) - -In this example, we can create a synchronization task named A2B. First, we need to define the range of data to be transferred in the source. Since the data being transferred is historical data (historical data refers to data that existed before the creation of the synchronization task), we need to configure the start-time and end-time of the data and the transfer mode mode. The URL of the data service port of the DataNode node on the target IoTDB needs to be configured through node-urls. - -The detailed statements are as follows: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'realtime.mode' = 'stream' -- The extraction mode for newly inserted data (after pipe creation) - 'path' = 'root.vehicle.**', -- Scope of Data Synchronization - 'start-time' = '2023.08.23T08:00:00+00:00', -- The start event time for synchronizing all data, including start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- The end event time for synchronizing all data, including end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -### 3.3 Bidirectional data transfer - -This example is used to demonstrate the scenario where two IoTDB act as active-active pairs, with the data link shown in the figure below: - -![](/img/pipe3.jpg) - -In this example, to avoid infinite data loops, the `forwarding-pipe-requests` parameter on A and B needs to be set to `false`, indicating that data transmitted from another pipe is not forwarded, and to keep the data consistent on both sides, the pipe needs to be configured with `inclusion=all` to synchronize full data and metadata. - -The detailed statement is as follows: - -On A IoTDB, execute the following statement: - -```SQL -create pipe AB -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'forwarding-pipe-requests' = 'false' -- Do not forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -On B IoTDB, execute the following statement: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'forwarding-pipe-requests' = 'false' -- Do not forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- The URL of the data service port of the DataNode node on the target IoTDB -) -``` - -### 3.4 Edge-cloud data transfer - -This example is used to demonstrate the scenario where data from multiple IoTDB is transferred to the cloud, with data from clusters B, C, and D all synchronized to cluster A, as shown in the figure below: - -![](/img/sync_en_03.png) - -In this example, to synchronize the data from clusters B, C, and D to A, the pipe between BA, CA, and DA needs to configure the `path` to limit the range, and to keep the edge and cloud data consistent, the pipe needs to be configured with `inclusion=all` to synchronize full data and metadata. The detailed statement is as follows: - -On B IoTDB, execute the following statement to synchronize data from B to A: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On C IoTDB, execute the following statement to synchronize data from C to A: - -```SQL -create pipe CA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On D IoTDB, execute the following statement to synchronize data from D to A: - -```SQL -create pipe DA -with source ( - 'inclusion'='all', -- Indicates synchronization of full data, schema , and auth - 'path'='root.db.**', -- Limit the range -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -### 3.5 Cascading data transfer - -This example is used to demonstrate the scenario where data is transferred in a cascading manner between multiple IoTDB, with data from cluster A synchronized to cluster B, and then to cluster C, as shown in the figure below: - -![](/img/sync_en_04.png) - -In this example, to synchronize the data from cluster A to C, the `forwarding-pipe-requests` needs to be set to `true` between BC. The detailed statement is as follows: - -On A IoTDB, execute the following statement to synchronize data from A to B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -On B IoTDB, execute the following statement to synchronize data from B to C: - -```SQL -create pipe BC -with source ( - 'forwarding-pipe-requests' = 'true' -- Whether to forward data written by other Pipes -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- The URL of the data service port of the DataNode node on the target IoTDB -) -) -``` - -### 3.6 Cross-gate data transfer - -This example is used to demonstrate the scenario where data from one IoTDB is synchronized to another IoTDB through a unidirectional gateway, as shown in the figure below: - -![](/img/cross-network-gateway.png) - - -In this example, the iotdb-air-gap-sink plugin in the sink task needs to be used . After configuring the gateway, execute the following statement on A IoTDB. Fill in the node-urls with the URL of the data service port of the DataNode node on the target IoTDB configured by the gateway, as detailed below: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- The URL of the data service port of the DataNode node on the target IoTDB -``` -**Note:** -* When creating a pipe for synchronization across a network gap (data diode), you must ensure that the target user on the receiving end already exists. If the receiving-end user is missing at the time of pipe creation, data prior to the subsequent creation of that user will not be synchronized. -* Currently supported network gap device models are listed in the table below. -> For other models of network gateway devices, Please contact timechodb staff to confirm compatibility. - -| Gateway Type | Model | Return Packet Limit | Send Limit | -|--------------|-----------------------------------------------------|---------------------| ---------------------- | -| Forward Gate | NARI Syskeeper-2000 Forward Gate | All 0 / All 1 bytes | No Limit | -| Forward Gate | XJ Self-developed Diaphragm | All 0 / All 1 bytes | No Limit | -| Unknown | WISGAP | No Limit | No Limit | -| Forward Gate | KEDONG StoneWall-2000 Network Security Isolation Device | No Limit | No Limit | -| Reverse Gate | NARI Syskeeper-2000 Reverse Direction | All 0 / All 1 bytes | Meet E Language Format | -| Unknown | DPtech ISG5000 | No Limit | No Limit | -| Unknown | GAP XL—GAP | No Limit | No Limit | - - -### 3.7 Compression Synchronization (V1.3.3+) - -IoTDB supports specifying data compression methods during synchronization. Real time compression and transmission of data can be achieved by configuring the `compressor` parameter. `Compressor` currently supports 5 optional algorithms: snappy/gzip/lz4/zstd/lzma2, and can choose multiple compression algorithm combinations to compress in the order of configuration `rate-limit-bytes-per-second`(supported in V1.3.3 and later versions) is the maximum number of bytes allowed to be transmitted per second, calculated as compressed bytes. If it is less than 0, there is no limit. - -For example, to create a synchronization task named A2B: - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', -- The URL of the data service port of the DataNode node on the target IoTDB - 'compressor' = 'snappy,lz4' -- Compression algorithms -) -``` - -### 3.8 Encrypted Synchronization (V1.3.1+) - -IoTDB supports the use of SSL encryption during the synchronization process, ensuring the secure transfer of data between different IoTDB instances. By configuring SSL-related parameters, such as the certificate address and password (`ssl.trust-store-path`)、(`ssl.trust-store-pwd`), data can be protected by SSL encryption during the synchronization process. - -For example, to create a synchronization task named A2B: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- The URL of the data service port of the DataNode node on the target IoTDB - 'ssl.trust-store-path'='pki/trusted', -- The trust store certificate path required to connect to the target DataNode - 'ssl.trust-store-pwd'='root' -- The trust store certificate password required to connect to the target DataNode -) -``` - -## 4. Reference: Notes - -You can adjust the parameters for data synchronization by modifying the IoTDB configuration file (`iotdb-system.properties`), such as the directory for storing synchronized data. The complete configuration is as follows: - -V1.3.3+: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## 5. Reference: parameter description - -### 5.1 source parameter(V1.3.3) - -| Key | Value | Value range | Required | Default value | -|:-------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------|:----------|:---------------| -| source | iotdb-source | String: iotdb-source | Yes | - | -| inclusion | Used to specify the range of data to be synchronized in the data synchronization task, including data, schema, and auth | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | No | data.insert | -| inclusion.exclusion | Used to exclude specific operations from the range specified by inclusion, reducing the amount of data synchronized | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | No | - | -| mode.streaming | Specifies the capture source for time-series data writes. Applicable when mode.streamingis false, determining the source for capturing data.insertspecified in inclusion. Offers two strategies:- true: ​​Dynamic capture selection.​​ The system adaptively chooses between capturing individual write requests or only TsFile sealing requests based on downstream processing speed. Prioritizes capturing write requests for lower latency when processing is fast; captures only file sealing requests to avoid backlog when slow. Suitable for most scenarios, balancing latency and throughput optimally.- false: ​​Fixed batch capture.​​ Captures only TsFile sealing requests. Suitable for resource-constrained scenarios to reduce system load. Note: The snapshot data captured upon pipe startup is only provided to downstream processing in file format. | Boolean: true / false | No | true | -| mode.strict | Determines the strictness when filtering data using time/ path/ database-name/ table-nameparameters:- true: ​​Strict filtering.​​ The system strictly filters captured data according to the given conditions, ensuring only matching data is selected.- false: ​​Non-strict filtering.​​ The system may include some extra data during filtering. Suitable for performance-sensitive scenarios to reduce CPU and I/O consumption. | Boolean: true / false | No | true | -| mode.snapshot | Determines the capture mode for time-series data, affecting the dataspecified in inclusion. Offers two modes:- true: ​​Static data capture.​​ Upon pipe startup, a one-time data snapshot is captured. ​​The pipe will automatically terminate (DROP PIPE SQL is executed automatically) after the snapshot data is fully consumed.​​- false: ​​Dynamic data capture.​​ In addition to capturing a snapshot upon startup, the pipe continuously captures subsequent data changes. The pipe runs continuously to handle the dynamic data stream. | Boolean: true / false | No | false | -| path | Can be specified when the user connects with sql_dialectset to tree. For upgraded user pipes, the default sql_dialectis tree. This parameter determines the capture scope for time-series data, affecting the dataspecified in inclusion, as well as some sequence-related metadata. Data is selected into the streaming pipe if its tree model path matches the specified path.
Starting from version V2.0.8.2, this parameter supports specifying multiple exact paths in a single pipe, e.g., `'path'='root.test.d0.s1,root.test.d0.s2,root.test.d0.s3'`. | String: IoTDB-standard tree path pattern, wildcards allowed | No | root.** | -| start-time | The start event time for synchronizing all data, including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | No | Long.MIN_VALUE | -| end-time | The end event time for synchronizing all data, including end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | No | Long.MAX_VALUE | -| forwarding-pipe-requests | Whether to forward data written by other Pipes (usually data synchronization) | Boolean: true / false | No | true | -| mods | Same as mods.enable, whether to send the MODS file for TSFile. | Boolean: true / false | No | false | -| skipIf | Which errors can be skipped? Currently only the insufficient privileges error. | String:no-privileges | No | no-privileges | - -> 💎 **Note:** The difference between the values of true and false for the data extraction mode `mode.streaming` -> -> - True (recommended): Under this value, the task will process and send the data in real-time. Its characteristics are high timeliness and low throughput. -> - False: Under this value, the task will process and send the data in batches (according to the underlying data files). Its characteristics are low timeliness and high throughput. - -### 5.2 sink parameter - -#### iotdb-thrift-sink - -| **Parameter** | **Description** | Value Range | Required | Default Value | -|:------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :----------------------------------------------------------- | :------- |:---------------------------------------------| -| sink | iotdb-thrift-sink or iotdb-thrift-async-sink | String: iotdb-thrift-sink or iotdb-thrift-async-sink | Yes | - | -| node-urls | URLs of the DataNode service ports on the target IoTDB. (please note that the synchronization task does not support forwarding to its own service). | String. Example:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| user/username | Username for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | TimechoDB@2021 (Before V2.0.6.x it is root) | -| batch.enable | Enables batch mode for log transmission to improve throughput and reduce IOPS. | Boolean: true, false | No | true | -| batch.max-delay-seconds | Maximum delay (in seconds) for batch transmission. | Integer | No | 1 | -| batch.max-delay-ms | Maximum delay (in ms) for batch transmission. (Available since v2.0.5) | Integer | No | 1 | -| batch.size-bytes | Maximum batch size (in bytes) for batch transmission. | Long | No | 16*1024*1024 | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | The maximum number of bytes allowed to be transmitted per second. The compressed bytes (such as after compression) are calculated. If it is less than 0, there is no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| format | The payload formats for data transmission include the following options:
- hybrid: The format depends on what is passed from the processor (either tsfile or tablet), and the sink performs no conversion.
- tsfile: Data is forcibly converted to tsfile format before transmission. This is suitable for scenarios like data file backup.
- tablet: Data is forcibly converted to tsfile format before transmission. This is useful for data synchronization when the sender and receiver have incompatible data types (to minimize errors). | String: hybrid / tsfile / tablet | No | hybrid | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - -#### iotdb-air-gap-sink - -| Key | Value | Value Range | Required | Default Value | -|:----------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:----------------|:---------------------------------------------| -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | Required | - | -| node-urls | The URL of the data service port of any DataNode nodes on the target IoTDB | String. Example: :'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Required | - | -| user/username | Username for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | TimechoDB@2021 (Before V2.0.6.x it is root) | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | Maximum bytes allowed per second for transmission (calculated after compression). Set to a value less than 0 for no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| air-gap.handshake-timeout-ms | The timeout duration of the handshake request when the sender and receiver first attempt to establish a connection, unit: ms | Integer | No | 5000 | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - -#### iotdb-thrift-ssl-sink - -| **Parameter** | **Description** | Value Range | Required | Default Value | -|:-------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------|:---------------------------------------------| -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | Yes | - | -| node-urls | URLs of the DataNode service ports on the target IoTDB. (please note that the synchronization task does not support forwarding to its own service). | String. Example:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| user/username | Username for connecting to the target IoTDB. Must have appropriate permissions. | String | No | root | -| password | Password for the username. | String | No | TimechoDB@2021 (Before V2.0.6.x it is root) | -| batch.enable | Enables batch mode for log transmission to improve throughput and reduce IOPS. | Boolean: true, false | No | true | -| batch.max-delay-seconds | Maximum delay (in seconds) for batch transmission. | Integer | No | 1 | -| batch.max-delay-ms | Maximum delay (in ms) for batch transmission. (Available since v2.0.5) | Integer | No | 1 | -| batch.size-bytes | Maximum batch size (in bytes) for batch transmission. | Long | No | 16*1024*1024 | -| compressor | The selected RPC compression algorithm. Multiple algorithms can be configured and will be adopted in sequence for each request. | String: snappy / gzip / lz4 / zstd / lzma2 | No | "" | -| compressor.zstd.level | When the selected RPC compression algorithm is zstd, this parameter can be used to additionally configure the compression level of the zstd algorithm. | Int: [-131072, 22] | No | 3 | -| rate-limit-bytes-per-second | Maximum bytes allowed per second for transmission (calculated after compression). Set to a value less than 0 for no limit. | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | No | -1 | -| load-tsfile-strategy | When synchronizing file data, ​​whether the receiver waits for the local load tsfile operation to complete before responding to the sender​​:
​​sync​​: Wait for the local load tsfile operation to complete before returning the response.
​​async​​: Do not wait for the local load tsfile operation to complete; return the response immediately. | String: sync / async | No | sync | -| ssl.trust-store-path | Path to the trust store certificate for SSL connection. | String.Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | Yes | - | -| ssl.trust-store-pwd | Password for the trust store certificate. | Integer | Yes | - | -| format | The payload formats for data transmission include the following options:
- hybrid: The format depends on what is passed from the processor (either tsfile or tablet), and the sink performs no conversion.
- tsfile: Data is forcibly converted to tsfile format before transmission. This is suitable for scenarios like data file backup.
- tablet: Data is forcibly converted to tsfile format before transmission. This is useful for data synchronization when the sender and receiver have incompatible data types (to minimize errors). | String: hybrid / tsfile / tablet | No | hybrid | -| exception.data.convert-on-type-mismatch | Whether to enable automatic conversion when data types mismatch on the sink side | Boolean: true / false | No | true | - diff --git a/src/UserGuide/latest/User-Manual/Data-subscription_timecho.md b/src/UserGuide/latest/User-Manual/Data-subscription_timecho.md deleted file mode 100644 index 2012b92a8..000000000 --- a/src/UserGuide/latest/User-Manual/Data-subscription_timecho.md +++ /dev/null @@ -1,147 +0,0 @@ -# Data Subscription - -## 1. Feature Introduction - -The IoTDB data subscription module (also known as the IoTDB subscription client) is a feature supported after IoTDB V1.3.3, which provides users with a streaming data consumption method that is different from data queries. It refers to the basic concepts and logic of message queue products such as Kafka, **providing data subscription and consumption interfaces**, but it is not intended to completely replace these consumer queue products. Instead, it offers more convenient data subscription services for scenarios where simple streaming data acquisition is needed. - -Using the IoTDB Subscription Client to consume data has significant advantages in the following application scenarios: - -1. **Continuously obtaining the latest data**: By using a subscription method, it is more real-time than scheduled queries, simpler to program applications, and has a lower system burden; - -2. **Simplify data push to third-party systems**: No need to develop data push components for different systems within IoTDB, data can be streamed within third-party systems, making it easier to send data to systems such as Flink, Kafka, DataX, Camel, MySQL, PG, etc. - -## 2. Key Concepts - -The IoTDB Subscription Client encompasses three core concepts: Topic, Consumer, and Consumer Group. The specific relationships are illustrated in the diagram below: - -
- -
- -1. **Topic**: Topic is the data space of IoTDB, represented by paths and time ranges (such as the full time range of root. * *). Consumers can subscribe to data on these topics (currently existing and future written). Unlike Kafka, IoTDB can create topics after data is stored, and the output format can be either Message or TsFile. - -2. **Consumer**: Consumer is an IoTDB subscription client is located, responsible for receiving and processing data published to specific topics. Consumers retrieve data from the queue and process it accordingly. There are two types of Consumers available in the IoTDB subscription client: - - `SubscriptionPullConsumer`, which corresponds to the pull consumption model in message queues, where user code needs to actively invoke data retrieval logic. - - `SubscriptionPushConsumer`, which corresponds to the push consumption model in message queues, where user code is triggered by newly arriving data events. - - -3. **Consumer Group**: A Consumer Group is a collection of Consumers who share the same Consumer Group ID. The Consumer Group has the following characteristics: - - Consumer Group and Consumer are in a one to many relationship. That is, there can be any number of consumers in a consumer group, but a consumer is not allowed to join multiple consumer groups simultaneously. - - A Consumer Group can have different types of Consumers (`SubscriptionPullConsumer` and `SubscriptionPushConsumer`). - - It is not necessary for all consumers in a Consumer Group to subscribe to the same topic. - - When different Consumers in the same Consumer Group subscribe to the same Topic, each piece of data under that Topic will only be processed by one Consumer within the group, ensuring that data is not processed repeatedly. - -## 3. SQL Statements - -### 3.1 Topic Management - -IoTDB supports the creation, deletion, and viewing of Topics through SQL statements. The status changes of Topics are illustrated in the diagram below: - -
- -
- -#### 3.1.1 Create Topic - -The SQL statement is as follows: - -```SQL - CREATE TOPIC [IF NOT EXISTS] - WITH ( - [ = ,], - ); -``` - -**IF NOT EXISTS semantics**: Used in creation operations to ensure that the create command is executed when the specified topic does not exist, preventing errors caused by attempting to create an existing topic. - -Detailed explanation of each parameter is as follows: - -| Key | Required or Optional with Default | Description | -| :-------------------------------------------- | :--------------------------------- | :----------------------------------------------------------- | -| **path** | optional: `root.**` | The path of the time series data corresponding to the topic, representing a set of time series to be subscribed. | -| **start-time** | optional: `MIN_VALUE` | The start time (event time) of the time series data corresponding to the topic. Can be in ISO format, such as 2011-12-03T10:15:30 or 2011-12-03T10:15:30+01:00, or a long value representing a raw timestamp consistent with the database's timestamp precision. Supports the special value `now`, which means the creation time of the topic. When start-time is `now` and end-time is MAX_VALUE, it indicates that only real-time data is subscribed. | -| **end-time** | optional: `MAX_VALUE` | The end time (event time) of the time series data corresponding to the topic. Can be in ISO format, such as 2011-12-03T10:15:30 or 2011-12:03T10:15:30+01:00, or a long value representing a raw timestamp consistent with the database's timestamp precision. Supports the special value `now`, which means the creation time of the topic. When end-time is `now` and start-time is MIN_VALUE, it indicates that only historical data is subscribed. | -| **processor** | optional: `do-nothing-processor` | The name and parameter configuration of the processor plugin, representing the custom processing logic applied to the original subscribed data, which can be specified in a similar way to pipe processor plugins. | -| **format** | optional: `SessionDataSetsHandler` | Represents the form in which data is subscribed from the topic. Currently supports the following two forms of data: `SessionDataSetsHandler`: Data subscribed from the topic is obtained using `SubscriptionSessionDataSetsHandler`, and consumers can consume each piece of data row by row. `TsFileHandler`: Data subscribed from the topic is obtained using `SubscriptionTsFileHandler`, and consumers can directly subscribe to the TsFile storing the corresponding data. | -| **mode** **(supported in versions 1.3.3.2 and later)** | option: `live` | The subscription mode corresponding to the topic, with two options: `live`: When subscribing to this topic, the subscribed dataset mode is a dynamic dataset, which means that you can continuously consume the latest data. `snapshot`: When the consumer subscribes to this topic, the subscribed dataset mode is a static dataset, which means the snapshot of the data at the moment the consumer group subscribes to the topic (not the moment the topic is created); the formed static dataset after subscription does not support TTL.| -| **loose-range** **(supported in versions 1.3.3.2 and later)** | option: `""` | String: Whether to strictly filter the data corresponding to this topic according to the path and time range, for example: "": Strictly filter the data corresponding to this topic according to the path and time range. `"time"`: Do not strictly filter the data corresponding to this topic according to the time range (rough filter); strictly filter the data corresponding to this topic according to the path. `"path"`: Do not strictly filter the data corresponding to this topic according to the path (rough filter); strictly filter the data corresponding to this topic according to the time range. `"time, path"` / `"path, time"` / `"all"`: Do not strictly filter the data corresponding to this topic according to the path and time range (rough filter).| - -Examples are as follows: - -```SQL --- Full subscription -CREATE TOPIC root_all; - --- Custom subscription -CREATE TOPIC IF NOT EXISTS db_timerange -WITH ( - 'path' = 'root.db.**', - 'start-time' = '2023-01-01', - 'end-time' = '2023-12-31' -); -``` - -#### 3.1.2 Delete Topic - -A Topic can only be deleted if it is not subscribed to. When a Topic is deleted, its related consumption progress will be cleared. - -```SQL -DROP TOPIC [IF EXISTS] ; -``` -**IF EXISTS semantics**: Used in deletion operations to ensure that the delete command is executed when a specified topic exists, preventing errors caused by attempting to delete non-existent topics. - -#### 3.1.3 View Topic - -```SQL -SHOW TOPICS; -SHOW TOPIC ; -``` - -Result set: - -```SQL -[TopicName|TopicConfigs] -``` - -- TopicName: Topic ID -- TopicConfigs: Topic configurations - -### 3.2 Check Subscription Status - -View all subscription relationships: - -```SQL --- Query the subscription relationships between all topics and consumer groups -SHOW SUBSCRIPTIONS --- Query all subscriptions under a specific topic -SHOW SUBSCRIPTIONS ON -``` - -Result set: - -```SQL -[TopicName|ConsumerGroupName|SubscribedConsumers] -``` - -- TopicName: The ID of the topic. -- ConsumerGroupName: The ID of the consumer group specified in the user's code. -- SubscribedConsumers: All client IDs in the consumer group that have subscribed to the topic. - -## 4. API interface - -In addition to SQL statements, IoTDB also supports using data subscription features through Java native interfaces, more details see([link](../API/Programming-Java-Native-API_timecho)). - - -## 5. Frequently Asked Questions - -### 5.1 What is the difference between IoTDB data subscription and Kafka? - -1. Consumption Orderliness - -- **Kafka guarantees that messages within a single partition are ordered**,when a topic corresponds to only one partition and only one consumer subscribes to this topic, the order in which the consumer (single-threaded) consumes the topic data is the same as the order in which the data is written. -- The IoTDB subscription client **does not guarantee** that the order in which the consumer consumes the data is the same as the order in which the data is written, but it will try to reflect the order of data writing. - -2. Message Delivery Semantics - -- Kafka can achieve Exactly once semantics for both Producers and Consumers through configuration. -- The IoTDB subscription client currently cannot provide Exactly once semantics for Consumers. diff --git a/src/UserGuide/latest/User-Manual/IoTDB-View_timecho.md b/src/UserGuide/latest/User-Manual/IoTDB-View_timecho.md deleted file mode 100644 index 111bf1f01..000000000 --- a/src/UserGuide/latest/User-Manual/IoTDB-View_timecho.md +++ /dev/null @@ -1,547 +0,0 @@ - - -# View - -## 1. Sequence View Application Background - -## 2. Application Scenario 1 Time Series Renaming (PI Asset Management) - -In practice, the equipment collecting data may be named with identification numbers that are difficult to be understood by human beings, which brings difficulties in querying to the business layer. - -The Sequence View, on the other hand, is able to re-organise the management of these sequences and access them using a new model structure without changing the original sequence content and without the need to create new or copy sequences. - -**For example**: a cloud device uses its own NIC MAC address to form entity numbers and stores data by writing the following time sequence:`root.db.0800200A8C6D.xvjeifg`. - -It is difficult for the user to understand. However, at this point, the user is able to rename it using the sequence view feature, map it to a sequence view, and use `root.view.device001.temperature` to access the captured data. - -### 2.1 Application Scenario 2 Simplifying business layer query logic - -Sometimes users have a large number of devices that manage a large number of time series. When conducting a certain business, the user wants to deal with only some of these sequences. At this time, the focus of attention can be picked out by the sequence view function, which is convenient for repeated querying and writing. - -**For example**: Users manage a product assembly line with a large number of time series for each segment of the equipment. The temperature inspector only needs to focus on the temperature of the equipment, so he can extract the temperature-related sequences and compose the sequence view. - -### 2.2 Application Scenario 3 Auxiliary Rights Management - -In the production process, different operations are generally responsible for different scopes. For security reasons, it is often necessary to restrict the access scope of the operations staff through permission management. - -**For example**: The safety management department now only needs to monitor the temperature of each device in a production line, but these data are stored in the same database with other confidential data. At this point, it is possible to create a number of new views that contain only temperature-related time series on the production line, and then to give the security officer access to only these sequence views, thus achieving the purpose of permission restriction. - -### 2.3 Motivation for designing sequence view functionality - -Combining the above two types of usage scenarios, the motivations for designing sequence view functionality, are: - -1. time series renaming. -2. to simplify the query logic at the business level. -3. Auxiliary rights management, open data to specific users through the view. - -## 3. Sequence View Concepts - -### 3.1 Terminology Concepts - -Concept: If not specified, the views specified in this document are **Sequence Views**, and new features such as device views may be introduced in the future. - -### 3.2 Sequence view - -A sequence view is a way of organising the management of time series. - -In traditional relational databases, data must all be stored in a table, whereas in time series databases such as IoTDB, it is the sequence that is the storage unit. Therefore, the concept of sequence views in IoTDB is also built on sequences. - -A sequence view is a virtual time series, and each virtual time series is like a soft link or shortcut that maps to a sequence or some kind of computational logic external to a certain view. In other words, a virtual sequence either maps to some defined external sequence or is computed from multiple external sequences. - -Users can create views using complex SQL queries, where the sequence view acts as a stored query statement, and when data is read from the view, the stored query statement is used as the source of the data in the FROM clause. - -### 3.3 Alias Sequences - -There is a special class of beings in a sequence view that satisfy all of the following conditions: - -1. the data source is a single time series -2. there is no computational logic -3. no filtering conditions (e.g., no WHERE clause restrictions). - -Such a sequence view is called an **alias sequence**, or alias sequence view. A sequence view that does not fully satisfy all of the above conditions is called a non-alias sequence view. The difference between them is that only aliased sequences support write functionality. - -** All sequence views, including aliased sequences, do not currently support Trigger functionality. ** - -### 3.4 Nested Views - -A user may want to select a number of sequences from an existing sequence view to form a new sequence view, called a nested view. - -**The current version does not support the nested view feature**. - -### 3.5 Some constraints on sequence views in IoTDB - -#### Constraint 1 A sequence view must depend on one or several time series - -A sequence view has two possible forms of existence: - -1. it maps to a time series -2. it is computed from one or more time series. - -The former form of existence has been exemplified in the previous section and is easy to understand; the latter form of existence here is because the sequence view allows for computational logic. - -For example, the user has installed two thermometers in the same boiler and now needs to calculate the average of the two temperature values as a measurement. The user has captured the following two sequences: `root.db.d01.temperature01`, `root.db.d01.temperature02`. - -At this point, the user can use the average of the two sequences as one sequence in the view: `root.db.d01.avg_temperature`. - -This example will 3.1.2 expand in detail. - -#### Restriction 2 Non-alias sequence views are read-only - -Writing to non-alias sequence views is not allowed. - -Only aliased sequence views are supported for writing. - -#### Restriction 3 Nested views are not allowed - -It is not possible to select certain columns in an existing sequence view to create a sequence view, either directly or indirectly. - -An example of this restriction will be given in 3.1.3. - -#### Restriction 4 Sequence view and time series cannot be renamed - -Both sequence views and time series are located under the same tree, so they cannot be renamed. - -The name (path) of any sequence should be uniquely determined. - -#### Restriction 5 Sequence views share timing data with time series, metadata such as labels are not shared - -Sequence views are mappings pointing to time series, so they fully share timing data, with the time series being responsible for persistent storage. - -However, their metadata such as tags and attributes are not shared. - -This is because the business query, view-oriented users are concerned about the structure of the current view, and if you use group by tag and other ways to do the query, obviously want to get the view contains the corresponding tag grouping effect, rather than the time series of the tag grouping effect (the user is not even aware of those time series). - -## 4. Sequence view functionality - -### 4.1 Creating a view - -Creating a sequence view is similar to creating a time series, the difference is that you need to specify the data source, i.e., the original sequence, through the AS keyword. - -#### SQL for creating a view - -User can select some sequences to create a view: - -```SQL -CREATE VIEW root.view.device.status -AS - SELECT s01 - FROM root.db.device -``` - -It indicates that the user has selected the sequence `s01` from the existing device `root.db.device`, creating the sequence view `root.view.device.status`. - -The sequence view can exist under the same entity as the time series, for example: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device -``` - -Thus, there is a virtual copy of `s01` under `root.db.device`, but with a different name `status`. - -It can be noticed that the sequence views in both of the above examples are aliased sequences, and we are giving the user a more convenient way of creating a sequence for that sequence: - -```SQL -CREATE VIEW root.view.device.status -AS - root.db.device.s01 -``` - -#### Creating views with computational logic - -Following the example in section 2.2 Limitations 1: - -> A user has installed two thermometers in the same boiler and now needs to calculate the average of the two temperature values as a measurement. The user has captured the following two sequences: `root.db.d01.temperature01`, `root.db.d01.temperature02`. -> -> At this point, the user can use the two sequences averaged as one sequence in the view: `root.view.device01.avg_temperature`. - -If the view is not used, the user can query the average of the two temperatures like this: - -```SQL -SELECT (temperature01 + temperature02) / 2 -FROM root.db.d01 -``` - -And if using a sequence view, the user can create a view this way to simplify future queries: - -```SQL -CREATE VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02) / 2 - FROM root.db.d01 -``` - -The user can then query it like this: - -```SQL -SELECT avg_temperature FROM root.db.d01 -``` - -#### Nested sequence views not supported - -Continuing with the example from 3.1.2, the user now wants to create a new view using the sequence view `root.db.d01.avg_temperature`, which is not allowed. We currently do not support nested views, whether it is an aliased sequence or not. - -For example, the following SQL statement will report an error: - -```SQL -CREATE VIEW root.view.device.avg_temp_copy -AS - root.db.d01.avg_temperature -- Not supported. Nested views are not allowed -``` - -#### Creating multiple sequence views at once - -If only one sequence view can be specified at a time which is not convenient for the user to use, then multiple sequences can be specified at a time, for example: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - SELECT s01, s02 - FROM root.db.device -``` - -此外,上述写法可以做简化: - -```SQL -CREATE VIEW root.db.device(status, sub.hardware) -AS - SELECT s01, s02 - FROM root.db.device -``` - -Both statements above are equivalent to the following typing: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device; - -CREATE VIEW root.db.device.sub.hardware -AS - SELECT s02 - FROM root.db.device -``` - -is also equivalent to the following: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - root.db.device.s01, root.db.device.s02 - --- or - -CREATE VIEW root.db.device(status, sub.hardware) -AS - root.db.device(s01, s02) -``` - -##### The mapping relationships between all sequences are statically stored - -Sometimes, the SELECT clause may contain a number of statements that can only be determined at runtime, such as below: - -```SQL -SELECT s01, s02 -FROM root.db.d01, root.db.d02 -``` - -The number of sequences that can be matched by the above statement is uncertain and is related to the state of the system. Even so, the user can use it to create views. - -However, it is important to note that the mapping relationship between all sequences is stored statically (fixed at creation)! Consider the following example: - -The current database contains only three sequences `root.db.d01.s01`, `root.db.d02.s01`, `root.db.d02.s02`, and then the view is created: - -```SQL -CREATE VIEW root.view.d(alpha, beta, gamma) -AS - SELECT s01, s02 - FROM root.db.d01, root.db.d02 -``` - -The mapping relationship between time series is as follows: - -| sequence number | time series | sequence view | -| ---- | ----------------- | ----------------- | -| 1 | `root.db.d01.s01` | root.view.d.alpha | -| 2 | `root.db.d02.s01` | root.view.d.beta | -| 3 | `root.db.d02.s02` | root.view.d.gamma | - -After that, if the user adds the sequence `root.db.d01.s02`, it does not correspond to any view; then, if the user deletes `root.db.d01.s01`, the query for `root.view.d.alpha` will report an error directly, and it will not correspond to `root.db.d01.s02` either. - -Please always note that inter-sequence mapping relationships are stored statically and solidly. - -#### Batch Creation of Sequence Views - -There are several existing devices, each with a temperature value, for example: - -1. root.db.d1.temperature -2. root.db.d2.temperature -3. ... - -There may be many other sequences stored under these devices (e.g. `root.db.d1.speed`), but for now it is possible to create a view that contains only the temperature values for these devices, without relation to the other sequences:. - -```SQL -CREATE VIEW root.db.view(${2}_temperature) -AS - SELECT temperature FROM root.db.* -``` - -This is modelled on the query writeback (`SELECT INTO`) convention for naming rules, which uses variable placeholders to specify naming rules. See also: [QUERY WRITEBACK (SELECT INTO)](../Basic-Concept/Query-Data_timecho#into-clause-query-write-back) - -Here `root.db.*.temperature` specifies what time series will be included in the view; and `${2}` specifies from which node in the time series the name is extracted to name the sequence view. - -Here, `${2}` refers to level 2 (starting at 0) of `root.db.*.temperature`, which is the result of the `*` match; and `${2}_temperature` is the result of the match and `temperature` spliced together with underscores to make up the node names of the sequences under the view. - -The above statement for creating a view is equivalent to the following writeup: - -```SQL -CREATE VIEW root.db.view(${2}_${3}) -AS - SELECT temperature from root.db.* -``` - -The final view contains these sequences: - -1. root.db.view.d1_temperature -2. root.db.view.d2_temperature -3. ... - -Created using wildcards, only static mapping relationships at the moment of creation will be stored. - -#### SELECT clauses are somewhat limited when creating views - -The SELECT clause used when creating a serial view is subject to certain restrictions. The main restrictions are as follows: - -1. the `WHERE` clause cannot be used. -2. `GROUP BY` clause cannot be used. -3. `MAX_VALUE` and other aggregation functions cannot be used. - -Simply put, after `AS` you can only use `SELECT ... FROM ... ` and the results of this query must form a time series. - -### 4.2 View Data Queries - -For the data query functions that can be supported, the sequence view and time series can be used indiscriminately with identical behaviour when performing time series data queries. - -**The types of queries that are not currently supported by the sequence view are as follows:** - -1. **align by device query -2. **group by tags query - -Users can also mix time series and sequence view queries in the same SELECT statement, for example: - -```SQL -SELECT temperature01, temperature02, avg_temperature -FROM root.db.d01 -WHERE temperature01 < temperature02 -``` - -However, if the user wants to query the metadata of the sequence, such as tag, attributes, etc., the query is the result of the sequence view, not the result of the time series referenced by the sequence view. - -In addition, for aliased sequences, if the user wants to get information about the time series such as tags, attributes, etc., the user needs to query the mapping of the view columns to find the corresponding time series, and then query the time series for the tags, attributes, etc. The method of querying the mapping of the view columns will be explained in section 3.5. - -### 4.3 Modify Views - -The modification operations supported by the view include: modifying its calculation logic,modifying tag/attributes, and deleting. - -#### Modify view data source - -```SQL -ALTER VIEW root.view.device.status -AS - SELECT s01 - FROM root.ln.wf.d01 -``` - -#### Modify the view's calculation logic - -```SQL -ALTER VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02 + temperature03) / 3 - FROM root.db.d01 -``` - -#### Tag point management - -- Add a new -tag -```SQL -ALTER view root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -- Add a new attribute - -```SQL -ALTER view root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -- rename tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -- Reset the value of a tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -- Delete an existing tag or attribute - -```SQL -ALTER view root.turbine.d1.s1 DROP tag1, tag2 -``` - -- Update insert tags and attributes - -> If the tag or attribute did not exist before, insert it, otherwise, update the old value with the new one. - -```SQL -ALTER view root.turbine.d1.s1 UPSERT TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -#### Deleting Views - -Since a view is a sequence, a view can be deleted as if it were a time series. - - -```SQL -DELETE VIEW root.view.device.avg_temperatue -``` - -### 4.4 View Synchronisation - -#### If the dependent original sequence is deleted - -When the sequence view is queried (when the sequence is parsed), **the empty result set** is returned if the dependent time series does not exist. - -This is similar to the feedback for querying a non-existent sequence, but with a difference: if the dependent time series cannot be parsed, the empty result set is the one that contains the table header as a reminder to the user that the view is problematic. - -Additionally, when the dependent time series is deleted, no attempt is made to find out if there is a view that depends on the column, and the user receives no warning. - -#### Data Writes to Non-Aliased Sequences Not Supported - -Writes to non-alias sequences are not supported. - -Please refer to the previous section 2.1.6 Restrictions2 for more details. - -#### Metadata for sequences is not shared - -Please refer to the previous section 2.1.6 Restriction 5 for details. - -### 4.5 View Metadata Queries - -View metadata query specifically refers to querying the metadata of the view itself (e.g., how many columns the view has), as well as information about the views in the database (e.g., what views are available). - -#### Viewing Current View Columns - -The user has two ways of querying: - -1. a query using `SHOW TIMESERIES`, which contains both time series and series views. This query contains both the time series and the sequence view. However, only some of the attributes of the view can be displayed. -2. a query using `SHOW VIEW`, which contains only the sequence view. It displays the complete properties of the sequence view. - -Example: - -```Shell -IoTDB> show timeseries; -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.device.s01 | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.view.status | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp01 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp02 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.avg_temp| null| root.db| FLOAT| null| null|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -Total line number = 5 -It costs 0.789s -IoTDB> -``` - -The last column `ViewType` shows the type of the sequence, the time series is BASE and the sequence view is VIEW. - -In addition, some of the sequence view properties will be missing, for example `root.db.d01.avg_temp` is calculated from temperature averages, so the `Encoding` and `Compression` properties are null values. - -In addition, the query results of the `SHOW TIMESERIES` statement are divided into two main parts. - -1. information about the timing data, such as data type, compression, encoding, etc. -2. other metadata information, such as tag, attribute, database, etc. - -For the sequence view, the temporal data information presented is the same as the original sequence or null (e.g., the calculated average temperature has a data type but no compression method); the metadata information presented is the content of the view. - -To learn more about the view, use `SHOW ``VIEW`. The `SHOW ``VIEW` shows the source of the view's data, etc. - -```Shell -IoTDB> show VIEW root.**; -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -| Timeseries|Database|DataType|Tags|Attributes|ViewType| SOURCE| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.view.status | root.db| INT32|null| null| VIEW| root.db.device.s01| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.d01.avg_temp| root.db| FLOAT|null| null| VIEW|(root.db.d01.temp01+root.db.d01.temp02)/2| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -Total line number = 2 -It costs 0.789s -IoTDB> -``` - -The last column, `SOURCE`, shows the data source for the sequence view, listing the SQL statement that created the sequence. - -##### About Data Types - -Both of the above queries involve the data type of the view. The data type of a view is inferred from the original time series type of the query statement or alias sequence that defines the view. This data type is computed in real time based on the current state of the system, so the data type queried at different moments may be changing. - -## 5. FAQ - -#### Q1: I want the view to implement the function of type conversion. For example, a time series of type int32 was originally placed in the same view as other series of type int64. I now want all the data queried through the view to be automatically converted to int64 type. - -> Ans: This is not the function of the sequence view. But the conversion can be done using `CAST`, for example: - -```SQL -CREATE VIEW root.db.device.int64_status -AS - SELECT CAST(s1, 'type'='INT64') from root.db.device -``` - -> This way, a query for `root.view.status` will yield a result of type int64. -> -> Please note in particular that in the above example, the data for the sequence view is obtained by `CAST` conversion, so `root.db.device.int64_status` is not an aliased sequence, and thus **not supported for writing**. - -#### Q2: Is default naming supported? Select a number of time series and create a view; but I don't specify the name of each series, it is named automatically by the database? - -> Ans: Not supported. Users must specify the naming explicitly. - -#### Q3: In the original system, create time series `root.db.device.s01`, you can find that database `root.db` is automatically created and device `root.db.device` is automatically created. Next, deleting the time series `root.db.device.s01` reveals that `root.db.device` was automatically deleted, while `root.db` remained. Will this mechanism be followed for creating views? What are the considerations? - -> Ans: Keep the original behaviour unchanged, the introduction of view functionality will not change these original logics. - -#### Q4: Does it support sequence view renaming? - -> A: Renaming is not supported in the current version, you can create your own view with new name to put it into use. \ No newline at end of file diff --git a/src/UserGuide/latest/User-Manual/Maintenance-commands_timecho.md b/src/UserGuide/latest/User-Manual/Maintenance-commands_timecho.md deleted file mode 100644 index 537f01174..000000000 --- a/src/UserGuide/latest/User-Manual/Maintenance-commands_timecho.md +++ /dev/null @@ -1,693 +0,0 @@ - -# Maintenance Statement - -## 1. Status Checking - -### 1.1 Viewing the Connected Model - -**Description**: Returns the current SQL dialect mode (`Tree` or `Table`). - -**Syntax**: - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT; -``` - -**Result:** - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TREE| -+-----------------+ -``` - -### 1.2 Viewing the Cluster Version - -**Description**: Returns the current cluster version. - -**Syntax**: - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW VERSION; -``` - -**Result**: - -```Plain -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.3 Viewing Cluster Key Parameters - -**Description**: Returns key parameters of the current cluster. - -**Syntax**: - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -Key Parameters: - -1. **ClusterName**: The name of the current cluster. -2. **DataReplicationFactor**: Number of data replicas per DataRegion. -3. **SchemaReplicationFactor**: Number of schema replicas per SchemaRegion. -4. **DataRegionConsensusProtocolClass**: Consensus protocol class for DataRegions. -5. **SchemaRegionConsensusProtocolClass**: Consensus protocol class for SchemaRegions. -6. **ConfigNodeConsensusProtocolClass**: Consensus protocol class for ConfigNodes. -7. **TimePartitionOrigin**: The starting timestamp of database time partitions. -8. **TimePartitionInterval**: The interval of database time partitions (in milliseconds). -9. **ReadConsistencyLevel**: The consistency level for read operations. -10. **SchemaRegionPerDataNode**: Number of SchemaRegions per DataNode. -11. **DataRegionPerDataNode**: Number of DataRegions per DataNode. -12. **SeriesSlotNum**: Number of SeriesSlots per DataRegion. -13. **SeriesSlotExecutorClass**: Implementation class for SeriesSlots. -14. **DiskSpaceWarningThreshold**: Disk space warning threshold (in percentage). -15. **TimestampPrecision**: Timestamp precision. - -**Example**: - -```SQL -IoTDB> SHOW VARIABLES; -``` - -**Result**: - -```Plain -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.4 Viewing the Current Timestamp of Database - -**Description**: Returns the current timestamp of the database. - -**Syntax**: - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP; -``` - -**Result**: - -```Plain -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - -### 1.5 Viewing Executing Queries - -**Description**: Displays information about all currently executing queries. - -**Syntax**: - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**Parameters**: - -1. **WHERE Clause**: Filters the result set based on specified conditions. -2. **ORDER BY Clause**: Sorts the result set based on specified columns. -3. **limitOffsetClause**: Limits the number of rows returned. - 1. Format: `LIMIT , `. - -**Columns in QUERIES Table**: - -- **time**: Timestamp when the query started. -- **queryid**: Unique ID of the query. -- **datanodeid**: ID of the DataNode executing the query. -- **elapsedtime**: Time elapsed since the query started (in seconds). -- **statement**: The SQL statement being executed. - -**Example**: - -```SQL -IoTDB> SHOW QUERIES WHERE elapsedtime > 0.003 -``` - -**Result**: - -```SQL -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -| Time| QueryId|DataNodeId|ElapsedTime| Statement| -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -|2025-05-09T15:16:01.293+08:00|20250509_071601_00015_1| 1| 0.006|SHOW QUERIES WHERE elapsedtime > 0.003| -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -``` - -### 1.6 Viewing Region Information - -**Description**: Displays regions' information of the current cluster. - -**Syntax**: - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW REGIONS -``` - -**Result**: - -```SQL -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 9|SchemaRegion|Running|root.__system| 21| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.555| | -| 10| DataRegion|Running|root.__system| 21| 21| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.556| 8.27 KB| -| 65|SchemaRegion|Running| root.ln| 1| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-25T14:46:50.113| | -| 66| DataRegion|Running| root.ln| 1| 1| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-25T14:46:50.425| 524 B| -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.7 Viewing Available Nodes - -**Description**: Returns the RPC addresses and ports of all available DataNodes in the current cluster. Note: A DataNode is considered "available" if it is not in the REMOVING state. - -> This feature is supported starting from v2.0.8. - -**Syntax**: - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -**Example**: - -```SQL -IoTDB> SHOW AVAILABLE URLS -``` - -**Result**: - -```SQL -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.8 View Service Information - -**Description**: Returns service information (MQTT service, REST service) on all active DataNodes (in RUNNING or READ-ONLY state) in the current cluster. - -> This feature is supported starting from v2.0.8.2. - -#### Syntax: -```sql -showServicesStatement - : SHOW SERVICES - ; -``` - -#### Examples: -```sql -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -``` - -Execution result: -```sql -+--------------+-------------+---------+ -| Service Name | DataNode ID | State | -+--------------+-------------+---------+ -| MQTT | 1 | STOPPED | -| REST | 1 | RUNNING | -+--------------+-------------+---------+ -``` - - -### 1.9 View Cluster Activation Status - -**Description**:Returns the activation status of the current cluster. - -#### Syntax: - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -#### Examples: - -```SQL -IoTDB> SHOW ACTIVATION -``` - -Execution result: - -```SQL -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -### 1.10 View Disk Space Usage -**Description**: Returns the disk space usage of the specified `pattern`, including the size of ChunkGroups and the size of Metadata. - -**Note**: Statistics are based on the actual size of data in TsFiles; therefore, deletions made via `mods` are not considered. - -> Supported since version 2.0.9.1 - -#### Syntax: -```sql -showDiskUsageStatement - : SHOW DISK_USAGE FROM pathPattern - whereClause? - orderByClause? - rowPaginationClause? - ; -pathPattern - : ROOT (DOT nodeName)* - ; -``` - -**Explanation**: The `pattern` is used to match devices, must start with `ROOT`, and intermediate nodes in the path support `*` or `**`. - -#### Result Set -| Column Name | Column Type | Description | -|---------------|-------------|----------------------------------| -| Database | string | Database name | -| DataNodeId | int32 | DataNode node ID | -| RegionId | int32 | Region ID | -| TimePartition | int64 | Time partition ID | -| SizeInBytes | int64 | Disk space occupied (in bytes) | - -#### Example: -```sql -SHOW DISK_USAGE FROM root.ln.**; -``` - -**Execution Result**: -```bash -+--------+----------+--------+-------------+-----------+ -|Database|DataNodeId|RegionId|TimePartition|SizeInBytes| -+--------+----------+--------+-------------+-----------+ -| root.ln| 1| 13| 2932| 203| -+--------+----------+--------+-------------+-----------+ -``` - -## 2. Status Setting - -### 2.1 Setting the Connected Model - -**Description**: Sets the current SQL dialect mode to `Tree` or `Table` which can be used in both tree and table modes. - -**Syntax**: - -```SQL -SET SQL_DIALECT = (TABLE | TREE); -``` - -**Example**: - -```SQL -IoTDB> SET SQL_DIALECT=TREE; -IoTDB> SHOW CURRENT_SQL_DIALECT; -``` - -**Result**: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TREE| -+-----------------+ -``` - -### 2.2 Updating Configuration Items - -**Description**: Updates configuration items. Changes take effect immediately without restarting if the items support hot modification. - -**Syntax**: - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**Parameters**: - -1. **propertyAssignments**: A list of properties to update. - 1. Format: `property (',' property)*`. - 2. Values: - - `DEFAULT`: Resets the configuration to its default value. - - `expression`: A specific value (must be a string). -2. **ON INTEGER_VALUE** **(Optional):** Specifies the node ID to update. - 1. If not specified or set to a negative value, updates all ConfigNodes and DataNodes. - -**Example**: - -```SQL -IoTDB> SET CONFIGURATION ‘disk_space_warning_threshold’='0.05',‘heartbeat_interval_in_ms’='1000' ON 1; -``` - -### 2.3 Loading Manually Modified Configuration Files - -**Description**: Loads manually modified configuration files and hot-loads the changes. Configuration items that support hot modification take effect immediately. - -**Syntax**: - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode** **(Optional):** - 1. Specifies the scope of configuration loading. - 2. Default: `CLUSTER`. - 3. Values: - - `LOCAL`: Loads configuration only on the DataNode directly connected to the client. - - `CLUSTER`: Loads configuration on all DataNodes in the cluster. - -**Example**: - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 Setting the System Status - -**Description**: Sets the system status to either `READONLY` or `RUNNING`. - -**Syntax**: - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **RUNNING |** **READONLY**: - 1. **RUNNING**: Sets the system to running mode, allowing both read and write operations. - 2. **READONLY**: Sets the system to read-only mode, allowing only read operations and prohibiting writes. -2. **localOrClusterMode** **(Optional):** - 1. **LOCAL**: Applies the status change only to the DataNode directly connected to the client. - 2. **CLUSTER**: Applies the status change to all DataNodes in the cluster. - 3. **Default**: `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - -## 3. Data Management - -### 3.1 Flushing Data from Memory to Disk - -**Description**: Flushes data from the memory table to disk. - -**Syntax**: - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **identifier** **(Optional):** - 1. Specifies the name of the path to flush. - 2. If not specified, all paths are flushed. - 3. **Multiple Paths**: Multiple path names can be specified, separated by commas (e.g., `FLUSH root.ln, root.lnm.**`). -2. **booleanValue** **(****Optional****)**: - 1. Specifies the type of data to flush. - 2. **TRUE**: Flushes only the sequential memory table. - 3. **FALSE**: Flushes only the unsequential MemTable. - 4. **Default**: Flushes both sequential and unsequential memory tables. -3. **localOrClusterMode** **(****Optional****)**: - 1. **ON LOCAL**: Flushes only the memory tables on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Flushes memory tables on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> FLUSH root.ln TRUE ON LOCAL; -``` - -## 4. Data Repair - -### 4.1 Starting Background Scan and Repair of TsFiles - -**Description**: Starts a background task to scan and repair TsFiles, fixing issues such as timestamp disorder within data files. - -**Syntax**: - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode(Optional)**: - 1. **ON LOCAL**: Executes the repair task only on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Executes the repair task on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 Pausing Background TsFile Repair Task - -**Description**: Pauses the background repair task. The paused task can be resumed by executing the `START REPAIR DATA` command again. - -**Syntax**: - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**Parameters**: - -1. **localOrClusterMode** **(Optional):** - 1. **ON LOCAL**: Executes the pause command only on the DataNode directly connected to the client. - 2. **ON CLUSTER**: Executes the pause command on all DataNodes in the cluster. - 3. **Default:** `ON CLUSTER`. - -**Example**: - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. Query Termination - -### 5.1 Terminating Queries - -**Description**: Terminates one or more running queries. - -**Syntax**: - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**Parameters**: - -1. **QUERY** **queryId:** Specifies the ID of the query to terminate. - -- To obtain the `queryId`, use the `SHOW QUERIES` command. - -2. **ALL QUERIES:** Terminates all currently running queries. - -**Example**: - -Terminate a specific query: - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -``` - -Terminate all queries: - -```SQL -IoTDB> KILL ALL QUERIES; -``` - -## 6. Query Debugging - -### 6.1 DEBUG SQL - -**Definition**: Add the `DEBUG` keyword at the beginning of an SQL query statement. During execution, debug logs will be output, including underlying file scan information involved in the query. - -> Supported since V2.0.9.1 - -#### Syntax: -```sql -debugSQLStatement - : DEBUG ? query - ; -``` - -**Description**: -* Log output directory: `logs/log_datanode_query_debug.log` - -#### Example: -1. Execute the following SQL for a DEBUG query -```sql -DEBUG SELECT * FROM root.ln.**; -``` - -2. Check the log content in `log_datanode_query_debug.log` to view the file scan information involved in the query. - -```bash -2026-03-24 10:06:18,755 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:159 - Cache miss: root.ln.wf01.wt01.temperature in file: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,757 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:160 - Device: root.ln.wf01.wt01, all sensors: [temperature] -2026-03-24 10:06:18,758 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.BloomFilterCache:110 - get bloomFilter from cache where filePath is: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: root.ln.wf01.wt01.temperature metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=0, chunkMetaDataListDataSize=8, measurementId='temperature', dataType=DOUBLE, statistics=startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], modified=false, isSeq=true, chunkMetadataList=[measurementId: temperature, datatype: DOUBLE, version: 0, Statistics: startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], deleteIntervalList: null]}. -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:97 - Modifications size is 0 for file Path: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:109 - After modification Chunk meta data list is: -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:110 - measurementId: temperature, datatype: DOUBLE, version: 0, Statistics: startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], deleteIntervalList: null -2026-03-24 10:06:18,760 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile', regionId=13, timePartitionId=2932, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=27} -2026-03-24 10:06:18,761 [pool-69-IoTDB-ClientRPC-Processor-1$20260324_020618_00052_1] INFO o.a.i.d.q.p.Coordinator:902 - debug select * from root.ln.** -``` diff --git a/src/UserGuide/latest/User-Manual/Streaming_timecho.md b/src/UserGuide/latest/User-Manual/Streaming_timecho.md deleted file mode 100644 index 7cad70485..000000000 --- a/src/UserGuide/latest/User-Manual/Streaming_timecho.md +++ /dev/null @@ -1,854 +0,0 @@ - - -# Stream Computing Framework - -The IoTDB stream processing framework allows users to implement customized stream processing logic, which can monitor and capture storage engine changes, transform changed data, and push transformed data outward. - -We call a data flow processing task a Pipe. A stream processing task (Pipe) contains three subtasks: - -- Source task -- Processor task -- Sink task - -The stream processing framework allows users to customize the processing logic of three subtasks using Java language and process data in a UDF-like manner. -In a Pipe, the above three subtasks are executed by three plugins respectively, and the data will be processed by these three plugins in turn: -Pipe Source is used to extract data, Pipe Processor is used to process data, Pipe Sink is used to send data, and the final data will be sent to an external system. - -**The model of the Pipe task is as follows:** - -![pipe.png](/img/1706778988482.jpg) - -Describing a data flow processing task essentially describes the properties of Pipe Source, Pipe Processor and Pipe Sink plugins. -Users can declaratively configure the specific attributes of the three subtasks through SQL statements, and achieve flexible data ETL capabilities by combining different attributes. - -Using the stream processing framework, a complete data link can be built to meet the needs of end-side-cloud synchronization, off-site disaster recovery, and read-write load sub-library*. - -## 1. Custom stream processing plugin development - -### 1.1 Programming development dependencies - -It is recommended to use maven to build the project and add the following dependencies in `pom.xml`. Please be careful to select the same dependency version as the IoTDB server version. - -```xml - - org.apache.iotdb - pipe-api - 1.3.1 - provided - -``` - -### 1.2 Event-driven programming model - -The user programming interface design of the stream processing plugin refers to the general design concept of the event-driven programming model. Events are data abstractions in the user programming interface, and the programming interface is decoupled from the specific execution method. It only needs to focus on describing the processing method expected by the system after the event (data) reaches the system. - -In the user programming interface of the stream processing plugin, events are an abstraction of database data writing operations. The event is captured by the stand-alone stream processing engine, and is passed to the PipeSource plugin, PipeProcessor plugin, and PipeSink plugin in sequence according to the three-stage stream processing process, and triggers the execution of user logic in the three plugins in turn. - -In order to take into account the low latency of stream processing in low load scenarios on the end side and the high throughput of stream processing in high load scenarios on the end side, the stream processing engine will dynamically select processing objects in the operation logs and data files. Therefore, user programming of stream processing The interface requires users to provide processing logic for the following two types of events: operation log writing event TabletInsertionEvent and data file writing event TsFileInsertionEvent. - -#### **Operation log writing event (TabletInsertionEvent)** - -The operation log write event (TabletInsertionEvent) is a high-level data abstraction for user write requests. It provides users with the ability to manipulate the underlying data of write requests by providing a unified operation interface. - -For different database deployment methods, the underlying storage structures corresponding to operation log writing events are different. For stand-alone deployment scenarios, the operation log writing event is an encapsulation of write-ahead log (WAL) entries; for a distributed deployment scenario, the operation log writing event is an encapsulation of a single node consensus protocol operation log entry. - -For write operations generated by different write request interfaces in the database, the data structure of the request structure corresponding to the operation log write event is also different. IoTDB provides numerous writing interfaces such as InsertRecord, InsertRecords, InsertTablet, InsertTablets, etc. Each writing request uses a completely different serialization method, and the generated binary entries are also different. - -The existence of operation log writing events provides users with a unified view of data operations, which shields the implementation differences of the underlying data structure, greatly reduces the user's programming threshold, and improves the ease of use of the function. - -```java -/** TabletInsertionEvent is used to define the event of data insertion. */ -public interface TabletInsertionEvent extends Event { - - /** - * The consumer processes the data row by row and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processRowByRow(BiConsumer consumer); - - /** - * The consumer processes the Tablet directly and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processTablet(BiConsumer consumer); -} -``` - -#### **Data file writing event (TsFileInsertionEvent)** - -The data file writing event (TsFileInsertionEvent) is a high-level abstraction of the database file writing operation. It is a data collection of several operation log writing events (TabletInsertionEvent). - -The storage engine of IoTDB is LSM structured. When data is written, the writing operation will first be placed into a log-structured file, and the written data will be stored in the memory at the same time. When the memory reaches the control upper limit, the disk flushing behavior will be triggered, that is, the data in the memory will be converted into a database file, and the previously prewritten operation log will be deleted. When the data in the memory is converted into the data in the database file, it will undergo two compression processes: encoding compression and general compression. Therefore, the data in the database file takes up less space than the original data in the memory. - -In extreme network conditions, directly transmitting data files is more economical than transmitting data writing operations. It will occupy lower network bandwidth and achieve faster transmission speeds. Of course, there is no free lunch. Computing and processing data in files requires additional file I/O costs compared to directly computing and processing data in memory. However, it is precisely the existence of two structures, disk data files and memory write operations, with their own advantages and disadvantages, that gives the system the opportunity to make dynamic trade-offs and adjustments. It is based on this observation that data files are introduced into the plugin's event model. Write event. - -To sum up, the data file writing event appears in the event stream of the stream processing plugin, and there are two situations: - -(1) Historical data extraction: Before a stream processing task starts, all written data that has been placed on the disk will exist in the form of TsFile. After a stream processing task starts, when collecting historical data, the historical data will be abstracted using TsFileInsertionEvent; - -(2) Real-time data extraction: When a stream processing task is in progress, when the real-time processing speed of operation log write events in the data stream is slower than the write request speed, after a certain progress, the operation log write events that cannot be processed in the future will be persisted. to disk and exists in the form of TsFile. After this data is extracted by the stream processing engine, TsFileInsertionEvent will be used as an abstraction. - -```java -/** - * TsFileInsertionEvent is used to define the event of writing TsFile. Event data stores in disks, - * which is compressed and encoded, and requires IO cost for computational processing. - */ -public interface TsFileInsertionEvent extends Event { - - /** - * The method is used to convert the TsFileInsertionEvent into several TabletInsertionEvents. - * - * @return {@code Iterable} the list of TabletInsertionEvent - */ - Iterable toTabletInsertionEvents(); -} -``` - -### 1.3 Custom stream processing plugin programming interface definition - -Based on the custom stream processing plugin programming interface, users can easily write data extraction plugins, data processing plugins and data sending plugins, so that the stream processing function can be flexibly adapted to various industrial scenarios. - -#### Data extraction plugin interface - -Data extraction is the first stage of the three-stage process of stream processing, which includes data extraction, data processing, and data sending. The data extraction plugin (PipeSource) serves as a bridge between the stream processing engine and the storage engine. It captures various data write events by listening to the behavior of the storage engine. - -```java -/** - * PipeSource - * - *

PipeSource is responsible for capturing events from sources. - * - *

Various data sources can be supported by implementing different PipeSource classes. - * - *

The lifecycle of a PipeSource is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH Source` clause in SQL are - * parsed and the validation method {@link PipeSource#validate(PipeParameterValidator)} will - * be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} will be called to - * configure the runtime behavior of the PipeSource. - *
  • Then the method {@link PipeSource#start()} will be called to start the PipeSource. - *
  • While the collaboration task is in progress, the method {@link PipeSource#supply()} will be - * called to capture events from sources and then the events will be passed to the - * PipeProcessor. - *
  • The method {@link PipeSource#close()} will be called when the collaboration task is - * cancelled (the `DROP PIPE` command is executed). - *
- */ -public interface PipeSource extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSource. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSourceRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSource#validate(PipeParameterValidator)} - * is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSource - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSourceRuntimeConfiguration configuration) - throws Exception; - - /** - * Start the Source. After this method is called, events should be ready to be supplied by - * {@link PipeSource#supply()}. This method is called after {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @throws Exception the user can throw errors if necessary - */ - void start() throws Exception; - - /** - * Supply single event from the Source and the caller will send the event to the processor. - * This method is called after {@link PipeSource#start()} is called. - * - * @return the event to be supplied. the event may be null if the Source has no more events at - * the moment, but the Source is still running for more events. - * @throws Exception the user can throw errors if necessary - */ - Event supply() throws Exception; -} -``` - -#### Data processing plugin interface - -Data processing is the second stage of the three-stage process of stream processing, which includes data extraction, data processing, and data sending. The data processing plugin (PipeProcessor) is primarily used for filtering and transforming the various events captured by the data extraction plugin (PipeSource). - -```java -/** - * PipeProcessor - * - *

PipeProcessor is used to filter and transform the Event formed by the PipeSource. - * - *

The lifecycle of a PipeProcessor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH PROCESSOR` clause in SQL are - * parsed and the validation method {@link PipeProcessor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} will be called - * to configure the runtime behavior of the PipeProcessor. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSource. The - * following 3 methods will be called: {@link - * PipeProcessor#process(TabletInsertionEvent, EventCollector)}, {@link - * PipeProcessor#process(TsFileInsertionEvent, EventCollector)} and {@link - * PipeProcessor#process(Event, EventCollector)}. - *
    • PipeSink serializes the events into binaries and send them to sinks. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeProcessor#close() } method will be called. - *
- */ -public interface PipeProcessor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeProcessor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeProcessorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeProcessor#validate(PipeParameterValidator)} is called and before the beginning of the - * events processing. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeProcessor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeProcessorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is called to process the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(TabletInsertionEvent tabletInsertionEvent, EventCollector eventCollector) - throws Exception; - - /** - * This method is called to process the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - default void process(TsFileInsertionEvent tsFileInsertionEvent, EventCollector eventCollector) - throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - process(tabletInsertionEvent, eventCollector); - } - } - - /** - * This method is called to process the Event. - * - * @param event Event to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(Event event, EventCollector eventCollector) throws Exception; -} -``` - -#### Data sending plugin interface - -Data sending is the third stage of the three-stage process of stream processing, which includes data extraction, data processing, and data sending. The data sending plugin (PipeSink) is responsible for sending the various events processed by the data processing plugin (PipeProcessor). It serves as the network implementation layer of the stream processing framework and should support multiple real-time communication protocols and connectors in its interface. - -```java -/** - * PipeSink - * - *

PipeSink is responsible for sending events to sinks. - * - *

Various network protocols can be supported by implementing different PipeSink classes. - * - *

The lifecycle of a PipeSink is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SINK` clause in SQL are - * parsed and the validation method {@link PipeSink#validate(PipeParameterValidator)} will be - * called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link PipeSink#customize(PipeParameters, - * PipeSinkRuntimeConfiguration)} will be called to configure the runtime behavior of the - * PipeSink and the method {@link PipeSink#handshake()} will be called to create a connection - * with sink. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. - *
    • PipeSink serializes the events into binaries and send them to sinks. The following 3 - * methods will be called: {@link PipeSink#transfer(TabletInsertionEvent)}, {@link - * PipeSink#transfer(TsFileInsertionEvent)} and {@link PipeSink#transfer(Event)}. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeSink#close() } method will be called. - *
- * - *

In addition, the method {@link PipeSink#heartbeat()} will be called periodically to check - * whether the connection with sink is still alive. The method {@link PipeSink#handshake()} will be - * called to create a new connection with the sink when the method {@link PipeSink#heartbeat()} - * throws exceptions. - */ -public interface PipeSink extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSink. In this method, the user can do the following - * things: - * - *

    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSinkRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSink#validate(PipeParameterValidator)} is - * called and before the method {@link PipeSink#handshake()} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSink - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSinkRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is used to create a connection with sink. This method will be called after the - * method {@link PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called or - * will be called when the method {@link PipeSink#heartbeat()} throws exceptions. - * - * @throws Exception if the connection is failed to be created - */ - void handshake() throws Exception; - - /** - * This method will be called periodically to check whether the connection with sink is still - * alive. - * - * @throws Exception if the connection dies - */ - void heartbeat() throws Exception; - - /** - * This method is used to transfer the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(TabletInsertionEvent tabletInsertionEvent) throws Exception; - - /** - * This method is used to transfer the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - default void transfer(TsFileInsertionEvent tsFileInsertionEvent) throws Exception { - try { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - transfer(tabletInsertionEvent); - } - } finally { - tsFileInsertionEvent.close(); - } - } - - /** - * This method is used to transfer the generic events, including HeartbeatEvent. - * - * @param event Event to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(Event event) throws Exception; -} -``` - -## 2. Custom stream processing plugin management - -In order to ensure the flexibility and ease of use of user-defined plugins in actual production, the system also needs to provide the ability to dynamically and uniformly manage plugins. -The stream processing plugin management statements introduced in this chapter provide an entry point for dynamic unified management of plugins. - -### 2.1 Load plugin statement - -In IoTDB, if you want to dynamically load a user-defined plugin in the system, you first need to implement a specific plugin class based on PipeSource, PipeProcessor or PipeSink. -Then the plugin class needs to be compiled and packaged into a jar executable file, and finally the plugin is loaded into IoTDB using the management statement for loading the plugin. - -The syntax of the management statement for loading the plugin is shown in the figure. - -```sql -CREATE PIPEPLUGIN [IF NOT EXISTS] -AS -USING -``` -**IF NOT EXISTS semantics**: Used in creation operations to ensure that the create command is executed when the specified Pipe Plugin does not exist, preventing errors caused by attempting to create an existing Pipe Plugin. - -Example: If you implement a data processing plugin named edu.tsinghua.iotdb.pipe.ExampleProcessor, and the packaged jar package is pipe-plugin.jar, you want to use this plugin in the stream processing engine, and mark the plugin as example. There are two ways to use the plugin package, one is to upload to the URI server, and the other is to upload to the local directory of the cluster. - -Method 1: Upload to the URI server - -Preparation: To register in this way, you need to upload the JAR package to the URI server in advance and ensure that the IoTDB instance that executes the registration statement can access the URI server. For example https://example.com:8080/iotdb/pipe-plugin.jar . - -SQL: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -Method 2: Upload the data to the local directory of the cluster - -Preparation: To register in this way, you need to place the JAR package in any path on the machine where the DataNode node is located, and we recommend that you place the JAR package in the /ext/pipe directory of the IoTDB installation path (the installation package is already in the installation package, so you do not need to create a new one). For example: iotdb-1.x.x-bin/ext/pipe/pipe-plugin.jar. **(Note: If you are using a cluster, you will need to place the JAR package under the same path as the machine where each DataNode node is located)** - -SQL: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -### 2.2 Delete plugin statement - -When the user no longer wants to use a plugin and needs to uninstall the plugin from the system, he can use the delete plugin statement as shown in the figure. - -```sql -DROP PIPEPLUGIN [IF EXISTS] -``` - -**IF EXISTS semantics**: Used in deletion operations to ensure that when a specified Pipe Plugin exists, the delete command is executed to prevent errors caused by attempting to delete a non-existent Pipe Plugin. - -### 2.3 View plugin statements - -Users can also view plugins in the system on demand. View the statement of the plugin as shown in the figure. -```sql -SHOW PIPEPLUGINS -``` - -## 3. System preset stream processing plugin - -### 3.1 Pre-built Source Plugin - -#### iotdb-source - -Function: Extract historical or realtime data inside IoTDB into pipe. - - -| key | value | value range | required or optional with default | -|---------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|-----------------------------------| -| source | iotdb-source | String: iotdb-source | required | -| source.pattern | path prefix for filtering time series | String: any time series prefix | optional: root | -| source.history.start-time | start of synchronizing historical data event time,including start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| source.history.end-time | end of synchronizing historical data event time,including end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.forwarding-pipe-requests | Whether to forward data written by another Pipe (usually Data Sync) | Boolean: true, false | optional:true | -| start-time(V1.3.1+) | start of synchronizing all data event time,including start-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| end-time(V1.3.1+) | end of synchronizing all data event time,including end-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.realtime.mode | Extraction mode for real-time data | String: hybrid, stream, batch | optional:hybrid | -| source.forwarding-pipe-requests | Whether to forward data written by another Pipe (usually Data Sync) | Boolean: true, false | optional:true | - -> 🚫 **source.pattern Parameter Description** -> -> * Pattern should use backquotes to modify illegal characters or illegal path nodes, for example, if you want to filter root.\`a@b\` or root.\`123\`, you should set the pattern to root.\`a@b\` or root.\`123\`(Refer specifically to [Timing of single and double quotes and backquotes](https://iotdb.apache.org/Download/)) -> * In the underlying implementation, when pattern is detected as root (default value) or a database name, synchronization efficiency is higher, and any other format will reduce performance. -> * The path prefix does not need to form a complete path. For example, when creating a pipe with the parameter 'source.pattern'='root.aligned.1': - > - > * root.aligned.1TS - > * root.aligned.1TS.\`1\` - > * root.aligned.100TS - > - > the data will be synchronized; - > - > * root.aligned.\`1\` -> * root.aligned.\`123\` - > - > the data will not be synchronized. - -> ❗️**start-time, end-time parameter description of source** -> -> * start-time, end-time should be in ISO format, such as 2011-12-03T10:15:30 or 2011-12-03T10:15:30+01:00. However, version 1.3.1+ supports timeStamp format like 1706704494000. - -> ✅ **A piece of data from production to IoTDB contains two key concepts of time** -> -> * **event time:** The time when the data is actually produced (or the generation time assigned to the data by the data production system, which is the time item in the data point), also called event time. -> * **arrival time:** The time when data arrives in the IoTDB system. -> -> The out-of-order data we often refer to refers to data whose **event time** is far behind the current system time (or the maximum **event time** that has been dropped) when the data arrives. On the other hand, whether it is out-of-order data or sequential data, as long as they arrive newly in the system, their **arrival time** will increase with the order in which the data arrives at IoTDB. - -> 💎 **The work of iotdb-source can be split into two stages** -> -> 1. Historical data extraction: All data with **arrival time** < **current system time** when creating the pipe is called historical data -> 2. Realtime data extraction: All data with **arrival time** >= **current system time** when the pipe is created is called realtime data -> -> The historical data transmission phase and the realtime data transmission phase are executed serially. Only when the historical data transmission phase is completed, the realtime data transmission phase is executed.** - -> 📌 **source.realtime.mode: Data extraction mode** -> -> * log: In this mode, the task only uses the operation log for data processing and sending -> * file: In this mode, the task only uses data files for data processing and sending. -> * hybrid: This mode takes into account the characteristics of low latency but low throughput when sending data one by one in the operation log, and the characteristics of high throughput but high latency when sending in batches of data files. It can automatically operate under different write loads. Switch the appropriate data extraction method. First, adopt the data extraction method based on operation logs to ensure low sending delay. When a data backlog occurs, it will automatically switch to the data extraction method based on data files to ensure high sending throughput. When the backlog is eliminated, it will automatically switch back to the data extraction method based on data files. The data extraction method of the operation log avoids the problem of difficulty in balancing data sending delay or throughput using a single data extraction algorithm. - -> 🍕 **source.forwarding-pipe-requests: Whether to allow forwarding data transmitted from another pipe** -> -> * If you want to use pipe to build data synchronization of A -> B -> C, then the pipe of B -> C needs to set this parameter to true, so that the data written by A to B through the pipe in A -> B can be forwarded correctly. to C -> * If you want to use pipe to build two-way data synchronization (dual-active) of A \<-> B, then the pipes of A -> B and B -> A need to set this parameter to false, otherwise the data will be endless. inter-cluster round-robin forwarding - -### 3.2 Preset processor plugin - -#### do-nothing-processor - -Function: No processing is done on the events passed in by the source. - - -| key | value | value range | required or optional with default | -|-----------|----------------------|------------------------------|-----------------------------------| -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### 3.3 Preset sink plugin - -#### do-nothing-sink - -Function: No processing is done on the events passed in by the processor. - -| key | value | value range | required or optional with default | -|------|-----------------|-------------------------|-----------------------------------| -| sink | do-nothing-sink | String: do-nothing-sink | required | - -## 4. Stream processing task management - -### 4.1 Create a stream processing task - -Use the `CREATE PIPE` statement to create a stream processing task. Taking the creation of a data synchronization stream processing task as an example, the sample SQL statement is as follows: - -```sql -CREATE PIPE -- PipeId is the name that uniquely identifies the sync task -WITH SOURCE ( - -- Default IoTDB Data Extraction Plugin - 'source' = 'iotdb-source', - -- Path prefix, only data that can match the path prefix will be extracted for subsequent processing and delivery - 'source.pattern' = 'root.timecho', - -- Whether to extract historical data - 'source.history.enable' = 'true', - -- Describes the time range of the historical data being extracted, indicating the earliest possible time - 'source.history.start-time' = '2011.12.03T10:15:30+01:00', - -- Describes the time range of the extracted historical data, indicating the latest time - 'source.history.end-time' = '2022.12.03T10:15:30+01:00', - -- Whether to extract realtime data - 'source.realtime.enable' = 'true', -) -WITH PROCESSOR ( - -- Default data processing plugin, means no processing - 'processor' = 'do-nothing-processor', -) -WITH SINK ( - -- IoTDB data sending plugin with target IoTDB - 'sink' = 'iotdb-thrift-sink', - -- Data service for one of the DataNode nodes on the target IoTDB ip - 'sink.ip' = '127.0.0.1', - -- Data service port of one of the DataNode nodes of the target IoTDB - 'sink.port' = '6667', -) -``` - -**When creating a stream processing task, you need to configure the PipeId and the parameters of the three plugin parts:** - -| Configuration | Description | Required or not | Default implementation | Default implementation description | Default implementation description | -|---------------|-----------------------------------------------------------------------------------------------------|---------------------------------|------------------------|---------------------------------------------------------------------------------------------------------------------------|------------------------------------| -| PipeId | A globally unique name that identifies a stream processing | Required | - | - | - | -| source | Pipe Source plugin, responsible for extracting stream processing data at the bottom of the database | Optional | iotdb-source | Integrate the full historical data of the database and subsequent real-time data arriving into the stream processing task | No | -| processor | Pipe Processor plugin, responsible for processing data | Optional | do-nothing-processor | Does not do any processing on the incoming data | Yes | -| sink | Pipe Sink plugin, responsible for sending data | Required | - | - | Yes | - -In the example, the iotdb-source, do-nothing-processor and iotdb-thrift-sink plugins are used to build the data flow processing task. IoTDB also has other built-in stream processing plugins, **please check the "System Preset Stream Processing plugin" section**. - -**A simplest example of the CREATE PIPE statement is as follows:** - -```sql -CREATE PIPE -- PipeId is a name that uniquely identifies the stream processing task -WITH SINK ( - -- IoTDB data sending plugin, the target is IoTDB - 'sink' = 'iotdb-thrift-sink', - --The data service IP of one of the DataNode nodes in the target IoTDB - 'sink.ip' = '127.0.0.1', - -- The data service port of one of the DataNode nodes in the target IoTDB - 'sink.port' = '6667', -) -``` - -The semantics expressed are: synchronize all historical data in this database instance and subsequent real-time data arriving to the IoTDB instance with the target 127.0.0.1:6667. - -**Notice:** - -- SOURCE and PROCESSOR are optional configurations. If you do not fill in the configuration parameters, the system will use the corresponding default implementation. -- SINK is a required configuration and needs to be configured declaratively in the CREATE PIPE statement -- SINK has self-reuse capability. For different stream processing tasks, if their SINKs have the same KV attributes (the keys corresponding to the values of all attributes are the same), then the system will only create one SINK instance in the end to realize the duplication of connection resources. - - - For example, there are the following declarations of two stream processing tasks, pipe1 and pipe2: - - ```sql - CREATE PIPE pipe1 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.ip' = 'localhost', - 'sink.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.port' = '9999', - 'sink.ip' = 'localhost', - ) - ``` - -- Because their declarations of SINK are exactly the same (**even if the order of declaration of some attributes is different**), the framework will automatically reuse the SINKs they declared, and ultimately the SINKs of pipe1 and pipe2 will be the same instance. . -- When the source is the default iotdb-source, and source.forwarding-pipe-requests is the default value true, please do not build an application scenario that includes data cycle synchronization (it will cause an infinite loop): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### 4.2 Start the stream processing task - -After the CREATE PIPE statement is successfully executed, the stream processing task-related instance will be created, but the running status of the entire stream processing task will be set to STOPPED(V1.3.0), that is, the stream processing task will not process data immediately. In version 1.3.1 and later, the status of the task will be set to RUNNING after CREATE. - -You can use the START PIPE statement to cause a stream processing task to start processing data: - -```sql -START PIPE -``` - -### 4.3 Stop the stream processing task - -Use the STOP PIPE statement to stop the stream processing task from processing data: - -```sql -STOP PIPE -``` - -### 4.4 Delete stream processing tasks - -Use the DROP PIPE statement to stop the stream processing task from processing data (when the stream processing task status is RUNNING), and then delete the entire stream processing task: - -```sql -DROP PIPE -``` - -Users do not need to perform a STOP operation before deleting the stream processing task. - -### 4.5 Display stream processing tasks - -Use the SHOW PIPES statement to view all stream processing tasks: - -```sql -SHOW PIPES -``` - -The query results are as follows: - -```sql -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor|PipeSink|ExceptionMessage| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| {}| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -``` - -You can use `` to specify the status of a stream processing task you want to see: - -```sql -SHOW PIPE -``` - -You can also use the where clause to determine whether the Pipe Sink used by a certain \ is reused. - -```sql -SHOW PIPES -WHERE SINK USED BY -``` - -### 4.6 Stream processing task running status migration - -A stream processing pipe will pass through various states during its managed life cycle: - -- **RUNNING:** pipe is working properly - - When a pipe is successfully created, its initial state is RUNNING.(V1.3.1+) -- **STOPPED:** The pipe is stopped. When the pipeline is in this state, there are several possibilities: - - When a pipe is successfully created, its initial state is STOPPED.(V1.3.0) - - The user manually pauses a pipe that is in normal running status, and its status will passively change from RUNNING to STOPPED. - - When an unrecoverable error occurs during the running of a pipe, its status will automatically change from RUNNING to STOPPED -- **DROPPED:** The pipe task was permanently deleted - -The following diagram shows all states and state transitions: - -![State migration diagram](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -## 5. authority management - -### 5.1 Stream processing tasks - - -| Permission name | Description | -|-----------------|------------------------------------------------------------| -| USE_PIPE | Register a stream processing task. The path is irrelevant. | -| USE_PIPE | Start the stream processing task. The path is irrelevant. | -| USE_PIPE | Stop the stream processing task. The path is irrelevant. | -| USE_PIPE | Offload stream processing tasks. The path is irrelevant. | -| USE_PIPE | Query stream processing tasks. The path is irrelevant. | - -### 5.2 Stream processing task plugin - - -| Permission name | Description | -|-----------------|----------------------------------------------------------------------| -| USE_PIPE | Register stream processing task plugin. The path is irrelevant. | -| USE_PIPE | Uninstall the stream processing task plugin. The path is irrelevant. | -| USE_PIPE | Query stream processing task plugin. The path is irrelevant. | - -## 6. Configuration parameters - -In iotdb-system.properties: - -V1.3.0+: -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 - -# The maximum number of selectors that can be used in the async connector. -# pipe_async_connector_selector_number=1 - -# The core number of clients that can be used in the async connector. -# pipe_async_connector_core_client_number=8 - -# The maximum number of clients that can be used in the async connector. -# pipe_async_connector_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -V1.3.1+: -```Properties -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` diff --git a/src/UserGuide/latest/User-Manual/Tiered-Storage_timecho.md b/src/UserGuide/latest/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 5d4c4a1dd..000000000 --- a/src/UserGuide/latest/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,100 +0,0 @@ - - -# Tiered Storage -## 1. Overview - -The Tiered storage functionality allows users to define multiple layers of storage, spanning across multiple types of storage media (Memory mapped directory, SSD, rotational hard discs or cloud storage). While memory and cloud storage is usually singular, the local file system storages can consist of multiple directories joined together into one tier. Meanwhile, users can classify data based on its hot or cold nature and store data of different categories in specified "tier". Currently, IoTDB supports the classification of hot and cold data through TTL (Time to live / age) of data. When the data in one tier does not meet the TTL rules defined in the current tier, the data will be automatically migrated to the next tier. - -## 2. Parameter Definition - -To enable tiered storage in IoTDB, you need to configure the following aspects: - -1. configure the data catalogue and divide the data catalogue into different tiers -2. configure the TTL of the data managed in each tier to distinguish between hot and cold data categories managed in different tiers. -3. configure the minimum remaining storage space ratio for each tier so that when the storage space of the tier triggers the threshold, the data of the tier will be automatically migrated to the next tier (optional). - -The specific parameter definitions and their descriptions are as follows. - -| Configuration | Default | Required | Description | Constraint | -| --------------------------------------- | ------------------------ | --- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | Yes | specify different storage directories and divide the storage directories into tiers | Each level of storage uses a semicolon to separate, and commas to separate within a single level; cloud (OBJECT_STORAGE) configuration can only be used as the last level of storage and the first level can't be used as cloud storage; a cloud object at most; the remote storage directory is denoted by OBJECT_STORAGE | -| tier_ttl_in_ms | -1 | Yes | Define the maximum age of data for which each tier is responsible | Each level of storage is separated by a semicolon; the number of levels should match the number of levels defined by dn_data_dirs;"-1" means "unlimited". | -| dn_default_space_usage_thresholds | 0.85 | Yes | Define the maximum storage usage threshold ratio for each tier of data directories. When the used space exceeds this ratio, the data will be automatically migrated to the next tier. If the storage usage of the last tier surpasses this threshold, the system will be set to ​​READ_ONLY​​ mode. | Each level of storage is separated by a semicolon; the number of levels should match the number of levels defined by dn_data_dirs | -| object_storage_type | `AWS_S3` | Required when using remote storage | Cloud storage type. | all `AWS_S3` is supported. | -| object_storage_bucket | iotdb_data | Required when using remote storage | Name of cloud storage bucket | Bucket definition in AWS S3 | -| object_storage_endpoint | | Required when using remote storage | endpoint of cloud storage | endpoint of AWS S3 | -| object_storage_region | (Empty) | Required when using remote storage | Cloud storage Region. | | -| object_storage_access_key | | Required when using remote storage | Authentication information stored in the cloud: key | AWS S3 credential key | -| object_storage_access_secret | | Required when using remote storage | Authentication information stored in the cloud: secret | AWS S3 credential secret | -| enable_path_style_access | false | No | Whether to enable path style access for object storage service. | | -| remote_tsfile_cache_dirs | data/datanode/data/cache | No | Cache directory stored locally in the cloud | | -| remote_tsfile_cache_page_size_in_kb | 20480 | No | Block size of locally cached files stored in the cloud | | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | No | Maximum Disk Occupancy Size for Cloud Storage Local Cache | | - -## 3. local tiered storag configuration example - -The following is an example of a local two-level storage configuration. - -```JavaScript -//Required configuration items -dn_data_dirs=/data1/data;/data2/data,/data3/data; -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -In this example, two levels of storage are configured, specifically: - -| **tier** | **data path** | **data range** | **threshold for minimum remaining disk space** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| tier 1 | path 1:/data1/data | data for last 1 day | 20% | -| tier 2 | path 2:/data2/data path 2:/data3/data | data from 1 day ago | 10% | - -## 4. remote tiered storag configuration example - -The following takes three-level storage as an example: - -```JavaScript -//Required configuration items -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_type=AWS_S3 -object_storage_bucket=iotdb -object_storage_region= -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// Optional configuration items -enable_path_style_access=false -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -In this example, a total of three levels of storage are configured, specifically: - -| **tier** | **data path** | **data range** | **threshold for minimum remaining disk space** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| tier1 | path 1:/data1/data | data for last 1 day | 20% | -| tier2 | path 1:/data2/data path 2:/data3/data | data from past 1 day to past 10 days | 15% | -| tier 3 | S3 Cloud Storage | Data older than 10 days | 10% | diff --git a/src/UserGuide/latest/User-Manual/User-defined-function_timecho.md b/src/UserGuide/latest/User-Manual/User-defined-function_timecho.md deleted file mode 100644 index 5d5623569..000000000 --- a/src/UserGuide/latest/User-Manual/User-defined-function_timecho.md +++ /dev/null @@ -1,953 +0,0 @@ -# UDF - -## 1. UDF Introduction - -UDF (User Defined Function) refers to user-defined functions. IoTDB provides a variety of built-in time series processing functions and also supports extending custom functions to meet more computing needs. - -In IoTDB, you can expand two types of UDF: - - - - - - - - - - - - - - - - - - - - - - -
UDF ClassAccessStrategyDescription
UDTFMAPPABLE_ROW_BY_ROWCustom scalar function, input k columns of time series and 1 row of data, output 1 column of time series and 1 row of data, can be used in any clause and expression that appears in the scalar function, such as select clause, where clause, etc.
ROW_BY_ROW
SLIDING_TIME_WINDOW
SLIDING_SIZE_WINDOW
SESSION_TIME_WINDOW
STATE_WINDOW
Custom time series generation function, input k columns of time series m rows of data, output 1 column of time series n rows of data, the number of input rows m can be different from the number of output rows n, and can only be used in SELECT clauses.
UDAF-Custom aggregation function, input k columns of time series m rows of data, output 1 column of time series 1 row of data, can be used in any clause and expression that appears in the aggregation function, such as select clause, having clause, etc.
- -### 1.1 UDF usage - -The usage of UDF is similar to that of regular built-in functions, and can be directly used in SELECT statements like calling regular functions. - -#### 1.Basic SQL syntax support - -* Support `SLIMIT` / `SOFFSET` -* Support `LIMIT` / `OFFSET` -* Support queries with value filters -* Support queries with time filters - - -#### 2. Queries with * in SELECT Clauses - -Assume that there are 2 time series (`root.sg.d1.s1` and `root.sg.d1.s2`) in the system. - -* **`SELECT example(*) from root.sg.d1`** - -Then the result set will include the results of `example (root.sg.d1.s1)` and `example (root.sg.d1.s2)`. - -* **`SELECT example(s1, *) from root.sg.d1`** - -Then the result set will include the results of `example(root.sg.d1.s1, root.sg.d1.s1)` and `example(root.sg.d1.s1, root.sg.d1.s2)`. - -* **`SELECT example(*, *) from root.sg.d1`** - -Then the result set will include the results of `example(root.sg.d1.s1, root.sg.d1.s1)`, `example(root.sg.d1.s2, root.sg.d1.s1)`, `example(root.sg.d1.s1, root.sg.d1.s2)` and `example(root.sg.d1.s2, root.sg.d1.s2)`. - -#### 3. Queries with Key-value Attributes in UDF Parameters - -You can pass any number of key-value pair parameters to the UDF when constructing a UDF query. The key and value in the key-value pair need to be enclosed in single or double quotes. Note that key-value pair parameters can only be passed in after all time series have been passed in. Here is a set of examples: - - Example: -``` sql -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; -``` - -#### 4. Nested Queries - - Example: -``` sql -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` - - -## 2. UDF management - -### 2.1 UDF Registration - -The process of registering a UDF in IoTDB is as follows: - -1. Implement a complete UDF class, assuming the full class name of this class is `org.apache.iotdb.udf.ExampleUDTF`. -2. Convert the project into a JAR package. If using Maven to manage the project, you can refer to the [Maven project example](https://github.com/apache/iotdb/tree/master/example/udf) above. -3. Make preparations for registration according to the registration mode. For details, see the following example. -4. You can use following SQL to register UDF. - -```sql -CREATE FUNCTION AS (USING URI URI-STRING) -``` - -#### Example: register UDF named `example`, you can choose either of the following two registration methods - -#### Method 1: Manually place the jar package - -Prepare: -When registering using this method, it is necessary to place the JAR package in advance in the `ext/udf` directory of all nodes in the cluster (which can be configured). - -Registration statement: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' -``` - -#### Method 2: Cluster automatically installs jar packages through URI - -Prepare: -When registering using this method, it is necessary to upload the JAR package to the URI server in advance and ensure that the IoTDB instance executing the registration statement can access the URI server. - -Registration statement: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' USING URI 'http://jar/example.jar' -``` - -IoTDB will download JAR packages and synchronize them to the entire cluster. - -#### Note - -1. Since UDF instances are dynamically loaded through reflection technology, you do not need to restart the server during the UDF registration process. - -2. UDF function names are not case-sensitive. - -3. Please ensure that the function name given to the UDF is different from all built-in function names. A UDF with the same name as a built-in function cannot be registered. - -4. We recommend that you do not use classes that have the same class name but different function logic in different JAR packages. For example, in `UDF(UDAF/UDTF): udf1, udf2`, the JAR package of udf1 is `udf1.jar` and the JAR package of udf2 is `udf2.jar`. Assume that both JAR packages contain the `org.apache.iotdb.udf.ExampleUDTF` class. If you use two UDFs in the same SQL statement at the same time, the system will randomly load either of them and may cause inconsistency in UDF execution behavior. - -### 2.2 UDF Deregistration - -The SQL syntax is as follows: - -```sql -DROP FUNCTION -``` - -Example: Uninstall the UDF from the above example: - -```sql -DROP FUNCTION example -``` - -Note: For functions registered using USING URI, you need to remove the UDF's JAR files from the cluster-wide node path (`installation_package/ext/udf/install`). - -### 2.3 Show All Registered UDFs - -``` sql -SHOW FUNCTIONS -``` - -### 2.4 UDF configuration - -- UDF configuration allows configuring the storage directory of UDF in `iotdb-system.properties` - ``` Properties -# UDF lib dir - -udf_lib_dir=ext/udf -``` - -- -When using custom functions, there is a message indicating insufficient memory. Change the following configuration parameters in `iotdb-system.properties` and restart the service. - - ``` Properties - -# Used to estimate the memory usage of text fields in a UDF query. -# It is recommended to set this value to be slightly larger than the average length of all text -# effectiveMode: restart -# Datatype: int -udf_initial_byte_array_length_for_memory_control=48 - -# How much memory may be used in ONE UDF query (in MB). -# The upper limit is 20% of allocated memory for read. -# effectiveMode: restart -# Datatype: float -udf_memory_budget_in_mb=30.0 - -# UDF memory allocation ratio. -# The parameter form is a:b:c, where a, b, and c are integers. -# effectiveMode: restart -udf_reader_transformer_collector_memory_proportion=1:1:1 -``` - -### 2.5 UDF User Permissions - - -When users use UDF, they will be involved in the `USE_UDF` permission, and only users with this permission are allowed to perform UDF registration, uninstallation, and query operations. - -For more user permissions related content, please refer to [Account Management Statements](../User-Manual/Authority-Management_timecho). - - -## 3. UDF Libraries - -Based on the ability of user-defined functions, IoTDB provides a series of functions for temporal data processing, including data quality, data profiling, anomaly detection, frequency domain analysis, data matching, data repairing, sequence discovery, machine learning, etc., which can meet the needs of industrial fields for temporal data processing. - -You can refer to the [UDF Libraries](../SQL-Manual/UDF-Libraries_timecho.md)document to find the installation steps and registration statements for each function, to ensure that all required functions are registered correctly. - - -## 4. UDF development - -### 4.1 UDF Development Dependencies - -If you use [Maven](http://search.maven.org/), you can search for the development dependencies listed below from the [Maven repository](http://search.maven.org/) . Please note that you must select the same dependency version as the target IoTDB server version for development. - -``` xml - - org.apache.iotdb - udf-api - 1.0.0 - provided - -``` - -### 4.2 UDTF(User Defined Timeseries Generating Function) - -To write a UDTF, you need to inherit the `org.apache.iotdb.udf.api.UDTF` class, and at least implement the `beforeStart` method and a `transform` method. - -#### Interface Description: - -| Interface definition | Description | Required to Implement | -| :----------------------------------------------------------- | :----------------------------------------------------------- | ----------------------------------------------------- | -| void validate(UDFParameterValidator validator) throws Exception | This method is mainly used to validate `UDFParameters` and it is executed before `beforeStart(UDFParameters, UDTFConfigurations)` is called. | Optional | -| void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception | The initialization method to call the user-defined initialization behavior before a UDTF processes the input data. Every time a user executes a UDTF query, the framework will construct a new UDF instance, and `beforeStart` will be called. | Required | -| Object transform(Row row) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `MappableRowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Row`, and the transformation result should be returned. | Required to implement at least one `transform` method | -| void transform(Column[] columns, ColumnBuilder builder) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `MappableRowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Column[]`, and the transformation result should be output by `ColumnBuilder`. You need to call the data collection method provided by `builder` to determine the output data. | Required to implement at least one `transform` method | -| void transform(Row row, PointCollector collector) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `RowByRowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `Row`, and the transformation result should be output by `PointCollector`. You need to call the data collection method provided by `collector` to determine the output data. | Required to implement at least one `transform` method | -| void transform(RowWindow rowWindow, PointCollector collector) throws Exception | This method is called by the framework. This data processing method will be called when you choose to use the `SlidingSizeWindowAccessStrategy` or `SlidingTimeWindowAccessStrategy` strategy (set in `beforeStart`) to consume raw data. Input data is passed in by `RowWindow`, and the transformation result should be output by `PointCollector`. You need to call the data collection method provided by `collector` to determine the output data. | Required to implement at least one `transform` method | -| void terminate(PointCollector collector) throws Exception | This method is called by the framework. This method will be called once after all `transform` calls have been executed. In a single UDF query, this method will and will only be called once. You need to call the data collection method provided by `collector` to determine the output data. | Optional | -| void beforeDestroy() | This method is called by the framework after the last input data is processed, and will only be called once in the life cycle of each UDF instance. | Optional | - -In the life cycle of a UDTF instance, the calling sequence of each method is as follows: - -1. void validate(UDFParameterValidator validator) throws Exception -2. void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception -3. `Object transform(Row row) throws Exception` or `void transform(Column[] columns, ColumnBuilder builder) throws Exception` or `void transform(Row row, PointCollector collector) throws Exception` or `void transform(RowWindow rowWindow, PointCollector collector) throws Exception` -4. void terminate(PointCollector collector) throws Exception -5. void beforeDestroy() - -> Note that every time the framework executes a UDTF query, a new UDF instance will be constructed. When the query ends, the corresponding instance will be destroyed. Therefore, the internal data of the instances in different UDTF queries (even in the same SQL statement) are isolated. You can maintain some state data in the UDTF without considering the influence of concurrency and other factors. - -#### Detailed interface introduction: - -1. **void validate(UDFParameterValidator validator) throws Exception** - -The `validate` method is used to validate the parameters entered by the user. - -In this method, you can limit the number and types of input time series, check the attributes of user input, or perform any custom verification. - -Please refer to the [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/parameter/UDFParameterValidator.java) for the usage of `UDFParameterValidator`. - - -2. **void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception** - -This method is mainly used to customize UDTF. In this method, the user can do the following things: - -1. Use UDFParameters to get the time series paths and parse key-value pair attributes entered by the user. -2. Set the strategy to access the raw data and set the output data type in UDTFConfigurations. -3. Create resources, such as establishing external connections, opening files, etc. - - -2.1 **UDFParameters** - -`UDFParameters` is used to parse UDF parameters in SQL statements (the part in parentheses after the UDF function name in SQL). The input parameters have two parts. The first part is data types of the time series that the UDF needs to process, and the second part is the key-value pair attributes for customization. Only the second part can be empty. - - -Example: - -``` sql -SELECT UDF(s1, s2, 'key1'='iotdb', 'key2'='123.45') FROM root.sg.d; -``` - -Usage: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - String stringValue = parameters.getString("key1"); // iotdb - Float floatValue = parameters.getFloat("key2"); // 123.45 - Double doubleValue = parameters.getDouble("key3"); // null - int intValue = parameters.getIntOrDefault("key4", 678); // 678 - // do something - - // configurations - // ... -} -``` - - -2.2 **UDTFConfigurations** - -You must use `UDTFConfigurations` to specify the strategy used by UDF to access raw data and the type of output sequence. - -Usage: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(Type.INT32); -} -``` - -The `setAccessStrategy` method is used to set the UDF's strategy for accessing the raw data, and the `setOutputDataType` method is used to set the data type of the output sequence. - - 2.2.1 **setAccessStrategy** - - -Note that the raw data access strategy you set here determines which `transform` method the framework will call. Please implement the `transform` method corresponding to the raw data access strategy. Of course, you can also dynamically decide which strategy to set based on the attribute parameters parsed by `UDFParameters`. Therefore, two `transform` methods are also allowed to be implemented in one UDF. - -The following are the strategies you can set: - -| Interface definition | Description | The `transform` Method to Call | -| :-------------------------------- | :----------------------------------------------------------- | ------------------------------------------------------------ | -| MappableRowByRowStrategy | Custom scalar function
The framework will call the `transform` method once for each row of raw data input, with k columns of time series and 1 row of data input, and 1 column of time series and 1 row of data output. It can be used in any clause and expression where scalar functions appear, such as select clauses, where clauses, etc. | void transform(Column[] columns, ColumnBuilder builder) throws ExceptionObject transform(Row row) throws Exception | -| RowByRowAccessStrategy | Customize time series generation function to process raw data line by line.
The framework will call the `transform` method once for each row of raw data input, inputting k columns of time series and 1 row of data, and outputting 1 column of time series and n rows of data.
When a sequence is input, the row serves as a data point for the input sequence.
When multiple sequences are input, after aligning the input sequences in time, each row serves as a data point for the input sequence.
(In a row of data, there may be a column with a `null` value, but not all columns are `null`) | void transform(Row row, PointCollector collector) throws Exception | -| SlidingTimeWindowAccessStrategy | Customize time series generation functions to process raw data in a sliding time window manner.
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SlidingSizeWindowAccessStrategy | Customize the time series generation function to process raw data in a fixed number of rows, meaning that each data processing window will contain a fixed number of rows of data (except for the last window).
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SessionTimeWindowAccessStrategy | Customize time series generation functions to process raw data in a session window format.
The framework will call the `transform` method once for each raw data input window, input k columns of time series m rows of data, and output 1 column of time series n rows of data.
A window may contain multiple rows of data, and after aligning the input sequence in time, each window serves as a data point for the input sequence.
(Each window may have i rows, and each row of data may have a column with a `null` value, but not all of them are `null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| StateWindowAccessStrategy | Customize time series generation functions to process raw data in a state window format.
he framework will call the `transform` method once for each raw data input window, inputting 1 column of time series m rows of data and outputting 1 column of time series n rows of data.
A window may contain multiple rows of data, and currently only supports opening windows for one physical quantity, which is one column of data. | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | - - -#### Interface Description: - -- `MappableRowByRowStrategy` and `RowByRowAccessStrategy`: The construction of `RowByRowAccessStrategy` does not require any parameters. - -- `SlidingTimeWindowAccessStrategy` - -Window opening diagram: - - - -`SlidingTimeWindowAccessStrategy`: `SlidingTimeWindowAccessStrategy` has many constructors, you can pass 3 types of parameters to them: - -- Parameter 1: The display window on the time axis - -The first type of parameters are optional. If the parameters are not provided, the beginning time of the display window will be set to the same as the minimum timestamp of the query result set, and the ending time of the display window will be set to the same as the maximum timestamp of the query result set. - -- Parameter 2: Time interval for dividing the time axis (should be positive) -- Parameter 3: Time sliding step (not required to be greater than or equal to the time interval, but must be a positive number) - -The sliding step parameter is also optional. If the parameter is not provided, the sliding step will be set to the same as the time interval for dividing the time axis. - -The relationship between the three types of parameters can be seen in the figure below. Please see the [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/SlidingTimeWindowAccessStrategy.java) for more details. - -

- -> Note that the actual time interval of some of the last time windows may be less than the specified time interval parameter. In addition, there may be cases where the number of data rows in some time windows is 0. In these cases, the framework will also call the `transform` method for the empty windows. - -- `SlidingSizeWindowAccessStrategy` - -Window opening diagram: - - - -`SlidingSizeWindowAccessStrategy`: `SlidingSizeWindowAccessStrategy` has many constructors, you can pass 2 types of parameters to them: - -* Parameter 1: Window size. This parameter specifies the number of data rows contained in a data processing window. Note that the number of data rows in some of the last time windows may be less than the specified number of data rows. -* Parameter 2: Sliding step. This parameter means the number of rows between the first point of the next window and the first point of the current window. (This parameter is not required to be greater than or equal to the window size, but must be a positive number) - -The sliding step parameter is optional. If the parameter is not provided, the sliding step will be set to the same as the window size. - -- `SessionTimeWindowAccessStrategy` - -Window opening diagram: **Time intervals less than or equal to the given minimum time interval `sessionGap` are assigned in one group.** - - - -`SessionTimeWindowAccessStrategy`: `SessionTimeWindowAccessStrategy` has many constructors, you can pass 2 types of parameters to them: - -- Parameter 1: The display window on the time axis. -- Parameter 2: The minimum time interval `sessionGap` of two adjacent windows. - -- `StateWindowAccessStrategy` - -Window opening diagram: **For numerical data, if the state difference is less than or equal to the given threshold `delta`, it will be assigned in one group.** - - - -`StateWindowAccessStrategy` has four constructors. - -- Constructor 1: For numerical data, there are 3 parameters: the time axis can display the start and end time of the time window and the threshold `delta` for the allowable change within a single window. -- Constructor 2: For text data and boolean data, there are 3 parameters: the time axis can be provided to display the start and end time of the time window. For both data types, the data within a single window is same, and there is no need to provide an allowable change threshold. -- Constructor 3: For numerical data, there are 1 parameters: you can only provide the threshold delta that is allowed to change within a single window. The start time of the time axis display time window will be defined as the smallest timestamp in the entire query result set, and the time axis display time window end time will be defined as The largest timestamp in the entire query result set. -- Constructor 4: For text data and boolean data, you can provide no parameter. The start and end timestamps are explained in Constructor 3. - -StateWindowAccessStrategy can only take one column as input for now. - -Please see the [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/StateWindowAccessStrategy.java) for more details. - - 2.2.2 **setOutputDataType** - -Note that the type of output sequence you set here determines the type of data that the `PointCollector` can actually receive in the `transform` method. The relationship between the output data type set in `setOutputDataType` and the actual data output type that `PointCollector` can receive is as follows: - -| Output Data Type Set in `setOutputDataType` | Data Type that `PointCollector` Can Receive | -| :------------------------------------------ | :----------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | java.lang.String and org.apache.iotdb.udf.api.type.Binar` | - -The type of output time series of a UDTF is determined at runtime, which means that a UDTF can dynamically determine the type of output time series according to the type of input time series. -Here is a simple example: - -```java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **Object transform(Row row) throws Exception** - -You need to implement this method or `transform(Column[] columns, ColumnBuilder builder) throws Exception` when you specify the strategy of UDF to read the original data as `MappableRowByRowAccessStrategy`. - -This method processes the raw data one row at a time. The raw data is input from `Row` and output by its return object. You must return only one object based on each input data point in a single `transform` method call, i.e., input and output are one-to-one. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `Object transform(Row row) throws Exception` method. It is an adder that receives two columns of time series as input. - -```java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type dataType; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - dataType = parameters.getDataType(0); - configurations - .setAccessStrategy(new MappableRowByRowAccessStrategy()) - .setOutputDataType(dataType); - } - - @Override - public Object transform(Row row) throws Exception { - return row.getLong(0) + row.getLong(1); - } -} -``` - - - -4. **void transform(Column[] columns, ColumnBuilder builder) throws Exception** - -You need to implement this method or `Object transform(Row row) throws Exception` when you specify the strategy of UDF to read the original data as `MappableRowByRowAccessStrategy`. - -This method processes the raw data multiple rows at a time. After performance tests, we found that UDTF that process multiple rows at once perform better than those UDTF that process one data point at a time. The raw data is input from `Column[]` and output by `ColumnBuilder`. You must output a corresponding data point based on each input data point in a single `transform` method call, i.e., input and output are still one-to-one. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `void transform(Column[] columns, ColumnBuilder builder) throws Exception` method. It is an adder that receives two columns of time series as input. - -```java -import org.apache.iotdb.tsfile.read.common.block.column.Column; -import org.apache.iotdb.tsfile.read.common.block.column.ColumnBuilder; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type type; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - type = parameters.getDataType(0); - configurations.setAccessStrategy(new MappableRowByRowAccessStrategy()).setOutputDataType(type); - } - - @Override - public void transform(Column[] columns, ColumnBuilder builder) throws Exception { - long[] inputs1 = columns[0].getLongs(); - long[] inputs2 = columns[1].getLongs(); - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - builder.writeLong(inputs1[i] + inputs2[i]); - } - } -} -``` - -5. **void transform(Row row, PointCollector collector) throws Exception** - -You need to implement this method when you specify the strategy of UDF to read the original data as `RowByRowAccessStrategy`. - -This method processes the raw data one row at a time. The raw data is input from `Row` and output by `PointCollector`. You can output any number of data points in one `transform` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -The following is a complete UDF example that implements the `void transform(Row row, PointCollector collector) throws Exception` method. It is an adder that receives two columns of time series as input. When two data points in a row are not `null`, this UDF will output the algebraic sum of these two data points. - -``` java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT64) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) throws Exception { - if (row.isNull(0) || row.isNull(1)) { - return; - } - collector.putLong(row.getTime(), row.getLong(0) + row.getLong(1)); - } -} -``` - -6. **void transform(RowWindow rowWindow, PointCollector collector) throws Exception** - -You need to implement this method when you specify the strategy of UDF to read the original data as `SlidingTimeWindowAccessStrategy` or `SlidingSizeWindowAccessStrategy`. - -This method processes a batch of data in a fixed number of rows or a fixed time interval each time, and we call the container containing this batch of data a window. The raw data is input from `RowWindow` and output by `PointCollector`. `RowWindow` can help you access a batch of `Row`, it provides a set of interfaces for random access and iterative access to this batch of `Row`. You can output any number of data points in one `transform` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -Below is a complete UDF example that implements the `void transform(RowWindow rowWindow, PointCollector collector) throws Exception` method. It is a counter that receives any number of time series as input, and its function is to count and output the number of data rows in each time window within a specified time range. - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.access.RowWindow; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.SlidingTimeWindowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Counter implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new SlidingTimeWindowAccessStrategy( - parameters.getLong("time_interval"), - parameters.getLong("sliding_step"), - parameters.getLong("display_window_begin"), - parameters.getLong("display_window_end"))); - } - - @Override - public void transform(RowWindow rowWindow, PointCollector collector) { - if (rowWindow.windowSize() != 0) { - collector.putInt(rowWindow.windowStartTime(), rowWindow.windowSize()); - } - } -} -``` - -7. **void terminate(PointCollector collector) throws Exception** - -In some scenarios, a UDF needs to traverse all the original data to calculate the final output data points. The `terminate` interface provides support for those scenarios. - -This method is called after all `transform` calls are executed and before the `beforeDestory` method is executed. You can implement the `transform` method to perform pure data processing (without outputting any data points), and implement the `terminate` method to output the processing results. - -The processing results need to be output by the `PointCollector`. You can output any number of data points in one `terminate` method call. It should be noted that the type of output data points must be the same as you set in the `beforeStart` method, and the timestamps of output data points must be strictly monotonically increasing. - -Below is a complete UDF example that implements the `void terminate(PointCollector collector) throws Exception` method. It takes one time series whose data type is `INT32` as input, and outputs the maximum value point of the series. - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Max implements UDTF { - - private Long time; - private int value; - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) { - if (row.isNull(0)) { - return; - } - int candidateValue = row.getInt(0); - if (time == null || value < candidateValue) { - time = row.getTime(); - value = candidateValue; - } - } - - @Override - public void terminate(PointCollector collector) throws IOException { - if (time != null) { - collector.putInt(time, value); - } - } -} -``` - -8. **void beforeDestroy()** - -The method for terminating a UDF. - -This method is called by the framework. For a UDF instance, `beforeDestroy` will be called after the last record is processed. In the entire life cycle of the instance, `beforeDestroy` will only be called once. - - - -### 4.3 UDAF (User Defined Aggregation Function) - -A complete definition of UDAF involves two classes, `State` and `UDAF`. - -#### State Class - -To write your own `State`, you need to implement the `org.apache.iotdb.udf.api.State` interface. - -#### Interface Description: - -| Interface Definition | Description | Required to Implement | -| -------------------------------- | ------------------------------------------------------------ | --------------------- | -| void reset() | To reset the `State` object to its initial state, you need to fill in the initial values of the fields in the `State` class within this method as if you were writing a constructor. | Required | -| byte[] serialize() | Serializes `State` to binary data. This method is used for IoTDB internal `State` passing. Note that the order of serialization must be consistent with the following deserialization methods. | Required | -| void deserialize(byte[] bytes) | Deserializes binary data to `State`. This method is used for IoTDB internal `State` passing. Note that the order of deserialization must be consistent with the serialization method above. | Required | - -#### Detailed interface introduction: - -1. **void reset()** - -This method resets the `State` to its initial state, you need to fill in the initial values of the fields in the `State` object in this method. For optimization reasons, IoTDB reuses `State` as much as possible internally, rather than creating a new `State` for each group, which would introduce unnecessary overhead. When `State` has finished updating the data in a group, this method is called to reset to the initial state as a way to process the next group. - -In the case of `State` for averaging (aka `avg`), for example, you would need the sum of the data, `sum`, and the number of entries in the data, `count`, and initialize both to 0 in the `reset()` method. - -```java -class AvgState implements State { - double sum; - - long count; - - @Override - public void reset() { - sum = 0; - count = 0; - } - - // other methods -} -``` - -2. **byte[] serialize()/void deserialize(byte[] bytes)** - -These methods serialize the `State` into binary data, and deserialize the `State` from the binary data. IoTDB, as a distributed database, involves passing data among different nodes, so you need to write these two methods to enable the passing of the State among different nodes. Note that the order of serialization and deserialization must be the consistent. - -In the case of `State` for averaging (aka `avg`), for example, you can convert the content of State to `byte[]` array and read out the content of State from `byte[]` array in any way you want, the following shows the code for serialization/deserialization using `ByteBuffer` introduced by Java8: - -```java -@Override -public byte[] serialize() { - ByteBuffer buffer = ByteBuffer.allocate(Double.BYTES + Long.BYTES); - buffer.putDouble(sum); - buffer.putLong(count); - - return buffer.array(); -} - -@Override -public void deserialize(byte[] bytes) { - ByteBuffer buffer = ByteBuffer.wrap(bytes); - sum = buffer.getDouble(); - count = buffer.getLong(); -} -``` - - - -#### UDAF Classes - -To write a UDAF, you need to implement the `org.apache.iotdb.udf.api.UDAF` interface. - -#### Interface Description: - -| Interface definition | Description | Required to Implement | -| ------------------------------------------------------------ | ------------------------------------------------------------ | --------------------- | -| void validate(UDFParameterValidator validator) throws Exception | This method is mainly used to validate `UDFParameters` and it is executed before `beforeStart(UDFParameters, UDTFConfigurations)` is called. | Optional | -| void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception | Initialization method that invokes user-defined initialization behavior before UDAF processes the input data. Unlike UDTF, configuration is of type `UDAFConfiguration`. | Required | -| State createState() | To create a `State` object, usually just call the default constructor and modify the default initial value as needed. | Required | -| void addInput(State state, Column[] columns, BitMap bitMap) | Update `State` object according to the incoming data `Column[]` in batch, note that last column `columns[columns.length - 1]` always represents the time column. In addition, `BitMap` represents the data that has been filtered out before, you need to manually determine whether the corresponding data has been filtered out when writing this method. | Required | -| void combineState(State state, State rhs) | Merge `rhs` state into `state` state. In a distributed scenario, the same set of data may be distributed on different nodes, IoTDB generates a `State` object for the partial data on each node, and then calls this method to merge it into the complete `State`. | Required | -| void outputFinal(State state, ResultValue resultValue) | Computes the final aggregated result based on the data in `State`. Note that according to the semantics of the aggregation, only one value can be output per group. | Required | -| void beforeDestroy() | This method is called by the framework after the last input data is processed, and will only be called once in the life cycle of each UDF instance. | Optional | - -In the life cycle of a UDAF instance, the calling sequence of each method is as follows: - -1. State createState() -2. void validate(UDFParameterValidator validator) throws Exception -3. void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception -4. void addInput(State state, Column[] columns, BitMap bitMap) -5. void combineState(State state, State rhs) -6. void outputFinal(State state, ResultValue resultValue) -7. void beforeDestroy() - -Similar to UDTF, every time the framework executes a UDAF query, a new UDF instance will be constructed. When the query ends, the corresponding instance will be destroyed. Therefore, the internal data of the instances in different UDAF queries (even in the same SQL statement) are isolated. You can maintain some state data in the UDAF without considering the influence of concurrency and other factors. - -#### Detailed interface introduction: - - -1. **void validate(UDFParameterValidator validator) throws Exception** - -Same as UDTF, the `validate` method is used to validate the parameters entered by the user. - -In this method, you can limit the number and types of input time series, check the attributes of user input, or perform any custom verification. - -2. **void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception** - - The `beforeStart` method does the same thing as the UDAF: - -1. Use UDFParameters to get the time series paths and parse key-value pair attributes entered by the user. -2. Set the strategy to access the raw data and set the output data type in UDAFConfigurations. -3. Create resources, such as establishing external connections, opening files, etc. - -The role of the `UDFParameters` type can be seen above. - -2.2 **UDTFConfigurations** - -The difference from UDTF is that UDAF uses `UDAFConfigurations` as the type of `configuration` object. - -Currently, this class only supports setting the type of output data. - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setOutputDataType(Type.INT32); } -} -``` - -The relationship between the output type set in `setOutputDataType` and the type of data output that `ResultValue` can actually receive is as follows: - -| The output type set in `setOutputDataType` | The output type that `ResultValue` can actually receive | -| ------------------------------------------ | ------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | org.apache.iotdb.udf.api.type.Binary | - -The output type of the UDAF is determined at runtime. You can dynamically determine the output sequence type based on the input type. - -Here is a simple example: - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **State createState()** - - -This method creates and initializes a `State` object for UDAF. Due to the limitations of the Java language, you can only call the default constructor for the `State` class. The default constructor assigns a default initial value to all the fields in the class, and if that initial value does not meet your requirements, you need to initialize them manually within this method. - -The following is an example that includes manual initialization. Suppose you want to implement an aggregate function that multiply all numbers in the group, then your initial `State` value should be set to 1, but the default constructor initializes it to 0, so you need to initialize `State` manually after calling the default constructor: - -```java -public State createState() { - MultiplyState state = new MultiplyState(); - state.result = 1; - return state; -} -``` - -4. **void addInput(State state, Column[] columns, BitMap bitMap)** - -This method updates the `State` object with the raw input data. For performance reasons, also to align with the IoTDB vectorized query engine, the raw input data is no longer a data point, but an array of columns ``Column[]``. Note that the last column (i.e. `columns[columns.length - 1]`) is always the time column, so you can also do different operations in UDAF depending on the time. - -Since the input parameter is not of a single data point type, but of multiple columns, you need to manually filter some of the data in the columns, which is why the third parameter, `BitMap`, exists. It identifies which of these columns have been filtered out, so you don't have to think about the filtered data in any case. - -Here's an example of `addInput()` that counts the number of items (aka count). It shows how you can use `BitMap` to ignore data that has been filtered out. Note that due to the limitations of the Java language, you need to do the explicit cast the `State` object from type defined in the interface to a custom `State` type at the beginning of the method, otherwise you won't be able to use the `State` object. - -```java -public void addInput(State state, Column[] columns, BitMap bitMap) { - CountState countState = (CountState) state; - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - if (bitMap != null && !bitMap.isMarked(i)) { - continue; - } - if (!columns[0].isNull(i)) { - countState.count++; - } - } -} -``` - -5. **void combineState(State state, State rhs)** - - -This method combines two `State`s, or more precisely, updates the first `State` object with the second `State` object. IoTDB is a distributed database, and the data of the same group may be distributed on different nodes. For performance reasons, IoTDB will first aggregate some of the data on each node into `State`, and then merge the `State`s on different nodes that belong to the same group, which is what `combineState` does. - -Here's an example of `combineState()` for averaging (aka avg). Similar to `addInput`, you need to do an explicit type conversion for the two `State`s at the beginning. Also note that you are updating the value of the first `State` with the contents of the second `State`. - -```java -public void combineState(State state, State rhs) { - AvgState avgState = (AvgState) state; - AvgState avgRhs = (AvgState) rhs; - - avgState.count += avgRhs.count; - avgState.sum += avgRhs.sum; -} -``` - -6. **void outputFinal(State state, ResultValue resultValue)** - -This method works by calculating the final result from `State`. You need to access the various fields in `State`, derive the final result, and set the final result into the `ResultValue` object.IoTDB internally calls this method once at the end for each group. Note that according to the semantics of aggregation, the final result can only be one value. - -Here is another `outputFinal` example for averaging (aka avg). In addition to the forced type conversion at the beginning, you will also see a specific use of the `ResultValue` object, where the final result is set by `setXXX` (where `XXX` is the type name). - -```java -public void outputFinal(State state, ResultValue resultValue) { - AvgState avgState = (AvgState) state; - - if (avgState.count != 0) { - resultValue.setDouble(avgState.sum / avgState.count); - } else { - resultValue.setNull(); - } -} -``` - -7. **void beforeDestroy()** - - -The method for terminating a UDF. - -This method is called by the framework. For a UDF instance, `beforeDestroy` will be called after the last record is processed. In the entire life cycle of the instance, `beforeDestroy` will only be called once. - - -### 4.4 Maven Project Example - -If you use Maven, you can build your own UDF project referring to our **udf-example** module. You can find the project [here](https://github.com/apache/iotdb/tree/master/example/udf). - - -## 5. Contribute universal built-in UDF functions to iotdb - -This part mainly introduces how external users can contribute their own UDFs to the IoTDB community. - -### 5.1 Prerequisites - -1. UDFs must be universal. - - The "universal" mentioned here refers to: UDFs can be widely used in some scenarios. In other words, the UDF function must have reuse value and may be directly used by other users in the community. - - If you are not sure whether the UDF you want to contribute is universal, you can send an email to `dev@iotdb.apache.org` or create an issue to initiate a discussion. - -2. The UDF you are going to contribute has been well tested and can run normally in the production environment. - - -### 5.2 What you need to prepare - -1. UDF source code -2. Test cases -3. Instructions - -### 5.3 Contribution Content - -#### 5.3.1 UDF Source Code - -1. Create the UDF main class and related classes in `iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin` or in its subfolders. -2. Register your UDF in `iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin/BuiltinTimeSeriesGeneratingFunction.java`. - -#### 5.3.2 Test Cases - -At a minimum, you need to write integration tests for the UDF. - -You can add a test class in `integration-test/src/test/java/org/apache/iotdb/db/it/udf`. - - -#### 5.3.3 Instructions - -The instructions need to include: the name and the function of the UDF, the attribute parameters that must be provided when the UDF is executed, the applicable scenarios, and the usage examples, etc. - -The instructions for use should include both Chinese and English versions. Instructions for use should be added separately in `docs/zh/UserGuide/Operation Manual/DML Data Manipulation Language.md` and `docs/UserGuide/Operation Manual/DML Data Manipulation Language.md`. - -#### 5.3.4 Submit a PR - -When you have prepared the UDF source code, test cases, and instructions, you are ready to submit a Pull Request (PR) on [Github](https://github.com/apache/iotdb). You can refer to our code contribution guide to submit a PR: [Development Guide](https://iotdb.apache.org/Community/Development-Guide.html). - - -After the PR review is approved and merged, your UDF has already contributed to the IoTDB community! - -## 6. Common problem - -Q1: How to modify the registered UDF? - -A1: Assume that the name of the UDF is `example` and the full class name is `org.apache.iotdb.udf.ExampleUDTF`, which is introduced by `example.jar`. - -1. Unload the registered function by executing `DROP FUNCTION example`. -2. Delete `example.jar` under `iotdb-server-2.0.x-all-bin/ext/udf`. -3. Modify the logic in `org.apache.iotdb.udf.ExampleUDTF` and repackage it. The name of the JAR package can still be `example.jar`. -4. Upload the new JAR package to `iotdb-server-2.0.x-all-bin/ext/udf`. -5. Load the new UDF by executing `CREATE FUNCTION example AS "org.apache.iotdb.udf.ExampleUDTF"`. - diff --git a/src/zh/UserGuide/Master/Table/AI-capability/AINode_Upgrade_timecho.md b/src/zh/UserGuide/Master/Table/AI-capability/AINode_Upgrade_timecho.md deleted file mode 100644 index 72b244f2b..000000000 --- a/src/zh/UserGuide/Master/Table/AI-capability/AINode_Upgrade_timecho.md +++ /dev/null @@ -1,941 +0,0 @@ - - -# AINode - -AINode 是支持时序相关模型注册、管理、调用的 IoTDB 原生节点,内置业界领先的自研时序大模型,如清华自研时序模型 Timer 系列,可通过标准 SQL 语句进行调用,实现时序数据的毫秒级实时推理,可支持时序趋势预测、缺失值填补、异常值检测等应用场景。 - -系统架构如下图所示: - -![](/img/AINode-0.png) - -三种节点的职责如下: - -* **ConfigNode**:负责分布式节点管理和负载均衡。 -* **DataNode**:负责接收并解析用户的 SQL请求;负责存储时间序列数据;负责数据的预处理计算。 -* **AINode**:负责时序模型的管理和使用。 - -## 1. 优势特点 - -与单独构建机器学习服务相比,具有以下优势: - -* **简单易用**:无需使用 Python 或 Java 编程,使用 SQL 语句即可完成机器学习模型管理与推理的完整流程。如创建模型可使用CREATE MODEL语句、使用模型进行推理可使用` SELECT * FROM FORECAST (...) ` 语句等,使用更加简单便捷。 -* **避免数据迁移**:使用 IoTDB 原生机器学习可以将存储在 IoTDB 中的数据直接应用于机器学习模型的推理,无需将数据移动到单独的机器学习服务平台,从而加速数据处理、提高安全性并降低成本。 - -![](/img/h1.png) - -* **内置先进算法**:支持业内领先机器学习分析算法,覆盖典型时序分析任务,为时序数据库赋能原生数据分析能力。如: - * **时间序列预测(Time Series Forecasting)**:从过去时间序列中学习变化模式;从而根据给定过去时间的观测值,输出未来序列最可能的预测。 - * **时序异常检测(Anomaly Detection for Time Series)**:在给定的时间序列数据中检测和识别异常值,帮助发现时间序列中的异常行为。 - -## 2. 基本概念 - -* **模型(Model)**:机器学习模型,以时序数据作为输入,输出分析任务的结果或决策。模型是 AINode 的基本管理单元,支持模型的增(注册)、删、查、改(微调)、用(推理)。 -* **创建(Create)**: 将外部设计或训练好的模型文件或算法加载到 AINode 中,由 IoTDB 统一管理与使用。 -* **推理(Inference)**:使用创建的模型在指定时序数据上完成该模型适用的时序分析任务。 -* **内置能力(Built-in)**:AINode 自带常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -![](/img/AINode-new.png) - -## 3. 安装部署 - -AINode 的部署可参考文档 [AINode 部署](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md) 。 - -## 4. 使用指导 - -TimechoDB-AINode 支持模型推理、模型微调以及模型管理(注册、查看、删除、加载、卸载等)三大功能,下面章节将进行详细说明。 - -### 4.1 模型推理 - -TimechoDB-AINode 表模型支持时序预测和时序数据分类两大推理能力。 - -#### 4.1.1 时序预测 - -表模型 AINode 提供的时序预测能力包括: - -* **单变量预测**:支持对单一目标变量进行预测。 -* **协变量预测**:可同时对多个目标变量进行联合预测,并支持在预测中引入协变量,以提升预测的准确性。 - -下文将详细介绍预测推理功能的语法定义、参数说明以及使用实例。 - -1. **SQL 语法** - -```SQL -SELECT * FROM FORECAST( - MODEL_ID, - TARGETS, -- 获取目标变量的 SQL - [HISTORY_COVS, -- 字符串,用于获取历史协变量的 SQL - FUTURE_COVS, -- 字符串,用于获取未来协变量的 SQL - OUTPUT_START_TIME, - OUTPUT_LENGTH, - OUTPUT_INTERVAL, - TIMECOL, - PRESERVE_INPUT, - AUTO_ADAPT, -- bool类型,表示是否开启自适应 - MODEL_OPTIONS]? -) -``` - -* 内置模型推理无需注册流程,通过 forecast 函数,指定 model\_id 就可以使用模型的推理功能。 -* 参数介绍 - -| 参数名 | 参数类型 | 参数属性 | 描述 | 是否必填 | 备注 | -|---------------------|-------|----------------------------------------------------|-----------------------------------------------------------------------------------------| ---------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| model\_id | 标量参数 | 字符串类型 | 预测所用模型的唯一标识 | 是| | -| targets | 表参数 | SET SEMANTIC | 待预测目标变量的输入数据。IoTDB会自动将数据按时间升序排序再交给AINode 。 | 是 | 使用 SQL 描述带预测目标变量的输入数据,输入的 SQL 不合法时会有对应的查询报错。 | -| history\_covs | 标量参数 | 字符串类型(合法的表模型查询 SQL)默认:无 | 指定此次预测任务的协变量的历史数据,这些数据用于辅助目标变量的预测,AINode 不会对历史协变量输出预测结果。在将数据给予模型前,AINode 会自动将数据按时间升序排序。 | 否 | 1. 查询结果只能包含 FIELD 列;
2. 其它:不同模型可能会有独特要求,不符合时会抛出对应的错误。 | -| future\_covs | 标量参数 | 字符串类型(合法的表模型查询 SQL) 默认:无 | 指定此次预测任务部分协变量的未来数据,这些数据用于辅助目标变量的预测。 在将数据给予模型前,AINode 会自动将数据按时间升序排序。 | 否 | 1. 当且仅当设置 history\_covs 时可以指定此参数;
2. 所涉及协变量名称必须是 history\_covs 的子集;
3. 查询结果只能包含 FIELD 列;
4. 其它:不同模型可能会有独特要求,不符合时会抛出对应的错误。 | -| auto\_adapt | 标量参数 | 布尔类型,默认值:true | 是否为协变量推理开启自适应。(V2.0.8.2起支持) | 否 | 当开启自适应时:
1. 若未来协变量集合future\_covs不是历史协变量集合history\_covs的子集,将自动抛弃那些不属于历史协变量的未来协变量。
2. 若某个历史协变量的长度不等于输入目标变量的长度:a. 小于时,在其头部补 0;b. 大于时,自动丢弃其最早的数据。
3. 若某个未来协变量的长度不等于预测长度output\_length: a. 小于时,在其尾部补 0;b. 大于时,自动丢弃其最新的数据。 | -| output\_start\_time | 标量参数 | 时间戳类型。 默认值:目标变量最后一个时间戳 + output\_interval | 输出的预测点的起始时间戳 【即起报时间】 | 否 | 必须大于目标变量时间戳的最大值 | -| output\_length | 标量参数 | INT32 类型。 默认值:96 | 输出窗口大小 | 否 | 必须大于 0 | -| output\_interval | 标量参数 | 时间间隔类型。 默认值:(输入数据的最后一个时间戳 - 输入数据的第一个时间戳) / n - 1 | 输出的预测点之间的时间间隔 支持的单位是 ns、us、ms、s、m、h、d、w | 否 | 必须大于 0 | -| timecol | 标量参数 | 字符串类型。 默认值:time | 时间列名 | 否 | 必须为存在于 targets 中的且数据类型为 TIMESTAMP 的列 | -| preserve\_input | 标量参数 | 布尔类型。 默认值:false | 是否在输出结果集中保留目标变量输入的所有原始行 | 否 | | -| model\_options | 标量参数 | 字符串类型。 默认值:空字符串 | 模型相关的 key-value 对,比如是否需要对输入进行归一化等。不同的 key-value 对以 ';' 间隔 | 否 | | - -说明: - -* **默认行为**:预测 targets 的所有列。当前仅支持 INT32、INT64、FLOAT、DOUBLE 类型。 -* **输入数据要求**: - * 必须包含时间列。 - * 行数要求:不足最低行数会报错,超过最大行数则自动截取末尾数据。 - * 列数要求:单变量模型仅支持单列,多列将报错;协变量模型通常无限制,除非模型自身有明确约束。 - * 协变量预测时,SQL 语句中需明确指定 DATABASE。 -* **输出结果**: - * 包含所有目标变量列,数据类型与原表一致。 - * 若指定 `preserve_input=true`,会额外增加 `is_input` 列来标识原始数据行。 -* **时间戳生成**: - * 使用 `OUTPUT_START_TIME`(可选)作为预测起始时间点,并以此划分历史与未来数据。 - * 使用 `OUTPUT_INTERVAL`(可选,默认为输入数据的采样间隔)作为输出时间间隔。第 N 行的时间戳计算公式为:`OUTPUT_START_TIME + (N - 1) * OUTPUT_INTERVAL`。 - -2. **使用示例** - -**示例一:单变量预测** - -提前创建数据库 etth 及表 eg - -```SQL -create database etth; -create table eg (hufl FLOAT FIELD, hull FLOAT FIELD, mufl FLOAT FIELD, mull FLOAT FIELD, lufl FLOAT FIELD, lull FLOAT FIELD, ot FLOAT FIELD) -``` - -准备原始数据 [ETTh1-tab](/img/ETTh1-tab.csv),可通过 [import-data](../Tools-System/Data-Import-Tool_timecho.md#_2-2-csv-格式) 脚本导入原始数据,例如 - -```bash -./tools/import-data.sh -ft csv -sql_dialect table -db etth -table eg -s ~/Desktop/model-compare-html/ETTh1-tab.csv -``` - -使用表 eg 中测点 ot 已知的 1440 行数据,预测其未来的 96 行数据. - -```SQL -IoTDB:etth> select Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT from eg LIMIT 1440 -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -| Time| HUFL| HULL| MUFL| MULL| LUFL| LULL| OT| -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -|2016-07-01T00:00:00.000+08:00| 5.827|2.009|1.599|0.462|4.203| 1.34|30.531| -|2016-07-01T01:00:00.000+08:00| 5.693|2.076|1.492|0.426|4.142|1.371|27.787| -|2016-07-01T02:00:00.000+08:00| 5.157|1.741|1.279|0.355|3.777|1.218|27.787| -|2016-07-01T03:00:00.000+08:00| 5.09|1.942|1.279|0.391|3.807|1.279|25.044| -...... -Total line number = 1440 -It costs 0.119s - -IoTDB:etth> select * from forecast( - model_id => 'sundial', - targets => (select Time, ot from etth.eg where time >= 2016-08-07T18:00:00.000+08:00 limit 1440) order BY time, - output_length => 96 -) -+-----------------------------+---------+ -| time| ot| -+-----------------------------+---------+ -|2016-10-06T18:00:00.000+08:00|20.733124| -|2016-10-06T19:00:00.000+08:00|20.258146| -|2016-10-06T20:00:00.000+08:00|20.022043| -|2016-10-06T21:00:00.000+08:00|19.789446| -...... -Total line number = 96 -It costs 1.615s -``` - -**示例二:协变量预测** - -提前创建表 tab\_real(存储原始真实数据) - -```SQL -create table tab_real (target1 DOUBLE FIELD, target2 DOUBLE FIELD, cov1 DOUBLE FIELD, cov2 DOUBLE FIELD, cov3 DOUBLE FIELD); -``` - -准备原始数据 - -```SQL ---写入语句 -IoTDB:etth> INSERT INTO tab_real (time, target1, target2, cov1, cov2, cov3) VALUES -(1, 1.0, 1.0, 1.0, 1.0, 1.0), -(2, 2.0, 2.0, 2.0, 2.0, 2.0), -(3, 3.0, 3.0, 3.0, 3.0, 3.0), -(4, 4.0, 4.0, 4.0, 4.0, 4.0), -(5, 5.0, 5.0, 5.0, 5.0, 5.0), -(6, 6.0, 6.0, 6.0, 6.0, 6.0), -(7, NULL, NULL, NULL, NULL, 7.0), -(8, NULL, NULL, NULL, NULL, 8.0); - -IoTDB:etth> SELECT * FROM tab_real -+-----------------------------+-------+-------+----+----+----+ -| time|target1|target2|cov1|cov2|cov3| -+-----------------------------+-------+-------+----+----+----+ -|1970-01-01T08:00:00.001+08:00| 1.0| 1.0| 1.0| 1.0| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| 2.0| 2.0| 2.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| 3.0| 3.0| 3.0| 3.0| -|1970-01-01T08:00:00.004+08:00| 4.0| 4.0| 4.0| 4.0| 4.0| -|1970-01-01T08:00:00.005+08:00| 5.0| 5.0| 5.0| 5.0| 5.0| -|1970-01-01T08:00:00.006+08:00| 6.0| 6.0| 6.0| 6.0| 6.0| -|1970-01-01T08:00:00.007+08:00| null| null|null|null| 7.0| -|1970-01-01T08:00:00.008+08:00| null| null|null|null| 8.0| -+-----------------------------+-------+-------+----+----+----+ -``` - -* 预测任务一:使用历史协变量 cov1,cov2 和 cov3 辅助预测目标变量 target1 和 target2。 - - ![](/img/ainode-upgrade-table-forecast-timecho-1.png) - - * 使用表 tab\_real 中 cov1,cov2,cov3,target1,target2 的 前 6 行历史数据,预测目标变量 target1 和 target2 未来的 2 行数据 - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.338330268859863|7.338330268859863| - |1970-01-01T08:00:00.008+08:00| 8.02529525756836| 8.02529525756836| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.315s - ``` -* 预测任务二:使用相同表中的历史协变量 cov1,cov2 和已知协变量 cov3 辅助预测目标变量 target1 和 target2。 - - ![](/img/ainode-upgrade-table-forecast-timecho-2.png) - - * 使用表 tab\_real 中 cov1,cov2,cov3,target1,target2 的 前 6 行历史数据,以及同表中已知协变量 cov3 在未来的 2 行数据来预测目标变量 target1 和 target2 未来的 2 行数据 - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - FUTURE_COVS => ' - SELECT TIME, cov3 - FROM etth.tab_real - WHERE TIME >= 7 - LIMIT 2', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.244050025939941|7.244050025939941| - |1970-01-01T08:00:00.008+08:00|7.907227516174316|7.907227516174316| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.291s - ``` -* 预测任务三:使用不同表中的历史协变量 cov1,cov2 和已知协变量 cov3 辅助预测目标变量 target1 和 target2。 - - ![](/img/ainode-upgrade-table-forecast-timecho-3.png) - - * 提前创建表 tab\_cov\_forecast(存储已知协变量 cov3 的预测值 ),并准备相关数据。 - ```SQL - create table tab_cov_forecast (cov3 DOUBLE FIELD); - - --写入语句 - INSERT INTO tab_cov_forecast (time, cov3) VALUES (7, 7.0),(8, 8.0); - - IoTDB:etth> SELECT * FROM tab_cov_forecast - +----+----+ - |time|cov3| - +----+----+ - | 7| 7.0| - | 8| 8.0| - +----+----+ - ``` - * 使用表 tab\_real 中 cov1,cov2,cov3,target1,target2 已知的前 6 行数据,以及表 tab\_cov\_forecast 中已知协变量 cov3 在未来的 2 行数据来预测目标变量 target1 和 target2 未来的 2 行数据 - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - FUTURE_COVS => ' - SELECT TIME, cov3 - FROM etth.tab_cov_forecast - WHERE TIME >= 7 - LIMIT 2', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.244050025939941|7.244050025939941| - |1970-01-01T08:00:00.008+08:00|7.907227516174316|7.907227516174316| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.351s - ``` - - -#### 4.1.2 时序分类 - -时序分类是时序预测之外的重要能力,在工业界具有广泛应用。其典型范式是输入多个测点的近期采样值,综合判断设备整体运行状态,输出当前状态的分类标签。例如:可用于新能源电池组设备的运行状态分类等场景。 - -AINode 表模型支持通过调用协变量分类模型执行时序数据的分类任务。 - -> 注意:该功能从 V2.0.9.1 版本开始提供。 - -1. **SQL 语法** - -```SQL -SELECT * FROM CLASSIFY( - MODEL_ID, - INPUTS -- 获取输入变量的 SQL - [TIMECOL, - MODEL_OPTIONS]? -) -``` - -* 参数介绍 - -| 参数名 | 参数类型 | 参数属性 | 描述 | 是否必填 | 备注 | -|-----------------|-------|-------------------| ----------------------------------------------------------------------------------------- | ---------- | ------------------------------------------------------------------------ | -| model\_id | 标量参数 | 字符串类型 | 分类所用模型的唯一标识| 是| | -| inputs | 表参数 | SET SEMANTIC | 输入的待分类数据。IoTDB 会自动将数据按时间升序排序再交给 AINode 。 | 是 | 使用 SQL 描述输入的待分类数据,输入的 SQL 不合法时会有对应的查询报错。| -| timecol | 标量参数 | 字符串类型,默认值:time | 时间列名| 否 | 存在于 inputs 中的,数据类型为 TIMESTAMP 的列,否则报错。 | -| model\_options | 标量参数 | 字符串类型默认值:空字符串。 | 模型相关的 key-value 对,比如是否需要对输入进行归一化等。不同的 key-value 对以 ';' 间隔 | 否| 指定某个模型不支持参数,并不会报错,只会被忽略 | - -说明: - -* ​**输入数据要求**​: - * 类型约束:目前仅支持 INT32、INT64、FLOAT、DOUBLE 类型。 - * 行数要求:不同模型要求不同。对于有行数限制的模型,低于最小行数或高于最多行数时将报错。 - * 列数要求:必须包含时间列。单变量分类模型仅支持单列,多列将报错;多变量分类模型通常无限制,除非模型自身有明确约束。 - * 顺序要求:多变量零样本分类模型通常无限制,除非模型自身有明确约束。 -* ​**输出结果**​: - * 返回结果是由时序数据的分类结果组成的表,其规格取决于模型的具体实现。 - -2. **使用示例** - -假设某项目的时序数据的变量数为10,输入长度为192。以自定义的 mantis\_custom 模型为例进行时序数据分类推理。 - -![](/img/ainode-classify-table-timecho.png) - -* 注册模型 - -```SQL -CREATE MODEL mantis_custom USING URI 'file:///path/to/mantis' -``` - -注册自定义模型的详细步骤说明可参考 [4.3 小节](#_4-3-注册自定义模型)。 - -* 运行SQL - -```SQL -IoTDB:etth> SELECT * FROM CLASSIFY ( - MODEL_ID => 'mantis_custom', - INPUTS => ( - SELECT Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT,UT,MT,LT - FROM eg - WHERE TIME < 2016-07-09 00:00:00 - ORDER BY TIME DESC - LIMIT 192) ORDER BY TIME -) -``` - -* 执行结果 - -```SQL -+--------+ -|category| -+--------+ -| 4| -+--------+ -``` - - -### 4.2 模型微调 - -AINode 支持通过 SQL 进行模型微调任务。 - -**SQL 语法** - -```SQL -createModelStatement - | CREATE MODEL modelId=identifier (WITH HYPERPARAMETERS '(' hparamPair (',' hparamPair)* ')')? FROM MODEL existingModelId=identifier ON DATASET '(' targetData=string ')' - ; -hparamPair - : hparamKey=identifier '=' hyparamValue=primaryExpression - ; -``` - -**参数说明** - -| 名称 | 描述 | -| ----------------- |----------------------------------------------------------------------------------------------------------------------------------------| -| modelId | 微调出的模型的唯一标识 | -| hparamPair | 微调使用的超参数 key-value 对,目前支持如下:
`train_epochs`: int 类型,微调轮数
`iter_per_epoch`: int 类型,每轮微调的迭代次数
`learning_rate`: double 类型,学习率 | -| existingModelId | 微调使用的基座模型 | -| targetData | 用于获取微调使用的数据集的 SQL | - -**示例** - -1. 选择测点 ot 中指定时间范围的数据作为微调数据集,基于 sundial 创建模型 sundialv3。 - -```SQL -IoTDB> set sql_dialect=table -Msg: The statement is executed successfully. -IoTDB> CREATE MODEL sundialv3 FROM MODEL sundial ON DATASET ('SELECT time, ot from etth.eg where 1467302400000 <= time and time < 1517468400001') -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -| sundialv3| sundial| fine_tuned| training| -+---------------------+---------+-----------+---------+ -``` - -2. 微调任务后台异步启动,可在 AINode 进程看到 log;微调完成后,查询并使用新的模型 - -```SQL -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -| sundialv3| sundial| fine_tuned| active| -+---------------------+---------+-----------+---------+ -``` - -### 4.3 注册自定义模型 - -**符合以下要求的 Transformers 模型可以注册到 AINode 中:** - -1. AINode 目前使用 v4.56.2 版本的 transformers,构建模型时需**避免继承低版本(<4.50)接口**; -2. 模型需继承一类 AINode 的推理任务流水线(当前支持预测流水线): - * iotdb-core/ainode/iotdb/ainode/core/inference/pipeline/basic\_pipeline.py - - **V2.0.9.3 之前** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - 在推理任务开始前对输入数据进行前处理,包括形状验证和数值转换。 - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - 在推理任务结束后对输出结果进行后处理。 - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def preprocess(self, inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], **infer_kwargs): - """ - 在将输入数据传递给模型进行推理之前进行预处理,验证输入数据的形状和类型。 - - Args: - inputs (list[dict]): - 输入数据,字典列表,每个字典包含: - - 'targets': 形状为 (input_length,) 或 (target_count, input_length) 的张量。 - - 'past_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - 'future_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - infer_kwargs (dict, optional): 推理的额外关键字参数,如: - - `output_length`(int): 如果提供'future_covariates',用于验证其有效性。 - - Raises: - ValueError: 如果输入格式不正确(例如,缺少键、张量形状无效)。 - - Returns: - 经过预处理和验证的输入数据,可直接用于模型推理。 - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - 对给定输入执行预测。 - - Parameters: - inputs: 用于进行预测的输入数据。类型和结构取决于模型的具体实现。 - **infer_kwargs: 额外的推理参数,例如: - - `output_length`(int): 模型应该生成的时间点数量。 - - Returns: - 预测输出,具体形式取决于模型的具体实现。 - """ - pass - - def postprocess(self, outputs: list[torch.Tensor], **infer_kwargs) -> list[torch.Tensor]: - """ - 在推理后对模型输出进行后处理,验证输出数据的形状并确保其符合预期维度。 - - Args: - outputs: - 模型输出,2D张量列表,每个张量形状为 `[target_count, output_length]`。 - - Raises: - InferenceModelInternalException: 如果输出张量形状无效(例如,维数错误)。 - ValueError: 如果输出格式不正确。 - - Returns: - list[torch.Tensor]: - 后处理后的输出,将是一个2D张量列表。 - """ - pass - ``` - - **V2.0.9.3 起** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - 在推理任务开始前对输入数据进行前处理,包括形状验证和数值转换。 - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - 在推理任务结束后对输出结果进行后处理。 - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def _preprocess( - self, - inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], - **infer_kwargs, - ): - """ - 在将输入数据传递给模型进行推理之前进行预处理,验证输入数据的形状和类型。 - - Args: - inputs (list[dict[str, dict[str, torch.Tensor] | torch.Tensor]]): - 输入数据,字典列表,每个字典包含: - - 'targets': 形状为 (input_length,) 或 (target_count, input_length) 的张量。 - - 'past_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - 'future_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - infer_kwargs (dict, optional): 推理的额外关键字参数,如: - - `output_length`(int): 如果提供'future_covariates',用于验证其有效性。 - - Raises: - ValueError: 如果输入格式不正确(例如,缺少键、张量形状无效)。 - - Returns: - 经过预处理和验证的输入数据,可直接用于模型推理。 - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - 对给定输入执行预测。 - - Parameters: - inputs: 用于进行预测的输入数据。类型和结构取决于模型的具体实现。 - **infer_kwargs: 额外的推理参数,例如: - - `output_length`(int): 模型应该生成的时间点数量。 - - Returns: - 预测输出,具体形式取决于模型的具体实现。 - """ - pass - - def _postprocess(self, outputs, **infer_kwargs) -> list[torch.Tensor]: - """ - 在推理后对模型输出进行后处理,验证输出数据的形状并确保其符合预期维度。 - - Args: - outputs: - 模型输出,2D张量列表,每个张量形状为 `[target_count, output_length]`。 - - Raises: - InferenceModelInternalException: 如果输出张量形状无效(例如,维数错误)。 - ValueError: 如果输出格式不正确。 - - Returns: - list[torch.Tensor]: - 后处理后的输出,将是一个2D张量列表。 - """ - pass - ``` - -3. 修改模型配置文件 config.json,确保包含以下字段: - - **V2.0.9.3 之前** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // 指定模型 Config 类 - "AutoModelForCausalLM": "model.Chronos2Model" // 指定模型类 - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // 指定模型的推理流水线 - "model_type": "custom_t5", // 指定模型类型 - } - ``` - - * 必须通过 auto\_map 指定模型的 Config 类和模型类; - * 必须集成并指定推理流水线类; - * 对于 AINode 管理的内置(builtin)和自定义(user\_defined)模型,模型类别(model\_type)也作为不可重复的唯一标识。即,要注册的模型类别不得与任何已存在的模型类型重复,通过微调创建的模型将继承原模型的模型类别。 - - **V2.0.9.3 起** - > 参数 model_type 非必填 - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // 指定模型 Config 类 - "AutoModelForCausalLM": "model.Chronos2Model" // 指定模型类 - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // 指定模型的推理流水线 - } - ``` - * 必须通过 auto\_map 指定模型的 Config 类和模型类; - * 必须集成并指定推理流水线类; - - -4. 确保要注册的模型目录包含以下文件,且模型配置文件名称和权重文件名称不支持自定义: - * 模型配置文件:config.json; - * 模型权重文件:model.safetensors; - * 模型代码:其它 .py 文件。 - -**注册自定义模型的 SQL 语法如下所示:** - -```SQL -CREATE MODEL USING URI -``` - -**参数说明:** - -* **model\_id**:自定义模型的唯一标识;不可重复,有以下约束: - * 允许出现标识符 [ 0-9 a-z A-Z \_ ] (字母,数字(非开头),下划线(非开头)) - * 长度限制为 2-64 字符 - * 大小写敏感 -* **uri**:包含模型代码和权重的本地 uri 地址。 - -**注册示例:** - -从本地路径上传自定义 Transformers 模型,AINode 会将该文件夹拷贝至 user\_defined 目录中。 - -```SQL -CREATE MODEL chronos2 USING URI 'file:///path/to/chronos2' -``` - -SQL执行后会异步进行注册的流程,可以通过模型展示查看模型的注册状态(见查看模型章节)。模型注册完成后,就可以通过使用正常查询的方式调用具体函数,进行模型推理。 - -### 4.4 查看模型 - -注册成功的模型可以通过查看指令查询模型的具体信息。 - -```SQL -SHOW MODELS -``` - -除了直接展示所有模型的信息外,可以指定`model_id`来查看某一具体模型的信息。 - -```SQL -SHOW MODELS -- 只展示特定模型 -``` - -模型展示的结果中包含如下内容: - -| **ModelId** | **ModelType** | **Category** | **State** | -| ------------------- | --------------------- | -------------------- | ----------------- | -| 模型ID | 模型类型 | 模型种类 | 模型状态 | - -其中,State 模型状态机流转示意图如下: - -![](/img/ainode-upgrade-state-timecho.png) - -状态机流程说明: - -1. 启动 AINode 后,执行 `show models` 命令,仅能查看到**系统内置(BUILTIN)**的模型。 -2. 用户可导入自己的模型,这类模型的来源标识为**用户自定义(USER\_****DEFINED)**;AINode 会尝试从模型配置文件中解析模型类型(ModelType),若解析失败,该字段则显示为空。 -3. 时序大模型(内置模型)权重文件不随 AINode 打包,AINode 启动时自动下载。 - 1. 下载过程中为 ACTIVATING,下载成功转变为 ACTIVE,失败则变成 INACTIVE。 -4. 用户启动模型微调任务后,正在训练的模型状态为 TRAINING,训练成功变为 ACTIVE,失败则是 FAILED。 -5. 若微调任务成功,微调结束后会统计所有 ckpt (训练文件)中指标最佳的文件并自动重命名,变成用户指定的 model\_id。 - -**查看示例** - -```SQL -IoTDB> show models -+---------------------+--------------+--------------+-------------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------+--------------+-------------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| custom| | user_defined| active| -| timer_xl| timer| builtin| activating| -| sundial| sundial| builtin| active| -| sundialx_1| sundial| fine_tuned| active| -| sundialx_4| sundial| fine_tuned| training| -| sundialx_5| sundial| fine_tuned| failed| -| chronos2| t5| builtin| inactive| -+---------------------+--------------+--------------+-------------+ -``` - -内置传统时序模型介绍如下: - -| 模型名称 | 核心概念 | 适用场景 | 主要特点 | -|----------------------------------| ----------------------------------------------------------------------------------------- | ---------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | -| **ARIMA**(自回归整合移动平均模型) | 结合自回归(AR)、差分(I)和移动平均(MA),用于预测平稳时间序列或可通过差分变为平稳的数据。 | 单变量时间序列预测,如股票价格、销量、经济指标等。| 1. 适用于线性趋势和季节性较弱的数据。2. 需要选择参数 (p,d,q)。3. 对缺失值敏感。 | -| **Holt-Winters**(三参数指数平滑) | 基于指数平滑,引入水平、趋势和季节性三个分量,适用于具有趋势和季节性的数据。 | 有明显季节性和趋势的时间序列,如月度销售额、电力需求等。 | 1. 可处理加性或乘性季节性。2. 对近期数据赋予更高权重。3. 简单易实现。 | -| **Exponential Smoothing**(指数平滑) | 通过加权平均历史数据,权重随时间指数递减,强调近期观测值的重要性。 | 无显著季节性但存在趋势的数据,如短期需求预测。 | 1. 参数少,计算简单。2. 适合平稳或缓慢变化序列。3. 可扩展为双指数或三指数平滑。 | -| **Naive Forecaster**(朴素预测器) | 使用最近一期的观测值作为下一期的预测值,是最简单的基准模型。 | 作为其他模型的比较基准,或数据无明显模式时的简单预测。 | 1. 无需训练。2. 对突发变化敏感。3. 季节性朴素变体可用前一季节同期值预测。 | -| **STL Forecaster**(季节趋势分解预测) | 基于STL分解时间序列,分别预测趋势、季节性和残差分量后组合。 | 具有复杂季节性、趋势和非线性模式的数据,如气候数据、交通流量。 | 1. 能处理非固定季节性。2. 对异常值稳健。3. 分解后可结合其他模型预测各分量。 | -| **Gaussian HMM**(高斯隐马尔可夫模型) | 假设观测数据由隐藏状态生成,每个状态的观测概率服从高斯分布。 | 状态序列预测或分类,如语音识别、金融状态识别。 | 1. 适用于时序数据的状态建模。2. 假设观测值在给定状态下独立。3. 需指定隐藏状态数量。 | -| **GMM HMM** (高斯混合隐马尔可夫模型) | 扩展Gaussian HMM,每个状态的观测概率由高斯混合模型描述,可捕捉更复杂的观测分布。 | 需要多模态观测分布的场景,如复杂动作识别、生物信号分析。 | 1. 比单一高斯更灵活。2. 参数更多,计算复杂度高。3. 需训练GMM成分数。 | -| **STRAY**(基于奇异值的异常检测) | 通过奇异值分解(SVD)检测高维数据中的异常点,常用于时间序列异常检测。 | 高维时间序列的异常检测,如传感器网络、IT系统监控。 | 1. 无需分布假设。2. 可处理高维数据。3. 对全局异常敏感,局部异常可能漏检。 | - -内置时序大模型介绍如下: - -| 模型名称 | 核心概念 | 适用场景 | 主要特点 | -|---------------| ---------------------------------------------------------------------- | ------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| **Timer-XL** | 支持超长上下文的时序大模型,通过大规模工业数据预训练增强泛化能力。 | 需利用极长历史数据的复杂工业预测,如能源、航空航天、交通等领域。 | 1. 超长上下文支持,可处理数万时间点输入。2. 多场景覆盖,支持非平稳、多变量及协变量预测。3. 基于万亿级高质量工业时序数据预训练。 | -| **Timer-Sundial** | 采用“Transformer + TimeFlow”架构的生成式基础模型,专注于概率预测。 | 需要量化不确定性的零样本预测场景,如金融、供应链、新能源发电预测。 | 1. 强大的零样本泛化能力,支持点预测与概率预测 2. 可灵活分析预测分布的任意统计特性。3. 创新生成架构,实现高效的非确定性样本生成。 | -| **Chronos-2** | 基于离散词元化范式的通用时序基础模型,将预测转化为语言建模任务。 | 快速零样本单变量预测,以及可借助协变量(如促销、天气)提升效果的场景。 | 1. 强大的零样本概率预测能力。2. 支持协变量统一建模,但对输入有严格要求:a. 未来协变量的名称组成的集合必须是历史协变量的名称组成的集合的子集;b. 每个历史协变量的长度必须等于目标变量的长度; c. 每个未来协变量的长度必须等于预测长度;3. 采用高效的编码器式结构,兼顾性能与推理速度。 | - - -### 4.5 删除模型 - -对于注册成功的模型,用户可以通过 SQL 进行删除,AINode 会将 user\_defined 目录下的对应模型文件夹整个删除。其 SQL 语法如下: - -```SQL -DROP MODEL -``` - -需要指定已经成功注册的模型 model\_id 来删除对应的模型。由于模型删除涉及模型数据清理,操作不会立即完成,此时模型的状态为 DROPPING,该状态的模型不能用于模型推理。请注意,该功能不支持删除内置模型。 - -### 4.6 加载/卸载模型 - -为适应不同场景,AINode 提供以下两种模型加载策略: - -* 即时加载:即推理时临时加载模型,结束后释放资源。适用于测试或低负载场景。 -* 常驻加载:即将模型持久化加载在内存(CPU)或显存(GPU)中,以支持高并发推理。用户只需通过 SQL 指定加载或卸载的模型,AINode 会自动管理实例数量。当前常驻模型的状态也可随时查看。 - -下文将详细介绍加载/卸载模型的相关内容: - -1. 配置参数 - -支持通过编辑如下配置项设置常驻加载相关参数。 - -```Properties -# AINode 在推理时可使用的设备内存/显存占总量的比例 -# Datatype: Float -ain_inference_memory_usage_ratio=0.4 - -# AINode 每个加载的模型实例需要占用的内存比例,即模型占用*该值 -# Datatype: Float -ain_inference_extra_memory_ratio=1.2 -``` - -2. 展示可用的 device - -支持通过如下 SQL 命令查看所有可用的设备 ID - -```SQL -SHOW AI_DEVICES -``` - -示例 - -```SQL -IoTDB> show ai_devices -+-------------+ -| DeviceId| -+-------------+ -| cpu| -| 0| -| 1| -+-------------+ -``` - -3. 加载模型 - -支持通过如下 SQL 命令手动加载模型,系统根据硬件资源使用情况**自动均衡**模型实例数量。 - -```SQL -LOAD MODEL TO DEVICES (, )* -``` - -参数要求 - -* **existing\_model\_id:** 指定的模型 id,当前版本仅支持 timer\_xl 和 sundial。 -* **device\_id:** 模型加载的位置。 - * **cpu:** 加载到 AINode 所在服务器的内存中。 - * **gpu\_id:** 加载到 AINode 所在服务器的对应显卡中,如 "0, 1" 表示加载到编号为 0 和 1 的两张显卡中。 - -示例 - -```SQL -LOAD MODEL sundial TO DEVICES 'cpu,0,1' -``` - -4. 卸载模型 - -支持通过如下 SQL 命令手动卸载指定模型的所有实例,系统会**重分配**空闲出的资源给其他模型 - -```SQL -UNLOAD MODEL FROM DEVICES (, )* -``` - -参数要求 - -* **existing\_model\_id:** 指定的模型 id,当前版本仅支持 timer\_xl 和 sundial。 -* **device\_id:** 模型加载的位置。 - * **cpu:** 尝试从 AINode 所在服务器的内存中卸载指定模型。 - * **gpu\_id:** 尝试从 AINode 所在服务器的对应显卡中卸载指定模型,如 "0, 1" 表示尝试从编号为 0 和 1 的两张显卡卸载指定模型。 - -示例 - -```SQL -UNLOAD MODEL sundial FROM DEVICES 'cpu,0,1' -``` - -5. 展示加载的模型 - -支持通过如下 SQL 命令查看已经手动加载的模型实例,可通过 `device_id `指定设备。 - -```SQL -SHOW LOADED MODELS -SHOW LOADED MODELS (, )* # 展示指定设备中的模型实例 -``` - -示例:在内存、gpu\_0 和 gpu\_1 两张显卡加载了sundial 模型 - -```SQL -IoTDB> show loaded models -+-------------+--------------+------------------+ -| DeviceId| ModelId| Count(instances)| -+-------------+--------------+------------------+ -| cpu| sundial| 4| -| 0| sundial| 6| -| 1| sundial| 6| -+-------------+--------------+------------------+ -``` - -说明: - -* DeviceId : 设备 ID -* ModelId :加载的模型 ID -* Count(instances) :每个设备中的模型实例数量(系统自动分配) - -### 4.7 时序大模型介绍 - -AINode 目前支持多种时序大模型,相关介绍及部署使用可参考[时序大模型](../AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md) - -## 5. 权限管理 - -使用 AINode 相关的功能时,可以使用IoTDB本身的鉴权去做一个权限管理,用户只有在具备 USE\_MODEL 权限时,才可以使用模型管理的相关功能。当使用推理功能时,用户需要有访问输入模型的 SQL 对应的源序列的权限。 - -| **权限名称** | **权限范围** | **管理员用户(默认ROOT)** | **普通用户** | -| ------------------------- | ----------------------------------------- | ---------------------------------- | -------------------- | -| USE\_MODEL | create model / show models / drop model | √ | √ | -| READ\_SCHEMA&READ\_DATA | forecast | √ | √ | diff --git a/src/zh/UserGuide/Master/Table/AI-capability/AINode_timecho.md b/src/zh/UserGuide/Master/Table/AI-capability/AINode_timecho.md deleted file mode 100644 index d28e42606..000000000 --- a/src/zh/UserGuide/Master/Table/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,453 +0,0 @@ - - -# AINode - -AINode 是支持时序相关模型注册、管理、调用的 IoTDB 原生节点,内置业界领先的自研时序大模型,如清华自研时序模型 Timer 系列,可通过标准 SQL 语句进行调用,实现时序数据的毫秒级实时推理,可支持时序趋势预测、缺失值填补、异常值检测等应用场景。 - -> V2.0.5.1及以后版本支持 - -系统架构如下图所示: - -![](/img/AINode-0.png) - -三种节点的职责如下: - -- **ConfigNode**:负责分布式节点管理和负载均衡。 -- **DataNode**:负责接收并解析用户的 SQL请求;负责存储时间序列数据;负责数据的预处理计算。 -- **AINode**:负责时序模型的管理和使用。 - -## 1. 优势特点 - -与单独构建机器学习服务相比,具有以下优势: - -- **简单易用**:无需使用 Python 或 Java 编程,使用 SQL 语句即可完成机器学习模型管理与推理的完整流程。如创建模型可使用CREATE MODEL语句、使用模型进行推理可使用 SELECT * FROM FORECAST (...) 语句等,使用更加简单便捷。 - -- **避免数据迁移**:使用 IoTDB 原生机器学习可以将存储在 IoTDB 中的数据直接应用于机器学习模型的推理,无需将数据移动到单独的机器学习服务平台,从而加速数据处理、提高安全性并降低成本。 - -![](/img/h1.png) - -- **内置先进算法**:支持业内领先机器学习分析算法,覆盖典型时序分析任务,为时序数据库赋能原生数据分析能力。如: - - **时间序列预测(Time Series Forecasting)**:从过去时间序列中学习变化模式;从而根据给定过去时间的观测值,输出未来序列最可能的预测。 - - **时序异常检测(Anomaly Detection for Time Series)**:在给定的时间序列数据中检测和识别异常值,帮助发现时间序列中的异常行为。 - - **时间序列标注(Time Series Annotation)**:为每个数据点或特定时间段添加额外的信息或标记,例如事件发生、异常点、趋势变化等,以便更好地理解和分析数据。 - - -## 2. 基本概念 - -- **模型(Model)**:机器学习模型,以时序数据作为输入,输出分析任务的结果或决策。模型是 AINode 的基本管理单元,支持模型的增(注册)、删、查、改(微调)、用(推理)。 -- **创建(Create)**: 将外部设计或训练好的模型文件或算法加载到 AINode 中,由 IoTDB 统一管理与使用。 -- **推理(Inference)**:使用创建的模型在指定时序数据上完成该模型适用的时序分析任务。 -- **内置能力(Built-in)**:AINode 自带常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -![](/img/AINode-new.png) - -## 3. 安装部署 - -AINode 的部署可参考文档 [AINode 部署](../Deployment-and-Maintenance/AINode_Deployment_timecho.md) 章节。 - -## 4. 使用指导 - -AINode 对时序模型提供了模型创建及删除功能,内置模型无需创建,可直接使用。 - -### 4.1 注册模型 - -通过指定模型输入输出的向量维度,可以注册训练好的深度学习模型,从而用于模型推理。 - -符合以下内容的模型可以注册到AINode中: - 1. AINode 目前支持基于 PyTorch 2.4.0 版本训练的模型,需避免使用 2.4.0 版本以上的特性。 - 2. AINode 支持使用 PyTorch JIT 存储的模型(`model.pt`),模型文件需要包含模型的结构和权重。 - 3. 模型输入序列可以包含一列或多列,若有多列,需要和模型能力、模型配置文件对应。 - 4. 模型的配置参数必须在`config.yaml`配置文件中明确定义。使用模型时,必须严格按照`config.yaml`配置文件中定义的输入输出维度。如果输入输出列数不匹配配置文件,将会导致错误。 - -下方为模型注册的SQL语法定义。 - -```SQL -create model using uri -``` - -SQL中参数的具体含义如下: - -- model_id:模型的全局唯一标识,不可重复。模型名称具备以下约束: - - - 允许出现标识符 [ 0-9 a-z A-Z _ ](字母,数字(非开头),下划线(非开头)) - - 长度限制为2-64字符 - - 大小写敏感 - -- uri:模型注册文件的资源路径,路径下应包含**模型结构及权重文件 model.pt 文件和模型配置文件 config.yaml** - - - 模型结构及权重文件:模型训练完成后得到的权重文件,目前支持 pytorch 训练得到的 .pt 文件 - - - 模型配置文件:模型注册时需要提供的与模型结构有关的参数,其中必须包含模型的输入输出维度用于模型推理: - - - | **参数名** | **参数描述** | **示例** | - | ------------ | ---------------------------- | -------- | - | input_shape | 模型输入的行列,用于模型推理 | [96,2] | - | output_shape | 模型输出的行列,用于模型推理 | [48,2] | - - - ​ 除了模型推理外,还可以指定模型输入输出的数据类型: - - - | **参数名** | **参数描述** | **示例** | - | ----------- | ------------------ | --------------------- | - | input_type | 模型输入的数据类型 | ['float32','float32'] | - | output_type | 模型输出的数据类型 | ['float32','float32'] | - - - ​ 除此之外,可以额外指定备注信息用于在模型管理时进行展示 - - - | **参数名** | **参数描述** | **示例** | - | ---------- | ---------------------------------------------- | ------------------------------------------- | - | attributes | 可选,用户自行设定的模型备注信息,用于模型展示 | 'model_type': 'dlinear','kernel_size': '25' | - - -除了本地模型文件的注册,还可以通过URI来指定远程资源路径来进行注册,使用开源的模型仓库(例如HuggingFace)。 - -#### 示例 - -在 [example 文件夹](https://github.com/apache/iotdb/tree/master/integration-test/src/test/resources/ainode-example)下,包含model.pt和config.yaml文件,model.pt为训练得到,config.yaml的内容如下: - -```YAML -configs: - # 必选项 - input_shape: [96, 2] # 表示模型接收的数据为96行x2列 - output_shape: [48, 2] # 表示模型输出的数据为48行x2列 - - # 可选项 默认为全部float32,列数为shape对应的列数 - input_type: ["int64","int64"] #输入对应的数据类型,需要与输入列数匹配 - output_type: ["text","int64"] #输出对应的数据类型,需要与输出列数匹配 - -attributes: # 可选项 为用户自定义的备注信息 - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -指定该文件夹作为加载路径就可以注册该模型 - -```SQL -IoTDB> create model dlinear_example using uri "file://./example" -``` - -SQL执行后会异步进行注册的流程,可以通过模型展示查看模型的注册状态(见模型展示章节),注册成功的耗时主要受到模型文件大小的影响。 - -模型注册完成后,就可以通过使用正常查询的方式调用具体函数,进行模型推理。 - -### 4.2 查看模型 - -注册成功的模型可以通过show models指令查询模型的具体信息。其SQL定义如下: - -```SQL -show models - -show models -``` - -除了直接展示所有模型的信息外,可以指定model id来查看某一具体模型的信息。模型展示的结果中包含如下信息: - -| **ModelId** | **ModelType** | **Category** | **State** | -|-------------|-----------|--------------|----------------| -| 模型ID | 模型类型 | 模型种类 | 模型状态 | - -- 模型状态机流转示意图如下 - -![](/img/AINode-State.png) - -**说明:** - -1. 启动 AINode,show models 只能看到 BUILT-IN 模型 -2. 用户可导入自己的模型,来源为 USER-DEFINED,可尝试从配置文件解析 ModelType,解析不到则为空 -3. 时序大模型权重不随 AINode 打包,AINode 启动时自动下载,下载过程中为 LOADING -4. 下载成功转变为 ACTIVE,失败则变成 INACTIVE -5. 用户启动微调,正在训练的模型状态为 TRAINING,训练成功变为 ACTIVE,失败则是 FAILED - -**示例** - -```SQL -IoTDB> show models -+---------------------+--------------------+--------------+---------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+--------------+---------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| custom| | USER-DEFINED| ACTIVE| -| timerxl| Timer-XL| BUILT-IN| LOADING| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| sundialx_1| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_2| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_4| Timer-Sundial| FINE-TUNED| TRAINING| -| sundialx_5| Timer-Sundial| FINE-TUNED| FAILED| -+---------------------+--------------------+--------------+---------+ -``` - -### 4.3 删除模型 - -对于注册成功的模型,用户可以通过SQL进行删除,该操作会删除所有 AINode 下的相关模型文件,其SQL如下: - -```SQL -drop model -``` - -需要指定已经成功注册的模型 model_id 来删除对应的模型。由于模型删除涉及模型数据清理,操作不会立即完成,此时模型的状态为 DROPPING,该状态的模型不能用于模型推理。请注意,该功能不支持删除内置模型。 - -### 4.4 使用内置模型推理 - -SQL语法如下: - - -```SQL -SELECT * FROM forecast( - input, - model_id, - [output_length, - output_start_time, - output_interval, - timecol, - preserve_input, - model_options]? -) -``` - -内置模型推理无需注册流程,通过 forecast 函数,指定 model_id 就可以使用模型的推理功能 - - - 请注意,使用内置时序大模型进行推理的前提条件是本地存有对应模型权重,目录为 /IOTDB_AINODE_HOME/data/ainode/models/weights/model_id/。若本地没有模型权重,则会自动从 HuggingFace 拉取,请保证本地能直接访问 HuggingFace。 - -- 参数介绍如下: - -| 参数名 | 参数类型 | 参数属性 | 描述 | 是否必填 | 备注 | -| :---------------- | :------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------------------------------------------------------- | -| input | 表参数 | SET SEMANTIC | 待预测的输入数据 | 是 | | -| model_id | 标量参数 | 字符串类型 | 需要选择的model名 | 是 | 只能为非空,且内置的模型,否则报错:空字符串:MODEL_ID should never be null or empty不存在的模型:model [%s] has not been created模型不可用:model [%s] is not available | -| output_length | 标量参数 | INT32类型默认值:96 | 输出窗口大小 | 否 | 必须大于 0,否则报错:OUTPUT_LENGTH should be greater than 0 | -| output_start_time | 标量参数 | 时间戳类型默认值:输入数据的最后一个时间戳加 output_interval | 输出的预测点的起始时间戳 | 否 | 可以为负数,表示1970年1月1号之前的时间戳 | -| output_interval | 标量参数 | 时间间隔类型默认值:0(输入数据的采样间隔) | 输出的预测点之间的时间间隔支持的单位是 ns、us、ms、s、m、h、d、w | 否 | 大于 0 时,采用用户指定的输出间隔;小于等于 0 时,根据输入数据自动推测 | -| timecol | 标量参数 | 字符串类型默认值:time | 时间列名 | 否 | 存在于 input 中的,数据类型为 TIMESTAMP 的列,否则报错:若数据类型不为 TIMESTAMP: The type of the column [%s] is not as expected.若列不存在:Required column [%s] not found in the source table argument. | -| preserve_input | 标量参数 | 布尔类型默认值:false | 是否在输出结果集中保留输入的所有原始行 | 否 | | -| model_options | 标量参数 | 字符串类型默认值:空字符串 | 模型相关的key-value对,比如是否需要对输入进行归一化等。不同的key-value对以';'间隔 | 否 | 指定某个模型不支持参数,并不会报错,只会被忽略;AINode 中内置的模型支持的常见参数详见文末附录说明。 | - -**说明:** - -1. forecast 函数默认对输入表中所有列进行预测(不包含time列和partition by 的列)。 -2. forecast 函数对于输入数据无顺序性要求,默认对输入数据按照时间戳(由 TIMECOL 参数指定时间戳的列名)做升序排序后,再调用模型进行预测。 -3. 不同模型对于输入数据的行数要求不同,输入数据少于最低行数要求时会报错。 - - 在当前的 AINdoe 内置模型中,Timer-XL 模型至少需要输入 96 行数据,Timer-Sundial 模型至少需要输入 16 行数据。 -4. forecast 函数的返回结果列包含 input 表的所有输入列,列的数据类型与原表列的数据类型一致。若 preserve_input= true,则还包含 is_input 列(表征当前行是否为输入行) - - 目前只支持对 INT32、INT64、FLOAT、DOUBLE 进行预测,否则报错:The type of the column [%s] is [%s], only INT32, INT64, FLOAT, DOUBLE is allowed -5. output_start_time 和 output_interval 只会影响输出结果集的时间戳列生成,均为可选参数。 - - output_start_time 默认为输入数据的最后一个时间戳加 output_interval - - output_interval = (输入数据的最后一个时间戳 - 输入数据的第一个时间戳) / n - 1, 默认为输入数据的采样间隔 - - 第 N 个输出行的时间为 output_start_time + (N - 1) * output_interval - -**示例:需要提前创建数据库及表** - -```sql -create database etth -create table eg (hufl FLOAT FIELD, hull FLOAT FIELD, mufl FLOAT FIELD, mull FLOAT FIELD, lufl FLOAT FIELD, lull FLOAT FIELD, ot FLOAT FIELD) -``` - -我们所使用的的测试集的数据为[ETTh1-tab](/img/ETTh1-tab.csv)。 - -**查看当前支持的模型** - -```Bash -IoTDB:etth> show models -+---------------------+--------------------+--------+------+ -| ModelId| ModelType|Category| State| -+---------------------+--------------------+--------+------+ -| arima| Arima|BUILT-IN|ACTIVE| -| holtwinters| HoltWinters|BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing|BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster|BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster|BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm|BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm|BUILT-IN|ACTIVE| -| stray| Stray|BUILT-IN|ACTIVE| -| sundial| Timer-Sundial|BUILT-IN|ACTIVE| -| timer_xl| Timer-XL|BUILT-IN|ACTIVE| -+---------------------+--------------------+--------+------+ -Total line number = 10 -It costs 0.004s -``` - -**表模型推理(以 sundial 为例)** - -```Bash -IoTDB:etth> select Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT from eg LIMIT 96 -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -| Time| HUFL| HULL| MUFL| MULL| LUFL| LULL| OT| -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -|2016-07-01T00:00:00.000+08:00| 5.827|2.009|1.599|0.462|4.203| 1.34|30.531| -|2016-07-01T01:00:00.000+08:00| 5.693|2.076|1.492|0.426|4.142|1.371|27.787| -|2016-07-01T02:00:00.000+08:00| 5.157|1.741|1.279|0.355|3.777|1.218|27.787| -|2016-07-01T03:00:00.000+08:00| 5.09|1.942|1.279|0.391|3.807|1.279|25.044| -...... -Total line number = 96 -It costs 0.119s - -IoTDB:etth> select * from forecast( - model_id => 'sundial', - input => (select Time, ot from etth.eg where time >= 2016-08-07T18:00:00.000+08:00 limit 1440) order BY time, - output_length => 96 -) -+-----------------------------+---------+ -| time| ot| -+-----------------------------+---------+ -|2016-10-06T18:00:00.000+08:00|20.781654| -|2016-10-06T19:00:00.000+08:00|20.252121| -|2016-10-06T20:00:00.000+08:00|19.960138| -|2016-10-06T21:00:00.000+08:00|19.662334| -...... -Total line number = 96 -It costs 1.615s -``` -### 4.5 使用内置模型微调 - -> 仅 Timer-XL、Timer-Sundial 可以进行微调操作。 - -SQL语法如下: - - -```SQL -create model (with hyperparameters -(=(, =)*))? -from model -on dataset (inputSql) -``` - -#### 示例 - -1. 选择测点 ot 中前 80% 数据作为微调数据集,基于 sundial 创建模型 sundialv3。 - -```SQL -IoTDB> set sql_dialect=table -Msg: The statement is executed successfully. -IoTDB> CREATE MODEL sundialv3 FROM MODEL sundial ON DATASET ('SELECT time, ot from etth.eg where 1467302400000 <= time and time < 1517468400001') -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+--------------------+----------+--------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+--------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| timer_xl| Timer-XL| BUILT-IN| ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED| ACTIVE| -| sundialv3| Timer-Sundial|FINE-TUNED|TRAINING| -+---------------------+--------------------+----------+--------+ -``` - -2. 微调任务后台异步启动,可在 AINode 进程看到 log;微调完成后,查询并使用新的模型 - -```SQL -IoTDB> show models -+---------------------+--------------------+----------+------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+------+ -| arima| Arima| BUILT-IN|ACTIVE| -| holtwinters| HoltWinters| BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN|ACTIVE| -| stray| Stray| BUILT-IN|ACTIVE| -| sundial| Timer-Sundial| BUILT-IN|ACTIVE| -| timer_xl| Timer-XL| BUILT-IN|ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|ACTIVE| -| sundialv3| Timer-Sundial|FINE-TUNED|ACTIVE| -+---------------------+--------------------+----------+------+ -``` - -### 4.6 时序大模型导入步骤 - -AINode 目前支持多种时序大模型,部署使用请参考[时序大模型](../AI-capability/TimeSeries-Large-Model.md) - -## 5 权限管理 - -使用AINode相关的功能时,可以使用IoTDB本身的鉴权去做一个权限管理,用户只有在具备 USE_MODEL 权限时,才可以使用模型管理的相关功能。当使用推理功能时,用户需要有访问输入模型的SQL对应的源序列的权限。 - -| **权限名称** | **权限范围** | **管理员用户(默认ROOT)** | **普通用户** | **路径相关** | -| :----------- | :-------------------------------------- | :------------------------- | :----------- | :----------- | -| USE_MODEL | create model / show models / drop model | √ | √ | x | -| READ_DATA | call inference | √ | √ | √ | - -## 6 附录 - -**Arima** - -| 支持的参数 | 含义 | 默认值 | -| :---------------------- | :----------------------------------------------------------- | :-------- | -| order | ARIMA模型的阶数 `(p, d, q)`:p是自回归阶数,d是差分阶数,q是滑动平均阶数。 | (1,0,0) | -| seansonal_order | 季节性ARIMA的阶数 `(P, D, Q, s)`:分别为季节性自回归、差分、滑动平均阶数,s是季节周期(如12代表月度数据)。 | (0,0,0,0) | -| method | 优化器选择,可选:'newton'、'nm'、'bfgs'、'lbfgs'、'powell'、'cg'、'ncg'、'basinhopping'。 | 'lbfgs' | -| maxiter | 最大迭代次数或函数评估次数。 | 50 | -| out_of_sample_size | 用于验证的时间序列尾部样本数,模型不在这些样本上拟合。 | 0 | -| scoring | 验证时使用的评分函数,字符串需为 sklearn 中可导入的评分指标,或用户自定义函数。 | 'mse' | -| trend | 趋势项配置,若 with_intercept=True 且此项为 None,则默认使用 'c'(包含常数项)。 | None | -| with_intercept | 是否包含截距项。 | True | -| time_varying_regression | 是否允许回归系数随时间变化。 | False | -| enforce_stationarity | 是否强制AR部分平稳性。 | True | -| enforce_invertibility | 是否强制MA部分可逆性。 | True | -| simple_differnecing | 是否使用差分后的数据估计(牺牲前几行数据换取更简状态空间)。 | False | -| measurement_error | 是否认为观测值中含有误差。 | False | -| mle_regression | 是否使用极大似然估计回归系数,若 `time_varying_regression=True` 则必须为 False。 | True | -| hamilton_representation | 是否使用 Hamilton 表达方式(默认用 Harvey)。 | False | -| concentrate_scale | 是否从似然函数中排除误差方差参数,减少待估参数个数(但无法获得误差项方差的标准误)。 | False | - -**NaiveForecaster** - -| 支持的参数 | 含义 | 默认值 | -| ---------- | ------------------------------------------------------------ | ------ | -| strategy | 预测策略: • `"last"`:预测训练集最后一个值;若设置了季节周期(`sp`>1),则每个季节分别预测其最后一个周期值。对 NaN 值鲁棒。 • `"mean"`:预测最后窗口中的平均值;若 `sp`>1,按每个季节分别计算均值。对 NaN 值鲁棒。 • `"drift"`:用最后窗口的首尾点拟合一条直线并外推预测。对 NaN 值不鲁棒。 | "last" | -| sp | 季节性周期。若为 `None`,等效于 `1`,表示无季节性;如果设为 12,表示每 12 个单位(如月)为一个周期。 | 1 | - -- STLForecaster - -| 支持的参数 | 含义 | 默认值 | -| :------------ | :----------------------------------------------------------- | :----- | -| sp | 季节周期长度(周期性单位数)。传入 statsmodels 的 STL 中。 | 2 | -| seasonal | 季节项平滑窗口长度,必须为 ≥3 的奇数,通常建议 ≥7。 | 7 | -| seasonal_deg | 季节项 LOESS 的多项式阶数(0 表示常数,1 表示线性)。 | 1 | -| trend_deg | 趋势项 LOESS 的多项式阶数(0 或 1)。 | 1 | -| low_pass_deg | 低通项 LOESS 的多项式阶数(0 或 1)。 | 1 | -| seasonal_jump | LOESS 拟合的插值步长(季节项),每 n 点拟合一次,中间插值。值越大,估计速度越快。 | 1 | -| trend_jump | 趋势项插值步长,越大速度越快但精度可能下降。 | 1 | -| low_pass_jump | 低通项插值步长,设置同上。 | 1 | - -**ExponentialSmoothing (HoltWinters)** - -| 支持的参数 | 含义 | 默认值 | -| :-------------------- | :----------------------------------------------------------- | :---------- | -| damped_trend | 是否使用阻尼趋势(趋势会逐渐平缓,而非无限增长)。 | True | -| initialization_method | 初始化方法: • `"estimated"`:通过拟合估计初始状态 • `"heuristic"`:使用启发式方法估计初始水平/趋势/季节 • `"known"`:用户显式提供所有初始值 • `"legacy-heuristic"`:旧版本兼容方式 | "estimated" | -| optmized | 是否通过最大化对数似然来优化参数。 | True | -| remove_bias | 是否移除偏差,使预测值和拟合值的残差平均值为0。 | False | -| use_brute | 是否使用穷举法(网格搜索)来寻找初始参数。否则使用启发式初始值。 | | \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md b/src/zh/UserGuide/Master/Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md deleted file mode 100644 index 3d86bfc4f..000000000 --- a/src/zh/UserGuide/Master/Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md +++ /dev/null @@ -1,157 +0,0 @@ - -# 时序大模型 - -## 1. 简介 - -时序大模型是专为时序数据分析设计的基础模型。IoTDB 团队长期自研时序基础模型 Timer,该模型基于 Transformer 架构,经海量多领域时序数据预训练,可支撑时序预测、异常检测、时序填补等下游任务;团队打造的 AINode 平台同时支持集成业界前沿时序基础模型,为用户提供多元选型。不同于传统时序分析技术,这类大模型具备通用特征提取能力,可通过零样本分析、微调等技术服务广泛的分析任务。 - -本文相关时序大模型领域的技术成果(含团队自研及业界前沿方向)均发表于国际机器学习顶级会议,具体内容见附录。 - -## 2. 应用场景 - -* **时序预测**:为工业生产、自然环境等领域提供时间序列数据的预测服务,帮助用户提前了解未来变化趋势。 -* **数据填补**:针对时间序列中的缺失序列段,进行上下文填补,以增强数据集的连续性和完整性。 -* **异常检测**:利用自回归分析技术,对时间序列数据进行实时监测,及时预警潜在的异常情况。 - -![](/img/LargeModel09.png) - -## 3. Timer-1 模型 - -Timer[1] 模型(非内置模型)不仅展现了出色的少样本泛化和多任务适配能力,还通过预训练获得了丰富的知识库,赋予了它处理多样化下游任务的通用能力,拥有以下特点: - -* **泛化性**:模型能够通过使用少量样本进行微调,达到行业内领先的深度模型预测效果。 -* **通用性**:模型设计灵活,能够适配多种不同的任务需求,并且支持变化的输入和输出长度,使其在各种应用场景中都能发挥作用。 -* **可扩展性**:随着模型参数数量的增加或预训练数据规模的扩大,模型效果会持续提升,确保模型能够随着时间和数据量的增长而不断优化其预测效果。 - -![](/img/model01.png) - -## 4. Timer-XL 模型 - -Timer-XL[2]基于 Timer 进一步扩展升级了网络结构,在多个维度全面突破: - -* **超长上下文支持**:该模型突破了传统时序预测模型的限制,支持处理数千个 Token(相当于数万个时间点)的输入,有效解决了上下文长度瓶颈问题。 -* **多变量预测场景覆盖**:支持多种预测场景,包括非平稳时间序列的预测、涉及多个变量的预测任务以及包含协变量的预测,满足多样化的业务需求。 -* **大规模工业时序数据集:**采用万亿大规模工业物联网领域的时序数据集进行预训练,数据集兼有庞大的体量、卓越的质量和丰富的领域等重要特质,覆盖能源、航空航天、钢铁、交通等多领域。 - -![](/img/model02.png) - -## 5. Timer-Sundial 模型 - -Timer-Sundial[3]是一个专注于时间序列预测的生成式基础模型系列,其基础版本拥有 1.28 亿参数,并在 1 万亿个时间点上进行了大规模预训练,其核心特性包括: - -* **强大的泛化性能:**具备零样本预测能力,可同时支持点预测和概率预测。 -* **灵活预测分布分析:**不仅能预测均值或分位数,还可通过模型生成的原始样本评估预测分布的任意统计特性。 -* **创新生成架构:** 采用 “Transformer + TimeFlow” 协同架构——Transformer 学习时间片段的自回归表征,TimeFlow 模块基于流匹配框架 (Flow-Matching) 将随机噪声转化为多样化预测轨迹,实现高效的非确定性样本生成。 - -![](/img/model03.png) - -## 6. Chronos-2 模型 - -Chronos-2 [4]是由 Amazon Web Services (AWS) 研究团队开发的,基于 Chronos 离散词元建模范式发展起来的通用时间序列基础模型,该模型同时适用于零样本单变量预测和协变量预测。其主要特性包括: - -* **概率性预测能力**:模型以生成式方式输出多步预测结果,支持分位数或分布级预测,从而刻画未来不确定性。 -* **零样本通用预测**:依托预训练获得的上下文学习能力,可直接对未见过的数据集执行预测,无需重新训练或参数更新。 -* **多变量与协变量统一建模**:支持在同一架构下联合建模多条相关时间序列及其协变量,以提升复杂任务的预测效果。但对输入有严格要求: - * 未来协变量的名称组成的集合必须是历史协变量的名称组成的集合的子集; - * 每个历史协变量的长度必须等于目标变量的长度; - * 每个未来协变量的长度必须等于预测长度; -* **高效推理与部署**:模型采用紧凑的编码器式(encoder-only)结构,在保持强泛化能力的同时兼顾推理效率。 - -![](/img/timeseries-large-model-chronos2.png) - -## 7. 效果展示 - -时序大模型能够适应多种不同领域和场景的真实时序数据,在各种任务上拥有优异的处理效果,以下是在不同数据上的真实表现: - -**时序预测:** - -利用时序大模型的预测能力,能够准确预测时间序列的未来变化趋势,如下图蓝色曲线代表预测趋势,红色曲线为实际趋势,两曲线高度吻合。 - -![](/img/LargeModel03.png) - -**数据填补**: - -利用时序大模型对缺失数据段进行预测式填补。 - -![](/img/timeseries-large-model-data-imputation.png) - -**异常检测**: - -利用时序大模型精准识别与正常趋势偏离过大的异常值。 - -![](/img/LargeModel05.png) - -## 8. 部署使用 - -1. 打开 IoTDB cli 控制台,检查 ConfigNode、DataNode、AINode 节点确保均为 Running。 - -```Plain -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| -+------+----------+-------+---------------+------------+--------------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| 2.0.5.1| 069354f| -| 1| DataNode|Running| 127.0.0.1| 10730| 2.0.5.1| 069354f| -| 2| AINode|Running| 127.0.0.1| 10810| 2.0.5.1|069354f-dev| -+------+----------+-------+---------------+------------+--------------+-----------+ -Total line number = 3 -It costs 0.140s -``` - -2. 联网环境下首次启动 AINode 节点会自动拉取 Timer-XL、Sundial、Chronos2 模型。 - - > 注意: - > - > * AINode 安装包不包含模型权重文件 - > * 自动拉取功能依赖部署环境具备 HuggingFace 网络访问能力 - > * AINode 支持手动上传模型权重文件,具体操作方法可参考[导入权重文件](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_3-3-导入内置权重文件) - -3. 检查模型是否可用。 - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 附录 - -**[1]** Timer- Generative Pre-trained Transformers Are Large Time Series Models, Yong Liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ 返回](#ref1) - -**[2]** TIMER-XL- LONG-CONTEXT TRANSFORMERS FOR UNIFIED TIME SERIES FORECASTING ,Yong Liu, Guo Qin, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ 返回](#ref2) - -**[3]** Sundial- A Family of Highly Capable Time Series Foundation Models, Yong Liu, Guo Qin, Zhiyuan Shi, Zhi Chen, Caiyin Yang, Xiangdong Huang, Jianmin Wang, Mingsheng Long, **ICML 2025 spotlight**. [↩ 返回](#ref3) - -**[4] **Chronos-2: From Univariate to Universal Forecasting, Abdul Fatir Ansari, Oleksandr Shchur, Jaris Küken, Andreas Auer, Boran Han, Pedro Mercado, Syama Sundar Rangapuram, Huibin Shen, Lorenzo Stella, Xiyuan Zhang, Mononito Goswami, Shubham Kapoor, Danielle C. Maddix, Pablo Guerron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, Michael Bohlke-Schneider, **arXiv:2510.15821.**[↩ 返回](#ref4) diff --git a/src/zh/UserGuide/Master/Table/API/Programming-CSharp-Native-API_timecho.md b/src/zh/UserGuide/Master/Table/API/Programming-CSharp-Native-API_timecho.md deleted file mode 100644 index 0d1c44c6d..000000000 --- a/src/zh/UserGuide/Master/Table/API/Programming-CSharp-Native-API_timecho.md +++ /dev/null @@ -1,403 +0,0 @@ - -# C# 原生接口 - -## 1. 功能介绍 - -IoTDB具备C#原生客户端驱动和对应的连接池,提供对象化接口,可以直接组装时序对象进行写入,无需拼装 SQL。推荐使用连接池,多线程并行操作数据库。 - -## 2. 使用方式 - -**环境要求:** - -* .NET SDK >= 5.0 或 .NET Framework 4.x -* Thrift >= 0.14.1 -* NLog >= 4.7.9 - -**依赖安装:** - -支持使用 NuGet Package Manager, .NET CLI等工具来安装,以 .NET CLI为例 - -如果使用的是\\.NET 5.0 或者更高版本的SDK,输入如下命令即可安装最新的NuGet包 - -```Plain -dotnet add package Apache.IoTDB -``` -注意:请勿使用高版本客户端连接低版本服务。 - -## 3. 读写操作 - -### 3.1 TableSessionPool - -#### 3.1.1 功能描述 - -TableSessionPool 定义了与IoTDB交互的基本操作,可以执行数据插入、查询操作以及关闭会话等,同时也是一个连接池这个池可以帮助我们高效地重用连接,并且在不需要时正确地清理资源, 该接口定义了如何从池中获取会话以及如何关闭池的基本操作。 - -#### 3.1.2 方法列表 - -以下是 TableSessionPool 中定义的方法及其详细说明: - -| 方法 | 描述 | 参数 | 返回值 | -| ---------------------------------------------------------------- | ---------------------------------------------------------------- |-------------------------------------------------------------------|--------------------| -| `Open(bool enableRpcCompression)` | 打开会话连接,自定义`enableRpcCompression` | `enableRpcCompression`:是否启用`RpcCompression`,此参数需配合 Server 端配置一同使用 | `Task` | -| `Open()` | 打开会话连接,不开启`RpcCompression` | 无 | `Task` | -| `InsertAsync(Tablet tablet)` | 将一个包含时间序列数据的Tablet 对象插入到数据库中 | tablet: 要插入的Tablet对象 | `Task` | -| `ExecuteNonQueryStatementAsync(string sql)` | 执行非查询SQL语句,如DDL(数据定义语言)或DML(数据操作语言)命令 | sql: 要执行的SQL语句。 | `Task` | -| `ExecuteQueryStatementAsync(string sql)` | 执行查询SQL语句,并返回包含查询结果的SessionDataSet对象 | sql: 要执行的SQL语句。 | `Task` | -| `ExecuteQueryStatementAsync(string sql, long timeoutInMs)` | 执行查询SQL语句,并设置查询超时时间(以毫秒为单位) | sql: 要执行的查询SQL语句。
timeoutInMs: 查询超时时间(毫秒) | `Task` | -| `Close()` | 关闭会话,释放所持有的资源 | 无 | `Task` | - -#### 3.1.3 接口展示 - -```C# -public async Task Open(bool enableRpcCompression, CancellationToken cancellationToken = default) - - public async Task Open(CancellationToken cancellationToken = default) - - public async Task InsertAsync(Tablet tablet) - - public async Task ExecuteNonQueryStatementAsync(string sql) - - public async Task ExecuteQueryStatementAsync(string sql) - - public async Task ExecuteQueryStatementAsync(string sql, long timeoutInMs) - - public async Task Close() -``` - -### 3.2 TableSessionPool.Builder - -#### 3.2.1 功能描述 - -TableSessionPool.Builder 是 TableSessionPool的构造器,用于配置和创建 TableSessionPool 的实例。允许开发者配置连接参数、会话参数和池化行为等。 - -#### 3.2.2 配置选项 - -以下是 TableSessionPool.Builder 类的可用配置选项及其默认值: - -| **配置项** | **描述** | **默认值** | -| ---------------------------------------------------- | ---------------------------------------------------------------- |---------------------| -| `SetHost(string host)` | 设置IoTDB的节点 host | `localhost` | -| `SetPort(int port)` | 设置IoTDB的节点端口 | `6667` | -| `SetNodeUrls(List nodeUrls)` | 设置IoTDB集群的节点URL列表,当设置此项时会忽略SetHost和SetPort | `无 ` | -| `SetUsername(string username)` | 设置连接的用户名 | `"root"` | -| `SetPassword(string password)` | 设置连接的密码 | `"TimechoDB@2021"` //V2.0.6.x 之前默认密码是root | -| `SetFetchSize(int fetchSize)` | 设置查询结果的获取大小 | `1024 ` | -| `SetZoneId(string zoneId)` | 设置时区相关的ZoneId | `UTC+08:00` | -| `SetPoolSize(int poolSize)` | 设置会话池的最大大小,即池中允许的最大会话数 | `8 ` | -| `SetEnableRpcCompression(bool enableRpcCompression)` | 是否启用RPC压缩 | `false` | -| `SetConnectionTimeoutInMs(int timeout)` | 设置连接超时时间(毫秒) | `500` | -| `SetDatabase(string database)` | 设置目标数据库名称 | ` "" ` | - -#### 3.2.3 接口展示 - -```c# -public Builder SetHost(string host) - { - _host = host; - return this; - } - - public Builder SetPort(int port) - { - _port = port; - return this; - } - - public Builder SetUsername(string username) - { - _username = username; - return this; - } - - public Builder SetPassword(string password) - { - _password = password; - return this; - } - - public Builder SetFetchSize(int fetchSize) - { - _fetchSize = fetchSize; - return this; - } - - public Builder SetZoneId(string zoneId) - { - _zoneId = zoneId; - return this; - } - - public Builder SetPoolSize(int poolSize) - { - _poolSize = poolSize; - return this; - } - - public Builder SetEnableRpcCompression(bool enableRpcCompression) - { - _enableRpcCompression = enableRpcCompression; - return this; - } - - public Builder SetConnectionTimeoutInMs(int timeout) - { - _connectionTimeoutInMs = timeout; - return this; - } - - public Builder SetNodeUrls(List nodeUrls) - { - _nodeUrls = nodeUrls; - return this; - } - - protected internal Builder SetSqlDialect(string sqlDialect) - { - _sqlDialect = sqlDialect; - return this; - } - - public Builder SetDatabase(string database) - { - _database = database; - return this; - } - - public Builder() - { - _host = "localhost"; - _port = 6667; - _username = "root"; - _password = "TimechoDB@2021"; //V2.0.6.x 之前默认密码是root - _fetchSize = 1024; - _zoneId = "UTC+08:00"; - _poolSize = 8; - _enableRpcCompression = false; - _connectionTimeoutInMs = 500; - _sqlDialect = IoTDBConstant.TABLE_SQL_DIALECT; - _database = ""; - } - - public TableSessionPool Build() - { - SessionPool sessionPool; - // if nodeUrls is not empty, use nodeUrls to create session pool - if (_nodeUrls.Count > 0) - { - sessionPool = new SessionPool(_nodeUrls, _username, _password, _fetchSize, _zoneId, _poolSize, _enableRpcCompression, _connectionTimeoutInMs, _sqlDialect, _database); - } - else - { - sessionPool = new SessionPool(_host, _port, _username, _password, _fetchSize, _zoneId, _poolSize, _enableRpcCompression, _connectionTimeoutInMs, _sqlDialect, _database); - } - return new TableSessionPool(sessionPool); - } -``` - -## 4. 示例代码 - -完整示例:[samples/Apache.IoTDB.Samples/TableSessionPoolTest.cs](https://github.com/apache/iotdb-client-csharp/blob/main/samples/Apache.IoTDB.Samples/TableSessionPoolTest.cs) - -```c# -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -using System; -using System.Collections.Generic; -using System.Threading.Tasks; -using Apache.IoTDB.DataStructure; - -namespace Apache.IoTDB.Samples; - -public class TableSessionPoolTest -{ - private readonly SessionPoolTest sessionPoolTest; - - public TableSessionPoolTest(SessionPoolTest sessionPoolTest) - { - this.sessionPoolTest = sessionPoolTest; - } - - public async Task Test() - { - await TestCleanup(); - - await TestSelectAndInsert(); - await TestUseDatabase(); - // await TestCleanup(); - } - - - public async Task TestSelectAndInsert() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - - await tableSessionPool.ExecuteNonQueryStatementAsync("CREATE DATABASE test1"); - await tableSessionPool.ExecuteNonQueryStatementAsync("CREATE DATABASE test2"); - - await tableSessionPool.ExecuteNonQueryStatementAsync("use test2"); - - // or use full qualified table name - await tableSessionPool.ExecuteNonQueryStatementAsync( - "create table test1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - - await tableSessionPool.ExecuteNonQueryStatementAsync( - "create table table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - var res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - // show tables by specifying another database - // using SHOW tables FROM - res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES FROM test1"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - var tableName = "testTable1"; - List columnNames = - new List { - "region_id", - "plant_id", - "device_id", - "model", - "temperature", - "humidity" }; - List dataTypes = - new List{ - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE}; - List columnCategories = - new List{ - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD}; - var values = new List> { }; - var timestamps = new List { }; - for (long timestamp = 0; timestamp < 100; timestamp++) - { - timestamps.Add(timestamp); - values.Add(new List { "1", "5", "3", "A", 1.23F + timestamp, 111.1 + timestamp }); - } - var tablet = new Tablet(tableName, columnNames, columnCategories, dataTypes, values, timestamps); - - await tableSessionPool.InsertAsync(tablet); - - - res = await tableSessionPool.ExecuteQueryStatementAsync("select * from testTable1 " - + "where region_id = '1' and plant_id in ('3', '5') and device_id = '3'"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.Close(); - } - - - public async Task TestUseDatabase() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetDatabase("test1") - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - - // show tables from current database - var res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.ExecuteNonQueryStatementAsync("use test2"); - - // show tables from current database - res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.Close(); - } - - public async Task TestCleanup() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - await tableSessionPool.ExecuteNonQueryStatementAsync("drop database test1"); - await tableSessionPool.ExecuteNonQueryStatementAsync("drop database test2"); - - await tableSessionPool.Close(); - } -} -``` diff --git a/src/zh/UserGuide/Master/Table/API/Programming-Cpp-Native-API_timecho.md b/src/zh/UserGuide/Master/Table/API/Programming-Cpp-Native-API_timecho.md deleted file mode 100644 index be499eba7..000000000 --- a/src/zh/UserGuide/Master/Table/API/Programming-Cpp-Native-API_timecho.md +++ /dev/null @@ -1,462 +0,0 @@ - - -# C++ 原生接口 - -## 1. 依赖 - -- Java 8+ -- Flex -- Bison 2.7+ -- Boost 1.56+ -- OpenSSL 1.0+ -- GCC 5.5.0+ - - -## 2. 安装 - -### 2.1 安装相关依赖 - -- **MAC** - 1. 安装 Bison : - - 使用下面 brew 命令安装 bison 版本: - ```shell - brew install bison - ``` - - 2. 安装 Boost :确保安装最新的 Boost 版本。 - - ```shell - brew install boost - ``` - - 3. 检查 OpenSSL :确保 openssl 库已安装,默认的 openssl 头文件路径为"/usr/local/opt/openssl/include" - - 如果在编译过程中出现找不到 openssl 的错误,尝试添加`-Dopenssl.include.dir=""` - - -- **Ubuntu 16.04+ 或其他 Debian 系列** - - 使用以下命令安装所赖: - - ```shell - sudo apt-get update - sudo apt-get install gcc g++ bison flex libboost-all-dev libssl-dev - ``` - - -- **CentOS 7.7+/Fedora/Rocky Linux 或其他 Red-hat 系列** - - 使用 yum 命令安装依赖: - - ```shell - sudo yum update - sudo yum install gcc gcc-c++ boost-devel bison flex openssl-devel - ``` - - -- **Windows** - -1. 构建编译环境 - - 安装 MS Visual Studio(推荐安装 2019+ 版本):安装时需要勾选 Visual Studio C/C++ IDE and compiler(supporting CMake, Clang, MinGW) - - 下载安装 [CMake](https://cmake.org/download/) 。 - -2. 下载安装 Flex、Bison - - 下载 [Win_Flex_Bison](https://sourceforge.net/projects/winflexbison/) - - 下载后需要将可执行文件重命名为 flex.exe 和 bison.exe 以保证编译时能够被找到,添加可执行文件的目录到 PATH 环境变量中 - -3. 安装 Boost 库 - - 下载 [Boost](https://www.boost.org/users/download/) - - 本地编译 Boost :依次执行 bootstrap.bat 和 b2.exe - - 添加 Boost 安装目录到 PATH 环境变量中,例如 `C:\Program Files (x86)\boost_1_78_0` - -4. 安装 OpenSSL - - 下载安装 [OpenSSL](http://slproweb.com/products/Win32OpenSSL.html) - - 添加 OpenSSL 下的 include 目录到 PATH 环境变量中 - - -### 2.2 执行编译 - -从 git 克隆源代码: -```shell -git clone https://github.com/apache/iotdb.git -``` - -默认的主分支是 master 分支,如果你想使用某个发布版本,请切换分支 (如 2.0.6 版本): -```shell -git checkout rc/2.0.6 -``` -注意:请勿使用高版本客户端连接低版本服务。 - -在 IoTDB 根目录下执行 maven 编译: - -- Mac 或 glibc 版本 >= 2.32 的 Linux - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp - ``` - -- glibc 版本 >= 2.31 的 Linux - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Diotdb-tools-thrift.version=0.14.1.1-old-glibc-SNAPSHOT - ``` - -- glibc 版本 >= 2.17 的 Linux - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Diotdb-tools-thrift.version=0.14.1.1-glibc223-SNAPSHOT - ``` - -- 使用 Visual Studio 2022 的 Windows - ```batch - .\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp - ``` - -- 使用 Visual Studio 2019 的 Windows - ```batch - .\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Dcmake.generator="Visual Studio 16 2019" -Diotdb-tools-thrift.version=0.14.1.1-msvc142-SNAPSHOT - ``` - - 如果没有将 Boost 库地址加入 PATH 环境变量,在编译命令中还需添加相关参数,例如:`-DboostIncludeDir="C:\Program Files (x86)\boost_1_78_0" -DboostLibraryDir="C:\Program Files (x86)\boost_1_78_0\stage\lib"` - -编译成功后,打包好的库文件位于 `iotdb-client/client-cpp/target` 中,同时可以在 `example/client-cpp-example/target` 下找到编译好的示例程序。 - -### 2.3 编译 Q&A - -Q:Linux 上的环境有哪些要求呢? - -A: -- 已知依赖的 glibc (x86_64 版本) 最低版本要求为 2.17,GCC 最低版本为 5.5 -- 已知依赖的 glibc (ARM 版本) 最低版本要求为 2.31,GCC 最低版本为 10.2 -- 如果不满足上面的要求,可以尝试自己本地编译 Thrift - - 下载 https://github.com/apache/iotdb-bin-resources/tree/iotdb-tools-thrift-v0.14.1.0/iotdb-tools-thrift 这里的代码 - - 执行 `./mvnw clean install` - - 回到 iotdb 代码目录执行 `./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp` - - -Q:Linux 编译报错`undefined reference to '_libc_sinle_thread'`如何处理? - -A: -- 该问题是用于默认的预编译 Thrift 依赖了过高的 glibc 版本导致的 -- 可以尝试在编译的 maven 命令中增加 `-Diotdb-tools-thrift.version=0.14.1.1-glibc223-SNAPSHOT` 或者 `-Diotdb-tools-thrift.version=0.14.1.1-old-glibc-SNAPSHOT` - -Q:如果在 Windows 上需要使用 Visual Studio 2017 或更早版本进行编译,要怎么做? - -A: -- 可以尝试自己本地编译 Thrift 后再进行客户端的编译 - - 下载 https://github.com/apache/iotdb-bin-resources/tree/iotdb-tools-thrift-v0.14.1.0/iotdb-tools-thrift 这里的代码 - - 执行 `.\mvnw.cmd clean install` - - 回到 iotdb 代码目录执行 `.\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Dcmake.generator="Visual Studio 15 2017"` - -Q: Windows 上使用 Visual Studio 进行编译时出现乱码,如何解决? - -A: -- 该问题是由于项目整体使用 utf-8 编码,而编译用到的一些 windows 系统文件编码不是 utf-8(系统编码默认跟随地区,在中国为 GBK) -- 可以在控制面板中更改系统区域设置,具体操作方法为:打开控制面板 -> 时钟和区域 -> 区域,切换到管理选项卡,在 "非Unicode程序的语言" 下,选择更改系统区域设置,勾选 `Beta版:使用Unicode UTF-8提供全球语言支持` 后重启电脑。(细节可能随windows版本有差异,可在网上寻找详细教程) -- 注意,修改 windows 系统编码后可能会导致一些其他使用 GBK 编码的程序出现乱码,将系统区域改回后可复原,请自行斟酌。 - -## 3. 使用方式 - -### 3.1 TableSession 类 - -C++ 客户端的操作均通过 TableSession 类进行,下面将给出 TableSession 接口中定义的方法说明。 - -#### 3.1.1 方法列表 - -1. `insert(Tablet& tablet, bool sorted = false)`,将一个包含时间序列数据的Tablet对象插入到数据库中,sorted参数指明tablet中的行是否已按时间排序。 -2. `executeNonQueryStatement(string& sql)`,执行非查询SQL语句,如DDL(数据定义语言)或DML(数据操作语言)命令。 -3. `executeQueryStatement(string& sql)`,执行查询SQL语句,并返回包含查询结果的SessionDataSet对象,可选timeoutInMs参数指示超时返回时间。 - * 注意:调用 `SessionDataSet::next()` 获取查询结果行时,必须将返回的 `std::shared_ptr` 对象存储在局部作用域变量中(例如:`auto row = dataSet->next();`),以确保数据生命周期有效。若直接通过 `.get()` 或裸指针访问(如 `dataSet->next().get()`),将导致智能指针引用计数归零,数据被立即释放,后续访问将引发未定义行为。 -4. `open(bool enableRPCCompression = false)`,开启连接,并决定是否启用RPC压缩(客户端状态须与服务端一致,默认不开启)。 -5. `close()`,关闭连接。 - -#### 3.1.2 接口展示 - -```cpp -class TableSession { -private: - Session* session; -public: - TableSession(Session* session) { - this->session = session; - } - void insert(Tablet& tablet, bool sorted = false); - void executeNonQueryStatement(const std::string& sql); - unique_ptr executeQueryStatement(const std::string& sql); - unique_ptr executeQueryStatement(const std::string& sql, int64_t timeoutInMs); - string getDatabase(); //获取当前选择的database,可由executeNonQueryStatement代替 - void open(bool enableRPCCompression = false); - void close(); -}; -``` - -### 3.2 TableSessionBuilder 类 - -TableSessionBuilder类是一个构建器,用于配置和创建TableSession类的实例,通过它可以在创建实例时方便地设置连接参数、查询参数等。 - -#### 3.2.1 使用示例 - -```cpp -//设置连接的IP、端口、用户名、密码 -//设置的顺序任意,确保最后调用build()即可,创建的实例默认已进行open()连接操作 -session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("TimechoDB@2021") //V2.0.6.x 之前默认密码是root - ->build(); -``` - -#### 3.2.2 可设置的参数列表 - -| **参数名** | **描述** | **默认值** | -| :---: | :---: |:-------------------------:| -| host | 设置连接的节点IP | "127.0.0.1" ("localhost") | -| rpcPort | 设置连接的节点端口 | 6667 | -| username | 设置连接的用户名 | "root" | -| password | 设置连接密码 | "TimechoDB@2021" //V2.0.6.x 之前默认密码是root | -| zoneId | 设置时区相关的ZoneId | "" | -| fetchSize | 设置查询结果的获取大小 | 10000 | -| database | 设置目标数据库名称 | "" | - - -## 4. 示例代码 - -示例工程源代码: - -- `example/client-cpp-example/src/TableModelSessionExample.cpp` : [TableModelSessionExample](https://github.com/apache/iotdb/blob/master/example/client-cpp-example/src/TableModelSessionExample.cpp) - -编译成功后,示例代码工程位于 `example/client-cpp-example/target` - -```cpp -#include "TableSession.h" -#include "TableSessionBuilder.h" - -using namespace std; - -shared_ptr session; - -void insertRelationalTablet() { - - vector> schemaList { - make_pair("region_id", TSDataType::TEXT), - make_pair("plant_id", TSDataType::TEXT), - make_pair("device_id", TSDataType::TEXT), - make_pair("model", TSDataType::TEXT), - make_pair("temperature", TSDataType::FLOAT), - make_pair("humidity", TSDataType::DOUBLE) - }; - - vector columnTypes = { - ColumnCategory::TAG, - ColumnCategory::TAG, - ColumnCategory::TAG, - ColumnCategory::ATTRIBUTE, - ColumnCategory::FIELD, - ColumnCategory::FIELD - }; - - Tablet tablet("table1", schemaList, columnTypes, 100); - - for (int row = 0; row < 100; row++) { - int rowIndex = tablet.rowSize++; - tablet.timestamps[rowIndex] = row; - - // 使用基于索引的 API 比通过列名查找更高效 - // 推荐写法:tablet.addValue(0, rowIndex, "1"); - // 避免写法:tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue(0, rowIndex, "1"); // region_id - tablet.addValue(1, rowIndex, "5"); // plant_id - tablet.addValue(2, rowIndex, "3"); // device_id - tablet.addValue(3, rowIndex, "A"); // model - tablet.addValue(4, rowIndex, 37.6F); // temperature - tablet.addValue(5, rowIndex, 111.1); // humidity - if (tablet.rowSize == tablet.maxRowNumber) { - session->insert(tablet); - tablet.reset(); - } - } - - if (tablet.rowSize != 0) { - session->insert(tablet); - tablet.reset(); - } -} - -void Output(unique_ptr &dataSet) { - for (const string &name: dataSet->getColumnNames()) { - cout << name << " "; - } - cout << endl; - while (dataSet->hasNext()) { - cout << dataSet->next()->toString(); - } - cout << endl; -} - -void OutputWithType(unique_ptr &dataSet) { - for (const string &name: dataSet->getColumnNames()) { - cout << name << " "; - } - cout << endl; - for (const string &type: dataSet->getColumnTypeList()) { - cout << type << " "; - } - cout << endl; - while (dataSet->hasNext()) { - cout << dataSet->next()->toString(); - } - cout << endl; -} - -int main() { - try { - session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("root") - ->build(); - - cout << "[Create Database db1,db2]\n" << endl; - try { - session->executeNonQueryStatement("CREATE DATABASE IF NOT EXISTS db1"); - session->executeNonQueryStatement("CREATE DATABASE IF NOT EXISTS db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Use db1 as database]\n" << endl; - try { - session->executeNonQueryStatement("USE db1"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Create Table table1,table2]\n" << endl; - try { - session->executeNonQueryStatement("create table db1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - session->executeNonQueryStatement("create table db2.table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show Tables]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show tables from specific database]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES FROM db1"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[InsertTablet]\n" << endl; - try { - insertRelationalTablet(); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Query Table Data]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SELECT * FROM table1" - " where region_id = '1' and plant_id in ('3', '5') and device_id = '3'"); - OutputWithType(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - session->close(); - - // specify database in constructor - session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("root") - ->database("db1") - ->build(); - - cout << "[Show tables from current database(db1)]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Change database to db2]\n" << endl; - try { - session->executeNonQueryStatement("USE db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show tables from current database(db2)]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Drop Database db1,db2]\n" << endl; - try { - session->executeNonQueryStatement("DROP DATABASE db1"); - session->executeNonQueryStatement("DROP DATABASE db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "session close\n" << endl; - session->close(); - - cout << "finished!\n" << endl; - } catch (IoTDBConnectionException &e) { - cout << e.what() << endl; - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - return 0; -} -``` - -## 5. FAQ - -### 5.1 Thrift 编译相关问题 - -1. MAC:本地 Maven 编译 Thrift 时如出现以下链接的问题,可以尝试将 xcode-commandline 版本从 12 降低到 11.5 - https://stackoverflow.com/questions/63592445/ld-unsupported-tapi-file-type-tapi-tbd-in-yaml-file/65518087#65518087 - - -2. Windows:Maven 编译 Thrift 时需要使用 wget 下载远端文件,可能出现以下报错: - ``` - Failed to delete cached file C:\Users\Administrator\.m2\repository\.cache\download-maven-plugin\index.ser - ``` - - 解决方法: - - 尝试删除 ".m2\repository\\.cache\" 目录并重试。 - - 在添加 pom 文件对应的 download-maven-plugin 中添加 "\true\" diff --git a/src/zh/UserGuide/Master/Table/API/Programming-Go-Native-API_timecho.md b/src/zh/UserGuide/Master/Table/API/Programming-Go-Native-API_timecho.md deleted file mode 100644 index ad109f414..000000000 --- a/src/zh/UserGuide/Master/Table/API/Programming-Go-Native-API_timecho.md +++ /dev/null @@ -1,598 +0,0 @@ - - -# Go 原生接口 - -## 1. 使用方法 -### 1.1 依赖 - -* golang >= 1.13 -* make >= 3.0 -* curl >= 7.1.1 -* thrift 0.15.0 -* Linux、Macos 或其他类 unix 系统 -* Windows+bash (下载 IoTDB Go client 需要 git ,通过 WSL、cygwin、Git Bash 任意一种方式均可) - -### 1.2 安装方法 - -* 通过 go mod - -```sh -# 切换到 GOPATH 的 HOME 路径,启用 Go Modules 功能 -export GO111MODULE=on - -# 配置 GOPROXY 环境变量 -export GOPROXY=https://goproxy.io - -# 创建命名的文件夹或目录,并切换当前目录 -mkdir session_example && cd session_example - -# 保存文件,自动跳转到新的地址 -curl -o session_example.go -L https://github.com/apache/iotdb-client-go/raw/main/example/session_example.go - -# 初始化 go module 环境 -go mod init session_example - -# 下载依赖包 -go mod tidy - -# 编译并运行程序 -go run session_example.go -``` - -* 通过 GOPATH - -```sh -# get thrift 0.13.0 -go get github.com/apache/thrift@0.13.0 - -# 递归创建目录 -mkdir -p $GOPATH/src/iotdb-client-go-example/session_example - -# 切换到当前目录 -cd $GOPATH/src/iotdb-client-go-example/session_example - -# 保存文件,自动跳转到新的地址 -curl -o session_example.go -L https://github.com/apache/iotdb-client-go/raw/main/example/session_example.go - -# 初始化 go module 环境 -go mod init - -# 下载依赖包 -go mod tidy - -# 编译并运行程序 -go run session_example.go -``` -* 注意:请勿使用高版本客户端连接低版本服务。 - -## 2. ITableSession 接口 -### 2.1 功能描述 - -ITableSession 接口定义了与 IoTDB 交互的基本操作,可以执行数据插入、查询操作以及关闭会话等,非线程安全。 - -### 2.2 方法列表 - -以下是 ITableSession 接口中定义的方法及其详细说明: - -| **方法名** | **描述** | **参数** | **返回值** | **返回异常** | -| -------------------------------------------------------------- | ---------------------------------------------------------------------- | --------------------------------------------------------------------------- | ---------------------------------------------------- | ----------------------------------------------- | -| `Insert(tablet *Tablet)` | 将一个包含时间序列数据的Tablet 插入到数据库中| `tablet`: 要插入的`Tablet`| `TSStatus`:执行结果的状态,由 common 包定义。 | `errer`:操作过程中的错误(如连接失败)。 | -| `xecuteNonQueryStatement(sql string)`| 执行非查询 SQL 语句,如 DDL (数据定义语言)或 DML (数据操作语言)命令 | `sql`: 要执行的 SQL 语句。| 同上 | 同上| -| `ExecuteQueryStatement (sql string, timeoutInMs *int64)` | 执行查询 SQL 语句,并返回查询结果集 | `sql`: 要执行的查询 SQL 语句。`timeoutInMs`: 查询超时时间(毫秒) | `SessionDataSet`:查询结果数据集。 | `errer`:操作过程中的错误(如连接失败)。 | -| `Close()` | 关闭会话,释放所持有的资源 | 无 | 无 | `errer`:关闭连接过程中的错误 | - -### 2.3 接口展示 -1. ITableSession - -```go -// ITableSession defines an interface for interacting with IoTDB tables. -// It supports operations such as data insertion, executing queries, and closing the session. -// Implementations of this interface are expected to manage connections and ensure -// proper resource cleanup. -// -// Each method may return an error to indicate issues such as connection errors -// or execution failures. -// -// Since this interface includes a Close method, it is recommended to use -// defer to ensure the session is properly closed. -type ITableSession interface { - - // Insert inserts a Tablet into the database. - // - // Parameters: - // - tablet: A pointer to a Tablet containing time-series data to be inserted. - // - // Returns: - // - r: A pointer to TSStatus indicating the execution result. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - Insert(tablet *Tablet) (r *common.TSStatus, err error) - - // ExecuteNonQueryStatement executes a non-query SQL statement, such as a DDL or DML command. - // - // Parameters: - // - sql: The SQL statement to execute. - // - // Returns: - // - r: A pointer to TSStatus indicating the execution result. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - ExecuteNonQueryStatement(sql string) (r *common.TSStatus, err error) - - // ExecuteQueryStatement executes a query SQL statement and returns the result set. - // - // Parameters: - // - sql: The SQL query statement to execute. - // - timeoutInMs: A pointer to the timeout duration in milliseconds for the query execution. - // - // Returns: - // - result: A pointer to SessionDataSet containing the query results. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - ExecuteQueryStatement(sql string, timeoutInMs *int64) (*SessionDataSet, error) - - // Close closes the session, releasing any held resources. - // - // Returns: - // - err: An error if there is an issue with closing the IoTDB connection. - Close() (err error) -} -``` - -2. 构造 TableSession - -* Config 中不需要手动设置 sqlDialect,使用时只需要使用对应的 NewSession 函数 - -```Go -type Config struct { - Host string - Port string - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - sqlDialect string - Version Version - Database string -} - -type ClusterConfig struct { - NodeUrls []string //ip:port - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - sqlDialect string - Database string -} - -// NewTableSession creates a new TableSession instance using the provided configuration. -// -// Parameters: -// - config: The configuration for the session. -// - enableRPCCompression: A boolean indicating whether RPC compression is enabled. -// - connectionTimeoutInMs: The timeout in milliseconds for establishing a connection. -// -// Returns: -// - An ITableSession instance if the session is successfully created. -// - An error if there is an issue during session initialization. -func NewTableSession(config *Config, enableRPCCompression bool, connectionTimeoutInMs int) (ITableSession, error) - -// NewClusterTableSession creates a new TableSession instance for a cluster setup. -// -// Parameters: -// - clusterConfig: The configuration for the cluster session. -// - enableRPCCompression: A boolean indicating whether RPC compression is enabled. -// -// Returns: -// - An ITableSession instance if the session is successfully created. -// - An error if there is an issue during session initialization. -func NewClusterTableSession(clusterConfig *ClusterConfig, enableRPCCompression bool) (ITableSession, error) -``` - -> 注意: -> -> 通过 *NewTableSession 或 NewClusterTableSession* 得到的 TableSession,连接已经建立,不需要额外的 open 操作。 - -### 2.4 示例代码 - -```Go -package main - -import ( - "flag" - "log" - "math/rand" - "strconv" - "time" - - "github.com/apache/iotdb-client-go/v2/client" -) - -func main() { - flag.Parse() - config := &client.Config{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_session", - } - session, err := client.NewTableSession(config, false, 0) - if err != nil { - log.Fatal(err) - } - defer session.Close() - - checkError(session.ExecuteNonQueryStatement("create database test_db")) - checkError(session.ExecuteNonQueryStatement("use test_db")) - checkError(session.ExecuteNonQueryStatement("create table t1 (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - insertRelationalTablet(session) - showTables(session) - query(session) -} - -func getTextValueFromDataSet(dataSet *client.SessionDataSet, columnName string) string { - if isNull, err := dataSet.IsNull(columnName); err != nil { - log.Fatal(err) - } else if isNull { - return "null" - } - v, err := dataSet.GetString(columnName) - if err != nil { - log.Fatal(err) - } - return v -} - -func insertRelationalTablet(session client.ITableSession) { - tablet, err := client.NewRelationalTablet("t1", []*client.MeasurementSchema{ - { - Measurement: "tag1", - DataType: client.STRING, - }, - { - Measurement: "tag2", - DataType: client.STRING, - }, - { - Measurement: "s1", - DataType: client.TEXT, - }, - { - Measurement: "s2", - DataType: client.TEXT, - }, - }, []client.ColumnCategory{client.TAG, client.TAG, client.FIELD, client.FIELD}, 1024) - if err != nil { - log.Fatal("Failed to create relational tablet {}", err) - } - ts := time.Now().UTC().UnixNano() / 1000000 - for row := 0; row < 16; row++ { - ts++ - tablet.SetTimestamp(ts, row) - tablet.SetValueAt("tag1_value_"+strconv.Itoa(row), 0, row) - tablet.SetValueAt("tag2_value_"+strconv.Itoa(row), 1, row) - tablet.SetValueAt("s1_value_"+strconv.Itoa(row), 2, row) - tablet.SetValueAt("s2_value_"+strconv.Itoa(row), 3, row) - tablet.RowSize++ - } - checkError(session.Insert(tablet)) - - tablet.Reset() - - for row := 0; row < 16; row++ { - ts++ - tablet.SetTimestamp(ts, row) - tablet.SetValueAt("tag1_value_1", 0, row) - tablet.SetValueAt("tag2_value_1", 1, row) - tablet.SetValueAt("s1_value_"+strconv.Itoa(row), 2, row) - tablet.SetValueAt("s2_value_"+strconv.Itoa(row), 3, row) - - nullValueColumn := rand.Intn(4) - tablet.SetValueAt(nil, nullValueColumn, row) - tablet.RowSize++ - } - checkError(session.Insert(tablet)) -} - -func showTables(session client.ITableSession) { - timeout := int64(2000) - dataSet, err := session.ExecuteQueryStatement("show tables", &timeout) - defer dataSet.Close() - if err != nil { - log.Fatal(err) - } - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - value, err := dataSet.GetString("TableName") - if err != nil { - log.Fatal(err) - } - log.Printf("tableName is %v", value) - } -} - -func query(session client.ITableSession) { - timeout := int64(2000) - dataSet, err := session.ExecuteQueryStatement("select * from t1", &timeout) - defer dataSet.Close() - if err != nil { - log.Fatal(err) - } - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - log.Printf("%v %v %v %v", getTextValueFromDataSet(dataSet, "tag1"), getTextValueFromDataSet(dataSet, "tag2"), getTextValueFromDataSet(dataSet, "s1"), getTextValueFromDataSet(dataSet, "s2")) - } -} - -func checkError(err error) { - if err != nil { - log.Fatal(err) - } -} -``` - -## 3. TableSessionPool 接口 -### 3.1 功能描述 - -TableSessionPool 是一个用于管理 ITableSession 实例的池。这个池可以帮助我们高效地重用连接,并且在不需要时正确地清理资源, 该接口定义了如何从池中获取会话以及如何关闭池的基本操作。 - -### 3.2 方法列表 - -| **方法名** | **描述** | **返回值** | **返回异常** | -| -------------------- | -------------------------------------------------------------------- | --------------------------- | --------------------------------- | -| `GetSession()` | 从池中获取一个 ITableSession 实例,用于与 IoTDB 交互。 | `ITableSession `实例| `error`:获取失败的错误原因 | -| `Close()` | 关闭会话池,释放任何持有的资源。关闭后,不能再从池中获取新的会话。 | 无 | 无 | - -### 3.3 接口展示 -1. TableSessionPool - -```Go -// TableSessionPool manages a pool of ITableSession instances, enabling efficient -// reuse and management of resources. It provides methods to acquire a session -// from the pool and to close the pool, releasing all held resources. -// -// This implementation ensures proper lifecycle management of sessions, -// including efficient reuse and cleanup of resources. - -// GetSession acquires an ITableSession instance from the pool. -// -// Returns: -// - A usable ITableSession instance for interacting with IoTDB. -// - An error if a session cannot be acquired. -func (spool *TableSessionPool) GetSession() (ITableSession, error) { - return spool.sessionPool.getTableSession() -} - -// Close closes the TableSessionPool, releasing all held resources. -// Once closed, no further sessions can be acquired from the pool. -func (spool *TableSessionPool) Close() -``` - -2. 构造 TableSessionPool - -```Go -type PoolConfig struct { - Host string - Port string - NodeUrls []string - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - Database string - sqlDialect string -} - -// NewTableSessionPool creates a new TableSessionPool with the specified configuration. -// -// Parameters: -// - conf: PoolConfig defining the configuration for the pool. -// - maxSize: The maximum number of sessions the pool can hold. -// - connectionTimeoutInMs: Timeout for establishing a connection in milliseconds. -// - waitToGetSessionTimeoutInMs: Timeout for waiting to acquire a session in milliseconds. -// - enableCompression: A boolean indicating whether to enable compression. -// -// Returns: -// - A TableSessionPool instance. -func NewTableSessionPool(conf *PoolConfig, maxSize, connectionTimeoutInMs, waitToGetSessionTimeoutInMs int, - enableCompression bool) TableSessionPool -``` - -> 注意: -> -> * 通过 TableSessionPool 得到的 TableSession,如果已经在创建 TableSessionPool 指定了 Database,使用时可以不再指定 Database。 -> * 如果使用过程中通过 use database 指定了其他 database,在 close 放回 TableSessionPool 时会自动恢复为 TableSessionPool 所用的 database。 - -### 3.4 示例代码 - -```go -package main - -import ( - "log" - "strconv" - "sync" - "sync/atomic" - "time" - - "github.com/apache/iotdb-client-go/v2/client" -) - -func main() { - sessionPoolWithSpecificDatabaseExample() - sessionPoolWithoutSpecificDatabaseExample() - putBackToSessionPoolExample() -} - -func putBackToSessionPoolExample() { - // should create database test_db before executing - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_db", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) - defer sessionPool.Close() - - num := 4 - successGetSessionNum := int32(0) - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - dbName := "db" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create database "+dbName+"because ", err) - return - } - atomic.AddInt32(&successGetSessionNum, 1) - defer func() { - time.Sleep(6 * time.Second) - // put back to session pool - session.Close() - }() - checkError(session.ExecuteNonQueryStatement("create database " + dbName)) - checkError(session.ExecuteNonQueryStatement("use " + dbName)) - checkError(session.ExecuteNonQueryStatement("create table table_of_" + dbName + " (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() - log.Println("success num is", successGetSessionNum) - - log.Println("All session's database have been reset.") - // the using database will automatically reset to session pool's database after the session closed - wg.Add(5) - for i := 0; i < 5; i++ { - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to get session because ", err) - } - defer session.Close() - timeout := int64(3000) - dataSet, err := session.ExecuteQueryStatement("show tables", &timeout) - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - value, err := dataSet.GetString("TableName") - if err != nil { - log.Fatal(err) - } - log.Println("table is", value) - } - dataSet.Close() - }() - } - wg.Wait() -} - -func sessionPoolWithSpecificDatabaseExample() { - // should create database test_db before executing - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_db", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 8000, false) - defer sessionPool.Close() - num := 10 - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - tableName := "t" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create table "+tableName+"because ", err) - return - } - defer session.Close() - checkError(session.ExecuteNonQueryStatement("create table " + tableName + " (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() -} - -func sessionPoolWithoutSpecificDatabaseExample() { - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 8000, false) - defer sessionPool.Close() - num := 10 - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - dbName := "db" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create database ", dbName, err) - return - } - defer session.Close() - checkError(session.ExecuteNonQueryStatement("create database " + dbName)) - checkError(session.ExecuteNonQueryStatement("use " + dbName)) - checkError(session.ExecuteNonQueryStatement("create table t1 (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() -} - -func checkError(err error) { - if err != nil { - log.Fatal(err) - } -} -``` - diff --git a/src/zh/UserGuide/Master/Table/API/Programming-JDBC_timecho.md b/src/zh/UserGuide/Master/Table/API/Programming-JDBC_timecho.md deleted file mode 100644 index 9484025d3..000000000 --- a/src/zh/UserGuide/Master/Table/API/Programming-JDBC_timecho.md +++ /dev/null @@ -1,192 +0,0 @@ - - -# JDBC - -## 1. 功能介绍 - -IoTDB JDBC接口提供了一种标准的方式来与IoTDB数据库进行交互,允许用户通过Java程序执行SQL语句来管理数据库和时间序列数据。它支持数据库的连接、创建、查询、更新和删除操作,以及时间序列数据的批量插入和查询。 - -**注意**: 目前的JDBC实现仅是为与第三方工具连接使用的。使用JDBC(执行插入语句时)无法提供高性能写入。 - -对于Java应用,我们推荐使用Java 原生接口。 - -## 2. 使用方式 - -**环境要求:** - -- JDK >= 1.8 -- Maven >= 3.6 - -**在maven中添加依赖:** - -```XML - - - com.timecho.iotdb - iotdb-jdbc - 2.0.1.1 - - -``` -注意:请勿使用高版本客户端连接低版本服务。 - -## 3. 读写操作 - -### 3.1 功能说明 - -- **写操作**:通过execute方法执行插入、创建数据库、创建时间序列等操作。 -- **读操作**:通过executeQuery方法执行查询操作,并使用ResultSet对象获取查询结果。 - -### 3.2 **方法列表** - -| **方法名** | **描述** | **参数** | **返回值** | -| ------------------------------------------------------------ | ---------------------------------- | ---------------------------------------------------------- | ----------------------------------- | -| Class.forName(String driver) | 加载JDBC驱动类 | driver: JDBC驱动类的名称 | Class: 加载的类对象 | -| DriverManager.getConnection(String url, String username, String password) | 建立数据库连接 | url: 数据库的URLusername: 数据库用户名password: 数据库密码 | Connection: 数据库连接对象 | -| Connection.createStatement() | 创建Statement对象,用于执行SQL语句 | 无 | Statement: SQL语句执行对象 | -| Statement.execute(String sql) | 执行SQL语句,对于非查询语句 | sql: 要执行的SQL语句 | boolean: 指示是否返回ResultSet对象 | -| Statement.executeQuery(String sql) | 执行查询SQL语句并返回结果集 | sql: 要执行的查询SQL语句 | ResultSet: 查询结果集 | -| ResultSet.getMetaData() | 获取结果集的元数据 | 无 | ResultSetMetaData: 结果集元数据对象 | -| ResultSet.next() | 移动到结果集的下一行 | 无 | boolean: 是否成功移动到下一行 | -| ResultSet.getString(int columnIndex) | 获取指定列的字符串值 | columnIndex: 列索引(从1开始) | String: 列的字符串值 | - -## 4. 示例代码 - -**注意:使用表模型,必须在 url 中指定 sql_dialect 参数为 table。** - -```Java -String url = "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table"; -``` - -JDBC接口示例代码:[src/main/java/org/apache/iotdb/TableModelJDBCExample.java](https://github.com/apache/iotdb/blob/rc/2.0.1/example/jdbc/src/main/java/org/apache/iotdb/TableModelJDBCExample.java) - - -```Java -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -package org.apache.iotdb; - -import org.apache.iotdb.jdbc.IoTDBSQLException; - -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -import java.sql.Connection; -import java.sql.DriverManager; -import java.sql.ResultSet; -import java.sql.ResultSetMetaData; -import java.sql.SQLException; -import java.sql.Statement; - -public class TableModelJDBCExample { - - private static final Logger LOGGER = LoggerFactory.getLogger(TableModelJDBCExample.class); - - public static void main(String[] args) throws ClassNotFoundException, SQLException { - Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); - - // don't specify database in url - try (Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", "root", "TimechoDB@2021"); //V2.0.6.x 之前默认密码是root - Statement statement = connection.createStatement()) { - - statement.execute("CREATE DATABASE test1"); - statement.execute("CREATE DATABASE test2"); - - statement.execute("use test2"); - - // or use full qualified table name - statement.execute( - "create table test1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - - statement.execute( - "create table table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - // show tables by specifying another database - // using SHOW tables FROM - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES FROM test1")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - - // specify database in url - try (Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/test1?sql_dialect=table", "root", "TimechoDB@2021"); //V2.0.6.x 之前默认密码是root - Statement statement = connection.createStatement()) { - // show tables from current database test1 - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - // change database to test2 - statement.execute("use test2"); - - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - } - } -} -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/API/Programming-Java-Native-API_timecho.md b/src/zh/UserGuide/Master/Table/API/Programming-Java-Native-API_timecho.md deleted file mode 100644 index a17dab3c0..000000000 --- a/src/zh/UserGuide/Master/Table/API/Programming-Java-Native-API_timecho.md +++ /dev/null @@ -1,855 +0,0 @@ - - -# Java原生接口 - -## 1. 功能介绍 - -IoTDB具备Java原生客户端驱动和对应的连接池,提供对象化接口,可以直接组装时序对象进行写入,无需拼装 SQL。推荐使用连接池,多线程并行操作数据库。 - -## 2. 使用方式 - -**环境要求:** - -- JDK >= 1.8 -- Maven >= 3.6 - -**在maven中添加依赖:** - -```XML - - - com.timecho.iotdb - iotdb-session - - ${project.version} - - -``` -* 可从[此处](https://repo1.maven.org/maven2/com/timecho/iotdb/iotdb-session/)查看`iotdb-session`最新版本 -* 注意:请勿使用高版本客户端连接低版本服务。 - -## 3. 读写操作 - -### 3.1 ITableSession接口 - -#### 3.1.1 功能描述 - -ITableSession接口定义了与IoTDB交互的基本操作,可以执行数据插入、查询操作以及关闭会话等,非线程安全。 - -#### 3.1.2 方法列表 - -以下是ITableSession接口中定义的方法及其详细说明: - -| **方法名** | **描述** | **参数** | **返回值** | **返回异常** | -| --------------------------------------------------- | ------------------------------------------------------------ | ----------------------------------------------------------- | -------------- | --------------------------------------------------- | -| insert(Tablet tablet) | 将一个包含时间序列数据的Tablet 对象插入到数据库中 | tablet: 要插入的Tablet对象 | 无 | StatementExecutionExceptionIoTDBConnectionException | -| executeNonQueryStatement(String sql) | 执行非查询SQL语句,如DDL(数据定义语言)或DML(数据操作语言)命令 | sql: 要执行的SQL语句。 | 无 | StatementExecutionExceptionIoTDBConnectionException | -| executeQueryStatement(String sql) | 执行查询SQL语句,并返回包含查询结果的SessionDataSet对象 | sql: 要执行的查询SQL语句。 | SessionDataSet | StatementExecutionExceptionIoTDBConnectionException | -| executeQueryStatement(String sql, long timeoutInMs) | 执行查询SQL语句,并设置查询超时时间(以毫秒为单位) | sql: 要执行的查询SQL语句。timeoutInMs: 查询超时时间(毫秒) | SessionDataSet | StatementExecutionException | -| close() | 关闭会话,释放所持有的资源 | 无 | 无 | IoTDBConnectionException | - -**关于 Object 数据类型的说明:** - -自 V2.0.8 起,`iTableSession.insert(Tablet tablet)`接口支持将单个 Object 类文件拆成多段后按顺序分段写入。当 Tablet 数据结构中列数据类型为 **`TSDataType.Object`​ ​**时,需要使用如下方法向 Tablet 填值。 - -```Java -/* -rowIndex:tablet 行位置 -columnIndex:tablet 列位置 -isEOF:本次写入内容是否为 Object 文件的最后一部分 -offset:本次写入的内容在 Object 文件中的起始偏移量 -content:本次写入的文件内容 -写入时需要确保分段的多个 byte[] 总长度与原始 Object 大小相等,否则会导致写入的数据大小不正确 -*/ -void addValue(int rowIndex, int columnIndex, boolean isEOF, long offset, byte[] content) -``` - -查询时,支持通过`Field.getStringValue`、`Field.getObjectValue`、`SessionDataSet.DataIterator.getObject`、`SessionDataSet.DataIterator.getString` 四种方法进行获取,其返回内容均为以 (Object) 开头且以对象大小结尾的字符串(String 类型),形如:(Object) XX.XX KB 。 - - -#### 3.1.3 接口展示 - -``` java -/** - * This interface defines a session for interacting with IoTDB tables. - * It supports operations such as data insertion, executing queries, and closing the session. - * Implementations of this interface are expected to manage connections and ensure - * proper resource cleanup. - * - *

Each method may throw exceptions to indicate issues such as connection errors or - * execution failures. - * - *

Since this interface extends {@link AutoCloseable}, it is recommended to use - * try-with-resources to ensure the session is properly closed. - */ -public interface ITableSession extends AutoCloseable { - - /** - * Inserts a {@link Tablet} into the database. - * - * @param tablet the tablet containing time-series data to be inserted. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - void insert(Tablet tablet) throws StatementExecutionException, IoTDBConnectionException; - - /** - * Executes a non-query SQL statement, such as a DDL or DML command. - * - * @param sql the SQL statement to execute. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - * @throws StatementExecutionException if an error occurs while executing the statement. - */ - void executeNonQueryStatement(String sql) throws IoTDBConnectionException, StatementExecutionException; - - /** - * Executes a query SQL statement and returns the result set. - * - * @param sql the SQL query statement to execute. - * @return a {@link SessionDataSet} containing the query results. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - SessionDataSet executeQueryStatement(String sql) - throws StatementExecutionException, IoTDBConnectionException; - - /** - * Executes a query SQL statement with a specified timeout and returns the result set. - * - * @param sql the SQL query statement to execute. - * @param timeoutInMs the timeout duration in milliseconds for the query execution. - * @return a {@link SessionDataSet} containing the query results. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - SessionDataSet executeQueryStatement(String sql, long timeoutInMs) - throws StatementExecutionException, IoTDBConnectionException; - - /** - * Closes the session, releasing any held resources. - * - * @throws IoTDBConnectionException if there is an issue with closing the IoTDB connection. - */ - @Override - void close() throws IoTDBConnectionException; -} -``` - -### 3.2 TableSessionBuilder类 - -#### 3.2.1 功能描述 - -TableSessionBuilder类是一个构建器,用于配置和创建ITableSession接口的实例。它允许开发者设置连接参数、查询参数和安全特性等。 - -#### 3.2.2 配置选项 - -以下是TableSessionBuilder类中可用的配置选项及其默认值: - -| **配置项** | **描述** | **默认值** | -| ---------------------------------------------------- | ---------------------------------------- |---------------------------------------------| -| nodeUrls(List`` nodeUrls) | 设置IoTDB集群的节点URL列表 | Collections.singletonList("localhost:6667") | -| username(String username) | 设置连接的用户名 | "root" | -| password(String password) | 设置连接的密码 | "TimechoDB@2021" //V2.0.6.x 之前默认密码是root | -| database(String database) | 设置目标数据库名称 | null | -| queryTimeoutInMs(long queryTimeoutInMs) | 设置查询超时时间(毫秒) | 60000(1分钟) | -| fetchSize(int fetchSize) | 设置查询结果的获取大小 | 5000 | -| zoneId(ZoneId zoneId) | 设置时区相关的ZoneId | ZoneId.systemDefault() | -| thriftDefaultBufferSize(int thriftDefaultBufferSize) | 设置Thrift客户端的默认缓冲区大小(字节) | 1024(1KB) | -| thriftMaxFrameSize(int thriftMaxFrameSize) | 设置Thrift客户端的最大帧大小(字节) | 64 * 1024 * 1024(64MB) | -| enableRedirection(boolean enableRedirection) | 是否启用集群节点的重定向 | true | -| enableAutoFetch(boolean enableAutoFetch) | 是否启用自动获取可用DataNodes | true | -| maxRetryCount(int maxRetryCount) | 设置连接尝试的最大重试次数 | 60 | -| retryIntervalInMs(long retryIntervalInMs) | 设置重试间隔时间(毫秒) | 500 | -| useSSL(boolean useSSL) | 是否启用SSL安全连接 | false | -| trustStore(String keyStore) | 设置SSL连接的信任库路径 | null | -| trustStorePwd(String keyStorePwd) | 设置SSL连接的信任库密码 | null | -| enableCompression(boolean enableCompression) | 是否启用RPC压缩 | false | -| connectionTimeoutInMs(int connectionTimeoutInMs) | 设置连接超时时间(毫秒) | 0(无超时) | - -#### 3.2.3 接口展示 - -``` java -/** - * A builder class for constructing instances of {@link ITableSession}. - * - *

This builder provides a fluent API for configuring various options such as connection - * settings, query parameters, and security features. - * - *

All configurations have reasonable default values, which can be overridden as needed. - */ -public class TableSessionBuilder { - - /** - * Builds and returns a configured {@link ITableSession} instance. - * - * @return a fully configured {@link ITableSession}. - * @throws IoTDBConnectionException if an error occurs while establishing the connection. - */ - public ITableSession build() throws IoTDBConnectionException; - - /** - * Sets the list of node URLs for the IoTDB cluster. - * - * @param nodeUrls a list of node URLs. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue Collection.singletonList("localhost:6667") - */ - public TableSessionBuilder nodeUrls(List nodeUrls); - - /** - * Sets the username for the connection. - * - * @param username the username. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue "root“ - */ - public TableSessionBuilder username(String username); - - /** - * Sets the password for the connection. - * - * @param password the password. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue "TimechoDB@2021" //V2.0.6.x 之前默认密码是root - */ - public TableSessionBuilder password(String password); - - /** - * Sets the target database name. - * - * @param database the database name. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder database(String database); - - /** - * Sets the query timeout in milliseconds. - * - * @param queryTimeoutInMs the query timeout in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 60000 (1 minute) - */ - public TableSessionBuilder queryTimeoutInMs(long queryTimeoutInMs); - - /** - * Sets the fetch size for query results. - * - * @param fetchSize the fetch size. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 5000 - */ - public TableSessionBuilder fetchSize(int fetchSize); - - /** - * Sets the {@link ZoneId} for timezone-related operations. - * - * @param zoneId the {@link ZoneId}. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue ZoneId.systemDefault() - */ - public TableSessionBuilder zoneId(ZoneId zoneId); - - /** - * Sets the default init buffer size for the Thrift client. - * - * @param thriftDefaultBufferSize the buffer size in bytes. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 1024 (1 KB) - */ - public TableSessionBuilder thriftDefaultBufferSize(int thriftDefaultBufferSize); - - /** - * Sets the maximum frame size for the Thrift client. - * - * @param thriftMaxFrameSize the maximum frame size in bytes. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 64 * 1024 * 1024 (64 MB) - */ - public TableSessionBuilder thriftMaxFrameSize(int thriftMaxFrameSize); - - /** - * Enables or disables redirection for cluster nodes. - * - * @param enableRedirection whether to enable redirection. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue true - */ - public TableSessionBuilder enableRedirection(boolean enableRedirection); - - /** - * Enables or disables automatic fetching of available DataNodes. - * - * @param enableAutoFetch whether to enable automatic fetching. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue true - */ - public TableSessionBuilder enableAutoFetch(boolean enableAutoFetch); - - /** - * Sets the maximum number of retries for connection attempts. - * - * @param maxRetryCount the maximum retry count. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 60 - */ - public TableSessionBuilder maxRetryCount(int maxRetryCount); - - /** - * Sets the interval between retries in milliseconds. - * - * @param retryIntervalInMs the interval in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 500 milliseconds - */ - public TableSessionBuilder retryIntervalInMs(long retryIntervalInMs); - - /** - * Enables or disables SSL for secure connections. - * - * @param useSSL whether to enable SSL. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue false - */ - public TableSessionBuilder useSSL(boolean useSSL); - - /** - * Sets the trust store path for SSL connections. - * - * @param keyStore the trust store path. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder trustStore(String keyStore); - - /** - * Sets the trust store password for SSL connections. - * - * @param keyStorePwd the trust store password. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder trustStorePwd(String keyStorePwd); - - /** - * Enables or disables rpc compression for the connection. - * - * @param enableCompression whether to enable compression. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue false - */ - public TableSessionBuilder enableCompression(boolean enableCompression); - - /** - * Sets the connection timeout in milliseconds. - * - * @param connectionTimeoutInMs the connection timeout in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 0 (no timeout) - */ - public TableSessionBuilder connectionTimeoutInMs(int connectionTimeoutInMs); -} -``` - -> 注意: 原生API中创建表时,表名或列名中若含有特殊字符或中文字符,无需额外添加双引号括起,否则会包含引号字符。 - -## 4. 客户端连接池 - -### 4.1 ITableSessionPool 接口 - -#### 4.1.1 功能描述 - -ITableSessionPool 是一个用于管理 ITableSession实例的池。这个池可以帮助我们高效地重用连接,并且在不需要时正确地清理资源, 该接口定义了如何从池中获取会话以及如何关闭池的基本操作。 - -#### 4.1.2 方法列表 - -| **方法名** | **描述** | **返回值** | **返回异常** | -| ------------ | ------------------------------------------------------------ | ------------------ | ------------------------ | -| getSession() | 从池中获取一个 ITableSession 实例,用于与 IoTDB 交互。 | ITableSession 实例 | IoTDBConnectionException | -| close() | 关闭会话池,释放任何持有的资源。关闭后,不能再从池中获取新的会话。 | 无 | 无 | - -#### 4.1.3 接口展示 - -```Java -/** - * This interface defines a pool for managing {@link ITableSession} instances. - * It provides methods to acquire a session from the pool and to close the pool. - * - *

The implementation should handle the lifecycle of sessions, ensuring efficient - * reuse and proper cleanup of resources. - */ -public interface ITableSessionPool { - - /** - * Acquires an {@link ITableSession} instance from the pool. - * - * @return an {@link ITableSession} instance for interacting with the IoTDB. - * @throws IoTDBConnectionException if there is an issue obtaining a session from the pool. - */ - ITableSession getSession() throws IoTDBConnectionException; - - /** - * Closes the session pool, releasing any held resources. - * - *

Once the pool is closed, no further sessions can be acquired. - */ - void close(); -} -``` - -### 4.2 TableSessionPoolBuilder 类 - -#### 4.2.1 功能描述 - -TableSessionPool 的构造器,用于配置和创建 ITableSessionPool 的实例。允许开发者配置连接参数、会话参数和池化行为等。 - -#### 4.2.2 配置选项 - -以下是 TableSessionPoolBuilder 类的可用配置选项及其默认值: - -| **配置项** | **描述** | **默认值** | -| ------------------------------------------------------------ | -------------------------------------------- |---------------------------------------------| -| nodeUrls(List`` nodeUrls) | 设置IoTDB集群的节点URL列表 | Collections.singletonList("localhost:6667") | -| maxSize(int maxSize) | 设置会话池的最大大小,即池中允许的最大会话数 | 5 | -| user(String user) | 设置连接的用户名 | "root" | -| password(String password) | 设置连接的密码 | "TimechoDB@2021" //V2.0.6.x 之前默认密码是root | -| database(String database) | 设置目标数据库名称 | "root" | -| queryTimeoutInMs(long queryTimeoutInMs) | 设置查询超时时间(毫秒) | 60000(1分钟) | -| fetchSize(int fetchSize) | 设置查询结果的获取大小 | 5000 | -| zoneId(ZoneId zoneId) | 设置时区相关的 ZoneId | ZoneId.systemDefault() | -| waitToGetSessionTimeoutInMs(long waitToGetSessionTimeoutInMs) | 设置从池中获取会话的超时时间(毫秒) | 30000(30秒) | -| thriftDefaultBufferSize(int thriftDefaultBufferSize) | 设置Thrift客户端的默认缓冲区大小(字节) | 1024(1KB) | -| thriftMaxFrameSize(int thriftMaxFrameSize) | 设置Thrift客户端的最大帧大小(字节) | 64 * 1024 * 1024(64MB) | -| enableCompression(boolean enableCompression) | 是否启用连接的压缩 | false | -| enableRedirection(boolean enableRedirection) | 是否启用集群节点的重定向 | true | -| connectionTimeoutInMs(int connectionTimeoutInMs) | 设置连接超时时间(毫秒) | 10000(10秒) | -| enableAutoFetch(boolean enableAutoFetch) | 是否启用自动获取可用DataNodes | true | -| maxRetryCount(int maxRetryCount) | 设置连接尝试的最大重试次数 | 60 | -| retryIntervalInMs(long retryIntervalInMs) | 设置重试间隔时间(毫秒) | 500 | -| useSSL(boolean useSSL) | 是否启用SSL安全连接 | false | -| trustStore(String keyStore) | 设置SSL连接的信任库路径 | null | -| trustStorePwd(String keyStorePwd) | 设置SSL连接的信任库密码 | null | - -#### 4.2.3 接口展示 - -```Java -/** - * A builder class for constructing instances of {@link ITableSessionPool}. - * - *

This builder provides a fluent API for configuring a session pool, including - * connection settings, session parameters, and pool behavior. - * - *

All configurations have reasonable default values, which can be overridden as needed. - */ -public class TableSessionPoolBuilder { - - /** - * Builds and returns a configured {@link ITableSessionPool} instance. - * - * @return a fully configured {@link ITableSessionPool}. - */ - public ITableSessionPool build(); - - /** - * Sets the list of node URLs for the IoTDB cluster. - * - * @param nodeUrls a list of node URLs. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue Collection.singletonList("localhost:6667") - */ - public TableSessionPoolBuilder nodeUrls(List nodeUrls); - - /** - * Sets the maximum size of the session pool. - * - * @param maxSize the maximum number of sessions allowed in the pool. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 5 - */ - public TableSessionPoolBuilder maxSize(int maxSize); - - /** - * Sets the username for the connection. - * - * @param user the username. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "root" - */ - public TableSessionPoolBuilder user(String user); - - /** - * Sets the password for the connection. - * - * @param password the password. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "TimechoDB@2021" //V2.0.6.x 之前默认密码是root - */ - public TableSessionPoolBuilder password(String password); - - /** - * Sets the target database name. - * - * @param database the database name. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "root" - */ - public TableSessionPoolBuilder database(String database); - - /** - * Sets the query timeout in milliseconds. - * - * @param queryTimeoutInMs the query timeout in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 60000 (1 minute) - */ - public TableSessionPoolBuilder queryTimeoutInMs(long queryTimeoutInMs); - - /** - * Sets the fetch size for query results. - * - * @param fetchSize the fetch size. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 5000 - */ - public TableSessionPoolBuilder fetchSize(int fetchSize); - - /** - * Sets the {@link ZoneId} for timezone-related operations. - * - * @param zoneId the {@link ZoneId}. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue ZoneId.systemDefault() - */ - public TableSessionPoolBuilder zoneId(ZoneId zoneId); - - /** - * Sets the timeout for waiting to acquire a session from the pool. - * - * @param waitToGetSessionTimeoutInMs the timeout duration in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 30000 (30 seconds) - */ - public TableSessionPoolBuilder waitToGetSessionTimeoutInMs(long waitToGetSessionTimeoutInMs); - - /** - * Sets the default buffer size for the Thrift client. - * - * @param thriftDefaultBufferSize the buffer size in bytes. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 1024 (1 KB) - */ - public TableSessionPoolBuilder thriftDefaultBufferSize(int thriftDefaultBufferSize); - - /** - * Sets the maximum frame size for the Thrift client. - * - * @param thriftMaxFrameSize the maximum frame size in bytes. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 64 * 1024 * 1024 (64 MB) - */ - public TableSessionPoolBuilder thriftMaxFrameSize(int thriftMaxFrameSize); - - /** - * Enables or disables compression for the connection. - * - * @param enableCompression whether to enable compression. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue false - */ - public TableSessionPoolBuilder enableCompression(boolean enableCompression); - - /** - * Enables or disables redirection for cluster nodes. - * - * @param enableRedirection whether to enable redirection. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue true - */ - public TableSessionPoolBuilder enableRedirection(boolean enableRedirection); - - /** - * Sets the connection timeout in milliseconds. - * - * @param connectionTimeoutInMs the connection timeout in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 10000 (10 seconds) - */ - public TableSessionPoolBuilder connectionTimeoutInMs(int connectionTimeoutInMs); - - /** - * Enables or disables automatic fetching of available DataNodes. - * - * @param enableAutoFetch whether to enable automatic fetching. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue true - */ - public TableSessionPoolBuilder enableAutoFetch(boolean enableAutoFetch); - - /** - * Sets the maximum number of retries for connection attempts. - * - * @param maxRetryCount the maximum retry count. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 60 - */ - public TableSessionPoolBuilder maxRetryCount(int maxRetryCount); - - /** - * Sets the interval between retries in milliseconds. - * - * @param retryIntervalInMs the interval in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 500 milliseconds - */ - public TableSessionPoolBuilder retryIntervalInMs(long retryIntervalInMs); - - /** - * Enables or disables SSL for secure connections. - * - * @param useSSL whether to enable SSL. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue false - */ - public TableSessionPoolBuilder useSSL(boolean useSSL); - - /** - * Sets the trust store path for SSL connections. - * - * @param keyStore the trust store path. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue null - */ - public TableSessionPoolBuilder trustStore(String keyStore); - - /** - * Sets the trust store password for SSL connections. - * - * @param keyStorePwd the trust store password. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue null - */ - public TableSessionPoolBuilder trustStorePwd(String keyStorePwd); -} -``` - -## 5. 示例代码 - -Session 示例代码:[src/main/java/org/apache/iotdb/TableModelSessionExample.java](https://github.com/apache/iotdb/blob/master/example/session/src/main/java/org/apache/iotdb/TableModelSessionExample.java) - -SessionPool 示例代码:[src/main/java/org/apache/iotdb/TableModelSessionPoolExample.java](https://github.com/apache/iotdb/blob/master/example/session/src/main/java/org/apache/iotdb/TableModelSessionPoolExample.java) - -```Java -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -package org.apache.iotdb; - -import org.apache.iotdb.isession.ITableSession; -import org.apache.iotdb.isession.SessionDataSet; -import org.apache.iotdb.isession.pool.ITableSessionPool; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.TableSessionPoolBuilder; - -import org.apache.tsfile.enums.ColumnCategory; -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.write.record.Tablet; - -import java.util.ArrayList; -import java.util.Arrays; -import java.util.Collections; -import java.util.List; - -import static org.apache.iotdb.SessionExample.printDataSet; - -public class TableModelSessionPoolExample { - - private static final String LOCAL_URL = "127.0.0.1:6667"; - - public static void main(String[] args) { - - // don't specify database in constructor - ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(Collections.singletonList(LOCAL_URL)) - .user("root") - .password("TimechoDB@2021") //V2.0.6.x 之前默认密码是root - .maxSize(1) - .build(); - - try (ITableSession session = tableSessionPool.getSession()) { - - session.executeNonQueryStatement("CREATE DATABASE test1"); - session.executeNonQueryStatement("CREATE DATABASE test2"); - - session.executeNonQueryStatement("use test2"); - - // or use full qualified table name - session.executeNonQueryStatement( - "create table test1.table1(" - + "region_id STRING TAG, " - + "plant_id STRING TAG, " - + "device_id STRING TAG, " - + "model STRING ATTRIBUTE, " - + "temperature FLOAT FIELD, " - + "humidity DOUBLE FIELD) with (TTL=3600000)"); - - session.executeNonQueryStatement( - "create table table2(" - + "region_id STRING TAG, " - + "plant_id STRING TAG, " - + "color STRING ATTRIBUTE, " - + "temperature FLOAT FIELD, " - + "speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - // show tables by specifying another database - // using SHOW tables FROM - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES FROM test1")) { - printDataSet(dataSet); - } - - // insert table data by tablet - List columnNameList = - Arrays.asList("region_id", "plant_id", "device_id", "model", "temperature", "humidity"); - List dataTypeList = - Arrays.asList( - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE); - List columnTypeList = - new ArrayList<>( - Arrays.asList( - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD)); - Tablet tablet = new Tablet("test1", columnNameList, dataTypeList, columnTypeList, 100); - for (long timestamp = 0; timestamp < 100; timestamp++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue("plant_id", rowIndex, "5"); - tablet.addValue("device_id", rowIndex, "3"); - tablet.addValue("model", rowIndex, "A"); - tablet.addValue("temperature", rowIndex, 37.6F); - tablet.addValue("humidity", rowIndex, 111.1); - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - session.insert(tablet); - tablet.reset(); - } - } - if (tablet.getRowSize() != 0) { - session.insert(tablet); - tablet.reset(); - } - - // query table data - try (SessionDataSet dataSet = - session.executeQueryStatement( - "select * from test1 " - + "where region_id = '1' and plant_id in ('3', '5') and device_id = '3'")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } finally { - tableSessionPool.close(); - } - - // specify database in constructor - tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(Collections.singletonList(LOCAL_URL)) - .user("root") - .password("TimechoDB@2021")//V2.0.6.x 之前默认密码是root - .maxSize(1) - .database("test1") - .build(); - - try (ITableSession session = tableSessionPool.getSession()) { - - // show tables from current database - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - // change database to test2 - session.executeNonQueryStatement("use test2"); - - // show tables by specifying another database - // using SHOW tables FROM - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } - - try (ITableSession session = tableSessionPool.getSession()) { - - // show tables from default database test1 - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } finally { - tableSessionPool.close(); - } - } -} -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/API/Programming-MQTT_timecho.md b/src/zh/UserGuide/Master/Table/API/Programming-MQTT_timecho.md deleted file mode 100644 index 9d9073502..000000000 --- a/src/zh/UserGuide/Master/Table/API/Programming-MQTT_timecho.md +++ /dev/null @@ -1,264 +0,0 @@ - -# MQTT 协议 - -## 1. 概述 - -MQTT 是一种专为物联网(IoT)和低带宽环境设计的轻量级消息传输协议,基于发布/订阅(Pub/Sub)模型,支持设备间高效、可靠的双向通信。其核心目标是低功耗、低带宽消耗和高实时性,尤其适合网络不稳定或资源受限的场景(如传感器、移动设备)。 - -IoTDB 深度集成了 MQTT 协议能力,完整兼容 MQTT v3.1(OASIS 国际标准协议)。IoTDB 服务器内置高性能 MQTT Broker 服务模块,无需第三方中间件,支持设备通过 MQTT 报文将时序数据直接写入 IoTDB 存储引擎。 - -![](/img/mqtt-table-1.png) - -注意,自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 MQTT 服务的 JAR 包。请使用该服务前联系天谋团队获取 JAR 包,并放置于 timechodb_home/lib 或者 timechodb_home/ext/external_service 路径下。 - -## 2. 配置方式 - -默认情况下,IoTDB MQTT 服务通过`${IOTDB_HOME}/${IOTDB_CONF}/iotdb-system.properties`加载配置。 - -具体配置项如下: - -| **名称** | **描述** | **默认** | -|---------------------------| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------- | -| `enable_mqtt_service` | 是否启用 mqtt 服务 | FALSE | -| `mqtt_host` | mqtt 服务绑定主机 | 127.0.0.1 | -| `mqtt_port` | mqtt 服务绑定端口 | 1883 | -| `mqtt_handler_pool_size` | 处理 mqtt 消息的处理程序池大小 | 1 | -| **`mqtt_payload_formatter`** | **mqtt**​**​ 消息有效负载格式化程序。**​**可选项:**​​**`json`**​**:仅适用于树模型。**​​**`line`**​**:仅适用于表模型。** | **json** | -| `mqtt_max_message_size` | mqtt 消息最大长度(字节) | 1048576 | - -## 3. 写入协议 - -* 行协议语法格式 - -```JavaScript -[,=[,=]][ =[,=]] =[,=] [] -``` - -* 示例 - -```JavaScript -myMeasurement,tag1=value1,tag2=value2 attr1=value1,attr2=value2 fieldKey="fieldValue" 1556813561098000000 -``` - -![](/img/mqtt-table-2.png) - - -## 4. 命名约定 - -* 数据库名称 - -MQTT topic 名称用 `/` 分割后, 第一串内容作为数据库名称。 - -```Properties -topic: stock/Legacy -databaseName: stock - - -topic: stock/Legacy/# -databaseName:stock -``` - -* 表名称 - -表名称使用行协议中的 ``。 - -* 类型标识 - -| Filed 内容 | IoTDB 数据类型 | -|--------------------------------------------------------------| ---------------- | -| 1
1.12 | DOUBLE | -| 1`f`
1.12`f` | FLOAT | -| 1`i`
123`i` | INT64 | -| 1`u`
123`u` | INT64| -| 1`i32`
123`i32` | INT32 | -| `"xxx"` | TEXT | -| `t`,`T`,`true`,`True`,`TRUE`
`f`,`F`,`false`,`False`,`FALSE` | BOOLEAN | - - -## 5. 代码示例 -以下是 mqtt 客户端将消息发送到 IoTDB 服务器的示例。 - - ```java -MQTT mqtt = new MQTT(); -mqtt.setHost("127.0.0.1", 1883); -mqtt.setUserName("root"); -mqtt.setPassword("root"); - -BlockingConnection connection = mqtt.blockingConnection(); -String DATABASE = "myMqttTest"; -connection.connect(); - -String payload = - "test1,tag1=t1,tag2=t2 attr3=a5,attr4=a4 field1=\"fieldValue1\",field2=1i,field3=1u 1"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -payload = "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 2"; -connection.publish(DATABASE, payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -payload = "# It's a remark\n " + "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 6"; - connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); - Thread.sleep(10); - -//批量写入示例 -payload = - "test1,tag1=t1,tag2=t2 field7=t,field8=T,field9=true 3 \n " - + "test1,tag1=t1,tag2=t2 field7=f,field8=F,field9=FALSE 4"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -//批量写入示例 -payload = - "test1,tag1=t1,tag2=t2 attr1=a1,attr2=a2 field1=\"fieldValue1\",field2=1i,field3=1u 4 \n " - + "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 5"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -connection.disconnect(); - ``` - - -## 6. 自定义 MQTT 消息格式 - -事实上可以通过简单编程来实现 MQTT 消息的格式自定义。 -可以在源码的 [example/mqtt-customize](https://github.com/apache/iotdb/tree/master/example/mqtt-customize) 项目中找到一个简单示例。 - -步骤: -1. 创建一个 Java 项目,增加如下依赖 -```xml - - org.apache.iotdb - iotdb-server - 2.0.4-SNAPSHOT - -``` -2. 创建一个实现类,实现接口 `org.apache.iotdb.db.mqtt.protocol.PayloadFormatter` - -```java -package org.apache.iotdb.mqtt.server; - -import io.netty.buffer.ByteBuf; -import org.apache.iotdb.db.protocol.mqtt.Message; -import org.apache.iotdb.db.protocol.mqtt.PayloadFormatter; - -import java.nio.charset.StandardCharsets; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -public class CustomizedLinePayloadFormatter implements PayloadFormatter { - - @Override - public List format(String topic, ByteBuf payload) { - // Suppose the payload is a line format - if (payload == null) { - return null; - } - - String line = payload.toString(StandardCharsets.UTF_8); - // parse data from the line and generate Messages and put them into List ret - List ret = new ArrayList<>(); - // this is just an example, so we just generate some Messages directly - for (int i = 0; i < 3; i++) { - long ts = i; - TableMessage message = new TableMessage(); - - // Parsing Database Name - message.setDatabase("db" + i); - - //Parsing Table Names - message.setTable("t" + i); - - // Parsing Tags - List tagKeys = new ArrayList<>(); - tagKeys.add("tag1" + i); - tagKeys.add("tag2" + i); - List tagValues = new ArrayList<>(); - tagValues.add("t_value1" + i); - tagValues.add("t_value2" + i); - message.setTagKeys(tagKeys); - message.setTagValues(tagValues); - - // Parsing Attributes - List attributeKeys = new ArrayList<>(); - List attributeValues = new ArrayList<>(); - attributeKeys.add("attr1" + i); - attributeKeys.add("attr2" + i); - attributeValues.add("a_value1" + i); - attributeValues.add("a_value2" + i); - message.setAttributeKeys(attributeKeys); - message.setAttributeValues(attributeValues); - - // Parsing Fields - List fields = Arrays.asList("field1" + i, "field2" + i); - List dataTypes = Arrays.asList(TSDataType.FLOAT, TSDataType.FLOAT); - List values = Arrays.asList("4.0" + i, "5.0" + i); - message.setFields(fields); - message.setDataTypes(dataTypes); - message.setValues(values); - - //// Parsing timestamp - message.setTimestamp(ts); - ret.add(message); - } - return ret; - } - - @Override - public String getName() { - // set the value of mqtt_payload_formatter in iotdb-system.properties as the following string: - return "CustomizedLine"; - } -} -``` - - -3. 修改项目中的 `src/main/resources/META-INF/services/org.apache.iotdb.db.protocol.mqtt.PayloadFormatter` 文件: - 将示例中的文件内容清除,并将刚才的实现类的全名(包名.类名)写入文件中。注意,这个文件中只有一行。 - 在本例中,文件内容为: `org.apache.iotdb.mqtt.server.CustomizedLinePayloadFormatter` -4. 编译项目生成一个 jar 包: `mvn package -DskipTests` - - -在 IoTDB 服务端: -1. 创建 ${IOTDB_HOME}/ext/mqtt/ 文件夹, 将刚才的 jar 包放入此文件夹。 -2. 打开 MQTT 服务参数. (`enable_mqtt_service=true` in `conf/iotdb-system.properties`) -3. 用刚才的实现类中的 getName() 方法的返回值 设置为 `conf/iotdb-system.properties` 中 `mqtt_payload_formatter` 的值, - , 在本例中,为 `CustomizedLine` -4. 启动 IoTDB -5. 搞定 - -More: MQTT 协议的消息不限于 line,你还可以用任意二进制。通过如下函数获得: -`payload.forEachByte()` or `payload.array`。 - - -## 7. 注意事项 - -为避免因缺省client_id引发的兼容性问题,强烈建议在所有MQTT客户端中始终显式地提供唯一且非空的 client_id。 -不同客户端在client_id缺失或为空时的表现并不一致,常见示例如下: -1. 显式传入空字符串 -• MQTTX:client_id=""时,IoTDB会直接丢弃消息; -• mosquitto_pub:client_id=""时,IoTDB能正常接收消息。 -2. 完全不传client_id -• MQTTX:消息可被IoTDB正常接收; -• mosquitto_pub:IoTDB拒绝连接。 -由此可见,显式指定唯一且非空的client_id是消除上述差异、确保消息可靠投递的最简单做法。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/API/Programming-ODBC_timecho.md b/src/zh/UserGuide/Master/Table/API/Programming-ODBC_timecho.md deleted file mode 100644 index 6326752d8..000000000 --- a/src/zh/UserGuide/Master/Table/API/Programming-ODBC_timecho.md +++ /dev/null @@ -1,1083 +0,0 @@ - - -# ODBC - -## 1. 功能介绍 - -IoTDB ODBC 驱动程序提供了通过 ODBC 标准接口与数据库进行交互的能力,支持通过 ODBC 连接管理时序数据库中的数据。目前支持数据库连接、数据查询、数据插入、数据修改和数据删除等操作,可适配各类支持 ODBC 协议的应用程序与工具链。 - -> 注意:该功能从 V2.0.8.2 起支持。 - -## 2. 使用方式 - -推荐使用预编译二进制包安装,无需自行编译,直接通过脚本完成驱动安装与系统注册,目前仅支持 Windows 系统。 - -### 2.1 环境要求 - -仅需满足操作系统层面的 ODBC 驱动管理器依赖,无需配置编译环境: - -| **操作系统** | **要求与安装方式** | -| -------------------- |------------------------------------------------------------------------------------------------------------------------------------| -| Windows | 1. **Windows 10/11、Server 2016/2019/2022**:自带 ODBC 17/18 版本驱动管理器,无需额外安装
2. **Windows 8.1/Server 2012 R2**:需手动安装对应版本 ODBC 驱动管理器 | - -### 2.2 安装步骤 - -1. 联系天谋团队获取预编译二进制包 - -二进制包目录结构: - -```Plain -├── bin/ -│ ├── apache_iotdb_odbc.dll -│ └── install_driver.exe -├── install.bat -└── registry.bat -``` - -2. 以**管理员权限**打开命令行工具(CMD/PowerShell),并运行以下命令:(可以将路径替换为任意绝对路径) - -```Bash -install.bat "C:\Program Files\Apache IoTDB ODBC Driver" -``` - -脚本自动完成以下操作: - -* 创建安装目录(如果不存在) -* 将 `bin\apache_iotdb_odbc.dll` 复制到指定安装目录 -* 调用 `install_driver.exe` 通过 ODBC 标准 API(`SQLInstallDriverEx`)将驱动注册到系统 - -3. 验证安装:打开「ODBC 数据源管理器」,在「驱动程序」选项卡中可查看到 `Apache IoTDB ODBC Driver`,即表示注册成功。 - -![](/img/odbc-1.png) - -### 2.3 卸载步骤 - -1. 以管理员身份打开命令提示符,`cd` 进入项目根目录。 -2. 运行卸载脚本: - -```Bash -uninstall.bat -``` - -脚本会调用 `install_driver.exe` 通过 ODBC 标准 API(`SQLRemoveDriver`)从系统中注销驱动。安装目录中的 DLL 文件不会被自动删除,如需清理请手动删除。 - -### 2.4 连接配置 - -安装驱动后,需要配置数据源(DSN)才能让应用程序通过 DSN 名称连接数据库。IoTDB ODBC 驱动支持通过数据源和连接字符串配置连接参数两种方法。 - -#### 2.4.1 配置数据源 - -**通过 ODBC 数据源管理器配置** - -1. 打开"ODBC 数据源管理程序",切换到"用户 DSN"选项卡,点击"添加"按钮。 - -![](/img/odbc-2.png) - -2. 在弹出的驱动程序列表中选择"Apache IoTDB ODBC Driver",点击"完成"。 - -![](/img/odbc-3.png) - -3. 弹出数据源配置对话框,填写连接参数,随后点击 OK: - -![](/img/odbc-4.png) - -对话框中各字段的含义如下: - -| **区域** | **字段** | **说明** | -| ---------------- | ----------------- | ----------------------------------------------------------------------------------------------------------------- | -| Data Source | DSN Name | 数据源名称,应用程序通过此名称引用该数据源 | -| Data Source | Description | 数据源描述(可选) | -| Connection | Server | IoTDB 服务器 IP 地址,默认 127.0.0.1 | -| Connection | Port | IoTDB Session API 端口,默认 6667 | -| Connection | User | 用户名,默认 root | -| Connection | Password | 密码,默认 root | -| Options | Table Model | 勾选时使用表模型,取消勾选时使用树模型 | -| Options | Database | 数据库名称。仅表模型模式下可用;树模型时此字段灰化不可编辑 | -| Options | Log Level | 日志级别(0-4):0=OFF, 1=ERROR, 2=WARN, 3=INFO, 4=TRACE | -| Options | Session Timeout | 会话超时时间(毫秒),0 表示不设超时。注意服务端 queryTimeoutThreshold 默认为 60000ms,超过此值需修改服务端配置 | -| Options | Batch Size | 每次拉取的行数,默认 1000。设为 0 时重置为默认值 | - -4. 填写完成后,可以点击"Test Connection"按钮测试连接。测试连接会使用当前填写的参数尝试连接到 IoTDB 服务器并执行 `SHOW VERSION` 查询。连接成功时会显示服务器版本信息,失败时会显示具体的错误原因。 -5. 确认参数无误后,点击"OK"保存。数据源会出现在"用户 DSN"列表中,如下图中的名称为123的数据源。 - -![](/img/odbc-5.png) - -如需修改已有数据源的配置,在列表中选中后点击"配置"按钮即可重新编辑。 - -#### 2.4.2 连接字符串 - -连接字符串格式为**分号分隔的键值对**,如: - -```Bash -Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;database=testdb;isTableModel=true;loglevel=2 -``` - -具体字段属性介绍见下表: - -| **字段名称** | **说明** | **可选值** | **默认值** | -| --------------------------- | ---------------------------------- |------------------------------------------------------------------------------------------------------------------------------| --------------------------------- | -| DSN | 数据源名称 | 自定义数据源名 | - | -| uid | 数据库用户名 | 任意字符串 | root | -| pwd | 数据库密码 | 任意字符串 | root | -| server | IoTDB 服务器地址 | ip地址 | 127.0.0.1 | -| port | IoTDB 服务器端口 | 端口 | 6667 | -| database | 数据库名称(仅表模型模式下生效) | 任意字符串 | 空字符串| -| loglevel | 日志级别 | 整数值(0-4) | 4(LOG\_LEVEL\_TRACE) | -| isTableModel / tablemodel | 是否启用表模型模式 | 布尔类型,支持多种表示方式:
1. 0, false, no, off :设置为 false;
2. 1, true, yes, on :设置为 true;
3. 其他值默认设置为 true。 | true| -| sessiontimeoutms | Session 超时时间(毫秒) | 64 位整数,默认为`LLONG_MAX`;设置为`0`时将被替换为`LLONG_MAX`。注意,服务端有超时设置项:`private long queryTimeoutThreshold = 60000;`需要修改这一项才能得到超过60秒的超时时间。 | LLONG\_MAX| -| batchsize | 每次拉取数据的批量大小 | 64 位整数,默认为`1000`;设置为`0`时将被替换为`1000` | 1000| - -说明: - -* 字段名称不区分大小写(自动转换为小写进行比较) -* 连接字符串格式为分号分隔的键值对,如:`Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;database=testdb;isTableModel=true;loglevel=2` -* 对于布尔类型的字段(isTableModel),支持多种表示方式 -* 所有字段都是可选的,如果未指定则使用默认值 -* 不支持的字段会忽略并在日志中记录警告信息,但不会影响连接 -* 服务器接口默认值 6667 是 IoTDB 的 C++ Session 接口所使用的默认端口。本 ODBC 驱动使用 C++ Session 接口与 IoTDB 传递数据。如果 IoTDB 服务端的 C++ Session 接口使用的端口不是默认的,需要在 ODBC 连接字符串中作相应的更改。 - -#### 2.4.3 数据源配置与连接字符串的关系 - -在 ODBC 数据源管理器中保存的配置,会以键值对的形式写入系统的 ODBC 数据源配置中(Windows 下对应注册表 `HKEY_CURRENT_USER\SOFTWARE\ODBC\ODBC.INI`)。当应用程序使用 `SQLConnect` 或在连接字符串中指定 `DSN=数据源名称` 时,驱动会从系统配置中读取这些参数。 - -**连接字符串的优先级高于 DSN 中保存的配置。** 具体规则如下: - -1. 如果连接字符串中包含 `DSN=xxx` 且不包含 `DRIVER=...`,驱动会先从系统配置中加载该 DSN 的所有参数作为基础值。 -2. 然后,连接字符串中显式指定的参数会覆盖 DSN 中的同名参数。 -3. 如果连接字符串中包含 `DRIVER=...`,则不会从系统配置中读取任何 DSN 参数,完全以连接字符串为准。 - -例如:DSN 中配置了 `Server=192.168.1.100`、`Port=6667`,但连接字符串为 `DSN=MyDSN;Server=127.0.0.1`,则实际连接使用 `Server=127.0.0.1`(连接字符串覆盖),`Port=6667`(来自 DSN)。 - -### 2.5 日志记录 - -驱动运行时的日志输出分为「驱动自身日志」和「ODBC 管理器追踪日志」两类,需注意日志等级对性能的影响。 - -#### 2.5.1 驱动自身日志 - -* 输出位置:用户主目录下的 `apache_iotdb_odbc.log`; -* 日志等级:通过连接字符串的 `loglevel` 配置(0-4,等级越高输出越详细); -* 性能影响:高日志等级会显著降低驱动性能,建议仅调试时使用。 - -#### 2.5.2 ODBC 管理器追踪日志 - -* 开启方式:打开「ODBC 数据源管理器」→「跟踪」→「立即启动跟踪」; -* 注意事项:开启后会大幅降低驱动性能,仅用于问题排查。 - -## 3. 接口支持 - -### 3.1 方法列表 - -驱动对 ODBC 标准 API 的支持情况如下: - -| ODBC/Setup API | 函数功能 | 参数列表 | 参数说明 | -| ------------------- | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| SQLAllocHandle| 分配ODBC句柄 | (SQLSMALLINT HandleType, SQLHANDLE InputHandle, SQLHANDLE \*OutputHandle) | HandleType: 要分配的句柄类型(ENV/DBC/STMT/DESC);
InputHandle: 上级上下文句柄;
OutputHandle: 返回的新句柄指针 | -| SQLBindCol | 绑定列到结果缓冲区 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN \*StrLen\_or\_Ind) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
TargetType: C数据类型;
TargetValue: 数据缓冲区;
BufferLength: 缓冲区长度;
StrLen\_or\_Ind: 返回数据长度或NULL指示 | -| SQLColAttribute| 获取列属性信息 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLUSMALLINT FieldIdentifier, SQLPOINTER CharacterAttribute, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength, SQLLEN \*NumericAttribute) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
FieldIdentifier: 属性ID;
CharacterAttribute: 字符属性输出;
BufferLength: 缓冲区长度;
StringLength: 返回长度;
NumericAttribute: 数值属性输出 | -| SQLColumns| 查询表列信息 | (SQLHSTMT StatementHandle, SQLCHAR \*CatalogName, SQLSMALLINT NameLength1, SQLCHAR \*SchemaName, SQLSMALLINT NameLength2, SQLCHAR \*TableName, SQLSMALLINT NameLength3, SQLCHAR \*ColumnName, SQLSMALLINT NameLength4) | StatementHandle: 语句句柄;
Catalog/Schema/Table/ColumnName: 查询对象名称;
NameLength\*: 对应名称长度 | -| SQLConnect | 建立数据库连接 | (SQLHDBC ConnectionHandle, SQLCHAR \*ServerName, SQLSMALLINT NameLength1, SQLCHAR \*UserName, SQLSMALLINT NameLength2, SQLCHAR \*Authentication, SQLSMALLINT NameLength3) | ConnectionHandle: 连接句柄;
ServerName: 数据源名称;
UserName: 用户名;
Authentication: 密码;NameLength\*: 字符串长度 | -| SQLDescribeCol | 描述结果集中的列 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLCHAR \*ColumnName, SQLSMALLINT BufferLength, SQLSMALLINT \*NameLength, SQLSMALLINT \*DataType, SQLULEN \*ColumnSize, SQLSMALLINT \*DecimalDigits, SQLSMALLINT \*Nullable) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
ColumnName: 列名输出;
BufferLength: 缓冲区长度;
NameLength: 返回列名长度;
DataType: SQL类型;
ColumnSize: 列大小;
DecimalDigits: 小数位;
Nullable: 是否可为空 | -| SQLDisconnect | 断开数据库连接 | (SQLHDBC ConnectionHandle) | ConnectionHandle: 连接句柄 | -| SQLDriverConnect | 使用连接字符串建立连接 | (SQLHDBC ConnectionHandle, SQLHWND WindowHandle, SQLCHAR \*InConnectionString, SQLSMALLINT StringLength1, SQLCHAR \*OutConnectionString, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength2, SQLUSMALLINT DriverCompletion) | ConnectionHandle: 连接句柄;
WindowHandle: 窗口句柄;
InConnectionString: 输入连接字符串;
StringLength1: 输入长度;
OutConnectionString: 输出连接字符串;
BufferLength: 输出缓冲区;
StringLength2: 返回长度;
DriverCompletion: 连接提示方式 | -| SQLEndTran | 提交或回滚事务 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT CompletionType) | HandleType: 句柄类型;
Handle: 连接或环境句柄;
CompletionType: 提交或回滚事务 | -| SQLExecDirect | 直接执行SQL语句 | (SQLHSTMT StatementHandle, SQLCHAR \*StatementText, SQLINTEGER TextLength) | StatementHandle: 语句句柄;
StatementText: SQL文本;
TextLength: SQL长度 | -| SQLFetch | 提取结果集中的下一行 | (SQLHSTMT StatementHandle) | StatementHandle: 语句句柄 | -| SQLFreeHandle | 释放ODBC句柄 | (SQLSMALLINT HandleType, SQLHANDLE Handle) | HandleType: 句柄类型;
Handle: 要释放的句柄 | -| SQLFreeStmt | 释放语句相关资源 | (SQLHSTMT StatementHandle, SQLUSMALLINT Option) | StatementHandle: 语句句柄;
Option: 释放选项(关闭游标/重置参数等) | -| SQLGetConnectAttr | 获取连接属性 | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER \*StringLength) | ConnectionHandle: 连接句柄;
Attribute: 属性ID;
Value: 返回属性值;
BufferLength: 缓冲区长度;
StringLength: 返回长度 | -| SQLGetData | 获取结果数据 | (SQLHSTMT StatementHandle, SQLUSMALLINT Col\_or\_Param\_Num, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN \*StrLen\_or\_Ind) | StatementHandle: 语句句柄;
Col\_or\_Param\_Num: 列号;
TargetType: C类型;
TargetValue: 数据缓冲区;
BufferLength: 缓冲区大小;
StrLen\_or\_Ind: 返回长度或NULL标志 | -| SQLGetDiagField | 获取诊断字段 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLSMALLINT DiagIdentifier, SQLPOINTER DiagInfo, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength) | HandleType: 句柄类型;
Handle: 句柄;
RecNumber: 记录号;
DiagIdentifier: 诊断字段ID;
DiagInfo: 输出信息;
BufferLength: 缓冲区;
StringLength: 返回长度 | -| SQLGetDiagRec | 获取诊断记录 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLCHAR \*Sqlstate, SQLINTEGER \*NativeError, SQLCHAR \*MessageText, SQLSMALLINT BufferLength, SQLSMALLINT \*TextLength) | HandleType: 句柄类型;
Handle: 句柄;
RecNumber: 记录号;
Sqlstate: SQL状态码;
NativeError: 原生错误码;
MessageText: 错误信息;
BufferLength: 缓冲区;
TextLength: 返回长度 | -| SQLGetInfo | 获取数据库信息 | (SQLHDBC ConnectionHandle, SQLUSMALLINT InfoType, SQLPOINTER InfoValue, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength) | ConnectionHandle: 连接句柄;

InfoType: 信息类型;
InfoValue: 返回值;
BufferLength: 缓冲区长度;
StringLength: 返回长度 | -| SQLGetStmtAttr | 获取语句属性 | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER \*StringLength) | StatementHandle: 语句句柄;
Attribute: 属性ID;
Value: 返回值;
BufferLength: 缓冲区;
StringLength: 返回长度 | -| SQLGetTypeInfo | 获取数据类型信息 | (SQLHSTMT StatementHandle, SQLSMALLINT DataType) | StatementHandle: 语句句柄;
DataType: SQL数据类型 | -| SQLMoreResults | 获取更多结果集 | (SQLHSTMT StatementHandle) | StatementHandle: 语句句柄 | -| SQLNumResultCols | 获取结果集列数 | (SQLHSTMT StatementHandle, SQLSMALLINT \*ColumnCount) | StatementHandle: 语句句柄;
ColumnCount: 返回列数 | -| SQLRowCount | 获取受影响的行数 | (SQLHSTMT StatementHandle, SQLLEN \*RowCount) | StatementHandle: 语句句柄;
RowCount: 返回受影响行数 | -| SQLSetConnectAttr | 设置连接属性 | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | ConnectionHandle: 连接句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 属性值长度 | -| SQLSetEnvAttr | 设置环境属性 | (SQLHENV EnvironmentHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | EnvironmentHandle: 环境句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 长度 | -| SQLSetStmtAttr | 设置语句属性 | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | StatementHandle: 语句句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 长度 | -| SQLTables | 查询表信息 | (SQLHSTMT StatementHandle, SQLCHAR \*CatalogName, SQLSMALLINT NameLength1, SQLCHAR \*SchemaName, SQLSMALLINT NameLength2, SQLCHAR \*TableName, SQLSMALLINT NameLength3, SQLCHAR \*TableType, SQLSMALLINT NameLength4) | StatementHandle: 语句句柄;
Catalog/Schema/TableName: 表名;
TableType: 表类型;
NameLength\*: 对应长度 | - -### 3.2 数据类型转换 - -IoTDB 数据类型与 ODBC 标准数据类型的映射关系如下: - -| **IoTDB 数据类型** | **ODBC 数据类型** | -| -------------------------- | ------------------------- | -| BOOLEAN | SQL\_BIT | -| INT32 | SQL\_INTEGER | -| INT64 | SQL\_BIGINT | -| FLOAT | SQL\_REAL | -| DOUBLE | SQL\_DOUBLE | -| TEXT | SQL\_VARCHAR | -| STRING | SQL\_VARCHAR | -| BLOB | SQL\_LONGVARBINARY | -| TIMESTAMP | SQL\_BIGINT | -| DATE | SQL\_DATE | - -## 4. 操作示例 - -本章节主要介绍 **C#**、**Python**、**C++**、**PowerBI**、**Excel** 全类型操作示例,覆盖数据查询、插入、删除等核心操作。 - -### 4.1 C# 示例 - -```C# -/******* -Note: When the output contains Chinese characters, it may cause garbled text. -This is because the table.Write() function cannot output strings in UTF-8 encoding -and can only output using GB2312 (or another system default encoding). This issue -may not occur in software like Power BI; it also does not occur when using the Console. -WriteLine function. This is an issue with the ConsoleTable package. -*****/ -using System.Data.Common; -using System.Data.Odbc; -using System.Reflection.PortableExecutable; -using ConsoleTables; -using System; - -/// 执行 SELECT 查询并以表格形式输出 fulltable 的结果 -void Query(OdbcConnection dbConnection) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - dbCommand.CommandText = "select * from fulltable"; - using (OdbcDataReader dbReader = dbCommand.ExecuteReader()) - { - var fCount = dbReader.FieldCount; - Console.WriteLine($"fCount = {fCount}"); - // 输出表头 - var columns = new string[fCount]; - for (var i = 0; i < fCount; i++) - { - var fName = dbReader.GetName(i); - if (fName.Contains('.')) - { - fName = fName.Substring(fName.LastIndexOf('.') + 1); - } - columns[i] = fName; - } - // 输出内容 - var table = new ConsoleTable(columns); - while (dbReader.Read()) - { - var row = new object[fCount]; - for (var i = 0; i < fCount; i++) - { - if (dbReader.IsDBNull(i)) - { - row[i] = null; - continue; - } - row[i] = dbReader.GetValue(i); - } - table.AddRow(row); - } - table.Write(); - Console.WriteLine(); - } - } - } - catch (Exception ex) - { - Console.WriteLine(ex.ToString()); - } -} - -/// 执行非查询 SQL 语句(如 CREATE DATABASE、CREATE TABLE、INSERT 等) -void Execute(OdbcConnection dbConnection, string command) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - try - { - dbCommand.CommandText = command; - Console.WriteLine($"Execute command: {command}"); - dbCommand.ExecuteNonQuery(); - } - catch (Exception ex) - { - Console.WriteLine($"CommandText error: {ex.Message}"); - } - } - } - catch (OdbcException ex) - { - Console.WriteLine($"数据库错误:{ex.Message}"); - } - catch (Exception ex) - { - Console.WriteLine($"发生未知错误:{ex.Message}"); - } -} - -var dsn = "Apache IoTDB DSN"; -var user = "root"; -var password = "root"; -var server = "127.0.0.1"; -var database = "test"; -var connectionString = $"DSN={dsn};Server={server};UID={user};PWD={password};Database={database};loglevel=4"; - -using (OdbcConnection dbConnection = new OdbcConnection(connectionString)) -{ - Console.WriteLine($"Start"); - try - { - dbConnection.Open(); - } - catch (Exception ex) - { - Console.WriteLine($"Login failed: {ex.Message}"); - Console.WriteLine($"Stack Trace: {ex.StackTrace}"); - dbConnection.Dispose(); - return; - } - Console.WriteLine($"Successfully opened connection. database name = {dbConnection.Driver}"); - Execute(dbConnection, "CREATE DATABASE IF NOT EXISTS test"); - Execute(dbConnection, "use test"); - Console.WriteLine("use test Execute complete. Begin to setup fulltable."); - - Execute(dbConnection, "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) WITH (TTL=315360000000)"); - string[] insertStatements = new string[] - { - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')" - }; - foreach (var insert in insertStatements) - { - Execute(dbConnection, insert); - } - Console.WriteLine("fulltable setup complete. Begin to query."); - Query(dbConnection); // 执行查询并输出结果 -} - -``` - -### 4.2 Python 示例 - -1. 通过Python访问odbc,需安装pyodbc包 - -```Plain -pip install pyodbc -``` - -2. 完整代码 - -```Python -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -""" -Apache IoTDB ODBC Python example -Use pyodbc to connect to the IoTDB ODBC driver and perform operations such as query and insert. -For reference, see examples/cpp-example/test.cpp and examples/BasicTest/BasicTest/Program.cs -""" - -import pyodbc - -def execute(conn: pyodbc.Connection, command: str) -> None: - """执行非查询 SQL 语句(如 USE、CREATE、INSERT、DELETE 等)""" - try: - with conn.cursor() as cursor: - cursor.execute(command) - # INSERT/UPDATE/DELETE require commit; session commands such as USE do not. - cmd_upper = command.strip().upper() - if cmd_upper.startswith(("INSERT", "UPDATE", "DELETE")): - conn.commit() - print(f"Execute command: {command}") - except pyodbc.Error as ex: - print(f"CommandText error: {ex}") - -def query(conn: pyodbc.Connection, sql: str) -> None: - """执行 SELECT 查询并以表格形式输出结果""" - try: - with conn.cursor() as cursor: - cursor.execute(sql) - col_count = len(cursor.description) - print(f"fCount = {col_count}") - - if col_count <= 0: - return - - # Get column names (if the name contains '.', take the last segment, consistent with C++/C# samples). - columns = [] - for i in range(col_count): - col_name = cursor.description[i][0] or f"Column{i}" - if "." in str(col_name): - col_name = str(col_name).split(".")[-1] - columns.append(str(col_name)) - - # Fetch data rows - rows = cursor.fetchall() - - # Simple table output - col_widths = [max(len(str(col)), 4) for col in columns] - for i, row in enumerate(rows): - for j, val in enumerate(row): - if j < len(col_widths): - col_widths[j] = max(col_widths[j], len(str(val) if val is not None else "NULL")) - - # Print header - header = " | ".join(str(c).ljust(col_widths[i]) for i, c in enumerate(columns)) - print(header) - print("-" * len(header)) - - # Print data rows - for row in rows: - values = [] - for i, val in enumerate(row): - if val is None: - cell = "NULL" - else: - cell = str(val) - values.append(cell.ljust(col_widths[i]) if i < len(col_widths) else cell) - print(" | ".join(values)) - - print() - - except pyodbc.Error as ex: - print(f"Query error: {ex}") - -def main() -> None: - dsn = "Apache IoTDB DSN" - user = "root" - password = "root" - server = "127.0.0.1" - database = "test" - connection_string = ( - f"DSN={dsn};Server={server};UID={user};PWD={password};" - f"Database={database};loglevel=4" - ) - - print("Start") - - try: - conn = pyodbc.connect(connection_string) - except pyodbc.Error as ex: - print(f"Login failed: {ex}") - return - - try: - driver_name = conn.getinfo(6) # SQL_DRIVER_NAME - print(f"Successfully opened connection. driver = {driver_name}") - except Exception: - print("Successfully opened connection.") - - try: - execute(conn, "CREATE DATABASE IF NOT EXISTS test") - execute(conn, "use test") - print("use test Execute complete. Begin to setup fulltable.") - - # Create the fulltable table and insert test data - execute( - conn, - "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, " - "int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, " - "double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, " - "blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) " - "WITH (TTL=315360000000)", - ) - insert_statements = [ - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - ] - for insert_sql in insert_statements: - execute(conn, insert_sql) - print("fulltable setup complete. Begin to query.") - query(conn, "select * from fulltable") - print("Query ok") - finally: - conn.close() - -if __name__ == "__main__": - main() -``` - -### 4.3 C++ 示例 - -```C++ -#define WIN32_LEAN_AND_MEAN -#include - -#include -#include -#include -#include -#include -#include -#include - -#ifndef SQL_DIAG_COLUMN_SIZE -#define SQL_DIAG_COLUMN_SIZE 33L -#endif - -// 错误处理函数(保持核心功能) -void CheckOdbcError(SQLRETURN retCode, SQLSMALLINT handleType, SQLHANDLE handle, const char* functionName) { - if (retCode == SQL_SUCCESS || retCode == SQL_SUCCESS_WITH_INFO) { - return; - } - - SQLCHAR sqlState[6]; - SQLCHAR message[SQL_MAX_MESSAGE_LENGTH]; - SQLINTEGER nativeError; - SQLSMALLINT textLength; - SQLRETURN errRet; - errRet = SQLGetDiagRec(handleType, handle, 1, sqlState, &nativeError, message, sizeof(message), &textLength); - - std::cerr << "ODBC Error in " << functionName << ":\n"; - std::cerr << " SQL State: " << sqlState << "\n"; - std::cerr << " Native Error: " << nativeError << "\n"; - std::cerr << " Message: " << message << "\n"; - std::cerr << " SQLGetDiagRec Return: " << errRet << "\n"; - - if (retCode == SQL_ERROR || retCode == SQL_INVALID_HANDLE) { - exit(1); - } -} - -// 简化版表格输出 - 仅展示基本数据 -void PrintSimpleTable(const std::vector& headers, - const std::vector>& rows) { - // 打印表头 - for (size_t i = 0; i < headers.size(); i++) { - std::cout << headers[i]; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - // 打印分隔线 - for (size_t i = 0; i < headers.size(); i++) { - std::cout << "----------------"; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - // 打印数据行 - for (const auto& row : rows) { - for (size_t i = 0; i < row.size(); i++) { - std::cout << row[i]; - if (i < row.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - } - std::cout << std::endl; -} - -/// 执行 SELECT 查询并以表格形式输出 fulltable 的结果 -void Query(SQLHDBC hDbc) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret = SQL_SUCCESS; - - try { - // 分配语句句柄 - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - return; - } - - // 执行查询 - const std::string sqlQuery = "select * from fulltable"; - std::cout << "Execute query: " << sqlQuery << std::endl; - - ret = SQLExecDirect(hStmt, reinterpret_cast(const_cast(sqlQuery.c_str())), SQL_NTS); - if (!SQL_SUCCEEDED(ret)) { - if (ret != SQL_NO_DATA) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect(SELECT)"); - } - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - // 获取列数量 - SQLSMALLINT colCount = 0; - ret = SQLNumResultCols(hStmt, &colCount); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLNumResultCols"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::cout << "Column count = " << colCount << std::endl; - - // 如果没有列,直接返回 - if (colCount <= 0) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - // 获取列名和类型信息 - std::vector columnNames; - std::vector columnTypes(colCount); - std::vector columnSizes(colCount); - std::vector decimalDigits(colCount); - std::vector nullable(colCount); - - // Get basic column information - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLSMALLINT nameLength = 0; - ret = SQLDescribeCol(hStmt, i, NULL, 0, &nameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get length)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector colNameBuffer(nameLength + 1); - SQLSMALLINT actualNameLength = 0; - - ret = SQLDescribeCol(hStmt, i, colNameBuffer.data(), nameLength + 1, - &actualNameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get name)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::string fullName(reinterpret_cast(colNameBuffer.data())); - - size_t pos = fullName.find_last_of('.'); - if (pos != std::string::npos) { - columnNames.push_back(fullName.substr(pos + 1)); - } else { - columnNames.push_back(fullName); - } - - ret = SQLDescribeCol(hStmt, i, NULL, 0, NULL, &columnTypes[i-1], - &columnSizes[i-1], &decimalDigits[i-1], &nullable[i-1]); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get type info)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - } - - std::vector> tableRows; - - int rowCount = 0; - // Get data front every row - while (true) { - ret = SQLFetch(hStmt); - if (ret == SQL_NO_DATA) { - break; - } - - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLFetch"); - break; - } - - std::vector row; - - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLLEN indicator = 0; - std::string valueStr; - - SQLSMALLINT cType; - size_t bufferSize; - bool isCharacterType = false; - const int maxBufferSize = 32768; - - switch (columnTypes[i-1]) { - case SQL_CHAR: - case SQL_VARCHAR: - case SQL_LONGVARCHAR: - case SQL_WCHAR: - case SQL_WVARCHAR: - case SQL_WLONGVARCHAR: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_DECIMAL: - case SQL_NUMERIC: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_INTEGER: - case SQL_SMALLINT: - case SQL_TINYINT: - case SQL_BIGINT: - cType = SQL_C_SBIGINT; - bufferSize = sizeof(SQLBIGINT); - break; - - case SQL_REAL: - case SQL_FLOAT: - case SQL_DOUBLE: - cType = SQL_C_DOUBLE; - bufferSize = sizeof(double); - break; - - case SQL_BIT: - cType = SQL_C_BIT; - bufferSize = sizeof(SQLCHAR); - break; - - case SQL_DATE: - case SQL_TYPE_DATE: - cType = SQL_C_DATE; - bufferSize = sizeof(SQL_DATE_STRUCT); - break; - - case SQL_TIME: - case SQL_TYPE_TIME: - cType = SQL_C_TIME; - bufferSize = sizeof(SQL_TIME_STRUCT); - break; - - case SQL_TIMESTAMP: - case SQL_TYPE_TIMESTAMP: - cType = SQL_C_TIMESTAMP; - bufferSize = sizeof(SQL_TIMESTAMP_STRUCT); - break; - - default: - cType = SQL_C_CHAR; - bufferSize = 256; - isCharacterType = true; - break; - } - - std::vector buffer(bufferSize); - - ret = SQLGetData(hStmt, i, cType, buffer.data(), bufferSize, &indicator); - - if (indicator == SQL_NULL_DATA) { - valueStr = "NULL"; - } - else if (ret != SQL_SUCCESS) { - valueStr = "ERR_CONV"; - } - else { - if (cType == SQL_C_CHAR) { - valueStr = reinterpret_cast(buffer.data()); - } - else if (cType == SQL_C_SBIGINT) { - SQLBIGINT intVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(intVal); - } - else if (cType == SQL_C_DOUBLE) { - double doubleVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(doubleVal); - } - else if (cType == SQL_C_BIT) { - valueStr = (*buffer.data() != 0) ? "TRUE" : "FALSE"; - } - else if (cType == SQL_C_DATE) { - SQL_DATE_STRUCT* date = reinterpret_cast(buffer.data()); - char dateStr[20]; - snprintf(dateStr, sizeof(dateStr), "%04d-%02d-%02d", - date->year, date->month, date->day); - valueStr = dateStr; - } - else if (cType == SQL_C_TIME) { - SQL_TIME_STRUCT* time = reinterpret_cast(buffer.data()); - char timeStr[15]; - snprintf(timeStr, sizeof(timeStr), "%02d:%02d:%02d", - time->hour, time->minute, time->second); - valueStr = timeStr; - } - else if (cType == SQL_C_TIMESTAMP) { - SQL_TIMESTAMP_STRUCT* ts = reinterpret_cast(buffer.data()); - char tsStr[30]; - snprintf(tsStr, sizeof(tsStr), "%04d-%02d-%02d %02d:%02d:%02d.%06d", - ts->year, ts->month, ts->day, - ts->hour, ts->minute, ts->second, - ts->fraction / 1000); - valueStr = tsStr; - } - else { - valueStr = "UNKNOWN_TYPE"; - } - - if (isCharacterType && ret == SQL_SUCCESS_WITH_INFO) { - SQLLEN actualSize = 0; - SQLGetDiagField(SQL_HANDLE_STMT, hStmt, 0, SQL_DIAG_COLUMN_SIZE, - &actualSize, SQL_IS_INTEGER, NULL); - - if (indicator > 0 && static_cast(indicator) > bufferSize - 1) { - valueStr += "..."; - } - } - - } - - row.push_back(valueStr); - } - - tableRows.push_back(row); - } - - if (!tableRows.empty()) { - PrintSimpleTable(columnNames, tableRows); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (const std::exception& ex) { - std::cerr << "Exception: " << ex.what() << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } - catch (...) { - std::cerr << "Unknown exception occurred" << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -/// 执行非查询 SQL 语句(如 CREATE DATABASE、CREATE TABLE、INSERT 等) -void Execute(SQLHDBC hDbc, const std::string& command) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret; - - try { - // 分配语句句柄 - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - - // 执行命令 - ret = SQLExecDirect(hStmt, (SQLCHAR*)command.c_str(), SQL_NTS); - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect"); - } - - // 释放语句句柄 - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (...) { - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -int main() { - SQLHENV hEnv = SQL_NULL_HENV; - SQLHDBC hDbc = SQL_NULL_HDBC; - SQLRETURN ret; - - try { - std::cout << "Start" << std::endl; - - // 1. 初始化ODBC环境 - ret = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hEnv); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_ENV)"); - - ret = SQLSetEnvAttr(hEnv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)SQL_OV_ODBC3, 0); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLSetEnvAttr"); - - // 2. 建立连接 - ret = SQLAllocHandle(SQL_HANDLE_DBC, hEnv, &hDbc); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_DBC)"); - - // 连接字符串 - std::string dsn = "Apache IoTDB DSN"; - std::string user = "root"; - std::string password = "root"; - std::string server = "127.0.0.1"; - std::string database = "test"; - - std::string connectionString = "DSN=" + dsn + ";Server=" + server + - ";UID=" + user + ";PWD=" + password + - ";Database=" + database + ";loglevel=4"; - std::cout << "Using connection string: " << connectionString << std::endl; - - SQLCHAR outConnStr[1024]; - SQLSMALLINT outConnStrLen; - - ret = SQLDriverConnect(hDbc, NULL, - (SQLCHAR*)connectionString.c_str(), SQL_NTS, - outConnStr, sizeof(outConnStr), - &outConnStrLen, SQL_DRIVER_COMPLETE); - - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - std::cerr << "Login failed" << std::endl; - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLDriverConnect"); - return 1; - } - - // 获取驱动名称 - SQLCHAR driverName[256]; - SQLSMALLINT nameLength; - ret = SQLGetInfo(hDbc, SQL_DRIVER_NAME, driverName, sizeof(driverName), &nameLength); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLGetInfo"); - - std::cout << "Successfully opened connection. database name = " << driverName << std::endl; - - // 3. 执行操作 - Execute(hDbc, "CREATE DATABASE IF NOT EXISTS test"); - Execute(hDbc, "use test"); - std::cout << "use test Execute complete. Begin to setup fulltable." << std::endl; - - // 创建 fulltable 表并插入测试数据 - Execute(hDbc, "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) WITH (TTL=315360000000)"); - const char* insertStatements[] = { - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')" - }; - for (const char* sql : insertStatements) { - Execute(hDbc, sql); - } - std::cout << "fulltable setup complete. Begin to query." << std::endl; - Query(hDbc); - std::cout << "Query ok" << std::endl; - - // 4. 清理资源 - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - - return 0; - } - catch (...) { - // 异常清理 - if (hDbc != SQL_NULL_HDBC) { - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - } - if (hEnv != SQL_NULL_HENV) { - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - } - - std::cerr << "Unexpected error!" << std::endl; - return 1; - } -} -``` - -### 4.4 PowerBI 示例 - -1. 打开 PowerBI Desktop,创建新项目; -2. 点击「主页」→「获取数据」→「更多...」→「ODBC」→ 点击「连接」按钮; -3. 数据源选择:在弹出窗口中选择「数据源名称 (DSN)」,下拉选择 `Apache IoTDB DSN`; -4. 高级配置: - -* 点击「高级选项」,在「连接字符串」输入框填写配置(样例): - -```Plain -server=127.0.0.1;port=6667;database=test;isTableModel=true;loglevel=4 -``` - -* 说明: - - * `dsn` 项可选,填写 / 不填写均不影响连接; - * `loglevel` 分为 0-4 等级:0 级(ERROR)日志最少,4 级(TRACE)日志最详细,按需设置; - - `server/database/dsn/loglevel` 大小写不敏感(如可写为 `Server/DATABASE`); - * 如果在DSN中配置了相关信息,则可以不填写任何配置信息,驱动管理器会自动使用在DSN中填入的配置信息。 - -5. 身份验证:输入用户名(默认 `root`)和密码(默认 `root`),点击「连接」; -6. 数据加载:在界面中选择需要调用的表(如 `fulltable/table1`),点击「加载」即可查看数据。 - -### 4.5 Excel 示例 - -1. 打开 Excel,创建空白工作簿; -2. 点击「数据」选项卡 →「自其他来源」→「来自数据连接向导」; -3. 数据源选择:选择「ODBC DSN」→ 下一步 → 选择 `Apache IoTDB DSN` → 下一步; -4. 连接配置: -* 连接字符串、用户名、密码的输入流程与 PowerBI 完全一致,连接字符串格式参考: - -```Plain -server=127.0.0.1;port=6667;database=test;isTableModel=true;loglevel=4 -``` - -* 如果在DSN中配置了相关信息,则可以不填写任何配置信息,驱动管理器会自动使用在DSN中填入的配置信息。 -5. 表选择:选择需要访问的数据库和表(如 fulltable),点击「下一步」; -6. 保存连接:自定义设置数据连接文件名、连接描述等信息,点击「完成」; -7. 导入数据:选择数据导入到工作表的位置(如「现有工作表」的 A1 单元格),点击「确定」,完成数据加载。 diff --git a/src/zh/UserGuide/Master/Table/API/Programming-Python-Native-API_timecho.md b/src/zh/UserGuide/Master/Table/API/Programming-Python-Native-API_timecho.md deleted file mode 100644 index b927a1827..000000000 --- a/src/zh/UserGuide/Master/Table/API/Programming-Python-Native-API_timecho.md +++ /dev/null @@ -1,757 +0,0 @@ - - -# Python 原生接口 - -## 1. 使用方式 - -安装依赖包: - -```shell -pip3 install apache-iotdb>=2.0 -``` -注意:请勿使用高版本客户端连接低版本服务。 - -## 2. 读写操作 - -### 2.1 TableSession - -#### 2.1.1 功能描述 - -TableSession是IoTDB的一个核心类,用于与IoTDB数据库进行交互。通过这个类,用户可以执行SQL语句、插入数据以及管理数据库会话。 - -#### 2.1.2 方法列表 - -| **方法名称** | **描述** | **参数类型** | **返回类型** | -| --------------------------- | ---------------------------------- | ---------------------------------- | -------------- | -| insert | 写入数据 | tablet: Union[Tablet, NumpyTablet] | None | -| execute_non_query_statement | 执行非查询 SQL 语句,如 DDL 和 DML | sql: str | None | -| execute_query_statement | 执行查询 SQL 语句并返回结果集 | sql: str | SessionDataSet | -| close | 关闭会话并释放资源 | None | None | - -自 V2.0.8.2 版本起,SessionDataSet 提供分批获取 DataFrame 的方法,用于高效处理大数据量查询: - -```python -# 分批获取 DataFrame -has_next = result.has_next_df() -if has_next: - df = result.next_df() - # 处理 DataFrame -``` - -**方法说明:** -- `has_next_df()`: 返回 `True`/`False`,表示是否还有数据可返回 -- `next_df()`: 返回 `DataFrame` 或 `None`,每次返回 `fetchSize` 行(默认5000行,由 Session 的 `fetch_size` 参数控制) - - 剩余数据 ≥ `fetchSize` 时,返回 `fetchSize` 行 - - 剩余数据 < `fetchSize` 时,返回剩余所有行 - - 数据遍历完毕时,返回 `None` -- 初始化 Session 时检查 `fetchSize`,若 ≤0 则重置为 5000 并打印警告日志 - -**注意:** 不要混合使用不同的遍历方式,如(todf函数与 next_df 混用),否则会出现预期外的错误。 - - -自 V2.0.8.3 版本起,Python 客户端在 `Tablet`批量写入与 `Session` 值序列化中支持 `TSDataType.OBJECT` ,查询结果经 `Field` 读取,相关接口定义如下: - -| 函数名 | 功能 | 参数 | 返回值 | -| ------------------------------------- | ------------------------------------------------------------ | --------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- | -| encode\_object\_cell | 将一格 OBJECT 编成线格式字节 | is\_eof: bool,offset: int,content: bytes | bytes:\|[eof 1B]\|[offset 8B BE]\|[payload]\| | -| decode\_object\_cell | 把线格式一格解析回 eof、offset、payload | cell: bytes(长度 ≥ 9) | Tuple[bool, int, bytes]:(is\_eof, offset, payload) | -| Tablet.add\_value\_object | 在指定行列写入一格 OBJECT(内部调用 encode\_object\_cell) | row\_index: int,column\_index: int,is\_eof: bool,offset: int,content: bytes | None | -| Tablet.add\_value\_object\_by\_name | 同上,按列名定位列 | column\_name: str,row\_index: int,is\_eof: bool,offset: int,content: bytes | None | -| NumpyTablet.add\_value\_object | 与 Tablet.add\_value\_object 相同语义,列数据为 ndarray | 同上(row\_index、column\_index、…) | None | -| Field.get\_object\_value | 按「目标类型」把 value 转成 Python 值 | data\_type: TSDataType | 随类型:OBJECT 时为 self.value 整段 UTF-8 解码 得到的 str(见[Field.py](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/iotdb/utils/Field.py)) | -| Field.get\_string\_value | 字符串化展示 | 无 | str;OBJECT 时为 self.value.decode("utf-8") | -| Field.get\_binary\_value | 取 TEXT/STRING/BLOB 的二进制 | 无 | bytes 或 None;OBJECT 列会抛错,不应调用 | - - -#### 2.1.3 接口展示 - -**TableSession:** - - -```Python -class TableSession(object): -def insert(self, tablet: Union[Tablet, NumpyTablet]): - """ - Insert data into the database. - - Parameters: - tablet (Tablet | NumpyTablet): The tablet containing the data to be inserted. - Accepts either a `Tablet` or `NumpyTablet`. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def execute_non_query_statement(self, sql: str): - """ - Execute a non-query SQL statement. - - Parameters: - sql (str): The SQL statement to execute. Typically used for commands - such as INSERT, DELETE, or UPDATE. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def execute_query_statement(self, sql: str, timeout_in_ms: int = 0) -> "SessionDataSet": - """ - Execute a query SQL statement and return the result set. - - Parameters: - sql (str): The SQL query to execute. - timeout_in_ms (int, optional): Timeout for the query in milliseconds. Defaults to 0, - which means no timeout. - - Returns: - SessionDataSet: The result set of the query. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def close(self): - """ - Close the session and release resources. - - Raises: - IoTDBConnectionException: If there is an issue closing the connection. - """ - pass -``` - -### 2.2 TableSessionConfig - -#### 2.2.1 功能描述 - -TableSessionConfig是一个配置类,用于设置和创建TableSession 实例。它定义了连接到IoTDB数据库所需的各种参数。 - -#### 2.2.2 配置选项 - -| **配置项** | **描述** | **类型** | **默认值** | -| ------------------ | ------------------------- | -------- |-----------------------------------------| -| node_urls | 数据库连接的节点 URL 列表 | list | ["localhost:6667"] | -| username | 数据库连接用户名 | str | "root" | -| password | 数据库连接密码 | str | "TimechoDB@2021" //V2.0.6.x 之前默认密码是root | -| database | 要连接的目标数据库 | str | None | -| fetch_size | 每次查询获取的行数 | int | 5000 | -| time_zone | 会话的默认时区 | str | Session.DEFAULT_ZONE_ID | -| enable_compression | 是否启用数据压缩 | bool | False | - -#### 2.2.3 接口展示 - -```Python -class TableSessionConfig(object): - """ - Configuration class for a TableSession. - - This class defines various parameters for connecting to and interacting - with the IoTDB tables. - """ - - def __init__( - self, - node_urls: list = None, - username: str = Session.DEFAULT_USER, - password: str = Session.DEFAULT_PASSWORD, - database: str = None, - fetch_size: int = 5000, - time_zone: str = Session.DEFAULT_ZONE_ID, - enable_compression: bool = False, - ): - """ - Initialize a TableSessionConfig object with the provided parameters. - - Parameters: - node_urls (list, optional): A list of node URLs for the database connection. - Defaults to ["localhost:6667"]. - username (str, optional): The username for the database connection. - Defaults to "root". - password (str, optional): The password for the database connection. - Defaults to "TimechoDB@2021". //V2.0.6.x 之前默认密码是root - database (str, optional): The target database to connect to. Defaults to None. - fetch_size (int, optional): The number of rows to fetch per query. Defaults to 5000. - time_zone (str, optional): The default time zone for the session. - Defaults to Session.DEFAULT_ZONE_ID. - enable_compression (bool, optional): Whether to enable data compression. - Defaults to False. - """ -``` - -**注意事项:** - -在使用完 TableSession 后,务必调用 close 方法来释放资源。 - -## 3. 客户端连接池 - -### 3.1 TableSessionPool - -#### 3.1.1 功能描述 - -TableSessionPool 是一个会话池管理类,用于管理 TableSession 实例的创建和销毁。它提供了从池中获取会话和关闭会话池的功能。 - -#### 3.1.2 方法列表 - -| **方法名称** | **描述** | **返回类型** | **异常** | -| ------------ | ---------------------------------------- | ------------ | -------- | -| get_session | 从会话池中检索一个新的 TableSession 实例 | TableSession | 无 | -| close | 关闭会话池并释放所有资源 | None | 无 | - -#### 3.1.3 接口展示 - -**TableSessionPool:** - -```Python -def get_session(self) -> TableSession: - """ - Retrieve a new TableSession instance. - - Returns: - TableSession: A new session object configured with the session pool. - - Notes: - The session is initialized with the underlying session pool for managing - connections. Ensure proper usage of the session's lifecycle. - """ - -def close(self): - """ - Close the session pool and release all resources. - - This method closes the underlying session pool, ensuring that all - resources associated with it are properly released. - - Notes: - After calling this method, the session pool cannot be used to retrieve - new sessions, and any attempt to do so may raise an exception. - """ -``` - -### 3.2 TableSessionPoolConfig - -#### 3.2.1 功能描述 - -TableSessionPoolConfig是一个配置类,用于设置和创建 TableSessionPool 实例。它定义了初始化和管理 IoTDB 数据库会话池所需的参数。 - -#### 3.2.2 配置选项 - -| **配置项** | **描述** | **类型** | **默认值** | -| ------------------ | ------------------------------ | -------- | ------------------------ | -| node_urls | 数据库连接的节点 URL 列表 | list | None | -| max_pool_size | 会话池中的最大会话数 | int | 5 | -| username | 数据库连接用户名 | str | Session.DEFAULT_USER | -| password | 数据库连接密码 | str | Session.DEFAULT_PASSWORD | -| database | 要连接的目标数据库 | str | None | -| fetch_size | 每次查询获取的行数 | int | 5000 | -| time_zone | 会话池的默认时区 | str | Session.DEFAULT_ZONE_ID | -| enable_redirection | 是否启用重定向 | bool | False | -| enable_compression | 是否启用数据压缩 | bool | False | -| wait_timeout_in_ms | 等待会话可用的最大时间(毫秒) | int | 10000 | -| max_retry | 操作的最大重试次数 | int | 3 | - -#### 3.2.3 接口展示 - - -```Python -class TableSessionPoolConfig(object): - """ - Configuration class for a TableSessionPool. - - This class defines the parameters required to initialize and manage - a session pool for interacting with the IoTDB database. - """ - def __init__( - self, - node_urls: list = None, - max_pool_size: int = 5, - username: str = Session.DEFAULT_USER, - password: str = Session.DEFAULT_PASSWORD, - database: str = None, - fetch_size: int = 5000, - time_zone: str = Session.DEFAULT_ZONE_ID, - enable_redirection: bool = False, - enable_compression: bool = False, - wait_timeout_in_ms: int = 10000, - max_retry: int = 3, - ): - """ - Initialize a TableSessionPoolConfig object with the provided parameters. - - Parameters: - node_urls (list, optional): A list of node URLs for the database connection. - Defaults to None. - max_pool_size (int, optional): The maximum number of sessions in the pool. - Defaults to 5. - username (str, optional): The username for the database connection. - Defaults to Session.DEFAULT_USER. - password (str, optional): The password for the database connection. - Defaults to Session.DEFAULT_PASSWORD. - database (str, optional): The target database to connect to. Defaults to None. - fetch_size (int, optional): The number of rows to fetch per query. Defaults to 5000. - time_zone (str, optional): The default time zone for the session pool. - Defaults to Session.DEFAULT_ZONE_ID. - enable_redirection (bool, optional): Whether to enable redirection. - Defaults to False. - enable_compression (bool, optional): Whether to enable data compression. - Defaults to False. - wait_timeout_in_ms (int, optional): The maximum time (in milliseconds) to wait for a session - to become available. Defaults to 10000. - max_retry (int, optional): The maximum number of retry attempts for operations. Defaults to 3. - - """ -``` -### 3.3 SSL 连接 - -#### 3.3.1 服务器端配置证书 - -`conf/iotdb-system.properties` 配置文件中查找或添加以下配置项: - -``` -enable_thrift_ssl=true -key_store_path=/path/to/your/server_keystore.jks -key_store_pwd=your_keystore_password -``` - -#### 3.3.2 配置 python 客户端证书 - -- 设置 use_ssl 为 True 以启用 SSL。 -- 指定客户端证书路径,使用 ca_certs 参数。 - -``` -use_ssl = True -ca_certs = "/path/to/your/server.crt" # 或 ca_certs = "/path/to/your//ca_cert.pem" -``` -**示例代码:使用 SSL 连接 IoTDB** - -```Python -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# - -from iotdb.SessionPool import PoolConfig, SessionPool -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前默认密码是root -# Configure SSL enabled -use_ssl = True -# Configure certificate path -ca_certs = "/path/server.crt" - - -def get_data(): - session = Session( - ip, port_, username_, password_, use_ssl=use_ssl, ca_certs=ca_certs - ) - session.open(False) - with session.execute_query_statement("SHOW DATABASES") as session_data_set: - print(session_data_set.get_column_names()) - while session_data_set.has_next(): - print(session_data_set.next()) - - session.close() - - -def get_data2(): - pool_config = PoolConfig( - host=ip, - port=port_, - user_name=username_, - password=password_, - fetch_size=1024, - time_zone="UTC+8", - max_retry=3, - use_ssl=use_ssl, - ca_certs=ca_certs, - ) - max_pool_size = 5 - wait_timeout_in_ms = 3000 - session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) - session = session_pool.get_session() - with session.execute_query_statement("SHOW DATABASES") as session_data_set: - print(session_data_set.get_column_names()) - while session_data_set.has_next(): - print(session_data_set.next()) - session_pool.put_back(session) - session_pool.close() - - -if __name__ == "__main__": - df = get_data() -``` - -## 4. 示例代码 - -Session示例代码:[Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/table_model_session_example.py) - -SessionPool示例代码:[SessionPool Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/table_model_session_pool_example.py) - -```Python -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# -import threading - -import numpy as np - -from iotdb.table_session_pool import TableSessionPool, TableSessionPoolConfig -from iotdb.utils.IoTDBConstants import TSDataType -from iotdb.utils.NumpyTablet import NumpyTablet -from iotdb.utils.Tablet import ColumnType, Tablet - - -def prepare_data(): - print("create database") - # Get a session from the pool - session = session_pool.get_session() - session.execute_non_query_statement("CREATE DATABASE IF NOT EXISTS db1") - session.execute_non_query_statement('USE "db1"') - session.execute_non_query_statement( - "CREATE TABLE table0 (id1 string tag, attr1 string attribute, " - + "m1 double " - + "field)" - ) - session.execute_non_query_statement( - "CREATE TABLE table1 (id1 string tag, attr1 string attribute, " - + "m1 double " - + "field)" - ) - - print("now the tables are:") - # show result - with session.execute_query_statement("SHOW TABLES") as res: - while res.has_next(): - print(res.next()) - - session.close() - - -def insert_data(num: int): - print("insert data for table" + str(num)) - # Get a session from the pool - session = session_pool.get_session() - column_names = [ - "id1", - "attr1", - "m1", - ] - data_types = [ - TSDataType.STRING, - TSDataType.STRING, - TSDataType.DOUBLE, - ] - column_types = [ColumnType.TAG, ColumnType.ATTRIBUTE, ColumnType.FIELD] - timestamps = [] - values = [] - for row in range(15): - timestamps.append(row) - values.append(["id:" + str(row), "attr:" + str(row), row * 1.0]) - tablet = Tablet( - "table" + str(num), column_names, data_types, values, timestamps, column_types - ) - session.insert(tablet) - session.execute_non_query_statement("FLush") - - np_timestamps = np.arange(15, 30, dtype=np.dtype(">i8")) - np_values = [ - np.array(["id:{}".format(i) for i in range(15, 30)]), - np.array(["attr:{}".format(i) for i in range(15, 30)]), - np.linspace(15.0, 29.0, num=15, dtype=TSDataType.DOUBLE.np_dtype()), - ] - - np_tablet = NumpyTablet( - "table" + str(num), - column_names, - data_types, - np_values, - np_timestamps, - column_types=column_types, - ) - session.insert(np_tablet) - session.close() - - -def query_data(): - # Get a session from the pool - session = session_pool.get_session() - - print("get data from table0") - with session.execute_query_statement("select * from table0") as res: - while res.has_next(): - print(res.next()) - - print("get data from table1") - with session.execute_query_statement("select * from table1") as res: - while res.has_next(): - print(res.next()) - - # 使用分批DataFrame方式查询表数据(推荐大数据量场景) - print("get data from table0 using batch DataFrame") - with session.execute_query_statement("select * from table0") as res: - while res.has_next_df(): - print(res.next_df()) - - session.close() - - -def delete_data(): - session = session_pool.get_session() - session.execute_non_query_statement("drop database db1") - print("data has been deleted. now the databases are:") - with session.execute_query_statement("show databases") as res: - while res.has_next(): - print(res.next()) - session.close() - - -# Create a session pool -username = "root" -password = "TimechoDB@2021" //V2.0.6.x 之前默认密码是root -node_urls = ["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"] -fetch_size = 1024 -database = "db1" -max_pool_size = 5 -wait_timeout_in_ms = 3000 -config = TableSessionPoolConfig( - node_urls=node_urls, - username=username, - password=password, - database=database, - max_pool_size=max_pool_size, - fetch_size=fetch_size, - wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) - -prepare_data() - -insert_thread1 = threading.Thread(target=insert_data, args=(0,)) -insert_thread2 = threading.Thread(target=insert_data, args=(1,)) - -insert_thread1.start() -insert_thread2.start() - -insert_thread1.join() -insert_thread2.join() - -query_data() -delete_data() -session_pool.close() -print("example is finished!") -``` - -Object 类型使用示例: - -```Python -import os - -import numpy as np -import pytest - -from iotdb.utils.IoTDBConstants import TSDataType -from iotdb.utils.NumpyTablet import NumpyTablet -from iotdb.utils.Tablet import Tablet, ColumnType -from iotdb.utils.object_column import decode_object_cell - - -def _require_thrift(): - pytest.importorskip("iotdb.thrift.common.ttypes") - - -def _session_endpoint(): - host = os.environ.get("IOTDB_HOST", "127.0.0.1") - port = int(os.environ.get("IOTDB_PORT", "6667")) - return host, port - - -@pytest.fixture(scope="module") -def table_session(): - _require_thrift() - from iotdb.Session import Session - from iotdb.table_session import TableSession, TableSessionConfig - - host, port = _session_endpoint() - cfg = TableSessionConfig( - node_urls=[f"{host}:{port}"], - username=os.environ.get("IOTDB_USER", Session.DEFAULT_USER), - password=os.environ.get("IOTDB_PASSWORD", Session.DEFAULT_PASSWORD), - ) - ts = TableSession(cfg) - yield ts - ts.close() - - -def test_table_numpy_tablet_object_columns(table_session): - """ - Table model: Tablet.add_value_object / add_value_object_by_name, - NumpyTablet.add_value_object, insert + query Field + decode_object_cell; - 另含同一 time 上分两段写入 OBJECT(先 is_eof=False/offset=0,再 is_eof=True/offset=首段长度), - 并用 read_object(f1) 校验拼接后的完整字节。 - """ - db = "test_py_object_e2e" - table = "obj_tbl" - table_session.execute_non_query_statement(f"create database if not exists {db}") - table_session.execute_non_query_statement(f"use {db}") - table_session.execute_non_query_statement(f"drop table if exists {table}") - table_session.execute_non_query_statement( - f"create table {table}(" - "device STRING TAG, temp FLOAT FIELD, f1 OBJECT FIELD, f2 OBJECT FIELD)" - ) - - column_names = ["device", "temp", "f1", "f2"] - data_types = [ - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.OBJECT, - TSDataType.OBJECT, - ] - column_types = [ - ColumnType.TAG, - ColumnType.FIELD, - ColumnType.FIELD, - ColumnType.FIELD, - ] - timestamps = [100, 200] - values = [ - ["d1", 1.5, None, None], - ["d1", 2.5, None, None], - ] - - tablet = Tablet( - table, column_names, data_types, values, timestamps, column_types - ) - tablet.add_value_object(0, 2, True, 0, b"first-row-obj") - # 整对象单段写入:is_eof=True 且 offset=0;分段续写需满足服务端 offset/长度校验 - tablet.add_value_object_by_name("f2", 0, True, 0, b"seg") - tablet.add_value_object(1, 2, True, 0, b"second-f1") - tablet.add_value_object(1, 3, True, 0, b"second-f2") - table_session.insert(tablet) - - ts_arr = np.array([300, 400], dtype=TSDataType.INT64.np_dtype()) - np_vals = [ - np.array(["d1", "d1"]), - np.array([1.0, 2.0], dtype=np.float32), - np.array([None, None], dtype=object), - np.array([None, None], dtype=object), - ] - np_tab = NumpyTablet( - table, column_names, data_types, np_vals, ts_arr, column_types=column_types - ) - np_tab.add_value_object(0, 2, True, 0, b"np-r0-f1") - np_tab.add_value_object(0, 3, True, 0, b"np-r0-f2") - np_tab.add_value_object(1, 2, True, 0, b"np-r1-f1") - np_tab.add_value_object(1, 3, True, 0, b"np-r1-f2") - table_session.insert(np_tab) - - # 分段 OBJECT:先 is_eof=False(续传),再 is_eof=True(末段);offset 为已写入字节长度 - chunk0 = bytes((i % 256) for i in range(512)) - chunk1 = b"\xab" * 64 - expected_segmented = chunk0 + chunk1 - seg1 = Tablet( - table, - column_names, - data_types, - [["d1", 3.0, None, None]], - [500], - column_types, - ) - seg1.add_value_object(0, 2, False, 0, chunk0) - seg1.add_value_object(0, 3, True, 0, b"f2-seg") - table_session.insert(seg1) - seg2 = Tablet( - table, - column_names, - data_types, - [["d1", 3.0, None, None]], - [500], - column_types, - ) - seg2.add_value_object(0, 2, True, 512, chunk1) - seg2.add_value_object(0, 3, True, 0, b"f2-seg") - table_session.insert(seg2) - - with table_session.execute_query_statement( - f"select read_object(f1) from {table} where time = 500" - ) as ds: - assert ds.has_next() - row = ds.next() - blob = row.get_fields()[0].get_binary_value() - assert blob == expected_segmented - assert not ds.has_next() - - seen = 0 - with table_session.execute_query_statement( - f"select device, temp, f1, f2 from {table} order by time" - ) as ds: - while ds.has_next(): - row = ds.next() - fields = row.get_fields() - assert fields[0].get_object_value(TSDataType.STRING) == "d1" - assert fields[1].get_object_value(TSDataType.FLOAT) is not None - for j in (2, 3): - raw = fields[j].value - assert isinstance(raw, (bytes, bytearray)) - eof, off, body = decode_object_cell(bytes(raw)) - assert isinstance(eof, bool) and isinstance(off, int) - assert isinstance(body, bytes) - fields[j].get_string_value() - fields[j].get_object_value(TSDataType.OBJECT) - seen += 1 - assert seen == 5 - - -if __name__ == "__main__": - pytest.main([__file__, "-v", "-rs"]) -``` diff --git a/src/zh/UserGuide/Master/Table/API/RestServiceV1_timecho.md b/src/zh/UserGuide/Master/Table/API/RestServiceV1_timecho.md deleted file mode 100644 index 117ba16ba..000000000 --- a/src/zh/UserGuide/Master/Table/API/RestServiceV1_timecho.md +++ /dev/null @@ -1,367 +0,0 @@ - - -# RestAPI V1 - -IoTDB 的 RESTful 服务可用于查询、写入和管理操作,它使用 OpenAPI 标准来定义接口并生成框架。 - -注意:自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 REST 服务的 JAR 包,请使用该服务前联系天谋团队获取相应的 JAR 包,并放置于 timechodb_home/lib 或者 timechodb_home/ext/external_service 路径下。 - -## 1. 开启 RESTful 服务 - -Restful 服务默认情况是关闭的,开启 restful 功能需要找到 IoTDB 安装目录下的`conf/iotdb-system.properties`文件,将 `enable_rest_service` 设置为 `true` ,然后重启 datanode 进程。 - -```Plain - enable_rest_service=true -``` - -## 2. 鉴权 - -除了检活接口 `/ping`,RESTful 服务均使用基础(Basic)鉴权,所有请求都需要在 Header 中携带 `Authorization` 信息。 - -1. 鉴权格式 - -```JSON -Authorization: Basic -``` - -其中 `` 是 `用户名:密码` 直接做 Base64 编码的结果,其快速生成方式如下 - -* Linux/macOS - -```Bash -echo -n "你的用户名:你的密码" | base64 -eg: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows - -```Bash -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("用户名:密码")) -eg: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) - -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"用户名:密码\"))" -eg: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. 鉴权示例 - -默认用户名 `root`,密码 `TimechoDB@2021`: - -* 拼接字符串:`root:TimechoDB@2021` -* Base64 编码后为:`cm9vdDpUaW1lY2hvREJAMjAyMQ==` -* 最终 Header: - -```JSON -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. 错误说明 -* 用户名/密码错误:返回 HTTP 状态码 `801`,内容: - -```JSON -{"code":801,"message":"WRONG_LOGIN_PASSWORD"} -``` - -* 未设置 `Authorization`:返回 HTTP 状态码 `800`,内容: - -```JSON -{"code":800,"message":"INIT_AUTH_ERROR"} -``` - - -## 3. 接口定义 - -### 3.1 ping - -ping 接口可以用于线上服务检活。 - -请求方式:`GET` - -请求路径:http://ip:port/ping - -请求示例: - -```Plain -curl http://127.0.0.1:18080/ping -``` - -返回的 HTTP 状态码: - -- `200`:当前服务工作正常,可以接收外部请求。 -- `503`:当前服务出现异常,不能接收外部请求。 - -响应参数: - -| 参数名称 | 参数类型 | 参数描述 | -| -------- | -------- | -------- | -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: - -- HTTP 状态码为 `200` 时: - -```JSON -{ "code": 200, "message": "SUCCESS_STATUS"} -``` - -- HTTP 状态码为 `503` 时: - -```JSON -{ "code": 500, "message": "thrift service is unavailable"} -``` - -注意:`ping` 接口访问不需要鉴权。 - -### 3.2 查询接口 - -- 请求地址:`/rest/table/v1/query` - -- 请求方式:post - -- Request 格式 - -请求头:`application/json` - -请求参数说明 - -| 参数名称 | 参数类型 | 是否必填 | 参数描述 | -| --------- | -------- | -------- | ------------------------------------------------------------ | -| database | string | 是 | 数据库名称 | -| sql | string | 是 | | -| row_limit | int | 否 | 一次查询能返回的结果集的最大行数。 如果不设置该参数,将使用配置文件的 `rest_query_default_row_size_limit` 作为默认值。 当返回结果集的行数超出限制时,将返回状态码 `411`。 | - -- Response 格式 - -| 参数名称 | 参数类型 | 参数描述 | -| ------------ | -------- | ------------------------------------------------------------ | -| column_names | array | 列名 | -| data_types | array | 每一列的类型 | -| values | array | 二维数组,第一维与结果集所有行,第二维数组代表结果集的每一行,每一个元素为一列,长度与column_names的长度相同。 | - -- 请求示例 - -```JSON -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"database":"test","sql":"select s1,s2,s3 from test_table"}' http://127.0.0.1:18080/rest/table/v1/query -``` - -- 响应示例: - -```JSON -{ - "column_names": [ - "s1", - "s2", - "s3" - ], - "data_types": [ - "STRING", - "BOOLEAN", - "INT32" - ], - "values": [ - [ - "a11", - true, - 2024 - ], - [ - "a11", - false, - 2025 - ] - ] -} -``` - -### 3.3 非查询接口 - -- 请求地址:`/rest/table/v1/nonQuery` - -- 请求方式:post - -- Request 格式 - - - 请求头:`application/json` - - - 请求参数说明 - -| 参数名称 | 参数类型 | 是否必填 | 参数描述 | -| -------- | -------- | -------- | -------- | -| sql | string | 是 | | -| database | string | 否 | 数据库名 | - -- Response 格式 - -| 参数名称 | 参数类型 | 参数描述 | -| -------- | -------- | -------- | -| code | integer | 状态码 | -| message | string | 信息提示 | - -- 请求示例 - -```JSON -#创建数据库 -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"create database test","database":""}' http://127.0.0.1:18080/rest/table/v1/nonQuery -#在test库中创建表test_table -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"CREATE TABLE table1 (time TIMESTAMP TIME,region STRING TAG,plant_id STRING TAG,device_id STRING TAG,model_id STRING ATTRIBUTE,maintenance STRING ATTRIBUTE,temperature FLOAT FIELD,humidity FLOAT FIELD,status Boolean FIELD,arrival_time TIMESTAMP FIELD) WITH (TTL=31536000000)","database":"test"}' http://127.0.0.1:18080/rest/table/v1/nonQuery -``` - -- 响应示例: - -```JSON -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - -### 3.4 批量写入接口 - -- 请求地址:`/rest/table/v1/insertTablet` - -- 请求方式:post - -- Request 格式 - -请求头:`application/json` - -请求参数说明 - -| 参数名称 | 参数类型 | 是否必填 | 参数描述 | -| ----------------- | -------- | -------- | ------------------------------------------------------------ | -| database | string | 是 | 数据库名称 | -| table | string | 是 | 表名 | -| column_names | array | 是 | 列名 | -| column_categories | array | 是 | 列类别(TAG,FIELD,*ATTRIBUTE*) | -| data_types | array | 是 | 数据类型 | -| timestamps | array | 是 | 时间列 | -| values | array | 是 | 值列,每一列中的值可以为 `null`二维数组第一层长度跟timestamps长度相同。第二层长度跟column_names长度相同 | - -- Response 格式 - -响应参数: - -| 参数名称 | 参数类型 | 参数描述 | -| -------- | -------- | -------- | -| code | integer | 状态码 | -| message | string | 信息提示 | - -- 请求示例 - -```JSON -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"database":"test","column_categories":["TAG","FIELD","FIELD"],"timestamps":[1739702535000,1739789055000],"column_names":["s1","s2","s3"],"data_types":["STRING","BOOLEAN","INT32"],"values":[["a11",true,2024],["a11",false,2025]],"table":"test_table"}' http://127.0.0.1:18080/rest/table/v1/insertTablet -``` - -- 响应示例 - -```JSON -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - -## 4. 配置 - -配置文件位于 `iotdb-system.properties` 中。 - -- 将 `enable_rest_service` 设置为 `true` 表示启用该模块,而将 `false` 表示禁用该模块。默认情况下,该值为 `false`。 - -```Plain -enable_rest_service=true -``` - -- 仅在 `enable_rest_service=true` 时生效。将 `rest_service_port `设置为数字(1025~65535),以自定义REST服务套接字端口,默认情况下,值为 `18080`。 - -```Plain -rest_service_port=18080 -``` - -- 将 'enable_swagger' 设置 'true' 启用swagger来展示rest接口信息, 而设置为 'false' 关闭该功能. 默认情况下,该值为 `false`。 - -```Plain -enable_swagger=false -``` - -- 一次查询能返回的结果集最大行数。当返回结果集的行数超出参数限制时,您只会得到在行数范围内的结果集,且将得到状态码`411`。 - -```Plain -rest_query_default_row_size_limit=10000 -``` - -- 缓存客户登录信息的过期时间(用于加速用户鉴权的速度,单位为秒,默认是8个小时) - -```Plain -cache_expire_in_seconds=28800 -``` - -- 缓存中存储的最大用户数量(默认是100) - -```Plain -cache_max_num=100 -``` - -- 缓存初始容量(默认是10) - -```Plain -cache_init_num=10 -``` - -- REST Service 是否开启 SSL 配置,将 `enable_https` 设置为 `true` 以启用该模块,而将 `false` 设置为禁用该模块。默认情况下,该值为 `false`。 - -```Plain -enable_https=false -``` - -- keyStore 所在路径(非必填) - -```Plain -key_store_path= -``` - -- keyStore 密码(非必填) - -```Plain -key_store_pwd= -``` - -- trustStore 所在路径(非必填) - -```Plain -trust_store_path="" -``` - -- trustStore 密码(非必填) - -```Plain -trust_store_pwd="" -``` - -- SSL 超时时间,单位为秒 - -```Plain -idle_timeout_in_seconds=5000 -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Background-knowledge/Cluster-Concept_timecho.md b/src/zh/UserGuide/Master/Table/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index b0462f0dd..000000000 --- a/src/zh/UserGuide/Master/Table/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,132 +0,0 @@ - - -# 常见概念 - -## 1. 数据模型相关概念 - -### 1.1 数据模型(sql_dialect) - -IoTDB 支持两种时序数据模型(SQL语法),管理的对象均为设备和测点树:以层级路径的方式管理数据,一条路径对应一个设备的一个测点表;以关系表的方式管理数据,一张表对应一类设备。 - -### 1.2 元数据(Schema) - -元数据是数据库的数据模型信息,即树形结构或表结构。包括测点的名称、数据类型等定义。 - -### 1.3 设备(Device) - -对应一个实际场景中的物理设备,通常包含多个测点。 - -### 1.4 测点(Timeseries) - -又名:物理量、时间序列、时间线、点位、信号量、指标、测量值等。
-测点是多个数据点按时间戳递增排列形成的一个时间序列。通常一个测点代表一个采集点位,能够定期采集所在环境的物理量。 - -### 1.5 编码(Encoding) - -编码是一种压缩技术,将数据以二进制的形式进行表示,可以提高存储效率。IoTDB 支持多种针对不同类型的数据的编码方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md)。 - -### 1.6 压缩(Compression) - -IoTDB 在数据编码后,使用压缩技术进一步压缩二进制数据,提升存储效率。IoTDB 支持多种压缩方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md)。 - -## 2. 分布式相关概念 - -下图展示了一个常见的 IoTDB 3C3D(3 个 ConfigNode、3 个 DataNode)的集群部署模式: - - - -IoTDB 的集群包括如下常见概念: - -- 节点(ConfigNode、DataNode、AINode) -- Region(SchemaRegion、DataRegion) -- 多副本 - -下文将对以上概念进行介绍。 - - -### 2.1 节点 - -IoTDB 集群包括三种节点(进程):ConfigNode(管理节点),DataNode(数据节点)和 AINode(分析节点),如下所示: - -- ConfigNode:管理集群的节点信息、配置信息、用户权限、元数据、分区信息等,负责分布式操作的调度和负载均衡,所有 ConfigNode 之间互为全量备份,如上图中的 ConfigNode-1,ConfigNode-2 和 ConfigNode-3 所示。 -- DataNode:服务客户端请求,负责数据的存储和计算,如上图中的 DataNode-1,DataNode-2 和 DataNode-3 所示。 -- AINode:负责提供机器学习能力,支持注册已训练好的机器学习模型,并通过 SQL 调用模型进行推理,目前已内置自研时序大模型和常见的机器学习算法(如预测与异常检测)。 - -### 2.2 数据分区 - -在 IoTDB 中,元数据和数据都被分为小的分区,即 Region,由集群的各个 DataNode 进行管理。 - -- SchemaRegion:元数据分区,管理一部分设备和测点的元数据。不同 DataNode 相同 RegionID 的 SchemaRegion 互为副本,如上图中 SchemaRegion-1 拥有三个副本,分别放置于 DataNode-1,DataNode-2 和 DataNode-3。 -- DataRegion:数据分区,管理一部分设备的一段时间的数据。不同 DataNode 相同 RegionID 的 DataRegion 互为副本,如上图中 DataRegion-2 拥有两个副本,分别放置于 DataNode-1 和 DataNode-2。 -- 具体分区算法可参考:[数据分区](../Technical-Insider/Cluster-data-partitioning.md) - -### 2.3 多副本 - -数据和元数据的副本数可配置,不同部署模式下的副本数推荐如下配置,其中多副本时可提供高可用服务。 - -| 类别 | 配置项 | 单机推荐配置 | 集群推荐配置 | -| :----- | :------------------------ | :----------- | :----------- | -| 元数据 | schema_replication_factor | 1 | 3 | -| 数据 | data_replication_factor | 1 | 2 | - - -## 3. 部署相关概念 - -IoTDB 有三种运行模式:单机模式、集群模式和双活模式。 - -### 3.1 单机模式 - -IoTDB单机实例包括 1 个ConfigNode、1个DataNode,即1C1D; - -- **特点**:便于开发者安装部署,部署和维护成本较低,操作方便。 -- **适用场景**:资源有限或对高可用要求不高的场景,例如边缘端服务器。 -- **部署方法**:[单机版部署](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -### 3.2 双活模式 - -双活版部署为 TimechoDB 企业版功能,是指两个独立的实例进行双向同步,能同时对外提供服务。当一台停机重启后,另一个实例会将缺失数据断点续传。 - -> IoTDB 双活实例通常为2个单机节点,即2套1C1D。每个实例也可以为集群。 - -- **特点**:资源占用最低的高可用解决方案。 -- **适用场景**:资源有限(仅有两台服务器),但希望获得高可用能力。 -- **部署方法**:[双活版部署](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### 3.3 集群模式 - -IoTDB 集群实例为 3 个ConfigNode 和不少于 3 个 DataNode,通常为 3 个 DataNode,即3C3D;当部分节点出现故障时,剩余节点仍然能对外提供服务,保证数据库服务的高可用性,且可随节点增加提升数据库性能。 - -- **特点**:具有高可用性、高扩展性,可通过增加 DataNode 提高系统性能。 -- **适用场景**:需要提供高可用和可靠性的企业级应用场景。 -- **部署方法**:[集群版部署](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -### 3.4 特点总结 - -| 维度 | 单机模式 | 双活模式 | 集群模式 | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| 适用场景 | 边缘侧部署、对高可用要求不高 | 高可用性业务、容灾场景等 | 高可用性业务、容灾场景等 | -| 所需机器数量 | 1 | 2 | ≥3 | -| 安全可靠性 | 无法容忍单点故障 | 高,可容忍单点故障 | 高,可容忍单点故障 | -| 扩展性 | 可扩展 DataNode 提升性能 | 每个实例可按需扩展 | 可扩展 DataNode 提升性能 | -| 性能 | 可随 DataNode 数量扩展 | 与其中一个实例性能相同 | 可随 DataNode 数量扩展 | - -- 单机模式和集群模式,部署步骤类似(逐个增加 ConfigNode 和 DataNode),仅副本数和可提供服务的最少节点数不同。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Background-knowledge/Data-Model-and-Terminology_timecho.md b/src/zh/UserGuide/Master/Table/Background-knowledge/Data-Model-and-Terminology_timecho.md deleted file mode 100644 index 4bd0c054a..000000000 --- a/src/zh/UserGuide/Master/Table/Background-knowledge/Data-Model-and-Terminology_timecho.md +++ /dev/null @@ -1,391 +0,0 @@ - - -# 建模方案设计 - -本章节主要介绍如何将时序数据应用场景转化为IoTDB时序建模。 - -## 1. 时序数据模型 - -在构建IoTDB建模方案前,需要先了解时序数据和时序数据模型,详细内容见此页面:[时序数据模型](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - -## 2. IoTDB 的树表孪生模型 - -IoTDB 提供了树表孪生模型的方式,其特点分别如下: - -**树模型**:以测点为对象进行管理,每个测点对应一条时间序列,测点名按`.`分割可形成一个树形目录结构,与物理世界一一对应,对测点的读写操作简单直观。 - -**表模型**:推荐为每类设备创建一张表,同类设备的物理量采集都具备一定共性(如都采集温度和湿度物理量),数据分析灵活丰富。 - -### 2.1 模型特点 - -树表孪生模型有各自的适用场景。 - -以下表格从适用场景、典型操作等多个维度对树模型和表模型进行了对比。用户可以根据具体的使用需求,选择适合的模型,从而实现数据的高效存储和管理。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
对比维度树模型表模型
适用场景测点管理,监控场景设备管理,分析场景
典型操作指定点位路径进行读写通过标签进行数据筛选分析
结构特点和文件系统一样灵活增删模板化管理,便于数据治理
语法特点简洁灵活分析丰富
性能对比相同
- -**注意:** -- 同一个集群实例中可以存在两种模型空间,不同模型的语法、数据库命名方式不同,默认不互相可见。 - - -### 2.2 模型选择 - -IoTDB 支持通过多种客户端工具与数据库建立连接,不同客户端下进行模型选择的方式说明如下: - -1. [命令行工具 CLI](../Tools-System/CLI_timecho.md) - -通过 CLI 建立连接时,需要通过 `sql_dialect` 参数指定使用的模型(默认使用树模型)。 - -```Bash -# 树模型 -start-cli.sh(bat) -start-cli.sh(bat) -sql_dialect tree - -# 表模型 -start-cli.sh(bat) -sql_dialect table -``` - -2. [SQL](../User-Manual/Maintenance-statement_timecho.md#_2-1-设置连接的模型) - -在使用 SQL 语言进行数据操作时,可通过 set 语句切换使用的模型。 - -```SQL --- 指定为树模型 -IoTDB> SET SQL_DIALECT=TREE - --- 指定为表模型 -IoTDB> SET SQL_DIALECT=TABLE -``` - -3. 应用编程接口 - -通过多语言应用编程接口建立连接时,可通过模型对应的 session/sessionpool 创建连接池实例,简单示例如下: - -* [Java 原生接口](../API/Programming-Java-Native-API_timecho.md) - -```Java -// 树模型 -SessionPool sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(3) - .build(); - -//表模型 - ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(1) - .build(); -``` - -* [Python 原生接口](../API/Programming-Python-Native-API_timecho.md) - -```Python -# 树模型 -session = Session( -​ ip=ip, -​ port=port, -​ user=username, -​ password=password, -​ fetch_size=1024, -​ zone_id="UTC+8", -​ enable_redirection=True -) - -# 表模型 -config = TableSessionPoolConfig( -​ node_urls=node_urls, -​ username=username, -​ password=password, -​ database=database, -​ max_pool_size=max_pool_size, -​ fetch_size=fetch_size, -​ wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) -``` - -* [C++ 原生接口](../API/Programming-Cpp-Native-API_timecho.md) - -```C++ -// 树模型 -session = new Session(hostip, port, username, password); - -// 表模型 -session = (new TableSessionBuilder()) - ->host(ip) - ->rpcPort(port) - ->username(username) - ->password(password) - ->build(); -``` - -* [GO 原生接口](../API/Programming-Go-Native-API_timecho.md) - -```Go -//树模型 -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, -} -sessionPool = client.NewSessionPool(config, 3, 60000, 60000, false) -defer sessionPool.Close() - -//表模型 -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, - Database: dbname, -} -sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) -defer sessionPool.Close() -``` - -* [C# 原生接口](../API/Programming-CSharp-Native-API_timecho.md) - -```C# -//树模型 -var session_pool = new SessionPool(host, port, pool_size); - -//表模型 -var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(nodeUrls) - .SetUsername(username) - .SetPassword(password) - .SetFetchSize(1024) - .Build(); -``` - -* [JDBC](../API/Programming-JDBC_timecho.md) - -使用表模型,必须在 url 中指定 sql\_dialect 参数为 table。 - -```Java -// 树模型 -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/", username, password); - -// 表模型 -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", username, password); -``` - -### 2.3 树转表 - -IoTDB 提供了树转表功能,如下图所示: - -![](/img/tree-to-table-1.png) - -该功能支持通过创建表视图的方式,将已存在的树模型数据转化为表视图,进而通过表视图进行查询,实现了对同一份数据的树模型和表模型协同处理。更详细的功能介绍可参考[树转表视图](../User-Manual/Tree-to-Table_timecho.md),需要注意的是:​**创建树转表视图的 SQL 语句只允许在表模型下执行**​。 - - -## 3. 应用场景 - -应用场景主要包括三类: - -- 场景一:使用树模型进行数据的读写 - -- 场景二:使用表模型进行数据的读写 - -- 场景三:共用一份数据,使用树模型进行数据读写、使用表模型进行数据分析 - -### 3.1 场景一:树模型 - -#### 3.1.1 特点 - -- 简单直观,和物理世界的监测点位一一对应 - -- 类似文件系统一样灵活,可以设计任意分支结构 - -- 适用 DCS、SCADA 等工业监控场景 - -#### 3.1.2 基础概念 - -| **概念** | **定义** | -| -------------------- | ------------------------------------------------------------ | -| **数据库** | 定义:一个以 root. 为前缀的路径
命名推荐:仅包含 root 的下一级节点,如 root.db
数量推荐:上限和内存相关,一个数据库也可以充分利用机器资源,无需为性能原因创建多个数据库
创建方式:推荐手动创建,也可创建时间序列时自动创建(默认为 root 的下一级节点) | -| **时间序列(测点)** | 定义:
1. 一个以数据库路径为前缀的、由 . 分割的路径,可包含任意多个层级,如 root.db.turbine.device1.metric1
2. 每个时间序列可以有不同的数据类型。
命名推荐:
1. 仅将唯一定位时间序列的标签(类似联合主键)放入路径中,一般不超过10层
2. 通常将基数(不同的取值数量)少的标签放在前面,便于系统将公共前缀进行压缩
数量推荐:
1. 集群可管理的时间序列总量和总内存相关,可参考资源推荐章节
2. 任一层级的子节点数量没有限制
创建方式:可手动创建或在数据写入时自动创建。 | -| **设备** | 定义:倒数第二级为设备,如 root.db.turbine.**device1**.metric1中的“device1”这一层级即为设备
创建方式:无法仅创建设备,随时间序列创建而存在 | - -#### 3.1.3 建模示例 - -##### 3.1.3.1 有多种类型的设备需要管理,如何建模? - -- 如场景中不同类型的设备具备不同的层级路径和测点集合,可以在数据库节点下按设备类型创建分支。每种设备下可以有不同的测点结构。 - -
- -
- -##### 3.1.3.2 如果场景中没有设备,只有测点,如何建模? - -- 如场站的监控系统中,每个测点都有唯一编号,但无法对应到某些设备。 - -
- -
- -##### 3.1.3.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 如在储能场景中,每一层结构都要监控其电压和电流,可以采用如下建模方式。 - -
- -
- - -### 3.2 场景二:表模型 - -#### 3.2.1 特点 - -- 以时序表建模管理设备时序数据,便于使用标准 SQL 进行分析 - -- 适用于设备数据分析或从其他数据库迁移至 IoTDB 的场景 - -#### 3.2.2 基础概念 - -- 数据库:可管理多类设备 - -- 时序表:对应一类设备 - -| **列类别** | **定义** | -| --------------------------- | ------------------------------------------------------------ | -| **时间列(TIME)** | 每个时序表必须有一个时间列,且列名必须为 time,数据类型为 TIMESTAMP | -| **标签列(TAG)** | 设备的唯一标识(联合主键),可以为 0 至多个
标签信息不可修改和删除,但允许增加
推荐按粒度由大到小进行排列 | -| **测点列(FIELD)** | 一个设备采集的测点可以有1个至多个,值随时间变化
表的测点列没有数量限制,可以达到数十万以上 | -| **属性列(ATTRIBUTE)** | 对设备的补充描述,**不随时间变化**
设备属性信息可以有0个或多个,可以更新或新增
少量希望修改的静态属性可以存至此列 | - - -数据筛选效率:时间列=标签列>属性列>测点列 - -#### 3.2.3 建模示例 - -##### 3.2.3.1 有多种类型的设备需要管理,如何建模? - -- 推荐为每一类型的设备建立一张表,每个表可以具有不同的标签和测点集合。 -- 即使设备之间有联系,或有层级关系,也推荐为每一类设备建一张表。 - -
- -
- -##### 3.2.3.2 如果没有设备标识列和属性列,如何建模? - -- 列数没有数量限制,可以达到数十万以上。 - -
- -
- -##### 3.2.3.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 每个设备有多个子设备及测点信息,推荐为每类设备建一个表进行管理。 - -
- -
- -### 3.3 场景三:双模型结合 - -#### 3.3.1 特点 - -- 巧妙融合了树模型与表模型的优点,共用一份数据,写入灵活,查询丰富。 - -- 数据写入阶段,采用树模型语法,支持数据灵活接入和扩展。 - -- 数据分析阶段,采用表模型语法,允许用户通过标准 SQL 查询语言,执行复杂的数据分析。 - -#### 3.3.2 建模示例 - -##### 3.3.2.1 有多种类型的设备需要管理,如何建模? - -- 场景中不同类型的设备具备不同的层级路径和测点集合。 - -- 树模型:在数据库节点下按设备类型创建分支,每种设备下可以有不同的测点结构。 - -- 表视图:为每种类型的设备建立一张表视图,每个表视图具有不同的标签和测点集合。 - -
- -
- -##### 3.3.2.2 如果没有设备标识列和属性列,如何建模? - -- 树模型:每个测点都有唯一编号,但无法对应到某些设备。 - -- 表视图:将所有测点放入一张表中,测点列数没有数量限制,可以达到数十万以上。若测点具有相同的数据类型,可将测点作为同一类设备。 - -
- -
- -##### 3.3.2.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 树模型:按照物理世界的监测点,对每一层结构进行建模。 - -- 表视图:按照设备分类,建立多个表对每一层结构信息进行管理。 - -
- -
diff --git a/src/zh/UserGuide/Master/Table/Background-knowledge/Data-Type_timecho.md b/src/zh/UserGuide/Master/Table/Background-knowledge/Data-Type_timecho.md deleted file mode 100644 index 42ee3009a..000000000 --- a/src/zh/UserGuide/Master/Table/Background-knowledge/Data-Type_timecho.md +++ /dev/null @@ -1,189 +0,0 @@ - - -# 数据类型 - -## 1. 基本数据类型 - -IoTDB 支持以下十一种数据类型: - -* BOOLEAN(布尔值) -* INT32(整型) -* INT64(长整型) -* FLOAT(单精度浮点数) -* DOUBLE(双精度浮点数) -* TEXT(长字符串,不推荐使用) -* STRING(字符串) -* BLOB(大二进制对象) -* OBJECT(大二进制对象) - > V2.0.8 版本起支持 -* TIMESTAMP(时间戳) -* DATE(日期) - -其中: -1. STRING 和 TEXT 类型的区别在于,STRING 类型具有更多的统计信息,能够用于优化值过滤查询。TEXT 类型适合用于存储长字符串。 -2. OBJECT 和 BLOB 类型的区别如下: - - | | **OBJECT** | **BLOB** | - | ---------------------- |-------------------------------------------------------------------------------------------------------------------------| -------------------------------------------- | - | 写放大(越低越好) | 低(写放大系数永远为 1) | 高(写放大系数为 2 + 合并次数) | - | 空间放大(越低越好) | 低(merge & release on write) | 高(merge on read and release on compact) | - | 查询结果 | 默认查询 OBJECT 列时,返回结果如`(Object) XX.XX KB)`。
真实 OBJECT 数据存储路径位于:`${data_dir}/object_data`,可通过 `READ_OBJECT` 函数读取其真实内容 | 直接返回真实的二进制内容 | - - -### 1.1 数据类型兼容性 - -当写入数据的类型与序列注册的数据类型不一致时, -- 如果序列数据类型不兼容写入数据类型,系统会给出错误提示。 -- 如果序列数据类型兼容写入数据类型,系统会进行数据类型的自动转换,将写入的数据类型更正为注册序列的类型。 - -各数据类型的兼容情况如下表所示: - -| 序列数据类型 | 支持的写入数据类型 | -|-----------|------------------------------------| -| BOOLEAN | BOOLEAN | -| INT32 | INT32 | -| INT64 | INT32 INT64 TIMESTAMP | -| FLOAT | INT32 FLOAT | -| DOUBLE | INT32 INT64 FLOAT DOUBLE TIMESTAMP | -| TEXT | TEXT STRING | -| STRING | TEXT STRING | -| BLOB | TEXT STRING BLOB | -| OBJECT | OBJECT | -| TIMESTAMP | INT32 INT64 TIMESTAMP | -| DATE | DATE | - -## 2. 时间戳类型 - -时间戳是一个数据到来的时间点,其中包括绝对时间戳和相对时间戳。 - -### 2.1 绝对时间戳 - -IOTDB 中绝对时间戳分为二种,一种为 LONG 类型,一种为 DATETIME 类型(包含 DATETIME-INPUT, DATETIME-DISPLAY 两个小类)。 - -在用户在输入时间戳时,可以使用 LONG 类型的时间戳或 DATETIME-INPUT 类型的时间戳,其中 DATETIME-INPUT 类型的时间戳支持格式如表所示: - -
- -**DATETIME-INPUT 类型支持格式** - - -| format | -| :--------------------------- | -| yyyy-MM-dd HH:mm:ss | -| yyyy/MM/dd HH:mm:ss | -| yyyy.MM.dd HH:mm:ss | -| yyyy-MM-dd HH:mm:ssZZ | -| yyyy/MM/dd HH:mm:ssZZ | -| yyyy.MM.dd HH:mm:ssZZ | -| yyyy/MM/dd HH:mm:ss.SSS | -| yyyy-MM-dd HH:mm:ss.SSS | -| yyyy.MM.dd HH:mm:ss.SSS | -| yyyy-MM-dd HH:mm:ss.SSSZZ | -| yyyy/MM/dd HH:mm:ss.SSSZZ | -| yyyy.MM.dd HH:mm:ss.SSSZZ | -| ISO8601 standard time format | - - -
- - -IoTDB 在显示时间戳时可以支持 LONG 类型以及 DATETIME-DISPLAY 类型,其中 DATETIME-DISPLAY 类型可以支持用户自定义时间格式。自定义时间格式的语法如表所示: - -
- -**DATETIME-DISPLAY 自定义时间格式的语法** - - -| Symbol | Meaning | Presentation | Examples | -| :----: | :-------------------------: | :----------: | :--------------------------------: | -| G | era | era | era | -| C | century of era (>=0) | number | 20 | -| Y | year of era (>=0) | year | 1996 | -| | | | | -| x | weekyear | year | 1996 | -| w | week of weekyear | number | 27 | -| e | day of week | number | 2 | -| E | day of week | text | Tuesday; Tue | -| | | | | -| y | year | year | 1996 | -| D | day of year | number | 189 | -| M | month of year | month | July; Jul; 07 | -| d | day of month | number | 10 | -| | | | | -| a | halfday of day | text | PM | -| K | hour of halfday (0~11) | number | 0 | -| h | clockhour of halfday (1~12) | number | 12 | -| | | | | -| H | hour of day (0~23) | number | 0 | -| k | clockhour of day (1~24) | number | 24 | -| m | minute of hour | number | 30 | -| s | second of minute | number | 55 | -| S | fraction of second | millis | 978 | -| | | | | -| z | time zone | text | Pacific Standard Time; PST | -| Z | time zone offset/id | zone | -0800; -08:00; America/Los_Angeles | -| | | | | -| ' | escape for text | delimiter | | -| '' | single quote | literal | ' | - -
- -### 2.2 相对时间戳 - - 相对时间是指与服务器时间```now()```和```DATETIME```类型时间相差一定时间间隔的时间。 - 形式化定义为: - - ``` - Duration = (Digit+ ('Y'|'MO'|'W'|'D'|'H'|'M'|'S'|'MS'|'US'|'NS'))+ - RelativeTime = (now() | DATETIME) ((+|-) Duration)+ - ``` - -
- - **The syntax of the duration unit** - - - | Symbol | Meaning | Presentation | Examples | - | :----: | :---------: | :----------------------: | :------: | - | y | year | 1y=365 days | 1y | - | mo | month | 1mo=30 days | 1mo | - | w | week | 1w=7 days | 1w | - | d | day | 1d=1 day | 1d | - | | | | | - | h | hour | 1h=3600 seconds | 1h | - | m | minute | 1m=60 seconds | 1m | - | s | second | 1s=1 second | 1s | - | | | | | - | ms | millisecond | 1ms=1000_000 nanoseconds | 1ms | - | us | microsecond | 1us=1000 nanoseconds | 1us | - | ns | nanosecond | 1ns=1 nanosecond | 1ns | - -
- - 例子: - - ``` - now() - 1d2h //比服务器时间早 1 天 2 小时的时间 - now() - 1w //比服务器时间早 1 周的时间 - ``` - - > 注意:'+'和'-'的左右两边必须有空格 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md b/src/zh/UserGuide/Master/Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md deleted file mode 100644 index 0f7011af7..000000000 --- a/src/zh/UserGuide/Master/Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md +++ /dev/null @@ -1,45 +0,0 @@ - -# 时序数据模型 - -## 1. 什么叫时序数据? - -万物互联的今天,物联网场景、工业场景等各类场景都在进行数字化转型,人们通过在各类设备上安装传感器对设备的各类状态进行采集。如电机采集电压、电流,风机的叶片转速、角速度、发电功率;车辆采集经纬度、速度、油耗;桥梁的振动频率、挠度、位移量等。传感器的数据采集,已经渗透在各个行业中。 - -![](/img/%E6%97%B6%E5%BA%8F%E6%95%B0%E6%8D%AE%E4%BB%8B%E7%BB%8D.png) - - -通常来说,我们把每个采集点位叫做一个**测点( 也叫物理量、时间序列、时间线、信号量、指标、测量值等)**,每个测点都在随时间的推移不断收集到新的数据信息,从而构成了一条**时间序列**。用表格的方式,每个时间序列就是一个由时间、值两列形成的表格;用图形化的方式,每个时间序列就是一个随时间推移形成的走势图,也可以形象的称之为设备的“心电图”。 - -![](/img/%E5%BF%83%E7%94%B5%E5%9B%BE1.png) - -传感器产生的海量时序数据是各行各业数字化转型的基础,因此我们对时序数据的模型梳理主要围绕设备、传感器展开。 - -## 2. 时序数据中的关键概念有哪些? - -时序数据中主要涉及的概念如下。 - -| **设备(Device)** | 也称为实体、装备等,是实际场景中拥有物理量的设备或装置。在 IoTDB 中,实体是管理一组时间序列的集合,可以是一个物理设备、测量装置、传感器集合等。如:能源场景:风机,由区域、场站、线路、机型、实例等标识工厂场景:机械臂,由物联网平台生成的唯一 ID 标识车联网场景:车辆,由车辆识别代码 VIN 标识监控场景:CPU,由机房、机架、Hostname、设备类型等标识 | -| ------------------------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| **测点(FIELD)** | 也称物理量、信号量、指标、点位、工况等,是在实际场景中检测装置记录的测量信息。通常一个物理量代表一个采集点位,能够定期采集所在环境的物理量。如:能源电力场景:电流、电压、风速、转速车联网场景:油量、车速、经度、维度工厂场景:温度、湿度
_表模型下**测点数量**等于所有表的测点数之和(每张表的测点数 = device 数量 * field 的列数),具体统计方法可参考_[元数据查询](../Basic-Concept/Table-Management_timecho.md#_1-7-元数据查询) | -| **数据点(Data Point)** | 由一个时间戳和一个数值组成,其中时间戳为 long 类型,数值可以为 BOOLEAN、FLOAT、INT32 等各种类型。如下图表格形式的时间序列的一行,或图形形式的时间序列的一个点,就是一个数据点。
| -| **采集频率(Frequency)** | 指物理量在一定时间内产生数据的次数。例如,一个温度传感器可能每秒钟采集一次温度数据,那么它的采集频率就是1Hz(赫兹),即每秒1次。 | -| **数据保存时间(TTL)** | TTL 指定表中数据的保存时间,超过 TTL 的数据将自动删除。IoTDB 支持对不同的表设定不同的数据存活时间,便于 IoTDB 定期、自动地删除一定时间之前的数据。合理使用 TTL 可以控制 IoTDB 占用的总磁盘空间,避免磁盘写满等异常,并维持较高的查询性能和减少内存资源占用。 | \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Basic-Concept/Database-Management_timecho.md b/src/zh/UserGuide/Master/Table/Basic-Concept/Database-Management_timecho.md deleted file mode 100644 index 43b014c57..000000000 --- a/src/zh/UserGuide/Master/Table/Basic-Concept/Database-Management_timecho.md +++ /dev/null @@ -1,176 +0,0 @@ - - -# 数据库管理 - -## 1. 数据库管理 - -### 1.1 创建数据库 - -用于创建数据库。 - -**语法:** - -```SQL - CREATE DATABASE (IF NOT EXISTS)? (WITH properties)? -``` - -**说明:** - -1. 数据库名称,具有以下特性: - - 大小写不敏感,创建成功后,统一显示为小写 - - 名称的长度不得超过 64 个字符。 - - 名称中包含下划线(_)、数字(非开头)、英文字母可以直接创建 - - 名称中包含特殊字符(如`)、中文字符、数字开头时,必须用双引号 "" 括起来。 - -2. WITH properties 子句可配置如下属性: - -> 注:属性的大小写不敏感,有关详细信息[大小写敏感规则](../SQL-Manual/Identifier.md#大小写敏感性)。 - -| 属性 | 含义 | 默认值 | -| ------------------------- | ---------------------------------------- | --------- | -| `TTL` | 数据自动过期删除,单位 ms | INF | -| `TIME_PARTITION_INTERVAL` | 数据库的时间分区间隔,单位 ms | 604800000 | -| `SCHEMA_REGION_GROUP_NUM` | 数据库的元数据副本组数量,一般不需要修改 | 1 | -| `DATA_REGION_GROUP_NUM` | 数据库的数据副本组数量,一般不需要修改 | 2 | - -**示例:** - -```SQL -CREATE DATABASE IF NOT EXISTS database1 with(TTL=31536000000); -``` - -### 1.2 使用数据库 - -用于指定当前数据库作为表的命名空间。 - -**语法:** - -```SQL -USE -``` - -**示例:** - -```SQL -USE database1; -``` - -### 1.3 查看当前数据库 - -返回当前会话所连接的数据库名称,若未执行过 `use`语句指定数据库,则默认为 `null`。 - -**语法:** - -```SQL -SHOW CURRENT_DATABASE -``` - -**示例:** - -```SQL -USE database1; -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| database1| -+---------------+ -``` - -### 1.4 查看所有数据库 - -用于查看所有数据库和数据库的属性信息。 - -**语法:** - -```SQL -SHOW DATABASES (DETAILS)? -``` - -**语句返回列含义如下:** - -| 列名 | 含义 | -| ----------------------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| database | database名称。 | -| TTL | 数据保留周期。如果在创建数据库的时候指定TTL,则TTL对该数据库下所有表的TTL生效。也可以再通过 [create table](../Basic-Concept/Table-Management_timecho.md#11-创建表) 、[alter table](../Basic-Concept/Table-Management_timecho.md#14-修改表) 来设置或更新表的TTL时间。 | -| SchemaReplicationFactor | 元数据副本数,用于确保元数据的高可用性。可以在`iotdb-system.properties`中修改`schema_replication_factor`配置项。 | -| DataReplicationFactor | 数据副本数,用于确保数据的高可用性。可以在`iotdb-system.properties`中修改`data_replication_factor`配置项。 | -| TimePartitionInterval | 时间分区间隔,决定了数据在磁盘上按多长时间进行目录分组,通常采用默认1周即可。 | -| SchemaRegionGroupNum | 使用`DETAILS`语句会返回此列,展示数据库的元数据副本组数量,一般不需要修改 | -| DataRegionGroupNum | 使用`DETAILS`语句会返回此列,展示数据库的数据副本组数量,一般不需要修改 | - -**示例:** - -```SQL -SHOW DATABASES DETAILS; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|DataRegionGroupNum| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| database1| INF| 1| 1| 604800000| 1| 2| -|information_schema| INF| null| null| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -``` - -### 1.5 修改数据库 - -用于修改数据库中的部分属性。 - -**语法:** - -```SQL -ALTER DATABASE (IF EXISTS)? database=identifier SET PROPERTIES propertyAssignments -``` - -**说明:** - -1. `ALTER DATABASE`操作目前仅支持对数据库的`SCHEMA_REGION_GROUP_NUM`、`DATA_REGION_GROUP_NUM`以及`TTL`属性进行修改。 - -**示例:** - -```SQL -ALTER DATABASE database1 SET PROPERTIES TTL=31536000000; -``` - -### 1.6 删除数据库 - -用于删除数据库。 - -**语法:** - -```SQL -DROP DATABASE (IF EXISTS)? -``` - -**说明:** - -1. 数据库已被设置为当前使用(use)的数据库,仍然可以被删除(drop)。 -2. 删除数据库将导致所选数据库及其内所有表连同其存储的数据一并被删除。 - -**示例:** - -```SQL -DROP DATABASE IF EXISTS database1; -``` diff --git a/src/zh/UserGuide/Master/Table/Basic-Concept/Query-Data_timecho.md b/src/zh/UserGuide/Master/Table/Basic-Concept/Query-Data_timecho.md deleted file mode 100644 index 48c47bc3d..000000000 --- a/src/zh/UserGuide/Master/Table/Basic-Concept/Query-Data_timecho.md +++ /dev/null @@ -1,592 +0,0 @@ - - -# 数据查询 - -## 1. 语法概览 - -```SQL -SELECT ⟨select_list⟩ - FROM ⟨tables⟩ | patternRecognition - [WHERE ⟨condition⟩] - [GROUP BY ⟨groups⟩] - [HAVING ⟨group_filter⟩] - [WINDOW windowDefinition (',' windowDefinition)*)] - [FILL ⟨fill_methods⟩] - [ORDER BY ⟨order_expression⟩] - [OFFSET ⟨n⟩] - [LIMIT ⟨n⟩]; -``` - -IoTDB 查询语法提供以下子句: - -- SELECT 子句:查询结果应包含的列。详细语法见:[SELECT子句](../SQL-Manual/Select-Clause_timecho.md) -- FROM 子句:指出查询的数据源,可以是单个表、多个通过 `JOIN` 子句连接的表,或者是一个子查询。详细语法见:[FROM & JOIN 子句](../SQL-Manual/From-Join-Clause.md) -- WHERE 子句:用于过滤数据,只选择满足特定条件的数据行。这个子句在逻辑上紧跟在 FROM 子句之后执行。详细语法见:[WHERE 子句](../SQL-Manual/Where-Clause.md) -- GROUP BY 子句:当需要对数据进行聚合时使用,指定了用于分组的列。详细语法见:[GROUP BY 子句](../SQL-Manual/GroupBy-Clause.md) -- HAVING 子句:在 GROUP BY 子句之后使用,用于对已经分组的数据进行过滤。与 WHERE 子句类似,但 HAVING 子句在分组后执行。详细语法见:[HAVING 子句](../SQL-Manual/Having-Clause.md) -- FILL 子句:用于处理查询结果中的空值,用户可以使用 FILL 子句来指定数据缺失时的填充模式(如前一个非空值或线性插值)来填充 null 值,以便于数据可视化和分析。 详细语法见:[FILL 子句](../SQL-Manual/Fill-Clause.md) -- ORDER BY 子句:对查询结果进行排序,可以指定升序(ASC)或降序(DESC),以及 NULL 值的处理方式(NULLS FIRST 或 NULLS LAST)。详细语法见:[ORDER BY 子句](../SQL-Manual/OrderBy-Clause.md) -- OFFSET 子句:用于指定查询结果的起始位置,即跳过前 OFFSET 行。与 LIMIT 子句配合使用。详细语法见:[LIMIT 和 OFFSET 子句](../SQL-Manual/Limit-Offset-Clause.md) -- LIMIT 子句:限制查询结果的行数,通常与 OFFSET 子句一起使用以实现分页功能。详细语法见:[LIMIT 和 OFFSET 子句](../SQL-Manual/Limit-Offset-Clause.md) - -## 2. 子句执行顺序 - -![](/img/data-query-1.png) - - - -## 3. 常见查询示例 - -### 3.1 示例数据 - -在[示例数据页面](../Reference/Sample-Data.md)中,包含了用于构建表结构和插入数据的SQL语句,下载并在IoTDB CLI中执行这些语句,即可将数据导入IoTDB,您可以使用这些数据来测试和执行示例中的SQL语句,并获得相应的结果。 - -### 3.2 原始数据查询 - -**示例1:根据时间过滤** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE time >= 2024-11-27 00:00:00 and time <= 2024-11-29 00:00:00; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-28T08:00:00.000+08:00| 85.0| null| -|2024-11-28T09:00:00.000+08:00| null| 40.9| -|2024-11-28T10:00:00.000+08:00| 85.0| 35.2| -|2024-11-28T11:00:00.000+08:00| 88.0| 45.1| -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| null| -|2024-11-27T16:41:00.000+08:00| 85.0| null| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-27T16:43:00.000+08:00| null| null| -|2024-11-27T16:44:00.000+08:00| null| null| -+-----------------------------+-----------+--------+ -Total line number = 11 -It costs 0.075s -``` - -**示例2:根据值过滤** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE temperature > 89.0; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-29T18:30:00.000+08:00| 90.0| 35.4| -|2024-11-26T13:37:00.000+08:00| 90.0| 35.1| -|2024-11-26T13:38:00.000+08:00| 90.0| 35.1| -|2024-11-30T09:30:00.000+08:00| 90.0| 35.2| -|2024-11-30T14:30:00.000+08:00| 90.0| 34.8| -+-----------------------------+-----------+--------+ -Total line number = 5 -It costs 0.156s -``` - -**示例3:根据属性过滤** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE model_id ='B'; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| null| -|2024-11-27T16:41:00.000+08:00| 85.0| null| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-27T16:43:00.000+08:00| null| null| -|2024-11-27T16:44:00.000+08:00| null| null| -+-----------------------------+-----------+--------+ -Total line number = 7 -It costs 0.106s -``` - -**示例4:多设备按时间对齐查询** - -```SQL -IoTDB> SELECT date_bin_gapfill(1d, TIME) AS a_time, - device_id, - AVG(temperature) AS avg_temp - FROM table1 - WHERE TIME >= 2024-11-26 13:00:00 - AND TIME <= 2024-11-27 17:00:00 - GROUP BY 1, device_id FILL METHOD PREVIOUS; -``` - -执行结果如下: - -```SQL -+-----------------------------+---------+--------+ -| a_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-26T08:00:00.000+08:00| 100| 90.0| -|2024-11-27T08:00:00.000+08:00| 100| 90.0| -|2024-11-26T08:00:00.000+08:00| 101| 90.0| -|2024-11-27T08:00:00.000+08:00| 101| 85.0| -+-----------------------------+---------+--------+ -Total line number = 4 -It costs 0.048s -``` - -### 3.3 聚合查询 - -**示例:查询计算了在指定时间范围内,每个`device_id`的平均温度、最高温度和最低温度。** - -```SQL -IoTDB> SELECT device_id, AVG(temperature) as avg_temp, MAX(temperature) as max_temp, MIN(temperature) as min_temp - FROM table1 - WHERE time >= 2024-11-26 00:00:00 AND time <= 2024-11-29 00:00:00 - GROUP BY device_id; -``` - -执行结果如下: - -```SQL -+---------+--------+--------+--------+ -|device_id|avg_temp|max_temp|min_temp| -+---------+--------+--------+--------+ -| 100| 87.6| 90.0| 85.0| -| 101| 85.0| 85.0| 85.0| -+---------+--------+--------+--------+ -Total line number = 2 -It costs 0.278s -``` - -### 3.4 最新点查询 - -**示例:查询表中每个 `device_id` 返回最后一条记录,包含该记录的温度值以及在该设备中基于时间和温度排序的最后一条记录。** - -```SQL -IoTDB> SELECT device_id,last(time),last_by(temperature,time) - FROM table1 - GROUP BY device_id; -``` - -执行结果如下: - -```SQL -+---------+-----------------------------+-----+ -|device_id| _col1|_col2| -+---------+-----------------------------+-----+ -| 100|2024-11-29T18:30:00.000+08:00| 90.0| -| 101|2024-11-30T14:30:00.000+08:00| 90.0| -+---------+-----------------------------+-----+ -Total line number = 2 -It costs 0.090s -``` - -### 3.5 降采样查询(date_bin 函数) - -**示例:查询将时间按天分组,并计算每天的平均温度。** - -```SQL -IoTDB> SELECT device_id,date_bin(1d ,time) as day_time, AVG(temperature) as avg_temp - FROM table1 - WHERE time >= 2024-11-26 00:00:00 AND time <= 2024-11-30 00:00:00 - GROUP BY device_id,date_bin(1d ,time); -``` - -执行结果如下: - -```SQL -+---------+-----------------------------+--------+ -|device_id| day_time|avg_temp| -+---------+-----------------------------+--------+ -| 100|2024-11-29T08:00:00.000+08:00| 90.0| -| 100|2024-11-28T08:00:00.000+08:00| 86.0| -| 100|2024-11-26T08:00:00.000+08:00| 90.0| -| 101|2024-11-29T08:00:00.000+08:00| 85.0| -| 101|2024-11-27T08:00:00.000+08:00| 85.0| -+---------+-----------------------------+--------+ -Total line number = 5 -It costs 0.066s -``` - -### 3.6 多设备降采样对齐查询 - -#### 3.6.1 采样频率相同,时间不同 - -**table1:采样频率:1s** - -| Time | device_id | temperature | -| ------------ | --------- | ----------- | -| 00:00:00.001 | d1 | 90.0 | -| 00:00:01.002 | d1 | 85.0 | -| 00:00:02.101 | d1 | 85.0 | -| 00:00:03.201 | d1 | null | -| 00:00:04.105 | d1 | 90.0 | -| 00:00:05.023 | d1 | 85.0 | -| 00:00:06.129 | d1 | 90.0 | - -**table2:采样频率:1s** - -| Time | device_id | humidity | -| ------------ | --------- | -------- | -| 00:00:00.003 | d1 | 35.1 | -| 00:00:01.012 | d1 | 37.2 | -| 00:00:02.031 | d1 | null | -| 00:00:03.134 | d1 | 35.2 | -| 00:00:04.201 | d1 | 38.2 | -| 00:00:05.091 | d1 | 35.4 | -| 00:00:06.231 | d1 | 35.1 | - -**示例:查询`table1`的降采样数据:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS a_time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**结果:** - -```SQL -+-----------------------------+-------+ -| a_time|a_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| -|2025-05-13T00:00:01.000+08:00| 85.0| -|2025-05-13T00:00:02.000+08:00| 85.0| -|2025-05-13T00:00:03.000+08:00| 85.0| -|2025-05-13T00:00:04.000+08:00| 90.0| -|2025-05-13T00:00:05.000+08:00| 85.0| -|2025-05-13T00:00:06.000+08:00| 90.0| -+-----------------------------+-------+ -``` - -**示例:查询`table2`的降采样数据:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS b_time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**结果:** - -```SQL -+-----------------------------+-------+ -| b_time|b_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 35.1| -|2025-05-13T00:00:01.000+08:00| 37.2| -|2025-05-13T00:00:02.000+08:00| 37.2| -|2025-05-13T00:00:03.000+08:00| 35.2| -|2025-05-13T00:00:04.000+08:00| 38.2| -|2025-05-13T00:00:05.000+08:00| 35.4| -|2025-05-13T00:00:06.000+08:00| 35.1| -+-----------------------------+-------+ -``` - -**示例:按整点将多个序列进行时间对齐:** - -```SQL -IoTDB> SELECT time, - a_value, - b_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) B - USING (time) -``` - -**结果:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|b_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:02.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:03.000+08:00| 85.0| 35.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 38.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 35.4| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` - -- 保留空值:当 `NULL` 值本身具有特殊含义,或希望保留数据的 null 值时,可以选择去掉 `FILL METHOD PREVIOUS` 不进行填充。 -**示例:** - -```SQL -IoTDB> SELECT time, - a_value, - b_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1) B - USING (time) -``` - -**结果:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|b_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:02.000+08:00| 85.0| null| -|2025-05-13T00:00:03.000+08:00| null| 35.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 38.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 35.4| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` -#### 3.6.2 采样频率不同,时间不同 - -**table1:采样频率:1s** - -| Time | device_id | temperature | -| ------------ | --------- | ----------- | -| 00:00:00.001 | d1 | 90.0 | -| 00:00:01.002 | d1 | 85.0 | -| 00:00:02.101 | d1 | 85.0 | -| 00:00:03.201 | d1 | null | -| 00:00:04.105 | d1 | 90.0 | -| 00:00:05.023 | d1 | 85.0 | -| 00:00:06.129 | d1 | 90.0 | - -**table3: 采样频率:2s** - -| Time | device_id | humidity | -| ------------ | --------- | -------- | -| 00:00:00.005 | d1 | 35.1 | -| 00:00:02.106 | d1 | 37.2 | -| 00:00:04.187 | d1 | null | -| 00:00:06.156 | d1 | 35.1 | - -**示例:查询`table1`的降采样数据:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS a_time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**结果:** - -```SQL -+-----------------------------+-------+ -| a_time|a_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| -|2025-05-13T00:00:01.000+08:00| 85.0| -|2025-05-13T00:00:02.000+08:00| 85.0| -|2025-05-13T00:00:03.000+08:00| 85.0| -|2025-05-13T00:00:04.000+08:00| 90.0| -|2025-05-13T00:00:05.000+08:00| 85.0| -|2025-05-13T00:00:06.000+08:00| 90.0| -+-----------------------------+-------+ -``` -**示例:查询`table3`的降采样数据:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS c_time, - first(humidity) AS c_value - FROM table3 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**结果:** - -```SQL -+-----------------------------+-------+ -| c_time|c_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 35.1| -|2025-05-13T00:00:01.000+08:00| 35.1| -|2025-05-13T00:00:02.000+08:00| 37.2| -|2025-05-13T00:00:03.000+08:00| 37.2| -|2025-05-13T00:00:04.000+08:00| 37.2| -|2025-05-13T00:00:05.000+08:00| 37.2| -|2025-05-13T00:00:06.000+08:00| 35.1| -+-----------------------------+-------+ -``` - -**示例:按照高采样频率进行对齐:** - -```SQL -IoTDB> SELECT time, - a_value, - c_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS c_value - FROM table3 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) C - USING (time) -``` - -**结果:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|c_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 35.1| -|2025-05-13T00:00:02.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:03.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 37.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` - -### 3.7 数据填充 - -**示例:查询指定时间范围内,满足 `device_id` 为 '100' 的记录,若存在缺失的数据点,则用前一个非空值进行填充。** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE time >= 2024-11-26 00:00:00 and time <= 2024-11-30 11:00:00 - AND region='北京' AND plant_id='1001' AND device_id='101' - FILL METHOD PREVIOUS; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:41:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:42:00.000+08:00| 85.0| 35.2| -|2024-11-27T16:43:00.000+08:00| 85.0| 35.2| -|2024-11-27T16:44:00.000+08:00| 85.0| 35.2| -+-----------------------------+-----------+--------+ -Total line number = 7 -It costs 0.101s -``` - -### 3.8 排序&分页 - -**示例:查询表中湿度降序排列且空值(NULL)排最后的记录,跳过前 2 条,只返回接下来的 8 条记录。** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - ORDER BY humidity desc NULLS LAST - OFFSET 2 - LIMIT 10; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-28T09:00:00.000+08:00| null| 40.9| -|2024-11-29T18:30:00.000+08:00| 90.0| 35.4| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-28T10:00:00.000+08:00| 85.0| 35.2| -|2024-11-30T09:30:00.000+08:00| 90.0| 35.2| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-26T13:38:00.000+08:00| 90.0| 35.1| -|2024-11-26T13:37:00.000+08:00| 90.0| 35.1| -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-30T14:30:00.000+08:00| 90.0| 34.8| -+-----------------------------+-----------+--------+ -Total line number = 10 -It costs 0.093s -``` diff --git a/src/zh/UserGuide/Master/Table/Basic-Concept/TTL-Delete-Data_timecho.md b/src/zh/UserGuide/Master/Table/Basic-Concept/TTL-Delete-Data_timecho.md deleted file mode 100644 index 3e6cb9778..000000000 --- a/src/zh/UserGuide/Master/Table/Basic-Concept/TTL-Delete-Data_timecho.md +++ /dev/null @@ -1,142 +0,0 @@ - - -# 数据保留时间 - -## 1. 概览 - -IoTDB支持对表(table)级别设置数据保留时间(TTL),允许系统自动定期删除旧数据,以有效控制磁盘空间并维护高性能查询和低内存占用。TTL默认以毫秒为单位,数据过期后不可查询且禁止写入,但物理删除会延迟至压缩时。需注意,TTL变更可能导致短暂数据可查询性变化。 - -**注意事项:** - -1. TTL设置为毫秒,不受配置文件时间精度影响。 -2. TTL变更可能影响数据的可查询性。 -3. 系统最终会移除过期数据,但存在延迟。 -4. TTL 判断数据是否过期依据的是数据点时间,非写入时间。 - -## 2. 设置 TTL - -在表模型中,IoTDB 的 TTL 是按照表的粒度生效的。可以直接在表上设置 TTL,或者在数据库级别设置 TTL。当在数据库级别设置了TTL时,在创建新表的过程中,系统会自动采用这个TTL值作为新表的默认设置,但每个表仍然可以独立地被设置或覆盖该值。 - -注意,如果数据库级别的TTL被修改,不会直接影响到已经存在的表的TTL设置。 - -### 2.1 为表设置 TTL - -如果在建表时通过sql语句设置了表的 TTL,则会以表的ttl为准。建表语句详情可见:[表管理](../Basic-Concept/Table-Management_timecho.md) - -示例1:创建表时设置 TTL - -```SQL -CREATE TABLE test3 ("场站" string id, "温度" int32) with (TTL=3600); -``` - -示例2:更改表语句设置TTL: - -```SQL -ALTER TABLE tableB SET PROPERTIES TTL=3600; -``` - -示例3:不指定TTL或设为默认值,它将与数据库的TTL相同,默认情况下是'INF'(无穷大): - -```SQL -CREATE TABLE test3 ("场站" string id, "温度" int32) with (TTL=DEFAULT); -CREATE TABLE test3 ("场站" string id, "温度" int32); -ALTER TABLE tableB set properties TTL=DEFAULT; -``` - -### 2.2 为数据库设置 TTL - -没有设置表的TTL,则会继承database的ttl。建数据库语句详情可见:[数据库管理](../Basic-Concept/Database-Management_timecho.md) - -示例4:数据库设置为 ttl =3600000,将生成一个ttl=3600000的表: - -```SQL -CREATE DATABASE db WITH (ttl=3600000); -use db; -CREATE TABLE test3 ("场站" string id, "温度" int32); -``` - -示例5:数据库不设置ttl,将生成一个没有ttl的表: - -```SQL -CREATE DATABASE db; -use db; -CREATE TABLE test3 ("场站" string id, "温度" int32); -``` - -示例6:数据库设置了ttl,但想显式设置没有TTL的表,可以将TTL设置为'INF': - -```SQL -CREATE DATABASE db WITH (ttl=3600000); -use db; -CREATE TABLE test3 ("场站" string id, "温度" int32) with (ttl='INF'); -``` - -## 3. 取消 TTL - -取消 TTL 设置,可以修改表的 TTL 设置为 'INF'。目前,IoTDB 不支持修改数据库的 TTL。 - -```SQL -ALTER TABLE tableB set properties TTL='INF'; -``` - -## 4. 查看 TTL 信息 - -使用 "SHOW DATABASES" 和 "SHOW TABLES" 命令可以直接显示数据库和表的 TTL 详情。数据库和表管理语句详情可见:[数据库管理](../Basic-Concept/Database-Management_timecho.md)、[表管理](../Basic-Concept/Table-Management_timecho.md) - -> 注意,树模型数据库的TTL也将显示。 - -```SQL -IoTDB> show databases; -+---------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+---------+-------+-----------------------+---------------------+---------------------+ -|test_prop| 300| 1| 3| 100000| -| test2| 300| 1| 1| 604800000| -+---------+-------+-----------------------+---------------------+---------------------+ - -IoTDB> show databases details; -+---------+-------+-----------------------+---------------------+---------------------+-----+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|Model| -+---------+-------+-----------------------+---------------------+---------------------+-----+ -|test_prop| 300| 1| 3| 100000|TABLE| -| test2| 300| 1| 1| 604800000| TREE| -+---------+-------+-----------------------+---------------------+---------------------+-----+ - -IoTDB> show tables; -+---------+-------+ -|TableName|TTL(ms)| -+---------+-------+ -| grass| 1000| -| bamboo| 300| -| flower| INF| -+---------+-------+ - -IoTDB> show tables details; -+---------+-------+----------+ -|TableName|TTL(ms)| Status| -+---------+-------+----------+ -| bean| 300|PRE_CREATE| -| grass| 1000| USING| -| bamboo| 300| USING| -| flower| INF| USING| -+---------+-------+----------+ -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Basic-Concept/Table-Management_timecho.md b/src/zh/UserGuide/Master/Table/Basic-Concept/Table-Management_timecho.md deleted file mode 100644 index c2dca4beb..000000000 --- a/src/zh/UserGuide/Master/Table/Basic-Concept/Table-Management_timecho.md +++ /dev/null @@ -1,339 +0,0 @@ - - -# 表管理 - -在开始使用表管理功能前,推荐您先了解以下相关预备知识,以便更好地理解和应用表管理功能: -* [时序数据模型](../Background-knowledge/Navigating_Time_Series_Data_timecho.md):了解时序数据的基本概念与特点,帮助建立建模基础。 -* [建模方案设计](../Background-knowledge/Data-Model-and-Terminology_timecho.md):掌握 IoTDB 时序模型及适用场景,为表管理提供设计基础。 - -## 1. 表管理 - -### 1.1 创建表 - -#### 1.1.1 通过 Create 语句手动创建表 - -用于在当前数据库中创建表,也可以对任何指定数据库创建表,格式为“数据库名.表名”。 - -**语法:** - -```SQL -createTableStatement - : CREATE TABLE (IF NOT EXISTS)? qualifiedName - '(' (columnDefinition (',' columnDefinition)*)? ')' - charsetDesc? - comment? - (WITH properties)? - ; - -charsetDesc - : DEFAULT? (CHAR SET | CHARSET | CHARACTER SET) EQ? identifierOrString - ; - -columnDefinition - : identifier columnCategory=(TAG | ATTRIBUTE | TIME) charsetName? comment? - | identifier type (columnCategory=(TAG | ATTRIBUTE | TIME | FIELD))? charsetName? comment? - ; - -charsetName - : CHAR SET identifier - | CHARSET identifier - | CHARACTER SET identifier - ; - -comment - : COMMENT string - ; -``` - -**说明:** - -1. 在创建表时,可以不指定时间列(TIME),IoTDB会自动添加该列并命名为"time", 且顺序上位于第一列。其他所有列可以通过在数据库配置时启用`enable_auto_create_schema`选项,或通过 session 接口自动创建或修改表的语句来添加。 -2. 自 V2.0.8.2 版本起,支持创建表时自定义命名时间列,自定义时间列在表中的顺序由创建 SQL 中的顺序决定。相关约束如下: -- 当列分类(columnCategory)设为 TIME 时,数据类型(dataType)必须为 TIMESTAMP。 -- 每张表最多允许定义 1个时间列(columnCategory = TIME)。 -- 当未显式定义时间列时,不允许其他列使用 time 作为名称,否则会与系统默认时间列命名冲突。 -3. 列的类别可以省略,默认为`FIELD`。当列的类别为`TAG`或`ATTRIBUTE`时,数据类型需为`STRING`(可省略)。 -4. 表的TTL默认为其所在数据库的TTL。如果使用默认值,可以省略此属性,或将其设置为`default`。 -5. 表名称,具有以下特性: - - 大小写不敏感,创建成功后,统一显示为小写 - - 名称可包含特殊字符,如 `~!`"%` 等 - - 包含特殊字符或中文字符的表名创建时必须用双引号 "" 括起来。 - - 注意:SQL中特殊字符或中文表名需加双引号。原生API中无需额外添加,否则表名会包含引号字符。 - - 当为表命名时,最外层的双引号(`""`)不会在实际创建的表名中出现。 - - - ```shell - -- SQL 中 - "a""b" --> a"b - """""" --> "" - -- API 中 - "a""b" --> "a""b" - ``` -6. columnDefinition 列名称与表名称具有相同特性,并且可包含特殊字符`.`。 -7. COMMENT 给表添加注释。 - -**示例:** - -```SQL -CREATE TABLE table1 ( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - humidity FLOAT FIELD COMMENT 'humidity', - status Boolean FIELD COMMENT 'status', - arrival_time TIMESTAMP FIELD COMMENT 'arrival_time' -) COMMENT 'table1' WITH (TTL=31536000000); -``` - -注意:若您使用的终端不支持多行粘贴(例如 Windows CMD),请将 SQL 语句调整为单行格式后再执行。 - -### 1.2 查看表 - -用于查看该数据库中或指定数据库中的所有表和表库的属性信息。 - -**语法:** - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -**说明:** - -1. 在查询中指定了`FROM`或`IN`子句时,系统将展示指定数据库内的所有表。 -2. 如果未指定`FROM`或`IN`子句,系统将展示当前选定数据库中的所有表。如果用户未使用(use)某个数据库空间,系统将报错。 -3. 请求显示详细信息(指定`DETAILS`),系统将展示表的当前状态,包括: - - `USING`:表示表处于正常可用状态。 - - `PRE_CREATE`:表示表正在创建中或创建失败,此时表不可用。 - - `PRE_DELETE`:表示表正在删除中或删除失败,此类表将永久不可用。 - -**示例:** - -```sql -show tables details from database1; -``` -```shell -+---------------+-----------+------+-------+ -| TableName| TTL(ms)|Status|Comment| -+---------------+-----------+------+-------+ -| table1|31536000000| USING| table1| -+---------------+-----------+------+-------+ -``` - -### 1.3 查看表的列 - -用于查看表的列名、数据类型、类别、状态。 - -**语法:** - -```SQL -(DESC | DESCRIBE) (DETAILS)? -``` - -**说明:** - -- 如果设置了`DETAILS`选项,系统将展示列的详细状态信息,包括: - - `USING`:表示列目前处于正常使用状态。 - - `PRE_DELETE`:表示列正在被删除或删除操作失败,该列将永久无法使用。 - -**示例:** - -```sql -desc table1 details; -``` -```shell -+------------+---------+---------+------+------------+ -| ColumnName| DataType| Category|Status| Comment| -+------------+---------+---------+------+------------+ -| time|TIMESTAMP| TIME| USING| null| -| region| STRING| TAG| USING| null| -| plant_id| STRING| TAG| USING| null| -| device_id| STRING| TAG| USING| null| -| model_id| STRING|ATTRIBUTE| USING| null| -| maintenance| STRING|ATTRIBUTE| USING| maintenance| -| temperature| FLOAT| FIELD| USING| temperature| -| humidity| FLOAT| FIELD| USING| humidity| -| status| BOOLEAN| FIELD| USING| status| -|arrival_time|TIMESTAMP| FIELD| USING|arrival_time| -+------------+---------+---------+------+------------+ -``` - - -### 1.4 查看表的创建信息 - -用于获取表模型下表或视图的完整定义语句。该功能会自动补全创建时省略的所有默认值,因此结果集中所展示的语句可能与原始创建语句不同。 - -> V2.0.5 起支持该功能 - -**语法:** - -```SQL -SHOW CREATE TABLE -``` - -**说明:** - -1. 该语句不支持对系统表的查询 - -**示例:** - -```SQL -show create table table1; -``` -```shell -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| Table| Create Table| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|table1|CREATE TABLE "table1" ("region" STRING TAG,"plant_id" STRING TAG,"device_id" STRING TAG,"model_id" STRING ATTRIBUTE,"maintenance" STRING ATTRIBUTE,"temperature" FLOAT FIELD,"humidity" FLOAT FIELD,"status" BOOLEAN FIELD,"arrival_time" TIMESTAMP FIELD) WITH (ttl=31536000000)| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -``` - - -### 1.5 修改表 - -用于修改表,包括添加列、删除列、修改列类型(V2.0.8.2)以及设置表的属性。 - -**语法:** - -```SQL -#addColumn; -ALTER TABLE (IF EXISTS)? tableName=qualifiedName ADD COLUMN (IF NOT EXISTS)? column=columnDefinition COMMENT 'column_comment'; -#dropColumn; -ALTER TABLE (IF EXISTS)? tableName=qualifiedName DROP COLUMN (IF EXISTS)? column=identifier; -#setTableProperties; -// set TTL can use this; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName SET PROPERTIES propertyAssignments; -| COMMENT ON TABLE tableName=qualifiedName IS 'table_comment'; -| COMMENT ON COLUMN tableName.column IS 'column_comment'; -#changeColumndatatype; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName ALTER COLUMN (IF EXISTS)? column=identifier SET DATA TYPE new_type=type; -``` - -**说明:** - -1. `SET PROPERTIES`操作目前仅支持对表的 TTL 属性进行配置。 -2. 删除列功能,仅支持删除属性列(ATTRIBUTE)和物理量列(FIELD),标识列(TAG)不支持删除。 -3. 修改后的 comment 会覆盖原有注释,如果指定为 null,则会擦除之前的 comment。 -4. 自 V2.0.8.2 版本起支持修改字段数据类型,目前只支持修改Category类型为FIELD的字段。 - - 变更过程中若该时间序列被并发删除,会报错提示。 - - 变更后的字段类型需要与原类型兼容,具体兼容性如下表所示: - -| 原始类型 | 可变更为类型 | -| ----------- | ----------------------------------------------- | -| INT32 | INT64, FLOAT, DOUBLE, TIMESTAMP, STRING, TEXT | -| INT64 | TIMESTAMP, DOUBLE, STRING, TEXT | -| FLOAT | DOUBLE, STRING, TEXT | -| DOUBLE | STRING, TEXT | -| BOOLEAN | STRING, TEXT | -| TEXT | BLOB, STRING | -| STRING | TEXT, BLOB | -| BLOB | STRING, TEXT | -| DATE | STRING, TEXT | -| TIMESTAMP | INT64, DOUBLE, STRING, TEXT | - - -**示例:** - -表 table1 增加 tag 列 a -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS a TAG COMMENT 'a'; -``` -表 table1 增加 field 列 b -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS b FLOAT FIELD COMMENT 'b'; -``` -修改表 table1 的 TTL -```SQL -ALTER TABLE table1 set properties TTL=3600; -``` -表 table1 增加注释 -```SQL -COMMENT ON TABLE table1 IS 'table1'; -``` -表 table1 的 a 列去掉注释 -```SQL -COMMENT ON COLUMN table1.a IS null; -``` -修改表 table1 的 b 列的数据类型 -```SQL -ALTER TABLE table1 ALTER COLUMN IF EXISTS b SET DATA TYPE DOUBLE; -``` - -### 1.6 删除表 - -用于删除表。 - -**语法:** - -```SQL -DROP TABLE (IF EXISTS)? -``` - -**示例:** - -```SQL -DROP TABLE table1; -``` - - -### 1.7 元数据查询 - -表模型下**测点数量**等于所有表的测点数之和,目前单表测点数可通过公式:**单表测点数 = device 数量 × field 列的数量** 计算得出,后续会支持通过 SQL 语句直接查询表模型下测点数,敬请期待。 - -以[示例数据](../Reference/Sample-Data.md) 中的表 table1 为例。 - -该示例组织架构共包含三个 tag 列(region 为区域,plant_id 为工厂,device_id 为机器)和四个 field 列(temperature 为温度,humidity 为湿度,status 为状态,arrival_time 为到达时间)。 - -device 的唯一标识由全部 tag 列组合而成,只要 region(区域)+ plant_id(工厂)+ device_id(机器)的组合不重复,就代表一个独立设备。 - -示例数据一共定义了 2 个区域,分别为:北京、上海。其中 - -* 北京区域:包含 1 个工厂,工厂编号 1001; - * 该工厂下共有 2 台设备,设备编号分别为 100、101; -* 上海区域:包含 2 个工厂,工厂编号分别为 3001、3002; - * 工厂 3001 下包含 2 台设备:100、101; - * 工厂 3002 下包含 2 台设备:100、101。 - -综上,整个表一共存在 6 组唯一 tag 组合,对应 6 个独立设备。 - -**单表测点数完整计算示例:** - -1. 查询 device 数量 - -```sql -count devices from table1; -``` -```shell -+--------------+ -|count(devices)| -+--------------+ -| 6| -+--------------+ -``` - -2. 计算单表测点数量 -- device 数量:6 -- field 列数:4 -- 单表测点总数:6 × 4 = 24 - diff --git a/src/zh/UserGuide/Master/Table/Basic-Concept/Write-Updata-Data_timecho.md b/src/zh/UserGuide/Master/Table/Basic-Concept/Write-Updata-Data_timecho.md deleted file mode 100644 index 58ea9299d..000000000 --- a/src/zh/UserGuide/Master/Table/Basic-Concept/Write-Updata-Data_timecho.md +++ /dev/null @@ -1,383 +0,0 @@ - - -# 写入&更新 - -## 1.SQL 写入 - -### 1.1 语法 - -在 IoTDB 中,数据写入遵循以下通用语法: - -```SQL -INSERT INTO [(COLUMN_NAME[, COLUMN_NAME]*)]? VALUES (COLUMN_VALUE[, COLUMN_VALUE]*) -``` - -基本约束包括: - -1. 通过 insert 语句写入无法自动创建表。 -2. 未指定的标签列将自动填充为 `null`。 -3. 未包含时间戳,系统将使用当前时间 `now()` 进行填充。 -4. 若当前设备(由标识信息定位)不存在该属性列的值,执行写入操作将导致原有的空值(NULL)被写入的数据所替代。 -5. 若当前设备(由标识信息定位)已有属性列的值,再次写入相同的属性列时,系统将更新该属性列的值为新数据。 -6. 写入重复时间戳,原时间戳对应列的值会更新。 -7. 若 INSERT 语句未指定列名(如 INSERT INTO table VALUES (...)),则 VALUES中的值必须严格按表中列的物理顺序排列(顺序可通过 DESC table命令查看)。 - -由于属性一般并不随时间的变化而变化,因此推荐以 update 的方式单独更新属性值,参见下文 [数据更新](#_3-数据更新)。 - -
- -
- - -### 1.2 指定列写入 - -在写入操作中,可以指定部分列,未指定的列将不会被写入任何内容(即设置为 `null`)。 - -**示例:** - -```SQL -INSERT INTO table1(region, plant_id, device_id, time, temperature, humidity) VALUES ('北京', '1001', '100', '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, time, temperature) VALUES ('北京', '1001', '100', '2025-11-26 13:38:00', 91.0); -``` - -### 1.3 空值写入 - -标签列、属性列和测点列可以指定空值(`null`),表示不写入任何内容。 - -**示例(与上述示例等价):** - -```SQL -# 上述部分列写入等价于如下的带空值写入; -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('北京', '1001', '100', null, null, '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('北京', '1001', '100', null, null, '2025-11-26 13:38:00', 91.0, null); -``` - -当向不包含任何标签列的表中写入数据时,系统将默认创建一个所有标签列值均为 null 的device。 - -> 注意,该操作不仅会自动为表中已有的标签列填充 null 值,而且对于未来新增的标签列,同样会自动填充 null。 - -### 1.4 多行写入 - -支持同时写入多行数据,提高数据写入效率。 - -**示例:** - -```SQL -INSERT INTO table1 -VALUES -('2025-11-26 13:37:00', '北京', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('2025-11-26 13:38:00', '北京', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:38:25'); - -INSERT INTO table1 -(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) -VALUES -('北京', '1001', '100', 'A', '180', '2025-11-26 13:37:00', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('北京', '1001', '100', 'A', '180', '2025-11-26 13:38:00', 90.0, 35.1, true, '2025-11-26 13:38:25'); -``` - -#### 注意事项 - -- 如果在 SQL 语句中引用了表中不存在的列,IoTDB 将返回错误码 `COLUMN_NOT_EXIST(616)`。 -- 如果写入的数据类型与表中列的数据类型不一致,将报错 `DATA_TYPE_MISMATCH(507)`。 - -### 1.5 查询写回 - -IoTDB 表模型支持追加查询写回功能,即`INSERT INTO QUERY` 语句,支持将查询结果写入**已经存在**的表中。 - -> 注意:该功能从 V 2.0.6 版本开始提供。 - -#### 1.5.1 语法定义 - -```SQL -INSERT INTO table_name [ ( column [, ... ] ) ] query -``` - -其中**​ ​query** 支持三种形式,下面将通过示例进行说明。 - -以[示例数据](../Reference/Sample-Data.md)为源数据,先创建目标表 - -```SQL -IoTDB:database1> CREATE TABLE target_table ( time TIMESTAMP TIME, region STRING TAG, device_id STRING TAG, temperature FLOAT FIELD ); -Msg: The statement is executed successfully. -``` - -1. 通过标准查询语句写回 - -即 query 处为直接通过`select ... from ...`执行的查询。 - -例如:使用标准查询语句,将 table1 中北京地区的 time, region, device\_id, temperature 数据查询写回到 target\_table 中 - -```SQL -insert into target_table select time,region,device_id,temperature from table1 where region = '北京'; -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region='北京'; -``` -```shell -+-----------------------------+------+---------+-----------+ -| time|region|device_id|temperature| -+-----------------------------+------+---------+-----------+ -|2024-11-26T13:37:00.000+08:00| 北京| 100| 90.0| -|2024-11-26T13:38:00.000+08:00| 北京| 100| 90.0| -|2024-11-27T16:38:00.000+08:00| 北京| 101| null| -|2024-11-27T16:39:00.000+08:00| 北京| 101| 85.0| -|2024-11-27T16:40:00.000+08:00| 北京| 101| 85.0| -|2024-11-27T16:41:00.000+08:00| 北京| 101| 85.0| -|2024-11-27T16:42:00.000+08:00| 北京| 101| null| -|2024-11-27T16:43:00.000+08:00| 北京| 101| null| -|2024-11-27T16:44:00.000+08:00| 北京| 101| null| -+-----------------------------+------+---------+-----------+ -Total line number = 9 -It costs 0.029s -``` - -2. 通过表引用查询写回 - -即 query 处为表引用方式`table source_table`。 - -例如:使用表引用查询,将 table3 中的数据查询写回到 target\_table 中 - -```SQL -insert into target_table(time,device_id,temperature) table table3; -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region is null; -``` -```shell -+-----------------------------+------+---------+-----------+ -| time|region|device_id|temperature| -+-----------------------------+------+---------+-----------+ -|2025-05-13T00:00:00.001+08:00| null| d1| 90.0| -|2025-05-13T00:00:01.002+08:00| null| d1| 85.0| -|2025-05-13T00:00:02.101+08:00| null| d1| 85.0| -|2025-05-13T00:00:03.201+08:00| null| d1| null| -|2025-05-13T00:00:04.105+08:00| null| d1| 90.0| -|2025-05-13T00:00:05.023+08:00| null| d1| 85.0| -|2025-05-13T00:00:06.129+08:00| null| d1| 90.0| -+-----------------------------+------+---------+-----------+ -Total line number = 7 -It costs 0.015s -``` - -3. 通过子查询写回 - -即 query 处为带括号的子查询。 - -例如:使用子查询,将 table1 中时间与 table2 上海地区记录匹配的数据的 time, region, device\_id, temperature 查询写回到 target\_table - -```SQL -insert into target_table (select t1.time, t1.region as region, t1.device_id as device_id, t1.temperature as temperature from table1 t1 where t1.time in (select t2.time from table2 t2 where t2.region = '上海')); -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region = '上海'; -``` -```shell -+-----------------------------+------+---------+-----------+ -| time|region|device_id|temperature| -+-----------------------------+------+---------+-----------+ -|2024-11-28T08:00:00.000+08:00| 上海| 100| 85.0| -|2024-11-29T11:00:00.000+08:00| 上海| 100| null| -+-----------------------------+------+---------+-----------+ -Total line number = 2 -It costs 0.014s -``` - -#### 1.5.2 相关说明 - -* 允许 query 中的源表与目标表 table\_name 是同一个表,例如:`INSERT INTO testtb SELECT * FROM testtb`。 -* 目标表必须已存在,否则提示错误信息`550: Table 'xxx.xxx' does not exist`。 -* 查询返回列和目标表列的数量和类型需完全匹配,目前不支持 Object 类型,不支持兼容类型的转换,若类型不匹配则提示错误信息 `701: Insert query has mismatched column types`。 -* 允许指定目标表的部分列,指定目标表列名时需符合以下规则: - * 必须包含时间戳列,否则提示错误信息`701: time column can not be null` - * 必须包含至少一个 FIELD 列,否则提示错误信息`701: No Field column present` - * 允许不指定 TAG 列 - * 允许指定列数少于目标表列数,缺失列自动补为 NULL 值 -* JAVA 支持使用 [executeNonQueryStatement](../API/Programming-Java-Native-API_timecho.md#_3-1-itablesession接口) 方法执行`INSERT INTO QUERY`。 -* REST 支持[/rest/table/v1/nonQuery](../API/RestServiceV1.md#_3-3-非查询接口)API 执行`INSERT INTO QUERY`。 -* `INSERT INTO QUERY`不支持 Explain 和 Explain Analyze。 -* 用户必须有下列权限才能正常执行查询写回语句: - * 对查询语句中的源表具有 `SELECT` 权限。 - * 对目标表具有`WRITE`权限。 - * 更多用户权限相关的内容,请参考[权限管理](../User-Manual/Authority-Management_timecho.md)。 - - -### 1.6 Object 类型写入 - -自为了避免单个 Object 过大导致写入请求过大,Object 类型的值支持拆分后按顺序分段写入。SQL 中需要使用 `to_object(isEOF, offset, content)` 函数进行值填充。 - -> V2.0.8 版本起支持 - -**语法:** - -```SQL -insert into tableName(time, columnName) values(timeValue, to_object(isEOF, offset, content)); -``` - -**参数:** - -| 名称 | 数据类型 | 描述 | -| --------- | ------------------- | ---------------------------------------- | -| isEOF | boolean | 本次写入内容是否为 Object 的最后一部分 | -| offset | int64 | 本次写入的内容在 Object 中的起始偏移量 | -| content | 十六进制(hex)格式 | 本次写入的 Object 内容 | - -**示例:** - -向表 table1 中增加 object 类型字段 s1 - -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS s1 OBJECT FIELD COMMENT 'object类型'; -``` - -1. 不分段写入 - -```SQL -insert into table1(time, device_id, s1) values(now(), 'tag1', to_object(true, 0, X'696F746462')); -``` - -2. 分段写入 - -```SQL ---分段写入 object 数据; ---第一次写入:to_object(false, 0, X'696F'); -insert into table1(time, device_id, s1) values(1, 'tag1', to_object(false, 0, X'696F')); ---第二次写入:to_object(false, 2, X'7464'); -insert into table1(time, device_id, s1) values(1, 'tag1', to_object(false, 2, X'7464')); ---第三次写入:to_object(true, 4, X'62'); -insert into table1(time, device_id, s1) values(1, 'tag1', to_object(true, 4, X'62')); -``` - -**注意:** - -1. 如果某个 Object 值只写入了部分片段,查询该 Object 值时会显示 null,只有写入完全后才能查询到数据 -2. 分段写入时,如果本次写入的 offset 不等于已写入的 Object 大小,本次写入报错 -3. 如果已写入了部分数据,本次写入的 offset 为 0,本次写入会清除之前已写入的数据部分,重新写入新的数据 - - -## 2. 无模式写入 - -在通过 Session 进行数据写入时,IoTDB 支持无模式写入:无需事先手动创建表,系统会根据写入请求中的信息自动构建表结构,之后直接执行数据写入操作。 - -**示例:** - -```Java -try (ITableSession session = - new TableSessionBuilder() - .nodeUrls(Collections.singletonList("127.0.0.1:6667")) - .username("root") - .password("root") - .build()) { - - session.executeNonQueryStatement("CREATE DATABASE db1"); - session.executeNonQueryStatement("use db1"); - - // 不创建表直接写入数据 - List columnNameList = - Arrays.asList("region_id", "plant_id", "device_id", "model", "temperature", "humidity"); - List dataTypeList = - Arrays.asList( - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE); - List columnTypeList = - new ArrayList<>( - Arrays.asList( - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD)); - Tablet tablet = new Tablet("table1", columnNameList, dataTypeList, columnTypeList, 100); - for (long timestamp = 0; timestamp < 100; timestamp++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue("plant_id", rowIndex, "5"); - tablet.addValue("device_id", rowIndex, "3"); - tablet.addValue("model", rowIndex, "A"); - tablet.addValue("temperature", rowIndex, 37.6F); - tablet.addValue("humidity", rowIndex, 111.1); - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - session.insert(tablet); - tablet.reset(); - } - } - if (tablet.getRowSize() != 0) { - session.insert(tablet); - tablet.reset(); - } -} -``` - -在代码执行完成后,可以通过下述语句确认表已成功创建,其中包含了时间列、标签列、属性列以及测点列等各类信息。 - -```SQL -desc table1; -``` -```shell -+-----------+---------+-----------+ -| ColumnName| DataType| Category| -+-----------+---------+-----------+ -| time|TIMESTAMP| TIME| -| region_id| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model| STRING| ATTRIBUTE| -|temperature| FLOAT| FIELD| -| humidity| DOUBLE| FIELD| -+-----------+---------+-----------+ -``` - - -## 3. 数据更新 - -### 3.1 语法 - -```SQL -UPDATE SET updateAssignment (',' updateAssignment)* (WHERE where=booleanExpression)? - -updateAssignment - : identifier EQ expression - ; -``` - -1. `update`语句仅允许修改属性(ATTRIBUTE)列的值。 -2. `WHERE` 的规则: - - 范围仅限于标签列(TAG)和属性列(ATTRIBUTE),不允许涉及测点列(FIELD)和时间列(TIME)。 - - 不允许使用聚合函数 -3. 执行 SET 操作后,赋值表达式的结果应当是字符串类型,且其使用的限制应与 WHERE 子句中的表达式相同。 -4. 属性(ATTRIBUTE)列以及测点(FIELD)列的值也可通过`insert`语句来实现指定行的更新。 - -**示例:** - -```SQL -update table1 set b = a where substring(a, 1, 1) like '%'; -``` diff --git a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md b/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md deleted file mode 100644 index 7b865ecc3..000000000 --- a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md +++ /dev/null @@ -1,294 +0,0 @@ - -# AINode 部署 - -## 1. AINode 介绍 - -### 1.1 能力介绍 - -AINode 是 TimechoDB 在 ConfigNode、DataNode 后提供的第三种内生节点,该节点通过与 TimechoDB 集群的 DataNode、ConfigNode 交互,扩展了对时间序列进行机器学习分析的能力。AINode 将模型的管理、训练及推理融合在数据库引擎中,支持使用注册的模型在指定时序数据上通过简单 SQL 语句完成时序分析任务,还支持注册并使用自定义机器学习模型。AINode 目前已集成常见时序分析场景(例如预测)的机器学习算法和自研模型。 - -### 1.2 部署模式 - -AINode 是 TimechoDB 集群外的额外套件,采用独立安装包部署。 - -
- - -
- -## 2. 安装准备 - -### 2.1 安装包获取 - -AINode 安装包(`timechodb--ainode-bin.zip`)解压后关键目录结构如下: - -| **目录** | **类型** | **说明** | -| ---------------- | ---------------- | ------------------------------------------ | -| lib | 文件夹 | AINode 的可执行程序及依赖 | -| sbin | 文件夹 | AINode 的运行脚本,用于启动或停止 AINode | -| conf | 文件夹 | AINode 的配置文件和版本声明文件 | - -### 2.2 前置检查 - -为确保您获取的 AINode 安装包完整且正确,在执行安装部署前建议您进行 SHA512 校验。 - -**准备工作:** - -* 获取官方发布的 SHA512 校验码:请联系天谋工作人员获取 - -**校验步骤(以 linux 为例):** - -1. 打开终端,进入安装包所在目录(如`/data/ainode`): - ```Bash - cd /data/ainode - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -```SQL -(base) root@hadoop@1:/data/ainode (0.664s) -sha512sum timechodb-2.0.6.1-ainode-bin.zip -4d5a6a64935b4f0459bc9ed214c4563aa7a6a5941024336e9416212424707f27bdfdfc70f4c528b51b812687d660014adc1b8add699498ea67ff17c7e619a6f0 timechodb-2.0.6.1-ainode-bin.zip -``` - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行 AINode 的安装部署操作。 - -**注意事项:** - -* 若校验结果不一致,请联系天谋工作人员重新获取安装包 -* 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.3 环境要求 - -* 建议操作环境: Linux, MacOS; -* TimechoDB 版本:>= V 2.0.8; - -#### 2.3.1 资源配置建议 - -> 说明:本节资源配置建议仅针对​**模型推理任务**​。模型训练任务的资源配置建议将在后续版本中补充。 - -以下为基于单张 NVIDIA 4090(24 GB 显存)运行模型推理任务的资源配置基准线。AINode 中的模型推理任务支持通过横向扩展显卡数量来提升整体吞吐,通常建议按 1、2、4、8 张显卡四种规格配置机器。 - -基准测试使用的推理任务规格如下: - -* ​**单变量推理**​:历史序列长度 2880,预测长度 720; -* ​**协变量推理**​:历史序列长度 2880,预测长度 720,包含 20 个已知协变量。 - - -| GPU 数量(NVIDIA 4090, 24 GB 显存) | 推荐 CPU 核数 | 推荐内存(GB) | 可支持单变量推理吞吐(QPS) | 可支持协变量推理吞吐(QPS) | -| ------------------------------------- | --------------- | ---------------- | ----------------------------- | ----------------------------- | -| 1 卡 | 16 核 | 24 GB | 100 | 10 | -| 2 卡 | 32 核 | 48 GB | 200 | 20 | -| 4 卡 | 64 核 | 96 GB | 400 | 40 | -| 8 卡 | 128 核 | 192 GB | 800 | 80 | - -**注意**: - -* 上表中的 CPU 和内存配置遵循以下通用原则:每张显卡配置 16 核 CPU,内存容量与显存容量按 1:1 比例配置 -* 以上吞吐数据为基准测试参考值,实际性能可能因模型类型、数据复杂度及部署环境差异而有所不同 -* 单变量与协变量推理任务的吞吐可按需独立评估,不可直接相加 - - -## 3. 安装部署及使用 - -### 3.1 安装 AINode - -下载导入 AINode 到专用文件夹,切换到专用文件夹并解压安装包; - -```Shell -unzip timechodb--ainode-bin.zip -``` - -### 3.2 配置项修改 - -AINode 支持修改一些必要的参数。可以在 `/TIMECHO_AINODE_HOME/conf/iotdb-ainode.properties` 文件中找到下列参数并进行持久化的修改: - -| **名称** | **描述** | **类型** | **默认值** | -|-----------------------------------|----------------------------------------------| ---------------- | -------------------- | -| cluster\_name | AINode 要加入的集群标识 | string| defaultCluster | -| ain\_seed\_config\_node | AINode 启动时注册的 ConfigNode 地址 | String | 127.0.0.1:10710 | -| ain\_cluster\_ingress\_address | AINode 拉取数据的 DataNode 的 rpc 地址 | String | 127.0.0.1 | -| ain\_cluster\_ingress\_port | AINode 拉取数据的 DataNode 的 rpc 端口 | Integer | 6667 | -| ain\_cluster\_ingress\_username | AINode 拉取数据的 DataNode 的客户端用户名 | String | root | -| ain\_cluster\_ingress\_password | AINode 拉取数据的 DataNode 的客户端密码 | String | root | -| ain\_rpc\_address | AINode 提供服务与通信的地址 ,内部服务通讯接口 | String | 127.0.0.1 | -| ain\_rpc\_port | AINode 提供服务与通信的端口 | String | 10810 | -| ain\_system\_dir | AINode 元数据存储路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String| data/AINode/system | -| ain\_models\_dir | AINode 存储模型文件的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String| data/AINode/models | -| ain\_thrift\_compression\_enabled | AINode 是否启用 thrift 的压缩机制,0-不启动、1-启动 | Boolean | 0 | - -### 3.3 导入内置权重文件 - -若部署环境可联网且能连通 HuggingFace 环境,系统会自动拉取内置模型权重文件,可忽略本步骤。 - -若为离线环境,联系天谋工作人员获取模型权重文件夹,并放置到`/TIMECHO_AINODE_HOME/data/ainode/models/builtin` 目录下。 - -**​NOTE:​**注意目录层级,最终所有内置模型权重的父目录都是 `builtin `。 - -### 3.4 启动 AINode - -在完成 ConfigNode 的部署后,可以通过添加 TimechoDB 来支持时序模型的管理和推理功能。在配置项中指定 TimechoDB 集群的信息后,可以执行相应的指令来启动 AINode,加入 TimechoDB 集群。 - -```Shell -# 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh -d - - # Windows 系统 - bash sbin\start-ainode.bat -d -``` - -### 3.5 激活 AINode - -1. 参考 TimechoDB 激活:[激活方式](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md#_2-6-激活数据库) - -2. 可通过如下方式验证 AINode 激活,当看到状态显示为 ACTIVATED 表示激活成功。 - -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -Total line number = 3 -It costs 0.002s -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ -Total line number = 7 -It costs 0.013s -``` - - -### 3.6 检测 AINode 节点状态 - -AINode 启动过程中会自动将新的 AINode 加入 TimechoDB 集群。启动 AINode 后可以在命令行中输入 SQL 来查询,集群中看到 AINode 节点,其运行状态为 Running(如下展示)表示加入成功。 - -```Shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -除此之外,还可以通过 show models 命令来查看模型状态。如果模型状态不对,请检查权重文件路径是否正确。 - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 3.7 停止 AINode - -如果需要停止正在运行的 AINode 节点,则执行相应的关停脚本,且支持通过参数 -p 指定端口,该端口为配置项中的 `ain_rpc_port`。 - -```Shell -# Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # 指定端口 - - #Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # 指定端口 -``` - -停止 AINode 后,还可以在集群中看到 AINode 节点,其运行状态为 UNKNOWN(如下展示),此时无法使用 AINode 功能。 - -```Shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|UNKNOWN| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -如果需要重新启动该节点,需重新执行启动脚本。 - -### 3.8 升级 AINode - -如果需要对当前 AINode 进行版本升级,可参考如下步骤: - -1. 停止当前 AINode 服务 - - * 执行停止命令,确保服务完全退出后再进行后续操作 - - ```Shell - # Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # 指定端口 - - #Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # 指定端口 - ``` -2. 替换核心文件 - - * 删除当前版本的`lib` 和 `sbin`目录,并将新版本的 `lib` 和 `sbin` 复制到对应位置 - * 备份 conf 目录下已修改的配置文件,然后替换 conf 文件夹,并将修改的配置同步到对应位置 -3. 更新内置模型权重(可选) - - * 若新版本涉及内置模型更新,相关信息将在[发布历史](../IoTDB-Introduction/Release-history\_timecho.md)中同步。可联系天谋工作人员获取最新权重包,并将权重包替换至 `data/ainode/models/builtin` 目录 -4. 升级完毕后,可启动 AINode 服务,并查看节点状态,具体命令可参考【3.4】和【3.6】小节。 - diff --git a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index a3259a987..000000000 --- a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,340 +0,0 @@ - -# AINode 部署 - -## 1. AINode介绍 - -### 1.1 能力介绍 - -AINode 是 IoTDB 在 ConfigNode、DataNode 后提供的第三种内生节点,该节点通过与 IoTDB 集群的 DataNode、ConfigNode 的交互,扩展了对时间序列进行机器学习分析的能力,支持从外部引入已有机器学习模型进行注册,并使用注册的模型在指定时序数据上通过简单 SQL 语句完成时序分析任务的过程,将模型的创建、管理及推理融合在数据库引擎中。目前已提供常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -### 1.2 交付方式 -AINode 是 IoTDB 集群外的额外套件,独立安装包。 - -### 1.3 部署模式 -
- - -
- -## 2. 安装准备 - -### 2.1 安装包获取 - -AINode 安装包(`timechodb--ainode-bin.zip`),安装包解压后目录结构如下: - -| **目录** | **类型** | **说明** | -| ------------ | -------- | ------------------------------------------------ | -| lib | 文件夹 | AINode 的 python 包文件 | -| sbin | 文件夹 | AINode的运行脚本,可以启动,移除和停止AINode | -| conf | 文件夹 | AINode 的配置文件和运行环境设置脚本 | -| LICENSE | 文件 | 证书 | -| NOTICE | 文件 | 提示 | -| README_ZH.md | 文件 | markdown格式的中文版说明 | -| README.md | 文件 | 使用说明 | - -### 2.2 前置检查 - -为确保您获取的 AINode 安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:请联系天谋工作人员获取 - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/ainode`): - ```Bash - cd /data/ainode - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-06.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行 AINode 的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.3 环境准备 - -1. 建议操作环境: Ubuntu, MacOS -2. IoTDB 版本:>= V 2.0.5.1 -3. 运行环境 - - Python 版本在 3.9 ~3.12,且带有 pip 和 venv 工具; - - -## 3. 安装部署及使用 - -### 3.1 安装 AINode - -1. 保证 Python 版本介于 3.9 ~3.12 - -```shell -python --version -# 或 -python3 --version -``` -2. 下载导入 AINode 到专用文件夹,切换到专用文件夹并解压安装包 - -```shell - unzip timechodb--ainode-bin.zip -``` - -3. 激活 AINode: - -- 进入 IoTDB CLI - -```sql - # Linux或MACOS系统 - ./start-cli.sh -sql_dialect table - - # windows系统 - ./start-cli.bat -sql_dialect table -``` - -- 执行以下内容获取激活所需机器码: - -```sql -show system info -``` - -- 将返回的机器码复制给天谋工作人员: - -```sql -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -``` - -- 将工作人员返回的激活码输入到CLI中,输入以下内容 - - 注:激活码前后需要用'符号进行标注,如所示 - -```sql -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZK' -``` - -- 可通过如下方式验证激活,当看到状态显示为 ACTIVATED 表示激活成功 - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ - -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ - -``` - -### 3.2 配置项修改 -AINode 支持修改一些必要的参数。可以在 `conf/iotdb-ainode.properties` 文件中找到下列参数并进行持久化的修改: - -| **名称** | **描述** | **类型** | **默认值** | -| ------------------------------ | ------------------------------------------------------------ | -------- | ------------------ | -| cluster_name | AINode 要加入集群的标识 | string | defaultCluster | -| ain_seed_config_node | AINode 启动时注册的 ConfigNode 地址 | String | 127.0.0.1:10710 | -| ain_cluster_ingress_address | AINode 拉取数据的 DataNode 的 rpc 地址 | String | 127.0.0.1 | -| ain_cluster_ingress_port | AINode 拉取数据的 DataNode 的 rpc 端口 | Integer | 6667 | -| ain_cluster_ingress_username | AINode 拉取数据的 DataNode 的客户端用户名 | String | root | -| ain_cluster_ingress_password | AINode 拉取数据的 DataNode 的客户端密码 | String | root | -| ain_cluster_ingress_time_zone | AINode 拉取数据的 DataNode 的客户端时区 | String | UTC+8 | -| ain_inference_rpc_address | AINode 提供服务与通信的地址 ,内部服务通讯接口 | String | 127.0.0.1 | -| ain_inference_rpc_port | AINode 提供服务与通信的端口 | String | 10810 | -| ain_system_dir | AINode 元数据存储路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/system | -| ain_models_dir | AINode 存储模型文件的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/models | -| ain_thrift_compression_enabled | AINode 是否启用 thrift 的压缩机制,0-不启动、1-启动 | Boolean | 0 | - -### 3.3 导入权重文件 - -> 仅离线环境,在线环境可忽略本步骤 -> - 联系天谋工作人员获取模型权重文件,并放置到/IOTDB_AINODE_HOME/data/ainode/models/weights/目录下。 - -### 3.4 启动 AINode - - 在完成 Seed-ConfigNode 的部署后,可以通过添加 AINode 节点来支持模型的注册和推理功能。在配置项中指定 IoTDB 集群的信息后,可以执行相应的指令来启动 AINode,加入 IoTDB 集群。 - -- 联网环境启动 - -启动命令 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### 3.5 检测 AINode 节点状态 - -AINode 启动过程中会自动将新的 AINode 加入 IoTDB 集群。启动 AINode 后可以在 命令行中输入 SQL 来查询,集群中看到 AINode 节点,其运行状态为 Running(如下展示)表示加入成功。 - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -除此之外,还可以通过 show models 命令来查看模型状态。如果模型状态不对,请检查权重文件路径是否正确。 - -```sql -IoTDB:etth> show models -+---------------------+--------------------+--------+------+ -| ModelId| ModelType|Category| State| -+---------------------+--------------------+--------+------+ -| arima| Arima|BUILT-IN|ACTIVE| -| holtwinters| HoltWinters|BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing|BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster|BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster|BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm|BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm|BUILT-IN|ACTIVE| -| stray| Stray|BUILT-IN|ACTIVE| -| sundial| Timer-Sundial|BUILT-IN|ACTIVE| -| timer_xl| Timer-XL|BUILT-IN|ACTIVE| -+---------------------+--------------------+--------+------+ -``` - -### 3.6 停止 AINode - -如果需要停止正在运行的 AINode 节点,则执行相应的关闭脚本。 - -- 停止命令 - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - -停止 AINode 后,还可以在集群中看到 AINode 节点,其运行状态为 UNKNOWN(如下展示),此时无法使用 AINode 功能。 - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -如果需要重新启动该节点,需重新执行启动脚本。 - - -## 4. 常见问题 - -### 4.1 启动AINode时出现找不到venv模块的报错 - - 当使用默认方式启动 AINode 时,会在安装包目录下创建一个 python 虚拟环境并安装依赖,因此要求安装 venv 模块。通常来说 python3.10 及以上的版本会自带 venv,但对于一些系统自带的 python 环境可能并不满足这一要求。出现该报错时有两种解决方案(二选一): - - 在本地安装 venv 模块,以 ubuntu 为例,可以通过运行以下命令来安装 python 自带的 venv 模块。或者从 python 官网安装一个自带 venv 的 python 版本。 - - ```shell -apt-get install python3.10-venv -``` - 安装 3.10.0 版本的 venv 到 AINode 里面 在 AINode 路径下 - - ```shell -../Python-3.10.0/python -m venv venv(文件夹名) -``` - 在运行启动脚本时通过 `-i` 指定已有的 python 解释器路径作为 AINode 的运行环境,这样就不再需要创建一个新的虚拟环境。 - - ### 4.2 python中的SSL模块没有被正确安装和配置,无法处理HTTPS资源 -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -可以安装 OpenSSLS 后,再重新构建 python 来解决这个问题 -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - - Python 要求我们的系统上安装有 OpenSSL,具体安装方法可见[链接](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - - ### 4.3 pip版本较低 - - windows下出现类似“error:Microsoft Visual C++ 14.0 or greater is required...”的编译问题 - - 出现对应的报错,通常是 c++版本或是 setuptools 版本不足,可以在 - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - - ### 4.4 安装编译python - - 使用以下指定从官网下载安装包并解压: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` - 编译安装对应的 python 包: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index 0fa3c8625..000000000 --- a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,573 +0,0 @@ - -# 集群版部署指导 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -## 1. 注意事项 - -1. 安装前请确认系统已参照[系统配置](../Deployment-and-Maintenance/Environment-Requirements.md)准备完成。 - -2. 推荐使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在服务器上配`/etc/hosts`,如本机ip是11.101.17.224,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。 - - ```shell - echo "11.101.17.224 iotdb-1" >> /etc/hosts - ``` - -3. 有些参数首次启动后不能修改,请参考下方的[参数配置](#参数配置)章节来进行设置。 - -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 - -5. 请注意,安装部署(包括激活和使用软件)IoTDB时,您可以: - -- 使用 root 用户(推荐):可以避免权限等问题。 - -- 使用固定的非 root 用户: - - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - - 避免使用 sudo:使用 sudo 命令会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 - -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考:[监控面板部署](./Monitoring-panel-deployment.md) - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - - -## 2. 准备步骤 - -1. 准备IoTDB数据库安装包 :timechodb-{version}-bin.zip(安装包获取见:[链接](./IoTDB-Package_timecho.md)) -2. 按环境要求配置好操作系统环境(系统环境配置见:[链接](./Environment-Requirements.md)) - -### 2.1 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-01.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -## 3. 安装步骤 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ------------- | ------- | -------------------- | -| 11.101.17.224 | iotdb-1 | ConfigNode、DataNode | -| 11.101.17.225 | iotdb-2 | ConfigNode、DataNode | -| 11.101.17.226 | iotdb-3 | ConfigNode、DataNode | - -### 3.1 设置主机名 - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置/etc/hosts,使用如下命令: - -```shell -echo "11.101.17.224 iotdb-1" >> /etc/hosts -echo "11.101.17.225 iotdb-2" >> /etc/hosts -echo "11.101.17.226 iotdb-3" >> /etc/hosts -``` - -### 3.2 参数配置 - -解压安装包并进入安装目录 - -```shell -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -#### 3.2.1 环境脚本配置 - -- ./conf/confignode-env.sh配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------- | :------------------------------------- | :--------- | :----------------------------------------------- | :----------- | -| MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- ./conf/datanode-env.sh配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------- | :----------------------------------- |:-----------------------| :----------------------------------------------- | :----------- | -| MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 3.2.2 通用配置(./conf/iotdb-system.properties) - -- 集群配置 - -| 配置项 | 说明 | 11.101.17.224 | 11.101.17.225 | 11.101.17.226 | -| ------------------------- | ---------------------------------------- | -------------- | -------------- | -------------- | -| cluster_name | 集群名称 | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | 元数据副本数,DataNode数量不应少于此数目 | 3 | 3 | 3 | -| data_replication_factor | 数据副本数,DataNode数量不应少于此数目 | 2 | 2 | 2 | - -#### 3.2.3 ConfigNode 配置 - -| 配置项 | 说明 | 默认 | 推荐值 | 11.101.17.224 | 11.101.17.225 | 11.101.17.226 | 备注 | -| ------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 10710 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 10720 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -#### 3.2.4 DataNode 配置 - -| 配置项 | 说明 | 默认 | 推荐值 | 11.101.17.224 | 11.101.17.225 | 11.101.17.226 | 备注 | -| ------------------------------- | ------------------------------------------------------------ | --------------- | --------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| dn_rpc_address | 客户端 RPC 服务的地址 | 127.0.0.1 | 默认本机可直接访问。非本机访问,请修改此配置项为所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址。 | iotdb-1 |iotdb-2 | iotdb-3 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 6667 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 10730 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 10740 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 10750 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 10760 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -> ❗️注意:VSCode Remote等编辑器无自动保存配置功能,请确保修改的文件被持久化保存,否则配置项无法生效 - -### 3.3 启动ConfigNode节点 - -先启动第一个iotdb-1的confignode, 保证种子confignode节点先启动,然后依次启动第2和第3个confignode节点 - -```shell -# Unix/OS X -cd sbin -./start-confignode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\start-confignode.bat - -# V2.0.4.x 版本及之后 -.\windows\start-confignode.bat -``` - -如果启动失败,请参考下[常见问题](#常见问题) - -### 3.4 启动DataNode 节点 - - 分别进入iotdb的sbin目录下,依次启动3个datanode节点: - -```shell -# Unix/OS X -cd sbin -./start-datanode.sh -d #-d参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\start-datanode.bat - -# V2.0.4.x 版本及之后 -.\windows\start-datanode.bat -``` - -### 3.5 激活数据库 - -#### 方式一:通过 CLI 激活 - -- 进入集群任一节点 CLI - -```shell -# Linux 系统与 MacOS 系统启动命令如下: -# V2.0.6.x 版本之前 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.6.x 版本及之后 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table - -# Windows 系统启动命令如下: -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.6.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -- 执行以下内容获取激活所需机器码: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -|01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -- 执行以下语句获取待激活数据库的版本号: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.x.x| xxxxxxx| -+-------+---------+ -Total line number = 1 -``` - -- 将获取到的机器码与版本号,一同提供给天谋工作人员。 - -- 工作人员会返回激活码,正常是与提供的机器码的顺序对应的,请将整串激活码粘贴到CLI中进行激活,此激活操作只需在集群中的任意一台机器上执行一次即可。 - - - 注:激活码前后需要用`'`符号进行标注,如下所示 - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -#### 方式二:激活文件拷贝激活 - -- 依次启动3个Confignode、Datanode节点后,每台机器各自的activation文件夹, 分别拷贝每台机器的system_info文件给天谋工作人员; -- 工作人员将返回每个ConfigNode、Datanode节点的license文件,这里会返回3个license文件; -- 将3个license文件分别放入对应的ConfigNode节点的activation文件夹下; - - -### 3.6 验证激活 - -可在 CLI 中通过执行 `show activation` 命令查看激活状态,示例如下,状态显示为 ACTIVATED 表示激活成功 - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -### 3.7 一键启停集群 - -#### 3.7.1 概述 - -在 IoTDB 的根目录中,`sbin` 子目录包含的 `start-all.sh` 和 `stop-all.sh` 脚本,与 `conf` 子目录中的 `iotdb-cluster.properties` 配置文件协同工作,可通过单一节点实现一键启动或停止集群所有节点的功能。通过这种方式,可以高效地管理 IoTDB 集群的生命周期,简化了部署和运维流程。 -下文将介绍`iotdb-cluster.properties` 文件中的具体配置项。 - -#### 3.7.2 配置项 - - -> 注意: -> -> * 当集群变更时,需要手动更新此配置文件。 -> * 如果在未配置 `iotdb-cluster.properties` 配置文件的情况下执行 `start-all.sh` 或者 `stop-all.sh` 脚本,则默认会启停当前脚本所在 IOTDB\_HOME 目录下的 ConfigNode 与 DataNode 节点。 -> * 推荐配置 ssh 免密登录:如果未配置,启动脚本后会提示输入服务器密码以便于后续启动/停止/销毁操作。如果已配置,则无需在执行脚本过程中输入服务器密码。 - -* confignode\_address\_list - -| 名字 | confignode\_address\_list | -| :--------------: | :------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的 ConfigNode 节点所在主机的 IP 或主机名列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_address\_list - -| 名字 | datanode\_address\_list | -| :----------------: | :---------------------------------------------------------------------------- | -| 描述 | 待启动/停止的 DataNode 节点所在主机的 IP 或主机名列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* ssh\_account - -| 名字 | ssh\_account | -| :----------------: | :------------------------------------------------------------- | -| 描述 | 通过 SSH 登陆目标主机的用户名,需要所有的主机的用户名都相同 | -| 类型 | String | -| 默认值 | root | -| 改后生效方式 | 重启服务生效 | - -* ssh\_port - -| 名字 | ssh\_port | -| :----------------: | :--------------------------------------------------------- | -| 描述 | 目标主机对外暴露的 SSH 端口,需要所有的主机的端口都相同 | -| 类型 | int | -| 默认值 | 22 | -| 改后生效方式 | 重启服务生效 | - -* confignode\_deploy\_path - -| 名字 | confignode\_deploy\_path | -| :----------------: | :---------------------------------------------------------------------------------------------------------------- | -| 描述 | 待启动/停止的所有 ConfigNode 所在目标主机的路径,需要所有待启动/停止的 ConfigNode 节点在目标主机的相同目录下。例如:`/data/demo/apache-iotdb-1.3.1-all-bin` | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_deploy\_path - -| 名字 | datanode\_deploy\_path | -| :----------------: | :------------------------------------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的所有 DataNode 所在目标主机的路径,需要所有待启动/停止的 DataNode 节点在目标主机的相同目录下。例如:`/data/demo/apache-iotdb-1.3.1-all-bin` | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - - -#### 3.7.3 简单示例 - -1. 配置文件 `iotdb-cluster.properties` - -```properties -# Configure ConfigNodes machine addresses separated by , -confignode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# Configure DataNodes machine addresses separated by , -datanode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# User name for logging in to the deployment machine using ssh -ssh_account=root - -# ssh login port -ssh_port=22 - -# iotdb deployment directory (iotdb will be deployed to the target node in this folder) -confignode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -datanode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -``` - -2. 执行 ./start-all.sh 命令验证启动结果,在 cli 中执行 show cluster,可看到类似如下结果 -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -| 0|ConfigNode|Running| 172.xx.xx.16| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 1|ConfigNode|Running| 172.xx.xx.18| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 2|ConfigNode|Running| 172.xx.xx.17| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 3| DataNode|Running| 172.xx.xx.18| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 4| DataNode|Running| 172.xx.xx.17| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 5| DataNode|Running| 172.xx.xx.16| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -``` - - -## 4. 节点维护步骤 - -### 4.1 ConfigNode节点维护 - -ConfigNode节点维护分为ConfigNode添加和移除两种操作,有两个常见使用场景: - -- 集群扩展:如集群中只有1个ConfigNode时,希望增加ConfigNode以提升ConfigNode节点高可用性,则可以添加2个ConfigNode,使得集群中有3个ConfigNode。 -- 集群故障恢复:1个ConfigNode所在机器发生故障,使得该ConfigNode无法正常运行,此时可以移除该ConfigNode,然后添加一个新的ConfigNode进入集群。 - -> ❗️注意,在完成ConfigNode节点维护后,需要保证集群中有1或者3个正常运行的ConfigNode。2个ConfigNode不具备高可用性,超过3个ConfigNode会导致性能损失。 - -#### 4.1.1 添加ConfigNode节点 - -脚本命令: - -```shell -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-confignode.sh - -# Windows -# 首先切换到IoTDB根目录 -# V2.0.4.x 版本之前 -sbin\start-confignode.bat - -# V2.0.4.x 版本及之后 -sbin\windows\start-confignode.bat -``` - -#### 4.1.2 移除ConfigNode节点 - -首先通过CLI连接集群,通过`show confignodes`确认想要移除ConfigNode的NodeID: - -```shell -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -然后使用SQL将ConfigNode移除,SQL命令: - -```Bash -remove confignode [confignode_id] -``` - -### 4.2 DataNode节点维护 - -DataNode节点维护有两个常见场景: - -- 集群扩容:出于集群能力扩容等目的,添加新的DataNode进入集群 -- 集群故障恢复:一个DataNode所在机器出现故障,使得该DataNode无法正常运行,此时可以移除该DataNode,并添加新的DataNode进入集群 - -> ❗️注意,为了使集群能正常工作,在DataNode节点维护过程中以及维护完成后,正常运行的DataNode总数不得少于数据副本数(通常为2),也不得少于元数据副本数(通常为3)。 - -#### 4.2.1 添加DataNode节点 - -脚本命令: - -```Bash -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-datanode.sh - -#Windows -# 首先切换到IoTDB根目录 -# V2.0.4.x 版本之前 -sbin\start-datanode.bat - -# V2.0.4.x 版本及之后 -sbin\windows\start-datanode.bat -``` - -说明:在添加DataNode后,随着新的写入到来(以及旧数据过期,如果设置了TTL),集群负载会逐渐向新的DataNode均衡,最终在所有节点上达到存算资源的均衡。 - -#### 4.2.2 移除DataNode节点 - -首先通过CLI连接集群,通过`show datanodes`确认想要移除的DataNode的NodeID: - -```Bash -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -然后使用SQL将DataNode移除,SQL命令: - -```Bash -remove datanode [datanode_id] -``` - -### 4.3 集群维护 - -更多关于集群维护的介绍可参考:[集群维护](../User-Manual/Load-Balance.md) - -## 5. 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 清理环境: - - 1. 结束所有 ConfigNode 和 DataNode 进程。 -```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - # Unix/OS X - sbin/stop-standalone.sh - - # Windows - # V2.0.4.x 版本之前 - sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - sbin\windows\stop-standalone.bat - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - - 2. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```shell - cd /data/iotdb rm -rf data logs - ``` -## 6. 附录 - -### 6.1 Confignode节点参数介绍 - -| 参数 | 描述 | 是否为必填项 | -| :--- | :------------------------------- | :----------- | -| -d | 以守护进程模式启动,即在后台运行 | 否 | - -### 6.2 Datanode节点参数介绍 - -| 缩写 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - diff --git a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Database-Resources_timecho.md b/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Database-Resources_timecho.md deleted file mode 100644 index 4da0bf727..000000000 --- a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Database-Resources_timecho.md +++ /dev/null @@ -1,209 +0,0 @@ - -# 资源规划 - -## 1. CPU - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
序列数(采集频率<=1HZ)CPU节点数
单机双活分布式
10W以内2核-4核123
30W以内4核-8核123
50W以内8核-16核123
100W以内16核-32核123
200w以内32核-48核123
1000w以内48核12请联系天谋商务咨询
1000w以上请联系天谋商务咨询
- -> CPU支持型号:鲲鹏、飞腾、申威、海光、兆芯、龙芯 - -## 2. 内存 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
序列数(采集频率<=1HZ)内存节点数
单机双活分布式
10W以内2G-4G123
30W以内6G-12G123
50W以内12G-24G123
100W以内24G-48G123
200w以内24G-96G123
1000w以内128G12请联系天谋商务咨询
1000w以上请联系天谋商务咨询
- -> 提供灵活的内存配置选项,用户可在datanode-env文件中进行调整,详细信息和配置指南请参见 [datanode-env](../Reference/System-Config-Manual_timecho.md#_2-2-datanode-env-sh-bat) - -**注意:** 如需查看 AI 模型推理场景的专项硬件配比与吞吐参考,可查看 AINode 部署文档【[2.3.1 资源配置建议](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_2-3-1-资源配置建议)】章节: - -## 3. 存储(磁盘) -### 3.1 存储空间 - -可通过磁盘资源评估器进行计算:[磁盘资源评估器](https://www.timecho.com/docs/zh/ResourceEvaluator.html) - -计算公式:测点数量 * 采样频率(Hz)* 每个数据点大小(Byte,不同数据类型不一样,见下表) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
数据点大小计算表
数据类型 时间戳(字节)值(字节)数据点总大小(字节)
开关量(Boolean)819
整型(INT32)/ 单精度浮点数(FLOAT)8412
长整型(INT64)/ 双精度浮点数(DOUBLE)8816
字符串(TEXT)8平均为a8+a
- -示例:1000设备,每个设备100 测点,共 100000 序列,INT32 类型。采样频率1Hz(每秒一次),存储1年,3副本。 -- 完整计算公式:1000设备 * 100测点 * 12字节每数据点 * 86400秒每天 * 365天每年 * 3副本/10压缩比 / 1024 / 1024 / 1024 / 1024 =11T -- 简版计算公式:1000 * 100 * 12 * 86400 * 365 * 3 / 10 / 1024 / 1024 / 1024 / 1024 = 11T -### 3.2 存储配置 -1000w 点位以上或查询负载较大,推荐配置 SSD。 -## 4. 网络(网卡) -在写入吞吐不超过1000万点/秒时,需配置千兆网卡;当写入吞吐超过 1000万点/秒时,需配置万兆网卡。 -| **写入吞吐(数据点/秒)** | **网卡速率** | -| ------------------- | ------------- | -| <1000万 | 1Gbps(千兆) | -| >=1000万 | 10Gbps(万兆) | -## 5. 其他说明 -IoTDB 具有集群秒级扩容能力,扩容节点数据可不迁移,因此您无需担心按现有数据情况估算的集群能力有限,未来您可在需要扩容时为集群加入新的节点。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Deployment-form_timecho.md b/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Deployment-form_timecho.md deleted file mode 100644 index d49674d07..000000000 --- a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Deployment-form_timecho.md +++ /dev/null @@ -1,61 +0,0 @@ - -# 部署形态 - -IoTDB 有三种运行模式:单机模式、集群模式和双活模式。 - -## 1. 单机模式 - -IoTDB单机实例包括 1 个ConfigNode、1个DataNode,即1C1D; - -- **特点**:便于开发者安装部署,部署和维护成本较低,操作方便。 -- **适用场景**:资源有限或对高可用要求不高的场景,例如边缘端服务器。 -- **部署方法**:[单机版部署](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -## 2. 双活模式 - -双活版部署为 TimechoDB 企业版功能,是指两个独立的实例进行双向同步,能同时对外提供服务。当一台停机重启后,另一个实例会将缺失数据断点续传。 - -> IoTDB 双活实例通常为2个单机节点,即2套1C1D。每个实例也可以为集群。 - -- **特点**:资源占用最低的高可用解决方案。 -- **适用场景**:资源有限(仅有两台服务器),但希望获得高可用能力。 -- **部署方法**:[双活版部署](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -## 3. 集群模式 - -IoTDB 集群实例为 3 个ConfigNode 和不少于 3 个 DataNode,通常为 3 个 DataNode,即3C3D;当部分节点出现故障时,剩余节点仍然能对外提供服务,保证数据库服务的高可用性,且可随节点增加提升数据库性能。 - -- **特点**:具有高可用性、高扩展性,可通过增加 DataNode 提高系统性能。 -- **适用场景**:需要提供高可用和可靠性的企业级应用场景。 -- **部署方法**:[集群版部署](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -## 4. 特点总结 - -| 维度 | 单机模式 | 双活模式 | 集群模式 | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| 适用场景 | 边缘侧部署、对高可用要求不高 | 高可用性业务、容灾场景等 | 高可用性业务、容灾场景等 | -| 所需机器数量 | 1 | 2 | ≥3 | -| 安全可靠性 | 无法容忍单点故障 | 高,可容忍单点故障 | 高,可容忍单点故障 | -| 扩展性 | 可扩展 DataNode 提升性能 | 每个实例可按需扩展 | 可扩展 DataNode 提升性能 | -| 性能 | 可随 DataNode 数量扩展 | 与其中一个实例性能相同 | 可随 DataNode 数量扩展 | - -- 单机模式和集群模式,部署步骤类似(逐个增加 ConfigNode 和 DataNode),仅副本数和可提供服务的最少节点数不同。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index 80a847eaf..000000000 --- a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,495 +0,0 @@ - -# Docker部署指导 - -## 1. 环境准备 - -### 1.1 Docker安装 - -```Bash -#以ubuntu为例,其他操作系统可以自行搜索安装方法 -#step1: 安装一些必要的系统工具 -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: 安装GPG证书 -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: 写入软件源信息 -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: 更新并安装Docker-CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: 设置docker开机自启动 -sudo systemctl enable docker -#step6: 验证docker是否安装成功 -docker --version #显示版本信息,即安装成功 -``` - -### 1.2 docker-compose安装 - -```Bash -#安装命令 -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#验证是否安装成功 -docker-compose --version #显示版本信息即安装成功 -``` - -### 1.3 安装dmidecode插件 - -默认情况下,linux服务器应该都已安装,如果没有安装的话,可以使用下面的命令安装。 - -```Bash -sudo apt-get install dmidecode -``` - -dmidecode 安装后,查找安装路径:`whereis dmidecode`,这里假设结果为`/usr/sbin/dmidecode`,记住该路径,后面的docker-compose的yml文件会用到。 - -### 1.4 获取IoTDB的容器镜像 - -关于IoTDB企业版的容器镜像您可联系商务或技术支持获取。 - -## 2. 单机版部署 - -本节演示如何部署1C1D的docker单机版。 - -### 2.1 load 镜像文件 - -比如这里获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -load镜像: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### 2.2 创建docker bridge网络 - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### 2.3 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在`/docker-iotdb` 文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml ` - -```Bash -docker-iotdb: -├── iotdb #iotdb安装目录 -│── docker-compose-standalone.yml #单机版docker-compose的yml文件 -``` - -完整的`docker-compose-standalone.yml`内容如下: - -```Bash -version: "3" -services: - iotdb-service: - image: timecho/timechodb:2.0.2.1-standalone #使用的镜像 - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### 2.4 首次启动 - -使用下面的命令启动: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -由于没有激活,首次启动时会直接退出,属于正常现象,首次启动是为了获取机器码文件,用于后面的激活流程。 - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### 2.5 申请激活 - -- 首次启动后,在物理机目录`/docker-iotdb/iotdb/activation`下会生成一个 `system_info`文件,将这个文件拷贝给天谋工作人员。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 收到工作人员返回的license文件,将license文件拷贝到`/docker-iotdb/iotdb/activation`文件夹下。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### 2.6 再次启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### 2.7 验证部署 - -- 查看日志,有如下字样,表示启动成功 - -```Bash -docker logs -f iotdb-datanode #查看日志命令 -2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- 进入容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - 进入容器, 通过cli登录数据库, 使用show cluster命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb /bin/bash #进入容器 - ./start-cli.sh -h iotdb #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### 2.8 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在docker-compose-standalone.yml中添加映射 - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:重新启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## 3. 集群版部署 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -**注意:集群版目前只支持host网络和overlay 网络,不支持bridge网络。** - -下面以host网络为例演示如何部署3C3D集群。 - -### 3.1 设置主机名 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ----------- | ------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置/etc/hosts,使用如下命令: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 3.2 load镜像文件 - -比如获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -在3台服务器上分别执行load镜像命令: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### 3.3 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在/docker-iotdb文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`,`/docker-iotdb/confignode.yml`,`/docker-iotdb/datanode.yml` - -```Bash -docker-iotdb: -├── confignode.yml #confignode的yml文件 -├── datanode.yml #datanode的yml文件 -└── iotdb #IoTDB安装目录 -``` - -在每台服务器上都要编写2个yml文件,即`confignode.yml`和`datanode.yml`,yml示例如下: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:2.0.x.x-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #默认第一台为seed节点 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:2.0.x.x-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_seed_config_node=iotdb-1:10710 #默认第1台为seed节点 - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### 3.4 首次启动confignode - -先在3台服务器上分别启动confignode, 用来获取机器码,注意启动顺序,先启动第1台iotdb-1,再启动iotdb-2和iotdb-3。 - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #后台启动 -``` - -### 3.5 申请激活 - -- 首次启动3个confignode后,在每个物理机目录`/docker-iotdb/iotdb/activation`下都会生成一个`system_info`文件,将3个服务器的`system_info`文件拷贝给天谋工作人员; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 将3个license文件分别放入对应的ConfigNode节点的`/docker-iotdb/iotdb/activation`文件夹下; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- license放入对应的activation文件夹后,confignode会自动激活,不用重启confignode - -### 3.6 启动datanode - -在3台服务器上分别启动datanode - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #后台启动 -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### 3.7 验证部署 - -- 查看日志,有如下字样,表示datanode启动成功 - - ```Bash - docker logs -f iotdb-datanode #查看日志命令 - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- 进入任意一个容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - 进入容器,通过cli登录数据库,使用`show cluster`命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb-datanode /bin/bash #进入容器 - ./start-cli.sh -h iotdb-1 #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### 3.8 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:在3台服务器中分别拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -或者 -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在3台服务器的`confignode.yml`和`datanode.yml`中添加/conf目录映射 - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:在3台服务器上重新启动IoTDB - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index 1a21357a0..000000000 --- a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,203 +0,0 @@ - -# 双活版部署指导 - -## 1. 什么是双活版? - -双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 两个单机(或集群)可构成一个高可用组:当其中一个单机(或集群)停止服务时,另一个单机(或集群)不会受到影响。当停止服务的单机(或集群)再次启动时,另一个单机(或集群)会将新写入的数据同步过来。业务可以绑定两个单机(或集群)进行读写,从而达到高可用的目的。 -- 双活部署方案允许在物理节点少于 3 的情况下实现高可用,在部署成本上具备一定优势。同时可以通过电力、网络的双环网,实现两套单机(或集群)的物理供应隔离,保障运行的稳定性。 -- 目前双活能力为企业版功能。 - -![](/img/%E5%8F%8C%E6%B4%BB%E5%90%8C%E6%AD%A5.png) - -## 2. 注意事项 - -1. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置`/etc/hosts`,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。 - - ```Bash - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -2. 有些参数首次启动后不能修改,请参考下方的"安装步骤"章节来进行设置。 - -3. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考[文档](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - -## 3. 安装步骤 - -我们以两台单机A和B构建的双活版IoTDB为例,A和B的ip分别是192.168.1.3 和 192.168.1.4 ,这里用hostname来表示不同的主机,规划如下: - -| 机器 | 机器ip | 主机名 | -| ---- | ----------- | ------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### Step1:分别安装两套独立的 IoTDB - -在2个机器上分别安装 IoTDB,单机版部署文档可参考[文档](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md),集群版部署文档可参考[文档](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)。**推荐 A、B 集群的各项配置保持一致,以实现最佳的双活效果。** - -### Step2:在机器A上创建数据同步任务至机器B - -- 在机器A上创建数据同步流程,即机器A上的数据自动同步到机器B,使用sbin目录下的cli工具连接A上的IoTDB数据库: - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-1 - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-1 - ``` - -- 创建并启动数据同步命令,SQL 如下: - - ```Bash - create pipe AB - with source ( - 'source.mode.double-living' ='true' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.mode.double-living` 均设置为 `true`,表示不转发从另一pipe传输而来的数据。 - -### Step3:在机器B上创建数据同步任务至机器A - - - 在机器B上创建数据同步流程,即机器B上的数据自动同步到机器A,使用sbin目录下的cli工具连接B上的IoTDB数据库: - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-2 - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-2 - ``` - - 创建并启动pipe,SQL 如下: - - ```Bash - create pipe BA - with source ( - 'source.mode.double-living' ='true' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.mode.double-living` 均设置为 `true`,表示不转发从另一pipe传输而来的数据。 - -### Step4:验证部署 - -上述数据同步流程创建完成后,即可启动双活集群。 - -#### 检查集群运行状态 - -```Bash -#在2个节点分别执行show cluster命令检查IoTDB服务状态 -show cluster -``` - -**机器A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**机器B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -确保每一个 ConfigNode 和 DataNode 都处于 Running 状态。 - -#### 检查同步状态 - -- 机器A上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-A.png) - -- 机器B上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-B.png) - -确保每一个 pipe 都处于 RUNNING 状态。 - -### Step5:停止双活版 IoTDB - -- 在机器A的执行下列命令: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 #登录cli - IoTDB> stop pipe AB #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\windows\stop-standalone.bat - ``` - -- 在机器B的执行下列命令: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 #登录cli - IoTDB> stop pipe BA #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\windows\stop-standalone.bat - ``` diff --git a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index f6bd4cb1a..000000000 --- a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,47 +0,0 @@ - -# 安装包获取 -## 1. 获取方式 - -企业版安装包可通过产品试用申请,或直接联系与您对接的工作人员获取。 - -## 2. 安装包结构 - -安装包解压后目录结构如下: - -| **目录** | **类型** | **说明** | -| :--------------- | :------- | :----------------------------------------------------------- | -| activation | 文件夹 | 激活文件所在目录,包括生成的机器码以及从天谋工作人员获取的企业版激活码(启动ConfigNode后才会生成该目录,即可获取激活码) | -| conf | 文件夹 | 配置文件目录,包含 ConfigNode、DataNode、JMX 和 logback 等配置文件 | -| data | 文件夹 | 默认的数据文件目录,包含 ConfigNode 和 DataNode 的数据文件。(启动程序后才会生成该目录) | -| lib | 文件夹 | 库文件目录 | -| licenses | 文件夹 | 开源协议证书文件目录 | -| logs | 文件夹 | 默认的日志文件目录,包含 ConfigNode 和 DataNode 的日志文件(启动程序后才会生成该目录) | -| sbin | 文件夹 | 主要脚本目录,包含数据库启、停等脚本 | -| tools | 文件夹 | 工具目录 | -| ext | 文件夹 | pipe,trigger,udf插件的相关文件 | -| LICENSE | 文件 | 开源许可证文件 | -| NOTICE | 文件 | 开源声明文件 | -| README_ZH.md | 文件 | 使用说明(中文版) | -| README.md | 文件 | 使用说明(英文版) | -| RELEASE_NOTES.md | 文件 | 版本说明 | - -注意:自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 MQTT 服务 和 REST 服务的 JAR 包。如需使用,请联系天谋团队获取。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index f5b5ff6d8..000000000 --- a/src/zh/UserGuide/Master/Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,306 +0,0 @@ - -# 单机版部署指导 - -本章将介绍如何启动IoTDB单机实例,IoTDB单机实例包括 1 个ConfigNode 和1个DataNode(即通常所说的1C1D)。 - -## 1. 注意事项 - -1. 安装前请确认系统已参照[系统配置](../Deployment-and-Maintenance/Environment-Requirements.md)准备完成。 -2. 推荐使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在服务器上配置`/etc/hosts`,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的 `cn_internal_address`、`dn_internal_address`。 - - ```shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. 部分参数首次启动后不能修改,请参考下方的[参数配置](#2参数配置)章节进行设置。 -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 -5. 请注意,安装部署(包括激活和使用软件)IoTDB时,您可以: - - 使用 root 用户(推荐):可以避免权限等问题。 - - 使用固定的非 root 用户: - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - 避免使用 sudo:使用 sudo 命令会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系工作人员获取,部署监控面板步骤可以参考:[监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - - -## 2. 安装步骤 - -### 2.1 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-01.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.2 解压安装包并进入安装目录 - -```Plain -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -### 2.3 参数配置 - -#### 2.3.1 内存配置 - -- conf/confignode-env.sh(或 .bat) - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------- | :------------------------------------- | :--------- | :----------------------------------------------- | :----------- | -| MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- conf/datanode-env.sh(或 .bat) - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------- | :----------------------------------- |:-----------------------| :----------------------------------------------- | :----------- | -| MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 2.3.2 功能配置 - -系统实际生效的参数在文件 conf/iotdb-system.properties 中,启动需设置以下参数,可以从 conf/iotdb-system.properties.template 文件中查看全部参数 - -集群级功能配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :------------------------ | :------------------------------- | :------------- | :----------------------------------------------- |:-------------------------| -| cluster_name | 集群名称 | defaultCluster | 可根据需要设置集群名称,如无特殊需要保持默认即可 | 支持热加载,但不建议手动修改该参数 | -| schema_replication_factor | 元数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | -| data_replication_factor | 数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | - -ConfigNode 配置 - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------- | :----------------- | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -DataNode 配置 - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- |:----------------------------------------| :----------------- | -| dn_rpc_address | 客户端 RPC 服务的地址 | 127.0.0.1 | 默认本机可直接访问。非本机访问,请修改此配置项为所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址。 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -### 2.4 启动 ConfigNode 节点 - -进入iotdb的sbin目录下,启动confignode - -```shell -# Unix/OS X -./sbin/start-confignode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\sbin\start-confignode.bat - -# V2.0.4.x 版本及之后 -.\sbin\windows\start-confignode.bat -``` - -如果启动失败,请参考下方[常见问题](#常见问题)。 - -### 2.5 启动 DataNode 节点 - - 进入iotdb的sbin目录下,启动datanode: - -```shell -# Unix/OS X -./sbin/start-datanode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\sbin\start-datanode.bat - -# V2.0.4.x 版本及之后 -.\sbin\windows\start-datanode.bat -``` - -### 2.6 激活数据库 - -#### 方式一:命令激活 - -- 进入 IoTDB CLI - -```shell -# Linux 系统与 MacOS 系统启动命令如下: -# V2.0.6.x 版本之前 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.6.x 版本及之后 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table - -# Windows 系统启动命令如下: -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.6.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -- 执行以下语句获取激活所需机器码: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -- 执行以下语句获取待激活数据库的版本号: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.x.x| xxxxxxx| -+-------+---------+ -Total line number = 1 -``` - -- 将获取到的机器码与版本号,一同提供给天谋工作人员。 - -- 将工作人员返回的激活码输入到 CLI 中进行激活操作,请注意激活码前后需要用`'`符号进行标注,如下所示 - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - - -#### 方式二:文件激活 - -- 启动Confignode、Datanode节点后,进入activation文件夹, 将 system_info文件复制给天谋工作人员 -- 收到工作人员返回的 license文件 -- 将license文件放入对应节点的activation文件夹下; - - -### 2.7 验证激活 - -可在 CLI 中通过执行 `show activation` 命令查看激活状态,当看到“ClusterActivationStatus”字段状态显示为 ACTIVATED 表示激活成功 - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81.png) - -## 3. 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 清理环境: - 1. 结束所有 ConfigNode 和 DataNode 进程。 - ```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - # Unix/OS X - sbin/stop-standalone.sh - - # Windows - # V2.0.4.x 版本之前 - sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - sbin\windows\stop-standalone.bat - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - - 2. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```shell - cd /data/iotdb rm -rf data logs - ``` - -## 4. 附录 - -### 4.1 Confignode节点参数介绍 - -| 参数 | 描述 | 是否为必填项 | -| :--- | :------------------------------- | :----------- | -| -d | 以守护进程模式启动,即在后台运行 | 否 | - -### 4.2 Datanode节点参数介绍 - -| 缩写 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - diff --git a/src/zh/UserGuide/Master/Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md b/src/zh/UserGuide/Master/Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md deleted file mode 100644 index a27cd20bd..000000000 --- a/src/zh/UserGuide/Master/Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md +++ /dev/null @@ -1,38 +0,0 @@ - - -# 概览 - -IoTDB 生态集成打通时序数据全链路:通过数据采集实现设备秒级接入,经数据集成构建跨云管道,依托编程框架快速开发业务逻辑,结合计算引擎完成分布式处理,通过可视化与 SQL 开发实现分析策略,最终对接物联网平台完成边云协同,构建从物理世界到数字决策的完整智能闭环。 - -![](/img/eco-overview-n.png) - -下面的文档将会帮助您快速详细的了解各个阶段不同集成工具的使用方式: - -- 计算引擎 - - Spark [Spark](./Spark-IoTDB.md) -- SQL 开发 - - DBeaver [DBeaver](./DBeaver.md) - - DataGrip [DataGrip ](./DataGrip.md) -- 编程框架 - - Spring Boot Starter [Spring Boot Starter](./Spring-Boot-Starter.md) - - Mybatis Generator [Mybatis Generator](./Mybatis-Generator.md) - - MyBatisPlus Generator [MyBatisPlus Generator](./MyBatisPlus-Generator.md) \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Ecosystem-Integration/SeaTunnel_timecho.md b/src/zh/UserGuide/Master/Table/Ecosystem-Integration/SeaTunnel_timecho.md deleted file mode 100644 index 8eae5ac09..000000000 --- a/src/zh/UserGuide/Master/Table/Ecosystem-Integration/SeaTunnel_timecho.md +++ /dev/null @@ -1,193 +0,0 @@ - - -# Apache SeaTunnel - -## 1. 概述 - -SeaTunnel 是一款专为海量数据设计的分布式集成平台,凭借其高性能与弹性扩展能力,通过标准化的 Connector 连接器(由 Source 和 Sink 构成)打通多源异构数据链路。平台将各类数据源通过 Source 统一抽象为 SeaTunnelRow 格式,经动态资源调度与批量处理优化后,由 Sink 高效写入不同存储系统。通过 IoTDB Connector 与 SeaTunnel 的深度集成,不仅解决了时序数据场景下的 高吞吐写入、多源治理、复杂分析 等核心挑战,更通过开箱即用的连接器生态和自动化运维能力,帮助企业在物联网、工业互联网等领域快速构建 低成本、高可靠、易扩展 的数据基础设施。 - -## 2. 使用步骤 - -### 2.1 环境准备 - -#### 2.1.1 软件要求 - -| 软件 | 版本 | 安装参考 | -| ----------- | ---------- |-----------------------------------------------| -| IoTDB | >= 2.0.5 | [快速入手](../QuickStart/QuickStart_timecho.md) | -| SeaTunnel | 2.3.12 | [官方网站](https://seatunnel.apache.org/download) | - -* Thrift 版本冲突解决(仅 Spark 引擎需处理): - -```Bash -# 移除 Spark 中的旧版 Thrift -rm -f $SPARK_HOME/jars/libthrift* -# 复制 IoTDB 的 Thrift 库到 Sparkcp -$IOTDB_HOME/lib/libthrift* $SPARK_HOME/jars/ -``` - -#### 2.1.2 依赖配置 - -1. JDBC - -* Spark/Flink 引擎:将 [JDBC 驱动 Jar 包](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) 放入 `${SEATUNNEL_HOME}/plugins/` 目录 -* SeaTunnel Zeta 引擎:将 [JDBC 驱动 Jar 包](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) 放入 `${SEATUNNEL_HOME}/lib/` 目录 - -2. Connector - -将对应版本的 [seaTunnel Connector](https://mvnrepository.com/artifact/org.apache.seatunnel/connector-iotdb) 放入 `${SEATUNNEL_HOME}/plugins/` 目录 - -### 2.2 读取数据 (IoTDB Source Connector) - -#### 2.2.1 配置参数 - -| **参数名** | **类型** | **必填** | **默认值** | **描述** | -| ---------------------------------- | ---------------- | ---------------- | ------------------ |-----------------------------------------------------------------------| -| `node_urls` | string | 是 | - | IoTDB 集群地址,格式:`"host1:port"`或`"host1:port,host2:port"` | -| `username` | string | 是 | - | IoTDB 用户名 | -| `password` | string | 是 | - | IoTDB 密码 | -| `sql_dialect` | string | 否 | tree | IoTDB 模型,tree:树模型;table:表模型 | -| `sql` | string | 是 | - | 要执行的 SQL 查询语句 | -| `database` | string | 否 | - | 数据库名,只在表模型中生效 | -| `schema` | config | 是 | - | 数据模式定义 | -| `fetch_size` | int | 否 | - | 单次获取数据量:查询时每次从 IoTDB 获取的数据量 | -| `lower_bound`| long | 否 | - | 时间范围下界(通过时间列进行数据分片时使用) | -| `upper_bound` | long | 否 | - | 时间范围上界(通过时间列进行数据分片时使用) | -| `num_partitions`| int | 否 | - | 分区数量(通过时间列进行数据分片时使用):
1个分区:使用完整时间范围
若分区数 < (上界-下界),则使用差值作为实际分区数 | -| `thrift_default_buffer_size` | int | 否 | - | Thrift 协议缓冲区大小 | -| `thrift_max_frame_size` | int | 否 | - | Thrift 最大帧尺寸 | -| `enable_cache_leader` | boolean | 否 | - | 是否启用 Leader 节点缓存 | -| `version` | string | 否 | - | 客户端 SQL 语义版本`(V_0_12/V_0_13)` | - -#### 2.2.2 配置示例 - -1. 在 `${SEATUNNEL_HOME}/`​`config/` 目录下新建` iotdb_source_example.conf` - -```SQL -env { - parallelism = 2 # 并行度为2 - job.mode = "BATCH" # 批处理模式 -} - -source { - IoTDB { - node_urls = "localhost:6667" - username = "root" - password = "root" - sql_dialect = "table" - sql = "SELECT time,device_id,city,s1,s2,s3,s4 FROM tcollector.table1" - schema { - fields { - time = timestamp - device_id = string - city= string - s1= int - s2= bigint - s3= float - s4= double - } - } - } -} - -sink { - Console { - } # 输出到控制台 -} -``` - -2. 执行如下命令运行 seaTunnel - -```Bash -./bin/seatunnel.sh --config config/iotdb_source_example.conf -e local -``` - -3. 更多详情请参考 Apache SeanTunnel 官网 [IoTDB Source Connector](https://seatunnel.incubator.apache.org/zh-CN/docs/2.3.12/connector-v2/source/IoTDB) 相关介绍 - -### 2.3 写入数据(IoTDB Sink Connector) - -#### 2.3.1 配置参数 - -| **名称** | **类型** | **是否必传​** | **默认值** | **描述** | -|-------------------------------|---------| ---------------------- |------------------| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | Array | 是 | - | `IoTDB`集群地址,格式为` ["host1:port"]`或`["host1:port","host2:port"]` | -| `username` | String | 是 | - | `IoTDB`用户的用户名 | -| `password` | String | 是 | - | `IoTDB`用户的密码 | -| `sql_dialect` | String | 否 | tree | `IoTDB`模型,tree:树模型;table:表模型 | -| `storage_group` | String | 是 | - | `IoTDB`树模型:指定设备存储组(路径前缀) 例: deviceId = \${storage\_group} + "." + \${key\_device} ;`IoTDB`表模型:指定数据库 | -| `key_device` | String | 是 | - | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`设备 ID 的字段名;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`表名的字段名 | -| `key_timestamp` | String | 否 | processing time | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`时间戳的字段名(如未指定,则使用处理时间作为时间戳);`IoTDB`表模型:在 SeaTunnelRow 中指定 IoTDB 时间列的字段名(如未指定,则使用处理时间作为时间戳) | -| `key_measurement_fields` | Array | 否 | 见描述 | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`测量列表的字段名(如未指定,则包括排除`key_device`&`key_timestamp`后的其余字段);`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`测点列(FIELD)的字段名(如未指定,则包括排除`key_device`&`key_timestamp`&`key_tag_fields`&`key_attribute_fields`后的其余字段) | -| `key_tag_fields` | Array | 否 | - | `IoTDB`树模型:不生效;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`标签列(TAG)的字段名 | -| `key_attribute_fields` | Array | 否 | - | `IoTDB`树模型:不生效;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`属性列(ATTRIBUTE)的字段名 | -| `batch_size` | Integer | 否 | 1024 | 对于批写入,当缓冲区的数量达到`batch_size`的数量或时间达到`batch_interval_ms`时,数据将被刷新到IoTDB中 | -| `max_retries` | Integer | 否 | - | 刷新的重试次数 failed | -| `retry_backoff_multiplier_ms` | Integer | 否 | - | 用作生成下一个退避延迟的乘数 | -| `max_retry_backoff_ms` | Integer | 否 | - | 尝试重试对`IoTDB`的请求之前等待的时间量 | -| `default_thrift_buffer_size` | Integer | 否 | - | 在`IoTDB`客户端中节省初始化缓冲区大小 | -| `max_thrift_frame_size` | Integer | 否 | - | 在`IoTDB`客户端中节约最大帧大小 | -| `zone_id` | string | 否 | - | `IoTDB`java.time.ZoneId client | -| `enable_rpc_compression` | Boolean | 否 | - | 在`IoTDB`客户端中启用rpc压缩 | -| `connection_timeout_in_ms` | Integer | 否 | - | 连接到`IoTDB`时等待的最长时间(毫秒) | - -#### 2.3.2 配置示例 - -1. 在 `${SEATUNNEL_HOME}/`​`config/` 目录下新建` iotdb_sink_example.conf` - -```Bash -# 定义运行时环境 -env { - parallelism = 4 - job.mode = "BATCH" -} -source{ - Jdbc { - url = "jdbc:mysql://localhost:3306/demo_db?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true" - driver = "com.mysql.cj.jdbc.Driver" - connection_check_timeout_sec = 100 - user = "root" - password = "IoTDB@2024" - query = "select * from device" - } -} -sink { - IoTDB { - node_urls = ["localhost:6667"] - username = "root" - password = "root" - sql_dialect = "table" - storage_group = "seatunnel" - key_device = "id" - key_timestamp = "intime" - } -} -``` - -2. 执行如下命令运行 seaTunnel - -```Bash -./bin/seatunnel.sh --config config/iotdb_sink_example.conf -e local -``` - -3. 更多配置参数及示例请参考 Apache SeanTunnel 官网 [IoTDB Sink Connector](https://seatunnel.incubator.apache.org/zh-CN/docs/2.3.12/connector-v2/sink/IoTDB) 相关介绍 - - diff --git a/src/zh/UserGuide/Master/Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/zh/UserGuide/Master/Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index a82e4ed7f..000000000 --- a/src/zh/UserGuide/Master/Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,271 +0,0 @@ - - -# 产品介绍 - -TimechoDB 是一款低成本、高性能的物联网原生时序数据库,是天谋科技基于 Apache IoTDB 社区版本提供的原厂商业化产品。它可以解决企业组建物联网大数据平台管理时序数据时所遇到的应用场景复杂、数据体量大、采样频率高、数据乱序多、数据处理耗时长、分析需求多样、存储与运维成本高等多种问题。 - -天谋科技基于 TimechoDB 提供更多样的产品功能、更强大的性能和稳定性、更丰富的效能工具,并为用户提供全方位的企业服务,从而为商业化客户提供更强大的产品能力,和更优质的开发、运维、使用体验。 - -- 下载、部署与使用:[快速上手](../QuickStart/QuickStart_timecho.md) - -## 1. 产品体系 - -天谋产品体系由若干个组件构成,覆盖由【数据采集】到【数据管理】到【数据分析&应用】的全时序数据生命周期,做到“采-存-用”一体化时序数据解决方案,帮助用户高效地管理和分析物联网产生的海量时序数据。 - -
- Introduction-zh-timecho.png -
- - -其中: - -1. **时序数据库(TimechoDB,基于 Apache IoTDB 提供的原厂商业化产品)**:时序数据存储的核心组件,其能够为用户提供高压缩存储能力、丰富时序查询能力、实时流处理能力,同时具备数据的高可用和集群的高扩展性,并在安全层面提供全方位保障。同时 TimechoDB 还为用户提供多种应用工具,方便用户配置和管理系统;多语言API和外部系统应用集成能力,方便用户在 TimechoDB 基础上构建业务应用。 -2. **时序数据标准文件格式(Apache TsFile,多位天谋科技核心团队成员主导&贡献代码)**:该文件格式是一种专为时序数据设计的存储格式,可以高效地存储和查询海量时序数据。目前 Timecho 采集、存储、智能分析等模块的底层存储文件均由 Apache TsFile 进行支撑。TsFile 可以被高效地加载至 IoTDB 中,也能够被迁移出来。通过 TsFile,用户可以在采集、管理、应用&分析阶段统一使用相同的文件格式进行数据管理,极大简化了数据采集到分析的整个流程,提高时序数据管理的效率和便捷度。 -3. **时序模型训推一体化引擎(AINode)**:针对智能分析场景,TimechoDB 提供 AINode 时序模型训推一体化引擎,它提供了一套完整的时序数据分析工具,底层为模型训练引擎,支持训练任务与数据管理,与包括机器学习、深度学习等。通过这些工具,用户可以对存储在 TimechoDB 中的数据进行深入分析,挖掘出其中的价值。 -4. **数据采集**:为了更加便捷的对接各类工业采集场景, 天谋科技提供数据采集接入服务,支持多种协议和格式,可以接入各种传感器、设备产生的数据,同时支持断点续传、网闸穿透等特性。更加适配工业领域采集过程中配置难、传输慢、网络弱的特点,让用户的数采变得更加简单、高效。 - -## 2. TimechoDB 整体架构 - -下图展示了一个常见的 IoTDB 3C3D(3 个 ConfigNode、3 个 DataNode)的集群部署模式: - - - -## 3. 产品特性 - -TimechoDB 具备以下优势和特性: - -- 灵活的部署方式:支持云端一键部署、终端解压即用、终端-云端无缝连接(数据云端同步工具) - -- 低硬件成本的存储解决方案:支持高压缩比的磁盘存储,无需区分历史库与实时库,数据统一管理 - -- 层级化的测点组织管理方式:支持在系统中根据设备实际层级关系进行建模,以实现与工业测点管理结构的对齐,同时支持针对层级结构的目录查看、检索等能力 - -- 高通量的数据读写:支持百万级设备接入、数据高速读写、乱序/多频采集等复杂工业读写场景 - -- 丰富的时间序列查询语义:支持时序数据原生计算引擎,支持查询时时间戳对齐,提供近百种内置聚合与时序计算函数,支持面向时序特征分析和AI能力 - -- 高可用的分布式系统:支持HA分布式架构,系统提供7*24小时不间断的实时数据库服务,一个物理节点宕机或网络故障,不会影响系统的正常运行;支持物理节点的增加、删除或过热,系统会自动进行计算/存储资源的负载均衡处理;支持异构环境,不同类型、不同性能的服务器可以组建集群,系统根据物理机的配置,自动负载均衡 - -- 极低的使用&运维门槛:支持类 SQL 语言、提供多语言原生二次开发接口、具备控制台等完善的工具体系 - -- 丰富的生态环境对接:支持Hadoop、Spark等大数据生态系统组件对接,支持Grafana、Thingsboard、DataEase等设备管理和可视化工具 - -## 4. 企业特性 - -### 4.1 更高阶的产品功能 - -TimechoDB 在 Apache IoTDB 基础上提供了更多高阶产品功能,在内核层面针对工业生产场景进行原生升级和优化,如多级存储、云边协同、可视化工具、安全增强等功能,能够让用户无需过多关注底层逻辑,将精力聚焦在业务开发中,让工业生产更简单更高效,为企业带来更多的经济效益。如: - -- 双活部署:双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 数据同步:通过数据库内置的同步模块,支持数据由场站向中心汇聚,支持全量汇聚、部分汇聚、级联汇聚等各类场景,可支持实时数据同步与批量数据同步两种模式。同时提供多种内置插件,支持企业数据同步应用中的网闸穿透、加密传输、压缩传输等相关要求。 - -- 多级存储:通过升级底层存储能力,支持根据访问频率和数据重要性等因素将数据划分为冷、温、热等不同层级的数据,并将其存储在不同介质中(如 SSD、机械硬盘、云存储等),同时在查询过程中也由系统进行数据调度。从而在保证数据访问速度的同时,降低客户数据存储成本。 - -- 安全增强:通过白名单、审计日志等功能加强企业内部管理,降低数据泄露风险。 - -详细功能对比如下: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
功能Apache IoTDBTimechoDB
部署模式单机部署
分布式部署
双活部署-
容器部署部分支持
数据库功能测点管理
数据写入
数据查询
连续查询
触发器
用户自定义函数
权限管理
数据同步仅文件同步,无内置插件实时同步+文件同步,丰富内置插件
流处理仅框架,无内置插件框架+丰富内置插件
多级存储-
视图-
白名单-
审计日志-
配套工具可视化控制台-
集群管理工具-
系统监控工具-
国产化国产化兼容性认证-
技术支持专家服务-
使用培训-
- -### 4.2 更高效/稳定的产品性能 - -TimechoDB 在 Apache IoTDB 的基础上优化了稳定性与性能,经过企业版技术支持,能够实现10倍以上性能提升,并具有故障及时恢复的性能优势。 - -### 4.3 更用户友好的工具体系 - -TimechoDB 将为用户提供更简单、易用的工具体系,通过集群监控面板(IoTDB Grafana)、数据库控制台(IoTDB Workbench)、集群管理工具(IoTDB Deploy Tool,简称 IoTD)等产品帮助用户快速部署、管理、监控数据库集群,降低运维人员工作/学习成本,简化数据库运维工作,使运维过程更加方便、快捷。 - -- 集群监控面板:旨在解决 IoTDB 及其所在操作系统的监控问题,主要包括:操作系统资源监控、IoTDB 性能监控,及上百项内核监控指标,从而帮助用户监控集群健康状态,并进行集群调优和运维。 - -
-

总体概览

-

操作系统资源监控

-

IoTDB 性能监控

-
-
- - - -
-

- -- 数据库控制台:旨在提供低门槛的数据库交互工具,通过提供界面化的控制台帮助用户简洁明了的进行元数据管理、数据增删改查、权限管理、系统管理等操作,简化数据库使用难度,提高数据库使用效率。 - - -
-

首页

-

元数据管理

-

SQL 查询

-
-
- - - -
-

- - -- 集群管理工具:旨在解决分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。 - - -
-  -
- -### 4.4 更专业的企业技术服务 - -TimechoDB 客户提供强大的原厂服务,包括但不限于现场安装及培训、专家顾问咨询、现场紧急救助、软件升级、在线自助服务、远程支持、最新开发版使用指导等服务。同时,为了使 IoTDB 更契合工业生产场景,我们会根据企业实际数据结构和读写负载,进行建模方案推荐、读写性能调优、压缩比调优、数据库配置推荐及其他的技术支持。如遇到部分产品未覆盖的工业化定制场景,TimechoDB 将根据用户特点提供定制化开发工具。 - -相较于 Apache IoTDB,每 2-3 个月一个发版周期,TimechoDB 提供周期更快的发版频率,同时针对客户现场紧急问题,提供天级别的专属修复,确保生产环境稳定。 - - -### 4.5 更兼容的国产化适配 - -TimechoDB 代码自研可控,同时兼容大部分主流信创产品(CPU、操作系统等),并完成与多个厂家的兼容认证,确保产品的合规性和安全性。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/IoTDB-Introduction/Release-history_timecho.md b/src/zh/UserGuide/Master/Table/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index 729b1f060..000000000 --- a/src/zh/UserGuide/Master/Table/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,675 +0,0 @@ - -# 发布历史 - -## 1. TimechoDB(数据库内核) - -### V2.0.9.4 - -> 发版时间:2026.06.10
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.4-bin.zip
-> SHA512 校验码:040ebdd9e45d93535e9628cf377003d560be83cec9737f5a5fbd0c3a93a12810814094752eac3eacdfec5cddcf433fa83e76edc14be34c73c1a54d9b937ea1b5 - -V2.0.9.4 版本主要优化了表模型 AINode 的推理功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:表模型协变量推理模型自适应支持填充空值 - - -### V2.0.9.3 - -> 发版时间:2026.05.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.3-bin.zip
-> SHA512 校验码:f6c5d50cbf8902503289884f073593c650ffdc8edbebfabf27f6ab4499630749331aa4ed09dd34627a39fa8dee27b4d7e2689d0ed1cf23c76dd9c7270f9fae2a - -V2.0.9.3 版本 AINode 新增支持同一套模型代码搭配不同模型权重分别注册为模型的功能,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:[支持同一套模型代码搭配不同模型权重分别注册为模型](../AI-capability/AINode_Upgrade_timecho.md#_4-3-注册自定义模型) - - -### V2.0.9.2 - -> 发版时间:2026.05.11
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.2-bin.zip
-> SHA512 校验码:10d3f34b6e65ad5c09b1cf3538ee27e181cc38c5fedf6acfd7d7053797ca23c76245683536275b69bd478aa1e43364351eceef1948832ab663a7398665af9eff - -V2.0.9.2 版本 新增 Object 类型导入导出功能,新增脚本 tsfile-backup(目前仅支持表模型),同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 脚本与工具:表模型[import-data 脚本 TsFile 格式](../Tools-System/Data-Import-Tool_timecho.md#_2-4-tsfile-格式)支持 object 类型数据导入 -- 脚本与工具:表模型新增 [tsfile-backup 脚本](../Tools-System/Data-Export-Tool_timecho.md#_3-基于-pipe-框架的-tsfilebackup) -- 流处理模块:表模型 PIPE 支持 [Object 类型数据本地导出和远程传输](../User-Manual/Data-Sync_timecho.md#_3-9-object-类型数据导出) -- 系统模块:[审计日志](../User-Manual/Audit-Log_timecho.md)支持慢请求个数统计 - - -### V2.0.9.1 - -> 发版时间:2026.05.11
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.1-bin.zip
-> SHA512 校验码:18ff3801ba58550e06ef0aa4bf4465e8ce1b31d1aecb9c6899eb843f5d9187d3cc575e930ee38d96b87b17067e2b21f1852ab5127eac7480cf5051c20a68894b - -V2.0.9.1 版本新增 AINode 协变量分类推理能力,支持 schema级/表级存储空间统计功能,数据查询新增集合操作、CTE 及多个内置函数,支持通过 DEBUG SQL 调试查询,支持配置开机自启等,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:表模型支持[时序数据分类推理](../AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理) -- 查询模块:表模型支持[集合操作(UNION/INTERSECT/EXCEPT)](../SQL-Manual/Set-Operations_timecho.md)及 [共用表表达式(CTE)](../SQL-Manual/Common-Table-Expression_timecho.md) -- 查询模块:表模型新增 [IF 标量函数](../SQL-Manual/Basis-Function_timecho.md#_8-3-if-表达式)、[二进制函数](../SQL-Manual/Basis-Function_timecho.md#_7-二进制函数)、[APPROX_PERCENTILE 聚合函数](../SQL-Manual/Basis-Function_timecho.md#_2-聚合函数) -- 查询模块:支持 [DEBUG SQL](../User-Manual/Maintenance-statement_timecho.md#_6-调试查询),优化 [Explain Analyze](../User-Manual/Query-Performance-Analysis.md) 结果集 -- 查询模块:支持 [schema级](../../latest/User-Manual/Maintenance-statement_timecho.md#_1-10-查看磁盘空间占用情况)/[表级](../Reference/System-Tables_timecho.md#_2-22-table-disk-usage-表)存储空间统计,支持 [show configuration 语句](../User-Manual/Maintenance-statement_timecho.md#_1-13-查看节点配置信息)查看集群配置信息 -- 脚本与工具:数据/元数据导入导出工具支持 SSL 协议 -- 脚本与工具:命令行工具支持展示[访问历史功能](../Tools-System/CLI_timecho.md#_4-访问历史功能) -- 系统模块:支持配置[开机自启](../User-Manual/Auto-Start-On-Boot_timecho.md) -- 其他:修复安全漏洞 CVE-2026-28564 - - -### V2.0.8.3 - -> 发版时间:2026.04.21
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.3-bin.zip
-> SHA512 校验码:4b95bea87cc375bc455897dcf4cec80692421fa5c3eee746e1095b94288611d4afdd94aa8dad70340757d041757758924701cbdb2b73b49fb8730c4caac2a126 - -V2.0.8.3 版本新增 Python 读写 Object 类型数据的能力,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 接口模块:表模型[Python 原生接口](../API/Programming-Python-Native-API_timecho.md)支持读写 Object 类型数据 - - -### V2.0.8.2 - -> 发版时间:2026.03.31
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.2-bin.zip
-> SHA512 校验码:02ab10e3e94786dd5676e0a69609eef192afd90d87f4d8d7bd44e7e9cbc8a18d61ba5668bae56cb8e4416ac71a877f760963b72ca7838d7c39ae10f1ed321d89 - -V2.0.8.2 版本新增树模型修改序列全名功能,表模型支持自定义 Time 列列名,树、表双模型支持更改数据类型,ODBC Driver等,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 存储模块:树模型支持[修改序列全名](../../latest/Basic-Concept/Operate-Metadata_timecho.md#_2-4-修改时间序列名称),支持[更改序列数据类型](../../latest/Basic-Concept/Operate-Metadata_timecho.md#_2-3-修改时间序列数据类型) -- 存储模块:表模型支持[更改列数据类型](../Basic-Concept/Table-Management_timecho.md#_1-5-修改表),支持[自定义 Time 列列名](../Basic-Concept/Table-Management_timecho.md#_1-1-创建表) -- 接口模块:支持 [ODBC Driver](../API/Programming-ODBC_timecho.md), Python SessionDataset 支持分批获取 DataFrame,MQTT 服务外置并新增系统表 Services 提供服务查询 -- AINode:表模型支持自适应[协变量推理](../AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理) -- 流处理模块:树模型数据同步 pipe 语句中支持填写多个精确路径的 path - -### V2.0.8.1 - -> 发版时间:2026.02.04
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.1-bin.zip
-> SHA512 校验码:49d97cbf488443f8e8e73cc39f6f320b3bc84b194aed90af695ebd5771650b5e5b6a3abb0fb68059bd01827260485b903c035657b337442f4fdd32c877f2aca3 - -V2.0.8.1 版本表模型新增Object数据类型,强化升级审计日志功,优化树模型 OPC UA 协议,AINode 支持协变量预测,以及 AINode 支持并发推理等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:新增 DataNode 可用节点的列表展示,可[查看节点的 RPC 地址和端口](../User-Manual/Maintenance-statement_timecho.md#_1-7-查看可用节点) -- 查询模块:表模型新增[统计查询耗时的系统表](../Reference/System-Tables_timecho.md#_2-20-queries-costs-histogram-表) -- 存储模块:支持通过 SQL 查看[创建表](../Basic-Concept/Table-Management_timecho.md#_1-4-查看表的创建信息)/[视图](../User-Manual/Tree-to-Table_timecho.md#_2-4-查看表视图)的完整定义语句 -- 存储模块:优化树模型 [OPC UA 协议](../../latest/API/Programming-OPC-UA_timecho.md) -- 系统模块:表模型新增 [Object 数据类型](../Background-knowledge/Data-Type_timecho.md) -- 系统模块:强化升级[审计日志](../User-Manual/Audit-Log_timecho.md)功能 -- 系统模块:表模型新增 DataNode [节点连接情况](../Reference/System-Tables_timecho.md#_2-18-connections-表)的系统表 -- AINode:内置 chronos-2 模型,支持[协变量预测](../AI-capability/AINode_Upgrade_timecho.md)功能 -- AINode:Timer-XL、Sundial 内置模型支持[并发推理](../AI-capability/AINode_Upgrade_timecho.md)功能 -- 流处理模块:创建全量同步 pipe 会[自动拆分](../User-Manual/Data-Sync_timecho.md#_2-1-创建任务)为实时、历史两个独立 pipe,可通过 show pipes 语句分别查看剩余事件数 -- 其他:修复安全漏洞 CVE-2025-12183、CVE-2025-66566、CVE-2025-11226 - -### V2.0.6.6 - -> 发版时间:2026.01.20
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.6-bin.zip
-> SHA512 校验码:d12e60b8119690d63c501d0c2afcd527e39df8a8786198e35b53338e21939e1a9244805e710d81cbb62d02c2739909d7e8227c029660a0cd9ea7ca718cf9bdf6 - -V2.0.6.6 版本主要优化了树模型时间序列的查询性能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:优化了 show/count timeseries/devices 的查询性能 - -### V2.0.6.4 - -> 发版时间:2025.11.17
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.4-bin.zip
-> SHA512 校验码:57b9998cc14632862c32b6781c70db1c52caf8172b5d45d27cc214cab50d3afd4230ed0754e1c1a4ed825666bf971dc81fbb7d3b93261e57e9dabc20e794a2b8 - -V2.0.6.4 版本主要优化了存储以及 AINode 模块的相关功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 存储模块:支持树模型修改时间序列的编码及压缩方式 -* AINode:支持一键部署,优化了模型推理功能 - -### V2.0.6.1 - -> 发版时间:2025.09.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.1-bin.zip
-> SHA512 校验码:c88e3e2c0dbd06578bd0697ca9992880b300baee2c4906ba1f952134e37ae2fa803a6af236f4541d318b75f43a498b5d5bfbbc7c445783271076c36e696e4dd0 - -V2.0.6.1 版本新增表模型查询写回功能,新增访问控制黑白名单功能,新增位操作函数(内置标量函数)以及可下推的时间函数,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:支持表模型查询写回功能 -* 查询模块:表模型行模式识别支持使用聚合函数,捕获连续数据进行分析计算 -* 查询模块:表模型新增内置标量函数-位操作函数 -* 查询模块:表模型新增可下推的 EXTRACT 时间函数 -* 系统模块:新增访问控制,支持用户自定义配置黑白名单功能 -* 其他:用户默认密码更新为安全强度更高的“TimechoDB@2021” - -### V2.0.5.2 - -> 发版时间:2025.08.08
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.5.2-bin.zip
-> SHA512 校验码:a00a4075c9937b7749c454f71d2480fea5e9ff9659c0628b132e30e2f256c7c537cd91dca4f6be924db0274bb180946a1b88e460c025bf82fdb994a3c2c7b91e - -V2.0.5.2 版本修复了部分产品缺陷,优化了数据同步功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V2.0.5.1 - -> 发版时间:2025.07.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.5.1-bin.zip
-> SHA512 校验码:aa724755b659bf89a60da6f2123dfa91fe469d2e330ed9bd029e8f36dd49212f3d83b1025e9da26cb69315e02f65c7e9a93922e40df4f2aa4c7f8da8da2a4cea - -V2.0.5.1 版本新增树转表视图、表模型窗口函数、聚合函数 approx\_most\_frequent,并支持 LEFT & RIGHT JOIN、ASOF LEFT JOIN;AINode 新增 Timer-XL、Timer-Sundial 两种内置模型,支持树、表模型推理及微调功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:支持手动创建树转表视图 -* 查询模块:表模型新增窗口函数 -* 查询模块:表模型新增聚合函数 approx\_most\_frequent -* 查询模块:表模型 JOIN 功能扩展,支持 LEFT & RIGHT JOIN、ASOF LEFT JOIN -* 查询模块:表模型支持行模式识别,可捕获连续数据进行分析计算 -* 查询模块:表模型新增多个系统表,例如:VIEWS(表视图信息)、MODELS(模型信息)等 -* 系统模块:新增 TsFile 数据文件加密功能 -* AI 模块:AINode 新增 Timer-XL、Timer-Sundial 两种内置模型 -* AI 模块:AINode 支持树模型、表模型的推理及微调功能 -* 其他模块:支持通过 OPC DA 协议发布数据 - -### 2.x 其他历史版本 - -#### V2.0.4.2 - -> 发版时间:2025.06.21
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.4.2-bin.zip
-> SHA512 校验码:31f26473ac90988ce970dac8d0950671bde918f9af6f2f6a6c2bf99a53aa1c0a459c53a137b18ff0b28e70952e9c4b6acb50029e0b2e38837b969eb8f78f2939 - -V2.0.4.2 版本支持了传递 TOPIC 给 MQTT 自定义插件,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.4.1 - -> 发版时间:2025.06.03
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.4.1-bin.zip
-> SHA512 校验码:93ac08bfae06aff6db04849f474458433026f66778f4f5c402eb22f1a7cb14d8096daf0a9e9cc365ddfefd4f8ca4443b2a9fb6461906f056b1e6a344990beb3a - -V2.0.4.1 版本表模型新增用户自定义表函数(UDTF)及多种内置表函数、新增聚合函数 approx\_count\_distinct、新增支持针对时间列的 ASOF INNER JOIN,并对脚本工具进行了分类整理,将 Windows 平台专用脚本独立,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:表模型新增用户自定义表函数(UDTF)及多种内置表函数 -* 查询模块:表模型支持针对时间列的 ASOF INNER JOIN -* 查询模块:表模型新增聚合函数 approx\_count\_distinct -* 流处理:支持通过 SQL 异步加载 TsFile -* 系统模块:缩容时,副本选择支持容灾负载均衡策略 -* 系统模块:适配 Window Server 2025 -* 脚本与工具:对脚本工具进行了分类整理,并将 Windows 平台专用脚本独立 - -#### V2.0.3.4 - -> 发版时间:2025.06.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.4-bin.zip
-> SHA512 校验码:d80d34b7d3890def75b17c491fc4c13efc36153a5950a9b23744755d04d6adb5d6ab9ec970101183fef7bfeb8a559ef92fce90d2d22f7b7fd5795cd5589461bb - -V2.0.3.4版本将用户密码的加密算法变更为 SHA-256,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.3.3 - -> 发版时间:2025.05.16
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.3-bin.zip
-> SHA512 校验码:f47e3fb45f869dbe690e7cfaa93f95e5e08a462b362aa9d7ccac7ee5b55022dc8f62db12009dfde055f278f3003ff9ea7c22849d52a3ef2c25822f01ade78591 - -V2.0.3.3 版本新增元数据导入导出脚本适配表模型、Spark 生态集成(表模型)、AINode 返回结果新增时间戳,表模型新增部分聚合函数和标量函数,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:表模型新增聚合函数 count\_if 和标量函数 greatest / least -* 查询模块:表模型全表 count(\*) 查询性能显著提升 -* AI 模块:AINode 返回结果新增时间戳 -* 系统模块:表模型元数据模块性能优化 -* 系统模块:表模型支持主动监听并加载 TsFile 功能 -* 系统模块:新增 TsFile 解析转换时间、TsFile 转 Tablet 数量等监控指标 -* 生态集成:表模型生态拓展集成 Spark -* 脚本与工具:import-schema、export-schema 脚本支持表模型元数据导入导出 - -#### V2.0.3.2 - -> 发版时间:2025.05.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.2-bin.zip
-> SHA512 校验码:76bd294de4b01782e5dd621a996aeb448e4581f98c70fb5b72b17dc392c2e1227c0d26bd3df5533669a80f217a83a566bc6ec926b7efd21ce7a89b894cd33e19 - -V2.0.3.2版本修复了部分产品缺陷,优化了节点移除功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.2.1 - -> 发版时间:2025.04.07
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.2.1-bin.zip
-> SHA512 校验码:a41be3f8c57e6a39ac165f1d6ab92c9ed790b0712528f31662c58617f4c94e6bfc9392a9c1ef2fc5bdd8c7ca79901389f368cbdbec3e5b1d5c1ce155b2f1a457 - -V2.0.2.1 版本新增了表模型权限管理、用户管理以及相关操作鉴权,并新增了表模型 UDF、系统表和嵌套查询等功能。此外,持续优化数据订阅机制,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:新增表模型 UDF 的管理、用户自定义标量函数(UDSF)和用户自定义聚合函数(UDAF) -* 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -* 查询模块:表模型支持权限管理、用户管理以及相关操作鉴权 -* 查询模块:新增系统表及多种运维语句,优化系统管理 -* 系统模块:CSharp 客户端支持表模型 -* 系统模块:新增表模型 C++ Session 写入接口 -* 系统模块:多级存储支持符合 S3 协议的非 AWS 对象存储系统 -* 系统模块:UDF 函数拓展,新增 pattern\_match 模式匹配函数 -* 数据同步:表模型支持元数据同步及同步删除操作 - -#### V2.0.1.2 - -> 发版时间:2025.01.25
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.1.2-bin.zip
-> SHA512 校验码:51c2fa5da2974a8a3c8871dec1c49bd98e5d193a13ef33ac7801adb833a1e360d74f0160bcdf33c7ffb23a5c5e0f376e26a4315cf877f1459483356285b85349 - -V2.0.1.2 版本正式实现树表双模型配置,并配合表模型支持标准 SQL 查询语法、多种函数和运算符、流处理、Benchmark 等功能。此外,该版本更新还包括:Python 客户端支持四种新数据类型,支持只读模式下的数据库删除操作,脚本工具同时兼容 TsFile、CSV 和 SQL 数据的导入导出,对 Kubernetes Operator 的生态集成等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 时序表模型:IoTDB 支持了时序表模型,提供的 SQL 语法包括 SELECT、WHERE、JOIN、GROUP BY、ORDER BY、LIMIT 子句和嵌套查询 -* 查询模块:表模型支持多种函数和运算符,包括逻辑运算符、数学函数以及时序特色函数 DIFF 等 -* 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -* 存储模块:表模型支持通过 Session 接口进行数据写入,Session 接口支持元数据自动创建 -* 存储模块:Python 客户端新增支持四种新数据类型:`String`、`Blob`、`Date` 和 `Timestamp` -* 存储模块:优化同种类合并任务优先级的比较规则 -* 流处理模块:支持在发送端指定接收端鉴权信息 -* 流处理模块:TsFile Load 支持表模型 -* 流处理模块:流处理插件适配表模型 -* 系统模块:增强了 DataNode 缩容的稳定性 -* 系统模块:在 readonly 状态下,支持用户进行 drop database 操作 -* 脚本与工具:Benchmark 工具适配表模型 -* 脚本与工具: Benchmark 工具支持四种新数据类型:`String`、`Blob`、`Date` 和 `Timestamp` -* 脚本与工具:data/export-data 脚本扩展,支持新数据类型(字符串、大二进制对象、日期、时间戳) -* 脚本与工具:import-data/export-data 脚本迭代,同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出 -* 生态集成:支持 Kubernetes Operator - - -### V1.3.7.3 - -> 发版时间:2026.06.02
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 校验码:8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 版本主要优化了查询模块和数据同步等功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:优化 Last 查询、对齐序列查询、倒序时间过滤查询等场景 -- 元数据模块:优化已激活序列及其子路径下的设备创建校验 -- 数据同步:优化同步失败后的重试机制 -- 数据同步:跨网闸同步插件支持配置实时写入传输超时时间 -- 接口模块:Go 客户端写入接口增加错误码校验 -- 接口模块:优化 C# 客户端连接池管理 - - -### V1.3.7.2 - -> 发版时间:2026.04.07
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 校验码:787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 版本主要优化了数据同步和查询模块的相关功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:优化 Pipe 复杂路径匹配场景下的分发性能 -- 查询模块:Show Queries 语句新增客户端 IP、查询超时时间、服务端等待时间等信息 -- 生态集成:支持 IoTDB 以 OPC Client 模式向外部 OPC Server 推送数据 - - -### V1.3.6.6 - -> 发版时间:2026.01.20
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 校验码:590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 版本优化了数据的读写功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.6.3 - -> 发版时间:2026.01.04
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 校验码:43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 版本主要围绕查询性能、内存管理机制两大核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:优化多种场景的查询性能,包括多序列 Last 查询等 -* 查询模块:Java SDK 新增 FastLastQuery 接口,支持更高效的 Last 查询操作 -* 查询模块:树模型 fetchSchema 调整为分段流式返回,提升大数据量场景下的响应速度 -* 存储模块:优化内存管理,避免内存泄漏风险,保障系统长期稳定运行 -* 存储模块:优化文件合并机制,提升合并处理效率,优化系统存储资源占用 -* 其他:修复安全漏洞 CVE-2025-12183,CVE-2025-66566 and CVE-2025-11226 - - -### V1.3.6.1 - -> 发版时间:2025.12.09
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 校验码:9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 版本主要围绕数据同步稳定性这一核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 数据同步:优化 Pipe SQL 参数配置,支持指定异步加载方式 -* 数据同步:新增语法糖功能,可将全量 Pipe 创建 SQL 自动拆分为实时同步与历史同步两类 -* 系统模块:新增全局数据类型压缩方式配置项,支持按需调整存储压缩策略 - - -### V1.3.5.11 - -> 发版时间:2025.09.24
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 校验码:f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 版本主要优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.10 - -> 发版时间:2025.08.27
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 校验码:3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.9 - -> 发版时间:2025.08.25
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 校验码:95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 版本优化了内存控制,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 -### 1.x 其他历史版本 - -#### V1.3.5.8 - -> 发版时间:2025.08.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 校验码:aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.7 - -> 发版时间:2025.08.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 校验码:17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.6 - -> 发版时间:2025.07.16
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 校验码:05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 版本新增配置项开关支持禁用数据订阅功能,优化了C++高可用客户端,以及正常情况、重启、删除三个场景下的 PIPE 同步延迟问题,和大 TEXT 对象时的查询问题,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.4 - -> 发版时间:2025.06.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 校验码:edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 版本修复了部分产品缺陷,优化了节点移除功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.3 - -> 发版时间:2025.06.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 校验码:5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 版本主要优化了数据同步功能,包括持久化 PIPE 发送进度,增加 PIPE 事件传输时间监控项,并修复了相关缺陷;另外将用户密码的加密算法变更为 SHA-256,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.2 - -> 发版时间:2025.06.10
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 校验码:4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 版本主要优化了数据同步功能,包括支持通过使用参数进行级联配置,支持同步和实时写入顺序完全一致;支持系统重启后历史数据和实时数据分区发送,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.1 - -> 发版时间:2025.05.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 校验码:91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.4.2 - -> 发版时间:2025.04.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 校验码:52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b2 - -V1.3.4.2 版本优化了数据同步功能,支持双活之间同步外部 PIPE 转发而来的数据。 - - -#### V1.3.4.1 - -> 发版时间:2025.01.08
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 校验码:e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 版本新增模式匹配函数、持续优化数据订阅机制,提升稳定性、import-data/export-data 脚本扩展支持新数据类型,import-data/export-data 脚本合并同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -- 系统模块:UDF 函数拓展,新增 pattern_match 模式匹配函数 -- 数据同步:支持在发送端指定接收端鉴权信息 -- 生态集成:支持 Kubernetes Operator -- 脚本与工具:import-data/export-data 脚本扩展,支持新数据类型(字符串、大二进制对象、日期、时间戳) -- 脚本与工具:import-data/export-data 脚本迭代,同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出 - -#### V1.3.3.3 - -> 发版时间:2024.10.31
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 校验码:4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3版本增加优化重启恢复性能,减少启动时间、DataNode 主动监听并加载 TsFile,同时增加可观测性指标、发送端支持传文件至指定目录后,接收端自动加载到IoTDB、Alter Pipe 支持 Alter Source 的能力等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:接收端支持对不一致数据类型的自动转换 -- 数据同步:接收端增强可观测性,支持多个内部接口的 ops/latency 统计 -- 数据同步:opc-ua-sink 插件支持 CS 模式访问和非匿名访问方式 -- 数据订阅: SDK 支持 create if not exists 和 drop if exists 接口 -- 流处理:Alter Pipe 支持 Alter Source 的能力 -- 系统模块:新增 rest 模块的耗时监控 -- 脚本与工具:支持加载自动加载指定目录的TsFile文件 -- 脚本与工具:import-tsfile脚本扩展,支持脚本与iotdb server不在同一服务器运行 -- 脚本与工具:新增对Kubernetes Helm的支持 -- 脚本与工具:Python 客户端支持新数据类型(字符串、大二进制对象、日期、时间戳) - -#### V1.3.3.2 - -> 发版时间:2024.8.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 校验码:32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2版本支持输出读取mods文件的耗时、输入最大顺乱序归并排序内存 以及dispatch 耗时、通过参数配置对时间分区原点的调整、支持根据 pipe 历史数据处理结束标记自动结束订阅,同时合并了模块内存控制性能提升,具体发布内容如下: - -- 查询模块:Explain Analyze 功能支持输出读取mods文件的耗时 -- 查询模块:Explain Analyze 功能支持输入最大顺乱序归并排序内存以及 dispatch 耗时 -- 存储模块:新增合并目标文件拆分功能,增加配置文件参数 -- 系统模块:支持通过参数配置对时间分区原点的调整 -- 流处理:数据订阅支持根据 pipe 历史数据处理结束标记自动结束订阅 -- 数据同步:RPC 压缩支持指定压缩等级 -- 脚本与工具:数据/元数据导出只过滤 root.__system,不对root.__systema 等开头的数据进行过滤 - -#### V1.3.3.1 - -> 发版时间:2024.7.12
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 校验码:1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1版本多级存储增加限流机制、数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权,优化了数据同步接收端一些不明确的WARN日志、重启恢复性能,减少启动时间,同时对脚本内容进行了合并,具体发布内容如下: - -- 查询模块:Filter 性能优化,提升聚合查询和where条件查询的速度 -- 查询模块:Java Session客户端查询 sql 请求均分到所有节点 -- 系统模块:将"iotdb-confignode.properties、iotdb-datanode.properties、iotdb-common.properties"配置文件合并为" iotdb-system.properties" -- 存储模块:多级存储增加限流机制 -- 数据同步:数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权 -- 系统模块:优化重启恢复性能,减少启动时间 - -#### V1.3.2.2 - -> 发版时间:2024.6.4
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 校验码:ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 版本新增 explain analyze 语句分析单个 SQL 查询耗时、新增 UDAF 用户自定义聚合函数框架、支持磁盘空间到达设置阈值自动删除数据、元数据同步、统计指定路径下数据点数、SQL 语句导入导出脚本等功能,同时集群管理工具支持滚动升级、上传插件到整个集群,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 存储模块:insertRecords 接口写入性能提升 -- 存储模块:新增 SpaceTL 功能,支持磁盘空间到达设置阈值自动删除数据 -- 查询模块:新增 Explain Analyze 语句(监控单条 SQL 执行各阶段耗时) -- 查询模块:新增 UDAF 用户自定义聚合函数框架 -- 查询模块:UDF 新增包络解调分析 -- 查询模块:新增 MaxBy/MinBy 函数,支持获取最大/小值的同时返回对应时间戳 -- 查询模块:值过滤查询性能提升 -- 数据同步:路径匹配支持通配符 -- 数据同步:支持元数据同步(含时间序列及相关属性、权限等设置) -- 流处理:增加 Alter Pipe 语句,支持热更新 Pipe 任务的插件 -- 系统模块:系统数据点数统计增加对 load TsFile 导入数据的统计 -- 脚本与工具:新增本地升级备份工具(通过硬链接对原有数据进行备份) -- 脚本与工具:新增 export-data/import-data 脚本,支持将数据导出为 CSV、TsFile 格式或 SQL 语句 -- 脚本与工具:Windows 环境支持通过窗口名区分 ConfigNode、DataNode、Cli - -#### V1.3.1.4 - -> 发版时间:2024.4.23
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 校验码:8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1 版本增加系统激活情况查看、内置方差/标准差聚合函数、内置Fill语句支持超时时间设置、tsfile修复命令等功能,增加一键收集实例信息脚本、一键启停集群等脚本,并对视图、流处理等功能进行优化,提升使用易用度和版本性能。具体发布内容如下: - -- 查询模块:Fill 子句支持设置填充超时阈值,超过时间阈值不填充 -- 查询模块:Rest 接口(V2 版)增加列类型返回 -- 数据同步:数据同步简化时间范围指定方式,直接设置起止时间 -- 数据同步:数据同步支持 SSL 传输协议(iotdb-thrift-ssl-sink 插件) -- 系统模块:支持使用 SQL 查询集群激活信息 -- 系统模块:多级存储增加迁移时传输速率控制 -- 系统模块:系统可观测性提升(增加集群节点的散度监控、分布式任务调度框架可观测性) -- 系统模块:日志默认输出策略优化 -- 脚本与工具:增加一键启停集群脚本(start-all/stop-all.sh & start-all/stop-all.bat) -- 脚本与工具:增加一键收集实例信息脚本(collect-info.sh & collect-info.bat) - -#### V1.3.0.4 - -> 发版时间:2024.1.3
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 校验码:3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 发布了全新内生机器学习框架 AINode,全面升级权限模块支持序列粒度授予权限,并对视图、流处理等功能进行诸多细节优化,进一步提升了产品的使用易用度,并增强了版本稳定性和各方面性能。具体发布内容如下: - -- 查询模块:新增 AINode 内生机器学习模块 -- 查询模块:优化 show path 语句返回时间长的问题 -- 安全模块:升级权限模块,支持时间序列粒度的权限设置 -- 安全模块:支持客户端与服务器 SSL 通讯加密 -- 流处理:流处理模块新增多种 metrics 监控项 -- 查询模块:非可写视图序列支持 LAST 查询 -- 系统模块:优化数据点监控项统计准确性 - -#### V1.2.0.1 - -> 发版时间:2023.6.30
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 校验码:dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1主要增加了流处理框架、动态模板、substring/replace/round内置查询函数等新特性,增强了show region、show timeseries、show variable等内置语句功能和Session接口,同时优化了内置监控项及其实现,修复部分产品bug和性能问题。 - -- 流处理:新增流处理框架 -- 元数据模块:新增模板动态扩充功能 -- 存储模块:新增SPRINTZ和RLBE编码以及LZMA2压缩算法 -- 查询模块:新增cast、round、substr、replace内置标量函数 -- 查询模块:新增time_duration、mode内置聚合函数 -- 查询模块:SQL语句支持case when语法 -- 查询模块:SQL语句支持order by表达式 -- 接口模块:Python API支持连接分布式多个节点 -- 接口模块:Python客户端支持写入重定向 -- 接口模块:Session API增加用模板批量创建序列接口 - -#### V1.1.0.1 - -> 发版时间:2023-04-03
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.1.0.1.zip
-> SHA512 校验码:58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1主要改进增加了部分新特性,如支持 GROUP BY VARIATION、GROUP BY CONDITION 等分段方式、增加 DIFF、COUNT_IF 等实用函数,引入 pipeline 执行引擎进一步提升查询速度等。同时修复对齐序列 last 查询 order by timeseries、LIMIT&OFFSET 不生效、重启后元数据模版错误、删除所有 database 后创建序列错误等相关问题。 - -- 查询模块:align by device 语句支持 order by time -- 查询模块:支持 Show Queries 命令 -- 查询模块:支持 kill query 命令 -- 系统模块:show regions 支持指定特定的 database -- 系统模块:新增 SQL show variables, 可以展示当前集群参数 -- 查询模块:聚合查询支持 GROUP BY VARIATION -- 查询模块:SELECT INTO 支持特定的数据类型强转 -- 查询模块:实现内置标量函数 DIFF -- 系统模块:show regions 显示创建时间 -- 查询模块:实现内置聚合函数 COUNT_IF -- 查询模块:聚合查询支持 GROUP BY CONDITION -- 系统模块:支持修改 dn_rpc_port 和 dn_rpc_address - - -## 2. Workbench(控制台工具) - -| **控制台版本号** | **版本说明** | **可支持IoTDB版本** | **SHA512 校验码** | -| ---------------- | ------------------------------------------------------------ | ------------------------- | ------------------------------------------------------------ | -| V2.1.1 | 优化趋势界面测点选择,支持无设备场景 | V2.0 及以上版本 | aa05fd4d9f33f07c0949bc2d6546bb4b9791ed5ea94bcef27e2bf51ea141ec0206f1c12466aced7bf3449e11ad68d65378d697f3d10cb4881024a83746029a65 | -| V2.0.1-beta | V2.x系列首个版本,支持树、表双模型 | V2.0 及以上版本 | 0ca0d5029874ed8ada9c7d1cb562370b3a46913eed66d39c08759287ccc8bf332cf80bb8861e788614b61ae5d53a9f5605f553e1a607e856f395eb5102e7cc4d | -| V1.5.7 | 优化测点列表中测点名称拆分为设备名称和测点,测点选择区域支持左右滚动,以及导出文件列顺序与页面保持一致 | V1.3.4及以上的1.x系列版本 | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | 优化 CSV 格式导入导出功能:导入时,支持标签、别名为非必填项;导出时,支持测点描述里反引号包裹引号的场景 | V1.3.4及以上的1.x系列版本 | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | 新增服务器时钟,支持企业版激活数据库 | V1.3.4及以上的1.x系列版本 | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | 新增实例管理中prometheus设置的认证功能 | V1.3.4及以上的1.x系列版本 | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | 新增AI分析功能以及模式匹配功能 | V1.3.2及以上的1.x系列版本 | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | 新增树模型展示及英文版 | V1.3.2及以上的1.x系列版本 | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | 分析功能新增分析方式,优化导入模版等功能 | V1.3.2及以上的1.x系列版本 | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | 新增数据库配置功能,优化部分版本细节 | V1.3.2及以上的1.x系列版本 | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | 优化各模块权限控制功能 | V1.3.1及以上的1.x系列版本 | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | 可视化功能新增“常用模版”概念,所有界面优化补充页面缓存等功能 | V1.3.0及以上的1.x系列版本 | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | 计算功能新增“导入、导出”功能,测点列表新增“时间对齐”字段 | V1.2.2及以上的1.x系列版本 | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | 首页新增“激活详情”,新增分析等功能 | V1.2.2及以上的1.x系列版本 | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | 优化“测点描述”展示内容等功能 | V1.2.2及以上的1.x系列版本 | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | 数据同步界面新增“监控面板”,优化Prometheus提示信息 | V1.2.2及以上的1.x系列版本 | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | 全新Workbench版本升级 | V1.2.0及以上的1.x系列版本 | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/zh/UserGuide/Master/Table/QuickStart/QuickStart_timecho.md b/src/zh/UserGuide/Master/Table/QuickStart/QuickStart_timecho.md deleted file mode 100644 index 89ef3eef6..000000000 --- a/src/zh/UserGuide/Master/Table/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,93 +0,0 @@ - - -# 快速上手 - -本篇文档将帮助您了解快速入门 IoTDB 的方法。 - -## 1. 如何安装部署? - -本篇文档将帮助您快速安装部署 IoTDB,您可以通过以下文档的链接快速定位到所需要查看的内容: - -1. 准备所需机器资源:IoTDB 的部署和运行需要考虑多个方面的机器资源配置。具体资源配置可查看 [资源规划](../Deployment-and-Maintenance/Database-Resources_timecho.md) - -2. 完成系统配置准备:IoTDB 的系统配置涉及多个方面,关键的系统配置介绍可查看 [系统配置](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. 获取安装包:您可以联系天谋商务获取 IoTDB 安装包,以确保下载的是最新且稳定的版本。具体安装包结构可查看:[安装包获取](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. 安装数据库并激活:您可以根据实际部署架构选择以下教程进行安装部署: - - - 单机版:[单机版](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - 分布式(集群)版:[分布式(集群)版](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - 双活版:[双活版](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️注意:目前我们仍然推荐直接在物理机/虚拟机上安装部署,如需要 docker 部署,可参考:[Docker 部署](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. 安装数据库配套工具:企业版数据库提供监控面板等配套工具,建议在部署企业版时安装,可以帮助您更加便捷的使用 IoTDB: - - - 监控面板:提供了上百个数据库监控指标,用来对 IoTDB 及其所在操作系统进行细致监控,从而进行系统优化、性能优化、发现瓶颈等,安装步骤可查看 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - -## 2. 如何使用? - -1. 数据库建模设计:数据库建模是创建数据库系统的重要步骤,它涉及到设计数据的结构和关系,以确保数据的组织方式能够满足特定应用的需求,下面的文档将会帮助您快速了解 IoTDB 的建模设计: - - - 时序概念介绍:[时序数据模型](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - - - 建模设计介绍:[建模方案设计](../Background-knowledge/Data-Model-and-Terminology_timecho.md) - - - 数据库介绍:[数据库管理](../Basic-Concept/Database-Management_timecho.md) - - - 表介绍:[表管理](../Basic-Concept/Table-Management_timecho.md) - -2. 数据写入&更新:在数据写入&更新方面,IoTDB 提供了多种方式来插入实时数据,支持追加查询写回,基本介绍请查看 [数据写入&更新](../Basic-Concept/Write-Updata-Data_timecho.md) - -3. 数据查询:IoTDB 提供了丰富的数据查询功能,数据查询的基本介绍请查看 [数据查询](../Basic-Concept/Query-Data.md),其中包含了适用于时序特色分析的模式查询和窗口函数,详细介绍请查看[模式查询](../User-Manual/Pattern-Query_timecho.md) 和 [窗口函数](../User-Manual/Window-Function_timecho.md) - -4. 数据删除:IoTDB 提供了两种删除方式,分别为SQL语句删除与过期自动删除(TTL) - - - SQL语句删除:基本介绍请查看 [数据删除](../Basic-Concept/Delete-Data.md) - - 过期自动删除(TTL):基本介绍请查看 [过期自动删除](../Basic-Concept/TTL-Delete-Data_timecho.md) - -5. 其他进阶功能:除了数据库常见的写入、查询等功能外,IoTDB 还支持“数据同步”等功能,具体使用方法可参见具体文档: - - - 数据同步:[数据同步](../User-Manual/Data-Sync_timecho.md) - -6. 应用编程接口: IoTDB 提供了多种应用编程接口(API),以便于开发者在应用程序中与 IoTDB 进行交互,目前支持[ Java 原生接口](../API/Programming-Java-Native-API_timecho.md)、[Python 原生接口](../API/Programming-Python-Native-API_timecho.md)、[JDBC](../API/Programming-JDBC_timecho.md)等,更多编程接口可参见官网【应用编程接口】其他章节 - -## 3. 还有哪些便捷的周边工具? - -IoTDB 除了自身拥有丰富的功能外,其周边的工具体系包含的种类十分齐全。本篇文档将帮助您快速使用周边工具体系: - - - 监控面板:是一个对 IoTDB 及其所在操作系统进行细致监控的工具,涵盖数据库性能、系统资源等上百个数据库监控指标,助力系统优化与瓶颈识别等,具体使用介绍请查看 [监控面板部署](../Tools-System/Monitor-Tool_timecho.md) - - -## 4. 想了解更多技术细节? - -如果您想了解 IoTDB 的更多技术内幕,可以移步至下面的文档: - - - 数据分区和负载均衡:IoTDB 基于时序数据特性,精心设计了数据分区策略和负载均衡算法,提升了集群的可用性和性能,想了解更多请查看 [数据分区和负载均衡](../Technical-Insider/Cluster-data-partitioning.md) - - - 压缩&编码:IoTDB 通过多样化的编码和压缩技术,针对不同数据类型优化存储效率,想了解更多请查看 [压缩&编码](../Technical-Insider/Encoding-and-Compression.md) - - diff --git a/src/zh/UserGuide/Master/Table/Reference/System-Config-Manual_timecho.md b/src/zh/UserGuide/Master/Table/Reference/System-Config-Manual_timecho.md deleted file mode 100644 index a7bffe3a7..000000000 --- a/src/zh/UserGuide/Master/Table/Reference/System-Config-Manual_timecho.md +++ /dev/null @@ -1,3354 +0,0 @@ - - -# 配置参数 - -IoTDB 配置文件位于 IoTDB 安装目录:`conf`文件夹下。 - -- `confignode-env.sh/bat`:环境配置项的配置文件,可以配置 ConfigNode 的内存大小。 -- `datanode-env.sh/bat`:环境配置项的配置文件,可以配置 DataNode 的内存大小。 -- `iotdb-system.properties`:IoTDB 的配置文件。 -- `iotdb-system.properties.template`:IoTDB 的配置文件模版。 - -## 1. 修改配置: - -在 `iotdb-system.properties` 文件中已存在的参数可以直接进行修改。对于那些在 `iotdb-system.properties` 中未列出的参数,可以从 `iotdb-system.properties.template` 配置文件模板中找到相应的参数,然后将其复制到 `iotdb-system.properties` 文件中进行修改。 - -### 1.1 改后生效方式 - -不同的配置参数有不同的生效方式,分为以下三种: - -- 仅允许在第一次启动服务前修改: 在第一次启动 ConfigNode/DataNode 后即禁止修改,修改会导致 ConfigNode/DataNode 无法启动。 -- 重启服务生效: ConfigNode/DataNode 启动后仍可修改,但需要重启 ConfigNode/DataNode 后才生效。 -- 热加载: 可在 ConfigNode/DataNode 运行时修改,修改后通过 Session 或 Cli 发送 `load configuration` 或 `set configuration key1 = 'value1'` 命令(SQL)至 IoTDB 使配置生效。 - -## 2. 环境配置项 - -### 2.1 confignode-env.sh/bat - -环境配置项主要用于对 ConfigNode 运行的 Java 环境相关参数进行配置,如 JVM 相关配置。ConfigNode 启动时,此部分配置会被传给 JVM,详细配置项说明如下: - -- MEMORY_SIZE - -| 名字 | MEMORY_SIZE | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB ConfigNode 启动时分配的内存大小 | -| 类型 | String | -| 默认值 | 取决于操作系统和机器配置。默认为机器内存的十分之三,最多会被设置为 16G。 | -| 改后生效方式 | 重启服务生效 | - -- ON_HEAP_MEMORY - -| 名字 | ON_HEAP_MEMORY | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB ConfigNode 能使用的堆内内存大小, 曾用名: MAX_HEAP_SIZE | -| 类型 | String | -| 默认值 | 取决于MEMORY_SIZE的配置。 | -| 改后生效方式 | 重启服务生效 | - -- OFF_HEAP_MEMORY - -| 名字 | OFF_HEAP_MEMORY | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB ConfigNode 能使用的堆外内存大小, 曾用名: MAX_DIRECT_MEMORY_SIZE | -| 类型 | String | -| 默认值 | 取决于MEMORY_SIZE的配置。 | -| 改后生效方式 | 重启服务生效 | - -### 2.2 datanode-env.sh/bat - -环境配置项主要用于对 DataNode 运行的 Java 环境相关参数进行配置,如 JVM 相关配置。DataNode/Standalone 启动时,此部分配置会被传给 JVM,详细配置项说明如下: - -- MEMORY_SIZE - -| 名字 | MEMORY_SIZE | -| ------------ | ---------------------------------------------------- | -| 描述 | IoTDB DataNode 启动时分配的内存大小 | -| 类型 | String | -| 默认值 | 取决于操作系统和机器配置。默认为机器内存的二分之一。 | -| 改后生效方式 | 重启服务生效 | - -- ON_HEAP_MEMORY - -| 名字 | ON_HEAP_MEMORY | -| ------------ | ---------------------------------------------------------- | -| 描述 | IoTDB DataNode 能使用的堆内内存大小, 曾用名: MAX_HEAP_SIZE | -| 类型 | String | -| 默认值 | 取决于MEMORY_SIZE的配置。 | -| 改后生效方式 | 重启服务生效 | - -- OFF_HEAP_MEMORY - -| 名字 | OFF_HEAP_MEMORY | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB DataNode 能使用的堆外内存大小, 曾用名: MAX_DIRECT_MEMORY_SIZE | -| 类型 | String | -| 默认值 | 取决于MEMORY_SIZE的配置 | -| 改后生效方式 | 重启服务生效 | - - -## 3. 系统配置项(iotdb-system.properties.template) - -### 3.1 集群管理 - -- cluster_name - -| 名字 | cluster_name | -| -------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | 集群名称 | -| 类型 | String | -| 默认值 | default_cluster | -| 修改方式 | CLI 中执行语句 `set configuration cluster_name = 'xxx'` (xxx为希望修改成的集群名称) | -| 注意 | 此修改通过网络分发至每个节点。在网络波动或者有节点宕机的情况下,不保证能够在全部节点修改成功。未修改成功的节点重启时无法加入集群,此时需要手动修改该节点的配置文件中的cluster_name项,再重启。正常情况下,不建议通过手动修改配置文件的方式修改集群名称,不建议通过`load configuration`的方式热加载。 | - -### 3.2 SeedConfigNode 配置 - -- cn_seed_config_node - -| 名字 | cn_seed_config_node | -| ------------ | ------------------------------------------------------------ | -| 描述 | 目标 ConfigNode 地址,ConfigNode 通过此地址加入集群,推荐使用 SeedConfigNode。V1.2.2 及以前曾用名是 cn_target_config_node_list | -| 类型 | String | -| 默认值 | 127.0.0.1:10710 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_seed_config_node - -| 名字 | dn_seed_config_node | -| ------------ | ------------------------------------------------------------ | -| 描述 | ConfigNode 地址,DataNode 启动时通过此地址加入集群,推荐使用 SeedConfigNode。V1.2.2 及以前曾用名是 dn_target_config_node_list | -| 类型 | String | -| 默认值 | 127.0.0.1:10710 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -### 3.3 Node RPC 配置 - -- cn_internal_address - -| 名字 | cn_internal_address | -| ------------ | ---------------------------- | -| 描述 | ConfigNode 集群内部地址 | -| 类型 | String | -| 默认值 | 127.0.0.1 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- cn_internal_port - -| 名字 | cn_internal_port | -| ------------ | ---------------------------- | -| 描述 | ConfigNode 集群服务监听端口 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 10710 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- cn_consensus_port - -| 名字 | cn_consensus_port | -| ------------ | ----------------------------- | -| 描述 | ConfigNode 的共识协议通信端口 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 10720 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_rpc_address - -| 名字 | dn_rpc_address | -| ------------ |----------------| -| 描述 | 客户端 RPC 服务监听地址 | -| 类型 | String | -| 默认值 | 127.0.0.1 | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_port - -| 名字 | dn_rpc_port | -| ------------ | ----------------------- | -| 描述 | Client RPC 服务监听端口 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 6667 | -| 改后生效方式 | 重启服务生效 | - -- dn_internal_address - -| 名字 | dn_internal_address | -| ------------ | ---------------------------- | -| 描述 | DataNode 内网通信地址 | -| 类型 | string | -| 默认值 | 127.0.0.1 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_internal_port - -| 名字 | dn_internal_port | -| ------------ | ---------------------------- | -| 描述 | DataNode 内网通信端口 | -| 类型 | int | -| 默认值 | 10730 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_mpp_data_exchange_port - -| 名字 | dn_mpp_data_exchange_port | -| ------------ | ---------------------------- | -| 描述 | MPP 数据交换端口 | -| 类型 | int | -| 默认值 | 10740 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_schema_region_consensus_port - -| 名字 | dn_schema_region_consensus_port | -| ------------ | ------------------------------------- | -| 描述 | DataNode 元数据副本的共识协议通信端口 | -| 类型 | int | -| 默认值 | 10750 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_data_region_consensus_port - -| 名字 | dn_data_region_consensus_port | -| ------------ | ----------------------------------- | -| 描述 | DataNode 数据副本的共识协议通信端口 | -| 类型 | int | -| 默认值 | 10760 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_join_cluster_retry_interval_ms - -| 名字 | dn_join_cluster_retry_interval_ms | -| ------------ | --------------------------------- | -| 描述 | DataNode 再次重试加入集群等待时间 | -| 类型 | long | -| 默认值 | 5000 | -| 改后生效方式 | 重启服务生效 | - -### 3.4 副本配置 - -- config_node_consensus_protocol_class - -| 名字 | config_node_consensus_protocol_class | -| ------------ | ------------------------------------------------ | -| 描述 | ConfigNode 副本的共识协议,仅支持 RatisConsensus | -| 类型 | String | -| 默认值 | org.apache.iotdb.consensus.ratis.RatisConsensus | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- schema_replication_factor - -| 名字 | schema_replication_factor | -| ------------ | ---------------------------------- | -| 描述 | Database 的默认元数据副本数 | -| 类型 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 重启服务后对**新的 Database** 生效 | - -- schema_region_consensus_protocol_class - -| 名字 | schema_region_consensus_protocol_class | -| ------------ | ----------------------------------------------------- | -| 描述 | 元数据副本的共识协议,多副本时只能使用 RatisConsensus | -| 类型 | String | -| 默认值 | org.apache.iotdb.consensus.ratis.RatisConsensus | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- data_replication_factor - -| 名字 | data_replication_factor | -| ------------ | ---------------------------------- | -| 描述 | Database 的默认数据副本数 | -| 类型 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 重启服务后对**新的 Database** 生效 | - -- data_region_consensus_protocol_class - -| 名字 | data_region_consensus_protocol_class | -| ------------ | ------------------------------------------------------------ | -| 描述 | 数据副本的共识协议,多副本时可以使用 IoTConsensus 或 RatisConsensus | -| 类型 | String | -| 默认值 | org.apache.iotdb.consensus.iot.IoTConsensus | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -### 3.5 目录配置 - -- cn_system_dir - -| 名字 | cn_system_dir | -| ------------ | ----------------------------------------------------------- | -| 描述 | ConfigNode 系统数据存储路径 | -| 类型 | String | -| 默认值 | data/confignode/system(Windows:data\\configndoe\\system) | -| 改后生效方式 | 重启服务生效 | - -- cn_consensus_dir - -| 名字 | cn_consensus_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | ConfigNode 共识协议数据存储路径 | -| 类型 | String | -| 默认值 | data/confignode/consensus(Windows:data\\configndoe\\consensus) | -| 改后生效方式 | 重启服务生效 | - -- cn_pipe_receiver_file_dir - -| 名字 | cn_pipe_receiver_file_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | ConfigNode中pipe接收者用于存储文件的目录路径。 | -| 类型 | String | -| 默认值 | data/confignode/system/pipe/receiver(Windows:data\\confignode\\system\\pipe\\receiver) | -| 改后生效方式 | 重启服务生效 | - -- dn_system_dir - -| 名字 | dn_system_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 元数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/system(Windows:data\\datanode\\system) | -| 改后生效方式 | 重启服务生效 | - -- dn_data_dirs - -| 名字 | dn_data_dirs | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/data(Windows:data\\datanode\\data) | -| 改后生效方式 | 重启服务生效 | - -- dn_multi_dir_strategy - -| 名字 | dn_multi_dir_strategy | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 在 data_dirs 中为 TsFile 选择目录时采用的策略。可使用简单类名或类名全称。系统提供以下三种策略:
1. SequenceStrategy:IoTDB 按顺序选择目录,依次遍历 data_dirs 中的所有目录,并不断轮循;
2. MaxDiskUsableSpaceFirstStrategy:IoTDB 优先选择 data_dirs 中对应磁盘空余空间最大的目录;
您可以通过以下方法完成用户自定义策略:
1. 继承 org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy 类并实现自身的 Strategy 方法;
2. 将实现的类的完整类名(包名加类名,UserDefineStrategyPackage)填写到该配置项;
3. 将该类 jar 包添加到工程中。 | -| 类型 | String | -| 默认值 | SequenceStrategy | -| 改后生效方式 | 热加载 | - -- dn_consensus_dir - -| 名字 | dn_consensus_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 共识层日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/consensus(Windows:data\\datanode\\consensus) | -| 改后生效方式 | 重启服务生效 | - -- dn_wal_dirs - -| 名字 | dn_wal_dirs | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 写前日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/wal(Windows:data\\datanode\\wal) | -| 改后生效方式 | 重启服务生效 | - -- dn_tracing_dir - -| 名字 | dn_tracing_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 追踪根目录路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | datanode/tracing(Windows:datanode\\tracing) | -| 改后生效方式 | 重启服务生效 | - -- dn_sync_dir - -| 名字 | dn_sync_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB sync 存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/sync(Windows:data\\datanode\\sync) | -| 改后生效方式 | 重启服务生效 | - -- sort_tmp_dir - -| 名字 | sort_tmp_dir | -| ------------ | ------------------------------------------------- | -| 描述 | 用于配置排序操作的临时目录。 | -| 类型 | String | -| 默认值 | data/datanode/tmp(Windows:data\\datanode\\tmp) | -| 改后生效方式 | 重启服务生效 | - -- dn_pipe_receiver_file_dirs - -| 名字 | dn_pipe_receiver_file_dirs | -| ------------ | ------------------------------------------------------------ | -| 描述 | DataNode中pipe接收者用于存储文件的目录路径。 | -| 类型 | String | -| 默认值 | data/datanode/system/pipe/receiver(Windows:data\\datanode\\system\\pipe\\receiver) | -| 改后生效方式 | 重启服务生效 | - -- iot_consensus_v2_receiver_file_dirs - -| 名字 | iot_consensus_v2_receiver_file_dirs | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTConsensus V2中接收者用于存储文件的目录路径。 | -| 类型 | String | -| 默认值 | data/datanode/system/pipe/consensus/receiver(Windows:data\\datanode\\system\\pipe\\consensus\\receiver) | -| 改后生效方式 | 重启服务生效 | - -- iot_consensus_v2_deletion_file_dir - -| 名字 | iot_consensus_v2_deletion_file_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTConsensus V2中删除操作用于存储文件的目录路径。 | -| 类型 | String | -| 默认值 | data/datanode/system/pipe/consensus/deletion(Windows:data\\datanode\\system\\pipe\\consensus\\deletion) | -| 改后生效方式 | 重启服务生效 | - -### 3.6 监控配置 - -- cn_metric_reporter_list - -| 名字 | cn_metric_reporter_list | -| ------------ | -------------------------------------------------- | -| 描述 | confignode中用于配置监控模块的数据需要报告的系统。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -- cn_metric_level - -| 名字 | cn_metric_level | -| ------------ | ------------------------------------------ | -| 描述 | confignode中控制监控模块收集数据的详细程度 | -| 类型 | String | -| 默认值 | IMPORTANT | -| 改后生效方式 | 重启服务生效 | - -- cn_metric_async_collect_period - -| 名字 | cn_metric_async_collect_period | -| ------------ | -------------------------------------------------- | -| 描述 | confignode中某些监控数据异步收集的周期,单位是秒。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -- cn_metric_prometheus_reporter_port - -| 名字 | cn_metric_prometheus_reporter_port | -| ------------ | ------------------------------------------------------ | -| 描述 | confignode中Prometheus报告者用于监控数据报告的端口号。 | -| 类型 | int | -| 默认值 | 9091 | -| 改后生效方式 | 重启服务生效 | - -- dn_metric_reporter_list - -| 名字 | dn_metric_reporter_list | -| ------------ | ------------------------------------------------ | -| 描述 | DataNode中用于配置监控模块的数据需要报告的系统。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -- dn_metric_level - -| 名字 | dn_metric_level | -| ------------ | ---------------------------------------- | -| 描述 | DataNode中控制监控模块收集数据的详细程度 | -| 类型 | String | -| 默认值 | IMPORTANT | -| 改后生效方式 | 重启服务生效 | - -- dn_metric_async_collect_period - -| 名字 | dn_metric_async_collect_period | -| ------------ | ------------------------------------------------ | -| 描述 | DataNode中某些监控数据异步收集的周期,单位是秒。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -- dn_metric_prometheus_reporter_port - -| 名字 | dn_metric_prometheus_reporter_port | -| ------------ | ---------------------------------------------------- | -| 描述 | DataNode中Prometheus报告者用于监控数据报告的端口号。 | -| 类型 | int | -| 默认值 | 9092 | -| 改后生效方式 | 重启服务生效 | - -- dn_metric_internal_reporter_type - -| 名字 | dn_metric_internal_reporter_type | -| ------------ | ------------------------------------------------------------ | -| 描述 | DataNode中监控模块内部报告者的种类,用于内部监控和检查数据是否已经成功写入和刷新。 | -| 类型 | String | -| 默认值 | IOTDB | -| 改后生效方式 | 重启服务生效 | - -### 3.7 SSL 配置 - -- enable_thrift_ssl - -| 名字 | enable_thrift_ssl | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当enable_thrift_ssl配置为true时,将通过dn_rpc_port使用 SSL 加密进行通信 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- enable_https - -| 名字 | enable_https | -| ------------ | ------------------------------ | -| 描述 | REST Service 是否开启 SSL 配置 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- key_store_path - -| 名字 | key_store_path | -| ------------ | -------------- | -| 描述 | ssl证书路径 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -- key_store_pwd - -| 名字 | key_store_pwd | -| ------------ | ------------- | -| 描述 | ssl证书密码 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -### 3.8 连接配置 - -- cn_rpc_thrift_compression_enable - -| 名字 | cn_rpc_thrift_compression_enable | -| ------------ | -------------------------------- | -| 描述 | 是否启用 thrift 的压缩机制。 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- cn_rpc_max_concurrent_client_num - -| 名字 | cn_rpc_max_concurrent_client_num | -| ------------ |---------------------------------| -| 描述 | 最大连接数。 | -| 类型 | int | -| 默认值 | 3000 | -| 改后生效方式 | 重启服务生效 | - -- cn_connection_timeout_ms - -| 名字 | cn_connection_timeout_ms | -| ------------ | ------------------------ | -| 描述 | 节点连接超时时间 | -| 类型 | int | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -- cn_selector_thread_nums_of_client_manager - -| 名字 | cn_selector_thread_nums_of_client_manager | -| ------------ | ----------------------------------------- | -| 描述 | 客户端异步线程管理的选择器线程数量 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- cn_max_client_count_for_each_node_in_client_manager - -| 名字 | cn_max_client_count_for_each_node_in_client_manager | -| ------------ | --------------------------------------------------- | -| 描述 | 单 ClientManager 中路由到每个节点的最大 Client 个数 | -| 类型 | int | -| 默认值 | 300 | -| 改后生效方式 | 重启服务生效 | - -- dn_session_timeout_threshold - -| 名字 | dn_session_timeout_threshold | -| ------------ | ---------------------------- | -| 描述 | 最大的会话空闲时间 | -| 类型 | int | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_thrift_compression_enable - -| 名字 | dn_rpc_thrift_compression_enable | -| ------------ | -------------------------------- | -| 描述 | 是否启用 thrift 的压缩机制 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_advanced_compression_enable - -| 名字 | dn_rpc_advanced_compression_enable | -| ------------ | ---------------------------------- | -| 描述 | 是否启用 thrift 的自定制压缩机制 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_selector_thread_count - -| 名字 | rpc_selector_thread_count | -| ------------ | ------------------------- | -| 描述 | rpc 选择器线程数量 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_min_concurrent_client_num - -| 名字 | rpc_min_concurrent_client_num | -| ------------ | ----------------------------- | -| 描述 | 最小连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_max_concurrent_client_num - -| 名字 | dn_rpc_max_concurrent_client_num | -| ------------ |----------------------------------| -| 描述 | 最大连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- dn_thrift_max_frame_size - -| 名字 | dn_thrift_max_frame_size | -| ------------ |-------------------------------------------------------------------------------------------------------------------| -| 描述 | RPC 请求/响应的最大字节数 | -| 类型 | int | -|默认值| 默认为0,即根据启动时DNJVM的配置参数自动计算:
a. min(64MB, dn_alloc_memory/64)
b.若用户手动配置了dn_thrift_max_frame_size,仍然使用用户指定的大小 | -| 改后生效方式 | 重启服务生效 | - -- dn_thrift_init_buffer_size - -| 名字 | dn_thrift_init_buffer_size | -| ------------ | -------------------------- | -| 描述 | 字节数 | -| 类型 | long | -| 默认值 | 1024 | -| 改后生效方式 | 重启服务生效 | - -- dn_connection_timeout_ms - -| 名字 | dn_connection_timeout_ms | -| ------------ | ------------------------ | -| 描述 | 节点连接超时时间 | -| 类型 | int | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -- dn_selector_thread_count_of_client_manager - -| 名字 | dn_selector_thread_count_of_client_manager | -| ------------ | ------------------------------------------------------------ | -| 描述 | selector thread (TAsyncClientManager) nums for async thread in a clientManagerclientManager中异步线程的选择器线程(TAsyncClientManager)编号 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- dn_max_client_count_for_each_node_in_client_manager - -| 名字 | dn_max_client_count_for_each_node_in_client_manager | -| ------------ | --------------------------------------------------- | -| 描述 | 单 ClientManager 中路由到每个节点的最大 Client 个数 | -| 类型 | int | -| 默认值 | 300 | -| 改后生效方式 | 重启服务生效 | - -### 3.9 对象存储管理 - -- remote_tsfile_cache_dirs - -| 名字 | remote_tsfile_cache_dirs | -| ------------ | ------------------------ | -| 描述 | 云端存储在本地的缓存目录 | -| 类型 | String | -| 默认值 | data/datanode/data/cache | -| 改后生效方式 | 重启服务生效 | - -- remote_tsfile_cache_page_size_in_kb - -| 名字 | remote_tsfile_cache_page_size_in_kb | -| ------------ | ----------------------------------- | -| 描述 | 云端存储在本地缓存文件的块大小 | -| 类型 | int | -| 默认值 | 20480 | -| 改后生效方式 | 重启服务生效 | - -- remote_tsfile_cache_max_disk_usage_in_mb - -| 名字 | remote_tsfile_cache_max_disk_usage_in_mb | -| ------------ | ---------------------------------------- | -| 描述 | 云端存储本地缓存的最大磁盘占用大小 | -| 类型 | long | -| 默认值 | 51200 | -| 改后生效方式 | 重启服务生效 | - -- object_storage_type - -| 名字 | object_storage_type | -| ------------ | ------------------- | -| 描述 | 云端存储类型 | -| 类型 | String | -| 默认值 | AWS_S3 | -| 改后生效方式 | 重启服务生效 | - -- object_storage_endpoint - -| 名字 | object_storage_endpoint | -| ------------ | ----------------------- | -| 描述 | 云端存储的 endpoint | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -- object_storage_bucket - -| 名字 | object_storage_bucket | -| ------------ | ---------------------- | -| 描述 | 云端存储 bucket 的名称 | -| 类型 | String | -| 默认值 | iotdb_data | -| 改后生效方式 | 重启服务生效 | - -- object_storage_access_key - -| 名字 | object_storage_access_key | -| ------------ | ------------------------- | -| 描述 | 云端存储的验证信息 key | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -- object_storage_access_secret - -| 名字 | object_storage_access_secret | -| ------------ | ---------------------------- | -| 描述 | 云端存储的验证信息 secret | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -### 3.10 多级管理 - -- dn_default_space_usage_thresholds - -| 名字 | dn_default_space_usage_thresholds | -| ------------ | ------------------------------------------------------------ | -| 描述 | 定义每个层级数据目录的最小剩余空间比例;当剩余空间少于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的剩余存储空间到低于此阈值时,会将系统置为 READ_ONLY | -| 类型 | double | -| 默认值 | 0.85 | -| 改后生效方式 | 热加载 | - -- dn_tier_full_policy - -| 名字 | dn_tier_full_policy | -| ------------ | ------------------------------------------------------------ | -| 描述 | 如何处理最后一层数据,当其已用空间高于其dn_default_space_usage_threshold时。| -| 类型 | String | -| 默认值 | NULL | -| 改后生效方式 | 热加载 | - -- migrate_thread_count - -| 名字 | migrate_thread_count | -| ------------ | ---------------------------------------- | -| 描述 | DataNode数据目录中迁移操作的线程池大小。 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 热加载 | - -- tiered_storage_migrate_speed_limit_bytes_per_sec - -| 名字 | tiered_storage_migrate_speed_limit_bytes_per_sec | -| ------------ | ------------------------------------------------ | -| 描述 | 限制不同存储层级之间的数据迁移速度。 | -| 类型 | int | -| 默认值 | 10485760 | -| 改后生效方式 | 热加载 | - -### 3.11 REST服务配置 - -- enable_rest_service - -| 名字 | enable_rest_service | -| ------------ | ------------------- | -| 描述 | 是否开启Rest服务。 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- rest_service_port - -| 名字 | rest_service_port | -| ------------ | ------------------ | -| 描述 | Rest服务监听端口号 | -| 类型 | int32 | -| 默认值 | 18080 | -| 改后生效方式 | 重启服务生效 | - -- enable_swagger - -| 名字 | enable_swagger | -| ------------ | --------------------------------- | -| 描述 | 是否启用swagger来展示rest接口信息 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- rest_query_default_row_size_limit - -| 名字 | rest_query_default_row_size_limit | -| ------------ | --------------------------------- | -| 描述 | 一次查询能返回的结果集最大行数 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- cache_expire_in_seconds - -| 名字 | cache_expire_in_seconds | -| ------------ | -------------------------------- | -| 描述 | 用户登录信息缓存的过期时间(秒) | -| 类型 | int32 | -| 默认值 | 28800 | -| 改后生效方式 | 重启服务生效 | - -- cache_max_num - -| 名字 | cache_max_num | -| ------------ | ------------------------ | -| 描述 | 缓存中存储的最大用户数量 | -| 类型 | int32 | -| 默认值 | 100 | -| 改后生效方式 | 重启服务生效 | - -- cache_init_num - -| 名字 | cache_init_num | -| ------------ | -------------- | -| 描述 | 缓存初始容量 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- client_auth - -| 名字 | client_auth | -| ------------ | ---------------------- | -| 描述 | 是否需要客户端身份验证 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- trust_store_path - -| 名字 | trust_store_path | -| ------------ | ----------------------- | -| 描述 | keyStore 密码(非必填) | -| 类型 | String | -| 默认值 | "" | -| 改后生效方式 | 重启服务生效 | - -- trust_store_pwd - -| 名字 | trust_store_pwd | -| ------------ | ------------------------- | -| 描述 | trustStore 密码(非必填) | -| 类型 | String | -| 默认值 | "" | -| 改后生效方式 | 重启服务生效 | - -- idle_timeout_in_seconds - -| 名字 | idle_timeout_in_seconds | -| ------------ | ----------------------- | -| 描述 | SSL 超时时间,单位为秒 | -| 类型 | int32 | -| 默认值 | 5000 | -| 改后生效方式 | 重启服务生效 | - -### 3.12 负载均衡配置 - -- series_slot_num - -| 名字 | series_slot_num | -| ------------ | ---------------------------- | -| 描述 | 序列分区槽数 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- series_partition_executor_class - -| 名字 | series_partition_executor_class | -| ------------ | ------------------------------------------------------------ | -| 描述 | 序列分区哈希函数 | -| 类型 | String | -| 默认值 | org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- schema_region_group_extension_policy - -| 名字 | schema_region_group_extension_policy | -| ------------ | ------------------------------------ | -| 描述 | SchemaRegionGroup 的扩容策略 | -| 类型 | string | -| 默认值 | AUTO | -| 改后生效方式 | 重启服务生效 | - -- default_schema_region_group_num_per_database - -| 名字 | default_schema_region_group_num_per_database | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当选用 CUSTOM-SchemaRegionGroup 扩容策略时,此参数为每个 Database 拥有的 SchemaRegionGroup 数量;当选用 AUTO-SchemaRegionGroup 扩容策略时,此参数为每个 Database 最少拥有的 SchemaRegionGroup 数量 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_per_data_node - -| 名字 | schema_region_per_data_node | -| ------------ | -------------------------------------------------- | -| 描述 | 期望每个 DataNode 可管理的 SchemaRegion 的最大数量 | -| 类型 | double | -| 默认值 | 1.0 | -| 改后生效方式 | 重启服务生效 | - -- data_region_group_extension_policy - -| 名字 | data_region_group_extension_policy | -| ------------ | ---------------------------------- | -| 描述 | DataRegionGroup 的扩容策略 | -| 类型 | string | -| 默认值 | AUTO | -| 改后生效方式 | 重启服务生效 | - -- default_data_region_group_num_per_database - -| 名字 | default_data_region_group_per_database | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当选用 CUSTOM-DataRegionGroup 扩容策略时,此参数为每个 Database 拥有的 DataRegionGroup 数量;当选用 AUTO-DataRegionGroup 扩容策略时,此参数为每个 Database 最少拥有的 DataRegionGroup 数量 | -| 类型 | int | -| 默认值 | 2 | -| 改后生效方式 | 重启服务生效 | - -- data_region_per_data_node - -| 名字 | data_region_per_data_node | -| ------------ | ------------------------------------------------ | -| 描述 | 期望每个 DataNode 可管理的 DataRegion 的最大数量 | -| 类型 | double | -| 默认值 | CPU 核心数的一半 | -| 改后生效方式 | 重启服务生效 | - -- enable_auto_leader_balance_for_ratis_consensus - -| 名字 | enable_auto_leader_balance_for_ratis_consensus | -| ------------ | ---------------------------------------------- | -| 描述 | 是否为 Ratis 共识协议开启自动均衡 leader 策略 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -- enable_auto_leader_balance_for_iot_consensus - -| 名字 | enable_auto_leader_balance_for_iot_consensus | -| ------------ | -------------------------------------------- | -| 描述 | 是否为 IoT 共识协议开启自动均衡 leader 策略 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -### 3.13 集群管理 - -- time_partition_origin - -| 名字 | time_partition_origin | -| ------------ | ------------------------------------------------------------ | -| 描述 | Database 数据时间分区的起始点,即从哪个时间点开始计算时间分区。 | -| 类型 | Long | -| 单位 | 毫秒 | -| 默认值 | 0 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- time_partition_interval - -| 名字 | time_partition_interval | -| ------------ | ------------------------------- | -| 描述 | Database 默认的数据时间分区间隔 | -| 类型 | Long | -| 单位 | 毫秒 | -| 默认值 | 604800000 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- heartbeat_interval_in_ms - -| 名字 | heartbeat_interval_in_ms | -| ------------ | ------------------------ | -| 描述 | 集群节点间的心跳间隔 | -| 类型 | Long | -| 单位 | ms | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- disk_space_warning_threshold - -| 名字 | disk_space_warning_threshold | -| ------------ | ---------------------------- | -| 描述 | DataNode 磁盘剩余阈值 | -| 类型 | double(percentage) | -| 默认值 | 0.05 | -| 改后生效方式 | 重启服务生效 | - -### 3.14 内存控制配置 - -- datanode_memory_proportion - -| 名字 | datanode_memory_proportion | -| ------------ | ---------------------------------------------------- | -| 描述 | 存储引擎、查询引擎、元数据、共识、流处理引擎和空闲内存比例 | -| 类型 | Ratio | -| 默认值 | 3:3:1:1:1:1 | -| 改后生效方式 | 重启服务生效 | - -- schema_memory_proportion - -| 名字 | schema_memory_proportion | -| ------------ | ------------------------------------------------------------ | -| 描述 | Schema 相关的内存如何在 SchemaRegion、SchemaCache 和 PartitionCache 之间分配 | -| 类型 | Ratio | -| 默认值 | 5:4:1 | -| 改后生效方式 | 重启服务生效 | - -- storage_engine_memory_proportion - -| 名字 | storage_engine_memory_proportion | -| ------------ | -------------------------------- | -| 描述 | 写入和合并占存储内存比例 | -| 类型 | Ratio | -| 默认值 | 8:2 | -| 改后生效方式 | 重启服务生效 | - -- write_memory_proportion - -| 名字 | write_memory_proportion | -| ------------ | -------------------------------------------- | -| 描述 | Memtable 和 TimePartitionInfo 占写入内存比例 | -| 类型 | Ratio | -| 默认值 | 19:1 | -| 改后生效方式 | 重启服务生效 | - -- primitive_array_size - -| 名字 | primitive_array_size | -| ------------ | ---------------------------------------- | -| 描述 | 数组池中的原始数组大小(每个数组的长度) | -| 类型 | int32 | -| 默认值 | 64 | -| 改后生效方式 | 重启服务生效 | - -- chunk_metadata_size_proportion - -| 名字 | chunk_metadata_size_proportion | -| ------------ | -------------------------------------------- | -| 描述 | 在数据压缩过程中,用于存储块元数据的内存比例 | -| 类型 | Double | -| 默认值 | 0.1 | -| 改后生效方式 | 重启服务生效 | - -- flush_proportion - -| 名字 | flush_proportion | -| ------------ | ------------------------------------------------------------ | -| 描述 | 调用flush disk的写入内存比例,默认0.4,若有极高的写入负载力(比如batch=1000),可以设置为低于默认值,比如0.2 | -| 类型 | Double | -| 默认值 | 0.4 | -| 改后生效方式 | 重启服务生效 | - -- buffered_arrays_memory_proportion - -| 名字 | buffered_arrays_memory_proportion | -| ------------ | --------------------------------------- | -| 描述 | 为缓冲数组分配的写入内存比例,默认为0.6 | -| 类型 | Double | -| 默认值 | 0.6 | -| 改后生效方式 | 重启服务生效 | - -- reject_proportion - -| 名字 | reject_proportion | -| ------------ | ------------------------------------------------------------ | -| 描述 | 拒绝插入的写入内存比例,默认0.8,若有极高的写入负载力(比如batch=1000)并且物理内存足够大,它可以设置为高于默认值,如0.9 | -| 类型 | Double | -| 默认值 | 0.8 | -| 改后生效方式 | 重启服务生效 | - -- device_path_cache_proportion - -| 名字 | device_path_cache_proportion | -| ------------ | --------------------------------------------------- | -| 描述 | 在内存中分配给设备路径缓存(DevicePathCache)的比例 | -| 类型 | Double | -| 默认值 | 0.05 | -| 改后生效方式 | 重启服务生效 | - -- write_memory_variation_report_proportion - -| 名字 | write_memory_variation_report_proportion | -| ------------ | ------------------------------------------------------------ | -| 描述 | 如果 DataRegion 的内存增加超过写入可用内存的一定比例,则向系统报告。默认值为0.001 | -| 类型 | Double | -| 默认值 | 0.001 | -| 改后生效方式 | 重启服务生效 | - -- check_period_when_insert_blocked - -| 名字 | check_period_when_insert_blocked | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当插入被拒绝时,等待时间(以毫秒为单位)去再次检查系统,默认为50。若插入被拒绝,读取负载低,可以设置大一些。 | -| 类型 | int32 | -| 默认值 | 50 | -| 改后生效方式 | 重启服务生效 | - -- io_task_queue_size_for_flushing - -| 名字 | io_task_queue_size_for_flushing | -| ------------ | -------------------------------- | -| 描述 | ioTaskQueue 的大小。默认值为10。 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- enable_query_memory_estimation - -| 名字 | enable_query_memory_estimation | -| ------------ | ------------------------------------------------------------ | -| 描述 | 开启后会预估每次查询的内存使用量,如果超过可用内存,会拒绝本次查询 | -| 类型 | bool | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -### 3.15 元数据引擎配置 - -- schema_engine_mode - -| 名字 | schema_engine_mode | -| ------------ | ------------------------------------------------------------ | -| 描述 | 元数据引擎的运行模式,支持 Memory 和 PBTree;PBTree 模式下支持将内存中暂时不用的序列元数据实时置换到磁盘上,需要使用时再加载进内存;此参数在集群中所有的 DataNode 上务必保持相同。 | -| 类型 | string | -| 默认值 | Memory | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- partition_cache_size - -| 名字 | partition_cache_size | -| ------------ | ------------------------------ | -| 描述 | 分区信息缓存的最大缓存条目数。 | -| 类型 | Int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- sync_mlog_period_in_ms - -| 名字 | sync_mlog_period_in_ms | -| ------------ | ------------------------------------------------------------ | -| 描述 | mlog定期刷新到磁盘的周期,单位毫秒。如果该参数为0,则表示每次对元数据的更新操作都会被立即写到磁盘上。 | -| 类型 | Int64 | -| 默认值 | 100 | -| 改后生效方式 | 重启服务生效 | - -- tag_attribute_flush_interval - -| 名字 | tag_attribute_flush_interval | -| ------------ | -------------------------------------------------- | -| 描述 | 标签和属性记录的间隔数,达到此记录数量时将强制刷盘 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- tag_attribute_total_size - -| 名字 | tag_attribute_total_size | -| ------------ | ---------------------------------------- | -| 描述 | 每个时间序列标签和属性的最大持久化字节数 | -| 类型 | int32 | -| 默认值 | 700 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- max_measurement_num_of_internal_request - -| 名字 | max_measurement_num_of_internal_request | -| ------------ | ------------------------------------------------------------ | -| 描述 | 一次注册序列请求中若物理量过多,在系统内部执行时将被拆分为若干个轻量级的子请求,每个子请求中的物理量数目不超过此参数设置的最大值。 | -| 类型 | Int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- datanode_schema_cache_eviction_policy - -| 名字 | datanode_schema_cache_eviction_policy | -| ------------ | ----------------------------------------------------- | -| 描述 | 当 Schema 缓存达到其最大容量时,Schema 缓存的淘汰策略 | -| 类型 | String | -| 默认值 | FIFO | -| 改后生效方式 | 重启服务生效 | - -- cluster_timeseries_limit_threshold - -| 名字 | cluster_timeseries_limit_threshold | -| ------------ | ---------------------------------- | -| 描述 | 集群中可以创建的时间序列的最大数量 | -| 类型 | Int32 | -| 默认值 | -1 | -| 改后生效方式 | 重启服务生效 | - -- cluster_device_limit_threshold - -| 名字 | cluster_device_limit_threshold | -| ------------ | ------------------------------ | -| 描述 | 集群中可以创建的最大设备数量 | -| 类型 | Int32 | -| 默认值 | -1 | -| 改后生效方式 | 重启服务生效 | - -- database_limit_threshold - -| 名字 | database_limit_threshold | -| ------------ | ------------------------------ | -| 描述 | 集群中可以创建的最大数据库数量 | -| 类型 | Int32 | -| 默认值 | -1 | -| 改后生效方式 | 重启服务生效 | - -### 3.16 自动推断数据类型 - -- enable_auto_create_schema - -| 名字 | enable_auto_create_schema | -| ------------ | -------------------------------------- | -| 描述 | 当写入的序列不存在时,是否自动创建序列 | -| 取值 | true or false | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -- default_storage_group_level - -| 名字 | default_storage_group_level | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当写入的数据不存在且自动创建序列时,若需要创建相应的 database,将序列路径的哪一层当做 database。例如,如果我们接到一个新序列 root.sg0.d1.s2, 并且 level=1, 那么 root.sg0 被视为database(因为 root 是 level 0 层) | -| 取值 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- boolean_string_infer_type - -| 名字 | boolean_string_infer_type | -| ------------ | ------------------------------------------ | -| 描述 | "true" 或者 "false" 字符串被推断的数据类型 | -| 取值 | BOOLEAN 或者 TEXT | -| 默认值 | BOOLEAN | -| 改后生效方式 | 热加载 | - -- integer_string_infer_type - -| 名字 | integer_string_infer_type | -| ------------ | --------------------------------- | -| 描述 | 整型字符串推断的数据类型 | -| 取值 | INT32, INT64, FLOAT, DOUBLE, TEXT | -| 默认值 | DOUBLE | -| 改后生效方式 | 热加载 | - -- floating_string_infer_type - -| 名字 | floating_string_infer_type | -| ------------ | ----------------------------- | -| 描述 | "6.7"等字符串被推断的数据类型 | -| 取值 | DOUBLE, FLOAT or TEXT | -| 默认值 | DOUBLE | -| 改后生效方式 | 热加载 | - -- nan_string_infer_type - -| 名字 | nan_string_infer_type | -| ------------ | ---------------------------- | -| 描述 | "NaN" 字符串被推断的数据类型 | -| 取值 | DOUBLE, FLOAT or TEXT | -| 默认值 | DOUBLE | -| 改后生效方式 | 热加载 | - -- default_boolean_encoding - -| 名字 | default_boolean_encoding | -| ------------ | ------------------------ | -| 描述 | BOOLEAN 类型编码格式 | -| 取值 | PLAIN, RLE | -| 默认值 | RLE | -| 改后生效方式 | 热加载 | - -- default_int32_encoding - -| 名字 | default_int32_encoding | -| ------------ | -------------------------------------- | -| 描述 | int32 类型编码格式 | -| 取值 | PLAIN, RLE, TS_2DIFF, REGULAR, GORILLA | -| 默认值 | TS_2DIFF | -| 改后生效方式 | 热加载 | - -- default_int64_encoding - -| 名字 | default_int64_encoding | -| ------------ | -------------------------------------- | -| 描述 | int64 类型编码格式 | -| 取值 | PLAIN, RLE, TS_2DIFF, REGULAR, GORILLA | -| 默认值 | TS_2DIFF | -| 改后生效方式 | 热加载 | - -- default_float_encoding - -| 名字 | default_float_encoding | -| ------------ | ----------------------------- | -| 描述 | float 类型编码格式 | -| 取值 | PLAIN, RLE, TS_2DIFF, GORILLA | -| 默认值 | GORILLA | -| 改后生效方式 | 热加载 | - -- default_double_encoding - -| 名字 | default_double_encoding | -| ------------ | ----------------------------- | -| 描述 | double 类型编码格式 | -| 取值 | PLAIN, RLE, TS_2DIFF, GORILLA | -| 默认值 | GORILLA | -| 改后生效方式 | 热加载 | - -- default_text_encoding - -| 名字 | default_text_encoding | -| ------------ | --------------------- | -| 描述 | text 类型编码格式 | -| 取值 | PLAIN | -| 默认值 | PLAIN | -| 改后生效方式 | 热加载 | - -* boolean_compressor - -| 名字 | boolean_compressor | -| -------------- | ----------------------------------------------------------------------- | -| 描述 | 启用自动创建模式时,BOOLEAN 数据类型的压缩方式 (V2.0.6 版本开始支持) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -* int32_compressor - -| 名字 | int32_compressor | -| -------------- | ------------------------------------------------------------------------- | -| 描述 | 启用自动创建模式时,INT32/DATE 数据类型的压缩方式(V2.0.6 版本开始支持) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -* int64_compressor - -| 名字 | int64_compressor | -| -------------- | ------------------------------------------------------------------------------ | -| 描述 | 启用自动创建模式时,INT64/TIMESTAMP 数据类型的压缩方式(V2.0.6 版本开始支持) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -* float_compressor - -| 名字 | float_compressor | -| -------------- | -------------------------------------------------------------------- | -| 描述 | 启用自动创建模式时,FLOAT 数据类型的压缩方式(V2.0.6 版本开始支持) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -* double_compressor - -| 名字 | double_compressor | -| -------------- | --------------------------------------------------------------------- | -| 描述 | 启用自动创建模式时,DOUBLE 数据类型的压缩方式(V2.0.6 版本开始支持) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -* text_compressor - -| 名字 | text_compressor | -| -------------- | -------------------------------------------------------------------------------- | -| 描述 | 启用自动创建模式时,TEXT/BINARY/BLOB 数据类型的压缩方式(V2.0.6 版本开始支持 ) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - - - -### 3.17 查询配置 - -- read_consistency_level - -| 名字 | read_consistency_level | -| ------------ | ------------------------------------------------------------ | -| 描述 | 查询一致性等级,取值 “strong” 时从 Leader 副本查询,取值 “weak” 时随机查询一个副本。 | -| 类型 | String | -| 默认值 | strong | -| 改后生效方式 | 重启服务生效 | - -- meta_data_cache_enable - -| 名字 | meta_data_cache_enable | -| ------------ | ------------------------------------------------------------ | -| 描述 | 是否缓存元数据(包括 BloomFilter、Chunk Metadata 和 TimeSeries Metadata。) | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -- chunk_timeseriesmeta_free_memory_proportion - -| 名字 | chunk_timeseriesmeta_free_memory_proportion | -| ------------ | ------------------------------------------------------------ | -| 描述 | 读取内存分配比例,BloomFilterCache、ChunkCache、TimeseriesMetadataCache、数据集查询的内存和可用内存的查询。参数形式为a : b : c : d : e,其中a、b、c、d、e为整数。 例如“1 : 1 : 1 : 1 : 1” ,“1 : 100 : 200 : 300 : 400” 。 | -| 类型 | String | -| 默认值 | 1 : 100 : 200 : 300 : 400 | -| 改后生效方式 | 重启服务生效 | - -- enable_last_cache - -| 名字 | enable_last_cache | -| ------------ | ------------------ | -| 描述 | 是否开启最新点缓存 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -- mpp_data_exchange_core_pool_size - -| 名字 | mpp_data_exchange_core_pool_size | -| ------------ | -------------------------------- | -| 描述 | MPP 数据交换线程池核心线程数 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- mpp_data_exchange_max_pool_size - -| 名字 | mpp_data_exchange_max_pool_size | -| ------------ | ------------------------------- | -| 描述 | MPP 数据交换线程池最大线程数 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- mpp_data_exchange_keep_alive_time_in_ms - -| 名字 | mpp_data_exchange_keep_alive_time_in_ms | -| ------------ | --------------------------------------- | -| 描述 | MPP 数据交换最大等待时间 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- driver_task_execution_time_slice_in_ms - -| 名字 | driver_task_execution_time_slice_in_ms | -| ------------ | -------------------------------------- | -| 描述 | 单个 DriverTask 最长执行时间(ms) | -| 类型 | int32 | -| 默认值 | 200 | -| 改后生效方式 | 重启服务生效 | - -- max_tsblock_size_in_bytes - -| 名字 | max_tsblock_size_in_bytes | -| ------------ | ------------------------------- | -| 描述 | 单个 TsBlock 的最大容量(byte) | -| 类型 | int32 | -| 默认值 | 131072 | -| 改后生效方式 | 重启服务生效 | - -- max_tsblock_line_numbers - -| 名字 | max_tsblock_line_numbers | -| ------------ | ------------------------ | -| 描述 | 单个 TsBlock 的最大行数 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- slow_query_threshold - -| 名字 | slow_query_threshold | -| ------------ |----------------------| -| 描述 | 慢查询的时间阈值。单位:毫秒。 | -| 类型 | long | -| 默认值 | 3000 | -| 改后生效方式 | 热加载 | - -- query_cost_stat_window - -| 名字 | query_cost_stat_window | -| ------------ |--------------------| -| 描述 | 查询耗时统计的窗口,单位为分钟。 | -| 类型 | Int32 | -| 默认值 | 0 | -| 改后生效方式 | 热加载 | - -- query_timeout_threshold - -| 名字 | query_timeout_threshold | -| ------------ | -------------------------------- | -| 描述 | 查询的最大执行时间。单位:毫秒。 | -| 类型 | Int32 | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -- max_allowed_concurrent_queries - -| 名字 | max_allowed_concurrent_queries | -| ------------ | ------------------------------ | -| 描述 | 允许的最大并发查询数量。 | -| 类型 | Int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- query_thread_count - -| 名字 | query_thread_count | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当 IoTDB 对内存中的数据进行查询时,最多启动多少个线程来执行该操作。如果该值小于等于 0,那么采用机器所安装的 CPU 核的数量。 | -| 类型 | Int32 | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- degree_of_query_parallelism - -| 名字 | degree_of_query_parallelism | -| ------------ | ------------------------------------------------------------ | -| 描述 | 设置单个查询片段实例将创建的 pipeline 驱动程序数量,也就是查询操作的并行度。 | -| 类型 | Int32 | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- mode_map_size_threshold - -| 名字 | mode_map_size_threshold | -| ------------ | ---------------------------------------------- | -| 描述 | 计算 MODE 聚合函数时,计数映射可以增长到的阈值 | -| 类型 | Int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- batch_size - -| 名字 | batch_size | -| ------------ | ---------------------------------------------------------- | -| 描述 | 服务器中每次迭代的数据量(数据条目,即不同时间戳的数量。) | -| 类型 | Int32 | -| 默认值 | 100000 | -| 改后生效方式 | 重启服务生效 | - -- sort_buffer_size_in_bytes - -| 名字 | sort_buffer_size_in_bytes | -| ------------ |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | 设置外部排序操作中使用的内存缓冲区大小 | -| 类型 | long | -| 默认值 | 1048576(V2.0.6 之前版本)
0(V2.0.6 及之后版本),当值小于等于 0 时,由系统自动进行计算,计算公式为:`sort_buffer_size_in_bytes = Math.min(32 * 1024 * 1024, 堆内内存 * 查询引擎内存比例 * 查询执行内存比例 / 查询线程数 / 2)` | -| 改后生效方式 | 热加载 | - -- merge_threshold_of_explain_analyze - -| 名字 | merge_threshold_of_explain_analyze | -| ------------ | ------------------------------------------------------------ | -| 描述 | 用于设置在 `EXPLAIN ANALYZE` 语句的结果集中操作符(operator)数量的合并阈值。 | -| 类型 | int | -| 默认值 | 10 | -| 改后生效方式 | 热加载 | - -### 3.18 TTL配置 - -- ttl_check_interval - -| 名字 | ttl_check_interval | -| ------------ | -------------------------------------- | -| 描述 | ttl 检查任务的间隔,单位 ms,默认为 2h | -| 类型 | int | -| 默认值 | 7200000 | -| 改后生效方式 | 重启服务生效 | - -- max_expired_time - -| 名字 | max_expired_time | -| ------------ | ------------------------------------------------------------ | -| 描述 | 如果一个文件中存在设备已经过期超过此时间,那么这个文件将被立即整理。单位 ms,默认为一个月 | -| 类型 | int | -| 默认值 | 2592000000 | -| 改后生效方式 | 重启服务生效 | - -- expired_data_ratio - -| 名字 | expired_data_ratio | -| ------------ | ------------------------------------------------------------ | -| 描述 | 过期设备比例。如果一个文件中过期设备的比率超过这个值,那么这个文件中的过期数据将通过 compaction 清理。 | -| 类型 | float | -| 默认值 | 0.3 | -| 改后生效方式 | 重启服务生效 | - -### 3.19 存储引擎配置 - -- timestamp_precision - -| 名字 | timestamp_precision | -| ------------ | ---------------------------- | -| 描述 | 时间戳精度,支持 ms、us、ns | -| 类型 | String | -| 默认值 | ms | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- timestamp_precision_check_enabled - -| 名字 | timestamp_precision_check_enabled | -| ------------ | --------------------------------- | -| 描述 | 用于控制是否启用时间戳精度检查 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- max_waiting_time_when_insert_blocked - -| 名字 | max_waiting_time_when_insert_blocked | -| ------------ | ----------------------------------------------- | -| 描述 | 当插入请求等待超过这个时间,则抛出异常,单位 ms | -| 类型 | Int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- handle_system_error - -| 名字 | handle_system_error | -| ------------ | ------------------------------------ | -| 描述 | 当系统遇到不可恢复的错误时的处理方法 | -| 类型 | String | -| 默认值 | CHANGE_TO_READ_ONLY | -| 改后生效方式 | 重启服务生效 | - -- enable_timed_flush_seq_memtable - -| 名字 | enable_timed_flush_seq_memtable | -| ------------ | ------------------------------- | -| 描述 | 是否开启定时刷盘顺序 memtable | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- seq_memtable_flush_interval_in_ms - -| 名字 | seq_memtable_flush_interval_in_ms | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当 memTable 的创建时间小于当前时间减去该值时,该 memtable 需要被刷盘 | -| 类型 | long | -| 默认值 | 600000 | -| 改后生效方式 | 热加载 | - -- seq_memtable_flush_check_interval_in_ms - -| 名字 | seq_memtable_flush_check_interval_in_ms | -| ------------ | ---------------------------------------- | -| 描述 | 检查顺序 memtable 是否需要刷盘的时间间隔 | -| 类型 | long | -| 默认值 | 30000 | -| 改后生效方式 | 热加载 | - -- enable_timed_flush_unseq_memtable - -| 名字 | enable_timed_flush_unseq_memtable | -| ------------ | --------------------------------- | -| 描述 | 是否开启定时刷新乱序 memtable | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- unseq_memtable_flush_interval_in_ms - -| 名字 | unseq_memtable_flush_interval_in_ms | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当 memTable 的创建时间小于当前时间减去该值时,该 memtable 需要被刷盘 | -| 类型 | long | -| 默认值 | 600000 | -| 改后生效方式 | 热加载 | - -- unseq_memtable_flush_check_interval_in_ms - -| 名字 | unseq_memtable_flush_check_interval_in_ms | -| ------------ | ----------------------------------------- | -| 描述 | 检查乱序 memtable 是否需要刷盘的时间间隔 | -| 类型 | long | -| 默认值 | 30000 | -| 改后生效方式 | 热加载 | - -- tvlist_sort_algorithm - -| 名字 | tvlist_sort_algorithm | -| ------------ | ------------------------ | -| 描述 | memtable中数据的排序方法 | -| 类型 | String | -| 默认值 | TIM | -| 改后生效方式 | 重启服务生效 | - -- avg_series_point_number_threshold - -| 名字 | avg_series_point_number_threshold | -| ------------ | ------------------------------------------------ | -| 描述 | 内存中平均每个时间序列点数最大值,达到触发 flush | -| 类型 | int32 | -| 默认值 | 100000 | -| 改后生效方式 | 重启服务生效 | - -- flush_thread_count - -| 名字 | flush_thread_count | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当 IoTDB 将内存中的数据写入磁盘时,最多启动多少个线程来执行该操作。如果该值小于等于 0,那么采用机器所安装的 CPU 核的数量。默认值为 0。 | -| 类型 | int32 | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- enable_partial_insert - -| 名字 | enable_partial_insert | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在一次 insert 请求中,如果部分测点写入失败,是否继续写入其他测点。 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -- recovery_log_interval_in_ms - -| 名字 | recovery_log_interval_in_ms | -| ------------ | ----------------------------------------- | -| 描述 | data region的恢复过程中打印日志信息的间隔 | -| 类型 | Int32 | -| 默认值 | 5000 | -| 改后生效方式 | 重启服务生效 | - -- 0.13_data_insert_adapt - -| 名字 | 0.13_data_insert_adapt | -| ------------ | ------------------------------------------------------- | -| 描述 | 如果 0.13 版本客户端进行写入,需要将此配置项设置为 true | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- enable_tsfile_validation - -| 名字 | enable_tsfile_validation | -| ------------ | -------------------------------------- | -| 描述 | Flush, Load 或合并后验证 tsfile 正确性 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 热加载 | - -- tier_ttl_in_ms - -| 名字 | tier_ttl_in_ms | -| ------------ | ----------------------------------------- | -| 描述 | 定义每个层级负责的数据范围,通过 TTL 表示 | -| 类型 | long | -| 默认值 | -1 | -| 改后生效方式 | 重启服务生效 | - -* max_object_file_size_in_byte - -| 名字 | max\_object\_file\_size\_in\_byte | -| -------------- |------------------------------| -| 描述 | 单对象文件的最大尺寸限制 (V2.0.8 版本起支持) | -| 类型 | long | -| 默认值 | 4294967296 | -| 改后生效方式 | 热加载 | - -* restrict_object_limit - -| 名字 | restrict\_object\_limit | -|----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | 对 OBJECT 类型的表名、列名和设备名称没有特殊限制。(V2.0.8 版本起支持)当设置为 true 且表中包含 OBJECT 列时,需遵循以下限制:
1. 命名规范:TAG 列的值、表名和字段名禁止使用 “.” 或 “..”,且不得包含 “./” 或 “.\” 字符,否则元数据创建将失败。若名称包含文件系统不支持的字符,则会在数据写入时报错。
2. 大小写敏感:如果底层文件系统不区分大小写,则设备标识符(如 'd1' 与 'D1')将被视为相同。在此情况下,若创建此类名称相似的设备,其 OBJECT 数据文件可能互相覆盖,导致数据错误。
3. 存储路径:OBJECT 类型数据的实际存储路径格式为:`${dataregionid}/${tablename}/${tag1}/${tag2}/.../${field}/${timestamp}.bin`。 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - - -### 3.20 合并配置 - -- enable_seq_space_compaction - -| 名字 | enable_seq_space_compaction | -| ------------ | -------------------------------------- | -| 描述 | 顺序空间内合并,开启顺序文件之间的合并 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- enable_unseq_space_compaction - -| 名字 | enable_unseq_space_compaction | -| ------------ | -------------------------------------- | -| 描述 | 乱序空间内合并,开启乱序文件之间的合并 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- enable_cross_space_compaction - -| 名字 | enable_cross_space_compaction | -| ------------ | ------------------------------------------ | -| 描述 | 跨空间合并,开启将乱序文件合并到顺序文件中 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- enable_auto_repair_compaction - -| 名字 | enable_auto_repair_compaction | -| ------------ | ----------------------------- | -| 描述 | 启用通过合并操作自动修复未排序文件的功能 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- cross_selector - -| 名字 | cross_selector | -| ------------ |----------------| -| 描述 | 跨空间合并任务的选择器 | -| 类型 | String | -| 默认值 | rewrite | -| 改后生效方式 | 重启服务生效 | - -- cross_performer - -| 名字 | cross_performer | -| ------------ |-----------------------------------| -| 描述 | 跨空间合并任务的执行器,可选项:read_point 和 fast | -| 类型 | String | -| 默认值 | fast | -| 改后生效方式 | 热加载 | - -- inner_seq_selector - -| 名字 | inner_seq_selector | -| ------------ |------------------------------------------------------------------------| -| 描述 | 顺序空间内合并任务的选择器,可选 size_tiered_single_\target,size_tiered_multi_target | -| 类型 | String | -| 默认值 | size_tiered_multi_target | -| 改后生效方式 | 热加载 | - -- inner_seq_performer - -| 名字 | inner_seq_performer | -| ------------ |--------------------------------------| -| 描述 | 顺序空间内合并任务的执行器,可选项是 read_chunk 和 fast | -| 类型 | String | -| 默认值 | read_chunk | -| 改后生效方式 | 热加载 | - -- inner_unseq_selector - -| 名字 | inner_unseq_selector | -| ------------ |-------------------------------------------------------------------------| -| 描述 | 乱序空间内合并任务的选择器,可选 size_tiered_single_\target,size_tiered_multi_target | -| 类型 | String | -| 默认值 | size_tiered_multi_target | -| 改后生效方式 | 热加载 | - -- inner_unseq_performer - -| 名字 | inner_unseq_performer | -| ------------ |--------------------------------------| -| 描述 | 乱序空间内合并任务的执行器,可选项是 read_point 和 fast | -| 类型 | String | -| 默认值 | fast | -| 改后生效方式 | 热加载 | - -- compaction_priority - -| 名字 | compaction_priority | -| ------------ |-------------------------------------------------------------------------------------------| -| 描述 | 合并时的优先级。INNER_CROSS:优先执行空间内合并,优先减少文件数量;CROSS_INNER:优先执行跨空间合并,优先清理乱序文件;BALANCE:交替执行两种合并类型。 | -| 类型 | String | -| 默认值 | INNER_CROSS | -| 改后生效方式 | 重启服务生效 | - -- candidate_compaction_task_queue_size - -| 名字 | candidate_compaction_task_queue_size | -| ------------ | ------------------------------------ | -| 描述 | 待选合并任务队列容量 | -| 类型 | int32 | -| 默认值 | 50 | -| 改后生效方式 | 重启服务生效 | - -- target_compaction_file_size - -| 名字 | target_compaction_file_size | -| ------------ |-----------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | 该参数作用于两个场景:1. 空间内合并的目标文件大小 2. 跨空间合并中待选序列文件的大小需小于 target_compaction_file_size * 1.5 多数情况下,跨空间合并的目标文件大小不会超过此阈值,即便超出,幅度也不会过大 。 默认值:2GB ,单位:byte | -| 类型 | Long | -| 默认值 | 2147483648 | -| 改后生效方式 | 热加载 | - -- inner_compaction_total_file_size_threshold - -| 名字 | inner_compaction_total_file_size_threshold | -| ------------ |--------------------------------------------| -| 描述 | 空间内合并的文件总大小阈值,单位:byte | -| 类型 | Long | -| 默认值 | 10737418240 | -| 改后生效方式 | 热加载 | - -- inner_compaction_total_file_num_threshold - -| 名字 | inner_compaction_total_file_num_threshold | -| ------------ | ----------------------------------------- | -| 描述 | 空间内合并的文件总数阈值 | -| 类型 | int32 | -| 默认值 | 100 | -| 改后生效方式 | 热加载 | - -- max_level_gap_in_inner_compaction - -| 名字 | max_level_gap_in_inner_compaction | -| ------------ | -------------------------------------- | -| 描述 | 空间内合并筛选的最大层级差 | -| 类型 | int32 | -| 默认值 | 2 | -| 改后生效方式 | 热加载 | - -- target_chunk_size - -| 名字 | target_chunk_size | -| ------------ |--------------------------------------------------| -| 描述 | 刷盘与合并操作的目标数据块大小, 若内存表中某条时序数据的大小超过该值,数据会被刷盘至多个数据块 | -| 类型 | Long | -| 默认值 | 1600000 | -| 改后生效方式 | 重启服务生效 | - -- target_chunk_point_num - -| 名字 | target_chunk_point_num | -| ------------ |------------------------------------------------------| -| 描述 | 刷盘与合并操作中单个数据块的目标点数, 若内存表中某条时序数据的点数超过该值,数据会被刷盘至多个数据块中 | -| 类型 | Long | -| 默认值 | 100000 | -| 改后生效方式 | 重启服务生效 | - -- chunk_size_lower_bound_in_compaction - -| 名字 | chunk_size_lower_bound_in_compaction | -| ------------ |--------------------------------------| -| 描述 | 若数据块大小低于此阈值,则会被反序列化为数据点,默认值为128字节 | -| 类型 | Long | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- chunk_point_num_lower_bound_in_compaction - -| 名字 | chunk_point_num_lower_bound_in_compaction | -| ------------ |------------------------------------------| -| 描述 | 若数据块内的数据点数低于此阈值,则会被反序列化为数据点 | -| 类型 | Long | -| 默认值 | 100 | -| 改后生效方式 | 重启服务生效 | - -- inner_compaction_candidate_file_num - -| 名字 | inner_compaction_candidate_file_num | -| ------------ | ---------------------------------------- | -| 描述 | 空间内合并待选文件筛选的文件数量要求 | -| 类型 | int32 | -| 默认值 | 30 | -| 改后生效方式 | 热加载 | - -- max_cross_compaction_candidate_file_num - -| 名字 | max_cross_compaction_candidate_file_num | -| ------------ | --------------------------------------- | -| 描述 | 跨空间合并待选文件筛选的文件数量上限 | -| 类型 | int32 | -| 默认值 | 500 | -| 改后生效方式 | 热加载 | - -- max_cross_compaction_candidate_file_size - -| 名字 | max_cross_compaction_candidate_file_size | -| ------------ |------------------------------------------| -| 描述 | 跨空间合并待选文件筛选的总大小上限 | -| 类型 | Long | -| 默认值 | 5368709120 | -| 改后生效方式 | 热加载 | - -- min_cross_compaction_unseq_file_level - -| 名字 | min_cross_compaction_unseq_file_level | -| ------------ |---------------------------------------| -| 描述 | 可被选为待选文件的乱序文件的最小空间内合并层级 | -| 类型 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 热加载 | - -- compaction_thread_count - -| 名字 | compaction_thread_count | -| ------------ | ----------------------- | -| 描述 | 执行合并任务的线程数目 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 热加载 | - -- compaction_max_aligned_series_num_in_one_batch - -| 名字 | compaction_max_aligned_series_num_in_one_batch | -| ------------ | ---------------------------------------------- | -| 描述 | 对齐序列合并一次执行时处理的值列数量 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 热加载 | - -- compaction_schedule_interval_in_ms - -| 名字 | compaction_schedule_interval_in_ms | -| ------------ |------------------------------------| -| 描述 | 合并调度的时间间隔,单位 ms | -| 类型 | Long | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -- compaction_write_throughput_mb_per_sec - -| 名字 | compaction_write_throughput_mb_per_sec | -| ------------ |----------------------------------------| -| 描述 | 合并操作每秒可达到的写入吞吐量上限, 小于或等于 0 的取值表示无限制 | -| 类型 | int32 | -| 默认值 | 16 | -| 改后生效方式 | 重启服务生效 | - -- compaction_read_throughput_mb_per_sec - -| 名字 | compaction_read_throughput_mb_per_sec | -| --------- | ---------------------------------------------------- | -| 描述 | 合并每秒读吞吐限制,单位为 megabyte,小于或等于 0 的取值表示无限制 | -| 类型 | int32 | -| 默认值 | 0 | -| Effective | 热加载 | - -- compaction_read_operation_per_sec - -| 名字 | compaction_read_operation_per_sec | -| --------- | ------------------------------------------- | -| 描述 | 合并每秒读操作数量限制,小于或等于 0 的取值表示无限制 | -| 类型 | int32 | -| 默认值 | 0 | -| Effective | 热加载 | - -- sub_compaction_thread_count - -| 名字 | sub_compaction_thread_count | -| ------------ | ------------------------------------------------------------ | -| 描述 | 每个合并任务的子任务线程数,只对跨空间合并和乱序空间内合并生效 | -| 类型 | int32 | -| 默认值 | 4 | -| 改后生效方式 | 热加载 | - -- inner_compaction_task_selection_disk_redundancy - -| 名字 | inner_compaction_task_selection_disk_redundancy | -| ------------ | ----------------------------------------------- | -| 描述 | 定义了磁盘可用空间的冗余值,仅用于内部压缩 | -| 类型 | double | -| 默认值 | 0.05 | -| 改后生效方式 | 热加载 | - -- inner_compaction_task_selection_mods_file_threshold - -| 名字 | inner_compaction_task_selection_mods_file_threshold | -| ------------ | --------------------------------------------------- | -| 描述 | 定义了mods文件大小的阈值,仅用于内部压缩。 | -| 类型 | long | -| 默认值 | 131072 | -| 改后生效方式 | 热加载 | - -- compaction_schedule_thread_num - -| 名字 | compaction_schedule_thread_num | -| ------------ | ------------------------------ | -| 描述 | 选择合并任务的线程数量 | -| 类型 | int32 | -| 默认值 | 4 | -| 改后生效方式 | 热加载 | - -### 3.21 写前日志配置 - -- wal_mode - -| 名字 | wal_mode | -| ------------ | ------------------------------------------------------------ | -| 描述 | 写前日志的写入模式. DISABLE 模式下会关闭写前日志;SYNC 模式下写入请求会在成功写入磁盘后返回; ASYNC 模式下写入请求返回时可能尚未成功写入磁盘后。 | -| 类型 | String | -| 默认值 | ASYNC | -| 改后生效方式 | 重启服务生效 | - -- max_wal_nodes_num - -| 名字 | max_wal_nodes_num | -| ------------ | ----------------------------------------------------- | -| 描述 | 写前日志节点的最大数量,默认值 0 表示数量由系统控制。 | -| 类型 | int32 | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- wal_async_mode_fsync_delay_in_ms - -| 名字 | wal_async_mode_fsync_delay_in_ms | -| ------------ | ------------------------------------------- | -| 描述 | async 模式下写前日志调用 fsync 前的等待时间 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 热加载 | - -- wal_sync_mode_fsync_delay_in_ms - -| 名字 | wal_sync_mode_fsync_delay_in_ms | -| ------------ | ------------------------------------------ | -| 描述 | sync 模式下写前日志调用 fsync 前的等待时间 | -| 类型 | int32 | -| 默认值 | 3 | -| 改后生效方式 | 热加载 | - -- wal_buffer_size_in_byte - -| 名字 | wal_buffer_size_in_byte | -| ------------ | ----------------------- | -| 描述 | 写前日志的 buffer 大小 | -| 类型 | int32 | -| 默认值 | 33554432 | -| 改后生效方式 | 重启服务生效 | - -- wal_buffer_queue_capacity - -| 名字 | wal_buffer_queue_capacity | -| ------------ | ------------------------- | -| 描述 | 写前日志阻塞队列大小上限 | -| 类型 | int32 | -| 默认值 | 500 | -| 改后生效方式 | 重启服务生效 | - -- wal_file_size_threshold_in_byte - -| 名字 | wal_file_size_threshold_in_byte | -| ------------ | ------------------------------- | -| 描述 | 写前日志文件封口阈值 | -| 类型 | int32 | -| 默认值 | 31457280 | -| 改后生效方式 | 热加载 | - -- wal_min_effective_info_ratio - -| 名字 | wal_min_effective_info_ratio | -| ------------ | ---------------------------- | -| 描述 | 写前日志最小有效信息比 | -| 类型 | double | -| 默认值 | 0.1 | -| 改后生效方式 | 热加载 | - -- wal_memtable_snapshot_threshold_in_byte - -| 名字 | wal_memtable_snapshot_threshold_in_byte | -| ------------ | ---------------------------------------- | -| 描述 | 触发写前日志中内存表快照的内存表大小阈值 | -| 类型 | int64 | -| 默认值 | 8388608 | -| 改后生效方式 | 热加载 | - -- max_wal_memtable_snapshot_num - -| 名字 | max_wal_memtable_snapshot_num | -| ------------ | ------------------------------ | -| 描述 | 写前日志中内存表的最大数量上限 | -| 类型 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 热加载 | - -- delete_wal_files_period_in_ms - -| 名字 | delete_wal_files_period_in_ms | -| ------------ | ----------------------------- | -| 描述 | 删除写前日志的检查间隔 | -| 类型 | int64 | -| 默认值 | 20000 | -| 改后生效方式 | 热加载 | - -- wal_throttle_threshold_in_byte - -| 名字 | wal_throttle_threshold_in_byte | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在IoTConsensus中,当WAL文件的大小达到一定阈值时,会开始对写入操作进行节流,以控制写入速度。 | -| 类型 | long | -| 默认值 | 53687091200 | -| 改后生效方式 | 热加载 | - -- iot_consensus_cache_window_time_in_ms - -| 名字 | iot_consensus_cache_window_time_in_ms | -| ------------ | ---------------------------------------- | -| 描述 | 在IoTConsensus中,写缓存的最大等待时间。 | -| 类型 | long | -| 默认值 | -1 | -| 改后生效方式 | 热加载 | - -- enable_wal_compression - -| 名字 | iot_consensus_cache_window_time_in_ms | -| ------------ | ------------------------------------- | -| 描述 | 用于控制是否启用WAL的压缩。 | -| 类型 | boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -### 3.22 IoT 共识协议配置 - -当Region配置了IoTConsensus共识协议之后,下述的配置项才会生效 - -- data_region_iot_max_log_entries_num_per_batch - -| 名字 | data_region_iot_max_log_entries_num_per_batch | -| ------------ | --------------------------------------------- | -| 描述 | IoTConsensus batch 的最大日志条数 | -| 类型 | int32 | -| 默认值 | 1024 | -| 改后生效方式 | 重启服务生效 | - -- data_region_iot_max_size_per_batch - -| 名字 | data_region_iot_max_size_per_batch | -| ------------ | ---------------------------------- | -| 描述 | IoTConsensus batch 的最大大小 | -| 类型 | int32 | -| 默认值 | 16777216 | -| 改后生效方式 | 重启服务生效 | - -- data_region_iot_max_pending_batches_num - -| 名字 | data_region_iot_max_pending_batches_num | -| ------------ | --------------------------------------- | -| 描述 | IoTConsensus batch 的流水线并发阈值 | -| 类型 | int32 | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -- data_region_iot_max_memory_ratio_for_queue - -| 名字 | data_region_iot_max_memory_ratio_for_queue | -| ------------ | ------------------------------------------ | -| 描述 | IoTConsensus 队列内存分配比例 | -| 类型 | double | -| 默认值 | 0.6 | -| 改后生效方式 | 重启服务生效 | - -- region_migration_speed_limit_bytes_per_second - -| 名字 | region_migration_speed_limit_bytes_per_second | -| ------------ | --------------------------------------------- | -| 描述 | 定义了在region迁移过程中,数据传输的最大速率 | -| 类型 | long | -| 默认值 | 33554432 | -| 改后生效方式 | 重启服务生效 | - -### 3.23 TsFile配置 - -- group_size_in_byte - -| 名字 | group_size_in_byte | -| ------------ | ---------------------------------------------- | -| 描述 | 每次将内存中的数据写入到磁盘时的最大写入字节数 | -| 类型 | int32 | -| 默认值 | 134217728 | -| 改后生效方式 | 热加载 | - -- page_size_in_byte - -| 名字 | page_size_in_byte | -| ------------ | ---------------------------------------------------- | -| 描述 | 内存中每个列写出时,写成的单页最大的大小,单位为字节 | -| 类型 | int32 | -| 默认值 | 65536 | -| 改后生效方式 | 热加载 | - -- max_number_of_points_in_page - -| 名字 | max_number_of_points_in_page | -| ------------ | ------------------------------------------------- | -| 描述 | 一个页中最多包含的数据点(时间戳-值的二元组)数量 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 热加载 | - -- pattern_matching_threshold - -| 名字 | pattern_matching_threshold | -| ------------ | ------------------------------ | -| 描述 | 正则表达式匹配时最大的匹配次数 | -| 类型 | int32 | -| 默认值 | 1000000 | -| 改后生效方式 | 热加载 | - -- float_precision - -| 名字 | float_precision | -| ------------ | ------------------------------------------------------------ | -| 描述 | 浮点数精度,为小数点后数字的位数 | -| 类型 | int32 | -| 默认值 | 默认为 2 位。注意:32 位浮点数的十进制精度为 7 位,64 位浮点数的十进制精度为 15 位。如果设置超过机器精度将没有实际意义。 | -| 改后生效方式 | 热加载 | - -- value_encoder - -| 名字 | value_encoder | -| ------------ | ------------------------------------- | -| 描述 | value 列编码方式 | -| 类型 | 枚举 String: “TS_2DIFF”,“PLAIN”,“RLE” | -| 默认值 | PLAIN | -| 改后生效方式 | 热加载 | - -- compressor - -| 名字 | compressor | -| ------------ | ------------------------------------------------------------ | -| 描述 | 数据压缩方法; 对齐序列中时间列的压缩方法 | -| 类型 | 枚举 String : "UNCOMPRESSED", "SNAPPY", "LZ4", "ZSTD", "LZMA2" | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -- encrypt_flag - -| 名字 | encrypt_flag | -| ------------ | ---------------------------- | -| 描述 | 用于开启或关闭数据加密功能。 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- encrypt_type - -| 名字 | encrypt_type | -| ------------ | ------------------------------------- | -| 描述 | 数据加密的方法。 | -| 类型 | String | -| 默认值 | org.apache.tsfile.encrypt.UNENCRYPTED | -| 改后生效方式 | 重启服务生效 | - -- encrypt_key_path - -| 名字 | encrypt_key_path | -| ------------ | ---------------------------- | -| 描述 | 数据加密使用的密钥来源路径。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -### 3.24 授权配置 - -- authorizer_provider_class - -| 名字 | authorizer_provider_class | -| ------------ | ------------------------------------------------------------ | -| 描述 | 权限服务的类名 | -| 类型 | String | -| 默认值 | org.apache.iotdb.commons.auth.authorizer.LocalFileAuthorizer | -| 改后生效方式 | 重启服务生效 | - -- iotdb_server_encrypt_decrypt_provider - -| 名字 | iotdb_server_encrypt_decrypt_provider | -| ------------ | ------------------------------------------------------------ | -| 描述 | 用于用户密码加密的类 | -| 类型 | String | -| 默认值 | org.apache.iotdb.commons.security.encrypt.MessageDigestEncrypt | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- iotdb_server_encrypt_decrypt_provider_parameter - -| 名字 | iotdb_server_encrypt_decrypt_provider_parameter | -| ------------ | ----------------------------------------------- | -| 描述 | 用于初始化用户密码加密类的参数 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- author_cache_size - -| 名字 | author_cache_size | -| ------------ | ------------------------ | -| 描述 | 用户缓存与角色缓存的大小 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- author_cache_expire_time - -| 名字 | author_cache_expire_time | -| ------------ | -------------------------------------- | -| 描述 | 用户缓存与角色缓存的有效期,单位为分钟 | -| 类型 | int32 | -| 默认值 | 30 | -| 改后生效方式 | 重启服务生效 | - -### 3.25 UDF配置 - -- udf_initial_byte_array_length_for_memory_control - -| 名字 | udf_initial_byte_array_length_for_memory_control | -| ------------ | ------------------------------------------------------------ | -| 描述 | 用于评估UDF查询中文本字段的内存使用情况。建议将此值设置为略大于所有文本的平均长度记录。 | -| 类型 | int32 | -| 默认值 | 48 | -| 改后生效方式 | 重启服务生效 | - -- udf_memory_budget_in_mb - -| 名字 | udf_memory_budget_in_mb | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在一个UDF查询中使用多少内存(以 MB 为单位)。上限为已分配内存的 20% 用于读取。 | -| 类型 | Float | -| 默认值 | 30.0 | -| 改后生效方式 | 重启服务生效 | - -- udf_reader_transformer_collector_memory_proportion - -| 名字 | udf_reader_transformer_collector_memory_proportion | -| ------------ | --------------------------------------------------------- | -| 描述 | UDF内存分配比例。参数形式为a : b : c,其中a、b、c为整数。 | -| 类型 | String | -| 默认值 | 1:1:1 | -| 改后生效方式 | 重启服务生效 | - -- udf_lib_dir - -| 名字 | udf_lib_dir | -| ------------ | ---------------------------- | -| 描述 | UDF 日志及jar文件存储路径 | -| 类型 | String | -| 默认值 | ext/udf(Windows:ext\\udf) | -| 改后生效方式 | 重启服务生效 | - -### 3.26 触发器配置 - -- trigger_lib_dir - -| 名字 | trigger_lib_dir | -| ------------ | ----------------------- | -| 描述 | 触发器 JAR 包存放的目录 | -| 类型 | String | -| 默认值 | ext/trigger | -| 改后生效方式 | 重启服务生效 | - -- stateful_trigger_retry_num_when_not_found - -| 名字 | stateful_trigger_retry_num_when_not_found | -| ------------ | ---------------------------------------------- | -| 描述 | 有状态触发器触发无法找到触发器实例时的重试次数 | -| 类型 | Int32 | -| 默认值 | 3 | -| 改后生效方式 | 重启服务生效 | - -### 3.27 SELECT-INTO配置 - -- into_operation_buffer_size_in_byte - -| 名字 | into_operation_buffer_size_in_byte | -| ------------ | ------------------------------------------------------------ | -| 描述 | 执行 select-into 语句时,待写入数据占用的最大内存(单位:Byte) | -| 类型 | long | -| 默认值 | 104857600 | -| 改后生效方式 | 热加载 | - -- select_into_insert_tablet_plan_row_limit - -| 名字 | select_into_insert_tablet_plan_row_limit | -| ------------ | ------------------------------------------------------------ | -| 描述 | 执行 select-into 语句时,一个 insert-tablet-plan 中可以处理的最大行数 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 热加载 | - -- into_operation_execution_thread_count - -| 名字 | into_operation_execution_thread_count | -| ------------ | ------------------------------------------ | -| 描述 | SELECT INTO 中执行写入任务的线程池的线程数 | -| 类型 | int32 | -| 默认值 | 2 | -| 改后生效方式 | 重启服务生效 | - -### 3.28 连续查询配置 -- continuous_query_submit_thread_count - -| 名字 | continuous_query_execution_thread | -| ------------ | --------------------------------- | -| 描述 | 执行连续查询任务的线程池的线程数 | -| 类型 | int32 | -| 默认值 | 2 | -| 改后生效方式 | 重启服务生效 | - -- continuous_query_min_every_interval_in_ms - -| 名字 | continuous_query_min_every_interval_in_ms | -| ------------ | ----------------------------------------- | -| 描述 | 连续查询执行时间间隔的最小值 | -| 类型 | long (duration) | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -### 3.29 PIPE配置 - -- pipe_lib_dir - -| 名字 | pipe_lib_dir | -| ------------ | -------------------------- | -| 描述 | 自定义 Pipe 插件的存放目录 | -| 类型 | string | -| 默认值 | ext/pipe | -| 改后生效方式 | 暂不支持修改 | - -- pipe_subtask_executor_max_thread_num - -| 名字 | pipe_subtask_executor_max_thread_num | -| ------------ | ------------------------------------------------------------ | -| 描述 | pipe 子任务 processor、sink 中各自可以使用的最大线程数。实际值将是 min(pipe_subtask_executor_max_thread_num, max(1, CPU核心数 / 2))。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -- pipe_sink_timeout_ms - -| 名字 | pipe_sink_timeout_ms | -| ------------ | --------------------------------------------- | -| 描述 | thrift 客户端的连接超时时间(以毫秒为单位)。 | -| 类型 | int | -| 默认值 | 900000 | -| 改后生效方式 | 重启服务生效 | - -- pipe_sink_selector_number - -| 名字 | pipe_sink_selector_number | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在 iotdb-thrift-async-sink 插件中可以使用的最大执行结果处理线程数量。 建议将此值设置为小于或等于 pipe_sink_max_client_number。 | -| 类型 | int | -| 默认值 | 4 | -| 改后生效方式 | 重启服务生效 | - -- pipe_sink_max_client_number - -| 名字 | pipe_sink_max_client_number | -| ------------ | ----------------------------------------------------------- | -| 描述 | 在 iotdb-thrift-async-sink 插件中可以使用的最大客户端数量。 | -| 类型 | int | -| 默认值 | 16 | -| 改后生效方式 | 重启服务生效 | - -- pipe_air_gap_receiver_enabled - -| 名字 | pipe_air_gap_receiver_enabled | -| ------------ | ------------------------------------------------------------ | -| 描述 | 是否启用通过网闸接收 pipe 数据。接收器只能在 tcp 模式下返回 0 或 1,以指示数据是否成功接收。 \| | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- pipe_air_gap_receiver_port - -| 名字 | pipe_air_gap_receiver_port | -| ------------ | ------------------------------------ | -| 描述 | 服务器通过网闸接收 pipe 数据的端口。 | -| 类型 | int | -| 默认值 | 9780 | -| 改后生效方式 | 重启服务生效 | - -- pipe_all_sinks_rate_limit_bytes_per_second - -| 名字 | pipe_all_sinks_rate_limit_bytes_per_second | -| ------------ | ------------------------------------------------------------ | -| 描述 | 所有 pipe sink 每秒可以传输的总字节数。当给定的值小于或等于 0 时,表示没有限制。默认值是 -1,表示没有限制。 | -| 类型 | double | -| 默认值 | -1 | -| 改后生效方式 | 热加载 | - -### 3.30 Ratis共识协议配置 - -当Region配置了RatisConsensus共识协议之后,下述的配置项才会生效 - -- config_node_ratis_log_appender_buffer_size_max - -| 名字 | config_node_ratis_log_appender_buffer_size_max | -| ------------ | ---------------------------------------------- | -| 描述 | confignode 一次同步日志RPC最大的传输字节限制 | -| 类型 | int32 | -| 默认值 | 16777216 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_log_appender_buffer_size_max - -| 名字 | schema_region_ratis_log_appender_buffer_size_max | -| ------------ | ------------------------------------------------ | -| 描述 | schema region 一次同步日志RPC最大的传输字节限制 | -| 类型 | int32 | -| 默认值 | 16777216 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_log_appender_buffer_size_max - -| 名字 | data_region_ratis_log_appender_buffer_size_max | -| ------------ | ---------------------------------------------- | -| 描述 | data region 一次同步日志RPC最大的传输字节限制 | -| 类型 | int32 | -| 默认值 | 16777216 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_snapshot_trigger_threshold - -| 名字 | config_node_ratis_snapshot_trigger_threshold | -| ------------ | -------------------------------------------- | -| 描述 | confignode 触发snapshot需要的日志条数 | -| 类型 | int32 | -| 默认值 | 400,000 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_snapshot_trigger_threshold - -| 名字 | schema_region_ratis_snapshot_trigger_threshold | -| ------------ | ---------------------------------------------- | -| 描述 | schema region 触发snapshot需要的日志条数 | -| 类型 | int32 | -| 默认值 | 400,000 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_snapshot_trigger_threshold - -| 名字 | data_region_ratis_snapshot_trigger_threshold | -| ------------ | -------------------------------------------- | -| 描述 | data region 触发snapshot需要的日志条数 | -| 类型 | int32 | -| 默认值 | 400,000 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_log_unsafe_flush_enable - -| 名字 | config_node_ratis_log_unsafe_flush_enable | -| ------------ | ----------------------------------------- | -| 描述 | confignode 是否允许Raft日志异步刷盘 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_log_unsafe_flush_enable - -| 名字 | schema_region_ratis_log_unsafe_flush_enable | -| ------------ | ------------------------------------------- | -| 描述 | schema region 是否允许Raft日志异步刷盘 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_log_unsafe_flush_enable - -| 名字 | data_region_ratis_log_unsafe_flush_enable | -| ------------ | ----------------------------------------- | -| 描述 | data region 是否允许Raft日志异步刷盘 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_log_segment_size_max_in_byte - -| 名字 | config_node_ratis_log_segment_size_max_in_byte | -| ------------ | ---------------------------------------------- | -| 描述 | confignode 一个RaftLog日志段文件的大小 | -| 类型 | int32 | -| 默认值 | 25165824 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_log_segment_size_max_in_byte - -| 名字 | schema_region_ratis_log_segment_size_max_in_byte | -| ------------ | ------------------------------------------------ | -| 描述 | schema region 一个RaftLog日志段文件的大小 | -| 类型 | int32 | -| 默认值 | 25165824 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_log_segment_size_max_in_byte - -| 名字 | data_region_ratis_log_segment_size_max_in_byte | -| ------------ | ---------------------------------------------- | -| 描述 | data region 一个RaftLog日志段文件的大小 | -| 类型 | int32 | -| 默认值 | 25165824 | -| 改后生效方式 | 重启服务生效 | - -- config_node_simple_consensus_log_segment_size_max_in_byte - -| 名字 | data_region_ratis_log_segment_size_max_in_byte | -| ------------ | ---------------------------------------------- | -| 描述 | Confignode 简单共识协议一个Log日志段文件的大小 | -| 类型 | int32 | -| 默认值 | 25165824 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_grpc_flow_control_window - -| 名字 | config_node_ratis_grpc_flow_control_window | -| ------------ | ------------------------------------------ | -| 描述 | confignode grpc 流式拥塞窗口大小 | -| 类型 | int32 | -| 默认值 | 4194304 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_grpc_flow_control_window - -| 名字 | schema_region_ratis_grpc_flow_control_window | -| ------------ | -------------------------------------------- | -| 描述 | schema region grpc 流式拥塞窗口大小 | -| 类型 | int32 | -| 默认值 | 4194304 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_grpc_flow_control_window - -| 名字 | data_region_ratis_grpc_flow_control_window | -| ------------ | ------------------------------------------ | -| 描述 | data region grpc 流式拥塞窗口大小 | -| 类型 | int32 | -| 默认值 | 4194304 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_grpc_leader_outstanding_appends_max - -| 名字 | config_node_ratis_grpc_leader_outstanding_appends_max | -| ------------ | ----------------------------------------------------- | -| 描述 | config node grpc 流水线并发阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_grpc_leader_outstanding_appends_max - -| 名字 | schema_region_ratis_grpc_leader_outstanding_appends_max | -| ------------ | ------------------------------------------------------- | -| 描述 | schema region grpc 流水线并发阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_grpc_leader_outstanding_appends_max - -| 名字 | data_region_ratis_grpc_leader_outstanding_appends_max | -| ------------ | ----------------------------------------------------- | -| 描述 | data region grpc 流水线并发阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_log_force_sync_num - -| 名字 | config_node_ratis_log_force_sync_num | -| ------------ | ------------------------------------ | -| 描述 | config node fsync 阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_log_force_sync_num - -| 名字 | schema_region_ratis_log_force_sync_num | -| ------------ | -------------------------------------- | -| 描述 | schema region fsync 阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_log_force_sync_num - -| 名字 | data_region_ratis_log_force_sync_num | -| ------------ | ------------------------------------ | -| 描述 | data region fsync 阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_rpc_leader_election_timeout_min_ms - -| 名字 | config_node_ratis_rpc_leader_election_timeout_min_ms | -| ------------ | ---------------------------------------------------- | -| 描述 | confignode leader 选举超时最小值 | -| 类型 | int32 | -| 默认值 | 2000ms | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_rpc_leader_election_timeout_min_ms - -| 名字 | schema_region_ratis_rpc_leader_election_timeout_min_ms | -| ------------ | ------------------------------------------------------ | -| 描述 | schema region leader 选举超时最小值 | -| 类型 | int32 | -| 默认值 | 2000ms | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_rpc_leader_election_timeout_min_ms - -| 名字 | data_region_ratis_rpc_leader_election_timeout_min_ms | -| ------------ | ---------------------------------------------------- | -| 描述 | data region leader 选举超时最小值 | -| 类型 | int32 | -| 默认值 | 2000ms | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_rpc_leader_election_timeout_max_ms - -| 名字 | config_node_ratis_rpc_leader_election_timeout_max_ms | -| ------------ | ---------------------------------------------------- | -| 描述 | confignode leader 选举超时最大值 | -| 类型 | int32 | -| 默认值 | 4000ms | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_rpc_leader_election_timeout_max_ms - -| 名字 | schema_region_ratis_rpc_leader_election_timeout_max_ms | -| ------------ | ------------------------------------------------------ | -| 描述 | schema region leader 选举超时最大值 | -| 类型 | int32 | -| 默认值 | 4000ms | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_rpc_leader_election_timeout_max_ms - -| 名字 | data_region_ratis_rpc_leader_election_timeout_max_ms | -| ------------ | ---------------------------------------------------- | -| 描述 | data region leader 选举超时最大值 | -| 类型 | int32 | -| 默认值 | 4000ms | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_request_timeout_ms - -| 名字 | config_node_ratis_request_timeout_ms | -| ------------ | ------------------------------------ | -| 描述 | confignode Raft 客户端重试超时 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_request_timeout_ms - -| 名字 | schema_region_ratis_request_timeout_ms | -| ------------ | -------------------------------------- | -| 描述 | schema region Raft 客户端重试超时 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_request_timeout_ms - -| 名字 | data_region_ratis_request_timeout_ms | -| ------------ | ------------------------------------ | -| 描述 | data region Raft 客户端重试超时 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_max_retry_attempts - -| 名字 | config_node_ratis_max_retry_attempts | -| ------------ | ------------------------------------ | -| 描述 | confignode Raft客户端最大重试次数 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_initial_sleep_time_ms - -| 名字 | config_node_ratis_initial_sleep_time_ms | -| ------------ | --------------------------------------- | -| 描述 | confignode Raft客户端初始重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 100ms | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_max_sleep_time_ms - -| 名字 | config_node_ratis_max_sleep_time_ms | -| ------------ | ------------------------------------- | -| 描述 | confignode Raft客户端最大重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_max_retry_attempts - -| 名字 | schema_region_ratis_max_retry_attempts | -| ------------ | -------------------------------------- | -| 描述 | schema region Raft客户端最大重试次数 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_initial_sleep_time_ms - -| 名字 | schema_region_ratis_initial_sleep_time_ms | -| ------------ | ----------------------------------------- | -| 描述 | schema region Raft客户端初始重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 100ms | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_max_sleep_time_ms - -| 名字 | schema_region_ratis_max_sleep_time_ms | -| ------------ | ---------------------------------------- | -| 描述 | schema region Raft客户端最大重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_max_retry_attempts - -| 名字 | data_region_ratis_max_retry_attempts | -| ------------ | ------------------------------------ | -| 描述 | data region Raft客户端最大重试次数 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_initial_sleep_time_ms - -| 名字 | data_region_ratis_initial_sleep_time_ms | -| ------------ | --------------------------------------- | -| 描述 | data region Raft客户端初始重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 100ms | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_max_sleep_time_ms - -| 名字 | data_region_ratis_max_sleep_time_ms | -| ------------ | -------------------------------------- | -| 描述 | data region Raft客户端最大重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- ratis_first_election_timeout_min_ms - -| 名字 | ratis_first_election_timeout_min_ms | -| ------------ | ----------------------------------- | -| 描述 | Ratis协议首次选举最小超时时间 | -| 类型 | int64 | -| 默认值 | 50 (ms) | -| 改后生效方式 | 重启服务生效 | - -- ratis_first_election_timeout_max_ms - -| 名字 | ratis_first_election_timeout_max_ms | -| ------------ | ----------------------------------- | -| 描述 | Ratis协议首次选举最大超时时间 | -| 类型 | int64 | -| 默认值 | 150 (ms) | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_preserve_logs_num_when_purge - -| 名字 | config_node_ratis_preserve_logs_num_when_purge | -| ------------ | ---------------------------------------------- | -| 描述 | confignode snapshot后保持一定数量日志不删除 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_preserve_logs_num_when_purge - -| 名字 | schema_region_ratis_preserve_logs_num_when_purge | -| ------------ | ------------------------------------------------ | -| 描述 | schema region snapshot后保持一定数量日志不删除 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_preserve_logs_num_when_purge - -| 名字 | data_region_ratis_preserve_logs_num_when_purge | -| ------------ | ---------------------------------------------- | -| 描述 | data region snapshot后保持一定数量日志不删除 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_log_max_size - -| 名字 | config_node_ratis_log_max_size | -| ------------ | ----------------------------------- | -| 描述 | config node磁盘Raft Log最大占用空间 | -| 类型 | int64 | -| 默认值 | 2147483648 (2GB) | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_log_max_size - -| 名字 | schema_region_ratis_log_max_size | -| ------------ | -------------------------------------- | -| 描述 | schema region 磁盘Raft Log最大占用空间 | -| 类型 | int64 | -| 默认值 | 2147483648 (2GB) | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_log_max_size - -| 名字 | data_region_ratis_log_max_size | -| ------------ | ------------------------------------ | -| 描述 | data region 磁盘Raft Log最大占用空间 | -| 类型 | int64 | -| 默认值 | 21474836480 (20GB) | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_periodic_snapshot_interval - -| 名字 | config_node_ratis_periodic_snapshot_interval | -| ------------ | -------------------------------------------- | -| 描述 | config node定期snapshot的间隔时间 | -| 类型 | int64 | -| 默认值 | 86400 (秒) | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_periodic_snapshot_interval - -| 名字 | schema_region_ratis_preserve_logs_num_when_purge | -| ------------ | ------------------------------------------------ | -| 描述 | schema region定期snapshot的间隔时间 | -| 类型 | int64 | -| 默认值 | 86400 (秒) | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_periodic_snapshot_interval - -| 名字 | data_region_ratis_preserve_logs_num_when_purge | -| ------------ | ---------------------------------------------- | -| 描述 | data region定期snapshot的间隔时间 | -| 类型 | int64 | -| 默认值 | 86400 (秒) | -| 改后生效方式 | 重启服务生效 | - -### 3.31 IoTConsensusV2配置 - -- iot_consensus_v2_pipeline_size - -| 名字 | iot_consensus_v2_pipeline_size | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTConsensus V2中连接器(connector)和接收器(receiver)的默认事件缓冲区大小。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -- iot_consensus_v2_mode - -| 名字 | iot_consensus_v2_pipeline_size | -| ------------ | ----------------------------------- | -| 描述 | IoTConsensus V2使用的共识协议模式。 | -| 类型 | String | -| 默认值 | batch | -| 改后生效方式 | 重启服务生效 | - -### 3.32 Procedure 配置 - -- procedure_core_worker_thread_count - -| 名字 | procedure_core_worker_thread_count | -| ------------ | ---------------------------------- | -| 描述 | 工作线程数量 | -| 类型 | int32 | -| 默认值 | 4 | -| 改后生效方式 | 重启服务生效 | - -- procedure_completed_clean_interval - -| 名字 | procedure_completed_clean_interval | -| ------------ | ---------------------------------- | -| 描述 | 清理已完成的 procedure 时间间隔 | -| 类型 | int32 | -| 默认值 | 30(s) | -| 改后生效方式 | 重启服务生效 | - -- procedure_completed_evict_ttl - -| 名字 | procedure_completed_evict_ttl | -| ------------ | --------------------------------- | -| 描述 | 已完成的 procedure 的数据保留时间 | -| 类型 | int32 | -| 默认值 | 60(s) | -| 改后生效方式 | 重启服务生效 | - -### 3.33 MQTT代理配置 - -- enable_mqtt_service - -| 名字 | enable_mqtt_service。 | -| ------------ | --------------------- | -| 描述 | 是否开启MQTT服务 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 热加载 | - -- mqtt_host - -| 名字 | mqtt_host | -| ------------ | -------------------- | -| 描述 | MQTT服务绑定的host。 | -| 类型 | String | -| 默认值 | 127.0.0.1 | -| 改后生效方式 | 热加载 | - -- mqtt_port - -| 名字 | mqtt_port | -| ------------ | -------------------- | -| 描述 | MQTT服务绑定的port。 | -| 类型 | int32 | -| 默认值 | 1883 | -| 改后生效方式 | 热加载 | - -- mqtt_handler_pool_size - -| 名字 | mqtt_handler_pool_size | -| ------------ | ---------------------------------- | -| 描述 | 用于处理MQTT消息的处理程序池大小。 | -| 类型 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 热加载 | - -- mqtt_payload_formatter - -| 名字 | mqtt_payload_formatter | -| ------------ | ---------------------------- | -| 描述 | MQTT消息有效负载格式化程序。 | -| 类型 | String | -| 默认值 | json | -| 改后生效方式 | 热加载 | - -- mqtt_max_message_size - -| 名字 | mqtt_max_message_size | -| ------------ | ------------------------------------ | -| 描述 | MQTT消息的最大长度(以字节为单位)。 | -| 类型 | int32 | -| 默认值 | 1048576 | -| 改后生效方式 | 热加载 | - -### 3.34 审计日志配置 - -- enable_audit_log - -| 名字 | enable_audit_log | -| ------------ | ------------------------------ | -| 描述 | 用于控制是否启用审计日志功能。 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- audit_log_storage - -| 名字 | audit_log_storage | -| ------------ | -------------------------- | -| 描述 | 定义了审计日志的输出位置。 | -| 类型 | String | -| 默认值 | IOTDB,LOGGER | -| 改后生效方式 | 重启服务生效 | - -- audit_log_operation - -| 名字 | audit_log_operation | -| ------------ | -------------------------------------- | -| 描述 | 定义了哪些类型的操作需要记录审计日志。 | -| 类型 | String | -| 默认值 | DML,DDL,QUERY | -| 改后生效方式 | 重启服务生效 | - -- enable_audit_log_for_native_insert_api - -| 名字 | enable_audit_log_for_native_insert_api | -| ------------ | -------------------------------------- | -| 描述 | 用于控制本地写入API是否记录审计日志。 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -### 3.35 白名单配置 -- enable_white_list - -| 名字 | enable_white_list | -| ------------ | ----------------- | -| 描述 | 是否启用白名单。 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 热加载 | - -### 3.36 IoTDB-AI 配置 - -- model_inference_execution_thread_count - -| 名字 | model_inference_execution_thread_count | -| ------------ | -------------------------------------- | -| 描述 | 用于模型推理操作的线程数。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -### 3.37 TsFile 主动监听&加载功能配置 - -- load_clean_up_task_execution_delay_time_seconds - -| 名字 | load_clean_up_task_execution_delay_time_seconds | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在加载TsFile失败后,系统将等待多长时间才会执行清理任务来清除这些未成功加载的TsFile。 | -| 类型 | int | -| 默认值 | 1800 | -| 改后生效方式 | 热加载 | - -- load_write_throughput_bytes_per_second - -| 名字 | load_write_throughput_bytes_per_second | -| ------------ | -------------------------------------- | -| 描述 | 加载TsFile时磁盘写入的最大字节数每秒。 | -| 类型 | int | -| 默认值 | -1 | -| 改后生效方式 | 热加载 | - -- load_active_listening_enable - -| 名字 | load_active_listening_enable | -| ------------ | ------------------------------------------------------------ | -| 描述 | 是否开启 DataNode 主动监听并且加载 tsfile 的功能(默认开启)。 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- load_active_listening_dirs - -| 名字 | load_active_listening_dirs | -| ------------ | ------------------------------------------------------------ | -| 描述 | 需要监听的目录(自动包括目录中的子目录),如有多个使用 “,“ 隔开默认的目录为 ext/load/pending(支持热装载)。 | -| 类型 | String | -| 默认值 | ext/load/pending | -| 改后生效方式 | 热加载 | - -- load_active_listening_fail_dir - -| 名字 | load_active_listening_fail_dir | -| ------------ | ---------------------------------------------------------- | -| 描述 | 执行加载 tsfile 文件失败后将文件转存的目录,只能配置一个。 | -| 类型 | String | -| 默认值 | ext/load/failed | -| 改后生效方式 | 热加载 | - -- load_active_listening_max_thread_num - -| 名字 | load_active_listening_max_thread_num | -| ------------ | ------------------------------------------------------------ | -| 描述 | 同时执行加载 tsfile 任务的最大线程数,参数被注释掉时的默值为 max(1, CPU 核心数 / 2),当用户设置的值不在这个区间[1, CPU核心数 /2]内时,会设置为默认值 (1, CPU 核心数 / 2)。 | -| 类型 | Long | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- load_active_listening_check_interval_seconds - -| 名字 | load_active_listening_check_interval_seconds | -| ------------ | ------------------------------------------------------------ | -| 描述 | 主动监听轮询间隔,单位秒。主动监听 tsfile 的功能是通过轮询检查文件夹实现的。该配置指定了两次检查 load_active_listening_dirs 的时间间隔,每次检查完成 load_active_listening_check_interval_seconds 秒后,会执行下一次检查。当用户设置的轮询间隔小于 1 时,会被设置为默认值 5 秒。 | -| 类型 | Long | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - - -* last_cache_operation_on_load - -|名字| last_cache_operation_on_load | -|:---:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|描述| 当成功加载一个 TsFile 时,对 LastCache 执行的操作。`UPDATE`:使用 TsFile 中的数据更新 LastCache;`UPDATE_NO_BLOB`:与 UPDATE 类似,但会使 blob 序列的 LastCache 失效;`CLEAN_DEVICE`:使 TsFile 中包含的设备的 LastCache 失效;`CLEAN_ALL`:清空整个 LastCache。 | -|类型| String | -|默认值| UPDATE_NO_BLOB | -|改后生效方式| 重启后生效 | - -* cache_last_values_for_load - -|名字| cache_last_values_for_load | -|:---:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|描述| 在加载 TsFile 之前是否缓存最新值(last values)。仅在 `last_cache_operation_on_load=UPDATE_NO_BLOB` 或 `last_cache_operation_on_load=UPDATE` 时生效。当设置为 true 时,即使 `last_cache_operation_on_load=UPDATE`,也会忽略 blob 序列。启用此选项会在加载 TsFile 期间增加内存占用。 | -|类型| Boolean | -|默认值| true | -|改后生效方式| 重启后生效 | - -* cache_last_values_memory_budget_in_byte - -|名字| cache_last_values_memory_budget_in_byte | -|:---:|:----------------------------------------------------------------------------------------------------| -|描述| 当 `cache_last_values_for_load=true` 时,用于缓存最新值的最大内存大小(以字节为单位)。如果超过该值,缓存的值将被丢弃,并以流式方式直接从 TsFile 中读取最新值。 | -|类型| int32 | -|默认值| 4194304 | -|改后生效方式| 重启后生效 | - - -### 3.38 分发重试配置 - -- enable_retry_for_unknown_error - -| 名字 | enable_retry_for_unknown_error | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在遇到未知错误时,写请求远程分发的最大重试时间,单位是毫秒。 | -| 类型 | Long | -| 默认值 | 60000 | -| 改后生效方式 | 热加载 | - -- enable_retry_for_unknown_error - -| 名字 | enable_retry_for_unknown_error | -| ------------ | -------------------------------- | -| 描述 | 用于控制是否对未知错误进行重试。 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 热加载 | \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Reference/System-Tables_timecho.md b/src/zh/UserGuide/Master/Table/Reference/System-Tables_timecho.md deleted file mode 100644 index 108cd8d0a..000000000 --- a/src/zh/UserGuide/Master/Table/Reference/System-Tables_timecho.md +++ /dev/null @@ -1,793 +0,0 @@ - - -# 系统表 - -IoTDB 内置系统数据库 `INFORMATION_SCHEMA`,其中包含一系列系统表,用于存储 IoTDB 运行时信息(如当前正在执行的 SQL 语句等)。目前`INFORMATION_SCHEMA`数据库只支持读操作。 - -> 💡 **【V2.0.9.1 版本更新】**
-> 👉 新增一张系统表:**[TABLE_DISK_USAGE](#_2-22-table-disk-usage-表)**(表级存储空间统计),助力集群运维与性能分析。 - -## 1. 系统库 - -* 名称:`INFORMATION_SCHEMA` -* 指令:只读,只支持 `Show databases (DETAILS) `​`/ Show Tables (DETAILS) / Use`,其余操作将会报错:`"The database 'information_schema' can only be queried"` -* 属性:`TTL=INF`,其余属性默认为`null` -* SQL示例: - -```sql -IoTDB> show databases -+------------------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+------------------+-------+-----------------------+---------------------+---------------------+ -|information_schema| INF| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+ - -IoTDB> show tables from information_schema -+-----------------------+-------+ -| TableName|TTL(ms)| -+-----------------------+-------+ -| columns| INF| -| config_nodes| INF| -| configurations| INF| -| connections| INF| -| current_queries| INF| -| data_nodes| INF| -| databases| INF| -| functions| INF| -| keywords| INF| -| nodes| INF| -| pipe_plugins| INF| -| pipes| INF| -| queries| INF| -|queries_costs_histogram| INF| -| regions| INF| -| services| INF| -| subscriptions| INF| -| table_disk_usage| INF| -| tables| INF| -| topics| INF| -| views| INF| -+-----------------------+-------+ -``` - -## 2. 系统表 - -* 名称:`DATABASES`, `TABLES`, `REGIONS`, `QUERIES`, `COLUMNS`, `PIPES`, `PIPE_PLUGINS`, `SUBSCRIPTION`, `TOPICS`, `VIEWS`, `MODELS`, `FUNCTIONS`, `CONFIGURATIONS`, `KEYWORDS`, `NODES`, `CONFIG_NODES`, `DATA_NODES`, `CONNECTIONS`, `CURRENT_QUERIES`, `QUERIES_COSTS_HISTOGRAM`、`SERVICES`、`TABLE_DISK_USAGE`(详细介绍见后面小节) -* 操作:只读,只支持`SELECT`, `COUNT/SHOW DEVICES`, `DESC`,不支持对于表结构 / 内容的任意修改,如果修改将会报错:`"The database 'information_schema' can only be queried"` -* 列名:系统表的列名均默认为小写,且用`_`分隔 - -### 2.1 DATABASES 表 - -* 包含集群中所有数据库的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ----------------------------- | ---------- | ----------- | ---------------- | -| database | STRING | TAG | 数据库名称 | -| ttl(ms) | STRING | ATTRIBUTE | 数据保留时间 | -| schema\_replication\_factor | INT32 | ATTRIBUTE | 元数据副本数 | -| data\_replication\_factor | INT32 | ATTRIBUTE | 数据副本数 | -| time\_partition\_interval | INT64 | ATTRIBUTE | 时间分区间隔 | -| schema\_region\_group\_num | INT32 | ATTRIBUTE | 元数据分区数量 | -| data\_region\_group\_num | INT32 | ATTRIBUTE | 数据分区数量 | - -* 查询结果只展示自身对该数据库本身或库中任意表有任意权限的数据库集合 -* 查询示例: - -```sql -IoTDB> select * from information_schema.databases -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -| database|ttl(ms)|schema_replication_factor|data_replication_factor|time_partition_interval|schema_region_group_num|data_region_group_num| -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -|information_schema| INF| null| null| null| null| null| -| database1| INF| 1| 1| 604800000| 0| 0| -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -``` - -### 2.2 TABLES 表 - -* 包含集群中所有表的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------- | ---------- | ----------- | -------------- | -| database | STRING | TAG | 数据库名称 | -| table\_name | STRING | TAG | 表名称 | -| ttl(ms) | STRING | ATTRIBUTE | 数据保留时间 | -| status | STRING | ATTRIBUTE | 状态 | -| comment | STRING | ATTRIBUTE | 注释 | - -* 说明:status 可能为`USING`/`PRE_CREATE`/`PRE_DELETE`,具体见表管理中[查看表](../Basic-Concept/Table-Management_timecho.md#12-查看表)的相关描述 -* 查询结果只展示自身有任意权限的表集合 -* 查询示例: - -```sql -IoTDB> select * from information_schema.tables -+------------------+--------------+-----------+------+-------+-----------+ -| database| table_name| ttl(ms)|status|comment| table_type| -+------------------+--------------+-----------+------+-------+-----------+ -|information_schema| databases| INF| USING| null|SYSTEM VIEW| -|information_schema| models| INF| USING| null|SYSTEM VIEW| -|information_schema| subscriptions| INF| USING| null|SYSTEM VIEW| -|information_schema| regions| INF| USING| null|SYSTEM VIEW| -|information_schema| functions| INF| USING| null|SYSTEM VIEW| -|information_schema| keywords| INF| USING| null|SYSTEM VIEW| -|information_schema| columns| INF| USING| null|SYSTEM VIEW| -|information_schema| topics| INF| USING| null|SYSTEM VIEW| -|information_schema|configurations| INF| USING| null|SYSTEM VIEW| -|information_schema| queries| INF| USING| null|SYSTEM VIEW| -|information_schema| tables| INF| USING| null|SYSTEM VIEW| -|information_schema| pipe_plugins| INF| USING| null|SYSTEM VIEW| -|information_schema| nodes| INF| USING| null|SYSTEM VIEW| -|information_schema| data_nodes| INF| USING| null|SYSTEM VIEW| -|information_schema| pipes| INF| USING| null|SYSTEM VIEW| -|information_schema| views| INF| USING| null|SYSTEM VIEW| -|information_schema| config_nodes| INF| USING| null|SYSTEM VIEW| -| database1| table1|31536000000| USING| null| BASE TABLE| -+------------------+--------------+-----------+------+-------+-----------+ -``` - -### 2.3 REGIONS 表 - -* 包含集群中所有`Region`的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| --------------------- | ----------- | ----------- | ----------------------------------------------------------------------------------------------------------- | -| region\_id | INT32 | TAG | region ID | -| datanode\_id | INT32 | TAG | dataNode ID | -| type | STRING | ATTRIBUTE | 类型(SchemaRegion / DataRegion) | -| status | STRING | ATTRIBUTE | 状态(Running/Unknown 等) | -| database | STRING | ATTRIBUTE | database 名字 | -| series\_slot\_num | INT32 | ATTRIBUTE | series slot 个数 | -| time\_slot\_num | INT64 | ATTRIBUTE | time slot 个数 | -| rpc\_address | STRING | ATTRIBUTE | Rpc 地址 | -| rpc\_port | INT32 | ATTRIBUTE | Rpc 端口 | -| internal\_address | STRING | ATTRIBUTE | 内部通讯地址 | -| role | STRING | ATTRIBUTE | Leader / Follower | -| create\_time | TIMESTAMP | ATTRIBUTE | 创建时间 | -| tsfile\_size\_bytes | INT64 | ATTRIBUTE | 可统计的 DataRegion:含有 TsFile 的总文件大小;不可统计的 DataRegion(Unknown):-1;SchemaRegion:null; | - -* 仅管理员可执行查询操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.regions -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -|region_id|datanode_id| type| status| database|series_slot_num|time_slot_num|rpc_address|rpc_port|internal_address| role| create_time|tsfile_size_bytes| -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -| 0| 1|SchemaRegion|Running|database1| 12| 0| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:08.485+08:00| null| -| 1| 1| DataRegion|Running|database1| 6| 6| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:09.156+08:00| 3985| -| 2| 1| DataRegion|Running|database1| 6| 6| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:09.156+08:00| 3841| -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -``` - -### 2.4 QUERIES 表 - -* 包含集群中所有正在执行的查询的信息。也可以使用 `SHOW QUERIES`语法去查询。 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| --------------- | ----------- | ----------- | ------------------------------------------------ | -| query\_id | STRING | TAG | ID | -| start\_time | TIMESTAMP | ATTRIBUTE | 查询开始的时间戳,时间戳精度与系统精度保持一致 | -| datanode\_id | INT32 | ATTRIBUTE | 发起查询的DataNode ID | -| elapsed\_time | FLOAT | ATTRIBUTE | 查询执行耗时,单位是秒 | -| statement | STRING | ATTRIBUTE | 查询sql | -| user | STRING | ATTRIBUTE | 发起查询的用户 | - -* 普通用户查询结果仅显示自身执行的查询;管理员显示全部。 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.queries -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -|20250331_023242_00011_1|2025-03-31T10:32:42.360+08:00| 1| 0.025|select * from information_schema.queries|root| -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -``` - -### 2.5 COLUMNS 表 - -* 包含集群中所有表中列的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| -------------- | ---------- | ----------- | -------------- | -| database | STRING | TAG | 数据库名称 | -| table\_name | STRING | TAG | 表名称 | -| column\_name | STRING | TAG | 列名称 | -| datatype | STRING | ATTRIBUTE | 列的数值类型 | -| category | STRING | ATTRIBUTE | 列类型 | -| status | STRING | ATTRIBUTE | 列状态 | -| comment | STRING | ATTRIBUTE | 列注释 | - -说明: -* status 可能为`USING`/`PRE_DELETE`,具体见表管理中[查看表的列](../Basic-Concept/Table-Management_timecho.md#13-查看表的列)的相关描述 -* 查询结果只展示自身有任意权限的表的列信息 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.columns where database = 'database1' -+---------+----------+------------+---------+---------+------+-------+ -| database|table_name| column_name| datatype| category|status|comment| -+---------+----------+------------+---------+---------+------+-------+ -|database1| table1| time|TIMESTAMP| TIME| USING| null| -|database1| table1| region| STRING| TAG| USING| null| -|database1| table1| plant_id| STRING| TAG| USING| null| -|database1| table1| device_id| STRING| TAG| USING| null| -|database1| table1| model_id| STRING|ATTRIBUTE| USING| null| -|database1| table1| maintenance| STRING|ATTRIBUTE| USING| null| -|database1| table1| temperature| FLOAT| FIELD| USING| null| -|database1| table1| humidity| FLOAT| FIELD| USING| null| -|database1| table1| status| BOOLEAN| FIELD| USING| null| -|database1| table1|arrival_time|TIMESTAMP| FIELD| USING| null| -+---------+----------+------------+---------+---------+------+-------+ -``` - -### 2.6 PIPES 表 - -* 包含集群中所有 PIPE 的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------------------- | ----------- | ----------- | --------------------------------------- | -| id | STRING | TAG | Pipe 名称 | -| creation\_time | TIMESTAMP | ATTRIBUTE | 创建时间 | -| state | STRING | ATTRIBUTE | Pipe 状态(RUNNING/STOPPED) | -| pipe\_source | STRING | ATTRIBUTE | source 插件参数 | -| pipe\_processor | STRING | ATTRIBUTE | processor 插件参数 | -| pipe\_sink | STRING | ATTRIBUTE | source 插件参数 | -| exception\_message | STRING | ATTRIBUTE | Exception 信息 | -| remaining\_event\_count | INT64 | ATTRIBUTE | 剩余 event 数量,如果 Unknown 则为 -1 | -| estimated\_remaining\_seconds | DOUBLE | ATTRIBUTE | 预估剩余时间,如果 Unknown 则为 -1 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -select * from information_schema.pipes -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -| id| creation_time| state| pipe_source|pipe_processor| pipe_sink|exception_message|remaining_event_count|estimated_remaining_seconds| -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -|tablepipe1|2025-03-31T12:25:24.040+08:00|RUNNING|{__system.sql-dialect=table, source.password=******, source.username=root}| {}|{format=hybrid, node-urls=192.168.xxx.xxx:6667, sink=iotdb-thrift-sink}| | 0| 0.0| -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -``` - -### 2.7 PIPE\_PLUGINS 表 - -* 包含集群中所有PIPE插件的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| -------------- | ---------- | ----------- | ----------------------------------------------- | -| plugin\_name | STRING | TAG | 插件名称 | -| plugin\_type | STRING | ATTRIBUTE | 插件类型(Builtin/External) | -| class\_name | STRING | ATTRIBUTE | 插件的主类名 | -| plugin\_jar | STRING | ATTRIBUTE | 插件的 jar 包名称,若为 builtin 类型则为 null | - -* 查询示例: - -```SQL -IoTDB> select * from information_schema.pipe_plugins -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -| plugin_name|plugin_type| class_name|plugin_jar| -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -|IOTDB-THRIFT-SSL-SINK| Builtin|org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| null| -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| null| -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.donothing.DoNothingConnector| null| -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.processor.donothing.DoNothingProcessor| null| -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| null| -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.extractor.iotdb.IoTDBExtractor| null| -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -``` - -### 2.8 SUBSCRIPTIONS 表 - -* 包含集群中所有数据订阅的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ----------------------- | ---------- | ----------- | -------------- | -| topic\_name | STRING | TAG | 订阅主题名称 | -| consumer\_group\_name | STRING | TAG | 消费者组名称 | -| subscribed\_consumers | STRING | ATTRIBUTE | 订阅的消费者 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.subscriptions where topic_name = 'topic_1' -+----------+-------------------+--------------------------------+ -|topic_name|consumer_group_name| subscribed_consumers| -+----------+-------------------+--------------------------------+ -| topic_1| cg1|[c3, c4, c5, c6, c7, c0, c1, c2]| -+----------+-------------------+--------------------------------+ -``` - -### 2.9 TOPICS 表 - -* 包含集群中所有数据订阅主题的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ---------------- | ---------- | ----------- | -------------- | -| topic\_name | STRING | TAG | 订阅主题名称 | -| topic\_configs | STRING | ATTRIBUTE | 订阅主题配置 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.topics -+----------+----------------------------------------------------------------+ -|topic_name| topic_configs| -+----------+----------------------------------------------------------------+ -| topic|{__system.sql-dialect=table, start-time=2025-01-10T17:05:38.282}| -+----------+----------------------------------------------------------------+ -``` - -### 2.10 VIEWS 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的表视图信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------ | ---------- | ----------- | ---------------- | -| database | STRING | TAG | 数据库名称 | -| table\_name | STRING | TAG | 视图名称 | -| view\_definition | STRING | ATTRIBUTE | 视图的创建语句 | - -* 查询结果只展示自身有任意权限的视图集合 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.views -+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------+ -| database|table_name| view_definition| -+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------+ -|database1| ln|CREATE VIEW "ln" ("device" STRING TAG,"model" STRING TAG,"status" BOOLEAN FIELD,"hardware" STRING FIELD) WITH (ttl='INF') AS root.ln.**| -+---------+----------+--------------------------------------------------------------------------------------------------------------------------------------- -``` - -### 2.11 MODELS 表 - -> 该系统表从 V 2.0.5 版本开始提供,从V 2.0.8 版本开始不再提供 - -* 包含数据库内所有的模型信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------- | ---------- | ----------- | ----------------------------------------------------------------------- | -| model\_id | STRING | TAG | 模型名称 | -| model\_type | STRING | ATTRIBUTE | 模型类型(预测,异常检测,自定义) | -| state | STRING | ATTRIBUTE | 模型状态(是否可用) | -| configs | STRING | ATTRIBUTE | 模型的超参数的 string 格式,与正常的 show 相同 | -| notes | STRING | ATTRIBUTE | 模型注释* 内置 model:Built-in model in IoTDB* 用户的 model:自定义 | - -* 查询示例: - -```SQL --- 找到类型为内置预测的所有模型 -IoTDB> select * from information_schema.models where model_type = 'BUILT_IN_FORECAST' -+---------------------+-----------------+------+-------+-----------------------+ -| model_id| model_type| state|configs| notes| -+---------------------+-----------------+------+-------+-----------------------+ -| _STLForecaster|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _NaiveForecaster|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _ARIMA|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -|_ExponentialSmoothing|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _HoltWinters|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _sundial|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -+---------------------+-----------------+------+-------+-----------------------+ -``` - -### 2.12 FUNCTIONS 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的函数信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------ | ---------- | ----------- | ----------------------------------------- | -| function\_name | STRING | TAG | 函数名称 | -| function\_type | STRING | ATTRIBUTE | 函数类型(内/外置数值/聚合/表函数) | -| class\_name(udf) | STRING | ATTRIBUTE | 如为 UDF,则为类名,否则为 null(暂定) | -| state | STRING | ATTRIBUTE | 是否可用 | - -* 查询示例: - -```SQL -IoTDB> select * from information_schema.functions where function_type='built-in table function' -+--------------+-----------------------+---------------+---------+ -|function_table| function_type|class_name(udf)| state| -+--------------+-----------------------+---------------+---------+ -| CUMULATE|built-in table function| null|AVAILABLE| -| SESSION|built-in table function| null|AVAILABLE| -| HOP|built-in table function| null|AVAILABLE| -| TUMBLE|built-in table function| null|AVAILABLE| -| FORECAST|built-in table function| null|AVAILABLE| -| VARIATION|built-in table function| null|AVAILABLE| -| CAPACITY|built-in table function| null|AVAILABLE| -+--------------+-----------------------+---------------+---------+ -``` - -### 2.13 CONFIGURATIONS表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的属性信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ---------- | ---------- | ----------- | -------- | -| variable | STRING | TAG | 属性名 | -| value | STRING | ATTRIBUTE | 属性值 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.configurations -+----------------------------------+-----------------------------------------------------------------+ -| variable| value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 2.14 KEYWORDS 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的关键字信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ---------- | ---------- | ----------- | -------------------------------- | -| word | STRING | TAG | 关键字 | -| reserved | INT32 | ATTRIBUTE | 是否为保留字,1表示是,0表示否 | - -* 查询示例: - -```SQL -IoTDB> select * from information_schema.keywords limit 10 -+----------+--------+ -| word|reserved| -+----------+--------+ -| ABSENT| 0| -|ACTIVATION| 1| -| ACTIVATE| 1| -| ADD| 0| -| ADMIN| 0| -| AFTER| 0| -| AINODES| 1| -| ALL| 0| -| ALTER| 1| -| ANALYZE| 0| -+----------+--------+ -``` - -### 2.15 NODES 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的节点信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------------------ | ---------- | ----------- | --------------- | -| node\_id | INT32 | TAG | 节点 ID | -| node\_type | STRING | ATTRIBUTE | 节点类型 | -| status | STRING | ATTRIBUTE | 节点状态 | -| internal\_address | STRING | ATTRIBUTE | 内部 rpc 地址 | -| internal\_port | INT32 | ATTRIBUTE | 内部端口 | -| version | STRING | ATTRIBUTE | 版本号 | -| build\_info | STRING | ATTRIBUTE | CommitID | -| activate\_status(仅企业版) | STRING | ATTRIBUTE | 激活状态 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.nodes -+-------+----------+-------+----------------+-------------+-------+----------+ -|node_id| node_type| status|internal_address|internal_port|version|build_info| -+-------+----------+-------+----------------+-------------+-------+----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|2.0.5.1| 58d685e| -| 1| DataNode|Running| 127.0.0.1| 10730|2.0.5.1| 58d685e| -+-------+----------+-------+----------------+-------------+-------+----------+ -+----------+--------+ -``` - -### 2.16 CONFIG\_NODES 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的配置节点信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------------- | ---------- | ----------- | --------------------- | -| node\_id | INT32 | TAG | 节点 ID | -| config\_consensus\_port | INT32 | ATTRIBUTE | configNode 共识端口 | -| role | STRING | ATTRIBUTE | configNode 节点角色 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.config_nodes -+-------+---------------------+------+ -|node_id|config_consensus_port| role| -+-------+---------------------+------+ -| 0| 10720|Leader| -+-------+---------------------+------+ -``` - -### 2.17 DATA\_NODES 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的数据节点信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------------ | ---------- | ----------- | ----------------------- | -| node\_id | INT32 | TAG | 节点 ID | -| data\_region\_num | INT32 | ATTRIBUTE | DataRegion 数量 | -| schema\_region\_num | INT32 | ATTRIBUTE | SchemaRegion 数量 | -| rpc\_address | STRING | ATTRIBUTE | Rpc 地址 | -| rpc\_port | INT32 | ATTRIBUTE | Rpc 端口 | -| mpp\_port | INT32 | ATTRIBUTE | MPP 通信端口 | -| data\_consensus\_port | INT32 | ATTRIBUTE | DataRegion 共识端口 | -| scema\_consensus\_port | INT32 | ATTRIBUTE | SchemaRegion 共识端口 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.data_nodes -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -|node_id|data_region_num|schema_region_num|rpc_address|rpc_port|mpp_port|data_consensus_port|schema_consensus_port| -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -| 1| 4| 4| 0.0.0.0| 6667| 10740| 10760| 10750| -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -``` - -### 2.18 CONNECTIONS 表 - -> 该系统表从 V 2.0.8 版本开始提供 - -* 包含集群中所有连接。 -* 表结构如下表所示: - -| **列名** | **数据类型** | **列类型** | **说明** | -| -------------------- | -------------------- | ------------------ | ---------------- | -| datanode\_id | STRING | TAG | DataNode的ID | -| user\_id | STRING | TAG | 用户ID | -| session\_id | STRING | TAG | Session ID | -| user\_name | STRING | ATTRIBUTE | 用户名 | -| last\_active\_time | TIMESTAMP | ATTRIBUTE | 最近活跃时间 | -| client\_ip | STRING | ATTRIBUTE | 客户端IP | - -* 查询示例: - -```SQL -IoTDB> select * from information_schema.connections; -+-----------+-------+----------+---------+-----------------------------+---------+ -|datanode_id|user_id|session_id|user_name| last_active_time|client_ip| -+-----------+-------+----------+---------+-----------------------------+---------+ -| 1| 0| 2| root|2026-01-21T16:28:54.704+08:00|127.0.0.1| -+-----------+-------+----------+---------+-----------------------------+---------+ -``` - -### 2.19 CURRENT\_QUERIES 表 - -> 该系统表从 V 2.0.8 版本开始提供 - -* 包含所有执行结束时间在 `[now() - query_cost_stat_window, now())` 范围内的所有查询,也包括当前正在执行的查询。其中`query_cost_stat_window `代表查询耗时统计的窗口,默认值为 0 ,可通过配置文件`iotdb-system.properties`进行配置。 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| -------------- | ----------- | -------- | ------------------------------------------------------------------------------- | -| query\_id | STRING | TAG | 查询语句的 ID | -| state | STRING | FIELD | 查询状态,RUNNING 表示正在执行,FINISHED 表示已结束 | -| start\_time | TIMESTAMP | FIELD | 查询开始的时间戳,时间戳精度与系统精度保持一致 | -| end\_time | TIMESTAMP | FIELD | 查询结束的时间戳,时间戳精度与系统精度保持一致。若查询尚未结束,该列值为 NULL | -| datanode\_id | INT32 | FIELD | 该查询语句是从哪个 DataNode 发起的 | -| cost\_time| FLOAT | FIELD | 查询的执行耗时,单位是秒。若查询尚未结束,该列值为查询已执行时间 | -| statement | STRING | FIELD | 查询的sql / 查询请求拼接后的 sql | -| user | STRING | FIELD | 发起查询的用户 | -| client\_ip | STRING | FIELD | 发起查询的客户端 ip | - -* 普通用户查询结果仅显示自身执行的查询;管理员显示全部。 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.current_queries; -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -| query_id| state| start_time|end_time|datanode_id|cost_time| statement|user|client_ip| -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -|20260121_085427_00013_1|RUNNING|2026-01-21T16:54:27.019+08:00| null| 1| 0.0|select * from information_schema.current_queries|root|127.0.0.1| -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -``` - -### 2.20 QUERIES\_COSTS\_HISTOGRAM 表 - -> 该系统表从 V 2.0.8 版本开始提供 - -* 包含过去 `query_cost_stat_window` 时间内的查询耗时的直方图(仅统计已经执行结束的 SQL),其中`query_cost_stat_window `代表查询耗时统计的窗口,默认值为 0 ,可通过配置文件`iotdb-system.properties`进行配置。 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| -------------- | ---------- | -------- | -------------------------------------------------------------------- | -| bin| STRING | TAG| 分桶名:共包含61个分桶,[0, 1), [1, 2), [2, 3),...., [59, 60), 60+ | -| nums | INT32 | FIELD | 分桶内sql的个数 | -| datanode\_id | INT32 | FIELD | 该桶属于哪个 DataNode | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.queries_costs_histogram limit 10 -+------+----+-----------+ -| bin|nums|datanode_id| -+------+----+-----------+ -| [0,1)| 0| 1| -| [1,2)| 0| 1| -| [2,3)| 0| 1| -| [3,4)| 0| 1| -| [4,5)| 0| 1| -| [5,6)| 0| 1| -| [6,7)| 0| 1| -| [7,8)| 0| 1| -| [8,9)| 0| 1| -|[9,10)| 0| 1| -+------+----+-----------+ -``` - -### 2.21 SERVICES 表 - -> 该系统表从 V 2.0.8.2 版本开始提供 - -* 可展示所有正常工作(RUNNING 或 READ-ONLY) DN 上的服务(MQTT 服务、REST 服务)。 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| --------------- | ---------- | ----------- | ------------------------------ | -| service\_name | STRING | TAG | 服务名称 | -| datanode\_id | INT32 | ATTRIBUTE | 所在 DataNode 的 ID | -| state | STRING | ATTRIBUTE | 服务状态: RUNNING / STOPPED | - -* 查询示例: - -```SQL -IoTDB> select * from information_schema.services -+------------+-----------+-------+ -|service_name|datanode_id| state| -+------------+-----------+-------+ -| MQTT| 1|STOPPED| -| REST| 1|RUNNING| -+------------+-----------+-------+ -``` - -### 2.22 TABLE_DISK_USAGE 表 - -> 该系统表从 V 2.0.9.1 版本开始提供 - -用于展示指定表(不包含 view)的磁盘空间占用情况,包括 ChunkGroup 的大小和 Metadata 大小。 - -注意:统计基于 TsFile 中数据的真实大小,因此不会考虑 mods 删除的情况。 - -表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ----------------- | ---------- | -------- | -------------------- | -| database | string | Field | Database 名 | -| table\_name | string | Field | 表名 | -| datanode\_id | int32 | Field | DataNode 节点 id | -| region\_id | int32 | Field | Region id | -| time\_partition | int64 | Field | 时间分区 id | -| size\_in\_bytes | int64 | Field | 占用磁盘空间(byte) | - -查询示例: - -```SQL --- 查询所有数据; -select * from information_schema.table_disk_usage; -``` - -```Bash -+---------+-------------------+-----------+---------+--------------+-------------+ -| database| table_name|datanode_id|region_id|time_partition|size_in_bytes| -+---------+-------------------+-----------+---------+--------------+-------------+ -|database1| table1| 1| 3| 2864| 867| -|database1| table11| 1| 3| 2864| 0| -|database1| table3| 1| 3| 2864| 0| -|database1| table1| 1| 3| 2865| 1411| -|database1| table11| 1| 3| 2865| 0| -|database1| table3| 1| 3| 2865| 0| -|database1| table1| 1| 3| 2925| 590| -|database1| table11| 1| 3| 2925| 0| -|database1| table3| 1| 3| 2925| 0| -|database1| table1| 1| 4| 2864| 883| -|database1| table11| 1| 4| 2864| 0| -|database1| table3| 1| 4| 2864| 0| -|database1| table1| 1| 4| 2865| 1224| -|database1| table11| 1| 4| 2865| 0| -|database1| table3| 1| 4| 2865| 0| -|database1| table1| 1| 4| 2888| 0| -|database1| table11| 1| 4| 2888| 0| -|database1| table3| 1| 4| 2888| 205| -| etth| tab_cov_forecast| 1| 8| 0| 0| -| etth| tab_real| 1| 8| 0| 963| -| etth|tab_target_forecast| 1| 8| 0| 0| -| etth| tab_cov_forecast| 1| 9| 0| 448| -| etth| tab_real| 1| 9| 0| 0| -| etth|tab_target_forecast| 1| 9| 0| 0| -+---------+-------------------+-----------+---------+--------------+-------------+ -``` - -```SQL --- 指定查询条件; -select * from information_schema.table_disk_usage where region_id = 4 and table_name like '%1'; -``` - -```Bash -+---------+----------+-----------+---------+--------------+-------------+ -| database|table_name|datanode_id|region_id|time_partition|size_in_bytes| -+---------+----------+-----------+---------+--------------+-------------+ -|database1| table1| 1| 4| 2864| 883| -|database1| table11| 1| 4| 2864| 0| -|database1| table1| 1| 4| 2865| 1224| -|database1| table11| 1| 4| 2865| 0| -|database1| table1| 1| 4| 2888| 0| -|database1| table11| 1| 4| 2888| 0| -+---------+----------+-----------+---------+--------------+-------------+ -``` - - -## 3. 权限说明 - -* 不支持通过`GRANT/REVOKE`语句对 `information_schema` 数据库及其下任何表进行权限操作 -* 支持任意用户通过`show databases`语句查看`information_schema`数据库相关信息 -* 支持任意用户通过`show tables from information_schema` 语句查看所有系统表相关信息 -* 支持任意用户通过`desc`语句查看任意系统表 diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function_timecho.md deleted file mode 100644 index 6a0a3e041..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/Basis-Function_timecho.md +++ /dev/null @@ -1,2670 +0,0 @@ - - -# 基础函数 - -## 1. 比较函数和运算符 - -### 1.1 基本比较运算符 - -比较运算符用于比较两个值,并返回比较结果(true或false)。 - -| 运算符 | 描述 | -| ------ | ---------- | -| < | 小于 | -| > | 大于 | -| <= | 小于或等于 | -| >= | 大于或等于 | -| = | 等于 | -| <> | 不等于 | -| != | 不等于 | - -#### 1.1.1 比较规则: - -1. 所有类型都可以与自身进行比较 -2. 数值类型(INT32, INT64, FLOAT, DOUBLE, TIMESTAMP)之间可以相互比较 -3. 字符类型(STRING, TEXT)之间也可以相互比较 -4. 除上述规则外的类型进行比较时,均会报错。 - -### 1.2 BETWEEN 运算符 - -1. `BETWEEN` 操作符用于判断一个值是否在指定的范围内。 -2. `NOT BETWEEN`操作符用于判断一个值是否不在指定范围内。 -3. `BETWEEN` 和 `NOT BETWEEN` 操作符可用于评估任何可排序的类型。 -4. `BETWEEN` 和 `NOT BETWEEN` 的值、最小值和最大值参数必须是同一类型,否则会报错。 - -**语法**: - -```SQL - value BETWEEN min AND max: - value NOT BETWEEN min AND max: -``` - -示例 1 :BETWEEN - -```SQL --- 查询 temperature 在 85.0 和 90.0 之间的记录 -SELECT * FROM table1 WHERE temperature BETWEEN 85.0 AND 90.0; -``` - -示例 2 :NOT BETWEEN - -```SQL -3-- 查询 humidity 不在 35.0 和 40.0 之间的记录 -SELECT * FROM table1 WHERE humidity NOT BETWEEN 35.0 AND 40.0; -``` - -### 1.3 IS NULL 运算符 - -1. `IS NULL` 和 `IS NOT NULL` 运算符用于判断一个值是否为 NULL。 -2. 这两个运算符适用于所有数据类型。 - -示例1:查询 temperature 为 NULL 的记录 - -```SQL -SELECT * FROM table1 WHERE temperature IS NULL; -``` - -示例2:查询 humidity 不为 NULL 的记录 - -```SQL -SELECT * FROM table1 WHERE humidity IS NOT NULL; -``` - -### 1.4 IN 运算符 - -1. `IN` 操作符可用于 `WHERE` 子句中,比较一列中的一些值。 -2. 这些值可以由静态数组、标量表达式。 - -**语法:** - -```SQL -... WHERE column [NOT] IN ('value1','value2', expression1) -``` - -示例 1:静态数组:查询 region 为 '北京' 或 '上海' 的记录 - -```SQL -SELECT * FROM table1 WHERE region IN ('北京', '上海'); ---等价于 -SELECT * FROM region WHERE name = '北京' OR name = '上海'; -``` - -示例 2:标量表达式:查询 temperature 在特定值中的记录 - -```SQL -SELECT * FROM table1 WHERE temperature IN (85.0, 90.0); -``` - -示例 3:查询 region 不为 '北京' 或 '上海' 的记录 - -```SQL -SELECT * FROM table1 WHERE region NOT IN ('北京', '上海'); -``` - -### 1.5 GREATEST 和 LEAST - -`Greatest` 函数用于返回参数列表中的最大值,`Least` 函数用于返回参数列表中的最小值,返回数据类型与输入类型相同。 -1. 空值处理:若所有参数均为 NULL,则返回 NULL。 -2. 参数要求:必须提供 至少 2 个参数。 -3. 类型约束:仅支持 相同数据类型 的参数比较。 -4. 支持类型: `BOOLEAN`、`FLOAT`、`DOUBLE`、`INT32`、`INT64`、`STRING`、`TEXT`、`TIMESTAMP`、`DATE` - -**语法:** - -```sql - greatest(value1, value2, ..., valueN) - least(value1, value2, ..., valueN) -``` - -**示例:** - -```sql --- 查询 table2 中 temperature 和 humidity 的最大记录 -SELECT GREATEST(temperature,humidity) FROM table2; - --- 查询 table2 中 temperature 和 humidity 的最小记录 -SELECT LEAST(temperature,humidity) FROM table2; -``` - - -## 2. 聚合函数 - -### 2.1 概述 - -1. 聚合函数是多对一函数。它们对一组值进行聚合计算,得到单个聚合结果。 -2. 除了 `COUNT()`之外,其他所有聚合函数都忽略空值,并在没有输入行或所有值为空时返回空值。 例如,`SUM()` 返回 null 而不是零,而 `AVG()` 在计数中不包括 null 值。 - -### 2.2 支持的聚合函数 - -| 函数名 | 功能描述 | 允许的输入类型 | 输出类型 | -|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------| -| COUNT | 计算数据点数。 | 所有类型 | INT64 | -| COUNT_IF | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数 | exp 必须是一个布尔类型的表达式,例如 count_if(temperature>20) | INT64 | -| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。 | `x`:待计算列,支持所有类型;
`maxStandardError`:指定该函数应产生的最大标准误差,取值范围[0.0040625, 0.26],未指定值时默认0.023。 | INT64 | -| APPROX_MOST_FREQUENT | APPROX_MOST_FREQUENT(x, k, capacity) 函数用于近似计算数据集中出现频率最高的前 k 个元素。它返回一个JSON 格式的字符串,其中键是该元素的值,值是该元素对应的近似频率。(V 2.0.5.1 及以后版本支持) | `x`:待计算列,支持 IoTDB 现有所有的数据类型;
`k`:返回出现频率最高的 k 个值;
`capacity`: 用于计算的桶的数量,跟内存占用相关:其值越大误差越小,但占用内存更大,反之capacity值越小误差越大,但占用内存更小。 | STRING | -| APPROX_PERCENTILE | APPROX_PERCENTILE 函数用于计算数据集中指定百分位数的值,帮助快速了解数据分布情况(如中位数、四分位数等),支持基于权重的百分位数计算;若百分位数不指向精确位置,返回相邻数值在该位置的线性插值。内存占用与质心数量相关,可通过 compression 参数限定最大质心数量,误差可通过经验公式预估。注意:该函数自 V2.0.9.1 起支持 | 单权重版本:APPROX_PERCENTILE (x, percentage)
x:待计算列,支持 INT32、INT64、FLOAT、DOUBLE、TIMESTAMP 等所有数字类型;
percentage:目标分位数,DOUBLE 类型。
带权重版本:APPROX_PERCENTILE (x, w, percentage)
x:待计算列,支持 INT32、INT64、FLOAT、DOUBLE、TIMESTAMP 等所有数字类型;
w:权重列,整型(与待计算列长度对齐,Null 或 0 表示该行忽略);
percentage:目标分位数,DOUBLE 类型。 | 与待计算列 x 的类型相同 | -| SUM | 求和。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| AVG | 求平均值。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| MAX | 求最大值。 | 所有类型 | 与输入类型一致 | -| MIN | 求最小值。 | 所有类型 | 与输入类型一致 | -| FIRST | 求时间戳最小且不为 NULL 的值。 | 所有类型 | 与输入类型一致 | -| LAST | 求时间戳最大且不为 NULL 的值。 | 所有类型 | 与输入类型一致 | -| STDDEV | STDDEV_SAMP 的别名,求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| STDDEV_POP | 求总体标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| STDDEV_SAMP | 求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VARIANCE | VAR_SAMP 的别名,求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VAR_POP | 求总体方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VAR_SAMP | 求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| EXTREME | 求具有最大绝对值的值。如果正值和负值的最大绝对值相等,则返回正值。 | INT32 INT64 FLOAT DOUBLE | 与输入类型一致 | -| MODE | 求众数。注意: 1.输入序列的不同值个数过多时会有内存异常风险; 2.如果所有元素出现的频次相同,即没有众数,则随机返回一个元素; 3.如果有多个众数,则随机返回一个众数; 4. NULL 值也会被统计频次,所以即使输入序列的值不全为 NULL,最终结果也可能为 NULL。 | 所有类型 | 与输入类型一致 | -| MAX_BY | MAX_BY(x, y) 求二元输入 x 和 y 在 y 最大时对应的 x 的值。MAX_BY(time, x) 返回 x 取最大值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 | -| MIN_BY | MIN_BY(x, y) 求二元输入 x 和 y 在 y 最小时对应的 x 的值。MIN_BY(time, x) 返回 x 取最小值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 | -| FIRST_BY | FIRST_BY(x, y) 求当 y 为第一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 | -| LAST_BY | LAST_BY(x, y) 求当 y 为最后一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 | - - -### 2.3 示例 - -#### 2.3.1 示例数据 - -在[示例数据页面](../Reference/Sample-Data.md)中,包含了用于构建表结构和插入数据的SQL语句,下载并在IoTDB CLI中执行这些语句,即可将数据导入IoTDB,您可以使用这些数据来测试和执行示例中的SQL语句,并获得相应的结果。 - -#### 2.3.2 Count - -统计的是整张表的行数和 `temperature` 列非 NULL 值的数量。 - -```SQL -IoTDB> select count(*), count(temperature) from table1; -``` - -执行结果如下: - -> 注意:只有COUNT函数可以与*一起使用,否则将抛出错误。 - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 18| 12| -+-----+-----+ -Total line number = 1 -It costs 0.834s -``` - - -#### 2.3.3 Count_if - -统计 `table2` 中 到达时间 `arrival_time` 不是 `null` 的记录行数。 - -```sql -IoTDB> select count_if(arrival_time is not null) from table2; -``` - -执行结果如下: - -```sql -+-----+ -|_col0| -+-----+ -| 4| -+-----+ -Total line number = 1 -It costs 0.047s -``` - -#### 2.3.4 Approx_count_distinct - -查询 `table1` 中 `temperature` 列不同值的个数。 - -```sql -IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature) as approx FROM table1; -IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1; -``` - -执行结果如下: - -```sql -+------+------+ -|origin|approx| -+------+------+ -| 3| 3| -+------+------+ -Total line number = 1 -It costs 0.022s -``` - -#### 2.3.5 Approx_most_frequent - -查询 `table1` 中 `temperature` 列出现频次最高的2个值 - -```sql -IoTDB> select approx_most_frequent(temperature,2,100) as topk from table1; -``` - -执行结果如下: - -```sql -+-------------------+ -| topk| -+-------------------+ -|{"85.0":6,"90.0":5}| -+-------------------+ -Total line number = 1 -It costs 0.064s -``` - -#### 2.3.6 Approx_Percentile - -从`table1` 中,分别计算列 temperature 的90 分位数和列 humidity 的50 分位数(中位数),返回这两个近似百分位数值。 - -```SQL -SELECT APPROX_PERCENTILE(temperature,0.9), APPROX_PERCENTILE(humidity,0.5) FROM table1; -``` - -执行结果如下: - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 35.2| -+-----+-----+ -Total line number = 1 -It costs 0.206s -``` - - -#### 2.3.7 First - -查询`temperature`列、`humidity`列时间戳最小且不为 NULL 的值。 - -```SQL -IoTDB> select first(temperature), first(humidity) from table1; -``` - -执行结果如下: - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 35.1| -+-----+-----+ -Total line number = 1 -It costs 0.170s -``` - -#### 2.3.8 Last - -查询`temperature`列、`humidity`列时间戳最大且不为 NULL 的值。 - -```SQL -IoTDB> select last(temperature), last(humidity) from table1; -``` - -执行结果如下: - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 34.8| -+-----+-----+ -Total line number = 1 -It costs 0.211s -``` - -#### 2.3.9 First_by - -查询 `temperature` 列中非 NULL 且时间戳最小的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最小的行的 `humidity` 值。 - -```SQL -IoTDB> select first_by(time, temperature), first_by(humidity, temperature) from table1; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-26T13:37:00.000+08:00| 35.1| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.269s -``` - -#### 2.3.10 Last_by - -查询`temperature` 列中非 NULL 且时间戳最大的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最大的行的 `humidity` 值。 - -```SQL -IoTDB> select last_by(time, temperature), last_by(humidity, temperature) from table1; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-30T14:30:00.000+08:00| 34.8| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.070s -``` - -#### 2.3.11 Max_by - -查询`temperature` 列中最大值所在行的 `time` 值,以及`temperature` 列中最大值所在行的 `humidity` 值。 - -```SQL -IoTDB> select max_by(time, temperature), max_by(humidity, temperature) from table1; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-30T09:30:00.000+08:00| 35.2| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.172s -``` - -#### 2.3.12 Min_by - -查询`temperature` 列中最小值所在行的 `time` 值,以及`temperature` 列中最小值所在行的 `humidity` 值。 - -```SQL -select min_by(time, temperature), min_by(humidity, temperature) from table1; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-29T10:00:00.000+08:00| null| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.244s -``` - - -## 3. 逻辑运算符 - -### 3.1 概述 - -逻辑运算符用于组合条件或否定条件,返回布尔结果(`true` 或 `false`)。 - -以下是常用的逻辑运算符及其描述: - -| 运算符 | 描述 | 示例 | -| ------ | ----------------------------- | ------- | -| AND | 仅当两个值都为 true 时为 true | a AND b | -| OR | 任一值为 true 时为 true | a OR b | -| NOT | 当值为 false 时为 true | NOT a | - -### 3.2 NULL 对逻辑运算符的影响 - -#### 3.2.1 AND 运算符 - -- 如果表达式的一侧或两侧为 `NULL`,结果可能为 `NULL`。 -- 如果 `AND` 运算符的一侧为 `FALSE`,则表达式结果为 `FALSE`。 - -示例: - -```SQL -NULL AND true -- null -NULL AND false -- false -NULL AND NULL -- null -``` - -#### 3.2.2 OR 运算符 - -- 如果表达式的一侧或两侧为 `NULL`,结果可能为 `NULL`。 -- 如果 `OR` 运算符的一侧为 `TRUE`,则表达式结果为 `TRUE`。 - -示例: - -```SQL -NULL OR NULL -- null -NULL OR false -- null -NULL OR true -- true -``` - -##### 3.2.2.1 真值表 - -以下真值表展示了 `NULL` 在 `AND` 和 `OR` 运算符中的处理方式: - -| a | b | a AND b | a OR b | -| ----- | ----- | ------- | ------ | -| TRUE | TRUE | TRUE | TRUE | -| TRUE | FALSE | FALSE | TRUE | -| TRUE | NULL | NULL | TRUE | -| FALSE | TRUE | FALSE | TRUE | -| FALSE | FALSE | FALSE | FALSE | -| FALSE | NULL | FALSE | NULL | -| NULL | TRUE | NULL | TRUE | -| NULL | FALSE | FALSE | NULL | -| NULL | NULL | NULL | NULL | - -#### 3.2.3 NOT 运算符 - -NULL 的逻辑否定仍然是 NULL - -示例: - -```SQL -NOT NULL -- null -``` - -##### 3.2.3.1真值表 - -以下真值表展示了 `NULL` 在 `NOT` 运算符中的处理方式: - -| a | NOT a | -| ----- | ----- | -| TRUE | FALSE | -| FALSE | TRUE | -| NULL | NULL | - - -## 4. 日期和时间函数和运算符 - -### 4.1 now() -> Timestamp - -返回当前时间的时间戳。 - -### 4.2 date_bin(interval, Timestamp[, Timestamp]) -> Timestamp - -`date_bin` 函数是一种用于处理时间数据的函数,作用是将一个时间戳(Timestamp)舍入到指定的时间间隔(interval)的边界上。 - -**语法:** - -```SQL --- 从时间戳为 0 开始计算时间间隔,返回最接近指定时间戳的时间间隔起始点 -date_bin(interval,source) - --- 从时间戳为 origin 开始计算时间间隔,返回最接近指定时间戳的时间间隔起始点 -date_bin(interval,source,origin) - --- interval支持的时间单位有: --- 年y、月mo、周week、日d、小时h、分钟M、秒s、毫秒ms、微秒µs、纳秒ns。 --- source必须为时间戳类型。 -``` - -**参数:** - -| 参数 | 含义 | -| -------- | ------------------------------------------------------------ | -| interval | 时间间隔支持的时间单位有:年y、月mo、周week、日d、小时h、分钟M、秒s、毫秒ms、微秒µs、纳秒ns。 | -| source | 待计算时间列,也可以是表达式。必须为时间戳类型。 | -| origin | 起始时间戳 | - -#### 4.2.1 语法约定: - -1. 不传入 `origin` 时,起始时间戳从 1970-01-01T00:00:00Z 开始计算(北京时间为 1970-01-01 08:00:00)。 -2. `interval` 为一个非负数,且必须带上时间单位。`interval` 为 0ms 时,不进行计算,直接返回 `source`。 -3. 当传入 `origin` 或 `source` 为负时,表示纪元时间之前的某个时间点,`date_bin` 会正常计算并返回与该时间点相关的时间段。 -4. 如果 `source` 中的值为 `null`,则返回 `null`。 -5. 不支持月份和非月份时间单位混用,例如 `1 MONTH 1 DAY`,这种时间间隔有歧义。 - -> 假设是起始时间是 2000 年 4 月 30 日进行计算,那么在一个时间间隔后,如果是先算 DAY再算MONTH,则会得到 2000 年 6 月 1 日,如果先算 MONTH 再算 DAY 则会得到 2000 年 5 月 31 日,二者得出的时间日期不同。 - -#### 4.2.2 示例 - -##### 示例数据 - -在[示例数据页面](../Reference/Sample-Data.md)中,包含了用于构建表结构和插入数据的SQL语句,下载并在IoTDB CLI中执行这些语句,即可将数据导入IoTDB,您可以使用这些数据来测试和执行示例中的SQL语句,并获得相应的结果。 - -示例 1:不指定起始时间戳 - -```SQL -SELECT - time, - date_bin(1h,time) as time_bin -FROM - table1; -``` - -结果: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:00:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.683s -``` - -示例 2:指定起始时间戳 - -```SQL -SELECT - time, - date_bin(1h, time, 2024-11-29T18:30:00.000) as time_bin -FROM - table1; -``` - -结果: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:30:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T09:30:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T10:30:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:30:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T07:30:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T08:30:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T09:30:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T10:30:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:30:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:30:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.056s -``` - -示例 3:`origin` 为负数的情况 - -```SQL -SELECT - time, - date_bin(1h, time, 1969-12-31 00:00:00.000) as time_bin -FROM - table1; -``` - -结果: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:00:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.203s -``` - -示例 4:`interval` 为 0 的情况 - -```SQL -SELECT - time, - date_bin(0ms, time) as time_bin -FROM - table1; -``` - -结果: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:30:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:38:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:39:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:40:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:41:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:42:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:43:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:44:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:30:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:37:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:38:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.107s -``` - -示例 5:`source` 为 null 的情况 - -```SQL -SELECT - arrival_time, - date_bin(1h,arrival_time) as time_bin -FROM - table1; -``` - -结果: - -```Plain -+-----------------------------+-----------------------------+ -| arrival_time| time_bin| -+-----------------------------+-----------------------------+ -| null| null| -|2024-11-30T14:30:17.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:13.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:37:01.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -|2024-11-27T16:37:03.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:37:04.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -| null| null| -|2024-11-27T16:37:08.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -|2024-11-29T18:30:15.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:09.000+08:00|2024-11-28T08:00:00.000+08:00| -| null| null| -|2024-11-28T10:00:11.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:12.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:34.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:25.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.319s -``` - -### 4.3 Extract 函数 - -该函数用于提取日期对应部分的值。(V2.0.6 版本起支持) - -#### 4.3.1 语法定义 - -```SQL -EXTRACT (identifier FROM expression) -``` -* 参数说明 - * **expression**: `TIMESTAMP` 类型或时间常量 - * **identifier** :取值范围及对应的返回值见下表 - - | 取值范围 | 返回值类型 | 返回值范围 | - | -------------------------- | ------------- | ------------- | - | `YEAR` | `INT64` | `/` | - | `QUARTER` | `INT64` | `1-4` | - | `MONTH` | `INT64` | `1-12` | - | `WEEK` | `INT64` | `1-53` | - | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` | - | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` | - | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` | - | `HOUR` | `INT64` | `0-23` | - | `MINUTE` | `INT64` | `0-59` | - | `SECOND` | `INT64` | `0-59` | - | `MS` | `INT64` | `0-999` | - | `US` | `INT64` | `0-999` | - | `NS` | `INT64` | `0-999` | - - -#### 4.3.2 使用示例 - -以[示例数据](../Reference/Sample-Data.md)中的 table1 为源数据,查询某段时间每天前12个小时的温度平均值 - -```SQL -IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00 and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by date_bin(1d,time) order by date_bin(1d,time) -+----------+-----+ -| fmtdate|avgtp| -+----------+-----+ -|2024-11-28| 86.0| -|2024-11-29| 85.0| -|2024-11-30| 90.0| -+----------+-----+ -Total line number = 3 -It costs 0.041s -``` - -`Format` 函数介绍:[Format 函数](../SQL-Manual/Basis-Function_timecho.md#_7-2-format-函数) - -`Date_bin` 函数介绍:[Date_bin 函数](../SQL-Manual/Basis-Function_timecho.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp) - - -## 5. 数学函数和运算符 - -### 5.1 数学运算符 - -| **运算符** | **描述** | -| ---------- | ------------------------ | -| + | 加法 | -| - | 减法 | -| * | 乘法 | -| / | 除法(整数除法执行截断) | -| % | 模(余数) | -| - | 取反 | - -### 5.2 数学函数 - -| 函数名 | 描述 | 输入 | 输出 | 用法 | -|-------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------| ---------------------- | ---------- | -| sin | 正弦函数 | double、float、INT64、INT32 | double | sin(x) | -| cos | 余弦函数 | double、float、INT64、INT32 | double | cos(x) | -| tan | 正切函数 | double、float、INT64、INT32 | double | tan(x) | -| asin | 反正弦函数 | double、float、INT64、INT32 | double | asin(x) | -| acos | 反余弦函数 | double、float、INT64、INT32 | double | acos(x) | -| atan | 反正切函数 | double、float、INT64、INT32 | double | atan(x) | -| sinh | 双曲正弦函数 | double、float、INT64、INT32 | double | sinh(x) | -| cosh | 双曲余弦函数 | double、float、INT64、INT32 | double | cosh(x) | -| tanh | 双曲正切函数 | double、float、INT64、INT32 | double | tanh(x) | -| degrees | 将弧度角 x 转换为度 | double、float、INT64、INT32 | double | degrees(x) | -| radians | 将度转换为弧度 | double、float、INT64、INT32 | double | radians(x) | -| abs | 绝对值 | double、float、INT64、INT32 | 返回与输入类型相同的值 | abs(x) | -| sign | 返回 x 的符号函数,即:如果参数为 0,则返回 0,如果参数大于 0,则返回 1,如果参数小于 0,则返回 -1。对于 double/float 类型的参数,函数还会返回:如果参数为 NaN,则返回 NaN,如果参数为 +Infinity,则返回 1.0,如果参数为 -Infinity,则返回 -1.0。 | double、float、INT64、INT32 | 返回与输入类型相同的值 | sign(x) | -| ceil | 返回 x 向上取整到最近的整数。 | double、float、INT64、INT32 | double | ceil(x) | -| floor | 返回 x 向下取整到最近的整数。 | double、float、INT64、INT32 | double | floor(x) | -| exp | 返回欧拉数 e 的 x 次幂。 | double、float、INT64、INT32 | double | exp(x) | -| ln | 返回 x 的自然对数。 | double、float、INT64、INT32 | double | ln(x) | -| log10 | 返回 x 的以 10 为底的对数。 | double、float、INT64、INT32 | double | log10(x) | -| round | 返回 x 四舍五入到最近的整数。 | double、float、INT64、INT32 | double | round(x) | -| round | 返回 x 四舍五入到 d 位小数。 | double、float、INT64、INT32 | double | round(x, d) | -| sqrt | 返回 x 的平方根。 | double、float、INT64、INT32 | double | sqrt(x) | -| e | 自然指数 | | double | e() | -| pi | π | | double | pi() | - - -## 6. 位运算函数 - -> V 2.0.6 版本起支持 - -示例原始数据如下: - -```SQL -IoTDB:database1> select * from bit_table -+-----------------------------+---------+------+-----+ -| time|device_id|length|width| -+-----------------------------+---------+------+-----+ -|2025-10-29T15:59:42.957+08:00| d1| 14| 12| -|2025-10-29T15:58:59.399+08:00| d3| 15| 10| -|2025-10-29T15:59:32.769+08:00| d2| 13| 12| -+-----------------------------+---------+------+-----+ - ---建表语句 -CREATE TABLE bit_table(time TIMESTAMP TIME, device_id STRING TAG, length INT32 FIELD, width INT32 FIELD); - ---写入数据 -INSERT INTO bit_table values(2025-10-29 15:59:42.957, 'd1', 14, 12),(2025-10-29 15:58:59.399, 'd3', 15, 10),(2025-10-29 15:59:32.769, 'd2', 13, 12); -``` - -### 6.1 bit\_count(num, bits) - -`bit_count(num, bits)` 函数用于统计整数 `num`在指定位宽 `bits`下的二进制表示中 1 的个数。 - -#### 6.1.1 语法定义 - -```SQL -bit_count(num, bits) -> INT64 --返回结果类型为 Int64 -``` - -* 参数说明 - * **​num:​**任意整型数值(int32 或者 int64) - * **​bits:​**整型数值,取值范围为2\~64 - -注意:如果 bits 位数不够表示 num,会报错(此处是​**有符号补码**​):`Argument exception, the scalar function num must be representable with the bits specified. [num] cannot be represented with [bits] bits.` - -* 调用方式 - * 两个具体数值:`bit_count(9, 64)` - * 列与数值:`bit_count(column1, 64)` - * 两列之间:`bit_count(column1, column2)` - -#### 6.1.2 使用示例 - -```SQL --- 两个具体数值 -IoTDB:database1> select distinct bit_count(2,8) from bit_table -+-----+ -|_col0| -+-----+ -| 1| -+-----+ --- 两个具体数值 -IoTDB:database1> select distinct bit_count(-5,8) from bit_table -+-----+ -|_col0| -+-----+ -| 7| -+-----+ ---列与数值 -IoTDB:database1> select length,bit_count(length,8) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 3| -| 15| 4| -| 13| 3| -+------+-----+ ---bits位数不够 -IoTDB:database1> select length,bit_count(length,2) from bit_table -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Argument exception, the scalar function num must be representable with the bits specified. 13 cannot be represented with 2 bits. -``` - -### 6.2 bitwise\_and(x, y) - -`bitwise_and(x, y)`函数基于二进制补码表示法,对两个整数 x 和 y 的每一位进行逻辑与操作,并返回其按位与(bitwise AND)的运算结果。 - -#### 6.2.1 语法定义 - -```SQL -bitwise_and(x, y) -> INT64 --返回结果类型为 Int64 -``` - -* 参数说明 - * ​**x, y**​: 必须是 Int32 或 Int64 数据类型的整数值 -* 调用方式 - * 两个具体数值:`bitwise_and(19, 25)` - * 列与数值:`bitwise_and(column1, 25)` - * 两列之间:`bitwise_and(column1, column2)` - -#### 6.2.2 使用示例 - -```SQL ---两个具体数值 -IoTDB:database1> select distinct bitwise_and(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 17| -+-----+ ---列与数值 -IoTDB:database1> select length, bitwise_and(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 8| -| 15| 9| -| 13| 9| -+------+-----+ ---俩列之间 -IoTDB:database1> select length, width, bitwise_and(length, width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 12| -| 15| 10| 10| -| 13| 12| 12| -+------+-----+-----+ -``` - -### 6.3 bitwise\_not(x) - -`bitwise_not(x)` 函数基于二进制补码表示法,对整数 x 的每一位进行逻辑非操作,并返回其按位取反(bitwise NOT)的运算结果。 - -#### 6.3.1 语法定义 - -```SQL -bitwise_not(x) -> INT64 --返回结果类型为 Int64 -``` - -* 参数说明 - * ​**x**​: 必须是 Int32 或 Int64 数据类型的整数值 -* 调用方式 - * 具体数值:`bitwise_not(5)` - * 单列操作:`bitwise_not(column1)` - -#### 6.3.2 使用示例 - -```SQL --- 具体数值 -IoTDB:database1> select distinct bitwise_not(5) from bit_table -+-----+ -|_col0| -+-----+ -| -6| -+-----+ --- 单列 -IoTDB:database1> select length, bitwise_not(length) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| -15| -| 15| -16| -| 13| -14| -+------+-----+ -``` - -### 6.4 bitwise\_or(x, y) - -`bitwise_or(x,y)` 函数基于二进制补码表示法,对两个整数 x 和 y 的每一位进行逻辑或操作,并返回其按位或(bitwise OR)的运算结果。 - -#### 6.4.1 语法定义 - -```SQL -bitwise_or(x, y) -> INT64 --返回结果类型为 Int64 -``` - -* 参数说明 - * ​**x, y**​: 必须是 Int32 或 Int64 数据类型的整数值 -* 调用方式 - * 两个具体数值:`bitwise_or(19, 25)` - * 列与数值:`bitwise_or(column1, 25)` - * 两列之间:`bitwise_or(column1, column2)` - -#### 6.4.2 使用示例 - -```SQL --- 两个具体数值 -IoTDB:database1> select distinct bitwise_or(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 27| -+-----+ --- 列与数值 -IoTDB:database1> select length,bitwise_or(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 31| -| 15| 31| -| 13| 29| -+------+-----+ --- 两列之间 -IoTDB:database1> select length, width, bitwise_or(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 14| -| 15| 10| 15| -| 13| 12| 13| -+------+-----+-----+ -``` - -### 6.5 bitwise\_xor(x, y) - -bitwise\_xor(x,y) 函数基于二进制补码表示法,对两个整数 x 和 y 的每一位进行逻辑异或操作,并返回其按位异或(bitwise XOR)的运算结果。异或规则:相同为0,不同为1。 - -#### 6.5.1 语法定义 - -```SQL -bitwise_xor(x, y) -> INT64 --返回结果类型为 Int64 -``` - -* 参数说明 - * ​**x, y**​: 必须是 Int32 或 Int64 数据类型的整数值 -* 调用方式 - * 两个具体数值:`bitwise_xor(19, 25)` - * 列与数值:`bitwise_xor(column1, 25)` - * 两列之间:`bitwise_xor(column1, column2)` - -#### 6.5.2 使用示例 - -```SQL --- 两个具体数值 -IoTDB:database1> select distinct bitwise_xor(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 10| -+-----+ --- 列与数值 -IoTDB:database1> select length,bitwise_xor(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 23| -| 15| 22| -| 13| 20| -+------+-----+ --- 两列之间 -IoTDB:database1> select length, width, bitwise_xor(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 2| -| 15| 10| 5| -| 13| 12| 1| -+------+-----+-----+ -``` - -### 6.6 bitwise\_left\_shift(value, shift) - -`bitwise_left_shift(value, shift)` 函数返回将整数 `value`的二进制表示左移 `shift`位后的结果。左移操作将二进制位向高位方向移动,右侧空出的位用 0 填充,左侧溢出的位直接丢弃。等价于: `value << shift`。 - -#### 6.6.1 语法定义 - -```SQL -bitwise_left_shift(value, shift) -> [same as value] --返回结果类型与value数据类型相同 -``` - -* 参数说明 - * ​**value**​: 要左移的整数值,必须是 Int32 或 Int64 数据类型 - * ​**shift**​: 左移的位数,必须是 Int32 或 Int64 数据类型 -* 调用方式 - * 两个具体数值:`bitwise_left_shift(1, 2)` - * 列与数值:`bitwise_left_shift(column1, 2)` - * 两列之间:`bitwise_left_shift(column1, column2)` - -#### 6.6.2 使用示例 - -```SQL ---两个具体数值 -IoTDB:database1> select distinct bitwise_left_shift(1,2) from bit_table -+-----+ -|_col0| -+-----+ -| 4| -+-----+ --- 列与数值 -IoTDB:database1> select length, bitwise_left_shift(length,2) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 56| -| 15| 60| -| 13| 52| -+------+-----+ --- 两列之间 -IoTDB:database1> select length, width, bitwise_left_shift(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -+------+-----+-----+ -``` - -### 6.7 bitwise\_right\_shift(value, shift) - -`bitwise_right_shift(value, shift)`函数返回将整数 `value`的二进制表示逻辑右移(无符号右移) `shift`位后的结果。逻辑右移操作将二进制位向低位方向移动,左侧空出的高位用 0 填充,右侧溢出的低位直接丢弃。 - -#### 6.7.1 语法定义 - -```SQL -bitwise_right_shift(value, shift) -> [same as value] --返回结果类型与value数据类型相同 -``` - -* 参数说明 - * ​**value**​: 要右移的整数值,必须是 Int32 或 Int64 数据类型 - * ​**shift**​: 右移的位数,必须是 Int32 或 Int64 数据类型 -* 调用方式 - * 两个具体数值:`bitwise_right_shift(8, 3)` - * 列与数值:`bitwise_right_shift(column1, 3)` - * 两列之间:`bitwise_right_shift(column1, column2)` - -#### 6.7.2 使用示例 - -```SQL ---两个具体数值 -IoTDB:database1> select distinct bitwise_right_shift(8,3) from bit_table -+-----+ -|_col0| -+-----+ -| 1| -+-----+ ---列与数值 -IoTDB:database1> select length, bitwise_right_shift(length,3) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 1| -| 15| 1| -| 13| 1| -+------+-----+ ---两列之间 -IoTDB:database1> select length, width, bitwise_right_shift(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -``` - -### 6.8 bitwise\_right\_shift\_arithmetic(value, shift) - -`bitwise_right_shift_arithmetic(value, shift)`函数返回将整数 `value`的二进制表示算术右移 `shift`位后的结果。算术右移操作将二进制位向低位方向移动,右侧溢出的低位直接丢弃,左侧空出的高位用符号位填充(正数补0,负数补1),以保持数值的符号不变。 - -#### 6.8.1 语法定义 - -```SQL -bitwise_right_shift_arithmetic(value, shift) -> [same as value]--返回结果类型与value数据类型相同 -``` - -* 参数说明 - * ​**value**​: 要右移的整数值,必须是 Int32 或 Int64 数据类型 - * ​**shift**​: 右移的位数,必须是 Int32 或 Int64 数据类型 -* 调用方式: - * 两个具体数值:`bitwise_right_shift_arithmetic(12, 2)` - * 列与数值:`bitwise_right_shift_arithmetic(column1, 64)` - * 两列之间:`bitwise_right_shift_arithmetic(column1, column2)` - -#### 6.8.2 使用示例 - -```SQL ---两个具体数值 -IoTDB:database1> select distinct bitwise_right_shift_arithmetic(12,2) from bit_table -+-----+ -|_col0| -+-----+ -| 3| -+-----+ --- 列与数值 -IoTDB:database1> select length, bitwise_right_shift_arithmetic(length,3) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 1| -| 15| 1| -| 13| 1| -+------+-----+ ---两列之间 -IoTDB:database1> select length, width, bitwise_right_shift_arithmetic(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -+------+-----+-----+ -``` - -## 7. 二进制函数 - -> V2.0.9.1 起支持 - -### 7.1 Base64 编码函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| ----------------------------- | ------------------------------------------------------------------------- | ------------------ | -------------- | -| `to_base64(input)` | 将输入数据编码为标准 Base64 字符串,解决二进制数据传输 / 存储兼容性问题 | STRING/TEXT/BLOB | STRING | -| `from_base64(input)` | 将标准 Base64 字符串解码为原始二进制数据,为 to\_base64 逆操作 | STRING/TEXT | BLOB | -| `to_base64url(input)` | 将输入数据编码为 URL 安全的 Base64URL 字符串,替换 +/\_、省略填充符 | STRING/TEXT/BLOB | STRING | -| `from_base64url(input)` | 将 Base64URL 字符串解码为原始二进制数据,为 to\_base64url 逆操作 | STRING/TEXT | BLOB | -| `to_base32(input)` | 将输入数据编码为 Base32 字符串,字符无混淆、不区分大小写,可读性高 | STRING/TEXT/BLOB | STRING | -| `from_base32(input)` | 将 Base32 字符串解码为原始二进制数据,为 to\_base32 逆操作 | STRING/TEXT | BLOB | - -**使用示例** - -1. to\_base64:编码字符串为标准Base64 - -```SQL -SELECT DISTINCT to_base64('IoTDB二进制测试') FROM table1; -``` - -```Bash -+----------------------------+ -| _col0| -+----------------------------+ -|SW9URELkuozov5vliLbmtYvor5U=| -+----------------------------+ -``` - -2. from\_base64:解码Base64字符串为二进制 - -```SQL -SELECT DISTINCT from_base64('SW9URELkuozov5vliLbmtYvor5U=') FROM table1; -``` - -```Bash -+------------------------------------------+ -| _col0| -+------------------------------------------+ -|0x496f544442e4ba8ce8bf9be588b6e6b58be8af95| -+------------------------------------------+ -``` - -3. to\_base64url:编码为URL安全的Base64URL(无+/\_、无填充符=) - -```SQL -SELECT DISTINCT to_base64url('https://iotdb.apache.org') FROM table1; -``` - -```Bash -+--------------------------------+ -| _col0| -+--------------------------------+ -|aHR0cHM6Ly9pb3RkYi5hcGFjaGUub3Jn| -+--------------------------------+ -``` - -4. from\_base64url:解码Base64URL字符串 - -```SQL -SELECT DISTINCT from_base64url('aHR0cHM6Ly9pb3RkYi5hcGFjaGUub3Jn') FROM table1; -``` - -```Bash -+--------------------------------------------------+ -| _col0| -+--------------------------------------------------+ -|0x68747470733a2f2f696f7464622e6170616368652e6f7267| -+--------------------------------------------------+ -``` - -5. to\_base32:编码为Base32字符串 - -```SQL -SELECT DISTINCT to_base32('123456') FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|GEZDGNBVGY======| -+----------------+ -``` - -6. from\_base32:解码Base32字符串 - -```SQL -SELECT DISTINCT from_base32('GEZDGNBVGY======') FROM table1; -``` - -```SQL -+--------------+ -| _col0| -+--------------+ -|0x313233343536| -+--------------+ -``` - -### 7.2 十六进制编码函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| ----------------------- | -------------------------------------------------------------- | ------------------ | -------------- | -| `TO_HEX(input)` | 将输入数据转换为十六进制字符串,直接反映底层字节值,便于调试 | STRING/TEXT/BLOB | STRING | -| `FROM_HEX(input)` | 将十六进制字符串解码为原始二进制数据,为TO\_HEX逆操作 | STRING/TEXT | BLOB | - -**使用示例** - -1. TO\_HEX:将字符串/二进制转换为十六进制 - -```SQL -SELECT DISTINCT TO_HEX('test') FROM table1; -``` - -```Bash -+--------+ -| _col0| -+--------+ -|74657374| -+--------+ -``` - -2. FROM\_HEX:将十六进制字符串解码为二进制 - -```SQL -SELECT DISTINCT FROM_HEX('74657374') FROM table1; -``` - -```Bash -+----------+ -| _col0| -+----------+ -|0x74657374| -+----------+ -``` - -### 7.3 二进制基础函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| -------------------------------------- | ------------------------------------------------------------------------------------------- | ------------------------- | ---------------- | -| `length(input)` | 返回输入数据长度,文本类型返字符数,BLOB类型返回字节数,OBJECT 类型返回对象二进制字节大小 | STRING/TEXT/BLOB/OBJECT | INT32 | -| `REVERSE(input)`| 反转输入数据顺序,文本类型反转字符,BLOB类型反转字节 | STRING/TEXT/BLOB | 与输入类型一致 | -| `LPAD(input, length, pad_bytes)` | 对BLOB进行字节级左填充/截断,使最终字节长度等于指定值 | BLOB、INT32/INT64、BLOB | BLOB | -| `RPAD(input, length, pad_bytes)` | 对BLOB进行字节级右填充/截断,使最终字节长度等于指定值 | BLOB、INT32/INT64、BLOB | BLOB | - -**使用示例** - -1. length:获取数据长度 - -```SQL -SELECT DISTINCT length('IoTDB') FROM table1; -``` - -```Bash -+-----+ -|_col0| -+-----+ -| 5| -+-----+ -``` - -2. REVERSE:反转数据 - -```SQL -SELECT DISTINCT REVERSE('12345') FROM table1; -``` - -```Bash -+-----+ -|_col0| -+-----+ -|54321| -+-----+ -``` - -3. LPAD:左填充/截断BLOB(参数:原BLOB、目标长度、填充字节) - -```SQL -SELECT DISTINCT LPAD(FROM_HEX('74657374'),5, FROM_HEX('74657374')) FROM table1; -``` - -```Bash -+------------+ -| _col0| -+------------+ -|0x7474657374| -+------------+ -``` - -4. RPAD:右填充/截断BLOB - -```SQL -SELECT DISTINCT RPAD(FROM_HEX('74657374'),5, FROM_HEX('74657374')) FROM table1; -``` - -```Bash -+------------+ -| _col0| -+------------+ -|0x7465737474| -+------------+ -``` - -### 7.4 整数编码函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| ------------------------------------ | ------------------------------------------------------------------ | -------------- | -------------- | -| `to_big_endian_32(input)` | 将INT32整数转换为4字节大端序BLOB,符合网络字节序标准 | INT32 | BLOB | -| `to_big_endian_64(input)` | 将INT64整数转换为8字节大端序BLOB,符合网络字节序标准 | INT64 | BLOB | -| `from_big_endian_32(input)` | 将4字节大端序BLOB解码为INT32整数,为to\_big\_endian\_32逆操作 | BLOB | INT32 | -| `from_big_endian_64(input)` | 将8字节大端序BLOB解码为INT64整数,为to\_big\_endian\_64逆操作 | BLOB | INT64 | -| `to_little_endian_32(input)` | 将INT32整数转换为4字节小端序BLOB,适配x86等主流架构 | INT32 | BLOB | -| `to_little_endian_64(input)` | 将INT64整数转换为8字节小端序BLOB,适配x86等主流架构 | INT64 | BLOB | -| `from_little_endian_32(input)` | 将4字节小端序BLOB解码为INT32整数,为to\_little\_endian\_32逆操作 | BLOB | INT32 | -| `from_little_endian_64(input)` | 将8字节小端序BLOB解码为INT64整数,为to\_little\_endian\_64逆操作 | BLOB | INT64 | - -**使用示例** - -1. 大端序编码/解码 - -```SQL -SELECT DISTINCT TO_HEX(to_big_endian_32(12345)) FROM table1; -``` - -```Bash -+--------+ -| _col0| -+--------+ -|00003039| -+--------+ -``` - -```SQL -SELECT DISTINCT from_big_endian_32(FROM_HEX('00003039')) FROM table1; -``` - -```Bash -+-----+ -|_col0| -+-----+ -|12345| -+-----+ -``` - -```SQL -SELECT DISTINCT TO_HEX(to_big_endian_64(1234567890123)) FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|0000011f71fb04cb| -+----------------+ -``` - -```SQL -SELECT DISTINCT from_big_endian_64(FROM_HEX('0000011f71fb04cb')) FROM table1; -``` - -```Bash -+-------------+ -| _col0| -+-------------+ -|1234567890123| -+-------------+ -``` - -2. 小端序编码/解码 - -```SQL -SELECT DISTINCT TO_HEX(to_little_endian_32(12345)) FROM table1; -``` - -```Bash -+--------+ -| _col0| -+--------+ -|39300000| -+--------+ -``` - -```SQL -SELECT DISTINCT from_little_endian_32(FROM_HEX('39300000')) FROM table1; -``` - -```Bash -+-----+ -|_col0| -+-----+ -|12345| -+-----+ -``` - -```SQL -SELECT DISTINCT TO_HEX(to_little_endian_64(1234567890123)) FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|cb04fb711f010000| -+----------------+ -``` - -```SQL -SELECT DISTINCT from_little_endian_64(FROM_HEX('cb04fb711f010000')) FROM table1; -``` - -```Bash -+-------------+ -| _col0| -+-------------+ -|1234567890123| -+-------------+ -``` - -### 7.5 浮点型编码函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| ------------------------------ | ------------------------------------------------------------------- | -------------- | -------------- | -| `to_ieee754_32(input)` | 将FLOAT单精度浮点数转换为4字节大端序IEEE754标准BLOB | FLOAT | BLOB | -| `to_ieee754_64(input)` | 将DOUBLE双精度浮点数转换为8字节大端序IEEE754标准BLOB | DOUBLE | BLOB | -| `from_ieee754_32(input)` | 将4字节IEEE754标准BLOB解码为FLOAT浮点数,为to\_ieee754\_32逆操作 | BLOB | FLOAT | -| `from_ieee754_64(input)` | 将8字节IEEE754标准BLOB解码为DOUBLE浮点数,为to\_ieee754\_64逆操作 | BLOB | DOUBLE | - -**使用示例** - -1. 单精度浮点数(FLOAT)编码/解码 - -```SQL -SELECT DISTINCT TO_HEX(to_ieee754_32(temperature)) FROM table1 where time = 2024-11-26 13:37:00; -``` - -```Bash -+--------+ -| _col0| -+--------+ -|42b40000| -+--------+ -``` - -```SQL -SELECT DISTINCT from_ieee754_32(FROM_HEX('42b40000')) FROM table1; -``` - -```Bash -+-----+ -|_col0| -+-----+ -| 90.0| -+-----+ -``` - -2. 双精度浮点数(DOUBLE)编码/解码 - -```SQL -SELECT DISTINCT TO_HEX(to_ieee754_64(3.1415926535)) FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|400921fb54411744| -+----------------+ -``` - -```Bash -SELECT DISTINCT from_ieee754_64(FROM_HEX('400921fb54411744')) FROM table1; -``` - -```Bash -+------------+ -| _col0| -+------------+ -|3.1415926535| -+------------+ -``` - -### 7.6 哈希函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| -------------------------------- | ---------------------------------------------------------------- | -------------------- | -------------- | -| `sha256(input)` | 计算输入数据的SHA-256密码学哈希值,不可逆、抗碰撞 | STRING、TEXT、BLOB | BLOB(32字节) | -| `SHA512(input)` | 计算输入数据的SHA-512密码学哈希值,安全强度高于SHA256 | STRING、TEXT、BLOB | BLOB(64字节) | -| `SHA1(input)` | 计算输入数据的SHA-1哈希值,抗碰撞性弱,不推荐安全场景使用 | STRING、TEXT、BLOB | BLOB(20字节) | -| `MD5(input)` | 计算输入数据的MD5哈希值,无密码学安全性,仅用于非加密校验 | STRING、TEXT、BLOB | BLOB(16字节) | -| `CRC32(input)` | 计算输入数据的CRC32循环冗余校验码,高效检测非恶意数据错误 | STRING、TEXT、BLOB | INT64 | -| `spooky_hash_v2_32(input)` | 计算输入数据的32位SpookyHashV2非密码学哈希值,高性能、低冲突 | STRING、TEXT、BLOB | BLOB(4字节) | -| `spooky_hash_v2_64(input)` | 计算输入数据的64位SpookyHashV2非密码学哈希值,高性能、低冲突 | STRING、TEXT、BLOB | BLOB(8字节) | -| `xxhash64(input)` | 计算输入数据的64位xxHash非密码学哈希值,计算速度极快 | STRING、TEXT、BLOB | BLOB(8字节) | -| `murmur3(input)` | 计算输入数据的128位MurmurHash3非密码学哈希值,分布均匀、应用广 | STRING、TEXT、BLOB | BLOB(16字节) | - -**使用示例** - -1. 密码学哈希函数 - -```SQL -SELECT DISTINCT TO_HEX(sha256('test')) FROM table1; -``` - -```Bash -+----------------------------------------------------------------+ -| _col0| -+----------------------------------------------------------------+ -|9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08| -+----------------------------------------------------------------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(SHA512('test')) FROM table1; -``` - -```Bash -+--------------------------------------------------------------------------------------------------------------------------------+ -| _col0| -+--------------------------------------------------------------------------------------------------------------------------------+ -|ee26b0dd4af7e749aa1a8ee3c10ae9923f618980772e473f8819a5d4940e0db27ac185f8a0e1d5f84f88bc887fd67b143732c304cc5fa9ad8e6f57f50028a8ff| -+--------------------------------------------------------------------------------------------------------------------------------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(SHA1('test')) FROM table1; -``` - -```Bash -+----------------------------------------+ -| _col0| -+----------------------------------------+ -|a94a8fe5ccb19ba61c4c0873d391e987982fbbd3| -+----------------------------------------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(MD5('test')) FROM table1; -``` - -```Bash -+--------------------------------+ -| _col0| -+--------------------------------+ -|098f6bcd4621d373cade4e832627b4f6| -+--------------------------------+ -``` - -2. 校验/非密码学哈希函数 - -```SQL -SELECT DISTINCT CRC32('test') FROM table1; -``` - -```Bash -+----------+ -| _col0| -+----------+ -|3632233996| -+----------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(spooky_hash_v2_32('test')) FROM table1; -``` - -```Bash -+--------+ -| _col0| -+--------+ -|ec0d8b75| -+--------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(spooky_hash_v2_64('test')) FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|7b01e8bcec0d8b75| -+----------------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(xxhash64('test')) FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|4fdcca5ddb678139| -+----------------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(murmur3('test')) FROM table1; -``` - -```Bash -+--------------------------------+ -| _col0| -+--------------------------------+ -|9de1bd74cc287dac824dbdf93182129a| -+--------------------------------+ -``` - -### 7.7 HMAC函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| ------------------------------ | ------------------------------------------------------------------- | -------------------------------------------- | -------------- | -| `hmac_md5(data, key)` | 结合MD5与密钥计算HMAC消息认证码,验证数据完整性和来源,适配旧系统 | data:STRING/TEXT/BLOBkey:STRING/TEXT | BLOB(16字节) | -| `hmac_sha1(data, key)` | 结合SHA-1与密钥计算HMAC消息认证码,验证数据完整性和来源 | data:STRING/TEXT/BLOBkey:STRING/TEXT | BLOB(20字节) | -| `hmac_sha256(data, key)` | 结合SHA256与密钥计算HMAC消息认证码,业界推荐标准,安全强度高 | data:STRING/TEXT/BLOBkey:STRING/TEXT | BLOB(32字节) | -| `hmac_sha512(data, key)` | 结合SHA512与密钥计算HMAC消息认证码,商用级别最高安全强度 | data:STRING/TEXT/BLOBkey:STRING/TEXT | BLOB(64字节) | - -**使用示例** - -* 通用密钥:'iotdb\_secret\_key' -* 待验证数据:'user\_data\_123' - -1. hmac\_md5 - -```SQL -SELECT DISTINCT TO_HEX(hmac_md5('user_data_123', 'iotdb_secret_key')) FROM table1; -``` - -```Bash -+--------------------------------+ -| _col0| -+--------------------------------+ -|8ee863080ceb3b43b5ffdc7a937e7f28| -+--------------------------------+ -``` - -2. hmac\_sha1 - -```SQL -SELECT DISTINCT TO_HEX(hmac_sha1('user_data_123', 'iotdb_secret_key')) FROM table1; -``` - -```Bash -+----------------------------------------+ -| _col0| -+----------------------------------------+ -|b5b7ae1a495745299ec3bd236c511c13540481ce| -+----------------------------------------+ -``` - -3. hmac\_sha256(推荐使用) - -```SQL -SELECT DISTINCT TO_HEX(hmac_sha256('user_data_123', 'iotdb_secret_key')) FROM table1; -``` - -```Bash -+----------------------------------------------------------------+ -| _col0| -+----------------------------------------------------------------+ -|73b6f26bbcb5192dbe2cb83745b0fc48c63418fa674b0bf62fabe7f8747f3afd| -+----------------------------------------------------------------+ -``` - -4. hmac\_sha512 - -```SQL -SELECT DISTINCT TO_HEX(hmac_sha512('user_data_123', 'iotdb_secret_key')) FROM table1; -``` - -```Bash -+--------------------------------------------------------------------------------------------------------------------------------+ -| _col0| -+--------------------------------------------------------------------------------------------------------------------------------+ -|2fed4ec5a0535e3349798b371d6525255ee85d9eae0ddcbdecf89db84f943151f5febf0ffd9c01ae9661278504aba186cf6f732ae5f42d63be58aadee2baccc2| -+--------------------------------------------------------------------------------------------------------------------------------+ -``` - - - -## 8. 条件表达式 - -### 8.1 CASE 表达式 - -CASE 表达式有两种形式:简单形式、搜索形式 - -#### 8.1.1 简单形式 - -简单形式从左到右搜索每个值表达式,直到找到一个与表达式相等的值: - -```SQL -CASE expression - WHEN value THEN result - [ WHEN ... ] - [ ELSE result ] -END -``` - -如果找到匹配的值,则返回相应的结果。如果没有找到匹配项,则返回 ELSE 子句中的结果(如果存在),否则返回 null。例如: - -```SQL -SELECT a, - CASE a - WHEN 1 THEN 'one' - WHEN 2 THEN 'two' - ELSE 'many' - END -``` - -#### 8.1.2 搜索形式 - -搜索形式从左到右评估每个布尔条件,直到找到一个为真的条件,并返回相应的结果: - -```SQL -CASE - WHEN condition THEN result - [ WHEN ... ] - [ ELSE result ] -END -``` - -如果没有条件为真,则返回 ELSE 子句中的结果(如果存在),否则返回 null。例如: - -```SQL -SELECT a, b, - CASE - WHEN a = 1 THEN 'aaa' - WHEN b = 2 THEN 'bbb' - ELSE 'ccc' - END -``` - -### 8.2 COALESCE 函数 - -返回参数列表中的第一个非空值。 - -```SQL -coalesce(value1, value2[, ...]) -``` - -### 8.3 IF 表达式 - -IF 表达式有两种形式:一种仅指定真值(true\_value),另一种同时指定真值和假值(false\_value)。 - -| 形式 | 说明 | 输出类型限制 | -| ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- | -| `if(condition, true_value)` | 若条件(condition)为真,则计算并返回`true_value`;否则返回`null`,且`true_value`不会被计算。 | | -| `if(condition, true_value, false_value)` | 若条件(condition)为真,则计算并返回`true_value`;否则计算并返回`false_value`。 | `true_value`和`false_value`的数据类型​**必须完全一致**​,不支持隐式类型转换。 | - -> V 2.0.9.1 版本起支持 - -**示例:** - -1. IF 表达式和 CASE 表达式等价示例: - -```SQL --- IF 写法 -SELECT - device_id, - temperature, - IF(temperature > 85, 'High Value', 'Low Value') -FROM table1; - --- CASE 等价写法 -SELECT - device_id, - temperature, - CASE - WHEN temperature > 85 THEN 'High Value' - ELSE 'Low Value' - END -FROM table1; -``` - -2. 输出类型限制示例: - -```SQL --- 成功 --- temperature(float) 和 humidity(float) 类型一致 -select if(temperature > 85, temperature, humidity) from table1 - --- 失败 --- temperature(float) 和 status(boolean) 类型不一致 -select if(temperature > 85, temperature, status) from table1 -``` - - - -## 9. 转换函数 - -### 9.1 转换函数 - -#### 9.1.1 cast(value AS type) → type - -1. 显式地将一个值转换为指定类型。 -2. 可以用于将字符串(varchar)转换为数值类型,或数值转换为字符串类型,V2.0.8 版本起支持 OBJECT 类型强转成 STRING 类型。 -3. 如果转换失败,将抛出运行时错误。 - -示例: - -```SQL -SELECT * - FROM table1 - WHERE CAST(time AS DATE) - IN (CAST('2024-11-27' AS DATE), CAST('2024-11-28' AS DATE)); -``` - -#### 9.1.2 try_cast(value AS type) → type - -1. 与 `cast()` 类似。 -2. 如果转换失败,则返回 `null`。 - -示例: - -```SQL -SELECT * - FROM table1 - WHERE try_cast(time AS DATE) - IN (try_cast('2024-11-27' AS DATE), try_cast('2024-11-28' AS DATE)); -``` - -### 9.2 Format 函数 -该函数基于指定的格式字符串与输入参数,生成并返回格式化后的字符串输出。其功能与 Java 语言中的`String.format` 方法及 C 语言中的`printf`函数相类似,支持开发者通过占位符语法构建动态字符串模板,其中预设的格式标识符将被传入的对应参数值精准替换,最终形成符合特定格式要求的完整字符串。 - -#### 9.2.1 语法介绍 - -```SQL -format(pattern,...args) -> String -``` - -**参数定义** - -* `pattern`: 格式字符串,可包含静态文本及一个或多个格式说明符(如 `%s`, `%d` 等),或任意返回类型为 `STRING/TEXT` 的表达式。 -* `args`: 用于替换格式说明符的输入参数。需满足以下条件: - * 参数数量 ≥ 1 - * 若存在多个参数,以逗号`,`分隔(如 `arg1,arg2`) - * 参数总数可多于 `pattern` 中的占位符数量,但不可少于,否则触发异常 - -**返回值** - -* 类型为 `STRING` 的格式化结果字符串 - -#### 9.2.2 使用示例 - -1. 格式化浮点数 - -```SQL -IoTDB:database1> select format('%.5f',humidity) from table1 where humidity = 35.4 -+--------+ -| _col0| -+--------+ -|35.40000| -+--------+ -``` - -2. 格式化整数 - -```SQL -IoTDB:database1> select format('%03d',8) from table1 limit 1 -+-----+ -|_col0| -+-----+ -| 008| -+-----+ -``` - -3. 格式化日期和时间戳 - -* Locale-specific日期 - -```SQL -IoTDB:database1> SELECT format('%1$tA, %1$tB %1$te, %1$tY', 2024-01-01) from table1 limit 1 -+--------------------+ -| _col0| -+--------------------+ -|星期一, 一月 1, 2024| -+--------------------+ -``` - -* 去除时区信息 - -```SQL -IoTDB:database1> SELECT format('%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL', 2024-01-01T00:00:00.000+08:00) from table1 limit 1 -+-----------------------+ -| _col0| -+-----------------------+ -|2024-01-01 00:00:00.000| -+-----------------------+ -``` - -* 获取秒级时间戳精度 - -```SQL -IoTDB:database1> SELECT format('%1$tF %1$tT', 2024-01-01T00:00:00.000+08:00) from table1 limit 1 -+-------------------+ -| _col0| -+-------------------+ -|2024-01-01 00:00:00| -+-------------------+ -``` - -* 日期符号说明如下 - -| **符号** | **​ 描述** | -| ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| 'H' | 24 小时制的小时数,格式为两位数,必要时加上前导零,i.e. 00 - 23。 | -| 'I' | 12 小时制的小时数,格式为两位数,必要时加上前导零,i.e. 01 - 12。 | -| 'k' | 24 小时制的小时数,i.e. 0 - 23。 | -| 'l' | 12 小时制的小时数,i.e. 1 - 12。 | -| 'M' | 小时内的分钟,格式为两位数,必要时加上前导零,i.e. 00 - 59。 | -| 'S' | 分钟内的秒数,格式为两位数,必要时加上前导零,i.e. 00 - 60(“60 ”是支持闰秒所需的特殊值)。 | -| 'L' | 秒内毫秒,格式为三位数,必要时加前导零,i.e. 000 - 999。 | -| 'N' | 秒内的纳秒,格式为九位数,必要时加前导零,i.e. 000000000 - 999999999。 | -| 'p' | 当地特定的[上午或下午](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/text/DateFormatSymbols.html#getAmPmStrings())标记,小写,如 “am ”或 “pm”。使用转换前缀 “T ”会强制输出为大写。 | -| 'z' | 从格林尼治标准时间偏移的[RFC 822](http://www.ietf.org/rfc/rfc0822.txt)式数字时区,例如 -0800。该值将根据夏令时的需要进行调整。对于 long、[Long](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/lang/Long.html)和[Date](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/util/Date.html),使用的时区是 Java 虚拟机此实例的[默认时区](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/util/TimeZone.html#getDefault())。 | -| 'Z' | 表示时区缩写的字符串。该值将根据夏令时的需要进行调整。对于 long、[Long](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/lang/Long.html)和[Date](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/util/Date.html),使用的时区是此 Java 虚拟机实例的[默认时区](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/util/TimeZone.html#getDefault())。Formatter 的时区将取代参数的时区(如果有)。 | -| 's' | 自 1970 年 1 月 1 日 00:00:00 UTC 开始的纪元起的秒数,i.e. Long.MIN\_VALUE/1000 至 Long.MAX\_VALUE/1000。 | -| 'Q' | 自 1970 年 1 月 1 日 00:00:00 UTC 开始的纪元起的毫秒数,i.e. Long.MIN\_VALUE 至 Long.MAX\_VALUE。 | - -* 用于格式化常见的日期/时间组成的转换字符说明如下 - -| **符号** | **描述** | -| ---------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| 'B' | 特定于区域设置[的完整月份名称](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/text/DateFormatSymbols.html#getMonths()),例如 “January”、“February”。 | -| 'b' | 当地特定月份的缩写名称,如"1 月"、"2 月"。 | -| 'h' | 与"b "相同。 | -| 'A' | 一周中某一天在当地的全称,如"星期日"、"星期一"。 | -| 'a' | 当地特有的星期简短名称,例如"星期日"、"星期一 | -| 'C' | 四位数年份除以100,格式为两位数,必要时加上前导零,即00 - 99 | -| 'Y' | 年份,格式为至少四位数,必要时加上前导零,例如0092相当于公历92年。 | -| 'y' | 年份的最后两位数,格式为必要的前导零,即00 - 99。 | -| 'j' | 年号,格式为三位数,必要时加前导零,例如公历为001 - 366。 | -| 'm' | 月份,格式为两位数,必要时加前导零,即01 - 13。 | -| 'd' | 月日,格式为两位数,必要时加前导零,即01 - 31 | -| 'e' | 月日,格式为两位数,即1 - 31。 | - -4. 格式化字符串 - -```SQL -IoTDB:database1> SELECT format('The measurement status is :%s',status) FROM table2 limit 1 -+-------------------------------+ -| _col0| -+-------------------------------+ -|The measurement status is :true| -+-------------------------------+ -``` - -5. 格式化百分号 - -```SQL -IoTDB:database1> SELECT format('%s%%', 99.9) from table1 limit 1 -+-----+ -|_col0| -+-----+ -|99.9%| -+-----+ -``` - -#### 9.2.3 **格式转换失败场景说明** - -1. 类型不匹配错误 - -* 时间戳类型冲突 若格式说明符中包含时间相关标记(如 `%Y-%m-%d`),但参数提供: - * 非 `DATE`/`TIMESTAMP` 类型值 - * 或涉及日期细粒度单位(如 `%H` 小时、`%M` 分钟)时,参数仅支持 `TIMESTAMP` 类型,否则将抛出类型异常 - -```SQL --- 示例1 -IoTDB:database1> SELECT format('%1$tA, %1$tB %1$te, %1$tY', humidity) from table2 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %1$tA, %1$tB %1$te, %1$tY (IllegalFormatConversion: A != java.lang.Float) - --- 示例2 -IoTDB:database1> SELECT format('%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL', humidity) from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL (IllegalFormatConversion: Y != java.lang.Float) -``` - -* 浮点数类型冲突 若使用 `%f` 等浮点格式说明符,但参数提供非数值类型(如字符串、布尔值),将触发类型转换错误 - -```SQL -IoTDB:database1> select format('%.5f',status) from table1 where humidity = 35.4 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %.5f (IllegalFormatConversion: f != java.lang.Boolean) -``` - -2. 参数数量不匹配错误 - -* 实际提供的参数数量 必须等于或大于 格式字符串中格式说明符的数量 -* 若参数数量少于格式说明符数量,将抛出 `ArgumentCountMismatch` 异常 - -```SQL -IoTDB:database1> select format('%.5f %03d', humidity) from table1 where humidity = 35.4 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %.5f %03d (MissingFormatArgument: Format specifier '%03d') -``` - -3. 无效调用错误 - -* 当函数参数满足以下任一条件时,视为非法调用: - * 参数总数 小于 2(必须包含格式字符串及至少一个参数) - * 格式字符串(`pattern`)类型非 `STRING/TEXT` - -```SQL --- 示例1 -IoTDB:database1> select format('%s') from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Scalar function format must have at least two arguments, and first argument pattern must be TEXT or STRING type. - ---示例2 -IoTDB:database1> select format(123, humidity) from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Scalar function format must have at least two arguments, and first argument pattern must be TEXT or STRING type. -``` - - - -## 10. 字符串函数和操作符 - -### 10.1 字符串操作符 - -#### 10.1.1 || 操作符 - -`||` 操作符用于字符串连接,功能与 `concat` 函数相同。 - -#### 10.1.2 LIKE 语句 - -`LIKE` 语句用于模式匹配,具体用法在[模式匹配:LIKE](#1-like-运算符) 中有详细文档。 - -### 10.2 字符串函数 - -| 函数名 | 描述 | 输入 | 输出 | 用法 | -| ----------- |---------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------| ------------------------------------------------------------ | ------------------------------------------------------------ | -| length | 返回字符串的字符长度,而不是字符数组的长度。 | 支持一个参数,类型可以是字符串或文本。**string**:要计算长度的字符串。 | INT32 | length(string) | -| upper | 将字符串中的字母转换为大写。 | 支持一个参数,类型可以是字符串或文本。**string**:要计算长度的字符串。 | String | upper(string) | -| lower | 将字符串中的字母转换为小写。 | 支持一个参数,类型可以是字符串或文本。**string**:要计算长度的字符串。 | String | lower(string) | -| trim | 从源字符串中删除指定的开头和/或结尾字符。 | 支持三个参数**specification(可选)**:指定从哪边去掉字符,可以是:`BOTH`:两边都去掉(默认)。`LEADING`:只去掉开头的字符。`TRAILING`:只去掉结尾的字符。**trimcharacter(可选)**:要去掉的字符,如果没指定,默认去掉空格。**string**:要处理的字符串。 | String | trim([ [ specification ] [ trimcharacter ] FROM ] string) 示例:`trim('!' FROM '!foo!');` —— `'foo'` | -| strpos | 返回子字符串在字符串中第一次出现的起始位置。位置从 1 开始计数。如果未找到,返回 0。注意:起始位置是基于字符而不是字节数组确定的。 | 仅支持两个参数,类型可以是字符串或文本。**sourceStr**:要搜索的字符串。**subStr**:要找的子字符串。 | INT32 | strpos(sourceStr, subStr) | -| starts_with | 测试子字符串是否是字符串的前缀。 | 支持两个参数,类型可以是字符串或文本。**sourceStr**:要检查的字符串,类型可以是字符串或文本。**prefix**:前缀子字符串,类型可以是字符串或文本。 | Boolean | starts_with(sourceStr, prefix) | -| ends_with | 测试字符串是否以指定的后缀结束。 | 支持两个参数,类型可以是字符串或文本。**sourceStr**:要检查的字符串。**suffix**:后缀子字符串。 | Boolean | ends_with(sourceStr, suffix) | -| concat | 返回字符串 `string1`、`string2`、...、`stringN` 的连接结果。功能与连接操作符 `\|\|` 相同。 | 至少两个参数,所有参数类型必须是字符串或文本。 | String | concat(str1, str2, ...) 或 str1 \|\| str2 ... | -| strcmp | 比较两个字符串的字母序。 | 支持两个参数,两个参数类型必须是字符串或文本。**string1**:第一个要比较的字符串。**string2**:第二个要比较的字符串。 | 返回一个整数值INT32如果 `str1 < str2`,返回 `-1`如果 `str1 = str2`,返回 `0`如果 `str1 > str2`,返回 `1`如果 `str1` 或 `str2` 为 `NULL`,返回 `NULL` | strcmp(str1, str2) | -| replace | 从字符串中删除所有 `search` 的实例。 | 支持两个参数,可以是字符串或文本类型。**string**:原始字符串,要从中删除内容的字符串。**search**:要删除的子字符串。 | String | replace(string, string) | -| replace | 将字符串中所有 `search` 的实例替换为 `replace`。 | 支持三个参数,可以是字符串或文本类型。**string**:原始字符串,要从中替换内容的字符串。**search**:要替换掉的子字符串。**replace**:用来替换的新字符串。 | String | replace(string, string, string) | -| substring | 从指定位置提取字符到字符串末尾。需要注意的是,起始位置是基于字符而不是字节数组确定的。`start_index` 从 1 开始计数,长度从 `start_index` 位置计算。 | 支持两个参数**string**:要提取子字符串的源字符串,可以是字符串或文本类型。**start_index**:从哪个索引开始提取子字符串,索引从 1 开始计数。 | String:返回一个字符串,从 `start_index` 位置开始到字符串末尾的所有字符。**注意事项**:`start_index` 从 1 开始,即数组的第 0 个位置是 1参数为 null时,返回 `null`start_index 大于字符串长度时,结果报错。 | substring(string from start_index)或 substring(string, start_index) | -| substring | 从一个字符串中提取从指定位置开始、指定长度的子字符串注意:起始位置和长度是基于字符而不是字节数组确定的。`start_index` 从 1 开始计数,长度从 `start_index` 位置计算。 | 支持三个参数**string**:要提取子字符串的源字符串,可以是字符串或文本类型。**start_index**:从哪个索引开始提取子字符串,索引从 1 开始计数。**length**:要提取的子字符串的长度。 | String:返回一个字符串,从 `start_index` 位置开始,提取 `length` 个字符。**注意事项**:参数为 null时,返回 `null`如果 `start_index` 大于字符串的长度,结果报错。如果 `length` 小于 0,结果报错。极端情况,`start_index + length` 超过 `int.MAX` 并变成负数,将导致异常结果。 | substring(string from start_index for length) 或 substring(string, start_index, length) | - -## 11. 模式匹配函数 - -### 11.1 LIKE 运算符 - -#### 11.1.1 用途 - -`LIKE` 运算符用于将值与模式进行比较。它通常用于 `WHERE` 子句中,用于匹配字符串中的特定模式。 - -#### 11.1.2 语法 - -```SQL -... column [NOT] LIKE 'pattern' ESCAPE 'character'; -``` - -#### 11.1.3 匹配规则 - -- 匹配字符是区分大小写的。 -- 模式支持两个匹配符号: - - `_`:匹配任意单个字符。 - - `%`:匹配0个或多个字符。 - -#### 11.1.4 注意事项 - -- `LIKE` 模式匹配总是覆盖整个字符串。如果需要匹配字符串中的任意位置,模式必须以 `%` 开头和结尾。 -- 如果需要匹配 `%` 或 `_` 作为普通字符,必须使用转义字符。 - -#### 11.1.5 示例 - -示例 1:匹配以特定字符开头的字符串 - -- **说明**:查找所有以字母 `E` 开头的名称,例如 `Europe`。 - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'E%'; -``` - -示例 2:排除特定模式 - -- **说明**:查找所有不以字母 `E` 开头的名称。 - -```SQL -SELECT * FROM table1 WHERE continent NOT LIKE 'E%'; -``` - -示例 3:匹配特定长度的字符串 - -- **说明**:查找所有以 `A` 开头、以 `a` 结尾且中间有两个字符的名称,例如 `Asia`。 - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'A__a'; -``` - -示例 4:转义特殊字符 - -- **说明**:查找所有以 `South_` 开头的名称。这里使用了转义字符 `\` 来转义 `_` 等特殊字符,例如`South_America`。 - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'South\_%' ESCAPE '\'; -``` - -示例 5:匹配转义字符本身 - -- **说明**:如果需要匹配转义字符本身,可以使用双转义字符 `\\`。 - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'South\\%' ESCAPE '\'; -``` - -### 11.2 regexp_like 函数 - -#### 11.2.1 用途 - -`regexp_like` 函数用于评估正则表达式模式,并确定该模式是否包含在字符串中。 - -#### 11.2.2 语法 - -```SQL -regexp_like(string, pattern); -``` - -#### 11.2.3 注意事项 - -- `regexp_like` 的模式只需包含在字符串中,而不需要匹配整个字符串。 -- 如果需要匹配整个字符串,可以使用正则表达式的锚点 `^` 和 `$`。 -- `^` 表示“字符串的开头”,`$` 表示“字符串的结尾”。 -- 正则表达式采用 Java 定义的正则语法,但存在以下需要注意的例外情况: - - **多行模式** - 1. 启用方式:`(?m)`。 - 2. 只识别`\n`作为行终止符。 - 3. 不支持`(?d)`标志,且禁止使用。 - - **不区分大小写匹配** - 1. 启用方式:`(?i)`。 - 2. 基于Unicode规则,不支持上下文相关和本地化匹配。 - 3. 不支持`(?u)`标志,且禁止使用。 - - **字符类** - 1. 在字符类(如`[A-Z123]`)中,`\Q`和`\E`不被支持,被视为普通字面量。 - - **Unicode字符类(**`\p{prop}`**)** - 1. **名称下划线**:名称中的所有下划线必须删除(如`OldItalic`而非`Old_Italic`)。 - 2. **文字(Scripts)**:直接指定,无需`Is`、`script=`或`sc=`前缀(如`\p{Hiragana}`)。 - 3. **区块(Blocks)**:必须使用`In`前缀,不支持`block=`或`blk=`前缀(如`\p{InMongolian}`)。 - 4. **类别(Categories)**:直接指定,无需`Is`、`general_category=`或`gc=`前缀(如`\p{L}`)。 - 5. **二元属性(Binary Properties)**:直接指定,无需`Is`(如`\p{NoncharacterCodePoint}`)。 - -#### 11.2.4 示例 - -示例 1:匹配包含特定模式的字符串 - -```SQL -SELECT regexp_like('1a 2b 14m', '\\d+b'); -- true -``` - -- **说明**:检查字符串 `'1a 2b 14m'` 是否包含模式 `\d+b`。 - - `\d+` 表示“一个或多个数字”。 - - `b` 表示字母 `b`。 - - 在 `'1a 2b 14m'` 中,`2b` 符合这个模式,所以返回 `true`。 - -示例 2:匹配整个字符串 - -```SQL -SELECT regexp_like('1a 2b 14m', '^\\d+b$'); -- false -``` - -- **说明**:检查字符串 `'1a 2b 14m'` 是否完全匹配模式 `^\\d+b$`。 - - `\d+` 表示“一个或多个数字”。 - - `b` 表示字母 `b`。 - - `'1a 2b 14m'` 并不符合这个模式,因为它不是从数字开始,也不是以 `b` 结束,所以返回 `false`。 - -## 12. 时序分窗函数 - -原始示例数据如下: - -```SQL -IoTDB> SELECT * FROM bid; -+-----------------------------+--------+-----+ -| time|stock_id|price| -+-----------------------------+--------+-----+ -|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+--------+-----+ - --- 创建语句 -CREATE TABLE bid(time TIMESTAMP TIME, stock_id STRING TAG, price FLOAT FIELD); --- 插入数据 -INSERT INTO bid(time, stock_id, price) VALUES('2021-01-01T09:05:00','AAPL',100.0),('2021-01-01T09:06:00','TESL',200.0),('2021-01-01T09:07:00','AAPL',103.0),('2021-01-01T09:07:00','TESL',202.0),('2021-01-01T09:09:00','AAPL',102.0),('2021-01-01T09:15:00','TESL',195.0); -``` - -### 12.1 HOP - -#### 12.1.1 功能描述 - -HOP 函数用于按时间分段分窗分析,识别每一行数据所属的时间窗口。该函数通过指定固定窗口大小(size)和窗口滑动步长(SLIDE),将数据按时间戳分配到所有与其时间戳重叠的窗口中。若窗口之间存在重叠(步长 < 窗口大小),数据会自动复制到多个窗口。 - -#### 12.1.2 函数定义 - -```SQL -HOP(data, timecol, size, slide[, origin]) -``` - -#### 12.1.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小 | -| SLIDE | 标量参数 | 长整数类型 | 窗口滑动步长 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -#### 12.1.4 返回结果 - -HOP 函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 12.1.5 使用示例 - -```SQL -IoTDB> SELECT * FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.2 SESSION - -#### 12.2.1 功能描述 - -SESSION 函数用于按会话间隔对数据进行分窗。系统逐行检查与前一行的时间间隔,小于阈值(GAP)则归入当前窗口,超过则归入下一个窗口。 - -#### 12.2.2 函数定义 - -```SQL -SESSION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], timecol, gap) -``` -#### 12.2.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| TIMECOL | 标量参数 | 字符串类型默认值:'time' | 时间列名 | -| GAP | 标量参数 | 长整数类型 | 会话间隔阈值 | - -#### 12.2.4 返回结果 - -SESSION 函数的返回结果列包含: - -* window\_start: 会话窗口内的第一条数据的时间 -* window\_end: 会话窗口内的最后一条数据的时间 -* 映射列:DATA 参数的所有输入列 - -#### 12.2.5 使用示例 - -```SQL -IoTDB> SELECT * FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY SESSION -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL| 201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.3 VARIATION - -#### 12.3.1 功能描述 - -VARIATION 函数用于按数据差值分窗,将第一条数据作为首个窗口的基准值,每个数据点会与基准值进行差值运算,如果差值小于给定的阈值(delta)则加入当前窗口;如果超过阈值,则分为下一个窗口,将该值作为下一个窗口的基准值。 - -#### 12.3.2 函数定义 - -```sql -VARIATION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], col, delta) -``` - -#### 12.3.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| -------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| COL | 标量参数 | 字符串类型 | 标识对哪一列计算差值 | -| DELTA | 标量参数 | 浮点数类型 | 差值阈值 | - -#### 12.3.4 返回结果 - -VARIATION 函数的返回结果列包含: - -* window\_index: 窗口编号 -* 映射列:DATA 参数的所有输入列 - -#### 12.3.5 使用示例 - -```sql -IoTDB> SELECT * FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price',DELTA => 2.0); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 1|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY VARIATION -IoTDB> SELECT first(time) as window_start, last(time) as window_end, stock_id, avg(price) as avg FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price', DELTA => 2.0) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:07:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.5| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 12.4 CAPACITY - -#### 12.4.1 功能描述 - -CAPACITY 函数用于按数据点数(行数)分窗,每个窗口最多有 SIZE 行数据。 - -#### 12.4.2 函数定义 - -```sql -CAPACITY(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], size) -``` - -#### 12.4.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| -------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小 | - -#### 12.4.4 返回结果 - -CAPACITY 函数的返回结果列包含: - -* window\_index: 窗口编号 -* 映射列:DATA 参数的所有输入列 - -#### 12.4.5 使用示例 - -```sql -IoTDB> SELECT * FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 0|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY COUNT -IoTDB> SELECT first(time) as start_time, last(time) as end_time, stock_id, avg(price) as avg FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| start_time| end_time|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|101.5| -|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 12.5 TUMBLE - -#### 12.5.1 功能描述 - -TUMBLE 函数用于通过时间属性字段为每行数据分配一个窗口,滚动窗口的大小固定且不重复。 - -#### 12.5.2 函数定义 - -```sql -TUMBLE(data, timecol, size[, origin]) -``` -#### 12.5.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小,需为正数 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -#### 12.5.4 返回结果 - -TUBMLE 函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 12.5.5 使用示例 - -```SQL -IoTDB> SELECT * FROM TUMBLE( DATA => bid, TIMECOL => 'time', SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM TUMBLE(DATA => bid, TIMECOL => 'time', SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.6 CUMULATE - -#### 12.6.1 功能描述 - -Cumulate 函数用于从初始的窗口开始,创建相同窗口开始但窗口结束步长不同的窗口,直到达到最大的窗口大小。每个窗口包含其区间内的元素。例如:1小时步长,24小时大小的累计窗口,每天可以获得如下这些窗口:`[00:00, 01:00)`,`[00:00, 02:00)`,`[00:00, 03:00)`, …, `[00:00, 24:00)` - -#### 12.6.2 函数定义 - -```sql -CUMULATE(data, timecol, size, step[, origin]) -``` - -#### 12.6.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------------------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小,SIZE必须是STEP的整数倍,需为正数 | -| STEP | 标量参数 | 长整数类型 | 窗口步长,需为正数 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -> 注意:size 如果不是 step 的整数倍,则会报错`Cumulative table function requires size must be an integral multiple of step` - -#### 12.6.4 返回结果 - -CUMULATE函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 12.6.5 使用示例 - -```sql -IoTDB> SELECT * FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m, SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| TESL| 201.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00| AAPL| 100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| AAPL| 101.5| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/Common-Table-Expression_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/Common-Table-Expression_timecho.md deleted file mode 100644 index b3dc653b2..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/Common-Table-Expression_timecho.md +++ /dev/null @@ -1,234 +0,0 @@ - - -# 公用表表达式(CTE) - -## 1. 概述 - -CTE(Common Table Expressions,公用表表达式)功能支持通过 `WITH` 子句定义一个或多个临时结果集(即公用表),这些结果集可以在同一个查询的后续部分中被多次引用。CTE 提供了一种清晰的方式来构建复杂的查询,使 SQL 代码更易读和维护。 - -> 注意:该功能从 V 2.0.9.1 版本开始提供。 - -## 2. 语法定义 - -CTE 的简化 SQL 语法如下: - -```SQL -with_clause: - WITH cte_name [(col_name [, col_name] ...)] AS (subquery) - [, cte_name [(col_name [, col_name] ...)] AS (subquery)] ... -``` - -* 支持简单 CTE 和嵌套 CTE:可以在 `WITH` 子句中定义一个或多个 CTE,且 CTE 之间可以嵌套引用(但不能前向引用,即不能使用尚未定义的 CTE)。 -* CTE 名称与源表重名:如果 CTE 名称与源表重名,在外层作用域中只有 CTE 可见,源表将被屏蔽。 -* CTE 的多次引用:同一个 CTE 在外层查询中可以被多次引用。 -* Explain / ExplainAnalyze 支持:支持对整个查询进行 `Explain` 或 `ExplainAnalyze`,但不支持对 CTE 定义中的 `subquery` 进行 `Explain` 或 `ExplainAnalyze`。 -* 列名指定限制:CTE 定义时指定的列名个数需与 `subquery` 输出列个数一致,否则报错。 -* 未使用的 CTE:如果定义的 CTE 在查询主体中没有用到,查询仍可正常执行。 - -## 3. 使用示例 - -基于[示例数据](../Reference/Sample-Data.md) 中的表 `table1` 和 `table2`作为源表: - -### 3.1 简单 CTE - -```SQL -WITH cte1 AS (SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL), - cte2 AS (SELECT device_id, humidity FROM table2 WHERE humidity IS NOT NULL) -SELECT * FROM cte1 join cte2 on cte1.device_id = cte2.device_id limit 10; -``` - -执行结果 - -```Bash -+---------+-----------+---------+--------+ -|device_id|temperature|device_id|humidity| -+---------+-----------+---------+--------+ -| 100| 90.0| 100| 45.1| -| 100| 90.0| 100| 35.2| -| 100| 90.0| 100| 35.1| -| 100| 85.0| 100| 45.1| -| 100| 85.0| 100| 35.2| -| 100| 85.0| 100| 35.1| -| 100| 85.0| 100| 45.1| -| 100| 85.0| 100| 35.2| -| 100| 85.0| 100| 35.1| -| 100| 88.0| 100| 45.1| -+---------+-----------+---------+--------+ -Total line number = 10 -It costs 0.075s -``` - -### 3.2 CTE 与源表重名 - -```SQL -WITH table1 AS (SELECT time, device_id, temperature FROM table1 WHERE temperature IS NOT NULL) -SELECT * FROM table1 limit 5; -``` - -执行结果 - -```Bash -+-----------------------------+---------+-----------+ -| time|device_id|temperature| -+-----------------------------+---------+-----------+ -|2024-11-30T09:30:00.000+08:00| 101| 90.0| -|2024-11-30T14:30:00.000+08:00| 101| 90.0| -|2024-11-29T10:00:00.000+08:00| 101| 85.0| -|2024-11-27T16:39:00.000+08:00| 101| 85.0| -|2024-11-27T16:40:00.000+08:00| 101| 85.0| -+-----------------------------+---------+-----------+ -Total line number = 5 -It costs 0.103s -``` - -### 3.3 嵌套 CTE - -```SQL -WITH - table1 AS (select device_id, temperature from table1 WHERE temperature IS NOT NULL), - cte1 AS (select device_id, temperature from table2 WHERE temperature IS NOT NULL), - table2 AS (select temperature from table1), - cte2 AS (SELECT temperature FROM table1) -SELECT * FROM table2; -``` - -执行结果 - -```Bash -+-----------+ -|temperature| -+-----------+ -| 90.0| -| 90.0| -| 85.0| -| 85.0| -| 85.0| -| 85.0| -| 90.0| -| 85.0| -| 85.0| -| 88.0| -| 90.0| -| 90.0| -+-----------+ -Total line number = 12 -It costs 0.050s -``` - -* 不支持前向引用 - -```SQL -WITH - cte2 AS (SELECT temperature FROM cte1), - cte1 AS (select device_id, temperature from table1) -SELECT * FROM cte2; -``` - -错误信息 - -```Bash -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 550: Table 'database1.cte1' does not exist. -``` - -### 3.4 CTE 的多次引用 - -```SQL -WITH cte AS (select device_id, temperature from table1 WHERE temperature IS NOT NULL) -SELECT * FROM cte WHERE temperature > (SELECT avg(temperature ) FROM cte); -``` - -执行结果 - -```Bash -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 90.0| -| 100| 90.0| -| 100| 88.0| -| 100| 90.0| -| 100| 90.0| -+---------+-----------+ -Total line number = 6 -It costs 0.241s -``` - -### 3.5 Explain 支持 - -* 支持整个查询 - -```SQL -EXPLAIN WITH cte AS (SELECT * FROM table1) SELECT * FROM cte; -``` - -执行结果 - -```Bash -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| distribution plan| -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ | -| │OutputNode-7 │ | -| │OutputColumns-[time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time] │ | -| │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ | -| └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ | -| │ | -| │ | -| ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ | -| │Collect-42 │ | -| │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ | -| └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ | -| ┌───────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────┐ | -| │ │ | -| ┌───────────┐ ┌───────────┐ | -| │Exchange-49│ │Exchange-50│ | -| └───────────┘ └───────────┘ | -| │ │ | -| │ │ | -|┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐| -|│DeviceTableScanNode-41 │ │DeviceTableScanNode-40 │| -|│QualifiedTableName: database1.table1 │ │QualifiedTableName: database1.table1 │| -|│OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│| -|│DeviceNumber: 3 │ │DeviceNumber: 3 │| -|│ScanOrder: ASC │ │ScanOrder: ASC │| -|│PushDownOffset: 0 │ │PushDownOffset: 0 │| -|│PushDownLimit: 0 │ │PushDownLimit: 0 │| -|│PushDownLimitToEachDevice: false │ │PushDownLimitToEachDevice: false │| -|│RegionId: 2 │ │RegionId: 1 │| -|└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘| -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 29 -It costs 0.065s -``` - -* 不支持 cte 内部查询 - -```SQL -WITH cte AS (EXPLAIN SELECT * FROM table1) SELECT * FROM cte; -``` - -错误信息 - -```Bash -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 700: line 1:14: mismatched input 'EXPLAIN'. Expecting: -``` diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/Featured-Functions_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/Featured-Functions_timecho.md deleted file mode 100644 index 308c527a4..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/Featured-Functions_timecho.md +++ /dev/null @@ -1,862 +0,0 @@ - - -# 特色函数 - -## 1. 降采样函数 - -### 1.1 `date_bin` 函数 - -#### 1.1.1 功能描述 - -`date_bin` 是一个标量函数,用于将时间戳规整到指定的时间区间起点,并结合 `GROUP BY` 子句实现降采样。 - -- 部分区间结果为空:只会对满足条件的数据进行时间戳规整,不会填充缺失的时间区间。 -- 全部区间结果为空::满足条件的整个查询范围内没有数据时,降采样返回空结果集 - -#### 1.1.2 使用示例 - -##### 示例数据 - -在[示例数据页面](../Reference/Sample-Data.md)中,包含了用于构建表结构和插入数据的SQL语句,下载并在IoTDB CLI中执行这些语句,即可将数据导入IoTDB,您可以使用这些数据来测试和执行示例中的SQL语句,并获得相应的结果。 - -示例 1:获取设备** **`100`** **某个时间范围的每小时平均温度 - -```SQL -SELECT date_bin(1h, time) AS hour_time, avg(temperature) AS avg_temp -FROM table1 -WHERE (time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00) - AND device_id = '100' -GROUP BY 1; -``` - -结果: - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:00:00.000+08:00| 90.0| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -+-----------------------------+--------+ -``` - -示例 2:获取每个设备某个时间范围的每小时平均温度 - -```SQL -SELECT date_bin(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00 -GROUP BY 1, device_id; -``` - -结果: - -```Plain -+-----------------------------+---------+--------+ -| hour_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-29T11:00:00.000+08:00| 100| null| -|2024-11-29T18:00:00.000+08:00| 100| 90.0| -|2024-11-28T08:00:00.000+08:00| 100| 85.0| -|2024-11-28T09:00:00.000+08:00| 100| null| -|2024-11-28T10:00:00.000+08:00| 100| 85.0| -|2024-11-28T11:00:00.000+08:00| 100| 88.0| -|2024-11-29T10:00:00.000+08:00| 101| 85.0| -|2024-11-27T16:00:00.000+08:00| 101| 85.0| -+-----------------------------+---------+--------+ -``` - -示例 3:获取所有设备某个时间范围的每小时平均温度 - -```SQL -SELECT date_bin(1h, time) AS hour_time, avg(temperature) AS avg_temp - FROM table1 - WHERE time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00 - group by 1; -``` - -结果: - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-29T10:00:00.000+08:00| 85.0| -|2024-11-27T16:00:00.000+08:00| 85.0| -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:00:00.000+08:00| 90.0| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -+-----------------------------+--------+ -``` - -### 1.2 `date_bin_gapfill` 函数 - -#### 1.2.1 功能描述 - -`date_bin_gapfill` 是 `date_bin` 的扩展,能够填充缺失的时间区间,从而返回完整的时间序列。 - -- 部分区间结果为空:对满足条件的数据进行时间戳规整,并填充缺失的时间区间。 -- 全部区间结果为空::整个查询范围内没有数据时,`date_bin_gapfill`会返回空结果集 - -#### 1.2.2 功能限制 - -- **`date_bin_gapfill`** **必须与** **`GROUP BY`** **子句搭配使用**,如果用在其他子句中,不会报错,但不会执行 gapfill 功能,效果与使用 `date_bin` 相同。 -- **每个** **`GROUP BY`** **子句中只能使用一个** **`date_bin_gapfill`**。如果出现多个 `date_bin_gapfill`,会报错:multiple date_bin_gapfill calls not allowed -- **`date_bin_gapfill`** **的执行顺序**:GAPFILL 功能发生在 `HAVING` 子句执行之后,`FILL` 子句执行之前。 -- **使用** **`date_bin_gapfill`** **时,****`WHERE`** **子句中的时间过滤条件必须是以下形式之一:** - - `time >= XXX AND time <= XXX` - - `time > XXX AND time < XXX` - - `time BETWEEN XXX AND XXX` -- **使用** **`date_bin_gapfill`** **时,如果出现其他时间过滤条件**,会报错。时间过滤条件与其他值过滤条件只能通过 `AND` 连接。 -- **如果不能从 where 子句中推断出 startTime 和 endTime,则报错**:could not infer startTime or endTime from WHERE clause。 - -#### 1.2.3 使用示例 - -示例 1:填充缺失时间区间 - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, avg(temperature) AS avg_temp -FROM table1 -WHERE (time >= 2024-11-28 07:00:00 AND time <= 2024-11-28 16:00:00) - AND device_id = '100' -GROUP BY 1; -``` - -结果: - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-28T07:00:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -|2024-11-28T12:00:00.000+08:00| null| -|2024-11-28T13:00:00.000+08:00| null| -|2024-11-28T14:00:00.000+08:00| null| -|2024-11-28T15:00:00.000+08:00| null| -|2024-11-28T16:00:00.000+08:00| null| -+-----------------------------+--------+ -``` - -示例 2:结合设备分组填充缺失时间区间 - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-28 07:00:00 AND time <= 2024-11-28 16:00:00 -GROUP BY 1, device_id; -``` - -结果: - -```Plain -+-----------------------------+---------+--------+ -| hour_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-28T07:00:00.000+08:00| 100| null| -|2024-11-28T08:00:00.000+08:00| 100| 85.0| -|2024-11-28T09:00:00.000+08:00| 100| null| -|2024-11-28T10:00:00.000+08:00| 100| 85.0| -|2024-11-28T11:00:00.000+08:00| 100| 88.0| -|2024-11-28T12:00:00.000+08:00| 100| null| -|2024-11-28T13:00:00.000+08:00| 100| null| -|2024-11-28T14:00:00.000+08:00| 100| null| -|2024-11-28T15:00:00.000+08:00| 100| null| -|2024-11-28T16:00:00.000+08:00| 100| null| -+-----------------------------+---------+--------+ -``` - -示例 3:查询范围内没有数据返回空结果集 - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-27 09:00:00 AND time <= 2024-11-27 14:00:00 -GROUP BY 1, device_id; -``` - -结果: - -```Plain -+---------+---------+--------+ -|hour_time|device_id|avg_temp| -+---------+---------+--------+ -+---------+---------+--------+ -``` - -## 2. DIFF函数 - -### 2.1 功能概述 - -`DIFF` 函数用于计算当前行与上一行的差值。对于第一行,由于没有前一行数据,因此永远返回 `NULL`。 - -### 2.2 函数定义 - -``` -DIFF(numberic[, boolean]) -> Double -``` - -### 2.3 参数说明 - -- 第一个参数:数值类型 - - - **类型**:必须是数值类型(`INT32`、`INT64`、`FLOAT`、`DOUBLE`) - - **作用**:指定要计算差值的列。 - -- 第二个参数:布尔类型(可选) - - **类型**:布尔类型(`true` 或 `false`)。 - - **默认值**:`true`。 - - **作用**: - - **`true`**:忽略 `NULL` 值,向前找到第一个非 `NULL` 值进行差值计算。如果前面没有非 `NULL` 值,则返回 `NULL`。 - - **`false`**:不忽略 `NULL` 值,如果前一行为 `NULL`,则差值结果为 `NULL`。 - -### 2.4 注意事项 - -- 在树模型中,第二个参数需要指定为 `'ignoreNull'='true'` 或 `'ignoreNull'='false'`,但在表模型中,只需指定为 `true` 或 `false`。 -- 如果用户写成 `'ignoreNull'='true'` 或 `'ignoreNull'='false'`,表模型会将其视为对两个字符串常量进行等号比较,返回布尔值,但结果总是 `false`,等价于指定第二个参数为 `false`。 - -### 2.5 使用示例 - -示例 1:忽略 `NULL` 值 - -```SQL -SELECT time, DIFF(temperature) AS diff_temp -FROM table1 -WHERE device_id = '100'; -``` - -结果: - -```Plain -+-----------------------------+---------+ -| time|diff_temp| -+-----------------------------+---------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:30:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| -5.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 0.0| -|2024-11-28T11:00:00.000+08:00| 3.0| -|2024-11-26T13:37:00.000+08:00| 2.0| -|2024-11-26T13:38:00.000+08:00| 0.0| -+-----------------------------+---------+ -``` - -示例 2:不忽略 `NULL` 值 - -```SQL -SELECT time, DIFF(temperature, false) AS diff_temp -FROM table1 -WHERE device_id = '100'; -``` - -结果: - -```Plain -+-----------------------------+---------+ -| time|diff_temp| -+-----------------------------+---------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:30:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| -5.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| null| -|2024-11-28T11:00:00.000+08:00| 3.0| -|2024-11-26T13:37:00.000+08:00| 2.0| -|2024-11-26T13:38:00.000+08:00| 0.0| -+-----------------------------+---------+ -``` - -示例 3:完整示例 - -```SQL -SELECT time, temperature, - DIFF(temperature) AS diff_temp_1, - DIFF(temperature, false) AS diff_temp_2 -FROM table1 -WHERE device_id = '100'; -``` - -结果: - -```Plain -+-----------------------------+-----------+-----------+-----------+ -| time|temperature|diff_temp_1|diff_temp_2| -+-----------------------------+-----------+-----------+-----------+ -|2024-11-29T11:00:00.000+08:00| null| null| null| -|2024-11-29T18:30:00.000+08:00| 90.0| null| null| -|2024-11-28T08:00:00.000+08:00| 85.0| -5.0| -5.0| -|2024-11-28T09:00:00.000+08:00| null| null| null| -|2024-11-28T10:00:00.000+08:00| 85.0| 0.0| null| -|2024-11-28T11:00:00.000+08:00| 88.0| 3.0| 3.0| -|2024-11-26T13:37:00.000+08:00| 90.0| 2.0| 2.0| -|2024-11-26T13:38:00.000+08:00| 90.0| 0.0| 0.0| -+-----------------------------+-----------+-----------+-----------+ -``` - -## 3. 时序分窗函数 - -原始示例数据如下: - -```SQL -IoTDB> SELECT * FROM bid; -+-----------------------------+--------+-----+ -| time|stock_id|price| -+-----------------------------+--------+-----+ -|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+--------+-----+ - --- 创建语句 -CREATE TABLE bid(time TIMESTAMP TIME, stock_id STRING TAG, price FLOAT FIELD); --- 插入数据 -INSERT INTO bid(time, stock_id, price) VALUES('2021-01-01T09:05:00','AAPL',100.0),('2021-01-01T09:06:00','TESL',200.0),('2021-01-01T09:07:00','AAPL',103.0),('2021-01-01T09:07:00','TESL',202.0),('2021-01-01T09:09:00','AAPL',102.0),('2021-01-01T09:15:00','TESL',195.0); -``` - -### 3.1 HOP - -#### 3.1.1 功能描述 - -HOP 函数用于按时间分段分窗分析,识别每一行数据所属的时间窗口。该函数通过指定固定窗口大小(size)和窗口滑动步长(SLIDE),将数据按时间戳分配到所有与其时间戳重叠的窗口中。若窗口之间存在重叠(步长 < 窗口大小),数据会自动复制到多个窗口。 - -#### 3.1.2 函数定义 - -```SQL -HOP(data, timecol, size, slide[, origin]) -``` - -#### 3.1.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小 | -| SLIDE | 标量参数 | 长整数类型 | 窗口滑动步长 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -#### 3.1.4 返回结果 - -HOP 函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 3.1.5 使用示例 - -```SQL -IoTDB> SELECT * FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.2 SESSION - -#### 3.2.1 功能描述 - -SESSION 函数用于按会话间隔对数据进行分窗。系统逐行检查与前一行的时间间隔,小于阈值(GAP)则归入当前窗口,超过则归入下一个窗口。 - -#### 3.2.2 函数定义 - -```SQL -SESSION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], timecol, gap) -``` -#### 3.2.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| TIMECOL | 标量参数 | 字符串类型默认值:'time' | 时间列名| -| GAP | 标量参数 | 长整数类型 | 会话间隔阈值 | - -#### 3.2.4 返回结果 - -SESSION 函数的返回结果列包含: - -* window\_start: 会话窗口内的第一条数据的时间 -* window\_end: 会话窗口内的最后一条数据的时间 -* 映射列:DATA 参数的所有输入列 - -#### 3.2.5 使用示例 - -```SQL -IoTDB> SELECT * FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY SESSION -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL| 201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.3 VARIATION - -#### 3.3.1 功能描述 - -VARIATION 函数用于按数据差值分窗,将第一条数据作为首个窗口的基准值,每个数据点会与基准值进行差值运算,如果差值小于给定的阈值(delta)则加入当前窗口;如果超过阈值,则分为下一个窗口,将该值作为下一个窗口的基准值。 - -#### 3.3.2 函数定义 - -```sql -VARIATION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], col, delta) -``` - -#### 3.3.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| -------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| COL | 标量参数 | 字符串类型 | 标识对哪一列计算差值 | -| DELTA | 标量参数 | 浮点数类型 | 差值阈值 | - -#### 3.3.4 返回结果 - -VARIATION 函数的返回结果列包含: - -* window\_index: 窗口编号 -* 映射列:DATA 参数的所有输入列 - -#### 3.3.5 使用示例 - -```sql -IoTDB> SELECT * FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price',DELTA => 2.0); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 1|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY VARIATION -IoTDB> SELECT first(time) as window_start, last(time) as window_end, stock_id, avg(price) as avg FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price', DELTA => 2.0) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:07:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.5| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 3.4 CAPACITY - -#### 3.4.1 功能描述 - -CAPACITY 函数用于按数据点数(行数)分窗,每个窗口最多有 SIZE 行数据。 - -#### 3.4.2 函数定义 - -```sql -CAPACITY(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], size) -``` - -#### 3.4.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| -------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小 | - -#### 3.4.4 返回结果 - -CAPACITY 函数的返回结果列包含: - -* window\_index: 窗口编号 -* 映射列:DATA 参数的所有输入列 - -#### 3.4.5 使用示例 - -```sql -IoTDB> SELECT * FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 0|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY COUNT -IoTDB> SELECT first(time) as start_time, last(time) as end_time, stock_id, avg(price) as avg FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| start_time| end_time|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|101.5| -|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 3.5 TUMBLE - -#### 3.5.1 功能描述 - -TUMBLE 函数用于通过时间属性字段为每行数据分配一个窗口,滚动窗口的大小固定且不重复。 - -#### 3.5.2 函数定义 - -```sql -TUMBLE(data, timecol, size[, origin]) -``` -#### 3.5.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小,需为正数 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -#### 3.5.4 返回结果 - -TUBMLE 函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 3.5.5 使用示例 - -```SQL -IoTDB> SELECT * FROM TUMBLE( DATA => bid, TIMECOL => 'time', SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM TUMBLE(DATA => bid, TIMECOL => 'time', SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.6 CUMULATE - -#### 3.6.1 功能描述 - -Cumulate 函数用于从初始的窗口开始,创建相同窗口开始但窗口结束步长不同的窗口,直到达到最大的窗口大小。每个窗口包含其区间内的元素。例如:1小时步长,24小时大小的累计窗口,每天可以获得如下这些窗口:`[00:00, 01:00)`,`[00:00, 02:00)`,`[00:00, 03:00)`, …, `[00:00, 24:00)` - -#### 3.6.2 函数定义 - -```sql -CUMULATE(data, timecol, size, step[, origin]) -``` - -#### 3.6.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------------------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小,SIZE必须是STEP的整数倍,需为正数 | -| STEP | 标量参数 | 长整数类型 | 窗口步长,需为正数 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -> 注意:size 如果不是 step 的整数倍,则会报错`Cumulative table function requires size must be an integral multiple of step` - -#### 3.6.4 返回结果 - -CUMULATE函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 3.6.5 使用示例 - -```sql -IoTDB> SELECT * FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m, SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| TESL| 201.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00| AAPL| 100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| AAPL| 101.5| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -## 4. 窗口函数 - -### 4.1 语法定义 - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -更多详细功能介绍请参考:[窗口函数](../User-Manual/Window-Function_timecho.md) - -### 4.2 使用示例 - -表 device_flow 原始数据如下 - -```sql -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ -``` - -1. 从 device_flow 中查询所有列,并按 device 维度分组,在每个设备分组内按 flow 字段值排序,计算 flow 字段的累计求和,最终将累计和命名为 sum 列返回。 - -查询语句: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -``` - -查询结果: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` -2. 从 device_flow 表查询所有原始列,按 device 设备分组,每个设备分组内按 flow 字段值排序,统计「当前行所在的 flow 分组 + 前 1 个 flow 分组」范围内的行数(计数),最终将计数结果命名为 count 列返回。 - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. 从 device_flow 表查询所有原始列,按 device 分组,每个分组内按 flow 字段值升序排序,统计「当前行 flow 值 - 2」到「当前行 flow 值」这个数值区间内的所有行的数量,最终将计数结果命名为 count 列返回。 - -查询语句: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -## 5. Object 类型读取函数 - -描述:用于读取 OBJECT 对象的二进制内容。返回 BLOB 类型(对象的二进制内容)。 -> V2.0.8 版本起支持 - -语法: - -```SQL -READ_OBJECT(object [, offset, length]) -``` - -参数: - -* 必选参数:`object`,类型为 OBJECT -* 可选参数: - * `offset`,类型为 long(int64),为读取的偏移量,缺省值为0。如果 offset 小于 0,或者大于等于全文件长度,则抛异常 - * `length`,类型为 long(int64),为读取的数据长度,缺省值为全文件长度 - * 当长度大于 2^31 - 1 时,报错 - * 当长度大于从 offset 起的剩余文件长度时,会取从 offset 起的文件内容 - * length 小于 0 时,视为读取 offset 开始 object 剩下的所有数据 - -示例: - -```SQL -IoTDB:database1> select READ_OBJECT(s1) from table1 where device_id = 'tag1' -+------------+ -| _col0| -+------------+ -|0x696f746462| -+------------+ -Total line number = 1 - - -IoTDB:database1> select READ_OBJECT(s1, 0, 3) from table1 where device_id = 'tag1' -+--------+ -| _col0| -+--------+ -|0x696f74| -+--------+ -Total line number = 1 -``` diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/QuickStart-Only-Sql_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/QuickStart-Only-Sql_timecho.md deleted file mode 100644 index 26e1dac30..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/QuickStart-Only-Sql_timecho.md +++ /dev/null @@ -1,127 +0,0 @@ - - -# 快速 SQL 体验 - -> **在执行以下 SQL 语句前,请确保** -> -> * **已成功启动 IoTDB 服务** -> * **已通过 Cli 客户端连接 IoTDB** -> -> 注意:若您使用的终端不支持多行粘贴(例如 Windows CMD),请将 SQL 语句调整为单行格式后再执行。 - -## 1. 数据库管理 - -```SQL ---创建数据库 database1,并将数据库的 TTL 时间设置为1年; -CREATE DATABASE IF NOT EXISTS database1; - ---使用数据库 database1; -USE database1; - ---修改数据库的 TTL 时间为1周; -ALTER DATABASE database1 SET PROPERTIES TTL=604800000; - ---删除数据库 database1; -DROP DATABASE IF EXISTS database1; -``` - -详细语法说明可参考:[数据库管理](../Basic-Concept/Database-Management_timecho.md) - -## 2. 表管理 - -```SQL ---创建表 table1; -CREATE TABLE table1 ( - time TIMESTAMP TIME, - device_id STRING TAG, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - status Boolean FIELD COMMENT 'status' -); - --- 查看表 table1 的列信息; -DESC table1 DETAILS; - --- 修改表; --- 表 table1 增加列; -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS humidity FLOAT FIELD COMMENT 'humidity'; --- 表 table1 TTL 设置为1周; -ALTER TABLE table1 set properties TTL=604800000; - ---删除表 table1; -DROP TABLE table1; -``` - -详细语法说明可参考:[表管理](../Basic-Concept/Table-Management_timecho.md) - -## 3. 数据写入 - -```SQL ---单行写入; -INSERT INTO table1(device_id, time, temperature) VALUES ('100', '2025-11-26 13:37:00', 90.0); - ---多行写入; -INSERT INTO table1(device_id, maintenance, time, temperature) VALUES - ('101', '180', '2024-11-26 13:37:00', 88.0), - ('100', '180', '2024-11-26 13:38:00', 85.0), - ('101', '180', '2024-11-27 16:38:00', 80.0); -``` - -详细语法说明可参考:[数据写入](../Basic-Concept/Write-Updata-Data_timecho.md#_1-数据写入) - -## 4. 数据查询 - -```SQL ---全表查询; -SELECT * FROM table1; - ---函数查询; -SELECT count(*), sum(temperature) FROM table1; - ---查询指定设备及时间段的数据; -SELECT * -FROM table1 -WHERE time >= 2024-11-26 00:00:00 and time <= 2024-11-27 00:00:00 and device_id='101'; -``` - -详细语法说明可参考:[数据查询](../Basic-Concept/Query-Data_timecho.md) - -## 5. 数据更新 - -```SQL --- 更新 device_id 是 100 的数据的属性 maintenance 值; -UPDATE table1 SET maintenance='45' WHERE device_id='100'; -``` - -详细语法说明可参考:[数据更新](../Basic-Concept/Write-Updata-Data_timecho.md#_2-数据更新) - -## 6. 数据删除 - -```SQL --- 删除指定设备及时间段的数据; -DELETE FROM table1 WHERE time >= 2024-11-26 23:39:00 and time <= 2024-11-27 20:42:00 AND device_id='101'; - --- 全表删除; -DELETE FROM table1; -``` - -详细语法说明可参考:[数据删除](../Basic-Concept/Delete-Data.md) diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/Row-Pattern-Recognition_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/Row-Pattern-Recognition_timecho.md deleted file mode 100644 index 5e6ca2f9a..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/Row-Pattern-Recognition_timecho.md +++ /dev/null @@ -1,155 +0,0 @@ - - -# 模式查询 - -## 1. 语法定义 - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**说明:** - -* PARTITION BY : 可选,用于对输入表进行分组,每个分组能独立进行模式匹配。如果未声明该子句,则整个输入表将作为一个整体进行处理。 -* ORDER BY :可选,用于确保输入数据按某种顺序进行匹配处理。 -* MEASURES :可选,用于指定从匹配到的一段数据中提取哪些信息。 -* ROWS PER MATCH :可选,用于指定模式匹配成功后结果集的输出方式。 -* AFTER MATCH SKIP :可选,用于指定在识别到一个非空匹配后,下一次模式匹配应从哪一行继续进行。 -* PATTERN :用于定义需要匹配的行模式。 -* SUBSET :可选,用于将多个基本模式变量所匹配的行合并为一个逻辑集合。 -* DEFINE :用于定义行模式的基本模式变量。 - -更多详细功能介绍请参考:[模式查询](../User-Manual/Pattern-Query_timecho.md) - -## 2. 使用示例 - -以[示例数据](../Reference/Sample-Data.md)为源数据 - -1. 时间分段查询 - -将 table1 中的数据按照时间间隔小于等于 24 小时分段,查询每段中的数据总条数,以及开始、结束时间。 - -查询SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -2. 差值分段查询 - -将 table2 中的数据按照 humidity 湿度值差值小于 0.1 分段,查询每段中的数据总条数,以及开始、结束时间。 - -* 查询sql - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* 查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -3. 事件统计查询 - -将 table1 中数据按照设备号分组,统计上海地区湿度大于 35 的开始、结束时间及最大湿度值。 - -* 查询sql - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= '上海' AND A.humidity> 35 -) AS m -``` - -* 查询结果 - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Authority-Management_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Authority-Management_timecho.md deleted file mode 100644 index 34fc1bf54..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Authority-Management_timecho.md +++ /dev/null @@ -1,377 +0,0 @@ - -# 权限管理 - -本文档为 V2.0.7 版本起权限管理的 SQL 手册,详细功能使用可见[权限管理](../User-Manual/Authority-Management-Upgrade_timecho.md),如需查阅 V2.0.7 版本之前权限管理的功能介绍可参考[权限管理](../User-Manual/Authority-Management_timecho.md) - -## 1. 权限列表 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
权限类型权限名称生效范围描述
全局权限SYSTEM全局允许用户创建、修改、删除数据库。
允许用户创建、修改、删除表及表视图。
允许用户创建、删除、查看用户自定义函数。
允许用户创建、开始、停止、删除、查看PIPE。允许用户创建、删除、查看PIPEPLUGINS。
允许用户查询、取消查询。允许用户查看变量。允许用户查看集群状态。
允许用户创建、删除、查看深度学习模型。
SECURITY全局允许用户创建用户。
允许用户删除用户。
允许用户修改用户密码。
允许用户查看用户的权限信息。
允许用户列出所有用户。
允许用户创建角色。
允许用户删除角色。
允许用户查看角色的权限信息。
允许用户将角色授予某个用户或撤销。
允许用户列出所有角色。
AUDIT全局允许用户维护审计日志的规则 允许用户查看审计日志。
数据权限CREATEANY允许创建任意表、创建任意数据库。
数据库允许用户在该数据库下创建表;允许用户创建该名称的数据库。
允许用户创建该名称的表。
ALTERANY允许修改任意表的定义、任意数据库的定义。
数据库允许用户修改数据库的定义,允许用户修改数据库下表的定义。
允许用户修改表的定义。
SELECTANY允许查询系统内任意数据库中任意表的数据。
数据库允许用户查询该数据库中任意表的数据。
允许用户查询该表中的数据。执行多表查询时,数据库仅展示用户有权限访问的数据。
INSERTANY允许任意数据库的任意表插入/更新数据。
数据库允许用户向该数据库范围内任意表插入/更新数据。
允许用户向该表中插入/更新数据。
DELETEANY允许删除任意表的数据。
数据库允许用户删除该数据库范围内的数据。
允许用户删除该表中的数据。
- -## 2. SQL 语句 - -### 2.1 用户与角色管理 - -1. 创建用户(需 SECURITY 权限) - -```SQL -CREATE USER -eg: CREATE USER user1 'Passwd@202604'; -``` - -2. 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备 SECURITY 权限。 - -```SQL -ALTER USER SET PASSWORD -eg: ALTER USER tempuser SET PASSWORD 'Newpwd@202604'; -``` - -3. 删除用户(需 SECURITY 权限) - -```SQL -DROP USER -eg: DROP USER user1; -``` - -4. 创建角色 (需 SECURITY 权限) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1; -``` - -5. 删除角色 (需 SECURITY 权限) - -```SQL -DROP ROLE -eg: DROP ROLE role1; -``` - -6. 赋予用户角色 (需 SECURITY 权限) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1; -``` - -7. 移除用户角色 (需 SECURITY 权限) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1; -``` - -8. 列出所有用户(需 SECURITY 权限) - -```SQL -LIST USER; -``` - -9. 列出所有的角色 (需 SECURITY 权限) - -```SQL -LIST ROLE; -``` - -10. 列出指定角色下所有用户(需 SECURITY 权限) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser; -``` - -11. 列出指定用户下的所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 SECURITY 权限。 - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser; -``` - -12. 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser; -``` - -13. 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF ROLE -eg: LIST PRIVILEGES OF ROLE actor; -``` - -### 2.2 权限管理 - -#### 2.2.1 授予权限 - -1. 给用户授予管理用户的权限 - -```SQL -GRANT SECURITY TO USER -eg: GRANT SECURITY TO USER TEST_USER; -``` - -2. 给用户授予创建数据库及在数据库范围内创建表的权限,且允许用户在该范围内管理权限 - -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION -eg: GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION; -``` - -3. 给角色授予查询数据库的权限 - -```SQL -GRANT SELECT ON DATABASE TO ROLE -eg: GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE; -``` - -4. 给用户授予查询表的权限 - -```SQL -GRANT SELECT ON . TO USER -eg: GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER; -``` - -5. 给角色授予查询所有数据库及表的权限 - -```SQL -GRANT SELECT ON ANY TO ROLE -eg: GRANT SELECT ON ANY TO ROLE TEST_ROLE; -``` - -6. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地授予权限。 - -```SQL -GRANT ALL TO USER TESTUSER; --- 将用户可以获取的所有权限授予给用户,包括全局权限和 ANY 范围的所有数据权限 - -GRANT ALL ON ANY TO USER TESTUSER; --- 将 ANY 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在所有数据库上的所有数据权限 - -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER; --- 将 DB 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该数据库上的所有数据权限 - -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER; --- 将 TABLE 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该表上的所有数据权限 -``` - -#### 2.2.2 撤销权限 - -1. 取消用户管理用户的权限 - -```SQL -REVOKE SECURITY FROM USER -eg: REVOKE SECURITY FROM USER TEST_USER; -``` - -2. 取消用户创建数据库及在数据库范围内创建表的权限 - -```SQL -REVOKE CREATE ON DATABASE FROM USER -eg: REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER; -``` - -3. 取消用户查询表的权限 - -```SQL -REVOKE SELECT ON . FROM USER -eg: REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER; -``` - -4. 取消用户查询所有数据库及表的权限 - -```SQL -REVOKE SELECT ON ANY FROM USER -eg: REVOKE SELECT ON ANY FROM USER TEST_USER; -``` - -5. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地撤销权限。 - -```SQL -REVOKE ALL FROM USER TESTUSER; --- 取消用户所有的全局权限以及 ANY 范围的所有数据权限 - -REVOKE ALL ON ANY FROM USER TESTUSER; --- 取消用户 ANY 范围的所有数据权限,不会影响 DB 范围和 TABLE 范围的权限 - -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER; --- 取消用户在 DB 上的所有数据权限,不会影响 TABLE 权限 - -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER; --- 取消用户在 TABLE 上的所有数据权限 -``` - -#### 2.2.3 查看用户权限 - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser -``` diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md deleted file mode 100644 index df4b0fcc0..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md +++ /dev/null @@ -1,171 +0,0 @@ - - -# 数据增删 - -## 1. 数据写入 - -**语法:** - -```SQL -INSERT INTO [(COLUMN_NAME[, COLUMN_NAME]*)]? VALUES (COLUMN_VALUE[, COLUMN_VALUE]*) -``` - -更多详细语法说明请参考:[写入语法](../Basic-Concept/Write-Updata-Data_timecho.md#_1-1-语法) - -**示例一:指定列写入** - -```SQL -INSERT INTO table1(region, plant_id, device_id, time, temperature, humidity) VALUES ('北京', '1001', '100', '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, time, temperature) VALUES ('北京', '1001', '100', '2025-11-26 13:38:00', 91.0); -``` - -**示例二:空值写入** - -上述部分列写入等价于如下的带空值写入 -```SQL -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('北京', '1001', '100', null, null, '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('北京', '1001', '100', null, null, '2025-11-26 13:38:00', 91.0, null); -``` - -**示例三:多行写入** - -```SQL -INSERT INTO table1 -VALUES -('2025-11-26 13:37:00', '北京', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('2025-11-26 13:38:00', '北京', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:38:25'); - -INSERT INTO table1 -(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) -VALUES -('北京', '1001', '100', 'A', '180', '2025-11-26 13:37:00', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('北京', '1001', '100', 'A', '180', '2025-11-26 13:38:00', 90.0, 35.1, true, '2025-11-26 13:38:25'); -``` - -**示例四:查询写回** - -```SQL -insert into target_table select time,region,device_id,temperature from table1 where region = '北京'; - -insert into target_table(time,device_id,temperature) table table3; - -insert into target_table (select t1.time, t1.region as region, t1.device_id as device_id, t1.temperature as temperature from table1 t1 where t1.time in (select t2.time from table2 t2 where t2.region = '上海')); -``` - -## 2. 数据更新 - -**语法:** - -```SQL -UPDATE SET updateAssignment (',' updateAssignment)* (WHERE where=booleanExpression)? - -updateAssignment - : identifier EQ expression - ; -``` - -更多详细语法说明请参考:[更新语法](../Basic-Concept/Write-Updata-Data_timecho.md#_2-1-语法) - -**示例:** - -```SQL -update table1 set b = a where substring(a, 1, 1) like '%'; -``` - -## 3. 数据删除 - -**语法:** - -```SQL -DELETE FROM [WHERE_CLAUSE]? - -WHERE_CLAUSE: - WHERE DELETE_CONDITION - -DELETE_CONDITION: - SINGLE_CONDITION - | DELETE_CONDITION AND DELETE_CONDITION - | DELETE_CONDITION OR DELETE_CONDITION - -SINGLE_CODITION: - TIME_CONDITION | ID_CONDITION - -TIME_CONDITION: - time TIME_OPERATOR LONG_LITERAL - -TIME_OPERATOR: - < | > | <= | >= | = - -ID_CONDITION: - identifier = STRING_LITERAL -``` - -**示例一: 删除全表数据** - -全表删除 -```SQL -DELETE FROM table1; -``` - -**示例二:删除一段时间范围内的数据** - -单时间段数据删除 -```SQL -DELETE FROM table1 WHERE time <= 2024-11-29 00:00:00; -``` -多时间段数据删除 -```SQL -DELETE FROM table1 WHERE time >= 2024-11-27 00:00:00 and time <= 2024-11-29 00:00:00; -``` - -**示例三:删除指定设备的数据** - -删除指定设备的数据 -```SQL -DELETE FROM table1 WHERE device_id='101' and model_id = 'B'; -``` -删除指定设备及时间段的数据 -```SQL -DELETE FROM table1 - WHERE time >= 2024-11-27 16:39:00 and time <= 2024-11-29 16:42:00 - AND device_id='101' and model_id = 'B'; -``` -删除指定类型设备的数据 -```SQL -DELETE FROM table1 WHERE model_id = 'B'; -``` - -## 4. 设备删除 - -**语法:** - -```SQL -DELETE DEVICES FROM tableName=qualifiedName (WHERE booleanExpression)? -``` - -**示例:删除指定设备及其相关的所有数据** - -```SQL -DELETE DEVICES FROM table1 WHERE device_id = '101'; -``` diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Data-Sync_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Data-Sync_timecho.md deleted file mode 100644 index e272c9053..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Data-Sync_timecho.md +++ /dev/null @@ -1,320 +0,0 @@ - -# 数据同步 - -本文档主要为数据同步功能的SQL语句,详细功能介绍及使用说明见 [数据同步](../User-Manual/Data-Sync_timecho.md) - -## 1. 创建任务 - -**语法:** - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId 是能够唯一标定任务的名字 --- 数据抽取插件,可选插件 -WITH SOURCE ( - [ = ,], -) --- 数据处理插件,可选插件 -WITH PROCESSOR ( - [ = ,], -) --- 数据连接插件,必填插件 -WITH SINK ( - [ = ,], -) -``` - -**示例一:全量数据同步** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -**示例二:部分数据同步** - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'mode.streaming' = 'true' - 'database-name'='db_b.*', - 'start-time' = '2023.08.23T08:00:00+00:00', - 'end-time' = '2023.10.23T08:00:00+00:00' -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -**示例三:双向数据传输** - -* 在 A IoTDB 上执行下列语句 - -```SQL -create pipe AB -with source ( - 'source.mode.double-living' ='true' -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* 在 B IoTDB 上执行下列语句 - -```SQL -create pipe BA -with source ( - 'source.mode.double-living' ='true' -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -) -``` - -**示例四:边云数据传输** - -* 在 B IoTDB 上执行下列语句,将 B 中数据同步至 A - -```SQL -create pipe BA -with source ( - 'database-name'='db_b.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -) -``` - -* 在 C IoTDB 上执行下列语句,将 C 中数据同步至 A - -```SQL -create pipe CA -with source ( - 'database-name'='db_c.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* 在 D IoTDB 上执行下列语句,将 D 中数据同步至 A - -```SQL -create pipe DA -with source ( - 'database-name'='db_d.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -) -``` - -**示例五:级联数据传输** - -* 在 A IoTDB 上执行下列语句,将 A 中数据同步至 B - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* 在 B IoTDB 上执行下列语句,将 B 中数据同步至 C - -```SQL -create pipe BC -with source ( -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -) -``` - -**示例六:跨网闸数据传输** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -) -``` - -**示例七:压缩同步** - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', - 'compressor' = 'snappy,lz4', - 'rate-limit-bytes-per-second'='1048576' -) -``` - -**示例八:加密同步** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', - 'ssl.trust-store-path'='pki/trusted', - 'ssl.trust-store-pwd'='root' -) -``` - -**示例九:本地导出 Object 类型数据** - -```SQL -CREATE PIPE tsfile_export_local -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile-local-sink', - 'sink.local.target-path' = '/data/backup/export_2024' - 'sink.rate-limit-bytes-per-second' = '10485760' -); -``` - -**示例十:远程传输 Object 类型数据** - -* 该方式需提前注册 `tsfile_remote_sink` 插件 - -```SQL -CREATE PIPE tsfile_export_scp -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile_remote_sink', - 'sink.file-mode' = 'scp', - 'sink.scp.host' = '192.168.1.100', - 'sink.scp.port' = '22', - 'sink.scp.user' = 'backup_user', - 'sink.scp.password' = 'ComplexPass123!', - 'sink.scp.remote-path' = '/remote/archive/', - 'sink.rate-limit-bytes-per-second' = '10485760' -); -``` - -## 2. 开始任务 - -**语法:** - -```SQL -START PIPE -``` - -**示例:** - -```SQL -START PIPE A2B -``` - -## 3. 停止任务 - -**语法:** - -```SQL -STOP PIPE -``` - -**示例:** - -```SQL -STOP PIPE A2B -``` - -## 4. 删除任务 - -**语法:** - -```SQL -DROP PIPE [IF EXISTS] -``` - -**示例:** - -```SQL -DROP PIPE IF EXISTS A2B -``` - -## 5. 查看任务 - -**语法:** - -```SQL --- 查看全部任务 -SHOW PIPES --- 查看指定任务 -SHOW PIPE -``` - -**示例:** - -```SQL -SHOW PIPES - -SHOW PIPE A2B -``` - -## 6. 修改任务 - -**语法:** - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -**示例:** - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md deleted file mode 100644 index cd2206af9..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md +++ /dev/null @@ -1,663 +0,0 @@ - - -# 运维语句 - -## 1. 状态查看 - -### 1.1 查看当前的树/表模型 - -**语法:** - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 1.2 查看登录的用户名 - -**语法:** - -```SQL -showCurrentUserStatement - : SHOW CURRENT_USER - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW CURRENT_USER -+-----------+ -|CurrentUser| -+-----------+ -| root| -+-----------+ -``` - -### 1.3 查看连接的数据库名 - -**语法:** - -```SQL -showCurrentDatabaseStatement - : SHOW CURRENT_DATABASE - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW CURRENT_DATABASE; -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ - -IoTDB> USE test; - -IoTDB> SHOW CURRENT_DATABASE; -+---------------+ -|CurrentDatabase| -+---------------+ -| test| -+---------------+ -``` - -### 1.4 查看集群版本 - -**语法:** - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW VERSION -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.5 查看集群关键参数 - -**语法:** - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW VARIABLES -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.6 查看集群ID - -**语法:** - -```SQL -showClusterIdStatement - : SHOW (CLUSTERID | CLUSTER_ID) - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW CLUSTER_ID -+------------------------------------+ -| ClusterId| -+------------------------------------+ -|40163007-9ec1-4455-aa36-8055d740fcda| -``` - -### 1.7 查看服务器的时间 - -查看客户端直连的 DataNode 进程所在的服务器的时间 - -**语法:** - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - -### 1.8 查看分区信息 - -**语法:** - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW REGIONS -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 6|SchemaRegion|Running|tcollector| 670| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.194| | -| 7| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.196| 169.85 KB| -| 8| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.198| 161.63 KB| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.9 查看可用节点 - -> V2.0.8 起支持该功能 - -**语法:** - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW AVAILABLE URLS -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.10 查看服务信息 - -> V2.0.8.2 起支持该功能 - -**语法:** - -```SQL -showServicesStatement - : SHOW SERVICES - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -+------------+-----------+-------+ -|service_name|datanode_id| state| -+------------+-----------+-------+ -| MQTT| 1|STOPPED| -| REST| 1|RUNNING| -+------------+-----------+-------+ -``` - -### 1.11 查看集群激活状态 - -**语法:** - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW ACTIVATION -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -## 2. 状态设置 - -### 2.1 设置连接的树/表模型 - -**语法:** - -```SQL -SET SQL_DIALECT EQ (TABLE | TREE) -``` - -**示例:** - -```SQL -IoTDB> SET SQL_DIALECT=TABLE -IoTDB> SHOW CURRENT_SQL_DIALECT -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 2.2 更新配置项 - -**语法:** - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**示例:** - -```SQL -IoTDB> SET CONFIGURATION disk_space_warning_threshold='0.05',heartbeat_interval_in_ms='1000' ON 1; -``` - -### 2.3 读取手动修改的配置文件 - -**语法:** - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**示例:** - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 设置系统的状态 - -**语法:** - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**示例:** - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - -## 3. 数据管理 - -### 3.1 将内存表中的数据刷到磁盘 - -**语法:** - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**示例:** - -```SQL -IoTDB> FLUSH test_db TRUE ON LOCAL; -``` - - -## 4. 数据修复 - -### 4.1 启动后台扫描并修复 tsfile 任务 - -**语法:** - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**示例:** - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 暂停后台修复 tsfile 任务 - -**语法:** - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**示例:** - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. 查询相关 - -### 5.1 查看正在执行的查询 - -**语法:** - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW QUERIES WHERE elapsed_time > 30 -+-----------------------+-----------------------------+-----------+------------+------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -|20250108_101015_00000_1|2025-01-08T18:10:15.935+08:00| 1| 32.283|show queries|root| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -``` - -### 5.2 主动终止查询 - -**语法:** - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**示例:** - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -- 终止指定query -IoTDB> KILL ALL QUERIES; -- 终止所有query -``` - -### 5.3 查询性能分析 - -#### 5.3.1 查看执行计划 - -**语法:** - -```SQL -EXPLAIN -``` - -更多详细语法说明请参考:[EXPLAIN 语句](../User-Manual/Query-Performance-Analysis.md#_1-explain-语句) - -**示例:** - -```SQL -IoTDB> explain select * from t1 -+-----------------------------------------------------------------------------------------------+ -| distribution plan| -+-----------------------------------------------------------------------------------------------+ -| ┌─────────────────────────────────────────────┐ | -| │OutputNode-4 │ | -| │OutputColumns-[time, device_id, type, speed] │ | -| │OutputSymbols: [time, device_id, type, speed]│ | -| └─────────────────────────────────────────────┘ | -| │ | -| │ | -| ┌─────────────────────────────────────────────┐ | -| │Collect-21 │ | -| │OutputSymbols: [time, device_id, type, speed]│ | -| └─────────────────────────────────────────────┘ | -| ┌───────────────────────┴───────────────────────┐ | -| │ │ | -|┌─────────────────────────────────────────────┐ ┌───────────┐ | -|│TableScan-19 │ │Exchange-28│ | -|│QualifiedTableName: test.t1 │ └───────────┘ | -|│OutputSymbols: [time, device_id, type, speed]│ │ | -|│DeviceNumber: 1 │ │ | -|│ScanOrder: ASC │ ┌─────────────────────────────────────────────┐| -|│PushDownOffset: 0 │ │TableScan-20 │| -|│PushDownLimit: 0 │ │QualifiedTableName: test.t1 │| -|│PushDownLimitToEachDevice: false │ │OutputSymbols: [time, device_id, type, speed]│| -|│RegionId: 2 │ │DeviceNumber: 1 │| -|└─────────────────────────────────────────────┘ │ScanOrder: ASC │| -| │PushDownOffset: 0 │| -| │PushDownLimit: 0 │| -| │PushDownLimitToEachDevice: false │| -| │RegionId: 1 │| -| └─────────────────────────────────────────────┘| -+-----------------------------------------------------------------------------------------------+ -``` - -#### 5.3.2 查询性能分析 - -**语法:** - -```SQL -EXPLAIN ANALYZE [VERBOSE] -``` - -更多详细语法说明请参考:[EXPLAIN ANALYZE 语句](../User-Manual/Query-Performance-Analysis.md#_2-explain-analyze-语句) - -**示例:** - -```SQL -IoTDB> explain analyze verbose select * from t1 -+-----------------------------------------------------------------------------------------------+ -| Explain Analyze| -+-----------------------------------------------------------------------------------------------+ -|Analyze Cost: 38.860 ms | -|Fetch Partition Cost: 9.888 ms | -|Fetch Schema Cost: 54.046 ms | -|Logical Plan Cost: 10.102 ms | -|Logical Optimization Cost: 17.396 ms | -|Distribution Plan Cost: 2.508 ms | -|Dispatch Cost: 22.126 ms | -|Fragment Instances Count: 2 | -| | -|FRAGMENT-INSTANCE[Id: 20241127_090849_00009_1.2.0][IP: 0.0.0.0][DataRegion: 2][State: FINISHED]| -| Total Wall Time: 18 ms | -| Cost of initDataQuerySource: 6.153 ms | -| Seq File(unclosed): 1, Seq File(closed): 0 | -| UnSeq File(unclosed): 0, UnSeq File(closed): 0 | -| ready queued time: 0.164 ms, blocked queued time: 0.342 ms | -| Query Statistics: | -| loadBloomFilterFromCacheCount: 0 | -| loadBloomFilterFromDiskCount: 0 | -| loadBloomFilterActualIOSize: 0 | -| loadBloomFilterTime: 0.000 | -| loadTimeSeriesMetadataAlignedMemSeqCount: 1 | -| loadTimeSeriesMetadataAlignedMemSeqTime: 0.246 | -| loadTimeSeriesMetadataFromCacheCount: 0 | -| loadTimeSeriesMetadataFromDiskCount: 0 | -| loadTimeSeriesMetadataActualIOSize: 0 | -| constructAlignedChunkReadersMemCount: 1 | -| constructAlignedChunkReadersMemTime: 0.294 | -| loadChunkFromCacheCount: 0 | -| loadChunkFromDiskCount: 0 | -| loadChunkActualIOSize: 0 | -| pageReadersDecodeAlignedMemCount: 1 | -| pageReadersDecodeAlignedMemTime: 0.047 | -| [PlanNodeId 43]: IdentitySinkNode(IdentitySinkOperator) | -| CPU Time: 5.523 ms | -| output: 2 rows | -| HasNext() Called Count: 6 | -| Next() Called Count: 5 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 31]: CollectNode(CollectOperator) | -| CPU Time: 5.512 ms | -| output: 2 rows | -| HasNext() Called Count: 6 | -| Next() Called Count: 5 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 29]: TableScanNode(TableScanOperator) | -| CPU Time: 5.439 ms | -| output: 1 rows | -| HasNext() Called Count: 3 -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| DeviceNumber: 1 | -| CurrentDeviceIndex: 0 | -| [PlanNodeId 40]: ExchangeNode(ExchangeOperator) | -| CPU Time: 0.053 ms | -| output: 1 rows | -| HasNext() Called Count: 2 | -| Next() Called Count: 1 | -| Estimated Memory Size: : 131072 | -| | -|FRAGMENT-INSTANCE[Id: 20241127_090849_00009_1.3.0][IP: 0.0.0.0][DataRegion: 1][State: FINISHED]| -| Total Wall Time: 13 ms | -| Cost of initDataQuerySource: 5.725 ms | -| Seq File(unclosed): 1, Seq File(closed): 0 | -| UnSeq File(unclosed): 0, UnSeq File(closed): 0 | -| ready queued time: 0.118 ms, blocked queued time: 5.844 ms | -| Query Statistics: | -| loadBloomFilterFromCacheCount: 0 | -| loadBloomFilterFromDiskCount: 0 | -| loadBloomFilterActualIOSize: 0 | -| loadBloomFilterTime: 0.000 | -| loadTimeSeriesMetadataAlignedMemSeqCount: 1 | -| loadTimeSeriesMetadataAlignedMemSeqTime: 0.004 | -| loadTimeSeriesMetadataFromCacheCount: 0 | -| loadTimeSeriesMetadataFromDiskCount: 0 | -| loadTimeSeriesMetadataActualIOSize: 0 | -| constructAlignedChunkReadersMemCount: 1 | -| constructAlignedChunkReadersMemTime: 0.001 | -| loadChunkFromCacheCount: 0 | -| loadChunkFromDiskCount: 0 | -| loadChunkActualIOSize: 0 | -| pageReadersDecodeAlignedMemCount: 1 | -| pageReadersDecodeAlignedMemTime: 0.007 | -| [PlanNodeId 42]: IdentitySinkNode(IdentitySinkOperator) | -| CPU Time: 0.270 ms | -| output: 1 rows | -| HasNext() Called Count: 3 | -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 30]: TableScanNode(TableScanOperator) | -| CPU Time: 0.250 ms | -| output: 1 rows | -| HasNext() Called Count: 3 | -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| DeviceNumber: 1 | -| CurrentDeviceIndex: 0 | -+-----------------------------------------------------------------------------------------------+ -``` diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Metadata-Operations_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Metadata-Operations_timecho.md deleted file mode 100644 index 662677066..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/SQL-Metadata-Operations_timecho.md +++ /dev/null @@ -1,412 +0,0 @@ - - -# 元数据操作 - -## 1. 数据库管理 - -### 1.1 创建数据库 - -**语法:** - -```SQL -CREATE DATABASE (IF NOT EXISTS)? (WITH properties)? -``` - -更多详细语法说明请参考:[创建数据库](../Basic-Concept/Database-Management_timecho.md#_1-1-创建数据库) - -**示例:** - -创建一个名为 database1 的数据库, 数据库的 TTL 时间默认永久。 -```SQL -CREATE DATABASE database1; -CREATE DATABASE IF NOT EXISTS database1; -``` - -创建一个名为 database1 的数据库,并将数据库的 TTL 时间设置为1年。 -```SQL -CREATE DATABASE IF NOT EXISTS database1 with(TTL=31536000000); -``` - -### 1.2 使用数据库 - -**语法:** - -```SQL -USE -``` - -**示例:** - -```SQL -USE database1; -``` - -### 1.3 查看当前数据库 - -**语法:** - -```SQL -SHOW CURRENT_DATABASE; -``` - -**示例:** - -未执行过 `use`语句指定数据库 -```SQL -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ -``` -执行 `use`语句指定数据库 database1 -```sql -USE database1; -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| database1| -+---------------+ -``` - -### 1.4 查看所有数据库 - -**语法:** - -```SQL -SHOW DATABASES (DETAILS)? -``` - -更多返回结果详细说明请参考:[查看所有数据库](../Basic-Concept/Database-Management_timecho.md#_1-4-查看所有数据库) - -**示例:** - -查看所有数据库 -```SQL -SHOW DATABASES; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+------------------+-------+-----------------------+---------------------+---------------------+ -| database1| INF| 1| 1| 604800000| -|information_schema| INF| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+ -``` - -查看所有数据库详情 -```sql -SHOW DATABASES DETAILS; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|DataRegionGroupNum| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| database1| INF| 1| 1| 604800000| 1| 2| -|information_schema| INF| null| null| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -``` - -### 1.5 修改数据库 - -**语法:** - -```SQL -ALTER DATABASE (IF EXISTS)? database=identifier SET PROPERTIES propertyAssignments; -``` - -**示例:** - -修改数据库 database1 的 TTL 时间为1年 -```SQL -ALTER DATABASE database1 SET PROPERTIES TTL=31536000000; -``` - -### 1.6 删除数据库 - -**语法:** - -```SQL -DROP DATABASE (IF EXISTS)? ; -``` - -**示例:** - -删除数据库 database1 -```SQL -DROP DATABASE IF EXISTS database1; -``` - -## 2. 表管理 - -### 2.1 创建表 - -**语法:** - -```SQL -createTableStatement - : CREATE TABLE (IF NOT EXISTS)? qualifiedName - '(' (columnDefinition (',' columnDefinition)*)? ')' - charsetDesc? - comment? - (WITH properties)? - ; - -charsetDesc - : DEFAULT? (CHAR SET | CHARSET | CHARACTER SET) EQ? identifierOrString - ; - -columnDefinition - : identifier columnCategory=(TAG | ATTRIBUTE | TIME) charsetName? comment? - | identifier type (columnCategory=(TAG | ATTRIBUTE | TIME | FIELD))? charsetName? comment? - ; - -charsetName - : CHAR SET identifier - | CHARSET identifier - | CHARACTER SET identifier - ; - -comment - : COMMENT string - ; -``` - -更多详细语法说明请参考:[创建表](../Basic-Concept/Table-Management_timecho.md#_1-1-创建表) - -**示例:** - -创建表 table1 并将表的 TTL 设置为1年 -```SQL -CREATE TABLE table1 ( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - humidity FLOAT FIELD COMMENT 'humidity', - status Boolean FIELD COMMENT 'status', - arrival_time TIMESTAMP FIELD COMMENT 'arrival_time' -) COMMENT 'table1' WITH (TTL=31536000000); -``` - -创建空表 tableB -```SQL -CREATE TABLE if not exists tableB (); -``` - -创建表 tableC -```SQL -CREATE TABLE tableC ( - station STRING TAG, - temperature int32 FIELD COMMENT 'temperature' - ) with (TTL=DEFAULT); -``` - -创建表 table1 并自定义时间列:命名为time_test, 位于表的第二列 (V2.0.8 起支持) -```SQL -CREATE TABLE table1 ( - region STRING TAG, - time_user_defined TIMESTAMP TIME, - temperature FLOAT FIELD -); -``` - -注意:若您使用的终端不支持多行粘贴(例如 Windows CMD),请将 SQL 语句调整为单行格式后再执行。 - -### 2.2 查看表 - -**语法:** - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -**示例:** - -查看数据库 database1 下的所有表 -```SQL -show tables from database1; -``` -```shell -+---------+---------------+ -|TableName| TTL(ms)| -+---------+---------------+ -| table1| 31536000000| -+---------+---------------+ -``` - -查看数据库 database1 下的所有表及其属性信息 -```sql -show tables details from database1; -``` -```shell -+---------------+-----------+------+-------+ -| TableName| TTL(ms)|Status|Comment| -+---------------+-----------+------+-------+ -| table1|31536000000| USING| table1| -+---------------+-----------+------+-------+ -``` - -### 2.3 查看表的列 - -**语法:** - -```SQL -(DESC | DESCRIBE) (DETAILS)? -``` - -**示例:** - -查看表 table1 的列信息 -```SQL -desc table1; -``` -```shell -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ -``` -查看表 table1 的列详细信息 -```sql -desc table1 details; -``` -```shell -+------------+---------+---------+------+------------+ -| ColumnName| DataType| Category|Status| Comment| -+------------+---------+---------+------+------------+ -| time|TIMESTAMP| TIME| USING| null| -| region| STRING| TAG| USING| null| -| plant_id| STRING| TAG| USING| null| -| device_id| STRING| TAG| USING| null| -| model_id| STRING|ATTRIBUTE| USING| null| -| maintenance| STRING|ATTRIBUTE| USING| maintenance| -| temperature| FLOAT| FIELD| USING| temperature| -| humidity| FLOAT| FIELD| USING| humidity| -| status| BOOLEAN| FIELD| USING| status| -|arrival_time|TIMESTAMP| FIELD| USING|arrival_time| -+------------+---------+---------+------+------------+ -``` - -### 2.4 查看表的创建信息 - -**语法:** - -```SQL -SHOW CREATE TABLE -``` - -**示例:** - -查看表 table1 的创建信息 -```SQL -show create table table1; -``` -```shell -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| Table| Create Table| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|table1|CREATE TABLE "table1" ("region" STRING TAG,"plant_id" STRING TAG,"device_id" STRING TAG,"model_id" STRING ATTRIBUTE,"maintenance" STRING ATTRIBUTE,"temperature" FLOAT FIELD,"humidity" FLOAT FIELD,"status" BOOLEAN FIELD,"arrival_time" TIMESTAMP FIELD) WITH (ttl=31536000000)| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -``` - - -### 2.5 修改表 - -**语法:** - -```SQL -#addColumn; -ALTER TABLE (IF EXISTS)? tableName=qualifiedName ADD COLUMN (IF NOT EXISTS)? column=columnDefinition COMMENT 'column_comment'; -#dropColumn; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName DROP COLUMN (IF EXISTS)? column=identifier; -#setTableProperties; -// set TTL can use this; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName SET PROPERTIES propertyAssignments; -| COMMENT ON TABLE tableName=qualifiedName IS 'table_comment'; -| COMMENT ON COLUMN tableName.column IS 'column_comment'; -#changeColumndatatype; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName ALTER COLUMN (IF EXISTS)? column=identifier SET DATA TYPE new_type=type; -``` - -**示例:** - -表 table1 增加 tag 列 a -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS a TAG COMMENT 'a'; -``` -表 table1 增加 field 列 b -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS b FLOAT FIELD COMMENT 'b'; -``` -修改表 table1 的 TTL -```SQL -ALTER TABLE table1 set properties TTL=3600; -``` -表 table1 增加注释 -```SQL -COMMENT ON TABLE table1 IS 'table1'; -``` -表 table1 的 a 列去掉注释 -```SQL -COMMENT ON COLUMN table1.a IS null; -``` -修改表 table1 的 b 列的数据类型 -```SQL -ALTER TABLE table1 ALTER COLUMN IF EXISTS b SET DATA TYPE DOUBLE; -``` - -### 2.6 删除表 - -**语法:** - -```SQL -DROP TABLE (IF EXISTS)? -``` - -**示例:** - -```SQL -DROP TABLE table1; -DROP TABLE database1.table1; -``` - - diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/Select-Clause_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/Select-Clause_timecho.md deleted file mode 100644 index d5568b600..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/Select-Clause_timecho.md +++ /dev/null @@ -1,461 +0,0 @@ - - -# SELECT 子句 - -## 1. 语法概览 - -```sql -SELECT setQuantifier? selectItem (',' selectItem)* - -selectItem - : expression (AS? identifier)? #selectSingle - | tableName '.' ASTERISK (AS columnAliases)? #selectAll - | ASTERISK #selectAll - ; -setQuantifier - : DISTINCT - | ALL - ; -``` - -- **SELECT 子句**: 指定了查询结果应包含的列,包含聚合函数(如 SUM、AVG、COUNT 等)以及窗口函数,在逻辑上最后执行。 -- **DISTINCT 关键字**: `SELECT DISTINCT column_name` 确保查询结果中的值是唯一的,去除重复项。 -- **COLUMNS 函数**:SELECT 子句中支持使用 COLUMNS 函数进行列筛选,并支持和表达式结合使用,使表达式的效果对所有筛选出的列生效。 - -## 2. 语法详释: - -每个 `selectItem` 可以是以下形式之一: - -- **表达式**: `expression [ [ AS ] column_alias ]` 定义单个输出列,可以指定列别名。 -- **选择某个关系的所有列**: `relation.*` 选择某个关系的所有列,不允许使用列别名。 -- **选择结果集中的所有列**: `*` 选择查询的所有列,不允许使用列别名。 - -`DISTINCT` 的使用场景: - -- **SELECT 语句**:在 SELECT 语句中使用 DISTINCT,查询结果去除重复项。 -- **聚合函数**:与聚合函数一起使用时,DISTINCT 只处理输入数据集中的非重复行。 -- **GROUP BY 子句**:在 GROUP BY 子句中使用 ALL 和 DISTINCT 量词,决定是否每个重复的分组集产生不同的输出行。 - -`COLUMNS` 函数: -- **`COLUMNS(*)`**: 匹配所有列,支持结合表达式进行使用。 -- **`COLUMNS(regexStr) ? AS identifier`**:正则匹配 - - 匹配所有列名满足正则表达式的列,支持结合表达式进行使用。 - - 支持引用正则表达式捕获到的 groups 对列进行重命名,不写 AS 时展示原始列名(即 _coln_原始列名,其中 n 为列在结果表中的 position)。 - - 重命名用法简述: - - regexStr 中使用圆括号设置要捕获的组; - - 在 identifier 中使用 `'$index'` 引用捕获到的组。 - - 注意:使用该功能时,identifier 中会包含特殊字符 '$',所以整个 identifier 要用双引号引起来。 - -## 3. 示例数据 - -在[示例数据页面](../Reference/Sample-Data.md)中,包含了用于构建表结构和插入数据的SQL语句,下载并在IoTDB CLI中执行这些语句,即可将数据导入IoTDB,您可以使用这些数据来测试和执行示例中的SQL语句,并获得相应的结果。 - -### 3.1 选择列表 - -#### 3.1.1 星表达式 - -使用星号(*)可以选取表中的所有列,**注意**,星号表达式不能被大多数函数转换,除了`count(*)`的情况。 - -示例:从表中选择所有列 - -```sql -SELECT * FROM table1; -``` - -执行结果如下: - -```shell -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| modifytime| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-29T18:30:00.000+08:00| 上海| 3002| 100| E| 180| 90.0| 35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T10:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| 35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 35.2| true| null| -|2024-11-30T14:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00| 上海| 3001| 101| D| 360| 85.0| null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| 35.3| null| null| -|2024-11-27T16:40:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.2| false| null| -|2024-11-27T16:43:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false| null| -|2024-11-27T16:44:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -Total line number = 18 -It costs 0.653s -``` - -#### 3.1.2 聚合函数 - -聚合函数将多行数据汇总为单个值。当 SELECT 子句中存在聚合函数时,查询将被视为聚合查询。在聚合查询中,所有表达式必须是聚合函数的一部分或由[GROUP BY子句](../SQL-Manual/GroupBy-Clause.md)指定的分组的一部分。 - -示例1:返回地址表中的总行数: - -```sql -SELECT count(*) FROM table1; -``` - -执行结果如下: - -```shell -+-----+ -|_col0| -+-----+ -| 18| -+-----+ -Total line number = 1 -It costs 0.091s -``` - -示例2:返回按城市分组的地址表中的总行数: - -```sql -SELECT region, count(*) - FROM table1 - GROUP BY region; -``` - -执行结果如下: - -```shell -+------+-----+ -|region|_col1| -+------+-----+ -| 上海| 9| -| 北京| 9| -+------+-----+ -Total line number = 2 -It costs 0.071s -``` - -#### 3.1.3 别名 - -关键字`AS`:为选定的列指定别名,别名将覆盖已存在的列名,以提高查询结果的可读性。 - -示例1:原始表格: - -```sql -IoTDB> SELECT * FROM table1; -``` - -执行结果如下: - -```shell -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| modifytime| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-29T18:30:00.000+08:00| 上海| 3002| 100| E| 180| 90.0| 35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T10:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| 35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 35.2| true| null| -|2024-11-30T14:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00| 上海| 3001| 101| D| 360| 85.0| null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| 35.3| null| null| -|2024-11-27T16:40:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.2| false| null| -|2024-11-27T16:43:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false| null| -|2024-11-27T16:44:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -Total line number = 18 -It costs 0.653s -``` - -示例2:单列设置别名: - -```sql -IoTDB> SELECT device_id - AS device - FROM table1; -``` - -执行结果如下: - -```shell -+------+ -|device| -+------+ -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -+------+ -Total line number = 18 -It costs 0.053s -``` - -示例3:所有列的别名: - -```sql -IoTDB> SELECT table1.* - AS (timestamp, Reg, Pl, DevID, Mod, Mnt, Temp, Hum, Stat,MTime) - FROM table1; -``` - -执行结果如下: - -```shell -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -| TIMESTAMP| REG| PL|DEVID|MOD|MNT|TEMP| HUM| STAT| MTIME| -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -|2024-11-29T11:00:00.000+08:00|上海|3002| 100| E|180|null|45.1| true| null| -|2024-11-29T18:30:00.000+08:00|上海|3002| 100| E|180|90.0|35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00|上海|3001| 100| C| 90|85.0|null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00|上海|3001| 100| C| 90|null|40.9| true| null| -|2024-11-28T10:00:00.000+08:00|上海|3001| 100| C| 90|85.0|35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00|上海|3001| 100| C| 90|88.0|45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00|北京|1001| 100| A|180|90.0|35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00|北京|1001| 100| A|180|90.0|35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00|上海|3002| 101| F|360|90.0|35.2| true| null| -|2024-11-30T14:30:00.000+08:00|上海|3002| 101| F|360|90.0|34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00|上海|3001| 101| D|360|85.0|null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00|北京|1001| 101| B|180|null|35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00|北京|1001| 101| B|180|85.0|35.3| null| null| -|2024-11-27T16:40:00.000+08:00|北京|1001| 101| B|180|85.0|null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00|北京|1001| 101| B|180|85.0|null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00|北京|1001| 101| B|180|null|35.2|false| null| -|2024-11-27T16:43:00.000+08:00|北京|1001| 101| B|180|null|null|false| null| -|2024-11-27T16:44:00.000+08:00|北京|1001| 101| B|180|null|null|false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -Total line number = 18 -It costs 0.189s -``` - -#### 3.1.4 Object 类型查询 - -> V2.0.8 版本起支持 - -示例一:直接查询 object 类型数据 - -```SQL -IoTDB:database1> select s1 from table1 where device_id = 'tag1'; -``` - -执行结果如下: -```shell -+------------+ -| s1| -+------------+ -|(Object) 5 B| -+------------+ -Total line number = 1 -It costs 0.428s -``` - -示例二:通过 read\_object 函数查询 Object 类型数据的真实内容 - -```SQL -IoTDB:database1> select read_object(s1) from table1 where device_id = 'tag1'; -``` - -执行结果如下: -```shell -+------------+ -| _col0| -+------------+ -|0x696f746462| -+------------+ -Total line number = 1 -It costs 0.188s -``` - - -### 3.2 Columns 函数 - -1. 不结合表达式 - -查询列名以 'm' 开头的列的数据 -```sql -IoTDB:database1> select columns('^m.*') from table1 limit 5; -``` - -执行结果如下: -```shell -+--------+-----------+ -|model_id|maintenance| -+--------+-----------+ -| E| 180| -| E| 180| -| C| 90| -| C| 90| -| C| 90| -+--------+-----------+ -``` - -查询列名以 'o' 开头的列,未匹配到任何列,抛出异常 -```SQL -IoTDB:database1> select columns('^o.*') from table1 limit 5; -``` -执行结果如下: -```shell -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: No matching columns found that match regex '^o.*' -``` - -查询列名以 'm' 开头的列的数据,并重命名以 'series_' 开头 -```SQL -IoTDB:database1> select columns('^m(.*)') AS "series_$0" from table1 limit 5; -``` -执行结果如下: -```shell -+---------------+------------------+ -|series_model_id|series_maintenance| -+---------------+------------------+ -| E| 180| -| E| 180| -| C| 90| -| C| 90| -| C| 90| -+---------------+------------------+ -``` - -2. 结合表达式 - -- 单个 COLUMNS 函数 - -查询所有列的最小值 -```sql -IoTDB:database1> select min(columns(*)) from table1; -``` -执行结果如下: -```shell -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -| _col0_time|_col1_region|_col2_plant_id|_col3_device_id|_col4_model_id|_col5_maintenance|_col6_temperature|_col7_humidity|_col8_status| _col9_arrival_time| -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -|2024-11-26T13:37:00.000+08:00| 上海| 1001| 100| A| 180| 85.0| 34.8| false|2024-11-26T13:37:34.000+08:00| -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -``` - -- 多个 COLUMNS 函数,出现在同一表达式 - -> 使用限制:出现多个 COLUMNS 函数时,多个 COLUMNS 函数的参数要完全相同 - -查询 'h' 开头列的最小值和最大值之和 -```sql -IoTDB:database1> select min(columns('^h.*')) + max(columns('^h.*')) from table1; -``` -执行结果如下: -```shell -+--------------+ -|_col0_humidity| -+--------------+ -| 79.899994| -+--------------+ -``` - -错误查询,两个 COLUMNS 函数不完全相同 -```SQL -IoTDB:database1> select min(columns('^h.*')) + max(columns('^t.*')) from table1; -``` -执行结果如下: -```shell -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Multiple different COLUMNS in the same expression are not supported -``` - -- 多个 COLUMNS 函数,出现在不同表达式 - -分别查询 'h' 开头列的最小值和最大值 -```sql -IoTDB:database1> select min(columns('^h.*')) , max(columns('^h.*')) from table1; -``` -执行结果如下: -```shell -+--------------+--------------+ -|_col0_humidity|_col1_humidity| -+--------------+--------------+ -| 34.8| 45.1| -+--------------+--------------+ -``` -分别查询 'h' 开头列的最小值和 'te'开头列的最大值 -```SQL -IoTDB:database1> select min(columns('^h.*')) , max(columns('^te.*')) from table1; -``` -执行结果如下: -```shell -+--------------+-----------------+ -|_col0_humidity|_col1_temperature| -+--------------+-----------------+ -| 34.8| 90.0| -+--------------+-----------------+ -``` - -3. 在 WHERE 子句中使用 - -查询数据,所有 'h' 开头列的数据必须要大于 40 -```sql -IoTDB:database1> select * from table1 where columns('^h.*') > 40; -``` -执行结果如下: -```shell -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -``` -等价于 -```SQL -IoTDB:database1> select * from table1 where humidity > 40; -``` -执行结果如下: -```shell -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -``` - -## 4. 结果集列顺序 - -- **列顺序**: 结果集中的列顺序与 SELECT 子句中指定的顺序相同。 -- **多列排序**: 如果选择表达式返回多个列,它们的排序方式与源关系中的排序方式相同 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/Set-Operations_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/Set-Operations_timecho.md deleted file mode 100644 index 07d8ff1fb..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/Set-Operations_timecho.md +++ /dev/null @@ -1,322 +0,0 @@ - - -# 集合操作 - -IoTDB 原生支持 SQL 标准集合操作,包括 UNION(并集)、INTERSECT(交集)和EXCEPT(差集)三种核心运算符。通过执行这些操作,可实现无缝合并、比较和筛选多源时序数据查询结果,显著提升时序数据分析的灵活性与效率。 - -> 注意:该功能从 V2.0.9.1 版本开始提供。 - -## 1. UNION -### 1.1 概述 - -UNION 操作将两个查询结果集的所有行合并(不保证结果顺序),支持去重(默认)和保留重复两种模式。 - -### 1.2 语法定义 - -```SQL -query UNION (ALL | DISTINCT) query -``` - -**说明:** - -1. **去重规则:** - - 1. 默认(`UNION` 或 `UNION DISTINCT`):自动去除重复行。 - 2. `UNION ALL`:保留所有行(包括重复项),性能更高。 -2. **输入要求:** - - 1. 两个查询结果的列数必须相同。 - 2. 对应列数据类型需兼容,兼容性规则如下: - * 数值类型互容:`INT32`、`INT64`、`FLOAT`、`DOUBLE` 之间完全兼容。 - * 字符串类型互容:`TEXT` 与 `STRING` 完全兼容。 - * 特殊规则:`INT64` 与 `TIMESTAMP` 兼容。 -3. **结果集规则:** - - 1. 列名及顺序继承第一个查询的定义。 - -### 1.3 使用示例 - -以[示例数据](../Reference/Sample-Data.md)为原始数据。 - -1. 获取 table1 和 table2 中设备及温度的非空数据集合(去重) - -```SQL -select device_id,temperature from table1 where temperature is not null -union -select device_id,temperature from table2 where temperature is not null; - ---等价于; -select device_id,temperature from table1 where temperature is not null -union distinct -select device_id,temperature from table2 where temperature is not null; -``` - -执行结果: - -```Bash -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 85.0| -| 100| 90.0| -| 100| 85.0| -| 100| 88.0| -+---------+-----------+ -Total line number = 5 -It costs 0.074s -``` - -2. 获取 table1 和 table2 中设备及温度的非空数据集合(保留重复) - -```SQL -select device_id,temperature from table1 where temperature is not null -union all -select device_id,temperature from table2 where temperature is not null; -``` - -执行结果: - -```SQL -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 90.0| -| 101| 85.0| -| 101| 85.0| -| 101| 85.0| -| 101| 85.0| -| 100| 90.0| -| 100| 85.0| -| 100| 85.0| -| 100| 88.0| -| 100| 90.0| -| 100| 90.0| -| 101| 90.0| -| 101| 85.0| -| 101| 85.0| -| 100| 85.0| -| 100| 90.0| -+---------+-----------+ -Total line number = 17 -It costs 0.108s -``` - -> ​**注意**​: -> -> * 集合操作​**不保证结果顺序**​,实际输出顺序可能与示例不同。 - -## 2. INTERSECT -### 2.1 概述 - -INTERSECT 操作返回两个查询结果集中共同存在的行(不保证结果顺序),支持去重(默认)和保留重复两种模式。 - -### 2.2 语法定义 - -```SQL -query1 INTERSECT [ALL | DISTINCT] query2 -``` - -​**说明**​: - -1. ​**去重规则**​: - - 1. 默认(`INTERSECT` 或 `INTERSECT DISTINCT`):自动去除重复行。 - 2. `INTERSECT ALL`:保留所有重复行(包括重复项),性能略低。 -2. ​**优先级规则**​: - - 1. `INTERSECT` 优先级高于 `UNION` 和 `EXCEPT`(如 `A UNION B INTERSECT C` 等价于 `A UNION (B INTERSECT C)`)。 - 2. 从左到右计算(`A INTERSECT B INTERSECT C` 等价于 `(A INTERSECT B) INTERSECT C`)。 -3. ​**输入要求**​: - - 1. 两个查询结果的列数必须相同。 - 2. 对应列数据类型需兼容(兼容性规则同 UNION): - * 数值类型互容:`INT32`、`INT64`、`FLOAT`、`DOUBLE` 之间完全兼容。 - * 字符串类型互容:`TEXT` 与 `STRING` 完全兼容。 - * 特殊规则:`INT64` 与 `TIMESTAMP` 兼容。 - 3. NULL 值视为相等(`NULL IS NOT DISTINCT FROM NULL`)。 - 4. 若 `SELECT` 未包含 `time` 列,则 `time` 列不参与比较,结果集无 `time` 列。 -4. ​**结果集规则**​: - - 1. 列名及顺序继承第一个查询的定义。 - -### 2.3 使用示例 - -基于 [示例数据](../Reference/Sample-Data.md): - -1. 获取 table1 和 table2 中设备及温度的共同数据(去重) - - ```SQL - select device_id, temperature from table1 - intersect - select device_id, temperature from table2; - - --等价于; - select device_id, temperature from table1 - intersect distinct - select device_id, temperature from table2; - ``` - - 执行结果: - - ```Bash - +---------+-----------+ - |device_id|temperature| - +---------+-----------+ - | 101| 90.0| - | 101| 85.0| - | 100| null| - | 100| 90.0| - | 100| 85.0| - +---------+-----------+ - Total line number = 5 - It costs 0.087s - ``` -2. 获取 table1 和 table2 中设备及温度的共同数据(保留重复) - - ```SQL - select device_id, temperature from table1 - intersect all - select device_id, temperature from table2; - ``` - - 执行结果: - - ```Bash - +---------+-----------+ - |device_id|temperature| - +---------+-----------+ - | 100| 85.0| - | 100| 90.0| - | 100| null| - | 101| 85.0| - | 101| 85.0| - | 101| 90.0| - +---------+-----------+ - Total line number = 6 - It costs 0.139s - ``` - -> ​**注意**​: -> -> * 集合操作​**不保证结果顺序**​,实际输出顺序可能与示例不同。 -> * 与 `UNION`/`EXCEPT` 混合使用时,需通过括号明确优先级(如 `A INTERSECT (B UNION C)`)。 - -## 3. EXCEPT -### 3.1 概述 - -EXCEPT 操作返回第一个查询结果集存在但第二个查询结果集中不存在的行(不保证结果顺序),支持去重(默认)和保留重复两种模式。 - -### 3.2 语法定义 - -```SQL -query1 EXCEPT [ALL | DISTINCT] query2 -``` - -​**说明**​: - -1. ​**去重规则**​: - - 1. 默认(`EXCEPT` 或 `EXCEPT DISTINCT`):自动去除重复行。 - 2. `EXCEPT ALL`:保留所有重复行(包括重复项),性能略低。 -2. ​**优先级规则**​: - - 1. `EXCEPT` 与 `UNION` 优先级相同,低于 `INTERSECT`(如 `A INTERSECT B EXCEPT C` 等价于 `(A INTERSECT B) EXCEPT C`)。 - 2. 从左到右计算(`A EXCEPT B EXCEPT C` 等价于 `(A EXCEPT B) EXCEPT C`)。 -3. ​**输入要求**​: - - 1. 两个查询结果的列数必须相同。 - 2. 对应列数据类型需兼容(兼容性规则同 UNION): - * 数值类型互容:`INT32`、`INT64`、`FLOAT`、`DOUBLE` 之间完全兼容。 - * 字符串类型互容:`TEXT` 与 `STRING` 完全兼容。 - * 特殊规则:`INT64` 与 `TIMESTAMP` 兼容。 - 3. NULL 值视为相等(`NULL IS NOT DISTINCT FROM NULL`)。 - 4. 若 `SELECT` 未包含 `time` 列,则 `time` 列不参与比较,结果集无 `time` 列。 -4. ​**结果集规则**​: - - 1. 列名及顺序继承第一个查询的定义。 - -### 3.3 使用示例 - -基于 [示例数据](../Reference/Sample-Data.md): - -1. 获取 table1 中存在但 table2 中不存在的设备及温度数据(去重) - - ```SQL - select device_id, temperature from table1 - except - select device_id, temperature from table2; - - --等价于; - select device_id, temperature from table1 - except distinct - select device_id, temperature from table2; - ``` - - 执行结果: - - ```Bash - +---------+-----------+ - |device_id|temperature| - +---------+-----------+ - | 101| null| - | 100| 88.0| - +---------+-----------+ - Total line number = 2 - It costs 0.173s - ``` -2. 获取 table1 中存在但 table2 中不存在的设备及温度数据(保留重复) - - ```SQL - select device_id, temperature from table1 - except all - select device_id, temperature from table2; - ``` - - 执行结果: - - ```Bash - +---------+-----------+ - |device_id|temperature| - +---------+-----------+ - | 100| 85.0| - | 100| 88.0| - | 100| 90.0| - | 100| 90.0| - | 100| null| - | 101| 85.0| - | 101| 85.0| - | 101| 90.0| - | 101| null| - | 101| null| - | 101| null| - | 101| null| - +---------+-----------+ - Total line number = 12 - It costs 0.155s - ``` - -> ​**注意**​: -> -> * 集合操作​**不保证结果顺序**​,实际输出顺序可能与示例不同。 -> * 与 `UNION`/`INTERSECT` 混合使用时,需通过括号明确优先级(如 `A EXCEPT (B INTERSECT C)`)。 diff --git a/src/zh/UserGuide/Master/Table/SQL-Manual/overview_timecho.md b/src/zh/UserGuide/Master/Table/SQL-Manual/overview_timecho.md deleted file mode 100644 index 0cb0fd1d9..000000000 --- a/src/zh/UserGuide/Master/Table/SQL-Manual/overview_timecho.md +++ /dev/null @@ -1,54 +0,0 @@ - - -# 概览 - -## 1. 语法概览 - -```SQL -SELECT ⟨select_list⟩ - FROM ⟨tables⟩ | patternRecognition - [WHERE ⟨condition⟩] - [GROUP BY ⟨groups⟩] - [HAVING ⟨group_filter⟩] - [WINDOW windowDefinition (',' windowDefinition)*)] - [FILL ⟨fill_methods⟩] - [ORDER BY ⟨order_expression⟩] - [OFFSET ⟨n⟩] - [LIMIT ⟨n⟩]; -``` - -IoTDB 查询语法提供以下子句: - -- SELECT 子句:查询结果应包含的列。详细语法见:[SELECT子句](../SQL-Manual/Select-Clause_timecho.md) -- FROM 子句:指出查询的数据源,可以是单个表、多个通过 `JOIN` 子句连接的表,或者是一个子查询。详细语法见:[FROM & JOIN 子句](../SQL-Manual/From-Join-Clause.md) -- WHERE 子句:用于过滤数据,只选择满足特定条件的数据行。这个子句在逻辑上紧跟在 FROM 子句之后执行。详细语法见:[WHERE 子句](../SQL-Manual/Where-Clause.md) -- GROUP BY 子句:当需要对数据进行聚合时使用,指定了用于分组的列。详细语法见:[GROUP BY 子句](../SQL-Manual/GroupBy-Clause.md) -- HAVING 子句:在 GROUP BY 子句之后使用,用于对已经分组的数据进行过滤。与 WHERE 子句类似,但 HAVING 子句在分组后执行。详细语法见:[HAVING 子句](../SQL-Manual/Having-Clause.md) -- FILL 子句:用于处理查询结果中的空值,用户可以使用 FILL 子句来指定数据缺失时的填充模式(如前一个非空值或线性插值)来填充 null 值,以便于数据可视化和分析。 详细语法见:[FILL 子句](../SQL-Manual/Fill-Clause.md) -- ORDER BY 子句:对查询结果进行排序,可以指定升序(ASC)或降序(DESC),以及 NULL 值的处理方式(NULLS FIRST 或 NULLS LAST)。详细语法见:[ORDER BY 子句](../SQL-Manual/OrderBy-Clause.md) -- OFFSET 子句:用于指定查询结果的起始位置,即跳过前 OFFSET 行。与 LIMIT 子句配合使用。详细语法见:[LIMIT 和 OFFSET 子句](../SQL-Manual/Limit-Offset-Clause.md) -- LIMIT 子句:限制查询结果的行数,通常与 OFFSET 子句一起使用以实现分页功能。详细语法见:[LIMIT 和 OFFSET 子句](../SQL-Manual/Limit-Offset-Clause.md) - -## 2. 子句执行顺序 - - -![](/img/data-query-1.png) diff --git a/src/zh/UserGuide/Master/Table/Tools-System/CLI_timecho.md b/src/zh/UserGuide/Master/Table/Tools-System/CLI_timecho.md deleted file mode 100644 index 0e6ed52c4..000000000 --- a/src/zh/UserGuide/Master/Table/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,193 +0,0 @@ - - -# 命令行工具 - -IoTDB 为用户提供 CLI 工具用于和服务端程序进行交互操作。在使用 CLI 工具连接 IoTDB 前,请保证 IoTDB 服务已经正常启动。下面介绍 CLI 工具的运行方式和相关参数。 - -> 本文中 $IoTDB_HOME 表示 IoTDB 的安装目录所在路径。 - -## 1. CLI 启动 - -CLI 客户端脚本是 $IoTDB_HOME/sbin 文件夹下的`start-cli`脚本。启动命令为: - -- Linux/MacOS 系统常用启动命令为: - -```Shell -Shell> bash sbin/start-cli.sh -sql_dialect table -或 -# V2.0.6.x 版本之前 -Shell> bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x 版本及之后 -Shell> bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -- Windows 系统常用启动命令为: - -```Shell -# V2.0.4.x 版本之前 -Shell> sbin\start-cli.bat -sql_dialect table -或 -Shell> sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.4.x 版本及之后 -Shell> sbin\windows\start-cli.bat -sql_dialect table -或 -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell> sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x 版本及之后 -Shell> sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -其中: - -- -h 和-p 项是 IoTDB 所在的 IP 和 RPC 端口号(本机未修改 IP 和 RPC 端口号默认为 127.0.0.1、6667) -- -u 和-pw 是 IoTDB 登录的用户名密码(安装后IoTDB有一个默认用户,用户名为`root`,密码为`TimechoDB@2021`,V2.0.6版本之前密码为`root`) -- -sql_dialect 是登录的数据模型(表模型或树模型),此处指定为 table 代表进入表模型模式 - -更多参数见: - -| **参数名** | **参数类型** | **是否为必需参数** | **说明** | **示例** | -|:-----------------------------|:-----------|:------------|:-----------------------------------------------------------|:---------------------| -| -h `` | string 类型 | 否 | IoTDB 客户端连接 IoTDB 服务器的 IP 地址, 默认使用:127.0.0.1。 | -h 127.0.0.1 | -| -p `` | int 类型 | 否 | IoTDB 客户端连接服务器的端口号,IoTDB 默认使用 6667。 | -p 6667 | -| -u `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的用户名,默认使用 root。 | -u root | -| -pw `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的密码,默认使用 TimechoDB@2021(V2.0.6版本之前为root)。 | -pw root | -| -sql_dialect `` | string 类型 | 否 | 目前可选 tree(树模型) 、table(表模型),默认 tree | -sql_dialect table | -| -e `` | string 类型 | 否 | 在不进入客户端输入模式的情况下,批量操作 IoTDB。 | -e "show databases" | -| -c | 空 | 否 | 如果服务器设置了 rpc_thrift_compression_enable=true, 则 CLI 必须使用 -c | -c | -| -disableISO8601 | 空 | 否 | 如果设置了这个参数,IoTDB 将以数字的形式打印时间戳 (timestamp)。 | -disableISO8601 | -| -usessl `` | Boolean 类型 | 否 | 否开启 ssl 连接 | -usessl true | -| -ts `` | string 类型 | 否 | ssl 证书存储路径 | -ts /path/to/truststore | -| -tpw `` | string 类型 | 否 | ssl 证书存储密码 | -tpw myTrustPassword | -| -timeout `` | int 类型 | 否 | 查询超时时间(秒)。如果未设置,则使用服务器的配置。 | -timeout 30 | -| -help | 空 | 否 | 打印 IoTDB 的帮助信息。 | -help | - -启动后出现如图提示即为启动成功。 - -![](/img/Cli-01.png) - -## 2. CLI 使用 - -### 2.1 执行语句 - -进入 CLI 后,用户可以直接在对话中输入 SQL 语句进行交互。如: - -- 创建数据库 - -```Java -create database test -``` - -![](/img/Cli-02.png) - - -- 查看数据库 - -```Java -show databases -``` - -![](/img/Cli-03.png) - - -### 2.2 命令技巧 - -CLI中使用命令小技巧: - -(1)快速切换历史命令: 上下箭头 - -(2)历史命令自动补全:右箭头 - -(3)中断执行命令: CTRL+C - -## 3. CLI 退出 - -在 CLI 中输入`quit`或`exit`可退出 CLI 结束本次会话。 - -## 4. 访问历史功能 - -IoTDB **V2.0.9.1** 起支持开启访问历史功能,即客户端登录成功后展示关键的历史访问信息,支持分布式场景。管理员与普通用户仅可查看自身访问历史,核心展示内容包括: - -* 上一次成功会话:显示日期、时间、访问应用、IP地址及访问方法(首次登录或无历史记录时不显示)。 -* 最近一次失败尝试:显示距离本次成功登录时间最近的一次失败记录的日期、时间、访问应用、IP地址及访问方法。 -* 累计失败次数:统计自上一次成功会话建立以来,所有未成功建立的会话尝试总次数。 - -### 4.1 开启访问历史 - -支持通过修改 `iotdb-system.properties` 文件中的相关参数来控制是否开启访问历史功能,修改参数后需重启生效,例如: - -```Plain -# 用于控制是否启用审计日志功能 -enable_audit_log=false -``` - -* 开启时,记录登录信息并定期清理过期数据; -* 关闭时,不记录、不展示、不清理; -* 开关关闭后重开,展示的历史为关闭前最后一条记录,不一定代表真实最近登录记录。 - -使用示例: - -```Bash ---------------------- -Starting IoTDB Cli ---------------------- - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ Enterprise version 2.0.9.1 (Build: xxxxxxx) - - ----Last Successful Session------------------ -Time: 2026-03-24T10:25:47.759+08:00 -IP Address: 127.0.0.1 ----Last Failed Session---------------------- -Time: 2026-03-24T10:27:26.314+08:00 -IP Address: 127.0.0.1 -Cumulative Failed Attempts: 1 -Successfully login at 127.0.0.1:6667 -IoTDB> -``` - -### 4.2 查看访问历史 - -root 用户及具有 AUDIT 权限的用户可以通过 SQL 语句查看访问历史记录。 - -语法定义: - -```SQL -select * from __audit.login_history; -``` - -示例: - -```SQL -IoTDB> select * from __audit.login_history -+-----------------------------+-------+-------+--------+---------+------+ -| time|user_id|node_id|username| ip|result| -+-----------------------------+-------+-------+--------+---------+------+ -|2026-03-25T10:55:58.240+08:00| u_0| node_1| root|127.0.0.1| true| -+-----------------------------+-------+-------+--------+---------+------+ -Total line number = 1 -It costs 0.213s -``` diff --git a/src/zh/UserGuide/Master/Table/Tools-System/Data-Export-Tool_timecho.md b/src/zh/UserGuide/Master/Table/Tools-System/Data-Export-Tool_timecho.md deleted file mode 100644 index df86ca8d4..000000000 --- a/src/zh/UserGuide/Master/Table/Tools-System/Data-Export-Tool_timecho.md +++ /dev/null @@ -1,260 +0,0 @@ -# 数据导出 - -## 1. 功能概述 - -IoTDB 支持两种方式进行数据导出: - -* 数据导出工具 :`export-data.sh/bat` 位于 `tools `目录下,能够将指定 SQL 的查询结果导出为 CSV、SQL 及 TsFile (开源时间序列文件格式)格式。 -* 基于 PIPE 框架的 TsFileBackup:`tsfile-backup.sh/bat`位于 `tools `目录下,能够使用 PIPE 将指定的数据文件导出为 TsFile 格式。 - - - - - - - - - - - - - - - - - - - - - - - - - -
文件格式IoTDB工具具体介绍
CSVexport-data.sh/bat纯文本格式,存储格式化数据,需按照下文指定 CSV 格式进行构造
SQL包含自定义 SQL 语句的文件
TsFile开源时序数据文件格式
tsfile-backup.sh/bat开源时序数据文件格式,支持 Object 数据类型
- - -## 2. 数据导出工具 - -### 2.1 公共参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|--------------|-------------------------|--------------------------------------------------------------------------------------| -------------- |-----------------------------------------------------------------------------| -| -ft | --file\_type | 导出文件的类型,可以选择:csv、sql、tsfile | √ | | -| -h | -- host | 主机名 | 否 | 127.0.0.1 | -| -p | --port | 端口号 | 否 | 6667 | -| -u | --username | 用户名 | 否 | root | -| -pw | --password | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6 版本之前为 root) | -| -sql_dialect | --sql_dialect | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| -db | --database | ​将要导出的目标数据库,只在`-sql_dialect`为 table 类型下生效。 | `-sql_dialect`为 table 时必填| - | -| -table | --table | 将要导出的目标表,只在`-sql_dialect`为 table 类型下生效。如果指定了`-q`参数则此参数不生效,如果导出类型为 tsfile/sql 则此参数必填。 | ​ 否 | - | -| -start_time | --start_time | 将要导出的数据起始时间,只有`-sql_dialect`为 table 类型时生效。如果填写了`-q`,则此参数不生效。支持的时间类型同`-tf`参数。 |否 | - | -| -end_time | --end_time | 将要导出的数据的终止时间,只有`-sql_dialect`为 table 类型时生效。如果填写了`-q`,则此参数不生效。 | 否 | - | -| -t | --target | 指定输出文件的目标文件夹,如果路径不存在新建文件夹 | √ | | -| -pfn | --prefix\_file\_name | 指定导出文件的名称。例如:abc,生成的文件是abc\_0.tsfile、abc\_1.tsfile | 否 | dump\_0.tsfile | -| -q | --query | 要执行的查询语句。自 V2.0.8 起,SQL 语句中的分号将被自动移除,查询执行保持正常。 | 否 | 无 | -| -timeout | --query\_timeout | 会话查询的超时时间(ms) | 否 | `-1`(V2.0.8 之前)
`Long.MAX_VALUE`(V2.0.8 及之后)
范围:`-1~Long.MAX_VALUE` | -| -help | --help | 显示帮助信息 | 否 | | -| -usessl | --use_ssl | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| -ts | --trust_store | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| -tpw | --trust_store_password | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 CSV 格式 - -#### 2.2.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table - [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -``` - -#### 2.2.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- |--------------------------------------| -| -dt | --datatype | 是否在 CSV 文件的表头输出时间序列的数据类型,可以选择`true`或`false` | 否 | false | -| -lpf | --lines\_per\_file | 每个转储文件的行数 | 否 | 10000
范围:0~Integer.Max=2147483647 | -| -tf | --time\_format | 指定 CSV 文件中的时间格式。可以选择:1) 时间戳(数字、长整型);2) ISO8601(默认);3) 用户自定义模式,如`yyyy-MM-dd HH:mm:ss`(默认为ISO8601)。SQL 文件中的时间戳输出不受时间格式设置影响 | 否| ISO8601 | -| -tz | --timezone | 设置时区,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | - -#### 2.2.3 运行示例: - -```Shell -# 正确示例 -> export-data.sh -ft csv -sql_dialect table -t /path/export/dir -db database1 -q "select * from table1" - -# 异常示例 -> export-data.sh -ft csv -sql_dialect table -t /path/export/dir -q "select * from table1" -Parse error: Missing required option: db -``` - -### 2.3 SQL 格式 - -#### 2.3.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-aligned ] - -lpf - [-tf ] [-tz ] [-q ] [-timeout ] - -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] -``` - -#### 2.3.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | -------------------------------------- | -| -aligned | --use\_aligned | 是否导出为对齐的 SQL 格式 | 否 | true | -| -lpf | --lines\_per\_file | 每个转储文件的行数 | 否 | 10000
范围:0~Integer.Max=2147483647 | -| -tf | --time\_format | 指定 CSV 文件中的时间格式。可以选择:1) 时间戳(数字、长整型);2) ISO8601(默认);3) 用户自定义模式,如`yyyy-MM-dd HH:mm:ss`(默认为ISO8601)。SQL 文件中的时间戳输出不受时间格式设置影响 | 否| ISO8601| -| -tz | --timezone | 设置时区,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | - -#### 2.3.3 运行示例: - -```Shell -# 正确示例 -> export-data.sh -ft sql -sql_dialect table -t /path/export/dir -db database1 -start_time 1 - -# 异常示例 -> export-data.sh -ft sql -sql_dialect table -t /path/export/dir -start_time 1 -Parse error: Missing required option: db -``` - -### 2.4 TsFile 格式 - -#### 2.4.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] -``` - -#### 2.4.2 私有参数 - -* 无 - -#### 2.4.3 运行示例: - -```Shell -# 正确示例 -> /tools/export-data.sh -ft tsfile -sql_dialect table -t /path/export/dir -db database1 -start_time 0 - -# 异常示例 -> /tools/export-data.sh -ft tsfile -sql_dialect table -t /path/export/dir -start_time 0 -Parse error: Missing required option: db -``` - -## 3. 基于 PIPE 框架的 TsFileBackup - -IoTDB 自 **V2.0.9.2** 版本起支持 `tsfile-backup.sh/bat` 脚本,该脚本能够自动生成并向服务端发送 `CREATE PIPE` SQL 指令,将指定的数据文件导出为 TsFile 格式。 - -**注意:** - -1. **使用该脚本需联系天谋团队获取相关的 jar 包(`tsfile-remote-sink--jar-with-dependencies.jar`),并放至 IoTDB 可访问的路径(例如所有数据节点主机)。** -2. **该脚本支持 Object 类型数据导出为 TsFile 文件。** - -### 3.1 运行命令 - -```Shell -# Unix/OS X -> tools/tsfile-backup.sh [-sql_dialect ] [-h ] [-p ] - [-u ] [-pw ] [-path ] [-db ] [-table -
] [-s ] [-e ] [-t ] - [-th ] [-tu ] [-tp ] - [--rate_limit] [--plugin_jar] [-help] -# Windows -> tools\windows>tsfile-backup.bat [-sql_dialect ] [-h ] [-p ] - [-u ] [-pw ] [-path ] [-db ] [-table -
] [-s ] [-e ] [-t ] - [-th ] [-tu ] [-tp ] - [--rate_limit] [--plugin_jar] [-help] -``` - -### 3.2 脚本参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|------------------------|------------------------|----------------------------------------------------------------| -------------- |-------------| -| `-sql_dialect` | `--sql_dialect` | 指定数据模型类型,可选值:`tree`(树模型) 或`table`(表模型)。 | 是 | - | -| `-h` | `--host` | 本地主机地址。指当前数据所在的 IoTDB 实例 IP。 | 否 | `127.0.0.1` | -| `-p` | `--port` | 端口号,IoTDB RPC 服务端口。 | 否 | `6667` | -| `-u` | `--user` | 用户名,用于登录 IoTDB 验证。 | 否 | `root` | -| `-pw` | `--password` | 密码,对应用户的IoTDB密码,支持隐藏输入。 | 否 | `root` | -| `-t` | `--target` | 导出目标目录。在 SCP 模式下,此路径指远程服务器上的绝对物理路径。TsFile 和关联的 Object 目录将导出至此。 | 是 | - | -| `-db` | `--database` | 数据库名称 (表模型可选) | 否 | `.*` | -| `-table` | `--table` | 表名 (表模型可选) | 否 | `.*` | -| `-s` | `--start_time` | 起始时间。支持 ISO8601 格式(如 2026-01-01T00:00:00)或毫秒时间戳。仅导出该时间点及之后的数据。 | 否 | - | -| `-e` | `--end_time` | 截止时间。格式同上。仅导出该时间点之前的数据。 | 否 | - | -| `-th` | `--target_host` | 远程目标主机 IP,默认自动识别启动脚本的IP。指定此参数后,脚本将自动配置 Pipe 使用 SCP 模式进行数据传输。 | 否 | - | -| `-tu` | `--target_host_user` | 远程主机用户名。用于 SSH/SCP 登录目标服务器。 | 否 | - | -| `-tpw` | `--target_host_pw` | 远程主机密码。用于远程身份验证,支持隐藏输入。 | 否 | - | -| `-tp` | `--target_host_port` | 远程 SSH 端口。 | 否 | `22` | -| `--rate_limit` | `--rate_limit` | 发送速率限制。单位:字节/秒 (Bytes/s)。防止导出任务占用过多网络带宽。 | 否 | - | -| `--plugin_jar` | `--plugin_jar` | 指定 Pipe 插件的Jar包路径 | 否 | - | -| `--object-parallelism` | `--object-parallelism` | 指定object文件发送最大并行度 | 否 | - | -| `--object-batch-size` | `--object-batch-size` | 限制每个对象文件上传批次的总字节数,用于控制内存占用和单次 SCP 传输大小 | 否 | - | -| `-help` | `--help` | 查看帮助 | 否 | - | - -### 3.3 运行示例 - -示例一:SCP 远程导出(将数据发送到另一台服务器) - -```Bash -./tsfile-backup.sh -sql_dialect table -db test_db -t /remote/archive/ -th 192.168.1.100 -tu backup_user -tpw ComplexPass123! -``` - -示例二:带限速的远程 Object 数据导出 - -```Bash -./tsfile-backup.sh -sql_dialect table -t /mnt/backup/ -th 10.0.0.5 -tu iot_admin -tpw Admin@2026 --rate_limit 5242880 -``` - -示例三:指定 Pipe jar 目录 - -```Bash -./tsfile-backup.sh -sql_dialect table -db test -table .* -tu luoluoyuyu -tpw -t /tmp/backup --plugin_jar /local/lib/tsfile-remote-sink-2.0.8-SNAPSHOT-jar-with-dependencies.jar -``` - -注意:SCP 模式导出 Object 类型数据时,为避免出现握手异常、连接失败或 Pipe 频繁启停问题,建议采取以下任一措施: -* 适当调低配置参数 object-parallelism -* 按需调大目标机的 MaxStartups,修改后执行 sshd reload 或 sshd restart 使配置生效 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Tools-System/Data-Import-Tool_timecho.md b/src/zh/UserGuide/Master/Table/Tools-System/Data-Import-Tool_timecho.md deleted file mode 100644 index b35dccc0e..000000000 --- a/src/zh/UserGuide/Master/Table/Tools-System/Data-Import-Tool_timecho.md +++ /dev/null @@ -1,378 +0,0 @@ -# 数据导入 - -## 1. 功能概述 - -IoTDB 支持三种方式进行数据导入: -- 数据导入工具 :`import-data.sh/bat` 位于 `tools` 目录下,可以将 `CSV`、`SQL`、及`TsFile`(开源时序文件格式)的数据导入 `IoTDB`。 -- `TsFile` 自动加载功能。 -- `Load SQL` 导入 `TsFile` 。 - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
文件格式IoTDB工具具体介绍
CSVimport-data.sh/bat可用于单个或一个目录的 CSV 文件批量导入 IoTDB
SQL可用于单个或一个目录的 SQL 文件批量导入 IoTDB
TsFile可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
TsFile 自动加载可以监听指定路径下新产生的 TsFile 文件,并将其加载进 IoTDB
Load SQL可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
- -- **表模型 TsFile 导入暂时只支持本地导入。** -- 自 V2.0.9.2 版本起,import-data.sh/bat 脚本导入 tsfile 文件时支持 Object 数据类型。 - -## 2. 数据导入工具 - -### 2.1 公共参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|--------------|-------------------------|-------------------------------------------------------------------------------------------------------------------------------| -------------- |-------------------------------------| -| -ft | --file\_type | 导入文件的类型,可以选择:csv、sql、tsfile | √ | -| -h | -- host | 主机名 | 否 | 127.0.0.1 | -| -p | --port | 端口号 | 否 | 6667 | -| -u | --username | 用户名 | 否 | root | -| -pw | --password | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6 版本之前为 root) | -| -s | --source | 待加载的脚本文件(夹)的本地目录路径
如果为 csv sql tsfile 这三个支持的格式,直接导入
不支持的格式,报错提示`The file name must end with "csv" or "sql"or "tsfile"!` | √ | -| -sql_dialect | --sql_dialect | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| -db | --database | 数据将要导入的目标库,只在 `-sql_dialect` 为 table 类型下生效。 |-sql_dialect 为 table 时必填;
V2.0.9.2 版本起,当文件格式为 SQL 时,该参数为可选参数,若参数或 SQL 中均未显式指定目标数据库时会进行提示。 | - | -| -table | --table | 数据将要导入的目标表,只在 `-sql_dialect` 为 table 类型且文件类型为 csv 条件下生效且必填。 | 否 | - | -| -tn | --thread\_num | 最大并行线程数 | 否 | 8
范围:0~Integer.Max=2147483647 | -| -tz | --timezone | 时区设置,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | -| -help | --help | 显示帮助信息,支持分开展示和全部展示`-help`或`-help csv` | 否 | -| -usessl | --use_ssl | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| -ts | --trust_store | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| -tpw | --trust_store_password | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - - - -### 2.2 CSV 格式 - -#### 2.2.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table - [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# V2.0.4.x 版本及之后 -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] -``` - -#### 2.2.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | ---------------------------- | ----------------------------------------------------------------------------------- |-------------------------------------------|---------------------------------------| -| -fd | --fail\_dir | 指定保存失败文件的目录 | 否 | YOUR\_CSV\_FILE\_PATH | -| -lpf | --lines\_per\_failed\_file | 指定失败文件最大写入数据的行数 | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -aligned | --use\_aligned | 是否导入为对齐序列 | 否 | false | -| -batch | --batch\_size | 指定每调用一次接口处理的数据行数(最小值为1,最大值为Integer.​*MAX\_VALUE*​) | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -ti | --type\_infer | 通过选项定义类型信息,例如`"boolean=text,int=long, ..."` | 否 | 无 | -| -tp | --timestamp\_precision | 时间戳精度 | 否:
1. ms(毫秒)
2. us(微秒)
3. ns(纳秒) | ms -| - -#### 2.2.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_0.csv -db database1 -table table1 - -# 异常示例 -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_1.csv -table table1 -Parse error: Missing required option: db - -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_1.csv -db database1 -table table5 -There are no tables or the target table table5 does not exist -``` - -#### 2.2.4 导入说明 - -1. CSV 导入规范 - - - 特殊字符转义规则:若Text类型的字段中包含特殊字符(例如逗号`,`),需使用反斜杠(`\`)​进行转义处理。 - - 支持的时间格式:`yyyy-MM-dd'T'HH:mm:ss`, `yyy-MM-dd HH:mm:ss`, 或者 `yyyy-MM-dd'T'HH:mm:ss.SSSZ` 。 - - 时间戳列​必须作为数据文件的首列存在。 - -2. CSV 文件示例 - -```sql -time,region,device,model,temperature,humidity -1970-01-01T08:00:00.001+08:00,"上海","101","F",90.0,35.2 -1970-01-01T08:00:00.002+08:00,"上海","101","F",90.0,34.8 -``` - - -### 2.3 SQL 格式 - -#### 2.3.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# V2.0.4.x 版本及之后 -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] -``` - -#### 2.3.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | ---------------------------- | ----------------------------------------------------------------------------------- | -------------- |---------------------------------------| -| -fd | --fail\_dir | 指定保存失败文件的目录 | 否 | YOUR\_CSV\_FILE\_PATH | -| -lpf | --lines\_per\_failed\_file | 指定失败文件最大写入数据的行数 | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -batch | --batch\_size | 指定每调用一次接口处理的数据行数(最小值为1,最大值为Integer.​*MAX\_VALUE*​) | 否 | 100000
范围:0~Integer.Max=2147483647 | - -#### 2.3.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft sql -sql_dialect table -s ./sql/dump0_0.sql -db database1 - -# 异常示例 -> tools/import-data.sh -ft sql -sql_dialect table -s ./sql/dump1_1.sql -db database1 -Source file or directory ./sql/dump1_1.sql does not exist - -# 目标表存在但是元数据不适配/数据异常:生成.failed异常文件记录该条信息,日志打印错误信息如下 -Fail to insert measurements '[column.name]' caused by [data type is not consistent, input '[column.value]', registered '[column.DataType]'] -``` - -### 2.4 TsFile 格式 - -#### 2.4.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-o ] -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# V2.0.4.x 版本及之后 -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-o ] -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] -``` - -#### 2.4.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|------|------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------| -------------------- | -| -os | --on\_succcess | 1. none:不删除
2. mv:移动成功的文件到目标文件夹
3. cp:硬连接(拷贝)成功的文件到目标文件夹
4. delete:删除 | √ || -| -sd | --success\_dir | 当`--on_succcess`为 mv 或 cp 时,mv 或 cp 的目标文件夹。文件的文件名变为文件夹打平后拼接原有文件名 | 当`--on_succcess`为mv或cp时需要填写 | `${EXEC_DIR}/success`| -| -of | --on\_fail | 1. none:跳过
2. mv:移动失败的文件到目标文件夹
3. cp:硬连接(拷贝)失败的文件到目标文件夹
4. delete:删除 | √ || -| -fd | --fail\_dir | 当`--on_fail`指定为 mv 或 cp 时,mv 或 cp 的目标文件夹。文件的文件名变为文件夹打平后拼接原有文件名 | 当`--on_fail`指定为 mv 或 cp 时需要填写 | `${EXEC_DIR}/fail` | -| -tp | --timestamp\_precision | 时间戳精度
tsfile 非远程导入:-tp 指定 tsfile 文件的时间精度 手动校验和服务器的时间戳是否一致 不一致返回报错信息
远程导入:-tp 指定 tsfile 文件的时间精度 pipe 自动校验时间戳精度是否一致 不一致返回 pipe 报错信息 | 否:
1. ms(毫秒)
2. us(微秒)
3. ns(纳秒) | ms| -| -o | --object-file-paths | Object 文件存储路径。
默认模式:若不指定此参数,脚本将自动识别并导入位于 `/` 同名子目录下的 Object 文件。
绝对路径模式:显式指定 Object 文件的外部存储根目录,工具将基于此路径建立数据的关联索引。
注意:该参数自 V2.0.9.2 版本起支持 | 否 | | - - -#### 2.4.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 -os none -of none - -# 异常示例 -> tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 -Parse error: Missing required options: os, of -``` - -**Object 类型导入** - -1. 导入格式 - -* 默认 - -```Bash -target_dir - ├── tsfile.tsfile - └── tsfile/ (对应TSFile名字) - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - -* 指定 Object 目录 - -```Bash -target_dir - ├── tsfile.tsfile -object_dir - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - -2. 命令行示例 - -* 基础导入(自动识别 TsFile 同名目录下的 Object 文件) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/import/sensor_v1.tsfile -db database1 -os none -of none -``` - -* 批量导入目录(指定并发线程数与成功后的处理动作) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/raw_data/ -tn 16 -os mv -sd /data/archive/ -``` - -* 表模型关联导入(指定外部 Object 存储路径与目标数据库) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/import/ -db factory_db -o /mnt/object_storage/ -of mv -fd /data/error_log/ -``` - - -## 3. TsFile 自动加载功能 - -本功能允许 IoTDB 主动监听指定目录下的新增 TsFile,并将 TsFile 自动加载至 IoTDB 中。通过此功能,IoTDB 能自动检测并加载 TsFile,无需手动执行任何额外的加载操作。 - -![](/img/Data-import1.png) - -### 3.1 配置参数 - -可通过从配置文件模版 `iotdb-system.properties.template` 中找到下列参数,添加到 IoTDB 配置文件 `iotdb-system.properties` 中开启 TsFile 自动加载功能。完整配置如下: - -| **配置参数** | **参数说明** | **value 取值范围** | **是否必填** | **默认值** | **加载方式** | -| --------------------------------------------------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ---------------------------- | -------------------- | ------------------------ | -------------------- | -| load\_active\_listening\_enable | 是否开启 DataNode 主动监听并且加载 tsfile 的功能(默认开启)。 | Boolean: true,false | 选填 | true | 热加载 | -| load\_active\_listening\_dirs | 需要监听的目录(自动包括目录中的子目录),如有多个使用 “,“ 隔开;
默认的目录为 `ext/load/pending`;
支持热装载;
**注意:表模型中,文件所在的目录名会作为 database**; | String: 一个或多个文件目录 | 选填 | `ext/load/pending` | 热加载 | -| load\_active\_listening\_fail\_dir | 执行加载 tsfile 文件失败后将文件转存的目录,只能配置一个 | String: 一个文件目录 | 选填 | `ext/load/failed` | 热加载 | -| load\_active\_listening\_max\_thread\_num | 同时执行加载 tsfile 任务的最大线程数,参数被注释掉时的默值为 max(1, CPU 核心数 / 2),当用户设置的值不在这个区间[1, CPU核心数 /2]内时,会设置为默认值 (1, CPU 核心数 / 2) | Long: [1, Long.MAX\_VALUE] | 选填 | max(1, CPU 核心数 / 2) | 重启后生效 | -| load\_active\_listening\_check\_interval\_seconds | 主动监听轮询间隔,单位秒。主动监听 tsfile 的功能是通过轮询检查文件夹实现的。该配置指定了两次检查 `load_active_listening_dirs` 的时间间隔,每次检查完成 `load_active_listening_check_interval_seconds` 秒后,会执行下一次检查。当用户设置的轮询间隔小于 1 时,会被设置为默认值 5 秒 | Long: [1, Long.MAX\_VALUE] | 选填 | 5 | 重启后生效 | - -### 3.2 示例说明 - -```bash -load_active_listening_dir/ -├─sensors/ -│ ├─temperature/ -│ │ └─temperature-table.TSFILE - -``` - -- 表模型 TsFile - - `temperature-table.TSFILE`: 会被导入到 `temperature` database 下(因为它位于`sensors/temperature/` 目录下) - -### 3.3 注意事项 - -1. 如果待加载的文件中,存在 mods 文件,应优先将 mods 文件移动到监听目录下面,然后再移动 tsfile 文件,且 mods 文件应和对应的 tsfile 文件处于同一目录。防止加载到 tsfile 文件时,加载不到对应的 mods 文件 -2. 禁止设置 Pipe 的 receiver 目录、存放数据的 data 目录等作为监听目录 -3. 禁止 `load_active_listening_fail_dir` 与 `load_active_listening_dirs` 存在相同的目录,或者互相嵌套 -4. 保证 `load_active_listening_dirs` 目录有足够的权限,在加载成功之后,文件将会被删除,如果没有删除权限,则会重复加载 - -## 4. Load SQL - -IoTDB 支持通过 CLI 执行 SQL 直接将存有时间序列的一个或多个 TsFile 文件导入到另外一个正在运行的 IoTDB 实例中。 - -### 4.1 运行命令 - -```SQL -load '' with ( - 'attribute-key1'='attribute-value1', - 'attribute-key2'='attribute-value2', -) -``` - -* `` :文件本身,或是包含若干文件的文件夹路径 -* ``:可选参数,具体如下表所示 - -| Key | Key 描述 | Value 类型 | Value 取值范围 | Value 是否必填 | Value 默认值 | -| --------------------------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ------------ | ----------------------------------------- | ---------------- | -------------------------- | -| `database-level` | 当 tsfile 对应的 database 不存在时,可以通过` database-level`参数的值来制定 database 的级别,默认为`iotdb-common.properties`中设置的级别。
例如当设置 level 参数为 1 时表明此 tsfile 中所有时间序列中层级为1的前缀路径是 database。 | Integer | `[1: Integer.MAX_VALUE]` | 否 | 1 | -| `on-success` | 表示对于成功载入的 tsfile 的处置方式:默认为`delete`,即tsfile 成功加载后将被删除;`none `表明 tsfile 成功加载之后依然被保留在源文件夹, | String | `delete / none` | 否 | delete | -| `model` | 指定写入的 tsfile 是表模型还是树模型,该参数在V2.0.2.1后无效(系统会自动识别是树模型还是表模型) | String | `tree / table` | 否 | 与`-sql_dialect`一致 | -| `database-name` | **仅限表模型有效**: 文件导入的目标 database,不存在时会自动创建,`database-name`中不允许包括"`root.`"前缀,如果包含,将会报错。 | String | `-` | 否 | null | -| `convert-on-type-mismatch` | 加载 tsfile 时,如果数据类型不一致,是否进行转换 | Boolean | `true / false` | 否 | true | -| `verify` | 加载 tsfile 前是否校验 schema | Boolean | `true / false` | 否 | true | -| `tablet-conversion-threshold` | 转换为 tablet 形式的 tsfile 大小阈值,针对小文件 tsfile 加载,采用将其转换为 tablet 形式进行写入:默认值为 -1,即任意大小 tsfile 都不进行转换 | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | 否 | -1 | -| `async` | 是否开启异步加载 tsfile,将文件移到 active load 目录下面,所有的 tsfile 都 load 到`database-name`下. | Boolean | `true / false` | 否 | false | - -### 4.2 运行示例 - -```SQL --- 准备目标数据库 database2 -IoTDB> create database database2 -Msg: The statement is executed successfully. - -IoTDB> use database2 -Msg: The statement is executed successfully. - -IoTDB:database2> show tables details -+---------+-------+------+-------+ -|TableName|TTL(ms)|Status|Comment| -+---------+-------+------+-------+ -+---------+-------+------+-------+ -Empty set. - ---通过执行load sql 导入tsfile -IoTDB:database2> load '/home/dump0.tsfile' with ( 'on-success'='none', 'database-name'='database2') -Msg: The statement is executed successfully. - --- 验证数据导入成功 -IoTDB:database2> select * from table2 -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -|2024-11-30T00:00:00.000+08:00| 上海| 3002| 101| 90.0| 35.2| true| null| -|2024-11-29T00:00:00.000+08:00| 上海| 3001| 101| 85.0| 35.1| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T00:00:00.000+08:00| 北京| 1001| 101| 85.0| 35.1| true|2024-11-27T16:37:01.000+08:00| -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| null| 45.1| true| null| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| 85.0| 35.2| false|2024-11-28T08:00:09.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -``` diff --git a/src/zh/UserGuide/Master/Table/Tools-System/Maintenance-Tool_timecho.md b/src/zh/UserGuide/Master/Table/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index f31b5cfcf..000000000 --- a/src/zh/UserGuide/Master/Table/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,1013 +0,0 @@ - - -# 集群管理工具 - -## 1. 集群管理工具 - -IoTDB 集群管理工具是一款易用的运维工具(企业版工具)。旨在解决 IoTDB 分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。本文档将说明如何用集群管理工具远程部署、配置、启动和停止 IoTDB 集群实例。 - -### 1.1 环境准备 - -本工具为 TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -IoTDB 要部署的机器需要依赖jdk 8及以上版本、lsof、netstat、unzip功能如果没有请自行安装,可以参考文档最后的一节环境所需安装命令。 - -提示:IoTDB集群管理工具需要使用有root权限的账号 - -### 1.2 部署方法 - -#### 下载安装 - -本工具为TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -注意:由于二进制包仅支持GLIBC2.17 及以上版本,因此最低适配Centos7版本 - -* 在iotd目录内输入以下指令后: - -```bash -bash install-iotdbctl.sh -``` - -即可在之后的 shell 内激活 iotdbctl 关键词,如检查部署前所需的环境指令如下所示: - -```bash -iotdbctl cluster check example -``` - -* 也可以不激活iotd直接使用 <iotdbctl absolute path>/sbin/iotdbctl 来执行命令,如检查部署前所需的环境: - -```bash -/sbin/iotdbctl cluster check example -``` - -### 1.3 系统结构 - -IoTDB集群管理工具主要由config、logs、doc、sbin目录组成。 - -* `config`存放要部署的集群配置文件如果要使用集群部署工具需要修改里面的yaml文件。 -* `logs` 存放部署工具日志,如果想要查看部署工具执行日志请查看`logs/iotd_yyyy_mm_dd.log`。 -* `sbin` 存放集群部署工具所需的二进制包。 -* `doc` 存放用户手册、开发手册和推荐部署手册。 - - -### 1.4 集群配置文件介绍 - -* 在`iotdbctl/config` 目录下有集群配置的yaml文件,yaml文件名字就是集群名字yaml 文件可以有多个,为了方便用户配置yaml文件在iotd/config目录下面提供了`default_cluster.yaml`示例。 -* yaml 文件配置由`global`、`confignode_servers`、`datanode_servers`、`grafana_server`、`prometheus_server`四大部分组成 -* global 是通用配置主要配置机器用户名密码、IoTDB本地安装文件、Jdk配置等。在`iotdbctl/config`目录中提供了一个`default_cluster.yaml`样例数据, - 用户可以复制修改成自己集群名字并参考里面的说明进行配置IoTDB集群,在`default_cluster.yaml`样例中没有注释的均为必填项,已经注释的为非必填项。 - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotdbctl cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - - -| 参数 | 说明 | 是否必填 | -|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| iotdb_zip_dir | IoTDB 部署分发目录,如果值为空则从`iotdb_download_url`指定地址下载 | 非必填 | -| iotdb_download_url | IoTDB 下载地址,如果`iotdb_zip_dir` 没有值则从指定地址下载 | 非必填 | -| jdk_tar_dir | jdk 本地目录,可使用该 jdk 路径进行上传部署至目标节点。 | 非必填 | -| jdk_deploy_dir | jdk 远程机器部署目录,会将 jdk 部署到该目录下面,与下面的`jdk_dir_name`参数构成完整的jdk部署目录即 `/` | 非必填 | -| jdk_dir_name | jdk 解压后的目录名称默认是jdk_iotdb | 非必填 | -| iotdb_lib_dir | IoTDB lib 目录或者IoTDB 的lib 压缩包仅支持.zip格式 ,仅用于IoTDB升级,默认处于注释状态,如需升级请打开注释修改路径即可。如果使用zip文件请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache\-iotdb\-1.2.0/lib/* | 非必填 | -| user | ssh登陆部署机器的用户名 | 必填 | -| password | ssh登录的密码, 如果password未指定使用pkey登陆, 请确保已配置节点之间ssh登录免密钥 | 非必填 | -| pkey | 密钥登陆如果password有值优先使用password否则使用pkey登陆 | 非必填 | -| ssh_port | ssh登录端口 | 必填 | -| iotdb_admin_user | iotdb服务用户名默认root | 非必填 | -| iotdb_admin_password | iotdb服务密码默认root | 非必填 | -| deploy_dir | IoTDB 部署目录,会把 IoTDB 部署到该目录下面与下面的`iotdb_dir_name`参数构成完整的IoTDB 部署目录即 `/` | 必填 | -| iotdb_dir_name | IoTDB 解压后的目录名称默认是iotdb | 非必填 | -| datanode-env.sh | 对应`iotdb/config/datanode-env.sh` ,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值 | 非必填 | -| confignode-env.sh | 对应`iotdb/config/confignode-env.sh`,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值 | 非必填 | -| iotdb-common.properties | 对应`iotdb/config/iotdb-common.properties` | 非必填 | -| cn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`cn_seed_config_node` | 必填 | -| dn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` | 必填 | - -其中datanode-env.sh 和confignode-env.sh 可以配置额外参数extra_opts,当该参数配置后会在datanode-env.sh 和confignode-env.sh 后面追加对应的值,可参考default_cluster.yaml,配置示例如下: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - - -* confignode_servers 是部署IoTDB Confignodes配置,里面可以配置多个Confignode - 默认将第一个启动的ConfigNode节点node1当作Seed-ConfigNode - -| 参数 | 说明 | 是否必填 | -|-----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| name | Confignode 名称 | 必填 | -| deploy_dir | IoTDB config node 部署目录 | 必填| | -| cn_internal_address | 对应iotdb/内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_address` | 必填 | -| cn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-confignode.properties`中的`cn_seed_config_node` | 必填 | -| cn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_port` | 必填 | -| cn_consensus_port | 对应`iotdb/config/iotdb-system.properties`中的`cn_consensus_port` | 非必填 | -| cn_data_dir | 对应`iotdb/config/iotdb-system.properties`中的`cn_data_dir` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`confignode_servers`同时配置值优先使用confignode_servers中的值 | 非必填 | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| 参数 | 说明 | 是否必填 | -| -------------------------- | ------------------------------------------------------------ | -------- | -| name | Datanode 名称 | 必填 | -| deploy_dir | IoTDB data node 部署目录,注:该目录不能与下面的IoTDB config node部署目录相同 | 必填 | -| dn_rpc_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` | 必填 | -| dn_internal_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` | 必填 | -| dn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-datanode.properties`中的`dn_seed_config_node`,推荐使用 SeedConfigNode | 必填 | -| dn_rpc_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` | 必填 | -| dn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 | 非必填 | - - -| 参数 | 说明 |是否必填| -|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|--- | -| name | Datanode 名称 |必填| -| deploy_dir | IoTDB data node 部署目录 |必填| -| dn_rpc_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` |必填| -| dn_internal_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` |必填| -| dn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` |必填| -| dn_rpc_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` |必填| -| dn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` |必填| -| iotdb-system.properties | 对应`iotdb/config/iotdb-common.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 |非必填| - -* grafana_server 是部署Grafana 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------|------------------|-------------------| -| grafana_dir_name | grafana 解压目录名称 | 非必填默认grafana_iotdb | -| host | grafana 部署的服务器ip | 必填 | -| grafana_port | grafana 部署机器的端口 | 非必填,默认3000 | -| deploy_dir | grafana 部署服务器目录 | 必填 | -| grafana_tar_dir | grafana 压缩包位置 | 必填 | -| dashboards | dashboards 所在的位置 | 非必填,多个用逗号隔开 | - -* prometheus_server 是部署Prometheus 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------------------|------------------|-----------------------| -| prometheus_dir_name | prometheus 解压目录名称 | 非必填默认prometheus_iotdb | -| host | prometheus 部署的服务器ip | 必填 | -| prometheus_port | prometheus 部署机器的端口 | 非必填,默认9090 | -| deploy_dir | prometheus 部署服务器目录 | 必填 | -| prometheus_tar_dir | prometheus 压缩包位置 | 必填 | -| storage_tsdb_retention_time | 默认保存数据天数 默认15天 | 非必填 | -| storage_tsdb_retention_size | 指定block可以保存的数据大小默认512M ,注意单位KB, MB, GB, TB, PB, EB | 非必填 | - -如果在config/xxx.yaml的`iotdb-system.properties`和`iotdb-system.properties`中配置了metrics,则会自动把配置放入到promethues无需手动修改 - -注意:如何配置yaml key对应的值包含特殊字符如:等建议整个value使用双引号,对应的文件路径中不要使用包含空格的路径,防止出现识别出现异常问题。 - -### 1.5 使用场景 - -#### 清理数据场景 - -* 清理集群数据场景会删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 -* 首先执行停止集群命令、然后在执行集群清理命令。 -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster clean default_cluster -``` - -#### 集群销毁场景 - -* 集群销毁场景会删除IoTDB集群中的`data`、`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录。 -* 首先执行停止集群命令、然后在执行集群销毁命令。 - - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster destroy default_cluster -``` - -#### 集群升级场景 - -* 集群升级首先需要在config/xxx.yaml中配置`iotdb_lib_dir`为要上传到服务器的jar所在目录路径(例如iotdb/lib)。 -* 如果使用zip文件上传请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache-iotdb-1.2.0/lib/* -* 执行上传命令、然后执行重启IoTDB集群命令即可完成集群升级 - -```bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### 集群配置文件的热部署场景 - -* 首先修改在config/xxx.yaml中配置。 -* 执行分发命令、然后执行热部署命令即可完成集群配置的热部署 - -```bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### 集群扩容场景 - -* 首先修改在config/xxx.yaml中添加一个datanode 或者confignode 节点。 -* 执行集群扩容命令 -```bash -iotdbctl cluster scaleout default_cluster -``` - -#### 集群缩容场景 - -* 首先在config/xxx.yaml中找到要缩容的节点名字或者ip+port(其中confignode port 是cn_internal_port、datanode port 是rpc_port) -* 执行集群缩容命令 -```bash -iotdbctl cluster scalein default_cluster -``` - -#### 已有IoTDB集群,使用集群部署工具场景 - -* 配置服务器的`user`、`passwod`或`pkey`、`ssh_port` -* 修改config/xxx.yaml中IoTDB 部署路径,`deploy_dir`(IoTDB 部署目录)、`iotdb_dir_name`(IoTDB解压目录名称,默认是iotdb) - 例如IoTDB 部署完整路径是`/home/data/apache-iotdb-1.1.1`则需要修改yaml文件`deploy_dir:/home/data/`、`iotdb_dir_name:apache-iotdb-1.1.1` -* 如果服务器不是使用的java_home则修改`jdk_deploy_dir`(jdk 部署目录)、`jdk_dir_name`(jdk解压后的目录名称,默认是jdk_iotdb),如果使用的是java_home 则不需要修改配置 - 例如jdk部署完整路径是`/home/data/jdk_1.8.2`则需要修改yaml文件`jdk_deploy_dir:/home/data/`、`jdk_dir_name:jdk_1.8.2` -* 配置`cn_seed_config_node`、`dn_seed_config_node` -* 配置`confignode_servers`中`iotdb-system.properties`里面的`cn_internal_address`、`cn_internal_port`、`cn_consensus_port`、`cn_system_dir`、 - `cn_consensus_dir`里面的值不是IoTDB默认的则需要配置否则可不必配置 -* 配置`datanode_servers`中`iotdb-system.properties`里面的`dn_rpc_address`、`dn_internal_address`、`dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`等 -* 执行初始化命令 - -```bash -iotdbctl cluster init default_cluster -``` - -#### 一键部署IoTDB、Grafana和Prometheus 场景 - -* 配置`iotdb-system.properties` 打开metrics接口 -* 配置Grafana 配置,如果`dashboards` 有多个就用逗号隔开,名字不能重复否则会被覆盖。 -* 配置Prometheus配置,IoTDB 集群配置了metrics 则无需手动修改Prometheus 配置会根据哪个节点配置了metrics,自动修改Prometheus 配置。 -* 启动集群 - -```bash -iotdbctl cluster start default_cluster -``` - -更加详细参数请参考上方的 集群配置文件介绍 - - -### 1.6 命令格式 - -本工具的基本用法为: -```bash -iotdbctl cluster [params (Optional)] -``` -* key 表示了具体的命令。 - -* cluster name 表示集群名称(即`iotdbctl/config` 文件中yaml文件名字)。 - -* params 表示了命令的所需参数(选填)。 - -* 例如部署default_cluster集群的命令格式为: - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 集群的功能及参数列表如下: - -| 命令 | 功能 | 参数 | -|-----------------|-------------------------------|-------------------------------------------------------------------------------------------------------------------------| -| check | 检测集群是否可以部署 | 集群名称列表 | -| clean | 清理集群 | 集群名称 | -| deploy/dist-all | 部署集群 | 集群名称 ,-N,模块名称(iotdb、grafana、prometheus可选),-op force(可选) | -| list | 打印集群及状态列表 | 无 | -| start | 启动集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) | -| stop | 关闭集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) ,-op force(nodename、grafana、prometheus可选) | -| restart | 重启集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选),-op force(强制停止)/rolling(滚动重启) | -| show | 查看集群信息,details字段表示展示集群信息细节 | 集群名称, details(可选) | -| destroy | 销毁集群 | 集群名称,-N,模块名称(iotdb、grafana、prometheus可选) | -| scaleout | 集群扩容 | 集群名称 | -| scalein | 集群缩容 | 集群名称,-N,集群节点名字或集群节点ip+port | -| reload | 集群热加载 | 集群名称 | -| dist-conf | 集群配置文件分发 | 集群名称 | -| dumplog | 备份指定集群日志 | 集群名称,-N,集群节点名字 -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -loglevel 日志类型 -l 传输速度 | -| dumpdata | 备份指定集群数据 | 集群名称, -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -l 传输速度 | -| dist-lib | lib 包升级 | 集群名字(升级完后请重启) | -| init | 已有集群使用集群部署工具时,初始化集群配置 | 集群名字,初始化集群配置 | -| status | 查看进程状态 | 集群名字 | -| acitvate | 激活集群 | 集群名字 | -| dist-plugin | 上传plugin(udf,trigger,pipe)到集群 | 集群名字,-type 类型 U(udf)/T(trigger)/P(pipe) -file /xxxx/trigger.jar,上传完成后需手动执行创建udf、pipe、trigger命令 | -| upgrade | 滚动升级 | 集群名字 | -| health_check | 健康检查 | 集群名字,-N,节点名称(可选) | -| backup | 停机备份 | 集群名字,-N,节点名称(可选) | -| importschema | 元数据导入 | 集群名字,-N,节点名称(必填) -param 参数 | -| exportschema | 元数据导出 | 集群名字,-N,节点名称(必填) -param 参数 | - - -### 1.7 详细命令执行过程 - -下面的命令都是以default_cluster.yaml 为示例执行的,用户可以修改成自己的集群文件来执行 - -#### 检查集群部署环境命令 - -```bash -iotdbctl cluster check default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 验证目标节点是否能够通过 SSH 登录 - -* 验证对应节点上的 JDK 版本是否满足IoTDB jdk1.8及以上版本、服务器是否按照unzip、是否安装lsof 或者netstat - -* 如果看到下面提示`Info:example check successfully!` 证明服务器已经具备安装的要求, - 如果输出`Error:example check fail!` 证明有部分条件没有满足需求可以查看上面的输出的Error日志(例如:`Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`)进行修复, - 如果检查jdk没有满足要求,我们可以自己在yaml 文件中配置一个jdk1.8 及以上版本的进行部署不影响后面使用, - 如果检查lsof、netstat或者unzip 不满足要求需要在服务器上自行安装。 - -#### 部署集群命令 - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`confignode_servers` 和`datanode_servers`中的节点信息上传IoTDB压缩包和jdk压缩包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值) - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -```bash -iotdbctl cluster deploy default_cluster -op force -``` -注意:该命令会强制执行部署,具体过程会删除已存在的部署目录重新部署 - -*部署单个模块* -```bash -# 部署grafana模块 -iotdbctl cluster deploy default_cluster -N grafana -# 部署prometheus模块 -iotdbctl cluster deploy default_cluster -N prometheus -# 部署iotdb模块 -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### 启动集群命令 - -```bash -iotdbctl cluster start default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 启动confignode,根据yaml配置文件中`confignode_servers`中的顺序依次启动同时根据进程id检查confignode是否正常,第一个confignode 为seek config - -* 启动datanode,根据yaml配置文件中`datanode_servers`中的顺序依次启动同时根据进程id检查datanode是否正常 - -* 如果根据进程id检查进程存在后,通过cli依次检查集群列表中每个服务是否正常,如果cli链接失败则每隔10s重试一次直到成功最多重试5次 - - -*启动单个节点命令* -```bash -#按照IoTDB 节点名称启动 -iotdbctl cluster start default_cluster -N datanode_1 -#按照IoTDB 集群ip+port启动,其中port对应confignode的cn_internal_port、datanode的rpc_port -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 -#启动grafana -iotdbctl cluster start default_cluster -N grafana -#启动prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对于节点位置信息,如果启动的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果启动的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 启动该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的start-confignode.sh和start-datanode.sh 脚本, -在实际输出结果失败时有可能是集群还未正常启动,建议使用status命令进行查看当前集群状态(iotdbctl cluster status xxx) - - -#### 查看IoTDB集群状态命令 - -```bash -iotdbctl cluster show default_cluster -#查看IoTDB集群详细信息 -iotdbctl cluster show default_cluster details -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 依次在datanode通过cli执行`show cluster details` 如果有一个节点执行成功则不会在后续节点继续执行cli直接返回结果 - - -#### 停止集群命令 - - -```bash -iotdbctl cluster stop default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`datanode_servers`中datanode节点信息,按照配置先后顺序依次停止datanode节点 - -* 根据`confignode_servers`中confignode节点信息,按照配置依次停止confignode节点 - -*强制停止集群命令* - -```bash -iotdbctl cluster stop default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群 - -*停止单个节点命令* - -```bash -#按照IoTDB 节点名称停止 -iotdbctl cluster stop default_cluster -N datanode_1 -#按照IoTDB 集群ip+port停止(ip+port是按照datanode中的ip+dn_rpc_port获取唯一节点或confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 -#停止grafana -iotdbctl cluster stop default_cluster -N grafana -#停止prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对应节点位置信息,如果停止的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果停止的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 停止该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的stop-confignode.sh和stop-datanode.sh 脚本,在某些情况下有可能iotdb集群并未停止。 - - -#### 清理集群数据命令 - -```bash -iotdbctl cluster clean default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`配置信息 - -* 根据`confignode_servers`、`datanode_servers`中的信息,检查是否还有服务正在运行, - 如果有任何一个服务正在运行则不会执行清理命令 - -* 删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 - - - -#### 重启集群命令 - -```bash -iotdbctl cluster restart default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 执行上述的停止集群命令(stop),然后执行启动集群命令(start) 具体参考上面的start 和stop 命令 - -*强制重启集群命令* - -```bash -iotdbctl cluster restart default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群,然后启动集群 - -*重启单个节点命令* - -```bash -#按照IoTDB 节点名称重启datanode_1 -iotdbctl cluster restart default_cluster -N datanode_1 -#按照IoTDB 节点名称重启confignode_1 -iotdbctl cluster restart default_cluster -N confignode_1 -#重启grafana -iotdbctl cluster restart default_cluster -N grafana -#重启prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### 集群缩容命令 - -```bash -#按照节点名称缩容 -iotdbctl cluster scalein default_cluster -N nodename -#按照ip+port缩容(ip+port按照datanode中的ip+dn_rpc_port获取唯一节点,confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster scalein default_cluster -N ip:port -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 判断要缩容的confignode节点和datanode是否只剩一个,如果只剩一个则不能执行缩容 - -* 然后根据ip:port或者nodename 获取要缩容的节点信息,执行缩容命令,然后销毁该节点目录,如果缩容的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果缩容的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - - -提示:目前一次仅支持一个节点缩容 - -#### 集群扩容命令 - -```bash -iotdbctl cluster scaleout default_cluster -``` -* 修改config/xxx.yaml 文件添加一个datanode 节点或者confignode节点 - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 找到要扩容的节点,执行上传IoTDB压缩包和jdb包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值)并解压 - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -* 执行启动该节点命令并校验节点是否启动成功 - -提示:目前一次仅支持一个节点扩容 - -#### 销毁集群命令 -```bash -iotdbctl cluster destroy default_cluster -``` - -* cluster-name 找到默认位置的 yaml 文件 - -* 根据`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`中node节点信息,检查是否节点还在运行, - 如果有任何一个节点正在运行则停止销毁命令 - -* 删除IoTDB集群中的`data`以及yaml文件配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录 - -*销毁单个模块* -```bash -# 销毁grafana模块 -iotdbctl cluster destroy default_cluster -N grafana -# 销毁prometheus模块 -iotdbctl cluster destroy default_cluster -N prometheus -# 销毁iotdb模块 -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### 分发集群配置命令 -```bash -iotdbctl cluster dist-conf default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 根据yaml文件节点配置信息生成并依次上传`iotdb-system.properties`到指定节点 - -#### 热加载集群配置命令 -```bash -iotdbctl cluster reload default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据yaml文件节点配置信息依次在cli中执行`load configuration` - -#### 集群节点日志备份 -```bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 该命令会根据yaml文件校验datanode_1,confignode_1 是否存在,然后根据配置的起止日期(startdate<=logtime<=enddate)备份指定节点datanode_1,confignode_1 的日志数据到指定服务`192.168.9.48` 端口`36000` 数据备份路径是 `/iotdb/logs` ,IoTDB日志存储路径在`/root/data/db/iotdb/logs`(非必填,如果不填写-logs xxx 默认从IoTDB安装路径/logs下面备份日志) - -| 命令 | 功能 | 是否必填 | -|------------|------------------------------------| ---| -| -h | 存放备份数据机器ip |否| -| -u | 存放备份数据机器用户名 |否| -| -pw | 存放备份数据机器密码 |否| -| -p | 存放备份数据机器端口(默认22) |否| -| -path | 存放备份数据的路径(默认当前路径) |否| -| -loglevel | 日志基本有all、info、error、warn(默认是全部) |否| -| -l | 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -| -N | 配置文件集群名称多个用逗号隔开 |是| -| -startdate | 起始时间(包含默认1970-01-01) |否| -| -enddate | 截止时间(包含) |否| -| -logs | IoTDB 日志存放路径,默认是({iotdb}/logs) |否| - -#### 集群节点数据备份 -```bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* 该命令会根据yaml文件获取leader 节点,然后根据起止日期(startdate<=logtime<=enddate)备份数据到192.168.9.48 服务上的/iotdb/datas 目录下 - -| 命令 | 功能 | 是否必填 | -| ---|---------------------------------| ---| -|-h| 存放备份数据机器ip |否| -|-u| 存放备份数据机器用户名 |否| -|-pw| 存放备份数据机器密码 |否| -|-p| 存放备份数据机器端口(默认22) |否| -|-path| 存放备份数据的路径(默认当前路径) |否| -|-granularity| 类型partition |是| -|-l| 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -|-startdate| 起始时间(包含) |是| -|-enddate| 截止时间(包含) |是| - -#### 集群lib包上传(升级) -```bash -iotdbctl cluster dist-lib default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 - -注意执行完升级后请重启IoTDB 才能生效 - -#### 集群初始化 -```bash -iotdbctl cluster init default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 初始化集群配置 - -#### 查看集群进程状态 -```bash -iotdbctl cluster status default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 展示集群的存活状态 - -#### 集群授权激活 - -集群激活默认是通过输入激活码激活,也可以通过-op license_path 通过license路径激活 - -* 默认激活方式 -```bash -iotdbctl cluster activate default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -op license_path -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -### 1.8 集群plugin分发 -```bash -#分发udf -iotdbctl cluster dist-plugin default_cluster -type U -file /xxxx/udf.jar -#分发trigger -iotdbctl cluster dist-plugin default_cluster -type T -file /xxxx/trigger.jar -#分发pipe -iotdbctl cluster dist-plugin default_cluster -type P -file /xxxx/pipe.jar -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取 `datanode_servers`配置信息 - -* 上传udf/trigger/pipe jar包 - -上传完成后需要手动执行创建udf/trigger/pipe命令 - -### 1.9 集群滚动升级 -```bash -iotdbctl cluster upgrade default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 -* confignode 执行停止、替换lib包、启动,然后datanode执行停止、替换lib包、启动 - - - -### 1.10 集群健康检查 -```bash -iotdbctl cluster health_check default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行health_check.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行health_check.sh - - -### 1.11 集群停机备份 -```bash -iotdbctl cluster backup default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行backup.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行backup.sh - -说明:多个节点部署到单台机器,只支持 quick 模式 - -### 1.12 集群元数据导入 - -```bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入import-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|---------------------------------|------| -| -s |指定想要导入的数据文件,这里可以指定文件或者文件夹。如果指定的是文件夹,将会把文件夹中所有的后缀为csv的文件进行批量导入。 | 是 | -| -fd |指定一个目录来存放导入失败的文件,如果没有指定这个参数,失败的文件将会被保存到源数据的目录中,文件名为是源文件名加上.failed的后缀。 | 否 | -| -lpf |用于指定每个导入失败文件写入数据的行数,默认值为10000 | 否 | - - - -### 1.13 集群元数据导出 - -```bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入export-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|------------------------------------------------------------|------| -| -t | 为导出的CSV文件指定输出路径 | 是 | -| -path |指定导出元数据的path pattern,指定该参数后会忽略-s参数例如:root.stock.** | 否 | -| -pf |如果未指定-path,则需指定该参数,指定查询元数据路径所在文件路径,支持 txt 文件格式,每个待导出的路径为一行。 | 否 | -| -lpf |指定导出的dump文件最大行数,默认值为10000。 | 否 | -| -timeout |指定session查询时的超时时间,单位为ms | 否 | - - - -### 1.14 集群部署工具样例介绍 -在集群部署工具安装目录中config/example 下面有3个yaml样例,如果需要可以复制到config 中进行修改即可 - -| 名称 | 说明 | -|-----------------------------|------------------------------------------------| -| default_1c1d.yaml | 1个confignode和1个datanode 配置样例 | -| default_3c3d.yaml | 3个confignode和3个datanode 配置样例 | -| default_3c3d_grafa_prome | 3个confignode和3个datanode、Grafana、Prometheus配置样例 | - -## 2. 数据文件夹概览工具 - -IoTDB数据文件夹概览工具用于打印出数据文件夹的结构概览信息,工具位置为 tools/tsfile/print-iotdb-data-dir。 - -### 2.1 用法 - -- Windows: - -```bash -.\print-iotdb-data-dir.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"IoTDB_data_dir_overview.txt"作为默认值。 - -### 2.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## 3. TsFile概览工具 - -TsFile概览工具用于以概要模式打印出一个TsFile的内容,工具位置为 tools/tsfile/print-tsfile。 - -### 3.1 用法 - -- Windows: - -```bash -.\print-tsfile-sketch.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-tsfile-sketch.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"TsFile_sketch_view.txt"作为默认值。 - -### 3.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -解释: - -- 以"|"为分隔,左边是在TsFile文件中的实际位置,右边是梗概内容。 -- "|||||||||||||||||||||"是为增强可读性而添加的导引信息,不是TsFile中实际存储的数据。 -- 最后打印的"IndexOfTimerseriesIndex Tree"是对TsFile文件末尾的元数据索引树的重新整理打印,便于直观理解,不是TsFile中存储的实际数据。 - -## 4. TsFile Resource概览工具 - -TsFile resource概览工具用于打印出TsFile resource文件的内容,工具位置为 tools/tsfile/print-tsfile-resource-files。 - -### 4.1 用法 - -- Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### 4.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/zh/UserGuide/Master/Table/Tools-System/Monitor-Tool_timecho.md b/src/zh/UserGuide/Master/Table/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index c1d2e3080..000000000 --- a/src/zh/UserGuide/Master/Table/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,187 +0,0 @@ - - - -# 监控工具 - -监控工具的部署可参考文档 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) 章节。 - -## 1. 监控指标的 Prometheus 映射关系 - -> 对于 Metric Name 为 name, Tags 为 K1=V1, ..., Kn=Vn 的监控指标有如下映射,其中 value 为具体值 - -| 监控指标类型 | 映射关系 | -| ---------------- | ------------------------------------------------------------ | -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | - -## 2. 修改配置文件 - -1) 以 DataNode 为例,修改 iotdb-system.properties 配置文件如下: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -2) 启动 IoTDB DataNode - -3) 打开浏览器或者用```curl``` 访问 ```http://servier_ip:9091/metrics```, 就能得到如下 metric 数据: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -## 3. Prometheus + Grafana - -如上所示,IoTDB 对外暴露出标准的 Prometheus 格式的监控指标数据,可以使用 Prometheus 采集并存储监控指标,使用 Grafana -可视化监控指标。 - -IoTDB、Prometheus、Grafana三者的关系如下图所示: - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. IoTDB在运行过程中持续收集监控指标数据。 -2. Prometheus以固定的间隔(可配置)从IoTDB的HTTP接口拉取监控指标数据。 -3. Prometheus将拉取到的监控指标数据存储到自己的TSDB中。 -4. Grafana以固定的间隔(可配置)从Prometheus查询监控指标数据并绘图展示。 - -从交互流程可以看出,我们需要做一些额外的工作来部署和配置Prometheus和Grafana。 - -比如,你可以对Prometheus进行如下的配置(部分参数可以自行调整)来从IoTDB获取监控数据 - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -更多细节可以参考下面的文档: - -[Prometheus安装使用文档](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus从HTTP接口拉取metrics数据的配置说明](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana安装使用文档](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana从Prometheus查询数据并绘图的文档](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## 4. Apache IoTDB Dashboard - -我们提供了Apache IoTDB Dashboard,支持统一集中式运维管理,可通过一个监控面板监控多个集群。 - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - -你可以在企业版中获取到 Dashboard 的 Json文件。 - -### 4.1 集群概览 - -可以监控包括但不限于: -- 集群总CPU核数、总内存空间、总硬盘空间 -- 集群包含多少个ConfigNode与DataNode -- 集群启动时长 -- 集群写入速度 -- 集群各节点当前CPU、内存、磁盘使用率 -- 分节点的信息 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - -### 4.2 数据写入 - -可以监控包括但不限于: -- 写入平均耗时、耗时中位数、99%分位耗时 -- WAL文件数量与尺寸 -- 节点 WAL flush SyncBuffer 耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### 4.3 数据查询 - -可以监控包括但不限于: -- 节点查询加载时间序列元数据耗时 -- 节点查询读取时间序列耗时 -- 节点查询修改时间序列元数据耗时 -- 节点查询加载Chunk元数据列表耗时 -- 节点查询修改Chunk元数据耗时 -- 节点查询按照Chunk元数据过滤耗时 -- 节点查询构造Chunk Reader耗时的平均值 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### 4.4 存储引擎 - -可以监控包括但不限于: -- 分类型的文件数量、大小 -- 处于各阶段的TsFile数量、大小 -- 各类任务的数量与耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### 4.5 系统监控 - -可以监控包括但不限于: -- 系统内存、交换内存、进程内存 -- 磁盘空间、文件数、文件尺寸 -- JVM GC时间占比、分类型的GC次数、GC数据量、各年代的堆内存占用 -- 网络传输速率、包发送速率 - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) - -### 4.6 数据同步 - -可以监控包括但不限于: -- Pipe事件提交队列大小、未分配Pipe事件数。 -- Source队列未处理事件数、Source供给事件速率、Processor处理事件速率。 -- 各类Pipesink/source未传输事件数、Pipe connector传输事件速率。 -- Pipesink重试队列和pending handler数量、Pipesink压缩前后累计大小和压缩耗时、Pipesink 批量大小和批处理间隔分布。 -- Pipe内存占用和容量、Pipe phantom reference数量、linked TsFile数量和大小、Pipe发送TsFile读取磁盘字节数。 - -![](/img/monitor-tool-pipe-1.png) - -![](/img/monitor-tool-pipe-2.png) - -![](/img/monitor-tool-pipe-3.png) - -![](/img/monitor-tool-pipe-4.png) \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/Tools-System/Schema-Export-Tool_timecho.md b/src/zh/UserGuide/Master/Table/Tools-System/Schema-Export-Tool_timecho.md deleted file mode 100644 index 8dc2e3d89..000000000 --- a/src/zh/UserGuide/Master/Table/Tools-System/Schema-Export-Tool_timecho.md +++ /dev/null @@ -1,111 +0,0 @@ - - -# 元数据导出 - -## 1. 功能概述 - -元数据导出工具 `export-schema.sh/bat` 位于tools 目录下,能够将 IoTDB 中指定数据库下的元数据导出为脚本文件。 - -## 2. 功能详解 - -### 2.1 参数介绍 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|----------------|--------------------------|-----------------------------------------------| ----------------------------------- |---------------------------------------| -| `-h` | `-- host` | 主机名 | 否 | 127.0.0.1 | -| `-p` | `--port` | 端口号 | 否 | 6667 | -| `-u` | `--username` | 用户名 | 否 | root | -| `-pw` | `--password` | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6 版本之前为 root) | -| `-sql_dialect` | `--sql_dialect` | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| `-db` | `--database` | 将要导出的目标数据库,只在`-sql_dialect`为 table 类型下生效。 | `-sql_dialect`为 table 时必填 | - | -| `-table` | `--table` | 将要导出的目标表,只在`-sql_dialect`为 table 类型下生效。 | 否 | - | -| `-t` | `--target` | 指定输出文件的目标文件夹,如果路径不存在新建文件夹 | 是 | | -| `-path` | `--path_pattern` | 指定导出元数据的path pattern | `-sql_dialect`为 tree 时必填 | | -| `-pfn` | `--prefix_file_name` | 指定导出文件的名称。 | 否 | dump\_dbname.sql | -| `-lpf` | `--lines_per_file` | 指定导出的dump文件最大行数,只在`-sql_dialect`为 tree 类型下生效。 | 否 | `10000` | -| `-timeout` | `--query_timeout` | 会话查询的超时时间(ms) | 否 | -1范围:-1~Long. max=9223372036854775807 | -| `-help` | `--help` | 显示帮助信息 | 否 | | -| `-usessl` | `--use_ssl` | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| `-ts` | `--trust_store` | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| `-tpw` | `--trust_store_password` | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 运行命令 - -```Bash -Shell -# Unix/OS X -> tools/export-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -# Windows -# V2.0.4.x 版本之前 -> tools\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\schema\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -``` - -### 2.3 运行示例 - -将 `database1` 下的元数据导出到`/home`下 - -```Bash -./export-schema.sh -sql_dialect table -t /home/ -db database1 -``` - -导出文件 `dump_database1.sql`,内容格式如下: - -```sql -DROP TABLE IF EXISTS table1; -CREATE TABLE table1( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -DROP TABLE IF EXISTS table2; -CREATE TABLE table2( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -``` diff --git a/src/zh/UserGuide/Master/Table/Tools-System/Schema-Import-Tool_timecho.md b/src/zh/UserGuide/Master/Table/Tools-System/Schema-Import-Tool_timecho.md deleted file mode 100644 index 72bc81ca2..000000000 --- a/src/zh/UserGuide/Master/Table/Tools-System/Schema-Import-Tool_timecho.md +++ /dev/null @@ -1,166 +0,0 @@ - - -# 元数据导入 - -## 1. 功能概述 - -元数据导入工具 `import-schema.sh/bat` 位于tools 目录下,能够将指定路径下创建元数据的脚本文件导入到 IoTDB 中。 - -## 2. 功能详解 - -### 2.1 参数介绍 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|----------------|------------------------------|------------------------------------------------| -------------- |-------------------------------------| -| `-h` | `-- host` | 主机名 | 否 | 127.0.0.1 | -| `-p` | `--port` | 端口号 | 否 | 6667 | -| `-u` | `--username` | 用户名 | 否 | root | -| `-pw` | `--password` | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6 版本之前为 root) | -| `-sql_dialect` | `--sql_dialect` | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| `-db` | `--database` | 将要导入的目标数据库 | `是` | - | -| `-table` | `--table` | 将要导入的目标表,只在`-sql_dialect`为 table 类型下生效。 | 否 | - | -| `-s` | `--source` | 待加载的脚本文件(夹)的本地目录路径。 | 是 | | -| `-fd` | `--fail_dir` | 指定保存失败文件的目录 | 否 | | -| `-lpf` | `--lines_per_failed_file` | 指定失败文件最大写入数据的行数,只在`-sql_dialect`为 table 类型下生效。 | 否 | 100000范围:0~Integer.Max=2147483647 | -| `-help` | `--help` | 显示帮助信息 | 否 | | -| `-usessl` | `--use_ssl` | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| `-ts` | `--trust_store` | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| `-tpw` | `--trust_store_password` | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 运行命令 - -```Bash -# Unix/OS X -tools/import-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# Windows -# V2.0.4.x 版本之前 -tools\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# V2.0.4.x 版本及之后 -tools\windows\schema\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] -``` - -### 2.3 运行示例 - - -将 `/home `下的文件 `dump_database1.sql` 导入到 `database2 `中,文件内容如下: - -```sql -DROP TABLE IF EXISTS table1; -CREATE TABLE table1( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -DROP TABLE IF EXISTS table2; -CREATE TABLE table2( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -``` - -执行脚本: - -```Bash -./import-schema.sh -sql_dialect table -s /home/dump_database1.sql -db database2 - -# database2 不存在时,提示错误信息如下 -The target database database2 does not exist - -# database2 存在时,提示成功 -Import completely! -``` - -验证导入元数据: - -```Bash -# 导入前 -IoTDB:database2> show tables -+---------+-------+ -|TableName|TTL(ms)| -+---------+-------+ -+---------+-------+ -Empty set. - -# 导入后 -IoTDB:database2> show tables details -+---------+-------+------+-------+ -|TableName|TTL(ms)|Status|Comment| -+---------+-------+------+-------+ -| table2| INF| USING| null| -| table1| INF| USING| null| -+---------+-------+------+-------+ - -IoTDB:database2> desc table1 -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ - -IoTDB:database2> desc table2 -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ -``` diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Audit-Log_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index 0f0df7463..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,163 +0,0 @@ - - - -# 安全审计 - -## 1. 引言 - -审计日志是数据库的记录凭证,通过审计日志功能可以查询数据库中增删改查等各项操作,以保证信息安全。IoTDB 审计日志功能支持以下特性: - -* 可通过配置决定是否开启审计日志功能 -* 可通过参数设置审计日志记录的操作类型和权限级别 -* 可通过参数设置审计日志文件的存储周期,包括基于 TTL 实现时间滚动和基于 SpaceTL 实现空间滚动。 -* 可通过参数设置统计任意时间段内写入和查询延时大于阈值(默认3000毫秒)的慢请求个数。 -* 审计日志文件默认加密存储 - -> 注意:该功能从 V2.0.8 版本开始提供。 - -## 2. 配置参数 - -通过编辑配置文件 `iotdb-system.properties` 中如下参数来启动审计日志功能。 - -* V2.0.8.1 - - -| 参数名称 | 参数描述 | 数据类型 | 默认值 | 生效方式 | -|-----------------------------------|------------------------------------------------------------------------------------------------------| ---------- | ------------------------ | ---------- | -| `enable_audit_log` | 是否开启审计日志。 true:启用。false:禁用。 | Boolean | false | 热加载 | -| `auditable_operation_type` | 操作类型选择。 DML :所有 DML 都会记录审计日志; DDL :所有 DDL 都会记录审计日志; QUERY :所有 QUERY 都会记录审计日志; CONTROL:所有控制语句都会记录审计日志; | String | DML,DDL,QUERY,CONTROL | 热加载 | -| `auditable_operation_level` | 权限级别选择。 global :记录全部的审计日志; object:仅针对数据实例的事件的审计日志会被记录; 包含关系:object < global。 例如:设置为 global 时,所有审计日志正常记录;设置为 object 时,仅记录对具体数据实例的操作。 | String | global | 热加载 | -| `auditable_operation_result` | 审计结果选择。 success:只记录成功事件的审计日志; fail:只记录失败事件的审计日志; | String | success, fail | 热加载 | -| `audit_log_ttl_in_days` | 审计日志的 TTL,生成审计日志的时间达到该阈值后过期。 | Double | -1.0(永远不会被删除) | 热加载 | -| `audit_log_space_tl_in_GB` | 审计日志的 SpaceTL,审计日志总空间达到该阈值后开始轮转删除。 | Double | 1.0| 热加载| -| `audit_log_batch_interval_in_ms` | 审计日志批量写入的时间间隔 | Long | 1000 | 热加载 | -| `audit_log_batch_max_queue_bytes` | 用于批量处理审计日志的队列最大字节数。当队列大小超过此值时,后续的写入操作将被阻塞。 | Long | 268435456 | 热加载 | - - -* V2.0.9.2 - -| 参数名称 | 参数描述 | 数据类型 | 默认值 | 生效方式 | -|-----------------------------------|------------------------------------------------------------------------------------------------------| ---------- | ------------------------ | ---------- | -| `enable_audit_log` | 是否开启审计日志。 true:启用。false:禁用。 | Boolean | false | 热加载 | -| `auditable_operation_type` | 操作类型选择。 DML :所有 DML 都会记录审计日志; DDL :所有 DDL 都会记录审计日志; QUERY :所有 QUERY 都会记录审计日志; CONTROL:所有控制语句都会记录审计日志; | String | DML,DDL,QUERY,CONTROL | 热加载 | -| `auditable_dml_event_type` | 审计DML操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_ddl_event_type` | 审计DDL操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_query_event_type` | 审计查询操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_control_event_type` | 审计控制操作时的事件类型。`CHANGE_AUDIT_OPTION`:审计选项变更,`OBJECT_AUTHENTICATION`:对象鉴权,`LOGIN`:登录,`LOGOUT`:退出登录,`DN_SHUTDOWN`:数据节点关机,`SLOW_OPERATION`:慢操作 | String | `CHANGE_AUDIT_OPTION`,`OBJECT_AUTHENTICATION`,`LOGIN`,`LOGOUT`,`DN_SHUTDOWN`,`SLOW_OPERATION` | 热加载 | -| `auditable_operation_level` | 权限级别选择。 global :记录全部的审计日志; object:仅针对数据实例的事件的审计日志会被记录; 包含关系:object < global。 例如:设置为 global 时,所有审计日志正常记录;设置为 object 时,仅记录对具体数据实例的操作。 | String | global | 热加载 | -| `auditable_operation_result` | 审计结果选择。 success:只记录成功事件的审计日志; fail:只记录失败事件的审计日志; | String | success, fail | 热加载 | -| `audit_log_ttl_in_days` | 审计日志的 TTL,生成审计日志的时间达到该阈值后过期。 | Double | -1.0(永远不会被删除) | 热加载 | -| `audit_log_space_tl_in_GB` | 审计日志的 SpaceTL,审计日志总空间达到该阈值后开始轮转删除。 | Double | 1.0| 热加载| -| `audit_log_batch_interval_in_ms` | 审计日志批量写入的时间间隔 | Long | 1000 | 热加载 | -| `audit_log_batch_max_queue_bytes` | 用于批量处理审计日志的队列最大字节数。当队列大小超过此值时,后续的写入操作将被阻塞。 | Long | 268435456 | 热加载 | - -**关于对象鉴权和慢操作的说明:** -* 当 `auditable_dml_event_type` 、`auditable_ddl_event_type`、`auditable_query_event_type`、`auditable_control_event_type` 参数值设置为 `OBJECT_AUTHENTICATION`(对象鉴权)时,则对应的事件类型会被记录审计日志。 -* 当 `auditable_dml_event_type` 、`auditable_ddl_event_type`、`auditable_query_event_type`、`auditable_control_event_type` 参数值设置为 `SLOW_OPERATION`(慢操作),则操作时间大于 `slow_query_threshold` 参数值(默认 3000 ms)的对应事件类型才会被记录审计日志。`slow_query_threshold` 参数值可通过 iotdb-system.properties 文件进行配置。 - -## 3. 查阅方法 - -支持通过 SQL 直接阅读、获取审计日志相关信息。 - -### 3.1 SQL 语法 - -```SQL -SELECT (, )* log FROM WHERE whereclause ORDER BY order_expression -``` - -其中: - -* `AUDIT_LOG_PATH` :审计日志存储位置`__audit.audit_log`; -* `audit_log_field`:查询字段请参考下一小节元数据结构。 -* 支持 Where 条件搜索和 Order By 排序。 - -### 3.2 元数据结构 - -| 字段 | 含义 | 类型 | -| ------------------------ |------------------------------------------------------| ----------- | -| `time` | 事件开始的的日期和时间 | timestamp | -| `username` | 用户名称 | string | -| `cli_hostname` | 用户主机标识 | string | -| `audit_event_type` | 审计事件类型,WRITE\_DATA, GENERATE\_KEY, SLOW\_OPERATION 等 | string | -| `operation_type` | 审计事件的操作类型,DML, DDL, QUERY, CONTROL | string | -| `privilege_type` | 审计事件使用的权限,WRITE\_DATA, MANAGE\_USER 等 | string | -| `privilege_level` | 事件的权限级别,global, object | string | -| `result` | 事件结果,success=1, fail=0 | boolean | -| `database` | 数据库名称 | string | -| `sql_string` | 用户的原始 SQL | string | -| `log` | 具体的事件描述 | string | - -### 3.3 使用示例 - -* 查询成功执行了DML操作的时间、用户名及主机信息 - -```SQL -IoTDB:__audit> select time,username,cli_hostname from audit_log where result = true and operation_type='DML' -+-----------------------------+--------+------------+ -| time|username|cli_hostname| -+-----------------------------+--------+------------+ -|2026-01-23T11:43:46.697+08:00| root| 127.0.0.1| -|2026-01-23T11:45:39.950+08:00| root| 127.0.0.1| -+-----------------------------+--------+------------+ -Total line number = 2 -It costs 0.284s -``` - -* 查询最近一次操作的时间、用户名、主机信息、操作类型以及原始 SQL - -```SQL -IoTDB:__audit> select time,username,cli_hostname,operation_type,sql_string from audit_log order by time desc limit 1 -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -| time|username|cli_hostname|operation_type| sql_string| -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -|2026-01-23T11:46:31.026+08:00| root| 127.0.0.1| QUERY|select time,username,cli_hostname,operation_type,sql_string from audit_log order by time desc limit 1| -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -Total line number = 1 -It costs 0.053s -``` - -* 查询所有事件结果为false的操作数据库、操作类型及日志信息 - -```SQL -IoTDB:__audit> select time,database,operation_type,log from audit_log where result=false -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -| time|database|operation_type| log| -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -|2026-01-23T11:47:42.136+08:00| | CONTROL|User user1 (ID=-1) login failed with code: 804, Authentication failed.| -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -Total line number = 1 -It costs 0.011s -``` - -* 设置 slow_query_threshold = 1 (ms),查询审计事件类型为慢操作的记录 -```SQL -IoTDB:__audit> select * from audit_log where audit_event_type='SLOW_OPERATION' limit 3 -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| time|node_id|user_id|username|cli_hostname|audit_event_type|operation_type|privilege_type|privilege_level|result| database| sql_string| log| -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|2026-05-06T14:57:57.468+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| QUERY| [SELECT]| OBJECT| true| | show databases| SLOW_QUERY: cost 10 ms, show databases| -|2026-05-06T14:58:38.149+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| DML| [INSERT]| OBJECT| true|database1|INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2024-11-26 13:37:00', 90.0, 35.1, true, '2024-11-26 13:37:34'), ('北京', '1001', '100', 'A', '180', '2024-11-26 13:38:00', 90.0, 35.1, true, '2024-11-26 13:38:25'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:38:00', NULL, 35.1, true, '2024-11-27 16:37:01'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:39:00', 85.0, 35.3, NULL, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:40:00', 85.0, NULL, NULL, '2024-11-27 16:37:03'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:41:00', 85.0, NULL, NULL, '2024-11-27 16:37:04'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:42:00', NULL, 35.2, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:43:00', NULL, Null, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:44:00', NULL, Null, false, '2024-11-27 16:37:08'), ('上海', '3001', '100', 'C', '90', '2024-11-28 08:00:00', 85.0, Null, NULL, '2024-11-28 08:00:09'), ('上海', '3001', '100', 'C', '90', '2024-11-28 09:00:00', NULL, 40.9, true, NULL), ('上海', '3001', '100', 'C', '90', '2024-11-28 10:00:00', 85.0, 35.2, NULL, '2024-11-28 10:00:11'), ('上海', '3001', '100', 'C', '90', '2024-11-28 11:00:00', 88.0, 45.1, true, '2024-11-28 11:00:12'), ('上海', '3001', '101', 'D', '360', '2024-11-29 10:00:00', 85.0, NULL, NULL, '2024-11-29 10:00:13'), ('上海', '3002', '100', 'E', '180', '2024-11-29 11:00:00', NULL, 45.1, true, NULL), ('上海', '3002', '100', 'E', '180', '2024-11-29 18:30:00', 90.0, 35.4, true, '2024-11-29 18:30:15'), ('上海', '3002', '101', 'F', '360', '2024-11-30 09:30:00', 90.0, 35.2, true, NULL), ('上海', '3002', '101', 'F', '360', '2024-11-30 14:30:00', 90.0, 34.8, true, '2024-11-30 14:30:17')|Execution: INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2024-11-26 13:37:00', 90.0, 35.1, true, '2024-11-26 13:37:34'), ('北京', '1001', '100', 'A', '180', '2024-11-26 13:38:00', 90.0, 35.1, true, '2024-11-26 13:38:25'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:38:00', NULL, 35.1, true, '2024-11-27 16:37:01'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:39:00', 85.0, 35.3, NULL, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:40:00', 85.0, NULL, NULL, '2024-11-27 16:37:03'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:41:00', 85.0, NULL, NULL, '2024-11-27 16:37:04'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:42:00', NULL, 35.2, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:43:00', NULL, Null, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:44:00', NULL, Null, false, '2024-11-27 16:37:08'), ('上海', '3001', '100', 'C', '90', '2024-11-28 08:00:00', 85.0, Null, NULL, '2024-11-28 08:00:09'), ('上海', '3001', '100', 'C', '90', '2024-11-28 09:00:00', NULL, 40.9, true, NULL), ('上海', '3001', '100', 'C', '90', '2024-11-28 10:00:00', 85.0, 35.2, NULL, '2024-11-28 10:00:11'), ('上海', '3001', '100', 'C', '90', '2024-11-28 11:00:00', 88.0, 45.1, true, '2024-11-28 11:00:12'), ('上海', '3001', '101', 'D', '360', '2024-11-29 10:00:00', 85.0, NULL, NULL, '2024-11-29 10:00:13'), ('上海', '3002', '100', 'E', '180', '2024-11-29 11:00:00', NULL, 45.1, true, NULL), ('上海', '3002', '100', 'E', '180', '2024-11-29 18:30:00', 90.0, 35.4, true, '2024-11-29 18:30:15'), ('上海', '3002', '101', 'F', '360', '2024-11-30 09:30:00', 90.0, 35.2, true, NULL), ('上海', '3002', '101', 'F', '360', '2024-11-30 14:30:00', 90.0, 34.8, true, '2024-11-30 14:30:17') cost 329 ms, with status code: TSStatus(code:200, message:)| -|2026-05-06T14:58:45.534+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| QUERY| [SELECT]| OBJECT| true|database1| select * from table1| SLOW_QUERY: cost 121 ms, select * from table1| -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 3 -It costs 0.026s -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Authority-Management-Upgrade_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Authority-Management-Upgrade_timecho.md deleted file mode 100644 index 94334ce70..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Authority-Management-Upgrade_timecho.md +++ /dev/null @@ -1,539 +0,0 @@ - - -# 权限管理 - -IoTDB 提供了权限管理功能,用于对数据和集群系统执行精细的访问控制,保障数据与系统安全。本篇介绍了 IoTDB 表模型中权限模块的基本概念、用户定义、权限管理、鉴权逻辑与功能用例。 - -## 1. 基本概念 - -### 1.1 用户 - -用户即数据库的合法使用者。一个用户与一个唯一的用户名相对应,并且拥有密码作为身份验证的手段。一个人在使用数据库之前,必须先提供合法的(即存于数据库中的)用户名与密码。 - -### 1.2 权限 - -数据库提供多种操作,但并非所有的用户都能执行所有操作。如果一个用户可以执行某项操作,则称该用户有执行该操作的权限。 - -### 1.3 角色 - -角色是若干权限的集合,并且有一个唯一的角色名作为标识符。角色通常和一个现实身份相对应(例如交通调度员),而一个现实身份可能对应着多个用户。这些具有相同现实身份的用户往往具有相同的一些权限,角色就是为了能对这样的权限进行统一的管理的抽象。 - -### 1.4 默认用户与角色 - -安装初始化后的 IoTDB 中有一个默认用户 root,默认密码为 TimechoDB@2021。该用户为管理员用户,拥有所有权限,无法被赋予、撤销权限,也无法被删除,数据库内仅有一个管理员用户。一个新创建的用户或角色不具备任何权限。 - -## 2. 用户定义 - -拥有 SECURITY 的用户可以创建用户或者角色,需要满足以下约束: - -### 2.1 用户名限制 - -* 4\~32个字符,支持使用英文大小写字母、数字、特殊字符`(!@#$%^&*()_+-=)`,用户无法创建和管理员用户同名的用户。 -* 如果用户名全是数字或包含特殊字符,则创建时需要使用双引号`""`括起来。 - -### 2.2 密码限制 - -12~32个字符,必须包含大写小写字母、至少一个数字、至少一个特殊字符(`!@#$%^&*()_+-=`)且不能与用户名相同。 - -### 2.3 角色名限制 - -4\~32个字符,支持使用英文大小写字母、数字、特殊字符(`!@#$%^&*()_+-=`)。用户无法创建和管理员用户同名的角色。 - -## 3. 权限管理 - -IoTDB 表模型主要有两类权限:全局权限、数据权限。 - -### 3.1 全局权限 - -全局权限包含 SYSTEM、SECURITY、AUDIT 三种类型: - -* SYSTEM:只具备运维操作、DDL(Data Definition Language)相关的权限。 -* SECURITY:只具备管理角色(Role)或用户(User)以及为其他账号授予权限的权限。 -* AUDIT :只具备维护审计规则、查看审计日志的权限。 - -各权限详细描述见下表: - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
权限名称原权限名称描述
SYSTEM允许用户创建、修改、删除数据库。
允许用户创建、修改、删除表及表视图。
允许用户创建、删除、查看用户自定义函数。
允许用户创建、开始、停止、删除、查看 PIPE。允许用户创建、删除、查看 PIPEPLUGINS。
允许用户查询、取消查询。允许用户查看变量。允许用户查看集群状态。
允许用户创建、删除、查询深度学习模型
SECURITYMANAGE_USER-允许用户创建、删除用户 允许用户修改用户密码 允许用户查看用户的权限信息 允许用户列出所有用户
MANAGE_ROLE-允许用户创建、删除角色 允许用户查看角色的权限信息 允许用户将角色授予某个用户或撤销 允许用户列出所有角色
AUDIT允许用户维护审计日志的规则 允许用户查看审计日志
- -### 3.2 数据权限 - -数据权限由权限类型和范围组成。 - -* 权限类型包括:CREATE(创建权限),DROP(删除权限),ALTER(修改权限),SELECT(查询数据权限),INSERT(插入/更新数据权限),DELETE(删除数据权限)。 -* 范围包括:ANY(系统范围内),DATABASE(数据库范围内),TABLE(单个表)。 - * 作用于 ANY 的权限会影响所有数据库和表。 - * 作用于数据库的权限会影响该数据库及其所有表。 - * 作用于表的权限仅影响该表。 -* 范围生效说明:执行单表操作时,数据库会匹配用户权限与数据权限范围。例如,用户尝试向 DATABASE1.TABLE1 写入数据时,系统会依次检查用户是否有对 ANY、DATABASE1或 DATABASE1.TABLE1 的写入权限,直到匹配成功或者匹配失败。 -* 权限类型、范围及效果逻辑关系如下表所示: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
权限类型权限范围(层级)权限效果
CREATEANY允许创建任意表、创建任意数据库
数据库允许用户在该数据库下创建表;允许用户创建该名称的数据库
允许用户创建该名称的表
DROPANY允许删除任意表、删除任意数据库
数据库允许用户删除该数据库;允许用户删除该数据库下的表
D允许用户删除该表
ALTERANY允许修改任意表的定义、任意数据库的定义
数据库允许用户修改数据库的定义,允许用户修改数据库下表的定义
允许用户修改表的定义
SELECTANY允许查询系统内任意数据库中任意表的数据
数据库允许用户查询该数据库中任意表的数据
允许用户查询该表中的数据。执行多表查询时,数据库仅展示用户有权限访问的数据。
INSERTANY允许任意数据库的任意表插入/更新数据
数据库允许用户向该数据库范围内任意表插入/更新数据
允许用户向该表中插入/更新数据
DELETEANY允许删除任意表的数据
数据库允许用户删除该数据库范围内的数据
允许用户删除该表中的数据
- -### 3.3 权限授予与取消 - -IoTDB支持通过如下三种途径进行用户授权和撤销权限: - -* 超级管理员直接授予或撤销 -* 拥有GRANT OPTION权限的用户授予或撤销 -* 通过角色授予或撤销(由超级管理员或具备 SECURITY 权限的用户操作角色) - -在IoTDB 表模型中,授权或撤销权限时需遵循以下原则: - -* 授权/撤销全局权限时,无需指定权限的范围。 -* 授予/撤销数据权限时,需要指定权限类型和权限范围。在撤销权限时只会撤销指定的权限范围,不会受权限范围包含关系的影响。 -* 允许对尚未创建的数据库或表提前进行权限规划和授权。 -* 允许重复授权/撤销权限。 -* WITH GRANT OPTION: 允许用户在授权范围内管理权限。用户可以授予或撤销其他用户在该范围内的权限。 - -## 4. 功能语法与示例 - -### 4.1 用户与角色相关 - -1. 创建用户(需 SECURITY 权限) - -```SQL -CREATE USER -eg: CREATE USER user1 'Passwd@202604'; -``` - -2. 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备 SECURITY 权限。 - -```SQL -ALTER USER SET PASSWORD -eg: ALTER USER tempuser SET PASSWORD 'Newpwd@202604'; -``` - -3. 删除用户(需 SECURITY 权限) - -```SQL -DROP USER -eg: DROP USER user1; -``` - -4. 创建角色 (需 SECURITY 权限) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1; -``` - -5. 删除角色 (需 SECURITY 权限) - -```SQL -DROP ROLE -eg: DROP ROLE role1; -``` - -6. 赋予用户角色 (需 SECURITY 权限) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1; -``` - -7. 移除用户角色 (需 SECURITY 权限) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1; -``` - -8. 列出所有用户(需 SECURITY 权限) - -```SQL -LIST USER; -``` - -9. 列出所有的角色 (需 SECURITY 权限) - -```SQL -LIST ROLE; -``` - -10. 列出指定角色下所有用户(需 SECURITY 权限) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser; -``` - -11. 列出指定用户下的所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 SECURITY 权限。 - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser; -``` - -12. 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser; -``` - -13. 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF ROLE -eg: LIST PRIVILEGES OF ROLE actor; -``` - -### 4.2 授权与取消授权 - -#### 4.2.1 授予权限 - -1. 给用户授予管理用户的权限 - -```SQL -GRANT SECURITY TO USER -eg: GRANT SECURITY TO USER TEST_USER; -``` - -2. 给用户授予创建数据库及在数据库范围内创建表的权限,且允许用户在该范围内管理权限 - -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION -eg: GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION; -``` - -3. 给角色授予查询数据库的权限 - -```SQL -GRANT SELECT ON DATABASE TO ROLE -eg: GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE; -``` - -4. 给用户授予查询表的权限 - -```SQL -GRANT SELECT ON . TO USER -eg: GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER; -``` - -5. 给角色授予查询所有数据库及表的权限 - -```SQL -GRANT SELECT ON ANY TO ROLE -eg: GRANT SELECT ON ANY TO ROLE TEST_ROLE; -``` - -6. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地授予权限。 - -```SQL -GRANT ALL TO USER TESTUSER; --- 将用户可以获取的所有权限授予给用户,包括全局权限和 ANY 范围的所有数据权限 - -GRANT ALL ON ANY TO USER TESTUSER; --- 将 ANY 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在所有数据库上的所有数据权限 - -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER; --- 将 DB 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该数据库上的所有数据权限 - -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER; --- 将 TABLE 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该表上的所有数据权限 -``` - -#### 4.2.2 撤销权限 - -1. 取消用户管理用户的权限 - -```SQL -REVOKE SECURITY FROM USER -eg: REVOKE SECURITY FROM USER TEST_USER; -``` - -2. 取消用户创建数据库及在数据库范围内创建表的权限 - -```SQL -REVOKE CREATE ON DATABASE FROM USER -eg: REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER; -``` - -3. 取消用户查询表的权限 - -```SQL -REVOKE SELECT ON . FROM USER -eg: REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER; -``` - -4. 取消用户查询所有数据库及表的权限 - -```SQL -REVOKE SELECT ON ANY FROM USER -eg: REVOKE SELECT ON ANY FROM USER TEST_USER; -``` - -5. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地撤销权限。 - -```SQL -REVOKE ALL FROM USER TESTUSER; --- 取消用户所有的全局权限以及 ANY 范围的所有数据权限 - -REVOKE ALL ON ANY FROM USER TESTUSER; --- 取消用户 ANY 范围的所有数据权限,不会影响 DB 范围和 TABLE 范围的权限 - -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER; --- 取消用户在 DB 上的所有数据权限,不会影响 TABLE 权限 - -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER; --- 取消用户在 TABLE 上的所有数据权限 -``` - -### 4.3 查看用户权限 - -每个用户都有一个权限访问列表,标识其获得的所有权限。可使用 `LIST PRIVILEGES OF USER ` 语句查看某个用户或角色的权限信息,输出格式如下: - -| ROLE | SCOPE | PRIVIVLEGE | WITH GRANT OPTION | -| ------- |---------|------------| ------------------- | -| | DB1.TB1 | SELECT | FALSE | -| | | SECURITY | TRUE | -| ROLE1 | DB2.TB2 | UPDATE | TRUE | -| ROLE1 | DB3.\* | DELETE | FALSE | -| ROLE1 | \*.\* | UPDATE | TRUE | - -其中: - -* `ROLE` 列:如果为空,则表示为该用户的自身权限。如果不为空,则表示该权限来源于被授予的角色。 -* `SCOPE` 列:表示该用户/角色的权限范围,表范围的权限表示为`DB.TABLE`,数据库范围的权限表示为`DB.*`, ANY 范围的权限表示为`*.*`。 -* `PRIVIVLEGE` 列:列出具体的权限类型。 -* `WITH GRANT OPTION` 列:如果为 TRUE,表示用户可以将自己的权限授予他人。 -* 用户或者角色可以同时具有树模型和表模型的权限,但系统会根据当前连接的模型来显示相应的权限,另一种模型下的权限则不会显示。 - -## 5. 场景示例 - -以 [示例数据](../Reference/Sample-Data.md) 内容为例,两个表的数据可能分别属于 bj、sh 两个数据中心,彼此间不希望对方获取自己的数据库数据,因此我们需要将不同的数据在数据中心层进行权限隔离。 - -### 5.1 创建用户 - -使用 `CREATE USER ` 创建用户。例如,可以使用具有所有权限的root用户为 ln 和 sgcc 集团创建两个用户角色,名为 `bj_write_user`, `sh_write_user`,密码均为 write_Pwd@2026。SQL 语句为: - -```SQL -CREATE USER bj_write_user 'write_Pwd@2026'; -CREATE USER sh_write_user 'write_Pwd@2026'; -``` - -使用展示用户的 SQL 语句: - -```SQL -LIST USER; -``` - -可以看到这两个已经被创建的用户,结果如下: - -```SQL -+------+-------------+-----------------+-----------------+ -|UserId| User|MaxSessionPerUser|MinSessionPerUser| -+------+-------------+-----------------+-----------------+ -| 0| root| -1| 1| -| 10000|bj_write_user| -1| -1| -| 10001|sh_write_user| -1| -1| -+------+-------------+-----------------+-----------------+ -``` - -### 5.2 赋予用户权限 - -虽然两个用户已经创建,但是不具有任何权限,因此并不能对数据库进行操作,例如使用 `bj_write_user` 用户对 table1 中的数据进行写入,SQL 语句为: - -```SQL -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -``` - -系统不允许用户进行此操作,会提示错误: - -```SQL -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: database is not specified -IoTDB> use database1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: DATABASE database1 -``` - -root 用户使用 `GRANT ON TO USER ` 语句赋予用户`bj_write_user`对 table1 的写入权限,例如: - -```SQL -GRANT INSERT ON database1.table1 TO USER bj_write_user -``` - -使用`bj_write_user`再尝试写入数据 - -```SQL -IoTDB> use database1 -Msg: The statement is executed successfully. -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: The statement is executed successfully. -``` - -### 5.3 撤销用户权限 - -授予用户权限后,可以使用 `REVOKE ON FROM USER `来撤销已经授予用户的权限。例如,用root用户撤销`bj_write_user`和`sh_write_user`的权限: - -```SQL -REVOKE INSERT ON database1.table1 FROM USER bj_write_user -REVOKE INSERT ON database1.table2 FROM USER sh_write_user -``` - -撤销权限后,`bj_write_user`就没有向table1写入数据的权限了。 - -```SQL -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: No permissions for this operation, please add privilege INSERT ON database1.table1 -``` diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Authority-Management_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Authority-Management_timecho.md deleted file mode 100644 index a34e76a8d..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Authority-Management_timecho.md +++ /dev/null @@ -1,484 +0,0 @@ - - -# 权限管理 - -IoTDB 提供了权限管理功能,用于对数据和集群系统执行精细的访问控制,保障数据与系统安全。本篇介绍了 IoTDB 表模型中权限模块的基本概念、用户定义、权限管理、鉴权逻辑与功能用例。 - -## 1. 基本概念 - -### 1.1 用户 - -用户即数据库的合法使用者。一个用户与一个唯一的用户名相对应,并且拥有密码作为身份验证的手段。一个人在使用数据库之前,必须先提供合法的(即存于数据库中的)用户名与密码。 - -### 1.2 权限 - -数据库提供多种操作,但并非所有的用户都能执行所有操作。如果一个用户可以执行某项操作,则称该用户有执行该操作的权限。 - -### 1.3 角色 - -角色是若干权限的集合,并且有一个唯一的角色名作为标识符。角色通常和一个现实身份相对应(例如交通调度员),而一个现实身份可能对应着多个用户。这些具有相同现实身份的用户往往具有相同的一些权限,角色就是为了能对这样的权限进行统一的管理的抽象。 - -### 1.4 默认用户与角色 - -安装初始化后的 IoTDB 中有一个默认用户 root,默认密码为 TimechoDB@2021(V2.0.6.x 之前为 root)。该用户为管理员用户,拥有所有权限,无法被赋予、撤销权限,也无法被删除,数据库内仅有一个管理员用户。一个新创建的用户或角色不具备任何权限。 - - -## 2. 权限列表 - -IoTDB 表模型主要有两类权限:全局权限、数据权限。 - -### 2.1 全局权限 - -全局权限包含用户管理和角色管理。 - -下表描述了全局权限的种类: - -| 权限名称 | 描述 | -| ----------------- |----------------------------------------------------------------------------------------| -| MANAGE\_USER | - 允许用户创建用户
- 允许用户删除用户
- 允许用户修改用户密码
- 允许用户查看用户的权限信息
- 允许用户列出所有用户 | -| MANAGE\_ROLE | - 允许用户创建角色
- 允许用户删除角色
- 允许用户查看角色的权限信息
- 允许用户将角色授予某个用户或撤销
- 允许用户列出所有角色 | - - -### 2.2 数据权限 - -数据权限由权限类型和范围组成。 - -* 权限类型包括:CREATE(创建权限),DROP(删除权限),ALTER(修改权限),SELECT(查询数据权限),INSERT(插入/更新数据权限),DELETE(删除数据权限)。 - -* 范围包括:ANY(系统范围内),DATABASE(数据库范围内),TABLE(单个表)。 - - 作用于 ANY 的权限会影响所有数据库和表。 - - 作用于数据库的权限会影响该数据库及其所有表。 - - 作用于表的权限仅影响该表。 - -* 范围生效说明:执行单表操作时,数据库会匹配用户权限与数据权限范围。例如,用户尝试向 DATABASE1.TABLE1 写入数据时,系统会依次检查用户是否有对 ANY、DATABASE1或 DATABASE1.TABLE1 的写入权限,直到匹配成功或者匹配失败。 - -* 权限类型、范围及效果逻辑关系如下表所示: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
权限类型权限范围(层级)权限效果
CREATEANY允许创建任意表、创建任意数据库
数据库允许用户在该数据库下创建表;允许用户创建该名称的数据库
允许用户创建该名称的表
DROPANY允许删除任意表、删除任意数据库
数据库允许用户删除该数据库;允许用户删除该数据库下的表
D允许用户删除该表
ALTERANY允许修改任意表的定义、任意数据库的定义
数据库允许用户修改数据库的定义,允许用户修改数据库下表的定义
允许用户修改表的定义
SELECTANY允许查询系统内任意数据库中任意表的数据
数据库允许用户查询该数据库中任意表的数据
允许用户查询该表中的数据。执行多表查询时,数据库仅展示用户有权限访问的数据。
INSERTANY允许任意数据库的任意表插入/更新数据
数据库允许用户向该数据库范围内任意表插入/更新数据
允许用户向该表中插入/更新数据
DELETEANY允许删除任意表的数据
数据库允许用户删除该数据库范围内的数据
允许用户删除该表中的数据
- -## 3. 用户、角色管理 -1. 创建用户(需 MANAGE_USER 权限) - -```SQL -CREATE USER -eg: CREATE USER user1 'passwd' -``` - -- 用户名约束:4~32个字符,支持使用英文大小写字母、数字、特殊字符`(!@#$%^&*()_+-=)`,用户无法创建和管理员用户同名的用户。 - - 如果用户名全是数字或包含特殊字符,则创建时需要使用双引号`""`括起来。 -- 密码约束:4~32个字符,可使用大写小写字母、数字、特殊字符`(!@#$%^&*()_+-=)`,密码默认采用 SHA-256 进行加密。 - -2. 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备 MANAGE_USER 权限。 - -```SQL -ALTER USER SET PASSWORD -eg: ALTER USER tempuser SET PASSWORD 'newpwd' -``` - -3. 删除用户(需 MANAGE_USER 权限) - -```SQL -DROP USER -eg: DROP USER user1 -``` - -4. 创建角色 (需 MANAGE_ROLE 权限) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1 -``` - -角色名约束:4~32个字符,支持使用英文大小写字母、数字、特殊字符`(!@#$%^&*()_+-=)`,用户无法创建和管理员用户同名的角色。 - -5. 删除角色 (需 MANAGE_ROLE 权限) - -```SQL -DROP ROLE -eg: DROP ROLE role1 -``` - -6. 赋予用户角色 (需 MANAGE_ROLE 权限) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1 -``` - -7. 移除用户角色 (需 MANAGE_ROLE 权限) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1 -``` - -8. 列出所有用户(需 MANAGE_USER 权限) - -```SQL -LIST USER -``` - -9. 列出所有的角色 (需 MANAGE_ROLE 权限) - -```SQL -LIST ROLE -``` - -10. 列出指定角色下所有用户(需 MANAGE_USER 权限) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser -``` - -11. 列出指定用户下的所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 MANAGE_ROLE 权限。 - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser -``` - -12. 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 MANAGE_USER 权限。 - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser -``` - -13. 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 MANAGE_ROLE 权限。 - -```SQL -LIST PRIVILEGES OF ROLE -eg: LIST PRIVILEGES OF ROLE actor -``` - -## 4. 权限管理 - -IoTDB支持通过如下三种途径进行用户授权和撤销权限: - -- 超级管理员直接授予或撤销 - -- 拥有GRANT OPTION权限的用户授予或撤销 - -- 通过角色授予或撤销(由超级管理员或具备MANAGE_ROLE权限的用户操作角色) - -在IoTDB 表模型中,授权或撤销权限时需遵循以下原则: - -- 授权/撤销全局权限时,无需指定权限的范围。 - -- 授予/撤销数据权限时,需要指定权限类型和权限范围。在撤销权限时只会撤销指定的权限范围,不会受权限范围包含关系的影响。 - -- 允许对尚未创建的数据库或表提前进行权限规划和授权。 - -- 允许重复授权/撤销权限。 - -- WITH GRANT OPTION: 允许用户在授权范围内管理权限。用户可以授予或撤销其他用户在该范围内的权限。 - -### 4.1 授予权限 - -1. 给用户授予管理用户的权限 - -```SQL -GRANT MANAGE_USER TO USER -eg: GRANT MANAGE_USER TO USER TEST_USER -``` - -2. 给用户授予创建数据库及在数据库范围内创建表的权限,且允许用户在该范围内管理权限 - -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION -eg: GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION -``` - -3. 给角色授予查询数据库的权限 - -```SQL -GRANT SELECT ON DATABASE TO ROLE -eg: GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE -``` - -4. 给用户授予查询表的权限 - -```SQL -GRANT SELECT ON . TO USER -eg: GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER -``` - -5. 给角色授予查询所有数据库及表的权限 - -```SQL -GRANT SELECT ON ANY TO ROLE -eg: GRANT SELECT ON ANY TO ROLE TEST_ROLE -``` - -6. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地授予权限。 - -```sql -GRANT ALL TO USER TESTUSER --- 将用户可以获取的所有权限授予给用户,包括全局权限和 ANY 范围的所有数据权限 - -GRANT ALL ON ANY TO USER TESTUSER --- 将 ANY 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在所有数据库上的所有数据权限 - -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER --- 将 DB 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该数据库上的所有数据权限 - -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER --- 将 TABLE 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该表上的所有数据权限 -``` - -### 4.2 撤销权限 - -1. 取消用户管理用户的权限 - -```SQL -REVOKE MANAGE_USER FROM USER -eg: REVOKE MANAGE_USER FROM USER TEST_USER -``` - -2. 取消用户创建数据库及在数据库范围内创建表的权限 - -```SQL -REVOKE CREATE ON DATABASE FROM USER -eg: REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER -``` - -3. 取消用户查询表的权限 - -```SQL -REVOKE SELECT ON . FROM USER -eg: REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER -``` - -4. 取消用户查询所有数据库及表的权限 - -```SQL -REVOKE SELECT ON ANY FROM USER -eg: REVOKE SELECT ON ANY FROM USER TEST_USER -``` - -5. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地撤销权限。 - -```sql -REVOKE ALL FROM USER TESTUSER --- 取消用户所有的全局权限以及 ANY 范围的所有数据权限 - -REVOKE ALL ON ANY FROM USER TESTUSER --- 取消用户 ANY 范围的所有数据权限,不会影响 DB 范围和 TABLE 范围的权限 - -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER --- 取消用户在 DB 上的所有数据权限,不会影响 TABLE 权限 - -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER --- 取消用户在 TABLE 上的所有数据权限 -``` - -### 4.3 查看用户权限 - -每个用户都有一个权限访问列表,标识其获得的所有权限。可使用 `LIST PRIVILEGES OF USER ` 语句查看某个用户或角色的权限信息,输出格式如下: - -| ROLE | SCOPE | PRIVIVLEGE | WITH GRANT OPTION | -|--------------|---------| -------------- |-------------------| -| | DB1.TB1 | SELECT | FALSE | -| | | MANAGE\_ROLE | TRUE | -| ROLE1 | DB2.TB2 | UPDATE | TRUE | -| ROLE1 | DB3.\* | DELETE | FALSE | -| ROLE1 | \*.\* | UPDATE | TRUE | - -其中: -- `ROLE` 列:如果为空,则表示为该用户的自身权限。如果不为空,则表示该权限来源于被授予的角色。 -- `SCOPE` 列:表示该用户/角色的权限范围,表范围的权限表示为`DB.TABLE`,数据库范围的权限表示为`DB.*`, ANY 范围的权限表示为`*.*`。 -- `PRIVIVLEGE` 列:列出具体的权限类型。 -- `WITH GRANT OPTION` 列:如果为 TRUE,表示用户可以将自己的权限授予他人。 -- 用户或者角色可以同时具有树模型和表模型的权限,但系统会根据当前连接的模型来显示相应的权限,另一种模型下的权限则不会显示。 - -## 5. 示例 - -以 [示例数据](../Reference/Sample-Data.md) 内容为例,两个表的数据可能分别属于 bj、sh 两个数据中心,彼此间不希望对方获取自己的数据库数据,因此我们需要将不同的数据在数据中心层进行权限隔离。 - -### 5.1 创建用户 - -使用 `CREATE USER ` 创建用户。例如,可以使用具有所有权限的root用户为 ln 和 sgcc 集团创建两个用户角色,名为 `bj_write_user`, `sh_write_user`,密码均为 `write_pwd`。SQL 语句为: - -```SQL -CREATE USER bj_write_user 'write_pwd' -CREATE USER sh_write_user 'write_pwd' -``` - -使用展示用户的 SQL 语句: - -```Plain -LIST USER -``` - -可以看到这两个已经被创建的用户,结果如下: - -```sql -+-------------+ -| User| -+-------------+ -|bj_write_user| -| root| -|sh_write_user| -+-------------+ -``` - -### 5.2 赋予用户权限 - -虽然两个用户已经创建,但是不具有任何权限,因此并不能对数据库进行操作,例如使用 `bj_write_user` 用户对 table1 中的数据进行写入,SQL 语句为: - -```sql -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -``` - -系统不允许用户进行此操作,会提示错误: - -```sql -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: database is not specified -IoTDB> use database1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: DATABASE database1 -``` - -root 用户使用 `GRANT ON TO USER ` 语句赋予用户`bj_write_user`对 table1 的写入权限,例如: - -```sql -GRANT INSERT ON database1.table1 TO USER bj_write_user -``` - -使用`bj_write_user`再尝试写入数据 - -```SQL -IoTDB> use database1 -Msg: The statement is executed successfully. -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: The statement is executed successfully. -``` - -### 5.3 撤销用户权限 - -授予用户权限后,可以使用 `REVOKE ON FROM USER `来撤销已经授予用户的权限。例如,用root用户撤销`bj_write_user`和`sh_write_user`的权限: - -```sql -REVOKE INSERT ON database1.table1 FROM USER bj_write_user -REVOKE INSERT ON database1.table2 FROM USER sh_write_user -``` - -撤销权限后,`bj_write_user`就没有向table1写入数据的权限了。 - -```sql -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: No permissions for this operation, please add privilege INSERT ON database1.table1 -``` diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Auto-Start-On-Boot_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Auto-Start-On-Boot_timecho.md deleted file mode 100644 index 06a0ddba6..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Auto-Start-On-Boot_timecho.md +++ /dev/null @@ -1,243 +0,0 @@ - - -# 开机自启 - -## 1.概述 - -TimechoDB 支持通过 `daemon-confignode.sh`、`daemon-datanode.sh`、`daemon-ainode.sh` 三个脚本,将ConfigNode、DataNode、AINode 注册为 Linux 系统服务,结合系统自带的 `systemctl `命令,以守护进程方式管理 TimechoDB 集群,实现更便捷的启动、停止、重启及开机自启等操作,提升服务稳定性。 - -> 注意:该功能从 V 2.0.9.1 版本开始提供。 - -## 2. 环境要求 - -| 操作系统 | Linux(支持`systemctl`命令) | -| ---------- |:-----------------------------------------------------:| -| 用户权限 | root 用户 | -| 环境变量 | 部署 ConfigNode 和 DataNode 前需设置`JAVA_HOME` | - -## 3. 服务注册 - -进入 TimechoDB 安装目录,执行对应的守护进程脚本: - -```Bash -# 注册 ConfigNode 服务 -./tools/ops/daemon-confignode.sh - -# 注册 DataNode 服务 -./tools/ops/daemon-datanode.sh - -# 注册 AINode 服务 -./tools/ops/daemon-ainode.sh -``` - -执行脚本时将提示以下两个选择项: - -1. 是否本次直接启动对应 TimechoDB 服务(timechodb-confignode/timechodb-datanode/timechodb-ainode); -2. 是否将对应服务注册为开机自启服务。 - -脚本执行完成后,将在 `/etc/systemd/system/` 目录生成对应的服务文件: - -* `timechodb-confignode.service` -* `timechodb-datanode.service` -* `timechodb-ainode.service` - -## 4. 服务管理 - -服务注册完成后,可通过 systemctl 命令对 TimechoDB 各节点服务进行启动、停止、重启、查看状态及配置开机自启等操作,以下命令均需使用 root 用户执行。 - -### 4.1 手动启动服务 - -```bash -# 启动 ConfigNode 服务 -systemctl start timechodb-confignode -# 启动 DataNode 服务 -systemctl start timechodb-datanode -# 启动 AINode 服务 -systemctl start timechodb-ainode -``` - -### 4.2 手动停止服务 - -```bash -# 停止 ConfigNode 服务 -systemctl stop timechodb-confignode -# 停止 DataNode 服务 -systemctl stop timechodb-datanode -# 停止 AINode 服务 -systemctl stop timechodb-ainode -``` - -停止服务后,通过查看服务状态,若显示为 inactive(dead),则说明服务关闭成功;若为其他状态,需查看 TimechoDB 日志,分析异常原因。 - -### 4.3 查看服务状态 - -```bash -# 查看 ConfigNode 服务状态 -systemctl status timechodb-confignode -# 查看 DataNode 服务状态 -systemctl status timechodb-datanode -# 查看 AINode 服务状态 -systemctl status timechodb-ainode -``` - -状态说明: - -* active(running):服务正在运行,若该状态持续 10 分钟,说明服务启动成功; -* failed:服务启动失败,需查看 TimechoDB 日志排查问题。 - -### 4.4 重启服务 - -重启服务相当于先执行停止操作,再执行启动操作,命令如下: - -```bash -# 重启 ConfigNode 服务 -systemctl restart timechodb-confignode -# 重启 DataNode 服务 -systemctl restart timechodb-datanode -# 重启 AINode 服务 -systemctl restart timechodb-ainode -``` - -### 4.5 配置开机自启 - -```bash -# 配置 ConfigNode 开机自启 -systemctl enable timechodb-confignode -# 配置 DataNode 开机自启 -systemctl enable timechodb-datanode -# 配置 AINode 开机自启 -systemctl enable timechodb-ainode -``` - -### 4.6 取消开机自启 - -```bash -# 取消 ConfigNode 开机自启 -systemctl disable timechodb-confignode -# 取消 DataNode 开机自启 -systemctl disable timechodb-datanode -# 取消 AINode 开机自启 -systemctl disable timechodb-ainode -``` - -## 5. 自定义服务配置 - -### 5.1 自定义方式 - -#### 5.1.1 方案一:修改脚本 - -1. 修改 `daemon-xxx.sh` 中的[Unit]、[Service]、[Install]区域配置项,具体配置项的含义参考下一小节 -2. 执行 `daemon-xxx.sh` 脚本 - -#### 5.1.2 方案二:修改服务文件 - -1. 修改 `/etc/systemd/system` 中的 `xx.service` 文件 -2. 执行 `systemctl deamon-reload` - -### 5.2 `daemon-xxx.sh` 配置项 - -#### 5.2.1 [Unit] 部分(服务元信息) - -| 配置项 | 说明 | -| --------------- | ---------------------------------- | -| Description | 服务描述 | -| Documentation | 指向 TimechoDB 官方文档 | -| After | 确保在网络服务启动后才启动该服务 | - -#### 5.2.2 [Service] 部分(服务运行配置) - -| 配置项 | 含义 | -| -------------------------------------------- | ---------------------------------------------------------------------- | -| StandardOutput、StandardError | 指定服务标准输出和错误日志的存储路径 | -| LimitNOFILE=65536 | 设置文件描述符上限,默认值为 65536 | -| Type=simple | 服务类型为简单前台进程,systemd 会跟踪服务主进程 | -| User=root、Group=root | 指定服务以 root 用户和 root 组的权限运行 | -| ExecStart/ExecStop | 分别指定服务的启动脚本和停止脚本的路径 | -| Restart=on-failure | 仅在服务异常退出时,自动重启服务 | -| SuccessExitStatus=143 | 将退出码 143(128+15,即 SIGTERM 正常终止)视为成功退出 | -| RestartSec=5 | 服务重启的间隔时间,默认为 5 秒 | -| StartLimitInterval=600s、StartLimitBurst=3 | 10 分钟(600 秒)内,服务最多重启 3 次,防止频繁重启导致系统资源浪费 | -| RestartPreventExitStatus=SIGKILL | 服务被 SIGKILL 信号杀死后,不自动重启,避免无限重启僵尸进程 | - -#### 5.2.3 [Install] 部分(安装配置) - -| 配置项 | 含义 | -| ---------------------------- | -------------------------------------------- | -| WantedBy=multi-user.target | 指定服务在系统进入多用户模式时,自动启动。 | - -### 5.3 .service 文件格式示例 - -```bash -[Unit] -Description=timechodb-confignode -Documentation=https://www.timecho.com/ -After=network.target - -[Service] -StandardOutput=null -StandardError=null -LimitNOFILE=65536 -Type=simple -User=root -Group=root -Environment=JAVA_HOME=$JAVA_HOME -ExecStart=$TimechoDB_SBIN_HOME/start-confignode.sh -Restart=on-failure -SuccessExitStatus=143 -RestartSec=5 -StartLimitInterval=600s -StartLimitBurst=3 -RestartPreventExitStatus=SIGKILL - -[Install] -WantedBy=multi-user.target -``` - -注:上述为 timechodb-confignode.service 文件的标准格式,timechodb-datanode.service、timechodb-ainode.service 文件格式类似。 - -## 6. 注意事项 - -1. **进程守护机制** - -* **自动重启**:服务启动失败或运行中异常退出(如 OOM)时,系统将自动重启。 -* **不重启**:正常退出(如执行 `kill`、`./sbin/stop-xxx.sh` 或 `systemctl stop`)不会触发自动重启。 - -2. **日志位置** - -* 所有运行日志均存储在 TimechoDB 安装目录下的 `logs` 文件夹中,排查问题时请查阅该目录。 - -3. **集群状态查看** - -* 服务启动后,执行 `./sbin/start-cli.sh` 并输入 `show cluster` 命令,即可查看集群状态。 - -4. **故障恢复流程** - -* 若服务状态为 `failed`,修复问题后**必须**先执行 `systemctl daemon-reload`,然后再执行 `systemctl start`,否则启动将失败。 - -5. **配置生效** - -* 修改 `daemon-xxx.sh` 脚本内容后,需执行 `systemctl daemon-reload` 重新注册服务,新配置方可生效。 - -6. **启动方式兼容** - -* `systemctl start`启动的服务,可用`./sbin/stop` 停止(不重启)。 -* `./sbin/start` 启动的进程,无法通过 `systemctl` 监控状态。 diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Black-White-List_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Black-White-List_timecho.md deleted file mode 100644 index 483be04cd..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Black-White-List_timecho.md +++ /dev/null @@ -1,78 +0,0 @@ - - -# 黑白名单 - -## 1. 引言 - -IoTDB 是一款针对物联网场景设计的时间序列数据库,支持高效的数据存储、查询和分析。随着物联网技术的广泛应用,数据安全性和访问控制变得至关重要。在开放环境中,如何保证合法用户对数据的安全访问成为了一项关键挑战。白名单机制仅允许可信 IP 或用户接入,从源头缩小攻击面;黑名单功能则能在边缘与云端协同场景下实时拦截恶意 IP,阻断非法访问、SQL 注入、暴力破解及 DDoS 等威胁,为数据传输提供持续、稳定的安全保障。 - -> 注意:该功能从 V2.0.6 版本开始提供。 - -## 2. 白名单 - -### 2.1 功能描述 - -通过开启白名单功能、配置白名单列表,指定允许连接 IoTDB 的客户端地址,来限制仅在白名单范围内的客户端才能够访问 IoTDB,从而实现安全控制。 - -### 2.2 配置参数 - -管理员可以通过以下两种方式来启用/禁用白名单功能以及添加、修改、删除白名单ip/ip段。 - -* 编辑配置文件 `iotdb-system.properties`进行维护 -* 通过 set configuration 语句进行维护 - * 表模型请参考:[set configuration](../SQL-Manual/SQL-Maintenance-Statements_timecho.md#_2-2-更新配置项) - -相关参数如下: - -| 名称 | 描述 | 默认值 | 生效方式 | 示例 | -| ------------------------- | ----------------------------------------------------------------------------------- | -------- | ---------- | ------------------------------------------------------------------- | -| `enable_white_list` | 是否启用白名单功能。true:启用;false:禁用。字段值不区分大小写。 | false | 热加载 | `set enable_white_list = 'true' ` | -| `white_ip_list` | 添加、修改、删除白名单ip/ip段。支持精确匹配,支持\*通配符,多个ip之间以逗号分隔。 | 空 | 热加载 | `set white_ip_list='192.168.1.200,192.168.1.201,192.168.1.*`' | - -## 3. 黑名单 - -### 3.1 功能描述 - -通过开启黑名单功能、配置黑名单列表,阻止某些特定 IP 地址访问数据库,来防止非法访问、SQL注入、暴力破解、DDoS攻击等安全威胁,从而确保数据传输过程中的安全性和稳定性。 - -### 3.2 配置参数 - -管理员可以通过以下两种方式来启用/禁用黑名单功能以及添加、修改、删除黑名单 ip/ip 段。 - -* 编辑配置文件 `iotdb-system.properties`进行维护 -* 通过 set configuration 语句进行维护 - * 表模型请参考:[set configuration](../SQL-Manual/SQL-Maintenance-Statements_timecho.md#_2-2-更新配置项) - -相关参数如下: - -| 名称 | 描述 | 默认值 | 生效方式 | 示例 | -| ------------------------- | ----------------------------------------------------------------------------------- | -------- | ---------- | ------------------------------------------------------------------- | -| `enable_black_list` | 是否启用黑名单功能。true:启用;false:禁用。字段值不区分大小写。 | false | 热加载 | `set enable_black_list = 'true' ` | -| `black_ip_list` | 添加、修改、删除黑名单ip/ip段。支持精确匹配,支持\*通配符,多个ip之间以逗号分隔。 | 空 | 热加载 | `set black_ip_list='192.168.1.200,192.168.1.201,192.168.1.*`' | - -## 4. 注意事项 - -1. 开启白名单后,若列表为空将拒绝所有连接,若未包含本机 IP 则拒绝本机登录。 -2. 当同一 IP 同时存在于黑白名单时,黑名单优先级更高。 -3. 系统会校验 IP 格式,无效条目将在用户连接时报错并被跳过,不影响其他有效IP的加载。 -4. 配置支持重复IP,内存中会自动去重且无提示。如需去重请手动修改。 -5. 黑/白名单规则仅对新连接生效,功能开启前的现有连接不受影响,其后续重连才会被拦截。 diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index e05d9c999..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,843 +0,0 @@ - - -# 数据同步 -数据同步是工业物联网的典型需求,通过数据同步机制,可实现 IoTDB 之间的数据共享,搭建完整的数据链路来满足内网外网数据互通、端边云同步、数据迁移、数据备份等需求。 - -## 1. 功能概述 - -### 1.1 数据同步 - -一个数据同步任务包含 3 个阶段: - -![](/img/data-sync-new.png) - -- 抽取(Source)阶段:该部分用于从源 IoTDB 抽取数据,在 SQL 语句中的 source 部分定义 -- 处理(Process)阶段:该部分用于处理从源 IoTDB 抽取出的数据,在 SQL 语句中的 processor 部分定义 -- 发送(Sink)阶段:该部分用于向目标 IoTDB 发送数据,在 SQL 语句中的 sink 部分定义 - -通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。目前数据同步支持以下信息的同步,您可以在创建同步任务时对同步范围进行选择(默认选择 data.insert,即同步新写入的数据): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
同步范围同步内容说明
all所有范围
data(数据)insert(增量)同步新写入的数据
delete(删除)同步被删除的数据
schema(元数据)database(数据库)同步数据库的创建、修改或删除操作
table(表)同步表的创建、修改或删除操作
TTL(数据到期时间)同步数据的存活时间
auth(权限)-同步用户权限和访问控制
- -### 1.2 功能限制及说明 - -- 不支持 1.x 系列版本 IoTDB 与 2.x 以及以上系列版本的 IoTDB 之间进行数据同步。 -- 在进行数据同步任务时,请避免执行任何删除操作,防止两端状态不一致。 -- 树模型与表模型的`pipe`及`pipe plugins`在设计上相互隔离,建议在创建`pipe`前先通过`show`命令查询当前`-sql_dialect`参数配置下可用的内置插件,以确保语法兼容性和功能支持。 -- 自 V2.0.9.2 版本起支持 Object 类型数据导出。 -- 当 Pipe 向接收端写入数据因字段类型不匹配而失败时,IoTDB 可按照目标端已有 schema 的字段类型对数据进行转换,并重试写入,以提高同步成功率。该能力由 `sink.exception.data.convert-on-type-mismatch` 控制,参数说明见后文 sink 参数表。 - - * 类型不匹配时的转换规则如下: - - | 源类型 | 目标类型 | 转换规则 | - | -------------------- | ----------- | -------------------------------------------------------------------------------- | - | 数值类型 | 数值类型 | 按目标数值类型进行转换,可能发生截断、精度损失或溢出。 | - | 数值类型 | BOOLEAN | `0`转换为`false`,非`0`转换为`true`。 | - | BOOLEAN | 数值类型 | `true`转换为`1`,`false`转换为`0`。 | - | TEXT、STRING、BLOB | BOOLEAN | 按字符串解析为 BOOLEAN。 | - | TEXT、STRING、BLOB | 数值类型 | 按字符串解析为目标数值类型;解析失败时写入默认值`0`、`0L`或`0.0`。 | - | TEXT、STRING、BLOB | TIMESTAMP | 按字符串解析为 TIMESTAMP;解析失败时写入默认值`0L`。 | - | TEXT、STRING、BLOB | DATE | 按字符串解析为 DATE;解析失败时写入默认日期`1970-01-01`。 | - | 非法数值 | DATE | 若无法转换为合法 DATE,则写入默认日期`1970-01-01`。 | - | DATE | TIMESTAMP | 按 UTC 转换为当天零点对应的时间戳。 | - | TIMESTAMP | DATE | 按 UTC 转换为对应日期。 | - - > 注意:自动转换基于目标端已有 schema 执行,不会自动修改目标端 schema;该能力优先保证同步继续进行,可能导致精度损失或默认值写入。 - -## 2. 使用说明 - -数据同步任务有三种状态:RUNNING、STOPPED 和 DROPPED。任务状态转换如下图所示: - -![](/img/Data-Sync01.png) - -创建后任务会直接启动,同时当任务发生异常停止后,系统会自动尝试重启任务。 - -提供以下 SQL 语句对同步任务进行状态管理。 - -### 2.1 创建任务 - -使用 `CREATE PIPE` 语句来创建一条数据同步任务,下列属性中`PipeId`和`sink`必填,`source`和`processor`为选填项,输入 SQL 时注意 `SOURCE`与 `SINK` 插件顺序不能替换。 - -SQL 示例如下: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId 是能够唯一标定任务的名字 --- 数据抽取插件,可选插件 -WITH SOURCE ( - [ = ,], -) --- 数据处理插件,可选插件 -WITH PROCESSOR ( - [ = ,], -) --- 数据连接插件,必填插件 -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Pipe 不存在时,执行创建命令,防止因尝试创建已存在的 Pipe 而导致报错。 - -**注意**:V2.0.8 起,创建一个全量数据同步 Pipe (例如 Pipeid : `alldatapipe`)时,系统会自动将其拆分为两个独立的 Pipe: - -* 历史 Pipe:PipeId 为原名称加 _history后缀(如 `alldatapipe_history`),source 参数默认携带 `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` - -* 实时 Pipe:PipeId 为原名称加 _realtime后缀(如 `alldatapipe_realtime`),source 参数默认携带 `'history.enable'='false'` ,若配置了元数据同步,则由实时 Pipe 负责发送 - -创建成功后,原 PipeId(如 `alldatapipe`)将不再作为有效标识符。在进行启动、停止、删除、查看等任务操作时,必须使用拆分后的独立 PipeId(即 `*_history`或 `*_realtime`)。操作示例见[查看任务](./Data-Sync_timecho.md#_2-5-查看任务)小节 - -### 2.2 开始任务 - -创建之后,任务直接进入运行状态,不需要执行启动任务。当使用`STOP PIPE`语句停止任务时需手动使用`START PIPE`语句来启动任务,PIPE 发生异常情况停止后会自动重新启动任务,从而开始处理数据: - -```SQL -START PIPE -``` - -### 2.3 停止任务 - -停止处理数据: - -```SQL -STOP PIPE -``` - -### 2.4 删除任务 - -删除指定任务: - -```SQL -DROP PIPE [IF EXISTS] -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Pipe 存在时,执行删除命令,防止因尝试删除不存在的 Pipe 而导致报错。 - -删除任务不需要先停止同步任务。 - -### 2.5 查看任务 - -查看全部任务: - -```SQL -SHOW PIPES -``` - -查看指定任务: - -```SQL -SHOW PIPE -``` - - pipe 的 show pipes 结果示例: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -其中各列含义如下: - -- **ID**:同步任务的唯一标识符 -- **CreationTime**:同步任务的创建的时间 -- **State**:同步任务的状态 -- **PipeSource**:同步数据流的来源 -- **PipeProcessor**:同步数据流在传输过程中的处理逻辑 -- **PipeSink**:同步数据流的目的地 -- **ExceptionMessage**:显示同步任务的异常信息 -- **RemainingEventCount(统计存在延迟)**:剩余 event 数,当前数据同步任务中的所有 event 总数,包括数据同步的 event,以及系统和用户自定义的 event。 -- **EstimatedRemainingSeconds(统计存在延迟)**:剩余时间,基于当前 event 个数和 pipe 处速率,预估完成传输的剩余时间。 - -示例: - -在 V2.0.8 及之后的版本中,创建一个全量数据同步任务,并查看该任务详情 - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -``` - -### 2.6 修改任务 - -`ALTER PIPE` 语句用于动态更新已存在的 PIPE,支持修改或替换 source、processor 及 sink 的配置。 - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -说明: - -* 执行操作不会改变 PIPE 的运行状态,等价于保留原 PipeId 的处理进度,在原进度位置创建新 PIPE。 -* source/processor/sink 的 modify/replace 参数均为非必填;若未指定任何修改参数,等价于删除当前 PIPE 后,按原配置和进度重新创建。 -* 对于指定 modify 的插件,保留该插件其他参数,仅替换或新增给定的参数。 -* 对于指定 replace 的插件,直接替换该插件所有参数。 -* 当使用 [IF EXISTS] 关键字时,即使不存在同名的 Pipe 也会返回执行成功,但是实际未执行任何操作。 - -示例: - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` - -### 2.7 同步插件 - -为了使得整体架构更加灵活以匹配不同的同步场景需求,我们支持在同步任务框架中进行插件组装。系统为您预置了一些常用插件可直接使用,同时您也可以自定义 processor 插件 和 Sink 插件,并加载至 IoTDB 系统进行使用。查看系统中的插件(含自定义与内置插件)可以用以下语句: - -```SQL -SHOW PIPEPLUGINS -``` - -返回结果如下: - -```SQL -IoTDB> SHOW PIPEPLUGINS -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -| PluginName|PluginType| ClassName|PluginJar|ExceptionMessage| -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -| DO-NOTHING-PROCESSOR| Builtin|org.apache.iotdb.commons.pipe.agent.plugin.builtin.processor.donothing.DoNothingProcessor| | | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.donothing.DoNothingSink| | | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.airgap.IoTDBAirGapSink| | | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.source.iotdb.IoTDBSource| | | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.thrift.IoTDBThriftSink| | | -|IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.thrift.IoTDBThriftSslSink| | | -| TSFILE-LOCAL-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.tsfile.PipeTsFileLocalSink| | | -| WRITE-BACK-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.writeback.WriteBackSink| | | -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -``` - -预置插件详细介绍如下(各插件的详细参数可参考本文[参数说明](#参考参数说明)): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
类型自定义插件插件名称介绍
source 插件不支持iotdb-source默认的 extractor 插件,用于抽取 IoTDB 历史或实时数据
processor 插件支持do-nothing-processor默认的 processor 插件,不对传入的数据做任何的处理
sink 插件支持do-nothing-sink不对发送出的数据做任何的处理
iotdb-thrift-sink默认的 sink 插件,用于 IoTDB 到 IoTDB(V2.0.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,多线程 async non-blocking IO 模型,传输性能高,尤其适用于目标端为分布式时的场景
iotdb-air-gap-sink用于 IoTDB 向 IoTDB(V2.0.0 及以上)跨单向数据网闸的数据同步。支持的网闸型号包括南瑞 Syskeeper 2000 等
iotdb-thrift-ssl-sink用于 IoTDB 与 IoTDB(V2.0.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,多线程 sync blocking IO 模型,适用于安全需求较高的场景
write-back-sink用于 IoTDB (V2.0.2 及以上)的数据回写插件,实现物化视图的作用。
opc-ua-sink用于 IoTDB (V2.0.2 及以上)支持OPC UA协议的数据传输插件,支持Client/Server 和 Pub/Sub 两种通信模式。
tsfile-local-sink用于 IoTDB (V2.0.9.2及以上)支持将 Object 数据导出到 IoTDB 服务器所在的本地文件系统。
tsfile-remote-sink用于 IoTDB (V2.0.9.2及以上)支持通过 SSH/SCP 协议将 Object 数据发送到远程服务器。
- - -## 3. 使用示例 - -### 3.1 全量数据同步 - -本例子用来演示将一个 IoTDB 的所有数据同步至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务,用来同步 A IoTDB 到 B IoTDB 间的全量数据,这里需要用到用到 sink 的 iotdb-thrift-sink 插件(内置插件),需通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,如下面的示例语句: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.2 部分数据同步 - -本例子用来演示同步某个历史时间范围( 2023 年 8 月 23 日 8 点到 2023 年 10 月 23 日 8 点)的数据至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务。首先我们需要在 source 中定义传输数据的范围,由于传输的是历史数据(历史数据是指同步任务创建之前存在的数据),需要配置数据的起止时间 start-time 和 end-time 以及传输的模式 mode.streaming。通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url。 - -详细语句如下: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'mode.streaming' = 'true' -- 新插入数据(pipe创建后)的抽取模式:是否按流式抽取(false 时为批式) - 'database-name'='testdb.*', -- 同步数据的范围 - 'start-time' = '2023.08.23T08:00:00+00:00', -- 同步所有数据的开始 event time,包含 start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- 同步所有数据的结束 event time,包含 end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.3 双向数据传输 - -本例子用来演示两个 IoTDB 之间互为双活的场景,数据链路如下图所示: - -![](/img/1706698592139.jpg) - -在这个例子中,为了避免数据无限循环,需要将 A 和 B 上的参数`source.mode.double-living` 均设置为 `true`,表示不转发从另一 pipe 传输而来的数据。 - -详细语句如下: - -在 A IoTDB 上执行下列语句: - -```SQL -create pipe AB -with source ( - 'source.mode.double-living' ='true' --不转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句: - -```SQL -create pipe BA -with source ( - 'source.mode.double-living' ='true' --不转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -### 3.4 边云数据传输 - -本例子用来演示多个 IoTDB 之间边云传输数据的场景,数据由 B 、C、D 集群分别都同步至 A 集群,数据链路如下图所示: - -![](/img/dataSync03.png) - -在这个例子中,为了将 B 、C、D 集群的数据同步至 A,在 BA 、CA、DA 之间的 pipe 需要配置database-name 和 table-name 限制范围,详细语句如下: - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 A: - -```SQL -create pipe BA -with source ( - 'database-name'='db_b.*', -- 限制范围 - 'table-name'='.*', -- 可选择匹配所有 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 C IoTDB 上执行下列语句,将 C 中数据同步至 A: - -```SQL -create pipe CA -with source ( - 'database-name'='db_c.*', -- 限制范围 - 'table-name'='.*', -- 可选择匹配所有 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 D IoTDB 上执行下列语句,将 D 中数据同步至 A: - -```SQL -create pipe DA -with source ( - 'database-name'='db_d.*', -- 限制范围 - 'table-name'='.*', -- 可选择匹配所有 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.5 级联数据传输 - -本例子用来演示多个 IoTDB 之间级联传输数据的场景,数据由 A 集群同步至 B 集群,再同步至 C 集群,数据链路如下图所示: - -![](/img/1706698610134.jpg) - - -在 A IoTDB 上执行下列语句,将 A 中数据同步至 B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 C: - -```SQL -create pipe BC -with source ( -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.6 跨网闸数据传输 - -本例子用来演示将一个 IoTDB 的数据,经过单向网闸,同步至另一个 IoTDB 的场景,数据链路如下图所示: - -![](/img/cross-network-gateway.png) - -在这个例子中,需要使用 sink 任务中的 iotdb-air-gap-sink 插件,配置网闸后,在 A IoTDB 上执行下列语句,其中 node-urls 填写网闸配置的目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,详细语句如下: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -**注意:** -* 跨网闸同步创建 pipe 时,必须确保接收端目标用户已存在。若创建 pipe 时接收端用户缺失,后续补建用户也无法同步此前数据。 -* 目前支持的网闸型号见下表 -> 其他型号的网闸设备,请与天谋商务联系确认是否支持。 - -| 网闸类型 | 网闸型号 | 回包限制 | 发送限制 | -| ------------ | -------------------------------------------- | ----------------- | --------------- | -| 正向型 | 南瑞 Syskeeper-2000 正向型 | 全 0 / 全 1 bytes | 无限制 | -| 正向型 | 许继自研网闸 | 全 0 / 全 1 bytes | 无限制 | -| 未标记正反向 | 威努特安全隔离与信息交换系统 | 无限制 | 无限制 | -| 正向型 | 科东 StoneWall-2000 网络安全隔离设备(正向型) | 无限制 | 无限制 | -| 反向型 | 南瑞 Syskeeper-2000 反向型 | 全 0 / 全 1 bytes | 满足 E 语言格式 | -| 未标记正反向 | 迪普科技ISG5000 | 无限制 | 无限制 | -| 未标记正反向 | 熙羚安全隔离与信息交换系统XL—GAP | 无限制 | 无限制 | - -### 3.7 压缩同步 - -IoTDB 支持在同步过程中指定数据压缩方式。可通过配置 `compressor` 参数,实现数据的实时压缩和传输。`compressor`目前支持 snappy / gzip / lz4 / zstd / lzma2 5 种可选算法,且可以选择多种压缩算法组合,按配置的顺序进行压缩。`rate-limit-bytes-per-second`(V1.3.3 及以后版本支持)每秒最大允许传输的byte数,计算压缩后的byte,若小于0则不限制。 - -如创建一个名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'compressor' = 'snappy,lz4' -- - 'rate-limit-bytes-per-second'='1048576' -- 每秒最大允许传输的byte数 -) -``` - -### 3.8 加密同步 - -IoTDB 支持在同步过程中使用 SSL 加密,从而在不同的 IoTDB 实例之间安全地传输数据。通过配置 SSL 相关的参数,如证书地址和密码(`ssl.trust-store-path`)、(`ssl.trust-store-pwd`)可以确保数据在同步过程中被 SSL 加密所保护。 - -如创建名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'ssl.trust-store-path'='pki/trusted', -- 连接目标端 DataNode 所需的 trust store 证书路径 - 'ssl.trust-store-pwd'='root' -- 连接目标端 DataNode 所需的 trust store 证书密码 -) -``` - -### 3.9 Object 类型数据导出 - -IoTDB 自 V2.0.9.2 版本起支持导出 Object 类型数据,通过配置 sink 参数支持如下两种方式: - -* Local 模式(本地导出):将数据导出到 IoTDB 服务器所在的本地文件系统。 -* SCP 模式(远程传输):通过 SSH/SCP 协议将数据发送到远程服务器。 - -**示例一:本地导出** - -可直接使用系统内置的 `tsfile-local-sink `插件创建 PIPE 语句导出数据,例如: - -```SQL -CREATE PIPE tsfile_export_local -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile-local-sink', -- 必填,指定 Sink 类型 - 'sink.local.target-path' = '/data/backup/export_2024' -- 导出目标路径 - 'sink.rate-limit-bytes-per-second' = '10485760' -- 限速 10MB/s -); -``` - -**示例二:远程传输** - -1. 联系天谋团队获取 `tsfile-remote-sink` 插件相关的 jar 包,如 `tsfile-remote-sink--jar-with-dependencies.jar`,并放至 IoTDB 可访问的路径(例如所有数据节点主机)。 -2. 使用如下语句注册插件 - -```SQL -CREATE PIPEPLUGIN tsfile_remote_sink -AS 'org.apache.iotdb.pipe.plugin.sink.tsfile.PipeTsFileRemoteSink' -USING URI 'file:///path/to/tsfile-remote-sink-<版本号>-jar-with-dependencies.jar'; -``` - -3. 创建 PIPE 语句 - -```SQL -CREATE PIPE tsfile_export_scp -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile_remote_sink', - 'sink.file-mode' = 'scp', -- 指定为 SCP 模式 - 'sink.scp.host' = '192.168.1.100', -- 远程主机 IP - 'sink.scp.port' = '22', -- SSH 端口 - 'sink.scp.user' = 'backup_user', -- SSH 用户名 - 'sink.scp.password' = 'ComplexPass123!', -- SSH 密码 - 'sink.scp.remote-path' = '/remote/archive/', -- 远程存放路径 - 'sink.rate-limit-bytes-per-second' = '10485760' -- 限速 10MB/s -); -``` - -注意:远程导出 Object 类型数据时,为避免出现握手异常、连接失败或 Pipe 频繁启停问题,建议采取以下任一措施: -* 适当调低配置参数 sink.scp.object-parallelism -* 按需调大目标机的 MaxStartups,修改后执行 sshd reload 或 sshd restart 使配置生效 - -**Sink 导出 TSFile 与 Object 格式:** - -```Bash -target_dir - ├── tsfile.tsfile - └── tsfile/ (对应TSFile名字) - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - - -## 参考:注意事项 - -可通过修改 IoTDB 配置文件(`iotdb-system.properties`)以调整数据同步的参数,如同步数据存储目录等。完整配置如下:: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## 参考:参数说明 - -### source 参数 - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | -|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------| ------------ | ------------------------------- | -| source | iotdb-source | String: iotdb-source | 必填 | - | -| inclusion | 用于指定数据同步任务中需要同步范围,分为数据,元数据和权限 | String:all, data(insert,delete), schema(database,table,ttl),auth | 选填 | data.insert | -| inclusion.exclusion | 用于从 inclusion 指定的同步范围内排除特定的操作,减少同步的数据量 | String:all, data(insert,delete), schema(database,table,ttl), auth | 选填 | 空字符串 | -| mode.streaming | 此参数指定时序数据写入的捕获来源。适用于 `mode.streaming`为 `false` 模式下的场景,决定`inclusion`中`data.insert`数据的捕获来源。提供两种捕获策略:true: 动态选择捕获的类型。系统将根据下游处理速度,自适应地选择是捕获每个写入请求还是仅捕获 TsFile 文件的封口请求。当下游处理速度快时,优先捕获写入请求以减少延迟;当处理速度慢时,仅捕获文件封口请求以避免处理堆积。这种模式适用于大多数场景,能够实现处理延迟和吞吐量的最优平衡。false:固定按批捕获方式。仅捕获 TsFile 文件的封口请求,适用于资源受限的应用场景,以降低系统负载。注意,pipe 启动时捕获的快照数据只会以文件的方式供下游处理。 | Boolean: true / false | 否 | true | -| mode.strict | 在使用 time / path / database-name / table-name 参数过滤数据时,是否需要严格按照条件筛选:`true`: 严格筛选。系统将完全按照给定条件过滤筛选被捕获的数据,确保只有符合条件的数据被选中。`false`:非严格筛选。系统在筛选被捕获的数据时可能会包含一些额外的数据,适用于性能敏感的场景,可降低 CPU 和 IO 消耗。 | Boolean: true / false | 否 | true | -| mode.snapshot | 此参数决定时序数据的捕获方式,影响`inclusion`中的`data`数据。提供两种模式:`true`:静态数据捕获。启动 pipe 时,会进行一次性的数据快照捕获。当快照数据被完全消费后,**pipe 将自动终止(DROP PIPE SQL 会自动执行)**。`false`:动态数据捕获。除了在 pipe 启动时捕获快照数据外,还会持续捕获后续的数据变更。pipe 将持续运行以处理动态数据流。 | Boolean: true / false | 否 | false | -| database-name | 当用户连接指定的 sql_dialect 为 table 时可以指定。此参数决定时序数据的捕获范围,影响`inclusion`中的`data`数据。表示要过滤的数据库的名称。它可以是具体的数据库名,也可以是 Java 风格正则表达式来匹配多个数据库。默认情况下,匹配所有的库。 | String:数据库名或数据库正则模式串,可以匹配未创建的、不存在的库 | 否 | ".*" | -| table-name | 当用户连接指定的 sql_dialect 为 table 时可以指定。此参数决定时序数据的捕获范围,影响`inclusion`中的`data`数据。表示要过滤的表的名称。它可以是具体的表名,也可以是 Java 风格正则表达式来匹配多个表。默认情况下,匹配所有的表。 | String:数据表名或数据表正则模式串,可以是未创建的、不存在的表 | 否 | ".*" | -| start-time | 此参数决定时序数据的捕获范围,影响`inclusion`中的`data`数据。当数据的 event time 大于等于该参数时,数据会被筛选出来进入流处理 pipe。 | Long: [Long.MIN_VALUE, Long.MAX_VALUE] (unix 裸时间戳)或 String:IoTDB 支持的 ISO 格式时间戳 | 否 | Long.MIN_VALUE(unix 裸时间戳) | -| end-time | 此参数决定时序数据的捕获范围,影响`inclusion`中的`data`数据。当数据的 event time 小于等于该参数时,数据会被筛选出来进入流处理 pipe。 | Long: [Long.MIN_VALUE, Long.MAX_VALUE](unix 裸时间戳)或String:IoTDB 支持的 ISO 格式时间戳 | 否 | Long.MAX_VALUE(unix 裸时间戳) | -| mode.double-living | 是否开启全量双活模式,开启后将忽略`-sql_dialect`连接方式,树表模型数据均会被捕获,且不会转发由另一pipe同步而来的数据。 | Boolean: true / false | 否 | false | -| mods | 同 mods.enable,是否发送 tsfile 的 mods 文件 | Boolean: true / false | 选填 | false | -| skipIf | 出现哪些错误可以跳过,当前只有无权限的错误 | String:no-privileges | 选填 | no-privileges | - -> 💎 **说明:数据抽取模式 mode.streaming 取值 true 和 false 的差异** -> - **true(推荐)**:该取值下,任务将对数据进行实时处理、发送,其特点是高时效、低吞吐 -> - **false**:该取值下,任务将对数据进行批量(按底层数据文件)处理、发送,其特点是低时效、高吞吐 - - -### sink 参数 - -#### iotdb-thrift-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | -|--------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------| -------- | ------------ | -| sink | iotdb-thrift-sink 或 iotdb-thrift-async-sink | String: iotdb-thrift-sink 或 iotdb-thrift-async-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/usename | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V2.0.5及以后版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| format | 数据传输的payload格式, 可选项包括:
- hybrid: 取决于 processor 传递过来的格式(tsfile或tablet),sink不做任何转换。
- tsfile:强制转换成tsfile发送,可用于数据文件备份等场景。
- tablet:强制转换成tsfile发送,可用于发送端/接收端数据类型不完全兼容时的数据同步(以减少报错)。 | String: hybrid / tsfile / tablet | 选填 | hybrid | -| mark-as-general-write-request | 该参数可控制双活 pipe 之间能否同步外部 pipe 转发的数据(配置到双活外部 pipe 的发送端)(V2.0.5及以后版本支持) | Boolean: true / false。True:能同步;False:不能同步; | 选填 | False | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - - - -#### iotdb-air-gap-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | -|--------------------------------------------| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -------- | -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/username | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | TimechoDB@2021,V2.0.6.x之前为 root | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| air-gap.handshake-timeout-ms | 发送端与接收端在首次尝试建立连接时握手请求的超时时长,单位:毫秒 | Integer | 选填 | 5000 | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - -#### iotdb-thrift-ssl-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | -|--------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------| -------- | ------------ | -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/username | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | TimechoDB@2021,V2.0.6.x之前为 root | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V2.0.5及以后版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| ssl.trust-store-path | 连接目标端 DataNode 所需的 trust store 证书路径 | String.Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| ssl.trust-store-pwd | 连接目标端 DataNode 所需的 trust store 证书密码 | Integer | 必填 | - | -| format | 数据传输的payload格式, 可选项包括:
- hybrid: 取决于 processor 传递过来的格式(tsfile或tablet),sink不做任何转换。
- tsfile:强制转换成tsfile发送,可用于数据文件备份等场景。
- tablet:强制转换成tsfile发送,可用于发送端/接收端数据类型不完全兼容时的数据同步(以减少报错)。 | String: hybrid / tsfile / tablet | 选填 | hybrid | -| mark-as-general-write-request | 该参数可控制双活 pipe 之间能否同步外部 pipe 转发的数据(配置到双活外部 pipe 的发送端) | Boolean: true / false。True:能同步;False:不能同步; | 选填 | False | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - -#### write-back-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | -| ---------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -------- | -| sink | write-back-sink | String: write-back-sink | 必填 | - | -| user/username | 用于写回的用户 | String:用户名 | 选填 | root | -| password | 用于写回的密码 | String:密码 | 选填 | root123 | -| user-id | 用户对应的 userId | String | 选填 | root | -| cli-hostname | 用户对应的 cli 主机名 | String | 选填 | root | -| use-event-user-name | 如果 event 中含有另一个用户的用户名,是否使用该用户名(现在没有 external source 基本不需要) | Boolean: true / false | 选填 | false | - -#### opc-ua-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认值** | -|------------------------------------|----------------------------------|-----------------------------| ---------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| sink | OPC UA SINK | String: opc-ua-sink | 必填 | - | -| sink.opcua.model | OPC UA 使用的模式 | String: client-server / pub-sub | 选填 | pub-sub | -| sink.opcua.tcp.port | OPC UA 的 TCP 端口 | Integer: [0, 65536] | 选填 | 12686 | -| sink.opcua.https.port | OPC UA 的 HTTPS 端口 | Integer: [0, 65536] | 选填 | 8443 | -| sink.opcua.security.dir | OPC UA 的密钥及证书目录 | String: Path,支持绝对及相对目录 | 选填 | iotdb 相关 DataNode 的 conf 目录下的 opc_security 文件夹 ``。
如无 iotdb 的 conf 目录(例如 IDEA 中启动 DataNode),则为用户主目录下的 iotdb_opc_security 文件夹 `` | -| sink.opcua.enable-anonymous-access | OPC UA 是否允许匿名访问 | Boolean | 选填 | true | -| sink.user | 用户,这里指 OPC UA 的允许用户 | String | 选填 | root | -| sink.password | 密码,这里指 OPC UA 的允许密码 | String | 选填 | TimechoDB@2021,V2.0.6.x之前为 root | -| sink.opcua.placeholder | 当ID列的值出现null时,用于替代null映射路径的占位字符串 | String | 选填 | "null" | - -#### tsfile-local-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认值** | -|-----------------------------------|-----------------------------------------|---------------------------|------|---------| -| sink | 组件名称 | String: tsfile-local-sink | 是 | - | -| sink.local.target-path | 本地目标目录 | String | 是 | - | -| sink.rate-limit-bytes-per-second | 限速阈值。单位:字节/秒。开启限速时生效。rate-limit<=0不限速 | Long | 否 | 0 | - -#### tsfile-remote-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认值** | -|-----------------------------------|--------------------------------------|----------------------------|----------|--------------| -| sink | 组件名称 | String: tsfile-remote-sink | 是 | - | -| sink.scp.host | 远程主机 IP | String | 是 | - | -| sink.scp.port | 远程 SSH 端口 | Long | 否 | 22 | -| sink.scp.user | 远程 SSH 用户 | String | 是 | - | -| sink.scp.password | 远程 SSH 密码 | String | 是 | - | -| sink.scp.remote-path | 远程目标目录 | String | 是 | - | -| sink.rate-limit-bytes-per-second | 单位:字节/秒。开启限速时生效。rate-limit<=0不限速 | Long | 否 | 0 | -| sink.scp.object-parallelism | object文件发送最大并行度 | Long | 否 | `min(cpu/4,16)` | -| sink.scp.object-batch-size-bytes | 单次异步线程发送的最大Object文件大小, 单位 MB | Long | 否 | 200 | diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Maintenance-statement_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Maintenance-statement_timecho.md deleted file mode 100644 index c829350db..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Maintenance-statement_timecho.md +++ /dev/null @@ -1,957 +0,0 @@ - -# 运维语句 - -## 1. 状态查看 - -### 1.1 查看连接的模型 - -**含义**:返回当前连接的 sql_dialect 是树模型/表模型。 - -#### 语法: - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT -``` - -执行结果如下: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` -### 1.2 查看登录的用户名 - -**含义**:返回当前登录的用户名。 - -#### 语法: - -```SQL -showCurrentUserStatement - : SHOW CURRENT_USER - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_USER -``` - -执行结果如下: - -```SQL -+-----------+ -|CurrentUser| -+-----------+ -| root| -+-----------+ -``` - -### 1.3 查看连接的数据库名 - -**含义**:返回当前连接的数据库名,若没有执行过 use 语句,则为 null。 - -#### 语法: - -```SQL -showCurrentDatabaseStatement - : SHOW CURRENT_DATABASE - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_DATABASE; - -IoTDB> USE test; - -IoTDB> SHOW CURRENT_DATABASE; -``` - -执行结果如下: - -```SQL -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ - -+---------------+ -|CurrentDatabase| -+---------------+ -| test| -+---------------+ -``` - -### 1.4 查看集群版本 - -**含义**:返回当前集群的版本。 - -#### 语法: - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW VERSION -``` - -执行结果如下: - -```SQL -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.5 查看集群关键参数 - -**含义**:返回当前集群的关键参数。 - -#### 语法: - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -关键参数如下: - -1. **ClusterName**:当前集群的名称。 -2. **DataReplicationFactor**:数据副本的数量,表示每个数据分区(DataRegion)的副本数。 -3. **SchemaReplicationFactor**:元数据副本的数量,表示每个元数据分区(SchemaRegion)的副本数。 -4. **DataRegionConsensusProtocolClass**:数据分区(DataRegion)使用的共识协议类。 -5. **SchemaRegionConsensusProtocolClass**:元数据分区(SchemaRegion)使用的共识协议类。 -6. **ConfigNodeConsensusProtocolClass**:配置节点(ConfigNode)使用的共识协议类。 -7. **TimePartitionOrigin**:数据库时间分区的起始时间戳。 -8. **TimePartitionInterval**:数据库的时间分区间隔(单位:毫秒)。 -9. **ReadConsistencyLevel**:读取操作的一致性级别。 -10. **SchemaRegionPerDataNode**:数据节点(DataNode)上的元数据分区(SchemaRegion)数量。 -11. **DataRegionPerDataNode**:数据节点(DataNode)上的数据分区(DataRegion)数量。 -12. **SeriesSlotNum**:数据分区(DataRegion)的序列槽(SeriesSlot)数量。 -13. **SeriesSlotExecutorClass**:序列槽的实现类。 -14. **DiskSpaceWarningThreshold**:磁盘空间告警阈值(单位:百分比)。 -15. **TimestampPrecision**:时间戳精度。 - -#### 示例: - -```SQL -IoTDB> SHOW VARIABLES -``` - -执行结果如下: - -```SQL -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.6 查看集群ID - -**含义**:返回当前集群的ID。 - -#### 语法: - -```SQL -showClusterIdStatement - : SHOW (CLUSTERID | CLUSTER_ID) - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CLUSTER_ID -``` - -执行结果如下: - -```SQL -+------------------------------------+ -| ClusterId| -+------------------------------------+ -|40163007-9ec1-4455-aa36-8055d740fcda| -``` - -### 1.7 查看客户端直连的 DataNode 进程所在服务器的时间 - -#### 语法: - -**含义**:返回当前客户端直连的 DataNode 进程所在服务器的时间。 - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP -``` - -执行结果如下: - -```SQL -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - -### 1.8 查看正在执行的查询信息 - -**含义**:用于显示所有正在执行的查询信息。 - -> 更多系统表使用方法请参考[系统表](../Reference/System-Tables_timecho.md) - -#### 语法: - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**参数解释**: - -1. **WHERE** 子句:需保证过滤的目标列是结果集中存在的列 -2. **ORDER BY** 子句:需保证`sortKey`是结果集中存在的列 -3. **limitOffsetClause**: - - **含义**:用于限制结果集的返回数量。 - - **格式**:`LIMIT , `, `` 是偏移量,`` 是返回的行数。 -4. **QUERIES** 表中的列: - - **query_id**:查询语句的 ID - - **start_time**:查询开始的时间戳,时间戳精度与系统精度一致 - - **datanode_id**:发起查询语句的 DataNode 的ID - - **elapsed_time**:查询的执行耗时,单位是秒 - - **statement**:查询的 SQL 语句 - - **user**:发起查询的用户 - -#### 示例: - -```SQL -IoTDB> SHOW QUERIES WHERE elapsed_time > 30 -``` - -执行结果如下: - -```SQL -+-----------------------+-----------------------------+-----------+------------+------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -|20250108_101015_00000_1|2025-01-08T18:10:15.935+08:00| 1| 32.283|show queries|root| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -``` - -### 1.9 查看分区信息 - -**含义**:返回当前集群的分区信息。 - -#### 语法: - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW REGIONS -``` - -执行结果如下: - -```SQL -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 6|SchemaRegion|Running|tcollector| 670| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.194| | -| 7| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.196| 169.85 KB| -| 8| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.198| 161.63 KB| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.10 查看可用节点 - -**含义**:返回当前集群所有可用的 DataNode 的 RPC 地址和端口。注意:这里对于“可用”的定义为:处于非 REMOVING 状态的 DN 节点。 - -> V2.0.8 起支持该功能 - -#### 语法: - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW AVAILABLE URLS -``` - -执行结果如下: - -```SQL -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.11 查看服务信息 - -**含义**:返回当前集群所有正常工作(RUNNING 或 READ-ONLY) DN 上的服务信息(MQTT 服务、REST 服务)。 - -> V2.0.8.2 起支持该功能 - -#### 语法: - -```SQL -showServicesStatement - : SHOW SERVICES - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -``` - -执行结果如下: - -```SQL -+------------+-----------+-------+ -|service_name|datanode_id| state| -+------------+-----------+-------+ -| MQTT| 1|STOPPED| -| REST| 1|RUNNING| -+------------+-----------+-------+ -``` - - -### 1.12 查看集群激活状态 - -**含义**:返回当前集群的激活状态。 - -#### 语法: - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW ACTIVATION -``` - -执行结果如下: - -```SQL -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -### 1.13 查看节点配置信息 - -**含义**:默认返回指定节点(通过 `node_id` 指定)的配置文件中已生效的配置项;若未指定 `node_id`,则返回客户端直连的 DataNode 配置。 添加 `all` 参数返回所有配置项(未配置项的 `value` 为 `null`);添加 `with desc` 参数返回配置项含描述信息。 - -> V2.0.9.1 起支持该功能 - -#### 语法: - -```SQL -showConfigurationStatement - : SHOW (ALL)? CONFIGURATION (ON nodeId=INTEGER_VALUE)? (WITH DESC)? - ; -``` - -#### 结果集说明 - -| 列名 | 列类型 | 含义 | -| ---------------- | -------- | ------------------ | -| name | string | 参数名 | -| value | string | 参数值 | -| default\_value | string | 参数默认值 | -| description | string | 参数描述(可选) | - -#### 示例: - -1. 查看客户端直连 DataNode 的配置信息 - -```SQL -show configuration; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| schema_replication_factor| 1| 1| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_metric_prometheus_reporter_port| 9091| 9091| -| dn_metric_prometheus_reporter_port| 9092| 9092| -| series_slot_num| 1000| 1000| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| time_partition_origin| 0| 0| -| time_partition_interval| 604800000| 604800000| -| disk_space_warning_threshold| 0.05| 0.05| -| schema_engine_mode| Memory| Memory| -| tag_attribute_total_size| 700| 700| -| read_consistency_level| strong| strong| -| timestamp_precision| ms| ms| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 28 -It costs 0.013s -``` - -2. 查看指定 node id 的节点配置信息 - -```Bash -show configuration on 1; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| schema_replication_factor| 1| 1| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_metric_prometheus_reporter_port| 9091| 9091| -| dn_metric_prometheus_reporter_port| 9092| 9092| -| series_slot_num| 1000| 1000| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| time_partition_origin| 0| 0| -| time_partition_interval| 604800000| 604800000| -| disk_space_warning_threshold| 0.05| 0.05| -| schema_engine_mode| Memory| Memory| -| tag_attribute_total_size| 700| 700| -| read_consistency_level| strong| strong| -| timestamp_precision| ms| ms| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 28 -It costs 0.004s -``` - -3. 查看所有配置信息 - -```Bash -show all configuration; -``` - -```Bash -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| dn_join_cluster_retry_interval_ms| null| 5000| -| config_node_consensus_protocol_class| null| org.apache.iotdb.consensus.ratis.RatisConsensus| -| schema_replication_factor| 1| 1| -| schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_system_dir| null| data/confignode/system| -| cn_consensus_dir| null| data/confignode/consensus| -| cn_pipe_receiver_file_dir| null| data/confignode/system/pipe/receiver| -... -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 412 -It costs 0.006s -``` - -4. 查看配置项描述信息 - -```Bash -show configuration on 1 with desc; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| name| value| default_value| description| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| Used for indicate cluster name and distinguish different cluster. If you need to modify the cluster name, it's recommended to use 'set configuration "cluster_name=xxx"' sql. Manually modifying configuration file is not recommended, which may cause node restart fail.effectiveMode: hot_reload.Datatype: string| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710|For the first ConfigNode to start, cn_seed_config_node points to its own cn_internal_address:cn_internal_port. For other ConfigNodes that to join the cluster, cn_seed_config_node points to any running ConfigNode's cn_internal_address:cn_internal_port. Note: After this ConfigNode successfully joins the cluster for the first time, this parameter is no longer used. Each node automatically maintains the list of ConfigNodes and traverses connections when restarting. Format: address:port e.g. 127.0.0.1:10710.effectiveMode: first_start.Datatype: String| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| dn_seed_config_node points to any running ConfigNode's cn_internal_address:cn_internal_port. Note: After this DataNode successfully joins the cluster for the first time, this parameter is no longer used. Each node automatically maintains the list of ConfigNodes and traverses connections when restarting. Format: address:port e.g. 127.0.0.1:10710.effectiveMode: first_start.Datatype: String| -| cn_internal_address| 127.0.0.1| 127.0.0.1| Used for RPC communication inside cluster. Could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: first_start.Datatype: String| -| cn_internal_port| 10710| 10710| Used for RPC communication inside cluster.effectiveMode: first_start.Datatype: int| -| cn_consensus_port| 10720| 10720| Used for consensus communication among ConfigNodes inside cluster.effectiveMode: first_start.Datatype: int| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| Used for connection of IoTDB native clients(Session) Could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: restart.Datatype: String| -| dn_rpc_port| 6667| 6667| Used for connection of IoTDB native clients(Session) Bind with dn_rpc_address.effectiveMode: restart.Datatype: int| -| dn_internal_address| 127.0.0.1| 127.0.0.1| Used for communication inside cluster. could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: first_start.Datatype: String| -| dn_internal_port| 10730| 10730| Used for communication inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_mpp_data_exchange_port| 10740| 10740| Port for data exchange among DataNodes inside cluster Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_schema_region_consensus_port| 10750| 10750| port for consensus's communication for schema region inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_data_region_consensus_port| 10760| 10760| port for consensus's communication for data region inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| schema_replication_factor| 1| 1| Default number of schema replicas.effectiveMode: first_start.Datatype: int| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| SchemaRegion consensus protocol type. This parameter is unmodifiable after ConfigNode starts for the first time. These consensus protocols are currently supported: 1. org.apache.iotdb.consensus.ratis.RatisConsensus 2. org.apache.iotdb.consensus.simple.SimpleConsensus (The schema_replication_factor can only be set to 1).effectiveMode: first_start.Datatype: string| -| data_replication_factor| 1| 1| Default number of data replicas.effectiveMode: first_start.Datatype: int| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| DataRegion consensus protocol type. This parameter is unmodifiable after ConfigNode starts for the first time. These consensus protocols are currently supported: 1. org.apache.iotdb.consensus.simple.SimpleConsensus (The data_replication_factor can only be set to 1) 2. org.apache.iotdb.consensus.iot.IoTConsensus 3. org.apache.iotdb.consensus.ratis.RatisConsensus 4. org.apache.iotdb.consensus.iot.IoTConsensusV2.effectiveMode: first_start.Datatype: string| -| cn_metric_prometheus_reporter_port| 9091| 9091| The port of prometheus reporter of metric module.effectiveMode: restart.Datatype: int| -| dn_metric_prometheus_reporter_port| 9092| 9092| The port of prometheus reporter of metric module.effectiveMode: restart.Datatype: int| -| series_slot_num| 1000| 1000| All parameters in Partition configuration is unmodifiable after ConfigNode starts for the first time. And these parameters should be consistent within the ConfigNodeGroup. Number of SeriesPartitionSlots per Database.effectiveMode: first_start.Datatype: Integer| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| SeriesPartitionSlot executor class These hashing algorithms are currently supported: 1. BKDRHashExecutor(Default) 2. APHashExecutor 3. JSHashExecutor 4. SDBMHashExecutor Also, if you want to implement your own SeriesPartition executor, you can inherit the SeriesPartitionExecutor class and modify this parameter to correspond to your Java class.effectiveMode: first_start.Datatype: String| -| time_partition_origin| 0| 0| Time partition origin in milliseconds, default is equal to zero. This origin is set by default to the beginning of Unix time, which is January 1, 1970, at 00:00 UTC (Coordinated Universal Time). This point is known as the Unix epoch, and its timestamp is 0. If you want to specify a different time partition origin, you can set this value to a specific Unix timestamp in milliseconds.effectiveMode: first_start.Datatype: long| -| time_partition_interval| 604800000| 604800000| Time partition interval in milliseconds, and partitioning data inside each data region, default is equal to one week.effectiveMode: first_start.Datatype: long| -| disk_space_warning_threshold| 0.05| 0.05| Disk remaining threshold at which DataNode is set to ReadOnly status.effectiveMode: restart.Datatype: double(percentage)| -| schema_engine_mode| Memory| Memory| The schema management mode of schema engine. Currently, support Memory and PBTree. This config of all DataNodes in one cluster must keep same.effectiveMode: first_start.Datatype: string| -| tag_attribute_total_size| 700| 700| max size for a storage block for tags and attributes of one time series. If the combined size of tags and attributes exceeds the tag_attribute_total_size, a new storage block will be allocated to continue storing the excess data. the unit is byte.effectiveMode: first_start.Datatype: int| -| read_consistency_level| strong| strong| The read consistency level These consistency levels are currently supported: 1. strong(Default, read from the leader replica) 2. weak(Read from a random replica).effectiveMode: restart.Datatype: string| -| timestamp_precision| ms| ms| Use this value to set timestamp precision as "ms", "us" or "ns". Once the precision has been set, it can not be changed.effectiveMode: first_start.Datatype: String| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 28 -It costs 0.010s -``` - - - -## 2. 状态设置 - -### 2.1 设置连接的模型 - -**含义**:将当前连接的 sql_dialect 置为树模型/表模型,在树模型和表模型中均可使用该命令。 - -#### 语法: - -```SQL -SET SQL_DIALECT EQ (TABLE | TREE) -``` - -#### 示例: - -```SQL -IoTDB> SET SQL_DIALECT=TABLE -IoTDB> SHOW CURRENT_SQL_DIALECT -``` - -执行结果如下: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 2.2 更新配置项 - -**含义**:用于更新配置项,执行完成后会进行配置项的热加载,对于支持热修改的配置项会立即生效。 - -#### 语法: - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**参数解释**: - -1. **propertyAssignments** - - **含义**:更新的配置列表,由多个 `property` 组成。 - - 可以更新多个配置列表,用逗号分隔。 - - **取值**: - - `DEFAULT`:将配置项恢复为默认值。 - - `expression`:具体的值,必须是一个字符串。 -2. **ON INTEGER_VALUE** - - **含义**:指定要更新配置的节点 ID。 - - **可选性**:可选。如果不指定或指定的值低于 0,则更新所有 ConfigNode 和 DataNode 的配置。 - -#### 示例: - -```SQL -IoTDB> SET CONFIGURATION disk_space_warning_threshold='0.05',heartbeat_interval_in_ms='1000' ON 1; -``` - -### 2.3 读取手动修改的配置文件 - -**含义**:用于读取手动修改过的配置文件,并对配置项进行热加载,对于支持热修改的配置项会立即生效。 - -#### 语法: - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定配置热加载的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `LOCAL`:只对客户端直连的 DataNode 进行配置热加载。 - - `CLUSTER`:对集群中所有 DataNode 进行配置热加载。 - -#### 示例: - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 设置系统的状态 - -**含义**:用于设置系统的状态。 - -#### 语法: - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **RUNNING | READONLY** - - **含义**:指定系统的新状态。 - - **取值**: - - `RUNNING`:将系统设置为运行状态,允许读写操作。 - - `READONLY`:将系统设置为只读状态,只允许读取操作,禁止写入操作。 -2. **localOrClusterMode** - - **含义**:指定状态变更的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `LOCAL`:仅对客户端直连的 DataNode 生效。 - - `CLUSTER`:对集群中所有 DataNode 生效。 - -#### 示例: - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - - -## 3. 数据管理 - -### 3.1 刷写内存表中的数据到磁盘 - -**含义**:将内存表中的数据刷写到磁盘上。 - -#### 语法: - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **identifier** - - **含义**:指定要刷写的数据库名称。 - - **可选性**:可选。如果不指定,则默认刷写所有数据库。 - - **多个数据库**:可以指定多个数据库名称,用逗号分隔。例如:`FLUSH test_db1, test_db2`。 -2. **booleanValue** - - **含义**:指定刷写的内容。 - - **可选性**:可选。如果不指定,则默认刷写顺序和乱序空间的内存。 - - **取值**: - - `TRUE`:只刷写顺序空间的内存表。 - - `FALSE`:只刷写乱序空间的MemTable。 -3. **localOrClusterMode** - - **含义**:指定刷写的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:只刷写客户端直连的 DataNode 上的内存表。 - - `ON CLUSTER`:刷写集群中所有 DataNode 上的内存表。 - -#### 示例: - -```SQL -IoTDB> FLUSH test_db TRUE ON LOCAL; -``` - -## 4. 数据修复 - -### 4.1 启动后台扫描并修复 tsfile 任务 - -**含义**:启动一个后台任务,开始扫描并修复 tsfile,能够修复数据文件内的时间戳乱序类异常。 - -#### 语法: - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定数据修复的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:仅对客户端直连的 DataNode 执行。 - - `ON CLUSTER`:对集群中所有 DataNode 执行。 - -#### 示例: - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 暂停后台修复 tsfile 任务 - -**含义**:暂停后台的修复任务,暂停中的任务可通过再次执行 start repair data 命令恢复。 - -#### 语法: - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定数据修复的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:仅对客户端直连的 DataNode 执行。 - - `ON CLUSTER`:对集群中所有 DataNode 执行。 - -#### 示例: - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. 终止查询 - -### 5.1 主动终止查询 - -**含义**:使用该命令主动地终止查询。 - -#### 语法: - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**参数解释**: - -1. **QUERY queryId=string** - - **含义**:指定要终止的查询的 ID。 `` 是正在执行的查询的唯一标识符。 - - **获取查询 ID**:可以通过 `SHOW QUERIES` 命令获取所有正在执行的查询及其 ID。 -2. **ALL QUERIES** - - **含义**:终止所有正在执行的查询。 - -#### 示例: - -通过指定 `queryId` 可以中止指定的查询,为了获取正在执行的查询 id,用户可以使用 show queries 命令,该命令将显示所有正在执行的查询列表。 - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -- 终止指定query -IoTDB> KILL ALL QUERIES; -- 终止所有query -``` - -## 6. 调试查询 - -### 6.1 DEBUG SQL - - -**​含义:​**在 SQL 查询语句开头添加 debug 关键字,执行时将输出 debug 日志,包括涉及到的底层文件 scan 信息。 - -> V2.0.9.1 起支持该功能 - -#### 语法: - -```SQL -debugSQLStatement - : DEBUG ? query - ; -``` - -**说明:** - -* 日志输出目录为: `logs/log_datanode_query_debug.log` - -#### 示例: - -1. 执行以下 SQL 进行 DEBUG 查询 - -```SQL -debug select * from table3; -``` - -2. 观察`log_datanode_query_debug.log` 的日志内容,查看查询涉及到的文件 scan 信息。 - -```Bash -2026-03-24 10:10:41,515 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.t.TsFileResource:1098 - Path: table3.d1 file /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2864/1769139940009-1-0-0.tsfile is not satisfied because of no device! -2026-03-24 10:10:41,515 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.t.TsFileResource:1098 - Path: table3.d1 file /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2865/1769139940010-1-0-0.tsfile is not satisfied because of no device! -2026-03-24 10:10:41,516 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:159 - Cache miss: table3.d1. in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,516 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:160 - Device: table3.d1, all sensors: [, temperature] -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.BloomFilterCache:110 - get bloomFilter from cache where filePath is: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: table3.d1. metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=-128, chunkMetaDataListDataSize=8, measurementId='', dataType=VECTOR, statistics=startTime: 1747065600001 endTime: 1747065601002 count: 2, modified=false, isSeq=true, chunkMetadataList=[measurementId: , datatype: VECTOR, version: 0, Statistics: startTime: 1747065600001 endTime: 1747065601002 count: 2, deleteIntervalList: null]}. -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: table3.d1.temperature metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=64, chunkMetaDataListDataSize=8, measurementId='temperature', dataType=FLOAT, statistics=startTime: 1747065600001 endTime: 1747065601002 count: 2 [minValue:85.0,maxValue:90.0,firstValue:90.0,lastValue:85.0,sumValue:175.0], modified=false, isSeq=true, chunkMetadataList=[measurementId: temperature, datatype: FLOAT, version: 0, Statistics: startTime: 1747065600001 endTime: 1747065601002 count: 2 [minValue:85.0,maxValue:90.0,firstValue:90.0,lastValue:85.0,sumValue:175.0], deleteIntervalList: null]}. -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:110 - Modifications size is 1 for file Path: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:114 - [] -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:125 - After modification Chunk meta data list is: -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:126 - org.apache.tsfile.file.metadata.TableDeviceChunkMetadata@2e11291f -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile', regionId=4, timePartitionId=2888, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=19} -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile', regionId=4, timePartitionId=2888, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=46} -2026-03-24 10:10:41,519 [pool-69-IoTDB-ClientRPC-Processor-1$20260324_021041_00068_1] INFO o.a.i.d.q.p.Coordinator:902 - debug select * from table3 -``` diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Pattern-Query_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Pattern-Query_timecho.md deleted file mode 100644 index 8f8df33fa..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Pattern-Query_timecho.md +++ /dev/null @@ -1,1135 +0,0 @@ - - -# 模式查询 - -IoTDB 针对时序数据的特色分析场景,提供了模式查询能力,为时序数据的深度挖掘与复杂计算提供了灵活高效的解决方案。下文将对该功能进行详细的介绍。 - -## 1. 概述 - -模式查询支持通过定义模式变量的识别逻辑以及正则表达式来捕获一段连续的数据,并对每一段捕获的数据进行分析计算,适用于识别时序数据中的特定模式(如下图所示)、检测特定事件等业务场景。 - -![](/img/timeseries-featured-analysis-1.png) - -> 注意:该功能从 V 2.0.5 版本开始提供。 - -## 2. 功能介绍 -### 2.1 语法格式 - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**说明:** - -* PARTITION BY : 可选,用于对输入表进行分组,每个分组能独立进行模式匹配。如果未声明该子句,则整个输入表将作为一个整体进行处理。 -* ORDER BY :可选,用于确保输入数据按某种顺序进行匹配处理。 -* MEASURES :可选,用于指定从匹配到的一段数据中提取哪些信息。 -* ROWS PER MATCH :可选,用于指定模式匹配成功后结果集的输出方式。 -* AFTER MATCH SKIP :可选,用于指定在识别到一个非空匹配后,下一次模式匹配应从哪一行继续进行。 -* PATTERN :用于定义需要匹配的行模式。 -* SUBSET :可选,用于将多个基本模式变量所匹配的行合并为一个逻辑集合。 -* DEFINE :用于定义行模式的基本模式变量。 - -**语法示例原始数据:** - -```SQL -IoTDB:database3> select * from t -+-----------------------------+------+----------+ -| time|device|totalprice| -+-----------------------------+------+----------+ -|2025-01-01T00:01:00.000+08:00| d1| 90| -|2025-01-01T00:02:00.000+08:00| d1| 80| -|2025-01-01T00:03:00.000+08:00| d1| 70| -|2025-01-01T00:04:00.000+08:00| d1| 80| -|2025-01-01T00:05:00.000+08:00| d1| 70| -|2025-01-01T00:06:00.000+08:00| d1| 80| -+-----------------------------+------+----------+ - --- 创建语句 -create table t(device tag, totalprice int32 field) - -insert into t(time,device,totalprice) values(2025-01-01T00:01:00, 'd1', 90),(2025-01-01T00:02:00, 'd1', 80),(2025-01-01T00:03:00, 'd1', 70),(2025-01-01T00:04:00, 'd1', 80),(2025-01-01T00:05:00, 'd1', 70),(2025-01-01T00:06:00, 'd1', 80) -``` - -### 2.2 DEFINE 子句 - -用于为模式识别中的每个基本模式变量指定其判断条件。这些变量通常由标识符(如 `A`, `B`)代表,并通过该子句中的布尔表达式精确定义哪些行符合该变量的要求。 - -* 在模式匹配执行过程中,仅当布尔表达式返回 TRUE 时,才会将当前行标记为该变量,从而将其纳入到当前匹配分组中。 - -```SQL --- 只有在当前行的 totalprice 值小于前一行 totalprice 值的情况下,当前行才可以被识别为 B。 -DEFINE B AS totalprice < PREV(totalprice) -``` - -* **未**在子句中**显式**定义的变量,其匹配条件隐含为恒真(TRUE),即可在任何输入行上成功匹配。 - -### 2.3 SUBSET 子句 - -用于将多个基本模式变量(如 `A`、`B`)匹配到的行合并成一个联合模式变量(如 `U`),使这些行可以被视为同一个逻辑集合进行操作。可用于`MEASURES`、`DEFINE `和`AFTER MATCH SKIP`子句。 - -```SQL -SUBSET U = (A, B) -``` - -例如,对于模式 `PATTERN ((A | B){5} C+)` ,在匹配过程中无法确定第五次重复时具体匹配的是基本模式变量 A 还是 B,因此 - -1. 在 `MEASURES `子句中,若需要引用该阶段最后一次匹配到的行,则可通过定义联合模式变量 `SUBSET U = (A, B)`实现。此时表达式 `RPR_LAST(U.totalprice)` 将直接返回该目标行的 `totalprice` 值。 -2. 在 `AFTER MATCH SKIP` 子句中,若匹配结果中未包含基本模式变量 A 或 B 时,执行 `AFTER MATCH SKIP TO LAST B` 或 `AFTER MATCH SKIP TO LAST A` 会因锚点缺失跳转失败;而通过引入联合模式变量 `SUBSET U = (A, B)`,使用 `AFTER MATCH SKIP TO LAST U` 则始终有效。 - -### 2.4 PATTERN 子句 - -用于定义需要匹配的行模式,其基本构成单元是**基本模式变量。** - -```SQL -PATTERN ( row_pattern ) -``` - -#### 2.4.1 模式种类 - -| 行模式 | 语法格式 | 描述 | -| ----------------------------------- |---------------------| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| 模式连接(Pattern Concatenation) | `A B+ C+ D+` | 由不带任何运算符的子模式组成,按声明顺序依次匹配所有子模式。| -| 模式选择(Pattern Alternation) | `A \| B \| C` | 由以`\|`分隔的多个子模式组成,仅匹配其中一个。当多个子模式均可匹配时,选择最左侧的子模式匹配。 | -| 模式排列(Pattern Permutation) | `PERMUTE(A, B, C)` | 该模式等价于对所有子模式元素的不同顺序进行选择匹配,即要求 A、B、C 三者均须匹配,但其出现顺序不固定。当多种匹配顺序均可成功时,依据 PERMUTE 列表中元素的定义先后顺序,按**字典序原则**确定优先级。例如,A B C 为最高优先,C B A 则为最低优先。 | -| 模式分组(Pattern Grouping) | `(A B C)` | 用圆括号将子模式括起,视作一个整体对待,可与其他运算符配合使用。如`(A B C)+`表示连续出现一组`(A B C)`的模式。 | -| 空模式(Empty Pattern) | `()` | 表示一个不包含任何行的空匹配 | -| 模式排除(Pattern Exclusion) | `{- row_pattern -}` | 用于指定在输出中需要排除的匹配部分。通常与`ALL ROWS PER MATCH`选项结合使用,用于输出感兴趣的行。如`PATTERN (A {- B+ C+ -} D+)`,并使用`ALL ROWS PER MATCH`时,输出将仅包含匹配的首行(`A`对应行)与尾部行(`D+`对应行)。 | - -#### 2.4.2 分区起始/结束锚点(Partition Start/End Anchor) - -* `^A` 表示匹配以 A 为分区开始的模式 - * 当 PATTERN 子句的取值为 `^A` 时,要求匹配必须从分区的首行开始,且这一行要满足 `A` 的定义 - * 当 PATTERN 子句的取值为 `^A^` 或 `A^` 时,输出结果为空 -* `A$` 表示匹配以 A 为分区结束的模式 - * 当 PATTERN 子句的取值为 `A$` 时,要求必须在分区的结束位置匹配,并且这一行要满足 `A`的定义 - * 当 PATTERN 子句的取值为 `$A` 或 `$A$` 时,输出结果为空 - -**示例说明** - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - AFTER MATCH SKIP PAST LAST ROW - PATTERN %s -- PATTERN 子句 - DEFINE A AS true -) AS m; -``` - -* 查询结果 - * 当 PATTERN 子句为 PATTERN (^A) 时 - - ![](/img/timeseries-featured-analysis-2.png) - - 实际返回 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * 当 PATTERN 子句为 PATTERN (^A^) 时,输出的结果为空,因为不可能从分区的起始位置开始匹配了一个 A 之后,又回到分区的起始位置 - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - * 当 PATTERN 子句为 PATTERN (A\$) 时 - - ![](/img/timeseries-featured-analysis-3.png) - - 实际返回 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:06:00.000+08:00| 1| 80| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * 当 PATTERN 子句为 PATTERN (\$A\$) 时,输出的结果为空 - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - -#### 2.4.3 量词(Quantifiers) - -量词用于指定子模式重复出现的次数,置于相应子模式之后,如 `(A | B)*`。 - -常用量词如下: - -| 量词 | 描述 | -| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `*` | 零次或多次重复 | -| `+` | 一次或多次重复 | -| `?` | 零次或一次重复 | -| `{n}` | 恰好重复 n 次 | -| `{m, n}` | 重复次数在 m 到 n 之间(m、n 为非负整数)。* 若省略左界,则默认从 0 开始;* 若省略右界,则重复次数不设上限(如 {5,} 等同于“至少重复五次”);* 若同时省略左右界,即 {,},则与 \* 等价。 | - -* 可通过在量词后加 `?` 改变匹配偏好。 - * `{3,5}`:偏好 5 次,最不偏好 3 次;`{3,5}?`:偏好 3 次,最不偏好 5 次 - * `?`:偏好 1 次;`??`:偏好 0 次 - -### 2.5 AFTER MATCH SKIP 子句 - -用于指定在识别到一个非空匹配后,下一次模式匹配应从哪一行继续进行。 - -| 跳转策略 | 描述 | 是否允许识别重叠匹配项 | -| ------------------------------------------------------------- | --------------------------------------------------- | ------------------------ | -| `AFTER MATCH SKIP PAST LAST ROW` | 默认行为。在当前匹配的最后一行之后的下一行开始。 | 否 | -| `AFTER MATCH SKIP TO NEXT ROW` | 在当前匹配中的第二行开始。 | 是 | -| `AFTER MATCH SKIP TO [ FIRST \| LAST ] pattern_variable` | 跳转到某个模式变量的 [ 第一行 | 最后一行 ] 开始。 | 是 | - -* 在所有可能的配置中,仅当 `ALL ROWS PER MATCH WITH UNMATCHED ROWS` 与 `AFTER MATCH SKIP PAST LAST ROW` 联合使用时,系统才能确保对每个输入行恰好生成一条输出记录。 - -**示例说明** - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - %s -- AFTER MATCH SKIP 子句 - PATTERN (A B+ C+ D?) - SUBSET U = (C, D) - DEFINE - B AS B.totalprice < PREV (B.totalprice), - C AS C.totalprice > PREV (C.totalprice), - D AS false -- 永远不会匹配成功 -) AS m; -``` - -* 查询结果 - * 当 AFTER MATCH SKIP PAST LAST ROW 时 - - ![](/img/timeseries-featured-analysis-4.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:根据 `AFTER MATCH SKIP PAST LAST ROW` 语义,从第 5 行开始,无法再找寻到一个合法匹配 - * 此模式一定不会出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 4 - ``` - - * 当 AFTER MATCH SKIP TO NEXT ROW 时 - - ![](/img/timeseries-featured-analysis-5.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:根据 `AFTER MATCH SKIP TO NEXT ROW` 语义,从第 2 行开始,匹配:第 2、3、4 行 - * 第三次匹配:尝试从第 3 行开始,失败 - * 第三次匹配:尝试从第 4 行开始,成功,匹配第 4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:02:00.000+08:00| 2| 80| A| - |2025-01-01T00:03:00.000+08:00| 2| 70| B| - |2025-01-01T00:04:00.000+08:00| 2| 80| C| - |2025-01-01T00:04:00.000+08:00| 3| 80| A| - |2025-01-01T00:05:00.000+08:00| 3| 70| B| - |2025-01-01T00:06:00.000+08:00| 3| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 10 - ``` - - * 当 AFTER MATCH SKIP TO FIRST C 时 - - ![](/img/timeseries-featured-analysis-6.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:从第一个 C (也就是第 4 行)处开始,匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO LAST B 或 AFTER MATCH SKIP TO B 时 - - ![](/img/timeseries-featured-analysis-7.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:尝试从最后一个 B (也就是第 3 行)处开始,失败 - * 第二次匹配:尝试从第 4 行开始,成功匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO U 时 - - ![](/img/timeseries-featured-analysis-8.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:`SKIP TO U` 表示跳转到最后一个 C 或 D,D 永远不可能匹配成功,所以就是跳转到最后一个 C(也就是第 4 行),成功匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO A 时,报错。因为不能跳转到匹配的第一行, 否则会造成死循环。 - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: cannot skip to first row of match - ``` - - * 当 AFTER MATCH SKIP TO B 时,报错。因为不能跳转到匹配分组中不存在的模式变量。 - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: pattern variable is not present in match - ``` - - -### 2.6 ROWS PER MATCH 子句 - -用于指定模式匹配成功后结果集的输出方式,主要包括以下两种选项: - -| 输出方式 | 规则描述 | 输出结果 | **空匹配/未匹配行**处理逻辑 | -| -------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| ONE ROW PER MATCH | 每一次成功匹配,产生一行输出结果。 | * PARTITION BY 子句中的列* MEASURES 子句中定义的表达式。 | 输出空匹配;跳过未匹配行。 | -| ALL ROWS PER MATCH | 每一次匹配中的每一行都将产生一条输出记录,除非该行通过 exclusion 语法排除。 | * PARTITION BY 子句中的列* ORDER BY 子句中的列* MEASURES 子句中定义的表达式* 输入表中的其余列 | * 默认:输出空匹配;跳过未匹配行。* ALL ROWS PER MATCH​**SHOW EMPTY MATCHES**​:默认输出空匹配,跳过未匹配行* ALL ROWS PER MATCH​**OMIT EMPTY MATCHES**​:不输出空匹配,跳过未匹配行* ALL ROWS PER MATCH​**WITH UNMATCHED ROWS**​:输出空匹配,并为每一条未匹配行额外生成一条输出记录| - -### 2.7 MEASURES 子句 - -用于指定从匹配到的一段数据中提取哪些信息。该子句为可选项,如果未显式指定,则根据 ROWS PER MATCH 子句的设置,部分输入列会成为模式识别的输出结果。 - -```SQL -MEASURES measure_expression AS measure_name [, ...] -``` - -* `measure_expression` 是根据匹配的一段数据计算出的标量值。 - -| 用法示例 | 说明 | -| ---------------------------------------------- | -------------------------------------------------------------------------------------------------------------- | -| `A.totalprice AS starting_price` | 返回匹配分组中第一行(即与变量 A 关联的唯一一行)中的价格,作为起始价格。 | -| `RPR_LAST(B.totalprice) AS bottom_price` | 返回与变量 B 关联的最后一行中的价格,代表“V”形模式中最低点的价格,对应下降区段的末尾。 | -| `RPR_LAST(U.totalprice) AS top_price` | 返回匹配分组中的最高价格,对应变量 C 或 D 所关联的最后一行,即整个匹配分组的末尾。【假设 SUBSET U = (C, D)】 | - -* 每个 `measure_expression `都会定义一个输出列,该列可通过其指定的 `measure_name `进行引用。 - -### 2.8 模式查询表达式 - -在 MEASURES 与 DEFINE 子句中使用的表达式为​**标量表达式**​,用于在输入表的行级上下文中求值。**标量表达式**除了支持标准 SQL 语法外,还支持针对模式查询的特殊扩展函数。 - -#### 2.8.1 模式变量引用 - -```SQL -A.totalprice -U.orderdate -orderstatus -``` - -* 当列名前缀为某**基本模式变量**或**联合模式变量**时,表示引用该变量所匹配的所有行的对应列值。 -* 若列名不带前缀,则等同于使用“​**全局联合模式变量**​”(即所有基本模式变量的并集)作前缀,表示引用当前匹配中所有行的该列值。 - -> 不允许在模式识别表达式中使用表名作列名前缀。 - -#### 2.8.2 扩展函数 - -| 函数名 | 函数式 | 描述 | -|------------------| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `MATCH_NUMBER`函数 | `MATCH_NUMBER()` | 返回当前匹配在分区内的序号,从 1 开始计数。空匹配与非空匹配一致,也占用匹配序号。 | -| `CLASSIFIER `函数 | `CLASSIFIER(option)`| 1. 返回当前行所映射的基本模式变量名称。1. `option`是一个可选参数:可以传入基本模式变量`CLASSIFIER(A)`或联合模式变量`CLASSIFIER(U)`,用于限定函数作用范围,对于不在范围内的行,直接返回 NULL。在对联合模式变量使用时,可用于辨别该行究竟映射至并集中哪一个基本模式变量。 | -| 逻辑导航函数 | `RPR_FIRST(expr, k)` | 1. 表示从**当前匹配分组**中,定位至第一个满足 expr 的行,在此基础上再向分组尾部方向搜索到第 k 次出现的同一模式变量对应行,返回该行的指定列值。如果在指定方向上未能找到第 k 次匹配行,则函数返回 NULL。1. 其中 k 是可选参数,默认为 0,表示仅定位至首个满足条件的行;若显式指定,必须为非负整数。 | -| 逻辑导航函数 | `RPR_LAST(expr, k)`| 1. 表示从**当前匹配分组**中,定位至最后一个满足 expr 的行,在此基础上再向分组开头方向搜索到第 k 次出现的同一模式变量对应行,返回该行的指定列值。如果在指定方向上未能找到第 k 次匹配行,则函数返回 NULL。1. 其中 k 是可选参数,默认为 0,表示仅定位至末个满足条件的行;若显式指定,必须为非负整数。 | -| 物理导航函数 | `PREV(expr, k)` | 1. 表示从最后一次匹配至给定模式变量的行开始,向开头方向偏移 k 行,返回对应列值。若导航超出​**分区边界**​,则函数返回 NULL。1. 其中 k 是可选参数,默认为 1;若显式指定,必须为非负整数。 | -| 物理导航函数 |`NEXT(expr, k)` | 1. 表示从最后一次匹配至给定模式变量的行开始,向尾部方向偏移 k 行,返回对应列值。若导航超出​**分区边界**​,则函数返回 NULL。1. 其中 k 是可选参数,默认为 1;若显式指定,必须为非负整数。 | -| 聚合函数 | COUNT、SUM、AVG、MAX、MIN 函数 | 可用于对当前匹配中的数据进行计算。聚合函数与导航函数不允许互相嵌套。(V 2.0.6 版本起支持) | -| 嵌套函数 | `PREV/NEXT(CLASSIFIER())` | 物理导航函数与 CLASSIFIER 函数嵌套。用于获取当前行的前一个和后一个匹配行所对应的模式变量 | -| 嵌套函数 |`PREV/NEXT(RPR_FIRST/RPR_LAST(expr, k)`) | 物理函数内部**允许嵌套**逻辑函数,逻辑函数内部**不允许嵌套**物理函数。用于先进行逻辑偏移,再进行物理偏移。 | - -**示例说明** - -1. CLASSIFIER 函数 - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* 分析过程 - - ![](/img/timeseries-featured-analysis-9.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----+---------------+-----+ -| time|match|price|lower_or_higher|label| -+-----------------------------+-----+-----+---------------+-----+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| -+-----------------------------+-----+-----+---------------+-----+ -Total line number = 6 -``` - -2. 逻辑导航函数 - -* 查询 sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` - -* 查询结果 - * 当取值为 totalprice、RPR\_LAST(totalprice)、RUNNING RPR\_LAST(totalprice) 时 - - ![](/img/timeseries-featured-analysis-10.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 FINAL RPR\_LAST(totalprice) 时 - - ![](/img/timeseries-featured-analysis-11.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_FIRST(totalprice)、 RUNNING RPR\_FIRST(totalprice)、FINAL RPR\_FIRST(totalprice)时 - - ![](/img/timeseries-featured-analysis-12.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 90| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 90| - |2025-01-01T00:05:00.000+08:00| 90| - |2025-01-01T00:06:00.000+08:00| 90| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_LAST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-13.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| null| - |2025-01-01T00:02:00.000+08:00| null| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 FINAL RPP\_LAST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-14.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_FIRST(totalprice, 2) 和 FINAL RPR\_FIRST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-15.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 70| - |2025-01-01T00:02:00.000+08:00| 70| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 6 - ``` - -3. 物理导航函数 - -* 查询 sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (B) - DEFINE B AS B.totalprice >= PREV(B.totalprice) -) AS m; -``` - -* 查询结果 - * 当取值为 `PREV(totalprice)` 时 - - ![](/img/timeseries-featured-analysis-16.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `PREV(B.totalprice, 2)` 时 - - ![](/img/timeseries-featured-analysis-17.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `PREV(B.totalprice, 4)` 时 - - ![](/img/timeseries-featured-analysis-18.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| null| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `NEXT(totalprice)` 或 `NEXT(B.totalprice, 1)` 时 - - ![](/img/timeseries-featured-analysis-19.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * `当取值为 NEXT(B.totalprice, 2)` 时 - - ![](/img/timeseries-featured-analysis-20.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - -4. 聚合函数 - -* 查询 sql - -```SQL -SELECT m.time, m.count, m.avg, m.sum, m.min, m.max -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - COUNT(*) AS count, - AVG(totalprice) AS avg, - SUM(totalprice) AS sum, - MIN(totalprice) AS min, - MAX(totalprice) AS max - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* 分析过程(以 MIN(totalprice)为例) - -![](/img/timeseries-featured-analysis-21.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----------------+-----+---+---+ -| time|count| avg| sum|min|max| -+-----------------------------+-----+-----------------+-----+---+---+ -|2025-01-01T00:01:00.000+08:00| 1| 90.0| 90.0| 90| 90| -|2025-01-01T00:02:00.000+08:00| 2| 85.0|170.0| 80| 90| -|2025-01-01T00:03:00.000+08:00| 3| 80.0|240.0| 70| 90| -|2025-01-01T00:04:00.000+08:00| 4| 80.0|320.0| 70| 90| -|2025-01-01T00:05:00.000+08:00| 5| 78.0|390.0| 70| 90| -|2025-01-01T00:06:00.000+08:00| 6|78.33333333333333|470.0| 70| 90| -+-----------------------------+-----+-----------------+-----+---+---+ -Total line number = 6 -``` - -5. 嵌套函数 - -示例一 - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label, m.prev_label, m.next_label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label, - PREV(CLASSIFIER(W)) AS prev_label, - NEXT(CLASSIFIER(W)) AS next_label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* 分析过程 - -![](/img/timeseries-featured-analysis-22.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -| time|match|price|lower_or_higher|label|prev_label|next_label| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| null| A| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| H| null| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| null| A| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| L| null| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| null| A| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| L| null| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -Total line number = 6 -``` - -示例二 - -* 查询 sql - -```SQL -SELECT m.time, m.prev_last_price, m.next_first_price -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - PREV(RPR_LAST(totalprice), 2) AS prev_last_price, - NEXT(RPR_FIRST(totalprice), 2) as next_first_price - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* 分析过程 - -![](/img/timeseries-featured-analysis-23.png) - -* 查询结果 - -```SQL -+-----------------------------+---------------+----------------+ -| time|prev_last_price|next_first_price| -+-----------------------------+---------------+----------------+ -|2025-01-01T00:01:00.000+08:00| null| 70| -|2025-01-01T00:02:00.000+08:00| null| 70| -|2025-01-01T00:03:00.000+08:00| 90| 70| -|2025-01-01T00:04:00.000+08:00| 80| 70| -|2025-01-01T00:05:00.000+08:00| 70| 70| -|2025-01-01T00:06:00.000+08:00| 80| 70| -+-----------------------------+---------------+----------------+ -Total line number = 6 -``` - -#### 2.8.3 RUNNING 和 FINAL 语义 -1. 定义 - -* `RUNNING`: 表示计算范围为当前匹配分组内,从分组的起始行到当前正在处理的行(即到当前行为止)。 -* `FINAL`: 表示计算范围为当前匹配分组内,从分组的起始行到分组的最终行(即整个匹配分组)。 - -2. 作用范围 - -* DEFINE 子句默认采用 RUNNING 语义。 -* MEASURES 子句默认采用 RUNNING 语义,支持指定 FINAL 语义。当采用 ONE ROW PER MATCH 输出模式时,所有表达式都从匹配分组的末行位置进行计算,此时 RUNNING 语义与 FINAL 语义等价。 - -3. 语法约束 - -* RUNNING 和 FINAL 需要写在**逻辑导航函数**或聚合函数之前,不能直接作用于**列引用。** - * 合法:`RUNNING RPP_LAST(A.totalprice)`、`FINAL RPP_LAST(A.totalprice)` - * 非法:`RUNNING A.totalprice`、`FINAL A.totalprice`、 `RUNNING PREV(A.totalprice)` - -## 3. 场景示例 - -以[示例数据](../Reference/Sample-Data.md)为源数据 - -### 3.1 时间分段查询 - -将 table1 中的数据按照时间间隔小于等于 24 小时分段,查询每段中的数据总条数,以及开始、结束时间。 - -查询SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -### 3.2 差值分段查询 - -将 table2 中的数据按照 humidity 湿度值差值小于 0.1 分段,查询每段中的数据总条数,以及开始、结束时间。 - -* 查询sql - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* 查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -### 3.3 事件统计查询 - -将 table1 中数据按照设备号分组,统计上海地区湿度大于 35 的开始、结束时间及最大湿度值。 - -* 查询sql - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= '上海' AND A.humidity> 35 -) AS m -``` - -* 查询结果 - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` - -## 4. 实际案例 - -### 4.1海拔高度监测 - -* **业务背景** - -石油运输车辆在油品运输过程中,海拔高度会直接影响环境气压:海拔越高,气压越低,油品挥发风险越高。为精准评估油品自然损耗情况,需通过北斗定位数据识别海拔异常事件,为损耗评估提供数据支撑。 - -* **数据结构** - -监测数据表包含以下核心字段: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | 数据采集时间 | -| device\_id | STRING | TAG | 车辆设备编号(分区键) | -| department | STRING | FIELD | 所属部门 | -| altitude | DOUBLE | FIELD | 海拔高度(单位:米) | - -* **业务需求** - -识别运输车辆的海拔异常事件:当车辆海拔高度超过 500 米,后续又降至 500 米以下时,视为一个完整的异常事件。需计算每个事件的核心指标: - -* 事件起始时间(海拔首次超过 500 米的时间); -* 事件结束时间(海拔最后一次高于 500 米的时间); -* 事件期间该车辆的最大海拔值。 - -![](/img/pattern-query-altitude.png) - -* **实现方法** - -```SQL -SELECT * -FROM beidou -MATCH_RECOGNIZE ( - PARTITION BY device_id -- 按车辆设备分区 - ORDER BY time -- 按时间排序 - MEASURES - FIRST(A.time) AS ts_s, -- 事件起始时间 - LAST(A.time) AS ts_e, -- 事件结束时间 - MAX(A.altitude) AS max_a -- 事件最大海拔 - PATTERN (A+) -- 匹配连续的海拔超500米的记录 - DEFINE - A AS A.altitude > 500 -- 定义A为海拔高于500米的记录 -) -``` - -### 4.2 安全注入操作识别 - -* **业务背景** - -核电站需定期执行安全检测试验(如 PT1RPA010《用 1 RPA 601KC 进行安全注入逻辑试验》),以验证发电设备无损伤。该类试验会导致水管流量呈现特征性变化,中控系统需识别该流量模式,及时汇报异常行为,保障设备安全。 - -* **数据结构** - -传感器数据表包含以下核心字段: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | 数据采集时间 | -| pipe\_id | STRING | TAG | 水管编号(分区键) | -| pressure | DOUBLE | FIELD | 水管压力 | -| flow\_rate | DOUBLE | FIELD | 水管流量(核心监测值) | - -* **业务需求** - -识别 PT1RPA010 试验对应的流量特征模式:正常流量→持续下降→极低流量(<0.5)→持续回升→恢复正常流量。需提取该模式的核心指标: - -* 模式整体起始时间(初始正常流量的时间); -* 模式整体终止时间(恢复正常流量的时间); -* 极低流量阶段的起始 / 结束时间; -* 极低流量阶段的最小流量值。 - -![](/img/pattern-query-flow.png) - -* **实现方法** - -```SQL -SELECT * FROM sensor MATCH_RECOGNIZE( - PARTITION BY pipe_id -- 按水管编号分区 - ORDER BY time -- 按时间排序 - MEASURES - A.time AS start_ts, -- 模式整体起始时间 - E.time AS end_ts, -- 模式整体终止时间 - FIRST(C.time) AS low_start_ts, -- 极低流量起始时间 - LAST(C.time) AS low_end_ts, -- 极低流量结束时间 - MIN(C.flow_rate) AS min_low_flow -- 极低流量最小值(补充原代码缺失字段名) - ONE ROW PER MATCH -- 每个匹配模式仅输出1行结果 - PATTERN(A B+? C+ D+? E) -- 匹配正常→下降→极低→回升→正常的流量模式 - DEFINE - A AS flow_rate BETWEEN 2 AND 2.5, -- 初始正常流量 - B AS flow_rate < PREV(B.flow_rate), -- 流量持续下降 - C AS flow_rate < 0.5, -- 极低流量阈值 - D AS flow_rate > PREV(D.flow_rate), -- 流量持续回升 - E AS flow_rate BETWEEN 2 AND 2.5 -- 恢复正常流量 -); -``` - -### 4.3 极端运行阵风(草帽风)识别 - -* **业务背景** - -风力发电场景中,“极端运行阵风(草帽风)” 是一种短时间(约 10 秒)、波峰显著的正弦形阵风,这类阵风会对风机造成物理损伤。识别该类阵风并统计发生频率,可有效评估风机受损风险,指导设备维护。 - -* **数据结构** - -风机传感器数据表核心字段: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | 风速采集时间 | -| speed | DOUBLE | FIELD | 风机处风速(核心指标) | - -* **业务需求** - -识别 “草帽风” 的特征模式:风力缓慢下降→急剧增加→急剧减少→缓慢增加至初始值(全程约 10 秒)。核心目标是统计该类阵风的发生次数,为风机风险评估提供依据。 - -![](/img/pattern-query-speed.png) - -* **实现方法** - -```SQL -SELECT COUNT(*) -- 统计极端阵风发生次数 -FROM sensor -MATCH_RECOGNIZE( - ORDER BY time -- 按时间排序 - MEASURES - FIRST(B.time) AS ts_s, -- 阵风起始时间 - LAST(D.time) AS ts_e -- 阵风结束时间 - PATTERN (B+ R+? F+? D+? E) -- 匹配草帽风的风速变化模式 - DEFINE - -- B阶段:风速缓慢下降,初始风速>9,首尾风速差<2.5 - B AS speed <= AVG(B.speed) - AND FIRST(B.speed) > 9 - AND (FIRST(B.speed) - LAST(B.speed)) < 2.5, - -- R阶段:风速急剧增加(高于阶段平均风速) - R AS speed >= AVG(R.speed), - -- F阶段:风速急剧减少,阶段最大风速>16(波峰阈值) - F AS speed <= AVG(F.speed) - AND MAX(F.speed) > 16, - -- D阶段:风速缓慢增加,首尾风速差<2.5 - D AS speed >= AVG(D.speed) - AND (LAST(D.speed) - FIRST(D.speed)) < 2.5, - -- E阶段:风速恢复至初始值±0.2,全程时长<11秒 - E AS speed - FIRST(B.speed) BETWEEN -0.2 AND 0.2 - AND time - FIRST(B.time) < 11 -); -``` diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Tiered-Storage_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 86c183dc3..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,101 +0,0 @@ - - -# 多级存储 -## 1. 概述 - -多级存储功能向用户提供多种存储介质管理的能力,用户可以使用多级存储功能为 IoTDB 配置不同类型的存储介质,并为存储介质进行分级。具体的,在 IoTDB 中,多级存储的配置体现为多目录的管理。用户可以将多个存储目录归为同一类,作为一个“层级”向 IoTDB 中配置,这种“层级”我们称之为 storage tier;同时,用户可以根据数据的冷热进行分类,并将不同类别的数据存储到指定的“层级”中。当前 IoTDB 支持通过数据的 TTL 进行冷热数据的分类,当一个层级中的数据不满足当前层级定义的 TTL 规则时,该数据会被自动迁移至下一层级中。 - -## 2. 参数定义 - -在 IoTDB 中开启多级存储,需要进行以下几个方面的配置: - -1. 配置数据目录,并将数据目录分为不同的层级 -2. 配置每个层级所管理的数据的 TTL,以区分不同层级管理的冷热数据类别。 -3. 配置每个层级的最小剩余存储空间比例,当该层级的存储空间触发该阈值时,该层级的数据会被自动迁移至下一层级(可选)。 - -具体的参数定义及其描述如下。 - -| 配置项 | 默认值 | 是否必填 | 说明 | 约束 | -| --------------------------------------- | ------------------------ | --- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | 是 | 用来指定不同的存储目录,并将存储目录进行层级划分 | 每级存储使用分号分隔,单级内使用逗号分隔;云端配置只能作为最后一级存储且第一级不能作为云端存储;最多配置一个云端对象;远端存储目录使用 OBJECT_STORAGE 来表示 | -| tier_ttl_in_ms | -1 | 是 | 定义每个层级负责的数据范围,通过 TTL 表示 | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致;"-1" 表示"无限制" | -| dn_default_space_usage_thresholds | 0.85 | 是 | 定义每个层级数据目录的最大使用空间比例;当使用空间大于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的使用存储空间大于此阈值时,会将系统置为 READ_ONLY | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致 | -| object_storage_type | AWS_S3 | 使用远端存储时必填 | 云端存储类型 | IoTDB 支持 S3 协议作为远端存储类型 | -| object_storage_bucket | iotdb_data | 使用远端存储时必填 | 云端存储 bucket 的名称 | AWS S3 中的 bucket 定义 | -| object_storage_region | | 使用远端存储时必填 | 云端存储的服务区域 | AWS S3 中的 region 定义 | -| object_storage_endpoint | | 使用远端存储时必填 | 云端存储的 endpoint | AWS S3 的 endpoint | -| object_storage_access_key | | 使用远端存储时必填 | 云端存储的验证信息 key | AWS S3 的 credential key | -| object_storage_access_secret | | 使用远端存储时必填 | 云端存储的验证信息 secret | AWS S3 的 credential secret | -| enable_path_style_access | false | 否 | 是否启用云端存储服务路径访问 | | -| remote_tsfile_cache_dirs | data/datanode/data/cache | 否 | 云端存储在本地的缓存目录 | | -| remote_tsfile_cache_page_size_in_kb | 20480 | 否 | 云端存储在本地缓存文件的块大小 | | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | 否 | 云端存储本地缓存的最大磁盘占用大小 | | - - -## 3. 本地多级存储配置示例 - -以下以本地两级存储的配置示例。 - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data; -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -在该示例中,共配置了两个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 1 天以前的数据 | 10% | - -## 4. 远端多级存储配置示例 - -以下以三级存储为例: - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_type=AWS_S3 -object_storage_bucket=iotdb -object_storage_region= -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// 可选配置项 -enable_path_style_access=false -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -在该示例中,共配置了三个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 过去1 天至过去 10 天内的数据 | 15% | -| 层级三 | 远端 S3 协议存储 | 过去 10 天以前的数据 | 10% | \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Timeseries-Featured-Analysis_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Timeseries-Featured-Analysis_timecho.md deleted file mode 100644 index 003fdcedb..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Timeseries-Featured-Analysis_timecho.md +++ /dev/null @@ -1,1721 +0,0 @@ - - -# 时序特色分析 - -IoTDB 针对时序数据的特色分析场景,提供了模式查询与窗口函数两大核心能力,为时序数据的深度挖掘与复杂计算提供了灵活高效的解决方案。下文将对两大功能进行详细的介绍。 - -## 1. 模式查询 - -### 1.1 概述 - -模式查询支持通过定义模式变量的识别逻辑以及正则表达式来捕获一段连续的数据,并对每一段捕获的数据进行分析计算,适用于识别时序数据中的特定模式(如下图所示)、检测特定事件等业务场景。 - -![](/img/timeseries-featured-analysis-1.png) - -> 注意:该功能从 V 2.0.5 版本开始提供。 - -### 1.2 功能介绍 -#### 1.2.1 语法格式 - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**说明:** - -* PARTITION BY : 可选,用于对输入表进行分组,每个分组能独立进行模式匹配。如果未声明该子句,则整个输入表将作为一个整体进行处理。 -* ORDER BY :可选,用于确保输入数据按某种顺序进行匹配处理。 -* MEASURES :可选,用于指定从匹配到的一段数据中提取哪些信息。 -* ROWS PER MATCH :可选,用于指定模式匹配成功后结果集的输出方式。 -* AFTER MATCH SKIP :可选,用于指定在识别到一个非空匹配后,下一次模式匹配应从哪一行继续进行。 -* PATTERN :用于定义需要匹配的行模式。 -* SUBSET :可选,用于将多个基本模式变量所匹配的行合并为一个逻辑集合。 -* DEFINE :用于定义行模式的基本模式变量。 - -**语法示例原始数据:** - -```SQL -IoTDB:database3> select * from t -+-----------------------------+------+----------+ -| time|device|totalprice| -+-----------------------------+------+----------+ -|2025-01-01T00:01:00.000+08:00| d1| 90| -|2025-01-01T00:02:00.000+08:00| d1| 80| -|2025-01-01T00:03:00.000+08:00| d1| 70| -|2025-01-01T00:04:00.000+08:00| d1| 80| -|2025-01-01T00:05:00.000+08:00| d1| 70| -|2025-01-01T00:06:00.000+08:00| d1| 80| -+-----------------------------+------+----------+ - --- 创建语句 -create table t(device tag, totalprice int32 field) - -insert into t(time,device,totalprice) values(2025-01-01T00:01:00, 'd1', 90),(2025-01-01T00:02:00, 'd1', 80),(2025-01-01T00:03:00, 'd1', 70),(2025-01-01T00:04:00, 'd1', 80),(2025-01-01T00:05:00, 'd1', 70),(2025-01-01T00:06:00, 'd1', 80) -``` - -#### 1.2.2 DEFINE 子句 - -用于为模式识别中的每个基本模式变量指定其判断条件。这些变量通常由标识符(如 `A`, `B`)代表,并通过该子句中的布尔表达式精确定义哪些行符合该变量的要求。 - -* 在模式匹配执行过程中,仅当布尔表达式返回 TRUE 时,才会将当前行标记为该变量,从而将其纳入到当前匹配分组中。 - -```SQL --- 只有在当前行的 totalprice 值小于前一行 totalprice 值的情况下,当前行才可以被识别为 B。 -DEFINE B AS totalprice < PREV(totalprice) -``` - -* **未**在子句中**显式**定义的变量,其匹配条件隐含为恒真(TRUE),即可在任何输入行上成功匹配。 - -#### 1.2.3 SUBSET 子句 - -用于将多个基本模式变量(如 `A`、`B`)匹配到的行合并成一个联合模式变量(如 `U`),使这些行可以被视为同一个逻辑集合进行操作。可用于`MEASURES`、`DEFINE `和`AFTER MATCH SKIP`子句。 - -```SQL -SUBSET U = (A, B) -``` - -例如,对于模式 `PATTERN ((A | B){5} C+)` ,在匹配过程中无法确定第五次重复时具体匹配的是基本模式变量 A 还是 B,因此 - -1. 在 `MEASURES `子句中,若需要引用该阶段最后一次匹配到的行,则可通过定义联合模式变量 `SUBSET U = (A, B)`实现。此时表达式 `RPR_LAST(U.totalprice)` 将直接返回该目标行的 `totalprice` 值。 -2. 在 `AFTER MATCH SKIP` 子句中,若匹配结果中未包含基本模式变量 A 或 B 时,执行 `AFTER MATCH SKIP TO LAST B` 或 `AFTER MATCH SKIP TO LAST A` 会因锚点缺失跳转失败;而通过引入联合模式变量 `SUBSET U = (A, B)`,使用 `AFTER MATCH SKIP TO LAST U` 则始终有效。 - -#### 1.2.4 PATTERN 子句 - -用于定义需要匹配的行模式,其基本构成单元是**基本模式变量。** - -```SQL -PATTERN ( row_pattern ) -``` - -##### 1.2.4.1 模式种类 - -| 行模式 | 语法格式 | 描述 | -| ----------------------------------- |---------------------| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| 模式连接(Pattern Concatenation) | `A B+ C+ D+` | 由不带任何运算符的子模式组成,按声明顺序依次匹配所有子模式。| -| 模式选择(Pattern Alternation) | `A \| B \| C` | 由以`\|`分隔的多个子模式组成,仅匹配其中一个。当多个子模式均可匹配时,选择最左侧的子模式匹配。 | -| 模式排列(Pattern Permutation) | `PERMUTE(A, B, C)` | 该模式等价于对所有子模式元素的不同顺序进行选择匹配,即要求 A、B、C 三者均须匹配,但其出现顺序不固定。当多种匹配顺序均可成功时,依据 PERMUTE 列表中元素的定义先后顺序,按**字典序原则**确定优先级。例如,A B C 为最高优先,C B A 则为最低优先。 | -| 模式分组(Pattern Grouping) | `(A B C)` | 用圆括号将子模式括起,视作一个整体对待,可与其他运算符配合使用。如`(A B C)+`表示连续出现一组`(A B C)`的模式。 | -| 空模式(Empty Pattern) | `()` | 表示一个不包含任何行的空匹配 | -| 模式排除(Pattern Exclusion) | `{- row_pattern -}` | 用于指定在输出中需要排除的匹配部分。通常与`ALL ROWS PER MATCH`选项结合使用,用于输出感兴趣的行。如`PATTERN (A {- B+ C+ -} D+)`,并使用`ALL ROWS PER MATCH`时,输出将仅包含匹配的首行(`A`对应行)与尾部行(`D+`对应行)。 | - -##### 1.2.4.2 分区起始/结束锚点(Partition Start/End Anchor) - -* `^A` 表示匹配以 A 为分区开始的模式 - * 当 PATTERN 子句的取值为 `^A` 时,要求匹配必须从分区的首行开始,且这一行要满足 `A` 的定义 - * 当 PATTERN 子句的取值为 `^A^` 或 `A^` 时,输出结果为空 -* `A$` 表示匹配以 A 为分区结束的模式 - * 当 PATTERN 子句的取值为 `A$` 时,要求必须在分区的结束位置匹配,并且这一行要满足 `A`的定义 - * 当 PATTERN 子句的取值为 `$A` 或 `$A$` 时,输出结果为空 - -**示例说明** - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - AFTER MATCH SKIP PAST LAST ROW - PATTERN %s -- PATTERN 子句 - DEFINE A AS true -) AS m; -``` - -* 查询结果 - * 当 PATTERN 子句为 PATTERN (^A) 时 - - ![](/img/timeseries-featured-analysis-2.png) - - 实际返回 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * 当 PATTERN 子句为 PATTERN (^A^) 时,输出的结果为空,因为不可能从分区的起始位置开始匹配了一个 A 之后,又回到分区的起始位置 - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - * 当 PATTERN 子句为 PATTERN (A\$) 时 - - ![](/img/timeseries-featured-analysis-3.png) - - 实际返回 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:06:00.000+08:00| 1| 80| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * 当 PATTERN 子句为 PATTERN (\$A\$) 时,输出的结果为空 - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - -##### 1.2.4.3 量词(Quantifiers) - -量词用于指定子模式重复出现的次数,置于相应子模式之后,如 `(A | B)*`。 - -常用量词如下: - -| 量词 | 描述 | -| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `*` | 零次或多次重复 | -| `+` | 一次或多次重复 | -| `?` | 零次或一次重复 | -| `{n}` | 恰好重复 n 次 | -| `{m, n}` | 重复次数在 m 到 n 之间(m、n 为非负整数)。* 若省略左界,则默认从 0 开始;* 若省略右界,则重复次数不设上限(如 {5,} 等同于“至少重复五次”);* 若同时省略左右界,即 {,},则与 \* 等价。 | - -* 可通过在量词后加 `?` 改变匹配偏好。 - * `{3,5}`:偏好 5 次,最不偏好 3 次;`{3,5}?`:偏好 3 次,最不偏好 5 次 - * `?`:偏好 1 次;`??`:偏好 0 次 - -#### 1.2.5 AFTER MATCH SKIP 子句 - -用于指定在识别到一个非空匹配后,下一次模式匹配应从哪一行继续进行。 - -| 跳转策略 | 描述 | 是否允许识别重叠匹配项 | -| ------------------------------------------------------------- | --------------------------------------------------- | ------------------------ | -| `AFTER MATCH SKIP PAST LAST ROW` | 默认行为。在当前匹配的最后一行之后的下一行开始。 | 否 | -| `AFTER MATCH SKIP TO NEXT ROW` | 在当前匹配中的第二行开始。 | 是 | -| `AFTER MATCH SKIP TO [ FIRST \| LAST ] pattern_variable` | 跳转到某个模式变量的 [ 第一行 | 最后一行 ] 开始。 | 是 | - -* 在所有可能的配置中,仅当 `ALL ROWS PER MATCH WITH UNMATCHED ROWS` 与 `AFTER MATCH SKIP PAST LAST ROW` 联合使用时,系统才能确保对每个输入行恰好生成一条输出记录。 - -**示例说明** - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - %s -- AFTER MATCH SKIP 子句 - PATTERN (A B+ C+ D?) - SUBSET U = (C, D) - DEFINE - B AS B.totalprice < PREV (B.totalprice), - C AS C.totalprice > PREV (C.totalprice), - D AS false -- 永远不会匹配成功 -) AS m; -``` - -* 查询结果 - * 当 AFTER MATCH SKIP PAST LAST ROW 时 - - ![](/img/timeseries-featured-analysis-4.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:根据 `AFTER MATCH SKIP PAST LAST ROW` 语义,从第 5 行开始,无法再找寻到一个合法匹配 - * 此模式一定不会出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 4 - ``` - - * 当 AFTER MATCH SKIP TO NEXT ROW 时 - - ![](/img/timeseries-featured-analysis-5.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:根据 `AFTER MATCH SKIP TO NEXT ROW` 语义,从第 2 行开始,匹配:第 2、3、4 行 - * 第三次匹配:尝试从第 3 行开始,失败 - * 第三次匹配:尝试从第 4 行开始,成功,匹配第 4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:02:00.000+08:00| 2| 80| A| - |2025-01-01T00:03:00.000+08:00| 2| 70| B| - |2025-01-01T00:04:00.000+08:00| 2| 80| C| - |2025-01-01T00:04:00.000+08:00| 3| 80| A| - |2025-01-01T00:05:00.000+08:00| 3| 70| B| - |2025-01-01T00:06:00.000+08:00| 3| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 10 - ``` - - * 当 AFTER MATCH SKIP TO FIRST C 时 - - ![](/img/timeseries-featured-analysis-6.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:从第一个 C (也就是第 4 行)处开始,匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO LAST B 或 AFTER MATCH SKIP TO B 时 - - ![](/img/timeseries-featured-analysis-7.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:尝试从最后一个 B (也就是第 3 行)处开始,失败 - * 第二次匹配:尝试从第 4 行开始,成功匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO U 时 - - ![](/img/timeseries-featured-analysis-8.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:`SKIP TO U` 表示跳转到最后一个 C 或 D,D 永远不可能匹配成功,所以就是跳转到最后一个 C(也就是第 4 行),成功匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO A 时,报错。因为不能跳转到匹配的第一行, 否则会造成死循环。 - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: cannot skip to first row of match - ``` - - * 当 AFTER MATCH SKIP TO B 时,报错。因为不能跳转到匹配分组中不存在的模式变量。 - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: pattern variable is not present in match - ``` - - -#### 1.2.6 ROWS PER MATCH 子句 - -用于指定模式匹配成功后结果集的输出方式,主要包括以下两种选项: - -| 输出方式 | 规则描述 | 输出结果 | **空匹配/未匹配行**处理逻辑 | -| -------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| ONE ROW PER MATCH | 每一次成功匹配,产生一行输出结果。 | * PARTITION BY 子句中的列* MEASURES 子句中定义的表达式。 | 输出空匹配;跳过未匹配行。 | -| ALL ROWS PER MATCH | 每一次匹配中的每一行都将产生一条输出记录,除非该行通过 exclusion 语法排除。 | * PARTITION BY 子句中的列* ORDER BY 子句中的列* MEASURES 子句中定义的表达式* 输入表中的其余列 | * 默认:输出空匹配;跳过未匹配行。* ALL ROWS PER MATCH​**SHOW EMPTY MATCHES**​:默认输出空匹配,跳过未匹配行* ALL ROWS PER MATCH​**OMIT EMPTY MATCHES**​:不输出空匹配,跳过未匹配行* ALL ROWS PER MATCH​**WITH UNMATCHED ROWS**​:输出空匹配,并为每一条未匹配行额外生成一条输出记录| - -#### 1.2.7 MEASURES 子句 - -用于指定从匹配到的一段数据中提取哪些信息。该子句为可选项,如果未显式指定,则根据 ROWS PER MATCH 子句的设置,部分输入列会成为模式识别的输出结果。 - -```SQL -MEASURES measure_expression AS measure_name [, ...] -``` - -* `measure_expression` 是根据匹配的一段数据计算出的标量值。 - -| 用法示例 | 说明 | -| ---------------------------------------------- | -------------------------------------------------------------------------------------------------------------- | -| `A.totalprice AS starting_price` | 返回匹配分组中第一行(即与变量 A 关联的唯一一行)中的价格,作为起始价格。 | -| `RPR_LAST(B.totalprice) AS bottom_price` | 返回与变量 B 关联的最后一行中的价格,代表“V”形模式中最低点的价格,对应下降区段的末尾。 | -| `RPR_LAST(U.totalprice) AS top_price` | 返回匹配分组中的最高价格,对应变量 C 或 D 所关联的最后一行,即整个匹配分组的末尾。【假设 SUBSET U = (C, D)】 | - -* 每个 `measure_expression `都会定义一个输出列,该列可通过其指定的 `measure_name `进行引用。 - -#### 1.2.8 模式查询表达式 - -在 MEASURES 与 DEFINE 子句中使用的表达式为​**标量表达式**​,用于在输入表的行级上下文中求值。**标量表达式**除了支持标准 SQL 语法外,还支持针对模式查询的特殊扩展函数。 - -##### 1.2.8.1 模式变量引用 - -```SQL -A.totalprice -U.orderdate -orderstatus -``` - -* 当列名前缀为某**基本模式变量**或**联合模式变量**时,表示引用该变量所匹配的所有行的对应列值。 -* 若列名不带前缀,则等同于使用“​**全局联合模式变量**​”(即所有基本模式变量的并集)作前缀,表示引用当前匹配中所有行的该列值。 - -> 不允许在模式识别表达式中使用表名作列名前缀。 - -##### 1.2.8.2 扩展函数 - -| 函数名 | 函数式 | 描述 | -|------------------| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `MATCH_NUMBER`函数 | `MATCH_NUMBER()` | 返回当前匹配在分区内的序号,从 1 开始计数。空匹配与非空匹配一致,也占用匹配序号。 | -| `CLASSIFIER `函数 | `CLASSIFIER(option)`| 1. 返回当前行所映射的基本模式变量名称。1. `option`是一个可选参数:可以传入基本模式变量`CLASSIFIER(A)`或联合模式变量`CLASSIFIER(U)`,用于限定函数作用范围,对于不在范围内的行,直接返回 NULL。在对联合模式变量使用时,可用于辨别该行究竟映射至并集中哪一个基本模式变量。 | -| 逻辑导航函数 | `RPR_FIRST(expr, k)` | 1. 表示从**当前匹配分组**中,定位至第一个满足 expr 的行,在此基础上再向分组尾部方向搜索到第 k 次出现的同一模式变量对应行,返回该行的指定列值。如果在指定方向上未能找到第 k 次匹配行,则函数返回 NULL。1. 其中 k 是可选参数,默认为 0,表示仅定位至首个满足条件的行;若显式指定,必须为非负整数。 | -| 逻辑导航函数 | `RPR_LAST(expr, k)`| 1. 表示从**当前匹配分组**中,定位至最后一个满足 expr 的行,在此基础上再向分组开头方向搜索到第 k 次出现的同一模式变量对应行,返回该行的指定列值。如果在指定方向上未能找到第 k 次匹配行,则函数返回 NULL。1. 其中 k 是可选参数,默认为 0,表示仅定位至末个满足条件的行;若显式指定,必须为非负整数。 | -| 物理导航函数 | `PREV(expr, k)` | 1. 表示从最后一次匹配至给定模式变量的行开始,向开头方向偏移 k 行,返回对应列值。若导航超出​**分区边界**​,则函数返回 NULL。1. 其中 k 是可选参数,默认为 1;若显式指定,必须为非负整数。 | -| 物理导航函数 |`NEXT(expr, k)` | 1. 表示从最后一次匹配至给定模式变量的行开始,向尾部方向偏移 k 行,返回对应列值。若导航超出​**分区边界**​,则函数返回 NULL。1. 其中 k 是可选参数,默认为 1;若显式指定,必须为非负整数。 | -| 聚合函数 | COUNT、SUM、AVG、MAX、MIN 函数 | 可用于对当前匹配中的数据进行计算。聚合函数与导航函数不允许互相嵌套。(V 2.0.6 版本起支持) | -| 嵌套函数 | `PREV/NEXT(CLASSIFIER())` | 物理导航函数与 CLASSIFIER 函数嵌套。用于获取当前行的前一个和后一个匹配行所对应的模式变量 | -| 嵌套函数 |`PREV/NEXT(RPR_FIRST/RPR_LAST(expr, k)`) | 物理函数内部**允许嵌套**逻辑函数,逻辑函数内部**不允许嵌套**物理函数。用于先进行逻辑偏移,再进行物理偏移。 | - -**示例说明** - -1. CLASSIFIER 函数 - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* 分析过程 - - ![](/img/timeseries-featured-analysis-9.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----+---------------+-----+ -| time|match|price|lower_or_higher|label| -+-----------------------------+-----+-----+---------------+-----+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| -+-----------------------------+-----+-----+---------------+-----+ -Total line number = 6 -``` - -2. 逻辑导航函数 - -* 查询 sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` - -* 查询结果 - * 当取值为 totalprice、RPR\_LAST(totalprice)、RUNNING RPR\_LAST(totalprice) 时 - - ![](/img/timeseries-featured-analysis-10.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 FINAL RPR\_LAST(totalprice) 时 - - ![](/img/timeseries-featured-analysis-11.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_FIRST(totalprice)、 RUNNING RPR\_FIRST(totalprice)、FINAL RPR\_FIRST(totalprice)时 - - ![](/img/timeseries-featured-analysis-12.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 90| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 90| - |2025-01-01T00:05:00.000+08:00| 90| - |2025-01-01T00:06:00.000+08:00| 90| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_LAST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-13.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| null| - |2025-01-01T00:02:00.000+08:00| null| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 FINAL RPP\_LAST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-14.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_FIRST(totalprice, 2) 和 FINAL RPR\_FIRST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-15.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 70| - |2025-01-01T00:02:00.000+08:00| 70| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 6 - ``` - -3. 物理导航函数 - -* 查询 sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (B) - DEFINE B AS B.totalprice >= PREV(B.totalprice) -) AS m; -``` - -* 查询结果 - * 当取值为 `PREV(totalprice)` 时 - - ![](/img/timeseries-featured-analysis-16.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `PREV(B.totalprice, 2)` 时 - - ![](/img/timeseries-featured-analysis-17.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `PREV(B.totalprice, 4)` 时 - - ![](/img/timeseries-featured-analysis-18.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| null| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `NEXT(totalprice)` 或 `NEXT(B.totalprice, 1)` 时 - - ![](/img/timeseries-featured-analysis-19.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * `当取值为 NEXT(B.totalprice, 2)` 时 - - ![](/img/timeseries-featured-analysis-20.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - -4. 聚合函数 - -* 查询 sql - -```SQL -SELECT m.time, m.count, m.avg, m.sum, m.min, m.max -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - COUNT(*) AS count, - AVG(totalprice) AS avg, - SUM(totalprice) AS sum, - MIN(totalprice) AS min, - MAX(totalprice) AS max - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* 分析过程(以 MIN(totalprice)为例) - -![](/img/timeseries-featured-analysis-21.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----------------+-----+---+---+ -| time|count| avg| sum|min|max| -+-----------------------------+-----+-----------------+-----+---+---+ -|2025-01-01T00:01:00.000+08:00| 1| 90.0| 90.0| 90| 90| -|2025-01-01T00:02:00.000+08:00| 2| 85.0|170.0| 80| 90| -|2025-01-01T00:03:00.000+08:00| 3| 80.0|240.0| 70| 90| -|2025-01-01T00:04:00.000+08:00| 4| 80.0|320.0| 70| 90| -|2025-01-01T00:05:00.000+08:00| 5| 78.0|390.0| 70| 90| -|2025-01-01T00:06:00.000+08:00| 6|78.33333333333333|470.0| 70| 90| -+-----------------------------+-----+-----------------+-----+---+---+ -Total line number = 6 -``` - -5. 嵌套函数 - -示例一 - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label, m.prev_label, m.next_label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label, - PREV(CLASSIFIER(W)) AS prev_label, - NEXT(CLASSIFIER(W)) AS next_label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* 分析过程 - -![](/img/timeseries-featured-analysis-22.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -| time|match|price|lower_or_higher|label|prev_label|next_label| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| null| A| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| H| null| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| null| A| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| L| null| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| null| A| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| L| null| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -Total line number = 6 -``` - -示例二 - -* 查询 sql - -```SQL -SELECT m.time, m.prev_last_price, m.next_first_price -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - PREV(RPR_LAST(totalprice), 2) AS prev_last_price, - NEXT(RPR_FIRST(totalprice), 2) as next_first_price - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* 分析过程 - -![](/img/timeseries-featured-analysis-23.png) - -* 查询结果 - -```SQL -+-----------------------------+---------------+----------------+ -| time|prev_last_price|next_first_price| -+-----------------------------+---------------+----------------+ -|2025-01-01T00:01:00.000+08:00| null| 70| -|2025-01-01T00:02:00.000+08:00| null| 70| -|2025-01-01T00:03:00.000+08:00| 90| 70| -|2025-01-01T00:04:00.000+08:00| 80| 70| -|2025-01-01T00:05:00.000+08:00| 70| 70| -|2025-01-01T00:06:00.000+08:00| 80| 70| -+-----------------------------+---------------+----------------+ -Total line number = 6 -``` - -##### 1.2.8.3 RUNNING 和 FINAL 语义 -1. 定义 - -* `RUNNING`: 表示计算范围为当前匹配分组内,从分组的起始行到当前正在处理的行(即到当前行为止)。 -* `FINAL`: 表示计算范围为当前匹配分组内,从分组的起始行到分组的最终行(即整个匹配分组)。 - -2. 作用范围 - -* DEFINE 子句默认采用 RUNNING 语义。 -* MEASURES 子句默认采用 RUNNING 语义,支持指定 FINAL 语义。当采用 ONE ROW PER MATCH 输出模式时,所有表达式都从匹配分组的末行位置进行计算,此时 RUNNING 语义与 FINAL 语义等价。 - -3. 语法约束 - -* RUNNING 和 FINAL 需要写在**逻辑导航函数**或聚合函数之前,不能直接作用于**列引用。** - * 合法:`RUNNING RPP_LAST(A.totalprice)`、`FINAL RPP_LAST(A.totalprice)` - * 非法:`RUNNING A.totalprice`、`FINAL A.totalprice`、 `RUNNING PREV(A.totalprice)` - -### 1.3 场景示例 - -以[示例数据](../Reference/Sample-Data.md)为源数据 - -#### 1.3.1 时间分段查询 - -将 table1 中的数据按照时间间隔小于等于 24 小时分段,查询每段中的数据总条数,以及开始、结束时间。 - -查询SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -#### 1.3.2 差值分段查询 - -将 table2 中的数据按照 humidity 湿度值差值小于 0.1 分段,查询每段中的数据总条数,以及开始、结束时间。 - -* 查询sql - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* 查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -#### 1.3.3 事件统计查询 - -将 table1 中数据按照设备号分组,统计上海地区湿度大于 35 的开始、结束时间及最大湿度值。 - -* 查询sql - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= '上海' AND A.humidity> 35 -) AS m -``` - -* 查询结果 - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` - - -## 2. 窗口函数 - -### 2.1 功能介绍 - -窗口函数(Window Function) 是一种基于与当前行相关的特定行集合(称为“窗口”) 对每一行进行计算的特殊函数。它将分组操作(`PARTITION BY`)、排序(`ORDER BY`)与可定义的计算范围(窗口框架 `FRAME`)结合,在不折叠原始数据行的前提下实现复杂的跨行计算。常用于数据分析场景,比如排名、累计和、移动平均等操作。 - -> 注意:该功能从 V 2.0.5 版本开始提供。 - -例如,某场景下需要查询不同设备的功耗累加值,即可通过窗口函数来实现。 - -```SQL --- 原始数据 -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ - --- 创建表并插入数据 -CREATE TABLE device_flow(device String tag, flow INT32 FIELD); -insert into device_flow(time, device ,flow ) values ('1970-01-01T08:00:00.000+08:00','d0',3),('1970-01-01T08:00:01.000+08:00','d0',5),('1970-01-01T08:00:02.000+08:00','d0',3),('1970-01-01T08:00:03.000+08:00','d0',1),('1970-01-01T08:00:04.000+08:00','d1',2),('1970-01-01T08:00:05.000+08:00','d1',4); - - ---执行窗口函数查询 -SELECT *, sum(flow) ​OVER(PARTITION​ ​BY​ device ​ORDER​ ​BY​ flow) ​as​ sum ​FROM device_flow; -``` - -经过分组、排序、计算(步骤拆解如下图所示), - -![](/img/window-function-1.png) - -即可得到期望结果: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -### 2.2 功能定义 -#### 2.2.1 SQL 定义 - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -#### 2.2.2 窗口定义 -##### 2.2.2.1 Partition - -`PARTITION BY` 用于将数据分为多个独立、不相关的「组」,窗口函数只能访问并操作其所属分组内的数据,无法访问其它分组。该子句是可选的;如果未显式指定,则默认将所有数据分到同一组。值得注意的是,与 `GROUP BY` 通过聚合函数将一组数据规约成一行不同,`PARTITION BY` 的窗口函数**并不会影响组内的行数。** - -* 示例 - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER (PARTITION BY device) as count FROM device_flow; -``` - -拆解步骤: - -![](/img/window-function-2.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 4| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -|1970-01-01T08:00:02.000+08:00| d0| 3| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 4| -+-----------------------------+------+----+-----+ -``` - -##### 2.2.2.2 Ordering - -`ORDER BY` 用于对 partition 内的数据进行排序。排序后,相等的行被称为 peers。peers 会影响窗口函数的行为,例如不同 rank function 对 peers 的处理不同;不同 frame 的划分方式对于 peers 的处理也不同。该子句是可选的。 - -* 示例 - -查询语句: - -```SQL -IoTDB> SELECT *, rank() OVER (PARTITION BY device ORDER BY flow) as rank FROM device_flow; -``` - -拆解步骤: - -![](/img/window-function-3.png) - -查询结果: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -##### 2.2.2.3 Framing - -对于 partition 中的每一行,窗口函数都会在相应的一组行上求值,这些行称为 Frame(即 Window Function 在每一行上的输入域)。Frame 可以手动指定,指定时涉及两个属性,具体说明如下。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Frame 属性属性值值描述
类型ROWS通过行号来划分 frame
GROUPS通过 peers 来划分 frame,即值相同的行视为同等的存在。peers 中所有的行分为一个组,叫做 peer group
RANGE通过值来划分 frame
起始和终止位置UNBOUNDED PRECEDING整个 partition 的第一行
offset PRECEDING代表前面和当前行「距离」为 offset 的行
CURRENT ROW当前行
offset FOLLOWING代表后面和当前行「距离」为 offset 的行
UNBOUNDED FOLLOWING整个 partition 的最后一行
- -其中,`CURRENT ROW`、`PRECEDING N` 和 `FOLLOWING N` 的含义随着 frame 种类的不同而不同,如下表所示: - -| | `ROWS` | `GROUPS` | `RANGE` | -|--------------------|------------|------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------| -| `CURRENT ROW` | 当前行 | 由于 peer group 包含多行,因此这个选项根据作用于 frame\_start 和 frame\_end 而不同:* frame\_start:peer group 的第一行;* frame\_end:peer group 的最后一行。 | 和 GROUPS 相同,根据作用于 frame\_start 和 frame\_end 而不同:* frame\_start:peer group 的第一行;* frame\_end:peer group 的最后一行。 | -| `offset PRECEDING` | 前 offset 行 | 前 offset 个 peer group; | 前面与当前行的值之差小于等于 offset 就分为一个 frame | -| `offset FOLLOWING` | 后 offset 行 | 后 offset 个 peer group。 | 后面与当前行的值之差小于等于 offset 就分为一个 frame | - -语法格式如下: - -```SQL --- 同时指定 frame_start 和 frame_end -{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end --- 仅指定 frame_start,frame_end 为 CURRENT ROW -{ RANGE | ROWS | GROUPS } frame_start -``` - -若未手动指定 Frame,Frame 的默认划分规则如下: - -* 当窗口函数使用 ORDER BY 时:默认 Frame 为 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (即从窗口的第一行到当前行)。例如:RANK() OVER(PARTITION BY COL1 0RDER BY COL2) 中,Frame 默认包含分区内当前行及之前的所有行。 -* 当窗口函数不使用 ORDER BY 时:默认 Frame 为 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (即整个窗口的所有行)。例如:AVG(COL2) OVER(PARTITION BY col1) 中,Frame 默认包含分区内的所有行,计算整个分区的平均值。 - -需要注意的是,当 Frame 类型为 GROUPS 或 RANGE 时,需要指定 `ORDER BY`,区别在于 GROUPS 中的 ORDER BY 可以涉及多个字段,而 RANGE 需要计算,所以只能指定一个字段。 - -* 示例 - -1. Frame 类型为 ROWS - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ROWS 1 PRECEDING) as count FROM device_flow; -``` - -拆解步骤: - -* 取前一行和当前行作为 Frame - * 对于 partition 的第一行,由于没有前一行,所以整个 Frame 只有它一行,返回 1; - * 对于 partition 的其他行,整个 Frame 包含当前行和它的前一行,返回 2: - -![](/img/window-function-4.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 2| -+-----------------------------+------+----+-----+ -``` - -2. Frame 类型为 GROUPS - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -拆解步骤: - -* 取前一个 peer group 和当前 peer group 作为 Frame,那么以 device 为 d0 的 partition 为例(d1同理),对于 count 行数: - * 对于 flow 为 1 的 peer group,由于它也没比它小的 peer group 了,所以整个 Frame 就它一行,返回 1; - * 对于 flow 为 3 的 peer group,它本身包含 2 行,前一个 peer group 就是 flow 为 1 的,就一行,因此整个 Frame 三行,返回 3; - * 对于 flow 为 5 的 peer group,它本身包含 1 行,前一个 peer group 就是 flow 为 3 的,共两行,因此整个 Frame 三行,返回 3。 - -![](/img/window-function-5.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. Frame 类型为 RANGE - -查询语句: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -拆解步骤: - -* 把比当前行数据**小于等于 2 ​**的分为同一个 Frame,那么以 device 为 d0 的 partition 为例(d1 同理),对于 count 行数: - * 对于 flow 为 1 的行,由于它是最小的行了,所以整个 Frame 就它一行,返回 1; - * 对于 flow 为 3 的行,注意 CURRENT ROW 是作为 frame\_end 存在,因此是整个 peer group 的最后一行,符合要求比它小的共 1 行,然后 peer group 有 2 行,所以整个 Frame 共 3 行,返回 3; - * 对于 flow 为 5 的行,它本身包含 1 行,符合要求的比它小的共 2 行,所以整个 Frame 共 3 行,返回 3。 - -![](/img/window-function-6.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -### 2.3 内置的窗口函数 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
窗口函数分类窗口函数名函数定义是否支持 FRAME 子句
Aggregate Function所有内置聚合函数对一组值进行聚合计算,得到单个聚合结果。
Value Functionfirst_value返回 frame 的第一个值,如果指定了 IGNORE NULLS 需要跳过前缀的 NULL
last_value返回 frame 的最后一个值,如果指定了 IGNORE NULLS 需要跳过后缀的 NULL
nth_value返回 frame 的第 n 个元素(注意 n 是从 1 开始),如果有 IGNORE NULLS 需要跳过 NULL
lead返回当前行的后 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default
lag返回当前行的前 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default
Rank Functionrank返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间可能有 gap
dense_rank返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间没有 gap
row_number返回当前行在整个 partition 中的行号,注意行号从 1 开始
percent_rank以百分比的形式,返回当前行的值在整个 partition 中的序号;即 (rank() - 1) / (n - 1),其中 n 是整个 partition 的行数
cume_dist以百分比的形式,返回当前行的值在整个 partition 中的序号;即 (小于等于它的行数) / n
ntile指定 n,给每一行进行 1~n 的编号。
- -#### 2.3.1 Aggregate Function - -所有内置聚合函数,如 `sum()`、`avg()`、`min()`、`max()` 都能当作 Window Function 使用。 - -> 注意:与 GROUP BY 不同,Window Function 中每一行都有相应的输出 - -示例: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -#### 2.3.2 Value Function -1. `first_value` - -* 函数名:`first_value(value) [IGNORE NULLS]` -* 定义:返回 frame 的第一个值,如果指定了 IGNORE NULLS 需要跳过前缀的 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, first_value(flow) OVER w as first_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+-----------+ -| time|device|flow|first_value| -+-----------------------------+------+----+-----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----------+ -``` - -2. `last_value` - -* 函数名:`last_value(value) [IGNORE NULLS]` -* 定义:返回 frame 的最后一个值,如果指定了 IGNORE NULLS 需要跳过后缀的 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, last_value(flow) OVER w as last_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|last_value| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -3. `nth_value` - -* 函数名:`nth_value(value, n) [IGNORE NULLS]` -* 定义:返回 frame 的第 n 个元素(注意 n 是从 1 开始),如果有 IGNORE NULLS 需要跳过 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, nth_value(flow, 2) OVER w as nth_values FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|nth_values| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -4. lead - -* 函数名:`lead(value[, offset[, default]]) [IGNORE NULLS]` -* 定义:返回当前行的后 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default;offset 的默认值为 1,default 的默认值为 NULL。 -* lead 函数需要需要一个 ORDER BY 窗口子句 -* 示例: - -```SQL -IoTDB> SELECT *, lead(flow) OVER w as lead FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY time); -+-----------------------------+------+----+----+ -| time|device|flow|lead| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4|null| -|1970-01-01T08:00:00.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 1| -|1970-01-01T08:00:03.000+08:00| d0| 1|null| -+-----------------------------+------+----+----+ -``` - -5. lag - -* 函数名:`lag(value[, offset[, default]]) [IGNORE NULLS]` -* 定义:返回当前行的前 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default;offset 的默认值为 1,default 的默认值为 NULL。 -* lag 函数需要需要一个 ORDER BY 窗口子句 -* 示例: - -```SQL -IoTDB> SELECT *, lag(flow) OVER w as lag FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY device); -+-----------------------------+------+----+----+ -| time|device|flow| lag| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2|null| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3|null| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -+-----------------------------+------+----+----+ -``` - -#### 2.3.3 Rank Function -1. rank - -* 函数名:`rank()` -* 定义:返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间可能有 gap; -* 示例: - -```SQL -IoTDB> SELECT *, rank() OVER w as rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -2. dense\_rank - -* 函数名:`dense_rank()` -* 定义:返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间没有 gap。 -* 示例: - -```SQL -IoTDB> SELECT *, dense_rank() OVER w as dense_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|dense_rank| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+----------+ -``` - -3. row\_number - -* 函数名:`row_number()` -* 定义:返回当前行在整个 partition 中的行号,注意行号从 1 开始; -* 示例: - -```SQL -IoTDB> SELECT *, row_number() OVER w as row_number FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|row_number| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----------+ -``` - -4. percent\_rank - -* 函数名:`percent_rank()` -* 定义:以百分比的形式,返回当前行的值在整个 partition 中的序号;即 **(rank() - 1) / (n - 1)**,其中 n 是整个 partition 的行数; -* 示例: - -```SQL -IoTDB> SELECT *, percent_rank() OVER w as percent_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+------------------+ -| time|device|flow| percent_rank| -+-----------------------------+------+----+------------------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.0| -|1970-01-01T08:00:00.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:02.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+------------------+ -``` - -5. cume\_dist - -* 函数名:cume\_dist -* 定义:以百分比的形式,返回当前行的值在整个 partition 中的序号;即 **(小于等于它的行数) / n**。 -* 示例: - -```SQL -IoTDB> SELECT *, cume_dist() OVER w as cume_dist FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+---------+ -| time|device|flow|cume_dist| -+-----------------------------+------+----+---------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.5| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.25| -|1970-01-01T08:00:00.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:02.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+---------+ -``` - -6. ntile - -* 函数名:ntile -* 定义:指定 n,给每一行进行 1~n 的编号。 - * 整个 partition 行数比 n 小,那么编号就是行号 index; - * 整个 partition 行数比 n 大: - * 如果行数能除尽 n,那么比较完美,比如行数为 4,n 为 2,那么编号为 1、1、2、2、; - * 如果行数不能除尽 n,那么就分给开头几组,比如行数为 5,n 为 3,那么编号为 1、1、2、2、3; -* 示例: - -```SQL -IoTDB> SELECT *, ntile(2) OVER w as ntile FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+-----+ -| time|device|flow|ntile| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -+-----------------------------+------+----+-----+ -``` - -### 2.4 场景示例 -1. 多设备 diff 函数 - -对于每个设备的每一行,与前一行求差值: - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -对于每个设备的每一行,与后一行求差值: - -```SQL -SELECT - *, - measurement - lead(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -对于单个设备的每一行,与前一行求差值(后一行同理): - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (ORDER BY time) -FROM data -where device='d1' -WHERE timeCondition; -``` - -2. 多设备 TOP\_K/BOTTOM\_K - -利用 rank 获取序号,然后在外部的查询中保留想要的顺序。 - -(注意, window function 的执行顺序在 HAVING 子句之后,所以这里需要子查询) - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY time DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -除了按照时间排序之外,还可以按照测点的值进行排序: - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY measurement DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -3. 多设备 CHANGE\_POINTS - -这个 sql 用来去除输入序列中连续相同值,可以用 lead + 子查询实现: - -```SQL -SELECT - time, - device, - measurement -FROM( - SELECT - time, - device, - measurement, - LEAD(measurement) OVER (PARTITION BY device ORDER BY time) AS next - FROM data - WHERE timeCondition -) -WHERE measurement != next OR next IS NULL; -``` diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Tree-to-Table_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Tree-to-Table_timecho.md deleted file mode 100644 index 5cd9bef30..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Tree-to-Table_timecho.md +++ /dev/null @@ -1,615 +0,0 @@ - -# 树转表视图 - -## 1. 功能概述 - -IoTDB 提供了树转表功能,支持通过创建表视图的方式,将已存在的树模型数据转化为表视图,进而通过表视图进行查询,实现了对同一份数据的树模型和表模型协同处理: - -* 数据写入阶段,采用树模型语法,支持数据灵活接入和扩展。 -* 数据分析阶段,采用表模型语法,支持通过标准 SQL 查询语言,执行复杂的数据分析。 - -![](/img/tree-to-table-1.png) - -> - V2.0.5 及以后版本支持该功能。 -> - 表视图只读,不允许通过表视图写入数据。 - -## 2. 功能介绍 -### 2.1 创建表视图 -#### 2.1.1 语法定义 - -```SQL --- create (or replace) view on tree -CREATE - [OR REPLACE] - VIEW view_name ([viewColumnDefinition (',' viewColumnDefinition)*]) - [comment] - [RESTRICT] - [WITH properties] - AS prefixPath - -viewColumnDefinition - : column_name [dataType] TAG [comment] # tagColumn - | column_name [dataType] TIME [comment] # timeColumn - | column_name [dataType] FIELD [FROM original_measurement] [comment] # fieldColumn - ; - -comment - : COMMENT string - ; -``` - -> 注意:列仅支持 tag / field / time,不支持 attribute。 - -#### 2.1.2 语法说明 -1. **`prefixPath`** - -对应树模型的路径,路径最后一级必须为 `**`,且其他层级均不能出现 `*` 或 `**`。该路径确定 VIEW 对应的子树。 - -2. **`view_name`** - -视图名称,与表名相同(具体约束可参考[创建表](../Basic-Concept/Table-Management_timecho.md#\_1-1-创建表)),如 db.view。 - -3. **`viewColumnDefinition`** - -* `TAG`:每个 Tag 列按顺序对应`prefixPath`后面层级的路径节点。 -* `FIELD`:FIELD 列对应树模型中的测点(叶子节点)。 - * 若指定了 FIELD 列,则列名使用声明中的`column_name`。 - * 若声明了 `original_measurement`,则直接映射到树模型该测点。否则取小写`column_name` 作为测点名进行树模型映射。 - * 不支持多个 FIELD 映射到树模型同名测点。 - * 若未指定 FIELD 列的 `dataType`,则默认获取树模型映射测点的数据类型。 - * 若树模型中的设备不包含某些声明的 FIELD 列,或与声明的 FIELD 列的数据类型不一致,则在查询该设备时,该 FIELD 列的值永远为 NULL。 - * 若未指定 FIELD 列,则创建时会自动扫描出`prefixPath`子树下所有的测点(包括定义为所有普通序列的测点,以及挂载路径与 `prefixPath `有所重合的所有模板中的测点),列名使用树模型测点名称。 - * 不支持树模型存在名称(含小写)相同但类型不同的测点 -* `TIME`:创建视图时可以不指定时间列(TIME),IoTDB 会自动添加该列并命名为"time", 且顺序上位于第一列。自 V2.0.8.2 版本起,支持创建视图时**自定义命名时间列**,自定义时间列在视图中的顺序由创建 SQL 中的顺序决定。相关约束如下: - * 当列分类(columnCategory)设为 `TIME` 时,数据类型(dataType)必须为 `TIMESTAMP`。 - * 每个视图最多允许定义 1个时间列(columnCategory = TIME)。 - * 当未显式定义时间列时,不允许其他列使用 `time` 作为名称,否则会与系统默认时间列命名冲突。 - - -4. **`WITH properties`** - -目前仅支持 TTL,表示该视图 TTL ms 之前的数据不会在查询时展示,即`WHERE time > now() - TTL`。若树模型设置了 TTL,则查询时取两者中的更小值。 - -> 注意:表视图 TTL 不影响树模型中设备的真实 TTL,当设备数据达到树模型设定的 TTL 后,将被系统物理删除。 - -5. **`OR REPLACE`** - -table 与 view 不能重名。创建时若已存在同名 table ,则会报错;若已存在同名 view ,则进行替换。 - -6. **`RESTRICT`** - -约束匹配树模型设备的层级数(从 prefixPath 下一层开始),若有 RESTRICT 字段,则匹配层级完全等于 tag 数量的 device,否则匹配层级小于等于 tag 数量的 device。默认非 RESTRICT,即匹配层级小于等于 tag 数量的 device。 - -#### 2.1.3 使用示例 -1. 树模型及表视图原型 - -![](/img/tree-to-table-2.png) - -2. 创建表视图 - -* 创建语句 - -```SQL -CREATE OR REPLACE VIEW viewdb."风机表" - ("风机组" TAG, - "风机号" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -with (ttl=604800000) -AS root.db.** -``` - -* 具体说明 - -该语句表示,创建出名为 `viewdb."风机表"` 的视图(viewdb 如不存在会报错),如果该视图已存在,则替换该视图: - -* 为挂载于树模型 root.db.\*\* 路径下面的序列创建表视图。 -* 具备`风机组`、`风机号`两个 `TAG` 列,因此表视图中只包含原树模型中第 3 层上的设备。 -* 具备`电压`、`电流`两个 `FIELD` 列。这里两个 `FIELD` 列对应树模型下的序列名同样是`电压`、`电流`,且仅仅选取类型为 `DOUBLE` 的序列。 - - **​序列名的改名需求:​**如果树模型下的序列名为`current`,想要创建出的表视图中对应的 `FIELD` 列名为`电流`,这种情况下,SQL 变更如下: - - ```SQL - CREATE OR REPLACE VIEW viewdb."风机表" - ("风机组" TAG, - "风机号" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD FROM current - ) - AS root.db.** - with (ttl=604800000) - ``` - -* 当需要自定义时间列(V2.0.8.2 起支持)时,SQL 变更如下: - -```SQL -CREATE OR REPLACE VIEW viewdb."风机表" - ("风机组" TAG, - "风机号" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD, - time_user_defined TIMESTAMP TIME - ) -AS root.db.** -with (ttl=604800000) -``` - - -### 2.2 修改表视图 -#### 2.2.1 语法定义 - -修改表视图功能支持修改视图名称、添加列、列重命名、修改 FIELD 列数据类型(V2.0.8.2 起支持)、删除列、设置视图的 TTL 属性,以及通过 COMMENT 添加注释。 - -```SQL --- 修改视图名 -ALTER VIEW [IF EXISTS] viewName RENAME TO to=identifier - --- 在视图中添加某一列 -ALTER VIEW [IF EXISTS] viewName ADD COLUMN [IF NOT EXISTS] viewColumnDefinition -viewColumnDefinition - : column_name [dataType] TAG # tagColumn - | column_name [dataType] FIELD [FROM original_measurement] # fieldColumn - --- 为视图中的某一列重命名 -ALTER VIEW [IF EXISTS] viewName RENAME COLUMN [IF EXISTS] oldName TO newName - ---修改 FIELD 列的数据类型 -ALTER VIEW [IF EXISTS] viewName ALTER COLUMN [IF EXISTS] columnName SET DATA TYPE new_type - --- 删除视图中的某一列 -ALTER VIEW [IF EXISTS] viewName DROP COLUMN [IF EXISTS] columnName - --- 修改视图的 TTL -ALTER VIEW [IF EXISTS] viewName SET PROPERTIES propertyAssignments - --- 添加注释 -COMMENT ON VIEW qualifiedName IS (string | NULL) #commentView -COMMENT ON COLUMN qualifiedName '.' column=identifier IS (string | NULL) #commentColumn -``` - -#### 2.2.2 语法说明 -1. `SET PROPERTIES`操作目前仅支持对表视图的 TTL 属性进行配置。 -2. 删除列功能,仅支持删除物理量列(FIELD),标识列(TAG)不支持删除。 -3. 修改后的 comment 会覆盖原有注释,如果指定为 null,则会擦除之前的 comment。 -4. 修改 FIELD 列数据类型时,变更后的字段类型需要与原类型兼容,具体兼容性如下表所示: - -| 原始类型 | 可变更为类型 | -| ----------- | ----------------------------------------------- | -| INT32 | INT64, FLOAT, DOUBLE, TIMESTAMP, STRING, TEXT | -| INT64 | TIMESTAMP, DOUBLE, STRING, TEXT | -| FLOAT | DOUBLE, STRING, TEXT | -| DOUBLE | STRING, TEXT | -| BOOLEAN | STRING, TEXT | -| TEXT | BLOB, STRING | -| STRING | TEXT, BLOB | -| BLOB | STRING, TEXT | -| DATE | STRING, TEXT | -| TIMESTAMP | INT64, DOUBLE, STRING, TEXT | - -#### 2.2.3 使用示例 - -```SQL --- 修改视图名 -ALTER VIEW IF EXISTS tableview1 RENAME TO tableview - --- 在视图中添加某一列 -ALTER VIEW IF EXISTS tableview ADD COLUMN IF NOT EXISTS temperature float field - --- 为视图中的某一列重命名 -ALTER VIEW IF EXISTS tableview RENAME COLUMN IF EXISTS temperature TO temp - --- 修改 FIELD 列的数据类型 -ALTER VIEW IF EXISTS tableview ALTER COLUMN IF EXISTS temperature SET DATA TYPE double - --- 删除视图中的某一列 -ALTER VIEW IF EXISTS tableview DROP COLUMN IF EXISTS temp - --- 修改视图的 TTL -ALTER VIEW IF EXISTS tableview SET PROPERTIES TTL=3600 - --- 添加注释 -COMMENT ON VIEW tableview IS '树转表' -COMMENT ON COLUMN tableview.status is Null -``` - -### 2.3 删除表视图 -#### 2.3.1 语法定义 - -```SQL -DROP VIEW [IF EXISTS] viewName -``` - -#### 2.3.2 使用示例 - -```SQL -DROP VIEW IF EXISTS tableview -``` - -### 2.4 查看表视图 -#### 2.4.1 **`Show Tables`** -1. 语法定义 - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -2. 语法说明 - -`SHOW TABLES (DETAILS)` 语句通过结果集的`TABLE_TYPE`字段展示表或视图的类型信息: - -| 类型 | `TABLE_TYPE`字段值 | -| -------------------------------------- | ------------------------ | -| 普通表(Table) | `BASE TABLE` | -| 树转表视图(Tree View) | `VIEW FROM TREE` | -| 系统表(Iinformation\_schema.Tables) | `SYSTEM VIEW` | - -3. 使用示例 - -```SQL -IoTDB> show tables details from database1 -+-----------+-----------+------+-------+--------------+ -| TableName| TTL(ms)|Status|Comment| TableType| -+-----------+-----------+------+-------+--------------+ -| tableview| INF| USING| 树转表 |VIEW FROM TREE| -| table1|31536000000| USING| null| BASE TABLE| -| table2|31536000000| USING| null| BASE TABLE| -+-----------+-----------+------+-------+--------------+ - -IoTDB> show tables details from information_schema -+--------------+-------+------+-------+-----------+ -| TableName|TTL(ms)|Status|Comment| TableType| -+--------------+-------+------+-------+-----------+ -| columns| INF| USING| null|SYSTEM VIEW| -| config_nodes| INF| USING| null|SYSTEM VIEW| -|configurations| INF| USING| null|SYSTEM VIEW| -| data_nodes| INF| USING| null|SYSTEM VIEW| -| databases| INF| USING| null|SYSTEM VIEW| -| functions| INF| USING| null|SYSTEM VIEW| -| keywords| INF| USING| null|SYSTEM VIEW| -| models| INF| USING| null|SYSTEM VIEW| -| nodes| INF| USING| null|SYSTEM VIEW| -| pipe_plugins| INF| USING| null|SYSTEM VIEW| -| pipes| INF| USING| null|SYSTEM VIEW| -| queries| INF| USING| null|SYSTEM VIEW| -| regions| INF| USING| null|SYSTEM VIEW| -| subscriptions| INF| USING| null|SYSTEM VIEW| -| tables| INF| USING| null|SYSTEM VIEW| -| topics| INF| USING| null|SYSTEM VIEW| -| views| INF| USING| null|SYSTEM VIEW| -+--------------+-------+------+-------+-----------+ -``` - -#### 2.4.2 **`Show Create View`** -1. 语法定义 - -```SQL -SHOW CREATE VIEW viewname; -``` - -2. 语法说明 - -* 该语句用于获取表视图的完整定义语句。 -* 该语句会自动补全创建时省略的所有默认值,因此结果集中所展示的语句可能与原始创建语句不同。 -* 该语句不支持用于展示系统表; - -3. 使用示例 - -```SQL -IoTDB> show create view tableview -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| View| Create View| -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|tableview|CREATE VIEW "tableview" ("device" STRING TAG,"model" STRING TAG,"status" BOOLEAN FIELD,"hardware" STRING FIELD) COMMENT '表视图' WITH (ttl=INF) AS root.ln.**| -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -``` - -> 除此之外,还支持通过 `show create table` 语句查看表视图创建信息,相关详细介绍可查看[show create table](../Basic-Concept/Table-Management_timecho.md#_1-4-查看表的创建信息) - -### 2.5 非对齐与对齐设备的查询差异 - -树转表视图在查询对齐设备和非对齐设备中有 null 值的情况下结果​**可能与等价的树模型 align by device 查询不同**​。 - -* **对齐设备** - * 树模型的查询表现:当查询涉及的所有序列在某一行都是null时,不保留该行 - * 表视图的查询表现:与表模型一致,保留全是 null 的行 -* **非对齐设备** - * 树模型的查询表现:当查询涉及的所有序列在某一行都是null时,不保留该行 - * 表视图的查询表现:与树模型一致,不保留全是 null 的行 -* **说明示例** - * 对齐 - - ```SQL - -- 树模型写入数据(对齐) - CREATE ALIGNED TIMESERIES root.db.battery.b1(voltage INT32, current FLOAT) - INSERT INTO root.db.battery.b1(time, voltage, current) aligned values (1, 1, 1) - INSERT INTO root.db.battery.b1(time, voltage, current) aligned values (2, null, 1) - - -- 创建 VIEW 语句 - CREATE VIEW view1 (battery_id TAG, voltage INT32 FIELD, current FLOAT FIELD) as root.db.battery.** - - -- 查询 - IoTDB> select voltage from view1 - +-------+ - |voltage| - +-------+ - | 1| - | null| - +-------+ - Total line number = 2 - ``` - - * 非对齐 - - ```SQL - -- 树模型写入数据(非对齐) - CREATE TIMESERIES root.db.battery.b1.voltage INT32 - CREATE TIMESERIES root.db.battery.b1.current FLOAT - INSERT INTO root.db.battery.b1(time, voltage, current) values (1, 1, 1) - INSERT INTO root.db.battery.b1(time, voltage, current) values (2, null, 1) - - -- 创建 VIEW 语句 - CREATE VIEW view1 (battery_id TAG, voltage INT32 FIELD, current FLOAT FIELD) as root.db.battery.** - - -- 查询 - IoTDB> select voltage from view1 - +-------+ - |voltage| - +-------+ - | 1| - +-------+ - Total line number = 1 - - -- 如果在查询语句中指定了所有 field 列,或是仅指定了非 field 列时,才可以确保查到所有行 - IoTDB> select voltage,current from view1 - +-------+-------+ - |voltage|current| - +-------+-------+ - | 1| 1.0| - | null| 1.0| - +-------+-------+ - Total line number = 2 - - IoTDB> select battery_id from view1 - +-------+ - |battery_id| - +-------+ - | b1| - | b1| - +-------+ - Total line number = 2 - - -- 如果查询中同时有部分 field 列,那最终结果的行数取决于这部分 field 列根据时间戳对齐后的行数 - IoTDB> select time,voltage from view1 - +-----------------------------+-------+ - | time|voltage| - +-----------------------------+-------+ - |1970-01-01T08:00:00.001+08:00| 1| - +-----------------------------+-------+ - Total line number = 1 - ``` - -## 3. 场景示例 -### 3.1 原树模型管理了多种类型的设备 - -* 场景中不同类型的设备具备不同的层级路径和测点集合。 -* ​**写入时**​:在数据库节点下按设备类型创建分支,每种设备下可以有不同的测点结构 -* ​**查询时**​:为每种类型的设备建立一张表,每个表具有不同的标签和测点集合 - -![](/img/tree-to-table-3.png) - -**表视图的创建 SQL:** - -```SQL --- 风机表 -CREATE VIEW viewdb."风机表" - ("风机组" TAG, - "风机号" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -AS root.db."风机".** - --- 电机表 -CREATE VIEW viewdb."电机表" - ("电机组" TAG, - "电机号" TAG, - "功率" FLOAT FIELD, - "电量" FLOAT FIELD, - "温度" FLOAT FIELD - ) -AS root.db."电机".** -``` - -### 3.2 原树模型中没有设备,只有测点 - -如场站的监控系统中,每个测点都有唯一编号,但无法对应到某些设备 - -> 大宽表形式 - -![](/img/tree-to-table-4.png) - -**表视图的创建 SQL:** - -```SQL -CREATE VIEW viewdb.machine - (DCS_PIT_02105A DOUBLE FIELD, - DCS_PIT_02105B DOUBLE FIELD, - DCS_PIT_02105C DOUBLE FIELD, - ... - DCS_XI_02716A DOUBLE FIELD - ) -AS root.db.** -``` - -### 3.3 原树模型中一个设备既有子设备,也有测点 - -如在储能场景中,每一层结构都要监控其电压和电流 - -* ​**写入时**​:按照物理世界的监测点,对每一层结构进行建模 -* ​**查询时**​:按照设备分类,建立多个表对每一层结构信息进行管理 - -![](/img/tree-to-table-5.png) - -**表视图的创建 SQL:** - -```SQL --- 电池舱表 -CREATE VIEW viewdb."电池舱表" - ("电池站" TAG, - "电池舱" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -RESTRICT -AS root.db.** - --- 电池堆表 -CREATE VIEW viewdb."电池堆表" - ("电池站" TAG, - "电池舱" TAG, - "电池堆" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -RESTRICT -AS root.db.** - --- 电池簇表 -CREATE VIEW viewdb."电池簇表" - ("电池站" TAG, - "电池舱" TAG, - "电池堆" TAG, - "电池簇" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -RESTRICT -AS 'root.db.**' - --- 电芯表 -CREATE VIEW viewdb."电芯表" - ("电池站" TAG, - "电池舱" TAG, - "电池堆" TAG, - "电池簇" TAG, - "电芯" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -RESTRICT -AS root.db.** -``` - -### 3.4 原树模型中一个设备下只有一个测点 - -> 窄表形式 - -#### 3.4.1 所有测点数据类型相同 - -![](/img/tree-to-table-6.png) - -**表视图的创建 SQL:** - -```SQL -CREATE VIEW viewdb.machine - ( - sensor_id STRING TAG, - value DOUBLE FIELD - ) -AS root.db.** -``` - -#### 3.4.2 测点的数据类型不相同 -##### 3.4.2.1 为每一种数据类型的测点建一个窄表视图 - -​**优点**​:表视图数量是常数个,仅与系统中的数据类型相关 - -​**缺点**​:查询某一个测点值时,需要提前知道其数据类型,再去决定查询哪张表视图 - -![](/img/tree-to-table-7.png) - -**表视图的创建 SQL:** - -```SQL -CREATE VIEW viewdb.machine_float - ( - sensor_id STRING TAG, - value FLOAT FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_double - ( - sensor_id STRING TAG, - value DOUBLE FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_int32 - ( - sensor_id STRING TAG, - value INT32 FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_int64 - ( - sensor_id STRING TAG, - value INT64 FIELD - ) -AS root.db.** - -... -``` - -##### 3.4.2.2 为每一个测点建一个表 - -​**优点**​:查询某一个测点值时,不需要先查一下数据类型,再去决定查询哪张表,简单便捷 - -​**缺点**​:当测点数量较多时,会引入过多的表视图,需要写大量的建视图语句 - -![](/img/tree-to-table-8.png) - -**表视图的创建 SQL:** - -```SQL -CREATE VIEW viewdb.DCS_PIT_02105A - ( - value FLOAT FIELD - ) -AS root.db.DCS_PIT_02105A.** - -CREATE VIEW viewdb.DCS_PIT_02105B - ( - value DOUBLE FIELD - ) -AS root.db.DCS_PIT_02105B.** - -CREATE VIEW viewdb.DCS_XI_02716A - ( - value INT64 FIELD - ) -AS root.db.DCS_XI_02716A.** - -...... -``` diff --git a/src/zh/UserGuide/Master/Table/User-Manual/Window-Function_timecho.md b/src/zh/UserGuide/Master/Table/User-Manual/Window-Function_timecho.md deleted file mode 100644 index 6d656d322..000000000 --- a/src/zh/UserGuide/Master/Table/User-Manual/Window-Function_timecho.md +++ /dev/null @@ -1,754 +0,0 @@ - - -# 窗口函数 - -IoTDB 针对时序数据的特色分析场景,提供了窗口函数能力,为时序数据的深度挖掘与复杂计算提供了灵活高效的解决方案。下文将对该功能进行详细的介绍。 - -## 1. 功能介绍 - -窗口函数(Window Function) 是一种基于与当前行相关的特定行集合(称为“窗口”) 对每一行进行计算的特殊函数。它将分组操作(`PARTITION BY`)、排序(`ORDER BY`)与可定义的计算范围(窗口框架 `FRAME`)结合,在不折叠原始数据行的前提下实现复杂的跨行计算。常用于数据分析场景,比如排名、累计和、移动平均等操作。 - -> 注意:该功能从 V 2.0.5 版本开始提供。 - -例如,某场景下需要查询不同设备的功耗累加值,即可通过窗口函数来实现。 - -```SQL --- 原始数据 -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ - --- 创建表并插入数据 -CREATE TABLE device_flow(device String tag, flow INT32 FIELD); -insert into device_flow(time, device ,flow ) values ('1970-01-01T08:00:00.000+08:00','d0',3),('1970-01-01T08:00:01.000+08:00','d0',5),('1970-01-01T08:00:02.000+08:00','d0',3),('1970-01-01T08:00:03.000+08:00','d0',1),('1970-01-01T08:00:04.000+08:00','d1',2),('1970-01-01T08:00:05.000+08:00','d1',4); - - ---执行窗口函数查询 -SELECT *, sum(flow) ​OVER(PARTITION​ ​BY​ device ​ORDER​ ​BY​ flow) ​as​ sum ​FROM device_flow; -``` - -经过分组、排序、计算(步骤拆解如下图所示), - -![](/img/window-function-1.png) - -即可得到期望结果: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -## 2. 功能定义 -### 2.1 SQL 定义 - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -### 2.2 窗口定义 -#### 2.2.1 Partition - -`PARTITION BY` 用于将数据分为多个独立、不相关的「组」,窗口函数只能访问并操作其所属分组内的数据,无法访问其它分组。该子句是可选的;如果未显式指定,则默认将所有数据分到同一组。值得注意的是,与 `GROUP BY` 通过聚合函数将一组数据规约成一行不同,`PARTITION BY` 的窗口函数**并不会影响组内的行数。** - -* 示例 - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER (PARTITION BY device) as count FROM device_flow; -``` - -拆解步骤: - -![](/img/window-function-2.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 4| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -|1970-01-01T08:00:02.000+08:00| d0| 3| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 4| -+-----------------------------+------+----+-----+ -``` - -#### 2.2.2 Ordering - -`ORDER BY` 用于对 partition 内的数据进行排序。排序后,相等的行被称为 peers。peers 会影响窗口函数的行为,例如不同 rank function 对 peers 的处理不同;不同 frame 的划分方式对于 peers 的处理也不同。该子句是可选的。 - -* 示例 - -查询语句: - -```SQL -IoTDB> SELECT *, rank() OVER (PARTITION BY device ORDER BY flow) as rank FROM device_flow; -``` - -拆解步骤: - -![](/img/window-function-3.png) - -查询结果: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -#### 2.2.3 Framing - -对于 partition 中的每一行,窗口函数都会在相应的一组行上求值,这些行称为 Frame(即 Window Function 在每一行上的输入域)。Frame 可以手动指定,指定时涉及两个属性,具体说明如下。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Frame 属性属性值值描述
类型ROWS通过行号来划分 frame
GROUPS通过 peers 来划分 frame,即值相同的行视为同等的存在。peers 中所有的行分为一个组,叫做 peer group
RANGE通过值来划分 frame
起始和终止位置UNBOUNDED PRECEDING整个 partition 的第一行
offset PRECEDING代表前面和当前行「距离」为 offset 的行
CURRENT ROW当前行
offset FOLLOWING代表后面和当前行「距离」为 offset 的行
UNBOUNDED FOLLOWING整个 partition 的最后一行
- -其中,`CURRENT ROW`、`PRECEDING N` 和 `FOLLOWING N` 的含义随着 frame 种类的不同而不同,如下表所示: - -| | `ROWS` | `GROUPS` | `RANGE` | -|--------------------|------------|------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------| -| `CURRENT ROW` | 当前行 | 由于 peer group 包含多行,因此这个选项根据作用于 frame\_start 和 frame\_end 而不同:* frame\_start:peer group 的第一行;* frame\_end:peer group 的最后一行。 | 和 GROUPS 相同,根据作用于 frame\_start 和 frame\_end 而不同:* frame\_start:peer group 的第一行;* frame\_end:peer group 的最后一行。 | -| `offset PRECEDING` | 前 offset 行 | 前 offset 个 peer group; | 前面与当前行的值之差小于等于 offset 就分为一个 frame | -| `offset FOLLOWING` | 后 offset 行 | 后 offset 个 peer group。 | 后面与当前行的值之差小于等于 offset 就分为一个 frame | - -语法格式如下: - -```SQL --- 同时指定 frame_start 和 frame_end -{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end --- 仅指定 frame_start,frame_end 为 CURRENT ROW -{ RANGE | ROWS | GROUPS } frame_start -``` - -若未手动指定 Frame,Frame 的默认划分规则如下: - -* 当窗口函数使用 ORDER BY 时:默认 Frame 为 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (即从窗口的第一行到当前行)。例如:RANK() OVER(PARTITION BY COL1 0RDER BY COL2) 中,Frame 默认包含分区内当前行及之前的所有行。 -* 当窗口函数不使用 ORDER BY 时:默认 Frame 为 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (即整个窗口的所有行)。例如:AVG(COL2) OVER(PARTITION BY col1) 中,Frame 默认包含分区内的所有行,计算整个分区的平均值。 - -需要注意的是,当 Frame 类型为 GROUPS 或 RANGE 时,需要指定 `ORDER BY`,区别在于 GROUPS 中的 ORDER BY 可以涉及多个字段,而 RANGE 需要计算,所以只能指定一个字段。 - -* 示例 - -1. Frame 类型为 ROWS - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ROWS 1 PRECEDING) as count FROM device_flow; -``` - -拆解步骤: - -* 取前一行和当前行作为 Frame - * 对于 partition 的第一行,由于没有前一行,所以整个 Frame 只有它一行,返回 1; - * 对于 partition 的其他行,整个 Frame 包含当前行和它的前一行,返回 2: - -![](/img/window-function-4.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 2| -+-----------------------------+------+----+-----+ -``` - -2. Frame 类型为 GROUPS - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -拆解步骤: - -* 取前一个 peer group 和当前 peer group 作为 Frame,那么以 device 为 d0 的 partition 为例(d1同理),对于 count 行数: - * 对于 flow 为 1 的 peer group,由于它也没比它小的 peer group 了,所以整个 Frame 就它一行,返回 1; - * 对于 flow 为 3 的 peer group,它本身包含 2 行,前一个 peer group 就是 flow 为 1 的,就一行,因此整个 Frame 三行,返回 3; - * 对于 flow 为 5 的 peer group,它本身包含 1 行,前一个 peer group 就是 flow 为 3 的,共两行,因此整个 Frame 三行,返回 3。 - -![](/img/window-function-5.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. Frame 类型为 RANGE - -查询语句: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -拆解步骤: - -* 把比当前行数据**小于等于 2 ​**的分为同一个 Frame,那么以 device 为 d0 的 partition 为例(d1 同理),对于 count 行数: - * 对于 flow 为 1 的行,由于它是最小的行了,所以整个 Frame 就它一行,返回 1; - * 对于 flow 为 3 的行,注意 CURRENT ROW 是作为 frame\_end 存在,因此是整个 peer group 的最后一行,符合要求比它小的共 1 行,然后 peer group 有 2 行,所以整个 Frame 共 3 行,返回 3; - * 对于 flow 为 5 的行,它本身包含 1 行,符合要求的比它小的共 2 行,所以整个 Frame 共 3 行,返回 3。 - -![](/img/window-function-6.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -## 3. 内置的窗口函数 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
窗口函数分类窗口函数名函数定义是否支持 FRAME 子句
Aggregate Function所有内置聚合函数对一组值进行聚合计算,得到单个聚合结果。
Value Functionfirst_value返回 frame 的第一个值,如果指定了 IGNORE NULLS 需要跳过前缀的 NULL
last_value返回 frame 的最后一个值,如果指定了 IGNORE NULLS 需要跳过后缀的 NULL
nth_value返回 frame 的第 n 个元素(注意 n 是从 1 开始),如果有 IGNORE NULLS 需要跳过 NULL
lead返回当前行的后 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default
lag返回当前行的前 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default
Rank Functionrank返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间可能有 gap
dense_rank返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间没有 gap
row_number返回当前行在整个 partition 中的行号,注意行号从 1 开始
percent_rank以百分比的形式,返回当前行的值在整个 partition 中的序号;即 (rank() - 1) / (n - 1),其中 n 是整个 partition 的行数
cume_dist以百分比的形式,返回当前行的值在整个 partition 中的序号;即 (小于等于它的行数) / n
ntile指定 n,给每一行进行 1~n 的编号。
- -### 3.1 Aggregate Function - -所有内置聚合函数,如 `sum()`、`avg()`、`min()`、`max()` 都能当作 Window Function 使用。 - -> 注意:与 GROUP BY 不同,Window Function 中每一行都有相应的输出 - -示例: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -### 3.2 Value Function -1. `first_value` - -* 函数名:`first_value(value) [IGNORE NULLS]` -* 定义:返回 frame 的第一个值,如果指定了 IGNORE NULLS 需要跳过前缀的 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, first_value(flow) OVER w as first_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+-----------+ -| time|device|flow|first_value| -+-----------------------------+------+----+-----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----------+ -``` - -2. `last_value` - -* 函数名:`last_value(value) [IGNORE NULLS]` -* 定义:返回 frame 的最后一个值,如果指定了 IGNORE NULLS 需要跳过后缀的 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, last_value(flow) OVER w as last_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|last_value| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -3. `nth_value` - -* 函数名:`nth_value(value, n) [IGNORE NULLS]` -* 定义:返回 frame 的第 n 个元素(注意 n 是从 1 开始),如果有 IGNORE NULLS 需要跳过 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, nth_value(flow, 2) OVER w as nth_values FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|nth_values| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -4. lead - -* 函数名:`lead(value[, offset[, default]]) [IGNORE NULLS]` -* 定义:返回当前行的后 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default;offset 的默认值为 1,default 的默认值为 NULL。 -* lead 函数需要需要一个 ORDER BY 窗口子句 -* 示例: - -```SQL -IoTDB> SELECT *, lead(flow) OVER w as lead FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY time); -+-----------------------------+------+----+----+ -| time|device|flow|lead| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4|null| -|1970-01-01T08:00:00.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 1| -|1970-01-01T08:00:03.000+08:00| d0| 1|null| -+-----------------------------+------+----+----+ -``` - -5. lag - -* 函数名:`lag(value[, offset[, default]]) [IGNORE NULLS]` -* 定义:返回当前行的前 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default;offset 的默认值为 1,default 的默认值为 NULL。 -* lag 函数需要需要一个 ORDER BY 窗口子句 -* 示例: - -```SQL -IoTDB> SELECT *, lag(flow) OVER w as lag FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY device); -+-----------------------------+------+----+----+ -| time|device|flow| lag| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2|null| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3|null| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -+-----------------------------+------+----+----+ -``` - -### 3.3 Rank Function -1. rank - -* 函数名:`rank()` -* 定义:返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间可能有 gap; -* 示例: - -```SQL -IoTDB> SELECT *, rank() OVER w as rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -2. dense\_rank - -* 函数名:`dense_rank()` -* 定义:返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间没有 gap。 -* 示例: - -```SQL -IoTDB> SELECT *, dense_rank() OVER w as dense_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|dense_rank| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+----------+ -``` - -3. row\_number - -* 函数名:`row_number()` -* 定义:返回当前行在整个 partition 中的行号,注意行号从 1 开始; -* 示例: - -```SQL -IoTDB> SELECT *, row_number() OVER w as row_number FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|row_number| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----------+ -``` - -4. percent\_rank - -* 函数名:`percent_rank()` -* 定义:以百分比的形式,返回当前行的值在整个 partition 中的序号;即 **(rank() - 1) / (n - 1)**,其中 n 是整个 partition 的行数; -* 示例: - -```SQL -IoTDB> SELECT *, percent_rank() OVER w as percent_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+------------------+ -| time|device|flow| percent_rank| -+-----------------------------+------+----+------------------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.0| -|1970-01-01T08:00:00.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:02.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+------------------+ -``` - -5. cume\_dist - -* 函数名:cume\_dist -* 定义:以百分比的形式,返回当前行的值在整个 partition 中的序号;即 **(小于等于它的行数) / n**。 -* 示例: - -```SQL -IoTDB> SELECT *, cume_dist() OVER w as cume_dist FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+---------+ -| time|device|flow|cume_dist| -+-----------------------------+------+----+---------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.5| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.25| -|1970-01-01T08:00:00.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:02.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+---------+ -``` - -6. ntile - -* 函数名:ntile -* 定义:指定 n,给每一行进行 1~n 的编号。 - * 整个 partition 行数比 n 小,那么编号就是行号 index; - * 整个 partition 行数比 n 大: - * 如果行数能除尽 n,那么比较完美,比如行数为 4,n 为 2,那么编号为 1、1、2、2、; - * 如果行数不能除尽 n,那么就分给开头几组,比如行数为 5,n 为 3,那么编号为 1、1、2、2、3; -* 示例: - -```SQL -IoTDB> SELECT *, ntile(2) OVER w as ntile FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+-----+ -| time|device|flow|ntile| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -+-----------------------------+------+----+-----+ -``` - -## 4. 场景示例 -1. 多设备 diff 函数 - -对于每个设备的每一行,与前一行求差值: - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -对于每个设备的每一行,与后一行求差值: - -```SQL -SELECT - *, - measurement - lead(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -对于单个设备的每一行,与前一行求差值(后一行同理): - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (ORDER BY time) -FROM data -where device='d1' -WHERE timeCondition; -``` - -2. 多设备 TOP\_K/BOTTOM\_K - -利用 rank 获取序号,然后在外部的查询中保留想要的顺序。 - -(注意, window function 的执行顺序在 HAVING 子句之后,所以这里需要子查询) - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY time DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -除了按照时间排序之外,还可以按照测点的值进行排序: - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY measurement DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -3. 多设备 CHANGE\_POINTS - -这个 sql 用来去除输入序列中连续相同值,可以用 lead + 子查询实现: - -```SQL -SELECT - time, - device, - measurement -FROM( - SELECT - time, - device, - measurement, - LEAD(measurement) OVER (PARTITION BY device ORDER BY time) AS next - FROM data - WHERE timeCondition -) -WHERE measurement != next OR next IS NULL; -``` diff --git a/src/zh/UserGuide/Master/Tree/AI-capability/AINode_Upgrade_timecho.md b/src/zh/UserGuide/Master/Tree/AI-capability/AINode_Upgrade_timecho.md deleted file mode 100644 index e7464bc7f..000000000 --- a/src/zh/UserGuide/Master/Tree/AI-capability/AINode_Upgrade_timecho.md +++ /dev/null @@ -1,669 +0,0 @@ - - -# AINode - -AINode 是支持时序相关模型注册、管理、调用的 IoTDB 原生节点,内置业界领先的自研时序大模型,如清华自研时序模型 Timer 系列,可通过标准 SQL 语句进行调用,实现时序数据的毫秒级实时推理,可支持时序趋势预测、缺失值填补、异常值检测等应用场景。 - -系统架构如下图所示: - -![](/img/AINode-0.png) - -三种节点的职责如下: - -* **ConfigNode**:负责分布式节点管理和负载均衡。 -* **DataNode**:负责接收并解析用户的 SQL请求;负责存储时间序列数据;负责数据的预处理计算。 -* **AINode**:负责时序模型的管理和使用。 - -## 1. 优势特点 - -与单独构建机器学习服务相比,具有以下优势: - -* **简单易用**:无需使用 Python 或 Java 编程,使用 SQL 语句即可完成机器学习模型管理与推理的完整流程。如创建模型可使用CREATE MODEL语句、使用模型进行推理可使用 CALL INFERENCE (...) 语句等,使用更加简单便捷。 -* **避免数据迁移**:使用 IoTDB 原生机器学习可以将存储在 IoTDB 中的数据直接应用于机器学习模型的推理,无需将数据移动到单独的机器学习服务平台,从而加速数据处理、提高安全性并降低成本。 - -![](/img/h1.png) - -* **内置先进算法**:支持业内领先机器学习分析算法,覆盖典型时序分析任务,为时序数据库赋能原生数据分析能力。如: - * **时间序列预测(Time Series Forecasting)**:从过去时间序列中学习变化模式;从而根据给定过去时间的观测值,输出未来序列最可能的预测。 - * **时序异常检测(Anomaly Detection for Time Series)**:在给定的时间序列数据中检测和识别异常值,帮助发现时间序列中的异常行为。 - -## 2. 基本概念 - -* **模型(Model)**:机器学习模型,以时序数据作为输入,输出分析任务的结果或决策。模型是 AINode 的基本管理单元,支持模型的增(注册)、删、查、改(微调)、用(推理)。 -* **创建(Create)**: 将外部设计或训练好的模型文件或算法加载到 AINode 中,由 IoTDB 统一管理与使用。 -* **推理(Inference)**:使用创建的模型在指定时序数据上完成该模型适用的时序分析任务。 -* **内置能力(Built-in)**:AINode 自带常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -![](/img/h3.png) - -## 3. 安装部署 - -AINode 的部署可参考文档 [AINode 部署](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md) 。 - -## 4. 使用指导 - -TimechoDB-AINode 支持模型推理、模型微调以及模型管理(注册、查看、删除、加载、卸载等)三大功能,下面章节将进行详细说明。 - -### 4.1 模型推理 - -SQL语法如下: - -```SQL -call inference(,inputSql,(=)*) -``` - -在完成模型的注册后(内置模型推理无需注册流程),通过call关键字,调用inference函数就可以使用模型的推理功能,其对应的参数介绍如下: - -* **model\_id**: 对应一个已经注册的模型 -* **sql**:sql查询语句,查询的结果作为模型的输入进行模型推理。查询的结果中行列的维度需要与具体模型config中指定的大小相匹配。(这里的sql不建议使用`SELECT *`子句,因为在IoTDB中,`*`并不会对列进行排序,因此列的顺序是未定义的,可以使用`SELECT ot` 的方式确保列的顺序符合模型输入的预期) -* **parameterName/parameterValue**:参数名/参数值,目前支持: - - | 参数名称 | 参数类型 | 参数描述 | 默认值 | - | ------------------------ | ---------- | -------------------------- | -------- | - | **generateTime** | boolean | 返回结果是否包含时间戳列 | false | - | **outputLength** | int | 指定返回结果的输出长度 | 96 | - - -说明: - -1. 使用内置时序大模型进行推理的前提条件是本地存有对应模型权重,目录为 /TIMECHODB\_AINODE\_HOME/data/ainode/models/builtin/model\_id/。若本地没有模型权重,则会自动从 HuggingFace 拉取,请保证本地能直接访问 HuggingFace。 -2. 在深度学习应用中,经常将时间戳衍生特征(数据中的时间列)作为生成式任务的协变量,一同输入到模型中以提升模型的效果,但是在模型的输出结果中一般不包含时间列。为了保证实现的通用性,模型推理结果只对应模型的真实输出,如果模型不输出时间列,则结果中不会包含。 - -**示例** - -样本数据 [ETTh-tree](/img/ETTh-tree.csv) - -下面是使用 sundial 模型推理的一个操作示例,输入 96 行, 输出 48 行,我们通过SQL使用其进行推理。 - -```SQL -IoTDB> select OT from root.db.** -+-----------------------------+---------------+ -| Time|root.db.etth.OT| -+-----------------------------+---------------+ -|2016-07-01T00:00:00.000+08:00| 30.531| -|2016-07-01T01:00:00.000+08:00| 27.787| -|2016-07-01T02:00:00.000+08:00| 27.787| -|2016-07-01T03:00:00.000+08:00| 25.044| -|2016-07-01T04:00:00.000+08:00| 21.948| -| ...... | ...... | -|2016-07-04T19:00:00.000+08:00| 29.546| -|2016-07-04T20:00:00.000+08:00| 29.475| -|2016-07-04T21:00:00.000+08:00| 29.264| -|2016-07-04T22:00:00.000+08:00| 30.953| -|2016-07-04T23:00:00.000+08:00| 31.726| -+-----------------------------+---------------+ -Total line number = 96 - -IoTDB> call inference(sundial,"select OT from root.db.**", generateTime=True, outputLength=48) -+-----------------------------+------------------+ -| Time| output| -+-----------------------------+------------------+ -|2016-07-04T23:00:00.000+08:00|30.537494659423828| -|2016-07-04T23:59:22.500+08:00|29.619892120361328| -|2016-07-05T00:58:45.000+08:00|28.815832138061523| -|2016-07-05T01:58:07.500+08:00| 27.91131019592285| -|2016-07-05T02:57:30.000+08:00|26.893848419189453| -| ...... | ...... | -|2016-07-06T17:33:07.500+08:00| 24.40607261657715| -|2016-07-06T18:32:30.000+08:00| 25.00441551208496| -|2016-07-06T19:31:52.500+08:00|24.907312393188477| -|2016-07-06T20:31:15.000+08:00|25.156436920166016| -|2016-07-06T21:30:37.500+08:00|25.335433959960938| -+-----------------------------+------------------+ -Total line number = 48 -``` - -### 4.2 模型微调 - -AINode 支持通过 SQL 进行模型微调任务。 - -**SQL 语法** - -```SQL -createModel - | CREATE MODEL modelId=identifier (WITH HYPERPARAMETERS LR_BRACKET hparamPair (COMMA hparamPair)* RR_BRACKET)? FROM MODEL existingModelId=identifier ON DATASET LR_BRACKET trainingData RR_BRACKET - ; - -trainingData - : dataElement(COMMA dataElement)* - ; - -dataElement - : pathPatternElement (LR_BRACKET timeRange RR_BRACKET)? - ; - -pathPatternElement - : PATH path=prefixPath - ; -``` - -**参数说明** - -| 名称 | 描述 | -| ----------------- |---------------------------------------------------------------------------------------------------------------------------------------| -| modelId | 微调出的模型的唯一标识 | -| hparamPair | 微调使用的超参数 key-value 对,目前支持如下:
`train_epochs`: int 类型,微调轮数
`iter_per_epoch`: int 类型,每轮微调的迭代次数
`learning_rate`: double 类型,学习率 | -| existingModelId | 微调使用的基座模型 | -| trainingData | 微调使用的数据集 | - -**示例** - -1. 选择测点 root.db.etth.ot 中指定时间范围的数据作为微调数据集,基于 sundial 创建模型 sundialv2. - -```SQL -IoTDB> CREATE MODEL sundialv2 FROM MODEL sundial ON DATASET (PATH root.db.etth.OT([1467302400000, 1467644400000))) -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| training| -+---------------------+---------+-----------+---------+ -``` - -2. 微调任务后台异步启动,可在 AINode 进程看到 log;微调完成后,查询并使用新的模型 - -```SQL -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -+---------------------+---------+-----------+---------+ -``` - -### 4.3 注册自定义模型 - -**符合以下要求的 Transformers 模型可以注册到 AINode 中:** - -1. AINode 目前使用 v4.56.2 版本的 transformers,构建模型时需**避免继承低版本(<4.50)接口**; -2. 模型需继承一类 AINode 的推理任务流水线(当前支持预测流水线): - * iotdb-core/ainode/iotdb/ainode/core/inference/pipeline/basic\_pipeline.py - - **V2.0.9.3 之前** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - 在推理任务开始前对输入数据进行前处理,包括形状验证和数值转换。 - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - 在推理任务结束后对输出结果进行后处理。 - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def preprocess(self, inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], **infer_kwargs): - """ - 在将输入数据传递给模型进行推理之前进行预处理,验证输入数据的形状和类型。 - - Args: - inputs (list[dict]): - 输入数据,字典列表,每个字典包含: - - 'targets': 形状为 (input_length,) 或 (target_count, input_length) 的张量。 - - 'past_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - 'future_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - infer_kwargs (dict, optional): 推理的额外关键字参数,如: - - `output_length`(int): 如果提供'future_covariates',用于验证其有效性。 - - Raises: - ValueError: 如果输入格式不正确(例如,缺少键、张量形状无效)。 - - Returns: - 经过预处理和验证的输入数据,可直接用于模型推理。 - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - 对给定输入执行预测。 - - Parameters: - inputs: 用于进行预测的输入数据。类型和结构取决于模型的具体实现。 - **infer_kwargs: 额外的推理参数,例如: - - `output_length`(int): 模型应该生成的时间点数量。 - - Returns: - 预测输出,具体形式取决于模型的具体实现。 - """ - pass - - def postprocess(self, outputs: list[torch.Tensor], **infer_kwargs) -> list[torch.Tensor]: - """ - 在推理后对模型输出进行后处理,验证输出数据的形状并确保其符合预期维度。 - - Args: - outputs: - 模型输出,2D张量列表,每个张量形状为 `[target_count, output_length]`。 - - Raises: - InferenceModelInternalException: 如果输出张量形状无效(例如,维数错误)。 - ValueError: 如果输出格式不正确。 - - Returns: - list[torch.Tensor]: - 后处理后的输出,将是一个2D张量列表。 - """ - pass - ``` - - **V2.0.9.3 起** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - 在推理任务开始前对输入数据进行前处理,包括形状验证和数值转换。 - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - 在推理任务结束后对输出结果进行后处理。 - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def _preprocess( - self, - inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], - **infer_kwargs, - ): - """ - 在将输入数据传递给模型进行推理之前进行预处理,验证输入数据的形状和类型。 - - Args: - inputs (list[dict[str, dict[str, torch.Tensor] | torch.Tensor]]): - 输入数据,字典列表,每个字典包含: - - 'targets': 形状为 (input_length,) 或 (target_count, input_length) 的张量。 - - 'past_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - 'future_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - infer_kwargs (dict, optional): 推理的额外关键字参数,如: - - `output_length`(int): 如果提供'future_covariates',用于验证其有效性。 - - Raises: - ValueError: 如果输入格式不正确(例如,缺少键、张量形状无效)。 - - Returns: - 经过预处理和验证的输入数据,可直接用于模型推理。 - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - 对给定输入执行预测。 - - Parameters: - inputs: 用于进行预测的输入数据。类型和结构取决于模型的具体实现。 - **infer_kwargs: 额外的推理参数,例如: - - `output_length`(int): 模型应该生成的时间点数量。 - - Returns: - 预测输出,具体形式取决于模型的具体实现。 - """ - pass - - def _postprocess(self, outputs, **infer_kwargs) -> list[torch.Tensor]: - """ - 在推理后对模型输出进行后处理,验证输出数据的形状并确保其符合预期维度。 - - Args: - outputs: - 模型输出,2D张量列表,每个张量形状为 `[target_count, output_length]`。 - - Raises: - InferenceModelInternalException: 如果输出张量形状无效(例如,维数错误)。 - ValueError: 如果输出格式不正确。 - - Returns: - list[torch.Tensor]: - 后处理后的输出,将是一个2D张量列表。 - """ - pass - ``` - -3. 修改模型配置文件 config.json,确保包含以下字段: - - **V2.0.9.3 之前** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // 指定模型 Config 类 - "AutoModelForCausalLM": "model.Chronos2Model" // 指定模型类 - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // 指定模型的推理流水线 - "model_type": "custom_t5", // 指定模型类型 - } - ``` - - * 必须通过 auto\_map 指定模型的 Config 类和模型类; - * 必须集成并指定推理流水线类; - * 对于 AINode 管理的内置(builtin)和自定义(user\_defined)模型,模型类别(model\_type)也作为不可重复的唯一标识。即,要注册的模型类别不得与任何已存在的模型类型重复,通过微调创建的模型将继承原模型的模型类别。 - - **V2.0.9.3 起** - > 参数 model_type 非必填 - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // 指定模型 Config 类 - "AutoModelForCausalLM": "model.Chronos2Model" // 指定模型类 - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // 指定模型的推理流水线 - } - ``` - * 必须通过 auto\_map 指定模型的 Config 类和模型类; - * 必须集成并指定推理流水线类; - - -4. 确保要注册的模型目录包含以下文件,且模型配置文件名称和权重文件名称不支持自定义: - * 模型配置文件:config.json; - * 模型权重文件:model.safetensors; - * 模型代码:其它 .py 文件。 - -**注册自定义模型的 SQL 语法如下所示:** - -```SQL -CREATE MODEL USING URI -``` - -**参数说明:** - -* **model\_id**:自定义模型的唯一标识;不可重复,有以下约束: - * 允许出现标识符 [ 0-9 a-z A-Z \_ ] (字母,数字(非开头),下划线(非开头)) - * 长度限制为 2-64 字符 - * 大小写敏感 -* **uri**:包含模型代码和权重的本地 uri 地址。 - -**注册示例:** - -从本地路径上传自定义 Transformers 模型,AINode 会将该文件夹拷贝至 user\_defined 目录中。 - -```SQL -CREATE MODEL chronos2 USING URI 'file:///path/to/chronos2' -``` - -SQL执行后会异步进行注册的流程,可以通过模型展示查看模型的注册状态(见查看模型章节)。模型注册完成后,就可以通过使用正常查询的方式调用具体函数,进行模型推理。 - -### 4.4 查看模型 - -注册成功的模型可以通过查看指令查询模型的具体信息。 - -```SQL -SHOW MODELS -``` - -除了直接展示所有模型的信息外,可以指定`model_id`来查看某一具体模型的信息。 - -```SQL -SHOW MODELS -- 只展示特定模型 -``` - -模型展示的结果中包含如下内容: - -| **ModelId** | **ModelType** | **Category** | **State** | -| ------------------- | --------------------- | -------------------- | ----------------- | -| 模型ID | 模型类型 | 模型种类 | 模型状态 | - -其中,State 模型状态机流转示意图如下: - -![](/img/ainode-upgrade-state-timecho.png) - -状态机流程说明: - -1. 启动 AINode 后,执行 `show models` 命令,仅能查看到**系统内置(BUILTIN)**的模型。 -2. 用户可导入自己的模型,这类模型的来源标识为**用户自定义(USER\_****DEFINED)**;AINode 会尝试从模型配置文件中解析模型类型(ModelType),若解析失败,该字段则显示为空。 -3. 时序大模型(内置模型)权重文件不随 AINode 打包,AINode 启动时自动下载。 - 1. 下载过程中为 ACTIVATING,下载成功转变为 ACTIVE,失败则变成 INACTIVE。 -4. 用户启动模型微调任务后,正在训练的模型状态为 TRAINING,训练成功变为 ACTIVE,失败则是 FAILED。 -5. 若微调任务成功,微调结束后会统计所有 ckpt (训练文件)中指标最佳的文件并自动重命名,变成用户指定的 model\_id。 - -**查看示例** - -```SQL -IoTDB> show models -+---------------------+--------------+--------------+-------------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------+--------------+-------------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| custom| | user_defined| active| -| timer_xl| timer| builtin| activating| -| sundial| sundial| builtin| active| -| sundialx_1| sundial| fine_tuned| active| -| sundialx_4| sundial| fine_tuned| training| -| sundialx_5| sundial| fine_tuned| failed| -| chronos2| t5| builtin| inactive| -+---------------------+--------------+--------------+-------------+ -``` - -内置传统时序模型介绍如下: - -| 模型名称 | 核心概念 | 适用场景 | 主要特点 | -|----------------------------------| ----------------------------------------------------------------------------------------- | ---------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | -| **ARIMA**(自回归整合移动平均模型) | 结合自回归(AR)、差分(I)和移动平均(MA),用于预测平稳时间序列或可通过差分变为平稳的数据。 | 单变量时间序列预测,如股票价格、销量、经济指标等。| 1. 适用于线性趋势和季节性较弱的数据。2. 需要选择参数 (p,d,q)。3. 对缺失值敏感。 | -| **Holt-Winters**(三参数指数平滑) | 基于指数平滑,引入水平、趋势和季节性三个分量,适用于具有趋势和季节性的数据。 | 有明显季节性和趋势的时间序列,如月度销售额、电力需求等。 | 1. 可处理加性或乘性季节性。2. 对近期数据赋予更高权重。3. 简单易实现。 | -| **Exponential Smoothing**(指数平滑) | 通过加权平均历史数据,权重随时间指数递减,强调近期观测值的重要性。 | 无显著季节性但存在趋势的数据,如短期需求预测。 | 1. 参数少,计算简单。2. 适合平稳或缓慢变化序列。3. 可扩展为双指数或三指数平滑。 | -| **Naive Forecaster**(朴素预测器) | 使用最近一期的观测值作为下一期的预测值,是最简单的基准模型。 | 作为其他模型的比较基准,或数据无明显模式时的简单预测。 | 1. 无需训练。2. 对突发变化敏感。3. 季节性朴素变体可用前一季节同期值预测。 | -| **STL Forecaster**(季节趋势分解预测) | 基于STL分解时间序列,分别预测趋势、季节性和残差分量后组合。 | 具有复杂季节性、趋势和非线性模式的数据,如气候数据、交通流量。 | 1. 能处理非固定季节性。2. 对异常值稳健。3. 分解后可结合其他模型预测各分量。 | -| **Gaussian HMM**(高斯隐马尔可夫模型) | 假设观测数据由隐藏状态生成,每个状态的观测概率服从高斯分布。 | 状态序列预测或分类,如语音识别、金融状态识别。 | 1. 适用于时序数据的状态建模。2. 假设观测值在给定状态下独立。3. 需指定隐藏状态数量。 | -| **GMM HMM** (高斯混合隐马尔可夫模型) | 扩展Gaussian HMM,每个状态的观测概率由高斯混合模型描述,可捕捉更复杂的观测分布。 | 需要多模态观测分布的场景,如复杂动作识别、生物信号分析。 | 1. 比单一高斯更灵活。2. 参数更多,计算复杂度高。3. 需训练GMM成分数。 | -| **STRAY**(基于奇异值的异常检测) | 通过奇异值分解(SVD)检测高维数据中的异常点,常用于时间序列异常检测。 | 高维时间序列的异常检测,如传感器网络、IT系统监控。 | 1. 无需分布假设。2. 可处理高维数据。3. 对全局异常敏感,局部异常可能漏检。 | - -内置时序大模型介绍如下: - -| 模型名称 | 核心概念 | 适用场景 | 主要特点 | -|---------------| ---------------------------------------------------------------------- | ------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| **Timer-XL** | 支持超长上下文的时序大模型,通过大规模工业数据预训练增强泛化能力。 | 需利用极长历史数据的复杂工业预测,如能源、航空航天、交通等领域。 | 1. 超长上下文支持,可处理数万时间点输入。2. 多场景覆盖,支持非平稳、多变量及协变量预测。3. 基于万亿级高质量工业时序数据预训练。 | -| **Timer-Sundial** | 采用“Transformer + TimeFlow”架构的生成式基础模型,专注于概率预测。 | 需要量化不确定性的零样本预测场景,如金融、供应链、新能源发电预测。 | 1. 强大的零样本泛化能力,支持点预测与概率预测 2. 可灵活分析预测分布的任意统计特性。3. 创新生成架构,实现高效的非确定性样本生成。 | -| **Chronos-2** | 基于离散词元化范式的通用时序基础模型,将预测转化为语言建模任务。 | 快速零样本单变量预测,以及可借助协变量(如促销、天气)提升效果的场景。 | 1. 强大的零样本概率预测能力。2. 支持协变量统一建模,但对输入有严格要求:a. 未来协变量的名称组成的集合必须是历史协变量的名称组成的集合的子集;b. 每个历史协变量的长度必须等于目标变量的长度; c. 每个未来协变量的长度必须等于预测长度;3. 采用高效的编码器式结构,兼顾性能与推理速度。 | - - -### 4.5 删除模型 - -对于注册成功的模型,用户可以通过 SQL 进行删除,AINode 会将 user\_defined 目录下的对应模型文件夹整个删除。其 SQL 语法如下: - -```SQL -DROP MODEL -``` - -需要指定已经成功注册的模型 model\_id 来删除对应的模型。由于模型删除涉及模型数据清理,操作不会立即完成,此时模型的状态为 DROPPING,该状态的模型不能用于模型推理。请注意,该功能不支持删除内置模型。 - -### 4.6 加载/卸载模型 - -为适应不同场景,AINode 提供以下两种模型加载策略: - -* 即时加载:即推理时临时加载模型,结束后释放资源。适用于测试或低负载场景。 -* 常驻加载:即将模型持久化加载在内存(CPU)或显存(GPU)中,以支持高并发推理。用户只需通过 SQL 指定加载或卸载的模型,AINode 会自动管理实例数量。当前常驻模型的状态也可随时查看。 - -下文将详细介绍加载/卸载模型的相关内容: - -1. 配置参数 - -支持通过编辑如下配置项设置常驻加载相关参数。 - -```Properties -# AINode 在推理时可使用的设备内存/显存占总量的比例 -# Datatype: Float -ain_inference_memory_usage_ratio=0.4 - -# AINode 每个加载的模型实例需要占用的内存比例,即模型占用*该值 -# Datatype: Float -ain_inference_extra_memory_ratio=1.2 -``` - -2. 展示可用的 device - -支持通过如下 SQL 命令查看所有可用的设备 ID - -```SQL -SHOW AI_DEVICES -``` - -示例 - -```SQL -IoTDB> show ai_devices -+-------------+ -| DeviceId| -+-------------+ -| cpu| -| 0| -| 1| -+-------------+ -``` - -3. 加载模型 - -支持通过如下 SQL 命令手动加载模型,系统根据硬件资源使用情况**自动均衡**模型实例数量。 - -```SQL -LOAD MODEL TO DEVICES (, )* -``` - -参数要求 - -* **existing\_model\_id:** 指定的模型 id,当前版本仅支持 timer\_xl 和 sundial。 -* **device\_id:** 模型加载的位置。 - * **cpu:** 加载到 AINode 所在服务器的内存中。 - * **gpu\_id:** 加载到 AINode 所在服务器的对应显卡中,如 "0, 1" 表示加载到编号为 0 和 1 的两张显卡中。 - -示例 - -```SQL -LOAD MODEL sundial TO DEVICES 'cpu,0,1' -``` - -4. 卸载模型 - -支持通过如下 SQL 命令手动卸载指定模型的所有实例,系统会**重分配**空闲出的资源给其他模型 - -```SQL -UNLOAD MODEL FROM DEVICES (, )* -``` - -参数要求 - -* **existing\_model\_id:** 指定的模型 id,当前版本仅支持 timer\_xl 和 sundial。 -* **device\_id:** 模型加载的位置。 - * **cpu:** 尝试从 AINode 所在服务器的内存中卸载指定模型。 - * **gpu\_id:** 尝试从 AINode 所在服务器的对应显卡中卸载指定模型,如 "0, 1" 表示尝试从编号为 0 和 1 的两张显卡卸载指定模型。 - -示例 - -```SQL -UNLOAD MODEL sundial FROM DEVICES 'cpu,0,1' -``` - -5. 展示加载的模型 - -支持通过如下 SQL 命令查看已经手动加载的模型实例,可通过 `device_id `指定设备。 - -```SQL -SHOW LOADED MODELS -SHOW LOADED MODELS (, )* # 展示指定设备中的模型实例 -``` - -示例:在内存、gpu\_0 和 gpu\_1 两张显卡加载了sundial 模型 - -```SQL -IoTDB> show loaded models -+-------------+--------------+------------------+ -| DeviceId| ModelId| Count(instances)| -+-------------+--------------+------------------+ -| cpu| sundial| 4| -| 0| sundial| 6| -| 1| sundial| 6| -+-------------+--------------+------------------+ -``` - -说明: - -* DeviceId : 设备 ID -* ModelId :加载的模型 ID -* Count(instances) :每个设备中的模型实例数量(系统自动分配) - -### 4.7 时序大模型介绍 - -AINode 目前支持多种时序大模型,相关介绍及部署使用可参考[时序大模型](../AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md) - -## 5. 权限管理 - -使用 AINode 相关的功能时,可以使用IoTDB本身的鉴权去做一个权限管理,用户只有在具备 USE\_MODEL 权限时,才可以使用模型管理的相关功能。当使用推理功能时,用户需要有访问输入模型的 SQL 对应的源序列的权限。 - -| 权限名称 | 权限范围 | 管理员用户(默认ROOT) | 普通用户 | 路径相关 | -| ------------ | ----------------------------------------- | ------------------------ | ---------- | ---------- | -| USE\_MODEL | create model / show models / drop model | √ | √ | x | -| READ\_DATA | call inference | √ | √ | √ | diff --git a/src/zh/UserGuide/Master/Tree/AI-capability/AINode_timecho.md b/src/zh/UserGuide/Master/Tree/AI-capability/AINode_timecho.md deleted file mode 100644 index c551a0b0c..000000000 --- a/src/zh/UserGuide/Master/Tree/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,668 +0,0 @@ - - -# AINode - -AINode 是支持时序相关模型注册、管理、调用的 IoTDB 原生节点,内置业界领先的自研时序大模型,如清华自研时序模型 Timer 系列,可通过标准 SQL 语句进行调用,实现时序数据的毫秒级实时推理,可支持时序趋势预测、缺失值填补、异常值检测等应用场景。 - -> V2.0.5.1及以后版本支持 - -系统架构如下图所示: - -![](/img/AINode-0.png) - -三种节点的职责如下: - -- **ConfigNode**:负责分布式节点管理和负载均衡。 -- **DataNode**:负责接收并解析用户的 SQL请求;负责存储时间序列数据;负责数据的预处理计算。 -- **AINode**:负责时序模型的管理和使用。 - -## 1. 优势特点 - -与单独构建机器学习服务相比,具有以下优势: - -- **简单易用**:无需使用 Python 或 Java 编程,使用 SQL 语句即可完成机器学习模型管理与推理的完整流程。如创建模型可使用CREATE MODEL语句、使用模型进行推理可使用CALL INFERENCE(...)语句等,使用更加简单便捷。 - -- **避免数据迁移**:使用 IoTDB 原生机器学习可以将存储在 IoTDB 中的数据直接应用于机器学习模型的推理,无需将数据移动到单独的机器学习服务平台,从而加速数据处理、提高安全性并降低成本。 - -![](/img/h1.png) - -- **内置先进算法**:支持业内领先机器学习分析算法,覆盖典型时序分析任务,为时序数据库赋能原生数据分析能力。如: - - **时间序列预测(Time Series Forecasting)**:从过去时间序列中学习变化模式;从而根据给定过去时间的观测值,输出未来序列最可能的预测。 - - **时序异常检测(Anomaly Detection for Time Series)**:在给定的时间序列数据中检测和识别异常值,帮助发现时间序列中的异常行为。 - - **时间序列标注(Time Series Annotation)**:为每个数据点或特定时间段添加额外的信息或标记,例如事件发生、异常点、趋势变化等,以便更好地理解和分析数据。 - - -## 2. 基本概念 - -- **模型(Model)**:机器学习模型,以时序数据作为输入,输出分析任务的结果或决策。模型是 AINode 的基本管理单元,支持模型的增(注册)、删、查、改(微调)、用(推理)。 -- **创建(Create)**: 将外部设计或训练好的模型文件或算法加载到 AINode 中,由 IoTDB 统一管理与使用。 -- **推理(Inference)**:使用创建的模型在指定时序数据上完成该模型适用的时序分析任务。 -- **内置能力(Built-in)**:AINode 自带常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -![](/img/AINode-new.png) - -## 3. 安装部署 - -AINode 的部署可参考文档 [AINode 部署](../Deployment-and-Maintenance/AINode_Deployment_timecho.md) 章节。 - -## 4. 使用指导 - -AINode 对时序模型提供了模型创建及删除功能,内置模型无需创建,可直接使用。 - -### 4.1 注册模型 - -通过指定模型输入输出的向量维度,可以注册训练好的深度学习模型,从而用于模型推理。 - -符合以下内容的模型可以注册到AINode中: - 1. AINode 目前支持基于 PyTorch 2.4.0 版本训练的模型,需避免使用 2.4.0 版本以上的特性。 - 2. AINode 支持使用 PyTorch JIT 存储的模型(`model.pt`),模型文件需要包含模型的结构和权重。 - 3. 模型输入序列可以包含一列或多列,若有多列,需要和模型能力、模型配置文件对应。 - 4. 模型的配置参数必须在`config.yaml`配置文件中明确定义。使用模型时,必须严格按照`config.yaml`配置文件中定义的输入输出维度。如果输入输出列数不匹配配置文件,将会导致错误。 - -下方为模型注册的SQL语法定义。 - -```SQL -create model using uri -``` - -SQL中参数的具体含义如下: - -- model_id:模型的全局唯一标识,不可重复。模型名称具备以下约束: - - - 允许出现标识符 [ 0-9 a-z A-Z _ ](字母,数字(非开头),下划线(非开头)) - - 长度限制为2-64字符 - - 大小写敏感 - -- uri:模型注册文件的资源路径,路径下应包含**模型结构及权重文件 model.pt 文件和模型配置文件 config.yaml** - - - 模型结构及权重文件:模型训练完成后得到的权重文件,目前支持 pytorch 训练得到的 .pt 文件 - - - 模型配置文件:模型注册时需要提供的与模型结构有关的参数,其中必须包含模型的输入输出维度用于模型推理: - - - | **参数名** | **参数描述** | **示例** | - | ------------ | ---------------------------- | -------- | - | input_shape | 模型输入的行列,用于模型推理 | [96,2] | - | output_shape | 模型输出的行列,用于模型推理 | [48,2] | - - - ​ 除了模型推理外,还可以指定模型输入输出的数据类型: - - - | **参数名** | **参数描述** | **示例** | - | ----------- | ------------------ | --------------------- | - | input_type | 模型输入的数据类型 | ['float32','float32'] | - | output_type | 模型输出的数据类型 | ['float32','float32'] | - - - ​ 除此之外,可以额外指定备注信息用于在模型管理时进行展示 - - - | **参数名** | **参数描述** | **示例** | - | ---------- | ---------------------------------------------- | ------------------------------------------- | - | attributes | 可选,用户自行设定的模型备注信息,用于模型展示 | 'model_type': 'dlinear','kernel_size': '25' | - - -除了本地模型文件的注册,还可以通过URI来指定远程资源路径来进行注册,使用开源的模型仓库(例如HuggingFace)。 - -#### 示例 - -在[example 文件夹](https://github.com/apache/iotdb/tree/master/integration-test/src/test/resources/ainode-example)下,包含model.pt和config.yaml文件,model.pt为训练得到,config.yaml的内容如下: - -```YAML -configs: - # 必选项 - input_shape: [96, 2] # 表示模型接收的数据为96行x2列 - output_shape: [48, 2] # 表示模型输出的数据为48行x2列 - - # 可选项 默认为全部float32,列数为shape对应的列数 - input_type: ["int64","int64"] #输入对应的数据类型,需要与输入列数匹配 - output_type: ["text","int64"] #输出对应的数据类型,需要与输出列数匹配 - -attributes: # 可选项 为用户自定义的备注信息 - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -指定该文件夹作为加载路径就可以注册该模型 - -```SQL -IoTDB> create model dlinear_example using uri "file://./example" -``` - -SQL执行后会异步进行注册的流程,可以通过模型展示查看模型的注册状态(见模型展示章节),注册成功的耗时主要受到模型文件大小的影响。 - -模型注册完成后,就可以通过使用正常查询的方式调用具体函数,进行模型推理。 - -### 4.2 查看模型 - -注册成功的模型可以通过show models指令查询模型的具体信息。其SQL定义如下: - -```SQL -show models - -show models -``` - -除了直接展示所有模型的信息外,可以指定model id来查看某一具体模型的信息。模型展示的结果中包含如下信息: - -| **ModelId** | **ModelType** | **Category** | **State** | -|-------------|-----------|--------------|----------------| -| 模型ID | 模型类型 | 模型种类 | 模型状态 | - -- 模型状态机流转示意图如下 - -![](/img/AINode-State.png) - -**说明:** - -1. 启动 AINode,show models 只能看到 BUILT-IN 模型 -2. 用户可导入自己的模型,来源为 USER-DEFINED,可尝试从配置文件解析 ModelType,解析不到则为空 -3. 时序大模型权重不随 AINode 打包,AINode 启动时自动下载,下载过程中为 LOADING -4. 下载成功转变为 ACTIVE,失败则变成 INACTIVE -5. 用户启动微调,正在训练的模型状态为 TRAINING,训练成功变为 ACTIVE,失败则是 FAILED - -**示例** - -```SQL -IoTDB> show models -+---------------------+--------------------+--------------+---------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+--------------+---------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| custom| | USER-DEFINED| ACTIVE| -| timerxl| Timer-XL| BUILT-IN| LOADING| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| sundialx_1| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_2| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_4| Timer-Sundial| FINE-TUNED| TRAINING| -| sundialx_5| Timer-Sundial| FINE-TUNED| FAILED| -+---------------------+--------------------+--------------+---------+ -``` - -### 4.3 删除模型 - -对于注册成功的模型,用户可以通过SQL进行删除,该操作会删除所有 AINode 下的相关模型文件,其SQL如下: - -```SQL -drop model -``` - -需要指定已经成功注册的模型 model_id 来删除对应的模型。由于模型删除涉及模型数据清理,操作不会立即完成,此时模型的状态为 DROPPING,该状态的模型不能用于模型推理。请注意,该功能不支持删除内置模型。 - -### 4.4 使用内置模型推理 - -SQL语法如下: - - -```SQL -call inference(,inputSql,(=)*) - -window_function: - head(window_size) - tail(window_size) - count(window_size,sliding_step) -``` - -内置模型推理无需注册流程,通过call关键字,调用inference函数就可以使用模型的推理功能,其对应的参数介绍如下: - -- **model_id:** 模型名称 -- **parameterName**:参数名 -- **parameterValue**:参数值 - -请注意,使用内置时序大模型进行推理的前提条件是本地存有对应模型权重,目录为 /IOTDB_AINODE_HOME/data/ainode/models/weights/model_id/。若本地没有模型权重,则会自动从 HuggingFace 拉取,请保证本地能直接访问 HuggingFace。 - - -#### 内置模型及参数说明 - -目前已内置如下机器学习模型,具体参数说明请参考以下链接。 - -| 模型 | built_in_model_id | 任务类型 | 参数说明 | -| -------------------- | --------------------- | -------- | ------------------------------------------------------------ | -| Arima | _Arima | 预测 | [Arima参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.arima.ARIMA.html?highlight=Arima) | -| STLForecaster | _STLForecaster | 预测 | [STLForecaster参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.trend.STLForecaster.html#sktime.forecasting.trend.STLForecaster) | -| NaiveForecaster | _NaiveForecaster | 预测 | [NaiveForecaster参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.naive.NaiveForecaster.html#naiveforecaster) | -| ExponentialSmoothing | _ExponentialSmoothing | 预测 | [ExponentialSmoothing参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.exp_smoothing.ExponentialSmoothing.html) | -| GaussianHMM | _GaussianHMM | 标注 | [GaussianHMM参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.hmm_learn.gaussian.GaussianHMM.html) | -| GMMHMM | _GMMHMM | 标注 | [GMMHMM参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.hmm_learn.gmm.GMMHMM.html) | -| Stray | _Stray | 异常检测 | [Stray参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.stray.STRAY.html) | - - -在完成模型的注册后,通过call关键字,调用inference函数就可以使用模型的推理功能,其对应的参数介绍如下: - -- **model_id**: 对应一个已经注册的模型 -- **sql**:sql查询语句,查询的结果作为模型的输入进行模型推理。查询的结果中行列的维度需要与具体模型config中指定的大小相匹配。(这里的sql不建议使用`SELECT *`子句,因为在IoTDB中,`*`并不会对列进行排序,因此列的顺序是未定义的,可以使用`SELECT s0,s1`的方式确保列的顺序符合模型输入的预期) -- **window_function**: 推理过程中可以使用的窗口函数,目前提供三种类型的窗口函数用于辅助模型推理: - - **head(window_size)**: 获取数据中最前的window_size个点用于模型推理,该窗口可用于数据裁剪 - ![](/img/AINode-call1.png) - - - **tail(window_size)**:获取数据中最后的window_size个点用于模型推,该窗口可用于数据裁剪 - ![](/img/AINode-call2.png) - - - **count(window_size, sliding_step)**:基于点数的滑动窗口,每个窗口的数据会分别通过模型进行推理,如下图示例所示,window_size为2的窗口函数将输入数据集分为三个窗口,每个窗口分别进行推理运算生成结果。该窗口可用于连续推理 - ![](/img/AINode-call3.png) - -**说明1: window可以用来解决sql查询结果和模型的输入行数要求不一致时的问题,对行进行裁剪。需要注意的是,当列数不匹配或是行数直接少于模型需求时,推理无法进行,会返回错误信息。** - -**说明2: 在深度学习应用中,经常将时间戳衍生特征(数据中的时间列)作为生成式任务的协变量,一同输入到模型中以提升模型的效果,但是在模型的输出结果中一般不包含时间列。为了保证实现的通用性,模型推理结果只对应模型的真实输出,如果模型不输出时间列,则结果中不会包含。** - - -#### 示例 - -下面是使用深度学习模型推理的一个操作示例,针对上面提到的输入为`[96,2]`,输出为`[48,2]`的`dlinear`预测模型,我们通过SQL使用其进行推理。 - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**", generateTime=True) -+-----------------------------+--------------------------------------------+-----------------------------+ -| Time| _result_0| _result_1| -+-----------------------------+--------------------------------------------+-----------------------------+ -|1990-04-06T00:00:00.000+08:00| 0.726302981376648| 1.6549958229064941| -|1990-04-08T00:00:00.000+08:00| 0.7354921698570251| 1.6482787370681763| -|1990-04-10T00:00:00.000+08:00| 0.7238251566886902| 1.6278168201446533| -...... -|1990-07-07T00:00:00.000+08:00| 0.7692174911499023| 1.654654049873352| -|1990-07-09T00:00:00.000+08:00| 0.7685555815696716| 1.6625318765640259| -|1990-07-11T00:00:00.000+08:00| 0.7856493592262268| 1.6508299350738525| -+-----------------------------+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### 使用tail/head窗口函数的示例 - -当数据量不定且想要取96行最新数据用于推理时,可以使用对应的窗口函数tail。head函数的用法与其类似,不同点在于其取的是最早的96个点。 - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1988-01-01T00:00:00.000+08:00| 0.7355| 1.211| -...... -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 996 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**", generateTime=True, window=tail(96)) -+-----------------------------+--------------------------------------------+-----------------------------+ -| Time| _result_0| _result_1| -+-----------------------------+--------------------------------------------+-----------------------------+ -|1990-04-06T00:00:00.000+08:00| 0.726302981376648| 1.6549958229064941| -|1990-04-08T00:00:00.000+08:00| 0.7354921698570251| 1.6482787370681763| -|1990-04-10T00:00:00.000+08:00| 0.7238251566886902| 1.6278168201446533| -...... -|1990-07-07T00:00:00.000+08:00| 0.7692174911499023| 1.654654049873352| -|1990-07-09T00:00:00.000+08:00| 0.7685555815696716| 1.6625318765640259| -|1990-07-11T00:00:00.000+08:00| 0.7856493592262268| 1.6508299350738525| -+-----------------------------+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### 使用count窗口函数的示例 - -该窗口主要用于计算式任务,当任务对应的模型一次只能处理固定行数据而最终想要的确实多组预测结果时,使用该窗口函数可以使用点数滑动窗口进行连续推理。假设我们现在有一个异常检测模型anomaly_example(input: [24,2], output[1,1]),对每24行数据会生成一个0/1的标签,其使用示例如下: - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(anomaly_example,"select s0,s1 from root.**", generateTime=True, window=count(24,24)) -+-----------------------------+-------------------------+ -| Time| _result_0| -+-----------------------------+-------------------------+ -|1990-04-06T00:00:00.000+08:00| 0| -|1990-04-30T00:00:00.000+08:00| 1| -|1990-05-24T00:00:00.000+08:00| 1| -|1990-06-17T00:00:00.000+08:00| 0| -+-----------------------------+-------------------------+ -Total line number = 4 -``` - -其中结果集中每行的标签对应每24行数据为一组,输入该异常检测模型后的输出。 - -### 4.5 使用内置模型微调 -> 仅 Timer-XL、Timer-Sundial 可以进行微调操作。 - -SQL语法如下: - - -```SQL -create model (with hyperparameters -(=(, =)*))? -from model -on dataset (PATH ([timeRange])?) -``` - -#### 示例 - -1. 选择测点 root.db.etth.ot 中前 80% 的数据作为微调数据集,基于 sundial 创建模型 sundialv2. - -```SQL -IoTDB> CREATE MODEL sundialv2 FROM MODEL sundial ON DATASET (PATH root.db.etth.OT([1467302400000, 1517468400001))) -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+--------------------+----------+--------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+--------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| timer_xl| Timer-XL| BUILT-IN| ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|TRAINING| -+---------------------+--------------------+----------+--------+ -``` - -2. 微调任务后台异步启动,可在 AINode 进程看到 log;微调完成后,查询并使用新的模型 - -```SQL -IoTDB> show models -+---------------------+--------------------+----------+------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+------+ -| arima| Arima| BUILT-IN|ACTIVE| -| holtwinters| HoltWinters| BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN|ACTIVE| -| stray| Stray| BUILT-IN|ACTIVE| -| sundial| Timer-Sundial| BUILT-IN|ACTIVE| -| timer_xl| Timer-XL| BUILT-IN|ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|ACTIVE| -+---------------------+--------------------+----------+------+ -``` - -### 4.6 时序大模型导入步骤 - -AINode 目前支持多种时序大模型,部署使用请参考[时序大模型](../AI-capability/TimeSeries-Large-Model.md) - -## 5. 权限管理 - -使用AINode相关的功能时,可以使用IoTDB本身的鉴权去做一个权限管理,用户只有在具备 USE_MODEL 权限时,才可以使用模型管理的相关功能。当使用推理功能时,用户需要有访问输入模型的SQL对应的源序列的权限。 - -| 权限名称 | 权限范围 | 管理员用户(默认ROOT) | 普通用户 | 路径相关 | -| --------- | --------------------------------- | ---------------------- | -------- | -------- | -| USE_MODEL | create model / show models / drop model | √ | √ | x | -| READ_DATA | call inference | √ | √ | √ | - -## 6. 实际案例 - -### 6.1 电力负载预测 - -在部分工业场景下,会存在预测电力负载的需求,预测结果可用于优化电力供应、节约能源和资源、支持规划和扩展以及增强电力系统的可靠性。 - -我们所使用的 ETTh1 的测试集的数据为[ETTh1](/img/ETTh1.csv)。 - - -包含间隔1h采集一次的电力数据,每条数据由负载和油温构成,分别为:High UseFul Load, High UseLess Load, Middle UseLess Load, Low UseFul Load, Low UseLess Load, Oil Temperature。 - -在该数据集上,IoTDB-ML的模型推理功能可以通过以往高中低三种负载的数值和对应时间戳油温的关系,预测未来一段时间内的油温,赋能电网变压器的自动调控和监视。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-data.sh` 向 IoTDB 中导入 ETT 数据集 - -```Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/ETTh1.csv -``` - -#### 步骤二:模型导入 - -我们可以在iotdb-cli 中输入以下SQL从 huggingface 上拉取一个已经训练好的模型进行注册,用于后续的推理。 - -```SQL -create model dlinear using uri 'https://huggingface.co/hvlgo/dlinear/tree/main' -``` - -该模型基于较为轻量化的深度模型DLinear训练而得,能够以相对快的推理速度尽可能多地捕捉到序列内部的变化趋势和变量间的数据变化关系,相较于其他更深的模型更适用于快速实时预测。 - -#### 步骤三:模型推理 - -```Shell -IoTDB> select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth LIMIT 96 -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -| Time|root.eg.etth.s0|root.eg.etth.s1|root.eg.etth.s2|root.eg.etth.s3|root.eg.etth.s4|root.eg.etth.s5|root.eg.etth.s6| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -|2017-10-20T00:00:00.000+08:00| 10.449| 3.885| 8.706| 2.025| 2.041| 0.944| 8.864| -|2017-10-20T01:00:00.000+08:00| 11.119| 3.952| 8.813| 2.31| 2.071| 1.005| 8.442| -|2017-10-20T02:00:00.000+08:00| 9.511| 2.88| 7.533| 1.564| 1.949| 0.883| 8.16| -|2017-10-20T03:00:00.000+08:00| 9.645| 2.21| 7.249| 1.066| 1.828| 0.914| 7.949| -...... -|2017-10-23T20:00:00.000+08:00| 8.105| 0.938| 4.371| -0.569| 3.533| 1.279| 9.708| -|2017-10-23T21:00:00.000+08:00| 7.167| 1.206| 4.087| -0.462| 3.107| 1.432| 8.723| -|2017-10-23T22:00:00.000+08:00| 7.1| 1.34| 4.015| -0.32| 2.772| 1.31| 8.864| -|2017-10-23T23:00:00.000+08:00| 9.176| 2.746| 7.107| 1.635| 2.65| 1.097| 9.004| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example, "select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth", generateTime=True, window=head(96)) -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -| Time| output0| output1| output2| output3| output4| output5| output6| -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -|2017-10-23T23:00:00.000+08:00| 10.319546| 3.1450553| 7.877341| 1.5723765|2.7303758| 1.1362307| 8.867775| -|2017-10-24T01:00:00.000+08:00| 10.443649| 3.3286757| 7.8593454| 1.7675098| 2.560634| 1.1177158| 8.920919| -|2017-10-24T03:00:00.000+08:00| 10.883752| 3.2341104| 8.47036| 1.6116762|2.4874182| 1.1760603| 8.798939| -...... -|2017-10-26T19:00:00.000+08:00| 8.0115595| 1.2995274| 6.9900327|-0.098746896| 3.04923| 1.176214| 9.548782| -|2017-10-26T21:00:00.000+08:00| 8.612427| 2.5036244| 5.6790237| 0.66474205|2.8870275| 1.2051733| 9.330128| -|2017-10-26T22:00:00.000+08:00| 10.096699| 3.399722| 6.9909| 1.7478468|2.7642853| 1.1119363| 9.541455| -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -Total line number = 48 -``` - -我们将对油温的预测的结果和真实结果进行对比,可以得到以下的图像。 - -图中10/24 00:00之前的数据为输入模型的过去数据,10/24 00:00后的蓝色线条为模型给出的油温预测结果,而红色为数据集中实际的油温数据(用于进行对比)。 - -![](/img/AINode-analysis1.png) - -可以看到,我们使用了过去96个小时(4天)的六个负载信息和对应时间油温的关系,基于之前学习到的序列间相互关系对未来48个小时(2天)的油温这一数据的可能变化进行了建模,可以看到可视化后预测曲线与实际结果在趋势上保持了较高程度的一致性。 - -### 6.2 功率预测 - -变电站需要对电流、电压、功率等数据进行电力监控,用于检测潜在的电网问题、识别电力系统中的故障、有效管理电网负载以及分析电力系统的性能和趋势等。 - -我们利用某变电站中的电流、电压和功率等数据构成了真实场景下的数据集。该数据集包括变电站近四个月时间跨度,每5 - 6s 采集一次的 A相电压、B相电压、C相电压等数据。 - -测试集数据内容为[data](/img/data.csv)。 - -在该数据集上,IoTDB-ML的模型推理功能可以通过以往A相电压,B相电压和C相电压的数值和对应时间戳,预测未来一段时间内的C相电压,赋能变电站的监视管理。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-data.sh` 导入数据集 - -```Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/data.csv -``` - -#### 步骤二:模型导入 - -我们可以在iotdb-cli 中选择内置模型或已经注册好的模型用于后续的推理。 - -我们采用内置模型STLForecaster进行预测,STLForecaster 是一个基于 statsmodels 库中 STL 实现的时间序列预测方法。 - -#### 步骤三:模型推理 - -```Shell -IoTDB> select * from root.eg.voltage limit 96 -+-----------------------------+------------------+------------------+------------------+ -| Time|root.eg.voltage.s0|root.eg.voltage.s1|root.eg.voltage.s2| -+-----------------------------+------------------+------------------+------------------+ -|2023-02-14T20:38:32.000+08:00| 2038.0| 2028.0| 2041.0| -|2023-02-14T20:38:38.000+08:00| 2014.0| 2005.0| 2018.0| -|2023-02-14T20:38:44.000+08:00| 2014.0| 2005.0| 2018.0| -...... -|2023-02-14T20:47:52.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:47:57.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:48:03.000+08:00| 2024.0| 2016.0| 2027.0| -+-----------------------------+------------------+------------------+------------------+ -Total line number = 96 - -IoTDB> call inference(_STLForecaster, "select s0,s1,s2 from root.eg.voltage", generateTime=True, window=head(96),predict_length=48) -+-----------------------------+---------+---------+---------+ -| Time| output0| output1| output2| -+-----------------------------+---------+---------+---------+ -|2023-02-14T20:48:03.000+08:00|2026.3601|2018.2953|2029.4257| -|2023-02-14T20:48:09.000+08:00|2019.1538|2011.4361|2022.0888| -|2023-02-14T20:48:15.000+08:00|2025.5074|2017.4522|2028.5199| -...... - -|2023-02-14T20:52:15.000+08:00|2022.2336|2015.0290|2025.1023| -|2023-02-14T20:52:21.000+08:00|2015.7241|2008.8975|2018.5085| -|2023-02-14T20:52:27.000+08:00|2022.0777|2014.9136|2024.9396| -|2023-02-14T20:52:33.000+08:00|2015.5682|2008.7821|2018.3458| -+-----------------------------+---------+---------+---------+ -Total line number = 48 -``` -我们将对C相电压的预测的结果和真实结果进行对比,可以得到以下的图像。 - -图中 02/14 20:48 之前的数据为输入模型的过去数据, 02/14 20:48 后的蓝色线条为模型给出的C相电压预测结果,而红色为数据集中实际的C相电压数据(用于进行对比)。 - -![](/img/AINode-analysis2.png) - -可以看到,我们使用了过去10分钟的电压的数据,基于之前学习到的序列间相互关系对未来5分钟的C相电压这一数据的可能变化进行了建模,可以看到可视化后预测曲线与实际结果在趋势上保持了一定的同步性。 - -### 6.3 异常检测 - -在民航交通运输业,存在着对乘机旅客数量进行异常检测的需求。异常检测的结果可用于指导调整航班的调度,以使得企业获得更大效益。 - -Airline Passengers一个时间序列数据集,该数据集记录了1949年至1960年期间国际航空乘客数量,间隔一个月进行一次采样。该数据集共含一条时间序列。数据集为[airline](/img/airline.csv)。 -在该数据集上,IoTDB-ML的模型推理功能可以通过捕捉序列的变化规律以对序列时间点进行异常检测,赋能交通运输业。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-data.sh` 导入数据集 - -```Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/data.csv -``` - -#### 步骤二:模型推理 - -IoTDB内置有部分可以直接使用的机器学习算法,使用其中的异常检测算法进行预测的样例如下: - -```Shell -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", generateTime=True, k=2) -+-----------------------------+-------+ -| Time|output0| -+-----------------------------+-------+ -|1960-12-31T00:00:00.000+08:00| 0| -|1961-01-31T08:00:00.000+08:00| 0| -|1961-02-28T08:00:00.000+08:00| 0| -|1961-03-31T08:00:00.000+08:00| 0| -...... -|1972-06-30T08:00:00.000+08:00| 1| -|1972-07-31T08:00:00.000+08:00| 1| -|1972-08-31T08:00:00.000+08:00| 0| -|1972-09-30T08:00:00.000+08:00| 0| -|1972-10-31T08:00:00.000+08:00| 0| -|1972-11-30T08:00:00.000+08:00| 0| -+-----------------------------+-------+ -Total line number = 144 -``` - -我们将检测为异常的结果进行绘制,可以得到以下图像。其中蓝色曲线为原时间序列,用红色点特殊标注的时间点为算法检测为异常的时间点。 - -![](/img/s6.png) - -可以看到,Stray模型对输入序列变化进行了建模,成功检测出出现异常的时间点。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md b/src/zh/UserGuide/Master/Tree/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md deleted file mode 100644 index 3d86bfc4f..000000000 --- a/src/zh/UserGuide/Master/Tree/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md +++ /dev/null @@ -1,157 +0,0 @@ - -# 时序大模型 - -## 1. 简介 - -时序大模型是专为时序数据分析设计的基础模型。IoTDB 团队长期自研时序基础模型 Timer,该模型基于 Transformer 架构,经海量多领域时序数据预训练,可支撑时序预测、异常检测、时序填补等下游任务;团队打造的 AINode 平台同时支持集成业界前沿时序基础模型,为用户提供多元选型。不同于传统时序分析技术,这类大模型具备通用特征提取能力,可通过零样本分析、微调等技术服务广泛的分析任务。 - -本文相关时序大模型领域的技术成果(含团队自研及业界前沿方向)均发表于国际机器学习顶级会议,具体内容见附录。 - -## 2. 应用场景 - -* **时序预测**:为工业生产、自然环境等领域提供时间序列数据的预测服务,帮助用户提前了解未来变化趋势。 -* **数据填补**:针对时间序列中的缺失序列段,进行上下文填补,以增强数据集的连续性和完整性。 -* **异常检测**:利用自回归分析技术,对时间序列数据进行实时监测,及时预警潜在的异常情况。 - -![](/img/LargeModel09.png) - -## 3. Timer-1 模型 - -Timer[1] 模型(非内置模型)不仅展现了出色的少样本泛化和多任务适配能力,还通过预训练获得了丰富的知识库,赋予了它处理多样化下游任务的通用能力,拥有以下特点: - -* **泛化性**:模型能够通过使用少量样本进行微调,达到行业内领先的深度模型预测效果。 -* **通用性**:模型设计灵活,能够适配多种不同的任务需求,并且支持变化的输入和输出长度,使其在各种应用场景中都能发挥作用。 -* **可扩展性**:随着模型参数数量的增加或预训练数据规模的扩大,模型效果会持续提升,确保模型能够随着时间和数据量的增长而不断优化其预测效果。 - -![](/img/model01.png) - -## 4. Timer-XL 模型 - -Timer-XL[2]基于 Timer 进一步扩展升级了网络结构,在多个维度全面突破: - -* **超长上下文支持**:该模型突破了传统时序预测模型的限制,支持处理数千个 Token(相当于数万个时间点)的输入,有效解决了上下文长度瓶颈问题。 -* **多变量预测场景覆盖**:支持多种预测场景,包括非平稳时间序列的预测、涉及多个变量的预测任务以及包含协变量的预测,满足多样化的业务需求。 -* **大规模工业时序数据集:**采用万亿大规模工业物联网领域的时序数据集进行预训练,数据集兼有庞大的体量、卓越的质量和丰富的领域等重要特质,覆盖能源、航空航天、钢铁、交通等多领域。 - -![](/img/model02.png) - -## 5. Timer-Sundial 模型 - -Timer-Sundial[3]是一个专注于时间序列预测的生成式基础模型系列,其基础版本拥有 1.28 亿参数,并在 1 万亿个时间点上进行了大规模预训练,其核心特性包括: - -* **强大的泛化性能:**具备零样本预测能力,可同时支持点预测和概率预测。 -* **灵活预测分布分析:**不仅能预测均值或分位数,还可通过模型生成的原始样本评估预测分布的任意统计特性。 -* **创新生成架构:** 采用 “Transformer + TimeFlow” 协同架构——Transformer 学习时间片段的自回归表征,TimeFlow 模块基于流匹配框架 (Flow-Matching) 将随机噪声转化为多样化预测轨迹,实现高效的非确定性样本生成。 - -![](/img/model03.png) - -## 6. Chronos-2 模型 - -Chronos-2 [4]是由 Amazon Web Services (AWS) 研究团队开发的,基于 Chronos 离散词元建模范式发展起来的通用时间序列基础模型,该模型同时适用于零样本单变量预测和协变量预测。其主要特性包括: - -* **概率性预测能力**:模型以生成式方式输出多步预测结果,支持分位数或分布级预测,从而刻画未来不确定性。 -* **零样本通用预测**:依托预训练获得的上下文学习能力,可直接对未见过的数据集执行预测,无需重新训练或参数更新。 -* **多变量与协变量统一建模**:支持在同一架构下联合建模多条相关时间序列及其协变量,以提升复杂任务的预测效果。但对输入有严格要求: - * 未来协变量的名称组成的集合必须是历史协变量的名称组成的集合的子集; - * 每个历史协变量的长度必须等于目标变量的长度; - * 每个未来协变量的长度必须等于预测长度; -* **高效推理与部署**:模型采用紧凑的编码器式(encoder-only)结构,在保持强泛化能力的同时兼顾推理效率。 - -![](/img/timeseries-large-model-chronos2.png) - -## 7. 效果展示 - -时序大模型能够适应多种不同领域和场景的真实时序数据,在各种任务上拥有优异的处理效果,以下是在不同数据上的真实表现: - -**时序预测:** - -利用时序大模型的预测能力,能够准确预测时间序列的未来变化趋势,如下图蓝色曲线代表预测趋势,红色曲线为实际趋势,两曲线高度吻合。 - -![](/img/LargeModel03.png) - -**数据填补**: - -利用时序大模型对缺失数据段进行预测式填补。 - -![](/img/timeseries-large-model-data-imputation.png) - -**异常检测**: - -利用时序大模型精准识别与正常趋势偏离过大的异常值。 - -![](/img/LargeModel05.png) - -## 8. 部署使用 - -1. 打开 IoTDB cli 控制台,检查 ConfigNode、DataNode、AINode 节点确保均为 Running。 - -```Plain -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| -+------+----------+-------+---------------+------------+--------------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| 2.0.5.1| 069354f| -| 1| DataNode|Running| 127.0.0.1| 10730| 2.0.5.1| 069354f| -| 2| AINode|Running| 127.0.0.1| 10810| 2.0.5.1|069354f-dev| -+------+----------+-------+---------------+------------+--------------+-----------+ -Total line number = 3 -It costs 0.140s -``` - -2. 联网环境下首次启动 AINode 节点会自动拉取 Timer-XL、Sundial、Chronos2 模型。 - - > 注意: - > - > * AINode 安装包不包含模型权重文件 - > * 自动拉取功能依赖部署环境具备 HuggingFace 网络访问能力 - > * AINode 支持手动上传模型权重文件,具体操作方法可参考[导入权重文件](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_3-3-导入内置权重文件) - -3. 检查模型是否可用。 - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 附录 - -**[1]** Timer- Generative Pre-trained Transformers Are Large Time Series Models, Yong Liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ 返回](#ref1) - -**[2]** TIMER-XL- LONG-CONTEXT TRANSFORMERS FOR UNIFIED TIME SERIES FORECASTING ,Yong Liu, Guo Qin, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ 返回](#ref2) - -**[3]** Sundial- A Family of Highly Capable Time Series Foundation Models, Yong Liu, Guo Qin, Zhiyuan Shi, Zhi Chen, Caiyin Yang, Xiangdong Huang, Jianmin Wang, Mingsheng Long, **ICML 2025 spotlight**. [↩ 返回](#ref3) - -**[4] **Chronos-2: From Univariate to Universal Forecasting, Abdul Fatir Ansari, Oleksandr Shchur, Jaris Küken, Andreas Auer, Boran Han, Pedro Mercado, Syama Sundar Rangapuram, Huibin Shen, Lorenzo Stella, Xiyuan Zhang, Mononito Goswami, Shubham Kapoor, Danielle C. Maddix, Pablo Guerron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, Michael Bohlke-Schneider, **arXiv:2510.15821.**[↩ 返回](#ref4) diff --git a/src/zh/UserGuide/Master/Tree/API/Programming-Data-Subscription_timecho.md b/src/zh/UserGuide/Master/Tree/API/Programming-Data-Subscription_timecho.md deleted file mode 100644 index 183fdcc80..000000000 --- a/src/zh/UserGuide/Master/Tree/API/Programming-Data-Subscription_timecho.md +++ /dev/null @@ -1,268 +0,0 @@ - - - - -# 数据订阅API - -IoTDB 提供了强大的数据订阅功能,允许用户通过订阅 API 实时获取 IoTDB 新增的数据。详细的功能定义及介绍:[数据订阅](../User-Manual/Data-subscription_timecho) - -## 1. 核心步骤 - -1. 创建Topic:创建一个Topic,Topic中包含希望订阅的测点。 -2. 订阅Topic:在 consumer 订阅 topic 前,topic 必须已经被创建,否则订阅会失败。同一个 consumer group 下的 consumers 会均分数据。 -3. 消费数据:只有显式订阅了某个 topic,才会收到对应 topic 的数据。 -4. 取消订阅: consumer close 时会退出对应的 consumer group,同时取消现存的所有订阅。 - - -## 2. 详细步骤 - -本章节用于说明开发的核心流程,并未演示所有的参数和接口,如需了解全部功能及参数请参见: [全量接口说明](./Programming-Java-Native-API_timecho#_3-全量接口说明) - - -### 2.1 创建maven项目 - -创建一个maven项目,并导入以下依赖(JDK >= 1.8, Maven >= 3.6) - -```xml - - - org.apache.iotdb - iotdb-session - - ${project.version} - - -``` -注意:请勿使用高版本客户端连接低版本服务。 - -### 2.2 代码案例 - -#### 2.2.1 Topic操作 - -```java -import java.util.Optional; -import java.util.Properties; -import java.util.Set; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.SubscriptionSession; -import org.apache.iotdb.session.subscription.model.Topic; - -public class DataConsumerExample { - - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - try (SubscriptionSession session = new SubscriptionSession("127.0.0.1", 6667, "root", "TimechoDB@2021", 67108864)) { //V2.0.6.x 之前默认密码为root - // 1. open session - session.open(); - - // 2. create a topic of all data - Properties sessionConfig = new Properties(); - sessionConfig.put(TopicConstant.PATH_KEY, "root.**"); - - session.createTopic("allData", sessionConfig); - - // 3. show all topics - Set topics = session.getTopics(); - System.out.println(topics); - - // 4. show a specific topic - Optional allData = session.getTopic("allData"); - System.out.println(allData.get()); - } - } -} -``` - -#### 2.2.2 数据消费 - -##### 场景-1: 订阅IoTDB中新增的实时数据(大屏或组态展示的场景) - -```java -import java.io.IOException; -import java.util.List; -import java.util.Properties; -import org.apache.iotdb.rpc.subscription.config.ConsumerConstant; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.consumer.SubscriptionPullConsumer; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessage; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessageType; -import org.apache.iotdb.session.subscription.payload.SubscriptionSessionDataSet; -import org.apache.tsfile.read.common.RowRecord; - -public class DataConsumerExample { - - public static void main(String[] args) throws IOException { - - // 5. create a pull consumer, the subscription is automatically cancelled when the logic in the try resources is completed - Properties consumerConfig = new Properties(); - consumerConfig.put(ConsumerConstant.CONSUMER_ID_KEY, "c1"); - consumerConfig.put(ConsumerConstant.CONSUMER_GROUP_ID_KEY, "cg1"); - consumerConfig.put(ConsumerConstant.USERNAME_KEY, "root"); - consumerConfig.put(ConsumerConstant.PASSWORD_KEY, "TimechoDB@2021"); //V2.0.6.x 之前默认密码为root - try (SubscriptionPullConsumer pullConsumer = new SubscriptionPullConsumer(consumerConfig)) { - pullConsumer.open(); - pullConsumer.subscribe("topic_all"); - while (true) { - List messages = pullConsumer.poll(10000); - for (final SubscriptionMessage message : messages) { - final short messageType = message.getMessageType(); - if (SubscriptionMessageType.isValidatedMessageType(messageType)) { - for (final SubscriptionSessionDataSet dataSet : message.getSessionDataSetsHandler()) { - while (dataSet.hasNext()) { - final RowRecord record = dataSet.next(); - System.out.println(record); - } - } - } - } - } - } - } -} - - -``` - -##### 场景-2:订阅新增的 TsFile(定期数据备份的场景) - -前提:需要被消费的topic的格式为TsfileHandler类型,举例:`create topic topic_all_tsfile with ('path'='root.**','format'='TsFileHandler')` - -```java -import java.io.IOException; -import java.util.List; -import java.util.Properties; -import org.apache.iotdb.rpc.subscription.config.ConsumerConstant; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.consumer.SubscriptionPullConsumer; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessage; - - -public class DataConsumerExample { - - public static void main(String[] args) throws IOException { - // 1. create a pull consumer, the subscription is automatically cancelled when the logic in the try resources is completed - Properties consumerConfig = new Properties(); - consumerConfig.put(ConsumerConstant.CONSUMER_ID_KEY, "c1"); - consumerConfig.put(ConsumerConstant.CONSUMER_GROUP_ID_KEY, "cg1"); - consumerConfig.put(ConsumerConstant.USERNAME_KEY, "root"); - consumerConfig.put(ConsumerConstant.PASSWORD_KEY, "TimechoDB@2021");//V2.0.6.x 之前默认密码为root - consumerConfig.put(ConsumerConstant.FILE_SAVE_DIR_KEY, "/Users/iotdb/Downloads"); - try (SubscriptionPullConsumer pullConsumer = new SubscriptionPullConsumer(consumerConfig)) { - pullConsumer.open(); - pullConsumer.subscribe("topic_all_tsfile"); - while (true) { - List messages = pullConsumer.poll(10000); - for (final SubscriptionMessage message : messages) { - message.getTsFileHandler().copyFile("/Users/iotdb/Downloads/1.tsfile"); - } - } - } - } -} -``` - - - - -## 3. 全量接口说明 - -### 3.1 参数列表 - -可通过Properties参数对象设置消费者相关参数,具体参数如下。 - -#### 3.1.1 SubscriptionConsumer - - -| 参数 | 是否必填(默认值) | 参数含义 | -| :---------------------- |:-------------------------------------------------------------------------------------| :----------------------------------------------------------- | -| host | optional: 127.0.0.1 | `String`: IoTDB 中某 DataNode 的 RPC host | -| port | optional: 6667 | `Integer`: IoTDB 中某 DataNode 的 RPC port | -| node-urls | optional: 127.0.0.1:6667 | `List`: IoTDB 中所有 DataNode 的 RPC 地址,可以是多个;host:port 和 node-urls 选填一个即可。当 host:port 和 node-urls 都填写了,则取 host:port 和 node-urls 的**并集**构成新的 node-urls 应用 | -| username | optional: root | `String`: IoTDB 中 DataNode 的用户名 | -| password | optional: TimechoDB@2021 //V2.0.6.x 之前默认密码为root | `String`: IoTDB 中 DataNode 的密码 | -| groupId | optional | `String`: consumer group id,若未指定则随机分配(新的 consumer group),保证不同的 consumer group 对应的 consumer group id 均不相同 | -| consumerId | optional | `String`: consumer client id,若未指定则随机分配,保证同一个 consumer group 中每一个 consumer client id 均不相同 | -| heartbeatIntervalMs | optional: 30000 (min: 1000) | `Long`: consumer 向 IoTDB DataNode 定期发送心跳请求的间隔 | -| endpointsSyncIntervalMs | optional: 120000 (min: 5000) | `Long`: consumer 探测 IoTDB 集群节点扩缩容情况调整订阅连接的间隔 | -| fileSaveDir | optional: Paths.get(System.getProperty("user.dir"), "iotdb-subscription").toString() | `String`: consumer 订阅出的 TsFile 文件临时存放的目录路径 | -| fileSaveFsync | optional: false | `Boolean`: consumer 订阅 TsFile 的过程中是否主动调用 fsync | - -`SubscriptionPushConsumer` 中的特殊配置: - -| 参数 | 是否必填(默认值) | 参数含义 | -| :----------------- | :------------------------------------ | :----------------------------------------------------------- | -| ackStrategy | optional: `ACKStrategy.AFTER_CONSUME` | 消费进度的确认机制包含以下选项:`ACKStrategy.BEFORE_CONSUME`(当 consumer 收到数据时立刻提交消费进度,`onReceive` 前)`ACKStrategy.AFTER_CONSUME`(当 consumer 消费完数据再去提交消费进度,`onReceive` 后) | -| consumeListener | optional | 消费数据的回调函数,需实现 `ConsumeListener` 接口,定义消费 `SessionDataSetsHandler` 和 `TsFileHandler` 形式数据的处理逻辑 | -| autoPollIntervalMs | optional: 5000 (min: 500) | Long: consumer 自动拉取数据的时间间隔,单位为**毫秒** | -| autoPollTimeoutMs | optional: 10000 (min: 1000) | Long: consumer 每次拉取数据的超时时间,单位为**毫秒** | - -`SubscriptionPullConsumer` 中的特殊配置: - -| 参数 | 是否必填(默认值) | 参数含义 | -| :----------------- | :------------------------ | :----------------------------------------------------------- | -| autoCommit | optional: true | Boolean: 是否自动提交消费进度如果此参数设置为 false,则需要调用 `commit` 方法来手动提交消费进度 | -| autoCommitInterval | optional: 5000 (min: 500) | Long: 自动提交消费进度的时间间隔,单位为**毫秒**仅当 autoCommit 参数为 true 的时候才会生效 | - - -### 3.2 函数列表 - -#### 3.2.1 数据订阅 - -##### SubscriptionPullConsumer - -| **函数名** | **说明** | **参数** | -| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -| `open()` | 打开消费者连接,启动消息消费。如果 `autoCommit` 启用,会启动自动提交工作器。 | 无 | -| `close()` | 关闭消费者连接。如果 `autoCommit` 启用,会在关闭前提交所有未提交的消息。 | 无 | -| `poll(final Duration timeout)` | 拉取消息,指定超时时间。 | `timeout` : 拉取的超时时间。 | -| `poll(final long timeoutMs)` | 拉取消息,指定超时时间(毫秒)。 | `timeoutMs` : 超时时间,单位为毫秒。 | -| `poll(final Set topicNames, final Duration timeout)` | 拉取指定主题的消息,指定超时时间。 | `topicNames` : 要拉取的主题集合。`timeout`: 超时时间。 | -| `poll(final Set topicNames, final long timeoutMs)` | 拉取指定主题的消息,指定超时时间(毫秒)。 | `topicNames` : 要拉取的主题集合。`timeoutMs`: 超时时间,单位为毫秒。 | -| `commitSync(final SubscriptionMessage message)` | 同步提交单条消息。 | `message` : 需要提交的消息对象。 | -| `commitSync(final Iterable messages)` | 同步提交多条消息。 | `messages` : 需要提交的消息集合。 | -| `commitAsync(final SubscriptionMessage message)` | 异步提交单条消息。 | `message` : 需要提交的消息对象。 | -| `commitAsync(final Iterable messages)` | 异步提交多条消息。 | `messages` : 需要提交的消息集合。 | -| `commitAsync(final SubscriptionMessage message, final AsyncCommitCallback callback)` | 异步提交单条消息并指定回调函数。 | `message` : 需要提交的消息对象。`callback` : 异步提交完成后的回调函数。 | -| `commitAsync(final Iterable messages, final AsyncCommitCallback callback)` | 异步提交多条消息并指定回调函数。 | `messages` : 需要提交的消息集合。`callback` : 异步提交完成后的回调函数。 | - -##### SubscriptionPushConsumer - -| **函数名** | **说明** | **参数** | -| -------------------------------------------------------- | ----------------------------------------------------- | ------------------------------------------------------- | -| `open()` | 打开消费者连接,启动消息消费,提交自动轮询工作器。 | 无 | -| `close()` | 关闭消费者连接,停止消息消费。 | 无 | -| `toString()` | 返回消费者对象的核心配置信息。 | 无 | -| `coreReportMessage()` | 获取消费者核心配置的键值对表示形式。 | 无 | -| `allReportMessage()` | 获取消费者所有配置的键值对表示形式。 | 无 | -| `buildPushConsumer()` | 通过 `Builder` 构建 `SubscriptionPushConsumer` 实例。 | 无 | -| `ackStrategy(final AckStrategy ackStrategy)` | 配置消费者的消息确认策略。 | `ackStrategy`: 指定的消息确认策略。 | -| `consumeListener(final ConsumeListener consumeListener)` | 配置消费者的消息消费逻辑。 | `consumeListener`: 消费者接收消息时的处理逻辑。 | -| `autoPollIntervalMs(final long autoPollIntervalMs)` | 配置自动轮询的时间间隔。 | `autoPollIntervalMs` : 自动轮询的间隔时间,单位为毫秒。 | -| `autoPollTimeoutMs(final long autoPollTimeoutMs)` | 配置自动轮询的超时时间。 | `autoPollTimeoutMs`: 自动轮询的超时时间,单位为毫秒。 | - - - - - - - - - diff --git a/src/zh/UserGuide/Master/Tree/API/Programming-JDBC_timecho.md b/src/zh/UserGuide/Master/Tree/API/Programming-JDBC_timecho.md deleted file mode 100644 index f259d0287..000000000 --- a/src/zh/UserGuide/Master/Tree/API/Programming-JDBC_timecho.md +++ /dev/null @@ -1,295 +0,0 @@ - - -# JDBC - -**注意**: 当前 JDBC 实现仅适用于与第三方工具对接。不建议通过 JDBC 执行插入操作,因其无法提供高性能写入;查询场景推荐使用 JDBC。 - -对于Java应用,我们推荐使用[Java 原生接口](./Programming-Java-Native-API_timecho)* - -## 1. 依赖 - -* JDK >= 1.8 -* Maven >= 3.6 - -## 2. 安装方法 - -在根目录下执行下面的命令: -```shell -mvn clean install -pl iotdb-client/jdbc -am -DskipTests -``` - -### 2.1 在 MAVEN 中使用 IoTDB JDBC - -```xml - - - org.apache.iotdb - iotdb-jdbc - - ${project.version} - - -``` -注意:请勿使用高版本客户端连接低版本服务。 - -### 2.2 示例代码 - -本章提供了如何建立数据库连接、执行 SQL 和显示查询结果的示例。 - -要求您已经在工程中包含了数据库编程所需引入的包和 JDBC class. - -**注意:为了更快地插入,建议使用 executeBatch()** - -```java -import java.sql.*; -import org.apache.iotdb.jdbc.IoTDBSQLException; - -public class JDBCExample { - /** - * Before executing a SQL statement with a Statement object, you need to create a Statement object using the createStatement() method of the Connection object. - * After creating a Statement object, you can use its execute() method to execute a SQL statement - * Finally, remember to close the 'statement' and 'connection' objects by using their close() method - * For statements with query results, we can use the getResultSet() method of the Statement object to get the result set. - */ - public static void main(String[] args) throws SQLException { - Connection connection = getConnection(); - if (connection == null) { - System.out.println("get connection defeat"); - return; - } - Statement statement = connection.createStatement(); - //Create database - try { - statement.execute("CREATE DATABASE root.demo"); - }catch (IoTDBSQLException e){ - System.out.println(e.getMessage()); - } - - //SHOW DATABASES - statement.execute("SHOW DATABASES"); - outputResult(statement.getResultSet()); - - //Create time series - //Different data type has different encoding methods. Here use INT32 as an example - try { - statement.execute("CREATE TIMESERIES root.demo.s0 WITH DATATYPE=INT32,ENCODING=RLE;"); - }catch (IoTDBSQLException e){ - System.out.println(e.getMessage()); - } - //Show time series - statement.execute("SHOW TIMESERIES root.demo"); - outputResult(statement.getResultSet()); - //Show devices - statement.execute("SHOW DEVICES"); - outputResult(statement.getResultSet()); - //Count time series - statement.execute("COUNT TIMESERIES root"); - outputResult(statement.getResultSet()); - //Count nodes at the given level - statement.execute("COUNT NODES root LEVEL=3"); - outputResult(statement.getResultSet()); - //Count timeseries group by each node at the given level - statement.execute("COUNT TIMESERIES root GROUP BY LEVEL=3"); - outputResult(statement.getResultSet()); - - - //Execute insert statements in batch - statement.addBatch("insert into root.demo(timestamp,s0) values(1,1);"); - statement.addBatch("insert into root.demo(timestamp,s0) values(2,15);"); - statement.addBatch("insert into root.demo(timestamp,s0) values(2,17);"); - statement.addBatch("insert into root.demo(timestamp,s0) values(4,12);"); - statement.executeBatch(); - statement.clearBatch(); - - //Full query statement - String sql = "select * from root.demo"; - ResultSet resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Exact query statement - sql = "select s0 from root.demo where time = 4;"; - resultSet= statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Time range query - sql = "select s0 from root.demo where time >= 2 and time < 5;"; - resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Aggregate query - sql = "select count(s0) from root.demo;"; - resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Delete time series - statement.execute("delete timeseries root.demo.s0"); - - //close connection - statement.close(); - connection.close(); - } - - public static Connection getConnection() { - // JDBC driver name and database URL - String driver = "org.apache.iotdb.jdbc.IoTDBDriver"; - String url = "jdbc:iotdb://127.0.0.1:6667/"; - // set rpc compress mode - // String url = "jdbc:iotdb://127.0.0.1:6667?rpc_compress=true"; - - // Database credentials - String username = "root"; - String password = "TimechoDB@2021"; // V2.0.6.x 之前默认密码是 root - - Connection connection = null; - try { - Class.forName(driver); - connection = DriverManager.getConnection(url, username, password); - } catch (ClassNotFoundException e) { - e.printStackTrace(); - } catch (SQLException e) { - e.printStackTrace(); - } - return connection; - } - - /** - * This is an example of outputting the results in the ResultSet - */ - private static void outputResult(ResultSet resultSet) throws SQLException { - if (resultSet != null) { - System.out.println("--------------------------"); - final ResultSetMetaData metaData = resultSet.getMetaData(); - final int columnCount = metaData.getColumnCount(); - for (int i = 0; i < columnCount; i++) { - System.out.print(metaData.getColumnLabel(i + 1) + " "); - } - System.out.println(); - while (resultSet.next()) { - for (int i = 1; ; i++) { - System.out.print(resultSet.getString(i)); - if (i < columnCount) { - System.out.print(", "); - } else { - System.out.println(); - break; - } - } - } - System.out.println("--------------------------\n"); - } - } -} -``` - -可以在 url 中指定 version 参数: -```java -String url = "jdbc:iotdb://127.0.0.1:6667?version=V_1_0"; -``` -version 表示客户端使用的 SQL 语义版本,用于升级 0.13 时兼容 0.12 的 SQL 语义,可能取值有:`V_0_12`、`V_0_13`、`V_1_0`。 - -此外,IoTDB 在 JDBC 中提供了额外的接口,供用户在连接中使用不同的字符集(例如 GB18030)读写数据库。 -IoTDB 默认的字符集为 UTF-8。当用户期望使用 UTF-8 外的字符集时,需要在 JDBC 的连接中,指定 charset 属性。例如: -1. 使用 GB18030 的 charset 创建连接: -```java -DriverManager.getConnection("jdbc:iotdb://127.0.0.1:6667?charset=GB18030", "root", "TimechoDB@2021") -// V2.0.6.x 之前默认密码是 root -``` -2. 调用如下 `IoTDBStatement` 接口执行 SQL 时,可以接受 `byte[]` 编码的 SQL,该 SQL 将按照被指定的 charset 解析成字符串。 -```java -public boolean execute(byte[] sql) throws SQLException; -``` -3. 查询结果输出时,可使用 `ResultSet` 的 `getBytes` 方法得到的 `byte[]`,`byte[]` 的编码使用连接指定的 charset 进行。 -```java -System.out.print(resultSet.getString(i) + " (" + new String(resultSet.getBytes(i), charset) + ")"); -``` -以下是完整示例: -```java -public class JDBCCharsetExample { - - private static final Logger LOGGER = LoggerFactory.getLogger(JDBCCharsetExample.class); - - public static void main(String[] args) throws Exception { - Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); - - try (final Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?charset=GB18030", "root", "TimechoDB@2021"); // V2.0.6.x 之前默认密码是 root - final IoTDBStatement statement = (IoTDBStatement) connection.createStatement()) { - - final String insertSQLWithGB18030 = - "insert into root.测试(timestamp, 维语, 彝语, 繁体, 蒙文, 简体, 标点符号, 藏语) values(1, 'ئۇيغۇر تىلى', 'ꆈꌠꉙ', \"繁體\", 'ᠮᠣᠩᠭᠣᠯ ᠬᠡᠯᠡ', '简体', '——?!', \"བོད་སྐད།\");"; - final byte[] insertSQLWithGB18030Bytes = insertSQLWithGB18030.getBytes("GB18030"); - statement.execute(insertSQLWithGB18030Bytes); - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - - outputResult("GB18030"); - outputResult("UTF-8"); - outputResult("UTF-16"); - outputResult("GBK"); - outputResult("ISO-8859-1"); - } - - private static void outputResult(String charset) throws SQLException { - System.out.println("[Charset: " + charset + "]"); - try (final Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?charset=" + charset, "root", "TimechoDB@2021"); // V2.0.6.x 之前默认密码是 root - final IoTDBStatement statement = (IoTDBStatement) connection.createStatement()) { - outputResult(statement.executeQuery("select ** from root"), Charset.forName(charset)); - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - } - - private static void outputResult(ResultSet resultSet, Charset charset) throws SQLException { - if (resultSet != null) { - System.out.println("--------------------------"); - final ResultSetMetaData metaData = resultSet.getMetaData(); - final int columnCount = metaData.getColumnCount(); - for (int i = 0; i < columnCount; i++) { - System.out.print(metaData.getColumnLabel(i + 1) + " "); - } - System.out.println(); - - while (resultSet.next()) { - for (int i = 1; ; i++) { - System.out.print( - resultSet.getString(i) + " (" + new String(resultSet.getBytes(i), charset) + ")"); - if (i < columnCount) { - System.out.print(", "); - } else { - System.out.println(); - break; - } - } - } - System.out.println("--------------------------\n"); - } - } -} -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/API/Programming-Java-Native-API_timecho.md b/src/zh/UserGuide/Master/Tree/API/Programming-Java-Native-API_timecho.md deleted file mode 100644 index 18a7bb83d..000000000 --- a/src/zh/UserGuide/Master/Tree/API/Programming-Java-Native-API_timecho.md +++ /dev/null @@ -1,625 +0,0 @@ - - - -# Java原生API - -IoTDB 原生 API 中的 Session 是实现与数据库交互的核心接口,它集成了丰富的方法,支持数据写入、查询以及元数据操作等功能。通过实例化 Session,能够建立与 IoTDB 服务器的连接,在该连接所构建的环境中执行各类数据库操作。Session为非线程安全,不能被多线程同时调用。 - -SessionPool 是 Session 的连接池,推荐使用SessionPool编程。在多线程并发的情形下,SessionPool 能够合理地管理和分配连接资源,以提升系统性能与资源利用效率。 - -## 1. 步骤概览 - -1. 创建连接池实例:初始化一个SessionPool对象,用于管理多个Session实例。 -2. 执行操作:直接从SessionPool中获取Session实例,并执行数据库操作,无需每次都打开和关闭连接。 -3. 关闭连接池资源:在不再需要进行数据库操作时,关闭SessionPool,释放所有相关资源。 - -## 2. 详细步骤 - -本章节用于说明开发的核心流程,并未演示所有的参数和接口,如需了解全部功能及参数请参见: [全量接口说明](./Programming-Java-Native-API_timecho#_3-全量接口说明) 或 查阅: [源码](https://github.com/apache/iotdb/tree/rc/2.0.1/example/session/src/main/java/org/apache/iotdb) - -### 2.1 创建maven项目 - -创建一个maven项目,并在pom.xml文件中添加以下依赖(JDK >= 1.8, Maven >= 3.6) - -```xml - - - org.apache.iotdb - iotdb-session - - ${project.version} - - -``` -注意:请勿使用高版本客户端连接低版本服务。 - -### 2.2 创建连接池实例 - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.session.pool.SessionPool; - -public class IoTDBSessionPoolExample { - private static SessionPool sessionPool; - - public static void main(String[] args) { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //V2.0.6.x 之前默认密码为root - .maxSize(3) - .build(); - } -} -``` - -### 2.3 执行数据库操作 - -#### 2.3.1 数据写入 - -在工业场景中,数据写入可分为以下几类:多行数据写入、单设备多行数据写入,下面按不同场景对写入接口进行介绍。 - -##### 多行数据写入接口 - -接口说明:支持一次写入多行数据,每一行对应一个设备一个时间戳的多个测点值。 - - -接口列表: - -| 接口名称 | 功能描述 | -| ------------------------------------------------------------ | ------------------------------------------ | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | 插入多行数据,适用于不同测点独立采集的场景 | - -代码案例: - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; -import org.apache.tsfile.enums.TSDataType; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. execute insert data - insertRecordsExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //V2.0.6.x 之前默认密码为root - .maxSize(3) - .build(); - } - - public static void insertRecordsExample() throws IoTDBConnectionException, StatementExecutionException { - String deviceId = "root.sg1.d1"; - List measurements = new ArrayList<>(); - measurements.add("s1"); - measurements.add("s2"); - measurements.add("s3"); - List deviceIds = new ArrayList<>(); - List> measurementsList = new ArrayList<>(); - List> valuesList = new ArrayList<>(); - List timestamps = new ArrayList<>(); - List> typesList = new ArrayList<>(); - - for (long time = 0; time < 500; time++) { - List values = new ArrayList<>(); - List types = new ArrayList<>(); - values.add(1L); - values.add(2L); - values.add(3L); - types.add(TSDataType.INT64); - types.add(TSDataType.INT64); - types.add(TSDataType.INT64); - - deviceIds.add(deviceId); - measurementsList.add(measurements); - valuesList.add(values); - typesList.add(types); - timestamps.add(time); - if (time != 0 && time % 100 == 0) { - try { - sessionPool.insertRecords(deviceIds, timestamps, measurementsList, typesList, valuesList); - } catch (IoTDBConnectionException | StatementExecutionException e) { - // solve exception - } - deviceIds.clear(); - measurementsList.clear(); - valuesList.clear(); - typesList.clear(); - timestamps.clear(); - } - } - try { - sessionPool.insertRecords(deviceIds, timestamps, measurementsList, typesList, valuesList); - } catch (IoTDBConnectionException | StatementExecutionException e) { - // solve exception - } - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - -##### 单设备多行数据写入接口 - -接口说明:支持一次写入单个设备的多行数据,每一行对应一个时间戳的多个测点值。 - -接口列表: - -| 接口名称 | 功能描述 | -| ----------------------------- | ---------------------------------------------------- | -| `insertTablet(Tablet tablet)` | 插入单个设备的多行数据,适用于不同测点独立采集的场景 | - -代码案例: - -```java -import java.util.ArrayList; -import java.util.List; -import java.util.Random; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.write.record.Tablet; -import org.apache.tsfile.write.schema.IMeasurementSchema; -import org.apache.tsfile.write.schema.MeasurementSchema; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. execute insert data - insertTabletExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - //nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //V2.0.6.x 之前默认密码为root - .maxSize(3) - .build(); - } - - private static void insertTabletExample() throws IoTDBConnectionException, StatementExecutionException { - /* - * A Tablet example: - * device1 - * time s1, s2, s3 - * 1, 1, 1, 1 - * 2, 2, 2, 2 - * 3, 3, 3, 3 - */ - // The schema of measurements of one device - // only measurementId and data type in MeasurementSchema take effects in Tablet - List schemaList = new ArrayList<>(); - schemaList.add(new MeasurementSchema("s1", TSDataType.INT64)); - schemaList.add(new MeasurementSchema("s2", TSDataType.INT64)); - schemaList.add(new MeasurementSchema("s3", TSDataType.INT64)); - - Tablet tablet = new Tablet("root.sg.d1",schemaList,100); - - // Method 1 to add tablet data - long timestamp = System.currentTimeMillis(); - - Random random = new Random(); - for (long row = 0; row < 100; row++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - for (int s = 0; s < 3; s++) { - long value = random.nextLong(); - tablet.addValue(schemaList.get(s).getMeasurementName(), rowIndex, value); - } - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - sessionPool.insertTablet(tablet); - tablet.reset(); - } - timestamp++; - } - if (tablet.getRowSize() != 0) { - sessionPool.insertTablet(tablet); - tablet.reset(); - } - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - -#### 2.3.2 SQL操作 - -SQL操作分为查询和非查询两类操作,对应的接口为`executeQuery`和`executeNonQuery`操作,其区别为前者执行的是具体的查询语句,会返回一个结果集,后者是执行的是增、删、改操作,不返回结果集。 - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.isession.pool.SessionDataSetWrapper; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. executes a non-query SQL statement, such as a DDL or DML command. - executeQueryExample(); - // 3. executes a query SQL statement and returns the result set. - executeNonQueryExample(); - // 4. close SessionPool - closeSessionPool(); - } - - private static void executeNonQueryExample() throws IoTDBConnectionException, StatementExecutionException { - // 1. create a nonAligned time series - sessionPool.executeNonQueryStatement("create timeseries root.test.d1.s1 with dataType = int32"); - // 2. set ttl - sessionPool.executeNonQueryStatement("set TTL to root.test.** 10000"); - // 3. delete time series - sessionPool.executeNonQueryStatement("delete timeseries root.test.d1.s1"); - } - - private static void executeQueryExample() throws IoTDBConnectionException, StatementExecutionException { - // 1. execute normal query - try(SessionDataSetWrapper wrapper = sessionPool.executeQueryStatement("select s1 from root.sg1.d1 limit 10")) { - // get DataIterator like JDBC - DataIterator dataIterator = wrapper.iterator(); - System.out.println(wrapper.getColumnNames()); - System.out.println(wrapper.getColumnTypes()); - while (dataIterator.next()) { - StringBuilder builder = new StringBuilder(); - for (String columnName : wrapper.getColumnNames()) { - builder.append(dataIterator.getString(columnName) + " "); - } - System.out.println(builder); - } - } - // 2. execute aggregate query - try(SessionDataSetWrapper wrapper = sessionPool.executeQueryStatement("select count(s1) from root.sg1.d1 group by ([0, 40), 5ms) ")) { - // get DataIterator like JDBC - DataIterator dataIterator = wrapper.iterator(); - System.out.println(wrapper.getColumnNames()); - System.out.println(wrapper.getColumnTypes()); - while (dataIterator.next()) { - StringBuilder builder = new StringBuilder(); - for (String columnName : wrapper.getColumnNames()) { - builder.append(dataIterator.getString(columnName) + " "); - } - System.out.println(builder); - } - } - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //V2.0.6.x 之前默认密码为root - .maxSize(3) - .build(); - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - - -更多关于结果集及其方法 `SessionDataSet.DataIterator` 的使用可参考如下示例(其中,getBlob 和 getDate 两个接口从 V2.0.4 起支持): - -```java -import org.apache.iotdb.isession.SessionDataSet; -import org.apache.iotdb.isession.pool.SessionDataSetWrapper; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; - -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.utils.Binary; -import org.apache.tsfile.utils.DateUtils; -import org.apache.tsfile.write.record.Tablet; -import org.apache.tsfile.write.schema.MeasurementSchema; -import org.junit.Assert; - -import java.sql.Timestamp; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -public class SessionExample { - private static SessionPool sessionPool; - - public static void main(String[] args) - throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. executes a query SQL statement, such as a DDL or DML command. - executeQueryExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void executeQueryExample() - throws IoTDBConnectionException, StatementExecutionException { - Tablet tablet = - new Tablet( - "root.sg.d1", - Arrays.asList( - new MeasurementSchema("s1", TSDataType.INT32), - new MeasurementSchema("s2", TSDataType.INT64), - new MeasurementSchema("s3", TSDataType.FLOAT), - new MeasurementSchema("s4", TSDataType.DOUBLE), - new MeasurementSchema("s5", TSDataType.TEXT), - new MeasurementSchema("s6", TSDataType.BOOLEAN), - new MeasurementSchema("s7", TSDataType.TIMESTAMP), - new MeasurementSchema("s8", TSDataType.BLOB), - new MeasurementSchema("s9", TSDataType.STRING), - new MeasurementSchema("s10", TSDataType.DATE), - new MeasurementSchema("s11", TSDataType.TIMESTAMP)), - 10); - tablet.addTimestamp(0, 0L); - tablet.addValue("s1", 0, 1); - tablet.addValue("s2", 0, 1L); - tablet.addValue("s3", 0, 0f); - tablet.addValue("s4", 0, 0d); - tablet.addValue("s5", 0, "text_value"); - tablet.addValue("s6", 0, true); - tablet.addValue("s7", 0, 1L); - tablet.addValue("s8", 0, new Binary(new byte[] {1})); - tablet.addValue("s9", 0, "string_value"); - tablet.addValue("s10", 0, DateUtils.parseIntToLocalDate(20250403)); - tablet.initBitMaps(); - tablet.bitMaps[10].mark(0); - tablet.rowSize = 1; - sessionPool.insertAlignedTablet(tablet); - - try (SessionDataSetWrapper dataSet = - sessionPool.executeQueryStatement("select * from root.sg.d1")) { - SessionDataSet.DataIterator iterator = dataSet.iterator(); - int count = 0; - while (iterator.next()) { - count++; - Assert.assertFalse(iterator.isNull("root.sg.d1.s1")); - Assert.assertEquals(1, iterator.getInt("root.sg.d1.s1")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s2")); - Assert.assertEquals(1L, iterator.getLong("root.sg.d1.s2")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s3")); - Assert.assertEquals(0, iterator.getFloat("root.sg.d1.s3"), 0.01); - Assert.assertFalse(iterator.isNull("root.sg.d1.s4")); - Assert.assertEquals(0, iterator.getDouble("root.sg.d1.s4"), 0.01); - Assert.assertFalse(iterator.isNull("root.sg.d1.s5")); - Assert.assertEquals("text_value", iterator.getString("root.sg.d1.s5")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s6")); - Assert.assertTrue(iterator.getBoolean("root.sg.d1.s6")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s7")); - Assert.assertEquals(new Timestamp(1), iterator.getTimestamp("root.sg.d1.s7")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s8")); - Assert.assertEquals(new Binary(new byte[] {1}), iterator.getBlob("root.sg.d1.s8")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s9")); - Assert.assertEquals("string_value", iterator.getString("root.sg.d1.s9")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s10")); - Assert.assertEquals( - DateUtils.parseIntToLocalDate(20250403), iterator.getDate("root.sg.d1.s10")); - Assert.assertTrue(iterator.isNull("root.sg.d1.s11")); - Assert.assertNull(iterator.getTimestamp("root.sg.d1.s11")); - - Assert.assertEquals(new Timestamp(0), iterator.getTimestamp("Time")); - Assert.assertFalse(iterator.isNull("Time")); - } - Assert.assertEquals(tablet.rowSize, count); - } - } - - private static void constructSessionPool() { - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("root") - .maxSize(3) - .build(); - } - - public static void closeSessionPool() { - sessionPool.close(); - } -} -``` - - -## 3. 全量接口说明 - -### 3.1 参数列表 - -Session具有如下的字段,可以通过构造函数或Session.Builder方式设置如下参数 - -| 字段名 | 类型 | 说明 | -| -------------------------------- | ----------------------------------- | ---------------------------------------- | -| `nodeUrls` | `List` | 数据库节点的 URL 列表,支持多节点连接 | -| `username` | `String` | 用户名 | -| `password` | `String` | 密码 | -| `fetchSize` | `int` | 查询结果的默认批量返回大小 | -| `useSSL` | `boolean` | 是否启用 SSL | -| `trustStore` | `String` | 信任库路径 | -| `trustStorePwd` | `String` | 信任库密码 | -| `queryTimeoutInMs` | `long` | 查询的超时时间,单位毫秒。默认值-1。负数代表采用服务器默认配置,0 代表关闭查询超时功能。 | -| `enableRPCCompression` | `boolean` | 是否启用 RPC 压缩 | -| `connectionTimeoutInMs` | `int` | 连接超时时间,单位毫秒 | -| `zoneId` | `ZoneId` | 会话的时区设置 | -| `thriftDefaultBufferSize` | `int` | Thrift 默认缓冲区大小 | -| `thriftMaxFrameSize` | `int` | Thrift 最大帧大小 | -| `defaultEndPoint` | `TEndPoint` | 默认的数据库端点信息 | -| `defaultSessionConnection` | `SessionConnection` | 默认的会话连接对象 | -| `isClosed` | `boolean` | 当前会话是否已关闭 | -| `enableRedirection` | `boolean` | 是否启用重定向功能 | -| `enableRecordsAutoConvertTablet` | `boolean` | 是否启用记录自动转换为 Tablet 的功能 | -| `deviceIdToEndpoint` | `Map` | 设备 ID 和数据库端点的映射关系 | -| `endPointToSessionConnection` | `Map` | 数据库端点和会话连接的映射关系 | -| `executorService` | `ScheduledExecutorService` | 用于定期更新节点列表的线程池 | -| `availableNodes` | `INodeSupplier` | 可用节点的供应器 | -| `enableQueryRedirection` | `boolean` | 是否启用查询重定向功能 | -| `version` | `Version` | 客户端的版本号,用于与服务端的兼容性判断 | -| `enableAutoFetch` | `boolean` | 是否启用自动获取功能 | -| `maxRetryCount` | `int` | 最大重试次数 | -| `retryIntervalInMs` | `long` | 重试的间隔时间,单位毫秒 | - - - -### 3.2 接口列表 - -#### 3.2.1 元数据管理 - -| 方法名 | 功能描述 | 参数解释 | -| ------------------------------------------------------------ | ------------------------ | ------------------------------------------------------------ | -| `createDatabase(String database)` | 创建数据库 | `database`: 数据库名称 | -| `deleteDatabase(String database)` | 删除指定数据库 | `database`: 要删除的数据库名称 | -| `deleteDatabases(List databases)` | 批量删除数据库 | `databases`: 要删除的数据库名称列表 | -| `createTimeseries(String path, TSDataType dataType, TSEncoding encoding, CompressionType compressor)` | 创建单个时间序列 | `path`: 时间序列路径,`dataType`: 数据类型,`encoding`: 编码类型,`compressor`: 压缩类型 | -| `createAlignedTimeseries(...)` | 创建对齐时间序列 | 设备ID、测点列表、数据类型列表、编码列表、压缩类型列表 | -| `createMultiTimeseries(...)` | 批量创建时间序列 | 多个路径、数据类型、编码、压缩类型、属性、标签、别名等 | -| `deleteTimeseries(String path)` | 删除时间序列 | `path`: 要删除的时间序列路径 | -| `deleteTimeseries(List paths)` | 批量删除时间序列 | `paths`: 要删除的时间序列路径列表 | -| `setSchemaTemplate(String templateName, String prefixPath)` | 设置模式模板 | `templateName`: 模板名称,`prefixPath`: 应用模板的路径 | -| `createSchemaTemplate(Template template)` | 创建模式模板 | `template`: 模板对象 | -| `dropSchemaTemplate(String templateName)` | 删除模式模板 | `templateName`: 要删除的模板名称 | -| `addAlignedMeasurementsInTemplate(...)` | 添加对齐测点到模板 | 模板名称、测点路径列表、数据类型、编码类型、压缩类型 | -| `addUnalignedMeasurementsInTemplate(...)` | 添加非对齐测点到模板 | 同上 | -| `deleteNodeInTemplate(String templateName, String path)` | 删除模板中的节点 | `templateName`: 模板名称,`path`: 要删除的路径 | -| `countMeasurementsInTemplate(String name)` | 统计模板中测点数量 | `name`: 模板名称 | -| `isMeasurementInTemplate(String templateName, String path)` | 检查模板中是否存在某测点 | `templateName`: 模板名称,`path`: 测点路径 | -| `isPathExistInTemplate(String templateName, String path)` | 检查模板中路径是否存在 | 同上 | -| `showMeasurementsInTemplate(String templateName)` | 显示模板中的测点 | `templateName`: 模板名称 | -| `showMeasurementsInTemplate(String templateName, String pattern)` | 按模式显示模板中的测点 | `templateName`: 模板名称,`pattern`: 匹配模式 | -| `showAllTemplates()` | 显示所有模板 | 无参数 | -| `showPathsTemplateSetOn(String templateName)` | 显示模板应用的路径 | `templateName`: 模板名称 | -| `showPathsTemplateUsingOn(String templateName)` | 显示模板实际使用的路径 | 同上 | -| `unsetSchemaTemplate(String prefixPath, String templateName)` | 取消路径的模板设置 | `prefixPath`: 路径,`templateName`: 模板名称 | - - -#### 3.2.2 数据写入 - -| 方法名 | 功能描述 | 参数解释 | -| ------------------------------------------------------------ | ---------------------------------- | ------------------------------------------------------------ | -| `insertRecord(String deviceId, long time, List measurements, List types, Object... values)` | 插入单条记录 | `deviceId`: 设备ID,`time`: 时间戳,`measurements`: 测点列表,`types`: 数据类型列表,`values`: 值列表 | -| `insertRecord(String deviceId, long time, List measurements, List values)` | 插入单条记录 | `deviceId`: 设备ID,`time`: 时间戳,`measurements`: 测点列表,`values`: 值列表 | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> valuesList)` | 插入多条记录 | `deviceIds`: 设备ID列表,`times`: 时间戳列表,`measurementsList`: 测点列表列表,`valuesList`: 值列表 | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | 插入多条记录 | 同上,增加 `typesList`: 数据类型列表 | -| `insertRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList)` | 插入单设备的多条记录 | `deviceId`: 设备ID,`times`: 时间戳列表,`measurementsList`: 测点列表列表,`typesList`: 类型列表,`valuesList`: 值列表 | -| `insertRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList, boolean haveSorted)` | 插入排序后的单设备多条记录 | 同上,增加 `haveSorted`: 数据是否已排序 | -| `insertStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList)` | 插入字符串格式的单设备记录 | `deviceId`: 设备ID,`times`: 时间戳列表,`measurementsList`: 测点列表,`valuesList`: 值列表 | -| `insertStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList, boolean haveSorted)` | 插入排序的字符串格式单设备记录 | 同上,增加 `haveSorted`: 数据是否已排序 | -| `insertAlignedRecord(String deviceId, long time, List measurements, List types, List values)` | 插入单条对齐记录 | `deviceId`: 设备ID,`time`: 时间戳,`measurements`: 测点列表,`types`: 类型列表,`values`: 值列表 | -| `insertAlignedRecord(String deviceId, long time, List measurements, List values)` | 插入字符串格式的单条对齐记录 | `deviceId`: 设备ID,`time`: 时间戳,`measurements`: 测点列表,`values`: 值列表 | -| `insertAlignedRecords(List deviceIds, List times, List> measurementsList, List> valuesList)` | 插入多条对齐记录 | `deviceIds`: 设备ID列表,`times`: 时间戳列表,`measurementsList`: 测点列表,`valuesList`: 值列表 | -| `insertAlignedRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | 插入多条对齐记录 | 同上,增加 `typesList`: 数据类型列表 | -| `insertAlignedRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList)` | 插入单设备的多条对齐记录 | 同上 | -| `insertAlignedRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList, boolean haveSorted)` | 插入排序的单设备多条对齐记录 | 同上,增加 `haveSorted`: 数据是否已排序 | -| `insertAlignedStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList)` | 插入字符串格式的单设备对齐记录 | `deviceId`: 设备ID,`times`: 时间戳列表,`measurementsList`: 测点列表,`valuesList`: 值列表 | -| `insertAlignedStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList, boolean haveSorted)` | 插入排序的字符串格式单设备对齐记录 | 同上,增加 `haveSorted`: 数据是否已排序 | -| `insertTablet(Tablet tablet)` | 插入单个Tablet数据 | `tablet`: 要插入的Tablet数据 | -| `insertTablet(Tablet tablet, boolean sorted)` | 插入排序的Tablet数据 | 同上,增加 `sorted`: 数据是否已排序 | -| `insertAlignedTablet(Tablet tablet)` | 插入对齐的Tablet数据 | `tablet`: 要插入的Tablet数据 | -| `insertAlignedTablet(Tablet tablet, boolean sorted)` | 插入排序的对齐Tablet数据 | 同上,增加 `sorted`: 数据是否已排序 | -| `insertTablets(Map tablets)` | 批量插入多个Tablet数据 | `tablets`: 设备ID到Tablet的映射表 | -| `insertTablets(Map tablets, boolean sorted)` | 批量插入排序的多个Tablet数据 | 同上,增加 `sorted`: 数据是否已排序 | -| `insertAlignedTablets(Map tablets)` | 批量插入多个对齐Tablet数据 | `tablets`: 设备ID到Tablet的映射表 | -| `insertAlignedTablets(Map tablets, boolean sorted)` | 批量插入排序的多个对齐Tablet数据 | 同上,增加 `sorted`: 数据是否已排序 | - -#### 3.2.3 数据删除 - -| 方法名 | 功能描述 | 参数解释 | -| ------------------------------------------------------------ | ---------------------------- | ---------------------------------------- | -| `deleteTimeseries(String path)` | 删除单个时间序列 | `path`: 时间序列路径 | -| `deleteTimeseries(List paths)` | 批量删除时间序列 | `paths`: 时间序列路径列表 | -| `deleteData(String path, long endTime)` | 删除指定路径的历史数据 | `path`: 路径,`endTime`: 结束时间戳 | -| `deleteData(List paths, long endTime)` | 批量删除路径的历史数据 | `paths`: 路径列表,`endTime`: 结束时间戳 | -| `deleteData(List paths, long startTime, long endTime)` | 删除路径时间范围内的历史数据 | 同上,增加 `startTime`: 起始时间戳 | - - -#### 3.2.4 数据查询 - -| 方法名 | 功能描述 | 参数解释 | -| ------------------------------------------------------------ | -------------------------------- | ------------------------------------------------------------ | -| `executeQueryStatement(String sql)` | 执行查询语句 | `sql`: 查询SQL语句 | -| `executeQueryStatement(String sql, long timeoutInMs)` | 执行带超时的查询语句 | `sql`: 查询SQL语句,`timeoutInMs`: 查询超时时间(毫秒),默认取服务器配置即60s | -| `executeRawDataQuery(List paths, long startTime, long endTime)` | 查询指定路径的原始数据 | `paths`: 查询路径列表,`startTime`: 起始时间戳,`endTime`: 结束时间戳 | -| `executeRawDataQuery(List paths, long startTime, long endTime, long timeOut)` | 查询指定路径的原始数据(带超时) | 同上,增加 `timeOut`: 超时时间 | -| `executeLastDataQuery(List paths)` | 查询最新数据 | `paths`: 查询路径列表 | -| `executeLastDataQuery(List paths, long lastTime)` | 查询指定时间的最新数据 | `paths`: 查询路径列表,`lastTime`: 指定的时间戳 | -| `executeLastDataQuery(List paths, long lastTime, long timeOut)` | 查询指定时间的最新数据(带超时) | 同上,增加 `timeOut`: 超时时间 | -| `executeLastDataQueryForOneDevice(String db, String device, List sensors, boolean isLegalPathNodes)` | 查询单个设备的最新数据 | `db`: 数据库名,`device`: 设备名,`sensors`: 传感器列表,`isLegalPathNodes`: 是否合法路径节点 | -| `executeAggregationQuery(List paths, List aggregations)` | 执行聚合查询 | `paths`: 查询路径列表,`aggregations`: 聚合类型列表 | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime)` | 执行带时间范围的聚合查询 | 同上,增加 `startTime`: 起始时间戳,`endTime`: 结束时间戳 | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime, long interval)` | 执行带时间间隔的聚合查询 | 同上,增加 `interval`: 时间间隔 | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime, long interval, long slidingStep)` | 执行滑动窗口聚合查询 | 同上,增加 `slidingStep`: 滑动步长 | -| `fetchAllConnections()` | 获取所有活动连接信息 | 无参数 | - -#### 3.2.5 系统状态与备份 - -| 方法名 | 功能描述 | 参数解释 | -| -------------------------- | ---------------------- | -------------------------------------- | -| `getBackupConfiguration()` | 获取备份配置信息 | 无参数 | -| `fetchAllConnections()` | 获取所有活动的连接信息 | 无参数 | -| `getSystemStatus()` | 获取系统状态 | 已废弃,默认返回 `SystemStatus.NORMAL` | - - - diff --git a/src/zh/UserGuide/Master/Tree/API/Programming-MQTT_timecho.md b/src/zh/UserGuide/Master/Tree/API/Programming-MQTT_timecho.md deleted file mode 100644 index 9dbd8aae4..000000000 --- a/src/zh/UserGuide/Master/Tree/API/Programming-MQTT_timecho.md +++ /dev/null @@ -1,295 +0,0 @@ - - -# MQTT 协议 - -## 1. 概述 - -MQTT 是一种专为物联网(IoT)和低带宽环境设计的轻量级消息传输协议,基于发布/订阅(Pub/Sub)模型,支持设备间高效、可靠的双向通信。其核心目标是低功耗、低带宽消耗和高实时性,尤其适合网络不稳定或资源受限的场景(如传感器、移动设备)。 - -IoTDB 深度集成了 MQTT 协议能力,完整兼容 MQTT v3.1(OASIS 国际标准协议)。IoTDB 服务器内置高性能 MQTT Broker 服务模块,无需第三方中间件,支持设备通过 MQTT 报文将时序数据直接写入 IoTDB 存储引擎。 - - - -注意,自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 MQTT 服务的 JAR 包。请使用该服务前联系天谋团队获取 JAR 包,并放置于 timechodb_home/lib 或者 timechodb_home/ext/external_service 路径下。 - -## 2. 内置 MQTT 服务 -内置的 MQTT 服务提供了通过 MQTT 直接连接到 IoTDB 的能力。 它侦听来自 MQTT 客户端的发布消息,然后立即将数据写入存储。 -MQTT 主题与 IoTDB 时间序列相对应。 -消息有效载荷可以由 Java SPI 加载的`PayloadFormatter`格式化为事件,默认实现为`JSONPayloadFormatter` - 默认的`json`格式化程序支持两种 json 格式以及由他们组成的json数组,以下是 MQTT 消息有效负载示例: - -```json - { - "device":"root.sg.d1", - "timestamp":1586076045524, - "measurements":["s1","s2"], - "values":[0.530635,0.530635] - } -``` -或者 -```json - { - "device":"root.sg.d1", - "timestamps":[1586076045524,1586076065526], - "measurements":["s1","s2"], - "values":[[0.530635,0.530635], [0.530655,0.530695]] - } -``` -或者以上两者的JSON数组形式。 - - - -## 3. MQTT 配置 -默认情况下,IoTDB MQTT 服务从`${IOTDB_HOME}/${IOTDB_CONF}/iotdb-system.properties`加载配置。 - -配置如下: - -| **名称** | **描述** | **默认** | -|---------------------------| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------- | -| `enable_mqtt_service` | 是否启用 mqtt 服务 | FALSE | -| `mqtt_host` | mqtt 服务绑定主机 | 127.0.0.1 | -| `mqtt_port` | mqtt 服务绑定端口 | 1883 | -| `mqtt_handler_pool_size` | 处理 mqtt 消息的处理程序池大小 | 1 | -| **`mqtt_payload_formatter`** | **mqtt**​**​ 消息有效负载格式化程序。**​**可选项:**​​**`json`**​**:仅适用于树模型。**​​**`line`**​**:仅适用于表模型。** | **json** | -| `mqtt_max_message_size` | mqtt 消息最大长度(字节) | 1048576 | - - -## 4. 示例代码 -以下是 mqtt 客户端将消息发送到 IoTDB 服务器的示例。 - - ```java -MQTT mqtt = new MQTT(); -mqtt.setHost("127.0.0.1", 1883); -mqtt.setUserName("root"); -mqtt.setPassword("root"); - -BlockingConnection connection = mqtt.blockingConnection(); -connection.connect(); - -Random random = new Random(); -for (int i = 0; i < 10; i++) { - String payload = String.format("{\n" + - "\"device\":\"root.sg.d1\",\n" + - "\"timestamp\":%d,\n" + - "\"measurements\":[\"s1\"],\n" + - "\"values\":[%f]\n" + - "}", System.currentTimeMillis(), random.nextDouble()); - - connection.publish("root.sg.d1.s1", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -} - -connection.disconnect(); - ``` - - -## 5. 自定义 MQTT 消息格式 - -在生产环境中,每个设备通常都配备了自己的 MQTT 客户端,且这些客户端的消息格式已经预先设定。如果按照 IoTDB 所支持的 MQTT 消息格式进行通信,就需要对现有的所有客户端进行全面的升级改造,这无疑会带来较高的成本。然而,我们可以通过简单的编程手段,轻松实现 MQTT 消息格式的自定义,而无需改造客户端。 -可以在源码的 [example/mqtt-customize](https://github.com/apache/iotdb/tree/rc/2.0.1/example/mqtt-customize) 项目中找到一个简单示例。 - -假定mqtt客户端传过来的是以下消息格式: -```json - { - "time":1586076045523, - "deviceID":"car_1", - "deviceType":"油车", - "point":"油量", - "value":10.0 -} -``` -或者JSON的数组形式: -```java -[ - { - "time":1586076045523, - "deviceID":"car_1", - "deviceType":"油车", - "point":"油量", - "value":10.0 - }, - { - "time":1586076045524, - "deviceID":"car_2", - "deviceType":"新能源车", - "point":"速度", - "value":80.0 - } -] -``` - - -则可以通过以下步骤设置设置自定义MQTT消息格式: - -1. 创建一个 Java 项目,增加如下依赖 -```xml - - org.apache.iotdb - iotdb-server - 2.0.4-SNAPSHOT - -``` -2. 创建一个实现类,实现接口 `org.apache.iotdb.db.mqtt.protocol.PayloadFormatter` - -```java -package org.apache.iotdb.mqtt.server; - -import org.apache.iotdb.db.protocol.mqtt.Message; -import org.apache.iotdb.db.protocol.mqtt.PayloadFormatter; -import org.apache.iotdb.db.protocol.mqtt.TableMessage; - -import com.google.common.collect.Lists; -import com.google.gson.Gson; -import com.google.gson.GsonBuilder; -import com.google.gson.JsonArray; -import com.google.gson.JsonElement; -import com.google.gson.JsonObject; -import com.google.gson.JsonParseException; -import io.netty.buffer.ByteBuf; -import org.apache.commons.lang3.NotImplementedException; -import org.apache.tsfile.enums.TSDataType; - -import java.nio.charset.StandardCharsets; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -/** - * The Customized JSON payload formatter. one json format supported: { "time":1586076045523, - * "deviceID":"car_1", "deviceType":"新能源车", "point":"速度", "value":80.0 } - */ -public class CustomizedJsonPayloadFormatter implements PayloadFormatter { - private static final String JSON_KEY_TIME = "time"; - private static final String JSON_KEY_DEVICEID = "deviceID"; - private static final String JSON_KEY_DEVICETYPE = "deviceType"; - private static final String JSON_KEY_POINT = "point"; - private static final String JSON_KEY_VALUE = "value"; - private static final Gson GSON = new GsonBuilder().create(); - - @Override - public List format(String topic, ByteBuf payload) { - if (payload == null) { - return new ArrayList<>(); - } - String txt = payload.toString(StandardCharsets.UTF_8); - JsonElement jsonElement = GSON.fromJson(txt, JsonElement.class); - if (jsonElement.isJsonObject()) { - JsonObject jsonObject = jsonElement.getAsJsonObject(); - return formatTableRow(topic, jsonObject); - } else if (jsonElement.isJsonArray()) { - JsonArray jsonArray = jsonElement.getAsJsonArray(); - List messages = new ArrayList<>(); - for (JsonElement element : jsonArray) { - JsonObject jsonObject = element.getAsJsonObject(); - messages.addAll(formatTableRow(topic, jsonObject)); - } - return messages; - } - throw new JsonParseException("payload is invalidate"); - } - - @Override - @Deprecated - public List format(ByteBuf payload) { - throw new NotImplementedException(); - } - - private List formatTableRow(String topic, JsonObject jsonObject) { - TableMessage message = new TableMessage(); - String database = !topic.contains("/") ? topic : topic.substring(0, topic.indexOf("/")); - String table = "test_table"; - - // Parsing Database Name - message.setDatabase((database)); - - // Parsing Table Name - message.setTable(table); - - // Parsing Tags - List tagKeys = new ArrayList<>(); - tagKeys.add(JSON_KEY_DEVICEID); - List tagValues = new ArrayList<>(); - tagValues.add(jsonObject.get(JSON_KEY_DEVICEID).getAsString()); - message.setTagKeys(tagKeys); - message.setTagValues(tagValues); - - // Parsing Attributes - List attributeKeys = new ArrayList<>(); - List attributeValues = new ArrayList<>(); - attributeKeys.add(JSON_KEY_DEVICETYPE); - attributeValues.add(jsonObject.get(JSON_KEY_DEVICETYPE).getAsString()); - message.setAttributeKeys(attributeKeys); - message.setAttributeValues(attributeValues); - - // Parsing Fields - List fields = Arrays.asList(JSON_KEY_POINT); - List dataTypes = Arrays.asList(TSDataType.FLOAT); - List values = Arrays.asList(jsonObject.get(JSON_KEY_VALUE).getAsFloat()); - message.setFields(fields); - message.setDataTypes(dataTypes); - message.setValues(values); - - // Parsing timestamp - message.setTimestamp(jsonObject.get(JSON_KEY_TIME).getAsLong()); - return Lists.newArrayList(message); - } - - @Override - public String getName() { - // set the value of mqtt_payload_formatter in iotdb-common.properties as the following string: - return "CustomizedJson2Table"; - } - - @Override - public String getType() { - return PayloadFormatter.TABLE_TYPE; - } -} -``` -3. 修改项目中的 `src/main/resources/META-INF/services/org.apache.iotdb.db.protocol.mqtt.PayloadFormatter` 文件: - 将示例中的文件内容清除,并将刚才的实现类的全名(包名.类名)写入文件中。注意,这个文件中只有一行。 - 在本例中,文件内容为: `org.apache.iotdb.mqtt.server.CustomizedJsonPayloadFormatter` -4. 编译项目生成一个 jar 包: `mvn package -DskipTests` - - -在 IoTDB 服务端: -1. 创建 ${IOTDB_HOME}/ext/mqtt/ 文件夹, 将刚才的 jar 包放入此文件夹。 -2. 打开 MQTT 服务参数. (`enable_mqtt_service=true` in `conf/iotdb-system.properties`) -3. 用刚才的实现类中的 getName() 方法的返回值 设置为 `conf/iotdb-system.properties` 中 `mqtt_payload_formatter` 的值, - , 在本例中,为 `CustomizedJson2Table` -4. 启动 IoTDB -5. 搞定 - -More: MQTT 协议的消息不限于 json,你还可以用任意二进制。通过如下函数获得: -`payload.forEachByte()` or `payload.array`。 - - -## 6. 注意事项 - -为避免因缺省client_id引发的兼容性问题,强烈建议在所有MQTT客户端中始终显式地提供唯一且非空的 client_id。 -不同客户端在client_id缺失或为空时的表现并不一致,常见示例如下: -1. 显式传入空字符串 - • MQTTX:client_id=""时,IoTDB会直接丢弃消息; - • mosquitto_pub:client_id=""时,IoTDB能正常接收消息。 -2. 完全不传client_id - • MQTTX:消息可被IoTDB正常接收; - • mosquitto_pub:IoTDB拒绝连接。 - 由此可见,显式指定唯一且非空的client_id是消除上述差异、确保消息可靠投递的最简单做法。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/API/Programming-ODBC_timecho.md b/src/zh/UserGuide/Master/Tree/API/Programming-ODBC_timecho.md deleted file mode 100644 index ebdfef741..000000000 --- a/src/zh/UserGuide/Master/Tree/API/Programming-ODBC_timecho.md +++ /dev/null @@ -1,1031 +0,0 @@ - - -# ODBC - -## 1. 功能介绍 - -IoTDB ODBC 驱动程序提供了通过 ODBC 标准接口与数据库进行交互的能力,支持通过 ODBC 连接管理时序数据库中的数据。目前支持数据库连接、数据查询、数据插入、数据修改和数据删除等操作,可适配各类支持 ODBC 协议的应用程序与工具链。 - -> 注意:该功能从 V2.0.8.2 起支持。 - - -## 2. 使用方式 - -推荐使用预编译二进制包安装,无需自行编译,直接通过脚本完成驱动安装与系统注册,目前仅支持 Windows 系统。 - -### 2.1 环境要求 - -仅需满足操作系统层面的 ODBC 驱动管理器依赖,无需配置编译环境: - -| **操作系统** | **要求与安装方式** | -| -------------------- |------------------------------------------------------------------------------------------------------------------------------------| -| Windows | 1. **Windows 10/11、Server 2016/2019/2022**:自带 ODBC 17/18 版本驱动管理器,无需额外安装
2. **Windows 8.1/Server 2012 R2**:需手动安装对应版本 ODBC 驱动管理器 | - -### 2.2 安装步骤 - -1. 联系天谋团队获取预编译二进制包 - -二进制包目录结构: - -```Plain -├── bin/ -│ ├── apache_iotdb_odbc.dll -│ └── install_driver.exe -├── install.bat -└── registry.bat -``` - -2. 以**管理员权限**打开命令行工具(CMD/PowerShell),并运行以下命令:(可以将路径替换为任意绝对路径) - -```Bash -install.bat "C:\Program Files\Apache IoTDB ODBC Driver" -``` - -脚本自动完成以下操作: - -* 创建安装目录(如果不存在) -* 将 `bin\apache_iotdb_odbc.dll` 复制到指定安装目录 -* 调用 `install_driver.exe` 通过 ODBC 标准 API(`SQLInstallDriverEx`)将驱动注册到系统 - -3. 验证安装:打开「ODBC 数据源管理器」,在「驱动程序」选项卡中可查看到 `Apache IoTDB ODBC Driver`,即表示注册成功。 - -![](/img/odbc-1.png) - -### 2.3 卸载步骤 - -1. 以管理员身份打开命令提示符,`cd` 进入项目根目录。 -2. 运行卸载脚本: - -```Bash -uninstall.bat -``` - -脚本会调用 `install_driver.exe` 通过 ODBC 标准 API(`SQLRemoveDriver`)从系统中注销驱动。安装目录中的 DLL 文件不会被自动删除,如需清理请手动删除。 - -### 2.4 连接配置 - -安装驱动后,需要配置数据源(DSN)才能让应用程序通过 DSN 名称连接数据库。IoTDB ODBC 驱动支持通过数据源和连接字符串配置连接参数两种方法。 - -#### 2.4.1 配置数据源 - -**通过 ODBC 数据源管理器配置** - -1. 打开"ODBC 数据源管理程序",切换到"用户 DSN"选项卡,点击"添加"按钮。 - -![](/img/odbc-2.png) - -2. 在弹出的驱动程序列表中选择"Apache IoTDB ODBC Driver",点击"完成"。 - -![](/img/odbc-3.png) - -3. 弹出数据源配置对话框,填写连接参数,随后点击 OK: - -![](/img/odbc-4.png) - -对话框中各字段的含义如下: - -| **区域** | **字段** | **说明** | -| ---------------- | ----------------- | ----------------------------------------------------------------------------------------------------------------- | -| Data Source | DSN Name | 数据源名称,应用程序通过此名称引用该数据源 | -| Data Source | Description | 数据源描述(可选) | -| Connection | Server | IoTDB 服务器 IP 地址,默认 127.0.0.1 | -| Connection | Port | IoTDB Session API 端口,默认 6667 | -| Connection | User | 用户名,默认 root | -| Connection | Password | 密码,默认 root | -| Options | Table Model | 勾选时使用表模型,取消勾选时使用树模型 | -| Options | Database | 数据库名称。仅表模型模式下可用;树模型时此字段灰化不可编辑 | -| Options | Log Level | 日志级别(0-4):0=OFF, 1=ERROR, 2=WARN, 3=INFO, 4=TRACE | -| Options | Session Timeout | 会话超时时间(毫秒),0 表示不设超时。注意服务端 queryTimeoutThreshold 默认为 60000ms,超过此值需修改服务端配置 | -| Options | Batch Size | 每次拉取的行数,默认 1000。设为 0 时重置为默认值 | - -4. 填写完成后,可以点击"Test Connection"按钮测试连接。测试连接会使用当前填写的参数尝试连接到 IoTDB 服务器并执行 `SHOW VERSION` 查询。连接成功时会显示服务器版本信息,失败时会显示具体的错误原因。 -5. 确认参数无误后,点击"OK"保存。数据源会出现在"用户 DSN"列表中,如下图中的名称为123的数据源。 - -![](/img/odbc-5.png) - -如需修改已有数据源的配置,在列表中选中后点击"配置"按钮即可重新编辑。 - -#### 2.4.2 连接字符串 - -连接字符串格式为**分号分隔的键值对**,如: - -```Bash -Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;isTableModel=false;loglevel=2 -``` - -具体字段属性介绍见下表: - -| **字段名称** | **说明** | **可选值** | **默认值** | -| --------------------------- | ---------------------------------- |------------------------------------------------------------------------------------------------------------------------------| --------------------------------- | -| DSN | 数据源名称 | 自定义数据源名 | - | -| uid | 数据库用户名 | 任意字符串 | root | -| pwd | 数据库密码 | 任意字符串 | root | -| server | IoTDB 服务器地址 | ip地址 | 127.0.0.1 | -| port | IoTDB 服务器端口 | 端口 | 6667 | -| database | 数据库名称(仅表模型模式下生效) | 任意字符串 | 空字符串| -| loglevel | 日志级别 | 整数值(0-4) | 4(LOG\_LEVEL\_TRACE) | -| isTableModel / tablemodel | 是否启用表模型模式 | 布尔类型,支持多种表示方式:
1. 0, false, no, off :设置为 false;
2. 1, true, yes, on :设置为 true;
3. 其他值默认设置为 true。 | true| -| sessiontimeoutms | Session 超时时间(毫秒) | 64 位整数,默认为`LLONG_MAX`;设置为`0`时将被替换为`LLONG_MAX`。注意,服务端有超时设置项:`private long queryTimeoutThreshold = 60000;`需要修改这一项才能得到超过60秒的超时时间。 | LLONG\_MAX| -| batchsize | 每次拉取数据的批量大小 | 64 位整数,默认为`1000`;设置为`0`时将被替换为`1000` | 1000| - -说明: - -* 字段名称不区分大小写(自动转换为小写进行比较) -* 连接字符串格式为分号分隔的键值对,如:`Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;isTableModel=false;loglevel=2` -* 对于布尔类型的字段(isTableModel),支持多种表示方式 -* 所有字段都是可选的,如果未指定则使用默认值 -* 不支持的字段会忽略并在日志中记录警告信息,但不会影响连接 -* 服务器接口默认值 6667 是 IoTDB 的 C++ Session 接口所使用的默认端口。本 ODBC 驱动使用 C++ Session 接口与 IoTDB 传递数据。如果 IoTDB 服务端的 C++ Session 接口使用的端口不是默认的,需要在 ODBC 连接字符串中作相应的更改。 - -#### 2.4.3 数据源配置与连接字符串的关系 - -在 ODBC 数据源管理器中保存的配置,会以键值对的形式写入系统的 ODBC 数据源配置中(Windows 下对应注册表 `HKEY_CURRENT_USER\SOFTWARE\ODBC\ODBC.INI`)。当应用程序使用 `SQLConnect` 或在连接字符串中指定 `DSN=数据源名称` 时,驱动会从系统配置中读取这些参数。 - -**连接字符串的优先级高于 DSN 中保存的配置。** 具体规则如下: - -1. 如果连接字符串中包含 `DSN=xxx` 且不包含 `DRIVER=...`,驱动会先从系统配置中加载该 DSN 的所有参数作为基础值。 -2. 然后,连接字符串中显式指定的参数会覆盖 DSN 中的同名参数。 -3. 如果连接字符串中包含 `DRIVER=...`,则不会从系统配置中读取任何 DSN 参数,完全以连接字符串为准。 - -例如:DSN 中配置了 `Server=192.168.1.100`、`Port=6667`,但连接字符串为 `DSN=MyDSN;Server=127.0.0.1`,则实际连接使用 `Server=127.0.0.1`(连接字符串覆盖),`Port=6667`(来自 DSN)。 - -### 2.5 日志记录 - -驱动运行时的日志输出分为「驱动自身日志」和「ODBC 管理器追踪日志」两类,需注意日志等级对性能的影响。 - -#### 2.5.1 驱动自身日志 - -* 输出位置:用户主目录下的 `apache_iotdb_odbc.log`; -* 日志等级:通过连接字符串的 `loglevel` 配置(0-4,等级越高输出越详细); -* 性能影响:高日志等级会显著降低驱动性能,建议仅调试时使用。 - -#### 2.5.2 ODBC 管理器追踪日志 - -* 开启方式:打开「ODBC 数据源管理器」→「跟踪」→「立即启动跟踪」; -* 注意事项:开启后会大幅降低驱动性能,仅用于问题排查。 - -## 3. 接口支持 - -### 3.1 方法列表 - -驱动对 ODBC 标准 API 的支持情况如下: - -| ODBC/Setup API | 函数功能 | 参数列表 | 参数说明 | -| ------------------- | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| SQLAllocHandle| 分配ODBC句柄 | (SQLSMALLINT HandleType, SQLHANDLE InputHandle, SQLHANDLE \*OutputHandle) | HandleType: 要分配的句柄类型(ENV/DBC/STMT/DESC);
InputHandle: 上级上下文句柄;
OutputHandle: 返回的新句柄指针 | -| SQLBindCol | 绑定列到结果缓冲区 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN \*StrLen\_or\_Ind) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
TargetType: C数据类型;
TargetValue: 数据缓冲区;
BufferLength: 缓冲区长度;
StrLen\_or\_Ind: 返回数据长度或NULL指示 | -| SQLColAttribute| 获取列属性信息 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLUSMALLINT FieldIdentifier, SQLPOINTER CharacterAttribute, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength, SQLLEN \*NumericAttribute) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
FieldIdentifier: 属性ID;
CharacterAttribute: 字符属性输出;
BufferLength: 缓冲区长度;
StringLength: 返回长度;
NumericAttribute: 数值属性输出 | -| SQLColumns| 查询表列信息 | (SQLHSTMT StatementHandle, SQLCHAR \*CatalogName, SQLSMALLINT NameLength1, SQLCHAR \*SchemaName, SQLSMALLINT NameLength2, SQLCHAR \*TableName, SQLSMALLINT NameLength3, SQLCHAR \*ColumnName, SQLSMALLINT NameLength4) | StatementHandle: 语句句柄;
Catalog/Schema/Table/ColumnName: 查询对象名称;
NameLength\*: 对应名称长度 | -| SQLConnect | 建立数据库连接 | (SQLHDBC ConnectionHandle, SQLCHAR \*ServerName, SQLSMALLINT NameLength1, SQLCHAR \*UserName, SQLSMALLINT NameLength2, SQLCHAR \*Authentication, SQLSMALLINT NameLength3) | ConnectionHandle: 连接句柄;
ServerName: 数据源名称;
UserName: 用户名;
Authentication: 密码;NameLength\*: 字符串长度 | -| SQLDescribeCol | 描述结果集中的列 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLCHAR \*ColumnName, SQLSMALLINT BufferLength, SQLSMALLINT \*NameLength, SQLSMALLINT \*DataType, SQLULEN \*ColumnSize, SQLSMALLINT \*DecimalDigits, SQLSMALLINT \*Nullable) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
ColumnName: 列名输出;
BufferLength: 缓冲区长度;
NameLength: 返回列名长度;
DataType: SQL类型;
ColumnSize: 列大小;
DecimalDigits: 小数位;
Nullable: 是否可为空 | -| SQLDisconnect | 断开数据库连接 | (SQLHDBC ConnectionHandle) | ConnectionHandle: 连接句柄 | -| SQLDriverConnect | 使用连接字符串建立连接 | (SQLHDBC ConnectionHandle, SQLHWND WindowHandle, SQLCHAR \*InConnectionString, SQLSMALLINT StringLength1, SQLCHAR \*OutConnectionString, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength2, SQLUSMALLINT DriverCompletion) | ConnectionHandle: 连接句柄;
WindowHandle: 窗口句柄;
InConnectionString: 输入连接字符串;
StringLength1: 输入长度;
OutConnectionString: 输出连接字符串;
BufferLength: 输出缓冲区;
StringLength2: 返回长度;
DriverCompletion: 连接提示方式 | -| SQLEndTran | 提交或回滚事务 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT CompletionType) | HandleType: 句柄类型;
Handle: 连接或环境句柄;
CompletionType: 提交或回滚事务 | -| SQLExecDirect | 直接执行SQL语句 | (SQLHSTMT StatementHandle, SQLCHAR \*StatementText, SQLINTEGER TextLength) | StatementHandle: 语句句柄;
StatementText: SQL文本;
TextLength: SQL长度 | -| SQLFetch | 提取结果集中的下一行 | (SQLHSTMT StatementHandle) | StatementHandle: 语句句柄 | -| SQLFreeHandle | 释放ODBC句柄 | (SQLSMALLINT HandleType, SQLHANDLE Handle) | HandleType: 句柄类型;
Handle: 要释放的句柄 | -| SQLFreeStmt | 释放语句相关资源 | (SQLHSTMT StatementHandle, SQLUSMALLINT Option) | StatementHandle: 语句句柄;
Option: 释放选项(关闭游标/重置参数等) | -| SQLGetConnectAttr | 获取连接属性 | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER \*StringLength) | ConnectionHandle: 连接句柄;
Attribute: 属性ID;
Value: 返回属性值;
BufferLength: 缓冲区长度;
StringLength: 返回长度 | -| SQLGetData | 获取结果数据 | (SQLHSTMT StatementHandle, SQLUSMALLINT Col\_or\_Param\_Num, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN \*StrLen\_or\_Ind) | StatementHandle: 语句句柄;
Col\_or\_Param\_Num: 列号;
TargetType: C类型;
TargetValue: 数据缓冲区;
BufferLength: 缓冲区大小;
StrLen\_or\_Ind: 返回长度或NULL标志 | -| SQLGetDiagField | 获取诊断字段 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLSMALLINT DiagIdentifier, SQLPOINTER DiagInfo, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength) | HandleType: 句柄类型;
Handle: 句柄;
RecNumber: 记录号;
DiagIdentifier: 诊断字段ID;
DiagInfo: 输出信息;
BufferLength: 缓冲区;
StringLength: 返回长度 | -| SQLGetDiagRec | 获取诊断记录 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLCHAR \*Sqlstate, SQLINTEGER \*NativeError, SQLCHAR \*MessageText, SQLSMALLINT BufferLength, SQLSMALLINT \*TextLength) | HandleType: 句柄类型;
Handle: 句柄;
RecNumber: 记录号;
Sqlstate: SQL状态码;
NativeError: 原生错误码;
MessageText: 错误信息;
BufferLength: 缓冲区;
TextLength: 返回长度 | -| SQLGetInfo | 获取数据库信息 | (SQLHDBC ConnectionHandle, SQLUSMALLINT InfoType, SQLPOINTER InfoValue, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength) | ConnectionHandle: 连接句柄;

InfoType: 信息类型;
InfoValue: 返回值;
BufferLength: 缓冲区长度;
StringLength: 返回长度 | -| SQLGetStmtAttr | 获取语句属性 | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER \*StringLength) | StatementHandle: 语句句柄;
Attribute: 属性ID;
Value: 返回值;
BufferLength: 缓冲区;
StringLength: 返回长度 | -| SQLGetTypeInfo | 获取数据类型信息 | (SQLHSTMT StatementHandle, SQLSMALLINT DataType) | StatementHandle: 语句句柄;
DataType: SQL数据类型 | -| SQLMoreResults | 获取更多结果集 | (SQLHSTMT StatementHandle) | StatementHandle: 语句句柄 | -| SQLNumResultCols | 获取结果集列数 | (SQLHSTMT StatementHandle, SQLSMALLINT \*ColumnCount) | StatementHandle: 语句句柄;
ColumnCount: 返回列数 | -| SQLRowCount | 获取受影响的行数 | (SQLHSTMT StatementHandle, SQLLEN \*RowCount) | StatementHandle: 语句句柄;
RowCount: 返回受影响行数 | -| SQLSetConnectAttr | 设置连接属性 | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | ConnectionHandle: 连接句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 属性值长度 | -| SQLSetEnvAttr | 设置环境属性 | (SQLHENV EnvironmentHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | EnvironmentHandle: 环境句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 长度 | -| SQLSetStmtAttr | 设置语句属性 | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | StatementHandle: 语句句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 长度 | -| SQLTables | 查询表信息 | (SQLHSTMT StatementHandle, SQLCHAR \*CatalogName, SQLSMALLINT NameLength1, SQLCHAR \*SchemaName, SQLSMALLINT NameLength2, SQLCHAR \*TableName, SQLSMALLINT NameLength3, SQLCHAR \*TableType, SQLSMALLINT NameLength4) | StatementHandle: 语句句柄;
Catalog/Schema/TableName: 表名;
TableType: 表类型;
NameLength\*: 对应长度 | - -### 3.2 数据类型转换 - -IoTDB 数据类型与 ODBC 标准数据类型的映射关系如下: - -| **IoTDB 数据类型** | **ODBC 数据类型** | -| -------------------------- | ------------------------- | -| BOOLEAN | SQL\_BIT | -| INT32 | SQL\_INTEGER | -| INT64 | SQL\_BIGINT | -| FLOAT | SQL\_REAL | -| DOUBLE | SQL\_DOUBLE | -| TEXT | SQL\_VARCHAR | -| STRING | SQL\_VARCHAR | -| BLOB | SQL\_LONGVARBINARY | -| TIMESTAMP | SQL\_BIGINT | -| DATE | SQL\_DATE | - -## 4. 操作示例 - -本章节主要介绍 **C#**、**Python**、**C++**、**PowerBI**、**Excel** 全类型操作示例,覆盖数据查询、插入、删除等核心操作。 - -### 4.1 C# 示例 - -```C# -/******* -Note: When the output contains Chinese characters, it may cause garbled text. -This is because the table.Write() function cannot output strings in UTF-8 encoding -and can only output using GB2312 (or another system default encoding). This issue -may not occur in software like Power BI; it also does not occur when using the Console. -WriteLine function. This is an issue with the ConsoleTable package. -*****/ -using System.Data.Common; -using System.Data.Odbc; -using System.Reflection.PortableExecutable; -using ConsoleTables; -using System; - -/// 执行 SELECT 查询并以表格形式输出 root.full.fulldevice 的结果 -void Query(OdbcConnection dbConnection) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - dbCommand.CommandText = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000"; - using (OdbcDataReader dbReader = dbCommand.ExecuteReader()) - { - var fCount = dbReader.FieldCount; - Console.WriteLine($"fCount = {fCount}"); - // 输出表头 - var columns = new string[fCount]; - for (var i = 0; i < fCount; i++) - { - var fName = dbReader.GetName(i); - if (fName.Contains('.')) - { - fName = fName.Substring(fName.LastIndexOf('.') + 1); - } - columns[i] = fName; - } - // 输出内容 - var table = new ConsoleTable(columns); - while (dbReader.Read()) - { - var row = new object[fCount]; - for (var i = 0; i < fCount; i++) - { - if (dbReader.IsDBNull(i)) - { - row[i] = null; - continue; - } - row[i] = dbReader.GetValue(i); - } - table.AddRow(row); - } - table.Write(); - Console.WriteLine(); - } - } - } - catch (Exception ex) - { - Console.WriteLine(ex.ToString()); - } -} - -/// 执行非查询 SQL 语句(如 INSERT 等,树模型 INSERT 会自动创建) -void Execute(OdbcConnection dbConnection, string command) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - try - { - dbCommand.CommandText = command; - Console.WriteLine($"Execute command: {command}"); - dbCommand.ExecuteNonQuery(); - } - catch (Exception ex) - { - Console.WriteLine($"CommandText error: {ex.Message}"); - } - } - } - catch (OdbcException ex) - { - Console.WriteLine($"数据库错误:{ex.Message}"); - } - catch (Exception ex) - { - Console.WriteLine($"发生未知错误:{ex.Message}"); - } -} - -var dsn = "Apache IoTDB DSN"; -var user = "root"; -var password = "root"; -var server = "127.0.0.1"; -var connectionString = $"DSN={dsn};Server={server};UID={user};PWD={password};loglevel=4;istablemodel=0"; - -using (OdbcConnection dbConnection = new OdbcConnection(connectionString)) -{ - Console.WriteLine($"Start"); - try - { - dbConnection.Open(); - } - catch (Exception ex) - { - Console.WriteLine($"Login failed: {ex.Message}"); - Console.WriteLine($"Stack Trace: {ex.StackTrace}"); - dbConnection.Dispose(); - return; - } - string[] insertStatements = new string[] - { - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')" - }; - foreach (var insert in insertStatements) - { - Execute(dbConnection, insert); - } - Console.WriteLine($"[DEBUG] Inserted {insertStatements.Length} rows. Begin to query."); - - Query(dbConnection); // 执行查询并输出结果 -} -``` - -### 4.2 Python 示例 - -1. 通过Python访问odbc,需安装pyodbc包 - -```Plain -pip install pyodbc -``` - -2. 完整代码 - -```Python -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -""" -Apache IoTDB ODBC Python 示例 - 树模型(Tree Model) -使用 pyodbc 连接 IoTDB ODBC 驱动,通过 istablemodel=0 使用树模型。 -功能参考 examples/BasicTest/TreeTest/TreeTest.cs 和 examples/cpp-example/TreeTest.cpp。 -""" - -import pyodbc - -def execute(conn: pyodbc.Connection, command: str) -> None: - """执行非查询 SQL 语句(如 INSERT 等,树模型 INSERT 会自动创建)""" - try: - with conn.cursor() as cursor: - cursor.execute(command) - cmd_upper = command.strip().upper() - if cmd_upper.startswith(("INSERT", "UPDATE", "DELETE")): - conn.commit() - print(f"Execute command: {command}") - except pyodbc.Error as ex: - print(f"CommandText error: {ex}") - -def query(conn: pyodbc.Connection, sql: str) -> None: - """执行 SELECT 查询并以表格形式输出 root.full.fulldevice 的结果""" - try: - with conn.cursor() as cursor: - cursor.execute(sql) - col_count = len(cursor.description) - print(f"fCount = {col_count}") - - if col_count <= 0: - return - - columns = [] - for i in range(col_count): - col_name = cursor.description[i][0] or f"Column{i}" - if "." in str(col_name): - col_name = str(col_name).split(".")[-1] - columns.append(str(col_name)) - - rows = cursor.fetchall() - - col_widths = [max(len(str(col)), 4) for col in columns] - for row in rows: - for j, val in enumerate(row): - if j < len(col_widths): - col_widths[j] = max(col_widths[j], len(str(val) if val is not None else "NULL")) - - header = " | ".join(str(c).ljust(col_widths[i]) for i, c in enumerate(columns)) - print(header) - print("-" * len(header)) - - for row in rows: - values = [] - for i, val in enumerate(row): - if val is None: - cell = "NULL" - else: - cell = str(val) - values.append(cell.ljust(col_widths[i]) if i < len(col_widths) else cell) - print(" | ".join(values)) - - print() - - except pyodbc.Error as ex: - print(f"Query error: {ex}") - -def main() -> None: - dsn = "Apache IoTDB DSN" - user = "root" - password = "root" - server = "127.0.0.1" - connection_string = ( - f"DSN={dsn};Server={server};UID={user};PWD={password};" - f"loglevel=4;istablemodel=0" - ) - - print("Start") - - try: - conn = pyodbc.connect(connection_string) - except pyodbc.Error as ex: - print(f"Login failed: {ex}") - return - - try: - driver_name = conn.getinfo(6) # SQL_DRIVER_NAME - print(f"Successfully opened connection. driver = {driver_name}") - except Exception: - print("Successfully opened connection.") - - try: - insert_statements = [ - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600001, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')", - ] - for insert_sql in insert_statements: - execute(conn, insert_sql) - print(f"[DEBUG] Inserted {len(insert_statements)} rows. Begin to query.") - - query_sql = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000" - query(conn, query_sql) - print("Query ok") - finally: - conn.close() - -if __name__ == "__main__": - main() -``` - -### 4.3 C++ 示例 - -```C++ -#define WIN32_LEAN_AND_MEAN -#include - -#include -#include -#include -#include -#include -#include -#include - -#ifndef SQL_DIAG_COLUMN_SIZE -#define SQL_DIAG_COLUMN_SIZE 33L -#endif - -void CheckOdbcError(SQLRETURN retCode, SQLSMALLINT handleType, SQLHANDLE handle, const char* functionName) { - if (retCode == SQL_SUCCESS || retCode == SQL_SUCCESS_WITH_INFO) { - return; - } - - SQLCHAR sqlState[6]; - SQLCHAR message[SQL_MAX_MESSAGE_LENGTH]; - SQLINTEGER nativeError; - SQLSMALLINT textLength; - SQLRETURN errRet; - errRet = SQLGetDiagRec(handleType, handle, 1, sqlState, &nativeError, message, sizeof(message), &textLength); - - std::cerr << "ODBC Error in " << functionName << ":\n"; - std::cerr << " SQL State: " << sqlState << "\n"; - std::cerr << " Native Error: " << nativeError << "\n"; - std::cerr << " Message: " << message << "\n"; - std::cerr << " SQLGetDiagRec Return: " << errRet << "\n"; - - if (retCode == SQL_ERROR || retCode == SQL_INVALID_HANDLE) { - exit(1); - } -} - -void PrintSimpleTable(const std::vector& headers, - const std::vector>& rows) { - for (size_t i = 0; i < headers.size(); i++) { - std::cout << headers[i]; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - for (size_t i = 0; i < headers.size(); i++) { - std::cout << "----------------"; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - for (const auto& row : rows) { - for (size_t i = 0; i < row.size(); i++) { - std::cout << row[i]; - if (i < row.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - } - std::cout << std::endl; -} - -/// 执行 SELECT 查询并以表格形式输出 root.full.fulldevice 的结果 -void Query(SQLHDBC hDbc) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret = SQL_SUCCESS; - - try { - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - return; - } - - const std::string sqlQuery = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000"; - std::cout << "Execute query: " << sqlQuery << std::endl; - - ret = SQLExecDirect(hStmt, reinterpret_cast(const_cast(sqlQuery.c_str())), SQL_NTS); - if (!SQL_SUCCEEDED(ret)) { - if (ret != SQL_NO_DATA) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect(SELECT)"); - } - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - SQLSMALLINT colCount = 0; - ret = SQLNumResultCols(hStmt, &colCount); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLNumResultCols"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::cout << "Column count = " << colCount << std::endl; - - if (colCount <= 0) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector columnNames; - std::vector columnTypes(colCount); - std::vector columnSizes(colCount); - std::vector decimalDigits(colCount); - std::vector nullable(colCount); - - // Get basic column information - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLSMALLINT nameLength = 0; - ret = SQLDescribeCol(hStmt, i, NULL, 0, &nameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get length)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector colNameBuffer(nameLength + 1); - SQLSMALLINT actualNameLength = 0; - - ret = SQLDescribeCol(hStmt, i, colNameBuffer.data(), nameLength + 1, - &actualNameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get name)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::string fullName(reinterpret_cast(colNameBuffer.data())); - - size_t pos = fullName.find_last_of('.'); - if (pos != std::string::npos) { - columnNames.push_back(fullName.substr(pos + 1)); - } else { - columnNames.push_back(fullName); - } - - ret = SQLDescribeCol(hStmt, i, NULL, 0, NULL, &columnTypes[i-1], - &columnSizes[i-1], &decimalDigits[i-1], &nullable[i-1]); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get type info)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - } - - std::vector> tableRows; - - int rowCount = 0; - // Get data front every row - while (true) { - ret = SQLFetch(hStmt); - if (ret == SQL_NO_DATA) { - break; - } - - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLFetch"); - break; - } - - std::vector row; - - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLLEN indicator = 0; - std::string valueStr; - - SQLSMALLINT cType; - size_t bufferSize; - bool isCharacterType = false; - const int maxBufferSize = 32768; - - switch (columnTypes[i-1]) { - case SQL_CHAR: - case SQL_VARCHAR: - case SQL_LONGVARCHAR: - case SQL_WCHAR: - case SQL_WVARCHAR: - case SQL_WLONGVARCHAR: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_DECIMAL: - case SQL_NUMERIC: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_INTEGER: - case SQL_SMALLINT: - case SQL_TINYINT: - case SQL_BIGINT: - cType = SQL_C_SBIGINT; - bufferSize = sizeof(SQLBIGINT); - break; - - case SQL_REAL: - case SQL_FLOAT: - case SQL_DOUBLE: - cType = SQL_C_DOUBLE; - bufferSize = sizeof(double); - break; - - case SQL_BIT: - cType = SQL_C_BIT; - bufferSize = sizeof(SQLCHAR); - break; - - case SQL_DATE: - case SQL_TYPE_DATE: - cType = SQL_C_DATE; - bufferSize = sizeof(SQL_DATE_STRUCT); - break; - - case SQL_TIME: - case SQL_TYPE_TIME: - cType = SQL_C_TIME; - bufferSize = sizeof(SQL_TIME_STRUCT); - break; - - case SQL_TIMESTAMP: - case SQL_TYPE_TIMESTAMP: - cType = SQL_C_TIMESTAMP; - bufferSize = sizeof(SQL_TIMESTAMP_STRUCT); - break; - - default: - cType = SQL_C_CHAR; - bufferSize = 256; - isCharacterType = true; - break; - } - - std::vector buffer(bufferSize); - - ret = SQLGetData(hStmt, i, cType, buffer.data(), bufferSize, &indicator); - - if (indicator == SQL_NULL_DATA) { - valueStr = "NULL"; - } - else if (ret != SQL_SUCCESS) { - valueStr = "ERR_CONV"; - } - else { - if (cType == SQL_C_CHAR) { - valueStr = reinterpret_cast(buffer.data()); - } - else if (cType == SQL_C_SBIGINT) { - SQLBIGINT intVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(intVal); - } - else if (cType == SQL_C_DOUBLE) { - double doubleVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(doubleVal); - } - else if (cType == SQL_C_BIT) { - valueStr = (*buffer.data() != 0) ? "TRUE" : "FALSE"; - } - else if (cType == SQL_C_DATE) { - SQL_DATE_STRUCT* date = reinterpret_cast(buffer.data()); - char dateStr[20]; - snprintf(dateStr, sizeof(dateStr), "%04d-%02d-%02d", - date->year, date->month, date->day); - valueStr = dateStr; - } - else if (cType == SQL_C_TIME) { - SQL_TIME_STRUCT* time = reinterpret_cast(buffer.data()); - char timeStr[15]; - snprintf(timeStr, sizeof(timeStr), "%02d:%02d:%02d", - time->hour, time->minute, time->second); - valueStr = timeStr; - } - else if (cType == SQL_C_TIMESTAMP) { - SQL_TIMESTAMP_STRUCT* ts = reinterpret_cast(buffer.data()); - char tsStr[30]; - snprintf(tsStr, sizeof(tsStr), "%04d-%02d-%02d %02d:%02d:%02d.%06d", - ts->year, ts->month, ts->day, - ts->hour, ts->minute, ts->second, - ts->fraction / 1000); - valueStr = tsStr; - } - else { - valueStr = "UNKNOWN_TYPE"; - } - - if (isCharacterType && ret == SQL_SUCCESS_WITH_INFO) { - SQLLEN actualSize = 0; - SQLGetDiagField(SQL_HANDLE_STMT, hStmt, 0, SQL_DIAG_COLUMN_SIZE, - &actualSize, SQL_IS_INTEGER, NULL); - - if (indicator > 0 && static_cast(indicator) > bufferSize - 1) { - valueStr += "..."; - } - } - - } - - row.push_back(valueStr); - } - - tableRows.push_back(row); - } - - if (!tableRows.empty()) { - PrintSimpleTable(columnNames, tableRows); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (const std::exception& ex) { - std::cerr << "Exception: " << ex.what() << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } - catch (...) { - std::cerr << "Unknown exception occurred" << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -/// 执行非查询 SQL 语句(如 INSERT 等,树模型 INSERT 会自动创建) -void Execute(SQLHDBC hDbc, const std::string& command) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret; - - try { - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - - ret = SQLExecDirect(hStmt, (SQLCHAR*)command.c_str(), SQL_NTS); - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect"); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (...) { - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -int main() { - SQLHENV hEnv = SQL_NULL_HENV; - SQLHDBC hDbc = SQL_NULL_HDBC; - SQLRETURN ret; - - try { - std::cout << "Start" << std::endl; - - ret = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hEnv); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_ENV)"); - - ret = SQLSetEnvAttr(hEnv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)SQL_OV_ODBC3, 0); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLSetEnvAttr"); - - ret = SQLAllocHandle(SQL_HANDLE_DBC, hEnv, &hDbc); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_DBC)"); - - std::string dsn = "Apache IoTDB DSN"; - std::string user = "root"; - std::string password = "root"; - std::string server = "127.0.0.1"; - - std::string connectionString = "DSN=" + dsn + ";Server=" + server + - ";UID=" + user + ";PWD=" + password + - ";loglevel=4;istablemodel=0"; - std::cout << "Using connection string: " << connectionString << std::endl; - - SQLCHAR outConnStr[1024]; - SQLSMALLINT outConnStrLen; - - ret = SQLDriverConnect(hDbc, NULL, - (SQLCHAR*)connectionString.c_str(), SQL_NTS, - outConnStr, sizeof(outConnStr), - &outConnStrLen, SQL_DRIVER_COMPLETE); - - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - std::cerr << "Login failed" << std::endl; - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLDriverConnect"); - return 1; - } - - SQLCHAR driverName[256]; - SQLSMALLINT nameLength; - ret = SQLGetInfo(hDbc, SQL_DRIVER_NAME, driverName, sizeof(driverName), &nameLength); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLGetInfo"); - - std::cout << "Successfully opened connection. database name = " << driverName << std::endl; - - const char* insertStatements[] = { - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')" - }; - for (const char* sql : insertStatements) { - Execute(hDbc, sql); - } - std::cout << "[DEBUG] Inserted 20 rows. Begin to query." << std::endl; - Query(hDbc); - std::cout << "Query ok" << std::endl; - - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - - return 0; - } - catch (...) { - if (hDbc != SQL_NULL_HDBC) { - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - } - if (hEnv != SQL_NULL_HENV) { - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - } - - std::cerr << "Unexpected error!" << std::endl; - return 1; - } -} -``` - -### 4.4 PowerBI 示例 - -1. 打开 PowerBI Desktop,创建新项目; -2. 点击「主页」→「获取数据」→「更多...」→「ODBC」→ 点击「连接」按钮; -3. 数据源选择:在弹出窗口中选择「数据源名称 (DSN)」,下拉选择 `Apache IoTDB DSN`; -4. 高级配置: - -* 点击「高级选项」,在「连接字符串」输入框填写配置(样例): - -```Plain -server=127.0.0.1;port=6667;isTableModel=false;loglevel=4 -``` - -* 说明: - - * `dsn` 项可选,填写 / 不填写均不影响连接; - * `loglevel` 分为 0-4 等级:0 级(ERROR)日志最少,4 级(TRACE)日志最详细,按需设置; - * `server`/`dsn`/`loglevel` 大小写不敏感(如可写为 `Server`); - * 如果在DSN中配置了相关信息,则可以不填写任何配置信息,驱动管理器会自动使用在DSN中填入的配置信息。 - -5. 身份验证:输入用户名(默认 `root`)和密码(默认 `root`),点击「连接」; -6. 数据加载:点击「加载」即可查看数据。 - -### 4.5 Excel 示例 - -1. 打开 Excel,创建空白工作簿; -2. 点击「数据」选项卡 →「自其他来源」→「来自数据连接向导」; -3. 数据源选择:选择「ODBC DSN」→ 下一步 → 选择 `Apache IoTDB DSN` → 下一步; -4. 连接配置: -* 连接字符串、用户名、密码的输入流程与 PowerBI 完全一致,连接字符串格式参考: - -```Plain -server=127.0.0.1;port=6667;isTableModel=false;loglevel=4 -``` - -* 如果在DSN中配置了相关信息,则可以不填写任何配置信息,驱动管理器会自动使用在DSN中填入的配置信息。 -5. 保存连接:自定义设置数据连接文件名、连接描述等信息,点击「完成」; -6. 导入数据:选择数据导入到工作表的位置(如「现有工作表」的 A1 单元格),点击「确定」,完成数据加载。 diff --git a/src/zh/UserGuide/Master/Tree/API/Programming-OPC-DA_timecho.md b/src/zh/UserGuide/Master/Tree/API/Programming-OPC-DA_timecho.md deleted file mode 100644 index f6dc7368d..000000000 --- a/src/zh/UserGuide/Master/Tree/API/Programming-OPC-DA_timecho.md +++ /dev/null @@ -1,208 +0,0 @@ - - -# OPC DA 协议 - -## 1. OPC DA - -OPC DA (OPC Data Access) 是工业自动化领域的一种通信协议标准,属于经典 OPC(OLE for Process Control)技术的核心部分。它的主要目标是实现 Windows 环境下工业设备与软件(如 SCADA、HMI、数据库)之间的实时数据交互。OPC DA 基于 COM / DCOM 实现,是一个轻量级的协议,分为服务器和客户端两个角色。 - -* **服务器:** 可以视为一个 Item 的池,存储各个实例的最新数据及其状态。所有 item 只能在服务器端管理,客户端只能读写数据,无权操作元信息。 - -![](/img/opc-da-1-1.png) - -* **客户端:** 连接服务器后,需要自定义一个组(这个组仅与客户端有关),并创建服务器的同名 item,然后可以对自身已创建的 item 进行读写。 - -![](/img/opc-da-1-2.png) - -## 2. OPC DA Sink - -IoTDB (V2.0.5.1及以后的V2.x版本支持) 提供的 OPC DA Sink 支持将树模型数据推送到本地 COM 服务器的插件,它封装了 OPC DA 接口规范及其固有复杂性,显著简化了集成流程。OPC DA Sink 推送数据流图如下所示。 - -![](/img/opc-da-2-1.png) - -### 2.1 SQL 语法 - -```SQL ----- 注意这里的 clsID 需要替换为自己的 clsID -create pipe opc ( - 'sink'='opc-da-sink', - --- 'opcda.progid'='opcserversim.Instance.1' - 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C' -); -``` - -### 2.2 参数介绍 - -| **参数** | **描述** | **取值范围 ** | 是否必填 | -| ------------------- | --------------------------------------------------------------------- | ----------------------- | ------------------ | -| sink | OPC DA SINK | String: opc-da-sink | 必填 | -| sink.opcda.clsid | OPC Server 的 ClsID(唯一标识字符串)。建议使用 clsID 而非 progID。 | String | 和 progId 二选一 | -| sink.opcda.progid | OPC Server 的 ProgID,如果有 clsID,优先使用 clsID。 | String | 和 clsID 二选一 | - -### 2.3 映射规范 - -使用时,IoTDB 将会将自身的树模型最新数据推送到服务器,数据的 itemID 为树模型下的时间序列的全路径,如 `root.a.b.c.d`。注意根据 OPC DA 标准,客户端无权直接在 server 侧创建 item,因此需要服务器提前将 IoTDB 的时间序列以 itemID 和对应数据类型的格式创建为 item。 - -* 数据类型对应如下表所示。 - -| IoTDB | OPC-DA Server | -| ----------- | ----------------------------------------------------------- | -| INT32 | VT\_I4 | -| INT64 | VT\_I8 | -| FLOAT | VT\_R4 | -| DOUBLE | VT\_R8 | -| TEXT | VT\_BSTR | -| BOOLEAN | VT\_BOOL | -| DATE | VT\_DATE | -| TIMESTAMP | VT\_DATE | -| BLOB | VT\_BSTR(Variant 不支持 VT\_BLOB,因此用 VT\_BSTR 替代) | -| STRING | VT\_BSTR | - -### 2.4 常见错误码 - -| 符号 | 错误码 | 描述 | -| ----------------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------ | -| OPC\_E\_BADTYPE | 0xC0040004 | 服务器无法在指定格式/请求的数据类型与规范数据类型之间转换数据。即服务器的数据类型与 IoTDB 的注册类型不一致。 | -| OPC\_E\_UNKNOWNITEMID| 0xC0040007 | 在服务器地址空间中未定义该条目ID(添加或验证时),或该条目ID在服务器地址空间中已不存在(读取或写入时)。即 IoTDB 的测点在服务器内没有对应的 itemID。 | -| OPC\_E\_INVALIDITEMID | 0xC0040008 | 该 itemID不符合服务器的语法规范。 | -| REGDB\_E\_CLASSNOTREG | 0x80040154 | 未注册类 | -| RPC\_S\_SERVER\_UNAVAILABLE | 0x800706ba | RPC服务不可用 | -| DISP\_E\_OVERFLOW | 0x8002000a | 超过类型的最大值 | -| DISP\_E\_BADVARTYPE | 0x80020005 | 类型不匹配 | - -### 2.5 使用限制 - -* 仅支持 COM,且仅能在 Windows 上使用 -* 重启后可能会推送少部分旧数据,但是最终会推送新数据 -* 目前仅支持树模型数据。 - -## 3. 使用步骤 -### 3.1 前置条件 -1. Windows 环境,版本 >= 8 -2. IoTDB 已安装且可正常运行 -3. OPC DA Server 已安装 - -* 以 Simple OPC Server Simulator 为例 - -![](/img/opc-da-3-1.png) - -* 双击某项,可以修改该项的名字(itemID),数据,数据类型等各个信息。 -* 右键某项,可以删除该项、更新值、以及新建项。 - -![](/img/opc-da-3-2.png) - -4. OPC DA Client 已安装 - -* 以 KepwareServerEX 的 quickClient 为例 -* 在 Kepware 中可以如下打开 OPC DA Client - -![](/img/opc-da-3-3.png) - -![](/img/opc-da-3-4.png) - - -### 3.2 配置修改 - -修改 server 配置,以避免 IoTDB 的写入 client 与 Kepware 的读取 client 连接到两个不同的实例而无法调试。 - -* 首先按 Win+R 键,在运行菜单内输入 `dcomcnfg`,打开 dcom 的组件配置: - -![](/img/opc-da-3-5.png) - -* 点击组件服务 -> 计算机 -> 我的电脑 -> DCOM 配置,找到`AGG Software Simple OPC Server Simulator`,右键“属性”: - -![](/img/opc-da-3-6.png) - -* 在`标识`内,将`用户账户`改为`交互式用户`。注意这里不要为`启动用户`,否则可能导致两个 client 分别启动不同的 server 实例。 - -![](/img/opc-da-3-7.png) - -### 3.3 clsID 获取 -1. 方式一:通过 DCOM 配置 获取 - -* 按 Win+R 键,在运行菜单内输入 `dcomcnfg`,打开 dcom 的组件配置; -* 点击组件服务 -> 计算机 -> 我的电脑 -> DCOM 配置,找到`AGG Software Simple OPC Server Simulator`,右键“属性”。 -* 在 `常规 `中可以获取该应用程序的 clsID,用于之后 opc-da-sink 的连接,注意不带大括号 - -![](/img/opc-da-3-8.png) - -2. 方式二:clsID 与 progID 也可以直接在 server 里获取 - -* 点击 `Help` > `Show OPC Server Info` - -![](/img/opc-da-3-9.png) - -* 弹窗中即可显示 - -![](/img/opc-da-3-10.png) - -### 3.4 写入数据 -#### 3.4.1 DA Server -1. 在 DA Server 内新建项,与 IoTDB 的待写入项的 name 与 type 保持一致 - -![](/img/opc-da-3-11.png) - -2. 在 Kepware 中连上该 server: - -![](/img/opc-da-3-12.png) - -3. 右键服务器新建组,组名任意: - -![](/img/opc-da-3-13.png) - -![](/img/opc-da-3-14.png) - -4. 右键新建 item,item 的名字为之前创建的名字 - -![](/img/opc-da-3-15.png) - -![](/img/opc-da-3-16.png) - -![](/img/opc-da-3-17.png) - -#### 3.4.2 IoTDB -1. 启动 IoTDB -2. 创建 Pipe - -```SQL -create pipe opc ('sink'='opc-da-sink', 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C') -``` - -* 注意:如果创建失败,提示` Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 1107: Failed to connect to server, error code: 0x80040154`,则可以参考该解决方案进行处理:https://opcexpert.com/support/0x80040154-class-not-registered/ - -3. 创建时间序列(如果已开启自动创建元数据,则本步骤可以省略) - -```SQL -create timeseries root.a.b.c.r string; -``` - -4. 插入数据 - -```SQL -insert into root.a.b.c (time, r) values(10000, "SomeString") -``` - -### 3.5 验证数据 - -查看 Quick client 的数据,应该已经得到更新。 - -![](/img/opc-da-3-18.png) \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/API/Programming-OPC-UA_timecho.md b/src/zh/UserGuide/Master/Tree/API/Programming-OPC-UA_timecho.md deleted file mode 100644 index f53d8d036..000000000 --- a/src/zh/UserGuide/Master/Tree/API/Programming-OPC-UA_timecho.md +++ /dev/null @@ -1,398 +0,0 @@ - - -# OPC UA 协议 - -## 1. 功能概述 - -本文档介绍了 IoTDB 与 OPC UA 协议集成的两种独立工作模式,请根据您的业务场景进行选择: - -* **模式一:数据订阅服务 (IoTDB 作为 OPC UA 服务器)**:IoTDB 启动内置的 OPC UA 服务器,被动地允许外部客户端(如 UAExpert)连接并订阅其内部数据。这是传统用法。 -* **模式二:数据主动推送 (IoTDB 作为 OPC UA 客户端)**:IoTDB 作为客户端,主动将数据和元数据同步到一个或多个独立部署的外部 OPC UA 服务器。 - > 注意:该模式从 V2.0.8 起支持。 - -**注意:模式互斥** - -当 Pipe 配置中指定了 `node-urls` 参数(模式二),IoTDB 将不会启动内置的 OPC UA 服务器(模式一)。两种模式在同一 Pipe 中**不可同时使用**。 - -## 2. 数据订阅 - -本模式支持用户以 OPC UA 协议从 IoTDB 中订阅数据,订阅数据的通信模式支持 Client/Server 和 Pub/Sub 两种。 - -注意:本功能并非从外部 OPC Server 中采集数据写入 IoTDB - -![](/img/opc-ua-new-1.png) - -### 2.1 OPC 服务启动方式 -#### 2.1.1 语法 - -启动 OPC UA 协议的语法: - -```SQL -create pipe p1 - with source (...) - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'sink.opcua.tcp.port' = '12686', - 'sink.opcua.https.port' = '8443', - 'sink.user' = 'root', - 'sink.password' = 'TimechoDB@2021', //V2.0.6.x 之前默认密码为root - 'sink.opcua.security.dir' = '...' - ) -``` - -#### 2.1.2 参数 - -| **参数** | **描述** | **取值范围** | **是否必填** | **默认值** | -| ------------------------------------ |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------| -------------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| sink | OPC UA SINK | String: opc-ua-sink | 必填 | | -| sink.opcua.model | OPC UA 使用的模式 | String: client-server / pub-sub | 选填 | client-server | -| sink.opcua.tcp.port | OPC UA 的 TCP 端口 | Integer: [0, 65536] | 选填 | 12686 | -| sink.opcua.https.port | OPC UA 的 HTTPS 端口 | Integer: [0, 65536] | 选填 | 8443 | -| sink.opcua.security.dir | OPC UA 的密钥及证书目录 | String: Path,支持绝对及相对目录 | 选填 | 1. iotdb 相关 DataNode 的 conf 目录下的 `opc_security` 文件夹 `/`。2. 如无 iotdb 的 conf 目录(例如 IDEA 中启动 DataNode),则为用户主目录下的 `iotdb_opc_security` 文件夹 `/` | -| opcua.security-policy | OPC UA 连接使用的安全策略,不区分大小写。可以配置多个,用`,`连接。配置一个安全策略后,client 才能用对应的策略连接。当前实现默认支持`None`和`Basic256Sha256`策略,应该默认改为任意的非`None`策略,`None`策略在调试环境中单独配置,因为`None`策略虽然不需移动证书,操作方便,但是不安全,生产环境的 server 不建议支持该策略。注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | String(安全性依次递增):
`None`
`Basic128Rsa15`
`Basic256`
`Basic256Sha256`
`Aes128_Sha256_RsaOaep`
`Aes256_Sha256_RsaPss` | 选填| `Basic256Sha256`,`Aes128_Sha256_RsaOaep`,`lAes256_Sha256_RsaPss` | -| sink.opcua.enable-anonymous-access | OPC UA 是否允许匿名访问 | Boolean | 选填 | true | -| sink.user | 用户,这里指 OPC UA 的允许用户 | String | 选填 | root | -| sink.password | 密码,这里指 OPC UA 的允许密码 | String | 选填 | TimechoDB@2021(V2.0.6.x 之前默认密码为root) | -| opcua.with-quality | OPC UA 的测点发布是否为 value + quality 模式。启用配置后,系统将按以下规则处理写入数据:
1. 同时包含 value 和 quality,则直接推送至 OPC UA Server。
2. 仅包含 value,则 quality 自动填充为 UNCERTAIN(默认值,支持自定义配置)。
3. 仅包含 quality,则该写入被忽略,不进行任何处理。
4. 包含非 value/quality 字段,则忽略该数据,并记录警告日志(日志频率可配置,避免高频干扰)。
5. quality 类型限制:目前仅支持布尔类型(true 表示 GOOD,false 表示 BAD); 注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | Boolean | 选填 | false | -| opcua.value-name | With-quality 为 true 时生效,表示 value 测点的名字。 注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | String | 选填 | value | -| opcua.quality-name | With-quality 为 true 时生效,表示 quality 测点的名字。 注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | String | 选填 | quality | -| opcua.default-quality | 没有 quality 时,可以通过 SQL 参数指定`GOOD`/`UNCERTAIN`/`BAD`。 注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | String:`GOOD`/`UNCERTAIN`/`BAD` | 选填 | `UNCERTAIN` | -| opcua.timeout-seconds | Client 连接 server 的超时秒数,仅在 IoTDB 为 client 时生效 注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | Long | 选填 | 10L | - -#### 2.1.3 示例 - -```Bash -create pipe p1 - with sink ('sink' = 'opc-ua-sink', - 'sink.user' = 'root', - 'sink.password' = 'TimechoDB@2021');//V2.0.6.x 之前默认密码为root -start pipe p1; -``` - -#### 2.1.4 使用限制 -1. 启动协议之后需要写入数据,才能建立连接,且仅能订阅建立连接之后的数据。 -2. 推荐在单机模式下使用。在分布式模式下,每一个 IoTDB DataNode 都作为一个独立的 OPC Server 提供数据,需要单独订阅。 - -### 2.2 两种通信模式示例 -#### 2.2.1 Client / Server 模式 - -在这种模式下,IoTDB 的流处理引擎通过 OPC UA Sink 与 OPC UA 服务器(Server)建立连接。OPC UA 服务器在其地址空间(Address Space) 中维护数据,IoTDB可以请求并获取这些数据。同时,其他OPC UA客户端(Client)也能访问服务器上的数据。 - -* 特性: - * OPC UA 将从 Sink 收到的设备信息,按照树形模型整理到 Objects folder 下的文件夹中。 - * 每个测点都被记录为一个变量节点,并记录当前数据库中的最新值。 - * OPC UA 无法删除数据或者改变数据类型的设置 - -##### 2.2.1.1 准备工作 -1. 此处以UAExpert客户端为例,下载 UAExpert 客户端:https://www.unified-automation.com/downloads/opc-ua-clients.html -2. 安装 UAExpert,填写自身的证书等信息。 - -##### 2.2.1.2 快速开始 -###### 2.2.1.2.1 支持 None 安全策略的场景 -1. 使用如下 sql,启动 OPC UA 服务。详细语法参见上文:[IoTDB OPC Server语法](./Programming-OPC-UA_timecho.md#_2-1-语法) - -```SQL -create pipe p1 with sink ('sink'='opc-ua-sink', 'opcua.security-policy'='AES128_SHA256_RSAOAEP, AES256_SHA256_RSAPSS, BASIC256SHA256, NONE'); -``` -注意:在 2.0.8.1 及以上版本中,默认不再支持 `None`,如需使用必须通过 `security-policy` 参数手动开启,如上所示。 - -2. 写入部分数据。 - -```SQL -insert into root.test.db(time, s2) values(now(), 2); -``` - -3. 在 UAExpert 中配置 iotdb 的连接,其中 password 填写为上述参数配置中 sink.password 中设定的密码(此处用户名、密码以2.3小节示例中配置的 root/root 为例): - -
- -
- -
- -
- -4. 信任服务器的证书后,在左侧 Objects folder 即可看到写入的数据。 - -
- -
- -
- -
- -注意:由于此处配置的 `SecurityPolicy` 为 `None`,因此不需要相互信任证书。生产环境建议使用非 `None` 的 `SecurityPolicy` 进行连接,此时需要相互信任证书,操作步骤可以见下文 `Pub/Sub` 模式,在 `Client/Server` 的证书目录下(可以在打印的日志中找 keyStore 关键词),将 reject 的内容挪到 `trusted/certs`下即可,采用连接、移动 server 目录、连接、移动 client 目录、连接的顺序。 - -5. 可以将左侧节点拖动到中间,并展示该节点的最新值: - -
- -
- -###### 2.2.1.2.2 不支持 None 安全策略的场景 -1. 使用如下 sql,创建并启动 OPC UA 服务。 - ```SQL - create pipe p1 with sink ('sink'='opc-ua-sink'); - ``` - 注意:从 2.0.8.1 版本开始,`OpcUaSink` 出于安全考虑,默认不再支持 `None` 模式。 - -2. 写入部分数据。 - ```SQL - insert into root.test.db(time, s2) values(now(), 2); - ``` - -3. 在 UAExpert 中配置 IoTDB 连接: - - 不可直接访问 `URL`,必须通过 `Discover` 方式发现端点 - - 客户端会先使用 `None` 策略发送 `GetEndpoints` 请求获取端点列表 - - 再根据配置的 `Basic256Sha256 + SignAndEncrypt` 选择对应加密端点建立加密连接 - -![](/img/opc-ua-un-none-1.png) - -4. 用户名密码配置同上,点击相关的连接模式后(`Sign` / `Sign & Encrypt`),如果出现以下内容,点 `Ignore` 直接连。 - -![](/img/opc-ua-un-none-2.png) - -#### 2.2.2 Pub / Sub 模式 - -在这种模式下,IoTDB的流处理引擎通过 OPC UA Sink 向OPC UA 服务器(Server)发送数据变更事件。这些事件被发布到服务器的消息队列中,并通过事件节点 (Event Node) 进行管理。其他OPC UA客户端(Client)可以订阅这些事件节点,以便在数据变更时接收通知。 - -* 特性: - * 每个测点会被 OPC UA 包装成一个事件节点(EventNode)。 - * 相关字段及其对应含义如下: - - | 字段 | 含义 | 类型(Milo) | 示例 | - | ------------ | ------------------ | --------------- | ----------------------- | - | Time | 时间戳 | DateTime | 1698907326198 | - | SourceName | 测点对应完整路径 | String | root.test.opc.sensor0 | - | SourceNode | 测点数据类型 | NodeId | Int32 | - | Message | 数据 | LocalizedText | 3.0 | - - - Event 仅会发送给所有已经监听的客户端,客户端未连接则会忽略该 Event。 - - 如果数据被删除,信息则无法推送给客户端。 - - -##### 2.2.2.1 准备工作 - -该代码位于 iotdb-example 包下的 [opc-ua-sink 文件夹](https://github.com/apache/iotdb/tree/master/example/pipe-opc-ua-sink/src/main/java/org/apache/iotdb/opcua)中 - -代码中包含: - -- 主类(ClientTest) -- Client 证书相关的逻辑(IoTDBKeyStoreLoaderClient) -- Client 的配置及启动逻辑(ClientExampleRunner) -- ClientTest 的父类(ClientExample) - -##### 2.2.2.2 快速开始 - -使用步骤为: - -1. 打开 IoTDB 并写入部分数据。 - -```SQL -insert into root.a.b(time, c, d) values(now(), 1, 2); -``` - - 此处自动创建元数据开启。 - -2. 使用如下 sql,创建并启动 Pub-Sub 模式的 OPC UA Sink。详细语法参见上文:[IoTDB OPC Server语法](./Programming-OPC-UA_timecho.md#_2-1-语法) - -```SQL -create pipe p1 with sink ('sink'='opc-ua-sink', 'sink.opcua.model'='pub-sub'); -start pipe p1; -``` - - 此时能看到服务器的 conf 目录下创建了 opc 证书相关的目录。 - -
- -
- -3. 直接运行 Client 连接,此时 Client 证书被服务器拒收。 - -
- -
- -4. 进入服务器的 sink.opcua.security.dir 目录下,进入 pki 的 rejected 目录,此时 Client 的证书应该已经在该目录下生成。 - -
- -
- -5. 将客户端的证书移入(不是复制) 同目录下 trusted 目录的 certs 文件夹中。 - -
- -
- -6. 再次打开 Client 连接,此时服务器的证书应该被 Client 拒收。 - -
- -
- -7. 进入客户端的 /client/security 目录下,进入 pki 的 rejected 目录,将服务器的证书移入(不是复制)trusted 目录。 - -
- -
- -8. 打开 Client,此时建立双向信任成功, Client 能够连接到服务器。 - -9. 向服务器中写入数据,此时 Client 中能够打印出收到的数据。 - -
- -
- - -#### 2.2.3 注意事项 - -1. **单机与集群**:建议使用1C1D单机版,如果集群中有多个 DataNode,可能数据会分散发送在各个 DataNode 上,无法收听到全量数据。 - -2. **无需操作根目录下证书**:在证书操作过程中,无需操作 IoTDB security 根目录下的 `iotdb-server.pfx` 证书和 client security 目录下的 `example-client.pfx` 目录。Client 和 Server 双向连接时,会将根目录下的证书发给对方,对方如果第一次看见此证书,就会放入 reject dir,如果该证书在 trusted/certs 里面,则能够信任对方。 - -3. **建议使用** **Java 17+**:在 JVM 8 的版本中,可能会存在密钥长度限制,报 Illegal key size 错误。对于特定版本(如 jdk.1.8u151+),可以在 ClientExampleRunner 的 create client 里加入 `Security.`*`setProperty`*`("crypto.policy", "unlimited");` 解决,也可以下载无限制的包 `local_policy.jar` 与 `US_export_policy `解决替换 `JDK/jre/lib/security `目录下的包解决,下载网址:https://www.oracle.com/java/technologies/javase-jce8-downloads.html。 - -4. **连接问题**:如果报错为 Unknown host,需要修改 IoTDB DataNode 所在机器的 etc/hosts 文件,加入目标端机器的 url 和 hostName。 - -## 3. 数据推送 - -在此模式下,IoTDB 通过 Pipe 扮演 OPC UA 客户端角色,主动将选定的数据连同质量码(`quality`)一并推送到一个或多个外部 OPC UA 服务器。外部服务器会自动按 IoTDB 的元数据动态创建目录树和节点。 - -![](/img/opc-ua-data-push.png) - -### 3.1 OPC 服务启动方式 - -#### 3.1.1语法 - -启动 OPC UA 协议的语法: - -```SQL -create pipe p1 - with source (...) - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'opcua.node-url' = '127.0.0.1:12686', - 'opcua.historizing' = 'true', - 'opcua.with-quality' = 'true' - ) -``` - -#### 3.1.2 参数 - -| **参数** | **描述** | ** 取值范围 ** | **是否必填** | **默认值** | -|-----------------------| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------- | -------------------- | -| sink | OPC UA SINK | String: opc-ua-sink | 必填 | | -| opcua.node-url | 逗号分隔,可以配置单个 OPC UA 的 tcp port,当存在该参数时,不会启动本机 server,而是发送到配置的 OPC UA Server。 | String | 选填 | `''` | -| opcua.historizing | 自动创建目录及叶子节点时,新节点是否存变量的历史数据。 | Boolean | 选填 | false | -| opcua.with-quality | OPC UA 的测点发布是否为 value + quality 模式。启用配置后,系统将按以下规则处理写入数据:
1. 同时包含 value 和 quality,则直接推送至 OPC UA Server。
2. 仅包含 value,则 quality 自动填充为 UNCERTAIN(默认值,支持自定义配置)。
3. 仅包含 quality,则该写入被忽略,不进行任何处理。
4. 包含非 value/quality 字段,则忽略该数据,并记录警告日志(日志频率可配置,避免高频干扰)。
5. quality 类型限制:目前仅支持布尔类型(true 表示 GOOD,false 表示 BAD); | Boolean | 选填 | false | -| opcua.value-name | With-quality 为 true 时生效,表示 value 测点的名字。 | String | 选填 | value | -| opcua.quality-name | With-quality 为 true 时生效,表示 quality 测点的名字。 | String | 选填 | quality | -| opcua.default-quality | 没有 quality 时,可以通过 SQL 参数指定`GOOD`/`UNCERTAIN`/`BAD`。 | String:`GOOD`/`UNCERTAIN`/`BAD` | 选填 | `UNCERTAIN` | -| opcua.security-policy | OPC UA client 连接使用的安全策略,不区分大小写,网址为:`http://opcfoundation.org/UA/SecurityPolicy#`,例如http://opcfoundation.org/UA/SecurityPolicy#Aes128_Sha256_RsaOaep | String(安全性依次递增):
`None`
`Basic128Rsa15`
`Basic256`
`Basic256Sha256`
`Aes128_Sha256_RsaOaep`
`Aes256_Sha256_RsaPss` | 选填| `Basic256Sha256` | -| opcua.timeout-seconds | Client 连接 server 的超时秒数,仅在 IoTDB 为 client 时生效 | Long | 选填 | 10L | - -> **参数命名注意**:以上参数均支持省略 `opcua.` 前缀,例如 `node-urls` 和 `opcua.node-urls` 等价。 -> -> **参数支持说明**:V2.0.8 起支持以上`opcua. `相关参数,且仅支持` client-server` 模式 - -#### 3.1.3 示例 - -```Bash -create pipe p1 - with source (...) - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'node-urls' = '127.0.0.1:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ) -``` - -#### 3.1.4 使用限制 - -1. 当前模式**仅支持`client-server`****模式和树模型数据**。 -2. 不支持一台机器上配置多个 DataNode,避免抢占相同的端口。 -3. 不支持`OBJECT` 类型的数据推送。 -4. 当某条时间序列改名后,将会联动修改 OPC UA Sink 删除对应的老路径,向新路径推送数据。 -5. 生产环境强烈建议使用非`None`的安全策略(如`Basic256Sha256`),并正确配置证书双向信任。 - -### 3.2 外置 OPC UA 服务器项目 - -IoTDB 支持单独的外部 Server 项目。该 Server 的实现及配置项与 IoTDB 目前的内置 Server 相同,但是需要额外支持新增目录及叶子节点,保证可以自动按照 IoTDB 写入中的元数据创建目录及叶子节点。 - -该 Server 的相关配置在启动 Server 时,需通过命令行的 args 注入,暂不支持 yml、xml 等配置文件。启动参数的键名与 IoTDB OPC Server 配置项对应,其中配置项中的点(.)和短划线(-)需替换为下划线(\_)。 - -例如: - -```SQL -.\start-IoTDB-opc-server.sh -enable_anonymous_access true -u root -pw root -https_port 8443 -``` - -其中,`user` 和 `password` 可简写为 `-u`、`-p`,其余参数键名均与配置项保持一致。请注意,`userName` 不能作为参数键名,仅支持 `user`。 - -### 3.3 场景示例 - -**目标**:将多个数据源的数据,按区域汇聚到3个外部OPC Server,供监控中心统一访问。 - -![](/img/opc-ua-data-push-example.png) - -1. **准备**:在三台服务器 (`ip1`, `ip2`, `ip3`) 上分别启动外部 OPC UA Server(端口12686)。 -2. **配置Pipe**:在 IoTDB 中创建3个 Pipe,使用`processor`或`source`中的路径模式过滤,将不同区域的数据推送到对应的 Server。 - ```SQL - -- 启动和连接 IoTDB - .\start-standalone.sh - - -- 启动三个 OPC UA Server - -- ip1、ip2、ip3 执行三次,端口为默认,12686 - .\start-IoTDB-external-opc-server.sh -enable-anonymous-access true -u root -pw root - - -- 创建三个 Pipe - .\start-cli.sh - create pipe p1 - with source () - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip1:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - create pipe p1 - with source () - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip2:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - create pipe p1 - with source () - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip3:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - ``` -3. **效果**:监控中心只需连接 `ip1`, `ip2`, `ip3` 这三个Server,即可获取所有区域的完整数据视图,且数据附带质量信息。 diff --git a/src/zh/UserGuide/Master/Tree/API/Programming-Python-Native-API_timecho.md b/src/zh/UserGuide/Master/Tree/API/Programming-Python-Native-API_timecho.md deleted file mode 100644 index f9895a3d4..000000000 --- a/src/zh/UserGuide/Master/Tree/API/Programming-Python-Native-API_timecho.md +++ /dev/null @@ -1,860 +0,0 @@ - - -# Python 原生接口 - -## 1. 依赖 - -在使用 Python 原生接口包前,您需要安装 thrift (>=0.13) 依赖。 - -## 2. 如何使用 (示例) - -首先下载包:`pip3 install apache-iotdb>=2.0` - -注意:请勿使用高版本客户端连接低版本服务。 - -您可以从这里得到一个使用该包进行数据读写的例子:[Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/session_example.py) - -关于对齐时间序列读写的例子:[Aligned Timeseries Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/session_aligned_timeseries_example.py) - -(您需要在文件的头部添加`import iotdb`) - -或者: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前密码默认值为root -session = Session(ip, port_, username_, password_) -session.open(False) -zone = session.get_time_zone() -session.close() -``` - -## 3. 基本接口说明 - -下面将给出 Session 对应的接口的简要介绍和对应参数: - -### 3.1 初始化 - -* 初始化 Session - -```python -session = Session( - ip="127.0.0.1", - port="6667", - user="root", - password="TimechoDB@2021", //V2.0.6.x 之前密码默认值为root - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) -``` - -* 初始化可连接多节点的 Session - -```python -session = Session.init_from_node_urls( - node_urls=["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"], - user="root", - password="TimechoDB@2021", //V2.0.6.x 之前密码默认值为root - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) -``` - -* 开启 Session,并决定是否开启 RPC 压缩 - -```python -session.open(enable_rpc_compression=False) -``` - -注意: 客户端的 RPC 压缩开启状态需和服务端一致 - -* 关闭 Session - -```python -session.close() -``` -### 3.2 通过SessionPool管理session连接 - -利用SessionPool管理session,不需要再考虑如何重用session。当session连接到达pool的最大值时,获取session的请求会被阻塞,可以通过参数设置阻塞等待时间。每次session使用完需要使用putBack方法将session归还到SessionPool中管理。 - -#### 创建SessionPool - -```python -pool_config = PoolConfig(host=ip,port=port, user_name=username, - password=password, fetch_size=1024, - time_zone="UTC+8", max_retry=3) -max_pool_size = 5 -wait_timeout_in_ms = 3000 - -# 通过配置参数创建连接池 -session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) -``` -#### 通过分布式节点创建SessionPool -```python -pool_config = PoolConfig(node_urls=node_urls=["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"], user_name=username, - password=password, fetch_size=1024, - time_zone="UTC+8", max_retry=3) -max_pool_size = 5 -wait_timeout_in_ms = 3000 -``` - -#### 通过SessionPool获取session,使用完手动调用PutBack - -```python -session = session_pool.get_session() -session.set_storage_group(STORAGE_GROUP_NAME) -session.create_time_series( - TIMESERIES_PATH, TSDataType.BOOLEAN, TSEncoding.PLAIN, Compressor.SNAPPY -) -# 使用完调用putBack归还 -session_pool.put_back(session) -# 关闭sessionPool时同时关闭管理的session -session_pool.close() -``` - -### 3.3 SSL 连接 - -#### 3.3.1 服务器端配置证书 - -`conf/iotdb-system.properties` 配置文件中查找或添加以下配置项: - -```Java -enable_thrift_ssl=true -key_store_path=/path/to/your/server_keystore.jks -key_store_pwd=your_keystore_password -``` - -#### 3.3.2 配置 python 客户端证书 - -- 设置 use_ssl 为 True 以启用 SSL。 -- 指定客户端证书路径,使用 ca_certs 参数。 - -```Java -use_ssl = True -ca_certs = "/path/to/your/server.crt" # 或 ca_certs = "/path/to/your//ca_cert.pem" -``` -**示例代码:使用 SSL 连接 IoTDB** - -```Java -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# - -from iotdb.SessionPool import PoolConfig, SessionPool -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前密码默认值为root -# Configure SSL enabled -use_ssl = True -# Configure certificate path -ca_certs = "/path/server.crt" - - -def get_data(): - session = Session( - ip, port_, username_, password_, use_ssl=use_ssl, ca_certs=ca_certs - ) - session.open(False) - with session.execute_query_statement("select * from root.eg.etth") as result: - df = result.todf() - df.rename(columns={"Time": "date"}, inplace=True) - session.close() - return df - - -def get_data2(): - pool_config = PoolConfig( - host=ip, - port=port_, - user_name=username_, - password=password_, - fetch_size=1024, - time_zone="UTC+8", - max_retry=3, - use_ssl=use_ssl, - ca_certs=ca_certs, - ) - max_pool_size = 5 - wait_timeout_in_ms = 3000 - session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) - session = session_pool.get_session() - with session.execute_query_statement("select * from root.eg.etth") as result: - df = result.todf() - df.rename(columns={"Time": "date"}, inplace=True) - session_pool.put_back(session) - session_pool.close() - - -if __name__ == "__main__": - df = get_data() -``` - -## 4. 数据定义接口 DDL - -### 4.1 Database 管理 - -* 设置 database - -```python -session.set_storage_group(group_name) -``` - -* 删除单个或多个 database - -```python -session.delete_storage_group(group_name) -session.delete_storage_groups(group_name_lst) -``` -### 4.2 时间序列管理 - -* 创建单个或多个时间序列 - -```python -session.create_time_series(ts_path, data_type, encoding, compressor, - props=None, tags=None, attributes=None, alias=None) - -session.create_multi_time_series( - ts_path_lst, data_type_lst, encoding_lst, compressor_lst, - props_lst=None, tags_lst=None, attributes_lst=None, alias_lst=None -) -``` - -* 创建对齐时间序列 - -```python -session.create_aligned_time_series( - device_id, measurements_lst, data_type_lst, encoding_lst, compressor_lst -) -``` - -注意:目前**暂不支持**使用传感器别名。 - -* 删除一个或多个时间序列 - -```python -session.delete_time_series(paths_list) -``` - -* 检测时间序列是否存在 - -```python -session.check_time_series_exists(path) -``` - -## 5. 数据操作接口 DML - -### 5.1 数据写入 - -推荐使用 insert_tablet 帮助提高写入效率 - -* 插入一个 Tablet,Tablet 是一个设备若干行数据块,每一行的列都相同 - * **写入效率高** - * **支持写入空值** (0.13 版本起) - -Python API 里目前有两种 Tablet 实现 - -* 普通 Tablet - -```python -values_ = [ - [False, 10, 11, 1.1, 10011.1, "test01"], - [True, 100, 11111, 1.25, 101.0, "test02"], - [False, 100, 1, 188.1, 688.25, "test03"], - [True, 0, 0, 0, 6.25, "test04"], -] -timestamps_ = [1, 2, 3, 4] -tablet_ = Tablet( - device_id, measurements_, data_types_, values_, timestamps_ -) -session.insert_tablet(tablet_) - -values_ = [ - [None, 10, 11, 1.1, 10011.1, "test01"], - [True, None, 11111, 1.25, 101.0, "test02"], - [False, 100, None, 188.1, 688.25, "test03"], - [True, 0, 0, 0, None, None], -] -timestamps_ = [16, 17, 18, 19] -tablet_ = Tablet( - device_id, measurements_, data_types_, values_, timestamps_ -) -session.insert_tablet(tablet_) -``` -* Numpy Tablet - -相较于普通 Tablet,Numpy Tablet 使用 [numpy.ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html) 来记录数值型数据。 -内存占用和序列化耗时会降低很多,写入效率也会有很大提升。 - -**注意** -1. Tablet 中的每一列时间戳和值记录为一个 ndarray -2. Numpy Tablet 只支持大端类型数据,ndarray 构建时如果不指定数据类型会使用小端,因此推荐在构建 ndarray 时指定下面例子中类型使用大端。如果不指定,IoTDB Python客户端也会进行大小端转换,不影响使用正确性。 - -```python -import numpy as np -data_types_ = [ - TSDataType.BOOLEAN, - TSDataType.INT32, - TSDataType.INT64, - TSDataType.FLOAT, - TSDataType.DOUBLE, - TSDataType.TEXT, -] -np_values_ = [ - np.array([False, True, False, True], TSDataType.BOOLEAN.np_dtype()), - np.array([10, 100, 100, 0], TSDataType.INT32.np_dtype()), - np.array([11, 11111, 1, 0], TSDataType.INT64.np_dtype()), - np.array([1.1, 1.25, 188.1, 0], TSDataType.FLOAT.np_dtype()), - np.array([10011.1, 101.0, 688.25, 6.25], TSDataType.DOUBLE.np_dtype()), - np.array(["test01", "test02", "test03", "test04"], TSDataType.TEXT.np_dtype()), -] -np_timestamps_ = np.array([1, 2, 3, 4], TSDataType.INT64.np_dtype()) -np_tablet_ = NumpyTablet( - device_id, measurements_, data_types_, np_values_, np_timestamps_ -) -session.insert_tablet(np_tablet_) - -# insert one numpy tablet with None into the database. -np_values_ = [ - np.array([False, True, False, True], TSDataType.BOOLEAN.np_dtype()), - np.array([10, 100, 100, 0], TSDataType.INT32.np_dtype()), - np.array([11, 11111, 1, 0], TSDataType.INT64.np_dtype()), - np.array([1.1, 1.25, 188.1, 0], TSDataType.FLOAT.np_dtype()), - np.array([10011.1, 101.0, 688.25, 6.25], TSDataType.DOUBLE.np_dtype()), - np.array(["test01", "test02", "test03", "test04"], TSDataType.TEXT.np_dtype()), -] -np_timestamps_ = np.array([98, 99, 100, 101], TSDataType.INT64.np_dtype()) -np_bitmaps_ = [] -for i in range(len(measurements_)): - np_bitmaps_.append(BitMap(len(np_timestamps_))) -np_bitmaps_[0].mark(0) -np_bitmaps_[1].mark(1) -np_bitmaps_[2].mark(2) -np_bitmaps_[4].mark(3) -np_bitmaps_[5].mark(3) -np_tablet_with_none = NumpyTablet( - device_id, measurements_, data_types_, np_values_, np_timestamps_, np_bitmaps_ -) -session.insert_tablet(np_tablet_with_none) -``` - -* 插入多个 Tablet - -```python -session.insert_tablets(tablet_lst) -``` - -* 插入一个 Record,一个 Record 是一个设备一个时间戳下多个测点的数据。 - -```python -session.insert_record(device_id, timestamp, measurements_, data_types_, values_) -``` - -* 插入多个 Record - -```python -session.insert_records( - device_ids_, time_list_, measurements_list_, data_type_list_, values_list_ - ) -``` - -* 插入同属于一个 device 的多个 Record - -```python -session.insert_records_of_one_device(device_id, time_list, measurements_list, data_types_list, values_list) -``` - -### 5.2 带有类型推断的写入 - -当数据均是 String 类型时,我们可以使用如下接口,根据 value 的值进行类型推断。例如:value 为 "true" ,就可以自动推断为布尔类型。value 为 "3.2" ,就可以自动推断为数值类型。服务器需要做类型推断,可能会有额外耗时,速度较无需类型推断的写入慢 - -```python -session.insert_str_record(device_id, timestamp, measurements, string_values) -``` - -### 5.3 对齐时间序列的写入 - -对齐时间序列的写入使用 insert_aligned_xxx 接口,其余与上述接口类似: - -* insert_aligned_record -* insert_aligned_records -* insert_aligned_records_of_one_device -* insert_aligned_tablet -* insert_aligned_tablets - - -## 6. IoTDB-SQL 接口 - -* 执行查询语句 - -```python -session.execute_query_statement(sql) -``` - -* 执行非查询语句 - -```python -session.execute_non_query_statement(sql) -``` - -* 执行语句 - -```python -session.execute_statement(sql) -``` - - -## 7. 元数据模版接口 -### 7.1 构建元数据模版 -1. 首先构建 Template 类 -2. 添加子节点 MeasurementNode -3. 调用创建元数据模版接口 - -```python -template = Template(name=template_name, share_time=True) - -m_node_x = MeasurementNode("x", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) -m_node_y = MeasurementNode("y", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) -m_node_z = MeasurementNode("z", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) - -template.add_template(m_node_x) -template.add_template(m_node_y) -template.add_template(m_node_z) - -session.create_schema_template(template) -``` -### 7.2 改模版节点信息 -修改模版节点,其中修改的模版必须已经被创建。以下函数能够在已经存在的模版中增加或者删除物理量 -* 在模版中增加实体 -```python -session.add_measurements_in_template(template_name, measurements_path, data_types, encodings, compressors, is_aligned) -``` - -* 在模版中删除物理量 -```python -session.delete_node_in_template(template_name, path) -``` - -### 7.3 挂载元数据模板 -```python -session.set_schema_template(template_name, prefix_path) -``` - -### 7.4 卸载元数据模版 -```python -session.unset_schema_template(template_name, prefix_path) -``` - -### 7.5 查看元数据模版 -* 查看所有的元数据模版 -```python -session.show_all_templates() -``` -* 查看元数据模版中的物理量个数 -```python -session.count_measurements_in_template(template_name) -``` - -* 判断某个节点是否为物理量,该节点必须已经在元数据模版中 -```python -session.count_measurements_in_template(template_name, path) -``` - -* 判断某个路径是否在元数据模版中,这个路径有可能不在元数据模版中 -```python -session.is_path_exist_in_template(template_name, path) -``` - -* 查看某个元数据模板下的物理量 -```python -session.show_measurements_in_template(template_name) -``` - -* 查看挂载了某个元数据模板的路径前缀 -```python -session.show_paths_template_set_on(template_name) -``` - -* 查看使用了某个元数据模板(即序列已创建)的路径前缀 -```python -session.show_paths_template_using_on(template_name) -``` - -### 7.6 删除元数据模版 -删除已经存在的元数据模版,不支持删除已经挂载的模版 -```python -session.drop_schema_template("template_python") -``` - - -## 8. 对 Pandas 的支持 - -我们支持将查询结果轻松地转换为 [Pandas Dataframe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html)。 - -SessionDataSet 有一个方法`.todf()`,它的作用是消费 SessionDataSet 中的数据,并将数据转换为 pandas dataframe。 - -例子: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前密码默认值为root -session = Session(ip, port_, username_, password_) -session.open(False) -with session.execute_query_statement("SELECT ** FROM root") as result: - # Transform to Pandas Dataset - df = result.todf() - -session.close() - -# Now you can work with the dataframe -df = ... -``` - -自 V2.0.8.2 版本起,SessionDataSet 提供分批获取 DataFrame 的方法,用于高效处理大数据量查询: - -```python -# 分批获取 DataFrame -has_next = result.has_next_df() -if has_next: - df = result.next_df() - # 处理 DataFrame -``` - -**方法说明:** -- `has_next_df()`: 返回 `True`/`False`,表示是否还有数据可返回 -- `next_df()`: 返回 `DataFrame` 或 `None`,每次返回 `fetchSize` 行(默认5000行,由 Session 的 `fetch_size` 参数控制) - - 剩余数据 ≥ `fetchSize` 时,返回 `fetchSize` 行 - - 剩余数据 < `fetchSize` 时,返回剩余所有行 - - 数据遍历完毕时,返回 `None` -- 初始化 Session 时检查 `fetchSize`,若 ≤0 则重置为 5000 并打印警告日志 - -**注意:** 不要混合使用不同的遍历方式,如(todf函数与 next_df 混用),否则会出现预期外的错误。 - -**使用示例:** -```python -from iotdb.Session import Session - -# 初始化 session,设置 fetch_size 为 2 -session = Session( - host="127.0.0.1", port="6667", fetch_size=2 -) -session.open(False) -session.execute_non_query_statement("CREATE DATABASE root.device0") - -# 写入三条数据 -session.insert_str_record("root.device0", 123, "pressure", "15.0") -session.insert_str_record("root.device0", 124, "pressure", "15.0") -session.insert_str_record("root.device0", 125, "pressure", "15.0") - -# 查询出 DataFrame -with session.execute_query_statement("SELECT * FROM root.device0") as session_data_set: - while session_data_set.has_next_df(): - df = session_data_set.next_df() - # 打印出两个 dataframe,第一个有 2 行,第二个有 1 行 - print(df) - -session.close() -``` - - -## 9. IoTDB Testcontainer - -Python 客户端对测试的支持是基于`testcontainers`库 (https://testcontainers-python.readthedocs.io/en/latest/index.html) 的,如果您想使用该特性,就需要将其安装到您的项目中。 - -要在 Docker 容器中启动(和停止)一个 IoTDB 数据库,只需这样做: - -```python -class MyTestCase(unittest.TestCase): - - def test_something(self): - with IoTDBContainer() as c: - session = Session("localhost", c.get_exposed_port(6667), "root", "TimechoDB@2021") //V2.0.6.x 之前密码默认值为root - session.open(False) - with session.execute_query_statement("SHOW TIMESERIES") as result: - print(result) - session.close() -``` - -默认情况下,它会拉取最新的 IoTDB 镜像 `apache/iotdb:latest`进行测试,如果您想指定待测 IoTDB 的版本,您只需要将版本信息像这样声明:`IoTDBContainer("apache/iotdb:0.12.0")`,此时,您就会得到一个`0.12.0`版本的 IoTDB 实例。 - -## 10. IoTDB DBAPI - -IoTDB DBAPI 遵循 Python DB API 2.0 规范 (https://peps.python.org/pep-0249/),实现了通过Python语言访问数据库的通用接口。 - -### 10.1 例子 -+ 初始化 - -初始化的参数与Session部分保持一致(sqlalchemy_mode参数除外,该参数仅在SQLAlchemy方言中使用) -```python -from iotdb.dbapi import connect - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前密码默认值为root -conn = connect(ip, port_, username_, password_,fetch_size=1024,zone_id="UTC+8",sqlalchemy_mode=False) -cursor = conn.cursor() -``` -+ 执行简单的SQL语句 -```python -cursor.execute("SELECT ** FROM root") -for row in cursor.fetchall(): - print(row) -``` - -+ 执行带有参数的SQL语句 - -IoTDB DBAPI 支持pyformat风格的参数 -```python -cursor.execute("SELECT ** FROM root WHERE time < %(time)s",{"time":"2017-11-01T00:08:00.000"}) -for row in cursor.fetchall(): - print(row) -``` - -+ 批量执行带有参数的SQL语句 -```python -seq_of_parameters = [ - {"timestamp": 1, "temperature": 1}, - {"timestamp": 2, "temperature": 2}, - {"timestamp": 3, "temperature": 3}, - {"timestamp": 4, "temperature": 4}, - {"timestamp": 5, "temperature": 5}, -] -sql = "insert into root.cursor(timestamp,temperature) values(%(timestamp)s,%(temperature)s)" -cursor.executemany(sql,seq_of_parameters) -``` - -+ 关闭连接 -```python -cursor.close() -conn.close() -``` - -## 11. IoTDB SQLAlchemy Dialect(实验性) -IoTDB的SQLAlchemy方言主要是为了适配Apache superset而编写的,该部分仍在完善中,请勿在生产环境中使用! -### 11.1 元数据模型映射 -SQLAlchemy 所使用的数据模型为关系数据模型,这种数据模型通过表格来描述不同实体之间的关系。 -而 IoTDB 的数据模型为层次数据模型,通过树状结构来对数据进行组织。 -为了使 IoTDB 能够适配 SQLAlchemy 的方言,需要对 IoTDB 中原有的数据模型进行重新组织, -把 IoTDB 的数据模型转换成 SQLAlchemy 的数据模型。 - -IoTDB 中的元数据有: - -1. Database:数据库 -2. Path:存储路径 -3. Entity:实体 -4. Measurement:物理量 - -SQLAlchemy 中的元数据有: -1. Schema:数据模式 -2. Table:数据表 -3. Column:数据列 - -它们之间的映射关系为: - -| SQLAlchemy中的元数据 | IoTDB中对应的元数据 | -| -------------------- | ---------------------------------------------- | -| Schema | Database | -| Table | Path ( from database to entity ) + Entity | -| Column | Measurement | - -下图更加清晰的展示了二者的映射关系: - -![sqlalchemy-to-iotdb](/img/UserGuide/API/IoTDB-SQLAlchemy/sqlalchemy-to-iotdb.png?raw=true) - -### 11.2 数据类型映射 -| IoTDB 中的数据类型 | SQLAlchemy 中的数据类型 | -|--------------|-------------------| -| BOOLEAN | Boolean | -| INT32 | Integer | -| INT64 | BigInteger | -| FLOAT | Float | -| DOUBLE | Float | -| TEXT | Text | -| LONG | BigInteger | -### 11.3 Example - -+ 执行语句 - -```python -from sqlalchemy import create_engine - -engine = create_engine("iotdb://root:TimechoDB@2021@127.0.0.1:6667") //V2.0.6.x 之前密码默认值为root -connect = engine.connect() -result = connect.execute("SELECT ** FROM root") -for row in result.fetchall(): - print(row) -``` - -+ ORM (目前只支持简单的查询) - -```python -from sqlalchemy import create_engine, Column, Float, BigInteger, MetaData -from sqlalchemy.ext.declarative import declarative_base -from sqlalchemy.orm import sessionmaker - -metadata = MetaData( - schema='root.factory' -) -Base = declarative_base(metadata=metadata) - - -class Device(Base): - __tablename__ = "room2.device1" - Time = Column(BigInteger, primary_key=True) - temperature = Column(Float) - status = Column(Float) - - -engine = create_engine("iotdb://root:TimechoDB@2021@127.0.0.1:6667") //V2.0.6.x 之前密码默认值为root - -DbSession = sessionmaker(bind=engine) -session = DbSession() - -res = session.query(Device.status).filter(Device.temperature > 1) - -for row in res: - print(row) -``` - -## 12. 给开发人员 - -### 12.1 介绍 - -这是一个使用 thrift rpc 接口连接到 IoTDB 的示例。在 Windows 和 Linux 上操作几乎是一样的,但要注意路径分隔符等不同之处。 - -### 12.2 依赖 - -首选 Python3.7 或更高版本。 - -必须安装 thrift(0.11.0 或更高版本)才能将 thrift 文件编译为 Python 代码。下面是官方的安装教程,最终,您应该得到一个 thrift 可执行文件。 - -``` -http://thrift.apache.org/docs/install/ -``` - -在开始之前,您还需要在 Python 环境中安装`requirements_dev.txt`中的其他依赖: -```shell -pip install -r requirements_dev.txt -``` - -### 12.3 编译 thrift 库并调试 - -在 IoTDB 源代码文件夹的根目录下,运行`mvn clean generate-sources -pl iotdb-client/client-py -am`, - -这个指令将自动删除`iotdb/thrift`中的文件,并使用新生成的 thrift 文件重新填充该文件夹。 - -这个文件夹在 git 中会被忽略,并且**永远不应该被推到 git 中!** - -**注意**不要将`iotdb/thrift`上传到 git 仓库中 ! - -### 12.4 Session 客户端 & 使用示例 - -我们将 thrift 接口打包到`client-py/src/iotdb/session.py `中(与 Java 版本类似),还提供了一个示例文件`client-py/src/SessionExample.py`来说明如何使用 Session 模块。请仔细阅读。 - -另一个简单的例子: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前密码默认值为root -session = Session(ip, port_, username_, password_) -session.open(False) -zone = session.get_time_zone() -session.close() -``` - -### 12.5 测试 - -请在`tests`文件夹中添加自定义测试。 - -要运行所有的测试,只需在根目录中运行`pytest . `即可。 - -**注意**一些测试需要在您的系统上使用 docker,因为测试的 IoTDB 实例是使用 [testcontainers](https://testcontainers-python.readthedocs.io/en/latest/index.html) 在 docker 容器中启动的。 - -### 12.6 其他工具 - -[black](https://pypi.org/project/black/) 和 [flake8](https://pypi.org/project/flake8/) 分别用于自动格式化和 linting。 -它们可以通过 `black .` 或 `flake8 .` 分别运行。 - -## 13. 发版 - -要进行发版, - -只需确保您生成了正确的 thrift 代码, - -运行了 linting 并进行了自动格式化, - -然后,确保所有测试都正常通过(通过`pytest . `), - -最后,您就可以将包发布到 pypi 了。 - -### 13.1 准备您的环境 - -首先,通过`pip install -r requirements_dev.txt`安装所有必要的开发依赖。 - -### 13.2 发版 - -有一个脚本`release.sh`可以用来执行发版的所有步骤。 - -这些步骤包括: - -* 删除所有临时目录(如果存在) - -* (重新)通过 mvn 生成所有必须的源代码 - -* 运行 linting (flke8) - -* 通过 pytest 运行测试 - -* Build - -* 发布到 pypi diff --git a/src/zh/UserGuide/Master/Tree/API/RestServiceV1_timecho.md b/src/zh/UserGuide/Master/Tree/API/RestServiceV1_timecho.md deleted file mode 100644 index 239d57ee7..000000000 --- a/src/zh/UserGuide/Master/Tree/API/RestServiceV1_timecho.md +++ /dev/null @@ -1,965 +0,0 @@ - - -# REST API V1(不推荐) -IoTDB 的 RESTful 服务可用于查询、写入和管理操作,它使用 OpenAPI 标准来定义接口并生成框架。 - -注意:自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 REST 服务的 JAR 包,请使用该服务前联系天谋团队获取相应的 JAR 包,并放置于 timechodb_home/lib 或者 timechodb_home/ext/external_service 路径下。 - -## 1. 开启RESTful 服务 -RESTful 服务默认情况是关闭的 - - 找到IoTDB安装目录下面的`conf/iotdb-system.properties`文件,将 `enable_rest_service` 设置为 `true` 以启用该模块。 - - ```properties - enable_rest_service=true - ``` - -## 2. 鉴权 -除了检活接口 `/ping`,RESTful 服务均使用基础(Basic)鉴权,所有请求都需要在 Header 中携带 `Authorization` 信息。 - -1. 鉴权格式 - -```JSON -Authorization: Basic -``` - -其中 `` 是 `用户名:密码` 直接做 Base64 编码的结果,其快速生成方式如下 - -* Linux/macOS - -```Bash -echo -n "你的用户名:你的密码" | base64 -eg: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows - -```Bash -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("用户名:密码")) -eg: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) - -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"用户名:密码\"))" -eg: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. 鉴权示例 - -默认用户名 `root`,密码 `TimechoDB@2021`: - -* 拼接字符串:`root:TimechoDB@2021` -* Base64 编码后为:`cm9vdDpUaW1lY2hvREJAMjAyMQ==` -* 最终 Header: - -```JSON -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. 错误说明 -* 用户名/密码错误:返回 HTTP 状态码 `600`,内容: - -```JSON -{"code":600,"message":"WRONG_LOGIN_PASSWORD_ERROR"} -``` - -* 未设置 `Authorization`:返回 HTTP 状态码 `603`,内容: - -```JSON -{"code":603,"message":"UNINITIALIZED_AUTH_ERROR"} -``` - -## 3. 接口 - -### 3.1 ping - -ping 接口可以用于线上服务检活。 - -请求方式:`GET` - -请求路径:`http://ip:port/ping -` -请求示例: - -```shell -$ curl http://127.0.0.1:18080/ping -``` - -返回的 HTTP 状态码: - -- `200`:当前服务工作正常,可以接收外部请求。 -- `503`:当前服务出现异常,不能接收外部请求。 - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: - -- HTTP 状态码为 `200` 时: - - ```json - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -- HTTP 状态码为 `503` 时: - - ```json - { - "code": 500, - "message": "thrift service is unavailable" - } - ``` - -> `/ping` 接口访问不需要鉴权。 - -### 3.2 query - -query 接口可以用于处理数据查询和元数据查询。 - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v1/query` - -参数说明: - -| 参数名称 |参数类型 |是否必填|参数描述| -|-----------| ------------ | ------------ |------------ | -| sql | string | 是 | | -| rowLimit | integer | 否 | 一次查询能返回的结果集的最大行数。
如果不设置该参数,将使用配置文件的 `rest_query_default_row_size_limit` 作为默认值。
当返回结果集的行数超出限制时,将返回状态码 `411`。 | - -响应参数: - -| 参数名称 |参数类型 |参数描述| -|--------------| ------------ | ------------| -| expressions | array | 用于数据查询时结果集列名的数组,用于元数据查询时为`null`| -| columnNames | array | 用于元数据查询结果集列名数组,用于数据查询时为`null` | -| timestamps | array | 时间戳列,用于元数据查询时为`null` | -| values |array|二维数组,第一维与结果集列名数组的长度相同,第二维数组代表结果集的一列| - -请求示例如下所示: - -提示:为了避免OOM问题,不推荐使用select * from root.xx.** 这种查找方式。 - -1. 请求示例 表达式查询: - ```shell - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select s3, s4, s3 + 1 from root.sg27 limit 2"}' http://127.0.0.1:18080/rest/v1/query -``` - - - 响应示例: - -```json -{ - "expressions": [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg27.s3 + 1" - ], - "columnNames": null, - "timestamps": [ - 1635232143960, - 1635232153960 - ], - "values": [ - [ - 11, - null - ], - [ - false, - true - ], - [ - 12.0, - null - ] - ] -} -``` - -2. 请求示例 show child paths: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show child paths root"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "child paths" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ] - ] -} -``` - -3. 请求示例 show child nodes: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show child nodes root"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "child nodes" - ], - "timestamps": null, - "values": [ - [ - "sg27", - "sg28" - ] - ] -} -``` - -4. 请求示例 show all ttl: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show all ttl"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - null, - null - ] - ] -} -``` - -5. 请求示例 show ttl: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show ttl on root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27" - ], - [ - null - ] - ] -} -``` - -6. 请求示例 show functions: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show functions"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "function name", - "function type", - "class name (UDF)" - ], - "timestamps": null, - "values": [ - [ - "ABS", - "ACOS", - "ASIN", - ... - ], - [ - "built-in UDTF", - "built-in UDTF", - "built-in UDTF", - ... - ], - [ - "org.apache.iotdb.db.query.udf.builtin.UDTFAbs", - "org.apache.iotdb.db.query.udf.builtin.UDTFAcos", - "org.apache.iotdb.db.query.udf.builtin.UDTFAsin", - ... - ] - ] -} -``` - -7. 请求示例 show timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show timeseries"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg28.s3", - "root.sg28.s4" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg27", - "root.sg27", - "root.sg28", - "root.sg28" - ], - [ - "INT32", - "BOOLEAN", - "INT32", - "BOOLEAN" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -8. 请求示例 show latest timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show latest timeseries"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg28.s4", - "root.sg27.s4", - "root.sg28.s3", - "root.sg27.s3" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg28", - "root.sg27", - "root.sg28", - "root.sg27" - ], - [ - "BOOLEAN", - "BOOLEAN", - "INT32", - "INT32" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -9. 请求示例 count timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"count timeseries root.**"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -10. 请求示例 count nodes: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"count nodes root.** level=2"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -11. 请求示例 show devices: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show devices"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "devices", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -12. 请求示例 show devices with database: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show devices with database"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "devices", - "database", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -13. 请求示例 list user: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"list user"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "user" - ], - "timestamps": null, - "values": [ - [ - "root" - ] - ] -} -``` - -14. 请求示例 原始聚合查询: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "columnNames": null, - "timestamps": [ - 0 - ], - "values": [ - [ - 1 - ], - [ - 2 - ] - ] -} -``` - -15. 请求示例 group by level: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.** group by level = 1"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "count(root.sg27.*)", - "count(root.sg28.*)" - ], - "timestamps": null, - "values": [ - [ - 3 - ], - [ - 3 - ] - ] -} -``` - -16. 请求示例 group by: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.sg27 group by([1635232143960,1635232153960),1s)"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "columnNames": null, - "timestamps": [ - 1635232143960, - 1635232144960, - 1635232145960, - 1635232146960, - 1635232147960, - 1635232148960, - 1635232149960, - 1635232150960, - 1635232151960, - 1635232152960 - ], - "values": [ - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ], - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ] - ] -} -``` - -17. 请求示例 last: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select last s3 from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "value", - "dataType" - ], - "timestamps": [ - 1635232143960 - ], - "values": [ - [ - "root.sg27.s3" - ], - [ - "11" - ], - [ - "INT32" - ] - ] -} -``` - -18. 请求示例 disable align: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select * from root.sg27 disable align"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "disable align clauses are not supported." -} -``` - -19. 请求示例 align by device: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(s3) from root.sg27 align by device"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "align by device clauses are not supported." -} -``` - -20. 请求示例 select into: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select s3, s4 into root.sg29.s1, root.sg29.s2 from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "select into clauses are not supported." -} -``` - -### 3.3 nonQuery - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v1/nonQuery` - -参数说明: - -|参数名称 |参数类型 |是否必填|参数描述| -| ------------ | ------------ | ------------ |------------ | -| sql | string | 是 | | - -请求示例: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"CREATE DATABASE root.ln"}' http://127.0.0.1:18080/rest/v1/nonQuery -``` - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - - -### 3.4 insertTablet - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v1/insertTablet` - -参数说明: - -| 参数名称 |参数类型 |是否必填|参数描述| -|--------------| ------------ | ------------ |------------ | -| timestamps | array | 是 | 时间列 | -| measurements | array | 是 | 测点名称 | -| dataTypes | array | 是 | 数据类型 | -| values | array | 是 | 值列,每一列中的值可以为 `null` | -| isAligned | boolean | 是 | 是否是对齐时间序列 | -| deviceId | string | 是 | 设备名称 | - -请求示例: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"timestamps":[1635232143960,1635232153960],"measurements":["s3","s4"],"dataTypes":["INT32","BOOLEAN"],"values":[[11,null],[false,true]],"isAligned":false,"deviceId":"root.sg27"}' http://127.0.0.1:18080/rest/v1/insertTablet -``` - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - -## 4. 配置 - -配置位于 `iotdb-system.properties` 中。 - - - -* 将 `enable_rest_service` 设置为 `true` 以启用该模块,而将 `false` 设置为禁用该模块。默认情况下,该值为 `false`。 - -```properties -enable_rest_service=true -``` - -* 仅在 `enable_rest_service=true` 时生效。将 `rest_service_port `设置为数字(1025~65535),以自定义REST服务套接字端口。默认情况下,值为 `18080`。 - -```properties -rest_service_port=18080 -``` - -* 将 'enable_swagger' 设置 'true' 启用swagger来展示rest接口信息, 而设置为 'false' 关闭该功能. 默认情况下,该值为 `false`。 - -```properties -enable_swagger=false -``` - -* 一次查询能返回的结果集最大行数。当返回结果集的行数超出参数限制时,您只会得到在行数范围内的结果集,且将得到状态码`411`。 - -```properties -rest_query_default_row_size_limit=10000 -``` - -* 缓存客户登录信息的过期时间(用于加速用户鉴权的速度,单位为秒,默认是8个小时) - -```properties -cache_expire=28800 -``` - -* 缓存中存储的最大用户数量(默认是100) - -```properties -cache_max_num=100 -``` - -* 缓存初始容量(默认是10) - -```properties -cache_init_num=10 -``` - -* REST Service 是否开启 SSL 配置,将 `enable_https` 设置为 `true` 以启用该模块,而将 `false` 设置为禁用该模块。默认情况下,该值为 `false`。 - -```properties -enable_https=false -``` - -* keyStore 所在路径(非必填) - -```properties -key_store_path= -``` - - -* keyStore 密码(非必填) - -```properties -key_store_pwd= -``` - - -* trustStore 所在路径(非必填) - -```properties -trust_store_path= -``` - -* trustStore 密码(非必填) - -```properties -trust_store_pwd= -``` - - -* SSL 超时时间,单位为秒 - -```properties -idle_timeout=5000 -``` diff --git a/src/zh/UserGuide/Master/Tree/API/RestServiceV2_timecho.md b/src/zh/UserGuide/Master/Tree/API/RestServiceV2_timecho.md deleted file mode 100644 index 629c40fb4..000000000 --- a/src/zh/UserGuide/Master/Tree/API/RestServiceV2_timecho.md +++ /dev/null @@ -1,1004 +0,0 @@ - - -# REST API V2 -IoTDB 的 RESTful 服务可用于查询、写入和管理操作,它使用 OpenAPI 标准来定义接口并生成框架。 - -注意:自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 REST 服务的 JAR 包,请使用该服务前联系天谋团队获取相应的 JAR 包,并放置于 timechodb_home/lib 或者 timechodb_home/ext/external_service 路径下。 - -## 1. 开启RESTful 服务 -RESTful 服务默认情况是关闭的 - - 找到IoTDB安装目录下面的`conf/iotdb-system.properties`文件,将 `enable_rest_service` 设置为 `true` 以启用该模块。 - - ```properties - enable_rest_service=true - ``` - -## 2. 鉴权 -除了检活接口 `/ping`,RESTful 服务均使用基础(Basic)鉴权,所有请求都需要在 Header 中携带 `Authorization` 信息。 - -1. 鉴权格式 - -```JSON -Authorization: Basic -``` - -其中 `` 是 `用户名:密码` 直接做 Base64 编码的结果,其快速生成方式如下 - -* Linux/macOS - -```Bash -echo -n "你的用户名:你的密码" | base64 -eg: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows - -```Bash -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("用户名:密码")) -eg: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) - -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"用户名:密码\"))" -eg: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. 鉴权示例 - -默认用户名 `root`,密码 `TimechoDB@2021`: - -* 拼接字符串:`root:TimechoDB@2021` -* Base64 编码后为:`cm9vdDpUaW1lY2hvREJAMjAyMQ==` -* 最终 Header: - -```JSON -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. 错误说明 -* 用户名/密码错误:返回 HTTP 状态码 `600`,内容: - -```JSON -{"code":600,"message":"WRONG_LOGIN_PASSWORD_ERROR"} -``` - -* 未设置 `Authorization`:返回 HTTP 状态码 `603`,内容: - -```JSON -{"code":603,"message":"UNINITIALIZED_AUTH_ERROR"} -``` - -## 3. 接口 - -### 3.1 ping - -ping 接口可以用于线上服务检活。 - -请求方式:`GET` - -请求路径:http://ip:port/ping - -请求示例: - -```shell -$ curl http://127.0.0.1:18080/ping -``` - -返回的 HTTP 状态码: - -- `200`:当前服务工作正常,可以接收外部请求。 -- `503`:当前服务出现异常,不能接收外部请求。 - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: - -- HTTP 状态码为 `200` 时: - - ```json - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -- HTTP 状态码为 `503` 时: - - ```json - { - "code": 500, - "message": "thrift service is unavailable" - } - ``` - -> `/ping` 接口访问不需要鉴权。 - -### 3.2 query - -query 接口可以用于处理数据查询和元数据查询。 - -请求方式:`POST` - -请求头:`application/json` - -请求路径: `http://ip:port/rest/v2/query` - -参数说明: - -| 参数名称 |参数类型 |是否必填|参数描述| -|-----------| ------------ | ------------ |------------ | -| sql | string | 是 | | -| row_limit | integer | 否 | 一次查询能返回的结果集的最大行数。
如果不设置该参数,将使用配置文件的 `rest_query_default_row_size_limit` 作为默认值。
当返回结果集的行数超出限制时,将返回状态码 `411`。 | - -响应参数: - -| 参数名称 |参数类型 |参数描述| -|--------------| ------------ | ------------| -| expressions | array | 用于数据查询时结果集列名的数组,用于元数据查询时为`null`| -| column_names | array | 用于元数据查询结果集列名数组,用于数据查询时为`null` | -| timestamps | array | 时间戳列,用于元数据查询时为`null` | -| values |array|二维数组,第一维与结果集列名数组的长度相同,第二维数组代表结果集的一列| - -请求示例如下所示: - -提示:为了避免OOM问题,不推荐使用select * from root.xx.** 这种查找方式。 - -1. 请求示例 表达式查询: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select s3, s4, s3 + 1 from root.sg27 limit 2"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg27.s3 + 1" - ], - "column_names": null, - "timestamps": [ - 1635232143960, - 1635232153960 - ], - "values": [ - [ - 11, - null - ], - [ - false, - true - ], - [ - 12.0, - null - ] - ] -} -``` - -2.请求示例 show child paths: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show child paths root"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "child paths" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ] - ] -} -``` - -3. 请求示例 show child nodes: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show child nodes root"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "child nodes" - ], - "timestamps": null, - "values": [ - [ - "sg27", - "sg28" - ] - ] -} -``` - -4. 请求示例 show all ttl: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show all ttl"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - null, - null - ] - ] -} -``` - -5. 请求示例 show ttl: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show ttl on root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27" - ], - [ - null - ] - ] -} -``` - -6. 请求示例 show functions: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show functions"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "function name", - "function type", - "class name (UDF)" - ], - "timestamps": null, - "values": [ - [ - "ABS", - "ACOS", - "ASIN", - ... - ], - [ - "built-in UDTF", - "built-in UDTF", - "built-in UDTF", - ... - ], - [ - "org.apache.iotdb.db.query.udf.builtin.UDTFAbs", - "org.apache.iotdb.db.query.udf.builtin.UDTFAcos", - "org.apache.iotdb.db.query.udf.builtin.UDTFAsin", - ... - ] - ] -} -``` - -7. 请求示例 show timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show timeseries"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg28.s3", - "root.sg28.s4" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg27", - "root.sg27", - "root.sg28", - "root.sg28" - ], - [ - "INT32", - "BOOLEAN", - "INT32", - "BOOLEAN" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -8. 请求示例 show latest timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show latest timeseries"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg28.s4", - "root.sg27.s4", - "root.sg28.s3", - "root.sg27.s3" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg28", - "root.sg27", - "root.sg28", - "root.sg27" - ], - [ - "BOOLEAN", - "BOOLEAN", - "INT32", - "INT32" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -9. 请求示例 count timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"count timeseries root.**"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -10. 请求示例 count nodes: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"count nodes root.** level=2"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -11. 请求示例 show devices: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show devices"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "devices", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -12. 请求示例 show devices with database: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show devices with database"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "devices", - "database", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -13. 请求示例 list user: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"list user"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "user" - ], - "timestamps": null, - "values": [ - [ - "root" - ] - ] -} -``` - -14. 请求示例 原始聚合查询: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "column_names": null, - "timestamps": [ - 0 - ], - "values": [ - [ - 1 - ], - [ - 2 - ] - ] -} -``` - -15. 请求示例 group by level: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.** group by level = 1"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "count(root.sg27.*)", - "count(root.sg28.*)" - ], - "timestamps": null, - "values": [ - [ - 3 - ], - [ - 3 - ] - ] -} -``` - -16. 请求示例 group by: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.sg27 group by([1635232143960,1635232153960),1s)"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "column_names": null, - "timestamps": [ - 1635232143960, - 1635232144960, - 1635232145960, - 1635232146960, - 1635232147960, - 1635232148960, - 1635232149960, - 1635232150960, - 1635232151960, - 1635232152960 - ], - "values": [ - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ], - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ] - ] -} -``` - -17. 请求示例 last: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select last s3 from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "value", - "dataType" - ], - "timestamps": [ - 1635232143960 - ], - "values": [ - [ - "root.sg27.s3" - ], - [ - "11" - ], - [ - "INT32" - ] - ] -} -``` - -18. 请求示例 disable align: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select * from root.sg27 disable align"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "disable align clauses are not supported." -} -``` - -19. 请求示例 align by device: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(s3) from root.sg27 align by device"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "align by device clauses are not supported." -} -``` - -20. 请求示例 select into: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select s3, s4 into root.sg29.s1, root.sg29.s2 from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "select into clauses are not supported." -} -``` - -### 3.3 nonQuery - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v2/nonQuery` - -参数说明: - -|参数名称 |参数类型 |是否必填|参数描述| -| ------------ | ------------ | ------------ |------------ | -| sql | string | 是 | | - -请求示例: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"CREATE DATABASE root.ln"}' http://127.0.0.1:18080/rest/v2/nonQuery -``` - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - - -### 3.4 insertTablet - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v2/insertTablet` - -参数说明: - -| 参数名称 |参数类型 |是否必填|参数描述| -|--------------| ------------ | ------------ |------------ | -| timestamps | array | 是 | 时间列 | -| measurements | array | 是 | 测点名称 | -| data_types | array | 是 | 数据类型 | -| values | array | 是 | 值列,每一列中的值可以为 `null` | -| is_aligned | boolean | 是 | 是否是对齐时间序列 | -| device | string | 是 | 设备名称 | - -请求示例: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"timestamps":[1635232143960,1635232153960],"measurements":["s3","s4"],"data_types":["INT32","BOOLEAN"],"values":[[11,null],[false,true]],"is_aligned":false,"device":"root.sg27"}' http://127.0.0.1:18080/rest/v2/insertTablet -``` - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - -### 3.5 insertRecords - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v2/insertRecords` - -参数说明: - -| 参数名称 |参数类型 |是否必填|参数描述| -|-------------------| ------------ | ------------ |------------ | -| timestamps | array | 是 | 时间列 | -| measurements_list | array | 是 | 测点名称 | -| data_types_list | array | 是 | 数据类型 | -| values_list | array | 是 | 值列,每一列中的值可以为 `null` | -| devices | string | 是 | 设备名称 | -| is_aligned | string | 是 | 是否是对齐时间序列 | - -请求示例: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"timestamps":[1635232113960,1635232151960,1635232143960,1635232143960],"measurements_list":[["s33","s44"],["s55","s66"],["s77","s88"],["s771","s881"]],"data_types_list":[["INT32","INT64"],["FLOAT","DOUBLE"],["FLOAT","DOUBLE"],["BOOLEAN","TEXT"]],"values_list":[[1,11],[2.1,2],[4,6],[false,"cccccc"]],"is_aligned":false,"devices":["root.s1","root.s1","root.s1","root.s3"]}' http://127.0.0.1:18080/rest/v2/insertRecords -``` - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - -## 4. 配置 - -配置位于 `iotdb-system.properties` 中。 - - - -* 将 `enable_rest_service` 设置为 `true` 以启用该模块,而将 `false` 设置为禁用该模块。默认情况下,该值为 `false`。 - -```properties -enable_rest_service=true -``` - -* 仅在 `enable_rest_service=true` 时生效。将 `rest_service_port `设置为数字(1025~65535),以自定义REST服务套接字端口。默认情况下,值为 `18080`。 - -```properties -rest_service_port=18080 -``` - -* 将 'enable_swagger' 设置 'true' 启用swagger来展示rest接口信息, 而设置为 'false' 关闭该功能. 默认情况下,该值为 `false`。 - -```properties -enable_swagger=false -``` - -* 一次查询能返回的结果集最大行数。当返回结果集的行数超出参数限制时,您只会得到在行数范围内的结果集,且将得到状态码`411`。 - -```properties -rest_query_default_row_size_limit=10000 -``` - -* 缓存客户登录信息的过期时间(用于加速用户鉴权的速度,单位为秒,默认是8个小时) - -```properties -cache_expire=28800 -``` - -* 缓存中存储的最大用户数量(默认是100) - -```properties -cache_max_num=100 -``` - -* 缓存初始容量(默认是10) - -```properties -cache_init_num=10 -``` - -* REST Service 是否开启 SSL 配置,将 `enable_https` 设置为 `true` 以启用该模块,而将 `false` 设置为禁用该模块。默认情况下,该值为 `false`。 - -```properties -enable_https=false -``` - -* keyStore 所在路径(非必填) - -```properties -key_store_path= -``` - - -* keyStore 密码(非必填) - -```properties -key_store_pwd= -``` - - -* trustStore 所在路径(非必填) - -```properties -trust_store_path= -``` - -* trustStore 密码(非必填) - -```properties -trust_store_pwd= -``` - - -* SSL 超时时间,单位为秒 - -```properties -idle_timeout=5000 -``` diff --git a/src/zh/UserGuide/Master/Tree/Background-knowledge/Cluster-Concept_timecho.md b/src/zh/UserGuide/Master/Tree/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index b0462f0dd..000000000 --- a/src/zh/UserGuide/Master/Tree/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,132 +0,0 @@ - - -# 常见概念 - -## 1. 数据模型相关概念 - -### 1.1 数据模型(sql_dialect) - -IoTDB 支持两种时序数据模型(SQL语法),管理的对象均为设备和测点树:以层级路径的方式管理数据,一条路径对应一个设备的一个测点表;以关系表的方式管理数据,一张表对应一类设备。 - -### 1.2 元数据(Schema) - -元数据是数据库的数据模型信息,即树形结构或表结构。包括测点的名称、数据类型等定义。 - -### 1.3 设备(Device) - -对应一个实际场景中的物理设备,通常包含多个测点。 - -### 1.4 测点(Timeseries) - -又名:物理量、时间序列、时间线、点位、信号量、指标、测量值等。
-测点是多个数据点按时间戳递增排列形成的一个时间序列。通常一个测点代表一个采集点位,能够定期采集所在环境的物理量。 - -### 1.5 编码(Encoding) - -编码是一种压缩技术,将数据以二进制的形式进行表示,可以提高存储效率。IoTDB 支持多种针对不同类型的数据的编码方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md)。 - -### 1.6 压缩(Compression) - -IoTDB 在数据编码后,使用压缩技术进一步压缩二进制数据,提升存储效率。IoTDB 支持多种压缩方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md)。 - -## 2. 分布式相关概念 - -下图展示了一个常见的 IoTDB 3C3D(3 个 ConfigNode、3 个 DataNode)的集群部署模式: - - - -IoTDB 的集群包括如下常见概念: - -- 节点(ConfigNode、DataNode、AINode) -- Region(SchemaRegion、DataRegion) -- 多副本 - -下文将对以上概念进行介绍。 - - -### 2.1 节点 - -IoTDB 集群包括三种节点(进程):ConfigNode(管理节点),DataNode(数据节点)和 AINode(分析节点),如下所示: - -- ConfigNode:管理集群的节点信息、配置信息、用户权限、元数据、分区信息等,负责分布式操作的调度和负载均衡,所有 ConfigNode 之间互为全量备份,如上图中的 ConfigNode-1,ConfigNode-2 和 ConfigNode-3 所示。 -- DataNode:服务客户端请求,负责数据的存储和计算,如上图中的 DataNode-1,DataNode-2 和 DataNode-3 所示。 -- AINode:负责提供机器学习能力,支持注册已训练好的机器学习模型,并通过 SQL 调用模型进行推理,目前已内置自研时序大模型和常见的机器学习算法(如预测与异常检测)。 - -### 2.2 数据分区 - -在 IoTDB 中,元数据和数据都被分为小的分区,即 Region,由集群的各个 DataNode 进行管理。 - -- SchemaRegion:元数据分区,管理一部分设备和测点的元数据。不同 DataNode 相同 RegionID 的 SchemaRegion 互为副本,如上图中 SchemaRegion-1 拥有三个副本,分别放置于 DataNode-1,DataNode-2 和 DataNode-3。 -- DataRegion:数据分区,管理一部分设备的一段时间的数据。不同 DataNode 相同 RegionID 的 DataRegion 互为副本,如上图中 DataRegion-2 拥有两个副本,分别放置于 DataNode-1 和 DataNode-2。 -- 具体分区算法可参考:[数据分区](../Technical-Insider/Cluster-data-partitioning.md) - -### 2.3 多副本 - -数据和元数据的副本数可配置,不同部署模式下的副本数推荐如下配置,其中多副本时可提供高可用服务。 - -| 类别 | 配置项 | 单机推荐配置 | 集群推荐配置 | -| :----- | :------------------------ | :----------- | :----------- | -| 元数据 | schema_replication_factor | 1 | 3 | -| 数据 | data_replication_factor | 1 | 2 | - - -## 3. 部署相关概念 - -IoTDB 有三种运行模式:单机模式、集群模式和双活模式。 - -### 3.1 单机模式 - -IoTDB单机实例包括 1 个ConfigNode、1个DataNode,即1C1D; - -- **特点**:便于开发者安装部署,部署和维护成本较低,操作方便。 -- **适用场景**:资源有限或对高可用要求不高的场景,例如边缘端服务器。 -- **部署方法**:[单机版部署](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -### 3.2 双活模式 - -双活版部署为 TimechoDB 企业版功能,是指两个独立的实例进行双向同步,能同时对外提供服务。当一台停机重启后,另一个实例会将缺失数据断点续传。 - -> IoTDB 双活实例通常为2个单机节点,即2套1C1D。每个实例也可以为集群。 - -- **特点**:资源占用最低的高可用解决方案。 -- **适用场景**:资源有限(仅有两台服务器),但希望获得高可用能力。 -- **部署方法**:[双活版部署](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### 3.3 集群模式 - -IoTDB 集群实例为 3 个ConfigNode 和不少于 3 个 DataNode,通常为 3 个 DataNode,即3C3D;当部分节点出现故障时,剩余节点仍然能对外提供服务,保证数据库服务的高可用性,且可随节点增加提升数据库性能。 - -- **特点**:具有高可用性、高扩展性,可通过增加 DataNode 提高系统性能。 -- **适用场景**:需要提供高可用和可靠性的企业级应用场景。 -- **部署方法**:[集群版部署](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -### 3.4 特点总结 - -| 维度 | 单机模式 | 双活模式 | 集群模式 | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| 适用场景 | 边缘侧部署、对高可用要求不高 | 高可用性业务、容灾场景等 | 高可用性业务、容灾场景等 | -| 所需机器数量 | 1 | 2 | ≥3 | -| 安全可靠性 | 无法容忍单点故障 | 高,可容忍单点故障 | 高,可容忍单点故障 | -| 扩展性 | 可扩展 DataNode 提升性能 | 每个实例可按需扩展 | 可扩展 DataNode 提升性能 | -| 性能 | 可随 DataNode 数量扩展 | 与其中一个实例性能相同 | 可随 DataNode 数量扩展 | - -- 单机模式和集群模式,部署步骤类似(逐个增加 ConfigNode 和 DataNode),仅副本数和可提供服务的最少节点数不同。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Background-knowledge/Data-Model-and-Terminology_timecho.md b/src/zh/UserGuide/Master/Tree/Background-knowledge/Data-Model-and-Terminology_timecho.md deleted file mode 100644 index c56987874..000000000 --- a/src/zh/UserGuide/Master/Tree/Background-knowledge/Data-Model-and-Terminology_timecho.md +++ /dev/null @@ -1,395 +0,0 @@ - - -# 建模方案设计 - -本章节主要介绍如何将时序数据应用场景转化为IoTDB时序建模。 - -## 1. 时序数据模型 - -在构建IoTDB建模方案前,需要先了解时序数据和时序数据模型,详细内容见此页面:[时序数据模型](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - -## 2. IoTDB 的树表孪生模型 - -IoTDB 提供了树表孪生模型的方式,其特点分别如下: - -**树模型**:以测点为对象进行管理,每个测点对应一条时间序列,测点名按`.`分割可形成一个树形目录结构,与物理世界一一对应,对测点的读写操作简单直观。 - -> 1. 数据建模时,为了足够的性能要求,建议数据路径(Path)的倒数第二层节点(对应设备数量)不少于 1000 个,且设备数量与并发处理能力挂钩,设备数量充足时,并发读写效率更优。 -若遇到“设备数量较少但单设备测点数量较多”的场景(如仅 3 台设备,每台设备含 10000 个测点),推荐在最后层级新增 `.value` ,以此提升倒数第二层节点总数,示例:`root.db.device01.metric.value`。 -> 2. 在构建树模型[路径](../Basic-Concept/Operate-Metadata_timecho.md#4-路径查询)时,节点命名若存在包含非标准字符或特殊符号的可能性,则建议对所有层级节点实施反引号封装策略。这样可以有效规避因字符解析异常导致的测点注册失败及数据写入中断问题,确保路径标识符在语法解析层面的准确性。 - -**表模型**:推荐为每类设备创建一张表,同类设备的物理量采集都具备一定共性(如都采集温度和湿度物理量),数据分析灵活丰富。 - -### 2.1 模型特点 - -树表孪生模型有各自的适用场景。 - -以下表格从适用场景、典型操作等多个维度对树模型和表模型进行了对比。用户可以根据具体的使用需求,选择适合的模型,从而实现数据的高效存储和管理。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
对比维度树模型表模型
适用场景测点管理,监控场景设备管理,分析场景
典型操作指定点位路径进行读写通过标签进行数据筛选分析
结构特点和文件系统一样灵活增删模板化管理,便于数据治理
语法特点简洁灵活分析丰富
性能对比相同
- -**注意:** -- 同一个集群实例中可以存在两种模型空间,不同模型的语法、数据库命名方式不同,默认不互相可见。 - -### 2.2 模型选择 - -IoTDB 支持通过多种客户端工具与数据库建立连接,不同客户端下进行模型选择的方式说明如下: - -1. [命令行工具 CLI](../Tools-System/CLI_timecho.md) - -通过 CLI 建立连接时,需要通过 `sql_dialect` 参数指定使用的模型(默认使用树模型)。 - -```Bash -# 树模型 -start-cli.sh(bat) -start-cli.sh(bat) -sql_dialect tree - -# 表模型 -start-cli.sh(bat) -sql_dialect table -``` - -2. [SQL](../User-Manual/Maintenance-statement_timecho.md#_2-1-设置连接的模型) - -在使用 SQL 语言进行数据操作时,可通过 set 语句切换使用的模型。 - -```SQL --- 指定为树模型 -IoTDB> SET SQL_DIALECT=TREE - --- 指定为表模型 -IoTDB> SET SQL_DIALECT=TABLE -``` - -3. 应用编程接口 - -通过多语言应用编程接口建立连接时,可通过模型对应的 session/sessionpool 创建连接池实例,简单示例如下: - -* [Java 原生接口](../API/Programming-Java-Native-API_timecho.md) - -```Java -// 树模型 -SessionPool sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(3) - .build(); - -//表模型 - ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(1) - .build(); -``` - -* [Python 原生接口](../API/Programming-Python-Native-API_timecho.md) - -```Python -# 树模型 -session = Session( -​ ip=ip, -​ port=port, -​ user=username, -​ password=password, -​ fetch_size=1024, -​ zone_id="UTC+8", -​ enable_redirection=True -) - -# 表模型 -config = TableSessionPoolConfig( -​ node_urls=node_urls, -​ username=username, -​ password=password, -​ database=database, -​ max_pool_size=max_pool_size, -​ fetch_size=fetch_size, -​ wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) -``` - -* [C++ 原生接口](../API/Programming-Cpp-Native-API.md) - -```C++ -// 树模型 -session = new Session(hostip, port, username, password); - -// 表模型 -session = (new TableSessionBuilder()) - ->host(ip) - ->rpcPort(port) - ->username(username) - ->password(password) - ->build(); -``` - -* [GO 原生接口](../API/Programming-Go-Native-API.md) - -```Go -//树模型 -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, -} -sessionPool = client.NewSessionPool(config, 3, 60000, 60000, false) -defer sessionPool.Close() - -//表模型 -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, - Database: dbname, -} -sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) -defer sessionPool.Close() -``` - -* [C# 原生接口](../API/Programming-CSharp-Native-API.md) - -```C# -//树模型 -var session_pool = new SessionPool(host, port, pool_size); - -//表模型 -var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(nodeUrls) - .SetUsername(username) - .SetPassword(password) - .SetFetchSize(1024) - .Build(); -``` - -* [JDBC](../API/Programming-JDBC_timecho.md) - -使用表模型,必须在 url 中指定 sql\_dialect 参数为 table。 - -```Java -// 树模型 -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/", username, password); - -// 表模型 -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", username, password); -``` - -### 2.3 树转表 - -IoTDB 提供了树转表功能,如下图所示: - -![](/img/tree-to-table-1.png) - -该功能支持通过创建表视图的方式,将已存在的树模型数据转化为表视图,进而通过表视图进行查询,实现了对同一份数据的树模型和表模型协同处理。更详细的功能介绍可参考[树转表视图](../../latest-Table/User-Manual/Tree-to-Table_timecho.md),需要注意的是:​**创建树转表视图的 SQL 语句只允许在表模型下执行**​。 - - -## 3. 应用场景 - -应用场景主要包括三类: - -- 场景一:使用树模型进行数据的读写 - -- 场景二:使用表模型进行数据的读写 - -- 场景三:共用一份数据,使用树模型进行数据读写、使用表模型进行数据分析 - -### 3.1 场景一:树模型 - -#### 3.1.1 特点 - -- 简单直观,和物理世界的监测点位一一对应 - -- 类似文件系统一样灵活,可以设计任意分支结构 - -- 适用 DCS、SCADA 等工业监控场景 - -#### 3.1.2 基础概念 - -| **概念** | **定义** | -| -------------------- | ------------------------------------------------------------ | -| **数据库** | 定义:一个以 root. 为前缀的路径
命名推荐:仅包含 root 的下一级节点,如 root.db
数量推荐:上限和内存相关,一个数据库也可以充分利用机器资源,无需为性能原因创建多个数据库
创建方式:推荐手动创建,也可创建时间序列时自动创建(默认为 root 的下一级节点) | -| **时间序列(测点)** | 定义:
1. 一个以数据库路径为前缀的、由 . 分割的路径,可包含任意多个层级,如 root.db.turbine.device1.metric1
2. 每个时间序列可以有不同的数据类型。
命名推荐:
1. 仅将唯一定位时间序列的标签(类似联合主键)放入路径中,一般不超过10层
2. 通常将基数(不同的取值数量)少的标签放在前面,便于系统将公共前缀进行压缩
数量推荐:
1. 集群可管理的时间序列总量和总内存相关,可参考资源推荐章节
2. 任一层级的子节点数量没有限制
创建方式:可手动创建或在数据写入时自动创建。 | -| **设备** | 定义:倒数第二级为设备,如 root.db.turbine.**device1**.metric1中的“device1”这一层级即为设备
创建方式:无法仅创建设备,随时间序列创建而存在 | - - -#### 3.1.3 建模示例 - -##### 3.1.3.1 有多种类型的设备需要管理,如何建模? - -- 如场景中不同类型的设备具备不同的层级路径和测点集合,可以在数据库节点下按设备类型创建分支。每种设备下可以有不同的测点结构。 - -
- -
- -##### 3.1.3.2 如果场景中没有设备,只有测点,如何建模? - -- 如场站的监控系统中,每个测点都有唯一编号,但无法对应到某些设备。 - -
- -
- -##### 3.1.3.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 如在储能场景中,每一层结构都要监控其电压和电流,可以采用如下建模方式。 - -
- -
- - -### 3.2 场景二:表模型 - -#### 3.2.1 特点 - -- 以时序表建模管理设备时序数据,便于使用标准 SQL 进行分析 - -- 适用于设备数据分析或从其他数据库迁移至 IoTDB 的场景 - -#### 3.2.2 基础概念 - -- 数据库:可管理多类设备 - -- 时序表:对应一类设备 - -| **列类别** | **定义** | -| --------------------------- | ------------------------------------------------------------ | -| **时间列(TIME)** | 每个时序表必须有一个时间列,且列名必须为 time,数据类型为 TIMESTAMP | -| **标签列(TAG)** | 设备的唯一标识(联合主键),可以为 0 至多个
标签信息不可修改和删除,但允许增加
推荐按粒度由大到小进行排列 | -| **测点列(FIELD)** | 一个设备采集的测点可以有1个至多个,值随时间变化
表的测点列没有数量限制,可以达到数十万以上 | -| **属性列(ATTRIBUTE)** | 对设备的补充描述,**不随时间变化**
设备属性信息可以有0个或多个,可以更新或新增
少量希望修改的静态属性可以存至此列 | - - -数据筛选效率:时间列=标签列>属性列>测点列 - -#### 3.2.3 建模示例 - -##### 3.2.3.1 有多种类型的设备需要管理,如何建模? - -- 推荐为每一类型的设备建立一张表,每个表可以具有不同的标签和测点集合。 -- 即使设备之间有联系,或有层级关系,也推荐为每一类设备建一张表。 - -
- -
- -##### 3.2.3.2 如果没有设备标识列和属性列,如何建模? - -- 列数没有数量限制,可以达到数十万以上。 - -
- -
- -##### 3.2.3.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 每个设备有多个子设备及测点信息,推荐为每类设备建一个表进行管理。 - -
- -
- -### 3.3 场景三:双模型结合 - -#### 3.3.1 特点 - -- 巧妙融合了树模型与表模型的优点,共用一份数据,写入灵活,查询丰富。 - -- 数据写入阶段,采用树模型语法,支持数据灵活接入和扩展。 - -- 数据分析阶段,采用表模型语法,允许用户通过标准 SQL 查询语言,执行复杂的数据分析。 - -#### 3.3.2 建模示例 - -##### 3.3.2.1 有多种类型的设备需要管理,如何建模? - -- 场景中不同类型的设备具备不同的层级路径和测点集合。 - -- 树模型:在数据库节点下按设备类型创建分支,每种设备下可以有不同的测点结构。 - -- 表视图:为每种类型的设备建立一张表视图,每个表视图具有不同的标签和测点集合。 - -
- -
- -##### 3.3.2.2 如果没有设备标识列和属性列,如何建模? - -- 树模型:每个测点都有唯一编号,但无法对应到某些设备。 - -- 表视图:将所有测点放入一张表中,测点列数没有数量限制,可以达到数十万以上。若测点具有相同的数据类型,可将测点作为同一类设备。 - -
- -
- -##### 3.3.2.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 树模型:按照物理世界的监测点,对每一层结构进行建模。 - -- 表视图:按照设备分类,建立多个表对每一层结构信息进行管理。 - -
- -
diff --git a/src/zh/UserGuide/Master/Tree/Background-knowledge/Navigating_Time_Series_Data_timecho.md b/src/zh/UserGuide/Master/Tree/Background-knowledge/Navigating_Time_Series_Data_timecho.md deleted file mode 100644 index f537b0e63..000000000 --- a/src/zh/UserGuide/Master/Tree/Background-knowledge/Navigating_Time_Series_Data_timecho.md +++ /dev/null @@ -1,70 +0,0 @@ - -# 时序数据模型 - -## 1. 什么叫时序数据? - -万物互联的今天,物联网场景、工业场景等各类场景都在进行数字化转型,人们通过在各类设备上安装传感器对设备的各类状态进行采集。如电机采集电压、电流,风机的叶片转速、角速度、发电功率;车辆采集经纬度、速度、油耗;桥梁的振动频率、挠度、位移量等。传感器的数据采集,已经渗透在各个行业中。 - -![](/img/%E6%97%B6%E5%BA%8F%E6%95%B0%E6%8D%AE%E4%BB%8B%E7%BB%8D.png) - - - -通常来说,我们把每个采集点位叫做一个**测点( 也叫物理量、时间序列、时间线、信号量、指标、测量值等)**,每个测点都在随时间的推移不断收集到新的数据信息,从而构成了一条**时间序列**。用表格的方式,每个时间序列就是一个由时间、值两列形成的表格;用图形化的方式,每个时间序列就是一个随时间推移形成的走势图,也可以形象的称之为设备的“心电图”。 - -![](/img/%E5%BF%83%E7%94%B5%E5%9B%BE1.png) - -传感器产生的海量时序数据是各行各业数字化转型的基础,因此我们对时序数据的模型梳理主要围绕设备、传感器展开。 - -## 2. 时序数据中的关键概念有哪些? - -时序数据中主要涉及的概念由下至上可分为:数据点、测点、设备。 - -![](/img/%E7%99%BD%E6%9D%BF.png) - -### 2.1 数据点 - -- 定义:由一个时间戳和一个数值组成,其中时间戳为 long 类型,数值可以为 BOOLEAN、FLOAT、INT32 等各种类型。 -- 示例:如上图中表格形式的时间序列的一行,或图形形式的时间序列的一个点,就是一个数据点。 - -![](/img/%E6%95%B0%E6%8D%AE%E7%82%B9.png) - -### 2.2 测点 - -- 定义:是多个数据点按时间戳递增排列形成的一个时间序列。通常一个测点代表一个采集点位,能够定期采集所在环境的物理量。 -- 又名:物理量、时间序列、时间线、信号量、指标、测量值等 -- 示例: - - 电力场景:电流、电压 - - 能源场景:风速、转速 - - 车联网场景:油量、车速、经度、维度 - - 工厂场景:温度、湿度 - -- _树模型下**测点数量**等于整个路径模式下叶子节点的数量,具体统计方法可参考_[统计时间序列总数](../Basic-Concept/Operate-Metadata_timecho.md#_2-7-统计时间序列总数) - - -### 2.3 设备 - -- 定义:对应一个实际场景中的物理设备,通常是一组测点的集合,由一到多个标签定位标识 -- 示例 - - 车联网场景:车辆,由车辆识别代码 VIN 标识 - - 工厂场景:机械臂,由物联网平台生成的唯一 ID 标识 - - 能源场景:风机,由区域、场站、线路、机型、实例等标识 - - 监控场景:CPU,由机房、机架、Hostname、设备类型等标识 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Basic-Concept/Operate-Metadata_timecho.md b/src/zh/UserGuide/Master/Tree/Basic-Concept/Operate-Metadata_timecho.md deleted file mode 100644 index 55fb26b33..000000000 --- a/src/zh/UserGuide/Master/Tree/Basic-Concept/Operate-Metadata_timecho.md +++ /dev/null @@ -1,1278 +0,0 @@ - - -# 测点管理 - -## 1. 数据库管理 - -数据库(Database)可以被视为关系数据库中的Database。 - -### 1.1 创建数据库 - -我们可以根据存储模型建立相应的数据库。如下所示: - -```sql -CREATE DATABASE root.ln; -``` - -需要注意的是,推荐创建一个 database. - -Database 的父子节点都不能再设置 database。 - -例如在已经有`root.ln`和`root.sgcc`这两个 database 的情况下,创建`root.ln.wf01` database 是不可行的。系统将给出相应的错误提示,如下所示: - -```sql -CREATE DATABASE root.ln.wf01; -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 501: root.ln has already been created as database -``` -同样,在已经有 `root.db.test` 这个 database 的情况下,创建 `root.db` database 也是不可行的。系统也会给出相应的错误提示,如下所示: - -```sql -CREATE DATABASE root.db; -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 529: some children of root.db have already been created as database -``` - -Database 节点名命名规则: -1. 节点名可由**中英文字符、数字、下划线(\_)、英文句号(.)、反引号(\`)** 组成 -2. 若节点名为以下情况,则必须用**反引号(\`)** 将整个名称包裹。 - - 纯数字(如 12345) - - 含有特殊字符(如 . 或 \_)并可能引发歧义的名称(如 db.01、\_temp) -3. 反引号的特殊处理: - 若节点名本身需要包含反引号(\`),则需用**两个反引号(\`\`)** 表示一个反引号。例如:命名为\`db123\`\`(本身包含一个反引号),需写为 \`db123\`\`\`。 - -还需注意,如果在 Windows 或 macOS 系统上部署,database 名是大小写不敏感的。例如同时创建`root.ln` 和 `root.LN` 是不被允许的。 - -### 1.2 查看数据库 - -在 database 创建后,我们可以使用 [SHOW DATABASES](../SQL-Manual/SQL-Manual.md#查看数据库) 语句和 [SHOW DATABASES \](../SQL-Manual/SQL-Manual.md#查看数据库) 来查看 database,SQL 语句如下所示: - -```sql -show databases; -show databases root.*; -show databases root.**; -``` - -执行结果为: - -```shell -+-------------+----+-------------------------+-----------------------+-----------------------+ -| database| ttl|schema_replication_factor|data_replication_factor|time_partition_interval| -+-------------+----+-------------------------+-----------------------+-----------------------+ -| root.sgcc|null| 2| 2| 604800| -| root.ln|null| 2| 2| 604800| -+-------------+----+-------------------------+-----------------------+-----------------------+ -Total line number = 2 -It costs 0.060s -``` - -### 1.3 删除数据库 - -用户可以使用`DELETE DATABASE `语句删除该路径模式匹配的所有的数据库。在删除的过程中,需要注意的是数据库的数据也会被删除。 - -```sql -DELETE DATABASE root.ln; -DELETE DATABASE root.sgcc; -// 删除所有数据,时间序列以及数据库; -DELETE DATABASE root.**; -``` - -### 1.4 统计数据库数量 - -用户可以使用`COUNT DATABASES `语句统计数据库的数量,允许指定`PathPattern` 用来统计匹配该`PathPattern` 的数据库的数量 - -SQL 语句如下所示: - -```sql -show databases; -count databases; -count databases root.*; -count databases root.sgcc.*; -count databases root.sgcc; -``` - -执行结果为: - -```shell -+-------------+ -| database| -+-------------+ -| root.sgcc| -| root.turbine| -| root.ln| -+-------------+ -Total line number = 3 -It costs 0.003s - -+-------------+ -| Database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.003s - -+-------------+ -| Database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| Database| -+-------------+ -| 0| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 1| -+-------------+ -Total line number = 1 -It costs 0.002s -``` - -### 1.5 数据保留时间(TTL) - -IoTDB 支持对设备(device)级别设置数据保留时间(TTL),允许系统自动定期删除旧数据,以有效控制磁盘空间并维护高性能查询和低内存占用。TTL 默认以毫秒为单位,数据过期后不可查询且禁止写入,但物理删除会延迟至压缩时。需注意,TTL 变更可能导致短暂数据可查询性变化,且若调小或解除 TTL,之前因 TTL 不可见的数据可能重新出现。 - -注意事项: -- TTL 设置为毫秒,不受配置文件时间精度影响。 -- TTL 变更可能影响数据的可查询性。 -- 系统最终会移除过期数据,但存在延迟。 -- TTL 判断数据是否过期依据的是数据点时间,非写入时间。 -- 系统最多支持设置 1000 条 TTL 规则,达到上限需先删除部分规则才能设置新规则。 - -#### TTL Path 规则 -设置的路径 path 只支持前缀路径(即路径中间不能带 \* , 且必须以 \*\* 结尾),该路径会匹配到设备,也允许用户指定不带星的 path 为具体的 database 或 device,当 path 不带 \* 时,会检查是否匹配到 database,若匹配到 database,则会同时设置 path 和 path.\*\*。 -注意:设备 TTL 设置不会对元数据的存在性进行校验,即允许对一条不存在的设备设置 TTL。 -```shell -合格的 path: -root.** -root.db.** -root.db.group1.** -root.db -root.db.group1.d1 - -不合格的 path: -root.*.db -root.**.db.* -root.db.* -``` -#### TTL 适用规则 -当一个设备适用多条TTL规则时,优先适用较精确和较长的规则。例如对于设备“root.bj.hd.dist001.turbine001”来说,规则“root.bj.hd.dist001.turbine001”比“root.bj.hd.dist001.\*\*”优先,而规则“root.bj.hd.dist001.\*\*”比“root.bj.hd.\*\*”优先; -#### 设置 TTL -set ttl 操作可以理解为设置一条 TTL规则,比如 set ttl to root.sg.group1.\*\* 就相当于对所有可以匹配到该路径模式的设备挂载 ttl。 unset ttl 操作表示对相应路径模式卸载 TTL,若不存在对应 TTL,则不做任何事。若想把 TTL 调成无限大,则可以使用 INF 关键字 -设置 TTL 的 SQL 语句如下所示: -```sql -set ttl to pathPattern 360000; -``` -pathPattern 是前缀路径,即路径中间不能带 \* 且必须以 \*\* 结尾。 -pathPattern 匹配对应的设备。为了兼容老版本 SQL 语法,允许用户输入的 pathPattern 匹配到 db,则自动将前缀路径扩展为 path.\*\*。 -例如,写set ttl to root.sg 360000 则会自动转化为set ttl to root.sg.\*\* 360000,转化后的语句对所有 root.sg 下的 device 设置TTL。 -但若写的 pathPattern 无法匹配到 db,则上述逻辑不会生效。 -如写set ttl to root.sg.group 360000 ,由于root.sg.group未匹配到 db,则不会被扩充为root.sg.group.\*\*。 也允许指定具体 device,不带 \*。 -#### 取消 TTL - -取消 TTL 的 SQL 语句如下所示: - -```sql -unset ttl from root.ln; -``` - -取消设置 TTL 后, `root.ln` 路径下所有的数据都会被保存。 -```sql -unset ttl from root.sgcc.**; -``` - -取消设置`root.sgcc`路径下的所有的 TTL 。 -```sql -unset ttl from root.**; -``` - -取消设置所有的 TTL 。 - -新语法 -```sql -unset ttl from root.**; -``` - -旧语法 -```sql -unset ttl to root.**; -``` -新旧语法在功能上没有区别并且同时兼容,仅是新语法在用词上更符合常规。 -#### 显示 TTL - -显示 TTL 的 SQL 语句如下所示: -show all ttl - -```sql -SHOW ALL TTL; -``` -```shell -+--------------+--------+ -| path| TTL| -| root.**|55555555| -| root.sg2.a.**|44440000| -+--------------+--------+ -``` - -show ttl on pathPattern -```sql -SHOW TTL ON root.db.**; -``` -```shell -+--------------+--------+ -| path| TTL| -| root.db.**|55555555| -| root.db.a.**|44440000| -+--------------+--------+ -``` -SHOW ALL TTL 这个例子会给出所有的 TTL。 -SHOW TTL ON pathPattern 这个例子会显示指定路径的 TTL。 - -显示设备的 TTL。 -```sql -show devices; -``` -```shell -+---------------+---------+---------+ -| Device|IsAligned| TTL| -+---------------+---------+---------+ -|root.sg.device1| false| 36000000| -|root.sg.device2| true| INF| -+---------------+---------+---------+ -``` -所有设备都一定会有 TTL,即不可能是 null。INF 表示无穷大。 - - -### 1.6 设置异构数据库(进阶操作) - -在熟悉 IoTDB 元数据建模的前提下,用户可以在 IoTDB 中设置异构的数据库,以便应对不同的生产需求。 - -目前支持的数据库异构参数有: - -| 参数名 | 参数类型 | 参数描述 | -|---------------------------|---------|---------------------------| -| TTL | Long | 数据库的 TTL | -| SCHEMA_REPLICATION_FACTOR | Integer | 数据库的元数据副本数 | -| DATA_REPLICATION_FACTOR | Integer | 数据库的数据副本数 | -| SCHEMA_REGION_GROUP_NUM | Integer | 数据库的 SchemaRegionGroup 数量 | -| DATA_REGION_GROUP_NUM | Integer | 数据库的 DataRegionGroup 数量 | - -用户在配置异构参数时需要注意以下三点: -+ TTL 和 TIME_PARTITION_INTERVAL 必须为正整数。 -+ SCHEMA_REPLICATION_FACTOR 和 DATA_REPLICATION_FACTOR 必须小于等于已部署的 DataNode 数量。 -+ SCHEMA_REGION_GROUP_NUM 和 DATA_REGION_GROUP_NUM 的功能与 iotdb-common.properties 配置文件中的 -`schema_region_group_extension_policy` 和 `data_region_group_extension_policy` 参数相关,以 DATA_REGION_GROUP_NUM 为例: -若设置 `data_region_group_extension_policy=CUSTOM`,则 DATA_REGION_GROUP_NUM 将作为 Database 拥有的 DataRegionGroup 的数量; -若设置 `data_region_group_extension_policy=AUTO`,则 DATA_REGION_GROUP_NUM 将作为 Database 拥有的 DataRegionGroup 的配额下界,即当该 Database 开始写入数据时,将至少拥有此数量的 DataRegionGroup。 - -用户可以在创建 Database 时设置任意异构参数,或在单机/分布式 IoTDB 运行时调整部分异构参数。 - -#### 创建 Database 时设置异构参数 - -用户可以在创建 Database 时设置上述任意异构参数,SQL 语句如下所示: - -```sql -CREATE DATABASE prefixPath (WITH databaseAttributeClause (COMMA? databaseAttributeClause)*)? -``` - -例如: -```sql -CREATE DATABASE root.db WITH SCHEMA_REPLICATION_FACTOR=1, DATA_REPLICATION_FACTOR=3, SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### 运行时调整异构参数 - -用户可以在 IoTDB 运行时调整部分异构参数,SQL 语句如下所示: - -```sql -ALTER DATABASE prefixPath WITH databaseAttributeClause (COMMA? databaseAttributeClause)* -``` - -例如: -```sql -ALTER DATABASE root.db WITH SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -注意,运行时只能调整下列异构参数: -+ SCHEMA_REGION_GROUP_NUM -+ DATA_REGION_GROUP_NUM - -#### 查看异构数据库 - -用户可以查询每个 Database 的具体异构配置,SQL 语句如下所示: - -```sql -SHOW DATABASES DETAILS prefixPath? -``` - -例如: - -```sql -SHOW DATABASES DETAILS; -``` -```shell -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|Database| TTL|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|MinSchemaRegionGroupNum|MaxSchemaRegionGroupNum|DataRegionGroupNum|MinDataRegionGroupNum|MaxDataRegionGroupNum| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|root.db1| null| 1| 3| 604800000| 0| 1| 1| 0| 2| 2| -|root.db2|86400000| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -|root.db3| null| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -Total line number = 3 -It costs 0.058s -``` - -各列查询结果依次为: -+ 数据库名称 -+ 数据库的 TTL -+ 数据库的元数据副本数 -+ 数据库的数据副本数 -+ 数据库的时间分区间隔 -+ 数据库当前拥有的 SchemaRegionGroup 数量 -+ 数据库需要拥有的最小 SchemaRegionGroup 数量 -+ 数据库允许拥有的最大 SchemaRegionGroup 数量 -+ 数据库当前拥有的 DataRegionGroup 数量 -+ 数据库需要拥有的最小 DataRegionGroup 数量 -+ 数据库允许拥有的最大 DataRegionGroup 数量 - - -## 2. 时间序列管理 - -### 2.1 创建时间序列 - -根据建立的数据模型,我们可以分别在两个数据库中创建相应的时间序列。创建时间序列的 SQL 语句如下所示: - -```sql -create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT; -create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT; -create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT; -``` - -从 v0.13 起,可以使用简化版的 SQL 语句创建时间序列: - -```sql -create timeseries root.ln.wf01.wt01.status BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature FLOAT; -create timeseries root.ln.wf02.wt02.hardware TEXT; -create timeseries root.ln.wf02.wt02.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature FLOAT; -``` - -创建时间序列时,系统会默认指定编码压缩方式,无需手动指定,若业务场景需要手动调整,可参考如下示例: -```sql -create timeseries root.sgcc.wf03.wt01.temperature FLOAT encoding=PLAIN compressor=SNAPPY; -``` - -需要注意的是,如果手动指定了编码方式,但与数据类型不对应时,系统会给出相应的错误提示,如下所示: -```sql -create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF; -error: encoding TS_2DIFF does not support BOOLEAN -``` - -更多详细的数据类型与编码压缩方式的对应列表请参见 [压缩&编码](../Technical-Insider/Encoding-and-Compression.md)。 - - -### 2.2 创建对齐时间序列 - -创建一组对齐时间序列的SQL语句如下所示: - -```sql -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT, longitude FLOAT); -``` - -一组对齐序列中的序列可以有不同的数据类型、编码方式以及压缩方式。 - -对齐的时间序列也支持设置别名、标签、属性。 - - -### 2.3 修改时间序列数据类型 - -自 V2.0.8.2 版本起,支持通过 SQL 语句修改时间序列的数据类型。 - -语法定义: - -```SQL -ALTER TIMESERIES fullPath SET DATA TYPE newType=type -``` - -说明: - -* 变更过程中若该时间序列被并发删除,会报错提示。 -* 变更后的时间序列类型需要与原类型兼容,具体兼容性如下表所示: - -| 原始类型 | 可变更为类型 | -| ----------- | ----------------------------------------------- | -| INT32 | INT64, FLOAT, DOUBLE, TIMESTAMP, STRING, TEXT | -| INT64 | TIMESTAMP, DOUBLE, STRING, TEXT | -| FLOAT | DOUBLE, STRING, TEXT | -| DOUBLE | STRING, TEXT | -| BOOLEAN | STRING, TEXT | -| TEXT | BLOB, STRING | -| STRING | TEXT, BLOB | -| BLOB | STRING, TEXT | -| DATE | STRING, TEXT | -| TIMESTAMP | INT64, DOUBLE, STRING, TEXT | - -使用示例: - -```SQL -ALTER TIMESERIES root.ln.wf01.wt01.temperature set data type DOUBLE; -``` - -### 2.4 修改时间序列名称 - -自 V2.0.8.2 版本起,支持通过 SQL 语句修改时间序列的全路径名称。修改成功后,原有名称作废,但仍在元数据的存储中。 - -语法定义: - -```SQL --- 支持将某个序列的全路径修改为另一全路径 -ALTER TIMESERIES RENAME TO -``` - -使用说明: - -* 该语句执行成功后将立即生效,原序列的 tag/attribute/alias 将迁移到新序列。 -* 作废序列(原序列)不再支持写入、查询、删除等操作。作废后的序列名称会被系统保留,不允许创建同名新序列,以此确保原序列名称唯一可追溯:支持通过 SHOW INVALID TIMESERIES 语句查看原序列,避免因频繁修改导致原序列信息丢失,大幅提升数据溯源与问题定位效率。 -* 新序列支持创建视图,原序列不再支持创建视图。修改新序列的编码压缩、序列类型、标签、属性、别名时,不会连带修改原序列;删除新序列时,会连带修改原序列。 -* 新序列路径或目标设备下原序列别名已存在时(包括真实序列、view、作废序列及其别名),系统会报错提示。 - -使用示例: - -```SQL -ALTER TIMESERIES root.ln.wf01.wt01.temperature RENAME TO root.newln.newwf.newwt.temperature; -``` - - -### 2.5 删除时间序列 - -我们可以使用`(DELETE | DROP) TimeSeries `语句来删除我们之前创建的时间序列。SQL 语句如下所示: - -```sql -delete timeseries root.ln.wf01.wt01.status; -delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware; -delete timeseries root.ln.wf02.*; -drop timeseries root.ln.wf02.*; -``` - -### 2.6 查看时间序列 - -* SHOW LATEST? TIMESERIES pathPattern? timeseriesWhereClause? limitClause? - - SHOW TIMESERIES 中可以有四种可选的子句,查询结果为这些时间序列的所有信息 - -时间序列信息具体包括:时间序列路径名,database,Measurement 别名,数据类型,编码方式,压缩方式,属性和标签。 - -示例: - -* SHOW TIMESERIES - - 展示系统中所有的时间序列信息 - -* SHOW TIMESERIES <`Path`> - - 返回给定路径的下的所有时间序列信息。其中 `Path` 需要为一个时间序列路径或路径模式。例如,分别查看`root`路径和`root.ln`路径下的时间序列,SQL 语句如下所示: - -```sql -show timeseries root.**; -show timeseries root.ln.**; -``` - -执行结果分别为: - -```shell -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.016s - -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -Total line number = 4 -It costs 0.004s -``` - -* SHOW TIMESERIES LIMIT INT OFFSET INT - - 只返回从指定下标开始的结果,最大返回条数被 LIMIT 限制,用于分页查询。例如: - -```sql -show timeseries root.ln.** limit 10 offset 10; -``` - -* SHOW TIMESERIES WHERE TIMESERIES contains 'containStr' - - 对查询结果集根据 timeseries 名称进行字符串模糊匹配过滤。例如: - -```sql -show timeseries root.ln.** where timeseries contains 'wf01.wt'; -``` - -执行结果为: - -```shell -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 2 -It costs 0.016s -``` - -* SHOW TIMESERIES WHERE DataType=type - - 对查询结果集根据时间序列数据类型进行过滤。例如: - -```sql -show timeseries root.ln.** where dataType=FLOAT; -``` - -执行结果为: - -```shell -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 3 -It costs 0.016s - -``` - -* SHOW TIMESERIES WHERE TAGS(KEY) = VALUE -* SHOW TIMESERIES WHERE TAGS(KEY) CONTAINS VALUE - - 对查询结果集根据标签进行过滤。例如: - -```sql -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -执行结果分别为: - -```shell -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s - -``` - - -* SHOW LATEST TIMESERIES - - 表示查询出的时间序列需要按照最近插入时间戳降序排列 - -需要注意的是,当查询路径不存在时,系统会返回 0 条时间序列。 - -* SHOW INVALID TIMESERIES - -自 V2.0.8.2 版本起,支持该 SQL 语句,用于展示**修改全路径名称**成功后的作废时间序列。 - -```SQL -IoTDB> show invalid timeSeries -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| NewPath| -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| GORILLA| LZ4|null| null| null| null| BASE|root.newln.newwf.newwt.temperature| -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -``` - -说明:返回结果中的最后一列 `NewPath`,展示作废序列对应的新序列,以服务于视图构建、集群迁移(Load+改名)等场景。 - -### 2.7 统计时间序列总数 - -IoTDB 支持使用`COUNT TIMESERIES`来统计一条路径中的时间序列个数。SQL 语句如下所示: - -* 可以通过 `WHERE` 条件对时间序列名称进行字符串模糊匹配,语法为: `COUNT TIMESERIES WHERE TIMESERIES contains 'containStr'` 。 -* 可以通过 `WHERE` 条件对时间序列数据类型进行过滤,语法为: `COUNT TIMESERIES WHERE DataType='`。 -* 可以通过 `WHERE` 条件对标签点进行过滤,语法为: `COUNT TIMESERIES WHERE TAGS(key)='value'` 或 `COUNT TIMESERIES WHERE TAGS(key) contains 'value'`。 -* 可以通过定义`LEVEL`来统计指定层级下的时间序列个数。这条语句可以用来统计每一个设备下的传感器数量,语法为:`COUNT TIMESERIES GROUP BY LEVEL=`。 - -```sql -COUNT TIMESERIES root.**; -COUNT TIMESERIES root.ln.**; -COUNT TIMESERIES root.ln.*.*.status; -COUNT TIMESERIES root.ln.wf01.wt01.status; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' ; -COUNT TIMESERIES root.** WHERE DATATYPE = INT64; -COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c' ; -COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c' ; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1; -``` - -例如有如下时间序列(可以使用`show timeseries`展示所有时间序列): - -```shell -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| {"unit":"c"}| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| {"description":"test1"}| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.004s -``` - -那么 Metadata Tree 如下所示: - - - -可以看到,`root`被定义为`LEVEL=0`。那么当你输入如下语句时: - -```sql -COUNT TIMESERIES root.** GROUP BY LEVEL=1; -COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2; -COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2; -``` - -你将得到以下结果: - -```sql -COUNT TIMESERIES root.** GROUP BY LEVEL=1; -COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2; -COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2; -``` -```shell -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -| root.sgcc| 2| -| root.ln| 4| -+------------+-----------------+ -Total line number = 3 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf02| 2| -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 2 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 1 -It costs 0.002s -``` - -> 注意:时间序列的路径只是过滤条件,与 level 的定义无关。 - -### 2.8 活跃时间序列查询 -我们在原有的时间序列查询和统计上添加新的WHERE时间过滤条件,可以得到在指定时间范围中存在数据的时间序列。 - -需要注意的是, 在带有时间过滤的元数据查询中并不考虑视图的存在,只考虑TsFile中实际存储的时间序列。 - -一个使用样例如下: -```sql -insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -show timeseries; -show timeseries where time >= 15000 and time < 16000; -count timeseries where time >= 15000 and time < 16000; -``` -```shell -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -+-----------------+ -|count(timeseries)| -+-----------------+ -| 4| -+-----------------+ -``` -关于活跃时间序列的定义,能通过正常查询查出来的数据就是活跃数据,也就是说插入但被删除的时间序列不在考虑范围内。 - -### 2.9 标签点管理 - -我们可以在创建时间序列的时候,为它添加别名和额外的标签和属性信息。 - -标签和属性的区别在于: - -* 标签可以用来查询时间序列路径,会在内存中维护标点到时间序列路径的倒排索引:标签 -> 时间序列路径 -* 属性只能用时间序列路径来查询:时间序列路径 -> 属性 - -所用到的扩展的创建时间序列的 SQL 语句如下所示: -```sql -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT, encoding=RLE, compression=SNAPPY tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2); -``` - -括号里的`temprature`是`s1`这个传感器的别名。 -我们可以在任何用到`s1`的地方,将其用`temprature`代替,这两者是等价的。 - -> IoTDB 同时支持在查询语句中使用 AS 函数设置别名。二者的区别在于:AS 函数设置的别名用于替代整条时间序列名,且是临时的,不与时间序列绑定;而上文中的别名只作为传感器的别名,与其绑定且可与原传感器名等价使用。 - -> 注意:额外的标签和属性信息总的大小不能超过`tag_attribute_total_size`. - - * 标签点属性更新 -创建时间序列后,我们也可以对其原有的标签点属性进行更新,主要有以下六种更新方式: -* 重命名标签或属性 -```sql -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1; -``` -* 重新设置标签或属性的值 -```sql -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1; -``` -* 删除已经存在的标签或属性 -```sql -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2; -``` -* 添加新的标签 -```sql -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4; -``` -* 添加新的属性 -```sql -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4; -``` -* 更新插入别名,标签和属性 -> 如果该别名,标签或属性原来不存在,则插入,否则,用新值更新原来的旧值 -```sql -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4); -``` - -* 使用标签作为过滤条件查询时间序列,使用 TAGS(tagKey) 来标识作为过滤条件的标签 -```sql -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -``` - -返回给定路径的下的所有满足条件的时间序列信息,SQL 语句如下所示: - -```sql -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c; -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1; -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -执行结果分别为: - -```shell -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s -``` - -- 使用标签作为过滤条件统计时间序列数量 - -```sql -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause; -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL=; -``` - -返回给定路径的下的所有满足条件的时间序列的数量,SQL 语句如下所示: - -```sql -count timeseries; -count timeseries root.** where TAGS(unit)='c'; -count timeseries root.** where TAGS(unit)='c' group by level = 2; -``` - -执行结果分别为: - -```shell -+-----------------+ -|count(timeseries)| -+-----------------+ -| 6| -+-----------------+ -Total line number = 1 -It costs 0.019s - -+-----------------+ -|count(timeseries)| -+-----------------+ -| 2| -+-----------------+ -Total line number = 1 -It costs 0.020s - -+--------------+-----------------+ -| column|count(timeseries)| -+--------------+-----------------+ -| root.ln.wf02| 2| -| root.ln.wf01| 0| -|root.sgcc.wf03| 0| -+--------------+-----------------+ -Total line number = 3 -It costs 0.011s -``` - -> 注意,现在我们只支持一个查询条件,要么是等值条件查询,要么是包含条件查询。当然 where 子句中涉及的必须是标签值,而不能是属性值。 - -创建对齐时间序列 - -```sql -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)); -``` - -执行结果如下: - -```sql -show timeseries; -``` -```shell -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -|root.sg1.d1.s2| null| root.sg1| DOUBLE| GORILLA| SNAPPY|{"tag4":"v4","tag3":"v3"}|{"attr4":"v4","attr3":"v3"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -支持查询: - -```sql -show timeseries where TAGS(tag1)='v1'; -``` -```shell -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -上述对时间序列标签、属性的更新等操作都支持。 - - -## 3. 路径查询 - -### 3.1 路径(Path) - -路径(path)是用于表示时间序列的层级结构的表达式,其语法定义如下: -```SQL - path - : nodeName ('.' nodeName)* - ; - - nodeName - : wildcard? identifier wildcard? - | wildcard - ; - - wildcard - : '*' - | '**' - ; -``` -### 3.2 路径结点名(NodeName) - -- 路径中由 `.` 分割的部分称为路径结点名(nodeName)。 -- 例如,`root.a.b.c` 是一个层级为 4 的路径,其中 root、a、b 和 c 都是路径结点名。 - -#### 约束条件 -- 保留字符 root:root 是一个保留字符,仅允许出现在路径的开头。如果在其他层级出现 root,系统将无法解析并提示报错。 - -- 字符支持:除 root 外的其他层级支持以下字符: - - 字母(a-z、A-Z) - - 数字(0-9) - - 下划线(_) - - UNICODE 中文字符(\u2E80 到 \u9FFF) -- 大小写敏感性:在 Windows 系统上,数据库路径结点名是大小写不敏感的。例如,root.ln 和 root.LN 会被视为相同的路径。 - -### 3.3 特殊字符(反引号) - -如果`路径结点名(NodeName)`中需要使用特殊字符(如空格、标点符号等),可以使用反引号(`)将结点名引用起来。更多关于反引号的使用方法,请参考[反引号](../SQL-Manual/Syntax-Rule.md#反引号)。 - -### 3.4 路径模式(Path Pattern) - -为了使得在表达多个时间序列的时候更加方便快捷,IoTDB 为用户提供带通配符`*`或`**`的路径。通配符可以出现在路径中的任何层。 - -- 单层通配符(*):在路径中表示一层。 - - 例如,`root.vehicle.*.sensor1` 表示以 `root.vehicle` 为前缀,以 `sensor1` 为后缀,且层级等于 4 的路径。 - -- 多层通配符(**):在路径中表示(`*`)+,即为一层或多层`*`. - - 例如:`root.vehicle.device1.**` 表示所有以 `root.vehicle.device1` 为前缀且层级大于等于 4 的路径。 - - `root.vehicle.**.sensor1` 表示以 `root.vehicle` 为前缀,以 `sensor1` 为后缀,且层级大于等于 4 的路径。 - -**注意**:* 和 ** 不能放在路径的开头。 - -### 3.5 查看路径的所有子路径 - -```sql -SHOW CHILD PATHS pathPattern; -``` - -可以查看此路径模式所匹配的所有路径的下一层的所有路径和它对应的节点类型,即pathPattern.*所匹配的路径及其节点类型。 - -节点类型:ROOT -> SG INTERNAL -> DATABASE -> INTERNAL -> DEVICE -> TIMESERIES - -示例: - -* 查询 root.ln 的下一层:show child paths root.ln - -```shell -+------------+----------+ -| child paths|node types| -+------------+----------+ -|root.ln.wf01| INTERNAL| -|root.ln.wf02| INTERNAL| -+------------+----------+ -Total line number = 2 -It costs 0.002s -``` - -* 查询形如 root.xx.xx.xx 的路径:show child paths root.\*.\* - -```shell -+---------------+ -| child paths| -+---------------+ -|root.ln.wf01.s1| -|root.ln.wf02.s2| -+---------------+ -``` - -### 3.6 查看路径的下一级节点 - -```sql -SHOW CHILD NODES pathPattern; -``` - -可以查看此路径模式所匹配的节点的下一层的所有节点。 - -示例: - -* 查询 root 的下一层:show child nodes root - -```shell -+------------+ -| child nodes| -+------------+ -| ln| -+------------+ -``` - -* 查询 root.ln 的下一层 :show child nodes root.ln - -```shell -+------------+ -| child nodes| -+------------+ -| wf01| -| wf02| -+------------+ -``` - -### 3.7 统计节点数 - -IoTDB 支持使用`COUNT NODES LEVEL=`来统计当前 Metadata - 树下满足某路径模式的路径中指定层级的节点个数。这条语句可以用来统计带有特定采样点的设备数。例如: - -```sql -COUNT NODES root.** LEVEL=2; -COUNT NODES root.ln.** LEVEL=2; -COUNT NODES root.ln.wf01.* LEVEL=3; -COUNT NODES root.**.temperature LEVEL=3; -``` - -对于上面提到的例子和 Metadata Tree,你可以获得如下结果: - -```shell -+------------+ -|count(nodes)| -+------------+ -| 4| -+------------+ -Total line number = 1 -It costs 0.003s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 1| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s -``` - -> 注意:时间序列的路径只是过滤条件,与 level 的定义无关。 - -### 3.8 查看设备 - -* SHOW DEVICES pathPattern? (WITH DATABASE)? devicesWhereClause? limitClause? - -与 `Show Timeseries` 相似,IoTDB 目前也支持两种方式查看设备。 - -* `SHOW DEVICES` 语句显示当前所有的设备信息,等价于 `SHOW DEVICES root.**`。 -* `SHOW DEVICES ` 语句规定了 `PathPattern`,返回给定的路径模式所匹配的设备信息。 -* `WHERE` 条件中可以使用 `DEVICE contains 'xxx'`,根据 device 名称进行模糊查询。 - -SQL 语句如下所示: - -```sql -show devices; -show devices root.ln.**; -show devices root.ln.** where device contains 't'; -``` - -你可以获得如下数据: - -```shell -+-------------------+---------+---------+ -| devices|isAligned| Template| -+-------------------+---------+---------+ -| root.ln.wf01.wt01| false| t1| -| root.ln.wf02.wt02| false| null| -|root.sgcc.wf03.wt01| false| null| -| root.turbine.d1| false| null| -+-------------------+---------+---------+ -Total line number = 4 -It costs 0.002s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf01.wt01| false| t1| -|root.ln.wf02.wt02| false| null| -+-----------------+---------+---------+ -Total line number = 2 -It costs 0.001s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf01.wt01| false| t1| -|root.ln.wf02.wt02| false| null| -+-----------------+---------+---------+ -Total line number = 2 -It costs 0.001s - -``` - -其中,`isAligned`表示该设备下的时间序列是否对齐, -`Template`显示着该设备所激活的模板名,null 表示没有激活模板。 - -查看设备及其 database 信息,可以使用 `SHOW DEVICES WITH DATABASE` 语句。 - -* `SHOW DEVICES WITH DATABASE` 语句显示当前所有的设备信息和其所在的 database,等价于 `SHOW DEVICES root.**`。 -* `SHOW DEVICES WITH DATABASE` 语句规定了 `PathPattern`,返回给定的路径模式所匹配的设备信息和其所在的 database。 - -SQL 语句如下所示: - -```sql -show devices with database; -show devices root.ln.** with database; -``` - -你可以获得如下数据: - -```shell -+-------------------+-------------+---------+---------+ -| devices| database|isAligned| Template| -+-------------------+-------------+---------+---------+ -| root.ln.wf01.wt01| root.ln| false| t1| -| root.ln.wf02.wt02| root.ln| false| null| -|root.sgcc.wf03.wt01| root.sgcc| false| null| -| root.turbine.d1| root.turbine| false| null| -+-------------------+-------------+---------+---------+ -Total line number = 4 -It costs 0.003s - -+-----------------+-------------+---------+---------+ -| devices| database|isAligned| Template| -+-----------------+-------------+---------+---------+ -|root.ln.wf01.wt01| root.ln| false| t1| -|root.ln.wf02.wt02| root.ln| false| null| -+-----------------+-------------+---------+---------+ -Total line number = 2 -It costs 0.001s -``` - -### 3.9 统计设备数量 - -* COUNT DEVICES \ - -上述语句用于统计设备的数量,同时允许指定`PathPattern` 用于统计匹配该`PathPattern` 的设备数量 - -SQL 语句如下所示: - -```sql -show devices; -count devices; -count devices root.ln.**; -``` - -你可以获得如下数据: - -```shell -+-------------------+---------+---------+ -| devices|isAligned| Template| -+-------------------+---------+---------+ -|root.sgcc.wf03.wt03| false| null| -| root.turbine.d1| false| null| -| root.ln.wf02.wt02| false| null| -| root.ln.wf01.wt01| false| t1| -+-------------------+---------+---------+ -Total line number = 4 -It costs 0.024s - -+--------------+ -|count(devices)| -+--------------+ -| 4| -+--------------+ -Total line number = 1 -It costs 0.004s - -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -Total line number = 1 -It costs 0.004s -``` - -### 3.10 活跃设备查询 -和活跃时间序列一样,我们可以在查看和统计设备的基础上添加时间过滤条件来查询在某段时间内存在数据的活跃设备。这里活跃的定义与活跃时间序列相同,使用样例如下: -```sql -insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -show devices; -show devices where time >= 15000 and time < 16000; -count devices where time >= 15000 and time < 16000; -``` -```shell -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -| root.sg.data3| false| -+-------------------+---------+ - -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -+-------------------+---------+ - -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Basic-Concept/Query-Data_timecho.md b/src/zh/UserGuide/Master/Tree/Basic-Concept/Query-Data_timecho.md deleted file mode 100644 index f367c3294..000000000 --- a/src/zh/UserGuide/Master/Tree/Basic-Concept/Query-Data_timecho.md +++ /dev/null @@ -1,3087 +0,0 @@ - - -# 数据查询 -## 1. 概述 - -在 IoTDB 中,使用 `SELECT` 语句从一条或多条时间序列中查询数据,IoTDB 不区分历史数据和实时数据,用户可以用统一的sql语法进行查询,通过 `WHERE` 子句中的时间过滤谓词决定查询的时间范围。 - -### 1.1 语法定义 - -```sql -SELECT [LAST] selectExpr [, selectExpr] ... - [INTO intoItem [, intoItem] ...] - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY { - ([startTime, endTime), interval [, slidingStep]) | - LEVEL = levelNum [, levelNum] ... | - TAGS(tagKey [, tagKey] ... | - VARIATION(expression[,delta][,ignoreNull=true/false]) | - CONDITION(expression,[keep>/>=/=/ 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` - -其含义为: - -被选择的设备为 ln 集团 wf01 子站 wt01 设备;被选择的时间序列为供电状态(status)和温度传感器(temperature);该语句要求选择出 “2017-11-01T00:05:00.000” 至 “2017-11-01T00:12:00.000” 之间的所选时间序列的值。 - -该 SQL 语句的执行结果如下: - -```shell -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -|2017-11-01T00:10:00.000+08:00| true| 25.52| -|2017-11-01T00:11:00.000+08:00| false| 22.91| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 6 -It costs 0.018s -``` - -#### 示例3:按照多个时间区间选择同一设备的多列数据 - -IoTDB 支持在一次查询中指定多个时间区间条件,用户可以根据需求随意组合时间区间条件。例如, - -SQL 语句为: - -```sql -select status, temperature from root.ln.wf01.wt01 where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -其含义为: - -被选择的设备为 ln 集团 wf01 子站 wt01 设备;被选择的时间序列为“供电状态(status)”和“温度传感器(temperature)”;该语句指定了两个不同的时间区间,分别为“2017-11-01T00:05:00.000 至 2017-11-01T00:12:00.000”和“2017-11-01T16:35:00.000 至 2017-11-01T16:37:00.000”;该语句要求选择出满足任一时间区间的被选时间序列的值。 - -该 SQL 语句的执行结果如下: - -```shell -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -|2017-11-01T00:10:00.000+08:00| true| 25.52| -|2017-11-01T00:11:00.000+08:00| false| 22.91| -|2017-11-01T16:35:00.000+08:00| true| 23.44| -|2017-11-01T16:36:00.000+08:00| false| 21.98| -|2017-11-01T16:37:00.000+08:00| false| 21.93| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 9 -It costs 0.018s -``` - -#### 示例4:按照多个时间区间选择不同设备的多列数据 - -该系统支持在一次查询中选择任意列的数据,也就是说,被选择的列可以来源于不同的设备。例如,SQL 语句为: - -```sql -select wf01.wt01.status, wf02.wt02.hardware from root.ln where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -其含义为: - -被选择的时间序列为 “ln 集团 wf01 子站 wt01 设备的供电状态” 以及 “ln 集团 wf02 子站 wt02 设备的硬件版本”;该语句指定了两个时间区间,分别为 “2017-11-01T00:05:00.000 至 2017-11-01T00:12:00.000” 和 “2017-11-01T16:35:00.000 至 2017-11-01T16:37:00.000”;该语句要求选择出满足任意时间区间的被选时间序列的值。 - -该 SQL 语句的执行结果如下: - -```shell -+-----------------------------+------------------------+--------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf02.wt02.hardware| -+-----------------------------+------------------------+--------------------------+ -|2017-11-01T00:06:00.000+08:00| false| v1| -|2017-11-01T00:07:00.000+08:00| false| v1| -|2017-11-01T00:08:00.000+08:00| false| v1| -|2017-11-01T00:09:00.000+08:00| false| v1| -|2017-11-01T00:10:00.000+08:00| true| v2| -|2017-11-01T00:11:00.000+08:00| false| v1| -|2017-11-01T16:35:00.000+08:00| true| v2| -|2017-11-01T16:36:00.000+08:00| false| v1| -|2017-11-01T16:37:00.000+08:00| false| v1| -+-----------------------------+------------------------+--------------------------+ -Total line number = 9 -It costs 0.014s -``` - -#### 示例5:根据时间降序返回结果集 - -IoTDB 支持 `order by time` 语句,用于对结果按照时间进行降序展示。例如,SQL 语句为: - -```sql -select * from root.ln.** where time > 1 order by time desc limit 10; -``` - -语句执行的结果为: - -```shell -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -|2017-11-07T23:59:00.000+08:00| v1| false| 21.07| false| -|2017-11-07T23:58:00.000+08:00| v1| false| 22.93| false| -|2017-11-07T23:57:00.000+08:00| v2| true| 24.39| true| -|2017-11-07T23:56:00.000+08:00| v2| true| 24.44| true| -|2017-11-07T23:55:00.000+08:00| v2| true| 25.9| true| -|2017-11-07T23:54:00.000+08:00| v1| false| 22.52| false| -|2017-11-07T23:53:00.000+08:00| v2| true| 24.58| true| -|2017-11-07T23:52:00.000+08:00| v1| false| 20.18| false| -|2017-11-07T23:51:00.000+08:00| v1| false| 22.24| false| -|2017-11-07T23:50:00.000+08:00| v2| true| 23.7| true| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -Total line number = 10 -It costs 0.016s -``` - -### 1.4 查询执行接口 - -在 IoTDB 中,提供两种方式执行数据查询操作: -- 使用 IoTDB-SQL 执行查询。 -- 常用查询的高效执行接口,包括时间序列原始数据范围查询、最新点查询、简单聚合查询。 - -#### 使用 IoTDB-SQL 执行查询 - -数据查询语句支持在 SQL 命令行终端、JDBC、JAVA / C++ / Python / Go 等编程语言 API、RESTful API 中使用。 - -- 在 SQL 命令行终端中执行查询语句:启动 SQL 命令行终端,直接输入查询语句执行即可,详见 [SQL 命令行终端](../Tools-System/CLI_timecho.md)。 - -- 在 JDBC 中执行查询语句,详见 [JDBC](../API/Programming-JDBC_timecho.md) 。 - -- 在 JAVA / C++ / Python / Go 等编程语言 API 中执行查询语句,详见应用编程接口一章相应文档。接口原型如下: - - ```java - SessionDataSet executeQueryStatement(String sql); - ``` - -- 在 RESTful API 中使用,详见 [HTTP API V1](../API/RestServiceV1_timecho.md) 或者 [HTTP API V2](../API/RestServiceV2_timecho.md)。 - -#### 常用查询的高效执行接口 - -各编程语言的 API 为常用的查询提供了高效执行接口,可以省去 SQL 解析等操作的耗时。包括: - -* 时间序列原始数据范围查询: - - 指定的查询时间范围为左闭右开区间,包含开始时间但不包含结束时间。 - -```java -SessionDataSet executeRawDataQuery(List paths, long startTime, long endTime); -``` - -* 最新点查询: - - 查询最后一条时间戳大于等于某个时间点的数据。 - -```java -SessionDataSet executeLastDataQuery(List paths, long lastTime); -``` - -* 聚合查询: - - 支持指定查询时间范围。指定的查询时间范围为左闭右开区间,包含开始时间但不包含结束时间。 - - 支持按照时间区间分段查询。 - -```java -SessionDataSet executeAggregationQuery(List paths, List aggregations); - -SessionDataSet executeAggregationQuery( - List paths, List aggregations, long startTime, long endTime); - -SessionDataSet executeAggregationQuery( - List paths, - List aggregations, - long startTime, - long endTime, - long interval); - -SessionDataSet executeAggregationQuery( - List paths, - List aggregations, - long startTime, - long endTime, - long interval, - long slidingStep); -``` - -## 2. 选择表达式(SELECT FROM 子句) - -`SELECT` 子句指定查询的输出,由若干个 `selectExpr` 组成。 每个 `selectExpr` 定义了查询结果中的一列或多列。 - -**`selectExpr` 是一个由时间序列路径后缀、常量、函数和运算符组成的表达式。即 `selectExpr` 中可以包含:** -- 时间序列路径后缀(支持使用通配符) -- 运算符 - - 算数运算符 - - 比较运算符 - - 逻辑运算符 -- 函数 - - 聚合函数 - - 时间序列生成函数(包括内置函数和用户自定义函数) -- 常量 - -### 2.1 使用别名 - -由于 IoTDB 独特的数据模型,在每个传感器前都附带有设备等诸多额外信息。有时,我们只针对某个具体设备查询,而这些前缀信息频繁显示造成了冗余,影响了结果集的显示与分析。 - -IoTDB 支持使用`AS`为查询结果集中的列指定别名。 - -**示例:** - -```sql -select s1 as temperature, s2 as speed from root.ln.wf01.wt01; -``` - -结果集将显示为: - -| Time | temperature | speed | -| ---- | ----------- | ----- | -| ... | ... | ... | - -### 2.2 运算符 - -IoTDB 中支持的运算符列表见文档 [运算符和函数](../SQL-Manual/Operator-and-Expression.md)。 - -### 2.3 函数 - -#### 聚合函数 - -聚合函数是多对一函数。它们对一组值进行聚合计算,得到单个聚合结果。 - -**包含聚合函数的查询称为聚合查询**,否则称为时间序列查询。 - -**注意:聚合查询和时间序列查询不能混合使用。** 下列语句是不支持的: - -```sql -select s1, count(s1) from root.sg.d1; -select sin(s1), count(s1) from root.sg.d1; -select s1, count(s1) from root.sg.d1 group by ([10,100),10ms); -``` - -IoTDB 支持的聚合函数见文档 [聚合函数](../SQL-Manual/Operator-and-Expression.md#内置函数)。 - -#### 时间序列生成函数 - -时间序列生成函数接受若干原始时间序列作为输入,产生一列时间序列输出。与聚合函数不同的是,时间序列生成函数的结果集带有时间戳列。 - -所有的时间序列生成函数都可以接受 * 作为输入,都可以与原始时间序列查询混合进行。 - -##### 内置时间序列生成函数 - -IoTDB 中支持的内置函数列表见文档 [运算符和函数](../SQL-Manual/Operator-and-Expression.md)。 - -##### 自定义时间序列生成函数 - -IoTDB 支持通过用户自定义函数(点击查看: [用户自定义函数](../User-Manual/Database-Programming.md#用户自定义函数) )能力进行函数功能扩展。 - -### 2.4 嵌套表达式举例 - -IoTDB 支持嵌套表达式,由于聚合查询和时间序列查询不能在一条查询语句中同时出现,我们将支持的嵌套表达式分为时间序列查询嵌套表达式和聚合查询嵌套表达式两类。 - -#### 时间序列查询嵌套表达式 - -IoTDB 支持在 `SELECT` 子句中计算由**时间序列、常量、时间序列生成函数(包括用户自定义函数)和运算符**组成的任意嵌套表达式。 - -**说明:** - -- 当某个时间戳下左操作数和右操作数都不为空(`null`)时,表达式才会有结果,否则表达式值为`null`,且默认不出现在结果集中。 -- 如果表达式中某个操作数对应多条时间序列(如通配符 `*`),那么每条时间序列对应的结果都会出现在结果集中(按照笛卡尔积形式)。 - -**示例 1:** - -```sql -select a, - b, - ((a + 1) * 2 - 1) % 2 + 1.5, - sin(a + sin(a + sin(b))), - -(a + b) * (sin(a + b) * sin(a + b) + cos(a + b) * cos(a + b)) + 1 -from root.sg1; -``` - -运行结果: - -```shell -+-----------------------------+----------+----------+----------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| Time|root.sg1.a|root.sg1.b|((((root.sg1.a + 1) * 2) - 1) % 2) + 1.5|sin(root.sg1.a + sin(root.sg1.a + sin(root.sg1.b)))|(-root.sg1.a + root.sg1.b * ((sin(root.sg1.a + root.sg1.b) * sin(root.sg1.a + root.sg1.b)) + (cos(root.sg1.a + root.sg1.b) * cos(root.sg1.a + root.sg1.b)))) + 1| -+-----------------------------+----------+----------+----------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.010+08:00| 1| 1| 2.5| 0.9238430524420609| -1.0| -|1970-01-01T08:00:00.020+08:00| 2| 2| 2.5| 0.7903505371876317| -3.0| -|1970-01-01T08:00:00.030+08:00| 3| 3| 2.5| 0.14065207680386618| -5.0| -|1970-01-01T08:00:00.040+08:00| 4| null| 2.5| null| null| -|1970-01-01T08:00:00.050+08:00| null| 5| null| null| null| -|1970-01-01T08:00:00.060+08:00| 6| 6| 2.5| -0.7288037411970916| -11.0| -+-----------------------------+----------+----------+----------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 6 -It costs 0.048s -``` - -**示例 2:** - -```sql -select (a + b) * 2 + sin(a) from root.sg; -``` - -运行结果: - -```shell -+-----------------------------+----------------------------------------------+ -| Time|((root.sg.a + root.sg.b) * 2) + sin(root.sg.a)| -+-----------------------------+----------------------------------------------+ -|1970-01-01T08:00:00.010+08:00| 59.45597888911063| -|1970-01-01T08:00:00.020+08:00| 100.91294525072763| -|1970-01-01T08:00:00.030+08:00| 139.01196837590714| -|1970-01-01T08:00:00.040+08:00| 180.74511316047935| -|1970-01-01T08:00:00.050+08:00| 219.73762514629607| -|1970-01-01T08:00:00.060+08:00| 259.6951893788978| -|1970-01-01T08:00:00.070+08:00| 300.7738906815579| -|1970-01-01T08:00:00.090+08:00| 39.45597888911063| -|1970-01-01T08:00:00.100+08:00| 39.45597888911063| -+-----------------------------+----------------------------------------------+ -Total line number = 9 -It costs 0.011s -``` - -**示例 3:** - -```sql -select (a + *) / 2 from root.sg1; -``` - -运行结果: - -```shell -+-----------------------------+-----------------------------+-----------------------------+ -| Time|(root.sg1.a + root.sg1.a) / 2|(root.sg1.a + root.sg1.b) / 2| -+-----------------------------+-----------------------------+-----------------------------+ -|1970-01-01T08:00:00.010+08:00| 1.0| 1.0| -|1970-01-01T08:00:00.020+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.030+08:00| 3.0| 3.0| -|1970-01-01T08:00:00.040+08:00| 4.0| null| -|1970-01-01T08:00:00.060+08:00| 6.0| 6.0| -+-----------------------------+-----------------------------+-----------------------------+ -Total line number = 5 -It costs 0.011s -``` - -**示例 4:** - -```sql -select (a + b) * 3 from root.sg, root.ln; -``` - -运行结果: - -```shell -+-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ -| Time|(root.sg.a + root.sg.b) * 3|(root.sg.a + root.ln.b) * 3|(root.ln.a + root.sg.b) * 3|(root.ln.a + root.ln.b) * 3| -+-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ -|1970-01-01T08:00:00.010+08:00| 90.0| 270.0| 360.0| 540.0| -|1970-01-01T08:00:00.020+08:00| 150.0| 330.0| 690.0| 870.0| -|1970-01-01T08:00:00.030+08:00| 210.0| 450.0| 570.0| 810.0| -|1970-01-01T08:00:00.040+08:00| 270.0| 240.0| 690.0| 660.0| -|1970-01-01T08:00:00.050+08:00| 330.0| null| null| null| -|1970-01-01T08:00:00.060+08:00| 390.0| null| null| null| -|1970-01-01T08:00:00.070+08:00| 450.0| null| null| null| -|1970-01-01T08:00:00.090+08:00| 60.0| null| null| null| -|1970-01-01T08:00:00.100+08:00| 60.0| null| null| null| -+-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ -Total line number = 9 -It costs 0.014s -``` - -#### 聚合查询嵌套表达式 - -IoTDB 支持在 `SELECT` 子句中计算由**聚合函数、常量、时间序列生成函数和表达式**组成的任意嵌套表达式。 - -**说明:** -- 当某个时间戳下左操作数和右操作数都不为空(`null`)时,表达式才会有结果,否则表达式值为`null`,且默认不出现在结果集中。但在使用`GROUP BY`子句的聚合查询嵌套表达式中,我们希望保留每个时间窗口的值,所以表达式值为`null`的窗口也包含在结果集中。 -- 如果表达式中某个操作数对应多条时间序列(如通配符`*`),那么每条时间序列对应的结果都会出现在结果集中(按照笛卡尔积形式)。 - -**示例 1:** - -```sql -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) -from root.ln.wf01.wt01; -``` - -运行结果: - -```shell -+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+--------------------------------------------------------------------+ -|avg(root.ln.wf01.wt01.temperature)|sin(avg(root.ln.wf01.wt01.temperature))|avg(root.ln.wf01.wt01.temperature) + 1|-sum(root.ln.wf01.wt01.hardware)|avg(root.ln.wf01.wt01.temperature) + sum(root.ln.wf01.wt01.hardware)| -+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+--------------------------------------------------------------------+ -| 15.927999999999999| -0.21826546964855045| 16.927999999999997| -7426.0| 7441.928| -+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+--------------------------------------------------------------------+ -Total line number = 1 -It costs 0.009s -``` - -**示例 2:** - -```sql -select avg(*), - (avg(*) + 1) * 3 / 2 -1 -from root.sg1; -``` - -运行结果: - -```shell -+---------------+---------------+-------------------------------------+-------------------------------------+ -|avg(root.sg1.a)|avg(root.sg1.b)|(avg(root.sg1.a) + 1) * 3 / 2 - 1 |(avg(root.sg1.b) + 1) * 3 / 2 - 1 | -+---------------+---------------+-------------------------------------+-------------------------------------+ -| 3.2| 3.4| 5.300000000000001| 5.6000000000000005| -+---------------+---------------+-------------------------------------+-------------------------------------+ -Total line number = 1 -It costs 0.007s -``` - -**示例 3:** - -```sql -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) as custom_sum -from root.ln.wf01.wt01 -GROUP BY([10, 90), 10ms); -``` - -运行结果: - -```shell -+-----------------------------+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+----------+ -| Time|avg(root.ln.wf01.wt01.temperature)|sin(avg(root.ln.wf01.wt01.temperature))|avg(root.ln.wf01.wt01.temperature) + 1|-sum(root.ln.wf01.wt01.hardware)|custom_sum| -+-----------------------------+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+----------+ -|1970-01-01T08:00:00.010+08:00| 13.987499999999999| 0.9888207947857667| 14.987499999999999| -3211.0| 3224.9875| -|1970-01-01T08:00:00.020+08:00| 29.6| -0.9701057337071853| 30.6| -3720.0| 3749.6| -|1970-01-01T08:00:00.030+08:00| null| null| null| null| null| -|1970-01-01T08:00:00.040+08:00| null| null| null| null| null| -|1970-01-01T08:00:00.050+08:00| null| null| null| null| null| -|1970-01-01T08:00:00.060+08:00| null| null| null| null| null| -|1970-01-01T08:00:00.070+08:00| null| null| null| null| null| -|1970-01-01T08:00:00.080+08:00| null| null| null| null| null| -+-----------------------------+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+----------+ -Total line number = 8 -It costs 0.012s -``` - -### 2.5 最新点查询 - -最新点查询是时序数据库 Apache IoTDB 中提供的一种特殊查询。它返回指定时间序列中时间戳最大的数据点,即一条序列的最新状态。 - -在物联网数据分析场景中,此功能尤为重要。为了满足了用户对设备实时监控的需求,Apache IoTDB 对最新点查询进行了**缓存优化**,能够提供毫秒级的返回速度。 - -SQL 语法: - -```sql -select last [COMMA ]* from < PrefixPath > [COMMA < PrefixPath >]* [ORDER BY TIMESERIES (DESC | ASC)?] -``` - -其含义是: 查询时间序列 prefixPath.path 中最近时间戳的数据。 - -- `whereClause` 中当前只支持时间过滤条件,任何其他过滤条件都将会返回异常。当缓存的最新点不满足过滤条件时,IoTDB 需要从存储中获取结果,此时性能将会有所下降。 - -- 结果集为四列的结构: - - ```shell - +----+----------+-----+--------+ - |Time|timeseries|value|dataType| - +----+----------+-----+--------+ - ``` - -- 可以使用 `ORDER BY TIME/TIMESERIES/VALUE/DATATYPE (DESC | ASC)` 指定结果集按照某一列进行降序/升序排列。当值列包含多种类型的数据时,按照字符串类型来排序。 - -**示例 1:** 查询 root.ln.wf01.wt01.status 的最新数据点 - -```sql - select last status from root.ln.wf01.wt01; -``` -```shell -+-----------------------------+------------------------+-----+--------+ -| Time| timeseries|value|dataType| -+-----------------------------+------------------------+-----+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.status|false| BOOLEAN| -+-----------------------------+------------------------+-----+--------+ -Total line number = 1 -It costs 0.000s -``` - -**示例 2:** 查询 root.ln.wf01.wt01 下 status,temperature 时间戳大于等于 2017-11-07T23:50:00 的最新数据点。 - -```sql - select last status, temperature from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00; -``` -```shell -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**示例 3:** 查询 root.ln.wf01.wt01 下所有序列的最新数据点,并按照序列名降序排列。 - -```sql - select last * from root.ln.wf01.wt01 order by timeseries desc; -``` -```shell -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**示例 4:** 查询 root.ln.wf01.wt01 下所有序列的最新数据点,并按照dataType降序排列。 - -```sql - select last * from root.ln.wf01.wt01 order by dataType desc; -``` -```shell -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**注意:** 可以通过函数组合方式实现其他过滤条件查询最新点的需求,例如 - -```sql - select max_time(*), last_value(*) from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00 and status = false align by device; -``` -```shell -+-----------------+---------------------+----------------+-----------------------+------------------+ -| Device|max_time(temperature)|max_time(status)|last_value(temperature)|last_value(status)| -+-----------------+---------------------+----------------+-----------------------+------------------+ -|root.ln.wf01.wt01| 1510077540000| 1510077540000| 21.067368| false| -+-----------------+---------------------+----------------+-----------------------+------------------+ -Total line number = 1 -It costs 0.021s -``` - - -## 3. 查询过滤条件(WHERE 子句) - -`WHERE` 子句指定了对数据行的筛选条件,由一个 `whereCondition` 组成。 - -`whereCondition` 是一个逻辑表达式,对于要选择的每一行,其计算结果为真。如果没有 `WHERE` 子句,将选择所有行。 -在 `whereCondition` 中,可以使用除聚合函数之外的任何 IOTDB 支持的函数和运算符。 - -根据过滤条件的不同,可以分为时间过滤条件和值过滤条件。时间过滤条件和值过滤条件可以混合使用。 - -### 3.1 时间过滤条件 - -使用时间过滤条件可以筛选特定时间范围的数据。对于时间戳支持的格式,请参考 [时间戳类型](../Background-knowledge/Data-Type.md) 。 - -示例如下: - -1. 选择时间戳大于 2022-01-01T00:05:00.000 的数据: - - ```sql - select s1 from root.sg1.d1 where time > 2022-01-01T00:05:00.000; - ``` - -2. 选择时间戳等于 2022-01-01T00:05:00.000 的数据: - - ```sql - select s1 from root.sg1.d1 where time = 2022-01-01T00:05:00.000; - ``` - -3. 选择时间区间 [2017-11-01T00:05:00.000, 2017-11-01T00:12:00.000) 内的数据: - - ```sql - select s1 from root.sg1.d1 where time >= 2022-01-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; - ``` - -注:在上述示例中,`time` 也可写做 `timestamp`。 - -### 3.2 值过滤条件 - -使用值过滤条件可以筛选数据值满足特定条件的数据。 -**允许**使用 select 子句中未选择的时间序列作为值过滤条件。 - -示例如下: - -1. 选择值大于 36.5 的数据: - - ```sql - select temperature from root.sg1.d1 where temperature > 36.5; - ``` - -2. 选择值等于 true 的数据: - - ```sql - select status from root.sg1.d1 where status = true; - -3. 选择区间 [36.5,40] 内或之外的数据: - - ```sql - select temperature from root.sg1.d1 where temperature between 36.5 and 40; - ```` - ```sql - select temperature from root.sg1.d1 where temperature not between 36.5 and 40; - ```` - -4. 选择值在特定范围内的数据: - - ```sql - select code from root.sg1.d1 where code in ('200', '300', '400', '500'); - ``` - -5. 选择值在特定范围外的数据: - - ```sql - select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); - ``` - -6. 选择值为空的数据: - - ```sql - select code from root.sg1.d1 where temperature is null; - ```` - -7. 选择值为非空的数据: - - ```sql - select code from root.sg1.d1 where temperature is not null; - ```` - -### 3.3 模糊查询 - -对于 TEXT 和 STRING 类型的数据,支持使用 `Like` 和 `Regexp` 运算符对数据进行模糊匹配 - -#### 使用 `Like` 进行模糊匹配 - -**匹配规则:** - -- `%` 表示任意0个或多个字符。 -- `_` 表示任意单个字符。 - -**示例 1:** 查询 `root.sg.d1` 下 `value` 含有`'cc'`的数据。 - -```sql - select * from root.sg.d1 where value like '%cc%'; -``` -```shell -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -**示例 2:** 查询 `root.sg.d1` 下 `value` 中间为 `'b'`、前后为任意单个字符的数据。 - -```sql - select * from root.sg.device where value like '_b_'; -``` -```shell -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:02.000+08:00| abc| -+-----------------------------+----------------+ -Total line number = 1 -It costs 0.002s -``` - -#### 使用 `Regexp` 进行模糊匹配 - -需要传入的过滤条件为 **Java 标准库风格的正则表达式**。 - -**常见的正则匹配举例:** - -```shell -长度为3-20的所有字符:^.{3,20}$ -大写英文字符:^[A-Z]+$ -数字和英文字符:^[A-Za-z0-9]+$ -以a开头的:^a.* -``` - -**示例 1:** 查询 root.sg.d1 下 value 值为26个英文字符组成的字符串。 - -```sql - select * from root.sg.d1 where value regexp '^[A-Za-z]+$'; -``` -```shell -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -**示例 2:** 查询 root.sg.d1 下 value 值为26个小写英文字符组成的字符串且时间大于100的。 - -```sql - select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100; -``` -```shell -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -## 4. 分段分组聚合(GROUP BY 子句) -IoTDB支持通过`GROUP BY`子句对序列进行分段或者分组聚合。 - -分段聚合是指按照时间维度,针对同时间序列中不同数据点之间的时间关系,对数据在行的方向进行分段,每个段得到一个聚合值。目前支持**时间区间分段**、**差值分段**、**条件分段**、**会话分段**和**点数分段**,未来将支持更多分段方式。 - -分组聚合是指针对不同时间序列,在时间序列的潜在业务属性上分组,每个组包含若干条时间序列,每个组得到一个聚合值。支持**按路径层级分组**和**按序列标签分组**两种分组方式。 - -### 4.1 分段聚合 - -#### 时间区间分段聚合 - -时间区间分段聚合是一种时序数据典型的查询方式,数据以高频进行采集,需要按照一定的时间间隔进行聚合计算,如计算每天的平均气温,需要将气温的序列按天进行分段,然后计算平均值。 - -在 IoTDB 中,聚合查询可以通过 `GROUP BY` 子句指定按照时间区间分段聚合。用户可以指定聚合的时间间隔和滑动步长,相关参数如下: - -* 参数 1:时间轴显示时间窗口大小 -* 参数 2:聚合窗口的大小(必须为正数) -* 参数 3:聚合窗口的滑动步长(可选,默认与聚合窗口大小相同) - -下图中指出了这三个参数的含义: - - - -接下来,我们给出几个典型例子: - -##### 未指定滑动步长的时间区间分段聚合查询 - -对应的 SQL 语句是: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d); -``` -这条查询的含义是: - -由于用户没有指定滑动步长,滑动步长将会被默认设置为跟时间间隔参数相同,也就是`1d`。 - -上面这个例子的第一个参数是显示窗口参数,决定了最终的显示范围是 [2017-11-01T00:00:00, 2017-11-07T23:00:00)。 - -上面这个例子的第二个参数是划分时间轴的时间间隔参数,将`1d`当作划分间隔,显示窗口参数的起始时间当作分割原点,时间轴即被划分为连续的时间间隔:[0,1d), [1d, 2d), [2d, 3d) 等等。 - -然后系统将会用 WHERE 子句中的时间和值过滤条件以及 GROUP BY 语句中的第一个参数作为数据的联合过滤条件,获得满足所有过滤条件的数据(在这个例子里是在 [2017-11-01T00:00:00, 2017-11-07 T23:00:00) 这个时间范围的数据),并把这些数据映射到之前分割好的时间轴中(这个例子里是从 2017-11-01T00:00:00 到 2017-11-07T23:00:00:00 的每一天) - -每个时间间隔窗口内都有数据,SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 1440| 26.0| -|2017-11-02T00:00:00.000+08:00| 1440| 26.0| -|2017-11-03T00:00:00.000+08:00| 1440| 25.99| -|2017-11-04T00:00:00.000+08:00| 1440| 26.0| -|2017-11-05T00:00:00.000+08:00| 1440| 26.0| -|2017-11-06T00:00:00.000+08:00| 1440| 25.99| -|2017-11-07T00:00:00.000+08:00| 1380| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 7 -It costs 0.024s -``` - -##### 指定滑动步长的时间区间分段聚合查询 - -对应的 SQL 语句是: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d); -``` - -这条查询的含义是: - -由于用户指定了滑动步长为`1d`,GROUP BY 语句执行时将会每次把时间间隔往后移动一天的步长,而不是默认的 3 小时。 - -也就意味着,我们想要取从 2017-11-01 到 2017-11-07 每一天的凌晨 0 点到凌晨 3 点的数据。 - -上面这个例子的第一个参数是显示窗口参数,决定了最终的显示范围是 [2017-11-01T00:00:00, 2017-11-07T23:00:00)。 - -上面这个例子的第二个参数是划分时间轴的时间间隔参数,将`3h`当作划分间隔,显示窗口参数的起始时间当作分割原点,时间轴即被划分为连续的时间间隔:[2017-11-01T00:00:00, 2017-11-01T03:00:00), [2017-11-02T00:00:00, 2017-11-02T03:00:00), [2017-11-03T00:00:00, 2017-11-03T03:00:00) 等等。 - -上面这个例子的第三个参数是每次时间间隔的滑动步长。 - -然后系统将会用 WHERE 子句中的时间和值过滤条件以及 GROUP BY 语句中的第一个参数作为数据的联合过滤条件,获得满足所有过滤条件的数据(在这个例子里是在 [2017-11-01T00:00:00, 2017-11-07 T23:00:00) 这个时间范围的数据),并把这些数据映射到之前分割好的时间轴中(这个例子里是从 2017-11-01T00:00:00 到 2017-11-07T23:00:00:00 的每一天的凌晨 0 点到凌晨 3 点) - -每个时间间隔窗口内都有数据,SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| 25.98| -|2017-11-02T00:00:00.000+08:00| 180| 25.98| -|2017-11-03T00:00:00.000+08:00| 180| 25.96| -|2017-11-04T00:00:00.000+08:00| 180| 25.96| -|2017-11-05T00:00:00.000+08:00| 180| 26.0| -|2017-11-06T00:00:00.000+08:00| 180| 25.85| -|2017-11-07T00:00:00.000+08:00| 180| 25.99| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 7 -It costs 0.006s -``` - -滑动步长可以小于聚合窗口,此时聚合窗口之间有重叠时间(类似于一个滑动窗口)。 - -例如 SQL: -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-01 10:00:00), 4h, 2h); -``` - -SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| 25.98| -|2017-11-01T02:00:00.000+08:00| 180| 25.98| -|2017-11-01T04:00:00.000+08:00| 180| 25.96| -|2017-11-01T06:00:00.000+08:00| 180| 25.96| -|2017-11-01T08:00:00.000+08:00| 180| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 5 -It costs 0.006s -``` - -##### 按照自然月份的时间区间分段聚合查询 - -对应的 SQL 语句是: - -```sql -select count(status) from root.ln.wf01.wt01 where time > 2017-11-01T01:00:00 group by([2017-11-01T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` - -这条查询的含义是: - -由于用户指定了滑动步长为`2mo`,GROUP BY 语句执行时将会每次把时间间隔往后移动 2 个自然月的步长,而不是默认的 1 个自然月。 - -也就意味着,我们想要取从 2017-11-01 到 2019-11-07 每 2 个自然月的第一个月的数据。 - -上面这个例子的第一个参数是显示窗口参数,决定了最终的显示范围是 [2017-11-01T00:00:00, 2019-11-07T23:00:00)。 - -起始时间为 2017-11-01T00:00:00,滑动步长将会以起始时间作为标准按月递增,取当月的 1 号作为时间间隔的起始时间。 - -上面这个例子的第二个参数是划分时间轴的时间间隔参数,将`1mo`当作划分间隔,显示窗口参数的起始时间当作分割原点,时间轴即被划分为连续的时间间隔:[2017-11-01T00:00:00, 2017-12-01T00:00:00), [2018-02-01T00:00:00, 2018-03-01T00:00:00), [2018-05-03T00:00:00, 2018-06-01T00:00:00) 等等。 - -上面这个例子的第三个参数是每次时间间隔的滑动步长。 - -然后系统将会用 WHERE 子句中的时间和值过滤条件以及 GROUP BY 语句中的第一个参数作为数据的联合过滤条件,获得满足所有过滤条件的数据(在这个例子里是在 [2017-11-01T00:00:00, 2019-11-07T23:00:00) 这个时间范围的数据),并把这些数据映射到之前分割好的时间轴中(这个例子里是从 2017-11-01T00:00:00 到 2019-11-07T23:00:00:00 的每两个自然月的第一个月) - -每个时间间隔窗口内都有数据,SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-11-01T00:00:00.000+08:00| 259| -|2018-01-01T00:00:00.000+08:00| 250| -|2018-03-01T00:00:00.000+08:00| 259| -|2018-05-01T00:00:00.000+08:00| 251| -|2018-07-01T00:00:00.000+08:00| 242| -|2018-09-01T00:00:00.000+08:00| 225| -|2018-11-01T00:00:00.000+08:00| 216| -|2019-01-01T00:00:00.000+08:00| 207| -|2019-03-01T00:00:00.000+08:00| 216| -|2019-05-01T00:00:00.000+08:00| 207| -|2019-07-01T00:00:00.000+08:00| 199| -|2019-09-01T00:00:00.000+08:00| 181| -|2019-11-01T00:00:00.000+08:00| 60| -+-----------------------------+-------------------------------+ -``` - -对应的 SQL 语句是: - -```sql -select count(status) from root.ln.wf01.wt01 group by([2017-10-31T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` - -这条查询的含义是: - -由于用户指定了滑动步长为`2mo`,GROUP BY 语句执行时将会每次把时间间隔往后移动 2 个自然月的步长,而不是默认的 1 个自然月。 - -也就意味着,我们想要取从 2017-10-31 到 2019-11-07 每 2 个自然月的第一个月的数据。 - -与上述示例不同的是起始时间为 2017-10-31T00:00:00,滑动步长将会以起始时间作为标准按月递增,取当月的 31 号(即最后一天)作为时间间隔的起始时间。若起始时间设置为 30 号,滑动步长会将时间间隔的起始时间设置为当月 30 号,若不存在则为最后一天。 - -上面这个例子的第一个参数是显示窗口参数,决定了最终的显示范围是 [2017-10-31T00:00:00, 2019-11-07T23:00:00)。 - -上面这个例子的第二个参数是划分时间轴的时间间隔参数,将`1mo`当作划分间隔,显示窗口参数的起始时间当作分割原点,时间轴即被划分为连续的时间间隔:[2017-10-31T00:00:00, 2017-11-31T00:00:00), [2018-02-31T00:00:00, 2018-03-31T00:00:00), [2018-05-31T00:00:00, 2018-06-31T00:00:00) 等等。 - -上面这个例子的第三个参数是每次时间间隔的滑动步长。 - -然后系统将会用 WHERE 子句中的时间和值过滤条件以及 GROUP BY 语句中的第一个参数作为数据的联合过滤条件,获得满足所有过滤条件的数据(在这个例子里是在 [2017-10-31T00:00:00, 2019-11-07T23:00:00) 这个时间范围的数据),并把这些数据映射到之前分割好的时间轴中(这个例子里是从 2017-10-31T00:00:00 到 2019-11-07T23:00:00:00 的每两个自然月的第一个月) - -每个时间间隔窗口内都有数据,SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-10-31T00:00:00.000+08:00| 251| -|2017-12-31T00:00:00.000+08:00| 250| -|2018-02-28T00:00:00.000+08:00| 259| -|2018-04-30T00:00:00.000+08:00| 250| -|2018-06-30T00:00:00.000+08:00| 242| -|2018-08-31T00:00:00.000+08:00| 225| -|2018-10-31T00:00:00.000+08:00| 216| -|2018-12-31T00:00:00.000+08:00| 208| -|2019-02-28T00:00:00.000+08:00| 216| -|2019-04-30T00:00:00.000+08:00| 208| -|2019-06-30T00:00:00.000+08:00| 199| -|2019-08-31T00:00:00.000+08:00| 181| -|2019-10-31T00:00:00.000+08:00| 69| -+-----------------------------+-------------------------------+ -``` - -##### 左开右闭区间 - -每个区间的结果时间戳为区间右端点,对应的 SQL 语句是: - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d); -``` - -这条查询语句的时间区间是左开右闭的,结果中不会包含时间点 2017-11-01 的数据,但是会包含时间点 2017-11-07 的数据。 - -SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-11-02T00:00:00.000+08:00| 1440| -|2017-11-03T00:00:00.000+08:00| 1440| -|2017-11-04T00:00:00.000+08:00| 1440| -|2017-11-05T00:00:00.000+08:00| 1440| -|2017-11-06T00:00:00.000+08:00| 1440| -|2017-11-07T00:00:00.000+08:00| 1440| -|2017-11-07T23:00:00.000+08:00| 1380| -+-----------------------------+-------------------------------+ -Total line number = 7 -It costs 0.004s -``` - -#### 差值分段聚合 -IoTDB支持通过`GROUP BY VARIATION`语句来根据差值进行分组。`GROUP BY VARIATION`会将第一个点作为一个组的**基准点**,每个新的数据在按照给定规则与基准点进行差值运算后, -如果差值小于给定的阈值则将该新点归于同一组,否则结束当前分组,以这个新的数据为新的基准点开启新的分组。 -该分组方式不会重叠,且没有固定的开始结束时间。其子句语法如下: -```sql -group by variation(controlExpression[,delta][,ignoreNull=true/false]) -``` -不同的参数含义如下 -* controlExpression - -分组所参照的值,**可以是查询数据中的某一列或是多列的表达式 -(多列表达式计算后仍为一个值,使用多列表达式时指定的列必须都为数值列)**, 差值便是根据数据的controlExpression的差值运算。 -* delta - -分组所使用的阈值,同一分组中**每个点的controlExpression对应的值与该组中基准点对应值的差值都小于`delta`**。当`delta=0`时,相当于一个等值分组,所有连续且expression值相同的数据将被分到一组。 - -* ignoreNull - -用于指定`controlExpression`的值为null时对数据的处理方式,当`ignoreNull`为false时,该null值会被视为新的值,`ignoreNull`为true时,则直接跳过对应的点。 - -在`delta`取不同值时,`controlExpression`支持的返回数据类型以及当`ignoreNull`为false时对于null值的处理方式可以见下表: - -| delta | controlExpression支持的返回类型 | ignoreNull=false时对于Null值的处理 | -|----------|--------------------------------------|-----------------------------------------------------------------| -| delta!=0 | INT32、INT64、FLOAT、DOUBLE | 若正在维护分组的值不为null,null视为无穷大/无穷小,结束当前分组。连续的null视为差值相等的值,会被分配在同一个分组 | -| delta=0 | TEXT、BINARY、INT32、INT64、FLOAT、DOUBLE | null被视为新分组中的新值,连续的null属于相同的分组 | - -下图为差值分段的一个分段方式示意图,与组中第一个数据的控制列值的差值在delta内的控制列对应的点属于相同的分组。 - -groupByVariation - -##### 使用注意事项 -1. `controlExpression`的结果应该为唯一值,如果使用通配符拼接后出现多列,则报错。 -2. 对于一个分组,默认Time列输出分组的开始时间,查询时可以使用select `__endTime`的方式来使得结果输出分组的结束时间。 -3. 与`ALIGN BY DEVICE`搭配使用时会对每个device进行单独的分组操作。 -4. 当没有指定`delta`和`ignoreNull`时,`delta`默认为0,`ignoreNull`默认为true。 -5. 当前暂不支持与`GROUP BY LEVEL`搭配使用。 - -使用如下的原始数据,接下来会给出几个事件分段查询的使用样例 -```shell -+-----------------------------+-------+-------+-------+--------+-------+-------+ -| Time| s1| s2| s3| s4| s5| s6| -+-----------------------------+-------+-------+-------+--------+-------+-------+ -|1970-01-01T08:00:00.000+08:00| 4.5| 9.0| 0.0| 45.0| 9.0| 8.25| -|1970-01-01T08:00:00.010+08:00| null| 19.0| 10.0| 145.0| 19.0| 8.25| -|1970-01-01T08:00:00.020+08:00| 24.5| 29.0| null| 245.0| 29.0| null| -|1970-01-01T08:00:00.030+08:00| 34.5| null| 30.0| 345.0| null| null| -|1970-01-01T08:00:00.040+08:00| 44.5| 49.0| 40.0| 445.0| 49.0| 8.25| -|1970-01-01T08:00:00.050+08:00| null| 59.0| 50.0| 545.0| 59.0| 6.25| -|1970-01-01T08:00:00.060+08:00| 64.5| 69.0| 60.0| 645.0| 69.0| null| -|1970-01-01T08:00:00.070+08:00| 74.5| 79.0| null| null| 79.0| 3.25| -|1970-01-01T08:00:00.080+08:00| 84.5| 89.0| 80.0| 845.0| 89.0| 3.25| -|1970-01-01T08:00:00.090+08:00| 94.5| 99.0| 90.0| 945.0| 99.0| 3.25| -|1970-01-01T08:00:00.150+08:00| 66.5| 77.0| 90.0| 945.0| 99.0| 9.25| -+-----------------------------+-------+-------+-------+--------+-------+-------+ -``` -##### delta=0时的等值事件分段 -使用如下sql语句 -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6); -``` -得到如下的查询结果,这里忽略了s6为null的行 -```shell -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.040+08:00| 24.5| 3| 50.0| -|1970-01-01T08:00:00.050+08:00|1970-01-01T08:00:00.050+08:00| null| 1| 50.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` -当指定ignoreNull为false时,会将s6为null的数据也考虑进来 -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, ignoreNull=false); -``` -得到如下的结果 -```shell -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.010+08:00| 4.5| 2| 10.0| -|1970-01-01T08:00:00.020+08:00|1970-01-01T08:00:00.030+08:00| 29.5| 1| 30.0| -|1970-01-01T08:00:00.040+08:00|1970-01-01T08:00:00.040+08:00| 44.5| 1| 40.0| -|1970-01-01T08:00:00.050+08:00|1970-01-01T08:00:00.050+08:00| null| 1| 50.0| -|1970-01-01T08:00:00.060+08:00|1970-01-01T08:00:00.060+08:00| 64.5| 1| 60.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` -##### delta!=0时的差值事件分段 -使用如下sql语句 -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, 4); -``` -得到如下的查询结果 -```shell -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.050+08:00| 24.5| 4| 100.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` -group by子句中的controlExpression同样支持列的表达式 - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6+s5, 10); -``` -得到如下的查询结果 -```shell -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.010+08:00| 4.5| 2| 10.0| -|1970-01-01T08:00:00.040+08:00|1970-01-01T08:00:00.050+08:00| 44.5| 2| 90.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.080+08:00| 79.5| 2| 80.0| -|1970-01-01T08:00:00.090+08:00|1970-01-01T08:00:00.150+08:00| 80.5| 2| 180.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` -#### 条件分段聚合 -当需要根据指定条件对数据进行筛选,并将连续的符合条件的行分为一组进行聚合运算时,可以使用`GROUP BY CONDITION`的分段方式;不满足给定条件的行因为不属于任何分组会被直接简单忽略。 -其语法定义如下: -```sql -group by condition(predict,[keep>/>=/=/<=/<]threshold,[,ignoreNull=true/false]) -``` -* predict - -返回boolean数据类型的合法表达式,用于分组的筛选。 -* keep[>/>=/=/<=/<]threshold - -keep表达式用来指定形成分组所需要连续满足`predict`条件的数据行数,只有行数满足keep表达式的分组才会被输出。keep表达式由一个'keep'字符串和`long`类型的threshold组合或者是单独的`long`类型数据构成。 - -* ignoreNull=true/false - -用于指定遇到predict为null的数据行时的处理方式,为true则跳过该行,为false则结束当前分组。 - -##### 使用注意事项 -1. keep条件在查询中是必需的,但可以省略掉keep字符串给出一个`long`类型常数,默认为`keep=该long型常数`的等于条件。 -2. `ignoreNull`默认为true。 -3. 对于一个分组,默认Time列输出分组的开始时间,查询时可以使用select `__endTime`的方式来使得结果输出分组的结束时间。 -4. 与`ALIGN BY DEVICE`搭配使用时会对每个device进行单独的分组操作。 -5. 当前暂不支持与`GROUP BY LEVEL`搭配使用。 - - -对于如下原始数据,下面会给出几个查询样例: -```shell -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -| Time|root.sg.beijing.car01.soc|root.sg.beijing.car01.charging_status|root.sg.beijing.car01.vehicle_status| -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 14.0| 1| 1| -|1970-01-01T08:00:00.002+08:00| 16.0| 1| 1| -|1970-01-01T08:00:00.003+08:00| 16.0| 0| 1| -|1970-01-01T08:00:00.004+08:00| 16.0| 0| 1| -|1970-01-01T08:00:00.005+08:00| 18.0| 1| 1| -|1970-01-01T08:00:00.006+08:00| 24.0| 1| 1| -|1970-01-01T08:00:00.007+08:00| 36.0| 1| 1| -|1970-01-01T08:00:00.008+08:00| 36.0| null| 1| -|1970-01-01T08:00:00.009+08:00| 45.0| 1| 1| -|1970-01-01T08:00:00.010+08:00| 60.0| 1| 1| -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -``` -查询至少连续两行以上的charging_status=1的数据,sql语句如下: -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoreNull=true); -``` -得到结果如下: -```shell -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -| Time|max_time(root.sg.beijing.car01.charging_status)|count(root.sg.beijing.car01.vehicle_status)|last_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 2| 2| 16.0| -|1970-01-01T08:00:00.005+08:00| 10| 5| 60.0| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -``` -当设置`ignoreNull`为false时,遇到null值为将其视为一个不满足条件的行,会结束正在计算的分组。 -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoreNull=false); -``` -得到如下结果,原先的分组被含null的行拆分: -```shell -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -| Time|max_time(root.sg.beijing.car01.charging_status)|count(root.sg.beijing.car01.vehicle_status)|last_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 2| 2| 16.0| -|1970-01-01T08:00:00.005+08:00| 7| 3| 36.0| -|1970-01-01T08:00:00.009+08:00| 10| 2| 60.0| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -``` -#### 会话分段聚合 -`GROUP BY SESSION`可以根据时间列的间隔进行分组,在结果集的时间列中,时间间隔小于等于设定阈值的数据会被分为一组。例如在工业场景中,设备并不总是连续运行,`GROUP BY SESSION`会将设备每次接入会话所产生的数据分为一组。 -其语法定义如下: -```sql -group by session(timeInterval) -``` -* timeInterval - -设定的时间差阈值,当两条数据时间列的差值大于该阈值,则会给数据创建一个新的分组。 - -下图为`group by session`下的一个分组示意图 - - - -##### 使用注意事项 -1. 对于一个分组,默认Time列输出分组的开始时间,查询时可以使用select `__endTime`的方式来使得结果输出分组的结束时间。 -2. 与`ALIGN BY DEVICE`搭配使用时会对每个device进行单独的分组操作。 -3. 当前暂不支持与`GROUP BY LEVEL`搭配使用。 - -对于下面的原始数据,给出几个查询样例。 -```shell -+-----------------------------+-----------------+-----------+--------+------+ -| Time| Device|temperature|hardware|status| -+-----------------------------+-----------------+-----------+--------+------+ -|1970-01-01T08:00:01.000+08:00|root.ln.wf02.wt01| 35.7| 11| false| -|1970-01-01T08:00:02.000+08:00|root.ln.wf02.wt01| 35.8| 22| true| -|1970-01-01T08:00:03.000+08:00|root.ln.wf02.wt01| 35.4| 33| false| -|1970-01-01T08:00:04.000+08:00|root.ln.wf02.wt01| 36.4| 44| false| -|1970-01-01T08:00:05.000+08:00|root.ln.wf02.wt01| 36.8| 55| false| -|1970-01-01T08:00:10.000+08:00|root.ln.wf02.wt01| 36.8| 110| false| -|1970-01-01T08:00:20.000+08:00|root.ln.wf02.wt01| 37.8| 220| true| -|1970-01-01T08:00:30.000+08:00|root.ln.wf02.wt01| 37.5| 330| false| -|1970-01-01T08:00:40.000+08:00|root.ln.wf02.wt01| 37.4| 440| false| -|1970-01-01T08:00:50.000+08:00|root.ln.wf02.wt01| 37.9| 550| false| -|1970-01-01T08:01:40.000+08:00|root.ln.wf02.wt01| 38.0| 110| false| -|1970-01-01T08:02:30.000+08:00|root.ln.wf02.wt01| 38.8| 220| true| -|1970-01-01T08:03:20.000+08:00|root.ln.wf02.wt01| 38.6| 330| false| -|1970-01-01T08:04:20.000+08:00|root.ln.wf02.wt01| 38.4| 440| false| -|1970-01-01T08:05:20.000+08:00|root.ln.wf02.wt01| 38.3| 550| false| -|1970-01-01T08:06:40.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-01T08:07:50.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-01T08:08:00.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-02T08:08:01.000+08:00|root.ln.wf02.wt01| 38.2| 110| false| -|1970-01-02T08:08:02.000+08:00|root.ln.wf02.wt01| 37.5| 220| true| -|1970-01-02T08:08:03.000+08:00|root.ln.wf02.wt01| 37.4| 330| false| -|1970-01-02T08:08:04.000+08:00|root.ln.wf02.wt01| 36.8| 440| false| -|1970-01-02T08:08:05.000+08:00|root.ln.wf02.wt01| 37.4| 550| false| -+-----------------------------+-----------------+-----------+--------+------+ -``` -可以按照不同的时间单位设定时间间隔,sql语句如下: -```sql -select __endTime,count(*) from root.** group by session(1d); -``` -得到如下结果: -```shell -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -| Time| __endTime|count(root.ln.wf02.wt01.temperature)|count(root.ln.wf02.wt01.hardware)|count(root.ln.wf02.wt01.status)| -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -|1970-01-01T08:00:01.000+08:00|1970-01-01T08:08:00.000+08:00| 15| 18| 15| -|1970-01-02T08:08:01.000+08:00|1970-01-02T08:08:05.000+08:00| 5| 5| 5| -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -``` -也可以和`HAVING`、`ALIGN BY DEVICE`共同使用 -```sql -select __endTime,sum(hardware) from root.ln.wf02.wt01 group by session(50s) having sum(hardware)>0 align by device; -``` -得到如下结果,其中排除了`sum(hardware)`为0的部分 -```shell -+-----------------------------+-----------------+-----------------------------+-------------+ -| Time| Device| __endTime|sum(hardware)| -+-----------------------------+-----------------+-----------------------------+-------------+ -|1970-01-01T08:00:01.000+08:00|root.ln.wf02.wt01|1970-01-01T08:03:20.000+08:00| 2475.0| -|1970-01-01T08:04:20.000+08:00|root.ln.wf02.wt01|1970-01-01T08:04:20.000+08:00| 440.0| -|1970-01-01T08:05:20.000+08:00|root.ln.wf02.wt01|1970-01-01T08:05:20.000+08:00| 550.0| -|1970-01-02T08:08:01.000+08:00|root.ln.wf02.wt01|1970-01-02T08:08:05.000+08:00| 1650.0| -+-----------------------------+-----------------+-----------------------------+-------------+ -``` -#### 点数分段聚合 -`GROUP BY COUNT`可以根据点数分组进行聚合运算,将连续的指定数量数据点分为一组,即按照固定的点数进行分组。 -其语法定义如下: -```sql -group by count(controlExpression, size[,ignoreNull=true/false]) -``` -* controlExpression - -计数参照的对象,可以是结果集的任意列或是列的表达式 - -* size - -一个组中数据点的数量,每`size`个数据点会被分到同一个组 - -* ignoreNull=true/false - -是否忽略`controlExpression`为null的数据点,当ignoreNull为true时,在计数时会跳过`controlExpression`结果为null的数据点 - -##### 使用注意事项 -1. 对于一个分组,默认Time列输出分组的开始时间,查询时可以使用select `__endTime`的方式来使得结果输出分组的结束时间。 -2. 与`ALIGN BY DEVICE`搭配使用时会对每个device进行单独的分组操作。 -3. 当前暂不支持与`GROUP BY LEVEL`搭配使用。 -4. 当一个分组内最终的点数不满足`size`的数量时,不会输出该分组的结果 - -对于下面的原始数据,给出几个查询样例。 -```shell -+-----------------------------+-----------+-----------------------+ -| Time|root.sg.soc|root.sg.charging_status| -+-----------------------------+-----------+-----------------------+ -|1970-01-01T08:00:00.001+08:00| 14.0| 1| -|1970-01-01T08:00:00.002+08:00| 16.0| 1| -|1970-01-01T08:00:00.003+08:00| 16.0| 0| -|1970-01-01T08:00:00.004+08:00| 16.0| 0| -|1970-01-01T08:00:00.005+08:00| 18.0| 1| -|1970-01-01T08:00:00.006+08:00| 24.0| 1| -|1970-01-01T08:00:00.007+08:00| 36.0| 1| -|1970-01-01T08:00:00.008+08:00| 36.0| null| -|1970-01-01T08:00:00.009+08:00| 45.0| 1| -|1970-01-01T08:00:00.010+08:00| 60.0| 1| -+-----------------------------+-----------+-----------------------+ -``` -sql语句如下 -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5); -``` -得到如下结果,其中由于第二个1970-01-01T08:00:00.006+08:00到1970-01-01T08:00:00.010+08:00的窗口中包含四个点,不符合`size = 5`的条件,因此不被输出 -```shell -+-----------------------------+-----------------------------+--------------------------------------+ -| Time| __endTime|first_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.001+08:00|1970-01-01T08:00:00.005+08:00| 14.0| -+-----------------------------+-----------------------------+--------------------------------------+ -``` -而当使用ignoreNull将null值也考虑进来时,可以得到两个点计数为5的窗口,sql如下 -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5,ignoreNull=false); -``` -得到如下结果 -```shell -+-----------------------------+-----------------------------+--------------------------------------+ -| Time| __endTime|first_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.001+08:00|1970-01-01T08:00:00.005+08:00| 14.0| -|1970-01-01T08:00:00.006+08:00|1970-01-01T08:00:00.010+08:00| 24.0| -+-----------------------------+-----------------------------+--------------------------------------+ -``` -### 4.2 分组聚合 - -#### 路径层级分组聚合 - -在时间序列层级结构中,路径层级分组聚合查询用于**对某一层级下同名的序列进行聚合查询**。 - -- 使用 `GROUP BY LEVEL = INT` 来指定需要聚合的层级,并约定 `ROOT` 为第 0 层。若统计 "root.ln" 下所有序列则需指定 level 为 1。 -- 路径层次分组聚合查询支持使用所有内置聚合函数。对于 `sum`,`avg`,`min_value`, `max_value`, `extreme` 五种聚合函数,需保证所有聚合的时间序列数据类型相同。其他聚合函数没有此限制。 - -**示例1:** 不同 database 下均存在名为 status 的序列, 如 "root.ln.wf01.wt01.status", "root.ln.wf02.wt02.status", 以及 "root.sgcc.wf03.wt01.status", 如果需要统计不同 database 下 status 序列的数据点个数,使用以下查询: - -```sql -select count(status) from root.** group by level = 1; -``` - -运行结果为: - -```shell -+-------------------------+---------------------------+ -|count(root.ln.*.*.status)|count(root.sgcc.*.*.status)| -+-------------------------+---------------------------+ -| 20160| 10080| -+-------------------------+---------------------------+ -Total line number = 1 -It costs 0.003s -``` - -**示例2:** 统计不同设备下 status 序列的数据点个数,可以规定 level = 3, - -```sql -select count(status) from root.** group by level = 3; -``` - -运行结果为: - -```shell -+---------------------------+---------------------------+ -|count(root.*.*.wt01.status)|count(root.*.*.wt02.status)| -+---------------------------+---------------------------+ -| 20160| 10080| -+---------------------------+---------------------------+ -Total line number = 1 -It costs 0.003s -``` - -注意,这时会将 database `ln` 和 `sgcc` 下名为 `wt01` 的设备视为同名设备聚合在一起。 - -**示例3:** 统计不同 database 下的不同设备中 status 序列的数据点个数,可以使用以下查询: - -```sql -select count(status) from root.** group by level = 1, 3; -``` - -运行结果为: - -```shell -+----------------------------+----------------------------+------------------------------+ -|count(root.ln.*.wt01.status)|count(root.ln.*.wt02.status)|count(root.sgcc.*.wt01.status)| -+----------------------------+----------------------------+------------------------------+ -| 10080| 10080| 10080| -+----------------------------+----------------------------+------------------------------+ -Total line number = 1 -It costs 0.003s -``` - -**示例4:** 查询所有序列下温度传感器 temperature 的最大值,可以使用下列查询语句: - -```sql -select max_value(temperature) from root.** group by level = 0; -``` - -运行结果: - -```shell -+---------------------------------+ -|max_value(root.*.*.*.temperature)| -+---------------------------------+ -| 26.0| -+---------------------------------+ -Total line number = 1 -It costs 0.013s -``` - -**示例5:** 上面的查询都是针对某一个传感器,特别地,**如果想要查询某一层级下所有传感器拥有的总数据点数,则需要显式规定测点为 `*`** - -```sql -select count(*) from root.ln.** group by level = 2; -``` - -运行结果: - -```shell -+----------------------+----------------------+ -|count(root.*.wf01.*.*)|count(root.*.wf02.*.*)| -+----------------------+----------------------+ -| 20160| 20160| -+----------------------+----------------------+ -Total line number = 1 -It costs 0.013s -``` - -##### 与时间区间分段聚合混合使用 - -通过定义 LEVEL 来统计指定层级下的数据点个数。 - -例如: - -统计降采样后的数据点个数 - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d), level=1; -``` - -结果: - -```shell -+-----------------------------+-------------------------+ -| Time|COUNT(root.ln.*.*.status)| -+-----------------------------+-------------------------+ -|2017-11-02T00:00:00.000+08:00| 1440| -|2017-11-03T00:00:00.000+08:00| 1440| -|2017-11-04T00:00:00.000+08:00| 1440| -|2017-11-05T00:00:00.000+08:00| 1440| -|2017-11-06T00:00:00.000+08:00| 1440| -|2017-11-07T00:00:00.000+08:00| 1440| -|2017-11-07T23:00:00.000+08:00| 1380| -+-----------------------------+-------------------------+ -Total line number = 7 -It costs 0.006s -``` - -加上滑动 Step 的降采样后的结果也可以汇总 - -```sql -select count(status) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d), level=1; -``` - -```shell -+-----------------------------+-------------------------+ -| Time|COUNT(root.ln.*.*.status)| -+-----------------------------+-------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| -|2017-11-02T00:00:00.000+08:00| 180| -|2017-11-03T00:00:00.000+08:00| 180| -|2017-11-04T00:00:00.000+08:00| 180| -|2017-11-05T00:00:00.000+08:00| 180| -|2017-11-06T00:00:00.000+08:00| 180| -|2017-11-07T00:00:00.000+08:00| 180| -+-----------------------------+-------------------------+ -Total line number = 7 -It costs 0.004s -``` - -#### 标签分组聚合 - -IoTDB 支持通过 `GROUP BY TAGS` 语句根据时间序列中定义的标签的键值做分组聚合查询。 - -我们先在 IoTDB 中写入如下示例数据,稍后会以这些数据为例介绍标签聚合查询。 - -这些是某工厂 `factory1` 在多个城市的多个车间的设备温度数据, 时间范围为 [1000, 10000)。 - -时间序列路径中的设备一级是设备唯一标识。城市信息 `city` 和车间信息 `workshop` 则被建模在该设备时间序列的标签中。 -其中,设备 `d1`、`d2` 在 `Beijing` 的 `w1` 车间, `d3`、`d4` 在 `Beijing` 的 `w2` 车间,`d5`、`d6` 在 `Shanghai` 的 `w1` 车间,`d7` 在 `Shanghai` 的 `w2` 车间。 -`d8` 和 `d9` 设备目前处于调试阶段,还未被分配到具体的城市和车间,所以其相应的标签值为空值。 - -```SQL -create database root.factory1; -create timeseries root.factory1.d1.temperature with datatype=FLOAT tags(city=Beijing, workshop=w1); -create timeseries root.factory1.d2.temperature with datatype=FLOAT tags(city=Beijing, workshop=w1); -create timeseries root.factory1.d3.temperature with datatype=FLOAT tags(city=Beijing, workshop=w2); -create timeseries root.factory1.d4.temperature with datatype=FLOAT tags(city=Beijing, workshop=w2); -create timeseries root.factory1.d5.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w1); -create timeseries root.factory1.d6.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w1); -create timeseries root.factory1.d7.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w2); -create timeseries root.factory1.d8.temperature with datatype=FLOAT; -create timeseries root.factory1.d9.temperature with datatype=FLOAT; - -insert into root.factory1.d1(time, temperature) values(1000, 104.0); -insert into root.factory1.d1(time, temperature) values(3000, 104.2); -insert into root.factory1.d1(time, temperature) values(5000, 103.3); -insert into root.factory1.d1(time, temperature) values(7000, 104.1); - -insert into root.factory1.d2(time, temperature) values(1000, 104.4); -insert into root.factory1.d2(time, temperature) values(3000, 103.7); -insert into root.factory1.d2(time, temperature) values(5000, 103.3); -insert into root.factory1.d2(time, temperature) values(7000, 102.9); - -insert into root.factory1.d3(time, temperature) values(1000, 103.9); -insert into root.factory1.d3(time, temperature) values(3000, 103.8); -insert into root.factory1.d3(time, temperature) values(5000, 102.7); -insert into root.factory1.d3(time, temperature) values(7000, 106.9); - -insert into root.factory1.d4(time, temperature) values(1000, 103.9); -insert into root.factory1.d4(time, temperature) values(5000, 102.7); -insert into root.factory1.d4(time, temperature) values(7000, 106.9); - -insert into root.factory1.d5(time, temperature) values(1000, 112.9); -insert into root.factory1.d5(time, temperature) values(7000, 113.0); - -insert into root.factory1.d6(time, temperature) values(1000, 113.9); -insert into root.factory1.d6(time, temperature) values(3000, 113.3); -insert into root.factory1.d6(time, temperature) values(5000, 112.7); -insert into root.factory1.d6(time, temperature) values(7000, 112.3); - -insert into root.factory1.d7(time, temperature) values(1000, 101.2); -insert into root.factory1.d7(time, temperature) values(3000, 99.3); -insert into root.factory1.d7(time, temperature) values(5000, 100.1); -insert into root.factory1.d7(time, temperature) values(7000, 99.8); - -insert into root.factory1.d8(time, temperature) values(1000, 50.0); -insert into root.factory1.d8(time, temperature) values(3000, 52.1); -insert into root.factory1.d8(time, temperature) values(5000, 50.1); -insert into root.factory1.d8(time, temperature) values(7000, 50.5); - -insert into root.factory1.d9(time, temperature) values(1000, 50.3); -insert into root.factory1.d9(time, temperature) values(3000, 52.1); -``` - -##### 单标签聚合查询 - -用户想统计该工厂每个地区的设备的温度的平均值,可以使用如下查询语句 - -```SQL -SELECT AVG(temperature) FROM root.factory1.** GROUP BY TAGS(city); -``` - -该查询会将具有同一个 `city` 标签值的时间序列的所有满足查询条件的点做平均值计算,计算结果如下 - -```shell -+--------+------------------+ -| city| avg(temperature)| -+--------+------------------+ -| Beijing|104.04666697184244| -|Shanghai|107.85000076293946| -| NULL| 50.84999910990397| -+--------+------------------+ -Total line number = 3 -It costs 0.231s -``` - -从结果集中可以看到,和分段聚合、按层次分组聚合相比,标签聚合的查询结果的不同点是: -1. 标签聚合查询的聚合结果不会再做去星号展开,而是将多个时间序列的数据作为一个整体进行聚合计算。 -2. 标签聚合查询除了输出聚合结果列,还会输出聚合标签的键值列。该列的列名为聚合指定的标签键,列的值则为所有查询的时间序列中出现的该标签的值。 -如果某些时间序列未设置该标签,则在键值列中有一行单独的 `NULL` ,代表未设置标签的所有时间序列数据的聚合结果。 - -##### 多标签分组聚合查询 - -除了基本的单标签聚合查询外,还可以按顺序指定多个标签进行聚合计算。 - -例如,用户想统计每个城市的每个车间内设备的平均温度。但因为各个城市的车间名称有可能相同,所以不能直接按照 `workshop` 做标签聚合。必须要先按照城市,再按照车间处理。 - -SQL 语句如下 - -```SQL -SELECT avg(temperature) FROM root.factory1.** GROUP BY TAGS(city, workshop); -``` - -查询结果如下 - -```shell -+--------+--------+------------------+ -| city|workshop| avg(temperature)| -+--------+--------+------------------+ -| NULL| NULL| 50.84999910990397| -|Shanghai| w1|113.01666768391927| -| Beijing| w2| 104.4000004359654| -|Shanghai| w2|100.10000038146973| -| Beijing| w1|103.73750019073486| -+--------+--------+------------------+ -Total line number = 5 -It costs 0.027s -``` - -从结果集中可以看到,和单标签聚合相比,多标签聚合的查询结果会根据指定的标签顺序,输出相应标签的键值列。 - -##### 基于时间区间的标签聚合查询 - -按照时间区间聚合是时序数据库中最常用的查询需求之一。IoTDB 在基于时间区间的聚合基础上,支持进一步按照标签进行聚合查询。 - -例如,用户想统计时间 `[1000, 10000)` 范围内,每个城市每个车间中的设备每 5 秒内的平均温度。 - -SQL 语句如下 - -```SQL -SELECT AVG(temperature) FROM root.factory1.** GROUP BY ([1000, 10000), 5s), TAGS(city, workshop); -``` - -查询结果如下 - -```shell -+-----------------------------+--------+--------+------------------+ -| Time| city|workshop| avg(temperature)| -+-----------------------------+--------+--------+------------------+ -|1970-01-01T08:00:01.000+08:00| NULL| NULL| 50.91999893188476| -|1970-01-01T08:00:01.000+08:00|Shanghai| w1|113.20000076293945| -|1970-01-01T08:00:01.000+08:00| Beijing| w2| 103.4| -|1970-01-01T08:00:01.000+08:00|Shanghai| w2| 100.1999994913737| -|1970-01-01T08:00:01.000+08:00| Beijing| w1|103.81666692097981| -|1970-01-01T08:00:06.000+08:00| NULL| NULL| 50.5| -|1970-01-01T08:00:06.000+08:00|Shanghai| w1| 112.6500015258789| -|1970-01-01T08:00:06.000+08:00| Beijing| w2| 106.9000015258789| -|1970-01-01T08:00:06.000+08:00|Shanghai| w2| 99.80000305175781| -|1970-01-01T08:00:06.000+08:00| Beijing| w1| 103.5| -+-----------------------------+--------+--------+------------------+ -``` - -和标签聚合相比,基于时间区间的标签聚合的查询会首先按照时间区间划定聚合范围,在时间区间内部再根据指定的标签顺序,进行相应数据的聚合计算。在输出的结果集中,会包含一列时间列,该时间列值的含义和时间区间聚合查询的相同。 - -##### 标签分组聚合的限制 - -由于标签聚合功能仍然处于开发阶段,目前有如下未实现功能。 - -> 1. 暂不支持 `HAVING` 子句过滤查询结果。 -> 2. 暂不支持结果按照标签值排序。 -> 3. 暂不支持 `LIMIT`,`OFFSET`,`SLIMIT`,`SOFFSET`。 -> 4. 暂不支持 `ALIGN BY DEVICE`。 -> 5. 暂不支持聚合函数内部包含表达式,例如 `count(s+1)`。 -> 6. 不支持值过滤条件聚合,和分层聚合查询行为保持一致。 - -## 5. 聚合结果过滤(HAVING 子句) - -如果想对聚合查询的结果进行过滤,可以在 `GROUP BY` 子句之后使用 `HAVING` 子句。 - -**注意:** - -1. `HAVING`子句中的过滤条件必须由聚合值构成,原始序列不能单独出现。 - - 下列使用方式是不正确的: - ```sql - select count(s1) from root.** group by ([1,3),1ms) having sum(s1) > s1; - select count(s1) from root.** group by ([1,3),1ms) having s1 > 1; - ``` - -2. 对`GROUP BY LEVEL`结果进行过滤时,`SELECT`和`HAVING`中出现的PATH只能有一级。 - - 下列使用方式是不正确的: - ```sql - select count(s1) from root.** group by ([1,3),1ms), level=1 having sum(d1.s1) > 1; - select count(d1.s1) from root.** group by ([1,3),1ms), level=1 having sum(s1) > 1; - ``` - -**SQL 示例:** - -- **示例 1:** - - 对于以下聚合结果进行过滤: - - ```shell - +-----------------------------+---------------------+---------------------+ - | Time|count(root.test.*.s1)|count(root.test.*.s2)| - +-----------------------------+---------------------+---------------------+ - |1970-01-01T08:00:00.001+08:00| 4| 4| - |1970-01-01T08:00:00.003+08:00| 1| 0| - |1970-01-01T08:00:00.005+08:00| 2| 4| - |1970-01-01T08:00:00.007+08:00| 3| 2| - |1970-01-01T08:00:00.009+08:00| 4| 4| - +-----------------------------+---------------------+---------------------+ - ``` - - ```sql - select count(s1) from root.** group by ([1,11),2ms), level=1 having count(s2) > 2; - ``` - - 执行结果如下: - - ```shell - +-----------------------------+---------------------+ - | Time|count(root.test.*.s1)| - +-----------------------------+---------------------+ - |1970-01-01T08:00:00.001+08:00| 4| - |1970-01-01T08:00:00.005+08:00| 2| - |1970-01-01T08:00:00.009+08:00| 4| - +-----------------------------+---------------------+ - ``` - -- **示例 2:** - - 对于以下聚合结果进行过滤: - ```shell - +-----------------------------+-------------+---------+---------+ - | Time| Device|count(s1)|count(s2)| - +-----------------------------+-------------+---------+---------+ - |1970-01-01T08:00:00.001+08:00|root.test.sg1| 1| 2| - |1970-01-01T08:00:00.003+08:00|root.test.sg1| 1| 0| - |1970-01-01T08:00:00.005+08:00|root.test.sg1| 1| 2| - |1970-01-01T08:00:00.007+08:00|root.test.sg1| 2| 1| - |1970-01-01T08:00:00.009+08:00|root.test.sg1| 2| 2| - |1970-01-01T08:00:00.001+08:00|root.test.sg2| 2| 2| - |1970-01-01T08:00:00.003+08:00|root.test.sg2| 0| 0| - |1970-01-01T08:00:00.005+08:00|root.test.sg2| 1| 2| - |1970-01-01T08:00:00.007+08:00|root.test.sg2| 1| 1| - |1970-01-01T08:00:00.009+08:00|root.test.sg2| 2| 2| - +-----------------------------+-------------+---------+---------+ - ``` - - ```sql - select count(s1), count(s2) from root.** group by ([1,11),2ms) having count(s2) > 1 align by device; - ``` - - 执行结果如下: - - ```shell - +-----------------------------+-------------+---------+---------+ - | Time| Device|count(s1)|count(s2)| - +-----------------------------+-------------+---------+---------+ - |1970-01-01T08:00:00.001+08:00|root.test.sg1| 1| 2| - |1970-01-01T08:00:00.005+08:00|root.test.sg1| 1| 2| - |1970-01-01T08:00:00.009+08:00|root.test.sg1| 2| 2| - |1970-01-01T08:00:00.001+08:00|root.test.sg2| 2| 2| - |1970-01-01T08:00:00.005+08:00|root.test.sg2| 1| 2| - |1970-01-01T08:00:00.009+08:00|root.test.sg2| 2| 2| - +-----------------------------+-------------+---------+---------+ - ``` - - -## 6. 结果集补空值(FILL 子句) - -### 6.1 功能介绍 - -当执行一些数据查询时,结果集的某行某列可能没有数据,则此位置结果为空,但这种空值不利于进行数据可视化展示和分析,需要对空值进行填充。 - -在 IoTDB 中,用户可以使用 `FILL` 子句指定数据缺失情况下的填充模式,允许用户按照特定的方法对任何查询的结果集填充空值,如取前一个不为空的值、线性插值等。 - -### 6.2 语法定义 - -**`FILL` 子句的语法定义如下:** - -```sql -FILL '(' PREVIOUS | LINEAR | constant ')' -``` - -**注意:** -- 在 `Fill` 语句中只能指定一种填充方法,该方法作用于结果集的全部列。 -- 空值填充不兼容 0.13 版本及以前的语法(即不支持 `FILL(([(, , )?])+)`) - -### 6.3 填充方式 - -**IoTDB 目前支持以下三种空值填充方式:** - -- `PREVIOUS` 填充:使用该列前一个非空值进行填充。 -- `LINEAR` 填充:使用该列前一个非空值和下一个非空值的线性插值进行填充。 -- 常量填充:使用指定常量填充。 - -**各数据类型支持的填充方法如下表所示:** - -| 数据类型 | 支持的填充方法 | -| :------- |:------------------------| -| BOOLEAN | `PREVIOUS`、常量 | -| INT32 | `PREVIOUS`、`LINEAR`、常量 | -| INT64 | `PREVIOUS`、`LINEAR`、常量 | -| FLOAT | `PREVIOUS`、`LINEAR`、常量 | -| DOUBLE | `PREVIOUS`、`LINEAR`、常量 | -| TEXT | `PREVIOUS`、常量 | - -**注意:** 对于数据类型不支持指定填充方法的列,既不会填充它,也不会报错,只是让那一列保持原样。 - -**下面通过举例进一步说明。** - -如果我们不使用任何填充方式: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000; -``` - -查询结果如下: - -```shell -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| null| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -#### `PREVIOUS` 填充 - -**对于查询结果集中的空值,使用该列前一个非空值进行填充。** - -**注意:** 如果结果集的某一列第一个值就为空,则不会填充该值,直到遇到该列第一个非空值为止。 - -例如,使用 `PREVIOUS` 填充,SQL 语句如下: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous); -``` - -`PREVIOUS` 填充后的结果如下: - -```shell -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 21.93| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| false| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -**在前值填充时,能够支持指定一个时间间隔,如果当前null值的时间戳与前一个非null值的时间戳的间隔,超过指定的时间间隔,则不进行填充。** - -> 1. 在线性填充和常量填充的情况下,如果指定了第二个参数,会抛出异常 -> 2. 时间超时参数仅支持整数 - 例如,原始数据如下所示: - -```sql -select s1 from root.db.d1; -``` -```shell -+-----------------------------+-------------+ -| Time|root.db.d1.s1| -+-----------------------------+-------------+ -|2023-11-08T16:41:50.008+08:00| 1.0| -+-----------------------------+-------------+ -|2023-11-08T16:46:50.011+08:00| 2.0| -+-----------------------------+-------------+ -|2023-11-08T16:48:50.011+08:00| 3.0| -+-----------------------------+-------------+ -``` - -根据时间分组,每1分钟求一个平均值 - -```sql -select avg(s1) - from root.db.d1 - group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m); -``` -```shell -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| null| -+-----------------------------+------------------+ -``` - -根据时间分组并用前值填充 - -```sql -select avg(s1) - from root.db.d1 - group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m) - FILL(PREVIOUS); -``` -```shell -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| 3.0| -+-----------------------------+------------------+ -``` - -根据时间分组并用前值填充,并指定超过2分钟的就不填充 - -```sql -select avg(s1) -from root.db.d1 -group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m) - FILL(PREVIOUS, 2m); -``` -```shell -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| 3.0| -+-----------------------------+------------------+ -``` - - -#### `LINEAR` 填充 - -**对于查询结果集中的空值,使用该列前一个非空值和下一个非空值的线性插值进行填充。** - -**注意:** -- 如果某个值之前的所有值都为空,或者某个值之后的所有值都为空,则不会填充该值。 -- 如果某列的数据类型为boolean/text,我们既不会填充它,也不会报错,只是让那一列保持原样。 - -例如,使用 `LINEAR` 填充,SQL 语句如下: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(linear); -``` - -`LINEAR` 填充后的结果如下: - -```shell -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 22.08| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -#### 常量填充 - -**对于查询结果集中的空值,使用指定常量填充。** - -**注意:** -- 如果某列数据类型与常量类型不兼容,既不填充该列,也不报错,将该列保持原样。对于常量兼容的数据类型,如下表所示: - - | 常量类型 | 能够填充的序列数据类型 | - |:------ |:------------------ | - | `BOOLEAN` | `BOOLEAN` `TEXT` | - | `INT64` | `INT32` `INT64` `FLOAT` `DOUBLE` `TEXT` | - | `DOUBLE` | `FLOAT` `DOUBLE` `TEXT` | - | `TEXT` | `TEXT` | -- 当常量值大于 `INT32` 所能表示的最大值时,对于 `INT32` 类型的列,既不填充该列,也不报错,将该列保持原样。 - -例如,使用 `FLOAT` 类型的常量填充,SQL 语句如下: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(2.0); -``` - -`FLOAT` 类型的常量填充后的结果如下: - -```shell -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 2.0| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -再比如,使用 `BOOLEAN` 类型的常量填充,SQL 语句如下: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(true); -``` - -`BOOLEAN` 类型的常量填充后的结果如下: - -```shell -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| null| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| true| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - - -## 7. 查询结果分页(LIMIT/SLIMIT 子句) - -当查询结果集数据量很大,放在一个页面不利于显示,可以使用 `LIMIT/SLIMIT` 子句和 `OFFSET/SOFFSET `子句进行分页控制。 - -- `LIMIT` 和 `SLIMIT` 子句用于控制查询结果的行数和列数。 -- `OFFSET` 和 `SOFFSET` 子句用于控制结果显示的起始位置。 - -### 7.1 按行分页 - -用户可以通过 `LIMIT` 和 `OFFSET` 子句控制查询结果的行数,`LIMIT rowLimit` 指定查询结果的行数,`OFFSET rowOffset` 指定查询结果显示的起始行位置。 - -注意: -- 当 `rowOffset` 超过结果集的大小时,返回空结果集。 -- 当 `rowLimit` 超过结果集的大小时,返回所有查询结果。 -- 当 `rowLimit` 和 `rowOffset` 不是正整数,或超过 `INT64` 允许的最大值时,系统将提示错误。 - -我们将通过以下示例演示如何使用 `LIMIT` 和 `OFFSET` 子句。 - -- **示例 1:** 基本的 `LIMIT` 子句 - -SQL 语句: - -```sql -select status, temperature from root.ln.wf01.wt01 limit 10; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 选择的时间序列是“状态”和“温度”。 SQL 语句要求返回查询结果的前 10 行。 - -结果如下所示: - -```shell -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:00:00.000+08:00| true| 25.96| -|2017-11-01T00:01:00.000+08:00| true| 24.36| -|2017-11-01T00:02:00.000+08:00| false| 20.09| -|2017-11-01T00:03:00.000+08:00| false| 20.18| -|2017-11-01T00:04:00.000+08:00| false| 21.13| -|2017-11-01T00:05:00.000+08:00| false| 22.72| -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 10 -It costs 0.000s -``` - -- **示例 2:** 带 `OFFSET` 的 `LIMIT` 子句 - -SQL 语句: - -```sql -select status, temperature from root.ln.wf01.wt01 limit 5 offset 3; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 选择的时间序列是“状态”和“温度”。 SQL 语句要求返回查询结果的第 3 至 7 行(第一行编号为 0 行)。 - -结果如下所示: - -```shell -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:03:00.000+08:00| false| 20.18| -|2017-11-01T00:04:00.000+08:00| false| 21.13| -|2017-11-01T00:05:00.000+08:00| false| 22.72| -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 5 -It costs 0.342s -``` - -- **示例 3:** `LIMIT` 子句与 `WHERE` 子句结合 - -SQL 语句: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2024-07-07T00:05:00.000 and time< 2024-07-12T00:12:00.000 limit 5 offset 3; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 选择的时间序列是“状态”和“温度”。 SQL 语句要求返回时间“ 2024-07-07T00:05:00.000”和“ 2024-07-12T00:12:00.000”之间的状态和温度传感器值的第 3 至 7 行(第一行编号为第 0 行)。 - -结果如下所示: - -```shell -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2024-07-09T17:32:11.943+08:00| true| 24.941973| -|2024-07-09T17:32:12.944+08:00| true| 20.05108| -|2024-07-09T17:32:13.945+08:00| true| 20.541632| -|2024-07-09T17:32:14.945+08:00| null| 23.09016| -|2024-07-09T17:32:14.946+08:00| true| null| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 5 -It costs 0.070s -```` - -- **示例 4:** `LIMIT` 子句与 `GROUP BY` 子句组合 - -SQL 语句: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) limit 4 offset 3; -``` - -含义: - -SQL 语句子句要求返回查询结果的第 3 至 6 行(第一行编号为 0 行)。 - -结果如下所示: - -```shell -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-04T00:00:00.000+08:00| 1440| 26.0| -|2017-11-05T00:00:00.000+08:00| 1440| 26.0| -|2017-11-06T00:00:00.000+08:00| 1440| 25.99| -|2017-11-07T00:00:00.000+08:00| 1380| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 4 -It costs 0.016s -``` - -### 7.2 按列分页 - -用户可以通过 `SLIMIT` 和 `SOFFSET` 子句控制查询结果的列数,`SLIMIT seriesLimit` 指定查询结果的列数,`SOFFSET seriesOffset` 指定查询结果显示的起始列位置。 - -注意: -- 仅用于控制值列,对时间列和设备列无效。 -- 当 `seriesOffset` 超过结果集的大小时,返回空结果集。 -- 当 `seriesLimit` 超过结果集的大小时,返回所有查询结果。 -- 当 `seriesLimit` 和 `seriesOffset` 不是正整数,或超过 `INT64` 允许的最大值时,系统将提示错误。 - -我们将通过以下示例演示如何使用 `SLIMIT` 和 `SOFFSET` 子句。 - -- **示例 1:** 基本的 `SLIMIT` 子句 - -SQL 语句: - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 所选时间序列是该设备下的第二列,即温度。 SQL 语句要求在"2017-11-01T00:05:00.000"和"2017-11-01T00:12:00.000"的时间点之间选择温度传感器值。 - -结果如下所示: - -```shell -+-----------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.temperature| -+-----------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| 20.71| -|2017-11-01T00:07:00.000+08:00| 21.45| -|2017-11-01T00:08:00.000+08:00| 22.58| -|2017-11-01T00:09:00.000+08:00| 20.98| -|2017-11-01T00:10:00.000+08:00| 25.52| -|2017-11-01T00:11:00.000+08:00| 22.91| -+-----------------------------+-----------------------------+ -Total line number = 6 -It costs 0.000s -``` - -- **示例 2:** 带 `SOFFSET` 的 `SLIMIT` 子句 - -SQL 语句: - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 soffset 1; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 所选时间序列是该设备下的第一列,即电源状态。 SQL 语句要求在" 2017-11-01T00:05:00.000"和"2017-11-01T00:12:00.000"的时间点之间选择状态传感器值。 - -结果如下所示: - -```shell -+-----------------------------+------------------------+ -| Time|root.ln.wf01.wt01.status| -+-----------------------------+------------------------+ -|2017-11-01T00:06:00.000+08:00| false| -|2017-11-01T00:07:00.000+08:00| false| -|2017-11-01T00:08:00.000+08:00| false| -|2017-11-01T00:09:00.000+08:00| false| -|2017-11-01T00:10:00.000+08:00| true| -|2017-11-01T00:11:00.000+08:00| false| -+-----------------------------+------------------------+ -Total line number = 6 -It costs 0.003s -``` - -- **示例 3:** `SLIMIT` 子句与 `GROUP BY` 子句结合 - -SQL 语句: - -```sql -select max_value(*) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) slimit 1 soffset 1; -``` - -含义: - -```shell -+-----------------------------+-----------------------------------+ -| Time|max_value(root.ln.wf01.wt01.status)| -+-----------------------------+-----------------------------------+ -|2017-11-01T00:00:00.000+08:00| true| -|2017-11-02T00:00:00.000+08:00| true| -|2017-11-03T00:00:00.000+08:00| true| -|2017-11-04T00:00:00.000+08:00| true| -|2017-11-05T00:00:00.000+08:00| true| -|2017-11-06T00:00:00.000+08:00| true| -|2017-11-07T00:00:00.000+08:00| true| -+-----------------------------+-----------------------------------+ -Total line number = 7 -It costs 0.000s -``` - -- **示例 4:** `SLIMIT` 子句与 `LIMIT` 子句结合 - -SQL 语句: - -```sql -select * from root.ln.wf01.wt01 limit 10 offset 100 slimit 2 soffset 0; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 所选时间序列是此设备下的第 0 列至第 1 列(第一列编号为第 0 列)。 SQL 语句子句要求返回查询结果的第 100 至 109 行(第一行编号为 0 行)。 - -结果如下所示: - -```shell -+-----------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+-----------------------------+------------------------+ -|2017-11-01T01:40:00.000+08:00| 21.19| false| -|2017-11-01T01:41:00.000+08:00| 22.79| false| -|2017-11-01T01:42:00.000+08:00| 22.98| false| -|2017-11-01T01:43:00.000+08:00| 21.52| false| -|2017-11-01T01:44:00.000+08:00| 23.45| true| -|2017-11-01T01:45:00.000+08:00| 24.06| true| -|2017-11-01T01:46:00.000+08:00| 22.6| false| -|2017-11-01T01:47:00.000+08:00| 23.78| true| -|2017-11-01T01:48:00.000+08:00| 24.72| true| -|2017-11-01T01:49:00.000+08:00| 24.68| true| -+-----------------------------+-----------------------------+------------------------+ -Total line number = 10 -It costs 0.009s -``` - -## 8. 结果集排序(ORDER BY 子句) - -### 8.1 时间对齐模式下的排序 -IoTDB的查询结果集默认按照时间对齐,可以使用`ORDER BY TIME`的子句指定时间戳的排列顺序。示例代码如下: -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time desc; -``` -执行结果: - -```shell -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -|2017-11-01T00:01:00.000+08:00| v2| true| 24.36| true| -|2017-11-01T00:00:00.000+08:00| v2| true| 25.96| true| -|1970-01-01T08:00:00.002+08:00| v2| false| null| null| -|1970-01-01T08:00:00.001+08:00| v1| true| null| null| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -``` -### 8.2 设备对齐模式下的排序 -当使用`ALIGN BY DEVICE`查询对齐模式下的结果集时,可以使用`ORDER BY`子句对返回的结果集顺序进行规定。 - -在设备对齐模式下支持4种排序模式的子句,其中包括两种排序键,`DEVICE`和`TIME`,靠前的排序键为主排序键,每种排序键都支持`ASC`和`DESC`两种排列顺序。 -1. ``ORDER BY DEVICE``: 按照设备名的字典序进行排序,排序方式为字典序排序,在这种情况下,相同名的设备会以组的形式进行展示。 - -2. ``ORDER BY TIME``: 按照时间戳进行排序,此时不同的设备对应的数据点会按照时间戳的优先级被打乱排序。 - -3. ``ORDER BY DEVICE,TIME``: 按照设备名的字典序进行排序,设备名相同的数据点会通过时间戳进行排序。 - -4. ``ORDER BY TIME,DEVICE``: 按照时间戳进行排序,时间戳相同的数据点会通过设备名的字典序进行排序。 - -> 为了保证结果的可观性,当不使用`ORDER BY`子句,仅使用`ALIGN BY DEVICE`时,会为设备视图提供默认的排序方式。其中默认的排序视图为``ORDER BY DEVCE,TIME``,默认的排序顺序为`ASC`, -> 即结果集默认先按照设备名升序排列,在相同设备名内再按照时间戳升序排序。 - - -当主排序键为`DEVICE`时,结果集的格式与默认情况类似:先按照设备名对结果进行排列,在相同的设备名下内按照时间戳进行排序。示例代码如下: -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by device desc,time asc align by device; -``` -执行结果: - -```shell -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -+-----------------------------+-----------------+--------+------+-----------+ -``` -主排序键为`Time`时,结果集会先按照时间戳进行排序,在时间戳相等时按照设备名排序。 -示例代码如下: -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time asc,device desc align by device; -``` -执行结果: -```shell -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -+-----------------------------+-----------------+--------+------+-----------+ -``` -当没有显式指定时,主排序键默认为`Device`,排序顺序默认为`ASC`,示例代码如下: -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` -结果如图所示,可以看出,`ORDER BY DEVICE ASC,TIME ASC`就是默认情况下的排序方式,由于`ASC`是默认排序顺序,此处可以省略。 -```shell -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -+-----------------------------+-----------------+--------+------+-----------+ -``` -同样,可以在聚合查询中使用`ALIGN BY DEVICE`和`ORDER BY`子句,对聚合后的结果进行排序,示例代码如下所示: -```sql -select count(*) from root.ln.** group by ((2017-11-01T00:00:00.000+08:00,2017-11-01T00:03:00.000+08:00],1m) order by device asc,time asc align by device; -``` -执行结果: -```shell -+-----------------------------+-----------------+---------------+-------------+------------------+ -| Time| Device|count(hardware)|count(status)|count(temperature)| -+-----------------------------+-----------------+---------------+-------------+------------------+ -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| 1| 1| -|2017-11-01T00:02:00.000+08:00|root.ln.wf01.wt01| null| 0| 0| -|2017-11-01T00:03:00.000+08:00|root.ln.wf01.wt01| null| 0| 0| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| 1| 1| null| -|2017-11-01T00:02:00.000+08:00|root.ln.wf02.wt02| 0| 0| null| -|2017-11-01T00:03:00.000+08:00|root.ln.wf02.wt02| 0| 0| null| -+-----------------------------+-----------------+---------------+-------------+------------------+ -``` - -### 8.3 任意表达式排序 -除了IoTDB中规定的Time,Device关键字外,还可以通过`ORDER BY`子句对指定时间序列中任意列的表达式进行排序。 - -排序在通过`ASC`,`DESC`指定排序顺序的同时,可以通过`NULLS`语法来指定NULL值在排序中的优先级,`NULLS FIRST`默认NULL值在结果集的最上方,`NULLS LAST`则保证NULL值在结果集的最后。如果没有在子句中指定,则默认顺序为`ASC`,`NULLS LAST`。 - -对于如下的数据,将给出几个任意表达式的查询示例供参考: -```shell -+-----------------------------+-------------+-------+-------+--------+-------+ -| Time| Device| base| score| bonus| total| -+-----------------------------+-------------+-------+-------+--------+-------+ -|1970-01-01T08:00:00.000+08:00| root.one| 12| 50.0| 45.0| 107.0| -|1970-01-02T08:00:00.000+08:00| root.one| 10| 50.0| 45.0| 105.0| -|1970-01-03T08:00:00.000+08:00| root.one| 8| 50.0| 45.0| 103.0| -|1970-01-01T08:00:00.010+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.020+08:00| root.two| 8| 10.0| 15.0| 33.0| -|1970-01-01T08:00:00.010+08:00| root.three| 9| null| 24.0| 33.0| -|1970-01-01T08:00:00.020+08:00| root.three| 8| null| 22.5| 30.5| -|1970-01-01T08:00:00.030+08:00| root.three| 7| null| 23.5| 30.5| -|1970-01-01T08:00:00.010+08:00| root.four| 9| 32.0| 45.0| 86.0| -|1970-01-01T08:00:00.020+08:00| root.four| 8| 32.0| 45.0| 85.0| -|1970-01-01T08:00:00.030+08:00| root.five| 7| 53.0| 44.0| 104.0| -|1970-01-01T08:00:00.040+08:00| root.five| 6| 54.0| 42.0| 102.0| -+-----------------------------+-------------+-------+-------+--------+-------+ -``` - -当需要根据基础分数score对结果进行排序时,可以直接使用 -```Sql -select score from root.** order by score desc align by device; -``` -会得到如下结果 - -```shell -+-----------------------------+---------+-----+ -| Time| Device|score| -+-----------------------------+---------+-----+ -|1970-01-01T08:00:00.040+08:00|root.five| 54.0| -|1970-01-01T08:00:00.030+08:00|root.five| 53.0| -|1970-01-01T08:00:00.000+08:00| root.one| 50.0| -|1970-01-02T08:00:00.000+08:00| root.one| 50.0| -|1970-01-03T08:00:00.000+08:00| root.one| 50.0| -|1970-01-01T08:00:00.000+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00| root.two| 10.0| -+-----------------------------+---------+-----+ -``` - -当想要根据总分对结果进行排序,可以在order by子句中使用表达式进行计算 -```Sql -select score,total from root.one order by base+score+bonus desc; -``` -该sql等价于 -```Sql -select score,total from root.one order by total desc; -``` -得到如下结果 - -```shell -+-----------------------------+--------------+--------------+ -| Time|root.one.score|root.one.total| -+-----------------------------+--------------+--------------+ -|1970-01-01T08:00:00.000+08:00| 50.0| 107.0| -|1970-01-02T08:00:00.000+08:00| 50.0| 105.0| -|1970-01-03T08:00:00.000+08:00| 50.0| 103.0| -+-----------------------------+--------------+--------------+ -``` -而如果要对总分进行排序,且分数相同时依次根据score, base, bonus和提交时间进行排序时,可以通过多个表达式来指定多层排序 - -```Sql -select base, score, bonus, total from root.** order by total desc NULLS Last, - score desc NULLS Last, - bonus desc NULLS Last, - time desc align by device; -``` -得到如下结果 -```shell -+-----------------------------+----------+----+-----+-----+-----+ -| Time| Device|base|score|bonus|total| -+-----------------------------+----------+----+-----+-----+-----+ -|1970-01-01T08:00:00.000+08:00| root.one| 12| 50.0| 45.0|107.0| -|1970-01-02T08:00:00.000+08:00| root.one| 10| 50.0| 45.0|105.0| -|1970-01-01T08:00:00.030+08:00| root.five| 7| 53.0| 44.0|104.0| -|1970-01-03T08:00:00.000+08:00| root.one| 8| 50.0| 45.0|103.0| -|1970-01-01T08:00:00.040+08:00| root.five| 6| 54.0| 42.0|102.0| -|1970-01-01T08:00:00.010+08:00| root.four| 9| 32.0| 45.0| 86.0| -|1970-01-01T08:00:00.020+08:00| root.four| 8| 32.0| 45.0| 85.0| -|1970-01-01T08:00:00.010+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.000+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.020+08:00| root.two| 8| 10.0| 15.0| 33.0| -|1970-01-01T08:00:00.010+08:00|root.three| 9| null| 24.0| 33.0| -|1970-01-01T08:00:00.030+08:00|root.three| 7| null| 23.5| 30.5| -|1970-01-01T08:00:00.020+08:00|root.three| 8| null| 22.5| 30.5| -+-----------------------------+----------+----+-----+-----+-----+ -``` -在order by中同样可以使用聚合查询表达式 -```Sql -select min_value(total) from root.** order by min_value(total) asc align by device; -``` -得到如下结果 -```shell -+----------+----------------+ -| Device|min_value(total)| -+----------+----------------+ -|root.three| 30.5| -| root.two| 33.0| -| root.four| 85.0| -| root.five| 102.0| -| root.one| 103.0| -+----------+----------------+ -``` -当在查询中指定多列,未被排序的列会随着行和排序列一起改变顺序,当排序列相同时行的顺序和具体实现有关(没有固定顺序) -```Sql -select min_value(total),max_value(base) from root.** order by max_value(total) desc align by device; -``` -得到结果如下 - -```shell -+----------+----------------+---------------+ -| Device|min_value(total)|max_value(base)| -+----------+----------------+---------------+ -| root.one| 103.0| 12| -| root.five| 102.0| 7| -| root.four| 85.0| 9| -| root.two| 33.0| 9| -|root.three| 30.5| 9| -+----------+----------------+---------------+ -``` - -Order by device, time可以和order by expression共同使用 -```Sql -select score from root.** order by device asc, score desc, time asc align by device; -``` -会得到如下结果 -```shell -+-----------------------------+---------+-----+ -| Time| Device|score| -+-----------------------------+---------+-----+ -|1970-01-01T08:00:00.040+08:00|root.five| 54.0| -|1970-01-01T08:00:00.030+08:00|root.five| 53.0| -|1970-01-01T08:00:00.010+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00|root.four| 32.0| -|1970-01-01T08:00:00.000+08:00| root.one| 50.0| -|1970-01-02T08:00:00.000+08:00| root.one| 50.0| -|1970-01-03T08:00:00.000+08:00| root.one| 50.0| -|1970-01-01T08:00:00.000+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00| root.two| 50.0| -|1970-01-01T08:00:00.020+08:00| root.two| 10.0| -+-----------------------------+---------+-----+ -``` - -## 9. 查询对齐模式(ALIGN BY DEVICE 子句) - -在 IoTDB 中,查询结果集**默认按照时间对齐**,包含一列时间列和若干个值列,每一行数据各列的时间戳相同。 - -除按照时间对齐外,还支持以下对齐模式: - -- 按设备对齐 `ALIGN BY DEVICE` - -### 9.1 按设备对齐 - -在按设备对齐模式下,设备名会单独作为一列出现,查询结果集包含一列时间列、一列设备列和若干个值列。如果 `SELECT` 子句中选择了 `N` 列,则结果集包含 `N + 2` 列(时间列和设备名字列)。 - -在默认情况下,结果集按照 `Device` 进行排列,在每个 `Device` 内按照 `Time` 列升序排序。 - -当查询多个设备时,要求设备之间同名的列数据类型相同。 - -为便于理解,可以按照关系模型进行对应。设备可以视为关系模型中的表,选择的列可以视为表中的列,`Time + Device` 看做其主键。 - -**示例:** - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` - -执行如下: - -```shell -+-----------------------------+-----------------+-----------+------+--------+ -| Time| Device|temperature|status|hardware| -+-----------------------------+-----------------+-----------+------+--------+ -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| 25.96| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| 24.36| true| null| -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| null| true| v1| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| null| false| v2| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| null| true| v2| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| null| true| v2| -+-----------------------------+-----------------+-----------+------+--------+ -Total line number = 6 -It costs 0.012s -``` -### 9.2 设备对齐模式下的排序 -在设备对齐模式下,默认按照设备名的字典序升序排列,每个设备内部按照时间戳大小升序排列,可以通过 `ORDER BY` 子句调整设备列和时间列的排序优先级。 - -详细说明及示例见文档 [结果集排序](../SQL-Manual/Operator-and-Expression.md)。 - -## 10. 查询写回(INTO 子句) - -`SELECT INTO` 语句用于将查询结果写入一系列指定的时间序列中。 - -应用场景如下: -- **实现 IoTDB 内部 ETL**:对原始数据进行 ETL 处理后写入新序列。 -- **查询结果存储**:将查询结果进行持久化存储,起到类似物化视图的作用。 -- **非对齐序列转对齐序列**:对齐序列从0.13版本开始支持,可以通过该功能将非对齐序列的数据写入新的对齐序列中。 - -### 10.1 语法定义 - -#### 整体描述 - -```sql -selectIntoStatement - : SELECT - resultColumn [, resultColumn] ... - INTO intoItem [, intoItem] ... - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY groupByTimeClause, groupByLevelClause] - [FILL {PREVIOUS | LINEAR | constant}] - [LIMIT rowLimit OFFSET rowOffset] - [ALIGN BY DEVICE] - ; - -intoItem - : [ALIGNED] intoDevicePath '(' intoMeasurementName [',' intoMeasurementName]* ')' - ; -``` - -#### `INTO` 子句 - -`INTO` 子句由若干个 `intoItem` 构成。 - -每个 `intoItem` 由一个目标设备路径和一个包含若干目标物理量名的列表组成(与 `INSERT` 语句中的 `INTO` 子句写法类似)。 - -其中每个目标物理量名与目标设备路径组成一个目标序列,一个 `intoItem` 包含若干目标序列。例如:`root.sg_copy.d1(s1, s2)` 指定了两条目标序列 `root.sg_copy.d1.s1` 和 `root.sg_copy.d1.s2`。 - -`INTO` 子句指定的目标序列要能够与查询结果集的列一一对应。具体规则如下: - -- **按时间对齐**(默认):全部 `intoItem` 包含的目标序列数量要与查询结果集的列数(除时间列外)一致,且按照表头从左到右的顺序一一对应。 -- **按设备对齐**(使用 `ALIGN BY DEVICE`):全部 `intoItem` 中指定的目标设备数和查询的设备数(即 `FROM` 子句中路径模式匹配的设备数)一致,且按照结果集设备的输出顺序一一对应。 - 为每个目标设备指定的目标物理量数量要与查询结果集的列数(除时间和设备列外)一致,且按照表头从左到右的顺序一一对应。 - -下面通过示例进一步说明: - -- **示例 1**(按时间对齐) -```sql - select s1, s2 into root.sg_copy.d1(t1), root.sg_copy.d2(t1, t2), root.sg_copy.d1(t2) from root.sg.d1, root.sg.d2; -``` -```shell -+--------------+-------------------+--------+ -| source column| target timeseries| written| -+--------------+-------------------+--------+ -| root.sg.d1.s1| root.sg_copy.d1.t1| 8000| -+--------------+-------------------+--------+ -| root.sg.d2.s1| root.sg_copy.d2.t1| 10000| -+--------------+-------------------+--------+ -| root.sg.d1.s2| root.sg_copy.d2.t2| 12000| -+--------------+-------------------+--------+ -| root.sg.d2.s2| root.sg_copy.d1.t2| 10000| -+--------------+-------------------+--------+ -Total line number = 4 -It costs 0.725s -``` - -该语句将 `root.sg` database 下四条序列的查询结果写入到 `root.sg_copy` database 下指定的四条序列中。注意,`root.sg_copy.d2(t1, t2)` 也可以写做 `root.sg_copy.d2(t1), root.sg_copy.d2(t2)`。 - -可以看到,`INTO` 子句的写法非常灵活,只要满足组合出的目标序列没有重复,且与查询结果列一一对应即可。 - -> `CLI` 展示的结果集中,各列的含义如下: -> - `source column` 列表示查询结果的列名。 -> - `target timeseries` 表示对应列写入的目标序列。 -> - `written` 表示预期写入的数据量。 - -- **示例 2**(按时间对齐) -```sql - select count(s1 + s2), last_value(s2) into root.agg.count(s1_add_s2), root.agg.last_value(s2) from root.sg.d1 group by ([0, 100), 10ms); -``` -```shell -+--------------------------------------+-------------------------+--------+ -| source column| target timeseries| written| -+--------------------------------------+-------------------------+--------+ -| count(root.sg.d1.s1 + root.sg.d1.s2)| root.agg.count.s1_add_s2| 10| -+--------------------------------------+-------------------------+--------+ -| last_value(root.sg.d1.s2)| root.agg.last_value.s2| 10| -+--------------------------------------+-------------------------+--------+ -Total line number = 2 -It costs 0.375s -``` - -该语句将聚合查询的结果存储到指定序列中。 - -- **示例 3**(按设备对齐) -```sql - select s1, s2 into root.sg_copy.d1(t1, t2), root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` -```shell -+--------------+--------------+-------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+--------------+-------------------+--------+ -| root.sg.d1| s1| root.sg_copy.d1.t1| 8000| -+--------------+--------------+-------------------+--------+ -| root.sg.d1| s2| root.sg_copy.d1.t2| 11000| -+--------------+--------------+-------------------+--------+ -| root.sg.d2| s1| root.sg_copy.d2.t1| 12000| -+--------------+--------------+-------------------+--------+ -| root.sg.d2| s2| root.sg_copy.d2.t2| 9000| -+--------------+--------------+-------------------+--------+ -Total line number = 4 -It costs 0.625s -``` - -该语句同样是将 `root.sg` database 下四条序列的查询结果写入到 `root.sg_copy` database 下指定的四条序列中。但在按设备对齐中,`intoItem` 的数量必须和查询的设备数量一致,每个查询设备对应一个 `intoItem`。 - -> 按设备对齐查询时,`CLI` 展示的结果集多出一列 `source device` 列表示查询的设备。 - -- **示例 4**(按设备对齐) -```sql - select s1 + s2 into root.expr.add(d1s1_d1s2), root.expr.add(d2s1_d2s2) from root.sg.d1, root.sg.d2 align by device; -``` -```shell -+--------------+--------------+------------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+--------------+------------------------+--------+ -| root.sg.d1| s1 + s2| root.expr.add.d1s1_d1s2| 10000| -+--------------+--------------+------------------------+--------+ -| root.sg.d2| s1 + s2| root.expr.add.d2s1_d2s2| 10000| -+--------------+--------------+------------------------+--------+ -Total line number = 2 -It costs 0.532s -``` - -该语句将表达式计算的结果存储到指定序列中。 - -#### 使用变量占位符 - -特别地,可以使用变量占位符描述目标序列与查询序列之间的对应规律,简化语句书写。目前支持以下两种变量占位符: - -- 后缀复制符 `::`:复制查询设备后缀(或物理量),表示从该层开始一直到设备的最后一层(或物理量),目标设备的节点名(或物理量名)与查询的设备对应的节点名(或物理量名)相同。 -- 单层节点匹配符 `${i}`:表示目标序列当前层节点名与查询序列的第`i`层节点名相同。比如,对于路径`root.sg1.d1.s1`而言,`${1}`表示`sg1`,`${2}`表示`d1`,`${3}`表示`s1`。 - -在使用变量占位符时,`intoItem`与查询结果集列的对应关系不能存在歧义,具体情况分类讨论如下: - -##### 按时间对齐(默认) - -> 注:变量占位符**只能描述序列与序列之间的对应关系**,如果查询中包含聚合、表达式计算,此时查询结果中的列无法与某个序列对应,因此目标设备和目标物理量都不能使用变量占位符。 - -###### (1)目标设备不使用变量占位符 & 目标物理量列表使用变量占位符 - -**限制:** - 1. 每个 `intoItem` 中,物理量列表的长度必须为 1。
(如果长度可以大于1,例如 `root.sg1.d1(::, s1)`,无法确定具体哪些列与`::`匹配) - 2. `intoItem` 数量为 1,或与查询结果集列数一致。
(在每个目标物理量列表长度均为 1 的情况下,若 `intoItem` 只有 1 个,此时表示全部查询序列写入相同设备;若 `intoItem` 数量与查询序列一致,则表示为每个查询序列指定一个目标设备;若 `intoItem` 大于 1 小于查询序列数,此时无法与查询序列一一对应) - -**匹配方法:** 每个查询序列指定目标设备,而目标物理量根据变量占位符生成。 - -**示例:** - -```sql -select s1, s2 -into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) -from root.sg.d1, root.sg.d2; -``` -该语句等价于: -```sql -select s1, s2 -into root.sg_copy.d1(s1), root.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2) -from root.sg.d1, root.sg.d2; -``` -可以看到,在这种情况下,语句并不能得到很好地简化。 - -###### (2)目标设备使用变量占位符 & 目标物理量列表不使用变量占位符 - -**限制:** 全部 `intoItem` 中目标物理量的数量与查询结果集列数一致。 - -**匹配方式:** 为每个查询序列指定了目标物理量,目标设备根据对应目标物理量所在 `intoItem` 的目标设备占位符生成。 - -**示例:** -```sql -select d1.s1, d1.s2, d2.s3, d3.s4 -into ::(s1_1, s2_2), root.sg.d2_2(s3_3), root.${2}_copy.::(s4) -from root.sg; -``` - -###### (3)目标设备使用变量占位符 & 目标物理量列表使用变量占位符 - -**限制:** `intoItem` 只有一个且物理量列表的长度为 1。 - -**匹配方式:** 每个查询序列根据变量占位符可以得到一个目标序列。 - -**示例:** -```sql -select * into root.sg_bk.::(::) from root.sg.**; -``` -将 `root.sg` 下全部序列的查询结果写到 `root.sg_bk`,设备名后缀和物理量名保持不变。 - -##### 按设备对齐(使用 `ALIGN BY DEVICE`) - -> 注:变量占位符**只能描述序列与序列之间的对应关系**,如果查询中包含聚合、表达式计算,此时查询结果中的列无法与某个物理量对应,因此目标物理量不能使用变量占位符。 - -###### (1)目标设备不使用变量占位符 & 目标物理量列表使用变量占位符 - -**限制:** 每个 `intoItem` 中,如果物理量列表使用了变量占位符,则列表的长度必须为 1。 - -**匹配方法:** 每个查询序列指定目标设备,而目标物理量根据变量占位符生成。 - -**示例:** -```sql -select s1, s2, s3, s4 -into root.backup_sg.d1(s1, s2, s3, s4), root.backup_sg.d2(::), root.sg.d3(backup_${4}) -from root.sg.d1, root.sg.d2, root.sg.d3 -align by device; -``` - -###### (2)目标设备使用变量占位符 & 目标物理量列表不使用变量占位符 - -**限制:** `intoItem` 只有一个。(如果出现多个带占位符的 `intoItem`,我们将无法得知每个 `intoItem` 需要匹配哪几个源设备) - -**匹配方式:** 每个查询设备根据变量占位符得到一个目标设备,每个设备下结果集各列写入的目标物理量由目标物理量列表指定。 - -**示例:** -```sql -select avg(s1), sum(s2) + sum(s3), count(s4) -into root.agg_${2}.::(avg_s1, sum_s2_add_s3, count_s4) -from root.** -align by device; -``` - -###### (3)目标设备使用变量占位符 & 目标物理量列表使用变量占位符 - -**限制:** `intoItem` 只有一个且物理量列表的长度为 1。 - -**匹配方式:** 每个查询序列根据变量占位符可以得到一个目标序列。 - -**示例:** -```sql -select * into ::(backup_${4}) from root.sg.** align by device; -``` -将 `root.sg` 下每条序列的查询结果写到相同设备下,物理量名前加`backup_`。 - -#### 指定目标序列为对齐序列 - -通过 `ALIGNED` 关键词可以指定写入的目标设备为对齐写入,每个 `intoItem` 可以独立设置。 - -**示例:** -```sql -select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` -该语句指定了 `root.sg_copy.d1` 是非对齐设备,`root.sg_copy.d2`是对齐设备。 - -#### 不支持使用的查询子句 - -- `SLIMIT`、`SOFFSET`:查询出来的列不确定,功能不清晰,因此不支持。 -- `LAST`查询、`GROUP BY TAGS`、`DISABLE ALIGN`:表结构和写入结构不一致,因此不支持。 - -#### 其他要注意的点 - -- 对于一般的聚合查询,时间戳是无意义的,约定使用 0 来存储。 -- 当目标序列存在时,需要保证源序列和目标时间序列的数据类型兼容。关于数据类型的兼容性,查看文档 [数据类型](../Background-knowledge/Data-Type.md#数据类型兼容性)。 -- 当目标序列不存在时,系统将自动创建目标序列(包括 database)。 -- 当查询的序列不存在或查询的序列不存在数据,则不会自动创建目标序列。 - -### 10.2 应用举例 - -#### 实现 IoTDB 内部 ETL -对原始数据进行 ETL 处理后写入新序列。 -```sql -SELECT preprocess_udf(s1, s2) INTO ::(preprocessed_s1, preprocessed_s2) FROM root.sg.* ALIGN BY DEIVCE; -``` -```shell -+--------------+-------------------+---------------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d1| preprocess_udf(s1)| root.sg.d1.preprocessed_s1| 8000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d1| preprocess_udf(s2)| root.sg.d1.preprocessed_s2| 10000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d2| preprocess_udf(s1)| root.sg.d2.preprocessed_s1| 11000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d2| preprocess_udf(s2)| root.sg.d2.preprocessed_s2| 9000| -+--------------+-------------------+---------------------------+--------+ -``` -以上语句使用自定义函数对数据进行预处理,将预处理后的结果持久化存储到新序列中。 - -#### 查询结果存储 -将查询结果进行持久化存储,起到类似物化视图的作用。 -```sql -SELECT count(s1), last_value(s1) INTO root.sg.agg_${2}(count_s1, last_value_s1) FROM root.sg1.d1 GROUP BY ([0, 10000), 10ms); -``` -```shell -+--------------------------+-----------------------------+--------+ -| source column| target timeseries| written| -+--------------------------+-----------------------------+--------+ -| count(root.sg.d1.s1)| root.sg.agg_d1.count_s1| 1000| -+--------------------------+-----------------------------+--------+ -| last_value(root.sg.d1.s2)| root.sg.agg_d1.last_value_s2| 1000| -+--------------------------+-----------------------------+--------+ -Total line number = 2 -It costs 0.115s -``` -以上语句将降采样查询的结果持久化存储到新序列中。 - -#### 非对齐序列转对齐序列 -对齐序列从 0.13 版本开始支持,可以通过该功能将非对齐序列的数据写入新的对齐序列中。 - -**注意:** 建议配合使用 `LIMIT & OFFSET` 子句或 `WHERE` 子句(时间过滤条件)对数据进行分批,防止单次操作的数据量过大。 - -```sql -SELECT s1, s2 INTO ALIGNED root.sg1.aligned_d(s1, s2) FROM root.sg1.non_aligned_d WHERE time >= 0 and time < 10000; -``` -```shell -+--------------------------+----------------------+--------+ -| source column| target timeseries| written| -+--------------------------+----------------------+--------+ -| root.sg1.non_aligned_d.s1| root.sg1.aligned_d.s1| 10000| -+--------------------------+----------------------+--------+ -| root.sg1.non_aligned_d.s2| root.sg1.aligned_d.s2| 10000| -+--------------------------+----------------------+--------+ -Total line number = 2 -It costs 0.375s -``` -以上语句将一组非对齐的序列的数据迁移到一组对齐序列。 - -### 10.3 相关用户权限 - -用户必须有下列权限才能正常执行查询写回语句: - -* 所有 `SELECT` 子句中源序列的 `WRITE_SCHEMA` 权限。 -* 所有 `INTO` 子句中目标序列 `WRITE_DATA` 权限。 - -更多用户权限相关的内容,请参考[权限管理语句](../User-Manual/Authority-Management_timecho.md)。 - -### 10.4 相关配置参数 - -* `select_into_insert_tablet_plan_row_limit` - - | 参数名 | select_into_insert_tablet_plan_row_limit | - | ---- | ---- | - | 描述 | 写入过程中每一批 `Tablet` 的最大行数 | - | 类型 | int32 | - | 默认值 | 10000 | - | 改后生效方式 | 重启后生效 | diff --git a/src/zh/UserGuide/Master/Tree/Basic-Concept/Write-Data_timecho.md b/src/zh/UserGuide/Master/Tree/Basic-Concept/Write-Data_timecho.md deleted file mode 100644 index 4bdd250e9..000000000 --- a/src/zh/UserGuide/Master/Tree/Basic-Concept/Write-Data_timecho.md +++ /dev/null @@ -1,190 +0,0 @@ - - - -# 数据写入 -## 1. CLI写入数据 - -IoTDB 为用户提供多种插入实时数据的方式,例如在 [Cli/Shell 工具](../Tools-System/CLI.md) 中直接输入插入数据的 INSERT 语句,或使用 Java API(标准 [Java JDBC](../API/Programming-JDBC_timecho) 接口)单条或批量执行插入数据的 INSERT 语句。 - -本节主要为您介绍实时数据接入的 INSERT 语句在场景中的实际使用示例,有关 INSERT SQL 语句的详细语法请参见本文 [INSERT 语句](../SQL-Manual/SQL-Manual.md#写入数据) 节。 - -注:写入重复时间戳的数据时,会直接覆盖原有同时间戳数据,等效于数据更新;但若写入值为 NULL,则不生效,不会覆盖原有字段值。 - -### 1.1 使用 INSERT 语句 - -使用 INSERT 语句可以向指定的已经创建的一条或多条时间序列中插入数据。对于每一条数据,均由一个时间戳类型的时间戳和一个数值或布尔值、字符串类型的传感器采集值组成。 - -在本节的场景实例下,以其中的两个时间序列`root.ln.wf02.wt02.status`和`root.ln.wf02.wt02.hardware`为例 ,它们的数据类型分别为 BOOLEAN 和 TEXT。 - -单列数据插入示例代码如下: - -```sql -IoTDB > insert into root.ln.wf02.wt02(timestamp,status) values(1,true) -IoTDB > insert into root.ln.wf02.wt02(timestamp,hardware) values(1, "v1") -``` - -以上示例代码将长整型的 timestamp 以及值为 true 的数据插入到时间序列`root.ln.wf02.wt02.status`中和将长整型的 timestamp 以及值为"v1"的数据插入到时间序列`root.ln.wf02.wt02.hardware`中。执行成功后会返回执行时间,代表数据插入已完成。 - -> 注意:在 IoTDB 中,TEXT 类型的数据单双引号都可以来表示,上面的插入语句是用的是双引号表示 TEXT 类型数据,下面的示例将使用单引号表示 TEXT 类型数据。 - -INSERT 语句还可以支持在同一个时间点下多列数据的插入,同时向 2 时间点插入上述两个时间序列的值,多列数据插入示例代码如下: - -```sql -IoTDB > insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2') -``` - -此外,INSERT 语句支持一次性插入多行数据,同时向 2 个不同时间点插入上述时间序列的值,示例代码如下: - -```sql -IoTDB > insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4') -``` - -在树模型写入数据时,timestamp 与 time 均可作为时间列标识用于 INSERT 语句,书写时无需刻意区分;但查询结果中,时间列统一展示为 Time(固定名称),保证结果格式统一。 - -插入数据后我们可以使用 SELECT 语句简单查询已插入的数据。 - -```sql -IoTDB > select * from root.ln.wf02.wt02 where time < 5 -``` - -结果如图所示。由查询结果可以看出,单列、多列数据的插入操作正确执行。 - -``` -+-----------------------------+--------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status| -+-----------------------------+--------------------------+------------------------+ -|1970-01-01T08:00:00.001+08:00| v1| true| -|1970-01-01T08:00:00.002+08:00| v2| false| -|1970-01-01T08:00:00.003+08:00| v3| false| -|1970-01-01T08:00:00.004+08:00| v4| true| -+-----------------------------+--------------------------+------------------------+ -Total line number = 4 -It costs 0.004s -``` - -此外,我们可以省略 timestamp 列,此时系统将使用当前的系统时间作为该数据点的时间戳,示例代码如下: -```sql -IoTDB > insert into root.ln.wf02.wt02(status, hardware) values (false, 'v2') -``` -**注意:** 当一次插入多行数据时必须指定时间戳。 - -### 1.2 向对齐时间序列插入数据 - -向对齐时间序列插入数据只需在SQL中增加`ALIGNED`关键词,其他类似。 - -示例代码如下: - -```sql -IoTDB > create aligned timeseries root.sg1.d1(s1 INT32, s2 DOUBLE) -IoTDB > insert into root.sg1.d1(time, s1, s2) aligned values(1, 1, 1) -IoTDB > insert into root.sg1.d1(time, s1, s2) aligned values(2, 2, 2), (3, 3, 3) -IoTDB > select * from root.sg1.d1 -``` - -结果如图所示。由查询结果可以看出,数据的插入操作正确执行。 - -``` -+-----------------------------+--------------+--------------+ -| Time|root.sg1.d1.s1|root.sg1.d1.s2| -+-----------------------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 1| 1.0| -|1970-01-01T08:00:00.002+08:00| 2| 2.0| -|1970-01-01T08:00:00.003+08:00| 3| 3.0| -+-----------------------------+--------------+--------------+ -Total line number = 3 -It costs 0.004s -``` - -## 2. 原生接口写入 -原生接口 (Session) 是目前IoTDB使用最广泛的系列接口,包含多种写入接口,适配不同的数据采集场景,性能高效且支持多语言。 - -### 2.1 多语言接口写入 -* ### Java - 使用Java接口写入之前,你需要先建立连接,参考 [Java原生接口](../API/Programming-Java-Native-API_timecho)。 - 之后通过 [ JAVA 数据操作接口(DML)](../API/Programming-Java-Native-API_timecho#数据写入)写入。 - -* ### Python - 参考 [ Python 数据操作接口(DML)](../API/Programming-Python-Native-API_timecho#数据写入) - -* ### C++ - 参考 [ C++ 数据操作接口(DML)](../API/Programming-Cpp-Native-API.md) - -* ### Go - 参考 [Go 原生接口](../API/Programming-Go-Native-API.md) - -## 3. REST API写入 - -参考 [insertTablet (v1)](../API/RestServiceV1_timecho#inserttablet) or [insertTablet (v2)](../API/RestServiceV2_timecho#inserttablet) - -示例如下: -```JSON -{ -      "timestamps": [ -            1, -            2, -            3 -      ], -      "measurements": [ -            "temperature", -            "status" -      ], -      "data_types": [ -            "FLOAT", -            "BOOLEAN" -      ], -      "values": [ -            [ -                  1.1, -                  2.2, -                  3.3 -            ], -            [ -                  false, -                  true, -                  true -            ] -      ], -      "is_aligned": false, -      "device": "root.ln.wf01.wt01" -} -``` - -## 4. MQTT写入 - -参考 [内置 MQTT 服务](../API/Programming-MQTT_timecho.md#_2-内置-mqtt-服务) - -## 5. 批量数据导入 - -针对于不同场景,IoTDB 为用户提供多种批量导入数据的操作方式,本章节向大家介绍最为常用的两种方式为 CSV文本形式的导入 和 TsFile文件形式的导入。 - -### 5.1 TsFile批量导入 - -TsFile 是在 IoTDB 中使用的时间序列的文件格式,您可以通过CLI等工具直接将存有时间序列的一个或多个 TsFile 文件导入到另外一个正在运行的IoTDB实例中。具体操作方式请参考[数据导入](../Tools-System/Data-Import-Tool_timecho)。 - -### 5.2 CSV批量导入 - -CSV 是以纯文本形式存储表格数据,您可以在CSV文件中写入多条格式化的数据,并批量的将这些数据导入到 IoTDB 中,在导入数据之前,建议在IoTDB中创建好对应的元数据信息。如果忘记创建元数据也不要担心,IoTDB 可以自动将CSV中数据推断为其对应的数据类型,前提是你每一列的数据类型必须唯一。除单个文件外,此工具还支持以文件夹的形式导入多个 CSV 文件,并且支持设置如时间精度等优化参数。具体操作方式请参考[数据导入](../Tools-System/Data-Import-Tool_timecho)。 - -## 6. 无模式写入 -在物联网场景中,由于设备的类型、数量可能随时间动态增减,不同设备可能产生不同字段的数据(如温度、湿度、状态码等),业务上又往往需要快速部署,需要灵活接入新设备且无需繁琐的预定义流程。因此,不同于传统时序数据库通常需要预先定义数据模型,IoTDB支持不提前创建元数据,在写入数据时,数据库中将自动识别并注册所需的元数据,实现自动建模。 - -用户既可以通过CLI使用insert语句或者原生接口的方式,批量或者单行实时写入一个设备或者多个设备的测点数据,也可以通过导入工具导入csv,TsFile等格式的历史数据,在导入过程中会自动创建序列,数据类型,压缩编码方式等元数据。 diff --git a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md b/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md deleted file mode 100644 index be000d67e..000000000 --- a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md +++ /dev/null @@ -1,268 +0,0 @@ - -# AINode 部署 - -## 1. AINode 介绍 - -### 1.1 能力介绍 - -AINode 是 TimechoDB 在 ConfigNode、DataNode 后提供的第三种内生节点,该节点通过与 TimechoDB 集群的 DataNode、ConfigNode 交互,扩展了对时间序列进行机器学习分析的能力。AINode 将模型的管理、训练及推理融合在数据库引擎中,支持使用注册的模型在指定时序数据上通过简单 SQL 语句完成时序分析任务,还支持注册并使用自定义机器学习模型。AINode 目前已集成常见时序分析场景(例如预测)的机器学习算法和自研模型。 - -### 1.2 部署模式 - -AINode 是 TimechoDB 集群外的额外套件,采用独立安装包部署。 - -
- - -
- -## 2. 安装准备 - -### 2.1 安装包获取 - -AINode 安装包(`timechodb--ainode-bin.zip`)解压后关键目录结构如下: - -| **目录** | **类型** | **说明** | -| ---------------- | ---------------- | ------------------------------------------ | -| lib | 文件夹 | AINode 的可执行程序及依赖 | -| sbin | 文件夹 | AINode 的运行脚本,用于启动或停止 AINode | -| conf | 文件夹 | AINode 的配置文件和版本声明文件 | - -### 2.2 前置检查 - -为确保您获取的 AINode 安装包完整且正确,在执行安装部署前建议您进行 SHA512 校验。 - -**准备工作:** - -* 获取官方发布的 SHA512 校验码:请联系天谋工作人员获取 - -**校验步骤(以 linux 为例):** - -1. 打开终端,进入安装包所在目录(如`/data/ainode`): - ```Bash - cd /data/ainode - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -```SQL -(base) root@hadoop@1:/data/ainode (0.664s) -sha512sum timechodb-2.0.6.1-ainode-bin.zip -4d5a6a64935b4f0459bc9ed214c4563aa7a6a5941024336e9416212424707f27bdfdfc70f4c528b51b812687d660014adc1b8add699498ea67ff17c7e619a6f0 timechodb-2.0.6.1-ainode-bin.zip -``` - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行 AINode 的安装部署操作。 - -**注意事项:** - -* 若校验结果不一致,请联系天谋工作人员重新获取安装包 -* 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.3 环境要求 - -* 建议操作环境: Linux, MacOS; -* TimechoDB 版本:>= V 2.0.8-beta; - -## 3. 安装部署及使用 - -### 3.1 安装 AINode - -下载导入 AINode 到专用文件夹,切换到专用文件夹并解压安装包; - -```Shell -unzip timechodb--ainode-bin.zip -``` - -### 3.2 配置项修改 - -AINode 支持修改一些必要的参数。可以在 `/TIMECHO_AINODE_HOME/conf/iotdb-ainode.properties` 文件中找到下列参数并进行持久化的修改: - -| **名称** | **描述** | **类型** | **默认值** | -|-----------------------------------|----------------------------------------------| ---------------- | -------------------- | -| cluster\_name | AINode 要加入的集群标识 | string| defaultCluster | -| ain\_seed\_config\_node | AINode 启动时注册的 ConfigNode 地址 | String | 127.0.0.1:10710 | -| ain\_cluster\_ingress\_address | AINode 拉取数据的 DataNode 的 rpc 地址 | String | 127.0.0.1 | -| ain\_cluster\_ingress\_port | AINode 拉取数据的 DataNode 的 rpc 端口 | Integer | 6667 | -| ain\_cluster\_ingress\_username | AINode 拉取数据的 DataNode 的客户端用户名 | String | root | -| ain\_cluster\_ingress\_password | AINode 拉取数据的 DataNode 的客户端密码 | String | root | -| ain\_rpc\_address | AINode 提供服务与通信的地址 ,内部服务通讯接口 | String | 127.0.0.1 | -| ain\_rpc\_port | AINode 提供服务与通信的端口 | String | 10810 | -| ain\_system\_dir | AINode 元数据存储路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String| data/AINode/system | -| ain\_models\_dir | AINode 存储模型文件的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String| data/AINode/models | -| ain\_thrift\_compression\_enabled | AINode 是否启用 thrift 的压缩机制,0-不启动、1-启动 | Boolean | 0 | - -### 3.3 导入内置权重文件 - -若部署环境可联网且能连通 HuggingFace 环境,系统会自动拉取内置模型权重文件,可忽略本步骤。 - -若为离线环境,联系天谋工作人员获取模型权重文件夹,并放置到`/TIMECHO_AINODE_HOME/data/ainode/models/builtin` 目录下。 - -**​NOTE:​**注意目录层级,最终所有内置模型权重的父目录都是 `builtin `。 - -### 3.4 启动 AINode - -在完成 ConfigNode 的部署后,可以通过添加 TimechoDB 来支持时序模型的管理和推理功能。在配置项中指定 TimechoDB 集群的信息后,可以执行相应的指令来启动 AINode,加入 TimechoDB 集群。 - -```Shell -# 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh -d - - # Windows 系统 - bash sbin\start-ainode.bat -d -``` - -### 3.5 激活 AINode - -1. 参考 TimechoDB 激活:[激活方式](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md#_2-6-激活数据库) - -2. 可通过如下方式验证 AINode 激活,当看到状态显示为 ACTIVATED 表示激活成功。 - -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -Total line number = 3 -It costs 0.002s -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ -Total line number = 7 -It costs 0.013s -``` - - -### 3.6 检测 AINode 节点状态 - -AINode 启动过程中会自动将新的 AINode 加入 TimechoDB 集群。启动 AINode 后可以在命令行中输入 SQL 来查询,集群中看到 AINode 节点,其运行状态为 Running(如下展示)表示加入成功。 - -```Shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -除此之外,还可以通过 show models 命令来查看模型状态。如果模型状态不对,请检查权重文件路径是否正确。 - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 3.7 停止 AINode - -如果需要停止正在运行的 AINode 节点,则执行相应的关停脚本,且支持通过参数 -p 指定端口,该端口为配置项中的 `ain_rpc_port`。 - -```Shell -# Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # 指定端口 - - #Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # 指定端口 -``` - -停止 AINode 后,还可以在集群中看到 AINode 节点,其运行状态为 UNKNOWN(如下展示),此时无法使用 AINode 功能。 - -```Shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|UNKNOWN| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -如果需要重新启动该节点,需重新执行启动脚本。 - -### 3.8 升级 AINode - -如果需要对当前 AINode 进行版本升级,可参考如下步骤: - -1. 停止当前 AINode 服务 - - * 执行停止命令,确保服务完全退出后再进行后续操作 - - ```Shell - # Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # 指定端口 - - #Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # 指定端口 - ``` -2. 替换核心文件 - - * 删除当前版本的`lib` 和 `sbin`目录,并将新版本的 `lib` 和 `sbin` 复制到对应位置 - * 备份 conf 目录下已修改的配置文件,然后替换 conf 文件夹,并将修改的配置同步到对应位置 -3. 更新内置模型权重(可选) - - * 若新版本涉及内置模型更新,相关信息将在[发布历史](../IoTDB-Introduction/Release-history\_timecho.md)中同步。可联系天谋工作人员获取最新权重包,并将权重包替换至 `data/ainode/models/builtin` 目录 -4. 升级完毕后,可启动 AINode 服务,并查看节点状态,具体命令可参考【3.4】和【3.6】小节。 - diff --git a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index 2dd610188..000000000 --- a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,340 +0,0 @@ - -# AINode 部署 - -## 1. AINode介绍 - -### 1.1 能力介绍 - -AINode 是 IoTDB 在 ConfigNode、DataNode 后提供的第三种内生节点,该节点通过与 IoTDB 集群的 DataNode、ConfigNode 的交互,扩展了对时间序列进行机器学习分析的能力,支持从外部引入已有机器学习模型进行注册,并使用注册的模型在指定时序数据上通过简单 SQL 语句完成时序分析任务的过程,将模型的创建、管理及推理融合在数据库引擎中。目前已提供常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -### 1.2 交付方式 -AINode 是 IoTDB 集群外的额外套件,独立安装包。 - -### 1.3 部署模式 -
- - -
- -## 2. 安装准备 - -### 2.1 安装包获取 - -AINode 安装包(`timechodb--ainode-bin.zip`),安装包解压后目录结构如下: - -| **目录** | **类型** | **说明** | -| ------------ | -------- | ------------------------------------------------ | -| lib | 文件夹 | AINode 的 python 包文件 | -| sbin | 文件夹 | AINode的运行脚本,可以启动,移除和停止AINode | -| conf | 文件夹 | AINode 的配置文件和运行环境设置脚本 | -| LICENSE | 文件 | 证书 | -| NOTICE | 文件 | 提示 | -| README_ZH.md | 文件 | markdown格式的中文版说明 | -| README.md | 文件 | 使用说明 | - -### 2.2 前置检查 - -为确保您获取的 AINode 安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:请联系天谋工作人员获取 - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/ainode`): - ```Bash - cd /data/ainode - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-06.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行 AINode 的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.3 环境准备 - -1. 建议操作环境: Ubuntu, MacOS -2. IoTDB 版本:>= V 2.0.5.1 -3. 运行环境 - - Python 版本在 3.9 ~3.12,且带有 pip 和 venv 工具; - - -## 3. 安装部署及使用 - -### 3.1 安装 AINode - -1. 保证 Python 版本介于 3.9 ~3.12 - -```shell -python --version -# 或 -python3 --version -``` -2. 下载导入 AINode 到专用文件夹,切换到专用文件夹并解压安装包 - -```shell - unzip timechodb--ainode-bin.zip -``` - -3. 激活 AINode: - -- 进入 IoTDB CLI - -```sql - # Linux或MACOS系统 - ./start-cli.sh - - # windows系统 - ./start-cli.bat -``` - -- 执行以下内容获取激活所需机器码: - -```sql -show system info -``` - -- 将返回的机器码复制给天谋工作人员: - -```sql -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -``` - -- 将工作人员返回的激活码输入到CLI中,输入以下内容 - - 注:激活码前后需要用'符号进行标注,如所示 - -```sql -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZK' -``` - -- 可通过如下方式验证激活,当看到状态显示为 ACTIVATED 表示激活成功 - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ - -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ - -``` - -### 3.2 配置项修改 -AINode 支持修改一些必要的参数。可以在 `conf/iotdb-ainode.properties` 文件中找到下列参数并进行持久化的修改: - -| **名称** | **描述** | **类型** | **默认值** | -| ------------------------------ | ------------------------------------------------------------ | -------- | ------------------ | -| cluster_name | AINode 要加入集群的标识 | string | defaultCluster | -| ain_seed_config_node | AINode 启动时注册的 ConfigNode 地址 | String | 127.0.0.1:10710 | -| ain_cluster_ingress_address | AINode 拉取数据的 DataNode 的 rpc 地址 | String | 127.0.0.1 | -| ain_cluster_ingress_port | AINode 拉取数据的 DataNode 的 rpc 端口 | Integer | 6667 | -| ain_cluster_ingress_username | AINode 拉取数据的 DataNode 的客户端用户名 | String | root | -| ain_cluster_ingress_password | AINode 拉取数据的 DataNode 的客户端密码 | String | root | -| ain_cluster_ingress_time_zone | AINode 拉取数据的 DataNode 的客户端时区 | String | UTC+8 | -| ain_inference_rpc_address | AINode 提供服务与通信的地址 ,内部服务通讯接口 | String | 127.0.0.1 | -| ain_inference_rpc_port | AINode 提供服务与通信的端口 | String | 10810 | -| ain_system_dir | AINode 元数据存储路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/system | -| ain_models_dir | AINode 存储模型文件的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/models | -| ain_thrift_compression_enabled | AINode 是否启用 thrift 的压缩机制,0-不启动、1-启动 | Boolean | 0 | - -### 3.3 导入权重文件 - -> 仅离线环境,在线环境可忽略本步骤 -> -联系天谋工作人员获取模型权重文件,并放置到/IOTDB_AINODE_HOME/data/ainode/models/weights/目录下。 - -### 3.4 启动 AINode - -在完成 Seed-ConfigNode 的部署后,可以通过添加 AINode 节点来支持模型的注册和推理功能。在配置项中指定 IoTDB 集群的信息后,可以执行相应的指令来启动 AINode,加入 IoTDB 集群。 - -- 联网环境启动 - -启动命令 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### 3.5 检测 AINode 节点状态 - -AINode 启动过程中会自动将新的 AINode 加入 IoTDB 集群。启动 AINode 后可以在 命令行中输入 SQL 来查询,集群中看到 AINode 节点,其运行状态为 Running(如下展示)表示加入成功。 - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -除此之外,还可以通过 show models 命令来查看模型状态。如果模型状态不对,请检查权重文件路径是否正确。 - -```sql -IoTDB:etth> show models -+---------------------+--------------------+--------+------+ -| ModelId| ModelType|Category| State| -+---------------------+--------------------+--------+------+ -| arima| Arima|BUILT-IN|ACTIVE| -| holtwinters| HoltWinters|BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing|BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster|BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster|BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm|BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm|BUILT-IN|ACTIVE| -| stray| Stray|BUILT-IN|ACTIVE| -| sundial| Timer-Sundial|BUILT-IN|ACTIVE| -| timer_xl| Timer-XL|BUILT-IN|ACTIVE| -+---------------------+--------------------+--------+------+ -``` - -### 3.6 停止 AINode - -如果需要停止正在运行的 AINode 节点,则执行相应的关闭脚本。 - -- 停止命令 - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - -停止 AINode 后,还可以在集群中看到 AINode 节点,其运行状态为 UNKNOWN(如下展示),此时无法使用 AINode 功能。 - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -如果需要重新启动该节点,需重新执行启动脚本。 - - -## 4. 常见问题 - -### 4.1 启动AINode时出现找不到venv模块的报错 - -当使用默认方式启动 AINode 时,会在安装包目录下创建一个 python 虚拟环境并安装依赖,因此要求安装 venv 模块。通常来说 python3.10 及以上的版本会自带 venv,但对于一些系统自带的 python 环境可能并不满足这一要求。出现该报错时有两种解决方案(二选一): - -在本地安装 venv 模块,以 ubuntu 为例,可以通过运行以下命令来安装 python 自带的 venv 模块。或者从 python 官网安装一个自带 venv 的 python 版本。 - - ```shell -apt-get install python3.10-venv -``` -安装 3.10.0 版本的 venv 到 AINode 里面 在 AINode 路径下 - - ```shell -../Python-3.10.0/python -m venv venv(文件夹名) -``` -在运行启动脚本时通过 `-i` 指定已有的 python 解释器路径作为 AINode 的运行环境,这样就不再需要创建一个新的虚拟环境。 - -### 4.2 python中的SSL模块没有被正确安装和配置,无法处理HTTPS资源 -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -可以安装 OpenSSLS 后,再重新构建 python 来解决这个问题 -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - -Python 要求我们的系统上安装有 OpenSSL,具体安装方法可见[链接](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - -### 4.3 pip版本较低 - -windows下出现类似“error:Microsoft Visual C++ 14.0 or greater is required...”的编译问题 - -出现对应的报错,通常是 c++版本或是 setuptools 版本不足,可以在 - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - -### 4.4 安装编译python - -使用以下指定从官网下载安装包并解压: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` -编译安装对应的 python 包: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index 391d6a474..000000000 --- a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,589 +0,0 @@ - -# 集群版部署指导 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -## 1. 注意事项 - -1. 安装前请确认系统已参照[系统配置](./Environment-Requirements.md)准备完成。 - -2. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置/etc/hosts,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。`dn_internal_address`。 - - ``` shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. 有些参数首次启动后不能修改,请参考下方的"参数配置"章节来进行设置。 - -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 - -5. 请注意,安装部署(包括激活和使用软件)IoTDB时需要保持使用同一个用户进行操作,您可以: -- 使用 root 用户(推荐):使用 root 用户可以避免权限等问题。 -- 使用固定的非 root 用户: - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - 避免使用 sudo:尽量避免使用 sudo 命令,因为它会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 - -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考:[监控面板部署](./Monitoring-panel-deployment.md) - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - -## 2. 准备步骤 - -1. 准备IoTDB数据库安装包 :iotdb-enterprise-{version}-bin.zip(安装包获取见:[链接](../Deployment-and-Maintenance/IoTDB-Package_timecho.md)) -2. 按环境要求配置好操作系统环境(系统环境配置见:[链接](../Deployment-and-Maintenance/Environment-Requirements.md)) - -### 2.1 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-01.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -## 3. 安装步骤 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ----------- | ------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -### 3.1 设置主机名 - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置`/etc/hosts`,使用如下命令: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 3.2 参数配置 - -解压安装包并进入安装目录 - -```Plain -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -#### 环境脚本配置 - -- `./conf/confignode-env.sh`配置 - - | **配置项** | **说明** | **默认值** | **推荐值** | 备注 | - | :---------- | :------------------------------------- | :--------- | :----------------------------------------------- | :----------- | - | MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- `./conf/datanode-env.sh`配置 - - | **配置项** | **说明** | **默认值** | **推荐值** | 备注 | - | :---------- | :----------------------------------- |:-----------------------| :----------------------------------------------- | :----------- | - | MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 通用配置 - -打开通用配置文件`./conf/iotdb-system.properties`,可根据部署方式设置以下参数: - -| 配置项 | 说明 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | -| ------------------------- | ---------------------------------------- | -------------- | -------------- | -------------- | -| cluster_name | 集群名称 | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | 元数据副本数,DataNode数量不应少于此数目 | 3 | 3 | 3 | -| data_replication_factor | 数据副本数,DataNode数量不应少于此数目 | 2 | 2 | 2 | - -#### ConfigNode 配置 - -打开ConfigNode配置文件`./conf/iotdb-system.properties`,设置以下参数 - -| 配置项 | 说明 | 默认 | 推荐值 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | 备注 | -| ------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 10710 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 10720 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -#### DataNode 配置 - -打开DataNode配置文件 `./conf/iotdb-system.properties`,设置以下参数: - -| 配置项 | 说明 | 默认 | 推荐值 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | 备注 | -| ------------------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| dn_rpc_address | 客户端 RPC 服务的地址 | 127.0.0.1 | 默认本机可直接访问。非本机访问,请修改此配置项为所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址。 | iotdb-1 |iotdb-2 | iotdb-3 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 6667 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 10730 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 10740 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 10750 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 10760 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -> ❗️注意:VSCode Remote等编辑器无自动保存配置功能,请确保修改的文件被持久化保存,否则配置项无法生效 - -### 3.3 启动ConfigNode节点 - -先启动第一个iotdb-1的confignode, 保证种子confignode节点先启动,然后依次启动第2和第3个confignode节点 - -```shell -# Unix/OS X -cd sbin -./start-confignode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\start-confignode.bat - -# V2.0.4.x 版本及之后 -.\windows\start-confignode.bat -``` - -如果启动失败,请参考下[常见问题](#常见问题) - -### 3.4 启动DataNode 节点 - -分别进入iotdb的sbin目录下,依次启动3个datanode节点: - -```shell -# Unix/OS X -cd sbin -./start-datanode.sh -d #-d参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\start-datanode.bat - -# V2.0.4.x 版本及之后 -.\windows\start-datanode.bat -``` - -### 3.5 激活数据库 - -#### 方式一:通过 CLI 激活 - -- 进入集群任一节点 CLI - -```shell -# Linux 系统与 MacOS 系统启动命令如下: -# V2.0.6.x 版本之前 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 - -# Windows 系统启动命令如下: -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -- 执行以下内容获取激活所需机器码: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -|01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -- 执行以下语句获取待激活数据库的版本号: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.x.x| xxxxxxx| -+-------+---------+ -Total line number = 1 -``` - -- 将获取到的机器码与版本号,一同提供给天谋工作人员。 - -- 工作人员会返回激活码,正常是与提供的机器码的顺序对应的,请将整串激活码粘贴到CLI中进行激活,此激活操作只需在集群中的任意一台机器上执行一次即可。 - - - 注:激活码前后需要用`'`符号进行标注,如下所示 - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - - -#### 方式二:激活文件拷贝激活 - -- 依次启动3个Confignode、Datanode节点后,每台机器各自的activation文件夹, 分别拷贝每台机器的system_info文件给天谋工作人员; -- 工作人员将返回每个ConfigNode、Datanode节点的license文件,这里会返回3个license文件; -- 将3个license文件分别放入对应的ConfigNode节点的activation文件夹下; - - -### 3.6 验证激活 - -可在 CLI 中通过执行 `show activation` 命令查看激活状态,示例如下,状态显示为 ACTIVATED 表示激活成功 - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -### 3.7 一键启停集群 - -#### 3.7.1 概述 - -在 IoTDB 的根目录中,`sbin` 子目录包含的 `start-all.sh` 和 `stop-all.sh` 脚本,与 `conf` 子目录中的 `iotdb-cluster.properties` 配置文件协同工作,可通过单一节点实现一键启动或停止集群所有节点的功能。通过这种方式,可以高效地管理 IoTDB 集群的生命周期,简化了部署和运维流程。 -下文将介绍`iotdb-cluster.properties` 文件中的具体配置项。 - -#### 3.7.2 配置项 - - -> 注意: -> -> * 当集群变更时,需要手动更新此配置文件。 -> * 如果在未配置 `iotdb-cluster.properties` 配置文件的情况下执行 `start-all.sh` 或者 `stop-all.sh` 脚本,则默认会启停当前脚本所在 IOTDB\_HOME 目录下的 ConfigNode 与 DataNode 节点。 -> * 推荐配置 ssh 免密登录:如果未配置,启动脚本后会提示输入服务器密码以便于后续启动/停止/销毁操作。如果已配置,则无需在执行脚本过程中输入服务器密码。 - -* confignode\_address\_list - -| 名字 | confignode\_address\_list | -| :--------------: | :------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的 ConfigNode 节点所在主机的 IP 或主机名列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_address\_list - -| 名字 | datanode\_address\_list | -| :----------------: | :---------------------------------------------------------------------------- | -| 描述 | 待启动/停止的 DataNode 节点所在主机的 IP 或主机名列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* ssh\_account - -| 名字 | ssh\_account | -| :----------------: | :------------------------------------------------------------- | -| 描述 | 通过 SSH 登陆目标主机的用户名,需要所有的主机的用户名都相同 | -| 类型 | String | -| 默认值 | root | -| 改后生效方式 | 重启服务生效 | - -* ssh\_port - -| 名字 | ssh\_port | -| :----------------: | :--------------------------------------------------------- | -| 描述 | 目标主机对外暴露的 SSH 端口,需要所有的主机的端口都相同 | -| 类型 | int | -| 默认值 | 22 | -| 改后生效方式 | 重启服务生效 | - -* confignode\_deploy\_path - -| 名字 | confignode\_deploy\_path | -| :----------------: | :---------------------------------------------------------------------------------------------------------------- | -| 描述 | 待启动/停止的所有 ConfigNode 所在目标主机的路径,需要所有待启动/停止的 ConfigNode 节点在目标主机的相同目录下。例如:`/data/demo/apache-iotdb-1.3.1-all-bin` | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_deploy\_path - -| 名字 | datanode\_deploy\_path | -| :----------------: | :------------------------------------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的所有 DataNode 所在目标主机的路径,需要所有待启动/停止的 DataNode 节点在目标主机的相同目录下。例如:`/data/demo/apache-iotdb-1.3.1-all-bin` | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - - -#### 3.7.3 简单示例 - -1. 配置文件 `iotdb-cluster.properties` - -```properties -# Configure ConfigNodes machine addresses separated by , -confignode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# Configure DataNodes machine addresses separated by , -datanode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# User name for logging in to the deployment machine using ssh -ssh_account=root - -# ssh login port -ssh_port=22 - -# iotdb deployment directory (iotdb will be deployed to the target node in this folder) -confignode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -datanode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -``` - -2. 执行 ./start-all.sh 命令验证启动结果,在 cli 中执行 show cluster,可看到类似如下结果 -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -| 0|ConfigNode|Running| 172.xx.xx.16| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 1|ConfigNode|Running| 172.xx.xx.18| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 2|ConfigNode|Running| 172.xx.xx.17| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 3| DataNode|Running| 172.xx.xx.18| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 4| DataNode|Running| 172.xx.xx.17| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 5| DataNode|Running| 172.xx.xx.16| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -``` - -## 4. 节点维护步骤 - -### 4.1 ConfigNode节点维护 - -ConfigNode节点维护分为ConfigNode添加和移除两种操作,有两个常见使用场景: -- 集群扩展:如集群中只有1个ConfigNode时,希望增加ConfigNode以提升ConfigNode节点高可用性,则可以添加2个ConfigNode,使得集群中有3个ConfigNode。 -- 集群故障恢复:1个ConfigNode所在机器发生故障,使得该ConfigNode无法正常运行,此时可以移除该ConfigNode,然后添加一个新的ConfigNode进入集群。 - -> ❗️注意,在完成ConfigNode节点维护后,需要保证集群中有1或者3个正常运行的ConfigNode。2个ConfigNode不具备高可用性,超过3个ConfigNode会导致性能损失。 - -#### 添加ConfigNode节点 - -脚本命令: -```shell -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-confignode.sh - -# Windows -# 首先切换到IoTDB根目录 -# V2.0.4.x 版本之前 -sbin\start-confignode.bat - -# V2.0.4.x 版本及之后 -sbin\windows\start-confignode.bat -``` - -参数介绍: - -| 参数 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - -#### 移除ConfigNode节点 - -首先通过CLI连接集群,通过`show confignodes`确认想要移除ConfigNode的NodeID: - -```Bash -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -然后使用SQL将ConfigNode移除,SQL命令: - - -```Bash -remove confignode [confignode_id] - -``` - -### 4.2 DataNode节点维护 - -DataNode节点维护有两个常见场景: - -- 集群扩容:出于集群能力扩容等目的,添加新的DataNode进入集群 -- 集群故障恢复:一个DataNode所在机器出现故障,使得该DataNode无法正常运行,此时可以移除该DataNode,并添加新的DataNode进入集群 - -> ❗️注意,为了使集群能正常工作,在DataNode节点维护过程中以及维护完成后,正常运行的DataNode总数不得少于数据副本数(通常为2),也不得少于元数据副本数(通常为3)。 - -#### 添加DataNode节点 - -脚本命令: - -```Bash -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-datanode.sh - -# Windows -# 首先切换到IoTDB根目录 -# V2.0.4.x 版本之前 -sbin\start-datanode.bat - -# V2.0.4.x 版本及之后 -sbin\windows\start-datanode.bat -``` - -参数介绍: - -| 缩写 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - -说明:在添加DataNode后,随着新的写入到来(以及旧数据过期,如果设置了TTL),集群负载会逐渐向新的DataNode均衡,最终在所有节点上达到存算资源的均衡。 - -#### 移除DataNode节点 - -首先通过CLI连接集群,通过`show datanodes`确认想要移除的DataNode的NodeID: - -```Bash -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -然后使用SQL将DataNode移除,SQL命令: - -```Bash -remove datanode [datanode_id] - -``` - -### 4.3 集群维护 - -更多关于集群维护的介绍可参考:[集群维护](../User-Manual/Load-Balance.md) - - -## 5. 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 - -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 步骤 4: 清理环境: - - a. 结束所有 ConfigNode 和 DataNode 进程。 - -```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - # Unix/OS X - sbin/stop-standalone.sh - - # Windows - # V2.0.4.x 版本之前 - sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - sbin\windows\stop-standalone.bat - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```Bash - cd /data/iotdb - rm -rf data logs - ``` diff --git a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Database-Resources_timecho.md b/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Database-Resources_timecho.md deleted file mode 100644 index 11f6b5861..000000000 --- a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Database-Resources_timecho.md +++ /dev/null @@ -1,206 +0,0 @@ - -# 资源规划 -## 1. CPU - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
序列数(采集频率<=1HZ)CPU节点数
单机双活分布式
10W以内2核-4核123
30W以内4核-8核123
50W以内8核-16核123
100W以内16核-32核123
200w以内32核-48核123
1000w以内48核12请联系天谋商务咨询
1000w以上请联系天谋商务咨询
- -> CPU支持型号:鲲鹏、飞腾、申威、海光、兆芯、龙芯 - -## 2. 内存 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
序列数(采集频率<=1HZ)内存节点数
单机双活分布式
10W以内2G-4G123
30W以内6G-12G123
50W以内12G-24G123
100W以内24G-48G123
200w以内48G-96G123
1000w以内128G12请联系天谋商务咨询
1000w以上请联系天谋商务咨询
- -> 提供灵活的内存配置选项,用户可在datanode-env文件中进行调整,详细信息和配置指南请参见 [datanode-env](../Reference/DataNode-Config-Manual.md#_2-环境配置项-datanode-env-sh-bat) - -## 3. 存储(磁盘) -### 3.1 存储空间 - -可通过磁盘资源评估器进行计算:[磁盘资源评估器](https://www.timecho.com/docs/zh/ResourceEvaluator.html) - -计算公式:测点数量 * 采样频率(Hz)* 每个数据点大小(Byte,不同数据类型不一样,见下表) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
数据点大小计算表
数据类型 时间戳(字节)值(字节)数据点总大小(字节)
开关量(Boolean)819
整型(INT32)/ 单精度浮点数(FLOAT)8412
长整型(INT64)/ 双精度浮点数(DOUBLE)8816
字符串(TEXT)8平均为a8+a
- -示例:1000设备,每个设备100 测点,共 100000 序列,INT32 类型。采样频率1Hz(每秒一次),存储1年,3副本。 -- 完整计算公式:1000设备 * 100测点 * 12字节每数据点 * 86400秒每天 * 365天每年 * 3副本/10压缩比 / 1024 / 1024 / 1024 / 1024 =11T -- 简版计算公式:1000 * 100 * 12 * 86400 * 365 * 3 / 10 / 1024 / 1024 / 1024 / 1024 = 11T -### 3.2 存储配置 -1000w 点位以上或查询负载较大,推荐配置 SSD。 -## 4. 网络(网卡) -在写入吞吐不超过1000万点/秒时,需配置千兆网卡;当写入吞吐超过 1000万点/秒时,需配置万兆网卡。 -| **写入吞吐(数据点/秒)** | **网卡速率** | -| ------------------- | ------------- | -| <1000万 | 1Gbps(千兆) | -| >=1000万 | 10Gbps(万兆) | -## 5. 其他说明 -IoTDB 具有集群秒级扩容能力,扩容节点数据可不迁移,因此您无需担心按现有数据情况估算的集群能力有限,未来您可在需要扩容时为集群加入新的节点。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Deployment-form_timecho.md b/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Deployment-form_timecho.md deleted file mode 100644 index d49674d07..000000000 --- a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Deployment-form_timecho.md +++ /dev/null @@ -1,61 +0,0 @@ - -# 部署形态 - -IoTDB 有三种运行模式:单机模式、集群模式和双活模式。 - -## 1. 单机模式 - -IoTDB单机实例包括 1 个ConfigNode、1个DataNode,即1C1D; - -- **特点**:便于开发者安装部署,部署和维护成本较低,操作方便。 -- **适用场景**:资源有限或对高可用要求不高的场景,例如边缘端服务器。 -- **部署方法**:[单机版部署](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -## 2. 双活模式 - -双活版部署为 TimechoDB 企业版功能,是指两个独立的实例进行双向同步,能同时对外提供服务。当一台停机重启后,另一个实例会将缺失数据断点续传。 - -> IoTDB 双活实例通常为2个单机节点,即2套1C1D。每个实例也可以为集群。 - -- **特点**:资源占用最低的高可用解决方案。 -- **适用场景**:资源有限(仅有两台服务器),但希望获得高可用能力。 -- **部署方法**:[双活版部署](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -## 3. 集群模式 - -IoTDB 集群实例为 3 个ConfigNode 和不少于 3 个 DataNode,通常为 3 个 DataNode,即3C3D;当部分节点出现故障时,剩余节点仍然能对外提供服务,保证数据库服务的高可用性,且可随节点增加提升数据库性能。 - -- **特点**:具有高可用性、高扩展性,可通过增加 DataNode 提高系统性能。 -- **适用场景**:需要提供高可用和可靠性的企业级应用场景。 -- **部署方法**:[集群版部署](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -## 4. 特点总结 - -| 维度 | 单机模式 | 双活模式 | 集群模式 | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| 适用场景 | 边缘侧部署、对高可用要求不高 | 高可用性业务、容灾场景等 | 高可用性业务、容灾场景等 | -| 所需机器数量 | 1 | 2 | ≥3 | -| 安全可靠性 | 无法容忍单点故障 | 高,可容忍单点故障 | 高,可容忍单点故障 | -| 扩展性 | 可扩展 DataNode 提升性能 | 每个实例可按需扩展 | 可扩展 DataNode 提升性能 | -| 性能 | 可随 DataNode 数量扩展 | 与其中一个实例性能相同 | 可随 DataNode 数量扩展 | - -- 单机模式和集群模式,部署步骤类似(逐个增加 ConfigNode 和 DataNode),仅副本数和可提供服务的最少节点数不同。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index 80a847eaf..000000000 --- a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,495 +0,0 @@ - -# Docker部署指导 - -## 1. 环境准备 - -### 1.1 Docker安装 - -```Bash -#以ubuntu为例,其他操作系统可以自行搜索安装方法 -#step1: 安装一些必要的系统工具 -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: 安装GPG证书 -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: 写入软件源信息 -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: 更新并安装Docker-CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: 设置docker开机自启动 -sudo systemctl enable docker -#step6: 验证docker是否安装成功 -docker --version #显示版本信息,即安装成功 -``` - -### 1.2 docker-compose安装 - -```Bash -#安装命令 -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#验证是否安装成功 -docker-compose --version #显示版本信息即安装成功 -``` - -### 1.3 安装dmidecode插件 - -默认情况下,linux服务器应该都已安装,如果没有安装的话,可以使用下面的命令安装。 - -```Bash -sudo apt-get install dmidecode -``` - -dmidecode 安装后,查找安装路径:`whereis dmidecode`,这里假设结果为`/usr/sbin/dmidecode`,记住该路径,后面的docker-compose的yml文件会用到。 - -### 1.4 获取IoTDB的容器镜像 - -关于IoTDB企业版的容器镜像您可联系商务或技术支持获取。 - -## 2. 单机版部署 - -本节演示如何部署1C1D的docker单机版。 - -### 2.1 load 镜像文件 - -比如这里获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -load镜像: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### 2.2 创建docker bridge网络 - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### 2.3 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在`/docker-iotdb` 文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml ` - -```Bash -docker-iotdb: -├── iotdb #iotdb安装目录 -│── docker-compose-standalone.yml #单机版docker-compose的yml文件 -``` - -完整的`docker-compose-standalone.yml`内容如下: - -```Bash -version: "3" -services: - iotdb-service: - image: timecho/timechodb:2.0.2.1-standalone #使用的镜像 - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### 2.4 首次启动 - -使用下面的命令启动: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -由于没有激活,首次启动时会直接退出,属于正常现象,首次启动是为了获取机器码文件,用于后面的激活流程。 - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### 2.5 申请激活 - -- 首次启动后,在物理机目录`/docker-iotdb/iotdb/activation`下会生成一个 `system_info`文件,将这个文件拷贝给天谋工作人员。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 收到工作人员返回的license文件,将license文件拷贝到`/docker-iotdb/iotdb/activation`文件夹下。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### 2.6 再次启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### 2.7 验证部署 - -- 查看日志,有如下字样,表示启动成功 - -```Bash -docker logs -f iotdb-datanode #查看日志命令 -2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- 进入容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - 进入容器, 通过cli登录数据库, 使用show cluster命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb /bin/bash #进入容器 - ./start-cli.sh -h iotdb #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### 2.8 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在docker-compose-standalone.yml中添加映射 - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:重新启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## 3. 集群版部署 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -**注意:集群版目前只支持host网络和overlay 网络,不支持bridge网络。** - -下面以host网络为例演示如何部署3C3D集群。 - -### 3.1 设置主机名 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ----------- | ------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置/etc/hosts,使用如下命令: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 3.2 load镜像文件 - -比如获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -在3台服务器上分别执行load镜像命令: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### 3.3 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在/docker-iotdb文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`,`/docker-iotdb/confignode.yml`,`/docker-iotdb/datanode.yml` - -```Bash -docker-iotdb: -├── confignode.yml #confignode的yml文件 -├── datanode.yml #datanode的yml文件 -└── iotdb #IoTDB安装目录 -``` - -在每台服务器上都要编写2个yml文件,即`confignode.yml`和`datanode.yml`,yml示例如下: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:2.0.x.x-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #默认第一台为seed节点 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:2.0.x.x-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_seed_config_node=iotdb-1:10710 #默认第1台为seed节点 - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### 3.4 首次启动confignode - -先在3台服务器上分别启动confignode, 用来获取机器码,注意启动顺序,先启动第1台iotdb-1,再启动iotdb-2和iotdb-3。 - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #后台启动 -``` - -### 3.5 申请激活 - -- 首次启动3个confignode后,在每个物理机目录`/docker-iotdb/iotdb/activation`下都会生成一个`system_info`文件,将3个服务器的`system_info`文件拷贝给天谋工作人员; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 将3个license文件分别放入对应的ConfigNode节点的`/docker-iotdb/iotdb/activation`文件夹下; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- license放入对应的activation文件夹后,confignode会自动激活,不用重启confignode - -### 3.6 启动datanode - -在3台服务器上分别启动datanode - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #后台启动 -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### 3.7 验证部署 - -- 查看日志,有如下字样,表示datanode启动成功 - - ```Bash - docker logs -f iotdb-datanode #查看日志命令 - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- 进入任意一个容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - 进入容器,通过cli登录数据库,使用`show cluster`命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb-datanode /bin/bash #进入容器 - ./start-cli.sh -h iotdb-1 #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### 3.8 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:在3台服务器中分别拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -或者 -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在3台服务器的`confignode.yml`和`datanode.yml`中添加/conf目录映射 - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:在3台服务器上重新启动IoTDB - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index 5e9c9314f..000000000 --- a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,203 +0,0 @@ - -# 双活版部署指导 - -## 1. 什么是双活版? - -双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 两个单机(或集群)可构成一个高可用组:当其中一个单机(或集群)停止服务时,另一个单机(或集群)不会受到影响。当停止服务的单机(或集群)再次启动时,另一个单机(或集群)会将新写入的数据同步过来。业务可以绑定两个单机(或集群)进行读写,从而达到高可用的目的。 -- 双活部署方案允许在物理节点少于 3 的情况下实现高可用,在部署成本上具备一定优势。同时可以通过电力、网络的双环网,实现两套单机(或集群)的物理供应隔离,保障运行的稳定性。 -- 目前双活能力为企业版功能。 - -![](/img/%E5%8F%8C%E6%B4%BB%E5%90%8C%E6%AD%A5.png) - -## 2. 注意事项 - -1. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置`/etc/hosts`,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。 - - ```Bash - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -2. 有些参数首次启动后不能修改,请参考下方的"安装步骤"章节来进行设置。 - -3. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考[文档](https://www.timecho.com/docs/zh/UserGuide/latest/Deployment-and-Maintenance/Monitoring-panel-deployment.html) - -## 3. 安装步骤 - -我们以两台单机A和B构建的双活版IoTDB为例,A和B的ip分别是192.168.1.3 和 192.168.1.4 ,这里用hostname来表示不同的主机,规划如下: - -| 机器 | 机器ip | 主机名 | -| ---- | ----------- | ------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### Step1:分别安装两套独立的 IoTDB - -在2个机器上分别安装 IoTDB,单机版部署文档可参考[文档](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md),集群版部署文档可参考[文档](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)。**推荐 A、B 集群的各项配置保持一致,以实现最佳的双活效果。** - -### Step2:在机器A上创建数据同步任务至机器B - -- 在机器A上创建数据同步流程,即机器A上的数据自动同步到机器B,使用sbin目录下的cli工具连接A上的IoTDB数据库: - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-1 - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-1 - ``` - -- 创建并启动数据同步命令,SQL 如下: - - ```Bash - create pipe AB - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一pipe传输而来的数据。 - -### Step3:在机器B上创建数据同步任务至机器A - - - 在机器B上创建数据同步流程,即机器B上的数据自动同步到机器A,使用sbin目录下的cli工具连接B上的IoTDB数据库: - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-2 - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-2 - ``` - - 创建并启动pipe,SQL 如下: - - ```Bash - create pipe BA - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一pipe传输而来的数据。 - -### Step4:验证部署 - -上述数据同步流程创建完成后,即可启动双活集群。 - -#### 检查集群运行状态 - -```Bash -#在2个节点分别执行show cluster命令检查IoTDB服务状态 -show cluster -``` - -**机器A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**机器B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -确保每一个 ConfigNode 和 DataNode 都处于 Running 状态。 - -#### 检查同步状态 - -- 机器A上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-A.png) - -- 机器B上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-B.png) - -确保每一个 pipe 都处于 RUNNING 状态。 - -### Step5:停止双活版 IoTDB - -- 在机器A的执行下列命令: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 #登录cli - IoTDB> stop pipe AB #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\windows\stop-standalone.bat - ``` - -- 在机器B的执行下列命令: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 #登录cli - IoTDB> stop pipe BA #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\windows\stop-standalone.bat - ``` diff --git a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index e2aaf79fb..000000000 --- a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,48 +0,0 @@ - -# 安装包获取 - -## 1. 企业版获取方式 - -企业版安装包可通过产品试用申请,或直接联系与您对接的商务人员获取。 - -## 2. 安装包结构 - -解压后安装包(iotdb-enterprise-{version}-bin.zip),安装包解压后目录结构如下: - -| **目录** | **类型** | **说明** | -| ---------------- | -------- | ------------------------------------------------------------ | -| activation | 文件夹 | 激活文件所在目录,包括生成的机器码以及从天谋工作人员获取的企业版激活码(启动ConfigNode后才会生成该目录,即可获取激活码) | -| conf | 文件夹 | 配置文件目录,包含 ConfigNode、DataNode、JMX 和 logback 等配置文件 | -| data | 文件夹 | 默认的数据文件目录,包含 ConfigNode 和 DataNode 的数据文件。(启动程序后才会生成该目录) | -| lib | 文件夹 | 库文件目录 | -| licenses | 文件夹 | 开源协议证书文件目录 | -| logs | 文件夹 | 默认的日志文件目录,包含 ConfigNode 和 DataNode 的日志文件(启动程序后才会生成该目录) | -| sbin | 文件夹 | 主要脚本目录,包含数据库启、停等脚本 | -| tools | 文件夹 | 工具目录 | -| ext | 文件夹 | pipe,trigger,udf插件的相关文件 | -| LICENSE | 文件 | 开源许可证文件 | -| NOTICE | 文件 | 开源声明文件 | -| README_ZH.md | 文件 | 使用说明(中文版) | -| README.md | 文件 | 使用说明(英文版) | -| RELEASE_NOTES.md | 文件 | 版本说明 | - -注意:自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 MQTT 服务 和 REST 服务的 JAR 包。如需使用,请联系天谋团队获取。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Kubernetes_timecho.md b/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Kubernetes_timecho.md deleted file mode 100644 index 7fbc7be8d..000000000 --- a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Kubernetes_timecho.md +++ /dev/null @@ -1,445 +0,0 @@ - - -# Kubernetes - -## 1. 环境准备 - -### 1.1 准备 Kubernetes 集群 - -确保拥有一个可用的 Kubernetes 集群(建议最低版本:Kubernetes 1.24),作为部署 IoTDB 集群的基础。 - -Kubernetes 版本要求:建议版本为 Kubernetes 1.24及以上 - -IoTDB版本要求:不能低于v1.3.3 - -## 2. 创建命名空间 - -### 2.1 创建命名空间 - -> 注意:在执行命名空间创建操作之前,需验证所指定的命名空间名称在 Kubernetes 集群中尚未被使用。如果命名空间已存在,创建命令将无法执行,可能导致部署过程中的错误。 - -```Bash -kubectl create ns iotdb-ns -``` - -### 2.2 查看命名空间 - -```Bash -kubectl get ns -``` - -## 3. 创建 PersistentVolume (PV) - -### 3.1 创建 PV 配置文件 - -PV用于持久化存储IoTDB的ConfigNode 和 DataNode的数据,有几个节点就要创建几个PV。 - -> 注:1个ConfigNode和1个DataNode 算2个节点,需要2个PV。 - -以 3ConfigNode、3DataNode 为例: - -1. 创建 `pv.yaml` 文件,并复制六份,分别重命名为 `pv01.yaml` ~ `pv06.yaml`。 - -```Bash -#可新建个文件夹放yaml文件 -#创建 pv.yaml 文件语句 -touch pv.yaml -``` - -2. 修改每个文件中的 `name` 和 `path` 以确保一致性。 - -**pv.yaml 示例:** - -```YAML -# pv.yaml -apiVersion: v1 -kind: PersistentVolume -metadata: - name: iotdb-pv-01 -spec: - capacity: - storage: 10Gi # 存储容量 - accessModes: # 访问模式 - - ReadWriteOnce - persistentVolumeReclaimPolicy: Retain # 回收策略 - # 存储类名称,如果使用本地静态存储storageClassName 不用配置,如果使用动态存储必需设置此项 - storageClassName: local-storage - # 根据你的存储类型添加相应的配置 - hostPath: # 如果是使用本地路径 - path: /data/k8s-data/iotdb-pv-01 - type: DirectoryOrCreate # 这行不配置就要手动创建文件夹 -``` - -### 3.2 应用 PV 配置 - -```Bash -kubectl apply -f pv01.yaml -kubectl apply -f pv-02.yaml -... -``` - -### 3.3 查看 PV - -```Bash -kubectl get pv -``` - - -### 3.4 手动创建文件夹 - -> 如果yaml里的hostPath-type未配置,需要手动创建对应的文件夹 - -在所有 Kubernetes 节点上创建对应的文件夹: - -```Bash -mkdir -p /data/k8s-data/iotdb-pv-01 -mkdir -p /data/k8s-data/iotdb-pv-02 -... -``` - -## 4. 安装 Helm - -安装Helm步骤请参考[Helm官网](https://helm.sh/zh/docs/intro/install/) - -## 5. 配置IoTDB的Helm Chart - -### 5.1 克隆 IoTDB Kubernetes 部署代码 - -请联系天谋工作人员获取IoTDB的Helm Chart - -### 5.2 修改 YAML 文件 - -> 确保使用的是支持的版本 >=1.3.3.2 - -**values.yaml 示例:** - -```YAML -nameOverride: "iotdb" -fullnameOverride: "iotdb" #软件安装后的名称 - -image: - repository: nexus.infra.timecho.com:8143/timecho/iotdb-enterprise - pullPolicy: IfNotPresent - tag: 1.3.3.2-standalone #软件所用的仓库和版本 - -storage: -# 存储类名称,如果使用本地静态存储storageClassName 不用配置,如果使用动态存储必需设置此项 - className: local-storage - -datanode: - name: datanode - nodeCount: 3 #datanode的节点数量 - enableRestService: true - storageCapacity: 10Gi #datanode的可用空间大小 - resources: - requests: - memory: 2Gi #datanode的内存初始化大小 - cpu: 1000m #datanode的CPU初始化大小 - limits: - memory: 4Gi #datanode的最大内存大小 - cpu: 1000m #datanode的最大CPU大小 - -confignode: - name: confignode - nodeCount: 3 #confignode的节点数量 - storageCapacity: 10Gi #confignode的可用空间大小 - resources: - requests: - memory: 512Mi #confignode的内存初始化大小 - cpu: 1000m #confignode的CPU初始化大小 - limits: - memory: 1024Mi #confignode的最大内存大小 - cpu: 2000m #confignode的最大CPU大小 - configNodeConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - schemaReplicationFactor: 3 - schemaRegionConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - dataReplicationFactor: 2 - dataRegionConsensusProtocolClass: org.apache.iotdb.consensus.iot.IoTConsensus -``` - -## 6. 配置私库信息或预先使用ctr拉取镜像 - -在k8s上配置私有仓库的信息,为下一步helm install的前置步骤。 - -方案一即在 helm install 时拉取可用的iotdb镜像,方案二则是提前将可用的iotdb镜像导入到containerd里。 - -### 6.1 【方案一】从私有仓库拉取镜像 - -#### 6.1.1 创建secret 使k8s可访问iotdb-helm的私有仓库 - -下文中“xxxxxx”表示IoTDB私有仓库的账号、密码、邮箱。 - -```Bash -# 注意 单引号 -kubectl create secret docker-registry timecho-nexus \ - --docker-server='nexus.infra.timecho.com:8143' \ - --docker-username='xxxxxx' \ - --docker-password='xxxxxx' \ - --docker-email='xxxxxx' \ - -n iotdb-ns - -# 查看secret -kubectl get secret timecho-nexus -n iotdb-ns -# 查看并输出为yaml -kubectl get secret timecho-nexus --output=yaml -n iotdb-ns -# 查看并解密 -kubectl get secret timecho-nexus --output="jsonpath={.data.\.dockerconfigjson}" -n iotdb-ns | base64 --decode -``` - -#### 6.1.2 将secret作为一个patch加载到命名空间iotdb-ns - -```Bash -# 添加一个patch,使该命名空间增加登陆nexus的登陆信息 -kubectl patch serviceaccount default -n iotdb-ns -p '{"imagePullSecrets": [{"name": "timecho-nexus"}]}' - -# 查看命名空间的该条信息 -kubectl get serviceaccounts -n iotdb-ns -o yaml -``` - -### 6.2 【方案二】导入镜像 - -该步骤用于客户无法连接私库的场景,需要联系公司实施同事辅助准备。 - -#### 6.2.1 拉取并导出镜像: - -```Bash -ctr images pull --user xxxxxxxx nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.2 查看并导出镜像: - -```Bash -# 查看 -ctr images ls - -# 导出 -ctr images export iotdb-enterprise:1.3.3.2-standalone.tar nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.3 导入到k8s的namespace下: - -> 注意,k8s.io为示例环境中k8s的ctr的命名空间,导入到其他命名空间是不行的 - -```Bash -# 导入到k8s的namespace下 -ctr -n k8s.io images import iotdb-enterprise:1.3.3.2-standalone.tar -``` - -#### 6.2.4 查看镜像 - -```Bash -ctr --namespace k8s.io images list | grep 1.3.3.2 -``` - -## 7. 安装 IoTDB - -### 7.1 安装 IoTDB - -```Bash -# 进入文件夹 -cd iotdb-cluster-k8s/helm - -# 安装iotdb -helm install iotdb ./ -n iotdb-ns -``` - -### 7.2 查看 Helm 安装列表 - -```Bash -# helm list -helm list -n iotdb-ns -``` - -### 7.3 查看 Pods - -```Bash -# 查看 iotdb的pods -kubectl get pods -n iotdb-ns -o wide -``` - -执行命令后,输出了带有confignode和datanode标识的各3个Pods,,总共6个Pods,即表明安装成功;需要注意的是,并非所有Pods都处于Running状态,未激活的datanode可能会持续重启,但在激活后将恢复正常。 - -### 7.4 发现故障的排除方式 - -```Bash -# 查看k8s的创建log -kubectl get events -n iotdb-ns -watch kubectl get events -n iotdb-ns - -# 获取详细信息 -kubectl describe pod confignode-0 -n iotdb-ns -kubectl describe pod datanode-0 -n iotdb-ns - -# 查看confignode日志 -kubectl logs -n iotdb-ns confignode-0 -f -``` - -## 8. 激活 IoTDB - -### 8.1 方案1:直接在 Pod 中激活(最快捷) - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-1 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-2 -- /iotdb/sbin/start-activate.sh -# 拿到机器码后进行激活 -``` - -### 8.2 方案2:进入confignode的容器中激活 - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /bin/bash -cd /iotdb/sbin -/bin/bash start-activate.sh -# 拿到机器码后进行激活 -# 退出容器 -``` - -### 8.3 方案3:手动激活 - -1. 查看 ConfigNode 详细信息,确定所在节点: - -```Bash -kubectl describe pod confignode-0 -n iotdb-ns | grep -e "Node:" -e "Path:" - -# 结果示例: -# Node: a87/172.20.31.87 -# Path: /data/k8s-data/env/confignode/.env -``` - -2. 查看 PVC 并找到 ConfigNode 对应的 Volume,确定所在路径: - -```Bash -kubectl get pvc -n iotdb-ns | grep "confignode-0" - -# 结果示例: -# map-confignode-confignode-0 Bound iotdb-pv-04 10Gi RWO local-storage 8h - -# 如果要查看多个confignode,使用如下: -for i in {0..2}; do echo confignode-$i;kubectl describe pod confignode-${i} -n iotdb-ns | grep -e "Node:" -e "Path:"; echo "----"; done -``` - -3. 查看对应 Volume 的详细信息,确定物理目录的位置: - -```Bash -kubectl describe pv iotdb-pv-04 | grep "Path:" - -# 结果示例: -# Path: /data/k8s-data/iotdb-pv-04 -``` - -4. 从对应节点的对应目录下找到 system-info 文件,使用该 system-info 作为机器码生成激活码,并在同级目录新建文件 license,将激活码写入到该文件。 - -## 9. 验证 IoTDB - -### 9.1 查看命名空间内的 Pods 状态 - -查看iotdb-ns命名空间内的IP、状态等信息,确定全部运行正常 - -```Bash -kubectl get pods -n iotdb-ns -o wide - -# 结果示例: -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` - -### 9.2 查看命名空间内的端口映射情况 - -```Bash -kubectl get svc -n iotdb-ns - -# 结果示例: -# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -# confignode-svc NodePort 10.10.226.151 80:31026/TCP 7d8h -# datanode-svc NodePort 10.10.194.225 6667:31563/TCP 7d8h -# jdbc-balancer LoadBalancer 10.10.191.209 6667:31895/TCP 7d8h -``` - -### 9.3 在任意服务器启动 CLI 脚本验证 IoTDB 集群状态 - -端口即jdbc-balancer的端口,服务器为k8s任意节点的IP - -```Bash -start-cli.sh -h 172.20.31.86 -p 31895 -start-cli.sh -h 172.20.31.87 -p 31895 -start-cli.sh -h 172.20.31.88 -p 31895 -``` - - - -## 10. 扩容 - -### 10.1 新增pv - -新增pv,必须有可用的pv才可以扩容。 - - - -**注意:DataNode重启后无法加入集群** - -**原因**:配置了静态存储的 hostPath 模式,并通过脚本修改了 `iotdb-system.properties` 文件,将 `dn_data_dirs` 设为 `/iotdb6/iotdb_data,/iotdb7/iotdb_data`,但未将默认存储路径 `/iotdb/data` 进行外挂,导致重启时数据丢失。 - -**解决方案**:是将 `/iotdb/data` 目录也进行外挂操作,且 ConfigNode 和 DataNode 均需如此设置,以确保数据完整性和集群稳定性。 - -### 10.2 扩容confignode - -示例:3 confignode 扩容为 4 confignode - -修改iotdb-cluster-k8s/helm的values.yaml文件,将confignode的3改成4 - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - - - - -### 10.3 扩容datanode - -示例:3 datanode 扩容为 4 datanode - -修改iotdb-cluster-k8s/helm的values.yaml文件,将datanode的3改成4 - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - -### 10.4 验证IoTDB状态 - -```Shell -kubectl get pods -n iotdb-ns -o wide - -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -# datanode-3 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index a877f5b27..000000000 --- a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,294 +0,0 @@ - -# 单机版部署指导 - -本章将介绍如何启动IoTDB单机实例,IoTDB单机实例包括 1 个ConfigNode 和1个DataNode(即通常所说的1C1D)。 - -## 1. 注意事项 - -1. 安装前请确认系统已参照[系统配置](./Environment-Requirements.md)准备完成。 - -2. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置/etc/hosts,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、dn_internal_address、dn_rpc_address。 - - ```shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. 部分参数首次启动后不能修改,请参考下方的【参数配置】章节进行设置 - -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 - -5. 请注意,安装部署(包括激活和使用软件)IoTDB时需要保持使用同一个用户进行操作,您可以: -- 使用 root 用户(推荐):使用 root 用户可以避免权限等问题。 -- 使用固定的非 root 用户: - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - 避免使用 sudo:尽量避免使用 sudo 命令,因为它会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 - -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考:[监控面板部署](./Monitoring-panel-deployment.md)。 - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - -## 2. 安装步骤 - -### 2.1 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-01.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.2 解压安装包并进入安装目录 - -```shell -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -### 2.3 参数配置 - -#### 环境脚本配置 - -- ./conf/confignode-env.sh(./conf/confignode-env.bat)配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------: | :------------------------------------: |:------------------------:| :----------------------------------------------: | :----------: | -| MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- ./conf/datanode-env.sh(./conf/datanode-env.bat)配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------: | :----------------------------------: |:----------------------:| :----------------------------------------------: | :----------: | -| MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 系统通用配置 - -打开通用配置文件(./conf/iotdb-system.properties 文件),设置以下参数: - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :-----------------------: | :------------------------------: | :------------: | :----------------------------------------------: |:------------------------:| -| cluster_name | 集群名称 | defaultCluster | 可根据需要设置集群名称,如无特殊需要保持默认即可 | 支持热加载,但不建议手动修改该参数 | -| schema_replication_factor | 元数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | -| data_replication_factor | 数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | - -#### ConfigNode配置 - -打开ConfigNode配置文件(./conf/iotdb-system.properties文件),设置以下参数: - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :-----------------: | :----------------------------------------------------------: | :-------------: | :----------------------------------------------: | :----------------: | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -#### DataNode 配置 - -打开DataNode配置文件 ./conf/iotdb-system.properties,设置以下参数: - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------- | :----------------- | -| dn_rpc_address | 客户端 RPC 服务的地址 | 127.0.0.1 | 默认本机可直接访问。非本机访问,请修改此配置项为所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址。 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -> ❗️注意:VSCode Remote等编辑器无自动保存配置功能,请确保修改的文件被持久化保存,否则配置项无法生效 - -### 2.4 启动 ConfigNode 节点 - -进入iotdb的sbin目录下,启动confignode - -```shell -# Unix/OS X -./sbin/start-confignode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\sbin\start-confignode.bat - -# V2.0.4.x 版本及之后 -.\sbin\windows\start-confignode.bat -``` - -如果启动失败,请参考下方[常见问题](#常见问题)。 - -### 2.5 启动 DataNode 节点 - -进入iotdb的sbin目录下,启动datanode: - -```shell -# Unix/OS X -./sbin/start-datanode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\sbin\start-datanode.bat - -# V2.0.4.x 版本及之后 -.\sbin\windows\start-datanode.bat -``` - -### 2.6 激活数据库 - -#### 方式一:命令激活 -- 进入 IoTDB CLI - -```shell -# Linux 系统与 MacOS 系统启动命令如下: -# V2.0.6.x 版本之前 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 - -# Windows 系统启动命令如下: -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` -- 执行以下语句获取激活所需机器码: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -- 执行以下语句获取待激活数据库的版本号: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.x.x| xxxxxxx| -+-------+---------+ -Total line number = 1 -``` - -- 将获取到的机器码与版本号,一同提供给天谋工作人员。 - -- 将工作人员返回的激活码输入到 CLI 中进行激活操作,请注意激活码前后需要用`'`符号进行标注,如下所示 - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - - -#### 方式二:文件激活 - -- 启动Confignode、Datanode节点后,进入activation文件夹, 将 system_info文件复制给天谋工作人员 -- 收到工作人员返回的 license文件 -- 将license文件放入对应节点的activation文件夹下; - - -### 2.7 验证激活 - -可在 CLI 中通过执行 `show activation` 命令查看激活状态,当看到“ClusterActivationStatus”字段状态显示为 ACTIVATED 表示激活成功 - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81.png) - -## 3. 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 - -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 步骤 4: 清理环境: - - a. 结束所有 ConfigNode 和 DataNode 进程。 - -```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - # Unix/OS X - sbin/stop-standalone.sh - - # Windows - # V2.0.4.x 版本之前 - sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - sbin\windows\stop-standalone.bat - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```Bash - cd /data/iotdb - rm -rf data logs - ``` diff --git a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/workbench-deployment_timecho.md b/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/workbench-deployment_timecho.md deleted file mode 100644 index 48267f677..000000000 --- a/src/zh/UserGuide/Master/Tree/Deployment-and-Maintenance/workbench-deployment_timecho.md +++ /dev/null @@ -1,254 +0,0 @@ - -# 可视化控制台部署 - -可视化控制台是IoTDB配套工具之一(类似 Navicat for MySQL)。它用于数据库部署实施、运维管理、应用开发各阶段的官方应用工具体系,让数据库的使用、运维和管理更加简单、高效,真正实现数据库低成本的管理和运维。本文档将帮助您安装Workbench。 - -
-  -  -
- -可视化控制台工具的使用说明可参考文档 [使用说明](../Tools-System/Workbench_timecho.md) 章节。 - -## 1. 安装准备 - -| 准备内容 | 名称 | 版本要求 | 官方链接 | -| :------: | :-----------------------: | :----------------------------------------------------------: | :----------------------------------------------------: | -| 操作系统 | Windows或Linux | - | - | -| 安装环境 | JDK | 1.5.4及以下版本需要 >= 1.8,1.5.5及以上版本需要 >= 17(下载时请根据机器配置选择ARM或x64安装包) | https://www.oracle.com/java/technologies/downloads/ | -| 相关软件 | Prometheus | 需要 >=V2.30.3 | https://prometheus.io/download/ | -| 数据库 | IoTDB | 需要>=V1.2.0企业版 | 您可联系商务或技术支持获取 | -| 控制台 | IoTDB-Workbench-``| - | 您可根据附录版本对照表进行选择后联系商务或技术支持获取 | - -### 1.1 前置检查 - -为确保您获取的可视化控制台安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:联系天谋工作人员获取 - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/workbench`): - ```Bash - cd /data/workbench - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum IoTDB-Workbench-``.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-04.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行可视化控制台的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -## 2. 安装步骤 - -### 2.1 步骤一:IoTDB 开启监控指标采集 - -1. 打开监控配置项。IoTDB中监控有关的配置项默认是关闭的,在部署监控面板前,您需要打开相关配置项(注意开启监控配置后需要重启服务)。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
配置项所在配置文件配置说明
cn_metric_reporter_listconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为PROMETHEUS
cn_metric_level请在配置文件中添加该配置项,值设置为IMPORTANT
cn_metric_prometheus_reporter_port请在配置文件中添加该配置项,可保持默认设置9091,如设置其他端口,不与其他端口冲突即可
dn_metric_reporter_listconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为PROMETHEUS
dn_metric_level请在配置文件中添加该配置项,值设置为IMPORTANT
dn_metric_prometheus_reporter_port请在配置文件中添加该配置项,可保持默认设置9092,如设置其他端口,不与其他端口冲突即可
dn_metric_internal_reporter_type请在配置文件中添加该配置项,值设置为IOTDB
enable_audit_logconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为true
audit_log_storage请在配置文件中添加该配置项,值设置为IOTDB,LOGGER
audit_log_operation请在配置文件中添加该配置项,值设置为DML,DDL,QUERY
- -2. 重启所有节点。修改3个节点的监控指标配置后,可重新启动所有节点的confignode和datanode服务: - - ```shell - # Unix/OS X - ./sbin/stop-standalone.sh #先停止confignode和datanode - ./sbin/start-confignode.sh -d #启动confignode - ./sbin/start-datanode.sh -d #启动datanode - - # Windows - # V2.0.4.x 版本之前 - .\sbin\stop-standalone.bat - .\sbin\start-confignode.bat - .\sbin\start-datanode.bat - - # V2.0.4.x 版本及之后 - .\sbin\windows\stop-standalone.bat - .\sbin\windows\start-confignode.bat - .\sbin\windows\start-datanode.bat - ``` - -3. 重启后,通过客户端确认各节点的运行状态,若状态都为Running,则为配置成功: - - ![](/img/%E5%90%AF%E5%8A%A8.png) - -### 2.2 步骤二:安装、配置Prometheus监控 - -1. 确保Prometheus安装完成(官方安装说明可参考:https://prometheus.io/docs/introduction/first_steps/) -2. 解压安装包,进入解压后的文件夹: - - ```Shell - tar xvfz prometheus-*.tar.gz - cd prometheus-* - ``` - -3. 修改配置。修改配置文件prometheus.yml如下 - 1. 新增confignode任务收集ConfigNode的监控数据 - 2. 新增datanode任务收集DataNode的监控数据 - - ```shell - global: - scrape_interval: 15s - evaluation_interval: 15s - scrape_configs: - - job_name: "prometheus" - static_configs: - - targets: ["localhost:9090"] - - job_name: "confignode" - static_configs: - - targets: ["iotdb-1:9091","iotdb-2:9091","iotdb-3:9091"] - honor_labels: true - - job_name: "datanode" - static_configs: - - targets: ["iotdb-1:9092","iotdb-2:9092","iotdb-3:9092"] - honor_labels: true - ``` - -4. 启动Prometheus。Prometheus 监控数据的默认过期时间为15天,在生产环境中,建议将其调整为180天以上,以对更长时间的历史监控数据进行追踪,启动命令如下所示: - - ```Shell - ./prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=180d - ``` - -5. 确认启动成功。在浏览器中输入 `http://IP:port`,进入Prometheus,点击进入Status下的Target界面,当看到State均为Up时表示配置成功并已经联通。 - -
- - -
- -### 2.3 步骤三:安装Workbench - -1. 进入iotdb-Workbench-``的config目录 - -2. 修改Workbench配置文件:进入`config`文件夹下修改配置文件`application-prod.properties`。若您是在本机安装则无需修改,若是部署在服务器上则需修改IP地址 - > Workbench可以部署在本地或者云服务器,只要能与 IoTDB 连接即可 - - | 配置项 | 修改前 | 修改后 | - | ---------------- | --------------------------------- | -------------------------------------- | - | pipe.callbackUrl | pipe.callbackUrl=`http://127.0.0.1` | pipe.callbackUrl=`http://<部署Workbench的IP地址>` | - - ![](/img/workbench-conf-1.png) - -3. 启动程序:请在IoTDB-Workbench-``的sbin文件夹下执行启动命令 - - Windows版: - ```shell - # 后台启动Workbench - start.bat -d - ``` - - Linux版: - ```shell - # 后台启动Workbench - ./start.sh -d - ``` - -4. 可以通过`jps`命令进行启动是否成功,如图所示即为启动成功: - - ![](/img/windows-jps.png) - -5. 验证是否成功:浏览器中打开:"`http://服务器ip:配置文件中端口`"进行访问,例如:"`http://127.0.0.1:9190`",当出现登录界面时即为成功 - - ![](/img/workbench.png) - - -## 3. 附录:IoTDB与控制台版本对照表 - -| **控制台版本号** | **版本说明** | **可支持IoTDB版本** | -|------------|--------------------------------------------------------|-------------------| -| V2.0.1-beta | V2.x系列首个版本,支持树、表双模型 | V2.0 及以上版本,AI分析模块仅支持2.0.5以上版本 | -| V1.5.7 | 优化测点列表中测点名称拆分为设备名称和测点,测点选择区域支持左右滚动,以及导出文件列顺序与页面保持一致 | V1.3.4及以上的1.x系列版本 | -| V1.5.6 | 优化 CSV 格式导入导出功能:导入时,支持标签、别名为非必填项;导出时,支持测点描述里反引号包裹引号的场景 | V1.3.4及以上的1.x系列版本 | -| V1.5.5 | 新增服务器时钟,支持企业版激活数据库 | V1.3.4及以上的1.x系列版本 | -| V1.5.4 | 新增实例管理中prometheus设置的认证功能 | V1.3.4及以上的1.x系列版本 | -| V1.5.1 | 新增AI分析功能以及模式匹配功能 | V1.3.2及以上的1.x系列版本 | -| V1.4.0 | 新增树模型展示及英文版 | V1.3.2及以上的1.x系列版本 | -| V1.3.1 | 分析功能新增分析方式,优化导入模版等功能 | V1.3.2及以上的1.x系列版本 | -| V1.3.0 | 新增数据库配置功能,优化部分版本细节 | V1.3.2及以上的1.x系列版本 | -| V1.2.6 | 优化各模块权限控制功能 | V1.3.1及以上的1.x系列版本 | -| V1.2.5 | 可视化功能新增“常用模版”概念,所有界面优化补充页面缓存等功能 | V1.3.0及以上的1.x系列版本 | -| V1.2.4 | 计算功能新增“导入、导出”功能,测点列表新增“时间对齐”字段 | V1.2.2及以上的1.x系列版本 | -| V1.2.3 | 首页新增“激活详情”,新增分析等功能 | V1.2.2及以上的1.x系列版本 | -| V1.2.2 | 优化“测点描述”展示内容等功能 | V1.2.2及以上的1.x系列版本 | -| V1.2.1 | 数据同步界面新增“监控面板”,优化Prometheus提示信息 | V1.2.2及以上的1.x系列版本 | -| V1.2.0 | 全新Workbench版本升级 | V1.2.0及以上的1.x系列版本 | diff --git a/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Ecosystem-Overview_timecho.md b/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Ecosystem-Overview_timecho.md deleted file mode 100644 index 454e27bba..000000000 --- a/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Ecosystem-Overview_timecho.md +++ /dev/null @@ -1,47 +0,0 @@ - - -# 概览 - -IoTDB 生态集成打通时序数据全链路:通过数据采集实现设备秒级接入,经数据集成构建跨云管道,依托编程框架快速开发业务逻辑,结合计算引擎完成分布式处理,通过可视化与 SQL 开发实现分析策略,最终对接物联网平台完成边云协同,构建从物理世界到数字决策的完整智能闭环。 - -![](/img/eco-overview-n.png) - -下面的文档将会帮助您快速详细的了解各个阶段不同集成工具的使用方式: - -- 数据采集 - - Telegraf [Telegraf 插件](./Telegraf.md) -- 数据集成 - - NiFi [Apache NiFi](./NiFi-IoTDB.md) - - Kafka [Kafka](./Programming-Kafka.md) -- 计算引擎 - - Flink [Flink](./Flink-IoTDB.md) - - Spark [Spark](./Spark-IoTDB.md) -- 可视化分析 - - Zeppelin [Zeppelin](./Zeppelin-IoTDB_timecho.md) - - Grafana [Grafana](./Grafana-Connector.md) - - Grafana Plugin [Grafana Plugin](./Grafana-Plugin.md) - - DataEase [DataEase](./DataEase.md) -- SQL 开发 - - DBeaver [DBeaver](./DBeaver.md) -- 物联网对接 - - Ignition [Ignition](./Ignition-IoTDB-plugin_timecho.md) - - Thingsboard [Thingsboard](./Thingsboard.md) \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md b/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md deleted file mode 100644 index 553d01d39..000000000 --- a/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md +++ /dev/null @@ -1,274 +0,0 @@ - -# Ignition - -## 1. 产品概述 - -1. Ignition简介 - -Ignition 是一个基于WEB的监控和数据采集工具(SCADA)- 一个开放且可扩展的通用平台。Ignition可以让你更轻松地控制、跟踪、显示和分析企业的所有数据,提升业务能力。更多介绍详情请参考[Ignition官网](https://docs.inductiveautomation.com/docs/8.1/getting-started/introducing-ignition) - -2. Ignition-IoTDB Connector介绍 - - Ignition-IoTDB Connector分为两个模块:Ignition-IoTDB连接器、Ignition-IoTDB With JDBC。其中: - - - Ignition-IoTDB 连接器:提供了将 Ignition 采集到的数据存入 IoTDB 的能力,也支持在Components中进行数据读取,同时注入了 `system.iotdb.insert`和`system.iotdb.query`脚本接口用于方便在Ignition编程使用 - - Ignition-IoTDB With JDBC:Ignition-IoTDB With JDBC 可以在 `Transaction Groups` 模块中使用,不适用于 `Tag Historian`模块,可以用于自定义写入和查询。 - - 两个模块与Ignition的具体关系与内容如下图所示。 - - ![](/img/Ignition.png) - -## 2. 安装要求 - -| **准备内容** | **版本要求** | -| :------------------------: | :------------------------------------------------------------: | -| IoTDB | 要求已安装V1.3.1及以上版本,安装请参考 IoTDB [部署指导](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) | -| Ignition | 要求已安装 8.1.x版本(8.1.37及以上)的 8.1 版本,安装请参考 Ignition 官网[安装指导](https://docs.inductiveautomation.com/docs/8.1/getting-started/installing-and-upgrading)(其他版本适配请联系商务了解) | -| Ignition-IoTDB连接器模块 | 请联系商务获取 | -| Ignition-IoTDB With JDBC模块 | 下载地址:https://repo1.maven.org/maven2/org/apache/iotdb/iotdb-jdbc/ | - -## 3. Ignition-IoTDB连接器使用说明 - -### 3.1 简介 - -Ignition-IoTDB连接器模块可以将数据存入与历史数据库提供程序关联的数据库连接中。数据根据其数据类型直接存储到 SQL 数据库中的表中,以及毫秒时间戳。根据每个标签上的值模式和死区设置,仅在更改时存储数据,从而避免重复和不必要的数据存储。 - -Ignition-IoTDB连接器提供了将 Ignition 采集到的数据存入 IoTDB 的能力。 - -### 3.2 安装步骤 - -步骤一:进入 `Config` - `System`- `Modules` 模块,点击最下方的`Install or Upgrade a Module...` - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-1.png) - -步骤二:选择获取到的 `modl`,选择文件并上传,点击 `Install`,信任相关证书。 - -![](/img/ignition-3.png) - -步骤三:安装完成后可以看到如下内容 - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-3.png) - -步骤四:进入 `Config` - `Tags`- `History` 模块,点击下方的`Create new Historical Tag Provider...` - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-4.png) - -步骤五:选择 `IoTDB`并填写配置信息 - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-5.png) - -配置内容如下: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
名称含义默认值备注
Main
Provider NameProvider 名称-
Enabled true为 true 时才能使用该 Provider
Description备注-
IoTDB Settings
Host Name目标IoTDB实例的地址-
Port Number目标IoTDB实例的端口6667
Username目标IoTDB的用户名-
Password目标IoTDB的密码-
Database Name要存储的数据库名称,以 root 开头,如 root.db-
Pool SizeSessionPool 的 Size50可以按需进行配置
Store and Forward Settings保持默认即可
- - -### 3.3 使用说明 - -#### 配置历史数据存储 - -- 配置好 `Provider` 后就可以在 `Designer` 中使用 `IoTDB Tag Historian` 了,就跟使用其他的 `Provider` 一样,右键点击对应 `Tag` 选择 `Edit tag(s)`,在 Tag Editor 中选择 History 分类 - - ![](/img/ignition-7.png) - -- 设置 `History Enabled` 为 `true`,并选择 `Storage Provider` 为上一步创建的 `Provider`,按需要配置其它参数,并点击 `OK`,然后保存项目。此时数据将会按照设置的内容持续的存入 `IoTDB` 实例中。 - - ![](/img/ignition-8.png) - -#### 读取数据 - -- 也可以在 Report 的 Data 标签下面直接选择存入 IoTDB 的 Tags - - ![](/img/ignition-9.png) - -- 在 Components 中也可以直接浏览相关数据 - - ![](/img/ignition-10.png) - -#### 脚本模块:该功能能够与 IoTDB 进行交互 - -1. system.iotdb.insert: - - -- 脚本说明:将数据写入到 IoTDB 实例中 - -- 脚本定义: - ``` shell - system.iotdb.insert(historian, deviceId, timestamps, measurementNames, measurementValues) - ``` - -- 参数: - - - `str historian`:对应的 IoTDB Tag Historian Provider 的名称 - - `str deviceId`:写入的 deviceId,不含配置的 database,如 Sine - - `long[] timestamps`:写入的数据点对于的时间戳列表 - - `str[] measurementNames`:写入的物理量的名称列表 - - `str[][] measurementValues`:写入的数据点数据,与时间戳列表和物理量名称列表对应 - -- 返回值:无 - -- 可用范围:Client, Designer, Gateway - -- 使用示例: - - ```shell - system.iotdb.insert("IoTDB", "Sine", [system.date.now()],["measure1","measure2"],[["val1","val2"]]) - ``` - -2. system.iotdb.query: - - -- 脚本说明:查询写到 IoTDB 实例中的数据 - -- 脚本定义: - ```shell - system.iotdb.query(historian, sql) - ``` - -- 参数: - - - `str historian`:对应的 IoTDB Tag Historian Provider 的名称 - - `str sql`:待查询的 sql 语句 - -- 返回值: - 查询的结果:`List>` - -- 可用范围:Client, Designer, Gateway -- 使用示例: - -```shell -system.iotdb.query("IoTDB", "select * from root.db.Sine where time > 1709563427247") -``` - -## 4. Ignition-IoTDB With JDBC - -### 4.1 简介 - - Ignition-IoTDB With JDBC提供了一个 JDBC 驱动,允许用户使用标准的JDBC API 连接和查询 lgnition-loTDB 数据库 - -### 4.2 安装步骤 - - 步骤一:进入 `Config` - `Databases` -`Drivers` 模块,创建 `Translator` - -![](/img/Ignition-IoTDBWithJDBC-1.png) - - 步骤二:进入 `Config` - `Databases` -`Drivers` 模块,创建 `JDBC Driver`,选择上一步配置的 `Translator`并上传下载的 `IoTDB-JDBC`,Classname 配置为 `org.apache.iotdb.jdbc.IoTDBDriver` - -![](/img/Ignition-IoTDBWithJDBC-2.png) - -步骤三:进入 `Config` - `Databases` -`Connections` 模块,创建新的 `Connections`,`JDBC Driver` 选择上一步创建的 `IoTDB Driver`,配置相关信息后保存即可使用 - -![](/img/Ignition-IoTDBWithJDBC-3.png) - -### 4.3 使用说明 - -#### 数据写入 - - 在`Transaction Groups`中的 `Data Source`选择之前创建的 `Connection` - -- `Table name` 需设置为 root 开始的完整的设备路径 -- 取消勾选 `Automatically create table` -- `Store timestame to` 配置为 time - -不选择其他项,设置好字段,并 `Enabled` 后 数据会安装设置存入对应的 IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E5%86%99%E5%85%A5-1.png) - -#### 数据查询 - -- 在 `Database Query Browser` 中选择`Data Source`选择之前创建的 `Connection`,即可编写 SQL 语句查询 IoTDB 中的数据 - -![](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2-ponz.png) - diff --git a/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/SeaTunnel_timecho.md b/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/SeaTunnel_timecho.md deleted file mode 100644 index 5fbef03f9..000000000 --- a/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/SeaTunnel_timecho.md +++ /dev/null @@ -1,189 +0,0 @@ - - -# Apache SeaTunnel - -## 1. 概述 - -SeaTunnel 是一款专为海量数据设计的分布式集成平台,凭借其高性能与弹性扩展能力,通过标准化的 Connector 连接器(由 Source 和 Sink 构成)打通多源异构数据链路。平台将各类数据源通过 Source 统一抽象为 SeaTunnelRow 格式,经动态资源调度与批量处理优化后,由 Sink 高效写入不同存储系统。通过 IoTDB Connector 与 SeaTunnel 的深度集成,不仅解决了时序数据场景下的 高吞吐写入、多源治理、复杂分析 等核心挑战,更通过开箱即用的连接器生态和自动化运维能力,帮助企业在物联网、工业互联网等领域快速构建 低成本、高可靠、易扩展 的数据基础设施。 - -## 2. 使用步骤 - -### 2.1 环境准备 - -#### 2.1.1 软件要求 - -| 软件 | 版本 | 安装参考 | -| ----------- | ---------- |-----------------------------------------------| -| IoTDB | >= 2.0.5 | [快速入手](../QuickStart/QuickStart_timecho.md) | -| SeaTunnel | 2.3.12 | [官方网站](https://seatunnel.apache.org/download) | - -* Thrift 版本冲突解决(仅 Spark 引擎需处理): - -```Bash -# 移除 Spark 中的旧版 Thrift -rm -f $SPARK_HOME/jars/libthrift* -# 复制 IoTDB 的 Thrift 库到 Sparkcp -$IOTDB_HOME/lib/libthrift* $SPARK_HOME/jars/ -``` - -#### 2.1.2 依赖配置 - -1. JDBC - -* Spark/Flink 引擎:将 [JDBC 驱动 Jar 包](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) 放入 `${SEATUNNEL_HOME}/plugins/` 目录 -* SeaTunnel Zeta 引擎:将 [JDBC 驱动 Jar 包](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) 放入 `${SEATUNNEL_HOME}/lib/` 目录 - -2. Connector - -将对应版本的 [seaTunnel Connector](https://mvnrepository.com/artifact/org.apache.seatunnel/connector-iotdb) 放入 `${SEATUNNEL_HOME}/plugins/` 目录 - -### 2.2 读取数据 (IoTDB Source Connector) - -#### 2.2.1 配置参数 - -| **参数名** | **类型** | **必填** | **默认值** | **描述** | -| ---------------------------------- | ---------------- | ---------------- | ------------------ |-----------------------------------------------------------------------| -| `node_urls` | string | 是 | - | IoTDB 集群地址,格式:`"host1:port"`或`"host1:port,host2:port"` | -| `username` | string | 是 | - | IoTDB 用户名 | -| `password` | string | 是 | - | IoTDB 密码 | -| `sql_dialect` | string | 否 | tree | IoTDB 模型,tree:树模型;table:表模型 | -| `sql` | string | 是 | - | 要执行的 SQL 查询语句 | -| `database` | string | 否 | - | 数据库名,只在表模型中生效 | -| `schema` | config | 是 | - | 数据模式定义 | -| `fetch_size` | int | 否 | - | 单次获取数据量:查询时每次从 IoTDB 获取的数据量 | -| `lower_bound`| long | 否 | - | 时间范围下界(通过时间列进行数据分片时使用) | -| `upper_bound` | long | 否 | - | 时间范围上界(通过时间列进行数据分片时使用) | -| `num_partitions`| int | 否 | - | 分区数量(通过时间列进行数据分片时使用):
1个分区:使用完整时间范围
若分区数 < (上界-下界),则使用差值作为实际分区数 | -| `thrift_default_buffer_size` | int | 否 | - | Thrift 协议缓冲区大小 | -| `thrift_max_frame_size` | int | 否 | - | Thrift 最大帧尺寸 | -| `enable_cache_leader` | boolean | 否 | - | 是否启用 Leader 节点缓存 | -| `version` | string | 否 | - | 客户端 SQL 语义版本`(V_0_12/V_0_13)` | - -#### 2.2.2 配置示例 - -1. 在 `${SEATUNNEL_HOME}/`​`config/` 目录下新建` iotdb_source_example.conf` - -```Bash -env{ - parallelism = 2 # 并行度为2 - job.mode = "BATCH" # 批处理模式 -} - -source { - IoTDB { - node_urls = "localhost:6667" - username = "root" - password = "root" - sql = "SELECT temperature, humidity, status FROM root.testdb.seatunnel.source.device align by device" - schema { - fields { - ts = timestamp - device_name = string - temperature = double - humidity = double - status = boolean - } - } - } -} - -sink { - Console { - } # 输出到控制台 -} -``` - -2. 执行如下命令运行 seaTunnel - -```Bash -./bin/seatunnel.sh --config config/iotdb_source_example.conf -e local -``` - -3. 更多详情请参考 Apache SeanTunnel 官网 [IoTDB Source Connector](https://seatunnel.incubator.apache.org/zh-CN/docs/2.3.12/connector-v2/source/IoTDB) 相关介绍 - -### 2.3 写入数据(IoTDB Sink Connector) - -#### 2.3.1 配置参数 - -| **名称** | **类型** | **是否必传​** | **默认值** | **描述** | -|-------------------------------|---------| ---------------------- |------------------| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | Array | 是 | - | `IoTDB`集群地址,格式为` ["host1:port"]`或`["host1:port","host2:port"]` | -| `username` | String | 是 | - | `IoTDB`用户的用户名 | -| `password` | String | 是 | - | `IoTDB`用户的密码 | -| `sql_dialect` | String | 否 | tree | `IoTDB`模型,tree:树模型;table:表模型 | -| `storage_group` | String | 是 | - | `IoTDB`树模型:指定设备存储组(路径前缀) 例: deviceId = \${storage\_group} + "." + \${key\_device} ;`IoTDB`表模型:指定数据库 | -| `key_device` | String | 是 | - | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`设备 ID 的字段名;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`表名的字段名 | -| `key_timestamp` | String | 否 | processing time | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`时间戳的字段名(如未指定,则使用处理时间作为时间戳);`IoTDB`表模型:在 SeaTunnelRow 中指定 IoTDB 时间列的字段名(如未指定,则使用处理时间作为时间戳) | -| `key_measurement_fields` | Array | 否 | 见描述 | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`测量列表的字段名(如未指定,则包括排除`key_device`&`key_timestamp`后的其余字段);`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`测点列(FIELD)的字段名(如未指定,则包括排除`key_device`&`key_timestamp`&`key_tag_fields`&`key_attribute_fields`后的其余字段) | -| `key_tag_fields` | Array | 否 | - | `IoTDB`树模型:不生效;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`标签列(TAG)的字段名 | -| `key_attribute_fields` | Array | 否 | - | `IoTDB`树模型:不生效;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`属性列(ATTRIBUTE)的字段名 | -| `batch_size` | Integer | 否 | 1024 | 对于批写入,当缓冲区的数量达到`batch_size`的数量或时间达到`batch_interval_ms`时,数据将被刷新到IoTDB中 | -| `max_retries` | Integer | 否 | - | 刷新的重试次数 failed | -| `retry_backoff_multiplier_ms` | Integer | 否 | - | 用作生成下一个退避延迟的乘数 | -| `max_retry_backoff_ms` | Integer | 否 | - | 尝试重试对`IoTDB`的请求之前等待的时间量 | -| `default_thrift_buffer_size` | Integer | 否 | - | 在`IoTDB`客户端中节省初始化缓冲区大小 | -| `max_thrift_frame_size` | Integer | 否 | - | 在`IoTDB`客户端中节约最大帧大小 | -| `zone_id` | string | 否 | - | `IoTDB`java.time.ZoneId client | -| `enable_rpc_compression` | Boolean | 否 | - | 在`IoTDB`客户端中启用rpc压缩 | -| `connection_timeout_in_ms` | Integer | 否 | - | 连接到`IoTDB`时等待的最长时间(毫秒) | - -#### 2.3.2 配置示例 - -1. 在 `${SEATUNNEL_HOME}/`​`config/` 目录下新建` iotdb_sink_example.conf` - -```bash -# 定义运行时环境 -env { - parallelism = 4 - job.mode = "BATCH" -} -source{ - Jdbc { - url = "jdbc:mysql://localhost:3306/demo_db?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true" - driver = "com.mysql.cj.jdbc.Driver" - connection_check_timeout_sec = 100 - user = "root" - password = "IoTDB@2024" - query = "select * from device" - } -} -sink { - IoTDB { - node_urls = ["localhost:6667"] - username = "root" - password = "root" - key_device = "id" # specify the `deviceId` use device_name field - key_timestamp = "intime" - storage_group = "root.mysql" - } -} -``` - -2. 执行如下命令运行 seaTunnel - -```Bash -./bin/seatunnel.sh --config config/iotdb_sink_example.conf -e local -``` - -3. 更多配置参数及示例请参考 Apache SeanTunnel 官网 [IoTDB Sink Connector](https://seatunnel.incubator.apache.org/zh-CN/docs/2.3.12/connector-v2/sink/IoTDB) 相关介绍 - - diff --git a/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md b/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md deleted file mode 100644 index f3bba6de1..000000000 --- a/src/zh/UserGuide/Master/Tree/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md +++ /dev/null @@ -1,174 +0,0 @@ - - -# Apache Zeppelin - -## 1. Zeppelin 简介 - -Apache Zeppelin 是一个基于网页的交互式数据分析系统。用户可以通过 Zeppelin 连接数据源并使用 SQL、Scala 等进行交互式操作。操作可以保存为文档(类似于 Jupyter)。Zeppelin 支持多种数据源,包括 Spark、ElasticSearch、Cassandra 和 InfluxDB 等等。现在,IoTDB 已经支持使用 Zeppelin 进行操作。样例如下: - -![iotdb-note-snapshot](/img/github/102752947-520a3e80-43a5-11eb-8fb1-8fac471c8c7e.png) - -## 2. Zeppelin-IoTDB 解释器 - -### 2.1 系统环境需求 - -| IoTDB 版本 | Java 版本 | Zeppelin 版本 | -| :--------: | :-----------: | :-----------: | -| >=`0.12.0` | >=`1.8.0_271` | `>=0.9.0` | - -安装 IoTDB:参考 [快速上手](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md). 假设 IoTDB 安装在 `$IoTDB_HOME`. - -安装 Zeppelin: -> 方法 1 直接下载:下载 [Zeppelin](https://zeppelin.apache.org/download.html#) 并解压二进制文件。推荐下载 [netinst](http://www.apache.org/dyn/closer.cgi/zeppelin/zeppelin-0.9.0/zeppelin-0.9.0-bin-netinst.tgz) 二进制包,此包由于未编译不相关的 interpreter,因此大小相对较小。 -> -> 方法 2 源码编译:参考 [从源码构建 Zeppelin](https://zeppelin.apache.org/docs/latest/setup/basics/how_to_build.html) ,使用命令为 `mvn clean package -pl zeppelin-web,zeppelin-server -am -DskipTests`。 - -假设 Zeppelin 安装在 `$Zeppelin_HOME`. - -### 2.2 编译解释器 - -运行如下命令编译 IoTDB Zeppelin 解释器。 - -```shell -cd $IoTDB_HOME - mvn clean package -pl iotdb-connector/zeppelin-interpreter -am -DskipTests -P get-jar-with-dependencies -``` - -编译后的解释器位于如下目录: - -```shell -$IoTDB_HOME/zeppelin-interpreter/target/zeppelin-{version}-SNAPSHOT-jar-with-dependencies.jar -``` - -### 2.3 安装解释器 - -当你编译好了解释器,在 Zeppelin 的解释器目录下创建一个新的文件夹`iotdb`,并将 IoTDB 解释器放入其中。 - -```shell -cd $IoTDB_HOME -mkdir -p $Zeppelin_HOME/interpreter/iotdb -cp $IoTDB_HOME/zeppelin-interpreter/target/zeppelin-{version}-SNAPSHOT-jar-with-dependencies.jar $Zeppelin_HOME/interpreter/iotdb -``` - -### 2.4 修改 Zeppelin 配置 - -进入 `$Zeppelin_HOME/conf`,使用 template 创建 Zeppelin 配置文件: - -```shell -cp zeppelin-site.xml.template zeppelin-site.xml -``` - -打开 zeppelin-site.xml 文件,将 `zeppelin.server.addr` 项修改为 `0.0.0.0` - -### 2.5 启动 Zeppelin 和 IoTDB - -进入 `$Zeppelin_HOME` 并运行 Zeppelin: - -```shell -# Unix/OS X -> ./bin/zeppelin-daemon.sh start - -# Windows -> .\bin\zeppelin.cmd -``` - -进入 `$IoTDB_HOME` 并运行 IoTDB: - -```shell -# Unix/OS X -> nohup sbin/start-server.sh >/dev/null 2>&1 & -or -> nohup sbin/start-server.sh -c -rpc_port >/dev/null 2>&1 & - -# Windows -> sbin\start-server.bat -c -rpc_port -``` - -## 3. 使用 Zeppelin-IoTDB 解释器 - -当 Zeppelin 启动后,访问 [http://127.0.0.1:8080/](http://127.0.0.1:8080/) - -通过如下步骤创建一个新的笔记本页面: - -1. 点击 `Create new node` 按钮 -2. 设置笔记本名 -3. 选择解释器为 iotdb - -现在可以开始使用 Zeppelin 操作 IoTDB 了。 - -![iotdb-create-note](/img/github/102752945-5171a800-43a5-11eb-8614-53b3276a3ce2.png) - -我们提供了一些简单的 SQL 来展示 Zeppelin-IoTDB 解释器的使用: - -```sql -CREATE DATABASE root.ln.wf01.wt01; -CREATE TIMESERIES root.ln.wf01.wt01.status WITH DATATYPE=BOOLEAN, ENCODING=PLAIN; -CREATE TIMESERIES root.ln.wf01.wt01.temperature WITH DATATYPE=FLOAT, ENCODING=PLAIN; -CREATE TIMESERIES root.ln.wf01.wt01.hardware WITH DATATYPE=INT32, ENCODING=PLAIN; - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (1, 1.1, false, 11); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (2, 2.2, true, 22); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (3, 3.3, false, 33); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (4, 4.4, false, 44); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (5, 5.5, false, 55); - -SELECT * -FROM root.ln.wf01.wt01 -WHERE time >= 1 - AND time <= 6; -``` - -样例如下: - -![iotdb-note-snapshot2](/img/github/102752948-52a2d500-43a5-11eb-9156-0c55667eb4cd.png) - -用户也可以参考 [[1]](https://zeppelin.apache.org/docs/0.9.0/usage/display_system/basic.html) 编写更丰富多彩的文档。 - -以上样例放置于 `$IoTDB_HOME/zeppelin-interpreter/Zeppelin-IoTDB-Demo.zpln` - -## 4. 解释器配置项 - -进入页面 [http://127.0.0.1:8080/#/interpreter](http://127.0.0.1:8080/#/interpreter) 并配置 IoTDB 的连接参数: - -![iotdb-configuration](/img/github/102752940-50407b00-43a5-11eb-94fb-3e3be222183c.png) - -可配置参数默认值和解释如下: - -| 属性 | 默认值 | 描述 | -| ---------------------------- | --------- | -------------------------------- | -| iotdb.host | 127.0.0.1 | IoTDB 主机名 | -| iotdb.port | 6667 | IoTDB 端口 | -| iotdb.username | root | 用户名 | -| iotdb.password | root | 密码 | -| iotdb.fetchSize | 10000 | 查询结果分批次返回时,每一批数量 | -| iotdb.zoneId | | 时区 ID | -| iotdb.enable.rpc.compression | FALSE | 是否允许 rpc 压缩 | -| iotdb.time.display.type | default | 时间戳的展示格式 | diff --git a/src/zh/UserGuide/Master/Tree/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/zh/UserGuide/Master/Tree/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index a82e4ed7f..000000000 --- a/src/zh/UserGuide/Master/Tree/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,271 +0,0 @@ - - -# 产品介绍 - -TimechoDB 是一款低成本、高性能的物联网原生时序数据库,是天谋科技基于 Apache IoTDB 社区版本提供的原厂商业化产品。它可以解决企业组建物联网大数据平台管理时序数据时所遇到的应用场景复杂、数据体量大、采样频率高、数据乱序多、数据处理耗时长、分析需求多样、存储与运维成本高等多种问题。 - -天谋科技基于 TimechoDB 提供更多样的产品功能、更强大的性能和稳定性、更丰富的效能工具,并为用户提供全方位的企业服务,从而为商业化客户提供更强大的产品能力,和更优质的开发、运维、使用体验。 - -- 下载、部署与使用:[快速上手](../QuickStart/QuickStart_timecho.md) - -## 1. 产品体系 - -天谋产品体系由若干个组件构成,覆盖由【数据采集】到【数据管理】到【数据分析&应用】的全时序数据生命周期,做到“采-存-用”一体化时序数据解决方案,帮助用户高效地管理和分析物联网产生的海量时序数据。 - -
- Introduction-zh-timecho.png -
- - -其中: - -1. **时序数据库(TimechoDB,基于 Apache IoTDB 提供的原厂商业化产品)**:时序数据存储的核心组件,其能够为用户提供高压缩存储能力、丰富时序查询能力、实时流处理能力,同时具备数据的高可用和集群的高扩展性,并在安全层面提供全方位保障。同时 TimechoDB 还为用户提供多种应用工具,方便用户配置和管理系统;多语言API和外部系统应用集成能力,方便用户在 TimechoDB 基础上构建业务应用。 -2. **时序数据标准文件格式(Apache TsFile,多位天谋科技核心团队成员主导&贡献代码)**:该文件格式是一种专为时序数据设计的存储格式,可以高效地存储和查询海量时序数据。目前 Timecho 采集、存储、智能分析等模块的底层存储文件均由 Apache TsFile 进行支撑。TsFile 可以被高效地加载至 IoTDB 中,也能够被迁移出来。通过 TsFile,用户可以在采集、管理、应用&分析阶段统一使用相同的文件格式进行数据管理,极大简化了数据采集到分析的整个流程,提高时序数据管理的效率和便捷度。 -3. **时序模型训推一体化引擎(AINode)**:针对智能分析场景,TimechoDB 提供 AINode 时序模型训推一体化引擎,它提供了一套完整的时序数据分析工具,底层为模型训练引擎,支持训练任务与数据管理,与包括机器学习、深度学习等。通过这些工具,用户可以对存储在 TimechoDB 中的数据进行深入分析,挖掘出其中的价值。 -4. **数据采集**:为了更加便捷的对接各类工业采集场景, 天谋科技提供数据采集接入服务,支持多种协议和格式,可以接入各种传感器、设备产生的数据,同时支持断点续传、网闸穿透等特性。更加适配工业领域采集过程中配置难、传输慢、网络弱的特点,让用户的数采变得更加简单、高效。 - -## 2. TimechoDB 整体架构 - -下图展示了一个常见的 IoTDB 3C3D(3 个 ConfigNode、3 个 DataNode)的集群部署模式: - - - -## 3. 产品特性 - -TimechoDB 具备以下优势和特性: - -- 灵活的部署方式:支持云端一键部署、终端解压即用、终端-云端无缝连接(数据云端同步工具) - -- 低硬件成本的存储解决方案:支持高压缩比的磁盘存储,无需区分历史库与实时库,数据统一管理 - -- 层级化的测点组织管理方式:支持在系统中根据设备实际层级关系进行建模,以实现与工业测点管理结构的对齐,同时支持针对层级结构的目录查看、检索等能力 - -- 高通量的数据读写:支持百万级设备接入、数据高速读写、乱序/多频采集等复杂工业读写场景 - -- 丰富的时间序列查询语义:支持时序数据原生计算引擎,支持查询时时间戳对齐,提供近百种内置聚合与时序计算函数,支持面向时序特征分析和AI能力 - -- 高可用的分布式系统:支持HA分布式架构,系统提供7*24小时不间断的实时数据库服务,一个物理节点宕机或网络故障,不会影响系统的正常运行;支持物理节点的增加、删除或过热,系统会自动进行计算/存储资源的负载均衡处理;支持异构环境,不同类型、不同性能的服务器可以组建集群,系统根据物理机的配置,自动负载均衡 - -- 极低的使用&运维门槛:支持类 SQL 语言、提供多语言原生二次开发接口、具备控制台等完善的工具体系 - -- 丰富的生态环境对接:支持Hadoop、Spark等大数据生态系统组件对接,支持Grafana、Thingsboard、DataEase等设备管理和可视化工具 - -## 4. 企业特性 - -### 4.1 更高阶的产品功能 - -TimechoDB 在 Apache IoTDB 基础上提供了更多高阶产品功能,在内核层面针对工业生产场景进行原生升级和优化,如多级存储、云边协同、可视化工具、安全增强等功能,能够让用户无需过多关注底层逻辑,将精力聚焦在业务开发中,让工业生产更简单更高效,为企业带来更多的经济效益。如: - -- 双活部署:双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 数据同步:通过数据库内置的同步模块,支持数据由场站向中心汇聚,支持全量汇聚、部分汇聚、级联汇聚等各类场景,可支持实时数据同步与批量数据同步两种模式。同时提供多种内置插件,支持企业数据同步应用中的网闸穿透、加密传输、压缩传输等相关要求。 - -- 多级存储:通过升级底层存储能力,支持根据访问频率和数据重要性等因素将数据划分为冷、温、热等不同层级的数据,并将其存储在不同介质中(如 SSD、机械硬盘、云存储等),同时在查询过程中也由系统进行数据调度。从而在保证数据访问速度的同时,降低客户数据存储成本。 - -- 安全增强:通过白名单、审计日志等功能加强企业内部管理,降低数据泄露风险。 - -详细功能对比如下: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
功能Apache IoTDBTimechoDB
部署模式单机部署
分布式部署
双活部署-
容器部署部分支持
数据库功能测点管理
数据写入
数据查询
连续查询
触发器
用户自定义函数
权限管理
数据同步仅文件同步,无内置插件实时同步+文件同步,丰富内置插件
流处理仅框架,无内置插件框架+丰富内置插件
多级存储-
视图-
白名单-
审计日志-
配套工具可视化控制台-
集群管理工具-
系统监控工具-
国产化国产化兼容性认证-
技术支持专家服务-
使用培训-
- -### 4.2 更高效/稳定的产品性能 - -TimechoDB 在 Apache IoTDB 的基础上优化了稳定性与性能,经过企业版技术支持,能够实现10倍以上性能提升,并具有故障及时恢复的性能优势。 - -### 4.3 更用户友好的工具体系 - -TimechoDB 将为用户提供更简单、易用的工具体系,通过集群监控面板(IoTDB Grafana)、数据库控制台(IoTDB Workbench)、集群管理工具(IoTDB Deploy Tool,简称 IoTD)等产品帮助用户快速部署、管理、监控数据库集群,降低运维人员工作/学习成本,简化数据库运维工作,使运维过程更加方便、快捷。 - -- 集群监控面板:旨在解决 IoTDB 及其所在操作系统的监控问题,主要包括:操作系统资源监控、IoTDB 性能监控,及上百项内核监控指标,从而帮助用户监控集群健康状态,并进行集群调优和运维。 - -
-

总体概览

-

操作系统资源监控

-

IoTDB 性能监控

-
-
- - - -
-

- -- 数据库控制台:旨在提供低门槛的数据库交互工具,通过提供界面化的控制台帮助用户简洁明了的进行元数据管理、数据增删改查、权限管理、系统管理等操作,简化数据库使用难度,提高数据库使用效率。 - - -
-

首页

-

元数据管理

-

SQL 查询

-
-
- - - -
-

- - -- 集群管理工具:旨在解决分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。 - - -
-  -
- -### 4.4 更专业的企业技术服务 - -TimechoDB 客户提供强大的原厂服务,包括但不限于现场安装及培训、专家顾问咨询、现场紧急救助、软件升级、在线自助服务、远程支持、最新开发版使用指导等服务。同时,为了使 IoTDB 更契合工业生产场景,我们会根据企业实际数据结构和读写负载,进行建模方案推荐、读写性能调优、压缩比调优、数据库配置推荐及其他的技术支持。如遇到部分产品未覆盖的工业化定制场景,TimechoDB 将根据用户特点提供定制化开发工具。 - -相较于 Apache IoTDB,每 2-3 个月一个发版周期,TimechoDB 提供周期更快的发版频率,同时针对客户现场紧急问题,提供天级别的专属修复,确保生产环境稳定。 - - -### 4.5 更兼容的国产化适配 - -TimechoDB 代码自研可控,同时兼容大部分主流信创产品(CPU、操作系统等),并完成与多个厂家的兼容认证,确保产品的合规性和安全性。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/IoTDB-Introduction/Release-history_timecho.md b/src/zh/UserGuide/Master/Tree/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index 77f73b1e2..000000000 --- a/src/zh/UserGuide/Master/Tree/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,677 +0,0 @@ - -# 发布历史 - -## 1. TimechoDB(数据库内核) - -### V2.0.9.4 - -> 发版时间:2026.06.10
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.4-bin.zip
-> SHA512 校验码:040ebdd9e45d93535e9628cf377003d560be83cec9737f5a5fbd0c3a93a12810814094752eac3eacdfec5cddcf433fa83e76edc14be34c73c1a54d9b937ea1b5 - -V2.0.9.4 版本主要优化了表模型 AINode 的推理功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:表模型协变量推理模型自适应支持填充空值 - - -### V2.0.9.3 - -> 发版时间:2026.05.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.3-bin.zip
-> SHA512 校验码:f6c5d50cbf8902503289884f073593c650ffdc8edbebfabf27f6ab4499630749331aa4ed09dd34627a39fa8dee27b4d7e2689d0ed1cf23c76dd9c7270f9fae2a - -V2.0.9.3 版本 AINode 新增支持同一套模型代码搭配不同模型权重分别注册为模型的功能,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:[支持同一套模型代码搭配不同模型权重分别注册为模型](../AI-capability/AINode_Upgrade_timecho.md#_4-3-注册自定义模型) - - -### V2.0.9.2 - -> 发版时间:2026.05.11
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.2-bin.zip
-> SHA512 校验码:10d3f34b6e65ad5c09b1cf3538ee27e181cc38c5fedf6acfd7d7053797ca23c76245683536275b69bd478aa1e43364351eceef1948832ab663a7398665af9eff - -V2.0.9.2 版本 新增 Object 类型导入导出功能,新增脚本 tsfile-backup(目前仅支持表模型),同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 脚本与工具:表模型[import-data 脚本 TsFile 格式](../../latest-Table/Tools-System/Data-Import-Tool_timecho.md#_2-4-tsfile-格式)支持 object 类型数据导入 -- 脚本与工具:表模型新增 [tsfile-backup 脚本](../../latest-Table/Tools-System/Data-Export-Tool_timecho.md#_3-基于-pipe-框架的-tsfilebackup) -- 流处理模块:表模型 PIPE 支持 [Object 类型数据本地导出和远程传输](../../latest-Table/User-Manual/Data-Sync_timecho.md#_3-9-object-类型数据导出) -- 系统模块:[审计日志](../User-Manual/Audit-Log_timecho.md)支持慢请求个数统计 - - -### V2.0.9.1 - -> 发版时间:2026.05.11
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.1-bin.zip
-> SHA512 校验码:18ff3801ba58550e06ef0aa4bf4465e8ce1b31d1aecb9c6899eb843f5d9187d3cc575e930ee38d96b87b17067e2b21f1852ab5127eac7480cf5051c20a68894b - -V2.0.9.1 版本新增 AINode 协变量分类推理能力,支持 schema级/表级存储空间统计功能,数据查询新增集合操作、CTE 及多个内置函数,支持通过 DEBUG SQL 调试查询,支持配置开机自启等,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:表模型支持[时序数据分类推理](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理) -- 查询模块:表模型支持[集合操作(UNION/INTERSECT/EXCEPT)](../../latest-Table/SQL-Manual/Set-Operations_timecho.md)及 [共用表表达式(CTE)](../../latest-Table/SQL-Manual/Common-Table-Expression_timecho.md) -- 查询模块:表模型新增 [IF 标量函数](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_8-3-if-表达式)、[二进制函数](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_7-二进制函数)、[APPROX_PERCENTILE 聚合函数](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_2-聚合函数) -- 查询模块:支持 [DEBUG SQL](../User-Manual/Maintenance-statement_timecho.md#_6-调试查询),优化 [Explain Analyze](../User-Manual/Query-Performance-Analysis.md) 结果集 -- 查询模块:支持 [schema级](../User-Manual/Maintenance-statement_timecho.md#_1-10-查看磁盘空间占用情况)/[表级](../../latest-Table/Reference/System-Tables_timecho.md#_2-22-table-disk-usage-表)存储空间统计,支持 [show configuration 语句](../../latest-Table/User-Manual/Maintenance-statement_timecho.md#_1-13-查看节点配置信息)查看集群配置信息 -- 脚本与工具:数据/元数据导入导出工具支持 SSL 协议 -- 脚本与工具:命令行工具支持展示[访问历史功能](../Tools-System/CLI_timecho.md#_5-访问历史功能) -- 系统模块:支持配置[开机自启](../User-Manual/Auto-Start-On-Boot_timecho.md) -- 其他:修复安全漏洞 CVE-2026-28564 - - -### V2.0.8.3 - -> 发版时间:2026.04.21
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.3-bin.zip
-> SHA512 校验码:4b95bea87cc375bc455897dcf4cec80692421fa5c3eee746e1095b94288611d4afdd94aa8dad70340757d041757758924701cbdb2b73b49fb8730c4caac2a126 - -V2.0.8.3 版本新增 Python 读写 Object 类型数据的能力,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 接口模块:表模型[Python 原生接口](../../latest-Table/API/Programming-Python-Native-API_timecho.md)支持读写 Object 类型数据 - - -### V2.0.8.2 - -> 发版时间:2026.03.31
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.2-bin.zip
-> SHA512 校验码:02ab10e3e94786dd5676e0a69609eef192afd90d87f4d8d7bd44e7e9cbc8a18d61ba5668bae56cb8e4416ac71a877f760963b72ca7838d7c39ae10f1ed321d89 - -V2.0.8.2 版本新增树模型修改序列全名功能,表模型支持自定义 Time 列列名,树、表双模型支持更改数据类型,ODBC Driver等,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 存储模块:树模型支持[修改序列全名](../Basic-Concept/Operate-Metadata_timecho.md#_2-4-修改时间序列名称),支持[更改序列数据类型](../Basic-Concept/Operate-Metadata_timecho.md#_2-3-修改时间序列数据类型) -- 存储模块:表模型支持[更改列数据类型](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-5-修改表),支持[自定义 Time 列列名](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-1-创建表) -- 接口模块:支持 [ODBC Driver](../API/Programming-ODBC_timecho.md), Python SessionDataset 支持分批获取 DataFrame,MQTT 服务外置并新增系统表 Services 提供服务查询 -- AINode:表模型支持自适应[协变量推理](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理) -- 流处理模块:树模型数据同步 pipe 语句中支持填写多个精确路径的 path - -### V2.0.8.1 - -> 发版时间:2026.02.04
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.1-bin.zip
-> SHA512 校验码:49d97cbf488443f8e8e73cc39f6f320b3bc84b194aed90af695ebd5771650b5e5b6a3abb0fb68059bd01827260485b903c035657b337442f4fdd32c877f2aca3 - -V2.0.8.1 版本表模型新增Object数据类型,强化升级审计日志功,优化树模型 OPC UA 协议,AINode 支持协变量预测,以及 AINode 支持并发推理等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:新增 DataNode 可用节点的列表展示,可[查看节点的 RPC 地址和端口](../User-Manual/Maintenance-statement_timecho.md#_1-7-查看可用节点) -- 查询模块:表模型新增[统计查询耗时的系统表](../../latest-Table/Reference/System-Tables_timecho.md#_2-20-queries-costs-histogram-表) -- 存储模块:支持通过 SQL 查看[创建表](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-4-查看表的创建信息)/[视图](../../latest-Table/User-Manual/Tree-to-Table_timecho.md#_2-4-查看表视图)的完整定义语句 -- 存储模块:优化树模型 [OPC UA 协议](../API/Programming-OPC-UA_timecho.md) -- 系统模块:表模型新增 [Object 数据类型](../../latest-Table/Background-knowledge/Data-Type_timecho.md) -- 系统模块:强化升级[审计日志](../User-Manual/Audit-Log_timecho.md)功能 -- 系统模块:表模型新增 DataNode [节点连接情况](../../latest-Table/Reference/System-Tables_timecho.md#_2-18-connections-表)的系统表 -- AINode:内置 chronos-2 模型,支持[协变量预测](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md)功能 -- AINode:Timer-XL、Sundial 内置模型支持[并发推理](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md)功能 -- 流处理模块:创建全量同步 pipe 会[自动拆分](../User-Manual/Data-Sync_timecho.md#_2-1-创建任务)为实时、历史两个独立 pipe,可通过 show pipes 语句分别查看剩余事件数 -- 其他:修复安全漏洞 CVE-2025-12183、CVE-2025-66566、CVE-2025-11226 - - -### V2.0.6.6 - -> 发版时间:2026.01.20
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.6-bin.zip
-> SHA512 校验码:d12e60b8119690d63c501d0c2afcd527e39df8a8786198e35b53338e21939e1a9244805e710d81cbb62d02c2739909d7e8227c029660a0cd9ea7ca718cf9bdf6 - -V2.0.6.6 版本主要优化了树模型时间序列的查询性能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:优化了 show/count timeseries/devices 的查询性能 - -### V2.0.6.4 - -> 发版时间:2025.11.17
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.4-bin.zip
-> SHA512 校验码:57b9998cc14632862c32b6781c70db1c52caf8172b5d45d27cc214cab50d3afd4230ed0754e1c1a4ed825666bf971dc81fbb7d3b93261e57e9dabc20e794a2b8 - -V2.0.6.4 版本主要优化了存储以及 AINode 模块的相关功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 存储模块:支持树模型修改时间序列的编码及压缩方式 -* AINode:支持一键部署,优化了模型推理功能 - - -### V2.0.6.1 - -> 发版时间:2025.09.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.1-bin.zip
-> SHA512 校验码:c88e3e2c0dbd06578bd0697ca9992880b300baee2c4906ba1f952134e37ae2fa803a6af236f4541d318b75f43a498b5d5bfbbc7c445783271076c36e696e4dd0 - -V2.0.6.1 版本新增表模型查询写回功能,新增访问控制黑白名单功能,新增位操作函数(内置标量函数)以及可下推的时间函数,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:支持表模型查询写回功能 -* 查询模块:表模型行模式识别支持使用聚合函数,捕获连续数据进行分析计算 -* 查询模块:表模型新增内置标量函数-位操作函数 -* 查询模块:表模型新增可下推的 EXTRACT 时间函数 -* 系统模块:新增访问控制,支持用户自定义配置黑白名单功能 -* 其他:用户默认密码更新为安全强度更高的“TimechoDB@2021” - -### V2.0.5.2 - -> 发版时间:2025.08.08
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.5.2-bin.zip
-> SHA512 校验码:a00a4075c9937b7749c454f71d2480fea5e9ff9659c0628b132e30e2f256c7c537cd91dca4f6be924db0274bb180946a1b88e460c025bf82fdb994a3c2c7b91e - -V2.0.5.2 版本修复了部分产品缺陷,优化了数据同步功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V2.0.5.1 - -> 发版时间:2025.07.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.5.1-bin.zip
-> SHA512 校验码:aa724755b659bf89a60da6f2123dfa91fe469d2e330ed9bd029e8f36dd49212f3d83b1025e9da26cb69315e02f65c7e9a93922e40df4f2aa4c7f8da8da2a4cea - -V2.0.5.1 版本新增树转表视图、表模型窗口函数、聚合函数 approx\_most\_frequent,并支持 LEFT & RIGHT JOIN、ASOF LEFT JOIN;AINode 新增 Timer-XL、Timer-Sundial 两种内置模型,支持树、表模型推理及微调功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:支持手动创建树转表视图 -* 查询模块:表模型新增窗口函数 -* 查询模块:表模型新增聚合函数 approx\_most\_frequent -* 查询模块:表模型 JOIN 功能扩展,支持 LEFT & RIGHT JOIN、ASOF LEFT JOIN -* 查询模块:表模型支持行模式识别,可捕获连续数据进行分析计算 -* 查询模块:表模型新增多个系统表,例如:VIEWS(表视图信息)、MODELS(模型信息)等 -* 系统模块:新增 TsFile 数据文件加密功能 -* AI 模块:AINode 新增 Timer-XL、Timer-Sundial 两种内置模型 -* AI 模块:AINode 支持树模型、表模型的推理及微调功能 -* 其他模块:支持通过 OPC DA 协议发布数据 - -### 2.x 其他历史版本 - -#### V2.0.4.2 - -> 发版时间:2025.06.21
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.4.2-bin.zip
-> SHA512 校验码:31f26473ac90988ce970dac8d0950671bde918f9af6f2f6a6c2bf99a53aa1c0a459c53a137b18ff0b28e70952e9c4b6acb50029e0b2e38837b969eb8f78f2939 - -V2.0.4.2 版本支持了传递 TOPIC 给 MQTT 自定义插件,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.4.1 - -> 发版时间:2025.06.03
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.4.1-bin.zip
-> SHA512 校验码:93ac08bfae06aff6db04849f474458433026f66778f4f5c402eb22f1a7cb14d8096daf0a9e9cc365ddfefd4f8ca4443b2a9fb6461906f056b1e6a344990beb3a - -V2.0.4.1 版本表模型新增用户自定义表函数(UDTF)及多种内置表函数、新增聚合函数 approx\_count\_distinct、新增支持针对时间列的 ASOF INNER JOIN,并对脚本工具进行了分类整理,将 Windows 平台专用脚本独立,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:表模型新增用户自定义表函数(UDTF)及多种内置表函数 -* 查询模块:表模型支持针对时间列的 ASOF INNER JOIN -* 查询模块:表模型新增聚合函数 approx\_count\_distinct -* 流处理:支持通过 SQL 异步加载 TsFile -* 系统模块:缩容时,副本选择支持容灾负载均衡策略 -* 系统模块:适配 Window Server 2025 -* 脚本与工具:对脚本工具进行了分类整理,并将 Windows 平台专用脚本独立 - -#### V2.0.3.4 - -> 发版时间:2025.06.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.4-bin.zip
-> SHA512 校验码:d80d34b7d3890def75b17c491fc4c13efc36153a5950a9b23744755d04d6adb5d6ab9ec970101183fef7bfeb8a559ef92fce90d2d22f7b7fd5795cd5589461bb - -V2.0.3.4版本将用户密码的加密算法变更为 SHA-256,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.3.3 - -> 发版时间:2025.05.16
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.3-bin.zip
-> SHA512 校验码:f47e3fb45f869dbe690e7cfaa93f95e5e08a462b362aa9d7ccac7ee5b55022dc8f62db12009dfde055f278f3003ff9ea7c22849d52a3ef2c25822f01ade78591 - -V2.0.3.3 版本新增元数据导入导出脚本适配表模型、Spark 生态集成(表模型)、AINode 返回结果新增时间戳,表模型新增部分聚合函数和标量函数,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:表模型新增聚合函数 count\_if 和标量函数 greatest / least -* 查询模块:表模型全表 count(\*) 查询性能显著提升 -* AI 模块:AINode 返回结果新增时间戳 -* 系统模块:表模型元数据模块性能优化 -* 系统模块:表模型支持主动监听并加载 TsFile 功能 -* 系统模块:新增 TsFile 解析转换时间、TsFile 转 Tablet 数量等监控指标 -* 生态集成:表模型生态拓展集成 Spark -* 脚本与工具:import-schema、export-schema 脚本支持表模型元数据导入导出 - -#### V2.0.3.2 - -> 发版时间:2025.05.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.2-bin.zip
-> SHA512 校验码:76bd294de4b01782e5dd621a996aeb448e4581f98c70fb5b72b17dc392c2e1227c0d26bd3df5533669a80f217a83a566bc6ec926b7efd21ce7a89b894cd33e19 - -V2.0.3.2版本修复了部分产品缺陷,优化了节点移除功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.2.1 - -> 发版时间:2025.04.07
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.2.1-bin.zip
-> SHA512 校验码:a41be3f8c57e6a39ac165f1d6ab92c9ed790b0712528f31662c58617f4c94e6bfc9392a9c1ef2fc5bdd8c7ca79901389f368cbdbec3e5b1d5c1ce155b2f1a457 - -V2.0.2.1 版本新增了表模型权限管理、用户管理以及相关操作鉴权,并新增了表模型 UDF、系统表和嵌套查询等功能。此外,持续优化数据订阅机制,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:新增表模型 UDF 的管理、用户自定义标量函数(UDSF)和用户自定义聚合函数(UDAF) -* 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -* 查询模块:表模型支持权限管理、用户管理以及相关操作鉴权 -* 查询模块:新增系统表及多种运维语句,优化系统管理 -* 系统模块:CSharp 客户端支持表模型 -* 系统模块:新增表模型 C++ Session 写入接口 -* 系统模块:多级存储支持符合 S3 协议的非 AWS 对象存储系统 -* 系统模块:UDF 函数拓展,新增 pattern\_match 模式匹配函数 -* 数据同步:表模型支持元数据同步及同步删除操作 - -#### V2.0.1.2 - -> 发版时间:2025.01.25
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.1.2-bin.zip
-> SHA512 校验码:51c2fa5da2974a8a3c8871dec1c49bd98e5d193a13ef33ac7801adb833a1e360d74f0160bcdf33c7ffb23a5c5e0f376e26a4315cf877f1459483356285b85349 - -V2.0.1.2 版本正式实现树表双模型配置,并配合表模型支持标准 SQL 查询语法、多种函数和运算符、流处理、Benchmark 等功能。此外,该版本更新还包括:Python 客户端支持四种新数据类型,支持只读模式下的数据库删除操作,脚本工具同时兼容 TsFile、CSV 和 SQL 数据的导入导出,对 Kubernetes Operator 的生态集成等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 时序表模型:IoTDB 支持了时序表模型,提供的 SQL 语法包括 SELECT、WHERE、JOIN、GROUP BY、ORDER BY、LIMIT 子句和嵌套查询 -* 查询模块:表模型支持多种函数和运算符,包括逻辑运算符、数学函数以及时序特色函数 DIFF 等 -* 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -* 存储模块:表模型支持通过 Session 接口进行数据写入,Session 接口支持元数据自动创建 -* 存储模块:Python 客户端新增支持四种新数据类型:`String`、`Blob`、`Date` 和 `Timestamp` -* 存储模块:优化同种类合并任务优先级的比较规则 -* 流处理模块:支持在发送端指定接收端鉴权信息 -* 流处理模块:TsFile Load 支持表模型 -* 流处理模块:流处理插件适配表模型 -* 系统模块:增强了 DataNode 缩容的稳定性 -* 系统模块:在 readonly 状态下,支持用户进行 drop database 操作 -* 脚本与工具:Benchmark 工具适配表模型 -* 脚本与工具: Benchmark 工具支持四种新数据类型:`String`、`Blob`、`Date` 和 `Timestamp` -* 脚本与工具:data/export-data 脚本扩展,支持新数据类型(字符串、大二进制对象、日期、时间戳) -* 脚本与工具:import-data/export-data 脚本迭代,同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出 -* 生态集成:支持 Kubernetes Operator - - -### V1.3.7.3 - -> 发版时间:2026.06.02
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 校验码:8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 版本主要优化了查询模块和数据同步等功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:优化 Last 查询、对齐序列查询、倒序时间过滤查询等场景 -- 元数据模块:优化已激活序列及其子路径下的设备创建校验 -- 数据同步:优化同步失败后的重试机制 -- 数据同步:跨网闸同步插件支持配置实时写入传输超时时间 -- 接口模块:Go 客户端写入接口增加错误码校验 -- 接口模块:优化 C# 客户端连接池管理 - - -### V1.3.7.2 - -> 发版时间:2026.04.07
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 校验码:787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 版本主要优化了数据同步和查询模块的相关功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:优化 Pipe 复杂路径匹配场景下的分发性能 -- 查询模块:Show Queries 语句新增客户端 IP、查询超时时间、服务端等待时间等信息 -- 生态集成:支持 IoTDB 以 OPC Client 模式向外部 OPC Server 推送数据 - - -### V1.3.6.6 - -> 发版时间:2026.01.20
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 校验码:590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 版本优化了数据的读写功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.6.3 - -> 发版时间:2026.01.04
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 校验码:43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 版本主要围绕查询性能、内存管理机制两大核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:优化多种场景的查询性能,包括多序列 Last 查询等 -* 查询模块:Java SDK 新增 FastLastQuery 接口,支持更高效的 Last 查询操作 -* 查询模块:树模型 fetchSchema 调整为分段流式返回,提升大数据量场景下的响应速度 -* 存储模块:优化内存管理,避免内存泄漏风险,保障系统长期稳定运行 -* 存储模块:优化文件合并机制,提升合并处理效率,优化系统存储资源占用 -* 其他:修复安全漏洞 CVE-2025-12183,CVE-2025-66566 and CVE-2025-11226 - - -### V1.3.6.1 - -> 发版时间:2025.12.09
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 校验码:9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 版本主要围绕数据同步稳定性这一核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 数据同步:优化 Pipe SQL 参数配置,支持指定异步加载方式 -* 数据同步:新增语法糖功能,可将全量 Pipe 创建 SQL 自动拆分为实时同步与历史同步两类 -* 系统模块:新增全局数据类型压缩方式配置项,支持按需调整存储压缩策略 - - -### V1.3.5.11 - -> 发版时间:2025.09.24
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 校验码:f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 版本主要优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.10 - -> 发版时间:2025.08.27
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 校验码:3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.9 - -> 发版时间:2025.08.25
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 校验码:95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 版本优化了内存控制,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 -### 1.x 其他历史版本 - -#### V1.3.5.8 - -> 发版时间:2025.08.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 校验码:aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.7 - -> 发版时间:2025.08.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 校验码:17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.6 - -> 发版时间:2025.07.16
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 校验码:05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 版本新增配置项开关支持禁用数据订阅功能,优化了C++高可用客户端,以及正常情况、重启、删除三个场景下的 PIPE 同步延迟问题,和大 TEXT 对象时的查询问题,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.4 - -> 发版时间:2025.06.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 校验码:edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 版本修复了部分产品缺陷,优化了节点移除功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.3 - -> 发版时间:2025.06.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 校验码:5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 版本主要优化了数据同步功能,包括持久化 PIPE 发送进度,增加 PIPE 事件传输时间监控项,并修复了相关缺陷;另外将用户密码的加密算法变更为 SHA-256,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.2 - -> 发版时间:2025.06.10
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 校验码:4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 版本主要优化了数据同步功能,包括支持通过使用参数进行级联配置,支持同步和实时写入顺序完全一致;支持系统重启后历史数据和实时数据分区发送,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.1 - -> 发版时间:2025.05.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 校验码:91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.4.2 - -> 发版时间:2025.04.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 校验码:52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b2 - -V1.3.4.2 版本优化了数据同步功能,支持双活之间同步外部 PIPE 转发而来的数据。 - - -#### V1.3.4.1 - -> 发版时间:2025.01.08
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 校验码:e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 版本新增模式匹配函数、持续优化数据订阅机制,提升稳定性、import-data/export-data 脚本扩展支持新数据类型,import-data/export-data 脚本合并同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -- 系统模块:UDF 函数拓展,新增 pattern_match 模式匹配函数 -- 数据同步:支持在发送端指定接收端鉴权信息 -- 生态集成:支持 Kubernetes Operator -- 脚本与工具:import-data/export-data 脚本扩展,支持新数据类型(字符串、大二进制对象、日期、时间戳) -- 脚本与工具:import-data/export-data 脚本迭代,同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出 - -#### V1.3.3.3 - -> 发版时间:2024.10.31
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 校验码:4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3版本增加优化重启恢复性能,减少启动时间、DataNode 主动监听并加载 TsFile,同时增加可观测性指标、发送端支持传文件至指定目录后,接收端自动加载到IoTDB、Alter Pipe 支持 Alter Source 的能力等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:接收端支持对不一致数据类型的自动转换 -- 数据同步:接收端增强可观测性,支持多个内部接口的 ops/latency 统计 -- 数据同步:opc-ua-sink 插件支持 CS 模式访问和非匿名访问方式 -- 数据订阅: SDK 支持 create if not exists 和 drop if exists 接口 -- 流处理:Alter Pipe 支持 Alter Source 的能力 -- 系统模块:新增 rest 模块的耗时监控 -- 脚本与工具:支持加载自动加载指定目录的TsFile文件 -- 脚本与工具:import-tsfile脚本扩展,支持脚本与iotdb server不在同一服务器运行 -- 脚本与工具:新增对Kubernetes Helm的支持 -- 脚本与工具:Python 客户端支持新数据类型(字符串、大二进制对象、日期、时间戳) - -#### V1.3.3.2 - -> 发版时间:2024.8.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 校验码:32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2版本支持输出读取mods文件的耗时、输入最大顺乱序归并排序内存 以及dispatch 耗时、通过参数配置对时间分区原点的调整、支持根据 pipe 历史数据处理结束标记自动结束订阅,同时合并了模块内存控制性能提升,具体发布内容如下: - -- 查询模块:Explain Analyze 功能支持输出读取mods文件的耗时 -- 查询模块:Explain Analyze 功能支持输入最大顺乱序归并排序内存以及 dispatch 耗时 -- 存储模块:新增合并目标文件拆分功能,增加配置文件参数 -- 系统模块:支持通过参数配置对时间分区原点的调整 -- 流处理:数据订阅支持根据 pipe 历史数据处理结束标记自动结束订阅 -- 数据同步:RPC 压缩支持指定压缩等级 -- 脚本与工具:数据/元数据导出只过滤 root.__system,不对root.__systema 等开头的数据进行过滤 - -#### V1.3.3.1 - -> 发版时间:2024.7.12
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 校验码:1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1版本多级存储增加限流机制、数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权,优化了数据同步接收端一些不明确的WARN日志、重启恢复性能,减少启动时间,同时对脚本内容进行了合并,具体发布内容如下: - -- 查询模块:Filter 性能优化,提升聚合查询和where条件查询的速度 -- 查询模块:Java Session客户端查询 sql 请求均分到所有节点 -- 系统模块:将"iotdb-confignode.properties、iotdb-datanode.properties、iotdb-common.properties"配置文件合并为" iotdb-system.properties" -- 存储模块:多级存储增加限流机制 -- 数据同步:数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权 -- 系统模块:优化重启恢复性能,减少启动时间 - -#### V1.3.2.2 - -> 发版时间:2024.6.4
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 校验码:ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 版本新增 explain analyze 语句分析单个 SQL 查询耗时、新增 UDAF 用户自定义聚合函数框架、支持磁盘空间到达设置阈值自动删除数据、元数据同步、统计指定路径下数据点数、SQL 语句导入导出脚本等功能,同时集群管理工具支持滚动升级、上传插件到整个集群,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 存储模块:insertRecords 接口写入性能提升 -- 存储模块:新增 SpaceTL 功能,支持磁盘空间到达设置阈值自动删除数据 -- 查询模块:新增 Explain Analyze 语句(监控单条 SQL 执行各阶段耗时) -- 查询模块:新增 UDAF 用户自定义聚合函数框架 -- 查询模块:UDF 新增包络解调分析 -- 查询模块:新增 MaxBy/MinBy 函数,支持获取最大/小值的同时返回对应时间戳 -- 查询模块:值过滤查询性能提升 -- 数据同步:路径匹配支持通配符 -- 数据同步:支持元数据同步(含时间序列及相关属性、权限等设置) -- 流处理:增加 Alter Pipe 语句,支持热更新 Pipe 任务的插件 -- 系统模块:系统数据点数统计增加对 load TsFile 导入数据的统计 -- 脚本与工具:新增本地升级备份工具(通过硬链接对原有数据进行备份) -- 脚本与工具:新增 export-data/import-data 脚本,支持将数据导出为 CSV、TsFile 格式或 SQL 语句 -- 脚本与工具:Windows 环境支持通过窗口名区分 ConfigNode、DataNode、Cli - -#### V1.3.1.4 - -> 发版时间:2024.4.23
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 校验码:8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1 版本增加系统激活情况查看、内置方差/标准差聚合函数、内置Fill语句支持超时时间设置、tsfile修复命令等功能,增加一键收集实例信息脚本、一键启停集群等脚本,并对视图、流处理等功能进行优化,提升使用易用度和版本性能。具体发布内容如下: - -- 查询模块:Fill 子句支持设置填充超时阈值,超过时间阈值不填充 -- 查询模块:Rest 接口(V2 版)增加列类型返回 -- 数据同步:数据同步简化时间范围指定方式,直接设置起止时间 -- 数据同步:数据同步支持 SSL 传输协议(iotdb-thrift-ssl-sink 插件) -- 系统模块:支持使用 SQL 查询集群激活信息 -- 系统模块:多级存储增加迁移时传输速率控制 -- 系统模块:系统可观测性提升(增加集群节点的散度监控、分布式任务调度框架可观测性) -- 系统模块:日志默认输出策略优化 -- 脚本与工具:增加一键启停集群脚本(start-all/stop-all.sh & start-all/stop-all.bat) -- 脚本与工具:增加一键收集实例信息脚本(collect-info.sh & collect-info.bat) - -#### V1.3.0.4 - -> 发版时间:2024.1.3
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 校验码:3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 发布了全新内生机器学习框架 AINode,全面升级权限模块支持序列粒度授予权限,并对视图、流处理等功能进行诸多细节优化,进一步提升了产品的使用易用度,并增强了版本稳定性和各方面性能。具体发布内容如下: - -- 查询模块:新增 AINode 内生机器学习模块 -- 查询模块:优化 show path 语句返回时间长的问题 -- 安全模块:升级权限模块,支持时间序列粒度的权限设置 -- 安全模块:支持客户端与服务器 SSL 通讯加密 -- 流处理:流处理模块新增多种 metrics 监控项 -- 查询模块:非可写视图序列支持 LAST 查询 -- 系统模块:优化数据点监控项统计准确性 - -#### V1.2.0.1 - -> 发版时间:2023.6.30
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 校验码:dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1主要增加了流处理框架、动态模板、substring/replace/round内置查询函数等新特性,增强了show region、show timeseries、show variable等内置语句功能和Session接口,同时优化了内置监控项及其实现,修复部分产品bug和性能问题。 - -- 流处理:新增流处理框架 -- 元数据模块:新增模板动态扩充功能 -- 存储模块:新增SPRINTZ和RLBE编码以及LZMA2压缩算法 -- 查询模块:新增cast、round、substr、replace内置标量函数 -- 查询模块:新增time_duration、mode内置聚合函数 -- 查询模块:SQL语句支持case when语法 -- 查询模块:SQL语句支持order by表达式 -- 接口模块:Python API支持连接分布式多个节点 -- 接口模块:Python客户端支持写入重定向 -- 接口模块:Session API增加用模板批量创建序列接口 - -#### V1.1.0.1 - -> 发版时间:2023-04-03
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.1.0.1.zip
-> SHA512 校验码:58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1主要改进增加了部分新特性,如支持 GROUP BY VARIATION、GROUP BY CONDITION 等分段方式、增加 DIFF、COUNT_IF 等实用函数,引入 pipeline 执行引擎进一步提升查询速度等。同时修复对齐序列 last 查询 order by timeseries、LIMIT&OFFSET 不生效、重启后元数据模版错误、删除所有 database 后创建序列错误等相关问题。 - -- 查询模块:align by device 语句支持 order by time -- 查询模块:支持 Show Queries 命令 -- 查询模块:支持 kill query 命令 -- 系统模块:show regions 支持指定特定的 database -- 系统模块:新增 SQL show variables, 可以展示当前集群参数 -- 查询模块:聚合查询支持 GROUP BY VARIATION -- 查询模块:SELECT INTO 支持特定的数据类型强转 -- 查询模块:实现内置标量函数 DIFF -- 系统模块:show regions 显示创建时间 -- 查询模块:实现内置聚合函数 COUNT_IF -- 查询模块:聚合查询支持 GROUP BY CONDITION -- 系统模块:支持修改 dn_rpc_port 和 dn_rpc_address - - -## 2. Workbench(控制台工具) - -| **控制台版本号** | **版本说明** | **可支持IoTDB版本** | **SHA512 校验码** | -| ---------------- | ------------------------------------------------------------ | ------------------------- | ------------------------------------------------------------ | -| V2.1.1 | 优化趋势界面测点选择,支持无设备场景 | V2.0 及以上版本 | aa05fd4d9f33f07c0949bc2d6546bb4b9791ed5ea94bcef27e2bf51ea141ec0206f1c12466aced7bf3449e11ad68d65378d697f3d10cb4881024a83746029a65 | -| V2.0.1-beta | V2.x系列首个版本,支持树、表双模型 | V2.0 及以上版本 | 0ca0d5029874ed8ada9c7d1cb562370b3a46913eed66d39c08759287ccc8bf332cf80bb8861e788614b61ae5d53a9f5605f553e1a607e856f395eb5102e7cc4d | -| V1.5.7 | 优化测点列表中测点名称拆分为设备名称和测点,测点选择区域支持左右滚动,以及导出文件列顺序与页面保持一致 | V1.3.4及以上的1.x系列版本 | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | 优化 CSV 格式导入导出功能:导入时,支持标签、别名为非必填项;导出时,支持测点描述里反引号包裹引号的场景 | V1.3.4及以上的1.x系列版本 | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | 新增服务器时钟,支持企业版激活数据库 | V1.3.4及以上的1.x系列版本 | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | 新增实例管理中prometheus设置的认证功能 | V1.3.4及以上的1.x系列版本 | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | 新增AI分析功能以及模式匹配功能 | V1.3.2及以上的1.x系列版本 | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | 新增树模型展示及英文版 | V1.3.2及以上的1.x系列版本 | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | 分析功能新增分析方式,优化导入模版等功能 | V1.3.2及以上的1.x系列版本 | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | 新增数据库配置功能,优化部分版本细节 | V1.3.2及以上的1.x系列版本 | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | 优化各模块权限控制功能 | V1.3.1及以上的1.x系列版本 | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | 可视化功能新增“常用模版”概念,所有界面优化补充页面缓存等功能 | V1.3.0及以上的1.x系列版本 | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | 计算功能新增“导入、导出”功能,测点列表新增“时间对齐”字段 | V1.2.2及以上的1.x系列版本 | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | 首页新增“激活详情”,新增分析等功能 | V1.2.2及以上的1.x系列版本 | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | 优化“测点描述”展示内容等功能 | V1.2.2及以上的1.x系列版本 | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | 数据同步界面新增“监控面板”,优化Prometheus提示信息 | V1.2.2及以上的1.x系列版本 | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | 全新Workbench版本升级 | V1.2.0及以上的1.x系列版本 | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/zh/UserGuide/Master/Tree/QuickStart/QuickStart_timecho.md b/src/zh/UserGuide/Master/Tree/QuickStart/QuickStart_timecho.md deleted file mode 100644 index 8445cc339..000000000 --- a/src/zh/UserGuide/Master/Tree/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,109 +0,0 @@ - - -# 快速上手 - -本篇文档将帮助您了解快速入门 IoTDB 的方法。 - -## 1. 如何安装部署? - -本篇文档将帮助您快速安装部署 IoTDB,您可以通过以下文档的链接快速定位到所需要查看的内容: - -1. 准备所需机器资源:IoTDB 的部署和运行需要考虑多个方面的机器资源配置。具体资源配置可查看 [资源规划](../Deployment-and-Maintenance/Database-Resources_timecho.md) - -2. 完成系统配置准备:IoTDB 的系统配置涉及多个方面,关键的系统配置介绍可查看 [系统配置](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. 获取安装包:您可以联系天谋商务获取 IoTDB 安装包,以确保下载的是最新且稳定的版本。具体安装包结构可查看:[安装包获取](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. 安装数据库并激活:您可以根据实际部署架构选择以下教程进行安装部署: - - - 单机版:[单机版](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - 分布式(集群)版:[分布式(集群)版](../Deployment-and-Maintenance//Cluster-Deployment_timecho.md) - - - 双活版:[双活版](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️注意:目前我们仍然推荐直接在物理机/虚拟机上安装部署,如需要 docker 部署,可参考:[Docker 部署](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. 安装数据库配套工具:企业版数据库提供监控面板、可视化控制台等配套工具,建议在部署企业版时安装,可以帮助您更加便捷的使用 IoTDB: - - - 监控面板:提供了上百个数据库监控指标,用来对 IoTDB 及其所在操作系统进行细致监控,从而进行系统优化、性能优化、发现瓶颈等,安装步骤可查看 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - 可视化控制台:是 IoTDB 的可视化界面,支持通过界面交互的形式提供元数据管理、数据查询、数据可视化等功能的操作,帮助用户简单、高效的使用数据库,安装步骤可查看 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - -## 2. 如何使用? - -1. 数据库建模设计:数据库建模是创建数据库系统的重要步骤,它涉及到设计数据的结构和关系,以确保数据的组织方式能够满足特定应用的需求,下面的文档将会帮助您快速了解 IoTDB 的建模设计: - - - 时序概念介绍:[走进时序数据](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - - - 建模设计介绍:[数据模型介绍](../Background-knowledge/Data-Model-and-Terminology_timecho.md) - - - SQL 语法介绍:[SQL 语法介绍](../Basic-Concept/Operate-Metadata_timecho.md) - -2. 数据写入:在数据写入方面,IoTDB 提供了多种方式来插入实时数据,基本的数据写入操作请查看 [数据写入](../Basic-Concept/Write-Data_timecho.md) - -3. 数据查询:IoTDB 提供了丰富的数据查询功能,数据查询的基本介绍请查看 [数据查询](../Basic-Concept/Query-Data_timecho.md) - -4. 其他进阶功能:除了数据库常见的写入、查询等功能外,IoTDB 还支持“数据同步、流处理框架、安全控制、权限管理、AI 分析”等功能,具体使用方法可参见具体文档: - - - 数据同步:[数据同步](../User-Manual/Data-Sync_timecho.md) - - - 流处理框架:[流处理框架](../User-Manual/Streaming_timecho.md) - - - 安全控制:[安全控制](../User-Manual/Black-White-List_timecho.md) - - - 权限管理:[权限管理](../User-Manual/Authority-Management_timecho.md) - - - AI 分析:[AI 能力](../AI-capability/AINode_timecho.md) - -5. 应用编程接口: IoTDB 提供了多种应用编程接口(API),以便于开发者在应用程序中与 IoTDB 进行交互,目前支持[ Java 原生接口](../API/Programming-Java-Native-API_timecho.md)、[Python 原生接口](../API/Programming-Python-Native-API_timecho.md)、[C++原生接口](../API/Programming-Cpp-Native-API.md)、[Go 原生接口](../API/Programming-Go-Native-API.md)等,更多编程接口可参见官网【应用编程接口】其他章节 - -## 3. 还有哪些便捷的周边工具? - -IoTDB 除了自身拥有丰富的功能外,其周边的工具体系包含的种类十分齐全。本篇文档将帮助您快速使用周边工具体系: - - - 可视化控制台:workbench 是 IoTDB 的一个支持界面交互的形式的可视化界面,提供直观的元数据管理、数据查询和数据可视化等功能,提升用户操作数据库的便捷性和效率,具体使用介绍请查看 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - - - 监控面板:是一个对 IoTDB 及其所在操作系统进行细致监控的工具,涵盖数据库性能、系统资源等上百个数据库监控指标,助力系统优化与瓶颈识别等,具体使用介绍请查看 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - 测试工具:IoT-benchmark 是一个基于 Java 和大数据环境开发的时序数据库基准测试工具,由清华大学软件学院研发并开源。它支持多种写入和查询方式,能够存储测试信息和结果供进一步查询或分析,并支持与 Tableau 集成以可视化测试结果。具体使用介绍请查看:[测试工具](../Tools-System/Benchmark.md) - - - 数据导入脚本:针对于不同场景,IoTDB 为用户提供多种批量导入数据的操作方式,具体使用介绍请查看:[数据导入](../Tools-System/Data-Import-Tool_timecho.md) - - - - 数据导出脚本:针对于不同场景,IoTDB 为用户提供多种批量导出数据的操作方式,具体使用介绍请查看:[数据导出](../Tools-System/Data-Export-Tool_timecho.md) - - -## 4. 想了解更多技术细节? - -如果您想了解 IoTDB 的更多技术内幕,可以移步至下面的文档: - - - 研究论文:IoTDB 具有列式存储、数据编码、预计算和索引技术,以及其类 SQL 接口和高性能数据处理能力,同时与 Apache Hadoop、MapReduce 和 Apache Spark 无缝集成。相关研究论文请查看 [研究论文](../Technical-Insider/Publication.md) - - - 压缩&编码:IoTDB 通过多样化的编码和压缩技术,针对不同数据类型优化存储效率,想了解更多请查看 [压缩&编码](../Technical-Insider/Encoding-and-Compression.md) - - - 数据分区和负载均衡:IoTDB 基于时序数据特性,精心设计了数据分区策略和负载均衡算法,提升了集群的可用性和性能,想了解更多请查看 [数据分区和负载均衡](../Technical-Insider/Cluster-data-partitioning.md) - - -## 5. 使用过程中遇到问题? - -如果您在安装或使用过程中遇到困难,可以移步至 [常见问题](../FAQ/Frequently-asked-questions.md) 中进行查看 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Reference/DataNode-Config-Manual_timecho.md b/src/zh/UserGuide/Master/Tree/Reference/DataNode-Config-Manual_timecho.md deleted file mode 100644 index 8f0a3fc61..000000000 --- a/src/zh/UserGuide/Master/Tree/Reference/DataNode-Config-Manual_timecho.md +++ /dev/null @@ -1,625 +0,0 @@ - - -# DataNode 配置参数 - -IoTDB DataNode 与 Standalone 模式共用一套配置文件,均位于 IoTDB 安装目录:`conf`文件夹下。 - -* `datanode-env.sh/bat`:环境配置项的配置文件,可以配置 DataNode 的内存大小。 - -* `iotdb-system.properties`:IoTDB 的配置文件。 - -## 1. 热修改配置项 - -为方便用户使用,IoTDB 为用户提供了热修改功能,即在系统运行过程中修改 `iotdb-system.properties` 中部分配置参数并即时应用到系统中。下面介绍的参数中,改后 生效方式为`热加载` -的均为支持热修改的配置参数。 - -通过 Session 或 Cli 发送 ```load configuration``` 或 `set configuration` 命令(SQL)至 IoTDB 可触发配置热加载。 - -## 2. 环境配置项(datanode-env.sh/bat) - -环境配置项主要用于对 DataNode 运行的 Java 环境相关参数进行配置,如 JVM 相关配置。DataNode/Standalone 启动时,此部分配置会被传给 JVM,详细配置项说明如下: - -* MEMORY\_SIZE - -|名字|MEMORY\_SIZE| -|:---:|:---| -|描述|IoTDB DataNode 启动时分配的内存大小 | -|类型|String| -|默认值|取决于操作系统和机器配置。默认为机器内存的二分之一。| -|改后生效方式|重启服务生效| - -* ON\_HEAP\_MEMORY - -|名字|ON\_HEAP\_MEMORY| -|:---:|:---| -|描述|IoTDB DataNode 能使用的堆内内存大小, 曾用名: MAX\_HEAP\_SIZE | -|类型|String| -|默认值|取决于MEMORY\_SIZE的配置。| -|改后生效方式|重启服务生效| - -* OFF\_HEAP\_MEMORY - -|名字|OFF\_HEAP\_MEMORY| -|:---:|:---| -|描述|IoTDB DataNode 能使用的堆外内存大小, 曾用名: MAX\_DIRECT\_MEMORY\_SIZE | -|类型|String| -|默认值|取决于MEMORY\_SIZE的配置| -|改后生效方式|重启服务生效| - -* JMX\_LOCAL - -|名字|JMX\_LOCAL| -|:---:|:---| -|描述|JMX 监控模式,配置为 true 表示仅允许本地监控,设置为 false 的时候表示允许远程监控。如想在本地通过网络连接JMX Service,比如nodeTool.sh会尝试连接127.0.0.1:31999,请将JMX_LOCAL设置为false。| -|类型|枚举 String : “true”, “false”| -|默认值|true| -|改后生效方式|重启服务生效| - -* JMX\_PORT - -|名字|JMX\_PORT| -|:---:|:---| -|描述|JMX 监听端口。请确认该端口是不是系统保留端口并且未被占用。| -|类型|Short Int: [0,65535]| -|默认值|31999| -|改后生效方式|重启服务生效| - -## 3. 系统配置项(iotdb-system.properties) - -系统配置项是 IoTDB DataNode/Standalone 运行的核心配置,它主要用于设置 DataNode/Standalone 数据库引擎的参数。 - -### 3.1 Data Node RPC 服务配置 - -* dn\_rpc\_address - -|名字| dn\_rpc\_address | -|:---:|:-----------------| -|描述| 客户端 RPC 服务监听地址 | -|类型| String | -|默认值| 127.0.0.1 | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_port - -|名字| dn\_rpc\_port | -|:---:|:---| -|描述| Client RPC 服务监听端口| -|类型| Short Int : [0,65535] | -|默认值| 6667 | -|改后生效方式|重启服务生效| - -* dn\_internal\_address - -|名字| dn\_internal\_address | -|:---:|:---| -|描述| DataNode 内网通信地址 | -|类型| string | -|默认值| 127.0.0.1 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_internal\_port - -|名字| dn\_internal\_port | -|:---:|:-------------------| -|描述| DataNode 内网通信端口 | -|类型| int | -|默认值| 10730 | -|改后生效方式| 仅允许在第一次启动服务前修改 | - -* dn\_mpp\_data\_exchange\_port - -|名字| dn\_mpp\_data\_exchange\_port | -|:---:|:---| -|描述| MPP 数据交换端口 | -|类型| int | -|默认值| 10740 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_schema\_region\_consensus\_port - -|名字| dn\_schema\_region\_consensus\_port | -|:---:|:---| -|描述| DataNode 元数据副本的共识协议通信端口 | -|类型| int | -|默认值| 10750 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_data\_region\_consensus\_port - -|名字| dn\_data\_region\_consensus\_port | -|:---:|:---| -|描述| DataNode 数据副本的共识协议通信端口 | -|类型| int | -|默认值| 10760 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_join\_cluster\_retry\_interval\_ms - -|名字| dn\_join\_cluster\_retry\_interval\_ms | -|:---:|:---------------------------------------| -|描述| DataNode 再次重试加入集群等待时间 | -|类型| long | -|默认值| 5000 | -|改后生效方式| 重启服务生效 | - - -### 3.2 SSL 配置 - -* enable\_thrift\_ssl - -|名字| enable\_thrift\_ssl | -|:---:|:----------------------------------------------| -|描述| 当enable\_thrift\_ssl配置为true时,将通过dn\_rpc\_port使用 SSL 加密进行通信 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* enable\_https - -|名字| enable\_https | -|:---:|:-------------------------| -|描述| REST Service 是否开启 SSL 配置 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* key\_store\_path - -|名字| key\_store\_path | -|:---:|:-----------------| -|描述| ssl证书路径 | -|类型| String | -|默认值| "" | -|改后生效方式| 重启服务生效 | - -* key\_store\_pwd - -|名字| key\_store\_pwd | -|:---:|:----------------| -|描述| ssl证书密码 | -|类型| String | -|默认值| "" | -|改后生效方式| 重启服务生效 | - - -### 3.3 SeedConfigNode 配置 - -* dn\_seed\_config\_node - -|名字| dn\_seed\_config\_node | -|:---:|:------------------------------------| -|描述| ConfigNode 地址,DataNode 启动时通过此地址加入集群,推荐使用 SeedConfigNode。V1.2.2 及以前曾用名是 dn\_target\_config\_node\_list | -|类型| String | -|默认值| 127.0.0.1:10710 | -|改后生效方式| 仅允许在第一次启动服务前修改 | - -### 3.4 连接配置 - -* dn\_session\_timeout\_threshold - -|名字| dn\_session_timeout_threshold | -|:---:|:------------------------------| -|描述| 最大的会话空闲时间 | -|类型| int | -|默认值| 0 | -|改后生效方式| 重启服务生效 | - - -* dn\_rpc\_thrift\_compression\_enable - -|名字| dn\_rpc\_thrift\_compression\_enable | -|:---:|:---------------------------------| -|描述| 是否启用 thrift 的压缩机制 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_advanced\_compression\_enable - -|名字| dn\_rpc\_advanced\_compression\_enable | -|:---:|:-----------------------------------| -|描述| 是否启用 thrift 的自定制压缩机制 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_selector\_thread\_count - -| 名字 | rpc\_selector\_thread\_count | -|:------:|:-----------------------------| -| 描述 | rpc 选择器线程数量 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -* dn\_rpc\_min\_concurrent\_client\_num - -| 名字 | rpc\_min\_concurrent\_client\_num | -|:------:|:----------------------------------| -| 描述 | 最小连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -* dn\_rpc\_max\_concurrent\_client\_num - -| 名字 | dn\_rpc\_max\_concurrent\_client\_num | -|:------:|:--------------------------------------| -| 描述 | 最大连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -* dn\_thrift\_max\_frame\_size - -|名字| dn\_thrift\_max\_frame\_size | -|:---:|:------------------------------------------------------------------------------------------------------------------| -|描述| RPC 请求/响应的最大字节数 | -|类型| int | -|默认值| 默认为0,即根据启动时DNJVM的配置参数自动计算:
a. min(64MB, dn_alloc_memory/64)
b.若用户手动配置了dn_thrift_max_frame_size,仍然使用用户指定的大小 | -|改后生效方式| 重启服务生效 | - -* dn\_thrift\_init\_buffer\_size - -|名字| dn\_thrift\_init\_buffer\_size | -|:---:|:---| -|描述| 字节数 | -|类型| long | -|默认值| 1024 | -|改后生效方式|重启服务生效| - -* dn\_connection\_timeout\_ms - -| 名字 | dn\_connection\_timeout\_ms | -|:------:|:----------------------------| -| 描述 | 节点连接超时时间 | -| 类型 | int | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -* dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager - -| 名字 | dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------:|:--------------------------------------------------------------| -| 描述 | 单 ClientManager 中路由到每个节点的核心 Client 个数 | -| 类型 | int | -| 默认值 | 200 | -| 改后生效方式 | 重启服务生效 | - -* dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager - -| 名字 | dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------:|:-------------------------------------------------------------| -| 描述 | 单 ClientManager 中路由到每个节点的最大 Client 个数 | -| 类型 | int | -| 默认值 | 300 | -| 改后生效方式 | 重启服务生效 | - -### 3.5 目录配置 - -* dn\_system\_dir - -| 名字 | dn\_system\_dir | -|:------:|:--------------------------------------------------------------------| -| 描述 | IoTDB 元数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/system(Windows:data\\datanode\\system) | -| 改后生效方式 | 重启服务生效 | - -* dn\_data\_dirs - -| 名字 | dn\_data\_dirs | -|:------:|:-------------------------------------------------------------------| -| 描述 | IoTDB 数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/data(Windows:data\\datanode\\data) | -| 改后生效方式 | 重启服务生效 | - -* dn\_multi\_dir\_strategy - -| 名字 | dn\_multi\_dir\_strategy | -|:------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | IoTDB 在 data\_dirs 中为 TsFile 选择目录时采用的策略。可使用简单类名或类名全称。系统提供以下三种策略:
1. SequenceStrategy:IoTDB 按顺序选择目录,依次遍历 data\_dirs 中的所有目录,并不断轮循;
2. MaxDiskUsableSpaceFirstStrategy:IoTDB 优先选择 data\_dirs 中对应磁盘空余空间最大的目录;
您可以通过以下方法完成用户自定义策略:
1. 继承 org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy 类并实现自身的 Strategy 方法;
2. 将实现的类的完整类名(包名加类名,UserDefineStrategyPackage)填写到该配置项;
3. 将该类 jar 包添加到工程中。 | -| 类型 | String | -| 默认值 | SequenceStrategy | -| 改后生效方式 | 热加载 | - -* dn\_consensus\_dir - -| 名字 | dn\_consensus\_dir | -|:------:|:-------------------------------------------------------------------------| -| 描述 | IoTDB 共识层日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/consensus(Windows:data\\datanode\\consensus) | -| 改后生效方式 | 重启服务生效 | - -* dn\_wal\_dirs - -| 名字 | dn\_wal\_dirs | -|:------:|:---------------------------------------------------------------------| -| 描述 | IoTDB 写前日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/wal(Windows:data\\datanode\\wal) | -| 改后生效方式 | 重启服务生效 | - -* dn\_tracing\_dir - -| 名字 | dn\_tracing\_dir | -|:------:|:--------------------------------------------------------------------| -| 描述 | IoTDB 追踪根目录路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | datanode/tracing | -| 改后生效方式 | 重启服务生效 | - -* dn\_sync\_dir - -| 名字 | dn\_sync\_dir | -|:------:|:----------------------------------------------------------------------| -| 描述 | IoTDB sync 存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/sync | -| 改后生效方式 | 重启服务生效 | - -### 3.6 Metric 配置 - -* dn\_metric\_reporter\_list - -| 名字 | dn\_metric\_reporter\_list | -|:------:|:------------------------------------------| -| 描述 | DataNode 中用于配置监控模块的数据需要报告的系统。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* dn\_metric\_level - -| 名字 | dn\_metric\_level | -|:------:|:---------------------------------| -| 描述 | DataNode 中控制监控模块收集数据的详细程度 | -| 类型 | String | -| 默认值 | IMPORTANT | -| 改后生效方式 | 重启服务生效 | - -* dn\_metric\_async\_collect\_period - -| 名字 | dn\_metric\_async\_collect\_period | -|:------:|:---------------------------------------| -| 描述 | DataNode 中某些监控数据异步收集的周期,单位是秒。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -* dn\_metric\_prometheus\_reporter\_port - -| 名字 | dn\_metric\_prometheus\_reporter\_port | -|:------:|:-------------------------------------------| -| 描述 | DataNode 中 Prometheus 报告者用于监控数据报告的端口号。 | -| 类型 | int | -| 默认值 | 9092 | -| 改后生效方式 | 重启服务生效 | - -* dn\_metric\_internal\_reporter\_type - -| 名字 | dn\_metric\_internal\_reporter\_type | -|:------:|:-----------------------------------------------------------| -| 描述 | DataNode 中监控模块内部报告者的种类,用于内部监控和检查数据是否已经成功写入和刷新。 | -| 类型 | String | -| 默认值 | IOTDB | -| 改后生效方式 | 重启服务生效 | - -## 4. 开启 GC 日志 - -GC 日志默认是关闭的。为了性能调优,用户可能会需要收集 GC 信息。 -若要打开 GC 日志,则需要在启动 IoTDB Server 的时候加上"printgc"参数: - -```bash -nohup sbin/start-datanode.sh printgc >/dev/null 2>&1 & -``` - -或者 - -```bash -# V2.0.4.x 版本之前 -sbin\start-datanode.bat printgc - -# V2.0.4.x 版本及之后 -tools\windows\start-datanode.bat printgc -``` - -GC 日志会被存储在`IOTDB_HOME/logs/gc.log`. 至多会存储 10 个 gc.log 文件,每个文件最多 10MB。 - -#### REST 服务配置 - -* enable\_rest\_service - -|名字| enable\_rest\_service | -|:---:|:--------------------| -|描述| 是否开启Rest服务。 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* rest\_service\_port - -|名字| rest\_service\_port | -|:---:|:------------------| -|描述| Rest服务监听端口号 | -|类型| int32 | -|默认值| 18080 | -|改后生效方式| 重启生效 | - -* enable\_swagger - -|名字| enable\_swagger | -|:---:|:-----------------------| -|描述| 是否启用swagger来展示rest接口信息 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* rest\_query\_default\_row\_size\_limit - -|名字| rest\_query\_default\_row\_size\_limit | -|:---:|:----------------------------------| -|描述| 一次查询能返回的结果集最大行数 | -|类型| int32 | -|默认值| 10000 | -|改后生效方式| 重启生效 | - -* cache\_expire - -|名字| cache\_expire | -|:---:|:--------------| -|描述| 缓存客户登录信息的过期时间 | -|类型| int32 | -|默认值| 28800 | -|改后生效方式| 重启生效 | - -* cache\_max\_num - -|名字| cache\_max\_num | -|:---:|:--------------| -|描述| 缓存中存储的最大用户数量 | -|类型| int32 | -|默认值| 100 | -|改后生效方式| 重启生效 | - -* cache\_init\_num - -|名字| cache\_init\_num | -|:---:|:---------------| -|描述| 缓存初始容量 | -|类型| int32 | -|默认值| 10 | -|改后生效方式| 重启生效 | - -* trust\_store\_path - -|名字| trust\_store\_path | -|:---:|:---------------| -|描述| keyStore 密码(非必填) | -|类型| String | -|默认值| "" | -|改后生效方式| 重启生效 | - -* trust\_store\_pwd - -|名字| trust\_store\_pwd | -|:---:|:---------------| -|描述| trustStore 密码(非必填) | -|类型| String | -|默认值| "" | -|改后生效方式| 重启生效 | - -* idle\_timeout - -|名字| idle\_timeout | -|:---:|:--------------| -|描述| SSL 超时时间,单位为秒 | -|类型| int32 | -|默认值| 5000 | -|改后生效方式| 重启生效 | - - - -#### 多级存储配置 - -* dn\_default\_space\_usage\_thresholds - -|名字| dn\_default\_space\_usage\_thresholds | -|:---:|:--------------| -|描述| 定义每个层级数据目录的最小剩余空间比例;当剩余空间少于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的剩余存储空间到低于此阈值时,会将系统置为 READ_ONLY | -|类型| double | -|默认值| 0.85 | -|改后生效方式| 热加载 | - -* remote\_tsfile\_cache\_dirs - -|名字| remote\_tsfile\_cache\_dirs | -|:---:|:--------------| -|描述| 云端存储在本地的缓存目录 | -|类型| string | -|默认值| data/datanode/data/cache | -|改后生效方式| 重启生效 | - -* remote\_tsfile\_cache\_page\_size\_in\_kb - -|名字| remote\_tsfile\_cache\_page\_size\_in\_kb | -|:---:|:--------------| -|描述| 云端存储在本地缓存文件的块大小 | -|类型| int | -|默认值| 20480 | -|改后生效方式| 重启生效 | - -* remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb - -|名字| remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb | -|:---:|:--------------| -|描述| 云端存储本地缓存的最大磁盘占用大小 | -|类型| long | -|默认值| 51200 | -|改后生效方式| 重启生效 | - -* object\_storage\_type - -|名字| object\_storage\_type | -|:---:|:--------------| -|描述| 云端存储类型 | -|类型| string | -|默认值| AWS_S3 | -|改后生效方式| 重启生效 | - -* object\_storage\_bucket - -|名字| object\_storage\_bucket | -|:---:|:--------------| -|描述| 云端存储 bucket 的名称 | -|类型| string | -|默认值| iotdb_data | -|改后生效方式| 重启生效 | - -* object\_storage\_endpoint - -|名字| object\_storage\_endpoint | -|:---:|:---------------------------| -|描述| 云端存储的 endpoint | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | - -* object\_storage\_access\_key - -|名字| object\_storage\_access\_key | -|:---:|:--------------| -|描述| 云端存储的验证信息 key | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | - -* object\_storage\_access\_secret - -|名字| object\_storage\_access\_secret | -|:---:|:--------------| -|描述| 云端存储的验证信息 secret | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | diff --git a/src/zh/UserGuide/Master/Tree/SQL-Manual/QuickStart-Only-Sql_timecho.md b/src/zh/UserGuide/Master/Tree/SQL-Manual/QuickStart-Only-Sql_timecho.md deleted file mode 100644 index 55f835bb8..000000000 --- a/src/zh/UserGuide/Master/Tree/SQL-Manual/QuickStart-Only-Sql_timecho.md +++ /dev/null @@ -1,111 +0,0 @@ - - -# 快速 SQL 体验 - -> **在执行以下 SQL 语句前,请确保** -> -> * **已成功启动 IoTDB 服务** -> * **已通过 Cli 客户端连接 IoTDB** -> -> 注意:若您使用的终端不支持多行粘贴(例如 Windows CMD),请将 SQL 语句调整为单行格式后再执行。 - -## 1. 数据库管理 - -```SQL --- 创建数据库; -CREATE DATABASE root.ln; - --- 查看数据库; -SHOW DATABASES root.**; - --- 删除数据库; -DELETE DATABASE root.ln; - --- 统计数据库; -COUNT DATABASES root.**; -``` - -详细语法说明可参考:[数据库管理](../Basic-Concept/Operate-Metadata_timecho.md#_1-数据库管理) - -## 2. 时间序列管理 - -```SQL --- 创建时间序列; -CREATE TIMESERIES root.ln.wf01.wt01.status BOOLEAN; -CREATE TIMESERIES root.ln.wf01.wt01.temperature FLOAT; - --- 创建对齐时间序列; -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT, longitude FLOAT); - --- 删除时间序列; -DELETE TIMESERIES root.ln.wf01.wt01.status; - --- 查看时间序列; -SHOW TIMESERIES root.ln.**; - --- 统计时间序列; -COUNT TIMESERIES root.ln.**; -``` - -详细语法说明可参考:[时间序列管理](../Basic-Concept/Operate-Metadata_timecho.md#_2-时间序列管理) - -## 3. 数据写入 - -```SQL --- 单列写入; -INSERT INTO root.ln.wf01.wt01(timestamp, temperature) VALUES(1, 23.0),(2, 42.6); - --- 多列写入; -INSERT INTO root.ln.wf01.wt01(timestamp, status, temperature) VALUES (3, false, 33.1),(4, true, 24.6); -``` - -详细语法说明可参考:[数据写入](../Basic-Concept/Write-Data_timecho.md) - -## 4. 数据查询 - -```SQL --- 时间过滤查询; -SELECT * from root.ln.** where time > 1; - --- 值过滤查询; -SELECT temperature FROM root.ln.wf01.wt01 where temperature > 36.5; - --- 函数查询; -SELECT count(temperature) FROM root.ln.wf01.wt01; - --- 最新点查询; -SELECT LAST status FROM root.ln.wf01.wt01; -``` - -详细语法说明可参考:[数据查询](../Basic-Concept/Query-Data_timecho.md) - -## 5. 数据删除 - -```SQL --- 单列删除; -DELETE FROM root.ln.wf01.wt01.status WHERE time >= 20; - --- 多列删除; -DELETE FROM root.ln.wf01.wt01.* where time <= 10; -``` - -详细语法说明可参考:[数据删除](../Basic-Concept/Delete-Data.md) diff --git a/src/zh/UserGuide/Master/Tree/SQL-Manual/SQL-Manual_timecho.md b/src/zh/UserGuide/Master/Tree/SQL-Manual/SQL-Manual_timecho.md deleted file mode 100644 index 44a04a5b5..000000000 --- a/src/zh/UserGuide/Master/Tree/SQL-Manual/SQL-Manual_timecho.md +++ /dev/null @@ -1,1704 +0,0 @@ -# SQL手册 - -## 1. 元数据操作 - -### 1.1 数据库管理 - -#### 创建数据库 - -```sql -CREATE DATABASE root.ln; -``` - -#### 查看数据库 - -```sql -show databases; -show databases root.*; -show databases root.**; -``` - -#### 删除数据库 - -```sql -DELETE DATABASE root.ln; -DELETE DATABASE root.sgcc; -DELETE DATABASE root.**; -``` - -#### 统计数据库数量 - -```sql -count databases; -count databases root.*; -count databases root.sgcc.*; -count databases root.sgcc; -``` - -### 1.2 时间序列管理 - -#### 创建时间序列 - -```sql -create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT; -create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT; -create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT; -``` - -- 简化版 - -```sql -create timeseries root.ln.wf01.wt01.status BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature FLOAT; -create timeseries root.ln.wf02.wt02.hardware TEXT; -create timeseries root.ln.wf02.wt02.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature FLOAT; -``` - -- 错误提示 - -```sql -create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF; -error: encoding TS_2DIFF does not support BOOLEAN -``` - -#### 创建对齐时间序列 - -```sql -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT, longitude FLOAT); -``` - -#### 修改时间序列数据类型 -> V2.0.8.2 起支持该语句 - -```sql -ALTER TIMESERIES root.ln.wf01.wt01.temperature set data type DOUBLE -``` - -#### 修改时间序列名称 -> V2.0.8.2 起支持该语句 - -```SQL -ALTER TIMESERIES root.ln.wf01.wt01.temperature RENAME TO root.newln.newwf.newwt.temperature -``` - -#### 删除时间序列 - -```sql -delete timeseries root.ln.wf01.wt01.status; -delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware; -delete timeseries root.ln.wf02.*; -drop timeseries root.ln.wf02.*; -``` - -#### 查看时间序列 - -```sql -SHOW TIMESERIES; -SHOW TIMESERIES ; -SHOW TIMESERIES root.**; -SHOW TIMESERIES root.ln.**; -SHOW TIMESERIES root.ln.** limit 10 offset 10; -SHOW TIMESERIES root.ln.** where timeseries contains 'wf01.wt'; -SHOW TIMESERIES root.ln.** where dataType=FLOAT; -SHOW TIMESERIES root.ln.** where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -SHOW LATEST TIMESERIES; -SHOW INVALID TIMESERIES; --V2.0.8.2 起支持该语句; -``` - -#### 统计时间序列数量 - -```sql -COUNT TIMESERIES root.**; -COUNT TIMESERIES root.ln.**; -COUNT TIMESERIES root.ln.*.*.status; -COUNT TIMESERIES root.ln.wf01.wt01.status; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc'; -COUNT TIMESERIES root.** WHERE DATATYPE = INT64; -COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c'; -COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c'; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1; -COUNT TIMESERIES root.** WHERE time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -COUNT TIMESERIES root.** GROUP BY LEVEL=1; -COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2; -COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2; -``` - -#### 标签点管理 - -```sql -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2); -``` - -- 重命名标签或属性 - -```sql -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1; -``` - -- 重新设置标签或属性的值 - -```sql -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1; -``` - -- 删除已经存在的标签或属性 - -```sql -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2; -``` - -- 添加新的标签 - -```sql -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4; -``` - -- 添加新的属性 - -```sql -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4; -``` - -- 更新插入别名,标签和属性 - -```sql -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4); -``` - -- 使用标签作为过滤条件查询时间序列 - -```sql -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -``` - -返回给定路径的下的所有满足条件的时间序列信息: - -```sql -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c; -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1; -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -- 使用标签作为过滤条件统计时间序列数量 - -```sql -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause; -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL=; -``` - -返回给定路径的下的所有满足条件的时间序列的数量: - -```sql -count timeseries; -count timeseries root.** where TAGS(unit)='c'; -count timeseries root.** where TAGS(unit)='c' group by level = 2; -``` - -创建对齐时间序列: - -```sql -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)); -``` - -支持查询: - -```sql -show timeseries where TAGS(tag1)='v1'; -``` - -### 1.3 时间序列路径管理 - -#### 查看路径的所有子路径 - -```sql -SHOW CHILD PATHS pathPattern; -- 查询 root.ln 的下一层; -show child paths root.ln; -- 查询形如 root.xx.xx.xx 的路径; -show child paths root.*.*; -``` -#### 查看路径的所有子节点 - -```sql -SHOW CHILD NODES pathPattern; -- 查询 root 的下一层; -show child nodes root; -- 查询 root.ln 的下一层; -show child nodes root.ln; -``` -#### 查看设备 - -```sql -show devices; -show devices root.ln.**; -show devices where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -``` -##### 查看设备及其 database 信息 - -```sql -show devices with database; -show devices root.ln.** with database; -``` -#### 统计节点数 - -```sql -COUNT NODES root.** LEVEL=2; -COUNT NODES root.ln.** LEVEL=2; -COUNT NODES root.ln.wf01.* LEVEL=3; -COUNT NODES root.**.temperature LEVEL=3; -``` -#### 统计设备数量 - -```sql -count devices; -count devices root.ln.**; -count devices where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -``` - -### 1.4 数据存活时间管理 - -#### 设置 TTL -```sql -set ttl to root.ln 3600000; -set ttl to root.sgcc.** 3600000; -set ttl to root.** 3600000; -``` -#### 取消 TTL -```sql -unset ttl from root.ln; -unset ttl from root.sgcc.**; -unset ttl from root.**; -``` - -#### 显示 TTL -```sql -SHOW ALL TTL; -SHOW TTL ON pathPattern; -show DEVICES; -``` -## 2. 写入数据 - -### 2.1 写入单列数据 -```sql -insert into root.ln.wf02.wt02(timestamp,status) values(1,true); -insert into root.ln.wf02.wt02(timestamp,hardware) values(1, 'v1'),(2, 'v1'); -``` -### 2.2 写入多列数据 -```sql -insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2'); -insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4'); -``` -### 2.3 使用服务器时间戳 -```sql -insert into root.ln.wf02.wt02(status, hardware) values (false, 'v2'); -``` -### 2.4 写入对齐时间序列数据 -```sql -create aligned timeseries root.sg1.d1(s1 INT32, s2 DOUBLE); -insert into root.sg1.d1(timestamp, s1, s2) aligned values(1, 1, 1); -insert into root.sg1.d1(timestamp, s1, s2) aligned values(2, 2, 2), (3, 3, 3); -select * from root.sg1.d1; -``` -### 2.5 加载 TsFile 文件数据 - -load '' [sglevel=int][onSuccess=delete/none] - -#### 通过指定文件路径(绝对路径)加载单 tsfile 文件 - -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile'` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' sglevel=1` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' onSuccess=delete` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' sglevel=1 onSuccess=delete` - - -#### 通过指定文件夹路径(绝对路径)批量加载文件 - -- `load '/Users/Desktop/data'` -- `load '/Users/Desktop/data' sglevel=1` -- `load '/Users/Desktop/data' onSuccess=delete` -- `load '/Users/Desktop/data' sglevel=1 onSuccess=delete` - -## 3. 删除数据 - -### 3.1 删除单列数据 -```sql -delete from root.ln.wf02.wt02.status where time<=2017-11-01T16:26:00; -delete from root.ln.wf02.wt02.status where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -delete from root.ln.wf02.wt02.status where time < 10; -delete from root.ln.wf02.wt02.status where time <= 10; -delete from root.ln.wf02.wt02.status where time < 20 and time > 10; -delete from root.ln.wf02.wt02.status where time <= 20 and time >= 10; -delete from root.ln.wf02.wt02.status where time > 20; -delete from root.ln.wf02.wt02.status where time >= 20; -delete from root.ln.wf02.wt02.status where time = 20; -``` -出错: -```sql -delete from root.ln.wf02.wt02.status where time > 4 or time < 0; -Msg: 303: Check metadata error: For delete statement, where clause can only contain atomic expressions like : time > XXX, time <= XXX, or two atomic expressions connected by 'AND' -``` - -删除时间序列中的所有数据: -```sql -delete from root.ln.wf02.wt02.status; -``` -### 3.2 删除多列数据 -```sql -delete from root.ln.wf02.wt02.* where time <= 2017-11-01T16:26:00; -``` -声明式的编程方式: -```sql -delete from root.ln.wf03.wt02.status where time < now(); -Msg: The statement is executed successfully. -``` -## 4. 数据查询 - -### 4.1 基础查询 - -#### 时间过滤查询 -```sql -select temperature from root.ln.wf01.wt01 where time < 2017-11-01T00:08:00.000; -``` -#### 根据一个时间区间选择多列数据 -```sql -select status, temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` -#### 按照多个时间区间选择同一设备的多列数据 -```sql -select status, temperature from root.ln.wf01.wt01 where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` -#### 按照多个时间区间选择不同设备的多列数据 -```sql -select wf01.wt01.status, wf02.wt02.hardware from root.ln where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` -#### 根据时间降序返回结果集 -```sql -select * from root.ln.** where time > 1 order by time desc limit 10; -``` -### 4.2 选择表达式 - -#### 使用别名 -```sql -select s1 as temperature, s2 as speed from root.ln.wf01.wt01; -``` -#### 运算符 - -#### 函数 - -不支持: -```sql -select s1, count(s1) from root.sg.d1; -select sin(s1), count(s1) from root.sg.d1; -select s1, count(s1) from root.sg.d1 group by ([10,100),10ms); -``` -##### 时间序列查询嵌套表达式 - -示例 1: -```sql -select a, - b, - ((a + 1) * 2 - 1) % 2 + 1.5, - sin(a + sin(a + sin(b))), - -(a + b) * (sin(a + b) * sin(a + b) + cos(a + b) * cos(a + b)) + 1 -from root.sg1; -``` -示例 2: -```sql -select (a + b) * 2 + sin(a) from root.sg; -``` -示例 3: -```sql -select (a + *) / 2 from root.sg1; -``` -示例 4: -```sql -select (a + b) * 3 from root.sg, root.ln; -``` -##### 聚合查询嵌套表达式 - -示例 1: -```sql -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) -from root.ln.wf01.wt01; -``` -示例 2: -```sql -select avg(*), - (avg(*) + 1) * 3 / 2 -1 -from root.sg1; -``` -示例 3: -```sql -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) as custom_sum -from root.ln.wf01.wt01 -GROUP BY([10, 90), 10ms); -``` -#### 最新点查询 - -SQL 语法: - -```sql -select last [COMMA ]* from < PrefixPath > [COMMA < PrefixPath >]* [ORDER BY TIMESERIES (DESC | ASC)?] -``` - -查询 root.ln.wf01.wt01.status 的最新数据点 -```sql -select last status from root.ln.wf01.wt01; -``` -查询 root.ln.wf01.wt01 下 status,temperature 时间戳大于等于 2017-11-07T23:50:00 的最新数据点 -```sql -select last status, temperature from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00; -``` - 查询 root.ln.wf01.wt01 下所有序列的最新数据点,并按照序列名降序排列 -```sql -select last * from root.ln.wf01.wt01 order by timeseries desc; -``` -### 4.3 查询过滤条件 - -#### 时间过滤条件 - -选择时间戳大于 2022-01-01T00:05:00.000 的数据: -```sql -select s1 from root.sg1.d1 where time > 2022-01-01T00:05:00.000; -``` -选择时间戳等于 2022-01-01T00:05:00.000 的数据: -```sql -select s1 from root.sg1.d1 where time = 2022-01-01T00:05:00.000; -``` -选择时间区间 [2017-11-01T00:05:00.000, 2017-11-01T00:12:00.000) 内的数据: -```sql -select s1 from root.sg1.d1 where time >= 2022-01-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` -#### 值过滤条件 - -选择值大于 36.5 的数据: -```sql -select temperature from root.sg1.d1 where temperature > 36.5; -``` -选择值等于 true 的数据: -```sql -select status from root.sg1.d1 where status = true; -``` -选择区间 [36.5,40] 内或之外的数据: -```sql -select temperature from root.sg1.d1 where temperature between 36.5 and 40; -``` -```sql -select temperature from root.sg1.d1 where temperature not between 36.5 and 40; -``` -选择值在特定范围内的数据: -```sql -select code from root.sg1.d1 where code in ('200', '300', '400', '500'); -``` -选择值在特定范围外的数据: -```sql -select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); -``` -选择值为空的数据: -```sql -select code from root.sg1.d1 where temperature is null; -``` -选择值为非空的数据: -```sql -select code from root.sg1.d1 where temperature is not null; -``` -#### 模糊查询 - -查询 `root.sg.d1` 下 `value` 含有`'cc'`的数据 -```sql -select * from root.sg.d1 where value like '%cc%'; -``` -查询 `root.sg.d1` 下 `value` 中间为 `'b'`、前后为任意单个字符的数据 -```sql -select * from root.sg.device where value like '_b_'; -``` -查询 root.sg.d1 下 value 值为26个英文字符组成的字符串 -```sql -select * from root.sg.d1 where value regexp '^[A-Za-z]+$'; -``` - -查询 root.sg.d1 下 value 值为26个小写英文字符组成的字符串且时间大于100的 -```sql -select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100; -``` - -### 4.4 分段分组聚合 - -#### 未指定滑动步长的时间区间分组聚合查询 -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d); -``` -#### 指定滑动步长的时间区间分组聚合查询 -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d); -``` -滑动步长可以小于聚合窗口 -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-01 10:00:00), 4h, 2h); -``` -#### 按照自然月份的时间区间分组聚合查询 -```sql -select count(status) from root.ln.wf01.wt01 where time > 2017-11-01T01:00:00 group by([2017-11-01T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` -每个时间间隔窗口内都有数据 -```sql -select count(status) from root.ln.wf01.wt01 group by([2017-10-31T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` -#### 左开右闭区间 -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d); -``` -#### 与分组聚合混合使用 - -统计降采样后的数据点个数 -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d), level=1; -``` -加上滑动 Step 的降采样后的结果也可以汇总 -```sql -select count(status) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d), level=1; -``` -#### 路径层级分组聚合 - -统计不同 database 下 status 序列的数据点个数 -```sql -select count(status) from root.** group by level = 1; -``` - 统计不同设备下 status 序列的数据点个数 -```sql -select count(status) from root.** group by level = 3; -``` -统计不同 database 下的不同设备中 status 序列的数据点个数 -```sql -select count(status) from root.** group by level = 1, 3; -``` -查询所有序列下温度传感器 temperature 的最大值 -```sql -select max_value(temperature) from root.** group by level = 0; -``` -查询某一层级下所有传感器拥有的总数据点数 -```sql -select count(*) from root.ln.** group by level = 2; -``` -#### 标签分组聚合 - -##### 单标签聚合查询 -```sql -SELECT AVG(temperature) FROM root.factory1.** GROUP BY TAGS(city); -``` -##### 多标签聚合查询 -```sql -SELECT avg(temperature) FROM root.factory1.** GROUP BY TAGS(city, workshop); -``` -##### 基于时间区间的标签聚合查询 -```sql -SELECT AVG(temperature) FROM root.factory1.** GROUP BY ([1000, 10000), 5s), TAGS(city, workshop); -``` -#### 差值分段聚合 -```sql -group by variation(controlExpression[,delta][,ignoreNull=true/false]) -``` -##### delta=0时的等值事件分段 -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6); -``` -指定ignoreNull为false -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, ignoreNull=false); -``` -##### delta!=0时的差值事件分段 -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, 4); -``` -#### 条件分段聚合 -```sql -group by condition(predict,[keep>/>=/=/<=/<]threshold,[,ignoreNull=true/false]) -``` -查询至少连续两行以上的charging_status=1的数据 -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoreNull=true); -``` -当设置`ignoreNull`为false时,遇到null值为将其视为一个不满足条件的行,得到结果原先的分组被含null的行拆分 -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoreNull=false); -``` -#### 会话分段聚合 -```sql -group by session(timeInterval) -``` -按照不同的时间单位设定时间间隔 -```sql -select __endTime,count(*) from root.** group by session(1d); -``` -和`HAVING`、`ALIGN BY DEVICE`共同使用 -```sql -select __endTime,sum(hardware) from root.ln.wf02.wt01 group by session(50s) having sum(hardware)>0 align by device; -``` -#### 点数分段聚合 -```sql -group by count(controlExpression, size[,ignoreNull=true/false]) -``` -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5); -``` -当使用ignoreNull将null值也考虑进来 -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5,ignoreNull=false); -``` -### 4.5 聚合结果过滤 - -不正确的: -```sql -select count(s1) from root.** group by ([1,3),1ms) having sum(s1) > s1; -select count(s1) from root.** group by ([1,3),1ms) having s1 > 1; -select count(s1) from root.** group by ([1,3),1ms), level=1 having sum(d1.s1) > 1; -select count(d1.s1) from root.** group by ([1,3),1ms), level=1 having sum(s1) > 1; -``` -SQL 示例: -```sql - select count(s1) from root.** group by ([1,11),2ms), level=1 having count(s2) > 2; - select count(s1), count(s2) from root.** group by ([1,11),2ms) having count(s2) > 1 align by device; -``` -### 4.6 结果集补空值 -```sql -FILL '(' PREVIOUS | LINEAR | constant (, interval=DURATION_LITERAL)? ')' -``` -#### `PREVIOUS` 填充 -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous); -``` -#### `PREVIOUS` 填充并指定填充超时阈值 -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous, 2m); -``` -#### `LINEAR` 填充 -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(linear); -``` -#### 常量填充 -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(2.0); -``` -使用 `BOOLEAN` 类型的常量填充 -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(true); -``` -### 4.7 查询结果分页 - -#### 按行分页 - - 基本的 `LIMIT` 子句 -```sql -select status, temperature from root.ln.wf01.wt01 limit 10; -``` -带 `OFFSET` 的 `LIMIT` 子句 -```sql -select status, temperature from root.ln.wf01.wt01 limit 5 offset 3; -``` -`LIMIT` 子句与 `WHERE` 子句结合 -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time< 2017-11-01T00:12:00.000 limit 5 offset 3; -``` - `LIMIT` 子句与 `GROUP BY` 子句组合 -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) limit 4 offset 3; -``` -#### 按列分页 - - 基本的 `SLIMIT` 子句 -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1; -``` -带 `SOFFSET` 的 `SLIMIT` 子句 -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 soffset 1; -``` -`SLIMIT` 子句与 `GROUP BY` 子句结合 -```sql -select max_value(*) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) slimit 1 soffset 1; -``` -`SLIMIT` 子句与 `LIMIT` 子句结合 -```sql -select * from root.ln.wf01.wt01 limit 10 offset 100 slimit 2 soffset 0; -``` -### 4.8 排序 - -时间对齐模式下的排序 -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time desc; -``` -设备对齐模式下的排序 -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by device desc,time asc align by device; -``` -在时间戳相等时按照设备名排序 -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time asc,device desc align by device; -``` -没有显式指定时 -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` -对聚合后的结果进行排序 -```sql -select count(*) from root.ln.** group by ((2017-11-01T00:00:00.000+08:00,2017-11-01T00:03:00.000+08:00],1m) order by device asc,time asc align by device; -``` -### 4.9 查询对齐模式 - -#### 按设备对齐 -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` -### 4.10 查询写回(SELECT INTO) - -#### 整体描述 -```sql -selectIntoStatement - : SELECT - resultColumn [, resultColumn] ... - INTO intoItem [, intoItem] ... - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY groupByTimeClause, groupByLevelClause] - [FILL ({PREVIOUS | LINEAR | constant} (, interval=DURATION_LITERAL)?)] - [LIMIT rowLimit OFFSET rowOffset] - [ALIGN BY DEVICE] - ; - -intoItem - : [ALIGNED] intoDevicePath '(' intoMeasurementName [',' intoMeasurementName]* ')' - ; -``` -按时间对齐,将 `root.sg` database 下四条序列的查询结果写入到 `root.sg_copy` database 下指定的四条序列中 -```sql -select s1, s2 into root.sg_copy.d1(t1), root.sg_copy.d2(t1, t2), root.sg_copy.d1(t2) from root.sg.d1, root.sg.d2; -``` -按时间对齐,将聚合查询的结果存储到指定序列中 -```sql -select count(s1 + s2), last_value(s2) into root.agg.count(s1_add_s2), root.agg.last_value(s2) from root.sg.d1 group by ([0, 100), 10ms); -``` -按设备对齐 -```sql -select s1, s2 into root.sg_copy.d1(t1, t2), root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` -按设备对齐,将表达式计算的结果存储到指定序列中 -```sql -select s1 + s2 into root.expr.add(d1s1_d1s2), root.expr.add(d2s1_d2s2) from root.sg.d1, root.sg.d2 align by device; -``` -#### 使用变量占位符 - -##### 按时间对齐(默认) - -###### 目标设备不使用变量占位符 & 目标物理量列表使用变量占位符 -```sql -select s1, s2 -into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) -from root.sg.d1, root.sg.d2; -``` - -该语句等价于: -```sql -select s1, s2 -into root.sg_copy.d1(s1), root.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2) -from root.sg.d1, root.sg.d2; -``` - -###### 目标设备使用变量占位符 & 目标物理量列表不使用变量占位符 - -```sql -select d1.s1, d1.s2, d2.s3, d3.s4 -into ::(s1_1, s2_2), root.sg.d2_2(s3_3), root.${2}_copy.::(s4) -from root.sg; -``` - -###### 目标设备使用变量占位符 & 目标物理量列表使用变量占位符 - -```sql -select * into root.sg_bk.::(::) from root.sg.**; -``` - -##### 按设备对齐(使用 `ALIGN BY DEVICE`) - -###### 目标设备不使用变量占位符 & 目标物理量列表使用变量占位符 -```sql -select s1, s2, s3, s4 -into root.backup_sg.d1(s1, s2, s3, s4), root.backup_sg.d2(::), root.sg.d3(backup_${4}) -from root.sg.d1, root.sg.d2, root.sg.d3 -align by device; -``` - -###### 目标设备使用变量占位符 & 目标物理量列表不使用变量占位符 -```sql -select avg(s1), sum(s2) + sum(s3), count(s4) -into root.agg_${2}.::(avg_s1, sum_s2_add_s3, count_s4) -from root.** -align by device; -``` - -###### 目标设备使用变量占位符 & 目标物理量列表使用变量占位符 -```sql -select * into ::(backup_${4}) from root.sg.** align by device; -``` - -#### 指定目标序列为对齐序列 -```sql -select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` -## 5. 运维语句 -生成对应的查询计划 -```sql -explain select s1,s2 from root.sg.d1; -``` -执行对应的查询语句,并获取分析结果 -```sql -explain analyze select s1,s2 from root.sg.d1 order by s1; -``` - -更多运维语句可查看[运维语句](../User-Manual/Maintenance-statement_timecho.md) - -## 6. 运算符 - -更多见文档[Operator-and-Expression](./Operator-and-Expression.md) - -### 6.1 算数运算符 - -更多见文档 [Arithmetic Operators and Functions](./Operator-and-Expression.md#_1-1-算数运算符) - -```sql -select s1, - s1, s2, + s2, s1 + s2, s1 - s2, s1 * s2, s1 / s2, s1 % s2 from root.sg.d1; -``` - -### 6.2 比较运算符 - -更多见文档[Comparison Operators and Functions](./Operator-and-Expression.md#_1-2-比较运算符) - -```sql -# Basic comparison operators; -select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test; - -# `BETWEEN ... AND ...` operator; -select temperature from root.sg1.d1 where temperature between 36.5 and 40; -select temperature from root.sg1.d1 where temperature not between 36.5 and 40; - -# Fuzzy matching operator: Use `Like` for fuzzy matching; -select * from root.sg.d1 where value like '%cc%'; -select * from root.sg.device where value like '_b_'; - -# Fuzzy matching operator: Use `Regexp` for fuzzy matching; -select * from root.sg.d1 where value regexp '^[A-Za-z]+$'; -select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100; -select b, b like '1%', b regexp '[0-2]' from root.test; - -# `IS NULL` operator; -select code from root.sg1.d1 where temperature is null; -select code from root.sg1.d1 where temperature is not null; - -# `IN` operator; -select code from root.sg1.d1 where code in ('200', '300', '400', '500'); -select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); -select a, a in (1, 2) from root.test; -``` - -### 6.3 逻辑运算符 - -更多见文档[Logical Operators](./Operator-and-Expression.md#_1-3-逻辑运算符) - -```sql -select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test; -``` - -## 7. 内置函数 - -更多见文档[Operator-and-Expression](./Operator-and-Expression.md#_2-内置函数) - -### 7.1 Aggregate Functions - -更多见文档[Aggregate Functions](./Operator-and-Expression.md#_2-1-聚合函数) - -```sql -select count(status) from root.ln.wf01.wt01; - -select count_if(s1=0 & s2=0, 3), count_if(s1=1 & s2=0, 3) from root.db.d1; -select count_if(s1=0 & s2=0, 3, 'ignoreNull'='false'), count_if(s1=1 & s2=0, 3, 'ignoreNull'='false') from root.db.d1; - -select time_duration(s1) from root.db.d1; -``` - -### 7.2 算数函数 - -更多见文档[Arithmetic Operators and Functions](./Operator-and-Expression.md#_2-2-数学函数) - -```sql -select s1, sin(s1), cos(s1), tan(s1) from root.sg1.d1 limit 5 offset 1000; -select s4,round(s4),round(s4,2),round(s4,-1) from root.sg1.d1; -``` - -### 7.3 比较函数 - -更多见文档[Comparison Operators and Functions](./Operator-and-Expression.md#_2-3-比较函数) - -```sql -select ts, on_off(ts, 'threshold'='2') from root.test; -select ts, in_range(ts, 'lower'='2', 'upper'='3.1') from root.test; -``` - -### 7.4 字符串处理函数 - -更多见文档[String Processing](./Operator-and-Expression.md#_2-4-字符串函数) - -```sql -select s1, string_contains(s1, 's'='warn') from root.sg1.d4; -select s1, string_matches(s1, 'regex'='[^\\s]+37229') from root.sg1.d4; -select s1, length(s1) from root.sg1.d1; -select s1, locate(s1, "target"="1") from root.sg1.d1; -select s1, locate(s1, "target"="1", "reverse"="true") from root.sg1.d1; -select s1, startswith(s1, "target"="1") from root.sg1.d1; -select s1, endswith(s1, "target"="1") from root.sg1.d1; -select s1, s2, concat(s1, s2, "target1"="IoT", "target2"="DB") from root.sg1.d1; -select s1, s2, concat(s1, s2, "target1"="IoT", "target2"="DB", "series_behind"="true") from root.sg1.d1; -select s1, substring(s1 from 1 for 2) from root.sg1.d1; -select s1, replace(s1, 'es', 'tt') from root.sg1.d1; -select s1, upper(s1) from root.sg1.d1; -select s1, lower(s1) from root.sg1.d1; -select s3, trim(s3) from root.sg1.d1; -select s1, s2, strcmp(s1, s2) from root.sg1.d1; -select strreplace(s1, "target"=",", "replace"="/", "limit"="2") from root.test.d1; -select strreplace(s1, "target"=",", "replace"="/", "limit"="1", "offset"="1", "reverse"="true") from root.test.d1; -select regexmatch(s1, "regex"="\d+\.\d+\.\d+\.\d+", "group"="0") from root.test.d1; -select regexreplace(s1, "regex"="192\.168\.0\.(\d+)", "replace"="cluster-$1", "limit"="1") from root.test.d1; -select regexsplit(s1, "regex"=",", "index"="-1") from root.test.d1; -select regexsplit(s1, "regex"=",", "index"="3") from root.test.d1; -``` - -### 7.5 数据类型转换函数 - -更多见文档[Data Type Conversion Function](./Operator-and-Expression.md#_2-5-数据类型转换函数) - -```sql -SELECT cast(s1 as INT32) from root.sg; -``` - -### 7.6 常序列生成函数 - -更多见文档[Constant Timeseries Generating Functions](./Operator-and-Expression.md#_2-6-常序列生成函数) - -```sql -select s1, s2, const(s1, 'value'='1024', 'type'='INT64'), pi(s2), e(s1, s2) from root.sg1.d1; -``` - -### 7.7 选择函数 - -更多见文档[Selector Functions](./Operator-and-Expression.md#_2-7-选择函数) - -```sql -select s1, top_k(s1, 'k'='2'), bottom_k(s1, 'k'='2') from root.sg1.d2 where time > 2020-12-10T20:36:15.530+08:00; -``` - -### 7.8 区间查询函数 - -更多见文档[Continuous Interval Functions](./Operator-and-Expression.md#_2-8-区间查询函数) - -```sql -select s1, zero_count(s1), non_zero_count(s2), zero_duration(s3), non_zero_duration(s4) from root.sg.d2; -``` - -### 7.9 趋势计算函数 - -更多见文档[Variation Trend Calculation Functions](./Operator-and-Expression.md#_2-9-趋势计算函数) - -```sql -select s1, time_difference(s1), difference(s1), non_negative_difference(s1), derivative(s1), non_negative_derivative(s1) from root.sg1.d1 limit 5 offset 1000; - -SELECT DIFF(s1), DIFF(s2) from root.test; -SELECT DIFF(s1, 'ignoreNull'='false'), DIFF(s2, 'ignoreNull'='false') from root.test; -``` - -### 7.10 采样函数 - -更多见文档[Sample Functions](./Operator-and-Expression.md#_2-10-采样函数)。 - -```sql -select equal_size_bucket_random_sample(temperature,'proportion'='0.1') as random_sample from root.ln.wf01.wt01; -select equal_size_bucket_agg_sample(temperature, 'type'='avg','proportion'='0.1') as agg_avg, equal_size_bucket_agg_sample(temperature, 'type'='max','proportion'='0.1') as agg_max, equal_size_bucket_agg_sample(temperature,'type'='min','proportion'='0.1') as agg_min, equal_size_bucket_agg_sample(temperature, 'type'='sum','proportion'='0.1') as agg_sum, equal_size_bucket_agg_sample(temperature, 'type'='extreme','proportion'='0.1') as agg_extreme, equal_size_bucket_agg_sample(temperature, 'type'='variance','proportion'='0.1') as agg_variance from root.ln.wf01.wt01; -select equal_size_bucket_m4_sample(temperature, 'proportion'='0.1') as M4_sample from root.ln.wf01.wt01; -select equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='avg', 'number'='2') as outlier_avg_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='stendis', 'number'='2') as outlier_stendis_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='cos', 'number'='2') as outlier_cos_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='prenextdis', 'number'='2') as outlier_prenextdis_sample from root.ln.wf01.wt01; - -select M4(s1,'timeInterval'='25','displayWindowBegin'='0','displayWindowEnd'='100') from root.vehicle.d1; -select M4(s1,'windowSize'='10') from root.vehicle.d1; -``` - -### 7.12 时间序列处理函数 - -更多见文档[Time-Series](./Operator-and-Expression.md#_2-11-时间序列处理函数) - -```sql -select change_points(s1), change_points(s2), change_points(s3), change_points(s4), change_points(s5), change_points(s6) from root.testChangePoints.d1; -``` - -## 8. 数据质量函数库 - -更多见文档[UDF-Libraries](../SQL-Manual/UDF-Libraries.md) - -### 8.1 数据质量 - -更多见文档[Data-Quality](../SQL-Manual/UDF-Libraries.md#数据质量) - -```sql -# Completeness; -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Consistency; -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Timeliness; -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Validity; -select Validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select Validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Accuracy; -select Accuracy(t1,t2,t3,m1,m2,m3) from root.test; -``` - -### 8.2 数据画像 - -更多见文档[Data-Profiling](../SQL-Manual/UDF-Libraries.md#数据画像) - -```sql -# ACF; -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05; - -# Distinct; -select distinct(s2) from root.test.d2; - -# Histogram; -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1; - -# Integral; -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10; -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10; - -# IntegralAvg; -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10; - -# Mad; -select mad(s0) from root.test; -select mad(s0, "error"="0.01") from root.test; - -# Median; -select median(s0, "error"="0.01") from root.test; - -# MinMax; -select minmax(s1) from root.test; - -# Mode; -select mode(s2) from root.test.d2; - -# MvAvg; -select mvavg(s1, "window"="3") from root.test; - -# PACF; -select pacf(s1, "lag"="5") from root.test; - -# Percentile; -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test; - -# Quantile; -select quantile(s0, "rank"="0.2", "K"="800") from root.test; - -# Period; -select period(s1) from root.test.d3; - -# QLB; -select QLB(s1) from root.test.d1; - -# Resample; -select resample(s1,'every'='5m','interp'='linear') from root.test.d1; -select resample(s1,'every'='30m','aggr'='first') from root.test.d1; -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1; - -# Sample; -select sample(s1,'method'='reservoir','k'='5') from root.test.d1; -select sample(s1,'method'='isometric','k'='5') from root.test.d1; - -# Segment; -select segment(s1, "error"="0.1") from root.test; - -# Skew; -select skew(s1) from root.test.d1; - -# Spline; -select spline(s1, "points"="151") from root.test; - -# Spread; -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; - -# Stddev; -select stddev(s1) from root.test.d1; - -# ZScore; -select zscore(s1) from root.test; -``` - -### 8.3 异常检测 - -更多见文档[Anomaly-Detection](../SQL-Manual/UDF-Libraries.md#异常检测) - -```sql -# IQR; -select iqr(s1) from root.test; - -# KSigma; -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30; - -# LOF; -select lof(s1,s2) from root.test.d1 where time<1000; -select lof(s1, "method"="series") from root.test.d1 where time<1000; - -# MissDetect; -select missdetect(s2,'minlen'='10') from root.test.d2; - -# Range; -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30; - -# TwoSidedFilter; -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test; - -# Outlier; -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test; - -# MasterTrain; -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test; - -# MasterDetect; -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test; -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test; -``` - -### 8.4 频域分析 - -更多见文档[Frequency-Domain](../SQL-Manual/UDF-Libraries.md#频域分析) - -```sql -# Conv; -select conv(s1,s2) from root.test.d2; - -# Deconv; -select deconv(s3,s2) from root.test.d2; -select deconv(s3,s2,'result'='remainder') from root.test.d2; - -# DWT; -select dwt(s1,"method"="haar") from root.test.d1; - -# FFT; -select fft(s1) from root.test.d1; -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1; - -# HighPass; -select highpass(s1,'wpass'='0.45') from root.test.d1; - -# IFFT; -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1; - -# LowPass; -select lowpass(s1,'wpass'='0.45') from root.test.d1; - -# Envelope; -select envelope(s1) from root.test.d1; -``` - -### 8.5 数据匹配 - -更多见文档[Data-Matching](../SQL-Manual/UDF-Libraries.md#数据匹配) - -```sql -# Cov; -select cov(s1,s2) from root.test.d2; - -# DTW; -select dtw(s1,s2) from root.test.d2; - -# Pearson; -select pearson(s1,s2) from root.test.d2; - -# PtnSym; -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1; - -# XCorr; -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05; -``` - -### 8.6 数据修复 - -更多见文档[Data-Repairing](../SQL-Manual/UDF-Libraries.md#数据修复) - -```sql -# TimestampRepair; -select timestamprepair(s1,'interval'='10000') from root.test.d2; -select timestamprepair(s1) from root.test.d2; - -# ValueFill; -select valuefill(s1) from root.test.d2; -select valuefill(s1,"method"="previous") from root.test.d2; - -# ValueRepair; -select valuerepair(s1) from root.test.d2; -select valuerepair(s1,'method'='LsGreedy') from root.test.d2; - -# MasterRepair; -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test; - -# SeasonalRepair; -select seasonalrepair(s1,'period'=3,'k'=2) from root.test.d2; -select seasonalrepair(s1,'method'='improved','period'=3) from root.test.d2; -``` - -### 8.7 序列发现 - -更多见文档[Series-Discovery](../SQL-Manual/UDF-Libraries.md#序列发现) - -```sql -# ConsecutiveSequences; -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1; -select consecutivesequences(s1,s2) from root.test.d1; - -# ConsecutiveWindows; -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1; -``` - -### 8.8 机器学习 - -更多见文档[Machine-Learning](../SQL-Manual/UDF-Libraries.md#机器学习) - -```sql -# AR; -select ar(s0,"p"="2") from root.test.d0; - -# Representation; -select representation(s0,"tb"="3","vb"="2") from root.test.d0; - -# RM; -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0; -``` - -## 9. 条件表达式 - -更多见文档[Conditional Expressions](./Operator-and-Expression.md#_3-条件表达式) - -```sql -select T, P, case -when 1000=1050 then "bad temperature" -when P<=1000000 or P>=1100000 then "bad pressure" -end as `result` -from root.test1; - -select str, case -when str like "%cc%" then "has cc" -when str like "%dd%" then "has dd" -else "no cc and dd" end as `result` -from root.test2; - -select -count(case when x<=1 then 1 end) as `(-∞,1]`, -count(case when 1 -[RESAMPLE - [EVERY ] - [BOUNDARY ] - [RANGE [, end_time_offset]] -] -[TIMEOUT POLICY BLOCKED|DISCARD] -BEGIN - SELECT CLAUSE - INTO CLAUSE - FROM CLAUSE - [WHERE CLAUSE] - [GROUP BY([, ]) [, level = ]] - [HAVING CLAUSE] - [FILL ({PREVIOUS | LINEAR | constant} (, interval=DURATION_LITERAL)?)] - [LIMIT rowLimit OFFSET rowOffset] - [ALIGN BY DEVICE] -END -``` - -#### 配置连续查询执行的周期性间隔 -```sql -CREATE CONTINUOUS QUERY cq1 -RESAMPLE EVERY 20s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) -END; - -SELECT temperature_max from root.ln.*.*; -``` -#### 配置连续查询的时间窗口大小 -```sql -CREATE CONTINUOUS QUERY cq2 -RESAMPLE RANGE 40s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) -END; - -SELECT temperature_max from root.ln.*.*; -``` -#### 同时配置连续查询执行的周期性间隔和时间窗口大小 -```sql -CREATE CONTINUOUS QUERY cq3 -RESAMPLE EVERY 20s RANGE 40s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) - FILL(100.0) -END; - -SELECT temperature_max from root.ln.*.*; -``` -#### 配置连续查询每次查询执行时间窗口的结束时间 -```sql -CREATE CONTINUOUS QUERY cq4 -RESAMPLE EVERY 20s RANGE 40s, 20s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) - FILL(100.0) -END; - -SELECT temperature_max from root.ln.*.*; -``` -#### 没有GROUP BY TIME子句的连续查询 -```sql -CREATE CONTINUOUS QUERY cq5 -RESAMPLE EVERY 20s -BEGIN - SELECT temperature + 1 - INTO root.precalculated_sg.::(temperature) - FROM root.ln.*.* - align by device -END; - -SELECT temperature from root.precalculated_sg.*.* align by device; -``` -### 11.2 连续查询的管理 - -#### 查询系统已有的连续查询 - -展示集群中所有的已注册的连续查询 -```sql -SHOW (CONTINUOUS QUERIES | CQS) -``` -```sql -SHOW CONTINUOUS QUERIES; -``` -#### 删除已有的连续查询 - -删除指定的名为cq_id的连续查询: - -```sql -DROP (CONTINUOUS QUERY | CQ) -``` -```sql -DROP CONTINUOUS QUERY s1_count_cq; -``` -#### 作为子查询的替代品 - -1. 创建一个连续查询 -```sql -CREATE CQ s1_count_cq -BEGIN - SELECT count(s1) - INTO root.sg_count.d.count_s1 - FROM root.sg.d - GROUP BY(30m) -END; -``` -1. 查询连续查询的结果 -```sql -SELECT avg(count_s1) from root.sg_count.d; -``` -## 12. 用户自定义函数 - -### 12.1 UDFParameters -```sql -SELECT UDF(s1, s2, 'key1'='iotdb', 'key2'='123.45') FROM root.sg.d; -``` -### 12.2 UDF 注册 - -```sql -CREATE FUNCTION AS (USING URI URI-STRING)? -``` - -#### 不指定URI -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample'; -``` -#### 指定URI -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' USING URI 'http://jar/example.jar'; -``` -### 12.3 UDF 卸载 - -```sql -DROP FUNCTION -``` -```sql -DROP FUNCTION example; -``` -### 12.4 UDF 查询 - -#### 带自定义输入参数的查询 -```sql -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -``` -```sql -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; -``` -#### 与其他查询的嵌套查询 -```sql -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` -### 12.5 查看所有注册的 UDF -```sql -SHOW FUNCTIONS; -``` -## 13. 权限管理 - -### 13.1 用户与角色相关 - -- 创建用户(需 MANAGE_USER 权限) - -```SQL -CREATE USER ; -eg: CREATE USER user1 'passwd'; -``` - -- 删除用户 (需 MANEGE_USER 权限) - -```SQL -DROP USER ; -eg: DROP USER user1; -``` - -- 创建角色 (需 MANAGE_ROLE 权限) - -```SQL -CREATE ROLE ; -eg: CREATE ROLE role1; -``` - -- 删除角色 (需 MANAGE_ROLE 权限) - -```SQL -DROP ROLE ; -eg: DROP ROLE role1; -``` - -- 赋予用户角色 (需 MANAGE_ROLE 权限) - -```SQL -GRANT ROLE TO ; -eg: GRANT ROLE admin TO user1; -``` - -- 移除用户角色 (需 MANAGE_ROLE 权限) - -```SQL -REVOKE ROLE FROM ; -eg: REVOKE ROLE admin FROM user1; -``` - -- 列出所有用户 (需 MANEGE_USER 权限) - -```SQL -LIST USER; -``` - -- 列出所有角色 (需 MANAGE_ROLE 权限) - -```SQL -LIST ROLE; -``` - -- 列出指定角色下所有用户 (需 MANEGE_USER 权限) - -```SQL -LIST USER OF ROLE ; -eg: LIST USER OF ROLE roleuser; -``` - -- 列出指定用户下所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 MANAGE_ROLE 权限。 - -```SQL -LIST ROLE OF USER ; -eg: LIST ROLE OF USER tempuser; -``` - -- 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 MANAGE_USER 权限。 - -```SQL -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; - -``` - -- 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 MANAGE_ROLE 权限。 - -```SQL -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -- 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备MANAGE_USER 权限。 - -```SQL -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'newpwd'; -``` - -### 13.2 授权与取消授权 - -用户使用授权语句对赋予其他用户权限,语法如下: - -```SQL -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT MANAGE_ROLE ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -用户使用取消授权语句可以将其他的权限取消,语法如下: - -```SQL -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE MANAGE_ROLE ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - diff --git a/src/zh/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md b/src/zh/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md deleted file mode 100644 index b26f6b668..000000000 --- a/src/zh/UserGuide/Master/Tree/SQL-Manual/UDF-Libraries_timecho.md +++ /dev/null @@ -1,5007 +0,0 @@ - -# UDF函数库 - -基于用户自定义函数能力,IoTDB 提供了一系列关于时序数据处理的函数,包括数据质量、数据画像、异常检测、 频域分析、数据匹配、数据修复、序列发现、机器学习等,能够满足工业领域对时序数据处理的需求。 - -> 注意:当前UDF函数库中的函数仅支持毫秒级的时间戳精度。 - -## 1. 安装步骤 -1. 请获取与 IoTDB 版本兼容的 UDF 函数库 JAR 包的压缩包。 - - | UDF 安装包 | 支持的 IoTDB 版本 | 下载链接 | - | --------------- | ----------------- | ------------------------------------------------------------ | - | TimechoDB-UDF-1.3.3.zip | V1.3.3及以上 | 请联系天谋商务获取 | - | TimechoDB-UDF-1.3.2.zip | V1.0.0~V1.3.2 | 请联系天谋商务获取 | - -2. 将获取的压缩包中的 `library-udf.jar` 文件放置在 IoTDB 集群所有节点的 `/ext/udf` 的目录下 -3. 在 IoTDB 的 SQL 命令行终端(CLI)或可视化控制台(Workbench)的 SQL 操作界面中,执行下述相应的函数注册语句。 -4. 批量注册:两种注册方式:注册脚本 或 SQL汇总语句 -- 注册脚本 - - 将压缩包中的注册脚本(`register-UDF.sh` 或 `register-UDF.bat`)按需复制到 IoTDB 的 tools 目录下,修改脚本中的参数(默认为host=127.0.0.1,rpcPort=6667,user=root,pass=root); - - 启动 IoTDB 服务,运行注册脚本批量注册 UDF - -- SQL汇总语句 - - 打开压缩包中的SQl文件,复制全部 SQL 语句,在 IoTDB 的 SQL 命令行终端(CLI)或可视化控制台(Workbench)的 SQL 操作界面中,执行全部 SQl 语句批量注册 UDF - -## 2. 数据质量 - -### 2.1 Completeness - -#### 注册语句 - -```sql -create function completeness as 'org.apache.iotdb.library.dquality.UDTFCompleteness' -``` - -#### 函数简介 - -本函数用于计算时间序列的完整性,用来衡量一段时序数据有没有缺失。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据完整程度,并输出窗口第一个数据点的时间戳和完整性结果。 - -**函数名:** COMPLETENESS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 -+ `downtime`:完整性计算是否考虑停机异常。它的取值为 'true' 或 'false',默认值为 'true'. 在考虑停机异常时,长时间的数据缺失将被视作停机,不对完整性产生影响。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行完整性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算完整性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------+ -| Time|completeness(root.test.d1.s1)| -+-----------------------------+-----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -+-----------------------------+-----------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算完整性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------+ -| Time|completeness(root.test.d1.s1, "window"="15")| -+-----------------------------+--------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+--------------------------------------------+ -``` - -### 2.2 Consistency - -#### 注册语句 - -```sql -create function consistency as 'org.apache.iotdb.library.dquality.UDTFConsistency' -``` - -#### 函数简介 - -本函数用于计算时间序列的一致性,用来衡量时序数据变化是否平稳、规律是否统一。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据一致性,并输出窗口第一个数据点的时间戳和一致性结果。 - -**函数名:** CONSISTENCY - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行一致性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算一致性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|consistency(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+----------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算一致性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------+ -| Time|consistency(root.test.d1.s1, "window"="15")| -+-----------------------------+-------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+-------------------------------------------+ -``` - -### 2.3 Timeliness - -#### 注册语句 - -```sql -create function timeliness as 'org.apache.iotdb.library.dquality.UDTFTimeliness' -``` - -#### 函数简介 - -本函数用于计算时间序列的时效性,用来衡量时序数据是否按时采集、按时上报。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据时效性,并输出窗口第一个数据点的时间戳和时效性结果。 - -**函数名:** TIMELINESS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行时效性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算时效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+---------------------------+ -| Time|timeliness(root.test.d1.s1)| -+-----------------------------+---------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+---------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算时效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------+ -| Time|timeliness(root.test.d1.s1, "window"="15")| -+-----------------------------+------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+------------------------------------------+ -``` - -### 2.4 Validity - -#### 注册语句 - -```sql -create function validity as 'org.apache.iotdb.library.dquality.UDTFValidity' -``` - -#### 函数简介 - -本函数用于计算时间序列的有效性,用来衡量时序数据是否正常、可用、无异常值。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据有效性,并输出窗口第一个数据点的时间戳和有效性结果。 - - -**函数名:** VALIDITY - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行有效性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算有效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|validity(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -+-----------------------------+-------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算有效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|validity(root.test.d1.s1, "window"="15")| -+-----------------------------+----------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - - - - -## 3. 数据画像 - -### 3.1 ACF - -#### 注册语句 - -```sql -create function acf as 'org.apache.iotdb.library.dprofile.UDTFACF' -``` - -#### 函数简介 - -本函数用于计算时间序列的自相关函数值,即序列与自身之间的互相关函数。 - -**函数名:** ACF - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中共包含$2N-1$个数据点。 - -**提示:** - -+ 序列中的`NaN`值会被忽略,在计算中表现为0。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|acf(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 6.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -|1970-01-01T08:00:00.004+08:00| 7.0| -|1970-01-01T08:00:00.005+08:00| 0.0| -|1970-01-01T08:00:00.006+08:00| 3.6| -|1970-01-01T08:00:00.007+08:00| 0.0| -|1970-01-01T08:00:00.008+08:00| 1.0| -+-----------------------------+--------------------+ -``` - -### 3.2 Distinct - -#### 注册语句 - -```sql -create function distinct as 'org.apache.iotdb.library.dprofile.UDTFDistinct' -``` - -#### 函数简介 - -本函数可以返回输入序列中出现的所有不同的元素。 - -**函数名:** DISTINCT - -**输入序列:** 仅支持单个输入序列,类型可以是任意的 - -**输出序列:** 输出单个序列,类型与输入相同。 - -**提示:** - -+ 输出序列的时间戳是无意义的。输出顺序是任意的。 -+ 缺失值和空值将被忽略,但`NaN`不会被忽略。 -+ 字符串区分大小写 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2020-01-01T08:00:00.001+08:00| Hello| -|2020-01-01T08:00:00.002+08:00| hello| -|2020-01-01T08:00:00.003+08:00| Hello| -|2020-01-01T08:00:00.004+08:00| World| -|2020-01-01T08:00:00.005+08:00| World| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select distinct(s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|distinct(root.test.d2.s2)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.001+08:00| Hello| -|1970-01-01T08:00:00.002+08:00| hello| -|1970-01-01T08:00:00.003+08:00| World| -+-----------------------------+-------------------------+ -``` - -### 3.3 Histogram - -#### 注册语句 - -```sql -create function histogram as 'org.apache.iotdb.library.dprofile.UDTFHistogram' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的分布直方图。 - -**函数名:** HISTOGRAM - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `min`:表示所求数据范围的下限,默认值为 -Double.MAX_VALUE。 -+ `max`:表示所求数据范围的上限,默认值为 Double.MAX_VALUE,`start`的值必须小于或等于`end`。 -+ `count`: 表示直方图分桶的数量,默认值为 1,其值必须为正整数。 - -**输出序列:** 直方图分桶的值,其中第 i 个桶(从 1 开始计数)表示的数据范围下界为$min+ (i-1)\cdot\frac{max-min}{count}$,数据范围上界为$min+ i \cdot \frac{max-min}{count}$。 - - -**提示:** - -+ 如果某个数据点的数值小于`min`,它会被放入第 1 个桶;如果某个数据点的数值大于`max`,它会被放入最后 1 个桶。 -+ 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 11.0| -|2020-01-01T00:00:11.000+08:00| 12.0| -|2020-01-01T00:00:12.000+08:00| 13.0| -|2020-01-01T00:00:13.000+08:00| 14.0| -|2020-01-01T00:00:14.000+08:00| 15.0| -|2020-01-01T00:00:15.000+08:00| 16.0| -|2020-01-01T00:00:16.000+08:00| 17.0| -|2020-01-01T00:00:17.000+08:00| 18.0| -|2020-01-01T00:00:18.000+08:00| 19.0| -|2020-01-01T00:00:19.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------+ -| Time|histogram(root.test.d1.s1, "min"="1", "max"="20", "count"="10")| -+-----------------------------+---------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 2| -|1970-01-01T08:00:00.001+08:00| 2| -|1970-01-01T08:00:00.002+08:00| 2| -|1970-01-01T08:00:00.003+08:00| 2| -|1970-01-01T08:00:00.004+08:00| 2| -|1970-01-01T08:00:00.005+08:00| 2| -|1970-01-01T08:00:00.006+08:00| 2| -|1970-01-01T08:00:00.007+08:00| 2| -|1970-01-01T08:00:00.008+08:00| 2| -|1970-01-01T08:00:00.009+08:00| 2| -+-----------------------------+---------------------------------------------------------------+ -``` - -### 3.4 Integral - -#### 注册语句 - -```sql -create function integral as 'org.apache.iotdb.library.dprofile.UDAFIntegral' -``` - -#### 函数简介 - -本函数用于计算时间序列的数值积分,即以时间为横坐标、数值为纵坐标绘制的折线图中折线以下的面积。 - -**函数名:** INTEGRAL - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `unit`:积分求解所用的时间轴单位,取值为 "1S", "1s", "1m", "1H", "1d"(区分大小写),分别表示以毫秒、秒、分钟、小时、天为单位计算积分。 - 缺省情况下取 "1s",以秒为单位。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为积分结果的数据点。 - -**提示:** - -+ 积分值等于折线图中每相邻两个数据点和时间轴形成的直角梯形的面积之和,不同时间单位下相当于横轴进行不同倍数放缩,得到的积分值可直接按放缩倍数转换。 - -+ 数据中`NaN`将会被忽略。折线将以临近两个有值数据点为准。 - -#### 使用示例 - -##### 参数缺省 - -缺省情况下积分以1s为时间单位。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 57.5| -+-----------------------------+-------------------------+ -``` - -其计算公式为: -$$\frac{1}{2}[(1+2)\times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 57.5$$ - - -##### 指定时间单位 - -指定以分钟为时间单位。 - - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.958| -+-----------------------------+-------------------------+ -``` - -其计算公式为: -$$\frac{1}{2\times 60}[(1+2) \times 1 + (2+3) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 0.958$$ - -### 3.5 IntegralAvg - -#### 注册语句 - -```sql -create function integralavg as 'org.apache.iotdb.library.dprofile.UDAFIntegralAvg' -``` - -#### 函数简介 - -本函数用于计算时间序列的函数均值,即在相同时间单位下的数值积分除以序列总的时间跨度。更多关于数值积分计算的信息请参考`Integral`函数。 - -**函数名:** INTEGRALAVG - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为时间加权平均结果的数据点。 - -**提示:** - -+ 时间加权的平均值等于在任意时间单位`unit`下计算的数值积分(即折线图中每相邻两个数据点和时间轴形成的直角梯形的面积之和), - 除以相同时间单位下输入序列的时间跨度,其值与具体采用的时间单位无关,默认与 IoTDB 时间单位一致。 - -+ 数据中的`NaN`将会被忽略。折线将以临近两个有值数据点为准。 - -+ 输入序列为空时,函数输出结果为 0;仅有一个数据点时,输出结果为该点数值。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|integralavg(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|1970-01-01T08:00:00.000+08:00| 6.388888888888889| -+-----------------------------+----------------------------+ -``` - -其计算公式为: -$$\frac{1}{2}[(1+2)\times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] / 10 = 5.75$$ - -### 3.6 Mad - -#### 注册语句 - -```sql -create function mad as 'org.apache.iotdb.library.dprofile.UDAFMad' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似绝对中位差,绝对中位差为所有数值与其中位数绝对偏移量的中位数。 - -如有数据集$\{1,3,3,5,5,6,7,8,9\}$,其中位数为5,所有数值与中位数的偏移量的绝对值为$\{0,0,1,2,2,2,3,4,4\}$,其中位数为2,故而原数据集的绝对中位差为2。 - -**函数名:** MAD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `error`:近似绝对中位差的基于数值的误差百分比,取值范围为 [0,1),默认值为 0。如当`error`=0.01 时,记精确绝对中位差为a,近似绝对中位差为b,不等式 $0.99a \le b \le 1.01a$ 成立。当`error`=0 时,计算结果为精确绝对中位差。 - - -**输出序列:** 输出单个序列,类型为DOUBLE,序列仅包含一个时间戳为 0、值为绝对中位差的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -##### 近似查询 - -当`error`参数取值不为 0 时,本函数计算近似绝对中位差。 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -............ -Total line number = 20 -``` - -用于查询的 SQL 语句如下: - -```sql -select mad(s1, "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -| Time|mad(root.test.s1, "error"="0.01")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9900000000000001| -+-----------------------------+---------------------------------+ -``` - -### 3.7 Median - -#### 注册语句 - -```sql -create function median as 'org.apache.iotdb.library.dprofile.UDAFMedian' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似中位数。中位数是顺序排列的一组数据中居于中间位置的数;当序列有偶数个时,中位数为中间二者的平均数。 - -**函数名:** MEDIAN - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `error`:近似中位数的基于排名的误差百分比,取值范围 [0,1),默认值为 0。如当`error`=0.01 时,计算出的中位数的真实排名百分比在 0.49~0.51 之间。当`error`=0 时,计算结果为精确中位数。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为中位数的数据点。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -Total line number = 20 -``` - -用于查询的 SQL 语句: - -```sql -select median(s1, "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|median(root.test.s1, "error"="0.01")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -+-----------------------------+------------------------------------+ -``` - -### 3.8 MinMax - -#### 注册语句 - -```sql -create function minmax as 'org.apache.iotdb.library.dprofile.UDTFMinMax' -``` - -#### 函数简介 - -本函数将输入序列使用 min-max 方法进行标准化。最小值归一至 0,最大值归一至 1. - -**函数名:** MINMAX - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `compute`:若设置为"batch",则将数据全部读入后转换;若设置为 "stream",则需用户提供最大值及最小值进行流式计算转换。默认为 "batch"。 -+ `min`:使用流式计算时的最小值。 -+ `max`:使用流式计算时的最大值。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select minmax(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|minmax(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.200+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.300+08:00| 0.25| -|1970-01-01T08:00:00.400+08:00| 0.08333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.700+08:00| 0.0| -|1970-01-01T08:00:00.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.900+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.000+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.100+08:00| 0.25| -|1970-01-01T08:00:01.200+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.300+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.25| -|1970-01-01T08:00:01.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.700+08:00| 1.0| -|1970-01-01T08:00:01.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.900+08:00| 0.0| -|1970-01-01T08:00:02.000+08:00| 0.16666666666666666| -+-----------------------------+--------------------+ -``` - - - -### 3.9 MvAvg - -#### 注册语句 - -```sql -create function mvavg as 'org.apache.iotdb.library.dprofile.UDTFMvAvg' -``` - -#### 函数简介 - -本函数计算序列的移动平均。 - -**函数名:** MVAVG - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:移动窗口的长度。默认值为 10. - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 指定窗口长度 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select mvavg(s1, "window"="3") from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -| Time|mvavg(root.test.s1, "window"="3")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.300+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.400+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.700+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.100+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.200+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.300+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.0| -|1970-01-01T08:00:01.500+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.600+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.700+08:00| 3.0| -|1970-01-01T08:00:01.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.900+08:00| -0.6666666666666666| -|1970-01-01T08:00:02.000+08:00| -3.3333333333333335| -+-----------------------------+---------------------------------+ -``` - -### 3.10 PACF - -#### 注册语句 - -```sql -create function pacf as 'org.apache.iotdb.library.dprofile.UDTFPACF' -``` - -#### 函数简介 - -本函数通过求解 Yule-Walker 方程,计算序列的偏自相关系数。对于特殊的输入序列,方程可能没有解,此时输出`NaN`。 - -**函数名:** PACF - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `lag`:最大滞后阶数。默认值为$\min(10\log_{10}n,n-1)$,$n$表示数据点个数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 指定滞后阶数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select pacf(s1, "lag"="5") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------+ -| Time|pacf(root.test.d1.s1, "lag"="5")| -+-----------------------------+--------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| -0.5744680851063829| -|2020-01-01T00:00:03.000+08:00| 0.3172297297297296| -|2020-01-01T00:00:04.000+08:00| -0.2977686586304181| -|2020-01-01T00:00:05.000+08:00| -2.0609033521065867| -+-----------------------------+--------------------------------+ -``` - -### 3.11 Percentile - -#### 注册语句 - -```sql -create function percentile as 'org.apache.iotdb.library.dprofile.UDAFPercentile' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似分位数。 - -**函数名:** PERCENTILE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `rank`:所求分位数在所有数据中的排名百分比,取值范围为 (0,1],默认值为 0.5。如当设为 0.5时则计算中位数。 -+ `error`:近似分位数的基于排名的误差百分比,取值范围为 [0,1),默认值为0。如`rank`=0.5 且`error`=0.01,则计算出的分位数的真实排名百分比在 0.49~0.51之间。当`error`=0 时,计算结果为精确分位数。 - -**输出序列:** 输出单个序列,类型与输入序列相同。当`error`=0时,序列仅包含一个时间戳为分位数第一次出现的时间戳、值为分位数的数据点;否则,输出值的时间戳为0。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|2021-03-17T10:32:17.054+08:00| 0.5319929| -|2021-03-17T10:32:18.054+08:00| 0.9304316| -|2021-03-17T10:32:19.054+08:00| -1.4800133| -|2021-03-17T10:32:20.054+08:00| 0.6114087| -|2021-03-17T10:32:21.054+08:00| 2.5163336| -|2021-03-17T10:32:22.054+08:00| -1.0845392| -|2021-03-17T10:32:23.054+08:00| 1.0562582| -|2021-03-17T10:32:24.054+08:00| 1.3867859| -|2021-03-17T10:32:25.054+08:00| -0.45429882| -|2021-03-17T10:32:26.054+08:00| 1.0353678| -|2021-03-17T10:32:27.054+08:00| 0.7307929| -|2021-03-17T10:32:28.054+08:00| 2.3167255| -|2021-03-17T10:32:29.054+08:00| 2.342443| -|2021-03-17T10:32:30.054+08:00| 1.5809103| -|2021-03-17T10:32:31.054+08:00| 1.4829416| -|2021-03-17T10:32:32.054+08:00| 1.5800357| -|2021-03-17T10:32:33.054+08:00| 0.7124368| -|2021-03-17T10:32:34.054+08:00| -0.78597564| -|2021-03-17T10:32:35.054+08:00| 1.2058644| -|2021-03-17T10:32:36.054+08:00| 1.4215064| -|2021-03-17T10:32:37.054+08:00| 1.2808295| -|2021-03-17T10:32:38.054+08:00| -0.6173715| -|2021-03-17T10:32:39.054+08:00| 0.06644377| -|2021-03-17T10:32:40.054+08:00| 2.349338| -|2021-03-17T10:32:41.054+08:00| 1.7335888| -|2021-03-17T10:32:42.054+08:00| 1.5872132| -............ -Total line number = 10000 -``` - -用于查询的 SQL 语句: - -```sql -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|percentile(root.test.s0, "rank"="0.2", "error"="0.01")| -+-----------------------------+------------------------------------------------------+ -|2021-03-17T10:35:02.054+08:00| 0.1801469624042511| -+-----------------------------+------------------------------------------------------+ -``` -输入序列: - -``` -+-----------------------------+-------------+ -| Time|root.test2.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+-------------+ -............ -Total line number = 20 -``` - -用于查询的 SQL 语句: - -```sql -select percentile(s1, "rank"="0.2", "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|percentile(root.test2.s1, "rank"="0.2", "error"="0.01")| -+-----------------------------+-------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| -1.0| -+-----------------------------+-------------------------------------------------------+ -``` - - -### 3.12 Quantile - -#### 注册语句 - -```sql -create function quantile as 'org.apache.iotdb.library.dprofile.UDAFQuantile' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的近似分位数。本函数基于KLL sketch算法实现。 - -**函数名:** QUANTILE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `rank`:所求分位数在所有数据中的排名比,取值范围为 (0,1],默认值为 0.5。如当设为 0.5时则计算近似中位数。 -+ `K`:允许维护的KLL sketch大小,最小值为100,默认值为800。如`rank`=0.5 且`K`=800,则计算出的分位数的真实排名比有至少99%的可能性在 0.49~0.51之间。 - -**输出序列:** 输出单个序列,类型与输入序列相同。输出值的时间戳为0。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - - -输入序列: - -``` -+-----------------------------+-------------+ -| Time|root.test1.s1| -+-----------------------------+-------------+ -|2021-03-17T10:32:17.054+08:00| 7| -|2021-03-17T10:32:18.054+08:00| 15| -|2021-03-17T10:32:19.054+08:00| 36| -|2021-03-17T10:32:20.054+08:00| 39| -|2021-03-17T10:32:21.054+08:00| 40| -|2021-03-17T10:32:22.054+08:00| 41| -|2021-03-17T10:32:23.054+08:00| 20| -|2021-03-17T10:32:24.054+08:00| 18| -+-----------------------------+-------------+ -............ -Total line number = 8 -``` - -用于查询的 SQL 语句: - -```sql -select quantile(s1, "rank"="0.2", "K"="800") from root.test1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------+ -| Time|quantile(root.test1.s1, "rank"="0.2", "K"="800")| -+-----------------------------+------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.000000000000001| -+-----------------------------+------------------------------------------------+ -``` - -### 3.13 Period - -#### 注册语句 - -```sql -create function period as 'org.apache.iotdb.library.dprofile.UDAFPeriod' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的周期。 - -**函数名:** PERIOD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为 INT32,序列仅包含一个时间戳为 0、值为周期的数据点。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d3.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| -|1970-01-01T08:00:00.004+08:00| 1.0| -|1970-01-01T08:00:00.005+08:00| 2.0| -|1970-01-01T08:00:00.006+08:00| 3.0| -|1970-01-01T08:00:00.007+08:00| 1.0| -|1970-01-01T08:00:00.008+08:00| 2.0| -|1970-01-01T08:00:00.009+08:00| 3.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select period(s1) from root.test.d3 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time|period(root.test.d3.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 3| -+-----------------------------+-----------------------+ -``` - -### 3.14 QLB - -#### 注册语句 - -```sql -create function qlb as 'org.apache.iotdb.library.dprofile.UDTFQLB' -``` - -#### 函数简介 - -本函数对输入序列计算$Q_{LB} $统计量,并计算对应的p值。p值越小表明序列越有可能为非平稳序列。 - -**函数名:** QLB - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `lag`:计算时用到的最大延迟阶数,取值应为 1 至 n-2 之间的整数,n 为序列采样总数。默认取 n-2。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。该序列是$Q_{LB} $统计量对应的 p 值,时间标签代表偏移阶数。 - -**提示:** $Q_{LB} $统计量由自相关系数求得,如需得到统计量而非 p 值,可以使用 ACF 函数。 - -#### 使用示例 - -##### 使用默认参数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T00:00:00.100+08:00| 1.22| -|1970-01-01T00:00:00.200+08:00| -2.78| -|1970-01-01T00:00:00.300+08:00| 1.53| -|1970-01-01T00:00:00.400+08:00| 0.70| -|1970-01-01T00:00:00.500+08:00| 0.75| -|1970-01-01T00:00:00.600+08:00| -0.72| -|1970-01-01T00:00:00.700+08:00| -0.22| -|1970-01-01T00:00:00.800+08:00| 0.28| -|1970-01-01T00:00:00.900+08:00| 0.57| -|1970-01-01T00:00:01.000+08:00| -0.22| -|1970-01-01T00:00:01.100+08:00| -0.72| -|1970-01-01T00:00:01.200+08:00| 1.34| -|1970-01-01T00:00:01.300+08:00| -0.25| -|1970-01-01T00:00:01.400+08:00| 0.17| -|1970-01-01T00:00:01.500+08:00| 2.51| -|1970-01-01T00:00:01.600+08:00| 1.42| -|1970-01-01T00:00:01.700+08:00| -1.34| -|1970-01-01T00:00:01.800+08:00| -0.01| -|1970-01-01T00:00:01.900+08:00| -0.49| -|1970-01-01T00:00:02.000+08:00| 1.63| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select QLB(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+---------------------+ -| Time| QLB(root.test.d1.s1)| -+-----------------------------+---------------------+ -|1970-01-01T08:00:00.021+08:00| -0.31671| -|1970-01-01T08:00:00.001+08:00| 0.12748561639660716| -|1970-01-01T08:00:00.022+08:00| -0.17051499999999997| -|1970-01-01T08:00:00.002+08:00| 0.21941409592365868| -|1970-01-01T08:00:00.023+08:00| -0.11341499999999997| -|1970-01-01T08:00:00.003+08:00| 0.3384920824593398| -|1970-01-01T08:00:00.024+08:00| 0.26146| -|1970-01-01T08:00:00.004+08:00| 0.26293189359893154| -|1970-01-01T08:00:00.025+08:00| 0.06431999999999996| -|1970-01-01T08:00:00.005+08:00| 0.37265953802871943| -|1970-01-01T08:00:00.026+08:00| 0.036919999999999994| -|1970-01-01T08:00:00.006+08:00| 0.4923218142923832| -|1970-01-01T08:00:00.027+08:00|-0.009294999999999993| -|1970-01-01T08:00:00.007+08:00| 0.609628728420623| -|1970-01-01T08:00:00.028+08:00| 0.12271499999999999| -|1970-01-01T08:00:00.008+08:00| 0.6510708392264906| -|1970-01-01T08:00:00.029+08:00| 0.008480000000000033| -|1970-01-01T08:00:00.009+08:00| 0.7430561964288097| -|1970-01-01T08:00:00.030+08:00| -0.21764500000000003| -|1970-01-01T08:00:00.010+08:00| 0.6236738200492055| -|1970-01-01T08:00:00.031+08:00| 0.35853999999999997| -|1970-01-01T08:00:00.011+08:00| 0.21487390993160937| -|1970-01-01T08:00:00.032+08:00| 0.18115499999999998| -|1970-01-01T08:00:00.012+08:00| 0.18479562182870324| -|1970-01-01T08:00:00.033+08:00| -0.27745499999999995| -|1970-01-01T08:00:00.013+08:00| 0.07329862193377235| -|1970-01-01T08:00:00.034+08:00| -0.22418500000000002| -|1970-01-01T08:00:00.014+08:00| 0.038000864459751926| -|1970-01-01T08:00:00.035+08:00| 0.31609000000000004| -|1970-01-01T08:00:00.015+08:00| 0.004052989734200874| -|1970-01-01T08:00:00.036+08:00| -0.06078500000000001| -|1970-01-01T08:00:00.016+08:00| 0.005663787468609627| -|1970-01-01T08:00:00.037+08:00| 0.19219499999999998| -|1970-01-01T08:00:00.017+08:00|0.0016316380755082571| -|1970-01-01T08:00:00.038+08:00| -0.25646| -|1970-01-01T08:00:00.018+08:00|2.0047954405910673E-5| -+-----------------------------+---------------------+ -``` - -### 3.15 Resample - -#### 注册语句 - -```sql -create function re_sample as 'org.apache.iotdb.library.dprofile.UDTFResample' -``` - -#### 函数简介 - -本函数对输入序列按照指定的频率进行重采样,包括上采样和下采样。目前,本函数支持的上采样方法包括`NaN`填充法 (NaN)、前值填充法 (FFill)、后值填充法 (BFill) 以及线性插值法 (Linear);本函数支持的下采样方法为分组聚合,聚合方法包括最大值 (Max)、最小值 (Min)、首值 (First)、末值 (Last)、平均值 (Mean)和中位数 (Median)。 - -**函数名:** RESAMPLE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `every`:重采样频率,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。该参数不允许缺省。 -+ `interp`:上采样的插值方法,取值为 'NaN'、'FFill'、'BFill' 或 'Linear'。在缺省情况下,使用`NaN`填充法。 -+ `aggr`:下采样的聚合方法,取值为 'Max'、'Min'、'First'、'Last'、'Mean' 或 'Median'。在缺省情况下,使用平均数聚合。 -+ `start`:重采样的起始时间(包含),是一个格式为 'yyyy-MM-dd HH:mm:ss' 的时间字符串。在缺省情况下,使用第一个有效数据点的时间戳。 -+ `end`:重采样的结束时间(不包含),是一个格式为 'yyyy-MM-dd HH:mm:ss' 的时间字符串。在缺省情况下,使用最后一个有效数据点的时间戳。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。该序列按照重采样频率严格等间隔分布。 - -**提示:** 数据中的`NaN`将会被忽略。 - -#### 使用示例 - -##### 上采样 - -当重采样频率高于数据原始频率时,将会进行上采样。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2021-03-06T16:00:00.000+08:00| 3.09| -|2021-03-06T16:15:00.000+08:00| 3.53| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:45:00.000+08:00| 3.51| -|2021-03-06T17:00:00.000+08:00| 3.41| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select resample(s1,'every'='5m','interp'='linear') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="5m", "interp"="linear")| -+-----------------------------+----------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:05:00.000+08:00| 3.2366665999094644| -|2021-03-06T16:10:00.000+08:00| 3.3833332856496177| -|2021-03-06T16:15:00.000+08:00| 3.5299999713897705| -|2021-03-06T16:20:00.000+08:00| 3.5199999809265137| -|2021-03-06T16:25:00.000+08:00| 3.509999990463257| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:35:00.000+08:00| 3.503333330154419| -|2021-03-06T16:40:00.000+08:00| 3.506666660308838| -|2021-03-06T16:45:00.000+08:00| 3.509999990463257| -|2021-03-06T16:50:00.000+08:00| 3.4766666889190674| -|2021-03-06T16:55:00.000+08:00| 3.443333387374878| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+----------------------------------------------------------+ -``` - -##### 下采样 - -当重采样频率低于数据原始频率时,将会进行下采样。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select resample(s1,'every'='30m','aggr'='first') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "aggr"="first")| -+-----------------------------+--------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+--------------------------------------------------------+ -``` - - -###### 指定重采样时间段 - -可以使用`start`和`end`两个参数指定重采样的时间段,超出实际时间范围的部分会被插值填补。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "start"="2021-03-06 15:00:00")| -+-----------------------------+-----------------------------------------------------------------------+ -|2021-03-06T15:00:00.000+08:00| NaN| -|2021-03-06T15:30:00.000+08:00| NaN| -|2021-03-06T16:00:00.000+08:00| 3.309999942779541| -|2021-03-06T16:30:00.000+08:00| 3.5049999952316284| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+-----------------------------------------------------------------------+ -``` - -### 3.16 Sample - -#### 注册语句 - -```sql -create function sample as 'org.apache.iotdb.library.dprofile.UDTFSample' -``` - -#### 函数简介 - -本函数对输入序列进行采样,即从输入序列中选取指定数量的数据点并输出。目前,本函数支持三种采样方法:**蓄水池采样法 (reservoir sampling)** 对数据进行随机采样,所有数据点被采样的概率相同;**等距采样法 (isometric sampling)** 按照相等的索引间隔对数据进行采样,**最大三角采样法 (triangle sampling)** 对所有数据会按采样率分桶,每个桶内会计算数据点间三角形面积,并保留面积最大的点,该算法通常用于数据的可视化展示中,采用过程可以保证一些关键的突变点在采用中得到保留,更多抽样算法细节可以阅读论文 [here](http://skemman.is/stream/get/1946/15343/37285/3/SS_MSthesis.pdf)。 - -**函数名:** SAMPLE - -**输入序列:** 仅支持单个输入序列,类型可以是任意的。 - -**参数:** - -+ `method`:采样方法,取值为 'reservoir','isometric' 或 'triangle' 。在缺省情况下,采用蓄水池采样法。 -+ `k`:采样数,它是一个正整数,在缺省情况下为 1。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列的长度为采样数,序列中的每一个数据点都来自于输入序列。 - -**提示:** 如果采样数大于序列长度,那么输入序列中所有的数据点都会被输出。 - -#### 使用示例 - - -##### 蓄水池采样 - -当`method`参数为 'reservoir' 或缺省时,采用蓄水池采样法对输入序列进行采样。由于该采样方法具有随机性,下面展示的输出序列只是一种可能的结果。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 4.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select sample(s1,'method'='reservoir','k'='5') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="reservoir", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - - -##### 等距采样 - -当`method`参数为 'isometric' 时,采用等距采样法对输入序列进行采样。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select sample(s1,'method'='isometric','k'='5') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="isometric", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -### 3.17 Segment - -#### 注册语句 - -```sql -create function segment as 'org.apache.iotdb.library.dprofile.UDTFSegment' -``` - -#### 函数简介 - -本函数按照数据的线性变化趋势将数据划分为多个子序列,返回分段直线拟合后的子序列首值或所有拟合值。 - -**函数名:** SEGMENT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `output`:"all" 输出所有拟合值;"first" 输出子序列起点拟合值。默认为 "first"。 - -+ `error`:判定存在线性趋势的误差允许阈值。误差的定义为子序列进行线性拟合的误差的绝对值的均值。默认为 0.1. - -**输出序列:** 输出单个序列,类型为 DOUBLE。 - -**提示:** 函数默认所有数据等时间间隔分布。函数读取所有数据,若原始数据过多,请先进行降采样处理。拟合采用自底向上方法,子序列的尾值可能会被认作子序列首值输出。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:00.300+08:00| 2.0| -|1970-01-01T08:00:00.400+08:00| 3.0| -|1970-01-01T08:00:00.500+08:00| 4.0| -|1970-01-01T08:00:00.600+08:00| 5.0| -|1970-01-01T08:00:00.700+08:00| 6.0| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 8.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:01.100+08:00| 9.1| -|1970-01-01T08:00:01.200+08:00| 9.2| -|1970-01-01T08:00:01.300+08:00| 9.3| -|1970-01-01T08:00:01.400+08:00| 9.4| -|1970-01-01T08:00:01.500+08:00| 9.5| -|1970-01-01T08:00:01.600+08:00| 9.6| -|1970-01-01T08:00:01.700+08:00| 9.7| -|1970-01-01T08:00:01.800+08:00| 9.8| -|1970-01-01T08:00:01.900+08:00| 9.9| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:02.100+08:00| 8.0| -|1970-01-01T08:00:02.200+08:00| 6.0| -|1970-01-01T08:00:02.300+08:00| 4.0| -|1970-01-01T08:00:02.400+08:00| 2.0| -|1970-01-01T08:00:02.500+08:00| 0.0| -|1970-01-01T08:00:02.600+08:00| -2.0| -|1970-01-01T08:00:02.700+08:00| -4.0| -|1970-01-01T08:00:02.800+08:00| -6.0| -|1970-01-01T08:00:02.900+08:00| -8.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.100+08:00| 10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -|1970-01-01T08:00:03.300+08:00| 10.0| -|1970-01-01T08:00:03.400+08:00| 10.0| -|1970-01-01T08:00:03.500+08:00| 10.0| -|1970-01-01T08:00:03.600+08:00| 10.0| -|1970-01-01T08:00:03.700+08:00| 10.0| -|1970-01-01T08:00:03.800+08:00| 10.0| -|1970-01-01T08:00:03.900+08:00| 10.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select segment(s1,"error"="0.1") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|segment(root.test.s1, "error"="0.1")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -+-----------------------------+------------------------------------+ -``` - -### 3.18 Skew - -#### 注册语句 - -```sql -create function skew as 'org.apache.iotdb.library.dprofile.UDAFSkew' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的总体偏度 - -**函数名:** SKEW - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为总体偏度的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -|2020-01-01T00:00:11.000+08:00| 10.0| -|2020-01-01T00:00:12.000+08:00| 10.0| -|2020-01-01T00:00:13.000+08:00| 10.0| -|2020-01-01T00:00:14.000+08:00| 10.0| -|2020-01-01T00:00:15.000+08:00| 10.0| -|2020-01-01T00:00:16.000+08:00| 10.0| -|2020-01-01T00:00:17.000+08:00| 10.0| -|2020-01-01T00:00:18.000+08:00| 10.0| -|2020-01-01T00:00:19.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select skew(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time| skew(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| -0.9998427402292644| -+-----------------------------+-----------------------+ -``` - -### 3.19 Spline - -#### 注册语句 - -```sql -create function spline as 'org.apache.iotdb.library.dprofile.UDTFSpline' -``` - -#### 函数简介 - -本函数提供对原始序列进行三次样条曲线拟合后的插值重采样。 - -**函数名:** SPLINE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `points`:重采样个数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -**提示**:输出序列保留输入序列的首尾值,等时间间隔采样。仅当输入点个数不少于 4 个时才计算插值。 - -#### 使用示例 - -##### 指定插值个数 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select spline(s1, "points"="151") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|spline(root.test.s1, "points"="151")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.010+08:00| 0.04870000251134237| -|1970-01-01T08:00:00.020+08:00| 0.09680000495910646| -|1970-01-01T08:00:00.030+08:00| 0.14430000734329226| -|1970-01-01T08:00:00.040+08:00| 0.19120000966389972| -|1970-01-01T08:00:00.050+08:00| 0.23750001192092896| -|1970-01-01T08:00:00.060+08:00| 0.2832000141143799| -|1970-01-01T08:00:00.070+08:00| 0.32830001624425253| -|1970-01-01T08:00:00.080+08:00| 0.3728000183105469| -|1970-01-01T08:00:00.090+08:00| 0.416700020313263| -|1970-01-01T08:00:00.100+08:00| 0.4600000222524008| -|1970-01-01T08:00:00.110+08:00| 0.5027000241279602| -|1970-01-01T08:00:00.120+08:00| 0.5448000259399414| -|1970-01-01T08:00:00.130+08:00| 0.5863000276883443| -|1970-01-01T08:00:00.140+08:00| 0.627200029373169| -|1970-01-01T08:00:00.150+08:00| 0.6675000309944153| -|1970-01-01T08:00:00.160+08:00| 0.7072000325520833| -|1970-01-01T08:00:00.170+08:00| 0.7463000340461731| -|1970-01-01T08:00:00.180+08:00| 0.7848000354766846| -|1970-01-01T08:00:00.190+08:00| 0.8227000368436178| -|1970-01-01T08:00:00.200+08:00| 0.8600000381469728| -|1970-01-01T08:00:00.210+08:00| 0.8967000393867494| -|1970-01-01T08:00:00.220+08:00| 0.9328000405629477| -|1970-01-01T08:00:00.230+08:00| 0.9683000416755676| -|1970-01-01T08:00:00.240+08:00| 1.0032000427246095| -|1970-01-01T08:00:00.250+08:00| 1.037500043710073| -|1970-01-01T08:00:00.260+08:00| 1.071200044631958| -|1970-01-01T08:00:00.270+08:00| 1.1043000454902647| -|1970-01-01T08:00:00.280+08:00| 1.1368000462849934| -|1970-01-01T08:00:00.290+08:00| 1.1687000470161437| -|1970-01-01T08:00:00.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:00.310+08:00| 1.2307000483103594| -|1970-01-01T08:00:00.320+08:00| 1.2608000489139557| -|1970-01-01T08:00:00.330+08:00| 1.2903000494873524| -|1970-01-01T08:00:00.340+08:00| 1.3192000500233967| -|1970-01-01T08:00:00.350+08:00| 1.3475000505149364| -|1970-01-01T08:00:00.360+08:00| 1.3752000509548186| -|1970-01-01T08:00:00.370+08:00| 1.402300051335891| -|1970-01-01T08:00:00.380+08:00| 1.4288000516510009| -|1970-01-01T08:00:00.390+08:00| 1.4547000518929958| -|1970-01-01T08:00:00.400+08:00| 1.480000052054723| -|1970-01-01T08:00:00.410+08:00| 1.5047000521290301| -|1970-01-01T08:00:00.420+08:00| 1.5288000521087646| -|1970-01-01T08:00:00.430+08:00| 1.5523000519867738| -|1970-01-01T08:00:00.440+08:00| 1.575200051755905| -|1970-01-01T08:00:00.450+08:00| 1.597500051409006| -|1970-01-01T08:00:00.460+08:00| 1.619200050938924| -|1970-01-01T08:00:00.470+08:00| 1.6403000503385066| -|1970-01-01T08:00:00.480+08:00| 1.660800049600601| -|1970-01-01T08:00:00.490+08:00| 1.680700048718055| -|1970-01-01T08:00:00.500+08:00| 1.7000000476837158| -|1970-01-01T08:00:00.510+08:00| 1.7188475466453037| -|1970-01-01T08:00:00.520+08:00| 1.7373800457262996| -|1970-01-01T08:00:00.530+08:00| 1.7555825448831923| -|1970-01-01T08:00:00.540+08:00| 1.7734400440724702| -|1970-01-01T08:00:00.550+08:00| 1.790937543250622| -|1970-01-01T08:00:00.560+08:00| 1.8080600423741364| -|1970-01-01T08:00:00.570+08:00| 1.8247925413995016| -|1970-01-01T08:00:00.580+08:00| 1.8411200402832066| -|1970-01-01T08:00:00.590+08:00| 1.8570275389817397| -|1970-01-01T08:00:00.600+08:00| 1.8725000374515897| -|1970-01-01T08:00:00.610+08:00| 1.8875225356492449| -|1970-01-01T08:00:00.620+08:00| 1.902080033531194| -|1970-01-01T08:00:00.630+08:00| 1.9161575310539258| -|1970-01-01T08:00:00.640+08:00| 1.9297400281739288| -|1970-01-01T08:00:00.650+08:00| 1.9428125248476913| -|1970-01-01T08:00:00.660+08:00| 1.9553600210317021| -|1970-01-01T08:00:00.670+08:00| 1.96736751668245| -|1970-01-01T08:00:00.680+08:00| 1.9788200117564232| -|1970-01-01T08:00:00.690+08:00| 1.9897025062101101| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.710+08:00| 2.0097024933913334| -|1970-01-01T08:00:00.720+08:00| 2.0188199867081615| -|1970-01-01T08:00:00.730+08:00| 2.027367479995188| -|1970-01-01T08:00:00.740+08:00| 2.0353599732971155| -|1970-01-01T08:00:00.750+08:00| 2.0428124666586482| -|1970-01-01T08:00:00.760+08:00| 2.049739960124489| -|1970-01-01T08:00:00.770+08:00| 2.056157453739342| -|1970-01-01T08:00:00.780+08:00| 2.06207994754791| -|1970-01-01T08:00:00.790+08:00| 2.067522441594897| -|1970-01-01T08:00:00.800+08:00| 2.072499935925006| -|1970-01-01T08:00:00.810+08:00| 2.07702743058294| -|1970-01-01T08:00:00.820+08:00| 2.081119925613404| -|1970-01-01T08:00:00.830+08:00| 2.0847924210611| -|1970-01-01T08:00:00.840+08:00| 2.0880599169707317| -|1970-01-01T08:00:00.850+08:00| 2.0909374133870027| -|1970-01-01T08:00:00.860+08:00| 2.0934399103546166| -|1970-01-01T08:00:00.870+08:00| 2.0955824079182768| -|1970-01-01T08:00:00.880+08:00| 2.0973799061226863| -|1970-01-01T08:00:00.890+08:00| 2.098847405012549| -|1970-01-01T08:00:00.900+08:00| 2.0999999046325684| -|1970-01-01T08:00:00.910+08:00| 2.1005574051201332| -|1970-01-01T08:00:00.920+08:00| 2.1002599065303778| -|1970-01-01T08:00:00.930+08:00| 2.0991524087846245| -|1970-01-01T08:00:00.940+08:00| 2.0972799118041947| -|1970-01-01T08:00:00.950+08:00| 2.0946874155104105| -|1970-01-01T08:00:00.960+08:00| 2.0914199198245944| -|1970-01-01T08:00:00.970+08:00| 2.0875224246680673| -|1970-01-01T08:00:00.980+08:00| 2.083039929962151| -|1970-01-01T08:00:00.990+08:00| 2.0780174356281687| -|1970-01-01T08:00:01.000+08:00| 2.0724999415874406| -|1970-01-01T08:00:01.010+08:00| 2.06653244776129| -|1970-01-01T08:00:01.020+08:00| 2.060159954071038| -|1970-01-01T08:00:01.030+08:00| 2.053427460438006| -|1970-01-01T08:00:01.040+08:00| 2.046379966783517| -|1970-01-01T08:00:01.050+08:00| 2.0390624730288924| -|1970-01-01T08:00:01.060+08:00| 2.031519979095454| -|1970-01-01T08:00:01.070+08:00| 2.0237974849045237| -|1970-01-01T08:00:01.080+08:00| 2.015939990377423| -|1970-01-01T08:00:01.090+08:00| 2.0079924954354746| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.110+08:00| 1.9907018211101906| -|1970-01-01T08:00:01.120+08:00| 1.9788509124245144| -|1970-01-01T08:00:01.130+08:00| 1.9645127287932083| -|1970-01-01T08:00:01.140+08:00| 1.9477527250665083| -|1970-01-01T08:00:01.150+08:00| 1.9286363560946513| -|1970-01-01T08:00:01.160+08:00| 1.9072290767278735| -|1970-01-01T08:00:01.170+08:00| 1.8835963418164114| -|1970-01-01T08:00:01.180+08:00| 1.8578036062105014| -|1970-01-01T08:00:01.190+08:00| 1.8299163247603802| -|1970-01-01T08:00:01.200+08:00| 1.7999999523162842| -|1970-01-01T08:00:01.210+08:00| 1.7623635841923329| -|1970-01-01T08:00:01.220+08:00| 1.7129696477516976| -|1970-01-01T08:00:01.230+08:00| 1.6543635959181928| -|1970-01-01T08:00:01.240+08:00| 1.5890908816156328| -|1970-01-01T08:00:01.250+08:00| 1.5196969577678319| -|1970-01-01T08:00:01.260+08:00| 1.4487272772986044| -|1970-01-01T08:00:01.270+08:00| 1.3787272931317647| -|1970-01-01T08:00:01.280+08:00| 1.3122424581911272| -|1970-01-01T08:00:01.290+08:00| 1.251818225400506| -|1970-01-01T08:00:01.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:01.310+08:00| 1.1548000470995912| -|1970-01-01T08:00:01.320+08:00| 1.1130667107899999| -|1970-01-01T08:00:01.330+08:00| 1.0756000393033045| -|1970-01-01T08:00:01.340+08:00| 1.043200033187868| -|1970-01-01T08:00:01.350+08:00| 1.016666692992053| -|1970-01-01T08:00:01.360+08:00| 0.9968000192642223| -|1970-01-01T08:00:01.370+08:00| 0.9844000125527389| -|1970-01-01T08:00:01.380+08:00| 0.9802666734059655| -|1970-01-01T08:00:01.390+08:00| 0.9852000023722649| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.410+08:00| 1.023999999165535| -|1970-01-01T08:00:01.420+08:00| 1.0559999990463256| -|1970-01-01T08:00:01.430+08:00| 1.0959999996423722| -|1970-01-01T08:00:01.440+08:00| 1.1440000009536744| -|1970-01-01T08:00:01.450+08:00| 1.2000000029802322| -|1970-01-01T08:00:01.460+08:00| 1.264000005722046| -|1970-01-01T08:00:01.470+08:00| 1.3360000091791153| -|1970-01-01T08:00:01.480+08:00| 1.4160000133514405| -|1970-01-01T08:00:01.490+08:00| 1.5040000182390214| -|1970-01-01T08:00:01.500+08:00| 1.600000023841858| -+-----------------------------+------------------------------------+ -``` - -### 3.20 Spread - -#### 注册语句 - -```sql -create function spread as 'org.apache.iotdb.library.dprofile.UDAFSpread' -``` - -#### 函数简介 - -本函数用于计算时间序列的极差,即最大值减去最小值的结果。 - -**函数名:** SPREAD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型与输入相同,序列仅包含一个时间戳为 0 、值为极差的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time|spread(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 26.0| -+-----------------------------+-----------------------+ -``` - - - -### 3.21 ZScore - -#### 注册语句 - -```sql -create function zscore as 'org.apache.iotdb.library.dprofile.UDTFZScore' -``` - -#### 函数简介 - -本函数将输入序列使用z-score方法进行归一化。 - -**函数名:** ZSCORE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `compute`:若设置为 "batch",则将数据全部读入后转换;若设置为 "stream",则需用户提供均值及方差进行流式计算转换。默认为 "batch"。 -+ `avg`:使用流式计算时的均值。 -+ `sd`:使用流式计算时的标准差。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select zscore(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|zscore(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.200+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.300+08:00| 0.20672455764868078| -|1970-01-01T08:00:00.400+08:00| -0.6201736729460423| -|1970-01-01T08:00:00.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.700+08:00| -1.033622788243404| -|1970-01-01T08:00:00.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:00.900+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.000+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.100+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.200+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.300+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.400+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.700+08:00| 3.9277665953249348| -|1970-01-01T08:00:01.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:01.900+08:00| -1.033622788243404| -|1970-01-01T08:00:02.000+08:00|-0.20672455764868078| -+-----------------------------+--------------------+ -``` - - - -## 4. 异常检测 - -### 4.1 IQR - -#### 注册语句 - -```sql -create function iqr as 'org.apache.iotdb.library.anomaly.UDTFIQR' -``` - -#### 函数简介 - -本函数用于检验超出上下四分位数1.5倍IQR的数据分布异常。 - -**函数名:** IQR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `method`:若设置为 "batch",则将数据全部读入后检测;若设置为 "stream",则需用户提供上下四分位数进行流式检测。默认为 "batch"。 -+ `q1`:使用流式计算时的下四分位数。 -+ `q3`:使用流式计算时的上四分位数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -**说明**:$IQR=Q_3-Q_1$ - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select iqr(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+-----------------+ -| Time|iqr(root.test.s1)| -+-----------------------------+-----------------+ -|1970-01-01T08:00:01.700+08:00| 10.0| -+-----------------------------+-----------------+ -``` - -### 4.2 KSigma - -#### 注册语句 - -```sql -create function ksigma as 'org.apache.iotdb.library.anomaly.UDTFKSigma' -``` - -#### 函数简介 - -本函数利用动态 K-Sigma 算法进行异常检测。在一个窗口内,与平均值的差距超过k倍标准差的数据将被视作异常并输出。 - -**函数名:** KSIGMA - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `k`:在动态 K-Sigma 算法中,分布异常的标准差倍数阈值,默认值为 3。 -+ `window`:动态 K-Sigma 算法的滑动窗口大小,默认值为 10000。 - - -**输出序列:** 输出单个序列,类型与输入序列相同。 - -**提示:** k 应大于 0,否则将不做输出。 - -#### 使用示例 - -##### 指定k - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:04.000+08:00| 100.0| -|2020-01-01T00:00:06.000+08:00| 150.0| -|2020-01-01T00:00:08.000+08:00| 200.0| -|2020-01-01T00:00:10.000+08:00| 200.0| -|2020-01-01T00:00:14.000+08:00| 200.0| -|2020-01-01T00:00:15.000+08:00| 200.0| -|2020-01-01T00:00:16.000+08:00| 200.0| -|2020-01-01T00:00:18.000+08:00| 200.0| -|2020-01-01T00:00:20.000+08:00| 150.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -|Time |ksigma(root.test.d1.s1,"k"="3.0")| -+-----------------------------+---------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -+-----------------------------+---------------------------------+ -``` - -### 4.3 LOF - -#### 注册语句 - -```sql -create function LOF as 'org.apache.iotdb.library.anomaly.UDTFLOF' -``` - -#### 函数简介 - -本函数使用局部离群点检测方法用于查找序列的密度异常。将根据提供的第k距离数及局部离群点因子(lof)阈值,判断输入数据是否为离群点,即异常,并输出各点的 LOF 值。 - -**函数名:** LOF - -**输入序列:** 多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:使用的检测方法。默认为 default,以高维数据计算。设置为 series,将一维时间序列转换为高维数据计算。 -+ `k`:使用第k距离计算局部离群点因子.默认为 3。 -+ `window`:每次读取数据的窗口长度。默认为 10000. -+ `windowsize`:使用series方法时,转化高维数据的维数,即单个窗口的大小。默认为 5。 - -**输出序列:** 输出单时间序列,类型为DOUBLE。 - -**提示:** 不完整的数据行会被忽略,不参与计算,也不标记为离群点。 - - -#### 使用示例 - -##### 默认参数 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| 1.0| -|1970-01-01T08:00:00.300+08:00| 1.0| 1.0| -|1970-01-01T08:00:00.400+08:00| 1.0| 0.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -1.0| -|1970-01-01T08:00:00.600+08:00| -1.0| -1.0| -|1970-01-01T08:00:00.700+08:00| -1.0| 0.0| -|1970-01-01T08:00:00.800+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| null| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select lof(s1,s2) from root.test.d1 where time<1000 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|lof(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.100+08:00| 3.8274824267668244| -|1970-01-01T08:00:00.200+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.300+08:00| 2.838155437762879| -|1970-01-01T08:00:00.400+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.500+08:00| 2.73518261244453| -|1970-01-01T08:00:00.600+08:00| 2.371440975708148| -|1970-01-01T08:00:00.700+08:00| 2.73518261244453| -|1970-01-01T08:00:00.800+08:00| 1.7561416374270742| -+-----------------------------+-------------------------------------+ -``` - -##### 诊断一维时间序列 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 1.0| -|1970-01-01T08:00:00.200+08:00| 2.0| -|1970-01-01T08:00:00.300+08:00| 3.0| -|1970-01-01T08:00:00.400+08:00| 4.0| -|1970-01-01T08:00:00.500+08:00| 5.0| -|1970-01-01T08:00:00.600+08:00| 6.0| -|1970-01-01T08:00:00.700+08:00| 7.0| -|1970-01-01T08:00:00.800+08:00| 8.0| -|1970-01-01T08:00:00.900+08:00| 9.0| -|1970-01-01T08:00:01.000+08:00| 10.0| -|1970-01-01T08:00:01.100+08:00| 11.0| -|1970-01-01T08:00:01.200+08:00| 12.0| -|1970-01-01T08:00:01.300+08:00| 13.0| -|1970-01-01T08:00:01.400+08:00| 14.0| -|1970-01-01T08:00:01.500+08:00| 15.0| -|1970-01-01T08:00:01.600+08:00| 16.0| -|1970-01-01T08:00:01.700+08:00| 17.0| -|1970-01-01T08:00:01.800+08:00| 18.0| -|1970-01-01T08:00:01.900+08:00| 19.0| -|1970-01-01T08:00:02.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select lof(s1, "method"="series") from root.test.d1 where time<1000 -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|lof(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 3.77777777777778| -|1970-01-01T08:00:00.200+08:00| 4.32727272727273| -|1970-01-01T08:00:00.300+08:00| 4.85714285714286| -|1970-01-01T08:00:00.400+08:00| 5.40909090909091| -|1970-01-01T08:00:00.500+08:00| 5.94999999999999| -|1970-01-01T08:00:00.600+08:00| 6.43243243243243| -|1970-01-01T08:00:00.700+08:00| 6.79999999999999| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 7.0| -|1970-01-01T08:00:01.000+08:00| 6.79999999999999| -|1970-01-01T08:00:01.100+08:00| 6.43243243243243| -|1970-01-01T08:00:01.200+08:00| 5.94999999999999| -|1970-01-01T08:00:01.300+08:00| 5.40909090909091| -|1970-01-01T08:00:01.400+08:00| 4.85714285714286| -|1970-01-01T08:00:01.500+08:00| 4.32727272727273| -|1970-01-01T08:00:01.600+08:00| 3.77777777777778| -+-----------------------------+--------------------+ -``` - -### 4.4 MissDetect - -#### 注册语句 - -```sql -create function missdetect as 'org.apache.iotdb.library.anomaly.UDTFMissDetect' -``` - -#### 函数简介 - -本函数用于检测数据中的缺失异常。在一些数据中,缺失数据会被线性插值填补,在数据中出现完美的线性片段,且这些片段往往长度较大。本函数通过在数据中发现这些完美线性片段来检测缺失异常。 - -**函数名:** MISSDETECT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `minlen`:被标记为异常的完美线性片段的最小长度,是一个大于等于 10 的整数,默认值为 10。 - -**输出序列:** 输出单个序列,类型为 BOOLEAN,即该数据点是否为缺失异常。 - -**提示:** 数据中的`NaN`将会被忽略。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 0.0| -|2021-07-01T12:00:01.000+08:00| 1.0| -|2021-07-01T12:00:02.000+08:00| 0.0| -|2021-07-01T12:00:03.000+08:00| 1.0| -|2021-07-01T12:00:04.000+08:00| 0.0| -|2021-07-01T12:00:05.000+08:00| 0.0| -|2021-07-01T12:00:06.000+08:00| 0.0| -|2021-07-01T12:00:07.000+08:00| 0.0| -|2021-07-01T12:00:08.000+08:00| 0.0| -|2021-07-01T12:00:09.000+08:00| 0.0| -|2021-07-01T12:00:10.000+08:00| 0.0| -|2021-07-01T12:00:11.000+08:00| 0.0| -|2021-07-01T12:00:12.000+08:00| 0.0| -|2021-07-01T12:00:13.000+08:00| 0.0| -|2021-07-01T12:00:14.000+08:00| 0.0| -|2021-07-01T12:00:15.000+08:00| 0.0| -|2021-07-01T12:00:16.000+08:00| 1.0| -|2021-07-01T12:00:17.000+08:00| 0.0| -|2021-07-01T12:00:18.000+08:00| 1.0| -|2021-07-01T12:00:19.000+08:00| 0.0| -|2021-07-01T12:00:20.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select missdetect(s2,'minlen'='10') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------+ -| Time|missdetect(root.test.d2.s2, "minlen"="10")| -+-----------------------------+------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| false| -|2021-07-01T12:00:01.000+08:00| false| -|2021-07-01T12:00:02.000+08:00| false| -|2021-07-01T12:00:03.000+08:00| false| -|2021-07-01T12:00:04.000+08:00| true| -|2021-07-01T12:00:05.000+08:00| true| -|2021-07-01T12:00:06.000+08:00| true| -|2021-07-01T12:00:07.000+08:00| true| -|2021-07-01T12:00:08.000+08:00| true| -|2021-07-01T12:00:09.000+08:00| true| -|2021-07-01T12:00:10.000+08:00| true| -|2021-07-01T12:00:11.000+08:00| true| -|2021-07-01T12:00:12.000+08:00| true| -|2021-07-01T12:00:13.000+08:00| true| -|2021-07-01T12:00:14.000+08:00| true| -|2021-07-01T12:00:15.000+08:00| true| -|2021-07-01T12:00:16.000+08:00| false| -|2021-07-01T12:00:17.000+08:00| false| -|2021-07-01T12:00:18.000+08:00| false| -|2021-07-01T12:00:19.000+08:00| false| -|2021-07-01T12:00:20.000+08:00| false| -+-----------------------------+------------------------------------------+ -``` - -### 4.5 Range - -#### 注册语句 - -```sql -create function range as 'org.apache.iotdb.library.anomaly.UDTFRange' -``` - -#### 函数简介 - -本函数用于查找时间序列的范围异常。将根据提供的上界与下界,判断输入数据是否越界,即异常,并输出所有异常点为新的时间序列。 - -**函数名:** RANGE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `lower_bound`:范围异常检测的下界。 -+ `upper_bound`:范围异常检测的上界。 - -**输出序列:** 输出单个序列,类型与输入序列相同。 - -**提示:** 应满足`upper_bound`大于`lower_bound`,否则将不做输出。 - - -#### 使用示例 - -##### 指定上界与下界 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------------------+ -|Time |range(root.test.d1.s1,"lower_bound"="101.0","upper_bound"="125.0")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -+-----------------------------+------------------------------------------------------------------+ -``` - -### 4.6 TwoSidedFilter - -#### 注册语句 - -```sql -create function twosidedfilter as 'org.apache.iotdb.library.anomaly.UDTFTwoSidedFilter' -``` - -#### 函数简介 - -本函数基于双边窗口检测法对输入序列中的异常点进行过滤。 - -**函数名:** TWOSIDEDFILTER - -**输出序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型与输入相同,是输入序列去除异常点后的结果。 - -**参数:** - -- `len`:双边窗口检测法中的窗口大小,取值范围为正整数,默认值为 5.如当`len`=3 时,算法向前、向后各取长度为3的窗口,在窗口中计算异常度。 -- `threshold`:异常度的阈值,取值范围为(0,1),默认值为 0.3。阈值越高,函数对于异常度的判定标准越严格。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:00:37.000+08:00| 1484.0| -|1970-01-01T08:00:38.000+08:00| 1055.0| -|1970-01-01T08:00:39.000+08:00| 1050.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test -``` - -输出序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -### 4.7 Outlier - -#### 注册语句 - -```sql -create function outlier as 'org.apache.iotdb.library.anomaly.UDTFOutlier' -``` - -#### 函数简介 - -本函数用于检测基于距离的异常点。在当前窗口中,如果一个点距离阈值范围内的邻居数量(包括它自己)少于密度阈值,则该点是异常点。 - -**函数名:** OUTLIER - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `r`:基于距离异常检测中的距离阈值。 -+ `k`:基于距离异常检测中的密度阈值。 -+ `w`:用于指定滑动窗口的大小。 -+ `s`:用于指定滑动窗口的步长。 - -**输出序列**:输出单个序列,类型与输入序列相同。 - -#### 使用示例 - -##### 指定查询参数 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|2020-01-04T23:59:55.000+08:00| 56.0| -|2020-01-04T23:59:56.000+08:00| 55.1| -|2020-01-04T23:59:57.000+08:00| 54.2| -|2020-01-04T23:59:58.000+08:00| 56.3| -|2020-01-04T23:59:59.000+08:00| 59.0| -|2020-01-05T00:00:00.000+08:00| 60.0| -|2020-01-05T00:00:01.000+08:00| 60.5| -|2020-01-05T00:00:02.000+08:00| 64.5| -|2020-01-05T00:00:03.000+08:00| 69.0| -|2020-01-05T00:00:04.000+08:00| 64.2| -|2020-01-05T00:00:05.000+08:00| 62.3| -|2020-01-05T00:00:06.000+08:00| 58.0| -|2020-01-05T00:00:07.000+08:00| 58.9| -|2020-01-05T00:00:08.000+08:00| 52.0| -|2020-01-05T00:00:09.000+08:00| 62.3| -|2020-01-05T00:00:10.000+08:00| 61.0| -|2020-01-05T00:00:11.000+08:00| 64.2| -|2020-01-05T00:00:12.000+08:00| 61.8| -|2020-01-05T00:00:13.000+08:00| 64.0| -|2020-01-05T00:00:14.000+08:00| 63.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|outlier(root.test.s1,"r"="5.0","k"="4","w"="10","s"="5")| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:03.000+08:00| 69.0| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:08.000+08:00| 52.0| -+-----------------------------+--------------------------------------------------------+ -``` - -## 5. 频域分析 - -### 5.1 Conv - -#### 注册语句 - -```sql -create function conv as 'org.apache.iotdb.library.frequency.UDTFConv' -``` - -#### 函数简介 - -本函数对两个输入序列进行卷积,即多项式乘法。 - - -**函数名:** CONV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为DOUBLE,它是两个序列卷积的结果。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 0.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| null| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select conv(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------+ -| Time|conv(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| -|1970-01-01T08:00:00.003+08:00| 2.0| -+-----------------------------+--------------------------------------+ -``` - -### 5.2 Deconv - -#### 注册语句 - -```sql -create function deconv as 'org.apache.iotdb.library.frequency.UDTFDeconv' -``` - -#### 函数简介 - -本函数对两个输入序列进行去卷积,即多项式除法运算。 - -**函数名:** DECONV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `result`:去卷积的结果,取值为'quotient'或'remainder',分别对应于去卷积的商和余数。在缺省情况下,输出去卷积的商。 - -**输出序列:** 输出单个序列,类型为DOUBLE。它是将第二个序列从第一个序列中去卷积(第一个序列除以第二个序列)的结果。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -##### 计算去卷积的商 - -当`result`参数缺省或为'quotient'时,本函数计算去卷积的商。 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s3|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 8.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| null| -|1970-01-01T08:00:00.003+08:00| 2.0| null| -+-----------------------------+---------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select deconv(s3,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2)| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - -##### 计算去卷积的余数 - -当`result`参数为'remainder'时,本函数计算去卷积的余数。输入序列同上,用于查询的SQL语句如下: - -```sql -select deconv(s3,s2,'result'='remainder') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2, "result"="remainder")| -+-----------------------------+--------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 0.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -+-----------------------------+--------------------------------------------------------------+ -``` - -### 5.3 DWT - -#### 注册语句 - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFDWT' -``` - -#### 函数简介 - -本函数对输入序列进行一维离散小波变换。 - -**函数名:** DWT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:小波滤波的类型,提供'Haar', 'DB4', 'DB6', 'DB8',其中DB指代Daubechies。若不设置该参数,则用户需提供小波滤波的系数。不区分大小写。 -+ `coef`:小波滤波的系数。若提供该参数,请使用英文逗号','分割各项,不添加空格或其它符号。 -+ `layer`:进行变换的次数,最终输出的向量个数等同于$layer+1$.默认取1。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。 - -**提示:** 输入序列长度必须为2的整数次幂。 - -#### 使用示例 - -##### Haar变换 - - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.2| -|1970-01-01T08:00:00.200+08:00| 1.5| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.600+08:00| 0.8| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.800+08:00| 2.5| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select dwt(s1,"method"="haar") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|dwt(root.test.d1.s1, "method"="haar")| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.14142135834465192| -|1970-01-01T08:00:00.100+08:00| 1.909188342921157| -|1970-01-01T08:00:00.200+08:00| 1.6263456473052773| -|1970-01-01T08:00:00.300+08:00| 1.9798989957517026| -|1970-01-01T08:00:00.400+08:00| 3.252691126023161| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776479437628| -|1970-01-01T08:00:00.800+08:00| -0.14142135834465192| -|1970-01-01T08:00:00.900+08:00| 0.21213200063848547| -|1970-01-01T08:00:01.000+08:00| -0.7778174761639416| -|1970-01-01T08:00:01.100+08:00| -0.8485281289944873| -|1970-01-01T08:00:01.200+08:00| 0.2828427799095765| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426400127697095| -|1970-01-01T08:00:01.500+08:00| -0.42426408557066786| -+-----------------------------+-------------------------------------+ -``` - - -### 5.4 IDWT - -#### 注册语句 - -```sql -create function idwt as 'org.apache.iotdb.library.frequency.UDTFIDWT' -``` - -#### 函数简介 - -本函数对输入序列进行一维离散小波逆变换,将 DWT 分解后的小波系数还原为原始数据。 - -**函数名:** IDWT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:小波滤波的类型,提供'Haar', 'DB4', 'DB6', 'DB8',其中DB指代Daubechies。若不设置该参数,则用户需提供小波滤波的系数。不区分大小写。 -+ `coef`:小波滤波的系数。若提供该参数,请使用英文逗号','分割各项,不添加空格或其它符号。 -+ `layer`:进行变换的次数,最终输出的向量个数等同于$layer+1$.默认取1。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。 - -**提示:** -* 输入序列长度必须为2的整数次幂。 -* IDWT 函数的参数设置(method/coef/layer)应与对应 DWT 变换时保持一致,才能正确还原原始数据。 -* 通常 IDWT 的输入为 DWT 函数的输出结果。 - -#### 使用示例 - -##### Haar变换 - - -输入序列: - -``` -+-----------------------------+--------------------+ -| Time| root.test.d1.s2| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 0.1414213562373095| -|1970-01-01T08:00:00.100+08:00| 1.909188309203678| -|1970-01-01T08:00:00.200+08:00| 1.6263455967290592| -|1970-01-01T08:00:00.300+08:00| 1.979898987322333| -|1970-01-01T08:00:00.400+08:00| 3.2526911934581184| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776310850235| -|1970-01-01T08:00:00.800+08:00| -0.1414213562373095| -|1970-01-01T08:00:00.900+08:00| 0.21213203435596428| -|1970-01-01T08:00:01.000+08:00| -0.7778174593052022| -|1970-01-01T08:00:01.100+08:00| -0.8485281374238569| -|1970-01-01T08:00:01.200+08:00| 0.2828427124746189| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426406871192857| -|1970-01-01T08:00:01.500+08:00|-0.42426406871192857| -+-----------------------------+--------------------+ -``` - -用于查询的SQL语句: - -```sql -select idwt(s2,"method"="haar") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------+ -| Time|idwt(root.test.d1.s2, "method"="haar")| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.19999999999999998| -|1970-01-01T08:00:00.200+08:00| 1.4999999999999996| -|1970-01-01T08:00:00.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.6999999999999997| -|1970-01-01T08:00:00.600+08:00| 0.7999999999999998| -|1970-01-01T08:00:00.700+08:00| 1.9999999999999996| -|1970-01-01T08:00:00.800+08:00| 2.4999999999999996| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.9999999999999996| -|1970-01-01T08:00:01.200+08:00| 1.7999999999999998| -|1970-01-01T08:00:01.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:01.400+08:00| 0.9999999999999998| -|1970-01-01T08:00:01.500+08:00| 1.5999999999999999| -+-----------------------------+--------------------------------------+ -``` - - -### 5.5 FFT - -#### 注册语句 - -```sql -create function fft as 'org.apache.iotdb.library.frequency.UDTFFFT' -``` - -#### 函数简介 - -本函数对输入序列进行快速傅里叶变换。 - -**函数名:** FFT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:傅里叶变换的类型,取值为'uniform'或'nonuniform',缺省情况下为'uniform'。当取值为'uniform'时,时间戳将被忽略,所有数据点都将被视作等距的,并应用等距快速傅里叶算法;当取值为'nonuniform'时,将根据时间戳应用非等距快速傅里叶算法(未实现)。 -+ `result`:傅里叶变换的结果,取值为'real'、'imag'、'abs'或'angle',分别对应于变换结果的实部、虚部、模和幅角。在缺省情况下,输出变换的模。 -+ `compress`:压缩参数,取值范围(0,1],是有损压缩时保留的能量比例。在缺省情况下,不进行压缩。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -##### 等距傅里叶变换 - -当`type`参数缺省或为'uniform'时,本函数进行等距傅里叶变换。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select fft(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------+ -| Time| fft(root.test.d1.s1)| -+-----------------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 1.2727111142703152E-8| -|1970-01-01T08:00:00.002+08:00| 2.385520799101839E-7| -|1970-01-01T08:00:00.003+08:00| 8.723291723972645E-8| -|1970-01-01T08:00:00.004+08:00| 19.999999960195904| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| -|1970-01-01T08:00:00.006+08:00| 3.2260694930700566E-7| -|1970-01-01T08:00:00.007+08:00| 8.723291605373329E-8| -|1970-01-01T08:00:00.008+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.009+08:00| 1.2727110997246171E-8| -|1970-01-01T08:00:00.010+08:00|1.9852334701272664E-23| -|1970-01-01T08:00:00.011+08:00| 1.2727111194499847E-8| -|1970-01-01T08:00:00.012+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.013+08:00| 8.723291785769131E-8| -|1970-01-01T08:00:00.014+08:00| 3.226069493070057E-7| -|1970-01-01T08:00:00.015+08:00| 9.999999850988388| -|1970-01-01T08:00:00.016+08:00| 19.999999960195904| -|1970-01-01T08:00:00.017+08:00| 8.723291747109068E-8| -|1970-01-01T08:00:00.018+08:00| 2.3855207991018386E-7| -|1970-01-01T08:00:00.019+08:00| 1.2727112069910878E-8| -+-----------------------------+----------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此在输出序列中$k=4$和$k=5$处有尖峰。 - -##### 等距傅里叶变换并压缩 - -输入序列同上,用于查询的SQL语句如下: - -```sql -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| fft(root.test.d1.s1,| fft(root.test.d1.s1,| -| | "result"="real",| "result"="imag",| -| | "compress"="0.99")| "compress"="0.99")| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - -注:基于傅里叶变换结果的共轭性质,压缩结果只保留前一半;根据给定的压缩参数,从低频到高频保留数据点,直到保留的能量比例超过该值;保留最后一个数据点以表示序列长度。 - -### 5.6 HighPass - -#### 注册语句 - -```sql -create function highpass as 'org.apache.iotdb.library.frequency.UDTFHighPass' -``` - -#### 函数简介 - -本函数对输入序列进行高通滤波,提取高于截止频率的分量。输入序列的时间戳将被忽略,所有数据点都将被视作等距的。 - -**函数名:** HIGHPASS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `wpass`:归一化后的截止频率,取值为(0,1),不可缺省。 - -**输出序列:** 输出单个序列,类型为DOUBLE,它是滤波后的序列,长度与时间戳均与输入一致。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select highpass(s1,'wpass'='0.45') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------+ -| Time|highpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9999999534830373| -|1970-01-01T08:00:01.000+08:00| 1.7462829277628608E-8| -|1970-01-01T08:00:02.000+08:00| -0.9999999593178128| -|1970-01-01T08:00:03.000+08:00| -4.1115269056426626E-8| -|1970-01-01T08:00:04.000+08:00| 0.9999999925494194| -|1970-01-01T08:00:05.000+08:00| 3.328126513330016E-8| -|1970-01-01T08:00:06.000+08:00| -1.0000000183304454| -|1970-01-01T08:00:07.000+08:00| 6.260191433311374E-10| -|1970-01-01T08:00:08.000+08:00| 1.0000000018134796| -|1970-01-01T08:00:09.000+08:00| -3.097210911744423E-17| -|1970-01-01T08:00:10.000+08:00| -1.0000000018134794| -|1970-01-01T08:00:11.000+08:00| -6.260191627862097E-10| -|1970-01-01T08:00:12.000+08:00| 1.0000000183304454| -|1970-01-01T08:00:13.000+08:00| -3.328126501424346E-8| -|1970-01-01T08:00:14.000+08:00| -0.9999999925494196| -|1970-01-01T08:00:15.000+08:00| 4.111526915498874E-8| -|1970-01-01T08:00:16.000+08:00| 0.9999999593178128| -|1970-01-01T08:00:17.000+08:00| -1.7462829341296528E-8| -|1970-01-01T08:00:18.000+08:00| -0.9999999534830369| -|1970-01-01T08:00:19.000+08:00| -1.035237222742873E-16| -+-----------------------------+-----------------------------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此高通滤波之后的输出序列服从$y=sin(2\pi t/4)$。 - -### 5.7 IFFT - -#### 注册语句 - -```sql -create function ifft as 'org.apache.iotdb.library.frequency.UDTFIFFT' -``` - -#### 函数简介 - -本函数将输入的两个序列作为实部和虚部视作一个复数,进行逆快速傅里叶变换,并输出结果的实部。输入数据的格式参见`FFT`函数的输出,并支持以`FFT`函数压缩后的输出作为本函数的输入。 - -**函数名:** IFFT - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `start`:输出序列的起始时刻,是一个格式为'yyyy-MM-dd HH:mm:ss'的时间字符串。在缺省情况下,为'1970-01-01 08:00:00'。 -+ `interval`:输出序列的时间间隔,是一个有单位的正数。目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,为1s。 - - -**输出序列:** 输出单个序列,类型为DOUBLE。该序列是一个等距时间序列,它的值是将两个输入序列依次作为实部和虚部进行逆快速傅里叶变换的结果。 - -**提示:** 如果某行数据中包含空值或`NaN`,该行数据将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| root.test.d1.re| root.test.d1.im| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - - -用于查询的SQL语句: - -```sql -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|ifft(root.test.d1.re, root.test.d1.im, "interval"="1m",| -| | "start"="2021-01-01 00:00:00")| -+-----------------------------+-------------------------------------------------------+ -|2021-01-01T00:00:00.000+08:00| 2.902112992431231| -|2021-01-01T00:01:00.000+08:00| 1.1755704705132448| -|2021-01-01T00:02:00.000+08:00| -2.175570513757101| -|2021-01-01T00:03:00.000+08:00| -1.9021130389094498| -|2021-01-01T00:04:00.000+08:00| 0.9999999925494194| -|2021-01-01T00:05:00.000+08:00| 1.902113046743454| -|2021-01-01T00:06:00.000+08:00| 0.17557053610884188| -|2021-01-01T00:07:00.000+08:00| -1.1755704886020932| -|2021-01-01T00:08:00.000+08:00| -0.9021130371347148| -|2021-01-01T00:09:00.000+08:00| 3.552713678800501E-16| -|2021-01-01T00:10:00.000+08:00| 0.9021130371347154| -|2021-01-01T00:11:00.000+08:00| 1.1755704886020932| -|2021-01-01T00:12:00.000+08:00| -0.17557053610884144| -|2021-01-01T00:13:00.000+08:00| -1.902113046743454| -|2021-01-01T00:14:00.000+08:00| -0.9999999925494196| -|2021-01-01T00:15:00.000+08:00| 1.9021130389094498| -|2021-01-01T00:16:00.000+08:00| 2.1755705137571004| -|2021-01-01T00:17:00.000+08:00| -1.1755704705132448| -|2021-01-01T00:18:00.000+08:00| -2.902112992431231| -|2021-01-01T00:19:00.000+08:00| -3.552713678800501E-16| -+-----------------------------+-------------------------------------------------------+ -``` - -### 5.8 LowPass - -#### 注册语句 - -```sql -create function lowpass as 'org.apache.iotdb.library.frequency.UDTFLowPass' -``` - -#### 函数简介 - -本函数对输入序列进行低通滤波,提取低于截止频率的分量。输入序列的时间戳将被忽略,所有数据点都将被视作等距的。 - -**函数名:** LOWPASS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `wpass`:归一化后的截止频率,取值为(0,1),不可缺省。 - -**输出序列:** 输出单个序列,类型为DOUBLE,它是滤波后的序列,长度与时间戳均与输入一致。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select lowpass(s1,'wpass'='0.45') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|lowpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.9021130073323922| -|1970-01-01T08:00:01.000+08:00| 1.1755704705132448| -|1970-01-01T08:00:02.000+08:00| -1.1755705286582614| -|1970-01-01T08:00:03.000+08:00| -1.9021130389094498| -|1970-01-01T08:00:04.000+08:00| 7.450580419288145E-9| -|1970-01-01T08:00:05.000+08:00| 1.902113046743454| -|1970-01-01T08:00:06.000+08:00| 1.1755705212076808| -|1970-01-01T08:00:07.000+08:00| -1.1755704886020932| -|1970-01-01T08:00:08.000+08:00| -1.9021130222335536| -|1970-01-01T08:00:09.000+08:00| 3.552713678800501E-16| -|1970-01-01T08:00:10.000+08:00| 1.9021130222335536| -|1970-01-01T08:00:11.000+08:00| 1.1755704886020932| -|1970-01-01T08:00:12.000+08:00| -1.1755705212076801| -|1970-01-01T08:00:13.000+08:00| -1.902113046743454| -|1970-01-01T08:00:14.000+08:00| -7.45058112983088E-9| -|1970-01-01T08:00:15.000+08:00| 1.9021130389094498| -|1970-01-01T08:00:16.000+08:00| 1.1755705286582616| -|1970-01-01T08:00:17.000+08:00| -1.1755704705132448| -|1970-01-01T08:00:18.000+08:00| -1.9021130073323924| -|1970-01-01T08:00:19.000+08:00| -2.664535259100376E-16| -+-----------------------------+----------------------------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此低通滤波之后的输出序列服从$y=2sin(2\pi t/5)$。 - - -### 5.9 Envelope - -#### 注册语句 - -```sql -create function envelope as 'org.apache.iotdb.library.frequency.UDFEnvelopeAnalysis' -``` - -#### 函数简介 - -本函数通过输入一维浮点数数组和用户指定的调制频率,实现对信号的解调和包络提取。解调的目标是从复杂的信号中提取感兴趣的部分,使其更易理解。比如通过解调可以找到信号的包络,即振幅的变化趋势。 - -**函数名:** Envelope - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `frequency`:频率(选填,正数。不填此参数,系统会基于序列对应时间的时间间隔来推断频率)。 -+ `amplification`: 扩增倍数(选填,正整数。输出Time列的结果为正整数的集合,不会输出小数。当频率小1时,可通过此参数对频率进行扩增以展示正常的结果)。 - -**输出序列:** -+ `Time`: 该列返回的值的含义是频率而并非时间,如果输出的格式为时间格式(如:1970-01-01T08:00:19.000+08:00),请将其转为时间戳值。 - -+ `Envelope(Path, 'frequency'='{frequency}')`:输出单个序列,类型为DOUBLE,它是包络分析之后的结果。 - -**提示:** 当解调的原始序列的值不连续时,本函数会视为连续处理,建议被分析的时间序列是一段值完整的时间序列。同时建议指定开始时间与结束时间。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:01.000+08:00| 1.0 | -|1970-01-01T08:00:02.000+08:00| 2.0 | -|1970-01-01T08:00:03.000+08:00| 3.0 | -|1970-01-01T08:00:04.000+08:00| 4.0 | -|1970-01-01T08:00:05.000+08:00| 5.0 | -|1970-01-01T08:00:06.000+08:00| 6.0 | -|1970-01-01T08:00:07.000+08:00| 7.0 | -|1970-01-01T08:00:08.000+08:00| 8.0 | -|1970-01-01T08:00:09.000+08:00| 9.0 | -|1970-01-01T08:00:10.000+08:00| 10.0 | -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: -```sql -set time_display_type=long; -select envelope(s1),envelope(s1,'frequency'='1000'),envelope(s1,'amplification'='10') from root.test.d1; -``` -输出序列: - -``` -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -|Time|envelope(root.test.d1.s1)|envelope(root.test.d1.s1, "frequency"="1000")|envelope(root.test.d1.s1, "amplification"="10")| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -| 0| 6.284350808484124| 6.284350808484124| 6.284350808484124| -| 100| 1.5581923657404393| 1.5581923657404393| null| -| 200| 0.8503211038340728| 0.8503211038340728| null| -| 300| 0.512808785945551| 0.512808785945551| null| -| 400| 0.26361156774506744| 0.26361156774506744| null| -|1000| null| null| 1.5581923657404393| -|2000| null| null| 0.8503211038340728| -|3000| null| null| 0.512808785945551| -|4000| null| null| 0.26361156774506744| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ - -``` - -## 6. 数据匹配 - -### 6.1 Cov - -#### 注册语句 - -```sql -create function cov as 'org.apache.iotdb.library.dmatch.UDAFCov' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的总体协方差。 - -**函数名:** COV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为总体协方差的数据点。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出`NaN`。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select cov(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|cov(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 12.291666666666666| -+-----------------------------+-------------------------------------+ -``` - -### 6.2 Dtw - -#### 注册语句 - -```sql -create function dtw as 'org.apache.iotdb.library.dmatch.UDAFDtw' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的 DTW 距离。 - -**函数名:** DTW - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为两个时间序列的 DTW 距离值。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出 0。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.004+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.005+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.006+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.007+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.008+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.009+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.010+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.011+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.012+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.013+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.014+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.015+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.016+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.017+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.018+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.019+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.020+08:00| 1.0| 2.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select dtw(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|dtw(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 20.0| -+-----------------------------+-------------------------------------+ -``` - -### 6.3 Pearson - -#### 注册语句 - -```sql -create function pearson as 'org.apache.iotdb.library.dmatch.UDAFPearson' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的皮尔森相关系数。 - -**函数名:** PEARSON - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为皮尔森相关系数的数据点。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出`NaN`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select pearson(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------+ -| Time|pearson(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.5630881927754872| -+-----------------------------+-----------------------------------------+ -``` - -### 6.4 PtnSym - -#### 注册语句 - -```sql -create function ptnsym as 'org.apache.iotdb.library.dmatch.UDTFPtnSym' -``` - -#### 函数简介 - -本函数用于寻找序列中所有对称度小于阈值的对称子序列。对称度通过 DTW 计算,值越小代表序列对称性越高。 - -**函数名:** PTNSYM - -**输入序列:** 仅支持一个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:对称子序列的长度,是一个正整数,默认值为 10。 -+ `threshold`:对称度阈值,是一个非负数,只有对称度小于等于该值的对称子序列才会被输出。在缺省情况下,所有的子序列都会被输出。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中的每一个数据点对应于一个对称子序列,时间戳为子序列的起始时刻,值为对称度。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s4| -+-----------------------------+---------------+ -|2021-01-01T12:00:00.000+08:00| 1.0| -|2021-01-01T12:00:01.000+08:00| 2.0| -|2021-01-01T12:00:02.000+08:00| 3.0| -|2021-01-01T12:00:03.000+08:00| 2.0| -|2021-01-01T12:00:04.000+08:00| 1.0| -|2021-01-01T12:00:05.000+08:00| 1.0| -|2021-01-01T12:00:06.000+08:00| 1.0| -|2021-01-01T12:00:07.000+08:00| 1.0| -|2021-01-01T12:00:08.000+08:00| 2.0| -|2021-01-01T12:00:09.000+08:00| 3.0| -|2021-01-01T12:00:10.000+08:00| 2.0| -|2021-01-01T12:00:11.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|ptnsym(root.test.d1.s4, "window"="5", "threshold"="0")| -+-----------------------------+------------------------------------------------------+ -|2021-01-01T12:00:00.000+08:00| 0.0| -|2021-01-01T12:00:07.000+08:00| 0.0| -+-----------------------------+------------------------------------------------------+ -``` - -### 6.5 XCorr - -#### 注册语句 - -```sql -create function xcorr as 'org.apache.iotdb.library.dmatch.UDTFXCorr' -``` - -#### 函数简介 - -本函数用于计算两条时间序列的互相关函数值, -对离散序列而言,互相关函数可以表示为 -$$CR(n) = \frac{1}{N} \sum_{m=1}^N S_1[m]S_2[m+n]$$ -常用于表征两条序列在不同对齐条件下的相似度。 - -**函数名:** XCORR - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中共包含$2N-1$个数据点, -其中正中心的值为两条序列按照预先对齐的结果计算的互相关系数(即等于以上公式的$CR(0)$), -前半部分的值表示将后一条输入序列向前平移时计算的互相关系数, -直至两条序列没有重合的数据点(不包含完全分离时的结果$CR(-N)=0.0$), -后半部分类似。 -用公式可表示为(所有序列的索引从1开始计数): -$$OS[i] = CR(-N+i) = \frac{1}{N} \sum_{m=1}^{i} S_1[m]S_2[N-i+m],\ if\ i <= N$$ -$$OS[i] = CR(i-N) = \frac{1}{N} \sum_{m=1}^{2N-i} S_1[i-N+m]S_2[m],\ if\ i > N$$ - -**提示:** - -+ 两条序列中的`null` 和`NaN` 值会被忽略,在计算中表现为 0。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| null| 6| -|2020-01-01T00:00:02.000+08:00| 2| 7| -|2020-01-01T00:00:03.000+08:00| 3| NaN| -|2020-01-01T00:00:04.000+08:00| 4| 9| -|2020-01-01T00:00:05.000+08:00| 5| 10| -+-----------------------------+---------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------+ -| Time|xcorr(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+---------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 10.0| -|1970-01-01T08:00:00.002+08:00| 16.0| -|1970-01-01T08:00:00.003+08:00| 16.75| -|1970-01-01T08:00:00.004+08:00| 20.0| -|1970-01-01T08:00:00.005+08:00| 13.2| -|1970-01-01T08:00:00.006+08:00| 5.6| -|1970-01-01T08:00:00.007+08:00| 7.0| -|1970-01-01T08:00:00.008+08:00| 0.0| -+-----------------------------+---------------------------------------+ -``` - - -### 6.6 Pattern\_match - -#### 注册语句 - -```SQL -create function pattern_match as 'org.apache.iotdb.library.match.UDAFPatternMatch' -``` - -#### 函数简介 - -本函数用于对输入的某一条时间序列与预设的`pattern`进行模式匹配,当相似度小于等于某个预设阈值时判定为匹配成功,并将最终匹配结果以`json`列表的方式输出。 - -**函数名:** PATTERN\_MATCH - -**输入序列:** 仅支持一个输入序列,类型为INT32,INT64,FLOAT,DOUBLE,BOOLEAN。 - -**参数:** - -* `timePattern` :以时间戳组成的字符串,以逗号分隔。长度必须大于1 。必填项。 -* `valuePattern `:以数字组成的字符串,以逗号分隔。数量与 `timePattern `相同,长度必须大于1。必填项。 - -> 提示:布尔类型的`valuePattern `,需要用1,0来表示`true`和`false`。 - -* `threshold` :阈值。Float类型。必填项。 - -**输出序列**:输出结果为包含所有成功匹配段落的起始时间戳`startTime`、终止时间戳`endTime`及相似度值`distance`的`json`列表。 - -#### 使用示例 -1. 线性数据 - -输入序列: - -```SQL -IoTDB> select s0 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s0| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.1| -|1970-01-01T08:00:00.003+08:00| 1.2| -|1970-01-01T08:00:00.004+08:00| 1.3| -|1970-01-01T08:00:00.005+08:00| 0.0| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+--------------------------------------------------------------------------------------------------+ -| match_result| -+--------------------------------------------------------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]| -+--------------------------------------------------------------------------------------------------+ -``` - -2. 布尔类型数据 - -输入序列: - -```SQL -IoTDB> select s1 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| true| -|1970-01-01T08:00:00.002+08:00| true| -|1970-01-01T08:00:00.003+08:00| true| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| false| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", "threshold"="0.5") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+-------------------------------------------------+ -| match_result| -+-------------------------------------------------+ -|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+-------------------------------------------------+ -``` - -3. V型数据 - -输入序列: - -```SQL -IoTDB> select s2 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s2| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| -1.0| -|1970-01-01T08:00:00.003+08:00| -2.0| -|1970-01-01T08:00:00.004+08:00| -3.0| -|1970-01-01T08:00:00.005+08:00| -2.0| -|1970-01-01T08:00:00.006+08:00| -1.0| -|1970-01-01T08:00:00.007+08:00| -0.0| -|1970-01-01T08:00:00.008+08:00| -0.0| -|1970-01-01T08:00:00.009+08:00| -0.0| -|1970-01-01T08:00:00.010+08:00| -0.0| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s2, "timePattern"="1,2,3,4,5,6,7", "valuePattern"="0.0,-1.0,-2.0,-3.0,-2.0,-1.0,-0.0", "threshold"="10") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+----------------------------------------------+ -| match_result| -+----------------------------------------------+ -|[{"distance":0.53,"startTime":1,"endTime":10}]| -+----------------------------------------------+ -``` - -4. 多个匹配模式 - -输入序列: - -```SQL -IoTDB> select s0,s1 from root.** -+-----------------------------+-------------+-------------+ -| Time|root.db.d0.s0|root.db.d0.s1| -+-----------------------------+-------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| true| -|1970-01-01T08:00:00.002+08:00| 1.1| true| -|1970-01-01T08:00:00.003+08:00| 1.2| true| -|1970-01-01T08:00:00.004+08:00| 1.3| false| -|1970-01-01T08:00:00.005+08:00| 0.0| false| -+-----------------------------+-------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result1, pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", - "threshold"="0.5") as match_result2 from root.db.d0 -``` - -输出序列: - -```SQL -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -| match_result1| match_result2| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -``` - - -## 7. 数据修复 - -### 7.1 TimestampRepair - -#### 注册语句 - -```sql -create function timestamprepair as 'org.apache.iotdb.library.drepair.UDTFTimestampRepair' -``` - -#### 函数简介 - -本函数用于时间戳修复。根据给定的标准时间间隔,采用最小化修复代价的方法,通过对数据时间戳的微调,将原本时间戳间隔不稳定的数据修复为严格等间隔的数据。在未给定标准时间间隔的情况下,本函数将使用时间间隔的中位数 (median)、众数 (mode) 或聚类中心 (cluster) 来推算标准时间间隔。 - - -**函数名:** TIMESTAMPREPAIR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `interval`: 标准时间间隔(单位是毫秒),是一个正整数。在缺省情况下,将根据指定的方法推算。 -+ `method`:推算标准时间间隔的方法,取值为 'median', 'mode' 或 'cluster',仅在`interval`缺省时有效。在缺省情况下,将使用中位数方法进行推算。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列是修复后的输入序列。 - -#### 使用示例 - -#### 指定标准时间间隔 - -在给定`interval`参数的情况下,本函数将按照指定的标准时间间隔进行修复。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:19.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:01.000+08:00| 7.0| -|2021-07-01T12:01:11.000+08:00| 8.0| -|2021-07-01T12:01:21.000+08:00| 9.0| -|2021-07-01T12:01:31.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timestamprepair(s1,'interval'='10000') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------------------+ -| Time|timestamprepair(root.test.d2.s1, "interval"="10000")| -+-----------------------------+----------------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+----------------------------------------------------+ -``` - -#### 自动推算标准时间间隔 - -如果`interval`参数没有给定,本函数将按照推算的标准时间间隔进行修复。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select timestamprepair(s1) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------+ -| Time|timestamprepair(root.test.d2.s1)| -+-----------------------------+--------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+--------------------------------+ -``` - -### 7.2ValueFill - -#### 注册语句 - -```sql -create function valuefill as 'org.apache.iotdb.library.drepair.UDTFValueFill' -``` - -#### 函数简介 - -**函数名:** ValueFill - -**输入序列:** 单列时序数据,类型为INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`: {"mean", "previous", "linear", "likelihood", "AR", "MA", "SCREEN"}, 默认为 "linear"。其中,“mean” 指使用均值填补的方法; “previous" 指使用前值填补方法;“linear" 指使用线性插值填补方法;“likelihood” 为基于速度的正态分布的极大似然估计方法;“AR” 指自回归的填补方法;“MA” 指滑动平均的填补方法;"SCREEN" 指约束填补方法;缺省情况下使用 “linear”。 - -**输出序列:** 填补后的单维序列。 - -**备注:** AR 模型采用 AR(1),时序列需满足自相关条件,否则将输出单个数据点 (0, 0.0). - -#### 使用示例 -##### 使用 linear 方法进行填补 - -当`method`缺省或取值为 'linear' 时,本函数将使用线性插值方法进行填补。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| NaN| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| NaN| -|2020-01-01T00:00:22.000+08:00| NaN| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select valuefill(s1) from root.test.d2 -``` - -输出序列: - - - -``` -+-----------------------------+--------------------------+ -| Time|valuefill(root.test.d2.s1)| -+-----------------------------+--------------------------+ -|2020-01-01T00:00:02.000+08:00| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 110.5| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.66666666666667| -|2020-01-01T00:00:22.000+08:00| 121.33333333333333| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+--------------------------+ -``` - -##### 使用 previous 方法进行填补 - -当`method`取值为 'previous' 时,本函数将使前值填补方法进行数值填补。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select valuefill(s1,"method"="previous") from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------+ -| Time|valuefill(root.test.d2.s1, "method"="previous")| -+-----------------------------+-----------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 108.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 116.0| -|2020-01-01T00:00:22.000+08:00| 116.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-----------------------------------------------+ -``` - -### 7.3 ValueRepair - -#### 注册语句 - -```sql -create function valuerepair as 'org.apache.iotdb.library.drepair.UDTFValueRepair' -``` - -#### 函数简介 - -本函数用于对时间序列的数值进行修复。目前,本函数支持两种修复方法:**Screen** 是一种基于速度阈值的方法,在最小改动的前提下使得所有的速度符合阈值要求;**LsGreedy** 是一种基于速度变化似然的方法,将速度变化建模为高斯分布,并采用贪心算法极大化似然函数。 - -**函数名:** VALUEREPAIR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `method`:修复时采用的方法,取值为 'Screen' 或 'LsGreedy'. 在缺省情况下,使用 Screen 方法进行修复。 -+ `minSpeed`:该参数仅在使用 Screen 方法时有效。当速度小于该值时会被视作数值异常点加以修复。在缺省情况下为中位数减去三倍绝对中位差。 -+ `maxSpeed`:该参数仅在使用 Screen 方法时有效。当速度大于该值时会被视作数值异常点加以修复。在缺省情况下为中位数加上三倍绝对中位差。 -+ `center`:该参数仅在使用 LsGreedy 方法时有效。对速度变化分布建立的高斯模型的中心。在缺省情况下为 0。 -+ `sigma` :该参数仅在使用 LsGreedy 方法时有效。对速度变化分布建立的高斯模型的标准差。在缺省情况下为绝对中位差。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列是修复后的输入序列。 - -**提示:** 输入序列中的`NaN`在修复之前会先进行线性插值填补。 - -#### 使用示例 - -##### 使用 Screen 方法进行修复 - -当`method`缺省或取值为 'Screen' 时,本函数将使用 Screen 方法进行数值修复。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select valuerepair(s1) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|valuerepair(root.test.d2.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+----------------------------+ -``` - -##### 使用 LsGreedy 方法进行修复 - -当`method`取值为 'LsGreedy' 时,本函数将使用 LsGreedy 方法进行数值修复。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select valuerepair(s1,'method'='LsGreedy') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|valuerepair(root.test.d2.s1, "method"="LsGreedy")| -+-----------------------------+-------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-------------------------------------------------+ -``` - -## 8. 序列发现 - -### 8.1 ConsecutiveSequences - -#### 注册语句 - -```sql -create function consecutivesequences as 'org.apache.iotdb.library.series.UDTFConsecutiveSequences' -``` - -#### 函数简介 - -本函数用于在多维严格等间隔数据中发现局部最长连续子序列。 - -严格等间隔数据是指数据的时间间隔是严格相等的,允许存在数据缺失(包括行缺失和值缺失),但不允许存在数据冗余和时间戳偏移。 - -连续子序列是指严格按照标准时间间隔等距排布,不存在任何数据缺失的子序列。如果某个连续子序列不是任何连续子序列的真子序列,那么它是局部最长的。 - - -**函数名:** CONSECUTIVESEQUENCES - -**输入序列:** 支持多个输入序列,类型可以是任意的,但要满足严格等间隔的要求。 - -**参数:** - -+ `gap`:标准时间间隔,是一个有单位的正数。目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,函数会利用众数估计标准时间间隔。 - -**输出序列:** 输出单个序列,类型为 INT32。输出序列中的每一个数据点对应一个局部最长连续子序列,时间戳为子序列的起始时刻,值为子序列包含的数据点个数。 - -**提示:** 对于不符合要求的输入,本函数不对输出做任何保证。 - -#### 使用示例 - -##### 手动指定标准时间间隔 - -本函数可以通过`gap`参数手动指定标准时间间隔。需要注意的是,错误的参数设置会导致输出产生严重错误。 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2, "gap"="5m")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------------------+ -``` - -##### 自动估计标准时间间隔 - -当`gap`参数缺省时,本函数可以利用众数估计标准时间间隔,得到同样的结果。因此,这种用法更受推荐。 - -输入序列同上,用于查询的SQL语句如下: - -```sql -select consecutivesequences(s1,s2) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------+ -``` - -### 8.2 ConsecutiveWindows - -#### 注册语句 - -```sql -create function consecutivewindows as 'org.apache.iotdb.library.series.UDTFConsecutiveWindows' -``` - -#### 函数简介 - -本函数用于在多维严格等间隔数据中发现指定长度的连续窗口。 - -严格等间隔数据是指数据的时间间隔是严格相等的,允许存在数据缺失(包括行缺失和值缺失),但不允许存在数据冗余和时间戳偏移。 - -连续窗口是指严格按照标准时间间隔等距排布,不存在任何数据缺失的子序列。 - - -**函数名:** CONSECUTIVEWINDOWS - -**输入序列:** 支持多个输入序列,类型可以是任意的,但要满足严格等间隔的要求。 - -**参数:** - -+ `gap`:标准时间间隔,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,函数会利用众数估计标准时间间隔。 -+ `length`:序列长度,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。该参数不允许缺省。 - -**输出序列:** 输出单个序列,类型为 INT32。输出序列中的每一个数据点对应一个指定长度连续子序列,时间戳为子序列的起始时刻,值为子序列包含的数据点个数。 - -**提示:** 对于不符合要求的输入,本函数不对输出做任何保证。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------+ -| Time|consecutivewindows(root.test.d1.s1, root.test.d1.s2, "length"="10m")| -+-----------------------------+--------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -+-----------------------------+--------------------------------------------------------------------+ -``` - - - -## 9. 机器学习 - -### 9.1 AR - -#### 注册语句 - -```sql -create function ar as 'org.apache.iotdb.library.dlearn.UDTFAR' -``` -#### 函数简介 - -本函数用于学习数据的自回归模型系数。 - -**函数名:** AR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `p`:自回归模型的阶数。默认为1。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。第一行对应模型的一阶系数,以此类推。 - -**提示:** - -- `p`应为正整数。 - -- 序列中的大部分点为等间隔采样点。 -- 序列中的缺失点通过线性插值进行填补后用于学习过程。 - -#### 使用示例 - -##### 指定阶数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ar(s0,"p"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+---------------------------+ -| Time|ar(root.test.d0.s0,"p"="2")| -+-----------------------------+---------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.9429| -|1970-01-01T08:00:00.002+08:00| -0.2571| -+-----------------------------+---------------------------+ -``` diff --git a/src/zh/UserGuide/Master/Tree/Tools-System/CLI_timecho.md b/src/zh/UserGuide/Master/Tree/Tools-System/CLI_timecho.md deleted file mode 100644 index 8904f07d0..000000000 --- a/src/zh/UserGuide/Master/Tree/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,265 +0,0 @@ - - -# 命令行工具 - -IOTDB 为用户提供 cli/Shell 工具用于启动客户端和服务端程序。下面介绍每个 cli/Shell 工具的运行方式和相关参数。 -> \$IOTDB\_HOME 表示 IoTDB 的安装目录所在路径。 - -## 1. Cli 运行方式 -安装后的 IoTDB 中有一个默认用户:`root`,默认密码为`TimechoDB@2021`(V2.0.6.x 版本之前为`root`)。用户可以使用该用户尝试运行 IoTDB 客户端以测试服务器是否正常启动。客户端启动脚本为$IOTDB_HOME/sbin 文件夹下的`start-cli`脚本。启动脚本时需要指定运行 IP 和 RPC PORT。以下为服务器在本机启动,且用户未更改运行端口号的示例,默认端口为 6667。若用户尝试连接远程服务器或更改了服务器运行的端口号,请在-h 和-p 项处使用服务器的 IP 和 RPC PORT。
-用户也可以在启动脚本的最前方设置自己的环境变量,如 JAVA_HOME 等。 - -Linux 系统与 MacOS 系统启动命令如下: - -```shell -# V2.0.6.x 版本之前 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` -Windows 系统启动命令如下: - -```shell -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` -回车后即可成功启动客户端。启动后出现如图提示即为启动成功。 - -``` - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ version - -Successfully login at 127.0.0.1:6667 -``` -输入`quit`或`exit`可退出 cli 结束本次会话,cli 输出`quit normally`表示退出成功。 - -## 2. Cli 运行参数 - -| **参数名** | **参数类型** | **是否为必需参数** | **说明** | **示例** | -|:-----------------------------|:-----------|:------------|:-----------------------------------------------------------|:---------------------| -| -h `` | string 类型 | 否 | IoTDB 客户端连接 IoTDB 服务器的 IP 地址, 默认使用:127.0.0.1。 | -h 127.0.0.1 | -| -p `` | int 类型 | 否 | IoTDB 客户端连接服务器的端口号,IoTDB 默认使用 6667。 | -p 6667 | -| -u `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的用户名,默认使用 root。 | -u root | -| -pw `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的密码,默认使用 TimechoDB@2021(V2.0.6版本之前为root)。 | -pw root | -| -sql_dialect `` | string 类型 | 否 | 目前可选 tree(树模型) 、table(表模型),默认 tree | -sql_dialect table | -| -e `` | string 类型 | 否 | 在不进入客户端输入模式的情况下,批量操作 IoTDB。 | -e "show databases" | -| -c | 空 | 否 | 如果服务器设置了 rpc_thrift_compression_enable=true, 则 CLI 必须使用 -c | -c | -| -disableISO8601 | 空 | 否 | 如果设置了这个参数,IoTDB 将以数字的形式打印时间戳 (timestamp)。 | -disableISO8601 | -| -usessl `` | Boolean 类型 | 否 | 否开启 ssl 连接 | -usessl true | -| -ts `` | string 类型 | 否 | ssl 证书存储路径 | -ts /path/to/truststore | -| -tpw `` | string 类型 | 否 | ssl 证书存储密码 | -tpw myTrustPassword | -| -timeout `` | int 类型 | 否 | 查询超时时间(秒)。如果未设置,则使用服务器的配置。 | -timeout 30 | -| -help | 空 | 否 | 打印 IoTDB 的帮助信息。 | -help | - -下面展示一条客户端命令,功能是连接 IP 为 10.129.187.21 的主机,端口为 6667 ,用户名为 root,密码为 root,以数字的形式打印时间戳,IoTDB 命令行显示的最大行数为 10。 - -Linux 系统与 MacOS 系统启动命令如下: - -```shell -Shell > bash sbin/start-cli.sh -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` -Windows 系统启动命令如下: - -```shell -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 - -# V2.0.4.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -## 3. CLI 特殊命令 -下面列举了一些CLI的特殊命令。 - -| 命令 | 描述 / 例子 | -|:---|:---| -| `set time_display_type=xxx` | 例如: long, default, ISO8601, yyyy-MM-dd HH:mm:ss | -| `show time_display_type` | 显示时间显示方式 | -| `set time_zone=xxx` | 例如: +08:00, Asia/Shanghai | -| `show time_zone` | 显示CLI的时区 | -| `set fetch_size=xxx` | 设置从服务器查询数据时的读取条数 | -| `show fetch_size` | 显示读取条数的大小 | -| `set max_display_num=xxx` | 设置 CLI 一次展示的最大数据条数, 设置为-1表示无限制 | -| `help` | 获取CLI特殊命令的提示 | -| `exit/quit` | 退出CLI | - - -## 4. Cli 的批量操作 -当您想要通过脚本的方式通过 Cli / Shell 对 IoTDB 进行批量操作时,可以使用-e 参数。通过使用该参数,您可以在不进入客户端输入模式的情况下操作 IoTDB。 - -为了避免 SQL 语句和其他参数混淆,现在只支持-e 参数作为最后的参数使用。 - -针对 cli/Shell 工具的-e 参数用法如下: - -Linux 系统与 MacOS 指令: - -```shell -Shell > bash sbin/start-cli.sh -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -Windows 系统指令 -```shell -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} - -# V2.0.4.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -在 Windows 环境下,-e 参数的 SQL 语句需要使用` `` `对于`" "`进行替换 - -为了更好的解释-e 参数的使用,可以参考下面在 Linux 上执行的例子。 - -假设用户希望对一个新启动的 IoTDB 进行如下操作: - -1. 创建名为 root.demo 的 database - -2. 创建名为 root.demo.s1 的时间序列 - -3. 向创建的时间序列中插入三个数据点 - -4. 查询验证数据是否插入成功 - -那么通过使用 cli/Shell 工具的 -e 参数,可以采用如下的脚本: - -```shell -# !/bin/bash - -host=127.0.0.1 -rpcPort=6667 -user=root -pass=root - -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "CREATE DATABASE root.demo" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create timeseries root.demo.s1 WITH DATATYPE=INT32, ENCODING=RLE" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(1,10)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(2,11)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(3,12)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "select s1 from root.demo" -``` - -打印出来的结果显示如下,通过这种方式进行的操作与客户端的输入模式以及通过 JDBC 进行操作结果是一致的。 - -```shell - Shell > bash ./shell.sh -+-----------------------------+------------+ -| Time|root.demo.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.001+08:00| 10| -|1970-01-01T08:00:00.002+08:00| 11| -|1970-01-01T08:00:00.003+08:00| 12| -+-----------------------------+------------+ -Total line number = 3 -It costs 0.267s -``` - -需要特别注意的是,在脚本中使用 -e 参数时要对特殊字符进行转义。 - -## 5. 访问历史功能 - -IoTDB **V2.0.9.1** 起支持开启访问历史功能,即客户端登录成功后展示关键的历史访问信息,支持分布式场景。管理员与普通用户仅可查看自身访问历史,核心展示内容包括: - -* 上一次成功会话:显示日期、时间、访问应用、IP地址及访问方法(首次登录或无历史记录时不显示)。 -* 最近一次失败尝试:显示距离本次成功登录时间最近的一次失败记录的日期、时间、访问应用、IP地址及访问方法。 -* 累计失败次数:统计自上一次成功会话建立以来,所有未成功建立的会话尝试总次数。 - -### 5.1 开启访问历史 - -支持通过修改 `iotdb-system.properties` 文件中的相关参数来控制是否开启访问历史功能,修改参数后需重启生效,例如: - -```Plain -# 用于控制是否启用审计日志功能 -enable_audit_log=false -``` - -* 开启时,记录登录信息并定期清理过期数据; -* 关闭时,不记录、不展示、不清理; -* 开关关闭后重开,展示的历史为关闭前最后一条记录,不一定代表真实最近登录记录。 - -使用示例: - -```Bash ---------------------- -Starting IoTDB Cli ---------------------- - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ Enterprise version 2.0.9.1 (Build: xxxxxxx) - - ----Last Successful Session------------------ -Time: 2026-03-24T10:25:47.759+08:00 -IP Address: 127.0.0.1 ----Last Failed Session---------------------- -Time: 2026-03-24T10:27:26.314+08:00 -IP Address: 127.0.0.1 -Cumulative Failed Attempts: 1 -Successfully login at 127.0.0.1:6667 -IoTDB> -``` - -### 5.2 查看访问历史 - -root 用户及具有 AUDIT 权限的用户可以通过 SQL 语句查看访问历史记录。 - -语法定义: - -```SQL -select * from root.__audit.login.u_{userid}.** -``` - -其中 userid 可通过 `list user` 语句查看。 - -示例: - -```SQL -IoTDB> select * from root.__audit.login.** -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -| Time|root.__audit.login.u_0.node_1.result|root.__audit.login.u_0.node_1.ip|root.__audit.login.u_0.node_1.username| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -|2026-03-25T10:55:58.240+08:00| true| 127.0.0.1| root| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -Total line number = 1 -It costs 0.039s -IoTDB> select * from root.__audit.login.u_0.** -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -| Time|root.__audit.login.u_0.node_1.result|root.__audit.login.u_0.node_1.ip|root.__audit.login.u_0.node_1.username| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -|2026-03-25T10:55:58.240+08:00| true| 127.0.0.1| root| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -Total line number = 1 -It costs 0.020s -``` diff --git a/src/zh/UserGuide/Master/Tree/Tools-System/Data-Export-Tool_timecho.md b/src/zh/UserGuide/Master/Tree/Tools-System/Data-Export-Tool_timecho.md deleted file mode 100644 index 329dc0814..000000000 --- a/src/zh/UserGuide/Master/Tree/Tools-System/Data-Export-Tool_timecho.md +++ /dev/null @@ -1,176 +0,0 @@ -# 数据导出 - -## 1. 功能概述 - -数据导出工具 export-data.sh/bat 位于 tools 目录下,能够将指定 SQL 的查询结果导出为 CSV、SQL 及 TsFile(开源时间序列文件格式)格式。具体功能如下: - - - - - - - - - - - - - - - - - - - - - -
文件格式IoTDB工具具体介绍
CSVexport-data.sh/bat纯文本格式,存储格式化数据,需按照下文指定 CSV 格式进行构造
SQL包含自定义 SQL 语句的文件
TsFile开源时序数据文件格式
- - -## 2. 功能详解 - -### 2.1 公共参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| -------- |-------------------------| ---------------------------------------------------------------------- | -------------- |------------------------------------------| -| -ft | --file\_type | 导出文件的类型,可以选择:csv、sql、tsfile | √ | | -| -h | -- host | 主机名 | 否 | 127.0.0.1 | -| -p | --port | 端口号 | 否 | 6667 | -| -u | --username | 用户名 | 否 | root | -| -pw | --password | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6.x 版本之前为 root) | -| -t | --target | 指定输出文件的目标文件夹,如果路径不存在新建文件夹 | √ | | -| -pfn | --prefix\_file\_name | 指定导出文件的名称。例如:abc,生成的文件是abc\_0.tsfile、abc\_1.tsfile | 否 | dump\_0.tsfile | -| -q | --query | 要执行的查询语句。自 V2.0.8 起,SQL 语句中的分号将被自动移除,查询执行保持正常。 | 否 | 无 | -| -timeout | --query\_timeout | 会话查询的超时时间(ms) | 否 | `-1`(V2.0.8 之前)
`Long.MAX_VALUE`(V2.0.8 及之后)
范围:`-1~Long.MAX_VALUE` | -| -help | --help | 显示帮助信息 | 否 | | -| -usessl | --use_ssl | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| -ts | --trust_store | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| -tpw | --trust_store_password | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 Csv 格式 - -#### 2.2.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -``` - -#### 2.2.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- |--------------------------------------| -| -dt | --datatype | 是否在CSV文件的表头输出时间序列的数据类型,可以选择`true`或`false` | 否 | false | -| -lpf | --lines\_per\_file | 每个转储文件的行数 | 否 | 10000
范围:0~Integer.Max=2147483647 | -| -tf | --time\_format | 指定CSV文件中的时间格式。可以选择:1) 时间戳(数字、长整型);2) ISO8601(默认);3) 用户自定义模式,如`yyyy-MM-dd HH:mm:ss`(默认为ISO8601)。SQL文件中的时间戳输出不受时间格式设置影响 | 否| ISO8601 | -| -tz | --timezone | 设置时区,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | - -#### 2.2.3 运行示例: - -```Shell -# 正确示例 -> tools/export-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn exported-data.csv -dt true -lpf 1000 -tf "yyyy-MM-dd HH:mm:ss" - -tz +08:00 -q "SELECT * FROM root.ln" -timeout 20000 - -# 异常示例 -> tools/export-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` - -### 2.3 Sql 格式 - -#### 2.3.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-aligned ] - -lpf - [-tf ] [-tz ] [-q ] [-timeout ] - -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] -``` - -#### 2.3.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | -------------------------------------- | -| -aligned | --use\_aligned | 是否导出为对齐的SQL格式 | 否 | true | -| -lpf | --lines\_per\_file | 每个转储文件的行数 | 否 | 10000
范围:0~Integer.Max=2147483647 | -| -tf | --time\_format | 指定CSV文件中的时间格式。可以选择:1) 时间戳(数字、长整型);2) ISO8601(默认);3) 用户自定义模式,如`yyyy-MM-dd HH:mm:ss`(默认为ISO8601)。SQL文件中的时间戳输出不受时间格式设置影响 | 否| ISO8601| -| -tz | --timezone | 设置时区,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | - -#### 2.3.3 运行示例: - -```Shell -# 正确示例 -> tools/export-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn exported-data.csv -aligned true -lpf 1000 -tf "yyyy-MM-dd HH:mm:ss" - -tz +08:00 -q "SELECT * FROM root.ln" -timeout 20000 - -# 异常示例 -> tools/export-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` - -### 2.4 TsFile 格式 - -#### 2.4.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] -``` - -#### 2.4.2 私有参数 - -* 无 - -#### 2.4.3 运行示例: - -```Shell -# 正确示例 -> tools/export-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn export-data.tsfile -q "SELECT * FROM root.ln" -timeout 10000 - -# 异常示例 -> tools/export-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` diff --git a/src/zh/UserGuide/Master/Tree/Tools-System/Data-Import-Tool_timecho.md b/src/zh/UserGuide/Master/Tree/Tools-System/Data-Import-Tool_timecho.md deleted file mode 100644 index f0c05341d..000000000 --- a/src/zh/UserGuide/Master/Tree/Tools-System/Data-Import-Tool_timecho.md +++ /dev/null @@ -1,333 +0,0 @@ -# 数据导入 - -## 1. 功能概述 - -IoTDB 支持三种方式进行数据导入: -- 数据导入工具 :`import-data.sh/bat` 位于 `tools` 目录下,可以将 `CSV`、`SQL`、及`TsFile`(开源时序文件格式)的数据导入 `IoTDB`。 -- `TsFile` 自动加载功能。 -- `Load SQL` 导入 `TsFile` 。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
文件格式IoTDB工具具体介绍
CSVimport-data.sh/bat可用于单个或一个目录的 CSV 文件批量导入 IoTDB
SQL可用于单个或一个目录的 SQL 文件批量导入 IoTDB
TsFile可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
TsFile 自动加载可以监听指定路径下新产生的 TsFile 文件,并将其加载进 IoTDB
Load SQL可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
- -## 2. 数据导入工具 - -### 2.1 公共参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|----------|---------------------------|-----------------------------------------------------------------------------------------------------------------------------| -------------- |--------------------------------------| -| -ft | --file\_type | 导入文件的类型,可以选择:csv、sql、tsfile | √ | -| -h | -- host | 主机名 | 否 | 127.0.0.1 | -| -p | --port | 端口号 | 否 | 6667 | -| -u | --username | 用户名 | 否 | root | -| -pw | --password | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6.x 版本之前为 root) | -| -s | --source | 待加载的脚本文件(夹)的本地目录路径
如果为csv sql tsfile这三个支持的格式,直接导入
不支持的格式,报错提示`The file name must end with "csv" or "sql"or "tsfile"!` | √ | -| -tn | --thread\_num | 最大并行线程数 | 否 | 8
范围:0~Integer.Max=2147483647 | -| -tz | --timezone | 时区设置,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | -| -help | --help | 显示帮助信息,支持分开展示和全部展示`-help`或`-help csv` | 否 | -| -usessl | --use_ssl | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| -ts | --trust_store | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| -tpw | --trust_store_password | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - - -### 2.2 CSV 格式 - -#### 2.2.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# V2.0.4.x 版本及之后 -> tools\windows\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] -``` - -#### 2.2.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | ---------------------------- | ----------------------------------------------------------------------------------- |-------------------------------------------|---------------------------------------| -| -fd | --fail\_dir | 指定保存失败文件的目录 | 否 | YOUR\_CSV\_FILE\_PATH | -| -lpf | --lines\_per\_failed\_file | 指定失败文件最大写入数据的行数 | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -aligned | --use\_aligned | 是否导入为对齐序列 | 否 | false | -| -batch | --batch\_size | 指定每调用一次接口处理的数据行数(最小值为1,最大值为Integer.​*MAX\_VALUE*​) | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -ti | --type\_infer | 通过选项定义类型信息,例如`"boolean=text,int=long, ..."` | 否 | 无 | -| -tp | --timestamp\_precision | 时间戳精度 | 否:
1. ms(毫秒)
2. us(微秒)
3. ns(纳秒) | ms -| - -#### 2.2.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -s /path/sql - -fd /path/failure/dir -lpf 100 -aligned true -ti "BOOLEAN=text,INT=long,FLOAT=double" - -tp ms -tz +08:00 -batch 5000 -tn 4 - -# 异常示例 -> tools/import-data.sh -ft csv -s /non_path -error: Source file or directory /non_path does not exist - -> tools/import-data.sh -ft csv -s /path/sql -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` - -#### 2.3.4 导入说明 - -1. CSV 导入规范 - -- 特殊字符转义规则:若Text类型的字段中包含特殊字符(例如逗号,),需使用反斜杠(\)​进行转义处理。 -- 支持的时间格式:yyyy-MM-dd'T'HH:mm:ss, yyy-MM-dd HH:mm:ss, 或者 yyyy-MM-dd'T'HH:mm:ss.SSSZ。 -- 时间戳列​必须作为数据文件的首列存在。 - -2. CSV 文件示例 - -- 时间对齐 - -```sql --- header 中不包含数据类型 - Time,root.test.t1.str,root.test.t2.str,root.test.t2.var - 1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 - 1970-01-01T08:00:00.002+08:00,"123",, - --- header 中包含数据类型(Text 类型数据支持加双引号和不加双引号) -Time,root.test.t1.str(TEXT),root.test.t2.str(TEXT),root.test.t2.var(INT32) -1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 -1970-01-01T08:00:00.002+08:00,123,hello world,123 -1970-01-01T08:00:00.003+08:00,"123",, -1970-01-01T08:00:00.004+08:00,123,,12 -``` - -- 设备对齐 - -```sql --- header 中不包含数据类型 - Time,Device,str,var - 1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", - 1970-01-01T08:00:00.002+08:00,root.test.t1,"123", - 1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100 - --- header 中包含数据类型(Text 类型数据支持加双引号和不加双引号) -Time,Device,str(TEXT),var(INT32) -1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", -1970-01-01T08:00:00.002+08:00,root.test.t1,"123", -1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100 -1970-01-01T08:00:00.002+08:00,root.test.t1,hello world,123 -``` - -### 2.3 SQL 格式 - -#### 2.2.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# V2.0.4.x 版本及之后 -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] -``` - -#### 2.2.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | ---------------------------- | ----------------------------------------------------------------------------------- | -------------- |---------------------------------------| -| -fd | --fail\_dir | 指定保存失败文件的目录 | 否 | YOUR\_CSV\_FILE\_PATH | -| -lpf | --lines\_per\_failed\_file | 指定失败文件最大写入数据的行数 | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -batch | --batch\_size | 指定每调用一次接口处理的数据行数(最小值为1,最大值为Integer.​*MAX\_VALUE*​) | 否 | 100000
范围:0~Integer.Max=2147483647 | - -#### 2.2.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -s /path/sql - -fd /path/failure/dir -lpf 500 -tz +08:00 - -batch 100000 -tn 4 - -# 异常示例 -> tools/import-data.sh -ft sql -s /path/sql -fd /non_path -error: Source file or directory /path/sql does not exist - - -> tools/import-data.sh -ft sql -s /path/sql -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` - -### 2.4 TsFile 格式 - -#### 2.4.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# V2.0.4.x 版本及之后 -> tools\windows\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] -``` - -#### 2.4.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | ------------------------ |----------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------| -------------------- | -| -os| --on\_succcess| 1. none:不删除
2. mv:移动成功的文件到目标文件夹
3. cp:硬连接(拷贝)成功的文件到目标文件夹
4. delete:删除 | √ || -| -sd | --success\_dir | 当`--on_succcess`为mv或cp时,mv或cp的目标文件夹。文件的文件名变为文件夹打平后拼接原有文件名 | 当`--on_succcess`为mv或cp时需要填写 | `${EXEC_DIR}/success`| -| -of| --on\_fail| 1. none:跳过
2. mv:移动失败的文件到目标文件夹
3. cp:硬连接(拷贝)失败的文件到目标文件夹
4. delete:删除 | √ || -| -fd | --fail\_dir | 当`--on_fail`指定为mv或cp时,mv或cp的目标文件夹。文件的文件名变为文件夹打平后拼接原有文件名 | 当`--on_fail`指定为mv或cp时需要填写 | `${EXEC_DIR}/fail` | -| -tp | --timestamp\_precision | 时间戳精度
tsfile非远程导入:-tp 指定tsfile文件的时间精度 手动校验和服务器的时间戳是否一致 不一致返回报错信息
远程导入:-tp 指定tsfile文件的时间精度 pipe自动校验时间戳精度是否一致 不一致返回pipe报错信息 | 否:
1. ms(毫秒)
2. us(微秒)
3. ns(纳秒) | ms| - - -#### 2.4.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 - -s /path/sql -os mv -of cp -sd /path/success/dir -fd /path/failure/dir - -tn 8 -tz +08:00 -tp ms - -# 异常示例 -> tools/import-data.sh -ft tsfile -s /path/sql -os mv -of cp - -fd /path/failure/dir -tn 8 -error: Missing option --success_dir (or -sd) when --on_success is 'mv' or 'cp' - -> tools/import-data.sh -ft tsfile -s /path/sql -os mv -of cp - -sd /path/success/dir -fd /path/failure/dir -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` -## 3. TsFile 自动加载功能 - -本功能允许 IoTDB 主动监听指定目录下的新增 TsFile,并将 TsFile 自动加载至 IoTDB 中。通过此功能,IoTDB 能自动检测并加载 TsFile,无需手动执行任何额外的加载操作。 - -![](/img/Data-import1.png) - -### 3.1 配置参数 - -可通过从配置文件模版 `iotdb-system.properties.template` 中找到下列参数,添加到 IoTDB 配置文件 `iotdb-system.properties` 中开启 TsFile 自动加载功能。完整配置如下: - -| **配置参数** | **参数说明** | **value 取值范围** | **是否必填** | **默认值** | **加载方式** | -| --------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------- | -------------------- | ------------------------ | -------------------- | -| load\_active\_listening\_enable | 是否开启 DataNode 主动监听并且加载 tsfile 的功能(默认开启)。 | Boolean: true,false | 选填 | true | 热加载 | -| load\_active\_listening\_dirs | 需要监听的目录(自动包括目录中的子目录),如有多个使用 “,“ 隔开默认的目录为 `ext/load/pending`(支持热装载) | String: 一个或多个文件目录 | 选填 | `ext/load/pending` | 热加载 | -| load\_active\_listening\_fail\_dir | 执行加载 tsfile 文件失败后将文件转存的目录,只能配置一个 | String: 一个文件目录 | 选填 | `ext/load/failed` | 热加载 | -| load\_active\_listening\_max\_thread\_num | 同时执行加载 tsfile 任务的最大线程数,参数被注释掉时的默值为 max(1, CPU 核心数 / 2),当用户设置的值不在这个区间[1, CPU核心数 /2]内时,会设置为默认值 (1, CPU 核心数 / 2) | Long: [1, Long.MAX\_VALUE] | 选填 | max(1, CPU 核心数 / 2) | 重启后生效 | -| load\_active\_listening\_check\_interval\_seconds | 主动监听轮询间隔,单位秒。主动监听 tsfile 的功能是通过轮询检查文件夹实现的。该配置指定了两次检查 `load_active_listening_dirs` 的时间间隔,每次检查完成 `load_active_listening_check_interval_seconds` 秒后,会执行下一次检查。当用户设置的轮询间隔小于 1 时,会被设置为默认值 5 秒 | Long: [1, Long.MAX\_VALUE] | 选填 | 5 | 重启后生效 | - -### 3.2 注意事项 - -1. 如果待加载的文件中,存在 mods 文件,应优先将 mods 文件移动到监听目录下面,然后再移动 tsfile 文件,且 mods 文件应和对应的 tsfile 文件处于同一目录。防止加载到 tsfile 文件时,加载不到对应的 mods 文件 -2. 禁止设置 Pipe 的 receiver 目录、存放数据的 data 目录等作为监听目录 -3. 禁止 `load_active_listening_fail_dir` 与 `load_active_listening_dirs` 存在相同的目录,或者互相嵌套 -4. 保证 `load_active_listening_dirs` 目录有足够的权限,在加载成功之后,文件将会被删除,如果没有删除权限,则会重复加载 - -## 4. Load SQL - -IoTDB 支持通过 CLI 执行 SQL 直接将存有时间序列的一个或多个 TsFile 文件导入到另外一个正在运行的 IoTDB 实例中。 - -### 4.1 运行命令 - -```SQL -load '' with ( - 'attribute-key1'='attribute-value1', - 'attribute-key2'='attribute-value2', -) -``` - -* `` :文件本身,或是包含若干文件的文件夹路径 -* ``:可选参数,具体如下表所示 - -| Key | Key 描述 | Value 类型 | Value 取值范围 | Value 是否必填 | Value 默认值 | -| --------------------------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ------------ | ----------------------------------------- | ---------------- | -------------------------- | -| `database-level` | 当 tsfile 对应的 database 不存在时,可以通过` database-level`参数的值来制定 database 的级别,默认为`iotdb-common.properties`中设置的级别。
例如当设置 level 参数为 1 时表明此 tsfile 中所有时间序列中层级为1的前缀路径是 database。 | Integer | `[1: Integer.MAX_VALUE]` | 否 | 1 | -| `on-success` | 表示对于成功载入的 tsfile 的处置方式:默认为`delete`,即tsfile 成功加载后将被删除;`none `表明 tsfile 成功加载之后依然被保留在源文件夹, | String | `delete / none` | 否 | delete | -| `model` | 指定写入的 tsfile 是表模型还是树模型,该参数在V2.0.2.1后无效(系统会自动识别是树模型还是表模型) | String | `tree / table` | 否 | 与`-sql_dialect`一致 | -| `database-name` | **仅限表模型有效**: 文件导入的目标 database,不存在时会自动创建,`database-name`中不允许包括"`root.`"前缀,如果包含,将会报错。 | String | `-` | 否 | null | -| `convert-on-type-mismatch` | 加载 tsfile 时,如果数据类型不一致,是否进行转换 | Boolean | `true / false` | 否 | true | -| `verify` | 加载 tsfile 前是否校验 schema | Boolean | `true / false` | 否 | true | -| `tablet-conversion-threshold` | 转换为 tablet 形式的 tsfile 大小阈值,针对小文件 tsfile 加载,采用将其转换为 tablet 形式进行写入:默认值为 -1,即任意大小 tsfile 都不进行转换 | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | 否 | -1 | -| `async` | 是否开启异步加载 tsfile,将文件移到 active load 目录下面,所有的 tsfile 都 load 到`database-name`下. | Boolean | `true / false` | 否 | false | - -### 4.2 运行示例 - -```SQL --- 准备待导入环境 -IoTDB> show databases -+-------------+-----------------------+---------------------+-------------------+---------------------+ -| Database|SchemaReplicationFactor|DataReplicationFactor|TimePartitionOrigin|TimePartitionInterval| -+-------------+-----------------------+---------------------+-------------------+---------------------+ -|root.__system| 1| 1| 0| 604800000| -+-------------+-----------------------+---------------------+-------------------+---------------------+ - --- 通过load sql 导入 tsfile -IoTDB> load '/home/dump1.tsfile' with ( 'on-success'='none') -Msg: The statement is executed successfully. - --- 验证数据导入成功 -IoTDB> select * from root.testdb.** -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -| Time|root.testdb.device.model.temperature|root.testdb.device.model.humidity|root.testdb.device.model.status| -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -|2025-04-17T10:35:47.218+08:00| 22.3| 19.4| true| -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Tools-System/Maintenance-Tool_timecho.md b/src/zh/UserGuide/Master/Tree/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index 1457e848a..000000000 --- a/src/zh/UserGuide/Master/Tree/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,1013 +0,0 @@ - - -# 集群管理工具 - -## 1. 集群管理工具 - -IoTDB 集群管理工具是一款易用的运维工具(企业版工具)。旨在解决 IoTDB 分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。本文档将说明如何用集群管理工具远程部署、配置、启动和停止 IoTDB 集群实例。 - -### 1.1 环境准备 - -本工具为 TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -IoTDB 要部署的机器需要依赖jdk 8及以上版本、lsof、netstat、unzip功能如果没有请自行安装,可以参考文档最后的一节环境所需安装命令。 - -提示:IoTDB集群管理工具需要使用有root权限的账号 - -### 1.2 部署方法 - -#### 下载安装 - -本工具为TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -注意:由于二进制包仅支持GLIBC2.17 及以上版本,因此最低适配Centos7版本 - -* 在iotd目录内输入以下指令后: - -```bash -bash install-iotdbctl.sh -``` - -即可在之后的 shell 内激活 iotdbctl 关键词,如检查部署前所需的环境指令如下所示: - -```bash -iotdbctl cluster check example -``` - -* 也可以不激活iotd直接使用 <iotdbctl absolute path>/sbin/iotdbctl 来执行命令,如检查部署前所需的环境: - -```bash -/sbin/iotdbctl cluster check example -``` - -### 1.3 系统结构 - -IoTDB集群管理工具主要由config、logs、doc、sbin目录组成。 - -* `config`存放要部署的集群配置文件如果要使用集群部署工具需要修改里面的yaml文件。 -* `logs` 存放部署工具日志,如果想要查看部署工具执行日志请查看`logs/iotd_yyyy_mm_dd.log`。 -* `sbin` 存放集群部署工具所需的二进制包。 -* `doc` 存放用户手册、开发手册和推荐部署手册。 - - -### 1.4 集群配置文件介绍 - -* 在`iotdbctl/config` 目录下有集群配置的yaml文件,yaml文件名字就是集群名字yaml 文件可以有多个,为了方便用户配置yaml文件在iotd/config目录下面提供了`default_cluster.yaml`示例。 -* yaml 文件配置由`global`、`confignode_servers`、`datanode_servers`、`grafana_server`、`prometheus_server`四大部分组成 -* global 是通用配置主要配置机器用户名密码、IoTDB本地安装文件、Jdk配置等。在`iotdbctl/config`目录中提供了一个`default_cluster.yaml`样例数据, - 用户可以复制修改成自己集群名字并参考里面的说明进行配置IoTDB集群,在`default_cluster.yaml`样例中没有注释的均为必填项,已经注释的为非必填项。 - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotdbctl cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - - -| 参数 | 说明 | 是否必填 | -|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| iotdb\_zip\_dir | IoTDB 部署分发目录,如果值为空则从`iotdb_download_url`指定地址下载 | 非必填 | -| iotdb\_download\_url | IoTDB 下载地址,如果`iotdb_zip_dir` 没有值则从指定地址下载 | 非必填 | -| jdk\_tar\_dir | jdk 本地目录,可使用该 jdk 路径进行上传部署至目标节点。 | 非必填 | -| jdk\_deploy\_dir | jdk 远程机器部署目录,会将 jdk 部署到该目录下面,与下面的`jdk_dir_name`参数构成完整的jdk部署目录即 `/` | 非必填 | -| jdk\_dir\_name | jdk 解压后的目录名称默认是jdk_iotdb | 非必填 | -| iotdb\_lib\_dir | IoTDB lib 目录或者IoTDB 的lib 压缩包仅支持.zip格式 ,仅用于IoTDB升级,默认处于注释状态,如需升级请打开注释修改路径即可。如果使用zip文件请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache\-iotdb\-1.2.0/lib/* | 非必填 | -| user | ssh登陆部署机器的用户名 | 必填 | -| password | ssh登录的密码, 如果password未指定使用pkey登陆, 请确保已配置节点之间ssh登录免密钥 | 非必填 | -| pkey | 密钥登陆如果password有值优先使用password否则使用pkey登陆 | 非必填 | -| ssh\_port | ssh登录端口 | 必填 | -| iotdb\_admin_user | iotdb服务用户名默认root | 非必填 | -| iotdb\_admin_password | iotdb服务密码默认root | 非必填 | -| deploy\_dir | IoTDB 部署目录,会把 IoTDB 部署到该目录下面与下面的`iotdb_dir_name`参数构成完整的IoTDB 部署目录即 `/` | 必填 | -| iotdb\_dir\_name | IoTDB 解压后的目录名称默认是iotdb | 非必填 | -| datanode-env.sh | 对应`iotdb/config/datanode-env.sh` ,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值 | 非必填 | -| confignode-env.sh | 对应`iotdb/config/confignode-env.sh`,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值 | 非必填 | -| iotdb-common.properties | 对应`iotdb/config/iotdb-common.properties` | 非必填 | -| cn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode\_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`cn_seed_config_node` | 必填 | -| dn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode\_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` | 必填 | - -其中datanode-env.sh 和confignode-env.sh 可以配置额外参数extra_opts,当该参数配置后会在datanode-env.sh 和confignode-env.sh 后面追加对应的值,可参考default\_cluster.yaml,配置示例如下: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - - -* confignode_servers 是部署IoTDB Confignodes配置,里面可以配置多个Confignode - 默认将第一个启动的ConfigNode节点node1当作Seed-ConfigNode - -| 参数 | 说明 | 是否必填 | -|-----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| name | Confignode 名称 | 必填 | -| deploy\_dir | IoTDB config node 部署目录 | 必填| | -| cn\_internal\_address | 对应iotdb/内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_address` | 必填 | -| cn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-confignode.properties`中的`cn_seed_config_node` | 必填 | -| cn\_internal\_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_port` | 必填 | -| cn\_consensus\_port | 对应`iotdb/config/iotdb-system.properties`中的`cn_consensus_port` | 非必填 | -| cn\_data\_dir | 对应`iotdb/config/iotdb-system.properties`中的`cn_data_dir` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`confignode_servers`同时配置值优先使用confignode\_servers中的值 | 非必填 | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| 参数 | 说明 | 是否必填 | -| -------------------------- | ------------------------------------------------------------ | -------- | -| name | Datanode 名称 | 必填 | -| deploy_dir | IoTDB data node 部署目录,注:该目录不能与下面的IoTDB config node部署目录相同 | 必填 | -| dn_rpc_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` | 必填 | -| dn_internal_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` | 必填 | -| dn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-datanode.properties`中的`dn_seed_config_node`,推荐使用 SeedConfigNode | 必填 | -| dn_rpc_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` | 必填 | -| dn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 | 非必填 | - - -| 参数 | 说明 |是否必填| -|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|--- | -| name | Datanode 名称 |必填| -| deploy\_dir | IoTDB data node 部署目录 |必填| -| dn\_rpc\_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` |必填| -| dn\_internal\_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` |必填| -| dn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` |必填| -| dn\_rpc\_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` |必填| -| dn\_internal\_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` |必填| -| iotdb-system.properties | 对应`iotdb/config/iotdb-common.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 |非必填| - -* grafana_server 是部署Grafana 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------|------------------|-------------------| -| grafana\_dir\_name | grafana 解压目录名称 | 非必填默认grafana_iotdb | -| host | grafana 部署的服务器ip | 必填 | -| grafana\_port | grafana 部署机器的端口 | 非必填,默认3000 | -| deploy\_dir | grafana 部署服务器目录 | 必填 | -| grafana\_tar\_dir | grafana 压缩包位置 | 必填 | -| dashboards | dashboards 所在的位置 | 非必填,多个用逗号隔开 | - -* prometheus_server 是部署Prometheus 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------------------|------------------|-----------------------| -| prometheus_dir\_name | prometheus 解压目录名称 | 非必填默认prometheus_iotdb | -| host | prometheus 部署的服务器ip | 必填 | -| prometheus\_port | prometheus 部署机器的端口 | 非必填,默认9090 | -| deploy\_dir | prometheus 部署服务器目录 | 必填 | -| prometheus\_tar\_dir | prometheus 压缩包位置 | 必填 | -| storage\_tsdb\_retention\_time | 默认保存数据天数 默认15天 | 非必填 | -| storage\_tsdb\_retention\_size | 指定block可以保存的数据大小默认512M ,注意单位KB, MB, GB, TB, PB, EB | 非必填 | - -如果在config/xxx.yaml的`iotdb-system.properties`和`iotdb-system.properties`中配置了metrics,则会自动把配置放入到promethues无需手动修改 - -注意:如何配置yaml key对应的值包含特殊字符如:等建议整个value使用双引号,对应的文件路径中不要使用包含空格的路径,防止出现识别出现异常问题。 - -### 1.5 使用场景 - -#### 清理数据场景 - -* 清理集群数据场景会删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 -* 首先执行停止集群命令、然后在执行集群清理命令。 -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster clean default_cluster -``` - -#### 集群销毁场景 - -* 集群销毁场景会删除IoTDB集群中的`data`、`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录。 -* 首先执行停止集群命令、然后在执行集群销毁命令。 - - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster destroy default_cluster -``` - -#### 集群升级场景 - -* 集群升级首先需要在config/xxx.yaml中配置`iotdb_lib_dir`为要上传到服务器的jar所在目录路径(例如iotdb/lib)。 -* 如果使用zip文件上传请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache-iotdb-1.2.0/lib/* -* 执行上传命令、然后执行重启IoTDB集群命令即可完成集群升级 - -```bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### 集群配置文件的热部署场景 - -* 首先修改在config/xxx.yaml中配置。 -* 执行分发命令、然后执行热部署命令即可完成集群配置的热部署 - -```bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### 集群扩容场景 - -* 首先修改在config/xxx.yaml中添加一个datanode 或者confignode 节点。 -* 执行集群扩容命令 -```bash -iotdbctl cluster scaleout default_cluster -``` - -#### 集群缩容场景 - -* 首先在config/xxx.yaml中找到要缩容的节点名字或者ip+port(其中confignode port 是cn_internal_port、datanode port 是rpc_port) -* 执行集群缩容命令 -```bash -iotdbctl cluster scalein default_cluster -``` - -#### 已有IoTDB集群,使用集群部署工具场景 - -* 配置服务器的`user`、`passwod`或`pkey`、`ssh_port` -* 修改config/xxx.yaml中IoTDB 部署路径,`deploy_dir`(IoTDB 部署目录)、`iotdb_dir_name`(IoTDB解压目录名称,默认是iotdb) - 例如IoTDB 部署完整路径是`/home/data/apache-iotdb-1.1.1`则需要修改yaml文件`deploy_dir:/home/data/`、`iotdb_dir_name:apache-iotdb-1.1.1` -* 如果服务器不是使用的java_home则修改`jdk_deploy_dir`(jdk 部署目录)、`jdk_dir_name`(jdk解压后的目录名称,默认是jdk_iotdb),如果使用的是java_home 则不需要修改配置 - 例如jdk部署完整路径是`/home/data/jdk_1.8.2`则需要修改yaml文件`jdk_deploy_dir:/home/data/`、`jdk_dir_name:jdk_1.8.2` -* 配置`cn_seed_config_node`、`dn_seed_config_node` -* 配置`confignode_servers`中`iotdb-system.properties`里面的`cn_internal_address`、`cn_internal_port`、`cn_consensus_port`、`cn_system_dir`、 - `cn_consensus_dir`里面的值不是IoTDB默认的则需要配置否则可不必配置 -* 配置`datanode_servers`中`iotdb-system.properties`里面的`dn_rpc_address`、`dn_internal_address`、`dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`等 -* 执行初始化命令 - -```bash -iotdbctl cluster init default_cluster -``` - -#### 一键部署IoTDB、Grafana和Prometheus 场景 - -* 配置`iotdb-system.properties` 打开metrics接口 -* 配置Grafana 配置,如果`dashboards` 有多个就用逗号隔开,名字不能重复否则会被覆盖。 -* 配置Prometheus配置,IoTDB 集群配置了metrics 则无需手动修改Prometheus 配置会根据哪个节点配置了metrics,自动修改Prometheus 配置。 -* 启动集群 - -```bash -iotdbctl cluster start default_cluster -``` - -更加详细参数请参考上方的 集群配置文件介绍 - - -### 1.6 命令格式 - -本工具的基本用法为: -```bash -iotdbctl cluster [params (Optional)] -``` -* key 表示了具体的命令。 - -* cluster name 表示集群名称(即`iotdbctl/config` 文件中yaml文件名字)。 - -* params 表示了命令的所需参数(选填)。 - -* 例如部署default_cluster集群的命令格式为: - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 集群的功能及参数列表如下: - -| 命令 | 功能 | 参数 | -|-----------------|-------------------------------|-------------------------------------------------------------------------------------------------------------------------| -| check | 检测集群是否可以部署 | 集群名称列表 | -| clean | 清理集群 | 集群名称 | -| deploy/dist-all | 部署集群 | 集群名称 ,-N,模块名称(iotdb、grafana、prometheus可选),-op force(可选) | -| list | 打印集群及状态列表 | 无 | -| start | 启动集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) | -| stop | 关闭集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) ,-op force(nodename、grafana、prometheus可选) | -| restart | 重启集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选),-op force(强制停止)/rolling(滚动重启) | -| show | 查看集群信息,details字段表示展示集群信息细节 | 集群名称, details(可选) | -| destroy | 销毁集群 | 集群名称,-N,模块名称(iotdb、grafana、prometheus可选) | -| scaleout | 集群扩容 | 集群名称 | -| scalein | 集群缩容 | 集群名称,-N,集群节点名字或集群节点ip+port | -| reload | 集群热加载 | 集群名称 | -| dist-conf | 集群配置文件分发 | 集群名称 | -| dumplog | 备份指定集群日志 | 集群名称,-N,集群节点名字 -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -loglevel 日志类型 -l 传输速度 | -| dumpdata | 备份指定集群数据 | 集群名称, -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -l 传输速度 | -| dist-lib | lib 包升级 | 集群名字(升级完后请重启) | -| init | 已有集群使用集群部署工具时,初始化集群配置 | 集群名字,初始化集群配置 | -| status | 查看进程状态 | 集群名字 | -| acitvate | 激活集群 | 集群名字 | -| dist-plugin | 上传plugin(udf,trigger,pipe)到集群 | 集群名字,-type 类型 U(udf)/T(trigger)/P(pipe) -file /xxxx/trigger.jar,上传完成后需手动执行创建udf、pipe、trigger命令 | -| upgrade | 滚动升级 | 集群名字 | -| health_check | 健康检查 | 集群名字,-N,节点名称(可选) | -| backup | 停机备份 | 集群名字,-N,节点名称(可选) | -| importschema | 元数据导入 | 集群名字,-N,节点名称(必填) -param 参数 | -| exportschema | 元数据导出 | 集群名字,-N,节点名称(必填) -param 参数 | - - -### 1.7 细命令执行过程 - -下面的命令都是以default_cluster.yaml 为示例执行的,用户可以修改成自己的集群文件来执行 - -#### 检查集群部署环境命令 - -```bash -iotdbctl cluster check default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 验证目标节点是否能够通过 SSH 登录 - -* 验证对应节点上的 JDK 版本是否满足IoTDB jdk1.8及以上版本、服务器是否按照unzip、是否安装lsof 或者netstat - -* 如果看到下面提示`Info:example check successfully!` 证明服务器已经具备安装的要求, - 如果输出`Error:example check fail!` 证明有部分条件没有满足需求可以查看上面的输出的Error日志(例如:`Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`)进行修复, - 如果检查jdk没有满足要求,我们可以自己在yaml 文件中配置一个jdk1.8 及以上版本的进行部署不影响后面使用, - 如果检查lsof、netstat或者unzip 不满足要求需要在服务器上自行安装。 - -#### 部署集群命令 - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`confignode_servers` 和`datanode_servers`中的节点信息上传IoTDB压缩包和jdk压缩包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值) - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -```bash -iotdbctl cluster deploy default_cluster -op force -``` -注意:该命令会强制执行部署,具体过程会删除已存在的部署目录重新部署 - -*部署单个模块* -```bash -# 部署grafana模块 -iotdbctl cluster deploy default_cluster -N grafana -# 部署prometheus模块 -iotdbctl cluster deploy default_cluster -N prometheus -# 部署iotdb模块 -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### 启动集群命令 - -```bash -iotdbctl cluster start default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 启动confignode,根据yaml配置文件中`confignode_servers`中的顺序依次启动同时根据进程id检查confignode是否正常,第一个confignode 为seek config - -* 启动datanode,根据yaml配置文件中`datanode_servers`中的顺序依次启动同时根据进程id检查datanode是否正常 - -* 如果根据进程id检查进程存在后,通过cli依次检查集群列表中每个服务是否正常,如果cli链接失败则每隔10s重试一次直到成功最多重试5次 - - -*启动单个节点命令* -```bash -#按照IoTDB 节点名称启动 -iotdbctl cluster start default_cluster -N datanode_1 -#按照IoTDB 集群ip+port启动,其中port对应confignode的cn_internal_port、datanode的rpc_port -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 -#启动grafana -iotdbctl cluster start default_cluster -N grafana -#启动prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对于节点位置信息,如果启动的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果启动的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 启动该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的start-confignode.sh和start-datanode.sh 脚本, -在实际输出结果失败时有可能是集群还未正常启动,建议使用status命令进行查看当前集群状态(iotdbctl cluster status xxx) - - -#### 查看IoTDB集群状态命令 - -```bash -iotdbctl cluster show default_cluster -#查看IoTDB集群详细信息 -iotdbctl cluster show default_cluster details -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 依次在datanode通过cli执行`show cluster details` 如果有一个节点执行成功则不会在后续节点继续执行cli直接返回结果 - - -#### 停止集群命令 - - -```bash -iotdbctl cluster stop default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`datanode_servers`中datanode节点信息,按照配置先后顺序依次停止datanode节点 - -* 根据`confignode_servers`中confignode节点信息,按照配置依次停止confignode节点 - -*强制停止集群命令* - -```bash -iotdbctl cluster stop default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群 - -*停止单个节点命令* - -```bash -#按照IoTDB 节点名称停止 -iotdbctl cluster stop default_cluster -N datanode_1 -#按照IoTDB 集群ip+port停止(ip+port是按照datanode中的ip+dn_rpc_port获取唯一节点或confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 -#停止grafana -iotdbctl cluster stop default_cluster -N grafana -#停止prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对应节点位置信息,如果停止的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果停止的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 停止该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的stop-confignode.sh和stop-datanode.sh 脚本,在某些情况下有可能iotdb集群并未停止。 - - -#### 清理集群数据命令 - -```bash -iotdbctl cluster clean default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`配置信息 - -* 根据`confignode_servers`、`datanode_servers`中的信息,检查是否还有服务正在运行, - 如果有任何一个服务正在运行则不会执行清理命令 - -* 删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 - - - -#### 重启集群命令 - -```bash -iotdbctl cluster restart default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 执行上述的停止集群命令(stop),然后执行启动集群命令(start) 具体参考上面的start 和stop 命令 - -*强制重启集群命令* - -```bash -iotdbctl cluster restart default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群,然后启动集群 - -*重启单个节点命令* - -```bash -#按照IoTDB 节点名称重启datanode_1 -iotdbctl cluster restart default_cluster -N datanode_1 -#按照IoTDB 节点名称重启confignode_1 -iotdbctl cluster restart default_cluster -N confignode_1 -#重启grafana -iotdbctl cluster restart default_cluster -N grafana -#重启prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### 集群缩容命令 - -```bash -#按照节点名称缩容 -iotdbctl cluster scalein default_cluster -N nodename -#按照ip+port缩容(ip+port按照datanode中的ip+dn_rpc_port获取唯一节点,confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster scalein default_cluster -N ip:port -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 判断要缩容的confignode节点和datanode是否只剩一个,如果只剩一个则不能执行缩容 - -* 然后根据ip:port或者nodename 获取要缩容的节点信息,执行缩容命令,然后销毁该节点目录,如果缩容的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果缩容的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - - -提示:目前一次仅支持一个节点缩容 - -#### 集群扩容命令 - -```bash -iotdbctl cluster scaleout default_cluster -``` -* 修改config/xxx.yaml 文件添加一个datanode 节点或者confignode节点 - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 找到要扩容的节点,执行上传IoTDB压缩包和jdb包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值)并解压 - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -* 执行启动该节点命令并校验节点是否启动成功 - -提示:目前一次仅支持一个节点扩容 - -#### 销毁集群命令 -```bash -iotdbctl cluster destroy default_cluster -``` - -* cluster-name 找到默认位置的 yaml 文件 - -* 根据`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`中node节点信息,检查是否节点还在运行, - 如果有任何一个节点正在运行则停止销毁命令 - -* 删除IoTDB集群中的`data`以及yaml文件配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录 - -*销毁单个模块* -```bash -# 销毁grafana模块 -iotdbctl cluster destroy default_cluster -N grafana -# 销毁prometheus模块 -iotdbctl cluster destroy default_cluster -N prometheus -# 销毁iotdb模块 -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### 分发集群配置命令 -```bash -iotdbctl cluster dist-conf default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 根据yaml文件节点配置信息生成并依次上传`iotdb-system.properties`到指定节点 - -#### 热加载集群配置命令 -```bash -iotdbctl cluster reload default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据yaml文件节点配置信息依次在cli中执行`load configuration` - -#### 集群节点日志备份 -```bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 该命令会根据yaml文件校验datanode_1,confignode_1 是否存在,然后根据配置的起止日期(startdate<=logtime<=enddate)备份指定节点datanode_1,confignode_1 的日志数据到指定服务`192.168.9.48` 端口`36000` 数据备份路径是 `/iotdb/logs` ,IoTDB日志存储路径在`/root/data/db/iotdb/logs`(非必填,如果不填写-logs xxx 默认从IoTDB安装路径/logs下面备份日志) - -| 命令 | 功能 | 是否必填 | -|------------|------------------------------------| ---| -| -h | 存放备份数据机器ip |否| -| -u | 存放备份数据机器用户名 |否| -| -pw | 存放备份数据机器密码 |否| -| -p | 存放备份数据机器端口(默认22) |否| -| -path | 存放备份数据的路径(默认当前路径) |否| -| -loglevel | 日志基本有all、info、error、warn(默认是全部) |否| -| -l | 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -| -N | 配置文件集群名称多个用逗号隔开 |是| -| -startdate | 起始时间(包含默认1970-01-01) |否| -| -enddate | 截止时间(包含) |否| -| -logs | IoTDB 日志存放路径,默认是({iotdb}/logs) |否| - -#### 集群节点数据备份 -```bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* 该命令会根据yaml文件获取leader 节点,然后根据起止日期(startdate<=logtime<=enddate)备份数据到192.168.9.48 服务上的/iotdb/datas 目录下 - -| 命令 | 功能 | 是否必填 | -| ---|---------------------------------| ---| -|-h| 存放备份数据机器ip |否| -|-u| 存放备份数据机器用户名 |否| -|-pw| 存放备份数据机器密码 |否| -|-p| 存放备份数据机器端口(默认22) |否| -|-path| 存放备份数据的路径(默认当前路径) |否| -|-granularity| 类型partition |是| -|-l| 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -|-startdate| 起始时间(包含) |是| -|-enddate| 截止时间(包含) |是| - -#### 集群lib包上传(升级) -```bash -iotdbctl cluster dist-lib default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 - -注意执行完升级后请重启IoTDB 才能生效 - -#### 集群初始化 -```bash -iotdbctl cluster init default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 初始化集群配置 - -#### 查看集群进程状态 -```bash -iotdbctl cluster status default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 展示集群的存活状态 - -#### 集群授权激活 - -集群激活默认是通过输入激活码激活,也可以通过-op license_path 通过license路径激活 - -* 默认激活方式 -```bash -iotdbctl cluster activate default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -op license_path -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -### 1.8 集群plugin分发 -```bash -#分发udf -iotdbctl cluster dist-plugin default_cluster -type U -file /xxxx/udf.jar -#分发trigger -iotdbctl cluster dist-plugin default_cluster -type T -file /xxxx/trigger.jar -#分发pipe -iotdbctl cluster dist-plugin default_cluster -type P -file /xxxx/pipe.jar -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取 `datanode_servers`配置信息 - -* 上传udf/trigger/pipe jar包 - -上传完成后需要手动执行创建udf/trigger/pipe命令 - -### 1.9 集群滚动升级 -```bash -iotdbctl cluster upgrade default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 -* confignode 执行停止、替换lib包、启动,然后datanode执行停止、替换lib包、启动 - - - -### 1.10 集群健康检查 -```bash -iotdbctl cluster health_check default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行health_check.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行health_check.sh - - -### 1.11 集群停机备份 -```bash -iotdbctl cluster backup default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行backup.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行backup.sh - -说明:多个节点部署到单台机器,只支持 quick 模式 - -### 1.12 集群元数据导入 - -```bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入import-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|---------------------------------|------| -| -s |指定想要导入的数据文件,这里可以指定文件或者文件夹。如果指定的是文件夹,将会把文件夹中所有的后缀为csv的文件进行批量导入。 | 是 | -| -fd |指定一个目录来存放导入失败的文件,如果没有指定这个参数,失败的文件将会被保存到源数据的目录中,文件名为是源文件名加上.failed的后缀。 | 否 | -| -lpf |用于指定每个导入失败文件写入数据的行数,默认值为10000 | 否 | - - - -### 1.13 集群元数据导出 - -```bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入export-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|------------------------------------------------------------|------| -| -t | 为导出的CSV文件指定输出路径 | 是 | -| -path |指定导出元数据的path pattern,指定该参数后会忽略-s参数例如:root.stock.** | 否 | -| -pf |如果未指定-path,则需指定该参数,指定查询元数据路径所在文件路径,支持 txt 文件格式,每个待导出的路径为一行。 | 否 | -| -lpf |指定导出的dump文件最大行数,默认值为10000。 | 否 | -| -timeout |指定session查询时的超时时间,单位为ms | 否 | - - - -### 1.14 集群部署工具样例介绍 -在集群部署工具安装目录中config/example 下面有3个yaml样例,如果需要可以复制到config 中进行修改即可 - -| 名称 | 说明 | -|-----------------------------|------------------------------------------------| -| default\_1c1d.yaml | 1个confignode和1个datanode 配置样例 | -| default\_3c3d.yaml | 3个confignode和3个datanode 配置样例 | -| default\_3c3d\_grafa\_prome | 3个confignode和3个datanode、Grafana、Prometheus配置样例 | - -## 2. 数据文件夹概览工具 - -IoTDB数据文件夹概览工具用于打印出数据文件夹的结构概览信息,工具位置为 tools/tsfile/print-iotdb-data-dir。 - -### 2.1 用法 - -- Windows: - -```bash -.\print-iotdb-data-dir.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"IoTDB_data_dir_overview.txt"作为默认值。 - -### 2.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## 3. TsFile概览工具 - -TsFile概览工具用于以概要模式打印出一个TsFile的内容,工具位置为 tools/tsfile/print-tsfile。 - -### 3.1 用法 - -- Windows: - -```bash -.\print-tsfile-sketch.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-tsfile-sketch.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"TsFile_sketch_view.txt"作为默认值。 - -### 3.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -解释: - -- 以"|"为分隔,左边是在TsFile文件中的实际位置,右边是梗概内容。 -- "|||||||||||||||||||||"是为增强可读性而添加的导引信息,不是TsFile中实际存储的数据。 -- 最后打印的"IndexOfTimerseriesIndex Tree"是对TsFile文件末尾的元数据索引树的重新整理打印,便于直观理解,不是TsFile中存储的实际数据。 - -## 4. TsFile Resource概览工具 - -TsFile resource概览工具用于打印出TsFile resource文件的内容,工具位置为 tools/tsfile/print-tsfile-resource-files。 - -### 4.1 用法 - -- Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### 4.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/zh/UserGuide/Master/Tree/Tools-System/Monitor-Tool_timecho.md b/src/zh/UserGuide/Master/Tree/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index 9f5d3f26a..000000000 --- a/src/zh/UserGuide/Master/Tree/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,187 +0,0 @@ - - - -# 监控工具 - -监控工具的部署可参考文档 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) 章节。 - -## 1. 监控指标的 Prometheus 映射关系 - -> 对于 Metric Name 为 name, Tags 为 K1=V1, ..., Kn=Vn 的监控指标有如下映射,其中 value 为具体值 - -| 监控指标类型 | 映射关系 | -| ---------------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | - -## 2. 修改配置文件 - -1) 以 DataNode 为例,修改 iotdb-system.properties 配置文件如下: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -2) 启动 IoTDB DataNode - -3) 打开浏览器或者用```curl``` 访问 ```http://servier_ip:9091/metrics```, 就能得到如下 metric 数据: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -## 3. Prometheus + Grafana - -如上所示,IoTDB 对外暴露出标准的 Prometheus 格式的监控指标数据,可以使用 Prometheus 采集并存储监控指标,使用 Grafana -可视化监控指标。 - -IoTDB、Prometheus、Grafana三者的关系如下图所示: - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. IoTDB在运行过程中持续收集监控指标数据。 -2. Prometheus以固定的间隔(可配置)从IoTDB的HTTP接口拉取监控指标数据。 -3. Prometheus将拉取到的监控指标数据存储到自己的TSDB中。 -4. Grafana以固定的间隔(可配置)从Prometheus查询监控指标数据并绘图展示。 - -从交互流程可以看出,我们需要做一些额外的工作来部署和配置Prometheus和Grafana。 - -比如,你可以对Prometheus进行如下的配置(部分参数可以自行调整)来从IoTDB获取监控数据 - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -更多细节可以参考下面的文档: - -[Prometheus安装使用文档](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus从HTTP接口拉取metrics数据的配置说明](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana安装使用文档](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana从Prometheus查询数据并绘图的文档](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## 4. Apache IoTDB Dashboard - -我们提供了Apache IoTDB Dashboard,支持统一集中式运维管理,可通过一个监控面板监控多个集群。 - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - -你可以在企业版中获取到 Dashboard 的 Json文件。 - -### 4.1 集群概览 - -可以监控包括但不限于: -- 集群总CPU核数、总内存空间、总硬盘空间 -- 集群包含多少个ConfigNode与DataNode -- 集群启动时长 -- 集群写入速度 -- 集群各节点当前CPU、内存、磁盘使用率 -- 分节点的信息 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - -### 4.2 数据写入 - -可以监控包括但不限于: -- 写入平均耗时、耗时中位数、99%分位耗时 -- WAL文件数量与尺寸 -- 节点 WAL flush SyncBuffer 耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### 4.3 数据查询 - -可以监控包括但不限于: -- 节点查询加载时间序列元数据耗时 -- 节点查询读取时间序列耗时 -- 节点查询修改时间序列元数据耗时 -- 节点查询加载Chunk元数据列表耗时 -- 节点查询修改Chunk元数据耗时 -- 节点查询按照Chunk元数据过滤耗时 -- 节点查询构造Chunk Reader耗时的平均值 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### 4.4 存储引擎 - -可以监控包括但不限于: -- 分类型的文件数量、大小 -- 处于各阶段的TsFile数量、大小 -- 各类任务的数量与耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### 4.5 系统监控 - -可以监控包括但不限于: -- 系统内存、交换内存、进程内存 -- 磁盘空间、文件数、文件尺寸 -- JVM GC时间占比、分类型的GC次数、GC数据量、各年代的堆内存占用 -- 网络传输速率、包发送速率 - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) - -### 4.6 数据同步 - -可以监控包括但不限于: -- Pipe事件提交队列大小、未分配Pipe事件数。 -- Source队列未处理事件数、Source供给事件速率、Processor处理事件速率。 -- 各类Pipesink/source未传输事件数、Pipe connector传输事件速率。 -- Pipesink重试队列和pending handler数量、Pipesink压缩前后累计大小和压缩耗时、Pipesink 批量大小和批处理间隔分布。 -- Pipe内存占用和容量、Pipe phantom reference数量、linked TsFile数量和大小、Pipe发送TsFile读取磁盘字节数。 - -![](/img/monitor-tool-pipe-1.png) - -![](/img/monitor-tool-pipe-2.png) - -![](/img/monitor-tool-pipe-3.png) - -![](/img/monitor-tool-pipe-4.png) \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/Tools-System/Schema-Export-Tool_timecho.md b/src/zh/UserGuide/Master/Tree/Tools-System/Schema-Export-Tool_timecho.md deleted file mode 100644 index c7e544003..000000000 --- a/src/zh/UserGuide/Master/Tree/Tools-System/Schema-Export-Tool_timecho.md +++ /dev/null @@ -1,84 +0,0 @@ - - -# 元数据导出 - -## 1. 功能概述 - -元数据导出工具 `export-schema.sh/bat` 位于tools 目录下,能够将 IoTDB 中指定数据库下的元数据导出为脚本文件。 - -## 2. 功能详解 - -### 2.1 参数介绍 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| --------------------- | -------------------------- | ------------------------------------------------------------------------ | ----------------------------------- |---------------------------------------| -| `-h` | `-- host` | 主机名 | 否 | 127.0.0.1 | -| `-p` | `--port` | 端口号 | 否 | 6667 | -| `-u` | `--username` | 用户名 | 否 | root | -| `-pw` | `--password` | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6.x 版本之前为 root) | -| `-sql_dialect` | `--sql_dialect` | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| `-db` | `--database` | 将要导出的目标数据库,只在`-sql_dialect`为 table 类型下生效。 | `-sql_dialect`为 table 时必填 | - | -| `-table` | `--table` | 将要导出的目标表,只在`-sql_dialect`为 table 类型下生效。 | 否 | - | -| `-t` | `--target` | 指定输出文件的目标文件夹,如果路径不存在新建文件夹 | 是 | | -| `-path` | `--path_pattern` | 指定导出元数据的path pattern | `-sql_dialect`为 tree 时必填 | | -| `-pfn` | `--prefix_file_name` | 指定导出文件的名称。 | 否 | dump\_dbname.sql | -| `-lpf` | `--lines_per_file` | 指定导出的dump文件最大行数,只在`-sql_dialect`为 tree 类型下生效。 | 否 | `10000` | -| `-timeout` | `--query_timeout` | 会话查询的超时时间(ms) | 否 | -1范围:-1~Long. max=9223372036854775807 | -| `-help` | `--help` | 显示帮助信息 | 否 | | -| `-usessl` | `--use_ssl` | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| `-ts` | `--trust_store` | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| `-tpw` | `--trust_store_password` | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 运行命令 - -```Bash -Shell -# Unix/OS X -> tools/export-schema.sh [-sql_dialect] -db -table - [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -# Windows -# V2.0.4.x 版本之前 -> tools\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\schema\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -``` - -### 2.3 运行示例 - -```Bash -# 导出 root.treedb路径下的元数据 -./export-schema.sh -sql_dialect tree -t /home/ -path "root.treedb.**" - -# 导出结果内容格式如下 -Timeseries,Alias,DataType,Encoding,Compression -root.treedb.device.temperature,,DOUBLE,GORILLA,LZ4 -root.treedb.device.humidity,,DOUBLE,GORILLA,LZ4 -``` diff --git a/src/zh/UserGuide/Master/Tree/Tools-System/Schema-Import-Tool_timecho.md b/src/zh/UserGuide/Master/Tree/Tools-System/Schema-Import-Tool_timecho.md deleted file mode 100644 index e60f962b8..000000000 --- a/src/zh/UserGuide/Master/Tree/Tools-System/Schema-Import-Tool_timecho.md +++ /dev/null @@ -1,90 +0,0 @@ - - -# 元数据导入 - -## 1. 功能概述 - -元数据导入工具 `import-schema.sh/bat` 位于tools 目录下,能够将指定路径下创建元数据的脚本文件导入到 IoTDB 中。 - -## 2. 功能详解 - -### 2.1 参数介绍 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|----------------|--------------------------------|----------------------------------------------|--------|--------------------------------------| -| `-h` | `-- host` | 主机名 | 否 | 127.0.0.1 | -| `-p` | `--port` | 端口号 | 否 | 6667 | -| `-u` | `--username` | 用户名 | 否 | root | -| `-pw` | `--password` | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6.x 版本之前为 root) | -| `-sql_dialect` | `--sql_dialect` | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| `-db` | `--database` | 将要导入的目标数据库 | `是` | - | -| `-table` | `--table` | 将要导入的目标表,只在`-sql_dialect`为 table 类型下生效。 | 否 | - | -| `-s` | `--source` | 待加载的脚本文件(夹)的本地目录路径。 | 是 | | -| `-fd` | `--fail_dir` | 指定保存失败文件的目录 | 否 | | -| `-lpf` | `--lines_per_failed_file` | 指定失败文件最大写入数据的行数,只在`-sql_dialect`为 table 类型下生效。 | 否 | 100000范围:0~Integer.Max=2147483647 | -| `-help` | `--help` | 显示帮助信息 | 否 | | -| `-usessl` | `--use_ssl` | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| `-ts` | `--trust_store` | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| `-tpw` | `--trust_store_password` | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 运行命令 - -```Bash -# Unix/OS X -tools/import-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# Windows -# V2.0.4.x 版本之前 -tools\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# V2.0.4.x 版本及之后 -tools\windows\schema\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] -``` - -### 2.3 运行示例 - -```Bash -# 导入前 -IoTDB> show timeseries root.treedb.** -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -# 执行导入命令 -./import-schema.sh -sql_dialect tree -s /home/dump0_0.csv -db root.treedb - -# 导入成功后验证 -IoTDB> show timeseries root.treedb.** -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias| Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.treedb.device.temperature| null|root.treedb| DOUBLE| GORILLA| LZ4|null| null| null| null| BASE| -| root.treedb.device.humidity| null|root.treedb| DOUBLE| GORILLA| LZ4|null| null| null| null| BASE| -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -``` diff --git a/src/zh/UserGuide/Master/Tree/Tools-System/Workbench_timecho.md b/src/zh/UserGuide/Master/Tree/Tools-System/Workbench_timecho.md deleted file mode 100644 index 423a657a8..000000000 --- a/src/zh/UserGuide/Master/Tree/Tools-System/Workbench_timecho.md +++ /dev/null @@ -1,34 +0,0 @@ -# 可视化控制台 - -可视化控制台的部署可参考文档 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) 章节。 - -## 1. 产品介绍 -IoTDB可视化控制台是在IoTDB企业版时序数据库基础上针对工业场景的实时数据收集、存储与分析一体化的数据管理场景开发的扩展组件,旨在为用户提供高效、可靠的实时数据存储和查询解决方案。它具有体量轻、性能高、易使用的特点,完美对接 Hadoop 与 Spark 生态,适用于工业物联网应用中海量时间序列数据高速写入和复杂分析查询的需求。 - -## 2. 使用说明 -IoTDB的可视化控制台包含以下功能模块: -| **功能模块** | **功能说明** | -| ------------ | ------------------------------------------------------------ | -| 实例管理 | 支持对连接实例进行统一管理,支持创建、编辑和删除,同时可以可视化呈现多实例的关系,帮助客户更清晰的管理多数据库实例 | -| 首页 | 支持查看数据库实例中各节点的服务运行状态(如是否激活、是否运行、IP信息等),支持查看集群、ConfigNode、DataNode运行监控状态,对数据库运行健康度进行监控,判断实例是否有潜在运行问题 | -| 测点列表 | 支持直接查看实例中的测点信息,包括所在数据库信息(如数据库名称、数据保存时间、设备数量等),及测点信息(测点名称、数据类型、压缩编码等),同时支持单条或批量创建、导出、删除测点 | -| 数据模型 | 支持查看各层级从属关系,将层级模型直观展示 | -| 数据查询 | 支持对常用数据查询场景提供界面式查询交互,并对查询数据进行批量导入、导出 | -| 统计查询 | 支持对常用数据统计场景提供界面式查询交互,如最大值、最小值、平均值、总和的结果输出。 | -| SQL操作 | 支持对数据库SQL进行界面式交互,单条或多条语句执行,结果的展示和导出 | -| 趋势 | 支持一键可视化查看数据整体趋势,对选中测点进行实时&历史数据绘制,观察测点实时&历史运行状态 | -| 分析 | 支持将数据通过不同的分析方式(如傅里叶变换等)进行可视化展示 | -| 视图 | 支持通过界面来查看视图名称、视图描述、结果测点以及表达式等信息,同时还可以通过界面交互快速的创建、编辑、删除视图 | -| 数据同步 | 支持对数据库间的数据同步任务进行直观创建、查看、管理,支持直接查看任务运行状态、同步数据和目标地址,还可以通过界面实时观察到同步状态的监控指标变化 | -| 权限管理 | 支持对权限进行界面管控,用于管理和控制数据库用户访问和操作数据库的权限 | -| 审计日志 | 支持对用户在数据库上的操作进行详细记录,包括DDL、DML和查询操作。帮助用户追踪和识别潜在的安全威胁、数据库错误和滥用行为 | - -主要功能展示: -* 首页 -![首页.png](/img/%E9%A6%96%E9%A1%B5.png) -* 测点列表 -![测点列表.png](/img/workbench-1.png) -* 数据查询 -![数据查询.png](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2.png) -* 趋势 -![历史趋势.png](/img/%E5%8E%86%E5%8F%B2%E8%B6%8B%E5%8A%BF.png) \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/Audit-Log_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index 14e0f28cb..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,166 +0,0 @@ - - - -# 安全审计 - -## 1. 引言 - -审计日志是数据库的记录凭证,通过审计日志功能可以查询到数据库中增删改查等各项操作,以保证信息安全。IoTDB 审计日志功能支持以下特性: - -* 可通过配置决定是否开启审计日志功能 -* 可通过参数设置审计日志记录的操作类型和权限级别 -* 可通过参数设置审计日志文件的存储周期,包括基于 TTL 实现时间滚动和基于 SpaceTL 实现空间滚动。 -* 可通过参数设置统计任意时间段内写入和查询延时大于阈值(默认3000毫秒)的慢请求个数。 -* 审计日志文件默认加密存储 - -> 注意:该功能从 V2.0.8 版本开始提供。 - -## 2. 配置参数 - -通过编辑配置文件 `iotdb-system.properties` 中如下参数来启动审计日志功能。 - -* V2.0.8.1 - - -| 参数名称 | 参数描述 | 数据类型 | 默认值 | 生效方式 | -|-----------------------------------|------------------------------------------------------------------------------------------------------| ---------- | ------------------------ | ---------- | -| `enable_audit_log` | 是否开启审计日志。 true:启用。false:禁用。 | Boolean | false | 热加载 | -| `auditable_operation_type` | 操作类型选择。 DML :所有 DML 都会记录审计日志; DDL :所有 DDL 都会记录审计日志; QUERY :所有 QUERY 都会记录审计日志; CONTROL:所有控制语句都会记录审计日志; | String | DML,DDL,QUERY,CONTROL | 热加载 | -| `auditable_operation_level` | 权限级别选择。 global :记录全部的审计日志; object:仅针对数据实例的事件的审计日志会被记录; 包含关系:object < global。 例如:设置为 global 时,所有审计日志正常记录;设置为 object 时,仅记录对具体数据实例的操作。 | String | global | 热加载 | -| `auditable_operation_result` | 审计结果选择。 success:只记录成功事件的审计日志; fail:只记录失败事件的审计日志; | String | success, fail | 热加载 | -| `audit_log_ttl_in_days` | 审计日志的 TTL,生成审计日志的时间达到该阈值后过期。 | Double | -1.0(永远不会被删除) | 热加载 | -| `audit_log_space_tl_in_GB` | 审计日志的 SpaceTL,审计日志总空间达到该阈值后开始轮转删除。 | Double | 1.0| 热加载| -| `audit_log_batch_interval_in_ms` | 审计日志批量写入的时间间隔 | Long | 1000 | 热加载 | -| `audit_log_batch_max_queue_bytes` | 用于批量处理审计日志的队列最大字节数。当队列大小超过此值时,后续的写入操作将被阻塞。 | Long | 268435456 | 热加载 | - - -* V2.0.9.2 - -| 参数名称 | 参数描述 | 数据类型 | 默认值 | 生效方式 | -|-----------------------------------|------------------------------------------------------------------------------------------------------| ---------- | ------------------------ | ---------- | -| `enable_audit_log` | 是否开启审计日志。 true:启用。false:禁用。 | Boolean | false | 热加载 | -| `auditable_operation_type` | 操作类型选择。 DML :所有 DML 都会记录审计日志; DDL :所有 DDL 都会记录审计日志; QUERY :所有 QUERY 都会记录审计日志; CONTROL:所有控制语句都会记录审计日志; | String | DML,DDL,QUERY,CONTROL | 热加载 | -| `auditable_dml_event_type` | 审计DML操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_ddl_event_type` | 审计DDL操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_query_event_type` | 审计查询操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_control_event_type` | 审计控制操作时的事件类型。`CHANGE_AUDIT_OPTION`:审计选项变更,`OBJECT_AUTHENTICATION`:对象鉴权,`LOGIN`:登录,`LOGOUT`:退出登录,`DN_SHUTDOWN`:数据节点关机,`SLOW_OPERATION`:慢操作 | String | `CHANGE_AUDIT_OPTION`,`OBJECT_AUTHENTICATION`,`LOGIN`,`LOGOUT`,`DN_SHUTDOWN`,`SLOW_OPERATION` | 热加载 | -| `auditable_operation_level` | 权限级别选择。 global :记录全部的审计日志; object:仅针对数据实例的事件的审计日志会被记录; 包含关系:object < global。 例如:设置为 global 时,所有审计日志正常记录;设置为 object 时,仅记录对具体数据实例的操作。 | String | global | 热加载 | -| `auditable_operation_result` | 审计结果选择。 success:只记录成功事件的审计日志; fail:只记录失败事件的审计日志; | String | success, fail | 热加载 | -| `audit_log_ttl_in_days` | 审计日志的 TTL,生成审计日志的时间达到该阈值后过期。 | Double | -1.0(永远不会被删除) | 热加载 | -| `audit_log_space_tl_in_GB` | 审计日志的 SpaceTL,审计日志总空间达到该阈值后开始轮转删除。 | Double | 1.0| 热加载| -| `audit_log_batch_interval_in_ms` | 审计日志批量写入的时间间隔 | Long | 1000 | 热加载 | -| `audit_log_batch_max_queue_bytes` | 用于批量处理审计日志的队列最大字节数。当队列大小超过此值时,后续的写入操作将被阻塞。 | Long | 268435456 | 热加载 | - -**关于对象鉴权和慢操作的说明:** -* 当 `auditable_dml_event_type` 、`auditable_ddl_event_type`、`auditable_query_event_type`、`auditable_control_event_type` 参数值设置为 `OBJECT_AUTHENTICATION`(对象鉴权)时,则对应的事件类型会被记录审计日志。 -* 当 `auditable_dml_event_type` 、`auditable_ddl_event_type`、`auditable_query_event_type`、`auditable_control_event_type` 参数值设置为 `SLOW_OPERATION`(慢操作),则操作时间大于 `slow_query_threshold` 参数值(默认 3000 ms)的对应事件类型才会被记录审计日志。`slow_query_threshold` 参数值可通过 iotdb-system.properties 文件进行配置。 - -## 3. 查阅方法 - -支持通过 SQL 直接阅读、获取审计日志相关信息。 - -### 3.1 SQL 语法 - -```SQL -SELECT (, )* log FROM WHERE whereclause ORDER BY order_expression -``` - -* `AUDIT_LOG_PATH` :审计日志存储位置`root.__audit.log..`; -* `audit_log_field`:查询字段请参考下一小节元数据结构。 -* 支持 Where 条件搜索和 Order By 排序。 - -### 3.2 元数据结构 - -| 字段 | 含义 | 类型 | -| ------------------------ | -------------------------------------------------- | ----------- | -| `time` | 事件开始的的日期和时间 | timestamp | -| `username` | 用户名称 | string | -| `cli_hostname` | 用户主机标识 | string | -| `audit_event_type` | 审计事件类型,WRITE\_DATA, GENERATE\_KEY, SLOW\_OPERATION 等 | string | -| `operation_type` | 审计事件的操作类型,DML, DDL, QUERY, CONTROL | string | -| `privilege_type` | 审计事件使用的权限,WRITE\_DATA, MANAGE\_USER 等 | string | -| `privilege_level` | 事件的权限级别,global, object | string | -| `result` | 事件结果,success=1, fail=0 | boolean | -| `database` | 数据库名称 | string | -| `sql_string` | 用户的原始 SQL | string | -| `log` | 具体的事件描述 | string | - -### 3.3 使用示例 - -* 查询成功执行了 QUERY 操作的时间、用户名及主机信息 - -```SQL -IoTDB> select username,cli_hostname from root.__audit.log.** where operation_type='QUERY' and result=true align by device -+-----------------------------+---------------------------+--------+------------+ -| Time| Device|username|cli_hostname| -+-----------------------------+---------------------------+--------+------------+ -|2026-01-23T10:39:21.563+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -|2026-01-23T10:39:33.746+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -|2026-01-23T10:42:15.032+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -+-----------------------------+---------------------------+--------+------------+ -Total line number = 3 -It costs 0.036s -``` - -* 查询最近一次操作的时间、用户名、主机信息、操作类型以及原始 SQL - -```SQL -IoTDB> select username,cli_hostname,operation_type,sql_string from root.__audit.log.** order by time desc limit 1 align by device -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -| Time| Device|username|cli_hostname|operation_type| sql_string| -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -|2026-01-23T10:42:32.795+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| QUERY|select username,cli_hostname from root.__audit.log.** where operation_type='QUERY' and result=true align by device| -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -Total line number = 1 -It costs 0.033s -``` - -* 查询所有事件结果为false的操作数据库、操作类型及日志信息 - -```SQL -IoTDB> select database,operation_type,log from root.__audit.log.** where result=false align by device -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -| Time| Device| database|operation_type| log| -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -|2026-01-23T10:49:55.159+08:00|root.__audit.log.node_1.u_10000| | CONTROL| User user1 (ID=10000) login failed with code: 801, Authentication failed.| -|2026-01-23T10:52:04.579+08:00|root.__audit.log.node_1.u_10000| [root.**]| QUERY| User user1 (ID=10000) requests authority on object [root.**] with result false| -|2026-01-23T10:52:43.412+08:00|root.__audit.log.node_1.u_10000|root.userdb| DDL| User user1 (ID=10000) requests authority on object root.userdb with result false| -|2026-01-23T10:52:48.075+08:00|root.__audit.log.node_1.u_10000| null| QUERY|User user1 (ID=10000) requests authority on object root.__audit with result false| -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -Total line number = 4 -It costs 0.024s -``` - -* 设置 slow_query_threshold = 1 (ms), 查询某个用户在某个数据节点上审计事件类型为慢操作的记录 - -```SQL -IoTDB> select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' align by device -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -| Time| Device|result|privilege_level|privilege_type|database|operation_type| log| sql_string|audit_event_type|cli_hostname|username| -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -|2026-05-06T14:43:55.088+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [READ_DATA]| | QUERY| SLOW_QUERY: cost 60 ms, select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' or audit_event_type='LOGIN'limit 1 align by device|select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' or audit_event_type='LOGIN'limit 1 align by device| SLOW_OPERATION| 127.0.0.1| root| -|2026-05-06T14:44:08.715+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [WRITE_DATA]| | DML| Execution: insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2') cost 290 ms, with status code: TSStatus(code:200, message:)| insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2')| SLOW_OPERATION| 127.0.0.1| root| -|2026-05-06T14:44:11.684+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [WRITE_DATA]| | DML|Execution: insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4') cost 6 ms, with status code: TSStatus(code:200, message:)| insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4')| SLOW_OPERATION| 127.0.0.1| root| -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -Total line number = 3 -It costs 0.010s -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/Authority-Management-Upgrade_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/Authority-Management-Upgrade_timecho.md deleted file mode 100644 index 2733e772c..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/Authority-Management-Upgrade_timecho.md +++ /dev/null @@ -1,468 +0,0 @@ - - -# 权限管理 - -IoTDB 为用户提供了权限管理操作,为用户提供对数据与集群系统的权限管理功能,保障数据与系统安全。本篇介绍IoTDB 中权限模块的基本概念、用户定义、权限管理、鉴权逻辑与功能用例。 - -## 1. 基本概念 - -### 1.1 用户 - -用户即数据库的合法使用者。一个用户与一个唯一的用户名相对应,并且拥有密码作为身份验证的手段。一个人在使用数据库之前,必须先提供合法的(即存于数据库中的)用户名与密码,作为用户成功登录。 - -### 1.2 权限 - -数据库提供多种操作,但并非所有的用户都能执行所有操作。如果一个用户可以执行某项操作,则称该用户有执行该操作的权限。权限通常需要一个路径来限定其生效范围,可以使用[路径模式](../Basic-Concept/Operate-Metadata_timecho.md)灵活管理权限。 - -### 1.3 角色 - -角色是若干权限的集合,并且有一个唯一的角色名作为标识符。角色通常和一个现实身份相对应(例如交通调度员),而一个现实身份可能对应着多个用户。这些具有相同现实身份的用户往往具有相同的一些权限,角色就是为了能对这样的权限进行统一的管理的抽象。 - -### 1.4 默认用户与角色 - -安装初始化后的 IoTDB 中有一个默认用户:root,默认密码为`TimechoDB@2021`。该用户为管理员用户,固定拥有所有权限,无法被赋予、撤销权限,也无法被删除,数据库内仅有一个管理员用户。 - -一个新创建的用户或角色不具备任何权限。 - -## 2. 用户定义 - -拥有 SECURITY 的用户可以创建用户或者角色,需要满足以下约束: - -### 2.1 用户名限制 - -4~32个字符,支持使用英文大小写字母、数字、特殊字符(`!@#$%^&*()_+-=`)。用户无法创建和管理员用户同名的用户。 - -### 2.2 密码限制 - -12~32个字符,必须包含大写小写字母、至少一个数字、至少一个特殊字符(`!@#$%^&*()_+-=`)且不能与用户名相同。 - -### 2.3 角色名限制 - -4~32个字符,支持使用英文大小写字母、数字、特殊字符(`!@#$%^&*()_+-=`)。用户无法创建和管理员用户同名的角色。 - -## 3. 权限管理 - -IoTDB 树模型主要有两类权限:全局权限、序列权限。 - -### 3.1 全局权限 - -全局权限包含 SYSTEM、SECURITY、AUDIT 三种类型: - -* SYSTEM:只具备运维操作、DDL(Data Definition Language)相关的权限。 -* SECURITY:只具备管理角色(Role)或用户(User)以及为其他账号授予权限的权限。 -* AUDIT :只具备维护审计规则、查看审计日志的权限。 - -各权限详细描述见下表: - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
权限名称原权限名称描述
SYSTEMMANAGE_DATABASE允许用户创建、删除数据库.
USE_TRIGGER允许用户创建、删除、查看触发器。
USE_UDF允许用户创建、删除、查看用户自定义函数。
USE_PIPE允许用户创建、开始、停止、删除、查看 PIPE。允许用户创建、删除、查看 PIPEPLUGINS。
USE_CQ允许用户注册、开始、停止、卸载、查询流处理任务。允许用户注册、卸载、查询注册流处理任务插件。
MAINTAIN允许用户查询、取消查询。允许用户查看变量。允许用户查看集群状态。
USE_MODEL允许用户创建、删除、查询深度学习模型
SECURITYMANAGE_USER允许用户创建、删除、修改、查看用户。
MANAGE_ROLE允许用户创建、删除、查看角色。允许用户将角色授予给其他用户,或取消其他用户的角色。
AUDIT允许用户维护审计日志的规则 允许用户查看审计日志
- -### 3.2 序列权限 - -序列权限约束了用户访问数据的范围与方式,支持对绝对路径与前缀匹配路径授权,可对timeseries 粒度生效。 - -下表描述了这类权限的种类与范围: - -| 权限名称 | 描述 | -| --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| READ\_DATA | 允许读取授权路径下的序列数据。 | -| WRITE\_DATA | 允许读取授权路径下的序列数据。
允许插入、删除授权路径下的的序列数据。
允许在授权路径下导入、加载数据,在导入数据时,需要拥有对应路径的 WRITE\_DATA 权限,在自动创建数据库与序列时,需要有SYSTEM与 WRITE\_SCHEMA 权限。 | -| READ\_SCHEMA | 允许获取授权路径下元数据树的详细信息,包括:路径下的数据库、子路径、子节点、设备、序列、模版、视图等。 | -| WRITE\_SCHEMA | 允许获取授权路径下元数据树的详细信息。
允许在授权路径下对序列、模版、视图等进行创建、删除、修改操作。
在创建或修改 view 的时候,会检查 view 路径的 WRITE\_SCHEMA 权限、数据源的 READ\_SCHEMA 权限。
在对 view 进行查询、插入时,会检查 view 路径的 READ\_DATA 权限、WRITE\_DATA 权限。
允许在授权路径下设置、取消、查看TTL。
允许在授权路径下挂载或者接触挂载模板。
允许在授权路径下对序列进行全路径名称的修改操作。//V2.0.8.2 起支持该功能 | - -### 3.3 权限授予与取消 - -在 IoTDB 中,用户可以由三种途径获得权限: - -1. 由超级管理员授予,超级管理员可以控制其他用户的权限。 -2. 由允许权限授权的用户授予,该用户获得权限时被指定了 grant option 关键字。 -3. 由超级管理员或者有 SECURITY 的用户授予某个角色进而获取权限。 - -取消用户的权限,可以由以下几种途径: - -1. 由超级管理员取消用户的权限。 -2. 由允许权限授权的用户取消权限,该用户获得权限时被指定了 grant option 关键字。 -3. 由超级管理员或者 SECURITY 的用户取消用户的某个角色进而取消权限。 - -* 在授权时,必须指定路径。全局权限需要指定为 root.\*\*, 而序列相关权限必须为绝对路径或者以双通配符结尾的前缀路径。 -* 当授予角色权限时,可以为该权限指定 with grant option 关键字,意味着用户可以转授其授权路径上的权限,也可以取消其他用户的授权路径上的权限。例如用户 A 在被授予`集团1.公司1.**`的读权限时制定了 grant option 关键字,那么 A 可以将`集团1.公司1`以下的任意节点、序列的读权限转授给他人, 同样也可以取消其他用户 `集团1.公司1` 下任意节点的读权限。 -* 在取消授权时,取消授权语句会与用户所有的权限路径进行匹配,将匹配到的权限路径进行清理,例如用户A 具有 `集团1.公司1.工厂1 `的读权限, 在取消 `集团1.公司1.** `的读权限时,会清除用户A 的 `集团1.公司1.工厂1` 的读权限。 - -## 4. 功能语法与示例 - -IoTDB 提供了组合权限,方便用户授权: - -| 权限名称 | 权限范围 | -| ---------- | ---------------------------- | -| ALL | 所有权限 | -| READ | READ\_SCHEMA、READ\_DATA | -| WRITE | WRITE\_SCHEMA、WRITE\_DATA | - -组合权限并不是一种具体的权限,而是一种简写方式,与直接书写对应的权限名称没有差异。 - -下面将通过一系列具体的用例展示权限语句的用法,非管理员执行下列语句需要提前获取权限,所需的权限标记在操作描述后。 - -### 4.1 用户与角色相关 - -* 创建用户(需 SECURITY 权限) - -```SQL -CREATE USER -eg: CREATE USER user1 'Passwd@202604'; -``` - -* 删除用户 (需 SECURITY 权限) - -```SQL -DROP USER -eg: DROP USER user1; -``` - -* 创建角色 (需 SECURITY 权限) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1; -``` - -* 删除角色 (需 SECURITY 权限) - -```SQL -DROP ROLE -eg: DROP ROLE role1; -``` - -* 赋予用户角色 (需 SECURITY 权限) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1; -``` - -* 移除用户角色 (需 SECURITY 权限) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1; -``` - -* 列出所有用户 (需 SECURITY 权限) - -```SQL -LIST USER; -``` - -* 列出所有角色 (需 SECURITY 权限) - -```SQL -LIST ROLE; -``` - -* 列出指定角色下所有用户 (需 SECURITY 权限) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser; -``` - -* 列出指定用户下所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 SECURITY 权限。 - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser; -``` - -* 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; -``` - -* 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -* 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备 SECURITY 权限。 - -```SQL -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'Newpwd@202604'; -``` - -### 4.2 授权与取消授权 - -用户使用授权语句对赋予其他用户权限,语法如下: - -```SQL -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT SECURITY ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -用户使用取消授权语句可以将其他的权限取消,语法如下: - -```SQL -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE SECURITY ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - -* **非管理员用户执行授权/取消授权语句时,需要对`` 有`` 权限,并且该权限是被标记带有 WITH GRANT OPTION 的。** -* 在授予取消全局权限时,或者语句中包含全局权限时(ALL 展开会包含全局权限),须指定 path 为 root.\*\*。 例如,以下授权/取消授权语句是合法的: - - ```SQL - GRANT SECURITY ON root.** TO USER user1; - GRANT SECURITY ON root.** TO ROLE role1 WITH GRANT OPTION; - GRANT ALL ON root.** TO role role1 WITH GRANT OPTION; - REVOKE SECURITY ON root.** FROM USER user1; - REVOKE SECURITY ON root.** FROM ROLE role1; - REVOKE ALL ON root.** FROM ROLE role1; - ``` - - 下面的语句是非法的: - - ```SQL - GRANT READ, SECURITY ON root.t1.** TO USER user1; - GRANT ALL ON root.t1.t2 TO USER user1 WITH GRANT OPTION; - REVOKE ALL ON root.t1.t2 FROM USER user1; - REVOKE READ, SECURITY ON root.t1.t2 FROM ROLE ROLE1; - ``` -* `` 必须为全路径或者以双通配符结尾的匹配路径,以下路径是合法的: - - ```SQL - root.** - root.t1.t2.** - root.t1.t2.t3 - ``` - - 以下的路径是非法的: - - ```SQL - root.t1.* - root.t1.**.t2 - root.t1*.t2.t3 - ``` - -## 5. 场景示例 - -根据本文中描述的 [样例数据](https://github.com/thulab/iotdb/files/4438687/OtherMaterial-Sample.Data.txt) 内容,IoTDB 的样例数据可能同时属于 ln, sgcc 等不同发电集团,不同的发电集团不希望其他发电集团获取自己的数据库数据,因此我们需要将不同的数据在集团层进行权限隔离。 - -### 5.1 创建用户 - -使用 `CREATE USER ` 创建用户。例如,我们可以使用具有所有权限的root用户为 ln 和 sgcc 集团创建两个用户角色,名为 ln\_write\_user, sgcc\_write\_user,密码均为 write_Pwd@2026。建议使用反引号(\`)包裹用户名。SQL 语句为: - -```SQL -CREATE USER `ln_write_user` 'write_Pwd@2026'; -CREATE USER `sgcc_write_user` 'write_Pwd@2026'; -``` - -此时使用展示用户的 SQL 语句: - -```SQL -LIST USER; -``` - -我们可以看到这两个已经被创建的用户,结果如下: - -```SQL -IoTDB> CREATE USER `ln_write_user` 'write_Pwd@2026'; -Msg: The statement is executed successfully. -IoTDB> CREATE USER `sgcc_write_user` 'write_Pwd@2026'; -Msg: The statement is executed successfully. -IoTDB> LIST USER; -+------+---------------+-----------------+-----------------+ -|UserId| User|MaxSessionPerUser|MinSessionPerUser| -+------+---------------+-----------------+-----------------+ -| 0| root| -1| 1| -| 10000| ln_write_user| -1| -1| -| 10001|sgcc_write_user| -1| -1| -+------+---------------+-----------------+-----------------+ -Total line number = 3 -It costs 0.005s -``` - -### 5.2 赋予用户权限 - -此时,虽然两个用户已经创建,但是他们不具有任何权限,因此他们并不能对数据库进行操作,例如我们使用 ln\_write\_user 用户对数据库中的数据进行写入,SQL 语句为: - -```SQL -INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true); -``` - -此时,系统不允许用户进行此操作,会提示错误: - -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true); -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -现在,我们用 root 用户分别赋予他们向对应路径的写入权限. - -我们使用 `GRANT ON TO USER ` 语句赋予用户权限,例如: - -```SQL -GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user`; -GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user`; -``` - -执行状态如下所示: - -```SQL -IoTDB> GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user`; -Msg: The statement is executed successfully. -IoTDB> GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user`; -Msg: The statement is executed successfully. -``` - -接着使用ln\_write\_user再尝试写入数据 - -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true); -Msg: The statement is executed successfully. -``` - -### 5.3 撤销用户权限 - -授予用户权限后,我们可以使用 `REVOKE ON FROM USER `来撤销已经授予用户的权限。例如,用root用户撤销ln\_write\_user和sgcc\_write\_user的权限: - -```SQL -REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user`; -REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user`; -``` - -执行状态如下所示: - -```SQL -IoTDB> REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user`; -Msg: The statement is executed successfully. -IoTDB> REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user`; -Msg: The statement is executed successfully. -``` - -撤销权限后,ln\_write\_user就没有向root.ln.\*\*写入数据的权限了。 - -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true); -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -## 6. 鉴权及其他 - -### 6.1 鉴权 - -用户权限主要由三部分组成:权限生效范围(路径), 权限类型, with grant option 标记: - -```Plain -userTest1 : - root.t1.** - read_schema, read_data - with grant option - root.** - write_schema, write_data - with grant option -``` - -每个用户都有一个这样的权限访问列表,标识他们获得的所有权限,可以通过 `LIST PRIVILEGES OF USER ` 查看他们的权限。 - -在对一个路径进行鉴权时,数据库会进行路径与权限的匹配。例如检查 `root.t1.t2` 的 read\_schema 权限时,首先会与权限访问列表的 `root.t1.**`进行匹配,匹配成功,则检查该路径是否包含待鉴权的权限,否则继续下一条路径-权限的匹配,直到匹配成功或者匹配结束。 - -在进行多路径鉴权时,对于多路径查询任务,数据库只会将有权限的数据呈现出来,无权限的数据不会包含在结果中;对于多路径写入任务,数据库要求必须所有的目标序列都获得了对应的权限,才能进行写入。 - -请注意,下面的操作需要检查多重权限 - -1. 开启了自动创建序列功能,在用户将数据插入到不存在的序列中时,不仅需要对应序列的写入权限,还需要序列的元数据修改权限。 -2. 执行 select into 语句时,需要检查源序列的读权限与目标序列的写权限。需要注意的是源序列数据可能因为权限不足而仅能获取部分数据,目标序列写入权限不足时会报错终止任务。 -3. View 权限与数据源的权限是独立的,向 view 执行读写操作仅会检查 view 的权限,而不再对源路径进行权限校验。 - -### 6.2 其他说明 - -角色是权限的集合,而权限和角色都是用户的一种属性。即一个角色可以拥有若干权限。一个用户可以拥有若干角色与权限(称为用户自身权限)。 - -目前在 IoTDB 中并不存在相互冲突的权限,因此一个用户真正具有的权限是用户自身权限与其所有的角色的权限的并集。即要判定用户是否能执行某一项操作,就要看用户自身权限或用户的角色的所有权限中是否有一条允许了该操作。用户自身权限与其角色权限,他的多个角色的权限之间可能存在相同的权限,但这并不会产生影响。 - -需要注意的是:如果一个用户自身有某种权限(对应操作 A),而他的某个角色有相同的权限。那么如果仅从该用户撤销该权限无法达到禁止该用户执行操作 A 的目的,还需要从这个角色中也撤销对应的权限,或者从这个用户将该角色撤销。同样,如果仅从上述角色将权限撤销,也不能禁止该用户执行操作 A。 - -同时,对角色的修改会立即反映到所有拥有该角色的用户上,例如对角色增加某种权限将立即使所有拥有该角色的用户都拥有对应权限,删除某种权限也将使对应用户失去该权限(除非用户本身有该权限)。 diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/Authority-Management_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/Authority-Management_timecho.md deleted file mode 100644 index 2e4c5c21e..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/Authority-Management_timecho.md +++ /dev/null @@ -1,510 +0,0 @@ - - -# 权限管理 - -IoTDB 为用户提供了权限管理操作,为用户提供对数据与集群系统的权限管理功能,保障数据与系统安全。 -本篇介绍IoTDB 中权限模块的基本概念、用户定义、权限管理、鉴权逻辑与功能用例。在 JAVA 编程环境中,您可以使用 [JDBC API](../API/Programming-JDBC_timecho) 单条或批量执行权限管理类语句。 - -## 1. 基本概念 - -### 1.1 用户 - -用户即数据库的合法使用者。一个用户与一个唯一的用户名相对应,并且拥有密码作为身份验证的手段。一个人在使用数据库之前,必须先提供合法的(即存于数据库中的)用户名与密码,作为用户成功登录。 - -### 1.2 权限 - -数据库提供多种操作,但并非所有的用户都能执行所有操作。如果一个用户可以执行某项操作,则称该用户有执行该操作的权限。权限通常需要一个路径来限定其生效范围,可以使用[路径模式](../Basic-Concept/Operate-Metadata.md)灵活管理权限。 - -### 1.3 角色 - -角色是若干权限的集合,并且有一个唯一的角色名作为标识符。角色通常和一个现实身份相对应(例如交通调度员),而一个现实身份可能对应着多个用户。这些具有相同现实身份的用户往往具有相同的一些权限,角色就是为了能对这样的权限进行统一的管理的抽象。 - -### 1.4 默认用户与角色 - -安装初始化后的 IoTDB 中有一个默认用户:root,默认密码为`TimechoDB@2021`(V2.0.6.x 版本之前为`root`)。该用户为管理员用户,固定拥有所有权限,无法被赋予、撤销权限,也无法被删除,数据库内仅有一个管理员用户。 - -一个新创建的用户或角色不具备任何权限。 - -## 2. 用户定义 - -拥有 MANAGE_USER、MANAGE_ROLE 的用户或者管理员可以创建用户或者角色,需要满足以下约束: - -### 2.1 用户名限制 - -4~32个字符,支持使用英文大小写字母、数字、特殊字符(`!@#$%^&*()_+-=`) - -用户无法创建和管理员用户同名的用户。 - -### 2.2 密码限制 - -4~32个字符,可使用大写小写字母、数字、特殊字符(`!@#$%^&*()_+-=`),密码默认采用 SHA-256 进行加密。 - -### 2.3 角色名限制 - -4~32个字符,支持使用英文大小写字母、数字、特殊字符(`!@#$%^&*()_+-=`) - -用户无法创建和管理员用户同名的角色。 - -## 3. 权限管理 - -IoTDB 主要有两类权限:序列权限、全局权限。 - -### 3.1 序列权限 - -序列权限约束了用户访问数据的范围与方式,支持对绝对路径与前缀匹配路径授权,可对timeseries 粒度生效。 - -下表描述了这类权限的种类与范围: - -| 权限名称 | 描述 | -|--------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| READ_DATA | 允许读取授权路径下的序列数据。 | -| WRITE_DATA | 允许读取授权路径下的序列数据。
允许插入、删除授权路径下的的序列数据。
允许在授权路径下导入、加载数据,在导入数据时,需要拥有对应路径的 WRITE_DATA 权限,在自动创建数据库与序列时,需要有 MANAGE_DATABASE 与 WRITE_SCHEMA 权限。 | -| READ_SCHEMA | 允许获取授权路径下元数据树的详细信息:
包括:路径下的数据库、子路径、子节点、设备、序列、模版、视图等。 | -| WRITE_SCHEMA | 允许获取授权路径下元数据树的详细信息。
允许在授权路径下对序列、模版、视图等进行创建、删除、修改操作。
在创建或修改 view 的时候,会检查 view 路径的 WRITE_SCHEMA 权限、数据源的 READ_SCHEMA 权限。
在对 view 进行查询、插入时,会检查 view 路径的 READ_DATA 权限、WRITE_DATA 权限。
允许在授权路径下设置、取消、查看TTL。
允许在授权路径下挂载或者接触挂载模板。
允许在授权路径下对序列进行全路径名称的修改操作。//V2.0.8.2 起支持该功能 | - -### 3.2 全局权限 - -全局权限约束了用户使用的数据库功能、限制了用户执行改变系统状态与任务状态的命令,用户获得全局授权后,可对数据库进行管理。 - -下表描述了系统权限的种类: - -| 权限名称 | 描述 | -|:---------------:|:------------------------------------------------------------------| -| MANAGE_DATABASE | - 允许用户创建、删除数据库. | -| MANAGE_USER | - 允许用户创建、删除、修改、查看用户。 | -| MANAGE_ROLE | - 允许用户创建、删除、查看角色。
允许用户将角色授予给其他用户,或取消其他用户的角色。 | -| USE_TRIGGER | - 允许用户创建、删除、查看触发器。
与触发器的数据源权限检查相独立。 | -| USE_UDF | - 允许用户创建、删除、查看用户自定义函数。
与自定义函数的数据源权限检查相独立。 | -| USE_CQ | - 允许用户创建、开始、停止、删除、查看管道。
允许用户创建、删除、查看管道插件。
与管道的数据源权限检查相独立。 | -| USE_PIPE | - 允许用户注册、开始、停止、卸载、查询流处理任务。
- 允许用户注册、卸载、查询注册流处理任务插件。 | -| EXTEND_TEMPLATE | - 允许自动扩展模板。 | -| MAINTAIN | - 允许用户查询、取消查询。
允许用户查看变量。
允许用户查看集群状态。 | -| USE_MODEL | - 允许用户创建、删除、查询深度学习模型 | - -关于模板权限: - -1. 模板的创建、删除、修改、查询、挂载、卸载仅允许管理员操作。 -2. 激活模板需要拥有激活路径的 WRITE_SCHEMA 权限 -3. 若开启了自动创建,在向挂载了模板的不存在路径写入时,数据库会自动扩展模板并写入数据,因此需要有 EXTEND_TEMPLATE 权限与写入序列的 WRITE_DATA 权限。 -4. 解除模板,需要拥有挂载模板路径的 WRITE_SCHEMA 权限。 -5. 查询使用了某个元数据模板的路径,需要有路径的 READ_SCHEMA 权限,否则将返回为空。 - -### 3.3 权限授予与取消 - -在 IoTDB 中,用户可以由三种途径获得权限: - -1. 由超级管理员授予,超级管理员可以控制其他用户的权限。 -2. 由允许权限授权的用户授予,该用户获得权限时被指定了 grant option 关键字。 -3. 由超级管理员或者有 MANAGE_ROLE 的用户授予某个角色进而获取权限。 - -取消用户的权限,可以由以下几种途径: - -1. 由超级管理员取消用户的权限。 -2. 由允许权限授权的用户取消权限,该用户获得权限时被指定了 grant option 关键字。 -3. 由超级管理员或者MANAGE_ROLE 的用户取消用户的某个角色进而取消权限。 - -- 在授权时,必须指定路径。全局权限需要指定为 root.**, 而序列相关权限必须为绝对路径或者以双通配符结尾的前缀路径。 -- 当授予角色权限时,可以为该权限指定 with grant option 关键字,意味着用户可以转授其授权路径上的权限,也可以取消其他用户的授权路径上的权限。例如用户 A 在被授予`集团1.公司1.**`的读权限时制定了 grant option 关键字,那么 A 可以将`集团1.公司1`以下的任意节点、序列的读权限转授给他人, 同样也可以取消其他用户 `集团1.公司1` 下任意节点的读权限。 -- 在取消授权时,取消授权语句会与用户所有的权限路径进行匹配,将匹配到的权限路径进行清理,例如用户A 具有 `集团1.公司1.工厂1 `的读权限, 在取消 `集团1.公司1.** `的读权限时,会清除用户A 的 `集团1.公司1.工厂1` 的读权限。 - - - -## 4. 鉴权 - -用户权限主要由三部分组成:权限生效范围(路径), 权限类型, with grant option 标记: - -``` -userTest1 : - root.t1.** - read_schema, read_data - with grant option - root.** - write_schema, write_data - with grant option -``` - -每个用户都有一个这样的权限访问列表,标识他们获得的所有权限,可以通过 `LIST PRIVILEGES OF USER ` 查看他们的权限。 - -在对一个路径进行鉴权时,数据库会进行路径与权限的匹配。例如检查 `root.t1.t2` 的 read_schema 权限时,首先会与权限访问列表的 `root.t1.**`进行匹配,匹配成功,则检查该路径是否包含待鉴权的权限,否则继续下一条路径-权限的匹配,直到匹配成功或者匹配结束。 - -在进行多路径鉴权时,对于多路径查询任务,数据库只会将有权限的数据呈现出来,无权限的数据不会包含在结果中;对于多路径写入任务,数据库要求必须所有的目标序列都获得了对应的权限,才能进行写入。 - -请注意,下面的操作需要检查多重权限 -1. 开启了自动创建序列功能,在用户将数据插入到不存在的序列中时,不仅需要对应序列的写入权限,还需要序列的元数据修改权限。 -2. 执行 select into 语句时,需要检查源序列的读权限与目标序列的写权限。需要注意的是源序列数据可能因为权限不足而仅能获取部分数据,目标序列写入权限不足时会报错终止任务。 -3. View 权限与数据源的权限是独立的,向 view 执行读写操作仅会检查 view 的权限,而不再对源路径进行权限校验。 - - - -## 5. 功能语法与示例 - -IoTDB 提供了组合权限,方便用户授权: - -| 权限名称 | 权限范围 | -|-------|-------------------------| -| ALL | 所有权限 | -| READ | READ_SCHEMA、READ_DATA | -| WRITE | WRITE_SCHEMA、WRITE_DATA | - -组合权限并不是一种具体的权限,而是一种简写方式,与直接书写对应的权限名称没有差异。 - -下面将通过一系列具体的用例展示权限语句的用法,非管理员执行下列语句需要提前获取权限,所需的权限标记在操作描述后。 - -### 5.1 用户与角色相关 - -- 创建用户(需 MANAGE_USER 权限) - - -```SQL -CREATE USER -eg: CREATE USER user1 'passwd' -``` - -- 删除用户 (需 MANEGE_USER 权限) - - -```SQL -DROP USER -eg: DROP USER user1 -``` - -- 创建角色 (需 MANAGE_ROLE 权限) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1 -``` - -- 删除角色 (需 MANAGE_ROLE 权限) - - -```SQL -DROP ROLE -eg: DROP ROLE role1 -``` - -- 赋予用户角色 (需 MANAGE_ROLE 权限) - - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1 -``` - -- 移除用户角色 (需 MANAGE_ROLE 权限) - - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1 -``` - -- 列出所有用户 (需 MANEGE_USER 权限) - -```SQL -LIST USER -``` - -- 列出所有角色 (需 MANAGE_ROLE 权限) - -```SQL -LIST ROLE -``` - -- 列出指定角色下所有用户 (需 MANEGE_USER 权限) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser -``` - -- 列出指定用户下所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 MANAGE_ROLE 权限。 - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser -``` - -- 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 MANAGE_USER 权限。 - -```SQL -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; - -``` - -- 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 MANAGE_ROLE 权限。 - -```SQL -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -- 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备MANAGE_USER 权限。 - -```SQL -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'newpwd'; -``` - -### 5.2 授权与取消授权 - -用户使用授权语句对赋予其他用户权限,语法如下: - -```SQL -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT MANAGE_ROLE ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -用户使用取消授权语句可以将其他的权限取消,语法如下: - -```SQL -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE MANAGE_ROLE ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - -- **非管理员用户执行授权/取消授权语句时,需要对\ 有\ 权限,并且该权限是被标记带有 WITH GRANT OPTION 的。** - -- 在授予取消全局权限时,或者语句中包含全局权限时(ALL 展开会包含全局权限),须指定 path 为 root.**。 例如,以下授权/取消授权语句是合法的: - - ```SQL - GRANT MANAGE_USER ON root.** TO USER user1; - GRANT MANAGE_ROLE ON root.** TO ROLE role1 WITH GRANT OPTION; - GRANT ALL ON root.** TO role role1 WITH GRANT OPTION; - REVOKE MANAGE_USER ON root.** FROM USER user1; - REVOKE MANAGE_ROLE ON root.** FROM ROLE role1; - REVOKE ALL ON root.** FROM ROLE role1; - ``` - 下面的语句是非法的: - - ```SQL - GRANT READ, MANAGE_ROLE ON root.t1.** TO USER user1; - GRANT ALL ON root.t1.t2 TO USER user1 WITH GRANT OPTION; - REVOKE ALL ON root.t1.t2 FROM USER user1; - REVOKE READ, MANAGE_ROLE ON root.t1.t2 FROM ROLE ROLE1; - ``` - -- \ 必须为全路径或者以双通配符结尾的匹配路径,以下路径是合法的: - - ```SQL - root.** - root.t1.t2.** - root.t1.t2.t3 - ``` - - 以下的路径是非法的: - - ```SQL - root.t1.* - root.t1.**.t2 - root.t1*.t2.t3 - ``` - -## 6. 示例 - -根据本文中描述的 [样例数据](https://github.com/thulab/iotdb/files/4438687/OtherMaterial-Sample.Data.txt) 内容,IoTDB 的样例数据可能同时属于 ln, sgcc 等不同发电集团,不同的发电集团不希望其他发电集团获取自己的数据库数据,因此我们需要将不同的数据在集团层进行权限隔离。 - -### 6.1 创建用户 - -使用 `CREATE USER ` 创建用户。例如,我们可以使用具有所有权限的root用户为 ln 和 sgcc 集团创建两个用户角色,名为 ln_write_user, sgcc_write_user,密码均为 write_pwd。建议使用反引号(`)包裹用户名。SQL 语句为: - -```SQL -CREATE USER `ln_write_user` 'write_pwd' -CREATE USER `sgcc_write_user` 'write_pwd' -``` -此时使用展示用户的 SQL 语句: - -```SQL -LIST USER -``` - -我们可以看到这两个已经被创建的用户,结果如下: - -```SQL -IoTDB> CREATE USER `ln_write_user` 'write_pwd' -Msg: The statement is executed successfully. -IoTDB> CREATE USER `sgcc_write_user` 'write_pwd' -Msg: The statement is executed successfully. -IoTDB> LIST USER; -+---------------+ -| user| -+---------------+ -| ln_write_user| -| root| -|sgcc_write_user| -+---------------+ -Total line number = 3 -It costs 0.012s -``` - -### 6.2 赋予用户权限 - -此时,虽然两个用户已经创建,但是他们不具有任何权限,因此他们并不能对数据库进行操作,例如我们使用 ln_write_user 用户对数据库中的数据进行写入,SQL 语句为: - -```SQL -INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true) -``` - -此时,系统不允许用户进行此操作,会提示错误: - -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true) -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -现在,我们用 root 用户分别赋予他们向对应路径的写入权限. - -我们使用 `GRANT ON TO USER ` 语句赋予用户权限,例如: -```SQL -GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user` -GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user` -``` - -执行状态如下所示: - -```SQL -IoTDB> GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user` -Msg: The statement is executed successfully. -IoTDB> GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user` -Msg: The statement is executed successfully. -``` - -接着使用ln_write_user再尝试写入数据 - -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true) -Msg: The statement is executed successfully. -``` - -### 6.3 撤销用户权限 -授予用户权限后,我们可以使用 `REVOKE ON FROM USER `来撤销已经授予用户的权限。例如,用root用户撤销ln_write_user和sgcc_write_user的权限: - -``` SQL -REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -``` - -执行状态如下所示: -``` SQL -IoTDB> REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -Msg: The statement is executed successfully. -IoTDB> REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -Msg: The statement is executed successfully. -``` - -撤销权限后,ln_write_user就没有向root.ln.**写入数据的权限了。 - -``` SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true) -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -## 7. 其他说明 - -角色是权限的集合,而权限和角色都是用户的一种属性。即一个角色可以拥有若干权限。一个用户可以拥有若干角色与权限(称为用户自身权限)。 - -目前在 IoTDB 中并不存在相互冲突的权限,因此一个用户真正具有的权限是用户自身权限与其所有的角色的权限的并集。即要判定用户是否能执行某一项操作,就要看用户自身权限或用户的角色的所有权限中是否有一条允许了该操作。用户自身权限与其角色权限,他的多个角色的权限之间可能存在相同的权限,但这并不会产生影响。 - -需要注意的是:如果一个用户自身有某种权限(对应操作 A),而他的某个角色有相同的权限。那么如果仅从该用户撤销该权限无法达到禁止该用户执行操作 A 的目的,还需要从这个角色中也撤销对应的权限,或者从这个用户将该角色撤销。同样,如果仅从上述角色将权限撤销,也不能禁止该用户执行操作 A。 - -同时,对角色的修改会立即反映到所有拥有该角色的用户上,例如对角色增加某种权限将立即使所有拥有该角色的用户都拥有对应权限,删除某种权限也将使对应用户失去该权限(除非用户本身有该权限)。 - -## 8. 升级说明 - -在 1.3 版本前,权限类型较多,在这一版实现中,权限类型做了精简,并且添加了对权限路径的约束。 - -数据库 1.3 版本的权限路径必须为全路径或者以双通配符结尾的匹配路径,在系统升级时,会自动转换不合法的权限路径和权限类型。 -路径上首个非法节点会被替换为`**`, 不在支持的权限类型也会映射到当前系统支持的权限上。 - -例如: - -| 权限类型 | 权限路径 | 映射之后的权限类型 | 权限路径 | -| ----------------- | --------------- |-----------------| ------------- | -| CREATE_DATBASE | root.db.t1.* | MANAGE_DATABASE | root.** | -| INSERT_TIMESERIES | root.db.t2.*.t3 | WRITE_DATA | root.db.t2.** | -| CREATE_TIMESERIES | root.db.t2*c.t3 | WRITE_SCHEMA | root.db.** | -| LIST_ROLE | root.** | (忽略) | | - - -新旧版本的权限类型对照可以参照下面的表格(--IGNORE 表示新版本忽略该权限): - -| 权限名称 | 是否路径相关 | 新权限名称 | 是否路径相关 | -|---------------------------|--------|-----------------|--------| -| CREATE_DATABASE | 是 | MANAGE_DATABASE | 否 | -| INSERT_TIMESERIES | 是 | WRITE_DATA | 是 | -| UPDATE_TIMESERIES | 是 | WRITE_DATA | 是 | -| READ_TIMESERIES | 是 | READ_DATA | 是 | -| CREATE_TIMESERIES | 是 | WRITE_SCHEMA | 是 | -| DELETE_TIMESERIES | 是 | WRITE_SCHEMA | 是 | -| CREATE_USER | 否 | MANAGE_USER | 否 | -| DELETE_USER | 否 | MANAGE_USER | 否 | -| MODIFY_PASSWORD | 否 | -- IGNORE | | -| LIST_USER | 否 | -- IGNORE | | -| GRANT_USER_PRIVILEGE | 否 | -- IGNORE | | -| REVOKE_USER_PRIVILEGE | 否 | -- IGNORE | | -| GRANT_USER_ROLE | 否 | MANAGE_ROLE | 否 | -| REVOKE_USER_ROLE | 否 | MANAGE_ROLE | 否 | -| CREATE_ROLE | 否 | MANAGE_ROLE | 否 | -| DELETE_ROLE | 否 | MANAGE_ROLE | 否 | -| LIST_ROLE | 否 | -- IGNORE | | -| GRANT_ROLE_PRIVILEGE | 否 | -- IGNORE | | -| REVOKE_ROLE_PRIVILEGE | 否 | -- IGNORE | | -| CREATE_FUNCTION | 否 | USE_UDF | 否 | -| DROP_FUNCTION | 否 | USE_UDF | 否 | -| CREATE_TRIGGER | 是 | USE_TRIGGER | 否 | -| DROP_TRIGGER | 是 | USE_TRIGGER | 否 | -| START_TRIGGER | 是 | USE_TRIGGER | 否 | -| STOP_TRIGGER | 是 | USE_TRIGGER | 否 | -| CREATE_CONTINUOUS_QUERY | 否 | USE_CQ | 否 | -| DROP_CONTINUOUS_QUERY | 否 | USE_CQ | 否 | -| ALL | 否 | All privilegs | | -| DELETE_DATABASE | 是 | MANAGE_DATABASE | 否 | -| ALTER_TIMESERIES | 是 | WRITE_SCHEMA | 是 | -| UPDATE_TEMPLATE | 否 | -- IGNORE | | -| READ_TEMPLATE | 否 | -- IGNORE | | -| APPLY_TEMPLATE | 是 | WRITE_SCHEMA | 是 | -| READ_TEMPLATE_APPLICATION | 否 | -- IGNORE | | -| SHOW_CONTINUOUS_QUERIES | 否 | -- IGNORE | | -| CREATE_PIPEPLUGIN | 否 | USE_PIPE | 否 | -| DROP_PIPEPLUGINS | 否 | USE_PIPE | 否 | -| SHOW_PIPEPLUGINS | 否 | -- IGNORE | | -| CREATE_PIPE | 否 | USE_PIPE | 否 | -| START_PIPE | 否 | USE_PIPE | 否 | -| STOP_PIPE | 否 | USE_PIPE | 否 | -| DROP_PIPE | 否 | USE_PIPE | 否 | -| SHOW_PIPES | 否 | -- IGNORE | | -| CREATE_VIEW | 是 | WRITE_SCHEMA | 是 | -| ALTER_VIEW | 是 | WRITE_SCHEMA | 是 | -| RENAME_VIEW | 是 | WRITE_SCHEMA | 是 | -| DELETE_VIEW | 是 | WRITE_SCHEMA | 是 | diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/Auto-Start-On-Boot_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/Auto-Start-On-Boot_timecho.md deleted file mode 100644 index 06a0ddba6..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/Auto-Start-On-Boot_timecho.md +++ /dev/null @@ -1,243 +0,0 @@ - - -# 开机自启 - -## 1.概述 - -TimechoDB 支持通过 `daemon-confignode.sh`、`daemon-datanode.sh`、`daemon-ainode.sh` 三个脚本,将ConfigNode、DataNode、AINode 注册为 Linux 系统服务,结合系统自带的 `systemctl `命令,以守护进程方式管理 TimechoDB 集群,实现更便捷的启动、停止、重启及开机自启等操作,提升服务稳定性。 - -> 注意:该功能从 V 2.0.9.1 版本开始提供。 - -## 2. 环境要求 - -| 操作系统 | Linux(支持`systemctl`命令) | -| ---------- |:-----------------------------------------------------:| -| 用户权限 | root 用户 | -| 环境变量 | 部署 ConfigNode 和 DataNode 前需设置`JAVA_HOME` | - -## 3. 服务注册 - -进入 TimechoDB 安装目录,执行对应的守护进程脚本: - -```Bash -# 注册 ConfigNode 服务 -./tools/ops/daemon-confignode.sh - -# 注册 DataNode 服务 -./tools/ops/daemon-datanode.sh - -# 注册 AINode 服务 -./tools/ops/daemon-ainode.sh -``` - -执行脚本时将提示以下两个选择项: - -1. 是否本次直接启动对应 TimechoDB 服务(timechodb-confignode/timechodb-datanode/timechodb-ainode); -2. 是否将对应服务注册为开机自启服务。 - -脚本执行完成后,将在 `/etc/systemd/system/` 目录生成对应的服务文件: - -* `timechodb-confignode.service` -* `timechodb-datanode.service` -* `timechodb-ainode.service` - -## 4. 服务管理 - -服务注册完成后,可通过 systemctl 命令对 TimechoDB 各节点服务进行启动、停止、重启、查看状态及配置开机自启等操作,以下命令均需使用 root 用户执行。 - -### 4.1 手动启动服务 - -```bash -# 启动 ConfigNode 服务 -systemctl start timechodb-confignode -# 启动 DataNode 服务 -systemctl start timechodb-datanode -# 启动 AINode 服务 -systemctl start timechodb-ainode -``` - -### 4.2 手动停止服务 - -```bash -# 停止 ConfigNode 服务 -systemctl stop timechodb-confignode -# 停止 DataNode 服务 -systemctl stop timechodb-datanode -# 停止 AINode 服务 -systemctl stop timechodb-ainode -``` - -停止服务后,通过查看服务状态,若显示为 inactive(dead),则说明服务关闭成功;若为其他状态,需查看 TimechoDB 日志,分析异常原因。 - -### 4.3 查看服务状态 - -```bash -# 查看 ConfigNode 服务状态 -systemctl status timechodb-confignode -# 查看 DataNode 服务状态 -systemctl status timechodb-datanode -# 查看 AINode 服务状态 -systemctl status timechodb-ainode -``` - -状态说明: - -* active(running):服务正在运行,若该状态持续 10 分钟,说明服务启动成功; -* failed:服务启动失败,需查看 TimechoDB 日志排查问题。 - -### 4.4 重启服务 - -重启服务相当于先执行停止操作,再执行启动操作,命令如下: - -```bash -# 重启 ConfigNode 服务 -systemctl restart timechodb-confignode -# 重启 DataNode 服务 -systemctl restart timechodb-datanode -# 重启 AINode 服务 -systemctl restart timechodb-ainode -``` - -### 4.5 配置开机自启 - -```bash -# 配置 ConfigNode 开机自启 -systemctl enable timechodb-confignode -# 配置 DataNode 开机自启 -systemctl enable timechodb-datanode -# 配置 AINode 开机自启 -systemctl enable timechodb-ainode -``` - -### 4.6 取消开机自启 - -```bash -# 取消 ConfigNode 开机自启 -systemctl disable timechodb-confignode -# 取消 DataNode 开机自启 -systemctl disable timechodb-datanode -# 取消 AINode 开机自启 -systemctl disable timechodb-ainode -``` - -## 5. 自定义服务配置 - -### 5.1 自定义方式 - -#### 5.1.1 方案一:修改脚本 - -1. 修改 `daemon-xxx.sh` 中的[Unit]、[Service]、[Install]区域配置项,具体配置项的含义参考下一小节 -2. 执行 `daemon-xxx.sh` 脚本 - -#### 5.1.2 方案二:修改服务文件 - -1. 修改 `/etc/systemd/system` 中的 `xx.service` 文件 -2. 执行 `systemctl deamon-reload` - -### 5.2 `daemon-xxx.sh` 配置项 - -#### 5.2.1 [Unit] 部分(服务元信息) - -| 配置项 | 说明 | -| --------------- | ---------------------------------- | -| Description | 服务描述 | -| Documentation | 指向 TimechoDB 官方文档 | -| After | 确保在网络服务启动后才启动该服务 | - -#### 5.2.2 [Service] 部分(服务运行配置) - -| 配置项 | 含义 | -| -------------------------------------------- | ---------------------------------------------------------------------- | -| StandardOutput、StandardError | 指定服务标准输出和错误日志的存储路径 | -| LimitNOFILE=65536 | 设置文件描述符上限,默认值为 65536 | -| Type=simple | 服务类型为简单前台进程,systemd 会跟踪服务主进程 | -| User=root、Group=root | 指定服务以 root 用户和 root 组的权限运行 | -| ExecStart/ExecStop | 分别指定服务的启动脚本和停止脚本的路径 | -| Restart=on-failure | 仅在服务异常退出时,自动重启服务 | -| SuccessExitStatus=143 | 将退出码 143(128+15,即 SIGTERM 正常终止)视为成功退出 | -| RestartSec=5 | 服务重启的间隔时间,默认为 5 秒 | -| StartLimitInterval=600s、StartLimitBurst=3 | 10 分钟(600 秒)内,服务最多重启 3 次,防止频繁重启导致系统资源浪费 | -| RestartPreventExitStatus=SIGKILL | 服务被 SIGKILL 信号杀死后,不自动重启,避免无限重启僵尸进程 | - -#### 5.2.3 [Install] 部分(安装配置) - -| 配置项 | 含义 | -| ---------------------------- | -------------------------------------------- | -| WantedBy=multi-user.target | 指定服务在系统进入多用户模式时,自动启动。 | - -### 5.3 .service 文件格式示例 - -```bash -[Unit] -Description=timechodb-confignode -Documentation=https://www.timecho.com/ -After=network.target - -[Service] -StandardOutput=null -StandardError=null -LimitNOFILE=65536 -Type=simple -User=root -Group=root -Environment=JAVA_HOME=$JAVA_HOME -ExecStart=$TimechoDB_SBIN_HOME/start-confignode.sh -Restart=on-failure -SuccessExitStatus=143 -RestartSec=5 -StartLimitInterval=600s -StartLimitBurst=3 -RestartPreventExitStatus=SIGKILL - -[Install] -WantedBy=multi-user.target -``` - -注:上述为 timechodb-confignode.service 文件的标准格式,timechodb-datanode.service、timechodb-ainode.service 文件格式类似。 - -## 6. 注意事项 - -1. **进程守护机制** - -* **自动重启**:服务启动失败或运行中异常退出(如 OOM)时,系统将自动重启。 -* **不重启**:正常退出(如执行 `kill`、`./sbin/stop-xxx.sh` 或 `systemctl stop`)不会触发自动重启。 - -2. **日志位置** - -* 所有运行日志均存储在 TimechoDB 安装目录下的 `logs` 文件夹中,排查问题时请查阅该目录。 - -3. **集群状态查看** - -* 服务启动后,执行 `./sbin/start-cli.sh` 并输入 `show cluster` 命令,即可查看集群状态。 - -4. **故障恢复流程** - -* 若服务状态为 `failed`,修复问题后**必须**先执行 `systemctl daemon-reload`,然后再执行 `systemctl start`,否则启动将失败。 - -5. **配置生效** - -* 修改 `daemon-xxx.sh` 脚本内容后,需执行 `systemctl daemon-reload` 重新注册服务,新配置方可生效。 - -6. **启动方式兼容** - -* `systemctl start`启动的服务,可用`./sbin/stop` 停止(不重启)。 -* `./sbin/start` 启动的进程,无法通过 `systemctl` 监控状态。 diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/Black-White-List_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/Black-White-List_timecho.md deleted file mode 100644 index 66d99c273..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/Black-White-List_timecho.md +++ /dev/null @@ -1,78 +0,0 @@ - - -# 黑白名单 - -## 1. 引言 - -IoTDB 是一款针对物联网场景设计的时间序列数据库,支持高效的数据存储、查询和分析。随着物联网技术的广泛应用,数据安全性和访问控制变得至关重要。在开放环境中,如何保证合法用户对数据的安全访问成为了一项关键挑战。白名单机制仅允许可信 IP 或用户接入,从源头缩小攻击面;黑名单功能则能在边缘与云端协同场景下实时拦截恶意 IP,阻断非法访问、SQL 注入、暴力破解及 DDoS 等威胁,为数据传输提供持续、稳定的安全保障。 - -> 注意:该功能从 V2.0.6 版本开始提供。 - -## 2. 白名单 - -### 2.1 功能描述 - -通过开启白名单功能、配置白名单列表,指定允许连接 IoTDB 的客户端地址,来限制仅在白名单范围内的客户端才能够访问 IoTDB,从而实现安全控制。 - -### 2.2 配置参数 - -管理员可以通过以下两种方式来启用/禁用白名单功能以及添加、修改、删除白名单ip/ip段。 - -* 编辑配置文件 `iotdb-system.properties`进行维护 -* 通过 set configuration 语句进行维护 - * 树模型请参考:[set configuration](../Reference/Modify-Config-Manual.md) - -相关参数如下: - -| 名称 | 描述 | 默认值 | 生效方式 | 示例 | -| ------------------------- | ----------------------------------------------------------------------------------- | -------- | ---------- | ------------------------------------------------------------------- | -| `enable_white_list` | 是否启用白名单功能。true:启用;false:禁用。字段值不区分大小写。 | false | 热加载 | `set enable_white_list = 'true' ` | -| `white_ip_list` | 添加、修改、删除白名单ip/ip段。支持精确匹配,支持\*通配符,多个ip之间以逗号分隔。 | 空 | 热加载 | `set white_ip_list='192.168.1.200,192.168.1.201,192.168.1.*`' | - -## 3. 黑名单 - -### 3.1 功能描述 - -通过开启黑名单功能、配置黑名单列表,阻止某些特定 IP 地址访问数据库,来防止非法访问、SQL注入、暴力破解、DDoS攻击等安全威胁,从而确保数据传输过程中的安全性和稳定性。 - -### 3.2 配置参数 - -管理员可以通过以下两种方式来启用/禁用黑名单功能以及添加、修改、删除黑名单 ip/ip 段。 - -* 编辑配置文件 `iotdb-system.properties`进行维护 -* 通过 set configuration 语句进行维护 - * 树模型请参考:[set configuration](../Reference/Modify-Config-Manual.md) - -相关参数如下: - -| 名称 | 描述 | 默认值 | 生效方式 | 示例 | -| ------------------------- | ----------------------------------------------------------------------------------- | -------- | ---------- | ------------------------------------------------------------------- | -| `enable_black_list` | 是否启用黑名单功能。true:启用;false:禁用。字段值不区分大小写。 | false | 热加载 | `set enable_black_list = 'true' ` | -| `black_ip_list` | 添加、修改、删除黑名单ip/ip段。支持精确匹配,支持\*通配符,多个ip之间以逗号分隔。 | 空 | 热加载 | `set black_ip_list='192.168.1.200,192.168.1.201,192.168.1.*`' | - -## 4. 注意事项 - -1. 开启白名单后,若列表为空将拒绝所有连接,若未包含本机 IP 则拒绝本机登录。 -2. 当同一 IP 同时存在于黑白名单时,黑名单优先级更高。 -3. 系统会校验 IP 格式,无效条目将在用户连接时报错并被跳过,不影响其他有效IP的加载。 -4. 配置支持重复IP,内存中会自动去重且无提示。如需去重请手动修改。 -5. 黑/白名单规则仅对新连接生效,功能开启前的现有连接不受影响,其后续重连才会被拦截。 diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/Data-Sync_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index c4cfe8fa7..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,743 +0,0 @@ - - -# 数据同步 -数据同步是工业物联网的典型需求,通过数据同步机制,可实现 IoTDB 之间的数据共享,搭建完整的数据链路来满足内网外网数据互通、端边云同步、数据迁移、数据备份等需求。 - -## 1. 功能概述 - -### 1.1 数据同步 - -一个数据同步任务包含 3 个阶段: - -![](/img/data-sync-new.png) - -- 抽取(Source)阶段:该部分用于从源 IoTDB 抽取数据,在 SQL 语句中的 source 部分定义 -- 处理(Process)阶段:该部分用于处理从源 IoTDB 抽取出的数据,在 SQL 语句中的 processor 部分定义 -- 发送(Sink)阶段:该部分用于向目标 IoTDB 发送数据,在 SQL 语句中的 sink 部分定义 - -通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。目前数据同步支持以下信息的同步,您可以在创建同步任务时对同步范围进行选择(默认选择 data.insert,即同步新写入的数据): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
同步范围同步内容说明
all所有范围
data(数据)insert(增量)同步新写入的数据
delete(删除)同步被删除的数据
schema(元数据)database(数据库)同步数据库的创建、修改或删除操作
timeseries(时间序列)同步时间序列的定义和属性
TTL(数据到期时间)同步数据的存活时间
auth(权限)-同步用户权限和访问控制
- -### 1.2 功能限制及说明 - -1. 元数据(schema)、权限(auth)同步功能存在如下限制: - -- 使用元数据同步时,要求`Schema region`、`ConfigNode` 的共识协议必须为默认的 ratis 协议,即`iotdb-system.properties`配置文件中是否包含`config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`、`schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`,不包含即为默认值ratis 协议。 - -- 为了防止潜在的冲突,请在开启元数据同步时关闭接收端自动创建元数据功能。可通过修改 `iotdb-system.properties`配置文件中的`enable_auto_create_schema`配置项为 false,关闭元数据自动创建功能。 - -- 开启元数据同步时,不支持使用自定义插件。 - -- 双活集群中元数据同步需避免两端同时操作。 - -- 在进行数据同步任务时,请避免执行任何删除操作,防止两端状态不一致。 - -2. Pipe 权限控制规范如下: - -- 创建 pipe 时,可以对抽取/写回插件指定用户名和密码。密码错误则禁止创建,未指定时默认使用当前用户进行同步。 - -- 数据/元数据同步时,先根据 Pipe 配置的路径模式(pattern/path)筛选,再基于用户读取权限进行鉴权 - - - 权限范围≥写入路径:完整同步 - - - 权限范围与写入路径无交集:不同步 - - - 权限范围<写入路径或存在交集:同步交集部分 - -- 遇到无权限数据时,若发送端 skipIf=no-privileges,则跳过无权限数据;若 skipIf 配置为空,任务报错(803错误) - - - 注意:此 skipIf 配置与接收端的 skipIf(默认为空)相互独立 - -- 对于 root.__system, root.__audit 均不会同步 - -3. Pipe 接收端类型自动转换 - -当 Pipe 向接收端写入数据因字段类型不匹配而失败时,IoTDB 可按照目标端已有 schema 的字段类型对数据进行转换,并重试写入,以提高同步成功率。该能力由 `sink.exception.data.convert-on-type-mismatch` 控制,参数说明见后文 sink 参数表。 - -类型不匹配时的转换规则如下: - -| 源类型 | 目标类型 | 转换规则 | -| -------------------- | ----------- | -------------------------------------------------------------------------------- | -| 数值类型 | 数值类型 | 按目标数值类型进行转换,可能发生截断、精度损失或溢出。 | -| 数值类型 | BOOLEAN | `0`转换为`false`,非`0`转换为`true`。 | -| BOOLEAN | 数值类型 | `true`转换为`1`,`false`转换为`0`。 | -| TEXT、STRING、BLOB | BOOLEAN | 按字符串解析为 BOOLEAN。 | -| TEXT、STRING、BLOB | 数值类型 | 按字符串解析为目标数值类型;解析失败时写入默认值`0`、`0L`或`0.0`。 | -| TEXT、STRING、BLOB | TIMESTAMP | 按字符串解析为 TIMESTAMP;解析失败时写入默认值`0L`。 | -| TEXT、STRING、BLOB | DATE | 按字符串解析为 DATE;解析失败时写入默认日期`1970-01-01`。 | -| 非法数值 | DATE | 若无法转换为合法 DATE,则写入默认日期`1970-01-01`。 | -| DATE | TIMESTAMP | 按 UTC 转换为当天零点对应的时间戳。 | -| TIMESTAMP | DATE | 按 UTC 转换为对应日期。 | - -> 注意:自动转换基于目标端已有 schema 执行,不会自动修改目标端 schema;该能力优先保证同步继续进行,可能导致精度损失或默认值写入。 - - - -## 2. 使用说明 - -数据同步任务有三种状态:RUNNING、STOPPED 和 DROPPED。任务状态转换如下图所示: - -![](/img/Data-Sync01.png) - -创建后任务会直接启动,同时当任务发生异常停止后,系统会自动尝试重启任务。 - -提供以下 SQL 语句对同步任务进行状态管理。 - -### 2.1 创建任务 - -使用 `CREATE PIPE` 语句来创建一条数据同步任务,下列属性中`PipeId`和`sink`必填,`source`和`processor`为选填项,输入 SQL 时注意 `SOURCE`与 `SINK` 插件顺序不能替换。 - -SQL 示例如下: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId 是能够唯一标定任务的名字 --- 数据抽取插件,可选插件 -WITH SOURCE ( - [ = ,], -) --- 数据处理插件,可选插件 -WITH PROCESSOR ( - [ = ,], -) --- 数据连接插件,必填插件 -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Pipe 不存在时,执行创建命令,防止因尝试创建已存在的 Pipe 而导致报错。 - -**注意**:V2.0.8 起,创建一个全量数据同步 Pipe (例如 Pipeid : `alldatapipe`)时,系统会自动将其拆分为两个独立的 Pipe: - -* 历史 Pipe:PipeId 为原名称加 _history后缀(如 `alldatapipe_history`),source 参数默认携带 `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` - -* 实时 Pipe:PipeId 为原名称加 _realtime后缀(如 `alldatapipe_realtime`),source 参数默认携带 `'history.enable'='false'` ,若配置了元数据同步,则由实时 Pipe 负责发送 - -创建成功后,原 PipeId(如 `alldatapipe`)将不再作为有效标识符。在进行启动、停止、删除、查看等任务操作时,必须使用拆分后的独立 PipeId(即 `*_history`或 `*_realtime`)。操作示例见[查看任务](./Data-Sync_timecho.md#_2-5-查看任务)小节 - - -### 2.2 开始任务 - -开始处理数据: - -```SQL -START PIPE -``` - -### 2.3 停止任务 - -停止处理数据: - -```SQL -STOP PIPE -``` - -### 2.4 删除任务 - -删除指定任务: - -```SQL -DROP PIPE [IF EXISTS] -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Pipe 存在时,执行删除命令,防止因尝试删除不存在的 Pipe 而导致报错。 - -删除任务不需要先停止同步任务。 - -### 2.5 查看任务 - -查看全部任务: - -```SQL -SHOW PIPES -``` - -查看指定任务: - -```SQL -SHOW PIPE -``` - - pipe 的 show pipes 结果示例: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -其中各列含义如下: - -- **ID**:同步任务的唯一标识符 -- **CreationTime**:同步任务的创建的时间 -- **State**:同步任务的状态 -- **PipeSource**:同步数据流的来源 -- **PipeProcessor**:同步数据流在传输过程中的处理逻辑 -- **PipeSink**:同步数据流的目的地 -- **ExceptionMessage**:显示同步任务的异常信息 -- **RemainingEventCount(统计存在延迟)**:剩余 event 数,当前数据同步任务中的所有 event 总数,包括数据和元数据同步的 event,以及系统和用户自定义的 event。 -- **EstimatedRemainingSeconds(统计存在延迟)**:剩余时间,基于当前 event 个数和 pipe 处速率,预估完成传输的剩余时间。 - -示例: - -在 V2.0.8 及之后的版本中,创建一个全量数据同步任务,并查看该任务详情 - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -``` - - -### 2.6 修改任务 - -`ALTER PIPE` 语句用于动态更新已存在的 PIPE,支持修改或替换 source、processor 及 sink 的配置。 - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -说明: - -* 执行操作不会改变 PIPE 的运行状态,等价于保留原 PipeId 的处理进度,在原进度位置创建新 PIPE。 -* source/processor/sink 的 modify/replace 参数均为非必填;若未指定任何修改参数,等价于删除当前 PIPE 后,按原配置和进度重新创建。 -* 对于指定 modify 的插件,保留该插件其他参数,仅替换或新增给定的参数。 -* 对于指定 replace 的插件,直接替换该插件所有参数。 -* 当使用 [IF EXISTS] 关键字时,即使不存在同名的 Pipe 也会返回执行成功,但是实际未执行任何操作。 - -示例: - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` - -### 2.7 同步插件 - -为了使得整体架构更加灵活以匹配不同的同步场景需求,我们支持在同步任务框架中进行插件组装。系统为您预置了一些常用插件可直接使用,同时您也可以自定义 processor 插件 和 Sink 插件,并加载至 IoTDB 系统进行使用。查看系统中的插件(含自定义与内置插件)可以用以下语句: - -```SQL -SHOW PIPEPLUGINS -``` - -返回结果如下: - -```SQL -IoTDB> SHOW PIPEPLUGINS -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| PluginName|PluginType| ClassName| PluginJar| -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.processor.donothing.DoNothingProcessor| | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.donothing.DoNothingConnector| | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.extractor.iotdb.IoTDBExtractor| | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| | -| IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| | -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ - -``` - -预置插件详细介绍如下(各插件的详细参数可参考本文[参数说明](#参考参数说明)): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
类型自定义插件插件名称介绍适用版本
source 插件不支持iotdb-source默认的 extractor 插件,用于抽取 IoTDB 历史或实时数据1.2.x
processor 插件支持do-nothing-processor默认的 processor 插件,不对传入的数据做任何的处理1.2.x
sink 插件支持do-nothing-sink不对发送出的数据做任何的处理1.2.x
iotdb-thrift-sink默认的 sink 插件(V1.3.1及以上),用于 IoTDB(V1.2.0 及以上)与 IoTDB(V1.2.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,多线程 async non-blocking IO 模型,传输性能高,尤其适用于目标端为分布式时的场景1.2.x
iotdb-air-gap-sink用于 IoTDB(V1.2.2 及以上)向 IoTDB(V1.2.2 及以上)跨单向数据网闸的数据同步。支持的网闸型号包括南瑞 Syskeeper 2000 等1.2.x
iotdb-thrift-ssl-sink用于 IoTDB(V1.3.1 及以上)与 IoTDB(V1.2.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,单线程 sync blocking IO 模型,适用于安全需求较高的场景 1.3.1+
- -导入自定义插件可参考[流处理框架](./Streaming_timecho.md#自定义流处理插件管理)章节。 - -## 3. 使用示例 - -### 3.1 全量数据同步 - -本例子用来演示将一个 IoTDB 的所有数据同步至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务,用来同步 A IoTDB 到 B IoTDB 间的全量数据,这里需要用到用到 sink 的 iotdb-thrift-sink 插件(内置插件),需通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,如下面的示例语句: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.2 部分数据同步 - -本例子用来演示同步某个历史时间范围( 2023 年 8 月 23 日 8 点到 2023 年 10 月 23 日 8 点)的数据至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务。首先我们需要在 source 中定义传输数据的范围,由于传输的是历史数据(历史数据是指同步任务创建之前存在的数据),需要配置数据的起止时间 start-time 和 end-time 以及传输的模式 mode。通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url。 - -详细语句如下: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'realtime.mode' = 'stream' -- 新插入数据(pipe创建后)的抽取模式 - 'path' = 'root.vehicle.**', -- 同步数据的范围 - 'start-time' = '2023.08.23T08:00:00+00:00', -- 同步所有数据的开始 event time,包含 start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- 同步所有数据的结束 event time,包含 end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.3 双向数据传输 - -本例子用来演示两个 IoTDB 之间互为双活的场景,数据链路如下图所示: - -![](/img/1706698592139.jpg) - -在这个例子中,为了避免数据无限循环,需要将 A 和 B 上的参数`forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一 pipe 传输而来的数据,以及要保持两侧的数据一致 pipe 需要配置`inclusion=all`来同步全量数据和元数据。 - -详细语句如下: - -在 A IoTDB 上执行下列语句: - -```SQL -create pipe AB -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'forwarding-pipe-requests' = 'false' --不转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'forwarding-pipe-requests' = 'false' --是否转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -### 3.4 边云数据传输 - -本例子用来演示多个 IoTDB 之间边云传输数据的场景,数据由 B 、C、D 集群分别都同步至 A 集群,数据链路如下图所示: - -![](/img/dataSync03.png) - -在这个例子中,为了将 B 、C、D 集群的数据同步至 A,在 BA 、CA、DA 之间的 pipe 需要配置`path`限制范围,以及要保持边侧和云侧的数据一致 pipe 需要配置`inclusion=all`来同步全量数据和元数据,详细语句如下: - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 A: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 C IoTDB 上执行下列语句,将 C 中数据同步至 A: - -```SQL -create pipe CA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 D IoTDB 上执行下列语句,将 D 中数据同步至 A: - -```SQL -create pipe DA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.5 级联数据传输 - -本例子用来演示多个 IoTDB 之间级联传输数据的场景,数据由 A 集群同步至 B 集群,再同步至 C 集群,数据链路如下图所示: - -![](/img/1706698610134.jpg) - -在这个例子中,为了将 A 集群的数据同步至 C,在 BC 之间的 pipe 需要将 `forwarding-pipe-requests` 配置为`true`,详细语句如下: - -在 A IoTDB 上执行下列语句,将 A 中数据同步至 B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 C: - -```SQL -create pipe BC -with source ( - 'forwarding-pipe-requests' = 'true' --是否转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.6 跨网闸数据传输 - -本例子用来演示将一个 IoTDB 的数据,经过单向网闸,同步至另一个 IoTDB 的场景,数据链路如下图所示: - -![](/img/cross-network-gateway.png) - -在这个例子中,需要使用 sink 任务中的 iotdb-air-gap-sink 插件,配置网闸后,在 A IoTDB 上执行下列语句,其中 node-urls 填写网闸配置的目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,详细语句如下: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -**注意:** -* 跨网闸同步创建 pipe 时,必须确保接收端目标用户已存在。若创建 pipe 时接收端用户缺失,后续补建用户也无法同步此前数据。 -* 目前支持的网闸型号见下表 -> 其他型号的网闸设备,请与天谋商务联系确认是否支持。 - -| 网闸类型 | 网闸型号 | 回包限制 | 发送限制 | -| ------------ | -------------------------------------------- | ----------------- | --------------- | -| 正向型 | 南瑞 Syskeeper-2000 正向型 | 全 0 / 全 1 bytes | 无限制 | -| 正向型 | 许继自研网闸 | 全 0 / 全 1 bytes | 无限制 | -| 未标记正反向 | 威努特安全隔离与信息交换系统 | 无限制 | 无限制 | -| 正向型 | 科东 StoneWall-2000 网络安全隔离设备(正向型) | 无限制 | 无限制 | -| 反向型 | 南瑞 Syskeeper-2000 反向型 | 全 0 / 全 1 bytes | 满足 E 语言格式 | -| 未标记正反向 | 迪普科技ISG5000 | 无限制 | 无限制 | -| 未标记正反向 | 熙羚安全隔离与信息交换系统XL—GAP | 无限制 | 无限制 | - -### 3.7 压缩同步 - -IoTDB 支持在同步过程中指定数据压缩方式。可通过配置 `compressor` 参数,实现数据的实时压缩和传输。`compressor`目前支持 snappy / gzip / lz4 / zstd / lzma2 5 种可选算法,且可以选择多种压缩算法组合,按配置的顺序进行压缩。`rate-limit-bytes-per-second`(V1.3.3 及以后版本支持)每秒最大允许传输的byte数,计算压缩后的byte,若小于0则不限制。 - -如创建一个名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'compressor' = 'snappy,lz4' -- - 'rate-limit-bytes-per-second'='1048576' -- 每秒最大允许传输的byte数 -) -``` - -### 3.8 加密同步 - -IoTDB 支持在同步过程中使用 SSL 加密,从而在不同的 IoTDB 实例之间安全地传输数据。通过配置 SSL 相关的参数,如证书地址和密码(`ssl.trust-store-path`)、(`ssl.trust-store-pwd`)可以确保数据在同步过程中被 SSL 加密所保护。 - -如创建名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'ssl.trust-store-path'='pki/trusted', -- 连接目标端 DataNode 所需的 trust store 证书路径 - 'ssl.trust-store-pwd'='root' -- 连接目标端 DataNode 所需的 trust store 证书密码 -) -``` - -## 4. 参考:注意事项 - -可通过修改 IoTDB 配置文件(`iotdb-system.properties`)以调整数据同步的参数,如同步数据存储目录等。完整配置如下:: - -V1.3.3+: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## 5. 参考:参数说明 - -### 5.1 source 参数 - -| 参数 | 描述 | value 取值范围 | 是否必填 | 默认取值 | -|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------|------|----------------| -| source | iotdb-source | String: iotdb-source | 必填 | - | -| inclusion | 用于指定数据同步任务中需要同步范围,分为数据、元数据和权限 | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | 选填 | data.insert | -| inclusion.exclusion | 用于从 inclusion 指定的同步范围内排除特定的操作,减少同步的数据量 | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | 选填 | 空字符串 | -| mode.streaming | 此参数指定时序数据写入的捕获来源。适用于 `mode.streaming`为 `false` 模式下的场景,决定`inclusion`中`data.insert`数据的捕获来源。提供两种捕获策略:true: 动态选择捕获的类型。系统将根据下游处理速度,自适应地选择是捕获每个写入请求还是仅捕获 TsFile 文件的封口请求。当下游处理速度快时,优先捕获写入请求以减少延迟;当处理速度慢时,仅捕获文件封口请求以避免处理堆积。这种模式适用于大多数场景,能够实现处理延迟和吞吐量的最优平衡。false:固定按批捕获方式。仅捕获 TsFile 文件的封口请求,适用于资源受限的应用场景,以降低系统负载。注意,pipe 启动时捕获的快照数据只会以文件的方式供下游处理。 | Boolean: true / false | 否 | true | -| mode.strict | 在使用 time / path / database-name / table-name 参数过滤数据时,是否需要严格按照条件筛选:`true`: 严格筛选。系统将完全按照给定条件过滤筛选被捕获的数据,确保只有符合条件的数据被选中。`false`:非严格筛选。系统在筛选被捕获的数据时可能会包含一些额外的数据,适用于性能敏感的场景,可降低 CPU 和 IO 消耗。 | Boolean: true / false | 否 | true | -| mode.snapshot | 此参数决定时序数据的捕获方式,影响`inclusion`中的`data`数据。提供两种模式:`true`:静态数据捕获。启动 pipe 时,会进行一次性的数据快照捕获。当快照数据被完全消费后,**pipe 将自动终止(DROP PIPE SQL 会自动执行)**。`false`:动态数据捕获。除了在 pipe 启动时捕获快照数据外,还会持续捕获后续的数据变更。pipe 将持续运行以处理动态数据流。 | Boolean: true / false | 否 | false | -| path | 当用户连接指定的sql_dialect为tree时可以指定。对于升级上来的用户pipe,默认sql_dialect为tree。此参数决定时序数据的捕获范围,影响 inclusion中的data数据,以及部分序列相关的元数据。当数据的树模型路径能够被path匹配时,数据会被筛选出来进入流处理pipe。
自 V2.0.8.2 版本起,该参数支持在一个pipe中填写多个精确路径的path , 如 `'path'='root.test.d0,s1,root.test.d0.s2,root.test.d0.s3'` | String:IoTDB标准的树路径模式,可以带通配符 | 选填 | root.** | -| start-time | 同步所有数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MIN_VALUE | -| end-time | 同步所有数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MAX_VALUE | -| forwarding-pipe-requests | 是否转发由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | 选填 | true | -| mods | 同 mods.enable,是否发送 tsfile 的 mods 文件 | Boolean: true / false | 选填 | false | -| skipIf | 出现哪些错误可以跳过,当前只有无权限的错误 | String:no-privileges | 选填 | no-privileges | - -> 💎 **说明:数据抽取模式 mode.streaming 取值 true 和 false 的差异** -> - **true(推荐)**:该取值下,任务将对数据进行实时处理、发送,其特点是高时效、低吞吐 -> - **false**:该取值下,任务将对数据进行批量(按底层数据文件)处理、发送,其特点是低时效、高吞吐 - - -### 5.2 sink **参数** - -#### iotdb-thrift-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -|-------------------------------------------| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- |----------------------------------| -| sink | iotdb-thrift-sink 或 iotdb-thrift-async-sink | String: iotdb-thrift-sink 或 iotdb-thrift-async-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/username | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | TimechoDB@2021, V2.0.6.x 之前为root | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V2.0.5及以后版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| format | 数据传输的payload格式, 可选项包括:
- hybrid: 取决于 processor 传递过来的格式(tsfile或tablet),sink不做任何转换。
- tsfile:强制转换成tsfile发送,可用于数据文件备份等场景。
- tablet:强制转换成tsfile发送,可用于发送端/接收端数据类型不完全兼容时的数据同步(以减少报错)。 | String: hybrid / tsfile / tablet | 选填 | hybrid | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - -#### iotdb-air-gap-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -|-------------------------------------------| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- |-----------------------------------| -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/username | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | TimechoDB@2021, V2.0.6.x 之前为root | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| air-gap.handshake-timeout-ms | 发送端与接收端在首次尝试建立连接时握手请求的超时时长,单位:毫秒 | Integer | 选填 | 5000 | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - -#### iotdb-thrift-ssl-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -|--------------------------------------------| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- |-----------------------------------| -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/username | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | TimechoDB@2021, V2.0.6.x 之前为root | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V2.0.5及以后版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| ssl.trust-store-path | 连接目标端 DataNode 所需的 trust store 证书路径 | String.Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| ssl.trust-store-pwd | 连接目标端 DataNode 所需的 trust store 证书密码 | Integer | 必填 | - | -| format | 数据传输的payload格式, 可选项包括:
- hybrid: 取决于 processor 传递过来的格式(tsfile或tablet),sink不做任何转换。
- tsfile:强制转换成tsfile发送,可用于数据文件备份等场景。
- tablet:强制转换成tsfile发送,可用于发送端/接收端数据类型不完全兼容时的数据同步(以减少报错)。 | String: hybrid / tsfile / tablet | 选填 | hybrid | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/Data-subscription_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/Data-subscription_timecho.md deleted file mode 100644 index 77267b8e7..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/Data-subscription_timecho.md +++ /dev/null @@ -1,144 +0,0 @@ -# 数据订阅 - -## 1. 功能介绍 - -IoTDB 数据订阅模块(又称 IoTDB 订阅客户端)是IoTDB V1.3.3 版本后支持的功能,它为用户提供了一种区别于数据查询的流式数据消费方式。它参考了 Kafka 等消息队列产品的基本概念和逻辑,**提供数据订阅和消费接口**,但并不是为了完全替代这些消费队列的产品,更多的是在简单流式获取数据的场景为用户提供更加便捷的数据订阅服务。 - -在下面应用场景中,使用 IoTDB 订阅客户端消费数据会有显著的优势: - -1. **持续获取最新数据**:使用订阅的方式,比定时查询更实时、应用编程更简单、系统负担更小; -2. **简化数据推送至第三方系统**:无需在 IoTDB 内部开发不同系统的数据推送组件,可以在第三方系统内实现数据的流式获取,更方便将数据发送至 Flink、Kafka、DataX、Camel、MySQL、PG 等系统。 - -## 2. 主要概念 - -IoTDB 订阅客户端包含 3 个核心概念:Topic、Consumer、Consumer Group,具体关系如下图 - -
- -
- -1. **Topic(主题)**: Topic 是 IoTDB 的数据空间,由路径和时间范围表示(如 root.** 的全时间范围)。消费者可以订阅这些主题的数据(当前已有的和未来写入的)。不同于 Kafka,IoTDB 可在数据入库后再创建 Topic,且输出格式可选择 Message 或 TsFile 两种。 - -2. **Consumer(消费者)**: Consumer 是 IoTDB 的订阅客户端,负责接收和处理发布到特定 Topic 的数据。Consumer 从队列中获取数据并进行相应的处理。在 IoTDB 订阅客户端中提供了两种类型的 Consumers: - - 一种是 `SubscriptionPullConsumer`,对应的是消息队列中的 pull 消费模式,用户代码需要主动调用数据获取逻辑 - - 一种是 `SubscriptionPushConsumer`,对应的是消息队列中的 push 消费模式,用户代码由新到达的数据事件触发 - -3. **Consumer Group(消费者组)**: Consumer Group 是一组 Consumers 的集合,拥有相同 Consumer Group ID 的消费者属于同一个消费者组。Consumer Group 有以下特点: - - Consumer Group 与 Consumer 为一对多的关系。即一个 consumer group 中的 consumers 可以有任意多个,但不允许一个 consumer 同时加入多个 consumer groups - - 允许一个 Consumer Group 中有不同类型的 Consumer(`SubscriptionPullConsumer` 和 `SubscriptionPushConsumer`) - - 一个 topic 不需要被一个 consumer group 中的所有 consumer 订阅 - - 当同一个 Consumer Group 中不同的 Consumers 订阅了相同的 Topic 时,该 Topic 下的每条数据只会被组内的一个 Consumer 处理,确保数据不会被重复处理 - -## 3. SQL 语句 - -### 3.1 Topic 管理 - -IoTDB 支持通过 SQL 语句对 Topic 进行创建、删除、查看操作。Topic状态变化如下图所示: - -
- -
- -#### 3.1.1 创建 Topic - -SQL 语句为: - -```SQL - CREATE TOPIC [IF NOT EXISTS] - WITH ( - [ = ,], - ); -``` -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Topic 不存在时,执行创建命令,防止因尝试创建已存在的 Topic 而导致报错。 - -各参数详细解释如下: - -| 参数 | 是否必填(默认值) | 参数含义 | -| :-------------------------------------------- | :--------------------------------- | :----------------------------------------------------------- | -| **path** | optional: `root.**` | topic 对应订阅数据时间序列的路径 path,表示一组需要订阅的时间序列集合 | -| **start-time** | optional: `MIN_VALUE` | topic 对应订阅数据时间序列的开始时间(event time)可以为 ISO 格式,例如 2011-12-03T10:15:30 或 2011-12-03T10:15:30+01:00也可以为 long 值,含义为裸时间戳,单位与数据库时间戳精度一致支持特殊 value **`now`**,含义为 topic 的创建时间,当 start-time 为 `now` 且 end-time 为 MAX_VALUE 时表示只订阅实时数据 | -| **end-time** | optional: `MAX_VALUE` | topic 对应订阅数据时间序列的结束时间(event time)可以为 ISO 格式,例如 2011-12-03T10:15:30 或 2011-12-03T10:15:30+01:00也可以为 long 值,含义为裸时间戳,单位与数据库时间戳精度一致支持特殊 value `now`,含义为 topic 的创建时间,当 end-time 为 `now` 且 start-time 为 MIN_VALUE 时表示只订阅历史数据 | -| **processor** | optional: `do-nothing-processor` | processor 插件名及其参数配置,表示对原始订阅数据应用的自定义处理逻辑,可以通过类似 pipe processor 插件的方式指定 | -| **format** | optional: `SessionDataSetsHandler` | 表示从该主题订阅出的数据呈现形式,目前支持下述两种数据形式:`SessionDataSetsHandler`:使用 `SubscriptionSessionDataSetsHandler` 获取从该主题订阅出的数据,消费者可以按行消费每条数据`TsFileHandler`:使用 `SubscriptionTsFileHandler` 获取从该主题订阅出的数据,消费者可以直接订阅到存储相应数据的 TsFile | -| **mode** **(v1.3.3.2 及之后版本支持)** | option: `live` | topic 对应的订阅模式,有两个选项:`live`:订阅该主题时,订阅的数据集模式为动态数据集,即可以不断消费到最新的数据`snapshot`:consumer 订阅该主题时,订阅的数据集模式为静态数据集,即 consumer group 订阅该主题的时刻(不是创建主题的时刻)数据的 snapshot;形成订阅后的静态数据集不支持 TTL | -| **loose-range** **(v1.3.3.2 及之后版本支持)** | option: `""` | String: 是否严格按照 path 和 time range 来筛选该 topic 对应的数据,例如:`""`:严格按照 path 和 time range 来筛选该 topic 对应的数据`"time"`:不严格按照 time range 来筛选该 topic 对应的数据(粗筛);严格按照 path 来筛选该 topic 对应的数据`"path"`:不严格按照 path 来筛选该 topic 对应的数据(粗筛);严格按照 time range 来筛选该 topic 对应的数据`"time, path"` / `"path, time"` / `"all"`:不严格按照 path 和 time range 来筛选该 topic 对应的数据(粗筛) | - -示例如下: - -```SQL --- 全量订阅 -CREATE TOPIC root_all; - --- 自定义订阅 -CREATE TOPIC IF NOT EXISTS db_timerange -WITH ( - 'path' = 'root.db.**', - 'start-time' = '2023-01-01', - 'end-time' = '2023-12-31' -); -``` - -#### 3.1.2 删除 Topic - -Topic 在没有被订阅的情况下,才能被删除,Topic 被删除时,其相关的消费进度都会被清理 - -```SQL -DROP TOPIC [IF EXISTS] ; -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Topic 存在时,执行删除命令,防止因尝试删除不存在的 Topic 而导致报错。 - -#### 3.1.3 查看 Topic - -```SQL -SHOW TOPICS; -SHOW TOPIC ; -``` - -结果集: - -```SQL -[TopicName|TopicConfigs] -``` - -- TopicName:主题 ID -- TopicConfigs:主题配置 - -### 3.2 查看订阅状态 - -查看所有订阅关系: - -```SQL --- 查询所有的 topics 与 consumer group 的订阅关系 -SHOW SUBSCRIPTIONS --- 查询某个 topic 下所有的 subscriptions -SHOW SUBSCRIPTIONS ON -``` - -结果集: - -```SQL -[TopicName|ConsumerGroupName|SubscribedConsumers] -``` - -- TopicName:主题 ID -- ConsumerGroupName:用户代码中指定的消费者组 ID -- SubscribedConsumers:该消费者组中订阅了该主题的所有客户端 ID - -## 4. API 接口 - -除 SQL 语句外,IoTDB 还支持通过 Java 原生接口使用数据订阅功能。详细语法参见页面:Java 原生接口([链接](../API/Programming-Java-Native-API_timecho))。 - -## 5. 常见问题 - -### 5.1 IoTDB 数据订阅与 Kafka 的区别是什么? - -1. 消费有序性 - -- **Kafka 保证消息在单个 partition 内是有序的**,当某个 topic 仅对应一个 partition 且只有一个 consumer 订阅了这个 topic,即可保证该 consumer(单线程) 消费该 topic 数据的顺序即为数据写入的顺序。 -- IoTDB 订阅客户端**不保证** consumer 消费数据的顺序即为数据写入的顺序,但会尽量反映数据写入的顺序。 - -2. 消息送达语义 - -- Kafka 可以通过配置实现 Producer 和 Consumer 的 Exactly once 语义。 -- IoTDB 订阅客户端目前无法提供 Consumer 的 Exactly once 语义。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/IoTDB-View_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/IoTDB-View_timecho.md deleted file mode 100644 index 6d86293d2..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/IoTDB-View_timecho.md +++ /dev/null @@ -1,547 +0,0 @@ - - -# 视图 - -## 1. 序列视图应用背景 - -### 1.1 应用场景1 时间序列重命名(PI资产管理) - -实际应用中,采集数据的设备可能使用人类难以理解的标识号来命名,这给业务层带来了查询上的困难。 - -而序列视图能够重新组织管理这些序列,在不改变原有序列内容、无需新建或拷贝序列的情况下,使用新的模型结构来访问他们。 - -**例如**:一台云端设备使用自己的网卡MAC地址组成实体编号,存储数据时写入如下时间序列:`root.db.0800200A8C6D.xvjeifg`. - -对于用户来说,它是难以理解的。但此时,用户能够使用序列视图功能对它重命名,将它映射到一个序列视图中去,使用`root.view.device001.temperature`来访问采集到的数据。 - -### 1.2 应用场景2 简化业务层查询逻辑 - -有时用户有大量设备,管理着大量时间序列。在进行某项业务时,用户希望仅处理其中的部分序列,此时就可以通过序列视图功能挑选出关注重点,方便反复查询、写入。 - -**例如**:用户管理一条产品流水线,各环节的设备有大量时间序列。温度检测员仅需要关注设备温度,就可以抽取温度相关的序列,组成序列视图。 - -### 1.3 应用场景3 辅助权限管理 - -生产过程中,不同业务负责的范围一般不同,出于安全考虑往往需要通过权限管理来限制业务员的访问范围。 - -**例如**:安全管理部门现在仅需要监控某生产线上各设备的温度,但这些数据与其他机密数据存放在同一数据库。此时,就可以创建若干新的视图,视图中仅含有生产线上与温度有关的时间序列,接着,向安全员只赋予这些序列视图的权限,从而达到权限限制的目的。 - -### 1.4 设计序列视图功能的动机 - -结合上述两类使用场景,设计序列视图功能的动机,主要有: - -1. 时间序列重命名。 -2. 简化业务层查询逻辑。 -3. 辅助权限管理,通过视图向特定用户开放数据。 - -## 2. 序列视图概念 - -### 2.1 术语概念 - -约定:若无特殊说明,本文档所指定的视图均是**序列视图**,未来可能引入设备视图等新功能。 - -### 2.2 序列视图 - -序列视图是一种组织管理时间序列的方式。 - -在传统关系型数据库中,数据都必须存放在一个表中,而在IoTDB等时序数据库中,序列才是存储单元。因此,IoTDB中序列视图的概念也是建立在序列上的。 - -一个序列视图就是一条虚拟的时间序列,每条虚拟的时间序列都像是一条软链接或快捷方式,映射到某个视图外部的序列或者某种计算逻辑。换言之,一个虚拟序列要么映射到某个确定的外部序列,要么由多个外部序列运算得来。 - -用户可以使用复杂的SQL查询创建视图,此时序列视图就像一条被存储的查询语句,当从视图中读取数据时,就把被存储的查询语句作为数据来源,放在FROM子句中。 - -### 2.3 别名序列 - -在序列视图中,有一类特殊的存在,他们满足如下所有条件: - -1. 数据来源为单一的时间序列 -2. 没有任何计算逻辑 -3. 没有任何筛选条件(例如无WHERE子句的限制) - -这样的序列视图,被称为**别名序列**,或别名序列视图。不完全满足上述所有条件的序列视图,就称为非别名序列视图。他们之间的区别是:只有别名序列支持写入功能。 - -**所有序列视图包括别名序列目前均不支持触发器功能(Trigger)。** - -### 2.4 嵌套视图 - -用户可能想从一个现有的序列视图中选出若干序列,组成一个新的序列视图,就称之为嵌套视图。 - -**当前版本不支持嵌套视图功能**。 - -### 2.5 IoTDB中对序列视图的一些约束 - -#### 限制1 序列视图必须依赖于一个或者若干个时间序列 - -一个序列视图有两种可能的存在形式: - -1. 它映射到一条时间序列 -2. 它由一条或若干条时间序列计算得来 - -前种存在形式已在前文举例,易于理解;而此处的后一种存在形式,则是因为序列视图允许计算逻辑的存在。 - -比如,用户在同一个锅炉安装了两个温度计,现在需要计算两个温度值的平均值作为测量结果。用户采集到的是如下两个序列:`root.db.d01.temperature01`、`root.db.d01.temperature02`。 - -此时,用户可以使用两个序列求平均值,作为视图中的一条序列:`root.db.d01.avg_temperature`。 - -该例子会3.1.2详细展开。 - -#### 限制2 非别名序列视图是只读的 - -不允许向非别名序列视图写入。 - -只有别名序列视图是支持写入的。 - -#### 限制3 不允许嵌套视图 - -不能选定现有序列视图中的某些列来创建序列视图,无论是直接的还是间接的。 - -本限制将在3.1.3给出示例。 - -#### 限制4 序列视图与时间序列不能重名 - -序列视图和时间序列都位于同一棵树下,所以他们不能重名。 - -任何一条序列的名称(路径)都应该是唯一确定的。 - -#### 限制5 序列视图与时间序列的时序数据共用,标签等元数据不共用 - -序列视图是指向时间序列的映射,所以它们完全共用时序数据,由时间序列负责持久化存储。 - -但是它们的tag、attributes等元数据不共用。 - -这是因为进行业务查询时,面向视图的用户关心的是当前视图的结构,而如果使用group by tag等方式做查询,显然希望是得到视图下含有对应tag的分组效果,而非时间序列的tag的分组效果(用户甚至对那些时间序列毫无感知)。 - -## 3. 序列视图功能介绍 - -### 3.1 创建视图 - -创建一个序列视图与创建一条时间序列类似,区别在于需要通过AS关键字指定数据来源,即原始序列。 - -#### 创建视图的SQL - -用户可以选取一些序列创建一个视图: - -```SQL -CREATE VIEW root.view.device.status -AS - SELECT s01 - FROM root.db.device -``` - -它表示用户从现有设备`root.db.device`中选出了`s01`这条序列,创建了序列视图`root.view.device.status`。 - -序列视图可以与时间序列存在于同一实体下,例如: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device -``` - -这样,`root.db.device`下就有了`s01`的一份虚拟拷贝,但是使用不同的名字`status`。 - -可以发现,上述两个例子中的序列视图,都是别名序列,我们给用户提供一种针对该序列的更方便的创建方式: - -```SQL -CREATE VIEW root.view.device.status -AS - root.db.device.s01 -``` - -#### 创建含有计算逻辑的视图 - -沿用2.2章节限制1中的例子: - -> 用户在同一个锅炉安装了两个温度计,现在需要计算两个温度值的平均值作为测量结果。用户采集到的是如下两个序列:`root.db.d01.temperature01`、`root.db.d01.temperature02`。 -> -> 此时,用户可以使用两个序列求平均值,作为视图中的一条序列:`root.view.device01.avg_temperature`。 - -如果不使用视图,用户可以这样查询两个温度的平均值: - -```SQL -SELECT (temperature01 + temperature02) / 2 -FROM root.db.d01 -``` - -而如果使用序列视图,用户可以这样创建一个视图来简化将来的查询: - -```SQL -CREATE VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02) / 2 - FROM root.db.d01 -``` - -然后用户可以这样查询: - -```SQL -SELECT avg_temperature FROM root.db.d01 -``` - -#### 不支持嵌套序列视图 - -继续沿用3.1.2中的例子,现在用户想使用序列视图`root.db.d01.avg_temperature`创建一个新的视图,这是不允许的。我们目前不支持嵌套视图,无论它是否是别名序列,都不支持。 - -比如下列SQL语句会报错: - -```SQL -CREATE VIEW root.view.device.avg_temp_copy -AS - root.db.d01.avg_temperature -- 不支持。不允许嵌套视图 -``` - -#### 一次创建多条序列视图 - -一次只能指定一个序列视图对用户来说使用不方便,则可以一次指定多条序列,比如: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - SELECT s01, s02 - FROM root.db.device -``` - -此外,上述写法可以做简化: - -```SQL -CREATE VIEW root.db.device(status, sub.hardware) -AS - SELECT s01, s02 - FROM root.db.device -``` - -上述两条语句都等价于如下写法: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device; - -CREATE VIEW root.db.device.sub.hardware -AS - SELECT s02 - FROM root.db.device -``` - -也等价于如下写法 - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - root.db.device.s01, root.db.device.s02 - --- 或者 - -CREATE VIEW root.db.device(status, sub.hardware) -AS - root.db.device(s01, s02) -``` - -##### 所有序列间的映射关系为静态存储 - -有时,SELECT子句中可能包含运行时才能确定的语句个数,比如如下的语句: - -```SQL -SELECT s01, s02 -FROM root.db.d01, root.db.d02 -``` - -上述语句能匹配到的序列数量是并不确定的,和系统状态有关。即便如此,用户也可以使用它创建视图。 - -不过需要特别注意,所有序列间的映射关系为静态存储(创建时固定)!请看以下示例: - -当前数据库中仅含有`root.db.d01.s01`、`root.db.d02.s01`、`root.db.d02.s02`三条序列,接着创建视图: - -```SQL -CREATE VIEW root.view.d(alpha, beta, gamma) -AS - SELECT s01, s02 - FROM root.db.d01, root.db.d02 -``` - -时间序列之间映射关系如下: - -| 序号 | 时间序列 | 序列视图 | -| ---- | ----------------- | ----------------- | -| 1 | `root.db.d01.s01` | root.view.d.alpha | -| 2 | `root.db.d02.s01` | root.view.d.beta | -| 3 | `root.db.d02.s02` | root.view.d.gamma | - -此后,用户新增了序列`root.db.d01.s02`,则它不对应到任何视图;接着,用户删除`root.db.d01.s01`,则查询`root.view.d.alpha`会直接报错,它也不会对应到`root.db.d01.s02`。 - -请时刻注意,序列间映射关系是静态地、固化地存储的。 - -#### 批量创建序列视图 - -现有若干个设备,每个设备都有一个温度数值,例如: - -1. root.db.d1.temperature -2. root.db.d2.temperature -3. ... - -这些设备下可能存储了很多其他序列(例如`root.db.d1.speed`),但目前可以创建一个视图,只包含这些设备的温度值,而不关系其他序列: - -```SQL -CREATE VIEW root.db.view(${2}_temperature) -AS - SELECT temperature FROM root.db.* -``` - -这里仿照了查询写回(`SELECT INTO`)对命名规则的约定,使用变量占位符来指定命名规则。可以参考:[查询写回(SELECT INTO)](../Basic-Concept/Query-Data_timecho#查询写回(INTO-子句)) - -这里`root.db.*.temperature`指定了有哪些时间序列会被包含在视图中;`${2}`则指定了从时间序列中的哪个节点提取出名字来命名序列视图。 - -此处,`${2}`指代的是`root.db.*.temperature`的层级2(从 0 开始),也就是`*`的匹配结果;`${2}_temperature`则是将匹配结果与`temperature`通过下划线拼接了起来,构成视图下各序列的节点名称。 - -上述创建视图的语句,和下列写法是等价的: - -```SQL -CREATE VIEW root.db.view(${2}_${3}) -AS - SELECT temperature from root.db.* -``` - -最终视图中含有这些序列: - -1. root.db.view.d1_temperature -2. root.db.view.d2_temperature -3. ... - -使用通配符创建,只会存储创建时刻的静态映射关系。 - -#### 创建视图时SELECT子句受到一定限制 - -创建序列视图时,使用的SELECT子句受到一定限制。主要限制如下: - -1. 不能使用`WHERE`子句。 -2. 不能使用`GROUP BY`子句。 -3. 不能使用`MAX_VALUE`等聚合函数。 - -简单来说,`AS`后只能使用`SELECT ... FROM ... `的结构,且该查询语句的结果必须能构成一条时间序列。 - -### 3.2 视图数据查询 - -对于可以支持的数据查询功能,在执行时序数据查询时,序列视图与时间序列可以无差别使用,行为完全一致。 - -**目前序列视图不支持的查询类型如下:** - -1. **align by device 查询** -2. **group by tags 查询** - -用户也可以在同一个SELECT语句中混合查询时间序列与序列视图,比如: - -```SQL -SELECT temperature01, temperature02, avg_temperature -FROM root.db.d01 -WHERE temperature01 < temperature02 -``` - -但是,如果用户想要查询序列的元数据,例如tag、attributes等,则查询到的是序列视图的结果,而并非序列视图所引用的时间序列的结果。 - -此外,对于别名序列,如果用户想要得到时间序列的tag、attributes等信息,则需要先查询视图列的映射,找到对应的时间序列,再向时间序列查询tag、attributes等信息。查询视图列的映射的方法将会在3.5部分说明。 - - -### 3.3 视图修改 - -视图支持的修改操作包括:修改计算逻辑,修改标签/属性,以及删除。 - -#### 修改视图数据来源 - -```SQL -ALTER VIEW root.view.device.status -AS - SELECT s01 - FROM root.ln.wf.d01 -``` - -#### 修改视图的计算逻辑 - -```SQL -ALTER VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02 + temperature03) / 3 - FROM root.db.d01 -``` - -#### 标签点管理 - -- 添加新的标签 - -```SQL -ALTER view root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -- 添加新的属性 - -```SQL -ALTER view root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -- 重命名标签或属性 - -```SQL -ALTER view root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -- 重新设置标签或属性的值 - -```SQL -ALTER view root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -- 删除已经存在的标签或属性 - -```SQL -ALTER view root.turbine.d1.s1 DROP tag1, tag2 -``` - -- 更新插入标签和属性 - -> 如果该标签或属性原来不存在,则插入,否则,用新值更新原来的旧值 - -```SQL -ALTER view root.turbine.d1.s1 UPSERT TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -#### 删除视图 - -因为一个视图就是一条序列,因此可以像删除时间序列一样删除一个视图。 - -```SQL -DELETE VIEW root.view.device.avg_temperatue -``` - -### 3.4 视图同步 - -#### 如果依赖的原序列被删除了 - -当序列视图查询时(序列解析时),如果依赖的时间序列不存在,则**返回空结果集**。 - -这和查询一个不存在的序列的反馈类似,但是有区别:如果依赖的时间序列无法解析,空结果集是包含表头的,以此来提醒用户该视图是存在问题的。 - -此外,被依赖的时间序列删除时,不会去查找是否有依赖于该列的视图,用户不会收到任何警告。 - -#### 不支持非别名序列的数据写入 - -不支持向非别名序列的写入。 - -详情请参考前文 2.1.6 限制2 - -#### 序列的元数据不共用 - -详情请参考前文2.1.6 限制5 - -### 3.5 视图元数据查询 - -视图元数据查询,特指查询视图本身的元数据(例如视图有多少列),以及数据库内视图的信息(例如有哪些视图)。 - -#### 查看当前的视图列 - -用户有两种查询方式: - -1. 使用`SHOW TIMESERIES`进行查询,该查询既包含时间序列,也包含序列视图。但是只能显示视图的部分属性 -2. 使用`SHOW VIEW`进行查询,该查询只包含序列视图。能完整显示序列视图的属性。 - -举例: - -```Shell -IoTDB> show timeseries; -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.device.s01 | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.view.status | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp01 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp02 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.avg_temp| null| root.db| FLOAT| null| null|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -Total line number = 5 -It costs 0.789s -IoTDB> -``` - -最后一列`ViewType`中显示了该序列的类型,时间序列为BASE,序列视图是VIEW。 - -此外,某些序列视图的属性会缺失,比如`root.db.d01.avg_temp`是由温度均值计算得来,所以`Encoding`和`Compression`属性都为空值。 - -此外,`SHOW TIMESERIES`语句的查询结果主要分为两部分: - -1. 时序数据的信息,例如数据类型,压缩方式,编码等 -2. 其他元数据信息,例如tag,attribute,所属database等 - -对于序列视图,展示的时序数据信息与其原始序列一致或者为空值(比如计算得到的平均温度有数据类型但是无压缩方式);展示的元数据信息则是视图的内容。 - -如果要得知视图的更多信息,需要使用`SHOW ``VIEW`。`SHOW ``VIEW`中展示视图的数据来源等。 - -```Shell -IoTDB> show VIEW root.**; -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -| Timeseries|Database|DataType|Tags|Attributes|ViewType| SOURCE| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.view.status | root.db| INT32|null| null| VIEW| root.db.device.s01| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.d01.avg_temp| root.db| FLOAT|null| null| VIEW|(root.db.d01.temp01+root.db.d01.temp02)/2| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -Total line number = 2 -It costs 0.789s -IoTDB> -``` - -最后一列`SOURCE`显示了该序列视图的数据来源,列出了创建该序列的SQL语句。 - -##### 关于数据类型 - -上述两种查询都涉及视图的数据类型。视图的数据类型是根据定义视图的查询语句或别名序列的原始时间序列类型推断出来的。这个数据类型是根据当前系统的状态实时计算出来的,因此在不同时刻查询到的数据类型可能是改变的。 - -## 4. FAQ - -#### Q1:我想让视图实现类型转换的功能。例如,原有一个int32类型的时间序列,和其他int64类型的序列被放在了同一个视图中。我现在希望通过视图查询到的数据,都能自动转换为int64类型。 - -> Ans:这不是序列视图的职能范围。但是可以使用`CAST`进行转换,比如: - -```SQL -CREATE VIEW root.db.device.int64_status -AS - SELECT CAST(s1, 'type'='INT64') from root.db.device -``` - -> 这样,查询`root.view.status`时,就会得到int64类型的结果。 -> -> 请特别注意,上述例子中,序列视图的数据是通过`CAST`转换得到的,因此`root.db.device.int64_status`并不是一条别名序列,也就**不支持写入**。 - -#### Q2:是否支持默认命名?选择若干时间序列,创建视图;但是我不指定每条序列的名字,由数据库自动命名? - -> Ans:不支持。用户必须明确指定命名。 - -#### Q3:在原有体系中,创建时间序列`root.db.device.s01`,可以发现自动创建了database`root.db`,自动创建了device`root.db.device`。接着删除时间序列`root.db.device.s01`,可以发现`root.db.device`被自动删除,`root.db`却还是保留的。对于创建视图,会沿用这一机制吗?出于什么考虑呢? - -> Ans:保持原有的行为不变,引入视图功能不会改变原有的这些逻辑。 - -#### Q4:是否支持序列视图重命名? - -> A:当前版本不支持重命名,可以自行创建新名称的视图投入使用。 \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/Maintenance-statement_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/Maintenance-statement_timecho.md deleted file mode 100644 index 41a4e4b56..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/Maintenance-statement_timecho.md +++ /dev/null @@ -1,720 +0,0 @@ - -# 运维语句 - -## 1. 状态查看 - -### 1.1 查看连接的模型 - -**含义**:返回当前连接的 sql_dialect 是树模型/表模型。 - -#### 语法: - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT -``` - -执行结果如下: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TREE| -+-----------------+ -``` - -### 1.2 查看集群版本 - -**含义**:返回当前集群的版本。 - -#### 语法: - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW VERSION -``` - -执行结果如下: - -```SQL -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.3 查看集群关键参数 - -**含义**:返回当前集群的关键参数。 - -#### 语法: - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -关键参数如下: - -1. **ClusterName**:当前集群的名称。 -2. **DataReplicationFactor**:数据副本的数量,表示每个数据分区(DataRegion)的副本数。 -3. **SchemaReplicationFactor**:元数据副本的数量,表示每个元数据分区(SchemaRegion)的副本数。 -4. **DataRegionConsensusProtocolClass**:数据分区(DataRegion)使用的共识协议类。 -5. **SchemaRegionConsensusProtocolClass**:元数据分区(SchemaRegion)使用的共识协议类。 -6. **ConfigNodeConsensusProtocolClass**:配置节点(ConfigNode)使用的共识协议类。 -7. **TimePartitionOrigin**:数据库时间分区的起始时间戳。 -8. **TimePartitionInterval**:数据库的时间分区间隔(单位:毫秒)。 -9. **ReadConsistencyLevel**:读取操作的一致性级别。 -10. **SchemaRegionPerDataNode**:数据节点(DataNode)上的元数据分区(SchemaRegion)数量。 -11. **DataRegionPerDataNode**:数据节点(DataNode)上的数据分区(DataRegion)数量。 -12. **SeriesSlotNum**:数据分区(DataRegion)的序列槽(SeriesSlot)数量。 -13. **SeriesSlotExecutorClass**:序列槽的实现类。 -14. **DiskSpaceWarningThreshold**:磁盘空间告警阈值(单位:百分比)。 -15. **TimestampPrecision**:时间戳精度。 - -#### 示例: - -```SQL -IoTDB> SHOW VARIABLES -``` - -执行结果如下: - -```SQL -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.4 查看数据库当前时间 - -#### 语法: - -**含义**:返回数据库当前时间。 - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP -``` - -执行结果如下: - -```SQL -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - -### 1.5 查看正在执行的查询信息 - -**含义**:用于显示所有正在执行的查询信息。 - -#### 语法: - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**参数解释**: - -1. **WHERE** 子句:需保证过滤的目标列是结果集中存在的列 -2. **ORDER BY** 子句:需保证`sortKey`是结果集中存在的列 -3. **limitOffsetClause**: - - **含义**:用于限制结果集的返回数量。 - - **格式**:`LIMIT , `, `` 是偏移量,`` 是返回的行数。 -4. **QUERIES** 表中的列: - - **time**:查询开始的时间戳,时间戳精度与系统精度一致 - - **queryid**:查询语句的 ID - - **datanodeid**:发起查询语句的 DataNode 的ID - - **elapsedtime**:查询的执行耗时,单位是秒 - - **statement**:查询的 SQL 语句 - - -#### 示例: - -```SQL -IoTDB> SHOW QUERIES WHERE elapsedtime > 0.003 -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -| Time| QueryId|DataNodeId|ElapsedTime| Statement| -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -|2025-05-09T15:16:01.293+08:00|20250509_071601_00015_1| 1| 0.006|SHOW QUERIES WHERE elapsedtime > 0.003| -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -``` - - -### 1.6 查看分区信息 - -**含义**:返回当前集群的分区信息。 - -#### 语法: - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW REGIONS -``` - -执行结果如下: - -```SQL -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 9|SchemaRegion|Running|root.__system| 21| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.555| | -| 10| DataRegion|Running|root.__system| 21| 21| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.556| 8.27 KB| -| 65|SchemaRegion|Running| root.ln| 1| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-25T14:46:50.113| | -| 66| DataRegion|Running| root.ln| 1| 1| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-25T14:46:50.425| 524 B| -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.7 查看可用节点 - -**含义**:返回当前集群所有可用的 DataNode 的 RPC 地址和端口。注意:这里对于“可用”的定义为:处于非 REMOVING 状态的 DN 节点。 - -> V2.0.8 起支持该功能 - -#### 语法: - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW AVAILABLE URLS -``` - -执行结果如下: - -```SQL -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.8 查看服务信息 - -**含义**:返回当前集群所有正常工作(RUNNING 或 READ-ONLY) DN 上的服务信息(MQTT 服务、REST 服务)。 - -> V2.0.8.2 起支持该功能 - -#### 语法: - -```SQL -showServicesStatement - : SHOW SERVICES - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -``` - -执行结果如下: - -```SQL -+------------+-----------+-------+ -|service_name|datanode_id| state| -+------------+-----------+-------+ -| MQTT| 1|STOPPED| -| REST| 1|RUNNING| -+------------+-----------+-------+ -``` - - -### 1.9 查看集群激活状态 - -**含义**:返回当前集群的激活状态。 - -#### 语法: - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW ACTIVATION -``` - -执行结果如下: - -```SQL -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -### 1.10 查看磁盘空间占用情况 - -含义:返回指定 pattern 的磁盘空间占用情况,包括 ChunkGroup 的大小和 Metadata 大小。 - -注意:统计基于 TsFile 中数据的真实大小,因此不会考虑 mods 删除的情况。 - -> V2.0.9.1 起支持该功能 - -#### 语法: - -```SQL -showDiskUsageStatement - : SHOW DISK_USAGE FROM pathPattern - whereClause? - orderByClause? - rowPaginationClause? - ; -pathPattern - : ROOT (DOT nodeName)* - ; -``` - -说明:Pattern 用于匹配设备,需要使用 root 作为开头,路径的中间节点支持 * 或 **。 - -#### 结果集 - -| 列名 | 列类型 | 含义 | -| --------------- | -------- | -------------------- | -| Database | string | Database 名 | -| DataNodeId | int32 | DataNode 节点 id | -| RegionId | int32 | Region id | -| TimePartition | int64 | 时间分区 id | -| SizeInBytes | int64 | 占用磁盘空间(byte) | - -#### 示例: - -```SQL -SHOW DISK_USAGE FROM root.ln.**; -``` - -执行结果如下: - -```Bash -+--------+----------+--------+-------------+-----------+ -|Database|DataNodeId|RegionId|TimePartition|SizeInBytes| -+--------+----------+--------+-------------+-----------+ -| root.ln| 1| 13| 2932| 203| -+--------+----------+--------+-------------+-----------+ -``` - - -## 2. 状态设置 - -### 2.1 设置连接的模型 - -**含义**:将当前连接的 sql_dialect 置为树模型/表模型,在树模型和表模型中均可使用该命令。 - -#### 语法: - -```SQL -SET SQL_DIALECT EQ (TABLE | TREE) -``` - -#### 示例: - -```SQL -IoTDB> SET SQL_DIALECT=TREE -IoTDB> SHOW CURRENT_SQL_DIALECT -``` - -执行结果如下: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TREE| -+-----------------+ -``` - -### 2.2 更新配置项 - -**含义**:用于更新配置项,执行完成后会进行配置项的热加载,对于支持热修改的配置项会立即生效。 - -#### 语法: - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**参数解释**: - -1. **propertyAssignments** - - **含义**:更新的配置列表,由多个 `property` 组成。 - - 可以更新多个配置列表,用逗号分隔。 - - **取值**: - - `DEFAULT`:将配置项恢复为默认值。 - - `expression`:具体的值,必须是一个字符串。 -2. **ON INTEGER_VALUE** - - **含义**:指定要更新配置的节点 ID。 - - **可选性**:可选。如果不指定或指定的值低于 0,则更新所有 ConfigNode 和 DataNode 的配置。 - -#### 示例: - -```SQL -IoTDB> SET CONFIGURATION 'disk_space_warning_threshold'='0.05','heartbeat_interval_in_ms'='1000' ON 1; -``` - -### 2.3 读取手动修改的配置文件 - -**含义**:用于读取手动修改过的配置文件,并对配置项进行热加载,对于支持热修改的配置项会立即生效。 - -#### 语法: - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定配置热加载的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `LOCAL`:只对客户端直连的 DataNode 进行配置热加载。 - - `CLUSTER`:对集群中所有 DataNode 进行配置热加载。 - -#### 示例: - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 设置系统的状态 - -**含义**:用于设置系统的状态。 - -#### 语法: - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **RUNNING | READONLY** - - **含义**:指定系统的新状态。 - - **取值**: - - `RUNNING`:将系统设置为运行状态,允许读写操作。 - - `READONLY`:将系统设置为只读状态,只允许读取操作,禁止写入操作。 -2. **localOrClusterMode** - - **含义**:指定状态变更的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `LOCAL`:仅对客户端直连的 DataNode 生效。 - - `CLUSTER`:对集群中所有 DataNode 生效。 - -#### 示例: - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - - -## 3. 数据管理 - -### 3.1 刷写内存表中的数据到磁盘 - -**含义**:将内存表中的数据刷写到磁盘上。 - -#### 语法: - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **identifier** - - **含义**:指定要刷写的路径名称。 - - **可选性**:可选。如果不指定,则默认刷写所有路径。 - - **多个路径**:可以指定多个路径名称,用逗号分隔。例如:`FLUSH root.ln, root.lnm`。 -2. **booleanValue** - - **含义**:指定刷写的内容。 - - **可选性**:可选。如果不指定,则默认刷写顺序和乱序空间的内存。 - - **取值**: - - `TRUE`:只刷写顺序空间的内存表。 - - `FALSE`:只刷写乱序空间的MemTable。 -3. **localOrClusterMode** - - **含义**:指定刷写的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:只刷写客户端直连的 DataNode 上的内存表。 - - `ON CLUSTER`:刷写集群中所有 DataNode 上的内存表。 - -#### 示例: - -```SQL -IoTDB> FLUSH root.ln TRUE ON LOCAL; -``` - -## 4. 数据修复 - -### 4.1 启动后台扫描并修复 tsfile 任务 - -**含义**:启动一个后台任务,开始扫描并修复 tsfile,能够修复数据文件内的时间戳乱序类异常。 - -#### 语法: - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定数据修复的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:仅对客户端直连的 DataNode 执行。 - - `ON CLUSTER`:对集群中所有 DataNode 执行。 - -#### 示例: - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 暂停后台修复 tsfile 任务 - -**含义**:暂停后台的修复任务,暂停中的任务可通过再次执行 start repair data 命令恢复。 - -#### 语法: - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定数据修复的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:仅对客户端直连的 DataNode 执行。 - - `ON CLUSTER`:对集群中所有 DataNode 执行。 - -#### 示例: - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. 终止查询 - -### 5.1 主动终止查询 - -**含义**:使用该命令主动地终止查询。 - -#### 语法: - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**参数解释**: - -1. **QUERY queryId=string** - - **含义**:指定要终止的查询的 ID。 `` 是正在执行的查询的唯一标识符。 - - **获取查询 ID**:可以通过 `SHOW QUERIES` 命令获取所有正在执行的查询及其 ID。 -2. **ALL QUERIES** - - **含义**:终止所有正在执行的查询。 - -#### 示例: - -通过指定 `queryId` 可以中止指定的查询,为了获取正在执行的查询 id,用户可以使用 show queries 命令,该命令将显示所有正在执行的查询列表。 - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -- 终止指定query -IoTDB> KILL ALL QUERIES; -- 终止所有query -``` - - - -## 6. 调试查询 - -### 6.1 DEBUG SQL - - -**​含义:​**在 SQL 查询语句开头添加 debug 关键字,执行时将输出 debug 日志,包括涉及到的底层文件 scan 信息。 - -> V2.0.9 起支持该功能 - -#### 语法: - -```SQL -debugSQLStatement - : DEBUG ? query - ; -``` - -**说明:** - -* 日志输出目录为: `logs/log_datanode_query_debug.log` - -#### 示例: - -1. 执行以下 SQL 进行 DEBUG 查询 - -```SQL -debug select * from root.ln.**; -``` - -2. 观察`log_datanode_query_debug.log` 的日志内容,查看查询涉及到的文件 scan 信息。 - -```Bash -2026-03-24 10:06:18,755 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:159 - Cache miss: root.ln.wf01.wt01.temperature in file: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,757 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:160 - Device: root.ln.wf01.wt01, all sensors: [temperature] -2026-03-24 10:06:18,758 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.BloomFilterCache:110 - get bloomFilter from cache where filePath is: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: root.ln.wf01.wt01.temperature metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=0, chunkMetaDataListDataSize=8, measurementId='temperature', dataType=DOUBLE, statistics=startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], modified=false, isSeq=true, chunkMetadataList=[measurementId: temperature, datatype: DOUBLE, version: 0, Statistics: startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], deleteIntervalList: null]}. -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:97 - Modifications size is 0 for file Path: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:109 - After modification Chunk meta data list is: -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:110 - measurementId: temperature, datatype: DOUBLE, version: 0, Statistics: startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], deleteIntervalList: null -2026-03-24 10:06:18,760 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile', regionId=13, timePartitionId=2932, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=27} -2026-03-24 10:06:18,761 [pool-69-IoTDB-ClientRPC-Processor-1$20260324_020618_00052_1] INFO o.a.i.d.q.p.Coordinator:902 - debug select * from root.ln.** -``` \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/Streaming_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/Streaming_timecho.md deleted file mode 100644 index a5944ef7a..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/Streaming_timecho.md +++ /dev/null @@ -1,858 +0,0 @@ - - -# 流计算框架 - -IoTDB 流处理框架允许用户实现自定义的流处理逻辑,可以实现对存储引擎变更的监听和捕获、实现对变更数据的变形、实现对变形后数据的向外推送等逻辑。 - -我们将一个数据流处理任务称为 Pipe。一个流处理任务(Pipe)包含三个子任务: - -- 抽取(Source) -- 处理(Process) -- 发送(Sink) - -流处理框架允许用户使用 Java 语言自定义编写三个子任务的处理逻辑,通过类似 UDF 的方式处理数据。 -在一个 Pipe 中,上述的三个子任务分别由三种插件执行实现,数据会依次经过这三个插件进行处理: -Pipe Source 用于抽取数据,Pipe Processor 用于处理数据,Pipe Sink 用于发送数据,最终数据将被发至外部系统。 - -**Pipe 任务的模型如下:** - -![任务模型图](/img/1706697228308.jpg) - -描述一个数据流处理任务,本质就是描述 Pipe Source、Pipe Processor 和 Pipe Sink 插件的属性。 -用户可以通过 SQL 语句声明式地配置三个子任务的具体属性,通过组合不同的属性,实现灵活的数据 ETL 能力。 - -利用流处理框架,可以搭建完整的数据链路来满足端*边云同步、异地灾备、读写负载分库*等需求。 - -## 1. 自定义流处理插件开发 - -### 1.1 编程开发依赖 - -推荐采用 maven 构建项目,在`pom.xml`中添加以下依赖。请注意选择和 IoTDB 服务器版本相同的依赖版本。 - -```xml - - org.apache.iotdb - pipe-api - 1.3.1 - provided - -``` - -### 1.2 事件驱动编程模型 - -流处理插件的用户编程接口设计,参考了事件驱动编程模型的通用设计理念。事件(Event)是用户编程接口中的数据抽象,而编程接口与具体的执行方式解耦,只需要专注于描述事件(数据)到达系统后,系统期望的处理方式即可。 - -在流处理插件的用户编程接口中,事件是数据库数据写入操作的抽象。事件由单机流处理引擎捕获,按照流处理三个阶段的流程,依次传递至 PipeSource 插件,PipeProcessor 插件和 PipeSink 插件,并依次在三个插件中触发用户逻辑的执行。 - -为了兼顾端侧低负载场景下的流处理低延迟和端侧高负载场景下的流处理高吞吐,流处理引擎会动态地在操作日志和数据文件中选择处理对象,因此,流处理的用户编程接口要求用户提供下列两类事件的处理逻辑:操作日志写入事件 TabletInsertionEvent 和数据文件写入事件 TsFileInsertionEvent。 - -#### **操作日志写入事件(TabletInsertionEvent)** - -操作日志写入事件(TabletInsertionEvent)是对用户写入请求的高层数据抽象,它通过提供统一的操作接口,为用户提供了操纵写入请求底层数据的能力。 - -对于不同的数据库部署方式,操作日志写入事件对应的底层存储结构是不一样的。对于单机部署的场景,操作日志写入事件是对写前日志(WAL)条目的封装;对于分布式部署的场景,操作日志写入事件是对单个节点共识协议操作日志条目的封装。 - -对于数据库不同写入请求接口生成的写入操作,操作日志写入事件对应的请求结构体的数据结构也是不一样的。IoTDB 提供了 InsertRecord、InsertRecords、InsertTablet、InsertTablets 等众多的写入接口,每一种写入请求都使用了完全不同的序列化方式,生成的二进制条目也不尽相同。 - -操作日志写入事件的存在,为用户提供了一种统一的数据操作视图,它屏蔽了底层数据结构的实现差异,极大地降低了用户的编程门槛,提升了功能的易用性。 - -```java -/** TabletInsertionEvent is used to define the event of data insertion. */ -public interface TabletInsertionEvent extends Event { - - /** - * The consumer processes the data row by row and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processRowByRow(BiConsumer consumer); - - /** - * The consumer processes the Tablet directly and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processTablet(BiConsumer consumer); -} -``` - -#### **数据文件写入事件(TsFileInsertionEvent)** - -数据文件写入事件(TsFileInsertionEvent) 是对数据库文件落盘操作的高层抽象,它是若干操作日志写入事件(TabletInsertionEvent)的数据集合。 - -IoTDB 的存储引擎是 LSM 结构的。数据写入时会先将写入操作落盘到日志结构的文件里,同时将写入数据保存在内存里。当内存达到控制上限,则会触发刷盘行为,即将内存中的数据转换为数据库文件,同时删除之前预写的操作日志。当内存中的数据转换为数据库文件中的数据时,会经过编码压缩和通用压缩两次压缩处理,因此数据库文件的数据相比内存中的原始数据占用的空间更少。 - -在极端的网络情况下,直接传输数据文件相比传输数据写入的操作要更加经济,它会占用更低的网络带宽,能实现更快的传输速度。当然,天下没有免费的午餐,对文件中的数据进行计算处理,相比直接对内存中的数据进行计算处理时,需要额外付出文件 I/O 的代价。但是,正是磁盘数据文件和内存写入操作两种结构各有优劣的存在,给了系统做动态权衡调整的机会,也正是基于这样的观察,插件的事件模型中才引入了数据文件写入事件。 - -综上,数据文件写入事件出现在流处理插件的事件流中,存在下面两种情况: - -(1)历史数据抽取:一个流处理任务开始前,所有已经落盘的写入数据都会以 TsFile 的形式存在。一个流处理任务开始后,采集历史数据时,历史数据将以 TsFileInsertionEvent 作为抽象; - -(2)实时数据抽取:一个流处理任务进行时,当数据流中实时处理操作日志写入事件的速度慢于写入请求速度一定进度之后,未来得及处理的操作日志写入事件会被被持久化至磁盘,以 TsFile 的形式存在,这一些数据被流处理引擎抽取到后,会以 TsFileInsertionEvent 作为抽象。 - -```java -/** - * TsFileInsertionEvent is used to define the event of writing TsFile. Event data stores in disks, - * which is compressed and encoded, and requires IO cost for computational processing. - */ -public interface TsFileInsertionEvent extends Event { - - /** - * The method is used to convert the TsFileInsertionEvent into several TabletInsertionEvents. - * - * @return {@code Iterable} the list of TabletInsertionEvent - */ - Iterable toTabletInsertionEvents(); -} -``` - -### 1.3 自定义流处理插件编程接口定义 - -基于自定义流处理插件编程接口,用户可以轻松编写数据抽取插件、数据处理插件和数据发送插件,从而使得流处理功能灵活适配各种工业场景。 - -#### 数据抽取插件接口 - -数据抽取是流处理数据从数据抽取到数据发送三阶段的第一阶段。数据抽取插件(PipeSource)是流处理引擎和存储引擎的桥梁,它通过监听存储引擎的行为,捕获各种数据写入事件。 - -```java -/** - * PipeSource - * - *

PipeSource is responsible for capturing events from sources. - * - *

Various data sources can be supported by implementing different PipeSource classes. - * - *

The lifecycle of a PipeSource is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SOURCE` clause in SQL are - * parsed and the validation method {@link PipeSource#validate(PipeParameterValidator)} will - * be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} will be called to - * config the runtime behavior of the PipeSource. - *
  • Then the method {@link PipeSource#start()} will be called to start the PipeSource. - *
  • While the collaboration task is in progress, the method {@link PipeSource#supply()} will be - * called to capture events from sources and then the events will be passed to the - * PipeProcessor. - *
  • The method {@link PipeSource#close()} will be called when the collaboration task is - * cancelled (the `DROP PIPE` command is executed). - *
- */ -public interface PipeSource extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSource. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSourceRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSource#validate(PipeParameterValidator)} - * is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSource - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSourceRuntimeConfiguration configuration) - throws Exception; - - /** - * Start the Source. After this method is called, events should be ready to be supplied by - * {@link PipeSource#supply()}. This method is called after {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @throws Exception the user can throw errors if necessary - */ - void start() throws Exception; - - /** - * Supply single event from the Source and the caller will send the event to the processor. - * This method is called after {@link PipeSource#start()} is called. - * - * @return the event to be supplied. the event may be null if the Source has no more events at - * the moment, but the Source is still running for more events. - * @throws Exception the user can throw errors if necessary - */ - Event supply() throws Exception; -} -``` - -#### 数据处理插件接口 - -数据处理是流处理数据从数据抽取到数据发送三阶段的第二阶段。数据处理插件(PipeProcessor)主要用于过滤和转换由数据抽取插件(PipeSource)捕获的各种事件。 - -```java -/** - * PipeProcessor - * - *

PipeProcessor is used to filter and transform the Event formed by the PipeSource. - * - *

The lifecycle of a PipeProcessor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH PROCESSOR` clause in SQL are - * parsed and the validation method {@link PipeProcessor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} will be called - * to config the runtime behavior of the PipeProcessor. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. The - * following 3 methods will be called: {@link - * PipeProcessor#process(TabletInsertionEvent, EventCollector)}, {@link - * PipeProcessor#process(TsFileInsertionEvent, EventCollector)} and {@link - * PipeProcessor#process(Event, EventCollector)}. - *
    • PipeSink serializes the events into binaries and send them to sinks. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeProcessor#close() } method will be called. - *
- */ -public interface PipeProcessor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeProcessor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeProcessorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeProcessor#validate(PipeParameterValidator)} is called and before the beginning of the - * events processing. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeProcessor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeProcessorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is called to process the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(TabletInsertionEvent tabletInsertionEvent, EventCollector eventCollector) - throws Exception; - - /** - * This method is called to process the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - default void process(TsFileInsertionEvent tsFileInsertionEvent, EventCollector eventCollector) - throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - process(tabletInsertionEvent, eventCollector); - } - } - - /** - * This method is called to process the Event. - * - * @param event Event to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(Event event, EventCollector eventCollector) throws Exception; -} -``` - -#### 数据发送插件接口 - -数据发送是流处理数据从数据抽取到数据发送三阶段的第三阶段。数据发送插件(PipeSink)主要用于发送经由数据处理插件(PipeProcessor)处理过后的各种事件,它作为流处理框架的网络实现层,接口上应允许接入多种实时通信协议和多种连接器。 - -```java -/** - * PipeSink - * - *

PipeSink is responsible for sending events to sinks. - * - *

Various network protocols can be supported by implementing different PipeSink classes. - * - *

The lifecycle of a PipeSink is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SINK` clause in SQL are - * parsed and the validation method {@link PipeSink#validate(PipeParameterValidator)} will be - * called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link PipeSink#customize(PipeParameters, - * PipeSinkRuntimeConfiguration)} will be called to config the runtime behavior of the - * PipeSink and the method {@link PipeSink#handshake()} will be called to create a connection - * with sink. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. - *
    • PipeSink serializes the events into binaries and send them to sinks. The following 3 - * methods will be called: {@link PipeSink#transfer(TabletInsertionEvent)}, {@link - * PipeSink#transfer(TsFileInsertionEvent)} and {@link PipeSink#transfer(Event)}. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeSink#close() } method will be called. - *
- * - *

In addition, the method {@link PipeSink#heartbeat()} will be called periodically to check - * whether the connection with sink is still alive. The method {@link PipeSink#handshake()} will be - * called to create a new connection with the sink when the method {@link PipeSink#heartbeat()} - * throws exceptions. - */ -public interface PipeSink extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSink. In this method, the user can do the following - * things: - * - *

    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSinkRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSink#validate(PipeParameterValidator)} is - * called and before the method {@link PipeSink#handshake()} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSink - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSinkRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is used to create a connection with sink. This method will be called after the - * method {@link PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called or - * will be called when the method {@link PipeSink#heartbeat()} throws exceptions. - * - * @throws Exception if the connection is failed to be created - */ - void handshake() throws Exception; - - /** - * This method will be called periodically to check whether the connection with sink is still - * alive. - * - * @throws Exception if the connection dies - */ - void heartbeat() throws Exception; - - /** - * This method is used to transfer the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(TabletInsertionEvent tabletInsertionEvent) throws Exception; - - /** - * This method is used to transfer the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - default void transfer(TsFileInsertionEvent tsFileInsertionEvent) throws Exception { - try { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - transfer(tabletInsertionEvent); - } - } finally { - tsFileInsertionEvent.close(); - } - } - - /** - * This method is used to transfer the generic events, including HeartbeatEvent. - * - * @param event Event to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(Event event) throws Exception; -} -``` - -## 2. 自定义流处理插件管理 - -为了保证用户自定义插件在实际生产中的灵活性和易用性,系统还需要提供对插件进行动态统一管理的能力。 -本章节介绍的流处理插件管理语句提供了对插件进行动态统一管理的入口。 - -### 2.1 加载插件语句 - -在 IoTDB 中,若要在系统中动态载入一个用户自定义插件,则首先需要基于 PipeSource、 PipeProcessor 或者 PipeSink 实现一个具体的插件类,然后需要将插件类编译打包成 jar 可执行文件,最后使用加载插件的管理语句将插件载入 IoTDB。 - -加载插件的管理语句的语法如图所示。 - -```sql -CREATE PIPEPLUGIN [IF NOT EXISTS] <别名> -AS <全类名> -USING -``` - -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Pipe Plugin 不存在时,执行创建命令,防止因尝试创建已存在的 Pipe Plugin 而导致报错。 - -示例:假如用户实现了一个全类名为edu.tsinghua.iotdb.pipe.ExampleProcessor 的数据处理插件,打包后的jar包为 pipe-plugin.jar ,用户希望在流处理引擎中使用这个插件,将插件标记为 example。插件包有两种使用方式,一种为上传到URI服务器,一种为上传到集群本地目录,两种方法任选一种即可。 - -【方式一】上传到URI服务器 - -准备工作:使用该种方式注册,您需要提前将 JAR 包上传到 URI 服务器上并确保执行注册语句的IoTDB实例能够访问该 URI 服务器。例如 https://example.com:8080/iotdb/pipe-plugin.jar 。 - -创建语句: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -【方式二】上传到集群本地目录 - -准备工作:使用该种方式注册,您需要提前将 JAR 包放置到DataNode节点所在机器的任意路径下,推荐您将JAR包放在IoTDB安装路径的/ext/pipe目录下(安装包中已有,无需新建)。例如:iotdb-1.x.x-bin/ext/pipe/pipe-plugin.jar。(**注意:如果您使用的是集群,那么需要将 JAR 包放置到每个 DataNode 节点所在机器的该路径下)** - -创建语句: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -### 2.2 删除插件语句 - -当用户不再想使用一个插件,需要将插件从系统中卸载时,可以使用如图所示的删除插件语句。 - -```sql -DROP PIPEPLUGIN [IF EXISTS] <别名> -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Pipe Plugin 存在时,执行删除命令,防止因尝试删除不存在的 Pipe Plugin 而导致报错。 - -### 2.3 查看插件语句 - -用户也可以按需查看系统中的插件。查看插件的语句如图所示。 - -```sql -SHOW PIPEPLUGINS -``` - -## 3. 系统预置的流处理插件 - -### 3.1 预置 source 插件 - -#### iotdb-source - -作用:抽取 IoTDB 内部的历史或实时数据进入 pipe。 - - -| key | value | value 取值范围 | required or optional with default | -|---------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|-----------------------------------| -| source | iotdb-source | String: iotdb-source | required | -| source.pattern | 用于筛选时间序列的路径前缀 | String: 任意的时间序列前缀 | optional: root | -| source.history.start-time | 抽取的历史数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| source.history.end-time | 抽取的历史数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| start-time(V1.3.1+) | start of synchronizing all data event time,including start-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| end-time(V1.3.1+) | end of synchronizing all data event time,including end-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.realtime.mode | 实时数据的抽取模式 | String: hybrid, log, file | optional: hybrid | -| source.forwarding-pipe-requests | 是否抽取由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | optional: true | - -> 🚫 **source.pattern 参数说明** -> -> * Pattern 需用反引号修饰不合法字符或者是不合法路径节点,例如如果希望筛选 root.\`a@b\` 或者 root.\`123\`,应设置 pattern 为 root.\`a@b\` 或者 root.\`123\`(具体参考 [单双引号和反引号的使用时机](https://iotdb.apache.org/zh/Download/#_1-0-版本不兼容的语法详细说明)) -> * 在底层实现中,当检测到 pattern 为 root(默认值)时,抽取效率较高,其他任意格式都将降低性能 -> * 路径前缀不需要能够构成完整的路径。例如,当创建一个包含参数为 'source.pattern'='root.aligned.1' 的 pipe 时: - > - > * root.aligned.1TS - > * root.aligned.1TS.\`1\` -> * root.aligned.100T - > - > 的数据会被抽取; - > - > * root.aligned.\`1\` -> * root.aligned.\`123\` - > - > 的数据不会被抽取。 - -> ❗️**source.history 的 start-time,end-time 参数说明** -> -> * start-time,end-time 应为 ISO 格式,例如 2011-12-03T10:15:30 或 2011-12-03T10:15:30+01:00 - -> ✅ **一条数据从生产到落库 IoTDB,包含两个关键的时间概念** -> -> * **event time:** 数据实际生产时的时间(或者数据生产系统给数据赋予的生成时间,是数据点中的时间项),也称为事件时间。 -> * **arrival time:** 数据到达 IoTDB 系统内的时间。 -> -> 我们常说的乱序数据,指的是数据到达时,其 **event time** 远落后于当前系统时间(或者已经落库的最大 **event time**)的数据。另一方面,不论是乱序数据还是顺序数据,只要它们是新到达系统的,那它们的 **arrival time** 都是会随着数据到达 IoTDB 的顺序递增的。 - -> 💎 **iotdb-source 的工作可以拆分成两个阶段** -> -> 1. 历史数据抽取:所有 **arrival time** < 创建 pipe 时**当前系统时间**的数据称为历史数据 -> 2. 实时数据抽取:所有 **arrival time** >= 创建 pipe 时**当前系统时间**的数据称为实时数据 -> -> 历史数据传输阶段和实时数据传输阶段,**两阶段串行执行,只有当历史数据传输阶段完成后,才执行实时数据传输阶段。** - -> 📌 **source.realtime.mode:数据抽取的模式** -> -> * log:该模式下,任务仅使用操作日志进行数据处理、发送 -> * file:该模式下,任务仅使用数据文件进行数据处理、发送 -> * hybrid:该模式,考虑了按操作日志逐条目发送数据时延迟低但吞吐低的特点,以及按数据文件批量发送时发送吞吐高但延迟高的特点,能够在不同的写入负载下自动切换适合的数据抽取方式,首先采取基于操作日志的数据抽取方式以保证低发送延迟,当产生数据积压时自动切换成基于数据文件的数据抽取方式以保证高发送吞吐,积压消除时自动切换回基于操作日志的数据抽取方式,避免了采用单一数据抽取算法难以平衡数据发送延迟或吞吐的问题。 - -> 🍕 **source.forwarding-pipe-requests:是否允许转发从另一 pipe 传输而来的数据** -> -> * 如果要使用 pipe 构建 A -> B -> C 的数据同步,那么 B -> C 的 pipe 需要将该参数为 true 后,A -> B 中 A 通过 pipe 写入 B 的数据才能被正确转发到 C -> * 如果要使用 pipe 构建 A \<-> B 的双向数据同步(双活),那么 A -> B 和 B -> A 的 pipe 都需要将该参数设置为 false,否则将会造成数据无休止的集群间循环转发 - -### 3.2 预置 processor 插件 - -#### do-nothing-processor - -作用:不对 source 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -|-----------|----------------------|------------------------------|-----------------------------------| -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### 3.3 预置 sink 插件 - -#### do-nothing-sink - -作用:不对 processor 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -|------|-----------------|-------------------------|-----------------------------------| -| sink | do-nothing-sink | String: do-nothing-sink | required | - -## 4. 流处理任务管理 - -### 4.1 创建流处理任务 - -使用 `CREATE PIPE` 语句来创建流处理任务。以数据同步流处理任务的创建为例,示例 SQL 语句如下: - -```sql -CREATE PIPE -- PipeId 是能够唯一标定流处理任务的名字 -WITH SOURCE ( - -- 默认的 IoTDB 数据抽取插件 - 'source' = 'iotdb-source', - -- 路径前缀,只有能够匹配该路径前缀的数据才会被抽取,用作后续的处理和发送 - 'source.pattern' = 'root.timecho', - -- 是否抽取历史数据 - 'source.history.enable' = 'true', - -- 描述被抽取的历史数据的时间范围,表示最早时间 - 'source.history.start-time' = '2011.12.03T10:15:30+01:00', - -- 描述被抽取的历史数据的时间范围,表示最晚时间 - 'source.history.end-time' = '2022.12.03T10:15:30+01:00', - -- 是否抽取实时数据 - 'source.realtime.enable' = 'true', - -- 描述实时数据的抽取方式 - 'source.realtime.mode' = 'hybrid', -) -WITH PROCESSOR ( - -- 默认的数据处理插件,即不做任何处理 - 'processor' = 'do-nothing-processor', -) -WITH SINK ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'sink' = 'iotdb-thrift-sink', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'sink.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'sink.port' = '6667', -) -``` - -**创建流处理任务时需要配置 PipeId 以及三个插件部分的参数:** - - -| 配置项 | 说明 | 是否必填 | 默认实现 | 默认实现说明 | 是否允许自定义实现 | -|-----------|--------------------------------|---------------------------|----------------------|------------------------------|--------------------------| -| PipeId | 全局唯一标定一个流处理任务的名称 | 必填 | - | - | - | -| source | Pipe Source 插件,负责在数据库底层抽取流处理数据 | 选填 | iotdb-source | 将数据库的全量历史数据和后续到达的实时数据接入流处理任务 | 否 | -| processor | Pipe Processor 插件,负责处理数据 | 选填 | do-nothing-processor | 对传入的数据不做任何处理 | | -| sink | Pipe Sink 插件,负责发送数据 | 必填 | - | - | | - -示例中,使用了 iotdb-source、do-nothing-processor 和 iotdb-thrift-sink 插件构建数据流处理任务。IoTDB 还内置了其他的流处理插件,**请查看“系统预置流处理插件”一节**。 - -**一个最简的 CREATE PIPE 语句示例如下:** - -```sql -CREATE PIPE -- PipeId 是能够唯一标定流处理任务的名字 -WITH SINK ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'sink' = 'iotdb-thrift-sink', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'sink.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'sink.port' = '6667', -) -``` - -其表达的语义是:将本数据库实例中的全量历史数据和后续到达的实时数据,同步到目标为 127.0.0.1:6667 的 IoTDB 实例上。 - -**注意:** - -- SOURCE 和 PROCESSOR 为选填配置,若不填写配置参数,系统则会采用相应的默认实现 -- SINK 为必填配置,需要在 CREATE PIPE 语句中声明式配置 -- SINK 具备自复用能力。对于不同的流处理任务,如果他们的 SINK 具备完全相同 KV 属性的(所有属性的 key 对应的 value 都相同),**那么系统最终只会创建一个 SINK 实例**,以实现对连接资源的复用。 - - - 例如,有下面 pipe1, pipe2 两个流处理任务的声明: - - ```sql - CREATE PIPE pipe1 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.ip' = 'localhost', - 'sink.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.port' = '9999', - 'sink.ip' = 'localhost', - ) - ``` - - - 因为它们对 SINK 的声明完全相同(**即使某些属性声明时的顺序不同**),所以框架会自动对它们声明的 SINK 进行复用,最终 pipe1, pipe2 的 SINK 将会是同一个实例。 -- 在 source 为默认的 iotdb-source,且 source.forwarding-pipe-requests 为默认值 true 时,请不要构建出包含数据循环同步的应用场景(会导致无限循环): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### 4.2 启动流处理任务 - -CREATE PIPE 语句成功执行后,流处理任务相关实例会被创建,但整个流处理任务的运行状态会被置为 STOPPED,即流处理任务不会立刻处理数据(V1.3.0)。在 1.3.1 及以上的版本,流处理任务的运行状态在创建后将被立即置为 RUNNING。 - -可以使用 START PIPE 语句使流处理任务开始处理数据: - -```sql -START PIPE -``` - -### 4.3 停止流处理任务 - -使用 STOP PIPE 语句使流处理任务停止处理数据: - -```sql -STOP PIPE -``` - -### 4.4 删除流处理任务 - -使用 DROP PIPE 语句使流处理任务停止处理数据(当流处理任务状态为 RUNNING 时),然后删除整个流处理任务流处理任务: - -```sql -DROP PIPE -``` - -用户在删除流处理任务前,不需要执行 STOP 操作。 - -### 4.5 展示流处理任务 - -使用 SHOW PIPES 语句查看所有流处理任务: - -```sql -SHOW PIPES -``` - -查询结果如下: - -```sql -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -| ID| CreationTime | State|PipeSource|PipeProcessor|PipeSink|ExceptionMessage| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| {}| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -``` - -可以使用 `` 指定想看的某个流处理任务状态: - -```sql -SHOW PIPE -``` - -您也可以通过 where 子句,判断某个 \ 使用的 Pipe Sink 被复用的情况。 - -```sql -SHOW PIPES -WHERE SINK USED BY -``` - -### 4.6 流处理任务运行状态迁移 - -一个流处理 pipe 在其的生命周期中会经过多种状态: - -- **RUNNING:** pipe 正在正常工作 - - 当一个 pipe 被成功创建之后,其初始状态为工作状态(V1.3.1+) -- **STOPPED:** pipe 处于停止运行状态。当管道处于该状态时,有如下几种可能: - - 当一个 pipe 被成功创建之后,其初始状态为暂停状态(V1.3.0) - - 用户手动将一个处于正常运行状态的 pipe 暂停,其状态会被动从 RUNNING 变为 STOPPED - - 当一个 pipe 运行过程中出现无法恢复的错误时,其状态会自动从 RUNNING 变为 STOPPED -- **DROPPED:** pipe 任务被永久删除 - -下图表明了所有状态以及状态的迁移: - -![状态迁移图](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -## 5. 权限管理 - -### 5.1 流处理任务 - - -| 权限名称 | 描述 | -|----------|---------------| -| USE_PIPE | 注册流处理任务。路径无关。 | -| USE_PIPE | 开启流处理任务。路径无关。 | -| USE_PIPE | 停止流处理任务。路径无关。 | -| USE_PIPE | 卸载流处理任务。路径无关。 | -| USE_PIPE | 查询流处理任务。路径无关。 | - -### 5.2 流处理任务插件 - - -| 权限名称 | 描述 | -|----------|-----------------| -| USE_PIPE | 注册流处理任务插件。路径无关。 | -| USE_PIPE | 卸载流处理任务插件。路径无关。 | -| USE_PIPE | 查询流处理任务插件。路径无关。 | - -## 6. 配置参数 - -在 iotdb-system.properties 中: - -V1.3.0+: -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 - -# The maximum number of selectors that can be used in the async connector. -# pipe_async_connector_selector_number=1 - -# The core number of clients that can be used in the async connector. -# pipe_async_connector_core_client_number=8 - -# The maximum number of clients that can be used in the async connector. -# pipe_async_connector_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -V1.3.1+: -```Properties -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/Tiered-Storage_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 86c183dc3..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,101 +0,0 @@ - - -# 多级存储 -## 1. 概述 - -多级存储功能向用户提供多种存储介质管理的能力,用户可以使用多级存储功能为 IoTDB 配置不同类型的存储介质,并为存储介质进行分级。具体的,在 IoTDB 中,多级存储的配置体现为多目录的管理。用户可以将多个存储目录归为同一类,作为一个“层级”向 IoTDB 中配置,这种“层级”我们称之为 storage tier;同时,用户可以根据数据的冷热进行分类,并将不同类别的数据存储到指定的“层级”中。当前 IoTDB 支持通过数据的 TTL 进行冷热数据的分类,当一个层级中的数据不满足当前层级定义的 TTL 规则时,该数据会被自动迁移至下一层级中。 - -## 2. 参数定义 - -在 IoTDB 中开启多级存储,需要进行以下几个方面的配置: - -1. 配置数据目录,并将数据目录分为不同的层级 -2. 配置每个层级所管理的数据的 TTL,以区分不同层级管理的冷热数据类别。 -3. 配置每个层级的最小剩余存储空间比例,当该层级的存储空间触发该阈值时,该层级的数据会被自动迁移至下一层级(可选)。 - -具体的参数定义及其描述如下。 - -| 配置项 | 默认值 | 是否必填 | 说明 | 约束 | -| --------------------------------------- | ------------------------ | --- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | 是 | 用来指定不同的存储目录,并将存储目录进行层级划分 | 每级存储使用分号分隔,单级内使用逗号分隔;云端配置只能作为最后一级存储且第一级不能作为云端存储;最多配置一个云端对象;远端存储目录使用 OBJECT_STORAGE 来表示 | -| tier_ttl_in_ms | -1 | 是 | 定义每个层级负责的数据范围,通过 TTL 表示 | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致;"-1" 表示"无限制" | -| dn_default_space_usage_thresholds | 0.85 | 是 | 定义每个层级数据目录的最大使用空间比例;当使用空间大于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的使用存储空间大于此阈值时,会将系统置为 READ_ONLY | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致 | -| object_storage_type | AWS_S3 | 使用远端存储时必填 | 云端存储类型 | IoTDB 支持 S3 协议作为远端存储类型 | -| object_storage_bucket | iotdb_data | 使用远端存储时必填 | 云端存储 bucket 的名称 | AWS S3 中的 bucket 定义 | -| object_storage_region | | 使用远端存储时必填 | 云端存储的服务区域 | AWS S3 中的 region 定义 | -| object_storage_endpoint | | 使用远端存储时必填 | 云端存储的 endpoint | AWS S3 的 endpoint | -| object_storage_access_key | | 使用远端存储时必填 | 云端存储的验证信息 key | AWS S3 的 credential key | -| object_storage_access_secret | | 使用远端存储时必填 | 云端存储的验证信息 secret | AWS S3 的 credential secret | -| enable_path_style_access | false | 否 | 是否启用云端存储服务路径访问 | | -| remote_tsfile_cache_dirs | data/datanode/data/cache | 否 | 云端存储在本地的缓存目录 | | -| remote_tsfile_cache_page_size_in_kb | 20480 | 否 | 云端存储在本地缓存文件的块大小 | | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | 否 | 云端存储本地缓存的最大磁盘占用大小 | | - - -## 3. 本地多级存储配置示例 - -以下以本地两级存储的配置示例。 - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data; -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -在该示例中,共配置了两个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 1 天以前的数据 | 10% | - -## 4. 远端多级存储配置示例 - -以下以三级存储为例: - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_type=AWS_S3 -object_storage_bucket=iotdb -object_storage_region= -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// 可选配置项 -enable_path_style_access=false -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -在该示例中,共配置了三个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 过去1 天至过去 10 天内的数据 | 15% | -| 层级三 | 远端 S3 协议存储 | 过去 10 天以前的数据 | 10% | \ No newline at end of file diff --git a/src/zh/UserGuide/Master/Tree/User-Manual/User-defined-function_timecho.md b/src/zh/UserGuide/Master/Tree/User-Manual/User-defined-function_timecho.md deleted file mode 100644 index 35a998b90..000000000 --- a/src/zh/UserGuide/Master/Tree/User-Manual/User-defined-function_timecho.md +++ /dev/null @@ -1,928 +0,0 @@ -# UDF - -## 1. UDF 介绍 - -UDF(User Defined Function)即用户自定义函数,IoTDB 提供多种内建的面向时序处理的函数,也支持扩展自定义函数来满足更多的计算需求。 - -IoTDB 支持两种类型的 UDF 函数,如下表所示。 - - - - - - - - - - - - - - - - - - - - - - -
UDF 分类数据访问策略描述
UDTFMAPPABLE_ROW_BY_ROW自定义标量函数,输入 k 列时间序列 1 行数据,输出 1 列时间序列 1 行数据,可用于标量函数出现的任何子句和表达式中,如select子句、where子句等。
ROW_BY_ROW
SLIDING_TIME_WINDOW
SLIDING_SIZE_WINDOW
SESSION_TIME_WINDOW
STATE_WINDOW
自定义时间序列生成函数,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据,输入行数 m 可以与输出行数 n 不相同,只能用于SELECT子句中。
UDAF-自定义聚合函数,输入 k 列时间序列 m 行数据,输出 1 列时间序列 1 行数据,可用于聚合函数出现的任何子句和表达式中,如select子句、having子句等。
- -### 1.1 UDF 使用 - -UDF 的使用方法与普通内建函数类似,可以直接在 SELECT 语句中像调用普通函数一样使用UDF。 - -#### 1.支持的基础 SQL 语法 - -* `SLIMIT` / `SOFFSET` -* `LIMIT` / `OFFSET` -* 支持值过滤 -* 支持时间过滤 - - -#### 2. 带 * 查询 - -假定现在有时间序列 `root.sg.d1.s1`和 `root.sg.d1.s2`。 - -* **执行`SELECT example(*) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1)`和`example(root.sg.d1.s2)`的结果。 - -* **执行`SELECT example(s1, *) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1, root.sg.d1.s1)`和`example(root.sg.d1.s1, root.sg.d1.s2)`的结果。 - -* **执行`SELECT example(*, *) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1, root.sg.d1.s1)`,`example(root.sg.d1.s2, root.sg.d1.s1)`,`example(root.sg.d1.s1, root.sg.d1.s2)` 和 `example(root.sg.d1.s2, root.sg.d1.s2)`的结果。 - -#### 3. 带自定义输入参数的查询 - -可以在进行 UDF 查询的时候,向 UDF 传入任意数量的键值对参数。键值对中的键和值都需要被单引号或者双引号引起来。注意,键值对参数只能在所有时间序列后传入。下面是一组例子: - - 示例: -``` sql -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; -``` - -#### 4. 与其他查询的嵌套查询 - - 示例: -``` sql -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` - - - -## 2. UDF 管理 - -### 2.1 UDF 注册 - -注册一个 UDF 可以按如下流程进行: - -1. 实现一个完整的 UDF 类,假定这个类的全类名为`org.apache.iotdb.udf.UDTFExample` -2. 将项目打成 JAR 包,如果使用 Maven 管理项目,可以参考 [Maven 项目示例](https://github.com/apache/iotdb/tree/master/example/udf)的写法 -3. 进行注册前的准备工作,根据注册方式的不同需要做不同的准备,具体可参考以下例子 -4. 使用以下 SQL 语句注册 UDF - -```sql -CREATE FUNCTION AS (USING URI URI-STRING) -``` - -#### 示例:注册名为`example`的 UDF,以下两种注册方式任选其一即可 - -#### 方式一:手动放置jar包 - -准备工作: -使用该种方式注册时,需要提前将 JAR 包放置到集群所有节点的 `ext/udf`目录下(该目录可配置)。 - -注册语句: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' -``` - -#### 方式二:集群通过URI自动安装jar包 - -准备工作: -使用该种方式注册时,需要提前将 JAR 包上传到 URI 服务器上并确保执行注册语句的 IoTDB 实例能够访问该 URI 服务器。 - -注册语句: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' USING URI 'http://jar/example.jar' -``` - -IoTDB 会下载 JAR 包并同步到整个集群。 - -#### 注意 - -1. 由于 IoTDB 的 UDF 是通过反射技术动态装载的,因此在装载过程中无需启停服务器。 - -2. UDF 函数名称是大小写不敏感的。 - -3. 请不要给 UDF 函数注册一个内置函数的名字。使用内置函数的名字给 UDF 注册会失败。 - -4. 不同的 JAR 包中最好不要有全类名相同但实现功能逻辑不一样的类。例如 UDF(UDAF/UDTF):`udf1`、`udf2`分别对应资源`udf1.jar`、`udf2.jar`。如果两个 JAR 包里都包含一个`org.apache.iotdb.udf.UDTFExample`类,当同一个 SQL 中同时使用到这两个 UDF 时,系统会随机加载其中一个类,导致 UDF 执行行为不一致。 - -### 2.2 UDF 卸载 - -SQL 语法如下: - -```sql -DROP FUNCTION -``` - -示例:卸载上述例子的 UDF: - -```sql -DROP FUNCTION example -``` - -注意:对于使用 using uri 注册的函数,需要移除集群所有节点路径(`安装包/ext/udf/install`)中存在的 UDF 的 jar 文件。 - -### 2.3 查看所有注册的 UDF - -``` sql -SHOW FUNCTIONS -``` - -### 2.4 UDF 配置 - -- 允许在 `iotdb-system.properties` 中配置 udf 的存储目录.: - ``` Properties -# UDF lib dir - -udf_lib_dir=ext/udf -``` - -- 使用自定义函数时,提示内存不足,更改 `iotdb-system.properties` 中下述配置参数并重启服务。 - ``` Properties - -# Used to estimate the memory usage of text fields in a UDF query. -# It is recommended to set this value to be slightly larger than the average length of all text -# effectiveMode: restart -# Datatype: int -udf_initial_byte_array_length_for_memory_control=48 - -# How much memory may be used in ONE UDF query (in MB). -# The upper limit is 20% of allocated memory for read. -# effectiveMode: restart -# Datatype: float -udf_memory_budget_in_mb=30.0 - -# UDF memory allocation ratio. -# The parameter form is a:b:c, where a, b, and c are integers. -# effectiveMode: restart -udf_reader_transformer_collector_memory_proportion=1:1:1 -``` - -### 2.5 UDF 用户权限 - -用户在使用 UDF 时会涉及到 `USE_UDF` 权限,具备该权限的用户才被允许执行 UDF 注册、卸载和查询操作。 - -更多用户权限相关的内容,请参考 [权限管理语句](../User-Manual/Authority-Management_timecho##权限管理)。 - - -## 3. UDF 函数库 - -基于用户自定义函数能力,IoTDB 提供了一系列关于时序数据处理的函数,包括数据质量、数据画像、异常检测、 频域分析、数据匹配、数据修复、序列发现、机器学习等,能够满足工业领域对时序数据处理的需求。 - -可以参考 [UDF 函数库](../SQL-Manual/UDF-Libraries_timecho.md)文档,查找安装步骤及每个函数对应的注册语句,以确保正确注册所有需要的函数。 - -## 4. UDF 开发 - -### 4.1 UDF 依赖 - -如果您使用 [Maven](http://search.maven.org/) ,可以从 [Maven 库](http://search.maven.org/) 中搜索下面示例中的依赖。请注意选择和目标 IoTDB 服务器版本相同的依赖版本。 - -``` xml - - org.apache.iotdb - udf-api - 1.0.0 - provided - -``` - -### 4.2 UDTF(User Defined Timeseries Generating Function) - -编写一个 UDTF 需要继承`org.apache.iotdb.udf.api.UDTF`类,并至少实现`beforeStart`方法和一种`transform`方法。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| :----------------------------------------------------------- | :----------------------------------------------------------- | ------------------------- | -| void validate(UDFParameterValidator validator) throws Exception | 在初始化方法`beforeStart`调用前执行,用于检测`UDFParameters`中用户输入的参数是否合法。 | 否 | -| void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception | 初始化方法,在 UDTF 处理输入数据前,调用用户自定义的初始化行为。用户每执行一次 UDTF 查询,框架就会构造一个新的 UDF 类实例,该方法在每个 UDF 类实例被初始化时调用一次。在每一个 UDF 类实例的生命周期内,该方法只会被调用一次。 | 是 | -| Object transform(Row row) throws Exception` | 这个方法由框架调用。当您在`beforeStart`中选择以`MappableRowByRowAccessStrategy`的策略消费原始数据时,可以选用该方法进行数据处理。输入参数以`Row`的形式传入,输出结果通过返回值`Object`输出。 | 所有`transform`方法四选一 | -| void transform(Column[] columns, ColumnBuilder builder) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`MappableRowByRowAccessStrategy`的策略消费原始数据时,可以选用该方法进行数据处理。输入参数以`Column[]`的形式传入,输出结果通过`ColumnBuilder`输出。您需要在该方法内自行调用`builder`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void transform(Row row, PointCollector collector) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`RowByRowAccessStrategy`的策略消费原始数据时,这个数据处理方法就会被调用。输入参数以`Row`的形式传入,输出结果通过`PointCollector`输出。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void transform(RowWindow rowWindow, PointCollector collector) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`SlidingSizeWindowAccessStrategy`或者`SlidingTimeWindowAccessStrategy`的策略消费原始数据时,这个数据处理方法就会被调用。输入参数以`RowWindow`的形式传入,输出结果通过`PointCollector`输出。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void terminate(PointCollector collector) throws Exception | 这个方法由框架调用。该方法会在所有的`transform`调用执行完成后,在`beforeDestory`方法执行前被调用。在一个 UDF 查询过程中,该方法会且只会调用一次。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 否 | -| void beforeDestroy() | UDTF 的结束方法。此方法由框架调用,并且只会被调用一次,即在处理完最后一条记录之后被调用。 | 否 | - -在一个完整的 UDTF 实例生命周期中,各个方法的调用顺序如下: - -1. void validate(UDFParameterValidator validator) throws Exception -2. void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception -3. Object transform(Row row) throws Exception 或着 void transform(Column[] columns, ColumnBuilder builder) throws Exception 或者 void transform(Row row, PointCollector collector) throws Exception 或者 void transform(RowWindow rowWindow, PointCollector collector) throws Exception -4. void terminate(PointCollector collector) throws Exception -5. void beforeDestroy() - -> 注意,框架每执行一次 UDTF 查询,都会构造一个全新的 UDF 类实例,查询结束时,对应的 UDF 类实例即被销毁,因此不同 UDTF 查询(即使是在同一个 SQL 语句中)UDF 类实例内部的数据都是隔离的。您可以放心地在 UDTF 中维护一些状态数据,无需考虑并发对 UDF 类实例内部状态数据的影响。 - -#### 接口详细介绍: - -1. **void validate(UDFParameterValidator validator) throws Exception** - - `validate`方法能够对用户输入的参数进行验证。 - - 您可以在该方法中限制输入序列的数量和类型,检查用户输入的属性或者进行自定义逻辑的验证。 - -`UDFParameterValidator`的使用方法请见 [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/parameter/UDFParameterValidator.java)。 - -2. **void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception** - - `beforeStart`方法有两个作用: - 1. 帮助用户解析 SQL 语句中的 UDF 参数 - 2. 配置 UDF 运行时必要的信息,即指定 UDF 访问原始数据时采取的策略和输出结果序列的类型 - 3. 创建资源,比如建立外部链接,打开文件等 - -2.1 **UDFParameters** - -`UDFParameters`的作用是解析 SQL 语句中的 UDF 参数(SQL 中 UDF 函数名称后括号中的部分)。参数包括序列类型参数和字符串 key-value 对形式输入的属性参数。 - -示例: - -``` sql -SELECT UDF(s1, s2, 'key1'='iotdb', 'key2'='123.45') FROM root.sg.d; -``` - -用法: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - String stringValue = parameters.getString("key1"); // iotdb - Float floatValue = parameters.getFloat("key2"); // 123.45 - Double doubleValue = parameters.getDouble("key3"); // null - int intValue = parameters.getIntOrDefault("key4", 678); // 678 - // do something - - // configurations - // ... -} -``` - -2.2 **UDTFConfigurations** - -您必须使用 `UDTFConfigurations` 指定 UDF 访问原始数据时采取的策略和输出结果序列的类型。 - -用法: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(Type.INT32); -} -``` - -其中`setAccessStrategy`方法用于设定 UDF 访问原始数据时采取的策略,`setOutputDataType`用于设定输出结果序列的类型。 - - 2.2.1 **setAccessStrategy** - -注意,您在此处设定的原始数据访问策略决定了框架会调用哪一种`transform`方法 ,请实现与原始数据访问策略对应的`transform`方法。当然,您也可以根据`UDFParameters`解析出来的属性参数,动态决定设定哪一种策略,因此,实现两种`transform`方法也是被允许的。 - -下面是您可以设定的访问原始数据的策略: - -| 接口定义 | 描述 | 调用的`transform`方法 | -| ------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| MappableRowByRowStrategy | 自定义标量函数
框架会为每一行原始数据输入调用一次`transform`方法,输入 k 列时间序列 1 行数据,输出 1 列时间序列 1 行数据,可用于标量函数出现的任何子句和表达式中,如select子句、where子句等。 | void transform(Column[] columns, ColumnBuilder builder) throws ExceptionObject transform(Row row) throws Exception | -| RowByRowAccessStrategy | 自定义时间序列生成函数,逐行地处理原始数据。
框架会为每一行原始数据输入调用一次`transform`方法,输入 k 列时间序列 1 行数据,输出 1 列时间序列 n 行数据。
当输入一个序列时,该行就作为输入序列的一个数据点。
当输入多个序列时,输入序列按时间对齐后,每一行作为的输入序列的一个数据点。
(一行数据中,可能存在某一列为`null`值,但不会全部都是`null`) | void transform(Row row, PointCollector collector) throws Exception | -| SlidingTimeWindowAccessStrategy | 自定义时间序列生成函数,以滑动时间窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SlidingSizeWindowAccessStrategy | 自定义时间序列生成函数,以固定行数的方式处理原始数据,即每个数据处理窗口都会包含固定行数的数据(最后一个窗口除外)。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为的输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SessionTimeWindowAccessStrategy | 自定义时间序列生成函数,以会话窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为的输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| StateWindowAccessStrategy | 自定义时间序列生成函数,以状态窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 1 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,目前仅支持对一个物理量也就是一列数据进行开窗。 | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | - -#### 接口详情: - -- `MappableRowByRowStrategy` 和 `RowByRowAccessStrategy`的构造不需要任何参数。 - -- `SlidingTimeWindowAccessStrategy` - -开窗示意图: - - - -`SlidingTimeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 3 类参数: - -1. 时间轴显示时间窗开始和结束时间 - -时间轴显示时间窗开始和结束时间不是必须要提供的。当您不提供这类参数时,时间轴显示时间窗开始时间会被定义为整个查询结果集中最小的时间戳,时间轴显示时间窗结束时间会被定义为整个查询结果集中最大的时间戳。 - -2. 划分时间轴的时间间隔参数(必须为正数) -3. 滑动步长(不要求大于等于时间间隔,但是必须为正数) - -滑动步长参数也不是必须的。当您不提供滑动步长参数时,滑动步长会被设定为划分时间轴的时间间隔。 - -3 类参数的关系可见下图。策略的构造方法详见 [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/SlidingTimeWindowAccessStrategy.java)。 - - - -> 注意,最后的一些时间窗口的实际时间间隔可能小于规定的时间间隔参数。另外,可能存在某些时间窗口内数据行数量为 0 的情况,这种情况框架也会为该窗口调用一次`transform`方法。 - -- `SlidingSizeWindowAccessStrategy` - -开窗示意图: - - - -`SlidingSizeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 2 个参数: - -1. 窗口大小,即一个数据处理窗口包含的数据行数。注意,最后一些窗口的数据行数可能少于规定的数据行数。 -2. 滑动步长,即下一窗口第一个数据行与当前窗口第一个数据行间的数据行数(不要求大于等于窗口大小,但是必须为正数) - -滑动步长参数不是必须的。当您不提供滑动步长参数时,滑动步长会被设定为窗口大小。 - -- `SessionTimeWindowAccessStrategy` - -开窗示意图:**时间间隔小于等于给定的最小时间间隔 sessionGap 则分为一组。** - - - - -`SessionTimeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 2 类参数: - -1. 时间轴显示时间窗开始和结束时间。 -2. 会话窗口之间的最小时间间隔。 - -- `StateWindowAccessStrategy` - -开窗示意图:**对于数值型数据,状态差值小于等于给定的阈值 delta 则分为一组。** - - - -`StateWindowAccessStrategy`有四种构造方法: - -1. 针对数值型数据,可以提供时间轴显示时间窗开始和结束时间以及对于单个窗口内部允许变化的阈值delta。 -2. 针对文本数据以及布尔数据,可以提供时间轴显示时间窗开始和结束时间。对于这两种数据类型,单个窗口内的数据是相同的,不需要提供变化阈值。 -3. 针对数值型数据,可以只提供单个窗口内部允许变化的阈值delta,时间轴显示时间窗开始时间会被定义为整个查询结果集中最小的时间戳,时间轴显示时间窗结束时间会被定义为整个查询结果集中最大的时间戳。 -4. 针对文本数据以及布尔数据,可以不提供任何参数,开始与结束时间戳见3中解释。 - -StateWindowAccessStrategy 目前只能接收一列输入。策略的构造方法详见 [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/StateWindowAccessStrategy.java)。 - - 2.2.2 **setOutputDataType** - -注意,您在此处设定的输出结果序列的类型,决定了`transform`方法中`PointCollector`实际能够接收的数据类型。`setOutputDataType`中设定的输出类型和`PointCollector`实际能够接收的数据输出类型关系如下: - -| `setOutputDataType`中设定的输出类型 | `PointCollector`实际能够接收的输出类型 | -| :---------------------------------- | :----------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | java.lang.String 和 org.apache.iotdb.udf.api.type.Binary | - -UDTF 输出序列的类型是运行时决定的。您可以根据输入序列类型动态决定输出序列类型。 - -示例: - -```java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **Object transform(Row row) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `MappableRowByRowAccessStrategy`,您就需要该方法和下面的`void transform(Column[] columns, ColumnBuilder builder) throws Exception` 二选一来实现,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的一行。原始数据由`Row`读入,由返回值输出。您必须在一次`transform`方法调用中,根据每个输入的数据点输出一个对应的数据点,即输入和输出依然是一对一的。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`Object transform(Row row) throws Exception`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,输出这两个数据点的代数和。 - -```java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type dataType; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - dataType = parameters.getDataType(0); - configurations - .setAccessStrategy(new MappableRowByRowAccessStrategy()) - .setOutputDataType(dataType); - } - - @Override - public Object transform(Row row) throws Exception { - return row.getLong(0) + row.getLong(1); - } -} -``` - -4. **void transform(Column[] columns, ColumnBuilder builder) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `MappableRowByRowAccessStrategy`,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的多行,经过性能测试,我们发现一次性处理多行的 UDTF 比一次处理一行的 UDTF 性能更好。原始数据由`Column[]`读入,由`ColumnBuilder`输出。您必须在一次`transform`方法调用中,根据每个输入的数据点输出一个对应的数据点,即输入和输出依然是一对一的。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(Column[] columns, ColumnBuilder builder) throws Exceptionn`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,输出这两个数据点的代数和。 - -``` java -import org.apache.iotdb.tsfile.read.common.block.column.Column; -import org.apache.iotdb.tsfile.read.common.block.column.ColumnBuilder; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type type; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - type = parameters.getDataType(0); - configurations.setAccessStrategy(new MappableRowByRowAccessStrategy()).setOutputDataType(type); - } - - @Override - public void transform(Column[] columns, ColumnBuilder builder) throws Exception { - long[] inputs1 = columns[0].getLongs(); - long[] inputs2 = columns[1].getLongs(); - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - builder.writeLong(inputs1[i] + inputs2[i]); - } - } -} -``` - -5. **void transform(Row row, PointCollector collector) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `RowByRowAccessStrategy`,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的一行。原始数据由`Row`读入,由`PointCollector`输出。您可以选择在一次`transform`方法调用中输出任意数量的数据点。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(Row row, PointCollector collector) throws Exception`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,当这两个数据点都不为`null`时,输出这两个数据点的代数和。 - -``` java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(Type.INT64) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) throws Exception { - if (row.isNull(0) || row.isNull(1)) { - return; - } - collector.putLong(row.getTime(), row.getLong(0) + row.getLong(1)); - } -} -``` - -6. **void transform(RowWindow rowWindow, PointCollector collector) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `SlidingTimeWindowAccessStrategy`或者`SlidingSizeWindowAccessStrategy`时,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理固定行数或者固定时间间隔内的一批数据,我们称包含这一批数据的容器为窗口。原始数据由`RowWindow`读入,由`PointCollector`输出。`RowWindow`能够帮助您访问某一批次的`Row`,它提供了对这一批次的`Row`进行随机访问和迭代访问的接口。您可以选择在一次`transform`方法调用中输出任意数量的数据点,需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(RowWindow rowWindow, PointCollector collector) throws Exception`方法的完整 UDF 示例。它是一个计数器,接收任意列数的时间序列输入,作用是统计并输出指定时间范围内每一个时间窗口中的数据行数。 - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.RowWindow; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.SlidingTimeWindowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Counter implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(Type.INT32) - .setAccessStrategy(new SlidingTimeWindowAccessStrategy( - parameters.getLong("time_interval"), - parameters.getLong("sliding_step"), - parameters.getLong("display_window_begin"), - parameters.getLong("display_window_end"))); - } - - @Override - public void transform(RowWindow rowWindow, PointCollector collector) throws Exception { - if (rowWindow.windowSize() != 0) { - collector.putInt(rowWindow.windowStartTime(), rowWindow.windowSize()); - } - } -} -``` - -7. **void terminate(PointCollector collector) throws Exception** - -在一些场景下,UDF 需要遍历完所有的原始数据后才能得到最后的输出结果。`terminate`接口为这类 UDF 提供了支持。 - -该方法会在所有的`transform`调用执行完成后,在`beforeDestory`方法执行前被调用。您可以选择使用`transform`方法进行单纯的数据处理,最后使用`terminate`将处理结果输出。 - -结果需要由`PointCollector`输出。您可以选择在一次`terminate`方法调用中输出任意数量的数据点。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void terminate(PointCollector collector) throws Exception`方法的完整 UDF 示例。它接收一个`INT32`类型的时间序列输入,作用是输出该序列的最大值点。 - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Max implements UDTF { - - private Long time; - private int value; - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) { - if (row.isNull(0)) { - return; - } - int candidateValue = row.getInt(0); - if (time == null || value < candidateValue) { - time = row.getTime(); - value = candidateValue; - } - } - - @Override - public void terminate(PointCollector collector) throws IOException { - if (time != null) { - collector.putInt(time, value); - } - } -} -``` - -8. **void beforeDestroy()** - -UDTF 的结束方法,您可以在此方法中进行一些资源释放等的操作。 - -此方法由框架调用。对于一个 UDF 类实例而言,生命周期中会且只会被调用一次,即在处理完最后一条记录之后被调用。 - -### 4.3 UDAF(User Defined Aggregation Function) - -一个完整的 UDAF 定义涉及到 State 和 UDAF 两个类。 - -#### State 类 - -编写一个 State 类需要实现`org.apache.iotdb.udf.api.State`接口,下表是需要实现的方法说明。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| -------------------------------- | ------------------------------------------------------------ | -------- | -| void reset() | 将 `State` 对象重置为初始的状态,您需要像编写构造函数一样,在该方法内填入 `State` 类中各个字段的初始值。 | 是 | -| byte[] serialize() | 将 `State` 序列化为二进制数据。该方法用于 IoTDB 内部的 `State` 对象传递,注意序列化的顺序必须和下面的反序列化方法一致。 | 是 | -| void deserialize(byte[] bytes) | 将二进制数据反序列化为 `State`。该方法用于 IoTDB 内部的 `State` 对象传递,注意反序列化的顺序必须和上面的序列化方法一致。 | 是 | - -#### 接口详细介绍: - -1. **void reset()** - -该方法的作用是将 `State` 重置为初始的状态,您需要在该方法内填写 `State` 对象中各个字段的初始值。出于优化上的考量,IoTDB 在内部会尽可能地复用 `State`,而不是为每一个组创建一个新的 `State`,这样会引入不必要的开销。当 `State` 更新完一个组中的数据之后,就会调用这个方法重置为初始状态,以此来处理下一个组。 - -以求平均数(也就是 `avg`)的 `State` 为例,您需要数据的总和 `sum` 与数据的条数 `count`,并在 `reset()` 方法中将二者初始化为 0。 - -```java -class AvgState implements State { - double sum; - - long count; - - @Override - public void reset() { - sum = 0; - count = 0; - } - - // other methods -} -``` - -2. **byte[] serialize()/void deserialize(byte[] bytes)** - -该方法的作用是将 State 序列化为二进制数据,和从二进制数据中反序列化出 State。IoTDB 作为分布式数据库,涉及到在不同节点中传递数据,因此您需要编写这两个方法,来实现 State 在不同节点中的传递。注意序列化和反序列的顺序必须一致。 - -还是以求平均数(也就是求 avg)的 State 为例,您可以通过任意途径将 State 的内容转化为 `byte[]` 数组,以及从 `byte[]` 数组中读取出 State 的内容,下面展示的是用 Java8 引入的 `ByteBuffer` 进行序列化/反序列的代码: - -```java -@Override -public byte[] serialize() { - ByteBuffer buffer = ByteBuffer.allocate(Double.BYTES + Long.BYTES); - buffer.putDouble(sum); - buffer.putLong(count); - - return buffer.array(); -} - -@Override -public void deserialize(byte[] bytes) { - ByteBuffer buffer = ByteBuffer.wrap(bytes); - sum = buffer.getDouble(); - count = buffer.getLong(); -} -``` - -#### UDAF 类 - -编写一个 UDAF 类需要实现`org.apache.iotdb.udf.api.UDAF`接口,下表是需要实现的方法说明。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -| void validate(UDFParameterValidator validator) throws Exception | 在初始化方法`beforeStart`调用前执行,用于检测`UDFParameters`中用户输入的参数是否合法。该方法与 UDTF 的`validate`相同。 | 否 | -| void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception | 初始化方法,在 UDAF 处理输入数据前,调用用户自定义的初始化行为。与 UDTF 不同的是,这里的 configuration 是 `UDAFConfiguration` 类型。 | 是 | -| State createState() | 创建`State`对象,一般只需要调用默认构造函数,然后按需修改默认的初始值即可。 | 是 | -| void addInput(State state, Column[] columns, BitMap bitMap) | 根据传入的数据`Column[]`批量地更新`State`对象,注意最后一列,也就是 `columns[columns.length - 1]` 总是代表时间列。另外`BitMap`表示之前已经被过滤掉的数据,您在编写该方法时需要手动判断对应的数据是否被过滤掉。 | 是 | -| void combineState(State state, State rhs) | 将`rhs`状态合并至`state`状态中。在分布式场景下,同一组的数据可能分布在不同节点上,IoTDB 会为每个节点上的部分数据生成一个`State`对象,然后调用该方法合并成完整的`State`。 | 是 | -| void outputFinal(State state, ResultValue resultValue) | 根据`State`中的数据,计算出最终的聚合结果。注意根据聚合的语义,每一组只能输出一个值。 | 是 | -| void beforeDestroy() | UDAF 的结束方法。此方法由框架调用,并且只会被调用一次,即在处理完最后一条记录之后被调用。 | 否 | - -在一个完整的 UDAF 实例生命周期中,各个方法的调用顺序如下: - -1. State createState() -2. void validate(UDFParameterValidator validator) throws Exception -3. void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception -4. void addInput(State state, Column[] columns, BitMap bitMap) -5. void combineState(State state, State rhs) -6. void outputFinal(State state, ResultValue resultValue) -7. void beforeDestroy() - -和 UDTF 类似,框架每执行一次 UDAF 查询,都会构造一个全新的 UDF 类实例,查询结束时,对应的 UDF 类实例即被销毁,因此不同 UDAF 查询(即使是在同一个 SQL 语句中)UDF 类实例内部的数据都是隔离的。您可以放心地在 UDAF 中维护一些状态数据,无需考虑并发对 UDF 类实例内部状态数据的影响。 - -#### 接口详细介绍: - -1. **void validate(UDFParameterValidator validator) throws Exception** - -同 UDTF, `validate`方法能够对用户输入的参数进行验证。 - -您可以在该方法中限制输入序列的数量和类型,检查用户输入的属性或者进行自定义逻辑的验证。 - -2. **void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception** - - `beforeStart`方法的作用 UDAF 相同: - - 1. 帮助用户解析 SQL 语句中的 UDF 参数 - 2. 配置 UDF 运行时必要的信息,即指定 UDF 访问原始数据时采取的策略和输出结果序列的类型 - 3. 创建资源,比如建立外部链接,打开文件等。 - -其中,`UDFParameters` 类型的作用可以参照上文。 - -2.2 **UDTFConfigurations** - -和 UDTF 的区别在于,UDAF 使用了 `UDAFConfigurations` 作为 `configuration` 对象的类型。 - -目前,该类仅支持设置输出数据的类型。 - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setOutputDataType(Type.INT32); -} -``` - -`setOutputDataType` 中设定的输出类型和 `ResultValue` 实际能够接收的数据输出类型关系如下: - -| `setOutputDataType`中设定的输出类型 | `ResultValue`实际能够接收的输出类型 | -| :---------------------------------- | :------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | org.apache.iotdb.udf.api.type.Binary | - -UDAF 输出序列的类型也是运行时决定的。您可以根据输入序列类型动态决定输出序列类型。 - -示例: - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **State createState()** - -为 UDAF 创建并初始化 `State`。由于 Java 语言本身的限制,您只能调用 `State` 类的默认构造函数。默认构造函数会为类中所有的字段赋一个默认的初始值,如果该初始值并不符合您的要求,您需要在这个方法内进行手动的初始化。 - -下面是一个包含手动初始化的例子。假设您要实现一个累乘的聚合函数,`State` 的初始值应该设置为 1,但是默认构造函数会初始化为 0,因此您需要在调用默认构造函数之后,手动对 `State` 进行初始化: - -```java -public State createState() { - MultiplyState state = new MultiplyState(); - state.result = 1; - return state; -} -``` - -4. **void addInput(State state, Column[] columns, BitMap bitMap)** - -该方法的作用是,通过原始的输入数据来更新 `State` 对象。出于性能上的考量,也是为了和 IoTDB 向量化的查询引擎相对齐,原始的输入数据不再是一个数据点,而是列的数组 `Column[]`。注意最后一列(也就是 `columns[columns.length - 1]` )总是时间列,因此您也可以在 UDAF 中根据时间进行不同的操作。 - -由于输入参数的类型不是一个数据点,而是多个列,您需要手动对列中的部分数据进行过滤处理,这就是第三个参数 `BitMap` 存在的意义。它用来标识这些列中哪些数据被过滤掉了,您在任何情况下都无需考虑被过滤掉的数据。 - -下面是一个用于统计数据条数(也就是 count)的 `addInput()` 示例。它展示了您应该如何使用 `BitMap` 来忽视那些已经被过滤掉的数据。注意还是由于 Java 语言本身的限制,您需要在方法的开头将接口中定义的 `State` 类型强制转化为自定义的 `State` 类型,不然后续无法正常使用该 `State` 对象。 - -```java -public void addInput(State state, Column[] columns, BitMap bitMap) { - CountState countState = (CountState) state; - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - if (bitMap != null && !bitMap.isMarked(i)) { - continue; - } - if (!columns[0].isNull(i)) { - countState.count++; - } - } -} -``` - -5. **void combineState(State state, State rhs)** - -该方法的作用是合并两个 `State`,更加准确的说,是用第二个 `State` 对象来更新第一个 `State` 对象。IoTDB 是分布式数据库,同一组的数据可能分布在多个不同的节点上。出于性能考虑,IoTDB 会为每个节点上的部分数据先进行聚合成 `State`,然后再将不同节点上的、属于同一个组的 `State` 进行合并,这就是 `combineState` 的作用。 - -下面是一个用于求平均数(也就是 avg)的 `combineState()` 示例。和 `addInput` 类似,您都需要在开头对两个 `State` 进行强制类型转换。另外需要注意是用第二个 `State` 的内容来更新第一个 `State` 的值。 - -```java -public void combineState(State state, State rhs) { - AvgState avgState = (AvgState) state; - AvgState avgRhs = (AvgState) rhs; - - avgState.count += avgRhs.count; - avgState.sum += avgRhs.sum; -} -``` - -6. **void outputFinal(State state, ResultValue resultValue)** - -该方法的作用是从 `State` 中计算出最终的结果。您需要访问 `State` 中的各个字段,求出最终的结果,并将最终的结果设置到 `ResultValue` 对象中。IoTDB 内部会为每个组在最后调用一次这个方法。注意根据聚合的语义,最终的结果只能是一个值。 - -下面还是一个用于求平均数(也就是 avg)的 `outputFinal` 示例。除了开头的强制类型转换之外,您还将看到 `ResultValue` 对象的具体用法,即通过 `setXXX`(其中 `XXX` 是类型名)来设置最后的结果。 - -```java -public void outputFinal(State state, ResultValue resultValue) { - AvgState avgState = (AvgState) state; - - if (avgState.count != 0) { - resultValue.setDouble(avgState.sum / avgState.count); - } else { - resultValue.setNull(); - } -} -``` - -7. **void beforeDestroy()** - -UDAF 的结束方法,您可以在此方法中进行一些资源释放等的操作。 - -此方法由框架调用。对于一个 UDF 类实例而言,生命周期中会且只会被调用一次,即在处理完最后一条记录之后被调用。 - -### 4.4 完整 Maven 项目示例 - -如果您使用 [Maven](http://search.maven.org/),可以参考我们编写的示例项目**udf-example**。您可以在 [这里](https://github.com/apache/iotdb/tree/master/example/udf) 找到它。 - - -## 5. 为iotdb贡献通用的内置UDF函数 - -该部分主要讲述了外部用户如何将自己编写的 UDF 贡献给 IoTDB 社区。 - -### 5.1 前提条件 - -1. UDF 具有通用性。 - - 通用性主要指的是:UDF 在某些业务场景下,可以被广泛使用。换言之,就是 UDF 具有复用价值,可被社区内其他用户直接使用。 - - 如果不确定自己写的 UDF 是否具有通用性,可以发邮件到 `dev@iotdb.apache.org` 或直接创建 ISSUE 发起讨论。 - -2. UDF 已经完成测试,且能够正常运行在用户的生产环境中。 - -### 5.2 贡献清单 - -1. UDF 的源代码 -2. UDF 的测试用例 -3. UDF 的使用说明 - -### 5.3 贡献内容 - -#### 5.3.1 源代码 - -1. 在`iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin`中创建 UDF 主类和相关的辅助类。 -2. 在`iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin/BuiltinTimeSeriesGeneratingFunction.java`中注册编写的 UDF。 - -#### 5.3.2 测试用例 - -至少需要为贡献的 UDF 编写集成测试。 - -可以在`integration-test/src/test/java/org/apache/iotdb/db/it/udf`中为贡献的 UDF 新增一个测试类进行测试。 - -#### 5.3.3 使用说明 - -使用说明需要包含:UDF 的名称、UDF 的作用、执行函数必须的属性参数、函数的适用的场景以及使用示例等。 - -使用说明需包含中英文两个版本。应分别在 `docs/zh/UserGuide/Operation Manual/DML Data Manipulation Language.md` 和 `docs/UserGuide/Operation Manual/DML Data Manipulation Language.md` 中新增使用说明。 - -#### 5.3.4 提交 PR - -当准备好源代码、测试用例和使用说明后,就可以将 UDF 贡献到 IoTDB 社区了。在 [Github](https://github.com/apache/iotdb) 上面提交 Pull Request (PR) 即可。具体提交方式见:[贡献指南](https://iotdb.apache.org/zh/Community/Development-Guide.html)。 - -当 PR 评审通过并被合并后, UDF 就已经贡献给 IoTDB 社区了! - -## 6. 常见问题 - -1. 如何修改已经注册的 UDF? - -答:假设 UDF 的名称为`example`,全类名为`org.apache.iotdb.udf.UDTFExample`,由`example.jar`引入 - -1. 首先卸载已经注册的`example`函数,执行`DROP FUNCTION example` -2. 删除 `iotdb-server-2.0.x-all-bin/ext/udf` 目录下的`example.jar` -3. 修改`org.apache.iotdb.udf.UDTFExample`中的逻辑,重新打包,JAR 包的名字可以仍然为`example.jar` -4. 将新的 JAR 包上传至 `iotdb-server-2.0.x-all-bin/ext/udf` 目录下 -5. 装载新的 UDF,执行`CREATE FUNCTION example AS "org.apache.iotdb.udf.UDTFExample"` \ No newline at end of file diff --git a/src/zh/UserGuide/V1.2.x/Deployment-and-Maintenance/Deployment-Guide_timecho.md b/src/zh/UserGuide/V1.2.x/Deployment-and-Maintenance/Deployment-Guide_timecho.md deleted file mode 100644 index 8749b74dc..000000000 --- a/src/zh/UserGuide/V1.2.x/Deployment-and-Maintenance/Deployment-Guide_timecho.md +++ /dev/null @@ -1,1071 +0,0 @@ - - -# 部署指导 - -IoTDB 提供单机版、集群版和双活版共 3 种部署形态。本章节将详细介绍每一种部署形态的具体部署步骤。 - -## 预备知识 - -在开始部署前,您需要充分了解下面的预备知识。 - -### 安装包结构 - -首先,需要获取安装包,名字为 `apache-iotdb-{version}-all-bin` 的安装包包含 ConfigNode 和 DataNode 的可执行程序,请将安装包部署于目标集群的所有机器上,推荐将安装包部署于所有服务器的相同目录下。 - -**之后,需要对 IoTDB 安装包的结构有了解。IoTDB 安装包目录结构如下:** - -| **目录** | **说明** | -| -------- | --------------------------------------------------------------------------- | -| conf | 配置文件目录,包含 ConfigNode、DataNode、JMX 和 logback 等配置文件 | -| data | 数据文件目录,包含 ConfigNode 和 DataNode 的数据文件 | -| lib | 库文件目录 | -| licenses | 证书文件目录 | -| logs | 日志文件目录,包含 ConfigNode 和 DataNode 的日志文件 | -| sbin | 脚本目录,包含 ConfigNode 和 DataNode 的启停移除脚本,以及 Cli 的启动脚本等 | -| tools | 系统工具目录 | - -### 配置文件 - -**必要情况下,您需要根据业务需求,修改每个服务器上的配置文件。登录服务器,并将工作路径切换至 `apache-iotdb-{version}-all-bin`,配置文件在 `./conf` 目录内。** - -* 对于所有部署 ConfigNode 的服务器,需要修改 **通用配置** 和 **ConfigNode 配置** 。 -* 对于所有部署 DataNode 的服务器,需要修改 **通用配置** 和 **DataNode 配置** 。 - -#### 通用配置 - -打开通用配置文件 ./conf/iotdb-common.properties,可根据 [部署推荐](./Deployment-Recommendation.md)设置以下参数: - -| **配置项** | **说明** | **默认** | -| ------------------------------------------ | ------------------------------------------------------------- | ----------------------------------------------- | -| cluster\_name | 节点希望加入的集群的名称 | defaultCluster | -| config\_node\_consensus\_protocol\_class | ConfigNode 使用的共识协议 | org.apache.iotdb.consensus.ratis.RatisConsensus | -| schema\_replication\_factor | 元数据副本数,DataNode 数量不应少于此数目 | 1 | -| schema\_region\_consensus\_protocol\_class | 元数据副本组的共识协议 | org.apache.iotdb.consensus.ratis.RatisConsensus | -| data\_replication\_factor | 数据副本数,DataNode 数量不应少于此数目 | 1 | -| data\_region\_consensus\_protocol\_class | 数据副本组的共识协议。注:RatisConsensus 目前不支持多数据目录 | org.apache.iotdb.consensus.iot.IoTConsensus | - -**注意:上述配置项在集群启动后即不可更改,且务必保证所有节点的通用配置完全一致,否则节点无法启动。** - -#### ConfigNode 配置 - -打开 ConfigNode 配置文件 ./conf/iotdb-confignode.properties,根据服务器/虚拟机的 IP 地址和可用端口,设置以下参数: - -| **配置项** | **说明** | **默认** | **用法** | -| ------------------------------ | ------------------------------------------------------------ | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| cn\_internal\_address | ConfigNode 在集群内部通讯使用的地址 | 127.0.0.1 | 设置为服务器的 IPV4 地址或域名 | -| cn\_internal\_port | ConfigNode 在集群内部通讯使用的端口 | 10710 | 设置为任意未占用端口 | -| cn\_consensus\_port | ConfigNode 副本组共识协议通信使用的端口 | 10720 | 设置为任意未占用端口 | -| cn\_target\_config\_node\_list | 节点注册加入集群时连接的 ConfigNode 的地址。注:只能配置一个 | 127.0.0.1:10710 | 对于 Seed-ConfigNode,设置为自己的 cn\_internal\_address:cn\_internal\_port;对于其它 ConfigNode,设置为另一个正在运行的 ConfigNode 的 cn\_internal\_address:cn\_internal\_port | - -**注意:上述配置项在节点启动后即不可更改,且务必保证所有端口均未被占用,否则节点无法启动。** - -#### DataNode 配置 - -打开 DataNode 配置文件 ./conf/iotdb-datanode.properties,根据服务器/虚拟机的 IP 地址和可用端口,设置以下参数: - -| **配置项** | **说明** | **默认** | **用法** | -| ----------------------------------- | ----------------------------------------- | --------------- | ---------------------------------------------------------------------------------------------------------- | -| dn\_rpc\_address | 客户端 RPC 服务的地址 | 127.0.0.1 | 设置为服务器的 IPV4 地址或域名 | -| dn\_rpc\_port | 客户端 RPC 服务的端口 | 6667 | 设置为任意未占用端口 | -| dn\_internal\_address | DataNode 在集群内部接收控制流使用的地址 | 127.0.0.1 | 设置为服务器的 IPV4 地址或域名 | -| dn\_internal\_port | DataNode 在集群内部接收控制流使用的端口 | 10730 | 设置为任意未占用端口 | -| dn\_mpp\_data\_exchange\_port | DataNode 在集群内部接收数据流使用的端口 | 10740 | 设置为任意未占用端口 | -| dn\_data\_region\_consensus\_port | DataNode 的数据副本间共识协议通信的端口 | 10750 | 设置为任意未占用端口 | -| dn\_schema\_region\_consensus\_port | DataNode 的元数据副本间共识协议通信的端口 | 10760 | 设置为任意未占用端口 | -| dn\_target\_config\_node\_list | 集群中正在运行的 ConfigNode 地址 | 127.0.0.1:10710 | 设置为任意正在运行的 ConfigNode 的 cn\_internal\_address:cn\_internal\_port,可设置多个,用逗号(",")隔开 | - -**注意:上述配置项在节点启动后即不可更改,且务必保证所有端口均未被占用,否则节点无法启动。** - -### 环境检查 - -**最后,在正式部署前,还需要对下列项目进行检查:** - -1. JDK>=1.8 的运行环境,并配置好 JAVA_HOME 环境变量。 -2. 设置最大文件打开数为 65535。 -3. 关闭交换内存。 -4. 首次启动 ConfigNode 节点时,确保已清空 ConfigNode 节点的 data/confignode 目录;首次启动 DataNode 节点时,确保已清空 DataNode 节点的 data/datanode 目录。 -5. 如果整个集群处在可信环境下,可以关闭机器上的防火墙选项。 -6. 在集群默认配置中,ConfigNode 会占用端口 10710 和 10720,DataNode 会占用端口 6667、10730、10740、10750 和 10760,请确保这些端口未被占用,或者手动修改配置文件中的端口配置。 - -### FAQ - -在部署集群过程中有任何问题,请参考 [分布式部署FAQ](../FAQ/Frequently-asked-questions.md)。 - -## 单机版部署 - -本小节描述如何启动包括 1 个 ConfigNode 和 1 个 DataNode 的实例。 - -### 启动流程 - -在完成配置文件的修改后(一般仅需要修改 IP 等信息) ,用户可以使用 sbin 文件夹下的 start-standalone 脚本启动 IoTDB。 - -Linux 系统与 MacOS 系统启动命令如下: - -``` -> bash sbin/start-standalone.sh -``` - -Windows 系统启动命令如下: - -``` -> sbin\start-standalone.bat -``` - -注意:目前,要使用单机模式,你需要保证所有的地址设置为 127.0.0.1,如果需要从非 IoTDB 所在的机器访问此IoTDB,请将配置项 `dn_rpc_address` 修改为 IoTDB 所在的机器 IP。 - -### 验证部署 - -若搭建的集群仅用于本地调试,可直接执行 ./sbin 目录下的 Cli 启动脚本: - -``` -# Linux -./sbin/start-cli.sh - -# Windows -.\sbin\start-cli.bat -``` - -若希望通过 Cli 连接生产环境的集群, -请阅读 [Cli 使用手册](../Tools-System/CLI.md)。 - -成功启动集群后,在 Cli 执行 `show cluster details`: -* 若所有节点的状态均为 **Running**,则说明集群部署成功; -* 否则,请阅读启动失败节点的运行日志,并检查对应的配置参数。 - -### 停止流程 - -Linux 系统与 MacOS 系统停止命令如下: - -``` -> bash sbin/stop-standalone.sh -``` - -Windows 系统停止命令如下: - -``` -> sbin\stop-standalone.bat -``` - -## 集群版部署(使用集群管理工具) - -IoTDB 集群管理工具是一款易用的运维工具(企业版工具)。旨在解决 IoTDB 分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发, -极大降低管理难度。本文档将说明如何用集群管理工具远程部署、配置、启动和停止 IoTDB 集群实例。 - -### 环境准备 - -本工具为 TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -IoTDB 要部署的机器需要依赖jdk 8及以上版本、lsof、netstat、unzip功能如果没有请自行安装,可以参考文档最后的一节环境所需安装命令。 - -提示:IoTDB集群管理工具需要使用有root权限的账号 - -### 部署方法 - -#### 下载安装 - -本工具为TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -注意:由于二进制包仅支持GLIBC2.17 及以上版本,因此最低适配Centos7版本 - -* 在iotd目录内输入以下指令后: - -```bash -bash install-iotd.sh -``` - -即可在之后的 shell 内激活 iotd 关键词,如检查部署前所需的环境指令如下所示: - -```bash -iotd cluster check example -``` - -* 也可以不激活iotd直接使用 <iotd absolute path>/sbin/iotd 来执行命令,如检查部署前所需的环境: - -```bash -/sbin/iotd cluster check example -``` - -### 系统结构 - -IoTDB集群管理工具主要由config、logs、doc、sbin目录组成。 - -* `config`存放要部署的集群配置文件如果要使用集群部署工具需要修改里面的yaml文件。 - -* `logs` 存放部署工具日志,如果想要查看部署工具执行日志请查看`logs/iotd_yyyy_mm_dd.log`。 - -* `sbin` 存放集群部署工具所需的二进制包。 - -* `doc` 存放用户手册、开发手册和推荐部署手册。 - -### 集群配置文件介绍 - -* 在`iotd/config` 目录下有集群配置的yaml文件,yaml文件名字就是集群名字yaml 文件可以有多个,为了方便用户配置yaml文件在iotd/config目录下面提供了`default_cluster.yaml`示例。 -* yaml 文件配置由`global`、`confignode_servers`、`datanode_servers`、`grafana_server`、`prometheus_server`四大部分组成 -* global 是通用配置主要配置机器用户名密码、IoTDB本地安装文件、Jdk配置等。在`iotd/config`目录中提供了一个`default_cluster.yaml`样例数据, - 用户可以复制修改成自己集群名字并参考里面的说明进行配置IoTDB集群,在`default_cluster.yaml`样例中没有注释的均为必填项,已经注释的为非必填项。 - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotd cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - -| 参数 | 说明 | 是否必填 | -|----------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| iotdb\_zip\_dir | IoTDB 部署分发目录,如果值为空则从`iotdb_download_url`指定地址下载 | 非必填 | -| iotdb\_download\_url | IoTDB 下载地址,如果`iotdb_zip_dir` 没有值则从指定地址下载 | 非必填 | -| jdk\_tar\_dir | jdk 本地目录,可使用该 jdk 路径进行上传部署至目标节点。 | 非必填 | -| jdk\_deploy\_dir | jdk 远程机器部署目录,会将 jdk 部署到该目录下面,与下面的`jdk_dir_name`参数构成完整的jdk部署目录即 `/` | 非必填 | -| jdk\_dir\_name | jdk 解压后的目录名称默认是jdk_iotdb | 非必填 | -| iotdb\_lib\_dir | IoTDB lib 目录或者IoTDB 的lib 压缩包仅支持.zip格式 ,仅用于IoTDB升级,默认处于注释状态,如需升级请打开注释修改路径即可。如果使用zip文件请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache-iotdb-1.2.0/lib/* | 非必填 | -| user | ssh登陆部署机器的用户名 | 必填 | -| password | ssh登录的密码, 如果password未指定使用pkey登陆, 请确保已配置节点之间ssh登录免密钥 | 非必填 | -| pkey | 密钥登陆如果password有值优先使用password否则使用pkey登陆 | 非必填 | -| ssh\_port | ssh登录端口 | 必填 | -| deploy\_dir | IoTDB 部署目录,会把 IoTDB 部署到该目录下面与下面的`iotdb_dir_name`参数构成完整的IoTDB 部署目录即 `/` | 必填 | -| iotdb\_dir\_name | IoTDB 解压后的目录名称默认是iotdb | 非必填 | -| datanode-env.sh | 对应`iotdb/config/datanode-env.sh` ,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值 | 非必填 | -| confignode-env.sh | 对应`iotdb/config/confignode-env.sh`,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值 | 非必填 | -| iotdb-common.properties | 对应`iotdb/config/iotdb-common.properties` | 非必填 | -| cn_target_config_node_list | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-confignode.properties`中的`cn_target_config_node_list` | 必填 | -| dn_target_config_node_list | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-datanode.properties`中的`dn_target_config_node_list` | 必填 | - -其中datanode-env.sh 和confignode-env.sh 可以配置额外参数extra_opts,当该参数配置后会在datanode-env.sh 和confignode-env.sh 后面追加对应的值,可参考default_cluster.yaml,配置示例如下: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - -* confignode_servers 是部署IoTDB Confignodes配置,里面可以配置多个Confignode - 默认将第一个启动的ConfigNode节点node1当作Seed-ConfigNode - -| 参数 | 说明 | 是否必填 | -|--------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| name | Confignode 名称 | 必填 | -| deploy_dir | IoTDB config node 部署目录 | 必填| | -| iotdb-confignode.properties | 对应`iotdb/config/iotdb-confignode.properties`更加详细请参看`iotdb-confignode.properties`文件说明 | 非必填 | -| cn\_internal\_address | 对应iotdb/内部通信地址,对应`iotdb/config/iotdb-confignode.properties`中的`cn_internal_address` | 必填 | -| cn\_target\_config\_node\_list | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-confignode.properties`中的`cn_target_config_node_list` | 必填 | -| cn\_internal\_port | 内部通信端口,对应`iotdb/config/iotdb-confignode.properties`中的`cn_internal_port` | 必填 | -| cn\_consensus\_port | 对应`iotdb/config/iotdb-confignode.properties`中的`cn_consensus_port` | 非必填 | -| cn\_data\_dir | 对应`iotdb/config/iotdb-confignode.properties`中的`cn_data_dir` | 必填 | -| iotdb-common.properties | 对应`iotdb/config/iotdb-common.properties`在`global`与`confignode_servers`同时配置值优先使用confignode_servers中的值 | 非必填 | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| 参数 | 说明 |是否必填| -|--------------------------------| --- |--- | -| name |Datanode 名称|必填| -| deploy\_dir |IoTDB data node 部署目录|必填| -| iotdb-datanode.properties |对应`iotdb/config/iotdb-datanode.properties`更加详细请参看`iotdb-datanode.properties`文件说明|非必填| -| dn\_rpc\_address |datanode rpc 地址对应`iotdb/config/iotdb-datanode.properties`中的`dn_rpc_address`|必填| -| dn\_internal\_address |内部通信地址,对应`iotdb/config/iotdb-datanode.properties`中的`dn_internal_address`|必填| -| dn\_target\_config\_node\_list |集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-datanode.properties`中的`dn_target_config_node_list`|必填| -| dn\_rpc\_port |datanode rpc端口地址,对应`iotdb/config/iotdb-datanode.properties`中的`dn_rpc_port`|必填| -| dn\_internal\_port |内部通信端口,对应`iotdb/config/iotdb-datanode.properties`中的`dn_internal_port`|必填| -| iotdb-common.properties |对应`iotdb/config/iotdb-common.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值|非必填| - -* grafana_server 是部署Grafana 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------|------------------|-------------------| -| grafana\_dir\_name | grafana 解压目录名称 | 非必填默认grafana_iotdb | -| host | grafana 部署的服务器ip | 必填 | -| grafana\_port | grafana 部署机器的端口 | 非必填,默认3000 | -| deploy\_dir | grafana 部署服务器目录 | 必填 | -| grafana\_tar\_dir | grafana 压缩包位置 | 必填 | -| dashboards | dashboards 所在的位置 | 非必填,多个用逗号隔开 | - -* prometheus_server 是部署Prometheus 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------------------|------------------|-----------------------| -| prometheus\_dir\_name | prometheus 解压目录名称 | 非必填默认prometheus_iotdb | -| host | prometheus 部署的服务器ip | 必填 | -| prometheus\_port | prometheus 部署机器的端口 | 非必填,默认9090 | -| deploy\_dir | prometheus 部署服务器目录 | 必填 | -| prometheus\_tar\_dir | prometheus 压缩包位置 | 必填 | -| storage\_tsdb\_retention\_time | 默认保存数据天数 默认15天 | 非必填 | -| storage\_tsdb\_retention\_size | 指定block可以保存的数据大小默认512M ,注意单位KB, MB, GB, TB, PB, EB | 非必填 | - -如果在config/xxx.yaml的`iotdb-datanode.properties`和`iotdb-confignode.properties`中配置了metrics,则会自动把配置放入到promethues无需手动修改 - -注意:如何配置yaml key对应的值包含特殊字符如:等建议整个value使用双引号,对应的文件路径中不要使用包含空格的路径,防止出现识别出现异常问题。 - -### 使用场景 - -#### 清理数据场景 - -* 清理集群数据场景会删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 -* 首先执行停止集群命令、然后在执行集群清理命令。 -```bash -iotd cluster stop default_cluster -iotd cluster clean default_cluster -``` - -#### 集群销毁场景 - -* 集群销毁场景会删除IoTDB集群中的`data`、`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录。 -* 首先执行停止集群命令、然后在执行集群销毁命令。 - - -```bash -iotd cluster stop default_cluster -iotd cluster destroy default_cluster -``` - -#### 集群升级场景 - -* 集群升级首先需要在config/xxx.yaml中配置`iotdb_lib_dir`为要上传到服务器的jar所在目录路径(例如iotdb/lib)。 -* 如果使用zip文件上传请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache-iotdb-1.2.0/lib/* -* 执行上传命令、然后执行重启IoTDB集群命令即可完成集群升级 - -```bash -iotd cluster upgrade default_cluster -iotd cluster restart default_cluster -``` - -#### 集群配置文件的热部署场景 - -* 首先修改在config/xxx.yaml中配置。 -* 执行分发命令、然后执行热部署命令即可完成集群配置的热部署 - -```bash -iotd cluster distribute default_cluster -iotd cluster reload default_cluster -``` - -#### 集群扩容场景 - -* 首先修改在config/xxx.yaml中添加一个datanode 或者confignode 节点。 -* 执行集群扩容命令 -```bash -iotd cluster scaleout default_cluster -``` - -#### 集群缩容场景 - -* 首先在config/xxx.yaml中找到要缩容的节点名字或者ip+port(其中confignode port 是cn_internal_port、datanode port 是rpc_port) -* 执行集群缩容命令 -```bash -iotd cluster scalein default_cluster -``` - -#### 已有IoTDB集群,使用集群部署工具场景 - -* 配置服务器的`user`、`passwod`或`pkey`、`ssh_port` -* 修改config/xxx.yaml中IoTDB 部署路径,`deploy_dir`(IoTDB 部署目录)、`iotdb_dir_name`(IoTDB解压目录名称,默认是iotdb) - 例如IoTDB 部署完整路径是`/home/data/apache-iotdb-1.1.1`则需要修改yaml文件`deploy_dir:/home/data/`、`iotdb_dir_name:apache-iotdb-1.1.1` -* 如果服务器不是使用的java_home则修改`jdk_deploy_dir`(jdk 部署目录)、`jdk_dir_name`(jdk解压后的目录名称,默认是jdk_iotdb),如果使用的是java_home 则不需要修改配置 - 例如jdk部署完整路径是`/home/data/jdk_1.8.2`则需要修改yaml文件`jdk_deploy_dir:/home/data/`、`jdk_dir_name:jdk_1.8.2` -* 配置`cn_target_config_node_list`、`dn_target_config_node_list` -* 配置`confignode_servers`中`iotdb-confignode.properties`里面的`cn_internal_address`、`cn_internal_port`、`cn_consensus_port`、`cn_system_dir`、 - `cn_consensus_dir`和`iotdb-common.properties`里面的值不是IoTDB默认的则需要配置否则可不必配置 -* 配置`datanode_servers`中`iotdb-datanode.properties`里面的`dn_rpc_address`、`dn_internal_address`、`dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`和`iotdb-common.properties`等 -* 执行初始化命令 - -```bash -iotd cluster init default_cluster -``` - -#### 一键部署IoTDB、Grafana和Prometheus 场景 - -* 配置`iotdb-datanode.properties` 、`iotdb-confignode.properties` 打开metrics接口 -* 配置Grafana 配置,如果`dashboards` 有多个就用逗号隔开,名字不能重复否则会被覆盖。 -* 配置Prometheus配置,IoTDB 集群配置了metrics 则无需手动修改Prometheus 配置会根据哪个节点配置了metrics,自动修改Prometheus 配置。 -* 启动集群 - -```bash -iotd cluster start default_cluster -``` - -更加详细参数请参考上方的 集群配置文件介绍 - - -### 命令格式 - -本工具的基本用法为: -```bash -iotd cluster [params (Optional)] -``` -* key 表示了具体的命令。 - -* cluster name 表示集群名称(即`iotd/config` 文件中yaml文件名字)。 - -* params 表示了命令的所需参数(选填)。 - -* 例如部署default_cluster集群的命令格式为: - -```bash -iotd cluster deploy default_cluster -``` - -* 集群的功能及参数列表如下: - -| 命令 | 功能 | 参数 | -|------------|----------------------------|-------------------------------------------------------------------------------------------------------------------------| -| check | 检测集群是否可以部署 | 集群名称列表 | -| clean | 清理集群 | 集群名称 | -| deploy | 部署集群 | 集群名称 ,-N,模块名称(iotdb、grafana、prometheus可选),-op force(可选) | -| list | 打印集群及状态列表 | 无 | -| start | 启动集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) | -| stop | 关闭集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) ,-op force(nodename、grafana、prometheus可选) | -| restart | 重启集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选),-op force(nodename、grafana、prometheus可选) | -| show | 查看集群信息,details字段表示展示集群信息细节 | 集群名称, details(可选) | -| destroy | 销毁集群 | 集群名称,-N,模块名称(iotdb、grafana、prometheus可选) | -| scaleout | 集群扩容 | 集群名称 | -| scalein | 集群缩容 | 集群名称,-N,集群节点名字或集群节点ip+port | -| reload | 集群热加载 | 集群名称 | -| distribute | 集群配置文件分发 | 集群名称 | -| dumplog | 备份指定集群日志 | 集群名称,-N,集群节点名字 -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -loglevel 日志类型 -l 传输速度 | -| dumpdata | 备份指定集群数据 | 集群名称, -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -l 传输速度 | -| upgrade | lib 包升级 | 集群名字(升级完后请重启) | -| init | 已有集群使用集群部署工具时,初始化集群配置 | 集群名字,初始化集群配置 | -| status | 查看进程状态 | 集群名字 | - -### 详细命令执行过程 - -下面的命令都是以default_cluster.yaml 为示例执行的,用户可以修改成自己的集群文件来执行 - -#### 检查集群部署环境命令 - -```bash -iotd cluster check default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 验证目标节点是否能够通过 SSH 登录 - -* 验证对应节点上的 JDK 版本是否满足IoTDB jdk1.8及以上版本、服务器是否按照unzip、是否安装lsof 或者netstat - -* 如果看到下面提示`Info:example check successfully!` 证明服务器已经具备安装的要求, - 如果输出`Error:example check fail!` 证明有部分条件没有满足需求可以查看上面的输出的Error日志(例如:`Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`)进行修复, - 如果检查jdk没有满足要求,我们可以自己在yaml 文件中配置一个jdk1.8 及以上版本的进行部署不影响后面使用, - 如果检查lsof、netstat或者unzip 不满足要求需要在服务器上自行安装。 - -#### 部署集群命令 - -```bash -iotd cluster deploy default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`confignode_servers` 和`datanode_servers`中的节点信息上传IoTDB压缩包和jdk压缩包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值) - -* 根据yaml文件节点配置信息生成并上传`iotdb-common.properties`、`iotdb-confignode.properties`、`iotdb-datanode.properties` - -```bash -iotd cluster deploy default_cluster -op force -``` -注意:该命令会强制执行部署,具体过程会删除已存在的部署目录重新部署 - -*部署单个模块* -```bash -# 部署grafana模块 -iotd cluster deploy default_cluster -N grafana -# 部署prometheus模块 -iotd cluster deploy default_cluster -N prometheus -# 部署iotdb模块 -iotd cluster deploy default_cluster -N iotdb -``` - -#### 启动集群命令 - -```bash -iotd cluster start default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 启动confignode,根据yaml配置文件中`confignode_servers`中的顺序依次启动同时根据进程id检查confignode是否正常,第一个confignode 为seek config - -* 启动datanode,根据yaml配置文件中`datanode_servers`中的顺序依次启动同时根据进程id检查datanode是否正常 - -* 如果根据进程id检查进程存在后,通过cli依次检查集群列表中每个服务是否正常,如果cli链接失败则每隔10s重试一次直到成功最多重试5次 - - -*启动单个节点命令* -```bash -#按照IoTDB 节点名称启动 -iotd cluster start default_cluster -N datanode_1 -#按照IoTDB 集群ip+port启动,其中port对应confignode的cn_internal_port、datanode的rpc_port -iotd cluster start default_cluster -N 192.168.1.5:6667 -#启动grafana -iotd cluster start default_cluster -N grafana -#启动prometheus -iotd cluster start default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对于节点位置信息,如果启动的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果启动的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 启动该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的start-confignode.sh和start-datanode.sh 脚本, -在实际输出结果失败时有可能是集群还未正常启动,建议使用status命令进行查看当前集群状态(iotd cluster status xxx) - - -#### 查看IoTDB集群状态命令 - -```bash -iotd cluster show default_cluster -#查看IoTDB集群详细信息 -iotd cluster show default_cluster details -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 依次在datanode通过cli执行`show cluster details` 如果有一个节点执行成功则不会在后续节点继续执行cli直接返回结果 - - -#### 停止集群命令 - - -```bash -iotd cluster stop default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`datanode_servers`中datanode节点信息,按照配置先后顺序依次停止datanode节点 - -* 根据`confignode_servers`中confignode节点信息,按照配置依次停止confignode节点 - -*强制停止集群命令* - -```bash -iotd cluster stop default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群 - -*停止单个节点命令* - -```bash -#按照IoTDB 节点名称停止 -iotd cluster stop default_cluster -N datanode_1 -#按照IoTDB 集群ip+port停止(ip+port是按照datanode中的ip+dn_rpc_port获取唯一节点或confignode中的ip+cn_internal_port获取唯一节点) -iotd cluster stop default_cluster -N 192.168.1.5:6667 -#停止grafana -iotd cluster stop default_cluster -N grafana -#停止prometheus -iotd cluster stop default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对应节点位置信息,如果停止的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果停止的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 停止该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的stop-confignode.sh和stop-datanode.sh 脚本,在某些情况下有可能iotdb集群并未停止。 - - -#### 清理集群数据命令 - -```bash -iotd cluster clean default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`配置信息 - -* 根据`confignode_servers`、`datanode_servers`中的信息,检查是否还有服务正在运行, - 如果有任何一个服务正在运行则不会执行清理命令 - -* 删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 - - - -#### 重启集群命令 - -```bash -iotd cluster restart default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 执行上述的停止集群命令(stop),然后执行启动集群命令(start) 具体参考上面的start 和stop 命令 - -*强制重启集群命令* - -```bash -iotd cluster restart default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群,然后启动集群 - -*重启单个节点命令* - -```bash -#按照IoTDB 节点名称重启datanode_1 -iotd cluster restart default_cluster -N datanode_1 -#按照IoTDB 节点名称重启confignode_1 -iotd cluster restart default_cluster -N confignode_1 -#重启grafana -iotd cluster restart default_cluster -N grafana -#重启prometheus -iotd cluster restart default_cluster -N prometheus -``` - -#### 集群缩容命令 - -```bash -#按照节点名称缩容 -iotd cluster scalein default_cluster -N nodename -#按照ip+port缩容(ip+port按照datanode中的ip+dn_rpc_port获取唯一节点,confignode中的ip+cn_internal_port获取唯一节点) -iotd cluster scalein default_cluster -N ip:port -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 判断要缩容的confignode节点和datanode是否只剩一个,如果只剩一个则不能执行缩容 - -* 然后根据ip:port或者nodename 获取要缩容的节点信息,执行缩容命令,然后销毁该节点目录,如果缩容的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果缩容的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - - -提示:目前一次仅支持一个节点缩容 - -#### 集群扩容命令 - -```bash -iotd cluster scaleout default_cluster -``` -* 修改config/xxx.yaml 文件添加一个datanode 节点或者confignode节点 - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 找到要扩容的节点,执行上传IoTDB压缩包和jdb包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值)并解压 - -* 根据yaml文件节点配置信息生成并上传`iotdb-common.properties`、`iotdb-confignode.properties`或`iotdb-datanode.properties` - -* 执行启动该节点命令并校验节点是否启动成功 - -提示:目前一次仅支持一个节点扩容 - -#### 销毁集群命令 -```bash -iotd cluster destroy default_cluster -``` - -* cluster-name 找到默认位置的 yaml 文件 - -* 根据`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`中node节点信息,检查是否节点还在运行, - 如果有任何一个节点正在运行则停止销毁命令 - -* 删除IoTDB集群中的`data`以及yaml文件配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录 - -*销毁单个模块* -```bash -# 销毁grafana模块 -iotd cluster destroy default_cluster -N grafana -# 销毁prometheus模块 -iotd cluster destroy default_cluster -N prometheus -# 销毁iotdb模块 -iotd cluster destroy default_cluster -N iotdb -``` - -#### 分发集群配置命令 -```bash -iotd cluster distribute default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 根据yaml文件节点配置信息生成并依次上传`iotdb-common.properties`、`iotdb-confignode.properties`、`iotdb-datanode.properties`、到指定节点 - -#### 热加载集群配置命令 -```bash -iotd cluster reload default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据yaml文件节点配置信息依次在cli中执行`load configuration` - -#### 集群节点日志备份 -```bash -iotd cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 该命令会根据yaml文件校验datanode_1,confignode_1 是否存在,然后根据配置的起止日期(startdate<=logtime<=enddate)备份指定节点datanode_1,confignode_1 的日志数据到指定服务`192.168.9.48` 端口`36000` 数据备份路径是 `/iotdb/logs` ,IoTDB日志存储路径在`/root/data/db/iotdb/logs`(非必填,如果不填写-logs xxx 默认从IoTDB安装路径/logs下面备份日志) - -| 命令 | 功能 | 是否必填 | -|------------|------------------------------------| ---| -| -h | 存放备份数据机器ip |否| -| -u | 存放备份数据机器用户名 |否| -| -pw | 存放备份数据机器密码 |否| -| -p | 存放备份数据机器端口(默认22) |否| -| -path | 存放备份数据的路径(默认当前路径) |否| -| -loglevel | 日志基本有all、info、error、warn(默认是全部) |否| -| -l | 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -| -N | 配置文件集群名称多个用逗号隔开 |是| -| -startdate | 起始时间(包含默认1970-01-01) |否| -| -enddate | 截止时间(包含) |否| -| -logs | IoTDB 日志存放路径,默认是({iotdb}/logs) |否| - -#### 集群节点数据备份 -```bash -iotd cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* 该命令会根据yaml文件获取leader 节点,然后根据起止日期(startdate<=logtime<=enddate)备份数据到192.168.9.48 服务上的/iotdb/datas 目录下 - -| 命令 | 功能 | 是否必填 | -| ---|---------------------------------| ---| -|-h| 存放备份数据机器ip |否| -|-u| 存放备份数据机器用户名 |否| -|-pw| 存放备份数据机器密码 |否| -|-p| 存放备份数据机器端口(默认22) |否| -|-path| 存放备份数据的路径(默认当前路径) |否| -|-granularity| 类型partition |是| -|-l| 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -|-startdate| 起始时间(包含) |是| -|-enddate| 截止时间(包含) |是| - -#### 集群升级 -```bash -iotd cluster upgrade default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 - -注意执行完升级后请重启IoTDB 才能生效 - -#### 集群初始化 -```bash -iotd cluster init default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 初始化集群配置 - -#### 查看集群进程状态 -```bash -iotd cluster status default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 展示集群的存活状态 - -### 集群部署工具样例介绍 -在集群部署工具安装目录中config/example 下面有3个yaml样例,如果需要可以复制到config 中进行修改即可 - -| 名称 | 说明 | -|-----------------------------|------------------------------------------------| -| default\_1c1d.yaml | 1个confignode和1个datanode 配置样例 | -| default\_3c3d.yaml | 3个confignode和3个datanode 配置样例 | -| default\_3c3d\_grafa\_prome | 3个confignode和3个datanode、Grafana、Prometheus配置样例 | - -## 集群版部署(手工部署) - -### 启动流程 - -本小节描述如何启动包括若干 ConfigNode 和 DataNode 的集群。 -集群可以提供服务的标准是至少启动一个 ConfigNode 且启动不小于(数据/元数据)副本个数的 DataNode。 - -总体启动流程分为三步: - -1. 启动种子 ConfigNode -2. 增加 ConfigNode(可选) -3. 增加 DataNode - -#### 启动 Seed-ConfigNode - -**集群第一个启动的节点必须是 ConfigNode,第一个启动的 ConfigNode 必须遵循本小节教程。** - -第一个启动的 ConfigNode 是 Seed-ConfigNode,标志着新集群的创建。 -在启动 Seed-ConfigNode 前,请打开通用配置文件 ./conf/iotdb-common.properties,并检查如下参数: - -| **配置项** | **检查** | -| ------------------------------------------ | -------------------------- | -| cluster\_name | 已设置为期望的集群名称 | -| config\_node\_consensus\_protocol\_class | 已设置为期望的共识协议 | -| schema\_replication\_factor | 已设置为期望的元数据副本数 | -| schema\_region\_consensus\_protocol\_class | 已设置为期望的共识协议 | -| data\_replication\_factor | 已设置为期望的数据副本数 | -| data\_region\_consensus\_protocol\_class | 已设置为期望的共识协议 | - -**注意:** 请根据[部署推荐](./Deployment-Recommendation.md)配置合适的通用参数,这些参数在首次配置后即不可修改。 - -接着请打开它的配置文件 ./conf/iotdb-confignode.properties,并检查如下参数: - - -| **配置项** | **检查** | -| ------------------------------ | ----------------------------------------------------------------------- | -| cn\_internal\_address | 已设置为服务器的 IPV4 地址或域名 | -| cn\_internal\_port | 该端口未被占用 | -| cn\_consensus\_port | 该端口未被占用 | -| cn\_target\_config\_node\_list | 已设置为自己的内部通讯地址,即 cn\_internal\_address:cn\_internal\_port | - -检查完毕后,即可在服务器上运行启动脚本: - -``` -# Linux 前台启动 -bash ./sbin/start-confignode.sh - -# Linux 后台启动 -nohup bash ./sbin/start-confignode.sh >/dev/null 2>&1 & - -# Windows -.\sbin\start-confignode.bat -``` - -ConfigNode 的其它配置参数可参考 -[ConfigNode 配置参数](../Reference/ConfigNode-Config-Manual.md)。 - -#### 增加更多 ConfigNode(可选) - -**只要不是第一个启动的 ConfigNode 就必须遵循本小节教程。** - -可向集群添加更多 ConfigNode,以保证 ConfigNode 的高可用。常用的配置为额外增加两个 ConfigNode,使集群共有三个 ConfigNode。 - -新增的 ConfigNode 需要保证 ./conf/iotdb-common.properites 中的所有配置参数与 Seed-ConfigNode 完全一致,否则可能启动失败或产生运行时错误。 -因此,请着重检查通用配置文件中的以下参数: - -| **配置项** | **检查** | -| ------------------------------------------ | --------------------------- | -| cluster\_name | 与 Seed-ConfigNode 保持一致 | -| config\_node\_consensus\_protocol\_class | 与 Seed-ConfigNode 保持一致 | -| schema\_replication\_factor | 与 Seed-ConfigNode 保持一致 | -| schema\_region\_consensus\_protocol\_class | 与 Seed-ConfigNode 保持一致 | -| data\_replication\_factor | 与 Seed-ConfigNode 保持一致 | -| data\_region\_consensus\_protocol\_class | 与 Seed-ConfigNode 保持一致 | - -接着请打开它的配置文件 ./conf/iotdb-confignode.properties,并检查以下参数: - -| **配置项** | **检查** | -| ------------------------------ | ------------------------------------------------------------------------------------------- | -| cn\_internal\_address | 已设置为服务器的 IPV4 地址或域名 | -| cn\_internal\_port | 该端口未被占用 | -| cn\_consensus\_port | 该端口未被占用 | -| cn\_target\_config\_node\_list | 已设置为另一个正在运行的 ConfigNode 的内部通讯地址,推荐使用 Seed-ConfigNode 的内部通讯地址 | - -检查完毕后,即可在服务器上运行启动脚本: - -``` -# Linux 前台启动 -bash ./sbin/start-confignode.sh - -# Linux 后台启动 -nohup bash ./sbin/start-confignode.sh >/dev/null 2>&1 & - -# Windows -.\sbin\start-confignode.bat -``` - -ConfigNode 的其它配置参数可参考 -[ConfigNode配置参数](../Reference/ConfigNode-Config-Manual.md) - -#### 增加 DataNode - -**确保集群已有正在运行的 ConfigNode 后,才能开始增加 DataNode。** - -可以向集群中添加任意个 DataNode。 -在添加新的 DataNode 前,请先打开通用配置文件 ./conf/iotdb-common.properties 并检查以下参数: - - -| **配置项** | **检查** | -| ------------- | --------------------------- | -| cluster\_name | 与 Seed-ConfigNode 保持一致 | - -接着打开它的配置文件 ./conf/iotdb-datanode.properties 并检查以下参数: - - -| **配置项** | **检查** | -| ----------------------------------- | ------------------------------------------------------------------------------------- | -| dn\_rpc\_address | 已设置为服务器的 IPV4 地址或域名 | -| dn\_rpc\_port | 该端口未被占用 | -| dn\_internal\_address | 已设置为服务器的 IPV4 地址或域名 | -| dn\_internal\_port | 该端口未被占用 | -| dn\_mpp\_data\_exchange\_port | 该端口未被占用 | -| dn\_data\_region\_consensus\_port | 该端口未被占用 | -| dn\_schema\_region\_consensus\_port | 该端口未被占用 | -| dn\_target\_config\_node\_list | 已设置为正在运行的 ConfigNode 的内部通讯地址,推荐使用 Seed-ConfigNode 的内部通讯地址 | - -检查完毕后,即可在服务器上运行启动脚本: - -``` -# Linux 前台启动 -bash ./sbin/start-datanode.sh - -# Linux 后台启动 -nohup bash ./sbin/start-datanode.sh >/dev/null 2>&1 & - -# Windows -.\sbin\start-datanode.bat -``` - -DataNode 的其它配置参数可参考 -[DataNode配置参数](../Reference/DataNode-Config-Manual.md)。 - -**注意:当且仅当集群拥有不少于副本个数(max{schema\_replication\_factor, data\_replication\_factor})的 DataNode 后,集群才可以提供服务** - -### 验证部署 - -若搭建的集群仅用于本地调试,可直接执行 ./sbin 目录下的 Cli 启动脚本: - -``` -# Linux -./sbin/start-cli.sh - -# Windows -.\sbin\start-cli.bat -``` - -若希望通过 Cli 连接生产环境的集群, -请阅读 [Cli 使用手册](../Tools-System/CLI.md)。 - - -以在6台服务器上启动的3C3D(3个ConfigNode 和 3个DataNode)集群为例, -这里假设3个ConfigNode的IP地址依次为192.168.1.10、192.168.1.11、192.168.1.12,且3个ConfigNode启动时均使用了默认的端口10710与10720; -3个DataNode的IP地址依次为192.168.1.20、192.168.1.21、192.168.1.22,且3个DataNode启动时均使用了默认的端口6667、10730、10740、10750与10760。 - -成功启动集群后,在 Cli 执行 `show cluster details`,看到的结果应当如下: - -``` -IoTDB> show cluster details -+------+----------+-------+---------------+------------+-------------------+------------+-------+-------+-------------------+-----------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|ConfigConsensusPort| RpcAddress|RpcPort|MppPort|SchemaConsensusPort|DataConsensusPort| -+------+----------+-------+---------------+------------+-------------------+------------+-------+-------+-------------------+-----------------+ -| 0|ConfigNode|Running| 192.168.1.10| 10710| 10720| | | | | | -| 2|ConfigNode|Running| 192.168.1.11| 10710| 10720| | | | | | -| 3|ConfigNode|Running| 192.168.1.12| 10710| 10720| | | | | | -| 1| DataNode|Running| 192.168.1.20| 10730| |192.168.1.20| 6667| 10740| 10750| 10760| -| 4| DataNode|Running| 192.168.1.21| 10730| |192.168.1.21| 6667| 10740| 10750| 10760| -| 5| DataNode|Running| 192.168.1.22| 10730| |192.168.1.22| 6667| 10740| 10750| 10760| -+------+----------+-------+---------------+------------+-------------------+------------+-------+-------+-------------------+-----------------+ -Total line number = 6 -It costs 0.012s -``` - -若所有节点的状态均为 **Running**,则说明集群部署成功; -否则,请阅读启动失败节点的运行日志,并检查对应的配置参数。 - -### 停止流程 - -本小节描述如何手动关闭 IoTDB 的 ConfigNode 或 DataNode 进程。 - -#### 使用脚本停止 ConfigNode - -执行停止 ConfigNode 脚本: - -``` -# Linux -./sbin/stop-confignode.sh - -# Windows -.\sbin\stop-confignode.bat -``` - -#### 使用脚本停止 DataNode - -执行停止 DataNode 脚本: - -``` -# Linux -./sbin/stop-datanode.sh - -# Windows -.\sbin\stop-datanode.bat -``` - -#### 停止节点进程 - -首先获取节点的进程号: - -``` -jps - -# 或 - -ps aux | grep iotdb -``` - -结束进程: - -``` -kill -9 -``` - -**注意:有些端口的信息需要 root 权限才能获取,在此情况下请使用 sudo** - -### 集群缩容 - -本小节描述如何将 ConfigNode 或 DataNode 移出集群。 - -#### 移除 ConfigNode - -在移除 ConfigNode 前,请确保移除后集群至少还有一个活跃的 ConfigNode。 -在活跃的 ConfigNode 上执行 remove-confignode 脚本: - -``` -# Linux -## 根据 confignode_id 移除节点 -./sbin/remove-confignode.sh - -## 根据 ConfigNode 内部通讯地址和端口移除节点 -./sbin/remove-confignode.sh : - - -# Windows -## 根据 confignode_id 移除节点 -.\sbin\remove-confignode.bat - -## 根据 ConfigNode 内部通讯地址和端口移除节点 -.\sbin\remove-confignode.bat : -``` - -#### 移除 DataNode - -在移除 DataNode 前,请确保移除后集群至少还有不少于(数据/元数据)副本个数的 DataNode。 -在活跃的 DataNode 上执行 remove-datanode 脚本: - -``` -# Linux -## 根据 datanode_id 移除节点 -./sbin/remove-datanode.sh - -## 根据 DataNode RPC 服务地址和端口移除节点 -./sbin/remove-datanode.sh : - - -# Windows -## 根据 datanode_id 移除节点 -.\sbin\remove-datanode.bat - -## 根据 DataNode RPC 服务地址和端口移除节点 -.\sbin\remove-datanode.bat : -``` - -## 双活版部署 - -IoTDB 的双活集群指的是两个独立的集群,它们的配置完全独立,可以同时接收外界的写入,每一个独立的集群都可以将写入到自己数据同步到另一个集群中, -两个集群的数据可达到最终一致。 - -两个集群可构成一个高可用组:当其中一个集群停止服务时,另一个集群不会受到影响。当停止服务的集群再次启动时,另一个集群会将新写入的数据同步过来。 -业务可以绑定两个集群进行读写,从而达到高可用的目的。 - -双活集群的部署方案允许在物理节点少于 3 的情况下实现高可用,在部署成本上具备一定优势。 - -请参考 [分布式部署FAQ](https://iotdb.apache.org/zh/UserGuide/Master/FAQ/FAQ-for-cluster-setup.html) \ No newline at end of file diff --git a/src/zh/UserGuide/V1.2.x/Tools-System/Data-Sync-Software_timecho.md b/src/zh/UserGuide/V1.2.x/Tools-System/Data-Sync-Software_timecho.md deleted file mode 100644 index 871ffc79b..000000000 --- a/src/zh/UserGuide/V1.2.x/Tools-System/Data-Sync-Software_timecho.md +++ /dev/null @@ -1,255 +0,0 @@ - - -# 数据同步软件 - -除了内置的 Pipe 同步功能之外,IoTDB 还提供外置的数据同步软件。该软件由 Java 编写,因此具有跨平台性,在 Windows 和 Linux 下都可以执行同步功能。该软件除 Pipe 的基础功能外,还添加了额外功能。该软件支持命令行中以 SQL 语句执行,此外还支持图形化界面。 - -## 连接方法 -在使用数据同步软件时,首先需定位发送端集群。为允许发送端集群出于关闭状态,此处除可以通过 ip:port 定位集群,也可以给定发送端集群的 data 目录。将以下两组配置其一写入 data-sync.properties 即可: - -- ssh_port, ssh_user, ssh_password, data_dir:用于定位数据目录位置,当 ssh 未配置时默认为本机。 -- user, password, dn_rpc_address, dn_rpc_port:用于定位运行中的 IoTDB 位置,user 和 password 可不填写,默认为 root。 - -## 软件特性及用法 -数据同步软件支持 SQL 及图形化界面的交互。其中,SQL 语句的用法与 IoTDB Pipe 的命令行用法相同。下面先介绍创建同步任务的 SQL 语句: -### 创建同步任务 -```shell -create pipe p1 - with extractor ( - .... - ) - with processor ( - .... - ) - with connector ( - .... - ) -``` -其中 extractor,processor,connector 均为可自定义的插件。extractor 为数据的收集器,负责收集 IoTDB 内的特定数据;processor 为处理器,负责对收集到的数据进行过滤等处理;connector 为连接器,负责对数据进行发送。此外,with extractor, with processor 两段均可不填,不填时,extractor 与 processor 均为默认值。上述命令中(....)部分为可配置的参数,容忍无效参数,具体有效参数取决于插件实现本身。 - -下面将列出该软件的所有特性及其 SQL 用法。 - -### 数据收集 -目前软件自带的 Extractor 为 iotdb-extractor,该 extractor 支持对任意前缀路径进行同步,即支持选定任意数据库,设备和时间序列。此外,还可以选择同步历史数据 / 实时数据或是两者,还支持规定历史数据的起始时间和截止时间。示例的 extractor 参数如下: -```shell -create pipe p1 - with extractor ( - 'extractor'='iotdb-extractor', - 'extractor.pattern'='root', - 'extractor.history.enable'='true', - 'extractor.history.start-time'='2023-07-03T16:49:58.845+08:00', - 'extractor.history.end-time'='2023-07-04T16:49:58.845+08:00', - 'extractor.realtime.enable'='true', - 'extractor.realtime.mode'='log' - ) - with processor ( - .... - ) - with connector ( - .... - ) -``` -其中,各参数的含义如下: - -| 参数名 | 参数说明 | 是否必需 | -| ---------------------- | ------------------------------------------------------- | -------- | -| extractor | 通用配置,表示选用的 extractor 类型,这里是软件自带的 iotdb-extractor。|否,默认为 iotdb-extractor| -| extractor.pattern | 为 iotdb-extractor 规定的特定配置,之后的参数相同。这里为选定的数据前缀路径。|否,默认为 root| -| extractor.history.enable | 规定是否可以同步历史数据,取值为 true 或 false。|否,默认为 true| -| extractor.history.start-time | 为历史数据的截取开始时间戳,用于截取某一时间段的数据。|否,默认为历史数据开始时间| -| extractor.history.end-time| 为历史数据的截取停止时间戳,用于截取某一时间段的数据。|否,默认为历史数据结束时间| -| extractor.realtime.enable| 规定是否可以同步实时数据,取值为 true 或 false|否,默认为true| -| extractor.realtime.mode |规定实时数据的同步方式,取值为 log,hybrid,file,表示基于文件,WAL或混合同步|否,默认为hybrid| -### 数据处理 -此外,该软件支持对选择的数据进行处理。目前的 processor 内置了一些简单功能,例如基于某个字段取值的过滤,选择和重命名等,还可以自定义流处理算法,来对收集到的数据进行处理,例如滑动平均等。一些较为简单的功能已经封装在软件内部,而如果想要自定义复杂功能,可以编写相关的类作为数据的处理插件。 - -#### 无操作 -无操作时,使用 do-nothing-processor 即可。示例的 processor 参数如下: -```shell -create pipe p1 - with extractor ( - .... - ) - with processor ( - 'processor'='do-nothing-processor' - ) - with connector ( - .... - ) -``` -与上述相似,此处的 processor 为通用配置,表示选用的 processor 类型。该配置非必选项,不填时即默认为 do-nothing-processor。 - -#### 取值过滤及选择 -使用自带的取值过滤 processor 可以根据 IoTDB 点的取值进行过滤。示例的 processor 参数如下: -```shell -create pipe p1 - with extractor ( - .... - ) - with processor ( - 'processor'='filter-processor', - 'processor.include.condition.type'='double', - 'processor.include.condition'='>1', - 'processor.exclude.condition.type'='double', - 'processor.exclude.condition'='>=2' - ) - with connector ( - .... - ) -``` -此处的 processor.include.condition 为选择某个取值的条件,processor.exclude.condition 为过滤某个取值的条件,二者必填其一。此处的参数表示选取收集的数据中,类型为 double 且大于 1 小于 2 的数据。 - -#### 取值重写 -使用取值重写 processor 可以根据 IoTDB 点的取值进行改写。processor 参数如下: -```shell -create pipe p1 - with extractor ( - .... - ) - with processor ( - 'processor'='rewrite-processor', - 'processor.rewrite.condition.type'='double', - 'processor.rewrite.condition'='>1', - 'processor.rewrite.newValue'='1' - ) - with connector ( - .... - ) -``` -此处的 processor.rewrite.condition 表示进行重写的判断条件,必填;processor.rewrite.newValue 表示进行重写的新值,同样必填。此处的参数表示将收集的数据中,类型为 double 且值大于 1 的数据改为 1,其他数据不变。 - -#### 字段重命名 -除了取值重写的功能之外,软件还支持对字段的重命名功能。该功能的参数如下: -```shell -create pipe p1 - with extractor ( - .... - ) - with processor ( - 'processor'='rename-processor', - 'processor.rename.oldPattern'='root.testpipe.d0', - 'processor.rename.newPattern'='root.receive.d1' - ) - with connector ( - .... - ) -``` -此处的 processor.rename.oldPattern 表示被重命名的序列,processor.rename.newPattern 表示序列的新名称。两者均为必填。 - -### 数据发送 -数据同步软件提供了内置的多种 connector,这些 connector 的类型如下: - -#### Thrift connector -支持使用 Thrift 方式,将数据同步到 IoTDB 的接收端。采用 Thrift 方式的同步参数如下: -- iotdb-thrift-connector -- iotdb-thrift-connector-v1 -- iotdb-thrift-connector-v2 - -其中,iotdb-thrift-connector-v1 会选择某个接收端地址进行发送,通常在单个地址时效率较高;iotdb-thrift-connector-v2 会并发在所有接收端地址进行发送,在多个地址时效率较高。 -iotdb-thrift-connector 会选择当前版本默认的 connector 进行发送,目前为 iotdb-thrift-connector-v1。以上 connector 公用相关参数,其取值示例如下: - -```shell -create pipe p1 - with extractor ( - .... - ) - with processor ( - .... - ) - with connector ( - 'connector'='iotdb-thrift-connector', - 'connector.ip'='xxx.xxx.xxx.xxx', - 'connector.port'='xxxx', - 'connector.node-urls'='xxx.xxx.xxx.xxx:xxxx,yyy.yyy.yyy.yyy:yyyy', - 'connector.compression'='zstd' - ) -``` -| 参数名 | 参数说明 | 是否必需 | -| ---------------------- | ------------------------------------------------------- | -------- | -|connector|通用配置,表示选用的 connector 类型,这里是 iotdb-thrift-connector, iotdb-thrift-connector-v1 或 iotdb-thrift-connector-v2。|否,默认为iotdb-thrift-connector| -|connector.ip| 表示接收端选定 IoTDB 的 IP 地址 | 与 node-urls 必选其一 | -|connector.port| 表示接收端选定 IoTDB 的端口 | 与 node-urls 必选其一 | -|connector.node-urls | 表示接收端集群的地址列表,与上面的 ip/port 可以共存 |与 ip/port 必选其一| -|connector.compression | 表示发送时 tsFile 使用的二次压缩算法 | 否,默认为不压缩 | - -#### InfluxDB connector -此外,使用 InfluxDB connector,还可以将上述经过筛选、处理的数据同步到 InfluxDB。该 Connector 的名称为 influxdb-connector。目前仅支持单点传输。参数取值示例如下: -```shell -create pipe p1 - with extractor ( - .... - ) - with processor ( - .... - ) - with connector ( - 'connector'='influxdb-connector', - 'connector.ip'='xxx.xxx.xxx.xxx', - 'connector.port'='xxxx' - ) -``` -这里的参数名和参数说明同上,但 ip 和 port 此时为必选项。 - -#### Local file backup connector - -Local file backup connector 能够将 IoTDB 内部的 tsFile 文件备份至本地。此时的参数为: -```shell -create pipe p1 - with extractor ( - .... - ) - with processor ( - .... - ) - with connector ( - 'connector'='local-file-backup-connector', - 'connector.path'='/usr/local', - 'connector.compression'='zstd' - ) -``` -这里的 connector.path 指要备份的文件目录,必填。connector.compression 为二次压缩方式,可选,非空时将根据 compression 的类型对打包出的 tsFile 进行二次压缩。 - -除了以上三类 connector 外,用户还可以自定义 connector 来实现数据的自定义发送。理论上,可以将数据以任意压缩格式,通过任何方法,发送至任何端口。 - -### REST api -与 IoTDB 自带的 Pipe 功能相同,该软件也提供了使用 REST 接口的访问方式,能够查看同步任务的启停和执行状态等。下面是一个示例的 REST 接口访问方法: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show pipes"}' http://127.0.0.1:18080/rest/v2/query -``` -其中示例中 Authorization: Basic cm9vdDpyb2901 字符串为用户名 root,密码 root 对应的 Basic 鉴权 Header 格式。用户名、密码对应的完整鉴权格式为 base64.encode(username + ':' + password),如果手动设置了用户名及密码,需要按照该格式来编写鉴权字符串。 -SQL 中的 show pipes 与 IoTDB 中的对应格式相同,18080 则为配置的 REST 端口。 - -此处的 REST api 也提供了配置文件,在项目根目录下的 data-sync.properties 中。其中能够对 REST 服务是否启用,用户名,密码及 REST 对外的端口进行配置。 - -### 开启同步任务 -开启同步任务的 SQL 同样与 IoTDB 内置 Pipe 的定义方式相同,如下: -```shell -start pipe p1 -``` -### 停止同步任务 -```shell -stop pipe p1 -``` -### 删除同步任务 -```shell -drop pipe p1 -``` diff --git a/src/zh/UserGuide/V1.2.x/Tools-System/Maintenance-Tool_timecho.md b/src/zh/UserGuide/V1.2.x/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index cf1ef99ce..000000000 --- a/src/zh/UserGuide/V1.2.x/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,491 +0,0 @@ - - -# 运维工具 - -## 集群版部署 - -### 集群管理工具部署 - -IoTDB 集群管理工具是一款易用的运维工具(企业版工具)。旨在解决 IoTDB 分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。本文档将说明如何用集群管理工具远程部署、配置、启动和停止 IoTDB 集群实例。 - -#### 部署集群管理工具 - -##### 环境依赖 - -IoTDB 要部署的机器需要依赖jdk 8及以上版本、lsof 或者 netstat、unzip功能如果没有请自行安装,可以参考文档最后的一节环境所需安装命令。 - -提示:IoTDB集群管理工具需要使用具有root权限的账号 - -##### 部署方法 - -###### 下载安装 - -本工具为TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -注意:由于二进制包仅支持GLIBC2.17 及以上版本,因此最低适配Centos7版本 - -* 在在iotd目录内输入以下指令后: - -```bash -bash install-iotd.sh -``` - -即可在之后的 shell 内激活 iotd 关键词,如检查部署前所需的环境指令如下所示: - -```bash -iotd cluster check example -``` - -* 也可以不激活iotd直接使用 <iotd absolute path>/sbin/iotd 来执行命令,如检查部署前所需的环境: - -```bash -/sbin/iotd cluster check example -``` - -#### 集群配置文件介绍 - -* 在`iotd/config` 目录下有集群配置的yaml文件,yaml文件名字就是集群名字yaml 文件可以有多个,为了方便用户配置yaml文件在iotd/config目录下面提供了`default_cluster.yaml`示例。 -* yaml 文件配置由`global`、`confignode_servers`、`datanode_servers`、`grafana_servers`(功能待开发)四大部分组成 -* global 是通用配置主要配置机器用户名密码、IoTDB本地安装文件、Jdk配置等。在`iotd/config`目录中提供了一个`default_cluster.yaml`样例数据, - 用户可以复制修改成自己集群名字并参考里面的说明进行配置iotdb集群,在`default_cluster.yaml`样例中没有注释的均为必填项,已经注释的为非必填项。 - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotd cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - -| 参数 | 说明 | 是否必填 | -| -------------------------- | ------------------------------------------------------------ | -------- | -| iotdb_zip_dir | IoTDB 部署分发目录,如果值为空则从`iotdb_download_url`指定地址下载 | 非必填 | -| iotdb_download_url | IoTDB 下载地址,如果`iotdb_zip_dir` 没有值则从指定地址下载 | 非必填 | -| jdk_tar_dir | jdk 本地目录,可使用该 jdk 路径进行上传部署至目标节点。 | 非必填 | -| jdk_deploy_dir | jdk 远程机器部署目录,会将 jdk 部署到目标节点该文件夹下最终部署完成的路径是`/jdk_iotdb` | 非必填 | -| iotdb_lib_dir | IoTDB lib 目录或者IoTDB 的lib 压缩包仅支持.zip格式 ,仅用于IoTDB升级,默认处于注释状态,如需升级请打开注释 | 非必填 | -| user | ssh登陆部署机器的用户名 | 必填 | -| password | ssh登录的密码, 如果password未指定使用pkey登陆, 请确保已配置节点之间ssh登录免密钥 | 非必填 | -| pkey | 密钥登陆如果password 有值优先使用password否则使用pkey登陆 | 非必填 | -| ssh_port | ssh登录端口 | 必填 | -| deploy_dir | iotdb 部署目录,会把 iotdb 部署到目标节点该文件夹下最终部署完成的路径是`/iotdb` | 必填 | -| datanode-env.sh | 对应`iotdb/config/datanode-env.sh` | 非必填 | -| confignode-env.sh | 对应`iotdb/config/confignode-env.sh` | 非必填 | -| iotdb-common.properties | 对应`iotdb/config/iotdb-common.properties` | 非必填 | -| cn_target_config_node_list | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-confignode.properties`中的`cn_target_config_node_list` | 必填 | -| dn_target_config_node_list | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-datanode.properties`中的`dn_target_config_node_list` | 必填 | - -* confignode_servers 是部署IoTDB Confignodes配置,里面可以配置多个Confignode - 默认将第一个启动的ConfigNode节点node1当作Seed-ConfigNode - -| 参数 | 说明 | 是否必填 | -| --------------------------- | ------------------------------------------------------------ | -------- | -| name | Confignode 名称 | 必填 | -| deploy_dir | IoTDB config node 部署目录,注:该目录不能与下面的IoTDB data node部署目录相同 | 必填| | -| iotdb-confignode.properties | 对应`iotdb/config/iotdb-confignode.properties`更加详细请参看`iotdb-confignode.properties`文件说明 | 非必填 | -| cn_internal_address | 对应iotdb/内部通信地址,对应`iotdb/config/iotdb-confignode.properties`中的`cn_internal_address` | 必填 | -| cn_target_config_node_list | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-confignode.properties`中的`cn_target_config_node_list` | 必填 | -| cn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-confignode.properties`中的`cn_internal_port` | 必填 | -| cn_consensus_port | 对应`iotdb/config/iotdb-confignode.properties`中的`cn_consensus_port` | 非必填 | -| cn_data_dir | 对应`iotdb/config/iotdb-confignode.properties`中的`cn_data_dir` | 必填 | -| iotdb-common.properties | 对应`iotdb/config/iotdb-common.properties`在`global`与`confignode_servers`同时配置值优先使用confignode_servers中的值 | 非必填 | - - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| 参数 | 说明 | 是否必填 | -| -------------------------- | ------------------------------------------------------------ | -------- | -| name | Datanode 名称 | 必填 | -| deploy_dir | IoTDB data node 部署目录,注:该目录不能与下面的IoTDB config node部署目录相同 | 必填 | -| iotdb-datanode.properties | 对应`iotdb/config/iotdb-datanode.properties`更加详细请参看`iotdb-datanode.properties`文件说明 | 非必填 | -| dn_rpc_address | datanode rpc 地址对应`iotdb/config/iotdb-datanode.properties`中的`dn_rpc_address` | 必填 | -| dn_internal_address | 内部通信地址,对应`iotdb/config/iotdb-datanode.properties`中的`dn_internal_address` | 必填 | -| dn_target_config_node_list | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-datanode.properties`中的`dn_target_config_node_list` | 必填 | -| dn_rpc_port | datanode rpc端口地址,对应`iotdb/config/iotdb-datanode.properties`中的`dn_rpc_port` | 必填 | -| dn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-datanode.properties`中的`dn_internal_port` | 必填 | -| iotdb-common.properties | 对应`iotdb/config/iotdb-common.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 | 非必填 | - -* grafana_servers 是部署Grafana 相关配置 - 该模块暂不支持 - -注意:如何配置yaml key对应的值包含特殊字符如:等建议整个value使用双引号,对应的文件路径中不要使用包含空格的路径,防止出现识别出现异常问题。 - -#### 命令格式 - -本工具的基本用法为: - -```bash -iotd cluster [params (Optional)] -``` - -* key 表示了具体的命令。 - -* cluster name 表示集群名称(即`iotd/config` 文件中yaml文件名字)。 - -* params 表示了命令的所需参数(选填)。 - -* 例如部署default_cluster集群的命令格式为: - -```bash -iotd cluster deploy default_cluster -``` - -* 集群的功能及参数列表如下: - -| 命令 | 功能 | 参数 | -| ---------- | --------------------------------------------- | ------------------------------------------------------------ | -| check | 检测集群是否可以部署 | 集群名称列表 | -| clean | 清理集群 | 集群名称 | -| deploy | 部署集群 | 集群名称 | -| list | 打印集群及状态列表 | 无 | -| start | 启动集群 | 集群名称,-N,节点名称(可选) | -| stop | 关闭集群 | 集群名称,-N,节点名称(可选) | -| restart | 重启集群 | 集群名称 | -| show | 查看集群信息,details字段表示展示集群信息细节 | 集群名称, details(可选) | -| destroy | 销毁集群 | 集群名称 | -| scaleout | 集群扩容 | 集群名称 | -| scalein | 集群缩容 | 集群名称,-N,集群节点名字或集群节点ip+port | -| reload | 集群热加载 | 集群名称 | -| distribute | 集群配置文件分发 | 集群名称 | -| dumplog | 备份指定集群日志 | 集群名称,-N,集群节点名字 -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -loglevel 日志类型 -l 传输速度 | -| dumpdata | 备份指定集群数据 | 集群名称, -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -l 传输速度 | -| upgrade | lib 包升级 | 集群名字(升级完后请重启) | - -#### 详细命令执行过程 - -下面的命令都是以default_cluster.yaml 为示例执行的,用户可以修改成自己的集群文件来执行 - -##### 检查集群部署环境命令 - -```bash -iotd cluster check default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 验证目标节点是否能够通过 SSH 登录 - -* 验证对应节点上的 JDK 版本是否满足IoTDB jdk1.8及以上版本、服务器是否按照unzip、是否安装lsof 或者netstat - -* 如果看到下面提示`Info:example check successfully!` 证明服务器已经具备安装的要求, - 如果输出`Warn:example check fail!` 证明有部分条件没有满足需求可以查看上面的Warn日志进行修复,假如jdk没有满足要求,我们可以自己在yaml 文件中配置一个jdk1.8 及以上版本的进行部署不影响后面使用,如果检查lsof、netstat或者unzip 不满足要求需要在服务器上自行安装 - - -##### 部署集群命令 - -```bash -iotd cluster deploy default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`confignode_servers` 和`datanode_servers`中的节点信息上传iotdb压缩包和jdk压缩包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值) - -* 根据yaml文件节点配置信息生成并上传`iotdb-common.properties`、`iotdb-confignode.properties`、`iotdb-datanode.properties` - -提示:这里的confignode 和datanode部署到同一台机器上时目录不能为相同,否则会被后部署的节点文件覆盖 - - -##### 启动集群命令 - -```bash -iotd cluster check default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 启动confignode,根据yaml配置文件中`confignode_servers`中的顺序依次启动同时根据进程id检查confignode是否正常,第一个confignode 为seek config - -* 启动datanode,根据yaml配置文件中`datanode_servers`中的顺序依次启动同时根据进程id检查datanode是否正常 - -* 如果根据进程id检查进程存在后,通过cli依次检查集群列表中每个服务是否正常,如果cli链接失败则每隔10s重试一次直到成功最多重试5次 - - -*启动单个节点命令* - -```bash -iotd cluster start default_cluster -N datanode_1 -``` - -or - -```bash -iotd cluster start default_cluster -N 192.168.1.5:6667 -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对于节点位置信息,如果启动的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果启动的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 启动该节点 - -##### 查看集群状态命令 - -```bash -iotd cluster show default_cluster -``` - -or - -```bash -iotd cluster show default_cluster details -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 依次在datanode通过cli执行`show cluster details` 如果有一个节点执行成功则不会在后续节点继续执行cli直接返回结果 - - -##### 停止集群命令 - -```bash -iotd cluster stop default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`datanode_servers`中datanode节点信息,按照配置先后顺序依次停止datanode节点 - -* 根据`confignode_servers`中confignode节点信息,按照配置依次停止confignode节点 - - -*停止单个节点命令* - -```bash -iotd cluster stop default_cluster -N datanode_1 -``` - -or - -```bash -iotd cluster stop default_cluster -N 192.168.1.5:6667 -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对于节点位置信息,如果停止的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果停止的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 停止该节点 - -## 数据文件夹概览工具 - -IoTDB数据文件夹概览工具用于打印出数据文件夹的结构概览信息,工具位置为 tools/tsfile/print-iotdb-data-dir。 - -### 用法 - -- Windows: - -```bash -.\print-iotdb-data-dir.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"IoTDB_data_dir_overview.txt"作为默认值。 - -### 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-common.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## TsFile概览工具 - -TsFile概览工具用于以概要模式打印出一个TsFile的内容,工具位置为 tools/tsfile/print-tsfile。 - -### 用法 - -- Windows: - -```bash -.\print-tsfile-sketch.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-tsfile-sketch.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"TsFile_sketch_view.txt"作为默认值。 - -### 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-common.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -解释: - -- 以"|"为分隔,左边是在TsFile文件中的实际位置,右边是梗概内容。 -- "|||||||||||||||||||||"是为增强可读性而添加的导引信息,不是TsFile中实际存储的数据。 -- 最后打印的"IndexOfTimerseriesIndex Tree"是对TsFile文件末尾的元数据索引树的重新整理打印,便于直观理解,不是TsFile中存储的实际数据。 - -## TsFile Resource概览工具 - -TsFile resource概览工具用于打印出TsFile resource文件的内容,工具位置为 tools/tsfile/print-tsfile-resource-files。 - -### 用法 - -- Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-common.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-common.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-common.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-datanode.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-datanode.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-common.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-common.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-common.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-datanode.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-datanode.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/zh/UserGuide/V1.2.x/Tools-System/Monitor-Tool_timecho.md b/src/zh/UserGuide/V1.2.x/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index 9a5a773c0..000000000 --- a/src/zh/UserGuide/V1.2.x/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,169 +0,0 @@ - - -# 监控工具 -## Prometheus - -### 监控指标的 Prometheus 映射关系 - -> 对于 Metric Name 为 name, Tags 为 K1=V1, ..., Kn=Vn 的监控指标有如下映射,其中 value 为具体值 - -| 监控指标类型 | 映射关系 | -| ---------------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | - -### 修改配置文件 - -1) 以 DataNode 为例,修改 iotdb-datanode.properties 配置文件如下: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -2) 启动 IoTDB DataNode - -3) 打开浏览器或者用```curl``` 访问 ```http://servier_ip:9091/metrics```, 就能得到如下 metric 数据: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -### Prometheus + Grafana - -如上所示,IoTDB 对外暴露出标准的 Prometheus 格式的监控指标数据,可以使用 Prometheus 采集并存储监控指标,使用 Grafana -可视化监控指标。 - -IoTDB、Prometheus、Grafana三者的关系如下图所示: - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. IoTDB在运行过程中持续收集监控指标数据。 -2. Prometheus以固定的间隔(可配置)从IoTDB的HTTP接口拉取监控指标数据。 -3. Prometheus将拉取到的监控指标数据存储到自己的TSDB中。 -4. Grafana以固定的间隔(可配置)从Prometheus查询监控指标数据并绘图展示。 - -从交互流程可以看出,我们需要做一些额外的工作来部署和配置Prometheus和Grafana。 - -比如,你可以对Prometheus进行如下的配置(部分参数可以自行调整)来从IoTDB获取监控数据 - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -更多细节可以参考下面的文档: - -[Prometheus安装使用文档](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus从HTTP接口拉取metrics数据的配置说明](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana安装使用文档](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana从Prometheus查询数据并绘图的文档](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -### Apache IoTDB Dashboard - -我们提供了Apache IoTDB Dashboard,支持统一集中式运维管理,可通过一个监控面板监控多个集群。 - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - -你可以在企业版中获取到 Dashboard 的 Json文件。 - -#### 集群概览 - -可以监控包括但不限于: -- 集群总CPU核数、总内存空间、总硬盘空间 -- 集群包含多少个ConfigNode与DataNode -- 集群启动时长 -- 集群写入速度 -- 集群各节点当前CPU、内存、磁盘使用率 -- 分节点的信息 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - -#### 数据写入 - -可以监控包括但不限于: -- 写入平均耗时、耗时中位数、99%分位耗时 -- WAL文件数量与尺寸 -- 节点 WAL flush SyncBuffer 耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -#### 数据查询 - -可以监控包括但不限于: -- 节点查询加载时间序列元数据耗时 -- 节点查询读取时间序列耗时 -- 节点查询修改时间序列元数据耗时 -- 节点查询加载Chunk元数据列表耗时 -- 节点查询修改Chunk元数据耗时 -- 节点查询按照Chunk元数据过滤耗时 -- 节点查询构造Chunk Reader耗时的平均值 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -#### 存储引擎 - -可以监控包括但不限于: -- 分类型的文件数量、大小 -- 处于各阶段的TsFile数量、大小 -- 各类任务的数量与耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -#### 系统监控 - -可以监控包括但不限于: -- 系统内存、交换内存、进程内存 -- 磁盘空间、文件数、文件尺寸 -- JVM GC时间占比、分类型的GC次数、GC数据量、各年代的堆内存占用 -- 网络传输速率、包发送速率 - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) - diff --git a/src/zh/UserGuide/V1.2.x/Tools-System/Workbench_timecho.md b/src/zh/UserGuide/V1.2.x/Tools-System/Workbench_timecho.md deleted file mode 100644 index 98ca65907..000000000 --- a/src/zh/UserGuide/V1.2.x/Tools-System/Workbench_timecho.md +++ /dev/null @@ -1,31 +0,0 @@ -# 可视化控制台 -## 第1章 产品介绍 -IoTDB可视化控制台是在IoTDB企业版时序数据库基础上针对工业场景的实时数据收集、存储与分析一体化的数据管理场景开发的扩展组件,旨在为用户提供高效、可靠的实时数据存储和查询解决方案。它具有体量轻、性能高、易使用的特点,完美对接 Hadoop 与 Spark 生态,适用于工业物联网应用中海量时间序列数据高速写入和复杂分析查询的需求。 - -## 第2章 使用说明 -IoTDB的可视化控制台包含以下功能模块: -| **功能模块** | **功能说明** | -| ------------ | ------------------------------------------------------------ | -| 实例管理 | 支持对连接实例进行统一管理,支持创建、编辑和删除,同时可以可视化呈现多实例的关系,帮助客户更清晰的管理多数据库实例 | -| 首页 | 支持查看数据库实例中各节点的服务运行状态(如是否激活、是否运行、IP信息等),支持查看集群、ConfigNode、DataNode运行监控状态,对数据库运行健康度进行监控,判断实例是否有潜在运行问题 | -| 测点列表 | 支持直接查看实例中的测点信息,包括所在数据库信息(如数据库名称、数据保存时间、设备数量等),及测点信息(测点名称、数据类型、压缩编码等),同时支持单条或批量创建、导出、删除测点 | -| 数据模型 | 支持查看各层级从属关系,将层级模型直观展示 | -| 数据查询 | 支持对常用数据查询场景提供界面式查询交互,并对查询数据进行批量导入、导出 | -| 统计查询 | 支持对常用数据统计场景提供界面式查询交互,如最大值、最小值、平均值、总和的结果输出。 | -| SQL操作 | 支持对数据库SQL进行界面式交互,单条或多条语句执行,结果的展示和导出 | -| 趋势 | 支持一键可视化查看数据整体趋势,对选中测点进行实时&历史数据绘制,观察测点实时&历史运行状态 | -| 分析 | 支持将数据通过不同的分析方式(如傅里叶变换等)进行可视化展示 | -| 视图 | 支持通过界面来查看视图名称、视图描述、结果测点以及表达式等信息,同时还可以通过界面交互快速的创建、编辑、删除视图 | -| 数据同步 | 支持对数据库间的数据同步任务进行直观创建、查看、管理,支持直接查看任务运行状态、同步数据和目标地址,还可以通过界面实时观察到同步状态的监控指标变化 | -| 权限管理 | 支持对权限进行界面管控,用于管理和控制数据库用户访问和操作数据库的权限 | -| 审计日志 | 支持对用户在数据库上的操作进行详细记录,包括DDL、DML和查询操作。帮助用户追踪和识别潜在的安全威胁、数据库错误和滥用行为 | - -主要功能展示: -* 首页 -![首页.png](/img/%E9%A6%96%E9%A1%B5.png) -* 测点列表 -![测点列表.png](/img/%E6%B5%8B%E7%82%B9%E5%88%97%E8%A1%A8.png) -* 数据查询 -![数据查询.png](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2.png) -* 趋势 -![历史趋势.png](/img/%E5%8E%86%E5%8F%B2%E8%B6%8B%E5%8A%BF.png) \ No newline at end of file diff --git a/src/zh/UserGuide/V1.2.x/User-Manual/Data-Sync_timecho.md b/src/zh/UserGuide/V1.2.x/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index ec8e0b7c5..000000000 --- a/src/zh/UserGuide/V1.2.x/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,390 +0,0 @@ - - -# 数据同步 -数据同步是工业物联网的典型需求,通过数据同步机制,可实现IoTDB之间的数据共享,搭建完整的数据链路来满足内网外网数据互通、端边云同步、数据迁移、数据备份等需求。 - -## 功能介绍 - -### 同步任务概述 - -一个数据同步任务包含2个阶段: - -- 抽取(Extract)阶段:该部分用于从源 IoTDB 抽取数据,在SQL语句中的 Extractor 部分定义 -- 发送(Connect)阶段:该部分用于向目标 IoTDB 发送数据,在SQL语句中的 Connector 部分定义 - - - -通过 SQL 语句声明式地配置2个部分的具体内容,可实现灵活的数据同步能力。 - -### 同步任务 - 创建 - -使用 `CREATE PIPE` 语句来创建一条数据同步任务,下列属性中`PipeId`和`connector`为必填项,`extractor`和`processor`为选填项,输入SQL时注意 `EXTRACTOR `与 `CONNECTOR` 插件顺序不能替换。 - -SQL 示例如下: - -```SQL -CREATE PIPE -- PipeId 是能够唯一标定任务任务的名字 --- 数据抽取插件,必填插件 -WITH EXTRACTOR ( - [ = ,], -) --- 数据连接插件,必填插件 -WITH CONNECTOR ( - [ = ,], -) -``` -> 📌 注:使用数据同步功能,请保证接收端开启自动创建元数据 - - - -### 同步任务 - 管理 - -数据同步任务有三种状态:RUNNING、STOPPED和DROPPED。任务状态转换如下图所示: - -![状态迁移图](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -一个数据同步任务在生命周期中会经过多种状态: - -- RUNNING: 运行状态。 -- STOPPED: 停止状态。 - - 说明1:任务的初始状态为停止状态,需要使用SQL语句启动任务 - - 说明2:用户也可以使用SQL语句手动将一个处于运行状态的任务停止,此时状态会从 RUNNING 变为 STOPPED - - 说明3:当一个任务出现无法恢复的错误时,其状态会自动从 RUNNING 变为 STOPPED -- DROPPED:删除状态。 - -我们提供以下SQL语句对同步任务进行状态管理。 - -#### 启动任务 - -创建之后,任务不会立即被处理,需要启动任务。使用`START PIPE`语句来启动任务,从而开始处理数据: - -```Go -START PIPE -``` - -#### 停止任务 - -停止处理数据: - -```Go -STOP PIPE -``` - -#### 删除任务 - -删除指定任务: - -```Go -DROP PIPE -``` -删除任务不需要先停止同步任务。 -#### 查看任务 - -查看全部任务: - -```Go -SHOW PIPES -``` - -查看指定任务: - -```Go -SHOW PIPE -``` - -### 插件 - -为了使得整体架构更加灵活以匹配不同的同步场景需求,在上述同步任务框架中IoTDB支持进行插件组装。系统为您预置了一些常用插件可直接使用,同时您也可以自定义 connector 插件,并加载至IoTDB系统进行使用。 - -| 模块 | 插件 | 预置插件 | 自定义插件 | -| --- | --- | --- | --- | -| 抽取(Extract) | Extractor 插件 | iotdb-extractor | 不支持 | -| 发送(Connect) | Connector 插件 | iotdb-thrift-connector、iotdb-air-gap-connector| 支持 | - -#### 预置插件 - -预置插件如下: - -| 插件名称 | 类型 | 介绍 | 适用版本 | -| ---------------------------- | ---- | ------------------------------------------------------------ | --------- | -| iotdb-extractor | extractor 插件 | 默认的extractor插件,用于抽取 IoTDB 历史或实时数据 | 1.2.x | -| iotdb-thrift-connector | connector 插件 | 用于 IoTDB(v1.2.0及以上)与 IoTDB(v1.2.0及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,多线程 async non-blocking IO 模型,传输性能高,尤其适用于目标端为分布式时的场景 | 1.2.x | -| iotdb-air-gap-connector | connector 插件 | 用于 IoTDB(v1.2.2+)向 IoTDB(v1.2.2+)跨单向数据网闸的数据同步。支持的网闸型号包括南瑞 Syskeeper 2000 等 | 1.2.1以上 | - -每个插件的详细参数可参考本文[参数说明](#connector-参数)章节。 - -#### 查看插件 - -查看系统中的插件(含自定义与内置插件)可以用以下语句: - -```Go -SHOW PIPEPLUGINS -``` - -返回结果如下(其中部分插件为系统内部插件,将在1.3.0版本中删除): - -```Go -IoTDB> SHOW PIPEPLUGINS -+----------------------------+----------+--------------------------------------------------------------------------------+----------------------------------------------------+ -| PluginName|PluginType| ClassName| PluginJar| -+----------------------------+----------+--------------------------------------------------------------------------------+----------------------------------------------------+ -| DO-NOTHING-CONNECTOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.DoNothingConnector| | -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.processor.DoNothingProcessor| | -| IOTDB-AIR-GAP-CONNECTOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.IoTDBAirGapConnector| | -| IOTDB-EXTRACTOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.extractor.IoTDBExtractor| | -| IOTDB-LEGACY-PIPE-CONNECTOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.IoTDBLegacyPipeConnector| | -|IOTDB-THRIFT-ASYNC-CONNECTOR| Builtin|org.apache.iotdb.commons.pipe.plugin.builtin.connector.IoTDBThriftAsyncConnector| | -| IOTDB-THRIFT-CONNECTOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.IoTDBThriftConnector| | -| IOTDB-THRIFT-SYNC-CONNECTOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.IoTDBThriftSyncConnector| | -+----------------------------+----------+--------------------------------------------------------------------------------+----------------------------------------------------+ -``` - -## 使用示例 - -### 全量数据同步 - -本例子用来演示将一个 IoTDB 的所有数据同步至另一个IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务,用来同步 A IoTDB 到 B IoTDB 间的全量数据,这里需要用到用到 connector 的 iotdb-thrift-connector 插件(内置插件),需指定接收端地址,这个例子中指定了'connector.ip'和'connector.port',也可指定'connector.node-urls',如下面的示例语句: - -```Go -create pipe A2B -with connector ( - 'connector'='iotdb-thrift-connector', - 'connector.ip'='127.0.0.1', - 'connector.port'='6668' -) -``` - - -### 历史数据同步 - -本例子用来演示同步某个历史时间范围(2023年8月23日8点到2023年10月23日8点)的数据至另一个IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务。首先我们需要在 extractor 中定义传输数据的范围,由于传输的是历史数据(历史数据是指同步任务创建之前存在的数据),所以需要将extractor.realtime.enable参数配置为false;同时需要配置数据的起止时间start-time和end-time以及传输的模式mode,此处推荐mode设置为 hybrid 模式(hybrid模式为混合传输,在无数据积压时采用实时传输方式,有数据积压时采用批量传输方式,并根据系统内部情况自动切换)。 - -详细语句如下: - -```SQL -create pipe A2B -WITH EXTRACTOR ( -'extractor'= 'iotdb-extractor', -'extractor.realtime.enable' = 'false', -'extractor.realtime.mode'='hybrid', -'extractor.history.start-time' = '2023.08.23T08:00:00+00:00', -'extractor.history.end-time' = '2023.10.23T08:00:00+00:00') -with connector ( -'connector'='iotdb-thrift-async-connector', -'connector.node-urls'='xxxx:6668', -'connector.batch.enable'='false') -``` - - -### 双向数据传输 - -本例子用来演示两个 IoTDB 之间互为双活的场景,数据链路如下图所示: - -![](/img/1706698592139.jpg) - -在这个例子中,为了避免数据无限循环,需要将A和B上的参数`extractor.forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一pipe传输而来的数据。同时将`'extractor.history.enable'` 设置为 `false`,表示不传输历史数据,即不同步创建该任务前的数据。 - -详细语句如下: - -在 A IoTDB 上执行下列语句: - -```Go -create pipe AB -with extractor ( - 'extractor.history.enable' = 'false', - 'extractor.forwarding-pipe-requests' = 'false', -with connector ( - 'connector'='iotdb-thrift-connector', - 'connector.ip'='127.0.0.1', - 'connector.port'='6668' -) -``` - -在 B IoTDB 上执行下列语句: - -```Go -create pipe BA -with extractor ( - 'extractor.history.enable' = 'false', - 'extractor.forwarding-pipe-requests' = 'false', -with connector ( - 'connector'='iotdb-thrift-connector', - 'connector.ip'='127.0.0.1', - 'connector.port'='6667' -) -``` - - -### 级联数据传输 - - -本例子用来演示多个 IoTDB 之间级联传输数据的场景,数据由A集群同步至B集群,再同步至C集群,数据链路如下图所示: - -![](/img/1706698610134.jpg) - -在这个例子中,为了将A集群的数据同步至C,在BC之间的pipe需要将 `extractor.forwarding-pipe-requests` 配置为`true`,详细语句如下: - -在A IoTDB上执行下列语句,将A中数据同步至B: - -```Go -create pipe AB -with connector ( - 'connector'='iotdb-thrift-connector', - 'connector.ip'='127.0.0.1', - 'connector.port'='6668' -) -``` - -在B IoTDB上执行下列语句,将B中数据同步至C: - -```Go -create pipe BC -with extractor ( - 'extractor.forwarding-pipe-requests' = 'true', -with connector ( - 'connector'='iotdb-thrift-connector', - 'connector.ip'='127.0.0.1', - 'connector.port'='6669' -) -``` - -### 跨网闸数据传输 - -本例子用来演示将一个 IoTDB 的数据,经过单向网闸,同步至另一个IoTDB的场景,数据链路如下图所示: - -![](/img/cross-network-gateway.png) - -在这个例子中,需要使用 connector 任务中的iotdb-air-gap-connector 插件(目前支持部分型号网闸,具体型号请联系天谋科技工作人员确认),配置网闸后,在 A IoTDB 上执行下列语句,其中ip和port填写网闸信息,详细语句如下: - -```Go -create pipe A2B -with connector ( - 'connector'='iotdb-air-gap-connector', - 'connector.ip'='10.53.53.53', - 'connector.port'='9780' -) -``` - -## 参考:注意事项 - -可通过修改 IoTDB 配置文件(iotdb-common.properties)以调整数据同步的参数,如同步数据存储目录等。完整配置如下: - -```Go -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 - -# The maximum number of selectors that can be used in the async connector. -# pipe_async_connector_selector_number=1 - -# The core number of clients that can be used in the async connector. -# pipe_async_connector_core_client_number=8 - -# The maximum number of clients that can be used in the async connector. -# pipe_async_connector_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -## 参考:参数说明 - -### extractor 参数 - -| key | value | value 取值范围 | 是否必填 |默认取值| -| ---------------------------------- | ------------------------------------------------ | -------------------------------------- | -------- |------| -| extractor | iotdb-extractor | String: iotdb-extractor | 必填 | - | -| extractor.pattern | 用于筛选时间序列的路径前缀 | String: 任意的时间序列前缀 | 选填 | root | -| extractor.history.enable | 是否同步历史数据 | Boolean: true, false | 选填 | true | -| extractor.history.start-time | 同步历史数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MIN_VALUE | -| extractor.history.end-time | 同步历史数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MAX_VALUE | -| extractor.realtime.enable | 是否同步实时数据 | Boolean: true, false | 选填 | true | -| extractor.realtime.mode | 实时数据的抽取模式 | String: hybrid, log, file | 选填 | hybrid | -| extractor.forwarding-pipe-requests | 是否转发由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | 选填 | true | - -> 💎 **说明:历史数据与实时数据的差异** -> -> * **历史数据**:所有 arrival time < 创建 pipe 时当前系统时间的数据称为历史数据 -> * **实时数据**:所有 arrival time >= 创建 pipe 时当前系统时间的数据称为实时数据 -> * **全量数据**: 全量数据 = 历史数据 + 实时数据 - - -> 💎 ​**说明:数据抽取模式hybrid, log和file的差异** -> -> - **hybrid(推荐)**:该模式下,任务将优先对数据进行实时处理、发送,当数据产生积压时自动切换至批量发送模式,其特点是平衡了数据同步的时效性和吞吐量 -> - **​log**:该模式下,任务将对数据进行实时处理、发送,其特点是高时效、低吞吐 -> - **file**:该模式下,任务将对数据进行批量(按底层数据文件)处理、发送,其特点是低时效、高吞吐 - - -### connector 参数 - -#### iotdb-thrift-connector - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -| --------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | ------------------------------------------- | -| connector | iotdb-thrift-connector 或 iotdb-thrift-sync-connector | String: iotdb-thrift-connector 或 iotdb-thrift-sync-connector | 必填 | | -| connector.ip | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip(请注意同步任务不支持向自身服务进行转发) | String | 选填 | 与 connector.node-urls 任选其一填写 | -| connector.port | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port(请注意同步任务不支持向自身服务进行转发) | Integer | 选填 | 与 connector.node-urls 任选其一填写 | -| connector.node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String。例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 选填 | 与 connector.ip:connector.port 任选其一填写 | -| connector.batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| connector.batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| connector.batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 - - - -#### iotdb-air-gap-connector - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -| -------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | ------------------------------------------- | -| connector | iotdb-air-gap-connector | String: iotdb-air-gap-connector | 必填 | | -| connector.ip | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip | String | 选填 | 与 connector.node-urls 任选其一填写 | -| connector.port | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port | Integer | 选填 | 与 connector.node-urls 任选其一填写 | -| connector.node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url | String。例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 选填 | 与 connector.ip:connector.port 任选其一填写 | -| connector.air-gap.handshake-timeout-ms | 发送端与接收端在首次尝试建立连接时握手请求的超时时长,单位:毫秒 | Integer | 选填 | 5000 | \ No newline at end of file diff --git a/src/zh/UserGuide/V1.2.x/User-Manual/IoTDB-View_timecho.md b/src/zh/UserGuide/V1.2.x/User-Manual/IoTDB-View_timecho.md deleted file mode 100644 index 7788f6666..000000000 --- a/src/zh/UserGuide/V1.2.x/User-Manual/IoTDB-View_timecho.md +++ /dev/null @@ -1,549 +0,0 @@ - - -# 视图 - -## 一、 序列视图应用背景 - -### 1.1 应用场景1 时间序列重命名(PI资产管理) - -实际应用中,采集数据的设备可能使用人类难以理解的标识号来命名,这给业务层带来了查询上的困难。 - -而序列视图能够重新组织管理这些序列,在不改变原有序列内容、无需新建或拷贝序列的情况下,使用新的模型结构来访问他们。 - -**例如**:一台云端设备使用自己的网卡MAC地址组成实体编号,存储数据时写入如下时间序列:`root.db.0800200A8C6D.xvjeifg`. - -对于用户来说,它是难以理解的。但此时,用户能够使用序列视图功能对它重命名,将它映射到一个序列视图中去,使用`root.view.device001.temperature`来访问采集到的数据。 - -### 1.2 应用场景2 简化业务层查询逻辑 - -有时用户有大量设备,管理着大量时间序列。在进行某项业务时,用户希望仅处理其中的部分序列,此时就可以通过序列视图功能挑选出关注重点,方便反复查询、写入。 - -**例如**:用户管理一条产品流水线,各环节的设备有大量时间序列。温度检测员仅需要关注设备温度,就可以抽取温度相关的序列,组成序列视图。 - -### 1.3 应用场景3 辅助权限管理 - -生产过程中,不同业务负责的范围一般不同,出于安全考虑往往需要通过权限管理来限制业务员的访问范围。 - -**例如**:安全管理部门现在仅需要监控某生产线上各设备的温度,但这些数据与其他机密数据存放在同一数据库。此时,就可以创建若干新的视图,视图中仅含有生产线上与温度有关的时间序列,接着,向安全员只赋予这些序列视图的权限,从而达到权限限制的目的。 - -### 1.4 设计序列视图功能的动机 - -结合上述两类使用场景,设计序列视图功能的动机,主要有: - -1. 时间序列重命名。 -2. 简化业务层查询逻辑。 -3. 辅助权限管理,通过视图向特定用户开放数据。 - -## 二、序列视图概念 - -### 2.1 术语概念 - -约定:若无特殊说明,本文档所指定的视图均是**序列视图**,未来可能引入设备视图等新功能。 - -### 2.2 序列视图 - -序列视图是一种组织管理时间序列的方式。 - -在传统关系型数据库中,数据都必须存放在一个表中,而在IoTDB等时序数据库中,序列才是存储单元。因此,IoTDB中序列视图的概念也是建立在序列上的。 - -一个序列视图就是一条虚拟的时间序列,每条虚拟的时间序列都像是一条软链接或快捷方式,映射到某个视图外部的序列或者某种计算逻辑。换言之,一个虚拟序列要么映射到某个确定的外部序列,要么由多个外部序列运算得来。 - -用户可以使用复杂的SQL查询创建视图,此时序列视图就像一条被存储的查询语句,当从视图中读取数据时,就把被存储的查询语句作为数据来源,放在FROM子句中。 - -### 2.3 别名序列 - -在序列视图中,有一类特殊的存在,他们满足如下所有条件: - -1. 数据来源为单一的时间序列 -2. 没有任何计算逻辑 -3. 没有任何筛选条件(例如无WHERE子句的限制) - -这样的序列视图,被称为**别名序列**,或别名序列视图。不完全满足上述所有条件的序列视图,就称为非别名序列视图。他们之间的区别是:只有别名序列支持写入功能。 - -**所有序列视图包括别名序列目前均不支持触发器功能(Trigger)。** - -### 2.4 嵌套视图 - -用户可能想从一个现有的序列视图中选出若干序列,组成一个新的序列视图,就称之为嵌套视图。 - -**当前版本不支持嵌套视图功能**。 - -### 2.5 IoTDB中对序列视图的一些约束 - -#### 限制1 序列视图必须依赖于一个或者若干个时间序列 - -一个序列视图有两种可能的存在形式: - -1. 它映射到一条时间序列 -2. 它由一条或若干条时间序列计算得来 - -前种存在形式已在前文举例,易于理解;而此处的后一种存在形式,则是因为序列视图允许计算逻辑的存在。 - -比如,用户在同一个锅炉安装了两个温度计,现在需要计算两个温度值的平均值作为测量结果。用户采集到的是如下两个序列:`root.db.d01.temperature01`、`root.db.d01.temperature02`。 - -此时,用户可以使用两个序列求平均值,作为视图中的一条序列:`root.db.d01.avg_temperature`。 - -该例子会3.1.2详细展开。 - -#### 限制2 非别名序列视图是只读的 - -不允许向非别名序列视图写入。 - -只有别名序列视图是支持写入的。 - -#### 限制3 不允许嵌套视图 - -不能选定现有序列视图中的某些列来创建序列视图,无论是直接的还是间接的。 - -本限制将在3.1.3给出示例。 - -#### 限制4 序列视图与时间序列不能重名 - -序列视图和时间序列都位于同一棵树下,所以他们不能重名。 - -任何一条序列的名称(路径)都应该是唯一确定的。 - -#### 限制5 序列视图与时间序列的时序数据共用,标签等元数据不共用 - -序列视图是指向时间序列的映射,所以它们完全共用时序数据,由时间序列负责持久化存储。 - -但是它们的tag、attributes等元数据不共用。 - -这是因为进行业务查询时,面向视图的用户关心的是当前视图的结构,而如果使用group by tag等方式做查询,显然希望是得到视图下含有对应tag的分组效果,而非时间序列的tag的分组效果(用户甚至对那些时间序列毫无感知)。 - -## 三、序列视图功能介绍 - -### 3.1 创建视图 - -创建一个序列视图与创建一条时间序列类似,区别在于需要通过AS关键字指定数据来源,即原始序列。 - -#### 3.1.1. 创建视图的SQL - -用户可以选取一些序列创建一个视图: - -```SQL -CREATE VIEW root.view.device.status -AS - SELECT s01 - FROM root.db.device -``` - -它表示用户从现有设备`root.db.device`中选出了`s01`这条序列,创建了序列视图`root.view.device.status`。 - -序列视图可以与时间序列存在于同一实体下,例如: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device -``` - -这样,`root.db.device`下就有了`s01`的一份虚拟拷贝,但是使用不同的名字`status`。 - -可以发现,上述两个例子中的序列视图,都是别名序列,我们给用户提供一种针对该序列的更方便的创建方式: - -```SQL -CREATE VIEW root.view.device.status -AS - root.db.device.s01 -``` - -#### 3.1.2 创建含有计算逻辑的视图 - -沿用2.2章节限制1中的例子: - -> 用户在同一个锅炉安装了两个温度计,现在需要计算两个温度值的平均值作为测量结果。用户采集到的是如下两个序列:`root.db.d01.temperature01`、`root.db.d01.temperature02`。 -> -> 此时,用户可以使用两个序列求平均值,作为视图中的一条序列:`root.view.device01.avg_temperature`。 - -如果不使用视图,用户可以这样查询两个温度的平均值: - -```SQL -SELECT (temperature01 + temperature02) / 2 -FROM root.db.d01 -``` - -而如果使用序列视图,用户可以这样创建一个视图来简化将来的查询: - -```SQL -CREATE VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02) / 2 - FROM root.db.d01 -``` - -然后用户可以这样查询: - -```SQL -SELECT avg_temperature FROM root.db.d01 -``` - -#### 3.1.3 不支持嵌套序列视图 - -继续沿用3.1.2中的例子,现在用户想使用序列视图`root.db.d01.avg_temperature`创建一个新的视图,这是不允许的。我们目前不支持嵌套视图,无论它是否是别名序列,都不支持。 - -比如下列SQL语句会报错: - -```SQL -CREATE VIEW root.view.device.avg_temp_copy -AS - root.db.d01.avg_temperature -- 不支持。不允许嵌套视图 -``` - -#### 3.1.4 一次创建多条序列视图 - -一次只能指定一个序列视图对用户来说使用不方便,则可以一次指定多条序列,比如: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - SELECT s01, s02 - FROM root.db.device -``` - -此外,上述写法可以做简化: - -```SQL -CREATE VIEW root.db.device(status, sub.hardware) -AS - SELECT s01, s02 - FROM root.db.device -``` - -上述两条语句都等价于如下写法: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device; - -CREATE VIEW root.db.device.sub.hardware -AS - SELECT s02 - FROM root.db.device -``` - -也等价于如下写法 - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - root.db.device.s01, root.db.device.s02 - --- 或者 - -CREATE VIEW root.db.device(status, sub.hardware) -AS - root.db.device(s01, s02) -``` - -##### 所有序列间的映射关系为静态存储 - -有时,SELECT子句中可能包含运行时才能确定的语句个数,比如如下的语句: - -```SQL -SELECT s01, s02 -FROM root.db.d01, root.db.d02 -``` - -上述语句能匹配到的序列数量是并不确定的,和系统状态有关。即便如此,用户也可以使用它创建视图。 - -不过需要特别注意,所有序列间的映射关系为静态存储(创建时固定)!请看以下示例: - -当前数据库中仅含有`root.db.d01.s01`、`root.db.d02.s01`、`root.db.d02.s02`三条序列,接着创建视图: - -```SQL -CREATE VIEW root.view.d(alpha, beta, gamma) -AS - SELECT s01, s02 - FROM root.db.d01, root.db.d02 -``` - -时间序列之间映射关系如下: - -| 序号 | 时间序列 | 序列视图 | -| ---- | ----------------- | ----------------- | -| 1 | `root.db.d01.s01` | root.view.d.alpha | -| 2 | `root.db.d02.s01` | root.view.d.beta | -| 3 | `root.db.d02.s02` | root.view.d.gamma | - -此后,用户新增了序列`root.db.d01.s02`,则它不对应到任何视图;接着,用户删除`root.db.d01.s01`,则查询`root.view.d.alpha`会直接报错,它也不会对应到`root.db.d01.s02`。 - -请时刻注意,序列间映射关系是静态地、固化地存储的。 - -#### 3.1.5 批量创建序列视图 - -现有若干个设备,每个设备都有一个温度数值,例如: - -1. root.db.d1.temperature -2. root.db.d2.temperature -3. ... - -这些设备下可能存储了很多其他序列(例如`root.db.d1.speed`),但目前可以创建一个视图,只包含这些设备的温度值,而不关系其他序列: - -```SQL -CREATE VIEW root.db.view(${2}_temperature) -AS - SELECT temperature FROM root.db.* -``` - -这里仿照了查询写回(`SELECT INTO`)对命名规则的约定,使用变量占位符来指定命名规则。可以参考:[查询写回(SELECT INTO)](../User-Manual/Query-Data.md#查询写回(SELECT-INTO-子句)) - -这里`root.db.*.temperature`指定了有哪些时间序列会被包含在视图中;`${2}`则指定了从时间序列中的哪个节点提取出名字来命名序列视图。 - -此处,`${2}`指代的是`root.db.*.temperature`的层级2(从 0 开始),也就是`*`的匹配结果;`${2}_temperature`则是将匹配结果与`temperature`通过下划线拼接了起来,构成视图下各序列的节点名称。 - -上述创建视图的语句,和下列写法是等价的: - -```SQL -CREATE VIEW root.db.view(${2}_${3}) -AS - SELECT temperature from root.db.* -``` - -最终视图中含有这些序列: - -1. root.db.view.d1_temperature -2. root.db.view.d2_temperature -3. ... - -使用通配符创建,只会存储创建时刻的静态映射关系。 - -#### 3.1.6 创建视图时SELECT子句受到一定限制 - -创建序列视图时,使用的SELECT子句受到一定限制。主要限制如下: - -1. 不能使用`WHERE`子句。 -2. 不能使用`GROUP BY`子句。 -3. 不能使用`MAX_VALUE`等聚合函数。 - -简单来说,`AS`后只能使用`SELECT ... FROM ... `的结构,且该查询语句的结果必须能构成一条时间序列。 - -### 3.2 视图数据查询 - -对于可以支持的数据查询功能,在执行时序数据查询时,序列视图与时间序列可以无差别使用,行为完全一致。 - -**目前序列视图不支持的查询类型如下:** - -1. **align by device 查询** -2. **针对非别名序列视图的 last 查询** -3. **group by tags 查询** - -用户也可以在同一个SELECT语句中混合查询时间序列与序列视图,比如: - -```SQL -SELECT temperature01, temperature02, avg_temperature -FROM root.db.d01 -WHERE temperature01 < temperature02 -``` - -但是,如果用户想要查询序列的元数据,例如tag、attributes等,则查询到的是序列视图的结果,而并非序列视图所引用的时间序列的结果。 - -此外,对于别名序列,如果用户想要得到时间序列的tag、attributes等信息,则需要先查询视图列的映射,找到对应的时间序列,再向时间序列查询tag、attributes等信息。查询视图列的映射的方法将会在3.5部分说明。 - -### 3.3 视图修改 - -视图支持的修改操作包括:修改计算逻辑,修改标签/属性/别名,以及删除。 - -#### 3.3.1 修改视图数据来源 - -```SQL -ALTER VIEW root.view.device.status -AS - SELECT s01 - FROM root.ln.wf.d01 -``` - -#### 3.3.2 修改视图的计算逻辑 - -```SQL -ALTER VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02 + temperature03) / 3 - FROM root.db.d01 -``` - -#### 3.3.3 标签点管理 - -- 添加新的标签 - -```SQL -ALTER view root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -- 添加新的属性 - -```SQL -ALTER view root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -- 重命名标签或属性 - -```SQL -ALTER view root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -- 重新设置标签或属性的值 - -```SQL -ALTER view root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -- 删除已经存在的标签或属性 - -```SQL -ALTER view root.turbine.d1.s1 DROP tag1, tag2 -``` - -- 更新插入别名,标签和属性 - -> 如果该别名,标签或属性原来不存在,则插入,否则,用新值更新原来的旧值 - -```SQL -ALTER view root.turbine.d1.s1 UPSERT TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -#### 3.3.4 删除视图 - -因为一个视图就是一条序列,因此可以像删除时间序列一样删除一个视图。 - - -```SQL -DELETE VIEW root.view.device.avg_temperatue -``` - -### 3.4 视图同步 - - -#### 如果依赖的原序列被删除了 - -当序列视图查询时(序列解析时),如果依赖的时间序列不存在,则**返回空结果集**。 - -这和查询一个不存在的序列的反馈类似,但是有区别:如果依赖的时间序列无法解析,空结果集是包含表头的,以此来提醒用户该视图是存在问题的。 - -此外,被依赖的时间序列删除时,不会去查找是否有依赖于该列的视图,用户不会收到任何警告。 - -#### 不支持非别名序列的数据写入 - -不支持向非别名序列的写入。 - -详情请参考前文 2.1.6 限制2 - -#### 序列的元数据不共用 - -详情请参考前文2.1.6 限制5 - -### 3.5 视图元数据查询 - -视图元数据查询,特指查询视图本身的元数据(例如视图有多少列),以及数据库内视图的信息(例如有哪些视图)。 - -#### 3.5.1 查看当前的视图列 - -用户有两种查询方式: - -1. 使用`SHOW TIMESERIES`进行查询,该查询既包含时间序列,也包含序列视图。但是只能显示视图的部分属性 -2. 使用`SHOW VIEW`进行查询,该查询只包含序列视图。能完整显示序列视图的属性。 - -举例: - -```Shell -IoTDB> show timeseries; -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.device.s01 | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.view.status | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp01 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp02 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.avg_temp| null| root.db| FLOAT| null| null|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -Total line number = 5 -It costs 0.789s -IoTDB> -``` - -最后一列`ViewType`中显示了该序列的类型,时间序列为BASE,序列视图是VIEW。 - -此外,某些序列视图的属性会缺失,比如`root.db.d01.avg_temp`是由温度均值计算得来,所以`Encoding`和`Compression`属性都为空值。 - -此外,`SHOW TIMESERIES`语句的查询结果主要分为两部分: - -1. 时序数据的信息,例如数据类型,压缩方式,编码等 -2. 其他元数据信息,例如tag,attribute,所属database等 - -对于序列视图,展示的时序数据信息与其原始序列一致或者为空值(比如计算得到的平均温度有数据类型但是无压缩方式);展示的元数据信息则是视图的内容。 - -如果要得知视图的更多信息,需要使用`SHOW ``VIEW`。`SHOW ``VIEW`中展示视图的数据来源等。 - -```Shell -IoTDB> show VIEW root.**; -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -| Timeseries|Database|DataType|Tags|Attributes|ViewType| SOURCE| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.view.status | root.db| INT32|null| null| VIEW| root.db.device.s01| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.d01.avg_temp| root.db| FLOAT|null| null| VIEW|(root.db.d01.temp01+root.db.d01.temp02)/2| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -Total line number = 2 -It costs 0.789s -IoTDB> -``` - -最后一列`SOURCE`显示了该序列视图的数据来源,列出了创建该序列的SQL语句。 - -##### 关于数据类型 - -上述两种查询都涉及视图的数据类型。视图的数据类型是根据定义视图的查询语句或别名序列的原始时间序列类型推断出来的。这个数据类型是根据当前系统的状态实时计算出来的,因此在不同时刻查询到的数据类型可能是改变的。 - -## 四、FAQ - -####Q1:我想让视图实现类型转换的功能。例如,原有一个int32类型的时间序列,和其他int64类型的序列被放在了同一个视图中。我现在希望通过视图查询到的数据,都能自动转换为int64类型。 - -> Ans:这不是序列视图的职能范围。但是可以使用`CAST`进行转换,比如: - -```SQL -CREATE VIEW root.db.device.int64_status -AS - SELECT CAST(s1, 'type'='INT64') from root.db.device -``` - -> 这样,查询`root.view.status`时,就会得到int64类型的结果。 -> -> 请特别注意,上述例子中,序列视图的数据是通过`CAST`转换得到的,因此`root.db.device.int64_status`并不是一条别名序列,也就**不支持写入**。 - -####Q2:是否支持默认命名?选择若干时间序列,创建视图;但是我不指定每条序列的名字,由数据库自动命名? - -> Ans:不支持。用户必须明确指定命名。 - -#### Q3:在原有体系中,创建时间序列`root.db.device.s01`,可以发现自动创建了database`root.db`,自动创建了device`root.db.device`。接着删除时间序列`root.db.device.s01`,可以发现`root.db.device`被自动删除,`root.db`却还是保留的。对于创建视图,会沿用这一机制吗?出于什么考虑呢? - -> Ans:保持原有的行为不变,引入视图功能不会改变原有的这些逻辑。 - -#### Q4:是否支持序列视图重命名? - -> A:当前版本不支持重命名,可以自行创建新名称的视图投入使用。 \ No newline at end of file diff --git a/src/zh/UserGuide/V1.2.x/User-Manual/Security-Management_timecho.md b/src/zh/UserGuide/V1.2.x/User-Manual/Security-Management_timecho.md deleted file mode 100644 index 48b84b8ad..000000000 --- a/src/zh/UserGuide/V1.2.x/User-Manual/Security-Management_timecho.md +++ /dev/null @@ -1,158 +0,0 @@ - - -# 安全控制 - -## 白名单 - -**功能描述** - -允许哪些客户端地址能连接 IoTDB - -**配置文件** - -conf/iotdb-common.properties - -conf/white.list - -**配置项** - -iotdb-common.properties: - -决定是否开启白名单功能 - -```YAML -# 是否开启白名单功能 -enable_white_list=true -``` - -white.list: - -决定哪些IP地址能够连接IoTDB - -```YAML -# 支持注释 -# 支持精确匹配,每行一个ip -10.2.3.4 - -# 支持*通配符,每行一个ip -10.*.1.3 -10.100.0.* -``` - -**注意事项** - -1. 如果通过session客户端取消本身的白名单,当前连接并不会立即断开。在下次创建连接的时候拒绝。 -2. 如果直接修改white.list,一分钟内生效。如果通过session客户端修改,立即生效,更新内存中的值和white.list磁盘文件 -3. 开启白名单功能,没有white.list 文件,启动DB服务成功,但是,拒绝所有连接。 -4. DB服务运行中,删除 white.list 文件,至多一分钟后,拒绝所有连接。 -5. 是否开启白名单功能的配置,可以热加载。 -6. 使用Java 原生接口修改白名单,必须是root用户才能修改,拒绝非root用户修改;修改内容必须合法,否则会抛出StatementExecutionException异常。 - -![白名单](/img/%E7%99%BD%E5%90%8D%E5%8D%95.png) - -## 安全审计 - -### 功能背景 - - 审计日志是数据库的记录凭证,通过审计日志功能可以查询到用户在数据库中增删改查等各项操作,以保证信息安全。关于IoTDB的审计日志功能可以实现以下场景的需求: - -- 可以按链接来源(是否人为操作)决定是否记录审计日志,如:非人为操作如硬件采集器写入的数据不需要记录审计日志,人为操作如普通用户通过cli、workbench等工具操作的数据需要记录审计日志。 -- 过滤掉系统级别的写入操作,如IoTDB监控体系本身记录的写入操作等。 - - - -#### 场景说明 - - - -##### 对所有用户的所有操作(增、删、改、查)进行记录 - -通过审计日志功能追踪到所有用户在数据中的各项操作。其中所记录的信息要包含数据操作(新增、删除、查询)及元数据操作(新增、修改、删除、查询)、客户端登录信息(用户名、ip地址)。 - - - -客户端的来源 - -- Cli、workbench、Zeppelin、Grafana、通过 Session/JDBC/MQTT 等协议传入的请求 - -![审计日志](/img/%E5%AE%A1%E8%AE%A1%E6%97%A5%E5%BF%97.png) - - - -##### 可关闭部分用户连接的审计日志 - - - -如非人为操作,硬件采集器通过 Session/JDBC/MQTT 写入的数据不需要记录审计日志 - - - -### 功能定义 - - - -通过配置可以实现: - -- 决定是否开启审计功能 -- 决定审计日志的输出位置,支持输出至一项或多项 - 1. 日志文件 - 2. IoTDB存储 -- 决定是否屏蔽原生接口的写入,防止记录审计日志过多影响性能 -- 决定审计日志内容类别,支持记录一项或多项 - 1. 数据的新增、删除操作 - 2. 数据和元数据的查询操作 - 3. 元数据类的新增、修改、删除操作 - -#### 配置项 - - 在iotdb-engine.properties 或 iotdb-common.properties中修改以下几项配置 - -```YAML -#################### -### Audit log Configuration -#################### - -# whether to enable the audit log. -# Datatype: Boolean -# enable_audit_log=false - -# Output location of audit logs -# Datatype: String -# IOTDB: the stored time series is: root.__system.audit._{user} -# LOGGER: log_audit.log in the log directory -# audit_log_storage=IOTDB,LOGGER - -# whether enable audit log for DML operation of data -# whether enable audit log for DDL operation of schema -# whether enable audit log for QUERY operation of data and schema -# Datatype: String -# audit_log_operation=DML,DDL,QUERY - -# whether the local write api records audit logs -# Datatype: Boolean -# This contains Session insert api: insertRecord(s), insertTablet(s),insertRecordsOfOneDevice -# MQTT insert api -# RestAPI insert api -# This parameter will cover the DML in audit_log_operation -# enable_audit_log_for_native_insert_api=true -``` - diff --git a/src/zh/UserGuide/V1.2.x/User-Manual/Stage_Data-Sync_timecho.md b/src/zh/UserGuide/V1.2.x/User-Manual/Stage_Data-Sync_timecho.md deleted file mode 100644 index a0bf01106..000000000 --- a/src/zh/UserGuide/V1.2.x/User-Manual/Stage_Data-Sync_timecho.md +++ /dev/null @@ -1,536 +0,0 @@ - - -# IoTDB 数据同步 - -**IoTDB 数据同步功能可以将 IoTDB 的数据传输到另一个数据平台,我们将一个数据同步任务称为 Pipe。** - -**一个 Pipe 包含三个子任务(插件):** - -- 抽取(Extract) -- 处理(Process) -- 发送(Connect) - -**Pipe 允许用户自定义三个子任务的处理逻辑,通过类似 UDF 的方式处理数据。** 在一个 Pipe 中,上述的子任务分别由三种插件执行实现,数据会依次经过这三个插件进行处理:Pipe Extractor 用于抽取数据,Pipe Processor 用于处理数据,Pipe Connector 用于发送数据,最终数据将被发至外部系统。 - -**Pipe 任务的模型如下:** - -![任务模型图](/img/%E6%B5%81%E5%A4%84%E7%90%86%E5%BC%95%E6%93%8E.jpeg) - -描述一个数据同步任务,本质就是描述 Pipe Extractor、Pipe Processor 和 Pipe Connector 插件的属性。用户可以通过 SQL 语句声明式地配置三个子任务的具体属性,通过组合不同的属性,实现灵活的数据 ETL 能力。 - -利用数据同步功能,可以搭建完整的数据链路来满足端*边云同步、异地灾备、读写负载分库*等需求。 - -## 快速开始 - -**🎯 目标:实现 IoTDB A -> IoTDB B 的全量数据同步** - -- 启动两个 IoTDB,A(datanode -> 127.0.0.1:6667) B(datanode -> 127.0.0.1:6668) -- 创建 A -> B 的 Pipe,在 A 上执行 - - ```sql - create pipe a2b - with connector ( - 'connector'='iotdb-thrift-connector', - 'connector.ip'='127.0.0.1', - 'connector.port'='6668' - ) - ``` -- 启动 A -> B 的 Pipe,在 A 上执行 - - ```sql - start pipe a2b - ``` -- 向 A 写入数据 - - ```sql - INSERT INTO root.db.d(time, m) values (1, 1) - ``` -- 在 B 检查由 A 同步过来的数据 - - ```sql - SELECT ** FROM root - ``` - -> ❗️**注:目前的 IoTDB -> IoTDB 的数据同步实现并不支持 DDL 同步** -> -> 即:不支持 ttl,trigger,别名,模板,视图,创建/删除序列,创建/删除存储组等操作 -> -> **IoTDB -> IoTDB 的数据同步要求目标端 IoTDB:** -> -> * 开启自动创建元数据:需要人工配置数据类型的编码和压缩与发送端保持一致 -> * 不开启自动创建元数据:手工创建与源端一致的元数据 - -## 同步任务管理 - -### 创建同步任务 - -可以使用 `CREATE PIPE` 语句来创建一条数据同步任务,示例 SQL 语句如下所示: - -```sql -CREATE PIPE -- PipeId 是能够唯一标定同步任务任务的名字 -WITH EXTRACTOR ( - -- 默认的 IoTDB 数据抽取插件 - 'extractor' = 'iotdb-extractor', - -- 路径前缀,只有能够匹配该路径前缀的数据才会被抽取,用作后续的处理和发送 - 'extractor.pattern' = 'root.timecho', - -- 是否抽取历史数据 - 'extractor.history.enable' = 'true', - -- 描述被抽取的历史数据的时间范围,表示最早时间 - 'extractor.history.start-time' = '2011.12.03T10:15:30+01:00', - -- 描述被抽取的历史数据的时间范围,表示最晚时间 - 'extractor.history.end-time' = '2022.12.03T10:15:30+01:00', - -- 是否抽取实时数据 - 'extractor.realtime.enable' = 'true', - -- 描述实时数据的抽取方式 - 'extractor.realtime.mode' = 'hybrid', -) -WITH PROCESSOR ( - -- 默认的数据处理插件,即不做任何处理 - 'processor' = 'do-nothing-processor', -) -WITH CONNECTOR ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'connector' = 'iotdb-thrift-connector', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'connector.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'connector.port' = '6667', -) -``` - -**创建同步任务时需要配置 PipeId 以及三个插件部分的参数:** - - -| 配置项 | 说明 | 是否必填 | 默认实现 | 默认实现说明 | 是否允许自定义实现 | -| --------- | ------------------------------------------------- | --------------------------- | -------------------- | ------------------------------------------------------ | ------------------------- | -| PipeId | 全局唯一标定一个同步任务的名称 | 必填 | - | - | - | -| extractor | Pipe Extractor 插件,负责在数据库底层抽取同步数据 | 选填 | iotdb-extractor | 将数据库的全量历史数据和后续到达的实时数据接入同步任务 | 否 | -| processor | Pipe Processor 插件,负责处理数据 | 选填 | do-nothing-processor | 对传入的数据不做任何处理 | | -| connector | Pipe Connector 插件,负责发送数据 | 必填 | - | - | | - -示例中,使用了 iotdb-extractor、do-nothing-processor 和 iotdb-thrift-connector 插件构建数据同步任务。IoTDB 还内置了其他的数据同步插件,**请查看“系统预置数据同步插件”一节**。 - -**一个最简的 CREATE PIPE 语句示例如下:** - -```sql -CREATE PIPE -- PipeId 是能够唯一标定任务任务的名字 -WITH CONNECTOR ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'connector' = 'iotdb-thrift-connector', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'connector.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'connector.port' = '6667', -) -``` - -其表达的语义是:将本数据库实例中的全量历史数据和后续到达的实时数据,同步到目标为 127.0.0.1:6667 的 IoTDB 实例上。 - -**注意:** - -- EXTRACTOR 和 PROCESSOR 为选填配置,若不填写配置参数,系统则会采用相应的默认实现 -- CONNECTOR 为必填配置,需要在 CREATE PIPE 语句中声明式配置 -- CONNECTOR 具备自复用能力。对于不同的任务,如果他们的 CONNECTOR 具备完全相同 KV 属性的(所有属性的 key 对应的 value 都相同),**那么系统最终只会创建一个 CONNECTOR 实例**,以实现对连接资源的复用。 - - - 例如,有下面 pipe1, pipe2 两个任务的声明: - - ```sql - CREATE PIPE pipe1 - WITH CONNECTOR ( - 'connector' = 'iotdb-thrift-connector', - 'connector.thrift.host' = 'localhost', - 'connector.thrift.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH CONNECTOR ( - 'connector' = 'iotdb-thrift-connector', - 'connector.thrift.port' = '9999', - 'connector.thrift.host' = 'localhost', - ) - ``` - - - 因为它们对 CONNECTOR 的声明完全相同(**即使某些属性声明时的顺序不同**),所以框架会自动对它们声明的 CONNECTOR 进行复用,最终 pipe1, pipe2 的CONNECTOR 将会是同一个实例。 -- 在 extractor 为默认的 iotdb-extractor,且 extractor.forwarding-pipe-requests 为默认值 true 时,请不要构建出包含数据循环同步的应用场景(会导致无限循环): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### 启动任务 - -CREATE PIPE 语句成功执行后,任务相关实例会被创建,但整个任务的运行状态会被置为 STOPPED,即任务不会立刻处理数据。 - -可以使用 START PIPE 语句使任务开始处理数据: - -```sql -START PIPE -``` - -### 停止任务 - -使用 STOP PIPE 语句使任务停止处理数据: - -```sql -STOP PIPE -``` - -### 删除任务 - -使用 DROP PIPE 语句使任务停止处理数据(当任务状态为 RUNNING 时),然后删除整个任务同步任务: - -```sql -DROP PIPE -``` - -用户在删除任务前,不需要执行 STOP 操作。 - -### 展示任务 - -使用 SHOW PIPES 语句查看所有任务: - -```sql -SHOW PIPES -``` - -查询结果如下: - -```sql -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -| ID| CreationTime | State|PipeExtractor|PipeProcessor|PipeConnector|ExceptionMessage| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| None| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -``` - -可以使用 `` 指定想看的某个同步任务状态: - -```sql -SHOW PIPE -``` - -您也可以通过 where 子句,判断某个 \ 使用的 Pipe Connector 被复用的情况。 - -```sql -SHOW PIPES -WHERE CONNECTOR USED BY -``` - -### 任务运行状态迁移 - -一个数据同步 pipe 在其被管理的生命周期中会经过多种状态: - -- **STOPPED:** pipe 处于停止运行状态。当管道处于该状态时,有如下几种可能: - - 当一个 pipe 被成功创建之后,其初始状态为暂停状态 - - 用户手动将一个处于正常运行状态的 pipe 暂停,其状态会被动从 RUNNING 变为 STOPPED - - 当一个 pipe 运行过程中出现无法恢复的错误时,其状态会自动从 RUNNING 变为 STOPPED -- **RUNNING:** pipe 正在正常工作 -- **DROPPED:** pipe 任务被永久删除 - -下图表明了所有状态以及状态的迁移: - -![状态迁移图](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -## 系统预置数据同步插件 - -### 查看预置插件 - -用户可以按需查看系统中的插件。查看插件的语句如图所示。 - -```sql -SHOW PIPEPLUGINS -``` - -### 预置 extractor 插件 - -#### iotdb-extractor - -作用:抽取 IoTDB 内部的历史或实时数据进入 pipe。 - - -| key | value | value 取值范围 | required or optional with default | -| ---------------------------------- | ------------------------------------------------ | -------------------------------------- | --------------------------------- | -| extractor | iotdb-extractor | String: iotdb-extractor | required | -| extractor.pattern | 用于筛选时间序列的路径前缀 | String: 任意的时间序列前缀 | optional: root | -| extractor.history.enable | 是否同步历史数据 | Boolean: true, false | optional: true | -| extractor.history.start-time | 同步历史数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| extractor.history.end-time | 同步历史数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| extractor.realtime.enable | 是否同步实时数据 | Boolean: true, false | optional: true | -| extractor.realtime.mode | 实时数据的抽取模式 | String: hybrid, log, file | optional: hybrid | -| extractor.forwarding-pipe-requests | 是否转发由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | optional: true | - -> 🚫 **extractor.pattern 参数说明** -> -> * Pattern 需用反引号修饰不合法字符或者是不合法路径节点,例如如果希望筛选 root.\`a@b\` 或者 root.\`123\`,应设置 pattern 为 root.\`a@b\` 或者 root.\`123\`(具体参考 [单双引号和反引号的使用时机](https://iotdb.apache.org/zh/Download/#_1-0-版本不兼容的语法详细说明)) -> * 在底层实现中,当检测到 pattern 为 root(默认值)时,同步效率较高,其他任意格式都将降低性能 -> * 路径前缀不需要能够构成完整的路径。例如,当创建一个包含参数为 'extractor.pattern'='root.aligned.1' 的 pipe 时: -> -> * root.aligned.1TS -> * root.aligned.1TS.\`1\` -> * root.aligned.100TS -> -> 的数据会被同步; -> -> * root.aligned.\`1\` -> * root.aligned.\`123\` -> -> 的数据不会被同步。 -> * root.\_\_system 的数据不会被 pipe 抽取,即不会被同步到目标端。用户虽然可以在 extractor.pattern 中包含任意前缀,包括带有(或覆盖) root.\__system 的前缀,但是 root.__system 下的数据总是会被 pipe 忽略的 - -> ❗️**extractor.history 的 start-time,end-time 参数说明** -> -> * start-time,end-time 应为 ISO 格式,例如 2011-12-03T10:15:30 或 2011-12-03T10:15:30+01:00 - -> ✅ **一条数据从生产到落库 IoTDB,包含两个关键的时间概念** -> -> * **event time:** 数据实际生产时的时间(或者数据生产系统给数据赋予的生成时间,是数据点中的时间项),也称为事件时间。 -> * **arrival time:** 数据到达 IoTDB 系统内的时间。 -> -> 我们常说的乱序数据,指的是数据到达时,其 **event time** 远落后于当前系统时间(或者已经落库的最大 **event time**)的数据。另一方面,不论是乱序数据还是顺序数据,只要它们是新到达系统的,那它们的 **arrival time** 都是会随着数据到达 IoTDB 的顺序递增的。 - -> 💎 **iotdb-extractor 的工作可以拆分成两个阶段** -> -> 1. 历史数据抽取:所有 **arrival time** < 创建 pipe 时**当前系统时间**的数据称为历史数据 -> 2. 实时数据抽取:所有 **arrival time** >= 创建 pipe 时**当前系统时间**的数据称为实时数据 -> -> 历史数据传输阶段和实时数据传输阶段,**两阶段串行执行,只有当历史数据传输阶段完成后,才执行实时数据传输阶段。** -> -> 用户可以指定 iotdb-extractor 进行: -> -> * 历史数据抽取(`'extractor.history.enable' = 'true'`, `'extractor.realtime.enable' = 'false'` ) -> * 实时数据抽取(`'extractor.history.enable' = 'false'`, `'extractor.realtime.enable' = 'true'` ) -> * 全量数据抽取(`'extractor.history.enable' = 'true'`, `'extractor.realtime.enable' = 'true'` ) -> * 禁止同时设置 `extractor.history.enable` 和 `extractor.realtime.enable` 为 `false` - -> 📌 **extractor.realtime.mode:数据抽取的模式** -> -> * log:该模式下,任务仅使用操作日志进行数据处理、发送 -> * file:该模式下,任务仅使用数据文件进行数据处理、发送 -> * hybrid:该模式,考虑了按操作日志逐条目发送数据时延迟低但吞吐低的特点,以及按数据文件批量发送时发送吞吐高但延迟高的特点,能够在不同的写入负载下自动切换适合的数据抽取方式,首先采取基于操作日志的数据抽取方式以保证低发送延迟,当产生数据积压时自动切换成基于数据文件的数据抽取方式以保证高发送吞吐,积压消除时自动切换回基于操作日志的数据抽取方式,避免了采用单一数据抽取算法难以平衡数据发送延迟或吞吐的问题。 - -> 🍕 **extractor.forwarding-pipe-requests:是否允许转发从另一 pipe 传输而来的数据** -> -> * 如果要使用 pipe 构建 A -> B -> C 的数据同步,那么 B -> C 的 pipe 需要将该参数为 true 后,A -> B 中 A 通过 pipe 写入 B 的数据才能被正确转发到 C -> * 如果要使用 pipe 构建 A \<-> B 的双向数据同步(双活),那么 A -> B 和 B -> A 的 pipe 都需要将该参数设置为 false,否则将会造成数据无休止的集群间循环转发 - -### 预置 processor 插件 - -#### do-nothing-processor - -作用:不对 extractor 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -| --------- | -------------------- | ---------------------------- | --------------------------------- | -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### 预置 connector 插件 - -#### iotdb-thrift-sync-connector(别名:iotdb-thrift-connector) - -作用:主要用于 IoTDB(v1.2.0+)与 IoTDB(v1.2.0+)之间的数据传输。 -使用 Thrift RPC 框架传输数据,单线程 blocking IO 模型。 -保证接收端 apply 数据的顺序与发送端接受写入请求的顺序一致。 - -限制:源端 IoTDB 与 目标端 IoTDB 版本都需要在 v1.2.0+。 - - -| key | value | value 取值范围 | required or optional with default | -| --------------------------------- | --------------------------------------------------------------------------- | ---------------------------------------------------------------------------- | ----------------------------------------------------- | -| connector | iotdb-thrift-connector 或 iotdb-thrift-sync-connector | String: iotdb-thrift-connector 或 iotdb-thrift-sync-connector | required | -| connector.ip | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip | String | optional: 与 connector.node-urls 任选其一填写 | -| connector.port | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port | Integer | optional: 与 connector.node-urls 任选其一填写 | -| connector.node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url | String。例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | optional: 与 connector.ip:connector.port 任选其一填写 | -| connector.batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | optional: true | -| connector.batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | optional: 1 | -| connector.batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | optional: 16 * 1024 * 1024 (16MiB) | - -> 📌 请确保接收端已经创建了发送端的所有时间序列,或是开启了自动创建元数据,否则将会导致 pipe 运行失败。 - -#### iotdb-thrift-async-connector - -作用:主要用于 IoTDB(v1.2.0+)与 IoTDB(v1.2.0+)之间的数据传输。 -使用 Thrift RPC 框架传输数据,多线程 async non-blocking IO 模型,传输性能高,尤其适用于目标端为分布式时的场景。 -不保证接收端 apply 数据的顺序与发送端接受写入请求的顺序一致,但是保证数据发送的完整性(at-least-once)。 - -限制:源端 IoTDB 与 目标端 IoTDB 版本都需要在 v1.2.0+。 - - -| key | value | value 取值范围 | required or optional with default | -| --------------------------------- | --------------------------------------------------------------------------- | ---------------------------------------------------------------------------- | ----------------------------------------------------- | -| connector | iotdb-thrift-async-connector | String: iotdb-thrift-async-connector | required | -| connector.ip | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip | String | optional: 与 connector.node-urls 任选其一填写 | -| connector.port | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port | Integer | optional: 与 connector.node-urls 任选其一填写 | -| connector.node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url | String。例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | optional: 与 connector.ip:connector.port 任选其一填写 | -| connector.batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | optional: true | -| connector.batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | optional: 1 | -| connector.batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | optional: 16 * 1024 * 1024 (16MiB) | - -> 📌 请确保接收端已经创建了发送端的所有时间序列,或是开启了自动创建元数据,否则将会导致 pipe 运行失败。 - -#### iotdb-legacy-pipe-connector - -作用:主要用于 IoTDB(v1.2.0+)向更低版本的 IoTDB 传输数据,使用 v1.2.0 版本前的数据同步(Sync)协议。 -使用 Thrift RPC 框架传输数据。单线程 sync blocking IO 模型,传输性能较弱。 - -限制:源端 IoTDB 版本需要在 v1.2.0+,目标端 IoTDB 版本可以是 v1.2.0+、v1.1.x(更低版本的 IoTDB 理论上也支持,但是未经测试)。 - -注意:理论上 v1.2.0+ IoTDB 可作为 v1.2.0 版本前的任意版本的数据同步(Sync)接收端。 - - -| key | value | value 取值范围 | required or optional with default | -| ------------------ | --------------------------------------------------------------------- | ----------------------------------- | --------------------------------- | -| connector | iotdb-legacy-pipe-connector | String: iotdb-legacy-pipe-connector | required | -| connector.ip | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip | String | required | -| connector.port | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port | Integer | required | -| connector.user | 目标端 IoTDB 的用户名,注意该用户需要支持数据写入、TsFile Load 的权限 | String | optional: root | -| connector.password | 目标端 IoTDB 的密码,注意该用户需要支持数据写入、TsFile Load 的权限 | String | optional: root | -| connector.version | 目标端 IoTDB 的版本,用于伪装自身实际版本,绕过目标端的版本一致性检查 | String | optional: 1.1 | - -> 📌 请确保接收端已经创建了发送端的所有时间序列,或是开启了自动创建元数据,否则将会导致 pipe 运行失败。 - -#### iotdb-air-gap-connector - -作用:用于 IoTDB(v1.2.2+)向 IoTDB(v1.2.2+)跨单向数据网闸的数据同步。支持的网闸型号包括南瑞 Syskeeper 2000 等。 -该 Connector 使用 Java 自带的 Socket 实现数据传输,单线程 blocking IO 模型,其性能与 iotdb-thrift-sync-connector 相当。 -保证接收端 apply 数据的顺序与发送端接受写入请求的顺序一致。 - -场景:例如,在电力系统的规范中 - -> 1.I/II 区与 III 区之间的应用程序禁止采用 SQL 命令访问数据库和基于 B/S 方式的双向数据传输 -> -> 2.I/II 区与 III 区之间的数据通信,传输的启动端由内网发起,反向的应答报文不容许携带数据,应用层的应答报文最多为 1 个字节,并且 1 个字节为全 0 或者全 1 两种状态 - -限制: - -1. 源端 IoTDB 与 目标端 IoTDB 版本都需要在 v1.2.2+。 -2. 单向数据网闸需要允许 TCP 请求跨越,且每一个请求可返回一个全 1 或全 0 的 byte。 -3. 目标端 IoTDB 需要在 iotdb-common.properties 内,配置 - a. pipe_air_gap_receiver_enabled=true - b. pipe_air_gap_receiver_port 配置 receiver 的接收端口 - - -| key | value | value 取值范围 | required or optional with default | -| -------------------------------------- | ---------------------------------------------------------------- | ---------------------------------------------------------------------------- | ----------------------------------------------------- | -| connector | iotdb-air-gap-connector | String: iotdb-air-gap-connector | required | -| connector.ip | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip | String | optional: 与 connector.node-urls 任选其一填写 | -| connector.port | 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port | Integer | optional: 与 connector.node-urls 任选其一填写 | -| connector.node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url | String。例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | optional: 与 connector.ip:connector.port 任选其一填写 | -| connector.air-gap.handshake-timeout-ms | 发送端与接收端在首次尝试建立连接时握手请求的超时时长,单位:毫秒 | Integer | optional: 5000 | - -> 📌 请确保接收端已经创建了发送端的所有时间序列,或是开启了自动创建元数据,否则将会导致 pipe 运行失败。 - -#### do-nothing-connector - -作用:不对 processor 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -| --------- | -------------------- | ---------------------------- | --------------------------------- | -| connector | do-nothing-connector | String: do-nothing-connector | required | - -## 权限管理 - -| 权限名称 | 描述 | -| ----------- | -------------------- | -| CREATE_PIPE | 注册任务。路径无关。 | -| START_PIPE | 开启任务。路径无关。 | -| STOP_PIPE | 停止任务。路径无关。 | -| DROP_PIPE | 卸载任务。路径无关。 | -| SHOW_PIPES | 查询任务。路径无关。 | - -## 配置参数 - -在 iotdb-common.properties 中: - -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 - -# The maximum number of selectors that can be used in the async connector. -# pipe_async_connector_selector_number=1 - -# The core number of clients that can be used in the async connector. -# pipe_async_connector_core_client_number=8 - -# The maximum number of clients that can be used in the async connector. -# pipe_async_connector_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -## 功能特性 - -### 最少一次语义保证 **at-least-once** - -数据同步功能向外部系统传输数据时,提供 at-least-once 的传输语义。在大部分场景下,同步功能可提供 exactly-once 保证,即所有数据被恰好同步一次。 - -但是在以下场景中,可能存在部分数据被同步多次 **(断点续传)** 的情况: - -- 临时的网络故障:某次数据传输请求失败后,系统会进行重试发送,直至到达最大尝试次数 -- Pipe 插件逻辑实现异常:插件运行中抛出错误,系统会进行重试发送,直至到达最大尝试次数 -- 数据节点宕机、重启等导致的数据分区切主:分区变更完成后,受影响的数据会被重新传输 -- 集群不可用:集群可用后,受影响的数据会重新传输 - -### 源端:数据写入与 Pipe 处理、发送数据异步解耦 - -数据同步功能中,数据传输采用的是异步复制模式。 - -数据同步与写入操作完全脱钩,不存在对写入关键路径的影响。该机制允许框架在保证持续数据同步的前提下,保持时序数据库的写入速度。 - -### 源端:可自适应数据写入负载的数据传输策略 - -支持根据写入负载,动态调整数据传输方式,同步默认使用 TsFile 文件与操作流动态混合传输(`'extractor.realtime.mode'='hybrid'`)。 - -在数据写入负载高时,优先选择 TsFile 传输的方式。TsFile 压缩比高,节省网络带宽。 - -在数据写入负载低时,优先选择操作流同步传输的方式。操作流传输实时性高。 - -### 源端:高可用集群部署时,Pipe 服务高可用 - -当发送端 IoTDB 为高可用集群部署模式时,数据同步服务也将是高可用的。 数据同步框架将监控每个数据节点的数据同步进度,并定期做轻量级的分布式一致性快照以保存同步状态。 - -- 当发送端集群某数据节点宕机时,数据同步框架可以利用一致性快照以及保存在副本上的数据快速恢复同步,以此实现数据同步服务的高可用。 -- 当发送端集群整体宕机并重启时,数据同步框架也能使用快照恢复同步服务。 diff --git a/src/zh/UserGuide/V1.2.x/User-Manual/Streaming_timecho.md b/src/zh/UserGuide/V1.2.x/User-Manual/Streaming_timecho.md deleted file mode 100644 index 1d4e8e6aa..000000000 --- a/src/zh/UserGuide/V1.2.x/User-Manual/Streaming_timecho.md +++ /dev/null @@ -1,814 +0,0 @@ - - -# 流计算框架 - -IoTDB 流处理框架允许用户实现自定义的流处理逻辑,可以实现对存储引擎变更的监听和捕获、实现对变更数据的变形、实现对变形后数据的向外推送等逻辑。 - -我们将一个数据流处理任务称为 Pipe。一个流处理任务(Pipe)包含三个子任务: - -- 抽取(Extract) -- 处理(Process) -- 发送(Connect) - -流处理框架允许用户使用 Java 语言自定义编写三个子任务的处理逻辑,通过类似 UDF 的方式处理数据。 -在一个 Pipe 中,上述的三个子任务分别由三种插件执行实现,数据会依次经过这三个插件进行处理: -Pipe Extractor 用于抽取数据,Pipe Processor 用于处理数据,Pipe Connector 用于发送数据,最终数据将被发至外部系统。 - -**Pipe 任务的模型如下:** - -![任务模型图](/img/%E5%90%8C%E6%AD%A5%E5%BC%95%E6%93%8E.jpeg) - -描述一个数据流处理任务,本质就是描述 Pipe Extractor、Pipe Processor 和 Pipe Connector 插件的属性。 -用户可以通过 SQL 语句声明式地配置三个子任务的具体属性,通过组合不同的属性,实现灵活的数据 ETL 能力。 - -利用流处理框架,可以搭建完整的数据链路来满足端*边云同步、异地灾备、读写负载分库*等需求。 - -## 自定义流处理插件开发 - -### 编程开发依赖 - -推荐采用 maven 构建项目,在`pom.xml`中添加以下依赖。请注意选择和 IoTDB 服务器版本相同的依赖版本。 - -```xml - - org.apache.iotdb - pipe-api - 1.2.1 - provided - -``` - -### 事件驱动编程模型 - -流处理插件的用户编程接口设计,参考了事件驱动编程模型的通用设计理念。事件(Event)是用户编程接口中的数据抽象,而编程接口与具体的执行方式解耦,只需要专注于描述事件(数据)到达系统后,系统期望的处理方式即可。 - -在流处理插件的用户编程接口中,事件是数据库数据写入操作的抽象。事件由单机流处理引擎捕获,按照流处理三个阶段的流程,依次传递至 PipeExtractor 插件,PipeProcessor 插件和 PipeConnector 插件,并依次在三个插件中触发用户逻辑的执行。 - -为了兼顾端侧低负载场景下的流处理低延迟和端侧高负载场景下的流处理高吞吐,流处理引擎会动态地在操作日志和数据文件中选择处理对象,因此,流处理的用户编程接口要求用户提供下列两类事件的处理逻辑:操作日志写入事件 TabletInsertionEvent 和数据文件写入事件 TsFileInsertionEvent。 - -#### **操作日志写入事件(TabletInsertionEvent)** - -操作日志写入事件(TabletInsertionEvent)是对用户写入请求的高层数据抽象,它通过提供统一的操作接口,为用户提供了操纵写入请求底层数据的能力。 - -对于不同的数据库部署方式,操作日志写入事件对应的底层存储结构是不一样的。对于单机部署的场景,操作日志写入事件是对写前日志(WAL)条目的封装;对于分布式部署的场景,操作日志写入事件是对单个节点共识协议操作日志条目的封装。 - -对于数据库不同写入请求接口生成的写入操作,操作日志写入事件对应的请求结构体的数据结构也是不一样的。IoTDB 提供了 InsertRecord、InsertRecords、InsertTablet、InsertTablets 等众多的写入接口,每一种写入请求都使用了完全不同的序列化方式,生成的二进制条目也不尽相同。 - -操作日志写入事件的存在,为用户提供了一种统一的数据操作视图,它屏蔽了底层数据结构的实现差异,极大地降低了用户的编程门槛,提升了功能的易用性。 - -```java -/** TabletInsertionEvent is used to define the event of data insertion. */ -public interface TabletInsertionEvent extends Event { - - /** - * The consumer processes the data row by row and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processRowByRow(BiConsumer consumer); - - /** - * The consumer processes the Tablet directly and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processTablet(BiConsumer consumer); -} -``` - -#### **数据文件写入事件(TsFileInsertionEvent)** - -数据文件写入事件(TsFileInsertionEvent) 是对数据库文件落盘操作的高层抽象,它是若干操作日志写入事件(TabletInsertionEvent)的数据集合。 - -IoTDB 的存储引擎是 LSM 结构的。数据写入时会先将写入操作落盘到日志结构的文件里,同时将写入数据保存在内存里。当内存达到控制上限,则会触发刷盘行为,即将内存中的数据转换为数据库文件,同时删除之前预写的操作日志。当内存中的数据转换为数据库文件中的数据时,会经过编码压缩和通用压缩两次压缩处理,因此数据库文件的数据相比内存中的原始数据占用的空间更少。 - -在极端的网络情况下,直接传输数据文件相比传输数据写入的操作要更加经济,它会占用更低的网络带宽,能实现更快的传输速度。当然,天下没有免费的午餐,对文件中的数据进行计算处理,相比直接对内存中的数据进行计算处理时,需要额外付出文件 I/O 的代价。但是,正是磁盘数据文件和内存写入操作两种结构各有优劣的存在,给了系统做动态权衡调整的机会,也正是基于这样的观察,插件的事件模型中才引入了数据文件写入事件。 - -综上,数据文件写入事件出现在流处理插件的事件流中,存在下面两种情况: - -(1)历史数据抽取:一个流处理任务开始前,所有已经落盘的写入数据都会以 TsFile 的形式存在。一个流处理任务开始后,采集历史数据时,历史数据将以 TsFileInsertionEvent 作为抽象; - -(2)实时数据抽取:一个流处理任务进行时,当数据流中实时处理操作日志写入事件的速度慢于写入请求速度一定进度之后,未来得及处理的操作日志写入事件会被被持久化至磁盘,以 TsFile 的形式存在,这一些数据被流处理引擎抽取到后,会以 TsFileInsertionEvent 作为抽象。 - -```java -/** - * TsFileInsertionEvent is used to define the event of writing TsFile. Event data stores in disks, - * which is compressed and encoded, and requires IO cost for computational processing. - */ -public interface TsFileInsertionEvent extends Event { - - /** - * The method is used to convert the TsFileInsertionEvent into several TabletInsertionEvents. - * - * @return {@code Iterable} the list of TabletInsertionEvent - */ - Iterable toTabletInsertionEvents(); -} -``` - -### 自定义流处理插件编程接口定义 - -基于自定义流处理插件编程接口,用户可以轻松编写数据抽取插件、数据处理插件和数据发送插件,从而使得流处理功能灵活适配各种工业场景。 - -#### 数据抽取插件接口 - -数据抽取是流处理数据从数据抽取到数据发送三阶段的第一阶段。数据抽取插件(PipeExtractor)是流处理引擎和存储引擎的桥梁,它通过监听存储引擎的行为, -捕获各种数据写入事件。 - -```java -/** - * PipeExtractor - * - *

PipeExtractor is responsible for capturing events from sources. - * - *

Various data sources can be supported by implementing different PipeExtractor classes. - * - *

The lifecycle of a PipeExtractor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH EXTRACTOR` clause in SQL are - * parsed and the validation method {@link PipeExtractor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeExtractor#customize(PipeParameters, PipeExtractorRuntimeConfiguration)} will be called - * to config the runtime behavior of the PipeExtractor. - *
  • Then the method {@link PipeExtractor#start()} will be called to start the PipeExtractor. - *
  • While the collaboration task is in progress, the method {@link PipeExtractor#supply()} will - * be called to capture events from sources and then the events will be passed to the - * PipeProcessor. - *
  • The method {@link PipeExtractor#close()} will be called when the collaboration task is - * cancelled (the `DROP PIPE` command is executed). - *
- */ -public interface PipeExtractor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeExtractor#customize(PipeParameters, PipeExtractorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeExtractor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeExtractorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeExtractor#validate(PipeParameterValidator)} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeExtractor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeExtractorRuntimeConfiguration configuration) - throws Exception; - - /** - * Start the extractor. After this method is called, events should be ready to be supplied by - * {@link PipeExtractor#supply()}. This method is called after {@link - * PipeExtractor#customize(PipeParameters, PipeExtractorRuntimeConfiguration)} is called. - * - * @throws Exception the user can throw errors if necessary - */ - void start() throws Exception; - - /** - * Supply single event from the extractor and the caller will send the event to the processor. - * This method is called after {@link PipeExtractor#start()} is called. - * - * @return the event to be supplied. the event may be null if the extractor has no more events at - * the moment, but the extractor is still running for more events. - * @throws Exception the user can throw errors if necessary - */ - Event supply() throws Exception; -} -``` - -#### 数据处理插件接口 - -数据处理是流处理数据从数据抽取到数据发送三阶段的第二阶段。数据处理插件(PipeProcessor)主要用于过滤和转换由数据抽取插件(PipeExtractor)捕获的 -各种事件。 - -```java -/** - * PipeProcessor - * - *

PipeProcessor is used to filter and transform the Event formed by the PipeExtractor. - * - *

The lifecycle of a PipeProcessor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH PROCESSOR` clause in SQL are - * parsed and the validation method {@link PipeProcessor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} will be called - * to config the runtime behavior of the PipeProcessor. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeExtractor captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeConnector. The - * following 3 methods will be called: {@link - * PipeProcessor#process(TabletInsertionEvent, EventCollector)}, {@link - * PipeProcessor#process(TsFileInsertionEvent, EventCollector)} and {@link - * PipeProcessor#process(Event, EventCollector)}. - *
    • PipeConnector serializes the events into binaries and send them to sinks. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeProcessor#close() } method will be called. - *
- */ -public interface PipeProcessor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeProcessor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeProcessorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeProcessor#validate(PipeParameterValidator)} is called and before the beginning of the - * events processing. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeProcessor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeProcessorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is called to process the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(TabletInsertionEvent tabletInsertionEvent, EventCollector eventCollector) - throws Exception; - - /** - * This method is called to process the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - default void process(TsFileInsertionEvent tsFileInsertionEvent, EventCollector eventCollector) - throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - process(tabletInsertionEvent, eventCollector); - } - } - - /** - * This method is called to process the Event. - * - * @param event Event to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(Event event, EventCollector eventCollector) throws Exception; -} -``` - -#### 数据发送插件接口 - -数据发送是流处理数据从数据抽取到数据发送三阶段的第三阶段。数据发送插件(PipeConnector)主要用于发送经由数据处理插件(PipeProcessor)处理过后的 -各种事件,它作为流处理框架的网络实现层,接口上应允许接入多种实时通信协议和多种连接器。 - -```java -/** - * PipeConnector - * - *

PipeConnector is responsible for sending events to sinks. - * - *

Various network protocols can be supported by implementing different PipeConnector classes. - * - *

The lifecycle of a PipeConnector is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH CONNECTOR` clause in SQL are - * parsed and the validation method {@link PipeConnector#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeConnector#customize(PipeParameters, PipeConnectorRuntimeConfiguration)} will be called - * to config the runtime behavior of the PipeConnector and the method {@link - * PipeConnector#handshake()} will be called to create a connection with sink. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeExtractor captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeConnector. - *
    • PipeConnector serializes the events into binaries and send them to sinks. The - * following 3 methods will be called: {@link - * PipeConnector#transfer(TabletInsertionEvent)}, {@link - * PipeConnector#transfer(TsFileInsertionEvent)} and {@link - * PipeConnector#transfer(Event)}. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeConnector#close() } method will be called. - *
- * - *

In addition, the method {@link PipeConnector#heartbeat()} will be called periodically to check - * whether the connection with sink is still alive. The method {@link PipeConnector#handshake()} - * will be called to create a new connection with the sink when the method {@link - * PipeConnector#heartbeat()} throws exceptions. - */ -public interface PipeConnector extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeConnector#customize(PipeParameters, PipeConnectorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeConnector. In this method, the user can do the - * following things: - * - *

    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeConnectorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeConnector#validate(PipeParameterValidator)} is called and before the method {@link - * PipeConnector#handshake()} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeConnector - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeConnectorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is used to create a connection with sink. This method will be called after the - * method {@link PipeConnector#customize(PipeParameters, PipeConnectorRuntimeConfiguration)} is - * called or will be called when the method {@link PipeConnector#heartbeat()} throws exceptions. - * - * @throws Exception if the connection is failed to be created - */ - void handshake() throws Exception; - - /** - * This method will be called periodically to check whether the connection with sink is still - * alive. - * - * @throws Exception if the connection dies - */ - void heartbeat() throws Exception; - - /** - * This method is used to transfer the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(TabletInsertionEvent tabletInsertionEvent) throws Exception; - - /** - * This method is used to transfer the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - default void transfer(TsFileInsertionEvent tsFileInsertionEvent) throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - transfer(tabletInsertionEvent); - } - } - - /** - * This method is used to transfer the Event. - * - * @param event Event to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(Event event) throws Exception; -} -``` - -## 自定义流处理插件管理 - -为了保证用户自定义插件在实际生产中的灵活性和易用性,系统还需要提供对插件进行动态统一管理的能力。 -本章节介绍的流处理插件管理语句提供了对插件进行动态统一管理的入口。 - -### 加载插件语句 - -在 IoTDB 中,若要在系统中动态载入一个用户自定义插件,则首先需要基于 PipeExtractor、 PipeProcessor 或者 PipeConnector 实现一个具体的插件类, -然后需要将插件类编译打包成 jar 可执行文件,最后使用加载插件的管理语句将插件载入 IoTDB。 - -加载插件的管理语句的语法如图所示。 - -```sql -CREATE PIPEPLUGIN <别名> -AS <全类名> -USING -``` - -示例:假如用户实现了一个全类名为edu.tsinghua.iotdb.pipe.ExampleProcessor 的数据处理插件,打包后的jar包为 pipe-plugin.jar ,用户希望在流处理引擎中使用这个插件,将插件标记为 example。插件包有两种使用方式,一种为上传到URI服务器,一种为上传到集群本地目录,两种方法任选一种即可。 - -【方式一】上传到URI服务器 - -准备工作:使用该种方式注册,您需要提前将 JAR 包上传到 URI 服务器上并确保执行注册语句的IoTDB实例能够访问该 URI 服务器。例如 https://example.com:8080/iotdb/pipe-plugin.jar 。 - -创建语句: - -```sql -CREATE PIPEPLUGIN example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -【方式二】上传到集群本地目录 - -准备工作:使用该种方式注册,您需要提前将 JAR 包放置到DataNode节点所在机器的任意路径下,推荐您将JAR包放在IoTDB安装路径的/ext/pipe目录下(安装包中已有,无需新建)。例如:iotdb-1.x.x-bin/ext/pipe/pipe-plugin.jar。(**注意:如果您使用的是集群,那么需要将 JAR 包放置到每个 DataNode 节点所在机器的该路径下)** - -创建语句: - -```sql -CREATE PIPEPLUGIN example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -### 删除插件语句 - -当用户不再想使用一个插件,需要将插件从系统中卸载时,可以使用如图所示的删除插件语句。 - -```sql -DROP PIPEPLUGIN <别名> -``` - -### 查看插件语句 - -用户也可以按需查看系统中的插件。查看插件的语句如图所示。 - -```sql -SHOW PIPEPLUGINS -``` - -## 系统预置的流处理插件 - -### 预置 extractor 插件 - -#### iotdb-extractor - -作用:抽取 IoTDB 内部的历史或实时数据进入 pipe。 - - -| key | value | value 取值范围 | required or optional with default | -| ---------------------------------- | ------------------------------------------------ | -------------------------------------- | --------------------------------- | -| extractor | iotdb-extractor | String: iotdb-extractor | required | -| extractor.pattern | 用于筛选时间序列的路径前缀 | String: 任意的时间序列前缀 | optional: root | -| extractor.history.enable | 是否抽取历史数据 | Boolean: true, false | optional: true | -| extractor.history.start-time | 抽取的历史数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| extractor.history.end-time | 抽取的历史数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| extractor.realtime.enable | 是否抽取实时数据 | Boolean: true, false | optional: true | -| extractor.realtime.mode | 实时数据的抽取模式 | String: hybrid, log, file | optional: hybrid | -| extractor.forwarding-pipe-requests | 是否抽取由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | optional: true | - -> 🚫 **extractor.pattern 参数说明** -> -> * Pattern 需用反引号修饰不合法字符或者是不合法路径节点,例如如果希望筛选 root.\`a@b\` 或者 root.\`123\`,应设置 pattern 为 root.\`a@b\` 或者 root.\`123\`(具体参考 [单双引号和反引号的使用时机](https://iotdb.apache.org/zh/Download/#_1-0-版本不兼容的语法详细说明)) -> * 在底层实现中,当检测到 pattern 为 root(默认值)时,抽取效率较高,其他任意格式都将降低性能 -> * 路径前缀不需要能够构成完整的路径。例如,当创建一个包含参数为 'extractor.pattern'='root.aligned.1' 的 pipe 时: - > - > * root.aligned.1TS -> * root.aligned.1TS.\`1\` -> * root.aligned.100T - > - > 的数据会被抽取; - > - > * root.aligned.\`1\` -> * root.aligned.\`123\` - > - > 的数据不会被抽取。 -> * root.\_\_system 的数据不会被 pipe 抽取。用户虽然可以在 extractor.pattern 中包含任意前缀,包括带有(或覆盖) root.\__system 的前缀,但是 root.__system 下的数据总是会被 pipe 忽略的 - -> ❗️**extractor.history 的 start-time,end-time 参数说明** -> -> * start-time,end-time 应为 ISO 格式,例如 2011-12-03T10:15:30 或 2011-12-03T10:15:30+01:00 - -> ✅ **一条数据从生产到落库 IoTDB,包含两个关键的时间概念** -> -> * **event time:** 数据实际生产时的时间(或者数据生产系统给数据赋予的生成时间,是数据点中的时间项),也称为事件时间。 -> * **arrival time:** 数据到达 IoTDB 系统内的时间。 -> -> 我们常说的乱序数据,指的是数据到达时,其 **event time** 远落后于当前系统时间(或者已经落库的最大 **event time**)的数据。另一方面,不论是乱序数据还是顺序数据,只要它们是新到达系统的,那它们的 **arrival time** 都是会随着数据到达 IoTDB 的顺序递增的。 - -> 💎 **iotdb-extractor 的工作可以拆分成两个阶段** -> -> 1. 历史数据抽取:所有 **arrival time** < 创建 pipe 时**当前系统时间**的数据称为历史数据 -> 2. 实时数据抽取:所有 **arrival time** >= 创建 pipe 时**当前系统时间**的数据称为实时数据 -> -> 历史数据传输阶段和实时数据传输阶段,**两阶段串行执行,只有当历史数据传输阶段完成后,才执行实时数据传输阶段。** -> -> 用户可以指定 iotdb-extractor 进行: -> -> * 历史数据抽取(`'extractor.history.enable' = 'true'`, `'extractor.realtime.enable' = 'false'` ) -> * 实时数据抽取(`'extractor.history.enable' = 'false'`, `'extractor.realtime.enable' = 'true'` ) -> * 全量数据抽取(`'extractor.history.enable' = 'true'`, `'extractor.realtime.enable' = 'true'` ) -> * 禁止同时设置 `extractor.history.enable` 和 `extractor.realtime.enable` 为 `false` - -> 📌 **extractor.realtime.mode:数据抽取的模式** -> -> * log:该模式下,任务仅使用操作日志进行数据处理、发送 -> * file:该模式下,任务仅使用数据文件进行数据处理、发送 -> * hybrid:该模式,考虑了按操作日志逐条目发送数据时延迟低但吞吐低的特点,以及按数据文件批量发送时发送吞吐高但延迟高的特点,能够在不同的写入负载下自动切换适合的数据抽取方式,首先采取基于操作日志的数据抽取方式以保证低发送延迟,当产生数据积压时自动切换成基于数据文件的数据抽取方式以保证高发送吞吐,积压消除时自动切换回基于操作日志的数据抽取方式,避免了采用单一数据抽取算法难以平衡数据发送延迟或吞吐的问题。 - -> 🍕 **extractor.forwarding-pipe-requests:是否允许转发从另一 pipe 传输而来的数据** -> -> * 如果要使用 pipe 构建 A -> B -> C 的数据同步,那么 B -> C 的 pipe 需要将该参数为 true 后,A -> B 中 A 通过 pipe 写入 B 的数据才能被正确转发到 C -> * 如果要使用 pipe 构建 A \<-> B 的双向数据同步(双活),那么 A -> B 和 B -> A 的 pipe 都需要将该参数设置为 false,否则将会造成数据无休止的集群间循环转发 - -### 预置 processor 插件 - -#### do-nothing-processor - -作用:不对 extractor 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -| --------- | -------------------- | ---------------------------- | --------------------------------- | -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### 预置 connector 插件 - -#### do-nothing-connector - -作用:不对 processor 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -| --------- | -------------------- | ---------------------------- | --------------------------------- | -| connector | do-nothing-connector | String: do-nothing-connector | required | - -## 流处理任务管理 - -### 创建流处理任务 - -使用 `CREATE PIPE` 语句来创建流处理任务。以数据同步流处理任务的创建为例,示例 SQL 语句如下: - -```sql -CREATE PIPE -- PipeId 是能够唯一标定流处理任务的名字 -WITH EXTRACTOR ( - -- 默认的 IoTDB 数据抽取插件 - 'extractor' = 'iotdb-extractor', - -- 路径前缀,只有能够匹配该路径前缀的数据才会被抽取,用作后续的处理和发送 - 'extractor.pattern' = 'root.timecho', - -- 是否抽取历史数据 - 'extractor.history.enable' = 'true', - -- 描述被抽取的历史数据的时间范围,表示最早时间 - 'extractor.history.start-time' = '2011.12.03T10:15:30+01:00', - -- 描述被抽取的历史数据的时间范围,表示最晚时间 - 'extractor.history.end-time' = '2022.12.03T10:15:30+01:00', - -- 是否抽取实时数据 - 'extractor.realtime.enable' = 'true', - -- 描述实时数据的抽取方式 - 'extractor.realtime.mode' = 'hybrid', -) -WITH PROCESSOR ( - -- 默认的数据处理插件,即不做任何处理 - 'processor' = 'do-nothing-processor', -) -WITH CONNECTOR ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'connector' = 'iotdb-thrift-connector', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'connector.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'connector.port' = '6667', -) -``` - -**创建流处理任务时需要配置 PipeId 以及三个插件部分的参数:** - - -| 配置项 | 说明 | 是否必填 | 默认实现 | 默认实现说明 | 是否允许自定义实现 | -| --------- | --------------------------------------------------- | --------------------------- | -------------------- | -------------------------------------------------------- | ------------------------- | -| PipeId | 全局唯一标定一个流处理任务的名称 | 必填 | - | - | - | -| extractor | Pipe Extractor 插件,负责在数据库底层抽取流处理数据 | 选填 | iotdb-extractor | 将数据库的全量历史数据和后续到达的实时数据接入流处理任务 | 否 | -| processor | Pipe Processor 插件,负责处理数据 | 选填 | do-nothing-processor | 对传入的数据不做任何处理 | | -| connector | Pipe Connector 插件,负责发送数据 | 必填 | - | - | | - -示例中,使用了 iotdb-extractor、do-nothing-processor 和 iotdb-thrift-connector 插件构建数据流处理任务。IoTDB 还内置了其他的流处理插件,**请查看“系统预置流处理插件”一节**。 - -**一个最简的 CREATE PIPE 语句示例如下:** - -```sql -CREATE PIPE -- PipeId 是能够唯一标定流处理任务的名字 -WITH CONNECTOR ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'connector' = 'iotdb-thrift-connector', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'connector.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'connector.port' = '6667', -) -``` - -其表达的语义是:将本数据库实例中的全量历史数据和后续到达的实时数据,同步到目标为 127.0.0.1:6667 的 IoTDB 实例上。 - -**注意:** - -- EXTRACTOR 和 PROCESSOR 为选填配置,若不填写配置参数,系统则会采用相应的默认实现 -- CONNECTOR 为必填配置,需要在 CREATE PIPE 语句中声明式配置 -- CONNECTOR 具备自复用能力。对于不同的流处理任务,如果他们的 CONNECTOR 具备完全相同 KV 属性的(所有属性的 key 对应的 value 都相同),**那么系统最终只会创建一个 CONNECTOR 实例**,以实现对连接资源的复用。 - - - 例如,有下面 pipe1, pipe2 两个流处理任务的声明: - - ```sql - CREATE PIPE pipe1 - WITH CONNECTOR ( - 'connector' = 'iotdb-thrift-connector', - 'connector.thrift.host' = 'localhost', - 'connector.thrift.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH CONNECTOR ( - 'connector' = 'iotdb-thrift-connector', - 'connector.thrift.port' = '9999', - 'connector.thrift.host' = 'localhost', - ) - ``` - - - 因为它们对 CONNECTOR 的声明完全相同(**即使某些属性声明时的顺序不同**),所以框架会自动对它们声明的 CONNECTOR 进行复用,最终 pipe1, pipe2 的CONNECTOR 将会是同一个实例。 -- 在 extractor 为默认的 iotdb-extractor,且 extractor.forwarding-pipe-requests 为默认值 true 时,请不要构建出包含数据循环同步的应用场景(会导致无限循环): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### 启动流处理任务 - -CREATE PIPE 语句成功执行后,流处理任务相关实例会被创建,但整个流处理任务的运行状态会被置为 STOPPED,即流处理任务不会立刻处理数据。 - -可以使用 START PIPE 语句使流处理任务开始处理数据: - -```sql -START PIPE -``` - -### 停止流处理任务 - -使用 STOP PIPE 语句使流处理任务停止处理数据: - -```sql -STOP PIPE -``` - -### 删除流处理任务 - -使用 DROP PIPE 语句使流处理任务停止处理数据(当流处理任务状态为 RUNNING 时),然后删除整个流处理任务流处理任务: - -```sql -DROP PIPE -``` - -用户在删除流处理任务前,不需要执行 STOP 操作。 - -### 展示流处理任务 - -使用 SHOW PIPES 语句查看所有流处理任务: - -```sql -SHOW PIPES -``` - -查询结果如下: - -```sql -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -| ID| CreationTime | State|PipeExtractor|PipeProcessor|PipeConnector|ExceptionMessage| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| None| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+-------------+-------------+-------------+----------------+ -``` - -可以使用 `` 指定想看的某个流处理任务状态: - -```sql -SHOW PIPE -``` - -您也可以通过 where 子句,判断某个 \ 使用的 Pipe Connector 被复用的情况。 - -```sql -SHOW PIPES -WHERE CONNECTOR USED BY -``` - -### 流处理任务运行状态迁移 - -一个流处理 pipe 在其被管理的生命周期中会经过多种状态: - -- **STOPPED:** pipe 处于停止运行状态。当管道处于该状态时,有如下几种可能: - - 当一个 pipe 被成功创建之后,其初始状态为暂停状态 - - 用户手动将一个处于正常运行状态的 pipe 暂停,其状态会被动从 RUNNING 变为 STOPPED - - 当一个 pipe 运行过程中出现无法恢复的错误时,其状态会自动从 RUNNING 变为 STOPPED -- **RUNNING:** pipe 正在正常工作 -- **DROPPED:** pipe 任务被永久删除 - -下图表明了所有状态以及状态的迁移: - -![状态迁移图](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -## 权限管理 - -### 流处理任务 - - -| 权限名称 | 描述 | -|-------------|---------------| -| CREATE_PIPE | 注册流处理任务。路径无关。 | -| START_PIPE | 开启流处理任务。路径无关。 | -| STOP_PIPE | 停止流处理任务。路径无关。 | -| DROP_PIPE | 卸载流处理任务。路径无关。 | -| SHOW_PIPES | 查询流处理任务。路径无关。 | - -### 流处理任务插件 - - -| 权限名称 | 描述 | -|-------------------|-----------------| -| CREATE_PIPEPLUGIN | 注册流处理任务插件。路径无关。 | -| DROP_PIPEPLUGIN | 卸载流处理任务插件。路径无关。 | -| SHOW_PIPEPLUGINS | 查询流处理任务插件。路径无关。 | - -## 配置参数 - -在 iotdb-common.properties 中: - -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 -``` diff --git a/src/zh/UserGuide/V1.2.x/User-Manual/Subscription_timecho.md b/src/zh/UserGuide/V1.2.x/User-Manual/Subscription_timecho.md deleted file mode 100644 index a6b79d313..000000000 --- a/src/zh/UserGuide/V1.2.x/User-Manual/Subscription_timecho.md +++ /dev/null @@ -1,430 +0,0 @@ - - -# IoTDB 数据订阅客户端 - -**IoTDB 的数据订阅客户端能够从特定的 IoTDB 集群内,按照一定的方式获取数据。** -我们提供了多语言的 api,能够实时获取 IoTDB 内的最新数据,且具有推送和拉取两种模式。下面将按照语言顺序列出这些 api。 - -## Java -提供 SubscriptionFactory 构造消费者,支持 PushConsumer 和 PullConsumer 两种消费风格: -```` java -public interface SubscriptionFactory { - - PushConsumer createPushConsumer(SubscriptionConfiguration subscriptionConfiguration) - throws SubscriptionException; - - PullConsumer createPullConsumer(SubscriptionConfiguration subscriptionConfiguration) - throws SubscriptionException; -} - -public class PushConsumer implements Consumer { - // 唤醒 - void resumeConsume(); - // 暂停 - void pauseConsume(); - // 是否暂停 - boolean isConsumePaused(); - - // 订阅消息到达后的回调 Listener - PushConsumer registerSubscriptionListener(SubscriptionListener listener) throws SubscriptionException; - // 错误出现后的回调 Listener - PushConsumer registerErrorListener(ReceiveErrorListener listener) throws SubscriptionException; - - void close(); -} - -public class PullConsumer implements Consumer { - // 获取队列顶部消息,客户端自己循环调用。比如:consumer.poll(Duration.ofMillis(100)) - List poll(Duration timeout) throws SubscriptionException; - - void openSubscription(); - - void close(); -} -```` -## C/C++ -IoTDB 提供与 Java 类似的订阅接口: -```` C -push_consumer * iotdb_create_push_consumer(subscription_config *cnf); -pull_consumer * iotdb_create_pull_consumer(subscription_config *cnf); - -int push_consumer_resume(push_consumer *pc); -int push_consumer_pause(push_consumer *pc); -int push_consumer_is_consume_paused(push_consumer *pc); -int push_consumer_register_subscription_listener(push_consumer *pc, SUBSCRIPTION_LISTENER listener); -int push_consumer_register_error_listener(push_consumer *pc, SERROR_LISTENER listener); -int push_consumer_close(push_consumer *pc); - -consumer_dataset * pull_consumer_poll(pull_consumer *pc, int64_t timeout); -int pull_consumer_open(pull_consumer *pc); -int pull_consumer_close(pull_consumer *pc); -```` - -## Python -Python 的订阅接口与 Java 的类似,订阅方式如下: -```python -from abc import ABC, abstractmethod -from typing import List -from datetime import timedelta - -class SubscriptionFactory(ABC): - @abstractmethod - def createPushConsumer(self, subscriptionConfiguration): - pass - - @abstractmethod - def createPullConsumer(self, subscriptionConfiguration): - pass - - -class PushConsumer: - def resumeConsume(self): - pass - - def pauseConsume(self): - pass - - def isConsumePaused(self): - pass - - def registerSubscriptionListener(self, listener): - pass - - def registerErrorListener(self, listener): - pass - - def close(self): - pass - - -class PullConsumer: - def poll(self, timeout: timedelta) -> List: - pass - - def openSubscription(self): - pass - - def close(self): - pass -``` - -## Go -Go 语言内的订阅方式如下: -```go -package main - -import ( -\t"time" -) - -type SubscriptionFactory interface { -\tCreatePushConsumer(subscriptionConfiguration SubscriptionConfiguration) (PushConsumer, error) -\tCreatePullConsumer(subscriptionConfiguration SubscriptionConfiguration) (PullConsumer, error) -} - -type PushConsumer interface { -\tResumeConsume() -\tPauseConsume() -\tIsConsumePaused() bool -\tRegisterSubscriptionListener(listener SubscriptionListener) error -\tRegisterErrorListener(listener ReceiveErrorListener) error -\tClose() -} - -type PullConsumer interface { -\tPoll(timeout time.Duration) ([]ConsumerDataSet, error) -\tOpenSubscription() -\tClose() -} - -type ConsumerDataSet struct { -\t// define fields of ConsumerDataSet -} - -type SubscriptionConfiguration struct { -\t// define fields of SubscriptionConfiguration -} - -type SubscriptionListener interface { -\t// define methods of SubscriptionListener -} - -type ReceiveErrorListener interface { -\t// define methods of ReceiveErrorListener -} -``` -## Rust -Rust 语言内的订阅方式如下: -``` rust -use std::time::Duration; - -pub trait SubscriptionFactory { - fn create_push_consumer(&self, subscription_configuration: SubscriptionConfiguration) - -> Result; - - fn create_pull_consumer(&self, subscription_configuration: SubscriptionConfiguration) - -> Result; -} - -pub struct PushConsumer { - // 唤醒 - fn resume_consume(&self); - - // 暂停 - fn pause_consume(&self); - - // 是否暂停 - fn is_consume_paused(&self) -> bool; - - // 订阅消息到达后的回调 Listener - fn register_subscription_listener(&self, listener: SubscriptionListener) - -> Result; - - // 错误出现后的回调 Listener - fn register_error_listener(&self, listener: ReceiveErrorListener) - -> Result; - - fn close(&self); -} - -pub struct PullConsumer { - // 获取队列顶部消息,客户端自己循环调用。比如:consumer.poll(Duration::from_millis(100)) - fn poll(&self, timeout: Duration) -> Result, SubscriptionException>; - - fn openSubscription(&self); - - fn close(&self); -} -``` -## Node.JS -```javascript -class PushConsumer { - constructor() { - this.paused = false; - this.subscriptionListener = null; - this.errorListener = null; - } - - resumeConsume() { - this.paused = false; - } - - pauseConsume() { - this.paused = true; - } - - isConsumePaused() { - return this.paused; - } - - registerSubscriptionListener(listener) { - this.subscriptionListener = listener; - return this; - } - - registerErrorListener(listener) { - this.errorListener = listener; - return this; - } - - close() { - // 关闭操作 - } -} - -class PullConsumer { - poll(timeout) { - // 获取队列顶部消息的操作 - return []; - } - - openSubscription() { - // 打开连接 - } - - close() { - // 关闭操作 - } -} - -class SubscriptionFactory { - createPushConsumer(subscriptionConfiguration) { - return new PushConsumer(); - } - - createPullConsumer(subscriptionConfiguration) { - return new PullConsumer(); - } -} - -module.exports = { - SubscriptionFactory, - PushConsumer, - PullConsumer -}; -``` - -使用方式示例: - -```javascript -const { SubscriptionFactory } = require('./subscription'); - -const factory = new SubscriptionFactory(); -const subscriptionConfiguration = { /* 配置信息 */ }; - -const pushConsumer = factory.createPushConsumer(subscriptionConfiguration); -pushConsumer.registerSubscriptionListener((message) => { - // 处理订阅消息到达后的逻辑 -}).registerErrorListener((error) => { - // 处理错误出现后的逻辑 -}); - -const pullConsumer = factory.createPullConsumer(subscriptionConfiguration); - -// 使用 pushConsumer 和 pullConsumer 进行操作 -``` - -## C# -``` C# -public interface SubscriptionFactory -{ - PushConsumer CreatePushConsumer(SubscriptionConfiguration subscriptionConfiguration) - { - throw new SubscriptionException(); - } - - PullConsumer CreatePullConsumer(SubscriptionConfiguration subscriptionConfiguration) - { - throw new SubscriptionException(); - } -} - -public class PushConsumer : Consumer -{ - public void ResumeConsume() - { - // 唤醒逻辑 - } - - public void PauseConsume() - { - // 暂停逻辑 - } - - public bool IsConsumePaused() - { - // 判断是否暂停逻辑 - } - - public PushConsumer RegisterSubscriptionListener(SubscriptionListener listener) - { - throw new SubscriptionException(); - } - - public PushConsumer RegisterErrorListener(ReceiveErrorListener listener) - { - throw new SubscriptionException(); - } - - public void Close() - { - // 关闭逻辑 - } -} - -public class PullConsumer : Consumer -{ - public List Poll(TimeSpan timeout) - { - throw new SubscriptionException(); - } - - public void OpenSubscription() - { - // 开始逻辑 - } - - public void Close() - { - // 关闭逻辑 - } -} -``` - -## WebSocket 方式订阅 -同时,IoTDB 的订阅客户端还支持以 WebSocket 的方式订阅。 WebSocket 的默认客户端端口为 9090,也可以在客户端内配置。订阅消息为: -```json - { - "event": "subscribe", - "pattern": "root", - "unordered": "false", - "timeRange": "...", - "ValueRange": ">100" -} -``` -该消息将订阅 IoTDB 的所有数据,不包括乱序数据,按照一定时间进行过滤,同时只需要大于100的数据。 - -此外,还需要自定义 socket.onmessage 函数,以处理获取到的数据。此外,还需要编写 socket.onclose 和 socket.onerror 等函数,以自定义客户端对这些事件的响应。 - -## MQTT 方式订阅 -目前 IoTDB 的 MQTT 方式支持数据订阅功能,其数据格式与 WebSocket 相同。mqtt 的 host 和 port 需要在 iotdb 的 properties 文件内配置。 - - -## 数据过滤 -与数据同步软件相同,IoTDB 的订阅功能也可以提供数据过滤的功能。WebSocket 与 MQTT 方式已经给出了示例。 -Api 方式下,以 Java 订阅接口为例子,用户可在 SubscriptionConfiguration 中配置过滤条件(Strategy)。目前支持指定的条件包括: -- 是否需要过滤乱序数据(disorderHandlingStrategy) -- 需要订阅的序列的共同前缀(topicsStrategy) -- 指定序列的时间范围(timeStrategy) -- 指定序列的值范围(valueStrategy) - -```java -public class PushConsumerExample { - - public static void test(String[] args) throws Throwable { - SubscriptionConfiguration config = new SubscriptionConfiguration.Builder() - .host("127.0.0.1") - .port(6667) - .user("root") - .password("root") - .group("my-test-group") - .build(); - - // 在此设置过滤条件 - config.disorderHandlingStrategy(new IntolerableStrategy()) - .topicsStrategy(new SingleTopicStrategy("root.sg.d1.n1")) - .timeStrategy(new GlobalTimeStrategy) - .valueStrategy(ValueStrategy.GreaterThanStrategy( - new SingleTopicStrategy("root.sg.d1.n1"), 100d)); - - SubscriptionFactory factory = new SubscriptionFactory(config); - final PullConsumer pullConsumer = factory.createPullConsumer(consumerConfig); - pullConsumer.openSubscription(); - while (true) { - List result = pullConsumer.poll(Duration.ofMillis(300)); - for (ConsumerDataSet item : result) { - System.out.println("received message : " + item); - } - } - } -} -``` diff --git a/src/zh/UserGuide/V1.2.x/User-Manual/Tiered-Storage_timecho.md b/src/zh/UserGuide/V1.2.x/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index a8b9684ff..000000000 --- a/src/zh/UserGuide/V1.2.x/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,96 +0,0 @@ - - -# 多级存储 -## 概述 - -多级存储功能向用户提供多种存储介质管理的能力,用户可以使用多级存储功能为 IoTDB 配置不同类型的存储介质,并为存储介质进行分级。IoTDB 可以根据数据的冷热程度,仅通过参数配置的方式来支持内存、SSD、普通硬盘到网络硬盘的多级存储。具体的,在 IoTDB 中,多级存储的配置体现为多目录的管理。用户可以将多个存储目录归为同一类,作为一个“层级”向 IoTDB 中配置,这种“层级”我们称之为 storage tier;同时,用户可以根据数据的冷热进行分类,并将不同类别的数据存储到指定的“层级”中。当前 IoTDB 支持通过数据的 TTL 进行冷热数据的分类,当一个层级中的数据不满足当前层级定义的 TTL 规则时,该数据会被自动迁移至下一层级中。 - -## 参数定义 - -在 IoTDB 中开启多级存储,需要进行以下几个方面的配置: - -1. 配置数据目录,并将数据目录分为不同的层级 -2. 配置每个层级所管理的数据的 TTL,以区分不同层级管理的冷热数据类别。 -3. 配置每个层级的最小剩余存储空间比例,当该层级的存储空间触发该阈值时,该层级的数据会被自动迁移至下一层级(可选)。 - -具体的参数定义及其描述如下。 - -| 配置项 | 默认值 | 说明 | 约束 | -| --------------------------------------- | ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | 用来指定不同的存储目录,并将存储目录进行层级划分 | 每级存储使用分号分隔,单级内使用逗号分隔;云端配置只能作为最后一级存储且第一级不能作为云端存储;最多配置一个云端对象;远端存储目录使用 OBJECT_STORAGE 来表示 | -| default_ttl_in_ms | -1 | 定义每个层级负责的数据范围,通过 TTL 表示 | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致 | -| dn_default_space_move_thresholds | 0.15 | 定义每个层级数据目录的最小剩余空间比例;当剩余空间少于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的剩余存储空间到低于此阈值时,会将系统置为 READ_ONLY | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致 | -| object_storage_type | AWS_S3 | 云端存储类型 | IoTDB 当前只支持 AWS S3 作为远端存储类型,此参数不支持修改 | -| object_storage_bucket | iotdb_data | 云端存储 bucket 的名称 | AWS S3 中的 bucket 定义;如果未使用远端存储,无需配置 | -| object_storage_endpoint | | 云端存储的 endpoint | AWS S3 的 endpoint;如果未使用远端存储,无需配置 | -| object_storage_access_key | | 云端存储的验证信息 key | AWS S3 的 credential key;如果未使用远端存储,无需配置 | -| object_storage_access_secret | | 云端存储的验证信息 secret | AWS S3 的 credential secret;如果未使用远端存储,无需配置 | -| remote_tsfile_cache_dirs | data/datanode/data/cache | 云端存储在本地的缓存目录 | 如果未使用远端存储,无需配置 | -| remote_tsfile_cache_page_size_in_kb | 20480 | 云端存储在本地缓存文件的块大小 | 如果未使用远端存储,无需配置 | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | 云端存储本地缓存的最大磁盘占用大小 | 如果未使用远端存储,无需配置 | - -## 本地多级存储配置示例 - -以下以本地两级存储的配置示例。 - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data; -default_ttl_in_ms=86400000;-1 -dn_default_space_move_thresholds=0.2;0.1 -``` - -在该示例中,共配置了两个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 1 天以前的数据 | 10% | - -## 远端多级存储配置示例 - -以下以三级存储为例: - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -default_ttl_in_ms=86400000;864000000;-1 -dn_default_space_move_thresholds=0.2;0.15;0.1 -object_storage_name=AWS_S3 -object_storage_bucket=iotdb -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// 可选配置项 -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -在该示例中,共配置了三个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 过去1 天至过去 10 天内的数据 | 15% | -| 层级三 | 远端 AWS S3 存储 | 过去 10 天以前的数据 | 10% | diff --git a/src/zh/UserGuide/V1.3.x/AI-capability/AINode_timecho.md b/src/zh/UserGuide/V1.3.x/AI-capability/AINode_timecho.md deleted file mode 100644 index 0e3b6ee97..000000000 --- a/src/zh/UserGuide/V1.3.x/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,655 +0,0 @@ - - -# AINode - -AINode 是支持时序相关模型注册、管理、调用的 IoTDB 原生节点,内置业界领先的自研时序大模型,如清华自研时序模型 Timer 系列,可通过标准 SQL 语句进行调用,实现时序数据的毫秒级实时推理,可支持时序趋势预测、缺失值填补、异常值检测等应用场景。 - -系统架构如下图所示: -::: center - -::: -三种节点的职责如下: - -- **ConfigNode**:负责保存和管理模型的元信息;负责分布式节点管理。 -- **DataNode**:负责接收并解析用户的 SQL请求;负责存储时间序列数据;负责数据的预处理计算。 -- **AINode**:负责模型文件的导入创建以及模型推理。 - -## 1. 优势特点 - -与单独构建机器学习服务相比,具有以下优势: - -- **简单易用**:无需使用 Python 或 Java 编程,使用 SQL 语句即可完成机器学习模型管理与推理的完整流程。如创建模型可使用CREATE MODEL语句、使用模型进行推理可使用CALL INFERENCE(...)语句等,使用更加简单便捷。 - -- **避免数据迁移**:使用 IoTDB 原生机器学习可以将存储在 IoTDB 中的数据直接应用于机器学习模型的推理,无需将数据移动到单独的机器学习服务平台,从而加速数据处理、提高安全性并降低成本。 - -![](/img/h1.png) - -- **内置先进算法**:支持业内领先机器学习分析算法,覆盖典型时序分析任务,为时序数据库赋能原生数据分析能力。如: - - **时间序列预测(Time Series Forecasting)**:从过去时间序列中学习变化模式;从而根据给定过去时间的观测值,输出未来序列最可能的预测。 - - **时序异常检测(Anomaly Detection for Time Series)**:在给定的时间序列数据中检测和识别异常值,帮助发现时间序列中的异常行为。 - - **时间序列标注(Time Series Annotation)**:为每个数据点或特定时间段添加额外的信息或标记,例如事件发生、异常点、趋势变化等,以便更好地理解和分析数据。 - - -## 2. 基本概念 - -- **模型(Model)**:机器学习模型,以时序数据作为输入,输出分析任务的结果或决策。模型是AINode 的基本管理单元,支持模型的增(注册)、删、查、用(推理)。 -- **创建(Create)**: 将外部设计或训练好的模型文件或算法加载到MLNode中,由IoTDB统一管理与使用。 -- **推理(Inference)**:使用创建的模型在指定时序数据上完成该模型适用的时序分析任务的过程。 -- **内置能力(Built-in)**:AINode 自带常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -::: center - -::: - -## 3. 安装部署 - -AINode 的部署可参考文档 [部署指导](../Deployment-and-Maintenance/AINode_Deployment_timecho.md#AINode-部署) 章节。 - -## 4. 使用指导 - -AINode 对时序数据相关的深度学习模型提供了模型创建及删除的流程,内置模型无需创建及删除,可直接使用,并且在完成推理后创建的内置模型实例将自动销毁。 - -### 4.1 注册模型 - -通过指定模型输入输出的向量维度,可以注册训练好的深度学习模型,从而用于模型推理。 - -符合以下内容的模型可以注册到AINode中: - 1. AINode 支持的PyTorch 2.1.0、 2.2.0版本训练的模型,需避免使用2.2.0版本以上的特性。 - 2. AINode支持使用PyTorch JIT存储的模型,模型文件需要包含模型的参数和结构。 - 3. 模型输入序列可以包含一列或多列,若有多列,需要和模型能力、模型配置文件对应。 - 4. 模型的输入输出维度必须在`config.yaml`配置文件中明确定义。使用模型时,必须严格按照`config.yaml`配置文件中定义的输入输出维度。如果输入输出列数不匹配配置文件,将会导致错误。 - -下方为模型注册的SQL语法定义。 - -```SQL -create model using uri -``` - -SQL中参数的具体含义如下: - -- model_name:模型的全局唯一标识,不可重复。模型名称具备以下约束: - - - 允许出现标识符 [ 0-9 a-z A-Z _ ] (字母,数字,下划线) - - 长度限制为2-64字符 - - 大小写敏感 - -- uri:模型注册文件的资源路径,路径下应包含**模型权重model.pt文件和模型的元数据描述文件config.yaml** - - - 模型权重文件:深度学习模型训练完成后得到的权重文件,目前支持pytorch训练得到的.pt文件 - - - yaml元数据描述文件:模型注册时需要提供的与模型结构有关的参数,其中必须包含模型的输入输出维度用于模型推理: - - - | **参数名** | **参数描述** | **示例** | - | ------------ | ---------------------------- | -------- | - | input_shape | 模型输入的行列,用于模型推理 | [96,2] | - | output_shape | 模型输出的行列,用于模型推理 | [48,2] | - - - ​ 除了模型推理外,还可以指定模型输入输出的数据类型: - - - | **参数名** | **参数描述** | **示例** | - | ----------- | ------------------ | --------------------- | - | input_type | 模型输入的数据类型 | ['float32','float32'] | - | output_type | 模型输出的数据类型 | ['float32','float32'] | - - - ​ 除此之外,可以额外指定备注信息用于在模型管理时进行展示 - - - | **参数名** | **参数描述** | **示例** | - | ---------- | ---------------------------------------------- | ------------------------------------------- | - | attributes | 可选,用户自行设定的模型备注信息,用于模型展示 | 'model_type': 'dlinear','kernel_size': '25' | - - -除了本地模型文件的注册,还可以通过URI来指定远程资源路径来进行注册,使用开源的模型仓库(例如HuggingFace)。 - -#### 示例 - -在当前的example文件夹下,包含model.pt和config.yaml文件,model.pt为训练得到,config.yaml的内容如下: - -```YAML -configs: - # 必选项 - input_shape: [96, 2] # 表示模型接收的数据为96行x2列 - output_shape: [48, 2] # 表示模型输出的数据为48行x2列 - - # 可选项 默认为全部float32,列数为shape对应的列数 - input_type: ["int64","int64"] #输入对应的数据类型,需要与输入列数匹配 - output_type: ["text","int64"] #输出对应的数据类型,需要与输出列数匹配 - -attributes: # 可选项 为用户自定义的备注信息 - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -指定该文件夹作为加载路径就可以注册该模型 - -```SQL -IoTDB> create model dlinear_example using uri "file://./example" -``` - -也可以从huggingFace上下载对应的模型文件进行注册 - -```SQL -IoTDB> create model dlinear_example using uri "https://huggingface.com/IoTDBML/dlinear/" -``` - -SQL执行后会异步进行注册的流程,可以通过模型展示查看模型的注册状态(见模型展示章节),注册成功的耗时主要受到模型文件大小的影响。 - -模型注册完成后,就可以通过使用正常查询的方式调用具体函数,进行模型推理。 - -### 4.2 查看模型 - -注册成功的模型可以通过show models指令查询模型的具体信息。其SQL定义如下: - -```SQL -show models - -show models -``` - -除了直接展示所有模型的信息外,可以指定model id来查看某一具体模型的信息。模型展示的结果中包含如下信息: - -| **ModelId** | **State** | **Configs** | **Attributes** | -| ------------ | ------------------------------------- | ---------------------------------------------- | -------------- | -| 模型唯一标识 | 模型注册状态(LOADING,ACTIVE,DROPPING) | InputShape, outputShapeInputTypes, outputTypes | 模型备注信息 | - -其中,State用于展示当前模型注册的状态,包含以下三个阶段 - -- **LOADING**:已经在configNode中添加对应的模型元信息,正将模型文件传输到AINode节点上 -- **ACTIVE:** 模型已经设置完成,模型处于可用状态 -- **DROPPING**:模型删除中,正在从configNode以及AINode处删除模型相关信息 -- **UNAVAILABLE**: 模型创建失败,可以通过drop model删除创建失败的model_name。 - -#### 示例 - -```SQL -IoTDB> show models - - -+---------------------+--------------------------+-----------+----------------------------+-----------------------+ -| ModelId| ModelType| State| Configs| Notes| -+---------------------+--------------------------+-----------+----------------------------+-----------------------+ -| dlinear_example| USER_DEFINED| ACTIVE| inputShape:[96,2]| | -| | | | outputShape:[48,2]| | -| | | | inputDataType:[float,float]| | -| | | |outputDataType:[float,float]| | -| _STLForecaster| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _NaiveForecaster| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _ARIMA| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -|_ExponentialSmoothing| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _GaussianHMM|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -| _GMMHMM|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -| _Stray|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -+---------------------+--------------------------+-----------+------------------------------------------------------------+-----------------------+ -``` - -我们前面已经注册了对应的模型,可以通过对应的指定查看模型状态,active表明模型注册成功,可用于推理。 - -### 4.3 删除模型 - -对于注册成功的模型,用户可以通过SQL进行删除。该操作除了删除configNode上的元信息外,还会删除所有AINode下的相关模型文件。其SQL如下: - -```SQL -drop model -``` - -需要指定已经成功注册的模型model_name来删除对应的模型。由于模型删除涉及多个节点上的数据删除,操作不会立即完成,此时模型的状态为DROPPING,该状态的模型不能用于模型推理。 - -### 4.4 使用内置模型推理 - -SQL语法如下: - - -```SQL -call inference(,sql[,=]) -``` - -内置模型推理无需注册流程,通过call关键字,调用inference函数就可以使用模型的推理功能,其对应的参数介绍如下: - -- **built_in_model_name:** 内置模型名称 -- **parameterName**:参数名 -- **parameterValue**:参数值 - -#### 内置模型及参数说明 - -目前已内置如下机器学习模型,具体参数说明请参考以下链接。 - -| 模型 | built_in_model_name | 任务类型 | 参数说明 | -| -------------------- | --------------------- | -------- | ------------------------------------------------------------ | -| Arima | _Arima | 预测 | [Arima参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.arima.ARIMA.html?highlight=Arima) | -| STLForecaster | _STLForecaster | 预测 | [STLForecaster参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.trend.STLForecaster.html#sktime.forecasting.trend.STLForecaster) | -| NaiveForecaster | _NaiveForecaster | 预测 | [NaiveForecaster参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.naive.NaiveForecaster.html#naiveforecaster) | -| ExponentialSmoothing | _ExponentialSmoothing | 预测 | [ExponentialSmoothing参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.exp_smoothing.ExponentialSmoothing.html) | -| GaussianHMM | _GaussianHMM | 标注 | [GaussianHMM参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.hmm_learn.gaussian.GaussianHMM.html) | -| GMMHMM | _GMMHMM | 标注 | [GMMHMM参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.hmm_learn.gmm.GMMHMM.html) | -| Stray | _Stray | 异常检测 | [Stray参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.stray.STRAY.html) | - -#### 示例 - -下面是使用内置模型推理的一个操作示例,使用内置的Stray模型进行异常检测算法,输入为`[144,1]`,输出为`[144,1]`,我们通过SQL使用其进行推理。 - -```SQL -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", k=2) -+-------+ -|output0| -+-------+ -| 0| -| 0| -| 0| -| 0| -...... -| 1| -| 1| -| 0| -| 0| -| 0| -| 0| -+-------+ -Total line number = 144 -``` - -### 4.5 使用深度学习模型推理 - -SQL语法如下: - -```SQL -call inference(,sql[,window=]) - - -window_function: - head(window_size) - tail(window_size) - count(window_size,sliding_step) -``` - -在完成模型的注册后,通过call关键字,调用inference函数就可以使用模型的推理功能,其对应的参数介绍如下: - -- **model_name**: 对应一个已经注册的模型 -- **sql**:sql查询语句,查询的结果作为模型的输入进行模型推理。查询的结果中行列的维度需要与具体模型config中指定的大小相匹配。(这里的sql不建议使用`SELECT *`子句,因为在IoTDB中,`*`并不会对列进行排序,因此列的顺序是未定义的,可以使用`SELECT s0,s1`的方式确保列的顺序符合模型输入的预期) -- **window_function**: 推理过程中可以使用的窗口函数,目前提供三种类型的窗口函数用于辅助模型推理: - - **head(window_size)**: 获取数据中最前的window_size个点用于模型推理,该窗口可用于数据裁剪 - ![](/img/AINode-call1.png) - - - **tail(window_size)**:获取数据中最后的window_size个点用于模型推,该窗口可用于数据裁剪 - ![](/img/AINode-call2.png) - - - **count(window_size, sliding_step)**:基于点数的滑动窗口,每个窗口的数据会分别通过模型进行推理,如下图示例所示,window_size为2的窗口函数将输入数据集分为三个窗口,每个窗口分别进行推理运算生成结果。该窗口可用于连续推理 - ![](/img/AINode-call3.png) - -**说明1: window可以用来解决sql查询结果和模型的输入行数要求不一致时的问题,对行进行裁剪。需要注意的是,当列数不匹配或是行数直接少于模型需求时,推理无法进行,会返回错误信息。** - -**说明2: 在深度学习应用中,经常将时间戳衍生特征(数据中的时间列)作为生成式任务的协变量,一同输入到模型中以提升模型的效果,但是在模型的输出结果中一般不包含时间列。为了保证实现的通用性,模型推理结果只对应模型的真实输出,如果模型不输出时间列,则结果中不会包含。** - - -#### 示例 - -下面是使用深度学习模型推理的一个操作示例,针对上面提到的输入为`[96,2]`,输出为`[48,2]`的`dlinear`预测模型,我们通过SQL使用其进行推理。 - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**") -+--------------------------------------------+-----------------------------+ -| _result_0| _result_1| -+--------------------------------------------+-----------------------------+ -| 0.726302981376648| 1.6549958229064941| -| 0.7354921698570251| 1.6482787370681763| -| 0.7238251566886902| 1.6278168201446533| -...... -| 0.7692174911499023| 1.654654049873352| -| 0.7685555815696716| 1.6625318765640259| -| 0.7856493592262268| 1.6508299350738525| -+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### 使用tail/head窗口函数的示例 - -当数据量不定且想要取96行最新数据用于推理时,可以使用对应的窗口函数tail。head函数的用法与其类似,不同点在于其取的是最早的96个点。 - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1988-01-01T00:00:00.000+08:00| 0.7355| 1.211| -...... -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 996 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**",window=tail(96)) -+--------------------------------------------+-----------------------------+ -| _result_0| _result_1| -+--------------------------------------------+-----------------------------+ -| 0.726302981376648| 1.6549958229064941| -| 0.7354921698570251| 1.6482787370681763| -| 0.7238251566886902| 1.6278168201446533| -...... -| 0.7692174911499023| 1.654654049873352| -| 0.7685555815696716| 1.6625318765640259| -| 0.7856493592262268| 1.6508299350738525| -+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### 使用count窗口函数的示例 - -该窗口主要用于计算式任务,当任务对应的模型一次只能处理固定行数据而最终想要的确实多组预测结果时,使用该窗口函数可以使用点数滑动窗口进行连续推理。假设我们现在有一个异常检测模型anomaly_example(input: [24,2], output[1,1]),对每24行数据会生成一个0/1的标签,其使用示例如下: - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(anomaly_example,"select s0,s1 from root.**",window=count(24,24)) -+-------------------------+ -| _result_0| -+-------------------------+ -| 0| -| 1| -| 1| -| 0| -+-------------------------+ -Total line number = 4 -``` - -其中结果集中每行的标签对应每24行数据为一组,输入该异常检测模型后的输出。 - - -### 4.6 时序大模型导入步骤 - -AINode 目前支持多种时序大模型,部署使用请参考[时序大模型](../AI-capability/TimeSeries-Large-Model) - -## 5. 权限管理 - -使用AINode相关的功能时,可以使用IoTDB本身的鉴权去做一个权限管理,用户只有在具备 USE_MODEL 权限时,才可以使用模型管理的相关功能。当使用推理功能时,用户需要有访问输入模型的SQL对应的源序列的权限。 - -| 权限名称 | 权限范围 | 管理员用户(默认ROOT) | 普通用户 | 路径相关 | -| --------- | --------------------------------- | ---------------------- | -------- | -------- | -| USE_MODEL | create model / show models / drop model | √ | √ | x | -| READ_DATA | call inference | √ | √ | √ | - -## 6. 实际案例 - -### 6.1 电力负载预测 - -在部分工业场景下,会存在预测电力负载的需求,预测结果可用于优化电力供应、节约能源和资源、支持规划和扩展以及增强电力系统的可靠性。 - -我们所使用的 ETTh1 的测试集的数据为[ETTh1](/img/ETTh1.csv)。 - - -包含间隔1h采集一次的电力数据,每条数据由负载和油温构成,分别为:High UseFul Load, High UseLess Load, Middle UseLess Load, Low UseFul Load, Low UseLess Load, Oil Temperature。 - -在该数据集上,IoTDB-ML的模型推理功能可以通过以往高中低三种负载的数值和对应时间戳油温的关系,预测未来一段时间内的油温,赋能电网变压器的自动调控和监视。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-csv.sh` 向 IoTDB 中导入 ETT 数据集 - -```Bash -bash ./import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ../../ETTh1.csv -``` - -#### 步骤二:模型导入 - -我们可以在iotdb-cli 中输入以下SQL从 huggingface 上拉取一个已经训练好的模型进行注册,用于后续的推理。 - -```SQL -create model dlinear using uri 'https://huggingface.co/hvlgo/dlinear/tree/main' -``` - -该模型基于较为轻量化的深度模型DLinear训练而得,能够以相对快的推理速度尽可能多地捕捉到序列内部的变化趋势和变量间的数据变化关系,相较于其他更深的模型更适用于快速实时预测。 - -#### 步骤三:模型推理 - -```Shell -IoTDB> select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth LIMIT 96 -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -| Time|root.eg.etth.s0|root.eg.etth.s1|root.eg.etth.s2|root.eg.etth.s3|root.eg.etth.s4|root.eg.etth.s5|root.eg.etth.s6| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -|2017-10-20T00:00:00.000+08:00| 10.449| 3.885| 8.706| 2.025| 2.041| 0.944| 8.864| -|2017-10-20T01:00:00.000+08:00| 11.119| 3.952| 8.813| 2.31| 2.071| 1.005| 8.442| -|2017-10-20T02:00:00.000+08:00| 9.511| 2.88| 7.533| 1.564| 1.949| 0.883| 8.16| -|2017-10-20T03:00:00.000+08:00| 9.645| 2.21| 7.249| 1.066| 1.828| 0.914| 7.949| -...... -|2017-10-23T20:00:00.000+08:00| 8.105| 0.938| 4.371| -0.569| 3.533| 1.279| 9.708| -|2017-10-23T21:00:00.000+08:00| 7.167| 1.206| 4.087| -0.462| 3.107| 1.432| 8.723| -|2017-10-23T22:00:00.000+08:00| 7.1| 1.34| 4.015| -0.32| 2.772| 1.31| 8.864| -|2017-10-23T23:00:00.000+08:00| 9.176| 2.746| 7.107| 1.635| 2.65| 1.097| 9.004| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example, "select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth", window=head(96)) -+-----------+----------+----------+------------+---------+----------+----------+ -| output0| output1| output2| output3| output4| output5| output6| -+-----------+----------+----------+------------+---------+----------+----------+ -| 10.319546| 3.1450553| 7.877341| 1.5723765|2.7303758| 1.1362307| 8.867775| -| 10.443649| 3.3286757| 7.8593454| 1.7675098| 2.560634| 1.1177158| 8.920919| -| 10.883752| 3.2341104| 8.47036| 1.6116762|2.4874182| 1.1760603| 8.798939| -...... -| 8.0115595| 1.2995274| 6.9900327|-0.098746896| 3.04923| 1.176214| 9.548782| -| 8.612427| 2.5036244| 5.6790237| 0.66474205|2.8870275| 1.2051733| 9.330128| -| 10.096699| 3.399722| 6.9909| 1.7478468|2.7642853| 1.1119363| 9.541455| -+-----------+----------+----------+------------+---------+----------+----------+ -Total line number = 48 -``` - -我们将对油温的预测的结果和真实结果进行对比,可以得到以下的图像。 - -图中10/24 00:00之前的数据为输入模型的过去数据,10/24 00:00后的蓝色线条为模型给出的油温预测结果,而红色为数据集中实际的油温数据(用于进行对比)。 - -![](/img/AINode-analysis1.png) - -可以看到,我们使用了过去96个小时(4天)的六个负载信息和对应时间油温的关系,基于之前学习到的序列间相互关系对未来48个小时(2天)的油温这一数据的可能变化进行了建模,可以看到可视化后预测曲线与实际结果在趋势上保持了较高程度的一致性。 - -### 6.2 功率预测 - -变电站需要对电流、电压、功率等数据进行电力监控,用于检测潜在的电网问题、识别电力系统中的故障、有效管理电网负载以及分析电力系统的性能和趋势等。 - -我们利用某变电站中的电流、电压和功率等数据构成了真实场景下的数据集。该数据集包括变电站近四个月时间跨度,每5 - 6s 采集一次的 A相电压、B相电压、C相电压等数据。 - -测试集数据内容为[data](/img/data.csv)。 - -在该数据集上,IoTDB-ML的模型推理功能可以通过以往A相电压,B相电压和C相电压的数值和对应时间戳,预测未来一段时间内的C相电压,赋能变电站的监视管理。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-csv.sh` 导入数据集 - -```Bash -bash ./import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ../../data.csv -``` - -#### 步骤二:模型导入 - -我们可以在iotdb-cli 中选择内置模型或已经注册好的模型用于后续的推理。 - -我们采用内置模型STLForecaster进行预测,STLForecaster 是一个基于 statsmodels 库中 STL 实现的时间序列预测方法。 - -#### 步骤三:模型推理 - -```Shell -IoTDB> select * from root.eg.voltage limit 96 -+-----------------------------+------------------+------------------+------------------+ -| Time|root.eg.voltage.s0|root.eg.voltage.s1|root.eg.voltage.s2| -+-----------------------------+------------------+------------------+------------------+ -|2023-02-14T20:38:32.000+08:00| 2038.0| 2028.0| 2041.0| -|2023-02-14T20:38:38.000+08:00| 2014.0| 2005.0| 2018.0| -|2023-02-14T20:38:44.000+08:00| 2014.0| 2005.0| 2018.0| -...... -|2023-02-14T20:47:52.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:47:57.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:48:03.000+08:00| 2024.0| 2016.0| 2027.0| -+-----------------------------+------------------+------------------+------------------+ -Total line number = 96 - -IoTDB> call inference(_STLForecaster, "select s0,s1,s2 from root.eg.voltage", window=head(96),predict_length=48) -+---------+---------+---------+ -| output0| output1| output2| -+---------+---------+---------+ -|2026.3601|2018.2953|2029.4257| -|2019.1538|2011.4361|2022.0888| -|2025.5074|2017.4522|2028.5199| -...... - -|2022.2336|2015.0290|2025.1023| -|2015.7241|2008.8975|2018.5085| -|2022.0777|2014.9136|2024.9396| -|2015.5682|2008.7821|2018.3458| -+---------+---------+---------+ -Total line number = 48 -``` -我们将对C相电压的预测的结果和真实结果进行对比,可以得到以下的图像。 - -图中 02/14 20:48 之前的数据为输入模型的过去数据, 02/14 20:48 后的蓝色线条为模型给出的C相电压预测结果,而红色为数据集中实际的C相电压数据(用于进行对比)。 - -![](/img/AINode-analysis2.png) - -可以看到,我们使用了过去10分钟的电压的数据,基于之前学习到的序列间相互关系对未来5分钟的C相电压这一数据的可能变化进行了建模,可以看到可视化后预测曲线与实际结果在趋势上保持了一定的同步性。 - -### 6.3 异常检测 - -在民航交通运输业,存在着对乘机旅客数量进行异常检测的需求。异常检测的结果可用于指导调整航班的调度,以使得企业获得更大效益。 - -Airline Passengers一个时间序列数据集,该数据集记录了1949年至1960年期间国际航空乘客数量,间隔一个月进行一次采样。该数据集共含一条时间序列。数据集为[airline](/img/airline.csv)。 -在该数据集上,IoTDB-ML的模型推理功能可以通过捕捉序列的变化规律以对序列时间点进行异常检测,赋能交通运输业。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-csv.sh` 导入数据集 - -```Bash -bash ./import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ../../data.csv -``` - -#### 步骤二:模型推理 - -IoTDB内置有部分可以直接使用的机器学习算法,使用其中的异常检测算法进行预测的样例如下: - -```Shell -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", k=2) -+-------+ -|output0| -+-------+ -| 0| -| 0| -| 0| -| 0| -...... -| 1| -| 1| -| 0| -| 0| -| 0| -| 0| -+-------+ -Total line number = 144 -``` - -我们将检测为异常的结果进行绘制,可以得到以下图像。其中蓝色曲线为原时间序列,用红色点特殊标注的时间点为算法检测为异常的时间点。 - -![](/img/s6.png) - -可以看到,Stray模型对输入序列变化进行了建模,成功检测出出现异常的时间点。 \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/API/Programming-OPC-DA_timecho.md b/src/zh/UserGuide/V1.3.x/API/Programming-OPC-DA_timecho.md deleted file mode 100644 index 435f28c80..000000000 --- a/src/zh/UserGuide/V1.3.x/API/Programming-OPC-DA_timecho.md +++ /dev/null @@ -1,208 +0,0 @@ - - -# OPC DA 协议 - -## 1. OPC DA - -OPC DA (OPC Data Access) 是工业自动化领域的一种通信协议标准,属于经典 OPC(OLE for Process Control)技术的核心部分。它的主要目标是实现 Windows 环境下工业设备与软件(如 SCADA、HMI、数据库)之间的实时数据交互。OPC DA 基于 COM / DCOM 实现,是一个轻量级的协议,分为服务器和客户端两个角色。 - -* **服务器:** 可以视为一个 Item 的池,存储各个实例的最新数据及其状态。所有 item 只能在服务器端管理,客户端只能读写数据,无权操作元信息。 - -![](/img/opc-da-1-1.png) - -* **客户端:** 连接服务器后,需要自定义一个组(这个组仅与客户端有关),并创建服务器的同名 item,然后可以对自身已创建的 item 进行读写。 - -![](/img/opc-da-1-2.png) - -## 2. OPC DA Sink - -IoTDB (V1.3.5.2及以后的V1.x版本支持) 提供的 OPC DA Sink 支持将树模型数据推送到本地 COM 服务器的插件,它封装了 OPC DA 接口规范及其固有复杂性,显著简化了集成流程。OPC DA Sink 推送数据流图如下所示。 - -![](/img/opc-da-2-1.png) - -### 2.1 SQL 语法 - -```SQL ----- 注意这里的 clsID 需要替换为自己的 clsID -create pipe opc ( - 'sink'='opc-da-sink', - --- 'opcda.progid'='opcserversim.Instance.1' - 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C' -); -``` - -### 2.2 参数介绍 - -| **参数** | **描述** | **取值范围 ** | 是否必填 | -| ------------------- | --------------------------------------------------------------------- | ----------------------- | ------------------ | -| sink | OPC DA SINK | String: opc-da-sink | 必填 | -| sink.opcda.clsid | OPC Server 的 ClsID(唯一标识字符串)。建议使用 clsID 而非 progID。 | String | 和 progId 二选一 | -| sink.opcda.progid | OPC Server 的 ProgID,如果有 clsID,优先使用 clsID。 | String | 和 clsID 二选一 | - -### 2.3 映射规范 - -使用时,IoTDB 将会将自身的树模型最新数据推送到服务器,数据的 itemID 为树模型下的时间序列的全路径,如 `root.a.b.c.d`。注意根据 OPC DA 标准,客户端无权直接在 server 侧创建 item,因此需要服务器提前将 IoTDB 的时间序列以 itemID 和对应数据类型的格式创建为 item。 - -* 数据类型对应如下表所示。 - -| IoTDB | OPC-DA Server | -| ----------- | ----------------------------------------------------------- | -| INT32 | VT\_I4 | -| INT64 | VT\_I8 | -| FLOAT | VT\_R4 | -| DOUBLE | VT\_R8 | -| TEXT | VT\_BSTR | -| BOOLEAN | VT\_BOOL | -| DATE | VT\_DATE | -| TIMESTAMP | VT\_DATE | -| BLOB | VT\_BSTR(Variant 不支持 VT\_BLOB,因此用 VT\_BSTR 替代) | -| STRING | VT\_BSTR | - -### 2.4 常见错误码 - -| 符号 | 错误码 | 描述 | -| ----------------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------ | -| OPC\_E\_BADTYPE | 0xC0040004 | 服务器无法在指定格式/请求的数据类型与规范数据类型之间转换数据。即服务器的数据类型与 IoTDB 的注册类型不一致。 | -| OPC\_E\_UNKNOWNITEMID| 0xC0040007 | 在服务器地址空间中未定义该条目ID(添加或验证时),或该条目ID在服务器地址空间中已不存在(读取或写入时)。即 IoTDB 的测点在服务器内没有对应的 itemID。 | -| OPC\_E\_INVALIDITEMID | 0xC0040008 | 该 itemID不符合服务器的语法规范。 | -| REGDB\_E\_CLASSNOTREG | 0x80040154 | 未注册类 | -| RPC\_S\_SERVER\_UNAVAILABLE | 0x800706ba | RPC服务不可用 | -| DISP\_E\_OVERFLOW | 0x8002000a | 超过类型的最大值 | -| DISP\_E\_BADVARTYPE | 0x80020005 | 类型不匹配 | - -### 2.5 使用限制 - -* 仅支持 COM,且仅能在 Windows 上使用 -* 重启后可能会推送少部分旧数据,但是最终会推送新数据 -* 目前仅支持树模型数据。 - -## 3. 使用步骤 -### 3.1 前置条件 -1. Windows 环境,版本 >= 8 -2. IoTDB 已安装且可正常运行 -3. OPC DA Server 已安装 - -* 以 Simple OPC Server Simulator 为例 - -![](/img/opc-da-3-1.png) - -* 双击某项,可以修改该项的名字(itemID),数据,数据类型等各个信息。 -* 右键某项,可以删除该项、更新值、以及新建项。 - -![](/img/opc-da-3-2.png) - -4. OPC DA Client 已安装 - -* 以 KepwareServerEX 的 quickClient 为例 -* 在 Kepware 中可以如下打开 OPC DA Client - -![](/img/opc-da-3-3.png) - -![](/img/opc-da-3-4.png) - - -### 3.2 配置修改 - -修改 server 配置,以避免 IoTDB 的写入 client 与 Kepware 的读取 client 连接到两个不同的实例而无法调试。 - -* 首先按 Win+R 键,在运行菜单内输入 `dcomcnfg`,打开 dcom 的组件配置: - -![](/img/opc-da-3-5.png) - -* 点击组件服务 -> 计算机 -> 我的电脑 -> DCOM 配置,找到`AGG Software Simple OPC Server Simulator`,右键“属性”: - -![](/img/opc-da-3-6.png) - -* 在`标识`内,将`用户账户`改为`交互式用户`。注意这里不要为`启动用户`,否则可能导致两个 client 分别启动不同的 server 实例。 - -![](/img/opc-da-3-7.png) - -### 3.3 clsID 获取 -1. 方式一:通过 DCOM 配置 获取 - -* 按 Win+R 键,在运行菜单内输入 `dcomcnfg`,打开 dcom 的组件配置; -* 点击组件服务 -> 计算机 -> 我的电脑 -> DCOM 配置,找到`AGG Software Simple OPC Server Simulator`,右键“属性”。 -* 在 `常规 `中可以获取该应用程序的 clsID,用于之后 opc-da-sink 的连接,注意不带大括号 - -![](/img/opc-da-3-8.png) - -2. 方式二:clsID 与 progID 也可以直接在 server 里获取 - -* 点击 `Help` > `Show OPC Server Info` - -![](/img/opc-da-3-9.png) - -* 弹窗中即可显示 - -![](/img/opc-da-3-10.png) - -### 3.4 写入数据 -#### 3.4.1 DA Server -1. 在 DA Server 内新建项,与 IoTDB 的待写入项的 name 与 type 保持一致 - -![](/img/opc-da-3-11.png) - -2. 在 Kepware 中连上该 server: - -![](/img/opc-da-3-12.png) - -3. 右键服务器新建组,组名任意: - -![](/img/opc-da-3-13.png) - -![](/img/opc-da-3-14.png) - -4. 右键新建 item,item 的名字为之前创建的名字 - -![](/img/opc-da-3-15.png) - -![](/img/opc-da-3-16.png) - -![](/img/opc-da-3-17.png) - -#### 3.4.2 IoTDB -1. 启动 IoTDB -2. 创建 Pipe - -```SQL -create pipe opc ('sink'='opc-da-sink', 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C') -``` - -* 注意:如果创建失败,提示` Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 1107: Failed to connect to server, error code: 0x80040154`,则可以参考该解决方案进行处理:https://opcexpert.com/support/0x80040154-class-not-registered/ - -3. 创建时间序列(如果已开启自动创建元数据,则本步骤可以省略) - -```SQL -create timeseries root.a.b.c.r string; -``` - -4. 插入数据 - -```SQL -insert into root.a.b.c (time, r) values(10000, "SomeString") -``` - -### 3.5 验证数据 - -查看 Quick client 的数据,应该已经得到更新。 - -![](/img/opc-da-3-18.png) \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/API/Programming-OPC-UA_timecho.md b/src/zh/UserGuide/V1.3.x/API/Programming-OPC-UA_timecho.md deleted file mode 100644 index 6661ae168..000000000 --- a/src/zh/UserGuide/V1.3.x/API/Programming-OPC-UA_timecho.md +++ /dev/null @@ -1,265 +0,0 @@ - - -# OPC UA 协议 - -## OPC UA 订阅数据 - -本功能支持用户以 OPC UA 协议从 IoTDB 中订阅数据,订阅数据的通信模式支持 Client/Server 和 Pub/Sub 两种。 - -注意:本功能并非从外部 OPC Server 中采集数据写入 IoTDB - -![](/img/opc-ua-new-1.png) - -## OPC 服务启动方式 - -### 语法 - -启动 OPC UA 协议的语法: - -```SQL -create pipe p1 - with source (...) - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'sink.opcua.tcp.port' = '12686', - 'sink.opcua.https.port' = '8443', - 'sink.user' = 'root', - 'sink.password' = 'root', - 'sink.opcua.security.dir' = '...' - ) -``` - -### 参数 - -| **参数** | **描述** | **取值范围** | **是否必填** | **默认值** | -| ---------------------------------- | ------------------------------ | -------------------------------- | ------------ | ------------------------------------------------------------ | -| sink | OPC UA SINK | String: opc-ua-sink | 必填 | | -| sink.opcua.model | OPC UA 使用的模式 | String: client-server / pub-sub | 选填 | pub-sub | -| sink.opcua.tcp.port | OPC UA 的 TCP 端口 | Integer: [0, 65536] | 选填 | 12686 | -| sink.opcua.https.port | OPC UA 的 HTTPS 端口 | Integer: [0, 65536] | 选填 | 8443 | -| sink.opcua.security.dir | OPC UA 的密钥及证书目录 | String: Path,支持绝对及相对目录 | 选填 | iotdb 相关 DataNode 的 conf 目录下的 opc_security 文件夹 /``。
如无 iotdb 的 conf 目录(例如 IDEA 中启动 DataNode),则为用户主目录下的 iotdb_opc_security 文件夹 /`` | -| sink.opcua.enable-anonymous-access | OPC UA 是否允许匿名访问 | Boolean | 选填 | true | -| sink.user | 用户,这里指 OPC UA 的允许用户 | String | 选填 | root | -| sink.password | 密码,这里指 OPC UA 的允许密码 | String | 选填 | root | - -### 示例 - -```Bash -create pipe p1 - with sink ('sink' = 'opc-ua-sink', - 'sink.user' = 'root', - 'sink.password' = 'root'); -start pipe p1; -``` - -### 使用限制 - -1. 启动协议之后需要写入数据,才能建立连接,且仅能订阅建立连接之后的数据。 -2. 推荐在单机模式下使用。在分布式模式下,每一个 IoTDB DataNode 都作为一个独立的 OPC Server 提供数据,需要单独订阅。 - -## 两种通信模式示例 -### Client / Server 模式 - -在这种模式下,IoTDB 的流处理引擎通过 OPC UA Sink 与 OPC UA 服务器(Server)建立连接。OPC UA 服务器在其地址空间(Address Space) 中维护数据,IoTDB可以请求并获取这些数据。同时,其他OPC UA客户端(Client)也能访问服务器上的数据。 - -* 特性: - * OPC UA 将从 Sink 收到的设备信息,按照树形模型整理到 Objects folder 下的文件夹中。 - * 每个测点都被记录为一个变量节点,并记录当前数据库中的最新值。 - * OPC UA 无法删除数据或者改变数据类型的设置 - -#### 准备工作 - -1. 此处以UAExpert客户端为例,下载 UAExpert 客户端:https://www.unified-automation.com/downloads/opc-ua-clients.html - -2. 安装 UAExpert,填写自身的证书等信息。 - -#### 快速开始 -##### 支持 None 安全策略的场景 -1. 使用如下 sql,创建并启动 client-server 模式的 OPC UA Sink。详细语法参见上文:[IoTDB OPC Server语法](#语法) - -```SQL -create pipe p1 with sink ('sink'='opc-ua-sink', 'opcua.security-policy'='AES128_SHA256_RSAOAEP, AES256_SHA256_RSAPSS, BASIC256SHA256, NONE'); -``` -注意:在 V1.3.7.2 及以上版本中,默认不再支持 `None`,如需使用必须通过 `security-policy` 参数手动开启,如上所示。 - -2. 写入部分数据。 - -```SQL -insert into root.test.db(time, s2) values(now(), 2) -``` - -此处自动创建元数据开启。 - -3. 在 UAExpert 中配置 iotdb 的连接,其中 password 填写为上述参数配置中 sink.password 中设定的密码(此处以默认密码root为例): - -

- -
- -
- -
- -4. 信任服务器的证书后,在左侧 Objects folder 即可看到写入的数据。 - -
- -
- -
- -
- -注意:由于此处配置的 `SecurityPolicy` 为 `None`,因此不需要相互信任证书。生产环境建议使用非 `None` 的 `SecurityPolicy` 进行连接,此时需要相互信任证书,操作步骤可以见下文 `Pub/Sub` 模式,在 `Client/Server` 的证书目录下(可以在打印的日志中找 keyStore 关键词),将 reject 的内容挪到 `trusted/certs`下即可,采用连接、移动 server 目录、连接、移动 client 目录、连接的顺序。 - -5. 可以将左侧节点拖动到中间,并展示该节点的最新值: - -
- -
- -##### 不支持 None 安全策略的场景 -1. 使用如下 sql,创建并启动 OPC UA 服务。 - ```SQL - create pipe p1 with sink ('sink'='opc-ua-sink'); - ``` - 注意:从 V1.3.7.2 版本开始,`OpcUaSink` 出于安全考虑,默认不再支持 `None` 模式。 - -2. 写入部分数据。 - ```SQL - insert into root.test.db(time, s2) values(now(), 2); - ``` - -3. 在 UAExpert 中配置 IoTDB 连接: - - 不可直接访问 `URL`,必须通过 `Discover` 方式发现端点 - - 客户端会先使用 `None` 策略发送 `GetEndpoints` 请求获取端点列表 - - 再根据配置的 `Basic256Sha256 + SignAndEncrypt` 选择对应加密端点建立加密连接 - -![](/img/opc-ua-un-none-1.png) - -4. 用户名密码配置同上,点击相关的连接模式后(`Sign` / `Sign & Encrypt`),如果出现以下内容,点 `Ignore` 直接连。 - -![](/img/opc-ua-un-none-2.png) - -### Pub / Sub 模式 - -在这种模式下,IoTDB的流处理引擎通过 OPC UA Sink 向OPC UA 服务器(Server)发送数据变更事件。这些事件被发布到服务器的消息队列中,并通过事件节点 (Event Node) 进行管理。其他OPC UA客户端(Client)可以订阅这些事件节点,以便在数据变更时接收通知。 - -- 特性: - - - 每个测点会被 OPC UA 包装成一个事件节点(EventNode)。 - - 相关字段及其对应含义如下: - - | 字段 | 含义 | 类型(Milo) | 示例 | - | :--------- | :--------------- | :------------ | :-------------------- | - | Time | 时间戳 | DateTime | 1698907326198 | - | SourceName | 测点对应完整路径 | String | root.test.opc.sensor0 | - | SourceNode | 测点数据类型 | NodeId | Int32 | - | Message | 数据 | LocalizedText | 3.0 | - - - Event 仅会发送给所有已经监听的客户端,客户端未连接则会忽略该 Event。 - - 如果数据被删除,信息则无法推送给客户端。 - - -#### 准备工作 - -该代码位于 iotdb-example 包下的 [opc-ua-sink 文件夹](https://github.com/apache/iotdb/tree/rc/1.3.5/example/pipe-opc-ua-sink/src/main/java/org/apache/iotdb/opcua)中 - -代码中包含: - -- 主类(ClientTest) -- Client 证书相关的逻辑(IoTDBKeyStoreLoaderClient) -- Client 的配置及启动逻辑(ClientExampleRunner) -- ClientTest 的父类(ClientExample) - -#### 快速开始 - -使用步骤为: - -1. 打开 IoTDB 并写入部分数据。 - -```SQL -insert into root.a.b(time, c, d) values(now(), 1, 2); -``` - -​ 此处自动创建元数据开启。 - -2. 使用如下 sql,创建并启动 Pub-Sub 模式的 OPC UA Sink。详细语法参见上文:[IoTDB OPC Server语法](#语法) - -```SQL -create pipe p1 with sink ('sink'='opc-ua-sink', - 'sink.opcua.model'='pub-sub'); -start pipe p1; -``` - -​ 此时能看到服务器的 conf 目录下创建了 opc 证书相关的目录。 - -
- -
- -3. 直接运行 Client 连接,此时 Client 证书被服务器拒收。 - -
- -
- -4. 进入服务器的 sink.opcua.security.dir 目录下,进入 pki 的 rejected 目录,此时 Client 的证书应该已经在该目录下生成。 - -
- -
- -5. 将客户端的证书移入(不是复制) 同目录下 trusted 目录的 certs 文件夹中。 - -
- -
- -6. 再次打开 Client 连接,此时服务器的证书应该被 Client 拒收。 - -
- -
- -7. 进入客户端的 /client/security 目录下,进入 pki 的 rejected 目录,将服务器的证书移入(不是复制)trusted 目录。 - -
- -
- -8. 打开 Client,此时建立双向信任成功, Client 能够连接到服务器。 - -9. 向服务器中写入数据,此时 Client 中能够打印出收到的数据。 - -
- -
- - -### 注意事项 - -1. **单机与集群**:建议使用1C1D单机版,如果集群中有多个 DataNode,可能数据会分散发送在各个 DataNode 上,无法收听到全量数据。 - -2. **无需操作根目录下证书**:在证书操作过程中,无需操作 IoTDB security 根目录下的 `iotdb-server.pfx` 证书和 client security 目录下的 `example-client.pfx` 目录。Client 和 Server 双向连接时,会将根目录下的证书发给对方,对方如果第一次看见此证书,就会放入 reject dir,如果该证书在 trusted/certs 里面,则能够信任对方。 - -3. **建议使用** **Java 17+**:在 JVM 8 的版本中,可能会存在密钥长度限制,报 Illegal key size 错误。对于特定版本(如 jdk.1.8u151+),可以在 ClientExampleRunner 的 create client 里加入 `Security.`*`setProperty`*`("crypto.policy", "unlimited");` 解决,也可以下载无限制的包 `local_policy.jar` 与 `US_export_policy `解决替换 `JDK/jre/lib/security `目录下的包解决,下载网址:https://www.oracle.com/java/technologies/javase-jce8-downloads.html。 diff --git a/src/zh/UserGuide/V1.3.x/Background-knowledge/Cluster-Concept_timecho.md b/src/zh/UserGuide/V1.3.x/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index 44739754b..000000000 --- a/src/zh/UserGuide/V1.3.x/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,116 +0,0 @@ - - -# 常见概念 - -## 数据模型相关概念 - -| 概念 | 含义 | -|-----------------|----------------------------------------------------------------------------------------------------------------------------| -| 数据模型 | 树模型,管理的对象为设备和测点,以层级路径的方式管理数据,一条路径对应一个设备的一个测点 | -| 元数据(Schema) | 元数据是数据库的数据模型信息,即树形结构,包括测点的名称、数据类型等定义。 | -| 设备(Device) | 对应一个实际场景中的物理设备,通常包含多个测点。 | -| 测点(Timeseries) | 又名:物理量、时间序列、时间线、点位、信号量、指标、测量值等。是多个数据点按时间戳递增排列形成的一个时间序列。通常一个测点代表一个采集点位,能够定期采集所在环境的物理量。 | -| 编码(Encoding) | 编码是一种压缩技术,将数据以二进制的形式进行表示,可以提高存储效率。IoTDB 支持多种针对不同类型的数据的编码方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md) | -| 压缩(Compression) | IoTDB 在数据编码后,使用压缩技术进一步压缩二进制数据,提升存储效率。IoTDB 支持多种压缩方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md) | - -## 分布式相关概念 - -下图展示了一个常见的 IoTDB 3C3D(3 个 ConfigNode、3 个 DataNode)的集群部署模式: - - - -IoTDB 的集群包括如下常见概念: - -- 节点(ConfigNode、DataNode、AINode) -- Region(SchemaRegion、DataRegion) -- 多副本 - -下文将对以上概念进行介绍。 - - -### 节点 - -IoTDB 集群包括三种节点(进程):ConfigNode(管理节点),DataNode(数据节点)和 AINode(分析节点),如下所示: - -- ConfigNode:管理集群的节点信息、配置信息、用户权限、元数据、分区信息等,负责分布式操作的调度和负载均衡,所有 ConfigNode 之间互为全量备份,如上图中的 ConfigNode-1,ConfigNode-2 和 ConfigNode-3 所示。 -- DataNode:服务客户端请求,负责数据的存储和计算,如上图中的 DataNode-1,DataNode-2 和 DataNode-3 所示。 -- AINode:负责提供机器学习能力,支持注册已训练好的机器学习模型,并通过 SQL 调用模型进行推理,目前已内置自研时序大模型和常见的机器学习算法(如预测与异常检测)。 - -### 数据分区 - -在 IoTDB 中,元数据和数据都被分为小的分区,即 Region,由集群的各个 DataNode 进行管理。 - -- SchemaRegion:元数据分区,管理一部分设备和测点的元数据。不同 DataNode 相同 RegionID 的 SchemaRegion 互为副本,如上图中 SchemaRegion-1 拥有三个副本,分别放置于 DataNode-1,DataNode-2 和 DataNode-3。 -- DataRegion:数据分区,管理一部分设备的一段时间的数据。不同 DataNode 相同 RegionID 的 DataRegion 互为副本,如上图中 DataRegion-2 拥有两个副本,分别放置于 DataNode-1 和 DataNode-2。 -- 具体分区算法可参考:[数据分区](../Technical-Insider/Cluster-data-partitioning.md) - -### 多副本 - -数据和元数据的副本数可配置,不同部署模式下的副本数推荐如下配置,其中多副本时可提供高可用服务。 - -| 类别 | 配置项 | 单机推荐配置 | 集群推荐配置 | -| :----- | :------------------------ | :----------- | :----------- | -| 元数据 | schema_replication_factor | 1 | 3 | -| 数据 | data_replication_factor | 1 | 2 | - - -## 部署相关概念 - -IoTDB 有三种运行模式:单机模式、集群模式和双活模式。 - -### 单机模式 - -IoTDB单机实例包括 1 个ConfigNode、1个DataNode,即1C1D; - -- **特点**:便于开发者安装部署,部署和维护成本较低,操作方便。 -- **适用场景**:资源有限或对高可用要求不高的场景,例如边缘端服务器。 -- **部署方法**:[单机版部署](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -### 双活模式 - -双活版部署为 TimechoDB 企业版功能,是指两个独立的实例进行双向同步,能同时对外提供服务。当一台停机重启后,另一个实例会将缺失数据断点续传。 - -> IoTDB 双活实例通常为2个单机节点,即2套1C1D。每个实例也可以为集群。 - -- **特点**:资源占用最低的高可用解决方案。 -- **适用场景**:资源有限(仅有两台服务器),但希望获得高可用能力。 -- **部署方法**:[双活版部署](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### 集群模式 - -IoTDB 集群实例为 3 个ConfigNode 和不少于 3 个 DataNode,通常为 3 个 DataNode,即3C3D;当部分节点出现故障时,剩余节点仍然能对外提供服务,保证数据库服务的高可用性,且可随节点增加提升数据库性能。 - -- **特点**:具有高可用性、高扩展性,可通过增加 DataNode 提高系统性能。 -- **适用场景**:需要提供高可用和可靠性的企业级应用场景。 -- **部署方法**:[集群版部署](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -### 特点总结 - -| 维度 | 单机模式 | 双活模式 | 集群模式 | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| 适用场景 | 边缘侧部署、对高可用要求不高 | 高可用性业务、容灾场景等 | 高可用性业务、容灾场景等 | -| 所需机器数量 | 1 | 2 | ≥3 | -| 安全可靠性 | 无法容忍单点故障 | 高,可容忍单点故障 | 高,可容忍单点故障 | -| 扩展性 | 可扩展 DataNode 提升性能 | 每个实例可按需扩展 | 可扩展 DataNode 提升性能 | -| 性能 | 可随 DataNode 数量扩展 | 与其中一个实例性能相同 | 可随 DataNode 数量扩展 | - -- 单机模式和集群模式,部署步骤类似(逐个增加 ConfigNode 和 DataNode),仅副本数和可提供服务的最少节点数不同。 \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/Basic-Concept/Operate-Metadata_timecho.md b/src/zh/UserGuide/V1.3.x/Basic-Concept/Operate-Metadata_timecho.md deleted file mode 100644 index f9871e3fc..000000000 --- a/src/zh/UserGuide/V1.3.x/Basic-Concept/Operate-Metadata_timecho.md +++ /dev/null @@ -1,1366 +0,0 @@ - - -# 测点管理 -## 数据库管理 - -数据库(Database)可以被视为关系数据库中的Database。 - -### 创建数据库 - -我们可以根据存储模型建立相应的数据库。如下所示: - -``` -IoTDB > CREATE DATABASE root.ln -``` - -需要注意的是,推荐创建一个 database. - -Database 的父子节点都不能再设置 database。例如在已经有`root.ln`和`root.sgcc`这两个 database 的情况下,创建`root.ln.wf01` database 是不可行的。系统将给出相应的错误提示,如下所示: - -``` -IoTDB> CREATE DATABASE root.ln.wf01 -Msg: 300: root.ln has already been created as database. -``` -Database 节点名命名规则: -1. 节点名可由**中英文字符、数字、下划线(\_)、英文句号(.)、反引号(\`)** 组成 -2. 若节点名为以下情况,则必须用**反引号(\`)** 将整个名称包裹。 - - 纯数字(如 12345) - - 含有特殊字符(如 . 或 \_)并可能引发歧义的名称(如 db.01、\_temp) -3. 反引号的特殊处理: - 若节点名本身需要包含反引号(\`),则需用**两个反引号(\`\`)** 表示一个反引号。例如:命名为\`db123\`\`(本身包含一个反引号),需写为 \`db123\`\`\`。 - -还需注意,如果在 Windows 或 macOS 系统上部署,database 名是大小写不敏感的。例如同时创建`root.ln` 和 `root.LN` 是不被允许的。 - -### 查看数据库 - -在 database 创建后,我们可以使用 [SHOW DATABASES](../SQL-Manual/SQL-Manual.md#查看数据库) 语句和 [SHOW DATABASES \](../SQL-Manual/SQL-Manual.md#查看数据库) 来查看 database,SQL 语句如下所示: - -``` -IoTDB> show databases -IoTDB> show databases root.* -IoTDB> show databases root.** -``` - -执行结果为: - -``` -+-------------+----+-------------------------+-----------------------+-----------------------+ -| database| ttl|schema_replication_factor|data_replication_factor|time_partition_interval| -+-------------+----+-------------------------+-----------------------+-----------------------+ -| root.sgcc|null| 2| 2| 604800| -| root.ln|null| 2| 2| 604800| -+-------------+----+-------------------------+-----------------------+-----------------------+ -Total line number = 2 -It costs 0.060s -``` - -### 删除数据库 - -用户可以使用`DELETE DATABASE `语句删除该路径模式匹配的所有的数据库。在删除的过程中,需要注意的是数据库的数据也会被删除。 - -``` -IoTDB > DELETE DATABASE root.ln -IoTDB > DELETE DATABASE root.sgcc -// 删除所有数据,时间序列以及数据库 -IoTDB > DELETE DATABASE root.** -``` - -### 统计数据库数量 - -用户可以使用`COUNT DATABASES `语句统计数据库的数量,允许指定`PathPattern` 用来统计匹配该`PathPattern` 的数据库的数量 - -SQL 语句如下所示: - -``` -IoTDB> show databases -IoTDB> count databases -IoTDB> count databases root.* -IoTDB> count databases root.sgcc.* -IoTDB> count databases root.sgcc -``` - -执行结果为: - -``` -+-------------+ -| database| -+-------------+ -| root.sgcc| -| root.turbine| -| root.ln| -+-------------+ -Total line number = 3 -It costs 0.003s - -+-------------+ -| Database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.003s - -+-------------+ -| Database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| Database| -+-------------+ -| 0| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 1| -+-------------+ -Total line number = 1 -It costs 0.002s -``` - -### 数据保留时间(TTL) - -IoTDB 支持对设备(device)级别设置数据保留时间(TTL),允许系统自动定期删除旧数据,以有效控制磁盘空间并维护高性能查询和低内存占用。TTL 默认以毫秒为单位,数据过期后不可查询且禁止写入,但物理删除会延迟至压缩时。需注意,TTL 变更可能导致短暂数据可查询性变化,且若调小或解除 TTL,之前因 TTL 不可见的数据可能重新出现。 - -注意事项: -- TTL 设置为毫秒,不受配置文件时间精度影响。 -- TTL 变更可能影响数据的可查询性。 -- 系统最终会移除过期数据,但存在延迟。 -- TTL 判断数据是否过期依据的是数据点时间,非写入时间。 -- 系统最多支持设置 1000 条 TTL 规则,达到上限需先删除部分规则才能设置新规则。 - -#### TTL Path 规则 -设置的路径 path 只支持前缀路径(即路径中间不能带 \* , 且必须以 \*\* 结尾),该路径会匹配到设备,也允许用户指定不带星的 path 为具体的 database 或 device,当 path 不带 \* 时,会检查是否匹配到 database,若匹配到 database,则会同时设置 path 和 path.\*\*。 -注意:设备 TTL 设置不会对元数据的存在性进行校验,即允许对一条不存在的设备设置 TTL。 -``` -合格的 path: -root.** -root.db.** -root.db.group1.** -root.db -root.db.group1.d1 - -不合格的 path: -root.*.db -root.**.db.* -root.db.* -``` -#### TTL 适用规则 -当一个设备适用多条TTL规则时,优先适用较精确和较长的规则。例如对于设备“root.bj.hd.dist001.turbine001”来说,规则“root.bj.hd.dist001.turbine001”比“root.bj.hd.dist001.\*\*”优先,而规则“root.bj.hd.dist001.\*\*”比“root.bj.hd.\*\*”优先; -#### 设置 TTL -set ttl 操作可以理解为设置一条 TTL规则,比如 set ttl to root.sg.group1.\*\* 就相当于对所有可以匹配到该路径模式的设备挂载 ttl。 unset ttl 操作表示对相应路径模式卸载 TTL,若不存在对应 TTL,则不做任何事。若想把 TTL 调成无限大,则可以使用 INF 关键字 -设置 TTL 的 SQL 语句如下所示: -``` -set ttl to pathPattern 360000; -``` -pathPattern 是前缀路径,即路径中间不能带 \* 且必须以 \*\* 结尾。 -pathPattern 匹配对应的设备。为了兼容老版本 SQL 语法,允许用户输入的 pathPattern 匹配到 db,则自动将前缀路径扩展为 path.\*\*。 -例如,写set ttl to root.sg 360000 则会自动转化为set ttl to root.sg.\*\* 360000,转化后的语句对所有 root.sg 下的 device 设置TTL。 -但若写的 pathPattern 无法匹配到 db,则上述逻辑不会生效。 -如写set ttl to root.sg.group 360000 ,由于root.sg.group未匹配到 db,则不会被扩充为root.sg.group.\*\*。 也允许指定具体 device,不带 \*。 -#### 取消 TTL - -取消 TTL 的 SQL 语句如下所示: - -``` -IoTDB> unset ttl from root.ln -``` - -取消设置 TTL 后, `root.ln` 路径下所有的数据都会被保存。 -``` -IoTDB> unset ttl from root.sgcc.** -``` - -取消设置`root.sgcc`路径下的所有的 TTL 。 -``` -IoTDB> unset ttl from root.** -``` - -取消设置所有的 TTL 。 - -新语法 -``` -IoTDB> unset ttl from root.** -``` - -旧语法 -``` -IoTDB> unset ttl to root.** -``` -新旧语法在功能上没有区别并且同时兼容,仅是新语法在用词上更符合常规。 -#### 显示 TTL - -显示 TTL 的 SQL 语句如下所示: -show all ttl - -``` -IoTDB> SHOW ALL TTL -+--------------+--------+ -| path| TTL| -| root.**|55555555| -| root.sg2.a.**|44440000| -+--------------+--------+ -``` - -show ttl on pathPattern -``` -IoTDB> SHOW TTL ON root.db.**; -+--------------+--------+ -| path| TTL| -| root.db.**|55555555| -| root.db.a.**|44440000| -+--------------+--------+ -``` -SHOW ALL TTL 这个例子会给出所有的 TTL。 -SHOW TTL ON pathPattern 这个例子会显示指定路径的 TTL。 - -显示设备的 TTL。 -``` -IoTDB> show devices -+---------------+---------+---------+ -| Device|IsAligned| TTL| -+---------------+---------+---------+ -|root.sg.device1| false| 36000000| -|root.sg.device2| true| INF| -+---------------+---------+---------+ -``` -所有设备都一定会有 TTL,即不可能是 null。INF 表示无穷大。 - - -### 设置异构数据库(进阶操作) - -在熟悉 IoTDB 元数据建模的前提下,用户可以在 IoTDB 中设置异构的数据库,以便应对不同的生产需求。 - -目前支持的数据库异构参数有: - -| 参数名 | 参数类型 | 参数描述 | -|---------------------------|---------|---------------------------| -| TTL | Long | 数据库的 TTL | -| SCHEMA_REPLICATION_FACTOR | Integer | 数据库的元数据副本数 | -| DATA_REPLICATION_FACTOR | Integer | 数据库的数据副本数 | -| SCHEMA_REGION_GROUP_NUM | Integer | 数据库的 SchemaRegionGroup 数量 | -| DATA_REGION_GROUP_NUM | Integer | 数据库的 DataRegionGroup 数量 | - -用户在配置异构参数时需要注意以下三点: -+ TTL 和 TIME_PARTITION_INTERVAL 必须为正整数。 -+ SCHEMA_REPLICATION_FACTOR 和 DATA_REPLICATION_FACTOR 必须小于等于已部署的 DataNode 数量。 -+ SCHEMA_REGION_GROUP_NUM 和 DATA_REGION_GROUP_NUM 的功能与 iotdb-common.properties 配置文件中的 -`schema_region_group_extension_policy` 和 `data_region_group_extension_policy` 参数相关,以 DATA_REGION_GROUP_NUM 为例: -若设置 `data_region_group_extension_policy=CUSTOM`,则 DATA_REGION_GROUP_NUM 将作为 Database 拥有的 DataRegionGroup 的数量; -若设置 `data_region_group_extension_policy=AUTO`,则 DATA_REGION_GROUP_NUM 将作为 Database 拥有的 DataRegionGroup 的配额下界,即当该 Database 开始写入数据时,将至少拥有此数量的 DataRegionGroup。 - -用户可以在创建 Database 时设置任意异构参数,或在单机/分布式 IoTDB 运行时调整部分异构参数。 - -#### 创建 Database 时设置异构参数 - -用户可以在创建 Database 时设置上述任意异构参数,SQL 语句如下所示: - -``` -CREATE DATABASE prefixPath (WITH databaseAttributeClause (COMMA? databaseAttributeClause)*)? -``` - -例如: -``` -CREATE DATABASE root.db WITH SCHEMA_REPLICATION_FACTOR=1, DATA_REPLICATION_FACTOR=3, SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### 运行时调整异构参数 - -用户可以在 IoTDB 运行时调整部分异构参数,SQL 语句如下所示: - -``` -ALTER DATABASE prefixPath WITH databaseAttributeClause (COMMA? databaseAttributeClause)* -``` - -例如: -``` -ALTER DATABASE root.db WITH SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -注意,运行时只能调整下列异构参数: -+ SCHEMA_REGION_GROUP_NUM -+ DATA_REGION_GROUP_NUM - -#### 查看异构数据库 - -用户可以查询每个 Database 的具体异构配置,SQL 语句如下所示: - -``` -SHOW DATABASES DETAILS prefixPath? -``` - -例如: - -``` -IoTDB> SHOW DATABASES DETAILS -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|Database| TTL|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|MinSchemaRegionGroupNum|MaxSchemaRegionGroupNum|DataRegionGroupNum|MinDataRegionGroupNum|MaxDataRegionGroupNum| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|root.db1| null| 1| 3| 604800000| 0| 1| 1| 0| 2| 2| -|root.db2|86400000| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -|root.db3| null| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -Total line number = 3 -It costs 0.058s -``` - -各列查询结果依次为: -+ 数据库名称 -+ 数据库的 TTL -+ 数据库的元数据副本数 -+ 数据库的数据副本数 -+ 数据库的时间分区间隔 -+ 数据库当前拥有的 SchemaRegionGroup 数量 -+ 数据库需要拥有的最小 SchemaRegionGroup 数量 -+ 数据库允许拥有的最大 SchemaRegionGroup 数量 -+ 数据库当前拥有的 DataRegionGroup 数量 -+ 数据库需要拥有的最小 DataRegionGroup 数量 -+ 数据库允许拥有的最大 DataRegionGroup 数量 - - -## 设备模板管理 - -IoTDB 支持设备模板功能,实现同类型不同实体的物理量元数据共享,减少元数据内存占用,同时简化同类型实体的管理。 - - -![img](/img/%E6%A8%A1%E6%9D%BF.png) - -![img](/img/template.jpg) - -### 创建设备模板 - -创建设备模板的 SQL 语法如下: - -```sql -CREATE DEVICE TEMPLATE ALIGNED? '(' [',' ]+ ')' -``` - -**示例1:** 创建包含两个非对齐序列的元数据模板 - -```shell -IoTDB> create device template t1 (temperature FLOAT encoding=RLE, status BOOLEAN encoding=PLAIN compression=SNAPPY) -``` - -**示例2:** 创建包含一组对齐序列的元数据模板 - -```shell -IoTDB> create device template t2 aligned (lat FLOAT encoding=Gorilla, lon FLOAT encoding=Gorilla) -``` - -其中,物理量 `lat` 和 `lon` 是对齐的。 - -### 挂载设备模板 - -元数据模板在创建后,需执行挂载操作,方可用于相应路径下的序列创建与数据写入。 - -**挂载模板前,需确保相关数据库已经创建。** - -**推荐将模板挂载在 database 节点上,不建议将模板挂载到 database 上层的节点上。** - -**模板挂载路径下禁止创建普通序列,已创建了普通序列的前缀路径上不允许挂载模板。** - -挂载元数据模板的 SQL 语句如下所示: - -```shell -IoTDB> set device template t1 to root.sg1.d1 -``` - -### 激活设备模板 - -挂载好设备模板后,且系统开启自动注册序列功能的情况下,即可直接进行数据的写入。例如 database 为 root.sg1,模板 t1 被挂载到了节点 root.sg1.d1,那么可直接向时间序列(如 root.sg1.d1.temperature 和 root.sg1.d1.status)写入时间序列数据,该时间序列已可被当作正常创建的序列使用。 - -**注意**:在插入数据之前或系统未开启自动注册序列功能,模板定义的时间序列不会被创建。可以使用如下SQL语句在插入数据前创建时间序列即激活模板: - -```shell -IoTDB> create timeseries using device template on root.sg1.d1 -``` - -**示例:** 执行以下语句 -```shell -IoTDB> set device template t1 to root.sg1.d1 -IoTDB> set device template t2 to root.sg1.d2 -IoTDB> create timeseries using device template on root.sg1.d1 -IoTDB> create timeseries using device template on root.sg1.d2 -``` - -查看此时的时间序列: -```sql -show timeseries root.sg1.** -``` - -```shell -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -|root.sg1.d1.temperature| null| root.sg1| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.sg1.d1.status| null| root.sg1| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -| root.sg1.d2.lon| null| root.sg1| FLOAT| GORILLA| SNAPPY|null| null| null| null| -| root.sg1.d2.lat| null| root.sg1| FLOAT| GORILLA| SNAPPY|null| null| null| null| -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -``` - -查看此时的设备: -```sql -show devices root.sg1.** -``` - -```shell -+---------------+---------+---------+ -| devices|isAligned| Template| -+---------------+---------+---------+ -| root.sg1.d1| false| null| -| root.sg1.d2| true| null| -+---------------+---------+---------+ -``` - -### 查看设备模板 - -- 查看所有设备模板 - -SQL 语句如下所示: - -```shell -IoTDB> show device templates -``` - -执行结果如下: -```shell -+-------------+ -|template name| -+-------------+ -| t2| -| t1| -+-------------+ -``` - -- 查看某个设备模板下的物理量 - -SQL 语句如下所示: - -```shell -IoTDB> show nodes in device template t1 -``` - -执行结果如下: -```shell -+-----------+--------+--------+-----------+ -|child nodes|dataType|encoding|compression| -+-----------+--------+--------+-----------+ -|temperature| FLOAT| RLE| SNAPPY| -| status| BOOLEAN| PLAIN| SNAPPY| -+-----------+--------+--------+-----------+ -``` - -- 查看挂载了某个设备模板的路径 - -```shell -IoTDB> show paths set device template t1 -``` - -执行结果如下: -```shell -+-----------+ -|child paths| -+-----------+ -|root.sg1.d1| -+-----------+ -``` - -- 查看使用了某个设备模板的路径(即模板在该路径上已激活,序列已创建) - -```shell -IoTDB> show paths using device template t1 -``` - -执行结果如下: -```shell -+-----------+ -|child paths| -+-----------+ -|root.sg1.d1| -+-----------+ -``` - -### 解除设备模板 - -若需删除模板表示的某一组时间序列,可采用解除模板操作,SQL语句如下所示: - -```shell -IoTDB> delete timeseries of device template t1 from root.sg1.d1 -``` - -或 - -```shell -IoTDB> deactivate device template t1 from root.sg1.d1 -``` - -解除操作支持批量处理,SQL语句如下所示: - -```shell -IoTDB> delete timeseries of device template t1 from root.sg1.*, root.sg2.* -``` - -或 - -```shell -IoTDB> deactivate device template t1 from root.sg1.*, root.sg2.* -``` - -若解除命令不指定模板名称,则会将给定路径涉及的所有模板使用情况均解除。 - -### 卸载设备模板 - -卸载设备模板的 SQL 语句如下所示: - -```shell -IoTDB> unset device template t1 from root.sg1.d1 -``` - -**注意**:不支持卸载仍处于激活状态的模板,需保证执行卸载操作前解除对该模板的所有使用,即删除所有该模板表示的序列。 - -### 删除设备模板 - -删除设备模板的 SQL 语句如下所示: - -```shell -IoTDB> drop device template t1 -``` - -**注意**:不支持删除已经挂载的模板,需在删除操作前保证该模板卸载成功。 - -### 修改设备模板 - -在需要新增物理量的场景中,可以通过修改设备模板来给所有已激活该模板的设备新增物理量。 - -修改设备模板的 SQL 语句如下所示: - -```shell -IoTDB> alter device template t1 add (speed FLOAT encoding=RLE) -``` - -**向已挂载模板的路径下的设备中写入数据,若写入请求中的物理量不在模板中,将自动扩展模板。** - - -## 时间序列管理 - -### 创建时间序列 - -根据建立的数据模型,我们可以分别在两个数据库中创建相应的时间序列。创建时间序列的 SQL 语句如下所示: - -``` -IoTDB > create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT,encoding=RLE -IoTDB > create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT,encoding=PLAIN -IoTDB > create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT,encoding=RLE -``` - -从 v0.13 起,可以使用简化版的 SQL 语句创建时间序列: - -``` -IoTDB > create timeseries root.ln.wf01.wt01.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.ln.wf01.wt01.temperature FLOAT encoding=RLE -IoTDB > create timeseries root.ln.wf02.wt02.hardware TEXT encoding=PLAIN -IoTDB > create timeseries root.ln.wf02.wt02.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.temperature FLOAT encoding=RLE -``` - -需要注意的是,当创建时间序列时指定的编码方式与数据类型不对应时,系统会给出相应的错误提示,如下所示: -``` -IoTDB> create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF -error: encoding TS_2DIFF does not support BOOLEAN -``` - -详细的数据类型与编码方式的对应列表请参见 [编码方式](../Technical-Insider/Encoding-and-Compression.md)。 - -### 创建对齐时间序列 - -创建一组对齐时间序列的SQL语句如下所示: - -``` -IoTDB> CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT encoding=PLAIN compressor=SNAPPY, longitude FLOAT encoding=PLAIN compressor=SNAPPY) -``` - -一组对齐序列中的序列可以有不同的数据类型、编码方式以及压缩方式。 - -对齐的时间序列也支持设置别名、标签、属性。 - -### 删除时间序列 - -我们可以使用`(DELETE | DROP) TimeSeries `语句来删除我们之前创建的时间序列。SQL 语句如下所示: - -``` -IoTDB> delete timeseries root.ln.wf01.wt01.status -IoTDB> delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware -IoTDB> delete timeseries root.ln.wf02.* -IoTDB> drop timeseries root.ln.wf02.* -``` - -### 查看时间序列 - -* SHOW LATEST? TIMESERIES pathPattern? timeseriesWhereClause? limitClause? - - SHOW TIMESERIES 中可以有四种可选的子句,查询结果为这些时间序列的所有信息 - -时间序列信息具体包括:时间序列路径名,database,Measurement 别名,数据类型,编码方式,压缩方式,属性和标签。 - -示例: - -* SHOW TIMESERIES - - 展示系统中所有的时间序列信息 - -* SHOW TIMESERIES <`Path`> - - 返回给定路径的下的所有时间序列信息。其中 `Path` 需要为一个时间序列路径或路径模式。例如,分别查看`root`路径和`root.ln`路径下的时间序列,SQL 语句如下所示: - -``` -IoTDB> show timeseries root.** -IoTDB> show timeseries root.ln.** -``` - -执行结果分别为: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.016s - -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -Total line number = 4 -It costs 0.004s -``` - -* SHOW TIMESERIES LIMIT INT OFFSET INT - - 只返回从指定下标开始的结果,最大返回条数被 LIMIT 限制,用于分页查询。例如: - -``` -show timeseries root.ln.** limit 10 offset 10 -``` - -* SHOW TIMESERIES WHERE TIMESERIES contains 'containStr' - - 对查询结果集根据 timeseries 名称进行字符串模糊匹配过滤。例如: - -``` -show timeseries root.ln.** where timeseries contains 'wf01.wt' -``` - -执行结果为: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 2 -It costs 0.016s -``` - -* SHOW TIMESERIES WHERE DataType=type - - 对查询结果集根据时间序列数据类型进行过滤。例如: - -``` -show timeseries root.ln.** where dataType=FLOAT -``` - -执行结果为: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 3 -It costs 0.016s - -``` - -* SHOW TIMESERIES WHERE TAGS(KEY) = VALUE -* SHOW TIMESERIES WHERE TAGS(KEY) CONTAINS VALUE - - 对查询结果集根据标签进行过滤。例如: - -``` -show timeseries root.ln.** where TAGS(unit)='c' -show timeseries root.ln.** where TAGS(description) contains 'test1' -``` - -执行结果分别为: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s - -``` - - -* SHOW LATEST TIMESERIES - - 表示查询出的时间序列需要按照最近插入时间戳降序排列 - - -需要注意的是,当查询路径不存在时,系统会返回 0 条时间序列。 - -### 统计时间序列总数 - -IoTDB 支持使用`COUNT TIMESERIES`来统计一条路径中的时间序列个数。SQL 语句如下所示: - -* 可以通过 `WHERE` 条件对时间序列名称进行字符串模糊匹配,语法为: `COUNT TIMESERIES WHERE TIMESERIES contains 'containStr'` 。 -* 可以通过 `WHERE` 条件对时间序列数据类型进行过滤,语法为: `COUNT TIMESERIES WHERE DataType='`。 -* 可以通过 `WHERE` 条件对标签点进行过滤,语法为: `COUNT TIMESERIES WHERE TAGS(key)='value'` 或 `COUNT TIMESERIES WHERE TAGS(key) contains 'value'`。 -* 可以通过定义`LEVEL`来统计指定层级下的时间序列个数。这条语句可以用来统计每一个设备下的传感器数量,语法为:`COUNT TIMESERIES GROUP BY LEVEL=`。 - -``` -IoTDB > COUNT TIMESERIES root.** -IoTDB > COUNT TIMESERIES root.ln.** -IoTDB > COUNT TIMESERIES root.ln.*.*.status -IoTDB > COUNT TIMESERIES root.ln.wf01.wt01.status -IoTDB > COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' -IoTDB > COUNT TIMESERIES root.** WHERE DATATYPE = INT64 -IoTDB > COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c' -IoTDB > COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c' -IoTDB > COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1 -``` - -例如有如下时间序列(可以使用`show timeseries`展示所有时间序列): - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| {"unit":"c"}| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| {"description":"test1"}| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.004s -``` - -那么 Metadata Tree 如下所示: - - - -可以看到,`root`被定义为`LEVEL=0`。那么当你输入如下语句时: - -``` -IoTDB > COUNT TIMESERIES root.** GROUP BY LEVEL=1 -IoTDB > COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2 -IoTDB > COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2 -``` - -你将得到以下结果: - -``` -IoTDB> COUNT TIMESERIES root.** GROUP BY LEVEL=1 -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -| root.sgcc| 2| -| root.ln| 4| -+------------+-----------------+ -Total line number = 3 -It costs 0.002s - -IoTDB > COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2 -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf02| 2| -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 2 -It costs 0.002s - -IoTDB > COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2 -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 1 -It costs 0.002s -``` - -> 注意:时间序列的路径只是过滤条件,与 level 的定义无关。 - -### 活跃时间序列查询 -我们在原有的时间序列查询和统计上添加新的WHERE时间过滤条件,可以得到在指定时间范围中存在数据的时间序列。 - -需要注意的是, 在带有时间过滤的元数据查询中并不考虑视图的存在,只考虑TsFile中实际存储的时间序列。 - -一个使用样例如下: -``` -IoTDB> insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -IoTDB> insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -IoTDB> insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -IoTDB> show timeseries; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -IoTDB> show timeseries where time >= 15000 and time < 16000; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -IoTDB> count timeseries where time >= 15000 and time < 16000; -+-----------------+ -|count(timeseries)| -+-----------------+ -| 4| -+-----------------+ -``` -关于活跃时间序列的定义,能通过正常查询查出来的数据就是活跃数据,也就是说插入但被删除的时间序列不在考虑范围内。 - -### 标签点管理 - -我们可以在创建时间序列的时候,为它添加别名和额外的标签和属性信息。 - -标签和属性的区别在于: - -* 标签可以用来查询时间序列路径,会在内存中维护标点到时间序列路径的倒排索引:标签 -> 时间序列路径 -* 属性只能用时间序列路径来查询:时间序列路径 -> 属性 - -所用到的扩展的创建时间序列的 SQL 语句如下所示: -``` -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT, encoding=RLE, compression=SNAPPY tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2) -``` - -括号里的`temprature`是`s1`这个传感器的别名。 -我们可以在任何用到`s1`的地方,将其用`temprature`代替,这两者是等价的。 - -> IoTDB 同时支持在查询语句中使用 AS 函数设置别名。二者的区别在于:AS 函数设置的别名用于替代整条时间序列名,且是临时的,不与时间序列绑定;而上文中的别名只作为传感器的别名,与其绑定且可与原传感器名等价使用。 - -> 注意:额外的标签和属性信息总的大小不能超过`tag_attribute_total_size`. - - * 标签点属性更新 -创建时间序列后,我们也可以对其原有的标签点属性进行更新,主要有以下六种更新方式: -* 重命名标签或属性 -``` -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` -* 重新设置标签或属性的值 -``` -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` -* 删除已经存在的标签或属性 -``` -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2 -``` -* 添加新的标签 -``` -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` -* 添加新的属性 -``` -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` -* 更新插入别名,标签和属性 -> 如果该别名,标签或属性原来不存在,则插入,否则,用新值更新原来的旧值 -``` -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -* 使用标签作为过滤条件查询时间序列,使用 TAGS(tagKey) 来标识作为过滤条件的标签 -``` -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -``` - -返回给定路径的下的所有满足条件的时间序列信息,SQL 语句如下所示: - -``` -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1 -show timeseries root.ln.** where TAGS(unit)='c' -show timeseries root.ln.** where TAGS(description) contains 'test1' -``` - -执行结果分别为: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s -``` - -- 使用标签作为过滤条件统计时间序列数量 - -``` -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL= -``` - -返回给定路径的下的所有满足条件的时间序列的数量,SQL 语句如下所示: - -``` -count timeseries -count timeseries root.** where TAGS(unit)='c' -count timeseries root.** where TAGS(unit)='c' group by level = 2 -``` - -执行结果分别为: - -``` -IoTDB> count timeseries -+-----------------+ -|count(timeseries)| -+-----------------+ -| 6| -+-----------------+ -Total line number = 1 -It costs 0.019s -IoTDB> count timeseries root.** where TAGS(unit)='c' -+-----------------+ -|count(timeseries)| -+-----------------+ -| 2| -+-----------------+ -Total line number = 1 -It costs 0.020s -IoTDB> count timeseries root.** where TAGS(unit)='c' group by level = 2 -+--------------+-----------------+ -| column|count(timeseries)| -+--------------+-----------------+ -| root.ln.wf02| 2| -| root.ln.wf01| 0| -|root.sgcc.wf03| 0| -+--------------+-----------------+ -Total line number = 3 -It costs 0.011s -``` - -> 注意,现在我们只支持一个查询条件,要么是等值条件查询,要么是包含条件查询。当然 where 子句中涉及的必须是标签值,而不能是属性值。 - -创建对齐时间序列 - -``` -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)) -``` - -执行结果如下: - -``` -IoTDB> show timeseries -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -|root.sg1.d1.s2| null| root.sg1| DOUBLE| GORILLA| SNAPPY|{"tag4":"v4","tag3":"v3"}|{"attr4":"v4","attr3":"v3"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -支持查询: - -``` -IoTDB> show timeseries where TAGS(tag1)='v1' -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -上述对时间序列标签、属性的更新等操作都支持。 - - -## 路径查询 - -### 查看路径的所有子路径 - -``` -SHOW CHILD PATHS pathPattern -``` - -可以查看此路径模式所匹配的所有路径的下一层的所有路径和它对应的节点类型,即pathPattern.*所匹配的路径及其节点类型。 - -节点类型:ROOT -> SG INTERNAL -> DATABASE -> INTERNAL -> DEVICE -> TIMESERIES - -示例: - -* 查询 root.ln 的下一层:show child paths root.ln - -``` -+------------+----------+ -| child paths|node types| -+------------+----------+ -|root.ln.wf01| INTERNAL| -|root.ln.wf02| INTERNAL| -+------------+----------+ -Total line number = 2 -It costs 0.002s -``` - -* 查询形如 root.xx.xx.xx 的路径:show child paths root.\*.\* - -``` -+---------------+ -| child paths| -+---------------+ -|root.ln.wf01.s1| -|root.ln.wf02.s2| -+---------------+ -``` - -### 查看路径的下一级节点 - -``` -SHOW CHILD NODES pathPattern -``` - -可以查看此路径模式所匹配的节点的下一层的所有节点。 - -示例: - -* 查询 root 的下一层:show child nodes root - -``` -+------------+ -| child nodes| -+------------+ -| ln| -+------------+ -``` - -* 查询 root.ln 的下一层 :show child nodes root.ln - -``` -+------------+ -| child nodes| -+------------+ -| wf01| -| wf02| -+------------+ -``` - -### 统计节点数 - -IoTDB 支持使用`COUNT NODES LEVEL=`来统计当前 Metadata - 树下满足某路径模式的路径中指定层级的节点个数。这条语句可以用来统计带有特定采样点的设备数。例如: - -``` -IoTDB > COUNT NODES root.** LEVEL=2 -IoTDB > COUNT NODES root.ln.** LEVEL=2 -IoTDB > COUNT NODES root.ln.wf01.* LEVEL=3 -IoTDB > COUNT NODES root.**.temperature LEVEL=3 -``` - -对于上面提到的例子和 Metadata Tree,你可以获得如下结果: - -``` -+------------+ -|count(nodes)| -+------------+ -| 4| -+------------+ -Total line number = 1 -It costs 0.003s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 1| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s -``` - -> 注意:时间序列的路径只是过滤条件,与 level 的定义无关。 - -### 查看设备 - -* SHOW DEVICES pathPattern? (WITH DATABASE)? devicesWhereClause? limitClause? - -与 `Show Timeseries` 相似,IoTDB 目前也支持两种方式查看设备。 - -* `SHOW DEVICES` 语句显示当前所有的设备信息,等价于 `SHOW DEVICES root.**`。 -* `SHOW DEVICES ` 语句规定了 `PathPattern`,返回给定的路径模式所匹配的设备信息。 -* `WHERE` 条件中可以使用 `DEVICE contains 'xxx'`,根据 device 名称进行模糊查询。 -* `WHERE` 条件中可以使用 `TEMPLATE = 'xxx'`,`TEMPLATE != 'xxx'`,根据 template 名称进行过滤查询。 -* `WHERE` 条件中可以使用 `TEMPLATE is null`,`TEMPLATE is not null`,根据 template 是否为null(null 表示没激活)进行过滤查询。 - -SQL 语句如下所示: - -``` -IoTDB> show devices -IoTDB> show devices root.ln.** -IoTDB> show devices root.ln.** where device contains 't' -IoTDB> show devices root.ln.** where template = 't1' -IoTDB> show devices root.ln.** where template is null -IoTDB> show devices root.ln.** where template != 't1' -IoTDB> show devices root.ln.** where template is not null -``` - -你可以获得如下数据: - -``` -+-------------------+---------+---------+ -| devices|isAligned| Template| -+-------------------+---------+---------+ -| root.ln.wf01.wt01| false| t1| -| root.ln.wf02.wt02| false| null| -|root.sgcc.wf03.wt01| false| null| -| root.turbine.d1| false| null| -+-------------------+---------+---------+ -Total line number = 4 -It costs 0.002s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf01.wt01| false| t1| -|root.ln.wf02.wt02| false| null| -+-----------------+---------+---------+ -Total line number = 2 -It costs 0.001s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf01.wt01| false| t1| -|root.ln.wf02.wt02| false| null| -+-----------------+---------+---------+ -Total line number = 2 -It costs 0.001s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf01.wt01| false| t1| -+-----------------+---------+---------+ -Total line number = 1 -It costs 0.001s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf02.wt02| false| null| -+-----------------+---------+---------+ -Total line number = 1 -It costs 0.001s -``` - -其中,`isAligned`表示该设备下的时间序列是否对齐, -`Template`显示着该设备所激活的模板名,null 表示没有激活模板。 - -查看设备及其 database 信息,可以使用 `SHOW DEVICES WITH DATABASE` 语句。 - -* `SHOW DEVICES WITH DATABASE` 语句显示当前所有的设备信息和其所在的 database,等价于 `SHOW DEVICES root.**`。 -* `SHOW DEVICES WITH DATABASE` 语句规定了 `PathPattern`,返回给定的路径模式所匹配的设备信息和其所在的 database。 - -SQL 语句如下所示: - -``` -IoTDB> show devices with database -IoTDB> show devices root.ln.** with database -``` - -你可以获得如下数据: - -``` -+-------------------+-------------+---------+---------+ -| devices| database|isAligned| Template| -+-------------------+-------------+---------+---------+ -| root.ln.wf01.wt01| root.ln| false| t1| -| root.ln.wf02.wt02| root.ln| false| null| -|root.sgcc.wf03.wt01| root.sgcc| false| null| -| root.turbine.d1| root.turbine| false| null| -+-------------------+-------------+---------+---------+ -Total line number = 4 -It costs 0.003s - -+-----------------+-------------+---------+---------+ -| devices| database|isAligned| Template| -+-----------------+-------------+---------+---------+ -|root.ln.wf01.wt01| root.ln| false| t1| -|root.ln.wf02.wt02| root.ln| false| null| -+-----------------+-------------+---------+---------+ -Total line number = 2 -It costs 0.001s -``` - -### 统计设备数量 - -* COUNT DEVICES \ - -上述语句用于统计设备的数量,同时允许指定`PathPattern` 用于统计匹配该`PathPattern` 的设备数量 - -SQL 语句如下所示: - -``` -IoTDB> show devices -IoTDB> count devices -IoTDB> count devices root.ln.** -``` - -你可以获得如下数据: - -``` -+-------------------+---------+---------+ -| devices|isAligned| Template| -+-------------------+---------+---------+ -|root.sgcc.wf03.wt03| false| null| -| root.turbine.d1| false| null| -| root.ln.wf02.wt02| false| null| -| root.ln.wf01.wt01| false| t1| -+-------------------+---------+---------+ -Total line number = 4 -It costs 0.024s - -+--------------+ -|count(devices)| -+--------------+ -| 4| -+--------------+ -Total line number = 1 -It costs 0.004s - -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -Total line number = 1 -It costs 0.004s -``` - -### 活跃设备查询 -和活跃时间序列一样,我们可以在查看和统计设备的基础上添加时间过滤条件来查询在某段时间内存在数据的活跃设备。这里活跃的定义与活跃时间序列相同,使用样例如下: -``` -IoTDB> insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -IoTDB> insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -IoTDB> insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -IoTDB> show devices; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -| root.sg.data3| false| -+-------------------+---------+ - -IoTDB> show devices where time >= 15000 and time < 16000; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -+-------------------+---------+ - -IoTDB> count devices where time >= 15000 and time < 16000; -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -``` \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index 185ba32d8..000000000 --- a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,564 +0,0 @@ - -# AINode 部署 - -## AINode介绍 - -### 能力介绍 - -AINode 是 IoTDB 在 ConfigNode、DataNode 后提供的第三种内生节点,该节点通过与 IoTDB 集群的 DataNode、ConfigNode 的交互,扩展了对时间序列进行机器学习分析的能力,支持从外部引入已有机器学习模型进行注册,并使用注册的模型在指定时序数据上通过简单 SQL 语句完成时序分析任务的过程,将模型的创建、管理及推理融合在数据库引擎中。目前已提供常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -### 交付方式 - 是 IoTDB 集群外的额外套件,独立安装包。 - -### 部署模式 -
- - -
- -## 安装准备 - -### 安装包获取 - - 用户可以下载AINode的软件安装包,下载并解压后即完成AINode的安装。 - - 解压后安装包(`apache-iotdb--ainode-bin.zip`),安装包解压后目录结构如下: -| **目录** | **类型** | **说明** | -| ------------ | -------- | ------------------------------------------------ | -| lib | 文件夹 | AINode编译后的二进制可执行文件以及相关的代码依赖 | -| sbin | 文件夹 | AINode的运行脚本,可以启动,移除和停止AINode | -| conf | 文件夹 | 包含AINode的配置项,具体包含以下配置项 | -| LICENSE | 文件 | 证书 | -| NOTICE | 文件 | 提示 | -| README_ZH.md | 文件 | markdown格式的中文版说明 | -| `README.md` | 文件 | 使用说明 | - -### 前置检查 - -为确保您获取的 AINode 安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:请联系天谋工作人员获取 - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/ainode`): - ```Bash - cd /data/ainode - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-05.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行 AINode 的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 环境准备 -- 建议操作环境: Ubuntu, CentOS, MacOS - -- 运行环境 - - Python 版本在 3.10~3.12 即可,且带有 pip 和 venv 工具;非联网环境下需要从 [此处](https://cloud.tsinghua.edu.cn/d/4c1342f6c272439aa96c/?p=%2Flibs&mode=list) 下载对应操作系统的zip压缩包(注意下载依赖需选择libs文件夹中的zip压缩包,如下图),并将文件夹下的所有文件拷贝到 `apache-iotdb--ainode-bin` 文件夹中 `lib` 文件夹下,并按下文步骤启动AINode。 - - - - - 环境变量中需存在 Python 解释器且可以通过 `python` 指令直接调用 - - 建议在 `apache-iotdb--ainode-bin` 文件夹下,新建 Python 解释器 venv 虚拟环境。如安装 3.10.0 版本虚拟环境,语句如下: - - ```shell - # 安装3.10.0版本的venv,创建虚拟环境,文件夹名为 `venv` - ../Python-3.10.0/python -m venv `venv` - ``` -## 安装部署及使用 - -### 安装 AINode - -1. AINode 激活 - - 要求 IoTDB 处于正常运行状态,且 license 中有 AINode 模块授权。 - - 激活 AINode 模块授权方式如下: - - 方式一:激活文件拷贝激活 - - 重新启动 confignode 节点后,进入 activation 文件夹, 将 system_info 文件复制给天谋工作人员,并告知工作人员申请 AINode 独立授权; - - 收到工作人员返回的 license 文件; - - 将 license 文件放入对应节点的 activation 文件夹下; -- 方式二:激活脚本激活 - - 获取激活所需机器码,进入安装目录的 `sbin` 目录,执行激活脚本: - ```shell - cd sbin - ./start-activate.sh - ``` - - 显示如下信息,请将机器码(即该串字符)复制给天谋工作人员,并告知工作人员申请 AINode 独立授权: - ```shell - Please copy the system_info's content and send it to Timecho: - 01-KU5LDFFN-PNBEHDRH - Please enter license: - ``` - - 将工作人员返回的激活码输入上一步的命令行提示处 `Please enter license:`,如下提示: - ```shell - Please enter license: - Jw+MmF+AtexsfgNGOFgTm83BgXbq0zT1+fOfPvQsLlj6ZsooHFU6HycUSEGC78eT1g67KPvkcLCUIsz2QpbyVmPLr9x1+kVjBubZPYlVpsGYLqLFc8kgpb5vIrPLd3hGLbJ5Ks8fV1WOVrDDVQq89YF2atQa2EaB9EAeTWd0bRMZ+s9ffjc/1Zmh9NSP/T3VCfJcJQyi7YpXWy5nMtcW0gSV+S6fS5r7a96PjbtE0zXNjnEhqgRzdU+mfO8gVuUNaIy9l375cp1GLpeCh6m6pF+APW1CiXLTSijK9Qh3nsL5bAOXNeob5l+HO5fEMgzrW8OJPh26Vl6ljKUpCvpTiw== - License has been stored to sbin/../activation/license - Import completed. Please start cluster and excute 'show cluster' to verify activation status - ``` -- 更新 license 后,重新启动 DataNode 节点,进入 IoTDB 的 sbin 目录下,启动 datanode: - ```shell - cd sbin - ./start-datanode.sh -d #-d参数将在后台进行启动 - ``` - -2. 检查Linux的内核架构 - ```shell - uname -m - ``` - -3. 导入Python环境[下载](https://repo.anaconda.com/miniconda/) - - 推荐下载py311版本应用,导入至用户根目录下 iotdb专用文件夹 中 - -4. 验证Python版本 - - ```shell - python --version - ``` - -5. 创建虚拟环境(在 ainode 目录下执行): - - ```shell - python -m venv venv - ``` - -6. 激活虚拟环境: - - ```shell - source venv/bin/activate - ``` - -7. 下载导入AINode到专用文件夹,切换到专用文件夹并解压安装包 - - ```shell - unzip iotdb-enterprise-ainode-1.3.3.2.zip - ``` - -8. 配置项修改 - - ```shell - vi iotdb-enterprise-ainode-1.3.3.2/conf/iotdb-ainode.properties - ``` - 配置项修改:[详细信息](#配置项修改) - > ain_seed_config_node=iotdb-1:10710(集群通讯节点IP:通讯节点端口)
- > ain_inference_rpc_address=iotdb-3(运行AINode的服务器IP) - -9. 更换Python源 - - ```shell - pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ - ``` - -10. 启动AINode节点 - - ```shell - nohup bash iotdb-enterprise-ainode-1.3.3.2/sbin/start-ainode.sh > myout.file 2>& 1 & - ``` - > 回到系统默认环境:conda deactivate - -### 配置项修改 -AINode 支持修改一些必要的参数。可以在 `conf/iotdb-ainode.properties` 文件中找到下列参数并进行持久化的修改: - -| **名称** | **描述** | **类型** | **默认值** | **改后生效方式** | -| :----------------------------- | ------------------------------------------------------------ | ------- | ------------------ | ---------------------------- | -| cluster_name | AINode 要加入集群的标识 | string | defaultCluster | 仅允许在第一次启动服务前修改 | -| ain_seed_config_node | AINode 启动时注册的 ConfigNode 地址 | String | 127.0.0.1:10710 | 仅允许在第一次启动服务前修改 | -| ain_inference_rpc_address | AINode 提供服务与通信的地址 ,内部服务通讯接口 | String | 127.0.0.1 | 仅允许在第一次启动服务前修改 | -| ain_inference_rpc_port | AINode 提供服务与通信的端口 | String | 10810 | 仅允许在第一次启动服务前修改 | -| ain_system_dir | AINode 元数据存储路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/system | 仅允许在第一次启动服务前修改 | -| ain_models_dir | AINode 存储模型文件的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/models | 仅允许在第一次启动服务前修改 | -| ain_logs_dir | AINode 存储日志的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | logs/AINode | 重启后生效 | -| ain_thrift_compression_enabled | AINode 是否启用 thrift 的压缩机制,0-不启动、1-启动 | Boolean | 0 | 重启后生效 | -### 启动 AINode - - 在完成 Seed-ConfigNode 的部署后,可以通过添加 AINode 节点来支持模型的注册和推理功能。在配置项中指定 IoTDB 集群的信息后,可以执行相应的指令来启动 AINode,加入 IoTDB 集群。 - -#### 联网环境启动 - -##### 启动命令 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -##### 详细语法 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh -i -r -n - - # Windows 系统 - sbin\start-ainode.bat -i -r -n - ``` - -##### 参数介绍: - -| **名称** | **标签** | **描述** | **是否必填** | **类型** | **默认值** | **输入方式** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | ---------------------- | -| ain_interpreter_dir | -i | AINode 所安装在的虚拟环境的解释器路径,需要使用绝对路径 | 否 | String | 默认读取环境变量 | 调用时输入或持久化修改 | -| ain_force_reinstall | -r | 该脚本在检查 AINode 安装情况的时候是否检查版本,如果检查则在版本不对的情况下会强制安装 lib 里的 whl 安装包 | 否 | Bool | false | 调用时输入 | -| ain_no_dependencies | -n | 指定在安装 AINode 的时候是否安装依赖,如果指定则仅安装 AINode 主程序而不安装依赖。 | 否 | Bool | false | 调用时输入 | - - 如不想每次启动时指定对应参数,也可以在 `conf` 文件夹下的`ainode-env.sh` 和 `ainode-env.bat` 脚本中持久化修改参数(目前支持持久化修改 ain_interpreter_dir 参数)。 - - `ainode-env.sh` : - ```shell - # The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - # ain_interpreter_dir= - ``` - `ainode-env.bat` : -```shell - @REM The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - @REM set ain_interpreter_dir= - ``` - 在写入参数值的后解除对应行的注释并保存即可在下一次执行脚本时生效。 - -#### 示例 - -##### 直接启动: - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - # Windows 系统 - sbin\start-ainode.bat - - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -##### 更新启动: -如果 AINode 的版本进行了更新(如更新了 `lib` 文件夹),可使用此命令。首先要保证 AINode 已经停止运行,然后通过 `-r` 参数重启,该参数会根据 `lib` 下的文件重新安装 AINode。 - -```shell - # 更新启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh -r - # Windows 系统 - sbin\start-ainode.bat -r - - - # 后台更新启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh -r > myout.file 2>& 1 & - # Windows 系统 - nohup bash sbin\start-ainode.bat -r > myout.file 2>& 1 & - ``` -#### 非联网环境启动 - -##### 启动命令 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -##### 详细语法 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh -i -r -n - - # Windows 系统 - sbin\start-ainode.bat -i -r -n - ``` - -##### 参数介绍: - -| **名称** | **标签** | **描述** | **是否必填** | **类型** | **默认值** | **输入方式** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | ---------------------- | -| ain_interpreter_dir | -i | AINode 所安装在的虚拟环境的解释器路径,需要使用绝对路径 | 否 | String | 默认读取环境变量 | 调用时输入或持久化修改 | -| ain_force_reinstall | -r | 该脚本在检查 AINode 安装情况的时候是否检查版本,如果检查则在版本不对的情况下会强制安装 lib 里的 whl 安装包 | 否 | Bool | false | 调用时输入 | - -> 注意:非联网环境下安装失败时,首先检查是否选择了平台对应的安装包,其次确认python版本(由于下载的安装包限制了python版本,3.7、3.9等其他都不行) - -#### 示例 - -##### 直接启动: - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - # Windows 系统 - sbin\start-ainode.bat - - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### 检测 AINode 节点状态 - -AINode 启动过程中会自动将新的 AINode 加入 IoTDB 集群。启动 AINode 后可以在 命令行中输入 SQL 来查询,集群中看到 AINode 节点,其运行状态为 Running(如下展示)表示加入成功。 - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -### 停止 AINode - -如果需要停止正在运行的 AINode 节点,则执行相应的关闭脚本。 - -#### 停止命令 - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - -##### 详细语法 - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh -t - - #Windows - sbin\stop-ainode.bat -t - ``` - -##### 参数介绍: - - | **名称** | **标签** | **描述** | **是否必填** | **类型** | **默认值** | **输入方式** | -| ----------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ------ | ---------- | -| ain_remove_target | -t | AINode 关闭时可以指定待移除的目标 AINode 的 Node ID、地址和端口号,格式为`` | 否 | String | 无 | 调用时输入 | - -#### 示例 -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - # Windows - sbin\stop-ainode.bat - ``` -停止 AINode 后,还可以在集群中看到 AINode 节点,其运行状态为 UNKNOWN(如下展示),此时无法使用 AINode 功能。 - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -如果需要重新启动该节点,需重新执行启动脚本。 - -### 移除 AINode - -当需要把一个 AINode 节点移出集群时,可以执行移除脚本。移除和停止脚本的差别是:停止是在集群中保留 AINode 节点但停止 AINode 服务,移除则是把 AINode 节点从集群中移除出去。 - - - #### 移除命令 - -```shell - # Linux / MacOS - bash sbin/remove-ainode.sh - - # Windows - sbin\remove-ainode.bat - ``` - -##### 详细语法 - -```shell - # Linux / MacOS - bash sbin/remove-ainode.sh -i -t -r -n - - # Windows - sbin\remove-ainode.bat -i -t -r -n - ``` - -##### 参数介绍: - - | **名称** | **标签** | **描述** | **是否必填** | **类型** | **默认值** | **输入方式** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | --------------------- | -| ain_interpreter_dir | -i | AINode 所安装在的虚拟环境的解释器路径,需要使用绝对路径 | 否 | String | 默认读取环境变量 | 调用时输入+持久化修改 | -| ain_remove_target | -t | AINode 关闭时可以指定待移除的目标 AINode 的 Node ID、地址和端口号,格式为`` | 否 | String | 无 | 调用时输入 | -| ain_force_reinstall | -r | 该脚本在检查 AINode 安装情况的时候是否检查版本,如果检查则在版本不对的情况下会强制安装 lib 里的 whl 安装包 | 否 | Bool | false | 调用时输入 | -| ain_no_dependencies | -n | 指定在安装 AINode 的时候是否安装依赖,如果指定则仅安装 AINode 主程序而不安装依赖。 | 否 | Bool | false | 调用时输入 | - - 如不想每次启动时指定对应参数,也可以在 `conf` 文件夹下的`ainode-env.sh` 和 `ainode-env.bat` 脚本中持久化修改参数(目前支持持久化修改 ain_interpreter_dir 参数)。 - - `ainode-env.sh` : - ```shell - # The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - # ain_interpreter_dir= - ``` - `ainode-env.bat` : -```shell - @REM The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - @REM set ain_interpreter_dir= - ``` - 在写入参数值的后解除对应行的注释并保存即可在下一次执行脚本时生效。 - -#### 示例 - -##### 直接移除: - - ```shell - # Linux / MacOS - bash sbin/remove-ainode.sh - - # Windows - sbin\remove-ainode.bat - ``` - 移除节点后,将无法查询到节点的相关信息。 - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -##### 指定移除: - -如果用户丢失了 data 文件夹下的文件,可能 AINode 本地无法主动移除自己,需要用户指定节点号、地址和端口号进行移除,此时我们支持用户按照以下方法输入参数进行删除。 - - ```shell - # Linux / MacOS - bash sbin/remove-ainode.sh -t /: - - # Windows - sbin\remove-ainode.bat -t /: - ``` - -## 常见问题 - -### 启动AINode时出现找不到venv模块的报错 - - 当使用默认方式启动 AINode 时,会在安装包目录下创建一个 python 虚拟环境并安装依赖,因此要求安装 venv 模块。通常来说 python3.10 及以上的版本会自带 venv,但对于一些系统自带的 python 环境可能并不满足这一要求。出现该报错时有两种解决方案(二选一): - - 在本地安装 venv 模块,以 ubuntu 为例,可以通过运行以下命令来安装 python 自带的 venv 模块。或者从 python 官网安装一个自带 venv 的 python 版本。 - - ```shell -apt-get install python3.10-venv -``` - 安装 3.10.0 版本的 venv 到 AINode 里面 在 AINode 路径下 - - ```shell -../Python-3.10.0/python -m venv venv(文件夹名) -``` - 在运行启动脚本时通过 `-i` 指定已有的 python 解释器路径作为 AINode 的运行环境,这样就不再需要创建一个新的虚拟环境。 - - ### python中的SSL模块没有被正确安装和配置,无法处理HTTPS资源 -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -可以安装 OpenSSLS 后,再重新构建 python 来解决这个问题 -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - - Python 要求我们的系统上安装有 OpenSSL,具体安装方法可见[链接](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - - ### pip版本较低 - - windows下出现类似“error:Microsoft Visual C++ 14.0 or greater is required...”的编译问题 - - 出现对应的报错,通常是 c++版本或是 setuptools 版本不足,可以在 - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - - ### 安装编译python - - 使用以下指定从官网下载安装包并解压: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` - 编译安装对应的 python 包: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index cd0dc532e..000000000 --- a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,559 +0,0 @@ - -# 集群版部署 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -## 注意事项 - -1. 安装前请确认系统已参照[系统配置](./Environment-Requirements.md)准备完成。 - -2. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置/etc/hosts,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。`dn_internal_address`。 - - ``` shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. 有些参数首次启动后不能修改,请参考下方的"参数配置"章节来进行设置。 - -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 - -5. 请注意,安装部署(包括激活和使用软件)IoTDB时需要保持使用同一个用户进行操作,您可以: -- 使用 root 用户(推荐):使用 root 用户可以避免权限等问题。 -- 使用固定的非 root 用户: - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - 避免使用 sudo:尽量避免使用 sudo 命令,因为它会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 - -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考:[监控面板部署](./Monitoring-panel-deployment.md) - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - -## 准备步骤 - -1. 准备IoTDB数据库安装包 :iotdb-enterprise-{version}-bin.zip(安装包获取见:[链接](../Deployment-and-Maintenance/IoTDB-Package_timecho.md)) -2. 按环境要求配置好操作系统环境(系统环境配置见:[链接](../Deployment-and-Maintenance/Environment-Requirements.md)) - -### 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-02.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - - -## 安装步骤 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ----------- | ------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -### 设置主机名 - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置`/etc/hosts`,使用如下命令: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 参数配置 - -解压安装包并进入安装目录 - -```Plain -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -#### 环境脚本配置 - -- `./conf/confignode-env.sh`配置 - - | **配置项** | **说明** | **默认值** | **推荐值** | 备注 | - | :---------- | :------------------------------------- | :--------- | :----------------------------------------------- | :----------- | - | MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- `./conf/datanode-env.sh`配置 - - | **配置项** | **说明** | **默认值** | **推荐值** | 备注 | - | :---------- | :----------------------------------- |:-----------------------| :----------------------------------------------- | :----------- | - | MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 通用配置 - -打开通用配置文件`./conf/iotdb-system.properties`,可根据部署方式设置以下参数: - -| 配置项 | 说明 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | -| ------------------------- | ---------------------------------------- | -------------- | -------------- | -------------- | -| cluster_name | 集群名称 | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | 元数据副本数,DataNode数量不应少于此数目 | 3 | 3 | 3 | -| data_replication_factor | 数据副本数,DataNode数量不应少于此数目 | 2 | 2 | 2 | - -#### ConfigNode 配置 - -打开ConfigNode配置文件`./conf/iotdb-system.properties`,设置以下参数 - -| 配置项 | 说明 | 默认 | 推荐值 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | 备注 | -| ------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 10710 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 10720 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -#### DataNode 配置 - -打开DataNode配置文件 `./conf/iotdb-system.properties`,设置以下参数: - -| 配置项 | 说明 | 默认 | 推荐值 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | 备注 | -| ------------------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| dn_rpc_address | 客户端 RPC 服务的地址 | 0.0.0.0 | 所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址 | iotdb-1 |iotdb-2 | iotdb-3 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 6667 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 10730 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 10740 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 10750 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 10760 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -> ❗️注意:VSCode Remote等编辑器无自动保存配置功能,请确保修改的文件被持久化保存,否则配置项无法生效 - -### 启动及激活数据库 (V 1.3.4 及以后的 1.x 版本) - -#### 启动 ConfigNode 节点 - -先启动第一个iotdb-1的confignode, 保证种子confignode节点先启动,然后依次启动第2和第3个confignode节点 - -```Bash -./start-confignode.sh -d #“-d”参数将在后台进行启动 -``` -如果启动失败,请参考[常见问题](#常见问题)。 - -#### 启动 DataNode 节点 - -分别进入iotdb的`sbin`目录下,依次启动3个datanode节点: - -```Bash -./start-datanode.sh -d #-d参数将在后台进行启动 -``` - -#### 激活数据库 - -##### 通过 CLI 激活 - -- 进入集群任一节点 CLI,执行获取机器码的语句 - - ```SQL - -- 连接CLI - ./sbin/start-cli.sh - -- 获取激活所需机器码 - IoTDB> show system info -``` - -- 系统将自动返回集群所有节点的机器码 - -```Bash -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -|01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -It costs 0.030s -``` - -- 将获取的机器码复制给天谋工作人员 - -- 工作人员会返回激活码,正常是与提供的机器码的顺序对应的,请将整串激活码粘贴到CLI中进行激活 - - - 注:激活码前后需要用`'`符号进行标注,如下所示 - - ```Bash - IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' - ``` - -### 启动及激活数据库 (V 1.3.4 之前版本) - -#### 启动 ConfigNode 节点 - -先启动第一个iotdb-1的confignode, 保证种子confignode节点先启动,然后依次启动第2和第3个confignode节点 - -```Bash -./start-confignode.sh -d #“-d”参数将在后台进行启动 -``` -如果启动失败,请参考[常见问题](#常见问题)。 - -#### 激活数据库 - -##### 方式一:激活文件拷贝激活 - -- 依次启动3个confignode节点后,每台机器各自的`activation`文件夹, 分别拷贝每台机器的`system_info`文件给天谋工作人员; -- 工作人员将返回每个ConfigNode节点的license文件,这里会返回3个license文件; -- 将3个license文件分别放入对应的ConfigNode节点的`activation`文件夹下; - -##### 方式二:激活脚本激活 - -- 依次获取3台机器的机器码,分别进入安装目录的`sbin`目录,执行激活脚本`start-activate.sh`: - - ```Bash - ./start-activate.sh - ``` - -- 显示如下信息,这里显示的是1台机器的机器码 : - - ```Bash - Please copy the system_info's content and send it to Timecho: - 01-KU5LDFFN-PNBEHDRH - Please enter license: - ``` - -- 其他2个节点依次执行激活脚本`start-activate.sh`,然后将获取的3台机器的机器码都复制给天谋工作人员 -- 工作人员会返回3段激活码,正常是与提供的3个机器码的顺序对应的,请分别将各自的激活码粘贴到上一步的命令行提示处 `Please enter license:`,如下提示: - - ```Bash - Please enter license: - Jw+MmF+Atxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx5bAOXNeob5l+HO5fEMgzrW8OJPh26Vl6ljKUpCvpTiw== - License has been stored to sbin/../activation/license - Import completed. Please start cluster and excute 'show cluster' to verify activation status - ``` - -#### 启动 DataNode 节点 - - 分别进入iotdb的`sbin`目录下,依次启动3个datanode节点: - -```Bash -./start-datanode.sh -d #-d参数将在后台进行启动 -``` - -### 验证部署 - -可直接执行`./sbin`目录下的Cli启动脚本: - -```Plain -./start-cli.sh -h ip(本机ip或域名) -p 端口号(6667) -``` - - 成功启动后,出现如下界面显示IOTDB安装成功。 - -![](/img/%E4%BC%81%E4%B8%9A%E7%89%88%E6%88%90%E5%8A%9F.png) - -出现安装成功界面后,继续看下是否激活成功,使用 `show cluster`命令 - -当看到最右侧显示`ACTIVATED`表示激活成功 - -![](/img/%E4%BC%81%E4%B8%9A%E7%89%88%E6%BF%80%E6%B4%BB.png) - -还可在 CLI 中通过执行 `show activation` 命令查看激活状态,示例如下,状态显示为ACTIVATED表示激活成功 - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -> 出现`ACTIVATED(W)`为被动激活,表示此ConfigNode没有license文件(或没有签发时间戳最新的license文件),其激活依赖于集群中其它Activate状态的ConfigNode。此时建议检查license文件是否已放入license文件夹,没有请放入license文件,若已存在license文件,可能是此节点license文件与其他节点信息不一致导致,请联系天谋工作人员重新申请. - - -### 一键启停集群 - -#### 概述 - -在 IoTDB 的根目录中,`sbin` 子目录包含的 `start-all.sh` 和 `stop-all.sh` 脚本,与 `conf` 子目录中的 `iotdb-cluster.properties` 配置文件协同工作,可通过单一节点实现一键启动或停止集群所有节点的功能。通过这种方式,可以高效地管理 IoTDB 集群的生命周期,简化了部署和运维流程。 -下文将介绍`iotdb-cluster.properties` 文件中的具体配置项。 - -#### 配置项 - - -> 注意: -> -> * 当集群变更时,需要手动更新此配置文件。 -> * 如果在未配置 `iotdb-cluster.properties` 配置文件的情况下执行 `start-all.sh` 或者 `stop-all.sh` 脚本,则默认会启停当前脚本所在 IOTDB\_HOME 目录下的 ConfigNode 与 DataNode 节点。 -> * 推荐配置 ssh 免密登录:如果未配置,启动脚本后会提示输入服务器密码以便于后续启动/停止/销毁操作。如果已配置,则无需在执行脚本过程中输入服务器密码。 - -* confignode\_address\_list - -| 名字 | confignode\_address\_list | -| :--------------: | :------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的 ConfigNode 节点所在主机的 IP 列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_address\_list - -| 名字 | datanode\_address\_list | -| :----------------: | :---------------------------------------------------------------------------- | -| 描述 | 待启动/停止的 DataNode 节点所在主机的 IP 列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* ssh\_account - -| 名字 | ssh\_account | -| :----------------: | :------------------------------------------------------------- | -| 描述 | 通过 SSH 登陆目标主机的用户名,需要所有的主机的用户名都相同 | -| 类型 | String | -| 默认值 | root | -| 改后生效方式 | 重启服务生效 | - -* ssh\_port - -| 名字 | ssh\_port | -| :----------------: | :--------------------------------------------------------- | -| 描述 | 目标主机对外暴露的 SSH 端口,需要所有的主机的端口都相同 | -| 类型 | int | -| 默认值 | 22 | -| 改后生效方式 | 重启服务生效 | - -* confignode\_deploy\_path - -| 名字 | confignode\_deploy\_path | -| :----------------: | :---------------------------------------------------------------------------------------------------------------- | -| 描述 | 待启动/停止的所有 ConfigNode 所在目标主机的路径,需要所有待启动/停止的 ConfigNode 节点在目标主机的相同目录下。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_deploy\_path - -| 名字 | datanode\_deploy\_path | -| :----------------: | :------------------------------------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的所有 DataNode 所在目标主机的路径,需要所有待启动/停止的 DataNode 节点在目标主机的相同目录下。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - - - -## 节点维护步骤 - -### ConfigNode节点维护 - -ConfigNode节点维护分为ConfigNode添加和移除两种操作,有两个常见使用场景: -- 集群扩展:如集群中只有1个ConfigNode时,希望增加ConfigNode以提升ConfigNode节点高可用性,则可以添加2个ConfigNode,使得集群中有3个ConfigNode。 -- 集群故障恢复:1个ConfigNode所在机器发生故障,使得该ConfigNode无法正常运行,此时可以移除该ConfigNode,然后添加一个新的ConfigNode进入集群。 - -> ❗️注意,在完成ConfigNode节点维护后,需要保证集群中有1或者3个正常运行的ConfigNode。2个ConfigNode不具备高可用性,超过3个ConfigNode会导致性能损失。 - -#### 添加ConfigNode节点 - -脚本命令: -```shell -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-confignode.sh - -# Windows -# 首先切换到IoTDB根目录 -sbin/start-confignode.bat -``` - -参数介绍: - -| 参数 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - -#### 移除ConfigNode节点 - -首先通过CLI连接集群,通过`show confignodes`确认想要移除ConfigNode的内部地址与端口号: - -```Bash -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -然后使用脚本将ConfigNode移除。脚本命令: - -```Bash -# Linux / MacOS -sbin/remove-confignode.sh [confignode_id] - -#Windows -sbin/remove-confignode.bat [confignode_id] - -``` - -### DataNode节点维护 - -DataNode节点维护有两个常见场景: - -- 集群扩容:出于集群能力扩容等目的,添加新的DataNode进入集群 -- 集群故障恢复:一个DataNode所在机器出现故障,使得该DataNode无法正常运行,此时可以移除该DataNode,并添加新的DataNode进入集群 - -> ❗️注意,为了使集群能正常工作,在DataNode节点维护过程中以及维护完成后,正常运行的DataNode总数不得少于数据副本数(通常为2),也不得少于元数据副本数(通常为3)。 - -#### 添加DataNode节点 - -脚本命令: - -```Bash -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-datanode.sh - -# Windows -# 首先切换到IoTDB根目录 -sbin/start-datanode.bat -``` - -参数介绍: - -| 缩写 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - -说明:在添加DataNode后,随着新的写入到来(以及旧数据过期,如果设置了TTL),集群负载会逐渐向新的DataNode均衡,最终在所有节点上达到存算资源的均衡。 - -#### 移除DataNode节点 - -首先通过CLI连接集群,通过`show datanodes`确认想要移除的DataNode的RPC地址与端口号: - -```Bash -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -然后使用脚本将DataNode移除。脚本命令: - -```Bash -# Linux / MacOS -sbin/remove-datanode.sh [datanode_id] - -#Windows -sbin/remove-datanode.bat [datanode_id] -``` - -## 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 - -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 步骤 4: 清理环境: - - a. 结束所有 ConfigNode 和 DataNode 进程。 - - ```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - sbin/stop-standalone.sh - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```Bash - cd /data/iotdb - rm -rf data logs - ``` \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index 741238903..000000000 --- a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,495 +0,0 @@ - -# Docker部署 - -## 环境准备 - -### Docker安装 - -```Bash -#以ubuntu为例,其他操作系统可以自行搜索安装方法 -#step1: 安装一些必要的系统工具 -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: 安装GPG证书 -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: 写入软件源信息 -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: 更新并安装Docker-CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: 设置docker开机自启动 -sudo systemctl enable docker -#step6: 验证docker是否安装成功 -docker --version #显示版本信息,即安装成功 -``` - -### docker-compose安装 - -```Bash -#安装命令 -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#验证是否安装成功 -docker-compose --version #显示版本信息即安装成功 -``` - -### 安装dmidecode插件 - -默认情况下,linux服务器应该都已安装,如果没有安装的话,可以使用下面的命令安装。 - -```Bash -sudo apt-get install dmidecode -``` - -dmidecode 安装后,查找安装路径:`whereis dmidecode`,这里假设结果为`/usr/sbin/dmidecode`,记住该路径,后面的docker-compose的yml文件会用到。 - -### 获取IoTDB的容器镜像 - -关于IoTDB企业版的容器镜像您可联系商务或技术支持获取。 - -## 单机版部署 - -本节演示如何部署1C1D的docker单机版。 - -### load 镜像文件 - -比如这里获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz` - -load镜像: - -```Bash -docker load -i iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### 创建docker bridge网络 - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在`/docker-iotdb` 文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml ` - -```Bash -docker-iotdb: -├── iotdb #iotdb安装目录 -│── docker-compose-standalone.yml #单机版docker-compose的yml文件 -``` - -完整的`docker-compose-standalone.yml`内容如下: - -```Bash -version: "3" -services: - iotdb-service: - image: iotdb-enterprise:1.3.2.3-standalone #使用的镜像 - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### 首次启动 - -使用下面的命令启动: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -由于没有激活,首次启动时会直接退出,属于正常现象,首次启动是为了获取机器码文件,用于后面的激活流程。 - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### 申请激活 - -- 首次启动后,在物理机目录`/docker-iotdb/iotdb/activation`下会生成一个 `system_info`文件,将这个文件拷贝给天谋工作人员。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 收到工作人员返回的license文件,将license文件拷贝到`/docker-iotdb/iotdb/activation`文件夹下。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### 再次启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### 验证部署 - -- 查看日志,有如下字样,表示启动成功 - -```Bash -docker logs -f iotdb-datanode #查看日志命令 -2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- 进入容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - 进入容器, 通过cli登录数据库, 使用show cluster命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb /bin/bash #进入容器 - ./start-cli.sh -h iotdb #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在docker-compose-standalone.yml中添加映射 - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:重新启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## 集群版部署 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -**注意:集群版目前只支持host网络和overlay 网络,不支持bridge网络。** - -下面以host网络为例演示如何部署3C3D集群。 - -### 设置主机名 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ----------- | ------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置/etc/hosts,使用如下命令: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### load镜像文件 - -比如获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz` - -在3台服务器上分别执行load镜像命令: - -```Bash -docker load -i iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在/docker-iotdb文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`,`/docker-iotdb/confignode.yml`,`/docker-iotdb/datanode.yml` - -```Bash -docker-iotdb: -├── confignode.yml #confignode的yml文件 -├── datanode.yml #datanode的yml文件 -└── iotdb #IoTDB安装目录 -``` - -在每台服务器上都要编写2个yml文件,即`confignode.yml`和`datanode.yml`,yml示例如下: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:1.3.2.3-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #默认第一台为seed节点 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:1.3.2.3-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_seed_config_node=iotdb-1:10710 #默认第1台为seed节点 - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### 首次启动confignode - -先在3台服务器上分别启动confignode, 用来获取机器码,注意启动顺序,先启动第1台iotdb-1,再启动iotdb-2和iotdb-3。 - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #后台启动 -``` - -### 申请激活 - -- 首次启动3个confignode后,在每个物理机目录`/docker-iotdb/iotdb/activation`下都会生成一个`system_info`文件,将3个服务器的`system_info`文件拷贝给天谋工作人员; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 将3个license文件分别放入对应的ConfigNode节点的`/docker-iotdb/iotdb/activation`文件夹下; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- license放入对应的activation文件夹后,confignode会自动激活,不用重启confignode - -### 启动datanode - -在3台服务器上分别启动datanode - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #后台启动 -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### 验证部署 - -- 查看日志,有如下字样,表示datanode启动成功 - - ```Bash - docker logs -f iotdb-datanode #查看日志命令 - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- 进入任意一个容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - 进入容器,通过cli登录数据库,使用`show cluster`命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb-datanode /bin/bash #进入容器 - ./start-cli.sh -h iotdb-1 #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:在3台服务器中分别拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -或者 -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在3台服务器的`confignode.yml`和`datanode.yml`中添加/conf目录映射 - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:在3台服务器上重新启动IoTDB - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index 545e0ca75..000000000 --- a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,163 +0,0 @@ - -# 双活版部署 - -## 什么是双活版? - -双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 两个单机(或集群)可构成一个高可用组:当其中一个单机(或集群)停止服务时,另一个单机(或集群)不会受到影响。当停止服务的单机(或集群)再次启动时,另一个单机(或集群)会将新写入的数据同步过来。业务可以绑定两个单机(或集群)进行读写,从而达到高可用的目的。 -- 双活部署方案允许在物理节点少于 3 的情况下实现高可用,在部署成本上具备一定优势。同时可以通过电力、网络的双环网,实现两套单机(或集群)的物理供应隔离,保障运行的稳定性。 -- 目前双活能力为企业版功能。 - -![](/img/%E5%8F%8C%E6%B4%BB%E5%90%8C%E6%AD%A5.png) - -## 注意事项 - -1. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置`/etc/hosts`,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。 - - ```Bash - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -2. 有些参数首次启动后不能修改,请参考下方的"安装步骤"章节来进行设置。 - -3. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考[文档](https://www.timecho.com/docs/zh/UserGuide/latest/Deployment-and-Maintenance/Monitoring-panel-deployment.html) - -## 安装步骤 - -我们以两台单机A和B构建的双活版IoTDB为例,A和B的ip分别是192.168.1.3 和 192.168.1.4 ,这里用hostname来表示不同的主机,规划如下: - -| 机器 | 机器ip | 主机名 | -| ---- | ----------- | ------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### Step1:分别安装两套独立的 IoTDB - -在2个机器上分别安装 IoTDB,单机版部署文档可参考[文档](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md),集群版部署文档可参考[文档](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)。**推荐 A、B 集群的各项配置保持一致,以实现最佳的双活效果。** - -### Step2:在机器A上创建数据同步任务至机器B - -- 在机器A上创建数据同步流程,即机器A上的数据自动同步到机器B,使用sbin目录下的cli工具连接A上的IoTDB数据库: - - ```Bash - ./sbin/start-cli.sh -h iotdb-1 - ``` - -- 创建并启动数据同步命令,SQL 如下: - - ```Bash - create pipe AB - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一pipe传输而来的数据。 - -### Step3:在机器B上创建数据同步任务至机器A - - - 在机器B上创建数据同步流程,即机器B上的数据自动同步到机器A,使用sbin目录下的cli工具连接B上的IoTDB数据库: - - ```Bash - ./sbin/start-cli.sh -h iotdb-2 - ``` - - 创建并启动pipe,SQL 如下: - - ```Bash - create pipe BA - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一pipe传输而来的数据。 - -### Step4:验证部署 - -上述数据同步流程创建完成后,即可启动双活集群。 - -#### 检查集群运行状态 - -```Bash -#在2个节点分别执行show cluster命令检查IoTDB服务状态 -show cluster -``` - -**机器A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**机器B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -确保每一个 ConfigNode 和 DataNode 都处于 Running 状态。 - -#### 检查同步状态 - -- 机器A上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-A.png) - -- 机器B上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-B.png) - -确保每一个 pipe 都处于 RUNNING 状态。 - -### Step5:停止双活版 IoTDB - -- 在机器A的执行下列命令: - - ```SQL - ./sbin/start-cli.sh -h iotdb-1 #登录cli - IoTDB> stop pipe AB #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - ``` - -- 在机器B的执行下列命令: - - ```SQL - ./sbin/start-cli.sh -h iotdb-2 #登录cli - IoTDB> stop pipe BA #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - ``` \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index f824da365..000000000 --- a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,46 +0,0 @@ - -# 安装包获取 - -## 企业版获取方式 - -企业版安装包可通过产品试用申请,或直接联系与您对接的商务人员获取。 - -## 安装包结构 - -解压后安装包(iotdb-enterprise-{version}-bin.zip),安装包解压后目录结构如下: - -| **目录** | **类型** | **说明** | -| ---------------- | -------- | ------------------------------------------------------------ | -| activation | 文件夹 | 激活文件所在目录,包括生成的机器码以及从商务侧获取的企业版激活码(启动ConfigNode后才会生成该目录,即可获取激活码) | -| conf | 文件夹 | 配置文件目录,包含 ConfigNode、DataNode、JMX 和 logback 等配置文件 | -| data | 文件夹 | 默认的数据文件目录,包含 ConfigNode 和 DataNode 的数据文件。(启动程序后才会生成该目录) | -| lib | 文件夹 | IoTDB可执行库文件目录 | -| licenses | 文件夹 | 开源社区证书文件目录 | -| logs | 文件夹 | 默认的日志文件目录,包含 ConfigNode 和 DataNode 的日志文件(启动程序后才会生成该目录) | -| sbin | 文件夹 | 主要脚本目录,包含启、停等脚本等 | -| tools | 文件夹 | 系统周边工具目录 | -| ext | 文件夹 | pipe,trigger,udf插件的相关文件(需要使用时用户自行创建) | -| LICENSE | 文件 | 证书 | -| NOTICE | 文件 | 提示 | -| README_ZH\.md | 文件 | markdown格式的中文版说明 | -| README\.md | 文件 | 使用说明 | -| RELEASE_NOTES\.md | 文件 | 版本说明 | diff --git a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Kubernetes_timecho.md b/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Kubernetes_timecho.md deleted file mode 100644 index 7fbc7be8d..000000000 --- a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Kubernetes_timecho.md +++ /dev/null @@ -1,445 +0,0 @@ - - -# Kubernetes - -## 1. 环境准备 - -### 1.1 准备 Kubernetes 集群 - -确保拥有一个可用的 Kubernetes 集群(建议最低版本:Kubernetes 1.24),作为部署 IoTDB 集群的基础。 - -Kubernetes 版本要求:建议版本为 Kubernetes 1.24及以上 - -IoTDB版本要求:不能低于v1.3.3 - -## 2. 创建命名空间 - -### 2.1 创建命名空间 - -> 注意:在执行命名空间创建操作之前,需验证所指定的命名空间名称在 Kubernetes 集群中尚未被使用。如果命名空间已存在,创建命令将无法执行,可能导致部署过程中的错误。 - -```Bash -kubectl create ns iotdb-ns -``` - -### 2.2 查看命名空间 - -```Bash -kubectl get ns -``` - -## 3. 创建 PersistentVolume (PV) - -### 3.1 创建 PV 配置文件 - -PV用于持久化存储IoTDB的ConfigNode 和 DataNode的数据,有几个节点就要创建几个PV。 - -> 注:1个ConfigNode和1个DataNode 算2个节点,需要2个PV。 - -以 3ConfigNode、3DataNode 为例: - -1. 创建 `pv.yaml` 文件,并复制六份,分别重命名为 `pv01.yaml` ~ `pv06.yaml`。 - -```Bash -#可新建个文件夹放yaml文件 -#创建 pv.yaml 文件语句 -touch pv.yaml -``` - -2. 修改每个文件中的 `name` 和 `path` 以确保一致性。 - -**pv.yaml 示例:** - -```YAML -# pv.yaml -apiVersion: v1 -kind: PersistentVolume -metadata: - name: iotdb-pv-01 -spec: - capacity: - storage: 10Gi # 存储容量 - accessModes: # 访问模式 - - ReadWriteOnce - persistentVolumeReclaimPolicy: Retain # 回收策略 - # 存储类名称,如果使用本地静态存储storageClassName 不用配置,如果使用动态存储必需设置此项 - storageClassName: local-storage - # 根据你的存储类型添加相应的配置 - hostPath: # 如果是使用本地路径 - path: /data/k8s-data/iotdb-pv-01 - type: DirectoryOrCreate # 这行不配置就要手动创建文件夹 -``` - -### 3.2 应用 PV 配置 - -```Bash -kubectl apply -f pv01.yaml -kubectl apply -f pv-02.yaml -... -``` - -### 3.3 查看 PV - -```Bash -kubectl get pv -``` - - -### 3.4 手动创建文件夹 - -> 如果yaml里的hostPath-type未配置,需要手动创建对应的文件夹 - -在所有 Kubernetes 节点上创建对应的文件夹: - -```Bash -mkdir -p /data/k8s-data/iotdb-pv-01 -mkdir -p /data/k8s-data/iotdb-pv-02 -... -``` - -## 4. 安装 Helm - -安装Helm步骤请参考[Helm官网](https://helm.sh/zh/docs/intro/install/) - -## 5. 配置IoTDB的Helm Chart - -### 5.1 克隆 IoTDB Kubernetes 部署代码 - -请联系天谋工作人员获取IoTDB的Helm Chart - -### 5.2 修改 YAML 文件 - -> 确保使用的是支持的版本 >=1.3.3.2 - -**values.yaml 示例:** - -```YAML -nameOverride: "iotdb" -fullnameOverride: "iotdb" #软件安装后的名称 - -image: - repository: nexus.infra.timecho.com:8143/timecho/iotdb-enterprise - pullPolicy: IfNotPresent - tag: 1.3.3.2-standalone #软件所用的仓库和版本 - -storage: -# 存储类名称,如果使用本地静态存储storageClassName 不用配置,如果使用动态存储必需设置此项 - className: local-storage - -datanode: - name: datanode - nodeCount: 3 #datanode的节点数量 - enableRestService: true - storageCapacity: 10Gi #datanode的可用空间大小 - resources: - requests: - memory: 2Gi #datanode的内存初始化大小 - cpu: 1000m #datanode的CPU初始化大小 - limits: - memory: 4Gi #datanode的最大内存大小 - cpu: 1000m #datanode的最大CPU大小 - -confignode: - name: confignode - nodeCount: 3 #confignode的节点数量 - storageCapacity: 10Gi #confignode的可用空间大小 - resources: - requests: - memory: 512Mi #confignode的内存初始化大小 - cpu: 1000m #confignode的CPU初始化大小 - limits: - memory: 1024Mi #confignode的最大内存大小 - cpu: 2000m #confignode的最大CPU大小 - configNodeConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - schemaReplicationFactor: 3 - schemaRegionConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - dataReplicationFactor: 2 - dataRegionConsensusProtocolClass: org.apache.iotdb.consensus.iot.IoTConsensus -``` - -## 6. 配置私库信息或预先使用ctr拉取镜像 - -在k8s上配置私有仓库的信息,为下一步helm install的前置步骤。 - -方案一即在 helm install 时拉取可用的iotdb镜像,方案二则是提前将可用的iotdb镜像导入到containerd里。 - -### 6.1 【方案一】从私有仓库拉取镜像 - -#### 6.1.1 创建secret 使k8s可访问iotdb-helm的私有仓库 - -下文中“xxxxxx”表示IoTDB私有仓库的账号、密码、邮箱。 - -```Bash -# 注意 单引号 -kubectl create secret docker-registry timecho-nexus \ - --docker-server='nexus.infra.timecho.com:8143' \ - --docker-username='xxxxxx' \ - --docker-password='xxxxxx' \ - --docker-email='xxxxxx' \ - -n iotdb-ns - -# 查看secret -kubectl get secret timecho-nexus -n iotdb-ns -# 查看并输出为yaml -kubectl get secret timecho-nexus --output=yaml -n iotdb-ns -# 查看并解密 -kubectl get secret timecho-nexus --output="jsonpath={.data.\.dockerconfigjson}" -n iotdb-ns | base64 --decode -``` - -#### 6.1.2 将secret作为一个patch加载到命名空间iotdb-ns - -```Bash -# 添加一个patch,使该命名空间增加登陆nexus的登陆信息 -kubectl patch serviceaccount default -n iotdb-ns -p '{"imagePullSecrets": [{"name": "timecho-nexus"}]}' - -# 查看命名空间的该条信息 -kubectl get serviceaccounts -n iotdb-ns -o yaml -``` - -### 6.2 【方案二】导入镜像 - -该步骤用于客户无法连接私库的场景,需要联系公司实施同事辅助准备。 - -#### 6.2.1 拉取并导出镜像: - -```Bash -ctr images pull --user xxxxxxxx nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.2 查看并导出镜像: - -```Bash -# 查看 -ctr images ls - -# 导出 -ctr images export iotdb-enterprise:1.3.3.2-standalone.tar nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.3 导入到k8s的namespace下: - -> 注意,k8s.io为示例环境中k8s的ctr的命名空间,导入到其他命名空间是不行的 - -```Bash -# 导入到k8s的namespace下 -ctr -n k8s.io images import iotdb-enterprise:1.3.3.2-standalone.tar -``` - -#### 6.2.4 查看镜像 - -```Bash -ctr --namespace k8s.io images list | grep 1.3.3.2 -``` - -## 7. 安装 IoTDB - -### 7.1 安装 IoTDB - -```Bash -# 进入文件夹 -cd iotdb-cluster-k8s/helm - -# 安装iotdb -helm install iotdb ./ -n iotdb-ns -``` - -### 7.2 查看 Helm 安装列表 - -```Bash -# helm list -helm list -n iotdb-ns -``` - -### 7.3 查看 Pods - -```Bash -# 查看 iotdb的pods -kubectl get pods -n iotdb-ns -o wide -``` - -执行命令后,输出了带有confignode和datanode标识的各3个Pods,,总共6个Pods,即表明安装成功;需要注意的是,并非所有Pods都处于Running状态,未激活的datanode可能会持续重启,但在激活后将恢复正常。 - -### 7.4 发现故障的排除方式 - -```Bash -# 查看k8s的创建log -kubectl get events -n iotdb-ns -watch kubectl get events -n iotdb-ns - -# 获取详细信息 -kubectl describe pod confignode-0 -n iotdb-ns -kubectl describe pod datanode-0 -n iotdb-ns - -# 查看confignode日志 -kubectl logs -n iotdb-ns confignode-0 -f -``` - -## 8. 激活 IoTDB - -### 8.1 方案1:直接在 Pod 中激活(最快捷) - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-1 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-2 -- /iotdb/sbin/start-activate.sh -# 拿到机器码后进行激活 -``` - -### 8.2 方案2:进入confignode的容器中激活 - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /bin/bash -cd /iotdb/sbin -/bin/bash start-activate.sh -# 拿到机器码后进行激活 -# 退出容器 -``` - -### 8.3 方案3:手动激活 - -1. 查看 ConfigNode 详细信息,确定所在节点: - -```Bash -kubectl describe pod confignode-0 -n iotdb-ns | grep -e "Node:" -e "Path:" - -# 结果示例: -# Node: a87/172.20.31.87 -# Path: /data/k8s-data/env/confignode/.env -``` - -2. 查看 PVC 并找到 ConfigNode 对应的 Volume,确定所在路径: - -```Bash -kubectl get pvc -n iotdb-ns | grep "confignode-0" - -# 结果示例: -# map-confignode-confignode-0 Bound iotdb-pv-04 10Gi RWO local-storage 8h - -# 如果要查看多个confignode,使用如下: -for i in {0..2}; do echo confignode-$i;kubectl describe pod confignode-${i} -n iotdb-ns | grep -e "Node:" -e "Path:"; echo "----"; done -``` - -3. 查看对应 Volume 的详细信息,确定物理目录的位置: - -```Bash -kubectl describe pv iotdb-pv-04 | grep "Path:" - -# 结果示例: -# Path: /data/k8s-data/iotdb-pv-04 -``` - -4. 从对应节点的对应目录下找到 system-info 文件,使用该 system-info 作为机器码生成激活码,并在同级目录新建文件 license,将激活码写入到该文件。 - -## 9. 验证 IoTDB - -### 9.1 查看命名空间内的 Pods 状态 - -查看iotdb-ns命名空间内的IP、状态等信息,确定全部运行正常 - -```Bash -kubectl get pods -n iotdb-ns -o wide - -# 结果示例: -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` - -### 9.2 查看命名空间内的端口映射情况 - -```Bash -kubectl get svc -n iotdb-ns - -# 结果示例: -# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -# confignode-svc NodePort 10.10.226.151 80:31026/TCP 7d8h -# datanode-svc NodePort 10.10.194.225 6667:31563/TCP 7d8h -# jdbc-balancer LoadBalancer 10.10.191.209 6667:31895/TCP 7d8h -``` - -### 9.3 在任意服务器启动 CLI 脚本验证 IoTDB 集群状态 - -端口即jdbc-balancer的端口,服务器为k8s任意节点的IP - -```Bash -start-cli.sh -h 172.20.31.86 -p 31895 -start-cli.sh -h 172.20.31.87 -p 31895 -start-cli.sh -h 172.20.31.88 -p 31895 -``` - - - -## 10. 扩容 - -### 10.1 新增pv - -新增pv,必须有可用的pv才可以扩容。 - - - -**注意:DataNode重启后无法加入集群** - -**原因**:配置了静态存储的 hostPath 模式,并通过脚本修改了 `iotdb-system.properties` 文件,将 `dn_data_dirs` 设为 `/iotdb6/iotdb_data,/iotdb7/iotdb_data`,但未将默认存储路径 `/iotdb/data` 进行外挂,导致重启时数据丢失。 - -**解决方案**:是将 `/iotdb/data` 目录也进行外挂操作,且 ConfigNode 和 DataNode 均需如此设置,以确保数据完整性和集群稳定性。 - -### 10.2 扩容confignode - -示例:3 confignode 扩容为 4 confignode - -修改iotdb-cluster-k8s/helm的values.yaml文件,将confignode的3改成4 - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - - - - -### 10.3 扩容datanode - -示例:3 datanode 扩容为 4 datanode - -修改iotdb-cluster-k8s/helm的values.yaml文件,将datanode的3改成4 - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - -### 10.4 验证IoTDB状态 - -```Shell -kubectl get pods -n iotdb-ns -o wide - -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -# datanode-3 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index 1f9a38b46..000000000 --- a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,325 +0,0 @@ - -# 单机版部署 - -本章将介绍如何启动IoTDB单机实例,IoTDB单机实例包括 1 个ConfigNode 和1个DataNode(即通常所说的1C1D)。 - -## 注意事项 - -1. 安装前请确认系统已参照[系统配置](./Environment-Requirements.md)准备完成。 - -2. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置/etc/hosts,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、dn_internal_address、dn_rpc_address。 - - ```shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. 部分参数首次启动后不能修改,请参考下方的【参数配置】章节进行设置 - -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 - -5. 请注意,安装部署(包括激活和使用软件)IoTDB时需要保持使用同一个用户进行操作,您可以: -- 使用 root 用户(推荐):使用 root 用户可以避免权限等问题。 -- 使用固定的非 root 用户: - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - 避免使用 sudo:尽量避免使用 sudo 命令,因为它会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 - -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考:[监控面板部署](./Monitoring-panel-deployment.md)。 - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - -## 安装步骤 - -### 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-02.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - - -### 解压安装包并进入安装目录 - -```shell -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -### 参数配置 - -#### 环境脚本配置 - -- ./conf/confignode-env.sh(./conf/confignode-env.bat)配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------: | :------------------------------------: | :--------: | :----------------------------------------------: | :----------: | -| MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- ./conf/datanode-env.sh(./conf/datanode-env.bat)配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------: | :----------------------------------: |:----------------------:| :----------------------------------------------: | :----------: | -| MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 系统通用配置 - -打开通用配置文件(./conf/iotdb-system.properties 文件),设置以下参数: - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :-----------------------: | :------------------------------: | :------------: | :----------------------------------------------: |:---------------------------------------------:| -| cluster_name | 集群名称 | defaultCluster | 可根据需要设置集群名称,如无特殊需要保持默认即可 | 首次启动后不可修改,V1.3.3及之后版本支持热加载,但不建议手动修改该参数 | -| schema_replication_factor | 元数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | -| data_replication_factor | 数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | - -#### ConfigNode配置 - -打开ConfigNode配置文件(./conf/iotdb-system.properties文件),设置以下参数: - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :-----------------: | :----------------------------------------------------------: | :-------------: | :----------------------------------------------: | :----------------: | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -#### DataNode 配置 - -打开DataNode配置文件 ./conf/iotdb-system.properties,设置以下参数: - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------- | :----------------- | -| dn_rpc_address | 客户端 RPC 服务的地址 |0.0.0.0 | 所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -> ❗️注意:VSCode Remote等编辑器无自动保存配置功能,请确保修改的文件被持久化保存,否则配置项无法生效 - -### 启动及激活数据库 (V 1.3.4 及以后的 1.x 版本) - -#### 启动 ConfigNode 节点 - -进入iotdb的sbin目录下,启动confignode - -```shell -./start-confignode.sh -d #“-d”参数将在后台进行启动 -``` -如果启动失败,请参考[常见问题](#常见问题)。 - -#### 启动 DataNode 节点 - -进入iotdb的sbin目录下,启动datanode: - -```shell -./start-datanode.sh -d #-d参数将在后台进行启动 -``` - -#### 激活数据库 - -##### 通过 CLI 激活 - -- 进入 CLI - - ```SQL - ./sbin/start-cli.sh -``` - -- 执行以下内容获取激活所需机器码: - - ```Bash - show system info - ``` - -- 将返回机器码复制给天谋工作人员: - -```Bash -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -It costs 0.030s -``` - -- 将工作人员返回的激活码输入到CLI中,输入以下内容 - - 注:激活码前后需要用`'`符号进行标注,如所示 - -```Bash -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` -### 启动及激活数据库 (V 1.3.4 之前版本) - -#### 启动 ConfigNode 节点 - -进入iotdb的sbin目录下,启动confignode - -```shell -./start-confignode.sh -d #“-d”参数将在后台进行启动 -``` -如果启动失败,请参考[常见问题](#常见问题)。 - -#### 激活数据库 - -##### 方式一:激活文件拷贝激活 - -- 启动confignode节点后,进入activation文件夹, 将 system_info文件复制给天谋工作人员 -- 收到工作人员返回的 license文件 -- 将license文件放入对应节点的activation文件夹下; - -##### 方式二:激活脚本激活 - -- 获取激活所需机器码,进入安装目录的sbin目录,执行激活脚本: - -```shell - cd sbin -./start-activate.sh -``` - -- 显示如下信息,请将机器码(即该串字符)复制给天谋工作人员: - -```shell -Please copy the system_info's content and send it to Timecho: -01-KU5LDFFN-PNBEHDRH -Please enter license: -``` - -- 将工作人员返回的激活码输入上一步的命令行提示处 `Please enter license:`,如下提示: - -```shell -Please enter license: -Jw+MmF+AtexsfgNGOFgTm83BgXbq0zT1+fOfPvQsLlj6ZsooHFU6HycUSEGC78eT1g67KPvkcLCUIsz2QpbyVmPLr9x1+kVjBubZPYlVpsGYLqLFc8kgpb5vIrPLd3hGLbJ5Ks8fV1WOVrDDVQq89YF2atQa2EaB9EAeTWd0bRMZ+s9ffjc/1Zmh9NSP/T3VCfJcJQyi7YpXWy5nMtcW0gSV+S6fS5r7a96PjbtE0zXNjnEhqgRzdU+mfO8gVuUNaIy9l375cp1GLpeCh6m6pF+APW1CiXLTSijK9Qh3nsL5bAOXNeob5l+HO5fEMgzrW8OJPh26Vl6ljKUpCvpTiw== -License has been stored to sbin/../activation/license -Import completed. Please start cluster and excute 'show cluster' to verify activation status -``` - -#### 启动 DataNode 节点 - -进入iotdb的sbin目录下,启动datanode: - -```shell -./start-datanode.sh -d #-d参数将在后台进行启动 -``` - - -### 验证部署 - -可直接执行 ./sbin 目录下的 Cli 启动脚本: - -```shell -./start-cli.sh -h ip(本机ip或域名) -p 端口号(6667) -``` - -成功启动后,出现如下界面显示IOTDB安装成功。 - -![](/img/%E5%90%AF%E5%8A%A8%E6%88%90%E5%8A%9F.png) - -出现安装成功界面后,继续看下是否激活成功,使用`show cluster`命令 - -当看到最右侧显示ACTIVATED表示激活成功 - -![](/img/show%20cluster.png) - -还可在 CLI 中通过执行 `show activation` 命令查看激活状态,示例如下,状态显示为ACTIVATED表示激活成功 - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -> 出现`ACTIVATED(W)`为被动激活,表示此ConfigNode没有license文件(或没有签发时间戳最新的license文件)。此时建议检查license文件是否已放入license文件夹,没有请放入license文件,若已存在license文件,可能是此节点license文件与其他节点信息不一致导致,请联系天谋工作人员重新申请. - - - -## 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 - -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 步骤 4: 清理环境: - - a. 结束所有 ConfigNode 和 DataNode 进程。 - - ```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - sbin/stop-standalone.sh - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```Bash - cd /data/iotdb - rm -rf data logs - ``` \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/workbench-deployment_timecho.md b/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/workbench-deployment_timecho.md deleted file mode 100644 index 89dfb6ab1..000000000 --- a/src/zh/UserGuide/V1.3.x/Deployment-and-Maintenance/workbench-deployment_timecho.md +++ /dev/null @@ -1,242 +0,0 @@ - -# 可视化控制台部署 - -可视化控制台是IoTDB配套工具之一(类似 Navicat for MySQL)。它用于数据库部署实施、运维管理、应用开发各阶段的官方应用工具体系,让数据库的使用、运维和管理更加简单、高效,真正实现数据库低成本的管理和运维。本文档将帮助您安装Workbench。 - -
-  -  -
- -可视化控制台工具的使用说明可参考文档 [使用说明](../Tools-System/Workbench_timecho.md) 章节。 - -## 安装准备 - -| 准备内容 | 名称 | 版本要求 | 官方链接 | -| :------: | :-----------------------: | :----------------------------------------------------------: | :----------------------------------------------------: | -| 操作系统 | Windows或Linux | - | - | -| 安装环境 | JDK | 1.5.4及以下版本需要 >= 1.8,1.5.5及以上版本需要 >= 17(下载时请根据机器配置选择ARM或x64安装包) | https://www.oracle.com/java/technologies/downloads/ | -| 相关软件 | Prometheus | 需要 >=V2.30.3 | https://prometheus.io/download/ | -| 数据库 | IoTDB | 需要>=V1.2.0企业版 | 您可联系商务或技术支持获取 | -| 控制台 | IoTDB-Workbench-``| - | 您可根据附录版本对照表进行选择后联系商务或技术支持获取 | - -### 前置检查 - -为确保您获取的可视化控制台安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:联系天谋工作人员获取 - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/workbench`): - ```Bash - cd /data/workbench - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum IoTDB-Workbench-``.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-03.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行可视化控制台的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - - -## 安装步骤 - -### 步骤一:IoTDB 开启监控指标采集 - -1. 打开监控配置项。IoTDB中监控有关的配置项默认是关闭的,在部署监控面板前,您需要打开相关配置项(注意开启监控配置后需要重启服务)。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
配置项所在配置文件配置说明
cn_metric_reporter_listconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为PROMETHEUS
cn_metric_level请在配置文件中添加该配置项,值设置为IMPORTANT
cn_metric_prometheus_reporter_port请在配置文件中添加该配置项,可保持默认设置9091,如设置其他端口,不与其他端口冲突即可
dn_metric_reporter_listconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为PROMETHEUS
dn_metric_level请在配置文件中添加该配置项,值设置为IMPORTANT
dn_metric_prometheus_reporter_port请在配置文件中添加该配置项,可保持默认设置9092,如设置其他端口,不与其他端口冲突即可
dn_metric_internal_reporter_type请在配置文件中添加该配置项,值设置为IOTDB
enable_audit_logconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为true
audit_log_storage请在配置文件中添加该配置项,值设置为IOTDB,LOGGER
audit_log_operation请在配置文件中添加该配置项,值设置为DML,DDL,QUERY
- -2. 重启所有节点。修改3个节点的监控指标配置后,可重新启动所有节点的confignode和datanode服务: - - ```shell - ./sbin/stop-standalone.sh #先停止confignode和datanode - ./sbin/start-confignode.sh -d #启动confignode - ./sbin/start-datanode.sh -d #启动datanode - ``` - -3. 重启后,通过客户端确认各节点的运行状态,若状态都为Running,则为配置成功: - - ![](/img/%E5%90%AF%E5%8A%A8.png) - -### 步骤二:安装、配置Prometheus监控 - -1. 确保Prometheus安装完成(官方安装说明可参考:https://prometheus.io/docs/introduction/first_steps/) -2. 解压安装包,进入解压后的文件夹: - - ```Shell - tar xvfz prometheus-*.tar.gz - cd prometheus-* - ``` - -3. 修改配置。修改配置文件prometheus.yml如下 - 1. 新增confignode任务收集ConfigNode的监控数据 - 2. 新增datanode任务收集DataNode的监控数据 - - ```shell - global: - scrape_interval: 15s - evaluation_interval: 15s - scrape_configs: - - job_name: "prometheus" - static_configs: - - targets: ["localhost:9090"] - - job_name: "confignode" - static_configs: - - targets: ["iotdb-1:9091","iotdb-2:9091","iotdb-3:9091"] - honor_labels: true - - job_name: "datanode" - static_configs: - - targets: ["iotdb-1:9092","iotdb-2:9092","iotdb-3:9092"] - honor_labels: true - ``` - -4. 启动Prometheus。Prometheus 监控数据的默认过期时间为15天,在生产环境中,建议将其调整为180天以上,以对更长时间的历史监控数据进行追踪,启动命令如下所示: - - ```Shell - ./prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=180d - ``` - -5. 确认启动成功。在浏览器中输入 `http://IP:port`,进入Prometheus,点击进入Status下的Target界面,当看到State均为Up时表示配置成功并已经联通。 - -
- - -
- -### 步骤三:安装Workbench - -1. 进入iotdb-Workbench-``的config目录 - -2. 修改Workbench配置文件:进入`config`文件夹下修改配置文件`application-prod.properties`。若您是在本机安装则无需修改,若是部署在服务器上则需修改IP地址 - > Workbench可以部署在本地或者云服务器,只要能与 IoTDB 连接即可 - - | 配置项 | 修改前 | 修改后 | - | ---------------- | --------------------------------- | -------------------------------------- | - | pipe.callbackUrl | pipe.callbackUrl=`http://127.0.0.1` | pipe.callbackUrl=`http://<部署Workbench的IP地址>` | - - ![](/img/workbench-conf-1.png) - -3. 启动程序:请在IoTDB-Workbench-``的sbin文件夹下执行启动命令 - - Windows版: - ```shell - # 后台启动Workbench - start.bat -d - ``` - - Linux版: - ```shell - # 后台启动Workbench - ./start.sh -d - ``` - -4. 可以通过`jps`命令进行启动是否成功,如图所示即为启动成功: - - ![](/img/windows-jps.png) - -5. 验证是否成功:浏览器中打开:"`http://服务器ip:配置文件中端口`"进行访问,例如:"`http://127.0.0.1:9190`",当出现登录界面时即为成功 - - ![](/img/workbench.png) - - -## 附录:IoTDB与控制台版本对照表 - -| **控制台版本号** | **版本说明** | **可支持IoTDB版本** | -|------------|--------------------------------------------------------|----------------| -| V1.5.7 | 优化测点列表中测点名称拆分为设备名称和测点,测点选择区域支持左右滚动,以及导出文件列顺序与页面保持一致 | V1.3.4及以上的1.x系列版本 | -| V1.5.6 | 优化 CSV 格式导入导出功能:导入时,支持标签、别名为非必填项;导出时,支持测点描述里反引号包裹引号的场景 | V1.3.4及以上的1.x系列版本 | -| V1.5.5 | 新增服务器时钟,支持企业版激活数据库 | V1.3.4及以上的1.x系列版本 | -| V1.5.4 | 新增实例管理中prometheus设置的认证功能 | V1.3.4及以上的1.x系列版本 | -| V1.5.1 | 新增AI分析功能以及模式匹配功能 | V1.3.2及以上的1.x系列版本 | -| V1.4.0 | 新增树模型展示及英文版 | V1.3.2及以上的1.x系列版本 | -| V1.3.1 | 分析功能新增分析方式,优化导入模版等功能 | V1.3.2及以上的1.x系列版本 | -| V1.3.0 | 新增数据库配置功能,优化部分版本细节 | V1.3.2及以上的1.x系列版本 | -| V1.2.6 | 优化各模块权限控制功能 | V1.3.1及以上的1.x系列版本 | -| V1.2.5 | 可视化功能新增“常用模版”概念,所有界面优化补充页面缓存等功能 | V1.3.0及以上的1.x系列版本 | -| V1.2.4 | 计算功能新增“导入、导出”功能,测点列表新增“时间对齐”字段 | V1.2.2及以上的1.x系列版本 | -| V1.2.3 | 首页新增“激活详情”,新增分析等功能 | V1.2.2及以上的1.x系列版本 | -| V1.2.2 | 优化“测点描述”展示内容等功能 | V1.2.2及以上的1.x系列版本 | -| V1.2.1 | 数据同步界面新增“监控面板”,优化Prometheus提示信息 | V1.2.2及以上的1.x系列版本 | -| V1.2.0 | 全新Workbench版本升级 | V1.2.0及以上的1.x系列版本 | diff --git a/src/zh/UserGuide/V1.3.x/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md b/src/zh/UserGuide/V1.3.x/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md deleted file mode 100644 index ff570158b..000000000 --- a/src/zh/UserGuide/V1.3.x/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md +++ /dev/null @@ -1,274 +0,0 @@ - -# Ignition - -## 产品概述 - -1. Ignition简介 - -Ignition 是一个基于WEB的监控和数据采集工具(SCADA)- 一个开放且可扩展的通用平台。Ignition可以让你更轻松地控制、跟踪、显示和分析企业的所有数据,提升业务能力。更多介绍详情请参考[Ignition官网](https://docs.inductiveautomation.com/docs/8.1/getting-started/introducing-ignition) - -2. Ignition-IoTDB Connector介绍 - - Ignition-IoTDB Connector分为两个模块:Ignition-IoTDB连接器、Ignition-IoTDB With JDBC。其中: - - - Ignition-IoTDB 连接器:提供了将 Ignition 采集到的数据存入 IoTDB 的能力,也支持在Components中进行数据读取,同时注入了 `system.iotdb.insert`和`system.iotdb.query`脚本接口用于方便在Ignition编程使用 - - Ignition-IoTDB With JDBC:Ignition-IoTDB With JDBC 可以在 `Transaction Groups` 模块中使用,不适用于 `Tag Historian`模块,可以用于自定义写入和查询。 - - 两个模块与Ignition的具体关系与内容如下图所示。 - - ![](/img/Ignition.png) - -## 安装要求 - -| **准备内容** | **版本要求** | -| :------------------------: | :------------------------------------------------------------: | -| IoTDB | 要求已安装V1.3.1及以上版本,安装请参考 IoTDB [部署指导](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) | -| Ignition | 要求已安装 8.1.x版本(8.1.37及以上)的 8.1 版本,安装请参考 Ignition 官网[安装指导](https://docs.inductiveautomation.com/docs/8.1/getting-started/installing-and-upgrading)(其他版本适配请联系商务了解) | -| Ignition-IoTDB连接器模块 | 请联系商务获取 | -| Ignition-IoTDB With JDBC模块 | 下载地址:https://repo1.maven.org/maven2/org/apache/iotdb/iotdb-jdbc/ | - -## Ignition-IoTDB连接器使用说明 - -### 简介 - -Ignition-IoTDB连接器模块可以将数据存入与历史数据库提供程序关联的数据库连接中。数据根据其数据类型直接存储到 SQL 数据库中的表中,以及毫秒时间戳。根据每个标签上的值模式和死区设置,仅在更改时存储数据,从而避免重复和不必要的数据存储。 - -Ignition-IoTDB连接器提供了将 Ignition 采集到的数据存入 IoTDB 的能力。 - -### 安装步骤 - -步骤一:进入 `Config` - `System`- `Modules` 模块,点击最下方的`Install or Upgrade a Module...` - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-1.png) - -步骤二:选择获取到的 `modl`,选择文件并上传,点击 `Install`,信任相关证书。 - -![](/img/ignition-3.png) - -步骤三:安装完成后可以看到如下内容 - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-3.png) - -步骤四:进入 `Config` - `Tags`- `History` 模块,点击下方的`Create new Historical Tag Provider...` - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-4.png) - -步骤五:选择 `IoTDB`并填写配置信息 - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-5.png) - -配置内容如下: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
名称含义默认值备注
Main
Provider NameProvider 名称-
Enabled true为 true 时才能使用该 Provider
Description备注-
IoTDB Settings
Host Name目标IoTDB实例的地址-
Port Number目标IoTDB实例的端口6667
Username目标IoTDB的用户名-
Password目标IoTDB的密码-
Database Name要存储的数据库名称,以 root 开头,如 root.db-
Pool SizeSessionPool 的 Size50可以按需进行配置
Store and Forward Settings保持默认即可
- - -### 使用说明 - -#### 配置历史数据存储 - -- 配置好 `Provider` 后就可以在 `Designer` 中使用 `IoTDB Tag Historian` 了,就跟使用其他的 `Provider` 一样,右键点击对应 `Tag` 选择 `Edit tag(s)`,在 Tag Editor 中选择 History 分类 - - ![](/img/ignition-7.png) - -- 设置 `History Enabled` 为 `true`,并选择 `Storage Provider` 为上一步创建的 `Provider`,按需要配置其它参数,并点击 `OK`,然后保存项目。此时数据将会按照设置的内容持续的存入 `IoTDB` 实例中。 - - ![](/img/ignition-8.png) - -#### 读取数据 - -- 也可以在 Report 的 Data 标签下面直接选择存入 IoTDB 的 Tags - - ![](/img/ignition-9.png) - -- 在 Components 中也可以直接浏览相关数据 - - ![](/img/ignition-10.png) - -#### 脚本模块:该功能能够与 IoTDB 进行交互 - -1. system.iotdb.insert: - - -- 脚本说明:将数据写入到 IoTDB 实例中 - -- 脚本定义: - ``` shell - system.iotdb.insert(historian, deviceId, timestamps, measurementNames, measurementValues) - ``` - -- 参数: - - - `str historian`:对应的 IoTDB Tag Historian Provider 的名称 - - `str deviceId`:写入的 deviceId,不含配置的 database,如 Sine - - `long[] timestamps`:写入的数据点对于的时间戳列表 - - `str[] measurementNames`:写入的物理量的名称列表 - - `str[][] measurementValues`:写入的数据点数据,与时间戳列表和物理量名称列表对应 - -- 返回值:无 - -- 可用范围:Client, Designer, Gateway - -- 使用示例: - - ```shell - system.iotdb.insert("IoTDB", "Sine", [system.date.now()],["measure1","measure2"],[["val1","val2"]]) - ``` - -2. system.iotdb.query: - - -- 脚本说明:查询写到 IoTDB 实例中的数据 - -- 脚本定义: - ```shell - system.iotdb.query(historian, sql) - ``` - -- 参数: - - - `str historian`:对应的 IoTDB Tag Historian Provider 的名称 - - `str sql`:待查询的 sql 语句 - -- 返回值: - 查询的结果:`List>` - -- 可用范围:Client, Designer, Gateway -- 使用示例: - -```shell -system.iotdb.query("IoTDB", "select * from root.db.Sine where time > 1709563427247") -``` - -## Ignition-IoTDB With JDBC - -### 简介 - - Ignition-IoTDB With JDBC提供了一个 JDBC 驱动,允许用户使用标准的JDBC API 连接和查询 lgnition-loTDB 数据库 - -### 安装步骤 - - 步骤一:进入 `Config` - `Databases` -`Drivers` 模块,创建 `Translator` - -![](/img/Ignition-IoTDBWithJDBC-1.png) - - 步骤二:进入 `Config` - `Databases` -`Drivers` 模块,创建 `JDBC Driver`,选择上一步配置的 `Translator`并上传下载的 `IoTDB-JDBC`,Classname 配置为 `org.apache.iotdb.jdbc.IoTDBDriver` - -![](/img/Ignition-IoTDBWithJDBC-2.png) - -步骤三:进入 `Config` - `Databases` -`Connections` 模块,创建新的 `Connections`,`JDBC Driver` 选择上一步创建的 `IoTDB Driver`,配置相关信息后保存即可使用 - -![](/img/Ignition-IoTDBWithJDBC-3.png) - -### 使用说明 - -#### 数据写入 - - 在`Transaction Groups`中的 `Data Source`选择之前创建的 `Connection` - -- `Table name` 需设置为 root 开始的完整的设备路径 -- 取消勾选 `Automatically create table` -- `Store timestame to` 配置为 time - -不选择其他项,设置好字段,并 `Enabled` 后 数据会安装设置存入对应的 IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E5%86%99%E5%85%A5-1.png) - -#### 数据查询 - -- 在 `Database Query Browser` 中选择`Data Source`选择之前创建的 `Connection`,即可编写 SQL 语句查询 IoTDB 中的数据 - -![](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2-ponz.png) - diff --git a/src/zh/UserGuide/V1.3.x/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md b/src/zh/UserGuide/V1.3.x/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md deleted file mode 100644 index 39ea4384b..000000000 --- a/src/zh/UserGuide/V1.3.x/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md +++ /dev/null @@ -1,174 +0,0 @@ - - -# Apache Zeppelin - -## Zeppelin 简介 - -Apache Zeppelin 是一个基于网页的交互式数据分析系统。用户可以通过 Zeppelin 连接数据源并使用 SQL、Scala 等进行交互式操作。操作可以保存为文档(类似于 Jupyter)。Zeppelin 支持多种数据源,包括 Spark、ElasticSearch、Cassandra 和 InfluxDB 等等。现在,IoTDB 已经支持使用 Zeppelin 进行操作。样例如下: - -![iotdb-note-snapshot](/img/github/102752947-520a3e80-43a5-11eb-8fb1-8fac471c8c7e.png) - -## Zeppelin-IoTDB 解释器 - -### 系统环境需求 - -| IoTDB 版本 | Java 版本 | Zeppelin 版本 | -| :--------: | :-----------: | :-----------: | -| >=`0.12.0` | >=`1.8.0_271` | `>=0.9.0` | - -安装 IoTDB:参考 [快速上手](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md). 假设 IoTDB 安装在 `$IoTDB_HOME`. - -安装 Zeppelin: -> 方法 1 直接下载:下载 [Zeppelin](https://zeppelin.apache.org/download.html#) 并解压二进制文件。推荐下载 [netinst](http://www.apache.org/dyn/closer.cgi/zeppelin/zeppelin-0.9.0/zeppelin-0.9.0-bin-netinst.tgz) 二进制包,此包由于未编译不相关的 interpreter,因此大小相对较小。 -> -> 方法 2 源码编译:参考 [从源码构建 Zeppelin](https://zeppelin.apache.org/docs/latest/setup/basics/how_to_build.html) ,使用命令为 `mvn clean package -pl zeppelin-web,zeppelin-server -am -DskipTests`。 - -假设 Zeppelin 安装在 `$Zeppelin_HOME`. - -### 编译解释器 - -运行如下命令编译 IoTDB Zeppelin 解释器。 - -```shell -cd $IoTDB_HOME - mvn clean package -pl iotdb-connector/zeppelin-interpreter -am -DskipTests -P get-jar-with-dependencies -``` - -编译后的解释器位于如下目录: - -```shell -$IoTDB_HOME/zeppelin-interpreter/target/zeppelin-{version}-SNAPSHOT-jar-with-dependencies.jar -``` - -### 安装解释器 - -当你编译好了解释器,在 Zeppelin 的解释器目录下创建一个新的文件夹`iotdb`,并将 IoTDB 解释器放入其中。 - -```shell -cd $IoTDB_HOME -mkdir -p $Zeppelin_HOME/interpreter/iotdb -cp $IoTDB_HOME/zeppelin-interpreter/target/zeppelin-{version}-SNAPSHOT-jar-with-dependencies.jar $Zeppelin_HOME/interpreter/iotdb -``` - -### 修改 Zeppelin 配置 - -进入 `$Zeppelin_HOME/conf`,使用 template 创建 Zeppelin 配置文件: - -```shell -cp zeppelin-site.xml.template zeppelin-site.xml -``` - -打开 zeppelin-site.xml 文件,将 `zeppelin.server.addr` 项修改为 `0.0.0.0` - -### 启动 Zeppelin 和 IoTDB - -进入 `$Zeppelin_HOME` 并运行 Zeppelin: - -```shell -# Unix/OS X -> ./bin/zeppelin-daemon.sh start - -# Windows -> .\bin\zeppelin.cmd -``` - -进入 `$IoTDB_HOME` 并运行 IoTDB: - -```shell -# Unix/OS X -> nohup sbin/start-server.sh >/dev/null 2>&1 & -or -> nohup sbin/start-server.sh -c -rpc_port >/dev/null 2>&1 & - -# Windows -> sbin\start-server.bat -c -rpc_port -``` - -## 使用 Zeppelin-IoTDB 解释器 - -当 Zeppelin 启动后,访问 [http://127.0.0.1:8080/](http://127.0.0.1:8080/) - -通过如下步骤创建一个新的笔记本页面: - -1. 点击 `Create new node` 按钮 -2. 设置笔记本名 -3. 选择解释器为 iotdb - -现在可以开始使用 Zeppelin 操作 IoTDB 了。 - -![iotdb-create-note](/img/github/102752945-5171a800-43a5-11eb-8614-53b3276a3ce2.png) - -我们提供了一些简单的 SQL 来展示 Zeppelin-IoTDB 解释器的使用: - -```sql -CREATE DATABASE root.ln.wf01.wt01; -CREATE TIMESERIES root.ln.wf01.wt01.status WITH DATATYPE=BOOLEAN, ENCODING=PLAIN; -CREATE TIMESERIES root.ln.wf01.wt01.temperature WITH DATATYPE=FLOAT, ENCODING=PLAIN; -CREATE TIMESERIES root.ln.wf01.wt01.hardware WITH DATATYPE=INT32, ENCODING=PLAIN; - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (1, 1.1, false, 11); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (2, 2.2, true, 22); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (3, 3.3, false, 33); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (4, 4.4, false, 44); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (5, 5.5, false, 55); - -SELECT * -FROM root.ln.wf01.wt01 -WHERE time >= 1 - AND time <= 6; -``` - -样例如下: - -![iotdb-note-snapshot2](/img/github/102752948-52a2d500-43a5-11eb-9156-0c55667eb4cd.png) - -用户也可以参考 [[1]](https://zeppelin.apache.org/docs/0.9.0/usage/display_system/basic.html) 编写更丰富多彩的文档。 - -以上样例放置于 `$IoTDB_HOME/zeppelin-interpreter/Zeppelin-IoTDB-Demo.zpln` - -## 解释器配置项 - -进入页面 [http://127.0.0.1:8080/#/interpreter](http://127.0.0.1:8080/#/interpreter) 并配置 IoTDB 的连接参数: - -![iotdb-configuration](/img/github/102752940-50407b00-43a5-11eb-94fb-3e3be222183c.png) - -可配置参数默认值和解释如下: - -| 属性 | 默认值 | 描述 | -| ---------------------------- | --------- | -------------------------------- | -| iotdb.host | 127.0.0.1 | IoTDB 主机名 | -| iotdb.port | 6667 | IoTDB 端口 | -| iotdb.username | root | 用户名 | -| iotdb.password | root | 密码 | -| iotdb.fetchSize | 10000 | 查询结果分批次返回时,每一批数量 | -| iotdb.zoneId | | 时区 ID | -| iotdb.enable.rpc.compression | FALSE | 是否允许 rpc 压缩 | -| iotdb.time.display.type | default | 时间戳的展示格式 | diff --git a/src/zh/UserGuide/V1.3.x/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/zh/UserGuide/V1.3.x/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index a97f6e9fb..000000000 --- a/src/zh/UserGuide/V1.3.x/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,266 +0,0 @@ - - -# 产品介绍 - -TimechoDB 是一款低成本、高性能的物联网原生时序数据库,是天谋科技基于 Apache IoTDB 社区版本提供的原厂商业化产品。它可以解决企业组建物联网大数据平台管理时序数据时所遇到的应用场景复杂、数据体量大、采样频率高、数据乱序多、数据处理耗时长、分析需求多样、存储与运维成本高等多种问题。 - -天谋科技基于 TimechoDB 提供更多样的产品功能、更强大的性能和稳定性、更丰富的效能工具,并为用户提供全方位的企业服务,从而为商业化客户提供更强大的产品能力,和更优质的开发、运维、使用体验。 - -- 下载、部署与使用:[快速上手](../QuickStart/QuickStart_timecho.md) - -## 产品体系 - -天谋产品体系由若干个组件构成,覆盖由【数据采集】到【数据管理】到【数据分析&应用】的全时序数据生命周期,做到“采-存-用”一体化时序数据解决方案,帮助用户高效地管理和分析物联网产生的海量时序数据。 - -
- Introduction-zh-timecho.png -
- - -其中: - -1. **时序数据库(TimechoDB,基于 Apache IoTDB 提供的原厂商业化产品)**:时序数据存储的核心组件,其能够为用户提供高压缩存储能力、丰富时序查询能力、实时流处理能力,同时具备数据的高可用和集群的高扩展性,并在安全层面提供全方位保障。同时 TimechoDB 还为用户提供多种应用工具,方便用户配置和管理系统;多语言API和外部系统应用集成能力,方便用户在 TimechoDB 基础上构建业务应用。 -2. **时序数据标准文件格式(Apache TsFile,多位天谋科技核心团队成员主导&贡献代码)**:该文件格式是一种专为时序数据设计的存储格式,可以高效地存储和查询海量时序数据。目前 Timecho 采集、存储、智能分析等模块的底层存储文件均由 Apache TsFile 进行支撑。TsFile 可以被高效地加载至 IoTDB 中,也能够被迁移出来。通过 TsFile,用户可以在采集、管理、应用&分析阶段统一使用相同的文件格式进行数据管理,极大简化了数据采集到分析的整个流程,提高时序数据管理的效率和便捷度。 -3. **时序模型训推一体化引擎(AINode)**:针对智能分析场景,TimechoDB 提供 AINode 时序模型训推一体化引擎,它提供了一套完整的时序数据分析工具,底层为模型训练引擎,支持训练任务与数据管理,与包括机器学习、深度学习等。通过这些工具,用户可以对存储在 TimechoDB 中的数据进行深入分析,挖掘出其中的价值。 -4. **数据采集**:为了更加便捷的对接各类工业采集场景, 天谋科技提供数据采集接入服务,支持多种协议和格式,可以接入各种传感器、设备产生的数据,同时支持断点续传、网闸穿透等特性。更加适配工业领域采集过程中配置难、传输慢、网络弱的特点,让用户的数采变得更加简单、高效。 - - -## 产品特性 - -TimechoDB 具备以下优势和特性: - -- 灵活的部署方式:支持云端一键部署、终端解压即用、终端-云端无缝连接(数据云端同步工具) - -- 低硬件成本的存储解决方案:支持高压缩比的磁盘存储,无需区分历史库与实时库,数据统一管理 - -- 层级化的测点组织管理方式:支持在系统中根据设备实际层级关系进行建模,以实现与工业测点管理结构的对齐,同时支持针对层级结构的目录查看、检索等能力 - -- 高通量的数据读写:支持百万级设备接入、数据高速读写、乱序/多频采集等复杂工业读写场景 - -- 丰富的时间序列查询语义:支持时序数据原生计算引擎,支持查询时时间戳对齐,提供近百种内置聚合与时序计算函数,支持面向时序特征分析和AI能力 - -- 高可用的分布式系统:支持HA分布式架构,系统提供7*24小时不间断的实时数据库服务,一个物理节点宕机或网络故障,不会影响系统的正常运行;支持物理节点的增加、删除或过热,系统会自动进行计算/存储资源的负载均衡处理;支持异构环境,不同类型、不同性能的服务器可以组建集群,系统根据物理机的配置,自动负载均衡 - -- 极低的使用&运维门槛:支持类 SQL 语言、提供多语言原生二次开发接口、具备控制台等完善的工具体系 - -- 丰富的生态环境对接:支持Hadoop、Spark等大数据生态系统组件对接,支持Grafana、Thingsboard、DataEase等设备管理和可视化工具 - -## 企业特性 - -### 更高阶的产品功能 - -TimechoDB 在 Apache IoTDB 基础上提供了更多高阶产品功能,在内核层面针对工业生产场景进行原生升级和优化,如多级存储、云边协同、可视化工具、安全增强等功能,能够让用户无需过多关注底层逻辑,将精力聚焦在业务开发中,让工业生产更简单更高效,为企业带来更多的经济效益。如: - -- 双活部署:双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 数据同步:通过数据库内置的同步模块,支持数据由场站向中心汇聚,支持全量汇聚、部分汇聚、级联汇聚等各类场景,可支持实时数据同步与批量数据同步两种模式。同时提供多种内置插件,支持企业数据同步应用中的网闸穿透、加密传输、压缩传输等相关要求。 - -- 多级存储:通过升级底层存储能力,支持根据访问频率和数据重要性等因素将数据划分为冷、温、热等不同层级的数据,并将其存储在不同介质中(如 SSD、机械硬盘、云存储等),同时在查询过程中也由系统进行数据调度。从而在保证数据访问速度的同时,降低客户数据存储成本。 - -- 安全增强:通过白名单、审计日志等功能加强企业内部管理,降低数据泄露风险。 - -详细功能对比如下: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
功能Apache IoTDBTimechoDB
部署模式单机部署
分布式部署
双活部署-
容器部署部分支持
数据库功能测点管理
数据写入
数据查询
连续查询
触发器
用户自定义函数
权限管理
数据同步仅文件同步,无内置插件实时同步+文件同步,丰富内置插件
流处理仅框架,无内置插件框架+丰富内置插件
多级存储-
视图-
白名单-
审计日志-
配套工具可视化控制台-
集群管理工具-
系统监控工具-
国产化国产化兼容性认证-
技术支持专家服务-
使用培训-
- -### 更高效/稳定的产品性能 - -TimechoDB 在 Apache IoTDB 的基础上优化了稳定性与性能,经过企业版技术支持,能够实现10倍以上性能提升,并具有故障及时恢复的性能优势。 - -### 更用户友好的工具体系 - -TimechoDB 将为用户提供更简单、易用的工具体系,通过集群监控面板(IoTDB Grafana)、数据库控制台(IoTDB Workbench)、集群管理工具(IoTDB Deploy Tool,简称 IoTD)等产品帮助用户快速部署、管理、监控数据库集群,降低运维人员工作/学习成本,简化数据库运维工作,使运维过程更加方便、快捷。 - -- 集群监控面板:旨在解决 IoTDB 及其所在操作系统的监控问题,主要包括:操作系统资源监控、IoTDB 性能监控,及上百项内核监控指标,从而帮助用户监控集群健康状态,并进行集群调优和运维。 - -
-

总体概览

-

操作系统资源监控

-

IoTDB 性能监控

-
-
- - - -
-

- -- 数据库控制台:旨在提供低门槛的数据库交互工具,通过提供界面化的控制台帮助用户简洁明了的进行元数据管理、数据增删改查、权限管理、系统管理等操作,简化数据库使用难度,提高数据库使用效率。 - - -
-

首页

-

元数据管理

-

SQL 查询

-
-
- - - -
-

- - -- 集群管理工具:旨在解决分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。 - - -
-  -
- -### 更专业的企业技术服务 - -TimechoDB 客户提供强大的原厂服务,包括但不限于现场安装及培训、专家顾问咨询、现场紧急救助、软件升级、在线自助服务、远程支持、最新开发版使用指导等服务。同时,为了使 IoTDB 更契合工业生产场景,我们会根据企业实际数据结构和读写负载,进行建模方案推荐、读写性能调优、压缩比调优、数据库配置推荐及其他的技术支持。如遇到部分产品未覆盖的工业化定制场景,TimechoDB 将根据用户特点提供定制化开发工具。 - -相较于 Apache IoTDB,每 2-3 个月一个发版周期,TimechoDB 提供周期更快的发版频率,同时针对客户现场紧急问题,提供天级别的专属修复,确保生产环境稳定。 - - -### 更兼容的国产化适配 - -TimechoDB 代码自研可控,同时兼容大部分主流信创产品(CPU、操作系统等),并完成与多个厂家的兼容认证,确保产品的合规性和安全性。 \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/IoTDB-Introduction/Release-history_timecho.md b/src/zh/UserGuide/V1.3.x/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index f4dc5fbaa..000000000 --- a/src/zh/UserGuide/V1.3.x/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,388 +0,0 @@ - -# 发布历史 - -## TimechoDB(数据库内核) - -### V1.3.7.3 - -> 发版时间:2026.06.02
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 校验码:8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 版本主要优化了查询模块和数据同步等功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:优化 Last 查询、对齐序列查询、倒序时间过滤查询等场景 -- 元数据模块:优化已激活序列及其子路径下的设备创建校验 -- 数据同步:优化同步失败后的重试机制 -- 数据同步:跨网闸同步插件支持配置实时写入传输超时时间 -- 接口模块:Go 客户端写入接口增加错误码校验 -- 接口模块:优化 C# 客户端连接池管理 - - -### V1.3.7.2 - -> 发版时间:2026.04.07
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 校验码:787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 版本主要优化了数据同步和查询模块的相关功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:优化 Pipe 复杂路径匹配场景下的分发性能 -- 查询模块:Show Queries 语句新增客户端 IP、查询超时时间、服务端等待时间等信息 -- 生态集成:支持 IoTDB 以 OPC Client 模式向外部 OPC Server 推送数据 - - -### V1.3.6.6 - -> 发版时间:2026.01.20
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 校验码:590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 版本优化了数据的读写功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.6.3 - -> 发版时间:2026.01.04
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 校验码:43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 版本主要围绕查询性能、内存管理机制两大核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:优化多种场景的查询性能,包括多序列 Last 查询等 -* 查询模块:Java SDK 新增 FastLastQuery 接口,支持更高效的 Last 查询操作 -* 查询模块:树模型 fetchSchema 调整为分段流式返回,提升大数据量场景下的响应速度 -* 存储模块:优化内存管理,避免内存泄漏风险,保障系统长期稳定运行 -* 存储模块:优化文件合并机制,提升合并处理效率,优化系统存储资源占用 -* 其他:修复安全漏洞 CVE-2025-12183,CVE-2025-66566 and CVE-2025-11226 - - -### V1.3.6.1 - -> 发版时间:2025.12.09
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 校验码:9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 版本主要围绕数据同步稳定性这一核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 数据同步:优化 Pipe SQL 参数配置,支持指定异步加载方式 -* 数据同步:新增语法糖功能,可将全量 Pipe 创建 SQL 自动拆分为实时同步与历史同步两类 -* 系统模块:新增全局数据类型压缩方式配置项,支持按需调整存储压缩策略 - - -### V1.3.5.11 - -> 发版时间:2025.09.24
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 校验码:f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 版本主要优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.10 - -> 发版时间:2025.08.27
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 校验码:3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.9 - -> 发版时间:2025.08.25
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 校验码:95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 版本优化了内存控制,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 -### 1.x 其他历史版本 - -#### V1.3.5.8 - -> 发版时间:2025.08.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 校验码:aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.7 - -> 发版时间:2025.08.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 校验码:17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.6 - -> 发版时间:2025.07.16
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 校验码:05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 版本新增配置项开关支持禁用数据订阅功能,优化了C++高可用客户端,以及正常情况、重启、删除三个场景下的 PIPE 同步延迟问题,和大 TEXT 对象时的查询问题,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.4 - -> 发版时间:2025.06.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 校验码:edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 版本修复了部分产品缺陷,优化了节点移除功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.3 - -> 发版时间:2025.06.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 校验码:5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 版本主要优化了数据同步功能,包括持久化 PIPE 发送进度,增加 PIPE 事件传输时间监控项,并修复了相关缺陷;另外将用户密码的加密算法变更为 SHA-256,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.2 - -> 发版时间:2025.06.10
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 校验码:4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 版本主要优化了数据同步功能,包括支持通过使用参数进行级联配置,支持同步和实时写入顺序完全一致;支持系统重启后历史数据和实时数据分区发送,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.1 - -> 发版时间:2025.05.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 校验码:91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.4.2 - -> 发版时间:2025.04.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 校验码:52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b2 - -V1.3.4.2 版本优化了数据同步功能,支持双活之间同步外部 PIPE 转发而来的数据。 - - -#### V1.3.4.1 - -> 发版时间:2025.01.08
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 校验码:e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 版本新增模式匹配函数、持续优化数据订阅机制,提升稳定性、import-data/export-data 脚本扩展支持新数据类型,import-data/export-data 脚本合并同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -- 系统模块:UDF 函数拓展,新增 pattern_match 模式匹配函数 -- 数据同步:支持在发送端指定接收端鉴权信息 -- 生态集成:支持 Kubernetes Operator -- 脚本与工具:import-data/export-data 脚本扩展,支持新数据类型(字符串、大二进制对象、日期、时间戳) -- 脚本与工具:import-data/export-data 脚本迭代,同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出 - -#### V1.3.3.3 - -> 发版时间:2024.10.31
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 校验码:4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3版本增加优化重启恢复性能,减少启动时间、DataNode 主动监听并加载 TsFile,同时增加可观测性指标、发送端支持传文件至指定目录后,接收端自动加载到IoTDB、Alter Pipe 支持 Alter Source 的能力等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:接收端支持对不一致数据类型的自动转换 -- 数据同步:接收端增强可观测性,支持多个内部接口的 ops/latency 统计 -- 数据同步:opc-ua-sink 插件支持 CS 模式访问和非匿名访问方式 -- 数据订阅: SDK 支持 create if not exists 和 drop if exists 接口 -- 流处理:Alter Pipe 支持 Alter Source 的能力 -- 系统模块:新增 rest 模块的耗时监控 -- 脚本与工具:支持加载自动加载指定目录的TsFile文件 -- 脚本与工具:import-tsfile脚本扩展,支持脚本与iotdb server不在同一服务器运行 -- 脚本与工具:新增对Kubernetes Helm的支持 -- 脚本与工具:Python 客户端支持新数据类型(字符串、大二进制对象、日期、时间戳) - -#### V1.3.3.2 - -> 发版时间:2024.8.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 校验码:32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2版本支持输出读取mods文件的耗时、输入最大顺乱序归并排序内存 以及dispatch 耗时、通过参数配置对时间分区原点的调整、支持根据 pipe 历史数据处理结束标记自动结束订阅,同时合并了模块内存控制性能提升,具体发布内容如下: - -- 查询模块:Explain Analyze 功能支持输出读取mods文件的耗时 -- 查询模块:Explain Analyze 功能支持输入最大顺乱序归并排序内存以及 dispatch 耗时 -- 存储模块:新增合并目标文件拆分功能,增加配置文件参数 -- 系统模块:支持通过参数配置对时间分区原点的调整 -- 流处理:数据订阅支持根据 pipe 历史数据处理结束标记自动结束订阅 -- 数据同步:RPC 压缩支持指定压缩等级 -- 脚本与工具:数据/元数据导出只过滤 root.__system,不对root.__systema 等开头的数据进行过滤 - -#### V1.3.3.1 - -> 发版时间:2024.7.12
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 校验码:1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1版本多级存储增加限流机制、数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权,优化了数据同步接收端一些不明确的WARN日志、重启恢复性能,减少启动时间,同时对脚本内容进行了合并,具体发布内容如下: - -- 查询模块:Filter 性能优化,提升聚合查询和where条件查询的速度 -- 查询模块:Java Session客户端查询 sql 请求均分到所有节点 -- 系统模块:将"iotdb-confignode.properties、iotdb-datanode.properties、iotdb-common.properties"配置文件合并为" iotdb-system.properties" -- 存储模块:多级存储增加限流机制 -- 数据同步:数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权 -- 系统模块:优化重启恢复性能,减少启动时间 - -#### V1.3.2.2 - -> 发版时间:2024.6.4
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 校验码:ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 版本新增 explain analyze 语句分析单个 SQL 查询耗时、新增 UDAF 用户自定义聚合函数框架、支持磁盘空间到达设置阈值自动删除数据、元数据同步、统计指定路径下数据点数、SQL 语句导入导出脚本等功能,同时集群管理工具支持滚动升级、上传插件到整个集群,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 存储模块:insertRecords 接口写入性能提升 -- 存储模块:新增 SpaceTL 功能,支持磁盘空间到达设置阈值自动删除数据 -- 查询模块:新增 Explain Analyze 语句(监控单条 SQL 执行各阶段耗时) -- 查询模块:新增 UDAF 用户自定义聚合函数框架 -- 查询模块:UDF 新增包络解调分析 -- 查询模块:新增 MaxBy/MinBy 函数,支持获取最大/小值的同时返回对应时间戳 -- 查询模块:值过滤查询性能提升 -- 数据同步:路径匹配支持通配符 -- 数据同步:支持元数据同步(含时间序列及相关属性、权限等设置) -- 流处理:增加 Alter Pipe 语句,支持热更新 Pipe 任务的插件 -- 系统模块:系统数据点数统计增加对 load TsFile 导入数据的统计 -- 脚本与工具:新增本地升级备份工具(通过硬链接对原有数据进行备份) -- 脚本与工具:新增 export-data/import-data 脚本,支持将数据导出为 CSV、TsFile 格式或 SQL 语句 -- 脚本与工具:Windows 环境支持通过窗口名区分 ConfigNode、DataNode、Cli - -#### V1.3.1.4 - -> 发版时间:2024.4.23
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 校验码:8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1 版本增加系统激活情况查看、内置方差/标准差聚合函数、内置Fill语句支持超时时间设置、tsfile修复命令等功能,增加一键收集实例信息脚本、一键启停集群等脚本,并对视图、流处理等功能进行优化,提升使用易用度和版本性能。具体发布内容如下: - -- 查询模块:Fill 子句支持设置填充超时阈值,超过时间阈值不填充 -- 查询模块:Rest 接口(V2 版)增加列类型返回 -- 数据同步:数据同步简化时间范围指定方式,直接设置起止时间 -- 数据同步:数据同步支持 SSL 传输协议(iotdb-thrift-ssl-sink 插件) -- 系统模块:支持使用 SQL 查询集群激活信息 -- 系统模块:多级存储增加迁移时传输速率控制 -- 系统模块:系统可观测性提升(增加集群节点的散度监控、分布式任务调度框架可观测性) -- 系统模块:日志默认输出策略优化 -- 脚本与工具:增加一键启停集群脚本(start-all/stop-all.sh & start-all/stop-all.bat) -- 脚本与工具:增加一键收集实例信息脚本(collect-info.sh & collect-info.bat) - -#### V1.3.0.4 - -> 发版时间:2024.1.3
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 校验码:3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 发布了全新内生机器学习框架 AINode,全面升级权限模块支持序列粒度授予权限,并对视图、流处理等功能进行诸多细节优化,进一步提升了产品的使用易用度,并增强了版本稳定性和各方面性能。具体发布内容如下: - -- 查询模块:新增 AINode 内生机器学习模块 -- 查询模块:优化 show path 语句返回时间长的问题 -- 安全模块:升级权限模块,支持时间序列粒度的权限设置 -- 安全模块:支持客户端与服务器 SSL 通讯加密 -- 流处理:流处理模块新增多种 metrics 监控项 -- 查询模块:非可写视图序列支持 LAST 查询 -- 系统模块:优化数据点监控项统计准确性 - -#### V1.2.0.1 - -> 发版时间:2023.6.30
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 校验码:dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1主要增加了流处理框架、动态模板、substring/replace/round内置查询函数等新特性,增强了show region、show timeseries、show variable等内置语句功能和Session接口,同时优化了内置监控项及其实现,修复部分产品bug和性能问题。 - -- 流处理:新增流处理框架 -- 元数据模块:新增模板动态扩充功能 -- 存储模块:新增SPRINTZ和RLBE编码以及LZMA2压缩算法 -- 查询模块:新增cast、round、substr、replace内置标量函数 -- 查询模块:新增time_duration、mode内置聚合函数 -- 查询模块:SQL语句支持case when语法 -- 查询模块:SQL语句支持order by表达式 -- 接口模块:Python API支持连接分布式多个节点 -- 接口模块:Python客户端支持写入重定向 -- 接口模块:Session API增加用模板批量创建序列接口 - -#### V1.1.0.1 - -> 发版时间:2023-04-03
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.1.0.1.zip
-> SHA512 校验码:58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1主要改进增加了部分新特性,如支持 GROUP BY VARIATION、GROUP BY CONDITION 等分段方式、增加 DIFF、COUNT_IF 等实用函数,引入 pipeline 执行引擎进一步提升查询速度等。同时修复对齐序列 last 查询 order by timeseries、LIMIT&OFFSET 不生效、重启后元数据模版错误、删除所有 database 后创建序列错误等相关问题。 - -- 查询模块:align by device 语句支持 order by time -- 查询模块:支持 Show Queries 命令 -- 查询模块:支持 kill query 命令 -- 系统模块:show regions 支持指定特定的 database -- 系统模块:新增 SQL show variables, 可以展示当前集群参数 -- 查询模块:聚合查询支持 GROUP BY VARIATION -- 查询模块:SELECT INTO 支持特定的数据类型强转 -- 查询模块:实现内置标量函数 DIFF -- 系统模块:show regions 显示创建时间 -- 查询模块:实现内置聚合函数 COUNT_IF -- 查询模块:聚合查询支持 GROUP BY CONDITION -- 系统模块:支持修改 dn_rpc_port 和 dn_rpc_address - - -## Workbench(控制台工具) - -| **控制台版本号** | **版本说明** | **可支持IoTDB版本** | **SHA512 校验码** | -| ---------------- | ------------------------------------------------------------ | ------------------------- | ------------------------------------------------------------ | -| V1.5.7 | 优化测点列表中测点名称拆分为设备名称和测点,测点选择区域支持左右滚动,以及导出文件列顺序与页面保持一致 | V1.3.4及以上的1.x系列版本 | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | 优化 CSV 格式导入导出功能:导入时,支持标签、别名为非必填项;导出时,支持测点描述里反引号包裹引号的场景 | V1.3.4及以上的1.x系列版本 | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | 新增服务器时钟,支持企业版激活数据库 | V1.3.4及以上的1.x系列版本 | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | 新增实例管理中prometheus设置的认证功能 | V1.3.4及以上的1.x系列版本 | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | 新增AI分析功能以及模式匹配功能 | V1.3.2及以上的1.x系列版本 | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | 新增树模型展示及英文版 | V1.3.2及以上的1.x系列版本 | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | 分析功能新增分析方式,优化导入模版等功能 | V1.3.2及以上的1.x系列版本 | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | 新增数据库配置功能,优化部分版本细节 | V1.3.2及以上的1.x系列版本 | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | 优化各模块权限控制功能 | V1.3.1及以上的1.x系列版本 | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | 可视化功能新增“常用模版”概念,所有界面优化补充页面缓存等功能 | V1.3.0及以上的1.x系列版本 | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | 计算功能新增“导入、导出”功能,测点列表新增“时间对齐”字段 | V1.2.2及以上的1.x系列版本 | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | 首页新增“激活详情”,新增分析等功能 | V1.2.2及以上的1.x系列版本 | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | 优化“测点描述”展示内容等功能 | V1.2.2及以上的1.x系列版本 | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | 数据同步界面新增“监控面板”,优化Prometheus提示信息 | V1.2.2及以上的1.x系列版本 | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | 全新Workbench版本升级 | V1.2.0及以上的1.x系列版本 | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/zh/UserGuide/V1.3.x/QuickStart/QuickStart_timecho.md b/src/zh/UserGuide/V1.3.x/QuickStart/QuickStart_timecho.md deleted file mode 100644 index f898f60ee..000000000 --- a/src/zh/UserGuide/V1.3.x/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,109 +0,0 @@ - - -# 快速上手 - -本篇文档将帮助您了解快速入门 IoTDB 的方法。 - -## 如何安装部署? - -本篇文档将帮助您快速安装部署 IoTDB,您可以通过以下文档的链接快速定位到所需要查看的内容: - -1. 准备所需机器资源:IoTDB 的部署和运行需要考虑多个方面的机器资源配置。具体资源配置可查看 [资源规划](../Deployment-and-Maintenance/Database-Resources.md) - -2. 完成系统配置准备:IoTDB 的系统配置涉及多个方面,关键的系统配置介绍可查看 [系统配置](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. 获取安装包:您可以联系天谋商务获取 IoTDB 安装包,以确保下载的是最新且稳定的版本。具体安装包结构可查看:[安装包获取](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. 安装数据库并激活:您可以根据实际部署架构选择以下教程进行安装部署: - - - 单机版:[单机版](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - 分布式(集群)版:[分布式(集群)版](../Deployment-and-Maintenance//Cluster-Deployment_timecho.md) - - - 双活版:[双活版](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️注意:目前我们仍然推荐直接在物理机/虚拟机上安装部署,如需要 docker 部署,可参考:[Docker 部署](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. 安装数据库配套工具:企业版数据库提供监控面板、可视化控制台等配套工具,建议在部署企业版时安装,可以帮助您更加便捷的使用 IoTDB: - - - 监控面板:提供了上百个数据库监控指标,用来对 IoTDB 及其所在操作系统进行细致监控,从而进行系统优化、性能优化、发现瓶颈等,安装步骤可查看 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - 可视化控制台:是 IoTDB 的可视化界面,支持通过界面交互的形式提供元数据管理、数据查询、数据可视化等功能的操作,帮助用户简单、高效的使用数据库,安装步骤可查看 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - -## 如何使用? - -1. 数据库建模设计:数据库建模是创建数据库系统的重要步骤,它涉及到设计数据的结构和关系,以确保数据的组织方式能够满足特定应用的需求,下面的文档将会帮助您快速了解 IoTDB 的建模设计: - - - 时序概念介绍:[走进时序数据](../Basic-Concept/Navigating_Time_Series_Data.md) - - - 建模设计介绍:[数据模型介绍](../Basic-Concept/Data-Model-and-Terminology.md) - - - SQL 语法介绍:[SQL 语法介绍](../Basic-Concept/Operate-Metadata_timecho.md) - -2. 数据写入:在数据写入方面,IoTDB 提供了多种方式来插入实时数据,基本的数据写入操作请查看 [数据写入](../Basic-Concept/Write-Data) - -3. 数据查询:IoTDB 提供了丰富的数据查询功能,数据查询的基本介绍请查看 [数据查询](../Basic-Concept/Query-Data.md) - -4. 其他进阶功能:除了数据库常见的写入、查询等功能外,IoTDB 还支持“数据同步、流处理框架、安全控制、权限管理、AI 分析”等功能,具体使用方法可参见具体文档: - - - 数据同步:[数据同步](../User-Manual/Data-Sync_timecho.md) - - - 流处理框架:[流处理框架](../User-Manual/Streaming_timecho.md) - - - 安全控制:[安全控制](../User-Manual/White-List_timecho.md) - - - 权限管理:[权限管理](../User-Manual/Authority-Management.md) - - - AI 分析:[AI 能力](../AI-capability/AINode_timecho.md) - -5. 应用编程接口: IoTDB 提供了多种应用编程接口(API),以便于开发者在应用程序中与 IoTDB 进行交互,目前支持[ Java 原生接口](../API/Programming-Java-Native-API.md)、[Python 原生接口](../API/Programming-Python-Native-API.md)、[C++原生接口](../API/Programming-Cpp-Native-API.md)、[Go 原生接口](../API/Programming-Go-Native-API.md)等,更多编程接口可参见官网【应用编程接口】其他章节 - -## 还有哪些便捷的周边工具? - -IoTDB 除了自身拥有丰富的功能外,其周边的工具体系包含的种类十分齐全。本篇文档将帮助您快速使用周边工具体系: - - - 可视化控制台:workbench 是 IoTDB 的一个支持界面交互的形式的可视化界面,提供直观的元数据管理、数据查询和数据可视化等功能,提升用户操作数据库的便捷性和效率,具体使用介绍请查看 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - - - 监控面板:是一个对 IoTDB 及其所在操作系统进行细致监控的工具,涵盖数据库性能、系统资源等上百个数据库监控指标,助力系统优化与瓶颈识别等,具体使用介绍请查看 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - 测试工具:IoT-benchmark 是一个基于 Java 和大数据环境开发的时序数据库基准测试工具,由清华大学软件学院研发并开源。它支持多种写入和查询方式,能够存储测试信息和结果供进一步查询或分析,并支持与 Tableau 集成以可视化测试结果。具体使用介绍请查看:[测试工具](../Tools-System/Benchmark.md) - - - 数据导入脚本:针对于不同场景,IoTDB 为用户提供多种批量导入数据的操作方式,具体使用介绍请查看:[数据导入](../Tools-System/Data-Import-Tool.md) - - - - 数据导出脚本:针对于不同场景,IoTDB 为用户提供多种批量导出数据的操作方式,具体使用介绍请查看:[数据导出](../Tools-System/Data-Export-Tool.md) - - -## 想了解更多技术细节? - -如果您想了解 IoTDB 的更多技术内幕,可以移步至下面的文档: - - - 研究论文:IoTDB 具有列式存储、数据编码、预计算和索引技术,以及其类 SQL 接口和高性能数据处理能力,同时与 Apache Hadoop、MapReduce 和 Apache Spark 无缝集成。相关研究论文请查看 [研究论文](../Technical-Insider/Publication.md) - - - 压缩&编码:IoTDB 通过多样化的编码和压缩技术,针对不同数据类型优化存储效率,想了解更多请查看 [压缩&编码](../Technical-Insider/Encoding-and-Compression.md) - - - 数据分区和负载均衡:IoTDB 基于时序数据特性,精心设计了数据分区策略和负载均衡算法,提升了集群的可用性和性能,想了解更多请查看 [数据分区和负载均衡](../Technical-Insider/Cluster-data-partitioning.md) - - -## 使用过程中遇到问题? - -如果您在安装或使用过程中遇到困难,可以移步至 [常见问题](../FAQ/Frequently-asked-questions.md) 中进行查看 \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/Reference/DataNode-Config-Manual-old_timecho.md b/src/zh/UserGuide/V1.3.x/Reference/DataNode-Config-Manual-old_timecho.md deleted file mode 100644 index 5a08a01cb..000000000 --- a/src/zh/UserGuide/V1.3.x/Reference/DataNode-Config-Manual-old_timecho.md +++ /dev/null @@ -1,582 +0,0 @@ - - -# DataNode 配置参数 - -IoTDB DataNode 与 Standalone 模式共用一套配置文件,均位于 IoTDB 安装目录:`conf`文件夹下。 - -* `datanode-env.sh/bat`:环境配置项的配置文件,可以配置 DataNode 的内存大小。 - -* `iotdb-datanode.properties`:IoTDB 配置文件。 - -## 热修改配置项 - -为方便用户使用,IoTDB 为用户提供了热修改功能,即在系统运行过程中修改 `iotdb-datanode.properties` 中部分配置参数并即时应用到系统中。下面介绍的参数中,改后 生效方式为`热加载`的均为支持热修改的配置参数。 - -通过 Session 或 Cli 发送 ```load configuration``` 或 `set configuration` 命令(SQL)至 IoTDB 可触发配置热加载。 - -## 环境配置项(datanode-env.sh/bat) - -环境配置项主要用于对 DataNode 运行的 Java 环境相关参数进行配置,如 JVM 相关配置。DataNode/Standalone 启动时,此部分配置会被传给 JVM,详细配置项说明如下: - -* MEMORY\_SIZE - -|名字|MEMORY\_SIZE| -|:---:|:---| -|描述|IoTDB DataNode 启动时分配的内存大小 | -|类型|String| -|默认值|取决于操作系统和机器配置。默认为机器内存的二分之一。| -|改后生效方式|重启服务生效| - -* ON\_HEAP\_MEMORY - -|名字|ON\_HEAP\_MEMORY| -|:---:|:---| -|描述|IoTDB DataNode 能使用的堆内内存大小, 曾用名: MAX\_HEAP\_SIZE | -|类型|String| -|默认值|取决于MEMORY\_SIZE的配置。| -|改后生效方式|重启服务生效| - -* OFF\_HEAP\_MEMORY - -|名字|OFF\_HEAP\_MEMORY| -|:---:|:---| -|描述|IoTDB DataNode 能使用的堆外内存大小, 曾用名: MAX\_DIRECT\_MEMORY\_SIZE | -|类型|String| -|默认值|取决于MEMORY\_SIZE的配置| -|改后生效方式|重启服务生效| - -* JMX\_LOCAL - -|名字|JMX\_LOCAL| -|:---:|:---| -|描述|JMX 监控模式,配置为 true 表示仅允许本地监控,设置为 false 的时候表示允许远程监控。如想在本地通过网络连接JMX Service,比如nodeTool.sh会尝试连接127.0.0.1:31999,请将JMX_LOCAL设置为false。| -|类型|枚举 String : “true”, “false”| -|默认值|true| -|改后生效方式|重启服务生效| - -* JMX\_PORT - -|名字|JMX\_PORT| -|:---:|:---| -|描述|JMX 监听端口。请确认该端口是不是系统保留端口并且未被占用。| -|类型|Short Int: [0,65535]| -|默认值|31999| -|改后生效方式|重启服务生效| - -## 系统配置项(iotdb-datanode.properties) - -系统配置项是 IoTDB DataNode/Standalone 运行的核心配置,它主要用于设置 DataNode/Standalone 数据库引擎的参数。 - -### Data Node RPC 服务配置 - -* dn\_rpc\_address - -|名字| dn\_rpc\_address | -|:---:|:-----------------| -|描述| 客户端 RPC 服务监听地址 | -|类型| String | -|默认值| 0.0.0.0 | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_port - -|名字| dn\_rpc\_port | -|:---:|:---| -|描述| Client RPC 服务监听端口| -|类型| Short Int : [0,65535] | -|默认值| 6667 | -|改后生效方式|重启服务生效| - -* dn\_internal\_address - -|名字| dn\_internal\_address | -|:---:|:---| -|描述| DataNode 内网通信地址 | -|类型| string | -|默认值| 127.0.0.1 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_internal\_port - -|名字| dn\_internal\_port | -|:---:|:-------------------| -|描述| DataNode 内网通信端口 | -|类型| int | -|默认值| 10730 | -|改后生效方式| 仅允许在第一次启动服务前修改 | - -* dn\_mpp\_data\_exchange\_port - -|名字| dn\_mpp\_data\_exchange\_port | -|:---:|:---| -|描述| MPP 数据交换端口 | -|类型| int | -|默认值| 10740 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_schema\_region\_consensus\_port - -|名字| dn\_schema\_region\_consensus\_port | -|:---:|:---| -|描述| DataNode 元数据副本的共识协议通信端口 | -|类型| int | -|默认值| 10750 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_data\_region\_consensus\_port - -|名字| dn\_data\_region\_consensus\_port | -|:---:|:---| -|描述| DataNode 数据副本的共识协议通信端口 | -|类型| int | -|默认值| 10760 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_join\_cluster\_retry\_interval\_ms - -|名字| dn\_join\_cluster\_retry\_interval\_ms | -|:---:|:---------------------------------------| -|描述| DataNode 再次重试加入集群等待时间 | -|类型| long | -|默认值| 5000 | -|改后生效方式| 重启服务生效 | - - -### SSL 配置 - -* enable\_thrift\_ssl - -|名字| enable\_thrift\_ssl | -|:---:|:----------------------------------------------| -|描述| 当enable\_thrift\_ssl配置为true时,将通过dn\_rpc\_port使用 SSL 加密进行通信 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* enable\_https - -|名字| enable\_https | -|:---:|:-------------------------| -|描述| REST Service 是否开启 SSL 配置 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* key\_store\_path - -|名字| key\_store\_path | -|:---:|:-----------------| -|描述| ssl证书路径 | -|类型| String | -|默认值| "" | -|改后生效方式| 重启服务生效 | - -* key\_store\_pwd - -|名字| key\_store\_pwd | -|:---:|:----------------| -|描述| ssl证书密码 | -|类型| String | -|默认值| "" | -|改后生效方式| 重启服务生效 | - - -### 目标 Config Nodes 配置 - -* dn\_seed\_config\_node - -|名字| dn\_seed\_config\_node | -|:---:|:------------------------------------| -|描述| ConfigNode 地址,DataNode 启动时通过此地址加入集群 | -|类型| String | -|默认值| 127.0.0.1:10710 | -|改后生效方式| 仅允许在第一次启动服务前修改 | - -### 连接配置 - -* dn\_session\_timeout\_threshold - -|名字| dn\_session_timeout_threshold | -|:---:|:------------------------------| -|描述| 最大的会话空闲时间 | -|类型| int | -|默认值| 0 | -|改后生效方式| 重启服务生效 | - - -* dn\_rpc\_thrift\_compression\_enable - -|名字| dn\_rpc\_thrift\_compression\_enable | -|:---:|:---------------------------------| -|描述| 是否启用 thrift 的压缩机制 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_advanced\_compression\_enable - -|名字| dn\_rpc\_advanced\_compression\_enable | -|:---:|:-----------------------------------| -|描述| 是否启用 thrift 的自定制压缩机制 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_selector\_thread\_count - -| 名字 | rpc\_selector\_thread\_count | -|:------:|:-----------------------------| -| 描述 | rpc 选择器线程数量 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -* dn\_rpc\_min\_concurrent\_client\_num - -| 名字 | rpc\_min\_concurrent\_client\_num | -|:------:|:----------------------------------| -| 描述 | 最小连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -* dn\_rpc\_max\_concurrent\_client\_num - -| 名字 | dn\_rpc\_max\_concurrent\_client\_num | -|:------:|:--------------------------------------| -| 描述 | 最大连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -* dn\_thrift\_max\_frame\_size - -|名字| dn\_thrift\_max\_frame\_size | -|:---:|:---| -|描述| RPC 请求/响应的最大字节数| -|类型| long | -|默认值| 536870912 (默认值512MB,应大于等于 512 * 1024 * 1024) | -|改后生效方式|重启服务生效| - -* dn\_thrift\_init\_buffer\_size - -|名字| dn\_thrift\_init\_buffer\_size | -|:---:|:---| -|描述| 字节数 | -|类型| long | -|默认值| 1024 | -|改后生效方式|重启服务生效| - -* dn\_connection\_timeout\_ms - -| 名字 | dn\_connection\_timeout\_ms | -|:------:|:----------------------------| -| 描述 | 节点连接超时时间 | -| 类型 | int | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -* dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager - -| 名字 | dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------:|:--------------------------------------------------------------| -| 描述 | 单 ClientManager 中路由到每个节点的核心 Client 个数 | -| 类型 | int | -| 默认值 | 200 | -| 改后生效方式 | 重启服务生效 | - -* dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager - -| 名字 | dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------:|:-------------------------------------------------------------| -| 描述 | 单 ClientManager 中路由到每个节点的最大 Client 个数 | -| 类型 | int | -| 默认值 | 300 | -| 改后生效方式 | 重启服务生效 | - -### 目录配置 - -* dn\_system\_dir - -| 名字 | dn\_system\_dir | -|:------:|:--------------------------------------------------------------------| -| 描述 | IoTDB 元数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/system(Windows:data\\datanode\\system) | -| 改后生效方式 | 重启服务生效 | - -* dn\_data\_dirs - -| 名字 | dn\_data\_dirs | -|:------:|:-------------------------------------------------------------------| -| 描述 | IoTDB 数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/data(Windows:data\\datanode\\data) | -| 改后生效方式 | 重启服务生效 | - -* dn\_multi\_dir\_strategy - -| 名字 | dn\_multi\_dir\_strategy | -|:------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | IoTDB 在 data\_dirs 中为 TsFile 选择目录时采用的策略。可使用简单类名或类名全称。系统提供以下三种策略:
1. SequenceStrategy:IoTDB 按顺序选择目录,依次遍历 data\_dirs 中的所有目录,并不断轮循;
2. MaxDiskUsableSpaceFirstStrategy:IoTDB 优先选择 data\_dirs 中对应磁盘空余空间最大的目录;
您可以通过以下方法完成用户自定义策略:
1. 继承 org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy 类并实现自身的 Strategy 方法;
2. 将实现的类的完整类名(包名加类名,UserDefineStrategyPackage)填写到该配置项;
3. 将该类 jar 包添加到工程中。 | -| 类型 | String | -| 默认值 | SequenceStrategy | -| 改后生效方式 | 热加载 | - -* dn\_consensus\_dir - -| 名字 | dn\_consensus\_dir | -|:------:|:-------------------------------------------------------------------------| -| 描述 | IoTDB 共识层日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/consensus(Windows:data\\datanode\\consensus) | -| 改后生效方式 | 重启服务生效 | - -* dn\_wal\_dirs - -| 名字 | dn\_wal\_dirs | -|:------:|:---------------------------------------------------------------------| -| 描述 | IoTDB 写前日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/wal(Windows:data\\datanode\\wal) | -| 改后生效方式 | 重启服务生效 | - -* dn\_tracing\_dir - -| 名字 | dn\_tracing\_dir | -|:------:|:--------------------------------------------------------------------| -| 描述 | IoTDB 追踪根目录路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | datanode/tracing | -| 改后生效方式 | 重启服务生效 | - -* dn\_sync\_dir - -| 名字 | dn\_sync\_dir | -|:------:|:----------------------------------------------------------------------| -| 描述 | IoTDB sync 存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/sync | -| 改后生效方式 | 重启服务生效 | - -### Metric 配置 - -## 开启 GC 日志 - -GC 日志默认是关闭的。为了性能调优,用户可能会需要收集 GC 信息。 -若要打开 GC 日志,则需要在启动 IoTDB Server 的时候加上"printgc"参数: - -```bash -nohup sbin/start-datanode.sh printgc >/dev/null 2>&1 & -``` - -或者 - -```bash -sbin\start-datanode.bat printgc -``` - -GC 日志会被存储在`IOTDB_HOME/logs/gc.log`. 至多会存储 10 个 gc.log 文件,每个文件最多 10MB。 - -#### REST 服务配置 - -* enable\_rest\_service - -|名字| enable\_rest\_service | -|:---:|:--------------------| -|描述| 是否开启Rest服务。 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* rest\_service\_port - -|名字| rest\_service\_port | -|:---:|:------------------| -|描述| Rest服务监听端口号 | -|类型| int32 | -|默认值| 18080 | -|改后生效方式| 重启生效 | - -* enable\_swagger - -|名字| enable\_swagger | -|:---:|:-----------------------| -|描述| 是否启用swagger来展示rest接口信息 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* rest\_query\_default\_row\_size\_limit - -|名字| rest\_query\_default\_row\_size\_limit | -|:---:|:----------------------------------| -|描述| 一次查询能返回的结果集最大行数 | -|类型| int32 | -|默认值| 10000 | -|改后生效方式| 重启生效 | - -* cache\_expire - -|名字| cache\_expire | -|:---:|:--------------| -|描述| 缓存客户登录信息的过期时间 | -|类型| int32 | -|默认值| 28800 | -|改后生效方式| 重启生效 | - -* cache\_max\_num - -|名字| cache\_max\_num | -|:---:|:--------------| -|描述| 缓存中存储的最大用户数量 | -|类型| int32 | -|默认值| 100 | -|改后生效方式| 重启生效 | - -* cache\_init\_num - -|名字| cache\_init\_num | -|:---:|:---------------| -|描述| 缓存初始容量 | -|类型| int32 | -|默认值| 10 | -|改后生效方式| 重启生效 | - -* trust\_store\_path - -|名字| trust\_store\_path | -|:---:|:---------------| -|描述| keyStore 密码(非必填) | -|类型| String | -|默认值| "" | -|改后生效方式| 重启生效 | - -* trust\_store\_pwd - -|名字| trust\_store\_pwd | -|:---:|:---------------| -|描述| trustStore 密码(非必填) | -|类型| String | -|默认值| "" | -|改后生效方式| 重启生效 | - -* idle\_timeout - -|名字| idle\_timeout | -|:---:|:--------------| -|描述| SSL 超时时间,单位为秒 | -|类型| int32 | -|默认值| 5000 | -|改后生效方式| 重启生效 | - -#### 存储引擎配置 - -* dn\_default\_space\_move\_thresholds - -|名字| dn\_default\_space\_move\_thresholds | -|:---:|:--------------| -|描述| 1.3.0/1版本:定义每个层级数据目录的最小剩余空间比例;当剩余空间少于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的剩余存储空间到低于此阈值时,会将系统置为 READ_ONLY | -|类型| double | -|默认值| 0.15 | -|改后生效方式| 热加载 | - -* dn\_default\_space\_usage\_thresholds - -|名字| dn\_default\_space\_usage\_thresholds | -|:---:|:--------------| -|描述| 1.3.2版本:定义每个层级数据目录的最小剩余空间比例;当剩余空间少于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的剩余存储空间到低于此阈值时,会将系统置为 READ_ONLY | -|类型| double | -|默认值| 0.85 | -|改后生效方式| 热加载 | - -* remote\_tsfile\_cache\_dirs - -|名字| remote\_tsfile\_cache\_dirs | -|:---:|:--------------| -|描述| 云端存储在本地的缓存目录 | -|类型| string | -|默认值| data/datanode/data/cache | -|改后生效方式| 重启生效 | - -* remote\_tsfile\_cache\_page\_size\_in\_kb - -|名字| remote\_tsfile\_cache\_page\_size\_in\_kb | -|:---:|:--------------| -|描述| 云端存储在本地缓存文件的块大小 | -|类型| int | -|默认值| 20480 | -|改后生效方式| 重启生效 | - -* remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb - -|名字| remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb | -|:---:|:--------------| -|描述| 云端存储本地缓存的最大磁盘占用大小 | -|类型| long | -|默认值| 51200 | -|改后生效方式| 重启生效 | - -* object\_storage\_type - -|名字| object\_storage\_type | -|:---:|:--------------| -|描述| 云端存储类型 | -|类型| string | -|默认值| AWS_S3 | -|改后生效方式| 重启生效 | - -* object\_storage\_bucket - -|名字| object\_storage\_bucket | -|:---:|:--------------| -|描述| 云端存储 bucket 的名称 | -|类型| string | -|默认值| iotdb_data | -|改后生效方式| 重启生效 | - -* object\_storage\_endpoint - -|名字| object\_storage\_endpoint | -|:---:|:---------------------------| -|描述| 云端存储的 endpoint | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | - -* object\_storage\_access\_key - -|名字| object\_storage\_access\_key | -|:---:|:--------------| -|描述| 云端存储的验证信息 key | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | - -* object\_storage\_access\_secret - -|名字| object\_storage\_access\_secret | -|:---:|:--------------| -|描述| 云端存储的验证信息 secret | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/Reference/DataNode-Config-Manual_timecho.md b/src/zh/UserGuide/V1.3.x/Reference/DataNode-Config-Manual_timecho.md deleted file mode 100644 index 0105369f1..000000000 --- a/src/zh/UserGuide/V1.3.x/Reference/DataNode-Config-Manual_timecho.md +++ /dev/null @@ -1,576 +0,0 @@ - - -# DataNode 配置参数 - -IoTDB DataNode 与 Standalone 模式共用一套配置文件,均位于 IoTDB 安装目录:`conf`文件夹下。 - -* `datanode-env.sh/bat`:环境配置项的配置文件,可以配置 DataNode 的内存大小。 - -* `iotdb-system.properties`:IoTDB 的配置文件。 - -## 热修改配置项 - -为方便用户使用,IoTDB 为用户提供了热修改功能,即在系统运行过程中修改 `iotdb-system.properties` 中部分配置参数并即时应用到系统中。下面介绍的参数中,改后 生效方式为`热加载` -的均为支持热修改的配置参数。 - -通过 Session 或 Cli 发送 ```load configuration``` 或 `set configuration` 命令(SQL)至 IoTDB 可触发配置热加载。 - -## 环境配置项(datanode-env.sh/bat) - -环境配置项主要用于对 DataNode 运行的 Java 环境相关参数进行配置,如 JVM 相关配置。DataNode/Standalone 启动时,此部分配置会被传给 JVM,详细配置项说明如下: - -* MEMORY\_SIZE - -|名字|MEMORY\_SIZE| -|:---:|:---| -|描述|IoTDB DataNode 启动时分配的内存大小 | -|类型|String| -|默认值|取决于操作系统和机器配置。默认为机器内存的二分之一。| -|改后生效方式|重启服务生效| - -* ON\_HEAP\_MEMORY - -|名字|ON\_HEAP\_MEMORY| -|:---:|:---| -|描述|IoTDB DataNode 能使用的堆内内存大小, 曾用名: MAX\_HEAP\_SIZE | -|类型|String| -|默认值|取决于MEMORY\_SIZE的配置。| -|改后生效方式|重启服务生效| - -* OFF\_HEAP\_MEMORY - -|名字|OFF\_HEAP\_MEMORY| -|:---:|:---| -|描述|IoTDB DataNode 能使用的堆外内存大小, 曾用名: MAX\_DIRECT\_MEMORY\_SIZE | -|类型|String| -|默认值|取决于MEMORY\_SIZE的配置| -|改后生效方式|重启服务生效| - -* JMX\_LOCAL - -|名字|JMX\_LOCAL| -|:---:|:---| -|描述|JMX 监控模式,配置为 true 表示仅允许本地监控,设置为 false 的时候表示允许远程监控。如想在本地通过网络连接JMX Service,比如nodeTool.sh会尝试连接127.0.0.1:31999,请将JMX_LOCAL设置为false。| -|类型|枚举 String : “true”, “false”| -|默认值|true| -|改后生效方式|重启服务生效| - -* JMX\_PORT - -|名字|JMX\_PORT| -|:---:|:---| -|描述|JMX 监听端口。请确认该端口是不是系统保留端口并且未被占用。| -|类型|Short Int: [0,65535]| -|默认值|31999| -|改后生效方式|重启服务生效| - -## 系统配置项(iotdb-system.properties) - -系统配置项是 IoTDB DataNode/Standalone 运行的核心配置,它主要用于设置 DataNode/Standalone 数据库引擎的参数。 - -### Data Node RPC 服务配置 - -* dn\_rpc\_address - -|名字| dn\_rpc\_address | -|:---:|:-----------------| -|描述| 客户端 RPC 服务监听地址 | -|类型| String | -|默认值| 0.0.0.0 | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_port - -|名字| dn\_rpc\_port | -|:---:|:---| -|描述| Client RPC 服务监听端口| -|类型| Short Int : [0,65535] | -|默认值| 6667 | -|改后生效方式|重启服务生效| - -* dn\_internal\_address - -|名字| dn\_internal\_address | -|:---:|:---| -|描述| DataNode 内网通信地址 | -|类型| string | -|默认值| 127.0.0.1 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_internal\_port - -|名字| dn\_internal\_port | -|:---:|:-------------------| -|描述| DataNode 内网通信端口 | -|类型| int | -|默认值| 10730 | -|改后生效方式| 仅允许在第一次启动服务前修改 | - -* dn\_mpp\_data\_exchange\_port - -|名字| dn\_mpp\_data\_exchange\_port | -|:---:|:---| -|描述| MPP 数据交换端口 | -|类型| int | -|默认值| 10740 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_schema\_region\_consensus\_port - -|名字| dn\_schema\_region\_consensus\_port | -|:---:|:---| -|描述| DataNode 元数据副本的共识协议通信端口 | -|类型| int | -|默认值| 10750 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_data\_region\_consensus\_port - -|名字| dn\_data\_region\_consensus\_port | -|:---:|:---| -|描述| DataNode 数据副本的共识协议通信端口 | -|类型| int | -|默认值| 10760 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_join\_cluster\_retry\_interval\_ms - -|名字| dn\_join\_cluster\_retry\_interval\_ms | -|:---:|:---------------------------------------| -|描述| DataNode 再次重试加入集群等待时间 | -|类型| long | -|默认值| 5000 | -|改后生效方式| 重启服务生效 | - - -### SSL 配置 - -* enable\_thrift\_ssl - -|名字| enable\_thrift\_ssl | -|:---:|:----------------------------------------------| -|描述| 当enable\_thrift\_ssl配置为true时,将通过dn\_rpc\_port使用 SSL 加密进行通信 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* enable\_https - -|名字| enable\_https | -|:---:|:-------------------------| -|描述| REST Service 是否开启 SSL 配置 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* key\_store\_path - -|名字| key\_store\_path | -|:---:|:-----------------| -|描述| ssl证书路径 | -|类型| String | -|默认值| "" | -|改后生效方式| 重启服务生效 | - -* key\_store\_pwd - -|名字| key\_store\_pwd | -|:---:|:----------------| -|描述| ssl证书密码 | -|类型| String | -|默认值| "" | -|改后生效方式| 重启服务生效 | - - -### SeedConfigNode 配置 - -* dn\_seed\_config\_node - -|名字| dn\_seed\_config\_node | -|:---:|:------------------------------------| -|描述| ConfigNode 地址,DataNode 启动时通过此地址加入集群,推荐使用 SeedConfigNode。V1.2.2 及以前曾用名是 dn\_target\_config\_node\_list | -|类型| String | -|默认值| 127.0.0.1:10710 | -|改后生效方式| 仅允许在第一次启动服务前修改 | - -### 连接配置 - -* dn\_session\_timeout\_threshold - -|名字| dn\_session_timeout_threshold | -|:---:|:------------------------------| -|描述| 最大的会话空闲时间 | -|类型| int | -|默认值| 0 | -|改后生效方式| 重启服务生效 | - - -* dn\_rpc\_thrift\_compression\_enable - -|名字| dn\_rpc\_thrift\_compression\_enable | -|:---:|:---------------------------------| -|描述| 是否启用 thrift 的压缩机制 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_advanced\_compression\_enable - -|名字| dn\_rpc\_advanced\_compression\_enable | -|:---:|:-----------------------------------| -|描述| 是否启用 thrift 的自定制压缩机制 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_selector\_thread\_count - -| 名字 | rpc\_selector\_thread\_count | -|:------:|:-----------------------------| -| 描述 | rpc 选择器线程数量 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -* dn\_rpc\_min\_concurrent\_client\_num - -| 名字 | rpc\_min\_concurrent\_client\_num | -|:------:|:----------------------------------| -| 描述 | 最小连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -* dn\_rpc\_max\_concurrent\_client\_num - -| 名字 | dn\_rpc\_max\_concurrent\_client\_num | -|:------:|:--------------------------------------| -| 描述 | 最大连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -* dn\_thrift\_max\_frame\_size - -|名字| dn\_thrift\_max\_frame\_size | -|:---:|:---| -|描述| RPC 请求/响应的最大字节数| -|类型| long | -|默认值| 536870912 (默认值512MB) | -|改后生效方式|重启服务生效| - -* dn\_thrift\_init\_buffer\_size - -|名字| dn\_thrift\_init\_buffer\_size | -|:---:|:---| -|描述| 字节数 | -|类型| long | -|默认值| 1024 | -|改后生效方式|重启服务生效| - -* dn\_connection\_timeout\_ms - -| 名字 | dn\_connection\_timeout\_ms | -|:------:|:----------------------------| -| 描述 | 节点连接超时时间 | -| 类型 | int | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -* dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager - -| 名字 | dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------:|:--------------------------------------------------------------| -| 描述 | 单 ClientManager 中路由到每个节点的核心 Client 个数 | -| 类型 | int | -| 默认值 | 200 | -| 改后生效方式 | 重启服务生效 | - -* dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager - -| 名字 | dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------:|:-------------------------------------------------------------| -| 描述 | 单 ClientManager 中路由到每个节点的最大 Client 个数 | -| 类型 | int | -| 默认值 | 300 | -| 改后生效方式 | 重启服务生效 | - -### 目录配置 - -* dn\_system\_dir - -| 名字 | dn\_system\_dir | -|:------:|:--------------------------------------------------------------------| -| 描述 | IoTDB 元数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/system(Windows:data\\datanode\\system) | -| 改后生效方式 | 重启服务生效 | - -* dn\_data\_dirs - -| 名字 | dn\_data\_dirs | -|:------:|:-------------------------------------------------------------------| -| 描述 | IoTDB 数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/data(Windows:data\\datanode\\data) | -| 改后生效方式 | 重启服务生效 | - -* dn\_multi\_dir\_strategy - -| 名字 | dn\_multi\_dir\_strategy | -|:------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | IoTDB 在 data\_dirs 中为 TsFile 选择目录时采用的策略。可使用简单类名或类名全称。系统提供以下三种策略:
1. SequenceStrategy:IoTDB 按顺序选择目录,依次遍历 data\_dirs 中的所有目录,并不断轮循;
2. MaxDiskUsableSpaceFirstStrategy:IoTDB 优先选择 data\_dirs 中对应磁盘空余空间最大的目录;
您可以通过以下方法完成用户自定义策略:
1. 继承 org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy 类并实现自身的 Strategy 方法;
2. 将实现的类的完整类名(包名加类名,UserDefineStrategyPackage)填写到该配置项;
3. 将该类 jar 包添加到工程中。 | -| 类型 | String | -| 默认值 | SequenceStrategy | -| 改后生效方式 | 热加载 | - -* dn\_consensus\_dir - -| 名字 | dn\_consensus\_dir | -|:------:|:-------------------------------------------------------------------------| -| 描述 | IoTDB 共识层日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/consensus(Windows:data\\datanode\\consensus) | -| 改后生效方式 | 重启服务生效 | - -* dn\_wal\_dirs - -| 名字 | dn\_wal\_dirs | -|:------:|:---------------------------------------------------------------------| -| 描述 | IoTDB 写前日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/wal(Windows:data\\datanode\\wal) | -| 改后生效方式 | 重启服务生效 | - -* dn\_tracing\_dir - -| 名字 | dn\_tracing\_dir | -|:------:|:--------------------------------------------------------------------| -| 描述 | IoTDB 追踪根目录路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | datanode/tracing | -| 改后生效方式 | 重启服务生效 | - -* dn\_sync\_dir - -| 名字 | dn\_sync\_dir | -|:------:|:----------------------------------------------------------------------| -| 描述 | IoTDB sync 存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/sync | -| 改后生效方式 | 重启服务生效 | - -### Metric 配置 - -## 开启 GC 日志 - -GC 日志默认是关闭的。为了性能调优,用户可能会需要收集 GC 信息。 -若要打开 GC 日志,则需要在启动 IoTDB Server 的时候加上"printgc"参数: - -```bash -nohup sbin/start-datanode.sh printgc >/dev/null 2>&1 & -``` - -或者 - -```bash -sbin\start-datanode.bat printgc -``` - -GC 日志会被存储在`IOTDB_HOME/logs/gc.log`. 至多会存储 10 个 gc.log 文件,每个文件最多 10MB。 - -#### REST 服务配置 - -* enable\_rest\_service - -|名字| enable\_rest\_service | -|:---:|:--------------------| -|描述| 是否开启Rest服务。 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* rest\_service\_port - -|名字| rest\_service\_port | -|:---:|:------------------| -|描述| Rest服务监听端口号 | -|类型| int32 | -|默认值| 18080 | -|改后生效方式| 重启生效 | - -* enable\_swagger - -|名字| enable\_swagger | -|:---:|:-----------------------| -|描述| 是否启用swagger来展示rest接口信息 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* rest\_query\_default\_row\_size\_limit - -|名字| rest\_query\_default\_row\_size\_limit | -|:---:|:----------------------------------| -|描述| 一次查询能返回的结果集最大行数 | -|类型| int32 | -|默认值| 10000 | -|改后生效方式| 重启生效 | - -* cache\_expire - -|名字| cache\_expire | -|:---:|:--------------| -|描述| 缓存客户登录信息的过期时间 | -|类型| int32 | -|默认值| 28800 | -|改后生效方式| 重启生效 | - -* cache\_max\_num - -|名字| cache\_max\_num | -|:---:|:--------------| -|描述| 缓存中存储的最大用户数量 | -|类型| int32 | -|默认值| 100 | -|改后生效方式| 重启生效 | - -* cache\_init\_num - -|名字| cache\_init\_num | -|:---:|:---------------| -|描述| 缓存初始容量 | -|类型| int32 | -|默认值| 10 | -|改后生效方式| 重启生效 | - -* trust\_store\_path - -|名字| trust\_store\_path | -|:---:|:---------------| -|描述| keyStore 密码(非必填) | -|类型| String | -|默认值| "" | -|改后生效方式| 重启生效 | - -* trust\_store\_pwd - -|名字| trust\_store\_pwd | -|:---:|:---------------| -|描述| trustStore 密码(非必填) | -|类型| String | -|默认值| "" | -|改后生效方式| 重启生效 | - -* idle\_timeout - -|名字| idle\_timeout | -|:---:|:--------------| -|描述| SSL 超时时间,单位为秒 | -|类型| int32 | -|默认值| 5000 | -|改后生效方式| 重启生效 | - - - -#### 多级存储配置 - -* dn\_default\_space\_usage\_thresholds - -|名字| dn\_default\_space\_usage\_thresholds | -|:---:|:--------------| -|描述| 定义每个层级数据目录的最小剩余空间比例;当剩余空间少于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的剩余存储空间到低于此阈值时,会将系统置为 READ_ONLY | -|类型| double | -|默认值| 0.85 | -|改后生效方式| 热加载 | - -* remote\_tsfile\_cache\_dirs - -|名字| remote\_tsfile\_cache\_dirs | -|:---:|:--------------| -|描述| 云端存储在本地的缓存目录 | -|类型| string | -|默认值| data/datanode/data/cache | -|改后生效方式| 重启生效 | - -* remote\_tsfile\_cache\_page\_size\_in\_kb - -|名字| remote\_tsfile\_cache\_page\_size\_in\_kb | -|:---:|:--------------| -|描述| 云端存储在本地缓存文件的块大小 | -|类型| int | -|默认值| 20480 | -|改后生效方式| 重启生效 | - -* remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb - -|名字| remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb | -|:---:|:--------------| -|描述| 云端存储本地缓存的最大磁盘占用大小 | -|类型| long | -|默认值| 51200 | -|改后生效方式| 重启生效 | - -* object\_storage\_type - -|名字| object\_storage\_type | -|:---:|:--------------| -|描述| 云端存储类型 | -|类型| string | -|默认值| AWS_S3 | -|改后生效方式| 重启生效 | - -* object\_storage\_bucket - -|名字| object\_storage\_bucket | -|:---:|:--------------| -|描述| 云端存储 bucket 的名称 | -|类型| string | -|默认值| iotdb_data | -|改后生效方式| 重启生效 | - -* object\_storage\_endpoint - -|名字| object\_storage\_endpoint | -|:---:|:---------------------------| -|描述| 云端存储的 endpoint | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | - -* object\_storage\_access\_key - -|名字| object\_storage\_access\_key | -|:---:|:--------------| -|描述| 云端存储的验证信息 key | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | - -* object\_storage\_access\_secret - -|名字| object\_storage\_access\_secret | -|:---:|:--------------| -|描述| 云端存储的验证信息 secret | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md b/src/zh/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md deleted file mode 100644 index df869257e..000000000 --- a/src/zh/UserGuide/V1.3.x/SQL-Manual/UDF-Libraries_timecho.md +++ /dev/null @@ -1,5091 +0,0 @@ - -# UDF函数库 - -基于用户自定义函数能力,IoTDB 提供了一系列关于时序数据处理的函数,包括数据质量、数据画像、异常检测、 频域分析、数据匹配、数据修复、序列发现、机器学习等,能够满足工业领域对时序数据处理的需求。 - -> 注意:当前UDF函数库中的函数仅支持毫秒级的时间戳精度。 - -## 安装步骤 -1. 请获取与 IoTDB 版本兼容的 UDF 函数库 JAR 包的压缩包。 - - | UDF 安装包 | 支持的 IoTDB 版本 | 下载链接 | - | --------------- | ----------------- | ------------------------------------------------------------ | - | TimechoDB-UDF-1.3.3.zip | V1.3.3及以上 | 请联系天谋商务获取 | - | TimechoDB-UDF-1.3.2.zip | V1.0.0~V1.3.2 | 请联系天谋商务获取 | - -2. 将获取的压缩包中的 `library-udf.jar` 文件放置在 IoTDB 集群所有节点的 `/ext/udf` 的目录下 -3. 在 IoTDB 的 SQL 命令行终端(CLI)或可视化控制台(Workbench)的 SQL 操作界面中,执行下述相应的函数注册语句。 -4. 批量注册:两种注册方式:注册脚本 或 SQL汇总语句 -- 注册脚本 - - 将压缩包中的注册脚本(`register-UDF.sh` 或 `register-UDF.bat`)按需复制到 IoTDB 的 tools 目录下,修改脚本中的参数(默认为host=127.0.0.1,rpcPort=6667,user=root,pass=root); - - 启动 IoTDB 服务,运行注册脚本批量注册 UDF - -- SQL汇总语句 - - 打开压缩包中的SQl文件,复制全部 SQL 语句,在 IoTDB 的 SQL 命令行终端(CLI)或可视化控制台(Workbench)的 SQL 操作界面中,执行全部 SQl 语句批量注册 UDF - -## 数据质量 - -### Completeness - -#### 注册语句 - -```sql -create function completeness as 'org.apache.iotdb.library.dquality.UDTFCompleteness' -``` - -#### 函数简介 - -本函数用于计算时间序列的完整性,用来衡量一段时序数据有没有缺失。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据完整程度,并输出窗口第一个数据点的时间戳和完整性结果。 - -**函数名:** COMPLETENESS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 -+ `downtime`:完整性计算是否考虑停机异常。它的取值为 'true' 或 'false',默认值为 'true'. 在考虑停机异常时,长时间的数据缺失将被视作停机,不对完整性产生影响。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行完整性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算完整性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------+ -| Time|completeness(root.test.d1.s1)| -+-----------------------------+-----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -+-----------------------------+-----------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算完整性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------+ -| Time|completeness(root.test.d1.s1, "window"="15")| -+-----------------------------+--------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+--------------------------------------------+ -``` - -### Consistency - -#### 注册语句 - -```sql -create function consistency as 'org.apache.iotdb.library.dquality.UDTFConsistency' -``` - -#### 函数简介 - -本函数用于计算时间序列的一致性,用来衡量时序数据变化是否平稳、规律是否统一。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据一致性,并输出窗口第一个数据点的时间戳和一致性结果。 - -**函数名:** CONSISTENCY - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行一致性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算一致性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|consistency(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+----------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算一致性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------+ -| Time|consistency(root.test.d1.s1, "window"="15")| -+-----------------------------+-------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+-------------------------------------------+ -``` - -### Timeliness - -#### 注册语句 - -```sql -create function timeliness as 'org.apache.iotdb.library.dquality.UDTFTimeliness' -``` - -#### 函数简介 - -本函数用于计算时间序列的时效性,用来衡量时序数据是否按时采集、按时上报。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据时效性,并输出窗口第一个数据点的时间戳和时效性结果。 - -**函数名:** TIMELINESS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行时效性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算时效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+---------------------------+ -| Time|timeliness(root.test.d1.s1)| -+-----------------------------+---------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+---------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算时效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------+ -| Time|timeliness(root.test.d1.s1, "window"="15")| -+-----------------------------+------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+------------------------------------------+ -``` - -### Validity - -#### 注册语句 - -```sql -create function validity as 'org.apache.iotdb.library.dquality.UDTFValidity' -``` - -#### 函数简介 - -本函数用于计算时间序列的有效性,用来衡量时序数据是否正常、可用、无异常值。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据有效性,并输出窗口第一个数据点的时间戳和有效性结果。 - -**函数名:** VALIDITY - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行有效性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算有效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|validity(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -+-----------------------------+-------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算有效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|validity(root.test.d1.s1, "window"="15")| -+-----------------------------+----------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - - - - -## 数据画像 - -### ACF - -#### 注册语句 - -```sql -create function acf as 'org.apache.iotdb.library.dprofile.UDTFACF' -``` - -#### 函数简介 - -本函数用于计算时间序列的自相关函数值,即序列与自身之间的互相关函数。 - -**函数名:** ACF - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中共包含$2N-1$个数据点。 - -**提示:** - -+ 序列中的`NaN`值会被忽略,在计算中表现为0。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|acf(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 6.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -|1970-01-01T08:00:00.004+08:00| 7.0| -|1970-01-01T08:00:00.005+08:00| 0.0| -|1970-01-01T08:00:00.006+08:00| 3.6| -|1970-01-01T08:00:00.007+08:00| 0.0| -|1970-01-01T08:00:00.008+08:00| 1.0| -+-----------------------------+--------------------+ -``` - -### Distinct - -#### 注册语句 - -```sql -create function distinct as 'org.apache.iotdb.library.dprofile.UDTFDistinct' -``` - -#### 函数简介 - -本函数可以返回输入序列中出现的所有不同的元素。 - -**函数名:** DISTINCT - -**输入序列:** 仅支持单个输入序列,类型可以是任意的 - -**输出序列:** 输出单个序列,类型与输入相同。 - -**提示:** - -+ 输出序列的时间戳是无意义的。输出顺序是任意的。 -+ 缺失值和空值将被忽略,但`NaN`不会被忽略。 -+ 字符串区分大小写 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2020-01-01T08:00:00.001+08:00| Hello| -|2020-01-01T08:00:00.002+08:00| hello| -|2020-01-01T08:00:00.003+08:00| Hello| -|2020-01-01T08:00:00.004+08:00| World| -|2020-01-01T08:00:00.005+08:00| World| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select distinct(s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|distinct(root.test.d2.s2)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.001+08:00| Hello| -|1970-01-01T08:00:00.002+08:00| hello| -|1970-01-01T08:00:00.003+08:00| World| -+-----------------------------+-------------------------+ -``` - -### Histogram - -#### 注册语句 - -```sql -create function histogram as 'org.apache.iotdb.library.dprofile.UDTFHistogram' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的分布直方图。 - -**函数名:** HISTOGRAM - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `min`:表示所求数据范围的下限,默认值为 -Double.MAX_VALUE。 -+ `max`:表示所求数据范围的上限,默认值为 Double.MAX_VALUE,`start`的值必须小于或等于`end`。 -+ `count`: 表示直方图分桶的数量,默认值为 1,其值必须为正整数。 - -**输出序列:** 直方图分桶的值,其中第 i 个桶(从 1 开始计数)表示的数据范围下界为$min+ (i-1)\cdot\frac{max-min}{count}$,数据范围上界为$min+ i \cdot \frac{max-min}{count}$。 - - -**提示:** - -+ 如果某个数据点的数值小于`min`,它会被放入第 1 个桶;如果某个数据点的数值大于`max`,它会被放入最后 1 个桶。 -+ 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 11.0| -|2020-01-01T00:00:11.000+08:00| 12.0| -|2020-01-01T00:00:12.000+08:00| 13.0| -|2020-01-01T00:00:13.000+08:00| 14.0| -|2020-01-01T00:00:14.000+08:00| 15.0| -|2020-01-01T00:00:15.000+08:00| 16.0| -|2020-01-01T00:00:16.000+08:00| 17.0| -|2020-01-01T00:00:17.000+08:00| 18.0| -|2020-01-01T00:00:18.000+08:00| 19.0| -|2020-01-01T00:00:19.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------+ -| Time|histogram(root.test.d1.s1, "min"="1", "max"="20", "count"="10")| -+-----------------------------+---------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 2| -|1970-01-01T08:00:00.001+08:00| 2| -|1970-01-01T08:00:00.002+08:00| 2| -|1970-01-01T08:00:00.003+08:00| 2| -|1970-01-01T08:00:00.004+08:00| 2| -|1970-01-01T08:00:00.005+08:00| 2| -|1970-01-01T08:00:00.006+08:00| 2| -|1970-01-01T08:00:00.007+08:00| 2| -|1970-01-01T08:00:00.008+08:00| 2| -|1970-01-01T08:00:00.009+08:00| 2| -+-----------------------------+---------------------------------------------------------------+ -``` - -### Integral - -#### 注册语句 - -```sql -create function integral as 'org.apache.iotdb.library.dprofile.UDAFIntegral' -``` - -#### 函数简介 - -本函数用于计算时间序列的数值积分,即以时间为横坐标、数值为纵坐标绘制的折线图中折线以下的面积。 - -**函数名:** INTEGRAL - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `unit`:积分求解所用的时间轴单位,取值为 "1S", "1s", "1m", "1H", "1d"(区分大小写),分别表示以毫秒、秒、分钟、小时、天为单位计算积分。 - 缺省情况下取 "1s",以秒为单位。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为积分结果的数据点。 - -**提示:** - -+ 积分值等于折线图中每相邻两个数据点和时间轴形成的直角梯形的面积之和,不同时间单位下相当于横轴进行不同倍数放缩,得到的积分值可直接按放缩倍数转换。 - -+ 数据中`NaN`将会被忽略。折线将以临近两个有值数据点为准。 - -#### 使用示例 - -##### 参数缺省 - -缺省情况下积分以1s为时间单位。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 57.5| -+-----------------------------+-------------------------+ -``` - -其计算公式为: -$$\frac{1}{2}[(1+2)\times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 57.5$$ - - -##### 指定时间单位 - -指定以分钟为时间单位。 - - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.958| -+-----------------------------+-------------------------+ -``` - -其计算公式为: -$$\frac{1}{2\times 60}[(1+2) \times 1 + (2+3) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 0.958$$ - -### IntegralAvg - -#### 注册语句 - -```sql -create function integralavg as 'org.apache.iotdb.library.dprofile.UDAFIntegralAvg' -``` - -#### 函数简介 - -本函数用于计算时间序列的函数均值,即在相同时间单位下的数值积分除以序列总的时间跨度。更多关于数值积分计算的信息请参考`Integral`函数。 - -**函数名:** INTEGRALAVG - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为时间加权平均结果的数据点。 - -**提示:** - -+ 时间加权的平均值等于在任意时间单位`unit`下计算的数值积分(即折线图中每相邻两个数据点和时间轴形成的直角梯形的面积之和), - 除以相同时间单位下输入序列的时间跨度,其值与具体采用的时间单位无关,默认与 IoTDB 时间单位一致。 - -+ 数据中的`NaN`将会被忽略。折线将以临近两个有值数据点为准。 - -+ 输入序列为空时,函数输出结果为 0;仅有一个数据点时,输出结果为该点数值。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|integralavg(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|1970-01-01T08:00:00.000+08:00| 6.388888888888889| -+-----------------------------+----------------------------+ -``` - -其计算公式为: -$$\frac{1}{2}[(1+2)\times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] / 10 = 5.75$$ - -### Mad - -#### 注册语句 - -```sql -create function mad as 'org.apache.iotdb.library.dprofile.UDAFMad' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似绝对中位差,绝对中位差为所有数值与其中位数绝对偏移量的中位数。 - -如有数据集$\{1,3,3,5,5,6,7,8,9\}$,其中位数为5,所有数值与中位数的偏移量的绝对值为$\{0,0,1,2,2,2,3,4,4\}$,其中位数为2,故而原数据集的绝对中位差为2。 - -**函数名:** MAD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `error`:近似绝对中位差的基于数值的误差百分比,取值范围为 [0,1),默认值为 0。如当`error`=0.01 时,记精确绝对中位差为a,近似绝对中位差为b,不等式 $0.99a \le b \le 1.01a$ 成立。当`error`=0 时,计算结果为精确绝对中位差。 - - -**输出序列:** 输出单个序列,类型为DOUBLE,序列仅包含一个时间戳为 0、值为绝对中位差的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -##### 近似查询 - -当`error`参数取值不为 0 时,本函数计算近似绝对中位差。 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -............ -Total line number = 20 -``` - -用于查询的 SQL 语句如下: - -```sql -select mad(s1, "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -| Time|mad(root.test.s1, "error"="0.01")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9900000000000001| -+-----------------------------+---------------------------------+ -``` - -### Median - -#### 注册语句 - -```sql -create function median as 'org.apache.iotdb.library.dprofile.UDAFMedian' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似中位数。中位数是顺序排列的一组数据中居于中间位置的数;当序列有偶数个时,中位数为中间二者的平均数。 - -**函数名:** MEDIAN - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `error`:近似中位数的基于排名的误差百分比,取值范围 [0,1),默认值为 0。如当`error`=0.01 时,计算出的中位数的真实排名百分比在 0.49~0.51 之间。当`error`=0 时,计算结果为精确中位数。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为中位数的数据点。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -Total line number = 20 -``` - -用于查询的 SQL 语句: - -```sql -select median(s1, "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|median(root.test.s1, "error"="0.01")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -+-----------------------------+------------------------------------+ -``` - -### MinMax - -#### 注册语句 - -```sql -create function minmax as 'org.apache.iotdb.library.dprofile.UDTFMinMax' -``` - -#### 函数简介 - -本函数将输入序列使用 min-max 方法进行标准化。最小值归一至 0,最大值归一至 1. - -**函数名:** MINMAX - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `compute`:若设置为"batch",则将数据全部读入后转换;若设置为 "stream",则需用户提供最大值及最小值进行流式计算转换。默认为 "batch"。 -+ `min`:使用流式计算时的最小值。 -+ `max`:使用流式计算时的最大值。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select minmax(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|minmax(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.200+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.300+08:00| 0.25| -|1970-01-01T08:00:00.400+08:00| 0.08333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.700+08:00| 0.0| -|1970-01-01T08:00:00.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.900+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.000+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.100+08:00| 0.25| -|1970-01-01T08:00:01.200+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.300+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.25| -|1970-01-01T08:00:01.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.700+08:00| 1.0| -|1970-01-01T08:00:01.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.900+08:00| 0.0| -|1970-01-01T08:00:02.000+08:00| 0.16666666666666666| -+-----------------------------+--------------------+ -``` - - - -### MvAvg - -#### 注册语句 - -```sql -create function mvavg as 'org.apache.iotdb.library.dprofile.UDTFMvAvg' -``` - -#### 函数简介 - -本函数计算序列的移动平均。 - -**函数名:** MVAVG - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:移动窗口的长度。默认值为 10. - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 指定窗口长度 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select mvavg(s1, "window"="3") from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -| Time|mvavg(root.test.s1, "window"="3")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.300+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.400+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.700+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.100+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.200+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.300+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.0| -|1970-01-01T08:00:01.500+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.600+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.700+08:00| 3.0| -|1970-01-01T08:00:01.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.900+08:00| -0.6666666666666666| -|1970-01-01T08:00:02.000+08:00| -3.3333333333333335| -+-----------------------------+---------------------------------+ -``` - -### PACF - -#### 注册语句 - -```sql -create function pacf as 'org.apache.iotdb.library.dprofile.UDTFPACF' -``` - -#### 函数简介 - -本函数通过求解 Yule-Walker 方程,计算序列的偏自相关系数。对于特殊的输入序列,方程可能没有解,此时输出`NaN`。 - -**函数名:** PACF - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `lag`:最大滞后阶数。默认值为$\min(10\log_{10}n,n-1)$,$n$表示数据点个数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 指定滞后阶数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select pacf(s1, "lag"="5") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------+ -| Time|pacf(root.test.d1.s1, "lag"="5")| -+-----------------------------+--------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| -0.5744680851063829| -|2020-01-01T00:00:03.000+08:00| 0.3172297297297296| -|2020-01-01T00:00:04.000+08:00| -0.2977686586304181| -|2020-01-01T00:00:05.000+08:00| -2.0609033521065867| -+-----------------------------+--------------------------------+ -``` - -### Percentile - -#### 注册语句 - -```sql -create function percentile as 'org.apache.iotdb.library.dprofile.UDAFPercentile' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似分位数。 - -**函数名:** PERCENTILE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `rank`:所求分位数在所有数据中的排名百分比,取值范围为 (0,1],默认值为 0.5。如当设为 0.5时则计算中位数。 -+ `error`:近似分位数的基于排名的误差百分比,取值范围为 [0,1),默认值为0。如`rank`=0.5 且`error`=0.01,则计算出的分位数的真实排名百分比在 0.49~0.51之间。当`error`=0 时,计算结果为精确分位数。 - -**输出序列:** 输出单个序列,类型与输入序列相同。当`error`=0时,序列仅包含一个时间戳为分位数第一次出现的时间戳、值为分位数的数据点;否则,输出值的时间戳为0。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|2021-03-17T10:32:17.054+08:00| 0.5319929| -|2021-03-17T10:32:18.054+08:00| 0.9304316| -|2021-03-17T10:32:19.054+08:00| -1.4800133| -|2021-03-17T10:32:20.054+08:00| 0.6114087| -|2021-03-17T10:32:21.054+08:00| 2.5163336| -|2021-03-17T10:32:22.054+08:00| -1.0845392| -|2021-03-17T10:32:23.054+08:00| 1.0562582| -|2021-03-17T10:32:24.054+08:00| 1.3867859| -|2021-03-17T10:32:25.054+08:00| -0.45429882| -|2021-03-17T10:32:26.054+08:00| 1.0353678| -|2021-03-17T10:32:27.054+08:00| 0.7307929| -|2021-03-17T10:32:28.054+08:00| 2.3167255| -|2021-03-17T10:32:29.054+08:00| 2.342443| -|2021-03-17T10:32:30.054+08:00| 1.5809103| -|2021-03-17T10:32:31.054+08:00| 1.4829416| -|2021-03-17T10:32:32.054+08:00| 1.5800357| -|2021-03-17T10:32:33.054+08:00| 0.7124368| -|2021-03-17T10:32:34.054+08:00| -0.78597564| -|2021-03-17T10:32:35.054+08:00| 1.2058644| -|2021-03-17T10:32:36.054+08:00| 1.4215064| -|2021-03-17T10:32:37.054+08:00| 1.2808295| -|2021-03-17T10:32:38.054+08:00| -0.6173715| -|2021-03-17T10:32:39.054+08:00| 0.06644377| -|2021-03-17T10:32:40.054+08:00| 2.349338| -|2021-03-17T10:32:41.054+08:00| 1.7335888| -|2021-03-17T10:32:42.054+08:00| 1.5872132| -............ -Total line number = 10000 -``` - -用于查询的 SQL 语句: - -```sql -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|percentile(root.test.s0, "rank"="0.2", "error"="0.01")| -+-----------------------------+------------------------------------------------------+ -|2021-03-17T10:35:02.054+08:00| 0.1801469624042511| -+-----------------------------+------------------------------------------------------+ -``` -输入序列: - -``` -+-----------------------------+-------------+ -| Time|root.test2.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+-------------+ -............ -Total line number = 20 -``` - -用于查询的 SQL 语句: - -```sql -select percentile(s1, "rank"="0.2", "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|percentile(root.test2.s1, "rank"="0.2", "error"="0.01")| -+-----------------------------+-------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| -1.0| -+-----------------------------+-------------------------------------------------------+ -``` - - -### Quantile - -#### 注册语句 - -```sql -create function quantile as 'org.apache.iotdb.library.dprofile.UDAFQuantile' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的近似分位数。本函数基于KLL sketch算法实现。 - -**函数名:** QUANTILE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `rank`:所求分位数在所有数据中的排名比,取值范围为 (0,1],默认值为 0.5。如当设为 0.5时则计算近似中位数。 -+ `K`:允许维护的KLL sketch大小,最小值为100,默认值为800。如`rank`=0.5 且`K`=800,则计算出的分位数的真实排名比有至少99%的可能性在 0.49~0.51之间。 - -**输出序列:** 输出单个序列,类型与输入序列相同。输出值的时间戳为0。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - - -输入序列: - -``` -+-----------------------------+-------------+ -| Time|root.test1.s1| -+-----------------------------+-------------+ -|2021-03-17T10:32:17.054+08:00| 7| -|2021-03-17T10:32:18.054+08:00| 15| -|2021-03-17T10:32:19.054+08:00| 36| -|2021-03-17T10:32:20.054+08:00| 39| -|2021-03-17T10:32:21.054+08:00| 40| -|2021-03-17T10:32:22.054+08:00| 41| -|2021-03-17T10:32:23.054+08:00| 20| -|2021-03-17T10:32:24.054+08:00| 18| -+-----------------------------+-------------+ -............ -Total line number = 8 -``` - -用于查询的 SQL 语句: - -```sql -select quantile(s1, "rank"="0.2", "K"="800") from root.test1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------+ -| Time|quantile(root.test1.s1, "rank"="0.2", "K"="800")| -+-----------------------------+------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.000000000000001| -+-----------------------------+------------------------------------------------+ -``` - -### Period - -#### 注册语句 - -```sql -create function period as 'org.apache.iotdb.library.dprofile.UDAFPeriod' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的周期。 - -**函数名:** PERIOD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为 INT32,序列仅包含一个时间戳为 0、值为周期的数据点。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d3.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| -|1970-01-01T08:00:00.004+08:00| 1.0| -|1970-01-01T08:00:00.005+08:00| 2.0| -|1970-01-01T08:00:00.006+08:00| 3.0| -|1970-01-01T08:00:00.007+08:00| 1.0| -|1970-01-01T08:00:00.008+08:00| 2.0| -|1970-01-01T08:00:00.009+08:00| 3.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select period(s1) from root.test.d3 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time|period(root.test.d3.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 3| -+-----------------------------+-----------------------+ -``` - -### QLB - -#### 注册语句 - -```sql -create function qlb as 'org.apache.iotdb.library.dprofile.UDTFQLB' -``` - -#### 函数简介 - -本函数对输入序列计算$Q_{LB} $统计量,并计算对应的p值。p值越小表明序列越有可能为非平稳序列。 - -**函数名:** QLB - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `lag`:计算时用到的最大延迟阶数,取值应为 1 至 n-2 之间的整数,n 为序列采样总数。默认取 n-2。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。该序列是$Q_{LB} $统计量对应的 p 值,时间标签代表偏移阶数。 - -**提示:** $Q_{LB} $统计量由自相关系数求得,如需得到统计量而非 p 值,可以使用 ACF 函数。 - -#### 使用示例 - -##### 使用默认参数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T00:00:00.100+08:00| 1.22| -|1970-01-01T00:00:00.200+08:00| -2.78| -|1970-01-01T00:00:00.300+08:00| 1.53| -|1970-01-01T00:00:00.400+08:00| 0.70| -|1970-01-01T00:00:00.500+08:00| 0.75| -|1970-01-01T00:00:00.600+08:00| -0.72| -|1970-01-01T00:00:00.700+08:00| -0.22| -|1970-01-01T00:00:00.800+08:00| 0.28| -|1970-01-01T00:00:00.900+08:00| 0.57| -|1970-01-01T00:00:01.000+08:00| -0.22| -|1970-01-01T00:00:01.100+08:00| -0.72| -|1970-01-01T00:00:01.200+08:00| 1.34| -|1970-01-01T00:00:01.300+08:00| -0.25| -|1970-01-01T00:00:01.400+08:00| 0.17| -|1970-01-01T00:00:01.500+08:00| 2.51| -|1970-01-01T00:00:01.600+08:00| 1.42| -|1970-01-01T00:00:01.700+08:00| -1.34| -|1970-01-01T00:00:01.800+08:00| -0.01| -|1970-01-01T00:00:01.900+08:00| -0.49| -|1970-01-01T00:00:02.000+08:00| 1.63| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select QLB(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+---------------------+ -| Time| QLB(root.test.d1.s1)| -+-----------------------------+---------------------+ -|1970-01-01T08:00:00.021+08:00| -0.31671| -|1970-01-01T08:00:00.001+08:00| 0.12748561639660716| -|1970-01-01T08:00:00.022+08:00| -0.17051499999999997| -|1970-01-01T08:00:00.002+08:00| 0.21941409592365868| -|1970-01-01T08:00:00.023+08:00| -0.11341499999999997| -|1970-01-01T08:00:00.003+08:00| 0.3384920824593398| -|1970-01-01T08:00:00.024+08:00| 0.26146| -|1970-01-01T08:00:00.004+08:00| 0.26293189359893154| -|1970-01-01T08:00:00.025+08:00| 0.06431999999999996| -|1970-01-01T08:00:00.005+08:00| 0.37265953802871943| -|1970-01-01T08:00:00.026+08:00| 0.036919999999999994| -|1970-01-01T08:00:00.006+08:00| 0.4923218142923832| -|1970-01-01T08:00:00.027+08:00|-0.009294999999999993| -|1970-01-01T08:00:00.007+08:00| 0.609628728420623| -|1970-01-01T08:00:00.028+08:00| 0.12271499999999999| -|1970-01-01T08:00:00.008+08:00| 0.6510708392264906| -|1970-01-01T08:00:00.029+08:00| 0.008480000000000033| -|1970-01-01T08:00:00.009+08:00| 0.7430561964288097| -|1970-01-01T08:00:00.030+08:00| -0.21764500000000003| -|1970-01-01T08:00:00.010+08:00| 0.6236738200492055| -|1970-01-01T08:00:00.031+08:00| 0.35853999999999997| -|1970-01-01T08:00:00.011+08:00| 0.21487390993160937| -|1970-01-01T08:00:00.032+08:00| 0.18115499999999998| -|1970-01-01T08:00:00.012+08:00| 0.18479562182870324| -|1970-01-01T08:00:00.033+08:00| -0.27745499999999995| -|1970-01-01T08:00:00.013+08:00| 0.07329862193377235| -|1970-01-01T08:00:00.034+08:00| -0.22418500000000002| -|1970-01-01T08:00:00.014+08:00| 0.038000864459751926| -|1970-01-01T08:00:00.035+08:00| 0.31609000000000004| -|1970-01-01T08:00:00.015+08:00| 0.004052989734200874| -|1970-01-01T08:00:00.036+08:00| -0.06078500000000001| -|1970-01-01T08:00:00.016+08:00| 0.005663787468609627| -|1970-01-01T08:00:00.037+08:00| 0.19219499999999998| -|1970-01-01T08:00:00.017+08:00|0.0016316380755082571| -|1970-01-01T08:00:00.038+08:00| -0.25646| -|1970-01-01T08:00:00.018+08:00|2.0047954405910673E-5| -+-----------------------------+---------------------+ -``` - -### Resample - -#### 注册语句 - -```sql -create function re_sample as 'org.apache.iotdb.library.dprofile.UDTFResample' -``` - -#### 函数简介 - -本函数对输入序列按照指定的频率进行重采样,包括上采样和下采样。目前,本函数支持的上采样方法包括`NaN`填充法 (NaN)、前值填充法 (FFill)、后值填充法 (BFill) 以及线性插值法 (Linear);本函数支持的下采样方法为分组聚合,聚合方法包括最大值 (Max)、最小值 (Min)、首值 (First)、末值 (Last)、平均值 (Mean)和中位数 (Median)。 - -**函数名:** RESAMPLE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `every`:重采样频率,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。该参数不允许缺省。 -+ `interp`:上采样的插值方法,取值为 'NaN'、'FFill'、'BFill' 或 'Linear'。在缺省情况下,使用`NaN`填充法。 -+ `aggr`:下采样的聚合方法,取值为 'Max'、'Min'、'First'、'Last'、'Mean' 或 'Median'。在缺省情况下,使用平均数聚合。 -+ `start`:重采样的起始时间(包含),是一个格式为 'yyyy-MM-dd HH:mm:ss' 的时间字符串。在缺省情况下,使用第一个有效数据点的时间戳。 -+ `end`:重采样的结束时间(不包含),是一个格式为 'yyyy-MM-dd HH:mm:ss' 的时间字符串。在缺省情况下,使用最后一个有效数据点的时间戳。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。该序列按照重采样频率严格等间隔分布。 - -**提示:** 数据中的`NaN`将会被忽略。 - -#### 使用示例 - -##### 上采样 - -当重采样频率高于数据原始频率时,将会进行上采样。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2021-03-06T16:00:00.000+08:00| 3.09| -|2021-03-06T16:15:00.000+08:00| 3.53| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:45:00.000+08:00| 3.51| -|2021-03-06T17:00:00.000+08:00| 3.41| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select resample(s1,'every'='5m','interp'='linear') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="5m", "interp"="linear")| -+-----------------------------+----------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:05:00.000+08:00| 3.2366665999094644| -|2021-03-06T16:10:00.000+08:00| 3.3833332856496177| -|2021-03-06T16:15:00.000+08:00| 3.5299999713897705| -|2021-03-06T16:20:00.000+08:00| 3.5199999809265137| -|2021-03-06T16:25:00.000+08:00| 3.509999990463257| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:35:00.000+08:00| 3.503333330154419| -|2021-03-06T16:40:00.000+08:00| 3.506666660308838| -|2021-03-06T16:45:00.000+08:00| 3.509999990463257| -|2021-03-06T16:50:00.000+08:00| 3.4766666889190674| -|2021-03-06T16:55:00.000+08:00| 3.443333387374878| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+----------------------------------------------------------+ -``` - -##### 下采样 - -当重采样频率低于数据原始频率时,将会进行下采样。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select resample(s1,'every'='30m','aggr'='first') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "aggr"="first")| -+-----------------------------+--------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+--------------------------------------------------------+ -``` - - -###### 指定重采样时间段 - -可以使用`start`和`end`两个参数指定重采样的时间段,超出实际时间范围的部分会被插值填补。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "start"="2021-03-06 15:00:00")| -+-----------------------------+-----------------------------------------------------------------------+ -|2021-03-06T15:00:00.000+08:00| NaN| -|2021-03-06T15:30:00.000+08:00| NaN| -|2021-03-06T16:00:00.000+08:00| 3.309999942779541| -|2021-03-06T16:30:00.000+08:00| 3.5049999952316284| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+-----------------------------------------------------------------------+ -``` - -### Sample - -#### 注册语句 - -```sql -create function sample as 'org.apache.iotdb.library.dprofile.UDTFSample' -``` - -#### 函数简介 - -本函数对输入序列进行采样,即从输入序列中选取指定数量的数据点并输出。目前,本函数支持三种采样方法:**蓄水池采样法 (reservoir sampling)** 对数据进行随机采样,所有数据点被采样的概率相同;**等距采样法 (isometric sampling)** 按照相等的索引间隔对数据进行采样,**最大三角采样法 (triangle sampling)** 对所有数据会按采样率分桶,每个桶内会计算数据点间三角形面积,并保留面积最大的点,该算法通常用于数据的可视化展示中,采用过程可以保证一些关键的突变点在采用中得到保留,更多抽样算法细节可以阅读论文 [here](http://skemman.is/stream/get/1946/15343/37285/3/SS_MSthesis.pdf)。 - -**函数名:** SAMPLE - -**输入序列:** 仅支持单个输入序列,类型可以是任意的。 - -**参数:** - -+ `method`:采样方法,取值为 'reservoir','isometric' 或 'triangle' 。在缺省情况下,采用蓄水池采样法。 -+ `k`:采样数,它是一个正整数,在缺省情况下为 1。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列的长度为采样数,序列中的每一个数据点都来自于输入序列。 - -**提示:** 如果采样数大于序列长度,那么输入序列中所有的数据点都会被输出。 - -#### 使用示例 - - -##### 蓄水池采样 - -当`method`参数为 'reservoir' 或缺省时,采用蓄水池采样法对输入序列进行采样。由于该采样方法具有随机性,下面展示的输出序列只是一种可能的结果。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 4.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select sample(s1,'method'='reservoir','k'='5') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="reservoir", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - - -##### 等距采样 - -当`method`参数为 'isometric' 时,采用等距采样法对输入序列进行采样。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select sample(s1,'method'='isometric','k'='5') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="isometric", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -### Segment - -#### 注册语句 - -```sql -create function segment as 'org.apache.iotdb.library.dprofile.UDTFSegment' -``` - -#### 函数简介 - -本函数按照数据的线性变化趋势将数据划分为多个子序列,返回分段直线拟合后的子序列首值或所有拟合值。 - -**函数名:** SEGMENT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `output`:"all" 输出所有拟合值;"first" 输出子序列起点拟合值。默认为 "first"。 - -+ `error`:判定存在线性趋势的误差允许阈值。误差的定义为子序列进行线性拟合的误差的绝对值的均值。默认为 0.1. - -**输出序列:** 输出单个序列,类型为 DOUBLE。 - -**提示:** 函数默认所有数据等时间间隔分布。函数读取所有数据,若原始数据过多,请先进行降采样处理。拟合采用自底向上方法,子序列的尾值可能会被认作子序列首值输出。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:00.300+08:00| 2.0| -|1970-01-01T08:00:00.400+08:00| 3.0| -|1970-01-01T08:00:00.500+08:00| 4.0| -|1970-01-01T08:00:00.600+08:00| 5.0| -|1970-01-01T08:00:00.700+08:00| 6.0| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 8.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:01.100+08:00| 9.1| -|1970-01-01T08:00:01.200+08:00| 9.2| -|1970-01-01T08:00:01.300+08:00| 9.3| -|1970-01-01T08:00:01.400+08:00| 9.4| -|1970-01-01T08:00:01.500+08:00| 9.5| -|1970-01-01T08:00:01.600+08:00| 9.6| -|1970-01-01T08:00:01.700+08:00| 9.7| -|1970-01-01T08:00:01.800+08:00| 9.8| -|1970-01-01T08:00:01.900+08:00| 9.9| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:02.100+08:00| 8.0| -|1970-01-01T08:00:02.200+08:00| 6.0| -|1970-01-01T08:00:02.300+08:00| 4.0| -|1970-01-01T08:00:02.400+08:00| 2.0| -|1970-01-01T08:00:02.500+08:00| 0.0| -|1970-01-01T08:00:02.600+08:00| -2.0| -|1970-01-01T08:00:02.700+08:00| -4.0| -|1970-01-01T08:00:02.800+08:00| -6.0| -|1970-01-01T08:00:02.900+08:00| -8.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.100+08:00| 10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -|1970-01-01T08:00:03.300+08:00| 10.0| -|1970-01-01T08:00:03.400+08:00| 10.0| -|1970-01-01T08:00:03.500+08:00| 10.0| -|1970-01-01T08:00:03.600+08:00| 10.0| -|1970-01-01T08:00:03.700+08:00| 10.0| -|1970-01-01T08:00:03.800+08:00| 10.0| -|1970-01-01T08:00:03.900+08:00| 10.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select segment(s1,"error"="0.1") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|segment(root.test.s1, "error"="0.1")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -+-----------------------------+------------------------------------+ -``` - -### Skew - -#### 注册语句 - -```sql -create function skew as 'org.apache.iotdb.library.dprofile.UDAFSkew' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的总体偏度 - -**函数名:** SKEW - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为总体偏度的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -|2020-01-01T00:00:11.000+08:00| 10.0| -|2020-01-01T00:00:12.000+08:00| 10.0| -|2020-01-01T00:00:13.000+08:00| 10.0| -|2020-01-01T00:00:14.000+08:00| 10.0| -|2020-01-01T00:00:15.000+08:00| 10.0| -|2020-01-01T00:00:16.000+08:00| 10.0| -|2020-01-01T00:00:17.000+08:00| 10.0| -|2020-01-01T00:00:18.000+08:00| 10.0| -|2020-01-01T00:00:19.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select skew(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time| skew(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| -0.9998427402292644| -+-----------------------------+-----------------------+ -``` - -### Spline - -#### 注册语句 - -```sql -create function spline as 'org.apache.iotdb.library.dprofile.UDTFSpline' -``` - -#### 函数简介 - -本函数提供对原始序列进行三次样条曲线拟合后的插值重采样。 - -**函数名:** SPLINE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `points`:重采样个数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -**提示**:输出序列保留输入序列的首尾值,等时间间隔采样。仅当输入点个数不少于 4 个时才计算插值。 - -#### 使用示例 - -##### 指定插值个数 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select spline(s1, "points"="151") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|spline(root.test.s1, "points"="151")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.010+08:00| 0.04870000251134237| -|1970-01-01T08:00:00.020+08:00| 0.09680000495910646| -|1970-01-01T08:00:00.030+08:00| 0.14430000734329226| -|1970-01-01T08:00:00.040+08:00| 0.19120000966389972| -|1970-01-01T08:00:00.050+08:00| 0.23750001192092896| -|1970-01-01T08:00:00.060+08:00| 0.2832000141143799| -|1970-01-01T08:00:00.070+08:00| 0.32830001624425253| -|1970-01-01T08:00:00.080+08:00| 0.3728000183105469| -|1970-01-01T08:00:00.090+08:00| 0.416700020313263| -|1970-01-01T08:00:00.100+08:00| 0.4600000222524008| -|1970-01-01T08:00:00.110+08:00| 0.5027000241279602| -|1970-01-01T08:00:00.120+08:00| 0.5448000259399414| -|1970-01-01T08:00:00.130+08:00| 0.5863000276883443| -|1970-01-01T08:00:00.140+08:00| 0.627200029373169| -|1970-01-01T08:00:00.150+08:00| 0.6675000309944153| -|1970-01-01T08:00:00.160+08:00| 0.7072000325520833| -|1970-01-01T08:00:00.170+08:00| 0.7463000340461731| -|1970-01-01T08:00:00.180+08:00| 0.7848000354766846| -|1970-01-01T08:00:00.190+08:00| 0.8227000368436178| -|1970-01-01T08:00:00.200+08:00| 0.8600000381469728| -|1970-01-01T08:00:00.210+08:00| 0.8967000393867494| -|1970-01-01T08:00:00.220+08:00| 0.9328000405629477| -|1970-01-01T08:00:00.230+08:00| 0.9683000416755676| -|1970-01-01T08:00:00.240+08:00| 1.0032000427246095| -|1970-01-01T08:00:00.250+08:00| 1.037500043710073| -|1970-01-01T08:00:00.260+08:00| 1.071200044631958| -|1970-01-01T08:00:00.270+08:00| 1.1043000454902647| -|1970-01-01T08:00:00.280+08:00| 1.1368000462849934| -|1970-01-01T08:00:00.290+08:00| 1.1687000470161437| -|1970-01-01T08:00:00.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:00.310+08:00| 1.2307000483103594| -|1970-01-01T08:00:00.320+08:00| 1.2608000489139557| -|1970-01-01T08:00:00.330+08:00| 1.2903000494873524| -|1970-01-01T08:00:00.340+08:00| 1.3192000500233967| -|1970-01-01T08:00:00.350+08:00| 1.3475000505149364| -|1970-01-01T08:00:00.360+08:00| 1.3752000509548186| -|1970-01-01T08:00:00.370+08:00| 1.402300051335891| -|1970-01-01T08:00:00.380+08:00| 1.4288000516510009| -|1970-01-01T08:00:00.390+08:00| 1.4547000518929958| -|1970-01-01T08:00:00.400+08:00| 1.480000052054723| -|1970-01-01T08:00:00.410+08:00| 1.5047000521290301| -|1970-01-01T08:00:00.420+08:00| 1.5288000521087646| -|1970-01-01T08:00:00.430+08:00| 1.5523000519867738| -|1970-01-01T08:00:00.440+08:00| 1.575200051755905| -|1970-01-01T08:00:00.450+08:00| 1.597500051409006| -|1970-01-01T08:00:00.460+08:00| 1.619200050938924| -|1970-01-01T08:00:00.470+08:00| 1.6403000503385066| -|1970-01-01T08:00:00.480+08:00| 1.660800049600601| -|1970-01-01T08:00:00.490+08:00| 1.680700048718055| -|1970-01-01T08:00:00.500+08:00| 1.7000000476837158| -|1970-01-01T08:00:00.510+08:00| 1.7188475466453037| -|1970-01-01T08:00:00.520+08:00| 1.7373800457262996| -|1970-01-01T08:00:00.530+08:00| 1.7555825448831923| -|1970-01-01T08:00:00.540+08:00| 1.7734400440724702| -|1970-01-01T08:00:00.550+08:00| 1.790937543250622| -|1970-01-01T08:00:00.560+08:00| 1.8080600423741364| -|1970-01-01T08:00:00.570+08:00| 1.8247925413995016| -|1970-01-01T08:00:00.580+08:00| 1.8411200402832066| -|1970-01-01T08:00:00.590+08:00| 1.8570275389817397| -|1970-01-01T08:00:00.600+08:00| 1.8725000374515897| -|1970-01-01T08:00:00.610+08:00| 1.8875225356492449| -|1970-01-01T08:00:00.620+08:00| 1.902080033531194| -|1970-01-01T08:00:00.630+08:00| 1.9161575310539258| -|1970-01-01T08:00:00.640+08:00| 1.9297400281739288| -|1970-01-01T08:00:00.650+08:00| 1.9428125248476913| -|1970-01-01T08:00:00.660+08:00| 1.9553600210317021| -|1970-01-01T08:00:00.670+08:00| 1.96736751668245| -|1970-01-01T08:00:00.680+08:00| 1.9788200117564232| -|1970-01-01T08:00:00.690+08:00| 1.9897025062101101| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.710+08:00| 2.0097024933913334| -|1970-01-01T08:00:00.720+08:00| 2.0188199867081615| -|1970-01-01T08:00:00.730+08:00| 2.027367479995188| -|1970-01-01T08:00:00.740+08:00| 2.0353599732971155| -|1970-01-01T08:00:00.750+08:00| 2.0428124666586482| -|1970-01-01T08:00:00.760+08:00| 2.049739960124489| -|1970-01-01T08:00:00.770+08:00| 2.056157453739342| -|1970-01-01T08:00:00.780+08:00| 2.06207994754791| -|1970-01-01T08:00:00.790+08:00| 2.067522441594897| -|1970-01-01T08:00:00.800+08:00| 2.072499935925006| -|1970-01-01T08:00:00.810+08:00| 2.07702743058294| -|1970-01-01T08:00:00.820+08:00| 2.081119925613404| -|1970-01-01T08:00:00.830+08:00| 2.0847924210611| -|1970-01-01T08:00:00.840+08:00| 2.0880599169707317| -|1970-01-01T08:00:00.850+08:00| 2.0909374133870027| -|1970-01-01T08:00:00.860+08:00| 2.0934399103546166| -|1970-01-01T08:00:00.870+08:00| 2.0955824079182768| -|1970-01-01T08:00:00.880+08:00| 2.0973799061226863| -|1970-01-01T08:00:00.890+08:00| 2.098847405012549| -|1970-01-01T08:00:00.900+08:00| 2.0999999046325684| -|1970-01-01T08:00:00.910+08:00| 2.1005574051201332| -|1970-01-01T08:00:00.920+08:00| 2.1002599065303778| -|1970-01-01T08:00:00.930+08:00| 2.0991524087846245| -|1970-01-01T08:00:00.940+08:00| 2.0972799118041947| -|1970-01-01T08:00:00.950+08:00| 2.0946874155104105| -|1970-01-01T08:00:00.960+08:00| 2.0914199198245944| -|1970-01-01T08:00:00.970+08:00| 2.0875224246680673| -|1970-01-01T08:00:00.980+08:00| 2.083039929962151| -|1970-01-01T08:00:00.990+08:00| 2.0780174356281687| -|1970-01-01T08:00:01.000+08:00| 2.0724999415874406| -|1970-01-01T08:00:01.010+08:00| 2.06653244776129| -|1970-01-01T08:00:01.020+08:00| 2.060159954071038| -|1970-01-01T08:00:01.030+08:00| 2.053427460438006| -|1970-01-01T08:00:01.040+08:00| 2.046379966783517| -|1970-01-01T08:00:01.050+08:00| 2.0390624730288924| -|1970-01-01T08:00:01.060+08:00| 2.031519979095454| -|1970-01-01T08:00:01.070+08:00| 2.0237974849045237| -|1970-01-01T08:00:01.080+08:00| 2.015939990377423| -|1970-01-01T08:00:01.090+08:00| 2.0079924954354746| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.110+08:00| 1.9907018211101906| -|1970-01-01T08:00:01.120+08:00| 1.9788509124245144| -|1970-01-01T08:00:01.130+08:00| 1.9645127287932083| -|1970-01-01T08:00:01.140+08:00| 1.9477527250665083| -|1970-01-01T08:00:01.150+08:00| 1.9286363560946513| -|1970-01-01T08:00:01.160+08:00| 1.9072290767278735| -|1970-01-01T08:00:01.170+08:00| 1.8835963418164114| -|1970-01-01T08:00:01.180+08:00| 1.8578036062105014| -|1970-01-01T08:00:01.190+08:00| 1.8299163247603802| -|1970-01-01T08:00:01.200+08:00| 1.7999999523162842| -|1970-01-01T08:00:01.210+08:00| 1.7623635841923329| -|1970-01-01T08:00:01.220+08:00| 1.7129696477516976| -|1970-01-01T08:00:01.230+08:00| 1.6543635959181928| -|1970-01-01T08:00:01.240+08:00| 1.5890908816156328| -|1970-01-01T08:00:01.250+08:00| 1.5196969577678319| -|1970-01-01T08:00:01.260+08:00| 1.4487272772986044| -|1970-01-01T08:00:01.270+08:00| 1.3787272931317647| -|1970-01-01T08:00:01.280+08:00| 1.3122424581911272| -|1970-01-01T08:00:01.290+08:00| 1.251818225400506| -|1970-01-01T08:00:01.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:01.310+08:00| 1.1548000470995912| -|1970-01-01T08:00:01.320+08:00| 1.1130667107899999| -|1970-01-01T08:00:01.330+08:00| 1.0756000393033045| -|1970-01-01T08:00:01.340+08:00| 1.043200033187868| -|1970-01-01T08:00:01.350+08:00| 1.016666692992053| -|1970-01-01T08:00:01.360+08:00| 0.9968000192642223| -|1970-01-01T08:00:01.370+08:00| 0.9844000125527389| -|1970-01-01T08:00:01.380+08:00| 0.9802666734059655| -|1970-01-01T08:00:01.390+08:00| 0.9852000023722649| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.410+08:00| 1.023999999165535| -|1970-01-01T08:00:01.420+08:00| 1.0559999990463256| -|1970-01-01T08:00:01.430+08:00| 1.0959999996423722| -|1970-01-01T08:00:01.440+08:00| 1.1440000009536744| -|1970-01-01T08:00:01.450+08:00| 1.2000000029802322| -|1970-01-01T08:00:01.460+08:00| 1.264000005722046| -|1970-01-01T08:00:01.470+08:00| 1.3360000091791153| -|1970-01-01T08:00:01.480+08:00| 1.4160000133514405| -|1970-01-01T08:00:01.490+08:00| 1.5040000182390214| -|1970-01-01T08:00:01.500+08:00| 1.600000023841858| -+-----------------------------+------------------------------------+ -``` - -### Spread - -#### 注册语句 - -```sql -create function spread as 'org.apache.iotdb.library.dprofile.UDAFSpread' -``` - -#### 函数简介 - -本函数用于计算时间序列的极差,即最大值减去最小值的结果。 - -**函数名:** SPREAD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型与输入相同,序列仅包含一个时间戳为 0 、值为极差的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time|spread(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 26.0| -+-----------------------------+-----------------------+ -``` - - - -### ZScore - -#### 注册语句 - -```sql -create function zscore as 'org.apache.iotdb.library.dprofile.UDTFZScore' -``` - -#### 函数简介 - -本函数将输入序列使用z-score方法进行归一化。 - -**函数名:** ZSCORE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `compute`:若设置为 "batch",则将数据全部读入后转换;若设置为 "stream",则需用户提供均值及方差进行流式计算转换。默认为 "batch"。 -+ `avg`:使用流式计算时的均值。 -+ `sd`:使用流式计算时的标准差。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select zscore(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|zscore(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.200+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.300+08:00| 0.20672455764868078| -|1970-01-01T08:00:00.400+08:00| -0.6201736729460423| -|1970-01-01T08:00:00.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.700+08:00| -1.033622788243404| -|1970-01-01T08:00:00.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:00.900+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.000+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.100+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.200+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.300+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.400+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.700+08:00| 3.9277665953249348| -|1970-01-01T08:00:01.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:01.900+08:00| -1.033622788243404| -|1970-01-01T08:00:02.000+08:00|-0.20672455764868078| -+-----------------------------+--------------------+ -``` - - - -## 异常检测 - -### IQR - -#### 注册语句 - -```sql -create function iqr as 'org.apache.iotdb.library.anomaly.UDTFIQR' -``` - -#### 函数简介 - -本函数用于检验超出上下四分位数1.5倍IQR的数据分布异常。 - -**函数名:** IQR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `method`:若设置为 "batch",则将数据全部读入后检测;若设置为 "stream",则需用户提供上下四分位数进行流式检测。默认为 "batch"。 -+ `q1`:使用流式计算时的下四分位数。 -+ `q3`:使用流式计算时的上四分位数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -**说明**:$IQR=Q_3-Q_1$ - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select iqr(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+-----------------+ -| Time|iqr(root.test.s1)| -+-----------------------------+-----------------+ -|1970-01-01T08:00:01.700+08:00| 10.0| -+-----------------------------+-----------------+ -``` - -### KSigma - -#### 注册语句 - -```sql -create function ksigma as 'org.apache.iotdb.library.anomaly.UDTFKSigma' -``` - -#### 函数简介 - -本函数利用动态 K-Sigma 算法进行异常检测。在一个窗口内,与平均值的差距超过k倍标准差的数据将被视作异常并输出。 - -**函数名:** KSIGMA - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `k`:在动态 K-Sigma 算法中,分布异常的标准差倍数阈值,默认值为 3。 -+ `window`:动态 K-Sigma 算法的滑动窗口大小,默认值为 10000。 - - -**输出序列:** 输出单个序列,类型与输入序列相同。 - -**提示:** k 应大于 0,否则将不做输出。 - -#### 使用示例 - -##### 指定k - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:04.000+08:00| 100.0| -|2020-01-01T00:00:06.000+08:00| 150.0| -|2020-01-01T00:00:08.000+08:00| 200.0| -|2020-01-01T00:00:10.000+08:00| 200.0| -|2020-01-01T00:00:14.000+08:00| 200.0| -|2020-01-01T00:00:15.000+08:00| 200.0| -|2020-01-01T00:00:16.000+08:00| 200.0| -|2020-01-01T00:00:18.000+08:00| 200.0| -|2020-01-01T00:00:20.000+08:00| 150.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -|Time |ksigma(root.test.d1.s1,"k"="3.0")| -+-----------------------------+---------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -+-----------------------------+---------------------------------+ -``` - -### LOF - -#### 注册语句 - -```sql -create function LOF as 'org.apache.iotdb.library.anomaly.UDTFLOF' -``` - -#### 函数简介 - -本函数使用局部离群点检测方法用于查找序列的密度异常。将根据提供的第k距离数及局部离群点因子(lof)阈值,判断输入数据是否为离群点,即异常,并输出各点的 LOF 值。 - -**函数名:** LOF - -**输入序列:** 多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:使用的检测方法。默认为 default,以高维数据计算。设置为 series,将一维时间序列转换为高维数据计算。 -+ `k`:使用第k距离计算局部离群点因子.默认为 3。 -+ `window`:每次读取数据的窗口长度。默认为 10000. -+ `windowsize`:使用series方法时,转化高维数据的维数,即单个窗口的大小。默认为 5。 - -**输出序列:** 输出单时间序列,类型为DOUBLE。 - -**提示:** 不完整的数据行会被忽略,不参与计算,也不标记为离群点。 - - -#### 使用示例 - -##### 默认参数 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| 1.0| -|1970-01-01T08:00:00.300+08:00| 1.0| 1.0| -|1970-01-01T08:00:00.400+08:00| 1.0| 0.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -1.0| -|1970-01-01T08:00:00.600+08:00| -1.0| -1.0| -|1970-01-01T08:00:00.700+08:00| -1.0| 0.0| -|1970-01-01T08:00:00.800+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| null| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select lof(s1,s2) from root.test.d1 where time<1000 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|lof(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.100+08:00| 3.8274824267668244| -|1970-01-01T08:00:00.200+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.300+08:00| 2.838155437762879| -|1970-01-01T08:00:00.400+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.500+08:00| 2.73518261244453| -|1970-01-01T08:00:00.600+08:00| 2.371440975708148| -|1970-01-01T08:00:00.700+08:00| 2.73518261244453| -|1970-01-01T08:00:00.800+08:00| 1.7561416374270742| -+-----------------------------+-------------------------------------+ -``` - -##### 诊断一维时间序列 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 1.0| -|1970-01-01T08:00:00.200+08:00| 2.0| -|1970-01-01T08:00:00.300+08:00| 3.0| -|1970-01-01T08:00:00.400+08:00| 4.0| -|1970-01-01T08:00:00.500+08:00| 5.0| -|1970-01-01T08:00:00.600+08:00| 6.0| -|1970-01-01T08:00:00.700+08:00| 7.0| -|1970-01-01T08:00:00.800+08:00| 8.0| -|1970-01-01T08:00:00.900+08:00| 9.0| -|1970-01-01T08:00:01.000+08:00| 10.0| -|1970-01-01T08:00:01.100+08:00| 11.0| -|1970-01-01T08:00:01.200+08:00| 12.0| -|1970-01-01T08:00:01.300+08:00| 13.0| -|1970-01-01T08:00:01.400+08:00| 14.0| -|1970-01-01T08:00:01.500+08:00| 15.0| -|1970-01-01T08:00:01.600+08:00| 16.0| -|1970-01-01T08:00:01.700+08:00| 17.0| -|1970-01-01T08:00:01.800+08:00| 18.0| -|1970-01-01T08:00:01.900+08:00| 19.0| -|1970-01-01T08:00:02.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select lof(s1, "method"="series") from root.test.d1 where time<1000 -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|lof(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 3.77777777777778| -|1970-01-01T08:00:00.200+08:00| 4.32727272727273| -|1970-01-01T08:00:00.300+08:00| 4.85714285714286| -|1970-01-01T08:00:00.400+08:00| 5.40909090909091| -|1970-01-01T08:00:00.500+08:00| 5.94999999999999| -|1970-01-01T08:00:00.600+08:00| 6.43243243243243| -|1970-01-01T08:00:00.700+08:00| 6.79999999999999| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 7.0| -|1970-01-01T08:00:01.000+08:00| 6.79999999999999| -|1970-01-01T08:00:01.100+08:00| 6.43243243243243| -|1970-01-01T08:00:01.200+08:00| 5.94999999999999| -|1970-01-01T08:00:01.300+08:00| 5.40909090909091| -|1970-01-01T08:00:01.400+08:00| 4.85714285714286| -|1970-01-01T08:00:01.500+08:00| 4.32727272727273| -|1970-01-01T08:00:01.600+08:00| 3.77777777777778| -+-----------------------------+--------------------+ -``` - -### MissDetect - -#### 注册语句 - -```sql -create function missdetect as 'org.apache.iotdb.library.anomaly.UDTFMissDetect' -``` - -#### 函数简介 - -本函数用于检测数据中的缺失异常。在一些数据中,缺失数据会被线性插值填补,在数据中出现完美的线性片段,且这些片段往往长度较大。本函数通过在数据中发现这些完美线性片段来检测缺失异常。 - -**函数名:** MISSDETECT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `minlen`:被标记为异常的完美线性片段的最小长度,是一个大于等于 10 的整数,默认值为 10。 - -**输出序列:** 输出单个序列,类型为 BOOLEAN,即该数据点是否为缺失异常。 - -**提示:** 数据中的`NaN`将会被忽略。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 0.0| -|2021-07-01T12:00:01.000+08:00| 1.0| -|2021-07-01T12:00:02.000+08:00| 0.0| -|2021-07-01T12:00:03.000+08:00| 1.0| -|2021-07-01T12:00:04.000+08:00| 0.0| -|2021-07-01T12:00:05.000+08:00| 0.0| -|2021-07-01T12:00:06.000+08:00| 0.0| -|2021-07-01T12:00:07.000+08:00| 0.0| -|2021-07-01T12:00:08.000+08:00| 0.0| -|2021-07-01T12:00:09.000+08:00| 0.0| -|2021-07-01T12:00:10.000+08:00| 0.0| -|2021-07-01T12:00:11.000+08:00| 0.0| -|2021-07-01T12:00:12.000+08:00| 0.0| -|2021-07-01T12:00:13.000+08:00| 0.0| -|2021-07-01T12:00:14.000+08:00| 0.0| -|2021-07-01T12:00:15.000+08:00| 0.0| -|2021-07-01T12:00:16.000+08:00| 1.0| -|2021-07-01T12:00:17.000+08:00| 0.0| -|2021-07-01T12:00:18.000+08:00| 1.0| -|2021-07-01T12:00:19.000+08:00| 0.0| -|2021-07-01T12:00:20.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select missdetect(s2,'minlen'='10') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------+ -| Time|missdetect(root.test.d2.s2, "minlen"="10")| -+-----------------------------+------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| false| -|2021-07-01T12:00:01.000+08:00| false| -|2021-07-01T12:00:02.000+08:00| false| -|2021-07-01T12:00:03.000+08:00| false| -|2021-07-01T12:00:04.000+08:00| true| -|2021-07-01T12:00:05.000+08:00| true| -|2021-07-01T12:00:06.000+08:00| true| -|2021-07-01T12:00:07.000+08:00| true| -|2021-07-01T12:00:08.000+08:00| true| -|2021-07-01T12:00:09.000+08:00| true| -|2021-07-01T12:00:10.000+08:00| true| -|2021-07-01T12:00:11.000+08:00| true| -|2021-07-01T12:00:12.000+08:00| true| -|2021-07-01T12:00:13.000+08:00| true| -|2021-07-01T12:00:14.000+08:00| true| -|2021-07-01T12:00:15.000+08:00| true| -|2021-07-01T12:00:16.000+08:00| false| -|2021-07-01T12:00:17.000+08:00| false| -|2021-07-01T12:00:18.000+08:00| false| -|2021-07-01T12:00:19.000+08:00| false| -|2021-07-01T12:00:20.000+08:00| false| -+-----------------------------+------------------------------------------+ -``` - -### Range - -#### 注册语句 - -```sql -create function range as 'org.apache.iotdb.library.anomaly.UDTFRange' -``` - -#### 函数简介 - -本函数用于查找时间序列的范围异常。将根据提供的上界与下界,判断输入数据是否越界,即异常,并输出所有异常点为新的时间序列。 - -**函数名:** RANGE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `lower_bound`:范围异常检测的下界。 -+ `upper_bound`:范围异常检测的上界。 - -**输出序列:** 输出单个序列,类型与输入序列相同。 - -**提示:** 应满足`upper_bound`大于`lower_bound`,否则将不做输出。 - - -#### 使用示例 - -##### 指定上界与下界 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------------------+ -|Time |range(root.test.d1.s1,"lower_bound"="101.0","upper_bound"="125.0")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -+-----------------------------+------------------------------------------------------------------+ -``` - -### TwoSidedFilter - -#### 注册语句 - -```sql -create function twosidedfilter as 'org.apache.iotdb.library.anomaly.UDTFTwoSidedFilter' -``` - -#### 函数简介 - -本函数基于双边窗口检测法对输入序列中的异常点进行过滤。 - -**函数名:** TWOSIDEDFILTER - -**输出序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型与输入相同,是输入序列去除异常点后的结果。 - -**参数:** - -- `len`:双边窗口检测法中的窗口大小,取值范围为正整数,默认值为 5.如当`len`=3 时,算法向前、向后各取长度为3的窗口,在窗口中计算异常度。 -- `threshold`:异常度的阈值,取值范围为(0,1),默认值为 0.3。阈值越高,函数对于异常度的判定标准越严格。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:00:37.000+08:00| 1484.0| -|1970-01-01T08:00:38.000+08:00| 1055.0| -|1970-01-01T08:00:39.000+08:00| 1050.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test -``` - -输出序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -### Outlier - -#### 注册语句 - -```sql -create function outlier as 'org.apache.iotdb.library.anomaly.UDTFOutlier' -``` - -#### 函数简介 - -本函数用于检测基于距离的异常点。在当前窗口中,如果一个点距离阈值范围内的邻居数量(包括它自己)少于密度阈值,则该点是异常点。 - -**函数名:** OUTLIER - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `r`:基于距离异常检测中的距离阈值。 -+ `k`:基于距离异常检测中的密度阈值。 -+ `w`:用于指定滑动窗口的大小。 -+ `s`:用于指定滑动窗口的步长。 - -**输出序列**:输出单个序列,类型与输入序列相同。 - -#### 使用示例 - -##### 指定查询参数 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|2020-01-04T23:59:55.000+08:00| 56.0| -|2020-01-04T23:59:56.000+08:00| 55.1| -|2020-01-04T23:59:57.000+08:00| 54.2| -|2020-01-04T23:59:58.000+08:00| 56.3| -|2020-01-04T23:59:59.000+08:00| 59.0| -|2020-01-05T00:00:00.000+08:00| 60.0| -|2020-01-05T00:00:01.000+08:00| 60.5| -|2020-01-05T00:00:02.000+08:00| 64.5| -|2020-01-05T00:00:03.000+08:00| 69.0| -|2020-01-05T00:00:04.000+08:00| 64.2| -|2020-01-05T00:00:05.000+08:00| 62.3| -|2020-01-05T00:00:06.000+08:00| 58.0| -|2020-01-05T00:00:07.000+08:00| 58.9| -|2020-01-05T00:00:08.000+08:00| 52.0| -|2020-01-05T00:00:09.000+08:00| 62.3| -|2020-01-05T00:00:10.000+08:00| 61.0| -|2020-01-05T00:00:11.000+08:00| 64.2| -|2020-01-05T00:00:12.000+08:00| 61.8| -|2020-01-05T00:00:13.000+08:00| 64.0| -|2020-01-05T00:00:14.000+08:00| 63.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|outlier(root.test.s1,"r"="5.0","k"="4","w"="10","s"="5")| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:03.000+08:00| 69.0| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:08.000+08:00| 52.0| -+-----------------------------+--------------------------------------------------------+ -``` - -## 频域分析 - -### Conv - -#### 注册语句 - -```sql -create function conv as 'org.apache.iotdb.library.frequency.UDTFConv' -``` - -#### 函数简介 - -本函数对两个输入序列进行卷积,即多项式乘法。 - - -**函数名:** CONV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为DOUBLE,它是两个序列卷积的结果。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 0.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| null| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select conv(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------+ -| Time|conv(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| -|1970-01-01T08:00:00.003+08:00| 2.0| -+-----------------------------+--------------------------------------+ -``` - -### Deconv - -#### 注册语句 - -```sql -create function deconv as 'org.apache.iotdb.library.frequency.UDTFDeconv' -``` - -#### 函数简介 - -本函数对两个输入序列进行去卷积,即多项式除法运算。 - -**函数名:** DECONV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `result`:去卷积的结果,取值为'quotient'或'remainder',分别对应于去卷积的商和余数。在缺省情况下,输出去卷积的商。 - -**输出序列:** 输出单个序列,类型为DOUBLE。它是将第二个序列从第一个序列中去卷积(第一个序列除以第二个序列)的结果。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -##### 计算去卷积的商 - -当`result`参数缺省或为'quotient'时,本函数计算去卷积的商。 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s3|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 8.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| null| -|1970-01-01T08:00:00.003+08:00| 2.0| null| -+-----------------------------+---------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select deconv(s3,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2)| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - -##### 计算去卷积的余数 - -当`result`参数为'remainder'时,本函数计算去卷积的余数。输入序列同上,用于查询的SQL语句如下: - -```sql -select deconv(s3,s2,'result'='remainder') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2, "result"="remainder")| -+-----------------------------+--------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 0.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -+-----------------------------+--------------------------------------------------------------+ -``` - -### DWT - -#### 注册语句 - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFDWT' -``` - -#### 函数简介 - -本函数对输入序列进行一维离散小波变换。 - -**函数名:** DWT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:小波滤波的类型,提供'Haar', 'DB4', 'DB6', 'DB8',其中DB指代Daubechies。若不设置该参数,则用户需提供小波滤波的系数。不区分大小写。 -+ `coef`:小波滤波的系数。若提供该参数,请使用英文逗号','分割各项,不添加空格或其它符号。 -+ `layer`:进行变换的次数,最终输出的向量个数等同于$layer+1$.默认取1。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。 - -**提示:** 输入序列长度必须为2的整数次幂。 - -#### 使用示例 - -##### Haar变换 - - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.2| -|1970-01-01T08:00:00.200+08:00| 1.5| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.600+08:00| 0.8| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.800+08:00| 2.5| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select dwt(s1,"method"="haar") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|dwt(root.test.d1.s1, "method"="haar")| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.14142135834465192| -|1970-01-01T08:00:00.100+08:00| 1.909188342921157| -|1970-01-01T08:00:00.200+08:00| 1.6263456473052773| -|1970-01-01T08:00:00.300+08:00| 1.9798989957517026| -|1970-01-01T08:00:00.400+08:00| 3.252691126023161| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776479437628| -|1970-01-01T08:00:00.800+08:00| -0.14142135834465192| -|1970-01-01T08:00:00.900+08:00| 0.21213200063848547| -|1970-01-01T08:00:01.000+08:00| -0.7778174761639416| -|1970-01-01T08:00:01.100+08:00| -0.8485281289944873| -|1970-01-01T08:00:01.200+08:00| 0.2828427799095765| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426400127697095| -|1970-01-01T08:00:01.500+08:00| -0.42426408557066786| -+-----------------------------+-------------------------------------+ -``` - - -### IDWT - -#### 注册语句 - -```sql -create function idwt as 'org.apache.iotdb.library.frequency.UDTFIDWT' -``` - -#### 函数简介 - -本函数对输入序列进行一维离散小波逆变换,将 DWT 分解后的小波系数还原为原始数据。 - -**函数名:** IDWT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:小波滤波的类型,提供'Haar', 'DB4', 'DB6', 'DB8',其中DB指代Daubechies。若不设置该参数,则用户需提供小波滤波的系数。不区分大小写。 -+ `coef`:小波滤波的系数。若提供该参数,请使用英文逗号','分割各项,不添加空格或其它符号。 -+ `layer`:进行变换的次数,最终输出的向量个数等同于$layer+1$.默认取1。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。 - -**提示:** -* 输入序列长度必须为2的整数次幂。 -* IDWT 函数的参数设置(method/coef/layer)应与对应 DWT 变换时保持一致,才能正确还原原始数据。 -* 通常 IDWT 的输入为 DWT 函数的输出结果。 - -#### 使用示例 - -##### Haar变换 - - -输入序列: - -``` -+-----------------------------+--------------------+ -| Time| root.test.d1.s2| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 0.1414213562373095| -|1970-01-01T08:00:00.100+08:00| 1.909188309203678| -|1970-01-01T08:00:00.200+08:00| 1.6263455967290592| -|1970-01-01T08:00:00.300+08:00| 1.979898987322333| -|1970-01-01T08:00:00.400+08:00| 3.2526911934581184| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776310850235| -|1970-01-01T08:00:00.800+08:00| -0.1414213562373095| -|1970-01-01T08:00:00.900+08:00| 0.21213203435596428| -|1970-01-01T08:00:01.000+08:00| -0.7778174593052022| -|1970-01-01T08:00:01.100+08:00| -0.8485281374238569| -|1970-01-01T08:00:01.200+08:00| 0.2828427124746189| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426406871192857| -|1970-01-01T08:00:01.500+08:00|-0.42426406871192857| -+-----------------------------+--------------------+ -``` - -用于查询的SQL语句: - -```sql -select idwt(s2,"method"="haar") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------+ -| Time|idwt(root.test.d1.s2, "method"="haar")| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.19999999999999998| -|1970-01-01T08:00:00.200+08:00| 1.4999999999999996| -|1970-01-01T08:00:00.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.6999999999999997| -|1970-01-01T08:00:00.600+08:00| 0.7999999999999998| -|1970-01-01T08:00:00.700+08:00| 1.9999999999999996| -|1970-01-01T08:00:00.800+08:00| 2.4999999999999996| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.9999999999999996| -|1970-01-01T08:00:01.200+08:00| 1.7999999999999998| -|1970-01-01T08:00:01.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:01.400+08:00| 0.9999999999999998| -|1970-01-01T08:00:01.500+08:00| 1.5999999999999999| -+-----------------------------+--------------------------------------+ -``` - - -### FFT - -#### 注册语句 - -```sql -create function fft as 'org.apache.iotdb.library.frequency.UDTFFFT' -``` - -#### 函数简介 - -本函数对输入序列进行快速傅里叶变换。 - -**函数名:** FFT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:傅里叶变换的类型,取值为'uniform'或'nonuniform',缺省情况下为'uniform'。当取值为'uniform'时,时间戳将被忽略,所有数据点都将被视作等距的,并应用等距快速傅里叶算法;当取值为'nonuniform'时,将根据时间戳应用非等距快速傅里叶算法(未实现)。 -+ `result`:傅里叶变换的结果,取值为'real'、'imag'、'abs'或'angle',分别对应于变换结果的实部、虚部、模和幅角。在缺省情况下,输出变换的模。 -+ `compress`:压缩参数,取值范围(0,1],是有损压缩时保留的能量比例。在缺省情况下,不进行压缩。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -##### 等距傅里叶变换 - -当`type`参数缺省或为'uniform'时,本函数进行等距傅里叶变换。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select fft(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------+ -| Time| fft(root.test.d1.s1)| -+-----------------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 1.2727111142703152E-8| -|1970-01-01T08:00:00.002+08:00| 2.385520799101839E-7| -|1970-01-01T08:00:00.003+08:00| 8.723291723972645E-8| -|1970-01-01T08:00:00.004+08:00| 19.999999960195904| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| -|1970-01-01T08:00:00.006+08:00| 3.2260694930700566E-7| -|1970-01-01T08:00:00.007+08:00| 8.723291605373329E-8| -|1970-01-01T08:00:00.008+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.009+08:00| 1.2727110997246171E-8| -|1970-01-01T08:00:00.010+08:00|1.9852334701272664E-23| -|1970-01-01T08:00:00.011+08:00| 1.2727111194499847E-8| -|1970-01-01T08:00:00.012+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.013+08:00| 8.723291785769131E-8| -|1970-01-01T08:00:00.014+08:00| 3.226069493070057E-7| -|1970-01-01T08:00:00.015+08:00| 9.999999850988388| -|1970-01-01T08:00:00.016+08:00| 19.999999960195904| -|1970-01-01T08:00:00.017+08:00| 8.723291747109068E-8| -|1970-01-01T08:00:00.018+08:00| 2.3855207991018386E-7| -|1970-01-01T08:00:00.019+08:00| 1.2727112069910878E-8| -+-----------------------------+----------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此在输出序列中$k=4$和$k=5$处有尖峰。 - -##### 等距傅里叶变换并压缩 - -输入序列同上,用于查询的SQL语句如下: - -```sql -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| fft(root.test.d1.s1,| fft(root.test.d1.s1,| -| | "result"="real",| "result"="imag",| -| | "compress"="0.99")| "compress"="0.99")| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - -注:基于傅里叶变换结果的共轭性质,压缩结果只保留前一半;根据给定的压缩参数,从低频到高频保留数据点,直到保留的能量比例超过该值;保留最后一个数据点以表示序列长度。 - -### HighPass - -#### 注册语句 - -```sql -create function highpass as 'org.apache.iotdb.library.frequency.UDTFHighPass' -``` - -#### 函数简介 - -本函数对输入序列进行高通滤波,提取高于截止频率的分量。输入序列的时间戳将被忽略,所有数据点都将被视作等距的。 - -**函数名:** HIGHPASS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `wpass`:归一化后的截止频率,取值为(0,1),不可缺省。 - -**输出序列:** 输出单个序列,类型为DOUBLE,它是滤波后的序列,长度与时间戳均与输入一致。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select highpass(s1,'wpass'='0.45') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------+ -| Time|highpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9999999534830373| -|1970-01-01T08:00:01.000+08:00| 1.7462829277628608E-8| -|1970-01-01T08:00:02.000+08:00| -0.9999999593178128| -|1970-01-01T08:00:03.000+08:00| -4.1115269056426626E-8| -|1970-01-01T08:00:04.000+08:00| 0.9999999925494194| -|1970-01-01T08:00:05.000+08:00| 3.328126513330016E-8| -|1970-01-01T08:00:06.000+08:00| -1.0000000183304454| -|1970-01-01T08:00:07.000+08:00| 6.260191433311374E-10| -|1970-01-01T08:00:08.000+08:00| 1.0000000018134796| -|1970-01-01T08:00:09.000+08:00| -3.097210911744423E-17| -|1970-01-01T08:00:10.000+08:00| -1.0000000018134794| -|1970-01-01T08:00:11.000+08:00| -6.260191627862097E-10| -|1970-01-01T08:00:12.000+08:00| 1.0000000183304454| -|1970-01-01T08:00:13.000+08:00| -3.328126501424346E-8| -|1970-01-01T08:00:14.000+08:00| -0.9999999925494196| -|1970-01-01T08:00:15.000+08:00| 4.111526915498874E-8| -|1970-01-01T08:00:16.000+08:00| 0.9999999593178128| -|1970-01-01T08:00:17.000+08:00| -1.7462829341296528E-8| -|1970-01-01T08:00:18.000+08:00| -0.9999999534830369| -|1970-01-01T08:00:19.000+08:00| -1.035237222742873E-16| -+-----------------------------+-----------------------------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此高通滤波之后的输出序列服从$y=sin(2\pi t/4)$。 - -### IFFT - -#### 注册语句 - -```sql -create function ifft as 'org.apache.iotdb.library.frequency.UDTFIFFT' -``` - -#### 函数简介 - -本函数将输入的两个序列作为实部和虚部视作一个复数,进行逆快速傅里叶变换,并输出结果的实部。输入数据的格式参见`FFT`函数的输出,并支持以`FFT`函数压缩后的输出作为本函数的输入。 - -**函数名:** IFFT - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `start`:输出序列的起始时刻,是一个格式为'yyyy-MM-dd HH:mm:ss'的时间字符串。在缺省情况下,为'1970-01-01 08:00:00'。 -+ `interval`:输出序列的时间间隔,是一个有单位的正数。目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,为1s。 - - -**输出序列:** 输出单个序列,类型为DOUBLE。该序列是一个等距时间序列,它的值是将两个输入序列依次作为实部和虚部进行逆快速傅里叶变换的结果。 - -**提示:** 如果某行数据中包含空值或`NaN`,该行数据将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| root.test.d1.re| root.test.d1.im| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - - -用于查询的SQL语句: - -```sql -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|ifft(root.test.d1.re, root.test.d1.im, "interval"="1m",| -| | "start"="2021-01-01 00:00:00")| -+-----------------------------+-------------------------------------------------------+ -|2021-01-01T00:00:00.000+08:00| 2.902112992431231| -|2021-01-01T00:01:00.000+08:00| 1.1755704705132448| -|2021-01-01T00:02:00.000+08:00| -2.175570513757101| -|2021-01-01T00:03:00.000+08:00| -1.9021130389094498| -|2021-01-01T00:04:00.000+08:00| 0.9999999925494194| -|2021-01-01T00:05:00.000+08:00| 1.902113046743454| -|2021-01-01T00:06:00.000+08:00| 0.17557053610884188| -|2021-01-01T00:07:00.000+08:00| -1.1755704886020932| -|2021-01-01T00:08:00.000+08:00| -0.9021130371347148| -|2021-01-01T00:09:00.000+08:00| 3.552713678800501E-16| -|2021-01-01T00:10:00.000+08:00| 0.9021130371347154| -|2021-01-01T00:11:00.000+08:00| 1.1755704886020932| -|2021-01-01T00:12:00.000+08:00| -0.17557053610884144| -|2021-01-01T00:13:00.000+08:00| -1.902113046743454| -|2021-01-01T00:14:00.000+08:00| -0.9999999925494196| -|2021-01-01T00:15:00.000+08:00| 1.9021130389094498| -|2021-01-01T00:16:00.000+08:00| 2.1755705137571004| -|2021-01-01T00:17:00.000+08:00| -1.1755704705132448| -|2021-01-01T00:18:00.000+08:00| -2.902112992431231| -|2021-01-01T00:19:00.000+08:00| -3.552713678800501E-16| -+-----------------------------+-------------------------------------------------------+ -``` - -### LowPass - -#### 注册语句 - -```sql -create function lowpass as 'org.apache.iotdb.library.frequency.UDTFLowPass' -``` - -#### 函数简介 - -本函数对输入序列进行低通滤波,提取低于截止频率的分量。输入序列的时间戳将被忽略,所有数据点都将被视作等距的。 - -**函数名:** LOWPASS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `wpass`:归一化后的截止频率,取值为(0,1),不可缺省。 - -**输出序列:** 输出单个序列,类型为DOUBLE,它是滤波后的序列,长度与时间戳均与输入一致。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select lowpass(s1,'wpass'='0.45') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|lowpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.9021130073323922| -|1970-01-01T08:00:01.000+08:00| 1.1755704705132448| -|1970-01-01T08:00:02.000+08:00| -1.1755705286582614| -|1970-01-01T08:00:03.000+08:00| -1.9021130389094498| -|1970-01-01T08:00:04.000+08:00| 7.450580419288145E-9| -|1970-01-01T08:00:05.000+08:00| 1.902113046743454| -|1970-01-01T08:00:06.000+08:00| 1.1755705212076808| -|1970-01-01T08:00:07.000+08:00| -1.1755704886020932| -|1970-01-01T08:00:08.000+08:00| -1.9021130222335536| -|1970-01-01T08:00:09.000+08:00| 3.552713678800501E-16| -|1970-01-01T08:00:10.000+08:00| 1.9021130222335536| -|1970-01-01T08:00:11.000+08:00| 1.1755704886020932| -|1970-01-01T08:00:12.000+08:00| -1.1755705212076801| -|1970-01-01T08:00:13.000+08:00| -1.902113046743454| -|1970-01-01T08:00:14.000+08:00| -7.45058112983088E-9| -|1970-01-01T08:00:15.000+08:00| 1.9021130389094498| -|1970-01-01T08:00:16.000+08:00| 1.1755705286582616| -|1970-01-01T08:00:17.000+08:00| -1.1755704705132448| -|1970-01-01T08:00:18.000+08:00| -1.9021130073323924| -|1970-01-01T08:00:19.000+08:00| -2.664535259100376E-16| -+-----------------------------+----------------------------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此低通滤波之后的输出序列服从$y=2sin(2\pi t/5)$。 - - -### Envelope - -#### 注册语句 - -```sql -create function envelope as 'org.apache.iotdb.library.frequency.UDFEnvelopeAnalysis' -``` - -#### 函数简介 - -本函数通过输入一维浮点数数组和用户指定的调制频率,实现对信号的解调和包络提取。解调的目标是从复杂的信号中提取感兴趣的部分,使其更易理解。比如通过解调可以找到信号的包络,即振幅的变化趋势。 - -**函数名:** Envelope - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `frequency`:频率(选填,正数。不填此参数,系统会基于序列对应时间的时间间隔来推断频率)。 -+ `amplification`: 扩增倍数(选填,正整数。输出Time列的结果为正整数的集合,不会输出小数。当频率小1时,可通过此参数对频率进行扩增以展示正常的结果)。 - -**输出序列:** -+ `Time`: 该列返回的值的含义是频率而并非时间,如果输出的格式为时间格式(如:1970-01-01T08:00:19.000+08:00),请将其转为时间戳值。 - -+ `Envelope(Path, 'frequency'='{frequency}')`:输出单个序列,类型为DOUBLE,它是包络分析之后的结果。 - -**提示:** 当解调的原始序列的值不连续时,本函数会视为连续处理,建议被分析的时间序列是一段值完整的时间序列。同时建议指定开始时间与结束时间。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:01.000+08:00| 1.0 | -|1970-01-01T08:00:02.000+08:00| 2.0 | -|1970-01-01T08:00:03.000+08:00| 3.0 | -|1970-01-01T08:00:04.000+08:00| 4.0 | -|1970-01-01T08:00:05.000+08:00| 5.0 | -|1970-01-01T08:00:06.000+08:00| 6.0 | -|1970-01-01T08:00:07.000+08:00| 7.0 | -|1970-01-01T08:00:08.000+08:00| 8.0 | -|1970-01-01T08:00:09.000+08:00| 9.0 | -|1970-01-01T08:00:10.000+08:00| 10.0 | -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: -```sql -set time_display_type=long; -select envelope(s1),envelope(s1,'frequency'='1000'),envelope(s1,'amplification'='10') from root.test.d1; -``` -输出序列: - -``` -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -|Time|envelope(root.test.d1.s1)|envelope(root.test.d1.s1, "frequency"="1000")|envelope(root.test.d1.s1, "amplification"="10")| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -| 0| 6.284350808484124| 6.284350808484124| 6.284350808484124| -| 100| 1.5581923657404393| 1.5581923657404393| null| -| 200| 0.8503211038340728| 0.8503211038340728| null| -| 300| 0.512808785945551| 0.512808785945551| null| -| 400| 0.26361156774506744| 0.26361156774506744| null| -|1000| null| null| 1.5581923657404393| -|2000| null| null| 0.8503211038340728| -|3000| null| null| 0.512808785945551| -|4000| null| null| 0.26361156774506744| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ - -``` - -## 数据匹配 - -### Cov - -#### 注册语句 - -```sql -create function cov as 'org.apache.iotdb.library.dmatch.UDAFCov' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的总体协方差。 - -**函数名:** COV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为总体协方差的数据点。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出`NaN`。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select cov(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|cov(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 12.291666666666666| -+-----------------------------+-------------------------------------+ -``` - -### Dtw - -#### 注册语句 - -```sql -create function dtw as 'org.apache.iotdb.library.dmatch.UDAFDtw' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的 DTW 距离。 - -**函数名:** DTW - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为两个时间序列的 DTW 距离值。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出 0。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.004+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.005+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.006+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.007+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.008+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.009+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.010+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.011+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.012+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.013+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.014+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.015+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.016+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.017+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.018+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.019+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.020+08:00| 1.0| 2.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select dtw(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|dtw(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 20.0| -+-----------------------------+-------------------------------------+ -``` - -### Pearson - -#### 注册语句 - -```sql -create function pearson as 'org.apache.iotdb.library.dmatch.UDAFPearson' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的皮尔森相关系数。 - -**函数名:** PEARSON - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为皮尔森相关系数的数据点。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出`NaN`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select pearson(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------+ -| Time|pearson(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.5630881927754872| -+-----------------------------+-----------------------------------------+ -``` - -### PtnSym - -#### 注册语句 - -```sql -create function ptnsym as 'org.apache.iotdb.library.dmatch.UDTFPtnSym' -``` - -#### 函数简介 - -本函数用于寻找序列中所有对称度小于阈值的对称子序列。对称度通过 DTW 计算,值越小代表序列对称性越高。 - -**函数名:** PTNSYM - -**输入序列:** 仅支持一个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:对称子序列的长度,是一个正整数,默认值为 10。 -+ `threshold`:对称度阈值,是一个非负数,只有对称度小于等于该值的对称子序列才会被输出。在缺省情况下,所有的子序列都会被输出。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中的每一个数据点对应于一个对称子序列,时间戳为子序列的起始时刻,值为对称度。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s4| -+-----------------------------+---------------+ -|2021-01-01T12:00:00.000+08:00| 1.0| -|2021-01-01T12:00:01.000+08:00| 2.0| -|2021-01-01T12:00:02.000+08:00| 3.0| -|2021-01-01T12:00:03.000+08:00| 2.0| -|2021-01-01T12:00:04.000+08:00| 1.0| -|2021-01-01T12:00:05.000+08:00| 1.0| -|2021-01-01T12:00:06.000+08:00| 1.0| -|2021-01-01T12:00:07.000+08:00| 1.0| -|2021-01-01T12:00:08.000+08:00| 2.0| -|2021-01-01T12:00:09.000+08:00| 3.0| -|2021-01-01T12:00:10.000+08:00| 2.0| -|2021-01-01T12:00:11.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|ptnsym(root.test.d1.s4, "window"="5", "threshold"="0")| -+-----------------------------+------------------------------------------------------+ -|2021-01-01T12:00:00.000+08:00| 0.0| -|2021-01-01T12:00:07.000+08:00| 0.0| -+-----------------------------+------------------------------------------------------+ -``` - -### XCorr - -#### 注册语句 - -```sql -create function xcorr as 'org.apache.iotdb.library.dmatch.UDTFXCorr' -``` - -#### 函数简介 - -本函数用于计算两条时间序列的互相关函数值, -对离散序列而言,互相关函数可以表示为 -$$CR(n) = \frac{1}{N} \sum_{m=1}^N S_1[m]S_2[m+n]$$ -常用于表征两条序列在不同对齐条件下的相似度。 - -**函数名:** XCORR - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中共包含$2N-1$个数据点, -其中正中心的值为两条序列按照预先对齐的结果计算的互相关系数(即等于以上公式的$CR(0)$), -前半部分的值表示将后一条输入序列向前平移时计算的互相关系数, -直至两条序列没有重合的数据点(不包含完全分离时的结果$CR(-N)=0.0$), -后半部分类似。 -用公式可表示为(所有序列的索引从1开始计数): -$$OS[i] = CR(-N+i) = \frac{1}{N} \sum_{m=1}^{i} S_1[m]S_2[N-i+m],\ if\ i <= N$$ -$$OS[i] = CR(i-N) = \frac{1}{N} \sum_{m=1}^{2N-i} S_1[i-N+m]S_2[m],\ if\ i > N$$ - -**提示:** - -+ 两条序列中的`null` 和`NaN` 值会被忽略,在计算中表现为 0。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| null| 6| -|2020-01-01T00:00:02.000+08:00| 2| 7| -|2020-01-01T00:00:03.000+08:00| 3| NaN| -|2020-01-01T00:00:04.000+08:00| 4| 9| -|2020-01-01T00:00:05.000+08:00| 5| 10| -+-----------------------------+---------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------+ -| Time|xcorr(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+---------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 10.0| -|1970-01-01T08:00:00.002+08:00| 16.0| -|1970-01-01T08:00:00.003+08:00| 16.75| -|1970-01-01T08:00:00.004+08:00| 20.0| -|1970-01-01T08:00:00.005+08:00| 13.2| -|1970-01-01T08:00:00.006+08:00| 5.6| -|1970-01-01T08:00:00.007+08:00| 7.0| -|1970-01-01T08:00:00.008+08:00| 0.0| -+-----------------------------+---------------------------------------+ -``` -### 6.6 Pattern\_match - -#### 注册语句 - -```SQL -create function pattern_match as 'org.apache.iotdb.library.match.UDAFPatternMatch' -``` - -#### 函数简介 - -本函数用于对输入的某一条时间序列与预设的`pattern`进行模式匹配,当相似度小于等于某个预设阈值时判定为匹配成功,并将最终匹配结果以`json`列表的方式输出。 - -**函数名:** PATTERN\_MATCH - -**输入序列:** 仅支持一个输入序列,类型为INT32,INT64,FLOAT,DOUBLE,BOOLEAN。 - -**参数:** - -* `timePattern` :以时间戳组成的字符串,以逗号分隔。长度必须大于1。必填项。 -* `valuePattern `:以数字组成的字符串,以逗号分隔。数量与 `timePattern `相同,长度必须大于1。必填项。 - -> 提示:布尔类型的`valuePattern `,需要用1,0来表示`true`和`false`。 - -* `threshold` :阈值。Float类型。必填项。 - -**输出序列**:输出结果为包含所有成功匹配段落的起始时间戳`startTime`、终止时间戳`endTime`及相似度值`distance`的`json`列表。 - -#### 使用示例 -1. 线性数据 - -输入序列: - -```SQL -IoTDB> select s0 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s0| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.1| -|1970-01-01T08:00:00.003+08:00| 1.2| -|1970-01-01T08:00:00.004+08:00| 1.3| -|1970-01-01T08:00:00.005+08:00| 0.0| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+--------------------------------------------------------------------------------------------------+ -| match_result| -+--------------------------------------------------------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]| -+--------------------------------------------------------------------------------------------------+ -``` - -2. 布尔类型数据 - -输入序列: - -```SQL -IoTDB> select s1 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| true| -|1970-01-01T08:00:00.002+08:00| true| -|1970-01-01T08:00:00.003+08:00| true| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| false| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", "threshold"="0.5") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+-------------------------------------------------+ -| match_result| -+-------------------------------------------------+ -|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+-------------------------------------------------+ -``` - -3. V型数据 - -输入序列: - -```SQL -IoTDB> select s2 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s2| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| -1.0| -|1970-01-01T08:00:00.003+08:00| -2.0| -|1970-01-01T08:00:00.004+08:00| -3.0| -|1970-01-01T08:00:00.005+08:00| -2.0| -|1970-01-01T08:00:00.006+08:00| -1.0| -|1970-01-01T08:00:00.007+08:00| -0.0| -|1970-01-01T08:00:00.008+08:00| -0.0| -|1970-01-01T08:00:00.009+08:00| -0.0| -|1970-01-01T08:00:00.010+08:00| -0.0| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s2, "timePattern"="1,2,3,4,5,6,7", "valuePattern"="0.0,-1.0,-2.0,-3.0,-2.0,-1.0,-0.0", "threshold"="10") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+----------------------------------------------+ -| match_result| -+----------------------------------------------+ -|[{"distance":0.53,"startTime":1,"endTime":10}]| -+----------------------------------------------+ -``` - -4. 多个匹配模式 - -输入序列: - -```SQL -IoTDB> select s0,s1 from root.** -+-----------------------------+-------------+-------------+ -| Time|root.db.d0.s0|root.db.d0.s1| -+-----------------------------+-------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| true| -|1970-01-01T08:00:00.002+08:00| 1.1| true| -|1970-01-01T08:00:00.003+08:00| 1.2| true| -|1970-01-01T08:00:00.004+08:00| 1.3| false| -|1970-01-01T08:00:00.005+08:00| 0.0| false| -+-----------------------------+-------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result1, pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", - "threshold"="0.5") as match_result2 from root.db.d0 -``` - -输出序列: - -```SQL -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -| match_result1| match_result2| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -``` - - - -## 数据修复 - -### TimestampRepair - -#### 注册语句 - -```sql -create function timestamprepair as 'org.apache.iotdb.library.drepair.UDTFTimestampRepair' -``` - -### 函数简介 - -本函数用于时间戳修复。根据给定的标准时间间隔,采用最小化修复代价的方法,通过对数据时间戳的微调,将原本时间戳间隔不稳定的数据修复为严格等间隔的数据。在未给定标准时间间隔的情况下,本函数将使用时间间隔的中位数 (median)、众数 (mode) 或聚类中心 (cluster) 来推算标准时间间隔。 - - -**函数名:** TIMESTAMPREPAIR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `interval`: 标准时间间隔(单位是毫秒),是一个正整数。在缺省情况下,将根据指定的方法推算。 -+ `method`:推算标准时间间隔的方法,取值为 'median', 'mode' 或 'cluster',仅在`interval`缺省时有效。在缺省情况下,将使用中位数方法进行推算。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列是修复后的输入序列。 - -### 使用示例 - -#### 指定标准时间间隔 - -在给定`interval`参数的情况下,本函数将按照指定的标准时间间隔进行修复。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:19.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:01.000+08:00| 7.0| -|2021-07-01T12:01:11.000+08:00| 8.0| -|2021-07-01T12:01:21.000+08:00| 9.0| -|2021-07-01T12:01:31.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timestamprepair(s1,'interval'='10000') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------------------+ -| Time|timestamprepair(root.test.d2.s1, "interval"="10000")| -+-----------------------------+----------------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+----------------------------------------------------+ -``` - -#### 自动推算标准时间间隔 - -如果`interval`参数没有给定,本函数将按照推算的标准时间间隔进行修复。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select timestamprepair(s1) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------+ -| Time|timestamprepair(root.test.d2.s1)| -+-----------------------------+--------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+--------------------------------+ -``` - -### ValueFill - -#### 注册语句 - -```sql -create function valuefill as 'org.apache.iotdb.library.drepair.UDTFValueFill' -``` - -#### 函数简介 - -**函数名:** ValueFill - -**输入序列:** 单列时序数据,类型为INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`: {"mean", "previous", "linear", "likelihood", "AR", "MA", "SCREEN"}, 默认为 "linear"。其中,“mean” 指使用均值填补的方法; “previous" 指使用前值填补方法;“linear" 指使用线性插值填补方法;“likelihood” 为基于速度的正态分布的极大似然估计方法;“AR” 指自回归的填补方法;“MA” 指滑动平均的填补方法;"SCREEN" 指约束填补方法;缺省情况下使用 “linear”。 - -**输出序列:** 填补后的单维序列。 - -**备注:** AR 模型采用 AR(1),时序列需满足自相关条件,否则将输出单个数据点 (0, 0.0). - -#### 使用示例 -##### 使用 linear 方法进行填补 - -当`method`缺省或取值为 'linear' 时,本函数将使用线性插值方法进行填补。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| NaN| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| NaN| -|2020-01-01T00:00:22.000+08:00| NaN| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select valuefill(s1) from root.test.d2 -``` - -输出序列: - - - -``` -+-----------------------------+--------------------------+ -| Time|valuefill(root.test.d2.s1)| -+-----------------------------+--------------------------+ -|2020-01-01T00:00:02.000+08:00| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 110.5| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.66666666666667| -|2020-01-01T00:00:22.000+08:00| 121.33333333333333| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+--------------------------+ -``` - -##### 使用 previous 方法进行填补 - -当`method`取值为 'previous' 时,本函数将使前值填补方法进行数值填补。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select valuefill(s1,"method"="previous") from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------+ -| Time|valuefill(root.test.d2.s1, "method"="previous")| -+-----------------------------+-----------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 108.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 116.0| -|2020-01-01T00:00:22.000+08:00| 116.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-----------------------------------------------+ -``` - -### ValueRepair - -#### 注册语句 - -```sql -create function valuerepair as 'org.apache.iotdb.library.drepair.UDTFValueRepair' -``` - -#### 函数简介 - -本函数用于对时间序列的数值进行修复。目前,本函数支持两种修复方法:**Screen** 是一种基于速度阈值的方法,在最小改动的前提下使得所有的速度符合阈值要求;**LsGreedy** 是一种基于速度变化似然的方法,将速度变化建模为高斯分布,并采用贪心算法极大化似然函数。 - -**函数名:** VALUEREPAIR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `method`:修复时采用的方法,取值为 'Screen' 或 'LsGreedy'. 在缺省情况下,使用 Screen 方法进行修复。 -+ `minSpeed`:该参数仅在使用 Screen 方法时有效。当速度小于该值时会被视作数值异常点加以修复。在缺省情况下为中位数减去三倍绝对中位差。 -+ `maxSpeed`:该参数仅在使用 Screen 方法时有效。当速度大于该值时会被视作数值异常点加以修复。在缺省情况下为中位数加上三倍绝对中位差。 -+ `center`:该参数仅在使用 LsGreedy 方法时有效。对速度变化分布建立的高斯模型的中心。在缺省情况下为 0。 -+ `sigma` :该参数仅在使用 LsGreedy 方法时有效。对速度变化分布建立的高斯模型的标准差。在缺省情况下为绝对中位差。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列是修复后的输入序列。 - -**提示:** 输入序列中的`NaN`在修复之前会先进行线性插值填补。 - -#### 使用示例 - -##### 使用 Screen 方法进行修复 - -当`method`缺省或取值为 'Screen' 时,本函数将使用 Screen 方法进行数值修复。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select valuerepair(s1) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|valuerepair(root.test.d2.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+----------------------------+ -``` - -##### 使用 LsGreedy 方法进行修复 - -当`method`取值为 'LsGreedy' 时,本函数将使用 LsGreedy 方法进行数值修复。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select valuerepair(s1,'method'='LsGreedy') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|valuerepair(root.test.d2.s1, "method"="LsGreedy")| -+-----------------------------+-------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-------------------------------------------------+ -``` - -## 序列发现 - -### ConsecutiveSequences - -#### 注册语句 - -```sql -create function consecutivesequences as 'org.apache.iotdb.library.series.UDTFConsecutiveSequences' -``` - -#### 函数简介 - -本函数用于在多维严格等间隔数据中发现局部最长连续子序列。 - -严格等间隔数据是指数据的时间间隔是严格相等的,允许存在数据缺失(包括行缺失和值缺失),但不允许存在数据冗余和时间戳偏移。 - -连续子序列是指严格按照标准时间间隔等距排布,不存在任何数据缺失的子序列。如果某个连续子序列不是任何连续子序列的真子序列,那么它是局部最长的。 - - -**函数名:** CONSECUTIVESEQUENCES - -**输入序列:** 支持多个输入序列,类型可以是任意的,但要满足严格等间隔的要求。 - -**参数:** - -+ `gap`:标准时间间隔,是一个有单位的正数。目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,函数会利用众数估计标准时间间隔。 - -**输出序列:** 输出单个序列,类型为 INT32。输出序列中的每一个数据点对应一个局部最长连续子序列,时间戳为子序列的起始时刻,值为子序列包含的数据点个数。 - -**提示:** 对于不符合要求的输入,本函数不对输出做任何保证。 - -#### 使用示例 - -##### 手动指定标准时间间隔 - -本函数可以通过`gap`参数手动指定标准时间间隔。需要注意的是,错误的参数设置会导致输出产生严重错误。 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2, "gap"="5m")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------------------+ -``` - -##### 自动估计标准时间间隔 - -当`gap`参数缺省时,本函数可以利用众数估计标准时间间隔,得到同样的结果。因此,这种用法更受推荐。 - -输入序列同上,用于查询的SQL语句如下: - -```sql -select consecutivesequences(s1,s2) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------+ -``` - -### ConsecutiveWindows - -#### 注册语句 - -```sql -create function consecutivewindows as 'org.apache.iotdb.library.series.UDTFConsecutiveWindows' -``` - -#### 函数简介 - -本函数用于在多维严格等间隔数据中发现指定长度的连续窗口。 - -严格等间隔数据是指数据的时间间隔是严格相等的,允许存在数据缺失(包括行缺失和值缺失),但不允许存在数据冗余和时间戳偏移。 - -连续窗口是指严格按照标准时间间隔等距排布,不存在任何数据缺失的子序列。 - - -**函数名:** CONSECUTIVEWINDOWS - -**输入序列:** 支持多个输入序列,类型可以是任意的,但要满足严格等间隔的要求。 - -**参数:** - -+ `gap`:标准时间间隔,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,函数会利用众数估计标准时间间隔。 -+ `length`:序列长度,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。该参数不允许缺省。 - -**输出序列:** 输出单个序列,类型为 INT32。输出序列中的每一个数据点对应一个指定长度连续子序列,时间戳为子序列的起始时刻,值为子序列包含的数据点个数。 - -**提示:** 对于不符合要求的输入,本函数不对输出做任何保证。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------+ -| Time|consecutivewindows(root.test.d1.s1, root.test.d1.s2, "length"="10m")| -+-----------------------------+--------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -+-----------------------------+--------------------------------------------------------------------+ -``` - - - -## 机器学习 - -### AR - -#### 注册语句 - -```sql -create function ar as 'org.apache.iotdb.library.dlearn.UDTFAR' -``` -#### 函数简介 - -本函数用于学习数据的自回归模型系数。 - -**函数名:** AR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `p`:自回归模型的阶数。默认为1。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。第一行对应模型的一阶系数,以此类推。 - -**提示:** - -- `p`应为正整数。 - -- 序列中的大部分点为等间隔采样点。 -- 序列中的缺失点通过线性插值进行填补后用于学习过程。 - -#### 使用示例 - -##### 指定阶数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ar(s0,"p"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+---------------------------+ -| Time|ar(root.test.d0.s0,"p"="2")| -+-----------------------------+---------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.9429| -|1970-01-01T08:00:00.002+08:00| -0.2571| -+-----------------------------+---------------------------+ -``` - -### Cluster - -#### 注册语句 - -```sql -create function cluster as 'org.apache.iotdb.library.dlearn.UDTFCluster' -``` - -#### 函数简介 - -本函数对**单条输入时间序列**,按固定长度 `l` 切分为**互不重叠**的连续子序列(窗口),再对这些子序列聚类,得到 `k` 个分组。 - -**函数名:** Cluster - -**输入序列:** 仅支持单条数值型时间序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。点按时间顺序读取;末尾不足以凑满一整窗的采样会被**丢弃**(仅使用 `⌊n/l⌋` 个窗口,`n` 为有效点数)。 - -**参数:** - -| 名称 | 含义 | 默认值 | 说明 | -|------|------|--------|------| -| `l` | 子序列(窗口)长度 | (必填) | 正整数;每个窗口含连续 `l` 个采样。 | -| `k` | 聚类个数 | (必填) | 整数 ≥ 2。 | -| `method` | 聚类算法 | `kmeans` | 可选:`kmeans`、`kshape`、`medoidshape`(大小写不敏感)。省略时默认为 k-means。 | -| `norm` | 是否对每个子序列做 Z-score 标准化 | `true` | 布尔;为 `true` 时在聚类前对每个子序列标准化。 | -| `maxiter` | 最大迭代次数 | `200` | 正整数。 | -| `output` | 输出模式 | `label` | `label`:每个窗口一个簇编号;`centroid`:按簇顺序拼接 `k` 个质心向量。 | -| `sample_rate` | 贪心采样比例 | `0.3` | 仅在 **`method` = `medoidshape`** 时使用;取值须在 `(0, 1]`。 | - -**`method` 说明:** - -- **kmeans**:欧氏空间中的 k-means(可选是否先做逐窗归一化)。 -- **kshape**:基于形状距离(由归一化互相关 NCC 得到的 SBD)分配簇;质心通过簇矩阵的 **SVD** 更新。 -- **medoidshape**:先粗聚类,再贪心选出 `k` 条代表子序列;`sample_rate` 控制每轮采样的候选数量。 - -**输出序列:** 由 `output` 控制: - -- **`output` = `label`(默认):** 一条输出序列,类型为 **INT32**。行数 = 完整窗口个数 `⌊n/l⌋`。每行时间戳 = 该窗口**第一个采样**的时间;值为簇编号 **0 … k−1**。 -- **`output` = `centroid`:** 一条输出序列,类型为 **DOUBLE**。行数 = **`k × l`**:按簇 **0 → k−1** 依次输出各簇质心的 `l` 个分量(拼接)。时间戳为 `0, 1, 2, …`(仅占位,无物理时间含义)。 - -**提示:** - -- 需满足有效点数 `n ≥ l`,且窗口数 `⌊n/l⌋ ≥ k`。 - -#### 使用示例 - -##### KShape:窗口长度 3,k = 2 - -九个采样 `{1,2,3,10,20,30,1,5,1}` 构成三个长度为 3 的不重叠窗口 `{1,2,3}`、`{10,20,30}`、`{1,5,1}`。在 **`method` = `kshape`** 且默认 **`norm` = `true`** 时,每一行对应一个窗口的簇编号,时间戳为各窗口起点。得到的标签为:**0, 0, 1**。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 10.0| -|2020-01-01T00:00:05.000+08:00| 20.0| -|2020-01-01T00:00:06.000+08:00| 30.0| -|2020-01-01T00:00:07.000+08:00| 1.0| -|2020-01-01T00:00:08.000+08:00| 5.0| -|2020-01-01T00:00:09.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select cluster(s0, "l"="3", "k"="2", "method"="kshape", "output"="label") -from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------------------------------------------+ -| Time|cluster(root.test.d0.s0,"l"="3","k"="2","method"="kshape","output"="label")| -+-----------------------------+----------------------------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 0| -|2020-01-01T00:00:04.000+08:00| 0| -|2020-01-01T00:00:07.000+08:00| 1| -+-----------------------------+----------------------------------------------------------------------------+ -``` diff --git a/src/zh/UserGuide/V1.3.x/Tools-System/CLI_timecho.md b/src/zh/UserGuide/V1.3.x/Tools-System/CLI_timecho.md deleted file mode 100644 index 4cd3d1744..000000000 --- a/src/zh/UserGuide/V1.3.x/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,168 +0,0 @@ - - -# SQL 命令行终端 (CLI) - -IOTDB 为用户提供 cli/Shell 工具用于启动客户端和服务端程序。下面介绍每个 cli/Shell 工具的运行方式和相关参数。 -> \$IOTDB\_HOME 表示 IoTDB 的安装目录所在路径。 - -## Cli 运行方式 -安装后的 IoTDB 中有一个默认用户:`root`,默认密码为`root`。用户可以使用该用户尝试运行 IoTDB 客户端以测试服务器是否正常启动。客户端启动脚本为$IOTDB_HOME/sbin 文件夹下的`start-cli`脚本。启动脚本时需要指定运行 IP 和 RPC PORT。以下为服务器在本机启动,且用户未更改运行端口号的示例,默认端口为 6667。若用户尝试连接远程服务器或更改了服务器运行的端口号,请在-h 和-p 项处使用服务器的 IP 和 RPC PORT。
-用户也可以在启动脚本的最前方设置自己的环境变量,如 JAVA_HOME 等 (对于 linux 用户,脚本路径为:"/sbin/start-cli.sh"; 对于 windows 用户,脚本路径为:"/sbin/start-cli.bat") - -Linux 系统与 MacOS 系统启动命令如下: - -```shell -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -``` -Windows 系统启动命令如下: - -```shell -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -``` -回车后即可成功启动客户端。启动后出现如图提示即为启动成功。 - -``` - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ version - -Successfully login at 127.0.0.1:6667 -``` -输入`quit`或`exit`可退出 cli 结束本次会话,cli 输出`quit normally`表示退出成功。 - -## Cli 运行参数 - -|参数名|参数类型|是否为必需参数| 说明| 例子 | -|:---|:---|:---|:---|:---| -|-disableISO8601 |没有参数 | 否 |如果设置了这个参数,IoTDB 将以数字的形式打印时间戳 (timestamp)。|-disableISO8601| -|-h <`host`> |string 类型,不需要引号|是|IoTDB 客户端连接 IoTDB 服务器的 IP 地址。|-h 10.129.187.21| -|-help|没有参数|否|打印 IoTDB 的帮助信息|-help| -|-p <`rpcPort`>|int 类型|是|IoTDB 连接服务器的端口号,IoTDB 默认运行在 6667 端口。|-p 6667| -|-pw <`password`>|string 类型,不需要引号|否|IoTDB 连接服务器所使用的密码。如果没有输入密码 IoTDB 会在 Cli 端提示输入密码。|-pw root| -|-u <`username`>|string 类型,不需要引号|是|IoTDB 连接服务器锁使用的用户名。|-u root| -|-maxPRC <`maxPrintRowCount`>|int 类型|否|设置 IoTDB 返回客户端命令行中所显示的最大行数。|-maxPRC 10| -|-e <`execute`> |string 类型|否|在不进入客户端输入模式的情况下,批量操作 IoTDB|-e "show databases"| -|-c | 空 | 否 | 如果服务器设置了 `rpc_thrift_compression_enable=true`, 则 CLI 必须使用 `-c` | -c | - -下面展示一条客户端命令,功能是连接 IP 为 10.129.187.21 的主机,端口为 6667 ,用户名为 root,密码为 root,以数字的形式打印时间戳,IoTDB 命令行显示的最大行数为 10。 - -Linux 系统与 MacOS 系统启动命令如下: - -```shell -Shell > bash sbin/start-cli.sh -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` -Windows 系统启动命令如下: - -```shell -Shell > sbin\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -## CLI 特殊命令 -下面列举了一些CLI的特殊命令。 - -| **参数名** | **参数类型** | **是否为必需参数** | **说明** | **示例** | -|:-----------------------------|:-----------|:------------|:-----------------------------------------------------------|:---------------------| -| -h `` | string 类型 | 否 | IoTDB 客户端连接 IoTDB 服务器的 IP 地址, 默认使用:127.0.0.1。 | -h 127.0.0.1 | -| -p `` | int 类型 | 否 | IoTDB 客户端连接服务器的端口号,IoTDB 默认使用 6667。 | -p 6667 | -| -u `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的用户名,默认使用 root。 | -u root | -| -pw `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的密码,默认使用 TimechoDB@2021(V2.0.6版本之前为root)。 | -pw root | -| -e `` | string 类型 | 否 | 在不进入客户端输入模式的情况下,批量操作 IoTDB。 | -e "show databases" | -| -c | 空 | 否 | 如果服务器设置了 rpc_thrift_compression_enable=true, 则 CLI 必须使用 -c | -c | -| -disableISO8601 | 空 | 否 | 如果设置了这个参数,IoTDB 将以数字的形式打印时间戳 (timestamp)。 | -disableISO8601 | -| -usessl `` | Boolean 类型 | 否 | 否开启 ssl 连接 | -usessl true | -| -ts `` | string 类型 | 否 | ssl 证书存储路径 | -ts /path/to/truststore | -| -tpw `` | string 类型 | 否 | ssl 证书存储密码 | -tpw myTrustPassword | -| -timeout `` | int 类型 | 否 | 查询超时时间(秒)。如果未设置,则使用服务器的配置。 | -timeout 30 | -| -help | 空 | 否 | 打印 IoTDB 的帮助信息。 | -help | - - - -## Cli 的批量操作 -当您想要通过脚本的方式通过 Cli / Shell 对 IoTDB 进行批量操作时,可以使用-e 参数。通过使用该参数,您可以在不进入客户端输入模式的情况下操作 IoTDB。 - -为了避免 SQL 语句和其他参数混淆,现在只支持-e 参数作为最后的参数使用。 - -针对 cli/Shell 工具的-e 参数用法如下: - -Linux 系统与 MacOS 指令: - -```shell -Shell > bash sbin/start-cli.sh -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -Windows 系统指令 -```shell -Shell > sbin\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -在 Windows 环境下,-e 参数的 SQL 语句需要使用` `` `对于`" "`进行替换 - -为了更好的解释-e 参数的使用,可以参考下面在 Linux 上执行的例子。 - -假设用户希望对一个新启动的 IoTDB 进行如下操作: - -1. 创建名为 root.demo 的 database - -2. 创建名为 root.demo.s1 的时间序列 - -3. 向创建的时间序列中插入三个数据点 - -4. 查询验证数据是否插入成功 - -那么通过使用 cli/Shell 工具的 -e 参数,可以采用如下的脚本: - -```shell -# !/bin/bash - -host=127.0.0.1 -rpcPort=6667 -user=root -pass=root - -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "CREATE DATABASE root.demo" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create timeseries root.demo.s1 WITH DATATYPE=INT32, ENCODING=RLE" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(1,10)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(2,11)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(3,12)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "select s1 from root.demo" -``` - -打印出来的结果显示如下,通过这种方式进行的操作与客户端的输入模式以及通过 JDBC 进行操作结果是一致的。 - -```shell - Shell > bash ./shell.sh -+-----------------------------+------------+ -| Time|root.demo.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.001+08:00| 10| -|1970-01-01T08:00:00.002+08:00| 11| -|1970-01-01T08:00:00.003+08:00| 12| -+-----------------------------+------------+ -Total line number = 3 -It costs 0.267s -``` - -需要特别注意的是,在脚本中使用 -e 参数时要对特殊字符进行转义。 - diff --git a/src/zh/UserGuide/V1.3.x/Tools-System/Maintenance-Tool_timecho.md b/src/zh/UserGuide/V1.3.x/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index 63d69149f..000000000 --- a/src/zh/UserGuide/V1.3.x/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,1013 +0,0 @@ - - -# 集群管理工具 - -## 集群管理工具 - -IoTDB 集群管理工具是一款易用的运维工具(企业版工具)。旨在解决 IoTDB 分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。本文档将说明如何用集群管理工具远程部署、配置、启动和停止 IoTDB 集群实例。 - -### 环境准备 - -本工具为 TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -IoTDB 要部署的机器需要依赖jdk 8及以上版本、lsof、netstat、unzip功能如果没有请自行安装,可以参考文档最后的一节环境所需安装命令。 - -提示:IoTDB集群管理工具需要使用有root权限的账号 - -### 部署方法 - -#### 下载安装 - -本工具为TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -注意:由于二进制包仅支持GLIBC2.17 及以上版本,因此最低适配Centos7版本 - -* 在iotd目录内输入以下指令后: - -```bash -bash install-iotdbctl.sh -``` - -即可在之后的 shell 内激活 iotdbctl 关键词,如检查部署前所需的环境指令如下所示: - -```bash -iotdbctl cluster check example -``` - -* 也可以不激活iotd直接使用 <iotdbctl absolute path>/sbin/iotdbctl 来执行命令,如检查部署前所需的环境: - -```bash -/sbin/iotdbctl cluster check example -``` - -### 系统结构 - -IoTDB集群管理工具主要由config、logs、doc、sbin目录组成。 - -* `config`存放要部署的集群配置文件如果要使用集群部署工具需要修改里面的yaml文件。 -* `logs` 存放部署工具日志,如果想要查看部署工具执行日志请查看`logs/iotd_yyyy_mm_dd.log`。 -* `sbin` 存放集群部署工具所需的二进制包。 -* `doc` 存放用户手册、开发手册和推荐部署手册。 - - -### 集群配置文件介绍 - -* 在`iotdbctl/config` 目录下有集群配置的yaml文件,yaml文件名字就是集群名字yaml 文件可以有多个,为了方便用户配置yaml文件在iotd/config目录下面提供了`default_cluster.yaml`示例。 -* yaml 文件配置由`global`、`confignode_servers`、`datanode_servers`、`grafana_server`、`prometheus_server`四大部分组成 -* global 是通用配置主要配置机器用户名密码、IoTDB本地安装文件、Jdk配置等。在`iotdbctl/config`目录中提供了一个`default_cluster.yaml`样例数据, - 用户可以复制修改成自己集群名字并参考里面的说明进行配置IoTDB集群,在`default_cluster.yaml`样例中没有注释的均为必填项,已经注释的为非必填项。 - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotdbctl cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - - -| 参数 | 说明 | 是否必填 | -|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| iotdb\_zip\_dir | IoTDB 部署分发目录,如果值为空则从`iotdb_download_url`指定地址下载 | 非必填 | -| iotdb\_download\_url | IoTDB 下载地址,如果`iotdb_zip_dir` 没有值则从指定地址下载 | 非必填 | -| jdk\_tar\_dir | jdk 本地目录,可使用该 jdk 路径进行上传部署至目标节点。 | 非必填 | -| jdk\_deploy\_dir | jdk 远程机器部署目录,会将 jdk 部署到该目录下面,与下面的`jdk_dir_name`参数构成完整的jdk部署目录即 `/` | 非必填 | -| jdk\_dir\_name | jdk 解压后的目录名称默认是jdk_iotdb | 非必填 | -| iotdb\_lib\_dir | IoTDB lib 目录或者IoTDB 的lib 压缩包仅支持.zip格式 ,仅用于IoTDB升级,默认处于注释状态,如需升级请打开注释修改路径即可。如果使用zip文件请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache\-iotdb\-1.2.0/lib/* | 非必填 | -| user | ssh登陆部署机器的用户名 | 必填 | -| password | ssh登录的密码, 如果password未指定使用pkey登陆, 请确保已配置节点之间ssh登录免密钥 | 非必填 | -| pkey | 密钥登陆如果password有值优先使用password否则使用pkey登陆 | 非必填 | -| ssh\_port | ssh登录端口 | 必填 | -| iotdb\_admin_user | iotdb服务用户名默认root | 非必填 | -| iotdb\_admin_password | iotdb服务密码默认root | 非必填 | -| deploy\_dir | IoTDB 部署目录,会把 IoTDB 部署到该目录下面与下面的`iotdb_dir_name`参数构成完整的IoTDB 部署目录即 `/` | 必填 | -| iotdb\_dir\_name | IoTDB 解压后的目录名称默认是iotdb | 非必填 | -| datanode-env.sh | 对应`iotdb/config/datanode-env.sh` ,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值 | 非必填 | -| confignode-env.sh | 对应`iotdb/config/confignode-env.sh`,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值 | 非必填 | -| iotdb-common.properties | 对应`iotdb/config/iotdb-common.properties` | 非必填 | -| cn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode\_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`cn_seed_config_node` | 必填 | -| dn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode\_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` | 必填 | - -其中datanode-env.sh 和confignode-env.sh 可以配置额外参数extra_opts,当该参数配置后会在datanode-env.sh 和confignode-env.sh 后面追加对应的值,可参考default\_cluster.yaml,配置示例如下: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - - -* confignode_servers 是部署IoTDB Confignodes配置,里面可以配置多个Confignode - 默认将第一个启动的ConfigNode节点node1当作Seed-ConfigNode - -| 参数 | 说明 | 是否必填 | -|-----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| name | Confignode 名称 | 必填 | -| deploy\_dir | IoTDB config node 部署目录 | 必填| | -| cn\_internal\_address | 对应iotdb/内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_address` | 必填 | -| cn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-confignode.properties`中的`cn_seed_config_node` | 必填 | -| cn\_internal\_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_port` | 必填 | -| cn\_consensus\_port | 对应`iotdb/config/iotdb-system.properties`中的`cn_consensus_port` | 非必填 | -| cn\_data\_dir | 对应`iotdb/config/iotdb-system.properties`中的`cn_data_dir` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`confignode_servers`同时配置值优先使用confignode\_servers中的值 | 非必填 | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| 参数 | 说明 | 是否必填 | -| -------------------------- | ------------------------------------------------------------ | -------- | -| name | Datanode 名称 | 必填 | -| deploy_dir | IoTDB data node 部署目录,注:该目录不能与下面的IoTDB config node部署目录相同 | 必填 | -| dn_rpc_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` | 必填 | -| dn_internal_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` | 必填 | -| dn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-datanode.properties`中的`dn_seed_config_node`,推荐使用 SeedConfigNode | 必填 | -| dn_rpc_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` | 必填 | -| dn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 | 非必填 | - - -| 参数 | 说明 |是否必填| -|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|--- | -| name | Datanode 名称 |必填| -| deploy\_dir | IoTDB data node 部署目录 |必填| -| dn\_rpc\_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` |必填| -| dn\_internal\_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` |必填| -| dn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` |必填| -| dn\_rpc\_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` |必填| -| dn\_internal\_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` |必填| -| iotdb-system.properties | 对应`iotdb/config/iotdb-common.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 |非必填| - -* grafana_server 是部署Grafana 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------|------------------|-------------------| -| grafana\_dir\_name | grafana 解压目录名称 | 非必填默认grafana_iotdb | -| host | grafana 部署的服务器ip | 必填 | -| grafana\_port | grafana 部署机器的端口 | 非必填,默认3000 | -| deploy\_dir | grafana 部署服务器目录 | 必填 | -| grafana\_tar\_dir | grafana 压缩包位置 | 必填 | -| dashboards | dashboards 所在的位置 | 非必填,多个用逗号隔开 | - -* prometheus_server 是部署Prometheus 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------------------|------------------|-----------------------| -| prometheus_dir\_name | prometheus 解压目录名称 | 非必填默认prometheus_iotdb | -| host | prometheus 部署的服务器ip | 必填 | -| prometheus\_port | prometheus 部署机器的端口 | 非必填,默认9090 | -| deploy\_dir | prometheus 部署服务器目录 | 必填 | -| prometheus\_tar\_dir | prometheus 压缩包位置 | 必填 | -| storage\_tsdb\_retention\_time | 默认保存数据天数 默认15天 | 非必填 | -| storage\_tsdb\_retention\_size | 指定block可以保存的数据大小默认512M ,注意单位KB, MB, GB, TB, PB, EB | 非必填 | - -如果在config/xxx.yaml的`iotdb-system.properties`和`iotdb-system.properties`中配置了metrics,则会自动把配置放入到promethues无需手动修改 - -注意:如何配置yaml key对应的值包含特殊字符如:等建议整个value使用双引号,对应的文件路径中不要使用包含空格的路径,防止出现识别出现异常问题。 - -### 使用场景 - -#### 清理数据场景 - -* 清理集群数据场景会删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 -* 首先执行停止集群命令、然后在执行集群清理命令。 -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster clean default_cluster -``` - -#### 集群销毁场景 - -* 集群销毁场景会删除IoTDB集群中的`data`、`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录。 -* 首先执行停止集群命令、然后在执行集群销毁命令。 - - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster destroy default_cluster -``` - -#### 集群升级场景 - -* 集群升级首先需要在config/xxx.yaml中配置`iotdb_lib_dir`为要上传到服务器的jar所在目录路径(例如iotdb/lib)。 -* 如果使用zip文件上传请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache-iotdb-1.2.0/lib/* -* 执行上传命令、然后执行重启IoTDB集群命令即可完成集群升级 - -```bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### 集群配置文件的热部署场景 - -* 首先修改在config/xxx.yaml中配置。 -* 执行分发命令、然后执行热部署命令即可完成集群配置的热部署 - -```bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### 集群扩容场景 - -* 首先修改在config/xxx.yaml中添加一个datanode 或者confignode 节点。 -* 执行集群扩容命令 -```bash -iotdbctl cluster scaleout default_cluster -``` - -#### 集群缩容场景 - -* 首先在config/xxx.yaml中找到要缩容的节点名字或者ip+port(其中confignode port 是cn_internal_port、datanode port 是rpc_port) -* 执行集群缩容命令 -```bash -iotdbctl cluster scalein default_cluster -``` - -#### 已有IoTDB集群,使用集群部署工具场景 - -* 配置服务器的`user`、`passwod`或`pkey`、`ssh_port` -* 修改config/xxx.yaml中IoTDB 部署路径,`deploy_dir`(IoTDB 部署目录)、`iotdb_dir_name`(IoTDB解压目录名称,默认是iotdb) - 例如IoTDB 部署完整路径是`/home/data/apache-iotdb-1.1.1`则需要修改yaml文件`deploy_dir:/home/data/`、`iotdb_dir_name:apache-iotdb-1.1.1` -* 如果服务器不是使用的java_home则修改`jdk_deploy_dir`(jdk 部署目录)、`jdk_dir_name`(jdk解压后的目录名称,默认是jdk_iotdb),如果使用的是java_home 则不需要修改配置 - 例如jdk部署完整路径是`/home/data/jdk_1.8.2`则需要修改yaml文件`jdk_deploy_dir:/home/data/`、`jdk_dir_name:jdk_1.8.2` -* 配置`cn_seed_config_node`、`dn_seed_config_node` -* 配置`confignode_servers`中`iotdb-system.properties`里面的`cn_internal_address`、`cn_internal_port`、`cn_consensus_port`、`cn_system_dir`、 - `cn_consensus_dir`里面的值不是IoTDB默认的则需要配置否则可不必配置 -* 配置`datanode_servers`中`iotdb-system.properties`里面的`dn_rpc_address`、`dn_internal_address`、`dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`等 -* 执行初始化命令 - -```bash -iotdbctl cluster init default_cluster -``` - -#### 一键部署IoTDB、Grafana和Prometheus 场景 - -* 配置`iotdb-system.properties` 打开metrics接口 -* 配置Grafana 配置,如果`dashboards` 有多个就用逗号隔开,名字不能重复否则会被覆盖。 -* 配置Prometheus配置,IoTDB 集群配置了metrics 则无需手动修改Prometheus 配置会根据哪个节点配置了metrics,自动修改Prometheus 配置。 -* 启动集群 - -```bash -iotdbctl cluster start default_cluster -``` - -更加详细参数请参考上方的 集群配置文件介绍 - - -### 命令格式 - -本工具的基本用法为: -```bash -iotdbctl cluster [params (Optional)] -``` -* key 表示了具体的命令。 - -* cluster name 表示集群名称(即`iotdbctl/config` 文件中yaml文件名字)。 - -* params 表示了命令的所需参数(选填)。 - -* 例如部署default_cluster集群的命令格式为: - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 集群的功能及参数列表如下: - -| 命令 | 功能 | 参数 | -|-----------------|-------------------------------|-------------------------------------------------------------------------------------------------------------------------| -| check | 检测集群是否可以部署 | 集群名称列表 | -| clean | 清理集群 | 集群名称 | -| deploy/dist-all | 部署集群 | 集群名称 ,-N,模块名称(iotdb、grafana、prometheus可选),-op force(可选) | -| list | 打印集群及状态列表 | 无 | -| start | 启动集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) | -| stop | 关闭集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) ,-op force(nodename、grafana、prometheus可选) | -| restart | 重启集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选),-op force(强制停止)/rolling(滚动重启) | -| show | 查看集群信息,details字段表示展示集群信息细节 | 集群名称, details(可选) | -| destroy | 销毁集群 | 集群名称,-N,模块名称(iotdb、grafana、prometheus可选) | -| scaleout | 集群扩容 | 集群名称 | -| scalein | 集群缩容 | 集群名称,-N,集群节点名字或集群节点ip+port | -| reload | 集群热加载 | 集群名称 | -| dist-conf | 集群配置文件分发 | 集群名称 | -| dumplog | 备份指定集群日志 | 集群名称,-N,集群节点名字 -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -loglevel 日志类型 -l 传输速度 | -| dumpdata | 备份指定集群数据 | 集群名称, -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -l 传输速度 | -| dist-lib | lib 包升级 | 集群名字(升级完后请重启) | -| init | 已有集群使用集群部署工具时,初始化集群配置 | 集群名字,初始化集群配置 | -| status | 查看进程状态 | 集群名字 | -| acitvate | 激活集群 | 集群名字 | -| dist-plugin | 上传plugin(udf,trigger,pipe)到集群 | 集群名字,-type 类型 U(udf)/T(trigger)/P(pipe) -file /xxxx/trigger.jar,上传完成后需手动执行创建udf、pipe、trigger命令 | -| upgrade | 滚动升级 | 集群名字 | -| health_check | 健康检查 | 集群名字,-N,节点名称(可选) | -| backup | 停机备份 | 集群名字,-N,节点名称(可选) | -| importschema | 元数据导入 | 集群名字,-N,节点名称(必填) -param 参数 | -| exportschema | 元数据导出 | 集群名字,-N,节点名称(必填) -param 参数 | - - -### 详细命令执行过程 - -下面的命令都是以default_cluster.yaml 为示例执行的,用户可以修改成自己的集群文件来执行 - -#### 检查集群部署环境命令 - -```bash -iotdbctl cluster check default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 验证目标节点是否能够通过 SSH 登录 - -* 验证对应节点上的 JDK 版本是否满足IoTDB jdk1.8及以上版本、服务器是否按照unzip、是否安装lsof 或者netstat - -* 如果看到下面提示`Info:example check successfully!` 证明服务器已经具备安装的要求, - 如果输出`Error:example check fail!` 证明有部分条件没有满足需求可以查看上面的输出的Error日志(例如:`Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`)进行修复, - 如果检查jdk没有满足要求,我们可以自己在yaml 文件中配置一个jdk1.8 及以上版本的进行部署不影响后面使用, - 如果检查lsof、netstat或者unzip 不满足要求需要在服务器上自行安装。 - -#### 部署集群命令 - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`confignode_servers` 和`datanode_servers`中的节点信息上传IoTDB压缩包和jdk压缩包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值) - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -```bash -iotdbctl cluster deploy default_cluster -op force -``` -注意:该命令会强制执行部署,具体过程会删除已存在的部署目录重新部署 - -*部署单个模块* -```bash -# 部署grafana模块 -iotdbctl cluster deploy default_cluster -N grafana -# 部署prometheus模块 -iotdbctl cluster deploy default_cluster -N prometheus -# 部署iotdb模块 -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### 启动集群命令 - -```bash -iotdbctl cluster start default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 启动confignode,根据yaml配置文件中`confignode_servers`中的顺序依次启动同时根据进程id检查confignode是否正常,第一个confignode 为seek config - -* 启动datanode,根据yaml配置文件中`datanode_servers`中的顺序依次启动同时根据进程id检查datanode是否正常 - -* 如果根据进程id检查进程存在后,通过cli依次检查集群列表中每个服务是否正常,如果cli链接失败则每隔10s重试一次直到成功最多重试5次 - - -*启动单个节点命令* -```bash -#按照IoTDB 节点名称启动 -iotdbctl cluster start default_cluster -N datanode_1 -#按照IoTDB 集群ip+port启动,其中port对应confignode的cn_internal_port、datanode的rpc_port -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 -#启动grafana -iotdbctl cluster start default_cluster -N grafana -#启动prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对于节点位置信息,如果启动的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果启动的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 启动该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的start-confignode.sh和start-datanode.sh 脚本, -在实际输出结果失败时有可能是集群还未正常启动,建议使用status命令进行查看当前集群状态(iotdbctl cluster status xxx) - - -#### 查看IoTDB集群状态命令 - -```bash -iotdbctl cluster show default_cluster -#查看IoTDB集群详细信息 -iotdbctl cluster show default_cluster details -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 依次在datanode通过cli执行`show cluster details` 如果有一个节点执行成功则不会在后续节点继续执行cli直接返回结果 - - -#### 停止集群命令 - - -```bash -iotdbctl cluster stop default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`datanode_servers`中datanode节点信息,按照配置先后顺序依次停止datanode节点 - -* 根据`confignode_servers`中confignode节点信息,按照配置依次停止confignode节点 - -*强制停止集群命令* - -```bash -iotdbctl cluster stop default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群 - -*停止单个节点命令* - -```bash -#按照IoTDB 节点名称停止 -iotdbctl cluster stop default_cluster -N datanode_1 -#按照IoTDB 集群ip+port停止(ip+port是按照datanode中的ip+dn_rpc_port获取唯一节点或confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 -#停止grafana -iotdbctl cluster stop default_cluster -N grafana -#停止prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对应节点位置信息,如果停止的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果停止的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 停止该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的stop-confignode.sh和stop-datanode.sh 脚本,在某些情况下有可能iotdb集群并未停止。 - - -#### 清理集群数据命令 - -```bash -iotdbctl cluster clean default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`配置信息 - -* 根据`confignode_servers`、`datanode_servers`中的信息,检查是否还有服务正在运行, - 如果有任何一个服务正在运行则不会执行清理命令 - -* 删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 - - - -#### 重启集群命令 - -```bash -iotdbctl cluster restart default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 执行上述的停止集群命令(stop),然后执行启动集群命令(start) 具体参考上面的start 和stop 命令 - -*强制重启集群命令* - -```bash -iotdbctl cluster restart default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群,然后启动集群 - -*重启单个节点命令* - -```bash -#按照IoTDB 节点名称重启datanode_1 -iotdbctl cluster restart default_cluster -N datanode_1 -#按照IoTDB 节点名称重启confignode_1 -iotdbctl cluster restart default_cluster -N confignode_1 -#重启grafana -iotdbctl cluster restart default_cluster -N grafana -#重启prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### 集群缩容命令 - -```bash -#按照节点名称缩容 -iotdbctl cluster scalein default_cluster -N nodename -#按照ip+port缩容(ip+port按照datanode中的ip+dn_rpc_port获取唯一节点,confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster scalein default_cluster -N ip:port -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 判断要缩容的confignode节点和datanode是否只剩一个,如果只剩一个则不能执行缩容 - -* 然后根据ip:port或者nodename 获取要缩容的节点信息,执行缩容命令,然后销毁该节点目录,如果缩容的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果缩容的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - - -提示:目前一次仅支持一个节点缩容 - -#### 集群扩容命令 - -```bash -iotdbctl cluster scaleout default_cluster -``` -* 修改config/xxx.yaml 文件添加一个datanode 节点或者confignode节点 - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 找到要扩容的节点,执行上传IoTDB压缩包和jdb包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值)并解压 - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -* 执行启动该节点命令并校验节点是否启动成功 - -提示:目前一次仅支持一个节点扩容 - -#### 销毁集群命令 -```bash -iotdbctl cluster destroy default_cluster -``` - -* cluster-name 找到默认位置的 yaml 文件 - -* 根据`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`中node节点信息,检查是否节点还在运行, - 如果有任何一个节点正在运行则停止销毁命令 - -* 删除IoTDB集群中的`data`以及yaml文件配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录 - -*销毁单个模块* -```bash -# 销毁grafana模块 -iotdbctl cluster destroy default_cluster -N grafana -# 销毁prometheus模块 -iotdbctl cluster destroy default_cluster -N prometheus -# 销毁iotdb模块 -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### 分发集群配置命令 -```bash -iotdbctl cluster dist-conf default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 根据yaml文件节点配置信息生成并依次上传`iotdb-system.properties`到指定节点 - -#### 热加载集群配置命令 -```bash -iotdbctl cluster reload default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据yaml文件节点配置信息依次在cli中执行`load configuration` - -#### 集群节点日志备份 -```bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 该命令会根据yaml文件校验datanode_1,confignode_1 是否存在,然后根据配置的起止日期(startdate<=logtime<=enddate)备份指定节点datanode_1,confignode_1 的日志数据到指定服务`192.168.9.48` 端口`36000` 数据备份路径是 `/iotdb/logs` ,IoTDB日志存储路径在`/root/data/db/iotdb/logs`(非必填,如果不填写-logs xxx 默认从IoTDB安装路径/logs下面备份日志) - -| 命令 | 功能 | 是否必填 | -|------------|------------------------------------| ---| -| -h | 存放备份数据机器ip |否| -| -u | 存放备份数据机器用户名 |否| -| -pw | 存放备份数据机器密码 |否| -| -p | 存放备份数据机器端口(默认22) |否| -| -path | 存放备份数据的路径(默认当前路径) |否| -| -loglevel | 日志基本有all、info、error、warn(默认是全部) |否| -| -l | 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -| -N | 配置文件集群名称多个用逗号隔开 |是| -| -startdate | 起始时间(包含默认1970-01-01) |否| -| -enddate | 截止时间(包含) |否| -| -logs | IoTDB 日志存放路径,默认是({iotdb}/logs) |否| - -#### 集群节点数据备份 -```bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* 该命令会根据yaml文件获取leader 节点,然后根据起止日期(startdate<=logtime<=enddate)备份数据到192.168.9.48 服务上的/iotdb/datas 目录下 - -| 命令 | 功能 | 是否必填 | -| ---|---------------------------------| ---| -|-h| 存放备份数据机器ip |否| -|-u| 存放备份数据机器用户名 |否| -|-pw| 存放备份数据机器密码 |否| -|-p| 存放备份数据机器端口(默认22) |否| -|-path| 存放备份数据的路径(默认当前路径) |否| -|-granularity| 类型partition |是| -|-l| 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -|-startdate| 起始时间(包含) |是| -|-enddate| 截止时间(包含) |是| - -#### 集群lib包上传(升级) -```bash -iotdbctl cluster dist-lib default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 - -注意执行完升级后请重启IoTDB 才能生效 - -#### 集群初始化 -```bash -iotdbctl cluster init default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 初始化集群配置 - -#### 查看集群进程状态 -```bash -iotdbctl cluster status default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 展示集群的存活状态 - -#### 集群授权激活 - -集群激活默认是通过输入激活码激活,也可以通过-op license_path 通过license路径激活 - -* 默认激活方式 -```bash -iotdbctl cluster activate default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -op license_path -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -### 集群plugin分发 -```bash -#分发udf -iotdbctl cluster dist-plugin default_cluster -type U -file /xxxx/udf.jar -#分发trigger -iotdbctl cluster dist-plugin default_cluster -type T -file /xxxx/trigger.jar -#分发pipe -iotdbctl cluster dist-plugin default_cluster -type P -file /xxxx/pipe.jar -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取 `datanode_servers`配置信息 - -* 上传udf/trigger/pipe jar包 - -上传完成后需要手动执行创建udf/trigger/pipe命令 - -### 集群滚动升级 -```bash -iotdbctl cluster upgrade default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 -* confignode 执行停止、替换lib包、启动,然后datanode执行停止、替换lib包、启动 - - - -### 集群健康检查 -```bash -iotdbctl cluster health_check default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行health_check.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行health_check.sh - - -### 集群停机备份 -```bash -iotdbctl cluster backup default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行backup.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行backup.sh - -说明:多个节点部署到单台机器,只支持 quick 模式 - -### 集群元数据导入 - -```bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入import-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|---------------------------------|------| -| -s |指定想要导入的数据文件,这里可以指定文件或者文件夹。如果指定的是文件夹,将会把文件夹中所有的后缀为csv的文件进行批量导入。 | 是 | -| -fd |指定一个目录来存放导入失败的文件,如果没有指定这个参数,失败的文件将会被保存到源数据的目录中,文件名为是源文件名加上.failed的后缀。 | 否 | -| -lpf |用于指定每个导入失败文件写入数据的行数,默认值为10000 | 否 | - - - -### 集群元数据导出 - -```bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入export-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|------------------------------------------------------------|------| -| -t | 为导出的CSV文件指定输出路径 | 是 | -| -path |指定导出元数据的path pattern,指定该参数后会忽略-s参数例如:root.stock.** | 否 | -| -pf |如果未指定-path,则需指定该参数,指定查询元数据路径所在文件路径,支持 txt 文件格式,每个待导出的路径为一行。 | 否 | -| -lpf |指定导出的dump文件最大行数,默认值为10000。 | 否 | -| -timeout |指定session查询时的超时时间,单位为ms | 否 | - - - -### 集群部署工具样例介绍 -在集群部署工具安装目录中config/example 下面有3个yaml样例,如果需要可以复制到config 中进行修改即可 - -| 名称 | 说明 | -|-----------------------------|------------------------------------------------| -| default\_1c1d.yaml | 1个confignode和1个datanode 配置样例 | -| default\_3c3d.yaml | 3个confignode和3个datanode 配置样例 | -| default\_3c3d\_grafa\_prome | 3个confignode和3个datanode、Grafana、Prometheus配置样例 | - -## 数据文件夹概览工具 - -IoTDB数据文件夹概览工具用于打印出数据文件夹的结构概览信息,工具位置为 tools/tsfile/print-iotdb-data-dir。 - -### 用法 - -- Windows: - -```bash -.\print-iotdb-data-dir.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"IoTDB_data_dir_overview.txt"作为默认值。 - -### 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## TsFile概览工具 - -TsFile概览工具用于以概要模式打印出一个TsFile的内容,工具位置为 tools/tsfile/print-tsfile。 - -### 用法 - -- Windows: - -```bash -.\print-tsfile-sketch.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-tsfile-sketch.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"TsFile_sketch_view.txt"作为默认值。 - -### 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -解释: - -- 以"|"为分隔,左边是在TsFile文件中的实际位置,右边是梗概内容。 -- "|||||||||||||||||||||"是为增强可读性而添加的导引信息,不是TsFile中实际存储的数据。 -- 最后打印的"IndexOfTimerseriesIndex Tree"是对TsFile文件末尾的元数据索引树的重新整理打印,便于直观理解,不是TsFile中存储的实际数据。 - -## TsFile Resource概览工具 - -TsFile resource概览工具用于打印出TsFile resource文件的内容,工具位置为 tools/tsfile/print-tsfile-resource-files。 - -### 用法 - -- Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/zh/UserGuide/V1.3.x/Tools-System/Monitor-Tool_timecho.md b/src/zh/UserGuide/V1.3.x/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index d3cc9dceb..000000000 --- a/src/zh/UserGuide/V1.3.x/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,170 +0,0 @@ - - - -# 监控工具 - -监控工具的部署可参考文档 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) 章节。 - -## 监控指标的 Prometheus 映射关系 - -> 对于 Metric Name 为 name, Tags 为 K1=V1, ..., Kn=Vn 的监控指标有如下映射,其中 value 为具体值 - -| 监控指标类型 | 映射关系 | -| ---------------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | - -## 修改配置文件 - -1) 以 DataNode 为例,修改 iotdb-system.properties 配置文件如下: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -2) 启动 IoTDB DataNode - -3) 打开浏览器或者用```curl``` 访问 ```http://servier_ip:9091/metrics```, 就能得到如下 metric 数据: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -## Prometheus + Grafana - -如上所示,IoTDB 对外暴露出标准的 Prometheus 格式的监控指标数据,可以使用 Prometheus 采集并存储监控指标,使用 Grafana -可视化监控指标。 - -IoTDB、Prometheus、Grafana三者的关系如下图所示: - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. IoTDB在运行过程中持续收集监控指标数据。 -2. Prometheus以固定的间隔(可配置)从IoTDB的HTTP接口拉取监控指标数据。 -3. Prometheus将拉取到的监控指标数据存储到自己的TSDB中。 -4. Grafana以固定的间隔(可配置)从Prometheus查询监控指标数据并绘图展示。 - -从交互流程可以看出,我们需要做一些额外的工作来部署和配置Prometheus和Grafana。 - -比如,你可以对Prometheus进行如下的配置(部分参数可以自行调整)来从IoTDB获取监控数据 - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -更多细节可以参考下面的文档: - -[Prometheus安装使用文档](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus从HTTP接口拉取metrics数据的配置说明](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana安装使用文档](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana从Prometheus查询数据并绘图的文档](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## Apache IoTDB Dashboard - -我们提供了Apache IoTDB Dashboard,支持统一集中式运维管理,可通过一个监控面板监控多个集群。 - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - -你可以在企业版中获取到 Dashboard 的 Json文件。 - -### 集群概览 - -可以监控包括但不限于: -- 集群总CPU核数、总内存空间、总硬盘空间 -- 集群包含多少个ConfigNode与DataNode -- 集群启动时长 -- 集群写入速度 -- 集群各节点当前CPU、内存、磁盘使用率 -- 分节点的信息 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - -### 数据写入 - -可以监控包括但不限于: -- 写入平均耗时、耗时中位数、99%分位耗时 -- WAL文件数量与尺寸 -- 节点 WAL flush SyncBuffer 耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### 数据查询 - -可以监控包括但不限于: -- 节点查询加载时间序列元数据耗时 -- 节点查询读取时间序列耗时 -- 节点查询修改时间序列元数据耗时 -- 节点查询加载Chunk元数据列表耗时 -- 节点查询修改Chunk元数据耗时 -- 节点查询按照Chunk元数据过滤耗时 -- 节点查询构造Chunk Reader耗时的平均值 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### 存储引擎 - -可以监控包括但不限于: -- 分类型的文件数量、大小 -- 处于各阶段的TsFile数量、大小 -- 各类任务的数量与耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### 系统监控 - -可以监控包括但不限于: -- 系统内存、交换内存、进程内存 -- 磁盘空间、文件数、文件尺寸 -- JVM GC时间占比、分类型的GC次数、GC数据量、各年代的堆内存占用 -- 网络传输速率、包发送速率 - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) diff --git a/src/zh/UserGuide/V1.3.x/Tools-System/Workbench_timecho.md b/src/zh/UserGuide/V1.3.x/Tools-System/Workbench_timecho.md deleted file mode 100644 index 2eb3f47b4..000000000 --- a/src/zh/UserGuide/V1.3.x/Tools-System/Workbench_timecho.md +++ /dev/null @@ -1,34 +0,0 @@ -# 可视化控制台 - -可视化控制台的部署可参考文档 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) 章节。 - -## 第1章 产品介绍 -IoTDB可视化控制台是在IoTDB企业版时序数据库基础上针对工业场景的实时数据收集、存储与分析一体化的数据管理场景开发的扩展组件,旨在为用户提供高效、可靠的实时数据存储和查询解决方案。它具有体量轻、性能高、易使用的特点,完美对接 Hadoop 与 Spark 生态,适用于工业物联网应用中海量时间序列数据高速写入和复杂分析查询的需求。 - -## 第2章 使用说明 -IoTDB的可视化控制台包含以下功能模块: -| **功能模块** | **功能说明** | -| ------------ | ------------------------------------------------------------ | -| 实例管理 | 支持对连接实例进行统一管理,支持创建、编辑和删除,同时可以可视化呈现多实例的关系,帮助客户更清晰的管理多数据库实例 | -| 首页 | 支持查看数据库实例中各节点的服务运行状态(如是否激活、是否运行、IP信息等),支持查看集群、ConfigNode、DataNode运行监控状态,对数据库运行健康度进行监控,判断实例是否有潜在运行问题 | -| 测点列表 | 支持直接查看实例中的测点信息,包括所在数据库信息(如数据库名称、数据保存时间、设备数量等),及测点信息(测点名称、数据类型、压缩编码等),同时支持单条或批量创建、导出、删除测点 | -| 数据模型 | 支持查看各层级从属关系,将层级模型直观展示 | -| 数据查询 | 支持对常用数据查询场景提供界面式查询交互,并对查询数据进行批量导入、导出 | -| 统计查询 | 支持对常用数据统计场景提供界面式查询交互,如最大值、最小值、平均值、总和的结果输出。 | -| SQL操作 | 支持对数据库SQL进行界面式交互,单条或多条语句执行,结果的展示和导出 | -| 趋势 | 支持一键可视化查看数据整体趋势,对选中测点进行实时&历史数据绘制,观察测点实时&历史运行状态 | -| 分析 | 支持将数据通过不同的分析方式(如傅里叶变换等)进行可视化展示 | -| 视图 | 支持通过界面来查看视图名称、视图描述、结果测点以及表达式等信息,同时还可以通过界面交互快速的创建、编辑、删除视图 | -| 数据同步 | 支持对数据库间的数据同步任务进行直观创建、查看、管理,支持直接查看任务运行状态、同步数据和目标地址,还可以通过界面实时观察到同步状态的监控指标变化 | -| 权限管理 | 支持对权限进行界面管控,用于管理和控制数据库用户访问和操作数据库的权限 | -| 审计日志 | 支持对用户在数据库上的操作进行详细记录,包括DDL、DML和查询操作。帮助用户追踪和识别潜在的安全威胁、数据库错误和滥用行为 | - -主要功能展示: -* 首页 -![首页.png](/img/%E9%A6%96%E9%A1%B5.png) -* 测点列表 -![测点列表.png](/img/workbench-1.png) -* 数据查询 -![数据查询.png](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2.png) -* 趋势 -![历史趋势.png](/img/%E5%8E%86%E5%8F%B2%E8%B6%8B%E5%8A%BF.png) \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/User-Manual/Audit-Log_timecho.md b/src/zh/UserGuide/V1.3.x/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index cb2ff4cdd..000000000 --- a/src/zh/UserGuide/V1.3.x/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,108 +0,0 @@ - - - -# 安全审计 - -## 功能背景 - - 审计日志是数据库的记录凭证,通过审计日志功能可以查询到用户在数据库中增删改查等各项操作,以保证信息安全。关于IoTDB的审计日志功能可以实现以下场景的需求: - -- 可以按链接来源(是否人为操作)决定是否记录审计日志,如:非人为操作如硬件采集器写入的数据不需要记录审计日志,人为操作如普通用户通过cli、workbench等工具操作的数据需要记录审计日志。 -- 过滤掉系统级别的写入操作,如IoTDB监控体系本身记录的写入操作等。 - - - -### 场景说明 - - - -#### 对所有用户的所有操作(增、删、改、查)进行记录 - -通过审计日志功能追踪到所有用户在数据中的各项操作。其中所记录的信息要包含数据操作(新增、删除、查询)及元数据操作(新增、修改、删除、查询)、客户端登录信息(用户名、ip地址)。 - - - -客户端的来源 - -- Cli、workbench、Zeppelin、Grafana、通过 Session/JDBC/MQTT 等协议传入的请求 - -![](/img/audit-log.png) - - -#### 可关闭部分用户连接的审计日志 - - - -如非人为操作,硬件采集器通过 Session/JDBC/MQTT 写入的数据不需要记录审计日志 - - - -## 功能定义 - - - -通过配置可以实现: - -- 决定是否开启审计功能 -- 决定审计日志的输出位置,支持输出至一项或多项 - 1. 日志文件 - 2. IoTDB存储 -- 决定是否屏蔽原生接口的写入,防止记录审计日志过多影响性能 -- 决定审计日志内容类别,支持记录一项或多项 - 1. 数据的新增、删除操作 - 2. 数据和元数据的查询操作 - 3. 元数据类的新增、修改、删除操作 - -### 配置项 - - 在iotdb-system.properties中修改以下几项配置 - -```YAML -#################### -### Audit log Configuration -#################### - -# whether to enable the audit log. -# Datatype: Boolean -# enable_audit_log=false - -# Output location of audit logs -# Datatype: String -# IOTDB: the stored time series is: root.__system.audit._{user} -# LOGGER: log_audit.log in the log directory -# audit_log_storage=IOTDB,LOGGER - -# whether enable audit log for DML operation of data -# whether enable audit log for DDL operation of schema -# whether enable audit log for QUERY operation of data and schema -# Datatype: String -# audit_log_operation=DML,DDL,QUERY - -# whether the local write api records audit logs -# Datatype: Boolean -# This contains Session insert api: insertRecord(s), insertTablet(s),insertRecordsOfOneDevice -# MQTT insert api -# RestAPI insert api -# This parameter will cover the DML in audit_log_operation -# enable_audit_log_for_native_insert_api=true -``` - diff --git a/src/zh/UserGuide/V1.3.x/User-Manual/Data-Sync-old_timecho.md b/src/zh/UserGuide/V1.3.x/User-Manual/Data-Sync-old_timecho.md deleted file mode 100644 index e039fb2e9..000000000 --- a/src/zh/UserGuide/V1.3.x/User-Manual/Data-Sync-old_timecho.md +++ /dev/null @@ -1,606 +0,0 @@ - - -# 数据同步 -数据同步是工业物联网的典型需求,通过数据同步机制,可实现 IoTDB 之间的数据共享,搭建完整的数据链路来满足内网外网数据互通、端边云同步、数据迁移、数据备份等需求。 - -## 功能概述 - -### 数据同步 - -一个数据同步任务包含 3 个阶段: - -![](/img/dataSync01.png) - -- 抽取(Source)阶段:该部分用于从源 IoTDB 抽取数据,在 SQL 语句中的 source 部分定义 -- 处理(Process)阶段:该部分用于处理从源 IoTDB 抽取出的数据,在 SQL 语句中的 processor 部分定义 -- 发送(Sink)阶段:该部分用于向目标 IoTDB 发送数据,在 SQL 语句中的 sink 部分定义 - -通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。目前数据同步支持以下信息的同步,您可以在创建同步任务时对同步范围进行选择(默认选择 data.insert,即同步新写入的数据): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
同步范围同步内容说明
all所有范围
data(数据)insert(增量)同步新写入的数据
delete(删除)同步被删除的数据
schema(元数据)database(数据库)同步数据库的创建、修改或删除操作
timeseries(时间序列)同步时间序列的定义和属性
TTL(数据到期时间)同步数据的存活时间
auth(权限)-同步用户权限和访问控制
- -### 功能限制及说明 - -元数据(schema)、权限(auth)同步功能存在如下限制: - -- 使用元数据同步时,要求`Schema region`、`ConfigNode` 的共识协议必须为默认的 ratis 协议,即:`iotdb-common.properties`配置文件中`config_node_consensus_protocol_class`和`schema_region_consensus_protocol_class`配置项均为`org.apache.iotdb.consensus.ratis.RatisConsensus` - -- 为了防止潜在的冲突,请在开启元数据同步时关闭接收端自动创建元数据功能。可通过修改 `iotdb-common.properties`配置文件中的`enable_auto_create_schema`配置项为 false,关闭元数据自动创建功能。 - -- 开启元数据同步时,不支持使用自定义插件。 - -- 双活集群中元数据同步需避免两端同时操作。 - -- 在进行数据同步任务时,请避免执行任何删除操作,防止两端状态不一致。 - -## 使用说明 - -数据同步任务有三种状态:RUNNING、STOPPED 和 DROPPED。任务状态转换如下图所示: - -V1.3.0及之前版本: - -在创建后不会立即启动,需要执行`START PIPE`语句启动任务。 - -![](/img/dataSync02.png) - -V1.3.1及之后版本: - -创建后任务会直接启动,同时当任务发生异常停止后,系统会自动尝试重启任务。 - -![](/img/Data-Sync01.png) - -提供以下 SQL 语句对同步任务进行状态管理。 - -### 创建任务 - -使用 `CREATE PIPE` 语句来创建一条数据同步任务,下列属性中`PipeId`和`sink`必填,`source`和`processor`为选填项,输入 SQL 时注意 `SOURCE`与 `SINK` 插件顺序不能替换。 - -SQL 示例如下: - -```SQL -CREATE PIPE -- PipeId 是能够唯一标定任务任务的名字 --- 数据抽取插件,可选插件 -WITH SOURCE ( - [ = ,], -) --- 数据处理插件,可选插件 -WITH PROCESSOR ( - [ = ,], -) --- 数据连接插件,必填插件 -WITH SINK ( - [ = ,], -) -``` - -### 开始任务 - -开始处理数据: - -```SQL -START PIPE -``` - -### 停止任务 - -停止处理数据: - -```SQL -STOP PIPE -``` - -### 删除任务 - -删除指定任务: - -```SQL -DROP PIPE -``` - -删除任务不需要先停止同步任务。 - -### 查看任务 - -查看全部任务: - -```SQL -SHOW PIPES -``` - -查看指定任务: - -```SQL -SHOW PIPE -``` - -pipe 的 show pipes 结果示例: - -```SQL -+--------------------------------+-----------------------+-------+---------------+--------------------+------------------------------------------------------------+----------------+ -| ID| CreationTime| State| PipeSource| PipeProcessor| PipeSink|ExceptionMessage| -+--------------------------------+-----------------------+-------+---------------+--------------------+------------------------------------------------------------+----------------+ -|3421aacb16ae46249bac96ce4048a220|2024-08-13T09:55:18.717|RUNNING| {}| {}|{{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}}| | -+--------------------------------+-----------------------+-------+---------------+--------------------+------------------------------------------------------------+----------------+ -``` - -其中各列含义如下: - -- **ID**:同步任务的唯一标识符 -- **CreationTime**:同步任务的创建的时间 -- **State**:同步任务的状态 -- **PipeSource**:同步数据流的来源 -- **PipeProcessor**:同步数据流在传输过程中的处理逻辑 -- **PipeSink**:同步数据流的目的地 -- **ExceptionMessage**:显示同步任务的异常信息 - - -### 同步插件 - -为了使得整体架构更加灵活以匹配不同的同步场景需求,我们支持在同步任务框架中进行插件组装。系统为您预置了一些常用插件可直接使用,同时您也可以自定义 processor 插件 和 Sink 插件,并加载至 IoTDB 系统进行使用。查看系统中的插件(含自定义与内置插件)可以用以下语句: - -```SQL -SHOW PIPEPLUGINS -``` - -返回结果如下(1.3.2 版本): - -```SQL -IoTDB> SHOW PIPEPLUGINS -+---------------------+----------+-------------------------------------------------------------------------------------------+----------------------------------------------------+ -| PluginName|PluginType| ClassName| PluginJar| -+---------------------+----------+-------------------------------------------------------------------------------------------+----------------------------------------------------+ -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.processor.donothing.DoNothingProcessor| | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.donothing.DoNothingConnector| | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.extractor.iotdb.IoTDBExtractor| | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| | -|IOTDB-THRIFT-SSL-SINK| Builtin|org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| | -+---------------------+----------+-------------------------------------------------------------------------------------------+----------------------------------------------------+ -``` - -预置插件详细介绍如下(各插件的详细参数可参考本文[参数说明](#参考参数说明)): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
类型自定义插件插件名称介绍适用版本
source 插件不支持iotdb-source默认的 extractor 插件,用于抽取 IoTDB 历史或实时数据1.2.x
processor 插件支持do-nothing-processor默认的 processor 插件,不对传入的数据做任何的处理1.2.x
sink 插件支持do-nothing-sink不对发送出的数据做任何的处理1.2.x
iotdb-thrift-sink默认的 sink 插件(V1.3.1及以上),用于 IoTDB(V1.2.0 及以上)与 IoTDB(V1.2.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,多线程 async non-blocking IO 模型,传输性能高,尤其适用于目标端为分布式时的场景1.2.x
iotdb-air-gap-sink用于 IoTDB(V1.2.2 及以上)向 IoTDB(V1.2.2 及以上)跨单向数据网闸的数据同步。支持的网闸型号包括南瑞 Syskeeper 2000 等1.2.x
iotdb-thrift-ssl-sink用于 IoTDB(V1.3.1 及以上)与 IoTDB(V1.2.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,单线程 sync blocking IO 模型,适用于安全需求较高的场景 1.3.1+
- -导入自定义插件可参考[流处理框架](./Streaming_timecho.md#自定义流处理插件管理)章节。 - -## 使用示例 - -### 全量数据同步 - -本例子用来演示将一个 IoTDB 的所有数据同步至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务,用来同步 A IoTDB 到 B IoTDB 间的全量数据,这里需要用到用到 sink 的 iotdb-thrift-sink 插件(内置插件),需通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,如下面的示例语句: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 部分数据同步 - -本例子用来演示同步某个历史时间范围( 2023 年 8 月 23 日 8 点到 2023 年 10 月 23 日 8 点)的数据至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务。首先我们需要在 source 中定义传输数据的范围,由于传输的是历史数据(历史数据是指同步任务创建之前存在的数据),需要配置数据的起止时间 start-time 和 end-time 以及传输的模式 mode。通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url。 - -详细语句如下: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'realtime.mode' = 'stream' -- 新插入数据(pipe创建后)的抽取模式 - 'start-time' = '2023.08.23T08:00:00+00:00', -- 同步所有数据的开始 event time,包含 start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- 同步所有数据的结束 event time,包含 end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 双向数据传输 - -本例子用来演示两个 IoTDB 之间互为双活的场景,数据链路如下图所示: - -![](/img/1706698592139.jpg) - -在这个例子中,为了避免数据无限循环,需要将 A 和 B 上的参数`forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一 pipe 传输而来的数据,以及要保持两侧的数据一致 pipe 需要配置`inclusion=all`来同步全量数据和元数据。 - -详细语句如下: - -在 A IoTDB 上执行下列语句: - -```SQL -create pipe AB -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'forwarding-pipe-requests' = 'false' --不转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'forwarding-pipe-requests' = 'false' --是否转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -### 边云数据传输 - -本例子用来演示多个 IoTDB 之间边云传输数据的场景,数据由 B 、C、D 集群分别都同步至 A 集群,数据链路如下图所示: - -![](/img/dataSync03.png) - -在这个例子中,为了将 B 、C、D 集群的数据同步至 A,在 BA 、CA、DA 之间的 pipe 需要配置`path`限制范围,以及要保持边侧和云侧的数据一致 pipe 需要配置`inclusion=all`来同步全量数据和元数据,详细语句如下: - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 A: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 C IoTDB 上执行下列语句,将 C 中数据同步至 A: - -```SQL -create pipe CA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 D IoTDB 上执行下列语句,将 D 中数据同步至 A: - -```SQL -create pipe DA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 级联数据传输 - -本例子用来演示多个 IoTDB 之间级联传输数据的场景,数据由 A 集群同步至 B 集群,再同步至 C 集群,数据链路如下图所示: - -![](/img/1706698610134.jpg) - -在这个例子中,为了将 A 集群的数据同步至 C,在 BC 之间的 pipe 需要将 `forwarding-pipe-requests` 配置为`true`,详细语句如下: - -在 A IoTDB 上执行下列语句,将 A 中数据同步至 B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 C: - -```SQL -create pipe BC -with source ( - 'forwarding-pipe-requests' = 'true' --是否转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 跨网闸数据传输 - -本例子用来演示将一个 IoTDB 的数据,经过单向网闸,同步至另一个 IoTDB 的场景,数据链路如下图所示: - -![](/img/cross-network-gateway.png) - -在这个例子中,需要使用 sink 任务中的 iotdb-air-gap-sink 插件(目前支持部分型号网闸,具体型号请联系天谋科技工作人员确认),配置网闸后,在 A IoTDB 上执行下列语句,其中 node-urls 填写网闸配置的目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,详细语句如下: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - - -### 加密同步(V1.3.1+ ) - -IoTDB 支持在同步过程中使用 SSL 加密,从而在不同的 IoTDB 实例之间安全地传输数据。通过配置 SSL 相关的参数,如证书地址和密码(`ssl.trust-store-path`)、(`ssl.trust-store-pwd`)可以确保数据在同步过程中被 SSL 加密所保护。 - -如创建名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'ssl.trust-store-path'='pki/trusted', -- 连接目标端 DataNode 所需的 trust store 证书路径 - 'ssl.trust-store-pwd'='root' -- 连接目标端 DataNode 所需的 trust store 证书密码 -) -``` - -## 参考:注意事项 - -可通过修改 IoTDB 配置文件(`iotdb-common.properties`)以调整数据同步的参数,如同步数据存储目录等。完整配置如下:: - -V1.3.0/1/2: - -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -## 参考:参数说明 - -### source 参数(V1.3.0) - -| 参数 | 描述 | value 取值范围 | 是否必填 | 默认取值 | -| :------------------------------ | :----------------------------------------------------------- | :------------------------------------- | :------- | :------------- | -| source | iotdb-source | String: iotdb-source | 必填 | - | -| source.pattern | 用于筛选时间序列的路径前缀 | String: 任意的时间序列前缀 | 选填 | root | -| source.history.enable | 是否发送历史数据 | Boolean: true / false | 选填 | true | -| source.history.start-time | 同步历史数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MIN_VALUE | -| source.history.end-time | 同步历史数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MAX_VALUE | -| source.realtime.enable | 是否发送实时数据 | Boolean: true / false | 选填 | true | -| source.realtime.mode | 新插入数据(pipe 创建后)的抽取模式 | String: stream, batch | 选填 | stream | -| source.forwarding-pipe-requests | 是否转发由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | 选填 | true | -| source.history.loose-range | tsfile 传输时,是否放宽历史数据(pipe 创建前)范围。"":不放宽范围,严格按照设置的条件挑选数据"time":放宽时间范围,避免对 TsFile 进行拆分,可以提升同步效率 | String: "" / "time" | 选填 | 空字符串 | - -> 💎 **说明:历史数据与实时数据的差异** -> - **历史数据**:所有 arrival time < 创建 pipe 时当前系统时间的数据称为历史数据 -> - **实时数据**:所有 arrival time >= 创建 pipe 时当前系统时间的数据称为实时数据 -> - **全量数据**: 全量数据 = 历史数据 + 实时数据 -> -> 💎 **说明:数据抽取模式 stream 和 batch 的差异** -> - **stream(推荐)**:该模式下,任务将对数据进行实时处理、发送,其特点是高时效、低吞吐 -> - **batch**:该模式下,任务将对数据进行批量(按底层数据文件)处理、发送,其特点是低时效、高吞吐 - -### source 参数(V1.3.1) - -> 在 1.3.1 及以上的版本中,各项参数不再需要额外增加 source、processor、sink 前缀 - -| 参数 | 描述 | value 取值范围 | 是否必填 | 默认取值 | -| :----------------------- | :----------------------------------------------------------- | :------------------------------------- | :------- | :------------- | -| source | iotdb-source | String: iotdb-source | 必填 | - | -| pattern | 用于筛选时间序列的路径前缀 | String: 任意的时间序列前缀 | 选填 | root | -| start-time | 同步所有数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MIN_VALUE | -| end-time | 同步所有数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MAX_VALUE | -| realtime.mode | 新插入数据(pipe 创建后)的抽取模式 | String: stream, batch | 选填 | stream | -| forwarding-pipe-requests | 是否转发由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | 选填 | true | -| history.loose-range | tsfile 传输时,是否放宽历史数据(pipe 创建前)范围。"":不放宽范围,严格按照设置的条件挑选数据"time":放宽时间范围,避免对 TsFile 进行拆分,可以提升同步效率 | String: "" / "time" | 选填 | 空字符串 | - -> 💎 **说明**:为保持低版本兼容,history.enable、history.start-time、history.end-time、realtime.enable 仍可使用,但在新版本中不推荐。 -> -> 💎 **说明:数据抽取模式 stream 和 batch 的差异** -> - **stream(推荐)**:该模式下,任务将对数据进行实时处理、发送,其特点是高时效、低吞吐 -> - **batch**:该模式下,任务将对数据进行批量(按底层数据文件)处理、发送,其特点是低时效、高吞吐 - -### source 参数(V1.3.2) - -> 在 1.3.1 及以上的版本中,各项参数不再需要额外增加 source、processor、sink 前缀 - -| 参数 | 描述 | value 取值范围 | 是否必填 | 默认取值 | -| :----------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :------------- | -| source | iotdb-source | String: iotdb-source | 必填 | - | -| inclusion | 用于指定数据同步任务中需要同步范围,分为数据、元数据和权限 | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | 选填 | data.insert | -| inclusion.exclusion | 用于从 inclusion 指定的同步范围内排除特定的操作,减少同步的数据量 | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | 选填 | - | -| path | 用于筛选待同步的时间序列及其相关元数据 / 数据的路径模式path 是精确匹配,参数必须为前缀路径或完整路径,即不能含有 `"*"`,最多在 path 参数的尾部含有一个 `"**"` | String:IoTDB 的 pattern | 选填 | root.** | -| pattern | 用于筛选时间序列的路径前缀元数据同步不能用 pattern 参数 | String: 任意的时间序列前缀 | 选填 | root | -| start-time | 同步所有数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MIN_VALUE | -| end-time | 同步所有数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MAX_VALUE | -| realtime.mode | 新插入数据(pipe 创建后)的抽取模式 | String: stream, batch | 选填 | stream | -| forwarding-pipe-requests | 是否转发由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | 选填 | true | -| history.loose-range | tsfile 传输时,是否放宽历史数据(pipe 创建前)范围。"":不放宽范围,严格按照设置的条件挑选数据"time":放宽时间范围,避免对 TsFile 进行拆分,可以提升同步效率 | String: "" 、 "time" | 选填 | "" | -| mods.enable | 是否发送 tsfile 的 mods 文件 | Boolean: true / false | 选填 | false | - -> 💎 **说明**:为保持低版本兼容,history.enable、history.start-time、history.end-time、realtime.enable 仍可使用,但在新版本中不推荐。 -> -> 💎 **说明:数据抽取模式 stream 和 batch 的差异** -> - **stream(推荐)**:该模式下,任务将对数据进行实时处理、发送,其特点是高时效、低吞吐 -> - **batch**:该模式下,任务将对数据进行批量(按底层数据文件)处理、发送,其特点是低时效、高吞吐 - -### sink 参数 - -> 在 1.3.1 及以上的版本中,各项参数不再需要额外增加 source、processor、sink 前缀 - -#### iotdb-thrift-sink( V1.3.0/1/2) - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -| :--------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------- | -| sink | iotdb-thrift-sink 或 iotdb-thrift-async-sink | String: iotdb-thrift-sink 或 iotdb-thrift-async-sink | 必填 | | -| sink.node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| sink.batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| sink.batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| sink.batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | - -#### iotdb-air-gap-sink( V1.3.0/1/2) - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -| :-------------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :------- | -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | 必填 | - | -| sink.node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| sink.air-gap.handshake-timeout-ms | 发送端与接收端在首次尝试建立连接时握手请求的超时时长,单位:毫秒 | Integer | 选填 | 5000 | - -#### iotdb-thrift-ssl-sink( V1.3.1/2) - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -| :---------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------- | -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| ssl.trust-store-path | 连接目标端 DataNode 所需的 trust store 证书路径 | String: 证书目录名,配置为相对目录时,相对于 IoTDB 根目录Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| ssl.trust-store-pwd | 连接目标端 DataNode 所需的 trust store 证书密码 | Integer | 必填 | - | diff --git a/src/zh/UserGuide/V1.3.x/User-Manual/Data-Sync_timecho.md b/src/zh/UserGuide/V1.3.x/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index ede5dba3b..000000000 --- a/src/zh/UserGuide/V1.3.x/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,659 +0,0 @@ - - -# 数据同步 -数据同步是工业物联网的典型需求,通过数据同步机制,可实现 IoTDB 之间的数据共享,搭建完整的数据链路来满足内网外网数据互通、端边云同步、数据迁移、数据备份等需求。 - -## 功能概述 - -### 数据同步 - -一个数据同步任务包含 3 个阶段: - -![](/img/dataSync01.png) - -- 抽取(Source)阶段:该部分用于从源 IoTDB 抽取数据,在 SQL 语句中的 source 部分定义 -- 处理(Process)阶段:该部分用于处理从源 IoTDB 抽取出的数据,在 SQL 语句中的 processor 部分定义 -- 发送(Sink)阶段:该部分用于向目标 IoTDB 发送数据,在 SQL 语句中的 sink 部分定义 - -通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。目前数据同步支持以下信息的同步,您可以在创建同步任务时对同步范围进行选择(默认选择 data.insert,即同步新写入的数据): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
同步范围同步内容说明
all所有范围
data(数据)insert(增量)同步新写入的数据
delete(删除)同步被删除的数据
schema(元数据)database(数据库)同步数据库的创建、修改或删除操作
timeseries(时间序列)同步时间序列的定义和属性
TTL(数据到期时间)同步数据的存活时间
auth(权限)-同步用户权限和访问控制
- -### 功能限制及说明 - -元数据(schema)、权限(auth)同步功能存在如下限制: - -- 使用元数据同步时,要求`Schema region`、`ConfigNode` 的共识协议必须为默认的 ratis 协议,即`iotdb-system.properties`配置文件中是否包含`config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`、`schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`,不包含即为默认值ratis 协议。 - -- 为了防止潜在的冲突,请在开启元数据同步时关闭接收端自动创建元数据功能。可通过修改 `iotdb-system.properties`配置文件中的`enable_auto_create_schema`配置项为 false,关闭元数据自动创建功能。 - -- 开启元数据同步时,不支持使用自定义插件。 - -- 双活集群中元数据同步需避免两端同时操作。 - -- 在进行数据同步任务时,请避免执行任何删除操作,防止两端状态不一致。 - -## 使用说明 - -数据同步任务有三种状态:RUNNING、STOPPED 和 DROPPED。任务状态转换如下图所示: - -![](/img/Data-Sync01.png) - -创建后任务会直接启动,同时当任务发生异常停止后,系统会自动尝试重启任务。 - -提供以下 SQL 语句对同步任务进行状态管理。 - -### 创建任务 - -使用 `CREATE PIPE` 语句来创建一条数据同步任务,下列属性中`PipeId`和`sink`必填,`source`和`processor`为选填项,输入 SQL 时注意 `SOURCE`与 `SINK` 插件顺序不能替换。 - -SQL 示例如下: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId 是能够唯一标定任务的名字 --- 数据抽取插件,可选插件 -WITH SOURCE ( - [ = ,], -) --- 数据处理插件,可选插件 -WITH PROCESSOR ( - [ = ,], -) --- 数据连接插件,必填插件 -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Pipe 不存在时,执行创建命令,防止因尝试创建已存在的 Pipe 而导致报错。 - -**注意**:V1.3.6 起,创建一个全量数据同步 Pipe (例如 Pipeid : `alldatapipe`)时,系统会自动将其拆分为两个独立的 Pipe: - -* 历史 Pipe:PipeId 为原名称加 _history后缀(如 `alldatapipe_history`),source 参数默认携带 `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` - -* 实时 Pipe:PipeId 为原名称加 _realtime后缀(如 `alldatapipe_realtime`),source 参数默认携带 `'history.enable'='false'` ,若配置了元数据同步,则由实时 Pipe 负责发送 - -创建成功后,原 PipeId(如 `alldatapipe`)将不再作为有效标识符。在进行启动、停止、删除、查看等任务操作时,必须使用拆分后的独立 PipeId(即 `*_history`或 `*_realtime`)。操作示例见[查看任务](./Data-Sync_timecho.md#查看任务)小节 - -### 开始任务 - -开始处理数据: - -```SQL -START PIPE -``` - -### 停止任务 - -停止处理数据: - -```SQL -STOP PIPE -``` - -### 删除任务 - -删除指定任务: - -```SQL -DROP PIPE [IF EXISTS] -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Pipe 存在时,执行删除命令,防止因尝试删除不存在的 Pipe 而导致报错。 - -删除任务不需要先停止同步任务。 - -### 查看任务 - -查看全部任务: - -```SQL -SHOW PIPES -``` - -查看指定任务: - -```SQL -SHOW PIPE -``` - - pipe 的 show pipes 结果示例: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -其中各列含义如下: - -- **ID**:同步任务的唯一标识符 -- **CreationTime**:同步任务的创建的时间 -- **State**:同步任务的状态 -- **PipeSource**:同步数据流的来源 -- **PipeProcessor**:同步数据流在传输过程中的处理逻辑 -- **PipeSink**:同步数据流的目的地 -- **ExceptionMessage**:显示同步任务的异常信息 -- **RemainingEventCount(统计存在延迟)**:剩余 event 数,当前数据同步任务中的所有 event 总数,包括数据和元数据同步的 event,以及系统和用户自定义的 event。 -- **EstimatedRemainingSeconds(统计存在延迟)**:剩余时间,基于当前 event 个数和 pipe 处速率,预估完成传输的剩余时间。 - -示例: - -在 V1.3.6 及之后的版本中,创建一个全量数据同步任务,并查看该任务详情 - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -``` - -### 同步插件 - -为了使得整体架构更加灵活以匹配不同的同步场景需求,我们支持在同步任务框架中进行插件组装。系统为您预置了一些常用插件可直接使用,同时您也可以自定义 processor 插件 和 Sink 插件,并加载至 IoTDB 系统进行使用。查看系统中的插件(含自定义与内置插件)可以用以下语句: - -```SQL -SHOW PIPEPLUGINS -``` - -返回结果如下: - -```SQL -IoTDB> SHOW PIPEPLUGINS -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| PluginName|PluginType| ClassName| PluginJar| -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.processor.donothing.DoNothingProcessor| | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.donothing.DoNothingConnector| | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.extractor.iotdb.IoTDBExtractor| | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| | -| IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| | -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ - -``` - -预置插件详细介绍如下(各插件的详细参数可参考本文[参数说明](#参考参数说明)): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
类型自定义插件插件名称介绍适用版本
source 插件不支持iotdb-source默认的 extractor 插件,用于抽取 IoTDB 历史或实时数据1.2.x
processor 插件支持do-nothing-processor默认的 processor 插件,不对传入的数据做任何的处理1.2.x
sink 插件支持do-nothing-sink不对发送出的数据做任何的处理1.2.x
iotdb-thrift-sink默认的 sink 插件(V1.3.1及以上),用于 IoTDB(V1.2.0 及以上)与 IoTDB(V1.2.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,多线程 async non-blocking IO 模型,传输性能高,尤其适用于目标端为分布式时的场景1.2.x
iotdb-air-gap-sink用于 IoTDB(V1.2.2 及以上)向 IoTDB(V1.2.2 及以上)跨单向数据网闸的数据同步。支持的网闸型号包括南瑞 Syskeeper 2000 等1.2.x
iotdb-thrift-ssl-sink用于 IoTDB(V1.3.1 及以上)与 IoTDB(V1.2.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,单线程 sync blocking IO 模型,适用于安全需求较高的场景 1.3.1+
- -导入自定义插件可参考[流处理框架](./Streaming_timecho.md#自定义流处理插件管理)章节。 - -## 使用示例 - -### 全量数据同步 - -本例子用来演示将一个 IoTDB 的所有数据同步至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务,用来同步 A IoTDB 到 B IoTDB 间的全量数据,这里需要用到用到 sink 的 iotdb-thrift-sink 插件(内置插件),需通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,如下面的示例语句: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 部分数据同步 - -本例子用来演示同步某个历史时间范围( 2023 年 8 月 23 日 8 点到 2023 年 10 月 23 日 8 点)的数据至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务。首先我们需要在 source 中定义传输数据的范围,由于传输的是历史数据(历史数据是指同步任务创建之前存在的数据),需要配置数据的起止时间 start-time 和 end-time 以及传输的模式 mode。通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url。 - -详细语句如下: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'realtime.mode' = 'stream' -- 新插入数据(pipe创建后)的抽取模式 - 'path' = 'root.vehicle.**', -- 同步数据的范围 - 'start-time' = '2023.08.23T08:00:00+00:00', -- 同步所有数据的开始 event time,包含 start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- 同步所有数据的结束 event time,包含 end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 双向数据传输 - -本例子用来演示两个 IoTDB 之间互为双活的场景,数据链路如下图所示: - -![](/img/1706698592139.jpg) - -在这个例子中,为了避免数据无限循环,需要将 A 和 B 上的参数`forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一 pipe 传输而来的数据,以及要保持两侧的数据一致 pipe 需要配置`inclusion=all`来同步全量数据和元数据。 - -详细语句如下: - -在 A IoTDB 上执行下列语句: - -```SQL -create pipe AB -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'forwarding-pipe-requests' = 'false' --不转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'forwarding-pipe-requests' = 'false' --是否转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -### 边云数据传输 - -本例子用来演示多个 IoTDB 之间边云传输数据的场景,数据由 B 、C、D 集群分别都同步至 A 集群,数据链路如下图所示: - -![](/img/dataSync03.png) - -在这个例子中,为了将 B 、C、D 集群的数据同步至 A,在 BA 、CA、DA 之间的 pipe 需要配置`path`限制范围,以及要保持边侧和云侧的数据一致 pipe 需要配置`inclusion=all`来同步全量数据和元数据,详细语句如下: - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 A: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 C IoTDB 上执行下列语句,将 C 中数据同步至 A: - -```SQL -create pipe CA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 D IoTDB 上执行下列语句,将 D 中数据同步至 A: - -```SQL -create pipe DA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 级联数据传输 - -本例子用来演示多个 IoTDB 之间级联传输数据的场景,数据由 A 集群同步至 B 集群,再同步至 C 集群,数据链路如下图所示: - -![](/img/1706698610134.jpg) - -在这个例子中,为了将 A 集群的数据同步至 C,在 BC 之间的 pipe 需要将 `forwarding-pipe-requests` 配置为`true`,详细语句如下: - -在 A IoTDB 上执行下列语句,将 A 中数据同步至 B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 C: - -```SQL -create pipe BC -with source ( - 'forwarding-pipe-requests' = 'true' --是否转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 跨网闸数据传输 - -本例子用来演示将一个 IoTDB 的数据,经过单向网闸,同步至另一个 IoTDB 的场景,数据链路如下图所示: - -![](/img/cross-network-gateway.png) - -在这个例子中,需要使用 sink 任务中的 iotdb-air-gap-sink 插件,配置网闸后,在 A IoTDB 上执行下列语句,其中 node-urls 填写网闸配置的目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,详细语句如下: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -**注意:目前支持的网闸型号** -> 其他型号的网闸设备,请与天谋商务联系确认是否支持。 - -| 网闸类型 | 网闸型号 | 回包限制 | 发送限制 | -| ------------ | -------------------------------------------- | ----------------- | --------------- | -| 正向型 | 南瑞 Syskeeper-2000 正向型 | 全 0 / 全 1 bytes | 无限制 | -| 正向型 | 许继自研网闸 | 全 0 / 全 1 bytes | 无限制 | -| 未标记正反向 | 威努特安全隔离与信息交换系统 | 无限制 | 无限制 | -| 正向型 | 科东 StoneWall-2000 网络安全隔离设备(正向型) | 无限制 | 无限制 | -| 反向型 | 南瑞 Syskeeper-2000 反向型 | 全 0 / 全 1 bytes | 满足 E 语言格式 | -| 未标记正反向 | 迪普科技ISG5000 | 无限制 | 无限制 | -| 未标记正反向 | 熙羚安全隔离与信息交换系统XL—GAP | 无限制 | 无限制 | - -### 压缩同步 - -IoTDB 支持在同步过程中指定数据压缩方式。可通过配置 `compressor` 参数,实现数据的实时压缩和传输。`compressor`目前支持 snappy / gzip / lz4 / zstd / lzma2 5 种可选算法,且可以选择多种压缩算法组合,按配置的顺序进行压缩。`rate-limit-bytes-per-second`(V1.3.3 及以后版本支持)每秒最大允许传输的byte数,计算压缩后的byte,若小于0则不限制。 - -如创建一个名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'compressor' = 'snappy,lz4' -- - 'rate-limit-bytes-per-second'='1048576' -- 每秒最大允许传输的byte数 -) -``` - -### 加密同步 - -IoTDB 支持在同步过程中使用 SSL 加密,从而在不同的 IoTDB 实例之间安全地传输数据。通过配置 SSL 相关的参数,如证书地址和密码(`ssl.trust-store-path`)、(`ssl.trust-store-pwd`)可以确保数据在同步过程中被 SSL 加密所保护。 - -如创建名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'ssl.trust-store-path'='pki/trusted', -- 连接目标端 DataNode 所需的 trust store 证书路径 - 'ssl.trust-store-pwd'='root' -- 连接目标端 DataNode 所需的 trust store 证书密码 -) -``` - -## 参考:注意事项 - -可通过修改 IoTDB 配置文件(`iotdb-system.properties`)以调整数据同步的参数,如同步数据存储目录等。完整配置如下:: - -V1.3.3+: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## 参考:参数说明 - -### source 参数(V1.3.3) - -| 参数 | 描述 | value 取值范围 | 是否必填 | 默认取值 | -| ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -------------- | -| source | iotdb-source | String: iotdb-source | 必填 | - | -| inclusion | 用于指定数据同步任务中需要同步范围,分为数据、元数据和权限 | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | 选填 | data.insert | -| inclusion.exclusion | 用于从 inclusion 指定的同步范围内排除特定的操作,减少同步的数据量 | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | 选填 | 空字符串 | -| mode | 用于在每个 data region 发送完毕时分别发送结束事件,并在全部 data region 发送完毕后自动 drop pipe。query:结束,subscribe:不结束。 | String: query / subscribe | 选填 | subscribe | -| path | 用于筛选待同步的时间序列及其相关元数据 / 数据的路径模式元数据同步只能用pathpath 是精确匹配,参数必须为前缀路径或完整路径,即不能含有 `"*"`,最多在 path参数的尾部含有一个 `"**"` | String:IoTDB 的 pattern | 选填 | root.** | -| pattern | 用于筛选时间序列的路径前缀 | String: 任意的时间序列前缀 | 选填 | root | -| start-time | 同步所有数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MIN_VALUE | -| end-time | 同步所有数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MAX_VALUE | -| realtime.mode | 新插入数据(pipe创建后)的抽取模式 | String: stream, batch | 选填 | batch | -| forwarding-pipe-requests | 是否转发由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | 选填 | true | -| history.loose-range | tsfile传输时,是否放宽历史数据(pipe创建前)范围。"":不放宽范围,严格按照设置的条件挑选数据"time":放宽时间范围,避免对TsFile进行拆分,可以提升同步效率"path":放宽路径范围,避免对TsFile进行拆分,可以提升同步效率"time, path" 、 "path, time" 、"all" : 放宽所有范围,避免对TsFile进行拆分,可以提升同步效率 | String: "" 、 "time" 、 "path" 、 "time, path" 、 "path, time" 、 "all" | 选填 | "" | -| realtime.loose-range | tsfile传输时,是否放宽实时数据(pipe创建前)范围。"":不放宽范围,严格按照设置的条件挑选数据"time":放宽时间范围,避免对TsFile进行拆分,可以提升同步效率"path":放宽路径范围,避免对TsFile进行拆分,可以提升同步效率"time, path" 、 "path, time" 、"all" : 放宽所有范围,避免对TsFile进行拆分,可以提升同步效率 | String: "" 、 "time" 、 "path" 、 "time, path" 、 "path, time" 、 "all" | 选填 | "" | -| mods.enable | 是否发送 tsfile 的 mods 文件 | Boolean: true / false | 选填 | false | - -> 💎 **说明**:为保持低版本兼容,history.enable、history.start-time、history.end-time、realtime.enable 仍可使用,但在新版本中不推荐。 -> -> 💎 **说明:数据抽取模式 stream 和 batch 的差异** -> - **stream(推荐)**:该模式下,任务将对数据进行实时处理、发送,其特点是高时效、低吞吐 -> - **batch**:该模式下,任务将对数据进行批量(按底层数据文件)处理、发送,其特点是低时效、高吞吐 - - -### sink **参数** - -> 在 1.3.3 及以上的版本中,只包含sink的情况下,不再需要额外增加with sink 前缀 - -#### iotdb-thrift-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -|-------------------------|--------------------------------------------------------------|----------------------------------------------------------------------------| -------- |--------------| -| sink | iotdb-thrift-sink 或 iotdb-thrift-async-sink | String: iotdb-thrift-sink 或 iotdb-thrift-async-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V1.3.6及以后的V1.x版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。(V1.3.6及以后的V1.x版本支持) | String: sync / async | 选填 | sync | - - -#### iotdb-air-gap-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -|------------------------------| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -------- | -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| air-gap.handshake-timeout-ms | 发送端与接收端在首次尝试建立连接时握手请求的超时时长,单位:毫秒 | Integer | 选填 | 5000 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。(V1.3.6及以后的V1.x版本支持) | String: sync / async | 选填 | sync | - -#### iotdb-thrift-ssl-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -| ----------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | ------------ | -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V1.3.6及以后的V1.x版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。(V1.3.6及以后的V1.x版本支持) | String: sync / async | 选填 | sync | -| ssl.trust-store-path | 连接目标端 DataNode 所需的 trust store 证书路径 | String.Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| ssl.trust-store-pwd | 连接目标端 DataNode 所需的 trust store 证书密码 | Integer | 必填 | - | diff --git a/src/zh/UserGuide/V1.3.x/User-Manual/IoTDB-View_timecho.md b/src/zh/UserGuide/V1.3.x/User-Manual/IoTDB-View_timecho.md deleted file mode 100644 index 2181ae4d4..000000000 --- a/src/zh/UserGuide/V1.3.x/User-Manual/IoTDB-View_timecho.md +++ /dev/null @@ -1,546 +0,0 @@ - - -# 视图 - -## 序列视图应用背景 - -### 应用场景1 时间序列重命名(PI资产管理) - -实际应用中,采集数据的设备可能使用人类难以理解的标识号来命名,这给业务层带来了查询上的困难。 - -而序列视图能够重新组织管理这些序列,在不改变原有序列内容、无需新建或拷贝序列的情况下,使用新的模型结构来访问他们。 - -**例如**:一台云端设备使用自己的网卡MAC地址组成实体编号,存储数据时写入如下时间序列:`root.db.0800200A8C6D.xvjeifg`. - -对于用户来说,它是难以理解的。但此时,用户能够使用序列视图功能对它重命名,将它映射到一个序列视图中去,使用`root.view.device001.temperature`来访问采集到的数据。 - -### 应用场景2 简化业务层查询逻辑 - -有时用户有大量设备,管理着大量时间序列。在进行某项业务时,用户希望仅处理其中的部分序列,此时就可以通过序列视图功能挑选出关注重点,方便反复查询、写入。 - -**例如**:用户管理一条产品流水线,各环节的设备有大量时间序列。温度检测员仅需要关注设备温度,就可以抽取温度相关的序列,组成序列视图。 - -### 应用场景3 辅助权限管理 - -生产过程中,不同业务负责的范围一般不同,出于安全考虑往往需要通过权限管理来限制业务员的访问范围。 - -**例如**:安全管理部门现在仅需要监控某生产线上各设备的温度,但这些数据与其他机密数据存放在同一数据库。此时,就可以创建若干新的视图,视图中仅含有生产线上与温度有关的时间序列,接着,向安全员只赋予这些序列视图的权限,从而达到权限限制的目的。 - -### 设计序列视图功能的动机 - -结合上述两类使用场景,设计序列视图功能的动机,主要有: - -1. 时间序列重命名。 -2. 简化业务层查询逻辑。 -3. 辅助权限管理,通过视图向特定用户开放数据。 - -## 序列视图概念 - -### 术语概念 - -约定:若无特殊说明,本文档所指定的视图均是**序列视图**,未来可能引入设备视图等新功能。 - -### 序列视图 - -序列视图是一种组织管理时间序列的方式。 - -在传统关系型数据库中,数据都必须存放在一个表中,而在IoTDB等时序数据库中,序列才是存储单元。因此,IoTDB中序列视图的概念也是建立在序列上的。 - -一个序列视图就是一条虚拟的时间序列,每条虚拟的时间序列都像是一条软链接或快捷方式,映射到某个视图外部的序列或者某种计算逻辑。换言之,一个虚拟序列要么映射到某个确定的外部序列,要么由多个外部序列运算得来。 - -用户可以使用复杂的SQL查询创建视图,此时序列视图就像一条被存储的查询语句,当从视图中读取数据时,就把被存储的查询语句作为数据来源,放在FROM子句中。 - -### 别名序列 - -在序列视图中,有一类特殊的存在,他们满足如下所有条件: - -1. 数据来源为单一的时间序列 -2. 没有任何计算逻辑 -3. 没有任何筛选条件(例如无WHERE子句的限制) - -这样的序列视图,被称为**别名序列**,或别名序列视图。不完全满足上述所有条件的序列视图,就称为非别名序列视图。他们之间的区别是:只有别名序列支持写入功能。 - -**所有序列视图包括别名序列目前均不支持触发器功能(Trigger)。** - -### 嵌套视图 - -用户可能想从一个现有的序列视图中选出若干序列,组成一个新的序列视图,就称之为嵌套视图。 - -**当前版本不支持嵌套视图功能**。 - -### IoTDB中对序列视图的一些约束 - -#### 限制1 序列视图必须依赖于一个或者若干个时间序列 - -一个序列视图有两种可能的存在形式: - -1. 它映射到一条时间序列 -2. 它由一条或若干条时间序列计算得来 - -前种存在形式已在前文举例,易于理解;而此处的后一种存在形式,则是因为序列视图允许计算逻辑的存在。 - -比如,用户在同一个锅炉安装了两个温度计,现在需要计算两个温度值的平均值作为测量结果。用户采集到的是如下两个序列:`root.db.d01.temperature01`、`root.db.d01.temperature02`。 - -此时,用户可以使用两个序列求平均值,作为视图中的一条序列:`root.db.d01.avg_temperature`。 - -该例子会3.1.2详细展开。 - -#### 限制2 非别名序列视图是只读的 - -不允许向非别名序列视图写入。 - -只有别名序列视图是支持写入的。 - -#### 限制3 不允许嵌套视图 - -不能选定现有序列视图中的某些列来创建序列视图,无论是直接的还是间接的。 - -本限制将在3.1.3给出示例。 - -#### 限制4 序列视图与时间序列不能重名 - -序列视图和时间序列都位于同一棵树下,所以他们不能重名。 - -任何一条序列的名称(路径)都应该是唯一确定的。 - -#### 限制5 序列视图与时间序列的时序数据共用,标签等元数据不共用 - -序列视图是指向时间序列的映射,所以它们完全共用时序数据,由时间序列负责持久化存储。 - -但是它们的tag、attributes等元数据不共用。 - -这是因为进行业务查询时,面向视图的用户关心的是当前视图的结构,而如果使用group by tag等方式做查询,显然希望是得到视图下含有对应tag的分组效果,而非时间序列的tag的分组效果(用户甚至对那些时间序列毫无感知)。 - -## 序列视图功能介绍 - -### 创建视图 - -创建一个序列视图与创建一条时间序列类似,区别在于需要通过AS关键字指定数据来源,即原始序列。 - -#### 创建视图的SQL - -用户可以选取一些序列创建一个视图: - -```SQL -CREATE VIEW root.view.device.status -AS - SELECT s01 - FROM root.db.device -``` - -它表示用户从现有设备`root.db.device`中选出了`s01`这条序列,创建了序列视图`root.view.device.status`。 - -序列视图可以与时间序列存在于同一实体下,例如: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device -``` - -这样,`root.db.device`下就有了`s01`的一份虚拟拷贝,但是使用不同的名字`status`。 - -可以发现,上述两个例子中的序列视图,都是别名序列,我们给用户提供一种针对该序列的更方便的创建方式: - -```SQL -CREATE VIEW root.view.device.status -AS - root.db.device.s01 -``` - -#### 创建含有计算逻辑的视图 - -沿用2.2章节限制1中的例子: - -> 用户在同一个锅炉安装了两个温度计,现在需要计算两个温度值的平均值作为测量结果。用户采集到的是如下两个序列:`root.db.d01.temperature01`、`root.db.d01.temperature02`。 -> -> 此时,用户可以使用两个序列求平均值,作为视图中的一条序列:`root.view.device01.avg_temperature`。 - -如果不使用视图,用户可以这样查询两个温度的平均值: - -```SQL -SELECT (temperature01 + temperature02) / 2 -FROM root.db.d01 -``` - -而如果使用序列视图,用户可以这样创建一个视图来简化将来的查询: - -```SQL -CREATE VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02) / 2 - FROM root.db.d01 -``` - -然后用户可以这样查询: - -```SQL -SELECT avg_temperature FROM root.db.d01 -``` - -#### 不支持嵌套序列视图 - -继续沿用3.1.2中的例子,现在用户想使用序列视图`root.db.d01.avg_temperature`创建一个新的视图,这是不允许的。我们目前不支持嵌套视图,无论它是否是别名序列,都不支持。 - -比如下列SQL语句会报错: - -```SQL -CREATE VIEW root.view.device.avg_temp_copy -AS - root.db.d01.avg_temperature -- 不支持。不允许嵌套视图 -``` - -#### 一次创建多条序列视图 - -一次只能指定一个序列视图对用户来说使用不方便,则可以一次指定多条序列,比如: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - SELECT s01, s02 - FROM root.db.device -``` - -此外,上述写法可以做简化: - -```SQL -CREATE VIEW root.db.device(status, sub.hardware) -AS - SELECT s01, s02 - FROM root.db.device -``` - -上述两条语句都等价于如下写法: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device; - -CREATE VIEW root.db.device.sub.hardware -AS - SELECT s02 - FROM root.db.device -``` - -也等价于如下写法 - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - root.db.device.s01, root.db.device.s02 - --- 或者 - -CREATE VIEW root.db.device(status, sub.hardware) -AS - root.db.device(s01, s02) -``` - -##### 所有序列间的映射关系为静态存储 - -有时,SELECT子句中可能包含运行时才能确定的语句个数,比如如下的语句: - -```SQL -SELECT s01, s02 -FROM root.db.d01, root.db.d02 -``` - -上述语句能匹配到的序列数量是并不确定的,和系统状态有关。即便如此,用户也可以使用它创建视图。 - -不过需要特别注意,所有序列间的映射关系为静态存储(创建时固定)!请看以下示例: - -当前数据库中仅含有`root.db.d01.s01`、`root.db.d02.s01`、`root.db.d02.s02`三条序列,接着创建视图: - -```SQL -CREATE VIEW root.view.d(alpha, beta, gamma) -AS - SELECT s01, s02 - FROM root.db.d01, root.db.d02 -``` - -时间序列之间映射关系如下: - -| 序号 | 时间序列 | 序列视图 | -| ---- | ----------------- | ----------------- | -| 1 | `root.db.d01.s01` | root.view.d.alpha | -| 2 | `root.db.d02.s01` | root.view.d.beta | -| 3 | `root.db.d02.s02` | root.view.d.gamma | - -此后,用户新增了序列`root.db.d01.s02`,则它不对应到任何视图;接着,用户删除`root.db.d01.s01`,则查询`root.view.d.alpha`会直接报错,它也不会对应到`root.db.d01.s02`。 - -请时刻注意,序列间映射关系是静态地、固化地存储的。 - -#### 批量创建序列视图 - -现有若干个设备,每个设备都有一个温度数值,例如: - -1. root.db.d1.temperature -2. root.db.d2.temperature -3. ... - -这些设备下可能存储了很多其他序列(例如`root.db.d1.speed`),但目前可以创建一个视图,只包含这些设备的温度值,而不关系其他序列: - -```SQL -CREATE VIEW root.db.view(${2}_temperature) -AS - SELECT temperature FROM root.db.* -``` - -这里仿照了查询写回(`SELECT INTO`)对命名规则的约定,使用变量占位符来指定命名规则。可以参考:[查询写回(SELECT INTO)](../Basic-Concept/Query-Data.md#查询写回(INTO-子句)) - -这里`root.db.*.temperature`指定了有哪些时间序列会被包含在视图中;`${2}`则指定了从时间序列中的哪个节点提取出名字来命名序列视图。 - -此处,`${2}`指代的是`root.db.*.temperature`的层级2(从 0 开始),也就是`*`的匹配结果;`${2}_temperature`则是将匹配结果与`temperature`通过下划线拼接了起来,构成视图下各序列的节点名称。 - -上述创建视图的语句,和下列写法是等价的: - -```SQL -CREATE VIEW root.db.view(${2}_${3}) -AS - SELECT temperature from root.db.* -``` - -最终视图中含有这些序列: - -1. root.db.view.d1_temperature -2. root.db.view.d2_temperature -3. ... - -使用通配符创建,只会存储创建时刻的静态映射关系。 - -#### 创建视图时SELECT子句受到一定限制 - -创建序列视图时,使用的SELECT子句受到一定限制。主要限制如下: - -1. 不能使用`WHERE`子句。 -2. 不能使用`GROUP BY`子句。 -3. 不能使用`MAX_VALUE`等聚合函数。 - -简单来说,`AS`后只能使用`SELECT ... FROM ... `的结构,且该查询语句的结果必须能构成一条时间序列。 - -### 视图数据查询 - -对于可以支持的数据查询功能,在执行时序数据查询时,序列视图与时间序列可以无差别使用,行为完全一致。 - -**目前序列视图不支持的查询类型如下:** - -1. **align by device 查询** -2. **group by tags 查询** - -用户也可以在同一个SELECT语句中混合查询时间序列与序列视图,比如: - -```SQL -SELECT temperature01, temperature02, avg_temperature -FROM root.db.d01 -WHERE temperature01 < temperature02 -``` - -但是,如果用户想要查询序列的元数据,例如tag、attributes等,则查询到的是序列视图的结果,而并非序列视图所引用的时间序列的结果。 - -此外,对于别名序列,如果用户想要得到时间序列的tag、attributes等信息,则需要先查询视图列的映射,找到对应的时间序列,再向时间序列查询tag、attributes等信息。查询视图列的映射的方法将会在3.5部分说明。 - -### 视图修改 - -视图支持的修改操作包括:修改计算逻辑,修改标签/属性,以及删除。 - -#### 修改视图数据来源 - -```SQL -ALTER VIEW root.view.device.status -AS - SELECT s01 - FROM root.ln.wf.d01 -``` - -#### 修改视图的计算逻辑 - -```SQL -ALTER VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02 + temperature03) / 3 - FROM root.db.d01 -``` - -#### 标签点管理 - -- 添加新的标签 - -```SQL -ALTER view root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -- 添加新的属性 - -```SQL -ALTER view root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -- 重命名标签或属性 - -```SQL -ALTER view root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -- 重新设置标签或属性的值 - -```SQL -ALTER view root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -- 删除已经存在的标签或属性 - -```SQL -ALTER view root.turbine.d1.s1 DROP tag1, tag2 -``` - -- 更新插入标签和属性 - -> 如果该标签或属性原来不存在,则插入,否则,用新值更新原来的旧值 - -```SQL -ALTER view root.turbine.d1.s1 UPSERT TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -#### 删除视图 - -因为一个视图就是一条序列,因此可以像删除时间序列一样删除一个视图。 - -```SQL -DELETE VIEW root.view.device.avg_temperatue -``` - -### 视图同步 - -#### 如果依赖的原序列被删除了 - -当序列视图查询时(序列解析时),如果依赖的时间序列不存在,则**返回空结果集**。 - -这和查询一个不存在的序列的反馈类似,但是有区别:如果依赖的时间序列无法解析,空结果集是包含表头的,以此来提醒用户该视图是存在问题的。 - -此外,被依赖的时间序列删除时,不会去查找是否有依赖于该列的视图,用户不会收到任何警告。 - -#### 不支持非别名序列的数据写入 - -不支持向非别名序列的写入。 - -详情请参考前文 2.1.6 限制2 - -#### 序列的元数据不共用 - -详情请参考前文2.1.6 限制5 - -### 视图元数据查询 - -视图元数据查询,特指查询视图本身的元数据(例如视图有多少列),以及数据库内视图的信息(例如有哪些视图)。 - -#### 查看当前的视图列 - -用户有两种查询方式: - -1. 使用`SHOW TIMESERIES`进行查询,该查询既包含时间序列,也包含序列视图。但是只能显示视图的部分属性 -2. 使用`SHOW VIEW`进行查询,该查询只包含序列视图。能完整显示序列视图的属性。 - -举例: - -```Shell -IoTDB> show timeseries; -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.device.s01 | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.view.status | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp01 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp02 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.avg_temp| null| root.db| FLOAT| null| null|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -Total line number = 5 -It costs 0.789s -IoTDB> -``` - -最后一列`ViewType`中显示了该序列的类型,时间序列为BASE,序列视图是VIEW。 - -此外,某些序列视图的属性会缺失,比如`root.db.d01.avg_temp`是由温度均值计算得来,所以`Encoding`和`Compression`属性都为空值。 - -此外,`SHOW TIMESERIES`语句的查询结果主要分为两部分: - -1. 时序数据的信息,例如数据类型,压缩方式,编码等 -2. 其他元数据信息,例如tag,attribute,所属database等 - -对于序列视图,展示的时序数据信息与其原始序列一致或者为空值(比如计算得到的平均温度有数据类型但是无压缩方式);展示的元数据信息则是视图的内容。 - -如果要得知视图的更多信息,需要使用`SHOW ``VIEW`。`SHOW ``VIEW`中展示视图的数据来源等。 - -```Shell -IoTDB> show VIEW root.**; -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -| Timeseries|Database|DataType|Tags|Attributes|ViewType| SOURCE| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.view.status | root.db| INT32|null| null| VIEW| root.db.device.s01| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.d01.avg_temp| root.db| FLOAT|null| null| VIEW|(root.db.d01.temp01+root.db.d01.temp02)/2| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -Total line number = 2 -It costs 0.789s -IoTDB> -``` - -最后一列`SOURCE`显示了该序列视图的数据来源,列出了创建该序列的SQL语句。 - -##### 关于数据类型 - -上述两种查询都涉及视图的数据类型。视图的数据类型是根据定义视图的查询语句或别名序列的原始时间序列类型推断出来的。这个数据类型是根据当前系统的状态实时计算出来的,因此在不同时刻查询到的数据类型可能是改变的。 - -## FAQ - -#### Q1:我想让视图实现类型转换的功能。例如,原有一个int32类型的时间序列,和其他int64类型的序列被放在了同一个视图中。我现在希望通过视图查询到的数据,都能自动转换为int64类型。 - -> Ans:这不是序列视图的职能范围。但是可以使用`CAST`进行转换,比如: - -```SQL -CREATE VIEW root.db.device.int64_status -AS - SELECT CAST(s1, 'type'='INT64') from root.db.device -``` - -> 这样,查询`root.view.status`时,就会得到int64类型的结果。 -> -> 请特别注意,上述例子中,序列视图的数据是通过`CAST`转换得到的,因此`root.db.device.int64_status`并不是一条别名序列,也就**不支持写入**。 - -#### Q2:是否支持默认命名?选择若干时间序列,创建视图;但是我不指定每条序列的名字,由数据库自动命名? - -> Ans:不支持。用户必须明确指定命名。 - -#### Q3:在原有体系中,创建时间序列`root.db.device.s01`,可以发现自动创建了database`root.db`,自动创建了device`root.db.device`。接着删除时间序列`root.db.device.s01`,可以发现`root.db.device`被自动删除,`root.db`却还是保留的。对于创建视图,会沿用这一机制吗?出于什么考虑呢? - -> Ans:保持原有的行为不变,引入视图功能不会改变原有的这些逻辑。 - -#### Q4:是否支持序列视图重命名? - -> A:当前版本不支持重命名,可以自行创建新名称的视图投入使用。 \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/User-Manual/Streaming_timecho.md b/src/zh/UserGuide/V1.3.x/User-Manual/Streaming_timecho.md deleted file mode 100644 index 010ce7ce6..000000000 --- a/src/zh/UserGuide/V1.3.x/User-Manual/Streaming_timecho.md +++ /dev/null @@ -1,862 +0,0 @@ - - -# 流计算框架 - -IoTDB 流处理框架允许用户实现自定义的流处理逻辑,可以实现对存储引擎变更的监听和捕获、实现对变更数据的变形、实现对变形后数据的向外推送等逻辑。 - -我们将一个数据流处理任务称为 Pipe。一个流处理任务(Pipe)包含三个子任务: - -- 抽取(Source) -- 处理(Process) -- 发送(Sink) - -流处理框架允许用户使用 Java 语言自定义编写三个子任务的处理逻辑,通过类似 UDF 的方式处理数据。 -在一个 Pipe 中,上述的三个子任务分别由三种插件执行实现,数据会依次经过这三个插件进行处理: -Pipe Source 用于抽取数据,Pipe Processor 用于处理数据,Pipe Sink 用于发送数据,最终数据将被发至外部系统。 - -**Pipe 任务的模型如下:** - -![任务模型图](/img/1706697228308.jpg) - -描述一个数据流处理任务,本质就是描述 Pipe Source、Pipe Processor 和 Pipe Sink 插件的属性。 -用户可以通过 SQL 语句声明式地配置三个子任务的具体属性,通过组合不同的属性,实现灵活的数据 ETL 能力。 - -利用流处理框架,可以搭建完整的数据链路来满足端*边云同步、异地灾备、读写负载分库*等需求。 - -## 自定义流处理插件开发 - -### 编程开发依赖 - -推荐采用 maven 构建项目,在`pom.xml`中添加以下依赖。请注意选择和 IoTDB 服务器版本相同的依赖版本。 - -```xml - - org.apache.iotdb - pipe-api - 1.3.1 - provided - -``` - -### 事件驱动编程模型 - -流处理插件的用户编程接口设计,参考了事件驱动编程模型的通用设计理念。事件(Event)是用户编程接口中的数据抽象,而编程接口与具体的执行方式解耦,只需要专注于描述事件(数据)到达系统后,系统期望的处理方式即可。 - -在流处理插件的用户编程接口中,事件是数据库数据写入操作的抽象。事件由单机流处理引擎捕获,按照流处理三个阶段的流程,依次传递至 PipeSource 插件,PipeProcessor 插件和 PipeSink 插件,并依次在三个插件中触发用户逻辑的执行。 - -为了兼顾端侧低负载场景下的流处理低延迟和端侧高负载场景下的流处理高吞吐,流处理引擎会动态地在操作日志和数据文件中选择处理对象,因此,流处理的用户编程接口要求用户提供下列两类事件的处理逻辑:操作日志写入事件 TabletInsertionEvent 和数据文件写入事件 TsFileInsertionEvent。 - -#### **操作日志写入事件(TabletInsertionEvent)** - -操作日志写入事件(TabletInsertionEvent)是对用户写入请求的高层数据抽象,它通过提供统一的操作接口,为用户提供了操纵写入请求底层数据的能力。 - -对于不同的数据库部署方式,操作日志写入事件对应的底层存储结构是不一样的。对于单机部署的场景,操作日志写入事件是对写前日志(WAL)条目的封装;对于分布式部署的场景,操作日志写入事件是对单个节点共识协议操作日志条目的封装。 - -对于数据库不同写入请求接口生成的写入操作,操作日志写入事件对应的请求结构体的数据结构也是不一样的。IoTDB 提供了 InsertRecord、InsertRecords、InsertTablet、InsertTablets 等众多的写入接口,每一种写入请求都使用了完全不同的序列化方式,生成的二进制条目也不尽相同。 - -操作日志写入事件的存在,为用户提供了一种统一的数据操作视图,它屏蔽了底层数据结构的实现差异,极大地降低了用户的编程门槛,提升了功能的易用性。 - -```java -/** TabletInsertionEvent is used to define the event of data insertion. */ -public interface TabletInsertionEvent extends Event { - - /** - * The consumer processes the data row by row and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processRowByRow(BiConsumer consumer); - - /** - * The consumer processes the Tablet directly and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processTablet(BiConsumer consumer); -} -``` - -#### **数据文件写入事件(TsFileInsertionEvent)** - -数据文件写入事件(TsFileInsertionEvent) 是对数据库文件落盘操作的高层抽象,它是若干操作日志写入事件(TabletInsertionEvent)的数据集合。 - -IoTDB 的存储引擎是 LSM 结构的。数据写入时会先将写入操作落盘到日志结构的文件里,同时将写入数据保存在内存里。当内存达到控制上限,则会触发刷盘行为,即将内存中的数据转换为数据库文件,同时删除之前预写的操作日志。当内存中的数据转换为数据库文件中的数据时,会经过编码压缩和通用压缩两次压缩处理,因此数据库文件的数据相比内存中的原始数据占用的空间更少。 - -在极端的网络情况下,直接传输数据文件相比传输数据写入的操作要更加经济,它会占用更低的网络带宽,能实现更快的传输速度。当然,天下没有免费的午餐,对文件中的数据进行计算处理,相比直接对内存中的数据进行计算处理时,需要额外付出文件 I/O 的代价。但是,正是磁盘数据文件和内存写入操作两种结构各有优劣的存在,给了系统做动态权衡调整的机会,也正是基于这样的观察,插件的事件模型中才引入了数据文件写入事件。 - -综上,数据文件写入事件出现在流处理插件的事件流中,存在下面两种情况: - -(1)历史数据抽取:一个流处理任务开始前,所有已经落盘的写入数据都会以 TsFile 的形式存在。一个流处理任务开始后,采集历史数据时,历史数据将以 TsFileInsertionEvent 作为抽象; - -(2)实时数据抽取:一个流处理任务进行时,当数据流中实时处理操作日志写入事件的速度慢于写入请求速度一定进度之后,未来得及处理的操作日志写入事件会被被持久化至磁盘,以 TsFile 的形式存在,这一些数据被流处理引擎抽取到后,会以 TsFileInsertionEvent 作为抽象。 - -```java -/** - * TsFileInsertionEvent is used to define the event of writing TsFile. Event data stores in disks, - * which is compressed and encoded, and requires IO cost for computational processing. - */ -public interface TsFileInsertionEvent extends Event { - - /** - * The method is used to convert the TsFileInsertionEvent into several TabletInsertionEvents. - * - * @return {@code Iterable} the list of TabletInsertionEvent - */ - Iterable toTabletInsertionEvents(); -} -``` - -### 自定义流处理插件编程接口定义 - -基于自定义流处理插件编程接口,用户可以轻松编写数据抽取插件、数据处理插件和数据发送插件,从而使得流处理功能灵活适配各种工业场景。 - -#### 数据抽取插件接口 - -数据抽取是流处理数据从数据抽取到数据发送三阶段的第一阶段。数据抽取插件(PipeSource)是流处理引擎和存储引擎的桥梁,它通过监听存储引擎的行为, -捕获各种数据写入事件。 - -```java -/** - * PipeSource - * - *

PipeSource is responsible for capturing events from sources. - * - *

Various data sources can be supported by implementing different PipeSource classes. - * - *

The lifecycle of a PipeSource is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SOURCE` clause in SQL are - * parsed and the validation method {@link PipeSource#validate(PipeParameterValidator)} will - * be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} will be called to - * config the runtime behavior of the PipeSource. - *
  • Then the method {@link PipeSource#start()} will be called to start the PipeSource. - *
  • While the collaboration task is in progress, the method {@link PipeSource#supply()} will be - * called to capture events from sources and then the events will be passed to the - * PipeProcessor. - *
  • The method {@link PipeSource#close()} will be called when the collaboration task is - * cancelled (the `DROP PIPE` command is executed). - *
- */ -public interface PipeSource extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSource. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSourceRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSource#validate(PipeParameterValidator)} - * is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSource - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSourceRuntimeConfiguration configuration) - throws Exception; - - /** - * Start the Source. After this method is called, events should be ready to be supplied by - * {@link PipeSource#supply()}. This method is called after {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @throws Exception the user can throw errors if necessary - */ - void start() throws Exception; - - /** - * Supply single event from the Source and the caller will send the event to the processor. - * This method is called after {@link PipeSource#start()} is called. - * - * @return the event to be supplied. the event may be null if the Source has no more events at - * the moment, but the Source is still running for more events. - * @throws Exception the user can throw errors if necessary - */ - Event supply() throws Exception; -} -``` - -#### 数据处理插件接口 - -数据处理是流处理数据从数据抽取到数据发送三阶段的第二阶段。数据处理插件(PipeProcessor)主要用于过滤和转换由数据抽取插件(PipeSource)捕获的 -各种事件。 - -```java -/** - * PipeProcessor - * - *

PipeProcessor is used to filter and transform the Event formed by the PipeSource. - * - *

The lifecycle of a PipeProcessor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH PROCESSOR` clause in SQL are - * parsed and the validation method {@link PipeProcessor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} will be called - * to config the runtime behavior of the PipeProcessor. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. The - * following 3 methods will be called: {@link - * PipeProcessor#process(TabletInsertionEvent, EventCollector)}, {@link - * PipeProcessor#process(TsFileInsertionEvent, EventCollector)} and {@link - * PipeProcessor#process(Event, EventCollector)}. - *
    • PipeSink serializes the events into binaries and send them to sinks. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeProcessor#close() } method will be called. - *
- */ -public interface PipeProcessor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeProcessor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeProcessorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeProcessor#validate(PipeParameterValidator)} is called and before the beginning of the - * events processing. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeProcessor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeProcessorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is called to process the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(TabletInsertionEvent tabletInsertionEvent, EventCollector eventCollector) - throws Exception; - - /** - * This method is called to process the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - default void process(TsFileInsertionEvent tsFileInsertionEvent, EventCollector eventCollector) - throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - process(tabletInsertionEvent, eventCollector); - } - } - - /** - * This method is called to process the Event. - * - * @param event Event to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(Event event, EventCollector eventCollector) throws Exception; -} -``` - -#### 数据发送插件接口 - -数据发送是流处理数据从数据抽取到数据发送三阶段的第三阶段。数据发送插件(PipeSink)主要用于发送经由数据处理插件(PipeProcessor)处理过后的 -各种事件,它作为流处理框架的网络实现层,接口上应允许接入多种实时通信协议和多种连接器。 - -```java -/** - * PipeSink - * - *

PipeSink is responsible for sending events to sinks. - * - *

Various network protocols can be supported by implementing different PipeSink classes. - * - *

The lifecycle of a PipeSink is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SINK` clause in SQL are - * parsed and the validation method {@link PipeSink#validate(PipeParameterValidator)} will be - * called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link PipeSink#customize(PipeParameters, - * PipeSinkRuntimeConfiguration)} will be called to config the runtime behavior of the - * PipeSink and the method {@link PipeSink#handshake()} will be called to create a connection - * with sink. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. - *
    • PipeSink serializes the events into binaries and send them to sinks. The following 3 - * methods will be called: {@link PipeSink#transfer(TabletInsertionEvent)}, {@link - * PipeSink#transfer(TsFileInsertionEvent)} and {@link PipeSink#transfer(Event)}. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeSink#close() } method will be called. - *
- * - *

In addition, the method {@link PipeSink#heartbeat()} will be called periodically to check - * whether the connection with sink is still alive. The method {@link PipeSink#handshake()} will be - * called to create a new connection with the sink when the method {@link PipeSink#heartbeat()} - * throws exceptions. - */ -public interface PipeSink extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSink. In this method, the user can do the following - * things: - * - *

    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSinkRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSink#validate(PipeParameterValidator)} is - * called and before the method {@link PipeSink#handshake()} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSink - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSinkRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is used to create a connection with sink. This method will be called after the - * method {@link PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called or - * will be called when the method {@link PipeSink#heartbeat()} throws exceptions. - * - * @throws Exception if the connection is failed to be created - */ - void handshake() throws Exception; - - /** - * This method will be called periodically to check whether the connection with sink is still - * alive. - * - * @throws Exception if the connection dies - */ - void heartbeat() throws Exception; - - /** - * This method is used to transfer the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(TabletInsertionEvent tabletInsertionEvent) throws Exception; - - /** - * This method is used to transfer the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - default void transfer(TsFileInsertionEvent tsFileInsertionEvent) throws Exception { - try { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - transfer(tabletInsertionEvent); - } - } finally { - tsFileInsertionEvent.close(); - } - } - - /** - * This method is used to transfer the generic events, including HeartbeatEvent. - * - * @param event Event to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(Event event) throws Exception; -} -``` - -## 自定义流处理插件管理 - -为了保证用户自定义插件在实际生产中的灵活性和易用性,系统还需要提供对插件进行动态统一管理的能力。 -本章节介绍的流处理插件管理语句提供了对插件进行动态统一管理的入口。 - -### 加载插件语句 - -在 IoTDB 中,若要在系统中动态载入一个用户自定义插件,则首先需要基于 PipeSource、 PipeProcessor 或者 PipeSink 实现一个具体的插件类, -然后需要将插件类编译打包成 jar 可执行文件,最后使用加载插件的管理语句将插件载入 IoTDB。 - -加载插件的管理语句的语法如图所示。 - -```sql -CREATE PIPEPLUGIN [IF NOT EXISTS] <别名> -AS <全类名> -USING -``` - -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Pipe Plugin 不存在时,执行创建命令,防止因尝试创建已存在的 Pipe Plugin 而导致报错。 - -示例:假如用户实现了一个全类名为edu.tsinghua.iotdb.pipe.ExampleProcessor 的数据处理插件,打包后的jar包为 pipe-plugin.jar ,用户希望在流处理引擎中使用这个插件,将插件标记为 example。插件包有两种使用方式,一种为上传到URI服务器,一种为上传到集群本地目录,两种方法任选一种即可。 - -【方式一】上传到URI服务器 - -准备工作:使用该种方式注册,您需要提前将 JAR 包上传到 URI 服务器上并确保执行注册语句的IoTDB实例能够访问该 URI 服务器。例如 https://example.com:8080/iotdb/pipe-plugin.jar 。 - -创建语句: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -【方式二】上传到集群本地目录 - -准备工作:使用该种方式注册,您需要提前将 JAR 包放置到DataNode节点所在机器的任意路径下,推荐您将JAR包放在IoTDB安装路径的/ext/pipe目录下(安装包中已有,无需新建)。例如:iotdb-1.x.x-bin/ext/pipe/pipe-plugin.jar。(**注意:如果您使用的是集群,那么需要将 JAR 包放置到每个 DataNode 节点所在机器的该路径下)** - -创建语句: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -### 删除插件语句 - -当用户不再想使用一个插件,需要将插件从系统中卸载时,可以使用如图所示的删除插件语句。 - -```sql -DROP PIPEPLUGIN [IF EXISTS] <别名> -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Pipe Plugin 存在时,执行删除命令,防止因尝试删除不存在的 Pipe Plugin 而导致报错。 - -### 查看插件语句 - -用户也可以按需查看系统中的插件。查看插件的语句如图所示。 - -```sql -SHOW PIPEPLUGINS -``` - -## 系统预置的流处理插件 - -### 预置 source 插件 - -#### iotdb-source - -作用:抽取 IoTDB 内部的历史或实时数据进入 pipe。 - - -| key | value | value 取值范围 | required or optional with default | -|---------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|-----------------------------------| -| source | iotdb-source | String: iotdb-source | required | -| source.pattern | 用于筛选时间序列的路径前缀 | String: 任意的时间序列前缀 | optional: root | -| source.history.start-time | 抽取的历史数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| source.history.end-time | 抽取的历史数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| start-time(V1.3.1+) | start of synchronizing all data event time,including start-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| end-time(V1.3.1+) | end of synchronizing all data event time,including end-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.realtime.mode | 实时数据的抽取模式 | String: hybrid, log, file | optional: hybrid | -| source.forwarding-pipe-requests | 是否抽取由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | optional: true | - -> 🚫 **source.pattern 参数说明** -> -> * Pattern 需用反引号修饰不合法字符或者是不合法路径节点,例如如果希望筛选 root.\`a@b\` 或者 root.\`123\`,应设置 pattern 为 root.\`a@b\` 或者 root.\`123\`(具体参考 [单双引号和反引号的使用时机](https://iotdb.apache.org/zh/Download/#_1-0-版本不兼容的语法详细说明)) -> * 在底层实现中,当检测到 pattern 为 root(默认值)时,抽取效率较高,其他任意格式都将降低性能 -> * 路径前缀不需要能够构成完整的路径。例如,当创建一个包含参数为 'source.pattern'='root.aligned.1' 的 pipe 时: - > - > * root.aligned.1TS - > * root.aligned.1TS.\`1\` -> * root.aligned.100T - > - > 的数据会被抽取; - > - > * root.aligned.\`1\` -> * root.aligned.\`123\` - > - > 的数据不会被抽取。 - -> ❗️**source.history 的 start-time,end-time 参数说明** -> -> * start-time,end-time 应为 ISO 格式,例如 2011-12-03T10:15:30 或 2011-12-03T10:15:30+01:00 - -> ✅ **一条数据从生产到落库 IoTDB,包含两个关键的时间概念** -> -> * **event time:** 数据实际生产时的时间(或者数据生产系统给数据赋予的生成时间,是数据点中的时间项),也称为事件时间。 -> * **arrival time:** 数据到达 IoTDB 系统内的时间。 -> -> 我们常说的乱序数据,指的是数据到达时,其 **event time** 远落后于当前系统时间(或者已经落库的最大 **event time**)的数据。另一方面,不论是乱序数据还是顺序数据,只要它们是新到达系统的,那它们的 **arrival time** 都是会随着数据到达 IoTDB 的顺序递增的。 - -> 💎 **iotdb-source 的工作可以拆分成两个阶段** -> -> 1. 历史数据抽取:所有 **arrival time** < 创建 pipe 时**当前系统时间**的数据称为历史数据 -> 2. 实时数据抽取:所有 **arrival time** >= 创建 pipe 时**当前系统时间**的数据称为实时数据 -> -> 历史数据传输阶段和实时数据传输阶段,**两阶段串行执行,只有当历史数据传输阶段完成后,才执行实时数据传输阶段。** - -> 📌 **source.realtime.mode:数据抽取的模式** -> -> * log:该模式下,任务仅使用操作日志进行数据处理、发送 -> * file:该模式下,任务仅使用数据文件进行数据处理、发送 -> * hybrid:该模式,考虑了按操作日志逐条目发送数据时延迟低但吞吐低的特点,以及按数据文件批量发送时发送吞吐高但延迟高的特点,能够在不同的写入负载下自动切换适合的数据抽取方式,首先采取基于操作日志的数据抽取方式以保证低发送延迟,当产生数据积压时自动切换成基于数据文件的数据抽取方式以保证高发送吞吐,积压消除时自动切换回基于操作日志的数据抽取方式,避免了采用单一数据抽取算法难以平衡数据发送延迟或吞吐的问题。 - -> 🍕 **source.forwarding-pipe-requests:是否允许转发从另一 pipe 传输而来的数据** -> -> * 如果要使用 pipe 构建 A -> B -> C 的数据同步,那么 B -> C 的 pipe 需要将该参数为 true 后,A -> B 中 A 通过 pipe 写入 B 的数据才能被正确转发到 C -> * 如果要使用 pipe 构建 A \<-> B 的双向数据同步(双活),那么 A -> B 和 B -> A 的 pipe 都需要将该参数设置为 false,否则将会造成数据无休止的集群间循环转发 - -### 预置 processor 插件 - -#### do-nothing-processor - -作用:不对 source 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -|-----------|----------------------|------------------------------|-----------------------------------| -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### 预置 sink 插件 - -#### do-nothing-sink - -作用:不对 processor 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -|------|-----------------|-------------------------|-----------------------------------| -| sink | do-nothing-sink | String: do-nothing-sink | required | - -## 流处理任务管理 - -### 创建流处理任务 - -使用 `CREATE PIPE` 语句来创建流处理任务。以数据同步流处理任务的创建为例,示例 SQL 语句如下: - -```sql -CREATE PIPE -- PipeId 是能够唯一标定流处理任务的名字 -WITH SOURCE ( - -- 默认的 IoTDB 数据抽取插件 - 'source' = 'iotdb-source', - -- 路径前缀,只有能够匹配该路径前缀的数据才会被抽取,用作后续的处理和发送 - 'source.pattern' = 'root.timecho', - -- 是否抽取历史数据 - 'source.history.enable' = 'true', - -- 描述被抽取的历史数据的时间范围,表示最早时间 - 'source.history.start-time' = '2011.12.03T10:15:30+01:00', - -- 描述被抽取的历史数据的时间范围,表示最晚时间 - 'source.history.end-time' = '2022.12.03T10:15:30+01:00', - -- 是否抽取实时数据 - 'source.realtime.enable' = 'true', - -- 描述实时数据的抽取方式 - 'source.realtime.mode' = 'hybrid', -) -WITH PROCESSOR ( - -- 默认的数据处理插件,即不做任何处理 - 'processor' = 'do-nothing-processor', -) -WITH SINK ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'sink' = 'iotdb-thrift-sink', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'sink.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'sink.port' = '6667', -) -``` - -**创建流处理任务时需要配置 PipeId 以及三个插件部分的参数:** - - -| 配置项 | 说明 | 是否必填 | 默认实现 | 默认实现说明 | 是否允许自定义实现 | -|-----------|--------------------------------|---------------------------|----------------------|------------------------------|--------------------------| -| PipeId | 全局唯一标定一个流处理任务的名称 | 必填 | - | - | - | -| source | Pipe Source 插件,负责在数据库底层抽取流处理数据 | 选填 | iotdb-source | 将数据库的全量历史数据和后续到达的实时数据接入流处理任务 | 否 | -| processor | Pipe Processor 插件,负责处理数据 | 选填 | do-nothing-processor | 对传入的数据不做任何处理 | | -| sink | Pipe Sink 插件,负责发送数据 | 必填 | - | - | | - -示例中,使用了 iotdb-source、do-nothing-processor 和 iotdb-thrift-sink 插件构建数据流处理任务。IoTDB 还内置了其他的流处理插件,**请查看“系统预置流处理插件”一节**。 - -**一个最简的 CREATE PIPE 语句示例如下:** - -```sql -CREATE PIPE -- PipeId 是能够唯一标定流处理任务的名字 -WITH SINK ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'sink' = 'iotdb-thrift-sink', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'sink.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'sink.port' = '6667', -) -``` - -其表达的语义是:将本数据库实例中的全量历史数据和后续到达的实时数据,同步到目标为 127.0.0.1:6667 的 IoTDB 实例上。 - -**注意:** - -- SOURCE 和 PROCESSOR 为选填配置,若不填写配置参数,系统则会采用相应的默认实现 -- SINK 为必填配置,需要在 CREATE PIPE 语句中声明式配置 -- SINK 具备自复用能力。对于不同的流处理任务,如果他们的 SINK 具备完全相同 KV 属性的(所有属性的 key 对应的 value 都相同),**那么系统最终只会创建一个 SINK 实例**,以实现对连接资源的复用。 - - - 例如,有下面 pipe1, pipe2 两个流处理任务的声明: - - ```sql - CREATE PIPE pipe1 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.ip' = 'localhost', - 'sink.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.port' = '9999', - 'sink.ip' = 'localhost', - ) - ``` - - - 因为它们对 SINK 的声明完全相同(**即使某些属性声明时的顺序不同**),所以框架会自动对它们声明的 SINK 进行复用,最终 pipe1, pipe2 的 SINK 将会是同一个实例。 -- 在 source 为默认的 iotdb-source,且 source.forwarding-pipe-requests 为默认值 true 时,请不要构建出包含数据循环同步的应用场景(会导致无限循环): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### 启动流处理任务 - -CREATE PIPE 语句成功执行后,流处理任务相关实例会被创建,但整个流处理任务的运行状态会被置为 STOPPED,即流处理任务不会立刻处理数据(V1.3.0)。在 1.3.1 及以上的版本,流处理任务的运行状态在创建后将被立即置为 RUNNING。 - -可以使用 START PIPE 语句使流处理任务开始处理数据: - -```sql -START PIPE -``` - -### 停止流处理任务 - -使用 STOP PIPE 语句使流处理任务停止处理数据: - -```sql -STOP PIPE -``` - -### 删除流处理任务 - -使用 DROP PIPE 语句使流处理任务停止处理数据(当流处理任务状态为 RUNNING 时),然后删除整个流处理任务流处理任务: - -```sql -DROP PIPE -``` - -用户在删除流处理任务前,不需要执行 STOP 操作。 - -### 展示流处理任务 - -使用 SHOW PIPES 语句查看所有流处理任务: - -```sql -SHOW PIPES -``` - -查询结果如下: - -```sql -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -| ID| CreationTime | State|PipeSource|PipeProcessor|PipeSink|ExceptionMessage| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| {}| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -``` - -可以使用 `` 指定想看的某个流处理任务状态: - -```sql -SHOW PIPE -``` - -您也可以通过 where 子句,判断某个 \ 使用的 Pipe Sink 被复用的情况。 - -```sql -SHOW PIPES -WHERE SINK USED BY -``` - -### 流处理任务运行状态迁移 - -一个流处理 pipe 在其的生命周期中会经过多种状态: - -- **RUNNING:** pipe 正在正常工作 - - 当一个 pipe 被成功创建之后,其初始状态为工作状态(V1.3.1+) -- **STOPPED:** pipe 处于停止运行状态。当管道处于该状态时,有如下几种可能: - - 当一个 pipe 被成功创建之后,其初始状态为暂停状态(V1.3.0) - - 用户手动将一个处于正常运行状态的 pipe 暂停,其状态会被动从 RUNNING 变为 STOPPED - - 当一个 pipe 运行过程中出现无法恢复的错误时,其状态会自动从 RUNNING 变为 STOPPED -- **DROPPED:** pipe 任务被永久删除 - -下图表明了所有状态以及状态的迁移: - -![状态迁移图](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -## 权限管理 - -### 流处理任务 - - -| 权限名称 | 描述 | -|----------|---------------| -| USE_PIPE | 注册流处理任务。路径无关。 | -| USE_PIPE | 开启流处理任务。路径无关。 | -| USE_PIPE | 停止流处理任务。路径无关。 | -| USE_PIPE | 卸载流处理任务。路径无关。 | -| USE_PIPE | 查询流处理任务。路径无关。 | - -### 流处理任务插件 - - -| 权限名称 | 描述 | -|----------|-----------------| -| USE_PIPE | 注册流处理任务插件。路径无关。 | -| USE_PIPE | 卸载流处理任务插件。路径无关。 | -| USE_PIPE | 查询流处理任务插件。路径无关。 | - -## 配置参数 - -在 iotdb-system.properties 中: - -V1.3.0+: -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 - -# The maximum number of selectors that can be used in the async connector. -# pipe_async_connector_selector_number=1 - -# The core number of clients that can be used in the async connector. -# pipe_async_connector_core_client_number=8 - -# The maximum number of clients that can be used in the async connector. -# pipe_async_connector_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -V1.3.1+: -```Properties -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` diff --git a/src/zh/UserGuide/V1.3.x/User-Manual/Tiered-Storage_timecho.md b/src/zh/UserGuide/V1.3.x/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 2b9fddc8d..000000000 --- a/src/zh/UserGuide/V1.3.x/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,97 +0,0 @@ - - -# 多级存储 -## 概述 - -多级存储功能向用户提供多种存储介质管理的能力,用户可以使用多级存储功能为 IoTDB 配置不同类型的存储介质,并为存储介质进行分级。具体的,在 IoTDB 中,多级存储的配置体现为多目录的管理。用户可以将多个存储目录归为同一类,作为一个“层级”向 IoTDB 中配置,这种“层级”我们称之为 storage tier;同时,用户可以根据数据的冷热进行分类,并将不同类别的数据存储到指定的“层级”中。当前 IoTDB 支持通过数据的 TTL 进行冷热数据的分类,当一个层级中的数据不满足当前层级定义的 TTL 规则时,该数据会被自动迁移至下一层级中。 - -## 参数定义 - -在 IoTDB 中开启多级存储,需要进行以下几个方面的配置: - -1. 配置数据目录,并将数据目录分为不同的层级 -2. 配置每个层级所管理的数据的 TTL,以区分不同层级管理的冷热数据类别。 -3. 配置每个层级的最小剩余存储空间比例,当该层级的存储空间触发该阈值时,该层级的数据会被自动迁移至下一层级(可选)。 - -具体的参数定义及其描述如下。 - -| 配置项 | 默认值 | 说明 | 约束 | -| --------------------------------------- | ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | 用来指定不同的存储目录,并将存储目录进行层级划分 | 每级存储使用分号分隔,单级内使用逗号分隔;云端配置只能作为最后一级存储且第一级不能作为云端存储;最多配置一个云端对象;远端存储目录使用 OBJECT_STORAGE 来表示 | -| tier_ttl_in_ms | -1 | 定义每个层级负责的数据范围,通过 TTL 表示 | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致;"-1" 表示"无限制" | -| dn_default_space_usage_thresholds | 0.85 | 定义每个层级数据目录的最大使用空间比例;当使用空间大于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的使用存储空间大于此阈值时,会将系统置为 READ_ONLY | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致 | -| object_storage_type | AWS_S3 | 云端存储类型 | IoTDB 当前只支持 AWS S3 作为远端存储类型,此参数不支持修改 | -| object_storage_bucket | iotdb_data | 云端存储 bucket 的名称 | AWS S3 中的 bucket 定义;如果未使用远端存储,无需配置 | -| object_storage_endpoint | | 云端存储的 endpoint | AWS S3 的 endpoint;如果未使用远端存储,无需配置 | -| object_storage_access_key | | 云端存储的验证信息 key | AWS S3 的 credential key;如果未使用远端存储,无需配置 | -| object_storage_access_secret | | 云端存储的验证信息 secret | AWS S3 的 credential secret;如果未使用远端存储,无需配置 | -| remote_tsfile_cache_dirs | data/datanode/data/cache | 云端存储在本地的缓存目录 | 如果未使用远端存储,无需配置 | -| remote_tsfile_cache_page_size_in_kb | 20480 | 云端存储在本地缓存文件的块大小 | 如果未使用远端存储,无需配置 | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | 云端存储本地缓存的最大磁盘占用大小 | 如果未使用远端存储,无需配置 | - - -## 本地多级存储配置示例 - -以下以本地两级存储的配置示例。 - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data; -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -在该示例中,共配置了两个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 1 天以前的数据 | 10% | - -## 远端多级存储配置示例 - -以下以三级存储为例: - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_name=AWS_S3 -object_storage_bucket=iotdb -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// 可选配置项 -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -在该示例中,共配置了三个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 过去1 天至过去 10 天内的数据 | 15% | -| 层级三 | 远端 AWS S3 存储 | 过去 10 天以前的数据 | 10% | \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/User-Manual/User-defined-function_timecho.md b/src/zh/UserGuide/V1.3.x/User-Manual/User-defined-function_timecho.md deleted file mode 100644 index 74d8f4baf..000000000 --- a/src/zh/UserGuide/V1.3.x/User-Manual/User-defined-function_timecho.md +++ /dev/null @@ -1,927 +0,0 @@ -# UDF - -## 1. UDF 介绍 - -UDF(User Defined Function)即用户自定义函数,IoTDB 提供多种内建的面向时序处理的函数,也支持扩展自定义函数来满足更多的计算需求。 - -IoTDB 支持两种类型的 UDF 函数,如下表所示。 - - - - - - - - - - - - - - - - - - - - - - -
UDF 分类数据访问策略描述
UDTFMAPPABLE_ROW_BY_ROW自定义标量函数,输入 k 列时间序列 1 行数据,输出 1 列时间序列 1 行数据,可用于标量函数出现的任何子句和表达式中,如select子句、where子句等。
ROW_BY_ROW
SLIDING_TIME_WINDOW
SLIDING_SIZE_WINDOW
SESSION_TIME_WINDOW
STATE_WINDOW
自定义时间序列生成函数,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据,输入行数 m 可以与输出行数 n 不相同,只能用于SELECT子句中。
UDAF-自定义聚合函数,输入 k 列时间序列 m 行数据,输出 1 列时间序列 1 行数据,可用于聚合函数出现的任何子句和表达式中,如select子句、having子句等。
- -### 1.1 UDF 使用 - -UDF 的使用方法与普通内建函数类似,可以直接在 SELECT 语句中像调用普通函数一样使用UDF。 - -#### 1.支持的基础 SQL 语法 - -* `SLIMIT` / `SOFFSET` -* `LIMIT` / `OFFSET` -* 支持值过滤 -* 支持时间过滤 - - -#### 2. 带 * 查询 - -假定现在有时间序列 `root.sg.d1.s1`和 `root.sg.d1.s2`。 - -* **执行`SELECT example(*) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1)`和`example(root.sg.d1.s2)`的结果。 - -* **执行`SELECT example(s1, *) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1, root.sg.d1.s1)`和`example(root.sg.d1.s1, root.sg.d1.s2)`的结果。 - -* **执行`SELECT example(*, *) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1, root.sg.d1.s1)`,`example(root.sg.d1.s2, root.sg.d1.s1)`,`example(root.sg.d1.s1, root.sg.d1.s2)` 和 `example(root.sg.d1.s2, root.sg.d1.s2)`的结果。 - -#### 3. 带自定义输入参数的查询 - -可以在进行 UDF 查询的时候,向 UDF 传入任意数量的键值对参数。键值对中的键和值都需要被单引号或者双引号引起来。注意,键值对参数只能在所有时间序列后传入。下面是一组例子: - - 示例: -``` sql -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; -``` - -#### 4. 与其他查询的嵌套查询 - - 示例: -``` sql -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` - - -## 2. UDF 管理 - -### 2.1 UDF 注册 - -注册一个 UDF 可以按如下流程进行: - -1. 实现一个完整的 UDF 类,假定这个类的全类名为`org.apache.iotdb.udf.UDTFExample` -2. 将项目打成 JAR 包,如果使用 Maven 管理项目,可以参考 [Maven 项目示例](https://github.com/apache/iotdb/tree/master/example/udf)的写法 -3. 进行注册前的准备工作,根据注册方式的不同需要做不同的准备,具体可参考以下例子 -4. 使用以下 SQL 语句注册 UDF - -```sql -CREATE FUNCTION AS (USING URI URI-STRING) -``` - -#### 示例:注册名为`example`的 UDF,以下两种注册方式任选其一即可 - -#### 方式一:手动放置jar包 - -准备工作: -使用该种方式注册时,需要提前将 JAR 包放置到集群所有节点的 `ext/udf`目录下(该目录可配置)。 - -注册语句: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' -``` - -#### 方式二:集群通过URI自动安装jar包 - -准备工作: -使用该种方式注册时,需要提前将 JAR 包上传到 URI 服务器上并确保执行注册语句的 IoTDB 实例能够访问该 URI 服务器。 - -注册语句: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' USING URI 'http://jar/example.jar' -``` - -IoTDB 会下载 JAR 包并同步到整个集群。 - -#### 注意 - -1. 由于 IoTDB 的 UDF 是通过反射技术动态装载的,因此在装载过程中无需启停服务器。 - -2. UDF 函数名称是大小写不敏感的。 - -3. 请不要给 UDF 函数注册一个内置函数的名字。使用内置函数的名字给 UDF 注册会失败。 - -4. 不同的 JAR 包中最好不要有全类名相同但实现功能逻辑不一样的类。例如 UDF(UDAF/UDTF):`udf1`、`udf2`分别对应资源`udf1.jar`、`udf2.jar`。如果两个 JAR 包里都包含一个`org.apache.iotdb.udf.UDTFExample`类,当同一个 SQL 中同时使用到这两个 UDF 时,系统会随机加载其中一个类,导致 UDF 执行行为不一致。 - -### 2.2 UDF 卸载 - -SQL 语法如下: - -```sql -DROP FUNCTION -``` - -示例:卸载上述例子的 UDF: - -```sql -DROP FUNCTION example -``` - -注意:对于使用 using uri 注册的函数,需要移除集群所有节点路径(`安装包/ext/udf/install`)中存在的 UDF 的 jar 文件。 - -### 2.3 查看所有注册的 UDF - -``` sql -SHOW FUNCTIONS -``` - -### 2.4 UDF 配置 - -- 允许在 `iotdb-system.properties` 中配置 udf 的存储目录.: - ``` Properties -# UDF lib dir - -udf_lib_dir=ext/udf -``` - -- 使用自定义函数时,提示内存不足,更改 `iotdb-system.properties` 中下述配置参数并重启服务。 - ``` Properties - -# Used to estimate the memory usage of text fields in a UDF query. -# It is recommended to set this value to be slightly larger than the average length of all text -# effectiveMode: restart -# Datatype: int -udf_initial_byte_array_length_for_memory_control=48 - -# How much memory may be used in ONE UDF query (in MB). -# The upper limit is 20% of allocated memory for read. -# effectiveMode: restart -# Datatype: float -udf_memory_budget_in_mb=30.0 - -# UDF memory allocation ratio. -# The parameter form is a:b:c, where a, b, and c are integers. -# effectiveMode: restart -udf_reader_transformer_collector_memory_proportion=1:1:1 -``` - -### 2.5 UDF 用户权限 - -用户在使用 UDF 时会涉及到 `USE_UDF` 权限,具备该权限的用户才被允许执行 UDF 注册、卸载和查询操作。 - -更多用户权限相关的内容,请参考 [权限管理语句](../User-Manual/Authority-Management.md##权限管理)。 - - -## 3. UDF 函数库 - -基于用户自定义函数能力,IoTDB 提供了一系列关于时序数据处理的函数,包括数据质量、数据画像、异常检测、 频域分析、数据匹配、数据修复、序列发现、机器学习等,能够满足工业领域对时序数据处理的需求。 - -可以参考 [UDF 函数库](../SQL-Manual/UDF-Libraries_timecho.md)文档,查找安装步骤及每个函数对应的注册语句,以确保正确注册所有需要的函数。 - -## 4. UDF 开发 - -### 4.1 UDF 依赖 - -如果您使用 [Maven](http://search.maven.org/) ,可以从 [Maven 库](http://search.maven.org/) 中搜索下面示例中的依赖。请注意选择和目标 IoTDB 服务器版本相同的依赖版本。 - -``` xml - - org.apache.iotdb - udf-api - 1.0.0 - provided - -``` - -### 4.2 UDTF(User Defined Timeseries Generating Function) - -编写一个 UDTF 需要继承`org.apache.iotdb.udf.api.UDTF`类,并至少实现`beforeStart`方法和一种`transform`方法。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| :----------------------------------------------------------- | :----------------------------------------------------------- | ------------------------- | -| void validate(UDFParameterValidator validator) throws Exception | 在初始化方法`beforeStart`调用前执行,用于检测`UDFParameters`中用户输入的参数是否合法。 | 否 | -| void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception | 初始化方法,在 UDTF 处理输入数据前,调用用户自定义的初始化行为。用户每执行一次 UDTF 查询,框架就会构造一个新的 UDF 类实例,该方法在每个 UDF 类实例被初始化时调用一次。在每一个 UDF 类实例的生命周期内,该方法只会被调用一次。 | 是 | -| Object transform(Row row) throws Exception` | 这个方法由框架调用。当您在`beforeStart`中选择以`MappableRowByRowAccessStrategy`的策略消费原始数据时,可以选用该方法进行数据处理。输入参数以`Row`的形式传入,输出结果通过返回值`Object`输出。 | 所有`transform`方法四选一 | -| void transform(Column[] columns, ColumnBuilder builder) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`MappableRowByRowAccessStrategy`的策略消费原始数据时,可以选用该方法进行数据处理。输入参数以`Column[]`的形式传入,输出结果通过`ColumnBuilder`输出。您需要在该方法内自行调用`builder`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void transform(Row row, PointCollector collector) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`RowByRowAccessStrategy`的策略消费原始数据时,这个数据处理方法就会被调用。输入参数以`Row`的形式传入,输出结果通过`PointCollector`输出。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void transform(RowWindow rowWindow, PointCollector collector) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`SlidingSizeWindowAccessStrategy`或者`SlidingTimeWindowAccessStrategy`的策略消费原始数据时,这个数据处理方法就会被调用。输入参数以`RowWindow`的形式传入,输出结果通过`PointCollector`输出。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void terminate(PointCollector collector) throws Exception | 这个方法由框架调用。该方法会在所有的`transform`调用执行完成后,在`beforeDestory`方法执行前被调用。在一个 UDF 查询过程中,该方法会且只会调用一次。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 否 | -| void beforeDestroy() | UDTF 的结束方法。此方法由框架调用,并且只会被调用一次,即在处理完最后一条记录之后被调用。 | 否 | - -在一个完整的 UDTF 实例生命周期中,各个方法的调用顺序如下: - -1. void validate(UDFParameterValidator validator) throws Exception -2. void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception -3. Object transform(Row row) throws Exception 或着 void transform(Column[] columns, ColumnBuilder builder) throws Exception 或者 void transform(Row row, PointCollector collector) throws Exception 或者 void transform(RowWindow rowWindow, PointCollector collector) throws Exception -4. void terminate(PointCollector collector) throws Exception -5. void beforeDestroy() - -> 注意,框架每执行一次 UDTF 查询,都会构造一个全新的 UDF 类实例,查询结束时,对应的 UDF 类实例即被销毁,因此不同 UDTF 查询(即使是在同一个 SQL 语句中)UDF 类实例内部的数据都是隔离的。您可以放心地在 UDTF 中维护一些状态数据,无需考虑并发对 UDF 类实例内部状态数据的影响。 - -#### 接口详细介绍: - -1. **void validate(UDFParameterValidator validator) throws Exception** - - `validate`方法能够对用户输入的参数进行验证。 - - 您可以在该方法中限制输入序列的数量和类型,检查用户输入的属性或者进行自定义逻辑的验证。 - -`UDFParameterValidator`的使用方法请见 [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/parameter/UDFParameterValidator.java)。 - -2. **void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception** - - `beforeStart`方法有两个作用: - 1. 帮助用户解析 SQL 语句中的 UDF 参数 - 2. 配置 UDF 运行时必要的信息,即指定 UDF 访问原始数据时采取的策略和输出结果序列的类型 - 3. 创建资源,比如建立外部链接,打开文件等 - -2.1 **UDFParameters** - -`UDFParameters`的作用是解析 SQL 语句中的 UDF 参数(SQL 中 UDF 函数名称后括号中的部分)。参数包括序列类型参数和字符串 key-value 对形式输入的属性参数。 - -示例: - -``` sql -SELECT UDF(s1, s2, 'key1'='iotdb', 'key2'='123.45') FROM root.sg.d; -``` - -用法: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - String stringValue = parameters.getString("key1"); // iotdb - Float floatValue = parameters.getFloat("key2"); // 123.45 - Double doubleValue = parameters.getDouble("key3"); // null - int intValue = parameters.getIntOrDefault("key4", 678); // 678 - // do something - - // configurations - // ... -} -``` - -2.2 **UDTFConfigurations** - -您必须使用 `UDTFConfigurations` 指定 UDF 访问原始数据时采取的策略和输出结果序列的类型。 - -用法: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(Type.INT32); -} -``` - -其中`setAccessStrategy`方法用于设定 UDF 访问原始数据时采取的策略,`setOutputDataType`用于设定输出结果序列的类型。 - - 2.2.1 **setAccessStrategy** - -注意,您在此处设定的原始数据访问策略决定了框架会调用哪一种`transform`方法 ,请实现与原始数据访问策略对应的`transform`方法。当然,您也可以根据`UDFParameters`解析出来的属性参数,动态决定设定哪一种策略,因此,实现两种`transform`方法也是被允许的。 - -下面是您可以设定的访问原始数据的策略: - -| 接口定义 | 描述 | 调用的`transform`方法 | -| ------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| MappableRowByRowStrategy | 自定义标量函数
框架会为每一行原始数据输入调用一次`transform`方法,输入 k 列时间序列 1 行数据,输出 1 列时间序列 1 行数据,可用于标量函数出现的任何子句和表达式中,如select子句、where子句等。 | void transform(Column[] columns, ColumnBuilder builder) throws ExceptionObject transform(Row row) throws Exception | -| RowByRowAccessStrategy | 自定义时间序列生成函数,逐行地处理原始数据。
框架会为每一行原始数据输入调用一次`transform`方法,输入 k 列时间序列 1 行数据,输出 1 列时间序列 n 行数据。
当输入一个序列时,该行就作为输入序列的一个数据点。
当输入多个序列时,输入序列按时间对齐后,每一行作为的输入序列的一个数据点。
(一行数据中,可能存在某一列为`null`值,但不会全部都是`null`) | void transform(Row row, PointCollector collector) throws Exception | -| SlidingTimeWindowAccessStrategy | 自定义时间序列生成函数,以滑动时间窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为的输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SlidingSizeWindowAccessStrategy | 自定义时间序列生成函数,以固定行数的方式处理原始数据,即每个数据处理窗口都会包含固定行数的数据(最后一个窗口除外)。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为的输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SessionTimeWindowAccessStrategy | 自定义时间序列生成函数,以会话窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为的输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| StateWindowAccessStrategy | 自定义时间序列生成函数,以状态窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 1 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,目前仅支持对一个物理量也就是一列数据进行开窗。 | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | - -#### 接口详情: - -- `MappableRowByRowStrategy` 和 `RowByRowAccessStrategy`的构造不需要任何参数。 - -- `SlidingTimeWindowAccessStrategy` - -开窗示意图: - - - -`SlidingTimeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 3 类参数: - -1. 时间轴显示时间窗开始和结束时间 - -时间轴显示时间窗开始和结束时间不是必须要提供的。当您不提供这类参数时,时间轴显示时间窗开始时间会被定义为整个查询结果集中最小的时间戳,时间轴显示时间窗结束时间会被定义为整个查询结果集中最大的时间戳。 - -2. 划分时间轴的时间间隔参数(必须为正数) -3. 滑动步长(不要求大于等于时间间隔,但是必须为正数) - -滑动步长参数也不是必须的。当您不提供滑动步长参数时,滑动步长会被设定为划分时间轴的时间间隔。 - -3 类参数的关系可见下图。策略的构造方法详见 [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/SlidingTimeWindowAccessStrategy.java)。 - - - -> 注意,最后的一些时间窗口的实际时间间隔可能小于规定的时间间隔参数。另外,可能存在某些时间窗口内数据行数量为 0 的情况,这种情况框架也会为该窗口调用一次`transform`方法。 - -- `SlidingSizeWindowAccessStrategy` - -开窗示意图: - - - -`SlidingSizeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 2 个参数: - -1. 窗口大小,即一个数据处理窗口包含的数据行数。注意,最后一些窗口的数据行数可能少于规定的数据行数。 -2. 滑动步长,即下一窗口第一个数据行与当前窗口第一个数据行间的数据行数(不要求大于等于窗口大小,但是必须为正数) - -滑动步长参数不是必须的。当您不提供滑动步长参数时,滑动步长会被设定为窗口大小。 - -- `SessionTimeWindowAccessStrategy` - -开窗示意图:**时间间隔小于等于给定的最小时间间隔 sessionGap 则分为一组。** - - - - -`SessionTimeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 2 类参数: - -1. 时间轴显示时间窗开始和结束时间。 -2. 会话窗口之间的最小时间间隔。 - -- `StateWindowAccessStrategy` - -开窗示意图:**对于数值型数据,状态差值小于等于给定的阈值 delta 则分为一组。** - - - -`StateWindowAccessStrategy`有四种构造方法: - -1. 针对数值型数据,可以提供时间轴显示时间窗开始和结束时间以及对于单个窗口内部允许变化的阈值delta。 -2. 针对文本数据以及布尔数据,可以提供时间轴显示时间窗开始和结束时间。对于这两种数据类型,单个窗口内的数据是相同的,不需要提供变化阈值。 -3. 针对数值型数据,可以只提供单个窗口内部允许变化的阈值delta,时间轴显示时间窗开始时间会被定义为整个查询结果集中最小的时间戳,时间轴显示时间窗结束时间会被定义为整个查询结果集中最大的时间戳。 -4. 针对文本数据以及布尔数据,可以不提供任何参数,开始与结束时间戳见3中解释。 - -StateWindowAccessStrategy 目前只能接收一列输入。策略的构造方法详见 [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/StateWindowAccessStrategy.java)。 - - 2.2.2 **setOutputDataType** - -注意,您在此处设定的输出结果序列的类型,决定了`transform`方法中`PointCollector`实际能够接收的数据类型。`setOutputDataType`中设定的输出类型和`PointCollector`实际能够接收的数据输出类型关系如下: - -| `setOutputDataType`中设定的输出类型 | `PointCollector`实际能够接收的输出类型 | -| :---------------------------------- | :----------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | java.lang.String 和 org.apache.iotdb.udf.api.type.Binary | - -UDTF 输出序列的类型是运行时决定的。您可以根据输入序列类型动态决定输出序列类型。 - -示例: - -```java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **Object transform(Row row) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `MappableRowByRowAccessStrategy`,您就需要该方法和下面的`void transform(Column[] columns, ColumnBuilder builder) throws Exception` 二选一来实现,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的一行。原始数据由`Row`读入,由返回值输出。您必须在一次`transform`方法调用中,根据每个输入的数据点输出一个对应的数据点,即输入和输出依然是一对一的。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`Object transform(Row row) throws Exception`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,输出这两个数据点的代数和。 - -```java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type dataType; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - dataType = parameters.getDataType(0); - configurations - .setAccessStrategy(new MappableRowByRowAccessStrategy()) - .setOutputDataType(dataType); - } - - @Override - public Object transform(Row row) throws Exception { - return row.getLong(0) + row.getLong(1); - } -} -``` - -4. **void transform(Column[] columns, ColumnBuilder builder) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `MappableRowByRowAccessStrategy`,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的多行,经过性能测试,我们发现一次性处理多行的 UDTF 比一次处理一行的 UDTF 性能更好。原始数据由`Column[]`读入,由`ColumnBuilder`输出。您必须在一次`transform`方法调用中,根据每个输入的数据点输出一个对应的数据点,即输入和输出依然是一对一的。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(Column[] columns, ColumnBuilder builder) throws Exceptionn`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,输出这两个数据点的代数和。 - -``` java -import org.apache.iotdb.tsfile.read.common.block.column.Column; -import org.apache.iotdb.tsfile.read.common.block.column.ColumnBuilder; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type type; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - type = parameters.getDataType(0); - configurations.setAccessStrategy(new MappableRowByRowAccessStrategy()).setOutputDataType(type); - } - - @Override - public void transform(Column[] columns, ColumnBuilder builder) throws Exception { - long[] inputs1 = columns[0].getLongs(); - long[] inputs2 = columns[1].getLongs(); - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - builder.writeLong(inputs1[i] + inputs2[i]); - } - } -} -``` - -5. **void transform(Row row, PointCollector collector) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `RowByRowAccessStrategy`,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的一行。原始数据由`Row`读入,由`PointCollector`输出。您可以选择在一次`transform`方法调用中输出任意数量的数据点。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(Row row, PointCollector collector) throws Exception`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,当这两个数据点都不为`null`时,输出这两个数据点的代数和。 - -``` java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(Type.INT64) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) throws Exception { - if (row.isNull(0) || row.isNull(1)) { - return; - } - collector.putLong(row.getTime(), row.getLong(0) + row.getLong(1)); - } -} -``` - -6. **void transform(RowWindow rowWindow, PointCollector collector) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `SlidingTimeWindowAccessStrategy`或者`SlidingSizeWindowAccessStrategy`时,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理固定行数或者固定时间间隔内的一批数据,我们称包含这一批数据的容器为窗口。原始数据由`RowWindow`读入,由`PointCollector`输出。`RowWindow`能够帮助您访问某一批次的`Row`,它提供了对这一批次的`Row`进行随机访问和迭代访问的接口。您可以选择在一次`transform`方法调用中输出任意数量的数据点,需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(RowWindow rowWindow, PointCollector collector) throws Exception`方法的完整 UDF 示例。它是一个计数器,接收任意列数的时间序列输入,作用是统计并输出指定时间范围内每一个时间窗口中的数据行数。 - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.RowWindow; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.SlidingTimeWindowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Counter implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(Type.INT32) - .setAccessStrategy(new SlidingTimeWindowAccessStrategy( - parameters.getLong("time_interval"), - parameters.getLong("sliding_step"), - parameters.getLong("display_window_begin"), - parameters.getLong("display_window_end"))); - } - - @Override - public void transform(RowWindow rowWindow, PointCollector collector) throws Exception { - if (rowWindow.windowSize() != 0) { - collector.putInt(rowWindow.windowStartTime(), rowWindow.windowSize()); - } - } -} -``` - -7. **void terminate(PointCollector collector) throws Exception** - -在一些场景下,UDF 需要遍历完所有的原始数据后才能得到最后的输出结果。`terminate`接口为这类 UDF 提供了支持。 - -该方法会在所有的`transform`调用执行完成后,在`beforeDestory`方法执行前被调用。您可以选择使用`transform`方法进行单纯的数据处理,最后使用`terminate`将处理结果输出。 - -结果需要由`PointCollector`输出。您可以选择在一次`terminate`方法调用中输出任意数量的数据点。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void terminate(PointCollector collector) throws Exception`方法的完整 UDF 示例。它接收一个`INT32`类型的时间序列输入,作用是输出该序列的最大值点。 - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Max implements UDTF { - - private Long time; - private int value; - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) { - if (row.isNull(0)) { - return; - } - int candidateValue = row.getInt(0); - if (time == null || value < candidateValue) { - time = row.getTime(); - value = candidateValue; - } - } - - @Override - public void terminate(PointCollector collector) throws IOException { - if (time != null) { - collector.putInt(time, value); - } - } -} -``` - -8. **void beforeDestroy()** - -UDTF 的结束方法,您可以在此方法中进行一些资源释放等的操作。 - -此方法由框架调用。对于一个 UDF 类实例而言,生命周期中会且只会被调用一次,即在处理完最后一条记录之后被调用。 - -### 4.3 UDAF(User Defined Aggregation Function) - -一个完整的 UDAF 定义涉及到 State 和 UDAF 两个类。 - -#### State 类 - -编写一个 State 类需要实现`org.apache.iotdb.udf.api.State`接口,下表是需要实现的方法说明。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| -------------------------------- | ------------------------------------------------------------ | -------- | -| void reset() | 将 `State` 对象重置为初始的状态,您需要像编写构造函数一样,在该方法内填入 `State` 类中各个字段的初始值。 | 是 | -| byte[] serialize() | 将 `State` 序列化为二进制数据。该方法用于 IoTDB 内部的 `State` 对象传递,注意序列化的顺序必须和下面的反序列化方法一致。 | 是 | -| void deserialize(byte[] bytes) | 将二进制数据反序列化为 `State`。该方法用于 IoTDB 内部的 `State` 对象传递,注意反序列化的顺序必须和上面的序列化方法一致。 | 是 | - -#### 接口详细介绍: - -1. **void reset()** - -该方法的作用是将 `State` 重置为初始的状态,您需要在该方法内填写 `State` 对象中各个字段的初始值。出于优化上的考量,IoTDB 在内部会尽可能地复用 `State`,而不是为每一个组创建一个新的 `State`,这样会引入不必要的开销。当 `State` 更新完一个组中的数据之后,就会调用这个方法重置为初始状态,以此来处理下一个组。 - -以求平均数(也就是 `avg`)的 `State` 为例,您需要数据的总和 `sum` 与数据的条数 `count`,并在 `reset()` 方法中将二者初始化为 0。 - -```java -class AvgState implements State { - double sum; - - long count; - - @Override - public void reset() { - sum = 0; - count = 0; - } - - // other methods -} -``` - -2. **byte[] serialize()/void deserialize(byte[] bytes)** - -该方法的作用是将 State 序列化为二进制数据,和从二进制数据中反序列化出 State。IoTDB 作为分布式数据库,涉及到在不同节点中传递数据,因此您需要编写这两个方法,来实现 State 在不同节点中的传递。注意序列化和反序列的顺序必须一致。 - -还是以求平均数(也就是求 avg)的 State 为例,您可以通过任意途径将 State 的内容转化为 `byte[]` 数组,以及从 `byte[]` 数组中读取出 State 的内容,下面展示的是用 Java8 引入的 `ByteBuffer` 进行序列化/反序列的代码: - -```java -@Override -public byte[] serialize() { - ByteBuffer buffer = ByteBuffer.allocate(Double.BYTES + Long.BYTES); - buffer.putDouble(sum); - buffer.putLong(count); - - return buffer.array(); -} - -@Override -public void deserialize(byte[] bytes) { - ByteBuffer buffer = ByteBuffer.wrap(bytes); - sum = buffer.getDouble(); - count = buffer.getLong(); -} -``` - -#### UDAF 类 - -编写一个 UDAF 类需要实现`org.apache.iotdb.udf.api.UDAF`接口,下表是需要实现的方法说明。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -| void validate(UDFParameterValidator validator) throws Exception | 在初始化方法`beforeStart`调用前执行,用于检测`UDFParameters`中用户输入的参数是否合法。该方法与 UDTF 的`validate`相同。 | 否 | -| void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception | 初始化方法,在 UDAF 处理输入数据前,调用用户自定义的初始化行为。与 UDTF 不同的是,这里的 configuration 是 `UDAFConfiguration` 类型。 | 是 | -| State createState() | 创建`State`对象,一般只需要调用默认构造函数,然后按需修改默认的初始值即可。 | 是 | -| void addInput(State state, Column[] columns, BitMap bitMap) | 根据传入的数据`Column[]`批量地更新`State`对象,注意最后一列,也就是 `columns[columns.length - 1]` 总是代表时间列。另外`BitMap`表示之前已经被过滤掉的数据,您在编写该方法时需要手动判断对应的数据是否被过滤掉。 | 是 | -| void combineState(State state, State rhs) | 将`rhs`状态合并至`state`状态中。在分布式场景下,同一组的数据可能分布在不同节点上,IoTDB 会为每个节点上的部分数据生成一个`State`对象,然后调用该方法合并成完整的`State`。 | 是 | -| void outputFinal(State state, ResultValue resultValue) | 根据`State`中的数据,计算出最终的聚合结果。注意根据聚合的语义,每一组只能输出一个值。 | 是 | -| void beforeDestroy() | UDAF 的结束方法。此方法由框架调用,并且只会被调用一次,即在处理完最后一条记录之后被调用。 | 否 | - -在一个完整的 UDAF 实例生命周期中,各个方法的调用顺序如下: - -1. State createState() -2. void validate(UDFParameterValidator validator) throws Exception -3. void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception -4. void addInput(State state, Column[] columns, BitMap bitMap) -5. void combineState(State state, State rhs) -6. void outputFinal(State state, ResultValue resultValue) -7. void beforeDestroy() - -和 UDTF 类似,框架每执行一次 UDAF 查询,都会构造一个全新的 UDF 类实例,查询结束时,对应的 UDF 类实例即被销毁,因此不同 UDAF 查询(即使是在同一个 SQL 语句中)UDF 类实例内部的数据都是隔离的。您可以放心地在 UDAF 中维护一些状态数据,无需考虑并发对 UDF 类实例内部状态数据的影响。 - -#### 接口详细介绍: - -1. **void validate(UDFParameterValidator validator) throws Exception** - -同 UDTF, `validate`方法能够对用户输入的参数进行验证。 - -您可以在该方法中限制输入序列的数量和类型,检查用户输入的属性或者进行自定义逻辑的验证。 - -2. **void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception** - - `beforeStart`方法的作用 UDAF 相同: - - 1. 帮助用户解析 SQL 语句中的 UDF 参数 - 2. 配置 UDF 运行时必要的信息,即指定 UDF 访问原始数据时采取的策略和输出结果序列的类型 - 3. 创建资源,比如建立外部链接,打开文件等。 - -其中,`UDFParameters` 类型的作用可以参照上文。 - -2.2 **UDTFConfigurations** - -和 UDTF 的区别在于,UDAF 使用了 `UDAFConfigurations` 作为 `configuration` 对象的类型。 - -目前,该类仅支持设置输出数据的类型。 - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setOutputDataType(Type.INT32); -} -``` - -`setOutputDataType` 中设定的输出类型和 `ResultValue` 实际能够接收的数据输出类型关系如下: - -| `setOutputDataType`中设定的输出类型 | `ResultValue`实际能够接收的输出类型 | -| :---------------------------------- | :------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | org.apache.iotdb.udf.api.type.Binary | - -UDAF 输出序列的类型也是运行时决定的。您可以根据输入序列类型动态决定输出序列类型。 - -示例: - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **State createState()** - -为 UDAF 创建并初始化 `State`。由于 Java 语言本身的限制,您只能调用 `State` 类的默认构造函数。默认构造函数会为类中所有的字段赋一个默认的初始值,如果该初始值并不符合您的要求,您需要在这个方法内进行手动的初始化。 - -下面是一个包含手动初始化的例子。假设您要实现一个累乘的聚合函数,`State` 的初始值应该设置为 1,但是默认构造函数会初始化为 0,因此您需要在调用默认构造函数之后,手动对 `State` 进行初始化: - -```java -public State createState() { - MultiplyState state = new MultiplyState(); - state.result = 1; - return state; -} -``` - -4. **void addInput(State state, Column[] columns, BitMap bitMap)** - -该方法的作用是,通过原始的输入数据来更新 `State` 对象。出于性能上的考量,也是为了和 IoTDB 向量化的查询引擎相对齐,原始的输入数据不再是一个数据点,而是列的数组 `Column[]`。注意最后一列(也就是 `columns[columns.length - 1]` )总是时间列,因此您也可以在 UDAF 中根据时间进行不同的操作。 - -由于输入参数的类型不是一个数据点,而是多个列,您需要手动对列中的部分数据进行过滤处理,这就是第三个参数 `BitMap` 存在的意义。它用来标识这些列中哪些数据被过滤掉了,您在任何情况下都无需考虑被过滤掉的数据。 - -下面是一个用于统计数据条数(也就是 count)的 `addInput()` 示例。它展示了您应该如何使用 `BitMap` 来忽视那些已经被过滤掉的数据。注意还是由于 Java 语言本身的限制,您需要在方法的开头将接口中定义的 `State` 类型强制转化为自定义的 `State` 类型,不然后续无法正常使用该 `State` 对象。 - -```java -public void addInput(State state, Column[] columns, BitMap bitMap) { - CountState countState = (CountState) state; - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - if (bitMap != null && !bitMap.isMarked(i)) { - continue; - } - if (!columns[0].isNull(i)) { - countState.count++; - } - } -} -``` - -5. **void combineState(State state, State rhs)** - -该方法的作用是合并两个 `State`,更加准确的说,是用第二个 `State` 对象来更新第一个 `State` 对象。IoTDB 是分布式数据库,同一组的数据可能分布在多个不同的节点上。出于性能考虑,IoTDB 会为每个节点上的部分数据先进行聚合成 `State`,然后再将不同节点上的、属于同一个组的 `State` 进行合并,这就是 `combineState` 的作用。 - -下面是一个用于求平均数(也就是 avg)的 `combineState()` 示例。和 `addInput` 类似,您都需要在开头对两个 `State` 进行强制类型转换。另外需要注意是用第二个 `State` 的内容来更新第一个 `State` 的值。 - -```java -public void combineState(State state, State rhs) { - AvgState avgState = (AvgState) state; - AvgState avgRhs = (AvgState) rhs; - - avgState.count += avgRhs.count; - avgState.sum += avgRhs.sum; -} -``` - -6. **void outputFinal(State state, ResultValue resultValue)** - -该方法的作用是从 `State` 中计算出最终的结果。您需要访问 `State` 中的各个字段,求出最终的结果,并将最终的结果设置到 `ResultValue` 对象中。IoTDB 内部会为每个组在最后调用一次这个方法。注意根据聚合的语义,最终的结果只能是一个值。 - -下面还是一个用于求平均数(也就是 avg)的 `outputFinal` 示例。除了开头的强制类型转换之外,您还将看到 `ResultValue` 对象的具体用法,即通过 `setXXX`(其中 `XXX` 是类型名)来设置最后的结果。 - -```java -public void outputFinal(State state, ResultValue resultValue) { - AvgState avgState = (AvgState) state; - - if (avgState.count != 0) { - resultValue.setDouble(avgState.sum / avgState.count); - } else { - resultValue.setNull(); - } -} -``` - -7. **void beforeDestroy()** - -UDAF 的结束方法,您可以在此方法中进行一些资源释放等的操作。 - -此方法由框架调用。对于一个 UDF 类实例而言,生命周期中会且只会被调用一次,即在处理完最后一条记录之后被调用。 - -### 4.4 完整 Maven 项目示例 - -如果您使用 [Maven](http://search.maven.org/),可以参考我们编写的示例项目**udf-example**。您可以在 [这里](https://github.com/apache/iotdb/tree/master/example/udf) 找到它。 - - -## 5. 为iotdb贡献通用的内置UDF函数 - -该部分主要讲述了外部用户如何将自己编写的 UDF 贡献给 IoTDB 社区。 - -### 5.1 前提条件 - -1. UDF 具有通用性。 - - 通用性主要指的是:UDF 在某些业务场景下,可以被广泛使用。换言之,就是 UDF 具有复用价值,可被社区内其他用户直接使用。 - - 如果不确定自己写的 UDF 是否具有通用性,可以发邮件到 `dev@iotdb.apache.org` 或直接创建 ISSUE 发起讨论。 - -2. UDF 已经完成测试,且能够正常运行在用户的生产环境中。 - -### 5.2 贡献清单 - -1. UDF 的源代码 -2. UDF 的测试用例 -3. UDF 的使用说明 - -### 5.3 贡献内容 - -#### 5.3.1 源代码 - -1. 在`iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin`中创建 UDF 主类和相关的辅助类。 -2. 在`iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin/BuiltinTimeSeriesGeneratingFunction.java`中注册编写的 UDF。 - -#### 5.3.2 测试用例 - -至少需要为贡献的 UDF 编写集成测试。 - -可以在`integration-test/src/test/java/org/apache/iotdb/db/it/udf`中为贡献的 UDF 新增一个测试类进行测试。 - -#### 5.3.3 使用说明 - -使用说明需要包含:UDF 的名称、UDF 的作用、执行函数必须的属性参数、函数的适用的场景以及使用示例等。 - -使用说明需包含中英文两个版本。应分别在 `docs/zh/UserGuide/Operation Manual/DML Data Manipulation Language.md` 和 `docs/UserGuide/Operation Manual/DML Data Manipulation Language.md` 中新增使用说明。 - -#### 5.3.4 提交 PR - -当准备好源代码、测试用例和使用说明后,就可以将 UDF 贡献到 IoTDB 社区了。在 [Github](https://github.com/apache/iotdb) 上面提交 Pull Request (PR) 即可。具体提交方式见:[贡献指南](https://iotdb.apache.org/zh/Community/Development-Guide.html)。 - -当 PR 评审通过并被合并后, UDF 就已经贡献给 IoTDB 社区了! - -## 6. 常见问题 - -1. 如何修改已经注册的 UDF? - -答:假设 UDF 的名称为`example`,全类名为`org.apache.iotdb.udf.UDTFExample`,由`example.jar`引入 - -1. 首先卸载已经注册的`example`函数,执行`DROP FUNCTION example` -2. 删除 `iotdb-server-1.0.0-all-bin/ext/udf` 目录下的`example.jar` -3. 修改`org.apache.iotdb.udf.UDTFExample`中的逻辑,重新打包,JAR 包的名字可以仍然为`example.jar` -4. 将新的 JAR 包上传至 `iotdb-server-1.0.0-all-bin/ext/udf` 目录下 -5. 装载新的 UDF,执行`CREATE FUNCTION example AS "org.apache.iotdb.udf.UDTFExample"` \ No newline at end of file diff --git a/src/zh/UserGuide/V1.3.x/User-Manual/White-List_timecho.md b/src/zh/UserGuide/V1.3.x/User-Manual/White-List_timecho.md deleted file mode 100644 index d69a563fc..000000000 --- a/src/zh/UserGuide/V1.3.x/User-Manual/White-List_timecho.md +++ /dev/null @@ -1,70 +0,0 @@ - - - -# 白名单 - -**功能描述** - -允许哪些客户端地址能连接 IoTDB - -**配置文件** - -conf/iotdb-system.properties - -conf/white.list - -**配置项** - -iotdb-system.properties: - -决定是否开启白名单功能 - -```YAML -# 是否开启白名单功能 -enable_white_list=true -``` - -white.list: - -决定哪些IP地址能够连接IoTDB - -```YAML -# 支持注释 -# 支持精确匹配,每行一个ip -10.2.3.4 - -# 支持*通配符,每行一个ip -10.*.1.3 -10.100.0.* -``` - -**注意事项** - -1. 如果通过session客户端取消本身的白名单,当前连接并不会立即断开。在下次创建连接的时候拒绝。 -2. 如果直接修改white.list,一分钟内生效。如果通过session客户端修改,立即生效,更新内存中的值和white.list磁盘文件 -3. 开启白名单功能,没有white.list 文件,启动DB服务成功,但是,拒绝所有连接。 -4. DB服务运行中,删除 white.list 文件,至多一分钟后,拒绝所有连接。 -5. 是否开启白名单功能的配置,可以热加载。 -6. 使用Java 原生接口修改白名单,必须是root用户才能修改,拒绝非root用户修改;修改内容必须合法,否则会抛出StatementExecutionException异常。 - -![白名单](/img/%E7%99%BD%E5%90%8D%E5%8D%95.png) - diff --git a/src/zh/UserGuide/dev-1.3/AI-capability/AINode_timecho.md b/src/zh/UserGuide/dev-1.3/AI-capability/AINode_timecho.md deleted file mode 100644 index 0e3b6ee97..000000000 --- a/src/zh/UserGuide/dev-1.3/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,655 +0,0 @@ - - -# AINode - -AINode 是支持时序相关模型注册、管理、调用的 IoTDB 原生节点,内置业界领先的自研时序大模型,如清华自研时序模型 Timer 系列,可通过标准 SQL 语句进行调用,实现时序数据的毫秒级实时推理,可支持时序趋势预测、缺失值填补、异常值检测等应用场景。 - -系统架构如下图所示: -::: center - -::: -三种节点的职责如下: - -- **ConfigNode**:负责保存和管理模型的元信息;负责分布式节点管理。 -- **DataNode**:负责接收并解析用户的 SQL请求;负责存储时间序列数据;负责数据的预处理计算。 -- **AINode**:负责模型文件的导入创建以及模型推理。 - -## 1. 优势特点 - -与单独构建机器学习服务相比,具有以下优势: - -- **简单易用**:无需使用 Python 或 Java 编程,使用 SQL 语句即可完成机器学习模型管理与推理的完整流程。如创建模型可使用CREATE MODEL语句、使用模型进行推理可使用CALL INFERENCE(...)语句等,使用更加简单便捷。 - -- **避免数据迁移**:使用 IoTDB 原生机器学习可以将存储在 IoTDB 中的数据直接应用于机器学习模型的推理,无需将数据移动到单独的机器学习服务平台,从而加速数据处理、提高安全性并降低成本。 - -![](/img/h1.png) - -- **内置先进算法**:支持业内领先机器学习分析算法,覆盖典型时序分析任务,为时序数据库赋能原生数据分析能力。如: - - **时间序列预测(Time Series Forecasting)**:从过去时间序列中学习变化模式;从而根据给定过去时间的观测值,输出未来序列最可能的预测。 - - **时序异常检测(Anomaly Detection for Time Series)**:在给定的时间序列数据中检测和识别异常值,帮助发现时间序列中的异常行为。 - - **时间序列标注(Time Series Annotation)**:为每个数据点或特定时间段添加额外的信息或标记,例如事件发生、异常点、趋势变化等,以便更好地理解和分析数据。 - - -## 2. 基本概念 - -- **模型(Model)**:机器学习模型,以时序数据作为输入,输出分析任务的结果或决策。模型是AINode 的基本管理单元,支持模型的增(注册)、删、查、用(推理)。 -- **创建(Create)**: 将外部设计或训练好的模型文件或算法加载到MLNode中,由IoTDB统一管理与使用。 -- **推理(Inference)**:使用创建的模型在指定时序数据上完成该模型适用的时序分析任务的过程。 -- **内置能力(Built-in)**:AINode 自带常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -::: center - -::: - -## 3. 安装部署 - -AINode 的部署可参考文档 [部署指导](../Deployment-and-Maintenance/AINode_Deployment_timecho.md#AINode-部署) 章节。 - -## 4. 使用指导 - -AINode 对时序数据相关的深度学习模型提供了模型创建及删除的流程,内置模型无需创建及删除,可直接使用,并且在完成推理后创建的内置模型实例将自动销毁。 - -### 4.1 注册模型 - -通过指定模型输入输出的向量维度,可以注册训练好的深度学习模型,从而用于模型推理。 - -符合以下内容的模型可以注册到AINode中: - 1. AINode 支持的PyTorch 2.1.0、 2.2.0版本训练的模型,需避免使用2.2.0版本以上的特性。 - 2. AINode支持使用PyTorch JIT存储的模型,模型文件需要包含模型的参数和结构。 - 3. 模型输入序列可以包含一列或多列,若有多列,需要和模型能力、模型配置文件对应。 - 4. 模型的输入输出维度必须在`config.yaml`配置文件中明确定义。使用模型时,必须严格按照`config.yaml`配置文件中定义的输入输出维度。如果输入输出列数不匹配配置文件,将会导致错误。 - -下方为模型注册的SQL语法定义。 - -```SQL -create model using uri -``` - -SQL中参数的具体含义如下: - -- model_name:模型的全局唯一标识,不可重复。模型名称具备以下约束: - - - 允许出现标识符 [ 0-9 a-z A-Z _ ] (字母,数字,下划线) - - 长度限制为2-64字符 - - 大小写敏感 - -- uri:模型注册文件的资源路径,路径下应包含**模型权重model.pt文件和模型的元数据描述文件config.yaml** - - - 模型权重文件:深度学习模型训练完成后得到的权重文件,目前支持pytorch训练得到的.pt文件 - - - yaml元数据描述文件:模型注册时需要提供的与模型结构有关的参数,其中必须包含模型的输入输出维度用于模型推理: - - - | **参数名** | **参数描述** | **示例** | - | ------------ | ---------------------------- | -------- | - | input_shape | 模型输入的行列,用于模型推理 | [96,2] | - | output_shape | 模型输出的行列,用于模型推理 | [48,2] | - - - ​ 除了模型推理外,还可以指定模型输入输出的数据类型: - - - | **参数名** | **参数描述** | **示例** | - | ----------- | ------------------ | --------------------- | - | input_type | 模型输入的数据类型 | ['float32','float32'] | - | output_type | 模型输出的数据类型 | ['float32','float32'] | - - - ​ 除此之外,可以额外指定备注信息用于在模型管理时进行展示 - - - | **参数名** | **参数描述** | **示例** | - | ---------- | ---------------------------------------------- | ------------------------------------------- | - | attributes | 可选,用户自行设定的模型备注信息,用于模型展示 | 'model_type': 'dlinear','kernel_size': '25' | - - -除了本地模型文件的注册,还可以通过URI来指定远程资源路径来进行注册,使用开源的模型仓库(例如HuggingFace)。 - -#### 示例 - -在当前的example文件夹下,包含model.pt和config.yaml文件,model.pt为训练得到,config.yaml的内容如下: - -```YAML -configs: - # 必选项 - input_shape: [96, 2] # 表示模型接收的数据为96行x2列 - output_shape: [48, 2] # 表示模型输出的数据为48行x2列 - - # 可选项 默认为全部float32,列数为shape对应的列数 - input_type: ["int64","int64"] #输入对应的数据类型,需要与输入列数匹配 - output_type: ["text","int64"] #输出对应的数据类型,需要与输出列数匹配 - -attributes: # 可选项 为用户自定义的备注信息 - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -指定该文件夹作为加载路径就可以注册该模型 - -```SQL -IoTDB> create model dlinear_example using uri "file://./example" -``` - -也可以从huggingFace上下载对应的模型文件进行注册 - -```SQL -IoTDB> create model dlinear_example using uri "https://huggingface.com/IoTDBML/dlinear/" -``` - -SQL执行后会异步进行注册的流程,可以通过模型展示查看模型的注册状态(见模型展示章节),注册成功的耗时主要受到模型文件大小的影响。 - -模型注册完成后,就可以通过使用正常查询的方式调用具体函数,进行模型推理。 - -### 4.2 查看模型 - -注册成功的模型可以通过show models指令查询模型的具体信息。其SQL定义如下: - -```SQL -show models - -show models -``` - -除了直接展示所有模型的信息外,可以指定model id来查看某一具体模型的信息。模型展示的结果中包含如下信息: - -| **ModelId** | **State** | **Configs** | **Attributes** | -| ------------ | ------------------------------------- | ---------------------------------------------- | -------------- | -| 模型唯一标识 | 模型注册状态(LOADING,ACTIVE,DROPPING) | InputShape, outputShapeInputTypes, outputTypes | 模型备注信息 | - -其中,State用于展示当前模型注册的状态,包含以下三个阶段 - -- **LOADING**:已经在configNode中添加对应的模型元信息,正将模型文件传输到AINode节点上 -- **ACTIVE:** 模型已经设置完成,模型处于可用状态 -- **DROPPING**:模型删除中,正在从configNode以及AINode处删除模型相关信息 -- **UNAVAILABLE**: 模型创建失败,可以通过drop model删除创建失败的model_name。 - -#### 示例 - -```SQL -IoTDB> show models - - -+---------------------+--------------------------+-----------+----------------------------+-----------------------+ -| ModelId| ModelType| State| Configs| Notes| -+---------------------+--------------------------+-----------+----------------------------+-----------------------+ -| dlinear_example| USER_DEFINED| ACTIVE| inputShape:[96,2]| | -| | | | outputShape:[48,2]| | -| | | | inputDataType:[float,float]| | -| | | |outputDataType:[float,float]| | -| _STLForecaster| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _NaiveForecaster| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _ARIMA| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -|_ExponentialSmoothing| BUILT_IN_FORECAST| ACTIVE| |Built-in model in IoTDB| -| _GaussianHMM|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -| _GMMHMM|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -| _Stray|BUILT_IN_ANOMALY_DETECTION| ACTIVE| |Built-in model in IoTDB| -+---------------------+--------------------------+-----------+------------------------------------------------------------+-----------------------+ -``` - -我们前面已经注册了对应的模型,可以通过对应的指定查看模型状态,active表明模型注册成功,可用于推理。 - -### 4.3 删除模型 - -对于注册成功的模型,用户可以通过SQL进行删除。该操作除了删除configNode上的元信息外,还会删除所有AINode下的相关模型文件。其SQL如下: - -```SQL -drop model -``` - -需要指定已经成功注册的模型model_name来删除对应的模型。由于模型删除涉及多个节点上的数据删除,操作不会立即完成,此时模型的状态为DROPPING,该状态的模型不能用于模型推理。 - -### 4.4 使用内置模型推理 - -SQL语法如下: - - -```SQL -call inference(,sql[,=]) -``` - -内置模型推理无需注册流程,通过call关键字,调用inference函数就可以使用模型的推理功能,其对应的参数介绍如下: - -- **built_in_model_name:** 内置模型名称 -- **parameterName**:参数名 -- **parameterValue**:参数值 - -#### 内置模型及参数说明 - -目前已内置如下机器学习模型,具体参数说明请参考以下链接。 - -| 模型 | built_in_model_name | 任务类型 | 参数说明 | -| -------------------- | --------------------- | -------- | ------------------------------------------------------------ | -| Arima | _Arima | 预测 | [Arima参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.arima.ARIMA.html?highlight=Arima) | -| STLForecaster | _STLForecaster | 预测 | [STLForecaster参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.trend.STLForecaster.html#sktime.forecasting.trend.STLForecaster) | -| NaiveForecaster | _NaiveForecaster | 预测 | [NaiveForecaster参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.naive.NaiveForecaster.html#naiveforecaster) | -| ExponentialSmoothing | _ExponentialSmoothing | 预测 | [ExponentialSmoothing参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.exp_smoothing.ExponentialSmoothing.html) | -| GaussianHMM | _GaussianHMM | 标注 | [GaussianHMM参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.hmm_learn.gaussian.GaussianHMM.html) | -| GMMHMM | _GMMHMM | 标注 | [GMMHMM参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.hmm_learn.gmm.GMMHMM.html) | -| Stray | _Stray | 异常检测 | [Stray参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.annotation.stray.STRAY.html) | - -#### 示例 - -下面是使用内置模型推理的一个操作示例,使用内置的Stray模型进行异常检测算法,输入为`[144,1]`,输出为`[144,1]`,我们通过SQL使用其进行推理。 - -```SQL -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", k=2) -+-------+ -|output0| -+-------+ -| 0| -| 0| -| 0| -| 0| -...... -| 1| -| 1| -| 0| -| 0| -| 0| -| 0| -+-------+ -Total line number = 144 -``` - -### 4.5 使用深度学习模型推理 - -SQL语法如下: - -```SQL -call inference(,sql[,window=]) - - -window_function: - head(window_size) - tail(window_size) - count(window_size,sliding_step) -``` - -在完成模型的注册后,通过call关键字,调用inference函数就可以使用模型的推理功能,其对应的参数介绍如下: - -- **model_name**: 对应一个已经注册的模型 -- **sql**:sql查询语句,查询的结果作为模型的输入进行模型推理。查询的结果中行列的维度需要与具体模型config中指定的大小相匹配。(这里的sql不建议使用`SELECT *`子句,因为在IoTDB中,`*`并不会对列进行排序,因此列的顺序是未定义的,可以使用`SELECT s0,s1`的方式确保列的顺序符合模型输入的预期) -- **window_function**: 推理过程中可以使用的窗口函数,目前提供三种类型的窗口函数用于辅助模型推理: - - **head(window_size)**: 获取数据中最前的window_size个点用于模型推理,该窗口可用于数据裁剪 - ![](/img/AINode-call1.png) - - - **tail(window_size)**:获取数据中最后的window_size个点用于模型推,该窗口可用于数据裁剪 - ![](/img/AINode-call2.png) - - - **count(window_size, sliding_step)**:基于点数的滑动窗口,每个窗口的数据会分别通过模型进行推理,如下图示例所示,window_size为2的窗口函数将输入数据集分为三个窗口,每个窗口分别进行推理运算生成结果。该窗口可用于连续推理 - ![](/img/AINode-call3.png) - -**说明1: window可以用来解决sql查询结果和模型的输入行数要求不一致时的问题,对行进行裁剪。需要注意的是,当列数不匹配或是行数直接少于模型需求时,推理无法进行,会返回错误信息。** - -**说明2: 在深度学习应用中,经常将时间戳衍生特征(数据中的时间列)作为生成式任务的协变量,一同输入到模型中以提升模型的效果,但是在模型的输出结果中一般不包含时间列。为了保证实现的通用性,模型推理结果只对应模型的真实输出,如果模型不输出时间列,则结果中不会包含。** - - -#### 示例 - -下面是使用深度学习模型推理的一个操作示例,针对上面提到的输入为`[96,2]`,输出为`[48,2]`的`dlinear`预测模型,我们通过SQL使用其进行推理。 - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**") -+--------------------------------------------+-----------------------------+ -| _result_0| _result_1| -+--------------------------------------------+-----------------------------+ -| 0.726302981376648| 1.6549958229064941| -| 0.7354921698570251| 1.6482787370681763| -| 0.7238251566886902| 1.6278168201446533| -...... -| 0.7692174911499023| 1.654654049873352| -| 0.7685555815696716| 1.6625318765640259| -| 0.7856493592262268| 1.6508299350738525| -+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### 使用tail/head窗口函数的示例 - -当数据量不定且想要取96行最新数据用于推理时,可以使用对应的窗口函数tail。head函数的用法与其类似,不同点在于其取的是最早的96个点。 - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1988-01-01T00:00:00.000+08:00| 0.7355| 1.211| -...... -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 996 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**",window=tail(96)) -+--------------------------------------------+-----------------------------+ -| _result_0| _result_1| -+--------------------------------------------+-----------------------------+ -| 0.726302981376648| 1.6549958229064941| -| 0.7354921698570251| 1.6482787370681763| -| 0.7238251566886902| 1.6278168201446533| -...... -| 0.7692174911499023| 1.654654049873352| -| 0.7685555815696716| 1.6625318765640259| -| 0.7856493592262268| 1.6508299350738525| -+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### 使用count窗口函数的示例 - -该窗口主要用于计算式任务,当任务对应的模型一次只能处理固定行数据而最终想要的确实多组预测结果时,使用该窗口函数可以使用点数滑动窗口进行连续推理。假设我们现在有一个异常检测模型anomaly_example(input: [24,2], output[1,1]),对每24行数据会生成一个0/1的标签,其使用示例如下: - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(anomaly_example,"select s0,s1 from root.**",window=count(24,24)) -+-------------------------+ -| _result_0| -+-------------------------+ -| 0| -| 1| -| 1| -| 0| -+-------------------------+ -Total line number = 4 -``` - -其中结果集中每行的标签对应每24行数据为一组,输入该异常检测模型后的输出。 - - -### 4.6 时序大模型导入步骤 - -AINode 目前支持多种时序大模型,部署使用请参考[时序大模型](../AI-capability/TimeSeries-Large-Model) - -## 5. 权限管理 - -使用AINode相关的功能时,可以使用IoTDB本身的鉴权去做一个权限管理,用户只有在具备 USE_MODEL 权限时,才可以使用模型管理的相关功能。当使用推理功能时,用户需要有访问输入模型的SQL对应的源序列的权限。 - -| 权限名称 | 权限范围 | 管理员用户(默认ROOT) | 普通用户 | 路径相关 | -| --------- | --------------------------------- | ---------------------- | -------- | -------- | -| USE_MODEL | create model / show models / drop model | √ | √ | x | -| READ_DATA | call inference | √ | √ | √ | - -## 6. 实际案例 - -### 6.1 电力负载预测 - -在部分工业场景下,会存在预测电力负载的需求,预测结果可用于优化电力供应、节约能源和资源、支持规划和扩展以及增强电力系统的可靠性。 - -我们所使用的 ETTh1 的测试集的数据为[ETTh1](/img/ETTh1.csv)。 - - -包含间隔1h采集一次的电力数据,每条数据由负载和油温构成,分别为:High UseFul Load, High UseLess Load, Middle UseLess Load, Low UseFul Load, Low UseLess Load, Oil Temperature。 - -在该数据集上,IoTDB-ML的模型推理功能可以通过以往高中低三种负载的数值和对应时间戳油温的关系,预测未来一段时间内的油温,赋能电网变压器的自动调控和监视。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-csv.sh` 向 IoTDB 中导入 ETT 数据集 - -```Bash -bash ./import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ../../ETTh1.csv -``` - -#### 步骤二:模型导入 - -我们可以在iotdb-cli 中输入以下SQL从 huggingface 上拉取一个已经训练好的模型进行注册,用于后续的推理。 - -```SQL -create model dlinear using uri 'https://huggingface.co/hvlgo/dlinear/tree/main' -``` - -该模型基于较为轻量化的深度模型DLinear训练而得,能够以相对快的推理速度尽可能多地捕捉到序列内部的变化趋势和变量间的数据变化关系,相较于其他更深的模型更适用于快速实时预测。 - -#### 步骤三:模型推理 - -```Shell -IoTDB> select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth LIMIT 96 -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -| Time|root.eg.etth.s0|root.eg.etth.s1|root.eg.etth.s2|root.eg.etth.s3|root.eg.etth.s4|root.eg.etth.s5|root.eg.etth.s6| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -|2017-10-20T00:00:00.000+08:00| 10.449| 3.885| 8.706| 2.025| 2.041| 0.944| 8.864| -|2017-10-20T01:00:00.000+08:00| 11.119| 3.952| 8.813| 2.31| 2.071| 1.005| 8.442| -|2017-10-20T02:00:00.000+08:00| 9.511| 2.88| 7.533| 1.564| 1.949| 0.883| 8.16| -|2017-10-20T03:00:00.000+08:00| 9.645| 2.21| 7.249| 1.066| 1.828| 0.914| 7.949| -...... -|2017-10-23T20:00:00.000+08:00| 8.105| 0.938| 4.371| -0.569| 3.533| 1.279| 9.708| -|2017-10-23T21:00:00.000+08:00| 7.167| 1.206| 4.087| -0.462| 3.107| 1.432| 8.723| -|2017-10-23T22:00:00.000+08:00| 7.1| 1.34| 4.015| -0.32| 2.772| 1.31| 8.864| -|2017-10-23T23:00:00.000+08:00| 9.176| 2.746| 7.107| 1.635| 2.65| 1.097| 9.004| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example, "select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth", window=head(96)) -+-----------+----------+----------+------------+---------+----------+----------+ -| output0| output1| output2| output3| output4| output5| output6| -+-----------+----------+----------+------------+---------+----------+----------+ -| 10.319546| 3.1450553| 7.877341| 1.5723765|2.7303758| 1.1362307| 8.867775| -| 10.443649| 3.3286757| 7.8593454| 1.7675098| 2.560634| 1.1177158| 8.920919| -| 10.883752| 3.2341104| 8.47036| 1.6116762|2.4874182| 1.1760603| 8.798939| -...... -| 8.0115595| 1.2995274| 6.9900327|-0.098746896| 3.04923| 1.176214| 9.548782| -| 8.612427| 2.5036244| 5.6790237| 0.66474205|2.8870275| 1.2051733| 9.330128| -| 10.096699| 3.399722| 6.9909| 1.7478468|2.7642853| 1.1119363| 9.541455| -+-----------+----------+----------+------------+---------+----------+----------+ -Total line number = 48 -``` - -我们将对油温的预测的结果和真实结果进行对比,可以得到以下的图像。 - -图中10/24 00:00之前的数据为输入模型的过去数据,10/24 00:00后的蓝色线条为模型给出的油温预测结果,而红色为数据集中实际的油温数据(用于进行对比)。 - -![](/img/AINode-analysis1.png) - -可以看到,我们使用了过去96个小时(4天)的六个负载信息和对应时间油温的关系,基于之前学习到的序列间相互关系对未来48个小时(2天)的油温这一数据的可能变化进行了建模,可以看到可视化后预测曲线与实际结果在趋势上保持了较高程度的一致性。 - -### 6.2 功率预测 - -变电站需要对电流、电压、功率等数据进行电力监控,用于检测潜在的电网问题、识别电力系统中的故障、有效管理电网负载以及分析电力系统的性能和趋势等。 - -我们利用某变电站中的电流、电压和功率等数据构成了真实场景下的数据集。该数据集包括变电站近四个月时间跨度,每5 - 6s 采集一次的 A相电压、B相电压、C相电压等数据。 - -测试集数据内容为[data](/img/data.csv)。 - -在该数据集上,IoTDB-ML的模型推理功能可以通过以往A相电压,B相电压和C相电压的数值和对应时间戳,预测未来一段时间内的C相电压,赋能变电站的监视管理。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-csv.sh` 导入数据集 - -```Bash -bash ./import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ../../data.csv -``` - -#### 步骤二:模型导入 - -我们可以在iotdb-cli 中选择内置模型或已经注册好的模型用于后续的推理。 - -我们采用内置模型STLForecaster进行预测,STLForecaster 是一个基于 statsmodels 库中 STL 实现的时间序列预测方法。 - -#### 步骤三:模型推理 - -```Shell -IoTDB> select * from root.eg.voltage limit 96 -+-----------------------------+------------------+------------------+------------------+ -| Time|root.eg.voltage.s0|root.eg.voltage.s1|root.eg.voltage.s2| -+-----------------------------+------------------+------------------+------------------+ -|2023-02-14T20:38:32.000+08:00| 2038.0| 2028.0| 2041.0| -|2023-02-14T20:38:38.000+08:00| 2014.0| 2005.0| 2018.0| -|2023-02-14T20:38:44.000+08:00| 2014.0| 2005.0| 2018.0| -...... -|2023-02-14T20:47:52.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:47:57.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:48:03.000+08:00| 2024.0| 2016.0| 2027.0| -+-----------------------------+------------------+------------------+------------------+ -Total line number = 96 - -IoTDB> call inference(_STLForecaster, "select s0,s1,s2 from root.eg.voltage", window=head(96),predict_length=48) -+---------+---------+---------+ -| output0| output1| output2| -+---------+---------+---------+ -|2026.3601|2018.2953|2029.4257| -|2019.1538|2011.4361|2022.0888| -|2025.5074|2017.4522|2028.5199| -...... - -|2022.2336|2015.0290|2025.1023| -|2015.7241|2008.8975|2018.5085| -|2022.0777|2014.9136|2024.9396| -|2015.5682|2008.7821|2018.3458| -+---------+---------+---------+ -Total line number = 48 -``` -我们将对C相电压的预测的结果和真实结果进行对比,可以得到以下的图像。 - -图中 02/14 20:48 之前的数据为输入模型的过去数据, 02/14 20:48 后的蓝色线条为模型给出的C相电压预测结果,而红色为数据集中实际的C相电压数据(用于进行对比)。 - -![](/img/AINode-analysis2.png) - -可以看到,我们使用了过去10分钟的电压的数据,基于之前学习到的序列间相互关系对未来5分钟的C相电压这一数据的可能变化进行了建模,可以看到可视化后预测曲线与实际结果在趋势上保持了一定的同步性。 - -### 6.3 异常检测 - -在民航交通运输业,存在着对乘机旅客数量进行异常检测的需求。异常检测的结果可用于指导调整航班的调度,以使得企业获得更大效益。 - -Airline Passengers一个时间序列数据集,该数据集记录了1949年至1960年期间国际航空乘客数量,间隔一个月进行一次采样。该数据集共含一条时间序列。数据集为[airline](/img/airline.csv)。 -在该数据集上,IoTDB-ML的模型推理功能可以通过捕捉序列的变化规律以对序列时间点进行异常检测,赋能交通运输业。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-csv.sh` 导入数据集 - -```Bash -bash ./import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f ../../data.csv -``` - -#### 步骤二:模型推理 - -IoTDB内置有部分可以直接使用的机器学习算法,使用其中的异常检测算法进行预测的样例如下: - -```Shell -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", k=2) -+-------+ -|output0| -+-------+ -| 0| -| 0| -| 0| -| 0| -...... -| 1| -| 1| -| 0| -| 0| -| 0| -| 0| -+-------+ -Total line number = 144 -``` - -我们将检测为异常的结果进行绘制,可以得到以下图像。其中蓝色曲线为原时间序列,用红色点特殊标注的时间点为算法检测为异常的时间点。 - -![](/img/s6.png) - -可以看到,Stray模型对输入序列变化进行了建模,成功检测出出现异常的时间点。 \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/API/Programming-OPC-DA_timecho.md b/src/zh/UserGuide/dev-1.3/API/Programming-OPC-DA_timecho.md deleted file mode 100644 index 435f28c80..000000000 --- a/src/zh/UserGuide/dev-1.3/API/Programming-OPC-DA_timecho.md +++ /dev/null @@ -1,208 +0,0 @@ - - -# OPC DA 协议 - -## 1. OPC DA - -OPC DA (OPC Data Access) 是工业自动化领域的一种通信协议标准,属于经典 OPC(OLE for Process Control)技术的核心部分。它的主要目标是实现 Windows 环境下工业设备与软件(如 SCADA、HMI、数据库)之间的实时数据交互。OPC DA 基于 COM / DCOM 实现,是一个轻量级的协议,分为服务器和客户端两个角色。 - -* **服务器:** 可以视为一个 Item 的池,存储各个实例的最新数据及其状态。所有 item 只能在服务器端管理,客户端只能读写数据,无权操作元信息。 - -![](/img/opc-da-1-1.png) - -* **客户端:** 连接服务器后,需要自定义一个组(这个组仅与客户端有关),并创建服务器的同名 item,然后可以对自身已创建的 item 进行读写。 - -![](/img/opc-da-1-2.png) - -## 2. OPC DA Sink - -IoTDB (V1.3.5.2及以后的V1.x版本支持) 提供的 OPC DA Sink 支持将树模型数据推送到本地 COM 服务器的插件,它封装了 OPC DA 接口规范及其固有复杂性,显著简化了集成流程。OPC DA Sink 推送数据流图如下所示。 - -![](/img/opc-da-2-1.png) - -### 2.1 SQL 语法 - -```SQL ----- 注意这里的 clsID 需要替换为自己的 clsID -create pipe opc ( - 'sink'='opc-da-sink', - --- 'opcda.progid'='opcserversim.Instance.1' - 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C' -); -``` - -### 2.2 参数介绍 - -| **参数** | **描述** | **取值范围 ** | 是否必填 | -| ------------------- | --------------------------------------------------------------------- | ----------------------- | ------------------ | -| sink | OPC DA SINK | String: opc-da-sink | 必填 | -| sink.opcda.clsid | OPC Server 的 ClsID(唯一标识字符串)。建议使用 clsID 而非 progID。 | String | 和 progId 二选一 | -| sink.opcda.progid | OPC Server 的 ProgID,如果有 clsID,优先使用 clsID。 | String | 和 clsID 二选一 | - -### 2.3 映射规范 - -使用时,IoTDB 将会将自身的树模型最新数据推送到服务器,数据的 itemID 为树模型下的时间序列的全路径,如 `root.a.b.c.d`。注意根据 OPC DA 标准,客户端无权直接在 server 侧创建 item,因此需要服务器提前将 IoTDB 的时间序列以 itemID 和对应数据类型的格式创建为 item。 - -* 数据类型对应如下表所示。 - -| IoTDB | OPC-DA Server | -| ----------- | ----------------------------------------------------------- | -| INT32 | VT\_I4 | -| INT64 | VT\_I8 | -| FLOAT | VT\_R4 | -| DOUBLE | VT\_R8 | -| TEXT | VT\_BSTR | -| BOOLEAN | VT\_BOOL | -| DATE | VT\_DATE | -| TIMESTAMP | VT\_DATE | -| BLOB | VT\_BSTR(Variant 不支持 VT\_BLOB,因此用 VT\_BSTR 替代) | -| STRING | VT\_BSTR | - -### 2.4 常见错误码 - -| 符号 | 错误码 | 描述 | -| ----------------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------ | -| OPC\_E\_BADTYPE | 0xC0040004 | 服务器无法在指定格式/请求的数据类型与规范数据类型之间转换数据。即服务器的数据类型与 IoTDB 的注册类型不一致。 | -| OPC\_E\_UNKNOWNITEMID| 0xC0040007 | 在服务器地址空间中未定义该条目ID(添加或验证时),或该条目ID在服务器地址空间中已不存在(读取或写入时)。即 IoTDB 的测点在服务器内没有对应的 itemID。 | -| OPC\_E\_INVALIDITEMID | 0xC0040008 | 该 itemID不符合服务器的语法规范。 | -| REGDB\_E\_CLASSNOTREG | 0x80040154 | 未注册类 | -| RPC\_S\_SERVER\_UNAVAILABLE | 0x800706ba | RPC服务不可用 | -| DISP\_E\_OVERFLOW | 0x8002000a | 超过类型的最大值 | -| DISP\_E\_BADVARTYPE | 0x80020005 | 类型不匹配 | - -### 2.5 使用限制 - -* 仅支持 COM,且仅能在 Windows 上使用 -* 重启后可能会推送少部分旧数据,但是最终会推送新数据 -* 目前仅支持树模型数据。 - -## 3. 使用步骤 -### 3.1 前置条件 -1. Windows 环境,版本 >= 8 -2. IoTDB 已安装且可正常运行 -3. OPC DA Server 已安装 - -* 以 Simple OPC Server Simulator 为例 - -![](/img/opc-da-3-1.png) - -* 双击某项,可以修改该项的名字(itemID),数据,数据类型等各个信息。 -* 右键某项,可以删除该项、更新值、以及新建项。 - -![](/img/opc-da-3-2.png) - -4. OPC DA Client 已安装 - -* 以 KepwareServerEX 的 quickClient 为例 -* 在 Kepware 中可以如下打开 OPC DA Client - -![](/img/opc-da-3-3.png) - -![](/img/opc-da-3-4.png) - - -### 3.2 配置修改 - -修改 server 配置,以避免 IoTDB 的写入 client 与 Kepware 的读取 client 连接到两个不同的实例而无法调试。 - -* 首先按 Win+R 键,在运行菜单内输入 `dcomcnfg`,打开 dcom 的组件配置: - -![](/img/opc-da-3-5.png) - -* 点击组件服务 -> 计算机 -> 我的电脑 -> DCOM 配置,找到`AGG Software Simple OPC Server Simulator`,右键“属性”: - -![](/img/opc-da-3-6.png) - -* 在`标识`内,将`用户账户`改为`交互式用户`。注意这里不要为`启动用户`,否则可能导致两个 client 分别启动不同的 server 实例。 - -![](/img/opc-da-3-7.png) - -### 3.3 clsID 获取 -1. 方式一:通过 DCOM 配置 获取 - -* 按 Win+R 键,在运行菜单内输入 `dcomcnfg`,打开 dcom 的组件配置; -* 点击组件服务 -> 计算机 -> 我的电脑 -> DCOM 配置,找到`AGG Software Simple OPC Server Simulator`,右键“属性”。 -* 在 `常规 `中可以获取该应用程序的 clsID,用于之后 opc-da-sink 的连接,注意不带大括号 - -![](/img/opc-da-3-8.png) - -2. 方式二:clsID 与 progID 也可以直接在 server 里获取 - -* 点击 `Help` > `Show OPC Server Info` - -![](/img/opc-da-3-9.png) - -* 弹窗中即可显示 - -![](/img/opc-da-3-10.png) - -### 3.4 写入数据 -#### 3.4.1 DA Server -1. 在 DA Server 内新建项,与 IoTDB 的待写入项的 name 与 type 保持一致 - -![](/img/opc-da-3-11.png) - -2. 在 Kepware 中连上该 server: - -![](/img/opc-da-3-12.png) - -3. 右键服务器新建组,组名任意: - -![](/img/opc-da-3-13.png) - -![](/img/opc-da-3-14.png) - -4. 右键新建 item,item 的名字为之前创建的名字 - -![](/img/opc-da-3-15.png) - -![](/img/opc-da-3-16.png) - -![](/img/opc-da-3-17.png) - -#### 3.4.2 IoTDB -1. 启动 IoTDB -2. 创建 Pipe - -```SQL -create pipe opc ('sink'='opc-da-sink', 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C') -``` - -* 注意:如果创建失败,提示` Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 1107: Failed to connect to server, error code: 0x80040154`,则可以参考该解决方案进行处理:https://opcexpert.com/support/0x80040154-class-not-registered/ - -3. 创建时间序列(如果已开启自动创建元数据,则本步骤可以省略) - -```SQL -create timeseries root.a.b.c.r string; -``` - -4. 插入数据 - -```SQL -insert into root.a.b.c (time, r) values(10000, "SomeString") -``` - -### 3.5 验证数据 - -查看 Quick client 的数据,应该已经得到更新。 - -![](/img/opc-da-3-18.png) \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/API/Programming-OPC-UA_timecho.md b/src/zh/UserGuide/dev-1.3/API/Programming-OPC-UA_timecho.md deleted file mode 100644 index 6661ae168..000000000 --- a/src/zh/UserGuide/dev-1.3/API/Programming-OPC-UA_timecho.md +++ /dev/null @@ -1,265 +0,0 @@ - - -# OPC UA 协议 - -## OPC UA 订阅数据 - -本功能支持用户以 OPC UA 协议从 IoTDB 中订阅数据,订阅数据的通信模式支持 Client/Server 和 Pub/Sub 两种。 - -注意:本功能并非从外部 OPC Server 中采集数据写入 IoTDB - -![](/img/opc-ua-new-1.png) - -## OPC 服务启动方式 - -### 语法 - -启动 OPC UA 协议的语法: - -```SQL -create pipe p1 - with source (...) - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'sink.opcua.tcp.port' = '12686', - 'sink.opcua.https.port' = '8443', - 'sink.user' = 'root', - 'sink.password' = 'root', - 'sink.opcua.security.dir' = '...' - ) -``` - -### 参数 - -| **参数** | **描述** | **取值范围** | **是否必填** | **默认值** | -| ---------------------------------- | ------------------------------ | -------------------------------- | ------------ | ------------------------------------------------------------ | -| sink | OPC UA SINK | String: opc-ua-sink | 必填 | | -| sink.opcua.model | OPC UA 使用的模式 | String: client-server / pub-sub | 选填 | pub-sub | -| sink.opcua.tcp.port | OPC UA 的 TCP 端口 | Integer: [0, 65536] | 选填 | 12686 | -| sink.opcua.https.port | OPC UA 的 HTTPS 端口 | Integer: [0, 65536] | 选填 | 8443 | -| sink.opcua.security.dir | OPC UA 的密钥及证书目录 | String: Path,支持绝对及相对目录 | 选填 | iotdb 相关 DataNode 的 conf 目录下的 opc_security 文件夹 /``。
如无 iotdb 的 conf 目录(例如 IDEA 中启动 DataNode),则为用户主目录下的 iotdb_opc_security 文件夹 /`` | -| sink.opcua.enable-anonymous-access | OPC UA 是否允许匿名访问 | Boolean | 选填 | true | -| sink.user | 用户,这里指 OPC UA 的允许用户 | String | 选填 | root | -| sink.password | 密码,这里指 OPC UA 的允许密码 | String | 选填 | root | - -### 示例 - -```Bash -create pipe p1 - with sink ('sink' = 'opc-ua-sink', - 'sink.user' = 'root', - 'sink.password' = 'root'); -start pipe p1; -``` - -### 使用限制 - -1. 启动协议之后需要写入数据,才能建立连接,且仅能订阅建立连接之后的数据。 -2. 推荐在单机模式下使用。在分布式模式下,每一个 IoTDB DataNode 都作为一个独立的 OPC Server 提供数据,需要单独订阅。 - -## 两种通信模式示例 -### Client / Server 模式 - -在这种模式下,IoTDB 的流处理引擎通过 OPC UA Sink 与 OPC UA 服务器(Server)建立连接。OPC UA 服务器在其地址空间(Address Space) 中维护数据,IoTDB可以请求并获取这些数据。同时,其他OPC UA客户端(Client)也能访问服务器上的数据。 - -* 特性: - * OPC UA 将从 Sink 收到的设备信息,按照树形模型整理到 Objects folder 下的文件夹中。 - * 每个测点都被记录为一个变量节点,并记录当前数据库中的最新值。 - * OPC UA 无法删除数据或者改变数据类型的设置 - -#### 准备工作 - -1. 此处以UAExpert客户端为例,下载 UAExpert 客户端:https://www.unified-automation.com/downloads/opc-ua-clients.html - -2. 安装 UAExpert,填写自身的证书等信息。 - -#### 快速开始 -##### 支持 None 安全策略的场景 -1. 使用如下 sql,创建并启动 client-server 模式的 OPC UA Sink。详细语法参见上文:[IoTDB OPC Server语法](#语法) - -```SQL -create pipe p1 with sink ('sink'='opc-ua-sink', 'opcua.security-policy'='AES128_SHA256_RSAOAEP, AES256_SHA256_RSAPSS, BASIC256SHA256, NONE'); -``` -注意:在 V1.3.7.2 及以上版本中,默认不再支持 `None`,如需使用必须通过 `security-policy` 参数手动开启,如上所示。 - -2. 写入部分数据。 - -```SQL -insert into root.test.db(time, s2) values(now(), 2) -``` - -此处自动创建元数据开启。 - -3. 在 UAExpert 中配置 iotdb 的连接,其中 password 填写为上述参数配置中 sink.password 中设定的密码(此处以默认密码root为例): - -

- -
- -
- -
- -4. 信任服务器的证书后,在左侧 Objects folder 即可看到写入的数据。 - -
- -
- -
- -
- -注意:由于此处配置的 `SecurityPolicy` 为 `None`,因此不需要相互信任证书。生产环境建议使用非 `None` 的 `SecurityPolicy` 进行连接,此时需要相互信任证书,操作步骤可以见下文 `Pub/Sub` 模式,在 `Client/Server` 的证书目录下(可以在打印的日志中找 keyStore 关键词),将 reject 的内容挪到 `trusted/certs`下即可,采用连接、移动 server 目录、连接、移动 client 目录、连接的顺序。 - -5. 可以将左侧节点拖动到中间,并展示该节点的最新值: - -
- -
- -##### 不支持 None 安全策略的场景 -1. 使用如下 sql,创建并启动 OPC UA 服务。 - ```SQL - create pipe p1 with sink ('sink'='opc-ua-sink'); - ``` - 注意:从 V1.3.7.2 版本开始,`OpcUaSink` 出于安全考虑,默认不再支持 `None` 模式。 - -2. 写入部分数据。 - ```SQL - insert into root.test.db(time, s2) values(now(), 2); - ``` - -3. 在 UAExpert 中配置 IoTDB 连接: - - 不可直接访问 `URL`,必须通过 `Discover` 方式发现端点 - - 客户端会先使用 `None` 策略发送 `GetEndpoints` 请求获取端点列表 - - 再根据配置的 `Basic256Sha256 + SignAndEncrypt` 选择对应加密端点建立加密连接 - -![](/img/opc-ua-un-none-1.png) - -4. 用户名密码配置同上,点击相关的连接模式后(`Sign` / `Sign & Encrypt`),如果出现以下内容,点 `Ignore` 直接连。 - -![](/img/opc-ua-un-none-2.png) - -### Pub / Sub 模式 - -在这种模式下,IoTDB的流处理引擎通过 OPC UA Sink 向OPC UA 服务器(Server)发送数据变更事件。这些事件被发布到服务器的消息队列中,并通过事件节点 (Event Node) 进行管理。其他OPC UA客户端(Client)可以订阅这些事件节点,以便在数据变更时接收通知。 - -- 特性: - - - 每个测点会被 OPC UA 包装成一个事件节点(EventNode)。 - - 相关字段及其对应含义如下: - - | 字段 | 含义 | 类型(Milo) | 示例 | - | :--------- | :--------------- | :------------ | :-------------------- | - | Time | 时间戳 | DateTime | 1698907326198 | - | SourceName | 测点对应完整路径 | String | root.test.opc.sensor0 | - | SourceNode | 测点数据类型 | NodeId | Int32 | - | Message | 数据 | LocalizedText | 3.0 | - - - Event 仅会发送给所有已经监听的客户端,客户端未连接则会忽略该 Event。 - - 如果数据被删除,信息则无法推送给客户端。 - - -#### 准备工作 - -该代码位于 iotdb-example 包下的 [opc-ua-sink 文件夹](https://github.com/apache/iotdb/tree/rc/1.3.5/example/pipe-opc-ua-sink/src/main/java/org/apache/iotdb/opcua)中 - -代码中包含: - -- 主类(ClientTest) -- Client 证书相关的逻辑(IoTDBKeyStoreLoaderClient) -- Client 的配置及启动逻辑(ClientExampleRunner) -- ClientTest 的父类(ClientExample) - -#### 快速开始 - -使用步骤为: - -1. 打开 IoTDB 并写入部分数据。 - -```SQL -insert into root.a.b(time, c, d) values(now(), 1, 2); -``` - -​ 此处自动创建元数据开启。 - -2. 使用如下 sql,创建并启动 Pub-Sub 模式的 OPC UA Sink。详细语法参见上文:[IoTDB OPC Server语法](#语法) - -```SQL -create pipe p1 with sink ('sink'='opc-ua-sink', - 'sink.opcua.model'='pub-sub'); -start pipe p1; -``` - -​ 此时能看到服务器的 conf 目录下创建了 opc 证书相关的目录。 - -
- -
- -3. 直接运行 Client 连接,此时 Client 证书被服务器拒收。 - -
- -
- -4. 进入服务器的 sink.opcua.security.dir 目录下,进入 pki 的 rejected 目录,此时 Client 的证书应该已经在该目录下生成。 - -
- -
- -5. 将客户端的证书移入(不是复制) 同目录下 trusted 目录的 certs 文件夹中。 - -
- -
- -6. 再次打开 Client 连接,此时服务器的证书应该被 Client 拒收。 - -
- -
- -7. 进入客户端的 /client/security 目录下,进入 pki 的 rejected 目录,将服务器的证书移入(不是复制)trusted 目录。 - -
- -
- -8. 打开 Client,此时建立双向信任成功, Client 能够连接到服务器。 - -9. 向服务器中写入数据,此时 Client 中能够打印出收到的数据。 - -
- -
- - -### 注意事项 - -1. **单机与集群**:建议使用1C1D单机版,如果集群中有多个 DataNode,可能数据会分散发送在各个 DataNode 上,无法收听到全量数据。 - -2. **无需操作根目录下证书**:在证书操作过程中,无需操作 IoTDB security 根目录下的 `iotdb-server.pfx` 证书和 client security 目录下的 `example-client.pfx` 目录。Client 和 Server 双向连接时,会将根目录下的证书发给对方,对方如果第一次看见此证书,就会放入 reject dir,如果该证书在 trusted/certs 里面,则能够信任对方。 - -3. **建议使用** **Java 17+**:在 JVM 8 的版本中,可能会存在密钥长度限制,报 Illegal key size 错误。对于特定版本(如 jdk.1.8u151+),可以在 ClientExampleRunner 的 create client 里加入 `Security.`*`setProperty`*`("crypto.policy", "unlimited");` 解决,也可以下载无限制的包 `local_policy.jar` 与 `US_export_policy `解决替换 `JDK/jre/lib/security `目录下的包解决,下载网址:https://www.oracle.com/java/technologies/javase-jce8-downloads.html。 diff --git a/src/zh/UserGuide/dev-1.3/Background-knowledge/Cluster-Concept_timecho.md b/src/zh/UserGuide/dev-1.3/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index 44739754b..000000000 --- a/src/zh/UserGuide/dev-1.3/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,116 +0,0 @@ - - -# 常见概念 - -## 数据模型相关概念 - -| 概念 | 含义 | -|-----------------|----------------------------------------------------------------------------------------------------------------------------| -| 数据模型 | 树模型,管理的对象为设备和测点,以层级路径的方式管理数据,一条路径对应一个设备的一个测点 | -| 元数据(Schema) | 元数据是数据库的数据模型信息,即树形结构,包括测点的名称、数据类型等定义。 | -| 设备(Device) | 对应一个实际场景中的物理设备,通常包含多个测点。 | -| 测点(Timeseries) | 又名:物理量、时间序列、时间线、点位、信号量、指标、测量值等。是多个数据点按时间戳递增排列形成的一个时间序列。通常一个测点代表一个采集点位,能够定期采集所在环境的物理量。 | -| 编码(Encoding) | 编码是一种压缩技术,将数据以二进制的形式进行表示,可以提高存储效率。IoTDB 支持多种针对不同类型的数据的编码方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md) | -| 压缩(Compression) | IoTDB 在数据编码后,使用压缩技术进一步压缩二进制数据,提升存储效率。IoTDB 支持多种压缩方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md) | - -## 分布式相关概念 - -下图展示了一个常见的 IoTDB 3C3D(3 个 ConfigNode、3 个 DataNode)的集群部署模式: - - - -IoTDB 的集群包括如下常见概念: - -- 节点(ConfigNode、DataNode、AINode) -- Region(SchemaRegion、DataRegion) -- 多副本 - -下文将对以上概念进行介绍。 - - -### 节点 - -IoTDB 集群包括三种节点(进程):ConfigNode(管理节点),DataNode(数据节点)和 AINode(分析节点),如下所示: - -- ConfigNode:管理集群的节点信息、配置信息、用户权限、元数据、分区信息等,负责分布式操作的调度和负载均衡,所有 ConfigNode 之间互为全量备份,如上图中的 ConfigNode-1,ConfigNode-2 和 ConfigNode-3 所示。 -- DataNode:服务客户端请求,负责数据的存储和计算,如上图中的 DataNode-1,DataNode-2 和 DataNode-3 所示。 -- AINode:负责提供机器学习能力,支持注册已训练好的机器学习模型,并通过 SQL 调用模型进行推理,目前已内置自研时序大模型和常见的机器学习算法(如预测与异常检测)。 - -### 数据分区 - -在 IoTDB 中,元数据和数据都被分为小的分区,即 Region,由集群的各个 DataNode 进行管理。 - -- SchemaRegion:元数据分区,管理一部分设备和测点的元数据。不同 DataNode 相同 RegionID 的 SchemaRegion 互为副本,如上图中 SchemaRegion-1 拥有三个副本,分别放置于 DataNode-1,DataNode-2 和 DataNode-3。 -- DataRegion:数据分区,管理一部分设备的一段时间的数据。不同 DataNode 相同 RegionID 的 DataRegion 互为副本,如上图中 DataRegion-2 拥有两个副本,分别放置于 DataNode-1 和 DataNode-2。 -- 具体分区算法可参考:[数据分区](../Technical-Insider/Cluster-data-partitioning.md) - -### 多副本 - -数据和元数据的副本数可配置,不同部署模式下的副本数推荐如下配置,其中多副本时可提供高可用服务。 - -| 类别 | 配置项 | 单机推荐配置 | 集群推荐配置 | -| :----- | :------------------------ | :----------- | :----------- | -| 元数据 | schema_replication_factor | 1 | 3 | -| 数据 | data_replication_factor | 1 | 2 | - - -## 部署相关概念 - -IoTDB 有三种运行模式:单机模式、集群模式和双活模式。 - -### 单机模式 - -IoTDB单机实例包括 1 个ConfigNode、1个DataNode,即1C1D; - -- **特点**:便于开发者安装部署,部署和维护成本较低,操作方便。 -- **适用场景**:资源有限或对高可用要求不高的场景,例如边缘端服务器。 -- **部署方法**:[单机版部署](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -### 双活模式 - -双活版部署为 TimechoDB 企业版功能,是指两个独立的实例进行双向同步,能同时对外提供服务。当一台停机重启后,另一个实例会将缺失数据断点续传。 - -> IoTDB 双活实例通常为2个单机节点,即2套1C1D。每个实例也可以为集群。 - -- **特点**:资源占用最低的高可用解决方案。 -- **适用场景**:资源有限(仅有两台服务器),但希望获得高可用能力。 -- **部署方法**:[双活版部署](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### 集群模式 - -IoTDB 集群实例为 3 个ConfigNode 和不少于 3 个 DataNode,通常为 3 个 DataNode,即3C3D;当部分节点出现故障时,剩余节点仍然能对外提供服务,保证数据库服务的高可用性,且可随节点增加提升数据库性能。 - -- **特点**:具有高可用性、高扩展性,可通过增加 DataNode 提高系统性能。 -- **适用场景**:需要提供高可用和可靠性的企业级应用场景。 -- **部署方法**:[集群版部署](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -### 特点总结 - -| 维度 | 单机模式 | 双活模式 | 集群模式 | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| 适用场景 | 边缘侧部署、对高可用要求不高 | 高可用性业务、容灾场景等 | 高可用性业务、容灾场景等 | -| 所需机器数量 | 1 | 2 | ≥3 | -| 安全可靠性 | 无法容忍单点故障 | 高,可容忍单点故障 | 高,可容忍单点故障 | -| 扩展性 | 可扩展 DataNode 提升性能 | 每个实例可按需扩展 | 可扩展 DataNode 提升性能 | -| 性能 | 可随 DataNode 数量扩展 | 与其中一个实例性能相同 | 可随 DataNode 数量扩展 | - -- 单机模式和集群模式,部署步骤类似(逐个增加 ConfigNode 和 DataNode),仅副本数和可提供服务的最少节点数不同。 \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/Basic-Concept/Operate-Metadata_timecho.md b/src/zh/UserGuide/dev-1.3/Basic-Concept/Operate-Metadata_timecho.md deleted file mode 100644 index f9871e3fc..000000000 --- a/src/zh/UserGuide/dev-1.3/Basic-Concept/Operate-Metadata_timecho.md +++ /dev/null @@ -1,1366 +0,0 @@ - - -# 测点管理 -## 数据库管理 - -数据库(Database)可以被视为关系数据库中的Database。 - -### 创建数据库 - -我们可以根据存储模型建立相应的数据库。如下所示: - -``` -IoTDB > CREATE DATABASE root.ln -``` - -需要注意的是,推荐创建一个 database. - -Database 的父子节点都不能再设置 database。例如在已经有`root.ln`和`root.sgcc`这两个 database 的情况下,创建`root.ln.wf01` database 是不可行的。系统将给出相应的错误提示,如下所示: - -``` -IoTDB> CREATE DATABASE root.ln.wf01 -Msg: 300: root.ln has already been created as database. -``` -Database 节点名命名规则: -1. 节点名可由**中英文字符、数字、下划线(\_)、英文句号(.)、反引号(\`)** 组成 -2. 若节点名为以下情况,则必须用**反引号(\`)** 将整个名称包裹。 - - 纯数字(如 12345) - - 含有特殊字符(如 . 或 \_)并可能引发歧义的名称(如 db.01、\_temp) -3. 反引号的特殊处理: - 若节点名本身需要包含反引号(\`),则需用**两个反引号(\`\`)** 表示一个反引号。例如:命名为\`db123\`\`(本身包含一个反引号),需写为 \`db123\`\`\`。 - -还需注意,如果在 Windows 或 macOS 系统上部署,database 名是大小写不敏感的。例如同时创建`root.ln` 和 `root.LN` 是不被允许的。 - -### 查看数据库 - -在 database 创建后,我们可以使用 [SHOW DATABASES](../SQL-Manual/SQL-Manual.md#查看数据库) 语句和 [SHOW DATABASES \](../SQL-Manual/SQL-Manual.md#查看数据库) 来查看 database,SQL 语句如下所示: - -``` -IoTDB> show databases -IoTDB> show databases root.* -IoTDB> show databases root.** -``` - -执行结果为: - -``` -+-------------+----+-------------------------+-----------------------+-----------------------+ -| database| ttl|schema_replication_factor|data_replication_factor|time_partition_interval| -+-------------+----+-------------------------+-----------------------+-----------------------+ -| root.sgcc|null| 2| 2| 604800| -| root.ln|null| 2| 2| 604800| -+-------------+----+-------------------------+-----------------------+-----------------------+ -Total line number = 2 -It costs 0.060s -``` - -### 删除数据库 - -用户可以使用`DELETE DATABASE `语句删除该路径模式匹配的所有的数据库。在删除的过程中,需要注意的是数据库的数据也会被删除。 - -``` -IoTDB > DELETE DATABASE root.ln -IoTDB > DELETE DATABASE root.sgcc -// 删除所有数据,时间序列以及数据库 -IoTDB > DELETE DATABASE root.** -``` - -### 统计数据库数量 - -用户可以使用`COUNT DATABASES `语句统计数据库的数量,允许指定`PathPattern` 用来统计匹配该`PathPattern` 的数据库的数量 - -SQL 语句如下所示: - -``` -IoTDB> show databases -IoTDB> count databases -IoTDB> count databases root.* -IoTDB> count databases root.sgcc.* -IoTDB> count databases root.sgcc -``` - -执行结果为: - -``` -+-------------+ -| database| -+-------------+ -| root.sgcc| -| root.turbine| -| root.ln| -+-------------+ -Total line number = 3 -It costs 0.003s - -+-------------+ -| Database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.003s - -+-------------+ -| Database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| Database| -+-------------+ -| 0| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 1| -+-------------+ -Total line number = 1 -It costs 0.002s -``` - -### 数据保留时间(TTL) - -IoTDB 支持对设备(device)级别设置数据保留时间(TTL),允许系统自动定期删除旧数据,以有效控制磁盘空间并维护高性能查询和低内存占用。TTL 默认以毫秒为单位,数据过期后不可查询且禁止写入,但物理删除会延迟至压缩时。需注意,TTL 变更可能导致短暂数据可查询性变化,且若调小或解除 TTL,之前因 TTL 不可见的数据可能重新出现。 - -注意事项: -- TTL 设置为毫秒,不受配置文件时间精度影响。 -- TTL 变更可能影响数据的可查询性。 -- 系统最终会移除过期数据,但存在延迟。 -- TTL 判断数据是否过期依据的是数据点时间,非写入时间。 -- 系统最多支持设置 1000 条 TTL 规则,达到上限需先删除部分规则才能设置新规则。 - -#### TTL Path 规则 -设置的路径 path 只支持前缀路径(即路径中间不能带 \* , 且必须以 \*\* 结尾),该路径会匹配到设备,也允许用户指定不带星的 path 为具体的 database 或 device,当 path 不带 \* 时,会检查是否匹配到 database,若匹配到 database,则会同时设置 path 和 path.\*\*。 -注意:设备 TTL 设置不会对元数据的存在性进行校验,即允许对一条不存在的设备设置 TTL。 -``` -合格的 path: -root.** -root.db.** -root.db.group1.** -root.db -root.db.group1.d1 - -不合格的 path: -root.*.db -root.**.db.* -root.db.* -``` -#### TTL 适用规则 -当一个设备适用多条TTL规则时,优先适用较精确和较长的规则。例如对于设备“root.bj.hd.dist001.turbine001”来说,规则“root.bj.hd.dist001.turbine001”比“root.bj.hd.dist001.\*\*”优先,而规则“root.bj.hd.dist001.\*\*”比“root.bj.hd.\*\*”优先; -#### 设置 TTL -set ttl 操作可以理解为设置一条 TTL规则,比如 set ttl to root.sg.group1.\*\* 就相当于对所有可以匹配到该路径模式的设备挂载 ttl。 unset ttl 操作表示对相应路径模式卸载 TTL,若不存在对应 TTL,则不做任何事。若想把 TTL 调成无限大,则可以使用 INF 关键字 -设置 TTL 的 SQL 语句如下所示: -``` -set ttl to pathPattern 360000; -``` -pathPattern 是前缀路径,即路径中间不能带 \* 且必须以 \*\* 结尾。 -pathPattern 匹配对应的设备。为了兼容老版本 SQL 语法,允许用户输入的 pathPattern 匹配到 db,则自动将前缀路径扩展为 path.\*\*。 -例如,写set ttl to root.sg 360000 则会自动转化为set ttl to root.sg.\*\* 360000,转化后的语句对所有 root.sg 下的 device 设置TTL。 -但若写的 pathPattern 无法匹配到 db,则上述逻辑不会生效。 -如写set ttl to root.sg.group 360000 ,由于root.sg.group未匹配到 db,则不会被扩充为root.sg.group.\*\*。 也允许指定具体 device,不带 \*。 -#### 取消 TTL - -取消 TTL 的 SQL 语句如下所示: - -``` -IoTDB> unset ttl from root.ln -``` - -取消设置 TTL 后, `root.ln` 路径下所有的数据都会被保存。 -``` -IoTDB> unset ttl from root.sgcc.** -``` - -取消设置`root.sgcc`路径下的所有的 TTL 。 -``` -IoTDB> unset ttl from root.** -``` - -取消设置所有的 TTL 。 - -新语法 -``` -IoTDB> unset ttl from root.** -``` - -旧语法 -``` -IoTDB> unset ttl to root.** -``` -新旧语法在功能上没有区别并且同时兼容,仅是新语法在用词上更符合常规。 -#### 显示 TTL - -显示 TTL 的 SQL 语句如下所示: -show all ttl - -``` -IoTDB> SHOW ALL TTL -+--------------+--------+ -| path| TTL| -| root.**|55555555| -| root.sg2.a.**|44440000| -+--------------+--------+ -``` - -show ttl on pathPattern -``` -IoTDB> SHOW TTL ON root.db.**; -+--------------+--------+ -| path| TTL| -| root.db.**|55555555| -| root.db.a.**|44440000| -+--------------+--------+ -``` -SHOW ALL TTL 这个例子会给出所有的 TTL。 -SHOW TTL ON pathPattern 这个例子会显示指定路径的 TTL。 - -显示设备的 TTL。 -``` -IoTDB> show devices -+---------------+---------+---------+ -| Device|IsAligned| TTL| -+---------------+---------+---------+ -|root.sg.device1| false| 36000000| -|root.sg.device2| true| INF| -+---------------+---------+---------+ -``` -所有设备都一定会有 TTL,即不可能是 null。INF 表示无穷大。 - - -### 设置异构数据库(进阶操作) - -在熟悉 IoTDB 元数据建模的前提下,用户可以在 IoTDB 中设置异构的数据库,以便应对不同的生产需求。 - -目前支持的数据库异构参数有: - -| 参数名 | 参数类型 | 参数描述 | -|---------------------------|---------|---------------------------| -| TTL | Long | 数据库的 TTL | -| SCHEMA_REPLICATION_FACTOR | Integer | 数据库的元数据副本数 | -| DATA_REPLICATION_FACTOR | Integer | 数据库的数据副本数 | -| SCHEMA_REGION_GROUP_NUM | Integer | 数据库的 SchemaRegionGroup 数量 | -| DATA_REGION_GROUP_NUM | Integer | 数据库的 DataRegionGroup 数量 | - -用户在配置异构参数时需要注意以下三点: -+ TTL 和 TIME_PARTITION_INTERVAL 必须为正整数。 -+ SCHEMA_REPLICATION_FACTOR 和 DATA_REPLICATION_FACTOR 必须小于等于已部署的 DataNode 数量。 -+ SCHEMA_REGION_GROUP_NUM 和 DATA_REGION_GROUP_NUM 的功能与 iotdb-common.properties 配置文件中的 -`schema_region_group_extension_policy` 和 `data_region_group_extension_policy` 参数相关,以 DATA_REGION_GROUP_NUM 为例: -若设置 `data_region_group_extension_policy=CUSTOM`,则 DATA_REGION_GROUP_NUM 将作为 Database 拥有的 DataRegionGroup 的数量; -若设置 `data_region_group_extension_policy=AUTO`,则 DATA_REGION_GROUP_NUM 将作为 Database 拥有的 DataRegionGroup 的配额下界,即当该 Database 开始写入数据时,将至少拥有此数量的 DataRegionGroup。 - -用户可以在创建 Database 时设置任意异构参数,或在单机/分布式 IoTDB 运行时调整部分异构参数。 - -#### 创建 Database 时设置异构参数 - -用户可以在创建 Database 时设置上述任意异构参数,SQL 语句如下所示: - -``` -CREATE DATABASE prefixPath (WITH databaseAttributeClause (COMMA? databaseAttributeClause)*)? -``` - -例如: -``` -CREATE DATABASE root.db WITH SCHEMA_REPLICATION_FACTOR=1, DATA_REPLICATION_FACTOR=3, SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### 运行时调整异构参数 - -用户可以在 IoTDB 运行时调整部分异构参数,SQL 语句如下所示: - -``` -ALTER DATABASE prefixPath WITH databaseAttributeClause (COMMA? databaseAttributeClause)* -``` - -例如: -``` -ALTER DATABASE root.db WITH SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -注意,运行时只能调整下列异构参数: -+ SCHEMA_REGION_GROUP_NUM -+ DATA_REGION_GROUP_NUM - -#### 查看异构数据库 - -用户可以查询每个 Database 的具体异构配置,SQL 语句如下所示: - -``` -SHOW DATABASES DETAILS prefixPath? -``` - -例如: - -``` -IoTDB> SHOW DATABASES DETAILS -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|Database| TTL|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|MinSchemaRegionGroupNum|MaxSchemaRegionGroupNum|DataRegionGroupNum|MinDataRegionGroupNum|MaxDataRegionGroupNum| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|root.db1| null| 1| 3| 604800000| 0| 1| 1| 0| 2| 2| -|root.db2|86400000| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -|root.db3| null| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -Total line number = 3 -It costs 0.058s -``` - -各列查询结果依次为: -+ 数据库名称 -+ 数据库的 TTL -+ 数据库的元数据副本数 -+ 数据库的数据副本数 -+ 数据库的时间分区间隔 -+ 数据库当前拥有的 SchemaRegionGroup 数量 -+ 数据库需要拥有的最小 SchemaRegionGroup 数量 -+ 数据库允许拥有的最大 SchemaRegionGroup 数量 -+ 数据库当前拥有的 DataRegionGroup 数量 -+ 数据库需要拥有的最小 DataRegionGroup 数量 -+ 数据库允许拥有的最大 DataRegionGroup 数量 - - -## 设备模板管理 - -IoTDB 支持设备模板功能,实现同类型不同实体的物理量元数据共享,减少元数据内存占用,同时简化同类型实体的管理。 - - -![img](/img/%E6%A8%A1%E6%9D%BF.png) - -![img](/img/template.jpg) - -### 创建设备模板 - -创建设备模板的 SQL 语法如下: - -```sql -CREATE DEVICE TEMPLATE ALIGNED? '(' [',' ]+ ')' -``` - -**示例1:** 创建包含两个非对齐序列的元数据模板 - -```shell -IoTDB> create device template t1 (temperature FLOAT encoding=RLE, status BOOLEAN encoding=PLAIN compression=SNAPPY) -``` - -**示例2:** 创建包含一组对齐序列的元数据模板 - -```shell -IoTDB> create device template t2 aligned (lat FLOAT encoding=Gorilla, lon FLOAT encoding=Gorilla) -``` - -其中,物理量 `lat` 和 `lon` 是对齐的。 - -### 挂载设备模板 - -元数据模板在创建后,需执行挂载操作,方可用于相应路径下的序列创建与数据写入。 - -**挂载模板前,需确保相关数据库已经创建。** - -**推荐将模板挂载在 database 节点上,不建议将模板挂载到 database 上层的节点上。** - -**模板挂载路径下禁止创建普通序列,已创建了普通序列的前缀路径上不允许挂载模板。** - -挂载元数据模板的 SQL 语句如下所示: - -```shell -IoTDB> set device template t1 to root.sg1.d1 -``` - -### 激活设备模板 - -挂载好设备模板后,且系统开启自动注册序列功能的情况下,即可直接进行数据的写入。例如 database 为 root.sg1,模板 t1 被挂载到了节点 root.sg1.d1,那么可直接向时间序列(如 root.sg1.d1.temperature 和 root.sg1.d1.status)写入时间序列数据,该时间序列已可被当作正常创建的序列使用。 - -**注意**:在插入数据之前或系统未开启自动注册序列功能,模板定义的时间序列不会被创建。可以使用如下SQL语句在插入数据前创建时间序列即激活模板: - -```shell -IoTDB> create timeseries using device template on root.sg1.d1 -``` - -**示例:** 执行以下语句 -```shell -IoTDB> set device template t1 to root.sg1.d1 -IoTDB> set device template t2 to root.sg1.d2 -IoTDB> create timeseries using device template on root.sg1.d1 -IoTDB> create timeseries using device template on root.sg1.d2 -``` - -查看此时的时间序列: -```sql -show timeseries root.sg1.** -``` - -```shell -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -|root.sg1.d1.temperature| null| root.sg1| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.sg1.d1.status| null| root.sg1| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -| root.sg1.d2.lon| null| root.sg1| FLOAT| GORILLA| SNAPPY|null| null| null| null| -| root.sg1.d2.lat| null| root.sg1| FLOAT| GORILLA| SNAPPY|null| null| null| null| -+-----------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -``` - -查看此时的设备: -```sql -show devices root.sg1.** -``` - -```shell -+---------------+---------+---------+ -| devices|isAligned| Template| -+---------------+---------+---------+ -| root.sg1.d1| false| null| -| root.sg1.d2| true| null| -+---------------+---------+---------+ -``` - -### 查看设备模板 - -- 查看所有设备模板 - -SQL 语句如下所示: - -```shell -IoTDB> show device templates -``` - -执行结果如下: -```shell -+-------------+ -|template name| -+-------------+ -| t2| -| t1| -+-------------+ -``` - -- 查看某个设备模板下的物理量 - -SQL 语句如下所示: - -```shell -IoTDB> show nodes in device template t1 -``` - -执行结果如下: -```shell -+-----------+--------+--------+-----------+ -|child nodes|dataType|encoding|compression| -+-----------+--------+--------+-----------+ -|temperature| FLOAT| RLE| SNAPPY| -| status| BOOLEAN| PLAIN| SNAPPY| -+-----------+--------+--------+-----------+ -``` - -- 查看挂载了某个设备模板的路径 - -```shell -IoTDB> show paths set device template t1 -``` - -执行结果如下: -```shell -+-----------+ -|child paths| -+-----------+ -|root.sg1.d1| -+-----------+ -``` - -- 查看使用了某个设备模板的路径(即模板在该路径上已激活,序列已创建) - -```shell -IoTDB> show paths using device template t1 -``` - -执行结果如下: -```shell -+-----------+ -|child paths| -+-----------+ -|root.sg1.d1| -+-----------+ -``` - -### 解除设备模板 - -若需删除模板表示的某一组时间序列,可采用解除模板操作,SQL语句如下所示: - -```shell -IoTDB> delete timeseries of device template t1 from root.sg1.d1 -``` - -或 - -```shell -IoTDB> deactivate device template t1 from root.sg1.d1 -``` - -解除操作支持批量处理,SQL语句如下所示: - -```shell -IoTDB> delete timeseries of device template t1 from root.sg1.*, root.sg2.* -``` - -或 - -```shell -IoTDB> deactivate device template t1 from root.sg1.*, root.sg2.* -``` - -若解除命令不指定模板名称,则会将给定路径涉及的所有模板使用情况均解除。 - -### 卸载设备模板 - -卸载设备模板的 SQL 语句如下所示: - -```shell -IoTDB> unset device template t1 from root.sg1.d1 -``` - -**注意**:不支持卸载仍处于激活状态的模板,需保证执行卸载操作前解除对该模板的所有使用,即删除所有该模板表示的序列。 - -### 删除设备模板 - -删除设备模板的 SQL 语句如下所示: - -```shell -IoTDB> drop device template t1 -``` - -**注意**:不支持删除已经挂载的模板,需在删除操作前保证该模板卸载成功。 - -### 修改设备模板 - -在需要新增物理量的场景中,可以通过修改设备模板来给所有已激活该模板的设备新增物理量。 - -修改设备模板的 SQL 语句如下所示: - -```shell -IoTDB> alter device template t1 add (speed FLOAT encoding=RLE) -``` - -**向已挂载模板的路径下的设备中写入数据,若写入请求中的物理量不在模板中,将自动扩展模板。** - - -## 时间序列管理 - -### 创建时间序列 - -根据建立的数据模型,我们可以分别在两个数据库中创建相应的时间序列。创建时间序列的 SQL 语句如下所示: - -``` -IoTDB > create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT,encoding=RLE -IoTDB > create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT,encoding=PLAIN -IoTDB > create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN,encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT,encoding=RLE -``` - -从 v0.13 起,可以使用简化版的 SQL 语句创建时间序列: - -``` -IoTDB > create timeseries root.ln.wf01.wt01.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.ln.wf01.wt01.temperature FLOAT encoding=RLE -IoTDB > create timeseries root.ln.wf02.wt02.hardware TEXT encoding=PLAIN -IoTDB > create timeseries root.ln.wf02.wt02.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.status BOOLEAN encoding=PLAIN -IoTDB > create timeseries root.sgcc.wf03.wt01.temperature FLOAT encoding=RLE -``` - -需要注意的是,当创建时间序列时指定的编码方式与数据类型不对应时,系统会给出相应的错误提示,如下所示: -``` -IoTDB> create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF -error: encoding TS_2DIFF does not support BOOLEAN -``` - -详细的数据类型与编码方式的对应列表请参见 [编码方式](../Technical-Insider/Encoding-and-Compression.md)。 - -### 创建对齐时间序列 - -创建一组对齐时间序列的SQL语句如下所示: - -``` -IoTDB> CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT encoding=PLAIN compressor=SNAPPY, longitude FLOAT encoding=PLAIN compressor=SNAPPY) -``` - -一组对齐序列中的序列可以有不同的数据类型、编码方式以及压缩方式。 - -对齐的时间序列也支持设置别名、标签、属性。 - -### 删除时间序列 - -我们可以使用`(DELETE | DROP) TimeSeries `语句来删除我们之前创建的时间序列。SQL 语句如下所示: - -``` -IoTDB> delete timeseries root.ln.wf01.wt01.status -IoTDB> delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware -IoTDB> delete timeseries root.ln.wf02.* -IoTDB> drop timeseries root.ln.wf02.* -``` - -### 查看时间序列 - -* SHOW LATEST? TIMESERIES pathPattern? timeseriesWhereClause? limitClause? - - SHOW TIMESERIES 中可以有四种可选的子句,查询结果为这些时间序列的所有信息 - -时间序列信息具体包括:时间序列路径名,database,Measurement 别名,数据类型,编码方式,压缩方式,属性和标签。 - -示例: - -* SHOW TIMESERIES - - 展示系统中所有的时间序列信息 - -* SHOW TIMESERIES <`Path`> - - 返回给定路径的下的所有时间序列信息。其中 `Path` 需要为一个时间序列路径或路径模式。例如,分别查看`root`路径和`root.ln`路径下的时间序列,SQL 语句如下所示: - -``` -IoTDB> show timeseries root.** -IoTDB> show timeseries root.ln.** -``` - -执行结果分别为: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.016s - -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -Total line number = 4 -It costs 0.004s -``` - -* SHOW TIMESERIES LIMIT INT OFFSET INT - - 只返回从指定下标开始的结果,最大返回条数被 LIMIT 限制,用于分页查询。例如: - -``` -show timeseries root.ln.** limit 10 offset 10 -``` - -* SHOW TIMESERIES WHERE TIMESERIES contains 'containStr' - - 对查询结果集根据 timeseries 名称进行字符串模糊匹配过滤。例如: - -``` -show timeseries root.ln.** where timeseries contains 'wf01.wt' -``` - -执行结果为: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 2 -It costs 0.016s -``` - -* SHOW TIMESERIES WHERE DataType=type - - 对查询结果集根据时间序列数据类型进行过滤。例如: - -``` -show timeseries root.ln.** where dataType=FLOAT -``` - -执行结果为: - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 3 -It costs 0.016s - -``` - -* SHOW TIMESERIES WHERE TAGS(KEY) = VALUE -* SHOW TIMESERIES WHERE TAGS(KEY) CONTAINS VALUE - - 对查询结果集根据标签进行过滤。例如: - -``` -show timeseries root.ln.** where TAGS(unit)='c' -show timeseries root.ln.** where TAGS(description) contains 'test1' -``` - -执行结果分别为: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s - -``` - - -* SHOW LATEST TIMESERIES - - 表示查询出的时间序列需要按照最近插入时间戳降序排列 - - -需要注意的是,当查询路径不存在时,系统会返回 0 条时间序列。 - -### 统计时间序列总数 - -IoTDB 支持使用`COUNT TIMESERIES`来统计一条路径中的时间序列个数。SQL 语句如下所示: - -* 可以通过 `WHERE` 条件对时间序列名称进行字符串模糊匹配,语法为: `COUNT TIMESERIES WHERE TIMESERIES contains 'containStr'` 。 -* 可以通过 `WHERE` 条件对时间序列数据类型进行过滤,语法为: `COUNT TIMESERIES WHERE DataType='`。 -* 可以通过 `WHERE` 条件对标签点进行过滤,语法为: `COUNT TIMESERIES WHERE TAGS(key)='value'` 或 `COUNT TIMESERIES WHERE TAGS(key) contains 'value'`。 -* 可以通过定义`LEVEL`来统计指定层级下的时间序列个数。这条语句可以用来统计每一个设备下的传感器数量,语法为:`COUNT TIMESERIES GROUP BY LEVEL=`。 - -``` -IoTDB > COUNT TIMESERIES root.** -IoTDB > COUNT TIMESERIES root.ln.** -IoTDB > COUNT TIMESERIES root.ln.*.*.status -IoTDB > COUNT TIMESERIES root.ln.wf01.wt01.status -IoTDB > COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' -IoTDB > COUNT TIMESERIES root.** WHERE DATATYPE = INT64 -IoTDB > COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c' -IoTDB > COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c' -IoTDB > COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1 -``` - -例如有如下时间序列(可以使用`show timeseries`展示所有时间序列): - -``` -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| {"unit":"c"}| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| {"description":"test1"}| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.004s -``` - -那么 Metadata Tree 如下所示: - - - -可以看到,`root`被定义为`LEVEL=0`。那么当你输入如下语句时: - -``` -IoTDB > COUNT TIMESERIES root.** GROUP BY LEVEL=1 -IoTDB > COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2 -IoTDB > COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2 -``` - -你将得到以下结果: - -``` -IoTDB> COUNT TIMESERIES root.** GROUP BY LEVEL=1 -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -| root.sgcc| 2| -| root.ln| 4| -+------------+-----------------+ -Total line number = 3 -It costs 0.002s - -IoTDB > COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2 -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf02| 2| -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 2 -It costs 0.002s - -IoTDB > COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2 -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 1 -It costs 0.002s -``` - -> 注意:时间序列的路径只是过滤条件,与 level 的定义无关。 - -### 活跃时间序列查询 -我们在原有的时间序列查询和统计上添加新的WHERE时间过滤条件,可以得到在指定时间范围中存在数据的时间序列。 - -需要注意的是, 在带有时间过滤的元数据查询中并不考虑视图的存在,只考虑TsFile中实际存储的时间序列。 - -一个使用样例如下: -``` -IoTDB> insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -IoTDB> insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -IoTDB> insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -IoTDB> show timeseries; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -IoTDB> show timeseries where time >= 15000 and time < 16000; -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -IoTDB> count timeseries where time >= 15000 and time < 16000; -+-----------------+ -|count(timeseries)| -+-----------------+ -| 4| -+-----------------+ -``` -关于活跃时间序列的定义,能通过正常查询查出来的数据就是活跃数据,也就是说插入但被删除的时间序列不在考虑范围内。 - -### 标签点管理 - -我们可以在创建时间序列的时候,为它添加别名和额外的标签和属性信息。 - -标签和属性的区别在于: - -* 标签可以用来查询时间序列路径,会在内存中维护标点到时间序列路径的倒排索引:标签 -> 时间序列路径 -* 属性只能用时间序列路径来查询:时间序列路径 -> 属性 - -所用到的扩展的创建时间序列的 SQL 语句如下所示: -``` -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT, encoding=RLE, compression=SNAPPY tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2) -``` - -括号里的`temprature`是`s1`这个传感器的别名。 -我们可以在任何用到`s1`的地方,将其用`temprature`代替,这两者是等价的。 - -> IoTDB 同时支持在查询语句中使用 AS 函数设置别名。二者的区别在于:AS 函数设置的别名用于替代整条时间序列名,且是临时的,不与时间序列绑定;而上文中的别名只作为传感器的别名,与其绑定且可与原传感器名等价使用。 - -> 注意:额外的标签和属性信息总的大小不能超过`tag_attribute_total_size`. - - * 标签点属性更新 -创建时间序列后,我们也可以对其原有的标签点属性进行更新,主要有以下六种更新方式: -* 重命名标签或属性 -``` -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` -* 重新设置标签或属性的值 -``` -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` -* 删除已经存在的标签或属性 -``` -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2 -``` -* 添加新的标签 -``` -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` -* 添加新的属性 -``` -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` -* 更新插入别名,标签和属性 -> 如果该别名,标签或属性原来不存在,则插入,否则,用新值更新原来的旧值 -``` -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -* 使用标签作为过滤条件查询时间序列,使用 TAGS(tagKey) 来标识作为过滤条件的标签 -``` -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -``` - -返回给定路径的下的所有满足条件的时间序列信息,SQL 语句如下所示: - -``` -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1 -show timeseries root.ln.** where TAGS(unit)='c' -show timeseries root.ln.** where TAGS(description) contains 'test1' -``` - -执行结果分别为: - -``` -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s -``` - -- 使用标签作为过滤条件统计时间序列数量 - -``` -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL= -``` - -返回给定路径的下的所有满足条件的时间序列的数量,SQL 语句如下所示: - -``` -count timeseries -count timeseries root.** where TAGS(unit)='c' -count timeseries root.** where TAGS(unit)='c' group by level = 2 -``` - -执行结果分别为: - -``` -IoTDB> count timeseries -+-----------------+ -|count(timeseries)| -+-----------------+ -| 6| -+-----------------+ -Total line number = 1 -It costs 0.019s -IoTDB> count timeseries root.** where TAGS(unit)='c' -+-----------------+ -|count(timeseries)| -+-----------------+ -| 2| -+-----------------+ -Total line number = 1 -It costs 0.020s -IoTDB> count timeseries root.** where TAGS(unit)='c' group by level = 2 -+--------------+-----------------+ -| column|count(timeseries)| -+--------------+-----------------+ -| root.ln.wf02| 2| -| root.ln.wf01| 0| -|root.sgcc.wf03| 0| -+--------------+-----------------+ -Total line number = 3 -It costs 0.011s -``` - -> 注意,现在我们只支持一个查询条件,要么是等值条件查询,要么是包含条件查询。当然 where 子句中涉及的必须是标签值,而不能是属性值。 - -创建对齐时间序列 - -``` -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)) -``` - -执行结果如下: - -``` -IoTDB> show timeseries -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -|root.sg1.d1.s2| null| root.sg1| DOUBLE| GORILLA| SNAPPY|{"tag4":"v4","tag3":"v3"}|{"attr4":"v4","attr3":"v3"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -支持查询: - -``` -IoTDB> show timeseries where TAGS(tag1)='v1' -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -上述对时间序列标签、属性的更新等操作都支持。 - - -## 路径查询 - -### 查看路径的所有子路径 - -``` -SHOW CHILD PATHS pathPattern -``` - -可以查看此路径模式所匹配的所有路径的下一层的所有路径和它对应的节点类型,即pathPattern.*所匹配的路径及其节点类型。 - -节点类型:ROOT -> SG INTERNAL -> DATABASE -> INTERNAL -> DEVICE -> TIMESERIES - -示例: - -* 查询 root.ln 的下一层:show child paths root.ln - -``` -+------------+----------+ -| child paths|node types| -+------------+----------+ -|root.ln.wf01| INTERNAL| -|root.ln.wf02| INTERNAL| -+------------+----------+ -Total line number = 2 -It costs 0.002s -``` - -* 查询形如 root.xx.xx.xx 的路径:show child paths root.\*.\* - -``` -+---------------+ -| child paths| -+---------------+ -|root.ln.wf01.s1| -|root.ln.wf02.s2| -+---------------+ -``` - -### 查看路径的下一级节点 - -``` -SHOW CHILD NODES pathPattern -``` - -可以查看此路径模式所匹配的节点的下一层的所有节点。 - -示例: - -* 查询 root 的下一层:show child nodes root - -``` -+------------+ -| child nodes| -+------------+ -| ln| -+------------+ -``` - -* 查询 root.ln 的下一层 :show child nodes root.ln - -``` -+------------+ -| child nodes| -+------------+ -| wf01| -| wf02| -+------------+ -``` - -### 统计节点数 - -IoTDB 支持使用`COUNT NODES LEVEL=`来统计当前 Metadata - 树下满足某路径模式的路径中指定层级的节点个数。这条语句可以用来统计带有特定采样点的设备数。例如: - -``` -IoTDB > COUNT NODES root.** LEVEL=2 -IoTDB > COUNT NODES root.ln.** LEVEL=2 -IoTDB > COUNT NODES root.ln.wf01.* LEVEL=3 -IoTDB > COUNT NODES root.**.temperature LEVEL=3 -``` - -对于上面提到的例子和 Metadata Tree,你可以获得如下结果: - -``` -+------------+ -|count(nodes)| -+------------+ -| 4| -+------------+ -Total line number = 1 -It costs 0.003s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 1| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s -``` - -> 注意:时间序列的路径只是过滤条件,与 level 的定义无关。 - -### 查看设备 - -* SHOW DEVICES pathPattern? (WITH DATABASE)? devicesWhereClause? limitClause? - -与 `Show Timeseries` 相似,IoTDB 目前也支持两种方式查看设备。 - -* `SHOW DEVICES` 语句显示当前所有的设备信息,等价于 `SHOW DEVICES root.**`。 -* `SHOW DEVICES ` 语句规定了 `PathPattern`,返回给定的路径模式所匹配的设备信息。 -* `WHERE` 条件中可以使用 `DEVICE contains 'xxx'`,根据 device 名称进行模糊查询。 -* `WHERE` 条件中可以使用 `TEMPLATE = 'xxx'`,`TEMPLATE != 'xxx'`,根据 template 名称进行过滤查询。 -* `WHERE` 条件中可以使用 `TEMPLATE is null`,`TEMPLATE is not null`,根据 template 是否为null(null 表示没激活)进行过滤查询。 - -SQL 语句如下所示: - -``` -IoTDB> show devices -IoTDB> show devices root.ln.** -IoTDB> show devices root.ln.** where device contains 't' -IoTDB> show devices root.ln.** where template = 't1' -IoTDB> show devices root.ln.** where template is null -IoTDB> show devices root.ln.** where template != 't1' -IoTDB> show devices root.ln.** where template is not null -``` - -你可以获得如下数据: - -``` -+-------------------+---------+---------+ -| devices|isAligned| Template| -+-------------------+---------+---------+ -| root.ln.wf01.wt01| false| t1| -| root.ln.wf02.wt02| false| null| -|root.sgcc.wf03.wt01| false| null| -| root.turbine.d1| false| null| -+-------------------+---------+---------+ -Total line number = 4 -It costs 0.002s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf01.wt01| false| t1| -|root.ln.wf02.wt02| false| null| -+-----------------+---------+---------+ -Total line number = 2 -It costs 0.001s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf01.wt01| false| t1| -|root.ln.wf02.wt02| false| null| -+-----------------+---------+---------+ -Total line number = 2 -It costs 0.001s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf01.wt01| false| t1| -+-----------------+---------+---------+ -Total line number = 1 -It costs 0.001s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf02.wt02| false| null| -+-----------------+---------+---------+ -Total line number = 1 -It costs 0.001s -``` - -其中,`isAligned`表示该设备下的时间序列是否对齐, -`Template`显示着该设备所激活的模板名,null 表示没有激活模板。 - -查看设备及其 database 信息,可以使用 `SHOW DEVICES WITH DATABASE` 语句。 - -* `SHOW DEVICES WITH DATABASE` 语句显示当前所有的设备信息和其所在的 database,等价于 `SHOW DEVICES root.**`。 -* `SHOW DEVICES WITH DATABASE` 语句规定了 `PathPattern`,返回给定的路径模式所匹配的设备信息和其所在的 database。 - -SQL 语句如下所示: - -``` -IoTDB> show devices with database -IoTDB> show devices root.ln.** with database -``` - -你可以获得如下数据: - -``` -+-------------------+-------------+---------+---------+ -| devices| database|isAligned| Template| -+-------------------+-------------+---------+---------+ -| root.ln.wf01.wt01| root.ln| false| t1| -| root.ln.wf02.wt02| root.ln| false| null| -|root.sgcc.wf03.wt01| root.sgcc| false| null| -| root.turbine.d1| root.turbine| false| null| -+-------------------+-------------+---------+---------+ -Total line number = 4 -It costs 0.003s - -+-----------------+-------------+---------+---------+ -| devices| database|isAligned| Template| -+-----------------+-------------+---------+---------+ -|root.ln.wf01.wt01| root.ln| false| t1| -|root.ln.wf02.wt02| root.ln| false| null| -+-----------------+-------------+---------+---------+ -Total line number = 2 -It costs 0.001s -``` - -### 统计设备数量 - -* COUNT DEVICES \ - -上述语句用于统计设备的数量,同时允许指定`PathPattern` 用于统计匹配该`PathPattern` 的设备数量 - -SQL 语句如下所示: - -``` -IoTDB> show devices -IoTDB> count devices -IoTDB> count devices root.ln.** -``` - -你可以获得如下数据: - -``` -+-------------------+---------+---------+ -| devices|isAligned| Template| -+-------------------+---------+---------+ -|root.sgcc.wf03.wt03| false| null| -| root.turbine.d1| false| null| -| root.ln.wf02.wt02| false| null| -| root.ln.wf01.wt01| false| t1| -+-------------------+---------+---------+ -Total line number = 4 -It costs 0.024s - -+--------------+ -|count(devices)| -+--------------+ -| 4| -+--------------+ -Total line number = 1 -It costs 0.004s - -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -Total line number = 1 -It costs 0.004s -``` - -### 活跃设备查询 -和活跃时间序列一样,我们可以在查看和统计设备的基础上添加时间过滤条件来查询在某段时间内存在数据的活跃设备。这里活跃的定义与活跃时间序列相同,使用样例如下: -``` -IoTDB> insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -IoTDB> insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -IoTDB> insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -IoTDB> show devices; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -| root.sg.data3| false| -+-------------------+---------+ - -IoTDB> show devices where time >= 15000 and time < 16000; -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -+-------------------+---------+ - -IoTDB> count devices where time >= 15000 and time < 16000; -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -``` \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index 185ba32d8..000000000 --- a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,564 +0,0 @@ - -# AINode 部署 - -## AINode介绍 - -### 能力介绍 - -AINode 是 IoTDB 在 ConfigNode、DataNode 后提供的第三种内生节点,该节点通过与 IoTDB 集群的 DataNode、ConfigNode 的交互,扩展了对时间序列进行机器学习分析的能力,支持从外部引入已有机器学习模型进行注册,并使用注册的模型在指定时序数据上通过简单 SQL 语句完成时序分析任务的过程,将模型的创建、管理及推理融合在数据库引擎中。目前已提供常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -### 交付方式 - 是 IoTDB 集群外的额外套件,独立安装包。 - -### 部署模式 -
- - -
- -## 安装准备 - -### 安装包获取 - - 用户可以下载AINode的软件安装包,下载并解压后即完成AINode的安装。 - - 解压后安装包(`apache-iotdb--ainode-bin.zip`),安装包解压后目录结构如下: -| **目录** | **类型** | **说明** | -| ------------ | -------- | ------------------------------------------------ | -| lib | 文件夹 | AINode编译后的二进制可执行文件以及相关的代码依赖 | -| sbin | 文件夹 | AINode的运行脚本,可以启动,移除和停止AINode | -| conf | 文件夹 | 包含AINode的配置项,具体包含以下配置项 | -| LICENSE | 文件 | 证书 | -| NOTICE | 文件 | 提示 | -| README_ZH.md | 文件 | markdown格式的中文版说明 | -| `README.md` | 文件 | 使用说明 | - -### 前置检查 - -为确保您获取的 AINode 安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:请联系天谋工作人员获取 - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/ainode`): - ```Bash - cd /data/ainode - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-05.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行 AINode 的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 环境准备 -- 建议操作环境: Ubuntu, CentOS, MacOS - -- 运行环境 - - Python 版本在 3.10~3.12 即可,且带有 pip 和 venv 工具;非联网环境下需要从 [此处](https://cloud.tsinghua.edu.cn/d/4c1342f6c272439aa96c/?p=%2Flibs&mode=list) 下载对应操作系统的zip压缩包(注意下载依赖需选择libs文件夹中的zip压缩包,如下图),并将文件夹下的所有文件拷贝到 `apache-iotdb--ainode-bin` 文件夹中 `lib` 文件夹下,并按下文步骤启动AINode。 - - - - - 环境变量中需存在 Python 解释器且可以通过 `python` 指令直接调用 - - 建议在 `apache-iotdb--ainode-bin` 文件夹下,新建 Python 解释器 venv 虚拟环境。如安装 3.10.0 版本虚拟环境,语句如下: - - ```shell - # 安装3.10.0版本的venv,创建虚拟环境,文件夹名为 `venv` - ../Python-3.10.0/python -m venv `venv` - ``` -## 安装部署及使用 - -### 安装 AINode - -1. AINode 激活 - - 要求 IoTDB 处于正常运行状态,且 license 中有 AINode 模块授权。 - - 激活 AINode 模块授权方式如下: - - 方式一:激活文件拷贝激活 - - 重新启动 confignode 节点后,进入 activation 文件夹, 将 system_info 文件复制给天谋工作人员,并告知工作人员申请 AINode 独立授权; - - 收到工作人员返回的 license 文件; - - 将 license 文件放入对应节点的 activation 文件夹下; -- 方式二:激活脚本激活 - - 获取激活所需机器码,进入安装目录的 `sbin` 目录,执行激活脚本: - ```shell - cd sbin - ./start-activate.sh - ``` - - 显示如下信息,请将机器码(即该串字符)复制给天谋工作人员,并告知工作人员申请 AINode 独立授权: - ```shell - Please copy the system_info's content and send it to Timecho: - 01-KU5LDFFN-PNBEHDRH - Please enter license: - ``` - - 将工作人员返回的激活码输入上一步的命令行提示处 `Please enter license:`,如下提示: - ```shell - Please enter license: - Jw+MmF+AtexsfgNGOFgTm83BgXbq0zT1+fOfPvQsLlj6ZsooHFU6HycUSEGC78eT1g67KPvkcLCUIsz2QpbyVmPLr9x1+kVjBubZPYlVpsGYLqLFc8kgpb5vIrPLd3hGLbJ5Ks8fV1WOVrDDVQq89YF2atQa2EaB9EAeTWd0bRMZ+s9ffjc/1Zmh9NSP/T3VCfJcJQyi7YpXWy5nMtcW0gSV+S6fS5r7a96PjbtE0zXNjnEhqgRzdU+mfO8gVuUNaIy9l375cp1GLpeCh6m6pF+APW1CiXLTSijK9Qh3nsL5bAOXNeob5l+HO5fEMgzrW8OJPh26Vl6ljKUpCvpTiw== - License has been stored to sbin/../activation/license - Import completed. Please start cluster and excute 'show cluster' to verify activation status - ``` -- 更新 license 后,重新启动 DataNode 节点,进入 IoTDB 的 sbin 目录下,启动 datanode: - ```shell - cd sbin - ./start-datanode.sh -d #-d参数将在后台进行启动 - ``` - -2. 检查Linux的内核架构 - ```shell - uname -m - ``` - -3. 导入Python环境[下载](https://repo.anaconda.com/miniconda/) - - 推荐下载py311版本应用,导入至用户根目录下 iotdb专用文件夹 中 - -4. 验证Python版本 - - ```shell - python --version - ``` - -5. 创建虚拟环境(在 ainode 目录下执行): - - ```shell - python -m venv venv - ``` - -6. 激活虚拟环境: - - ```shell - source venv/bin/activate - ``` - -7. 下载导入AINode到专用文件夹,切换到专用文件夹并解压安装包 - - ```shell - unzip iotdb-enterprise-ainode-1.3.3.2.zip - ``` - -8. 配置项修改 - - ```shell - vi iotdb-enterprise-ainode-1.3.3.2/conf/iotdb-ainode.properties - ``` - 配置项修改:[详细信息](#配置项修改) - > ain_seed_config_node=iotdb-1:10710(集群通讯节点IP:通讯节点端口)
- > ain_inference_rpc_address=iotdb-3(运行AINode的服务器IP) - -9. 更换Python源 - - ```shell - pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ - ``` - -10. 启动AINode节点 - - ```shell - nohup bash iotdb-enterprise-ainode-1.3.3.2/sbin/start-ainode.sh > myout.file 2>& 1 & - ``` - > 回到系统默认环境:conda deactivate - -### 配置项修改 -AINode 支持修改一些必要的参数。可以在 `conf/iotdb-ainode.properties` 文件中找到下列参数并进行持久化的修改: - -| **名称** | **描述** | **类型** | **默认值** | **改后生效方式** | -| :----------------------------- | ------------------------------------------------------------ | ------- | ------------------ | ---------------------------- | -| cluster_name | AINode 要加入集群的标识 | string | defaultCluster | 仅允许在第一次启动服务前修改 | -| ain_seed_config_node | AINode 启动时注册的 ConfigNode 地址 | String | 127.0.0.1:10710 | 仅允许在第一次启动服务前修改 | -| ain_inference_rpc_address | AINode 提供服务与通信的地址 ,内部服务通讯接口 | String | 127.0.0.1 | 仅允许在第一次启动服务前修改 | -| ain_inference_rpc_port | AINode 提供服务与通信的端口 | String | 10810 | 仅允许在第一次启动服务前修改 | -| ain_system_dir | AINode 元数据存储路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/system | 仅允许在第一次启动服务前修改 | -| ain_models_dir | AINode 存储模型文件的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/models | 仅允许在第一次启动服务前修改 | -| ain_logs_dir | AINode 存储日志的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | logs/AINode | 重启后生效 | -| ain_thrift_compression_enabled | AINode 是否启用 thrift 的压缩机制,0-不启动、1-启动 | Boolean | 0 | 重启后生效 | -### 启动 AINode - - 在完成 Seed-ConfigNode 的部署后,可以通过添加 AINode 节点来支持模型的注册和推理功能。在配置项中指定 IoTDB 集群的信息后,可以执行相应的指令来启动 AINode,加入 IoTDB 集群。 - -#### 联网环境启动 - -##### 启动命令 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -##### 详细语法 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh -i -r -n - - # Windows 系统 - sbin\start-ainode.bat -i -r -n - ``` - -##### 参数介绍: - -| **名称** | **标签** | **描述** | **是否必填** | **类型** | **默认值** | **输入方式** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | ---------------------- | -| ain_interpreter_dir | -i | AINode 所安装在的虚拟环境的解释器路径,需要使用绝对路径 | 否 | String | 默认读取环境变量 | 调用时输入或持久化修改 | -| ain_force_reinstall | -r | 该脚本在检查 AINode 安装情况的时候是否检查版本,如果检查则在版本不对的情况下会强制安装 lib 里的 whl 安装包 | 否 | Bool | false | 调用时输入 | -| ain_no_dependencies | -n | 指定在安装 AINode 的时候是否安装依赖,如果指定则仅安装 AINode 主程序而不安装依赖。 | 否 | Bool | false | 调用时输入 | - - 如不想每次启动时指定对应参数,也可以在 `conf` 文件夹下的`ainode-env.sh` 和 `ainode-env.bat` 脚本中持久化修改参数(目前支持持久化修改 ain_interpreter_dir 参数)。 - - `ainode-env.sh` : - ```shell - # The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - # ain_interpreter_dir= - ``` - `ainode-env.bat` : -```shell - @REM The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - @REM set ain_interpreter_dir= - ``` - 在写入参数值的后解除对应行的注释并保存即可在下一次执行脚本时生效。 - -#### 示例 - -##### 直接启动: - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - # Windows 系统 - sbin\start-ainode.bat - - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -##### 更新启动: -如果 AINode 的版本进行了更新(如更新了 `lib` 文件夹),可使用此命令。首先要保证 AINode 已经停止运行,然后通过 `-r` 参数重启,该参数会根据 `lib` 下的文件重新安装 AINode。 - -```shell - # 更新启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh -r - # Windows 系统 - sbin\start-ainode.bat -r - - - # 后台更新启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh -r > myout.file 2>& 1 & - # Windows 系统 - nohup bash sbin\start-ainode.bat -r > myout.file 2>& 1 & - ``` -#### 非联网环境启动 - -##### 启动命令 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -##### 详细语法 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh -i -r -n - - # Windows 系统 - sbin\start-ainode.bat -i -r -n - ``` - -##### 参数介绍: - -| **名称** | **标签** | **描述** | **是否必填** | **类型** | **默认值** | **输入方式** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | ---------------------- | -| ain_interpreter_dir | -i | AINode 所安装在的虚拟环境的解释器路径,需要使用绝对路径 | 否 | String | 默认读取环境变量 | 调用时输入或持久化修改 | -| ain_force_reinstall | -r | 该脚本在检查 AINode 安装情况的时候是否检查版本,如果检查则在版本不对的情况下会强制安装 lib 里的 whl 安装包 | 否 | Bool | false | 调用时输入 | - -> 注意:非联网环境下安装失败时,首先检查是否选择了平台对应的安装包,其次确认python版本(由于下载的安装包限制了python版本,3.7、3.9等其他都不行) - -#### 示例 - -##### 直接启动: - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - # Windows 系统 - sbin\start-ainode.bat - - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### 检测 AINode 节点状态 - -AINode 启动过程中会自动将新的 AINode 加入 IoTDB 集群。启动 AINode 后可以在 命令行中输入 SQL 来查询,集群中看到 AINode 节点,其运行状态为 Running(如下展示)表示加入成功。 - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -### 停止 AINode - -如果需要停止正在运行的 AINode 节点,则执行相应的关闭脚本。 - -#### 停止命令 - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - -##### 详细语法 - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh -t - - #Windows - sbin\stop-ainode.bat -t - ``` - -##### 参数介绍: - - | **名称** | **标签** | **描述** | **是否必填** | **类型** | **默认值** | **输入方式** | -| ----------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ------ | ---------- | -| ain_remove_target | -t | AINode 关闭时可以指定待移除的目标 AINode 的 Node ID、地址和端口号,格式为`` | 否 | String | 无 | 调用时输入 | - -#### 示例 -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - # Windows - sbin\stop-ainode.bat - ``` -停止 AINode 后,还可以在集群中看到 AINode 节点,其运行状态为 UNKNOWN(如下展示),此时无法使用 AINode 功能。 - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -如果需要重新启动该节点,需重新执行启动脚本。 - -### 移除 AINode - -当需要把一个 AINode 节点移出集群时,可以执行移除脚本。移除和停止脚本的差别是:停止是在集群中保留 AINode 节点但停止 AINode 服务,移除则是把 AINode 节点从集群中移除出去。 - - - #### 移除命令 - -```shell - # Linux / MacOS - bash sbin/remove-ainode.sh - - # Windows - sbin\remove-ainode.bat - ``` - -##### 详细语法 - -```shell - # Linux / MacOS - bash sbin/remove-ainode.sh -i -t -r -n - - # Windows - sbin\remove-ainode.bat -i -t -r -n - ``` - -##### 参数介绍: - - | **名称** | **标签** | **描述** | **是否必填** | **类型** | **默认值** | **输入方式** | -| ------------------- | ---- | ------------------------------------------------------------ | -------- | ------ | ---------------- | --------------------- | -| ain_interpreter_dir | -i | AINode 所安装在的虚拟环境的解释器路径,需要使用绝对路径 | 否 | String | 默认读取环境变量 | 调用时输入+持久化修改 | -| ain_remove_target | -t | AINode 关闭时可以指定待移除的目标 AINode 的 Node ID、地址和端口号,格式为`` | 否 | String | 无 | 调用时输入 | -| ain_force_reinstall | -r | 该脚本在检查 AINode 安装情况的时候是否检查版本,如果检查则在版本不对的情况下会强制安装 lib 里的 whl 安装包 | 否 | Bool | false | 调用时输入 | -| ain_no_dependencies | -n | 指定在安装 AINode 的时候是否安装依赖,如果指定则仅安装 AINode 主程序而不安装依赖。 | 否 | Bool | false | 调用时输入 | - - 如不想每次启动时指定对应参数,也可以在 `conf` 文件夹下的`ainode-env.sh` 和 `ainode-env.bat` 脚本中持久化修改参数(目前支持持久化修改 ain_interpreter_dir 参数)。 - - `ainode-env.sh` : - ```shell - # The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - # ain_interpreter_dir= - ``` - `ainode-env.bat` : -```shell - @REM The defaulte venv environment is used if ain_interpreter_dir is not set. Please use absolute path without quotation mark - @REM set ain_interpreter_dir= - ``` - 在写入参数值的后解除对应行的注释并保存即可在下一次执行脚本时生效。 - -#### 示例 - -##### 直接移除: - - ```shell - # Linux / MacOS - bash sbin/remove-ainode.sh - - # Windows - sbin\remove-ainode.bat - ``` - 移除节点后,将无法查询到节点的相关信息。 - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -##### 指定移除: - -如果用户丢失了 data 文件夹下的文件,可能 AINode 本地无法主动移除自己,需要用户指定节点号、地址和端口号进行移除,此时我们支持用户按照以下方法输入参数进行删除。 - - ```shell - # Linux / MacOS - bash sbin/remove-ainode.sh -t /: - - # Windows - sbin\remove-ainode.bat -t /: - ``` - -## 常见问题 - -### 启动AINode时出现找不到venv模块的报错 - - 当使用默认方式启动 AINode 时,会在安装包目录下创建一个 python 虚拟环境并安装依赖,因此要求安装 venv 模块。通常来说 python3.10 及以上的版本会自带 venv,但对于一些系统自带的 python 环境可能并不满足这一要求。出现该报错时有两种解决方案(二选一): - - 在本地安装 venv 模块,以 ubuntu 为例,可以通过运行以下命令来安装 python 自带的 venv 模块。或者从 python 官网安装一个自带 venv 的 python 版本。 - - ```shell -apt-get install python3.10-venv -``` - 安装 3.10.0 版本的 venv 到 AINode 里面 在 AINode 路径下 - - ```shell -../Python-3.10.0/python -m venv venv(文件夹名) -``` - 在运行启动脚本时通过 `-i` 指定已有的 python 解释器路径作为 AINode 的运行环境,这样就不再需要创建一个新的虚拟环境。 - - ### python中的SSL模块没有被正确安装和配置,无法处理HTTPS资源 -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -可以安装 OpenSSLS 后,再重新构建 python 来解决这个问题 -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - - Python 要求我们的系统上安装有 OpenSSL,具体安装方法可见[链接](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - - ### pip版本较低 - - windows下出现类似“error:Microsoft Visual C++ 14.0 or greater is required...”的编译问题 - - 出现对应的报错,通常是 c++版本或是 setuptools 版本不足,可以在 - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - - ### 安装编译python - - 使用以下指定从官网下载安装包并解压: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` - 编译安装对应的 python 包: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index cd0dc532e..000000000 --- a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,559 +0,0 @@ - -# 集群版部署 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -## 注意事项 - -1. 安装前请确认系统已参照[系统配置](./Environment-Requirements.md)准备完成。 - -2. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置/etc/hosts,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。`dn_internal_address`。 - - ``` shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. 有些参数首次启动后不能修改,请参考下方的"参数配置"章节来进行设置。 - -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 - -5. 请注意,安装部署(包括激活和使用软件)IoTDB时需要保持使用同一个用户进行操作,您可以: -- 使用 root 用户(推荐):使用 root 用户可以避免权限等问题。 -- 使用固定的非 root 用户: - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - 避免使用 sudo:尽量避免使用 sudo 命令,因为它会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 - -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考:[监控面板部署](./Monitoring-panel-deployment.md) - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - -## 准备步骤 - -1. 准备IoTDB数据库安装包 :iotdb-enterprise-{version}-bin.zip(安装包获取见:[链接](../Deployment-and-Maintenance/IoTDB-Package_timecho.md)) -2. 按环境要求配置好操作系统环境(系统环境配置见:[链接](../Deployment-and-Maintenance/Environment-Requirements.md)) - -### 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-02.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - - -## 安装步骤 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ----------- | ------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -### 设置主机名 - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置`/etc/hosts`,使用如下命令: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 参数配置 - -解压安装包并进入安装目录 - -```Plain -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -#### 环境脚本配置 - -- `./conf/confignode-env.sh`配置 - - | **配置项** | **说明** | **默认值** | **推荐值** | 备注 | - | :---------- | :------------------------------------- | :--------- | :----------------------------------------------- | :----------- | - | MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- `./conf/datanode-env.sh`配置 - - | **配置项** | **说明** | **默认值** | **推荐值** | 备注 | - | :---------- | :----------------------------------- |:-----------------------| :----------------------------------------------- | :----------- | - | MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 通用配置 - -打开通用配置文件`./conf/iotdb-system.properties`,可根据部署方式设置以下参数: - -| 配置项 | 说明 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | -| ------------------------- | ---------------------------------------- | -------------- | -------------- | -------------- | -| cluster_name | 集群名称 | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | 元数据副本数,DataNode数量不应少于此数目 | 3 | 3 | 3 | -| data_replication_factor | 数据副本数,DataNode数量不应少于此数目 | 2 | 2 | 2 | - -#### ConfigNode 配置 - -打开ConfigNode配置文件`./conf/iotdb-system.properties`,设置以下参数 - -| 配置项 | 说明 | 默认 | 推荐值 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | 备注 | -| ------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 10710 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 10720 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -#### DataNode 配置 - -打开DataNode配置文件 `./conf/iotdb-system.properties`,设置以下参数: - -| 配置项 | 说明 | 默认 | 推荐值 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | 备注 | -| ------------------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| dn_rpc_address | 客户端 RPC 服务的地址 | 0.0.0.0 | 所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址 | iotdb-1 |iotdb-2 | iotdb-3 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 6667 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 10730 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 10740 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 10750 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 10760 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -> ❗️注意:VSCode Remote等编辑器无自动保存配置功能,请确保修改的文件被持久化保存,否则配置项无法生效 - -### 启动及激活数据库 (V 1.3.4 及以后的 1.x 版本) - -#### 启动 ConfigNode 节点 - -先启动第一个iotdb-1的confignode, 保证种子confignode节点先启动,然后依次启动第2和第3个confignode节点 - -```Bash -./start-confignode.sh -d #“-d”参数将在后台进行启动 -``` -如果启动失败,请参考[常见问题](#常见问题)。 - -#### 启动 DataNode 节点 - -分别进入iotdb的`sbin`目录下,依次启动3个datanode节点: - -```Bash -./start-datanode.sh -d #-d参数将在后台进行启动 -``` - -#### 激活数据库 - -##### 通过 CLI 激活 - -- 进入集群任一节点 CLI,执行获取机器码的语句 - - ```SQL - -- 连接CLI - ./sbin/start-cli.sh - -- 获取激活所需机器码 - IoTDB> show system info -``` - -- 系统将自动返回集群所有节点的机器码 - -```Bash -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -|01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -It costs 0.030s -``` - -- 将获取的机器码复制给天谋工作人员 - -- 工作人员会返回激活码,正常是与提供的机器码的顺序对应的,请将整串激活码粘贴到CLI中进行激活 - - - 注:激活码前后需要用`'`符号进行标注,如下所示 - - ```Bash - IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' - ``` - -### 启动及激活数据库 (V 1.3.4 之前版本) - -#### 启动 ConfigNode 节点 - -先启动第一个iotdb-1的confignode, 保证种子confignode节点先启动,然后依次启动第2和第3个confignode节点 - -```Bash -./start-confignode.sh -d #“-d”参数将在后台进行启动 -``` -如果启动失败,请参考[常见问题](#常见问题)。 - -#### 激活数据库 - -##### 方式一:激活文件拷贝激活 - -- 依次启动3个confignode节点后,每台机器各自的`activation`文件夹, 分别拷贝每台机器的`system_info`文件给天谋工作人员; -- 工作人员将返回每个ConfigNode节点的license文件,这里会返回3个license文件; -- 将3个license文件分别放入对应的ConfigNode节点的`activation`文件夹下; - -##### 方式二:激活脚本激活 - -- 依次获取3台机器的机器码,分别进入安装目录的`sbin`目录,执行激活脚本`start-activate.sh`: - - ```Bash - ./start-activate.sh - ``` - -- 显示如下信息,这里显示的是1台机器的机器码 : - - ```Bash - Please copy the system_info's content and send it to Timecho: - 01-KU5LDFFN-PNBEHDRH - Please enter license: - ``` - -- 其他2个节点依次执行激活脚本`start-activate.sh`,然后将获取的3台机器的机器码都复制给天谋工作人员 -- 工作人员会返回3段激活码,正常是与提供的3个机器码的顺序对应的,请分别将各自的激活码粘贴到上一步的命令行提示处 `Please enter license:`,如下提示: - - ```Bash - Please enter license: - Jw+MmF+Atxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx5bAOXNeob5l+HO5fEMgzrW8OJPh26Vl6ljKUpCvpTiw== - License has been stored to sbin/../activation/license - Import completed. Please start cluster and excute 'show cluster' to verify activation status - ``` - -#### 启动 DataNode 节点 - - 分别进入iotdb的`sbin`目录下,依次启动3个datanode节点: - -```Bash -./start-datanode.sh -d #-d参数将在后台进行启动 -``` - -### 验证部署 - -可直接执行`./sbin`目录下的Cli启动脚本: - -```Plain -./start-cli.sh -h ip(本机ip或域名) -p 端口号(6667) -``` - - 成功启动后,出现如下界面显示IOTDB安装成功。 - -![](/img/%E4%BC%81%E4%B8%9A%E7%89%88%E6%88%90%E5%8A%9F.png) - -出现安装成功界面后,继续看下是否激活成功,使用 `show cluster`命令 - -当看到最右侧显示`ACTIVATED`表示激活成功 - -![](/img/%E4%BC%81%E4%B8%9A%E7%89%88%E6%BF%80%E6%B4%BB.png) - -还可在 CLI 中通过执行 `show activation` 命令查看激活状态,示例如下,状态显示为ACTIVATED表示激活成功 - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -> 出现`ACTIVATED(W)`为被动激活,表示此ConfigNode没有license文件(或没有签发时间戳最新的license文件),其激活依赖于集群中其它Activate状态的ConfigNode。此时建议检查license文件是否已放入license文件夹,没有请放入license文件,若已存在license文件,可能是此节点license文件与其他节点信息不一致导致,请联系天谋工作人员重新申请. - - -### 一键启停集群 - -#### 概述 - -在 IoTDB 的根目录中,`sbin` 子目录包含的 `start-all.sh` 和 `stop-all.sh` 脚本,与 `conf` 子目录中的 `iotdb-cluster.properties` 配置文件协同工作,可通过单一节点实现一键启动或停止集群所有节点的功能。通过这种方式,可以高效地管理 IoTDB 集群的生命周期,简化了部署和运维流程。 -下文将介绍`iotdb-cluster.properties` 文件中的具体配置项。 - -#### 配置项 - - -> 注意: -> -> * 当集群变更时,需要手动更新此配置文件。 -> * 如果在未配置 `iotdb-cluster.properties` 配置文件的情况下执行 `start-all.sh` 或者 `stop-all.sh` 脚本,则默认会启停当前脚本所在 IOTDB\_HOME 目录下的 ConfigNode 与 DataNode 节点。 -> * 推荐配置 ssh 免密登录:如果未配置,启动脚本后会提示输入服务器密码以便于后续启动/停止/销毁操作。如果已配置,则无需在执行脚本过程中输入服务器密码。 - -* confignode\_address\_list - -| 名字 | confignode\_address\_list | -| :--------------: | :------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的 ConfigNode 节点所在主机的 IP 列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_address\_list - -| 名字 | datanode\_address\_list | -| :----------------: | :---------------------------------------------------------------------------- | -| 描述 | 待启动/停止的 DataNode 节点所在主机的 IP 列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* ssh\_account - -| 名字 | ssh\_account | -| :----------------: | :------------------------------------------------------------- | -| 描述 | 通过 SSH 登陆目标主机的用户名,需要所有的主机的用户名都相同 | -| 类型 | String | -| 默认值 | root | -| 改后生效方式 | 重启服务生效 | - -* ssh\_port - -| 名字 | ssh\_port | -| :----------------: | :--------------------------------------------------------- | -| 描述 | 目标主机对外暴露的 SSH 端口,需要所有的主机的端口都相同 | -| 类型 | int | -| 默认值 | 22 | -| 改后生效方式 | 重启服务生效 | - -* confignode\_deploy\_path - -| 名字 | confignode\_deploy\_path | -| :----------------: | :---------------------------------------------------------------------------------------------------------------- | -| 描述 | 待启动/停止的所有 ConfigNode 所在目标主机的路径,需要所有待启动/停止的 ConfigNode 节点在目标主机的相同目录下。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_deploy\_path - -| 名字 | datanode\_deploy\_path | -| :----------------: | :------------------------------------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的所有 DataNode 所在目标主机的路径,需要所有待启动/停止的 DataNode 节点在目标主机的相同目录下。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - - - -## 节点维护步骤 - -### ConfigNode节点维护 - -ConfigNode节点维护分为ConfigNode添加和移除两种操作,有两个常见使用场景: -- 集群扩展:如集群中只有1个ConfigNode时,希望增加ConfigNode以提升ConfigNode节点高可用性,则可以添加2个ConfigNode,使得集群中有3个ConfigNode。 -- 集群故障恢复:1个ConfigNode所在机器发生故障,使得该ConfigNode无法正常运行,此时可以移除该ConfigNode,然后添加一个新的ConfigNode进入集群。 - -> ❗️注意,在完成ConfigNode节点维护后,需要保证集群中有1或者3个正常运行的ConfigNode。2个ConfigNode不具备高可用性,超过3个ConfigNode会导致性能损失。 - -#### 添加ConfigNode节点 - -脚本命令: -```shell -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-confignode.sh - -# Windows -# 首先切换到IoTDB根目录 -sbin/start-confignode.bat -``` - -参数介绍: - -| 参数 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - -#### 移除ConfigNode节点 - -首先通过CLI连接集群,通过`show confignodes`确认想要移除ConfigNode的内部地址与端口号: - -```Bash -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -然后使用脚本将ConfigNode移除。脚本命令: - -```Bash -# Linux / MacOS -sbin/remove-confignode.sh [confignode_id] - -#Windows -sbin/remove-confignode.bat [confignode_id] - -``` - -### DataNode节点维护 - -DataNode节点维护有两个常见场景: - -- 集群扩容:出于集群能力扩容等目的,添加新的DataNode进入集群 -- 集群故障恢复:一个DataNode所在机器出现故障,使得该DataNode无法正常运行,此时可以移除该DataNode,并添加新的DataNode进入集群 - -> ❗️注意,为了使集群能正常工作,在DataNode节点维护过程中以及维护完成后,正常运行的DataNode总数不得少于数据副本数(通常为2),也不得少于元数据副本数(通常为3)。 - -#### 添加DataNode节点 - -脚本命令: - -```Bash -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-datanode.sh - -# Windows -# 首先切换到IoTDB根目录 -sbin/start-datanode.bat -``` - -参数介绍: - -| 缩写 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - -说明:在添加DataNode后,随着新的写入到来(以及旧数据过期,如果设置了TTL),集群负载会逐渐向新的DataNode均衡,最终在所有节点上达到存算资源的均衡。 - -#### 移除DataNode节点 - -首先通过CLI连接集群,通过`show datanodes`确认想要移除的DataNode的RPC地址与端口号: - -```Bash -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -然后使用脚本将DataNode移除。脚本命令: - -```Bash -# Linux / MacOS -sbin/remove-datanode.sh [datanode_id] - -#Windows -sbin/remove-datanode.bat [datanode_id] -``` - -## 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 - -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 步骤 4: 清理环境: - - a. 结束所有 ConfigNode 和 DataNode 进程。 - - ```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - sbin/stop-standalone.sh - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```Bash - cd /data/iotdb - rm -rf data logs - ``` \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index 741238903..000000000 --- a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,495 +0,0 @@ - -# Docker部署 - -## 环境准备 - -### Docker安装 - -```Bash -#以ubuntu为例,其他操作系统可以自行搜索安装方法 -#step1: 安装一些必要的系统工具 -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: 安装GPG证书 -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: 写入软件源信息 -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: 更新并安装Docker-CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: 设置docker开机自启动 -sudo systemctl enable docker -#step6: 验证docker是否安装成功 -docker --version #显示版本信息,即安装成功 -``` - -### docker-compose安装 - -```Bash -#安装命令 -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#验证是否安装成功 -docker-compose --version #显示版本信息即安装成功 -``` - -### 安装dmidecode插件 - -默认情况下,linux服务器应该都已安装,如果没有安装的话,可以使用下面的命令安装。 - -```Bash -sudo apt-get install dmidecode -``` - -dmidecode 安装后,查找安装路径:`whereis dmidecode`,这里假设结果为`/usr/sbin/dmidecode`,记住该路径,后面的docker-compose的yml文件会用到。 - -### 获取IoTDB的容器镜像 - -关于IoTDB企业版的容器镜像您可联系商务或技术支持获取。 - -## 单机版部署 - -本节演示如何部署1C1D的docker单机版。 - -### load 镜像文件 - -比如这里获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz` - -load镜像: - -```Bash -docker load -i iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### 创建docker bridge网络 - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在`/docker-iotdb` 文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml ` - -```Bash -docker-iotdb: -├── iotdb #iotdb安装目录 -│── docker-compose-standalone.yml #单机版docker-compose的yml文件 -``` - -完整的`docker-compose-standalone.yml`内容如下: - -```Bash -version: "3" -services: - iotdb-service: - image: iotdb-enterprise:1.3.2.3-standalone #使用的镜像 - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### 首次启动 - -使用下面的命令启动: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -由于没有激活,首次启动时会直接退出,属于正常现象,首次启动是为了获取机器码文件,用于后面的激活流程。 - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### 申请激活 - -- 首次启动后,在物理机目录`/docker-iotdb/iotdb/activation`下会生成一个 `system_info`文件,将这个文件拷贝给天谋工作人员。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 收到工作人员返回的license文件,将license文件拷贝到`/docker-iotdb/iotdb/activation`文件夹下。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### 再次启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### 验证部署 - -- 查看日志,有如下字样,表示启动成功 - -```Bash -docker logs -f iotdb-datanode #查看日志命令 -2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- 进入容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - 进入容器, 通过cli登录数据库, 使用show cluster命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb /bin/bash #进入容器 - ./start-cli.sh -h iotdb #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在docker-compose-standalone.yml中添加映射 - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:重新启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## 集群版部署 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -**注意:集群版目前只支持host网络和overlay 网络,不支持bridge网络。** - -下面以host网络为例演示如何部署3C3D集群。 - -### 设置主机名 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ----------- | ------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置/etc/hosts,使用如下命令: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### load镜像文件 - -比如获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz` - -在3台服务器上分别执行load镜像命令: - -```Bash -docker load -i iotdb-enterprise-1.3.2.3-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在/docker-iotdb文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`,`/docker-iotdb/confignode.yml`,`/docker-iotdb/datanode.yml` - -```Bash -docker-iotdb: -├── confignode.yml #confignode的yml文件 -├── datanode.yml #datanode的yml文件 -└── iotdb #IoTDB安装目录 -``` - -在每台服务器上都要编写2个yml文件,即`confignode.yml`和`datanode.yml`,yml示例如下: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:1.3.2.3-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #默认第一台为seed节点 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:1.3.2.3-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_seed_config_node=iotdb-1:10710 #默认第1台为seed节点 - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### 首次启动confignode - -先在3台服务器上分别启动confignode, 用来获取机器码,注意启动顺序,先启动第1台iotdb-1,再启动iotdb-2和iotdb-3。 - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #后台启动 -``` - -### 申请激活 - -- 首次启动3个confignode后,在每个物理机目录`/docker-iotdb/iotdb/activation`下都会生成一个`system_info`文件,将3个服务器的`system_info`文件拷贝给天谋工作人员; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 将3个license文件分别放入对应的ConfigNode节点的`/docker-iotdb/iotdb/activation`文件夹下; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- license放入对应的activation文件夹后,confignode会自动激活,不用重启confignode - -### 启动datanode - -在3台服务器上分别启动datanode - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #后台启动 -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### 验证部署 - -- 查看日志,有如下字样,表示datanode启动成功 - - ```Bash - docker logs -f iotdb-datanode #查看日志命令 - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- 进入任意一个容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - 进入容器,通过cli登录数据库,使用`show cluster`命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb-datanode /bin/bash #进入容器 - ./start-cli.sh -h iotdb-1 #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:在3台服务器中分别拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -或者 -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在3台服务器的`confignode.yml`和`datanode.yml`中添加/conf目录映射 - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:在3台服务器上重新启动IoTDB - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index 545e0ca75..000000000 --- a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,163 +0,0 @@ - -# 双活版部署 - -## 什么是双活版? - -双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 两个单机(或集群)可构成一个高可用组:当其中一个单机(或集群)停止服务时,另一个单机(或集群)不会受到影响。当停止服务的单机(或集群)再次启动时,另一个单机(或集群)会将新写入的数据同步过来。业务可以绑定两个单机(或集群)进行读写,从而达到高可用的目的。 -- 双活部署方案允许在物理节点少于 3 的情况下实现高可用,在部署成本上具备一定优势。同时可以通过电力、网络的双环网,实现两套单机(或集群)的物理供应隔离,保障运行的稳定性。 -- 目前双活能力为企业版功能。 - -![](/img/%E5%8F%8C%E6%B4%BB%E5%90%8C%E6%AD%A5.png) - -## 注意事项 - -1. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置`/etc/hosts`,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。 - - ```Bash - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -2. 有些参数首次启动后不能修改,请参考下方的"安装步骤"章节来进行设置。 - -3. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考[文档](https://www.timecho.com/docs/zh/UserGuide/latest/Deployment-and-Maintenance/Monitoring-panel-deployment.html) - -## 安装步骤 - -我们以两台单机A和B构建的双活版IoTDB为例,A和B的ip分别是192.168.1.3 和 192.168.1.4 ,这里用hostname来表示不同的主机,规划如下: - -| 机器 | 机器ip | 主机名 | -| ---- | ----------- | ------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### Step1:分别安装两套独立的 IoTDB - -在2个机器上分别安装 IoTDB,单机版部署文档可参考[文档](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md),集群版部署文档可参考[文档](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)。**推荐 A、B 集群的各项配置保持一致,以实现最佳的双活效果。** - -### Step2:在机器A上创建数据同步任务至机器B - -- 在机器A上创建数据同步流程,即机器A上的数据自动同步到机器B,使用sbin目录下的cli工具连接A上的IoTDB数据库: - - ```Bash - ./sbin/start-cli.sh -h iotdb-1 - ``` - -- 创建并启动数据同步命令,SQL 如下: - - ```Bash - create pipe AB - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一pipe传输而来的数据。 - -### Step3:在机器B上创建数据同步任务至机器A - - - 在机器B上创建数据同步流程,即机器B上的数据自动同步到机器A,使用sbin目录下的cli工具连接B上的IoTDB数据库: - - ```Bash - ./sbin/start-cli.sh -h iotdb-2 - ``` - - 创建并启动pipe,SQL 如下: - - ```Bash - create pipe BA - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一pipe传输而来的数据。 - -### Step4:验证部署 - -上述数据同步流程创建完成后,即可启动双活集群。 - -#### 检查集群运行状态 - -```Bash -#在2个节点分别执行show cluster命令检查IoTDB服务状态 -show cluster -``` - -**机器A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**机器B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -确保每一个 ConfigNode 和 DataNode 都处于 Running 状态。 - -#### 检查同步状态 - -- 机器A上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-A.png) - -- 机器B上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-B.png) - -确保每一个 pipe 都处于 RUNNING 状态。 - -### Step5:停止双活版 IoTDB - -- 在机器A的执行下列命令: - - ```SQL - ./sbin/start-cli.sh -h iotdb-1 #登录cli - IoTDB> stop pipe AB #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - ``` - -- 在机器B的执行下列命令: - - ```SQL - ./sbin/start-cli.sh -h iotdb-2 #登录cli - IoTDB> stop pipe BA #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - ``` \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index f824da365..000000000 --- a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,46 +0,0 @@ - -# 安装包获取 - -## 企业版获取方式 - -企业版安装包可通过产品试用申请,或直接联系与您对接的商务人员获取。 - -## 安装包结构 - -解压后安装包(iotdb-enterprise-{version}-bin.zip),安装包解压后目录结构如下: - -| **目录** | **类型** | **说明** | -| ---------------- | -------- | ------------------------------------------------------------ | -| activation | 文件夹 | 激活文件所在目录,包括生成的机器码以及从商务侧获取的企业版激活码(启动ConfigNode后才会生成该目录,即可获取激活码) | -| conf | 文件夹 | 配置文件目录,包含 ConfigNode、DataNode、JMX 和 logback 等配置文件 | -| data | 文件夹 | 默认的数据文件目录,包含 ConfigNode 和 DataNode 的数据文件。(启动程序后才会生成该目录) | -| lib | 文件夹 | IoTDB可执行库文件目录 | -| licenses | 文件夹 | 开源社区证书文件目录 | -| logs | 文件夹 | 默认的日志文件目录,包含 ConfigNode 和 DataNode 的日志文件(启动程序后才会生成该目录) | -| sbin | 文件夹 | 主要脚本目录,包含启、停等脚本等 | -| tools | 文件夹 | 系统周边工具目录 | -| ext | 文件夹 | pipe,trigger,udf插件的相关文件(需要使用时用户自行创建) | -| LICENSE | 文件 | 证书 | -| NOTICE | 文件 | 提示 | -| README_ZH\.md | 文件 | markdown格式的中文版说明 | -| README\.md | 文件 | 使用说明 | -| RELEASE_NOTES\.md | 文件 | 版本说明 | diff --git a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Kubernetes_timecho.md b/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Kubernetes_timecho.md deleted file mode 100644 index 7fbc7be8d..000000000 --- a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Kubernetes_timecho.md +++ /dev/null @@ -1,445 +0,0 @@ - - -# Kubernetes - -## 1. 环境准备 - -### 1.1 准备 Kubernetes 集群 - -确保拥有一个可用的 Kubernetes 集群(建议最低版本:Kubernetes 1.24),作为部署 IoTDB 集群的基础。 - -Kubernetes 版本要求:建议版本为 Kubernetes 1.24及以上 - -IoTDB版本要求:不能低于v1.3.3 - -## 2. 创建命名空间 - -### 2.1 创建命名空间 - -> 注意:在执行命名空间创建操作之前,需验证所指定的命名空间名称在 Kubernetes 集群中尚未被使用。如果命名空间已存在,创建命令将无法执行,可能导致部署过程中的错误。 - -```Bash -kubectl create ns iotdb-ns -``` - -### 2.2 查看命名空间 - -```Bash -kubectl get ns -``` - -## 3. 创建 PersistentVolume (PV) - -### 3.1 创建 PV 配置文件 - -PV用于持久化存储IoTDB的ConfigNode 和 DataNode的数据,有几个节点就要创建几个PV。 - -> 注:1个ConfigNode和1个DataNode 算2个节点,需要2个PV。 - -以 3ConfigNode、3DataNode 为例: - -1. 创建 `pv.yaml` 文件,并复制六份,分别重命名为 `pv01.yaml` ~ `pv06.yaml`。 - -```Bash -#可新建个文件夹放yaml文件 -#创建 pv.yaml 文件语句 -touch pv.yaml -``` - -2. 修改每个文件中的 `name` 和 `path` 以确保一致性。 - -**pv.yaml 示例:** - -```YAML -# pv.yaml -apiVersion: v1 -kind: PersistentVolume -metadata: - name: iotdb-pv-01 -spec: - capacity: - storage: 10Gi # 存储容量 - accessModes: # 访问模式 - - ReadWriteOnce - persistentVolumeReclaimPolicy: Retain # 回收策略 - # 存储类名称,如果使用本地静态存储storageClassName 不用配置,如果使用动态存储必需设置此项 - storageClassName: local-storage - # 根据你的存储类型添加相应的配置 - hostPath: # 如果是使用本地路径 - path: /data/k8s-data/iotdb-pv-01 - type: DirectoryOrCreate # 这行不配置就要手动创建文件夹 -``` - -### 3.2 应用 PV 配置 - -```Bash -kubectl apply -f pv01.yaml -kubectl apply -f pv-02.yaml -... -``` - -### 3.3 查看 PV - -```Bash -kubectl get pv -``` - - -### 3.4 手动创建文件夹 - -> 如果yaml里的hostPath-type未配置,需要手动创建对应的文件夹 - -在所有 Kubernetes 节点上创建对应的文件夹: - -```Bash -mkdir -p /data/k8s-data/iotdb-pv-01 -mkdir -p /data/k8s-data/iotdb-pv-02 -... -``` - -## 4. 安装 Helm - -安装Helm步骤请参考[Helm官网](https://helm.sh/zh/docs/intro/install/) - -## 5. 配置IoTDB的Helm Chart - -### 5.1 克隆 IoTDB Kubernetes 部署代码 - -请联系天谋工作人员获取IoTDB的Helm Chart - -### 5.2 修改 YAML 文件 - -> 确保使用的是支持的版本 >=1.3.3.2 - -**values.yaml 示例:** - -```YAML -nameOverride: "iotdb" -fullnameOverride: "iotdb" #软件安装后的名称 - -image: - repository: nexus.infra.timecho.com:8143/timecho/iotdb-enterprise - pullPolicy: IfNotPresent - tag: 1.3.3.2-standalone #软件所用的仓库和版本 - -storage: -# 存储类名称,如果使用本地静态存储storageClassName 不用配置,如果使用动态存储必需设置此项 - className: local-storage - -datanode: - name: datanode - nodeCount: 3 #datanode的节点数量 - enableRestService: true - storageCapacity: 10Gi #datanode的可用空间大小 - resources: - requests: - memory: 2Gi #datanode的内存初始化大小 - cpu: 1000m #datanode的CPU初始化大小 - limits: - memory: 4Gi #datanode的最大内存大小 - cpu: 1000m #datanode的最大CPU大小 - -confignode: - name: confignode - nodeCount: 3 #confignode的节点数量 - storageCapacity: 10Gi #confignode的可用空间大小 - resources: - requests: - memory: 512Mi #confignode的内存初始化大小 - cpu: 1000m #confignode的CPU初始化大小 - limits: - memory: 1024Mi #confignode的最大内存大小 - cpu: 2000m #confignode的最大CPU大小 - configNodeConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - schemaReplicationFactor: 3 - schemaRegionConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - dataReplicationFactor: 2 - dataRegionConsensusProtocolClass: org.apache.iotdb.consensus.iot.IoTConsensus -``` - -## 6. 配置私库信息或预先使用ctr拉取镜像 - -在k8s上配置私有仓库的信息,为下一步helm install的前置步骤。 - -方案一即在 helm install 时拉取可用的iotdb镜像,方案二则是提前将可用的iotdb镜像导入到containerd里。 - -### 6.1 【方案一】从私有仓库拉取镜像 - -#### 6.1.1 创建secret 使k8s可访问iotdb-helm的私有仓库 - -下文中“xxxxxx”表示IoTDB私有仓库的账号、密码、邮箱。 - -```Bash -# 注意 单引号 -kubectl create secret docker-registry timecho-nexus \ - --docker-server='nexus.infra.timecho.com:8143' \ - --docker-username='xxxxxx' \ - --docker-password='xxxxxx' \ - --docker-email='xxxxxx' \ - -n iotdb-ns - -# 查看secret -kubectl get secret timecho-nexus -n iotdb-ns -# 查看并输出为yaml -kubectl get secret timecho-nexus --output=yaml -n iotdb-ns -# 查看并解密 -kubectl get secret timecho-nexus --output="jsonpath={.data.\.dockerconfigjson}" -n iotdb-ns | base64 --decode -``` - -#### 6.1.2 将secret作为一个patch加载到命名空间iotdb-ns - -```Bash -# 添加一个patch,使该命名空间增加登陆nexus的登陆信息 -kubectl patch serviceaccount default -n iotdb-ns -p '{"imagePullSecrets": [{"name": "timecho-nexus"}]}' - -# 查看命名空间的该条信息 -kubectl get serviceaccounts -n iotdb-ns -o yaml -``` - -### 6.2 【方案二】导入镜像 - -该步骤用于客户无法连接私库的场景,需要联系公司实施同事辅助准备。 - -#### 6.2.1 拉取并导出镜像: - -```Bash -ctr images pull --user xxxxxxxx nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.2 查看并导出镜像: - -```Bash -# 查看 -ctr images ls - -# 导出 -ctr images export iotdb-enterprise:1.3.3.2-standalone.tar nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.3 导入到k8s的namespace下: - -> 注意,k8s.io为示例环境中k8s的ctr的命名空间,导入到其他命名空间是不行的 - -```Bash -# 导入到k8s的namespace下 -ctr -n k8s.io images import iotdb-enterprise:1.3.3.2-standalone.tar -``` - -#### 6.2.4 查看镜像 - -```Bash -ctr --namespace k8s.io images list | grep 1.3.3.2 -``` - -## 7. 安装 IoTDB - -### 7.1 安装 IoTDB - -```Bash -# 进入文件夹 -cd iotdb-cluster-k8s/helm - -# 安装iotdb -helm install iotdb ./ -n iotdb-ns -``` - -### 7.2 查看 Helm 安装列表 - -```Bash -# helm list -helm list -n iotdb-ns -``` - -### 7.3 查看 Pods - -```Bash -# 查看 iotdb的pods -kubectl get pods -n iotdb-ns -o wide -``` - -执行命令后,输出了带有confignode和datanode标识的各3个Pods,,总共6个Pods,即表明安装成功;需要注意的是,并非所有Pods都处于Running状态,未激活的datanode可能会持续重启,但在激活后将恢复正常。 - -### 7.4 发现故障的排除方式 - -```Bash -# 查看k8s的创建log -kubectl get events -n iotdb-ns -watch kubectl get events -n iotdb-ns - -# 获取详细信息 -kubectl describe pod confignode-0 -n iotdb-ns -kubectl describe pod datanode-0 -n iotdb-ns - -# 查看confignode日志 -kubectl logs -n iotdb-ns confignode-0 -f -``` - -## 8. 激活 IoTDB - -### 8.1 方案1:直接在 Pod 中激活(最快捷) - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-1 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-2 -- /iotdb/sbin/start-activate.sh -# 拿到机器码后进行激活 -``` - -### 8.2 方案2:进入confignode的容器中激活 - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /bin/bash -cd /iotdb/sbin -/bin/bash start-activate.sh -# 拿到机器码后进行激活 -# 退出容器 -``` - -### 8.3 方案3:手动激活 - -1. 查看 ConfigNode 详细信息,确定所在节点: - -```Bash -kubectl describe pod confignode-0 -n iotdb-ns | grep -e "Node:" -e "Path:" - -# 结果示例: -# Node: a87/172.20.31.87 -# Path: /data/k8s-data/env/confignode/.env -``` - -2. 查看 PVC 并找到 ConfigNode 对应的 Volume,确定所在路径: - -```Bash -kubectl get pvc -n iotdb-ns | grep "confignode-0" - -# 结果示例: -# map-confignode-confignode-0 Bound iotdb-pv-04 10Gi RWO local-storage 8h - -# 如果要查看多个confignode,使用如下: -for i in {0..2}; do echo confignode-$i;kubectl describe pod confignode-${i} -n iotdb-ns | grep -e "Node:" -e "Path:"; echo "----"; done -``` - -3. 查看对应 Volume 的详细信息,确定物理目录的位置: - -```Bash -kubectl describe pv iotdb-pv-04 | grep "Path:" - -# 结果示例: -# Path: /data/k8s-data/iotdb-pv-04 -``` - -4. 从对应节点的对应目录下找到 system-info 文件,使用该 system-info 作为机器码生成激活码,并在同级目录新建文件 license,将激活码写入到该文件。 - -## 9. 验证 IoTDB - -### 9.1 查看命名空间内的 Pods 状态 - -查看iotdb-ns命名空间内的IP、状态等信息,确定全部运行正常 - -```Bash -kubectl get pods -n iotdb-ns -o wide - -# 结果示例: -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` - -### 9.2 查看命名空间内的端口映射情况 - -```Bash -kubectl get svc -n iotdb-ns - -# 结果示例: -# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -# confignode-svc NodePort 10.10.226.151 80:31026/TCP 7d8h -# datanode-svc NodePort 10.10.194.225 6667:31563/TCP 7d8h -# jdbc-balancer LoadBalancer 10.10.191.209 6667:31895/TCP 7d8h -``` - -### 9.3 在任意服务器启动 CLI 脚本验证 IoTDB 集群状态 - -端口即jdbc-balancer的端口,服务器为k8s任意节点的IP - -```Bash -start-cli.sh -h 172.20.31.86 -p 31895 -start-cli.sh -h 172.20.31.87 -p 31895 -start-cli.sh -h 172.20.31.88 -p 31895 -``` - - - -## 10. 扩容 - -### 10.1 新增pv - -新增pv,必须有可用的pv才可以扩容。 - - - -**注意:DataNode重启后无法加入集群** - -**原因**:配置了静态存储的 hostPath 模式,并通过脚本修改了 `iotdb-system.properties` 文件,将 `dn_data_dirs` 设为 `/iotdb6/iotdb_data,/iotdb7/iotdb_data`,但未将默认存储路径 `/iotdb/data` 进行外挂,导致重启时数据丢失。 - -**解决方案**:是将 `/iotdb/data` 目录也进行外挂操作,且 ConfigNode 和 DataNode 均需如此设置,以确保数据完整性和集群稳定性。 - -### 10.2 扩容confignode - -示例:3 confignode 扩容为 4 confignode - -修改iotdb-cluster-k8s/helm的values.yaml文件,将confignode的3改成4 - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - - - - -### 10.3 扩容datanode - -示例:3 datanode 扩容为 4 datanode - -修改iotdb-cluster-k8s/helm的values.yaml文件,将datanode的3改成4 - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - -### 10.4 验证IoTDB状态 - -```Shell -kubectl get pods -n iotdb-ns -o wide - -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -# datanode-3 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index ee6c78f66..000000000 --- a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,325 +0,0 @@ - -# 单机版部署 - -本章将介绍如何启动IoTDB单机实例,IoTDB单机实例包括 1 个ConfigNode 和1个DataNode(即通常所说的1C1D)。 - -## 注意事项 - -1. 安装前请确认系统已参照[系统配置](./Environment-Requirements.md)准备完成。 - -2. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置/etc/hosts,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、dn_internal_address、dn_rpc_address。 - - ```shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. 部分参数首次启动后不能修改,请参考下方的【参数配置】章节进行设置 - -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 - -5. 请注意,安装部署(包括激活和使用软件)IoTDB时需要保持使用同一个用户进行操作,您可以: -- 使用 root 用户(推荐):使用 root 用户可以避免权限等问题。 -- 使用固定的非 root 用户: - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - 避免使用 sudo:尽量避免使用 sudo 命令,因为它会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 - -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考:[监控面板部署](./Monitoring-panel-deployment.md)。 - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - -## 安装步骤 - -### 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-02.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - - -### 解压安装包并进入安装目录 - -```shell -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -### 参数配置 - -#### 环境脚本配置 - -- ./conf/confignode-env.sh(./conf/confignode-env.bat)配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------: | :------------------------------------: | :--------: | :----------------------------------------------: | :----------: | -| MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- ./conf/datanode-env.sh(./conf/datanode-env.bat)配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------: | :----------------------------------: |:----------------------:| :----------------------------------------------: | :----------: | -| MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 系统通用配置 - -打开通用配置文件(./conf/iotdb-system.properties 文件),设置以下参数: - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :-----------------------: | :------------------------------: | :------------: | :----------------------------------------------: |:--------------------------------------:| -| cluster_name | 集群名称 | defaultCluster | 可根据需要设置集群名称,如无特殊需要保持默认即可 | 首次启动后不可修改,V1.3.3及之后版本支持热加载,但不建议手动修改该参数 | -| schema_replication_factor | 元数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | -| data_replication_factor | 数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | - -#### ConfigNode配置 - -打开ConfigNode配置文件(./conf/iotdb-system.properties文件),设置以下参数: - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :-----------------: | :----------------------------------------------------------: | :-------------: | :----------------------------------------------: | :----------------: | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -#### DataNode 配置 - -打开DataNode配置文件 ./conf/iotdb-system.properties,设置以下参数: - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------- | :----------------- | -| dn_rpc_address | 客户端 RPC 服务的地址 |0.0.0.0 | 所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -> ❗️注意:VSCode Remote等编辑器无自动保存配置功能,请确保修改的文件被持久化保存,否则配置项无法生效 - -### 启动及激活数据库 (V 1.3.4 及以后的 1.x 版本) - -#### 启动 ConfigNode 节点 - -进入iotdb的sbin目录下,启动confignode - -```shell -./start-confignode.sh -d #“-d”参数将在后台进行启动 -``` -如果启动失败,请参考[常见问题](#常见问题)。 - -#### 启动 DataNode 节点 - -进入iotdb的sbin目录下,启动datanode: - -```shell -./start-datanode.sh -d #-d参数将在后台进行启动 -``` - -#### 激活数据库 - -##### 通过 CLI 激活 - -- 进入 CLI - - ```SQL - ./sbin/start-cli.sh -``` - -- 执行以下内容获取激活所需机器码: - - ```Bash - show system info - ``` - -- 将返回机器码复制给天谋工作人员: - -```Bash -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -It costs 0.030s -``` - -- 将工作人员返回的激活码输入到CLI中,输入以下内容 - - 注:激活码前后需要用`'`符号进行标注,如所示 - -```Bash -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` -### 启动及激活数据库 (V 1.3.4 之前版本) - -#### 启动 ConfigNode 节点 - -进入iotdb的sbin目录下,启动confignode - -```shell -./start-confignode.sh -d #“-d”参数将在后台进行启动 -``` -如果启动失败,请参考[常见问题](#常见问题)。 - -#### 激活数据库 - -##### 方式一:激活文件拷贝激活 - -- 启动confignode节点后,进入activation文件夹, 将 system_info文件复制给天谋工作人员 -- 收到工作人员返回的 license文件 -- 将license文件放入对应节点的activation文件夹下; - -##### 方式二:激活脚本激活 - -- 获取激活所需机器码,进入安装目录的sbin目录,执行激活脚本: - -```shell - cd sbin -./start-activate.sh -``` - -- 显示如下信息,请将机器码(即该串字符)复制给天谋工作人员: - -```shell -Please copy the system_info's content and send it to Timecho: -01-KU5LDFFN-PNBEHDRH -Please enter license: -``` - -- 将工作人员返回的激活码输入上一步的命令行提示处 `Please enter license:`,如下提示: - -```shell -Please enter license: -Jw+MmF+AtexsfgNGOFgTm83BgXbq0zT1+fOfPvQsLlj6ZsooHFU6HycUSEGC78eT1g67KPvkcLCUIsz2QpbyVmPLr9x1+kVjBubZPYlVpsGYLqLFc8kgpb5vIrPLd3hGLbJ5Ks8fV1WOVrDDVQq89YF2atQa2EaB9EAeTWd0bRMZ+s9ffjc/1Zmh9NSP/T3VCfJcJQyi7YpXWy5nMtcW0gSV+S6fS5r7a96PjbtE0zXNjnEhqgRzdU+mfO8gVuUNaIy9l375cp1GLpeCh6m6pF+APW1CiXLTSijK9Qh3nsL5bAOXNeob5l+HO5fEMgzrW8OJPh26Vl6ljKUpCvpTiw== -License has been stored to sbin/../activation/license -Import completed. Please start cluster and excute 'show cluster' to verify activation status -``` - -#### 启动 DataNode 节点 - -进入iotdb的sbin目录下,启动datanode: - -```shell -./start-datanode.sh -d #-d参数将在后台进行启动 -``` - - -### 验证部署 - -可直接执行 ./sbin 目录下的 Cli 启动脚本: - -```shell -./start-cli.sh -h ip(本机ip或域名) -p 端口号(6667) -``` - -成功启动后,出现如下界面显示IOTDB安装成功。 - -![](/img/%E5%90%AF%E5%8A%A8%E6%88%90%E5%8A%9F.png) - -出现安装成功界面后,继续看下是否激活成功,使用`show cluster`命令 - -当看到最右侧显示ACTIVATED表示激活成功 - -![](/img/show%20cluster.png) - -还可在 CLI 中通过执行 `show activation` 命令查看激活状态,示例如下,状态显示为ACTIVATED表示激活成功 - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -> 出现`ACTIVATED(W)`为被动激活,表示此ConfigNode没有license文件(或没有签发时间戳最新的license文件)。此时建议检查license文件是否已放入license文件夹,没有请放入license文件,若已存在license文件,可能是此节点license文件与其他节点信息不一致导致,请联系天谋工作人员重新申请. - - - -## 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 - -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 步骤 4: 清理环境: - - a. 结束所有 ConfigNode 和 DataNode 进程。 - - ```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - sbin/stop-standalone.sh - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```Bash - cd /data/iotdb - rm -rf data logs - ``` \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/workbench-deployment_timecho.md b/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/workbench-deployment_timecho.md deleted file mode 100644 index 4e825f213..000000000 --- a/src/zh/UserGuide/dev-1.3/Deployment-and-Maintenance/workbench-deployment_timecho.md +++ /dev/null @@ -1,243 +0,0 @@ - -# 可视化控制台部署 - -可视化控制台是IoTDB配套工具之一(类似 Navicat for MySQL)。它用于数据库部署实施、运维管理、应用开发各阶段的官方应用工具体系,让数据库的使用、运维和管理更加简单、高效,真正实现数据库低成本的管理和运维。本文档将帮助您安装Workbench。 - -
-  -  -
- -可视化控制台工具的使用说明可参考文档 [使用说明](../Tools-System/Workbench_timecho.md) 章节。 - -## 安装准备 - -| 准备内容 | 名称 | 版本要求 | 官方链接 | -| :------: | :-----------------------: |:-------------------------------------------------------------------:| :----------------------------------------------------: | -| 操作系统 | Windows或Linux | - | - | -| 安装环境 | JDK | 1.5.4及以下版本需要 >= 1.8,1.5.5及以上版本需要 >= 17(下载时请根据机器配置选择ARM或x64安装包) | https://www.oracle.com/java/technologies/downloads/ | -| 相关软件 | Prometheus | 需要 >=V2.30.3 | https://prometheus.io/download/ | -| 数据库 | IoTDB | 需要>=V1.2.0企业版 | 您可联系商务或技术支持获取 | -| 控制台 | IoTDB-Workbench-``| - | 您可根据附录版本对照表进行选择后联系商务或技术支持获取 | - - -### 前置检查 - -为确保您获取的可视化控制台安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:联系天谋工作人员获取 - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/workbench`): - ```Bash - cd /data/workbench - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum IoTDB-Workbench-``.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-03.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行可视化控制台的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -## 安装步骤 - -### 步骤一:IoTDB 开启监控指标采集 - -1. 打开监控配置项。IoTDB中监控有关的配置项默认是关闭的,在部署监控面板前,您需要打开相关配置项(注意开启监控配置后需要重启服务)。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
配置项所在配置文件配置说明
cn_metric_reporter_listconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为PROMETHEUS
cn_metric_level请在配置文件中添加该配置项,值设置为IMPORTANT
cn_metric_prometheus_reporter_port请在配置文件中添加该配置项,可保持默认设置9091,如设置其他端口,不与其他端口冲突即可
dn_metric_reporter_listconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为PROMETHEUS
dn_metric_level请在配置文件中添加该配置项,值设置为IMPORTANT
dn_metric_prometheus_reporter_port请在配置文件中添加该配置项,可保持默认设置9092,如设置其他端口,不与其他端口冲突即可
dn_metric_internal_reporter_type请在配置文件中添加该配置项,值设置为IOTDB
enable_audit_logconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为true
audit_log_storage请在配置文件中添加该配置项,值设置为IOTDB,LOGGER
audit_log_operation请在配置文件中添加该配置项,值设置为DML,DDL,QUERY
- -2. 重启所有节点。修改3个节点的监控指标配置后,可重新启动所有节点的confignode和datanode服务: - - ```shell - ./sbin/stop-standalone.sh #先停止confignode和datanode - ./sbin/start-confignode.sh -d #启动confignode - ./sbin/start-datanode.sh -d #启动datanode - ``` - -3. 重启后,通过客户端确认各节点的运行状态,若状态都为Running,则为配置成功: - - ![](/img/%E5%90%AF%E5%8A%A8.png) - -### 步骤二:安装、配置Prometheus监控 - -1. 确保Prometheus安装完成(官方安装说明可参考:https://prometheus.io/docs/introduction/first_steps/) -2. 解压安装包,进入解压后的文件夹: - - ```Shell - tar xvfz prometheus-*.tar.gz - cd prometheus-* - ``` - -3. 修改配置。修改配置文件prometheus.yml如下 - 1. 新增confignode任务收集ConfigNode的监控数据 - 2. 新增datanode任务收集DataNode的监控数据 - - ```shell - global: - scrape_interval: 15s - evaluation_interval: 15s - scrape_configs: - - job_name: "prometheus" - static_configs: - - targets: ["localhost:9090"] - - job_name: "confignode" - static_configs: - - targets: ["iotdb-1:9091","iotdb-2:9091","iotdb-3:9091"] - honor_labels: true - - job_name: "datanode" - static_configs: - - targets: ["iotdb-1:9092","iotdb-2:9092","iotdb-3:9092"] - honor_labels: true - ``` - -4. 启动Prometheus。Prometheus 监控数据的默认过期时间为15天,在生产环境中,建议将其调整为180天以上,以对更长时间的历史监控数据进行追踪,启动命令如下所示: - - ```Shell - ./prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=180d - ``` - -5. 确认启动成功。在浏览器中输入 `http://IP:port`,进入Prometheus,点击进入Status下的Target界面,当看到State均为Up时表示配置成功并已经联通。 - -
- - -
- -### 步骤三:安装Workbench - -1. 进入iotdb-Workbench-``的config目录 - -2. 修改Workbench配置文件:进入`config`文件夹下修改配置文件`application-prod.properties`。若您是在本机安装则无需修改,若是部署在服务器上则需修改IP地址 - > Workbench可以部署在本地或者云服务器,只要能与 IoTDB 连接即可 - - | 配置项 | 修改前 | 修改后 | - | ---------------- | --------------------------------- | -------------------------------------- | - | pipe.callbackUrl | pipe.callbackUrl=`http://127.0.0.1` | pipe.callbackUrl=`http://<部署Workbench的IP地址>` | - - ![](/img/workbench-conf-1.png) - -3. 启动程序:请在IoTDB-Workbench-``的sbin文件夹下执行启动命令 - - Windows版: - ```shell - # 后台启动Workbench - start.bat -d - ``` - - Linux版: - ```shell - # 后台启动Workbench - ./start.sh -d - ``` - -4. 可以通过`jps`命令进行启动是否成功,如图所示即为启动成功: - - ![](/img/windows-jps.png) - -5. 验证是否成功:浏览器中打开:"`http://服务器ip:配置文件中端口`"进行访问,例如:"`http://127.0.0.1:9190`",当出现登录界面时即为成功 - - ![](/img/workbench.png) - - -## 附录:IoTDB与控制台版本对照表 - -| **控制台版本号** | **版本说明** | **可支持IoTDB版本** | -|------------|--------------------------------------------------------|----------------| -| V1.5.7 | 优化测点列表中测点名称拆分为设备名称和测点,测点选择区域支持左右滚动,以及导出文件列顺序与页面保持一致 | V1.3.4及以上的1.x系列版本 | -| V1.5.6 | 优化 CSV 格式导入导出功能:导入时,支持标签、别名为非必填项;导出时,支持测点描述里反引号包裹引号的场景 | V1.3.4及以上的1.x系列版本 | -| V1.5.5 | 新增服务器时钟,支持企业版激活数据库 | V1.3.4及以上的1.x系列版本 | -| V1.5.4 | 新增实例管理中prometheus设置的认证功能 | V1.3.4及以上的1.x系列版本 | -| V1.5.1 | 新增AI分析功能以及模式匹配功能 | V1.3.2及以上的1.x系列版本 | -| V1.4.0 | 新增树模型展示及英文版 | V1.3.2及以上的1.x系列版本 | -| V1.3.1 | 分析功能新增分析方式,优化导入模版等功能 | V1.3.2及以上的1.x系列版本 | -| V1.3.0 | 新增数据库配置功能,优化部分版本细节 | V1.3.2及以上的1.x系列版本 | -| V1.2.6 | 优化各模块权限控制功能 | V1.3.1及以上的1.x系列版本 | -| V1.2.5 | 可视化功能新增“常用模版”概念,所有界面优化补充页面缓存等功能 | V1.3.0及以上的1.x系列版本 | -| V1.2.4 | 计算功能新增“导入、导出”功能,测点列表新增“时间对齐”字段 | V1.2.2及以上的1.x系列版本 | -| V1.2.3 | 首页新增“激活详情”,新增分析等功能 | V1.2.2及以上的1.x系列版本 | -| V1.2.2 | 优化“测点描述”展示内容等功能 | V1.2.2及以上的1.x系列版本 | -| V1.2.1 | 数据同步界面新增“监控面板”,优化Prometheus提示信息 | V1.2.2及以上的1.x系列版本 | -| V1.2.0 | 全新Workbench版本升级 | V1.2.0及以上的1.x系列版本 | - diff --git a/src/zh/UserGuide/dev-1.3/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md b/src/zh/UserGuide/dev-1.3/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md deleted file mode 100644 index ff570158b..000000000 --- a/src/zh/UserGuide/dev-1.3/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md +++ /dev/null @@ -1,274 +0,0 @@ - -# Ignition - -## 产品概述 - -1. Ignition简介 - -Ignition 是一个基于WEB的监控和数据采集工具(SCADA)- 一个开放且可扩展的通用平台。Ignition可以让你更轻松地控制、跟踪、显示和分析企业的所有数据,提升业务能力。更多介绍详情请参考[Ignition官网](https://docs.inductiveautomation.com/docs/8.1/getting-started/introducing-ignition) - -2. Ignition-IoTDB Connector介绍 - - Ignition-IoTDB Connector分为两个模块:Ignition-IoTDB连接器、Ignition-IoTDB With JDBC。其中: - - - Ignition-IoTDB 连接器:提供了将 Ignition 采集到的数据存入 IoTDB 的能力,也支持在Components中进行数据读取,同时注入了 `system.iotdb.insert`和`system.iotdb.query`脚本接口用于方便在Ignition编程使用 - - Ignition-IoTDB With JDBC:Ignition-IoTDB With JDBC 可以在 `Transaction Groups` 模块中使用,不适用于 `Tag Historian`模块,可以用于自定义写入和查询。 - - 两个模块与Ignition的具体关系与内容如下图所示。 - - ![](/img/Ignition.png) - -## 安装要求 - -| **准备内容** | **版本要求** | -| :------------------------: | :------------------------------------------------------------: | -| IoTDB | 要求已安装V1.3.1及以上版本,安装请参考 IoTDB [部署指导](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) | -| Ignition | 要求已安装 8.1.x版本(8.1.37及以上)的 8.1 版本,安装请参考 Ignition 官网[安装指导](https://docs.inductiveautomation.com/docs/8.1/getting-started/installing-and-upgrading)(其他版本适配请联系商务了解) | -| Ignition-IoTDB连接器模块 | 请联系商务获取 | -| Ignition-IoTDB With JDBC模块 | 下载地址:https://repo1.maven.org/maven2/org/apache/iotdb/iotdb-jdbc/ | - -## Ignition-IoTDB连接器使用说明 - -### 简介 - -Ignition-IoTDB连接器模块可以将数据存入与历史数据库提供程序关联的数据库连接中。数据根据其数据类型直接存储到 SQL 数据库中的表中,以及毫秒时间戳。根据每个标签上的值模式和死区设置,仅在更改时存储数据,从而避免重复和不必要的数据存储。 - -Ignition-IoTDB连接器提供了将 Ignition 采集到的数据存入 IoTDB 的能力。 - -### 安装步骤 - -步骤一:进入 `Config` - `System`- `Modules` 模块,点击最下方的`Install or Upgrade a Module...` - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-1.png) - -步骤二:选择获取到的 `modl`,选择文件并上传,点击 `Install`,信任相关证书。 - -![](/img/ignition-3.png) - -步骤三:安装完成后可以看到如下内容 - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-3.png) - -步骤四:进入 `Config` - `Tags`- `History` 模块,点击下方的`Create new Historical Tag Provider...` - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-4.png) - -步骤五:选择 `IoTDB`并填写配置信息 - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-5.png) - -配置内容如下: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
名称含义默认值备注
Main
Provider NameProvider 名称-
Enabled true为 true 时才能使用该 Provider
Description备注-
IoTDB Settings
Host Name目标IoTDB实例的地址-
Port Number目标IoTDB实例的端口6667
Username目标IoTDB的用户名-
Password目标IoTDB的密码-
Database Name要存储的数据库名称,以 root 开头,如 root.db-
Pool SizeSessionPool 的 Size50可以按需进行配置
Store and Forward Settings保持默认即可
- - -### 使用说明 - -#### 配置历史数据存储 - -- 配置好 `Provider` 后就可以在 `Designer` 中使用 `IoTDB Tag Historian` 了,就跟使用其他的 `Provider` 一样,右键点击对应 `Tag` 选择 `Edit tag(s)`,在 Tag Editor 中选择 History 分类 - - ![](/img/ignition-7.png) - -- 设置 `History Enabled` 为 `true`,并选择 `Storage Provider` 为上一步创建的 `Provider`,按需要配置其它参数,并点击 `OK`,然后保存项目。此时数据将会按照设置的内容持续的存入 `IoTDB` 实例中。 - - ![](/img/ignition-8.png) - -#### 读取数据 - -- 也可以在 Report 的 Data 标签下面直接选择存入 IoTDB 的 Tags - - ![](/img/ignition-9.png) - -- 在 Components 中也可以直接浏览相关数据 - - ![](/img/ignition-10.png) - -#### 脚本模块:该功能能够与 IoTDB 进行交互 - -1. system.iotdb.insert: - - -- 脚本说明:将数据写入到 IoTDB 实例中 - -- 脚本定义: - ``` shell - system.iotdb.insert(historian, deviceId, timestamps, measurementNames, measurementValues) - ``` - -- 参数: - - - `str historian`:对应的 IoTDB Tag Historian Provider 的名称 - - `str deviceId`:写入的 deviceId,不含配置的 database,如 Sine - - `long[] timestamps`:写入的数据点对于的时间戳列表 - - `str[] measurementNames`:写入的物理量的名称列表 - - `str[][] measurementValues`:写入的数据点数据,与时间戳列表和物理量名称列表对应 - -- 返回值:无 - -- 可用范围:Client, Designer, Gateway - -- 使用示例: - - ```shell - system.iotdb.insert("IoTDB", "Sine", [system.date.now()],["measure1","measure2"],[["val1","val2"]]) - ``` - -2. system.iotdb.query: - - -- 脚本说明:查询写到 IoTDB 实例中的数据 - -- 脚本定义: - ```shell - system.iotdb.query(historian, sql) - ``` - -- 参数: - - - `str historian`:对应的 IoTDB Tag Historian Provider 的名称 - - `str sql`:待查询的 sql 语句 - -- 返回值: - 查询的结果:`List>` - -- 可用范围:Client, Designer, Gateway -- 使用示例: - -```shell -system.iotdb.query("IoTDB", "select * from root.db.Sine where time > 1709563427247") -``` - -## Ignition-IoTDB With JDBC - -### 简介 - - Ignition-IoTDB With JDBC提供了一个 JDBC 驱动,允许用户使用标准的JDBC API 连接和查询 lgnition-loTDB 数据库 - -### 安装步骤 - - 步骤一:进入 `Config` - `Databases` -`Drivers` 模块,创建 `Translator` - -![](/img/Ignition-IoTDBWithJDBC-1.png) - - 步骤二:进入 `Config` - `Databases` -`Drivers` 模块,创建 `JDBC Driver`,选择上一步配置的 `Translator`并上传下载的 `IoTDB-JDBC`,Classname 配置为 `org.apache.iotdb.jdbc.IoTDBDriver` - -![](/img/Ignition-IoTDBWithJDBC-2.png) - -步骤三:进入 `Config` - `Databases` -`Connections` 模块,创建新的 `Connections`,`JDBC Driver` 选择上一步创建的 `IoTDB Driver`,配置相关信息后保存即可使用 - -![](/img/Ignition-IoTDBWithJDBC-3.png) - -### 使用说明 - -#### 数据写入 - - 在`Transaction Groups`中的 `Data Source`选择之前创建的 `Connection` - -- `Table name` 需设置为 root 开始的完整的设备路径 -- 取消勾选 `Automatically create table` -- `Store timestame to` 配置为 time - -不选择其他项,设置好字段,并 `Enabled` 后 数据会安装设置存入对应的 IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E5%86%99%E5%85%A5-1.png) - -#### 数据查询 - -- 在 `Database Query Browser` 中选择`Data Source`选择之前创建的 `Connection`,即可编写 SQL 语句查询 IoTDB 中的数据 - -![](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2-ponz.png) - diff --git a/src/zh/UserGuide/dev-1.3/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md b/src/zh/UserGuide/dev-1.3/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md deleted file mode 100644 index 39ea4384b..000000000 --- a/src/zh/UserGuide/dev-1.3/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md +++ /dev/null @@ -1,174 +0,0 @@ - - -# Apache Zeppelin - -## Zeppelin 简介 - -Apache Zeppelin 是一个基于网页的交互式数据分析系统。用户可以通过 Zeppelin 连接数据源并使用 SQL、Scala 等进行交互式操作。操作可以保存为文档(类似于 Jupyter)。Zeppelin 支持多种数据源,包括 Spark、ElasticSearch、Cassandra 和 InfluxDB 等等。现在,IoTDB 已经支持使用 Zeppelin 进行操作。样例如下: - -![iotdb-note-snapshot](/img/github/102752947-520a3e80-43a5-11eb-8fb1-8fac471c8c7e.png) - -## Zeppelin-IoTDB 解释器 - -### 系统环境需求 - -| IoTDB 版本 | Java 版本 | Zeppelin 版本 | -| :--------: | :-----------: | :-----------: | -| >=`0.12.0` | >=`1.8.0_271` | `>=0.9.0` | - -安装 IoTDB:参考 [快速上手](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md). 假设 IoTDB 安装在 `$IoTDB_HOME`. - -安装 Zeppelin: -> 方法 1 直接下载:下载 [Zeppelin](https://zeppelin.apache.org/download.html#) 并解压二进制文件。推荐下载 [netinst](http://www.apache.org/dyn/closer.cgi/zeppelin/zeppelin-0.9.0/zeppelin-0.9.0-bin-netinst.tgz) 二进制包,此包由于未编译不相关的 interpreter,因此大小相对较小。 -> -> 方法 2 源码编译:参考 [从源码构建 Zeppelin](https://zeppelin.apache.org/docs/latest/setup/basics/how_to_build.html) ,使用命令为 `mvn clean package -pl zeppelin-web,zeppelin-server -am -DskipTests`。 - -假设 Zeppelin 安装在 `$Zeppelin_HOME`. - -### 编译解释器 - -运行如下命令编译 IoTDB Zeppelin 解释器。 - -```shell -cd $IoTDB_HOME - mvn clean package -pl iotdb-connector/zeppelin-interpreter -am -DskipTests -P get-jar-with-dependencies -``` - -编译后的解释器位于如下目录: - -```shell -$IoTDB_HOME/zeppelin-interpreter/target/zeppelin-{version}-SNAPSHOT-jar-with-dependencies.jar -``` - -### 安装解释器 - -当你编译好了解释器,在 Zeppelin 的解释器目录下创建一个新的文件夹`iotdb`,并将 IoTDB 解释器放入其中。 - -```shell -cd $IoTDB_HOME -mkdir -p $Zeppelin_HOME/interpreter/iotdb -cp $IoTDB_HOME/zeppelin-interpreter/target/zeppelin-{version}-SNAPSHOT-jar-with-dependencies.jar $Zeppelin_HOME/interpreter/iotdb -``` - -### 修改 Zeppelin 配置 - -进入 `$Zeppelin_HOME/conf`,使用 template 创建 Zeppelin 配置文件: - -```shell -cp zeppelin-site.xml.template zeppelin-site.xml -``` - -打开 zeppelin-site.xml 文件,将 `zeppelin.server.addr` 项修改为 `0.0.0.0` - -### 启动 Zeppelin 和 IoTDB - -进入 `$Zeppelin_HOME` 并运行 Zeppelin: - -```shell -# Unix/OS X -> ./bin/zeppelin-daemon.sh start - -# Windows -> .\bin\zeppelin.cmd -``` - -进入 `$IoTDB_HOME` 并运行 IoTDB: - -```shell -# Unix/OS X -> nohup sbin/start-server.sh >/dev/null 2>&1 & -or -> nohup sbin/start-server.sh -c -rpc_port >/dev/null 2>&1 & - -# Windows -> sbin\start-server.bat -c -rpc_port -``` - -## 使用 Zeppelin-IoTDB 解释器 - -当 Zeppelin 启动后,访问 [http://127.0.0.1:8080/](http://127.0.0.1:8080/) - -通过如下步骤创建一个新的笔记本页面: - -1. 点击 `Create new node` 按钮 -2. 设置笔记本名 -3. 选择解释器为 iotdb - -现在可以开始使用 Zeppelin 操作 IoTDB 了。 - -![iotdb-create-note](/img/github/102752945-5171a800-43a5-11eb-8614-53b3276a3ce2.png) - -我们提供了一些简单的 SQL 来展示 Zeppelin-IoTDB 解释器的使用: - -```sql -CREATE DATABASE root.ln.wf01.wt01; -CREATE TIMESERIES root.ln.wf01.wt01.status WITH DATATYPE=BOOLEAN, ENCODING=PLAIN; -CREATE TIMESERIES root.ln.wf01.wt01.temperature WITH DATATYPE=FLOAT, ENCODING=PLAIN; -CREATE TIMESERIES root.ln.wf01.wt01.hardware WITH DATATYPE=INT32, ENCODING=PLAIN; - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (1, 1.1, false, 11); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (2, 2.2, true, 22); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (3, 3.3, false, 33); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (4, 4.4, false, 44); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (5, 5.5, false, 55); - -SELECT * -FROM root.ln.wf01.wt01 -WHERE time >= 1 - AND time <= 6; -``` - -样例如下: - -![iotdb-note-snapshot2](/img/github/102752948-52a2d500-43a5-11eb-9156-0c55667eb4cd.png) - -用户也可以参考 [[1]](https://zeppelin.apache.org/docs/0.9.0/usage/display_system/basic.html) 编写更丰富多彩的文档。 - -以上样例放置于 `$IoTDB_HOME/zeppelin-interpreter/Zeppelin-IoTDB-Demo.zpln` - -## 解释器配置项 - -进入页面 [http://127.0.0.1:8080/#/interpreter](http://127.0.0.1:8080/#/interpreter) 并配置 IoTDB 的连接参数: - -![iotdb-configuration](/img/github/102752940-50407b00-43a5-11eb-94fb-3e3be222183c.png) - -可配置参数默认值和解释如下: - -| 属性 | 默认值 | 描述 | -| ---------------------------- | --------- | -------------------------------- | -| iotdb.host | 127.0.0.1 | IoTDB 主机名 | -| iotdb.port | 6667 | IoTDB 端口 | -| iotdb.username | root | 用户名 | -| iotdb.password | root | 密码 | -| iotdb.fetchSize | 10000 | 查询结果分批次返回时,每一批数量 | -| iotdb.zoneId | | 时区 ID | -| iotdb.enable.rpc.compression | FALSE | 是否允许 rpc 压缩 | -| iotdb.time.display.type | default | 时间戳的展示格式 | diff --git a/src/zh/UserGuide/dev-1.3/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/zh/UserGuide/dev-1.3/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index 58c17ae67..000000000 --- a/src/zh/UserGuide/dev-1.3/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,266 +0,0 @@ - - -# 产品介绍 - -TimechoDB 是一款低成本、高性能的物联网原生时序数据库,是天谋科技基于 Apache IoTDB 社区版本提供的原厂商业化产品。它可以解决企业组建物联网大数据平台管理时序数据时所遇到的应用场景复杂、数据体量大、采样频率高、数据乱序多、数据处理耗时长、分析需求多样、存储与运维成本高等多种问题。 - -天谋科技基于 TimechoDB 提供更多样的产品功能、更强大的性能和稳定性、更丰富的效能工具,并为用户提供全方位的企业服务,从而为商业化客户提供更强大的产品能力,和更优质的开发、运维、使用体验。 - -- 下载、部署与使用:[快速上手](../QuickStart/QuickStart_timecho.md) - -## 产品体系 - -天谋产品体系由若干个组件构成,覆盖由【数据采集】到【数据管理】到【数据分析&应用】的全时序数据生命周期,做到“采-存-用”一体化时序数据解决方案,帮助用户高效地管理和分析物联网产生的海量时序数据。 - -
- Introduction-zh-timecho.png -
- - -其中: - -1. **时序数据库(TimechoDB,基于 Apache IoTDB 提供的原厂商业化产品)**:时序数据存储的核心组件,其能够为用户提供高压缩存储能力、丰富时序查询能力、实时流处理能力,同时具备数据的高可用和集群的高扩展性,并在安全层面提供全方位保障。同时 TimechoDB 还为用户提供多种应用工具,方便用户配置和管理系统;多语言API和外部系统应用集成能力,方便用户在 TimechoDB 基础上构建业务应用。 -2. **时序数据标准文件格式(Apache TsFile,多位天谋科技核心团队成员主导&贡献代码)**:该文件格式是一种专为时序数据设计的存储格式,可以高效地存储和查询海量时序数据。目前 Timecho 采集、存储、智能分析等模块的底层存储文件均由 Apache TsFile 进行支撑。TsFile 可以被高效地加载至 IoTDB 中,也能够被迁移出来。通过 TsFile,用户可以在采集、管理、应用&分析阶段统一使用相同的文件格式进行数据管理,极大简化了数据采集到分析的整个流程,提高时序数据管理的效率和便捷度。 -3. **时序模型训推一体化引擎(AINode)**:针对智能分析场景,TimechoDB 提供 AINode 时序模型训推一体化引擎,它提供了一套完整的时序数据分析工具,底层为模型训练引擎,支持训练任务与数据管理,与包括机器学习、深度学习等。通过这些工具,用户可以对存储在 TimechoDB 中的数据进行深入分析,挖掘出其中的价值。 -4. **数据采集**:为了更加便捷的对接各类工业采集场景, 天谋科技提供数据采集接入服务,支持多种协议和格式,可以接入各种传感器、设备产生的数据,同时支持断点续传、网闸穿透等特性。更加适配工业领域采集过程中配置难、传输慢、网络弱的特点,让用户的数采变得更加简单、高效。 - - -## 产品特性 - -TimechoDB 具备以下优势和特性: - -- 灵活的部署方式:支持云端一键部署、终端解压即用、终端-云端无缝连接(数据云端同步工具) - -- 低硬件成本的存储解决方案:支持高压缩比的磁盘存储,无需区分历史库与实时库,数据统一管理 - -- 层级化的测点组织管理方式:支持在系统中根据设备实际层级关系进行建模,以实现与工业测点管理结构的对齐,同时支持针对层级结构的目录查看、检索等能力 - -- 高通量的数据读写:支持百万级设备接入、数据高速读写、乱序/多频采集等复杂工业读写场景 - -- 丰富的时间序列查询语义:支持时序数据原生计算引擎,支持查询时时间戳对齐,提供近百种内置聚合与时序计算函数,支持面向时序特征分析和AI能力 - -- 高可用的分布式系统:支持HA分布式架构,系统提供7*24小时不间断的实时数据库服务,一个物理节点宕机或网络故障,不会影响系统的正常运行;支持物理节点的增加、删除或过热,系统会自动进行计算/存储资源的负载均衡处理;支持异构环境,不同类型、不同性能的服务器可以组建集群,系统根据物理机的配置,自动负载均衡 - -- 极低的使用&运维门槛:支持类 SQL 语言、提供多语言原生二次开发接口、具备控制台等完善的工具体系 - -- 丰富的生态环境对接:支持Hadoop、Spark等大数据生态系统组件对接,支持Grafana、Thingsboard、DataEase等设备管理和可视化工具 - -## 企业特性 - -### 更高阶的产品功能 - -TimechoDB 在 Apache IoTDB 基础上提供了更多高阶产品功能,在内核层面针对工业生产场景进行原生升级和优化,如多级存储、云边协同、可视化工具、安全增强等功能,能够让用户无需过多关注底层逻辑,将精力聚焦在业务开发中,让工业生产更简单更高效,为企业带来更多的经济效益。如: - -- 双活部署:双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 数据同步:通过数据库内置的同步模块,支持数据由场站向中心汇聚,支持全量汇聚、部分汇聚、级联汇聚等各类场景,可支持实时数据同步与批量数据同步两种模式。同时提供多种内置插件,支持企业数据同步应用中的网闸穿透、加密传输、压缩传输等相关要求。 - -- 多级存储:通过升级底层存储能力,支持根据访问频率和数据重要性等因素将数据划分为冷、温、热等不同层级的数据,并将其存储在不同介质中(如 SSD、机械硬盘、云存储等),同时在查询过程中也由系统进行数据调度。从而在保证数据访问速度的同时,降低客户数据存储成本。 - -- 安全增强:通过白名单、审计日志等功能加强企业内部管理,降低数据泄露风险。 - -详细功能对比如下: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
功能Apache IoTDBTimechoDB
部署模式单机部署
分布式部署
双活部署-
容器部署部分支持
数据库功能测点管理
数据写入
数据查询
连续查询
触发器
用户自定义函数
权限管理
数据同步仅文件同步,无内置插件实时同步+文件同步,丰富内置插件
流处理仅框架,无内置插件框架+丰富内置插件
多级存储-
视图-
白名单-
审计日志-
配套工具可视化控制台-
集群管理工具-
系统监控工具-
国产化国产化兼容性认证-
技术支持专家服务-
使用培训-
- -### 更高效/稳定的产品性能 - -TimechoDB 在 Apache IoTDB 的基础上优化了稳定性与性能,经过企业版技术支持,能够实现10倍以上性能提升,并具有故障及时恢复的性能优势。 - -### 更用户友好的工具体系 - -TimechoDB 将为用户提供更简单、易用的工具体系,通过集群监控面板(IoTDB Grafana)、数据库控制台(IoTDB Workbench)、集群管理工具(IoTDB Deploy Tool,简称 IoTD)等产品帮助用户快速部署、管理、监控数据库集群,降低运维人员工作/学习成本,简化数据库运维工作,使运维过程更加方便、快捷。 - -- 集群监控面板:旨在解决 IoTDB 及其所在操作系统的监控问题,主要包括:操作系统资源监控、IoTDB 性能监控,及上百项内核监控指标,从而帮助用户监控集群健康状态,并进行集群调优和运维。 - -
-

总体概览

-

操作系统资源监控

-

IoTDB 性能监控

-
-
- - - -
-

- -- 数据库控制台:旨在提供低门槛的数据库交互工具,通过提供界面化的控制台帮助用户简洁明了的进行元数据管理、数据增删改查、权限管理、系统管理等操作,简化数据库使用难度,提高数据库使用效率。 - - -
-

首页

-

元数据管理

-

SQL 查询

-
-
- - - -
-

- - -- 集群管理工具:旨在解决分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。 - - -
-  -
- -### 更专业的企业技术服务 - -TimechoDB 客户提供强大的原厂服务,包括但不限于现场安装及培训、专家顾问咨询、现场紧急救助、软件升级、在线自助服务、远程支持、最新开发版使用指导等服务。同时,为了使 IoTDB 更契合工业生产场景,我们会根据企业实际数据结构和读写负载,进行建模方案推荐、读写性能调优、压缩比调优、数据库配置推荐及其他的技术支持。如遇到部分产品未覆盖的工业化定制场景,TimechoDB 将根据用户特点提供定制化开发工具。 - -相较于 Apache IoTDB ,每 2-3 个月一个发版周期,TimechoDB 提供周期更快的发版频率,同时针对客户现场紧急问题,提供天级别的专属修复,确保生产环境稳定。 - - -### 更兼容的国产化适配 - -TimechoDB 代码自研可控,同时兼容大部分主流信创产品(CPU、操作系统等),并完成与多个厂家的兼容认证,确保产品的合规性和安全性。 \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/IoTDB-Introduction/Release-history_timecho.md b/src/zh/UserGuide/dev-1.3/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index f4dc5fbaa..000000000 --- a/src/zh/UserGuide/dev-1.3/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,388 +0,0 @@ - -# 发布历史 - -## TimechoDB(数据库内核) - -### V1.3.7.3 - -> 发版时间:2026.06.02
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 校验码:8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 版本主要优化了查询模块和数据同步等功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:优化 Last 查询、对齐序列查询、倒序时间过滤查询等场景 -- 元数据模块:优化已激活序列及其子路径下的设备创建校验 -- 数据同步:优化同步失败后的重试机制 -- 数据同步:跨网闸同步插件支持配置实时写入传输超时时间 -- 接口模块:Go 客户端写入接口增加错误码校验 -- 接口模块:优化 C# 客户端连接池管理 - - -### V1.3.7.2 - -> 发版时间:2026.04.07
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 校验码:787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 版本主要优化了数据同步和查询模块的相关功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:优化 Pipe 复杂路径匹配场景下的分发性能 -- 查询模块:Show Queries 语句新增客户端 IP、查询超时时间、服务端等待时间等信息 -- 生态集成:支持 IoTDB 以 OPC Client 模式向外部 OPC Server 推送数据 - - -### V1.3.6.6 - -> 发版时间:2026.01.20
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 校验码:590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 版本优化了数据的读写功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.6.3 - -> 发版时间:2026.01.04
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 校验码:43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 版本主要围绕查询性能、内存管理机制两大核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:优化多种场景的查询性能,包括多序列 Last 查询等 -* 查询模块:Java SDK 新增 FastLastQuery 接口,支持更高效的 Last 查询操作 -* 查询模块:树模型 fetchSchema 调整为分段流式返回,提升大数据量场景下的响应速度 -* 存储模块:优化内存管理,避免内存泄漏风险,保障系统长期稳定运行 -* 存储模块:优化文件合并机制,提升合并处理效率,优化系统存储资源占用 -* 其他:修复安全漏洞 CVE-2025-12183,CVE-2025-66566 and CVE-2025-11226 - - -### V1.3.6.1 - -> 发版时间:2025.12.09
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 校验码:9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 版本主要围绕数据同步稳定性这一核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 数据同步:优化 Pipe SQL 参数配置,支持指定异步加载方式 -* 数据同步:新增语法糖功能,可将全量 Pipe 创建 SQL 自动拆分为实时同步与历史同步两类 -* 系统模块:新增全局数据类型压缩方式配置项,支持按需调整存储压缩策略 - - -### V1.3.5.11 - -> 发版时间:2025.09.24
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 校验码:f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 版本主要优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.10 - -> 发版时间:2025.08.27
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 校验码:3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.9 - -> 发版时间:2025.08.25
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 校验码:95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 版本优化了内存控制,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 -### 1.x 其他历史版本 - -#### V1.3.5.8 - -> 发版时间:2025.08.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 校验码:aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.7 - -> 发版时间:2025.08.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 校验码:17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.6 - -> 发版时间:2025.07.16
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 校验码:05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 版本新增配置项开关支持禁用数据订阅功能,优化了C++高可用客户端,以及正常情况、重启、删除三个场景下的 PIPE 同步延迟问题,和大 TEXT 对象时的查询问题,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.4 - -> 发版时间:2025.06.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 校验码:edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 版本修复了部分产品缺陷,优化了节点移除功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.3 - -> 发版时间:2025.06.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 校验码:5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 版本主要优化了数据同步功能,包括持久化 PIPE 发送进度,增加 PIPE 事件传输时间监控项,并修复了相关缺陷;另外将用户密码的加密算法变更为 SHA-256,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.2 - -> 发版时间:2025.06.10
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 校验码:4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 版本主要优化了数据同步功能,包括支持通过使用参数进行级联配置,支持同步和实时写入顺序完全一致;支持系统重启后历史数据和实时数据分区发送,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.1 - -> 发版时间:2025.05.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 校验码:91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.4.2 - -> 发版时间:2025.04.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 校验码:52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b2 - -V1.3.4.2 版本优化了数据同步功能,支持双活之间同步外部 PIPE 转发而来的数据。 - - -#### V1.3.4.1 - -> 发版时间:2025.01.08
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 校验码:e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 版本新增模式匹配函数、持续优化数据订阅机制,提升稳定性、import-data/export-data 脚本扩展支持新数据类型,import-data/export-data 脚本合并同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -- 系统模块:UDF 函数拓展,新增 pattern_match 模式匹配函数 -- 数据同步:支持在发送端指定接收端鉴权信息 -- 生态集成:支持 Kubernetes Operator -- 脚本与工具:import-data/export-data 脚本扩展,支持新数据类型(字符串、大二进制对象、日期、时间戳) -- 脚本与工具:import-data/export-data 脚本迭代,同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出 - -#### V1.3.3.3 - -> 发版时间:2024.10.31
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 校验码:4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3版本增加优化重启恢复性能,减少启动时间、DataNode 主动监听并加载 TsFile,同时增加可观测性指标、发送端支持传文件至指定目录后,接收端自动加载到IoTDB、Alter Pipe 支持 Alter Source 的能力等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:接收端支持对不一致数据类型的自动转换 -- 数据同步:接收端增强可观测性,支持多个内部接口的 ops/latency 统计 -- 数据同步:opc-ua-sink 插件支持 CS 模式访问和非匿名访问方式 -- 数据订阅: SDK 支持 create if not exists 和 drop if exists 接口 -- 流处理:Alter Pipe 支持 Alter Source 的能力 -- 系统模块:新增 rest 模块的耗时监控 -- 脚本与工具:支持加载自动加载指定目录的TsFile文件 -- 脚本与工具:import-tsfile脚本扩展,支持脚本与iotdb server不在同一服务器运行 -- 脚本与工具:新增对Kubernetes Helm的支持 -- 脚本与工具:Python 客户端支持新数据类型(字符串、大二进制对象、日期、时间戳) - -#### V1.3.3.2 - -> 发版时间:2024.8.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 校验码:32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2版本支持输出读取mods文件的耗时、输入最大顺乱序归并排序内存 以及dispatch 耗时、通过参数配置对时间分区原点的调整、支持根据 pipe 历史数据处理结束标记自动结束订阅,同时合并了模块内存控制性能提升,具体发布内容如下: - -- 查询模块:Explain Analyze 功能支持输出读取mods文件的耗时 -- 查询模块:Explain Analyze 功能支持输入最大顺乱序归并排序内存以及 dispatch 耗时 -- 存储模块:新增合并目标文件拆分功能,增加配置文件参数 -- 系统模块:支持通过参数配置对时间分区原点的调整 -- 流处理:数据订阅支持根据 pipe 历史数据处理结束标记自动结束订阅 -- 数据同步:RPC 压缩支持指定压缩等级 -- 脚本与工具:数据/元数据导出只过滤 root.__system,不对root.__systema 等开头的数据进行过滤 - -#### V1.3.3.1 - -> 发版时间:2024.7.12
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 校验码:1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1版本多级存储增加限流机制、数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权,优化了数据同步接收端一些不明确的WARN日志、重启恢复性能,减少启动时间,同时对脚本内容进行了合并,具体发布内容如下: - -- 查询模块:Filter 性能优化,提升聚合查询和where条件查询的速度 -- 查询模块:Java Session客户端查询 sql 请求均分到所有节点 -- 系统模块:将"iotdb-confignode.properties、iotdb-datanode.properties、iotdb-common.properties"配置文件合并为" iotdb-system.properties" -- 存储模块:多级存储增加限流机制 -- 数据同步:数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权 -- 系统模块:优化重启恢复性能,减少启动时间 - -#### V1.3.2.2 - -> 发版时间:2024.6.4
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 校验码:ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 版本新增 explain analyze 语句分析单个 SQL 查询耗时、新增 UDAF 用户自定义聚合函数框架、支持磁盘空间到达设置阈值自动删除数据、元数据同步、统计指定路径下数据点数、SQL 语句导入导出脚本等功能,同时集群管理工具支持滚动升级、上传插件到整个集群,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 存储模块:insertRecords 接口写入性能提升 -- 存储模块:新增 SpaceTL 功能,支持磁盘空间到达设置阈值自动删除数据 -- 查询模块:新增 Explain Analyze 语句(监控单条 SQL 执行各阶段耗时) -- 查询模块:新增 UDAF 用户自定义聚合函数框架 -- 查询模块:UDF 新增包络解调分析 -- 查询模块:新增 MaxBy/MinBy 函数,支持获取最大/小值的同时返回对应时间戳 -- 查询模块:值过滤查询性能提升 -- 数据同步:路径匹配支持通配符 -- 数据同步:支持元数据同步(含时间序列及相关属性、权限等设置) -- 流处理:增加 Alter Pipe 语句,支持热更新 Pipe 任务的插件 -- 系统模块:系统数据点数统计增加对 load TsFile 导入数据的统计 -- 脚本与工具:新增本地升级备份工具(通过硬链接对原有数据进行备份) -- 脚本与工具:新增 export-data/import-data 脚本,支持将数据导出为 CSV、TsFile 格式或 SQL 语句 -- 脚本与工具:Windows 环境支持通过窗口名区分 ConfigNode、DataNode、Cli - -#### V1.3.1.4 - -> 发版时间:2024.4.23
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 校验码:8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1 版本增加系统激活情况查看、内置方差/标准差聚合函数、内置Fill语句支持超时时间设置、tsfile修复命令等功能,增加一键收集实例信息脚本、一键启停集群等脚本,并对视图、流处理等功能进行优化,提升使用易用度和版本性能。具体发布内容如下: - -- 查询模块:Fill 子句支持设置填充超时阈值,超过时间阈值不填充 -- 查询模块:Rest 接口(V2 版)增加列类型返回 -- 数据同步:数据同步简化时间范围指定方式,直接设置起止时间 -- 数据同步:数据同步支持 SSL 传输协议(iotdb-thrift-ssl-sink 插件) -- 系统模块:支持使用 SQL 查询集群激活信息 -- 系统模块:多级存储增加迁移时传输速率控制 -- 系统模块:系统可观测性提升(增加集群节点的散度监控、分布式任务调度框架可观测性) -- 系统模块:日志默认输出策略优化 -- 脚本与工具:增加一键启停集群脚本(start-all/stop-all.sh & start-all/stop-all.bat) -- 脚本与工具:增加一键收集实例信息脚本(collect-info.sh & collect-info.bat) - -#### V1.3.0.4 - -> 发版时间:2024.1.3
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 校验码:3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 发布了全新内生机器学习框架 AINode,全面升级权限模块支持序列粒度授予权限,并对视图、流处理等功能进行诸多细节优化,进一步提升了产品的使用易用度,并增强了版本稳定性和各方面性能。具体发布内容如下: - -- 查询模块:新增 AINode 内生机器学习模块 -- 查询模块:优化 show path 语句返回时间长的问题 -- 安全模块:升级权限模块,支持时间序列粒度的权限设置 -- 安全模块:支持客户端与服务器 SSL 通讯加密 -- 流处理:流处理模块新增多种 metrics 监控项 -- 查询模块:非可写视图序列支持 LAST 查询 -- 系统模块:优化数据点监控项统计准确性 - -#### V1.2.0.1 - -> 发版时间:2023.6.30
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 校验码:dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1主要增加了流处理框架、动态模板、substring/replace/round内置查询函数等新特性,增强了show region、show timeseries、show variable等内置语句功能和Session接口,同时优化了内置监控项及其实现,修复部分产品bug和性能问题。 - -- 流处理:新增流处理框架 -- 元数据模块:新增模板动态扩充功能 -- 存储模块:新增SPRINTZ和RLBE编码以及LZMA2压缩算法 -- 查询模块:新增cast、round、substr、replace内置标量函数 -- 查询模块:新增time_duration、mode内置聚合函数 -- 查询模块:SQL语句支持case when语法 -- 查询模块:SQL语句支持order by表达式 -- 接口模块:Python API支持连接分布式多个节点 -- 接口模块:Python客户端支持写入重定向 -- 接口模块:Session API增加用模板批量创建序列接口 - -#### V1.1.0.1 - -> 发版时间:2023-04-03
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.1.0.1.zip
-> SHA512 校验码:58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1主要改进增加了部分新特性,如支持 GROUP BY VARIATION、GROUP BY CONDITION 等分段方式、增加 DIFF、COUNT_IF 等实用函数,引入 pipeline 执行引擎进一步提升查询速度等。同时修复对齐序列 last 查询 order by timeseries、LIMIT&OFFSET 不生效、重启后元数据模版错误、删除所有 database 后创建序列错误等相关问题。 - -- 查询模块:align by device 语句支持 order by time -- 查询模块:支持 Show Queries 命令 -- 查询模块:支持 kill query 命令 -- 系统模块:show regions 支持指定特定的 database -- 系统模块:新增 SQL show variables, 可以展示当前集群参数 -- 查询模块:聚合查询支持 GROUP BY VARIATION -- 查询模块:SELECT INTO 支持特定的数据类型强转 -- 查询模块:实现内置标量函数 DIFF -- 系统模块:show regions 显示创建时间 -- 查询模块:实现内置聚合函数 COUNT_IF -- 查询模块:聚合查询支持 GROUP BY CONDITION -- 系统模块:支持修改 dn_rpc_port 和 dn_rpc_address - - -## Workbench(控制台工具) - -| **控制台版本号** | **版本说明** | **可支持IoTDB版本** | **SHA512 校验码** | -| ---------------- | ------------------------------------------------------------ | ------------------------- | ------------------------------------------------------------ | -| V1.5.7 | 优化测点列表中测点名称拆分为设备名称和测点,测点选择区域支持左右滚动,以及导出文件列顺序与页面保持一致 | V1.3.4及以上的1.x系列版本 | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | 优化 CSV 格式导入导出功能:导入时,支持标签、别名为非必填项;导出时,支持测点描述里反引号包裹引号的场景 | V1.3.4及以上的1.x系列版本 | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | 新增服务器时钟,支持企业版激活数据库 | V1.3.4及以上的1.x系列版本 | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | 新增实例管理中prometheus设置的认证功能 | V1.3.4及以上的1.x系列版本 | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | 新增AI分析功能以及模式匹配功能 | V1.3.2及以上的1.x系列版本 | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | 新增树模型展示及英文版 | V1.3.2及以上的1.x系列版本 | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | 分析功能新增分析方式,优化导入模版等功能 | V1.3.2及以上的1.x系列版本 | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | 新增数据库配置功能,优化部分版本细节 | V1.3.2及以上的1.x系列版本 | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | 优化各模块权限控制功能 | V1.3.1及以上的1.x系列版本 | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | 可视化功能新增“常用模版”概念,所有界面优化补充页面缓存等功能 | V1.3.0及以上的1.x系列版本 | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | 计算功能新增“导入、导出”功能,测点列表新增“时间对齐”字段 | V1.2.2及以上的1.x系列版本 | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | 首页新增“激活详情”,新增分析等功能 | V1.2.2及以上的1.x系列版本 | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | 优化“测点描述”展示内容等功能 | V1.2.2及以上的1.x系列版本 | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | 数据同步界面新增“监控面板”,优化Prometheus提示信息 | V1.2.2及以上的1.x系列版本 | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | 全新Workbench版本升级 | V1.2.0及以上的1.x系列版本 | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/zh/UserGuide/dev-1.3/QuickStart/QuickStart_timecho.md b/src/zh/UserGuide/dev-1.3/QuickStart/QuickStart_timecho.md deleted file mode 100644 index f898f60ee..000000000 --- a/src/zh/UserGuide/dev-1.3/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,109 +0,0 @@ - - -# 快速上手 - -本篇文档将帮助您了解快速入门 IoTDB 的方法。 - -## 如何安装部署? - -本篇文档将帮助您快速安装部署 IoTDB,您可以通过以下文档的链接快速定位到所需要查看的内容: - -1. 准备所需机器资源:IoTDB 的部署和运行需要考虑多个方面的机器资源配置。具体资源配置可查看 [资源规划](../Deployment-and-Maintenance/Database-Resources.md) - -2. 完成系统配置准备:IoTDB 的系统配置涉及多个方面,关键的系统配置介绍可查看 [系统配置](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. 获取安装包:您可以联系天谋商务获取 IoTDB 安装包,以确保下载的是最新且稳定的版本。具体安装包结构可查看:[安装包获取](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. 安装数据库并激活:您可以根据实际部署架构选择以下教程进行安装部署: - - - 单机版:[单机版](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - 分布式(集群)版:[分布式(集群)版](../Deployment-and-Maintenance//Cluster-Deployment_timecho.md) - - - 双活版:[双活版](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️注意:目前我们仍然推荐直接在物理机/虚拟机上安装部署,如需要 docker 部署,可参考:[Docker 部署](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. 安装数据库配套工具:企业版数据库提供监控面板、可视化控制台等配套工具,建议在部署企业版时安装,可以帮助您更加便捷的使用 IoTDB: - - - 监控面板:提供了上百个数据库监控指标,用来对 IoTDB 及其所在操作系统进行细致监控,从而进行系统优化、性能优化、发现瓶颈等,安装步骤可查看 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - 可视化控制台:是 IoTDB 的可视化界面,支持通过界面交互的形式提供元数据管理、数据查询、数据可视化等功能的操作,帮助用户简单、高效的使用数据库,安装步骤可查看 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - -## 如何使用? - -1. 数据库建模设计:数据库建模是创建数据库系统的重要步骤,它涉及到设计数据的结构和关系,以确保数据的组织方式能够满足特定应用的需求,下面的文档将会帮助您快速了解 IoTDB 的建模设计: - - - 时序概念介绍:[走进时序数据](../Basic-Concept/Navigating_Time_Series_Data.md) - - - 建模设计介绍:[数据模型介绍](../Basic-Concept/Data-Model-and-Terminology.md) - - - SQL 语法介绍:[SQL 语法介绍](../Basic-Concept/Operate-Metadata_timecho.md) - -2. 数据写入:在数据写入方面,IoTDB 提供了多种方式来插入实时数据,基本的数据写入操作请查看 [数据写入](../Basic-Concept/Write-Data) - -3. 数据查询:IoTDB 提供了丰富的数据查询功能,数据查询的基本介绍请查看 [数据查询](../Basic-Concept/Query-Data.md) - -4. 其他进阶功能:除了数据库常见的写入、查询等功能外,IoTDB 还支持“数据同步、流处理框架、安全控制、权限管理、AI 分析”等功能,具体使用方法可参见具体文档: - - - 数据同步:[数据同步](../User-Manual/Data-Sync_timecho.md) - - - 流处理框架:[流处理框架](../User-Manual/Streaming_timecho.md) - - - 安全控制:[安全控制](../User-Manual/White-List_timecho.md) - - - 权限管理:[权限管理](../User-Manual/Authority-Management.md) - - - AI 分析:[AI 能力](../AI-capability/AINode_timecho.md) - -5. 应用编程接口: IoTDB 提供了多种应用编程接口(API),以便于开发者在应用程序中与 IoTDB 进行交互,目前支持[ Java 原生接口](../API/Programming-Java-Native-API.md)、[Python 原生接口](../API/Programming-Python-Native-API.md)、[C++原生接口](../API/Programming-Cpp-Native-API.md)、[Go 原生接口](../API/Programming-Go-Native-API.md)等,更多编程接口可参见官网【应用编程接口】其他章节 - -## 还有哪些便捷的周边工具? - -IoTDB 除了自身拥有丰富的功能外,其周边的工具体系包含的种类十分齐全。本篇文档将帮助您快速使用周边工具体系: - - - 可视化控制台:workbench 是 IoTDB 的一个支持界面交互的形式的可视化界面,提供直观的元数据管理、数据查询和数据可视化等功能,提升用户操作数据库的便捷性和效率,具体使用介绍请查看 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - - - 监控面板:是一个对 IoTDB 及其所在操作系统进行细致监控的工具,涵盖数据库性能、系统资源等上百个数据库监控指标,助力系统优化与瓶颈识别等,具体使用介绍请查看 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - 测试工具:IoT-benchmark 是一个基于 Java 和大数据环境开发的时序数据库基准测试工具,由清华大学软件学院研发并开源。它支持多种写入和查询方式,能够存储测试信息和结果供进一步查询或分析,并支持与 Tableau 集成以可视化测试结果。具体使用介绍请查看:[测试工具](../Tools-System/Benchmark.md) - - - 数据导入脚本:针对于不同场景,IoTDB 为用户提供多种批量导入数据的操作方式,具体使用介绍请查看:[数据导入](../Tools-System/Data-Import-Tool.md) - - - - 数据导出脚本:针对于不同场景,IoTDB 为用户提供多种批量导出数据的操作方式,具体使用介绍请查看:[数据导出](../Tools-System/Data-Export-Tool.md) - - -## 想了解更多技术细节? - -如果您想了解 IoTDB 的更多技术内幕,可以移步至下面的文档: - - - 研究论文:IoTDB 具有列式存储、数据编码、预计算和索引技术,以及其类 SQL 接口和高性能数据处理能力,同时与 Apache Hadoop、MapReduce 和 Apache Spark 无缝集成。相关研究论文请查看 [研究论文](../Technical-Insider/Publication.md) - - - 压缩&编码:IoTDB 通过多样化的编码和压缩技术,针对不同数据类型优化存储效率,想了解更多请查看 [压缩&编码](../Technical-Insider/Encoding-and-Compression.md) - - - 数据分区和负载均衡:IoTDB 基于时序数据特性,精心设计了数据分区策略和负载均衡算法,提升了集群的可用性和性能,想了解更多请查看 [数据分区和负载均衡](../Technical-Insider/Cluster-data-partitioning.md) - - -## 使用过程中遇到问题? - -如果您在安装或使用过程中遇到困难,可以移步至 [常见问题](../FAQ/Frequently-asked-questions.md) 中进行查看 \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/Reference/DataNode-Config-Manual_timecho.md b/src/zh/UserGuide/dev-1.3/Reference/DataNode-Config-Manual_timecho.md deleted file mode 100644 index 0105369f1..000000000 --- a/src/zh/UserGuide/dev-1.3/Reference/DataNode-Config-Manual_timecho.md +++ /dev/null @@ -1,576 +0,0 @@ - - -# DataNode 配置参数 - -IoTDB DataNode 与 Standalone 模式共用一套配置文件,均位于 IoTDB 安装目录:`conf`文件夹下。 - -* `datanode-env.sh/bat`:环境配置项的配置文件,可以配置 DataNode 的内存大小。 - -* `iotdb-system.properties`:IoTDB 的配置文件。 - -## 热修改配置项 - -为方便用户使用,IoTDB 为用户提供了热修改功能,即在系统运行过程中修改 `iotdb-system.properties` 中部分配置参数并即时应用到系统中。下面介绍的参数中,改后 生效方式为`热加载` -的均为支持热修改的配置参数。 - -通过 Session 或 Cli 发送 ```load configuration``` 或 `set configuration` 命令(SQL)至 IoTDB 可触发配置热加载。 - -## 环境配置项(datanode-env.sh/bat) - -环境配置项主要用于对 DataNode 运行的 Java 环境相关参数进行配置,如 JVM 相关配置。DataNode/Standalone 启动时,此部分配置会被传给 JVM,详细配置项说明如下: - -* MEMORY\_SIZE - -|名字|MEMORY\_SIZE| -|:---:|:---| -|描述|IoTDB DataNode 启动时分配的内存大小 | -|类型|String| -|默认值|取决于操作系统和机器配置。默认为机器内存的二分之一。| -|改后生效方式|重启服务生效| - -* ON\_HEAP\_MEMORY - -|名字|ON\_HEAP\_MEMORY| -|:---:|:---| -|描述|IoTDB DataNode 能使用的堆内内存大小, 曾用名: MAX\_HEAP\_SIZE | -|类型|String| -|默认值|取决于MEMORY\_SIZE的配置。| -|改后生效方式|重启服务生效| - -* OFF\_HEAP\_MEMORY - -|名字|OFF\_HEAP\_MEMORY| -|:---:|:---| -|描述|IoTDB DataNode 能使用的堆外内存大小, 曾用名: MAX\_DIRECT\_MEMORY\_SIZE | -|类型|String| -|默认值|取决于MEMORY\_SIZE的配置| -|改后生效方式|重启服务生效| - -* JMX\_LOCAL - -|名字|JMX\_LOCAL| -|:---:|:---| -|描述|JMX 监控模式,配置为 true 表示仅允许本地监控,设置为 false 的时候表示允许远程监控。如想在本地通过网络连接JMX Service,比如nodeTool.sh会尝试连接127.0.0.1:31999,请将JMX_LOCAL设置为false。| -|类型|枚举 String : “true”, “false”| -|默认值|true| -|改后生效方式|重启服务生效| - -* JMX\_PORT - -|名字|JMX\_PORT| -|:---:|:---| -|描述|JMX 监听端口。请确认该端口是不是系统保留端口并且未被占用。| -|类型|Short Int: [0,65535]| -|默认值|31999| -|改后生效方式|重启服务生效| - -## 系统配置项(iotdb-system.properties) - -系统配置项是 IoTDB DataNode/Standalone 运行的核心配置,它主要用于设置 DataNode/Standalone 数据库引擎的参数。 - -### Data Node RPC 服务配置 - -* dn\_rpc\_address - -|名字| dn\_rpc\_address | -|:---:|:-----------------| -|描述| 客户端 RPC 服务监听地址 | -|类型| String | -|默认值| 0.0.0.0 | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_port - -|名字| dn\_rpc\_port | -|:---:|:---| -|描述| Client RPC 服务监听端口| -|类型| Short Int : [0,65535] | -|默认值| 6667 | -|改后生效方式|重启服务生效| - -* dn\_internal\_address - -|名字| dn\_internal\_address | -|:---:|:---| -|描述| DataNode 内网通信地址 | -|类型| string | -|默认值| 127.0.0.1 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_internal\_port - -|名字| dn\_internal\_port | -|:---:|:-------------------| -|描述| DataNode 内网通信端口 | -|类型| int | -|默认值| 10730 | -|改后生效方式| 仅允许在第一次启动服务前修改 | - -* dn\_mpp\_data\_exchange\_port - -|名字| dn\_mpp\_data\_exchange\_port | -|:---:|:---| -|描述| MPP 数据交换端口 | -|类型| int | -|默认值| 10740 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_schema\_region\_consensus\_port - -|名字| dn\_schema\_region\_consensus\_port | -|:---:|:---| -|描述| DataNode 元数据副本的共识协议通信端口 | -|类型| int | -|默认值| 10750 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_data\_region\_consensus\_port - -|名字| dn\_data\_region\_consensus\_port | -|:---:|:---| -|描述| DataNode 数据副本的共识协议通信端口 | -|类型| int | -|默认值| 10760 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_join\_cluster\_retry\_interval\_ms - -|名字| dn\_join\_cluster\_retry\_interval\_ms | -|:---:|:---------------------------------------| -|描述| DataNode 再次重试加入集群等待时间 | -|类型| long | -|默认值| 5000 | -|改后生效方式| 重启服务生效 | - - -### SSL 配置 - -* enable\_thrift\_ssl - -|名字| enable\_thrift\_ssl | -|:---:|:----------------------------------------------| -|描述| 当enable\_thrift\_ssl配置为true时,将通过dn\_rpc\_port使用 SSL 加密进行通信 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* enable\_https - -|名字| enable\_https | -|:---:|:-------------------------| -|描述| REST Service 是否开启 SSL 配置 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* key\_store\_path - -|名字| key\_store\_path | -|:---:|:-----------------| -|描述| ssl证书路径 | -|类型| String | -|默认值| "" | -|改后生效方式| 重启服务生效 | - -* key\_store\_pwd - -|名字| key\_store\_pwd | -|:---:|:----------------| -|描述| ssl证书密码 | -|类型| String | -|默认值| "" | -|改后生效方式| 重启服务生效 | - - -### SeedConfigNode 配置 - -* dn\_seed\_config\_node - -|名字| dn\_seed\_config\_node | -|:---:|:------------------------------------| -|描述| ConfigNode 地址,DataNode 启动时通过此地址加入集群,推荐使用 SeedConfigNode。V1.2.2 及以前曾用名是 dn\_target\_config\_node\_list | -|类型| String | -|默认值| 127.0.0.1:10710 | -|改后生效方式| 仅允许在第一次启动服务前修改 | - -### 连接配置 - -* dn\_session\_timeout\_threshold - -|名字| dn\_session_timeout_threshold | -|:---:|:------------------------------| -|描述| 最大的会话空闲时间 | -|类型| int | -|默认值| 0 | -|改后生效方式| 重启服务生效 | - - -* dn\_rpc\_thrift\_compression\_enable - -|名字| dn\_rpc\_thrift\_compression\_enable | -|:---:|:---------------------------------| -|描述| 是否启用 thrift 的压缩机制 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_advanced\_compression\_enable - -|名字| dn\_rpc\_advanced\_compression\_enable | -|:---:|:-----------------------------------| -|描述| 是否启用 thrift 的自定制压缩机制 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_selector\_thread\_count - -| 名字 | rpc\_selector\_thread\_count | -|:------:|:-----------------------------| -| 描述 | rpc 选择器线程数量 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -* dn\_rpc\_min\_concurrent\_client\_num - -| 名字 | rpc\_min\_concurrent\_client\_num | -|:------:|:----------------------------------| -| 描述 | 最小连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -* dn\_rpc\_max\_concurrent\_client\_num - -| 名字 | dn\_rpc\_max\_concurrent\_client\_num | -|:------:|:--------------------------------------| -| 描述 | 最大连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -* dn\_thrift\_max\_frame\_size - -|名字| dn\_thrift\_max\_frame\_size | -|:---:|:---| -|描述| RPC 请求/响应的最大字节数| -|类型| long | -|默认值| 536870912 (默认值512MB) | -|改后生效方式|重启服务生效| - -* dn\_thrift\_init\_buffer\_size - -|名字| dn\_thrift\_init\_buffer\_size | -|:---:|:---| -|描述| 字节数 | -|类型| long | -|默认值| 1024 | -|改后生效方式|重启服务生效| - -* dn\_connection\_timeout\_ms - -| 名字 | dn\_connection\_timeout\_ms | -|:------:|:----------------------------| -| 描述 | 节点连接超时时间 | -| 类型 | int | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -* dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager - -| 名字 | dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------:|:--------------------------------------------------------------| -| 描述 | 单 ClientManager 中路由到每个节点的核心 Client 个数 | -| 类型 | int | -| 默认值 | 200 | -| 改后生效方式 | 重启服务生效 | - -* dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager - -| 名字 | dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------:|:-------------------------------------------------------------| -| 描述 | 单 ClientManager 中路由到每个节点的最大 Client 个数 | -| 类型 | int | -| 默认值 | 300 | -| 改后生效方式 | 重启服务生效 | - -### 目录配置 - -* dn\_system\_dir - -| 名字 | dn\_system\_dir | -|:------:|:--------------------------------------------------------------------| -| 描述 | IoTDB 元数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/system(Windows:data\\datanode\\system) | -| 改后生效方式 | 重启服务生效 | - -* dn\_data\_dirs - -| 名字 | dn\_data\_dirs | -|:------:|:-------------------------------------------------------------------| -| 描述 | IoTDB 数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/data(Windows:data\\datanode\\data) | -| 改后生效方式 | 重启服务生效 | - -* dn\_multi\_dir\_strategy - -| 名字 | dn\_multi\_dir\_strategy | -|:------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | IoTDB 在 data\_dirs 中为 TsFile 选择目录时采用的策略。可使用简单类名或类名全称。系统提供以下三种策略:
1. SequenceStrategy:IoTDB 按顺序选择目录,依次遍历 data\_dirs 中的所有目录,并不断轮循;
2. MaxDiskUsableSpaceFirstStrategy:IoTDB 优先选择 data\_dirs 中对应磁盘空余空间最大的目录;
您可以通过以下方法完成用户自定义策略:
1. 继承 org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy 类并实现自身的 Strategy 方法;
2. 将实现的类的完整类名(包名加类名,UserDefineStrategyPackage)填写到该配置项;
3. 将该类 jar 包添加到工程中。 | -| 类型 | String | -| 默认值 | SequenceStrategy | -| 改后生效方式 | 热加载 | - -* dn\_consensus\_dir - -| 名字 | dn\_consensus\_dir | -|:------:|:-------------------------------------------------------------------------| -| 描述 | IoTDB 共识层日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/consensus(Windows:data\\datanode\\consensus) | -| 改后生效方式 | 重启服务生效 | - -* dn\_wal\_dirs - -| 名字 | dn\_wal\_dirs | -|:------:|:---------------------------------------------------------------------| -| 描述 | IoTDB 写前日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/wal(Windows:data\\datanode\\wal) | -| 改后生效方式 | 重启服务生效 | - -* dn\_tracing\_dir - -| 名字 | dn\_tracing\_dir | -|:------:|:--------------------------------------------------------------------| -| 描述 | IoTDB 追踪根目录路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | datanode/tracing | -| 改后生效方式 | 重启服务生效 | - -* dn\_sync\_dir - -| 名字 | dn\_sync\_dir | -|:------:|:----------------------------------------------------------------------| -| 描述 | IoTDB sync 存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/sync | -| 改后生效方式 | 重启服务生效 | - -### Metric 配置 - -## 开启 GC 日志 - -GC 日志默认是关闭的。为了性能调优,用户可能会需要收集 GC 信息。 -若要打开 GC 日志,则需要在启动 IoTDB Server 的时候加上"printgc"参数: - -```bash -nohup sbin/start-datanode.sh printgc >/dev/null 2>&1 & -``` - -或者 - -```bash -sbin\start-datanode.bat printgc -``` - -GC 日志会被存储在`IOTDB_HOME/logs/gc.log`. 至多会存储 10 个 gc.log 文件,每个文件最多 10MB。 - -#### REST 服务配置 - -* enable\_rest\_service - -|名字| enable\_rest\_service | -|:---:|:--------------------| -|描述| 是否开启Rest服务。 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* rest\_service\_port - -|名字| rest\_service\_port | -|:---:|:------------------| -|描述| Rest服务监听端口号 | -|类型| int32 | -|默认值| 18080 | -|改后生效方式| 重启生效 | - -* enable\_swagger - -|名字| enable\_swagger | -|:---:|:-----------------------| -|描述| 是否启用swagger来展示rest接口信息 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* rest\_query\_default\_row\_size\_limit - -|名字| rest\_query\_default\_row\_size\_limit | -|:---:|:----------------------------------| -|描述| 一次查询能返回的结果集最大行数 | -|类型| int32 | -|默认值| 10000 | -|改后生效方式| 重启生效 | - -* cache\_expire - -|名字| cache\_expire | -|:---:|:--------------| -|描述| 缓存客户登录信息的过期时间 | -|类型| int32 | -|默认值| 28800 | -|改后生效方式| 重启生效 | - -* cache\_max\_num - -|名字| cache\_max\_num | -|:---:|:--------------| -|描述| 缓存中存储的最大用户数量 | -|类型| int32 | -|默认值| 100 | -|改后生效方式| 重启生效 | - -* cache\_init\_num - -|名字| cache\_init\_num | -|:---:|:---------------| -|描述| 缓存初始容量 | -|类型| int32 | -|默认值| 10 | -|改后生效方式| 重启生效 | - -* trust\_store\_path - -|名字| trust\_store\_path | -|:---:|:---------------| -|描述| keyStore 密码(非必填) | -|类型| String | -|默认值| "" | -|改后生效方式| 重启生效 | - -* trust\_store\_pwd - -|名字| trust\_store\_pwd | -|:---:|:---------------| -|描述| trustStore 密码(非必填) | -|类型| String | -|默认值| "" | -|改后生效方式| 重启生效 | - -* idle\_timeout - -|名字| idle\_timeout | -|:---:|:--------------| -|描述| SSL 超时时间,单位为秒 | -|类型| int32 | -|默认值| 5000 | -|改后生效方式| 重启生效 | - - - -#### 多级存储配置 - -* dn\_default\_space\_usage\_thresholds - -|名字| dn\_default\_space\_usage\_thresholds | -|:---:|:--------------| -|描述| 定义每个层级数据目录的最小剩余空间比例;当剩余空间少于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的剩余存储空间到低于此阈值时,会将系统置为 READ_ONLY | -|类型| double | -|默认值| 0.85 | -|改后生效方式| 热加载 | - -* remote\_tsfile\_cache\_dirs - -|名字| remote\_tsfile\_cache\_dirs | -|:---:|:--------------| -|描述| 云端存储在本地的缓存目录 | -|类型| string | -|默认值| data/datanode/data/cache | -|改后生效方式| 重启生效 | - -* remote\_tsfile\_cache\_page\_size\_in\_kb - -|名字| remote\_tsfile\_cache\_page\_size\_in\_kb | -|:---:|:--------------| -|描述| 云端存储在本地缓存文件的块大小 | -|类型| int | -|默认值| 20480 | -|改后生效方式| 重启生效 | - -* remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb - -|名字| remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb | -|:---:|:--------------| -|描述| 云端存储本地缓存的最大磁盘占用大小 | -|类型| long | -|默认值| 51200 | -|改后生效方式| 重启生效 | - -* object\_storage\_type - -|名字| object\_storage\_type | -|:---:|:--------------| -|描述| 云端存储类型 | -|类型| string | -|默认值| AWS_S3 | -|改后生效方式| 重启生效 | - -* object\_storage\_bucket - -|名字| object\_storage\_bucket | -|:---:|:--------------| -|描述| 云端存储 bucket 的名称 | -|类型| string | -|默认值| iotdb_data | -|改后生效方式| 重启生效 | - -* object\_storage\_endpoint - -|名字| object\_storage\_endpoint | -|:---:|:---------------------------| -|描述| 云端存储的 endpoint | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | - -* object\_storage\_access\_key - -|名字| object\_storage\_access\_key | -|:---:|:--------------| -|描述| 云端存储的验证信息 key | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | - -* object\_storage\_access\_secret - -|名字| object\_storage\_access\_secret | -|:---:|:--------------| -|描述| 云端存储的验证信息 secret | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md b/src/zh/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md deleted file mode 100644 index f361f26c9..000000000 --- a/src/zh/UserGuide/dev-1.3/SQL-Manual/UDF-Libraries_timecho.md +++ /dev/null @@ -1,5005 +0,0 @@ - -# UDF函数库 - -基于用户自定义函数能力,IoTDB 提供了一系列关于时序数据处理的函数,包括数据质量、数据画像、异常检测、 频域分析、数据匹配、数据修复、序列发现、机器学习等,能够满足工业领域对时序数据处理的需求。 - -> 注意:当前UDF函数库中的函数仅支持毫秒级的时间戳精度。 - -## 安装步骤 -1. 请获取与 IoTDB 版本兼容的 UDF 函数库 JAR 包的压缩包。 - - | UDF 安装包 | 支持的 IoTDB 版本 | 下载链接 | - | --------------- | ----------------- | ------------------------------------------------------------ | - | TimechoDB-UDF-1.3.3.zip | V1.3.3及以上 | 请联系天谋商务获取 | - | TimechoDB-UDF-1.3.2.zip | V1.0.0~V1.3.2 | 请联系天谋商务获取 | - -2. 将获取的压缩包中的 `library-udf.jar` 文件放置在 IoTDB 集群所有节点的 `/ext/udf` 的目录下 -3. 在 IoTDB 的 SQL 命令行终端(CLI)或可视化控制台(Workbench)的 SQL 操作界面中,执行下述相应的函数注册语句。 -4. 批量注册:两种注册方式:注册脚本 或 SQL汇总语句 -- 注册脚本 - - 将压缩包中的注册脚本(`register-UDF.sh` 或 `register-UDF.bat`)按需复制到 IoTDB 的 tools 目录下,修改脚本中的参数(默认为host=127.0.0.1,rpcPort=6667,user=root,pass=root); - - 启动 IoTDB 服务,运行注册脚本批量注册 UDF - -- SQL汇总语句 - - 打开压缩包中的SQl文件,复制全部 SQL 语句,在 IoTDB 的 SQL 命令行终端(CLI)或可视化控制台(Workbench)的 SQL 操作界面中,执行全部 SQl 语句批量注册 UDF - -## 数据质量 - -### Completeness - -#### 注册语句 - -```sql -create function completeness as 'org.apache.iotdb.library.dquality.UDTFCompleteness' -``` - -#### 函数简介 - -本函数用于计算时间序列的完整性,用来衡量一段时序数据有没有缺失。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据完整程度,并输出窗口第一个数据点的时间戳和完整性结果。 - -**函数名:** COMPLETENESS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 -+ `downtime`:完整性计算是否考虑停机异常。它的取值为 'true' 或 'false',默认值为 'true'. 在考虑停机异常时,长时间的数据缺失将被视作停机,不对完整性产生影响。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行完整性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算完整性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------+ -| Time|completeness(root.test.d1.s1)| -+-----------------------------+-----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -+-----------------------------+-----------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算完整性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------+ -| Time|completeness(root.test.d1.s1, "window"="15")| -+-----------------------------+--------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+--------------------------------------------+ -``` - -### Consistency - -#### 注册语句 - -```sql -create function consistency as 'org.apache.iotdb.library.dquality.UDTFConsistency' -``` - -#### 函数简介 - -本函数用于计算时间序列的一致性,用来衡量时序数据变化是否平稳、规律是否统一。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据一致性,并输出窗口第一个数据点的时间戳和一致性结果。 - -**函数名:** CONSISTENCY - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行一致性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算一致性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|consistency(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+----------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算一致性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------+ -| Time|consistency(root.test.d1.s1, "window"="15")| -+-----------------------------+-------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+-------------------------------------------+ -``` - -### Timeliness - -#### 注册语句 - -```sql -create function timeliness as 'org.apache.iotdb.library.dquality.UDTFTimeliness' -``` - -#### 函数简介 - -本函数用于计算时间序列的时效性,用来衡量时序数据是否按时采集、按时上报。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据时效性,并输出窗口第一个数据点的时间戳和时效性结果。 - -**函数名:** TIMELINESS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行时效性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算时效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+---------------------------+ -| Time|timeliness(root.test.d1.s1)| -+-----------------------------+---------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+---------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算时效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------+ -| Time|timeliness(root.test.d1.s1, "window"="15")| -+-----------------------------+------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+------------------------------------------+ -``` - -### Validity - -#### 注册语句 - -```sql -create function validity as 'org.apache.iotdb.library.dquality.UDTFValidity' -``` - -#### 函数简介 - -本函数用于计算时间序列的有效性,用来衡量时序数据是否正常、可用、无异常值。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据有效性,并输出窗口第一个数据点的时间戳和有效性结果。 - -**函数名:** VALIDITY - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行有效性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算有效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|validity(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -+-----------------------------+-------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算有效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|validity(root.test.d1.s1, "window"="15")| -+-----------------------------+----------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - - - - -## 数据画像 - -### ACF - -#### 注册语句 - -```sql -create function acf as 'org.apache.iotdb.library.dprofile.UDTFACF' -``` - -#### 函数简介 - -本函数用于计算时间序列的自相关函数值,即序列与自身之间的互相关函数。 - -**函数名:** ACF - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中共包含$2N-1$个数据点。 - -**提示:** - -+ 序列中的`NaN`值会被忽略,在计算中表现为0。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|acf(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 6.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -|1970-01-01T08:00:00.004+08:00| 7.0| -|1970-01-01T08:00:00.005+08:00| 0.0| -|1970-01-01T08:00:00.006+08:00| 3.6| -|1970-01-01T08:00:00.007+08:00| 0.0| -|1970-01-01T08:00:00.008+08:00| 1.0| -+-----------------------------+--------------------+ -``` - -### Distinct - -#### 注册语句 - -```sql -create function distinct as 'org.apache.iotdb.library.dprofile.UDTFDistinct' -``` - -#### 函数简介 - -本函数可以返回输入序列中出现的所有不同的元素。 - -**函数名:** DISTINCT - -**输入序列:** 仅支持单个输入序列,类型可以是任意的 - -**输出序列:** 输出单个序列,类型与输入相同。 - -**提示:** - -+ 输出序列的时间戳是无意义的。输出顺序是任意的。 -+ 缺失值和空值将被忽略,但`NaN`不会被忽略。 -+ 字符串区分大小写 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2020-01-01T08:00:00.001+08:00| Hello| -|2020-01-01T08:00:00.002+08:00| hello| -|2020-01-01T08:00:00.003+08:00| Hello| -|2020-01-01T08:00:00.004+08:00| World| -|2020-01-01T08:00:00.005+08:00| World| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select distinct(s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|distinct(root.test.d2.s2)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.001+08:00| Hello| -|1970-01-01T08:00:00.002+08:00| hello| -|1970-01-01T08:00:00.003+08:00| World| -+-----------------------------+-------------------------+ -``` - -### Histogram - -#### 注册语句 - -```sql -create function histogram as 'org.apache.iotdb.library.dprofile.UDTFHistogram' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的分布直方图。 - -**函数名:** HISTOGRAM - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `min`:表示所求数据范围的下限,默认值为 -Double.MAX_VALUE。 -+ `max`:表示所求数据范围的上限,默认值为 Double.MAX_VALUE,`start`的值必须小于或等于`end`。 -+ `count`: 表示直方图分桶的数量,默认值为 1,其值必须为正整数。 - -**输出序列:** 直方图分桶的值,其中第 i 个桶(从 1 开始计数)表示的数据范围下界为$min+ (i-1)\cdot\frac{max-min}{count}$,数据范围上界为$min+ i \cdot \frac{max-min}{count}$。 - - -**提示:** - -+ 如果某个数据点的数值小于`min`,它会被放入第 1 个桶;如果某个数据点的数值大于`max`,它会被放入最后 1 个桶。 -+ 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 11.0| -|2020-01-01T00:00:11.000+08:00| 12.0| -|2020-01-01T00:00:12.000+08:00| 13.0| -|2020-01-01T00:00:13.000+08:00| 14.0| -|2020-01-01T00:00:14.000+08:00| 15.0| -|2020-01-01T00:00:15.000+08:00| 16.0| -|2020-01-01T00:00:16.000+08:00| 17.0| -|2020-01-01T00:00:17.000+08:00| 18.0| -|2020-01-01T00:00:18.000+08:00| 19.0| -|2020-01-01T00:00:19.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------+ -| Time|histogram(root.test.d1.s1, "min"="1", "max"="20", "count"="10")| -+-----------------------------+---------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 2| -|1970-01-01T08:00:00.001+08:00| 2| -|1970-01-01T08:00:00.002+08:00| 2| -|1970-01-01T08:00:00.003+08:00| 2| -|1970-01-01T08:00:00.004+08:00| 2| -|1970-01-01T08:00:00.005+08:00| 2| -|1970-01-01T08:00:00.006+08:00| 2| -|1970-01-01T08:00:00.007+08:00| 2| -|1970-01-01T08:00:00.008+08:00| 2| -|1970-01-01T08:00:00.009+08:00| 2| -+-----------------------------+---------------------------------------------------------------+ -``` - -### Integral - -#### 注册语句 - -```sql -create function integral as 'org.apache.iotdb.library.dprofile.UDAFIntegral' -``` - -#### 函数简介 - -本函数用于计算时间序列的数值积分,即以时间为横坐标、数值为纵坐标绘制的折线图中折线以下的面积。 - -**函数名:** INTEGRAL - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `unit`:积分求解所用的时间轴单位,取值为 "1S", "1s", "1m", "1H", "1d"(区分大小写),分别表示以毫秒、秒、分钟、小时、天为单位计算积分。 - 缺省情况下取 "1s",以秒为单位。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为积分结果的数据点。 - -**提示:** - -+ 积分值等于折线图中每相邻两个数据点和时间轴形成的直角梯形的面积之和,不同时间单位下相当于横轴进行不同倍数放缩,得到的积分值可直接按放缩倍数转换。 - -+ 数据中`NaN`将会被忽略。折线将以临近两个有值数据点为准。 - -#### 使用示例 - -##### 参数缺省 - -缺省情况下积分以1s为时间单位。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 57.5| -+-----------------------------+-------------------------+ -``` - -其计算公式为: -$$\frac{1}{2}[(1+2)\times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 57.5$$ - - -##### 指定时间单位 - -指定以分钟为时间单位。 - - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.958| -+-----------------------------+-------------------------+ -``` - -其计算公式为: -$$\frac{1}{2\times 60}[(1+2) \times 1 + (2+3) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 0.958$$ - -### IntegralAvg - -#### 注册语句 - -```sql -create function integralavg as 'org.apache.iotdb.library.dprofile.UDAFIntegralAvg' -``` - -#### 函数简介 - -本函数用于计算时间序列的函数均值,即在相同时间单位下的数值积分除以序列总的时间跨度。更多关于数值积分计算的信息请参考`Integral`函数。 - -**函数名:** INTEGRALAVG - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为时间加权平均结果的数据点。 - -**提示:** - -+ 时间加权的平均值等于在任意时间单位`unit`下计算的数值积分(即折线图中每相邻两个数据点和时间轴形成的直角梯形的面积之和), - 除以相同时间单位下输入序列的时间跨度,其值与具体采用的时间单位无关,默认与 IoTDB 时间单位一致。 - -+ 数据中的`NaN`将会被忽略。折线将以临近两个有值数据点为准。 - -+ 输入序列为空时,函数输出结果为 0;仅有一个数据点时,输出结果为该点数值。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|integralavg(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|1970-01-01T08:00:00.000+08:00| 6.388888888888889| -+-----------------------------+----------------------------+ -``` - -其计算公式为: -$$\frac{1}{2}[(1+2)\times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] / 10 = 5.75$$ - -### Mad - -#### 注册语句 - -```sql -create function mad as 'org.apache.iotdb.library.dprofile.UDAFMad' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似绝对中位差,绝对中位差为所有数值与其中位数绝对偏移量的中位数。 - -如有数据集$\{1,3,3,5,5,6,7,8,9\}$,其中位数为5,所有数值与中位数的偏移量的绝对值为$\{0,0,1,2,2,2,3,4,4\}$,其中位数为2,故而原数据集的绝对中位差为2。 - -**函数名:** MAD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `error`:近似绝对中位差的基于数值的误差百分比,取值范围为 [0,1),默认值为 0。如当`error`=0.01 时,记精确绝对中位差为a,近似绝对中位差为b,不等式 $0.99a \le b \le 1.01a$ 成立。当`error`=0 时,计算结果为精确绝对中位差。 - - -**输出序列:** 输出单个序列,类型为DOUBLE,序列仅包含一个时间戳为 0、值为绝对中位差的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -##### 近似查询 - -当`error`参数取值不为 0 时,本函数计算近似绝对中位差。 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -............ -Total line number = 20 -``` - -用于查询的 SQL 语句如下: - -```sql -select mad(s1, "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -| Time|mad(root.test.s1, "error"="0.01")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9900000000000001| -+-----------------------------+---------------------------------+ -``` - -### Median - -#### 注册语句 - -```sql -create function median as 'org.apache.iotdb.library.dprofile.UDAFMedian' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似中位数。中位数是顺序排列的一组数据中居于中间位置的数;当序列有偶数个时,中位数为中间二者的平均数。 - -**函数名:** MEDIAN - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `error`:近似中位数的基于排名的误差百分比,取值范围 [0,1),默认值为 0。如当`error`=0.01 时,计算出的中位数的真实排名百分比在 0.49~0.51 之间。当`error`=0 时,计算结果为精确中位数。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为中位数的数据点。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -Total line number = 20 -``` - -用于查询的 SQL 语句: - -```sql -select median(s1, "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|median(root.test.s1, "error"="0.01")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -+-----------------------------+------------------------------------+ -``` - -### MinMax - -#### 注册语句 - -```sql -create function minmax as 'org.apache.iotdb.library.dprofile.UDTFMinMax' -``` - -#### 函数简介 - -本函数将输入序列使用 min-max 方法进行标准化。最小值归一至 0,最大值归一至 1. - -**函数名:** MINMAX - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `compute`:若设置为"batch",则将数据全部读入后转换;若设置为 "stream",则需用户提供最大值及最小值进行流式计算转换。默认为 "batch"。 -+ `min`:使用流式计算时的最小值。 -+ `max`:使用流式计算时的最大值。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select minmax(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|minmax(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.200+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.300+08:00| 0.25| -|1970-01-01T08:00:00.400+08:00| 0.08333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.700+08:00| 0.0| -|1970-01-01T08:00:00.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.900+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.000+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.100+08:00| 0.25| -|1970-01-01T08:00:01.200+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.300+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.25| -|1970-01-01T08:00:01.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.700+08:00| 1.0| -|1970-01-01T08:00:01.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.900+08:00| 0.0| -|1970-01-01T08:00:02.000+08:00| 0.16666666666666666| -+-----------------------------+--------------------+ -``` - - - -### MvAvg - -#### 注册语句 - -```sql -create function mvavg as 'org.apache.iotdb.library.dprofile.UDTFMvAvg' -``` - -#### 函数简介 - -本函数计算序列的移动平均。 - -**函数名:** MVAVG - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:移动窗口的长度。默认值为 10. - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 指定窗口长度 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select mvavg(s1, "window"="3") from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -| Time|mvavg(root.test.s1, "window"="3")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.300+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.400+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.700+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.100+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.200+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.300+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.0| -|1970-01-01T08:00:01.500+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.600+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.700+08:00| 3.0| -|1970-01-01T08:00:01.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.900+08:00| -0.6666666666666666| -|1970-01-01T08:00:02.000+08:00| -3.3333333333333335| -+-----------------------------+---------------------------------+ -``` - -### PACF - -#### 注册语句 - -```sql -create function pacf as 'org.apache.iotdb.library.dprofile.UDTFPACF' -``` - -#### 函数简介 - -本函数通过求解 Yule-Walker 方程,计算序列的偏自相关系数。对于特殊的输入序列,方程可能没有解,此时输出`NaN`。 - -**函数名:** PACF - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `lag`:最大滞后阶数。默认值为$\min(10\log_{10}n,n-1)$,$n$表示数据点个数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 指定滞后阶数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select pacf(s1, "lag"="5") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------+ -| Time|pacf(root.test.d1.s1, "lag"="5")| -+-----------------------------+--------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| -0.5744680851063829| -|2020-01-01T00:00:03.000+08:00| 0.3172297297297296| -|2020-01-01T00:00:04.000+08:00| -0.2977686586304181| -|2020-01-01T00:00:05.000+08:00| -2.0609033521065867| -+-----------------------------+--------------------------------+ -``` - -### Percentile - -#### 注册语句 - -```sql -create function percentile as 'org.apache.iotdb.library.dprofile.UDAFPercentile' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似分位数。 - -**函数名:** PERCENTILE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `rank`:所求分位数在所有数据中的排名百分比,取值范围为 (0,1],默认值为 0.5。如当设为 0.5时则计算中位数。 -+ `error`:近似分位数的基于排名的误差百分比,取值范围为 [0,1),默认值为0。如`rank`=0.5 且`error`=0.01,则计算出的分位数的真实排名百分比在 0.49~0.51之间。当`error`=0 时,计算结果为精确分位数。 - -**输出序列:** 输出单个序列,类型与输入序列相同。当`error`=0时,序列仅包含一个时间戳为分位数第一次出现的时间戳、值为分位数的数据点;否则,输出值的时间戳为0。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|2021-03-17T10:32:17.054+08:00| 0.5319929| -|2021-03-17T10:32:18.054+08:00| 0.9304316| -|2021-03-17T10:32:19.054+08:00| -1.4800133| -|2021-03-17T10:32:20.054+08:00| 0.6114087| -|2021-03-17T10:32:21.054+08:00| 2.5163336| -|2021-03-17T10:32:22.054+08:00| -1.0845392| -|2021-03-17T10:32:23.054+08:00| 1.0562582| -|2021-03-17T10:32:24.054+08:00| 1.3867859| -|2021-03-17T10:32:25.054+08:00| -0.45429882| -|2021-03-17T10:32:26.054+08:00| 1.0353678| -|2021-03-17T10:32:27.054+08:00| 0.7307929| -|2021-03-17T10:32:28.054+08:00| 2.3167255| -|2021-03-17T10:32:29.054+08:00| 2.342443| -|2021-03-17T10:32:30.054+08:00| 1.5809103| -|2021-03-17T10:32:31.054+08:00| 1.4829416| -|2021-03-17T10:32:32.054+08:00| 1.5800357| -|2021-03-17T10:32:33.054+08:00| 0.7124368| -|2021-03-17T10:32:34.054+08:00| -0.78597564| -|2021-03-17T10:32:35.054+08:00| 1.2058644| -|2021-03-17T10:32:36.054+08:00| 1.4215064| -|2021-03-17T10:32:37.054+08:00| 1.2808295| -|2021-03-17T10:32:38.054+08:00| -0.6173715| -|2021-03-17T10:32:39.054+08:00| 0.06644377| -|2021-03-17T10:32:40.054+08:00| 2.349338| -|2021-03-17T10:32:41.054+08:00| 1.7335888| -|2021-03-17T10:32:42.054+08:00| 1.5872132| -............ -Total line number = 10000 -``` - -用于查询的 SQL 语句: - -```sql -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|percentile(root.test.s0, "rank"="0.2", "error"="0.01")| -+-----------------------------+------------------------------------------------------+ -|2021-03-17T10:35:02.054+08:00| 0.1801469624042511| -+-----------------------------+------------------------------------------------------+ -``` -输入序列: - -``` -+-----------------------------+-------------+ -| Time|root.test2.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+-------------+ -............ -Total line number = 20 -``` - -用于查询的 SQL 语句: - -```sql -select percentile(s1, "rank"="0.2", "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|percentile(root.test2.s1, "rank"="0.2", "error"="0.01")| -+-----------------------------+-------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| -1.0| -+-----------------------------+-------------------------------------------------------+ -``` - - -### Quantile - -#### 注册语句 - -```sql -create function quantile as 'org.apache.iotdb.library.dprofile.UDAFQuantile' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的近似分位数。本函数基于KLL sketch算法实现。 - -**函数名:** QUANTILE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `rank`:所求分位数在所有数据中的排名比,取值范围为 (0,1],默认值为 0.5。如当设为 0.5时则计算近似中位数。 -+ `K`:允许维护的KLL sketch大小,最小值为100,默认值为800。如`rank`=0.5 且`K`=800,则计算出的分位数的真实排名比有至少99%的可能性在 0.49~0.51之间。 - -**输出序列:** 输出单个序列,类型与输入序列相同。输出值的时间戳为0。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - - -输入序列: - -``` -+-----------------------------+-------------+ -| Time|root.test1.s1| -+-----------------------------+-------------+ -|2021-03-17T10:32:17.054+08:00| 7| -|2021-03-17T10:32:18.054+08:00| 15| -|2021-03-17T10:32:19.054+08:00| 36| -|2021-03-17T10:32:20.054+08:00| 39| -|2021-03-17T10:32:21.054+08:00| 40| -|2021-03-17T10:32:22.054+08:00| 41| -|2021-03-17T10:32:23.054+08:00| 20| -|2021-03-17T10:32:24.054+08:00| 18| -+-----------------------------+-------------+ -............ -Total line number = 8 -``` - -用于查询的 SQL 语句: - -```sql -select quantile(s1, "rank"="0.2", "K"="800") from root.test1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------+ -| Time|quantile(root.test1.s1, "rank"="0.2", "K"="800")| -+-----------------------------+------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.000000000000001| -+-----------------------------+------------------------------------------------+ -``` - -### Period - -#### 注册语句 - -```sql -create function period as 'org.apache.iotdb.library.dprofile.UDAFPeriod' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的周期。 - -**函数名:** PERIOD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为 INT32,序列仅包含一个时间戳为 0、值为周期的数据点。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d3.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| -|1970-01-01T08:00:00.004+08:00| 1.0| -|1970-01-01T08:00:00.005+08:00| 2.0| -|1970-01-01T08:00:00.006+08:00| 3.0| -|1970-01-01T08:00:00.007+08:00| 1.0| -|1970-01-01T08:00:00.008+08:00| 2.0| -|1970-01-01T08:00:00.009+08:00| 3.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select period(s1) from root.test.d3 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time|period(root.test.d3.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 3| -+-----------------------------+-----------------------+ -``` - -### QLB - -#### 注册语句 - -```sql -create function qlb as 'org.apache.iotdb.library.dprofile.UDTFQLB' -``` - -#### 函数简介 - -本函数对输入序列计算$Q_{LB} $统计量,并计算对应的p值。p值越小表明序列越有可能为非平稳序列。 - -**函数名:** QLB - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `lag`:计算时用到的最大延迟阶数,取值应为 1 至 n-2 之间的整数,n 为序列采样总数。默认取 n-2。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。该序列是$Q_{LB} $统计量对应的 p 值,时间标签代表偏移阶数。 - -**提示:** $Q_{LB} $统计量由自相关系数求得,如需得到统计量而非 p 值,可以使用 ACF 函数。 - -#### 使用示例 - -##### 使用默认参数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T00:00:00.100+08:00| 1.22| -|1970-01-01T00:00:00.200+08:00| -2.78| -|1970-01-01T00:00:00.300+08:00| 1.53| -|1970-01-01T00:00:00.400+08:00| 0.70| -|1970-01-01T00:00:00.500+08:00| 0.75| -|1970-01-01T00:00:00.600+08:00| -0.72| -|1970-01-01T00:00:00.700+08:00| -0.22| -|1970-01-01T00:00:00.800+08:00| 0.28| -|1970-01-01T00:00:00.900+08:00| 0.57| -|1970-01-01T00:00:01.000+08:00| -0.22| -|1970-01-01T00:00:01.100+08:00| -0.72| -|1970-01-01T00:00:01.200+08:00| 1.34| -|1970-01-01T00:00:01.300+08:00| -0.25| -|1970-01-01T00:00:01.400+08:00| 0.17| -|1970-01-01T00:00:01.500+08:00| 2.51| -|1970-01-01T00:00:01.600+08:00| 1.42| -|1970-01-01T00:00:01.700+08:00| -1.34| -|1970-01-01T00:00:01.800+08:00| -0.01| -|1970-01-01T00:00:01.900+08:00| -0.49| -|1970-01-01T00:00:02.000+08:00| 1.63| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select QLB(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+---------------------+ -| Time| QLB(root.test.d1.s1)| -+-----------------------------+---------------------+ -|1970-01-01T08:00:00.021+08:00| -0.31671| -|1970-01-01T08:00:00.001+08:00| 0.12748561639660716| -|1970-01-01T08:00:00.022+08:00| -0.17051499999999997| -|1970-01-01T08:00:00.002+08:00| 0.21941409592365868| -|1970-01-01T08:00:00.023+08:00| -0.11341499999999997| -|1970-01-01T08:00:00.003+08:00| 0.3384920824593398| -|1970-01-01T08:00:00.024+08:00| 0.26146| -|1970-01-01T08:00:00.004+08:00| 0.26293189359893154| -|1970-01-01T08:00:00.025+08:00| 0.06431999999999996| -|1970-01-01T08:00:00.005+08:00| 0.37265953802871943| -|1970-01-01T08:00:00.026+08:00| 0.036919999999999994| -|1970-01-01T08:00:00.006+08:00| 0.4923218142923832| -|1970-01-01T08:00:00.027+08:00|-0.009294999999999993| -|1970-01-01T08:00:00.007+08:00| 0.609628728420623| -|1970-01-01T08:00:00.028+08:00| 0.12271499999999999| -|1970-01-01T08:00:00.008+08:00| 0.6510708392264906| -|1970-01-01T08:00:00.029+08:00| 0.008480000000000033| -|1970-01-01T08:00:00.009+08:00| 0.7430561964288097| -|1970-01-01T08:00:00.030+08:00| -0.21764500000000003| -|1970-01-01T08:00:00.010+08:00| 0.6236738200492055| -|1970-01-01T08:00:00.031+08:00| 0.35853999999999997| -|1970-01-01T08:00:00.011+08:00| 0.21487390993160937| -|1970-01-01T08:00:00.032+08:00| 0.18115499999999998| -|1970-01-01T08:00:00.012+08:00| 0.18479562182870324| -|1970-01-01T08:00:00.033+08:00| -0.27745499999999995| -|1970-01-01T08:00:00.013+08:00| 0.07329862193377235| -|1970-01-01T08:00:00.034+08:00| -0.22418500000000002| -|1970-01-01T08:00:00.014+08:00| 0.038000864459751926| -|1970-01-01T08:00:00.035+08:00| 0.31609000000000004| -|1970-01-01T08:00:00.015+08:00| 0.004052989734200874| -|1970-01-01T08:00:00.036+08:00| -0.06078500000000001| -|1970-01-01T08:00:00.016+08:00| 0.005663787468609627| -|1970-01-01T08:00:00.037+08:00| 0.19219499999999998| -|1970-01-01T08:00:00.017+08:00|0.0016316380755082571| -|1970-01-01T08:00:00.038+08:00| -0.25646| -|1970-01-01T08:00:00.018+08:00|2.0047954405910673E-5| -+-----------------------------+---------------------+ -``` - -### Resample - -#### 注册语句 - -```sql -create function re_sample as 'org.apache.iotdb.library.dprofile.UDTFResample' -``` - -#### 函数简介 - -本函数对输入序列按照指定的频率进行重采样,包括上采样和下采样。目前,本函数支持的上采样方法包括`NaN`填充法 (NaN)、前值填充法 (FFill)、后值填充法 (BFill) 以及线性插值法 (Linear);本函数支持的下采样方法为分组聚合,聚合方法包括最大值 (Max)、最小值 (Min)、首值 (First)、末值 (Last)、平均值 (Mean)和中位数 (Median)。 - -**函数名:** RESAMPLE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `every`:重采样频率,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。该参数不允许缺省。 -+ `interp`:上采样的插值方法,取值为 'NaN'、'FFill'、'BFill' 或 'Linear'。在缺省情况下,使用`NaN`填充法。 -+ `aggr`:下采样的聚合方法,取值为 'Max'、'Min'、'First'、'Last'、'Mean' 或 'Median'。在缺省情况下,使用平均数聚合。 -+ `start`:重采样的起始时间(包含),是一个格式为 'yyyy-MM-dd HH:mm:ss' 的时间字符串。在缺省情况下,使用第一个有效数据点的时间戳。 -+ `end`:重采样的结束时间(不包含),是一个格式为 'yyyy-MM-dd HH:mm:ss' 的时间字符串。在缺省情况下,使用最后一个有效数据点的时间戳。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。该序列按照重采样频率严格等间隔分布。 - -**提示:** 数据中的`NaN`将会被忽略。 - -#### 使用示例 - -##### 上采样 - -当重采样频率高于数据原始频率时,将会进行上采样。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2021-03-06T16:00:00.000+08:00| 3.09| -|2021-03-06T16:15:00.000+08:00| 3.53| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:45:00.000+08:00| 3.51| -|2021-03-06T17:00:00.000+08:00| 3.41| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select resample(s1,'every'='5m','interp'='linear') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="5m", "interp"="linear")| -+-----------------------------+----------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:05:00.000+08:00| 3.2366665999094644| -|2021-03-06T16:10:00.000+08:00| 3.3833332856496177| -|2021-03-06T16:15:00.000+08:00| 3.5299999713897705| -|2021-03-06T16:20:00.000+08:00| 3.5199999809265137| -|2021-03-06T16:25:00.000+08:00| 3.509999990463257| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:35:00.000+08:00| 3.503333330154419| -|2021-03-06T16:40:00.000+08:00| 3.506666660308838| -|2021-03-06T16:45:00.000+08:00| 3.509999990463257| -|2021-03-06T16:50:00.000+08:00| 3.4766666889190674| -|2021-03-06T16:55:00.000+08:00| 3.443333387374878| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+----------------------------------------------------------+ -``` - -##### 下采样 - -当重采样频率低于数据原始频率时,将会进行下采样。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select resample(s1,'every'='30m','aggr'='first') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "aggr"="first")| -+-----------------------------+--------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+--------------------------------------------------------+ -``` - - -###### 指定重采样时间段 - -可以使用`start`和`end`两个参数指定重采样的时间段,超出实际时间范围的部分会被插值填补。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "start"="2021-03-06 15:00:00")| -+-----------------------------+-----------------------------------------------------------------------+ -|2021-03-06T15:00:00.000+08:00| NaN| -|2021-03-06T15:30:00.000+08:00| NaN| -|2021-03-06T16:00:00.000+08:00| 3.309999942779541| -|2021-03-06T16:30:00.000+08:00| 3.5049999952316284| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+-----------------------------------------------------------------------+ -``` - -### Sample - -#### 注册语句 - -```sql -create function sample as 'org.apache.iotdb.library.dprofile.UDTFSample' -``` - -#### 函数简介 - -本函数对输入序列进行采样,即从输入序列中选取指定数量的数据点并输出。目前,本函数支持三种采样方法:**蓄水池采样法 (reservoir sampling)** 对数据进行随机采样,所有数据点被采样的概率相同;**等距采样法 (isometric sampling)** 按照相等的索引间隔对数据进行采样,**最大三角采样法 (triangle sampling)** 对所有数据会按采样率分桶,每个桶内会计算数据点间三角形面积,并保留面积最大的点,该算法通常用于数据的可视化展示中,采用过程可以保证一些关键的突变点在采用中得到保留,更多抽样算法细节可以阅读论文 [here](http://skemman.is/stream/get/1946/15343/37285/3/SS_MSthesis.pdf)。 - -**函数名:** SAMPLE - -**输入序列:** 仅支持单个输入序列,类型可以是任意的。 - -**参数:** - -+ `method`:采样方法,取值为 'reservoir','isometric' 或 'triangle' 。在缺省情况下,采用蓄水池采样法。 -+ `k`:采样数,它是一个正整数,在缺省情况下为 1。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列的长度为采样数,序列中的每一个数据点都来自于输入序列。 - -**提示:** 如果采样数大于序列长度,那么输入序列中所有的数据点都会被输出。 - -#### 使用示例 - - -##### 蓄水池采样 - -当`method`参数为 'reservoir' 或缺省时,采用蓄水池采样法对输入序列进行采样。由于该采样方法具有随机性,下面展示的输出序列只是一种可能的结果。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 4.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select sample(s1,'method'='reservoir','k'='5') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="reservoir", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - - -##### 等距采样 - -当`method`参数为 'isometric' 时,采用等距采样法对输入序列进行采样。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select sample(s1,'method'='isometric','k'='5') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="isometric", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -### Segment - -#### 注册语句 - -```sql -create function segment as 'org.apache.iotdb.library.dprofile.UDTFSegment' -``` - -#### 函数简介 - -本函数按照数据的线性变化趋势将数据划分为多个子序列,返回分段直线拟合后的子序列首值或所有拟合值。 - -**函数名:** SEGMENT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `output`:"all" 输出所有拟合值;"first" 输出子序列起点拟合值。默认为 "first"。 - -+ `error`:判定存在线性趋势的误差允许阈值。误差的定义为子序列进行线性拟合的误差的绝对值的均值。默认为 0.1. - -**输出序列:** 输出单个序列,类型为 DOUBLE。 - -**提示:** 函数默认所有数据等时间间隔分布。函数读取所有数据,若原始数据过多,请先进行降采样处理。拟合采用自底向上方法,子序列的尾值可能会被认作子序列首值输出。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:00.300+08:00| 2.0| -|1970-01-01T08:00:00.400+08:00| 3.0| -|1970-01-01T08:00:00.500+08:00| 4.0| -|1970-01-01T08:00:00.600+08:00| 5.0| -|1970-01-01T08:00:00.700+08:00| 6.0| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 8.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:01.100+08:00| 9.1| -|1970-01-01T08:00:01.200+08:00| 9.2| -|1970-01-01T08:00:01.300+08:00| 9.3| -|1970-01-01T08:00:01.400+08:00| 9.4| -|1970-01-01T08:00:01.500+08:00| 9.5| -|1970-01-01T08:00:01.600+08:00| 9.6| -|1970-01-01T08:00:01.700+08:00| 9.7| -|1970-01-01T08:00:01.800+08:00| 9.8| -|1970-01-01T08:00:01.900+08:00| 9.9| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:02.100+08:00| 8.0| -|1970-01-01T08:00:02.200+08:00| 6.0| -|1970-01-01T08:00:02.300+08:00| 4.0| -|1970-01-01T08:00:02.400+08:00| 2.0| -|1970-01-01T08:00:02.500+08:00| 0.0| -|1970-01-01T08:00:02.600+08:00| -2.0| -|1970-01-01T08:00:02.700+08:00| -4.0| -|1970-01-01T08:00:02.800+08:00| -6.0| -|1970-01-01T08:00:02.900+08:00| -8.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.100+08:00| 10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -|1970-01-01T08:00:03.300+08:00| 10.0| -|1970-01-01T08:00:03.400+08:00| 10.0| -|1970-01-01T08:00:03.500+08:00| 10.0| -|1970-01-01T08:00:03.600+08:00| 10.0| -|1970-01-01T08:00:03.700+08:00| 10.0| -|1970-01-01T08:00:03.800+08:00| 10.0| -|1970-01-01T08:00:03.900+08:00| 10.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select segment(s1,"error"="0.1") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|segment(root.test.s1, "error"="0.1")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -+-----------------------------+------------------------------------+ -``` - -### Skew - -#### 注册语句 - -```sql -create function skew as 'org.apache.iotdb.library.dprofile.UDAFSkew' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的总体偏度 - -**函数名:** SKEW - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为总体偏度的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -|2020-01-01T00:00:11.000+08:00| 10.0| -|2020-01-01T00:00:12.000+08:00| 10.0| -|2020-01-01T00:00:13.000+08:00| 10.0| -|2020-01-01T00:00:14.000+08:00| 10.0| -|2020-01-01T00:00:15.000+08:00| 10.0| -|2020-01-01T00:00:16.000+08:00| 10.0| -|2020-01-01T00:00:17.000+08:00| 10.0| -|2020-01-01T00:00:18.000+08:00| 10.0| -|2020-01-01T00:00:19.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select skew(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time| skew(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| -0.9998427402292644| -+-----------------------------+-----------------------+ -``` - -### Spline - -#### 注册语句 - -```sql -create function spline as 'org.apache.iotdb.library.dprofile.UDTFSpline' -``` - -#### 函数简介 - -本函数提供对原始序列进行三次样条曲线拟合后的插值重采样。 - -**函数名:** SPLINE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `points`:重采样个数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -**提示**:输出序列保留输入序列的首尾值,等时间间隔采样。仅当输入点个数不少于 4 个时才计算插值。 - -#### 使用示例 - -##### 指定插值个数 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select spline(s1, "points"="151") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|spline(root.test.s1, "points"="151")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.010+08:00| 0.04870000251134237| -|1970-01-01T08:00:00.020+08:00| 0.09680000495910646| -|1970-01-01T08:00:00.030+08:00| 0.14430000734329226| -|1970-01-01T08:00:00.040+08:00| 0.19120000966389972| -|1970-01-01T08:00:00.050+08:00| 0.23750001192092896| -|1970-01-01T08:00:00.060+08:00| 0.2832000141143799| -|1970-01-01T08:00:00.070+08:00| 0.32830001624425253| -|1970-01-01T08:00:00.080+08:00| 0.3728000183105469| -|1970-01-01T08:00:00.090+08:00| 0.416700020313263| -|1970-01-01T08:00:00.100+08:00| 0.4600000222524008| -|1970-01-01T08:00:00.110+08:00| 0.5027000241279602| -|1970-01-01T08:00:00.120+08:00| 0.5448000259399414| -|1970-01-01T08:00:00.130+08:00| 0.5863000276883443| -|1970-01-01T08:00:00.140+08:00| 0.627200029373169| -|1970-01-01T08:00:00.150+08:00| 0.6675000309944153| -|1970-01-01T08:00:00.160+08:00| 0.7072000325520833| -|1970-01-01T08:00:00.170+08:00| 0.7463000340461731| -|1970-01-01T08:00:00.180+08:00| 0.7848000354766846| -|1970-01-01T08:00:00.190+08:00| 0.8227000368436178| -|1970-01-01T08:00:00.200+08:00| 0.8600000381469728| -|1970-01-01T08:00:00.210+08:00| 0.8967000393867494| -|1970-01-01T08:00:00.220+08:00| 0.9328000405629477| -|1970-01-01T08:00:00.230+08:00| 0.9683000416755676| -|1970-01-01T08:00:00.240+08:00| 1.0032000427246095| -|1970-01-01T08:00:00.250+08:00| 1.037500043710073| -|1970-01-01T08:00:00.260+08:00| 1.071200044631958| -|1970-01-01T08:00:00.270+08:00| 1.1043000454902647| -|1970-01-01T08:00:00.280+08:00| 1.1368000462849934| -|1970-01-01T08:00:00.290+08:00| 1.1687000470161437| -|1970-01-01T08:00:00.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:00.310+08:00| 1.2307000483103594| -|1970-01-01T08:00:00.320+08:00| 1.2608000489139557| -|1970-01-01T08:00:00.330+08:00| 1.2903000494873524| -|1970-01-01T08:00:00.340+08:00| 1.3192000500233967| -|1970-01-01T08:00:00.350+08:00| 1.3475000505149364| -|1970-01-01T08:00:00.360+08:00| 1.3752000509548186| -|1970-01-01T08:00:00.370+08:00| 1.402300051335891| -|1970-01-01T08:00:00.380+08:00| 1.4288000516510009| -|1970-01-01T08:00:00.390+08:00| 1.4547000518929958| -|1970-01-01T08:00:00.400+08:00| 1.480000052054723| -|1970-01-01T08:00:00.410+08:00| 1.5047000521290301| -|1970-01-01T08:00:00.420+08:00| 1.5288000521087646| -|1970-01-01T08:00:00.430+08:00| 1.5523000519867738| -|1970-01-01T08:00:00.440+08:00| 1.575200051755905| -|1970-01-01T08:00:00.450+08:00| 1.597500051409006| -|1970-01-01T08:00:00.460+08:00| 1.619200050938924| -|1970-01-01T08:00:00.470+08:00| 1.6403000503385066| -|1970-01-01T08:00:00.480+08:00| 1.660800049600601| -|1970-01-01T08:00:00.490+08:00| 1.680700048718055| -|1970-01-01T08:00:00.500+08:00| 1.7000000476837158| -|1970-01-01T08:00:00.510+08:00| 1.7188475466453037| -|1970-01-01T08:00:00.520+08:00| 1.7373800457262996| -|1970-01-01T08:00:00.530+08:00| 1.7555825448831923| -|1970-01-01T08:00:00.540+08:00| 1.7734400440724702| -|1970-01-01T08:00:00.550+08:00| 1.790937543250622| -|1970-01-01T08:00:00.560+08:00| 1.8080600423741364| -|1970-01-01T08:00:00.570+08:00| 1.8247925413995016| -|1970-01-01T08:00:00.580+08:00| 1.8411200402832066| -|1970-01-01T08:00:00.590+08:00| 1.8570275389817397| -|1970-01-01T08:00:00.600+08:00| 1.8725000374515897| -|1970-01-01T08:00:00.610+08:00| 1.8875225356492449| -|1970-01-01T08:00:00.620+08:00| 1.902080033531194| -|1970-01-01T08:00:00.630+08:00| 1.9161575310539258| -|1970-01-01T08:00:00.640+08:00| 1.9297400281739288| -|1970-01-01T08:00:00.650+08:00| 1.9428125248476913| -|1970-01-01T08:00:00.660+08:00| 1.9553600210317021| -|1970-01-01T08:00:00.670+08:00| 1.96736751668245| -|1970-01-01T08:00:00.680+08:00| 1.9788200117564232| -|1970-01-01T08:00:00.690+08:00| 1.9897025062101101| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.710+08:00| 2.0097024933913334| -|1970-01-01T08:00:00.720+08:00| 2.0188199867081615| -|1970-01-01T08:00:00.730+08:00| 2.027367479995188| -|1970-01-01T08:00:00.740+08:00| 2.0353599732971155| -|1970-01-01T08:00:00.750+08:00| 2.0428124666586482| -|1970-01-01T08:00:00.760+08:00| 2.049739960124489| -|1970-01-01T08:00:00.770+08:00| 2.056157453739342| -|1970-01-01T08:00:00.780+08:00| 2.06207994754791| -|1970-01-01T08:00:00.790+08:00| 2.067522441594897| -|1970-01-01T08:00:00.800+08:00| 2.072499935925006| -|1970-01-01T08:00:00.810+08:00| 2.07702743058294| -|1970-01-01T08:00:00.820+08:00| 2.081119925613404| -|1970-01-01T08:00:00.830+08:00| 2.0847924210611| -|1970-01-01T08:00:00.840+08:00| 2.0880599169707317| -|1970-01-01T08:00:00.850+08:00| 2.0909374133870027| -|1970-01-01T08:00:00.860+08:00| 2.0934399103546166| -|1970-01-01T08:00:00.870+08:00| 2.0955824079182768| -|1970-01-01T08:00:00.880+08:00| 2.0973799061226863| -|1970-01-01T08:00:00.890+08:00| 2.098847405012549| -|1970-01-01T08:00:00.900+08:00| 2.0999999046325684| -|1970-01-01T08:00:00.910+08:00| 2.1005574051201332| -|1970-01-01T08:00:00.920+08:00| 2.1002599065303778| -|1970-01-01T08:00:00.930+08:00| 2.0991524087846245| -|1970-01-01T08:00:00.940+08:00| 2.0972799118041947| -|1970-01-01T08:00:00.950+08:00| 2.0946874155104105| -|1970-01-01T08:00:00.960+08:00| 2.0914199198245944| -|1970-01-01T08:00:00.970+08:00| 2.0875224246680673| -|1970-01-01T08:00:00.980+08:00| 2.083039929962151| -|1970-01-01T08:00:00.990+08:00| 2.0780174356281687| -|1970-01-01T08:00:01.000+08:00| 2.0724999415874406| -|1970-01-01T08:00:01.010+08:00| 2.06653244776129| -|1970-01-01T08:00:01.020+08:00| 2.060159954071038| -|1970-01-01T08:00:01.030+08:00| 2.053427460438006| -|1970-01-01T08:00:01.040+08:00| 2.046379966783517| -|1970-01-01T08:00:01.050+08:00| 2.0390624730288924| -|1970-01-01T08:00:01.060+08:00| 2.031519979095454| -|1970-01-01T08:00:01.070+08:00| 2.0237974849045237| -|1970-01-01T08:00:01.080+08:00| 2.015939990377423| -|1970-01-01T08:00:01.090+08:00| 2.0079924954354746| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.110+08:00| 1.9907018211101906| -|1970-01-01T08:00:01.120+08:00| 1.9788509124245144| -|1970-01-01T08:00:01.130+08:00| 1.9645127287932083| -|1970-01-01T08:00:01.140+08:00| 1.9477527250665083| -|1970-01-01T08:00:01.150+08:00| 1.9286363560946513| -|1970-01-01T08:00:01.160+08:00| 1.9072290767278735| -|1970-01-01T08:00:01.170+08:00| 1.8835963418164114| -|1970-01-01T08:00:01.180+08:00| 1.8578036062105014| -|1970-01-01T08:00:01.190+08:00| 1.8299163247603802| -|1970-01-01T08:00:01.200+08:00| 1.7999999523162842| -|1970-01-01T08:00:01.210+08:00| 1.7623635841923329| -|1970-01-01T08:00:01.220+08:00| 1.7129696477516976| -|1970-01-01T08:00:01.230+08:00| 1.6543635959181928| -|1970-01-01T08:00:01.240+08:00| 1.5890908816156328| -|1970-01-01T08:00:01.250+08:00| 1.5196969577678319| -|1970-01-01T08:00:01.260+08:00| 1.4487272772986044| -|1970-01-01T08:00:01.270+08:00| 1.3787272931317647| -|1970-01-01T08:00:01.280+08:00| 1.3122424581911272| -|1970-01-01T08:00:01.290+08:00| 1.251818225400506| -|1970-01-01T08:00:01.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:01.310+08:00| 1.1548000470995912| -|1970-01-01T08:00:01.320+08:00| 1.1130667107899999| -|1970-01-01T08:00:01.330+08:00| 1.0756000393033045| -|1970-01-01T08:00:01.340+08:00| 1.043200033187868| -|1970-01-01T08:00:01.350+08:00| 1.016666692992053| -|1970-01-01T08:00:01.360+08:00| 0.9968000192642223| -|1970-01-01T08:00:01.370+08:00| 0.9844000125527389| -|1970-01-01T08:00:01.380+08:00| 0.9802666734059655| -|1970-01-01T08:00:01.390+08:00| 0.9852000023722649| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.410+08:00| 1.023999999165535| -|1970-01-01T08:00:01.420+08:00| 1.0559999990463256| -|1970-01-01T08:00:01.430+08:00| 1.0959999996423722| -|1970-01-01T08:00:01.440+08:00| 1.1440000009536744| -|1970-01-01T08:00:01.450+08:00| 1.2000000029802322| -|1970-01-01T08:00:01.460+08:00| 1.264000005722046| -|1970-01-01T08:00:01.470+08:00| 1.3360000091791153| -|1970-01-01T08:00:01.480+08:00| 1.4160000133514405| -|1970-01-01T08:00:01.490+08:00| 1.5040000182390214| -|1970-01-01T08:00:01.500+08:00| 1.600000023841858| -+-----------------------------+------------------------------------+ -``` - -### Spread - -#### 注册语句 - -```sql -create function spread as 'org.apache.iotdb.library.dprofile.UDAFSpread' -``` - -#### 函数简介 - -本函数用于计算时间序列的极差,即最大值减去最小值的结果。 - -**函数名:** SPREAD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型与输入相同,序列仅包含一个时间戳为 0 、值为极差的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time|spread(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 26.0| -+-----------------------------+-----------------------+ -``` - - - -### ZScore - -#### 注册语句 - -```sql -create function zscore as 'org.apache.iotdb.library.dprofile.UDTFZScore' -``` - -#### 函数简介 - -本函数将输入序列使用z-score方法进行归一化。 - -**函数名:** ZSCORE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `compute`:若设置为 "batch",则将数据全部读入后转换;若设置为 "stream",则需用户提供均值及方差进行流式计算转换。默认为 "batch"。 -+ `avg`:使用流式计算时的均值。 -+ `sd`:使用流式计算时的标准差。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select zscore(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|zscore(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.200+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.300+08:00| 0.20672455764868078| -|1970-01-01T08:00:00.400+08:00| -0.6201736729460423| -|1970-01-01T08:00:00.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.700+08:00| -1.033622788243404| -|1970-01-01T08:00:00.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:00.900+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.000+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.100+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.200+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.300+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.400+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.700+08:00| 3.9277665953249348| -|1970-01-01T08:00:01.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:01.900+08:00| -1.033622788243404| -|1970-01-01T08:00:02.000+08:00|-0.20672455764868078| -+-----------------------------+--------------------+ -``` - - - -## 异常检测 - -### IQR - -#### 注册语句 - -```sql -create function iqr as 'org.apache.iotdb.library.anomaly.UDTFIQR' -``` - -#### 函数简介 - -本函数用于检验超出上下四分位数1.5倍IQR的数据分布异常。 - -**函数名:** IQR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `method`:若设置为 "batch",则将数据全部读入后检测;若设置为 "stream",则需用户提供上下四分位数进行流式检测。默认为 "batch"。 -+ `q1`:使用流式计算时的下四分位数。 -+ `q3`:使用流式计算时的上四分位数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -**说明**:$IQR=Q_3-Q_1$ - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select iqr(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+-----------------+ -| Time|iqr(root.test.s1)| -+-----------------------------+-----------------+ -|1970-01-01T08:00:01.700+08:00| 10.0| -+-----------------------------+-----------------+ -``` - -### KSigma - -#### 注册语句 - -```sql -create function ksigma as 'org.apache.iotdb.library.anomaly.UDTFKSigma' -``` - -#### 函数简介 - -本函数利用动态 K-Sigma 算法进行异常检测。在一个窗口内,与平均值的差距超过k倍标准差的数据将被视作异常并输出。 - -**函数名:** KSIGMA - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `k`:在动态 K-Sigma 算法中,分布异常的标准差倍数阈值,默认值为 3。 -+ `window`:动态 K-Sigma 算法的滑动窗口大小,默认值为 10000。 - - -**输出序列:** 输出单个序列,类型与输入序列相同。 - -**提示:** k 应大于 0,否则将不做输出。 - -#### 使用示例 - -##### 指定k - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:04.000+08:00| 100.0| -|2020-01-01T00:00:06.000+08:00| 150.0| -|2020-01-01T00:00:08.000+08:00| 200.0| -|2020-01-01T00:00:10.000+08:00| 200.0| -|2020-01-01T00:00:14.000+08:00| 200.0| -|2020-01-01T00:00:15.000+08:00| 200.0| -|2020-01-01T00:00:16.000+08:00| 200.0| -|2020-01-01T00:00:18.000+08:00| 200.0| -|2020-01-01T00:00:20.000+08:00| 150.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -|Time |ksigma(root.test.d1.s1,"k"="3.0")| -+-----------------------------+---------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -+-----------------------------+---------------------------------+ -``` - -### LOF - -#### 注册语句 - -```sql -create function LOF as 'org.apache.iotdb.library.anomaly.UDTFLOF' -``` - -#### 函数简介 - -本函数使用局部离群点检测方法用于查找序列的密度异常。将根据提供的第k距离数及局部离群点因子(lof)阈值,判断输入数据是否为离群点,即异常,并输出各点的 LOF 值。 - -**函数名:** LOF - -**输入序列:** 多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:使用的检测方法。默认为 default,以高维数据计算。设置为 series,将一维时间序列转换为高维数据计算。 -+ `k`:使用第k距离计算局部离群点因子.默认为 3。 -+ `window`:每次读取数据的窗口长度。默认为 10000. -+ `windowsize`:使用series方法时,转化高维数据的维数,即单个窗口的大小。默认为 5。 - -**输出序列:** 输出单时间序列,类型为DOUBLE。 - -**提示:** 不完整的数据行会被忽略,不参与计算,也不标记为离群点。 - - -#### 使用示例 - -##### 默认参数 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| 1.0| -|1970-01-01T08:00:00.300+08:00| 1.0| 1.0| -|1970-01-01T08:00:00.400+08:00| 1.0| 0.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -1.0| -|1970-01-01T08:00:00.600+08:00| -1.0| -1.0| -|1970-01-01T08:00:00.700+08:00| -1.0| 0.0| -|1970-01-01T08:00:00.800+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| null| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select lof(s1,s2) from root.test.d1 where time<1000 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|lof(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.100+08:00| 3.8274824267668244| -|1970-01-01T08:00:00.200+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.300+08:00| 2.838155437762879| -|1970-01-01T08:00:00.400+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.500+08:00| 2.73518261244453| -|1970-01-01T08:00:00.600+08:00| 2.371440975708148| -|1970-01-01T08:00:00.700+08:00| 2.73518261244453| -|1970-01-01T08:00:00.800+08:00| 1.7561416374270742| -+-----------------------------+-------------------------------------+ -``` - -##### 诊断一维时间序列 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 1.0| -|1970-01-01T08:00:00.200+08:00| 2.0| -|1970-01-01T08:00:00.300+08:00| 3.0| -|1970-01-01T08:00:00.400+08:00| 4.0| -|1970-01-01T08:00:00.500+08:00| 5.0| -|1970-01-01T08:00:00.600+08:00| 6.0| -|1970-01-01T08:00:00.700+08:00| 7.0| -|1970-01-01T08:00:00.800+08:00| 8.0| -|1970-01-01T08:00:00.900+08:00| 9.0| -|1970-01-01T08:00:01.000+08:00| 10.0| -|1970-01-01T08:00:01.100+08:00| 11.0| -|1970-01-01T08:00:01.200+08:00| 12.0| -|1970-01-01T08:00:01.300+08:00| 13.0| -|1970-01-01T08:00:01.400+08:00| 14.0| -|1970-01-01T08:00:01.500+08:00| 15.0| -|1970-01-01T08:00:01.600+08:00| 16.0| -|1970-01-01T08:00:01.700+08:00| 17.0| -|1970-01-01T08:00:01.800+08:00| 18.0| -|1970-01-01T08:00:01.900+08:00| 19.0| -|1970-01-01T08:00:02.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select lof(s1, "method"="series") from root.test.d1 where time<1000 -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|lof(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 3.77777777777778| -|1970-01-01T08:00:00.200+08:00| 4.32727272727273| -|1970-01-01T08:00:00.300+08:00| 4.85714285714286| -|1970-01-01T08:00:00.400+08:00| 5.40909090909091| -|1970-01-01T08:00:00.500+08:00| 5.94999999999999| -|1970-01-01T08:00:00.600+08:00| 6.43243243243243| -|1970-01-01T08:00:00.700+08:00| 6.79999999999999| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 7.0| -|1970-01-01T08:00:01.000+08:00| 6.79999999999999| -|1970-01-01T08:00:01.100+08:00| 6.43243243243243| -|1970-01-01T08:00:01.200+08:00| 5.94999999999999| -|1970-01-01T08:00:01.300+08:00| 5.40909090909091| -|1970-01-01T08:00:01.400+08:00| 4.85714285714286| -|1970-01-01T08:00:01.500+08:00| 4.32727272727273| -|1970-01-01T08:00:01.600+08:00| 3.77777777777778| -+-----------------------------+--------------------+ -``` - -### MissDetect - -#### 注册语句 - -```sql -create function missdetect as 'org.apache.iotdb.library.anomaly.UDTFMissDetect' -``` - -#### 函数简介 - -本函数用于检测数据中的缺失异常。在一些数据中,缺失数据会被线性插值填补,在数据中出现完美的线性片段,且这些片段往往长度较大。本函数通过在数据中发现这些完美线性片段来检测缺失异常。 - -**函数名:** MISSDETECT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `minlen`:被标记为异常的完美线性片段的最小长度,是一个大于等于 10 的整数,默认值为 10。 - -**输出序列:** 输出单个序列,类型为 BOOLEAN,即该数据点是否为缺失异常。 - -**提示:** 数据中的`NaN`将会被忽略。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 0.0| -|2021-07-01T12:00:01.000+08:00| 1.0| -|2021-07-01T12:00:02.000+08:00| 0.0| -|2021-07-01T12:00:03.000+08:00| 1.0| -|2021-07-01T12:00:04.000+08:00| 0.0| -|2021-07-01T12:00:05.000+08:00| 0.0| -|2021-07-01T12:00:06.000+08:00| 0.0| -|2021-07-01T12:00:07.000+08:00| 0.0| -|2021-07-01T12:00:08.000+08:00| 0.0| -|2021-07-01T12:00:09.000+08:00| 0.0| -|2021-07-01T12:00:10.000+08:00| 0.0| -|2021-07-01T12:00:11.000+08:00| 0.0| -|2021-07-01T12:00:12.000+08:00| 0.0| -|2021-07-01T12:00:13.000+08:00| 0.0| -|2021-07-01T12:00:14.000+08:00| 0.0| -|2021-07-01T12:00:15.000+08:00| 0.0| -|2021-07-01T12:00:16.000+08:00| 1.0| -|2021-07-01T12:00:17.000+08:00| 0.0| -|2021-07-01T12:00:18.000+08:00| 1.0| -|2021-07-01T12:00:19.000+08:00| 0.0| -|2021-07-01T12:00:20.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select missdetect(s2,'minlen'='10') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------+ -| Time|missdetect(root.test.d2.s2, "minlen"="10")| -+-----------------------------+------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| false| -|2021-07-01T12:00:01.000+08:00| false| -|2021-07-01T12:00:02.000+08:00| false| -|2021-07-01T12:00:03.000+08:00| false| -|2021-07-01T12:00:04.000+08:00| true| -|2021-07-01T12:00:05.000+08:00| true| -|2021-07-01T12:00:06.000+08:00| true| -|2021-07-01T12:00:07.000+08:00| true| -|2021-07-01T12:00:08.000+08:00| true| -|2021-07-01T12:00:09.000+08:00| true| -|2021-07-01T12:00:10.000+08:00| true| -|2021-07-01T12:00:11.000+08:00| true| -|2021-07-01T12:00:12.000+08:00| true| -|2021-07-01T12:00:13.000+08:00| true| -|2021-07-01T12:00:14.000+08:00| true| -|2021-07-01T12:00:15.000+08:00| true| -|2021-07-01T12:00:16.000+08:00| false| -|2021-07-01T12:00:17.000+08:00| false| -|2021-07-01T12:00:18.000+08:00| false| -|2021-07-01T12:00:19.000+08:00| false| -|2021-07-01T12:00:20.000+08:00| false| -+-----------------------------+------------------------------------------+ -``` - -### Range - -#### 注册语句 - -```sql -create function range as 'org.apache.iotdb.library.anomaly.UDTFRange' -``` - -#### 函数简介 - -本函数用于查找时间序列的范围异常。将根据提供的上界与下界,判断输入数据是否越界,即异常,并输出所有异常点为新的时间序列。 - -**函数名:** RANGE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `lower_bound`:范围异常检测的下界。 -+ `upper_bound`:范围异常检测的上界。 - -**输出序列:** 输出单个序列,类型与输入序列相同。 - -**提示:** 应满足`upper_bound`大于`lower_bound`,否则将不做输出。 - - -#### 使用示例 - -##### 指定上界与下界 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------------------+ -|Time |range(root.test.d1.s1,"lower_bound"="101.0","upper_bound"="125.0")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -+-----------------------------+------------------------------------------------------------------+ -``` - -### TwoSidedFilter - -#### 注册语句 - -```sql -create function twosidedfilter as 'org.apache.iotdb.library.anomaly.UDTFTwoSidedFilter' -``` - -#### 函数简介 - -本函数基于双边窗口检测法对输入序列中的异常点进行过滤。 - -**函数名:** TWOSIDEDFILTER - -**输出序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型与输入相同,是输入序列去除异常点后的结果。 - -**参数:** - -- `len`:双边窗口检测法中的窗口大小,取值范围为正整数,默认值为 5.如当`len`=3 时,算法向前、向后各取长度为3的窗口,在窗口中计算异常度。 -- `threshold`:异常度的阈值,取值范围为(0,1),默认值为 0.3。阈值越高,函数对于异常度的判定标准越严格。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:00:37.000+08:00| 1484.0| -|1970-01-01T08:00:38.000+08:00| 1055.0| -|1970-01-01T08:00:39.000+08:00| 1050.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test -``` - -输出序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -### Outlier - -#### 注册语句 - -```sql -create function outlier as 'org.apache.iotdb.library.anomaly.UDTFOutlier' -``` - -#### 函数简介 - -本函数用于检测基于距离的异常点。在当前窗口中,如果一个点距离阈值范围内的邻居数量(包括它自己)少于密度阈值,则该点是异常点。 - -**函数名:** OUTLIER - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `r`:基于距离异常检测中的距离阈值。 -+ `k`:基于距离异常检测中的密度阈值。 -+ `w`:用于指定滑动窗口的大小。 -+ `s`:用于指定滑动窗口的步长。 - -**输出序列**:输出单个序列,类型与输入序列相同。 - -#### 使用示例 - -##### 指定查询参数 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|2020-01-04T23:59:55.000+08:00| 56.0| -|2020-01-04T23:59:56.000+08:00| 55.1| -|2020-01-04T23:59:57.000+08:00| 54.2| -|2020-01-04T23:59:58.000+08:00| 56.3| -|2020-01-04T23:59:59.000+08:00| 59.0| -|2020-01-05T00:00:00.000+08:00| 60.0| -|2020-01-05T00:00:01.000+08:00| 60.5| -|2020-01-05T00:00:02.000+08:00| 64.5| -|2020-01-05T00:00:03.000+08:00| 69.0| -|2020-01-05T00:00:04.000+08:00| 64.2| -|2020-01-05T00:00:05.000+08:00| 62.3| -|2020-01-05T00:00:06.000+08:00| 58.0| -|2020-01-05T00:00:07.000+08:00| 58.9| -|2020-01-05T00:00:08.000+08:00| 52.0| -|2020-01-05T00:00:09.000+08:00| 62.3| -|2020-01-05T00:00:10.000+08:00| 61.0| -|2020-01-05T00:00:11.000+08:00| 64.2| -|2020-01-05T00:00:12.000+08:00| 61.8| -|2020-01-05T00:00:13.000+08:00| 64.0| -|2020-01-05T00:00:14.000+08:00| 63.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|outlier(root.test.s1,"r"="5.0","k"="4","w"="10","s"="5")| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:03.000+08:00| 69.0| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:08.000+08:00| 52.0| -+-----------------------------+--------------------------------------------------------+ -``` - -## 频域分析 - -### Conv - -#### 注册语句 - -```sql -create function conv as 'org.apache.iotdb.library.frequency.UDTFConv' -``` - -#### 函数简介 - -本函数对两个输入序列进行卷积,即多项式乘法。 - - -**函数名:** CONV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为DOUBLE,它是两个序列卷积的结果。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 0.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| null| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select conv(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------+ -| Time|conv(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| -|1970-01-01T08:00:00.003+08:00| 2.0| -+-----------------------------+--------------------------------------+ -``` - -### Deconv - -#### 注册语句 - -```sql -create function deconv as 'org.apache.iotdb.library.frequency.UDTFDeconv' -``` - -#### 函数简介 - -本函数对两个输入序列进行去卷积,即多项式除法运算。 - -**函数名:** DECONV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `result`:去卷积的结果,取值为'quotient'或'remainder',分别对应于去卷积的商和余数。在缺省情况下,输出去卷积的商。 - -**输出序列:** 输出单个序列,类型为DOUBLE。它是将第二个序列从第一个序列中去卷积(第一个序列除以第二个序列)的结果。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -##### 计算去卷积的商 - -当`result`参数缺省或为'quotient'时,本函数计算去卷积的商。 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s3|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 8.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| null| -|1970-01-01T08:00:00.003+08:00| 2.0| null| -+-----------------------------+---------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select deconv(s3,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2)| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - -##### 计算去卷积的余数 - -当`result`参数为'remainder'时,本函数计算去卷积的余数。输入序列同上,用于查询的SQL语句如下: - -```sql -select deconv(s3,s2,'result'='remainder') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2, "result"="remainder")| -+-----------------------------+--------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 0.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -+-----------------------------+--------------------------------------------------------------+ -``` - -### DWT - -#### 注册语句 - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFDWT' -``` - -#### 函数简介 - -本函数对输入序列进行一维离散小波变换。 - -**函数名:** DWT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:小波滤波的类型,提供'Haar', 'DB4', 'DB6', 'DB8',其中DB指代Daubechies。若不设置该参数,则用户需提供小波滤波的系数。不区分大小写。 -+ `coef`:小波滤波的系数。若提供该参数,请使用英文逗号','分割各项,不添加空格或其它符号。 -+ `layer`:进行变换的次数,最终输出的向量个数等同于$layer+1$.默认取1。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。 - -**提示:** 输入序列长度必须为2的整数次幂。 - -#### 使用示例 - -##### Haar变换 - - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.2| -|1970-01-01T08:00:00.200+08:00| 1.5| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.600+08:00| 0.8| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.800+08:00| 2.5| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select dwt(s1,"method"="haar") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|dwt(root.test.d1.s1, "method"="haar")| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.14142135834465192| -|1970-01-01T08:00:00.100+08:00| 1.909188342921157| -|1970-01-01T08:00:00.200+08:00| 1.6263456473052773| -|1970-01-01T08:00:00.300+08:00| 1.9798989957517026| -|1970-01-01T08:00:00.400+08:00| 3.252691126023161| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776479437628| -|1970-01-01T08:00:00.800+08:00| -0.14142135834465192| -|1970-01-01T08:00:00.900+08:00| 0.21213200063848547| -|1970-01-01T08:00:01.000+08:00| -0.7778174761639416| -|1970-01-01T08:00:01.100+08:00| -0.8485281289944873| -|1970-01-01T08:00:01.200+08:00| 0.2828427799095765| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426400127697095| -|1970-01-01T08:00:01.500+08:00| -0.42426408557066786| -+-----------------------------+-------------------------------------+ -``` - - -### IDWT - -#### 注册语句 - -```sql -create function idwt as 'org.apache.iotdb.library.frequency.UDTFIDWT' -``` - -#### 函数简介 - -本函数对输入序列进行一维离散小波逆变换,将 DWT 分解后的小波系数还原为原始数据。 - -**函数名:** IDWT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:小波滤波的类型,提供'Haar', 'DB4', 'DB6', 'DB8',其中DB指代Daubechies。若不设置该参数,则用户需提供小波滤波的系数。不区分大小写。 -+ `coef`:小波滤波的系数。若提供该参数,请使用英文逗号','分割各项,不添加空格或其它符号。 -+ `layer`:进行变换的次数,最终输出的向量个数等同于$layer+1$.默认取1。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。 - -**提示:** -* 输入序列长度必须为2的整数次幂。 -* IDWT 函数的参数设置(method/coef/layer)应与对应 DWT 变换时保持一致,才能正确还原原始数据。 -* 通常 IDWT 的输入为 DWT 函数的输出结果。 - -#### 使用示例 - -##### Haar变换 - - -输入序列: - -``` -+-----------------------------+--------------------+ -| Time| root.test.d1.s2| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 0.1414213562373095| -|1970-01-01T08:00:00.100+08:00| 1.909188309203678| -|1970-01-01T08:00:00.200+08:00| 1.6263455967290592| -|1970-01-01T08:00:00.300+08:00| 1.979898987322333| -|1970-01-01T08:00:00.400+08:00| 3.2526911934581184| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776310850235| -|1970-01-01T08:00:00.800+08:00| -0.1414213562373095| -|1970-01-01T08:00:00.900+08:00| 0.21213203435596428| -|1970-01-01T08:00:01.000+08:00| -0.7778174593052022| -|1970-01-01T08:00:01.100+08:00| -0.8485281374238569| -|1970-01-01T08:00:01.200+08:00| 0.2828427124746189| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426406871192857| -|1970-01-01T08:00:01.500+08:00|-0.42426406871192857| -+-----------------------------+--------------------+ -``` - -用于查询的SQL语句: - -```sql -select idwt(s2,"method"="haar") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------+ -| Time|idwt(root.test.d1.s2, "method"="haar")| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.19999999999999998| -|1970-01-01T08:00:00.200+08:00| 1.4999999999999996| -|1970-01-01T08:00:00.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.6999999999999997| -|1970-01-01T08:00:00.600+08:00| 0.7999999999999998| -|1970-01-01T08:00:00.700+08:00| 1.9999999999999996| -|1970-01-01T08:00:00.800+08:00| 2.4999999999999996| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.9999999999999996| -|1970-01-01T08:00:01.200+08:00| 1.7999999999999998| -|1970-01-01T08:00:01.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:01.400+08:00| 0.9999999999999998| -|1970-01-01T08:00:01.500+08:00| 1.5999999999999999| -+-----------------------------+--------------------------------------+ -``` - - -### FFT - -#### 注册语句 - -```sql -create function fft as 'org.apache.iotdb.library.frequency.UDTFFFT' -``` - -#### 函数简介 - -本函数对输入序列进行快速傅里叶变换。 - -**函数名:** FFT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:傅里叶变换的类型,取值为'uniform'或'nonuniform',缺省情况下为'uniform'。当取值为'uniform'时,时间戳将被忽略,所有数据点都将被视作等距的,并应用等距快速傅里叶算法;当取值为'nonuniform'时,将根据时间戳应用非等距快速傅里叶算法(未实现)。 -+ `result`:傅里叶变换的结果,取值为'real'、'imag'、'abs'或'angle',分别对应于变换结果的实部、虚部、模和幅角。在缺省情况下,输出变换的模。 -+ `compress`:压缩参数,取值范围(0,1],是有损压缩时保留的能量比例。在缺省情况下,不进行压缩。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -##### 等距傅里叶变换 - -当`type`参数缺省或为'uniform'时,本函数进行等距傅里叶变换。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select fft(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------+ -| Time| fft(root.test.d1.s1)| -+-----------------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 1.2727111142703152E-8| -|1970-01-01T08:00:00.002+08:00| 2.385520799101839E-7| -|1970-01-01T08:00:00.003+08:00| 8.723291723972645E-8| -|1970-01-01T08:00:00.004+08:00| 19.999999960195904| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| -|1970-01-01T08:00:00.006+08:00| 3.2260694930700566E-7| -|1970-01-01T08:00:00.007+08:00| 8.723291605373329E-8| -|1970-01-01T08:00:00.008+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.009+08:00| 1.2727110997246171E-8| -|1970-01-01T08:00:00.010+08:00|1.9852334701272664E-23| -|1970-01-01T08:00:00.011+08:00| 1.2727111194499847E-8| -|1970-01-01T08:00:00.012+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.013+08:00| 8.723291785769131E-8| -|1970-01-01T08:00:00.014+08:00| 3.226069493070057E-7| -|1970-01-01T08:00:00.015+08:00| 9.999999850988388| -|1970-01-01T08:00:00.016+08:00| 19.999999960195904| -|1970-01-01T08:00:00.017+08:00| 8.723291747109068E-8| -|1970-01-01T08:00:00.018+08:00| 2.3855207991018386E-7| -|1970-01-01T08:00:00.019+08:00| 1.2727112069910878E-8| -+-----------------------------+----------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此在输出序列中$k=4$和$k=5$处有尖峰。 - -##### 等距傅里叶变换并压缩 - -输入序列同上,用于查询的SQL语句如下: - -```sql -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| fft(root.test.d1.s1,| fft(root.test.d1.s1,| -| | "result"="real",| "result"="imag",| -| | "compress"="0.99")| "compress"="0.99")| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - -注:基于傅里叶变换结果的共轭性质,压缩结果只保留前一半;根据给定的压缩参数,从低频到高频保留数据点,直到保留的能量比例超过该值;保留最后一个数据点以表示序列长度。 - -### HighPass - -#### 注册语句 - -```sql -create function highpass as 'org.apache.iotdb.library.frequency.UDTFHighPass' -``` - -#### 函数简介 - -本函数对输入序列进行高通滤波,提取高于截止频率的分量。输入序列的时间戳将被忽略,所有数据点都将被视作等距的。 - -**函数名:** HIGHPASS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `wpass`:归一化后的截止频率,取值为(0,1),不可缺省。 - -**输出序列:** 输出单个序列,类型为DOUBLE,它是滤波后的序列,长度与时间戳均与输入一致。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select highpass(s1,'wpass'='0.45') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------+ -| Time|highpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9999999534830373| -|1970-01-01T08:00:01.000+08:00| 1.7462829277628608E-8| -|1970-01-01T08:00:02.000+08:00| -0.9999999593178128| -|1970-01-01T08:00:03.000+08:00| -4.1115269056426626E-8| -|1970-01-01T08:00:04.000+08:00| 0.9999999925494194| -|1970-01-01T08:00:05.000+08:00| 3.328126513330016E-8| -|1970-01-01T08:00:06.000+08:00| -1.0000000183304454| -|1970-01-01T08:00:07.000+08:00| 6.260191433311374E-10| -|1970-01-01T08:00:08.000+08:00| 1.0000000018134796| -|1970-01-01T08:00:09.000+08:00| -3.097210911744423E-17| -|1970-01-01T08:00:10.000+08:00| -1.0000000018134794| -|1970-01-01T08:00:11.000+08:00| -6.260191627862097E-10| -|1970-01-01T08:00:12.000+08:00| 1.0000000183304454| -|1970-01-01T08:00:13.000+08:00| -3.328126501424346E-8| -|1970-01-01T08:00:14.000+08:00| -0.9999999925494196| -|1970-01-01T08:00:15.000+08:00| 4.111526915498874E-8| -|1970-01-01T08:00:16.000+08:00| 0.9999999593178128| -|1970-01-01T08:00:17.000+08:00| -1.7462829341296528E-8| -|1970-01-01T08:00:18.000+08:00| -0.9999999534830369| -|1970-01-01T08:00:19.000+08:00| -1.035237222742873E-16| -+-----------------------------+-----------------------------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此高通滤波之后的输出序列服从$y=sin(2\pi t/4)$。 - -### IFFT - -#### 注册语句 - -```sql -create function ifft as 'org.apache.iotdb.library.frequency.UDTFIFFT' -``` - -#### 函数简介 - -本函数将输入的两个序列作为实部和虚部视作一个复数,进行逆快速傅里叶变换,并输出结果的实部。输入数据的格式参见`FFT`函数的输出,并支持以`FFT`函数压缩后的输出作为本函数的输入。 - -**函数名:** IFFT - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `start`:输出序列的起始时刻,是一个格式为'yyyy-MM-dd HH:mm:ss'的时间字符串。在缺省情况下,为'1970-01-01 08:00:00'。 -+ `interval`:输出序列的时间间隔,是一个有单位的正数。目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,为1s。 - - -**输出序列:** 输出单个序列,类型为DOUBLE。该序列是一个等距时间序列,它的值是将两个输入序列依次作为实部和虚部进行逆快速傅里叶变换的结果。 - -**提示:** 如果某行数据中包含空值或`NaN`,该行数据将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| root.test.d1.re| root.test.d1.im| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - - -用于查询的SQL语句: - -```sql -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|ifft(root.test.d1.re, root.test.d1.im, "interval"="1m",| -| | "start"="2021-01-01 00:00:00")| -+-----------------------------+-------------------------------------------------------+ -|2021-01-01T00:00:00.000+08:00| 2.902112992431231| -|2021-01-01T00:01:00.000+08:00| 1.1755704705132448| -|2021-01-01T00:02:00.000+08:00| -2.175570513757101| -|2021-01-01T00:03:00.000+08:00| -1.9021130389094498| -|2021-01-01T00:04:00.000+08:00| 0.9999999925494194| -|2021-01-01T00:05:00.000+08:00| 1.902113046743454| -|2021-01-01T00:06:00.000+08:00| 0.17557053610884188| -|2021-01-01T00:07:00.000+08:00| -1.1755704886020932| -|2021-01-01T00:08:00.000+08:00| -0.9021130371347148| -|2021-01-01T00:09:00.000+08:00| 3.552713678800501E-16| -|2021-01-01T00:10:00.000+08:00| 0.9021130371347154| -|2021-01-01T00:11:00.000+08:00| 1.1755704886020932| -|2021-01-01T00:12:00.000+08:00| -0.17557053610884144| -|2021-01-01T00:13:00.000+08:00| -1.902113046743454| -|2021-01-01T00:14:00.000+08:00| -0.9999999925494196| -|2021-01-01T00:15:00.000+08:00| 1.9021130389094498| -|2021-01-01T00:16:00.000+08:00| 2.1755705137571004| -|2021-01-01T00:17:00.000+08:00| -1.1755704705132448| -|2021-01-01T00:18:00.000+08:00| -2.902112992431231| -|2021-01-01T00:19:00.000+08:00| -3.552713678800501E-16| -+-----------------------------+-------------------------------------------------------+ -``` - -### LowPass - -#### 注册语句 - -```sql -create function lowpass as 'org.apache.iotdb.library.frequency.UDTFLowPass' -``` - -#### 函数简介 - -本函数对输入序列进行低通滤波,提取低于截止频率的分量。输入序列的时间戳将被忽略,所有数据点都将被视作等距的。 - -**函数名:** LOWPASS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `wpass`:归一化后的截止频率,取值为(0,1),不可缺省。 - -**输出序列:** 输出单个序列,类型为DOUBLE,它是滤波后的序列,长度与时间戳均与输入一致。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select lowpass(s1,'wpass'='0.45') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|lowpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.9021130073323922| -|1970-01-01T08:00:01.000+08:00| 1.1755704705132448| -|1970-01-01T08:00:02.000+08:00| -1.1755705286582614| -|1970-01-01T08:00:03.000+08:00| -1.9021130389094498| -|1970-01-01T08:00:04.000+08:00| 7.450580419288145E-9| -|1970-01-01T08:00:05.000+08:00| 1.902113046743454| -|1970-01-01T08:00:06.000+08:00| 1.1755705212076808| -|1970-01-01T08:00:07.000+08:00| -1.1755704886020932| -|1970-01-01T08:00:08.000+08:00| -1.9021130222335536| -|1970-01-01T08:00:09.000+08:00| 3.552713678800501E-16| -|1970-01-01T08:00:10.000+08:00| 1.9021130222335536| -|1970-01-01T08:00:11.000+08:00| 1.1755704886020932| -|1970-01-01T08:00:12.000+08:00| -1.1755705212076801| -|1970-01-01T08:00:13.000+08:00| -1.902113046743454| -|1970-01-01T08:00:14.000+08:00| -7.45058112983088E-9| -|1970-01-01T08:00:15.000+08:00| 1.9021130389094498| -|1970-01-01T08:00:16.000+08:00| 1.1755705286582616| -|1970-01-01T08:00:17.000+08:00| -1.1755704705132448| -|1970-01-01T08:00:18.000+08:00| -1.9021130073323924| -|1970-01-01T08:00:19.000+08:00| -2.664535259100376E-16| -+-----------------------------+----------------------------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此低通滤波之后的输出序列服从$y=2sin(2\pi t/5)$。 - - -### Envelope - -#### 注册语句 - -```sql -create function envelope as 'org.apache.iotdb.library.frequency.UDFEnvelopeAnalysis' -``` - -#### 函数简介 - -本函数通过输入一维浮点数数组和用户指定的调制频率,实现对信号的解调和包络提取。解调的目标是从复杂的信号中提取感兴趣的部分,使其更易理解。比如通过解调可以找到信号的包络,即振幅的变化趋势。 - -**函数名:** Envelope - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `frequency`:频率(选填,正数。不填此参数,系统会基于序列对应时间的时间间隔来推断频率)。 -+ `amplification`: 扩增倍数(选填,正整数。输出Time列的结果为正整数的集合,不会输出小数。当频率小1时,可通过此参数对频率进行扩增以展示正常的结果)。 - -**输出序列:** -+ `Time`: 该列返回的值的含义是频率而并非时间,如果输出的格式为时间格式(如:1970-01-01T08:00:19.000+08:00),请将其转为时间戳值。 - -+ `Envelope(Path, 'frequency'='{frequency}')`:输出单个序列,类型为DOUBLE,它是包络分析之后的结果。 - -**提示:** 当解调的原始序列的值不连续时,本函数会视为连续处理,建议被分析的时间序列是一段值完整的时间序列。同时建议指定开始时间与结束时间。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:01.000+08:00| 1.0 | -|1970-01-01T08:00:02.000+08:00| 2.0 | -|1970-01-01T08:00:03.000+08:00| 3.0 | -|1970-01-01T08:00:04.000+08:00| 4.0 | -|1970-01-01T08:00:05.000+08:00| 5.0 | -|1970-01-01T08:00:06.000+08:00| 6.0 | -|1970-01-01T08:00:07.000+08:00| 7.0 | -|1970-01-01T08:00:08.000+08:00| 8.0 | -|1970-01-01T08:00:09.000+08:00| 9.0 | -|1970-01-01T08:00:10.000+08:00| 10.0 | -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: -```sql -set time_display_type=long; -select envelope(s1),envelope(s1,'frequency'='1000'),envelope(s1,'amplification'='10') from root.test.d1; -``` -输出序列: - -``` -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -|Time|envelope(root.test.d1.s1)|envelope(root.test.d1.s1, "frequency"="1000")|envelope(root.test.d1.s1, "amplification"="10")| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -| 0| 6.284350808484124| 6.284350808484124| 6.284350808484124| -| 100| 1.5581923657404393| 1.5581923657404393| null| -| 200| 0.8503211038340728| 0.8503211038340728| null| -| 300| 0.512808785945551| 0.512808785945551| null| -| 400| 0.26361156774506744| 0.26361156774506744| null| -|1000| null| null| 1.5581923657404393| -|2000| null| null| 0.8503211038340728| -|3000| null| null| 0.512808785945551| -|4000| null| null| 0.26361156774506744| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ - -``` - -## 数据匹配 - -### Cov - -#### 注册语句 - -```sql -create function cov as 'org.apache.iotdb.library.dmatch.UDAFCov' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的总体协方差。 - -**函数名:** COV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为总体协方差的数据点。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出`NaN`。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select cov(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|cov(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 12.291666666666666| -+-----------------------------+-------------------------------------+ -``` - -### Dtw - -#### 注册语句 - -```sql -create function dtw as 'org.apache.iotdb.library.dmatch.UDAFDtw' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的 DTW 距离。 - -**函数名:** DTW - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为两个时间序列的 DTW 距离值。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出 0。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.004+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.005+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.006+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.007+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.008+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.009+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.010+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.011+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.012+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.013+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.014+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.015+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.016+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.017+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.018+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.019+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.020+08:00| 1.0| 2.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select dtw(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|dtw(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 20.0| -+-----------------------------+-------------------------------------+ -``` - -### Pearson - -#### 注册语句 - -```sql -create function pearson as 'org.apache.iotdb.library.dmatch.UDAFPearson' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的皮尔森相关系数。 - -**函数名:** PEARSON - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为皮尔森相关系数的数据点。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出`NaN`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select pearson(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------+ -| Time|pearson(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.5630881927754872| -+-----------------------------+-----------------------------------------+ -``` - -### PtnSym - -#### 注册语句 - -```sql -create function ptnsym as 'org.apache.iotdb.library.dmatch.UDTFPtnSym' -``` - -#### 函数简介 - -本函数用于寻找序列中所有对称度小于阈值的对称子序列。对称度通过 DTW 计算,值越小代表序列对称性越高。 - -**函数名:** PTNSYM - -**输入序列:** 仅支持一个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:对称子序列的长度,是一个正整数,默认值为 10。 -+ `threshold`:对称度阈值,是一个非负数,只有对称度小于等于该值的对称子序列才会被输出。在缺省情况下,所有的子序列都会被输出。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中的每一个数据点对应于一个对称子序列,时间戳为子序列的起始时刻,值为对称度。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s4| -+-----------------------------+---------------+ -|2021-01-01T12:00:00.000+08:00| 1.0| -|2021-01-01T12:00:01.000+08:00| 2.0| -|2021-01-01T12:00:02.000+08:00| 3.0| -|2021-01-01T12:00:03.000+08:00| 2.0| -|2021-01-01T12:00:04.000+08:00| 1.0| -|2021-01-01T12:00:05.000+08:00| 1.0| -|2021-01-01T12:00:06.000+08:00| 1.0| -|2021-01-01T12:00:07.000+08:00| 1.0| -|2021-01-01T12:00:08.000+08:00| 2.0| -|2021-01-01T12:00:09.000+08:00| 3.0| -|2021-01-01T12:00:10.000+08:00| 2.0| -|2021-01-01T12:00:11.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|ptnsym(root.test.d1.s4, "window"="5", "threshold"="0")| -+-----------------------------+------------------------------------------------------+ -|2021-01-01T12:00:00.000+08:00| 0.0| -|2021-01-01T12:00:07.000+08:00| 0.0| -+-----------------------------+------------------------------------------------------+ -``` - -### XCorr - -#### 注册语句 - -```sql -create function xcorr as 'org.apache.iotdb.library.dmatch.UDTFXCorr' -``` - -#### 函数简介 - -本函数用于计算两条时间序列的互相关函数值, -对离散序列而言,互相关函数可以表示为 -$$CR(n) = \frac{1}{N} \sum_{m=1}^N S_1[m]S_2[m+n]$$ -常用于表征两条序列在不同对齐条件下的相似度。 - -**函数名:** XCORR - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中共包含$2N-1$个数据点, -其中正中心的值为两条序列按照预先对齐的结果计算的互相关系数(即等于以上公式的$CR(0)$), -前半部分的值表示将后一条输入序列向前平移时计算的互相关系数, -直至两条序列没有重合的数据点(不包含完全分离时的结果$CR(-N)=0.0$), -后半部分类似。 -用公式可表示为(所有序列的索引从1开始计数): -$$OS[i] = CR(-N+i) = \frac{1}{N} \sum_{m=1}^{i} S_1[m]S_2[N-i+m],\ if\ i <= N$$ -$$OS[i] = CR(i-N) = \frac{1}{N} \sum_{m=1}^{2N-i} S_1[i-N+m]S_2[m],\ if\ i > N$$ - -**提示:** - -+ 两条序列中的`null` 和`NaN` 值会被忽略,在计算中表现为 0。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| null| 6| -|2020-01-01T00:00:02.000+08:00| 2| 7| -|2020-01-01T00:00:03.000+08:00| 3| NaN| -|2020-01-01T00:00:04.000+08:00| 4| 9| -|2020-01-01T00:00:05.000+08:00| 5| 10| -+-----------------------------+---------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------+ -| Time|xcorr(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+---------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 10.0| -|1970-01-01T08:00:00.002+08:00| 16.0| -|1970-01-01T08:00:00.003+08:00| 16.75| -|1970-01-01T08:00:00.004+08:00| 20.0| -|1970-01-01T08:00:00.005+08:00| 13.2| -|1970-01-01T08:00:00.006+08:00| 5.6| -|1970-01-01T08:00:00.007+08:00| 7.0| -|1970-01-01T08:00:00.008+08:00| 0.0| -+-----------------------------+---------------------------------------+ -``` -### 6.6 Pattern\_match - -#### 注册语句 - -```SQL -create function pattern_match as 'org.apache.iotdb.library.match.UDAFPatternMatch' -``` - -#### 函数简介 - -本函数用于对输入的某一条时间序列与预设的`pattern`进行模式匹配,当相似度小于等于某个预设阈值时判定为匹配成功,并将最终匹配结果以`json`列表的方式输出。 - -**函数名:** PATTERN\_MATCH - -**输入序列:** 仅支持一个输入序列,类型为INT32,INT64,FLOAT,DOUBLE,BOOLEAN。 - -**参数:** - -* `timePattern` :以时间戳组成的字符串,以逗号分隔。长度必须大于1。必填项。 -* `valuePattern `:以数字组成的字符串,以逗号分隔。数量与 `timePattern `相同,长度必须大于1。必填项。 - -> 提示:布尔类型的`valuePattern `,需要用1,0来表示`true`和`false`。 - -* `threshold` :阈值。Float类型。必填项。 - -**输出序列**:输出结果为包含所有成功匹配段落的起始时间戳`startTime`、终止时间戳`endTime`及相似度值`distance`的`json`列表。 - -#### 使用示例 -1. 线性数据 - -输入序列: - -```SQL -IoTDB> select s0 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s0| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.1| -|1970-01-01T08:00:00.003+08:00| 1.2| -|1970-01-01T08:00:00.004+08:00| 1.3| -|1970-01-01T08:00:00.005+08:00| 0.0| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+--------------------------------------------------------------------------------------------------+ -| match_result| -+--------------------------------------------------------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]| -+--------------------------------------------------------------------------------------------------+ -``` - -2. 布尔类型数据 - -输入序列: - -```SQL -IoTDB> select s1 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| true| -|1970-01-01T08:00:00.002+08:00| true| -|1970-01-01T08:00:00.003+08:00| true| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| false| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", "threshold"="0.5") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+-------------------------------------------------+ -| match_result| -+-------------------------------------------------+ -|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+-------------------------------------------------+ -``` - -3. V型数据 - -输入序列: - -```SQL -IoTDB> select s2 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s2| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| -1.0| -|1970-01-01T08:00:00.003+08:00| -2.0| -|1970-01-01T08:00:00.004+08:00| -3.0| -|1970-01-01T08:00:00.005+08:00| -2.0| -|1970-01-01T08:00:00.006+08:00| -1.0| -|1970-01-01T08:00:00.007+08:00| -0.0| -|1970-01-01T08:00:00.008+08:00| -0.0| -|1970-01-01T08:00:00.009+08:00| -0.0| -|1970-01-01T08:00:00.010+08:00| -0.0| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s2, "timePattern"="1,2,3,4,5,6,7", "valuePattern"="0.0,-1.0,-2.0,-3.0,-2.0,-1.0,-0.0", "threshold"="10") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+----------------------------------------------+ -| match_result| -+----------------------------------------------+ -|[{"distance":0.53,"startTime":1,"endTime":10}]| -+----------------------------------------------+ -``` - -4. 多个匹配模式 - -输入序列: - -```SQL -IoTDB> select s0,s1 from root.** -+-----------------------------+-------------+-------------+ -| Time|root.db.d0.s0|root.db.d0.s1| -+-----------------------------+-------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| true| -|1970-01-01T08:00:00.002+08:00| 1.1| true| -|1970-01-01T08:00:00.003+08:00| 1.2| true| -|1970-01-01T08:00:00.004+08:00| 1.3| false| -|1970-01-01T08:00:00.005+08:00| 0.0| false| -+-----------------------------+-------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result1, pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", - "threshold"="0.5") as match_result2 from root.db.d0 -``` - -输出序列: - -```SQL -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -| match_result1| match_result2| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -``` - - - -## 数据修复 - -### TimestampRepair - -#### 注册语句 - -```sql -create function timestamprepair as 'org.apache.iotdb.library.drepair.UDTFTimestampRepair' -``` - -### 函数简介 - -本函数用于时间戳修复。根据给定的标准时间间隔,采用最小化修复代价的方法,通过对数据时间戳的微调,将原本时间戳间隔不稳定的数据修复为严格等间隔的数据。在未给定标准时间间隔的情况下,本函数将使用时间间隔的中位数 (median)、众数 (mode) 或聚类中心 (cluster) 来推算标准时间间隔。 - - -**函数名:** TIMESTAMPREPAIR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `interval`: 标准时间间隔(单位是毫秒),是一个正整数。在缺省情况下,将根据指定的方法推算。 -+ `method`:推算标准时间间隔的方法,取值为 'median', 'mode' 或 'cluster',仅在`interval`缺省时有效。在缺省情况下,将使用中位数方法进行推算。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列是修复后的输入序列。 - -### 使用示例 - -#### 指定标准时间间隔 - -在给定`interval`参数的情况下,本函数将按照指定的标准时间间隔进行修复。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:19.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:01.000+08:00| 7.0| -|2021-07-01T12:01:11.000+08:00| 8.0| -|2021-07-01T12:01:21.000+08:00| 9.0| -|2021-07-01T12:01:31.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timestamprepair(s1,'interval'='10000') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------------------+ -| Time|timestamprepair(root.test.d2.s1, "interval"="10000")| -+-----------------------------+----------------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+----------------------------------------------------+ -``` - -#### 自动推算标准时间间隔 - -如果`interval`参数没有给定,本函数将按照推算的标准时间间隔进行修复。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select timestamprepair(s1) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------+ -| Time|timestamprepair(root.test.d2.s1)| -+-----------------------------+--------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+--------------------------------+ -``` - -### ValueFill - -#### 注册语句 - -```sql -create function valuefill as 'org.apache.iotdb.library.drepair.UDTFValueFill' -``` - -#### 函数简介 - -**函数名:** ValueFill - -**输入序列:** 单列时序数据,类型为INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`: {"mean", "previous", "linear", "likelihood", "AR", "MA", "SCREEN"}, 默认为 "linear"。其中,“mean” 指使用均值填补的方法; “previous" 指使用前值填补方法;“linear" 指使用线性插值填补方法;“likelihood” 为基于速度的正态分布的极大似然估计方法;“AR” 指自回归的填补方法;“MA” 指滑动平均的填补方法;"SCREEN" 指约束填补方法;缺省情况下使用 “linear”。 - -**输出序列:** 填补后的单维序列。 - -**备注:** AR 模型采用 AR(1),时序列需满足自相关条件,否则将输出单个数据点 (0, 0.0). - -#### 使用示例 -##### 使用 linear 方法进行填补 - -当`method`缺省或取值为 'linear' 时,本函数将使用线性插值方法进行填补。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| NaN| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| NaN| -|2020-01-01T00:00:22.000+08:00| NaN| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select valuefill(s1) from root.test.d2 -``` - -输出序列: - - - -``` -+-----------------------------+--------------------------+ -| Time|valuefill(root.test.d2.s1)| -+-----------------------------+--------------------------+ -|2020-01-01T00:00:02.000+08:00| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 110.5| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.66666666666667| -|2020-01-01T00:00:22.000+08:00| 121.33333333333333| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+--------------------------+ -``` - -##### 使用 previous 方法进行填补 - -当`method`取值为 'previous' 时,本函数将使前值填补方法进行数值填补。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select valuefill(s1,"method"="previous") from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------+ -| Time|valuefill(root.test.d2.s1, "method"="previous")| -+-----------------------------+-----------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 108.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 116.0| -|2020-01-01T00:00:22.000+08:00| 116.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-----------------------------------------------+ -``` - -### ValueRepair - -#### 注册语句 - -```sql -create function valuerepair as 'org.apache.iotdb.library.drepair.UDTFValueRepair' -``` - -#### 函数简介 - -本函数用于对时间序列的数值进行修复。目前,本函数支持两种修复方法:**Screen** 是一种基于速度阈值的方法,在最小改动的前提下使得所有的速度符合阈值要求;**LsGreedy** 是一种基于速度变化似然的方法,将速度变化建模为高斯分布,并采用贪心算法极大化似然函数。 - -**函数名:** VALUEREPAIR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `method`:修复时采用的方法,取值为 'Screen' 或 'LsGreedy'. 在缺省情况下,使用 Screen 方法进行修复。 -+ `minSpeed`:该参数仅在使用 Screen 方法时有效。当速度小于该值时会被视作数值异常点加以修复。在缺省情况下为中位数减去三倍绝对中位差。 -+ `maxSpeed`:该参数仅在使用 Screen 方法时有效。当速度大于该值时会被视作数值异常点加以修复。在缺省情况下为中位数加上三倍绝对中位差。 -+ `center`:该参数仅在使用 LsGreedy 方法时有效。对速度变化分布建立的高斯模型的中心。在缺省情况下为 0。 -+ `sigma` :该参数仅在使用 LsGreedy 方法时有效。对速度变化分布建立的高斯模型的标准差。在缺省情况下为绝对中位差。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列是修复后的输入序列。 - -**提示:** 输入序列中的`NaN`在修复之前会先进行线性插值填补。 - -#### 使用示例 - -##### 使用 Screen 方法进行修复 - -当`method`缺省或取值为 'Screen' 时,本函数将使用 Screen 方法进行数值修复。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select valuerepair(s1) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|valuerepair(root.test.d2.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+----------------------------+ -``` - -##### 使用 LsGreedy 方法进行修复 - -当`method`取值为 'LsGreedy' 时,本函数将使用 LsGreedy 方法进行数值修复。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select valuerepair(s1,'method'='LsGreedy') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|valuerepair(root.test.d2.s1, "method"="LsGreedy")| -+-----------------------------+-------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-------------------------------------------------+ -``` - -## 序列发现 - -### ConsecutiveSequences - -#### 注册语句 - -```sql -create function consecutivesequences as 'org.apache.iotdb.library.series.UDTFConsecutiveSequences' -``` - -#### 函数简介 - -本函数用于在多维严格等间隔数据中发现局部最长连续子序列。 - -严格等间隔数据是指数据的时间间隔是严格相等的,允许存在数据缺失(包括行缺失和值缺失),但不允许存在数据冗余和时间戳偏移。 - -连续子序列是指严格按照标准时间间隔等距排布,不存在任何数据缺失的子序列。如果某个连续子序列不是任何连续子序列的真子序列,那么它是局部最长的。 - - -**函数名:** CONSECUTIVESEQUENCES - -**输入序列:** 支持多个输入序列,类型可以是任意的,但要满足严格等间隔的要求。 - -**参数:** - -+ `gap`:标准时间间隔,是一个有单位的正数。目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,函数会利用众数估计标准时间间隔。 - -**输出序列:** 输出单个序列,类型为 INT32。输出序列中的每一个数据点对应一个局部最长连续子序列,时间戳为子序列的起始时刻,值为子序列包含的数据点个数。 - -**提示:** 对于不符合要求的输入,本函数不对输出做任何保证。 - -#### 使用示例 - -##### 手动指定标准时间间隔 - -本函数可以通过`gap`参数手动指定标准时间间隔。需要注意的是,错误的参数设置会导致输出产生严重错误。 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2, "gap"="5m")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------------------+ -``` - -##### 自动估计标准时间间隔 - -当`gap`参数缺省时,本函数可以利用众数估计标准时间间隔,得到同样的结果。因此,这种用法更受推荐。 - -输入序列同上,用于查询的SQL语句如下: - -```sql -select consecutivesequences(s1,s2) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------+ -``` - -### ConsecutiveWindows - -#### 注册语句 - -```sql -create function consecutivewindows as 'org.apache.iotdb.library.series.UDTFConsecutiveWindows' -``` - -#### 函数简介 - -本函数用于在多维严格等间隔数据中发现指定长度的连续窗口。 - -严格等间隔数据是指数据的时间间隔是严格相等的,允许存在数据缺失(包括行缺失和值缺失),但不允许存在数据冗余和时间戳偏移。 - -连续窗口是指严格按照标准时间间隔等距排布,不存在任何数据缺失的子序列。 - - -**函数名:** CONSECUTIVEWINDOWS - -**输入序列:** 支持多个输入序列,类型可以是任意的,但要满足严格等间隔的要求。 - -**参数:** - -+ `gap`:标准时间间隔,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,函数会利用众数估计标准时间间隔。 -+ `length`:序列长度,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。该参数不允许缺省。 - -**输出序列:** 输出单个序列,类型为 INT32。输出序列中的每一个数据点对应一个指定长度连续子序列,时间戳为子序列的起始时刻,值为子序列包含的数据点个数。 - -**提示:** 对于不符合要求的输入,本函数不对输出做任何保证。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------+ -| Time|consecutivewindows(root.test.d1.s1, root.test.d1.s2, "length"="10m")| -+-----------------------------+--------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -+-----------------------------+--------------------------------------------------------------------+ -``` - - - -## 机器学习 - -### AR - -#### 注册语句 - -```sql -create function ar as 'org.apache.iotdb.library.dlearn.UDTFAR' -``` -#### 函数简介 - -本函数用于学习数据的自回归模型系数。 - -**函数名:** AR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `p`:自回归模型的阶数。默认为1。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。第一行对应模型的一阶系数,以此类推。 - -**提示:** - -- `p`应为正整数。 - -- 序列中的大部分点为等间隔采样点。 -- 序列中的缺失点通过线性插值进行填补后用于学习过程。 - -#### 使用示例 - -##### 指定阶数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ar(s0,"p"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+---------------------------+ -| Time|ar(root.test.d0.s0,"p"="2")| -+-----------------------------+---------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.9429| -|1970-01-01T08:00:00.002+08:00| -0.2571| -+-----------------------------+---------------------------+ -``` diff --git a/src/zh/UserGuide/dev-1.3/Tools-System/CLI_timecho.md b/src/zh/UserGuide/dev-1.3/Tools-System/CLI_timecho.md deleted file mode 100644 index a4960ec6c..000000000 --- a/src/zh/UserGuide/dev-1.3/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,170 +0,0 @@ - - -# SQL 命令行终端 (CLI) - -IOTDB 为用户提供 cli/Shell 工具用于启动客户端和服务端程序。下面介绍每个 cli/Shell 工具的运行方式和相关参数。 -> \$IOTDB\_HOME 表示 IoTDB 的安装目录所在路径。 - -## Cli 运行方式 -安装后的 IoTDB 中有一个默认用户:`root`,默认密码为`root`。用户可以使用该用户尝试运行 IoTDB 客户端以测试服务器是否正常启动。客户端启动脚本为$IOTDB_HOME/sbin 文件夹下的`start-cli`脚本。启动脚本时需要指定运行 IP 和 RPC PORT。以下为服务器在本机启动,且用户未更改运行端口号的示例,默认端口为 6667。若用户尝试连接远程服务器或更改了服务器运行的端口号,请在-h 和-p 项处使用服务器的 IP 和 RPC PORT。
-用户也可以在启动脚本的最前方设置自己的环境变量,如 JAVA_HOME 等 (对于 linux 用户,脚本路径为:"/sbin/start-cli.sh"; 对于 windows 用户,脚本路径为:"/sbin/start-cli.bat") - -Linux 系统与 MacOS 系统启动命令如下: - -```shell -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -``` -Windows 系统启动命令如下: - -```shell -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -``` -回车后即可成功启动客户端。启动后出现如图提示即为启动成功。 - -``` - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ version - -Successfully login at 127.0.0.1:6667 -``` -输入`quit`或`exit`可退出 cli 结束本次会话,cli 输出`quit normally`表示退出成功。 - -## Cli 运行参数 - - -| **参数名** | **参数类型** | **是否为必需参数** | **说明** | **示例** | -|:-----------------------------|:-----------|:------------|:-----------------------------------------------------------|:---------------------| -| -h `` | string 类型 | 否 | IoTDB 客户端连接 IoTDB 服务器的 IP 地址, 默认使用:127.0.0.1。 | -h 127.0.0.1 | -| -p `` | int 类型 | 否 | IoTDB 客户端连接服务器的端口号,IoTDB 默认使用 6667。 | -p 6667 | -| -u `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的用户名,默认使用 root。 | -u root | -| -pw `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的密码,默认使用 TimechoDB@2021(V2.0.6版本之前为root)。 | -pw root | -| -e `` | string 类型 | 否 | 在不进入客户端输入模式的情况下,批量操作 IoTDB。 | -e "show databases" | -| -c | 空 | 否 | 如果服务器设置了 rpc_thrift_compression_enable=true, 则 CLI 必须使用 -c | -c | -| -disableISO8601 | 空 | 否 | 如果设置了这个参数,IoTDB 将以数字的形式打印时间戳 (timestamp)。 | -disableISO8601 | -| -usessl `` | Boolean 类型 | 否 | 否开启 ssl 连接 | -usessl true | -| -ts `` | string 类型 | 否 | ssl 证书存储路径 | -ts /path/to/truststore | -| -tpw `` | string 类型 | 否 | ssl 证书存储密码 | -tpw myTrustPassword | -| -timeout `` | int 类型 | 否 | 查询超时时间(秒)。如果未设置,则使用服务器的配置。 | -timeout 30 | -| -help | 空 | 否 | 打印 IoTDB 的帮助信息。 | -help | - - - -下面展示一条客户端命令,功能是连接 IP 为 10.129.187.21 的主机,端口为 6667 ,用户名为 root,密码为 root,以数字的形式打印时间戳,IoTDB 命令行显示的最大行数为 10。 - -Linux 系统与 MacOS 系统启动命令如下: - -```shell -Shell > bash sbin/start-cli.sh -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` -Windows 系统启动命令如下: - -```shell -Shell > sbin\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -## CLI 特殊命令 -下面列举了一些CLI的特殊命令。 - -| 命令 | 描述 / 例子 | -|:---|:---| -| `set time_display_type=xxx` | 例如: long, default, ISO8601, yyyy-MM-dd HH:mm:ss | -| `show time_display_type` | 显示时间显示方式 | -| `set time_zone=xxx` | 例如: +08:00, Asia/Shanghai | -| `show time_zone` | 显示CLI的时区 | -| `set fetch_size=xxx` | 设置从服务器查询数据时的读取条数 | -| `show fetch_size` | 显示读取条数的大小 | -| `set max_display_num=xxx` | 设置 CLI 一次展示的最大数据条数, 设置为-1表示无限制 | -| `help` | 获取CLI特殊命令的提示 | -| `exit/quit` | 退出CLI | - - -## Cli 的批量操作 -当您想要通过脚本的方式通过 Cli / Shell 对 IoTDB 进行批量操作时,可以使用-e 参数。通过使用该参数,您可以在不进入客户端输入模式的情况下操作 IoTDB。 - -为了避免 SQL 语句和其他参数混淆,现在只支持-e 参数作为最后的参数使用。 - -针对 cli/Shell 工具的-e 参数用法如下: - -Linux 系统与 MacOS 指令: - -```shell -Shell > bash sbin/start-cli.sh -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -Windows 系统指令 -```shell -Shell > sbin\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -在 Windows 环境下,-e 参数的 SQL 语句需要使用` `` `对于`" "`进行替换 - -为了更好的解释-e 参数的使用,可以参考下面在 Linux 上执行的例子。 - -假设用户希望对一个新启动的 IoTDB 进行如下操作: - -1. 创建名为 root.demo 的 database - -2. 创建名为 root.demo.s1 的时间序列 - -3. 向创建的时间序列中插入三个数据点 - -4. 查询验证数据是否插入成功 - -那么通过使用 cli/Shell 工具的 -e 参数,可以采用如下的脚本: - -```shell -# !/bin/bash - -host=127.0.0.1 -rpcPort=6667 -user=root -pass=root - -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "CREATE DATABASE root.demo" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create timeseries root.demo.s1 WITH DATATYPE=INT32, ENCODING=RLE" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(1,10)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(2,11)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(3,12)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "select s1 from root.demo" -``` - -打印出来的结果显示如下,通过这种方式进行的操作与客户端的输入模式以及通过 JDBC 进行操作结果是一致的。 - -```shell - Shell > bash ./shell.sh -+-----------------------------+------------+ -| Time|root.demo.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.001+08:00| 10| -|1970-01-01T08:00:00.002+08:00| 11| -|1970-01-01T08:00:00.003+08:00| 12| -+-----------------------------+------------+ -Total line number = 3 -It costs 0.267s -``` - -需要特别注意的是,在脚本中使用 -e 参数时要对特殊字符进行转义。 - diff --git a/src/zh/UserGuide/dev-1.3/Tools-System/Maintenance-Tool_timecho.md b/src/zh/UserGuide/dev-1.3/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index 63d69149f..000000000 --- a/src/zh/UserGuide/dev-1.3/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,1013 +0,0 @@ - - -# 集群管理工具 - -## 集群管理工具 - -IoTDB 集群管理工具是一款易用的运维工具(企业版工具)。旨在解决 IoTDB 分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。本文档将说明如何用集群管理工具远程部署、配置、启动和停止 IoTDB 集群实例。 - -### 环境准备 - -本工具为 TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -IoTDB 要部署的机器需要依赖jdk 8及以上版本、lsof、netstat、unzip功能如果没有请自行安装,可以参考文档最后的一节环境所需安装命令。 - -提示:IoTDB集群管理工具需要使用有root权限的账号 - -### 部署方法 - -#### 下载安装 - -本工具为TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -注意:由于二进制包仅支持GLIBC2.17 及以上版本,因此最低适配Centos7版本 - -* 在iotd目录内输入以下指令后: - -```bash -bash install-iotdbctl.sh -``` - -即可在之后的 shell 内激活 iotdbctl 关键词,如检查部署前所需的环境指令如下所示: - -```bash -iotdbctl cluster check example -``` - -* 也可以不激活iotd直接使用 <iotdbctl absolute path>/sbin/iotdbctl 来执行命令,如检查部署前所需的环境: - -```bash -/sbin/iotdbctl cluster check example -``` - -### 系统结构 - -IoTDB集群管理工具主要由config、logs、doc、sbin目录组成。 - -* `config`存放要部署的集群配置文件如果要使用集群部署工具需要修改里面的yaml文件。 -* `logs` 存放部署工具日志,如果想要查看部署工具执行日志请查看`logs/iotd_yyyy_mm_dd.log`。 -* `sbin` 存放集群部署工具所需的二进制包。 -* `doc` 存放用户手册、开发手册和推荐部署手册。 - - -### 集群配置文件介绍 - -* 在`iotdbctl/config` 目录下有集群配置的yaml文件,yaml文件名字就是集群名字yaml 文件可以有多个,为了方便用户配置yaml文件在iotd/config目录下面提供了`default_cluster.yaml`示例。 -* yaml 文件配置由`global`、`confignode_servers`、`datanode_servers`、`grafana_server`、`prometheus_server`四大部分组成 -* global 是通用配置主要配置机器用户名密码、IoTDB本地安装文件、Jdk配置等。在`iotdbctl/config`目录中提供了一个`default_cluster.yaml`样例数据, - 用户可以复制修改成自己集群名字并参考里面的说明进行配置IoTDB集群,在`default_cluster.yaml`样例中没有注释的均为必填项,已经注释的为非必填项。 - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotdbctl cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - - -| 参数 | 说明 | 是否必填 | -|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| iotdb\_zip\_dir | IoTDB 部署分发目录,如果值为空则从`iotdb_download_url`指定地址下载 | 非必填 | -| iotdb\_download\_url | IoTDB 下载地址,如果`iotdb_zip_dir` 没有值则从指定地址下载 | 非必填 | -| jdk\_tar\_dir | jdk 本地目录,可使用该 jdk 路径进行上传部署至目标节点。 | 非必填 | -| jdk\_deploy\_dir | jdk 远程机器部署目录,会将 jdk 部署到该目录下面,与下面的`jdk_dir_name`参数构成完整的jdk部署目录即 `/` | 非必填 | -| jdk\_dir\_name | jdk 解压后的目录名称默认是jdk_iotdb | 非必填 | -| iotdb\_lib\_dir | IoTDB lib 目录或者IoTDB 的lib 压缩包仅支持.zip格式 ,仅用于IoTDB升级,默认处于注释状态,如需升级请打开注释修改路径即可。如果使用zip文件请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache\-iotdb\-1.2.0/lib/* | 非必填 | -| user | ssh登陆部署机器的用户名 | 必填 | -| password | ssh登录的密码, 如果password未指定使用pkey登陆, 请确保已配置节点之间ssh登录免密钥 | 非必填 | -| pkey | 密钥登陆如果password有值优先使用password否则使用pkey登陆 | 非必填 | -| ssh\_port | ssh登录端口 | 必填 | -| iotdb\_admin_user | iotdb服务用户名默认root | 非必填 | -| iotdb\_admin_password | iotdb服务密码默认root | 非必填 | -| deploy\_dir | IoTDB 部署目录,会把 IoTDB 部署到该目录下面与下面的`iotdb_dir_name`参数构成完整的IoTDB 部署目录即 `/` | 必填 | -| iotdb\_dir\_name | IoTDB 解压后的目录名称默认是iotdb | 非必填 | -| datanode-env.sh | 对应`iotdb/config/datanode-env.sh` ,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值 | 非必填 | -| confignode-env.sh | 对应`iotdb/config/confignode-env.sh`,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值 | 非必填 | -| iotdb-common.properties | 对应`iotdb/config/iotdb-common.properties` | 非必填 | -| cn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode\_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`cn_seed_config_node` | 必填 | -| dn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode\_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` | 必填 | - -其中datanode-env.sh 和confignode-env.sh 可以配置额外参数extra_opts,当该参数配置后会在datanode-env.sh 和confignode-env.sh 后面追加对应的值,可参考default\_cluster.yaml,配置示例如下: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - - -* confignode_servers 是部署IoTDB Confignodes配置,里面可以配置多个Confignode - 默认将第一个启动的ConfigNode节点node1当作Seed-ConfigNode - -| 参数 | 说明 | 是否必填 | -|-----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| name | Confignode 名称 | 必填 | -| deploy\_dir | IoTDB config node 部署目录 | 必填| | -| cn\_internal\_address | 对应iotdb/内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_address` | 必填 | -| cn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-confignode.properties`中的`cn_seed_config_node` | 必填 | -| cn\_internal\_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_port` | 必填 | -| cn\_consensus\_port | 对应`iotdb/config/iotdb-system.properties`中的`cn_consensus_port` | 非必填 | -| cn\_data\_dir | 对应`iotdb/config/iotdb-system.properties`中的`cn_data_dir` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`confignode_servers`同时配置值优先使用confignode\_servers中的值 | 非必填 | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| 参数 | 说明 | 是否必填 | -| -------------------------- | ------------------------------------------------------------ | -------- | -| name | Datanode 名称 | 必填 | -| deploy_dir | IoTDB data node 部署目录,注:该目录不能与下面的IoTDB config node部署目录相同 | 必填 | -| dn_rpc_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` | 必填 | -| dn_internal_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` | 必填 | -| dn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-datanode.properties`中的`dn_seed_config_node`,推荐使用 SeedConfigNode | 必填 | -| dn_rpc_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` | 必填 | -| dn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 | 非必填 | - - -| 参数 | 说明 |是否必填| -|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|--- | -| name | Datanode 名称 |必填| -| deploy\_dir | IoTDB data node 部署目录 |必填| -| dn\_rpc\_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` |必填| -| dn\_internal\_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` |必填| -| dn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` |必填| -| dn\_rpc\_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` |必填| -| dn\_internal\_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` |必填| -| iotdb-system.properties | 对应`iotdb/config/iotdb-common.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 |非必填| - -* grafana_server 是部署Grafana 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------|------------------|-------------------| -| grafana\_dir\_name | grafana 解压目录名称 | 非必填默认grafana_iotdb | -| host | grafana 部署的服务器ip | 必填 | -| grafana\_port | grafana 部署机器的端口 | 非必填,默认3000 | -| deploy\_dir | grafana 部署服务器目录 | 必填 | -| grafana\_tar\_dir | grafana 压缩包位置 | 必填 | -| dashboards | dashboards 所在的位置 | 非必填,多个用逗号隔开 | - -* prometheus_server 是部署Prometheus 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------------------|------------------|-----------------------| -| prometheus_dir\_name | prometheus 解压目录名称 | 非必填默认prometheus_iotdb | -| host | prometheus 部署的服务器ip | 必填 | -| prometheus\_port | prometheus 部署机器的端口 | 非必填,默认9090 | -| deploy\_dir | prometheus 部署服务器目录 | 必填 | -| prometheus\_tar\_dir | prometheus 压缩包位置 | 必填 | -| storage\_tsdb\_retention\_time | 默认保存数据天数 默认15天 | 非必填 | -| storage\_tsdb\_retention\_size | 指定block可以保存的数据大小默认512M ,注意单位KB, MB, GB, TB, PB, EB | 非必填 | - -如果在config/xxx.yaml的`iotdb-system.properties`和`iotdb-system.properties`中配置了metrics,则会自动把配置放入到promethues无需手动修改 - -注意:如何配置yaml key对应的值包含特殊字符如:等建议整个value使用双引号,对应的文件路径中不要使用包含空格的路径,防止出现识别出现异常问题。 - -### 使用场景 - -#### 清理数据场景 - -* 清理集群数据场景会删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 -* 首先执行停止集群命令、然后在执行集群清理命令。 -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster clean default_cluster -``` - -#### 集群销毁场景 - -* 集群销毁场景会删除IoTDB集群中的`data`、`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录。 -* 首先执行停止集群命令、然后在执行集群销毁命令。 - - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster destroy default_cluster -``` - -#### 集群升级场景 - -* 集群升级首先需要在config/xxx.yaml中配置`iotdb_lib_dir`为要上传到服务器的jar所在目录路径(例如iotdb/lib)。 -* 如果使用zip文件上传请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache-iotdb-1.2.0/lib/* -* 执行上传命令、然后执行重启IoTDB集群命令即可完成集群升级 - -```bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### 集群配置文件的热部署场景 - -* 首先修改在config/xxx.yaml中配置。 -* 执行分发命令、然后执行热部署命令即可完成集群配置的热部署 - -```bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### 集群扩容场景 - -* 首先修改在config/xxx.yaml中添加一个datanode 或者confignode 节点。 -* 执行集群扩容命令 -```bash -iotdbctl cluster scaleout default_cluster -``` - -#### 集群缩容场景 - -* 首先在config/xxx.yaml中找到要缩容的节点名字或者ip+port(其中confignode port 是cn_internal_port、datanode port 是rpc_port) -* 执行集群缩容命令 -```bash -iotdbctl cluster scalein default_cluster -``` - -#### 已有IoTDB集群,使用集群部署工具场景 - -* 配置服务器的`user`、`passwod`或`pkey`、`ssh_port` -* 修改config/xxx.yaml中IoTDB 部署路径,`deploy_dir`(IoTDB 部署目录)、`iotdb_dir_name`(IoTDB解压目录名称,默认是iotdb) - 例如IoTDB 部署完整路径是`/home/data/apache-iotdb-1.1.1`则需要修改yaml文件`deploy_dir:/home/data/`、`iotdb_dir_name:apache-iotdb-1.1.1` -* 如果服务器不是使用的java_home则修改`jdk_deploy_dir`(jdk 部署目录)、`jdk_dir_name`(jdk解压后的目录名称,默认是jdk_iotdb),如果使用的是java_home 则不需要修改配置 - 例如jdk部署完整路径是`/home/data/jdk_1.8.2`则需要修改yaml文件`jdk_deploy_dir:/home/data/`、`jdk_dir_name:jdk_1.8.2` -* 配置`cn_seed_config_node`、`dn_seed_config_node` -* 配置`confignode_servers`中`iotdb-system.properties`里面的`cn_internal_address`、`cn_internal_port`、`cn_consensus_port`、`cn_system_dir`、 - `cn_consensus_dir`里面的值不是IoTDB默认的则需要配置否则可不必配置 -* 配置`datanode_servers`中`iotdb-system.properties`里面的`dn_rpc_address`、`dn_internal_address`、`dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`等 -* 执行初始化命令 - -```bash -iotdbctl cluster init default_cluster -``` - -#### 一键部署IoTDB、Grafana和Prometheus 场景 - -* 配置`iotdb-system.properties` 打开metrics接口 -* 配置Grafana 配置,如果`dashboards` 有多个就用逗号隔开,名字不能重复否则会被覆盖。 -* 配置Prometheus配置,IoTDB 集群配置了metrics 则无需手动修改Prometheus 配置会根据哪个节点配置了metrics,自动修改Prometheus 配置。 -* 启动集群 - -```bash -iotdbctl cluster start default_cluster -``` - -更加详细参数请参考上方的 集群配置文件介绍 - - -### 命令格式 - -本工具的基本用法为: -```bash -iotdbctl cluster [params (Optional)] -``` -* key 表示了具体的命令。 - -* cluster name 表示集群名称(即`iotdbctl/config` 文件中yaml文件名字)。 - -* params 表示了命令的所需参数(选填)。 - -* 例如部署default_cluster集群的命令格式为: - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 集群的功能及参数列表如下: - -| 命令 | 功能 | 参数 | -|-----------------|-------------------------------|-------------------------------------------------------------------------------------------------------------------------| -| check | 检测集群是否可以部署 | 集群名称列表 | -| clean | 清理集群 | 集群名称 | -| deploy/dist-all | 部署集群 | 集群名称 ,-N,模块名称(iotdb、grafana、prometheus可选),-op force(可选) | -| list | 打印集群及状态列表 | 无 | -| start | 启动集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) | -| stop | 关闭集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) ,-op force(nodename、grafana、prometheus可选) | -| restart | 重启集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选),-op force(强制停止)/rolling(滚动重启) | -| show | 查看集群信息,details字段表示展示集群信息细节 | 集群名称, details(可选) | -| destroy | 销毁集群 | 集群名称,-N,模块名称(iotdb、grafana、prometheus可选) | -| scaleout | 集群扩容 | 集群名称 | -| scalein | 集群缩容 | 集群名称,-N,集群节点名字或集群节点ip+port | -| reload | 集群热加载 | 集群名称 | -| dist-conf | 集群配置文件分发 | 集群名称 | -| dumplog | 备份指定集群日志 | 集群名称,-N,集群节点名字 -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -loglevel 日志类型 -l 传输速度 | -| dumpdata | 备份指定集群数据 | 集群名称, -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -l 传输速度 | -| dist-lib | lib 包升级 | 集群名字(升级完后请重启) | -| init | 已有集群使用集群部署工具时,初始化集群配置 | 集群名字,初始化集群配置 | -| status | 查看进程状态 | 集群名字 | -| acitvate | 激活集群 | 集群名字 | -| dist-plugin | 上传plugin(udf,trigger,pipe)到集群 | 集群名字,-type 类型 U(udf)/T(trigger)/P(pipe) -file /xxxx/trigger.jar,上传完成后需手动执行创建udf、pipe、trigger命令 | -| upgrade | 滚动升级 | 集群名字 | -| health_check | 健康检查 | 集群名字,-N,节点名称(可选) | -| backup | 停机备份 | 集群名字,-N,节点名称(可选) | -| importschema | 元数据导入 | 集群名字,-N,节点名称(必填) -param 参数 | -| exportschema | 元数据导出 | 集群名字,-N,节点名称(必填) -param 参数 | - - -### 详细命令执行过程 - -下面的命令都是以default_cluster.yaml 为示例执行的,用户可以修改成自己的集群文件来执行 - -#### 检查集群部署环境命令 - -```bash -iotdbctl cluster check default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 验证目标节点是否能够通过 SSH 登录 - -* 验证对应节点上的 JDK 版本是否满足IoTDB jdk1.8及以上版本、服务器是否按照unzip、是否安装lsof 或者netstat - -* 如果看到下面提示`Info:example check successfully!` 证明服务器已经具备安装的要求, - 如果输出`Error:example check fail!` 证明有部分条件没有满足需求可以查看上面的输出的Error日志(例如:`Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`)进行修复, - 如果检查jdk没有满足要求,我们可以自己在yaml 文件中配置一个jdk1.8 及以上版本的进行部署不影响后面使用, - 如果检查lsof、netstat或者unzip 不满足要求需要在服务器上自行安装。 - -#### 部署集群命令 - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`confignode_servers` 和`datanode_servers`中的节点信息上传IoTDB压缩包和jdk压缩包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值) - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -```bash -iotdbctl cluster deploy default_cluster -op force -``` -注意:该命令会强制执行部署,具体过程会删除已存在的部署目录重新部署 - -*部署单个模块* -```bash -# 部署grafana模块 -iotdbctl cluster deploy default_cluster -N grafana -# 部署prometheus模块 -iotdbctl cluster deploy default_cluster -N prometheus -# 部署iotdb模块 -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### 启动集群命令 - -```bash -iotdbctl cluster start default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 启动confignode,根据yaml配置文件中`confignode_servers`中的顺序依次启动同时根据进程id检查confignode是否正常,第一个confignode 为seek config - -* 启动datanode,根据yaml配置文件中`datanode_servers`中的顺序依次启动同时根据进程id检查datanode是否正常 - -* 如果根据进程id检查进程存在后,通过cli依次检查集群列表中每个服务是否正常,如果cli链接失败则每隔10s重试一次直到成功最多重试5次 - - -*启动单个节点命令* -```bash -#按照IoTDB 节点名称启动 -iotdbctl cluster start default_cluster -N datanode_1 -#按照IoTDB 集群ip+port启动,其中port对应confignode的cn_internal_port、datanode的rpc_port -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 -#启动grafana -iotdbctl cluster start default_cluster -N grafana -#启动prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对于节点位置信息,如果启动的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果启动的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 启动该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的start-confignode.sh和start-datanode.sh 脚本, -在实际输出结果失败时有可能是集群还未正常启动,建议使用status命令进行查看当前集群状态(iotdbctl cluster status xxx) - - -#### 查看IoTDB集群状态命令 - -```bash -iotdbctl cluster show default_cluster -#查看IoTDB集群详细信息 -iotdbctl cluster show default_cluster details -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 依次在datanode通过cli执行`show cluster details` 如果有一个节点执行成功则不会在后续节点继续执行cli直接返回结果 - - -#### 停止集群命令 - - -```bash -iotdbctl cluster stop default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`datanode_servers`中datanode节点信息,按照配置先后顺序依次停止datanode节点 - -* 根据`confignode_servers`中confignode节点信息,按照配置依次停止confignode节点 - -*强制停止集群命令* - -```bash -iotdbctl cluster stop default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群 - -*停止单个节点命令* - -```bash -#按照IoTDB 节点名称停止 -iotdbctl cluster stop default_cluster -N datanode_1 -#按照IoTDB 集群ip+port停止(ip+port是按照datanode中的ip+dn_rpc_port获取唯一节点或confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 -#停止grafana -iotdbctl cluster stop default_cluster -N grafana -#停止prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对应节点位置信息,如果停止的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果停止的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 停止该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的stop-confignode.sh和stop-datanode.sh 脚本,在某些情况下有可能iotdb集群并未停止。 - - -#### 清理集群数据命令 - -```bash -iotdbctl cluster clean default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`配置信息 - -* 根据`confignode_servers`、`datanode_servers`中的信息,检查是否还有服务正在运行, - 如果有任何一个服务正在运行则不会执行清理命令 - -* 删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 - - - -#### 重启集群命令 - -```bash -iotdbctl cluster restart default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 执行上述的停止集群命令(stop),然后执行启动集群命令(start) 具体参考上面的start 和stop 命令 - -*强制重启集群命令* - -```bash -iotdbctl cluster restart default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群,然后启动集群 - -*重启单个节点命令* - -```bash -#按照IoTDB 节点名称重启datanode_1 -iotdbctl cluster restart default_cluster -N datanode_1 -#按照IoTDB 节点名称重启confignode_1 -iotdbctl cluster restart default_cluster -N confignode_1 -#重启grafana -iotdbctl cluster restart default_cluster -N grafana -#重启prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### 集群缩容命令 - -```bash -#按照节点名称缩容 -iotdbctl cluster scalein default_cluster -N nodename -#按照ip+port缩容(ip+port按照datanode中的ip+dn_rpc_port获取唯一节点,confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster scalein default_cluster -N ip:port -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 判断要缩容的confignode节点和datanode是否只剩一个,如果只剩一个则不能执行缩容 - -* 然后根据ip:port或者nodename 获取要缩容的节点信息,执行缩容命令,然后销毁该节点目录,如果缩容的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果缩容的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - - -提示:目前一次仅支持一个节点缩容 - -#### 集群扩容命令 - -```bash -iotdbctl cluster scaleout default_cluster -``` -* 修改config/xxx.yaml 文件添加一个datanode 节点或者confignode节点 - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 找到要扩容的节点,执行上传IoTDB压缩包和jdb包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值)并解压 - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -* 执行启动该节点命令并校验节点是否启动成功 - -提示:目前一次仅支持一个节点扩容 - -#### 销毁集群命令 -```bash -iotdbctl cluster destroy default_cluster -``` - -* cluster-name 找到默认位置的 yaml 文件 - -* 根据`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`中node节点信息,检查是否节点还在运行, - 如果有任何一个节点正在运行则停止销毁命令 - -* 删除IoTDB集群中的`data`以及yaml文件配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录 - -*销毁单个模块* -```bash -# 销毁grafana模块 -iotdbctl cluster destroy default_cluster -N grafana -# 销毁prometheus模块 -iotdbctl cluster destroy default_cluster -N prometheus -# 销毁iotdb模块 -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### 分发集群配置命令 -```bash -iotdbctl cluster dist-conf default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 根据yaml文件节点配置信息生成并依次上传`iotdb-system.properties`到指定节点 - -#### 热加载集群配置命令 -```bash -iotdbctl cluster reload default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据yaml文件节点配置信息依次在cli中执行`load configuration` - -#### 集群节点日志备份 -```bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 该命令会根据yaml文件校验datanode_1,confignode_1 是否存在,然后根据配置的起止日期(startdate<=logtime<=enddate)备份指定节点datanode_1,confignode_1 的日志数据到指定服务`192.168.9.48` 端口`36000` 数据备份路径是 `/iotdb/logs` ,IoTDB日志存储路径在`/root/data/db/iotdb/logs`(非必填,如果不填写-logs xxx 默认从IoTDB安装路径/logs下面备份日志) - -| 命令 | 功能 | 是否必填 | -|------------|------------------------------------| ---| -| -h | 存放备份数据机器ip |否| -| -u | 存放备份数据机器用户名 |否| -| -pw | 存放备份数据机器密码 |否| -| -p | 存放备份数据机器端口(默认22) |否| -| -path | 存放备份数据的路径(默认当前路径) |否| -| -loglevel | 日志基本有all、info、error、warn(默认是全部) |否| -| -l | 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -| -N | 配置文件集群名称多个用逗号隔开 |是| -| -startdate | 起始时间(包含默认1970-01-01) |否| -| -enddate | 截止时间(包含) |否| -| -logs | IoTDB 日志存放路径,默认是({iotdb}/logs) |否| - -#### 集群节点数据备份 -```bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* 该命令会根据yaml文件获取leader 节点,然后根据起止日期(startdate<=logtime<=enddate)备份数据到192.168.9.48 服务上的/iotdb/datas 目录下 - -| 命令 | 功能 | 是否必填 | -| ---|---------------------------------| ---| -|-h| 存放备份数据机器ip |否| -|-u| 存放备份数据机器用户名 |否| -|-pw| 存放备份数据机器密码 |否| -|-p| 存放备份数据机器端口(默认22) |否| -|-path| 存放备份数据的路径(默认当前路径) |否| -|-granularity| 类型partition |是| -|-l| 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -|-startdate| 起始时间(包含) |是| -|-enddate| 截止时间(包含) |是| - -#### 集群lib包上传(升级) -```bash -iotdbctl cluster dist-lib default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 - -注意执行完升级后请重启IoTDB 才能生效 - -#### 集群初始化 -```bash -iotdbctl cluster init default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 初始化集群配置 - -#### 查看集群进程状态 -```bash -iotdbctl cluster status default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 展示集群的存活状态 - -#### 集群授权激活 - -集群激活默认是通过输入激活码激活,也可以通过-op license_path 通过license路径激活 - -* 默认激活方式 -```bash -iotdbctl cluster activate default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -op license_path -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -### 集群plugin分发 -```bash -#分发udf -iotdbctl cluster dist-plugin default_cluster -type U -file /xxxx/udf.jar -#分发trigger -iotdbctl cluster dist-plugin default_cluster -type T -file /xxxx/trigger.jar -#分发pipe -iotdbctl cluster dist-plugin default_cluster -type P -file /xxxx/pipe.jar -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取 `datanode_servers`配置信息 - -* 上传udf/trigger/pipe jar包 - -上传完成后需要手动执行创建udf/trigger/pipe命令 - -### 集群滚动升级 -```bash -iotdbctl cluster upgrade default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 -* confignode 执行停止、替换lib包、启动,然后datanode执行停止、替换lib包、启动 - - - -### 集群健康检查 -```bash -iotdbctl cluster health_check default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行health_check.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行health_check.sh - - -### 集群停机备份 -```bash -iotdbctl cluster backup default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行backup.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行backup.sh - -说明:多个节点部署到单台机器,只支持 quick 模式 - -### 集群元数据导入 - -```bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入import-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|---------------------------------|------| -| -s |指定想要导入的数据文件,这里可以指定文件或者文件夹。如果指定的是文件夹,将会把文件夹中所有的后缀为csv的文件进行批量导入。 | 是 | -| -fd |指定一个目录来存放导入失败的文件,如果没有指定这个参数,失败的文件将会被保存到源数据的目录中,文件名为是源文件名加上.failed的后缀。 | 否 | -| -lpf |用于指定每个导入失败文件写入数据的行数,默认值为10000 | 否 | - - - -### 集群元数据导出 - -```bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入export-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|------------------------------------------------------------|------| -| -t | 为导出的CSV文件指定输出路径 | 是 | -| -path |指定导出元数据的path pattern,指定该参数后会忽略-s参数例如:root.stock.** | 否 | -| -pf |如果未指定-path,则需指定该参数,指定查询元数据路径所在文件路径,支持 txt 文件格式,每个待导出的路径为一行。 | 否 | -| -lpf |指定导出的dump文件最大行数,默认值为10000。 | 否 | -| -timeout |指定session查询时的超时时间,单位为ms | 否 | - - - -### 集群部署工具样例介绍 -在集群部署工具安装目录中config/example 下面有3个yaml样例,如果需要可以复制到config 中进行修改即可 - -| 名称 | 说明 | -|-----------------------------|------------------------------------------------| -| default\_1c1d.yaml | 1个confignode和1个datanode 配置样例 | -| default\_3c3d.yaml | 3个confignode和3个datanode 配置样例 | -| default\_3c3d\_grafa\_prome | 3个confignode和3个datanode、Grafana、Prometheus配置样例 | - -## 数据文件夹概览工具 - -IoTDB数据文件夹概览工具用于打印出数据文件夹的结构概览信息,工具位置为 tools/tsfile/print-iotdb-data-dir。 - -### 用法 - -- Windows: - -```bash -.\print-iotdb-data-dir.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"IoTDB_data_dir_overview.txt"作为默认值。 - -### 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## TsFile概览工具 - -TsFile概览工具用于以概要模式打印出一个TsFile的内容,工具位置为 tools/tsfile/print-tsfile。 - -### 用法 - -- Windows: - -```bash -.\print-tsfile-sketch.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-tsfile-sketch.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"TsFile_sketch_view.txt"作为默认值。 - -### 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -解释: - -- 以"|"为分隔,左边是在TsFile文件中的实际位置,右边是梗概内容。 -- "|||||||||||||||||||||"是为增强可读性而添加的导引信息,不是TsFile中实际存储的数据。 -- 最后打印的"IndexOfTimerseriesIndex Tree"是对TsFile文件末尾的元数据索引树的重新整理打印,便于直观理解,不是TsFile中存储的实际数据。 - -## TsFile Resource概览工具 - -TsFile resource概览工具用于打印出TsFile resource文件的内容,工具位置为 tools/tsfile/print-tsfile-resource-files。 - -### 用法 - -- Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/zh/UserGuide/dev-1.3/Tools-System/Monitor-Tool_timecho.md b/src/zh/UserGuide/dev-1.3/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index d3cc9dceb..000000000 --- a/src/zh/UserGuide/dev-1.3/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,170 +0,0 @@ - - - -# 监控工具 - -监控工具的部署可参考文档 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) 章节。 - -## 监控指标的 Prometheus 映射关系 - -> 对于 Metric Name 为 name, Tags 为 K1=V1, ..., Kn=Vn 的监控指标有如下映射,其中 value 为具体值 - -| 监控指标类型 | 映射关系 | -| ---------------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | - -## 修改配置文件 - -1) 以 DataNode 为例,修改 iotdb-system.properties 配置文件如下: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -2) 启动 IoTDB DataNode - -3) 打开浏览器或者用```curl``` 访问 ```http://servier_ip:9091/metrics```, 就能得到如下 metric 数据: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -## Prometheus + Grafana - -如上所示,IoTDB 对外暴露出标准的 Prometheus 格式的监控指标数据,可以使用 Prometheus 采集并存储监控指标,使用 Grafana -可视化监控指标。 - -IoTDB、Prometheus、Grafana三者的关系如下图所示: - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. IoTDB在运行过程中持续收集监控指标数据。 -2. Prometheus以固定的间隔(可配置)从IoTDB的HTTP接口拉取监控指标数据。 -3. Prometheus将拉取到的监控指标数据存储到自己的TSDB中。 -4. Grafana以固定的间隔(可配置)从Prometheus查询监控指标数据并绘图展示。 - -从交互流程可以看出,我们需要做一些额外的工作来部署和配置Prometheus和Grafana。 - -比如,你可以对Prometheus进行如下的配置(部分参数可以自行调整)来从IoTDB获取监控数据 - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -更多细节可以参考下面的文档: - -[Prometheus安装使用文档](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus从HTTP接口拉取metrics数据的配置说明](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana安装使用文档](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana从Prometheus查询数据并绘图的文档](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## Apache IoTDB Dashboard - -我们提供了Apache IoTDB Dashboard,支持统一集中式运维管理,可通过一个监控面板监控多个集群。 - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - -你可以在企业版中获取到 Dashboard 的 Json文件。 - -### 集群概览 - -可以监控包括但不限于: -- 集群总CPU核数、总内存空间、总硬盘空间 -- 集群包含多少个ConfigNode与DataNode -- 集群启动时长 -- 集群写入速度 -- 集群各节点当前CPU、内存、磁盘使用率 -- 分节点的信息 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - -### 数据写入 - -可以监控包括但不限于: -- 写入平均耗时、耗时中位数、99%分位耗时 -- WAL文件数量与尺寸 -- 节点 WAL flush SyncBuffer 耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### 数据查询 - -可以监控包括但不限于: -- 节点查询加载时间序列元数据耗时 -- 节点查询读取时间序列耗时 -- 节点查询修改时间序列元数据耗时 -- 节点查询加载Chunk元数据列表耗时 -- 节点查询修改Chunk元数据耗时 -- 节点查询按照Chunk元数据过滤耗时 -- 节点查询构造Chunk Reader耗时的平均值 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### 存储引擎 - -可以监控包括但不限于: -- 分类型的文件数量、大小 -- 处于各阶段的TsFile数量、大小 -- 各类任务的数量与耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### 系统监控 - -可以监控包括但不限于: -- 系统内存、交换内存、进程内存 -- 磁盘空间、文件数、文件尺寸 -- JVM GC时间占比、分类型的GC次数、GC数据量、各年代的堆内存占用 -- 网络传输速率、包发送速率 - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) diff --git a/src/zh/UserGuide/dev-1.3/Tools-System/Workbench_timecho.md b/src/zh/UserGuide/dev-1.3/Tools-System/Workbench_timecho.md deleted file mode 100644 index 2eb3f47b4..000000000 --- a/src/zh/UserGuide/dev-1.3/Tools-System/Workbench_timecho.md +++ /dev/null @@ -1,34 +0,0 @@ -# 可视化控制台 - -可视化控制台的部署可参考文档 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) 章节。 - -## 第1章 产品介绍 -IoTDB可视化控制台是在IoTDB企业版时序数据库基础上针对工业场景的实时数据收集、存储与分析一体化的数据管理场景开发的扩展组件,旨在为用户提供高效、可靠的实时数据存储和查询解决方案。它具有体量轻、性能高、易使用的特点,完美对接 Hadoop 与 Spark 生态,适用于工业物联网应用中海量时间序列数据高速写入和复杂分析查询的需求。 - -## 第2章 使用说明 -IoTDB的可视化控制台包含以下功能模块: -| **功能模块** | **功能说明** | -| ------------ | ------------------------------------------------------------ | -| 实例管理 | 支持对连接实例进行统一管理,支持创建、编辑和删除,同时可以可视化呈现多实例的关系,帮助客户更清晰的管理多数据库实例 | -| 首页 | 支持查看数据库实例中各节点的服务运行状态(如是否激活、是否运行、IP信息等),支持查看集群、ConfigNode、DataNode运行监控状态,对数据库运行健康度进行监控,判断实例是否有潜在运行问题 | -| 测点列表 | 支持直接查看实例中的测点信息,包括所在数据库信息(如数据库名称、数据保存时间、设备数量等),及测点信息(测点名称、数据类型、压缩编码等),同时支持单条或批量创建、导出、删除测点 | -| 数据模型 | 支持查看各层级从属关系,将层级模型直观展示 | -| 数据查询 | 支持对常用数据查询场景提供界面式查询交互,并对查询数据进行批量导入、导出 | -| 统计查询 | 支持对常用数据统计场景提供界面式查询交互,如最大值、最小值、平均值、总和的结果输出。 | -| SQL操作 | 支持对数据库SQL进行界面式交互,单条或多条语句执行,结果的展示和导出 | -| 趋势 | 支持一键可视化查看数据整体趋势,对选中测点进行实时&历史数据绘制,观察测点实时&历史运行状态 | -| 分析 | 支持将数据通过不同的分析方式(如傅里叶变换等)进行可视化展示 | -| 视图 | 支持通过界面来查看视图名称、视图描述、结果测点以及表达式等信息,同时还可以通过界面交互快速的创建、编辑、删除视图 | -| 数据同步 | 支持对数据库间的数据同步任务进行直观创建、查看、管理,支持直接查看任务运行状态、同步数据和目标地址,还可以通过界面实时观察到同步状态的监控指标变化 | -| 权限管理 | 支持对权限进行界面管控,用于管理和控制数据库用户访问和操作数据库的权限 | -| 审计日志 | 支持对用户在数据库上的操作进行详细记录,包括DDL、DML和查询操作。帮助用户追踪和识别潜在的安全威胁、数据库错误和滥用行为 | - -主要功能展示: -* 首页 -![首页.png](/img/%E9%A6%96%E9%A1%B5.png) -* 测点列表 -![测点列表.png](/img/workbench-1.png) -* 数据查询 -![数据查询.png](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2.png) -* 趋势 -![历史趋势.png](/img/%E5%8E%86%E5%8F%B2%E8%B6%8B%E5%8A%BF.png) \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/User-Manual/Audit-Log_timecho.md b/src/zh/UserGuide/dev-1.3/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index cb2ff4cdd..000000000 --- a/src/zh/UserGuide/dev-1.3/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,108 +0,0 @@ - - - -# 安全审计 - -## 功能背景 - - 审计日志是数据库的记录凭证,通过审计日志功能可以查询到用户在数据库中增删改查等各项操作,以保证信息安全。关于IoTDB的审计日志功能可以实现以下场景的需求: - -- 可以按链接来源(是否人为操作)决定是否记录审计日志,如:非人为操作如硬件采集器写入的数据不需要记录审计日志,人为操作如普通用户通过cli、workbench等工具操作的数据需要记录审计日志。 -- 过滤掉系统级别的写入操作,如IoTDB监控体系本身记录的写入操作等。 - - - -### 场景说明 - - - -#### 对所有用户的所有操作(增、删、改、查)进行记录 - -通过审计日志功能追踪到所有用户在数据中的各项操作。其中所记录的信息要包含数据操作(新增、删除、查询)及元数据操作(新增、修改、删除、查询)、客户端登录信息(用户名、ip地址)。 - - - -客户端的来源 - -- Cli、workbench、Zeppelin、Grafana、通过 Session/JDBC/MQTT 等协议传入的请求 - -![](/img/audit-log.png) - - -#### 可关闭部分用户连接的审计日志 - - - -如非人为操作,硬件采集器通过 Session/JDBC/MQTT 写入的数据不需要记录审计日志 - - - -## 功能定义 - - - -通过配置可以实现: - -- 决定是否开启审计功能 -- 决定审计日志的输出位置,支持输出至一项或多项 - 1. 日志文件 - 2. IoTDB存储 -- 决定是否屏蔽原生接口的写入,防止记录审计日志过多影响性能 -- 决定审计日志内容类别,支持记录一项或多项 - 1. 数据的新增、删除操作 - 2. 数据和元数据的查询操作 - 3. 元数据类的新增、修改、删除操作 - -### 配置项 - - 在iotdb-system.properties中修改以下几项配置 - -```YAML -#################### -### Audit log Configuration -#################### - -# whether to enable the audit log. -# Datatype: Boolean -# enable_audit_log=false - -# Output location of audit logs -# Datatype: String -# IOTDB: the stored time series is: root.__system.audit._{user} -# LOGGER: log_audit.log in the log directory -# audit_log_storage=IOTDB,LOGGER - -# whether enable audit log for DML operation of data -# whether enable audit log for DDL operation of schema -# whether enable audit log for QUERY operation of data and schema -# Datatype: String -# audit_log_operation=DML,DDL,QUERY - -# whether the local write api records audit logs -# Datatype: Boolean -# This contains Session insert api: insertRecord(s), insertTablet(s),insertRecordsOfOneDevice -# MQTT insert api -# RestAPI insert api -# This parameter will cover the DML in audit_log_operation -# enable_audit_log_for_native_insert_api=true -``` - diff --git a/src/zh/UserGuide/dev-1.3/User-Manual/Data-Sync_timecho.md b/src/zh/UserGuide/dev-1.3/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index 98c69cc58..000000000 --- a/src/zh/UserGuide/dev-1.3/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,658 +0,0 @@ - - -# 数据同步 -数据同步是工业物联网的典型需求,通过数据同步机制,可实现 IoTDB 之间的数据共享,搭建完整的数据链路来满足内网外网数据互通、端边云同步、数据迁移、数据备份等需求。 - -## 功能概述 - -### 数据同步 - -一个数据同步任务包含 3 个阶段: - -![](/img/dataSync01.png) - -- 抽取(Source)阶段:该部分用于从源 IoTDB 抽取数据,在 SQL 语句中的 source 部分定义 -- 处理(Process)阶段:该部分用于处理从源 IoTDB 抽取出的数据,在 SQL 语句中的 processor 部分定义 -- 发送(Sink)阶段:该部分用于向目标 IoTDB 发送数据,在 SQL 语句中的 sink 部分定义 - -通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。目前数据同步支持以下信息的同步,您可以在创建同步任务时对同步范围进行选择(默认选择 data.insert,即同步新写入的数据): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
同步范围同步内容说明
all所有范围
data(数据)insert(增量)同步新写入的数据
delete(删除)同步被删除的数据
schema(元数据)database(数据库)同步数据库的创建、修改或删除操作
timeseries(时间序列)同步时间序列的定义和属性
TTL(数据到期时间)同步数据的存活时间
auth(权限)-同步用户权限和访问控制
- -### 功能限制及说明 - -元数据(schema)、权限(auth)同步功能存在如下限制: - -- 使用元数据同步时,要求`Schema region`、`ConfigNode` 的共识协议必须为默认的 ratis 协议,即`iotdb-system.properties`配置文件中是否包含`config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`、`schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`,不包含即为默认值ratis 协议。 - -- 为了防止潜在的冲突,请在开启元数据同步时关闭接收端自动创建元数据功能。可通过修改 `iotdb-system.properties`配置文件中的`enable_auto_create_schema`配置项为 false,关闭元数据自动创建功能。 - -- 开启元数据同步时,不支持使用自定义插件。 - -- 双活集群中元数据同步需避免两端同时操作。 - -- 在进行数据同步任务时,请避免执行任何删除操作,防止两端状态不一致。 - -## 使用说明 - -数据同步任务有三种状态:RUNNING、STOPPED 和 DROPPED。任务状态转换如下图所示: - -![](/img/Data-Sync01.png) - -创建后任务会直接启动,同时当任务发生异常停止后,系统会自动尝试重启任务。 - -提供以下 SQL 语句对同步任务进行状态管理。 - -### 创建任务 - -使用 `CREATE PIPE` 语句来创建一条数据同步任务,下列属性中`PipeId`和`sink`必填,`source`和`processor`为选填项,输入 SQL 时注意 `SOURCE`与 `SINK` 插件顺序不能替换。 - -SQL 示例如下: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId 是能够唯一标定任务的名字 --- 数据抽取插件,可选插件 -WITH SOURCE ( - [ = ,], -) --- 数据处理插件,可选插件 -WITH PROCESSOR ( - [ = ,], -) --- 数据连接插件,必填插件 -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Pipe 不存在时,执行创建命令,防止因尝试创建已存在的 Pipe 而导致报错。 - -**注意**:V1.3.6 起,创建一个全量数据同步 Pipe (例如 Pipeid : `alldatapipe`)时,系统会自动将其拆分为两个独立的 Pipe: - -* 历史 Pipe:PipeId 为原名称加 _history后缀(如 `alldatapipe_history`),source 参数默认携带 `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` - -* 实时 Pipe:PipeId 为原名称加 _realtime后缀(如 `alldatapipe_realtime`),source 参数默认携带 `'history.enable'='false'` ,若配置了元数据同步,则由实时 Pipe 负责发送 - -创建成功后,原 PipeId(如 `alldatapipe`)将不再作为有效标识符。在进行启动、停止、删除、查看等任务操作时,必须使用拆分后的独立 PipeId(即 `*_history`或 `*_realtime`)。操作示例见[查看任务](./Data-Sync_timecho.md#查看任务)小节 - -### 开始任务 - -开始处理数据: - -```SQL -START PIPE -``` - -### 停止任务 - -停止处理数据: - -```SQL -STOP PIPE -``` - -### 删除任务 - -删除指定任务: - -```SQL -DROP PIPE [IF EXISTS] -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Pipe 存在时,执行删除命令,防止因尝试删除不存在的 Pipe 而导致报错。 - -删除任务不需要先停止同步任务。 - -### 查看任务 - -查看全部任务: - -```SQL -SHOW PIPES -``` - -查看指定任务: - -```SQL -SHOW PIPE -``` - - pipe 的 show pipes 结果示例: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -其中各列含义如下: - -- **ID**:同步任务的唯一标识符 -- **CreationTime**:同步任务的创建的时间 -- **State**:同步任务的状态 -- **PipeSource**:同步数据流的来源 -- **PipeProcessor**:同步数据流在传输过程中的处理逻辑 -- **PipeSink**:同步数据流的目的地 -- **ExceptionMessage**:显示同步任务的异常信息 -- **RemainingEventCount(统计存在延迟)**:剩余 event 数,当前数据同步任务中的所有 event 总数,包括数据和元数据同步的 event,以及系统和用户自定义的 event。 -- **EstimatedRemainingSeconds(统计存在延迟)**:剩余时间,基于当前 event 个数和 pipe 处速率,预估完成传输的剩余时间。 - -示例: - -在 V1.3.6 及之后的版本中,创建一个全量数据同步任务,并查看该任务详情 - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -``` - -### 同步插件 - -为了使得整体架构更加灵活以匹配不同的同步场景需求,我们支持在同步任务框架中进行插件组装。系统为您预置了一些常用插件可直接使用,同时您也可以自定义 processor 插件 和 Sink 插件,并加载至 IoTDB 系统进行使用。查看系统中的插件(含自定义与内置插件)可以用以下语句: - -```SQL -SHOW PIPEPLUGINS -``` - -返回结果如下: - -```SQL -IoTDB> SHOW PIPEPLUGINS -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| PluginName|PluginType| ClassName| PluginJar| -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.processor.donothing.DoNothingProcessor| | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.donothing.DoNothingConnector| | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.extractor.iotdb.IoTDBExtractor| | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| | -| IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| | -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ - -``` - -预置插件详细介绍如下(各插件的详细参数可参考本文[参数说明](#参考参数说明)): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
类型自定义插件插件名称介绍适用版本
source 插件不支持iotdb-source默认的 extractor 插件,用于抽取 IoTDB 历史或实时数据1.2.x
processor 插件支持do-nothing-processor默认的 processor 插件,不对传入的数据做任何的处理1.2.x
sink 插件支持do-nothing-sink不对发送出的数据做任何的处理1.2.x
iotdb-thrift-sink默认的 sink 插件(V1.3.1及以上),用于 IoTDB(V1.2.0 及以上)与 IoTDB(V1.2.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,多线程 async non-blocking IO 模型,传输性能高,尤其适用于目标端为分布式时的场景1.2.x
iotdb-air-gap-sink用于 IoTDB(V1.2.2 及以上)向 IoTDB(V1.2.2 及以上)跨单向数据网闸的数据同步。支持的网闸型号包括南瑞 Syskeeper 2000 等1.2.x
iotdb-thrift-ssl-sink用于 IoTDB(V1.3.1 及以上)与 IoTDB(V1.2.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,单线程 sync blocking IO 模型,适用于安全需求较高的场景 1.3.1+
- -导入自定义插件可参考[流处理框架](./Streaming_timecho.md#自定义流处理插件管理)章节。 - -## 使用示例 - -### 全量数据同步 - -本例子用来演示将一个 IoTDB 的所有数据同步至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务,用来同步 A IoTDB 到 B IoTDB 间的全量数据,这里需要用到用到 sink 的 iotdb-thrift-sink 插件(内置插件),需通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,如下面的示例语句: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 部分数据同步 - -本例子用来演示同步某个历史时间范围( 2023 年 8 月 23 日 8 点到 2023 年 10 月 23 日 8 点)的数据至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务。首先我们需要在 source 中定义传输数据的范围,由于传输的是历史数据(历史数据是指同步任务创建之前存在的数据),需要配置数据的起止时间 start-time 和 end-time 以及传输的模式 mode。通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url。 - -详细语句如下: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'realtime.mode' = 'stream' -- 新插入数据(pipe创建后)的抽取模式 - 'path' = 'root.vehicle.**', -- 同步数据的范围 - 'start-time' = '2023.08.23T08:00:00+00:00', -- 同步所有数据的开始 event time,包含 start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- 同步所有数据的结束 event time,包含 end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 双向数据传输 - -本例子用来演示两个 IoTDB 之间互为双活的场景,数据链路如下图所示: - -![](/img/1706698592139.jpg) - -在这个例子中,为了避免数据无限循环,需要将 A 和 B 上的参数`forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一 pipe 传输而来的数据,以及要保持两侧的数据一致 pipe 需要配置`inclusion=all`来同步全量数据和元数据。 - -详细语句如下: - -在 A IoTDB 上执行下列语句: - -```SQL -create pipe AB -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'forwarding-pipe-requests' = 'false' --不转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'forwarding-pipe-requests' = 'false' --是否转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -### 边云数据传输 - -本例子用来演示多个 IoTDB 之间边云传输数据的场景,数据由 B 、C、D 集群分别都同步至 A 集群,数据链路如下图所示: - -![](/img/dataSync03.png) - -在这个例子中,为了将 B 、C、D 集群的数据同步至 A,在 BA 、CA、DA 之间的 pipe 需要配置`path`限制范围,以及要保持边侧和云侧的数据一致 pipe 需要配置`inclusion=all`来同步全量数据和元数据,详细语句如下: - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 A: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 C IoTDB 上执行下列语句,将 C 中数据同步至 A: - -```SQL -create pipe CA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 D IoTDB 上执行下列语句,将 D 中数据同步至 A: - -```SQL -create pipe DA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 级联数据传输 - -本例子用来演示多个 IoTDB 之间级联传输数据的场景,数据由 A 集群同步至 B 集群,再同步至 C 集群,数据链路如下图所示: - -![](/img/1706698610134.jpg) - -在这个例子中,为了将 A 集群的数据同步至 C,在 BC 之间的 pipe 需要将 `forwarding-pipe-requests` 配置为`true`,详细语句如下: - -在 A IoTDB 上执行下列语句,将 A 中数据同步至 B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 C: - -```SQL -create pipe BC -with source ( - 'forwarding-pipe-requests' = 'true' --是否转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 跨网闸数据传输 - -本例子用来演示将一个 IoTDB 的数据,经过单向网闸,同步至另一个 IoTDB 的场景,数据链路如下图所示: - -![](/img/cross-network-gateway.png) - -在这个例子中,需要使用 sink 任务中的 iotdb-air-gap-sink 插件,配置网闸后,在 A IoTDB 上执行下列语句,其中 node-urls 填写网闸配置的目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,详细语句如下: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -**注意:目前支持的网闸型号** -> 其他型号的网闸设备,请与天谋商务联系确认是否支持。 - -| 网闸类型 | 网闸型号 | 回包限制 | 发送限制 | -| ------------ | -------------------------------------------- | ----------------- | --------------- | -| 正向型 | 南瑞 Syskeeper-2000 正向型 | 全 0 / 全 1 bytes | 无限制 | -| 正向型 | 许继自研网闸 | 全 0 / 全 1 bytes | 无限制 | -| 未标记正反向 | 威努特安全隔离与信息交换系统 | 无限制 | 无限制 | -| 正向型 | 科东 StoneWall-2000 网络安全隔离设备(正向型) | 无限制 | 无限制 | -| 反向型 | 南瑞 Syskeeper-2000 反向型 | 全 0 / 全 1 bytes | 满足 E 语言格式 | -| 未标记正反向 | 迪普科技ISG5000 | 无限制 | 无限制 | -| 未标记正反向 | 熙羚安全隔离与信息交换系统XL—GAP | 无限制 | 无限制 | - -### 压缩同步 - -IoTDB 支持在同步过程中指定数据压缩方式。可通过配置 `compressor` 参数,实现数据的实时压缩和传输。`compressor`目前支持 snappy / gzip / lz4 / zstd / lzma2 5 种可选算法,且可以选择多种压缩算法组合,按配置的顺序进行压缩。`rate-limit-bytes-per-second`(V1.3.3 及以后版本支持)每秒最大允许传输的byte数,计算压缩后的byte,若小于0则不限制。 - -如创建一个名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'compressor' = 'snappy,lz4' -- - 'rate-limit-bytes-per-second'='1048576' -- 每秒最大允许传输的byte数 -) -``` - -### 加密同步 - -IoTDB 支持在同步过程中使用 SSL 加密,从而在不同的 IoTDB 实例之间安全地传输数据。通过配置 SSL 相关的参数,如证书地址和密码(`ssl.trust-store-path`)、(`ssl.trust-store-pwd`)可以确保数据在同步过程中被 SSL 加密所保护。 - -如创建名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'ssl.trust-store-path'='pki/trusted', -- 连接目标端 DataNode 所需的 trust store 证书路径 - 'ssl.trust-store-pwd'='root' -- 连接目标端 DataNode 所需的 trust store 证书密码 -) -``` - -## 参考:注意事项 - -可通过修改 IoTDB 配置文件(`iotdb-system.properties`)以调整数据同步的参数,如同步数据存储目录等。完整配置如下:: - -V1.3.3+: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## 参考:参数说明 - -### source 参数(V1.3.3) - -| 参数 | 描述 | value 取值范围 | 是否必填 | 默认取值 | -| ------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -------------- | -| source | iotdb-source | String: iotdb-source | 必填 | - | -| inclusion | 用于指定数据同步任务中需要同步范围,分为数据、元数据和权限 | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | 选填 | data.insert | -| inclusion.exclusion | 用于从 inclusion 指定的同步范围内排除特定的操作,减少同步的数据量 | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | 选填 | 空字符串 | -| mode | 用于在每个 data region 发送完毕时分别发送结束事件,并在全部 data region 发送完毕后自动 drop pipe。query:结束,subscribe:不结束。 | String: query / subscribe | 选填 | subscribe | -| path | 用于筛选待同步的时间序列及其相关元数据 / 数据的路径模式元数据同步只能用pathpath 是精确匹配,参数必须为前缀路径或完整路径,即不能含有 `"*"`,最多在 path参数的尾部含有一个 `"**"` | String:IoTDB 的 pattern | 选填 | root.** | -| pattern | 用于筛选时间序列的路径前缀 | String: 任意的时间序列前缀 | 选填 | root | -| start-time | 同步所有数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MIN_VALUE | -| end-time | 同步所有数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MAX_VALUE | -| realtime.mode | 新插入数据(pipe创建后)的抽取模式 | String: stream, batch | 选填 | batch | -| forwarding-pipe-requests | 是否转发由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | 选填 | true | -| history.loose-range | tsfile传输时,是否放宽历史数据(pipe创建前)范围。"":不放宽范围,严格按照设置的条件挑选数据"time":放宽时间范围,避免对TsFile进行拆分,可以提升同步效率"path":放宽路径范围,避免对TsFile进行拆分,可以提升同步效率"time, path" 、 "path, time" 、"all" : 放宽所有范围,避免对TsFile进行拆分,可以提升同步效率 | String: "" 、 "time" 、 "path" 、 "time, path" 、 "path, time" 、 "all" | 选填 | "" | -| realtime.loose-range | tsfile传输时,是否放宽实时数据(pipe创建前)范围。"":不放宽范围,严格按照设置的条件挑选数据"time":放宽时间范围,避免对TsFile进行拆分,可以提升同步效率"path":放宽路径范围,避免对TsFile进行拆分,可以提升同步效率"time, path" 、 "path, time" 、"all" : 放宽所有范围,避免对TsFile进行拆分,可以提升同步效率 | String: "" 、 "time" 、 "path" 、 "time, path" 、 "path, time" 、 "all" | 选填 | "" | -| mods.enable | 是否发送 tsfile 的 mods 文件 | Boolean: true / false | 选填 | false | - -> 💎 **说明**:为保持低版本兼容,history.enable、history.start-time、history.end-time、realtime.enable 仍可使用,但在新版本中不推荐。 -> -> 💎 **说明:数据抽取模式 stream 和 batch 的差异** -> - **stream(推荐)**:该模式下,任务将对数据进行实时处理、发送,其特点是高时效、低吞吐 -> - **batch**:该模式下,任务将对数据进行批量(按底层数据文件)处理、发送,其特点是低时效、高吞吐 - - -### sink **参数** - -> 在 1.3.3 及以上的版本中,只包含sink的情况下,不再需要额外增加with sink 前缀 - -#### iotdb-thrift-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -| ----------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | ------------ | -| sink | iotdb-thrift-sink 或 iotdb-thrift-async-sink | String: iotdb-thrift-sink 或 iotdb-thrift-async-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V1.3.6及以后的V1.x版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。(V1.3.6及以后的V1.x版本支持) | String: sync / async | 选填 | sync | - -#### iotdb-air-gap-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -| ---------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -------- | -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| air-gap.handshake-timeout-ms | 发送端与接收端在首次尝试建立连接时握手请求的超时时长,单位:毫秒 | Integer | 选填 | 5000 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。(V1.3.6及以后的V1.x版本支持) | String: sync / async | 选填 | sync | - -#### iotdb-thrift-ssl-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -| ----------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | ------------ | -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V1.3.6及以后的V1.x版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。(V1.3.6及以后的V1.x版本支持) | String: sync / async | 选填 | sync | -| ssl.trust-store-path | 连接目标端 DataNode 所需的 trust store 证书路径 | String.Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| ssl.trust-store-pwd | 连接目标端 DataNode 所需的 trust store 证书密码 | Integer | 必填 | - | diff --git a/src/zh/UserGuide/dev-1.3/User-Manual/IoTDB-View_timecho.md b/src/zh/UserGuide/dev-1.3/User-Manual/IoTDB-View_timecho.md deleted file mode 100644 index 2181ae4d4..000000000 --- a/src/zh/UserGuide/dev-1.3/User-Manual/IoTDB-View_timecho.md +++ /dev/null @@ -1,546 +0,0 @@ - - -# 视图 - -## 序列视图应用背景 - -### 应用场景1 时间序列重命名(PI资产管理) - -实际应用中,采集数据的设备可能使用人类难以理解的标识号来命名,这给业务层带来了查询上的困难。 - -而序列视图能够重新组织管理这些序列,在不改变原有序列内容、无需新建或拷贝序列的情况下,使用新的模型结构来访问他们。 - -**例如**:一台云端设备使用自己的网卡MAC地址组成实体编号,存储数据时写入如下时间序列:`root.db.0800200A8C6D.xvjeifg`. - -对于用户来说,它是难以理解的。但此时,用户能够使用序列视图功能对它重命名,将它映射到一个序列视图中去,使用`root.view.device001.temperature`来访问采集到的数据。 - -### 应用场景2 简化业务层查询逻辑 - -有时用户有大量设备,管理着大量时间序列。在进行某项业务时,用户希望仅处理其中的部分序列,此时就可以通过序列视图功能挑选出关注重点,方便反复查询、写入。 - -**例如**:用户管理一条产品流水线,各环节的设备有大量时间序列。温度检测员仅需要关注设备温度,就可以抽取温度相关的序列,组成序列视图。 - -### 应用场景3 辅助权限管理 - -生产过程中,不同业务负责的范围一般不同,出于安全考虑往往需要通过权限管理来限制业务员的访问范围。 - -**例如**:安全管理部门现在仅需要监控某生产线上各设备的温度,但这些数据与其他机密数据存放在同一数据库。此时,就可以创建若干新的视图,视图中仅含有生产线上与温度有关的时间序列,接着,向安全员只赋予这些序列视图的权限,从而达到权限限制的目的。 - -### 设计序列视图功能的动机 - -结合上述两类使用场景,设计序列视图功能的动机,主要有: - -1. 时间序列重命名。 -2. 简化业务层查询逻辑。 -3. 辅助权限管理,通过视图向特定用户开放数据。 - -## 序列视图概念 - -### 术语概念 - -约定:若无特殊说明,本文档所指定的视图均是**序列视图**,未来可能引入设备视图等新功能。 - -### 序列视图 - -序列视图是一种组织管理时间序列的方式。 - -在传统关系型数据库中,数据都必须存放在一个表中,而在IoTDB等时序数据库中,序列才是存储单元。因此,IoTDB中序列视图的概念也是建立在序列上的。 - -一个序列视图就是一条虚拟的时间序列,每条虚拟的时间序列都像是一条软链接或快捷方式,映射到某个视图外部的序列或者某种计算逻辑。换言之,一个虚拟序列要么映射到某个确定的外部序列,要么由多个外部序列运算得来。 - -用户可以使用复杂的SQL查询创建视图,此时序列视图就像一条被存储的查询语句,当从视图中读取数据时,就把被存储的查询语句作为数据来源,放在FROM子句中。 - -### 别名序列 - -在序列视图中,有一类特殊的存在,他们满足如下所有条件: - -1. 数据来源为单一的时间序列 -2. 没有任何计算逻辑 -3. 没有任何筛选条件(例如无WHERE子句的限制) - -这样的序列视图,被称为**别名序列**,或别名序列视图。不完全满足上述所有条件的序列视图,就称为非别名序列视图。他们之间的区别是:只有别名序列支持写入功能。 - -**所有序列视图包括别名序列目前均不支持触发器功能(Trigger)。** - -### 嵌套视图 - -用户可能想从一个现有的序列视图中选出若干序列,组成一个新的序列视图,就称之为嵌套视图。 - -**当前版本不支持嵌套视图功能**。 - -### IoTDB中对序列视图的一些约束 - -#### 限制1 序列视图必须依赖于一个或者若干个时间序列 - -一个序列视图有两种可能的存在形式: - -1. 它映射到一条时间序列 -2. 它由一条或若干条时间序列计算得来 - -前种存在形式已在前文举例,易于理解;而此处的后一种存在形式,则是因为序列视图允许计算逻辑的存在。 - -比如,用户在同一个锅炉安装了两个温度计,现在需要计算两个温度值的平均值作为测量结果。用户采集到的是如下两个序列:`root.db.d01.temperature01`、`root.db.d01.temperature02`。 - -此时,用户可以使用两个序列求平均值,作为视图中的一条序列:`root.db.d01.avg_temperature`。 - -该例子会3.1.2详细展开。 - -#### 限制2 非别名序列视图是只读的 - -不允许向非别名序列视图写入。 - -只有别名序列视图是支持写入的。 - -#### 限制3 不允许嵌套视图 - -不能选定现有序列视图中的某些列来创建序列视图,无论是直接的还是间接的。 - -本限制将在3.1.3给出示例。 - -#### 限制4 序列视图与时间序列不能重名 - -序列视图和时间序列都位于同一棵树下,所以他们不能重名。 - -任何一条序列的名称(路径)都应该是唯一确定的。 - -#### 限制5 序列视图与时间序列的时序数据共用,标签等元数据不共用 - -序列视图是指向时间序列的映射,所以它们完全共用时序数据,由时间序列负责持久化存储。 - -但是它们的tag、attributes等元数据不共用。 - -这是因为进行业务查询时,面向视图的用户关心的是当前视图的结构,而如果使用group by tag等方式做查询,显然希望是得到视图下含有对应tag的分组效果,而非时间序列的tag的分组效果(用户甚至对那些时间序列毫无感知)。 - -## 序列视图功能介绍 - -### 创建视图 - -创建一个序列视图与创建一条时间序列类似,区别在于需要通过AS关键字指定数据来源,即原始序列。 - -#### 创建视图的SQL - -用户可以选取一些序列创建一个视图: - -```SQL -CREATE VIEW root.view.device.status -AS - SELECT s01 - FROM root.db.device -``` - -它表示用户从现有设备`root.db.device`中选出了`s01`这条序列,创建了序列视图`root.view.device.status`。 - -序列视图可以与时间序列存在于同一实体下,例如: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device -``` - -这样,`root.db.device`下就有了`s01`的一份虚拟拷贝,但是使用不同的名字`status`。 - -可以发现,上述两个例子中的序列视图,都是别名序列,我们给用户提供一种针对该序列的更方便的创建方式: - -```SQL -CREATE VIEW root.view.device.status -AS - root.db.device.s01 -``` - -#### 创建含有计算逻辑的视图 - -沿用2.2章节限制1中的例子: - -> 用户在同一个锅炉安装了两个温度计,现在需要计算两个温度值的平均值作为测量结果。用户采集到的是如下两个序列:`root.db.d01.temperature01`、`root.db.d01.temperature02`。 -> -> 此时,用户可以使用两个序列求平均值,作为视图中的一条序列:`root.view.device01.avg_temperature`。 - -如果不使用视图,用户可以这样查询两个温度的平均值: - -```SQL -SELECT (temperature01 + temperature02) / 2 -FROM root.db.d01 -``` - -而如果使用序列视图,用户可以这样创建一个视图来简化将来的查询: - -```SQL -CREATE VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02) / 2 - FROM root.db.d01 -``` - -然后用户可以这样查询: - -```SQL -SELECT avg_temperature FROM root.db.d01 -``` - -#### 不支持嵌套序列视图 - -继续沿用3.1.2中的例子,现在用户想使用序列视图`root.db.d01.avg_temperature`创建一个新的视图,这是不允许的。我们目前不支持嵌套视图,无论它是否是别名序列,都不支持。 - -比如下列SQL语句会报错: - -```SQL -CREATE VIEW root.view.device.avg_temp_copy -AS - root.db.d01.avg_temperature -- 不支持。不允许嵌套视图 -``` - -#### 一次创建多条序列视图 - -一次只能指定一个序列视图对用户来说使用不方便,则可以一次指定多条序列,比如: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - SELECT s01, s02 - FROM root.db.device -``` - -此外,上述写法可以做简化: - -```SQL -CREATE VIEW root.db.device(status, sub.hardware) -AS - SELECT s01, s02 - FROM root.db.device -``` - -上述两条语句都等价于如下写法: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device; - -CREATE VIEW root.db.device.sub.hardware -AS - SELECT s02 - FROM root.db.device -``` - -也等价于如下写法 - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - root.db.device.s01, root.db.device.s02 - --- 或者 - -CREATE VIEW root.db.device(status, sub.hardware) -AS - root.db.device(s01, s02) -``` - -##### 所有序列间的映射关系为静态存储 - -有时,SELECT子句中可能包含运行时才能确定的语句个数,比如如下的语句: - -```SQL -SELECT s01, s02 -FROM root.db.d01, root.db.d02 -``` - -上述语句能匹配到的序列数量是并不确定的,和系统状态有关。即便如此,用户也可以使用它创建视图。 - -不过需要特别注意,所有序列间的映射关系为静态存储(创建时固定)!请看以下示例: - -当前数据库中仅含有`root.db.d01.s01`、`root.db.d02.s01`、`root.db.d02.s02`三条序列,接着创建视图: - -```SQL -CREATE VIEW root.view.d(alpha, beta, gamma) -AS - SELECT s01, s02 - FROM root.db.d01, root.db.d02 -``` - -时间序列之间映射关系如下: - -| 序号 | 时间序列 | 序列视图 | -| ---- | ----------------- | ----------------- | -| 1 | `root.db.d01.s01` | root.view.d.alpha | -| 2 | `root.db.d02.s01` | root.view.d.beta | -| 3 | `root.db.d02.s02` | root.view.d.gamma | - -此后,用户新增了序列`root.db.d01.s02`,则它不对应到任何视图;接着,用户删除`root.db.d01.s01`,则查询`root.view.d.alpha`会直接报错,它也不会对应到`root.db.d01.s02`。 - -请时刻注意,序列间映射关系是静态地、固化地存储的。 - -#### 批量创建序列视图 - -现有若干个设备,每个设备都有一个温度数值,例如: - -1. root.db.d1.temperature -2. root.db.d2.temperature -3. ... - -这些设备下可能存储了很多其他序列(例如`root.db.d1.speed`),但目前可以创建一个视图,只包含这些设备的温度值,而不关系其他序列: - -```SQL -CREATE VIEW root.db.view(${2}_temperature) -AS - SELECT temperature FROM root.db.* -``` - -这里仿照了查询写回(`SELECT INTO`)对命名规则的约定,使用变量占位符来指定命名规则。可以参考:[查询写回(SELECT INTO)](../Basic-Concept/Query-Data.md#查询写回(INTO-子句)) - -这里`root.db.*.temperature`指定了有哪些时间序列会被包含在视图中;`${2}`则指定了从时间序列中的哪个节点提取出名字来命名序列视图。 - -此处,`${2}`指代的是`root.db.*.temperature`的层级2(从 0 开始),也就是`*`的匹配结果;`${2}_temperature`则是将匹配结果与`temperature`通过下划线拼接了起来,构成视图下各序列的节点名称。 - -上述创建视图的语句,和下列写法是等价的: - -```SQL -CREATE VIEW root.db.view(${2}_${3}) -AS - SELECT temperature from root.db.* -``` - -最终视图中含有这些序列: - -1. root.db.view.d1_temperature -2. root.db.view.d2_temperature -3. ... - -使用通配符创建,只会存储创建时刻的静态映射关系。 - -#### 创建视图时SELECT子句受到一定限制 - -创建序列视图时,使用的SELECT子句受到一定限制。主要限制如下: - -1. 不能使用`WHERE`子句。 -2. 不能使用`GROUP BY`子句。 -3. 不能使用`MAX_VALUE`等聚合函数。 - -简单来说,`AS`后只能使用`SELECT ... FROM ... `的结构,且该查询语句的结果必须能构成一条时间序列。 - -### 视图数据查询 - -对于可以支持的数据查询功能,在执行时序数据查询时,序列视图与时间序列可以无差别使用,行为完全一致。 - -**目前序列视图不支持的查询类型如下:** - -1. **align by device 查询** -2. **group by tags 查询** - -用户也可以在同一个SELECT语句中混合查询时间序列与序列视图,比如: - -```SQL -SELECT temperature01, temperature02, avg_temperature -FROM root.db.d01 -WHERE temperature01 < temperature02 -``` - -但是,如果用户想要查询序列的元数据,例如tag、attributes等,则查询到的是序列视图的结果,而并非序列视图所引用的时间序列的结果。 - -此外,对于别名序列,如果用户想要得到时间序列的tag、attributes等信息,则需要先查询视图列的映射,找到对应的时间序列,再向时间序列查询tag、attributes等信息。查询视图列的映射的方法将会在3.5部分说明。 - -### 视图修改 - -视图支持的修改操作包括:修改计算逻辑,修改标签/属性,以及删除。 - -#### 修改视图数据来源 - -```SQL -ALTER VIEW root.view.device.status -AS - SELECT s01 - FROM root.ln.wf.d01 -``` - -#### 修改视图的计算逻辑 - -```SQL -ALTER VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02 + temperature03) / 3 - FROM root.db.d01 -``` - -#### 标签点管理 - -- 添加新的标签 - -```SQL -ALTER view root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -- 添加新的属性 - -```SQL -ALTER view root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -- 重命名标签或属性 - -```SQL -ALTER view root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -- 重新设置标签或属性的值 - -```SQL -ALTER view root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -- 删除已经存在的标签或属性 - -```SQL -ALTER view root.turbine.d1.s1 DROP tag1, tag2 -``` - -- 更新插入标签和属性 - -> 如果该标签或属性原来不存在,则插入,否则,用新值更新原来的旧值 - -```SQL -ALTER view root.turbine.d1.s1 UPSERT TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -#### 删除视图 - -因为一个视图就是一条序列,因此可以像删除时间序列一样删除一个视图。 - -```SQL -DELETE VIEW root.view.device.avg_temperatue -``` - -### 视图同步 - -#### 如果依赖的原序列被删除了 - -当序列视图查询时(序列解析时),如果依赖的时间序列不存在,则**返回空结果集**。 - -这和查询一个不存在的序列的反馈类似,但是有区别:如果依赖的时间序列无法解析,空结果集是包含表头的,以此来提醒用户该视图是存在问题的。 - -此外,被依赖的时间序列删除时,不会去查找是否有依赖于该列的视图,用户不会收到任何警告。 - -#### 不支持非别名序列的数据写入 - -不支持向非别名序列的写入。 - -详情请参考前文 2.1.6 限制2 - -#### 序列的元数据不共用 - -详情请参考前文2.1.6 限制5 - -### 视图元数据查询 - -视图元数据查询,特指查询视图本身的元数据(例如视图有多少列),以及数据库内视图的信息(例如有哪些视图)。 - -#### 查看当前的视图列 - -用户有两种查询方式: - -1. 使用`SHOW TIMESERIES`进行查询,该查询既包含时间序列,也包含序列视图。但是只能显示视图的部分属性 -2. 使用`SHOW VIEW`进行查询,该查询只包含序列视图。能完整显示序列视图的属性。 - -举例: - -```Shell -IoTDB> show timeseries; -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.device.s01 | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.view.status | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp01 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp02 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.avg_temp| null| root.db| FLOAT| null| null|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -Total line number = 5 -It costs 0.789s -IoTDB> -``` - -最后一列`ViewType`中显示了该序列的类型,时间序列为BASE,序列视图是VIEW。 - -此外,某些序列视图的属性会缺失,比如`root.db.d01.avg_temp`是由温度均值计算得来,所以`Encoding`和`Compression`属性都为空值。 - -此外,`SHOW TIMESERIES`语句的查询结果主要分为两部分: - -1. 时序数据的信息,例如数据类型,压缩方式,编码等 -2. 其他元数据信息,例如tag,attribute,所属database等 - -对于序列视图,展示的时序数据信息与其原始序列一致或者为空值(比如计算得到的平均温度有数据类型但是无压缩方式);展示的元数据信息则是视图的内容。 - -如果要得知视图的更多信息,需要使用`SHOW ``VIEW`。`SHOW ``VIEW`中展示视图的数据来源等。 - -```Shell -IoTDB> show VIEW root.**; -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -| Timeseries|Database|DataType|Tags|Attributes|ViewType| SOURCE| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.view.status | root.db| INT32|null| null| VIEW| root.db.device.s01| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.d01.avg_temp| root.db| FLOAT|null| null| VIEW|(root.db.d01.temp01+root.db.d01.temp02)/2| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -Total line number = 2 -It costs 0.789s -IoTDB> -``` - -最后一列`SOURCE`显示了该序列视图的数据来源,列出了创建该序列的SQL语句。 - -##### 关于数据类型 - -上述两种查询都涉及视图的数据类型。视图的数据类型是根据定义视图的查询语句或别名序列的原始时间序列类型推断出来的。这个数据类型是根据当前系统的状态实时计算出来的,因此在不同时刻查询到的数据类型可能是改变的。 - -## FAQ - -#### Q1:我想让视图实现类型转换的功能。例如,原有一个int32类型的时间序列,和其他int64类型的序列被放在了同一个视图中。我现在希望通过视图查询到的数据,都能自动转换为int64类型。 - -> Ans:这不是序列视图的职能范围。但是可以使用`CAST`进行转换,比如: - -```SQL -CREATE VIEW root.db.device.int64_status -AS - SELECT CAST(s1, 'type'='INT64') from root.db.device -``` - -> 这样,查询`root.view.status`时,就会得到int64类型的结果。 -> -> 请特别注意,上述例子中,序列视图的数据是通过`CAST`转换得到的,因此`root.db.device.int64_status`并不是一条别名序列,也就**不支持写入**。 - -#### Q2:是否支持默认命名?选择若干时间序列,创建视图;但是我不指定每条序列的名字,由数据库自动命名? - -> Ans:不支持。用户必须明确指定命名。 - -#### Q3:在原有体系中,创建时间序列`root.db.device.s01`,可以发现自动创建了database`root.db`,自动创建了device`root.db.device`。接着删除时间序列`root.db.device.s01`,可以发现`root.db.device`被自动删除,`root.db`却还是保留的。对于创建视图,会沿用这一机制吗?出于什么考虑呢? - -> Ans:保持原有的行为不变,引入视图功能不会改变原有的这些逻辑。 - -#### Q4:是否支持序列视图重命名? - -> A:当前版本不支持重命名,可以自行创建新名称的视图投入使用。 \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/User-Manual/Streaming_timecho.md b/src/zh/UserGuide/dev-1.3/User-Manual/Streaming_timecho.md deleted file mode 100644 index 010ce7ce6..000000000 --- a/src/zh/UserGuide/dev-1.3/User-Manual/Streaming_timecho.md +++ /dev/null @@ -1,862 +0,0 @@ - - -# 流计算框架 - -IoTDB 流处理框架允许用户实现自定义的流处理逻辑,可以实现对存储引擎变更的监听和捕获、实现对变更数据的变形、实现对变形后数据的向外推送等逻辑。 - -我们将一个数据流处理任务称为 Pipe。一个流处理任务(Pipe)包含三个子任务: - -- 抽取(Source) -- 处理(Process) -- 发送(Sink) - -流处理框架允许用户使用 Java 语言自定义编写三个子任务的处理逻辑,通过类似 UDF 的方式处理数据。 -在一个 Pipe 中,上述的三个子任务分别由三种插件执行实现,数据会依次经过这三个插件进行处理: -Pipe Source 用于抽取数据,Pipe Processor 用于处理数据,Pipe Sink 用于发送数据,最终数据将被发至外部系统。 - -**Pipe 任务的模型如下:** - -![任务模型图](/img/1706697228308.jpg) - -描述一个数据流处理任务,本质就是描述 Pipe Source、Pipe Processor 和 Pipe Sink 插件的属性。 -用户可以通过 SQL 语句声明式地配置三个子任务的具体属性,通过组合不同的属性,实现灵活的数据 ETL 能力。 - -利用流处理框架,可以搭建完整的数据链路来满足端*边云同步、异地灾备、读写负载分库*等需求。 - -## 自定义流处理插件开发 - -### 编程开发依赖 - -推荐采用 maven 构建项目,在`pom.xml`中添加以下依赖。请注意选择和 IoTDB 服务器版本相同的依赖版本。 - -```xml - - org.apache.iotdb - pipe-api - 1.3.1 - provided - -``` - -### 事件驱动编程模型 - -流处理插件的用户编程接口设计,参考了事件驱动编程模型的通用设计理念。事件(Event)是用户编程接口中的数据抽象,而编程接口与具体的执行方式解耦,只需要专注于描述事件(数据)到达系统后,系统期望的处理方式即可。 - -在流处理插件的用户编程接口中,事件是数据库数据写入操作的抽象。事件由单机流处理引擎捕获,按照流处理三个阶段的流程,依次传递至 PipeSource 插件,PipeProcessor 插件和 PipeSink 插件,并依次在三个插件中触发用户逻辑的执行。 - -为了兼顾端侧低负载场景下的流处理低延迟和端侧高负载场景下的流处理高吞吐,流处理引擎会动态地在操作日志和数据文件中选择处理对象,因此,流处理的用户编程接口要求用户提供下列两类事件的处理逻辑:操作日志写入事件 TabletInsertionEvent 和数据文件写入事件 TsFileInsertionEvent。 - -#### **操作日志写入事件(TabletInsertionEvent)** - -操作日志写入事件(TabletInsertionEvent)是对用户写入请求的高层数据抽象,它通过提供统一的操作接口,为用户提供了操纵写入请求底层数据的能力。 - -对于不同的数据库部署方式,操作日志写入事件对应的底层存储结构是不一样的。对于单机部署的场景,操作日志写入事件是对写前日志(WAL)条目的封装;对于分布式部署的场景,操作日志写入事件是对单个节点共识协议操作日志条目的封装。 - -对于数据库不同写入请求接口生成的写入操作,操作日志写入事件对应的请求结构体的数据结构也是不一样的。IoTDB 提供了 InsertRecord、InsertRecords、InsertTablet、InsertTablets 等众多的写入接口,每一种写入请求都使用了完全不同的序列化方式,生成的二进制条目也不尽相同。 - -操作日志写入事件的存在,为用户提供了一种统一的数据操作视图,它屏蔽了底层数据结构的实现差异,极大地降低了用户的编程门槛,提升了功能的易用性。 - -```java -/** TabletInsertionEvent is used to define the event of data insertion. */ -public interface TabletInsertionEvent extends Event { - - /** - * The consumer processes the data row by row and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processRowByRow(BiConsumer consumer); - - /** - * The consumer processes the Tablet directly and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processTablet(BiConsumer consumer); -} -``` - -#### **数据文件写入事件(TsFileInsertionEvent)** - -数据文件写入事件(TsFileInsertionEvent) 是对数据库文件落盘操作的高层抽象,它是若干操作日志写入事件(TabletInsertionEvent)的数据集合。 - -IoTDB 的存储引擎是 LSM 结构的。数据写入时会先将写入操作落盘到日志结构的文件里,同时将写入数据保存在内存里。当内存达到控制上限,则会触发刷盘行为,即将内存中的数据转换为数据库文件,同时删除之前预写的操作日志。当内存中的数据转换为数据库文件中的数据时,会经过编码压缩和通用压缩两次压缩处理,因此数据库文件的数据相比内存中的原始数据占用的空间更少。 - -在极端的网络情况下,直接传输数据文件相比传输数据写入的操作要更加经济,它会占用更低的网络带宽,能实现更快的传输速度。当然,天下没有免费的午餐,对文件中的数据进行计算处理,相比直接对内存中的数据进行计算处理时,需要额外付出文件 I/O 的代价。但是,正是磁盘数据文件和内存写入操作两种结构各有优劣的存在,给了系统做动态权衡调整的机会,也正是基于这样的观察,插件的事件模型中才引入了数据文件写入事件。 - -综上,数据文件写入事件出现在流处理插件的事件流中,存在下面两种情况: - -(1)历史数据抽取:一个流处理任务开始前,所有已经落盘的写入数据都会以 TsFile 的形式存在。一个流处理任务开始后,采集历史数据时,历史数据将以 TsFileInsertionEvent 作为抽象; - -(2)实时数据抽取:一个流处理任务进行时,当数据流中实时处理操作日志写入事件的速度慢于写入请求速度一定进度之后,未来得及处理的操作日志写入事件会被被持久化至磁盘,以 TsFile 的形式存在,这一些数据被流处理引擎抽取到后,会以 TsFileInsertionEvent 作为抽象。 - -```java -/** - * TsFileInsertionEvent is used to define the event of writing TsFile. Event data stores in disks, - * which is compressed and encoded, and requires IO cost for computational processing. - */ -public interface TsFileInsertionEvent extends Event { - - /** - * The method is used to convert the TsFileInsertionEvent into several TabletInsertionEvents. - * - * @return {@code Iterable} the list of TabletInsertionEvent - */ - Iterable toTabletInsertionEvents(); -} -``` - -### 自定义流处理插件编程接口定义 - -基于自定义流处理插件编程接口,用户可以轻松编写数据抽取插件、数据处理插件和数据发送插件,从而使得流处理功能灵活适配各种工业场景。 - -#### 数据抽取插件接口 - -数据抽取是流处理数据从数据抽取到数据发送三阶段的第一阶段。数据抽取插件(PipeSource)是流处理引擎和存储引擎的桥梁,它通过监听存储引擎的行为, -捕获各种数据写入事件。 - -```java -/** - * PipeSource - * - *

PipeSource is responsible for capturing events from sources. - * - *

Various data sources can be supported by implementing different PipeSource classes. - * - *

The lifecycle of a PipeSource is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SOURCE` clause in SQL are - * parsed and the validation method {@link PipeSource#validate(PipeParameterValidator)} will - * be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} will be called to - * config the runtime behavior of the PipeSource. - *
  • Then the method {@link PipeSource#start()} will be called to start the PipeSource. - *
  • While the collaboration task is in progress, the method {@link PipeSource#supply()} will be - * called to capture events from sources and then the events will be passed to the - * PipeProcessor. - *
  • The method {@link PipeSource#close()} will be called when the collaboration task is - * cancelled (the `DROP PIPE` command is executed). - *
- */ -public interface PipeSource extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSource. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSourceRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSource#validate(PipeParameterValidator)} - * is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSource - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSourceRuntimeConfiguration configuration) - throws Exception; - - /** - * Start the Source. After this method is called, events should be ready to be supplied by - * {@link PipeSource#supply()}. This method is called after {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @throws Exception the user can throw errors if necessary - */ - void start() throws Exception; - - /** - * Supply single event from the Source and the caller will send the event to the processor. - * This method is called after {@link PipeSource#start()} is called. - * - * @return the event to be supplied. the event may be null if the Source has no more events at - * the moment, but the Source is still running for more events. - * @throws Exception the user can throw errors if necessary - */ - Event supply() throws Exception; -} -``` - -#### 数据处理插件接口 - -数据处理是流处理数据从数据抽取到数据发送三阶段的第二阶段。数据处理插件(PipeProcessor)主要用于过滤和转换由数据抽取插件(PipeSource)捕获的 -各种事件。 - -```java -/** - * PipeProcessor - * - *

PipeProcessor is used to filter and transform the Event formed by the PipeSource. - * - *

The lifecycle of a PipeProcessor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH PROCESSOR` clause in SQL are - * parsed and the validation method {@link PipeProcessor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} will be called - * to config the runtime behavior of the PipeProcessor. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. The - * following 3 methods will be called: {@link - * PipeProcessor#process(TabletInsertionEvent, EventCollector)}, {@link - * PipeProcessor#process(TsFileInsertionEvent, EventCollector)} and {@link - * PipeProcessor#process(Event, EventCollector)}. - *
    • PipeSink serializes the events into binaries and send them to sinks. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeProcessor#close() } method will be called. - *
- */ -public interface PipeProcessor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeProcessor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeProcessorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeProcessor#validate(PipeParameterValidator)} is called and before the beginning of the - * events processing. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeProcessor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeProcessorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is called to process the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(TabletInsertionEvent tabletInsertionEvent, EventCollector eventCollector) - throws Exception; - - /** - * This method is called to process the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - default void process(TsFileInsertionEvent tsFileInsertionEvent, EventCollector eventCollector) - throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - process(tabletInsertionEvent, eventCollector); - } - } - - /** - * This method is called to process the Event. - * - * @param event Event to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(Event event, EventCollector eventCollector) throws Exception; -} -``` - -#### 数据发送插件接口 - -数据发送是流处理数据从数据抽取到数据发送三阶段的第三阶段。数据发送插件(PipeSink)主要用于发送经由数据处理插件(PipeProcessor)处理过后的 -各种事件,它作为流处理框架的网络实现层,接口上应允许接入多种实时通信协议和多种连接器。 - -```java -/** - * PipeSink - * - *

PipeSink is responsible for sending events to sinks. - * - *

Various network protocols can be supported by implementing different PipeSink classes. - * - *

The lifecycle of a PipeSink is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SINK` clause in SQL are - * parsed and the validation method {@link PipeSink#validate(PipeParameterValidator)} will be - * called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link PipeSink#customize(PipeParameters, - * PipeSinkRuntimeConfiguration)} will be called to config the runtime behavior of the - * PipeSink and the method {@link PipeSink#handshake()} will be called to create a connection - * with sink. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. - *
    • PipeSink serializes the events into binaries and send them to sinks. The following 3 - * methods will be called: {@link PipeSink#transfer(TabletInsertionEvent)}, {@link - * PipeSink#transfer(TsFileInsertionEvent)} and {@link PipeSink#transfer(Event)}. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeSink#close() } method will be called. - *
- * - *

In addition, the method {@link PipeSink#heartbeat()} will be called periodically to check - * whether the connection with sink is still alive. The method {@link PipeSink#handshake()} will be - * called to create a new connection with the sink when the method {@link PipeSink#heartbeat()} - * throws exceptions. - */ -public interface PipeSink extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSink. In this method, the user can do the following - * things: - * - *

    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSinkRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSink#validate(PipeParameterValidator)} is - * called and before the method {@link PipeSink#handshake()} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSink - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSinkRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is used to create a connection with sink. This method will be called after the - * method {@link PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called or - * will be called when the method {@link PipeSink#heartbeat()} throws exceptions. - * - * @throws Exception if the connection is failed to be created - */ - void handshake() throws Exception; - - /** - * This method will be called periodically to check whether the connection with sink is still - * alive. - * - * @throws Exception if the connection dies - */ - void heartbeat() throws Exception; - - /** - * This method is used to transfer the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(TabletInsertionEvent tabletInsertionEvent) throws Exception; - - /** - * This method is used to transfer the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - default void transfer(TsFileInsertionEvent tsFileInsertionEvent) throws Exception { - try { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - transfer(tabletInsertionEvent); - } - } finally { - tsFileInsertionEvent.close(); - } - } - - /** - * This method is used to transfer the generic events, including HeartbeatEvent. - * - * @param event Event to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(Event event) throws Exception; -} -``` - -## 自定义流处理插件管理 - -为了保证用户自定义插件在实际生产中的灵活性和易用性,系统还需要提供对插件进行动态统一管理的能力。 -本章节介绍的流处理插件管理语句提供了对插件进行动态统一管理的入口。 - -### 加载插件语句 - -在 IoTDB 中,若要在系统中动态载入一个用户自定义插件,则首先需要基于 PipeSource、 PipeProcessor 或者 PipeSink 实现一个具体的插件类, -然后需要将插件类编译打包成 jar 可执行文件,最后使用加载插件的管理语句将插件载入 IoTDB。 - -加载插件的管理语句的语法如图所示。 - -```sql -CREATE PIPEPLUGIN [IF NOT EXISTS] <别名> -AS <全类名> -USING -``` - -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Pipe Plugin 不存在时,执行创建命令,防止因尝试创建已存在的 Pipe Plugin 而导致报错。 - -示例:假如用户实现了一个全类名为edu.tsinghua.iotdb.pipe.ExampleProcessor 的数据处理插件,打包后的jar包为 pipe-plugin.jar ,用户希望在流处理引擎中使用这个插件,将插件标记为 example。插件包有两种使用方式,一种为上传到URI服务器,一种为上传到集群本地目录,两种方法任选一种即可。 - -【方式一】上传到URI服务器 - -准备工作:使用该种方式注册,您需要提前将 JAR 包上传到 URI 服务器上并确保执行注册语句的IoTDB实例能够访问该 URI 服务器。例如 https://example.com:8080/iotdb/pipe-plugin.jar 。 - -创建语句: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -【方式二】上传到集群本地目录 - -准备工作:使用该种方式注册,您需要提前将 JAR 包放置到DataNode节点所在机器的任意路径下,推荐您将JAR包放在IoTDB安装路径的/ext/pipe目录下(安装包中已有,无需新建)。例如:iotdb-1.x.x-bin/ext/pipe/pipe-plugin.jar。(**注意:如果您使用的是集群,那么需要将 JAR 包放置到每个 DataNode 节点所在机器的该路径下)** - -创建语句: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -### 删除插件语句 - -当用户不再想使用一个插件,需要将插件从系统中卸载时,可以使用如图所示的删除插件语句。 - -```sql -DROP PIPEPLUGIN [IF EXISTS] <别名> -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Pipe Plugin 存在时,执行删除命令,防止因尝试删除不存在的 Pipe Plugin 而导致报错。 - -### 查看插件语句 - -用户也可以按需查看系统中的插件。查看插件的语句如图所示。 - -```sql -SHOW PIPEPLUGINS -``` - -## 系统预置的流处理插件 - -### 预置 source 插件 - -#### iotdb-source - -作用:抽取 IoTDB 内部的历史或实时数据进入 pipe。 - - -| key | value | value 取值范围 | required or optional with default | -|---------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|-----------------------------------| -| source | iotdb-source | String: iotdb-source | required | -| source.pattern | 用于筛选时间序列的路径前缀 | String: 任意的时间序列前缀 | optional: root | -| source.history.start-time | 抽取的历史数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| source.history.end-time | 抽取的历史数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| start-time(V1.3.1+) | start of synchronizing all data event time,including start-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| end-time(V1.3.1+) | end of synchronizing all data event time,including end-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.realtime.mode | 实时数据的抽取模式 | String: hybrid, log, file | optional: hybrid | -| source.forwarding-pipe-requests | 是否抽取由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | optional: true | - -> 🚫 **source.pattern 参数说明** -> -> * Pattern 需用反引号修饰不合法字符或者是不合法路径节点,例如如果希望筛选 root.\`a@b\` 或者 root.\`123\`,应设置 pattern 为 root.\`a@b\` 或者 root.\`123\`(具体参考 [单双引号和反引号的使用时机](https://iotdb.apache.org/zh/Download/#_1-0-版本不兼容的语法详细说明)) -> * 在底层实现中,当检测到 pattern 为 root(默认值)时,抽取效率较高,其他任意格式都将降低性能 -> * 路径前缀不需要能够构成完整的路径。例如,当创建一个包含参数为 'source.pattern'='root.aligned.1' 的 pipe 时: - > - > * root.aligned.1TS - > * root.aligned.1TS.\`1\` -> * root.aligned.100T - > - > 的数据会被抽取; - > - > * root.aligned.\`1\` -> * root.aligned.\`123\` - > - > 的数据不会被抽取。 - -> ❗️**source.history 的 start-time,end-time 参数说明** -> -> * start-time,end-time 应为 ISO 格式,例如 2011-12-03T10:15:30 或 2011-12-03T10:15:30+01:00 - -> ✅ **一条数据从生产到落库 IoTDB,包含两个关键的时间概念** -> -> * **event time:** 数据实际生产时的时间(或者数据生产系统给数据赋予的生成时间,是数据点中的时间项),也称为事件时间。 -> * **arrival time:** 数据到达 IoTDB 系统内的时间。 -> -> 我们常说的乱序数据,指的是数据到达时,其 **event time** 远落后于当前系统时间(或者已经落库的最大 **event time**)的数据。另一方面,不论是乱序数据还是顺序数据,只要它们是新到达系统的,那它们的 **arrival time** 都是会随着数据到达 IoTDB 的顺序递增的。 - -> 💎 **iotdb-source 的工作可以拆分成两个阶段** -> -> 1. 历史数据抽取:所有 **arrival time** < 创建 pipe 时**当前系统时间**的数据称为历史数据 -> 2. 实时数据抽取:所有 **arrival time** >= 创建 pipe 时**当前系统时间**的数据称为实时数据 -> -> 历史数据传输阶段和实时数据传输阶段,**两阶段串行执行,只有当历史数据传输阶段完成后,才执行实时数据传输阶段。** - -> 📌 **source.realtime.mode:数据抽取的模式** -> -> * log:该模式下,任务仅使用操作日志进行数据处理、发送 -> * file:该模式下,任务仅使用数据文件进行数据处理、发送 -> * hybrid:该模式,考虑了按操作日志逐条目发送数据时延迟低但吞吐低的特点,以及按数据文件批量发送时发送吞吐高但延迟高的特点,能够在不同的写入负载下自动切换适合的数据抽取方式,首先采取基于操作日志的数据抽取方式以保证低发送延迟,当产生数据积压时自动切换成基于数据文件的数据抽取方式以保证高发送吞吐,积压消除时自动切换回基于操作日志的数据抽取方式,避免了采用单一数据抽取算法难以平衡数据发送延迟或吞吐的问题。 - -> 🍕 **source.forwarding-pipe-requests:是否允许转发从另一 pipe 传输而来的数据** -> -> * 如果要使用 pipe 构建 A -> B -> C 的数据同步,那么 B -> C 的 pipe 需要将该参数为 true 后,A -> B 中 A 通过 pipe 写入 B 的数据才能被正确转发到 C -> * 如果要使用 pipe 构建 A \<-> B 的双向数据同步(双活),那么 A -> B 和 B -> A 的 pipe 都需要将该参数设置为 false,否则将会造成数据无休止的集群间循环转发 - -### 预置 processor 插件 - -#### do-nothing-processor - -作用:不对 source 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -|-----------|----------------------|------------------------------|-----------------------------------| -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### 预置 sink 插件 - -#### do-nothing-sink - -作用:不对 processor 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -|------|-----------------|-------------------------|-----------------------------------| -| sink | do-nothing-sink | String: do-nothing-sink | required | - -## 流处理任务管理 - -### 创建流处理任务 - -使用 `CREATE PIPE` 语句来创建流处理任务。以数据同步流处理任务的创建为例,示例 SQL 语句如下: - -```sql -CREATE PIPE -- PipeId 是能够唯一标定流处理任务的名字 -WITH SOURCE ( - -- 默认的 IoTDB 数据抽取插件 - 'source' = 'iotdb-source', - -- 路径前缀,只有能够匹配该路径前缀的数据才会被抽取,用作后续的处理和发送 - 'source.pattern' = 'root.timecho', - -- 是否抽取历史数据 - 'source.history.enable' = 'true', - -- 描述被抽取的历史数据的时间范围,表示最早时间 - 'source.history.start-time' = '2011.12.03T10:15:30+01:00', - -- 描述被抽取的历史数据的时间范围,表示最晚时间 - 'source.history.end-time' = '2022.12.03T10:15:30+01:00', - -- 是否抽取实时数据 - 'source.realtime.enable' = 'true', - -- 描述实时数据的抽取方式 - 'source.realtime.mode' = 'hybrid', -) -WITH PROCESSOR ( - -- 默认的数据处理插件,即不做任何处理 - 'processor' = 'do-nothing-processor', -) -WITH SINK ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'sink' = 'iotdb-thrift-sink', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'sink.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'sink.port' = '6667', -) -``` - -**创建流处理任务时需要配置 PipeId 以及三个插件部分的参数:** - - -| 配置项 | 说明 | 是否必填 | 默认实现 | 默认实现说明 | 是否允许自定义实现 | -|-----------|--------------------------------|---------------------------|----------------------|------------------------------|--------------------------| -| PipeId | 全局唯一标定一个流处理任务的名称 | 必填 | - | - | - | -| source | Pipe Source 插件,负责在数据库底层抽取流处理数据 | 选填 | iotdb-source | 将数据库的全量历史数据和后续到达的实时数据接入流处理任务 | 否 | -| processor | Pipe Processor 插件,负责处理数据 | 选填 | do-nothing-processor | 对传入的数据不做任何处理 | | -| sink | Pipe Sink 插件,负责发送数据 | 必填 | - | - | | - -示例中,使用了 iotdb-source、do-nothing-processor 和 iotdb-thrift-sink 插件构建数据流处理任务。IoTDB 还内置了其他的流处理插件,**请查看“系统预置流处理插件”一节**。 - -**一个最简的 CREATE PIPE 语句示例如下:** - -```sql -CREATE PIPE -- PipeId 是能够唯一标定流处理任务的名字 -WITH SINK ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'sink' = 'iotdb-thrift-sink', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'sink.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'sink.port' = '6667', -) -``` - -其表达的语义是:将本数据库实例中的全量历史数据和后续到达的实时数据,同步到目标为 127.0.0.1:6667 的 IoTDB 实例上。 - -**注意:** - -- SOURCE 和 PROCESSOR 为选填配置,若不填写配置参数,系统则会采用相应的默认实现 -- SINK 为必填配置,需要在 CREATE PIPE 语句中声明式配置 -- SINK 具备自复用能力。对于不同的流处理任务,如果他们的 SINK 具备完全相同 KV 属性的(所有属性的 key 对应的 value 都相同),**那么系统最终只会创建一个 SINK 实例**,以实现对连接资源的复用。 - - - 例如,有下面 pipe1, pipe2 两个流处理任务的声明: - - ```sql - CREATE PIPE pipe1 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.ip' = 'localhost', - 'sink.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.port' = '9999', - 'sink.ip' = 'localhost', - ) - ``` - - - 因为它们对 SINK 的声明完全相同(**即使某些属性声明时的顺序不同**),所以框架会自动对它们声明的 SINK 进行复用,最终 pipe1, pipe2 的 SINK 将会是同一个实例。 -- 在 source 为默认的 iotdb-source,且 source.forwarding-pipe-requests 为默认值 true 时,请不要构建出包含数据循环同步的应用场景(会导致无限循环): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### 启动流处理任务 - -CREATE PIPE 语句成功执行后,流处理任务相关实例会被创建,但整个流处理任务的运行状态会被置为 STOPPED,即流处理任务不会立刻处理数据(V1.3.0)。在 1.3.1 及以上的版本,流处理任务的运行状态在创建后将被立即置为 RUNNING。 - -可以使用 START PIPE 语句使流处理任务开始处理数据: - -```sql -START PIPE -``` - -### 停止流处理任务 - -使用 STOP PIPE 语句使流处理任务停止处理数据: - -```sql -STOP PIPE -``` - -### 删除流处理任务 - -使用 DROP PIPE 语句使流处理任务停止处理数据(当流处理任务状态为 RUNNING 时),然后删除整个流处理任务流处理任务: - -```sql -DROP PIPE -``` - -用户在删除流处理任务前,不需要执行 STOP 操作。 - -### 展示流处理任务 - -使用 SHOW PIPES 语句查看所有流处理任务: - -```sql -SHOW PIPES -``` - -查询结果如下: - -```sql -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -| ID| CreationTime | State|PipeSource|PipeProcessor|PipeSink|ExceptionMessage| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| {}| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -``` - -可以使用 `` 指定想看的某个流处理任务状态: - -```sql -SHOW PIPE -``` - -您也可以通过 where 子句,判断某个 \ 使用的 Pipe Sink 被复用的情况。 - -```sql -SHOW PIPES -WHERE SINK USED BY -``` - -### 流处理任务运行状态迁移 - -一个流处理 pipe 在其的生命周期中会经过多种状态: - -- **RUNNING:** pipe 正在正常工作 - - 当一个 pipe 被成功创建之后,其初始状态为工作状态(V1.3.1+) -- **STOPPED:** pipe 处于停止运行状态。当管道处于该状态时,有如下几种可能: - - 当一个 pipe 被成功创建之后,其初始状态为暂停状态(V1.3.0) - - 用户手动将一个处于正常运行状态的 pipe 暂停,其状态会被动从 RUNNING 变为 STOPPED - - 当一个 pipe 运行过程中出现无法恢复的错误时,其状态会自动从 RUNNING 变为 STOPPED -- **DROPPED:** pipe 任务被永久删除 - -下图表明了所有状态以及状态的迁移: - -![状态迁移图](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -## 权限管理 - -### 流处理任务 - - -| 权限名称 | 描述 | -|----------|---------------| -| USE_PIPE | 注册流处理任务。路径无关。 | -| USE_PIPE | 开启流处理任务。路径无关。 | -| USE_PIPE | 停止流处理任务。路径无关。 | -| USE_PIPE | 卸载流处理任务。路径无关。 | -| USE_PIPE | 查询流处理任务。路径无关。 | - -### 流处理任务插件 - - -| 权限名称 | 描述 | -|----------|-----------------| -| USE_PIPE | 注册流处理任务插件。路径无关。 | -| USE_PIPE | 卸载流处理任务插件。路径无关。 | -| USE_PIPE | 查询流处理任务插件。路径无关。 | - -## 配置参数 - -在 iotdb-system.properties 中: - -V1.3.0+: -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 - -# The maximum number of selectors that can be used in the async connector. -# pipe_async_connector_selector_number=1 - -# The core number of clients that can be used in the async connector. -# pipe_async_connector_core_client_number=8 - -# The maximum number of clients that can be used in the async connector. -# pipe_async_connector_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -V1.3.1+: -```Properties -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` diff --git a/src/zh/UserGuide/dev-1.3/User-Manual/Tiered-Storage_timecho.md b/src/zh/UserGuide/dev-1.3/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 960450334..000000000 --- a/src/zh/UserGuide/dev-1.3/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,97 +0,0 @@ - - -# 多级存储 -## 概述 - -多级存储功能向用户提供多种存储介质管理的能力,用户可以使用多级存储功能为 IoTDB 配置不同类型的存储介质,并为存储介质进行分级。具体的,在 IoTDB 中,多级存储的配置体现为多目录的管理。用户可以将多个存储目录归为同一类,作为一个“层级”向 IoTDB 中配置,这种“层级”我们称之为 storage tier;同时,用户可以根据数据的冷热进行分类,并将不同类别的数据存储到指定的“层级”中。当前 IoTDB 支持通过数据的 TTL 进行冷热数据的分类,当一个层级中的数据不满足当前层级定义的 TTL 规则时,该数据会被自动迁移至下一层级中。 - -## 参数定义 - -在 IoTDB 中开启多级存储,需要进行以下几个方面的配置: - -1. 配置数据目录,并将数据目录分为不同的层级 -2. 配置每个层级所管理的数据的 TTL,以区分不同层级管理的冷热数据类别。 -3. 配置每个层级的最小剩余存储空间比例,当该层级的存储空间触发该阈值时,该层级的数据会被自动迁移至下一层级(可选)。 - -具体的参数定义及其描述如下。 - -| 配置项 | 默认值 | 说明 | 约束 | -| --------------------------------------- | ------------------------ |--------------------------------------------------------------------------------------| ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | 用来指定不同的存储目录,并将存储目录进行层级划分 | 每级存储使用分号分隔,单级内使用逗号分隔;云端配置只能作为最后一级存储且第一级不能作为云端存储;最多配置一个云端对象;远端存储目录使用 OBJECT_STORAGE 来表示 | -| tier_ttl_in_ms | -1 | 定义每个层级负责的数据范围,通过 TTL 表示 | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致;"-1" 表示"无限制" | -| dn_default_space_usage_thresholds | 0.85 | 定义每个层级数据目录的最大使用空间比例;当使用空间大于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的使用存储空间大于此阈值时,会将系统置为 READ_ONLY | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致 | -| object_storage_type | AWS_S3 | 云端存储类型 | IoTDB 当前只支持 AWS S3 作为远端存储类型,此参数不支持修改 | -| object_storage_bucket | iotdb_data | 云端存储 bucket 的名称 | AWS S3 中的 bucket 定义;如果未使用远端存储,无需配置 | -| object_storage_endpoint | | 云端存储的 endpoint | AWS S3 的 endpoint;如果未使用远端存储,无需配置 | -| object_storage_access_key | | 云端存储的验证信息 key | AWS S3 的 credential key;如果未使用远端存储,无需配置 | -| object_storage_access_secret | | 云端存储的验证信息 secret | AWS S3 的 credential secret;如果未使用远端存储,无需配置 | -| remote_tsfile_cache_dirs | data/datanode/data/cache | 云端存储在本地的缓存目录 | 如果未使用远端存储,无需配置 | -| remote_tsfile_cache_page_size_in_kb | 20480 | 云端存储在本地缓存文件的块大小 | 如果未使用远端存储,无需配置 | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | 云端存储本地缓存的最大磁盘占用大小 | 如果未使用远端存储,无需配置 | - - -## 本地多级存储配置示例 - -以下以本地两级存储的配置示例。 - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data; -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -在该示例中,共配置了两个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 1 天以前的数据 | 10% | - -## 远端多级存储配置示例 - -以下以三级存储为例: - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_name=AWS_S3 -object_storage_bucket=iotdb -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// 可选配置项 -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -在该示例中,共配置了三个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 过去1 天至过去 10 天内的数据 | 15% | -| 层级三 | 远端 AWS S3 存储 | 过去 10 天以前的数据 | 10% | \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/User-Manual/User-defined-function_timecho.md b/src/zh/UserGuide/dev-1.3/User-Manual/User-defined-function_timecho.md deleted file mode 100644 index 74d8f4baf..000000000 --- a/src/zh/UserGuide/dev-1.3/User-Manual/User-defined-function_timecho.md +++ /dev/null @@ -1,927 +0,0 @@ -# UDF - -## 1. UDF 介绍 - -UDF(User Defined Function)即用户自定义函数,IoTDB 提供多种内建的面向时序处理的函数,也支持扩展自定义函数来满足更多的计算需求。 - -IoTDB 支持两种类型的 UDF 函数,如下表所示。 - - - - - - - - - - - - - - - - - - - - - - -
UDF 分类数据访问策略描述
UDTFMAPPABLE_ROW_BY_ROW自定义标量函数,输入 k 列时间序列 1 行数据,输出 1 列时间序列 1 行数据,可用于标量函数出现的任何子句和表达式中,如select子句、where子句等。
ROW_BY_ROW
SLIDING_TIME_WINDOW
SLIDING_SIZE_WINDOW
SESSION_TIME_WINDOW
STATE_WINDOW
自定义时间序列生成函数,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据,输入行数 m 可以与输出行数 n 不相同,只能用于SELECT子句中。
UDAF-自定义聚合函数,输入 k 列时间序列 m 行数据,输出 1 列时间序列 1 行数据,可用于聚合函数出现的任何子句和表达式中,如select子句、having子句等。
- -### 1.1 UDF 使用 - -UDF 的使用方法与普通内建函数类似,可以直接在 SELECT 语句中像调用普通函数一样使用UDF。 - -#### 1.支持的基础 SQL 语法 - -* `SLIMIT` / `SOFFSET` -* `LIMIT` / `OFFSET` -* 支持值过滤 -* 支持时间过滤 - - -#### 2. 带 * 查询 - -假定现在有时间序列 `root.sg.d1.s1`和 `root.sg.d1.s2`。 - -* **执行`SELECT example(*) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1)`和`example(root.sg.d1.s2)`的结果。 - -* **执行`SELECT example(s1, *) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1, root.sg.d1.s1)`和`example(root.sg.d1.s1, root.sg.d1.s2)`的结果。 - -* **执行`SELECT example(*, *) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1, root.sg.d1.s1)`,`example(root.sg.d1.s2, root.sg.d1.s1)`,`example(root.sg.d1.s1, root.sg.d1.s2)` 和 `example(root.sg.d1.s2, root.sg.d1.s2)`的结果。 - -#### 3. 带自定义输入参数的查询 - -可以在进行 UDF 查询的时候,向 UDF 传入任意数量的键值对参数。键值对中的键和值都需要被单引号或者双引号引起来。注意,键值对参数只能在所有时间序列后传入。下面是一组例子: - - 示例: -``` sql -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; -``` - -#### 4. 与其他查询的嵌套查询 - - 示例: -``` sql -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` - - -## 2. UDF 管理 - -### 2.1 UDF 注册 - -注册一个 UDF 可以按如下流程进行: - -1. 实现一个完整的 UDF 类,假定这个类的全类名为`org.apache.iotdb.udf.UDTFExample` -2. 将项目打成 JAR 包,如果使用 Maven 管理项目,可以参考 [Maven 项目示例](https://github.com/apache/iotdb/tree/master/example/udf)的写法 -3. 进行注册前的准备工作,根据注册方式的不同需要做不同的准备,具体可参考以下例子 -4. 使用以下 SQL 语句注册 UDF - -```sql -CREATE FUNCTION AS (USING URI URI-STRING) -``` - -#### 示例:注册名为`example`的 UDF,以下两种注册方式任选其一即可 - -#### 方式一:手动放置jar包 - -准备工作: -使用该种方式注册时,需要提前将 JAR 包放置到集群所有节点的 `ext/udf`目录下(该目录可配置)。 - -注册语句: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' -``` - -#### 方式二:集群通过URI自动安装jar包 - -准备工作: -使用该种方式注册时,需要提前将 JAR 包上传到 URI 服务器上并确保执行注册语句的 IoTDB 实例能够访问该 URI 服务器。 - -注册语句: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' USING URI 'http://jar/example.jar' -``` - -IoTDB 会下载 JAR 包并同步到整个集群。 - -#### 注意 - -1. 由于 IoTDB 的 UDF 是通过反射技术动态装载的,因此在装载过程中无需启停服务器。 - -2. UDF 函数名称是大小写不敏感的。 - -3. 请不要给 UDF 函数注册一个内置函数的名字。使用内置函数的名字给 UDF 注册会失败。 - -4. 不同的 JAR 包中最好不要有全类名相同但实现功能逻辑不一样的类。例如 UDF(UDAF/UDTF):`udf1`、`udf2`分别对应资源`udf1.jar`、`udf2.jar`。如果两个 JAR 包里都包含一个`org.apache.iotdb.udf.UDTFExample`类,当同一个 SQL 中同时使用到这两个 UDF 时,系统会随机加载其中一个类,导致 UDF 执行行为不一致。 - -### 2.2 UDF 卸载 - -SQL 语法如下: - -```sql -DROP FUNCTION -``` - -示例:卸载上述例子的 UDF: - -```sql -DROP FUNCTION example -``` - -注意:对于使用 using uri 注册的函数,需要移除集群所有节点路径(`安装包/ext/udf/install`)中存在的 UDF 的 jar 文件。 - -### 2.3 查看所有注册的 UDF - -``` sql -SHOW FUNCTIONS -``` - -### 2.4 UDF 配置 - -- 允许在 `iotdb-system.properties` 中配置 udf 的存储目录.: - ``` Properties -# UDF lib dir - -udf_lib_dir=ext/udf -``` - -- 使用自定义函数时,提示内存不足,更改 `iotdb-system.properties` 中下述配置参数并重启服务。 - ``` Properties - -# Used to estimate the memory usage of text fields in a UDF query. -# It is recommended to set this value to be slightly larger than the average length of all text -# effectiveMode: restart -# Datatype: int -udf_initial_byte_array_length_for_memory_control=48 - -# How much memory may be used in ONE UDF query (in MB). -# The upper limit is 20% of allocated memory for read. -# effectiveMode: restart -# Datatype: float -udf_memory_budget_in_mb=30.0 - -# UDF memory allocation ratio. -# The parameter form is a:b:c, where a, b, and c are integers. -# effectiveMode: restart -udf_reader_transformer_collector_memory_proportion=1:1:1 -``` - -### 2.5 UDF 用户权限 - -用户在使用 UDF 时会涉及到 `USE_UDF` 权限,具备该权限的用户才被允许执行 UDF 注册、卸载和查询操作。 - -更多用户权限相关的内容,请参考 [权限管理语句](../User-Manual/Authority-Management.md##权限管理)。 - - -## 3. UDF 函数库 - -基于用户自定义函数能力,IoTDB 提供了一系列关于时序数据处理的函数,包括数据质量、数据画像、异常检测、 频域分析、数据匹配、数据修复、序列发现、机器学习等,能够满足工业领域对时序数据处理的需求。 - -可以参考 [UDF 函数库](../SQL-Manual/UDF-Libraries_timecho.md)文档,查找安装步骤及每个函数对应的注册语句,以确保正确注册所有需要的函数。 - -## 4. UDF 开发 - -### 4.1 UDF 依赖 - -如果您使用 [Maven](http://search.maven.org/) ,可以从 [Maven 库](http://search.maven.org/) 中搜索下面示例中的依赖。请注意选择和目标 IoTDB 服务器版本相同的依赖版本。 - -``` xml - - org.apache.iotdb - udf-api - 1.0.0 - provided - -``` - -### 4.2 UDTF(User Defined Timeseries Generating Function) - -编写一个 UDTF 需要继承`org.apache.iotdb.udf.api.UDTF`类,并至少实现`beforeStart`方法和一种`transform`方法。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| :----------------------------------------------------------- | :----------------------------------------------------------- | ------------------------- | -| void validate(UDFParameterValidator validator) throws Exception | 在初始化方法`beforeStart`调用前执行,用于检测`UDFParameters`中用户输入的参数是否合法。 | 否 | -| void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception | 初始化方法,在 UDTF 处理输入数据前,调用用户自定义的初始化行为。用户每执行一次 UDTF 查询,框架就会构造一个新的 UDF 类实例,该方法在每个 UDF 类实例被初始化时调用一次。在每一个 UDF 类实例的生命周期内,该方法只会被调用一次。 | 是 | -| Object transform(Row row) throws Exception` | 这个方法由框架调用。当您在`beforeStart`中选择以`MappableRowByRowAccessStrategy`的策略消费原始数据时,可以选用该方法进行数据处理。输入参数以`Row`的形式传入,输出结果通过返回值`Object`输出。 | 所有`transform`方法四选一 | -| void transform(Column[] columns, ColumnBuilder builder) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`MappableRowByRowAccessStrategy`的策略消费原始数据时,可以选用该方法进行数据处理。输入参数以`Column[]`的形式传入,输出结果通过`ColumnBuilder`输出。您需要在该方法内自行调用`builder`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void transform(Row row, PointCollector collector) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`RowByRowAccessStrategy`的策略消费原始数据时,这个数据处理方法就会被调用。输入参数以`Row`的形式传入,输出结果通过`PointCollector`输出。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void transform(RowWindow rowWindow, PointCollector collector) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`SlidingSizeWindowAccessStrategy`或者`SlidingTimeWindowAccessStrategy`的策略消费原始数据时,这个数据处理方法就会被调用。输入参数以`RowWindow`的形式传入,输出结果通过`PointCollector`输出。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void terminate(PointCollector collector) throws Exception | 这个方法由框架调用。该方法会在所有的`transform`调用执行完成后,在`beforeDestory`方法执行前被调用。在一个 UDF 查询过程中,该方法会且只会调用一次。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 否 | -| void beforeDestroy() | UDTF 的结束方法。此方法由框架调用,并且只会被调用一次,即在处理完最后一条记录之后被调用。 | 否 | - -在一个完整的 UDTF 实例生命周期中,各个方法的调用顺序如下: - -1. void validate(UDFParameterValidator validator) throws Exception -2. void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception -3. Object transform(Row row) throws Exception 或着 void transform(Column[] columns, ColumnBuilder builder) throws Exception 或者 void transform(Row row, PointCollector collector) throws Exception 或者 void transform(RowWindow rowWindow, PointCollector collector) throws Exception -4. void terminate(PointCollector collector) throws Exception -5. void beforeDestroy() - -> 注意,框架每执行一次 UDTF 查询,都会构造一个全新的 UDF 类实例,查询结束时,对应的 UDF 类实例即被销毁,因此不同 UDTF 查询(即使是在同一个 SQL 语句中)UDF 类实例内部的数据都是隔离的。您可以放心地在 UDTF 中维护一些状态数据,无需考虑并发对 UDF 类实例内部状态数据的影响。 - -#### 接口详细介绍: - -1. **void validate(UDFParameterValidator validator) throws Exception** - - `validate`方法能够对用户输入的参数进行验证。 - - 您可以在该方法中限制输入序列的数量和类型,检查用户输入的属性或者进行自定义逻辑的验证。 - -`UDFParameterValidator`的使用方法请见 [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/parameter/UDFParameterValidator.java)。 - -2. **void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception** - - `beforeStart`方法有两个作用: - 1. 帮助用户解析 SQL 语句中的 UDF 参数 - 2. 配置 UDF 运行时必要的信息,即指定 UDF 访问原始数据时采取的策略和输出结果序列的类型 - 3. 创建资源,比如建立外部链接,打开文件等 - -2.1 **UDFParameters** - -`UDFParameters`的作用是解析 SQL 语句中的 UDF 参数(SQL 中 UDF 函数名称后括号中的部分)。参数包括序列类型参数和字符串 key-value 对形式输入的属性参数。 - -示例: - -``` sql -SELECT UDF(s1, s2, 'key1'='iotdb', 'key2'='123.45') FROM root.sg.d; -``` - -用法: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - String stringValue = parameters.getString("key1"); // iotdb - Float floatValue = parameters.getFloat("key2"); // 123.45 - Double doubleValue = parameters.getDouble("key3"); // null - int intValue = parameters.getIntOrDefault("key4", 678); // 678 - // do something - - // configurations - // ... -} -``` - -2.2 **UDTFConfigurations** - -您必须使用 `UDTFConfigurations` 指定 UDF 访问原始数据时采取的策略和输出结果序列的类型。 - -用法: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(Type.INT32); -} -``` - -其中`setAccessStrategy`方法用于设定 UDF 访问原始数据时采取的策略,`setOutputDataType`用于设定输出结果序列的类型。 - - 2.2.1 **setAccessStrategy** - -注意,您在此处设定的原始数据访问策略决定了框架会调用哪一种`transform`方法 ,请实现与原始数据访问策略对应的`transform`方法。当然,您也可以根据`UDFParameters`解析出来的属性参数,动态决定设定哪一种策略,因此,实现两种`transform`方法也是被允许的。 - -下面是您可以设定的访问原始数据的策略: - -| 接口定义 | 描述 | 调用的`transform`方法 | -| ------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| MappableRowByRowStrategy | 自定义标量函数
框架会为每一行原始数据输入调用一次`transform`方法,输入 k 列时间序列 1 行数据,输出 1 列时间序列 1 行数据,可用于标量函数出现的任何子句和表达式中,如select子句、where子句等。 | void transform(Column[] columns, ColumnBuilder builder) throws ExceptionObject transform(Row row) throws Exception | -| RowByRowAccessStrategy | 自定义时间序列生成函数,逐行地处理原始数据。
框架会为每一行原始数据输入调用一次`transform`方法,输入 k 列时间序列 1 行数据,输出 1 列时间序列 n 行数据。
当输入一个序列时,该行就作为输入序列的一个数据点。
当输入多个序列时,输入序列按时间对齐后,每一行作为的输入序列的一个数据点。
(一行数据中,可能存在某一列为`null`值,但不会全部都是`null`) | void transform(Row row, PointCollector collector) throws Exception | -| SlidingTimeWindowAccessStrategy | 自定义时间序列生成函数,以滑动时间窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为的输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SlidingSizeWindowAccessStrategy | 自定义时间序列生成函数,以固定行数的方式处理原始数据,即每个数据处理窗口都会包含固定行数的数据(最后一个窗口除外)。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为的输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SessionTimeWindowAccessStrategy | 自定义时间序列生成函数,以会话窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为的输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| StateWindowAccessStrategy | 自定义时间序列生成函数,以状态窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 1 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,目前仅支持对一个物理量也就是一列数据进行开窗。 | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | - -#### 接口详情: - -- `MappableRowByRowStrategy` 和 `RowByRowAccessStrategy`的构造不需要任何参数。 - -- `SlidingTimeWindowAccessStrategy` - -开窗示意图: - - - -`SlidingTimeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 3 类参数: - -1. 时间轴显示时间窗开始和结束时间 - -时间轴显示时间窗开始和结束时间不是必须要提供的。当您不提供这类参数时,时间轴显示时间窗开始时间会被定义为整个查询结果集中最小的时间戳,时间轴显示时间窗结束时间会被定义为整个查询结果集中最大的时间戳。 - -2. 划分时间轴的时间间隔参数(必须为正数) -3. 滑动步长(不要求大于等于时间间隔,但是必须为正数) - -滑动步长参数也不是必须的。当您不提供滑动步长参数时,滑动步长会被设定为划分时间轴的时间间隔。 - -3 类参数的关系可见下图。策略的构造方法详见 [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/SlidingTimeWindowAccessStrategy.java)。 - - - -> 注意,最后的一些时间窗口的实际时间间隔可能小于规定的时间间隔参数。另外,可能存在某些时间窗口内数据行数量为 0 的情况,这种情况框架也会为该窗口调用一次`transform`方法。 - -- `SlidingSizeWindowAccessStrategy` - -开窗示意图: - - - -`SlidingSizeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 2 个参数: - -1. 窗口大小,即一个数据处理窗口包含的数据行数。注意,最后一些窗口的数据行数可能少于规定的数据行数。 -2. 滑动步长,即下一窗口第一个数据行与当前窗口第一个数据行间的数据行数(不要求大于等于窗口大小,但是必须为正数) - -滑动步长参数不是必须的。当您不提供滑动步长参数时,滑动步长会被设定为窗口大小。 - -- `SessionTimeWindowAccessStrategy` - -开窗示意图:**时间间隔小于等于给定的最小时间间隔 sessionGap 则分为一组。** - - - - -`SessionTimeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 2 类参数: - -1. 时间轴显示时间窗开始和结束时间。 -2. 会话窗口之间的最小时间间隔。 - -- `StateWindowAccessStrategy` - -开窗示意图:**对于数值型数据,状态差值小于等于给定的阈值 delta 则分为一组。** - - - -`StateWindowAccessStrategy`有四种构造方法: - -1. 针对数值型数据,可以提供时间轴显示时间窗开始和结束时间以及对于单个窗口内部允许变化的阈值delta。 -2. 针对文本数据以及布尔数据,可以提供时间轴显示时间窗开始和结束时间。对于这两种数据类型,单个窗口内的数据是相同的,不需要提供变化阈值。 -3. 针对数值型数据,可以只提供单个窗口内部允许变化的阈值delta,时间轴显示时间窗开始时间会被定义为整个查询结果集中最小的时间戳,时间轴显示时间窗结束时间会被定义为整个查询结果集中最大的时间戳。 -4. 针对文本数据以及布尔数据,可以不提供任何参数,开始与结束时间戳见3中解释。 - -StateWindowAccessStrategy 目前只能接收一列输入。策略的构造方法详见 [Javadoc](https://github.com/apache/iotdb/blob/rc/1.3.4-1/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/StateWindowAccessStrategy.java)。 - - 2.2.2 **setOutputDataType** - -注意,您在此处设定的输出结果序列的类型,决定了`transform`方法中`PointCollector`实际能够接收的数据类型。`setOutputDataType`中设定的输出类型和`PointCollector`实际能够接收的数据输出类型关系如下: - -| `setOutputDataType`中设定的输出类型 | `PointCollector`实际能够接收的输出类型 | -| :---------------------------------- | :----------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | java.lang.String 和 org.apache.iotdb.udf.api.type.Binary | - -UDTF 输出序列的类型是运行时决定的。您可以根据输入序列类型动态决定输出序列类型。 - -示例: - -```java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **Object transform(Row row) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `MappableRowByRowAccessStrategy`,您就需要该方法和下面的`void transform(Column[] columns, ColumnBuilder builder) throws Exception` 二选一来实现,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的一行。原始数据由`Row`读入,由返回值输出。您必须在一次`transform`方法调用中,根据每个输入的数据点输出一个对应的数据点,即输入和输出依然是一对一的。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`Object transform(Row row) throws Exception`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,输出这两个数据点的代数和。 - -```java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type dataType; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - dataType = parameters.getDataType(0); - configurations - .setAccessStrategy(new MappableRowByRowAccessStrategy()) - .setOutputDataType(dataType); - } - - @Override - public Object transform(Row row) throws Exception { - return row.getLong(0) + row.getLong(1); - } -} -``` - -4. **void transform(Column[] columns, ColumnBuilder builder) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `MappableRowByRowAccessStrategy`,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的多行,经过性能测试,我们发现一次性处理多行的 UDTF 比一次处理一行的 UDTF 性能更好。原始数据由`Column[]`读入,由`ColumnBuilder`输出。您必须在一次`transform`方法调用中,根据每个输入的数据点输出一个对应的数据点,即输入和输出依然是一对一的。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(Column[] columns, ColumnBuilder builder) throws Exceptionn`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,输出这两个数据点的代数和。 - -``` java -import org.apache.iotdb.tsfile.read.common.block.column.Column; -import org.apache.iotdb.tsfile.read.common.block.column.ColumnBuilder; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type type; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - type = parameters.getDataType(0); - configurations.setAccessStrategy(new MappableRowByRowAccessStrategy()).setOutputDataType(type); - } - - @Override - public void transform(Column[] columns, ColumnBuilder builder) throws Exception { - long[] inputs1 = columns[0].getLongs(); - long[] inputs2 = columns[1].getLongs(); - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - builder.writeLong(inputs1[i] + inputs2[i]); - } - } -} -``` - -5. **void transform(Row row, PointCollector collector) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `RowByRowAccessStrategy`,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的一行。原始数据由`Row`读入,由`PointCollector`输出。您可以选择在一次`transform`方法调用中输出任意数量的数据点。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(Row row, PointCollector collector) throws Exception`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,当这两个数据点都不为`null`时,输出这两个数据点的代数和。 - -``` java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(Type.INT64) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) throws Exception { - if (row.isNull(0) || row.isNull(1)) { - return; - } - collector.putLong(row.getTime(), row.getLong(0) + row.getLong(1)); - } -} -``` - -6. **void transform(RowWindow rowWindow, PointCollector collector) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `SlidingTimeWindowAccessStrategy`或者`SlidingSizeWindowAccessStrategy`时,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理固定行数或者固定时间间隔内的一批数据,我们称包含这一批数据的容器为窗口。原始数据由`RowWindow`读入,由`PointCollector`输出。`RowWindow`能够帮助您访问某一批次的`Row`,它提供了对这一批次的`Row`进行随机访问和迭代访问的接口。您可以选择在一次`transform`方法调用中输出任意数量的数据点,需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(RowWindow rowWindow, PointCollector collector) throws Exception`方法的完整 UDF 示例。它是一个计数器,接收任意列数的时间序列输入,作用是统计并输出指定时间范围内每一个时间窗口中的数据行数。 - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.RowWindow; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.SlidingTimeWindowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Counter implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(Type.INT32) - .setAccessStrategy(new SlidingTimeWindowAccessStrategy( - parameters.getLong("time_interval"), - parameters.getLong("sliding_step"), - parameters.getLong("display_window_begin"), - parameters.getLong("display_window_end"))); - } - - @Override - public void transform(RowWindow rowWindow, PointCollector collector) throws Exception { - if (rowWindow.windowSize() != 0) { - collector.putInt(rowWindow.windowStartTime(), rowWindow.windowSize()); - } - } -} -``` - -7. **void terminate(PointCollector collector) throws Exception** - -在一些场景下,UDF 需要遍历完所有的原始数据后才能得到最后的输出结果。`terminate`接口为这类 UDF 提供了支持。 - -该方法会在所有的`transform`调用执行完成后,在`beforeDestory`方法执行前被调用。您可以选择使用`transform`方法进行单纯的数据处理,最后使用`terminate`将处理结果输出。 - -结果需要由`PointCollector`输出。您可以选择在一次`terminate`方法调用中输出任意数量的数据点。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void terminate(PointCollector collector) throws Exception`方法的完整 UDF 示例。它接收一个`INT32`类型的时间序列输入,作用是输出该序列的最大值点。 - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Max implements UDTF { - - private Long time; - private int value; - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) { - if (row.isNull(0)) { - return; - } - int candidateValue = row.getInt(0); - if (time == null || value < candidateValue) { - time = row.getTime(); - value = candidateValue; - } - } - - @Override - public void terminate(PointCollector collector) throws IOException { - if (time != null) { - collector.putInt(time, value); - } - } -} -``` - -8. **void beforeDestroy()** - -UDTF 的结束方法,您可以在此方法中进行一些资源释放等的操作。 - -此方法由框架调用。对于一个 UDF 类实例而言,生命周期中会且只会被调用一次,即在处理完最后一条记录之后被调用。 - -### 4.3 UDAF(User Defined Aggregation Function) - -一个完整的 UDAF 定义涉及到 State 和 UDAF 两个类。 - -#### State 类 - -编写一个 State 类需要实现`org.apache.iotdb.udf.api.State`接口,下表是需要实现的方法说明。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| -------------------------------- | ------------------------------------------------------------ | -------- | -| void reset() | 将 `State` 对象重置为初始的状态,您需要像编写构造函数一样,在该方法内填入 `State` 类中各个字段的初始值。 | 是 | -| byte[] serialize() | 将 `State` 序列化为二进制数据。该方法用于 IoTDB 内部的 `State` 对象传递,注意序列化的顺序必须和下面的反序列化方法一致。 | 是 | -| void deserialize(byte[] bytes) | 将二进制数据反序列化为 `State`。该方法用于 IoTDB 内部的 `State` 对象传递,注意反序列化的顺序必须和上面的序列化方法一致。 | 是 | - -#### 接口详细介绍: - -1. **void reset()** - -该方法的作用是将 `State` 重置为初始的状态,您需要在该方法内填写 `State` 对象中各个字段的初始值。出于优化上的考量,IoTDB 在内部会尽可能地复用 `State`,而不是为每一个组创建一个新的 `State`,这样会引入不必要的开销。当 `State` 更新完一个组中的数据之后,就会调用这个方法重置为初始状态,以此来处理下一个组。 - -以求平均数(也就是 `avg`)的 `State` 为例,您需要数据的总和 `sum` 与数据的条数 `count`,并在 `reset()` 方法中将二者初始化为 0。 - -```java -class AvgState implements State { - double sum; - - long count; - - @Override - public void reset() { - sum = 0; - count = 0; - } - - // other methods -} -``` - -2. **byte[] serialize()/void deserialize(byte[] bytes)** - -该方法的作用是将 State 序列化为二进制数据,和从二进制数据中反序列化出 State。IoTDB 作为分布式数据库,涉及到在不同节点中传递数据,因此您需要编写这两个方法,来实现 State 在不同节点中的传递。注意序列化和反序列的顺序必须一致。 - -还是以求平均数(也就是求 avg)的 State 为例,您可以通过任意途径将 State 的内容转化为 `byte[]` 数组,以及从 `byte[]` 数组中读取出 State 的内容,下面展示的是用 Java8 引入的 `ByteBuffer` 进行序列化/反序列的代码: - -```java -@Override -public byte[] serialize() { - ByteBuffer buffer = ByteBuffer.allocate(Double.BYTES + Long.BYTES); - buffer.putDouble(sum); - buffer.putLong(count); - - return buffer.array(); -} - -@Override -public void deserialize(byte[] bytes) { - ByteBuffer buffer = ByteBuffer.wrap(bytes); - sum = buffer.getDouble(); - count = buffer.getLong(); -} -``` - -#### UDAF 类 - -编写一个 UDAF 类需要实现`org.apache.iotdb.udf.api.UDAF`接口,下表是需要实现的方法说明。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -| void validate(UDFParameterValidator validator) throws Exception | 在初始化方法`beforeStart`调用前执行,用于检测`UDFParameters`中用户输入的参数是否合法。该方法与 UDTF 的`validate`相同。 | 否 | -| void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception | 初始化方法,在 UDAF 处理输入数据前,调用用户自定义的初始化行为。与 UDTF 不同的是,这里的 configuration 是 `UDAFConfiguration` 类型。 | 是 | -| State createState() | 创建`State`对象,一般只需要调用默认构造函数,然后按需修改默认的初始值即可。 | 是 | -| void addInput(State state, Column[] columns, BitMap bitMap) | 根据传入的数据`Column[]`批量地更新`State`对象,注意最后一列,也就是 `columns[columns.length - 1]` 总是代表时间列。另外`BitMap`表示之前已经被过滤掉的数据,您在编写该方法时需要手动判断对应的数据是否被过滤掉。 | 是 | -| void combineState(State state, State rhs) | 将`rhs`状态合并至`state`状态中。在分布式场景下,同一组的数据可能分布在不同节点上,IoTDB 会为每个节点上的部分数据生成一个`State`对象,然后调用该方法合并成完整的`State`。 | 是 | -| void outputFinal(State state, ResultValue resultValue) | 根据`State`中的数据,计算出最终的聚合结果。注意根据聚合的语义,每一组只能输出一个值。 | 是 | -| void beforeDestroy() | UDAF 的结束方法。此方法由框架调用,并且只会被调用一次,即在处理完最后一条记录之后被调用。 | 否 | - -在一个完整的 UDAF 实例生命周期中,各个方法的调用顺序如下: - -1. State createState() -2. void validate(UDFParameterValidator validator) throws Exception -3. void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception -4. void addInput(State state, Column[] columns, BitMap bitMap) -5. void combineState(State state, State rhs) -6. void outputFinal(State state, ResultValue resultValue) -7. void beforeDestroy() - -和 UDTF 类似,框架每执行一次 UDAF 查询,都会构造一个全新的 UDF 类实例,查询结束时,对应的 UDF 类实例即被销毁,因此不同 UDAF 查询(即使是在同一个 SQL 语句中)UDF 类实例内部的数据都是隔离的。您可以放心地在 UDAF 中维护一些状态数据,无需考虑并发对 UDF 类实例内部状态数据的影响。 - -#### 接口详细介绍: - -1. **void validate(UDFParameterValidator validator) throws Exception** - -同 UDTF, `validate`方法能够对用户输入的参数进行验证。 - -您可以在该方法中限制输入序列的数量和类型,检查用户输入的属性或者进行自定义逻辑的验证。 - -2. **void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception** - - `beforeStart`方法的作用 UDAF 相同: - - 1. 帮助用户解析 SQL 语句中的 UDF 参数 - 2. 配置 UDF 运行时必要的信息,即指定 UDF 访问原始数据时采取的策略和输出结果序列的类型 - 3. 创建资源,比如建立外部链接,打开文件等。 - -其中,`UDFParameters` 类型的作用可以参照上文。 - -2.2 **UDTFConfigurations** - -和 UDTF 的区别在于,UDAF 使用了 `UDAFConfigurations` 作为 `configuration` 对象的类型。 - -目前,该类仅支持设置输出数据的类型。 - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setOutputDataType(Type.INT32); -} -``` - -`setOutputDataType` 中设定的输出类型和 `ResultValue` 实际能够接收的数据输出类型关系如下: - -| `setOutputDataType`中设定的输出类型 | `ResultValue`实际能够接收的输出类型 | -| :---------------------------------- | :------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | org.apache.iotdb.udf.api.type.Binary | - -UDAF 输出序列的类型也是运行时决定的。您可以根据输入序列类型动态决定输出序列类型。 - -示例: - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **State createState()** - -为 UDAF 创建并初始化 `State`。由于 Java 语言本身的限制,您只能调用 `State` 类的默认构造函数。默认构造函数会为类中所有的字段赋一个默认的初始值,如果该初始值并不符合您的要求,您需要在这个方法内进行手动的初始化。 - -下面是一个包含手动初始化的例子。假设您要实现一个累乘的聚合函数,`State` 的初始值应该设置为 1,但是默认构造函数会初始化为 0,因此您需要在调用默认构造函数之后,手动对 `State` 进行初始化: - -```java -public State createState() { - MultiplyState state = new MultiplyState(); - state.result = 1; - return state; -} -``` - -4. **void addInput(State state, Column[] columns, BitMap bitMap)** - -该方法的作用是,通过原始的输入数据来更新 `State` 对象。出于性能上的考量,也是为了和 IoTDB 向量化的查询引擎相对齐,原始的输入数据不再是一个数据点,而是列的数组 `Column[]`。注意最后一列(也就是 `columns[columns.length - 1]` )总是时间列,因此您也可以在 UDAF 中根据时间进行不同的操作。 - -由于输入参数的类型不是一个数据点,而是多个列,您需要手动对列中的部分数据进行过滤处理,这就是第三个参数 `BitMap` 存在的意义。它用来标识这些列中哪些数据被过滤掉了,您在任何情况下都无需考虑被过滤掉的数据。 - -下面是一个用于统计数据条数(也就是 count)的 `addInput()` 示例。它展示了您应该如何使用 `BitMap` 来忽视那些已经被过滤掉的数据。注意还是由于 Java 语言本身的限制,您需要在方法的开头将接口中定义的 `State` 类型强制转化为自定义的 `State` 类型,不然后续无法正常使用该 `State` 对象。 - -```java -public void addInput(State state, Column[] columns, BitMap bitMap) { - CountState countState = (CountState) state; - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - if (bitMap != null && !bitMap.isMarked(i)) { - continue; - } - if (!columns[0].isNull(i)) { - countState.count++; - } - } -} -``` - -5. **void combineState(State state, State rhs)** - -该方法的作用是合并两个 `State`,更加准确的说,是用第二个 `State` 对象来更新第一个 `State` 对象。IoTDB 是分布式数据库,同一组的数据可能分布在多个不同的节点上。出于性能考虑,IoTDB 会为每个节点上的部分数据先进行聚合成 `State`,然后再将不同节点上的、属于同一个组的 `State` 进行合并,这就是 `combineState` 的作用。 - -下面是一个用于求平均数(也就是 avg)的 `combineState()` 示例。和 `addInput` 类似,您都需要在开头对两个 `State` 进行强制类型转换。另外需要注意是用第二个 `State` 的内容来更新第一个 `State` 的值。 - -```java -public void combineState(State state, State rhs) { - AvgState avgState = (AvgState) state; - AvgState avgRhs = (AvgState) rhs; - - avgState.count += avgRhs.count; - avgState.sum += avgRhs.sum; -} -``` - -6. **void outputFinal(State state, ResultValue resultValue)** - -该方法的作用是从 `State` 中计算出最终的结果。您需要访问 `State` 中的各个字段,求出最终的结果,并将最终的结果设置到 `ResultValue` 对象中。IoTDB 内部会为每个组在最后调用一次这个方法。注意根据聚合的语义,最终的结果只能是一个值。 - -下面还是一个用于求平均数(也就是 avg)的 `outputFinal` 示例。除了开头的强制类型转换之外,您还将看到 `ResultValue` 对象的具体用法,即通过 `setXXX`(其中 `XXX` 是类型名)来设置最后的结果。 - -```java -public void outputFinal(State state, ResultValue resultValue) { - AvgState avgState = (AvgState) state; - - if (avgState.count != 0) { - resultValue.setDouble(avgState.sum / avgState.count); - } else { - resultValue.setNull(); - } -} -``` - -7. **void beforeDestroy()** - -UDAF 的结束方法,您可以在此方法中进行一些资源释放等的操作。 - -此方法由框架调用。对于一个 UDF 类实例而言,生命周期中会且只会被调用一次,即在处理完最后一条记录之后被调用。 - -### 4.4 完整 Maven 项目示例 - -如果您使用 [Maven](http://search.maven.org/),可以参考我们编写的示例项目**udf-example**。您可以在 [这里](https://github.com/apache/iotdb/tree/master/example/udf) 找到它。 - - -## 5. 为iotdb贡献通用的内置UDF函数 - -该部分主要讲述了外部用户如何将自己编写的 UDF 贡献给 IoTDB 社区。 - -### 5.1 前提条件 - -1. UDF 具有通用性。 - - 通用性主要指的是:UDF 在某些业务场景下,可以被广泛使用。换言之,就是 UDF 具有复用价值,可被社区内其他用户直接使用。 - - 如果不确定自己写的 UDF 是否具有通用性,可以发邮件到 `dev@iotdb.apache.org` 或直接创建 ISSUE 发起讨论。 - -2. UDF 已经完成测试,且能够正常运行在用户的生产环境中。 - -### 5.2 贡献清单 - -1. UDF 的源代码 -2. UDF 的测试用例 -3. UDF 的使用说明 - -### 5.3 贡献内容 - -#### 5.3.1 源代码 - -1. 在`iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin`中创建 UDF 主类和相关的辅助类。 -2. 在`iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin/BuiltinTimeSeriesGeneratingFunction.java`中注册编写的 UDF。 - -#### 5.3.2 测试用例 - -至少需要为贡献的 UDF 编写集成测试。 - -可以在`integration-test/src/test/java/org/apache/iotdb/db/it/udf`中为贡献的 UDF 新增一个测试类进行测试。 - -#### 5.3.3 使用说明 - -使用说明需要包含:UDF 的名称、UDF 的作用、执行函数必须的属性参数、函数的适用的场景以及使用示例等。 - -使用说明需包含中英文两个版本。应分别在 `docs/zh/UserGuide/Operation Manual/DML Data Manipulation Language.md` 和 `docs/UserGuide/Operation Manual/DML Data Manipulation Language.md` 中新增使用说明。 - -#### 5.3.4 提交 PR - -当准备好源代码、测试用例和使用说明后,就可以将 UDF 贡献到 IoTDB 社区了。在 [Github](https://github.com/apache/iotdb) 上面提交 Pull Request (PR) 即可。具体提交方式见:[贡献指南](https://iotdb.apache.org/zh/Community/Development-Guide.html)。 - -当 PR 评审通过并被合并后, UDF 就已经贡献给 IoTDB 社区了! - -## 6. 常见问题 - -1. 如何修改已经注册的 UDF? - -答:假设 UDF 的名称为`example`,全类名为`org.apache.iotdb.udf.UDTFExample`,由`example.jar`引入 - -1. 首先卸载已经注册的`example`函数,执行`DROP FUNCTION example` -2. 删除 `iotdb-server-1.0.0-all-bin/ext/udf` 目录下的`example.jar` -3. 修改`org.apache.iotdb.udf.UDTFExample`中的逻辑,重新打包,JAR 包的名字可以仍然为`example.jar` -4. 将新的 JAR 包上传至 `iotdb-server-1.0.0-all-bin/ext/udf` 目录下 -5. 装载新的 UDF,执行`CREATE FUNCTION example AS "org.apache.iotdb.udf.UDTFExample"` \ No newline at end of file diff --git a/src/zh/UserGuide/dev-1.3/User-Manual/White-List_timecho.md b/src/zh/UserGuide/dev-1.3/User-Manual/White-List_timecho.md deleted file mode 100644 index d69a563fc..000000000 --- a/src/zh/UserGuide/dev-1.3/User-Manual/White-List_timecho.md +++ /dev/null @@ -1,70 +0,0 @@ - - - -# 白名单 - -**功能描述** - -允许哪些客户端地址能连接 IoTDB - -**配置文件** - -conf/iotdb-system.properties - -conf/white.list - -**配置项** - -iotdb-system.properties: - -决定是否开启白名单功能 - -```YAML -# 是否开启白名单功能 -enable_white_list=true -``` - -white.list: - -决定哪些IP地址能够连接IoTDB - -```YAML -# 支持注释 -# 支持精确匹配,每行一个ip -10.2.3.4 - -# 支持*通配符,每行一个ip -10.*.1.3 -10.100.0.* -``` - -**注意事项** - -1. 如果通过session客户端取消本身的白名单,当前连接并不会立即断开。在下次创建连接的时候拒绝。 -2. 如果直接修改white.list,一分钟内生效。如果通过session客户端修改,立即生效,更新内存中的值和white.list磁盘文件 -3. 开启白名单功能,没有white.list 文件,启动DB服务成功,但是,拒绝所有连接。 -4. DB服务运行中,删除 white.list 文件,至多一分钟后,拒绝所有连接。 -5. 是否开启白名单功能的配置,可以热加载。 -6. 使用Java 原生接口修改白名单,必须是root用户才能修改,拒绝非root用户修改;修改内容必须合法,否则会抛出StatementExecutionException异常。 - -![白名单](/img/%E7%99%BD%E5%90%8D%E5%8D%95.png) - diff --git a/src/zh/UserGuide/latest-Table/AI-capability/AINode_Upgrade_timecho.md b/src/zh/UserGuide/latest-Table/AI-capability/AINode_Upgrade_timecho.md deleted file mode 100644 index 72b244f2b..000000000 --- a/src/zh/UserGuide/latest-Table/AI-capability/AINode_Upgrade_timecho.md +++ /dev/null @@ -1,941 +0,0 @@ - - -# AINode - -AINode 是支持时序相关模型注册、管理、调用的 IoTDB 原生节点,内置业界领先的自研时序大模型,如清华自研时序模型 Timer 系列,可通过标准 SQL 语句进行调用,实现时序数据的毫秒级实时推理,可支持时序趋势预测、缺失值填补、异常值检测等应用场景。 - -系统架构如下图所示: - -![](/img/AINode-0.png) - -三种节点的职责如下: - -* **ConfigNode**:负责分布式节点管理和负载均衡。 -* **DataNode**:负责接收并解析用户的 SQL请求;负责存储时间序列数据;负责数据的预处理计算。 -* **AINode**:负责时序模型的管理和使用。 - -## 1. 优势特点 - -与单独构建机器学习服务相比,具有以下优势: - -* **简单易用**:无需使用 Python 或 Java 编程,使用 SQL 语句即可完成机器学习模型管理与推理的完整流程。如创建模型可使用CREATE MODEL语句、使用模型进行推理可使用` SELECT * FROM FORECAST (...) ` 语句等,使用更加简单便捷。 -* **避免数据迁移**:使用 IoTDB 原生机器学习可以将存储在 IoTDB 中的数据直接应用于机器学习模型的推理,无需将数据移动到单独的机器学习服务平台,从而加速数据处理、提高安全性并降低成本。 - -![](/img/h1.png) - -* **内置先进算法**:支持业内领先机器学习分析算法,覆盖典型时序分析任务,为时序数据库赋能原生数据分析能力。如: - * **时间序列预测(Time Series Forecasting)**:从过去时间序列中学习变化模式;从而根据给定过去时间的观测值,输出未来序列最可能的预测。 - * **时序异常检测(Anomaly Detection for Time Series)**:在给定的时间序列数据中检测和识别异常值,帮助发现时间序列中的异常行为。 - -## 2. 基本概念 - -* **模型(Model)**:机器学习模型,以时序数据作为输入,输出分析任务的结果或决策。模型是 AINode 的基本管理单元,支持模型的增(注册)、删、查、改(微调)、用(推理)。 -* **创建(Create)**: 将外部设计或训练好的模型文件或算法加载到 AINode 中,由 IoTDB 统一管理与使用。 -* **推理(Inference)**:使用创建的模型在指定时序数据上完成该模型适用的时序分析任务。 -* **内置能力(Built-in)**:AINode 自带常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -![](/img/AINode-new.png) - -## 3. 安装部署 - -AINode 的部署可参考文档 [AINode 部署](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md) 。 - -## 4. 使用指导 - -TimechoDB-AINode 支持模型推理、模型微调以及模型管理(注册、查看、删除、加载、卸载等)三大功能,下面章节将进行详细说明。 - -### 4.1 模型推理 - -TimechoDB-AINode 表模型支持时序预测和时序数据分类两大推理能力。 - -#### 4.1.1 时序预测 - -表模型 AINode 提供的时序预测能力包括: - -* **单变量预测**:支持对单一目标变量进行预测。 -* **协变量预测**:可同时对多个目标变量进行联合预测,并支持在预测中引入协变量,以提升预测的准确性。 - -下文将详细介绍预测推理功能的语法定义、参数说明以及使用实例。 - -1. **SQL 语法** - -```SQL -SELECT * FROM FORECAST( - MODEL_ID, - TARGETS, -- 获取目标变量的 SQL - [HISTORY_COVS, -- 字符串,用于获取历史协变量的 SQL - FUTURE_COVS, -- 字符串,用于获取未来协变量的 SQL - OUTPUT_START_TIME, - OUTPUT_LENGTH, - OUTPUT_INTERVAL, - TIMECOL, - PRESERVE_INPUT, - AUTO_ADAPT, -- bool类型,表示是否开启自适应 - MODEL_OPTIONS]? -) -``` - -* 内置模型推理无需注册流程,通过 forecast 函数,指定 model\_id 就可以使用模型的推理功能。 -* 参数介绍 - -| 参数名 | 参数类型 | 参数属性 | 描述 | 是否必填 | 备注 | -|---------------------|-------|----------------------------------------------------|-----------------------------------------------------------------------------------------| ---------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| model\_id | 标量参数 | 字符串类型 | 预测所用模型的唯一标识 | 是| | -| targets | 表参数 | SET SEMANTIC | 待预测目标变量的输入数据。IoTDB会自动将数据按时间升序排序再交给AINode 。 | 是 | 使用 SQL 描述带预测目标变量的输入数据,输入的 SQL 不合法时会有对应的查询报错。 | -| history\_covs | 标量参数 | 字符串类型(合法的表模型查询 SQL)默认:无 | 指定此次预测任务的协变量的历史数据,这些数据用于辅助目标变量的预测,AINode 不会对历史协变量输出预测结果。在将数据给予模型前,AINode 会自动将数据按时间升序排序。 | 否 | 1. 查询结果只能包含 FIELD 列;
2. 其它:不同模型可能会有独特要求,不符合时会抛出对应的错误。 | -| future\_covs | 标量参数 | 字符串类型(合法的表模型查询 SQL) 默认:无 | 指定此次预测任务部分协变量的未来数据,这些数据用于辅助目标变量的预测。 在将数据给予模型前,AINode 会自动将数据按时间升序排序。 | 否 | 1. 当且仅当设置 history\_covs 时可以指定此参数;
2. 所涉及协变量名称必须是 history\_covs 的子集;
3. 查询结果只能包含 FIELD 列;
4. 其它:不同模型可能会有独特要求,不符合时会抛出对应的错误。 | -| auto\_adapt | 标量参数 | 布尔类型,默认值:true | 是否为协变量推理开启自适应。(V2.0.8.2起支持) | 否 | 当开启自适应时:
1. 若未来协变量集合future\_covs不是历史协变量集合history\_covs的子集,将自动抛弃那些不属于历史协变量的未来协变量。
2. 若某个历史协变量的长度不等于输入目标变量的长度:a. 小于时,在其头部补 0;b. 大于时,自动丢弃其最早的数据。
3. 若某个未来协变量的长度不等于预测长度output\_length: a. 小于时,在其尾部补 0;b. 大于时,自动丢弃其最新的数据。 | -| output\_start\_time | 标量参数 | 时间戳类型。 默认值:目标变量最后一个时间戳 + output\_interval | 输出的预测点的起始时间戳 【即起报时间】 | 否 | 必须大于目标变量时间戳的最大值 | -| output\_length | 标量参数 | INT32 类型。 默认值:96 | 输出窗口大小 | 否 | 必须大于 0 | -| output\_interval | 标量参数 | 时间间隔类型。 默认值:(输入数据的最后一个时间戳 - 输入数据的第一个时间戳) / n - 1 | 输出的预测点之间的时间间隔 支持的单位是 ns、us、ms、s、m、h、d、w | 否 | 必须大于 0 | -| timecol | 标量参数 | 字符串类型。 默认值:time | 时间列名 | 否 | 必须为存在于 targets 中的且数据类型为 TIMESTAMP 的列 | -| preserve\_input | 标量参数 | 布尔类型。 默认值:false | 是否在输出结果集中保留目标变量输入的所有原始行 | 否 | | -| model\_options | 标量参数 | 字符串类型。 默认值:空字符串 | 模型相关的 key-value 对,比如是否需要对输入进行归一化等。不同的 key-value 对以 ';' 间隔 | 否 | | - -说明: - -* **默认行为**:预测 targets 的所有列。当前仅支持 INT32、INT64、FLOAT、DOUBLE 类型。 -* **输入数据要求**: - * 必须包含时间列。 - * 行数要求:不足最低行数会报错,超过最大行数则自动截取末尾数据。 - * 列数要求:单变量模型仅支持单列,多列将报错;协变量模型通常无限制,除非模型自身有明确约束。 - * 协变量预测时,SQL 语句中需明确指定 DATABASE。 -* **输出结果**: - * 包含所有目标变量列,数据类型与原表一致。 - * 若指定 `preserve_input=true`,会额外增加 `is_input` 列来标识原始数据行。 -* **时间戳生成**: - * 使用 `OUTPUT_START_TIME`(可选)作为预测起始时间点,并以此划分历史与未来数据。 - * 使用 `OUTPUT_INTERVAL`(可选,默认为输入数据的采样间隔)作为输出时间间隔。第 N 行的时间戳计算公式为:`OUTPUT_START_TIME + (N - 1) * OUTPUT_INTERVAL`。 - -2. **使用示例** - -**示例一:单变量预测** - -提前创建数据库 etth 及表 eg - -```SQL -create database etth; -create table eg (hufl FLOAT FIELD, hull FLOAT FIELD, mufl FLOAT FIELD, mull FLOAT FIELD, lufl FLOAT FIELD, lull FLOAT FIELD, ot FLOAT FIELD) -``` - -准备原始数据 [ETTh1-tab](/img/ETTh1-tab.csv),可通过 [import-data](../Tools-System/Data-Import-Tool_timecho.md#_2-2-csv-格式) 脚本导入原始数据,例如 - -```bash -./tools/import-data.sh -ft csv -sql_dialect table -db etth -table eg -s ~/Desktop/model-compare-html/ETTh1-tab.csv -``` - -使用表 eg 中测点 ot 已知的 1440 行数据,预测其未来的 96 行数据. - -```SQL -IoTDB:etth> select Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT from eg LIMIT 1440 -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -| Time| HUFL| HULL| MUFL| MULL| LUFL| LULL| OT| -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -|2016-07-01T00:00:00.000+08:00| 5.827|2.009|1.599|0.462|4.203| 1.34|30.531| -|2016-07-01T01:00:00.000+08:00| 5.693|2.076|1.492|0.426|4.142|1.371|27.787| -|2016-07-01T02:00:00.000+08:00| 5.157|1.741|1.279|0.355|3.777|1.218|27.787| -|2016-07-01T03:00:00.000+08:00| 5.09|1.942|1.279|0.391|3.807|1.279|25.044| -...... -Total line number = 1440 -It costs 0.119s - -IoTDB:etth> select * from forecast( - model_id => 'sundial', - targets => (select Time, ot from etth.eg where time >= 2016-08-07T18:00:00.000+08:00 limit 1440) order BY time, - output_length => 96 -) -+-----------------------------+---------+ -| time| ot| -+-----------------------------+---------+ -|2016-10-06T18:00:00.000+08:00|20.733124| -|2016-10-06T19:00:00.000+08:00|20.258146| -|2016-10-06T20:00:00.000+08:00|20.022043| -|2016-10-06T21:00:00.000+08:00|19.789446| -...... -Total line number = 96 -It costs 1.615s -``` - -**示例二:协变量预测** - -提前创建表 tab\_real(存储原始真实数据) - -```SQL -create table tab_real (target1 DOUBLE FIELD, target2 DOUBLE FIELD, cov1 DOUBLE FIELD, cov2 DOUBLE FIELD, cov3 DOUBLE FIELD); -``` - -准备原始数据 - -```SQL ---写入语句 -IoTDB:etth> INSERT INTO tab_real (time, target1, target2, cov1, cov2, cov3) VALUES -(1, 1.0, 1.0, 1.0, 1.0, 1.0), -(2, 2.0, 2.0, 2.0, 2.0, 2.0), -(3, 3.0, 3.0, 3.0, 3.0, 3.0), -(4, 4.0, 4.0, 4.0, 4.0, 4.0), -(5, 5.0, 5.0, 5.0, 5.0, 5.0), -(6, 6.0, 6.0, 6.0, 6.0, 6.0), -(7, NULL, NULL, NULL, NULL, 7.0), -(8, NULL, NULL, NULL, NULL, 8.0); - -IoTDB:etth> SELECT * FROM tab_real -+-----------------------------+-------+-------+----+----+----+ -| time|target1|target2|cov1|cov2|cov3| -+-----------------------------+-------+-------+----+----+----+ -|1970-01-01T08:00:00.001+08:00| 1.0| 1.0| 1.0| 1.0| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| 2.0| 2.0| 2.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| 3.0| 3.0| 3.0| 3.0| -|1970-01-01T08:00:00.004+08:00| 4.0| 4.0| 4.0| 4.0| 4.0| -|1970-01-01T08:00:00.005+08:00| 5.0| 5.0| 5.0| 5.0| 5.0| -|1970-01-01T08:00:00.006+08:00| 6.0| 6.0| 6.0| 6.0| 6.0| -|1970-01-01T08:00:00.007+08:00| null| null|null|null| 7.0| -|1970-01-01T08:00:00.008+08:00| null| null|null|null| 8.0| -+-----------------------------+-------+-------+----+----+----+ -``` - -* 预测任务一:使用历史协变量 cov1,cov2 和 cov3 辅助预测目标变量 target1 和 target2。 - - ![](/img/ainode-upgrade-table-forecast-timecho-1.png) - - * 使用表 tab\_real 中 cov1,cov2,cov3,target1,target2 的 前 6 行历史数据,预测目标变量 target1 和 target2 未来的 2 行数据 - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.338330268859863|7.338330268859863| - |1970-01-01T08:00:00.008+08:00| 8.02529525756836| 8.02529525756836| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.315s - ``` -* 预测任务二:使用相同表中的历史协变量 cov1,cov2 和已知协变量 cov3 辅助预测目标变量 target1 和 target2。 - - ![](/img/ainode-upgrade-table-forecast-timecho-2.png) - - * 使用表 tab\_real 中 cov1,cov2,cov3,target1,target2 的 前 6 行历史数据,以及同表中已知协变量 cov3 在未来的 2 行数据来预测目标变量 target1 和 target2 未来的 2 行数据 - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - FUTURE_COVS => ' - SELECT TIME, cov3 - FROM etth.tab_real - WHERE TIME >= 7 - LIMIT 2', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.244050025939941|7.244050025939941| - |1970-01-01T08:00:00.008+08:00|7.907227516174316|7.907227516174316| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.291s - ``` -* 预测任务三:使用不同表中的历史协变量 cov1,cov2 和已知协变量 cov3 辅助预测目标变量 target1 和 target2。 - - ![](/img/ainode-upgrade-table-forecast-timecho-3.png) - - * 提前创建表 tab\_cov\_forecast(存储已知协变量 cov3 的预测值 ),并准备相关数据。 - ```SQL - create table tab_cov_forecast (cov3 DOUBLE FIELD); - - --写入语句 - INSERT INTO tab_cov_forecast (time, cov3) VALUES (7, 7.0),(8, 8.0); - - IoTDB:etth> SELECT * FROM tab_cov_forecast - +----+----+ - |time|cov3| - +----+----+ - | 7| 7.0| - | 8| 8.0| - +----+----+ - ``` - * 使用表 tab\_real 中 cov1,cov2,cov3,target1,target2 已知的前 6 行数据,以及表 tab\_cov\_forecast 中已知协变量 cov3 在未来的 2 行数据来预测目标变量 target1 和 target2 未来的 2 行数据 - ```SQL - IoTDB:etth> SELECT * FROM FORECAST ( - MODEL_ID => 'chronos2', - TARGETS => ( - SELECT TIME, target1, target2 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6) ORDER BY TIME, - HISTORY_COVS => ' - SELECT TIME, cov1, cov2, cov3 - FROM etth.tab_real - WHERE TIME < 7 - ORDER BY TIME DESC - LIMIT 6', - FUTURE_COVS => ' - SELECT TIME, cov3 - FROM etth.tab_cov_forecast - WHERE TIME >= 7 - LIMIT 2', - OUTPUT_LENGTH => 2 - ) - +-----------------------------+-----------------+-----------------+ - | time| target1| target2| - +-----------------------------+-----------------+-----------------+ - |1970-01-01T08:00:00.007+08:00|7.244050025939941|7.244050025939941| - |1970-01-01T08:00:00.008+08:00|7.907227516174316|7.907227516174316| - +-----------------------------+-----------------+-----------------+ - Total line number = 2 - It costs 0.351s - ``` - - -#### 4.1.2 时序分类 - -时序分类是时序预测之外的重要能力,在工业界具有广泛应用。其典型范式是输入多个测点的近期采样值,综合判断设备整体运行状态,输出当前状态的分类标签。例如:可用于新能源电池组设备的运行状态分类等场景。 - -AINode 表模型支持通过调用协变量分类模型执行时序数据的分类任务。 - -> 注意:该功能从 V2.0.9.1 版本开始提供。 - -1. **SQL 语法** - -```SQL -SELECT * FROM CLASSIFY( - MODEL_ID, - INPUTS -- 获取输入变量的 SQL - [TIMECOL, - MODEL_OPTIONS]? -) -``` - -* 参数介绍 - -| 参数名 | 参数类型 | 参数属性 | 描述 | 是否必填 | 备注 | -|-----------------|-------|-------------------| ----------------------------------------------------------------------------------------- | ---------- | ------------------------------------------------------------------------ | -| model\_id | 标量参数 | 字符串类型 | 分类所用模型的唯一标识| 是| | -| inputs | 表参数 | SET SEMANTIC | 输入的待分类数据。IoTDB 会自动将数据按时间升序排序再交给 AINode 。 | 是 | 使用 SQL 描述输入的待分类数据,输入的 SQL 不合法时会有对应的查询报错。| -| timecol | 标量参数 | 字符串类型,默认值:time | 时间列名| 否 | 存在于 inputs 中的,数据类型为 TIMESTAMP 的列,否则报错。 | -| model\_options | 标量参数 | 字符串类型默认值:空字符串。 | 模型相关的 key-value 对,比如是否需要对输入进行归一化等。不同的 key-value 对以 ';' 间隔 | 否| 指定某个模型不支持参数,并不会报错,只会被忽略 | - -说明: - -* ​**输入数据要求**​: - * 类型约束:目前仅支持 INT32、INT64、FLOAT、DOUBLE 类型。 - * 行数要求:不同模型要求不同。对于有行数限制的模型,低于最小行数或高于最多行数时将报错。 - * 列数要求:必须包含时间列。单变量分类模型仅支持单列,多列将报错;多变量分类模型通常无限制,除非模型自身有明确约束。 - * 顺序要求:多变量零样本分类模型通常无限制,除非模型自身有明确约束。 -* ​**输出结果**​: - * 返回结果是由时序数据的分类结果组成的表,其规格取决于模型的具体实现。 - -2. **使用示例** - -假设某项目的时序数据的变量数为10,输入长度为192。以自定义的 mantis\_custom 模型为例进行时序数据分类推理。 - -![](/img/ainode-classify-table-timecho.png) - -* 注册模型 - -```SQL -CREATE MODEL mantis_custom USING URI 'file:///path/to/mantis' -``` - -注册自定义模型的详细步骤说明可参考 [4.3 小节](#_4-3-注册自定义模型)。 - -* 运行SQL - -```SQL -IoTDB:etth> SELECT * FROM CLASSIFY ( - MODEL_ID => 'mantis_custom', - INPUTS => ( - SELECT Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT,UT,MT,LT - FROM eg - WHERE TIME < 2016-07-09 00:00:00 - ORDER BY TIME DESC - LIMIT 192) ORDER BY TIME -) -``` - -* 执行结果 - -```SQL -+--------+ -|category| -+--------+ -| 4| -+--------+ -``` - - -### 4.2 模型微调 - -AINode 支持通过 SQL 进行模型微调任务。 - -**SQL 语法** - -```SQL -createModelStatement - | CREATE MODEL modelId=identifier (WITH HYPERPARAMETERS '(' hparamPair (',' hparamPair)* ')')? FROM MODEL existingModelId=identifier ON DATASET '(' targetData=string ')' - ; -hparamPair - : hparamKey=identifier '=' hyparamValue=primaryExpression - ; -``` - -**参数说明** - -| 名称 | 描述 | -| ----------------- |----------------------------------------------------------------------------------------------------------------------------------------| -| modelId | 微调出的模型的唯一标识 | -| hparamPair | 微调使用的超参数 key-value 对,目前支持如下:
`train_epochs`: int 类型,微调轮数
`iter_per_epoch`: int 类型,每轮微调的迭代次数
`learning_rate`: double 类型,学习率 | -| existingModelId | 微调使用的基座模型 | -| targetData | 用于获取微调使用的数据集的 SQL | - -**示例** - -1. 选择测点 ot 中指定时间范围的数据作为微调数据集,基于 sundial 创建模型 sundialv3。 - -```SQL -IoTDB> set sql_dialect=table -Msg: The statement is executed successfully. -IoTDB> CREATE MODEL sundialv3 FROM MODEL sundial ON DATASET ('SELECT time, ot from etth.eg where 1467302400000 <= time and time < 1517468400001') -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -| sundialv3| sundial| fine_tuned| training| -+---------------------+---------+-----------+---------+ -``` - -2. 微调任务后台异步启动,可在 AINode 进程看到 log;微调完成后,查询并使用新的模型 - -```SQL -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -| sundialv3| sundial| fine_tuned| active| -+---------------------+---------+-----------+---------+ -``` - -### 4.3 注册自定义模型 - -**符合以下要求的 Transformers 模型可以注册到 AINode 中:** - -1. AINode 目前使用 v4.56.2 版本的 transformers,构建模型时需**避免继承低版本(<4.50)接口**; -2. 模型需继承一类 AINode 的推理任务流水线(当前支持预测流水线): - * iotdb-core/ainode/iotdb/ainode/core/inference/pipeline/basic\_pipeline.py - - **V2.0.9.3 之前** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - 在推理任务开始前对输入数据进行前处理,包括形状验证和数值转换。 - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - 在推理任务结束后对输出结果进行后处理。 - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def preprocess(self, inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], **infer_kwargs): - """ - 在将输入数据传递给模型进行推理之前进行预处理,验证输入数据的形状和类型。 - - Args: - inputs (list[dict]): - 输入数据,字典列表,每个字典包含: - - 'targets': 形状为 (input_length,) 或 (target_count, input_length) 的张量。 - - 'past_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - 'future_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - infer_kwargs (dict, optional): 推理的额外关键字参数,如: - - `output_length`(int): 如果提供'future_covariates',用于验证其有效性。 - - Raises: - ValueError: 如果输入格式不正确(例如,缺少键、张量形状无效)。 - - Returns: - 经过预处理和验证的输入数据,可直接用于模型推理。 - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - 对给定输入执行预测。 - - Parameters: - inputs: 用于进行预测的输入数据。类型和结构取决于模型的具体实现。 - **infer_kwargs: 额外的推理参数,例如: - - `output_length`(int): 模型应该生成的时间点数量。 - - Returns: - 预测输出,具体形式取决于模型的具体实现。 - """ - pass - - def postprocess(self, outputs: list[torch.Tensor], **infer_kwargs) -> list[torch.Tensor]: - """ - 在推理后对模型输出进行后处理,验证输出数据的形状并确保其符合预期维度。 - - Args: - outputs: - 模型输出,2D张量列表,每个张量形状为 `[target_count, output_length]`。 - - Raises: - InferenceModelInternalException: 如果输出张量形状无效(例如,维数错误)。 - ValueError: 如果输出格式不正确。 - - Returns: - list[torch.Tensor]: - 后处理后的输出,将是一个2D张量列表。 - """ - pass - ``` - - **V2.0.9.3 起** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - 在推理任务开始前对输入数据进行前处理,包括形状验证和数值转换。 - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - 在推理任务结束后对输出结果进行后处理。 - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def _preprocess( - self, - inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], - **infer_kwargs, - ): - """ - 在将输入数据传递给模型进行推理之前进行预处理,验证输入数据的形状和类型。 - - Args: - inputs (list[dict[str, dict[str, torch.Tensor] | torch.Tensor]]): - 输入数据,字典列表,每个字典包含: - - 'targets': 形状为 (input_length,) 或 (target_count, input_length) 的张量。 - - 'past_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - 'future_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - infer_kwargs (dict, optional): 推理的额外关键字参数,如: - - `output_length`(int): 如果提供'future_covariates',用于验证其有效性。 - - Raises: - ValueError: 如果输入格式不正确(例如,缺少键、张量形状无效)。 - - Returns: - 经过预处理和验证的输入数据,可直接用于模型推理。 - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - 对给定输入执行预测。 - - Parameters: - inputs: 用于进行预测的输入数据。类型和结构取决于模型的具体实现。 - **infer_kwargs: 额外的推理参数,例如: - - `output_length`(int): 模型应该生成的时间点数量。 - - Returns: - 预测输出,具体形式取决于模型的具体实现。 - """ - pass - - def _postprocess(self, outputs, **infer_kwargs) -> list[torch.Tensor]: - """ - 在推理后对模型输出进行后处理,验证输出数据的形状并确保其符合预期维度。 - - Args: - outputs: - 模型输出,2D张量列表,每个张量形状为 `[target_count, output_length]`。 - - Raises: - InferenceModelInternalException: 如果输出张量形状无效(例如,维数错误)。 - ValueError: 如果输出格式不正确。 - - Returns: - list[torch.Tensor]: - 后处理后的输出,将是一个2D张量列表。 - """ - pass - ``` - -3. 修改模型配置文件 config.json,确保包含以下字段: - - **V2.0.9.3 之前** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // 指定模型 Config 类 - "AutoModelForCausalLM": "model.Chronos2Model" // 指定模型类 - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // 指定模型的推理流水线 - "model_type": "custom_t5", // 指定模型类型 - } - ``` - - * 必须通过 auto\_map 指定模型的 Config 类和模型类; - * 必须集成并指定推理流水线类; - * 对于 AINode 管理的内置(builtin)和自定义(user\_defined)模型,模型类别(model\_type)也作为不可重复的唯一标识。即,要注册的模型类别不得与任何已存在的模型类型重复,通过微调创建的模型将继承原模型的模型类别。 - - **V2.0.9.3 起** - > 参数 model_type 非必填 - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // 指定模型 Config 类 - "AutoModelForCausalLM": "model.Chronos2Model" // 指定模型类 - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // 指定模型的推理流水线 - } - ``` - * 必须通过 auto\_map 指定模型的 Config 类和模型类; - * 必须集成并指定推理流水线类; - - -4. 确保要注册的模型目录包含以下文件,且模型配置文件名称和权重文件名称不支持自定义: - * 模型配置文件:config.json; - * 模型权重文件:model.safetensors; - * 模型代码:其它 .py 文件。 - -**注册自定义模型的 SQL 语法如下所示:** - -```SQL -CREATE MODEL USING URI -``` - -**参数说明:** - -* **model\_id**:自定义模型的唯一标识;不可重复,有以下约束: - * 允许出现标识符 [ 0-9 a-z A-Z \_ ] (字母,数字(非开头),下划线(非开头)) - * 长度限制为 2-64 字符 - * 大小写敏感 -* **uri**:包含模型代码和权重的本地 uri 地址。 - -**注册示例:** - -从本地路径上传自定义 Transformers 模型,AINode 会将该文件夹拷贝至 user\_defined 目录中。 - -```SQL -CREATE MODEL chronos2 USING URI 'file:///path/to/chronos2' -``` - -SQL执行后会异步进行注册的流程,可以通过模型展示查看模型的注册状态(见查看模型章节)。模型注册完成后,就可以通过使用正常查询的方式调用具体函数,进行模型推理。 - -### 4.4 查看模型 - -注册成功的模型可以通过查看指令查询模型的具体信息。 - -```SQL -SHOW MODELS -``` - -除了直接展示所有模型的信息外,可以指定`model_id`来查看某一具体模型的信息。 - -```SQL -SHOW MODELS -- 只展示特定模型 -``` - -模型展示的结果中包含如下内容: - -| **ModelId** | **ModelType** | **Category** | **State** | -| ------------------- | --------------------- | -------------------- | ----------------- | -| 模型ID | 模型类型 | 模型种类 | 模型状态 | - -其中,State 模型状态机流转示意图如下: - -![](/img/ainode-upgrade-state-timecho.png) - -状态机流程说明: - -1. 启动 AINode 后,执行 `show models` 命令,仅能查看到**系统内置(BUILTIN)**的模型。 -2. 用户可导入自己的模型,这类模型的来源标识为**用户自定义(USER\_****DEFINED)**;AINode 会尝试从模型配置文件中解析模型类型(ModelType),若解析失败,该字段则显示为空。 -3. 时序大模型(内置模型)权重文件不随 AINode 打包,AINode 启动时自动下载。 - 1. 下载过程中为 ACTIVATING,下载成功转变为 ACTIVE,失败则变成 INACTIVE。 -4. 用户启动模型微调任务后,正在训练的模型状态为 TRAINING,训练成功变为 ACTIVE,失败则是 FAILED。 -5. 若微调任务成功,微调结束后会统计所有 ckpt (训练文件)中指标最佳的文件并自动重命名,变成用户指定的 model\_id。 - -**查看示例** - -```SQL -IoTDB> show models -+---------------------+--------------+--------------+-------------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------+--------------+-------------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| custom| | user_defined| active| -| timer_xl| timer| builtin| activating| -| sundial| sundial| builtin| active| -| sundialx_1| sundial| fine_tuned| active| -| sundialx_4| sundial| fine_tuned| training| -| sundialx_5| sundial| fine_tuned| failed| -| chronos2| t5| builtin| inactive| -+---------------------+--------------+--------------+-------------+ -``` - -内置传统时序模型介绍如下: - -| 模型名称 | 核心概念 | 适用场景 | 主要特点 | -|----------------------------------| ----------------------------------------------------------------------------------------- | ---------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | -| **ARIMA**(自回归整合移动平均模型) | 结合自回归(AR)、差分(I)和移动平均(MA),用于预测平稳时间序列或可通过差分变为平稳的数据。 | 单变量时间序列预测,如股票价格、销量、经济指标等。| 1. 适用于线性趋势和季节性较弱的数据。2. 需要选择参数 (p,d,q)。3. 对缺失值敏感。 | -| **Holt-Winters**(三参数指数平滑) | 基于指数平滑,引入水平、趋势和季节性三个分量,适用于具有趋势和季节性的数据。 | 有明显季节性和趋势的时间序列,如月度销售额、电力需求等。 | 1. 可处理加性或乘性季节性。2. 对近期数据赋予更高权重。3. 简单易实现。 | -| **Exponential Smoothing**(指数平滑) | 通过加权平均历史数据,权重随时间指数递减,强调近期观测值的重要性。 | 无显著季节性但存在趋势的数据,如短期需求预测。 | 1. 参数少,计算简单。2. 适合平稳或缓慢变化序列。3. 可扩展为双指数或三指数平滑。 | -| **Naive Forecaster**(朴素预测器) | 使用最近一期的观测值作为下一期的预测值,是最简单的基准模型。 | 作为其他模型的比较基准,或数据无明显模式时的简单预测。 | 1. 无需训练。2. 对突发变化敏感。3. 季节性朴素变体可用前一季节同期值预测。 | -| **STL Forecaster**(季节趋势分解预测) | 基于STL分解时间序列,分别预测趋势、季节性和残差分量后组合。 | 具有复杂季节性、趋势和非线性模式的数据,如气候数据、交通流量。 | 1. 能处理非固定季节性。2. 对异常值稳健。3. 分解后可结合其他模型预测各分量。 | -| **Gaussian HMM**(高斯隐马尔可夫模型) | 假设观测数据由隐藏状态生成,每个状态的观测概率服从高斯分布。 | 状态序列预测或分类,如语音识别、金融状态识别。 | 1. 适用于时序数据的状态建模。2. 假设观测值在给定状态下独立。3. 需指定隐藏状态数量。 | -| **GMM HMM** (高斯混合隐马尔可夫模型) | 扩展Gaussian HMM,每个状态的观测概率由高斯混合模型描述,可捕捉更复杂的观测分布。 | 需要多模态观测分布的场景,如复杂动作识别、生物信号分析。 | 1. 比单一高斯更灵活。2. 参数更多,计算复杂度高。3. 需训练GMM成分数。 | -| **STRAY**(基于奇异值的异常检测) | 通过奇异值分解(SVD)检测高维数据中的异常点,常用于时间序列异常检测。 | 高维时间序列的异常检测,如传感器网络、IT系统监控。 | 1. 无需分布假设。2. 可处理高维数据。3. 对全局异常敏感,局部异常可能漏检。 | - -内置时序大模型介绍如下: - -| 模型名称 | 核心概念 | 适用场景 | 主要特点 | -|---------------| ---------------------------------------------------------------------- | ------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| **Timer-XL** | 支持超长上下文的时序大模型,通过大规模工业数据预训练增强泛化能力。 | 需利用极长历史数据的复杂工业预测,如能源、航空航天、交通等领域。 | 1. 超长上下文支持,可处理数万时间点输入。2. 多场景覆盖,支持非平稳、多变量及协变量预测。3. 基于万亿级高质量工业时序数据预训练。 | -| **Timer-Sundial** | 采用“Transformer + TimeFlow”架构的生成式基础模型,专注于概率预测。 | 需要量化不确定性的零样本预测场景,如金融、供应链、新能源发电预测。 | 1. 强大的零样本泛化能力,支持点预测与概率预测 2. 可灵活分析预测分布的任意统计特性。3. 创新生成架构,实现高效的非确定性样本生成。 | -| **Chronos-2** | 基于离散词元化范式的通用时序基础模型,将预测转化为语言建模任务。 | 快速零样本单变量预测,以及可借助协变量(如促销、天气)提升效果的场景。 | 1. 强大的零样本概率预测能力。2. 支持协变量统一建模,但对输入有严格要求:a. 未来协变量的名称组成的集合必须是历史协变量的名称组成的集合的子集;b. 每个历史协变量的长度必须等于目标变量的长度; c. 每个未来协变量的长度必须等于预测长度;3. 采用高效的编码器式结构,兼顾性能与推理速度。 | - - -### 4.5 删除模型 - -对于注册成功的模型,用户可以通过 SQL 进行删除,AINode 会将 user\_defined 目录下的对应模型文件夹整个删除。其 SQL 语法如下: - -```SQL -DROP MODEL -``` - -需要指定已经成功注册的模型 model\_id 来删除对应的模型。由于模型删除涉及模型数据清理,操作不会立即完成,此时模型的状态为 DROPPING,该状态的模型不能用于模型推理。请注意,该功能不支持删除内置模型。 - -### 4.6 加载/卸载模型 - -为适应不同场景,AINode 提供以下两种模型加载策略: - -* 即时加载:即推理时临时加载模型,结束后释放资源。适用于测试或低负载场景。 -* 常驻加载:即将模型持久化加载在内存(CPU)或显存(GPU)中,以支持高并发推理。用户只需通过 SQL 指定加载或卸载的模型,AINode 会自动管理实例数量。当前常驻模型的状态也可随时查看。 - -下文将详细介绍加载/卸载模型的相关内容: - -1. 配置参数 - -支持通过编辑如下配置项设置常驻加载相关参数。 - -```Properties -# AINode 在推理时可使用的设备内存/显存占总量的比例 -# Datatype: Float -ain_inference_memory_usage_ratio=0.4 - -# AINode 每个加载的模型实例需要占用的内存比例,即模型占用*该值 -# Datatype: Float -ain_inference_extra_memory_ratio=1.2 -``` - -2. 展示可用的 device - -支持通过如下 SQL 命令查看所有可用的设备 ID - -```SQL -SHOW AI_DEVICES -``` - -示例 - -```SQL -IoTDB> show ai_devices -+-------------+ -| DeviceId| -+-------------+ -| cpu| -| 0| -| 1| -+-------------+ -``` - -3. 加载模型 - -支持通过如下 SQL 命令手动加载模型,系统根据硬件资源使用情况**自动均衡**模型实例数量。 - -```SQL -LOAD MODEL TO DEVICES (, )* -``` - -参数要求 - -* **existing\_model\_id:** 指定的模型 id,当前版本仅支持 timer\_xl 和 sundial。 -* **device\_id:** 模型加载的位置。 - * **cpu:** 加载到 AINode 所在服务器的内存中。 - * **gpu\_id:** 加载到 AINode 所在服务器的对应显卡中,如 "0, 1" 表示加载到编号为 0 和 1 的两张显卡中。 - -示例 - -```SQL -LOAD MODEL sundial TO DEVICES 'cpu,0,1' -``` - -4. 卸载模型 - -支持通过如下 SQL 命令手动卸载指定模型的所有实例,系统会**重分配**空闲出的资源给其他模型 - -```SQL -UNLOAD MODEL FROM DEVICES (, )* -``` - -参数要求 - -* **existing\_model\_id:** 指定的模型 id,当前版本仅支持 timer\_xl 和 sundial。 -* **device\_id:** 模型加载的位置。 - * **cpu:** 尝试从 AINode 所在服务器的内存中卸载指定模型。 - * **gpu\_id:** 尝试从 AINode 所在服务器的对应显卡中卸载指定模型,如 "0, 1" 表示尝试从编号为 0 和 1 的两张显卡卸载指定模型。 - -示例 - -```SQL -UNLOAD MODEL sundial FROM DEVICES 'cpu,0,1' -``` - -5. 展示加载的模型 - -支持通过如下 SQL 命令查看已经手动加载的模型实例,可通过 `device_id `指定设备。 - -```SQL -SHOW LOADED MODELS -SHOW LOADED MODELS (, )* # 展示指定设备中的模型实例 -``` - -示例:在内存、gpu\_0 和 gpu\_1 两张显卡加载了sundial 模型 - -```SQL -IoTDB> show loaded models -+-------------+--------------+------------------+ -| DeviceId| ModelId| Count(instances)| -+-------------+--------------+------------------+ -| cpu| sundial| 4| -| 0| sundial| 6| -| 1| sundial| 6| -+-------------+--------------+------------------+ -``` - -说明: - -* DeviceId : 设备 ID -* ModelId :加载的模型 ID -* Count(instances) :每个设备中的模型实例数量(系统自动分配) - -### 4.7 时序大模型介绍 - -AINode 目前支持多种时序大模型,相关介绍及部署使用可参考[时序大模型](../AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md) - -## 5. 权限管理 - -使用 AINode 相关的功能时,可以使用IoTDB本身的鉴权去做一个权限管理,用户只有在具备 USE\_MODEL 权限时,才可以使用模型管理的相关功能。当使用推理功能时,用户需要有访问输入模型的 SQL 对应的源序列的权限。 - -| **权限名称** | **权限范围** | **管理员用户(默认ROOT)** | **普通用户** | -| ------------------------- | ----------------------------------------- | ---------------------------------- | -------------------- | -| USE\_MODEL | create model / show models / drop model | √ | √ | -| READ\_SCHEMA&READ\_DATA | forecast | √ | √ | diff --git a/src/zh/UserGuide/latest-Table/AI-capability/AINode_timecho.md b/src/zh/UserGuide/latest-Table/AI-capability/AINode_timecho.md deleted file mode 100644 index 1cefe4f43..000000000 --- a/src/zh/UserGuide/latest-Table/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,453 +0,0 @@ - - -# AINode - -AINode 是支持时序相关模型注册、管理、调用的 IoTDB 原生节点,内置业界领先的自研时序大模型,如清华自研时序模型 Timer 系列,可通过标准 SQL 语句进行调用,实现时序数据的毫秒级实时推理,可支持时序趋势预测、缺失值填补、异常值检测等应用场景。 - -> V2.0.5.1及以后版本支持 - -系统架构如下图所示: - -![](/img/AINode-0.png) - -三种节点的职责如下: - -- **ConfigNode**:负责分布式节点管理和负载均衡。 -- **DataNode**:负责接收并解析用户的 SQL请求;负责存储时间序列数据;负责数据的预处理计算。 -- **AINode**:负责时序模型的管理和使用。 - -## 1. 优势特点 - -与单独构建机器学习服务相比,具有以下优势: - -- **简单易用**:无需使用 Python 或 Java 编程,使用 SQL 语句即可完成机器学习模型管理与推理的完整流程。如创建模型可使用CREATE MODEL语句、使用模型进行推理可使用 SELECT * FROM FORECAST (...) 语句等,使用更加简单便捷。 - -- **避免数据迁移**:使用 IoTDB 原生机器学习可以将存储在 IoTDB 中的数据直接应用于机器学习模型的推理,无需将数据移动到单独的机器学习服务平台,从而加速数据处理、提高安全性并降低成本。 - -![](/img/h1.png) - -- **内置先进算法**:支持业内领先机器学习分析算法,覆盖典型时序分析任务,为时序数据库赋能原生数据分析能力。如: - - **时间序列预测(Time Series Forecasting)**:从过去时间序列中学习变化模式;从而根据给定过去时间的观测值,输出未来序列最可能的预测。 - - **时序异常检测(Anomaly Detection for Time Series)**:在给定的时间序列数据中检测和识别异常值,帮助发现时间序列中的异常行为。 - - **时间序列标注(Time Series Annotation)**:为每个数据点或特定时间段添加额外的信息或标记,例如事件发生、异常点、趋势变化等,以便更好地理解和分析数据。 - - -## 2. 基本概念 - -- **模型(Model)**:机器学习模型,以时序数据作为输入,输出分析任务的结果或决策。模型是 AINode 的基本管理单元,支持模型的增(注册)、删、查、改(微调)、用(推理)。 -- **创建(Create)**: 将外部设计或训练好的模型文件或算法加载到 AINode 中,由 IoTDB 统一管理与使用。 -- **推理(Inference)**:使用创建的模型在指定时序数据上完成该模型适用的时序分析任务。 -- **内置能力(Built-in)**:AINode 自带常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -![](/img/AINode-new.png) - -## 3. 安装部署 - -AINode 的部署可参考文档 [AINode 部署](../Deployment-and-Maintenance/AINode_Deployment_timecho.md) 章节。 - -## 4. 使用指导 - -AINode 对时序模型提供了模型创建及删除功能,内置模型无需创建,可直接使用。 - -### 4.1 注册模型 - -通过指定模型输入输出的向量维度,可以注册训练好的深度学习模型,从而用于模型推理。 - -符合以下内容的模型可以注册到AINode中: - 1. AINode 目前支持基于 PyTorch 2.4.0 版本训练的模型,需避免使用 2.4.0 版本以上的特性。 - 2. AINode 支持使用 PyTorch JIT 存储的模型(`model.pt`),模型文件需要包含模型的结构和权重。 - 3. 模型输入序列可以包含一列或多列,若有多列,需要和模型能力、模型配置文件对应。 - 4. 模型的配置参数必须在`config.yaml`配置文件中明确定义。使用模型时,必须严格按照`config.yaml`配置文件中定义的输入输出维度。如果输入输出列数不匹配配置文件,将会导致错误。 - -下方为模型注册的SQL语法定义。 - -```SQL -create model using uri -``` - -SQL中参数的具体含义如下: - -- model_id:模型的全局唯一标识,不可重复。模型名称具备以下约束: - - - 允许出现标识符 [ 0-9 a-z A-Z _ ](字母,数字(非开头),下划线(非开头)) - - 长度限制为2-64字符 - - 大小写敏感 - -- uri:模型注册文件的资源路径,路径下应包含**模型结构及权重文件 model.pt 文件和模型配置文件 config.yaml** - - - 模型结构及权重文件:模型训练完成后得到的权重文件,目前支持 pytorch 训练得到的 .pt 文件 - - - 模型配置文件:模型注册时需要提供的与模型结构有关的参数,其中必须包含模型的输入输出维度用于模型推理: - - | **参数名** | **参数描述** | **示例** | - | ------------ | ---------------------------- | -------- | - | input_shape | 模型输入的行列,用于模型推理 | [96,2] | - | output_shape | 模型输出的行列,用于模型推理 | [48,2] | - - 除了模型推理外,还可以指定模型输入输出的数据类型: - - | **参数名** | **参数描述** | **示例** | - | ----------- | ------------------ | --------------------- | - | input_type | 模型输入的数据类型 | ['float32','float32'] | - | output_type | 模型输出的数据类型 | ['float32','float32'] | - - 除此之外,可以额外指定备注信息用于在模型管理时进行展示 - - | **参数名** | **参数描述** | **示例** | - | ---------- | ---------------------------------------------- | ------------------------------------------- | - | attributes | 可选,用户自行设定的模型备注信息,用于模型展示 | 'model_type': 'dlinear','kernel_size': '25' | - - -除了本地模型文件的注册,还可以通过URI来指定远程资源路径来进行注册,使用开源的模型仓库(例如HuggingFace)。 - -#### 示例 - -在 [example 文件夹](https://github.com/apache/iotdb/tree/master/integration-test/src/test/resources/ainode-example)下,包含model.pt和config.yaml文件,model.pt为训练得到,config.yaml的内容如下: - -```YAML -configs: - # 必选项 - input_shape: [96, 2] # 表示模型接收的数据为96行x2列 - output_shape: [48, 2] # 表示模型输出的数据为48行x2列 - - # 可选项 默认为全部float32,列数为shape对应的列数 - input_type: ["int64","int64"] #输入对应的数据类型,需要与输入列数匹配 - output_type: ["text","int64"] #输出对应的数据类型,需要与输出列数匹配 - -attributes: # 可选项 为用户自定义的备注信息 - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -指定该文件夹作为加载路径就可以注册该模型 - -```SQL -IoTDB> create model dlinear_example using uri "file://./example" -``` - -SQL执行后会异步进行注册的流程,可以通过模型展示查看模型的注册状态(见模型展示章节),注册成功的耗时主要受到模型文件大小的影响。 - -模型注册完成后,就可以通过使用正常查询的方式调用具体函数,进行模型推理。 - -### 4.2 查看模型 - -注册成功的模型可以通过show models指令查询模型的具体信息。其SQL定义如下: - -```SQL -show models - -show models -``` - -除了直接展示所有模型的信息外,可以指定model id来查看某一具体模型的信息。模型展示的结果中包含如下信息: - -| **ModelId** | **ModelType** | **Category** | **State** | -|-------------|-----------|--------------|----------------| -| 模型ID | 模型类型 | 模型种类 | 模型状态 | - -- 模型状态机流转示意图如下 - -![](/img/AINode-State.png) - -**说明:** - -1. 启动 AINode,show models 只能看到 BUILT-IN 模型 -2. 用户可导入自己的模型,来源为 USER-DEFINED,可尝试从配置文件解析 ModelType,解析不到则为空 -3. 时序大模型权重不随 AINode 打包,AINode 启动时自动下载,下载过程中为 LOADING -4. 下载成功转变为 ACTIVE,失败则变成 INACTIVE -5. 用户启动微调,正在训练的模型状态为 TRAINING,训练成功变为 ACTIVE,失败则是 FAILED - -**示例** - -```SQL -IoTDB> show models -+---------------------+--------------------+--------------+---------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+--------------+---------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| custom| | USER-DEFINED| ACTIVE| -| timerxl| Timer-XL| BUILT-IN| LOADING| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| sundialx_1| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_2| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_4| Timer-Sundial| FINE-TUNED| TRAINING| -| sundialx_5| Timer-Sundial| FINE-TUNED| FAILED| -+---------------------+--------------------+--------------+---------+ -``` - -### 4.3 删除模型 - -对于注册成功的模型,用户可以通过SQL进行删除,该操作会删除所有 AINode 下的相关模型文件,其SQL如下: - -```SQL -drop model -``` - -需要指定已经成功注册的模型 model_id 来删除对应的模型。由于模型删除涉及模型数据清理,操作不会立即完成,此时模型的状态为 DROPPING,该状态的模型不能用于模型推理。请注意,该功能不支持删除内置模型。 - -### 4.4 使用内置模型推理 - -SQL语法如下: - - -```SQL -SELECT * FROM forecast( - input, - model_id, - [output_length, - output_start_time, - output_interval, - timecol, - preserve_input, - model_options]? -) -``` - -内置模型推理无需注册流程,通过 forecast 函数,指定 model_id 就可以使用模型的推理功能 - - - 请注意,使用内置时序大模型进行推理的前提条件是本地存有对应模型权重,目录为 /IOTDB_AINODE_HOME/data/ainode/models/weights/model_id/。若本地没有模型权重,则会自动从 HuggingFace 拉取,请保证本地能直接访问 HuggingFace。 - -- 参数介绍如下: - -| 参数名 | 参数类型 | 参数属性 | 描述 | 是否必填 | 备注 | -| :---------------- | :------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :------- | :----------------------------------------------------------- | -| input | 表参数 | SET SEMANTIC | 待预测的输入数据 | 是 | | -| model_id | 标量参数 | 字符串类型 | 需要选择的model名 | 是 | 只能为非空,且内置的模型,否则报错:空字符串:MODEL_ID should never be null or empty不存在的模型:model [%s] has not been created模型不可用:model [%s] is not available | -| output_length | 标量参数 | INT32类型默认值:96 | 输出窗口大小 | 否 | 必须大于 0,否则报错:OUTPUT_LENGTH should be greater than 0 | -| output_start_time | 标量参数 | 时间戳类型默认值:输入数据的最后一个时间戳加 output_interval | 输出的预测点的起始时间戳 | 否 | 可以为负数,表示1970年1月1号之前的时间戳 | -| output_interval | 标量参数 | 时间间隔类型默认值:0(输入数据的采样间隔) | 输出的预测点之间的时间间隔支持的单位是 ns、us、ms、s、m、h、d、w | 否 | 大于 0 时,采用用户指定的输出间隔;小于等于 0 时,根据输入数据自动推测 | -| timecol | 标量参数 | 字符串类型默认值:time | 时间列名 | 否 | 存在于 input 中的,数据类型为 TIMESTAMP 的列,否则报错:若数据类型不为 TIMESTAMP: The type of the column [%s] is not as expected.若列不存在:Required column [%s] not found in the source table argument. | -| preserve_input | 标量参数 | 布尔类型默认值:false | 是否在输出结果集中保留输入的所有原始行 | 否 | | -| model_options | 标量参数 | 字符串类型默认值:空字符串 | 模型相关的key-value对,比如是否需要对输入进行归一化等。不同的key-value对以';'间隔 | 否 | 指定某个模型不支持参数,并不会报错,只会被忽略;AINode 中内置的模型支持的常见参数详见文末附录说明。 | - -**说明:** - -1. forecast 函数默认对输入表中所有列进行预测(不包含time列和partition by 的列)。 -2. forecast 函数对于输入数据无顺序性要求,默认对输入数据按照时间戳(由 TIMECOL 参数指定时间戳的列名)做升序排序后,再调用模型进行预测。 -3. 不同模型对于输入数据的行数要求不同,输入数据少于最低行数要求时会报错。 - - 在当前的 AINdoe 内置模型中,Timer-XL 模型至少需要输入 96 行数据,Timer-Sundial 模型至少需要输入 16 行数据。 -4. forecast 函数的返回结果列包含 input 表的所有输入列,列的数据类型与原表列的数据类型一致。若 preserve_input= true,则还包含 is_input 列(表征当前行是否为输入行) - - 目前只支持对 INT32、INT64、FLOAT、DOUBLE 进行预测,否则报错:The type of the column [%s] is [%s], only INT32, INT64, FLOAT, DOUBLE is allowed -5. output_start_time 和 output_interval 只会影响输出结果集的时间戳列生成,均为可选参数。 - - output_start_time 默认为输入数据的最后一个时间戳加 output_interval - - output_interval = (输入数据的最后一个时间戳 - 输入数据的第一个时间戳) / n - 1, 默认为输入数据的采样间隔 - - 第 N 个输出行的时间为 output_start_time + (N - 1) * output_interval - -**示例:需要提前创建数据库及表** - -```sql -create database etth -create table eg (hufl FLOAT FIELD, hull FLOAT FIELD, mufl FLOAT FIELD, mull FLOAT FIELD, lufl FLOAT FIELD, lull FLOAT FIELD, ot FLOAT FIELD) -``` - -我们所使用的的测试集的数据为[ETTh1-tab](/img/ETTh1-tab.csv)。 - -**查看当前支持的模型** - -```Bash -IoTDB:etth> show models -+---------------------+--------------------+--------+------+ -| ModelId| ModelType|Category| State| -+---------------------+--------------------+--------+------+ -| arima| Arima|BUILT-IN|ACTIVE| -| holtwinters| HoltWinters|BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing|BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster|BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster|BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm|BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm|BUILT-IN|ACTIVE| -| stray| Stray|BUILT-IN|ACTIVE| -| sundial| Timer-Sundial|BUILT-IN|ACTIVE| -| timer_xl| Timer-XL|BUILT-IN|ACTIVE| -+---------------------+--------------------+--------+------+ -Total line number = 10 -It costs 0.004s -``` - -**表模型推理(以 sundial 为例)** - -```Bash -IoTDB:etth> select Time, HUFL,HULL,MUFL,MULL,LUFL,LULL,OT from eg LIMIT 96 -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -| Time| HUFL| HULL| MUFL| MULL| LUFL| LULL| OT| -+-----------------------------+------+-----+-----+-----+-----+-----+------+ -|2016-07-01T00:00:00.000+08:00| 5.827|2.009|1.599|0.462|4.203| 1.34|30.531| -|2016-07-01T01:00:00.000+08:00| 5.693|2.076|1.492|0.426|4.142|1.371|27.787| -|2016-07-01T02:00:00.000+08:00| 5.157|1.741|1.279|0.355|3.777|1.218|27.787| -|2016-07-01T03:00:00.000+08:00| 5.09|1.942|1.279|0.391|3.807|1.279|25.044| -...... -Total line number = 96 -It costs 0.119s - -IoTDB:etth> select * from forecast( - model_id => 'sundial', - input => (select Time, ot from etth.eg where time >= 2016-08-07T18:00:00.000+08:00 limit 1440) order BY time, - output_length => 96 -) -+-----------------------------+---------+ -| time| ot| -+-----------------------------+---------+ -|2016-10-06T18:00:00.000+08:00|20.781654| -|2016-10-06T19:00:00.000+08:00|20.252121| -|2016-10-06T20:00:00.000+08:00|19.960138| -|2016-10-06T21:00:00.000+08:00|19.662334| -...... -Total line number = 96 -It costs 1.615s -``` -### 4.5 使用内置模型微调 - -> 仅 Timer-XL、Timer-Sundial 可以进行微调操作。 - -SQL语法如下: - - -```SQL -create model (with hyperparameters -(=(, =)*))? -from model -on dataset (inputSql) -``` - -#### 示例 - -1. 选择测点 ot 中前 80% 数据作为微调数据集,基于 sundial 创建模型 sundialv3。 - -```SQL -IoTDB> set sql_dialect=table -Msg: The statement is executed successfully. -IoTDB> CREATE MODEL sundialv3 FROM MODEL sundial ON DATASET ('SELECT time, ot from etth.eg where 1467302400000 <= time and time < 1517468400001') -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+--------------------+----------+--------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+--------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| timer_xl| Timer-XL| BUILT-IN| ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED| ACTIVE| -| sundialv3| Timer-Sundial|FINE-TUNED|TRAINING| -+---------------------+--------------------+----------+--------+ -``` - -2. 微调任务后台异步启动,可在 AINode 进程看到 log;微调完成后,查询并使用新的模型 - -```SQL -IoTDB> show models -+---------------------+--------------------+----------+------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+------+ -| arima| Arima| BUILT-IN|ACTIVE| -| holtwinters| HoltWinters| BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN|ACTIVE| -| stray| Stray| BUILT-IN|ACTIVE| -| sundial| Timer-Sundial| BUILT-IN|ACTIVE| -| timer_xl| Timer-XL| BUILT-IN|ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|ACTIVE| -| sundialv3| Timer-Sundial|FINE-TUNED|ACTIVE| -+---------------------+--------------------+----------+------+ -``` - -### 4.6 时序大模型导入步骤 - -AINode 目前支持多种时序大模型,部署使用请参考[时序大模型](../AI-capability/TimeSeries-Large-Model.md) - -## 5 权限管理 - -使用AINode相关的功能时,可以使用IoTDB本身的鉴权去做一个权限管理,用户只有在具备 USE_MODEL 权限时,才可以使用模型管理的相关功能。当使用推理功能时,用户需要有访问输入模型的SQL对应的源序列的权限。 - -| **权限名称** | **权限范围** | **管理员用户(默认ROOT)** | **普通用户** | **路径相关** | -| :----------- | :-------------------------------------- | :------------------------- | :----------- | :----------- | -| USE_MODEL | create model / show models / drop model | √ | √ | x | -| READ_DATA | call inference | √ | √ | √ | - -## 6 附录 - -**Arima** - -| 支持的参数 | 含义 | 默认值 | -| :---------------------- | :----------------------------------------------------------- | :-------- | -| order | ARIMA模型的阶数 `(p, d, q)`:p是自回归阶数,d是差分阶数,q是滑动平均阶数。 | (1,0,0) | -| seansonal_order | 季节性ARIMA的阶数 `(P, D, Q, s)`:分别为季节性自回归、差分、滑动平均阶数,s是季节周期(如12代表月度数据)。 | (0,0,0,0) | -| method | 优化器选择,可选:'newton'、'nm'、'bfgs'、'lbfgs'、'powell'、'cg'、'ncg'、'basinhopping'。 | 'lbfgs' | -| maxiter | 最大迭代次数或函数评估次数。 | 50 | -| out_of_sample_size | 用于验证的时间序列尾部样本数,模型不在这些样本上拟合。 | 0 | -| scoring | 验证时使用的评分函数,字符串需为 sklearn 中可导入的评分指标,或用户自定义函数。 | 'mse' | -| trend | 趋势项配置,若 with_intercept=True 且此项为 None,则默认使用 'c'(包含常数项)。 | None | -| with_intercept | 是否包含截距项。 | True | -| time_varying_regression | 是否允许回归系数随时间变化。 | False | -| enforce_stationarity | 是否强制AR部分平稳性。 | True | -| enforce_invertibility | 是否强制MA部分可逆性。 | True | -| simple_differnecing | 是否使用差分后的数据估计(牺牲前几行数据换取更简状态空间)。 | False | -| measurement_error | 是否认为观测值中含有误差。 | False | -| mle_regression | 是否使用极大似然估计回归系数,若 `time_varying_regression=True` 则必须为 False。 | True | -| hamilton_representation | 是否使用 Hamilton 表达方式(默认用 Harvey)。 | False | -| concentrate_scale | 是否从似然函数中排除误差方差参数,减少待估参数个数(但无法获得误差项方差的标准误)。 | False | - -**NaiveForecaster** - -| 支持的参数 | 含义 | 默认值 | -| ---------- | ------------------------------------------------------------ | ------ | -| strategy | 预测策略: • `"last"`:预测训练集最后一个值;若设置了季节周期(`sp`>1),则每个季节分别预测其最后一个周期值。对 NaN 值鲁棒。 • `"mean"`:预测最后窗口中的平均值;若 `sp`>1,按每个季节分别计算均值。对 NaN 值鲁棒。 • `"drift"`:用最后窗口的首尾点拟合一条直线并外推预测。对 NaN 值不鲁棒。 | "last" | -| sp | 季节性周期。若为 `None`,等效于 `1`,表示无季节性;如果设为 12,表示每 12 个单位(如月)为一个周期。 | 1 | - -- STLForecaster - -| 支持的参数 | 含义 | 默认值 | -| :------------ | :----------------------------------------------------------- | :----- | -| sp | 季节周期长度(周期性单位数)。传入 statsmodels 的 STL 中。 | 2 | -| seasonal | 季节项平滑窗口长度,必须为 ≥3 的奇数,通常建议 ≥7。 | 7 | -| seasonal_deg | 季节项 LOESS 的多项式阶数(0 表示常数,1 表示线性)。 | 1 | -| trend_deg | 趋势项 LOESS 的多项式阶数(0 或 1)。 | 1 | -| low_pass_deg | 低通项 LOESS 的多项式阶数(0 或 1)。 | 1 | -| seasonal_jump | LOESS 拟合的插值步长(季节项),每 n 点拟合一次,中间插值。值越大,估计速度越快。 | 1 | -| trend_jump | 趋势项插值步长,越大速度越快但精度可能下降。 | 1 | -| low_pass_jump | 低通项插值步长,设置同上。 | 1 | - -**ExponentialSmoothing (HoltWinters)** - -| 支持的参数 | 含义 | 默认值 | -| :-------------------- | :----------------------------------------------------------- | :---------- | -| damped_trend | 是否使用阻尼趋势(趋势会逐渐平缓,而非无限增长)。 | True | -| initialization_method | 初始化方法: • `"estimated"`:通过拟合估计初始状态 • `"heuristic"`:使用启发式方法估计初始水平/趋势/季节 • `"known"`:用户显式提供所有初始值 • `"legacy-heuristic"`:旧版本兼容方式 | "estimated" | -| optmized | 是否通过最大化对数似然来优化参数。 | True | -| remove_bias | 是否移除偏差,使预测值和拟合值的残差平均值为0。 | False | -| use_brute | 是否使用穷举法(网格搜索)来寻找初始参数。否则使用启发式初始值。 | | diff --git a/src/zh/UserGuide/latest-Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md b/src/zh/UserGuide/latest-Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md deleted file mode 100644 index 953b3ed9a..000000000 --- a/src/zh/UserGuide/latest-Table/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md +++ /dev/null @@ -1,157 +0,0 @@ - -# 时序大模型 - -## 1. 简介 - -时序大模型是专为时序数据分析设计的基础模型。IoTDB 团队长期自研时序基础模型 Timer,该模型基于 Transformer 架构,经海量多领域时序数据预训练,可支撑时序预测、异常检测、时序填补等下游任务;团队打造的 AINode 平台同时支持集成业界前沿时序基础模型,为用户提供多元选型。不同于传统时序分析技术,这类大模型具备通用特征提取能力,可通过零样本分析、微调等技术服务广泛的分析任务。 - -本文相关时序大模型领域的技术成果(含团队自研及业界前沿方向)均发表于国际机器学习顶级会议,具体内容见附录。 - -## 2. 应用场景 - -* **时序预测**:为工业生产、自然环境等领域提供时间序列数据的预测服务,帮助用户提前了解未来变化趋势。 -* **数据填补**:针对时间序列中的缺失序列段,进行上下文填补,以增强数据集的连续性和完整性。 -* **异常检测**:利用自回归分析技术,对时间序列数据进行实时监测,及时预警潜在的异常情况。 - -![](/img/LargeModel09.png) - -## 3. Timer-1 模型 - -Timer[1] 模型(非内置模型)不仅展现了出色的少样本泛化和多任务适配能力,还通过预训练获得了丰富的知识库,赋予了它处理多样化下游任务的通用能力,拥有以下特点: - -* **泛化性**:模型能够通过使用少量样本进行微调,达到行业内领先的深度模型预测效果。 -* **通用性**:模型设计灵活,能够适配多种不同的任务需求,并且支持变化的输入和输出长度,使其在各种应用场景中都能发挥作用。 -* **可扩展性**:随着模型参数数量的增加或预训练数据规模的扩大,模型效果会持续提升,确保模型能够随着时间和数据量的增长而不断优化其预测效果。 - -![](/img/model01.png) - -## 4. Timer-XL 模型 - -Timer-XL[2]基于 Timer 进一步扩展升级了网络结构,在多个维度全面突破: - -* **超长上下文支持**:该模型突破了传统时序预测模型的限制,支持处理数千个 Token(相当于数万个时间点)的输入,有效解决了上下文长度瓶颈问题。 -* **多变量预测场景覆盖**:支持多种预测场景,包括非平稳时间序列的预测、涉及多个变量的预测任务以及包含协变量的预测,满足多样化的业务需求。 -* **大规模工业时序数据集:**采用万亿大规模工业物联网领域的时序数据集进行预训练,数据集兼有庞大的体量、卓越的质量和丰富的领域等重要特质,覆盖能源、航空航天、钢铁、交通等多领域。 - -![](/img/model02.png) - -## 5. Timer-Sundial 模型 - -Timer-Sundial[3]是一个专注于时间序列预测的生成式基础模型系列,其基础版本拥有 1.28 亿参数,并在 1 万亿个时间点上进行了大规模预训练,其核心特性包括: - -* **强大的泛化性能:**具备零样本预测能力,可同时支持点预测和概率预测。 -* **灵活预测分布分析:**不仅能预测均值或分位数,还可通过模型生成的原始样本评估预测分布的任意统计特性。 -* **创新生成架构:** 采用 “Transformer + TimeFlow” 协同架构——Transformer 学习时间片段的自回归表征,TimeFlow 模块基于流匹配框架 (Flow-Matching) 将随机噪声转化为多样化预测轨迹,实现高效的非确定性样本生成。 - -![](/img/model03.png) - -## 6. Chronos-2 模型 - -Chronos-2 [4]是由 Amazon Web Services (AWS) 研究团队开发的,基于 Chronos 离散词元建模范式发展起来的通用时间序列基础模型,该模型同时适用于零样本单变量预测和协变量预测。其主要特性包括: - -* **概率性预测能力**:模型以生成式方式输出多步预测结果,支持分位数或分布级预测,从而刻画未来不确定性。 -* **零样本通用预测**:依托预训练获得的上下文学习能力,可直接对未见过的数据集执行预测,无需重新训练或参数更新。 -* **多变量与协变量统一建模**:支持在同一架构下联合建模多条相关时间序列及其协变量,以提升复杂任务的预测效果。但对输入有严格要求: - * 未来协变量的名称组成的集合必须是历史协变量的名称组成的集合的子集; - * 每个历史协变量的长度必须等于目标变量的长度; - * 每个未来协变量的长度必须等于预测长度; -* **高效推理与部署**:模型采用紧凑的编码器式(encoder-only)结构,在保持强泛化能力的同时兼顾推理效率。 - -![](/img/timeseries-large-model-chronos2.png) - -## 7. 效果展示 - -时序大模型能够适应多种不同领域和场景的真实时序数据,在各种任务上拥有优异的处理效果,以下是在不同数据上的真实表现: - -**时序预测:** - -利用时序大模型的预测能力,能够准确预测时间序列的未来变化趋势,如下图蓝色曲线代表预测趋势,红色曲线为实际趋势,两曲线高度吻合。 - -![](/img/LargeModel03.png) - -**数据填补**: - -利用时序大模型对缺失数据段进行预测式填补。 - -![](/img/timeseries-large-model-data-imputation.png) - -**异常检测**: - -利用时序大模型精准识别与正常趋势偏离过大的异常值。 - -![](/img/LargeModel05.png) - -## 8. 部署使用 - -1. 打开 IoTDB cli 控制台,检查 ConfigNode、DataNode、AINode 节点确保均为 Running。 - -```Plain -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| -+------+----------+-------+---------------+------------+--------------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| 2.0.5.1| 069354f| -| 1| DataNode|Running| 127.0.0.1| 10730| 2.0.5.1| 069354f| -| 2| AINode|Running| 127.0.0.1| 10810| 2.0.5.1|069354f-dev| -+------+----------+-------+---------------+------------+--------------+-----------+ -Total line number = 3 -It costs 0.140s -``` - -2. 联网环境下首次启动 AINode 节点会自动拉取 Timer-XL、Sundial、Chronos2 模型。 - - > 注意: - > - > * AINode 安装包不包含模型权重文件 - > * 自动拉取功能依赖部署环境具备 HuggingFace 网络访问能力 - > * AINode 支持手动上传模型权重文件,具体操作方法可参考[导入权重文件](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_3-3-导入内置权重文件) - -3. 检查模型是否可用。 - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 附录 - -**[1]** Timer- Generative Pre-trained Transformers Are Large Time Series Models, Yong Liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ 返回](#ref1) - -**[2]** TIMER-XL- LONG-CONTEXT TRANSFORMERS FOR UNIFIED TIME SERIES FORECASTING ,Yong Liu, Guo Qin, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ 返回](#ref2) - -**[3]** Sundial- A Family of Highly Capable Time Series Foundation Models, Yong Liu, Guo Qin, Zhiyuan Shi, Zhi Chen, Caiyin Yang, Xiangdong Huang, Jianmin Wang, Mingsheng Long, **ICML 2025 spotlight**. [↩ 返回](#ref3) - -**[4] **Chronos-2: From Univariate to Universal Forecasting, Abdul Fatir Ansari, Oleksandr Shchur, Jaris Küken, Andreas Auer, Boran Han, Pedro Mercado, Syama Sundar Rangapuram, Huibin Shen, Lorenzo Stella, Xiyuan Zhang, Mononito Goswami, Shubham Kapoor, Danielle C. Maddix, Pablo Guerron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, Michael Bohlke-Schneider, **arXiv:2510.15821.**[↩ 返回](#ref4) diff --git a/src/zh/UserGuide/latest-Table/API/Programming-CSharp-Native-API_timecho.md b/src/zh/UserGuide/latest-Table/API/Programming-CSharp-Native-API_timecho.md deleted file mode 100644 index 0d1c44c6d..000000000 --- a/src/zh/UserGuide/latest-Table/API/Programming-CSharp-Native-API_timecho.md +++ /dev/null @@ -1,403 +0,0 @@ - -# C# 原生接口 - -## 1. 功能介绍 - -IoTDB具备C#原生客户端驱动和对应的连接池,提供对象化接口,可以直接组装时序对象进行写入,无需拼装 SQL。推荐使用连接池,多线程并行操作数据库。 - -## 2. 使用方式 - -**环境要求:** - -* .NET SDK >= 5.0 或 .NET Framework 4.x -* Thrift >= 0.14.1 -* NLog >= 4.7.9 - -**依赖安装:** - -支持使用 NuGet Package Manager, .NET CLI等工具来安装,以 .NET CLI为例 - -如果使用的是\\.NET 5.0 或者更高版本的SDK,输入如下命令即可安装最新的NuGet包 - -```Plain -dotnet add package Apache.IoTDB -``` -注意:请勿使用高版本客户端连接低版本服务。 - -## 3. 读写操作 - -### 3.1 TableSessionPool - -#### 3.1.1 功能描述 - -TableSessionPool 定义了与IoTDB交互的基本操作,可以执行数据插入、查询操作以及关闭会话等,同时也是一个连接池这个池可以帮助我们高效地重用连接,并且在不需要时正确地清理资源, 该接口定义了如何从池中获取会话以及如何关闭池的基本操作。 - -#### 3.1.2 方法列表 - -以下是 TableSessionPool 中定义的方法及其详细说明: - -| 方法 | 描述 | 参数 | 返回值 | -| ---------------------------------------------------------------- | ---------------------------------------------------------------- |-------------------------------------------------------------------|--------------------| -| `Open(bool enableRpcCompression)` | 打开会话连接,自定义`enableRpcCompression` | `enableRpcCompression`:是否启用`RpcCompression`,此参数需配合 Server 端配置一同使用 | `Task` | -| `Open()` | 打开会话连接,不开启`RpcCompression` | 无 | `Task` | -| `InsertAsync(Tablet tablet)` | 将一个包含时间序列数据的Tablet 对象插入到数据库中 | tablet: 要插入的Tablet对象 | `Task` | -| `ExecuteNonQueryStatementAsync(string sql)` | 执行非查询SQL语句,如DDL(数据定义语言)或DML(数据操作语言)命令 | sql: 要执行的SQL语句。 | `Task` | -| `ExecuteQueryStatementAsync(string sql)` | 执行查询SQL语句,并返回包含查询结果的SessionDataSet对象 | sql: 要执行的SQL语句。 | `Task` | -| `ExecuteQueryStatementAsync(string sql, long timeoutInMs)` | 执行查询SQL语句,并设置查询超时时间(以毫秒为单位) | sql: 要执行的查询SQL语句。
timeoutInMs: 查询超时时间(毫秒) | `Task` | -| `Close()` | 关闭会话,释放所持有的资源 | 无 | `Task` | - -#### 3.1.3 接口展示 - -```C# -public async Task Open(bool enableRpcCompression, CancellationToken cancellationToken = default) - - public async Task Open(CancellationToken cancellationToken = default) - - public async Task InsertAsync(Tablet tablet) - - public async Task ExecuteNonQueryStatementAsync(string sql) - - public async Task ExecuteQueryStatementAsync(string sql) - - public async Task ExecuteQueryStatementAsync(string sql, long timeoutInMs) - - public async Task Close() -``` - -### 3.2 TableSessionPool.Builder - -#### 3.2.1 功能描述 - -TableSessionPool.Builder 是 TableSessionPool的构造器,用于配置和创建 TableSessionPool 的实例。允许开发者配置连接参数、会话参数和池化行为等。 - -#### 3.2.2 配置选项 - -以下是 TableSessionPool.Builder 类的可用配置选项及其默认值: - -| **配置项** | **描述** | **默认值** | -| ---------------------------------------------------- | ---------------------------------------------------------------- |---------------------| -| `SetHost(string host)` | 设置IoTDB的节点 host | `localhost` | -| `SetPort(int port)` | 设置IoTDB的节点端口 | `6667` | -| `SetNodeUrls(List nodeUrls)` | 设置IoTDB集群的节点URL列表,当设置此项时会忽略SetHost和SetPort | `无 ` | -| `SetUsername(string username)` | 设置连接的用户名 | `"root"` | -| `SetPassword(string password)` | 设置连接的密码 | `"TimechoDB@2021"` //V2.0.6.x 之前默认密码是root | -| `SetFetchSize(int fetchSize)` | 设置查询结果的获取大小 | `1024 ` | -| `SetZoneId(string zoneId)` | 设置时区相关的ZoneId | `UTC+08:00` | -| `SetPoolSize(int poolSize)` | 设置会话池的最大大小,即池中允许的最大会话数 | `8 ` | -| `SetEnableRpcCompression(bool enableRpcCompression)` | 是否启用RPC压缩 | `false` | -| `SetConnectionTimeoutInMs(int timeout)` | 设置连接超时时间(毫秒) | `500` | -| `SetDatabase(string database)` | 设置目标数据库名称 | ` "" ` | - -#### 3.2.3 接口展示 - -```c# -public Builder SetHost(string host) - { - _host = host; - return this; - } - - public Builder SetPort(int port) - { - _port = port; - return this; - } - - public Builder SetUsername(string username) - { - _username = username; - return this; - } - - public Builder SetPassword(string password) - { - _password = password; - return this; - } - - public Builder SetFetchSize(int fetchSize) - { - _fetchSize = fetchSize; - return this; - } - - public Builder SetZoneId(string zoneId) - { - _zoneId = zoneId; - return this; - } - - public Builder SetPoolSize(int poolSize) - { - _poolSize = poolSize; - return this; - } - - public Builder SetEnableRpcCompression(bool enableRpcCompression) - { - _enableRpcCompression = enableRpcCompression; - return this; - } - - public Builder SetConnectionTimeoutInMs(int timeout) - { - _connectionTimeoutInMs = timeout; - return this; - } - - public Builder SetNodeUrls(List nodeUrls) - { - _nodeUrls = nodeUrls; - return this; - } - - protected internal Builder SetSqlDialect(string sqlDialect) - { - _sqlDialect = sqlDialect; - return this; - } - - public Builder SetDatabase(string database) - { - _database = database; - return this; - } - - public Builder() - { - _host = "localhost"; - _port = 6667; - _username = "root"; - _password = "TimechoDB@2021"; //V2.0.6.x 之前默认密码是root - _fetchSize = 1024; - _zoneId = "UTC+08:00"; - _poolSize = 8; - _enableRpcCompression = false; - _connectionTimeoutInMs = 500; - _sqlDialect = IoTDBConstant.TABLE_SQL_DIALECT; - _database = ""; - } - - public TableSessionPool Build() - { - SessionPool sessionPool; - // if nodeUrls is not empty, use nodeUrls to create session pool - if (_nodeUrls.Count > 0) - { - sessionPool = new SessionPool(_nodeUrls, _username, _password, _fetchSize, _zoneId, _poolSize, _enableRpcCompression, _connectionTimeoutInMs, _sqlDialect, _database); - } - else - { - sessionPool = new SessionPool(_host, _port, _username, _password, _fetchSize, _zoneId, _poolSize, _enableRpcCompression, _connectionTimeoutInMs, _sqlDialect, _database); - } - return new TableSessionPool(sessionPool); - } -``` - -## 4. 示例代码 - -完整示例:[samples/Apache.IoTDB.Samples/TableSessionPoolTest.cs](https://github.com/apache/iotdb-client-csharp/blob/main/samples/Apache.IoTDB.Samples/TableSessionPoolTest.cs) - -```c# -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -using System; -using System.Collections.Generic; -using System.Threading.Tasks; -using Apache.IoTDB.DataStructure; - -namespace Apache.IoTDB.Samples; - -public class TableSessionPoolTest -{ - private readonly SessionPoolTest sessionPoolTest; - - public TableSessionPoolTest(SessionPoolTest sessionPoolTest) - { - this.sessionPoolTest = sessionPoolTest; - } - - public async Task Test() - { - await TestCleanup(); - - await TestSelectAndInsert(); - await TestUseDatabase(); - // await TestCleanup(); - } - - - public async Task TestSelectAndInsert() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - - await tableSessionPool.ExecuteNonQueryStatementAsync("CREATE DATABASE test1"); - await tableSessionPool.ExecuteNonQueryStatementAsync("CREATE DATABASE test2"); - - await tableSessionPool.ExecuteNonQueryStatementAsync("use test2"); - - // or use full qualified table name - await tableSessionPool.ExecuteNonQueryStatementAsync( - "create table test1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - - await tableSessionPool.ExecuteNonQueryStatementAsync( - "create table table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - var res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - // show tables by specifying another database - // using SHOW tables FROM - res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES FROM test1"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - var tableName = "testTable1"; - List columnNames = - new List { - "region_id", - "plant_id", - "device_id", - "model", - "temperature", - "humidity" }; - List dataTypes = - new List{ - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE}; - List columnCategories = - new List{ - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD}; - var values = new List> { }; - var timestamps = new List { }; - for (long timestamp = 0; timestamp < 100; timestamp++) - { - timestamps.Add(timestamp); - values.Add(new List { "1", "5", "3", "A", 1.23F + timestamp, 111.1 + timestamp }); - } - var tablet = new Tablet(tableName, columnNames, columnCategories, dataTypes, values, timestamps); - - await tableSessionPool.InsertAsync(tablet); - - - res = await tableSessionPool.ExecuteQueryStatementAsync("select * from testTable1 " - + "where region_id = '1' and plant_id in ('3', '5') and device_id = '3'"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.Close(); - } - - - public async Task TestUseDatabase() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetDatabase("test1") - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - - // show tables from current database - var res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.ExecuteNonQueryStatementAsync("use test2"); - - // show tables from current database - res = await tableSessionPool.ExecuteQueryStatementAsync("SHOW TABLES"); - res.ShowTableNames(); - while (res.HasNext()) Console.WriteLine(res.Next()); - await res.Close(); - - await tableSessionPool.Close(); - } - - public async Task TestCleanup() - { - var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(sessionPoolTest.nodeUrls) - .SetUsername(sessionPoolTest.username) - .SetPassword(sessionPoolTest.password) - .SetFetchSize(1024) - .Build(); - - await tableSessionPool.Open(false); - - if (sessionPoolTest.debug) tableSessionPool.OpenDebugMode(); - - await tableSessionPool.ExecuteNonQueryStatementAsync("drop database test1"); - await tableSessionPool.ExecuteNonQueryStatementAsync("drop database test2"); - - await tableSessionPool.Close(); - } -} -``` diff --git a/src/zh/UserGuide/latest-Table/API/Programming-Cpp-Native-API_timecho.md b/src/zh/UserGuide/latest-Table/API/Programming-Cpp-Native-API_timecho.md deleted file mode 100644 index be499eba7..000000000 --- a/src/zh/UserGuide/latest-Table/API/Programming-Cpp-Native-API_timecho.md +++ /dev/null @@ -1,462 +0,0 @@ - - -# C++ 原生接口 - -## 1. 依赖 - -- Java 8+ -- Flex -- Bison 2.7+ -- Boost 1.56+ -- OpenSSL 1.0+ -- GCC 5.5.0+ - - -## 2. 安装 - -### 2.1 安装相关依赖 - -- **MAC** - 1. 安装 Bison : - - 使用下面 brew 命令安装 bison 版本: - ```shell - brew install bison - ``` - - 2. 安装 Boost :确保安装最新的 Boost 版本。 - - ```shell - brew install boost - ``` - - 3. 检查 OpenSSL :确保 openssl 库已安装,默认的 openssl 头文件路径为"/usr/local/opt/openssl/include" - - 如果在编译过程中出现找不到 openssl 的错误,尝试添加`-Dopenssl.include.dir=""` - - -- **Ubuntu 16.04+ 或其他 Debian 系列** - - 使用以下命令安装所赖: - - ```shell - sudo apt-get update - sudo apt-get install gcc g++ bison flex libboost-all-dev libssl-dev - ``` - - -- **CentOS 7.7+/Fedora/Rocky Linux 或其他 Red-hat 系列** - - 使用 yum 命令安装依赖: - - ```shell - sudo yum update - sudo yum install gcc gcc-c++ boost-devel bison flex openssl-devel - ``` - - -- **Windows** - -1. 构建编译环境 - - 安装 MS Visual Studio(推荐安装 2019+ 版本):安装时需要勾选 Visual Studio C/C++ IDE and compiler(supporting CMake, Clang, MinGW) - - 下载安装 [CMake](https://cmake.org/download/) 。 - -2. 下载安装 Flex、Bison - - 下载 [Win_Flex_Bison](https://sourceforge.net/projects/winflexbison/) - - 下载后需要将可执行文件重命名为 flex.exe 和 bison.exe 以保证编译时能够被找到,添加可执行文件的目录到 PATH 环境变量中 - -3. 安装 Boost 库 - - 下载 [Boost](https://www.boost.org/users/download/) - - 本地编译 Boost :依次执行 bootstrap.bat 和 b2.exe - - 添加 Boost 安装目录到 PATH 环境变量中,例如 `C:\Program Files (x86)\boost_1_78_0` - -4. 安装 OpenSSL - - 下载安装 [OpenSSL](http://slproweb.com/products/Win32OpenSSL.html) - - 添加 OpenSSL 下的 include 目录到 PATH 环境变量中 - - -### 2.2 执行编译 - -从 git 克隆源代码: -```shell -git clone https://github.com/apache/iotdb.git -``` - -默认的主分支是 master 分支,如果你想使用某个发布版本,请切换分支 (如 2.0.6 版本): -```shell -git checkout rc/2.0.6 -``` -注意:请勿使用高版本客户端连接低版本服务。 - -在 IoTDB 根目录下执行 maven 编译: - -- Mac 或 glibc 版本 >= 2.32 的 Linux - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp - ``` - -- glibc 版本 >= 2.31 的 Linux - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Diotdb-tools-thrift.version=0.14.1.1-old-glibc-SNAPSHOT - ``` - -- glibc 版本 >= 2.17 的 Linux - ```shell - ./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Diotdb-tools-thrift.version=0.14.1.1-glibc223-SNAPSHOT - ``` - -- 使用 Visual Studio 2022 的 Windows - ```batch - .\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp - ``` - -- 使用 Visual Studio 2019 的 Windows - ```batch - .\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Dcmake.generator="Visual Studio 16 2019" -Diotdb-tools-thrift.version=0.14.1.1-msvc142-SNAPSHOT - ``` - - 如果没有将 Boost 库地址加入 PATH 环境变量,在编译命令中还需添加相关参数,例如:`-DboostIncludeDir="C:\Program Files (x86)\boost_1_78_0" -DboostLibraryDir="C:\Program Files (x86)\boost_1_78_0\stage\lib"` - -编译成功后,打包好的库文件位于 `iotdb-client/client-cpp/target` 中,同时可以在 `example/client-cpp-example/target` 下找到编译好的示例程序。 - -### 2.3 编译 Q&A - -Q:Linux 上的环境有哪些要求呢? - -A: -- 已知依赖的 glibc (x86_64 版本) 最低版本要求为 2.17,GCC 最低版本为 5.5 -- 已知依赖的 glibc (ARM 版本) 最低版本要求为 2.31,GCC 最低版本为 10.2 -- 如果不满足上面的要求,可以尝试自己本地编译 Thrift - - 下载 https://github.com/apache/iotdb-bin-resources/tree/iotdb-tools-thrift-v0.14.1.0/iotdb-tools-thrift 这里的代码 - - 执行 `./mvnw clean install` - - 回到 iotdb 代码目录执行 `./mvnw clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp` - - -Q:Linux 编译报错`undefined reference to '_libc_sinle_thread'`如何处理? - -A: -- 该问题是用于默认的预编译 Thrift 依赖了过高的 glibc 版本导致的 -- 可以尝试在编译的 maven 命令中增加 `-Diotdb-tools-thrift.version=0.14.1.1-glibc223-SNAPSHOT` 或者 `-Diotdb-tools-thrift.version=0.14.1.1-old-glibc-SNAPSHOT` - -Q:如果在 Windows 上需要使用 Visual Studio 2017 或更早版本进行编译,要怎么做? - -A: -- 可以尝试自己本地编译 Thrift 后再进行客户端的编译 - - 下载 https://github.com/apache/iotdb-bin-resources/tree/iotdb-tools-thrift-v0.14.1.0/iotdb-tools-thrift 这里的代码 - - 执行 `.\mvnw.cmd clean install` - - 回到 iotdb 代码目录执行 `.\mvnw.cmd clean package -pl example/client-cpp-example -am -DskipTests -P with-cpp -Dcmake.generator="Visual Studio 15 2017"` - -Q: Windows 上使用 Visual Studio 进行编译时出现乱码,如何解决? - -A: -- 该问题是由于项目整体使用 utf-8 编码,而编译用到的一些 windows 系统文件编码不是 utf-8(系统编码默认跟随地区,在中国为 GBK) -- 可以在控制面板中更改系统区域设置,具体操作方法为:打开控制面板 -> 时钟和区域 -> 区域,切换到管理选项卡,在 "非Unicode程序的语言" 下,选择更改系统区域设置,勾选 `Beta版:使用Unicode UTF-8提供全球语言支持` 后重启电脑。(细节可能随windows版本有差异,可在网上寻找详细教程) -- 注意,修改 windows 系统编码后可能会导致一些其他使用 GBK 编码的程序出现乱码,将系统区域改回后可复原,请自行斟酌。 - -## 3. 使用方式 - -### 3.1 TableSession 类 - -C++ 客户端的操作均通过 TableSession 类进行,下面将给出 TableSession 接口中定义的方法说明。 - -#### 3.1.1 方法列表 - -1. `insert(Tablet& tablet, bool sorted = false)`,将一个包含时间序列数据的Tablet对象插入到数据库中,sorted参数指明tablet中的行是否已按时间排序。 -2. `executeNonQueryStatement(string& sql)`,执行非查询SQL语句,如DDL(数据定义语言)或DML(数据操作语言)命令。 -3. `executeQueryStatement(string& sql)`,执行查询SQL语句,并返回包含查询结果的SessionDataSet对象,可选timeoutInMs参数指示超时返回时间。 - * 注意:调用 `SessionDataSet::next()` 获取查询结果行时,必须将返回的 `std::shared_ptr` 对象存储在局部作用域变量中(例如:`auto row = dataSet->next();`),以确保数据生命周期有效。若直接通过 `.get()` 或裸指针访问(如 `dataSet->next().get()`),将导致智能指针引用计数归零,数据被立即释放,后续访问将引发未定义行为。 -4. `open(bool enableRPCCompression = false)`,开启连接,并决定是否启用RPC压缩(客户端状态须与服务端一致,默认不开启)。 -5. `close()`,关闭连接。 - -#### 3.1.2 接口展示 - -```cpp -class TableSession { -private: - Session* session; -public: - TableSession(Session* session) { - this->session = session; - } - void insert(Tablet& tablet, bool sorted = false); - void executeNonQueryStatement(const std::string& sql); - unique_ptr executeQueryStatement(const std::string& sql); - unique_ptr executeQueryStatement(const std::string& sql, int64_t timeoutInMs); - string getDatabase(); //获取当前选择的database,可由executeNonQueryStatement代替 - void open(bool enableRPCCompression = false); - void close(); -}; -``` - -### 3.2 TableSessionBuilder 类 - -TableSessionBuilder类是一个构建器,用于配置和创建TableSession类的实例,通过它可以在创建实例时方便地设置连接参数、查询参数等。 - -#### 3.2.1 使用示例 - -```cpp -//设置连接的IP、端口、用户名、密码 -//设置的顺序任意,确保最后调用build()即可,创建的实例默认已进行open()连接操作 -session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("TimechoDB@2021") //V2.0.6.x 之前默认密码是root - ->build(); -``` - -#### 3.2.2 可设置的参数列表 - -| **参数名** | **描述** | **默认值** | -| :---: | :---: |:-------------------------:| -| host | 设置连接的节点IP | "127.0.0.1" ("localhost") | -| rpcPort | 设置连接的节点端口 | 6667 | -| username | 设置连接的用户名 | "root" | -| password | 设置连接密码 | "TimechoDB@2021" //V2.0.6.x 之前默认密码是root | -| zoneId | 设置时区相关的ZoneId | "" | -| fetchSize | 设置查询结果的获取大小 | 10000 | -| database | 设置目标数据库名称 | "" | - - -## 4. 示例代码 - -示例工程源代码: - -- `example/client-cpp-example/src/TableModelSessionExample.cpp` : [TableModelSessionExample](https://github.com/apache/iotdb/blob/master/example/client-cpp-example/src/TableModelSessionExample.cpp) - -编译成功后,示例代码工程位于 `example/client-cpp-example/target` - -```cpp -#include "TableSession.h" -#include "TableSessionBuilder.h" - -using namespace std; - -shared_ptr session; - -void insertRelationalTablet() { - - vector> schemaList { - make_pair("region_id", TSDataType::TEXT), - make_pair("plant_id", TSDataType::TEXT), - make_pair("device_id", TSDataType::TEXT), - make_pair("model", TSDataType::TEXT), - make_pair("temperature", TSDataType::FLOAT), - make_pair("humidity", TSDataType::DOUBLE) - }; - - vector columnTypes = { - ColumnCategory::TAG, - ColumnCategory::TAG, - ColumnCategory::TAG, - ColumnCategory::ATTRIBUTE, - ColumnCategory::FIELD, - ColumnCategory::FIELD - }; - - Tablet tablet("table1", schemaList, columnTypes, 100); - - for (int row = 0; row < 100; row++) { - int rowIndex = tablet.rowSize++; - tablet.timestamps[rowIndex] = row; - - // 使用基于索引的 API 比通过列名查找更高效 - // 推荐写法:tablet.addValue(0, rowIndex, "1"); - // 避免写法:tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue(0, rowIndex, "1"); // region_id - tablet.addValue(1, rowIndex, "5"); // plant_id - tablet.addValue(2, rowIndex, "3"); // device_id - tablet.addValue(3, rowIndex, "A"); // model - tablet.addValue(4, rowIndex, 37.6F); // temperature - tablet.addValue(5, rowIndex, 111.1); // humidity - if (tablet.rowSize == tablet.maxRowNumber) { - session->insert(tablet); - tablet.reset(); - } - } - - if (tablet.rowSize != 0) { - session->insert(tablet); - tablet.reset(); - } -} - -void Output(unique_ptr &dataSet) { - for (const string &name: dataSet->getColumnNames()) { - cout << name << " "; - } - cout << endl; - while (dataSet->hasNext()) { - cout << dataSet->next()->toString(); - } - cout << endl; -} - -void OutputWithType(unique_ptr &dataSet) { - for (const string &name: dataSet->getColumnNames()) { - cout << name << " "; - } - cout << endl; - for (const string &type: dataSet->getColumnTypeList()) { - cout << type << " "; - } - cout << endl; - while (dataSet->hasNext()) { - cout << dataSet->next()->toString(); - } - cout << endl; -} - -int main() { - try { - session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("root") - ->build(); - - cout << "[Create Database db1,db2]\n" << endl; - try { - session->executeNonQueryStatement("CREATE DATABASE IF NOT EXISTS db1"); - session->executeNonQueryStatement("CREATE DATABASE IF NOT EXISTS db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Use db1 as database]\n" << endl; - try { - session->executeNonQueryStatement("USE db1"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Create Table table1,table2]\n" << endl; - try { - session->executeNonQueryStatement("create table db1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - session->executeNonQueryStatement("create table db2.table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show Tables]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show tables from specific database]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES FROM db1"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[InsertTablet]\n" << endl; - try { - insertRelationalTablet(); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Query Table Data]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SELECT * FROM table1" - " where region_id = '1' and plant_id in ('3', '5') and device_id = '3'"); - OutputWithType(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - session->close(); - - // specify database in constructor - session = (new TableSessionBuilder()) - ->host("127.0.0.1") - ->rpcPort(6667) - ->username("root") - ->password("root") - ->database("db1") - ->build(); - - cout << "[Show tables from current database(db1)]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Change database to db2]\n" << endl; - try { - session->executeNonQueryStatement("USE db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Show tables from current database(db2)]\n" << endl; - try { - unique_ptr dataSet = session->executeQueryStatement("SHOW TABLES"); - Output(dataSet); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "[Drop Database db1,db2]\n" << endl; - try { - session->executeNonQueryStatement("DROP DATABASE db1"); - session->executeNonQueryStatement("DROP DATABASE db2"); - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - - cout << "session close\n" << endl; - session->close(); - - cout << "finished!\n" << endl; - } catch (IoTDBConnectionException &e) { - cout << e.what() << endl; - } catch (IoTDBException &e) { - cout << e.what() << endl; - } - return 0; -} -``` - -## 5. FAQ - -### 5.1 Thrift 编译相关问题 - -1. MAC:本地 Maven 编译 Thrift 时如出现以下链接的问题,可以尝试将 xcode-commandline 版本从 12 降低到 11.5 - https://stackoverflow.com/questions/63592445/ld-unsupported-tapi-file-type-tapi-tbd-in-yaml-file/65518087#65518087 - - -2. Windows:Maven 编译 Thrift 时需要使用 wget 下载远端文件,可能出现以下报错: - ``` - Failed to delete cached file C:\Users\Administrator\.m2\repository\.cache\download-maven-plugin\index.ser - ``` - - 解决方法: - - 尝试删除 ".m2\repository\\.cache\" 目录并重试。 - - 在添加 pom 文件对应的 download-maven-plugin 中添加 "\true\" diff --git a/src/zh/UserGuide/latest-Table/API/Programming-Go-Native-API_timecho.md b/src/zh/UserGuide/latest-Table/API/Programming-Go-Native-API_timecho.md deleted file mode 100644 index ad109f414..000000000 --- a/src/zh/UserGuide/latest-Table/API/Programming-Go-Native-API_timecho.md +++ /dev/null @@ -1,598 +0,0 @@ - - -# Go 原生接口 - -## 1. 使用方法 -### 1.1 依赖 - -* golang >= 1.13 -* make >= 3.0 -* curl >= 7.1.1 -* thrift 0.15.0 -* Linux、Macos 或其他类 unix 系统 -* Windows+bash (下载 IoTDB Go client 需要 git ,通过 WSL、cygwin、Git Bash 任意一种方式均可) - -### 1.2 安装方法 - -* 通过 go mod - -```sh -# 切换到 GOPATH 的 HOME 路径,启用 Go Modules 功能 -export GO111MODULE=on - -# 配置 GOPROXY 环境变量 -export GOPROXY=https://goproxy.io - -# 创建命名的文件夹或目录,并切换当前目录 -mkdir session_example && cd session_example - -# 保存文件,自动跳转到新的地址 -curl -o session_example.go -L https://github.com/apache/iotdb-client-go/raw/main/example/session_example.go - -# 初始化 go module 环境 -go mod init session_example - -# 下载依赖包 -go mod tidy - -# 编译并运行程序 -go run session_example.go -``` - -* 通过 GOPATH - -```sh -# get thrift 0.13.0 -go get github.com/apache/thrift@0.13.0 - -# 递归创建目录 -mkdir -p $GOPATH/src/iotdb-client-go-example/session_example - -# 切换到当前目录 -cd $GOPATH/src/iotdb-client-go-example/session_example - -# 保存文件,自动跳转到新的地址 -curl -o session_example.go -L https://github.com/apache/iotdb-client-go/raw/main/example/session_example.go - -# 初始化 go module 环境 -go mod init - -# 下载依赖包 -go mod tidy - -# 编译并运行程序 -go run session_example.go -``` -* 注意:请勿使用高版本客户端连接低版本服务。 - -## 2. ITableSession 接口 -### 2.1 功能描述 - -ITableSession 接口定义了与 IoTDB 交互的基本操作,可以执行数据插入、查询操作以及关闭会话等,非线程安全。 - -### 2.2 方法列表 - -以下是 ITableSession 接口中定义的方法及其详细说明: - -| **方法名** | **描述** | **参数** | **返回值** | **返回异常** | -| -------------------------------------------------------------- | ---------------------------------------------------------------------- | --------------------------------------------------------------------------- | ---------------------------------------------------- | ----------------------------------------------- | -| `Insert(tablet *Tablet)` | 将一个包含时间序列数据的Tablet 插入到数据库中| `tablet`: 要插入的`Tablet`| `TSStatus`:执行结果的状态,由 common 包定义。 | `errer`:操作过程中的错误(如连接失败)。 | -| `xecuteNonQueryStatement(sql string)`| 执行非查询 SQL 语句,如 DDL (数据定义语言)或 DML (数据操作语言)命令 | `sql`: 要执行的 SQL 语句。| 同上 | 同上| -| `ExecuteQueryStatement (sql string, timeoutInMs *int64)` | 执行查询 SQL 语句,并返回查询结果集 | `sql`: 要执行的查询 SQL 语句。`timeoutInMs`: 查询超时时间(毫秒) | `SessionDataSet`:查询结果数据集。 | `errer`:操作过程中的错误(如连接失败)。 | -| `Close()` | 关闭会话,释放所持有的资源 | 无 | 无 | `errer`:关闭连接过程中的错误 | - -### 2.3 接口展示 -1. ITableSession - -```go -// ITableSession defines an interface for interacting with IoTDB tables. -// It supports operations such as data insertion, executing queries, and closing the session. -// Implementations of this interface are expected to manage connections and ensure -// proper resource cleanup. -// -// Each method may return an error to indicate issues such as connection errors -// or execution failures. -// -// Since this interface includes a Close method, it is recommended to use -// defer to ensure the session is properly closed. -type ITableSession interface { - - // Insert inserts a Tablet into the database. - // - // Parameters: - // - tablet: A pointer to a Tablet containing time-series data to be inserted. - // - // Returns: - // - r: A pointer to TSStatus indicating the execution result. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - Insert(tablet *Tablet) (r *common.TSStatus, err error) - - // ExecuteNonQueryStatement executes a non-query SQL statement, such as a DDL or DML command. - // - // Parameters: - // - sql: The SQL statement to execute. - // - // Returns: - // - r: A pointer to TSStatus indicating the execution result. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - ExecuteNonQueryStatement(sql string) (r *common.TSStatus, err error) - - // ExecuteQueryStatement executes a query SQL statement and returns the result set. - // - // Parameters: - // - sql: The SQL query statement to execute. - // - timeoutInMs: A pointer to the timeout duration in milliseconds for the query execution. - // - // Returns: - // - result: A pointer to SessionDataSet containing the query results. - // - err: An error if an issue occurs during the operation, such as a connection error or execution failure. - ExecuteQueryStatement(sql string, timeoutInMs *int64) (*SessionDataSet, error) - - // Close closes the session, releasing any held resources. - // - // Returns: - // - err: An error if there is an issue with closing the IoTDB connection. - Close() (err error) -} -``` - -2. 构造 TableSession - -* Config 中不需要手动设置 sqlDialect,使用时只需要使用对应的 NewSession 函数 - -```Go -type Config struct { - Host string - Port string - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - sqlDialect string - Version Version - Database string -} - -type ClusterConfig struct { - NodeUrls []string //ip:port - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - sqlDialect string - Database string -} - -// NewTableSession creates a new TableSession instance using the provided configuration. -// -// Parameters: -// - config: The configuration for the session. -// - enableRPCCompression: A boolean indicating whether RPC compression is enabled. -// - connectionTimeoutInMs: The timeout in milliseconds for establishing a connection. -// -// Returns: -// - An ITableSession instance if the session is successfully created. -// - An error if there is an issue during session initialization. -func NewTableSession(config *Config, enableRPCCompression bool, connectionTimeoutInMs int) (ITableSession, error) - -// NewClusterTableSession creates a new TableSession instance for a cluster setup. -// -// Parameters: -// - clusterConfig: The configuration for the cluster session. -// - enableRPCCompression: A boolean indicating whether RPC compression is enabled. -// -// Returns: -// - An ITableSession instance if the session is successfully created. -// - An error if there is an issue during session initialization. -func NewClusterTableSession(clusterConfig *ClusterConfig, enableRPCCompression bool) (ITableSession, error) -``` - -> 注意: -> -> 通过 *NewTableSession 或 NewClusterTableSession* 得到的 TableSession,连接已经建立,不需要额外的 open 操作。 - -### 2.4 示例代码 - -```Go -package main - -import ( - "flag" - "log" - "math/rand" - "strconv" - "time" - - "github.com/apache/iotdb-client-go/v2/client" -) - -func main() { - flag.Parse() - config := &client.Config{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_session", - } - session, err := client.NewTableSession(config, false, 0) - if err != nil { - log.Fatal(err) - } - defer session.Close() - - checkError(session.ExecuteNonQueryStatement("create database test_db")) - checkError(session.ExecuteNonQueryStatement("use test_db")) - checkError(session.ExecuteNonQueryStatement("create table t1 (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - insertRelationalTablet(session) - showTables(session) - query(session) -} - -func getTextValueFromDataSet(dataSet *client.SessionDataSet, columnName string) string { - if isNull, err := dataSet.IsNull(columnName); err != nil { - log.Fatal(err) - } else if isNull { - return "null" - } - v, err := dataSet.GetString(columnName) - if err != nil { - log.Fatal(err) - } - return v -} - -func insertRelationalTablet(session client.ITableSession) { - tablet, err := client.NewRelationalTablet("t1", []*client.MeasurementSchema{ - { - Measurement: "tag1", - DataType: client.STRING, - }, - { - Measurement: "tag2", - DataType: client.STRING, - }, - { - Measurement: "s1", - DataType: client.TEXT, - }, - { - Measurement: "s2", - DataType: client.TEXT, - }, - }, []client.ColumnCategory{client.TAG, client.TAG, client.FIELD, client.FIELD}, 1024) - if err != nil { - log.Fatal("Failed to create relational tablet {}", err) - } - ts := time.Now().UTC().UnixNano() / 1000000 - for row := 0; row < 16; row++ { - ts++ - tablet.SetTimestamp(ts, row) - tablet.SetValueAt("tag1_value_"+strconv.Itoa(row), 0, row) - tablet.SetValueAt("tag2_value_"+strconv.Itoa(row), 1, row) - tablet.SetValueAt("s1_value_"+strconv.Itoa(row), 2, row) - tablet.SetValueAt("s2_value_"+strconv.Itoa(row), 3, row) - tablet.RowSize++ - } - checkError(session.Insert(tablet)) - - tablet.Reset() - - for row := 0; row < 16; row++ { - ts++ - tablet.SetTimestamp(ts, row) - tablet.SetValueAt("tag1_value_1", 0, row) - tablet.SetValueAt("tag2_value_1", 1, row) - tablet.SetValueAt("s1_value_"+strconv.Itoa(row), 2, row) - tablet.SetValueAt("s2_value_"+strconv.Itoa(row), 3, row) - - nullValueColumn := rand.Intn(4) - tablet.SetValueAt(nil, nullValueColumn, row) - tablet.RowSize++ - } - checkError(session.Insert(tablet)) -} - -func showTables(session client.ITableSession) { - timeout := int64(2000) - dataSet, err := session.ExecuteQueryStatement("show tables", &timeout) - defer dataSet.Close() - if err != nil { - log.Fatal(err) - } - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - value, err := dataSet.GetString("TableName") - if err != nil { - log.Fatal(err) - } - log.Printf("tableName is %v", value) - } -} - -func query(session client.ITableSession) { - timeout := int64(2000) - dataSet, err := session.ExecuteQueryStatement("select * from t1", &timeout) - defer dataSet.Close() - if err != nil { - log.Fatal(err) - } - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - log.Printf("%v %v %v %v", getTextValueFromDataSet(dataSet, "tag1"), getTextValueFromDataSet(dataSet, "tag2"), getTextValueFromDataSet(dataSet, "s1"), getTextValueFromDataSet(dataSet, "s2")) - } -} - -func checkError(err error) { - if err != nil { - log.Fatal(err) - } -} -``` - -## 3. TableSessionPool 接口 -### 3.1 功能描述 - -TableSessionPool 是一个用于管理 ITableSession 实例的池。这个池可以帮助我们高效地重用连接,并且在不需要时正确地清理资源, 该接口定义了如何从池中获取会话以及如何关闭池的基本操作。 - -### 3.2 方法列表 - -| **方法名** | **描述** | **返回值** | **返回异常** | -| -------------------- | -------------------------------------------------------------------- | --------------------------- | --------------------------------- | -| `GetSession()` | 从池中获取一个 ITableSession 实例,用于与 IoTDB 交互。 | `ITableSession `实例| `error`:获取失败的错误原因 | -| `Close()` | 关闭会话池,释放任何持有的资源。关闭后,不能再从池中获取新的会话。 | 无 | 无 | - -### 3.3 接口展示 -1. TableSessionPool - -```Go -// TableSessionPool manages a pool of ITableSession instances, enabling efficient -// reuse and management of resources. It provides methods to acquire a session -// from the pool and to close the pool, releasing all held resources. -// -// This implementation ensures proper lifecycle management of sessions, -// including efficient reuse and cleanup of resources. - -// GetSession acquires an ITableSession instance from the pool. -// -// Returns: -// - A usable ITableSession instance for interacting with IoTDB. -// - An error if a session cannot be acquired. -func (spool *TableSessionPool) GetSession() (ITableSession, error) { - return spool.sessionPool.getTableSession() -} - -// Close closes the TableSessionPool, releasing all held resources. -// Once closed, no further sessions can be acquired from the pool. -func (spool *TableSessionPool) Close() -``` - -2. 构造 TableSessionPool - -```Go -type PoolConfig struct { - Host string - Port string - NodeUrls []string - UserName string - Password string - FetchSize int32 - TimeZone string - ConnectRetryMax int - Database string - sqlDialect string -} - -// NewTableSessionPool creates a new TableSessionPool with the specified configuration. -// -// Parameters: -// - conf: PoolConfig defining the configuration for the pool. -// - maxSize: The maximum number of sessions the pool can hold. -// - connectionTimeoutInMs: Timeout for establishing a connection in milliseconds. -// - waitToGetSessionTimeoutInMs: Timeout for waiting to acquire a session in milliseconds. -// - enableCompression: A boolean indicating whether to enable compression. -// -// Returns: -// - A TableSessionPool instance. -func NewTableSessionPool(conf *PoolConfig, maxSize, connectionTimeoutInMs, waitToGetSessionTimeoutInMs int, - enableCompression bool) TableSessionPool -``` - -> 注意: -> -> * 通过 TableSessionPool 得到的 TableSession,如果已经在创建 TableSessionPool 指定了 Database,使用时可以不再指定 Database。 -> * 如果使用过程中通过 use database 指定了其他 database,在 close 放回 TableSessionPool 时会自动恢复为 TableSessionPool 所用的 database。 - -### 3.4 示例代码 - -```go -package main - -import ( - "log" - "strconv" - "sync" - "sync/atomic" - "time" - - "github.com/apache/iotdb-client-go/v2/client" -) - -func main() { - sessionPoolWithSpecificDatabaseExample() - sessionPoolWithoutSpecificDatabaseExample() - putBackToSessionPoolExample() -} - -func putBackToSessionPoolExample() { - // should create database test_db before executing - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_db", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) - defer sessionPool.Close() - - num := 4 - successGetSessionNum := int32(0) - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - dbName := "db" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create database "+dbName+"because ", err) - return - } - atomic.AddInt32(&successGetSessionNum, 1) - defer func() { - time.Sleep(6 * time.Second) - // put back to session pool - session.Close() - }() - checkError(session.ExecuteNonQueryStatement("create database " + dbName)) - checkError(session.ExecuteNonQueryStatement("use " + dbName)) - checkError(session.ExecuteNonQueryStatement("create table table_of_" + dbName + " (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() - log.Println("success num is", successGetSessionNum) - - log.Println("All session's database have been reset.") - // the using database will automatically reset to session pool's database after the session closed - wg.Add(5) - for i := 0; i < 5; i++ { - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to get session because ", err) - } - defer session.Close() - timeout := int64(3000) - dataSet, err := session.ExecuteQueryStatement("show tables", &timeout) - for { - hasNext, err := dataSet.Next() - if err != nil { - log.Fatal(err) - } - if !hasNext { - break - } - value, err := dataSet.GetString("TableName") - if err != nil { - log.Fatal(err) - } - log.Println("table is", value) - } - dataSet.Close() - }() - } - wg.Wait() -} - -func sessionPoolWithSpecificDatabaseExample() { - // should create database test_db before executing - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - Database: "test_db", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 8000, false) - defer sessionPool.Close() - num := 10 - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - tableName := "t" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create table "+tableName+"because ", err) - return - } - defer session.Close() - checkError(session.ExecuteNonQueryStatement("create table " + tableName + " (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() -} - -func sessionPoolWithoutSpecificDatabaseExample() { - config := &client.PoolConfig{ - Host: "127.0.0.1", - Port: "6667", - UserName: "root", - Password: "root", - } - sessionPool := client.NewTableSessionPool(config, 3, 60000, 8000, false) - defer sessionPool.Close() - num := 10 - var wg sync.WaitGroup - wg.Add(num) - for i := 0; i < num; i++ { - dbName := "db" + strconv.Itoa(i) - go func() { - defer wg.Done() - session, err := sessionPool.GetSession() - if err != nil { - log.Println("Failed to create database ", dbName, err) - return - } - defer session.Close() - checkError(session.ExecuteNonQueryStatement("create database " + dbName)) - checkError(session.ExecuteNonQueryStatement("use " + dbName)) - checkError(session.ExecuteNonQueryStatement("create table t1 (tag1 string tag, tag2 string tag, s1 text field, s2 text field)")) - }() - } - wg.Wait() -} - -func checkError(err error) { - if err != nil { - log.Fatal(err) - } -} -``` - diff --git a/src/zh/UserGuide/latest-Table/API/Programming-JDBC_timecho.md b/src/zh/UserGuide/latest-Table/API/Programming-JDBC_timecho.md deleted file mode 100644 index 9484025d3..000000000 --- a/src/zh/UserGuide/latest-Table/API/Programming-JDBC_timecho.md +++ /dev/null @@ -1,192 +0,0 @@ - - -# JDBC - -## 1. 功能介绍 - -IoTDB JDBC接口提供了一种标准的方式来与IoTDB数据库进行交互,允许用户通过Java程序执行SQL语句来管理数据库和时间序列数据。它支持数据库的连接、创建、查询、更新和删除操作,以及时间序列数据的批量插入和查询。 - -**注意**: 目前的JDBC实现仅是为与第三方工具连接使用的。使用JDBC(执行插入语句时)无法提供高性能写入。 - -对于Java应用,我们推荐使用Java 原生接口。 - -## 2. 使用方式 - -**环境要求:** - -- JDK >= 1.8 -- Maven >= 3.6 - -**在maven中添加依赖:** - -```XML - - - com.timecho.iotdb - iotdb-jdbc - 2.0.1.1 - - -``` -注意:请勿使用高版本客户端连接低版本服务。 - -## 3. 读写操作 - -### 3.1 功能说明 - -- **写操作**:通过execute方法执行插入、创建数据库、创建时间序列等操作。 -- **读操作**:通过executeQuery方法执行查询操作,并使用ResultSet对象获取查询结果。 - -### 3.2 **方法列表** - -| **方法名** | **描述** | **参数** | **返回值** | -| ------------------------------------------------------------ | ---------------------------------- | ---------------------------------------------------------- | ----------------------------------- | -| Class.forName(String driver) | 加载JDBC驱动类 | driver: JDBC驱动类的名称 | Class: 加载的类对象 | -| DriverManager.getConnection(String url, String username, String password) | 建立数据库连接 | url: 数据库的URLusername: 数据库用户名password: 数据库密码 | Connection: 数据库连接对象 | -| Connection.createStatement() | 创建Statement对象,用于执行SQL语句 | 无 | Statement: SQL语句执行对象 | -| Statement.execute(String sql) | 执行SQL语句,对于非查询语句 | sql: 要执行的SQL语句 | boolean: 指示是否返回ResultSet对象 | -| Statement.executeQuery(String sql) | 执行查询SQL语句并返回结果集 | sql: 要执行的查询SQL语句 | ResultSet: 查询结果集 | -| ResultSet.getMetaData() | 获取结果集的元数据 | 无 | ResultSetMetaData: 结果集元数据对象 | -| ResultSet.next() | 移动到结果集的下一行 | 无 | boolean: 是否成功移动到下一行 | -| ResultSet.getString(int columnIndex) | 获取指定列的字符串值 | columnIndex: 列索引(从1开始) | String: 列的字符串值 | - -## 4. 示例代码 - -**注意:使用表模型,必须在 url 中指定 sql_dialect 参数为 table。** - -```Java -String url = "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table"; -``` - -JDBC接口示例代码:[src/main/java/org/apache/iotdb/TableModelJDBCExample.java](https://github.com/apache/iotdb/blob/rc/2.0.1/example/jdbc/src/main/java/org/apache/iotdb/TableModelJDBCExample.java) - - -```Java -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -package org.apache.iotdb; - -import org.apache.iotdb.jdbc.IoTDBSQLException; - -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -import java.sql.Connection; -import java.sql.DriverManager; -import java.sql.ResultSet; -import java.sql.ResultSetMetaData; -import java.sql.SQLException; -import java.sql.Statement; - -public class TableModelJDBCExample { - - private static final Logger LOGGER = LoggerFactory.getLogger(TableModelJDBCExample.class); - - public static void main(String[] args) throws ClassNotFoundException, SQLException { - Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); - - // don't specify database in url - try (Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", "root", "TimechoDB@2021"); //V2.0.6.x 之前默认密码是root - Statement statement = connection.createStatement()) { - - statement.execute("CREATE DATABASE test1"); - statement.execute("CREATE DATABASE test2"); - - statement.execute("use test2"); - - // or use full qualified table name - statement.execute( - "create table test1.table1(region_id STRING TAG, plant_id STRING TAG, device_id STRING TAG, model STRING ATTRIBUTE, temperature FLOAT FIELD, humidity DOUBLE FIELD) with (TTL=3600000)"); - - statement.execute( - "create table table2(region_id STRING TAG, plant_id STRING TAG, color STRING ATTRIBUTE, temperature FLOAT FIELD, speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - // show tables by specifying another database - // using SHOW tables FROM - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES FROM test1")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - - // specify database in url - try (Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/test1?sql_dialect=table", "root", "TimechoDB@2021"); //V2.0.6.x 之前默认密码是root - Statement statement = connection.createStatement()) { - // show tables from current database test1 - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - - // change database to test2 - statement.execute("use test2"); - - try (ResultSet resultSet = statement.executeQuery("SHOW TABLES")) { - ResultSetMetaData metaData = resultSet.getMetaData(); - System.out.println(metaData.getColumnCount()); - while (resultSet.next()) { - System.out.println(resultSet.getString(1) + ", " + resultSet.getInt(2)); - } - } - } - } -} -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/API/Programming-Java-Native-API_timecho.md b/src/zh/UserGuide/latest-Table/API/Programming-Java-Native-API_timecho.md deleted file mode 100644 index a17dab3c0..000000000 --- a/src/zh/UserGuide/latest-Table/API/Programming-Java-Native-API_timecho.md +++ /dev/null @@ -1,855 +0,0 @@ - - -# Java原生接口 - -## 1. 功能介绍 - -IoTDB具备Java原生客户端驱动和对应的连接池,提供对象化接口,可以直接组装时序对象进行写入,无需拼装 SQL。推荐使用连接池,多线程并行操作数据库。 - -## 2. 使用方式 - -**环境要求:** - -- JDK >= 1.8 -- Maven >= 3.6 - -**在maven中添加依赖:** - -```XML - - - com.timecho.iotdb - iotdb-session - - ${project.version} - - -``` -* 可从[此处](https://repo1.maven.org/maven2/com/timecho/iotdb/iotdb-session/)查看`iotdb-session`最新版本 -* 注意:请勿使用高版本客户端连接低版本服务。 - -## 3. 读写操作 - -### 3.1 ITableSession接口 - -#### 3.1.1 功能描述 - -ITableSession接口定义了与IoTDB交互的基本操作,可以执行数据插入、查询操作以及关闭会话等,非线程安全。 - -#### 3.1.2 方法列表 - -以下是ITableSession接口中定义的方法及其详细说明: - -| **方法名** | **描述** | **参数** | **返回值** | **返回异常** | -| --------------------------------------------------- | ------------------------------------------------------------ | ----------------------------------------------------------- | -------------- | --------------------------------------------------- | -| insert(Tablet tablet) | 将一个包含时间序列数据的Tablet 对象插入到数据库中 | tablet: 要插入的Tablet对象 | 无 | StatementExecutionExceptionIoTDBConnectionException | -| executeNonQueryStatement(String sql) | 执行非查询SQL语句,如DDL(数据定义语言)或DML(数据操作语言)命令 | sql: 要执行的SQL语句。 | 无 | StatementExecutionExceptionIoTDBConnectionException | -| executeQueryStatement(String sql) | 执行查询SQL语句,并返回包含查询结果的SessionDataSet对象 | sql: 要执行的查询SQL语句。 | SessionDataSet | StatementExecutionExceptionIoTDBConnectionException | -| executeQueryStatement(String sql, long timeoutInMs) | 执行查询SQL语句,并设置查询超时时间(以毫秒为单位) | sql: 要执行的查询SQL语句。timeoutInMs: 查询超时时间(毫秒) | SessionDataSet | StatementExecutionException | -| close() | 关闭会话,释放所持有的资源 | 无 | 无 | IoTDBConnectionException | - -**关于 Object 数据类型的说明:** - -自 V2.0.8 起,`iTableSession.insert(Tablet tablet)`接口支持将单个 Object 类文件拆成多段后按顺序分段写入。当 Tablet 数据结构中列数据类型为 **`TSDataType.Object`​ ​**时,需要使用如下方法向 Tablet 填值。 - -```Java -/* -rowIndex:tablet 行位置 -columnIndex:tablet 列位置 -isEOF:本次写入内容是否为 Object 文件的最后一部分 -offset:本次写入的内容在 Object 文件中的起始偏移量 -content:本次写入的文件内容 -写入时需要确保分段的多个 byte[] 总长度与原始 Object 大小相等,否则会导致写入的数据大小不正确 -*/ -void addValue(int rowIndex, int columnIndex, boolean isEOF, long offset, byte[] content) -``` - -查询时,支持通过`Field.getStringValue`、`Field.getObjectValue`、`SessionDataSet.DataIterator.getObject`、`SessionDataSet.DataIterator.getString` 四种方法进行获取,其返回内容均为以 (Object) 开头且以对象大小结尾的字符串(String 类型),形如:(Object) XX.XX KB 。 - - -#### 3.1.3 接口展示 - -``` java -/** - * This interface defines a session for interacting with IoTDB tables. - * It supports operations such as data insertion, executing queries, and closing the session. - * Implementations of this interface are expected to manage connections and ensure - * proper resource cleanup. - * - *

Each method may throw exceptions to indicate issues such as connection errors or - * execution failures. - * - *

Since this interface extends {@link AutoCloseable}, it is recommended to use - * try-with-resources to ensure the session is properly closed. - */ -public interface ITableSession extends AutoCloseable { - - /** - * Inserts a {@link Tablet} into the database. - * - * @param tablet the tablet containing time-series data to be inserted. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - void insert(Tablet tablet) throws StatementExecutionException, IoTDBConnectionException; - - /** - * Executes a non-query SQL statement, such as a DDL or DML command. - * - * @param sql the SQL statement to execute. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - * @throws StatementExecutionException if an error occurs while executing the statement. - */ - void executeNonQueryStatement(String sql) throws IoTDBConnectionException, StatementExecutionException; - - /** - * Executes a query SQL statement and returns the result set. - * - * @param sql the SQL query statement to execute. - * @return a {@link SessionDataSet} containing the query results. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - SessionDataSet executeQueryStatement(String sql) - throws StatementExecutionException, IoTDBConnectionException; - - /** - * Executes a query SQL statement with a specified timeout and returns the result set. - * - * @param sql the SQL query statement to execute. - * @param timeoutInMs the timeout duration in milliseconds for the query execution. - * @return a {@link SessionDataSet} containing the query results. - * @throws StatementExecutionException if an error occurs while executing the statement. - * @throws IoTDBConnectionException if there is an issue with the IoTDB connection. - */ - SessionDataSet executeQueryStatement(String sql, long timeoutInMs) - throws StatementExecutionException, IoTDBConnectionException; - - /** - * Closes the session, releasing any held resources. - * - * @throws IoTDBConnectionException if there is an issue with closing the IoTDB connection. - */ - @Override - void close() throws IoTDBConnectionException; -} -``` - -### 3.2 TableSessionBuilder类 - -#### 3.2.1 功能描述 - -TableSessionBuilder类是一个构建器,用于配置和创建ITableSession接口的实例。它允许开发者设置连接参数、查询参数和安全特性等。 - -#### 3.2.2 配置选项 - -以下是TableSessionBuilder类中可用的配置选项及其默认值: - -| **配置项** | **描述** | **默认值** | -| ---------------------------------------------------- | ---------------------------------------- |---------------------------------------------| -| nodeUrls(List`` nodeUrls) | 设置IoTDB集群的节点URL列表 | Collections.singletonList("localhost:6667") | -| username(String username) | 设置连接的用户名 | "root" | -| password(String password) | 设置连接的密码 | "TimechoDB@2021" //V2.0.6.x 之前默认密码是root | -| database(String database) | 设置目标数据库名称 | null | -| queryTimeoutInMs(long queryTimeoutInMs) | 设置查询超时时间(毫秒) | 60000(1分钟) | -| fetchSize(int fetchSize) | 设置查询结果的获取大小 | 5000 | -| zoneId(ZoneId zoneId) | 设置时区相关的ZoneId | ZoneId.systemDefault() | -| thriftDefaultBufferSize(int thriftDefaultBufferSize) | 设置Thrift客户端的默认缓冲区大小(字节) | 1024(1KB) | -| thriftMaxFrameSize(int thriftMaxFrameSize) | 设置Thrift客户端的最大帧大小(字节) | 64 * 1024 * 1024(64MB) | -| enableRedirection(boolean enableRedirection) | 是否启用集群节点的重定向 | true | -| enableAutoFetch(boolean enableAutoFetch) | 是否启用自动获取可用DataNodes | true | -| maxRetryCount(int maxRetryCount) | 设置连接尝试的最大重试次数 | 60 | -| retryIntervalInMs(long retryIntervalInMs) | 设置重试间隔时间(毫秒) | 500 | -| useSSL(boolean useSSL) | 是否启用SSL安全连接 | false | -| trustStore(String keyStore) | 设置SSL连接的信任库路径 | null | -| trustStorePwd(String keyStorePwd) | 设置SSL连接的信任库密码 | null | -| enableCompression(boolean enableCompression) | 是否启用RPC压缩 | false | -| connectionTimeoutInMs(int connectionTimeoutInMs) | 设置连接超时时间(毫秒) | 0(无超时) | - -#### 3.2.3 接口展示 - -``` java -/** - * A builder class for constructing instances of {@link ITableSession}. - * - *

This builder provides a fluent API for configuring various options such as connection - * settings, query parameters, and security features. - * - *

All configurations have reasonable default values, which can be overridden as needed. - */ -public class TableSessionBuilder { - - /** - * Builds and returns a configured {@link ITableSession} instance. - * - * @return a fully configured {@link ITableSession}. - * @throws IoTDBConnectionException if an error occurs while establishing the connection. - */ - public ITableSession build() throws IoTDBConnectionException; - - /** - * Sets the list of node URLs for the IoTDB cluster. - * - * @param nodeUrls a list of node URLs. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue Collection.singletonList("localhost:6667") - */ - public TableSessionBuilder nodeUrls(List nodeUrls); - - /** - * Sets the username for the connection. - * - * @param username the username. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue "root“ - */ - public TableSessionBuilder username(String username); - - /** - * Sets the password for the connection. - * - * @param password the password. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue "TimechoDB@2021" //V2.0.6.x 之前默认密码是root - */ - public TableSessionBuilder password(String password); - - /** - * Sets the target database name. - * - * @param database the database name. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder database(String database); - - /** - * Sets the query timeout in milliseconds. - * - * @param queryTimeoutInMs the query timeout in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 60000 (1 minute) - */ - public TableSessionBuilder queryTimeoutInMs(long queryTimeoutInMs); - - /** - * Sets the fetch size for query results. - * - * @param fetchSize the fetch size. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 5000 - */ - public TableSessionBuilder fetchSize(int fetchSize); - - /** - * Sets the {@link ZoneId} for timezone-related operations. - * - * @param zoneId the {@link ZoneId}. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue ZoneId.systemDefault() - */ - public TableSessionBuilder zoneId(ZoneId zoneId); - - /** - * Sets the default init buffer size for the Thrift client. - * - * @param thriftDefaultBufferSize the buffer size in bytes. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 1024 (1 KB) - */ - public TableSessionBuilder thriftDefaultBufferSize(int thriftDefaultBufferSize); - - /** - * Sets the maximum frame size for the Thrift client. - * - * @param thriftMaxFrameSize the maximum frame size in bytes. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 64 * 1024 * 1024 (64 MB) - */ - public TableSessionBuilder thriftMaxFrameSize(int thriftMaxFrameSize); - - /** - * Enables or disables redirection for cluster nodes. - * - * @param enableRedirection whether to enable redirection. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue true - */ - public TableSessionBuilder enableRedirection(boolean enableRedirection); - - /** - * Enables or disables automatic fetching of available DataNodes. - * - * @param enableAutoFetch whether to enable automatic fetching. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue true - */ - public TableSessionBuilder enableAutoFetch(boolean enableAutoFetch); - - /** - * Sets the maximum number of retries for connection attempts. - * - * @param maxRetryCount the maximum retry count. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 60 - */ - public TableSessionBuilder maxRetryCount(int maxRetryCount); - - /** - * Sets the interval between retries in milliseconds. - * - * @param retryIntervalInMs the interval in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 500 milliseconds - */ - public TableSessionBuilder retryIntervalInMs(long retryIntervalInMs); - - /** - * Enables or disables SSL for secure connections. - * - * @param useSSL whether to enable SSL. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue false - */ - public TableSessionBuilder useSSL(boolean useSSL); - - /** - * Sets the trust store path for SSL connections. - * - * @param keyStore the trust store path. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder trustStore(String keyStore); - - /** - * Sets the trust store password for SSL connections. - * - * @param keyStorePwd the trust store password. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue null - */ - public TableSessionBuilder trustStorePwd(String keyStorePwd); - - /** - * Enables or disables rpc compression for the connection. - * - * @param enableCompression whether to enable compression. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue false - */ - public TableSessionBuilder enableCompression(boolean enableCompression); - - /** - * Sets the connection timeout in milliseconds. - * - * @param connectionTimeoutInMs the connection timeout in milliseconds. - * @return the current {@link TableSessionBuilder} instance. - * @defaultValue 0 (no timeout) - */ - public TableSessionBuilder connectionTimeoutInMs(int connectionTimeoutInMs); -} -``` - -> 注意: 原生API中创建表时,表名或列名中若含有特殊字符或中文字符,无需额外添加双引号括起,否则会包含引号字符。 - -## 4. 客户端连接池 - -### 4.1 ITableSessionPool 接口 - -#### 4.1.1 功能描述 - -ITableSessionPool 是一个用于管理 ITableSession实例的池。这个池可以帮助我们高效地重用连接,并且在不需要时正确地清理资源, 该接口定义了如何从池中获取会话以及如何关闭池的基本操作。 - -#### 4.1.2 方法列表 - -| **方法名** | **描述** | **返回值** | **返回异常** | -| ------------ | ------------------------------------------------------------ | ------------------ | ------------------------ | -| getSession() | 从池中获取一个 ITableSession 实例,用于与 IoTDB 交互。 | ITableSession 实例 | IoTDBConnectionException | -| close() | 关闭会话池,释放任何持有的资源。关闭后,不能再从池中获取新的会话。 | 无 | 无 | - -#### 4.1.3 接口展示 - -```Java -/** - * This interface defines a pool for managing {@link ITableSession} instances. - * It provides methods to acquire a session from the pool and to close the pool. - * - *

The implementation should handle the lifecycle of sessions, ensuring efficient - * reuse and proper cleanup of resources. - */ -public interface ITableSessionPool { - - /** - * Acquires an {@link ITableSession} instance from the pool. - * - * @return an {@link ITableSession} instance for interacting with the IoTDB. - * @throws IoTDBConnectionException if there is an issue obtaining a session from the pool. - */ - ITableSession getSession() throws IoTDBConnectionException; - - /** - * Closes the session pool, releasing any held resources. - * - *

Once the pool is closed, no further sessions can be acquired. - */ - void close(); -} -``` - -### 4.2 TableSessionPoolBuilder 类 - -#### 4.2.1 功能描述 - -TableSessionPool 的构造器,用于配置和创建 ITableSessionPool 的实例。允许开发者配置连接参数、会话参数和池化行为等。 - -#### 4.2.2 配置选项 - -以下是 TableSessionPoolBuilder 类的可用配置选项及其默认值: - -| **配置项** | **描述** | **默认值** | -| ------------------------------------------------------------ | -------------------------------------------- |---------------------------------------------| -| nodeUrls(List`` nodeUrls) | 设置IoTDB集群的节点URL列表 | Collections.singletonList("localhost:6667") | -| maxSize(int maxSize) | 设置会话池的最大大小,即池中允许的最大会话数 | 5 | -| user(String user) | 设置连接的用户名 | "root" | -| password(String password) | 设置连接的密码 | "TimechoDB@2021" //V2.0.6.x 之前默认密码是root | -| database(String database) | 设置目标数据库名称 | "root" | -| queryTimeoutInMs(long queryTimeoutInMs) | 设置查询超时时间(毫秒) | 60000(1分钟) | -| fetchSize(int fetchSize) | 设置查询结果的获取大小 | 5000 | -| zoneId(ZoneId zoneId) | 设置时区相关的 ZoneId | ZoneId.systemDefault() | -| waitToGetSessionTimeoutInMs(long waitToGetSessionTimeoutInMs) | 设置从池中获取会话的超时时间(毫秒) | 30000(30秒) | -| thriftDefaultBufferSize(int thriftDefaultBufferSize) | 设置Thrift客户端的默认缓冲区大小(字节) | 1024(1KB) | -| thriftMaxFrameSize(int thriftMaxFrameSize) | 设置Thrift客户端的最大帧大小(字节) | 64 * 1024 * 1024(64MB) | -| enableCompression(boolean enableCompression) | 是否启用连接的压缩 | false | -| enableRedirection(boolean enableRedirection) | 是否启用集群节点的重定向 | true | -| connectionTimeoutInMs(int connectionTimeoutInMs) | 设置连接超时时间(毫秒) | 10000(10秒) | -| enableAutoFetch(boolean enableAutoFetch) | 是否启用自动获取可用DataNodes | true | -| maxRetryCount(int maxRetryCount) | 设置连接尝试的最大重试次数 | 60 | -| retryIntervalInMs(long retryIntervalInMs) | 设置重试间隔时间(毫秒) | 500 | -| useSSL(boolean useSSL) | 是否启用SSL安全连接 | false | -| trustStore(String keyStore) | 设置SSL连接的信任库路径 | null | -| trustStorePwd(String keyStorePwd) | 设置SSL连接的信任库密码 | null | - -#### 4.2.3 接口展示 - -```Java -/** - * A builder class for constructing instances of {@link ITableSessionPool}. - * - *

This builder provides a fluent API for configuring a session pool, including - * connection settings, session parameters, and pool behavior. - * - *

All configurations have reasonable default values, which can be overridden as needed. - */ -public class TableSessionPoolBuilder { - - /** - * Builds and returns a configured {@link ITableSessionPool} instance. - * - * @return a fully configured {@link ITableSessionPool}. - */ - public ITableSessionPool build(); - - /** - * Sets the list of node URLs for the IoTDB cluster. - * - * @param nodeUrls a list of node URLs. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue Collection.singletonList("localhost:6667") - */ - public TableSessionPoolBuilder nodeUrls(List nodeUrls); - - /** - * Sets the maximum size of the session pool. - * - * @param maxSize the maximum number of sessions allowed in the pool. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 5 - */ - public TableSessionPoolBuilder maxSize(int maxSize); - - /** - * Sets the username for the connection. - * - * @param user the username. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "root" - */ - public TableSessionPoolBuilder user(String user); - - /** - * Sets the password for the connection. - * - * @param password the password. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "TimechoDB@2021" //V2.0.6.x 之前默认密码是root - */ - public TableSessionPoolBuilder password(String password); - - /** - * Sets the target database name. - * - * @param database the database name. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue "root" - */ - public TableSessionPoolBuilder database(String database); - - /** - * Sets the query timeout in milliseconds. - * - * @param queryTimeoutInMs the query timeout in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 60000 (1 minute) - */ - public TableSessionPoolBuilder queryTimeoutInMs(long queryTimeoutInMs); - - /** - * Sets the fetch size for query results. - * - * @param fetchSize the fetch size. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 5000 - */ - public TableSessionPoolBuilder fetchSize(int fetchSize); - - /** - * Sets the {@link ZoneId} for timezone-related operations. - * - * @param zoneId the {@link ZoneId}. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue ZoneId.systemDefault() - */ - public TableSessionPoolBuilder zoneId(ZoneId zoneId); - - /** - * Sets the timeout for waiting to acquire a session from the pool. - * - * @param waitToGetSessionTimeoutInMs the timeout duration in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 30000 (30 seconds) - */ - public TableSessionPoolBuilder waitToGetSessionTimeoutInMs(long waitToGetSessionTimeoutInMs); - - /** - * Sets the default buffer size for the Thrift client. - * - * @param thriftDefaultBufferSize the buffer size in bytes. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 1024 (1 KB) - */ - public TableSessionPoolBuilder thriftDefaultBufferSize(int thriftDefaultBufferSize); - - /** - * Sets the maximum frame size for the Thrift client. - * - * @param thriftMaxFrameSize the maximum frame size in bytes. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 64 * 1024 * 1024 (64 MB) - */ - public TableSessionPoolBuilder thriftMaxFrameSize(int thriftMaxFrameSize); - - /** - * Enables or disables compression for the connection. - * - * @param enableCompression whether to enable compression. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue false - */ - public TableSessionPoolBuilder enableCompression(boolean enableCompression); - - /** - * Enables or disables redirection for cluster nodes. - * - * @param enableRedirection whether to enable redirection. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue true - */ - public TableSessionPoolBuilder enableRedirection(boolean enableRedirection); - - /** - * Sets the connection timeout in milliseconds. - * - * @param connectionTimeoutInMs the connection timeout in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 10000 (10 seconds) - */ - public TableSessionPoolBuilder connectionTimeoutInMs(int connectionTimeoutInMs); - - /** - * Enables or disables automatic fetching of available DataNodes. - * - * @param enableAutoFetch whether to enable automatic fetching. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue true - */ - public TableSessionPoolBuilder enableAutoFetch(boolean enableAutoFetch); - - /** - * Sets the maximum number of retries for connection attempts. - * - * @param maxRetryCount the maximum retry count. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 60 - */ - public TableSessionPoolBuilder maxRetryCount(int maxRetryCount); - - /** - * Sets the interval between retries in milliseconds. - * - * @param retryIntervalInMs the interval in milliseconds. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue 500 milliseconds - */ - public TableSessionPoolBuilder retryIntervalInMs(long retryIntervalInMs); - - /** - * Enables or disables SSL for secure connections. - * - * @param useSSL whether to enable SSL. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue false - */ - public TableSessionPoolBuilder useSSL(boolean useSSL); - - /** - * Sets the trust store path for SSL connections. - * - * @param keyStore the trust store path. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue null - */ - public TableSessionPoolBuilder trustStore(String keyStore); - - /** - * Sets the trust store password for SSL connections. - * - * @param keyStorePwd the trust store password. - * @return the current {@link TableSessionPoolBuilder} instance. - * @defaultValue null - */ - public TableSessionPoolBuilder trustStorePwd(String keyStorePwd); -} -``` - -## 5. 示例代码 - -Session 示例代码:[src/main/java/org/apache/iotdb/TableModelSessionExample.java](https://github.com/apache/iotdb/blob/master/example/session/src/main/java/org/apache/iotdb/TableModelSessionExample.java) - -SessionPool 示例代码:[src/main/java/org/apache/iotdb/TableModelSessionPoolExample.java](https://github.com/apache/iotdb/blob/master/example/session/src/main/java/org/apache/iotdb/TableModelSessionPoolExample.java) - -```Java -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -package org.apache.iotdb; - -import org.apache.iotdb.isession.ITableSession; -import org.apache.iotdb.isession.SessionDataSet; -import org.apache.iotdb.isession.pool.ITableSessionPool; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.TableSessionPoolBuilder; - -import org.apache.tsfile.enums.ColumnCategory; -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.write.record.Tablet; - -import java.util.ArrayList; -import java.util.Arrays; -import java.util.Collections; -import java.util.List; - -import static org.apache.iotdb.SessionExample.printDataSet; - -public class TableModelSessionPoolExample { - - private static final String LOCAL_URL = "127.0.0.1:6667"; - - public static void main(String[] args) { - - // don't specify database in constructor - ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(Collections.singletonList(LOCAL_URL)) - .user("root") - .password("TimechoDB@2021") //V2.0.6.x 之前默认密码是root - .maxSize(1) - .build(); - - try (ITableSession session = tableSessionPool.getSession()) { - - session.executeNonQueryStatement("CREATE DATABASE test1"); - session.executeNonQueryStatement("CREATE DATABASE test2"); - - session.executeNonQueryStatement("use test2"); - - // or use full qualified table name - session.executeNonQueryStatement( - "create table test1.table1(" - + "region_id STRING TAG, " - + "plant_id STRING TAG, " - + "device_id STRING TAG, " - + "model STRING ATTRIBUTE, " - + "temperature FLOAT FIELD, " - + "humidity DOUBLE FIELD) with (TTL=3600000)"); - - session.executeNonQueryStatement( - "create table table2(" - + "region_id STRING TAG, " - + "plant_id STRING TAG, " - + "color STRING ATTRIBUTE, " - + "temperature FLOAT FIELD, " - + "speed DOUBLE FIELD) with (TTL=6600000)"); - - // show tables from current database - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - // show tables by specifying another database - // using SHOW tables FROM - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES FROM test1")) { - printDataSet(dataSet); - } - - // insert table data by tablet - List columnNameList = - Arrays.asList("region_id", "plant_id", "device_id", "model", "temperature", "humidity"); - List dataTypeList = - Arrays.asList( - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE); - List columnTypeList = - new ArrayList<>( - Arrays.asList( - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD)); - Tablet tablet = new Tablet("test1", columnNameList, dataTypeList, columnTypeList, 100); - for (long timestamp = 0; timestamp < 100; timestamp++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue("plant_id", rowIndex, "5"); - tablet.addValue("device_id", rowIndex, "3"); - tablet.addValue("model", rowIndex, "A"); - tablet.addValue("temperature", rowIndex, 37.6F); - tablet.addValue("humidity", rowIndex, 111.1); - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - session.insert(tablet); - tablet.reset(); - } - } - if (tablet.getRowSize() != 0) { - session.insert(tablet); - tablet.reset(); - } - - // query table data - try (SessionDataSet dataSet = - session.executeQueryStatement( - "select * from test1 " - + "where region_id = '1' and plant_id in ('3', '5') and device_id = '3'")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } finally { - tableSessionPool.close(); - } - - // specify database in constructor - tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(Collections.singletonList(LOCAL_URL)) - .user("root") - .password("TimechoDB@2021")//V2.0.6.x 之前默认密码是root - .maxSize(1) - .database("test1") - .build(); - - try (ITableSession session = tableSessionPool.getSession()) { - - // show tables from current database - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - // change database to test2 - session.executeNonQueryStatement("use test2"); - - // show tables by specifying another database - // using SHOW tables FROM - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } - - try (ITableSession session = tableSessionPool.getSession()) { - - // show tables from default database test1 - try (SessionDataSet dataSet = session.executeQueryStatement("SHOW TABLES")) { - printDataSet(dataSet); - } - - } catch (IoTDBConnectionException e) { - e.printStackTrace(); - } catch (StatementExecutionException e) { - e.printStackTrace(); - } finally { - tableSessionPool.close(); - } - } -} -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/API/Programming-MQTT_timecho.md b/src/zh/UserGuide/latest-Table/API/Programming-MQTT_timecho.md deleted file mode 100644 index 9d9073502..000000000 --- a/src/zh/UserGuide/latest-Table/API/Programming-MQTT_timecho.md +++ /dev/null @@ -1,264 +0,0 @@ - -# MQTT 协议 - -## 1. 概述 - -MQTT 是一种专为物联网(IoT)和低带宽环境设计的轻量级消息传输协议,基于发布/订阅(Pub/Sub)模型,支持设备间高效、可靠的双向通信。其核心目标是低功耗、低带宽消耗和高实时性,尤其适合网络不稳定或资源受限的场景(如传感器、移动设备)。 - -IoTDB 深度集成了 MQTT 协议能力,完整兼容 MQTT v3.1(OASIS 国际标准协议)。IoTDB 服务器内置高性能 MQTT Broker 服务模块,无需第三方中间件,支持设备通过 MQTT 报文将时序数据直接写入 IoTDB 存储引擎。 - -![](/img/mqtt-table-1.png) - -注意,自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 MQTT 服务的 JAR 包。请使用该服务前联系天谋团队获取 JAR 包,并放置于 timechodb_home/lib 或者 timechodb_home/ext/external_service 路径下。 - -## 2. 配置方式 - -默认情况下,IoTDB MQTT 服务通过`${IOTDB_HOME}/${IOTDB_CONF}/iotdb-system.properties`加载配置。 - -具体配置项如下: - -| **名称** | **描述** | **默认** | -|---------------------------| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------- | -| `enable_mqtt_service` | 是否启用 mqtt 服务 | FALSE | -| `mqtt_host` | mqtt 服务绑定主机 | 127.0.0.1 | -| `mqtt_port` | mqtt 服务绑定端口 | 1883 | -| `mqtt_handler_pool_size` | 处理 mqtt 消息的处理程序池大小 | 1 | -| **`mqtt_payload_formatter`** | **mqtt**​**​ 消息有效负载格式化程序。**​**可选项:**​​**`json`**​**:仅适用于树模型。**​​**`line`**​**:仅适用于表模型。** | **json** | -| `mqtt_max_message_size` | mqtt 消息最大长度(字节) | 1048576 | - -## 3. 写入协议 - -* 行协议语法格式 - -```JavaScript -[,=[,=]][ =[,=]] =[,=] [] -``` - -* 示例 - -```JavaScript -myMeasurement,tag1=value1,tag2=value2 attr1=value1,attr2=value2 fieldKey="fieldValue" 1556813561098000000 -``` - -![](/img/mqtt-table-2.png) - - -## 4. 命名约定 - -* 数据库名称 - -MQTT topic 名称用 `/` 分割后, 第一串内容作为数据库名称。 - -```Properties -topic: stock/Legacy -databaseName: stock - - -topic: stock/Legacy/# -databaseName:stock -``` - -* 表名称 - -表名称使用行协议中的 ``。 - -* 类型标识 - -| Filed 内容 | IoTDB 数据类型 | -|--------------------------------------------------------------| ---------------- | -| 1
1.12 | DOUBLE | -| 1`f`
1.12`f` | FLOAT | -| 1`i`
123`i` | INT64 | -| 1`u`
123`u` | INT64| -| 1`i32`
123`i32` | INT32 | -| `"xxx"` | TEXT | -| `t`,`T`,`true`,`True`,`TRUE`
`f`,`F`,`false`,`False`,`FALSE` | BOOLEAN | - - -## 5. 代码示例 -以下是 mqtt 客户端将消息发送到 IoTDB 服务器的示例。 - - ```java -MQTT mqtt = new MQTT(); -mqtt.setHost("127.0.0.1", 1883); -mqtt.setUserName("root"); -mqtt.setPassword("root"); - -BlockingConnection connection = mqtt.blockingConnection(); -String DATABASE = "myMqttTest"; -connection.connect(); - -String payload = - "test1,tag1=t1,tag2=t2 attr3=a5,attr4=a4 field1=\"fieldValue1\",field2=1i,field3=1u 1"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -payload = "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 2"; -connection.publish(DATABASE, payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -payload = "# It's a remark\n " + "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 6"; - connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); - Thread.sleep(10); - -//批量写入示例 -payload = - "test1,tag1=t1,tag2=t2 field7=t,field8=T,field9=true 3 \n " - + "test1,tag1=t1,tag2=t2 field7=f,field8=F,field9=FALSE 4"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -//批量写入示例 -payload = - "test1,tag1=t1,tag2=t2 attr1=a1,attr2=a2 field1=\"fieldValue1\",field2=1i,field3=1u 4 \n " - + "test1,tag1=t1,tag2=t2 field4=2,field5=2i32,field6=2f 5"; -connection.publish(DATABASE + "/myTopic", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -Thread.sleep(10); - -connection.disconnect(); - ``` - - -## 6. 自定义 MQTT 消息格式 - -事实上可以通过简单编程来实现 MQTT 消息的格式自定义。 -可以在源码的 [example/mqtt-customize](https://github.com/apache/iotdb/tree/master/example/mqtt-customize) 项目中找到一个简单示例。 - -步骤: -1. 创建一个 Java 项目,增加如下依赖 -```xml - - org.apache.iotdb - iotdb-server - 2.0.4-SNAPSHOT - -``` -2. 创建一个实现类,实现接口 `org.apache.iotdb.db.mqtt.protocol.PayloadFormatter` - -```java -package org.apache.iotdb.mqtt.server; - -import io.netty.buffer.ByteBuf; -import org.apache.iotdb.db.protocol.mqtt.Message; -import org.apache.iotdb.db.protocol.mqtt.PayloadFormatter; - -import java.nio.charset.StandardCharsets; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -public class CustomizedLinePayloadFormatter implements PayloadFormatter { - - @Override - public List format(String topic, ByteBuf payload) { - // Suppose the payload is a line format - if (payload == null) { - return null; - } - - String line = payload.toString(StandardCharsets.UTF_8); - // parse data from the line and generate Messages and put them into List ret - List ret = new ArrayList<>(); - // this is just an example, so we just generate some Messages directly - for (int i = 0; i < 3; i++) { - long ts = i; - TableMessage message = new TableMessage(); - - // Parsing Database Name - message.setDatabase("db" + i); - - //Parsing Table Names - message.setTable("t" + i); - - // Parsing Tags - List tagKeys = new ArrayList<>(); - tagKeys.add("tag1" + i); - tagKeys.add("tag2" + i); - List tagValues = new ArrayList<>(); - tagValues.add("t_value1" + i); - tagValues.add("t_value2" + i); - message.setTagKeys(tagKeys); - message.setTagValues(tagValues); - - // Parsing Attributes - List attributeKeys = new ArrayList<>(); - List attributeValues = new ArrayList<>(); - attributeKeys.add("attr1" + i); - attributeKeys.add("attr2" + i); - attributeValues.add("a_value1" + i); - attributeValues.add("a_value2" + i); - message.setAttributeKeys(attributeKeys); - message.setAttributeValues(attributeValues); - - // Parsing Fields - List fields = Arrays.asList("field1" + i, "field2" + i); - List dataTypes = Arrays.asList(TSDataType.FLOAT, TSDataType.FLOAT); - List values = Arrays.asList("4.0" + i, "5.0" + i); - message.setFields(fields); - message.setDataTypes(dataTypes); - message.setValues(values); - - //// Parsing timestamp - message.setTimestamp(ts); - ret.add(message); - } - return ret; - } - - @Override - public String getName() { - // set the value of mqtt_payload_formatter in iotdb-system.properties as the following string: - return "CustomizedLine"; - } -} -``` - - -3. 修改项目中的 `src/main/resources/META-INF/services/org.apache.iotdb.db.protocol.mqtt.PayloadFormatter` 文件: - 将示例中的文件内容清除,并将刚才的实现类的全名(包名.类名)写入文件中。注意,这个文件中只有一行。 - 在本例中,文件内容为: `org.apache.iotdb.mqtt.server.CustomizedLinePayloadFormatter` -4. 编译项目生成一个 jar 包: `mvn package -DskipTests` - - -在 IoTDB 服务端: -1. 创建 ${IOTDB_HOME}/ext/mqtt/ 文件夹, 将刚才的 jar 包放入此文件夹。 -2. 打开 MQTT 服务参数. (`enable_mqtt_service=true` in `conf/iotdb-system.properties`) -3. 用刚才的实现类中的 getName() 方法的返回值 设置为 `conf/iotdb-system.properties` 中 `mqtt_payload_formatter` 的值, - , 在本例中,为 `CustomizedLine` -4. 启动 IoTDB -5. 搞定 - -More: MQTT 协议的消息不限于 line,你还可以用任意二进制。通过如下函数获得: -`payload.forEachByte()` or `payload.array`。 - - -## 7. 注意事项 - -为避免因缺省client_id引发的兼容性问题,强烈建议在所有MQTT客户端中始终显式地提供唯一且非空的 client_id。 -不同客户端在client_id缺失或为空时的表现并不一致,常见示例如下: -1. 显式传入空字符串 -• MQTTX:client_id=""时,IoTDB会直接丢弃消息; -• mosquitto_pub:client_id=""时,IoTDB能正常接收消息。 -2. 完全不传client_id -• MQTTX:消息可被IoTDB正常接收; -• mosquitto_pub:IoTDB拒绝连接。 -由此可见,显式指定唯一且非空的client_id是消除上述差异、确保消息可靠投递的最简单做法。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/API/Programming-ODBC_timecho.md b/src/zh/UserGuide/latest-Table/API/Programming-ODBC_timecho.md deleted file mode 100644 index 6326752d8..000000000 --- a/src/zh/UserGuide/latest-Table/API/Programming-ODBC_timecho.md +++ /dev/null @@ -1,1083 +0,0 @@ - - -# ODBC - -## 1. 功能介绍 - -IoTDB ODBC 驱动程序提供了通过 ODBC 标准接口与数据库进行交互的能力,支持通过 ODBC 连接管理时序数据库中的数据。目前支持数据库连接、数据查询、数据插入、数据修改和数据删除等操作,可适配各类支持 ODBC 协议的应用程序与工具链。 - -> 注意:该功能从 V2.0.8.2 起支持。 - -## 2. 使用方式 - -推荐使用预编译二进制包安装,无需自行编译,直接通过脚本完成驱动安装与系统注册,目前仅支持 Windows 系统。 - -### 2.1 环境要求 - -仅需满足操作系统层面的 ODBC 驱动管理器依赖,无需配置编译环境: - -| **操作系统** | **要求与安装方式** | -| -------------------- |------------------------------------------------------------------------------------------------------------------------------------| -| Windows | 1. **Windows 10/11、Server 2016/2019/2022**:自带 ODBC 17/18 版本驱动管理器,无需额外安装
2. **Windows 8.1/Server 2012 R2**:需手动安装对应版本 ODBC 驱动管理器 | - -### 2.2 安装步骤 - -1. 联系天谋团队获取预编译二进制包 - -二进制包目录结构: - -```Plain -├── bin/ -│ ├── apache_iotdb_odbc.dll -│ └── install_driver.exe -├── install.bat -└── registry.bat -``` - -2. 以**管理员权限**打开命令行工具(CMD/PowerShell),并运行以下命令:(可以将路径替换为任意绝对路径) - -```Bash -install.bat "C:\Program Files\Apache IoTDB ODBC Driver" -``` - -脚本自动完成以下操作: - -* 创建安装目录(如果不存在) -* 将 `bin\apache_iotdb_odbc.dll` 复制到指定安装目录 -* 调用 `install_driver.exe` 通过 ODBC 标准 API(`SQLInstallDriverEx`)将驱动注册到系统 - -3. 验证安装:打开「ODBC 数据源管理器」,在「驱动程序」选项卡中可查看到 `Apache IoTDB ODBC Driver`,即表示注册成功。 - -![](/img/odbc-1.png) - -### 2.3 卸载步骤 - -1. 以管理员身份打开命令提示符,`cd` 进入项目根目录。 -2. 运行卸载脚本: - -```Bash -uninstall.bat -``` - -脚本会调用 `install_driver.exe` 通过 ODBC 标准 API(`SQLRemoveDriver`)从系统中注销驱动。安装目录中的 DLL 文件不会被自动删除,如需清理请手动删除。 - -### 2.4 连接配置 - -安装驱动后,需要配置数据源(DSN)才能让应用程序通过 DSN 名称连接数据库。IoTDB ODBC 驱动支持通过数据源和连接字符串配置连接参数两种方法。 - -#### 2.4.1 配置数据源 - -**通过 ODBC 数据源管理器配置** - -1. 打开"ODBC 数据源管理程序",切换到"用户 DSN"选项卡,点击"添加"按钮。 - -![](/img/odbc-2.png) - -2. 在弹出的驱动程序列表中选择"Apache IoTDB ODBC Driver",点击"完成"。 - -![](/img/odbc-3.png) - -3. 弹出数据源配置对话框,填写连接参数,随后点击 OK: - -![](/img/odbc-4.png) - -对话框中各字段的含义如下: - -| **区域** | **字段** | **说明** | -| ---------------- | ----------------- | ----------------------------------------------------------------------------------------------------------------- | -| Data Source | DSN Name | 数据源名称,应用程序通过此名称引用该数据源 | -| Data Source | Description | 数据源描述(可选) | -| Connection | Server | IoTDB 服务器 IP 地址,默认 127.0.0.1 | -| Connection | Port | IoTDB Session API 端口,默认 6667 | -| Connection | User | 用户名,默认 root | -| Connection | Password | 密码,默认 root | -| Options | Table Model | 勾选时使用表模型,取消勾选时使用树模型 | -| Options | Database | 数据库名称。仅表模型模式下可用;树模型时此字段灰化不可编辑 | -| Options | Log Level | 日志级别(0-4):0=OFF, 1=ERROR, 2=WARN, 3=INFO, 4=TRACE | -| Options | Session Timeout | 会话超时时间(毫秒),0 表示不设超时。注意服务端 queryTimeoutThreshold 默认为 60000ms,超过此值需修改服务端配置 | -| Options | Batch Size | 每次拉取的行数,默认 1000。设为 0 时重置为默认值 | - -4. 填写完成后,可以点击"Test Connection"按钮测试连接。测试连接会使用当前填写的参数尝试连接到 IoTDB 服务器并执行 `SHOW VERSION` 查询。连接成功时会显示服务器版本信息,失败时会显示具体的错误原因。 -5. 确认参数无误后,点击"OK"保存。数据源会出现在"用户 DSN"列表中,如下图中的名称为123的数据源。 - -![](/img/odbc-5.png) - -如需修改已有数据源的配置,在列表中选中后点击"配置"按钮即可重新编辑。 - -#### 2.4.2 连接字符串 - -连接字符串格式为**分号分隔的键值对**,如: - -```Bash -Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;database=testdb;isTableModel=true;loglevel=2 -``` - -具体字段属性介绍见下表: - -| **字段名称** | **说明** | **可选值** | **默认值** | -| --------------------------- | ---------------------------------- |------------------------------------------------------------------------------------------------------------------------------| --------------------------------- | -| DSN | 数据源名称 | 自定义数据源名 | - | -| uid | 数据库用户名 | 任意字符串 | root | -| pwd | 数据库密码 | 任意字符串 | root | -| server | IoTDB 服务器地址 | ip地址 | 127.0.0.1 | -| port | IoTDB 服务器端口 | 端口 | 6667 | -| database | 数据库名称(仅表模型模式下生效) | 任意字符串 | 空字符串| -| loglevel | 日志级别 | 整数值(0-4) | 4(LOG\_LEVEL\_TRACE) | -| isTableModel / tablemodel | 是否启用表模型模式 | 布尔类型,支持多种表示方式:
1. 0, false, no, off :设置为 false;
2. 1, true, yes, on :设置为 true;
3. 其他值默认设置为 true。 | true| -| sessiontimeoutms | Session 超时时间(毫秒) | 64 位整数,默认为`LLONG_MAX`;设置为`0`时将被替换为`LLONG_MAX`。注意,服务端有超时设置项:`private long queryTimeoutThreshold = 60000;`需要修改这一项才能得到超过60秒的超时时间。 | LLONG\_MAX| -| batchsize | 每次拉取数据的批量大小 | 64 位整数,默认为`1000`;设置为`0`时将被替换为`1000` | 1000| - -说明: - -* 字段名称不区分大小写(自动转换为小写进行比较) -* 连接字符串格式为分号分隔的键值对,如:`Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;database=testdb;isTableModel=true;loglevel=2` -* 对于布尔类型的字段(isTableModel),支持多种表示方式 -* 所有字段都是可选的,如果未指定则使用默认值 -* 不支持的字段会忽略并在日志中记录警告信息,但不会影响连接 -* 服务器接口默认值 6667 是 IoTDB 的 C++ Session 接口所使用的默认端口。本 ODBC 驱动使用 C++ Session 接口与 IoTDB 传递数据。如果 IoTDB 服务端的 C++ Session 接口使用的端口不是默认的,需要在 ODBC 连接字符串中作相应的更改。 - -#### 2.4.3 数据源配置与连接字符串的关系 - -在 ODBC 数据源管理器中保存的配置,会以键值对的形式写入系统的 ODBC 数据源配置中(Windows 下对应注册表 `HKEY_CURRENT_USER\SOFTWARE\ODBC\ODBC.INI`)。当应用程序使用 `SQLConnect` 或在连接字符串中指定 `DSN=数据源名称` 时,驱动会从系统配置中读取这些参数。 - -**连接字符串的优先级高于 DSN 中保存的配置。** 具体规则如下: - -1. 如果连接字符串中包含 `DSN=xxx` 且不包含 `DRIVER=...`,驱动会先从系统配置中加载该 DSN 的所有参数作为基础值。 -2. 然后,连接字符串中显式指定的参数会覆盖 DSN 中的同名参数。 -3. 如果连接字符串中包含 `DRIVER=...`,则不会从系统配置中读取任何 DSN 参数,完全以连接字符串为准。 - -例如:DSN 中配置了 `Server=192.168.1.100`、`Port=6667`,但连接字符串为 `DSN=MyDSN;Server=127.0.0.1`,则实际连接使用 `Server=127.0.0.1`(连接字符串覆盖),`Port=6667`(来自 DSN)。 - -### 2.5 日志记录 - -驱动运行时的日志输出分为「驱动自身日志」和「ODBC 管理器追踪日志」两类,需注意日志等级对性能的影响。 - -#### 2.5.1 驱动自身日志 - -* 输出位置:用户主目录下的 `apache_iotdb_odbc.log`; -* 日志等级:通过连接字符串的 `loglevel` 配置(0-4,等级越高输出越详细); -* 性能影响:高日志等级会显著降低驱动性能,建议仅调试时使用。 - -#### 2.5.2 ODBC 管理器追踪日志 - -* 开启方式:打开「ODBC 数据源管理器」→「跟踪」→「立即启动跟踪」; -* 注意事项:开启后会大幅降低驱动性能,仅用于问题排查。 - -## 3. 接口支持 - -### 3.1 方法列表 - -驱动对 ODBC 标准 API 的支持情况如下: - -| ODBC/Setup API | 函数功能 | 参数列表 | 参数说明 | -| ------------------- | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| SQLAllocHandle| 分配ODBC句柄 | (SQLSMALLINT HandleType, SQLHANDLE InputHandle, SQLHANDLE \*OutputHandle) | HandleType: 要分配的句柄类型(ENV/DBC/STMT/DESC);
InputHandle: 上级上下文句柄;
OutputHandle: 返回的新句柄指针 | -| SQLBindCol | 绑定列到结果缓冲区 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN \*StrLen\_or\_Ind) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
TargetType: C数据类型;
TargetValue: 数据缓冲区;
BufferLength: 缓冲区长度;
StrLen\_or\_Ind: 返回数据长度或NULL指示 | -| SQLColAttribute| 获取列属性信息 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLUSMALLINT FieldIdentifier, SQLPOINTER CharacterAttribute, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength, SQLLEN \*NumericAttribute) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
FieldIdentifier: 属性ID;
CharacterAttribute: 字符属性输出;
BufferLength: 缓冲区长度;
StringLength: 返回长度;
NumericAttribute: 数值属性输出 | -| SQLColumns| 查询表列信息 | (SQLHSTMT StatementHandle, SQLCHAR \*CatalogName, SQLSMALLINT NameLength1, SQLCHAR \*SchemaName, SQLSMALLINT NameLength2, SQLCHAR \*TableName, SQLSMALLINT NameLength3, SQLCHAR \*ColumnName, SQLSMALLINT NameLength4) | StatementHandle: 语句句柄;
Catalog/Schema/Table/ColumnName: 查询对象名称;
NameLength\*: 对应名称长度 | -| SQLConnect | 建立数据库连接 | (SQLHDBC ConnectionHandle, SQLCHAR \*ServerName, SQLSMALLINT NameLength1, SQLCHAR \*UserName, SQLSMALLINT NameLength2, SQLCHAR \*Authentication, SQLSMALLINT NameLength3) | ConnectionHandle: 连接句柄;
ServerName: 数据源名称;
UserName: 用户名;
Authentication: 密码;NameLength\*: 字符串长度 | -| SQLDescribeCol | 描述结果集中的列 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLCHAR \*ColumnName, SQLSMALLINT BufferLength, SQLSMALLINT \*NameLength, SQLSMALLINT \*DataType, SQLULEN \*ColumnSize, SQLSMALLINT \*DecimalDigits, SQLSMALLINT \*Nullable) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
ColumnName: 列名输出;
BufferLength: 缓冲区长度;
NameLength: 返回列名长度;
DataType: SQL类型;
ColumnSize: 列大小;
DecimalDigits: 小数位;
Nullable: 是否可为空 | -| SQLDisconnect | 断开数据库连接 | (SQLHDBC ConnectionHandle) | ConnectionHandle: 连接句柄 | -| SQLDriverConnect | 使用连接字符串建立连接 | (SQLHDBC ConnectionHandle, SQLHWND WindowHandle, SQLCHAR \*InConnectionString, SQLSMALLINT StringLength1, SQLCHAR \*OutConnectionString, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength2, SQLUSMALLINT DriverCompletion) | ConnectionHandle: 连接句柄;
WindowHandle: 窗口句柄;
InConnectionString: 输入连接字符串;
StringLength1: 输入长度;
OutConnectionString: 输出连接字符串;
BufferLength: 输出缓冲区;
StringLength2: 返回长度;
DriverCompletion: 连接提示方式 | -| SQLEndTran | 提交或回滚事务 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT CompletionType) | HandleType: 句柄类型;
Handle: 连接或环境句柄;
CompletionType: 提交或回滚事务 | -| SQLExecDirect | 直接执行SQL语句 | (SQLHSTMT StatementHandle, SQLCHAR \*StatementText, SQLINTEGER TextLength) | StatementHandle: 语句句柄;
StatementText: SQL文本;
TextLength: SQL长度 | -| SQLFetch | 提取结果集中的下一行 | (SQLHSTMT StatementHandle) | StatementHandle: 语句句柄 | -| SQLFreeHandle | 释放ODBC句柄 | (SQLSMALLINT HandleType, SQLHANDLE Handle) | HandleType: 句柄类型;
Handle: 要释放的句柄 | -| SQLFreeStmt | 释放语句相关资源 | (SQLHSTMT StatementHandle, SQLUSMALLINT Option) | StatementHandle: 语句句柄;
Option: 释放选项(关闭游标/重置参数等) | -| SQLGetConnectAttr | 获取连接属性 | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER \*StringLength) | ConnectionHandle: 连接句柄;
Attribute: 属性ID;
Value: 返回属性值;
BufferLength: 缓冲区长度;
StringLength: 返回长度 | -| SQLGetData | 获取结果数据 | (SQLHSTMT StatementHandle, SQLUSMALLINT Col\_or\_Param\_Num, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN \*StrLen\_or\_Ind) | StatementHandle: 语句句柄;
Col\_or\_Param\_Num: 列号;
TargetType: C类型;
TargetValue: 数据缓冲区;
BufferLength: 缓冲区大小;
StrLen\_or\_Ind: 返回长度或NULL标志 | -| SQLGetDiagField | 获取诊断字段 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLSMALLINT DiagIdentifier, SQLPOINTER DiagInfo, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength) | HandleType: 句柄类型;
Handle: 句柄;
RecNumber: 记录号;
DiagIdentifier: 诊断字段ID;
DiagInfo: 输出信息;
BufferLength: 缓冲区;
StringLength: 返回长度 | -| SQLGetDiagRec | 获取诊断记录 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLCHAR \*Sqlstate, SQLINTEGER \*NativeError, SQLCHAR \*MessageText, SQLSMALLINT BufferLength, SQLSMALLINT \*TextLength) | HandleType: 句柄类型;
Handle: 句柄;
RecNumber: 记录号;
Sqlstate: SQL状态码;
NativeError: 原生错误码;
MessageText: 错误信息;
BufferLength: 缓冲区;
TextLength: 返回长度 | -| SQLGetInfo | 获取数据库信息 | (SQLHDBC ConnectionHandle, SQLUSMALLINT InfoType, SQLPOINTER InfoValue, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength) | ConnectionHandle: 连接句柄;

InfoType: 信息类型;
InfoValue: 返回值;
BufferLength: 缓冲区长度;
StringLength: 返回长度 | -| SQLGetStmtAttr | 获取语句属性 | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER \*StringLength) | StatementHandle: 语句句柄;
Attribute: 属性ID;
Value: 返回值;
BufferLength: 缓冲区;
StringLength: 返回长度 | -| SQLGetTypeInfo | 获取数据类型信息 | (SQLHSTMT StatementHandle, SQLSMALLINT DataType) | StatementHandle: 语句句柄;
DataType: SQL数据类型 | -| SQLMoreResults | 获取更多结果集 | (SQLHSTMT StatementHandle) | StatementHandle: 语句句柄 | -| SQLNumResultCols | 获取结果集列数 | (SQLHSTMT StatementHandle, SQLSMALLINT \*ColumnCount) | StatementHandle: 语句句柄;
ColumnCount: 返回列数 | -| SQLRowCount | 获取受影响的行数 | (SQLHSTMT StatementHandle, SQLLEN \*RowCount) | StatementHandle: 语句句柄;
RowCount: 返回受影响行数 | -| SQLSetConnectAttr | 设置连接属性 | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | ConnectionHandle: 连接句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 属性值长度 | -| SQLSetEnvAttr | 设置环境属性 | (SQLHENV EnvironmentHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | EnvironmentHandle: 环境句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 长度 | -| SQLSetStmtAttr | 设置语句属性 | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | StatementHandle: 语句句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 长度 | -| SQLTables | 查询表信息 | (SQLHSTMT StatementHandle, SQLCHAR \*CatalogName, SQLSMALLINT NameLength1, SQLCHAR \*SchemaName, SQLSMALLINT NameLength2, SQLCHAR \*TableName, SQLSMALLINT NameLength3, SQLCHAR \*TableType, SQLSMALLINT NameLength4) | StatementHandle: 语句句柄;
Catalog/Schema/TableName: 表名;
TableType: 表类型;
NameLength\*: 对应长度 | - -### 3.2 数据类型转换 - -IoTDB 数据类型与 ODBC 标准数据类型的映射关系如下: - -| **IoTDB 数据类型** | **ODBC 数据类型** | -| -------------------------- | ------------------------- | -| BOOLEAN | SQL\_BIT | -| INT32 | SQL\_INTEGER | -| INT64 | SQL\_BIGINT | -| FLOAT | SQL\_REAL | -| DOUBLE | SQL\_DOUBLE | -| TEXT | SQL\_VARCHAR | -| STRING | SQL\_VARCHAR | -| BLOB | SQL\_LONGVARBINARY | -| TIMESTAMP | SQL\_BIGINT | -| DATE | SQL\_DATE | - -## 4. 操作示例 - -本章节主要介绍 **C#**、**Python**、**C++**、**PowerBI**、**Excel** 全类型操作示例,覆盖数据查询、插入、删除等核心操作。 - -### 4.1 C# 示例 - -```C# -/******* -Note: When the output contains Chinese characters, it may cause garbled text. -This is because the table.Write() function cannot output strings in UTF-8 encoding -and can only output using GB2312 (or another system default encoding). This issue -may not occur in software like Power BI; it also does not occur when using the Console. -WriteLine function. This is an issue with the ConsoleTable package. -*****/ -using System.Data.Common; -using System.Data.Odbc; -using System.Reflection.PortableExecutable; -using ConsoleTables; -using System; - -/// 执行 SELECT 查询并以表格形式输出 fulltable 的结果 -void Query(OdbcConnection dbConnection) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - dbCommand.CommandText = "select * from fulltable"; - using (OdbcDataReader dbReader = dbCommand.ExecuteReader()) - { - var fCount = dbReader.FieldCount; - Console.WriteLine($"fCount = {fCount}"); - // 输出表头 - var columns = new string[fCount]; - for (var i = 0; i < fCount; i++) - { - var fName = dbReader.GetName(i); - if (fName.Contains('.')) - { - fName = fName.Substring(fName.LastIndexOf('.') + 1); - } - columns[i] = fName; - } - // 输出内容 - var table = new ConsoleTable(columns); - while (dbReader.Read()) - { - var row = new object[fCount]; - for (var i = 0; i < fCount; i++) - { - if (dbReader.IsDBNull(i)) - { - row[i] = null; - continue; - } - row[i] = dbReader.GetValue(i); - } - table.AddRow(row); - } - table.Write(); - Console.WriteLine(); - } - } - } - catch (Exception ex) - { - Console.WriteLine(ex.ToString()); - } -} - -/// 执行非查询 SQL 语句(如 CREATE DATABASE、CREATE TABLE、INSERT 等) -void Execute(OdbcConnection dbConnection, string command) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - try - { - dbCommand.CommandText = command; - Console.WriteLine($"Execute command: {command}"); - dbCommand.ExecuteNonQuery(); - } - catch (Exception ex) - { - Console.WriteLine($"CommandText error: {ex.Message}"); - } - } - } - catch (OdbcException ex) - { - Console.WriteLine($"数据库错误:{ex.Message}"); - } - catch (Exception ex) - { - Console.WriteLine($"发生未知错误:{ex.Message}"); - } -} - -var dsn = "Apache IoTDB DSN"; -var user = "root"; -var password = "root"; -var server = "127.0.0.1"; -var database = "test"; -var connectionString = $"DSN={dsn};Server={server};UID={user};PWD={password};Database={database};loglevel=4"; - -using (OdbcConnection dbConnection = new OdbcConnection(connectionString)) -{ - Console.WriteLine($"Start"); - try - { - dbConnection.Open(); - } - catch (Exception ex) - { - Console.WriteLine($"Login failed: {ex.Message}"); - Console.WriteLine($"Stack Trace: {ex.StackTrace}"); - dbConnection.Dispose(); - return; - } - Console.WriteLine($"Successfully opened connection. database name = {dbConnection.Driver}"); - Execute(dbConnection, "CREATE DATABASE IF NOT EXISTS test"); - Execute(dbConnection, "use test"); - Console.WriteLine("use test Execute complete. Begin to setup fulltable."); - - Execute(dbConnection, "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) WITH (TTL=315360000000)"); - string[] insertStatements = new string[] - { - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')" - }; - foreach (var insert in insertStatements) - { - Execute(dbConnection, insert); - } - Console.WriteLine("fulltable setup complete. Begin to query."); - Query(dbConnection); // 执行查询并输出结果 -} - -``` - -### 4.2 Python 示例 - -1. 通过Python访问odbc,需安装pyodbc包 - -```Plain -pip install pyodbc -``` - -2. 完整代码 - -```Python -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -""" -Apache IoTDB ODBC Python example -Use pyodbc to connect to the IoTDB ODBC driver and perform operations such as query and insert. -For reference, see examples/cpp-example/test.cpp and examples/BasicTest/BasicTest/Program.cs -""" - -import pyodbc - -def execute(conn: pyodbc.Connection, command: str) -> None: - """执行非查询 SQL 语句(如 USE、CREATE、INSERT、DELETE 等)""" - try: - with conn.cursor() as cursor: - cursor.execute(command) - # INSERT/UPDATE/DELETE require commit; session commands such as USE do not. - cmd_upper = command.strip().upper() - if cmd_upper.startswith(("INSERT", "UPDATE", "DELETE")): - conn.commit() - print(f"Execute command: {command}") - except pyodbc.Error as ex: - print(f"CommandText error: {ex}") - -def query(conn: pyodbc.Connection, sql: str) -> None: - """执行 SELECT 查询并以表格形式输出结果""" - try: - with conn.cursor() as cursor: - cursor.execute(sql) - col_count = len(cursor.description) - print(f"fCount = {col_count}") - - if col_count <= 0: - return - - # Get column names (if the name contains '.', take the last segment, consistent with C++/C# samples). - columns = [] - for i in range(col_count): - col_name = cursor.description[i][0] or f"Column{i}" - if "." in str(col_name): - col_name = str(col_name).split(".")[-1] - columns.append(str(col_name)) - - # Fetch data rows - rows = cursor.fetchall() - - # Simple table output - col_widths = [max(len(str(col)), 4) for col in columns] - for i, row in enumerate(rows): - for j, val in enumerate(row): - if j < len(col_widths): - col_widths[j] = max(col_widths[j], len(str(val) if val is not None else "NULL")) - - # Print header - header = " | ".join(str(c).ljust(col_widths[i]) for i, c in enumerate(columns)) - print(header) - print("-" * len(header)) - - # Print data rows - for row in rows: - values = [] - for i, val in enumerate(row): - if val is None: - cell = "NULL" - else: - cell = str(val) - values.append(cell.ljust(col_widths[i]) if i < len(col_widths) else cell) - print(" | ".join(values)) - - print() - - except pyodbc.Error as ex: - print(f"Query error: {ex}") - -def main() -> None: - dsn = "Apache IoTDB DSN" - user = "root" - password = "root" - server = "127.0.0.1" - database = "test" - connection_string = ( - f"DSN={dsn};Server={server};UID={user};PWD={password};" - f"Database={database};loglevel=4" - ) - - print("Start") - - try: - conn = pyodbc.connect(connection_string) - except pyodbc.Error as ex: - print(f"Login failed: {ex}") - return - - try: - driver_name = conn.getinfo(6) # SQL_DRIVER_NAME - print(f"Successfully opened connection. driver = {driver_name}") - except Exception: - print("Successfully opened connection.") - - try: - execute(conn, "CREATE DATABASE IF NOT EXISTS test") - execute(conn, "use test") - print("use test Execute complete. Begin to setup fulltable.") - - # Create the fulltable table and insert test data - execute( - conn, - "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, " - "int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, " - "double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, " - "blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) " - "WITH (TTL=315360000000)", - ) - insert_statements = [ - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - ] - for insert_sql in insert_statements: - execute(conn, insert_sql) - print("fulltable setup complete. Begin to query.") - query(conn, "select * from fulltable") - print("Query ok") - finally: - conn.close() - -if __name__ == "__main__": - main() -``` - -### 4.3 C++ 示例 - -```C++ -#define WIN32_LEAN_AND_MEAN -#include - -#include -#include -#include -#include -#include -#include -#include - -#ifndef SQL_DIAG_COLUMN_SIZE -#define SQL_DIAG_COLUMN_SIZE 33L -#endif - -// 错误处理函数(保持核心功能) -void CheckOdbcError(SQLRETURN retCode, SQLSMALLINT handleType, SQLHANDLE handle, const char* functionName) { - if (retCode == SQL_SUCCESS || retCode == SQL_SUCCESS_WITH_INFO) { - return; - } - - SQLCHAR sqlState[6]; - SQLCHAR message[SQL_MAX_MESSAGE_LENGTH]; - SQLINTEGER nativeError; - SQLSMALLINT textLength; - SQLRETURN errRet; - errRet = SQLGetDiagRec(handleType, handle, 1, sqlState, &nativeError, message, sizeof(message), &textLength); - - std::cerr << "ODBC Error in " << functionName << ":\n"; - std::cerr << " SQL State: " << sqlState << "\n"; - std::cerr << " Native Error: " << nativeError << "\n"; - std::cerr << " Message: " << message << "\n"; - std::cerr << " SQLGetDiagRec Return: " << errRet << "\n"; - - if (retCode == SQL_ERROR || retCode == SQL_INVALID_HANDLE) { - exit(1); - } -} - -// 简化版表格输出 - 仅展示基本数据 -void PrintSimpleTable(const std::vector& headers, - const std::vector>& rows) { - // 打印表头 - for (size_t i = 0; i < headers.size(); i++) { - std::cout << headers[i]; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - // 打印分隔线 - for (size_t i = 0; i < headers.size(); i++) { - std::cout << "----------------"; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - // 打印数据行 - for (const auto& row : rows) { - for (size_t i = 0; i < row.size(); i++) { - std::cout << row[i]; - if (i < row.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - } - std::cout << std::endl; -} - -/// 执行 SELECT 查询并以表格形式输出 fulltable 的结果 -void Query(SQLHDBC hDbc) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret = SQL_SUCCESS; - - try { - // 分配语句句柄 - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - return; - } - - // 执行查询 - const std::string sqlQuery = "select * from fulltable"; - std::cout << "Execute query: " << sqlQuery << std::endl; - - ret = SQLExecDirect(hStmt, reinterpret_cast(const_cast(sqlQuery.c_str())), SQL_NTS); - if (!SQL_SUCCEEDED(ret)) { - if (ret != SQL_NO_DATA) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect(SELECT)"); - } - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - // 获取列数量 - SQLSMALLINT colCount = 0; - ret = SQLNumResultCols(hStmt, &colCount); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLNumResultCols"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::cout << "Column count = " << colCount << std::endl; - - // 如果没有列,直接返回 - if (colCount <= 0) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - // 获取列名和类型信息 - std::vector columnNames; - std::vector columnTypes(colCount); - std::vector columnSizes(colCount); - std::vector decimalDigits(colCount); - std::vector nullable(colCount); - - // Get basic column information - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLSMALLINT nameLength = 0; - ret = SQLDescribeCol(hStmt, i, NULL, 0, &nameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get length)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector colNameBuffer(nameLength + 1); - SQLSMALLINT actualNameLength = 0; - - ret = SQLDescribeCol(hStmt, i, colNameBuffer.data(), nameLength + 1, - &actualNameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get name)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::string fullName(reinterpret_cast(colNameBuffer.data())); - - size_t pos = fullName.find_last_of('.'); - if (pos != std::string::npos) { - columnNames.push_back(fullName.substr(pos + 1)); - } else { - columnNames.push_back(fullName); - } - - ret = SQLDescribeCol(hStmt, i, NULL, 0, NULL, &columnTypes[i-1], - &columnSizes[i-1], &decimalDigits[i-1], &nullable[i-1]); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get type info)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - } - - std::vector> tableRows; - - int rowCount = 0; - // Get data front every row - while (true) { - ret = SQLFetch(hStmt); - if (ret == SQL_NO_DATA) { - break; - } - - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLFetch"); - break; - } - - std::vector row; - - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLLEN indicator = 0; - std::string valueStr; - - SQLSMALLINT cType; - size_t bufferSize; - bool isCharacterType = false; - const int maxBufferSize = 32768; - - switch (columnTypes[i-1]) { - case SQL_CHAR: - case SQL_VARCHAR: - case SQL_LONGVARCHAR: - case SQL_WCHAR: - case SQL_WVARCHAR: - case SQL_WLONGVARCHAR: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_DECIMAL: - case SQL_NUMERIC: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_INTEGER: - case SQL_SMALLINT: - case SQL_TINYINT: - case SQL_BIGINT: - cType = SQL_C_SBIGINT; - bufferSize = sizeof(SQLBIGINT); - break; - - case SQL_REAL: - case SQL_FLOAT: - case SQL_DOUBLE: - cType = SQL_C_DOUBLE; - bufferSize = sizeof(double); - break; - - case SQL_BIT: - cType = SQL_C_BIT; - bufferSize = sizeof(SQLCHAR); - break; - - case SQL_DATE: - case SQL_TYPE_DATE: - cType = SQL_C_DATE; - bufferSize = sizeof(SQL_DATE_STRUCT); - break; - - case SQL_TIME: - case SQL_TYPE_TIME: - cType = SQL_C_TIME; - bufferSize = sizeof(SQL_TIME_STRUCT); - break; - - case SQL_TIMESTAMP: - case SQL_TYPE_TIMESTAMP: - cType = SQL_C_TIMESTAMP; - bufferSize = sizeof(SQL_TIMESTAMP_STRUCT); - break; - - default: - cType = SQL_C_CHAR; - bufferSize = 256; - isCharacterType = true; - break; - } - - std::vector buffer(bufferSize); - - ret = SQLGetData(hStmt, i, cType, buffer.data(), bufferSize, &indicator); - - if (indicator == SQL_NULL_DATA) { - valueStr = "NULL"; - } - else if (ret != SQL_SUCCESS) { - valueStr = "ERR_CONV"; - } - else { - if (cType == SQL_C_CHAR) { - valueStr = reinterpret_cast(buffer.data()); - } - else if (cType == SQL_C_SBIGINT) { - SQLBIGINT intVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(intVal); - } - else if (cType == SQL_C_DOUBLE) { - double doubleVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(doubleVal); - } - else if (cType == SQL_C_BIT) { - valueStr = (*buffer.data() != 0) ? "TRUE" : "FALSE"; - } - else if (cType == SQL_C_DATE) { - SQL_DATE_STRUCT* date = reinterpret_cast(buffer.data()); - char dateStr[20]; - snprintf(dateStr, sizeof(dateStr), "%04d-%02d-%02d", - date->year, date->month, date->day); - valueStr = dateStr; - } - else if (cType == SQL_C_TIME) { - SQL_TIME_STRUCT* time = reinterpret_cast(buffer.data()); - char timeStr[15]; - snprintf(timeStr, sizeof(timeStr), "%02d:%02d:%02d", - time->hour, time->minute, time->second); - valueStr = timeStr; - } - else if (cType == SQL_C_TIMESTAMP) { - SQL_TIMESTAMP_STRUCT* ts = reinterpret_cast(buffer.data()); - char tsStr[30]; - snprintf(tsStr, sizeof(tsStr), "%04d-%02d-%02d %02d:%02d:%02d.%06d", - ts->year, ts->month, ts->day, - ts->hour, ts->minute, ts->second, - ts->fraction / 1000); - valueStr = tsStr; - } - else { - valueStr = "UNKNOWN_TYPE"; - } - - if (isCharacterType && ret == SQL_SUCCESS_WITH_INFO) { - SQLLEN actualSize = 0; - SQLGetDiagField(SQL_HANDLE_STMT, hStmt, 0, SQL_DIAG_COLUMN_SIZE, - &actualSize, SQL_IS_INTEGER, NULL); - - if (indicator > 0 && static_cast(indicator) > bufferSize - 1) { - valueStr += "..."; - } - } - - } - - row.push_back(valueStr); - } - - tableRows.push_back(row); - } - - if (!tableRows.empty()) { - PrintSimpleTable(columnNames, tableRows); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (const std::exception& ex) { - std::cerr << "Exception: " << ex.what() << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } - catch (...) { - std::cerr << "Unknown exception occurred" << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -/// 执行非查询 SQL 语句(如 CREATE DATABASE、CREATE TABLE、INSERT 等) -void Execute(SQLHDBC hDbc, const std::string& command) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret; - - try { - // 分配语句句柄 - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - - // 执行命令 - ret = SQLExecDirect(hStmt, (SQLCHAR*)command.c_str(), SQL_NTS); - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect"); - } - - // 释放语句句柄 - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (...) { - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -int main() { - SQLHENV hEnv = SQL_NULL_HENV; - SQLHDBC hDbc = SQL_NULL_HDBC; - SQLRETURN ret; - - try { - std::cout << "Start" << std::endl; - - // 1. 初始化ODBC环境 - ret = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hEnv); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_ENV)"); - - ret = SQLSetEnvAttr(hEnv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)SQL_OV_ODBC3, 0); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLSetEnvAttr"); - - // 2. 建立连接 - ret = SQLAllocHandle(SQL_HANDLE_DBC, hEnv, &hDbc); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_DBC)"); - - // 连接字符串 - std::string dsn = "Apache IoTDB DSN"; - std::string user = "root"; - std::string password = "root"; - std::string server = "127.0.0.1"; - std::string database = "test"; - - std::string connectionString = "DSN=" + dsn + ";Server=" + server + - ";UID=" + user + ";PWD=" + password + - ";Database=" + database + ";loglevel=4"; - std::cout << "Using connection string: " << connectionString << std::endl; - - SQLCHAR outConnStr[1024]; - SQLSMALLINT outConnStrLen; - - ret = SQLDriverConnect(hDbc, NULL, - (SQLCHAR*)connectionString.c_str(), SQL_NTS, - outConnStr, sizeof(outConnStr), - &outConnStrLen, SQL_DRIVER_COMPLETE); - - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - std::cerr << "Login failed" << std::endl; - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLDriverConnect"); - return 1; - } - - // 获取驱动名称 - SQLCHAR driverName[256]; - SQLSMALLINT nameLength; - ret = SQLGetInfo(hDbc, SQL_DRIVER_NAME, driverName, sizeof(driverName), &nameLength); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLGetInfo"); - - std::cout << "Successfully opened connection. database name = " << driverName << std::endl; - - // 3. 执行操作 - Execute(hDbc, "CREATE DATABASE IF NOT EXISTS test"); - Execute(hDbc, "use test"); - std::cout << "use test Execute complete. Begin to setup fulltable." << std::endl; - - // 创建 fulltable 表并插入测试数据 - Execute(hDbc, "CREATE TABLE IF NOT EXISTS fullTable (time TIMESTAMP TIME, bool_col BOOLEAN FIELD, int32_col INT32 FIELD, int64_col INT64 FIELD, float_col FLOAT FIELD, double_col DOUBLE FIELD, text_col TEXT FIELD, string_col STRING FIELD, blob_col BLOB FIELD, timestamp_col TIMESTAMP FIELD, date_col DATE FIELD) WITH (TTL=315360000000)"); - const char* insertStatements[] = { - "INSERT INTO fulltable VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689600000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689660000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', '0x506C616E7444617461', 1735689720000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', '0x506C616E7444617462', 1735689780000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', '0x506C616E7444617461', 1735689840000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689900000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', '0x506C616E7444617463', 1735689960000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', '0x506C616E7444617464', 1735690020000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', '0x506C616E7444617463', 1735690080000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690140000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', '0x506C616E7444617465', 1735690200000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', '0x506C616E7444617466', 1735690260000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', '0x506C616E7444617465', 1735690320000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690380000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690440000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', '0x506C616E7444617467', 1735690500000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', '0x506C616E7444617468', 1735690560000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690620000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690680000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')", - "INSERT INTO fulltable VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', '0x506C616E7444617469', 1735690740000, '2026-01-04')" - }; - for (const char* sql : insertStatements) { - Execute(hDbc, sql); - } - std::cout << "fulltable setup complete. Begin to query." << std::endl; - Query(hDbc); - std::cout << "Query ok" << std::endl; - - // 4. 清理资源 - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - - return 0; - } - catch (...) { - // 异常清理 - if (hDbc != SQL_NULL_HDBC) { - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - } - if (hEnv != SQL_NULL_HENV) { - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - } - - std::cerr << "Unexpected error!" << std::endl; - return 1; - } -} -``` - -### 4.4 PowerBI 示例 - -1. 打开 PowerBI Desktop,创建新项目; -2. 点击「主页」→「获取数据」→「更多...」→「ODBC」→ 点击「连接」按钮; -3. 数据源选择:在弹出窗口中选择「数据源名称 (DSN)」,下拉选择 `Apache IoTDB DSN`; -4. 高级配置: - -* 点击「高级选项」,在「连接字符串」输入框填写配置(样例): - -```Plain -server=127.0.0.1;port=6667;database=test;isTableModel=true;loglevel=4 -``` - -* 说明: - - * `dsn` 项可选,填写 / 不填写均不影响连接; - * `loglevel` 分为 0-4 等级:0 级(ERROR)日志最少,4 级(TRACE)日志最详细,按需设置; - - `server/database/dsn/loglevel` 大小写不敏感(如可写为 `Server/DATABASE`); - * 如果在DSN中配置了相关信息,则可以不填写任何配置信息,驱动管理器会自动使用在DSN中填入的配置信息。 - -5. 身份验证:输入用户名(默认 `root`)和密码(默认 `root`),点击「连接」; -6. 数据加载:在界面中选择需要调用的表(如 `fulltable/table1`),点击「加载」即可查看数据。 - -### 4.5 Excel 示例 - -1. 打开 Excel,创建空白工作簿; -2. 点击「数据」选项卡 →「自其他来源」→「来自数据连接向导」; -3. 数据源选择:选择「ODBC DSN」→ 下一步 → 选择 `Apache IoTDB DSN` → 下一步; -4. 连接配置: -* 连接字符串、用户名、密码的输入流程与 PowerBI 完全一致,连接字符串格式参考: - -```Plain -server=127.0.0.1;port=6667;database=test;isTableModel=true;loglevel=4 -``` - -* 如果在DSN中配置了相关信息,则可以不填写任何配置信息,驱动管理器会自动使用在DSN中填入的配置信息。 -5. 表选择:选择需要访问的数据库和表(如 fulltable),点击「下一步」; -6. 保存连接:自定义设置数据连接文件名、连接描述等信息,点击「完成」; -7. 导入数据:选择数据导入到工作表的位置(如「现有工作表」的 A1 单元格),点击「确定」,完成数据加载。 diff --git a/src/zh/UserGuide/latest-Table/API/Programming-Python-Native-API_timecho.md b/src/zh/UserGuide/latest-Table/API/Programming-Python-Native-API_timecho.md deleted file mode 100644 index b927a1827..000000000 --- a/src/zh/UserGuide/latest-Table/API/Programming-Python-Native-API_timecho.md +++ /dev/null @@ -1,757 +0,0 @@ - - -# Python 原生接口 - -## 1. 使用方式 - -安装依赖包: - -```shell -pip3 install apache-iotdb>=2.0 -``` -注意:请勿使用高版本客户端连接低版本服务。 - -## 2. 读写操作 - -### 2.1 TableSession - -#### 2.1.1 功能描述 - -TableSession是IoTDB的一个核心类,用于与IoTDB数据库进行交互。通过这个类,用户可以执行SQL语句、插入数据以及管理数据库会话。 - -#### 2.1.2 方法列表 - -| **方法名称** | **描述** | **参数类型** | **返回类型** | -| --------------------------- | ---------------------------------- | ---------------------------------- | -------------- | -| insert | 写入数据 | tablet: Union[Tablet, NumpyTablet] | None | -| execute_non_query_statement | 执行非查询 SQL 语句,如 DDL 和 DML | sql: str | None | -| execute_query_statement | 执行查询 SQL 语句并返回结果集 | sql: str | SessionDataSet | -| close | 关闭会话并释放资源 | None | None | - -自 V2.0.8.2 版本起,SessionDataSet 提供分批获取 DataFrame 的方法,用于高效处理大数据量查询: - -```python -# 分批获取 DataFrame -has_next = result.has_next_df() -if has_next: - df = result.next_df() - # 处理 DataFrame -``` - -**方法说明:** -- `has_next_df()`: 返回 `True`/`False`,表示是否还有数据可返回 -- `next_df()`: 返回 `DataFrame` 或 `None`,每次返回 `fetchSize` 行(默认5000行,由 Session 的 `fetch_size` 参数控制) - - 剩余数据 ≥ `fetchSize` 时,返回 `fetchSize` 行 - - 剩余数据 < `fetchSize` 时,返回剩余所有行 - - 数据遍历完毕时,返回 `None` -- 初始化 Session 时检查 `fetchSize`,若 ≤0 则重置为 5000 并打印警告日志 - -**注意:** 不要混合使用不同的遍历方式,如(todf函数与 next_df 混用),否则会出现预期外的错误。 - - -自 V2.0.8.3 版本起,Python 客户端在 `Tablet`批量写入与 `Session` 值序列化中支持 `TSDataType.OBJECT` ,查询结果经 `Field` 读取,相关接口定义如下: - -| 函数名 | 功能 | 参数 | 返回值 | -| ------------------------------------- | ------------------------------------------------------------ | --------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- | -| encode\_object\_cell | 将一格 OBJECT 编成线格式字节 | is\_eof: bool,offset: int,content: bytes | bytes:\|[eof 1B]\|[offset 8B BE]\|[payload]\| | -| decode\_object\_cell | 把线格式一格解析回 eof、offset、payload | cell: bytes(长度 ≥ 9) | Tuple[bool, int, bytes]:(is\_eof, offset, payload) | -| Tablet.add\_value\_object | 在指定行列写入一格 OBJECT(内部调用 encode\_object\_cell) | row\_index: int,column\_index: int,is\_eof: bool,offset: int,content: bytes | None | -| Tablet.add\_value\_object\_by\_name | 同上,按列名定位列 | column\_name: str,row\_index: int,is\_eof: bool,offset: int,content: bytes | None | -| NumpyTablet.add\_value\_object | 与 Tablet.add\_value\_object 相同语义,列数据为 ndarray | 同上(row\_index、column\_index、…) | None | -| Field.get\_object\_value | 按「目标类型」把 value 转成 Python 值 | data\_type: TSDataType | 随类型:OBJECT 时为 self.value 整段 UTF-8 解码 得到的 str(见[Field.py](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/iotdb/utils/Field.py)) | -| Field.get\_string\_value | 字符串化展示 | 无 | str;OBJECT 时为 self.value.decode("utf-8") | -| Field.get\_binary\_value | 取 TEXT/STRING/BLOB 的二进制 | 无 | bytes 或 None;OBJECT 列会抛错,不应调用 | - - -#### 2.1.3 接口展示 - -**TableSession:** - - -```Python -class TableSession(object): -def insert(self, tablet: Union[Tablet, NumpyTablet]): - """ - Insert data into the database. - - Parameters: - tablet (Tablet | NumpyTablet): The tablet containing the data to be inserted. - Accepts either a `Tablet` or `NumpyTablet`. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def execute_non_query_statement(self, sql: str): - """ - Execute a non-query SQL statement. - - Parameters: - sql (str): The SQL statement to execute. Typically used for commands - such as INSERT, DELETE, or UPDATE. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def execute_query_statement(self, sql: str, timeout_in_ms: int = 0) -> "SessionDataSet": - """ - Execute a query SQL statement and return the result set. - - Parameters: - sql (str): The SQL query to execute. - timeout_in_ms (int, optional): Timeout for the query in milliseconds. Defaults to 0, - which means no timeout. - - Returns: - SessionDataSet: The result set of the query. - - Raises: - IoTDBConnectionException: If there is an issue with the database connection. - """ - pass - -def close(self): - """ - Close the session and release resources. - - Raises: - IoTDBConnectionException: If there is an issue closing the connection. - """ - pass -``` - -### 2.2 TableSessionConfig - -#### 2.2.1 功能描述 - -TableSessionConfig是一个配置类,用于设置和创建TableSession 实例。它定义了连接到IoTDB数据库所需的各种参数。 - -#### 2.2.2 配置选项 - -| **配置项** | **描述** | **类型** | **默认值** | -| ------------------ | ------------------------- | -------- |-----------------------------------------| -| node_urls | 数据库连接的节点 URL 列表 | list | ["localhost:6667"] | -| username | 数据库连接用户名 | str | "root" | -| password | 数据库连接密码 | str | "TimechoDB@2021" //V2.0.6.x 之前默认密码是root | -| database | 要连接的目标数据库 | str | None | -| fetch_size | 每次查询获取的行数 | int | 5000 | -| time_zone | 会话的默认时区 | str | Session.DEFAULT_ZONE_ID | -| enable_compression | 是否启用数据压缩 | bool | False | - -#### 2.2.3 接口展示 - -```Python -class TableSessionConfig(object): - """ - Configuration class for a TableSession. - - This class defines various parameters for connecting to and interacting - with the IoTDB tables. - """ - - def __init__( - self, - node_urls: list = None, - username: str = Session.DEFAULT_USER, - password: str = Session.DEFAULT_PASSWORD, - database: str = None, - fetch_size: int = 5000, - time_zone: str = Session.DEFAULT_ZONE_ID, - enable_compression: bool = False, - ): - """ - Initialize a TableSessionConfig object with the provided parameters. - - Parameters: - node_urls (list, optional): A list of node URLs for the database connection. - Defaults to ["localhost:6667"]. - username (str, optional): The username for the database connection. - Defaults to "root". - password (str, optional): The password for the database connection. - Defaults to "TimechoDB@2021". //V2.0.6.x 之前默认密码是root - database (str, optional): The target database to connect to. Defaults to None. - fetch_size (int, optional): The number of rows to fetch per query. Defaults to 5000. - time_zone (str, optional): The default time zone for the session. - Defaults to Session.DEFAULT_ZONE_ID. - enable_compression (bool, optional): Whether to enable data compression. - Defaults to False. - """ -``` - -**注意事项:** - -在使用完 TableSession 后,务必调用 close 方法来释放资源。 - -## 3. 客户端连接池 - -### 3.1 TableSessionPool - -#### 3.1.1 功能描述 - -TableSessionPool 是一个会话池管理类,用于管理 TableSession 实例的创建和销毁。它提供了从池中获取会话和关闭会话池的功能。 - -#### 3.1.2 方法列表 - -| **方法名称** | **描述** | **返回类型** | **异常** | -| ------------ | ---------------------------------------- | ------------ | -------- | -| get_session | 从会话池中检索一个新的 TableSession 实例 | TableSession | 无 | -| close | 关闭会话池并释放所有资源 | None | 无 | - -#### 3.1.3 接口展示 - -**TableSessionPool:** - -```Python -def get_session(self) -> TableSession: - """ - Retrieve a new TableSession instance. - - Returns: - TableSession: A new session object configured with the session pool. - - Notes: - The session is initialized with the underlying session pool for managing - connections. Ensure proper usage of the session's lifecycle. - """ - -def close(self): - """ - Close the session pool and release all resources. - - This method closes the underlying session pool, ensuring that all - resources associated with it are properly released. - - Notes: - After calling this method, the session pool cannot be used to retrieve - new sessions, and any attempt to do so may raise an exception. - """ -``` - -### 3.2 TableSessionPoolConfig - -#### 3.2.1 功能描述 - -TableSessionPoolConfig是一个配置类,用于设置和创建 TableSessionPool 实例。它定义了初始化和管理 IoTDB 数据库会话池所需的参数。 - -#### 3.2.2 配置选项 - -| **配置项** | **描述** | **类型** | **默认值** | -| ------------------ | ------------------------------ | -------- | ------------------------ | -| node_urls | 数据库连接的节点 URL 列表 | list | None | -| max_pool_size | 会话池中的最大会话数 | int | 5 | -| username | 数据库连接用户名 | str | Session.DEFAULT_USER | -| password | 数据库连接密码 | str | Session.DEFAULT_PASSWORD | -| database | 要连接的目标数据库 | str | None | -| fetch_size | 每次查询获取的行数 | int | 5000 | -| time_zone | 会话池的默认时区 | str | Session.DEFAULT_ZONE_ID | -| enable_redirection | 是否启用重定向 | bool | False | -| enable_compression | 是否启用数据压缩 | bool | False | -| wait_timeout_in_ms | 等待会话可用的最大时间(毫秒) | int | 10000 | -| max_retry | 操作的最大重试次数 | int | 3 | - -#### 3.2.3 接口展示 - - -```Python -class TableSessionPoolConfig(object): - """ - Configuration class for a TableSessionPool. - - This class defines the parameters required to initialize and manage - a session pool for interacting with the IoTDB database. - """ - def __init__( - self, - node_urls: list = None, - max_pool_size: int = 5, - username: str = Session.DEFAULT_USER, - password: str = Session.DEFAULT_PASSWORD, - database: str = None, - fetch_size: int = 5000, - time_zone: str = Session.DEFAULT_ZONE_ID, - enable_redirection: bool = False, - enable_compression: bool = False, - wait_timeout_in_ms: int = 10000, - max_retry: int = 3, - ): - """ - Initialize a TableSessionPoolConfig object with the provided parameters. - - Parameters: - node_urls (list, optional): A list of node URLs for the database connection. - Defaults to None. - max_pool_size (int, optional): The maximum number of sessions in the pool. - Defaults to 5. - username (str, optional): The username for the database connection. - Defaults to Session.DEFAULT_USER. - password (str, optional): The password for the database connection. - Defaults to Session.DEFAULT_PASSWORD. - database (str, optional): The target database to connect to. Defaults to None. - fetch_size (int, optional): The number of rows to fetch per query. Defaults to 5000. - time_zone (str, optional): The default time zone for the session pool. - Defaults to Session.DEFAULT_ZONE_ID. - enable_redirection (bool, optional): Whether to enable redirection. - Defaults to False. - enable_compression (bool, optional): Whether to enable data compression. - Defaults to False. - wait_timeout_in_ms (int, optional): The maximum time (in milliseconds) to wait for a session - to become available. Defaults to 10000. - max_retry (int, optional): The maximum number of retry attempts for operations. Defaults to 3. - - """ -``` -### 3.3 SSL 连接 - -#### 3.3.1 服务器端配置证书 - -`conf/iotdb-system.properties` 配置文件中查找或添加以下配置项: - -``` -enable_thrift_ssl=true -key_store_path=/path/to/your/server_keystore.jks -key_store_pwd=your_keystore_password -``` - -#### 3.3.2 配置 python 客户端证书 - -- 设置 use_ssl 为 True 以启用 SSL。 -- 指定客户端证书路径,使用 ca_certs 参数。 - -``` -use_ssl = True -ca_certs = "/path/to/your/server.crt" # 或 ca_certs = "/path/to/your//ca_cert.pem" -``` -**示例代码:使用 SSL 连接 IoTDB** - -```Python -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# - -from iotdb.SessionPool import PoolConfig, SessionPool -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前默认密码是root -# Configure SSL enabled -use_ssl = True -# Configure certificate path -ca_certs = "/path/server.crt" - - -def get_data(): - session = Session( - ip, port_, username_, password_, use_ssl=use_ssl, ca_certs=ca_certs - ) - session.open(False) - with session.execute_query_statement("SHOW DATABASES") as session_data_set: - print(session_data_set.get_column_names()) - while session_data_set.has_next(): - print(session_data_set.next()) - - session.close() - - -def get_data2(): - pool_config = PoolConfig( - host=ip, - port=port_, - user_name=username_, - password=password_, - fetch_size=1024, - time_zone="UTC+8", - max_retry=3, - use_ssl=use_ssl, - ca_certs=ca_certs, - ) - max_pool_size = 5 - wait_timeout_in_ms = 3000 - session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) - session = session_pool.get_session() - with session.execute_query_statement("SHOW DATABASES") as session_data_set: - print(session_data_set.get_column_names()) - while session_data_set.has_next(): - print(session_data_set.next()) - session_pool.put_back(session) - session_pool.close() - - -if __name__ == "__main__": - df = get_data() -``` - -## 4. 示例代码 - -Session示例代码:[Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/table_model_session_example.py) - -SessionPool示例代码:[SessionPool Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/table_model_session_pool_example.py) - -```Python -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# -import threading - -import numpy as np - -from iotdb.table_session_pool import TableSessionPool, TableSessionPoolConfig -from iotdb.utils.IoTDBConstants import TSDataType -from iotdb.utils.NumpyTablet import NumpyTablet -from iotdb.utils.Tablet import ColumnType, Tablet - - -def prepare_data(): - print("create database") - # Get a session from the pool - session = session_pool.get_session() - session.execute_non_query_statement("CREATE DATABASE IF NOT EXISTS db1") - session.execute_non_query_statement('USE "db1"') - session.execute_non_query_statement( - "CREATE TABLE table0 (id1 string tag, attr1 string attribute, " - + "m1 double " - + "field)" - ) - session.execute_non_query_statement( - "CREATE TABLE table1 (id1 string tag, attr1 string attribute, " - + "m1 double " - + "field)" - ) - - print("now the tables are:") - # show result - with session.execute_query_statement("SHOW TABLES") as res: - while res.has_next(): - print(res.next()) - - session.close() - - -def insert_data(num: int): - print("insert data for table" + str(num)) - # Get a session from the pool - session = session_pool.get_session() - column_names = [ - "id1", - "attr1", - "m1", - ] - data_types = [ - TSDataType.STRING, - TSDataType.STRING, - TSDataType.DOUBLE, - ] - column_types = [ColumnType.TAG, ColumnType.ATTRIBUTE, ColumnType.FIELD] - timestamps = [] - values = [] - for row in range(15): - timestamps.append(row) - values.append(["id:" + str(row), "attr:" + str(row), row * 1.0]) - tablet = Tablet( - "table" + str(num), column_names, data_types, values, timestamps, column_types - ) - session.insert(tablet) - session.execute_non_query_statement("FLush") - - np_timestamps = np.arange(15, 30, dtype=np.dtype(">i8")) - np_values = [ - np.array(["id:{}".format(i) for i in range(15, 30)]), - np.array(["attr:{}".format(i) for i in range(15, 30)]), - np.linspace(15.0, 29.0, num=15, dtype=TSDataType.DOUBLE.np_dtype()), - ] - - np_tablet = NumpyTablet( - "table" + str(num), - column_names, - data_types, - np_values, - np_timestamps, - column_types=column_types, - ) - session.insert(np_tablet) - session.close() - - -def query_data(): - # Get a session from the pool - session = session_pool.get_session() - - print("get data from table0") - with session.execute_query_statement("select * from table0") as res: - while res.has_next(): - print(res.next()) - - print("get data from table1") - with session.execute_query_statement("select * from table1") as res: - while res.has_next(): - print(res.next()) - - # 使用分批DataFrame方式查询表数据(推荐大数据量场景) - print("get data from table0 using batch DataFrame") - with session.execute_query_statement("select * from table0") as res: - while res.has_next_df(): - print(res.next_df()) - - session.close() - - -def delete_data(): - session = session_pool.get_session() - session.execute_non_query_statement("drop database db1") - print("data has been deleted. now the databases are:") - with session.execute_query_statement("show databases") as res: - while res.has_next(): - print(res.next()) - session.close() - - -# Create a session pool -username = "root" -password = "TimechoDB@2021" //V2.0.6.x 之前默认密码是root -node_urls = ["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"] -fetch_size = 1024 -database = "db1" -max_pool_size = 5 -wait_timeout_in_ms = 3000 -config = TableSessionPoolConfig( - node_urls=node_urls, - username=username, - password=password, - database=database, - max_pool_size=max_pool_size, - fetch_size=fetch_size, - wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) - -prepare_data() - -insert_thread1 = threading.Thread(target=insert_data, args=(0,)) -insert_thread2 = threading.Thread(target=insert_data, args=(1,)) - -insert_thread1.start() -insert_thread2.start() - -insert_thread1.join() -insert_thread2.join() - -query_data() -delete_data() -session_pool.close() -print("example is finished!") -``` - -Object 类型使用示例: - -```Python -import os - -import numpy as np -import pytest - -from iotdb.utils.IoTDBConstants import TSDataType -from iotdb.utils.NumpyTablet import NumpyTablet -from iotdb.utils.Tablet import Tablet, ColumnType -from iotdb.utils.object_column import decode_object_cell - - -def _require_thrift(): - pytest.importorskip("iotdb.thrift.common.ttypes") - - -def _session_endpoint(): - host = os.environ.get("IOTDB_HOST", "127.0.0.1") - port = int(os.environ.get("IOTDB_PORT", "6667")) - return host, port - - -@pytest.fixture(scope="module") -def table_session(): - _require_thrift() - from iotdb.Session import Session - from iotdb.table_session import TableSession, TableSessionConfig - - host, port = _session_endpoint() - cfg = TableSessionConfig( - node_urls=[f"{host}:{port}"], - username=os.environ.get("IOTDB_USER", Session.DEFAULT_USER), - password=os.environ.get("IOTDB_PASSWORD", Session.DEFAULT_PASSWORD), - ) - ts = TableSession(cfg) - yield ts - ts.close() - - -def test_table_numpy_tablet_object_columns(table_session): - """ - Table model: Tablet.add_value_object / add_value_object_by_name, - NumpyTablet.add_value_object, insert + query Field + decode_object_cell; - 另含同一 time 上分两段写入 OBJECT(先 is_eof=False/offset=0,再 is_eof=True/offset=首段长度), - 并用 read_object(f1) 校验拼接后的完整字节。 - """ - db = "test_py_object_e2e" - table = "obj_tbl" - table_session.execute_non_query_statement(f"create database if not exists {db}") - table_session.execute_non_query_statement(f"use {db}") - table_session.execute_non_query_statement(f"drop table if exists {table}") - table_session.execute_non_query_statement( - f"create table {table}(" - "device STRING TAG, temp FLOAT FIELD, f1 OBJECT FIELD, f2 OBJECT FIELD)" - ) - - column_names = ["device", "temp", "f1", "f2"] - data_types = [ - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.OBJECT, - TSDataType.OBJECT, - ] - column_types = [ - ColumnType.TAG, - ColumnType.FIELD, - ColumnType.FIELD, - ColumnType.FIELD, - ] - timestamps = [100, 200] - values = [ - ["d1", 1.5, None, None], - ["d1", 2.5, None, None], - ] - - tablet = Tablet( - table, column_names, data_types, values, timestamps, column_types - ) - tablet.add_value_object(0, 2, True, 0, b"first-row-obj") - # 整对象单段写入:is_eof=True 且 offset=0;分段续写需满足服务端 offset/长度校验 - tablet.add_value_object_by_name("f2", 0, True, 0, b"seg") - tablet.add_value_object(1, 2, True, 0, b"second-f1") - tablet.add_value_object(1, 3, True, 0, b"second-f2") - table_session.insert(tablet) - - ts_arr = np.array([300, 400], dtype=TSDataType.INT64.np_dtype()) - np_vals = [ - np.array(["d1", "d1"]), - np.array([1.0, 2.0], dtype=np.float32), - np.array([None, None], dtype=object), - np.array([None, None], dtype=object), - ] - np_tab = NumpyTablet( - table, column_names, data_types, np_vals, ts_arr, column_types=column_types - ) - np_tab.add_value_object(0, 2, True, 0, b"np-r0-f1") - np_tab.add_value_object(0, 3, True, 0, b"np-r0-f2") - np_tab.add_value_object(1, 2, True, 0, b"np-r1-f1") - np_tab.add_value_object(1, 3, True, 0, b"np-r1-f2") - table_session.insert(np_tab) - - # 分段 OBJECT:先 is_eof=False(续传),再 is_eof=True(末段);offset 为已写入字节长度 - chunk0 = bytes((i % 256) for i in range(512)) - chunk1 = b"\xab" * 64 - expected_segmented = chunk0 + chunk1 - seg1 = Tablet( - table, - column_names, - data_types, - [["d1", 3.0, None, None]], - [500], - column_types, - ) - seg1.add_value_object(0, 2, False, 0, chunk0) - seg1.add_value_object(0, 3, True, 0, b"f2-seg") - table_session.insert(seg1) - seg2 = Tablet( - table, - column_names, - data_types, - [["d1", 3.0, None, None]], - [500], - column_types, - ) - seg2.add_value_object(0, 2, True, 512, chunk1) - seg2.add_value_object(0, 3, True, 0, b"f2-seg") - table_session.insert(seg2) - - with table_session.execute_query_statement( - f"select read_object(f1) from {table} where time = 500" - ) as ds: - assert ds.has_next() - row = ds.next() - blob = row.get_fields()[0].get_binary_value() - assert blob == expected_segmented - assert not ds.has_next() - - seen = 0 - with table_session.execute_query_statement( - f"select device, temp, f1, f2 from {table} order by time" - ) as ds: - while ds.has_next(): - row = ds.next() - fields = row.get_fields() - assert fields[0].get_object_value(TSDataType.STRING) == "d1" - assert fields[1].get_object_value(TSDataType.FLOAT) is not None - for j in (2, 3): - raw = fields[j].value - assert isinstance(raw, (bytes, bytearray)) - eof, off, body = decode_object_cell(bytes(raw)) - assert isinstance(eof, bool) and isinstance(off, int) - assert isinstance(body, bytes) - fields[j].get_string_value() - fields[j].get_object_value(TSDataType.OBJECT) - seen += 1 - assert seen == 5 - - -if __name__ == "__main__": - pytest.main([__file__, "-v", "-rs"]) -``` diff --git a/src/zh/UserGuide/latest-Table/API/RestServiceV1_timecho.md b/src/zh/UserGuide/latest-Table/API/RestServiceV1_timecho.md deleted file mode 100644 index 117ba16ba..000000000 --- a/src/zh/UserGuide/latest-Table/API/RestServiceV1_timecho.md +++ /dev/null @@ -1,367 +0,0 @@ - - -# RestAPI V1 - -IoTDB 的 RESTful 服务可用于查询、写入和管理操作,它使用 OpenAPI 标准来定义接口并生成框架。 - -注意:自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 REST 服务的 JAR 包,请使用该服务前联系天谋团队获取相应的 JAR 包,并放置于 timechodb_home/lib 或者 timechodb_home/ext/external_service 路径下。 - -## 1. 开启 RESTful 服务 - -Restful 服务默认情况是关闭的,开启 restful 功能需要找到 IoTDB 安装目录下的`conf/iotdb-system.properties`文件,将 `enable_rest_service` 设置为 `true` ,然后重启 datanode 进程。 - -```Plain - enable_rest_service=true -``` - -## 2. 鉴权 - -除了检活接口 `/ping`,RESTful 服务均使用基础(Basic)鉴权,所有请求都需要在 Header 中携带 `Authorization` 信息。 - -1. 鉴权格式 - -```JSON -Authorization: Basic -``` - -其中 `` 是 `用户名:密码` 直接做 Base64 编码的结果,其快速生成方式如下 - -* Linux/macOS - -```Bash -echo -n "你的用户名:你的密码" | base64 -eg: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows - -```Bash -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("用户名:密码")) -eg: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) - -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"用户名:密码\"))" -eg: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. 鉴权示例 - -默认用户名 `root`,密码 `TimechoDB@2021`: - -* 拼接字符串:`root:TimechoDB@2021` -* Base64 编码后为:`cm9vdDpUaW1lY2hvREJAMjAyMQ==` -* 最终 Header: - -```JSON -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. 错误说明 -* 用户名/密码错误:返回 HTTP 状态码 `801`,内容: - -```JSON -{"code":801,"message":"WRONG_LOGIN_PASSWORD"} -``` - -* 未设置 `Authorization`:返回 HTTP 状态码 `800`,内容: - -```JSON -{"code":800,"message":"INIT_AUTH_ERROR"} -``` - - -## 3. 接口定义 - -### 3.1 ping - -ping 接口可以用于线上服务检活。 - -请求方式:`GET` - -请求路径:http://ip:port/ping - -请求示例: - -```Plain -curl http://127.0.0.1:18080/ping -``` - -返回的 HTTP 状态码: - -- `200`:当前服务工作正常,可以接收外部请求。 -- `503`:当前服务出现异常,不能接收外部请求。 - -响应参数: - -| 参数名称 | 参数类型 | 参数描述 | -| -------- | -------- | -------- | -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: - -- HTTP 状态码为 `200` 时: - -```JSON -{ "code": 200, "message": "SUCCESS_STATUS"} -``` - -- HTTP 状态码为 `503` 时: - -```JSON -{ "code": 500, "message": "thrift service is unavailable"} -``` - -注意:`ping` 接口访问不需要鉴权。 - -### 3.2 查询接口 - -- 请求地址:`/rest/table/v1/query` - -- 请求方式:post - -- Request 格式 - -请求头:`application/json` - -请求参数说明 - -| 参数名称 | 参数类型 | 是否必填 | 参数描述 | -| --------- | -------- | -------- | ------------------------------------------------------------ | -| database | string | 是 | 数据库名称 | -| sql | string | 是 | | -| row_limit | int | 否 | 一次查询能返回的结果集的最大行数。 如果不设置该参数,将使用配置文件的 `rest_query_default_row_size_limit` 作为默认值。 当返回结果集的行数超出限制时,将返回状态码 `411`。 | - -- Response 格式 - -| 参数名称 | 参数类型 | 参数描述 | -| ------------ | -------- | ------------------------------------------------------------ | -| column_names | array | 列名 | -| data_types | array | 每一列的类型 | -| values | array | 二维数组,第一维与结果集所有行,第二维数组代表结果集的每一行,每一个元素为一列,长度与column_names的长度相同。 | - -- 请求示例 - -```JSON -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"database":"test","sql":"select s1,s2,s3 from test_table"}' http://127.0.0.1:18080/rest/table/v1/query -``` - -- 响应示例: - -```JSON -{ - "column_names": [ - "s1", - "s2", - "s3" - ], - "data_types": [ - "STRING", - "BOOLEAN", - "INT32" - ], - "values": [ - [ - "a11", - true, - 2024 - ], - [ - "a11", - false, - 2025 - ] - ] -} -``` - -### 3.3 非查询接口 - -- 请求地址:`/rest/table/v1/nonQuery` - -- 请求方式:post - -- Request 格式 - - - 请求头:`application/json` - - - 请求参数说明 - -| 参数名称 | 参数类型 | 是否必填 | 参数描述 | -| -------- | -------- | -------- | -------- | -| sql | string | 是 | | -| database | string | 否 | 数据库名 | - -- Response 格式 - -| 参数名称 | 参数类型 | 参数描述 | -| -------- | -------- | -------- | -| code | integer | 状态码 | -| message | string | 信息提示 | - -- 请求示例 - -```JSON -#创建数据库 -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"create database test","database":""}' http://127.0.0.1:18080/rest/table/v1/nonQuery -#在test库中创建表test_table -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"CREATE TABLE table1 (time TIMESTAMP TIME,region STRING TAG,plant_id STRING TAG,device_id STRING TAG,model_id STRING ATTRIBUTE,maintenance STRING ATTRIBUTE,temperature FLOAT FIELD,humidity FLOAT FIELD,status Boolean FIELD,arrival_time TIMESTAMP FIELD) WITH (TTL=31536000000)","database":"test"}' http://127.0.0.1:18080/rest/table/v1/nonQuery -``` - -- 响应示例: - -```JSON -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - -### 3.4 批量写入接口 - -- 请求地址:`/rest/table/v1/insertTablet` - -- 请求方式:post - -- Request 格式 - -请求头:`application/json` - -请求参数说明 - -| 参数名称 | 参数类型 | 是否必填 | 参数描述 | -| ----------------- | -------- | -------- | ------------------------------------------------------------ | -| database | string | 是 | 数据库名称 | -| table | string | 是 | 表名 | -| column_names | array | 是 | 列名 | -| column_categories | array | 是 | 列类别(TAG,FIELD,*ATTRIBUTE*) | -| data_types | array | 是 | 数据类型 | -| timestamps | array | 是 | 时间列 | -| values | array | 是 | 值列,每一列中的值可以为 `null`二维数组第一层长度跟timestamps长度相同。第二层长度跟column_names长度相同 | - -- Response 格式 - -响应参数: - -| 参数名称 | 参数类型 | 参数描述 | -| -------- | -------- | -------- | -| code | integer | 状态码 | -| message | string | 信息提示 | - -- 请求示例 - -```JSON -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"database":"test","column_categories":["TAG","FIELD","FIELD"],"timestamps":[1739702535000,1739789055000],"column_names":["s1","s2","s3"],"data_types":["STRING","BOOLEAN","INT32"],"values":[["a11",true,2024],["a11",false,2025]],"table":"test_table"}' http://127.0.0.1:18080/rest/table/v1/insertTablet -``` - -- 响应示例 - -```JSON -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - -## 4. 配置 - -配置文件位于 `iotdb-system.properties` 中。 - -- 将 `enable_rest_service` 设置为 `true` 表示启用该模块,而将 `false` 表示禁用该模块。默认情况下,该值为 `false`。 - -```Plain -enable_rest_service=true -``` - -- 仅在 `enable_rest_service=true` 时生效。将 `rest_service_port `设置为数字(1025~65535),以自定义REST服务套接字端口,默认情况下,值为 `18080`。 - -```Plain -rest_service_port=18080 -``` - -- 将 'enable_swagger' 设置 'true' 启用swagger来展示rest接口信息, 而设置为 'false' 关闭该功能. 默认情况下,该值为 `false`。 - -```Plain -enable_swagger=false -``` - -- 一次查询能返回的结果集最大行数。当返回结果集的行数超出参数限制时,您只会得到在行数范围内的结果集,且将得到状态码`411`。 - -```Plain -rest_query_default_row_size_limit=10000 -``` - -- 缓存客户登录信息的过期时间(用于加速用户鉴权的速度,单位为秒,默认是8个小时) - -```Plain -cache_expire_in_seconds=28800 -``` - -- 缓存中存储的最大用户数量(默认是100) - -```Plain -cache_max_num=100 -``` - -- 缓存初始容量(默认是10) - -```Plain -cache_init_num=10 -``` - -- REST Service 是否开启 SSL 配置,将 `enable_https` 设置为 `true` 以启用该模块,而将 `false` 设置为禁用该模块。默认情况下,该值为 `false`。 - -```Plain -enable_https=false -``` - -- keyStore 所在路径(非必填) - -```Plain -key_store_path= -``` - -- keyStore 密码(非必填) - -```Plain -key_store_pwd= -``` - -- trustStore 所在路径(非必填) - -```Plain -trust_store_path="" -``` - -- trustStore 密码(非必填) - -```Plain -trust_store_pwd="" -``` - -- SSL 超时时间,单位为秒 - -```Plain -idle_timeout_in_seconds=5000 -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Background-knowledge/Cluster-Concept_timecho.md b/src/zh/UserGuide/latest-Table/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index b0462f0dd..000000000 --- a/src/zh/UserGuide/latest-Table/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,132 +0,0 @@ - - -# 常见概念 - -## 1. 数据模型相关概念 - -### 1.1 数据模型(sql_dialect) - -IoTDB 支持两种时序数据模型(SQL语法),管理的对象均为设备和测点树:以层级路径的方式管理数据,一条路径对应一个设备的一个测点表;以关系表的方式管理数据,一张表对应一类设备。 - -### 1.2 元数据(Schema) - -元数据是数据库的数据模型信息,即树形结构或表结构。包括测点的名称、数据类型等定义。 - -### 1.3 设备(Device) - -对应一个实际场景中的物理设备,通常包含多个测点。 - -### 1.4 测点(Timeseries) - -又名:物理量、时间序列、时间线、点位、信号量、指标、测量值等。
-测点是多个数据点按时间戳递增排列形成的一个时间序列。通常一个测点代表一个采集点位,能够定期采集所在环境的物理量。 - -### 1.5 编码(Encoding) - -编码是一种压缩技术,将数据以二进制的形式进行表示,可以提高存储效率。IoTDB 支持多种针对不同类型的数据的编码方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md)。 - -### 1.6 压缩(Compression) - -IoTDB 在数据编码后,使用压缩技术进一步压缩二进制数据,提升存储效率。IoTDB 支持多种压缩方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md)。 - -## 2. 分布式相关概念 - -下图展示了一个常见的 IoTDB 3C3D(3 个 ConfigNode、3 个 DataNode)的集群部署模式: - - - -IoTDB 的集群包括如下常见概念: - -- 节点(ConfigNode、DataNode、AINode) -- Region(SchemaRegion、DataRegion) -- 多副本 - -下文将对以上概念进行介绍。 - - -### 2.1 节点 - -IoTDB 集群包括三种节点(进程):ConfigNode(管理节点),DataNode(数据节点)和 AINode(分析节点),如下所示: - -- ConfigNode:管理集群的节点信息、配置信息、用户权限、元数据、分区信息等,负责分布式操作的调度和负载均衡,所有 ConfigNode 之间互为全量备份,如上图中的 ConfigNode-1,ConfigNode-2 和 ConfigNode-3 所示。 -- DataNode:服务客户端请求,负责数据的存储和计算,如上图中的 DataNode-1,DataNode-2 和 DataNode-3 所示。 -- AINode:负责提供机器学习能力,支持注册已训练好的机器学习模型,并通过 SQL 调用模型进行推理,目前已内置自研时序大模型和常见的机器学习算法(如预测与异常检测)。 - -### 2.2 数据分区 - -在 IoTDB 中,元数据和数据都被分为小的分区,即 Region,由集群的各个 DataNode 进行管理。 - -- SchemaRegion:元数据分区,管理一部分设备和测点的元数据。不同 DataNode 相同 RegionID 的 SchemaRegion 互为副本,如上图中 SchemaRegion-1 拥有三个副本,分别放置于 DataNode-1,DataNode-2 和 DataNode-3。 -- DataRegion:数据分区,管理一部分设备的一段时间的数据。不同 DataNode 相同 RegionID 的 DataRegion 互为副本,如上图中 DataRegion-2 拥有两个副本,分别放置于 DataNode-1 和 DataNode-2。 -- 具体分区算法可参考:[数据分区](../Technical-Insider/Cluster-data-partitioning.md) - -### 2.3 多副本 - -数据和元数据的副本数可配置,不同部署模式下的副本数推荐如下配置,其中多副本时可提供高可用服务。 - -| 类别 | 配置项 | 单机推荐配置 | 集群推荐配置 | -| :----- | :------------------------ | :----------- | :----------- | -| 元数据 | schema_replication_factor | 1 | 3 | -| 数据 | data_replication_factor | 1 | 2 | - - -## 3. 部署相关概念 - -IoTDB 有三种运行模式:单机模式、集群模式和双活模式。 - -### 3.1 单机模式 - -IoTDB单机实例包括 1 个ConfigNode、1个DataNode,即1C1D; - -- **特点**:便于开发者安装部署,部署和维护成本较低,操作方便。 -- **适用场景**:资源有限或对高可用要求不高的场景,例如边缘端服务器。 -- **部署方法**:[单机版部署](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -### 3.2 双活模式 - -双活版部署为 TimechoDB 企业版功能,是指两个独立的实例进行双向同步,能同时对外提供服务。当一台停机重启后,另一个实例会将缺失数据断点续传。 - -> IoTDB 双活实例通常为2个单机节点,即2套1C1D。每个实例也可以为集群。 - -- **特点**:资源占用最低的高可用解决方案。 -- **适用场景**:资源有限(仅有两台服务器),但希望获得高可用能力。 -- **部署方法**:[双活版部署](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### 3.3 集群模式 - -IoTDB 集群实例为 3 个ConfigNode 和不少于 3 个 DataNode,通常为 3 个 DataNode,即3C3D;当部分节点出现故障时,剩余节点仍然能对外提供服务,保证数据库服务的高可用性,且可随节点增加提升数据库性能。 - -- **特点**:具有高可用性、高扩展性,可通过增加 DataNode 提高系统性能。 -- **适用场景**:需要提供高可用和可靠性的企业级应用场景。 -- **部署方法**:[集群版部署](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -### 3.4 特点总结 - -| 维度 | 单机模式 | 双活模式 | 集群模式 | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| 适用场景 | 边缘侧部署、对高可用要求不高 | 高可用性业务、容灾场景等 | 高可用性业务、容灾场景等 | -| 所需机器数量 | 1 | 2 | ≥3 | -| 安全可靠性 | 无法容忍单点故障 | 高,可容忍单点故障 | 高,可容忍单点故障 | -| 扩展性 | 可扩展 DataNode 提升性能 | 每个实例可按需扩展 | 可扩展 DataNode 提升性能 | -| 性能 | 可随 DataNode 数量扩展 | 与其中一个实例性能相同 | 可随 DataNode 数量扩展 | - -- 单机模式和集群模式,部署步骤类似(逐个增加 ConfigNode 和 DataNode),仅副本数和可提供服务的最少节点数不同。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Background-knowledge/Data-Model-and-Terminology_timecho.md b/src/zh/UserGuide/latest-Table/Background-knowledge/Data-Model-and-Terminology_timecho.md deleted file mode 100644 index 4bd0c054a..000000000 --- a/src/zh/UserGuide/latest-Table/Background-knowledge/Data-Model-and-Terminology_timecho.md +++ /dev/null @@ -1,391 +0,0 @@ - - -# 建模方案设计 - -本章节主要介绍如何将时序数据应用场景转化为IoTDB时序建模。 - -## 1. 时序数据模型 - -在构建IoTDB建模方案前,需要先了解时序数据和时序数据模型,详细内容见此页面:[时序数据模型](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - -## 2. IoTDB 的树表孪生模型 - -IoTDB 提供了树表孪生模型的方式,其特点分别如下: - -**树模型**:以测点为对象进行管理,每个测点对应一条时间序列,测点名按`.`分割可形成一个树形目录结构,与物理世界一一对应,对测点的读写操作简单直观。 - -**表模型**:推荐为每类设备创建一张表,同类设备的物理量采集都具备一定共性(如都采集温度和湿度物理量),数据分析灵活丰富。 - -### 2.1 模型特点 - -树表孪生模型有各自的适用场景。 - -以下表格从适用场景、典型操作等多个维度对树模型和表模型进行了对比。用户可以根据具体的使用需求,选择适合的模型,从而实现数据的高效存储和管理。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
对比维度树模型表模型
适用场景测点管理,监控场景设备管理,分析场景
典型操作指定点位路径进行读写通过标签进行数据筛选分析
结构特点和文件系统一样灵活增删模板化管理,便于数据治理
语法特点简洁灵活分析丰富
性能对比相同
- -**注意:** -- 同一个集群实例中可以存在两种模型空间,不同模型的语法、数据库命名方式不同,默认不互相可见。 - - -### 2.2 模型选择 - -IoTDB 支持通过多种客户端工具与数据库建立连接,不同客户端下进行模型选择的方式说明如下: - -1. [命令行工具 CLI](../Tools-System/CLI_timecho.md) - -通过 CLI 建立连接时,需要通过 `sql_dialect` 参数指定使用的模型(默认使用树模型)。 - -```Bash -# 树模型 -start-cli.sh(bat) -start-cli.sh(bat) -sql_dialect tree - -# 表模型 -start-cli.sh(bat) -sql_dialect table -``` - -2. [SQL](../User-Manual/Maintenance-statement_timecho.md#_2-1-设置连接的模型) - -在使用 SQL 语言进行数据操作时,可通过 set 语句切换使用的模型。 - -```SQL --- 指定为树模型 -IoTDB> SET SQL_DIALECT=TREE - --- 指定为表模型 -IoTDB> SET SQL_DIALECT=TABLE -``` - -3. 应用编程接口 - -通过多语言应用编程接口建立连接时,可通过模型对应的 session/sessionpool 创建连接池实例,简单示例如下: - -* [Java 原生接口](../API/Programming-Java-Native-API_timecho.md) - -```Java -// 树模型 -SessionPool sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(3) - .build(); - -//表模型 - ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(1) - .build(); -``` - -* [Python 原生接口](../API/Programming-Python-Native-API_timecho.md) - -```Python -# 树模型 -session = Session( -​ ip=ip, -​ port=port, -​ user=username, -​ password=password, -​ fetch_size=1024, -​ zone_id="UTC+8", -​ enable_redirection=True -) - -# 表模型 -config = TableSessionPoolConfig( -​ node_urls=node_urls, -​ username=username, -​ password=password, -​ database=database, -​ max_pool_size=max_pool_size, -​ fetch_size=fetch_size, -​ wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) -``` - -* [C++ 原生接口](../API/Programming-Cpp-Native-API_timecho.md) - -```C++ -// 树模型 -session = new Session(hostip, port, username, password); - -// 表模型 -session = (new TableSessionBuilder()) - ->host(ip) - ->rpcPort(port) - ->username(username) - ->password(password) - ->build(); -``` - -* [GO 原生接口](../API/Programming-Go-Native-API_timecho.md) - -```Go -//树模型 -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, -} -sessionPool = client.NewSessionPool(config, 3, 60000, 60000, false) -defer sessionPool.Close() - -//表模型 -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, - Database: dbname, -} -sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) -defer sessionPool.Close() -``` - -* [C# 原生接口](../API/Programming-CSharp-Native-API_timecho.md) - -```C# -//树模型 -var session_pool = new SessionPool(host, port, pool_size); - -//表模型 -var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(nodeUrls) - .SetUsername(username) - .SetPassword(password) - .SetFetchSize(1024) - .Build(); -``` - -* [JDBC](../API/Programming-JDBC_timecho.md) - -使用表模型,必须在 url 中指定 sql\_dialect 参数为 table。 - -```Java -// 树模型 -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/", username, password); - -// 表模型 -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", username, password); -``` - -### 2.3 树转表 - -IoTDB 提供了树转表功能,如下图所示: - -![](/img/tree-to-table-1.png) - -该功能支持通过创建表视图的方式,将已存在的树模型数据转化为表视图,进而通过表视图进行查询,实现了对同一份数据的树模型和表模型协同处理。更详细的功能介绍可参考[树转表视图](../User-Manual/Tree-to-Table_timecho.md),需要注意的是:​**创建树转表视图的 SQL 语句只允许在表模型下执行**​。 - - -## 3. 应用场景 - -应用场景主要包括三类: - -- 场景一:使用树模型进行数据的读写 - -- 场景二:使用表模型进行数据的读写 - -- 场景三:共用一份数据,使用树模型进行数据读写、使用表模型进行数据分析 - -### 3.1 场景一:树模型 - -#### 3.1.1 特点 - -- 简单直观,和物理世界的监测点位一一对应 - -- 类似文件系统一样灵活,可以设计任意分支结构 - -- 适用 DCS、SCADA 等工业监控场景 - -#### 3.1.2 基础概念 - -| **概念** | **定义** | -| -------------------- | ------------------------------------------------------------ | -| **数据库** | 定义:一个以 root. 为前缀的路径
命名推荐:仅包含 root 的下一级节点,如 root.db
数量推荐:上限和内存相关,一个数据库也可以充分利用机器资源,无需为性能原因创建多个数据库
创建方式:推荐手动创建,也可创建时间序列时自动创建(默认为 root 的下一级节点) | -| **时间序列(测点)** | 定义:
1. 一个以数据库路径为前缀的、由 . 分割的路径,可包含任意多个层级,如 root.db.turbine.device1.metric1
2. 每个时间序列可以有不同的数据类型。
命名推荐:
1. 仅将唯一定位时间序列的标签(类似联合主键)放入路径中,一般不超过10层
2. 通常将基数(不同的取值数量)少的标签放在前面,便于系统将公共前缀进行压缩
数量推荐:
1. 集群可管理的时间序列总量和总内存相关,可参考资源推荐章节
2. 任一层级的子节点数量没有限制
创建方式:可手动创建或在数据写入时自动创建。 | -| **设备** | 定义:倒数第二级为设备,如 root.db.turbine.**device1**.metric1中的“device1”这一层级即为设备
创建方式:无法仅创建设备,随时间序列创建而存在 | - -#### 3.1.3 建模示例 - -##### 3.1.3.1 有多种类型的设备需要管理,如何建模? - -- 如场景中不同类型的设备具备不同的层级路径和测点集合,可以在数据库节点下按设备类型创建分支。每种设备下可以有不同的测点结构。 - -
- -
- -##### 3.1.3.2 如果场景中没有设备,只有测点,如何建模? - -- 如场站的监控系统中,每个测点都有唯一编号,但无法对应到某些设备。 - -
- -
- -##### 3.1.3.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 如在储能场景中,每一层结构都要监控其电压和电流,可以采用如下建模方式。 - -
- -
- - -### 3.2 场景二:表模型 - -#### 3.2.1 特点 - -- 以时序表建模管理设备时序数据,便于使用标准 SQL 进行分析 - -- 适用于设备数据分析或从其他数据库迁移至 IoTDB 的场景 - -#### 3.2.2 基础概念 - -- 数据库:可管理多类设备 - -- 时序表:对应一类设备 - -| **列类别** | **定义** | -| --------------------------- | ------------------------------------------------------------ | -| **时间列(TIME)** | 每个时序表必须有一个时间列,且列名必须为 time,数据类型为 TIMESTAMP | -| **标签列(TAG)** | 设备的唯一标识(联合主键),可以为 0 至多个
标签信息不可修改和删除,但允许增加
推荐按粒度由大到小进行排列 | -| **测点列(FIELD)** | 一个设备采集的测点可以有1个至多个,值随时间变化
表的测点列没有数量限制,可以达到数十万以上 | -| **属性列(ATTRIBUTE)** | 对设备的补充描述,**不随时间变化**
设备属性信息可以有0个或多个,可以更新或新增
少量希望修改的静态属性可以存至此列 | - - -数据筛选效率:时间列=标签列>属性列>测点列 - -#### 3.2.3 建模示例 - -##### 3.2.3.1 有多种类型的设备需要管理,如何建模? - -- 推荐为每一类型的设备建立一张表,每个表可以具有不同的标签和测点集合。 -- 即使设备之间有联系,或有层级关系,也推荐为每一类设备建一张表。 - -
- -
- -##### 3.2.3.2 如果没有设备标识列和属性列,如何建模? - -- 列数没有数量限制,可以达到数十万以上。 - -
- -
- -##### 3.2.3.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 每个设备有多个子设备及测点信息,推荐为每类设备建一个表进行管理。 - -
- -
- -### 3.3 场景三:双模型结合 - -#### 3.3.1 特点 - -- 巧妙融合了树模型与表模型的优点,共用一份数据,写入灵活,查询丰富。 - -- 数据写入阶段,采用树模型语法,支持数据灵活接入和扩展。 - -- 数据分析阶段,采用表模型语法,允许用户通过标准 SQL 查询语言,执行复杂的数据分析。 - -#### 3.3.2 建模示例 - -##### 3.3.2.1 有多种类型的设备需要管理,如何建模? - -- 场景中不同类型的设备具备不同的层级路径和测点集合。 - -- 树模型:在数据库节点下按设备类型创建分支,每种设备下可以有不同的测点结构。 - -- 表视图:为每种类型的设备建立一张表视图,每个表视图具有不同的标签和测点集合。 - -
- -
- -##### 3.3.2.2 如果没有设备标识列和属性列,如何建模? - -- 树模型:每个测点都有唯一编号,但无法对应到某些设备。 - -- 表视图:将所有测点放入一张表中,测点列数没有数量限制,可以达到数十万以上。若测点具有相同的数据类型,可将测点作为同一类设备。 - -
- -
- -##### 3.3.2.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 树模型:按照物理世界的监测点,对每一层结构进行建模。 - -- 表视图:按照设备分类,建立多个表对每一层结构信息进行管理。 - -
- -
diff --git a/src/zh/UserGuide/latest-Table/Background-knowledge/Data-Type_timecho.md b/src/zh/UserGuide/latest-Table/Background-knowledge/Data-Type_timecho.md deleted file mode 100644 index 42ee3009a..000000000 --- a/src/zh/UserGuide/latest-Table/Background-knowledge/Data-Type_timecho.md +++ /dev/null @@ -1,189 +0,0 @@ - - -# 数据类型 - -## 1. 基本数据类型 - -IoTDB 支持以下十一种数据类型: - -* BOOLEAN(布尔值) -* INT32(整型) -* INT64(长整型) -* FLOAT(单精度浮点数) -* DOUBLE(双精度浮点数) -* TEXT(长字符串,不推荐使用) -* STRING(字符串) -* BLOB(大二进制对象) -* OBJECT(大二进制对象) - > V2.0.8 版本起支持 -* TIMESTAMP(时间戳) -* DATE(日期) - -其中: -1. STRING 和 TEXT 类型的区别在于,STRING 类型具有更多的统计信息,能够用于优化值过滤查询。TEXT 类型适合用于存储长字符串。 -2. OBJECT 和 BLOB 类型的区别如下: - - | | **OBJECT** | **BLOB** | - | ---------------------- |-------------------------------------------------------------------------------------------------------------------------| -------------------------------------------- | - | 写放大(越低越好) | 低(写放大系数永远为 1) | 高(写放大系数为 2 + 合并次数) | - | 空间放大(越低越好) | 低(merge & release on write) | 高(merge on read and release on compact) | - | 查询结果 | 默认查询 OBJECT 列时,返回结果如`(Object) XX.XX KB)`。
真实 OBJECT 数据存储路径位于:`${data_dir}/object_data`,可通过 `READ_OBJECT` 函数读取其真实内容 | 直接返回真实的二进制内容 | - - -### 1.1 数据类型兼容性 - -当写入数据的类型与序列注册的数据类型不一致时, -- 如果序列数据类型不兼容写入数据类型,系统会给出错误提示。 -- 如果序列数据类型兼容写入数据类型,系统会进行数据类型的自动转换,将写入的数据类型更正为注册序列的类型。 - -各数据类型的兼容情况如下表所示: - -| 序列数据类型 | 支持的写入数据类型 | -|-----------|------------------------------------| -| BOOLEAN | BOOLEAN | -| INT32 | INT32 | -| INT64 | INT32 INT64 TIMESTAMP | -| FLOAT | INT32 FLOAT | -| DOUBLE | INT32 INT64 FLOAT DOUBLE TIMESTAMP | -| TEXT | TEXT STRING | -| STRING | TEXT STRING | -| BLOB | TEXT STRING BLOB | -| OBJECT | OBJECT | -| TIMESTAMP | INT32 INT64 TIMESTAMP | -| DATE | DATE | - -## 2. 时间戳类型 - -时间戳是一个数据到来的时间点,其中包括绝对时间戳和相对时间戳。 - -### 2.1 绝对时间戳 - -IOTDB 中绝对时间戳分为二种,一种为 LONG 类型,一种为 DATETIME 类型(包含 DATETIME-INPUT, DATETIME-DISPLAY 两个小类)。 - -在用户在输入时间戳时,可以使用 LONG 类型的时间戳或 DATETIME-INPUT 类型的时间戳,其中 DATETIME-INPUT 类型的时间戳支持格式如表所示: - -
- -**DATETIME-INPUT 类型支持格式** - - -| format | -| :--------------------------- | -| yyyy-MM-dd HH:mm:ss | -| yyyy/MM/dd HH:mm:ss | -| yyyy.MM.dd HH:mm:ss | -| yyyy-MM-dd HH:mm:ssZZ | -| yyyy/MM/dd HH:mm:ssZZ | -| yyyy.MM.dd HH:mm:ssZZ | -| yyyy/MM/dd HH:mm:ss.SSS | -| yyyy-MM-dd HH:mm:ss.SSS | -| yyyy.MM.dd HH:mm:ss.SSS | -| yyyy-MM-dd HH:mm:ss.SSSZZ | -| yyyy/MM/dd HH:mm:ss.SSSZZ | -| yyyy.MM.dd HH:mm:ss.SSSZZ | -| ISO8601 standard time format | - - -
- - -IoTDB 在显示时间戳时可以支持 LONG 类型以及 DATETIME-DISPLAY 类型,其中 DATETIME-DISPLAY 类型可以支持用户自定义时间格式。自定义时间格式的语法如表所示: - -
- -**DATETIME-DISPLAY 自定义时间格式的语法** - - -| Symbol | Meaning | Presentation | Examples | -| :----: | :-------------------------: | :----------: | :--------------------------------: | -| G | era | era | era | -| C | century of era (>=0) | number | 20 | -| Y | year of era (>=0) | year | 1996 | -| | | | | -| x | weekyear | year | 1996 | -| w | week of weekyear | number | 27 | -| e | day of week | number | 2 | -| E | day of week | text | Tuesday; Tue | -| | | | | -| y | year | year | 1996 | -| D | day of year | number | 189 | -| M | month of year | month | July; Jul; 07 | -| d | day of month | number | 10 | -| | | | | -| a | halfday of day | text | PM | -| K | hour of halfday (0~11) | number | 0 | -| h | clockhour of halfday (1~12) | number | 12 | -| | | | | -| H | hour of day (0~23) | number | 0 | -| k | clockhour of day (1~24) | number | 24 | -| m | minute of hour | number | 30 | -| s | second of minute | number | 55 | -| S | fraction of second | millis | 978 | -| | | | | -| z | time zone | text | Pacific Standard Time; PST | -| Z | time zone offset/id | zone | -0800; -08:00; America/Los_Angeles | -| | | | | -| ' | escape for text | delimiter | | -| '' | single quote | literal | ' | - -
- -### 2.2 相对时间戳 - - 相对时间是指与服务器时间```now()```和```DATETIME```类型时间相差一定时间间隔的时间。 - 形式化定义为: - - ``` - Duration = (Digit+ ('Y'|'MO'|'W'|'D'|'H'|'M'|'S'|'MS'|'US'|'NS'))+ - RelativeTime = (now() | DATETIME) ((+|-) Duration)+ - ``` - -
- - **The syntax of the duration unit** - - - | Symbol | Meaning | Presentation | Examples | - | :----: | :---------: | :----------------------: | :------: | - | y | year | 1y=365 days | 1y | - | mo | month | 1mo=30 days | 1mo | - | w | week | 1w=7 days | 1w | - | d | day | 1d=1 day | 1d | - | | | | | - | h | hour | 1h=3600 seconds | 1h | - | m | minute | 1m=60 seconds | 1m | - | s | second | 1s=1 second | 1s | - | | | | | - | ms | millisecond | 1ms=1000_000 nanoseconds | 1ms | - | us | microsecond | 1us=1000 nanoseconds | 1us | - | ns | nanosecond | 1ns=1 nanosecond | 1ns | - -
- - 例子: - - ``` - now() - 1d2h //比服务器时间早 1 天 2 小时的时间 - now() - 1w //比服务器时间早 1 周的时间 - ``` - - > 注意:'+'和'-'的左右两边必须有空格 \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md b/src/zh/UserGuide/latest-Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md deleted file mode 100644 index 0f7011af7..000000000 --- a/src/zh/UserGuide/latest-Table/Background-knowledge/Navigating_Time_Series_Data_timecho.md +++ /dev/null @@ -1,45 +0,0 @@ - -# 时序数据模型 - -## 1. 什么叫时序数据? - -万物互联的今天,物联网场景、工业场景等各类场景都在进行数字化转型,人们通过在各类设备上安装传感器对设备的各类状态进行采集。如电机采集电压、电流,风机的叶片转速、角速度、发电功率;车辆采集经纬度、速度、油耗;桥梁的振动频率、挠度、位移量等。传感器的数据采集,已经渗透在各个行业中。 - -![](/img/%E6%97%B6%E5%BA%8F%E6%95%B0%E6%8D%AE%E4%BB%8B%E7%BB%8D.png) - - -通常来说,我们把每个采集点位叫做一个**测点( 也叫物理量、时间序列、时间线、信号量、指标、测量值等)**,每个测点都在随时间的推移不断收集到新的数据信息,从而构成了一条**时间序列**。用表格的方式,每个时间序列就是一个由时间、值两列形成的表格;用图形化的方式,每个时间序列就是一个随时间推移形成的走势图,也可以形象的称之为设备的“心电图”。 - -![](/img/%E5%BF%83%E7%94%B5%E5%9B%BE1.png) - -传感器产生的海量时序数据是各行各业数字化转型的基础,因此我们对时序数据的模型梳理主要围绕设备、传感器展开。 - -## 2. 时序数据中的关键概念有哪些? - -时序数据中主要涉及的概念如下。 - -| **设备(Device)** | 也称为实体、装备等,是实际场景中拥有物理量的设备或装置。在 IoTDB 中,实体是管理一组时间序列的集合,可以是一个物理设备、测量装置、传感器集合等。如:能源场景:风机,由区域、场站、线路、机型、实例等标识工厂场景:机械臂,由物联网平台生成的唯一 ID 标识车联网场景:车辆,由车辆识别代码 VIN 标识监控场景:CPU,由机房、机架、Hostname、设备类型等标识 | -| ------------------------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| **测点(FIELD)** | 也称物理量、信号量、指标、点位、工况等,是在实际场景中检测装置记录的测量信息。通常一个物理量代表一个采集点位,能够定期采集所在环境的物理量。如:能源电力场景:电流、电压、风速、转速车联网场景:油量、车速、经度、维度工厂场景:温度、湿度
_表模型下**测点数量**等于所有表的测点数之和(每张表的测点数 = device 数量 * field 的列数),具体统计方法可参考_[元数据查询](../Basic-Concept/Table-Management_timecho.md#_1-7-元数据查询) | -| **数据点(Data Point)** | 由一个时间戳和一个数值组成,其中时间戳为 long 类型,数值可以为 BOOLEAN、FLOAT、INT32 等各种类型。如下图表格形式的时间序列的一行,或图形形式的时间序列的一个点,就是一个数据点。
| -| **采集频率(Frequency)** | 指物理量在一定时间内产生数据的次数。例如,一个温度传感器可能每秒钟采集一次温度数据,那么它的采集频率就是1Hz(赫兹),即每秒1次。 | -| **数据保存时间(TTL)** | TTL 指定表中数据的保存时间,超过 TTL 的数据将自动删除。IoTDB 支持对不同的表设定不同的数据存活时间,便于 IoTDB 定期、自动地删除一定时间之前的数据。合理使用 TTL 可以控制 IoTDB 占用的总磁盘空间,避免磁盘写满等异常,并维持较高的查询性能和减少内存资源占用。 | \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Basic-Concept/Database-Management_timecho.md b/src/zh/UserGuide/latest-Table/Basic-Concept/Database-Management_timecho.md deleted file mode 100644 index 43b014c57..000000000 --- a/src/zh/UserGuide/latest-Table/Basic-Concept/Database-Management_timecho.md +++ /dev/null @@ -1,176 +0,0 @@ - - -# 数据库管理 - -## 1. 数据库管理 - -### 1.1 创建数据库 - -用于创建数据库。 - -**语法:** - -```SQL - CREATE DATABASE (IF NOT EXISTS)? (WITH properties)? -``` - -**说明:** - -1. 数据库名称,具有以下特性: - - 大小写不敏感,创建成功后,统一显示为小写 - - 名称的长度不得超过 64 个字符。 - - 名称中包含下划线(_)、数字(非开头)、英文字母可以直接创建 - - 名称中包含特殊字符(如`)、中文字符、数字开头时,必须用双引号 "" 括起来。 - -2. WITH properties 子句可配置如下属性: - -> 注:属性的大小写不敏感,有关详细信息[大小写敏感规则](../SQL-Manual/Identifier.md#大小写敏感性)。 - -| 属性 | 含义 | 默认值 | -| ------------------------- | ---------------------------------------- | --------- | -| `TTL` | 数据自动过期删除,单位 ms | INF | -| `TIME_PARTITION_INTERVAL` | 数据库的时间分区间隔,单位 ms | 604800000 | -| `SCHEMA_REGION_GROUP_NUM` | 数据库的元数据副本组数量,一般不需要修改 | 1 | -| `DATA_REGION_GROUP_NUM` | 数据库的数据副本组数量,一般不需要修改 | 2 | - -**示例:** - -```SQL -CREATE DATABASE IF NOT EXISTS database1 with(TTL=31536000000); -``` - -### 1.2 使用数据库 - -用于指定当前数据库作为表的命名空间。 - -**语法:** - -```SQL -USE -``` - -**示例:** - -```SQL -USE database1; -``` - -### 1.3 查看当前数据库 - -返回当前会话所连接的数据库名称,若未执行过 `use`语句指定数据库,则默认为 `null`。 - -**语法:** - -```SQL -SHOW CURRENT_DATABASE -``` - -**示例:** - -```SQL -USE database1; -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| database1| -+---------------+ -``` - -### 1.4 查看所有数据库 - -用于查看所有数据库和数据库的属性信息。 - -**语法:** - -```SQL -SHOW DATABASES (DETAILS)? -``` - -**语句返回列含义如下:** - -| 列名 | 含义 | -| ----------------------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| database | database名称。 | -| TTL | 数据保留周期。如果在创建数据库的时候指定TTL,则TTL对该数据库下所有表的TTL生效。也可以再通过 [create table](../Basic-Concept/Table-Management_timecho.md#11-创建表) 、[alter table](../Basic-Concept/Table-Management_timecho.md#14-修改表) 来设置或更新表的TTL时间。 | -| SchemaReplicationFactor | 元数据副本数,用于确保元数据的高可用性。可以在`iotdb-system.properties`中修改`schema_replication_factor`配置项。 | -| DataReplicationFactor | 数据副本数,用于确保数据的高可用性。可以在`iotdb-system.properties`中修改`data_replication_factor`配置项。 | -| TimePartitionInterval | 时间分区间隔,决定了数据在磁盘上按多长时间进行目录分组,通常采用默认1周即可。 | -| SchemaRegionGroupNum | 使用`DETAILS`语句会返回此列,展示数据库的元数据副本组数量,一般不需要修改 | -| DataRegionGroupNum | 使用`DETAILS`语句会返回此列,展示数据库的数据副本组数量,一般不需要修改 | - -**示例:** - -```SQL -SHOW DATABASES DETAILS; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|DataRegionGroupNum| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| database1| INF| 1| 1| 604800000| 1| 2| -|information_schema| INF| null| null| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -``` - -### 1.5 修改数据库 - -用于修改数据库中的部分属性。 - -**语法:** - -```SQL -ALTER DATABASE (IF EXISTS)? database=identifier SET PROPERTIES propertyAssignments -``` - -**说明:** - -1. `ALTER DATABASE`操作目前仅支持对数据库的`SCHEMA_REGION_GROUP_NUM`、`DATA_REGION_GROUP_NUM`以及`TTL`属性进行修改。 - -**示例:** - -```SQL -ALTER DATABASE database1 SET PROPERTIES TTL=31536000000; -``` - -### 1.6 删除数据库 - -用于删除数据库。 - -**语法:** - -```SQL -DROP DATABASE (IF EXISTS)? -``` - -**说明:** - -1. 数据库已被设置为当前使用(use)的数据库,仍然可以被删除(drop)。 -2. 删除数据库将导致所选数据库及其内所有表连同其存储的数据一并被删除。 - -**示例:** - -```SQL -DROP DATABASE IF EXISTS database1; -``` diff --git a/src/zh/UserGuide/latest-Table/Basic-Concept/Query-Data_timecho.md b/src/zh/UserGuide/latest-Table/Basic-Concept/Query-Data_timecho.md deleted file mode 100644 index 48c47bc3d..000000000 --- a/src/zh/UserGuide/latest-Table/Basic-Concept/Query-Data_timecho.md +++ /dev/null @@ -1,592 +0,0 @@ - - -# 数据查询 - -## 1. 语法概览 - -```SQL -SELECT ⟨select_list⟩ - FROM ⟨tables⟩ | patternRecognition - [WHERE ⟨condition⟩] - [GROUP BY ⟨groups⟩] - [HAVING ⟨group_filter⟩] - [WINDOW windowDefinition (',' windowDefinition)*)] - [FILL ⟨fill_methods⟩] - [ORDER BY ⟨order_expression⟩] - [OFFSET ⟨n⟩] - [LIMIT ⟨n⟩]; -``` - -IoTDB 查询语法提供以下子句: - -- SELECT 子句:查询结果应包含的列。详细语法见:[SELECT子句](../SQL-Manual/Select-Clause_timecho.md) -- FROM 子句:指出查询的数据源,可以是单个表、多个通过 `JOIN` 子句连接的表,或者是一个子查询。详细语法见:[FROM & JOIN 子句](../SQL-Manual/From-Join-Clause.md) -- WHERE 子句:用于过滤数据,只选择满足特定条件的数据行。这个子句在逻辑上紧跟在 FROM 子句之后执行。详细语法见:[WHERE 子句](../SQL-Manual/Where-Clause.md) -- GROUP BY 子句:当需要对数据进行聚合时使用,指定了用于分组的列。详细语法见:[GROUP BY 子句](../SQL-Manual/GroupBy-Clause.md) -- HAVING 子句:在 GROUP BY 子句之后使用,用于对已经分组的数据进行过滤。与 WHERE 子句类似,但 HAVING 子句在分组后执行。详细语法见:[HAVING 子句](../SQL-Manual/Having-Clause.md) -- FILL 子句:用于处理查询结果中的空值,用户可以使用 FILL 子句来指定数据缺失时的填充模式(如前一个非空值或线性插值)来填充 null 值,以便于数据可视化和分析。 详细语法见:[FILL 子句](../SQL-Manual/Fill-Clause.md) -- ORDER BY 子句:对查询结果进行排序,可以指定升序(ASC)或降序(DESC),以及 NULL 值的处理方式(NULLS FIRST 或 NULLS LAST)。详细语法见:[ORDER BY 子句](../SQL-Manual/OrderBy-Clause.md) -- OFFSET 子句:用于指定查询结果的起始位置,即跳过前 OFFSET 行。与 LIMIT 子句配合使用。详细语法见:[LIMIT 和 OFFSET 子句](../SQL-Manual/Limit-Offset-Clause.md) -- LIMIT 子句:限制查询结果的行数,通常与 OFFSET 子句一起使用以实现分页功能。详细语法见:[LIMIT 和 OFFSET 子句](../SQL-Manual/Limit-Offset-Clause.md) - -## 2. 子句执行顺序 - -![](/img/data-query-1.png) - - - -## 3. 常见查询示例 - -### 3.1 示例数据 - -在[示例数据页面](../Reference/Sample-Data.md)中,包含了用于构建表结构和插入数据的SQL语句,下载并在IoTDB CLI中执行这些语句,即可将数据导入IoTDB,您可以使用这些数据来测试和执行示例中的SQL语句,并获得相应的结果。 - -### 3.2 原始数据查询 - -**示例1:根据时间过滤** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE time >= 2024-11-27 00:00:00 and time <= 2024-11-29 00:00:00; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-28T08:00:00.000+08:00| 85.0| null| -|2024-11-28T09:00:00.000+08:00| null| 40.9| -|2024-11-28T10:00:00.000+08:00| 85.0| 35.2| -|2024-11-28T11:00:00.000+08:00| 88.0| 45.1| -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| null| -|2024-11-27T16:41:00.000+08:00| 85.0| null| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-27T16:43:00.000+08:00| null| null| -|2024-11-27T16:44:00.000+08:00| null| null| -+-----------------------------+-----------+--------+ -Total line number = 11 -It costs 0.075s -``` - -**示例2:根据值过滤** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE temperature > 89.0; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-29T18:30:00.000+08:00| 90.0| 35.4| -|2024-11-26T13:37:00.000+08:00| 90.0| 35.1| -|2024-11-26T13:38:00.000+08:00| 90.0| 35.1| -|2024-11-30T09:30:00.000+08:00| 90.0| 35.2| -|2024-11-30T14:30:00.000+08:00| 90.0| 34.8| -+-----------------------------+-----------+--------+ -Total line number = 5 -It costs 0.156s -``` - -**示例3:根据属性过滤** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE model_id ='B'; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| null| -|2024-11-27T16:41:00.000+08:00| 85.0| null| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-27T16:43:00.000+08:00| null| null| -|2024-11-27T16:44:00.000+08:00| null| null| -+-----------------------------+-----------+--------+ -Total line number = 7 -It costs 0.106s -``` - -**示例4:多设备按时间对齐查询** - -```SQL -IoTDB> SELECT date_bin_gapfill(1d, TIME) AS a_time, - device_id, - AVG(temperature) AS avg_temp - FROM table1 - WHERE TIME >= 2024-11-26 13:00:00 - AND TIME <= 2024-11-27 17:00:00 - GROUP BY 1, device_id FILL METHOD PREVIOUS; -``` - -执行结果如下: - -```SQL -+-----------------------------+---------+--------+ -| a_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-26T08:00:00.000+08:00| 100| 90.0| -|2024-11-27T08:00:00.000+08:00| 100| 90.0| -|2024-11-26T08:00:00.000+08:00| 101| 90.0| -|2024-11-27T08:00:00.000+08:00| 101| 85.0| -+-----------------------------+---------+--------+ -Total line number = 4 -It costs 0.048s -``` - -### 3.3 聚合查询 - -**示例:查询计算了在指定时间范围内,每个`device_id`的平均温度、最高温度和最低温度。** - -```SQL -IoTDB> SELECT device_id, AVG(temperature) as avg_temp, MAX(temperature) as max_temp, MIN(temperature) as min_temp - FROM table1 - WHERE time >= 2024-11-26 00:00:00 AND time <= 2024-11-29 00:00:00 - GROUP BY device_id; -``` - -执行结果如下: - -```SQL -+---------+--------+--------+--------+ -|device_id|avg_temp|max_temp|min_temp| -+---------+--------+--------+--------+ -| 100| 87.6| 90.0| 85.0| -| 101| 85.0| 85.0| 85.0| -+---------+--------+--------+--------+ -Total line number = 2 -It costs 0.278s -``` - -### 3.4 最新点查询 - -**示例:查询表中每个 `device_id` 返回最后一条记录,包含该记录的温度值以及在该设备中基于时间和温度排序的最后一条记录。** - -```SQL -IoTDB> SELECT device_id,last(time),last_by(temperature,time) - FROM table1 - GROUP BY device_id; -``` - -执行结果如下: - -```SQL -+---------+-----------------------------+-----+ -|device_id| _col1|_col2| -+---------+-----------------------------+-----+ -| 100|2024-11-29T18:30:00.000+08:00| 90.0| -| 101|2024-11-30T14:30:00.000+08:00| 90.0| -+---------+-----------------------------+-----+ -Total line number = 2 -It costs 0.090s -``` - -### 3.5 降采样查询(date_bin 函数) - -**示例:查询将时间按天分组,并计算每天的平均温度。** - -```SQL -IoTDB> SELECT device_id,date_bin(1d ,time) as day_time, AVG(temperature) as avg_temp - FROM table1 - WHERE time >= 2024-11-26 00:00:00 AND time <= 2024-11-30 00:00:00 - GROUP BY device_id,date_bin(1d ,time); -``` - -执行结果如下: - -```SQL -+---------+-----------------------------+--------+ -|device_id| day_time|avg_temp| -+---------+-----------------------------+--------+ -| 100|2024-11-29T08:00:00.000+08:00| 90.0| -| 100|2024-11-28T08:00:00.000+08:00| 86.0| -| 100|2024-11-26T08:00:00.000+08:00| 90.0| -| 101|2024-11-29T08:00:00.000+08:00| 85.0| -| 101|2024-11-27T08:00:00.000+08:00| 85.0| -+---------+-----------------------------+--------+ -Total line number = 5 -It costs 0.066s -``` - -### 3.6 多设备降采样对齐查询 - -#### 3.6.1 采样频率相同,时间不同 - -**table1:采样频率:1s** - -| Time | device_id | temperature | -| ------------ | --------- | ----------- | -| 00:00:00.001 | d1 | 90.0 | -| 00:00:01.002 | d1 | 85.0 | -| 00:00:02.101 | d1 | 85.0 | -| 00:00:03.201 | d1 | null | -| 00:00:04.105 | d1 | 90.0 | -| 00:00:05.023 | d1 | 85.0 | -| 00:00:06.129 | d1 | 90.0 | - -**table2:采样频率:1s** - -| Time | device_id | humidity | -| ------------ | --------- | -------- | -| 00:00:00.003 | d1 | 35.1 | -| 00:00:01.012 | d1 | 37.2 | -| 00:00:02.031 | d1 | null | -| 00:00:03.134 | d1 | 35.2 | -| 00:00:04.201 | d1 | 38.2 | -| 00:00:05.091 | d1 | 35.4 | -| 00:00:06.231 | d1 | 35.1 | - -**示例:查询`table1`的降采样数据:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS a_time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**结果:** - -```SQL -+-----------------------------+-------+ -| a_time|a_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| -|2025-05-13T00:00:01.000+08:00| 85.0| -|2025-05-13T00:00:02.000+08:00| 85.0| -|2025-05-13T00:00:03.000+08:00| 85.0| -|2025-05-13T00:00:04.000+08:00| 90.0| -|2025-05-13T00:00:05.000+08:00| 85.0| -|2025-05-13T00:00:06.000+08:00| 90.0| -+-----------------------------+-------+ -``` - -**示例:查询`table2`的降采样数据:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS b_time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**结果:** - -```SQL -+-----------------------------+-------+ -| b_time|b_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 35.1| -|2025-05-13T00:00:01.000+08:00| 37.2| -|2025-05-13T00:00:02.000+08:00| 37.2| -|2025-05-13T00:00:03.000+08:00| 35.2| -|2025-05-13T00:00:04.000+08:00| 38.2| -|2025-05-13T00:00:05.000+08:00| 35.4| -|2025-05-13T00:00:06.000+08:00| 35.1| -+-----------------------------+-------+ -``` - -**示例:按整点将多个序列进行时间对齐:** - -```SQL -IoTDB> SELECT time, - a_value, - b_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) B - USING (time) -``` - -**结果:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|b_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:02.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:03.000+08:00| 85.0| 35.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 38.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 35.4| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` - -- 保留空值:当 `NULL` 值本身具有特殊含义,或希望保留数据的 null 值时,可以选择去掉 `FILL METHOD PREVIOUS` 不进行填充。 -**示例:** - -```SQL -IoTDB> SELECT time, - a_value, - b_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS b_value - FROM table2 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1) B - USING (time) -``` - -**结果:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|b_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:02.000+08:00| 85.0| null| -|2025-05-13T00:00:03.000+08:00| null| 35.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 38.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 35.4| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` -#### 3.6.2 采样频率不同,时间不同 - -**table1:采样频率:1s** - -| Time | device_id | temperature | -| ------------ | --------- | ----------- | -| 00:00:00.001 | d1 | 90.0 | -| 00:00:01.002 | d1 | 85.0 | -| 00:00:02.101 | d1 | 85.0 | -| 00:00:03.201 | d1 | null | -| 00:00:04.105 | d1 | 90.0 | -| 00:00:05.023 | d1 | 85.0 | -| 00:00:06.129 | d1 | 90.0 | - -**table3: 采样频率:2s** - -| Time | device_id | humidity | -| ------------ | --------- | -------- | -| 00:00:00.005 | d1 | 35.1 | -| 00:00:02.106 | d1 | 37.2 | -| 00:00:04.187 | d1 | null | -| 00:00:06.156 | d1 | 35.1 | - -**示例:查询`table1`的降采样数据:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS a_time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**结果:** - -```SQL -+-----------------------------+-------+ -| a_time|a_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| -|2025-05-13T00:00:01.000+08:00| 85.0| -|2025-05-13T00:00:02.000+08:00| 85.0| -|2025-05-13T00:00:03.000+08:00| 85.0| -|2025-05-13T00:00:04.000+08:00| 90.0| -|2025-05-13T00:00:05.000+08:00| 85.0| -|2025-05-13T00:00:06.000+08:00| 90.0| -+-----------------------------+-------+ -``` -**示例:查询`table3`的降采样数据:** - -```SQL -IoTDB> SELECT date_bin_gapfill(1s, TIME) AS c_time, - first(humidity) AS c_value - FROM table3 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS -``` - -**结果:** - -```SQL -+-----------------------------+-------+ -| c_time|c_value| -+-----------------------------+-------+ -|2025-05-13T00:00:00.000+08:00| 35.1| -|2025-05-13T00:00:01.000+08:00| 35.1| -|2025-05-13T00:00:02.000+08:00| 37.2| -|2025-05-13T00:00:03.000+08:00| 37.2| -|2025-05-13T00:00:04.000+08:00| 37.2| -|2025-05-13T00:00:05.000+08:00| 37.2| -|2025-05-13T00:00:06.000+08:00| 35.1| -+-----------------------------+-------+ -``` - -**示例:按照高采样频率进行对齐:** - -```SQL -IoTDB> SELECT time, - a_value, - c_value - FROM - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(temperature) AS a_value - FROM table1 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) A - JOIN - (SELECT date_bin_gapfill(1s, TIME) AS time, - first(humidity) AS c_value - FROM table3 - WHERE device_id = 'd1' - AND TIME >= 2025-05-13 00:00:00.000 - AND TIME <= 2025-05-13 00:00:07.000 - GROUP BY 1 FILL METHOD PREVIOUS) C - USING (time) -``` - -**结果:** - -```SQL -+-----------------------------+-------+-------+ -| time|a_value|c_value| -+-----------------------------+-------+-------+ -|2025-05-13T00:00:00.000+08:00| 90.0| 35.1| -|2025-05-13T00:00:01.000+08:00| 85.0| 35.1| -|2025-05-13T00:00:02.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:03.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:04.000+08:00| 90.0| 37.2| -|2025-05-13T00:00:05.000+08:00| 85.0| 37.2| -|2025-05-13T00:00:06.000+08:00| 90.0| 35.1| -+-----------------------------+-------+-------+ -``` - -### 3.7 数据填充 - -**示例:查询指定时间范围内,满足 `device_id` 为 '100' 的记录,若存在缺失的数据点,则用前一个非空值进行填充。** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - WHERE time >= 2024-11-26 00:00:00 and time <= 2024-11-30 11:00:00 - AND region='北京' AND plant_id='1001' AND device_id='101' - FILL METHOD PREVIOUS; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:40:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:41:00.000+08:00| 85.0| 35.3| -|2024-11-27T16:42:00.000+08:00| 85.0| 35.2| -|2024-11-27T16:43:00.000+08:00| 85.0| 35.2| -|2024-11-27T16:44:00.000+08:00| 85.0| 35.2| -+-----------------------------+-----------+--------+ -Total line number = 7 -It costs 0.101s -``` - -### 3.8 排序&分页 - -**示例:查询表中湿度降序排列且空值(NULL)排最后的记录,跳过前 2 条,只返回接下来的 8 条记录。** - -```SQL -IoTDB> SELECT time, temperature, humidity - FROM table1 - ORDER BY humidity desc NULLS LAST - OFFSET 2 - LIMIT 10; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------+--------+ -| time|temperature|humidity| -+-----------------------------+-----------+--------+ -|2024-11-28T09:00:00.000+08:00| null| 40.9| -|2024-11-29T18:30:00.000+08:00| 90.0| 35.4| -|2024-11-27T16:39:00.000+08:00| 85.0| 35.3| -|2024-11-28T10:00:00.000+08:00| 85.0| 35.2| -|2024-11-30T09:30:00.000+08:00| 90.0| 35.2| -|2024-11-27T16:42:00.000+08:00| null| 35.2| -|2024-11-26T13:38:00.000+08:00| 90.0| 35.1| -|2024-11-26T13:37:00.000+08:00| 90.0| 35.1| -|2024-11-27T16:38:00.000+08:00| null| 35.1| -|2024-11-30T14:30:00.000+08:00| 90.0| 34.8| -+-----------------------------+-----------+--------+ -Total line number = 10 -It costs 0.093s -``` diff --git a/src/zh/UserGuide/latest-Table/Basic-Concept/TTL-Delete-Data_timecho.md b/src/zh/UserGuide/latest-Table/Basic-Concept/TTL-Delete-Data_timecho.md deleted file mode 100644 index 3e6cb9778..000000000 --- a/src/zh/UserGuide/latest-Table/Basic-Concept/TTL-Delete-Data_timecho.md +++ /dev/null @@ -1,142 +0,0 @@ - - -# 数据保留时间 - -## 1. 概览 - -IoTDB支持对表(table)级别设置数据保留时间(TTL),允许系统自动定期删除旧数据,以有效控制磁盘空间并维护高性能查询和低内存占用。TTL默认以毫秒为单位,数据过期后不可查询且禁止写入,但物理删除会延迟至压缩时。需注意,TTL变更可能导致短暂数据可查询性变化。 - -**注意事项:** - -1. TTL设置为毫秒,不受配置文件时间精度影响。 -2. TTL变更可能影响数据的可查询性。 -3. 系统最终会移除过期数据,但存在延迟。 -4. TTL 判断数据是否过期依据的是数据点时间,非写入时间。 - -## 2. 设置 TTL - -在表模型中,IoTDB 的 TTL 是按照表的粒度生效的。可以直接在表上设置 TTL,或者在数据库级别设置 TTL。当在数据库级别设置了TTL时,在创建新表的过程中,系统会自动采用这个TTL值作为新表的默认设置,但每个表仍然可以独立地被设置或覆盖该值。 - -注意,如果数据库级别的TTL被修改,不会直接影响到已经存在的表的TTL设置。 - -### 2.1 为表设置 TTL - -如果在建表时通过sql语句设置了表的 TTL,则会以表的ttl为准。建表语句详情可见:[表管理](../Basic-Concept/Table-Management_timecho.md) - -示例1:创建表时设置 TTL - -```SQL -CREATE TABLE test3 ("场站" string id, "温度" int32) with (TTL=3600); -``` - -示例2:更改表语句设置TTL: - -```SQL -ALTER TABLE tableB SET PROPERTIES TTL=3600; -``` - -示例3:不指定TTL或设为默认值,它将与数据库的TTL相同,默认情况下是'INF'(无穷大): - -```SQL -CREATE TABLE test3 ("场站" string id, "温度" int32) with (TTL=DEFAULT); -CREATE TABLE test3 ("场站" string id, "温度" int32); -ALTER TABLE tableB set properties TTL=DEFAULT; -``` - -### 2.2 为数据库设置 TTL - -没有设置表的TTL,则会继承database的ttl。建数据库语句详情可见:[数据库管理](../Basic-Concept/Database-Management_timecho.md) - -示例4:数据库设置为 ttl =3600000,将生成一个ttl=3600000的表: - -```SQL -CREATE DATABASE db WITH (ttl=3600000); -use db; -CREATE TABLE test3 ("场站" string id, "温度" int32); -``` - -示例5:数据库不设置ttl,将生成一个没有ttl的表: - -```SQL -CREATE DATABASE db; -use db; -CREATE TABLE test3 ("场站" string id, "温度" int32); -``` - -示例6:数据库设置了ttl,但想显式设置没有TTL的表,可以将TTL设置为'INF': - -```SQL -CREATE DATABASE db WITH (ttl=3600000); -use db; -CREATE TABLE test3 ("场站" string id, "温度" int32) with (ttl='INF'); -``` - -## 3. 取消 TTL - -取消 TTL 设置,可以修改表的 TTL 设置为 'INF'。目前,IoTDB 不支持修改数据库的 TTL。 - -```SQL -ALTER TABLE tableB set properties TTL='INF'; -``` - -## 4. 查看 TTL 信息 - -使用 "SHOW DATABASES" 和 "SHOW TABLES" 命令可以直接显示数据库和表的 TTL 详情。数据库和表管理语句详情可见:[数据库管理](../Basic-Concept/Database-Management_timecho.md)、[表管理](../Basic-Concept/Table-Management_timecho.md) - -> 注意,树模型数据库的TTL也将显示。 - -```SQL -IoTDB> show databases; -+---------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+---------+-------+-----------------------+---------------------+---------------------+ -|test_prop| 300| 1| 3| 100000| -| test2| 300| 1| 1| 604800000| -+---------+-------+-----------------------+---------------------+---------------------+ - -IoTDB> show databases details; -+---------+-------+-----------------------+---------------------+---------------------+-----+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|Model| -+---------+-------+-----------------------+---------------------+---------------------+-----+ -|test_prop| 300| 1| 3| 100000|TABLE| -| test2| 300| 1| 1| 604800000| TREE| -+---------+-------+-----------------------+---------------------+---------------------+-----+ - -IoTDB> show tables; -+---------+-------+ -|TableName|TTL(ms)| -+---------+-------+ -| grass| 1000| -| bamboo| 300| -| flower| INF| -+---------+-------+ - -IoTDB> show tables details; -+---------+-------+----------+ -|TableName|TTL(ms)| Status| -+---------+-------+----------+ -| bean| 300|PRE_CREATE| -| grass| 1000| USING| -| bamboo| 300| USING| -| flower| INF| USING| -+---------+-------+----------+ -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Basic-Concept/Table-Management_timecho.md b/src/zh/UserGuide/latest-Table/Basic-Concept/Table-Management_timecho.md deleted file mode 100644 index c2dca4beb..000000000 --- a/src/zh/UserGuide/latest-Table/Basic-Concept/Table-Management_timecho.md +++ /dev/null @@ -1,339 +0,0 @@ - - -# 表管理 - -在开始使用表管理功能前,推荐您先了解以下相关预备知识,以便更好地理解和应用表管理功能: -* [时序数据模型](../Background-knowledge/Navigating_Time_Series_Data_timecho.md):了解时序数据的基本概念与特点,帮助建立建模基础。 -* [建模方案设计](../Background-knowledge/Data-Model-and-Terminology_timecho.md):掌握 IoTDB 时序模型及适用场景,为表管理提供设计基础。 - -## 1. 表管理 - -### 1.1 创建表 - -#### 1.1.1 通过 Create 语句手动创建表 - -用于在当前数据库中创建表,也可以对任何指定数据库创建表,格式为“数据库名.表名”。 - -**语法:** - -```SQL -createTableStatement - : CREATE TABLE (IF NOT EXISTS)? qualifiedName - '(' (columnDefinition (',' columnDefinition)*)? ')' - charsetDesc? - comment? - (WITH properties)? - ; - -charsetDesc - : DEFAULT? (CHAR SET | CHARSET | CHARACTER SET) EQ? identifierOrString - ; - -columnDefinition - : identifier columnCategory=(TAG | ATTRIBUTE | TIME) charsetName? comment? - | identifier type (columnCategory=(TAG | ATTRIBUTE | TIME | FIELD))? charsetName? comment? - ; - -charsetName - : CHAR SET identifier - | CHARSET identifier - | CHARACTER SET identifier - ; - -comment - : COMMENT string - ; -``` - -**说明:** - -1. 在创建表时,可以不指定时间列(TIME),IoTDB会自动添加该列并命名为"time", 且顺序上位于第一列。其他所有列可以通过在数据库配置时启用`enable_auto_create_schema`选项,或通过 session 接口自动创建或修改表的语句来添加。 -2. 自 V2.0.8.2 版本起,支持创建表时自定义命名时间列,自定义时间列在表中的顺序由创建 SQL 中的顺序决定。相关约束如下: -- 当列分类(columnCategory)设为 TIME 时,数据类型(dataType)必须为 TIMESTAMP。 -- 每张表最多允许定义 1个时间列(columnCategory = TIME)。 -- 当未显式定义时间列时,不允许其他列使用 time 作为名称,否则会与系统默认时间列命名冲突。 -3. 列的类别可以省略,默认为`FIELD`。当列的类别为`TAG`或`ATTRIBUTE`时,数据类型需为`STRING`(可省略)。 -4. 表的TTL默认为其所在数据库的TTL。如果使用默认值,可以省略此属性,或将其设置为`default`。 -5. 表名称,具有以下特性: - - 大小写不敏感,创建成功后,统一显示为小写 - - 名称可包含特殊字符,如 `~!`"%` 等 - - 包含特殊字符或中文字符的表名创建时必须用双引号 "" 括起来。 - - 注意:SQL中特殊字符或中文表名需加双引号。原生API中无需额外添加,否则表名会包含引号字符。 - - 当为表命名时,最外层的双引号(`""`)不会在实际创建的表名中出现。 - - - ```shell - -- SQL 中 - "a""b" --> a"b - """""" --> "" - -- API 中 - "a""b" --> "a""b" - ``` -6. columnDefinition 列名称与表名称具有相同特性,并且可包含特殊字符`.`。 -7. COMMENT 给表添加注释。 - -**示例:** - -```SQL -CREATE TABLE table1 ( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - humidity FLOAT FIELD COMMENT 'humidity', - status Boolean FIELD COMMENT 'status', - arrival_time TIMESTAMP FIELD COMMENT 'arrival_time' -) COMMENT 'table1' WITH (TTL=31536000000); -``` - -注意:若您使用的终端不支持多行粘贴(例如 Windows CMD),请将 SQL 语句调整为单行格式后再执行。 - -### 1.2 查看表 - -用于查看该数据库中或指定数据库中的所有表和表库的属性信息。 - -**语法:** - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -**说明:** - -1. 在查询中指定了`FROM`或`IN`子句时,系统将展示指定数据库内的所有表。 -2. 如果未指定`FROM`或`IN`子句,系统将展示当前选定数据库中的所有表。如果用户未使用(use)某个数据库空间,系统将报错。 -3. 请求显示详细信息(指定`DETAILS`),系统将展示表的当前状态,包括: - - `USING`:表示表处于正常可用状态。 - - `PRE_CREATE`:表示表正在创建中或创建失败,此时表不可用。 - - `PRE_DELETE`:表示表正在删除中或删除失败,此类表将永久不可用。 - -**示例:** - -```sql -show tables details from database1; -``` -```shell -+---------------+-----------+------+-------+ -| TableName| TTL(ms)|Status|Comment| -+---------------+-----------+------+-------+ -| table1|31536000000| USING| table1| -+---------------+-----------+------+-------+ -``` - -### 1.3 查看表的列 - -用于查看表的列名、数据类型、类别、状态。 - -**语法:** - -```SQL -(DESC | DESCRIBE) (DETAILS)? -``` - -**说明:** - -- 如果设置了`DETAILS`选项,系统将展示列的详细状态信息,包括: - - `USING`:表示列目前处于正常使用状态。 - - `PRE_DELETE`:表示列正在被删除或删除操作失败,该列将永久无法使用。 - -**示例:** - -```sql -desc table1 details; -``` -```shell -+------------+---------+---------+------+------------+ -| ColumnName| DataType| Category|Status| Comment| -+------------+---------+---------+------+------------+ -| time|TIMESTAMP| TIME| USING| null| -| region| STRING| TAG| USING| null| -| plant_id| STRING| TAG| USING| null| -| device_id| STRING| TAG| USING| null| -| model_id| STRING|ATTRIBUTE| USING| null| -| maintenance| STRING|ATTRIBUTE| USING| maintenance| -| temperature| FLOAT| FIELD| USING| temperature| -| humidity| FLOAT| FIELD| USING| humidity| -| status| BOOLEAN| FIELD| USING| status| -|arrival_time|TIMESTAMP| FIELD| USING|arrival_time| -+------------+---------+---------+------+------------+ -``` - - -### 1.4 查看表的创建信息 - -用于获取表模型下表或视图的完整定义语句。该功能会自动补全创建时省略的所有默认值,因此结果集中所展示的语句可能与原始创建语句不同。 - -> V2.0.5 起支持该功能 - -**语法:** - -```SQL -SHOW CREATE TABLE -``` - -**说明:** - -1. 该语句不支持对系统表的查询 - -**示例:** - -```SQL -show create table table1; -``` -```shell -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| Table| Create Table| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|table1|CREATE TABLE "table1" ("region" STRING TAG,"plant_id" STRING TAG,"device_id" STRING TAG,"model_id" STRING ATTRIBUTE,"maintenance" STRING ATTRIBUTE,"temperature" FLOAT FIELD,"humidity" FLOAT FIELD,"status" BOOLEAN FIELD,"arrival_time" TIMESTAMP FIELD) WITH (ttl=31536000000)| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -``` - - -### 1.5 修改表 - -用于修改表,包括添加列、删除列、修改列类型(V2.0.8.2)以及设置表的属性。 - -**语法:** - -```SQL -#addColumn; -ALTER TABLE (IF EXISTS)? tableName=qualifiedName ADD COLUMN (IF NOT EXISTS)? column=columnDefinition COMMENT 'column_comment'; -#dropColumn; -ALTER TABLE (IF EXISTS)? tableName=qualifiedName DROP COLUMN (IF EXISTS)? column=identifier; -#setTableProperties; -// set TTL can use this; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName SET PROPERTIES propertyAssignments; -| COMMENT ON TABLE tableName=qualifiedName IS 'table_comment'; -| COMMENT ON COLUMN tableName.column IS 'column_comment'; -#changeColumndatatype; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName ALTER COLUMN (IF EXISTS)? column=identifier SET DATA TYPE new_type=type; -``` - -**说明:** - -1. `SET PROPERTIES`操作目前仅支持对表的 TTL 属性进行配置。 -2. 删除列功能,仅支持删除属性列(ATTRIBUTE)和物理量列(FIELD),标识列(TAG)不支持删除。 -3. 修改后的 comment 会覆盖原有注释,如果指定为 null,则会擦除之前的 comment。 -4. 自 V2.0.8.2 版本起支持修改字段数据类型,目前只支持修改Category类型为FIELD的字段。 - - 变更过程中若该时间序列被并发删除,会报错提示。 - - 变更后的字段类型需要与原类型兼容,具体兼容性如下表所示: - -| 原始类型 | 可变更为类型 | -| ----------- | ----------------------------------------------- | -| INT32 | INT64, FLOAT, DOUBLE, TIMESTAMP, STRING, TEXT | -| INT64 | TIMESTAMP, DOUBLE, STRING, TEXT | -| FLOAT | DOUBLE, STRING, TEXT | -| DOUBLE | STRING, TEXT | -| BOOLEAN | STRING, TEXT | -| TEXT | BLOB, STRING | -| STRING | TEXT, BLOB | -| BLOB | STRING, TEXT | -| DATE | STRING, TEXT | -| TIMESTAMP | INT64, DOUBLE, STRING, TEXT | - - -**示例:** - -表 table1 增加 tag 列 a -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS a TAG COMMENT 'a'; -``` -表 table1 增加 field 列 b -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS b FLOAT FIELD COMMENT 'b'; -``` -修改表 table1 的 TTL -```SQL -ALTER TABLE table1 set properties TTL=3600; -``` -表 table1 增加注释 -```SQL -COMMENT ON TABLE table1 IS 'table1'; -``` -表 table1 的 a 列去掉注释 -```SQL -COMMENT ON COLUMN table1.a IS null; -``` -修改表 table1 的 b 列的数据类型 -```SQL -ALTER TABLE table1 ALTER COLUMN IF EXISTS b SET DATA TYPE DOUBLE; -``` - -### 1.6 删除表 - -用于删除表。 - -**语法:** - -```SQL -DROP TABLE (IF EXISTS)? -``` - -**示例:** - -```SQL -DROP TABLE table1; -``` - - -### 1.7 元数据查询 - -表模型下**测点数量**等于所有表的测点数之和,目前单表测点数可通过公式:**单表测点数 = device 数量 × field 列的数量** 计算得出,后续会支持通过 SQL 语句直接查询表模型下测点数,敬请期待。 - -以[示例数据](../Reference/Sample-Data.md) 中的表 table1 为例。 - -该示例组织架构共包含三个 tag 列(region 为区域,plant_id 为工厂,device_id 为机器)和四个 field 列(temperature 为温度,humidity 为湿度,status 为状态,arrival_time 为到达时间)。 - -device 的唯一标识由全部 tag 列组合而成,只要 region(区域)+ plant_id(工厂)+ device_id(机器)的组合不重复,就代表一个独立设备。 - -示例数据一共定义了 2 个区域,分别为:北京、上海。其中 - -* 北京区域:包含 1 个工厂,工厂编号 1001; - * 该工厂下共有 2 台设备,设备编号分别为 100、101; -* 上海区域:包含 2 个工厂,工厂编号分别为 3001、3002; - * 工厂 3001 下包含 2 台设备:100、101; - * 工厂 3002 下包含 2 台设备:100、101。 - -综上,整个表一共存在 6 组唯一 tag 组合,对应 6 个独立设备。 - -**单表测点数完整计算示例:** - -1. 查询 device 数量 - -```sql -count devices from table1; -``` -```shell -+--------------+ -|count(devices)| -+--------------+ -| 6| -+--------------+ -``` - -2. 计算单表测点数量 -- device 数量:6 -- field 列数:4 -- 单表测点总数:6 × 4 = 24 - diff --git a/src/zh/UserGuide/latest-Table/Basic-Concept/Write-Updata-Data_timecho.md b/src/zh/UserGuide/latest-Table/Basic-Concept/Write-Updata-Data_timecho.md deleted file mode 100644 index 58ea9299d..000000000 --- a/src/zh/UserGuide/latest-Table/Basic-Concept/Write-Updata-Data_timecho.md +++ /dev/null @@ -1,383 +0,0 @@ - - -# 写入&更新 - -## 1.SQL 写入 - -### 1.1 语法 - -在 IoTDB 中,数据写入遵循以下通用语法: - -```SQL -INSERT INTO [(COLUMN_NAME[, COLUMN_NAME]*)]? VALUES (COLUMN_VALUE[, COLUMN_VALUE]*) -``` - -基本约束包括: - -1. 通过 insert 语句写入无法自动创建表。 -2. 未指定的标签列将自动填充为 `null`。 -3. 未包含时间戳,系统将使用当前时间 `now()` 进行填充。 -4. 若当前设备(由标识信息定位)不存在该属性列的值,执行写入操作将导致原有的空值(NULL)被写入的数据所替代。 -5. 若当前设备(由标识信息定位)已有属性列的值,再次写入相同的属性列时,系统将更新该属性列的值为新数据。 -6. 写入重复时间戳,原时间戳对应列的值会更新。 -7. 若 INSERT 语句未指定列名(如 INSERT INTO table VALUES (...)),则 VALUES中的值必须严格按表中列的物理顺序排列(顺序可通过 DESC table命令查看)。 - -由于属性一般并不随时间的变化而变化,因此推荐以 update 的方式单独更新属性值,参见下文 [数据更新](#_3-数据更新)。 - -
- -
- - -### 1.2 指定列写入 - -在写入操作中,可以指定部分列,未指定的列将不会被写入任何内容(即设置为 `null`)。 - -**示例:** - -```SQL -INSERT INTO table1(region, plant_id, device_id, time, temperature, humidity) VALUES ('北京', '1001', '100', '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, time, temperature) VALUES ('北京', '1001', '100', '2025-11-26 13:38:00', 91.0); -``` - -### 1.3 空值写入 - -标签列、属性列和测点列可以指定空值(`null`),表示不写入任何内容。 - -**示例(与上述示例等价):** - -```SQL -# 上述部分列写入等价于如下的带空值写入; -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('北京', '1001', '100', null, null, '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('北京', '1001', '100', null, null, '2025-11-26 13:38:00', 91.0, null); -``` - -当向不包含任何标签列的表中写入数据时,系统将默认创建一个所有标签列值均为 null 的device。 - -> 注意,该操作不仅会自动为表中已有的标签列填充 null 值,而且对于未来新增的标签列,同样会自动填充 null。 - -### 1.4 多行写入 - -支持同时写入多行数据,提高数据写入效率。 - -**示例:** - -```SQL -INSERT INTO table1 -VALUES -('2025-11-26 13:37:00', '北京', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('2025-11-26 13:38:00', '北京', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:38:25'); - -INSERT INTO table1 -(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) -VALUES -('北京', '1001', '100', 'A', '180', '2025-11-26 13:37:00', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('北京', '1001', '100', 'A', '180', '2025-11-26 13:38:00', 90.0, 35.1, true, '2025-11-26 13:38:25'); -``` - -#### 注意事项 - -- 如果在 SQL 语句中引用了表中不存在的列,IoTDB 将返回错误码 `COLUMN_NOT_EXIST(616)`。 -- 如果写入的数据类型与表中列的数据类型不一致,将报错 `DATA_TYPE_MISMATCH(507)`。 - -### 1.5 查询写回 - -IoTDB 表模型支持追加查询写回功能,即`INSERT INTO QUERY` 语句,支持将查询结果写入**已经存在**的表中。 - -> 注意:该功能从 V 2.0.6 版本开始提供。 - -#### 1.5.1 语法定义 - -```SQL -INSERT INTO table_name [ ( column [, ... ] ) ] query -``` - -其中**​ ​query** 支持三种形式,下面将通过示例进行说明。 - -以[示例数据](../Reference/Sample-Data.md)为源数据,先创建目标表 - -```SQL -IoTDB:database1> CREATE TABLE target_table ( time TIMESTAMP TIME, region STRING TAG, device_id STRING TAG, temperature FLOAT FIELD ); -Msg: The statement is executed successfully. -``` - -1. 通过标准查询语句写回 - -即 query 处为直接通过`select ... from ...`执行的查询。 - -例如:使用标准查询语句,将 table1 中北京地区的 time, region, device\_id, temperature 数据查询写回到 target\_table 中 - -```SQL -insert into target_table select time,region,device_id,temperature from table1 where region = '北京'; -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region='北京'; -``` -```shell -+-----------------------------+------+---------+-----------+ -| time|region|device_id|temperature| -+-----------------------------+------+---------+-----------+ -|2024-11-26T13:37:00.000+08:00| 北京| 100| 90.0| -|2024-11-26T13:38:00.000+08:00| 北京| 100| 90.0| -|2024-11-27T16:38:00.000+08:00| 北京| 101| null| -|2024-11-27T16:39:00.000+08:00| 北京| 101| 85.0| -|2024-11-27T16:40:00.000+08:00| 北京| 101| 85.0| -|2024-11-27T16:41:00.000+08:00| 北京| 101| 85.0| -|2024-11-27T16:42:00.000+08:00| 北京| 101| null| -|2024-11-27T16:43:00.000+08:00| 北京| 101| null| -|2024-11-27T16:44:00.000+08:00| 北京| 101| null| -+-----------------------------+------+---------+-----------+ -Total line number = 9 -It costs 0.029s -``` - -2. 通过表引用查询写回 - -即 query 处为表引用方式`table source_table`。 - -例如:使用表引用查询,将 table3 中的数据查询写回到 target\_table 中 - -```SQL -insert into target_table(time,device_id,temperature) table table3; -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region is null; -``` -```shell -+-----------------------------+------+---------+-----------+ -| time|region|device_id|temperature| -+-----------------------------+------+---------+-----------+ -|2025-05-13T00:00:00.001+08:00| null| d1| 90.0| -|2025-05-13T00:00:01.002+08:00| null| d1| 85.0| -|2025-05-13T00:00:02.101+08:00| null| d1| 85.0| -|2025-05-13T00:00:03.201+08:00| null| d1| null| -|2025-05-13T00:00:04.105+08:00| null| d1| 90.0| -|2025-05-13T00:00:05.023+08:00| null| d1| 85.0| -|2025-05-13T00:00:06.129+08:00| null| d1| 90.0| -+-----------------------------+------+---------+-----------+ -Total line number = 7 -It costs 0.015s -``` - -3. 通过子查询写回 - -即 query 处为带括号的子查询。 - -例如:使用子查询,将 table1 中时间与 table2 上海地区记录匹配的数据的 time, region, device\_id, temperature 查询写回到 target\_table - -```SQL -insert into target_table (select t1.time, t1.region as region, t1.device_id as device_id, t1.temperature as temperature from table1 t1 where t1.time in (select t2.time from table2 t2 where t2.region = '上海')); -Msg: The statement is executed successfully. -``` -```sql -select * from target_table where region = '上海'; -``` -```shell -+-----------------------------+------+---------+-----------+ -| time|region|device_id|temperature| -+-----------------------------+------+---------+-----------+ -|2024-11-28T08:00:00.000+08:00| 上海| 100| 85.0| -|2024-11-29T11:00:00.000+08:00| 上海| 100| null| -+-----------------------------+------+---------+-----------+ -Total line number = 2 -It costs 0.014s -``` - -#### 1.5.2 相关说明 - -* 允许 query 中的源表与目标表 table\_name 是同一个表,例如:`INSERT INTO testtb SELECT * FROM testtb`。 -* 目标表必须已存在,否则提示错误信息`550: Table 'xxx.xxx' does not exist`。 -* 查询返回列和目标表列的数量和类型需完全匹配,目前不支持 Object 类型,不支持兼容类型的转换,若类型不匹配则提示错误信息 `701: Insert query has mismatched column types`。 -* 允许指定目标表的部分列,指定目标表列名时需符合以下规则: - * 必须包含时间戳列,否则提示错误信息`701: time column can not be null` - * 必须包含至少一个 FIELD 列,否则提示错误信息`701: No Field column present` - * 允许不指定 TAG 列 - * 允许指定列数少于目标表列数,缺失列自动补为 NULL 值 -* JAVA 支持使用 [executeNonQueryStatement](../API/Programming-Java-Native-API_timecho.md#_3-1-itablesession接口) 方法执行`INSERT INTO QUERY`。 -* REST 支持[/rest/table/v1/nonQuery](../API/RestServiceV1.md#_3-3-非查询接口)API 执行`INSERT INTO QUERY`。 -* `INSERT INTO QUERY`不支持 Explain 和 Explain Analyze。 -* 用户必须有下列权限才能正常执行查询写回语句: - * 对查询语句中的源表具有 `SELECT` 权限。 - * 对目标表具有`WRITE`权限。 - * 更多用户权限相关的内容,请参考[权限管理](../User-Manual/Authority-Management_timecho.md)。 - - -### 1.6 Object 类型写入 - -自为了避免单个 Object 过大导致写入请求过大,Object 类型的值支持拆分后按顺序分段写入。SQL 中需要使用 `to_object(isEOF, offset, content)` 函数进行值填充。 - -> V2.0.8 版本起支持 - -**语法:** - -```SQL -insert into tableName(time, columnName) values(timeValue, to_object(isEOF, offset, content)); -``` - -**参数:** - -| 名称 | 数据类型 | 描述 | -| --------- | ------------------- | ---------------------------------------- | -| isEOF | boolean | 本次写入内容是否为 Object 的最后一部分 | -| offset | int64 | 本次写入的内容在 Object 中的起始偏移量 | -| content | 十六进制(hex)格式 | 本次写入的 Object 内容 | - -**示例:** - -向表 table1 中增加 object 类型字段 s1 - -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS s1 OBJECT FIELD COMMENT 'object类型'; -``` - -1. 不分段写入 - -```SQL -insert into table1(time, device_id, s1) values(now(), 'tag1', to_object(true, 0, X'696F746462')); -``` - -2. 分段写入 - -```SQL ---分段写入 object 数据; ---第一次写入:to_object(false, 0, X'696F'); -insert into table1(time, device_id, s1) values(1, 'tag1', to_object(false, 0, X'696F')); ---第二次写入:to_object(false, 2, X'7464'); -insert into table1(time, device_id, s1) values(1, 'tag1', to_object(false, 2, X'7464')); ---第三次写入:to_object(true, 4, X'62'); -insert into table1(time, device_id, s1) values(1, 'tag1', to_object(true, 4, X'62')); -``` - -**注意:** - -1. 如果某个 Object 值只写入了部分片段,查询该 Object 值时会显示 null,只有写入完全后才能查询到数据 -2. 分段写入时,如果本次写入的 offset 不等于已写入的 Object 大小,本次写入报错 -3. 如果已写入了部分数据,本次写入的 offset 为 0,本次写入会清除之前已写入的数据部分,重新写入新的数据 - - -## 2. 无模式写入 - -在通过 Session 进行数据写入时,IoTDB 支持无模式写入:无需事先手动创建表,系统会根据写入请求中的信息自动构建表结构,之后直接执行数据写入操作。 - -**示例:** - -```Java -try (ITableSession session = - new TableSessionBuilder() - .nodeUrls(Collections.singletonList("127.0.0.1:6667")) - .username("root") - .password("root") - .build()) { - - session.executeNonQueryStatement("CREATE DATABASE db1"); - session.executeNonQueryStatement("use db1"); - - // 不创建表直接写入数据 - List columnNameList = - Arrays.asList("region_id", "plant_id", "device_id", "model", "temperature", "humidity"); - List dataTypeList = - Arrays.asList( - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.STRING, - TSDataType.FLOAT, - TSDataType.DOUBLE); - List columnTypeList = - new ArrayList<>( - Arrays.asList( - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.TAG, - ColumnCategory.ATTRIBUTE, - ColumnCategory.FIELD, - ColumnCategory.FIELD)); - Tablet tablet = new Tablet("table1", columnNameList, dataTypeList, columnTypeList, 100); - for (long timestamp = 0; timestamp < 100; timestamp++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - tablet.addValue("region_id", rowIndex, "1"); - tablet.addValue("plant_id", rowIndex, "5"); - tablet.addValue("device_id", rowIndex, "3"); - tablet.addValue("model", rowIndex, "A"); - tablet.addValue("temperature", rowIndex, 37.6F); - tablet.addValue("humidity", rowIndex, 111.1); - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - session.insert(tablet); - tablet.reset(); - } - } - if (tablet.getRowSize() != 0) { - session.insert(tablet); - tablet.reset(); - } -} -``` - -在代码执行完成后,可以通过下述语句确认表已成功创建,其中包含了时间列、标签列、属性列以及测点列等各类信息。 - -```SQL -desc table1; -``` -```shell -+-----------+---------+-----------+ -| ColumnName| DataType| Category| -+-----------+---------+-----------+ -| time|TIMESTAMP| TIME| -| region_id| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model| STRING| ATTRIBUTE| -|temperature| FLOAT| FIELD| -| humidity| DOUBLE| FIELD| -+-----------+---------+-----------+ -``` - - -## 3. 数据更新 - -### 3.1 语法 - -```SQL -UPDATE SET updateAssignment (',' updateAssignment)* (WHERE where=booleanExpression)? - -updateAssignment - : identifier EQ expression - ; -``` - -1. `update`语句仅允许修改属性(ATTRIBUTE)列的值。 -2. `WHERE` 的规则: - - 范围仅限于标签列(TAG)和属性列(ATTRIBUTE),不允许涉及测点列(FIELD)和时间列(TIME)。 - - 不允许使用聚合函数 -3. 执行 SET 操作后,赋值表达式的结果应当是字符串类型,且其使用的限制应与 WHERE 子句中的表达式相同。 -4. 属性(ATTRIBUTE)列以及测点(FIELD)列的值也可通过`insert`语句来实现指定行的更新。 - -**示例:** - -```SQL -update table1 set b = a where substring(a, 1, 1) like '%'; -``` diff --git a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md b/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md deleted file mode 100644 index fa8e0f176..000000000 --- a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md +++ /dev/null @@ -1,294 +0,0 @@ - -# AINode 部署 - -## 1. AINode 介绍 - -### 1.1 能力介绍 - -AINode 是 TimechoDB 在 ConfigNode、DataNode 后提供的第三种内生节点,该节点通过与 TimechoDB 集群的 DataNode、ConfigNode 交互,扩展了对时间序列进行机器学习分析的能力。AINode 将模型的管理、训练及推理融合在数据库引擎中,支持使用注册的模型在指定时序数据上通过简单 SQL 语句完成时序分析任务,还支持注册并使用自定义机器学习模型。AINode 目前已集成常见时序分析场景(例如预测)的机器学习算法和自研模型。 - -### 1.2 部署模式 - -AINode 是 TimechoDB 集群外的额外套件,采用独立安装包部署。 - -
- - -
- -## 2. 安装准备 - -### 2.1 安装包获取 - -AINode 安装包(`timechodb--ainode-bin.zip`)解压后关键目录结构如下: - -| **目录** | **类型** | **说明** | -| ---------------- | ---------------- | ------------------------------------------ | -| lib | 文件夹 | AINode 的可执行程序及依赖 | -| sbin | 文件夹 | AINode 的运行脚本,用于启动或停止 AINode | -| conf | 文件夹 | AINode 的配置文件和版本声明文件 | - -### 2.2 前置检查 - -为确保您获取的 AINode 安装包完整且正确,在执行安装部署前建议您进行 SHA512 校验。 - -**准备工作:** - -* 获取官方发布的 SHA512 校验码:请联系天谋工作人员获取 - -**校验步骤(以 linux 为例):** - -1. 打开终端,进入安装包所在目录(如`/data/ainode`): - ```Bash - cd /data/ainode - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -```SQL -(base) root@hadoop@1:/data/ainode (0.664s) -sha512sum timechodb-2.0.6.1-ainode-bin.zip -4d5a6a64935b4f0459bc9ed214c4563aa7a6a5941024336e9416212424707f27bdfdfc70f4c528b51b812687d660014adc1b8add699498ea67ff17c7e619a6f0 timechodb-2.0.6.1-ainode-bin.zip -``` - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行 AINode 的安装部署操作。 - -**注意事项:** - -* 若校验结果不一致,请联系天谋工作人员重新获取安装包 -* 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.3 环境要求 - -* 建议操作环境: Linux, MacOS; -* TimechoDB 版本:>= V 2.0.8; - -#### 2.3.1 资源配置建议 - -> 说明:本节资源配置建议仅针对​**模型推理任务**​。模型训练任务的资源配置建议将在后续版本中补充。 - -以下为基于单张 NVIDIA 4090(24 GB 显存)运行模型推理任务的资源配置基准线。AINode 中的模型推理任务支持通过横向扩展显卡数量来提升整体吞吐,通常建议按 1、2、4、8 张显卡四种规格配置机器。 - -基准测试使用的推理任务规格如下: - -* ​**单变量推理**​:历史序列长度 2880,预测长度 720; -* ​**协变量推理**​:历史序列长度 2880,预测长度 720,包含 20 个已知协变量。 - - -| GPU 数量(NVIDIA 4090, 24 GB 显存) | 推荐 CPU 核数 | 推荐内存(GB) | 可支持单变量推理吞吐(QPS) | 可支持协变量推理吞吐(QPS) | -| ------------------------------------- | --------------- | ---------------- | ----------------------------- | ----------------------------- | -| 1 卡 | 16 核 | 24 GB | 100 | 10 | -| 2 卡 | 32 核 | 48 GB | 200 | 20 | -| 4 卡 | 64 核 | 96 GB | 400 | 40 | -| 8 卡 | 128 核 | 192 GB | 800 | 80 | - -**注意**: - -* 上表中的 CPU 和内存配置遵循以下通用原则:每张显卡配置 16 核 CPU,内存容量与显存容量按 1:1 比例配置 -* 以上吞吐数据为基准测试参考值,实际性能可能因模型类型、数据复杂度及部署环境差异而有所不同 -* 单变量与协变量推理任务的吞吐可按需独立评估,不可直接相加 - - -## 3. 安装部署及使用 - -### 3.1 安装 AINode - -下载导入 AINode 到专用文件夹,切换到专用文件夹并解压安装包; - -```Shell -unzip timechodb--ainode-bin.zip -``` - -### 3.2 配置项修改 - -AINode 支持修改一些必要的参数。可以在 `/TIMECHO_AINODE_HOME/conf/iotdb-ainode.properties` 文件中找到下列参数并进行持久化的修改: - -| **名称** | **描述** | **类型** | **默认值** | -|-----------------------------------|----------------------------------------------| ---------------- | -------------------- | -| cluster\_name | AINode 要加入的集群标识 | string| defaultCluster | -| ain\_seed\_config\_node | AINode 启动时注册的 ConfigNode 地址 | String | 127.0.0.1:10710 | -| ain\_cluster\_ingress\_address | AINode 拉取数据的 DataNode 的 rpc 地址 | String | 127.0.0.1 | -| ain\_cluster\_ingress\_port | AINode 拉取数据的 DataNode 的 rpc 端口 | Integer | 6667 | -| ain\_cluster\_ingress\_username | AINode 拉取数据的 DataNode 的客户端用户名 | String | root | -| ain\_cluster\_ingress\_password | AINode 拉取数据的 DataNode 的客户端密码 | String | root | -| ain\_rpc\_address | AINode 提供服务与通信的地址 ,内部服务通讯接口 | String | 127.0.0.1 | -| ain\_rpc\_port | AINode 提供服务与通信的端口 | String | 10810 | -| ain\_system\_dir | AINode 元数据存储路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String| data/AINode/system | -| ain\_models\_dir | AINode 存储模型文件的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String| data/AINode/models | -| ain\_thrift\_compression\_enabled | AINode 是否启用 thrift 的压缩机制,0-不启动、1-启动 | Boolean | 0 | - -### 3.3 导入内置权重文件 - -若部署环境可联网且能连通 HuggingFace 环境,系统会自动拉取内置模型权重文件,可忽略本步骤。 - -若为离线环境,联系天谋工作人员获取模型权重文件夹,并放置到`/TIMECHO_AINODE_HOME/data/ainode/models/builtin` 目录下。 - -**​NOTE:​**注意目录层级,最终所有内置模型权重的父目录都是 `builtin `。 - -### 3.4 启动 AINode - -在完成 ConfigNode 的部署后,可以通过添加 TimechoDB 来支持时序模型的管理和推理功能。在配置项中指定 TimechoDB 集群的信息后,可以执行相应的指令来启动 AINode,加入 TimechoDB 集群。 - -```Shell -# 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh -d - - # Windows 系统 - bash sbin\start-ainode.bat -d -``` - -### 3.5 激活 AINode - -1. 参考 TimechoDB 激活:[激活方式](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md#_2-6-激活数据库) - -2. 可通过如下方式验证 AINode 激活,当看到状态显示为 ACTIVATED 表示激活成功。 - -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -Total line number = 3 -It costs 0.002s -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ -Total line number = 7 -It costs 0.013s -``` - - -### 3.6 检测 AINode 节点状态 - -AINode 启动过程中会自动将新的 AINode 加入 TimechoDB 集群。启动 AINode 后可以在命令行中输入 SQL 来查询,集群中看到 AINode 节点,其运行状态为 Running(如下展示)表示加入成功。 - -```Shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -除此之外,还可以通过 show models 命令来查看模型状态。如果模型状态不对,请检查权重文件路径是否正确。 - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 3.7 停止 AINode - -如果需要停止正在运行的 AINode 节点,则执行相应的关停脚本,且支持通过参数 -p 指定端口,该端口为配置项中的 `ain_rpc_port`。 - -```Shell -# Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # 指定端口 - - #Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # 指定端口 -``` - -停止 AINode 后,还可以在集群中看到 AINode 节点,其运行状态为 UNKNOWN(如下展示),此时无法使用 AINode 功能。 - -```Shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|UNKNOWN| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -如果需要重新启动该节点,需重新执行启动脚本。 - -### 3.8 升级 AINode - -如果需要对当前 AINode 进行版本升级,可参考如下步骤: - -1. 停止当前 AINode 服务 - - * 执行停止命令,确保服务完全退出后再进行后续操作 - - ```Shell - # Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # 指定端口 - - #Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # 指定端口 - ``` -2. 替换核心文件 - - * 删除当前版本的`lib` 和 `sbin`目录,并将新版本的 `lib` 和 `sbin` 复制到对应位置 - * 备份 conf 目录下已修改的配置文件,然后替换 conf 文件夹,并将修改的配置同步到对应位置 -3. 更新内置模型权重(可选) - - * 若新版本涉及内置模型更新,相关信息将在[发布历史](../IoTDB-Introduction/Release-history_timecho.md)中同步。可联系天谋工作人员获取最新权重包,并将权重包替换至 `data/ainode/models/builtin` 目录 -4. 升级完毕后,可启动 AINode 服务,并查看节点状态,具体命令可参考【3.4】和【3.6】小节。 - diff --git a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index c34bf2547..000000000 --- a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,340 +0,0 @@ - -# AINode 部署 - -## 1. AINode介绍 - -### 1.1 能力介绍 - -AINode 是 IoTDB 在 ConfigNode、DataNode 后提供的第三种内生节点,该节点通过与 IoTDB 集群的 DataNode、ConfigNode 的交互,扩展了对时间序列进行机器学习分析的能力,支持从外部引入已有机器学习模型进行注册,并使用注册的模型在指定时序数据上通过简单 SQL 语句完成时序分析任务的过程,将模型的创建、管理及推理融合在数据库引擎中。目前已提供常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -### 1.2 交付方式 -AINode 是 IoTDB 集群外的额外套件,独立安装包。 - -### 1.3 部署模式 -
- - -
- -## 2. 安装准备 - -### 2.1 安装包获取 - -AINode 安装包(`timechodb--ainode-bin.zip`),安装包解压后目录结构如下: - -| **目录** | **类型** | **说明** | -| ------------ | -------- | ------------------------------------------------ | -| lib | 文件夹 | AINode 的 python 包文件 | -| sbin | 文件夹 | AINode的运行脚本,可以启动,移除和停止AINode | -| conf | 文件夹 | AINode 的配置文件和运行环境设置脚本 | -| LICENSE | 文件 | 证书 | -| NOTICE | 文件 | 提示 | -| README_ZH.md | 文件 | markdown格式的中文版说明 | -| README.md | 文件 | 使用说明 | - -### 2.2 前置检查 - -为确保您获取的 AINode 安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:请联系天谋工作人员获取 - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/ainode`): - ```Bash - cd /data/ainode - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-06.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行 AINode 的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.3 环境准备 - -1. 建议操作环境: Ubuntu, MacOS -2. IoTDB 版本:>= V 2.0.5.1 -3. 运行环境 - - Python 版本在 3.9 ~3.12,且带有 pip 和 venv 工具; - - -## 3. 安装部署及使用 - -### 3.1 安装 AINode - -1. 保证 Python 版本介于 3.9 ~3.12 - -```shell -python --version -# 或 -python3 --version -``` -2. 下载导入 AINode 到专用文件夹,切换到专用文件夹并解压安装包 - -```shell - unzip timechodb--ainode-bin.zip -``` - -3. 激活 AINode: - -- 进入 IoTDB CLI - -```sql - # Linux或MACOS系统 - ./start-cli.sh -sql_dialect table - - # windows系统 - ./start-cli.bat -sql_dialect table -``` - -- 执行以下内容获取激活所需机器码: - -```sql -show system info -``` - -- 将返回的机器码复制给天谋工作人员: - -```sql -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -``` - -- 将工作人员返回的激活码输入到CLI中,输入以下内容 - - 注:激活码前后需要用'符号进行标注,如所示 - -```sql -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZK' -``` - -- 可通过如下方式验证激活,当看到状态显示为 ACTIVATED 表示激活成功 - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ - -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ - -``` - -### 3.2 配置项修改 -AINode 支持修改一些必要的参数。可以在 `conf/iotdb-ainode.properties` 文件中找到下列参数并进行持久化的修改: - -| **名称** | **描述** | **类型** | **默认值** | -| ------------------------------ | ------------------------------------------------------------ | -------- | ------------------ | -| cluster_name | AINode 要加入集群的标识 | string | defaultCluster | -| ain_seed_config_node | AINode 启动时注册的 ConfigNode 地址 | String | 127.0.0.1:10710 | -| ain_cluster_ingress_address | AINode 拉取数据的 DataNode 的 rpc 地址 | String | 127.0.0.1 | -| ain_cluster_ingress_port | AINode 拉取数据的 DataNode 的 rpc 端口 | Integer | 6667 | -| ain_cluster_ingress_username | AINode 拉取数据的 DataNode 的客户端用户名 | String | root | -| ain_cluster_ingress_password | AINode 拉取数据的 DataNode 的客户端密码 | String | root | -| ain_cluster_ingress_time_zone | AINode 拉取数据的 DataNode 的客户端时区 | String | UTC+8 | -| ain_inference_rpc_address | AINode 提供服务与通信的地址 ,内部服务通讯接口 | String | 127.0.0.1 | -| ain_inference_rpc_port | AINode 提供服务与通信的端口 | String | 10810 | -| ain_system_dir | AINode 元数据存储路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/system | -| ain_models_dir | AINode 存储模型文件的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/models | -| ain_thrift_compression_enabled | AINode 是否启用 thrift 的压缩机制,0-不启动、1-启动 | Boolean | 0 | - -### 3.3 导入权重文件 - -> 仅离线环境,在线环境可忽略本步骤 -> - 联系天谋工作人员获取模型权重文件,并放置到/IOTDB_AINODE_HOME/data/ainode/models/weights/目录下。 - -### 3.4 启动 AINode - - 在完成 Seed-ConfigNode 的部署后,可以通过添加 AINode 节点来支持模型的注册和推理功能。在配置项中指定 IoTDB 集群的信息后,可以执行相应的指令来启动 AINode,加入 IoTDB 集群。 - -- 联网环境启动 - -启动命令 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### 3.5 检测 AINode 节点状态 - -AINode 启动过程中会自动将新的 AINode 加入 IoTDB 集群。启动 AINode 后可以在 命令行中输入 SQL 来查询,集群中看到 AINode 节点,其运行状态为 Running(如下展示)表示加入成功。 - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -除此之外,还可以通过 show models 命令来查看模型状态。如果模型状态不对,请检查权重文件路径是否正确。 - -```sql -IoTDB:etth> show models -+---------------------+--------------------+--------+------+ -| ModelId| ModelType|Category| State| -+---------------------+--------------------+--------+------+ -| arima| Arima|BUILT-IN|ACTIVE| -| holtwinters| HoltWinters|BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing|BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster|BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster|BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm|BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm|BUILT-IN|ACTIVE| -| stray| Stray|BUILT-IN|ACTIVE| -| sundial| Timer-Sundial|BUILT-IN|ACTIVE| -| timer_xl| Timer-XL|BUILT-IN|ACTIVE| -+---------------------+--------------------+--------+------+ -``` - -### 3.6 停止 AINode - -如果需要停止正在运行的 AINode 节点,则执行相应的关闭脚本。 - -- 停止命令 - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - -停止 AINode 后,还可以在集群中看到 AINode 节点,其运行状态为 UNKNOWN(如下展示),此时无法使用 AINode 功能。 - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -如果需要重新启动该节点,需重新执行启动脚本。 - - -## 4. 常见问题 - -### 4.1 启动AINode时出现找不到venv模块的报错 - - 当使用默认方式启动 AINode 时,会在安装包目录下创建一个 python 虚拟环境并安装依赖,因此要求安装 venv 模块。通常来说 python3.10 及以上的版本会自带 venv,但对于一些系统自带的 python 环境可能并不满足这一要求。出现该报错时有两种解决方案(二选一): - - 在本地安装 venv 模块,以 ubuntu 为例,可以通过运行以下命令来安装 python 自带的 venv 模块。或者从 python 官网安装一个自带 venv 的 python 版本。 - - ```shell -apt-get install python3.10-venv -``` - 安装 3.10.0 版本的 venv 到 AINode 里面 在 AINode 路径下 - - ```shell -../Python-3.10.0/python -m venv venv(文件夹名) -``` - 在运行启动脚本时通过 `-i` 指定已有的 python 解释器路径作为 AINode 的运行环境,这样就不再需要创建一个新的虚拟环境。 - - ### 4.2 python中的SSL模块没有被正确安装和配置,无法处理HTTPS资源 -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -可以安装 OpenSSLS 后,再重新构建 python 来解决这个问题 -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - - Python 要求我们的系统上安装有 OpenSSL,具体安装方法可见[链接](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - - ### 4.3 pip版本较低 - - windows下出现类似“error:Microsoft Visual C++ 14.0 or greater is required...”的编译问题 - - 出现对应的报错,通常是 c++版本或是 setuptools 版本不足,可以在 - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - - ### 4.4 安装编译python - - 使用以下指定从官网下载安装包并解压: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` - 编译安装对应的 python 包: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index 0fa3c8625..000000000 --- a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,573 +0,0 @@ - -# 集群版部署指导 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -## 1. 注意事项 - -1. 安装前请确认系统已参照[系统配置](../Deployment-and-Maintenance/Environment-Requirements.md)准备完成。 - -2. 推荐使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在服务器上配`/etc/hosts`,如本机ip是11.101.17.224,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。 - - ```shell - echo "11.101.17.224 iotdb-1" >> /etc/hosts - ``` - -3. 有些参数首次启动后不能修改,请参考下方的[参数配置](#参数配置)章节来进行设置。 - -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 - -5. 请注意,安装部署(包括激活和使用软件)IoTDB时,您可以: - -- 使用 root 用户(推荐):可以避免权限等问题。 - -- 使用固定的非 root 用户: - - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - - 避免使用 sudo:使用 sudo 命令会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 - -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考:[监控面板部署](./Monitoring-panel-deployment.md) - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - - -## 2. 准备步骤 - -1. 准备IoTDB数据库安装包 :timechodb-{version}-bin.zip(安装包获取见:[链接](./IoTDB-Package_timecho.md)) -2. 按环境要求配置好操作系统环境(系统环境配置见:[链接](./Environment-Requirements.md)) - -### 2.1 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-01.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -## 3. 安装步骤 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ------------- | ------- | -------------------- | -| 11.101.17.224 | iotdb-1 | ConfigNode、DataNode | -| 11.101.17.225 | iotdb-2 | ConfigNode、DataNode | -| 11.101.17.226 | iotdb-3 | ConfigNode、DataNode | - -### 3.1 设置主机名 - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置/etc/hosts,使用如下命令: - -```shell -echo "11.101.17.224 iotdb-1" >> /etc/hosts -echo "11.101.17.225 iotdb-2" >> /etc/hosts -echo "11.101.17.226 iotdb-3" >> /etc/hosts -``` - -### 3.2 参数配置 - -解压安装包并进入安装目录 - -```shell -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -#### 3.2.1 环境脚本配置 - -- ./conf/confignode-env.sh配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------- | :------------------------------------- | :--------- | :----------------------------------------------- | :----------- | -| MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- ./conf/datanode-env.sh配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------- | :----------------------------------- |:-----------------------| :----------------------------------------------- | :----------- | -| MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 3.2.2 通用配置(./conf/iotdb-system.properties) - -- 集群配置 - -| 配置项 | 说明 | 11.101.17.224 | 11.101.17.225 | 11.101.17.226 | -| ------------------------- | ---------------------------------------- | -------------- | -------------- | -------------- | -| cluster_name | 集群名称 | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | 元数据副本数,DataNode数量不应少于此数目 | 3 | 3 | 3 | -| data_replication_factor | 数据副本数,DataNode数量不应少于此数目 | 2 | 2 | 2 | - -#### 3.2.3 ConfigNode 配置 - -| 配置项 | 说明 | 默认 | 推荐值 | 11.101.17.224 | 11.101.17.225 | 11.101.17.226 | 备注 | -| ------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 10710 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 10720 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -#### 3.2.4 DataNode 配置 - -| 配置项 | 说明 | 默认 | 推荐值 | 11.101.17.224 | 11.101.17.225 | 11.101.17.226 | 备注 | -| ------------------------------- | ------------------------------------------------------------ | --------------- | --------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| dn_rpc_address | 客户端 RPC 服务的地址 | 127.0.0.1 | 默认本机可直接访问。非本机访问,请修改此配置项为所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址。 | iotdb-1 |iotdb-2 | iotdb-3 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 6667 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 10730 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 10740 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 10750 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 10760 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -> ❗️注意:VSCode Remote等编辑器无自动保存配置功能,请确保修改的文件被持久化保存,否则配置项无法生效 - -### 3.3 启动ConfigNode节点 - -先启动第一个iotdb-1的confignode, 保证种子confignode节点先启动,然后依次启动第2和第3个confignode节点 - -```shell -# Unix/OS X -cd sbin -./start-confignode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\start-confignode.bat - -# V2.0.4.x 版本及之后 -.\windows\start-confignode.bat -``` - -如果启动失败,请参考下[常见问题](#常见问题) - -### 3.4 启动DataNode 节点 - - 分别进入iotdb的sbin目录下,依次启动3个datanode节点: - -```shell -# Unix/OS X -cd sbin -./start-datanode.sh -d #-d参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\start-datanode.bat - -# V2.0.4.x 版本及之后 -.\windows\start-datanode.bat -``` - -### 3.5 激活数据库 - -#### 方式一:通过 CLI 激活 - -- 进入集群任一节点 CLI - -```shell -# Linux 系统与 MacOS 系统启动命令如下: -# V2.0.6.x 版本之前 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.6.x 版本及之后 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table - -# Windows 系统启动命令如下: -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.6.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -- 执行以下内容获取激活所需机器码: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -|01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -- 执行以下语句获取待激活数据库的版本号: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.x.x| xxxxxxx| -+-------+---------+ -Total line number = 1 -``` - -- 将获取到的机器码与版本号,一同提供给天谋工作人员。 - -- 工作人员会返回激活码,正常是与提供的机器码的顺序对应的,请将整串激活码粘贴到CLI中进行激活,此激活操作只需在集群中的任意一台机器上执行一次即可。 - - - 注:激活码前后需要用`'`符号进行标注,如下所示 - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - -#### 方式二:激活文件拷贝激活 - -- 依次启动3个Confignode、Datanode节点后,每台机器各自的activation文件夹, 分别拷贝每台机器的system_info文件给天谋工作人员; -- 工作人员将返回每个ConfigNode、Datanode节点的license文件,这里会返回3个license文件; -- 将3个license文件分别放入对应的ConfigNode节点的activation文件夹下; - - -### 3.6 验证激活 - -可在 CLI 中通过执行 `show activation` 命令查看激活状态,示例如下,状态显示为 ACTIVATED 表示激活成功 - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -### 3.7 一键启停集群 - -#### 3.7.1 概述 - -在 IoTDB 的根目录中,`sbin` 子目录包含的 `start-all.sh` 和 `stop-all.sh` 脚本,与 `conf` 子目录中的 `iotdb-cluster.properties` 配置文件协同工作,可通过单一节点实现一键启动或停止集群所有节点的功能。通过这种方式,可以高效地管理 IoTDB 集群的生命周期,简化了部署和运维流程。 -下文将介绍`iotdb-cluster.properties` 文件中的具体配置项。 - -#### 3.7.2 配置项 - - -> 注意: -> -> * 当集群变更时,需要手动更新此配置文件。 -> * 如果在未配置 `iotdb-cluster.properties` 配置文件的情况下执行 `start-all.sh` 或者 `stop-all.sh` 脚本,则默认会启停当前脚本所在 IOTDB\_HOME 目录下的 ConfigNode 与 DataNode 节点。 -> * 推荐配置 ssh 免密登录:如果未配置,启动脚本后会提示输入服务器密码以便于后续启动/停止/销毁操作。如果已配置,则无需在执行脚本过程中输入服务器密码。 - -* confignode\_address\_list - -| 名字 | confignode\_address\_list | -| :--------------: | :------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的 ConfigNode 节点所在主机的 IP 或主机名列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_address\_list - -| 名字 | datanode\_address\_list | -| :----------------: | :---------------------------------------------------------------------------- | -| 描述 | 待启动/停止的 DataNode 节点所在主机的 IP 或主机名列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* ssh\_account - -| 名字 | ssh\_account | -| :----------------: | :------------------------------------------------------------- | -| 描述 | 通过 SSH 登陆目标主机的用户名,需要所有的主机的用户名都相同 | -| 类型 | String | -| 默认值 | root | -| 改后生效方式 | 重启服务生效 | - -* ssh\_port - -| 名字 | ssh\_port | -| :----------------: | :--------------------------------------------------------- | -| 描述 | 目标主机对外暴露的 SSH 端口,需要所有的主机的端口都相同 | -| 类型 | int | -| 默认值 | 22 | -| 改后生效方式 | 重启服务生效 | - -* confignode\_deploy\_path - -| 名字 | confignode\_deploy\_path | -| :----------------: | :---------------------------------------------------------------------------------------------------------------- | -| 描述 | 待启动/停止的所有 ConfigNode 所在目标主机的路径,需要所有待启动/停止的 ConfigNode 节点在目标主机的相同目录下。例如:`/data/demo/apache-iotdb-1.3.1-all-bin` | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_deploy\_path - -| 名字 | datanode\_deploy\_path | -| :----------------: | :------------------------------------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的所有 DataNode 所在目标主机的路径,需要所有待启动/停止的 DataNode 节点在目标主机的相同目录下。例如:`/data/demo/apache-iotdb-1.3.1-all-bin` | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - - -#### 3.7.3 简单示例 - -1. 配置文件 `iotdb-cluster.properties` - -```properties -# Configure ConfigNodes machine addresses separated by , -confignode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# Configure DataNodes machine addresses separated by , -datanode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# User name for logging in to the deployment machine using ssh -ssh_account=root - -# ssh login port -ssh_port=22 - -# iotdb deployment directory (iotdb will be deployed to the target node in this folder) -confignode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -datanode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -``` - -2. 执行 ./start-all.sh 命令验证启动结果,在 cli 中执行 show cluster,可看到类似如下结果 -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -| 0|ConfigNode|Running| 172.xx.xx.16| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 1|ConfigNode|Running| 172.xx.xx.18| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 2|ConfigNode|Running| 172.xx.xx.17| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 3| DataNode|Running| 172.xx.xx.18| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 4| DataNode|Running| 172.xx.xx.17| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 5| DataNode|Running| 172.xx.xx.16| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -``` - - -## 4. 节点维护步骤 - -### 4.1 ConfigNode节点维护 - -ConfigNode节点维护分为ConfigNode添加和移除两种操作,有两个常见使用场景: - -- 集群扩展:如集群中只有1个ConfigNode时,希望增加ConfigNode以提升ConfigNode节点高可用性,则可以添加2个ConfigNode,使得集群中有3个ConfigNode。 -- 集群故障恢复:1个ConfigNode所在机器发生故障,使得该ConfigNode无法正常运行,此时可以移除该ConfigNode,然后添加一个新的ConfigNode进入集群。 - -> ❗️注意,在完成ConfigNode节点维护后,需要保证集群中有1或者3个正常运行的ConfigNode。2个ConfigNode不具备高可用性,超过3个ConfigNode会导致性能损失。 - -#### 4.1.1 添加ConfigNode节点 - -脚本命令: - -```shell -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-confignode.sh - -# Windows -# 首先切换到IoTDB根目录 -# V2.0.4.x 版本之前 -sbin\start-confignode.bat - -# V2.0.4.x 版本及之后 -sbin\windows\start-confignode.bat -``` - -#### 4.1.2 移除ConfigNode节点 - -首先通过CLI连接集群,通过`show confignodes`确认想要移除ConfigNode的NodeID: - -```shell -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -然后使用SQL将ConfigNode移除,SQL命令: - -```Bash -remove confignode [confignode_id] -``` - -### 4.2 DataNode节点维护 - -DataNode节点维护有两个常见场景: - -- 集群扩容:出于集群能力扩容等目的,添加新的DataNode进入集群 -- 集群故障恢复:一个DataNode所在机器出现故障,使得该DataNode无法正常运行,此时可以移除该DataNode,并添加新的DataNode进入集群 - -> ❗️注意,为了使集群能正常工作,在DataNode节点维护过程中以及维护完成后,正常运行的DataNode总数不得少于数据副本数(通常为2),也不得少于元数据副本数(通常为3)。 - -#### 4.2.1 添加DataNode节点 - -脚本命令: - -```Bash -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-datanode.sh - -#Windows -# 首先切换到IoTDB根目录 -# V2.0.4.x 版本之前 -sbin\start-datanode.bat - -# V2.0.4.x 版本及之后 -sbin\windows\start-datanode.bat -``` - -说明:在添加DataNode后,随着新的写入到来(以及旧数据过期,如果设置了TTL),集群负载会逐渐向新的DataNode均衡,最终在所有节点上达到存算资源的均衡。 - -#### 4.2.2 移除DataNode节点 - -首先通过CLI连接集群,通过`show datanodes`确认想要移除的DataNode的NodeID: - -```Bash -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -然后使用SQL将DataNode移除,SQL命令: - -```Bash -remove datanode [datanode_id] -``` - -### 4.3 集群维护 - -更多关于集群维护的介绍可参考:[集群维护](../User-Manual/Load-Balance.md) - -## 5. 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 清理环境: - - 1. 结束所有 ConfigNode 和 DataNode 进程。 -```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - # Unix/OS X - sbin/stop-standalone.sh - - # Windows - # V2.0.4.x 版本之前 - sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - sbin\windows\stop-standalone.bat - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - - 2. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```shell - cd /data/iotdb rm -rf data logs - ``` -## 6. 附录 - -### 6.1 Confignode节点参数介绍 - -| 参数 | 描述 | 是否为必填项 | -| :--- | :------------------------------- | :----------- | -| -d | 以守护进程模式启动,即在后台运行 | 否 | - -### 6.2 Datanode节点参数介绍 - -| 缩写 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - diff --git a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Database-Resources_timecho.md b/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Database-Resources_timecho.md deleted file mode 100644 index 4da0bf727..000000000 --- a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Database-Resources_timecho.md +++ /dev/null @@ -1,209 +0,0 @@ - -# 资源规划 - -## 1. CPU - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
序列数(采集频率<=1HZ)CPU节点数
单机双活分布式
10W以内2核-4核123
30W以内4核-8核123
50W以内8核-16核123
100W以内16核-32核123
200w以内32核-48核123
1000w以内48核12请联系天谋商务咨询
1000w以上请联系天谋商务咨询
- -> CPU支持型号:鲲鹏、飞腾、申威、海光、兆芯、龙芯 - -## 2. 内存 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
序列数(采集频率<=1HZ)内存节点数
单机双活分布式
10W以内2G-4G123
30W以内6G-12G123
50W以内12G-24G123
100W以内24G-48G123
200w以内24G-96G123
1000w以内128G12请联系天谋商务咨询
1000w以上请联系天谋商务咨询
- -> 提供灵活的内存配置选项,用户可在datanode-env文件中进行调整,详细信息和配置指南请参见 [datanode-env](../Reference/System-Config-Manual_timecho.md#_2-2-datanode-env-sh-bat) - -**注意:** 如需查看 AI 模型推理场景的专项硬件配比与吞吐参考,可查看 AINode 部署文档【[2.3.1 资源配置建议](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_2-3-1-资源配置建议)】章节: - -## 3. 存储(磁盘) -### 3.1 存储空间 - -可通过磁盘资源评估器进行计算:[磁盘资源评估器](https://www.timecho.com/docs/zh/ResourceEvaluator.html) - -计算公式:测点数量 * 采样频率(Hz)* 每个数据点大小(Byte,不同数据类型不一样,见下表) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
数据点大小计算表
数据类型 时间戳(字节)值(字节)数据点总大小(字节)
开关量(Boolean)819
整型(INT32)/ 单精度浮点数(FLOAT)8412
长整型(INT64)/ 双精度浮点数(DOUBLE)8816
字符串(TEXT)8平均为a8+a
- -示例:1000设备,每个设备100 测点,共 100000 序列,INT32 类型。采样频率1Hz(每秒一次),存储1年,3副本。 -- 完整计算公式:1000设备 * 100测点 * 12字节每数据点 * 86400秒每天 * 365天每年 * 3副本/10压缩比 / 1024 / 1024 / 1024 / 1024 =11T -- 简版计算公式:1000 * 100 * 12 * 86400 * 365 * 3 / 10 / 1024 / 1024 / 1024 / 1024 = 11T -### 3.2 存储配置 -1000w 点位以上或查询负载较大,推荐配置 SSD。 -## 4. 网络(网卡) -在写入吞吐不超过1000万点/秒时,需配置千兆网卡;当写入吞吐超过 1000万点/秒时,需配置万兆网卡。 -| **写入吞吐(数据点/秒)** | **网卡速率** | -| ------------------- | ------------- | -| <1000万 | 1Gbps(千兆) | -| >=1000万 | 10Gbps(万兆) | -## 5. 其他说明 -IoTDB 具有集群秒级扩容能力,扩容节点数据可不迁移,因此您无需担心按现有数据情况估算的集群能力有限,未来您可在需要扩容时为集群加入新的节点。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Deployment-form_timecho.md b/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Deployment-form_timecho.md deleted file mode 100644 index d49674d07..000000000 --- a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Deployment-form_timecho.md +++ /dev/null @@ -1,61 +0,0 @@ - -# 部署形态 - -IoTDB 有三种运行模式:单机模式、集群模式和双活模式。 - -## 1. 单机模式 - -IoTDB单机实例包括 1 个ConfigNode、1个DataNode,即1C1D; - -- **特点**:便于开发者安装部署,部署和维护成本较低,操作方便。 -- **适用场景**:资源有限或对高可用要求不高的场景,例如边缘端服务器。 -- **部署方法**:[单机版部署](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -## 2. 双活模式 - -双活版部署为 TimechoDB 企业版功能,是指两个独立的实例进行双向同步,能同时对外提供服务。当一台停机重启后,另一个实例会将缺失数据断点续传。 - -> IoTDB 双活实例通常为2个单机节点,即2套1C1D。每个实例也可以为集群。 - -- **特点**:资源占用最低的高可用解决方案。 -- **适用场景**:资源有限(仅有两台服务器),但希望获得高可用能力。 -- **部署方法**:[双活版部署](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -## 3. 集群模式 - -IoTDB 集群实例为 3 个ConfigNode 和不少于 3 个 DataNode,通常为 3 个 DataNode,即3C3D;当部分节点出现故障时,剩余节点仍然能对外提供服务,保证数据库服务的高可用性,且可随节点增加提升数据库性能。 - -- **特点**:具有高可用性、高扩展性,可通过增加 DataNode 提高系统性能。 -- **适用场景**:需要提供高可用和可靠性的企业级应用场景。 -- **部署方法**:[集群版部署](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -## 4. 特点总结 - -| 维度 | 单机模式 | 双活模式 | 集群模式 | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| 适用场景 | 边缘侧部署、对高可用要求不高 | 高可用性业务、容灾场景等 | 高可用性业务、容灾场景等 | -| 所需机器数量 | 1 | 2 | ≥3 | -| 安全可靠性 | 无法容忍单点故障 | 高,可容忍单点故障 | 高,可容忍单点故障 | -| 扩展性 | 可扩展 DataNode 提升性能 | 每个实例可按需扩展 | 可扩展 DataNode 提升性能 | -| 性能 | 可随 DataNode 数量扩展 | 与其中一个实例性能相同 | 可随 DataNode 数量扩展 | - -- 单机模式和集群模式,部署步骤类似(逐个增加 ConfigNode 和 DataNode),仅副本数和可提供服务的最少节点数不同。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index 80a847eaf..000000000 --- a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,495 +0,0 @@ - -# Docker部署指导 - -## 1. 环境准备 - -### 1.1 Docker安装 - -```Bash -#以ubuntu为例,其他操作系统可以自行搜索安装方法 -#step1: 安装一些必要的系统工具 -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: 安装GPG证书 -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: 写入软件源信息 -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: 更新并安装Docker-CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: 设置docker开机自启动 -sudo systemctl enable docker -#step6: 验证docker是否安装成功 -docker --version #显示版本信息,即安装成功 -``` - -### 1.2 docker-compose安装 - -```Bash -#安装命令 -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#验证是否安装成功 -docker-compose --version #显示版本信息即安装成功 -``` - -### 1.3 安装dmidecode插件 - -默认情况下,linux服务器应该都已安装,如果没有安装的话,可以使用下面的命令安装。 - -```Bash -sudo apt-get install dmidecode -``` - -dmidecode 安装后,查找安装路径:`whereis dmidecode`,这里假设结果为`/usr/sbin/dmidecode`,记住该路径,后面的docker-compose的yml文件会用到。 - -### 1.4 获取IoTDB的容器镜像 - -关于IoTDB企业版的容器镜像您可联系商务或技术支持获取。 - -## 2. 单机版部署 - -本节演示如何部署1C1D的docker单机版。 - -### 2.1 load 镜像文件 - -比如这里获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -load镜像: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### 2.2 创建docker bridge网络 - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### 2.3 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在`/docker-iotdb` 文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml ` - -```Bash -docker-iotdb: -├── iotdb #iotdb安装目录 -│── docker-compose-standalone.yml #单机版docker-compose的yml文件 -``` - -完整的`docker-compose-standalone.yml`内容如下: - -```Bash -version: "3" -services: - iotdb-service: - image: timecho/timechodb:2.0.2.1-standalone #使用的镜像 - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### 2.4 首次启动 - -使用下面的命令启动: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -由于没有激活,首次启动时会直接退出,属于正常现象,首次启动是为了获取机器码文件,用于后面的激活流程。 - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### 2.5 申请激活 - -- 首次启动后,在物理机目录`/docker-iotdb/iotdb/activation`下会生成一个 `system_info`文件,将这个文件拷贝给天谋工作人员。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 收到工作人员返回的license文件,将license文件拷贝到`/docker-iotdb/iotdb/activation`文件夹下。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### 2.6 再次启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### 2.7 验证部署 - -- 查看日志,有如下字样,表示启动成功 - -```Bash -docker logs -f iotdb-datanode #查看日志命令 -2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- 进入容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - 进入容器, 通过cli登录数据库, 使用show cluster命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb /bin/bash #进入容器 - ./start-cli.sh -h iotdb #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### 2.8 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在docker-compose-standalone.yml中添加映射 - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:重新启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## 3. 集群版部署 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -**注意:集群版目前只支持host网络和overlay 网络,不支持bridge网络。** - -下面以host网络为例演示如何部署3C3D集群。 - -### 3.1 设置主机名 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ----------- | ------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置/etc/hosts,使用如下命令: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 3.2 load镜像文件 - -比如获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -在3台服务器上分别执行load镜像命令: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### 3.3 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在/docker-iotdb文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`,`/docker-iotdb/confignode.yml`,`/docker-iotdb/datanode.yml` - -```Bash -docker-iotdb: -├── confignode.yml #confignode的yml文件 -├── datanode.yml #datanode的yml文件 -└── iotdb #IoTDB安装目录 -``` - -在每台服务器上都要编写2个yml文件,即`confignode.yml`和`datanode.yml`,yml示例如下: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:2.0.x.x-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #默认第一台为seed节点 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:2.0.x.x-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_seed_config_node=iotdb-1:10710 #默认第1台为seed节点 - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### 3.4 首次启动confignode - -先在3台服务器上分别启动confignode, 用来获取机器码,注意启动顺序,先启动第1台iotdb-1,再启动iotdb-2和iotdb-3。 - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #后台启动 -``` - -### 3.5 申请激活 - -- 首次启动3个confignode后,在每个物理机目录`/docker-iotdb/iotdb/activation`下都会生成一个`system_info`文件,将3个服务器的`system_info`文件拷贝给天谋工作人员; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 将3个license文件分别放入对应的ConfigNode节点的`/docker-iotdb/iotdb/activation`文件夹下; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- license放入对应的activation文件夹后,confignode会自动激活,不用重启confignode - -### 3.6 启动datanode - -在3台服务器上分别启动datanode - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #后台启动 -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### 3.7 验证部署 - -- 查看日志,有如下字样,表示datanode启动成功 - - ```Bash - docker logs -f iotdb-datanode #查看日志命令 - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- 进入任意一个容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - 进入容器,通过cli登录数据库,使用`show cluster`命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb-datanode /bin/bash #进入容器 - ./start-cli.sh -h iotdb-1 #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### 3.8 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:在3台服务器中分别拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -或者 -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在3台服务器的`confignode.yml`和`datanode.yml`中添加/conf目录映射 - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:在3台服务器上重新启动IoTDB - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index 1a21357a0..000000000 --- a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,203 +0,0 @@ - -# 双活版部署指导 - -## 1. 什么是双活版? - -双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 两个单机(或集群)可构成一个高可用组:当其中一个单机(或集群)停止服务时,另一个单机(或集群)不会受到影响。当停止服务的单机(或集群)再次启动时,另一个单机(或集群)会将新写入的数据同步过来。业务可以绑定两个单机(或集群)进行读写,从而达到高可用的目的。 -- 双活部署方案允许在物理节点少于 3 的情况下实现高可用,在部署成本上具备一定优势。同时可以通过电力、网络的双环网,实现两套单机(或集群)的物理供应隔离,保障运行的稳定性。 -- 目前双活能力为企业版功能。 - -![](/img/%E5%8F%8C%E6%B4%BB%E5%90%8C%E6%AD%A5.png) - -## 2. 注意事项 - -1. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置`/etc/hosts`,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。 - - ```Bash - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -2. 有些参数首次启动后不能修改,请参考下方的"安装步骤"章节来进行设置。 - -3. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考[文档](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - -## 3. 安装步骤 - -我们以两台单机A和B构建的双活版IoTDB为例,A和B的ip分别是192.168.1.3 和 192.168.1.4 ,这里用hostname来表示不同的主机,规划如下: - -| 机器 | 机器ip | 主机名 | -| ---- | ----------- | ------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### Step1:分别安装两套独立的 IoTDB - -在2个机器上分别安装 IoTDB,单机版部署文档可参考[文档](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md),集群版部署文档可参考[文档](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)。**推荐 A、B 集群的各项配置保持一致,以实现最佳的双活效果。** - -### Step2:在机器A上创建数据同步任务至机器B - -- 在机器A上创建数据同步流程,即机器A上的数据自动同步到机器B,使用sbin目录下的cli工具连接A上的IoTDB数据库: - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-1 - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-1 - ``` - -- 创建并启动数据同步命令,SQL 如下: - - ```Bash - create pipe AB - with source ( - 'source.mode.double-living' ='true' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.mode.double-living` 均设置为 `true`,表示不转发从另一pipe传输而来的数据。 - -### Step3:在机器B上创建数据同步任务至机器A - - - 在机器B上创建数据同步流程,即机器B上的数据自动同步到机器A,使用sbin目录下的cli工具连接B上的IoTDB数据库: - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-2 - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-2 - ``` - - 创建并启动pipe,SQL 如下: - - ```Bash - create pipe BA - with source ( - 'source.mode.double-living' ='true' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.mode.double-living` 均设置为 `true`,表示不转发从另一pipe传输而来的数据。 - -### Step4:验证部署 - -上述数据同步流程创建完成后,即可启动双活集群。 - -#### 检查集群运行状态 - -```Bash -#在2个节点分别执行show cluster命令检查IoTDB服务状态 -show cluster -``` - -**机器A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**机器B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -确保每一个 ConfigNode 和 DataNode 都处于 Running 状态。 - -#### 检查同步状态 - -- 机器A上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-A.png) - -- 机器B上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-B.png) - -确保每一个 pipe 都处于 RUNNING 状态。 - -### Step5:停止双活版 IoTDB - -- 在机器A的执行下列命令: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 #登录cli - IoTDB> stop pipe AB #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\windows\stop-standalone.bat - ``` - -- 在机器B的执行下列命令: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 #登录cli - IoTDB> stop pipe BA #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\windows\stop-standalone.bat - ``` diff --git a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index f6bd4cb1a..000000000 --- a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,47 +0,0 @@ - -# 安装包获取 -## 1. 获取方式 - -企业版安装包可通过产品试用申请,或直接联系与您对接的工作人员获取。 - -## 2. 安装包结构 - -安装包解压后目录结构如下: - -| **目录** | **类型** | **说明** | -| :--------------- | :------- | :----------------------------------------------------------- | -| activation | 文件夹 | 激活文件所在目录,包括生成的机器码以及从天谋工作人员获取的企业版激活码(启动ConfigNode后才会生成该目录,即可获取激活码) | -| conf | 文件夹 | 配置文件目录,包含 ConfigNode、DataNode、JMX 和 logback 等配置文件 | -| data | 文件夹 | 默认的数据文件目录,包含 ConfigNode 和 DataNode 的数据文件。(启动程序后才会生成该目录) | -| lib | 文件夹 | 库文件目录 | -| licenses | 文件夹 | 开源协议证书文件目录 | -| logs | 文件夹 | 默认的日志文件目录,包含 ConfigNode 和 DataNode 的日志文件(启动程序后才会生成该目录) | -| sbin | 文件夹 | 主要脚本目录,包含数据库启、停等脚本 | -| tools | 文件夹 | 工具目录 | -| ext | 文件夹 | pipe,trigger,udf插件的相关文件 | -| LICENSE | 文件 | 开源许可证文件 | -| NOTICE | 文件 | 开源声明文件 | -| README_ZH.md | 文件 | 使用说明(中文版) | -| README.md | 文件 | 使用说明(英文版) | -| RELEASE_NOTES.md | 文件 | 版本说明 | - -注意:自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 MQTT 服务 和 REST 服务的 JAR 包。如需使用,请联系天谋团队获取。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index f5b5ff6d8..000000000 --- a/src/zh/UserGuide/latest-Table/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,306 +0,0 @@ - -# 单机版部署指导 - -本章将介绍如何启动IoTDB单机实例,IoTDB单机实例包括 1 个ConfigNode 和1个DataNode(即通常所说的1C1D)。 - -## 1. 注意事项 - -1. 安装前请确认系统已参照[系统配置](../Deployment-and-Maintenance/Environment-Requirements.md)准备完成。 -2. 推荐使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在服务器上配置`/etc/hosts`,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的 `cn_internal_address`、`dn_internal_address`。 - - ```shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. 部分参数首次启动后不能修改,请参考下方的[参数配置](#2参数配置)章节进行设置。 -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 -5. 请注意,安装部署(包括激活和使用软件)IoTDB时,您可以: - - 使用 root 用户(推荐):可以避免权限等问题。 - - 使用固定的非 root 用户: - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - 避免使用 sudo:使用 sudo 命令会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系工作人员获取,部署监控面板步骤可以参考:[监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - - -## 2. 安装步骤 - -### 2.1 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-01.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.2 解压安装包并进入安装目录 - -```Plain -unzip timechodb-{version}-bin.zip -cd timechodb-{version}-bin -``` - -### 2.3 参数配置 - -#### 2.3.1 内存配置 - -- conf/confignode-env.sh(或 .bat) - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------- | :------------------------------------- | :--------- | :----------------------------------------------- | :----------- | -| MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- conf/datanode-env.sh(或 .bat) - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------- | :----------------------------------- |:-----------------------| :----------------------------------------------- | :----------- | -| MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 2.3.2 功能配置 - -系统实际生效的参数在文件 conf/iotdb-system.properties 中,启动需设置以下参数,可以从 conf/iotdb-system.properties.template 文件中查看全部参数 - -集群级功能配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :------------------------ | :------------------------------- | :------------- | :----------------------------------------------- |:-------------------------| -| cluster_name | 集群名称 | defaultCluster | 可根据需要设置集群名称,如无特殊需要保持默认即可 | 支持热加载,但不建议手动修改该参数 | -| schema_replication_factor | 元数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | -| data_replication_factor | 数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | - -ConfigNode 配置 - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------- | :----------------- | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -DataNode 配置 - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- |:----------------------------------------| :----------------- | -| dn_rpc_address | 客户端 RPC 服务的地址 | 127.0.0.1 | 默认本机可直接访问。非本机访问,请修改此配置项为所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址。 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -### 2.4 启动 ConfigNode 节点 - -进入iotdb的sbin目录下,启动confignode - -```shell -# Unix/OS X -./sbin/start-confignode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\sbin\start-confignode.bat - -# V2.0.4.x 版本及之后 -.\sbin\windows\start-confignode.bat -``` - -如果启动失败,请参考下方[常见问题](#常见问题)。 - -### 2.5 启动 DataNode 节点 - - 进入iotdb的sbin目录下,启动datanode: - -```shell -# Unix/OS X -./sbin/start-datanode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\sbin\start-datanode.bat - -# V2.0.4.x 版本及之后 -.\sbin\windows\start-datanode.bat -``` - -### 2.6 激活数据库 - -#### 方式一:命令激活 - -- 进入 IoTDB CLI - -```shell -# Linux 系统与 MacOS 系统启动命令如下: -# V2.0.6.x 版本之前 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.6.x 版本及之后 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table - -# Windows 系统启动命令如下: -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.6.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -- 执行以下语句获取激活所需机器码: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -- 执行以下语句获取待激活数据库的版本号: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.x.x| xxxxxxx| -+-------+---------+ -Total line number = 1 -``` - -- 将获取到的机器码与版本号,一同提供给天谋工作人员。 - -- 将工作人员返回的激活码输入到 CLI 中进行激活操作,请注意激活码前后需要用`'`符号进行标注,如下所示 - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - - -#### 方式二:文件激活 - -- 启动Confignode、Datanode节点后,进入activation文件夹, 将 system_info文件复制给天谋工作人员 -- 收到工作人员返回的 license文件 -- 将license文件放入对应节点的activation文件夹下; - - -### 2.7 验证激活 - -可在 CLI 中通过执行 `show activation` 命令查看激活状态,当看到“ClusterActivationStatus”字段状态显示为 ACTIVATED 表示激活成功 - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81.png) - -## 3. 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 清理环境: - 1. 结束所有 ConfigNode 和 DataNode 进程。 - ```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - # Unix/OS X - sbin/stop-standalone.sh - - # Windows - # V2.0.4.x 版本之前 - sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - sbin\windows\stop-standalone.bat - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - - 2. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```shell - cd /data/iotdb rm -rf data logs - ``` - -## 4. 附录 - -### 4.1 Confignode节点参数介绍 - -| 参数 | 描述 | 是否为必填项 | -| :--- | :------------------------------- | :----------- | -| -d | 以守护进程模式启动,即在后台运行 | 否 | - -### 4.2 Datanode节点参数介绍 - -| 缩写 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - diff --git a/src/zh/UserGuide/latest-Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md b/src/zh/UserGuide/latest-Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md deleted file mode 100644 index a27cd20bd..000000000 --- a/src/zh/UserGuide/latest-Table/Ecosystem-Integration/Ecosystem-Overview_timecho.md +++ /dev/null @@ -1,38 +0,0 @@ - - -# 概览 - -IoTDB 生态集成打通时序数据全链路:通过数据采集实现设备秒级接入,经数据集成构建跨云管道,依托编程框架快速开发业务逻辑,结合计算引擎完成分布式处理,通过可视化与 SQL 开发实现分析策略,最终对接物联网平台完成边云协同,构建从物理世界到数字决策的完整智能闭环。 - -![](/img/eco-overview-n.png) - -下面的文档将会帮助您快速详细的了解各个阶段不同集成工具的使用方式: - -- 计算引擎 - - Spark [Spark](./Spark-IoTDB.md) -- SQL 开发 - - DBeaver [DBeaver](./DBeaver.md) - - DataGrip [DataGrip ](./DataGrip.md) -- 编程框架 - - Spring Boot Starter [Spring Boot Starter](./Spring-Boot-Starter.md) - - Mybatis Generator [Mybatis Generator](./Mybatis-Generator.md) - - MyBatisPlus Generator [MyBatisPlus Generator](./MyBatisPlus-Generator.md) \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Ecosystem-Integration/SeaTunnel_timecho.md b/src/zh/UserGuide/latest-Table/Ecosystem-Integration/SeaTunnel_timecho.md deleted file mode 100644 index 8eae5ac09..000000000 --- a/src/zh/UserGuide/latest-Table/Ecosystem-Integration/SeaTunnel_timecho.md +++ /dev/null @@ -1,193 +0,0 @@ - - -# Apache SeaTunnel - -## 1. 概述 - -SeaTunnel 是一款专为海量数据设计的分布式集成平台,凭借其高性能与弹性扩展能力,通过标准化的 Connector 连接器(由 Source 和 Sink 构成)打通多源异构数据链路。平台将各类数据源通过 Source 统一抽象为 SeaTunnelRow 格式,经动态资源调度与批量处理优化后,由 Sink 高效写入不同存储系统。通过 IoTDB Connector 与 SeaTunnel 的深度集成,不仅解决了时序数据场景下的 高吞吐写入、多源治理、复杂分析 等核心挑战,更通过开箱即用的连接器生态和自动化运维能力,帮助企业在物联网、工业互联网等领域快速构建 低成本、高可靠、易扩展 的数据基础设施。 - -## 2. 使用步骤 - -### 2.1 环境准备 - -#### 2.1.1 软件要求 - -| 软件 | 版本 | 安装参考 | -| ----------- | ---------- |-----------------------------------------------| -| IoTDB | >= 2.0.5 | [快速入手](../QuickStart/QuickStart_timecho.md) | -| SeaTunnel | 2.3.12 | [官方网站](https://seatunnel.apache.org/download) | - -* Thrift 版本冲突解决(仅 Spark 引擎需处理): - -```Bash -# 移除 Spark 中的旧版 Thrift -rm -f $SPARK_HOME/jars/libthrift* -# 复制 IoTDB 的 Thrift 库到 Sparkcp -$IOTDB_HOME/lib/libthrift* $SPARK_HOME/jars/ -``` - -#### 2.1.2 依赖配置 - -1. JDBC - -* Spark/Flink 引擎:将 [JDBC 驱动 Jar 包](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) 放入 `${SEATUNNEL_HOME}/plugins/` 目录 -* SeaTunnel Zeta 引擎:将 [JDBC 驱动 Jar 包](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) 放入 `${SEATUNNEL_HOME}/lib/` 目录 - -2. Connector - -将对应版本的 [seaTunnel Connector](https://mvnrepository.com/artifact/org.apache.seatunnel/connector-iotdb) 放入 `${SEATUNNEL_HOME}/plugins/` 目录 - -### 2.2 读取数据 (IoTDB Source Connector) - -#### 2.2.1 配置参数 - -| **参数名** | **类型** | **必填** | **默认值** | **描述** | -| ---------------------------------- | ---------------- | ---------------- | ------------------ |-----------------------------------------------------------------------| -| `node_urls` | string | 是 | - | IoTDB 集群地址,格式:`"host1:port"`或`"host1:port,host2:port"` | -| `username` | string | 是 | - | IoTDB 用户名 | -| `password` | string | 是 | - | IoTDB 密码 | -| `sql_dialect` | string | 否 | tree | IoTDB 模型,tree:树模型;table:表模型 | -| `sql` | string | 是 | - | 要执行的 SQL 查询语句 | -| `database` | string | 否 | - | 数据库名,只在表模型中生效 | -| `schema` | config | 是 | - | 数据模式定义 | -| `fetch_size` | int | 否 | - | 单次获取数据量:查询时每次从 IoTDB 获取的数据量 | -| `lower_bound`| long | 否 | - | 时间范围下界(通过时间列进行数据分片时使用) | -| `upper_bound` | long | 否 | - | 时间范围上界(通过时间列进行数据分片时使用) | -| `num_partitions`| int | 否 | - | 分区数量(通过时间列进行数据分片时使用):
1个分区:使用完整时间范围
若分区数 < (上界-下界),则使用差值作为实际分区数 | -| `thrift_default_buffer_size` | int | 否 | - | Thrift 协议缓冲区大小 | -| `thrift_max_frame_size` | int | 否 | - | Thrift 最大帧尺寸 | -| `enable_cache_leader` | boolean | 否 | - | 是否启用 Leader 节点缓存 | -| `version` | string | 否 | - | 客户端 SQL 语义版本`(V_0_12/V_0_13)` | - -#### 2.2.2 配置示例 - -1. 在 `${SEATUNNEL_HOME}/`​`config/` 目录下新建` iotdb_source_example.conf` - -```SQL -env { - parallelism = 2 # 并行度为2 - job.mode = "BATCH" # 批处理模式 -} - -source { - IoTDB { - node_urls = "localhost:6667" - username = "root" - password = "root" - sql_dialect = "table" - sql = "SELECT time,device_id,city,s1,s2,s3,s4 FROM tcollector.table1" - schema { - fields { - time = timestamp - device_id = string - city= string - s1= int - s2= bigint - s3= float - s4= double - } - } - } -} - -sink { - Console { - } # 输出到控制台 -} -``` - -2. 执行如下命令运行 seaTunnel - -```Bash -./bin/seatunnel.sh --config config/iotdb_source_example.conf -e local -``` - -3. 更多详情请参考 Apache SeanTunnel 官网 [IoTDB Source Connector](https://seatunnel.incubator.apache.org/zh-CN/docs/2.3.12/connector-v2/source/IoTDB) 相关介绍 - -### 2.3 写入数据(IoTDB Sink Connector) - -#### 2.3.1 配置参数 - -| **名称** | **类型** | **是否必传​** | **默认值** | **描述** | -|-------------------------------|---------| ---------------------- |------------------| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | Array | 是 | - | `IoTDB`集群地址,格式为` ["host1:port"]`或`["host1:port","host2:port"]` | -| `username` | String | 是 | - | `IoTDB`用户的用户名 | -| `password` | String | 是 | - | `IoTDB`用户的密码 | -| `sql_dialect` | String | 否 | tree | `IoTDB`模型,tree:树模型;table:表模型 | -| `storage_group` | String | 是 | - | `IoTDB`树模型:指定设备存储组(路径前缀) 例: deviceId = \${storage\_group} + "." + \${key\_device} ;`IoTDB`表模型:指定数据库 | -| `key_device` | String | 是 | - | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`设备 ID 的字段名;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`表名的字段名 | -| `key_timestamp` | String | 否 | processing time | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`时间戳的字段名(如未指定,则使用处理时间作为时间戳);`IoTDB`表模型:在 SeaTunnelRow 中指定 IoTDB 时间列的字段名(如未指定,则使用处理时间作为时间戳) | -| `key_measurement_fields` | Array | 否 | 见描述 | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`测量列表的字段名(如未指定,则包括排除`key_device`&`key_timestamp`后的其余字段);`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`测点列(FIELD)的字段名(如未指定,则包括排除`key_device`&`key_timestamp`&`key_tag_fields`&`key_attribute_fields`后的其余字段) | -| `key_tag_fields` | Array | 否 | - | `IoTDB`树模型:不生效;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`标签列(TAG)的字段名 | -| `key_attribute_fields` | Array | 否 | - | `IoTDB`树模型:不生效;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`属性列(ATTRIBUTE)的字段名 | -| `batch_size` | Integer | 否 | 1024 | 对于批写入,当缓冲区的数量达到`batch_size`的数量或时间达到`batch_interval_ms`时,数据将被刷新到IoTDB中 | -| `max_retries` | Integer | 否 | - | 刷新的重试次数 failed | -| `retry_backoff_multiplier_ms` | Integer | 否 | - | 用作生成下一个退避延迟的乘数 | -| `max_retry_backoff_ms` | Integer | 否 | - | 尝试重试对`IoTDB`的请求之前等待的时间量 | -| `default_thrift_buffer_size` | Integer | 否 | - | 在`IoTDB`客户端中节省初始化缓冲区大小 | -| `max_thrift_frame_size` | Integer | 否 | - | 在`IoTDB`客户端中节约最大帧大小 | -| `zone_id` | string | 否 | - | `IoTDB`java.time.ZoneId client | -| `enable_rpc_compression` | Boolean | 否 | - | 在`IoTDB`客户端中启用rpc压缩 | -| `connection_timeout_in_ms` | Integer | 否 | - | 连接到`IoTDB`时等待的最长时间(毫秒) | - -#### 2.3.2 配置示例 - -1. 在 `${SEATUNNEL_HOME}/`​`config/` 目录下新建` iotdb_sink_example.conf` - -```Bash -# 定义运行时环境 -env { - parallelism = 4 - job.mode = "BATCH" -} -source{ - Jdbc { - url = "jdbc:mysql://localhost:3306/demo_db?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true" - driver = "com.mysql.cj.jdbc.Driver" - connection_check_timeout_sec = 100 - user = "root" - password = "IoTDB@2024" - query = "select * from device" - } -} -sink { - IoTDB { - node_urls = ["localhost:6667"] - username = "root" - password = "root" - sql_dialect = "table" - storage_group = "seatunnel" - key_device = "id" - key_timestamp = "intime" - } -} -``` - -2. 执行如下命令运行 seaTunnel - -```Bash -./bin/seatunnel.sh --config config/iotdb_sink_example.conf -e local -``` - -3. 更多配置参数及示例请参考 Apache SeanTunnel 官网 [IoTDB Sink Connector](https://seatunnel.incubator.apache.org/zh-CN/docs/2.3.12/connector-v2/sink/IoTDB) 相关介绍 - - diff --git a/src/zh/UserGuide/latest-Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/zh/UserGuide/latest-Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index a82e4ed7f..000000000 --- a/src/zh/UserGuide/latest-Table/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,271 +0,0 @@ - - -# 产品介绍 - -TimechoDB 是一款低成本、高性能的物联网原生时序数据库,是天谋科技基于 Apache IoTDB 社区版本提供的原厂商业化产品。它可以解决企业组建物联网大数据平台管理时序数据时所遇到的应用场景复杂、数据体量大、采样频率高、数据乱序多、数据处理耗时长、分析需求多样、存储与运维成本高等多种问题。 - -天谋科技基于 TimechoDB 提供更多样的产品功能、更强大的性能和稳定性、更丰富的效能工具,并为用户提供全方位的企业服务,从而为商业化客户提供更强大的产品能力,和更优质的开发、运维、使用体验。 - -- 下载、部署与使用:[快速上手](../QuickStart/QuickStart_timecho.md) - -## 1. 产品体系 - -天谋产品体系由若干个组件构成,覆盖由【数据采集】到【数据管理】到【数据分析&应用】的全时序数据生命周期,做到“采-存-用”一体化时序数据解决方案,帮助用户高效地管理和分析物联网产生的海量时序数据。 - -
- Introduction-zh-timecho.png -
- - -其中: - -1. **时序数据库(TimechoDB,基于 Apache IoTDB 提供的原厂商业化产品)**:时序数据存储的核心组件,其能够为用户提供高压缩存储能力、丰富时序查询能力、实时流处理能力,同时具备数据的高可用和集群的高扩展性,并在安全层面提供全方位保障。同时 TimechoDB 还为用户提供多种应用工具,方便用户配置和管理系统;多语言API和外部系统应用集成能力,方便用户在 TimechoDB 基础上构建业务应用。 -2. **时序数据标准文件格式(Apache TsFile,多位天谋科技核心团队成员主导&贡献代码)**:该文件格式是一种专为时序数据设计的存储格式,可以高效地存储和查询海量时序数据。目前 Timecho 采集、存储、智能分析等模块的底层存储文件均由 Apache TsFile 进行支撑。TsFile 可以被高效地加载至 IoTDB 中,也能够被迁移出来。通过 TsFile,用户可以在采集、管理、应用&分析阶段统一使用相同的文件格式进行数据管理,极大简化了数据采集到分析的整个流程,提高时序数据管理的效率和便捷度。 -3. **时序模型训推一体化引擎(AINode)**:针对智能分析场景,TimechoDB 提供 AINode 时序模型训推一体化引擎,它提供了一套完整的时序数据分析工具,底层为模型训练引擎,支持训练任务与数据管理,与包括机器学习、深度学习等。通过这些工具,用户可以对存储在 TimechoDB 中的数据进行深入分析,挖掘出其中的价值。 -4. **数据采集**:为了更加便捷的对接各类工业采集场景, 天谋科技提供数据采集接入服务,支持多种协议和格式,可以接入各种传感器、设备产生的数据,同时支持断点续传、网闸穿透等特性。更加适配工业领域采集过程中配置难、传输慢、网络弱的特点,让用户的数采变得更加简单、高效。 - -## 2. TimechoDB 整体架构 - -下图展示了一个常见的 IoTDB 3C3D(3 个 ConfigNode、3 个 DataNode)的集群部署模式: - - - -## 3. 产品特性 - -TimechoDB 具备以下优势和特性: - -- 灵活的部署方式:支持云端一键部署、终端解压即用、终端-云端无缝连接(数据云端同步工具) - -- 低硬件成本的存储解决方案:支持高压缩比的磁盘存储,无需区分历史库与实时库,数据统一管理 - -- 层级化的测点组织管理方式:支持在系统中根据设备实际层级关系进行建模,以实现与工业测点管理结构的对齐,同时支持针对层级结构的目录查看、检索等能力 - -- 高通量的数据读写:支持百万级设备接入、数据高速读写、乱序/多频采集等复杂工业读写场景 - -- 丰富的时间序列查询语义:支持时序数据原生计算引擎,支持查询时时间戳对齐,提供近百种内置聚合与时序计算函数,支持面向时序特征分析和AI能力 - -- 高可用的分布式系统:支持HA分布式架构,系统提供7*24小时不间断的实时数据库服务,一个物理节点宕机或网络故障,不会影响系统的正常运行;支持物理节点的增加、删除或过热,系统会自动进行计算/存储资源的负载均衡处理;支持异构环境,不同类型、不同性能的服务器可以组建集群,系统根据物理机的配置,自动负载均衡 - -- 极低的使用&运维门槛:支持类 SQL 语言、提供多语言原生二次开发接口、具备控制台等完善的工具体系 - -- 丰富的生态环境对接:支持Hadoop、Spark等大数据生态系统组件对接,支持Grafana、Thingsboard、DataEase等设备管理和可视化工具 - -## 4. 企业特性 - -### 4.1 更高阶的产品功能 - -TimechoDB 在 Apache IoTDB 基础上提供了更多高阶产品功能,在内核层面针对工业生产场景进行原生升级和优化,如多级存储、云边协同、可视化工具、安全增强等功能,能够让用户无需过多关注底层逻辑,将精力聚焦在业务开发中,让工业生产更简单更高效,为企业带来更多的经济效益。如: - -- 双活部署:双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 数据同步:通过数据库内置的同步模块,支持数据由场站向中心汇聚,支持全量汇聚、部分汇聚、级联汇聚等各类场景,可支持实时数据同步与批量数据同步两种模式。同时提供多种内置插件,支持企业数据同步应用中的网闸穿透、加密传输、压缩传输等相关要求。 - -- 多级存储:通过升级底层存储能力,支持根据访问频率和数据重要性等因素将数据划分为冷、温、热等不同层级的数据,并将其存储在不同介质中(如 SSD、机械硬盘、云存储等),同时在查询过程中也由系统进行数据调度。从而在保证数据访问速度的同时,降低客户数据存储成本。 - -- 安全增强:通过白名单、审计日志等功能加强企业内部管理,降低数据泄露风险。 - -详细功能对比如下: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
功能Apache IoTDBTimechoDB
部署模式单机部署
分布式部署
双活部署-
容器部署部分支持
数据库功能测点管理
数据写入
数据查询
连续查询
触发器
用户自定义函数
权限管理
数据同步仅文件同步,无内置插件实时同步+文件同步,丰富内置插件
流处理仅框架,无内置插件框架+丰富内置插件
多级存储-
视图-
白名单-
审计日志-
配套工具可视化控制台-
集群管理工具-
系统监控工具-
国产化国产化兼容性认证-
技术支持专家服务-
使用培训-
- -### 4.2 更高效/稳定的产品性能 - -TimechoDB 在 Apache IoTDB 的基础上优化了稳定性与性能,经过企业版技术支持,能够实现10倍以上性能提升,并具有故障及时恢复的性能优势。 - -### 4.3 更用户友好的工具体系 - -TimechoDB 将为用户提供更简单、易用的工具体系,通过集群监控面板(IoTDB Grafana)、数据库控制台(IoTDB Workbench)、集群管理工具(IoTDB Deploy Tool,简称 IoTD)等产品帮助用户快速部署、管理、监控数据库集群,降低运维人员工作/学习成本,简化数据库运维工作,使运维过程更加方便、快捷。 - -- 集群监控面板:旨在解决 IoTDB 及其所在操作系统的监控问题,主要包括:操作系统资源监控、IoTDB 性能监控,及上百项内核监控指标,从而帮助用户监控集群健康状态,并进行集群调优和运维。 - -
-

总体概览

-

操作系统资源监控

-

IoTDB 性能监控

-
-
- - - -
-

- -- 数据库控制台:旨在提供低门槛的数据库交互工具,通过提供界面化的控制台帮助用户简洁明了的进行元数据管理、数据增删改查、权限管理、系统管理等操作,简化数据库使用难度,提高数据库使用效率。 - - -
-

首页

-

元数据管理

-

SQL 查询

-
-
- - - -
-

- - -- 集群管理工具:旨在解决分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。 - - -
-  -
- -### 4.4 更专业的企业技术服务 - -TimechoDB 客户提供强大的原厂服务,包括但不限于现场安装及培训、专家顾问咨询、现场紧急救助、软件升级、在线自助服务、远程支持、最新开发版使用指导等服务。同时,为了使 IoTDB 更契合工业生产场景,我们会根据企业实际数据结构和读写负载,进行建模方案推荐、读写性能调优、压缩比调优、数据库配置推荐及其他的技术支持。如遇到部分产品未覆盖的工业化定制场景,TimechoDB 将根据用户特点提供定制化开发工具。 - -相较于 Apache IoTDB,每 2-3 个月一个发版周期,TimechoDB 提供周期更快的发版频率,同时针对客户现场紧急问题,提供天级别的专属修复,确保生产环境稳定。 - - -### 4.5 更兼容的国产化适配 - -TimechoDB 代码自研可控,同时兼容大部分主流信创产品(CPU、操作系统等),并完成与多个厂家的兼容认证,确保产品的合规性和安全性。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/IoTDB-Introduction/Release-history_timecho.md b/src/zh/UserGuide/latest-Table/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index 729b1f060..000000000 --- a/src/zh/UserGuide/latest-Table/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,675 +0,0 @@ - -# 发布历史 - -## 1. TimechoDB(数据库内核) - -### V2.0.9.4 - -> 发版时间:2026.06.10
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.4-bin.zip
-> SHA512 校验码:040ebdd9e45d93535e9628cf377003d560be83cec9737f5a5fbd0c3a93a12810814094752eac3eacdfec5cddcf433fa83e76edc14be34c73c1a54d9b937ea1b5 - -V2.0.9.4 版本主要优化了表模型 AINode 的推理功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:表模型协变量推理模型自适应支持填充空值 - - -### V2.0.9.3 - -> 发版时间:2026.05.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.3-bin.zip
-> SHA512 校验码:f6c5d50cbf8902503289884f073593c650ffdc8edbebfabf27f6ab4499630749331aa4ed09dd34627a39fa8dee27b4d7e2689d0ed1cf23c76dd9c7270f9fae2a - -V2.0.9.3 版本 AINode 新增支持同一套模型代码搭配不同模型权重分别注册为模型的功能,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:[支持同一套模型代码搭配不同模型权重分别注册为模型](../AI-capability/AINode_Upgrade_timecho.md#_4-3-注册自定义模型) - - -### V2.0.9.2 - -> 发版时间:2026.05.11
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.2-bin.zip
-> SHA512 校验码:10d3f34b6e65ad5c09b1cf3538ee27e181cc38c5fedf6acfd7d7053797ca23c76245683536275b69bd478aa1e43364351eceef1948832ab663a7398665af9eff - -V2.0.9.2 版本 新增 Object 类型导入导出功能,新增脚本 tsfile-backup(目前仅支持表模型),同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 脚本与工具:表模型[import-data 脚本 TsFile 格式](../Tools-System/Data-Import-Tool_timecho.md#_2-4-tsfile-格式)支持 object 类型数据导入 -- 脚本与工具:表模型新增 [tsfile-backup 脚本](../Tools-System/Data-Export-Tool_timecho.md#_3-基于-pipe-框架的-tsfilebackup) -- 流处理模块:表模型 PIPE 支持 [Object 类型数据本地导出和远程传输](../User-Manual/Data-Sync_timecho.md#_3-9-object-类型数据导出) -- 系统模块:[审计日志](../User-Manual/Audit-Log_timecho.md)支持慢请求个数统计 - - -### V2.0.9.1 - -> 发版时间:2026.05.11
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.1-bin.zip
-> SHA512 校验码:18ff3801ba58550e06ef0aa4bf4465e8ce1b31d1aecb9c6899eb843f5d9187d3cc575e930ee38d96b87b17067e2b21f1852ab5127eac7480cf5051c20a68894b - -V2.0.9.1 版本新增 AINode 协变量分类推理能力,支持 schema级/表级存储空间统计功能,数据查询新增集合操作、CTE 及多个内置函数,支持通过 DEBUG SQL 调试查询,支持配置开机自启等,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:表模型支持[时序数据分类推理](../AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理) -- 查询模块:表模型支持[集合操作(UNION/INTERSECT/EXCEPT)](../SQL-Manual/Set-Operations_timecho.md)及 [共用表表达式(CTE)](../SQL-Manual/Common-Table-Expression_timecho.md) -- 查询模块:表模型新增 [IF 标量函数](../SQL-Manual/Basis-Function_timecho.md#_8-3-if-表达式)、[二进制函数](../SQL-Manual/Basis-Function_timecho.md#_7-二进制函数)、[APPROX_PERCENTILE 聚合函数](../SQL-Manual/Basis-Function_timecho.md#_2-聚合函数) -- 查询模块:支持 [DEBUG SQL](../User-Manual/Maintenance-statement_timecho.md#_6-调试查询),优化 [Explain Analyze](../User-Manual/Query-Performance-Analysis.md) 结果集 -- 查询模块:支持 [schema级](../../latest/User-Manual/Maintenance-statement_timecho.md#_1-10-查看磁盘空间占用情况)/[表级](../Reference/System-Tables_timecho.md#_2-22-table-disk-usage-表)存储空间统计,支持 [show configuration 语句](../User-Manual/Maintenance-statement_timecho.md#_1-13-查看节点配置信息)查看集群配置信息 -- 脚本与工具:数据/元数据导入导出工具支持 SSL 协议 -- 脚本与工具:命令行工具支持展示[访问历史功能](../Tools-System/CLI_timecho.md#_4-访问历史功能) -- 系统模块:支持配置[开机自启](../User-Manual/Auto-Start-On-Boot_timecho.md) -- 其他:修复安全漏洞 CVE-2026-28564 - - -### V2.0.8.3 - -> 发版时间:2026.04.21
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.3-bin.zip
-> SHA512 校验码:4b95bea87cc375bc455897dcf4cec80692421fa5c3eee746e1095b94288611d4afdd94aa8dad70340757d041757758924701cbdb2b73b49fb8730c4caac2a126 - -V2.0.8.3 版本新增 Python 读写 Object 类型数据的能力,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 接口模块:表模型[Python 原生接口](../API/Programming-Python-Native-API_timecho.md)支持读写 Object 类型数据 - - -### V2.0.8.2 - -> 发版时间:2026.03.31
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.2-bin.zip
-> SHA512 校验码:02ab10e3e94786dd5676e0a69609eef192afd90d87f4d8d7bd44e7e9cbc8a18d61ba5668bae56cb8e4416ac71a877f760963b72ca7838d7c39ae10f1ed321d89 - -V2.0.8.2 版本新增树模型修改序列全名功能,表模型支持自定义 Time 列列名,树、表双模型支持更改数据类型,ODBC Driver等,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 存储模块:树模型支持[修改序列全名](../../latest/Basic-Concept/Operate-Metadata_timecho.md#_2-4-修改时间序列名称),支持[更改序列数据类型](../../latest/Basic-Concept/Operate-Metadata_timecho.md#_2-3-修改时间序列数据类型) -- 存储模块:表模型支持[更改列数据类型](../Basic-Concept/Table-Management_timecho.md#_1-5-修改表),支持[自定义 Time 列列名](../Basic-Concept/Table-Management_timecho.md#_1-1-创建表) -- 接口模块:支持 [ODBC Driver](../API/Programming-ODBC_timecho.md), Python SessionDataset 支持分批获取 DataFrame,MQTT 服务外置并新增系统表 Services 提供服务查询 -- AINode:表模型支持自适应[协变量推理](../AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理) -- 流处理模块:树模型数据同步 pipe 语句中支持填写多个精确路径的 path - -### V2.0.8.1 - -> 发版时间:2026.02.04
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.1-bin.zip
-> SHA512 校验码:49d97cbf488443f8e8e73cc39f6f320b3bc84b194aed90af695ebd5771650b5e5b6a3abb0fb68059bd01827260485b903c035657b337442f4fdd32c877f2aca3 - -V2.0.8.1 版本表模型新增Object数据类型,强化升级审计日志功,优化树模型 OPC UA 协议,AINode 支持协变量预测,以及 AINode 支持并发推理等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:新增 DataNode 可用节点的列表展示,可[查看节点的 RPC 地址和端口](../User-Manual/Maintenance-statement_timecho.md#_1-7-查看可用节点) -- 查询模块:表模型新增[统计查询耗时的系统表](../Reference/System-Tables_timecho.md#_2-20-queries-costs-histogram-表) -- 存储模块:支持通过 SQL 查看[创建表](../Basic-Concept/Table-Management_timecho.md#_1-4-查看表的创建信息)/[视图](../User-Manual/Tree-to-Table_timecho.md#_2-4-查看表视图)的完整定义语句 -- 存储模块:优化树模型 [OPC UA 协议](../../latest/API/Programming-OPC-UA_timecho.md) -- 系统模块:表模型新增 [Object 数据类型](../Background-knowledge/Data-Type_timecho.md) -- 系统模块:强化升级[审计日志](../User-Manual/Audit-Log_timecho.md)功能 -- 系统模块:表模型新增 DataNode [节点连接情况](../Reference/System-Tables_timecho.md#_2-18-connections-表)的系统表 -- AINode:内置 chronos-2 模型,支持[协变量预测](../AI-capability/AINode_Upgrade_timecho.md)功能 -- AINode:Timer-XL、Sundial 内置模型支持[并发推理](../AI-capability/AINode_Upgrade_timecho.md)功能 -- 流处理模块:创建全量同步 pipe 会[自动拆分](../User-Manual/Data-Sync_timecho.md#_2-1-创建任务)为实时、历史两个独立 pipe,可通过 show pipes 语句分别查看剩余事件数 -- 其他:修复安全漏洞 CVE-2025-12183、CVE-2025-66566、CVE-2025-11226 - -### V2.0.6.6 - -> 发版时间:2026.01.20
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.6-bin.zip
-> SHA512 校验码:d12e60b8119690d63c501d0c2afcd527e39df8a8786198e35b53338e21939e1a9244805e710d81cbb62d02c2739909d7e8227c029660a0cd9ea7ca718cf9bdf6 - -V2.0.6.6 版本主要优化了树模型时间序列的查询性能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:优化了 show/count timeseries/devices 的查询性能 - -### V2.0.6.4 - -> 发版时间:2025.11.17
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.4-bin.zip
-> SHA512 校验码:57b9998cc14632862c32b6781c70db1c52caf8172b5d45d27cc214cab50d3afd4230ed0754e1c1a4ed825666bf971dc81fbb7d3b93261e57e9dabc20e794a2b8 - -V2.0.6.4 版本主要优化了存储以及 AINode 模块的相关功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 存储模块:支持树模型修改时间序列的编码及压缩方式 -* AINode:支持一键部署,优化了模型推理功能 - -### V2.0.6.1 - -> 发版时间:2025.09.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.1-bin.zip
-> SHA512 校验码:c88e3e2c0dbd06578bd0697ca9992880b300baee2c4906ba1f952134e37ae2fa803a6af236f4541d318b75f43a498b5d5bfbbc7c445783271076c36e696e4dd0 - -V2.0.6.1 版本新增表模型查询写回功能,新增访问控制黑白名单功能,新增位操作函数(内置标量函数)以及可下推的时间函数,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:支持表模型查询写回功能 -* 查询模块:表模型行模式识别支持使用聚合函数,捕获连续数据进行分析计算 -* 查询模块:表模型新增内置标量函数-位操作函数 -* 查询模块:表模型新增可下推的 EXTRACT 时间函数 -* 系统模块:新增访问控制,支持用户自定义配置黑白名单功能 -* 其他:用户默认密码更新为安全强度更高的“TimechoDB@2021” - -### V2.0.5.2 - -> 发版时间:2025.08.08
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.5.2-bin.zip
-> SHA512 校验码:a00a4075c9937b7749c454f71d2480fea5e9ff9659c0628b132e30e2f256c7c537cd91dca4f6be924db0274bb180946a1b88e460c025bf82fdb994a3c2c7b91e - -V2.0.5.2 版本修复了部分产品缺陷,优化了数据同步功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V2.0.5.1 - -> 发版时间:2025.07.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.5.1-bin.zip
-> SHA512 校验码:aa724755b659bf89a60da6f2123dfa91fe469d2e330ed9bd029e8f36dd49212f3d83b1025e9da26cb69315e02f65c7e9a93922e40df4f2aa4c7f8da8da2a4cea - -V2.0.5.1 版本新增树转表视图、表模型窗口函数、聚合函数 approx\_most\_frequent,并支持 LEFT & RIGHT JOIN、ASOF LEFT JOIN;AINode 新增 Timer-XL、Timer-Sundial 两种内置模型,支持树、表模型推理及微调功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:支持手动创建树转表视图 -* 查询模块:表模型新增窗口函数 -* 查询模块:表模型新增聚合函数 approx\_most\_frequent -* 查询模块:表模型 JOIN 功能扩展,支持 LEFT & RIGHT JOIN、ASOF LEFT JOIN -* 查询模块:表模型支持行模式识别,可捕获连续数据进行分析计算 -* 查询模块:表模型新增多个系统表,例如:VIEWS(表视图信息)、MODELS(模型信息)等 -* 系统模块:新增 TsFile 数据文件加密功能 -* AI 模块:AINode 新增 Timer-XL、Timer-Sundial 两种内置模型 -* AI 模块:AINode 支持树模型、表模型的推理及微调功能 -* 其他模块:支持通过 OPC DA 协议发布数据 - -### 2.x 其他历史版本 - -#### V2.0.4.2 - -> 发版时间:2025.06.21
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.4.2-bin.zip
-> SHA512 校验码:31f26473ac90988ce970dac8d0950671bde918f9af6f2f6a6c2bf99a53aa1c0a459c53a137b18ff0b28e70952e9c4b6acb50029e0b2e38837b969eb8f78f2939 - -V2.0.4.2 版本支持了传递 TOPIC 给 MQTT 自定义插件,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.4.1 - -> 发版时间:2025.06.03
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.4.1-bin.zip
-> SHA512 校验码:93ac08bfae06aff6db04849f474458433026f66778f4f5c402eb22f1a7cb14d8096daf0a9e9cc365ddfefd4f8ca4443b2a9fb6461906f056b1e6a344990beb3a - -V2.0.4.1 版本表模型新增用户自定义表函数(UDTF)及多种内置表函数、新增聚合函数 approx\_count\_distinct、新增支持针对时间列的 ASOF INNER JOIN,并对脚本工具进行了分类整理,将 Windows 平台专用脚本独立,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:表模型新增用户自定义表函数(UDTF)及多种内置表函数 -* 查询模块:表模型支持针对时间列的 ASOF INNER JOIN -* 查询模块:表模型新增聚合函数 approx\_count\_distinct -* 流处理:支持通过 SQL 异步加载 TsFile -* 系统模块:缩容时,副本选择支持容灾负载均衡策略 -* 系统模块:适配 Window Server 2025 -* 脚本与工具:对脚本工具进行了分类整理,并将 Windows 平台专用脚本独立 - -#### V2.0.3.4 - -> 发版时间:2025.06.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.4-bin.zip
-> SHA512 校验码:d80d34b7d3890def75b17c491fc4c13efc36153a5950a9b23744755d04d6adb5d6ab9ec970101183fef7bfeb8a559ef92fce90d2d22f7b7fd5795cd5589461bb - -V2.0.3.4版本将用户密码的加密算法变更为 SHA-256,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.3.3 - -> 发版时间:2025.05.16
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.3-bin.zip
-> SHA512 校验码:f47e3fb45f869dbe690e7cfaa93f95e5e08a462b362aa9d7ccac7ee5b55022dc8f62db12009dfde055f278f3003ff9ea7c22849d52a3ef2c25822f01ade78591 - -V2.0.3.3 版本新增元数据导入导出脚本适配表模型、Spark 生态集成(表模型)、AINode 返回结果新增时间戳,表模型新增部分聚合函数和标量函数,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:表模型新增聚合函数 count\_if 和标量函数 greatest / least -* 查询模块:表模型全表 count(\*) 查询性能显著提升 -* AI 模块:AINode 返回结果新增时间戳 -* 系统模块:表模型元数据模块性能优化 -* 系统模块:表模型支持主动监听并加载 TsFile 功能 -* 系统模块:新增 TsFile 解析转换时间、TsFile 转 Tablet 数量等监控指标 -* 生态集成:表模型生态拓展集成 Spark -* 脚本与工具:import-schema、export-schema 脚本支持表模型元数据导入导出 - -#### V2.0.3.2 - -> 发版时间:2025.05.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.2-bin.zip
-> SHA512 校验码:76bd294de4b01782e5dd621a996aeb448e4581f98c70fb5b72b17dc392c2e1227c0d26bd3df5533669a80f217a83a566bc6ec926b7efd21ce7a89b894cd33e19 - -V2.0.3.2版本修复了部分产品缺陷,优化了节点移除功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.2.1 - -> 发版时间:2025.04.07
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.2.1-bin.zip
-> SHA512 校验码:a41be3f8c57e6a39ac165f1d6ab92c9ed790b0712528f31662c58617f4c94e6bfc9392a9c1ef2fc5bdd8c7ca79901389f368cbdbec3e5b1d5c1ce155b2f1a457 - -V2.0.2.1 版本新增了表模型权限管理、用户管理以及相关操作鉴权,并新增了表模型 UDF、系统表和嵌套查询等功能。此外,持续优化数据订阅机制,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:新增表模型 UDF 的管理、用户自定义标量函数(UDSF)和用户自定义聚合函数(UDAF) -* 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -* 查询模块:表模型支持权限管理、用户管理以及相关操作鉴权 -* 查询模块:新增系统表及多种运维语句,优化系统管理 -* 系统模块:CSharp 客户端支持表模型 -* 系统模块:新增表模型 C++ Session 写入接口 -* 系统模块:多级存储支持符合 S3 协议的非 AWS 对象存储系统 -* 系统模块:UDF 函数拓展,新增 pattern\_match 模式匹配函数 -* 数据同步:表模型支持元数据同步及同步删除操作 - -#### V2.0.1.2 - -> 发版时间:2025.01.25
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.1.2-bin.zip
-> SHA512 校验码:51c2fa5da2974a8a3c8871dec1c49bd98e5d193a13ef33ac7801adb833a1e360d74f0160bcdf33c7ffb23a5c5e0f376e26a4315cf877f1459483356285b85349 - -V2.0.1.2 版本正式实现树表双模型配置,并配合表模型支持标准 SQL 查询语法、多种函数和运算符、流处理、Benchmark 等功能。此外,该版本更新还包括:Python 客户端支持四种新数据类型,支持只读模式下的数据库删除操作,脚本工具同时兼容 TsFile、CSV 和 SQL 数据的导入导出,对 Kubernetes Operator 的生态集成等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 时序表模型:IoTDB 支持了时序表模型,提供的 SQL 语法包括 SELECT、WHERE、JOIN、GROUP BY、ORDER BY、LIMIT 子句和嵌套查询 -* 查询模块:表模型支持多种函数和运算符,包括逻辑运算符、数学函数以及时序特色函数 DIFF 等 -* 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -* 存储模块:表模型支持通过 Session 接口进行数据写入,Session 接口支持元数据自动创建 -* 存储模块:Python 客户端新增支持四种新数据类型:`String`、`Blob`、`Date` 和 `Timestamp` -* 存储模块:优化同种类合并任务优先级的比较规则 -* 流处理模块:支持在发送端指定接收端鉴权信息 -* 流处理模块:TsFile Load 支持表模型 -* 流处理模块:流处理插件适配表模型 -* 系统模块:增强了 DataNode 缩容的稳定性 -* 系统模块:在 readonly 状态下,支持用户进行 drop database 操作 -* 脚本与工具:Benchmark 工具适配表模型 -* 脚本与工具: Benchmark 工具支持四种新数据类型:`String`、`Blob`、`Date` 和 `Timestamp` -* 脚本与工具:data/export-data 脚本扩展,支持新数据类型(字符串、大二进制对象、日期、时间戳) -* 脚本与工具:import-data/export-data 脚本迭代,同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出 -* 生态集成:支持 Kubernetes Operator - - -### V1.3.7.3 - -> 发版时间:2026.06.02
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 校验码:8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 版本主要优化了查询模块和数据同步等功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:优化 Last 查询、对齐序列查询、倒序时间过滤查询等场景 -- 元数据模块:优化已激活序列及其子路径下的设备创建校验 -- 数据同步:优化同步失败后的重试机制 -- 数据同步:跨网闸同步插件支持配置实时写入传输超时时间 -- 接口模块:Go 客户端写入接口增加错误码校验 -- 接口模块:优化 C# 客户端连接池管理 - - -### V1.3.7.2 - -> 发版时间:2026.04.07
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 校验码:787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 版本主要优化了数据同步和查询模块的相关功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:优化 Pipe 复杂路径匹配场景下的分发性能 -- 查询模块:Show Queries 语句新增客户端 IP、查询超时时间、服务端等待时间等信息 -- 生态集成:支持 IoTDB 以 OPC Client 模式向外部 OPC Server 推送数据 - - -### V1.3.6.6 - -> 发版时间:2026.01.20
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 校验码:590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 版本优化了数据的读写功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.6.3 - -> 发版时间:2026.01.04
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 校验码:43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 版本主要围绕查询性能、内存管理机制两大核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:优化多种场景的查询性能,包括多序列 Last 查询等 -* 查询模块:Java SDK 新增 FastLastQuery 接口,支持更高效的 Last 查询操作 -* 查询模块:树模型 fetchSchema 调整为分段流式返回,提升大数据量场景下的响应速度 -* 存储模块:优化内存管理,避免内存泄漏风险,保障系统长期稳定运行 -* 存储模块:优化文件合并机制,提升合并处理效率,优化系统存储资源占用 -* 其他:修复安全漏洞 CVE-2025-12183,CVE-2025-66566 and CVE-2025-11226 - - -### V1.3.6.1 - -> 发版时间:2025.12.09
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 校验码:9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 版本主要围绕数据同步稳定性这一核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 数据同步:优化 Pipe SQL 参数配置,支持指定异步加载方式 -* 数据同步:新增语法糖功能,可将全量 Pipe 创建 SQL 自动拆分为实时同步与历史同步两类 -* 系统模块:新增全局数据类型压缩方式配置项,支持按需调整存储压缩策略 - - -### V1.3.5.11 - -> 发版时间:2025.09.24
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 校验码:f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 版本主要优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.10 - -> 发版时间:2025.08.27
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 校验码:3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.9 - -> 发版时间:2025.08.25
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 校验码:95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 版本优化了内存控制,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 -### 1.x 其他历史版本 - -#### V1.3.5.8 - -> 发版时间:2025.08.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 校验码:aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.7 - -> 发版时间:2025.08.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 校验码:17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.6 - -> 发版时间:2025.07.16
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 校验码:05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 版本新增配置项开关支持禁用数据订阅功能,优化了C++高可用客户端,以及正常情况、重启、删除三个场景下的 PIPE 同步延迟问题,和大 TEXT 对象时的查询问题,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.4 - -> 发版时间:2025.06.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 校验码:edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 版本修复了部分产品缺陷,优化了节点移除功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.3 - -> 发版时间:2025.06.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 校验码:5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 版本主要优化了数据同步功能,包括持久化 PIPE 发送进度,增加 PIPE 事件传输时间监控项,并修复了相关缺陷;另外将用户密码的加密算法变更为 SHA-256,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.2 - -> 发版时间:2025.06.10
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 校验码:4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 版本主要优化了数据同步功能,包括支持通过使用参数进行级联配置,支持同步和实时写入顺序完全一致;支持系统重启后历史数据和实时数据分区发送,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.1 - -> 发版时间:2025.05.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 校验码:91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.4.2 - -> 发版时间:2025.04.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 校验码:52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b2 - -V1.3.4.2 版本优化了数据同步功能,支持双活之间同步外部 PIPE 转发而来的数据。 - - -#### V1.3.4.1 - -> 发版时间:2025.01.08
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 校验码:e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 版本新增模式匹配函数、持续优化数据订阅机制,提升稳定性、import-data/export-data 脚本扩展支持新数据类型,import-data/export-data 脚本合并同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -- 系统模块:UDF 函数拓展,新增 pattern_match 模式匹配函数 -- 数据同步:支持在发送端指定接收端鉴权信息 -- 生态集成:支持 Kubernetes Operator -- 脚本与工具:import-data/export-data 脚本扩展,支持新数据类型(字符串、大二进制对象、日期、时间戳) -- 脚本与工具:import-data/export-data 脚本迭代,同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出 - -#### V1.3.3.3 - -> 发版时间:2024.10.31
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 校验码:4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3版本增加优化重启恢复性能,减少启动时间、DataNode 主动监听并加载 TsFile,同时增加可观测性指标、发送端支持传文件至指定目录后,接收端自动加载到IoTDB、Alter Pipe 支持 Alter Source 的能力等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:接收端支持对不一致数据类型的自动转换 -- 数据同步:接收端增强可观测性,支持多个内部接口的 ops/latency 统计 -- 数据同步:opc-ua-sink 插件支持 CS 模式访问和非匿名访问方式 -- 数据订阅: SDK 支持 create if not exists 和 drop if exists 接口 -- 流处理:Alter Pipe 支持 Alter Source 的能力 -- 系统模块:新增 rest 模块的耗时监控 -- 脚本与工具:支持加载自动加载指定目录的TsFile文件 -- 脚本与工具:import-tsfile脚本扩展,支持脚本与iotdb server不在同一服务器运行 -- 脚本与工具:新增对Kubernetes Helm的支持 -- 脚本与工具:Python 客户端支持新数据类型(字符串、大二进制对象、日期、时间戳) - -#### V1.3.3.2 - -> 发版时间:2024.8.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 校验码:32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2版本支持输出读取mods文件的耗时、输入最大顺乱序归并排序内存 以及dispatch 耗时、通过参数配置对时间分区原点的调整、支持根据 pipe 历史数据处理结束标记自动结束订阅,同时合并了模块内存控制性能提升,具体发布内容如下: - -- 查询模块:Explain Analyze 功能支持输出读取mods文件的耗时 -- 查询模块:Explain Analyze 功能支持输入最大顺乱序归并排序内存以及 dispatch 耗时 -- 存储模块:新增合并目标文件拆分功能,增加配置文件参数 -- 系统模块:支持通过参数配置对时间分区原点的调整 -- 流处理:数据订阅支持根据 pipe 历史数据处理结束标记自动结束订阅 -- 数据同步:RPC 压缩支持指定压缩等级 -- 脚本与工具:数据/元数据导出只过滤 root.__system,不对root.__systema 等开头的数据进行过滤 - -#### V1.3.3.1 - -> 发版时间:2024.7.12
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 校验码:1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1版本多级存储增加限流机制、数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权,优化了数据同步接收端一些不明确的WARN日志、重启恢复性能,减少启动时间,同时对脚本内容进行了合并,具体发布内容如下: - -- 查询模块:Filter 性能优化,提升聚合查询和where条件查询的速度 -- 查询模块:Java Session客户端查询 sql 请求均分到所有节点 -- 系统模块:将"iotdb-confignode.properties、iotdb-datanode.properties、iotdb-common.properties"配置文件合并为" iotdb-system.properties" -- 存储模块:多级存储增加限流机制 -- 数据同步:数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权 -- 系统模块:优化重启恢复性能,减少启动时间 - -#### V1.3.2.2 - -> 发版时间:2024.6.4
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 校验码:ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 版本新增 explain analyze 语句分析单个 SQL 查询耗时、新增 UDAF 用户自定义聚合函数框架、支持磁盘空间到达设置阈值自动删除数据、元数据同步、统计指定路径下数据点数、SQL 语句导入导出脚本等功能,同时集群管理工具支持滚动升级、上传插件到整个集群,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 存储模块:insertRecords 接口写入性能提升 -- 存储模块:新增 SpaceTL 功能,支持磁盘空间到达设置阈值自动删除数据 -- 查询模块:新增 Explain Analyze 语句(监控单条 SQL 执行各阶段耗时) -- 查询模块:新增 UDAF 用户自定义聚合函数框架 -- 查询模块:UDF 新增包络解调分析 -- 查询模块:新增 MaxBy/MinBy 函数,支持获取最大/小值的同时返回对应时间戳 -- 查询模块:值过滤查询性能提升 -- 数据同步:路径匹配支持通配符 -- 数据同步:支持元数据同步(含时间序列及相关属性、权限等设置) -- 流处理:增加 Alter Pipe 语句,支持热更新 Pipe 任务的插件 -- 系统模块:系统数据点数统计增加对 load TsFile 导入数据的统计 -- 脚本与工具:新增本地升级备份工具(通过硬链接对原有数据进行备份) -- 脚本与工具:新增 export-data/import-data 脚本,支持将数据导出为 CSV、TsFile 格式或 SQL 语句 -- 脚本与工具:Windows 环境支持通过窗口名区分 ConfigNode、DataNode、Cli - -#### V1.3.1.4 - -> 发版时间:2024.4.23
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 校验码:8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1 版本增加系统激活情况查看、内置方差/标准差聚合函数、内置Fill语句支持超时时间设置、tsfile修复命令等功能,增加一键收集实例信息脚本、一键启停集群等脚本,并对视图、流处理等功能进行优化,提升使用易用度和版本性能。具体发布内容如下: - -- 查询模块:Fill 子句支持设置填充超时阈值,超过时间阈值不填充 -- 查询模块:Rest 接口(V2 版)增加列类型返回 -- 数据同步:数据同步简化时间范围指定方式,直接设置起止时间 -- 数据同步:数据同步支持 SSL 传输协议(iotdb-thrift-ssl-sink 插件) -- 系统模块:支持使用 SQL 查询集群激活信息 -- 系统模块:多级存储增加迁移时传输速率控制 -- 系统模块:系统可观测性提升(增加集群节点的散度监控、分布式任务调度框架可观测性) -- 系统模块:日志默认输出策略优化 -- 脚本与工具:增加一键启停集群脚本(start-all/stop-all.sh & start-all/stop-all.bat) -- 脚本与工具:增加一键收集实例信息脚本(collect-info.sh & collect-info.bat) - -#### V1.3.0.4 - -> 发版时间:2024.1.3
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 校验码:3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 发布了全新内生机器学习框架 AINode,全面升级权限模块支持序列粒度授予权限,并对视图、流处理等功能进行诸多细节优化,进一步提升了产品的使用易用度,并增强了版本稳定性和各方面性能。具体发布内容如下: - -- 查询模块:新增 AINode 内生机器学习模块 -- 查询模块:优化 show path 语句返回时间长的问题 -- 安全模块:升级权限模块,支持时间序列粒度的权限设置 -- 安全模块:支持客户端与服务器 SSL 通讯加密 -- 流处理:流处理模块新增多种 metrics 监控项 -- 查询模块:非可写视图序列支持 LAST 查询 -- 系统模块:优化数据点监控项统计准确性 - -#### V1.2.0.1 - -> 发版时间:2023.6.30
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 校验码:dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1主要增加了流处理框架、动态模板、substring/replace/round内置查询函数等新特性,增强了show region、show timeseries、show variable等内置语句功能和Session接口,同时优化了内置监控项及其实现,修复部分产品bug和性能问题。 - -- 流处理:新增流处理框架 -- 元数据模块:新增模板动态扩充功能 -- 存储模块:新增SPRINTZ和RLBE编码以及LZMA2压缩算法 -- 查询模块:新增cast、round、substr、replace内置标量函数 -- 查询模块:新增time_duration、mode内置聚合函数 -- 查询模块:SQL语句支持case when语法 -- 查询模块:SQL语句支持order by表达式 -- 接口模块:Python API支持连接分布式多个节点 -- 接口模块:Python客户端支持写入重定向 -- 接口模块:Session API增加用模板批量创建序列接口 - -#### V1.1.0.1 - -> 发版时间:2023-04-03
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.1.0.1.zip
-> SHA512 校验码:58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1主要改进增加了部分新特性,如支持 GROUP BY VARIATION、GROUP BY CONDITION 等分段方式、增加 DIFF、COUNT_IF 等实用函数,引入 pipeline 执行引擎进一步提升查询速度等。同时修复对齐序列 last 查询 order by timeseries、LIMIT&OFFSET 不生效、重启后元数据模版错误、删除所有 database 后创建序列错误等相关问题。 - -- 查询模块:align by device 语句支持 order by time -- 查询模块:支持 Show Queries 命令 -- 查询模块:支持 kill query 命令 -- 系统模块:show regions 支持指定特定的 database -- 系统模块:新增 SQL show variables, 可以展示当前集群参数 -- 查询模块:聚合查询支持 GROUP BY VARIATION -- 查询模块:SELECT INTO 支持特定的数据类型强转 -- 查询模块:实现内置标量函数 DIFF -- 系统模块:show regions 显示创建时间 -- 查询模块:实现内置聚合函数 COUNT_IF -- 查询模块:聚合查询支持 GROUP BY CONDITION -- 系统模块:支持修改 dn_rpc_port 和 dn_rpc_address - - -## 2. Workbench(控制台工具) - -| **控制台版本号** | **版本说明** | **可支持IoTDB版本** | **SHA512 校验码** | -| ---------------- | ------------------------------------------------------------ | ------------------------- | ------------------------------------------------------------ | -| V2.1.1 | 优化趋势界面测点选择,支持无设备场景 | V2.0 及以上版本 | aa05fd4d9f33f07c0949bc2d6546bb4b9791ed5ea94bcef27e2bf51ea141ec0206f1c12466aced7bf3449e11ad68d65378d697f3d10cb4881024a83746029a65 | -| V2.0.1-beta | V2.x系列首个版本,支持树、表双模型 | V2.0 及以上版本 | 0ca0d5029874ed8ada9c7d1cb562370b3a46913eed66d39c08759287ccc8bf332cf80bb8861e788614b61ae5d53a9f5605f553e1a607e856f395eb5102e7cc4d | -| V1.5.7 | 优化测点列表中测点名称拆分为设备名称和测点,测点选择区域支持左右滚动,以及导出文件列顺序与页面保持一致 | V1.3.4及以上的1.x系列版本 | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | 优化 CSV 格式导入导出功能:导入时,支持标签、别名为非必填项;导出时,支持测点描述里反引号包裹引号的场景 | V1.3.4及以上的1.x系列版本 | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | 新增服务器时钟,支持企业版激活数据库 | V1.3.4及以上的1.x系列版本 | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | 新增实例管理中prometheus设置的认证功能 | V1.3.4及以上的1.x系列版本 | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | 新增AI分析功能以及模式匹配功能 | V1.3.2及以上的1.x系列版本 | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | 新增树模型展示及英文版 | V1.3.2及以上的1.x系列版本 | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | 分析功能新增分析方式,优化导入模版等功能 | V1.3.2及以上的1.x系列版本 | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | 新增数据库配置功能,优化部分版本细节 | V1.3.2及以上的1.x系列版本 | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | 优化各模块权限控制功能 | V1.3.1及以上的1.x系列版本 | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | 可视化功能新增“常用模版”概念,所有界面优化补充页面缓存等功能 | V1.3.0及以上的1.x系列版本 | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | 计算功能新增“导入、导出”功能,测点列表新增“时间对齐”字段 | V1.2.2及以上的1.x系列版本 | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | 首页新增“激活详情”,新增分析等功能 | V1.2.2及以上的1.x系列版本 | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | 优化“测点描述”展示内容等功能 | V1.2.2及以上的1.x系列版本 | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | 数据同步界面新增“监控面板”,优化Prometheus提示信息 | V1.2.2及以上的1.x系列版本 | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | 全新Workbench版本升级 | V1.2.0及以上的1.x系列版本 | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/zh/UserGuide/latest-Table/QuickStart/QuickStart_timecho.md b/src/zh/UserGuide/latest-Table/QuickStart/QuickStart_timecho.md deleted file mode 100644 index 89ef3eef6..000000000 --- a/src/zh/UserGuide/latest-Table/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,93 +0,0 @@ - - -# 快速上手 - -本篇文档将帮助您了解快速入门 IoTDB 的方法。 - -## 1. 如何安装部署? - -本篇文档将帮助您快速安装部署 IoTDB,您可以通过以下文档的链接快速定位到所需要查看的内容: - -1. 准备所需机器资源:IoTDB 的部署和运行需要考虑多个方面的机器资源配置。具体资源配置可查看 [资源规划](../Deployment-and-Maintenance/Database-Resources_timecho.md) - -2. 完成系统配置准备:IoTDB 的系统配置涉及多个方面,关键的系统配置介绍可查看 [系统配置](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. 获取安装包:您可以联系天谋商务获取 IoTDB 安装包,以确保下载的是最新且稳定的版本。具体安装包结构可查看:[安装包获取](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. 安装数据库并激活:您可以根据实际部署架构选择以下教程进行安装部署: - - - 单机版:[单机版](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - 分布式(集群)版:[分布式(集群)版](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - - - 双活版:[双活版](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️注意:目前我们仍然推荐直接在物理机/虚拟机上安装部署,如需要 docker 部署,可参考:[Docker 部署](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. 安装数据库配套工具:企业版数据库提供监控面板等配套工具,建议在部署企业版时安装,可以帮助您更加便捷的使用 IoTDB: - - - 监控面板:提供了上百个数据库监控指标,用来对 IoTDB 及其所在操作系统进行细致监控,从而进行系统优化、性能优化、发现瓶颈等,安装步骤可查看 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - -## 2. 如何使用? - -1. 数据库建模设计:数据库建模是创建数据库系统的重要步骤,它涉及到设计数据的结构和关系,以确保数据的组织方式能够满足特定应用的需求,下面的文档将会帮助您快速了解 IoTDB 的建模设计: - - - 时序概念介绍:[时序数据模型](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - - - 建模设计介绍:[建模方案设计](../Background-knowledge/Data-Model-and-Terminology_timecho.md) - - - 数据库介绍:[数据库管理](../Basic-Concept/Database-Management_timecho.md) - - - 表介绍:[表管理](../Basic-Concept/Table-Management_timecho.md) - -2. 数据写入&更新:在数据写入&更新方面,IoTDB 提供了多种方式来插入实时数据,支持追加查询写回,基本介绍请查看 [数据写入&更新](../Basic-Concept/Write-Updata-Data_timecho.md) - -3. 数据查询:IoTDB 提供了丰富的数据查询功能,数据查询的基本介绍请查看 [数据查询](../Basic-Concept/Query-Data.md),其中包含了适用于时序特色分析的模式查询和窗口函数,详细介绍请查看[模式查询](../User-Manual/Pattern-Query_timecho.md) 和 [窗口函数](../User-Manual/Window-Function_timecho.md) - -4. 数据删除:IoTDB 提供了两种删除方式,分别为SQL语句删除与过期自动删除(TTL) - - - SQL语句删除:基本介绍请查看 [数据删除](../Basic-Concept/Delete-Data.md) - - 过期自动删除(TTL):基本介绍请查看 [过期自动删除](../Basic-Concept/TTL-Delete-Data_timecho.md) - -5. 其他进阶功能:除了数据库常见的写入、查询等功能外,IoTDB 还支持“数据同步”等功能,具体使用方法可参见具体文档: - - - 数据同步:[数据同步](../User-Manual/Data-Sync_timecho.md) - -6. 应用编程接口: IoTDB 提供了多种应用编程接口(API),以便于开发者在应用程序中与 IoTDB 进行交互,目前支持[ Java 原生接口](../API/Programming-Java-Native-API_timecho.md)、[Python 原生接口](../API/Programming-Python-Native-API_timecho.md)、[JDBC](../API/Programming-JDBC_timecho.md)等,更多编程接口可参见官网【应用编程接口】其他章节 - -## 3. 还有哪些便捷的周边工具? - -IoTDB 除了自身拥有丰富的功能外,其周边的工具体系包含的种类十分齐全。本篇文档将帮助您快速使用周边工具体系: - - - 监控面板:是一个对 IoTDB 及其所在操作系统进行细致监控的工具,涵盖数据库性能、系统资源等上百个数据库监控指标,助力系统优化与瓶颈识别等,具体使用介绍请查看 [监控面板部署](../Tools-System/Monitor-Tool_timecho.md) - - -## 4. 想了解更多技术细节? - -如果您想了解 IoTDB 的更多技术内幕,可以移步至下面的文档: - - - 数据分区和负载均衡:IoTDB 基于时序数据特性,精心设计了数据分区策略和负载均衡算法,提升了集群的可用性和性能,想了解更多请查看 [数据分区和负载均衡](../Technical-Insider/Cluster-data-partitioning.md) - - - 压缩&编码:IoTDB 通过多样化的编码和压缩技术,针对不同数据类型优化存储效率,想了解更多请查看 [压缩&编码](../Technical-Insider/Encoding-and-Compression.md) - - diff --git a/src/zh/UserGuide/latest-Table/Reference/System-Config-Manual_timecho.md b/src/zh/UserGuide/latest-Table/Reference/System-Config-Manual_timecho.md deleted file mode 100644 index a7bffe3a7..000000000 --- a/src/zh/UserGuide/latest-Table/Reference/System-Config-Manual_timecho.md +++ /dev/null @@ -1,3354 +0,0 @@ - - -# 配置参数 - -IoTDB 配置文件位于 IoTDB 安装目录:`conf`文件夹下。 - -- `confignode-env.sh/bat`:环境配置项的配置文件,可以配置 ConfigNode 的内存大小。 -- `datanode-env.sh/bat`:环境配置项的配置文件,可以配置 DataNode 的内存大小。 -- `iotdb-system.properties`:IoTDB 的配置文件。 -- `iotdb-system.properties.template`:IoTDB 的配置文件模版。 - -## 1. 修改配置: - -在 `iotdb-system.properties` 文件中已存在的参数可以直接进行修改。对于那些在 `iotdb-system.properties` 中未列出的参数,可以从 `iotdb-system.properties.template` 配置文件模板中找到相应的参数,然后将其复制到 `iotdb-system.properties` 文件中进行修改。 - -### 1.1 改后生效方式 - -不同的配置参数有不同的生效方式,分为以下三种: - -- 仅允许在第一次启动服务前修改: 在第一次启动 ConfigNode/DataNode 后即禁止修改,修改会导致 ConfigNode/DataNode 无法启动。 -- 重启服务生效: ConfigNode/DataNode 启动后仍可修改,但需要重启 ConfigNode/DataNode 后才生效。 -- 热加载: 可在 ConfigNode/DataNode 运行时修改,修改后通过 Session 或 Cli 发送 `load configuration` 或 `set configuration key1 = 'value1'` 命令(SQL)至 IoTDB 使配置生效。 - -## 2. 环境配置项 - -### 2.1 confignode-env.sh/bat - -环境配置项主要用于对 ConfigNode 运行的 Java 环境相关参数进行配置,如 JVM 相关配置。ConfigNode 启动时,此部分配置会被传给 JVM,详细配置项说明如下: - -- MEMORY_SIZE - -| 名字 | MEMORY_SIZE | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB ConfigNode 启动时分配的内存大小 | -| 类型 | String | -| 默认值 | 取决于操作系统和机器配置。默认为机器内存的十分之三,最多会被设置为 16G。 | -| 改后生效方式 | 重启服务生效 | - -- ON_HEAP_MEMORY - -| 名字 | ON_HEAP_MEMORY | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB ConfigNode 能使用的堆内内存大小, 曾用名: MAX_HEAP_SIZE | -| 类型 | String | -| 默认值 | 取决于MEMORY_SIZE的配置。 | -| 改后生效方式 | 重启服务生效 | - -- OFF_HEAP_MEMORY - -| 名字 | OFF_HEAP_MEMORY | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB ConfigNode 能使用的堆外内存大小, 曾用名: MAX_DIRECT_MEMORY_SIZE | -| 类型 | String | -| 默认值 | 取决于MEMORY_SIZE的配置。 | -| 改后生效方式 | 重启服务生效 | - -### 2.2 datanode-env.sh/bat - -环境配置项主要用于对 DataNode 运行的 Java 环境相关参数进行配置,如 JVM 相关配置。DataNode/Standalone 启动时,此部分配置会被传给 JVM,详细配置项说明如下: - -- MEMORY_SIZE - -| 名字 | MEMORY_SIZE | -| ------------ | ---------------------------------------------------- | -| 描述 | IoTDB DataNode 启动时分配的内存大小 | -| 类型 | String | -| 默认值 | 取决于操作系统和机器配置。默认为机器内存的二分之一。 | -| 改后生效方式 | 重启服务生效 | - -- ON_HEAP_MEMORY - -| 名字 | ON_HEAP_MEMORY | -| ------------ | ---------------------------------------------------------- | -| 描述 | IoTDB DataNode 能使用的堆内内存大小, 曾用名: MAX_HEAP_SIZE | -| 类型 | String | -| 默认值 | 取决于MEMORY_SIZE的配置。 | -| 改后生效方式 | 重启服务生效 | - -- OFF_HEAP_MEMORY - -| 名字 | OFF_HEAP_MEMORY | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB DataNode 能使用的堆外内存大小, 曾用名: MAX_DIRECT_MEMORY_SIZE | -| 类型 | String | -| 默认值 | 取决于MEMORY_SIZE的配置 | -| 改后生效方式 | 重启服务生效 | - - -## 3. 系统配置项(iotdb-system.properties.template) - -### 3.1 集群管理 - -- cluster_name - -| 名字 | cluster_name | -| -------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | 集群名称 | -| 类型 | String | -| 默认值 | default_cluster | -| 修改方式 | CLI 中执行语句 `set configuration cluster_name = 'xxx'` (xxx为希望修改成的集群名称) | -| 注意 | 此修改通过网络分发至每个节点。在网络波动或者有节点宕机的情况下,不保证能够在全部节点修改成功。未修改成功的节点重启时无法加入集群,此时需要手动修改该节点的配置文件中的cluster_name项,再重启。正常情况下,不建议通过手动修改配置文件的方式修改集群名称,不建议通过`load configuration`的方式热加载。 | - -### 3.2 SeedConfigNode 配置 - -- cn_seed_config_node - -| 名字 | cn_seed_config_node | -| ------------ | ------------------------------------------------------------ | -| 描述 | 目标 ConfigNode 地址,ConfigNode 通过此地址加入集群,推荐使用 SeedConfigNode。V1.2.2 及以前曾用名是 cn_target_config_node_list | -| 类型 | String | -| 默认值 | 127.0.0.1:10710 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_seed_config_node - -| 名字 | dn_seed_config_node | -| ------------ | ------------------------------------------------------------ | -| 描述 | ConfigNode 地址,DataNode 启动时通过此地址加入集群,推荐使用 SeedConfigNode。V1.2.2 及以前曾用名是 dn_target_config_node_list | -| 类型 | String | -| 默认值 | 127.0.0.1:10710 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -### 3.3 Node RPC 配置 - -- cn_internal_address - -| 名字 | cn_internal_address | -| ------------ | ---------------------------- | -| 描述 | ConfigNode 集群内部地址 | -| 类型 | String | -| 默认值 | 127.0.0.1 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- cn_internal_port - -| 名字 | cn_internal_port | -| ------------ | ---------------------------- | -| 描述 | ConfigNode 集群服务监听端口 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 10710 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- cn_consensus_port - -| 名字 | cn_consensus_port | -| ------------ | ----------------------------- | -| 描述 | ConfigNode 的共识协议通信端口 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 10720 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_rpc_address - -| 名字 | dn_rpc_address | -| ------------ |----------------| -| 描述 | 客户端 RPC 服务监听地址 | -| 类型 | String | -| 默认值 | 127.0.0.1 | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_port - -| 名字 | dn_rpc_port | -| ------------ | ----------------------- | -| 描述 | Client RPC 服务监听端口 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 6667 | -| 改后生效方式 | 重启服务生效 | - -- dn_internal_address - -| 名字 | dn_internal_address | -| ------------ | ---------------------------- | -| 描述 | DataNode 内网通信地址 | -| 类型 | string | -| 默认值 | 127.0.0.1 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_internal_port - -| 名字 | dn_internal_port | -| ------------ | ---------------------------- | -| 描述 | DataNode 内网通信端口 | -| 类型 | int | -| 默认值 | 10730 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_mpp_data_exchange_port - -| 名字 | dn_mpp_data_exchange_port | -| ------------ | ---------------------------- | -| 描述 | MPP 数据交换端口 | -| 类型 | int | -| 默认值 | 10740 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_schema_region_consensus_port - -| 名字 | dn_schema_region_consensus_port | -| ------------ | ------------------------------------- | -| 描述 | DataNode 元数据副本的共识协议通信端口 | -| 类型 | int | -| 默认值 | 10750 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_data_region_consensus_port - -| 名字 | dn_data_region_consensus_port | -| ------------ | ----------------------------------- | -| 描述 | DataNode 数据副本的共识协议通信端口 | -| 类型 | int | -| 默认值 | 10760 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- dn_join_cluster_retry_interval_ms - -| 名字 | dn_join_cluster_retry_interval_ms | -| ------------ | --------------------------------- | -| 描述 | DataNode 再次重试加入集群等待时间 | -| 类型 | long | -| 默认值 | 5000 | -| 改后生效方式 | 重启服务生效 | - -### 3.4 副本配置 - -- config_node_consensus_protocol_class - -| 名字 | config_node_consensus_protocol_class | -| ------------ | ------------------------------------------------ | -| 描述 | ConfigNode 副本的共识协议,仅支持 RatisConsensus | -| 类型 | String | -| 默认值 | org.apache.iotdb.consensus.ratis.RatisConsensus | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- schema_replication_factor - -| 名字 | schema_replication_factor | -| ------------ | ---------------------------------- | -| 描述 | Database 的默认元数据副本数 | -| 类型 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 重启服务后对**新的 Database** 生效 | - -- schema_region_consensus_protocol_class - -| 名字 | schema_region_consensus_protocol_class | -| ------------ | ----------------------------------------------------- | -| 描述 | 元数据副本的共识协议,多副本时只能使用 RatisConsensus | -| 类型 | String | -| 默认值 | org.apache.iotdb.consensus.ratis.RatisConsensus | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- data_replication_factor - -| 名字 | data_replication_factor | -| ------------ | ---------------------------------- | -| 描述 | Database 的默认数据副本数 | -| 类型 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 重启服务后对**新的 Database** 生效 | - -- data_region_consensus_protocol_class - -| 名字 | data_region_consensus_protocol_class | -| ------------ | ------------------------------------------------------------ | -| 描述 | 数据副本的共识协议,多副本时可以使用 IoTConsensus 或 RatisConsensus | -| 类型 | String | -| 默认值 | org.apache.iotdb.consensus.iot.IoTConsensus | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -### 3.5 目录配置 - -- cn_system_dir - -| 名字 | cn_system_dir | -| ------------ | ----------------------------------------------------------- | -| 描述 | ConfigNode 系统数据存储路径 | -| 类型 | String | -| 默认值 | data/confignode/system(Windows:data\\configndoe\\system) | -| 改后生效方式 | 重启服务生效 | - -- cn_consensus_dir - -| 名字 | cn_consensus_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | ConfigNode 共识协议数据存储路径 | -| 类型 | String | -| 默认值 | data/confignode/consensus(Windows:data\\configndoe\\consensus) | -| 改后生效方式 | 重启服务生效 | - -- cn_pipe_receiver_file_dir - -| 名字 | cn_pipe_receiver_file_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | ConfigNode中pipe接收者用于存储文件的目录路径。 | -| 类型 | String | -| 默认值 | data/confignode/system/pipe/receiver(Windows:data\\confignode\\system\\pipe\\receiver) | -| 改后生效方式 | 重启服务生效 | - -- dn_system_dir - -| 名字 | dn_system_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 元数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/system(Windows:data\\datanode\\system) | -| 改后生效方式 | 重启服务生效 | - -- dn_data_dirs - -| 名字 | dn_data_dirs | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/data(Windows:data\\datanode\\data) | -| 改后生效方式 | 重启服务生效 | - -- dn_multi_dir_strategy - -| 名字 | dn_multi_dir_strategy | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 在 data_dirs 中为 TsFile 选择目录时采用的策略。可使用简单类名或类名全称。系统提供以下三种策略:
1. SequenceStrategy:IoTDB 按顺序选择目录,依次遍历 data_dirs 中的所有目录,并不断轮循;
2. MaxDiskUsableSpaceFirstStrategy:IoTDB 优先选择 data_dirs 中对应磁盘空余空间最大的目录;
您可以通过以下方法完成用户自定义策略:
1. 继承 org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy 类并实现自身的 Strategy 方法;
2. 将实现的类的完整类名(包名加类名,UserDefineStrategyPackage)填写到该配置项;
3. 将该类 jar 包添加到工程中。 | -| 类型 | String | -| 默认值 | SequenceStrategy | -| 改后生效方式 | 热加载 | - -- dn_consensus_dir - -| 名字 | dn_consensus_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 共识层日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/consensus(Windows:data\\datanode\\consensus) | -| 改后生效方式 | 重启服务生效 | - -- dn_wal_dirs - -| 名字 | dn_wal_dirs | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 写前日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/wal(Windows:data\\datanode\\wal) | -| 改后生效方式 | 重启服务生效 | - -- dn_tracing_dir - -| 名字 | dn_tracing_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB 追踪根目录路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | datanode/tracing(Windows:datanode\\tracing) | -| 改后生效方式 | 重启服务生效 | - -- dn_sync_dir - -| 名字 | dn_sync_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTDB sync 存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/sync(Windows:data\\datanode\\sync) | -| 改后生效方式 | 重启服务生效 | - -- sort_tmp_dir - -| 名字 | sort_tmp_dir | -| ------------ | ------------------------------------------------- | -| 描述 | 用于配置排序操作的临时目录。 | -| 类型 | String | -| 默认值 | data/datanode/tmp(Windows:data\\datanode\\tmp) | -| 改后生效方式 | 重启服务生效 | - -- dn_pipe_receiver_file_dirs - -| 名字 | dn_pipe_receiver_file_dirs | -| ------------ | ------------------------------------------------------------ | -| 描述 | DataNode中pipe接收者用于存储文件的目录路径。 | -| 类型 | String | -| 默认值 | data/datanode/system/pipe/receiver(Windows:data\\datanode\\system\\pipe\\receiver) | -| 改后生效方式 | 重启服务生效 | - -- iot_consensus_v2_receiver_file_dirs - -| 名字 | iot_consensus_v2_receiver_file_dirs | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTConsensus V2中接收者用于存储文件的目录路径。 | -| 类型 | String | -| 默认值 | data/datanode/system/pipe/consensus/receiver(Windows:data\\datanode\\system\\pipe\\consensus\\receiver) | -| 改后生效方式 | 重启服务生效 | - -- iot_consensus_v2_deletion_file_dir - -| 名字 | iot_consensus_v2_deletion_file_dir | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTConsensus V2中删除操作用于存储文件的目录路径。 | -| 类型 | String | -| 默认值 | data/datanode/system/pipe/consensus/deletion(Windows:data\\datanode\\system\\pipe\\consensus\\deletion) | -| 改后生效方式 | 重启服务生效 | - -### 3.6 监控配置 - -- cn_metric_reporter_list - -| 名字 | cn_metric_reporter_list | -| ------------ | -------------------------------------------------- | -| 描述 | confignode中用于配置监控模块的数据需要报告的系统。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -- cn_metric_level - -| 名字 | cn_metric_level | -| ------------ | ------------------------------------------ | -| 描述 | confignode中控制监控模块收集数据的详细程度 | -| 类型 | String | -| 默认值 | IMPORTANT | -| 改后生效方式 | 重启服务生效 | - -- cn_metric_async_collect_period - -| 名字 | cn_metric_async_collect_period | -| ------------ | -------------------------------------------------- | -| 描述 | confignode中某些监控数据异步收集的周期,单位是秒。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -- cn_metric_prometheus_reporter_port - -| 名字 | cn_metric_prometheus_reporter_port | -| ------------ | ------------------------------------------------------ | -| 描述 | confignode中Prometheus报告者用于监控数据报告的端口号。 | -| 类型 | int | -| 默认值 | 9091 | -| 改后生效方式 | 重启服务生效 | - -- dn_metric_reporter_list - -| 名字 | dn_metric_reporter_list | -| ------------ | ------------------------------------------------ | -| 描述 | DataNode中用于配置监控模块的数据需要报告的系统。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -- dn_metric_level - -| 名字 | dn_metric_level | -| ------------ | ---------------------------------------- | -| 描述 | DataNode中控制监控模块收集数据的详细程度 | -| 类型 | String | -| 默认值 | IMPORTANT | -| 改后生效方式 | 重启服务生效 | - -- dn_metric_async_collect_period - -| 名字 | dn_metric_async_collect_period | -| ------------ | ------------------------------------------------ | -| 描述 | DataNode中某些监控数据异步收集的周期,单位是秒。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -- dn_metric_prometheus_reporter_port - -| 名字 | dn_metric_prometheus_reporter_port | -| ------------ | ---------------------------------------------------- | -| 描述 | DataNode中Prometheus报告者用于监控数据报告的端口号。 | -| 类型 | int | -| 默认值 | 9092 | -| 改后生效方式 | 重启服务生效 | - -- dn_metric_internal_reporter_type - -| 名字 | dn_metric_internal_reporter_type | -| ------------ | ------------------------------------------------------------ | -| 描述 | DataNode中监控模块内部报告者的种类,用于内部监控和检查数据是否已经成功写入和刷新。 | -| 类型 | String | -| 默认值 | IOTDB | -| 改后生效方式 | 重启服务生效 | - -### 3.7 SSL 配置 - -- enable_thrift_ssl - -| 名字 | enable_thrift_ssl | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当enable_thrift_ssl配置为true时,将通过dn_rpc_port使用 SSL 加密进行通信 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- enable_https - -| 名字 | enable_https | -| ------------ | ------------------------------ | -| 描述 | REST Service 是否开启 SSL 配置 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- key_store_path - -| 名字 | key_store_path | -| ------------ | -------------- | -| 描述 | ssl证书路径 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -- key_store_pwd - -| 名字 | key_store_pwd | -| ------------ | ------------- | -| 描述 | ssl证书密码 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -### 3.8 连接配置 - -- cn_rpc_thrift_compression_enable - -| 名字 | cn_rpc_thrift_compression_enable | -| ------------ | -------------------------------- | -| 描述 | 是否启用 thrift 的压缩机制。 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- cn_rpc_max_concurrent_client_num - -| 名字 | cn_rpc_max_concurrent_client_num | -| ------------ |---------------------------------| -| 描述 | 最大连接数。 | -| 类型 | int | -| 默认值 | 3000 | -| 改后生效方式 | 重启服务生效 | - -- cn_connection_timeout_ms - -| 名字 | cn_connection_timeout_ms | -| ------------ | ------------------------ | -| 描述 | 节点连接超时时间 | -| 类型 | int | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -- cn_selector_thread_nums_of_client_manager - -| 名字 | cn_selector_thread_nums_of_client_manager | -| ------------ | ----------------------------------------- | -| 描述 | 客户端异步线程管理的选择器线程数量 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- cn_max_client_count_for_each_node_in_client_manager - -| 名字 | cn_max_client_count_for_each_node_in_client_manager | -| ------------ | --------------------------------------------------- | -| 描述 | 单 ClientManager 中路由到每个节点的最大 Client 个数 | -| 类型 | int | -| 默认值 | 300 | -| 改后生效方式 | 重启服务生效 | - -- dn_session_timeout_threshold - -| 名字 | dn_session_timeout_threshold | -| ------------ | ---------------------------- | -| 描述 | 最大的会话空闲时间 | -| 类型 | int | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_thrift_compression_enable - -| 名字 | dn_rpc_thrift_compression_enable | -| ------------ | -------------------------------- | -| 描述 | 是否启用 thrift 的压缩机制 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_advanced_compression_enable - -| 名字 | dn_rpc_advanced_compression_enable | -| ------------ | ---------------------------------- | -| 描述 | 是否启用 thrift 的自定制压缩机制 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_selector_thread_count - -| 名字 | rpc_selector_thread_count | -| ------------ | ------------------------- | -| 描述 | rpc 选择器线程数量 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_min_concurrent_client_num - -| 名字 | rpc_min_concurrent_client_num | -| ------------ | ----------------------------- | -| 描述 | 最小连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- dn_rpc_max_concurrent_client_num - -| 名字 | dn_rpc_max_concurrent_client_num | -| ------------ |----------------------------------| -| 描述 | 最大连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- dn_thrift_max_frame_size - -| 名字 | dn_thrift_max_frame_size | -| ------------ |-------------------------------------------------------------------------------------------------------------------| -| 描述 | RPC 请求/响应的最大字节数 | -| 类型 | int | -|默认值| 默认为0,即根据启动时DNJVM的配置参数自动计算:
a. min(64MB, dn_alloc_memory/64)
b.若用户手动配置了dn_thrift_max_frame_size,仍然使用用户指定的大小 | -| 改后生效方式 | 重启服务生效 | - -- dn_thrift_init_buffer_size - -| 名字 | dn_thrift_init_buffer_size | -| ------------ | -------------------------- | -| 描述 | 字节数 | -| 类型 | long | -| 默认值 | 1024 | -| 改后生效方式 | 重启服务生效 | - -- dn_connection_timeout_ms - -| 名字 | dn_connection_timeout_ms | -| ------------ | ------------------------ | -| 描述 | 节点连接超时时间 | -| 类型 | int | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -- dn_selector_thread_count_of_client_manager - -| 名字 | dn_selector_thread_count_of_client_manager | -| ------------ | ------------------------------------------------------------ | -| 描述 | selector thread (TAsyncClientManager) nums for async thread in a clientManagerclientManager中异步线程的选择器线程(TAsyncClientManager)编号 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- dn_max_client_count_for_each_node_in_client_manager - -| 名字 | dn_max_client_count_for_each_node_in_client_manager | -| ------------ | --------------------------------------------------- | -| 描述 | 单 ClientManager 中路由到每个节点的最大 Client 个数 | -| 类型 | int | -| 默认值 | 300 | -| 改后生效方式 | 重启服务生效 | - -### 3.9 对象存储管理 - -- remote_tsfile_cache_dirs - -| 名字 | remote_tsfile_cache_dirs | -| ------------ | ------------------------ | -| 描述 | 云端存储在本地的缓存目录 | -| 类型 | String | -| 默认值 | data/datanode/data/cache | -| 改后生效方式 | 重启服务生效 | - -- remote_tsfile_cache_page_size_in_kb - -| 名字 | remote_tsfile_cache_page_size_in_kb | -| ------------ | ----------------------------------- | -| 描述 | 云端存储在本地缓存文件的块大小 | -| 类型 | int | -| 默认值 | 20480 | -| 改后生效方式 | 重启服务生效 | - -- remote_tsfile_cache_max_disk_usage_in_mb - -| 名字 | remote_tsfile_cache_max_disk_usage_in_mb | -| ------------ | ---------------------------------------- | -| 描述 | 云端存储本地缓存的最大磁盘占用大小 | -| 类型 | long | -| 默认值 | 51200 | -| 改后生效方式 | 重启服务生效 | - -- object_storage_type - -| 名字 | object_storage_type | -| ------------ | ------------------- | -| 描述 | 云端存储类型 | -| 类型 | String | -| 默认值 | AWS_S3 | -| 改后生效方式 | 重启服务生效 | - -- object_storage_endpoint - -| 名字 | object_storage_endpoint | -| ------------ | ----------------------- | -| 描述 | 云端存储的 endpoint | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -- object_storage_bucket - -| 名字 | object_storage_bucket | -| ------------ | ---------------------- | -| 描述 | 云端存储 bucket 的名称 | -| 类型 | String | -| 默认值 | iotdb_data | -| 改后生效方式 | 重启服务生效 | - -- object_storage_access_key - -| 名字 | object_storage_access_key | -| ------------ | ------------------------- | -| 描述 | 云端存储的验证信息 key | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -- object_storage_access_secret - -| 名字 | object_storage_access_secret | -| ------------ | ---------------------------- | -| 描述 | 云端存储的验证信息 secret | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -### 3.10 多级管理 - -- dn_default_space_usage_thresholds - -| 名字 | dn_default_space_usage_thresholds | -| ------------ | ------------------------------------------------------------ | -| 描述 | 定义每个层级数据目录的最小剩余空间比例;当剩余空间少于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的剩余存储空间到低于此阈值时,会将系统置为 READ_ONLY | -| 类型 | double | -| 默认值 | 0.85 | -| 改后生效方式 | 热加载 | - -- dn_tier_full_policy - -| 名字 | dn_tier_full_policy | -| ------------ | ------------------------------------------------------------ | -| 描述 | 如何处理最后一层数据,当其已用空间高于其dn_default_space_usage_threshold时。| -| 类型 | String | -| 默认值 | NULL | -| 改后生效方式 | 热加载 | - -- migrate_thread_count - -| 名字 | migrate_thread_count | -| ------------ | ---------------------------------------- | -| 描述 | DataNode数据目录中迁移操作的线程池大小。 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 热加载 | - -- tiered_storage_migrate_speed_limit_bytes_per_sec - -| 名字 | tiered_storage_migrate_speed_limit_bytes_per_sec | -| ------------ | ------------------------------------------------ | -| 描述 | 限制不同存储层级之间的数据迁移速度。 | -| 类型 | int | -| 默认值 | 10485760 | -| 改后生效方式 | 热加载 | - -### 3.11 REST服务配置 - -- enable_rest_service - -| 名字 | enable_rest_service | -| ------------ | ------------------- | -| 描述 | 是否开启Rest服务。 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- rest_service_port - -| 名字 | rest_service_port | -| ------------ | ------------------ | -| 描述 | Rest服务监听端口号 | -| 类型 | int32 | -| 默认值 | 18080 | -| 改后生效方式 | 重启服务生效 | - -- enable_swagger - -| 名字 | enable_swagger | -| ------------ | --------------------------------- | -| 描述 | 是否启用swagger来展示rest接口信息 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- rest_query_default_row_size_limit - -| 名字 | rest_query_default_row_size_limit | -| ------------ | --------------------------------- | -| 描述 | 一次查询能返回的结果集最大行数 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- cache_expire_in_seconds - -| 名字 | cache_expire_in_seconds | -| ------------ | -------------------------------- | -| 描述 | 用户登录信息缓存的过期时间(秒) | -| 类型 | int32 | -| 默认值 | 28800 | -| 改后生效方式 | 重启服务生效 | - -- cache_max_num - -| 名字 | cache_max_num | -| ------------ | ------------------------ | -| 描述 | 缓存中存储的最大用户数量 | -| 类型 | int32 | -| 默认值 | 100 | -| 改后生效方式 | 重启服务生效 | - -- cache_init_num - -| 名字 | cache_init_num | -| ------------ | -------------- | -| 描述 | 缓存初始容量 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- client_auth - -| 名字 | client_auth | -| ------------ | ---------------------- | -| 描述 | 是否需要客户端身份验证 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- trust_store_path - -| 名字 | trust_store_path | -| ------------ | ----------------------- | -| 描述 | keyStore 密码(非必填) | -| 类型 | String | -| 默认值 | "" | -| 改后生效方式 | 重启服务生效 | - -- trust_store_pwd - -| 名字 | trust_store_pwd | -| ------------ | ------------------------- | -| 描述 | trustStore 密码(非必填) | -| 类型 | String | -| 默认值 | "" | -| 改后生效方式 | 重启服务生效 | - -- idle_timeout_in_seconds - -| 名字 | idle_timeout_in_seconds | -| ------------ | ----------------------- | -| 描述 | SSL 超时时间,单位为秒 | -| 类型 | int32 | -| 默认值 | 5000 | -| 改后生效方式 | 重启服务生效 | - -### 3.12 负载均衡配置 - -- series_slot_num - -| 名字 | series_slot_num | -| ------------ | ---------------------------- | -| 描述 | 序列分区槽数 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- series_partition_executor_class - -| 名字 | series_partition_executor_class | -| ------------ | ------------------------------------------------------------ | -| 描述 | 序列分区哈希函数 | -| 类型 | String | -| 默认值 | org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- schema_region_group_extension_policy - -| 名字 | schema_region_group_extension_policy | -| ------------ | ------------------------------------ | -| 描述 | SchemaRegionGroup 的扩容策略 | -| 类型 | string | -| 默认值 | AUTO | -| 改后生效方式 | 重启服务生效 | - -- default_schema_region_group_num_per_database - -| 名字 | default_schema_region_group_num_per_database | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当选用 CUSTOM-SchemaRegionGroup 扩容策略时,此参数为每个 Database 拥有的 SchemaRegionGroup 数量;当选用 AUTO-SchemaRegionGroup 扩容策略时,此参数为每个 Database 最少拥有的 SchemaRegionGroup 数量 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_per_data_node - -| 名字 | schema_region_per_data_node | -| ------------ | -------------------------------------------------- | -| 描述 | 期望每个 DataNode 可管理的 SchemaRegion 的最大数量 | -| 类型 | double | -| 默认值 | 1.0 | -| 改后生效方式 | 重启服务生效 | - -- data_region_group_extension_policy - -| 名字 | data_region_group_extension_policy | -| ------------ | ---------------------------------- | -| 描述 | DataRegionGroup 的扩容策略 | -| 类型 | string | -| 默认值 | AUTO | -| 改后生效方式 | 重启服务生效 | - -- default_data_region_group_num_per_database - -| 名字 | default_data_region_group_per_database | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当选用 CUSTOM-DataRegionGroup 扩容策略时,此参数为每个 Database 拥有的 DataRegionGroup 数量;当选用 AUTO-DataRegionGroup 扩容策略时,此参数为每个 Database 最少拥有的 DataRegionGroup 数量 | -| 类型 | int | -| 默认值 | 2 | -| 改后生效方式 | 重启服务生效 | - -- data_region_per_data_node - -| 名字 | data_region_per_data_node | -| ------------ | ------------------------------------------------ | -| 描述 | 期望每个 DataNode 可管理的 DataRegion 的最大数量 | -| 类型 | double | -| 默认值 | CPU 核心数的一半 | -| 改后生效方式 | 重启服务生效 | - -- enable_auto_leader_balance_for_ratis_consensus - -| 名字 | enable_auto_leader_balance_for_ratis_consensus | -| ------------ | ---------------------------------------------- | -| 描述 | 是否为 Ratis 共识协议开启自动均衡 leader 策略 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -- enable_auto_leader_balance_for_iot_consensus - -| 名字 | enable_auto_leader_balance_for_iot_consensus | -| ------------ | -------------------------------------------- | -| 描述 | 是否为 IoT 共识协议开启自动均衡 leader 策略 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -### 3.13 集群管理 - -- time_partition_origin - -| 名字 | time_partition_origin | -| ------------ | ------------------------------------------------------------ | -| 描述 | Database 数据时间分区的起始点,即从哪个时间点开始计算时间分区。 | -| 类型 | Long | -| 单位 | 毫秒 | -| 默认值 | 0 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- time_partition_interval - -| 名字 | time_partition_interval | -| ------------ | ------------------------------- | -| 描述 | Database 默认的数据时间分区间隔 | -| 类型 | Long | -| 单位 | 毫秒 | -| 默认值 | 604800000 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- heartbeat_interval_in_ms - -| 名字 | heartbeat_interval_in_ms | -| ------------ | ------------------------ | -| 描述 | 集群节点间的心跳间隔 | -| 类型 | Long | -| 单位 | ms | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- disk_space_warning_threshold - -| 名字 | disk_space_warning_threshold | -| ------------ | ---------------------------- | -| 描述 | DataNode 磁盘剩余阈值 | -| 类型 | double(percentage) | -| 默认值 | 0.05 | -| 改后生效方式 | 重启服务生效 | - -### 3.14 内存控制配置 - -- datanode_memory_proportion - -| 名字 | datanode_memory_proportion | -| ------------ | ---------------------------------------------------- | -| 描述 | 存储引擎、查询引擎、元数据、共识、流处理引擎和空闲内存比例 | -| 类型 | Ratio | -| 默认值 | 3:3:1:1:1:1 | -| 改后生效方式 | 重启服务生效 | - -- schema_memory_proportion - -| 名字 | schema_memory_proportion | -| ------------ | ------------------------------------------------------------ | -| 描述 | Schema 相关的内存如何在 SchemaRegion、SchemaCache 和 PartitionCache 之间分配 | -| 类型 | Ratio | -| 默认值 | 5:4:1 | -| 改后生效方式 | 重启服务生效 | - -- storage_engine_memory_proportion - -| 名字 | storage_engine_memory_proportion | -| ------------ | -------------------------------- | -| 描述 | 写入和合并占存储内存比例 | -| 类型 | Ratio | -| 默认值 | 8:2 | -| 改后生效方式 | 重启服务生效 | - -- write_memory_proportion - -| 名字 | write_memory_proportion | -| ------------ | -------------------------------------------- | -| 描述 | Memtable 和 TimePartitionInfo 占写入内存比例 | -| 类型 | Ratio | -| 默认值 | 19:1 | -| 改后生效方式 | 重启服务生效 | - -- primitive_array_size - -| 名字 | primitive_array_size | -| ------------ | ---------------------------------------- | -| 描述 | 数组池中的原始数组大小(每个数组的长度) | -| 类型 | int32 | -| 默认值 | 64 | -| 改后生效方式 | 重启服务生效 | - -- chunk_metadata_size_proportion - -| 名字 | chunk_metadata_size_proportion | -| ------------ | -------------------------------------------- | -| 描述 | 在数据压缩过程中,用于存储块元数据的内存比例 | -| 类型 | Double | -| 默认值 | 0.1 | -| 改后生效方式 | 重启服务生效 | - -- flush_proportion - -| 名字 | flush_proportion | -| ------------ | ------------------------------------------------------------ | -| 描述 | 调用flush disk的写入内存比例,默认0.4,若有极高的写入负载力(比如batch=1000),可以设置为低于默认值,比如0.2 | -| 类型 | Double | -| 默认值 | 0.4 | -| 改后生效方式 | 重启服务生效 | - -- buffered_arrays_memory_proportion - -| 名字 | buffered_arrays_memory_proportion | -| ------------ | --------------------------------------- | -| 描述 | 为缓冲数组分配的写入内存比例,默认为0.6 | -| 类型 | Double | -| 默认值 | 0.6 | -| 改后生效方式 | 重启服务生效 | - -- reject_proportion - -| 名字 | reject_proportion | -| ------------ | ------------------------------------------------------------ | -| 描述 | 拒绝插入的写入内存比例,默认0.8,若有极高的写入负载力(比如batch=1000)并且物理内存足够大,它可以设置为高于默认值,如0.9 | -| 类型 | Double | -| 默认值 | 0.8 | -| 改后生效方式 | 重启服务生效 | - -- device_path_cache_proportion - -| 名字 | device_path_cache_proportion | -| ------------ | --------------------------------------------------- | -| 描述 | 在内存中分配给设备路径缓存(DevicePathCache)的比例 | -| 类型 | Double | -| 默认值 | 0.05 | -| 改后生效方式 | 重启服务生效 | - -- write_memory_variation_report_proportion - -| 名字 | write_memory_variation_report_proportion | -| ------------ | ------------------------------------------------------------ | -| 描述 | 如果 DataRegion 的内存增加超过写入可用内存的一定比例,则向系统报告。默认值为0.001 | -| 类型 | Double | -| 默认值 | 0.001 | -| 改后生效方式 | 重启服务生效 | - -- check_period_when_insert_blocked - -| 名字 | check_period_when_insert_blocked | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当插入被拒绝时,等待时间(以毫秒为单位)去再次检查系统,默认为50。若插入被拒绝,读取负载低,可以设置大一些。 | -| 类型 | int32 | -| 默认值 | 50 | -| 改后生效方式 | 重启服务生效 | - -- io_task_queue_size_for_flushing - -| 名字 | io_task_queue_size_for_flushing | -| ------------ | -------------------------------- | -| 描述 | ioTaskQueue 的大小。默认值为10。 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- enable_query_memory_estimation - -| 名字 | enable_query_memory_estimation | -| ------------ | ------------------------------------------------------------ | -| 描述 | 开启后会预估每次查询的内存使用量,如果超过可用内存,会拒绝本次查询 | -| 类型 | bool | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -### 3.15 元数据引擎配置 - -- schema_engine_mode - -| 名字 | schema_engine_mode | -| ------------ | ------------------------------------------------------------ | -| 描述 | 元数据引擎的运行模式,支持 Memory 和 PBTree;PBTree 模式下支持将内存中暂时不用的序列元数据实时置换到磁盘上,需要使用时再加载进内存;此参数在集群中所有的 DataNode 上务必保持相同。 | -| 类型 | string | -| 默认值 | Memory | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- partition_cache_size - -| 名字 | partition_cache_size | -| ------------ | ------------------------------ | -| 描述 | 分区信息缓存的最大缓存条目数。 | -| 类型 | Int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- sync_mlog_period_in_ms - -| 名字 | sync_mlog_period_in_ms | -| ------------ | ------------------------------------------------------------ | -| 描述 | mlog定期刷新到磁盘的周期,单位毫秒。如果该参数为0,则表示每次对元数据的更新操作都会被立即写到磁盘上。 | -| 类型 | Int64 | -| 默认值 | 100 | -| 改后生效方式 | 重启服务生效 | - -- tag_attribute_flush_interval - -| 名字 | tag_attribute_flush_interval | -| ------------ | -------------------------------------------------- | -| 描述 | 标签和属性记录的间隔数,达到此记录数量时将强制刷盘 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- tag_attribute_total_size - -| 名字 | tag_attribute_total_size | -| ------------ | ---------------------------------------- | -| 描述 | 每个时间序列标签和属性的最大持久化字节数 | -| 类型 | int32 | -| 默认值 | 700 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- max_measurement_num_of_internal_request - -| 名字 | max_measurement_num_of_internal_request | -| ------------ | ------------------------------------------------------------ | -| 描述 | 一次注册序列请求中若物理量过多,在系统内部执行时将被拆分为若干个轻量级的子请求,每个子请求中的物理量数目不超过此参数设置的最大值。 | -| 类型 | Int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- datanode_schema_cache_eviction_policy - -| 名字 | datanode_schema_cache_eviction_policy | -| ------------ | ----------------------------------------------------- | -| 描述 | 当 Schema 缓存达到其最大容量时,Schema 缓存的淘汰策略 | -| 类型 | String | -| 默认值 | FIFO | -| 改后生效方式 | 重启服务生效 | - -- cluster_timeseries_limit_threshold - -| 名字 | cluster_timeseries_limit_threshold | -| ------------ | ---------------------------------- | -| 描述 | 集群中可以创建的时间序列的最大数量 | -| 类型 | Int32 | -| 默认值 | -1 | -| 改后生效方式 | 重启服务生效 | - -- cluster_device_limit_threshold - -| 名字 | cluster_device_limit_threshold | -| ------------ | ------------------------------ | -| 描述 | 集群中可以创建的最大设备数量 | -| 类型 | Int32 | -| 默认值 | -1 | -| 改后生效方式 | 重启服务生效 | - -- database_limit_threshold - -| 名字 | database_limit_threshold | -| ------------ | ------------------------------ | -| 描述 | 集群中可以创建的最大数据库数量 | -| 类型 | Int32 | -| 默认值 | -1 | -| 改后生效方式 | 重启服务生效 | - -### 3.16 自动推断数据类型 - -- enable_auto_create_schema - -| 名字 | enable_auto_create_schema | -| ------------ | -------------------------------------- | -| 描述 | 当写入的序列不存在时,是否自动创建序列 | -| 取值 | true or false | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -- default_storage_group_level - -| 名字 | default_storage_group_level | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当写入的数据不存在且自动创建序列时,若需要创建相应的 database,将序列路径的哪一层当做 database。例如,如果我们接到一个新序列 root.sg0.d1.s2, 并且 level=1, 那么 root.sg0 被视为database(因为 root 是 level 0 层) | -| 取值 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -- boolean_string_infer_type - -| 名字 | boolean_string_infer_type | -| ------------ | ------------------------------------------ | -| 描述 | "true" 或者 "false" 字符串被推断的数据类型 | -| 取值 | BOOLEAN 或者 TEXT | -| 默认值 | BOOLEAN | -| 改后生效方式 | 热加载 | - -- integer_string_infer_type - -| 名字 | integer_string_infer_type | -| ------------ | --------------------------------- | -| 描述 | 整型字符串推断的数据类型 | -| 取值 | INT32, INT64, FLOAT, DOUBLE, TEXT | -| 默认值 | DOUBLE | -| 改后生效方式 | 热加载 | - -- floating_string_infer_type - -| 名字 | floating_string_infer_type | -| ------------ | ----------------------------- | -| 描述 | "6.7"等字符串被推断的数据类型 | -| 取值 | DOUBLE, FLOAT or TEXT | -| 默认值 | DOUBLE | -| 改后生效方式 | 热加载 | - -- nan_string_infer_type - -| 名字 | nan_string_infer_type | -| ------------ | ---------------------------- | -| 描述 | "NaN" 字符串被推断的数据类型 | -| 取值 | DOUBLE, FLOAT or TEXT | -| 默认值 | DOUBLE | -| 改后生效方式 | 热加载 | - -- default_boolean_encoding - -| 名字 | default_boolean_encoding | -| ------------ | ------------------------ | -| 描述 | BOOLEAN 类型编码格式 | -| 取值 | PLAIN, RLE | -| 默认值 | RLE | -| 改后生效方式 | 热加载 | - -- default_int32_encoding - -| 名字 | default_int32_encoding | -| ------------ | -------------------------------------- | -| 描述 | int32 类型编码格式 | -| 取值 | PLAIN, RLE, TS_2DIFF, REGULAR, GORILLA | -| 默认值 | TS_2DIFF | -| 改后生效方式 | 热加载 | - -- default_int64_encoding - -| 名字 | default_int64_encoding | -| ------------ | -------------------------------------- | -| 描述 | int64 类型编码格式 | -| 取值 | PLAIN, RLE, TS_2DIFF, REGULAR, GORILLA | -| 默认值 | TS_2DIFF | -| 改后生效方式 | 热加载 | - -- default_float_encoding - -| 名字 | default_float_encoding | -| ------------ | ----------------------------- | -| 描述 | float 类型编码格式 | -| 取值 | PLAIN, RLE, TS_2DIFF, GORILLA | -| 默认值 | GORILLA | -| 改后生效方式 | 热加载 | - -- default_double_encoding - -| 名字 | default_double_encoding | -| ------------ | ----------------------------- | -| 描述 | double 类型编码格式 | -| 取值 | PLAIN, RLE, TS_2DIFF, GORILLA | -| 默认值 | GORILLA | -| 改后生效方式 | 热加载 | - -- default_text_encoding - -| 名字 | default_text_encoding | -| ------------ | --------------------- | -| 描述 | text 类型编码格式 | -| 取值 | PLAIN | -| 默认值 | PLAIN | -| 改后生效方式 | 热加载 | - -* boolean_compressor - -| 名字 | boolean_compressor | -| -------------- | ----------------------------------------------------------------------- | -| 描述 | 启用自动创建模式时,BOOLEAN 数据类型的压缩方式 (V2.0.6 版本开始支持) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -* int32_compressor - -| 名字 | int32_compressor | -| -------------- | ------------------------------------------------------------------------- | -| 描述 | 启用自动创建模式时,INT32/DATE 数据类型的压缩方式(V2.0.6 版本开始支持) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -* int64_compressor - -| 名字 | int64_compressor | -| -------------- | ------------------------------------------------------------------------------ | -| 描述 | 启用自动创建模式时,INT64/TIMESTAMP 数据类型的压缩方式(V2.0.6 版本开始支持) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -* float_compressor - -| 名字 | float_compressor | -| -------------- | -------------------------------------------------------------------- | -| 描述 | 启用自动创建模式时,FLOAT 数据类型的压缩方式(V2.0.6 版本开始支持) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -* double_compressor - -| 名字 | double_compressor | -| -------------- | --------------------------------------------------------------------- | -| 描述 | 启用自动创建模式时,DOUBLE 数据类型的压缩方式(V2.0.6 版本开始支持) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -* text_compressor - -| 名字 | text_compressor | -| -------------- | -------------------------------------------------------------------------------- | -| 描述 | 启用自动创建模式时,TEXT/BINARY/BLOB 数据类型的压缩方式(V2.0.6 版本开始支持 ) | -| 类型 | String | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - - - -### 3.17 查询配置 - -- read_consistency_level - -| 名字 | read_consistency_level | -| ------------ | ------------------------------------------------------------ | -| 描述 | 查询一致性等级,取值 “strong” 时从 Leader 副本查询,取值 “weak” 时随机查询一个副本。 | -| 类型 | String | -| 默认值 | strong | -| 改后生效方式 | 重启服务生效 | - -- meta_data_cache_enable - -| 名字 | meta_data_cache_enable | -| ------------ | ------------------------------------------------------------ | -| 描述 | 是否缓存元数据(包括 BloomFilter、Chunk Metadata 和 TimeSeries Metadata。) | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -- chunk_timeseriesmeta_free_memory_proportion - -| 名字 | chunk_timeseriesmeta_free_memory_proportion | -| ------------ | ------------------------------------------------------------ | -| 描述 | 读取内存分配比例,BloomFilterCache、ChunkCache、TimeseriesMetadataCache、数据集查询的内存和可用内存的查询。参数形式为a : b : c : d : e,其中a、b、c、d、e为整数。 例如“1 : 1 : 1 : 1 : 1” ,“1 : 100 : 200 : 300 : 400” 。 | -| 类型 | String | -| 默认值 | 1 : 100 : 200 : 300 : 400 | -| 改后生效方式 | 重启服务生效 | - -- enable_last_cache - -| 名字 | enable_last_cache | -| ------------ | ------------------ | -| 描述 | 是否开启最新点缓存 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -- mpp_data_exchange_core_pool_size - -| 名字 | mpp_data_exchange_core_pool_size | -| ------------ | -------------------------------- | -| 描述 | MPP 数据交换线程池核心线程数 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- mpp_data_exchange_max_pool_size - -| 名字 | mpp_data_exchange_max_pool_size | -| ------------ | ------------------------------- | -| 描述 | MPP 数据交换线程池最大线程数 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- mpp_data_exchange_keep_alive_time_in_ms - -| 名字 | mpp_data_exchange_keep_alive_time_in_ms | -| ------------ | --------------------------------------- | -| 描述 | MPP 数据交换最大等待时间 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- driver_task_execution_time_slice_in_ms - -| 名字 | driver_task_execution_time_slice_in_ms | -| ------------ | -------------------------------------- | -| 描述 | 单个 DriverTask 最长执行时间(ms) | -| 类型 | int32 | -| 默认值 | 200 | -| 改后生效方式 | 重启服务生效 | - -- max_tsblock_size_in_bytes - -| 名字 | max_tsblock_size_in_bytes | -| ------------ | ------------------------------- | -| 描述 | 单个 TsBlock 的最大容量(byte) | -| 类型 | int32 | -| 默认值 | 131072 | -| 改后生效方式 | 重启服务生效 | - -- max_tsblock_line_numbers - -| 名字 | max_tsblock_line_numbers | -| ------------ | ------------------------ | -| 描述 | 单个 TsBlock 的最大行数 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- slow_query_threshold - -| 名字 | slow_query_threshold | -| ------------ |----------------------| -| 描述 | 慢查询的时间阈值。单位:毫秒。 | -| 类型 | long | -| 默认值 | 3000 | -| 改后生效方式 | 热加载 | - -- query_cost_stat_window - -| 名字 | query_cost_stat_window | -| ------------ |--------------------| -| 描述 | 查询耗时统计的窗口,单位为分钟。 | -| 类型 | Int32 | -| 默认值 | 0 | -| 改后生效方式 | 热加载 | - -- query_timeout_threshold - -| 名字 | query_timeout_threshold | -| ------------ | -------------------------------- | -| 描述 | 查询的最大执行时间。单位:毫秒。 | -| 类型 | Int32 | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -- max_allowed_concurrent_queries - -| 名字 | max_allowed_concurrent_queries | -| ------------ | ------------------------------ | -| 描述 | 允许的最大并发查询数量。 | -| 类型 | Int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- query_thread_count - -| 名字 | query_thread_count | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当 IoTDB 对内存中的数据进行查询时,最多启动多少个线程来执行该操作。如果该值小于等于 0,那么采用机器所安装的 CPU 核的数量。 | -| 类型 | Int32 | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- degree_of_query_parallelism - -| 名字 | degree_of_query_parallelism | -| ------------ | ------------------------------------------------------------ | -| 描述 | 设置单个查询片段实例将创建的 pipeline 驱动程序数量,也就是查询操作的并行度。 | -| 类型 | Int32 | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- mode_map_size_threshold - -| 名字 | mode_map_size_threshold | -| ------------ | ---------------------------------------------- | -| 描述 | 计算 MODE 聚合函数时,计数映射可以增长到的阈值 | -| 类型 | Int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- batch_size - -| 名字 | batch_size | -| ------------ | ---------------------------------------------------------- | -| 描述 | 服务器中每次迭代的数据量(数据条目,即不同时间戳的数量。) | -| 类型 | Int32 | -| 默认值 | 100000 | -| 改后生效方式 | 重启服务生效 | - -- sort_buffer_size_in_bytes - -| 名字 | sort_buffer_size_in_bytes | -| ------------ |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | 设置外部排序操作中使用的内存缓冲区大小 | -| 类型 | long | -| 默认值 | 1048576(V2.0.6 之前版本)
0(V2.0.6 及之后版本),当值小于等于 0 时,由系统自动进行计算,计算公式为:`sort_buffer_size_in_bytes = Math.min(32 * 1024 * 1024, 堆内内存 * 查询引擎内存比例 * 查询执行内存比例 / 查询线程数 / 2)` | -| 改后生效方式 | 热加载 | - -- merge_threshold_of_explain_analyze - -| 名字 | merge_threshold_of_explain_analyze | -| ------------ | ------------------------------------------------------------ | -| 描述 | 用于设置在 `EXPLAIN ANALYZE` 语句的结果集中操作符(operator)数量的合并阈值。 | -| 类型 | int | -| 默认值 | 10 | -| 改后生效方式 | 热加载 | - -### 3.18 TTL配置 - -- ttl_check_interval - -| 名字 | ttl_check_interval | -| ------------ | -------------------------------------- | -| 描述 | ttl 检查任务的间隔,单位 ms,默认为 2h | -| 类型 | int | -| 默认值 | 7200000 | -| 改后生效方式 | 重启服务生效 | - -- max_expired_time - -| 名字 | max_expired_time | -| ------------ | ------------------------------------------------------------ | -| 描述 | 如果一个文件中存在设备已经过期超过此时间,那么这个文件将被立即整理。单位 ms,默认为一个月 | -| 类型 | int | -| 默认值 | 2592000000 | -| 改后生效方式 | 重启服务生效 | - -- expired_data_ratio - -| 名字 | expired_data_ratio | -| ------------ | ------------------------------------------------------------ | -| 描述 | 过期设备比例。如果一个文件中过期设备的比率超过这个值,那么这个文件中的过期数据将通过 compaction 清理。 | -| 类型 | float | -| 默认值 | 0.3 | -| 改后生效方式 | 重启服务生效 | - -### 3.19 存储引擎配置 - -- timestamp_precision - -| 名字 | timestamp_precision | -| ------------ | ---------------------------- | -| 描述 | 时间戳精度,支持 ms、us、ns | -| 类型 | String | -| 默认值 | ms | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- timestamp_precision_check_enabled - -| 名字 | timestamp_precision_check_enabled | -| ------------ | --------------------------------- | -| 描述 | 用于控制是否启用时间戳精度检查 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- max_waiting_time_when_insert_blocked - -| 名字 | max_waiting_time_when_insert_blocked | -| ------------ | ----------------------------------------------- | -| 描述 | 当插入请求等待超过这个时间,则抛出异常,单位 ms | -| 类型 | Int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- handle_system_error - -| 名字 | handle_system_error | -| ------------ | ------------------------------------ | -| 描述 | 当系统遇到不可恢复的错误时的处理方法 | -| 类型 | String | -| 默认值 | CHANGE_TO_READ_ONLY | -| 改后生效方式 | 重启服务生效 | - -- enable_timed_flush_seq_memtable - -| 名字 | enable_timed_flush_seq_memtable | -| ------------ | ------------------------------- | -| 描述 | 是否开启定时刷盘顺序 memtable | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- seq_memtable_flush_interval_in_ms - -| 名字 | seq_memtable_flush_interval_in_ms | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当 memTable 的创建时间小于当前时间减去该值时,该 memtable 需要被刷盘 | -| 类型 | long | -| 默认值 | 600000 | -| 改后生效方式 | 热加载 | - -- seq_memtable_flush_check_interval_in_ms - -| 名字 | seq_memtable_flush_check_interval_in_ms | -| ------------ | ---------------------------------------- | -| 描述 | 检查顺序 memtable 是否需要刷盘的时间间隔 | -| 类型 | long | -| 默认值 | 30000 | -| 改后生效方式 | 热加载 | - -- enable_timed_flush_unseq_memtable - -| 名字 | enable_timed_flush_unseq_memtable | -| ------------ | --------------------------------- | -| 描述 | 是否开启定时刷新乱序 memtable | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- unseq_memtable_flush_interval_in_ms - -| 名字 | unseq_memtable_flush_interval_in_ms | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当 memTable 的创建时间小于当前时间减去该值时,该 memtable 需要被刷盘 | -| 类型 | long | -| 默认值 | 600000 | -| 改后生效方式 | 热加载 | - -- unseq_memtable_flush_check_interval_in_ms - -| 名字 | unseq_memtable_flush_check_interval_in_ms | -| ------------ | ----------------------------------------- | -| 描述 | 检查乱序 memtable 是否需要刷盘的时间间隔 | -| 类型 | long | -| 默认值 | 30000 | -| 改后生效方式 | 热加载 | - -- tvlist_sort_algorithm - -| 名字 | tvlist_sort_algorithm | -| ------------ | ------------------------ | -| 描述 | memtable中数据的排序方法 | -| 类型 | String | -| 默认值 | TIM | -| 改后生效方式 | 重启服务生效 | - -- avg_series_point_number_threshold - -| 名字 | avg_series_point_number_threshold | -| ------------ | ------------------------------------------------ | -| 描述 | 内存中平均每个时间序列点数最大值,达到触发 flush | -| 类型 | int32 | -| 默认值 | 100000 | -| 改后生效方式 | 重启服务生效 | - -- flush_thread_count - -| 名字 | flush_thread_count | -| ------------ | ------------------------------------------------------------ | -| 描述 | 当 IoTDB 将内存中的数据写入磁盘时,最多启动多少个线程来执行该操作。如果该值小于等于 0,那么采用机器所安装的 CPU 核的数量。默认值为 0。 | -| 类型 | int32 | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- enable_partial_insert - -| 名字 | enable_partial_insert | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在一次 insert 请求中,如果部分测点写入失败,是否继续写入其他测点。 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -- recovery_log_interval_in_ms - -| 名字 | recovery_log_interval_in_ms | -| ------------ | ----------------------------------------- | -| 描述 | data region的恢复过程中打印日志信息的间隔 | -| 类型 | Int32 | -| 默认值 | 5000 | -| 改后生效方式 | 重启服务生效 | - -- 0.13_data_insert_adapt - -| 名字 | 0.13_data_insert_adapt | -| ------------ | ------------------------------------------------------- | -| 描述 | 如果 0.13 版本客户端进行写入,需要将此配置项设置为 true | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- enable_tsfile_validation - -| 名字 | enable_tsfile_validation | -| ------------ | -------------------------------------- | -| 描述 | Flush, Load 或合并后验证 tsfile 正确性 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 热加载 | - -- tier_ttl_in_ms - -| 名字 | tier_ttl_in_ms | -| ------------ | ----------------------------------------- | -| 描述 | 定义每个层级负责的数据范围,通过 TTL 表示 | -| 类型 | long | -| 默认值 | -1 | -| 改后生效方式 | 重启服务生效 | - -* max_object_file_size_in_byte - -| 名字 | max\_object\_file\_size\_in\_byte | -| -------------- |------------------------------| -| 描述 | 单对象文件的最大尺寸限制 (V2.0.8 版本起支持) | -| 类型 | long | -| 默认值 | 4294967296 | -| 改后生效方式 | 热加载 | - -* restrict_object_limit - -| 名字 | restrict\_object\_limit | -|----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | 对 OBJECT 类型的表名、列名和设备名称没有特殊限制。(V2.0.8 版本起支持)当设置为 true 且表中包含 OBJECT 列时,需遵循以下限制:
1. 命名规范:TAG 列的值、表名和字段名禁止使用 “.” 或 “..”,且不得包含 “./” 或 “.\” 字符,否则元数据创建将失败。若名称包含文件系统不支持的字符,则会在数据写入时报错。
2. 大小写敏感:如果底层文件系统不区分大小写,则设备标识符(如 'd1' 与 'D1')将被视为相同。在此情况下,若创建此类名称相似的设备,其 OBJECT 数据文件可能互相覆盖,导致数据错误。
3. 存储路径:OBJECT 类型数据的实际存储路径格式为:`${dataregionid}/${tablename}/${tag1}/${tag2}/.../${field}/${timestamp}.bin`。 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - - -### 3.20 合并配置 - -- enable_seq_space_compaction - -| 名字 | enable_seq_space_compaction | -| ------------ | -------------------------------------- | -| 描述 | 顺序空间内合并,开启顺序文件之间的合并 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- enable_unseq_space_compaction - -| 名字 | enable_unseq_space_compaction | -| ------------ | -------------------------------------- | -| 描述 | 乱序空间内合并,开启乱序文件之间的合并 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- enable_cross_space_compaction - -| 名字 | enable_cross_space_compaction | -| ------------ | ------------------------------------------ | -| 描述 | 跨空间合并,开启将乱序文件合并到顺序文件中 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- enable_auto_repair_compaction - -| 名字 | enable_auto_repair_compaction | -| ------------ | ----------------------------- | -| 描述 | 启用通过合并操作自动修复未排序文件的功能 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- cross_selector - -| 名字 | cross_selector | -| ------------ |----------------| -| 描述 | 跨空间合并任务的选择器 | -| 类型 | String | -| 默认值 | rewrite | -| 改后生效方式 | 重启服务生效 | - -- cross_performer - -| 名字 | cross_performer | -| ------------ |-----------------------------------| -| 描述 | 跨空间合并任务的执行器,可选项:read_point 和 fast | -| 类型 | String | -| 默认值 | fast | -| 改后生效方式 | 热加载 | - -- inner_seq_selector - -| 名字 | inner_seq_selector | -| ------------ |------------------------------------------------------------------------| -| 描述 | 顺序空间内合并任务的选择器,可选 size_tiered_single_\target,size_tiered_multi_target | -| 类型 | String | -| 默认值 | size_tiered_multi_target | -| 改后生效方式 | 热加载 | - -- inner_seq_performer - -| 名字 | inner_seq_performer | -| ------------ |--------------------------------------| -| 描述 | 顺序空间内合并任务的执行器,可选项是 read_chunk 和 fast | -| 类型 | String | -| 默认值 | read_chunk | -| 改后生效方式 | 热加载 | - -- inner_unseq_selector - -| 名字 | inner_unseq_selector | -| ------------ |-------------------------------------------------------------------------| -| 描述 | 乱序空间内合并任务的选择器,可选 size_tiered_single_\target,size_tiered_multi_target | -| 类型 | String | -| 默认值 | size_tiered_multi_target | -| 改后生效方式 | 热加载 | - -- inner_unseq_performer - -| 名字 | inner_unseq_performer | -| ------------ |--------------------------------------| -| 描述 | 乱序空间内合并任务的执行器,可选项是 read_point 和 fast | -| 类型 | String | -| 默认值 | fast | -| 改后生效方式 | 热加载 | - -- compaction_priority - -| 名字 | compaction_priority | -| ------------ |-------------------------------------------------------------------------------------------| -| 描述 | 合并时的优先级。INNER_CROSS:优先执行空间内合并,优先减少文件数量;CROSS_INNER:优先执行跨空间合并,优先清理乱序文件;BALANCE:交替执行两种合并类型。 | -| 类型 | String | -| 默认值 | INNER_CROSS | -| 改后生效方式 | 重启服务生效 | - -- candidate_compaction_task_queue_size - -| 名字 | candidate_compaction_task_queue_size | -| ------------ | ------------------------------------ | -| 描述 | 待选合并任务队列容量 | -| 类型 | int32 | -| 默认值 | 50 | -| 改后生效方式 | 重启服务生效 | - -- target_compaction_file_size - -| 名字 | target_compaction_file_size | -| ------------ |-----------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | 该参数作用于两个场景:1. 空间内合并的目标文件大小 2. 跨空间合并中待选序列文件的大小需小于 target_compaction_file_size * 1.5 多数情况下,跨空间合并的目标文件大小不会超过此阈值,即便超出,幅度也不会过大 。 默认值:2GB ,单位:byte | -| 类型 | Long | -| 默认值 | 2147483648 | -| 改后生效方式 | 热加载 | - -- inner_compaction_total_file_size_threshold - -| 名字 | inner_compaction_total_file_size_threshold | -| ------------ |--------------------------------------------| -| 描述 | 空间内合并的文件总大小阈值,单位:byte | -| 类型 | Long | -| 默认值 | 10737418240 | -| 改后生效方式 | 热加载 | - -- inner_compaction_total_file_num_threshold - -| 名字 | inner_compaction_total_file_num_threshold | -| ------------ | ----------------------------------------- | -| 描述 | 空间内合并的文件总数阈值 | -| 类型 | int32 | -| 默认值 | 100 | -| 改后生效方式 | 热加载 | - -- max_level_gap_in_inner_compaction - -| 名字 | max_level_gap_in_inner_compaction | -| ------------ | -------------------------------------- | -| 描述 | 空间内合并筛选的最大层级差 | -| 类型 | int32 | -| 默认值 | 2 | -| 改后生效方式 | 热加载 | - -- target_chunk_size - -| 名字 | target_chunk_size | -| ------------ |--------------------------------------------------| -| 描述 | 刷盘与合并操作的目标数据块大小, 若内存表中某条时序数据的大小超过该值,数据会被刷盘至多个数据块 | -| 类型 | Long | -| 默认值 | 1600000 | -| 改后生效方式 | 重启服务生效 | - -- target_chunk_point_num - -| 名字 | target_chunk_point_num | -| ------------ |------------------------------------------------------| -| 描述 | 刷盘与合并操作中单个数据块的目标点数, 若内存表中某条时序数据的点数超过该值,数据会被刷盘至多个数据块中 | -| 类型 | Long | -| 默认值 | 100000 | -| 改后生效方式 | 重启服务生效 | - -- chunk_size_lower_bound_in_compaction - -| 名字 | chunk_size_lower_bound_in_compaction | -| ------------ |--------------------------------------| -| 描述 | 若数据块大小低于此阈值,则会被反序列化为数据点,默认值为128字节 | -| 类型 | Long | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- chunk_point_num_lower_bound_in_compaction - -| 名字 | chunk_point_num_lower_bound_in_compaction | -| ------------ |------------------------------------------| -| 描述 | 若数据块内的数据点数低于此阈值,则会被反序列化为数据点 | -| 类型 | Long | -| 默认值 | 100 | -| 改后生效方式 | 重启服务生效 | - -- inner_compaction_candidate_file_num - -| 名字 | inner_compaction_candidate_file_num | -| ------------ | ---------------------------------------- | -| 描述 | 空间内合并待选文件筛选的文件数量要求 | -| 类型 | int32 | -| 默认值 | 30 | -| 改后生效方式 | 热加载 | - -- max_cross_compaction_candidate_file_num - -| 名字 | max_cross_compaction_candidate_file_num | -| ------------ | --------------------------------------- | -| 描述 | 跨空间合并待选文件筛选的文件数量上限 | -| 类型 | int32 | -| 默认值 | 500 | -| 改后生效方式 | 热加载 | - -- max_cross_compaction_candidate_file_size - -| 名字 | max_cross_compaction_candidate_file_size | -| ------------ |------------------------------------------| -| 描述 | 跨空间合并待选文件筛选的总大小上限 | -| 类型 | Long | -| 默认值 | 5368709120 | -| 改后生效方式 | 热加载 | - -- min_cross_compaction_unseq_file_level - -| 名字 | min_cross_compaction_unseq_file_level | -| ------------ |---------------------------------------| -| 描述 | 可被选为待选文件的乱序文件的最小空间内合并层级 | -| 类型 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 热加载 | - -- compaction_thread_count - -| 名字 | compaction_thread_count | -| ------------ | ----------------------- | -| 描述 | 执行合并任务的线程数目 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 热加载 | - -- compaction_max_aligned_series_num_in_one_batch - -| 名字 | compaction_max_aligned_series_num_in_one_batch | -| ------------ | ---------------------------------------------- | -| 描述 | 对齐序列合并一次执行时处理的值列数量 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 热加载 | - -- compaction_schedule_interval_in_ms - -| 名字 | compaction_schedule_interval_in_ms | -| ------------ |------------------------------------| -| 描述 | 合并调度的时间间隔,单位 ms | -| 类型 | Long | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -- compaction_write_throughput_mb_per_sec - -| 名字 | compaction_write_throughput_mb_per_sec | -| ------------ |----------------------------------------| -| 描述 | 合并操作每秒可达到的写入吞吐量上限, 小于或等于 0 的取值表示无限制 | -| 类型 | int32 | -| 默认值 | 16 | -| 改后生效方式 | 重启服务生效 | - -- compaction_read_throughput_mb_per_sec - -| 名字 | compaction_read_throughput_mb_per_sec | -| --------- | ---------------------------------------------------- | -| 描述 | 合并每秒读吞吐限制,单位为 megabyte,小于或等于 0 的取值表示无限制 | -| 类型 | int32 | -| 默认值 | 0 | -| Effective | 热加载 | - -- compaction_read_operation_per_sec - -| 名字 | compaction_read_operation_per_sec | -| --------- | ------------------------------------------- | -| 描述 | 合并每秒读操作数量限制,小于或等于 0 的取值表示无限制 | -| 类型 | int32 | -| 默认值 | 0 | -| Effective | 热加载 | - -- sub_compaction_thread_count - -| 名字 | sub_compaction_thread_count | -| ------------ | ------------------------------------------------------------ | -| 描述 | 每个合并任务的子任务线程数,只对跨空间合并和乱序空间内合并生效 | -| 类型 | int32 | -| 默认值 | 4 | -| 改后生效方式 | 热加载 | - -- inner_compaction_task_selection_disk_redundancy - -| 名字 | inner_compaction_task_selection_disk_redundancy | -| ------------ | ----------------------------------------------- | -| 描述 | 定义了磁盘可用空间的冗余值,仅用于内部压缩 | -| 类型 | double | -| 默认值 | 0.05 | -| 改后生效方式 | 热加载 | - -- inner_compaction_task_selection_mods_file_threshold - -| 名字 | inner_compaction_task_selection_mods_file_threshold | -| ------------ | --------------------------------------------------- | -| 描述 | 定义了mods文件大小的阈值,仅用于内部压缩。 | -| 类型 | long | -| 默认值 | 131072 | -| 改后生效方式 | 热加载 | - -- compaction_schedule_thread_num - -| 名字 | compaction_schedule_thread_num | -| ------------ | ------------------------------ | -| 描述 | 选择合并任务的线程数量 | -| 类型 | int32 | -| 默认值 | 4 | -| 改后生效方式 | 热加载 | - -### 3.21 写前日志配置 - -- wal_mode - -| 名字 | wal_mode | -| ------------ | ------------------------------------------------------------ | -| 描述 | 写前日志的写入模式. DISABLE 模式下会关闭写前日志;SYNC 模式下写入请求会在成功写入磁盘后返回; ASYNC 模式下写入请求返回时可能尚未成功写入磁盘后。 | -| 类型 | String | -| 默认值 | ASYNC | -| 改后生效方式 | 重启服务生效 | - -- max_wal_nodes_num - -| 名字 | max_wal_nodes_num | -| ------------ | ----------------------------------------------------- | -| 描述 | 写前日志节点的最大数量,默认值 0 表示数量由系统控制。 | -| 类型 | int32 | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- wal_async_mode_fsync_delay_in_ms - -| 名字 | wal_async_mode_fsync_delay_in_ms | -| ------------ | ------------------------------------------- | -| 描述 | async 模式下写前日志调用 fsync 前的等待时间 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 热加载 | - -- wal_sync_mode_fsync_delay_in_ms - -| 名字 | wal_sync_mode_fsync_delay_in_ms | -| ------------ | ------------------------------------------ | -| 描述 | sync 模式下写前日志调用 fsync 前的等待时间 | -| 类型 | int32 | -| 默认值 | 3 | -| 改后生效方式 | 热加载 | - -- wal_buffer_size_in_byte - -| 名字 | wal_buffer_size_in_byte | -| ------------ | ----------------------- | -| 描述 | 写前日志的 buffer 大小 | -| 类型 | int32 | -| 默认值 | 33554432 | -| 改后生效方式 | 重启服务生效 | - -- wal_buffer_queue_capacity - -| 名字 | wal_buffer_queue_capacity | -| ------------ | ------------------------- | -| 描述 | 写前日志阻塞队列大小上限 | -| 类型 | int32 | -| 默认值 | 500 | -| 改后生效方式 | 重启服务生效 | - -- wal_file_size_threshold_in_byte - -| 名字 | wal_file_size_threshold_in_byte | -| ------------ | ------------------------------- | -| 描述 | 写前日志文件封口阈值 | -| 类型 | int32 | -| 默认值 | 31457280 | -| 改后生效方式 | 热加载 | - -- wal_min_effective_info_ratio - -| 名字 | wal_min_effective_info_ratio | -| ------------ | ---------------------------- | -| 描述 | 写前日志最小有效信息比 | -| 类型 | double | -| 默认值 | 0.1 | -| 改后生效方式 | 热加载 | - -- wal_memtable_snapshot_threshold_in_byte - -| 名字 | wal_memtable_snapshot_threshold_in_byte | -| ------------ | ---------------------------------------- | -| 描述 | 触发写前日志中内存表快照的内存表大小阈值 | -| 类型 | int64 | -| 默认值 | 8388608 | -| 改后生效方式 | 热加载 | - -- max_wal_memtable_snapshot_num - -| 名字 | max_wal_memtable_snapshot_num | -| ------------ | ------------------------------ | -| 描述 | 写前日志中内存表的最大数量上限 | -| 类型 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 热加载 | - -- delete_wal_files_period_in_ms - -| 名字 | delete_wal_files_period_in_ms | -| ------------ | ----------------------------- | -| 描述 | 删除写前日志的检查间隔 | -| 类型 | int64 | -| 默认值 | 20000 | -| 改后生效方式 | 热加载 | - -- wal_throttle_threshold_in_byte - -| 名字 | wal_throttle_threshold_in_byte | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在IoTConsensus中,当WAL文件的大小达到一定阈值时,会开始对写入操作进行节流,以控制写入速度。 | -| 类型 | long | -| 默认值 | 53687091200 | -| 改后生效方式 | 热加载 | - -- iot_consensus_cache_window_time_in_ms - -| 名字 | iot_consensus_cache_window_time_in_ms | -| ------------ | ---------------------------------------- | -| 描述 | 在IoTConsensus中,写缓存的最大等待时间。 | -| 类型 | long | -| 默认值 | -1 | -| 改后生效方式 | 热加载 | - -- enable_wal_compression - -| 名字 | iot_consensus_cache_window_time_in_ms | -| ------------ | ------------------------------------- | -| 描述 | 用于控制是否启用WAL的压缩。 | -| 类型 | boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -### 3.22 IoT 共识协议配置 - -当Region配置了IoTConsensus共识协议之后,下述的配置项才会生效 - -- data_region_iot_max_log_entries_num_per_batch - -| 名字 | data_region_iot_max_log_entries_num_per_batch | -| ------------ | --------------------------------------------- | -| 描述 | IoTConsensus batch 的最大日志条数 | -| 类型 | int32 | -| 默认值 | 1024 | -| 改后生效方式 | 重启服务生效 | - -- data_region_iot_max_size_per_batch - -| 名字 | data_region_iot_max_size_per_batch | -| ------------ | ---------------------------------- | -| 描述 | IoTConsensus batch 的最大大小 | -| 类型 | int32 | -| 默认值 | 16777216 | -| 改后生效方式 | 重启服务生效 | - -- data_region_iot_max_pending_batches_num - -| 名字 | data_region_iot_max_pending_batches_num | -| ------------ | --------------------------------------- | -| 描述 | IoTConsensus batch 的流水线并发阈值 | -| 类型 | int32 | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -- data_region_iot_max_memory_ratio_for_queue - -| 名字 | data_region_iot_max_memory_ratio_for_queue | -| ------------ | ------------------------------------------ | -| 描述 | IoTConsensus 队列内存分配比例 | -| 类型 | double | -| 默认值 | 0.6 | -| 改后生效方式 | 重启服务生效 | - -- region_migration_speed_limit_bytes_per_second - -| 名字 | region_migration_speed_limit_bytes_per_second | -| ------------ | --------------------------------------------- | -| 描述 | 定义了在region迁移过程中,数据传输的最大速率 | -| 类型 | long | -| 默认值 | 33554432 | -| 改后生效方式 | 重启服务生效 | - -### 3.23 TsFile配置 - -- group_size_in_byte - -| 名字 | group_size_in_byte | -| ------------ | ---------------------------------------------- | -| 描述 | 每次将内存中的数据写入到磁盘时的最大写入字节数 | -| 类型 | int32 | -| 默认值 | 134217728 | -| 改后生效方式 | 热加载 | - -- page_size_in_byte - -| 名字 | page_size_in_byte | -| ------------ | ---------------------------------------------------- | -| 描述 | 内存中每个列写出时,写成的单页最大的大小,单位为字节 | -| 类型 | int32 | -| 默认值 | 65536 | -| 改后生效方式 | 热加载 | - -- max_number_of_points_in_page - -| 名字 | max_number_of_points_in_page | -| ------------ | ------------------------------------------------- | -| 描述 | 一个页中最多包含的数据点(时间戳-值的二元组)数量 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 热加载 | - -- pattern_matching_threshold - -| 名字 | pattern_matching_threshold | -| ------------ | ------------------------------ | -| 描述 | 正则表达式匹配时最大的匹配次数 | -| 类型 | int32 | -| 默认值 | 1000000 | -| 改后生效方式 | 热加载 | - -- float_precision - -| 名字 | float_precision | -| ------------ | ------------------------------------------------------------ | -| 描述 | 浮点数精度,为小数点后数字的位数 | -| 类型 | int32 | -| 默认值 | 默认为 2 位。注意:32 位浮点数的十进制精度为 7 位,64 位浮点数的十进制精度为 15 位。如果设置超过机器精度将没有实际意义。 | -| 改后生效方式 | 热加载 | - -- value_encoder - -| 名字 | value_encoder | -| ------------ | ------------------------------------- | -| 描述 | value 列编码方式 | -| 类型 | 枚举 String: “TS_2DIFF”,“PLAIN”,“RLE” | -| 默认值 | PLAIN | -| 改后生效方式 | 热加载 | - -- compressor - -| 名字 | compressor | -| ------------ | ------------------------------------------------------------ | -| 描述 | 数据压缩方法; 对齐序列中时间列的压缩方法 | -| 类型 | 枚举 String : "UNCOMPRESSED", "SNAPPY", "LZ4", "ZSTD", "LZMA2" | -| 默认值 | LZ4 | -| 改后生效方式 | 热加载 | - -- encrypt_flag - -| 名字 | encrypt_flag | -| ------------ | ---------------------------- | -| 描述 | 用于开启或关闭数据加密功能。 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- encrypt_type - -| 名字 | encrypt_type | -| ------------ | ------------------------------------- | -| 描述 | 数据加密的方法。 | -| 类型 | String | -| 默认值 | org.apache.tsfile.encrypt.UNENCRYPTED | -| 改后生效方式 | 重启服务生效 | - -- encrypt_key_path - -| 名字 | encrypt_key_path | -| ------------ | ---------------------------- | -| 描述 | 数据加密使用的密钥来源路径。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -### 3.24 授权配置 - -- authorizer_provider_class - -| 名字 | authorizer_provider_class | -| ------------ | ------------------------------------------------------------ | -| 描述 | 权限服务的类名 | -| 类型 | String | -| 默认值 | org.apache.iotdb.commons.auth.authorizer.LocalFileAuthorizer | -| 改后生效方式 | 重启服务生效 | - -- iotdb_server_encrypt_decrypt_provider - -| 名字 | iotdb_server_encrypt_decrypt_provider | -| ------------ | ------------------------------------------------------------ | -| 描述 | 用于用户密码加密的类 | -| 类型 | String | -| 默认值 | org.apache.iotdb.commons.security.encrypt.MessageDigestEncrypt | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- iotdb_server_encrypt_decrypt_provider_parameter - -| 名字 | iotdb_server_encrypt_decrypt_provider_parameter | -| ------------ | ----------------------------------------------- | -| 描述 | 用于初始化用户密码加密类的参数 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 仅允许在第一次启动服务前修改 | - -- author_cache_size - -| 名字 | author_cache_size | -| ------------ | ------------------------ | -| 描述 | 用户缓存与角色缓存的大小 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- author_cache_expire_time - -| 名字 | author_cache_expire_time | -| ------------ | -------------------------------------- | -| 描述 | 用户缓存与角色缓存的有效期,单位为分钟 | -| 类型 | int32 | -| 默认值 | 30 | -| 改后生效方式 | 重启服务生效 | - -### 3.25 UDF配置 - -- udf_initial_byte_array_length_for_memory_control - -| 名字 | udf_initial_byte_array_length_for_memory_control | -| ------------ | ------------------------------------------------------------ | -| 描述 | 用于评估UDF查询中文本字段的内存使用情况。建议将此值设置为略大于所有文本的平均长度记录。 | -| 类型 | int32 | -| 默认值 | 48 | -| 改后生效方式 | 重启服务生效 | - -- udf_memory_budget_in_mb - -| 名字 | udf_memory_budget_in_mb | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在一个UDF查询中使用多少内存(以 MB 为单位)。上限为已分配内存的 20% 用于读取。 | -| 类型 | Float | -| 默认值 | 30.0 | -| 改后生效方式 | 重启服务生效 | - -- udf_reader_transformer_collector_memory_proportion - -| 名字 | udf_reader_transformer_collector_memory_proportion | -| ------------ | --------------------------------------------------------- | -| 描述 | UDF内存分配比例。参数形式为a : b : c,其中a、b、c为整数。 | -| 类型 | String | -| 默认值 | 1:1:1 | -| 改后生效方式 | 重启服务生效 | - -- udf_lib_dir - -| 名字 | udf_lib_dir | -| ------------ | ---------------------------- | -| 描述 | UDF 日志及jar文件存储路径 | -| 类型 | String | -| 默认值 | ext/udf(Windows:ext\\udf) | -| 改后生效方式 | 重启服务生效 | - -### 3.26 触发器配置 - -- trigger_lib_dir - -| 名字 | trigger_lib_dir | -| ------------ | ----------------------- | -| 描述 | 触发器 JAR 包存放的目录 | -| 类型 | String | -| 默认值 | ext/trigger | -| 改后生效方式 | 重启服务生效 | - -- stateful_trigger_retry_num_when_not_found - -| 名字 | stateful_trigger_retry_num_when_not_found | -| ------------ | ---------------------------------------------- | -| 描述 | 有状态触发器触发无法找到触发器实例时的重试次数 | -| 类型 | Int32 | -| 默认值 | 3 | -| 改后生效方式 | 重启服务生效 | - -### 3.27 SELECT-INTO配置 - -- into_operation_buffer_size_in_byte - -| 名字 | into_operation_buffer_size_in_byte | -| ------------ | ------------------------------------------------------------ | -| 描述 | 执行 select-into 语句时,待写入数据占用的最大内存(单位:Byte) | -| 类型 | long | -| 默认值 | 104857600 | -| 改后生效方式 | 热加载 | - -- select_into_insert_tablet_plan_row_limit - -| 名字 | select_into_insert_tablet_plan_row_limit | -| ------------ | ------------------------------------------------------------ | -| 描述 | 执行 select-into 语句时,一个 insert-tablet-plan 中可以处理的最大行数 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 热加载 | - -- into_operation_execution_thread_count - -| 名字 | into_operation_execution_thread_count | -| ------------ | ------------------------------------------ | -| 描述 | SELECT INTO 中执行写入任务的线程池的线程数 | -| 类型 | int32 | -| 默认值 | 2 | -| 改后生效方式 | 重启服务生效 | - -### 3.28 连续查询配置 -- continuous_query_submit_thread_count - -| 名字 | continuous_query_execution_thread | -| ------------ | --------------------------------- | -| 描述 | 执行连续查询任务的线程池的线程数 | -| 类型 | int32 | -| 默认值 | 2 | -| 改后生效方式 | 重启服务生效 | - -- continuous_query_min_every_interval_in_ms - -| 名字 | continuous_query_min_every_interval_in_ms | -| ------------ | ----------------------------------------- | -| 描述 | 连续查询执行时间间隔的最小值 | -| 类型 | long (duration) | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -### 3.29 PIPE配置 - -- pipe_lib_dir - -| 名字 | pipe_lib_dir | -| ------------ | -------------------------- | -| 描述 | 自定义 Pipe 插件的存放目录 | -| 类型 | string | -| 默认值 | ext/pipe | -| 改后生效方式 | 暂不支持修改 | - -- pipe_subtask_executor_max_thread_num - -| 名字 | pipe_subtask_executor_max_thread_num | -| ------------ | ------------------------------------------------------------ | -| 描述 | pipe 子任务 processor、sink 中各自可以使用的最大线程数。实际值将是 min(pipe_subtask_executor_max_thread_num, max(1, CPU核心数 / 2))。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -- pipe_sink_timeout_ms - -| 名字 | pipe_sink_timeout_ms | -| ------------ | --------------------------------------------- | -| 描述 | thrift 客户端的连接超时时间(以毫秒为单位)。 | -| 类型 | int | -| 默认值 | 900000 | -| 改后生效方式 | 重启服务生效 | - -- pipe_sink_selector_number - -| 名字 | pipe_sink_selector_number | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在 iotdb-thrift-async-sink 插件中可以使用的最大执行结果处理线程数量。 建议将此值设置为小于或等于 pipe_sink_max_client_number。 | -| 类型 | int | -| 默认值 | 4 | -| 改后生效方式 | 重启服务生效 | - -- pipe_sink_max_client_number - -| 名字 | pipe_sink_max_client_number | -| ------------ | ----------------------------------------------------------- | -| 描述 | 在 iotdb-thrift-async-sink 插件中可以使用的最大客户端数量。 | -| 类型 | int | -| 默认值 | 16 | -| 改后生效方式 | 重启服务生效 | - -- pipe_air_gap_receiver_enabled - -| 名字 | pipe_air_gap_receiver_enabled | -| ------------ | ------------------------------------------------------------ | -| 描述 | 是否启用通过网闸接收 pipe 数据。接收器只能在 tcp 模式下返回 0 或 1,以指示数据是否成功接收。 \| | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- pipe_air_gap_receiver_port - -| 名字 | pipe_air_gap_receiver_port | -| ------------ | ------------------------------------ | -| 描述 | 服务器通过网闸接收 pipe 数据的端口。 | -| 类型 | int | -| 默认值 | 9780 | -| 改后生效方式 | 重启服务生效 | - -- pipe_all_sinks_rate_limit_bytes_per_second - -| 名字 | pipe_all_sinks_rate_limit_bytes_per_second | -| ------------ | ------------------------------------------------------------ | -| 描述 | 所有 pipe sink 每秒可以传输的总字节数。当给定的值小于或等于 0 时,表示没有限制。默认值是 -1,表示没有限制。 | -| 类型 | double | -| 默认值 | -1 | -| 改后生效方式 | 热加载 | - -### 3.30 Ratis共识协议配置 - -当Region配置了RatisConsensus共识协议之后,下述的配置项才会生效 - -- config_node_ratis_log_appender_buffer_size_max - -| 名字 | config_node_ratis_log_appender_buffer_size_max | -| ------------ | ---------------------------------------------- | -| 描述 | confignode 一次同步日志RPC最大的传输字节限制 | -| 类型 | int32 | -| 默认值 | 16777216 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_log_appender_buffer_size_max - -| 名字 | schema_region_ratis_log_appender_buffer_size_max | -| ------------ | ------------------------------------------------ | -| 描述 | schema region 一次同步日志RPC最大的传输字节限制 | -| 类型 | int32 | -| 默认值 | 16777216 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_log_appender_buffer_size_max - -| 名字 | data_region_ratis_log_appender_buffer_size_max | -| ------------ | ---------------------------------------------- | -| 描述 | data region 一次同步日志RPC最大的传输字节限制 | -| 类型 | int32 | -| 默认值 | 16777216 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_snapshot_trigger_threshold - -| 名字 | config_node_ratis_snapshot_trigger_threshold | -| ------------ | -------------------------------------------- | -| 描述 | confignode 触发snapshot需要的日志条数 | -| 类型 | int32 | -| 默认值 | 400,000 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_snapshot_trigger_threshold - -| 名字 | schema_region_ratis_snapshot_trigger_threshold | -| ------------ | ---------------------------------------------- | -| 描述 | schema region 触发snapshot需要的日志条数 | -| 类型 | int32 | -| 默认值 | 400,000 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_snapshot_trigger_threshold - -| 名字 | data_region_ratis_snapshot_trigger_threshold | -| ------------ | -------------------------------------------- | -| 描述 | data region 触发snapshot需要的日志条数 | -| 类型 | int32 | -| 默认值 | 400,000 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_log_unsafe_flush_enable - -| 名字 | config_node_ratis_log_unsafe_flush_enable | -| ------------ | ----------------------------------------- | -| 描述 | confignode 是否允许Raft日志异步刷盘 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_log_unsafe_flush_enable - -| 名字 | schema_region_ratis_log_unsafe_flush_enable | -| ------------ | ------------------------------------------- | -| 描述 | schema region 是否允许Raft日志异步刷盘 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_log_unsafe_flush_enable - -| 名字 | data_region_ratis_log_unsafe_flush_enable | -| ------------ | ----------------------------------------- | -| 描述 | data region 是否允许Raft日志异步刷盘 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_log_segment_size_max_in_byte - -| 名字 | config_node_ratis_log_segment_size_max_in_byte | -| ------------ | ---------------------------------------------- | -| 描述 | confignode 一个RaftLog日志段文件的大小 | -| 类型 | int32 | -| 默认值 | 25165824 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_log_segment_size_max_in_byte - -| 名字 | schema_region_ratis_log_segment_size_max_in_byte | -| ------------ | ------------------------------------------------ | -| 描述 | schema region 一个RaftLog日志段文件的大小 | -| 类型 | int32 | -| 默认值 | 25165824 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_log_segment_size_max_in_byte - -| 名字 | data_region_ratis_log_segment_size_max_in_byte | -| ------------ | ---------------------------------------------- | -| 描述 | data region 一个RaftLog日志段文件的大小 | -| 类型 | int32 | -| 默认值 | 25165824 | -| 改后生效方式 | 重启服务生效 | - -- config_node_simple_consensus_log_segment_size_max_in_byte - -| 名字 | data_region_ratis_log_segment_size_max_in_byte | -| ------------ | ---------------------------------------------- | -| 描述 | Confignode 简单共识协议一个Log日志段文件的大小 | -| 类型 | int32 | -| 默认值 | 25165824 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_grpc_flow_control_window - -| 名字 | config_node_ratis_grpc_flow_control_window | -| ------------ | ------------------------------------------ | -| 描述 | confignode grpc 流式拥塞窗口大小 | -| 类型 | int32 | -| 默认值 | 4194304 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_grpc_flow_control_window - -| 名字 | schema_region_ratis_grpc_flow_control_window | -| ------------ | -------------------------------------------- | -| 描述 | schema region grpc 流式拥塞窗口大小 | -| 类型 | int32 | -| 默认值 | 4194304 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_grpc_flow_control_window - -| 名字 | data_region_ratis_grpc_flow_control_window | -| ------------ | ------------------------------------------ | -| 描述 | data region grpc 流式拥塞窗口大小 | -| 类型 | int32 | -| 默认值 | 4194304 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_grpc_leader_outstanding_appends_max - -| 名字 | config_node_ratis_grpc_leader_outstanding_appends_max | -| ------------ | ----------------------------------------------------- | -| 描述 | config node grpc 流水线并发阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_grpc_leader_outstanding_appends_max - -| 名字 | schema_region_ratis_grpc_leader_outstanding_appends_max | -| ------------ | ------------------------------------------------------- | -| 描述 | schema region grpc 流水线并发阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_grpc_leader_outstanding_appends_max - -| 名字 | data_region_ratis_grpc_leader_outstanding_appends_max | -| ------------ | ----------------------------------------------------- | -| 描述 | data region grpc 流水线并发阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_log_force_sync_num - -| 名字 | config_node_ratis_log_force_sync_num | -| ------------ | ------------------------------------ | -| 描述 | config node fsync 阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_log_force_sync_num - -| 名字 | schema_region_ratis_log_force_sync_num | -| ------------ | -------------------------------------- | -| 描述 | schema region fsync 阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_log_force_sync_num - -| 名字 | data_region_ratis_log_force_sync_num | -| ------------ | ------------------------------------ | -| 描述 | data region fsync 阈值 | -| 类型 | int32 | -| 默认值 | 128 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_rpc_leader_election_timeout_min_ms - -| 名字 | config_node_ratis_rpc_leader_election_timeout_min_ms | -| ------------ | ---------------------------------------------------- | -| 描述 | confignode leader 选举超时最小值 | -| 类型 | int32 | -| 默认值 | 2000ms | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_rpc_leader_election_timeout_min_ms - -| 名字 | schema_region_ratis_rpc_leader_election_timeout_min_ms | -| ------------ | ------------------------------------------------------ | -| 描述 | schema region leader 选举超时最小值 | -| 类型 | int32 | -| 默认值 | 2000ms | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_rpc_leader_election_timeout_min_ms - -| 名字 | data_region_ratis_rpc_leader_election_timeout_min_ms | -| ------------ | ---------------------------------------------------- | -| 描述 | data region leader 选举超时最小值 | -| 类型 | int32 | -| 默认值 | 2000ms | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_rpc_leader_election_timeout_max_ms - -| 名字 | config_node_ratis_rpc_leader_election_timeout_max_ms | -| ------------ | ---------------------------------------------------- | -| 描述 | confignode leader 选举超时最大值 | -| 类型 | int32 | -| 默认值 | 4000ms | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_rpc_leader_election_timeout_max_ms - -| 名字 | schema_region_ratis_rpc_leader_election_timeout_max_ms | -| ------------ | ------------------------------------------------------ | -| 描述 | schema region leader 选举超时最大值 | -| 类型 | int32 | -| 默认值 | 4000ms | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_rpc_leader_election_timeout_max_ms - -| 名字 | data_region_ratis_rpc_leader_election_timeout_max_ms | -| ------------ | ---------------------------------------------------- | -| 描述 | data region leader 选举超时最大值 | -| 类型 | int32 | -| 默认值 | 4000ms | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_request_timeout_ms - -| 名字 | config_node_ratis_request_timeout_ms | -| ------------ | ------------------------------------ | -| 描述 | confignode Raft 客户端重试超时 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_request_timeout_ms - -| 名字 | schema_region_ratis_request_timeout_ms | -| ------------ | -------------------------------------- | -| 描述 | schema region Raft 客户端重试超时 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_request_timeout_ms - -| 名字 | data_region_ratis_request_timeout_ms | -| ------------ | ------------------------------------ | -| 描述 | data region Raft 客户端重试超时 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_max_retry_attempts - -| 名字 | config_node_ratis_max_retry_attempts | -| ------------ | ------------------------------------ | -| 描述 | confignode Raft客户端最大重试次数 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_initial_sleep_time_ms - -| 名字 | config_node_ratis_initial_sleep_time_ms | -| ------------ | --------------------------------------- | -| 描述 | confignode Raft客户端初始重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 100ms | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_max_sleep_time_ms - -| 名字 | config_node_ratis_max_sleep_time_ms | -| ------------ | ------------------------------------- | -| 描述 | confignode Raft客户端最大重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 10000 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_max_retry_attempts - -| 名字 | schema_region_ratis_max_retry_attempts | -| ------------ | -------------------------------------- | -| 描述 | schema region Raft客户端最大重试次数 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_initial_sleep_time_ms - -| 名字 | schema_region_ratis_initial_sleep_time_ms | -| ------------ | ----------------------------------------- | -| 描述 | schema region Raft客户端初始重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 100ms | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_max_sleep_time_ms - -| 名字 | schema_region_ratis_max_sleep_time_ms | -| ------------ | ---------------------------------------- | -| 描述 | schema region Raft客户端最大重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_max_retry_attempts - -| 名字 | data_region_ratis_max_retry_attempts | -| ------------ | ------------------------------------ | -| 描述 | data region Raft客户端最大重试次数 | -| 类型 | int32 | -| 默认值 | 10 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_initial_sleep_time_ms - -| 名字 | data_region_ratis_initial_sleep_time_ms | -| ------------ | --------------------------------------- | -| 描述 | data region Raft客户端初始重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 100ms | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_max_sleep_time_ms - -| 名字 | data_region_ratis_max_sleep_time_ms | -| ------------ | -------------------------------------- | -| 描述 | data region Raft客户端最大重试睡眠时长 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- ratis_first_election_timeout_min_ms - -| 名字 | ratis_first_election_timeout_min_ms | -| ------------ | ----------------------------------- | -| 描述 | Ratis协议首次选举最小超时时间 | -| 类型 | int64 | -| 默认值 | 50 (ms) | -| 改后生效方式 | 重启服务生效 | - -- ratis_first_election_timeout_max_ms - -| 名字 | ratis_first_election_timeout_max_ms | -| ------------ | ----------------------------------- | -| 描述 | Ratis协议首次选举最大超时时间 | -| 类型 | int64 | -| 默认值 | 150 (ms) | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_preserve_logs_num_when_purge - -| 名字 | config_node_ratis_preserve_logs_num_when_purge | -| ------------ | ---------------------------------------------- | -| 描述 | confignode snapshot后保持一定数量日志不删除 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_preserve_logs_num_when_purge - -| 名字 | schema_region_ratis_preserve_logs_num_when_purge | -| ------------ | ------------------------------------------------ | -| 描述 | schema region snapshot后保持一定数量日志不删除 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_preserve_logs_num_when_purge - -| 名字 | data_region_ratis_preserve_logs_num_when_purge | -| ------------ | ---------------------------------------------- | -| 描述 | data region snapshot后保持一定数量日志不删除 | -| 类型 | int32 | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_log_max_size - -| 名字 | config_node_ratis_log_max_size | -| ------------ | ----------------------------------- | -| 描述 | config node磁盘Raft Log最大占用空间 | -| 类型 | int64 | -| 默认值 | 2147483648 (2GB) | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_log_max_size - -| 名字 | schema_region_ratis_log_max_size | -| ------------ | -------------------------------------- | -| 描述 | schema region 磁盘Raft Log最大占用空间 | -| 类型 | int64 | -| 默认值 | 2147483648 (2GB) | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_log_max_size - -| 名字 | data_region_ratis_log_max_size | -| ------------ | ------------------------------------ | -| 描述 | data region 磁盘Raft Log最大占用空间 | -| 类型 | int64 | -| 默认值 | 21474836480 (20GB) | -| 改后生效方式 | 重启服务生效 | - -- config_node_ratis_periodic_snapshot_interval - -| 名字 | config_node_ratis_periodic_snapshot_interval | -| ------------ | -------------------------------------------- | -| 描述 | config node定期snapshot的间隔时间 | -| 类型 | int64 | -| 默认值 | 86400 (秒) | -| 改后生效方式 | 重启服务生效 | - -- schema_region_ratis_periodic_snapshot_interval - -| 名字 | schema_region_ratis_preserve_logs_num_when_purge | -| ------------ | ------------------------------------------------ | -| 描述 | schema region定期snapshot的间隔时间 | -| 类型 | int64 | -| 默认值 | 86400 (秒) | -| 改后生效方式 | 重启服务生效 | - -- data_region_ratis_periodic_snapshot_interval - -| 名字 | data_region_ratis_preserve_logs_num_when_purge | -| ------------ | ---------------------------------------------- | -| 描述 | data region定期snapshot的间隔时间 | -| 类型 | int64 | -| 默认值 | 86400 (秒) | -| 改后生效方式 | 重启服务生效 | - -### 3.31 IoTConsensusV2配置 - -- iot_consensus_v2_pipeline_size - -| 名字 | iot_consensus_v2_pipeline_size | -| ------------ | ------------------------------------------------------------ | -| 描述 | IoTConsensus V2中连接器(connector)和接收器(receiver)的默认事件缓冲区大小。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -- iot_consensus_v2_mode - -| 名字 | iot_consensus_v2_pipeline_size | -| ------------ | ----------------------------------- | -| 描述 | IoTConsensus V2使用的共识协议模式。 | -| 类型 | String | -| 默认值 | batch | -| 改后生效方式 | 重启服务生效 | - -### 3.32 Procedure 配置 - -- procedure_core_worker_thread_count - -| 名字 | procedure_core_worker_thread_count | -| ------------ | ---------------------------------- | -| 描述 | 工作线程数量 | -| 类型 | int32 | -| 默认值 | 4 | -| 改后生效方式 | 重启服务生效 | - -- procedure_completed_clean_interval - -| 名字 | procedure_completed_clean_interval | -| ------------ | ---------------------------------- | -| 描述 | 清理已完成的 procedure 时间间隔 | -| 类型 | int32 | -| 默认值 | 30(s) | -| 改后生效方式 | 重启服务生效 | - -- procedure_completed_evict_ttl - -| 名字 | procedure_completed_evict_ttl | -| ------------ | --------------------------------- | -| 描述 | 已完成的 procedure 的数据保留时间 | -| 类型 | int32 | -| 默认值 | 60(s) | -| 改后生效方式 | 重启服务生效 | - -### 3.33 MQTT代理配置 - -- enable_mqtt_service - -| 名字 | enable_mqtt_service。 | -| ------------ | --------------------- | -| 描述 | 是否开启MQTT服务 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 热加载 | - -- mqtt_host - -| 名字 | mqtt_host | -| ------------ | -------------------- | -| 描述 | MQTT服务绑定的host。 | -| 类型 | String | -| 默认值 | 127.0.0.1 | -| 改后生效方式 | 热加载 | - -- mqtt_port - -| 名字 | mqtt_port | -| ------------ | -------------------- | -| 描述 | MQTT服务绑定的port。 | -| 类型 | int32 | -| 默认值 | 1883 | -| 改后生效方式 | 热加载 | - -- mqtt_handler_pool_size - -| 名字 | mqtt_handler_pool_size | -| ------------ | ---------------------------------- | -| 描述 | 用于处理MQTT消息的处理程序池大小。 | -| 类型 | int32 | -| 默认值 | 1 | -| 改后生效方式 | 热加载 | - -- mqtt_payload_formatter - -| 名字 | mqtt_payload_formatter | -| ------------ | ---------------------------- | -| 描述 | MQTT消息有效负载格式化程序。 | -| 类型 | String | -| 默认值 | json | -| 改后生效方式 | 热加载 | - -- mqtt_max_message_size - -| 名字 | mqtt_max_message_size | -| ------------ | ------------------------------------ | -| 描述 | MQTT消息的最大长度(以字节为单位)。 | -| 类型 | int32 | -| 默认值 | 1048576 | -| 改后生效方式 | 热加载 | - -### 3.34 审计日志配置 - -- enable_audit_log - -| 名字 | enable_audit_log | -| ------------ | ------------------------------ | -| 描述 | 用于控制是否启用审计日志功能。 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 重启服务生效 | - -- audit_log_storage - -| 名字 | audit_log_storage | -| ------------ | -------------------------- | -| 描述 | 定义了审计日志的输出位置。 | -| 类型 | String | -| 默认值 | IOTDB,LOGGER | -| 改后生效方式 | 重启服务生效 | - -- audit_log_operation - -| 名字 | audit_log_operation | -| ------------ | -------------------------------------- | -| 描述 | 定义了哪些类型的操作需要记录审计日志。 | -| 类型 | String | -| 默认值 | DML,DDL,QUERY | -| 改后生效方式 | 重启服务生效 | - -- enable_audit_log_for_native_insert_api - -| 名字 | enable_audit_log_for_native_insert_api | -| ------------ | -------------------------------------- | -| 描述 | 用于控制本地写入API是否记录审计日志。 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 重启服务生效 | - -### 3.35 白名单配置 -- enable_white_list - -| 名字 | enable_white_list | -| ------------ | ----------------- | -| 描述 | 是否启用白名单。 | -| 类型 | Boolean | -| 默认值 | false | -| 改后生效方式 | 热加载 | - -### 3.36 IoTDB-AI 配置 - -- model_inference_execution_thread_count - -| 名字 | model_inference_execution_thread_count | -| ------------ | -------------------------------------- | -| 描述 | 用于模型推理操作的线程数。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -### 3.37 TsFile 主动监听&加载功能配置 - -- load_clean_up_task_execution_delay_time_seconds - -| 名字 | load_clean_up_task_execution_delay_time_seconds | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在加载TsFile失败后,系统将等待多长时间才会执行清理任务来清除这些未成功加载的TsFile。 | -| 类型 | int | -| 默认值 | 1800 | -| 改后生效方式 | 热加载 | - -- load_write_throughput_bytes_per_second - -| 名字 | load_write_throughput_bytes_per_second | -| ------------ | -------------------------------------- | -| 描述 | 加载TsFile时磁盘写入的最大字节数每秒。 | -| 类型 | int | -| 默认值 | -1 | -| 改后生效方式 | 热加载 | - -- load_active_listening_enable - -| 名字 | load_active_listening_enable | -| ------------ | ------------------------------------------------------------ | -| 描述 | 是否开启 DataNode 主动监听并且加载 tsfile 的功能(默认开启)。 | -| 类型 | Boolean | -| 默认值 | true | -| 改后生效方式 | 热加载 | - -- load_active_listening_dirs - -| 名字 | load_active_listening_dirs | -| ------------ | ------------------------------------------------------------ | -| 描述 | 需要监听的目录(自动包括目录中的子目录),如有多个使用 “,“ 隔开默认的目录为 ext/load/pending(支持热装载)。 | -| 类型 | String | -| 默认值 | ext/load/pending | -| 改后生效方式 | 热加载 | - -- load_active_listening_fail_dir - -| 名字 | load_active_listening_fail_dir | -| ------------ | ---------------------------------------------------------- | -| 描述 | 执行加载 tsfile 文件失败后将文件转存的目录,只能配置一个。 | -| 类型 | String | -| 默认值 | ext/load/failed | -| 改后生效方式 | 热加载 | - -- load_active_listening_max_thread_num - -| 名字 | load_active_listening_max_thread_num | -| ------------ | ------------------------------------------------------------ | -| 描述 | 同时执行加载 tsfile 任务的最大线程数,参数被注释掉时的默值为 max(1, CPU 核心数 / 2),当用户设置的值不在这个区间[1, CPU核心数 /2]内时,会设置为默认值 (1, CPU 核心数 / 2)。 | -| 类型 | Long | -| 默认值 | 0 | -| 改后生效方式 | 重启服务生效 | - -- load_active_listening_check_interval_seconds - -| 名字 | load_active_listening_check_interval_seconds | -| ------------ | ------------------------------------------------------------ | -| 描述 | 主动监听轮询间隔,单位秒。主动监听 tsfile 的功能是通过轮询检查文件夹实现的。该配置指定了两次检查 load_active_listening_dirs 的时间间隔,每次检查完成 load_active_listening_check_interval_seconds 秒后,会执行下一次检查。当用户设置的轮询间隔小于 1 时,会被设置为默认值 5 秒。 | -| 类型 | Long | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - - -* last_cache_operation_on_load - -|名字| last_cache_operation_on_load | -|:---:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|描述| 当成功加载一个 TsFile 时,对 LastCache 执行的操作。`UPDATE`:使用 TsFile 中的数据更新 LastCache;`UPDATE_NO_BLOB`:与 UPDATE 类似,但会使 blob 序列的 LastCache 失效;`CLEAN_DEVICE`:使 TsFile 中包含的设备的 LastCache 失效;`CLEAN_ALL`:清空整个 LastCache。 | -|类型| String | -|默认值| UPDATE_NO_BLOB | -|改后生效方式| 重启后生效 | - -* cache_last_values_for_load - -|名字| cache_last_values_for_load | -|:---:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|描述| 在加载 TsFile 之前是否缓存最新值(last values)。仅在 `last_cache_operation_on_load=UPDATE_NO_BLOB` 或 `last_cache_operation_on_load=UPDATE` 时生效。当设置为 true 时,即使 `last_cache_operation_on_load=UPDATE`,也会忽略 blob 序列。启用此选项会在加载 TsFile 期间增加内存占用。 | -|类型| Boolean | -|默认值| true | -|改后生效方式| 重启后生效 | - -* cache_last_values_memory_budget_in_byte - -|名字| cache_last_values_memory_budget_in_byte | -|:---:|:----------------------------------------------------------------------------------------------------| -|描述| 当 `cache_last_values_for_load=true` 时,用于缓存最新值的最大内存大小(以字节为单位)。如果超过该值,缓存的值将被丢弃,并以流式方式直接从 TsFile 中读取最新值。 | -|类型| int32 | -|默认值| 4194304 | -|改后生效方式| 重启后生效 | - - -### 3.38 分发重试配置 - -- enable_retry_for_unknown_error - -| 名字 | enable_retry_for_unknown_error | -| ------------ | ------------------------------------------------------------ | -| 描述 | 在遇到未知错误时,写请求远程分发的最大重试时间,单位是毫秒。 | -| 类型 | Long | -| 默认值 | 60000 | -| 改后生效方式 | 热加载 | - -- enable_retry_for_unknown_error - -| 名字 | enable_retry_for_unknown_error | -| ------------ | -------------------------------- | -| 描述 | 用于控制是否对未知错误进行重试。 | -| 类型 | boolean | -| 默认值 | false | -| 改后生效方式 | 热加载 | \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Reference/System-Tables_timecho.md b/src/zh/UserGuide/latest-Table/Reference/System-Tables_timecho.md deleted file mode 100644 index 108cd8d0a..000000000 --- a/src/zh/UserGuide/latest-Table/Reference/System-Tables_timecho.md +++ /dev/null @@ -1,793 +0,0 @@ - - -# 系统表 - -IoTDB 内置系统数据库 `INFORMATION_SCHEMA`,其中包含一系列系统表,用于存储 IoTDB 运行时信息(如当前正在执行的 SQL 语句等)。目前`INFORMATION_SCHEMA`数据库只支持读操作。 - -> 💡 **【V2.0.9.1 版本更新】**
-> 👉 新增一张系统表:**[TABLE_DISK_USAGE](#_2-22-table-disk-usage-表)**(表级存储空间统计),助力集群运维与性能分析。 - -## 1. 系统库 - -* 名称:`INFORMATION_SCHEMA` -* 指令:只读,只支持 `Show databases (DETAILS) `​`/ Show Tables (DETAILS) / Use`,其余操作将会报错:`"The database 'information_schema' can only be queried"` -* 属性:`TTL=INF`,其余属性默认为`null` -* SQL示例: - -```sql -IoTDB> show databases -+------------------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+------------------+-------+-----------------------+---------------------+---------------------+ -|information_schema| INF| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+ - -IoTDB> show tables from information_schema -+-----------------------+-------+ -| TableName|TTL(ms)| -+-----------------------+-------+ -| columns| INF| -| config_nodes| INF| -| configurations| INF| -| connections| INF| -| current_queries| INF| -| data_nodes| INF| -| databases| INF| -| functions| INF| -| keywords| INF| -| nodes| INF| -| pipe_plugins| INF| -| pipes| INF| -| queries| INF| -|queries_costs_histogram| INF| -| regions| INF| -| services| INF| -| subscriptions| INF| -| table_disk_usage| INF| -| tables| INF| -| topics| INF| -| views| INF| -+-----------------------+-------+ -``` - -## 2. 系统表 - -* 名称:`DATABASES`, `TABLES`, `REGIONS`, `QUERIES`, `COLUMNS`, `PIPES`, `PIPE_PLUGINS`, `SUBSCRIPTION`, `TOPICS`, `VIEWS`, `MODELS`, `FUNCTIONS`, `CONFIGURATIONS`, `KEYWORDS`, `NODES`, `CONFIG_NODES`, `DATA_NODES`, `CONNECTIONS`, `CURRENT_QUERIES`, `QUERIES_COSTS_HISTOGRAM`、`SERVICES`、`TABLE_DISK_USAGE`(详细介绍见后面小节) -* 操作:只读,只支持`SELECT`, `COUNT/SHOW DEVICES`, `DESC`,不支持对于表结构 / 内容的任意修改,如果修改将会报错:`"The database 'information_schema' can only be queried"` -* 列名:系统表的列名均默认为小写,且用`_`分隔 - -### 2.1 DATABASES 表 - -* 包含集群中所有数据库的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ----------------------------- | ---------- | ----------- | ---------------- | -| database | STRING | TAG | 数据库名称 | -| ttl(ms) | STRING | ATTRIBUTE | 数据保留时间 | -| schema\_replication\_factor | INT32 | ATTRIBUTE | 元数据副本数 | -| data\_replication\_factor | INT32 | ATTRIBUTE | 数据副本数 | -| time\_partition\_interval | INT64 | ATTRIBUTE | 时间分区间隔 | -| schema\_region\_group\_num | INT32 | ATTRIBUTE | 元数据分区数量 | -| data\_region\_group\_num | INT32 | ATTRIBUTE | 数据分区数量 | - -* 查询结果只展示自身对该数据库本身或库中任意表有任意权限的数据库集合 -* 查询示例: - -```sql -IoTDB> select * from information_schema.databases -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -| database|ttl(ms)|schema_replication_factor|data_replication_factor|time_partition_interval|schema_region_group_num|data_region_group_num| -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -|information_schema| INF| null| null| null| null| null| -| database1| INF| 1| 1| 604800000| 0| 0| -+------------------+-------+-------------------------+-----------------------+-----------------------+-----------------------+---------------------+ -``` - -### 2.2 TABLES 表 - -* 包含集群中所有表的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------- | ---------- | ----------- | -------------- | -| database | STRING | TAG | 数据库名称 | -| table\_name | STRING | TAG | 表名称 | -| ttl(ms) | STRING | ATTRIBUTE | 数据保留时间 | -| status | STRING | ATTRIBUTE | 状态 | -| comment | STRING | ATTRIBUTE | 注释 | - -* 说明:status 可能为`USING`/`PRE_CREATE`/`PRE_DELETE`,具体见表管理中[查看表](../Basic-Concept/Table-Management_timecho.md#12-查看表)的相关描述 -* 查询结果只展示自身有任意权限的表集合 -* 查询示例: - -```sql -IoTDB> select * from information_schema.tables -+------------------+--------------+-----------+------+-------+-----------+ -| database| table_name| ttl(ms)|status|comment| table_type| -+------------------+--------------+-----------+------+-------+-----------+ -|information_schema| databases| INF| USING| null|SYSTEM VIEW| -|information_schema| models| INF| USING| null|SYSTEM VIEW| -|information_schema| subscriptions| INF| USING| null|SYSTEM VIEW| -|information_schema| regions| INF| USING| null|SYSTEM VIEW| -|information_schema| functions| INF| USING| null|SYSTEM VIEW| -|information_schema| keywords| INF| USING| null|SYSTEM VIEW| -|information_schema| columns| INF| USING| null|SYSTEM VIEW| -|information_schema| topics| INF| USING| null|SYSTEM VIEW| -|information_schema|configurations| INF| USING| null|SYSTEM VIEW| -|information_schema| queries| INF| USING| null|SYSTEM VIEW| -|information_schema| tables| INF| USING| null|SYSTEM VIEW| -|information_schema| pipe_plugins| INF| USING| null|SYSTEM VIEW| -|information_schema| nodes| INF| USING| null|SYSTEM VIEW| -|information_schema| data_nodes| INF| USING| null|SYSTEM VIEW| -|information_schema| pipes| INF| USING| null|SYSTEM VIEW| -|information_schema| views| INF| USING| null|SYSTEM VIEW| -|information_schema| config_nodes| INF| USING| null|SYSTEM VIEW| -| database1| table1|31536000000| USING| null| BASE TABLE| -+------------------+--------------+-----------+------+-------+-----------+ -``` - -### 2.3 REGIONS 表 - -* 包含集群中所有`Region`的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| --------------------- | ----------- | ----------- | ----------------------------------------------------------------------------------------------------------- | -| region\_id | INT32 | TAG | region ID | -| datanode\_id | INT32 | TAG | dataNode ID | -| type | STRING | ATTRIBUTE | 类型(SchemaRegion / DataRegion) | -| status | STRING | ATTRIBUTE | 状态(Running/Unknown 等) | -| database | STRING | ATTRIBUTE | database 名字 | -| series\_slot\_num | INT32 | ATTRIBUTE | series slot 个数 | -| time\_slot\_num | INT64 | ATTRIBUTE | time slot 个数 | -| rpc\_address | STRING | ATTRIBUTE | Rpc 地址 | -| rpc\_port | INT32 | ATTRIBUTE | Rpc 端口 | -| internal\_address | STRING | ATTRIBUTE | 内部通讯地址 | -| role | STRING | ATTRIBUTE | Leader / Follower | -| create\_time | TIMESTAMP | ATTRIBUTE | 创建时间 | -| tsfile\_size\_bytes | INT64 | ATTRIBUTE | 可统计的 DataRegion:含有 TsFile 的总文件大小;不可统计的 DataRegion(Unknown):-1;SchemaRegion:null; | - -* 仅管理员可执行查询操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.regions -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -|region_id|datanode_id| type| status| database|series_slot_num|time_slot_num|rpc_address|rpc_port|internal_address| role| create_time|tsfile_size_bytes| -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -| 0| 1|SchemaRegion|Running|database1| 12| 0| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:08.485+08:00| null| -| 1| 1| DataRegion|Running|database1| 6| 6| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:09.156+08:00| 3985| -| 2| 1| DataRegion|Running|database1| 6| 6| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-03-31T11:19:09.156+08:00| 3841| -+---------+-----------+------------+-------+---------+---------------+-------------+-----------+--------+----------------+------+-----------------------------+-----------------+ -``` - -### 2.4 QUERIES 表 - -* 包含集群中所有正在执行的查询的信息。也可以使用 `SHOW QUERIES`语法去查询。 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| --------------- | ----------- | ----------- | ------------------------------------------------ | -| query\_id | STRING | TAG | ID | -| start\_time | TIMESTAMP | ATTRIBUTE | 查询开始的时间戳,时间戳精度与系统精度保持一致 | -| datanode\_id | INT32 | ATTRIBUTE | 发起查询的DataNode ID | -| elapsed\_time | FLOAT | ATTRIBUTE | 查询执行耗时,单位是秒 | -| statement | STRING | ATTRIBUTE | 查询sql | -| user | STRING | ATTRIBUTE | 发起查询的用户 | - -* 普通用户查询结果仅显示自身执行的查询;管理员显示全部。 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.queries -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -|20250331_023242_00011_1|2025-03-31T10:32:42.360+08:00| 1| 0.025|select * from information_schema.queries|root| -+-----------------------+-----------------------------+-----------+------------+----------------------------------------+----+ -``` - -### 2.5 COLUMNS 表 - -* 包含集群中所有表中列的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| -------------- | ---------- | ----------- | -------------- | -| database | STRING | TAG | 数据库名称 | -| table\_name | STRING | TAG | 表名称 | -| column\_name | STRING | TAG | 列名称 | -| datatype | STRING | ATTRIBUTE | 列的数值类型 | -| category | STRING | ATTRIBUTE | 列类型 | -| status | STRING | ATTRIBUTE | 列状态 | -| comment | STRING | ATTRIBUTE | 列注释 | - -说明: -* status 可能为`USING`/`PRE_DELETE`,具体见表管理中[查看表的列](../Basic-Concept/Table-Management_timecho.md#13-查看表的列)的相关描述 -* 查询结果只展示自身有任意权限的表的列信息 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.columns where database = 'database1' -+---------+----------+------------+---------+---------+------+-------+ -| database|table_name| column_name| datatype| category|status|comment| -+---------+----------+------------+---------+---------+------+-------+ -|database1| table1| time|TIMESTAMP| TIME| USING| null| -|database1| table1| region| STRING| TAG| USING| null| -|database1| table1| plant_id| STRING| TAG| USING| null| -|database1| table1| device_id| STRING| TAG| USING| null| -|database1| table1| model_id| STRING|ATTRIBUTE| USING| null| -|database1| table1| maintenance| STRING|ATTRIBUTE| USING| null| -|database1| table1| temperature| FLOAT| FIELD| USING| null| -|database1| table1| humidity| FLOAT| FIELD| USING| null| -|database1| table1| status| BOOLEAN| FIELD| USING| null| -|database1| table1|arrival_time|TIMESTAMP| FIELD| USING| null| -+---------+----------+------------+---------+---------+------+-------+ -``` - -### 2.6 PIPES 表 - -* 包含集群中所有 PIPE 的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------------------- | ----------- | ----------- | --------------------------------------- | -| id | STRING | TAG | Pipe 名称 | -| creation\_time | TIMESTAMP | ATTRIBUTE | 创建时间 | -| state | STRING | ATTRIBUTE | Pipe 状态(RUNNING/STOPPED) | -| pipe\_source | STRING | ATTRIBUTE | source 插件参数 | -| pipe\_processor | STRING | ATTRIBUTE | processor 插件参数 | -| pipe\_sink | STRING | ATTRIBUTE | source 插件参数 | -| exception\_message | STRING | ATTRIBUTE | Exception 信息 | -| remaining\_event\_count | INT64 | ATTRIBUTE | 剩余 event 数量,如果 Unknown 则为 -1 | -| estimated\_remaining\_seconds | DOUBLE | ATTRIBUTE | 预估剩余时间,如果 Unknown 则为 -1 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -select * from information_schema.pipes -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -| id| creation_time| state| pipe_source|pipe_processor| pipe_sink|exception_message|remaining_event_count|estimated_remaining_seconds| -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -|tablepipe1|2025-03-31T12:25:24.040+08:00|RUNNING|{__system.sql-dialect=table, source.password=******, source.username=root}| {}|{format=hybrid, node-urls=192.168.xxx.xxx:6667, sink=iotdb-thrift-sink}| | 0| 0.0| -+----------+-----------------------------+-------+--------------------------------------------------------------------------+--------------+-----------------------------------------------------------------------+-----------------+---------------------+---------------------------+ -``` - -### 2.7 PIPE\_PLUGINS 表 - -* 包含集群中所有PIPE插件的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| -------------- | ---------- | ----------- | ----------------------------------------------- | -| plugin\_name | STRING | TAG | 插件名称 | -| plugin\_type | STRING | ATTRIBUTE | 插件类型(Builtin/External) | -| class\_name | STRING | ATTRIBUTE | 插件的主类名 | -| plugin\_jar | STRING | ATTRIBUTE | 插件的 jar 包名称,若为 builtin 类型则为 null | - -* 查询示例: - -```SQL -IoTDB> select * from information_schema.pipe_plugins -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -| plugin_name|plugin_type| class_name|plugin_jar| -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -|IOTDB-THRIFT-SSL-SINK| Builtin|org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| null| -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| null| -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.donothing.DoNothingConnector| null| -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.processor.donothing.DoNothingProcessor| null| -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| null| -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.extractor.iotdb.IoTDBExtractor| null| -+---------------------+-----------+-------------------------------------------------------------------------------------------------+----------+ -``` - -### 2.8 SUBSCRIPTIONS 表 - -* 包含集群中所有数据订阅的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ----------------------- | ---------- | ----------- | -------------- | -| topic\_name | STRING | TAG | 订阅主题名称 | -| consumer\_group\_name | STRING | TAG | 消费者组名称 | -| subscribed\_consumers | STRING | ATTRIBUTE | 订阅的消费者 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.subscriptions where topic_name = 'topic_1' -+----------+-------------------+--------------------------------+ -|topic_name|consumer_group_name| subscribed_consumers| -+----------+-------------------+--------------------------------+ -| topic_1| cg1|[c3, c4, c5, c6, c7, c0, c1, c2]| -+----------+-------------------+--------------------------------+ -``` - -### 2.9 TOPICS 表 - -* 包含集群中所有数据订阅主题的信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ---------------- | ---------- | ----------- | -------------- | -| topic\_name | STRING | TAG | 订阅主题名称 | -| topic\_configs | STRING | ATTRIBUTE | 订阅主题配置 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.topics -+----------+----------------------------------------------------------------+ -|topic_name| topic_configs| -+----------+----------------------------------------------------------------+ -| topic|{__system.sql-dialect=table, start-time=2025-01-10T17:05:38.282}| -+----------+----------------------------------------------------------------+ -``` - -### 2.10 VIEWS 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的表视图信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------ | ---------- | ----------- | ---------------- | -| database | STRING | TAG | 数据库名称 | -| table\_name | STRING | TAG | 视图名称 | -| view\_definition | STRING | ATTRIBUTE | 视图的创建语句 | - -* 查询结果只展示自身有任意权限的视图集合 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.views -+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------+ -| database|table_name| view_definition| -+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------+ -|database1| ln|CREATE VIEW "ln" ("device" STRING TAG,"model" STRING TAG,"status" BOOLEAN FIELD,"hardware" STRING FIELD) WITH (ttl='INF') AS root.ln.**| -+---------+----------+--------------------------------------------------------------------------------------------------------------------------------------- -``` - -### 2.11 MODELS 表 - -> 该系统表从 V 2.0.5 版本开始提供,从V 2.0.8 版本开始不再提供 - -* 包含数据库内所有的模型信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------- | ---------- | ----------- | ----------------------------------------------------------------------- | -| model\_id | STRING | TAG | 模型名称 | -| model\_type | STRING | ATTRIBUTE | 模型类型(预测,异常检测,自定义) | -| state | STRING | ATTRIBUTE | 模型状态(是否可用) | -| configs | STRING | ATTRIBUTE | 模型的超参数的 string 格式,与正常的 show 相同 | -| notes | STRING | ATTRIBUTE | 模型注释* 内置 model:Built-in model in IoTDB* 用户的 model:自定义 | - -* 查询示例: - -```SQL --- 找到类型为内置预测的所有模型 -IoTDB> select * from information_schema.models where model_type = 'BUILT_IN_FORECAST' -+---------------------+-----------------+------+-------+-----------------------+ -| model_id| model_type| state|configs| notes| -+---------------------+-----------------+------+-------+-----------------------+ -| _STLForecaster|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _NaiveForecaster|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _ARIMA|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -|_ExponentialSmoothing|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _HoltWinters|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -| _sundial|BUILT_IN_FORECAST|ACTIVE| null|Built-in model in IoTDB| -+---------------------+-----------------+------+-------+-----------------------+ -``` - -### 2.12 FUNCTIONS 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的函数信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------ | ---------- | ----------- | ----------------------------------------- | -| function\_name | STRING | TAG | 函数名称 | -| function\_type | STRING | ATTRIBUTE | 函数类型(内/外置数值/聚合/表函数) | -| class\_name(udf) | STRING | ATTRIBUTE | 如为 UDF,则为类名,否则为 null(暂定) | -| state | STRING | ATTRIBUTE | 是否可用 | - -* 查询示例: - -```SQL -IoTDB> select * from information_schema.functions where function_type='built-in table function' -+--------------+-----------------------+---------------+---------+ -|function_table| function_type|class_name(udf)| state| -+--------------+-----------------------+---------------+---------+ -| CUMULATE|built-in table function| null|AVAILABLE| -| SESSION|built-in table function| null|AVAILABLE| -| HOP|built-in table function| null|AVAILABLE| -| TUMBLE|built-in table function| null|AVAILABLE| -| FORECAST|built-in table function| null|AVAILABLE| -| VARIATION|built-in table function| null|AVAILABLE| -| CAPACITY|built-in table function| null|AVAILABLE| -+--------------+-----------------------+---------------+---------+ -``` - -### 2.13 CONFIGURATIONS表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的属性信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ---------- | ---------- | ----------- | -------- | -| variable | STRING | TAG | 属性名 | -| value | STRING | ATTRIBUTE | 属性值 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.configurations -+----------------------------------+-----------------------------------------------------------------+ -| variable| value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 2.14 KEYWORDS 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的关键字信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ---------- | ---------- | ----------- | -------------------------------- | -| word | STRING | TAG | 关键字 | -| reserved | INT32 | ATTRIBUTE | 是否为保留字,1表示是,0表示否 | - -* 查询示例: - -```SQL -IoTDB> select * from information_schema.keywords limit 10 -+----------+--------+ -| word|reserved| -+----------+--------+ -| ABSENT| 0| -|ACTIVATION| 1| -| ACTIVATE| 1| -| ADD| 0| -| ADMIN| 0| -| AFTER| 0| -| AINODES| 1| -| ALL| 0| -| ALTER| 1| -| ANALYZE| 0| -+----------+--------+ -``` - -### 2.15 NODES 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的节点信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------------------ | ---------- | ----------- | --------------- | -| node\_id | INT32 | TAG | 节点 ID | -| node\_type | STRING | ATTRIBUTE | 节点类型 | -| status | STRING | ATTRIBUTE | 节点状态 | -| internal\_address | STRING | ATTRIBUTE | 内部 rpc 地址 | -| internal\_port | INT32 | ATTRIBUTE | 内部端口 | -| version | STRING | ATTRIBUTE | 版本号 | -| build\_info | STRING | ATTRIBUTE | CommitID | -| activate\_status(仅企业版) | STRING | ATTRIBUTE | 激活状态 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.nodes -+-------+----------+-------+----------------+-------------+-------+----------+ -|node_id| node_type| status|internal_address|internal_port|version|build_info| -+-------+----------+-------+----------------+-------------+-------+----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|2.0.5.1| 58d685e| -| 1| DataNode|Running| 127.0.0.1| 10730|2.0.5.1| 58d685e| -+-------+----------+-------+----------------+-------------+-------+----------+ -+----------+--------+ -``` - -### 2.16 CONFIG\_NODES 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的配置节点信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------------- | ---------- | ----------- | --------------------- | -| node\_id | INT32 | TAG | 节点 ID | -| config\_consensus\_port | INT32 | ATTRIBUTE | configNode 共识端口 | -| role | STRING | ATTRIBUTE | configNode 节点角色 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.config_nodes -+-------+---------------------+------+ -|node_id|config_consensus_port| role| -+-------+---------------------+------+ -| 0| 10720|Leader| -+-------+---------------------+------+ -``` - -### 2.17 DATA\_NODES 表 - -> 该系统表从 V 2.0.5 版本开始提供 - -* 包含数据库内所有的数据节点信息 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ------------------------ | ---------- | ----------- | ----------------------- | -| node\_id | INT32 | TAG | 节点 ID | -| data\_region\_num | INT32 | ATTRIBUTE | DataRegion 数量 | -| schema\_region\_num | INT32 | ATTRIBUTE | SchemaRegion 数量 | -| rpc\_address | STRING | ATTRIBUTE | Rpc 地址 | -| rpc\_port | INT32 | ATTRIBUTE | Rpc 端口 | -| mpp\_port | INT32 | ATTRIBUTE | MPP 通信端口 | -| data\_consensus\_port | INT32 | ATTRIBUTE | DataRegion 共识端口 | -| scema\_consensus\_port | INT32 | ATTRIBUTE | SchemaRegion 共识端口 | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.data_nodes -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -|node_id|data_region_num|schema_region_num|rpc_address|rpc_port|mpp_port|data_consensus_port|schema_consensus_port| -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -| 1| 4| 4| 0.0.0.0| 6667| 10740| 10760| 10750| -+-------+---------------+-----------------+-----------+--------+--------+-------------------+---------------------+ -``` - -### 2.18 CONNECTIONS 表 - -> 该系统表从 V 2.0.8 版本开始提供 - -* 包含集群中所有连接。 -* 表结构如下表所示: - -| **列名** | **数据类型** | **列类型** | **说明** | -| -------------------- | -------------------- | ------------------ | ---------------- | -| datanode\_id | STRING | TAG | DataNode的ID | -| user\_id | STRING | TAG | 用户ID | -| session\_id | STRING | TAG | Session ID | -| user\_name | STRING | ATTRIBUTE | 用户名 | -| last\_active\_time | TIMESTAMP | ATTRIBUTE | 最近活跃时间 | -| client\_ip | STRING | ATTRIBUTE | 客户端IP | - -* 查询示例: - -```SQL -IoTDB> select * from information_schema.connections; -+-----------+-------+----------+---------+-----------------------------+---------+ -|datanode_id|user_id|session_id|user_name| last_active_time|client_ip| -+-----------+-------+----------+---------+-----------------------------+---------+ -| 1| 0| 2| root|2026-01-21T16:28:54.704+08:00|127.0.0.1| -+-----------+-------+----------+---------+-----------------------------+---------+ -``` - -### 2.19 CURRENT\_QUERIES 表 - -> 该系统表从 V 2.0.8 版本开始提供 - -* 包含所有执行结束时间在 `[now() - query_cost_stat_window, now())` 范围内的所有查询,也包括当前正在执行的查询。其中`query_cost_stat_window `代表查询耗时统计的窗口,默认值为 0 ,可通过配置文件`iotdb-system.properties`进行配置。 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| -------------- | ----------- | -------- | ------------------------------------------------------------------------------- | -| query\_id | STRING | TAG | 查询语句的 ID | -| state | STRING | FIELD | 查询状态,RUNNING 表示正在执行,FINISHED 表示已结束 | -| start\_time | TIMESTAMP | FIELD | 查询开始的时间戳,时间戳精度与系统精度保持一致 | -| end\_time | TIMESTAMP | FIELD | 查询结束的时间戳,时间戳精度与系统精度保持一致。若查询尚未结束,该列值为 NULL | -| datanode\_id | INT32 | FIELD | 该查询语句是从哪个 DataNode 发起的 | -| cost\_time| FLOAT | FIELD | 查询的执行耗时,单位是秒。若查询尚未结束,该列值为查询已执行时间 | -| statement | STRING | FIELD | 查询的sql / 查询请求拼接后的 sql | -| user | STRING | FIELD | 发起查询的用户 | -| client\_ip | STRING | FIELD | 发起查询的客户端 ip | - -* 普通用户查询结果仅显示自身执行的查询;管理员显示全部。 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.current_queries; -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -| query_id| state| start_time|end_time|datanode_id|cost_time| statement|user|client_ip| -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -|20260121_085427_00013_1|RUNNING|2026-01-21T16:54:27.019+08:00| null| 1| 0.0|select * from information_schema.current_queries|root|127.0.0.1| -+-----------------------+-------+-----------------------------+--------+-----------+---------+------------------------------------------------+----+---------+ -``` - -### 2.20 QUERIES\_COSTS\_HISTOGRAM 表 - -> 该系统表从 V 2.0.8 版本开始提供 - -* 包含过去 `query_cost_stat_window` 时间内的查询耗时的直方图(仅统计已经执行结束的 SQL),其中`query_cost_stat_window `代表查询耗时统计的窗口,默认值为 0 ,可通过配置文件`iotdb-system.properties`进行配置。 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| -------------- | ---------- | -------- | -------------------------------------------------------------------- | -| bin| STRING | TAG| 分桶名:共包含61个分桶,[0, 1), [1, 2), [2, 3),...., [59, 60), 60+ | -| nums | INT32 | FIELD | 分桶内sql的个数 | -| datanode\_id | INT32 | FIELD | 该桶属于哪个 DataNode | - -* 仅管理员可执行操作 -* 查询示例: - -```SQL -IoTDB> select * from information_schema.queries_costs_histogram limit 10 -+------+----+-----------+ -| bin|nums|datanode_id| -+------+----+-----------+ -| [0,1)| 0| 1| -| [1,2)| 0| 1| -| [2,3)| 0| 1| -| [3,4)| 0| 1| -| [4,5)| 0| 1| -| [5,6)| 0| 1| -| [6,7)| 0| 1| -| [7,8)| 0| 1| -| [8,9)| 0| 1| -|[9,10)| 0| 1| -+------+----+-----------+ -``` - -### 2.21 SERVICES 表 - -> 该系统表从 V 2.0.8.2 版本开始提供 - -* 可展示所有正常工作(RUNNING 或 READ-ONLY) DN 上的服务(MQTT 服务、REST 服务)。 -* 表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| --------------- | ---------- | ----------- | ------------------------------ | -| service\_name | STRING | TAG | 服务名称 | -| datanode\_id | INT32 | ATTRIBUTE | 所在 DataNode 的 ID | -| state | STRING | ATTRIBUTE | 服务状态: RUNNING / STOPPED | - -* 查询示例: - -```SQL -IoTDB> select * from information_schema.services -+------------+-----------+-------+ -|service_name|datanode_id| state| -+------------+-----------+-------+ -| MQTT| 1|STOPPED| -| REST| 1|RUNNING| -+------------+-----------+-------+ -``` - -### 2.22 TABLE_DISK_USAGE 表 - -> 该系统表从 V 2.0.9.1 版本开始提供 - -用于展示指定表(不包含 view)的磁盘空间占用情况,包括 ChunkGroup 的大小和 Metadata 大小。 - -注意:统计基于 TsFile 中数据的真实大小,因此不会考虑 mods 删除的情况。 - -表结构如下表所示: - -| 列名 | 数据类型 | 列类型 | 说明 | -| ----------------- | ---------- | -------- | -------------------- | -| database | string | Field | Database 名 | -| table\_name | string | Field | 表名 | -| datanode\_id | int32 | Field | DataNode 节点 id | -| region\_id | int32 | Field | Region id | -| time\_partition | int64 | Field | 时间分区 id | -| size\_in\_bytes | int64 | Field | 占用磁盘空间(byte) | - -查询示例: - -```SQL --- 查询所有数据; -select * from information_schema.table_disk_usage; -``` - -```Bash -+---------+-------------------+-----------+---------+--------------+-------------+ -| database| table_name|datanode_id|region_id|time_partition|size_in_bytes| -+---------+-------------------+-----------+---------+--------------+-------------+ -|database1| table1| 1| 3| 2864| 867| -|database1| table11| 1| 3| 2864| 0| -|database1| table3| 1| 3| 2864| 0| -|database1| table1| 1| 3| 2865| 1411| -|database1| table11| 1| 3| 2865| 0| -|database1| table3| 1| 3| 2865| 0| -|database1| table1| 1| 3| 2925| 590| -|database1| table11| 1| 3| 2925| 0| -|database1| table3| 1| 3| 2925| 0| -|database1| table1| 1| 4| 2864| 883| -|database1| table11| 1| 4| 2864| 0| -|database1| table3| 1| 4| 2864| 0| -|database1| table1| 1| 4| 2865| 1224| -|database1| table11| 1| 4| 2865| 0| -|database1| table3| 1| 4| 2865| 0| -|database1| table1| 1| 4| 2888| 0| -|database1| table11| 1| 4| 2888| 0| -|database1| table3| 1| 4| 2888| 205| -| etth| tab_cov_forecast| 1| 8| 0| 0| -| etth| tab_real| 1| 8| 0| 963| -| etth|tab_target_forecast| 1| 8| 0| 0| -| etth| tab_cov_forecast| 1| 9| 0| 448| -| etth| tab_real| 1| 9| 0| 0| -| etth|tab_target_forecast| 1| 9| 0| 0| -+---------+-------------------+-----------+---------+--------------+-------------+ -``` - -```SQL --- 指定查询条件; -select * from information_schema.table_disk_usage where region_id = 4 and table_name like '%1'; -``` - -```Bash -+---------+----------+-----------+---------+--------------+-------------+ -| database|table_name|datanode_id|region_id|time_partition|size_in_bytes| -+---------+----------+-----------+---------+--------------+-------------+ -|database1| table1| 1| 4| 2864| 883| -|database1| table11| 1| 4| 2864| 0| -|database1| table1| 1| 4| 2865| 1224| -|database1| table11| 1| 4| 2865| 0| -|database1| table1| 1| 4| 2888| 0| -|database1| table11| 1| 4| 2888| 0| -+---------+----------+-----------+---------+--------------+-------------+ -``` - - -## 3. 权限说明 - -* 不支持通过`GRANT/REVOKE`语句对 `information_schema` 数据库及其下任何表进行权限操作 -* 支持任意用户通过`show databases`语句查看`information_schema`数据库相关信息 -* 支持任意用户通过`show tables from information_schema` 语句查看所有系统表相关信息 -* 支持任意用户通过`desc`语句查看任意系统表 diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function_timecho.md deleted file mode 100644 index 6a0a3e041..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/Basis-Function_timecho.md +++ /dev/null @@ -1,2670 +0,0 @@ - - -# 基础函数 - -## 1. 比较函数和运算符 - -### 1.1 基本比较运算符 - -比较运算符用于比较两个值,并返回比较结果(true或false)。 - -| 运算符 | 描述 | -| ------ | ---------- | -| < | 小于 | -| > | 大于 | -| <= | 小于或等于 | -| >= | 大于或等于 | -| = | 等于 | -| <> | 不等于 | -| != | 不等于 | - -#### 1.1.1 比较规则: - -1. 所有类型都可以与自身进行比较 -2. 数值类型(INT32, INT64, FLOAT, DOUBLE, TIMESTAMP)之间可以相互比较 -3. 字符类型(STRING, TEXT)之间也可以相互比较 -4. 除上述规则外的类型进行比较时,均会报错。 - -### 1.2 BETWEEN 运算符 - -1. `BETWEEN` 操作符用于判断一个值是否在指定的范围内。 -2. `NOT BETWEEN`操作符用于判断一个值是否不在指定范围内。 -3. `BETWEEN` 和 `NOT BETWEEN` 操作符可用于评估任何可排序的类型。 -4. `BETWEEN` 和 `NOT BETWEEN` 的值、最小值和最大值参数必须是同一类型,否则会报错。 - -**语法**: - -```SQL - value BETWEEN min AND max: - value NOT BETWEEN min AND max: -``` - -示例 1 :BETWEEN - -```SQL --- 查询 temperature 在 85.0 和 90.0 之间的记录 -SELECT * FROM table1 WHERE temperature BETWEEN 85.0 AND 90.0; -``` - -示例 2 :NOT BETWEEN - -```SQL -3-- 查询 humidity 不在 35.0 和 40.0 之间的记录 -SELECT * FROM table1 WHERE humidity NOT BETWEEN 35.0 AND 40.0; -``` - -### 1.3 IS NULL 运算符 - -1. `IS NULL` 和 `IS NOT NULL` 运算符用于判断一个值是否为 NULL。 -2. 这两个运算符适用于所有数据类型。 - -示例1:查询 temperature 为 NULL 的记录 - -```SQL -SELECT * FROM table1 WHERE temperature IS NULL; -``` - -示例2:查询 humidity 不为 NULL 的记录 - -```SQL -SELECT * FROM table1 WHERE humidity IS NOT NULL; -``` - -### 1.4 IN 运算符 - -1. `IN` 操作符可用于 `WHERE` 子句中,比较一列中的一些值。 -2. 这些值可以由静态数组、标量表达式。 - -**语法:** - -```SQL -... WHERE column [NOT] IN ('value1','value2', expression1) -``` - -示例 1:静态数组:查询 region 为 '北京' 或 '上海' 的记录 - -```SQL -SELECT * FROM table1 WHERE region IN ('北京', '上海'); ---等价于 -SELECT * FROM region WHERE name = '北京' OR name = '上海'; -``` - -示例 2:标量表达式:查询 temperature 在特定值中的记录 - -```SQL -SELECT * FROM table1 WHERE temperature IN (85.0, 90.0); -``` - -示例 3:查询 region 不为 '北京' 或 '上海' 的记录 - -```SQL -SELECT * FROM table1 WHERE region NOT IN ('北京', '上海'); -``` - -### 1.5 GREATEST 和 LEAST - -`Greatest` 函数用于返回参数列表中的最大值,`Least` 函数用于返回参数列表中的最小值,返回数据类型与输入类型相同。 -1. 空值处理:若所有参数均为 NULL,则返回 NULL。 -2. 参数要求:必须提供 至少 2 个参数。 -3. 类型约束:仅支持 相同数据类型 的参数比较。 -4. 支持类型: `BOOLEAN`、`FLOAT`、`DOUBLE`、`INT32`、`INT64`、`STRING`、`TEXT`、`TIMESTAMP`、`DATE` - -**语法:** - -```sql - greatest(value1, value2, ..., valueN) - least(value1, value2, ..., valueN) -``` - -**示例:** - -```sql --- 查询 table2 中 temperature 和 humidity 的最大记录 -SELECT GREATEST(temperature,humidity) FROM table2; - --- 查询 table2 中 temperature 和 humidity 的最小记录 -SELECT LEAST(temperature,humidity) FROM table2; -``` - - -## 2. 聚合函数 - -### 2.1 概述 - -1. 聚合函数是多对一函数。它们对一组值进行聚合计算,得到单个聚合结果。 -2. 除了 `COUNT()`之外,其他所有聚合函数都忽略空值,并在没有输入行或所有值为空时返回空值。 例如,`SUM()` 返回 null 而不是零,而 `AVG()` 在计数中不包括 null 值。 - -### 2.2 支持的聚合函数 - -| 函数名 | 功能描述 | 允许的输入类型 | 输出类型 | -|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------| -| COUNT | 计算数据点数。 | 所有类型 | INT64 | -| COUNT_IF | COUNT_IF(exp) 用于统计满足指定布尔表达式的记录行数 | exp 必须是一个布尔类型的表达式,例如 count_if(temperature>20) | INT64 | -| APPROX_COUNT_DISTINCT | APPROX_COUNT_DISTINCT(x[,maxStandardError]) 函数提供 COUNT(DISTINCT x) 的近似值,返回不同输入值的近似个数。 | `x`:待计算列,支持所有类型;
`maxStandardError`:指定该函数应产生的最大标准误差,取值范围[0.0040625, 0.26],未指定值时默认0.023。 | INT64 | -| APPROX_MOST_FREQUENT | APPROX_MOST_FREQUENT(x, k, capacity) 函数用于近似计算数据集中出现频率最高的前 k 个元素。它返回一个JSON 格式的字符串,其中键是该元素的值,值是该元素对应的近似频率。(V 2.0.5.1 及以后版本支持) | `x`:待计算列,支持 IoTDB 现有所有的数据类型;
`k`:返回出现频率最高的 k 个值;
`capacity`: 用于计算的桶的数量,跟内存占用相关:其值越大误差越小,但占用内存更大,反之capacity值越小误差越大,但占用内存更小。 | STRING | -| APPROX_PERCENTILE | APPROX_PERCENTILE 函数用于计算数据集中指定百分位数的值,帮助快速了解数据分布情况(如中位数、四分位数等),支持基于权重的百分位数计算;若百分位数不指向精确位置,返回相邻数值在该位置的线性插值。内存占用与质心数量相关,可通过 compression 参数限定最大质心数量,误差可通过经验公式预估。注意:该函数自 V2.0.9.1 起支持 | 单权重版本:APPROX_PERCENTILE (x, percentage)
x:待计算列,支持 INT32、INT64、FLOAT、DOUBLE、TIMESTAMP 等所有数字类型;
percentage:目标分位数,DOUBLE 类型。
带权重版本:APPROX_PERCENTILE (x, w, percentage)
x:待计算列,支持 INT32、INT64、FLOAT、DOUBLE、TIMESTAMP 等所有数字类型;
w:权重列,整型(与待计算列长度对齐,Null 或 0 表示该行忽略);
percentage:目标分位数,DOUBLE 类型。 | 与待计算列 x 的类型相同 | -| SUM | 求和。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| AVG | 求平均值。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| MAX | 求最大值。 | 所有类型 | 与输入类型一致 | -| MIN | 求最小值。 | 所有类型 | 与输入类型一致 | -| FIRST | 求时间戳最小且不为 NULL 的值。 | 所有类型 | 与输入类型一致 | -| LAST | 求时间戳最大且不为 NULL 的值。 | 所有类型 | 与输入类型一致 | -| STDDEV | STDDEV_SAMP 的别名,求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| STDDEV_POP | 求总体标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| STDDEV_SAMP | 求样本标准差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VARIANCE | VAR_SAMP 的别名,求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VAR_POP | 求总体方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| VAR_SAMP | 求样本方差。 | INT32 INT64 FLOAT DOUBLE | DOUBLE | -| EXTREME | 求具有最大绝对值的值。如果正值和负值的最大绝对值相等,则返回正值。 | INT32 INT64 FLOAT DOUBLE | 与输入类型一致 | -| MODE | 求众数。注意: 1.输入序列的不同值个数过多时会有内存异常风险; 2.如果所有元素出现的频次相同,即没有众数,则随机返回一个元素; 3.如果有多个众数,则随机返回一个众数; 4. NULL 值也会被统计频次,所以即使输入序列的值不全为 NULL,最终结果也可能为 NULL。 | 所有类型 | 与输入类型一致 | -| MAX_BY | MAX_BY(x, y) 求二元输入 x 和 y 在 y 最大时对应的 x 的值。MAX_BY(time, x) 返回 x 取最大值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 | -| MIN_BY | MIN_BY(x, y) 求二元输入 x 和 y 在 y 最小时对应的 x 的值。MIN_BY(time, x) 返回 x 取最小值时对应的时间戳。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 | -| FIRST_BY | FIRST_BY(x, y) 求当 y 为第一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 | -| LAST_BY | LAST_BY(x, y) 求当 y 为最后一个不为 NULL 的值时,同一行里对应的 x 值。 | x 和 y 可以是任意类型 | 与第一个输入 x 的数据类型一致 | - - -### 2.3 示例 - -#### 2.3.1 示例数据 - -在[示例数据页面](../Reference/Sample-Data.md)中,包含了用于构建表结构和插入数据的SQL语句,下载并在IoTDB CLI中执行这些语句,即可将数据导入IoTDB,您可以使用这些数据来测试和执行示例中的SQL语句,并获得相应的结果。 - -#### 2.3.2 Count - -统计的是整张表的行数和 `temperature` 列非 NULL 值的数量。 - -```SQL -IoTDB> select count(*), count(temperature) from table1; -``` - -执行结果如下: - -> 注意:只有COUNT函数可以与*一起使用,否则将抛出错误。 - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 18| 12| -+-----+-----+ -Total line number = 1 -It costs 0.834s -``` - - -#### 2.3.3 Count_if - -统计 `table2` 中 到达时间 `arrival_time` 不是 `null` 的记录行数。 - -```sql -IoTDB> select count_if(arrival_time is not null) from table2; -``` - -执行结果如下: - -```sql -+-----+ -|_col0| -+-----+ -| 4| -+-----+ -Total line number = 1 -It costs 0.047s -``` - -#### 2.3.4 Approx_count_distinct - -查询 `table1` 中 `temperature` 列不同值的个数。 - -```sql -IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature) as approx FROM table1; -IoTDB> SELECT COUNT(DISTINCT temperature) as origin, APPROX_COUNT_DISTINCT(temperature,0.006) as approx FROM table1; -``` - -执行结果如下: - -```sql -+------+------+ -|origin|approx| -+------+------+ -| 3| 3| -+------+------+ -Total line number = 1 -It costs 0.022s -``` - -#### 2.3.5 Approx_most_frequent - -查询 `table1` 中 `temperature` 列出现频次最高的2个值 - -```sql -IoTDB> select approx_most_frequent(temperature,2,100) as topk from table1; -``` - -执行结果如下: - -```sql -+-------------------+ -| topk| -+-------------------+ -|{"85.0":6,"90.0":5}| -+-------------------+ -Total line number = 1 -It costs 0.064s -``` - -#### 2.3.6 Approx_Percentile - -从`table1` 中,分别计算列 temperature 的90 分位数和列 humidity 的50 分位数(中位数),返回这两个近似百分位数值。 - -```SQL -SELECT APPROX_PERCENTILE(temperature,0.9), APPROX_PERCENTILE(humidity,0.5) FROM table1; -``` - -执行结果如下: - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 35.2| -+-----+-----+ -Total line number = 1 -It costs 0.206s -``` - - -#### 2.3.7 First - -查询`temperature`列、`humidity`列时间戳最小且不为 NULL 的值。 - -```SQL -IoTDB> select first(temperature), first(humidity) from table1; -``` - -执行结果如下: - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 35.1| -+-----+-----+ -Total line number = 1 -It costs 0.170s -``` - -#### 2.3.8 Last - -查询`temperature`列、`humidity`列时间戳最大且不为 NULL 的值。 - -```SQL -IoTDB> select last(temperature), last(humidity) from table1; -``` - -执行结果如下: - -```SQL -+-----+-----+ -|_col0|_col1| -+-----+-----+ -| 90.0| 34.8| -+-----+-----+ -Total line number = 1 -It costs 0.211s -``` - -#### 2.3.9 First_by - -查询 `temperature` 列中非 NULL 且时间戳最小的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最小的行的 `humidity` 值。 - -```SQL -IoTDB> select first_by(time, temperature), first_by(humidity, temperature) from table1; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-26T13:37:00.000+08:00| 35.1| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.269s -``` - -#### 2.3.10 Last_by - -查询`temperature` 列中非 NULL 且时间戳最大的行的 `time` 值,以及 `temperature` 列中非 NULL 且时间戳最大的行的 `humidity` 值。 - -```SQL -IoTDB> select last_by(time, temperature), last_by(humidity, temperature) from table1; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-30T14:30:00.000+08:00| 34.8| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.070s -``` - -#### 2.3.11 Max_by - -查询`temperature` 列中最大值所在行的 `time` 值,以及`temperature` 列中最大值所在行的 `humidity` 值。 - -```SQL -IoTDB> select max_by(time, temperature), max_by(humidity, temperature) from table1; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-30T09:30:00.000+08:00| 35.2| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.172s -``` - -#### 2.3.12 Min_by - -查询`temperature` 列中最小值所在行的 `time` 值,以及`temperature` 列中最小值所在行的 `humidity` 值。 - -```SQL -select min_by(time, temperature), min_by(humidity, temperature) from table1; -``` - -执行结果如下: - -```SQL -+-----------------------------+-----+ -| _col0|_col1| -+-----------------------------+-----+ -|2024-11-29T10:00:00.000+08:00| null| -+-----------------------------+-----+ -Total line number = 1 -It costs 0.244s -``` - - -## 3. 逻辑运算符 - -### 3.1 概述 - -逻辑运算符用于组合条件或否定条件,返回布尔结果(`true` 或 `false`)。 - -以下是常用的逻辑运算符及其描述: - -| 运算符 | 描述 | 示例 | -| ------ | ----------------------------- | ------- | -| AND | 仅当两个值都为 true 时为 true | a AND b | -| OR | 任一值为 true 时为 true | a OR b | -| NOT | 当值为 false 时为 true | NOT a | - -### 3.2 NULL 对逻辑运算符的影响 - -#### 3.2.1 AND 运算符 - -- 如果表达式的一侧或两侧为 `NULL`,结果可能为 `NULL`。 -- 如果 `AND` 运算符的一侧为 `FALSE`,则表达式结果为 `FALSE`。 - -示例: - -```SQL -NULL AND true -- null -NULL AND false -- false -NULL AND NULL -- null -``` - -#### 3.2.2 OR 运算符 - -- 如果表达式的一侧或两侧为 `NULL`,结果可能为 `NULL`。 -- 如果 `OR` 运算符的一侧为 `TRUE`,则表达式结果为 `TRUE`。 - -示例: - -```SQL -NULL OR NULL -- null -NULL OR false -- null -NULL OR true -- true -``` - -##### 3.2.2.1 真值表 - -以下真值表展示了 `NULL` 在 `AND` 和 `OR` 运算符中的处理方式: - -| a | b | a AND b | a OR b | -| ----- | ----- | ------- | ------ | -| TRUE | TRUE | TRUE | TRUE | -| TRUE | FALSE | FALSE | TRUE | -| TRUE | NULL | NULL | TRUE | -| FALSE | TRUE | FALSE | TRUE | -| FALSE | FALSE | FALSE | FALSE | -| FALSE | NULL | FALSE | NULL | -| NULL | TRUE | NULL | TRUE | -| NULL | FALSE | FALSE | NULL | -| NULL | NULL | NULL | NULL | - -#### 3.2.3 NOT 运算符 - -NULL 的逻辑否定仍然是 NULL - -示例: - -```SQL -NOT NULL -- null -``` - -##### 3.2.3.1真值表 - -以下真值表展示了 `NULL` 在 `NOT` 运算符中的处理方式: - -| a | NOT a | -| ----- | ----- | -| TRUE | FALSE | -| FALSE | TRUE | -| NULL | NULL | - - -## 4. 日期和时间函数和运算符 - -### 4.1 now() -> Timestamp - -返回当前时间的时间戳。 - -### 4.2 date_bin(interval, Timestamp[, Timestamp]) -> Timestamp - -`date_bin` 函数是一种用于处理时间数据的函数,作用是将一个时间戳(Timestamp)舍入到指定的时间间隔(interval)的边界上。 - -**语法:** - -```SQL --- 从时间戳为 0 开始计算时间间隔,返回最接近指定时间戳的时间间隔起始点 -date_bin(interval,source) - --- 从时间戳为 origin 开始计算时间间隔,返回最接近指定时间戳的时间间隔起始点 -date_bin(interval,source,origin) - --- interval支持的时间单位有: --- 年y、月mo、周week、日d、小时h、分钟M、秒s、毫秒ms、微秒µs、纳秒ns。 --- source必须为时间戳类型。 -``` - -**参数:** - -| 参数 | 含义 | -| -------- | ------------------------------------------------------------ | -| interval | 时间间隔支持的时间单位有:年y、月mo、周week、日d、小时h、分钟M、秒s、毫秒ms、微秒µs、纳秒ns。 | -| source | 待计算时间列,也可以是表达式。必须为时间戳类型。 | -| origin | 起始时间戳 | - -#### 4.2.1 语法约定: - -1. 不传入 `origin` 时,起始时间戳从 1970-01-01T00:00:00Z 开始计算(北京时间为 1970-01-01 08:00:00)。 -2. `interval` 为一个非负数,且必须带上时间单位。`interval` 为 0ms 时,不进行计算,直接返回 `source`。 -3. 当传入 `origin` 或 `source` 为负时,表示纪元时间之前的某个时间点,`date_bin` 会正常计算并返回与该时间点相关的时间段。 -4. 如果 `source` 中的值为 `null`,则返回 `null`。 -5. 不支持月份和非月份时间单位混用,例如 `1 MONTH 1 DAY`,这种时间间隔有歧义。 - -> 假设是起始时间是 2000 年 4 月 30 日进行计算,那么在一个时间间隔后,如果是先算 DAY再算MONTH,则会得到 2000 年 6 月 1 日,如果先算 MONTH 再算 DAY 则会得到 2000 年 5 月 31 日,二者得出的时间日期不同。 - -#### 4.2.2 示例 - -##### 示例数据 - -在[示例数据页面](../Reference/Sample-Data.md)中,包含了用于构建表结构和插入数据的SQL语句,下载并在IoTDB CLI中执行这些语句,即可将数据导入IoTDB,您可以使用这些数据来测试和执行示例中的SQL语句,并获得相应的结果。 - -示例 1:不指定起始时间戳 - -```SQL -SELECT - time, - date_bin(1h,time) as time_bin -FROM - table1; -``` - -结果: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:00:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.683s -``` - -示例 2:指定起始时间戳 - -```SQL -SELECT - time, - date_bin(1h, time, 2024-11-29T18:30:00.000) as time_bin -FROM - table1; -``` - -结果: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:30:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T09:30:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:30:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T10:30:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:30:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T07:30:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T08:30:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T09:30:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T10:30:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:30:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:30:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.056s -``` - -示例 3:`origin` 为负数的情况 - -```SQL -SELECT - time, - date_bin(1h, time, 1969-12-31 00:00:00.000) as time_bin -FROM - table1; -``` - -结果: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:00:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.203s -``` - -示例 4:`interval` 为 0 的情况 - -```SQL -SELECT - time, - date_bin(0ms, time) as time_bin -FROM - table1; -``` - -结果: - -```Plain -+-----------------------------+-----------------------------+ -| time| time_bin| -+-----------------------------+-----------------------------+ -|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| -|2024-11-30T14:30:00.000+08:00|2024-11-30T14:30:00.000+08:00| -|2024-11-29T10:00:00.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:38:00.000+08:00|2024-11-27T16:38:00.000+08:00| -|2024-11-27T16:39:00.000+08:00|2024-11-27T16:39:00.000+08:00| -|2024-11-27T16:40:00.000+08:00|2024-11-27T16:40:00.000+08:00| -|2024-11-27T16:41:00.000+08:00|2024-11-27T16:41:00.000+08:00| -|2024-11-27T16:42:00.000+08:00|2024-11-27T16:42:00.000+08:00| -|2024-11-27T16:43:00.000+08:00|2024-11-27T16:43:00.000+08:00| -|2024-11-27T16:44:00.000+08:00|2024-11-27T16:44:00.000+08:00| -|2024-11-29T11:00:00.000+08:00|2024-11-29T11:00:00.000+08:00| -|2024-11-29T18:30:00.000+08:00|2024-11-29T18:30:00.000+08:00| -|2024-11-28T08:00:00.000+08:00|2024-11-28T08:00:00.000+08:00| -|2024-11-28T09:00:00.000+08:00|2024-11-28T09:00:00.000+08:00| -|2024-11-28T10:00:00.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:00.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:37:00.000+08:00| -|2024-11-26T13:38:00.000+08:00|2024-11-26T13:38:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.107s -``` - -示例 5:`source` 为 null 的情况 - -```SQL -SELECT - arrival_time, - date_bin(1h,arrival_time) as time_bin -FROM - table1; -``` - -结果: - -```Plain -+-----------------------------+-----------------------------+ -| arrival_time| time_bin| -+-----------------------------+-----------------------------+ -| null| null| -|2024-11-30T14:30:17.000+08:00|2024-11-30T14:00:00.000+08:00| -|2024-11-29T10:00:13.000+08:00|2024-11-29T10:00:00.000+08:00| -|2024-11-27T16:37:01.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -|2024-11-27T16:37:03.000+08:00|2024-11-27T16:00:00.000+08:00| -|2024-11-27T16:37:04.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -| null| null| -|2024-11-27T16:37:08.000+08:00|2024-11-27T16:00:00.000+08:00| -| null| null| -|2024-11-29T18:30:15.000+08:00|2024-11-29T18:00:00.000+08:00| -|2024-11-28T08:00:09.000+08:00|2024-11-28T08:00:00.000+08:00| -| null| null| -|2024-11-28T10:00:11.000+08:00|2024-11-28T10:00:00.000+08:00| -|2024-11-28T11:00:12.000+08:00|2024-11-28T11:00:00.000+08:00| -|2024-11-26T13:37:34.000+08:00|2024-11-26T13:00:00.000+08:00| -|2024-11-26T13:38:25.000+08:00|2024-11-26T13:00:00.000+08:00| -+-----------------------------+-----------------------------+ -Total line number = 18 -It costs 0.319s -``` - -### 4.3 Extract 函数 - -该函数用于提取日期对应部分的值。(V2.0.6 版本起支持) - -#### 4.3.1 语法定义 - -```SQL -EXTRACT (identifier FROM expression) -``` -* 参数说明 - * **expression**: `TIMESTAMP` 类型或时间常量 - * **identifier** :取值范围及对应的返回值见下表 - - | 取值范围 | 返回值类型 | 返回值范围 | - | -------------------------- | ------------- | ------------- | - | `YEAR` | `INT64` | `/` | - | `QUARTER` | `INT64` | `1-4` | - | `MONTH` | `INT64` | `1-12` | - | `WEEK` | `INT64` | `1-53` | - | `DAY_OF_MONTH (DAY)` | `INT64` | `1-31` | - | `DAY_OF_WEEK (DOW)` | `INT64` | `1-7` | - | `DAY_OF_YEAR (DOY)` | `INT64` | `1-366` | - | `HOUR` | `INT64` | `0-23` | - | `MINUTE` | `INT64` | `0-59` | - | `SECOND` | `INT64` | `0-59` | - | `MS` | `INT64` | `0-999` | - | `US` | `INT64` | `0-999` | - | `NS` | `INT64` | `0-999` | - - -#### 4.3.2 使用示例 - -以[示例数据](../Reference/Sample-Data.md)中的 table1 为源数据,查询某段时间每天前12个小时的温度平均值 - -```SQL -IoTDB:database1> select format('%1$tY-%1$tm-%1$td',date_bin(1d,time)) as fmtdate,avg(temperature) as avgtp from table1 where time >= 2024-11-26T00:00:00 and time <= 2024-11-30T23:59:59 and extract(hour from time) <= 12 group by date_bin(1d,time) order by date_bin(1d,time) -+----------+-----+ -| fmtdate|avgtp| -+----------+-----+ -|2024-11-28| 86.0| -|2024-11-29| 85.0| -|2024-11-30| 90.0| -+----------+-----+ -Total line number = 3 -It costs 0.041s -``` - -`Format` 函数介绍:[Format 函数](../SQL-Manual/Basis-Function_timecho.md#_7-2-format-函数) - -`Date_bin` 函数介绍:[Date_bin 函数](../SQL-Manual/Basis-Function_timecho.md#_4-2-date-bin-interval-timestamp-timestamp-timestamp) - - -## 5. 数学函数和运算符 - -### 5.1 数学运算符 - -| **运算符** | **描述** | -| ---------- | ------------------------ | -| + | 加法 | -| - | 减法 | -| * | 乘法 | -| / | 除法(整数除法执行截断) | -| % | 模(余数) | -| - | 取反 | - -### 5.2 数学函数 - -| 函数名 | 描述 | 输入 | 输出 | 用法 | -|-------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------| ---------------------- | ---------- | -| sin | 正弦函数 | double、float、INT64、INT32 | double | sin(x) | -| cos | 余弦函数 | double、float、INT64、INT32 | double | cos(x) | -| tan | 正切函数 | double、float、INT64、INT32 | double | tan(x) | -| asin | 反正弦函数 | double、float、INT64、INT32 | double | asin(x) | -| acos | 反余弦函数 | double、float、INT64、INT32 | double | acos(x) | -| atan | 反正切函数 | double、float、INT64、INT32 | double | atan(x) | -| sinh | 双曲正弦函数 | double、float、INT64、INT32 | double | sinh(x) | -| cosh | 双曲余弦函数 | double、float、INT64、INT32 | double | cosh(x) | -| tanh | 双曲正切函数 | double、float、INT64、INT32 | double | tanh(x) | -| degrees | 将弧度角 x 转换为度 | double、float、INT64、INT32 | double | degrees(x) | -| radians | 将度转换为弧度 | double、float、INT64、INT32 | double | radians(x) | -| abs | 绝对值 | double、float、INT64、INT32 | 返回与输入类型相同的值 | abs(x) | -| sign | 返回 x 的符号函数,即:如果参数为 0,则返回 0,如果参数大于 0,则返回 1,如果参数小于 0,则返回 -1。对于 double/float 类型的参数,函数还会返回:如果参数为 NaN,则返回 NaN,如果参数为 +Infinity,则返回 1.0,如果参数为 -Infinity,则返回 -1.0。 | double、float、INT64、INT32 | 返回与输入类型相同的值 | sign(x) | -| ceil | 返回 x 向上取整到最近的整数。 | double、float、INT64、INT32 | double | ceil(x) | -| floor | 返回 x 向下取整到最近的整数。 | double、float、INT64、INT32 | double | floor(x) | -| exp | 返回欧拉数 e 的 x 次幂。 | double、float、INT64、INT32 | double | exp(x) | -| ln | 返回 x 的自然对数。 | double、float、INT64、INT32 | double | ln(x) | -| log10 | 返回 x 的以 10 为底的对数。 | double、float、INT64、INT32 | double | log10(x) | -| round | 返回 x 四舍五入到最近的整数。 | double、float、INT64、INT32 | double | round(x) | -| round | 返回 x 四舍五入到 d 位小数。 | double、float、INT64、INT32 | double | round(x, d) | -| sqrt | 返回 x 的平方根。 | double、float、INT64、INT32 | double | sqrt(x) | -| e | 自然指数 | | double | e() | -| pi | π | | double | pi() | - - -## 6. 位运算函数 - -> V 2.0.6 版本起支持 - -示例原始数据如下: - -```SQL -IoTDB:database1> select * from bit_table -+-----------------------------+---------+------+-----+ -| time|device_id|length|width| -+-----------------------------+---------+------+-----+ -|2025-10-29T15:59:42.957+08:00| d1| 14| 12| -|2025-10-29T15:58:59.399+08:00| d3| 15| 10| -|2025-10-29T15:59:32.769+08:00| d2| 13| 12| -+-----------------------------+---------+------+-----+ - ---建表语句 -CREATE TABLE bit_table(time TIMESTAMP TIME, device_id STRING TAG, length INT32 FIELD, width INT32 FIELD); - ---写入数据 -INSERT INTO bit_table values(2025-10-29 15:59:42.957, 'd1', 14, 12),(2025-10-29 15:58:59.399, 'd3', 15, 10),(2025-10-29 15:59:32.769, 'd2', 13, 12); -``` - -### 6.1 bit\_count(num, bits) - -`bit_count(num, bits)` 函数用于统计整数 `num`在指定位宽 `bits`下的二进制表示中 1 的个数。 - -#### 6.1.1 语法定义 - -```SQL -bit_count(num, bits) -> INT64 --返回结果类型为 Int64 -``` - -* 参数说明 - * **​num:​**任意整型数值(int32 或者 int64) - * **​bits:​**整型数值,取值范围为2\~64 - -注意:如果 bits 位数不够表示 num,会报错(此处是​**有符号补码**​):`Argument exception, the scalar function num must be representable with the bits specified. [num] cannot be represented with [bits] bits.` - -* 调用方式 - * 两个具体数值:`bit_count(9, 64)` - * 列与数值:`bit_count(column1, 64)` - * 两列之间:`bit_count(column1, column2)` - -#### 6.1.2 使用示例 - -```SQL --- 两个具体数值 -IoTDB:database1> select distinct bit_count(2,8) from bit_table -+-----+ -|_col0| -+-----+ -| 1| -+-----+ --- 两个具体数值 -IoTDB:database1> select distinct bit_count(-5,8) from bit_table -+-----+ -|_col0| -+-----+ -| 7| -+-----+ ---列与数值 -IoTDB:database1> select length,bit_count(length,8) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 3| -| 15| 4| -| 13| 3| -+------+-----+ ---bits位数不够 -IoTDB:database1> select length,bit_count(length,2) from bit_table -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Argument exception, the scalar function num must be representable with the bits specified. 13 cannot be represented with 2 bits. -``` - -### 6.2 bitwise\_and(x, y) - -`bitwise_and(x, y)`函数基于二进制补码表示法,对两个整数 x 和 y 的每一位进行逻辑与操作,并返回其按位与(bitwise AND)的运算结果。 - -#### 6.2.1 语法定义 - -```SQL -bitwise_and(x, y) -> INT64 --返回结果类型为 Int64 -``` - -* 参数说明 - * ​**x, y**​: 必须是 Int32 或 Int64 数据类型的整数值 -* 调用方式 - * 两个具体数值:`bitwise_and(19, 25)` - * 列与数值:`bitwise_and(column1, 25)` - * 两列之间:`bitwise_and(column1, column2)` - -#### 6.2.2 使用示例 - -```SQL ---两个具体数值 -IoTDB:database1> select distinct bitwise_and(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 17| -+-----+ ---列与数值 -IoTDB:database1> select length, bitwise_and(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 8| -| 15| 9| -| 13| 9| -+------+-----+ ---俩列之间 -IoTDB:database1> select length, width, bitwise_and(length, width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 12| -| 15| 10| 10| -| 13| 12| 12| -+------+-----+-----+ -``` - -### 6.3 bitwise\_not(x) - -`bitwise_not(x)` 函数基于二进制补码表示法,对整数 x 的每一位进行逻辑非操作,并返回其按位取反(bitwise NOT)的运算结果。 - -#### 6.3.1 语法定义 - -```SQL -bitwise_not(x) -> INT64 --返回结果类型为 Int64 -``` - -* 参数说明 - * ​**x**​: 必须是 Int32 或 Int64 数据类型的整数值 -* 调用方式 - * 具体数值:`bitwise_not(5)` - * 单列操作:`bitwise_not(column1)` - -#### 6.3.2 使用示例 - -```SQL --- 具体数值 -IoTDB:database1> select distinct bitwise_not(5) from bit_table -+-----+ -|_col0| -+-----+ -| -6| -+-----+ --- 单列 -IoTDB:database1> select length, bitwise_not(length) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| -15| -| 15| -16| -| 13| -14| -+------+-----+ -``` - -### 6.4 bitwise\_or(x, y) - -`bitwise_or(x,y)` 函数基于二进制补码表示法,对两个整数 x 和 y 的每一位进行逻辑或操作,并返回其按位或(bitwise OR)的运算结果。 - -#### 6.4.1 语法定义 - -```SQL -bitwise_or(x, y) -> INT64 --返回结果类型为 Int64 -``` - -* 参数说明 - * ​**x, y**​: 必须是 Int32 或 Int64 数据类型的整数值 -* 调用方式 - * 两个具体数值:`bitwise_or(19, 25)` - * 列与数值:`bitwise_or(column1, 25)` - * 两列之间:`bitwise_or(column1, column2)` - -#### 6.4.2 使用示例 - -```SQL --- 两个具体数值 -IoTDB:database1> select distinct bitwise_or(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 27| -+-----+ --- 列与数值 -IoTDB:database1> select length,bitwise_or(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 31| -| 15| 31| -| 13| 29| -+------+-----+ --- 两列之间 -IoTDB:database1> select length, width, bitwise_or(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 14| -| 15| 10| 15| -| 13| 12| 13| -+------+-----+-----+ -``` - -### 6.5 bitwise\_xor(x, y) - -bitwise\_xor(x,y) 函数基于二进制补码表示法,对两个整数 x 和 y 的每一位进行逻辑异或操作,并返回其按位异或(bitwise XOR)的运算结果。异或规则:相同为0,不同为1。 - -#### 6.5.1 语法定义 - -```SQL -bitwise_xor(x, y) -> INT64 --返回结果类型为 Int64 -``` - -* 参数说明 - * ​**x, y**​: 必须是 Int32 或 Int64 数据类型的整数值 -* 调用方式 - * 两个具体数值:`bitwise_xor(19, 25)` - * 列与数值:`bitwise_xor(column1, 25)` - * 两列之间:`bitwise_xor(column1, column2)` - -#### 6.5.2 使用示例 - -```SQL --- 两个具体数值 -IoTDB:database1> select distinct bitwise_xor(19,25) from bit_table -+-----+ -|_col0| -+-----+ -| 10| -+-----+ --- 列与数值 -IoTDB:database1> select length,bitwise_xor(length,25) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 23| -| 15| 22| -| 13| 20| -+------+-----+ --- 两列之间 -IoTDB:database1> select length, width, bitwise_xor(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 2| -| 15| 10| 5| -| 13| 12| 1| -+------+-----+-----+ -``` - -### 6.6 bitwise\_left\_shift(value, shift) - -`bitwise_left_shift(value, shift)` 函数返回将整数 `value`的二进制表示左移 `shift`位后的结果。左移操作将二进制位向高位方向移动,右侧空出的位用 0 填充,左侧溢出的位直接丢弃。等价于: `value << shift`。 - -#### 6.6.1 语法定义 - -```SQL -bitwise_left_shift(value, shift) -> [same as value] --返回结果类型与value数据类型相同 -``` - -* 参数说明 - * ​**value**​: 要左移的整数值,必须是 Int32 或 Int64 数据类型 - * ​**shift**​: 左移的位数,必须是 Int32 或 Int64 数据类型 -* 调用方式 - * 两个具体数值:`bitwise_left_shift(1, 2)` - * 列与数值:`bitwise_left_shift(column1, 2)` - * 两列之间:`bitwise_left_shift(column1, column2)` - -#### 6.6.2 使用示例 - -```SQL ---两个具体数值 -IoTDB:database1> select distinct bitwise_left_shift(1,2) from bit_table -+-----+ -|_col0| -+-----+ -| 4| -+-----+ --- 列与数值 -IoTDB:database1> select length, bitwise_left_shift(length,2) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 56| -| 15| 60| -| 13| 52| -+------+-----+ --- 两列之间 -IoTDB:database1> select length, width, bitwise_left_shift(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -+------+-----+-----+ -``` - -### 6.7 bitwise\_right\_shift(value, shift) - -`bitwise_right_shift(value, shift)`函数返回将整数 `value`的二进制表示逻辑右移(无符号右移) `shift`位后的结果。逻辑右移操作将二进制位向低位方向移动,左侧空出的高位用 0 填充,右侧溢出的低位直接丢弃。 - -#### 6.7.1 语法定义 - -```SQL -bitwise_right_shift(value, shift) -> [same as value] --返回结果类型与value数据类型相同 -``` - -* 参数说明 - * ​**value**​: 要右移的整数值,必须是 Int32 或 Int64 数据类型 - * ​**shift**​: 右移的位数,必须是 Int32 或 Int64 数据类型 -* 调用方式 - * 两个具体数值:`bitwise_right_shift(8, 3)` - * 列与数值:`bitwise_right_shift(column1, 3)` - * 两列之间:`bitwise_right_shift(column1, column2)` - -#### 6.7.2 使用示例 - -```SQL ---两个具体数值 -IoTDB:database1> select distinct bitwise_right_shift(8,3) from bit_table -+-----+ -|_col0| -+-----+ -| 1| -+-----+ ---列与数值 -IoTDB:database1> select length, bitwise_right_shift(length,3) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 1| -| 15| 1| -| 13| 1| -+------+-----+ ---两列之间 -IoTDB:database1> select length, width, bitwise_right_shift(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -``` - -### 6.8 bitwise\_right\_shift\_arithmetic(value, shift) - -`bitwise_right_shift_arithmetic(value, shift)`函数返回将整数 `value`的二进制表示算术右移 `shift`位后的结果。算术右移操作将二进制位向低位方向移动,右侧溢出的低位直接丢弃,左侧空出的高位用符号位填充(正数补0,负数补1),以保持数值的符号不变。 - -#### 6.8.1 语法定义 - -```SQL -bitwise_right_shift_arithmetic(value, shift) -> [same as value]--返回结果类型与value数据类型相同 -``` - -* 参数说明 - * ​**value**​: 要右移的整数值,必须是 Int32 或 Int64 数据类型 - * ​**shift**​: 右移的位数,必须是 Int32 或 Int64 数据类型 -* 调用方式: - * 两个具体数值:`bitwise_right_shift_arithmetic(12, 2)` - * 列与数值:`bitwise_right_shift_arithmetic(column1, 64)` - * 两列之间:`bitwise_right_shift_arithmetic(column1, column2)` - -#### 6.8.2 使用示例 - -```SQL ---两个具体数值 -IoTDB:database1> select distinct bitwise_right_shift_arithmetic(12,2) from bit_table -+-----+ -|_col0| -+-----+ -| 3| -+-----+ --- 列与数值 -IoTDB:database1> select length, bitwise_right_shift_arithmetic(length,3) from bit_table -+------+-----+ -|length|_col1| -+------+-----+ -| 14| 1| -| 15| 1| -| 13| 1| -+------+-----+ ---两列之间 -IoTDB:database1> select length, width, bitwise_right_shift_arithmetic(length,width) from bit_table -+------+-----+-----+ -|length|width|_col2| -+------+-----+-----+ -| 14| 12| 0| -| 15| 10| 0| -| 13| 12| 0| -+------+-----+-----+ -``` - -## 7. 二进制函数 - -> V2.0.9.1 起支持 - -### 7.1 Base64 编码函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| ----------------------------- | ------------------------------------------------------------------------- | ------------------ | -------------- | -| `to_base64(input)` | 将输入数据编码为标准 Base64 字符串,解决二进制数据传输 / 存储兼容性问题 | STRING/TEXT/BLOB | STRING | -| `from_base64(input)` | 将标准 Base64 字符串解码为原始二进制数据,为 to\_base64 逆操作 | STRING/TEXT | BLOB | -| `to_base64url(input)` | 将输入数据编码为 URL 安全的 Base64URL 字符串,替换 +/\_、省略填充符 | STRING/TEXT/BLOB | STRING | -| `from_base64url(input)` | 将 Base64URL 字符串解码为原始二进制数据,为 to\_base64url 逆操作 | STRING/TEXT | BLOB | -| `to_base32(input)` | 将输入数据编码为 Base32 字符串,字符无混淆、不区分大小写,可读性高 | STRING/TEXT/BLOB | STRING | -| `from_base32(input)` | 将 Base32 字符串解码为原始二进制数据,为 to\_base32 逆操作 | STRING/TEXT | BLOB | - -**使用示例** - -1. to\_base64:编码字符串为标准Base64 - -```SQL -SELECT DISTINCT to_base64('IoTDB二进制测试') FROM table1; -``` - -```Bash -+----------------------------+ -| _col0| -+----------------------------+ -|SW9URELkuozov5vliLbmtYvor5U=| -+----------------------------+ -``` - -2. from\_base64:解码Base64字符串为二进制 - -```SQL -SELECT DISTINCT from_base64('SW9URELkuozov5vliLbmtYvor5U=') FROM table1; -``` - -```Bash -+------------------------------------------+ -| _col0| -+------------------------------------------+ -|0x496f544442e4ba8ce8bf9be588b6e6b58be8af95| -+------------------------------------------+ -``` - -3. to\_base64url:编码为URL安全的Base64URL(无+/\_、无填充符=) - -```SQL -SELECT DISTINCT to_base64url('https://iotdb.apache.org') FROM table1; -``` - -```Bash -+--------------------------------+ -| _col0| -+--------------------------------+ -|aHR0cHM6Ly9pb3RkYi5hcGFjaGUub3Jn| -+--------------------------------+ -``` - -4. from\_base64url:解码Base64URL字符串 - -```SQL -SELECT DISTINCT from_base64url('aHR0cHM6Ly9pb3RkYi5hcGFjaGUub3Jn') FROM table1; -``` - -```Bash -+--------------------------------------------------+ -| _col0| -+--------------------------------------------------+ -|0x68747470733a2f2f696f7464622e6170616368652e6f7267| -+--------------------------------------------------+ -``` - -5. to\_base32:编码为Base32字符串 - -```SQL -SELECT DISTINCT to_base32('123456') FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|GEZDGNBVGY======| -+----------------+ -``` - -6. from\_base32:解码Base32字符串 - -```SQL -SELECT DISTINCT from_base32('GEZDGNBVGY======') FROM table1; -``` - -```SQL -+--------------+ -| _col0| -+--------------+ -|0x313233343536| -+--------------+ -``` - -### 7.2 十六进制编码函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| ----------------------- | -------------------------------------------------------------- | ------------------ | -------------- | -| `TO_HEX(input)` | 将输入数据转换为十六进制字符串,直接反映底层字节值,便于调试 | STRING/TEXT/BLOB | STRING | -| `FROM_HEX(input)` | 将十六进制字符串解码为原始二进制数据,为TO\_HEX逆操作 | STRING/TEXT | BLOB | - -**使用示例** - -1. TO\_HEX:将字符串/二进制转换为十六进制 - -```SQL -SELECT DISTINCT TO_HEX('test') FROM table1; -``` - -```Bash -+--------+ -| _col0| -+--------+ -|74657374| -+--------+ -``` - -2. FROM\_HEX:将十六进制字符串解码为二进制 - -```SQL -SELECT DISTINCT FROM_HEX('74657374') FROM table1; -``` - -```Bash -+----------+ -| _col0| -+----------+ -|0x74657374| -+----------+ -``` - -### 7.3 二进制基础函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| -------------------------------------- | ------------------------------------------------------------------------------------------- | ------------------------- | ---------------- | -| `length(input)` | 返回输入数据长度,文本类型返字符数,BLOB类型返回字节数,OBJECT 类型返回对象二进制字节大小 | STRING/TEXT/BLOB/OBJECT | INT32 | -| `REVERSE(input)`| 反转输入数据顺序,文本类型反转字符,BLOB类型反转字节 | STRING/TEXT/BLOB | 与输入类型一致 | -| `LPAD(input, length, pad_bytes)` | 对BLOB进行字节级左填充/截断,使最终字节长度等于指定值 | BLOB、INT32/INT64、BLOB | BLOB | -| `RPAD(input, length, pad_bytes)` | 对BLOB进行字节级右填充/截断,使最终字节长度等于指定值 | BLOB、INT32/INT64、BLOB | BLOB | - -**使用示例** - -1. length:获取数据长度 - -```SQL -SELECT DISTINCT length('IoTDB') FROM table1; -``` - -```Bash -+-----+ -|_col0| -+-----+ -| 5| -+-----+ -``` - -2. REVERSE:反转数据 - -```SQL -SELECT DISTINCT REVERSE('12345') FROM table1; -``` - -```Bash -+-----+ -|_col0| -+-----+ -|54321| -+-----+ -``` - -3. LPAD:左填充/截断BLOB(参数:原BLOB、目标长度、填充字节) - -```SQL -SELECT DISTINCT LPAD(FROM_HEX('74657374'),5, FROM_HEX('74657374')) FROM table1; -``` - -```Bash -+------------+ -| _col0| -+------------+ -|0x7474657374| -+------------+ -``` - -4. RPAD:右填充/截断BLOB - -```SQL -SELECT DISTINCT RPAD(FROM_HEX('74657374'),5, FROM_HEX('74657374')) FROM table1; -``` - -```Bash -+------------+ -| _col0| -+------------+ -|0x7465737474| -+------------+ -``` - -### 7.4 整数编码函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| ------------------------------------ | ------------------------------------------------------------------ | -------------- | -------------- | -| `to_big_endian_32(input)` | 将INT32整数转换为4字节大端序BLOB,符合网络字节序标准 | INT32 | BLOB | -| `to_big_endian_64(input)` | 将INT64整数转换为8字节大端序BLOB,符合网络字节序标准 | INT64 | BLOB | -| `from_big_endian_32(input)` | 将4字节大端序BLOB解码为INT32整数,为to\_big\_endian\_32逆操作 | BLOB | INT32 | -| `from_big_endian_64(input)` | 将8字节大端序BLOB解码为INT64整数,为to\_big\_endian\_64逆操作 | BLOB | INT64 | -| `to_little_endian_32(input)` | 将INT32整数转换为4字节小端序BLOB,适配x86等主流架构 | INT32 | BLOB | -| `to_little_endian_64(input)` | 将INT64整数转换为8字节小端序BLOB,适配x86等主流架构 | INT64 | BLOB | -| `from_little_endian_32(input)` | 将4字节小端序BLOB解码为INT32整数,为to\_little\_endian\_32逆操作 | BLOB | INT32 | -| `from_little_endian_64(input)` | 将8字节小端序BLOB解码为INT64整数,为to\_little\_endian\_64逆操作 | BLOB | INT64 | - -**使用示例** - -1. 大端序编码/解码 - -```SQL -SELECT DISTINCT TO_HEX(to_big_endian_32(12345)) FROM table1; -``` - -```Bash -+--------+ -| _col0| -+--------+ -|00003039| -+--------+ -``` - -```SQL -SELECT DISTINCT from_big_endian_32(FROM_HEX('00003039')) FROM table1; -``` - -```Bash -+-----+ -|_col0| -+-----+ -|12345| -+-----+ -``` - -```SQL -SELECT DISTINCT TO_HEX(to_big_endian_64(1234567890123)) FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|0000011f71fb04cb| -+----------------+ -``` - -```SQL -SELECT DISTINCT from_big_endian_64(FROM_HEX('0000011f71fb04cb')) FROM table1; -``` - -```Bash -+-------------+ -| _col0| -+-------------+ -|1234567890123| -+-------------+ -``` - -2. 小端序编码/解码 - -```SQL -SELECT DISTINCT TO_HEX(to_little_endian_32(12345)) FROM table1; -``` - -```Bash -+--------+ -| _col0| -+--------+ -|39300000| -+--------+ -``` - -```SQL -SELECT DISTINCT from_little_endian_32(FROM_HEX('39300000')) FROM table1; -``` - -```Bash -+-----+ -|_col0| -+-----+ -|12345| -+-----+ -``` - -```SQL -SELECT DISTINCT TO_HEX(to_little_endian_64(1234567890123)) FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|cb04fb711f010000| -+----------------+ -``` - -```SQL -SELECT DISTINCT from_little_endian_64(FROM_HEX('cb04fb711f010000')) FROM table1; -``` - -```Bash -+-------------+ -| _col0| -+-------------+ -|1234567890123| -+-------------+ -``` - -### 7.5 浮点型编码函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| ------------------------------ | ------------------------------------------------------------------- | -------------- | -------------- | -| `to_ieee754_32(input)` | 将FLOAT单精度浮点数转换为4字节大端序IEEE754标准BLOB | FLOAT | BLOB | -| `to_ieee754_64(input)` | 将DOUBLE双精度浮点数转换为8字节大端序IEEE754标准BLOB | DOUBLE | BLOB | -| `from_ieee754_32(input)` | 将4字节IEEE754标准BLOB解码为FLOAT浮点数,为to\_ieee754\_32逆操作 | BLOB | FLOAT | -| `from_ieee754_64(input)` | 将8字节IEEE754标准BLOB解码为DOUBLE浮点数,为to\_ieee754\_64逆操作 | BLOB | DOUBLE | - -**使用示例** - -1. 单精度浮点数(FLOAT)编码/解码 - -```SQL -SELECT DISTINCT TO_HEX(to_ieee754_32(temperature)) FROM table1 where time = 2024-11-26 13:37:00; -``` - -```Bash -+--------+ -| _col0| -+--------+ -|42b40000| -+--------+ -``` - -```SQL -SELECT DISTINCT from_ieee754_32(FROM_HEX('42b40000')) FROM table1; -``` - -```Bash -+-----+ -|_col0| -+-----+ -| 90.0| -+-----+ -``` - -2. 双精度浮点数(DOUBLE)编码/解码 - -```SQL -SELECT DISTINCT TO_HEX(to_ieee754_64(3.1415926535)) FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|400921fb54411744| -+----------------+ -``` - -```Bash -SELECT DISTINCT from_ieee754_64(FROM_HEX('400921fb54411744')) FROM table1; -``` - -```Bash -+------------+ -| _col0| -+------------+ -|3.1415926535| -+------------+ -``` - -### 7.6 哈希函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| -------------------------------- | ---------------------------------------------------------------- | -------------------- | -------------- | -| `sha256(input)` | 计算输入数据的SHA-256密码学哈希值,不可逆、抗碰撞 | STRING、TEXT、BLOB | BLOB(32字节) | -| `SHA512(input)` | 计算输入数据的SHA-512密码学哈希值,安全强度高于SHA256 | STRING、TEXT、BLOB | BLOB(64字节) | -| `SHA1(input)` | 计算输入数据的SHA-1哈希值,抗碰撞性弱,不推荐安全场景使用 | STRING、TEXT、BLOB | BLOB(20字节) | -| `MD5(input)` | 计算输入数据的MD5哈希值,无密码学安全性,仅用于非加密校验 | STRING、TEXT、BLOB | BLOB(16字节) | -| `CRC32(input)` | 计算输入数据的CRC32循环冗余校验码,高效检测非恶意数据错误 | STRING、TEXT、BLOB | INT64 | -| `spooky_hash_v2_32(input)` | 计算输入数据的32位SpookyHashV2非密码学哈希值,高性能、低冲突 | STRING、TEXT、BLOB | BLOB(4字节) | -| `spooky_hash_v2_64(input)` | 计算输入数据的64位SpookyHashV2非密码学哈希值,高性能、低冲突 | STRING、TEXT、BLOB | BLOB(8字节) | -| `xxhash64(input)` | 计算输入数据的64位xxHash非密码学哈希值,计算速度极快 | STRING、TEXT、BLOB | BLOB(8字节) | -| `murmur3(input)` | 计算输入数据的128位MurmurHash3非密码学哈希值,分布均匀、应用广 | STRING、TEXT、BLOB | BLOB(16字节) | - -**使用示例** - -1. 密码学哈希函数 - -```SQL -SELECT DISTINCT TO_HEX(sha256('test')) FROM table1; -``` - -```Bash -+----------------------------------------------------------------+ -| _col0| -+----------------------------------------------------------------+ -|9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08| -+----------------------------------------------------------------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(SHA512('test')) FROM table1; -``` - -```Bash -+--------------------------------------------------------------------------------------------------------------------------------+ -| _col0| -+--------------------------------------------------------------------------------------------------------------------------------+ -|ee26b0dd4af7e749aa1a8ee3c10ae9923f618980772e473f8819a5d4940e0db27ac185f8a0e1d5f84f88bc887fd67b143732c304cc5fa9ad8e6f57f50028a8ff| -+--------------------------------------------------------------------------------------------------------------------------------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(SHA1('test')) FROM table1; -``` - -```Bash -+----------------------------------------+ -| _col0| -+----------------------------------------+ -|a94a8fe5ccb19ba61c4c0873d391e987982fbbd3| -+----------------------------------------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(MD5('test')) FROM table1; -``` - -```Bash -+--------------------------------+ -| _col0| -+--------------------------------+ -|098f6bcd4621d373cade4e832627b4f6| -+--------------------------------+ -``` - -2. 校验/非密码学哈希函数 - -```SQL -SELECT DISTINCT CRC32('test') FROM table1; -``` - -```Bash -+----------+ -| _col0| -+----------+ -|3632233996| -+----------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(spooky_hash_v2_32('test')) FROM table1; -``` - -```Bash -+--------+ -| _col0| -+--------+ -|ec0d8b75| -+--------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(spooky_hash_v2_64('test')) FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|7b01e8bcec0d8b75| -+----------------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(xxhash64('test')) FROM table1; -``` - -```Bash -+----------------+ -| _col0| -+----------------+ -|4fdcca5ddb678139| -+----------------+ -``` - -```SQL -SELECT DISTINCT TO_HEX(murmur3('test')) FROM table1; -``` - -```Bash -+--------------------------------+ -| _col0| -+--------------------------------+ -|9de1bd74cc287dac824dbdf93182129a| -+--------------------------------+ -``` - -### 7.7 HMAC函数 - -| 函数名称 | 功能描述 | 输入参数类型 | 输出参数类型 | -| ------------------------------ | ------------------------------------------------------------------- | -------------------------------------------- | -------------- | -| `hmac_md5(data, key)` | 结合MD5与密钥计算HMAC消息认证码,验证数据完整性和来源,适配旧系统 | data:STRING/TEXT/BLOBkey:STRING/TEXT | BLOB(16字节) | -| `hmac_sha1(data, key)` | 结合SHA-1与密钥计算HMAC消息认证码,验证数据完整性和来源 | data:STRING/TEXT/BLOBkey:STRING/TEXT | BLOB(20字节) | -| `hmac_sha256(data, key)` | 结合SHA256与密钥计算HMAC消息认证码,业界推荐标准,安全强度高 | data:STRING/TEXT/BLOBkey:STRING/TEXT | BLOB(32字节) | -| `hmac_sha512(data, key)` | 结合SHA512与密钥计算HMAC消息认证码,商用级别最高安全强度 | data:STRING/TEXT/BLOBkey:STRING/TEXT | BLOB(64字节) | - -**使用示例** - -* 通用密钥:'iotdb\_secret\_key' -* 待验证数据:'user\_data\_123' - -1. hmac\_md5 - -```SQL -SELECT DISTINCT TO_HEX(hmac_md5('user_data_123', 'iotdb_secret_key')) FROM table1; -``` - -```Bash -+--------------------------------+ -| _col0| -+--------------------------------+ -|8ee863080ceb3b43b5ffdc7a937e7f28| -+--------------------------------+ -``` - -2. hmac\_sha1 - -```SQL -SELECT DISTINCT TO_HEX(hmac_sha1('user_data_123', 'iotdb_secret_key')) FROM table1; -``` - -```Bash -+----------------------------------------+ -| _col0| -+----------------------------------------+ -|b5b7ae1a495745299ec3bd236c511c13540481ce| -+----------------------------------------+ -``` - -3. hmac\_sha256(推荐使用) - -```SQL -SELECT DISTINCT TO_HEX(hmac_sha256('user_data_123', 'iotdb_secret_key')) FROM table1; -``` - -```Bash -+----------------------------------------------------------------+ -| _col0| -+----------------------------------------------------------------+ -|73b6f26bbcb5192dbe2cb83745b0fc48c63418fa674b0bf62fabe7f8747f3afd| -+----------------------------------------------------------------+ -``` - -4. hmac\_sha512 - -```SQL -SELECT DISTINCT TO_HEX(hmac_sha512('user_data_123', 'iotdb_secret_key')) FROM table1; -``` - -```Bash -+--------------------------------------------------------------------------------------------------------------------------------+ -| _col0| -+--------------------------------------------------------------------------------------------------------------------------------+ -|2fed4ec5a0535e3349798b371d6525255ee85d9eae0ddcbdecf89db84f943151f5febf0ffd9c01ae9661278504aba186cf6f732ae5f42d63be58aadee2baccc2| -+--------------------------------------------------------------------------------------------------------------------------------+ -``` - - - -## 8. 条件表达式 - -### 8.1 CASE 表达式 - -CASE 表达式有两种形式:简单形式、搜索形式 - -#### 8.1.1 简单形式 - -简单形式从左到右搜索每个值表达式,直到找到一个与表达式相等的值: - -```SQL -CASE expression - WHEN value THEN result - [ WHEN ... ] - [ ELSE result ] -END -``` - -如果找到匹配的值,则返回相应的结果。如果没有找到匹配项,则返回 ELSE 子句中的结果(如果存在),否则返回 null。例如: - -```SQL -SELECT a, - CASE a - WHEN 1 THEN 'one' - WHEN 2 THEN 'two' - ELSE 'many' - END -``` - -#### 8.1.2 搜索形式 - -搜索形式从左到右评估每个布尔条件,直到找到一个为真的条件,并返回相应的结果: - -```SQL -CASE - WHEN condition THEN result - [ WHEN ... ] - [ ELSE result ] -END -``` - -如果没有条件为真,则返回 ELSE 子句中的结果(如果存在),否则返回 null。例如: - -```SQL -SELECT a, b, - CASE - WHEN a = 1 THEN 'aaa' - WHEN b = 2 THEN 'bbb' - ELSE 'ccc' - END -``` - -### 8.2 COALESCE 函数 - -返回参数列表中的第一个非空值。 - -```SQL -coalesce(value1, value2[, ...]) -``` - -### 8.3 IF 表达式 - -IF 表达式有两种形式:一种仅指定真值(true\_value),另一种同时指定真值和假值(false\_value)。 - -| 形式 | 说明 | 输出类型限制 | -| ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- | -| `if(condition, true_value)` | 若条件(condition)为真,则计算并返回`true_value`;否则返回`null`,且`true_value`不会被计算。 | | -| `if(condition, true_value, false_value)` | 若条件(condition)为真,则计算并返回`true_value`;否则计算并返回`false_value`。 | `true_value`和`false_value`的数据类型​**必须完全一致**​,不支持隐式类型转换。 | - -> V 2.0.9.1 版本起支持 - -**示例:** - -1. IF 表达式和 CASE 表达式等价示例: - -```SQL --- IF 写法 -SELECT - device_id, - temperature, - IF(temperature > 85, 'High Value', 'Low Value') -FROM table1; - --- CASE 等价写法 -SELECT - device_id, - temperature, - CASE - WHEN temperature > 85 THEN 'High Value' - ELSE 'Low Value' - END -FROM table1; -``` - -2. 输出类型限制示例: - -```SQL --- 成功 --- temperature(float) 和 humidity(float) 类型一致 -select if(temperature > 85, temperature, humidity) from table1 - --- 失败 --- temperature(float) 和 status(boolean) 类型不一致 -select if(temperature > 85, temperature, status) from table1 -``` - - - -## 9. 转换函数 - -### 9.1 转换函数 - -#### 9.1.1 cast(value AS type) → type - -1. 显式地将一个值转换为指定类型。 -2. 可以用于将字符串(varchar)转换为数值类型,或数值转换为字符串类型,V2.0.8 版本起支持 OBJECT 类型强转成 STRING 类型。 -3. 如果转换失败,将抛出运行时错误。 - -示例: - -```SQL -SELECT * - FROM table1 - WHERE CAST(time AS DATE) - IN (CAST('2024-11-27' AS DATE), CAST('2024-11-28' AS DATE)); -``` - -#### 9.1.2 try_cast(value AS type) → type - -1. 与 `cast()` 类似。 -2. 如果转换失败,则返回 `null`。 - -示例: - -```SQL -SELECT * - FROM table1 - WHERE try_cast(time AS DATE) - IN (try_cast('2024-11-27' AS DATE), try_cast('2024-11-28' AS DATE)); -``` - -### 9.2 Format 函数 -该函数基于指定的格式字符串与输入参数,生成并返回格式化后的字符串输出。其功能与 Java 语言中的`String.format` 方法及 C 语言中的`printf`函数相类似,支持开发者通过占位符语法构建动态字符串模板,其中预设的格式标识符将被传入的对应参数值精准替换,最终形成符合特定格式要求的完整字符串。 - -#### 9.2.1 语法介绍 - -```SQL -format(pattern,...args) -> String -``` - -**参数定义** - -* `pattern`: 格式字符串,可包含静态文本及一个或多个格式说明符(如 `%s`, `%d` 等),或任意返回类型为 `STRING/TEXT` 的表达式。 -* `args`: 用于替换格式说明符的输入参数。需满足以下条件: - * 参数数量 ≥ 1 - * 若存在多个参数,以逗号`,`分隔(如 `arg1,arg2`) - * 参数总数可多于 `pattern` 中的占位符数量,但不可少于,否则触发异常 - -**返回值** - -* 类型为 `STRING` 的格式化结果字符串 - -#### 9.2.2 使用示例 - -1. 格式化浮点数 - -```SQL -IoTDB:database1> select format('%.5f',humidity) from table1 where humidity = 35.4 -+--------+ -| _col0| -+--------+ -|35.40000| -+--------+ -``` - -2. 格式化整数 - -```SQL -IoTDB:database1> select format('%03d',8) from table1 limit 1 -+-----+ -|_col0| -+-----+ -| 008| -+-----+ -``` - -3. 格式化日期和时间戳 - -* Locale-specific日期 - -```SQL -IoTDB:database1> SELECT format('%1$tA, %1$tB %1$te, %1$tY', 2024-01-01) from table1 limit 1 -+--------------------+ -| _col0| -+--------------------+ -|星期一, 一月 1, 2024| -+--------------------+ -``` - -* 去除时区信息 - -```SQL -IoTDB:database1> SELECT format('%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL', 2024-01-01T00:00:00.000+08:00) from table1 limit 1 -+-----------------------+ -| _col0| -+-----------------------+ -|2024-01-01 00:00:00.000| -+-----------------------+ -``` - -* 获取秒级时间戳精度 - -```SQL -IoTDB:database1> SELECT format('%1$tF %1$tT', 2024-01-01T00:00:00.000+08:00) from table1 limit 1 -+-------------------+ -| _col0| -+-------------------+ -|2024-01-01 00:00:00| -+-------------------+ -``` - -* 日期符号说明如下 - -| **符号** | **​ 描述** | -| ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| 'H' | 24 小时制的小时数,格式为两位数,必要时加上前导零,i.e. 00 - 23。 | -| 'I' | 12 小时制的小时数,格式为两位数,必要时加上前导零,i.e. 01 - 12。 | -| 'k' | 24 小时制的小时数,i.e. 0 - 23。 | -| 'l' | 12 小时制的小时数,i.e. 1 - 12。 | -| 'M' | 小时内的分钟,格式为两位数,必要时加上前导零,i.e. 00 - 59。 | -| 'S' | 分钟内的秒数,格式为两位数,必要时加上前导零,i.e. 00 - 60(“60 ”是支持闰秒所需的特殊值)。 | -| 'L' | 秒内毫秒,格式为三位数,必要时加前导零,i.e. 000 - 999。 | -| 'N' | 秒内的纳秒,格式为九位数,必要时加前导零,i.e. 000000000 - 999999999。 | -| 'p' | 当地特定的[上午或下午](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/text/DateFormatSymbols.html#getAmPmStrings())标记,小写,如 “am ”或 “pm”。使用转换前缀 “T ”会强制输出为大写。 | -| 'z' | 从格林尼治标准时间偏移的[RFC 822](http://www.ietf.org/rfc/rfc0822.txt)式数字时区,例如 -0800。该值将根据夏令时的需要进行调整。对于 long、[Long](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/lang/Long.html)和[Date](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/util/Date.html),使用的时区是 Java 虚拟机此实例的[默认时区](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/util/TimeZone.html#getDefault())。 | -| 'Z' | 表示时区缩写的字符串。该值将根据夏令时的需要进行调整。对于 long、[Long](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/lang/Long.html)和[Date](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/util/Date.html),使用的时区是此 Java 虚拟机实例的[默认时区](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/util/TimeZone.html#getDefault())。Formatter 的时区将取代参数的时区(如果有)。 | -| 's' | 自 1970 年 1 月 1 日 00:00:00 UTC 开始的纪元起的秒数,i.e. Long.MIN\_VALUE/1000 至 Long.MAX\_VALUE/1000。 | -| 'Q' | 自 1970 年 1 月 1 日 00:00:00 UTC 开始的纪元起的毫秒数,i.e. Long.MIN\_VALUE 至 Long.MAX\_VALUE。 | - -* 用于格式化常见的日期/时间组成的转换字符说明如下 - -| **符号** | **描述** | -| ---------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| 'B' | 特定于区域设置[的完整月份名称](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/text/DateFormatSymbols.html#getMonths()),例如 “January”、“February”。 | -| 'b' | 当地特定月份的缩写名称,如"1 月"、"2 月"。 | -| 'h' | 与"b "相同。 | -| 'A' | 一周中某一天在当地的全称,如"星期日"、"星期一"。 | -| 'a' | 当地特有的星期简短名称,例如"星期日"、"星期一 | -| 'C' | 四位数年份除以100,格式为两位数,必要时加上前导零,即00 - 99 | -| 'Y' | 年份,格式为至少四位数,必要时加上前导零,例如0092相当于公历92年。 | -| 'y' | 年份的最后两位数,格式为必要的前导零,即00 - 99。 | -| 'j' | 年号,格式为三位数,必要时加前导零,例如公历为001 - 366。 | -| 'm' | 月份,格式为两位数,必要时加前导零,即01 - 13。 | -| 'd' | 月日,格式为两位数,必要时加前导零,即01 - 31 | -| 'e' | 月日,格式为两位数,即1 - 31。 | - -4. 格式化字符串 - -```SQL -IoTDB:database1> SELECT format('The measurement status is :%s',status) FROM table2 limit 1 -+-------------------------------+ -| _col0| -+-------------------------------+ -|The measurement status is :true| -+-------------------------------+ -``` - -5. 格式化百分号 - -```SQL -IoTDB:database1> SELECT format('%s%%', 99.9) from table1 limit 1 -+-----+ -|_col0| -+-----+ -|99.9%| -+-----+ -``` - -#### 9.2.3 **格式转换失败场景说明** - -1. 类型不匹配错误 - -* 时间戳类型冲突 若格式说明符中包含时间相关标记(如 `%Y-%m-%d`),但参数提供: - * 非 `DATE`/`TIMESTAMP` 类型值 - * 或涉及日期细粒度单位(如 `%H` 小时、`%M` 分钟)时,参数仅支持 `TIMESTAMP` 类型,否则将抛出类型异常 - -```SQL --- 示例1 -IoTDB:database1> SELECT format('%1$tA, %1$tB %1$te, %1$tY', humidity) from table2 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %1$tA, %1$tB %1$te, %1$tY (IllegalFormatConversion: A != java.lang.Float) - --- 示例2 -IoTDB:database1> SELECT format('%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL', humidity) from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS.%1$tL (IllegalFormatConversion: Y != java.lang.Float) -``` - -* 浮点数类型冲突 若使用 `%f` 等浮点格式说明符,但参数提供非数值类型(如字符串、布尔值),将触发类型转换错误 - -```SQL -IoTDB:database1> select format('%.5f',status) from table1 where humidity = 35.4 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %.5f (IllegalFormatConversion: f != java.lang.Boolean) -``` - -2. 参数数量不匹配错误 - -* 实际提供的参数数量 必须等于或大于 格式字符串中格式说明符的数量 -* 若参数数量少于格式说明符数量,将抛出 `ArgumentCountMismatch` 异常 - -```SQL -IoTDB:database1> select format('%.5f %03d', humidity) from table1 where humidity = 35.4 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Invalid format string: %.5f %03d (MissingFormatArgument: Format specifier '%03d') -``` - -3. 无效调用错误 - -* 当函数参数满足以下任一条件时,视为非法调用: - * 参数总数 小于 2(必须包含格式字符串及至少一个参数) - * 格式字符串(`pattern`)类型非 `STRING/TEXT` - -```SQL --- 示例1 -IoTDB:database1> select format('%s') from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Scalar function format must have at least two arguments, and first argument pattern must be TEXT or STRING type. - ---示例2 -IoTDB:database1> select format(123, humidity) from table1 limit 1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Scalar function format must have at least two arguments, and first argument pattern must be TEXT or STRING type. -``` - - - -## 10. 字符串函数和操作符 - -### 10.1 字符串操作符 - -#### 10.1.1 || 操作符 - -`||` 操作符用于字符串连接,功能与 `concat` 函数相同。 - -#### 10.1.2 LIKE 语句 - -`LIKE` 语句用于模式匹配,具体用法在[模式匹配:LIKE](#1-like-运算符) 中有详细文档。 - -### 10.2 字符串函数 - -| 函数名 | 描述 | 输入 | 输出 | 用法 | -| ----------- |---------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------| ------------------------------------------------------------ | ------------------------------------------------------------ | -| length | 返回字符串的字符长度,而不是字符数组的长度。 | 支持一个参数,类型可以是字符串或文本。**string**:要计算长度的字符串。 | INT32 | length(string) | -| upper | 将字符串中的字母转换为大写。 | 支持一个参数,类型可以是字符串或文本。**string**:要计算长度的字符串。 | String | upper(string) | -| lower | 将字符串中的字母转换为小写。 | 支持一个参数,类型可以是字符串或文本。**string**:要计算长度的字符串。 | String | lower(string) | -| trim | 从源字符串中删除指定的开头和/或结尾字符。 | 支持三个参数**specification(可选)**:指定从哪边去掉字符,可以是:`BOTH`:两边都去掉(默认)。`LEADING`:只去掉开头的字符。`TRAILING`:只去掉结尾的字符。**trimcharacter(可选)**:要去掉的字符,如果没指定,默认去掉空格。**string**:要处理的字符串。 | String | trim([ [ specification ] [ trimcharacter ] FROM ] string) 示例:`trim('!' FROM '!foo!');` —— `'foo'` | -| strpos | 返回子字符串在字符串中第一次出现的起始位置。位置从 1 开始计数。如果未找到,返回 0。注意:起始位置是基于字符而不是字节数组确定的。 | 仅支持两个参数,类型可以是字符串或文本。**sourceStr**:要搜索的字符串。**subStr**:要找的子字符串。 | INT32 | strpos(sourceStr, subStr) | -| starts_with | 测试子字符串是否是字符串的前缀。 | 支持两个参数,类型可以是字符串或文本。**sourceStr**:要检查的字符串,类型可以是字符串或文本。**prefix**:前缀子字符串,类型可以是字符串或文本。 | Boolean | starts_with(sourceStr, prefix) | -| ends_with | 测试字符串是否以指定的后缀结束。 | 支持两个参数,类型可以是字符串或文本。**sourceStr**:要检查的字符串。**suffix**:后缀子字符串。 | Boolean | ends_with(sourceStr, suffix) | -| concat | 返回字符串 `string1`、`string2`、...、`stringN` 的连接结果。功能与连接操作符 `\|\|` 相同。 | 至少两个参数,所有参数类型必须是字符串或文本。 | String | concat(str1, str2, ...) 或 str1 \|\| str2 ... | -| strcmp | 比较两个字符串的字母序。 | 支持两个参数,两个参数类型必须是字符串或文本。**string1**:第一个要比较的字符串。**string2**:第二个要比较的字符串。 | 返回一个整数值INT32如果 `str1 < str2`,返回 `-1`如果 `str1 = str2`,返回 `0`如果 `str1 > str2`,返回 `1`如果 `str1` 或 `str2` 为 `NULL`,返回 `NULL` | strcmp(str1, str2) | -| replace | 从字符串中删除所有 `search` 的实例。 | 支持两个参数,可以是字符串或文本类型。**string**:原始字符串,要从中删除内容的字符串。**search**:要删除的子字符串。 | String | replace(string, string) | -| replace | 将字符串中所有 `search` 的实例替换为 `replace`。 | 支持三个参数,可以是字符串或文本类型。**string**:原始字符串,要从中替换内容的字符串。**search**:要替换掉的子字符串。**replace**:用来替换的新字符串。 | String | replace(string, string, string) | -| substring | 从指定位置提取字符到字符串末尾。需要注意的是,起始位置是基于字符而不是字节数组确定的。`start_index` 从 1 开始计数,长度从 `start_index` 位置计算。 | 支持两个参数**string**:要提取子字符串的源字符串,可以是字符串或文本类型。**start_index**:从哪个索引开始提取子字符串,索引从 1 开始计数。 | String:返回一个字符串,从 `start_index` 位置开始到字符串末尾的所有字符。**注意事项**:`start_index` 从 1 开始,即数组的第 0 个位置是 1参数为 null时,返回 `null`start_index 大于字符串长度时,结果报错。 | substring(string from start_index)或 substring(string, start_index) | -| substring | 从一个字符串中提取从指定位置开始、指定长度的子字符串注意:起始位置和长度是基于字符而不是字节数组确定的。`start_index` 从 1 开始计数,长度从 `start_index` 位置计算。 | 支持三个参数**string**:要提取子字符串的源字符串,可以是字符串或文本类型。**start_index**:从哪个索引开始提取子字符串,索引从 1 开始计数。**length**:要提取的子字符串的长度。 | String:返回一个字符串,从 `start_index` 位置开始,提取 `length` 个字符。**注意事项**:参数为 null时,返回 `null`如果 `start_index` 大于字符串的长度,结果报错。如果 `length` 小于 0,结果报错。极端情况,`start_index + length` 超过 `int.MAX` 并变成负数,将导致异常结果。 | substring(string from start_index for length) 或 substring(string, start_index, length) | - -## 11. 模式匹配函数 - -### 11.1 LIKE 运算符 - -#### 11.1.1 用途 - -`LIKE` 运算符用于将值与模式进行比较。它通常用于 `WHERE` 子句中,用于匹配字符串中的特定模式。 - -#### 11.1.2 语法 - -```SQL -... column [NOT] LIKE 'pattern' ESCAPE 'character'; -``` - -#### 11.1.3 匹配规则 - -- 匹配字符是区分大小写的。 -- 模式支持两个匹配符号: - - `_`:匹配任意单个字符。 - - `%`:匹配0个或多个字符。 - -#### 11.1.4 注意事项 - -- `LIKE` 模式匹配总是覆盖整个字符串。如果需要匹配字符串中的任意位置,模式必须以 `%` 开头和结尾。 -- 如果需要匹配 `%` 或 `_` 作为普通字符,必须使用转义字符。 - -#### 11.1.5 示例 - -示例 1:匹配以特定字符开头的字符串 - -- **说明**:查找所有以字母 `E` 开头的名称,例如 `Europe`。 - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'E%'; -``` - -示例 2:排除特定模式 - -- **说明**:查找所有不以字母 `E` 开头的名称。 - -```SQL -SELECT * FROM table1 WHERE continent NOT LIKE 'E%'; -``` - -示例 3:匹配特定长度的字符串 - -- **说明**:查找所有以 `A` 开头、以 `a` 结尾且中间有两个字符的名称,例如 `Asia`。 - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'A__a'; -``` - -示例 4:转义特殊字符 - -- **说明**:查找所有以 `South_` 开头的名称。这里使用了转义字符 `\` 来转义 `_` 等特殊字符,例如`South_America`。 - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'South\_%' ESCAPE '\'; -``` - -示例 5:匹配转义字符本身 - -- **说明**:如果需要匹配转义字符本身,可以使用双转义字符 `\\`。 - -```SQL -SELECT * FROM table1 WHERE continent LIKE 'South\\%' ESCAPE '\'; -``` - -### 11.2 regexp_like 函数 - -#### 11.2.1 用途 - -`regexp_like` 函数用于评估正则表达式模式,并确定该模式是否包含在字符串中。 - -#### 11.2.2 语法 - -```SQL -regexp_like(string, pattern); -``` - -#### 11.2.3 注意事项 - -- `regexp_like` 的模式只需包含在字符串中,而不需要匹配整个字符串。 -- 如果需要匹配整个字符串,可以使用正则表达式的锚点 `^` 和 `$`。 -- `^` 表示“字符串的开头”,`$` 表示“字符串的结尾”。 -- 正则表达式采用 Java 定义的正则语法,但存在以下需要注意的例外情况: - - **多行模式** - 1. 启用方式:`(?m)`。 - 2. 只识别`\n`作为行终止符。 - 3. 不支持`(?d)`标志,且禁止使用。 - - **不区分大小写匹配** - 1. 启用方式:`(?i)`。 - 2. 基于Unicode规则,不支持上下文相关和本地化匹配。 - 3. 不支持`(?u)`标志,且禁止使用。 - - **字符类** - 1. 在字符类(如`[A-Z123]`)中,`\Q`和`\E`不被支持,被视为普通字面量。 - - **Unicode字符类(**`\p{prop}`**)** - 1. **名称下划线**:名称中的所有下划线必须删除(如`OldItalic`而非`Old_Italic`)。 - 2. **文字(Scripts)**:直接指定,无需`Is`、`script=`或`sc=`前缀(如`\p{Hiragana}`)。 - 3. **区块(Blocks)**:必须使用`In`前缀,不支持`block=`或`blk=`前缀(如`\p{InMongolian}`)。 - 4. **类别(Categories)**:直接指定,无需`Is`、`general_category=`或`gc=`前缀(如`\p{L}`)。 - 5. **二元属性(Binary Properties)**:直接指定,无需`Is`(如`\p{NoncharacterCodePoint}`)。 - -#### 11.2.4 示例 - -示例 1:匹配包含特定模式的字符串 - -```SQL -SELECT regexp_like('1a 2b 14m', '\\d+b'); -- true -``` - -- **说明**:检查字符串 `'1a 2b 14m'` 是否包含模式 `\d+b`。 - - `\d+` 表示“一个或多个数字”。 - - `b` 表示字母 `b`。 - - 在 `'1a 2b 14m'` 中,`2b` 符合这个模式,所以返回 `true`。 - -示例 2:匹配整个字符串 - -```SQL -SELECT regexp_like('1a 2b 14m', '^\\d+b$'); -- false -``` - -- **说明**:检查字符串 `'1a 2b 14m'` 是否完全匹配模式 `^\\d+b$`。 - - `\d+` 表示“一个或多个数字”。 - - `b` 表示字母 `b`。 - - `'1a 2b 14m'` 并不符合这个模式,因为它不是从数字开始,也不是以 `b` 结束,所以返回 `false`。 - -## 12. 时序分窗函数 - -原始示例数据如下: - -```SQL -IoTDB> SELECT * FROM bid; -+-----------------------------+--------+-----+ -| time|stock_id|price| -+-----------------------------+--------+-----+ -|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+--------+-----+ - --- 创建语句 -CREATE TABLE bid(time TIMESTAMP TIME, stock_id STRING TAG, price FLOAT FIELD); --- 插入数据 -INSERT INTO bid(time, stock_id, price) VALUES('2021-01-01T09:05:00','AAPL',100.0),('2021-01-01T09:06:00','TESL',200.0),('2021-01-01T09:07:00','AAPL',103.0),('2021-01-01T09:07:00','TESL',202.0),('2021-01-01T09:09:00','AAPL',102.0),('2021-01-01T09:15:00','TESL',195.0); -``` - -### 12.1 HOP - -#### 12.1.1 功能描述 - -HOP 函数用于按时间分段分窗分析,识别每一行数据所属的时间窗口。该函数通过指定固定窗口大小(size)和窗口滑动步长(SLIDE),将数据按时间戳分配到所有与其时间戳重叠的窗口中。若窗口之间存在重叠(步长 < 窗口大小),数据会自动复制到多个窗口。 - -#### 12.1.2 函数定义 - -```SQL -HOP(data, timecol, size, slide[, origin]) -``` - -#### 12.1.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小 | -| SLIDE | 标量参数 | 长整数类型 | 窗口滑动步长 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -#### 12.1.4 返回结果 - -HOP 函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 12.1.5 使用示例 - -```SQL -IoTDB> SELECT * FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.2 SESSION - -#### 12.2.1 功能描述 - -SESSION 函数用于按会话间隔对数据进行分窗。系统逐行检查与前一行的时间间隔,小于阈值(GAP)则归入当前窗口,超过则归入下一个窗口。 - -#### 12.2.2 函数定义 - -```SQL -SESSION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], timecol, gap) -``` -#### 12.2.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| TIMECOL | 标量参数 | 字符串类型默认值:'time' | 时间列名 | -| GAP | 标量参数 | 长整数类型 | 会话间隔阈值 | - -#### 12.2.4 返回结果 - -SESSION 函数的返回结果列包含: - -* window\_start: 会话窗口内的第一条数据的时间 -* window\_end: 会话窗口内的最后一条数据的时间 -* 映射列:DATA 参数的所有输入列 - -#### 12.2.5 使用示例 - -```SQL -IoTDB> SELECT * FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY SESSION -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL| 201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.3 VARIATION - -#### 12.3.1 功能描述 - -VARIATION 函数用于按数据差值分窗,将第一条数据作为首个窗口的基准值,每个数据点会与基准值进行差值运算,如果差值小于给定的阈值(delta)则加入当前窗口;如果超过阈值,则分为下一个窗口,将该值作为下一个窗口的基准值。 - -#### 12.3.2 函数定义 - -```sql -VARIATION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], col, delta) -``` - -#### 12.3.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| -------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| COL | 标量参数 | 字符串类型 | 标识对哪一列计算差值 | -| DELTA | 标量参数 | 浮点数类型 | 差值阈值 | - -#### 12.3.4 返回结果 - -VARIATION 函数的返回结果列包含: - -* window\_index: 窗口编号 -* 映射列:DATA 参数的所有输入列 - -#### 12.3.5 使用示例 - -```sql -IoTDB> SELECT * FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price',DELTA => 2.0); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 1|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY VARIATION -IoTDB> SELECT first(time) as window_start, last(time) as window_end, stock_id, avg(price) as avg FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price', DELTA => 2.0) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:07:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.5| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 12.4 CAPACITY - -#### 12.4.1 功能描述 - -CAPACITY 函数用于按数据点数(行数)分窗,每个窗口最多有 SIZE 行数据。 - -#### 12.4.2 函数定义 - -```sql -CAPACITY(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], size) -``` - -#### 12.4.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| -------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小 | - -#### 12.4.4 返回结果 - -CAPACITY 函数的返回结果列包含: - -* window\_index: 窗口编号 -* 映射列:DATA 参数的所有输入列 - -#### 12.4.5 使用示例 - -```sql -IoTDB> SELECT * FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 0|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY COUNT -IoTDB> SELECT first(time) as start_time, last(time) as end_time, stock_id, avg(price) as avg FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| start_time| end_time|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|101.5| -|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 12.5 TUMBLE - -#### 12.5.1 功能描述 - -TUMBLE 函数用于通过时间属性字段为每行数据分配一个窗口,滚动窗口的大小固定且不重复。 - -#### 12.5.2 函数定义 - -```sql -TUMBLE(data, timecol, size[, origin]) -``` -#### 12.5.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小,需为正数 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -#### 12.5.4 返回结果 - -TUBMLE 函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 12.5.5 使用示例 - -```SQL -IoTDB> SELECT * FROM TUMBLE( DATA => bid, TIMECOL => 'time', SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM TUMBLE(DATA => bid, TIMECOL => 'time', SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 12.6 CUMULATE - -#### 12.6.1 功能描述 - -Cumulate 函数用于从初始的窗口开始,创建相同窗口开始但窗口结束步长不同的窗口,直到达到最大的窗口大小。每个窗口包含其区间内的元素。例如:1小时步长,24小时大小的累计窗口,每天可以获得如下这些窗口:`[00:00, 01:00)`,`[00:00, 02:00)`,`[00:00, 03:00)`, …, `[00:00, 24:00)` - -#### 12.6.2 函数定义 - -```sql -CUMULATE(data, timecol, size, step[, origin]) -``` - -#### 12.6.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------------------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小,SIZE必须是STEP的整数倍,需为正数 | -| STEP | 标量参数 | 长整数类型 | 窗口步长,需为正数 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -> 注意:size 如果不是 step 的整数倍,则会报错`Cumulative table function requires size must be an integral multiple of step` - -#### 12.6.4 返回结果 - -CUMULATE函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 12.6.5 使用示例 - -```sql -IoTDB> SELECT * FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m, SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| TESL| 201.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00| AAPL| 100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| AAPL| 101.5| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/Common-Table-Expression_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/Common-Table-Expression_timecho.md deleted file mode 100644 index b3dc653b2..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/Common-Table-Expression_timecho.md +++ /dev/null @@ -1,234 +0,0 @@ - - -# 公用表表达式(CTE) - -## 1. 概述 - -CTE(Common Table Expressions,公用表表达式)功能支持通过 `WITH` 子句定义一个或多个临时结果集(即公用表),这些结果集可以在同一个查询的后续部分中被多次引用。CTE 提供了一种清晰的方式来构建复杂的查询,使 SQL 代码更易读和维护。 - -> 注意:该功能从 V 2.0.9.1 版本开始提供。 - -## 2. 语法定义 - -CTE 的简化 SQL 语法如下: - -```SQL -with_clause: - WITH cte_name [(col_name [, col_name] ...)] AS (subquery) - [, cte_name [(col_name [, col_name] ...)] AS (subquery)] ... -``` - -* 支持简单 CTE 和嵌套 CTE:可以在 `WITH` 子句中定义一个或多个 CTE,且 CTE 之间可以嵌套引用(但不能前向引用,即不能使用尚未定义的 CTE)。 -* CTE 名称与源表重名:如果 CTE 名称与源表重名,在外层作用域中只有 CTE 可见,源表将被屏蔽。 -* CTE 的多次引用:同一个 CTE 在外层查询中可以被多次引用。 -* Explain / ExplainAnalyze 支持:支持对整个查询进行 `Explain` 或 `ExplainAnalyze`,但不支持对 CTE 定义中的 `subquery` 进行 `Explain` 或 `ExplainAnalyze`。 -* 列名指定限制:CTE 定义时指定的列名个数需与 `subquery` 输出列个数一致,否则报错。 -* 未使用的 CTE:如果定义的 CTE 在查询主体中没有用到,查询仍可正常执行。 - -## 3. 使用示例 - -基于[示例数据](../Reference/Sample-Data.md) 中的表 `table1` 和 `table2`作为源表: - -### 3.1 简单 CTE - -```SQL -WITH cte1 AS (SELECT device_id, temperature FROM table1 WHERE temperature IS NOT NULL), - cte2 AS (SELECT device_id, humidity FROM table2 WHERE humidity IS NOT NULL) -SELECT * FROM cte1 join cte2 on cte1.device_id = cte2.device_id limit 10; -``` - -执行结果 - -```Bash -+---------+-----------+---------+--------+ -|device_id|temperature|device_id|humidity| -+---------+-----------+---------+--------+ -| 100| 90.0| 100| 45.1| -| 100| 90.0| 100| 35.2| -| 100| 90.0| 100| 35.1| -| 100| 85.0| 100| 45.1| -| 100| 85.0| 100| 35.2| -| 100| 85.0| 100| 35.1| -| 100| 85.0| 100| 45.1| -| 100| 85.0| 100| 35.2| -| 100| 85.0| 100| 35.1| -| 100| 88.0| 100| 45.1| -+---------+-----------+---------+--------+ -Total line number = 10 -It costs 0.075s -``` - -### 3.2 CTE 与源表重名 - -```SQL -WITH table1 AS (SELECT time, device_id, temperature FROM table1 WHERE temperature IS NOT NULL) -SELECT * FROM table1 limit 5; -``` - -执行结果 - -```Bash -+-----------------------------+---------+-----------+ -| time|device_id|temperature| -+-----------------------------+---------+-----------+ -|2024-11-30T09:30:00.000+08:00| 101| 90.0| -|2024-11-30T14:30:00.000+08:00| 101| 90.0| -|2024-11-29T10:00:00.000+08:00| 101| 85.0| -|2024-11-27T16:39:00.000+08:00| 101| 85.0| -|2024-11-27T16:40:00.000+08:00| 101| 85.0| -+-----------------------------+---------+-----------+ -Total line number = 5 -It costs 0.103s -``` - -### 3.3 嵌套 CTE - -```SQL -WITH - table1 AS (select device_id, temperature from table1 WHERE temperature IS NOT NULL), - cte1 AS (select device_id, temperature from table2 WHERE temperature IS NOT NULL), - table2 AS (select temperature from table1), - cte2 AS (SELECT temperature FROM table1) -SELECT * FROM table2; -``` - -执行结果 - -```Bash -+-----------+ -|temperature| -+-----------+ -| 90.0| -| 90.0| -| 85.0| -| 85.0| -| 85.0| -| 85.0| -| 90.0| -| 85.0| -| 85.0| -| 88.0| -| 90.0| -| 90.0| -+-----------+ -Total line number = 12 -It costs 0.050s -``` - -* 不支持前向引用 - -```SQL -WITH - cte2 AS (SELECT temperature FROM cte1), - cte1 AS (select device_id, temperature from table1) -SELECT * FROM cte2; -``` - -错误信息 - -```Bash -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 550: Table 'database1.cte1' does not exist. -``` - -### 3.4 CTE 的多次引用 - -```SQL -WITH cte AS (select device_id, temperature from table1 WHERE temperature IS NOT NULL) -SELECT * FROM cte WHERE temperature > (SELECT avg(temperature ) FROM cte); -``` - -执行结果 - -```Bash -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 90.0| -| 100| 90.0| -| 100| 88.0| -| 100| 90.0| -| 100| 90.0| -+---------+-----------+ -Total line number = 6 -It costs 0.241s -``` - -### 3.5 Explain 支持 - -* 支持整个查询 - -```SQL -EXPLAIN WITH cte AS (SELECT * FROM table1) SELECT * FROM cte; -``` - -执行结果 - -```Bash -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| distribution plan| -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ | -| │OutputNode-7 │ | -| │OutputColumns-[time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time] │ | -| │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ | -| └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ | -| │ | -| │ | -| ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ | -| │Collect-42 │ | -| │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ | -| └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ | -| ┌───────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────┐ | -| │ │ | -| ┌───────────┐ ┌───────────┐ | -| │Exchange-49│ │Exchange-50│ | -| └───────────┘ └───────────┘ | -| │ │ | -| │ │ | -|┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐| -|│DeviceTableScanNode-41 │ │DeviceTableScanNode-40 │| -|│QualifiedTableName: database1.table1 │ │QualifiedTableName: database1.table1 │| -|│OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│ │OutputSymbols: [time, region, plant_id, device_id, model_id, maintenance, temperature, humidity, status, arrival_time]│| -|│DeviceNumber: 3 │ │DeviceNumber: 3 │| -|│ScanOrder: ASC │ │ScanOrder: ASC │| -|│PushDownOffset: 0 │ │PushDownOffset: 0 │| -|│PushDownLimit: 0 │ │PushDownLimit: 0 │| -|│PushDownLimitToEachDevice: false │ │PushDownLimitToEachDevice: false │| -|│RegionId: 2 │ │RegionId: 1 │| -|└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘| -+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 29 -It costs 0.065s -``` - -* 不支持 cte 内部查询 - -```SQL -WITH cte AS (EXPLAIN SELECT * FROM table1) SELECT * FROM cte; -``` - -错误信息 - -```Bash -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 700: line 1:14: mismatched input 'EXPLAIN'. Expecting: -``` diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/Featured-Functions_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/Featured-Functions_timecho.md deleted file mode 100644 index 308c527a4..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/Featured-Functions_timecho.md +++ /dev/null @@ -1,862 +0,0 @@ - - -# 特色函数 - -## 1. 降采样函数 - -### 1.1 `date_bin` 函数 - -#### 1.1.1 功能描述 - -`date_bin` 是一个标量函数,用于将时间戳规整到指定的时间区间起点,并结合 `GROUP BY` 子句实现降采样。 - -- 部分区间结果为空:只会对满足条件的数据进行时间戳规整,不会填充缺失的时间区间。 -- 全部区间结果为空::满足条件的整个查询范围内没有数据时,降采样返回空结果集 - -#### 1.1.2 使用示例 - -##### 示例数据 - -在[示例数据页面](../Reference/Sample-Data.md)中,包含了用于构建表结构和插入数据的SQL语句,下载并在IoTDB CLI中执行这些语句,即可将数据导入IoTDB,您可以使用这些数据来测试和执行示例中的SQL语句,并获得相应的结果。 - -示例 1:获取设备** **`100`** **某个时间范围的每小时平均温度 - -```SQL -SELECT date_bin(1h, time) AS hour_time, avg(temperature) AS avg_temp -FROM table1 -WHERE (time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00) - AND device_id = '100' -GROUP BY 1; -``` - -结果: - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:00:00.000+08:00| 90.0| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -+-----------------------------+--------+ -``` - -示例 2:获取每个设备某个时间范围的每小时平均温度 - -```SQL -SELECT date_bin(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00 -GROUP BY 1, device_id; -``` - -结果: - -```Plain -+-----------------------------+---------+--------+ -| hour_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-29T11:00:00.000+08:00| 100| null| -|2024-11-29T18:00:00.000+08:00| 100| 90.0| -|2024-11-28T08:00:00.000+08:00| 100| 85.0| -|2024-11-28T09:00:00.000+08:00| 100| null| -|2024-11-28T10:00:00.000+08:00| 100| 85.0| -|2024-11-28T11:00:00.000+08:00| 100| 88.0| -|2024-11-29T10:00:00.000+08:00| 101| 85.0| -|2024-11-27T16:00:00.000+08:00| 101| 85.0| -+-----------------------------+---------+--------+ -``` - -示例 3:获取所有设备某个时间范围的每小时平均温度 - -```SQL -SELECT date_bin(1h, time) AS hour_time, avg(temperature) AS avg_temp - FROM table1 - WHERE time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00 - group by 1; -``` - -结果: - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-29T10:00:00.000+08:00| 85.0| -|2024-11-27T16:00:00.000+08:00| 85.0| -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:00:00.000+08:00| 90.0| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -+-----------------------------+--------+ -``` - -### 1.2 `date_bin_gapfill` 函数 - -#### 1.2.1 功能描述 - -`date_bin_gapfill` 是 `date_bin` 的扩展,能够填充缺失的时间区间,从而返回完整的时间序列。 - -- 部分区间结果为空:对满足条件的数据进行时间戳规整,并填充缺失的时间区间。 -- 全部区间结果为空::整个查询范围内没有数据时,`date_bin_gapfill`会返回空结果集 - -#### 1.2.2 功能限制 - -- **`date_bin_gapfill`** **必须与** **`GROUP BY`** **子句搭配使用**,如果用在其他子句中,不会报错,但不会执行 gapfill 功能,效果与使用 `date_bin` 相同。 -- **每个** **`GROUP BY`** **子句中只能使用一个** **`date_bin_gapfill`**。如果出现多个 `date_bin_gapfill`,会报错:multiple date_bin_gapfill calls not allowed -- **`date_bin_gapfill`** **的执行顺序**:GAPFILL 功能发生在 `HAVING` 子句执行之后,`FILL` 子句执行之前。 -- **使用** **`date_bin_gapfill`** **时,****`WHERE`** **子句中的时间过滤条件必须是以下形式之一:** - - `time >= XXX AND time <= XXX` - - `time > XXX AND time < XXX` - - `time BETWEEN XXX AND XXX` -- **使用** **`date_bin_gapfill`** **时,如果出现其他时间过滤条件**,会报错。时间过滤条件与其他值过滤条件只能通过 `AND` 连接。 -- **如果不能从 where 子句中推断出 startTime 和 endTime,则报错**:could not infer startTime or endTime from WHERE clause。 - -#### 1.2.3 使用示例 - -示例 1:填充缺失时间区间 - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, avg(temperature) AS avg_temp -FROM table1 -WHERE (time >= 2024-11-28 07:00:00 AND time <= 2024-11-28 16:00:00) - AND device_id = '100' -GROUP BY 1; -``` - -结果: - -```Plain -+-----------------------------+--------+ -| hour_time|avg_temp| -+-----------------------------+--------+ -|2024-11-28T07:00:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| 85.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 85.0| -|2024-11-28T11:00:00.000+08:00| 88.0| -|2024-11-28T12:00:00.000+08:00| null| -|2024-11-28T13:00:00.000+08:00| null| -|2024-11-28T14:00:00.000+08:00| null| -|2024-11-28T15:00:00.000+08:00| null| -|2024-11-28T16:00:00.000+08:00| null| -+-----------------------------+--------+ -``` - -示例 2:结合设备分组填充缺失时间区间 - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-28 07:00:00 AND time <= 2024-11-28 16:00:00 -GROUP BY 1, device_id; -``` - -结果: - -```Plain -+-----------------------------+---------+--------+ -| hour_time|device_id|avg_temp| -+-----------------------------+---------+--------+ -|2024-11-28T07:00:00.000+08:00| 100| null| -|2024-11-28T08:00:00.000+08:00| 100| 85.0| -|2024-11-28T09:00:00.000+08:00| 100| null| -|2024-11-28T10:00:00.000+08:00| 100| 85.0| -|2024-11-28T11:00:00.000+08:00| 100| 88.0| -|2024-11-28T12:00:00.000+08:00| 100| null| -|2024-11-28T13:00:00.000+08:00| 100| null| -|2024-11-28T14:00:00.000+08:00| 100| null| -|2024-11-28T15:00:00.000+08:00| 100| null| -|2024-11-28T16:00:00.000+08:00| 100| null| -+-----------------------------+---------+--------+ -``` - -示例 3:查询范围内没有数据返回空结果集 - -```SQL -SELECT date_bin_gapfill(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp -FROM table1 -WHERE time >= 2024-11-27 09:00:00 AND time <= 2024-11-27 14:00:00 -GROUP BY 1, device_id; -``` - -结果: - -```Plain -+---------+---------+--------+ -|hour_time|device_id|avg_temp| -+---------+---------+--------+ -+---------+---------+--------+ -``` - -## 2. DIFF函数 - -### 2.1 功能概述 - -`DIFF` 函数用于计算当前行与上一行的差值。对于第一行,由于没有前一行数据,因此永远返回 `NULL`。 - -### 2.2 函数定义 - -``` -DIFF(numberic[, boolean]) -> Double -``` - -### 2.3 参数说明 - -- 第一个参数:数值类型 - - - **类型**:必须是数值类型(`INT32`、`INT64`、`FLOAT`、`DOUBLE`) - - **作用**:指定要计算差值的列。 - -- 第二个参数:布尔类型(可选) - - **类型**:布尔类型(`true` 或 `false`)。 - - **默认值**:`true`。 - - **作用**: - - **`true`**:忽略 `NULL` 值,向前找到第一个非 `NULL` 值进行差值计算。如果前面没有非 `NULL` 值,则返回 `NULL`。 - - **`false`**:不忽略 `NULL` 值,如果前一行为 `NULL`,则差值结果为 `NULL`。 - -### 2.4 注意事项 - -- 在树模型中,第二个参数需要指定为 `'ignoreNull'='true'` 或 `'ignoreNull'='false'`,但在表模型中,只需指定为 `true` 或 `false`。 -- 如果用户写成 `'ignoreNull'='true'` 或 `'ignoreNull'='false'`,表模型会将其视为对两个字符串常量进行等号比较,返回布尔值,但结果总是 `false`,等价于指定第二个参数为 `false`。 - -### 2.5 使用示例 - -示例 1:忽略 `NULL` 值 - -```SQL -SELECT time, DIFF(temperature) AS diff_temp -FROM table1 -WHERE device_id = '100'; -``` - -结果: - -```Plain -+-----------------------------+---------+ -| time|diff_temp| -+-----------------------------+---------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:30:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| -5.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| 0.0| -|2024-11-28T11:00:00.000+08:00| 3.0| -|2024-11-26T13:37:00.000+08:00| 2.0| -|2024-11-26T13:38:00.000+08:00| 0.0| -+-----------------------------+---------+ -``` - -示例 2:不忽略 `NULL` 值 - -```SQL -SELECT time, DIFF(temperature, false) AS diff_temp -FROM table1 -WHERE device_id = '100'; -``` - -结果: - -```Plain -+-----------------------------+---------+ -| time|diff_temp| -+-----------------------------+---------+ -|2024-11-29T11:00:00.000+08:00| null| -|2024-11-29T18:30:00.000+08:00| null| -|2024-11-28T08:00:00.000+08:00| -5.0| -|2024-11-28T09:00:00.000+08:00| null| -|2024-11-28T10:00:00.000+08:00| null| -|2024-11-28T11:00:00.000+08:00| 3.0| -|2024-11-26T13:37:00.000+08:00| 2.0| -|2024-11-26T13:38:00.000+08:00| 0.0| -+-----------------------------+---------+ -``` - -示例 3:完整示例 - -```SQL -SELECT time, temperature, - DIFF(temperature) AS diff_temp_1, - DIFF(temperature, false) AS diff_temp_2 -FROM table1 -WHERE device_id = '100'; -``` - -结果: - -```Plain -+-----------------------------+-----------+-----------+-----------+ -| time|temperature|diff_temp_1|diff_temp_2| -+-----------------------------+-----------+-----------+-----------+ -|2024-11-29T11:00:00.000+08:00| null| null| null| -|2024-11-29T18:30:00.000+08:00| 90.0| null| null| -|2024-11-28T08:00:00.000+08:00| 85.0| -5.0| -5.0| -|2024-11-28T09:00:00.000+08:00| null| null| null| -|2024-11-28T10:00:00.000+08:00| 85.0| 0.0| null| -|2024-11-28T11:00:00.000+08:00| 88.0| 3.0| 3.0| -|2024-11-26T13:37:00.000+08:00| 90.0| 2.0| 2.0| -|2024-11-26T13:38:00.000+08:00| 90.0| 0.0| 0.0| -+-----------------------------+-----------+-----------+-----------+ -``` - -## 3. 时序分窗函数 - -原始示例数据如下: - -```SQL -IoTDB> SELECT * FROM bid; -+-----------------------------+--------+-----+ -| time|stock_id|price| -+-----------------------------+--------+-----+ -|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+--------+-----+ - --- 创建语句 -CREATE TABLE bid(time TIMESTAMP TIME, stock_id STRING TAG, price FLOAT FIELD); --- 插入数据 -INSERT INTO bid(time, stock_id, price) VALUES('2021-01-01T09:05:00','AAPL',100.0),('2021-01-01T09:06:00','TESL',200.0),('2021-01-01T09:07:00','AAPL',103.0),('2021-01-01T09:07:00','TESL',202.0),('2021-01-01T09:09:00','AAPL',102.0),('2021-01-01T09:15:00','TESL',195.0); -``` - -### 3.1 HOP - -#### 3.1.1 功能描述 - -HOP 函数用于按时间分段分窗分析,识别每一行数据所属的时间窗口。该函数通过指定固定窗口大小(size)和窗口滑动步长(SLIDE),将数据按时间戳分配到所有与其时间戳重叠的窗口中。若窗口之间存在重叠(步长 < 窗口大小),数据会自动复制到多个窗口。 - -#### 3.1.2 函数定义 - -```SQL -HOP(data, timecol, size, slide[, origin]) -``` - -#### 3.1.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小 | -| SLIDE | 标量参数 | 长整数类型 | 窗口滑动步长 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -#### 3.1.4 返回结果 - -HOP 函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 3.1.5 使用示例 - -```SQL -IoTDB> SELECT * FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.2 SESSION - -#### 3.2.1 功能描述 - -SESSION 函数用于按会话间隔对数据进行分窗。系统逐行检查与前一行的时间间隔,小于阈值(GAP)则归入当前窗口,超过则归入下一个窗口。 - -#### 3.2.2 函数定义 - -```SQL -SESSION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], timecol, gap) -``` -#### 3.2.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| TIMECOL | 标量参数 | 字符串类型默认值:'time' | 时间列名| -| GAP | 标量参数 | 长整数类型 | 会话间隔阈值 | - -#### 3.2.4 返回结果 - -SESSION 函数的返回结果列包含: - -* window\_start: 会话窗口内的第一条数据的时间 -* window\_end: 会话窗口内的最后一条数据的时间 -* 映射列:DATA 参数的所有输入列 - -#### 3.2.5 使用示例 - -```SQL -IoTDB> SELECT * FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY SESSION -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL| 201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.3 VARIATION - -#### 3.3.1 功能描述 - -VARIATION 函数用于按数据差值分窗,将第一条数据作为首个窗口的基准值,每个数据点会与基准值进行差值运算,如果差值小于给定的阈值(delta)则加入当前窗口;如果超过阈值,则分为下一个窗口,将该值作为下一个窗口的基准值。 - -#### 3.3.2 函数定义 - -```sql -VARIATION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], col, delta) -``` - -#### 3.3.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| -------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| COL | 标量参数 | 字符串类型 | 标识对哪一列计算差值 | -| DELTA | 标量参数 | 浮点数类型 | 差值阈值 | - -#### 3.3.4 返回结果 - -VARIATION 函数的返回结果列包含: - -* window\_index: 窗口编号 -* 映射列:DATA 参数的所有输入列 - -#### 3.3.5 使用示例 - -```sql -IoTDB> SELECT * FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price',DELTA => 2.0); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 1|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY VARIATION -IoTDB> SELECT first(time) as window_start, last(time) as window_end, stock_id, avg(price) as avg FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price', DELTA => 2.0) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:07:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.5| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 3.4 CAPACITY - -#### 3.4.1 功能描述 - -CAPACITY 函数用于按数据点数(行数)分窗,每个窗口最多有 SIZE 行数据。 - -#### 3.4.2 函数定义 - -```sql -CAPACITY(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], size) -``` - -#### 3.4.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| -------- | ---------- | -------------------------- | ---------------------------------------- | -| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小 | - -#### 3.4.4 返回结果 - -CAPACITY 函数的返回结果列包含: - -* window\_index: 窗口编号 -* 映射列:DATA 参数的所有输入列 - -#### 3.4.5 使用示例 - -```sql -IoTDB> SELECT * FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2); -+------------+-----------------------------+--------+-----+ -|window_index| time|stock_id|price| -+------------+-----------------------------+--------+-----+ -| 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| -| 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| -| 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| -| 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -| 0|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -| 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY COUNT -IoTDB> SELECT first(time) as start_time, last(time) as end_time, stock_id, avg(price) as avg FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2) GROUP BY window_index, stock_id; -+-----------------------------+-----------------------------+--------+-----+ -| start_time| end_time|stock_id| avg| -+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| -|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:05:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|101.5| -|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+--------+-----+ -``` - -### 3.5 TUMBLE - -#### 3.5.1 功能描述 - -TUMBLE 函数用于通过时间属性字段为每行数据分配一个窗口,滚动窗口的大小固定且不重复。 - -#### 3.5.2 函数定义 - -```sql -TUMBLE(data, timecol, size[, origin]) -``` -#### 3.5.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小,需为正数 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -#### 3.5.4 返回结果 - -TUBMLE 函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 3.5.5 使用示例 - -```SQL -IoTDB> SELECT * FROM TUMBLE( DATA => bid, TIMECOL => 'time', SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM TUMBLE(DATA => bid, TIMECOL => 'time', SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -### 3.6 CUMULATE - -#### 3.6.1 功能描述 - -Cumulate 函数用于从初始的窗口开始,创建相同窗口开始但窗口结束步长不同的窗口,直到达到最大的窗口大小。每个窗口包含其区间内的元素。例如:1小时步长,24小时大小的累计窗口,每天可以获得如下这些窗口:`[00:00, 01:00)`,`[00:00, 02:00)`,`[00:00, 03:00)`, …, `[00:00, 24:00)` - -#### 3.6.2 函数定义 - -```sql -CUMULATE(data, timecol, size, step[, origin]) -``` - -#### 3.6.3 参数说明 - -| 参数名 | 参数类型 | 参数属性 | 描述 | -| --------- | ---------- | --------------------------------- | -------------------------------------------- | -| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 | -| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 | -| SIZE | 标量参数 | 长整数类型 | 窗口大小,SIZE必须是STEP的整数倍,需为正数 | -| STEP | 标量参数 | 长整数类型 | 窗口步长,需为正数 | -| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 | - -> 注意:size 如果不是 step 的整数倍,则会报错`Cumulative table function requires size must be an integral multiple of step` - -#### 3.6.4 返回结果 - -CUMULATE函数的返回结果列包含: - -* window\_start: 窗口开始时间(闭区间) -* window\_end: 窗口结束时间(开区间) -* 映射列:DATA 参数的所有输入列 - -#### 3.6.5 使用示例 - -```sql -IoTDB> SELECT * FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m,SIZE => 10m); -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -| window_start| window_end| time|stock_id|price| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| -+-----------------------------+-----------------------------+-----------------------------+--------+-----+ - --- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME -IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m, SIZE => 10m) GROUP BY window_start, window_end, stock_id; -+-----------------------------+-----------------------------+--------+------------------+ -| window_start| window_end|stock_id| avg| -+-----------------------------+-----------------------------+--------+------------------+ -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| TESL| 201.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00| TESL| 195.0| -|2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00| AAPL| 100.0| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| AAPL| 101.5| -|2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| -+-----------------------------+-----------------------------+--------+------------------+ -``` - -## 4. 窗口函数 - -### 4.1 语法定义 - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -更多详细功能介绍请参考:[窗口函数](../User-Manual/Window-Function_timecho.md) - -### 4.2 使用示例 - -表 device_flow 原始数据如下 - -```sql -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ -``` - -1. 从 device_flow 中查询所有列,并按 device 维度分组,在每个设备分组内按 flow 字段值排序,计算 flow 字段的累计求和,最终将累计和命名为 sum 列返回。 - -查询语句: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -``` - -查询结果: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` -2. 从 device_flow 表查询所有原始列,按 device 设备分组,每个设备分组内按 flow 字段值排序,统计「当前行所在的 flow 分组 + 前 1 个 flow 分组」范围内的行数(计数),最终将计数结果命名为 count 列返回。 - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. 从 device_flow 表查询所有原始列,按 device 分组,每个分组内按 flow 字段值升序排序,统计「当前行 flow 值 - 2」到「当前行 flow 值」这个数值区间内的所有行的数量,最终将计数结果命名为 count 列返回。 - -查询语句: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -## 5. Object 类型读取函数 - -描述:用于读取 OBJECT 对象的二进制内容。返回 BLOB 类型(对象的二进制内容)。 -> V2.0.8 版本起支持 - -语法: - -```SQL -READ_OBJECT(object [, offset, length]) -``` - -参数: - -* 必选参数:`object`,类型为 OBJECT -* 可选参数: - * `offset`,类型为 long(int64),为读取的偏移量,缺省值为0。如果 offset 小于 0,或者大于等于全文件长度,则抛异常 - * `length`,类型为 long(int64),为读取的数据长度,缺省值为全文件长度 - * 当长度大于 2^31 - 1 时,报错 - * 当长度大于从 offset 起的剩余文件长度时,会取从 offset 起的文件内容 - * length 小于 0 时,视为读取 offset 开始 object 剩下的所有数据 - -示例: - -```SQL -IoTDB:database1> select READ_OBJECT(s1) from table1 where device_id = 'tag1' -+------------+ -| _col0| -+------------+ -|0x696f746462| -+------------+ -Total line number = 1 - - -IoTDB:database1> select READ_OBJECT(s1, 0, 3) from table1 where device_id = 'tag1' -+--------+ -| _col0| -+--------+ -|0x696f74| -+--------+ -Total line number = 1 -``` diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/QuickStart-Only-Sql_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/QuickStart-Only-Sql_timecho.md deleted file mode 100644 index 26e1dac30..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/QuickStart-Only-Sql_timecho.md +++ /dev/null @@ -1,127 +0,0 @@ - - -# 快速 SQL 体验 - -> **在执行以下 SQL 语句前,请确保** -> -> * **已成功启动 IoTDB 服务** -> * **已通过 Cli 客户端连接 IoTDB** -> -> 注意:若您使用的终端不支持多行粘贴(例如 Windows CMD),请将 SQL 语句调整为单行格式后再执行。 - -## 1. 数据库管理 - -```SQL ---创建数据库 database1,并将数据库的 TTL 时间设置为1年; -CREATE DATABASE IF NOT EXISTS database1; - ---使用数据库 database1; -USE database1; - ---修改数据库的 TTL 时间为1周; -ALTER DATABASE database1 SET PROPERTIES TTL=604800000; - ---删除数据库 database1; -DROP DATABASE IF EXISTS database1; -``` - -详细语法说明可参考:[数据库管理](../Basic-Concept/Database-Management_timecho.md) - -## 2. 表管理 - -```SQL ---创建表 table1; -CREATE TABLE table1 ( - time TIMESTAMP TIME, - device_id STRING TAG, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - status Boolean FIELD COMMENT 'status' -); - --- 查看表 table1 的列信息; -DESC table1 DETAILS; - --- 修改表; --- 表 table1 增加列; -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS humidity FLOAT FIELD COMMENT 'humidity'; --- 表 table1 TTL 设置为1周; -ALTER TABLE table1 set properties TTL=604800000; - ---删除表 table1; -DROP TABLE table1; -``` - -详细语法说明可参考:[表管理](../Basic-Concept/Table-Management_timecho.md) - -## 3. 数据写入 - -```SQL ---单行写入; -INSERT INTO table1(device_id, time, temperature) VALUES ('100', '2025-11-26 13:37:00', 90.0); - ---多行写入; -INSERT INTO table1(device_id, maintenance, time, temperature) VALUES - ('101', '180', '2024-11-26 13:37:00', 88.0), - ('100', '180', '2024-11-26 13:38:00', 85.0), - ('101', '180', '2024-11-27 16:38:00', 80.0); -``` - -详细语法说明可参考:[数据写入](../Basic-Concept/Write-Updata-Data_timecho.md#_1-数据写入) - -## 4. 数据查询 - -```SQL ---全表查询; -SELECT * FROM table1; - ---函数查询; -SELECT count(*), sum(temperature) FROM table1; - ---查询指定设备及时间段的数据; -SELECT * -FROM table1 -WHERE time >= 2024-11-26 00:00:00 and time <= 2024-11-27 00:00:00 and device_id='101'; -``` - -详细语法说明可参考:[数据查询](../Basic-Concept/Query-Data_timecho.md) - -## 5. 数据更新 - -```SQL --- 更新 device_id 是 100 的数据的属性 maintenance 值; -UPDATE table1 SET maintenance='45' WHERE device_id='100'; -``` - -详细语法说明可参考:[数据更新](../Basic-Concept/Write-Updata-Data_timecho.md#_2-数据更新) - -## 6. 数据删除 - -```SQL --- 删除指定设备及时间段的数据; -DELETE FROM table1 WHERE time >= 2024-11-26 23:39:00 and time <= 2024-11-27 20:42:00 AND device_id='101'; - --- 全表删除; -DELETE FROM table1; -``` - -详细语法说明可参考:[数据删除](../Basic-Concept/Delete-Data.md) diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/Row-Pattern-Recognition_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/Row-Pattern-Recognition_timecho.md deleted file mode 100644 index 5e6ca2f9a..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/Row-Pattern-Recognition_timecho.md +++ /dev/null @@ -1,155 +0,0 @@ - - -# 模式查询 - -## 1. 语法定义 - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**说明:** - -* PARTITION BY : 可选,用于对输入表进行分组,每个分组能独立进行模式匹配。如果未声明该子句,则整个输入表将作为一个整体进行处理。 -* ORDER BY :可选,用于确保输入数据按某种顺序进行匹配处理。 -* MEASURES :可选,用于指定从匹配到的一段数据中提取哪些信息。 -* ROWS PER MATCH :可选,用于指定模式匹配成功后结果集的输出方式。 -* AFTER MATCH SKIP :可选,用于指定在识别到一个非空匹配后,下一次模式匹配应从哪一行继续进行。 -* PATTERN :用于定义需要匹配的行模式。 -* SUBSET :可选,用于将多个基本模式变量所匹配的行合并为一个逻辑集合。 -* DEFINE :用于定义行模式的基本模式变量。 - -更多详细功能介绍请参考:[模式查询](../User-Manual/Pattern-Query_timecho.md) - -## 2. 使用示例 - -以[示例数据](../Reference/Sample-Data.md)为源数据 - -1. 时间分段查询 - -将 table1 中的数据按照时间间隔小于等于 24 小时分段,查询每段中的数据总条数,以及开始、结束时间。 - -查询SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -2. 差值分段查询 - -将 table2 中的数据按照 humidity 湿度值差值小于 0.1 分段,查询每段中的数据总条数,以及开始、结束时间。 - -* 查询sql - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* 查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -3. 事件统计查询 - -将 table1 中数据按照设备号分组,统计上海地区湿度大于 35 的开始、结束时间及最大湿度值。 - -* 查询sql - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= '上海' AND A.humidity> 35 -) AS m -``` - -* 查询结果 - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Authority-Management_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Authority-Management_timecho.md deleted file mode 100644 index 34fc1bf54..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Authority-Management_timecho.md +++ /dev/null @@ -1,377 +0,0 @@ - -# 权限管理 - -本文档为 V2.0.7 版本起权限管理的 SQL 手册,详细功能使用可见[权限管理](../User-Manual/Authority-Management-Upgrade_timecho.md),如需查阅 V2.0.7 版本之前权限管理的功能介绍可参考[权限管理](../User-Manual/Authority-Management_timecho.md) - -## 1. 权限列表 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
权限类型权限名称生效范围描述
全局权限SYSTEM全局允许用户创建、修改、删除数据库。
允许用户创建、修改、删除表及表视图。
允许用户创建、删除、查看用户自定义函数。
允许用户创建、开始、停止、删除、查看PIPE。允许用户创建、删除、查看PIPEPLUGINS。
允许用户查询、取消查询。允许用户查看变量。允许用户查看集群状态。
允许用户创建、删除、查看深度学习模型。
SECURITY全局允许用户创建用户。
允许用户删除用户。
允许用户修改用户密码。
允许用户查看用户的权限信息。
允许用户列出所有用户。
允许用户创建角色。
允许用户删除角色。
允许用户查看角色的权限信息。
允许用户将角色授予某个用户或撤销。
允许用户列出所有角色。
AUDIT全局允许用户维护审计日志的规则 允许用户查看审计日志。
数据权限CREATEANY允许创建任意表、创建任意数据库。
数据库允许用户在该数据库下创建表;允许用户创建该名称的数据库。
允许用户创建该名称的表。
ALTERANY允许修改任意表的定义、任意数据库的定义。
数据库允许用户修改数据库的定义,允许用户修改数据库下表的定义。
允许用户修改表的定义。
SELECTANY允许查询系统内任意数据库中任意表的数据。
数据库允许用户查询该数据库中任意表的数据。
允许用户查询该表中的数据。执行多表查询时,数据库仅展示用户有权限访问的数据。
INSERTANY允许任意数据库的任意表插入/更新数据。
数据库允许用户向该数据库范围内任意表插入/更新数据。
允许用户向该表中插入/更新数据。
DELETEANY允许删除任意表的数据。
数据库允许用户删除该数据库范围内的数据。
允许用户删除该表中的数据。
- -## 2. SQL 语句 - -### 2.1 用户与角色管理 - -1. 创建用户(需 SECURITY 权限) - -```SQL -CREATE USER -eg: CREATE USER user1 'Passwd@202604'; -``` - -2. 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备 SECURITY 权限。 - -```SQL -ALTER USER SET PASSWORD -eg: ALTER USER tempuser SET PASSWORD 'Newpwd@202604'; -``` - -3. 删除用户(需 SECURITY 权限) - -```SQL -DROP USER -eg: DROP USER user1; -``` - -4. 创建角色 (需 SECURITY 权限) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1; -``` - -5. 删除角色 (需 SECURITY 权限) - -```SQL -DROP ROLE -eg: DROP ROLE role1; -``` - -6. 赋予用户角色 (需 SECURITY 权限) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1; -``` - -7. 移除用户角色 (需 SECURITY 权限) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1; -``` - -8. 列出所有用户(需 SECURITY 权限) - -```SQL -LIST USER; -``` - -9. 列出所有的角色 (需 SECURITY 权限) - -```SQL -LIST ROLE; -``` - -10. 列出指定角色下所有用户(需 SECURITY 权限) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser; -``` - -11. 列出指定用户下的所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 SECURITY 权限。 - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser; -``` - -12. 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser; -``` - -13. 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF ROLE -eg: LIST PRIVILEGES OF ROLE actor; -``` - -### 2.2 权限管理 - -#### 2.2.1 授予权限 - -1. 给用户授予管理用户的权限 - -```SQL -GRANT SECURITY TO USER -eg: GRANT SECURITY TO USER TEST_USER; -``` - -2. 给用户授予创建数据库及在数据库范围内创建表的权限,且允许用户在该范围内管理权限 - -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION -eg: GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION; -``` - -3. 给角色授予查询数据库的权限 - -```SQL -GRANT SELECT ON DATABASE TO ROLE -eg: GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE; -``` - -4. 给用户授予查询表的权限 - -```SQL -GRANT SELECT ON . TO USER -eg: GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER; -``` - -5. 给角色授予查询所有数据库及表的权限 - -```SQL -GRANT SELECT ON ANY TO ROLE -eg: GRANT SELECT ON ANY TO ROLE TEST_ROLE; -``` - -6. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地授予权限。 - -```SQL -GRANT ALL TO USER TESTUSER; --- 将用户可以获取的所有权限授予给用户,包括全局权限和 ANY 范围的所有数据权限 - -GRANT ALL ON ANY TO USER TESTUSER; --- 将 ANY 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在所有数据库上的所有数据权限 - -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER; --- 将 DB 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该数据库上的所有数据权限 - -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER; --- 将 TABLE 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该表上的所有数据权限 -``` - -#### 2.2.2 撤销权限 - -1. 取消用户管理用户的权限 - -```SQL -REVOKE SECURITY FROM USER -eg: REVOKE SECURITY FROM USER TEST_USER; -``` - -2. 取消用户创建数据库及在数据库范围内创建表的权限 - -```SQL -REVOKE CREATE ON DATABASE FROM USER -eg: REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER; -``` - -3. 取消用户查询表的权限 - -```SQL -REVOKE SELECT ON . FROM USER -eg: REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER; -``` - -4. 取消用户查询所有数据库及表的权限 - -```SQL -REVOKE SELECT ON ANY FROM USER -eg: REVOKE SELECT ON ANY FROM USER TEST_USER; -``` - -5. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地撤销权限。 - -```SQL -REVOKE ALL FROM USER TESTUSER; --- 取消用户所有的全局权限以及 ANY 范围的所有数据权限 - -REVOKE ALL ON ANY FROM USER TESTUSER; --- 取消用户 ANY 范围的所有数据权限,不会影响 DB 范围和 TABLE 范围的权限 - -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER; --- 取消用户在 DB 上的所有数据权限,不会影响 TABLE 权限 - -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER; --- 取消用户在 TABLE 上的所有数据权限 -``` - -#### 2.2.3 查看用户权限 - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser -``` diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md deleted file mode 100644 index df4b0fcc0..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Data-Addition-Deletion_timecho.md +++ /dev/null @@ -1,171 +0,0 @@ - - -# 数据增删 - -## 1. 数据写入 - -**语法:** - -```SQL -INSERT INTO [(COLUMN_NAME[, COLUMN_NAME]*)]? VALUES (COLUMN_VALUE[, COLUMN_VALUE]*) -``` - -更多详细语法说明请参考:[写入语法](../Basic-Concept/Write-Updata-Data_timecho.md#_1-1-语法) - -**示例一:指定列写入** - -```SQL -INSERT INTO table1(region, plant_id, device_id, time, temperature, humidity) VALUES ('北京', '1001', '100', '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, time, temperature) VALUES ('北京', '1001', '100', '2025-11-26 13:38:00', 91.0); -``` - -**示例二:空值写入** - -上述部分列写入等价于如下的带空值写入 -```SQL -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('北京', '1001', '100', null, null, '2025-11-26 13:37:00', 90.0, 35.1); - -INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity) VALUES ('北京', '1001', '100', null, null, '2025-11-26 13:38:00', 91.0, null); -``` - -**示例三:多行写入** - -```SQL -INSERT INTO table1 -VALUES -('2025-11-26 13:37:00', '北京', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('2025-11-26 13:38:00', '北京', '1001', '100', 'A', '180', 90.0, 35.1, true, '2025-11-26 13:38:25'); - -INSERT INTO table1 -(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) -VALUES -('北京', '1001', '100', 'A', '180', '2025-11-26 13:37:00', 90.0, 35.1, true, '2025-11-26 13:37:34'), -('北京', '1001', '100', 'A', '180', '2025-11-26 13:38:00', 90.0, 35.1, true, '2025-11-26 13:38:25'); -``` - -**示例四:查询写回** - -```SQL -insert into target_table select time,region,device_id,temperature from table1 where region = '北京'; - -insert into target_table(time,device_id,temperature) table table3; - -insert into target_table (select t1.time, t1.region as region, t1.device_id as device_id, t1.temperature as temperature from table1 t1 where t1.time in (select t2.time from table2 t2 where t2.region = '上海')); -``` - -## 2. 数据更新 - -**语法:** - -```SQL -UPDATE SET updateAssignment (',' updateAssignment)* (WHERE where=booleanExpression)? - -updateAssignment - : identifier EQ expression - ; -``` - -更多详细语法说明请参考:[更新语法](../Basic-Concept/Write-Updata-Data_timecho.md#_2-1-语法) - -**示例:** - -```SQL -update table1 set b = a where substring(a, 1, 1) like '%'; -``` - -## 3. 数据删除 - -**语法:** - -```SQL -DELETE FROM [WHERE_CLAUSE]? - -WHERE_CLAUSE: - WHERE DELETE_CONDITION - -DELETE_CONDITION: - SINGLE_CONDITION - | DELETE_CONDITION AND DELETE_CONDITION - | DELETE_CONDITION OR DELETE_CONDITION - -SINGLE_CODITION: - TIME_CONDITION | ID_CONDITION - -TIME_CONDITION: - time TIME_OPERATOR LONG_LITERAL - -TIME_OPERATOR: - < | > | <= | >= | = - -ID_CONDITION: - identifier = STRING_LITERAL -``` - -**示例一: 删除全表数据** - -全表删除 -```SQL -DELETE FROM table1; -``` - -**示例二:删除一段时间范围内的数据** - -单时间段数据删除 -```SQL -DELETE FROM table1 WHERE time <= 2024-11-29 00:00:00; -``` -多时间段数据删除 -```SQL -DELETE FROM table1 WHERE time >= 2024-11-27 00:00:00 and time <= 2024-11-29 00:00:00; -``` - -**示例三:删除指定设备的数据** - -删除指定设备的数据 -```SQL -DELETE FROM table1 WHERE device_id='101' and model_id = 'B'; -``` -删除指定设备及时间段的数据 -```SQL -DELETE FROM table1 - WHERE time >= 2024-11-27 16:39:00 and time <= 2024-11-29 16:42:00 - AND device_id='101' and model_id = 'B'; -``` -删除指定类型设备的数据 -```SQL -DELETE FROM table1 WHERE model_id = 'B'; -``` - -## 4. 设备删除 - -**语法:** - -```SQL -DELETE DEVICES FROM tableName=qualifiedName (WHERE booleanExpression)? -``` - -**示例:删除指定设备及其相关的所有数据** - -```SQL -DELETE DEVICES FROM table1 WHERE device_id = '101'; -``` diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Data-Sync_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Data-Sync_timecho.md deleted file mode 100644 index e272c9053..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Data-Sync_timecho.md +++ /dev/null @@ -1,320 +0,0 @@ - -# 数据同步 - -本文档主要为数据同步功能的SQL语句,详细功能介绍及使用说明见 [数据同步](../User-Manual/Data-Sync_timecho.md) - -## 1. 创建任务 - -**语法:** - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId 是能够唯一标定任务的名字 --- 数据抽取插件,可选插件 -WITH SOURCE ( - [ = ,], -) --- 数据处理插件,可选插件 -WITH PROCESSOR ( - [ = ,], -) --- 数据连接插件,必填插件 -WITH SINK ( - [ = ,], -) -``` - -**示例一:全量数据同步** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -**示例二:部分数据同步** - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'mode.streaming' = 'true' - 'database-name'='db_b.*', - 'start-time' = '2023.08.23T08:00:00+00:00', - 'end-time' = '2023.10.23T08:00:00+00:00' -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -**示例三:双向数据传输** - -* 在 A IoTDB 上执行下列语句 - -```SQL -create pipe AB -with source ( - 'source.mode.double-living' ='true' -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* 在 B IoTDB 上执行下列语句 - -```SQL -create pipe BA -with source ( - 'source.mode.double-living' ='true' -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -) -``` - -**示例四:边云数据传输** - -* 在 B IoTDB 上执行下列语句,将 B 中数据同步至 A - -```SQL -create pipe BA -with source ( - 'database-name'='db_b.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -) -``` - -* 在 C IoTDB 上执行下列语句,将 C 中数据同步至 A - -```SQL -create pipe CA -with source ( - 'database-name'='db_c.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* 在 D IoTDB 上执行下列语句,将 D 中数据同步至 A - -```SQL -create pipe DA -with source ( - 'database-name'='db_d.*', - 'table-name'='.*', -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -) -``` - -**示例五:级联数据传输** - -* 在 A IoTDB 上执行下列语句,将 A 中数据同步至 B - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -) -``` - -* 在 B IoTDB 上执行下列语句,将 B 中数据同步至 C - -```SQL -create pipe BC -with source ( -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -) -``` - -**示例六:跨网闸数据传输** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -) -``` - -**示例七:压缩同步** - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', - 'compressor' = 'snappy,lz4', - 'rate-limit-bytes-per-second'='1048576' -) -``` - -**示例八:加密同步** - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', - 'ssl.trust-store-path'='pki/trusted', - 'ssl.trust-store-pwd'='root' -) -``` - -**示例九:本地导出 Object 类型数据** - -```SQL -CREATE PIPE tsfile_export_local -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile-local-sink', - 'sink.local.target-path' = '/data/backup/export_2024' - 'sink.rate-limit-bytes-per-second' = '10485760' -); -``` - -**示例十:远程传输 Object 类型数据** - -* 该方式需提前注册 `tsfile_remote_sink` 插件 - -```SQL -CREATE PIPE tsfile_export_scp -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile_remote_sink', - 'sink.file-mode' = 'scp', - 'sink.scp.host' = '192.168.1.100', - 'sink.scp.port' = '22', - 'sink.scp.user' = 'backup_user', - 'sink.scp.password' = 'ComplexPass123!', - 'sink.scp.remote-path' = '/remote/archive/', - 'sink.rate-limit-bytes-per-second' = '10485760' -); -``` - -## 2. 开始任务 - -**语法:** - -```SQL -START PIPE -``` - -**示例:** - -```SQL -START PIPE A2B -``` - -## 3. 停止任务 - -**语法:** - -```SQL -STOP PIPE -``` - -**示例:** - -```SQL -STOP PIPE A2B -``` - -## 4. 删除任务 - -**语法:** - -```SQL -DROP PIPE [IF EXISTS] -``` - -**示例:** - -```SQL -DROP PIPE IF EXISTS A2B -``` - -## 5. 查看任务 - -**语法:** - -```SQL --- 查看全部任务 -SHOW PIPES --- 查看指定任务 -SHOW PIPE -``` - -**示例:** - -```SQL -SHOW PIPES - -SHOW PIPE A2B -``` - -## 6. 修改任务 - -**语法:** - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -**示例:** - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md deleted file mode 100644 index cd2206af9..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Maintenance-Statements_timecho.md +++ /dev/null @@ -1,663 +0,0 @@ - - -# 运维语句 - -## 1. 状态查看 - -### 1.1 查看当前的树/表模型 - -**语法:** - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 1.2 查看登录的用户名 - -**语法:** - -```SQL -showCurrentUserStatement - : SHOW CURRENT_USER - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW CURRENT_USER -+-----------+ -|CurrentUser| -+-----------+ -| root| -+-----------+ -``` - -### 1.3 查看连接的数据库名 - -**语法:** - -```SQL -showCurrentDatabaseStatement - : SHOW CURRENT_DATABASE - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW CURRENT_DATABASE; -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ - -IoTDB> USE test; - -IoTDB> SHOW CURRENT_DATABASE; -+---------------+ -|CurrentDatabase| -+---------------+ -| test| -+---------------+ -``` - -### 1.4 查看集群版本 - -**语法:** - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW VERSION -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.5 查看集群关键参数 - -**语法:** - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW VARIABLES -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.6 查看集群ID - -**语法:** - -```SQL -showClusterIdStatement - : SHOW (CLUSTERID | CLUSTER_ID) - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW CLUSTER_ID -+------------------------------------+ -| ClusterId| -+------------------------------------+ -|40163007-9ec1-4455-aa36-8055d740fcda| -``` - -### 1.7 查看服务器的时间 - -查看客户端直连的 DataNode 进程所在的服务器的时间 - -**语法:** - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - -### 1.8 查看分区信息 - -**语法:** - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW REGIONS -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 6|SchemaRegion|Running|tcollector| 670| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.194| | -| 7| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.196| 169.85 KB| -| 8| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.198| 161.63 KB| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.9 查看可用节点 - -> V2.0.8 起支持该功能 - -**语法:** - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW AVAILABLE URLS -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.10 查看服务信息 - -> V2.0.8.2 起支持该功能 - -**语法:** - -```SQL -showServicesStatement - : SHOW SERVICES - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -+------------+-----------+-------+ -|service_name|datanode_id| state| -+------------+-----------+-------+ -| MQTT| 1|STOPPED| -| REST| 1|RUNNING| -+------------+-----------+-------+ -``` - -### 1.11 查看集群激活状态 - -**语法:** - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW ACTIVATION -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -## 2. 状态设置 - -### 2.1 设置连接的树/表模型 - -**语法:** - -```SQL -SET SQL_DIALECT EQ (TABLE | TREE) -``` - -**示例:** - -```SQL -IoTDB> SET SQL_DIALECT=TABLE -IoTDB> SHOW CURRENT_SQL_DIALECT -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 2.2 更新配置项 - -**语法:** - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**示例:** - -```SQL -IoTDB> SET CONFIGURATION disk_space_warning_threshold='0.05',heartbeat_interval_in_ms='1000' ON 1; -``` - -### 2.3 读取手动修改的配置文件 - -**语法:** - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**示例:** - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 设置系统的状态 - -**语法:** - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**示例:** - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - -## 3. 数据管理 - -### 3.1 将内存表中的数据刷到磁盘 - -**语法:** - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**示例:** - -```SQL -IoTDB> FLUSH test_db TRUE ON LOCAL; -``` - - -## 4. 数据修复 - -### 4.1 启动后台扫描并修复 tsfile 任务 - -**语法:** - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**示例:** - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 暂停后台修复 tsfile 任务 - -**语法:** - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**示例:** - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. 查询相关 - -### 5.1 查看正在执行的查询 - -**语法:** - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**示例:** - -```SQL -IoTDB> SHOW QUERIES WHERE elapsed_time > 30 -+-----------------------+-----------------------------+-----------+------------+------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -|20250108_101015_00000_1|2025-01-08T18:10:15.935+08:00| 1| 32.283|show queries|root| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -``` - -### 5.2 主动终止查询 - -**语法:** - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**示例:** - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -- 终止指定query -IoTDB> KILL ALL QUERIES; -- 终止所有query -``` - -### 5.3 查询性能分析 - -#### 5.3.1 查看执行计划 - -**语法:** - -```SQL -EXPLAIN -``` - -更多详细语法说明请参考:[EXPLAIN 语句](../User-Manual/Query-Performance-Analysis.md#_1-explain-语句) - -**示例:** - -```SQL -IoTDB> explain select * from t1 -+-----------------------------------------------------------------------------------------------+ -| distribution plan| -+-----------------------------------------------------------------------------------------------+ -| ┌─────────────────────────────────────────────┐ | -| │OutputNode-4 │ | -| │OutputColumns-[time, device_id, type, speed] │ | -| │OutputSymbols: [time, device_id, type, speed]│ | -| └─────────────────────────────────────────────┘ | -| │ | -| │ | -| ┌─────────────────────────────────────────────┐ | -| │Collect-21 │ | -| │OutputSymbols: [time, device_id, type, speed]│ | -| └─────────────────────────────────────────────┘ | -| ┌───────────────────────┴───────────────────────┐ | -| │ │ | -|┌─────────────────────────────────────────────┐ ┌───────────┐ | -|│TableScan-19 │ │Exchange-28│ | -|│QualifiedTableName: test.t1 │ └───────────┘ | -|│OutputSymbols: [time, device_id, type, speed]│ │ | -|│DeviceNumber: 1 │ │ | -|│ScanOrder: ASC │ ┌─────────────────────────────────────────────┐| -|│PushDownOffset: 0 │ │TableScan-20 │| -|│PushDownLimit: 0 │ │QualifiedTableName: test.t1 │| -|│PushDownLimitToEachDevice: false │ │OutputSymbols: [time, device_id, type, speed]│| -|│RegionId: 2 │ │DeviceNumber: 1 │| -|└─────────────────────────────────────────────┘ │ScanOrder: ASC │| -| │PushDownOffset: 0 │| -| │PushDownLimit: 0 │| -| │PushDownLimitToEachDevice: false │| -| │RegionId: 1 │| -| └─────────────────────────────────────────────┘| -+-----------------------------------------------------------------------------------------------+ -``` - -#### 5.3.2 查询性能分析 - -**语法:** - -```SQL -EXPLAIN ANALYZE [VERBOSE] -``` - -更多详细语法说明请参考:[EXPLAIN ANALYZE 语句](../User-Manual/Query-Performance-Analysis.md#_2-explain-analyze-语句) - -**示例:** - -```SQL -IoTDB> explain analyze verbose select * from t1 -+-----------------------------------------------------------------------------------------------+ -| Explain Analyze| -+-----------------------------------------------------------------------------------------------+ -|Analyze Cost: 38.860 ms | -|Fetch Partition Cost: 9.888 ms | -|Fetch Schema Cost: 54.046 ms | -|Logical Plan Cost: 10.102 ms | -|Logical Optimization Cost: 17.396 ms | -|Distribution Plan Cost: 2.508 ms | -|Dispatch Cost: 22.126 ms | -|Fragment Instances Count: 2 | -| | -|FRAGMENT-INSTANCE[Id: 20241127_090849_00009_1.2.0][IP: 0.0.0.0][DataRegion: 2][State: FINISHED]| -| Total Wall Time: 18 ms | -| Cost of initDataQuerySource: 6.153 ms | -| Seq File(unclosed): 1, Seq File(closed): 0 | -| UnSeq File(unclosed): 0, UnSeq File(closed): 0 | -| ready queued time: 0.164 ms, blocked queued time: 0.342 ms | -| Query Statistics: | -| loadBloomFilterFromCacheCount: 0 | -| loadBloomFilterFromDiskCount: 0 | -| loadBloomFilterActualIOSize: 0 | -| loadBloomFilterTime: 0.000 | -| loadTimeSeriesMetadataAlignedMemSeqCount: 1 | -| loadTimeSeriesMetadataAlignedMemSeqTime: 0.246 | -| loadTimeSeriesMetadataFromCacheCount: 0 | -| loadTimeSeriesMetadataFromDiskCount: 0 | -| loadTimeSeriesMetadataActualIOSize: 0 | -| constructAlignedChunkReadersMemCount: 1 | -| constructAlignedChunkReadersMemTime: 0.294 | -| loadChunkFromCacheCount: 0 | -| loadChunkFromDiskCount: 0 | -| loadChunkActualIOSize: 0 | -| pageReadersDecodeAlignedMemCount: 1 | -| pageReadersDecodeAlignedMemTime: 0.047 | -| [PlanNodeId 43]: IdentitySinkNode(IdentitySinkOperator) | -| CPU Time: 5.523 ms | -| output: 2 rows | -| HasNext() Called Count: 6 | -| Next() Called Count: 5 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 31]: CollectNode(CollectOperator) | -| CPU Time: 5.512 ms | -| output: 2 rows | -| HasNext() Called Count: 6 | -| Next() Called Count: 5 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 29]: TableScanNode(TableScanOperator) | -| CPU Time: 5.439 ms | -| output: 1 rows | -| HasNext() Called Count: 3 -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| DeviceNumber: 1 | -| CurrentDeviceIndex: 0 | -| [PlanNodeId 40]: ExchangeNode(ExchangeOperator) | -| CPU Time: 0.053 ms | -| output: 1 rows | -| HasNext() Called Count: 2 | -| Next() Called Count: 1 | -| Estimated Memory Size: : 131072 | -| | -|FRAGMENT-INSTANCE[Id: 20241127_090849_00009_1.3.0][IP: 0.0.0.0][DataRegion: 1][State: FINISHED]| -| Total Wall Time: 13 ms | -| Cost of initDataQuerySource: 5.725 ms | -| Seq File(unclosed): 1, Seq File(closed): 0 | -| UnSeq File(unclosed): 0, UnSeq File(closed): 0 | -| ready queued time: 0.118 ms, blocked queued time: 5.844 ms | -| Query Statistics: | -| loadBloomFilterFromCacheCount: 0 | -| loadBloomFilterFromDiskCount: 0 | -| loadBloomFilterActualIOSize: 0 | -| loadBloomFilterTime: 0.000 | -| loadTimeSeriesMetadataAlignedMemSeqCount: 1 | -| loadTimeSeriesMetadataAlignedMemSeqTime: 0.004 | -| loadTimeSeriesMetadataFromCacheCount: 0 | -| loadTimeSeriesMetadataFromDiskCount: 0 | -| loadTimeSeriesMetadataActualIOSize: 0 | -| constructAlignedChunkReadersMemCount: 1 | -| constructAlignedChunkReadersMemTime: 0.001 | -| loadChunkFromCacheCount: 0 | -| loadChunkFromDiskCount: 0 | -| loadChunkActualIOSize: 0 | -| pageReadersDecodeAlignedMemCount: 1 | -| pageReadersDecodeAlignedMemTime: 0.007 | -| [PlanNodeId 42]: IdentitySinkNode(IdentitySinkOperator) | -| CPU Time: 0.270 ms | -| output: 1 rows | -| HasNext() Called Count: 3 | -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| [PlanNodeId 30]: TableScanNode(TableScanOperator) | -| CPU Time: 0.250 ms | -| output: 1 rows | -| HasNext() Called Count: 3 | -| Next() Called Count: 2 | -| Estimated Memory Size: : 327680 | -| DeviceNumber: 1 | -| CurrentDeviceIndex: 0 | -+-----------------------------------------------------------------------------------------------+ -``` diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Metadata-Operations_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Metadata-Operations_timecho.md deleted file mode 100644 index 662677066..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/SQL-Metadata-Operations_timecho.md +++ /dev/null @@ -1,412 +0,0 @@ - - -# 元数据操作 - -## 1. 数据库管理 - -### 1.1 创建数据库 - -**语法:** - -```SQL -CREATE DATABASE (IF NOT EXISTS)? (WITH properties)? -``` - -更多详细语法说明请参考:[创建数据库](../Basic-Concept/Database-Management_timecho.md#_1-1-创建数据库) - -**示例:** - -创建一个名为 database1 的数据库, 数据库的 TTL 时间默认永久。 -```SQL -CREATE DATABASE database1; -CREATE DATABASE IF NOT EXISTS database1; -``` - -创建一个名为 database1 的数据库,并将数据库的 TTL 时间设置为1年。 -```SQL -CREATE DATABASE IF NOT EXISTS database1 with(TTL=31536000000); -``` - -### 1.2 使用数据库 - -**语法:** - -```SQL -USE -``` - -**示例:** - -```SQL -USE database1; -``` - -### 1.3 查看当前数据库 - -**语法:** - -```SQL -SHOW CURRENT_DATABASE; -``` - -**示例:** - -未执行过 `use`语句指定数据库 -```SQL -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ -``` -执行 `use`语句指定数据库 database1 -```sql -USE database1; -SHOW CURRENT_DATABASE; -``` -```shell -+---------------+ -|CurrentDatabase| -+---------------+ -| database1| -+---------------+ -``` - -### 1.4 查看所有数据库 - -**语法:** - -```SQL -SHOW DATABASES (DETAILS)? -``` - -更多返回结果详细说明请参考:[查看所有数据库](../Basic-Concept/Database-Management_timecho.md#_1-4-查看所有数据库) - -**示例:** - -查看所有数据库 -```SQL -SHOW DATABASES; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval| -+------------------+-------+-----------------------+---------------------+---------------------+ -| database1| INF| 1| 1| 604800000| -|information_schema| INF| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+ -``` - -查看所有数据库详情 -```sql -SHOW DATABASES DETAILS; -``` -```shell -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| Database|TTL(ms)|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|DataRegionGroupNum| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -| database1| INF| 1| 1| 604800000| 1| 2| -|information_schema| INF| null| null| null| null| null| -+------------------+-------+-----------------------+---------------------+---------------------+--------------------+------------------+ -``` - -### 1.5 修改数据库 - -**语法:** - -```SQL -ALTER DATABASE (IF EXISTS)? database=identifier SET PROPERTIES propertyAssignments; -``` - -**示例:** - -修改数据库 database1 的 TTL 时间为1年 -```SQL -ALTER DATABASE database1 SET PROPERTIES TTL=31536000000; -``` - -### 1.6 删除数据库 - -**语法:** - -```SQL -DROP DATABASE (IF EXISTS)? ; -``` - -**示例:** - -删除数据库 database1 -```SQL -DROP DATABASE IF EXISTS database1; -``` - -## 2. 表管理 - -### 2.1 创建表 - -**语法:** - -```SQL -createTableStatement - : CREATE TABLE (IF NOT EXISTS)? qualifiedName - '(' (columnDefinition (',' columnDefinition)*)? ')' - charsetDesc? - comment? - (WITH properties)? - ; - -charsetDesc - : DEFAULT? (CHAR SET | CHARSET | CHARACTER SET) EQ? identifierOrString - ; - -columnDefinition - : identifier columnCategory=(TAG | ATTRIBUTE | TIME) charsetName? comment? - | identifier type (columnCategory=(TAG | ATTRIBUTE | TIME | FIELD))? charsetName? comment? - ; - -charsetName - : CHAR SET identifier - | CHARSET identifier - | CHARACTER SET identifier - ; - -comment - : COMMENT string - ; -``` - -更多详细语法说明请参考:[创建表](../Basic-Concept/Table-Management_timecho.md#_1-1-创建表) - -**示例:** - -创建表 table1 并将表的 TTL 设置为1年 -```SQL -CREATE TABLE table1 ( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE COMMENT 'maintenance', - temperature FLOAT FIELD COMMENT 'temperature', - humidity FLOAT FIELD COMMENT 'humidity', - status Boolean FIELD COMMENT 'status', - arrival_time TIMESTAMP FIELD COMMENT 'arrival_time' -) COMMENT 'table1' WITH (TTL=31536000000); -``` - -创建空表 tableB -```SQL -CREATE TABLE if not exists tableB (); -``` - -创建表 tableC -```SQL -CREATE TABLE tableC ( - station STRING TAG, - temperature int32 FIELD COMMENT 'temperature' - ) with (TTL=DEFAULT); -``` - -创建表 table1 并自定义时间列:命名为time_test, 位于表的第二列 (V2.0.8 起支持) -```SQL -CREATE TABLE table1 ( - region STRING TAG, - time_user_defined TIMESTAMP TIME, - temperature FLOAT FIELD -); -``` - -注意:若您使用的终端不支持多行粘贴(例如 Windows CMD),请将 SQL 语句调整为单行格式后再执行。 - -### 2.2 查看表 - -**语法:** - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -**示例:** - -查看数据库 database1 下的所有表 -```SQL -show tables from database1; -``` -```shell -+---------+---------------+ -|TableName| TTL(ms)| -+---------+---------------+ -| table1| 31536000000| -+---------+---------------+ -``` - -查看数据库 database1 下的所有表及其属性信息 -```sql -show tables details from database1; -``` -```shell -+---------------+-----------+------+-------+ -| TableName| TTL(ms)|Status|Comment| -+---------------+-----------+------+-------+ -| table1|31536000000| USING| table1| -+---------------+-----------+------+-------+ -``` - -### 2.3 查看表的列 - -**语法:** - -```SQL -(DESC | DESCRIBE) (DETAILS)? -``` - -**示例:** - -查看表 table1 的列信息 -```SQL -desc table1; -``` -```shell -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ -``` -查看表 table1 的列详细信息 -```sql -desc table1 details; -``` -```shell -+------------+---------+---------+------+------------+ -| ColumnName| DataType| Category|Status| Comment| -+------------+---------+---------+------+------------+ -| time|TIMESTAMP| TIME| USING| null| -| region| STRING| TAG| USING| null| -| plant_id| STRING| TAG| USING| null| -| device_id| STRING| TAG| USING| null| -| model_id| STRING|ATTRIBUTE| USING| null| -| maintenance| STRING|ATTRIBUTE| USING| maintenance| -| temperature| FLOAT| FIELD| USING| temperature| -| humidity| FLOAT| FIELD| USING| humidity| -| status| BOOLEAN| FIELD| USING| status| -|arrival_time|TIMESTAMP| FIELD| USING|arrival_time| -+------------+---------+---------+------+------------+ -``` - -### 2.4 查看表的创建信息 - -**语法:** - -```SQL -SHOW CREATE TABLE -``` - -**示例:** - -查看表 table1 的创建信息 -```SQL -show create table table1; -``` -```shell -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| Table| Create Table| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|table1|CREATE TABLE "table1" ("region" STRING TAG,"plant_id" STRING TAG,"device_id" STRING TAG,"model_id" STRING ATTRIBUTE,"maintenance" STRING ATTRIBUTE,"temperature" FLOAT FIELD,"humidity" FLOAT FIELD,"status" BOOLEAN FIELD,"arrival_time" TIMESTAMP FIELD) WITH (ttl=31536000000)| -+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -``` - - -### 2.5 修改表 - -**语法:** - -```SQL -#addColumn; -ALTER TABLE (IF EXISTS)? tableName=qualifiedName ADD COLUMN (IF NOT EXISTS)? column=columnDefinition COMMENT 'column_comment'; -#dropColumn; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName DROP COLUMN (IF EXISTS)? column=identifier; -#setTableProperties; -// set TTL can use this; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName SET PROPERTIES propertyAssignments; -| COMMENT ON TABLE tableName=qualifiedName IS 'table_comment'; -| COMMENT ON COLUMN tableName.column IS 'column_comment'; -#changeColumndatatype; -| ALTER TABLE (IF EXISTS)? tableName=qualifiedName ALTER COLUMN (IF EXISTS)? column=identifier SET DATA TYPE new_type=type; -``` - -**示例:** - -表 table1 增加 tag 列 a -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS a TAG COMMENT 'a'; -``` -表 table1 增加 field 列 b -```SQL -ALTER TABLE table1 ADD COLUMN IF NOT EXISTS b FLOAT FIELD COMMENT 'b'; -``` -修改表 table1 的 TTL -```SQL -ALTER TABLE table1 set properties TTL=3600; -``` -表 table1 增加注释 -```SQL -COMMENT ON TABLE table1 IS 'table1'; -``` -表 table1 的 a 列去掉注释 -```SQL -COMMENT ON COLUMN table1.a IS null; -``` -修改表 table1 的 b 列的数据类型 -```SQL -ALTER TABLE table1 ALTER COLUMN IF EXISTS b SET DATA TYPE DOUBLE; -``` - -### 2.6 删除表 - -**语法:** - -```SQL -DROP TABLE (IF EXISTS)? -``` - -**示例:** - -```SQL -DROP TABLE table1; -DROP TABLE database1.table1; -``` - - diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/Select-Clause_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/Select-Clause_timecho.md deleted file mode 100644 index d5568b600..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/Select-Clause_timecho.md +++ /dev/null @@ -1,461 +0,0 @@ - - -# SELECT 子句 - -## 1. 语法概览 - -```sql -SELECT setQuantifier? selectItem (',' selectItem)* - -selectItem - : expression (AS? identifier)? #selectSingle - | tableName '.' ASTERISK (AS columnAliases)? #selectAll - | ASTERISK #selectAll - ; -setQuantifier - : DISTINCT - | ALL - ; -``` - -- **SELECT 子句**: 指定了查询结果应包含的列,包含聚合函数(如 SUM、AVG、COUNT 等)以及窗口函数,在逻辑上最后执行。 -- **DISTINCT 关键字**: `SELECT DISTINCT column_name` 确保查询结果中的值是唯一的,去除重复项。 -- **COLUMNS 函数**:SELECT 子句中支持使用 COLUMNS 函数进行列筛选,并支持和表达式结合使用,使表达式的效果对所有筛选出的列生效。 - -## 2. 语法详释: - -每个 `selectItem` 可以是以下形式之一: - -- **表达式**: `expression [ [ AS ] column_alias ]` 定义单个输出列,可以指定列别名。 -- **选择某个关系的所有列**: `relation.*` 选择某个关系的所有列,不允许使用列别名。 -- **选择结果集中的所有列**: `*` 选择查询的所有列,不允许使用列别名。 - -`DISTINCT` 的使用场景: - -- **SELECT 语句**:在 SELECT 语句中使用 DISTINCT,查询结果去除重复项。 -- **聚合函数**:与聚合函数一起使用时,DISTINCT 只处理输入数据集中的非重复行。 -- **GROUP BY 子句**:在 GROUP BY 子句中使用 ALL 和 DISTINCT 量词,决定是否每个重复的分组集产生不同的输出行。 - -`COLUMNS` 函数: -- **`COLUMNS(*)`**: 匹配所有列,支持结合表达式进行使用。 -- **`COLUMNS(regexStr) ? AS identifier`**:正则匹配 - - 匹配所有列名满足正则表达式的列,支持结合表达式进行使用。 - - 支持引用正则表达式捕获到的 groups 对列进行重命名,不写 AS 时展示原始列名(即 _coln_原始列名,其中 n 为列在结果表中的 position)。 - - 重命名用法简述: - - regexStr 中使用圆括号设置要捕获的组; - - 在 identifier 中使用 `'$index'` 引用捕获到的组。 - - 注意:使用该功能时,identifier 中会包含特殊字符 '$',所以整个 identifier 要用双引号引起来。 - -## 3. 示例数据 - -在[示例数据页面](../Reference/Sample-Data.md)中,包含了用于构建表结构和插入数据的SQL语句,下载并在IoTDB CLI中执行这些语句,即可将数据导入IoTDB,您可以使用这些数据来测试和执行示例中的SQL语句,并获得相应的结果。 - -### 3.1 选择列表 - -#### 3.1.1 星表达式 - -使用星号(*)可以选取表中的所有列,**注意**,星号表达式不能被大多数函数转换,除了`count(*)`的情况。 - -示例:从表中选择所有列 - -```sql -SELECT * FROM table1; -``` - -执行结果如下: - -```shell -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| modifytime| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-29T18:30:00.000+08:00| 上海| 3002| 100| E| 180| 90.0| 35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T10:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| 35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 35.2| true| null| -|2024-11-30T14:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00| 上海| 3001| 101| D| 360| 85.0| null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| 35.3| null| null| -|2024-11-27T16:40:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.2| false| null| -|2024-11-27T16:43:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false| null| -|2024-11-27T16:44:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -Total line number = 18 -It costs 0.653s -``` - -#### 3.1.2 聚合函数 - -聚合函数将多行数据汇总为单个值。当 SELECT 子句中存在聚合函数时,查询将被视为聚合查询。在聚合查询中,所有表达式必须是聚合函数的一部分或由[GROUP BY子句](../SQL-Manual/GroupBy-Clause.md)指定的分组的一部分。 - -示例1:返回地址表中的总行数: - -```sql -SELECT count(*) FROM table1; -``` - -执行结果如下: - -```shell -+-----+ -|_col0| -+-----+ -| 18| -+-----+ -Total line number = 1 -It costs 0.091s -``` - -示例2:返回按城市分组的地址表中的总行数: - -```sql -SELECT region, count(*) - FROM table1 - GROUP BY region; -``` - -执行结果如下: - -```shell -+------+-----+ -|region|_col1| -+------+-----+ -| 上海| 9| -| 北京| 9| -+------+-----+ -Total line number = 2 -It costs 0.071s -``` - -#### 3.1.3 别名 - -关键字`AS`:为选定的列指定别名,别名将覆盖已存在的列名,以提高查询结果的可读性。 - -示例1:原始表格: - -```sql -IoTDB> SELECT * FROM table1; -``` - -执行结果如下: - -```shell -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| modifytime| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-29T18:30:00.000+08:00| 上海| 3002| 100| E| 180| 90.0| 35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T10:00:00.000+08:00| 上海| 3001| 100| C| 90| 85.0| 35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00| 北京| 1001| 100| A| 180| 90.0| 35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 35.2| true| null| -|2024-11-30T14:30:00.000+08:00| 上海| 3002| 101| F| 360| 90.0| 34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00| 上海| 3001| 101| D| 360| 85.0| null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| 35.3| null| null| -|2024-11-27T16:40:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00| 北京| 1001| 101| B| 180| 85.0| null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00| 北京| 1001| 101| B| 180| null| 35.2| false| null| -|2024-11-27T16:43:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false| null| -|2024-11-27T16:44:00.000+08:00| 北京| 1001| 101| B| 180| null| null| false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -Total line number = 18 -It costs 0.653s -``` - -示例2:单列设置别名: - -```sql -IoTDB> SELECT device_id - AS device - FROM table1; -``` - -执行结果如下: - -```shell -+------+ -|device| -+------+ -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 100| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -| 101| -+------+ -Total line number = 18 -It costs 0.053s -``` - -示例3:所有列的别名: - -```sql -IoTDB> SELECT table1.* - AS (timestamp, Reg, Pl, DevID, Mod, Mnt, Temp, Hum, Stat,MTime) - FROM table1; -``` - -执行结果如下: - -```shell -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -| TIMESTAMP| REG| PL|DEVID|MOD|MNT|TEMP| HUM| STAT| MTIME| -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -|2024-11-29T11:00:00.000+08:00|上海|3002| 100| E|180|null|45.1| true| null| -|2024-11-29T18:30:00.000+08:00|上海|3002| 100| E|180|90.0|35.4| true|2024-11-29T18:30:15.000+08:00| -|2024-11-28T08:00:00.000+08:00|上海|3001| 100| C| 90|85.0|null| null|2024-11-28T08:00:09.000+08:00| -|2024-11-28T09:00:00.000+08:00|上海|3001| 100| C| 90|null|40.9| true| null| -|2024-11-28T10:00:00.000+08:00|上海|3001| 100| C| 90|85.0|35.2| null|2024-11-28T10:00:11.000+08:00| -|2024-11-28T11:00:00.000+08:00|上海|3001| 100| C| 90|88.0|45.1| true|2024-11-28T11:00:12.000+08:00| -|2024-11-26T13:37:00.000+08:00|北京|1001| 100| A|180|90.0|35.1| true|2024-11-26T13:37:34.000+08:00| -|2024-11-26T13:38:00.000+08:00|北京|1001| 100| A|180|90.0|35.1| true|2024-11-26T13:38:25.000+08:00| -|2024-11-30T09:30:00.000+08:00|上海|3002| 101| F|360|90.0|35.2| true| null| -|2024-11-30T14:30:00.000+08:00|上海|3002| 101| F|360|90.0|34.8| true|2024-11-30T14:30:17.000+08:00| -|2024-11-29T10:00:00.000+08:00|上海|3001| 101| D|360|85.0|null| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T16:38:00.000+08:00|北京|1001| 101| B|180|null|35.1| true|2024-11-26T16:37:01.000+08:00| -|2024-11-27T16:39:00.000+08:00|北京|1001| 101| B|180|85.0|35.3| null| null| -|2024-11-27T16:40:00.000+08:00|北京|1001| 101| B|180|85.0|null| null|2024-11-26T16:37:03.000+08:00| -|2024-11-27T16:41:00.000+08:00|北京|1001| 101| B|180|85.0|null| null|2024-11-26T16:37:04.000+08:00| -|2024-11-27T16:42:00.000+08:00|北京|1001| 101| B|180|null|35.2|false| null| -|2024-11-27T16:43:00.000+08:00|北京|1001| 101| B|180|null|null|false| null| -|2024-11-27T16:44:00.000+08:00|北京|1001| 101| B|180|null|null|false|2024-11-26T16:37:08.000+08:00| -+-----------------------------+----+----+-----+---+---+----+----+-----+-----------------------------+ -Total line number = 18 -It costs 0.189s -``` - -#### 3.1.4 Object 类型查询 - -> V2.0.8 版本起支持 - -示例一:直接查询 object 类型数据 - -```SQL -IoTDB:database1> select s1 from table1 where device_id = 'tag1'; -``` - -执行结果如下: -```shell -+------------+ -| s1| -+------------+ -|(Object) 5 B| -+------------+ -Total line number = 1 -It costs 0.428s -``` - -示例二:通过 read\_object 函数查询 Object 类型数据的真实内容 - -```SQL -IoTDB:database1> select read_object(s1) from table1 where device_id = 'tag1'; -``` - -执行结果如下: -```shell -+------------+ -| _col0| -+------------+ -|0x696f746462| -+------------+ -Total line number = 1 -It costs 0.188s -``` - - -### 3.2 Columns 函数 - -1. 不结合表达式 - -查询列名以 'm' 开头的列的数据 -```sql -IoTDB:database1> select columns('^m.*') from table1 limit 5; -``` - -执行结果如下: -```shell -+--------+-----------+ -|model_id|maintenance| -+--------+-----------+ -| E| 180| -| E| 180| -| C| 90| -| C| 90| -| C| 90| -+--------+-----------+ -``` - -查询列名以 'o' 开头的列,未匹配到任何列,抛出异常 -```SQL -IoTDB:database1> select columns('^o.*') from table1 limit 5; -``` -执行结果如下: -```shell -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: No matching columns found that match regex '^o.*' -``` - -查询列名以 'm' 开头的列的数据,并重命名以 'series_' 开头 -```SQL -IoTDB:database1> select columns('^m(.*)') AS "series_$0" from table1 limit 5; -``` -执行结果如下: -```shell -+---------------+------------------+ -|series_model_id|series_maintenance| -+---------------+------------------+ -| E| 180| -| E| 180| -| C| 90| -| C| 90| -| C| 90| -+---------------+------------------+ -``` - -2. 结合表达式 - -- 单个 COLUMNS 函数 - -查询所有列的最小值 -```sql -IoTDB:database1> select min(columns(*)) from table1; -``` -执行结果如下: -```shell -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -| _col0_time|_col1_region|_col2_plant_id|_col3_device_id|_col4_model_id|_col5_maintenance|_col6_temperature|_col7_humidity|_col8_status| _col9_arrival_time| -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -|2024-11-26T13:37:00.000+08:00| 上海| 1001| 100| A| 180| 85.0| 34.8| false|2024-11-26T13:37:34.000+08:00| -+-----------------------------+------------+--------------+---------------+--------------+-----------------+-----------------+--------------+------------+-----------------------------+ -``` - -- 多个 COLUMNS 函数,出现在同一表达式 - -> 使用限制:出现多个 COLUMNS 函数时,多个 COLUMNS 函数的参数要完全相同 - -查询 'h' 开头列的最小值和最大值之和 -```sql -IoTDB:database1> select min(columns('^h.*')) + max(columns('^h.*')) from table1; -``` -执行结果如下: -```shell -+--------------+ -|_col0_humidity| -+--------------+ -| 79.899994| -+--------------+ -``` - -错误查询,两个 COLUMNS 函数不完全相同 -```SQL -IoTDB:database1> select min(columns('^h.*')) + max(columns('^t.*')) from table1; -``` -执行结果如下: -```shell -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: Multiple different COLUMNS in the same expression are not supported -``` - -- 多个 COLUMNS 函数,出现在不同表达式 - -分别查询 'h' 开头列的最小值和最大值 -```sql -IoTDB:database1> select min(columns('^h.*')) , max(columns('^h.*')) from table1; -``` -执行结果如下: -```shell -+--------------+--------------+ -|_col0_humidity|_col1_humidity| -+--------------+--------------+ -| 34.8| 45.1| -+--------------+--------------+ -``` -分别查询 'h' 开头列的最小值和 'te'开头列的最大值 -```SQL -IoTDB:database1> select min(columns('^h.*')) , max(columns('^te.*')) from table1; -``` -执行结果如下: -```shell -+--------------+-----------------+ -|_col0_humidity|_col1_temperature| -+--------------+-----------------+ -| 34.8| 90.0| -+--------------+-----------------+ -``` - -3. 在 WHERE 子句中使用 - -查询数据,所有 'h' 开头列的数据必须要大于 40 -```sql -IoTDB:database1> select * from table1 where columns('^h.*') > 40; -``` -执行结果如下: -```shell -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -``` -等价于 -```SQL -IoTDB:database1> select * from table1 where humidity > 40; -``` -执行结果如下: -```shell -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|model_id|maintenance|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| E| 180| null| 45.1| true| null| -|2024-11-28T09:00:00.000+08:00| 上海| 3001| 100| C| 90| null| 40.9| true| null| -|2024-11-28T11:00:00.000+08:00| 上海| 3001| 100| C| 90| 88.0| 45.1| true|2024-11-28T11:00:12.000+08:00| -+-----------------------------+------+--------+---------+--------+-----------+-----------+--------+------+-----------------------------+ -``` - -## 4. 结果集列顺序 - -- **列顺序**: 结果集中的列顺序与 SELECT 子句中指定的顺序相同。 -- **多列排序**: 如果选择表达式返回多个列,它们的排序方式与源关系中的排序方式相同 \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/Set-Operations_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/Set-Operations_timecho.md deleted file mode 100644 index 07d8ff1fb..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/Set-Operations_timecho.md +++ /dev/null @@ -1,322 +0,0 @@ - - -# 集合操作 - -IoTDB 原生支持 SQL 标准集合操作,包括 UNION(并集)、INTERSECT(交集)和EXCEPT(差集)三种核心运算符。通过执行这些操作,可实现无缝合并、比较和筛选多源时序数据查询结果,显著提升时序数据分析的灵活性与效率。 - -> 注意:该功能从 V2.0.9.1 版本开始提供。 - -## 1. UNION -### 1.1 概述 - -UNION 操作将两个查询结果集的所有行合并(不保证结果顺序),支持去重(默认)和保留重复两种模式。 - -### 1.2 语法定义 - -```SQL -query UNION (ALL | DISTINCT) query -``` - -**说明:** - -1. **去重规则:** - - 1. 默认(`UNION` 或 `UNION DISTINCT`):自动去除重复行。 - 2. `UNION ALL`:保留所有行(包括重复项),性能更高。 -2. **输入要求:** - - 1. 两个查询结果的列数必须相同。 - 2. 对应列数据类型需兼容,兼容性规则如下: - * 数值类型互容:`INT32`、`INT64`、`FLOAT`、`DOUBLE` 之间完全兼容。 - * 字符串类型互容:`TEXT` 与 `STRING` 完全兼容。 - * 特殊规则:`INT64` 与 `TIMESTAMP` 兼容。 -3. **结果集规则:** - - 1. 列名及顺序继承第一个查询的定义。 - -### 1.3 使用示例 - -以[示例数据](../Reference/Sample-Data.md)为原始数据。 - -1. 获取 table1 和 table2 中设备及温度的非空数据集合(去重) - -```SQL -select device_id,temperature from table1 where temperature is not null -union -select device_id,temperature from table2 where temperature is not null; - ---等价于; -select device_id,temperature from table1 where temperature is not null -union distinct -select device_id,temperature from table2 where temperature is not null; -``` - -执行结果: - -```Bash -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 85.0| -| 100| 90.0| -| 100| 85.0| -| 100| 88.0| -+---------+-----------+ -Total line number = 5 -It costs 0.074s -``` - -2. 获取 table1 和 table2 中设备及温度的非空数据集合(保留重复) - -```SQL -select device_id,temperature from table1 where temperature is not null -union all -select device_id,temperature from table2 where temperature is not null; -``` - -执行结果: - -```SQL -+---------+-----------+ -|device_id|temperature| -+---------+-----------+ -| 101| 90.0| -| 101| 90.0| -| 101| 85.0| -| 101| 85.0| -| 101| 85.0| -| 101| 85.0| -| 100| 90.0| -| 100| 85.0| -| 100| 85.0| -| 100| 88.0| -| 100| 90.0| -| 100| 90.0| -| 101| 90.0| -| 101| 85.0| -| 101| 85.0| -| 100| 85.0| -| 100| 90.0| -+---------+-----------+ -Total line number = 17 -It costs 0.108s -``` - -> ​**注意**​: -> -> * 集合操作​**不保证结果顺序**​,实际输出顺序可能与示例不同。 - -## 2. INTERSECT -### 2.1 概述 - -INTERSECT 操作返回两个查询结果集中共同存在的行(不保证结果顺序),支持去重(默认)和保留重复两种模式。 - -### 2.2 语法定义 - -```SQL -query1 INTERSECT [ALL | DISTINCT] query2 -``` - -​**说明**​: - -1. ​**去重规则**​: - - 1. 默认(`INTERSECT` 或 `INTERSECT DISTINCT`):自动去除重复行。 - 2. `INTERSECT ALL`:保留所有重复行(包括重复项),性能略低。 -2. ​**优先级规则**​: - - 1. `INTERSECT` 优先级高于 `UNION` 和 `EXCEPT`(如 `A UNION B INTERSECT C` 等价于 `A UNION (B INTERSECT C)`)。 - 2. 从左到右计算(`A INTERSECT B INTERSECT C` 等价于 `(A INTERSECT B) INTERSECT C`)。 -3. ​**输入要求**​: - - 1. 两个查询结果的列数必须相同。 - 2. 对应列数据类型需兼容(兼容性规则同 UNION): - * 数值类型互容:`INT32`、`INT64`、`FLOAT`、`DOUBLE` 之间完全兼容。 - * 字符串类型互容:`TEXT` 与 `STRING` 完全兼容。 - * 特殊规则:`INT64` 与 `TIMESTAMP` 兼容。 - 3. NULL 值视为相等(`NULL IS NOT DISTINCT FROM NULL`)。 - 4. 若 `SELECT` 未包含 `time` 列,则 `time` 列不参与比较,结果集无 `time` 列。 -4. ​**结果集规则**​: - - 1. 列名及顺序继承第一个查询的定义。 - -### 2.3 使用示例 - -基于 [示例数据](../Reference/Sample-Data.md): - -1. 获取 table1 和 table2 中设备及温度的共同数据(去重) - - ```SQL - select device_id, temperature from table1 - intersect - select device_id, temperature from table2; - - --等价于; - select device_id, temperature from table1 - intersect distinct - select device_id, temperature from table2; - ``` - - 执行结果: - - ```Bash - +---------+-----------+ - |device_id|temperature| - +---------+-----------+ - | 101| 90.0| - | 101| 85.0| - | 100| null| - | 100| 90.0| - | 100| 85.0| - +---------+-----------+ - Total line number = 5 - It costs 0.087s - ``` -2. 获取 table1 和 table2 中设备及温度的共同数据(保留重复) - - ```SQL - select device_id, temperature from table1 - intersect all - select device_id, temperature from table2; - ``` - - 执行结果: - - ```Bash - +---------+-----------+ - |device_id|temperature| - +---------+-----------+ - | 100| 85.0| - | 100| 90.0| - | 100| null| - | 101| 85.0| - | 101| 85.0| - | 101| 90.0| - +---------+-----------+ - Total line number = 6 - It costs 0.139s - ``` - -> ​**注意**​: -> -> * 集合操作​**不保证结果顺序**​,实际输出顺序可能与示例不同。 -> * 与 `UNION`/`EXCEPT` 混合使用时,需通过括号明确优先级(如 `A INTERSECT (B UNION C)`)。 - -## 3. EXCEPT -### 3.1 概述 - -EXCEPT 操作返回第一个查询结果集存在但第二个查询结果集中不存在的行(不保证结果顺序),支持去重(默认)和保留重复两种模式。 - -### 3.2 语法定义 - -```SQL -query1 EXCEPT [ALL | DISTINCT] query2 -``` - -​**说明**​: - -1. ​**去重规则**​: - - 1. 默认(`EXCEPT` 或 `EXCEPT DISTINCT`):自动去除重复行。 - 2. `EXCEPT ALL`:保留所有重复行(包括重复项),性能略低。 -2. ​**优先级规则**​: - - 1. `EXCEPT` 与 `UNION` 优先级相同,低于 `INTERSECT`(如 `A INTERSECT B EXCEPT C` 等价于 `(A INTERSECT B) EXCEPT C`)。 - 2. 从左到右计算(`A EXCEPT B EXCEPT C` 等价于 `(A EXCEPT B) EXCEPT C`)。 -3. ​**输入要求**​: - - 1. 两个查询结果的列数必须相同。 - 2. 对应列数据类型需兼容(兼容性规则同 UNION): - * 数值类型互容:`INT32`、`INT64`、`FLOAT`、`DOUBLE` 之间完全兼容。 - * 字符串类型互容:`TEXT` 与 `STRING` 完全兼容。 - * 特殊规则:`INT64` 与 `TIMESTAMP` 兼容。 - 3. NULL 值视为相等(`NULL IS NOT DISTINCT FROM NULL`)。 - 4. 若 `SELECT` 未包含 `time` 列,则 `time` 列不参与比较,结果集无 `time` 列。 -4. ​**结果集规则**​: - - 1. 列名及顺序继承第一个查询的定义。 - -### 3.3 使用示例 - -基于 [示例数据](../Reference/Sample-Data.md): - -1. 获取 table1 中存在但 table2 中不存在的设备及温度数据(去重) - - ```SQL - select device_id, temperature from table1 - except - select device_id, temperature from table2; - - --等价于; - select device_id, temperature from table1 - except distinct - select device_id, temperature from table2; - ``` - - 执行结果: - - ```Bash - +---------+-----------+ - |device_id|temperature| - +---------+-----------+ - | 101| null| - | 100| 88.0| - +---------+-----------+ - Total line number = 2 - It costs 0.173s - ``` -2. 获取 table1 中存在但 table2 中不存在的设备及温度数据(保留重复) - - ```SQL - select device_id, temperature from table1 - except all - select device_id, temperature from table2; - ``` - - 执行结果: - - ```Bash - +---------+-----------+ - |device_id|temperature| - +---------+-----------+ - | 100| 85.0| - | 100| 88.0| - | 100| 90.0| - | 100| 90.0| - | 100| null| - | 101| 85.0| - | 101| 85.0| - | 101| 90.0| - | 101| null| - | 101| null| - | 101| null| - | 101| null| - +---------+-----------+ - Total line number = 12 - It costs 0.155s - ``` - -> ​**注意**​: -> -> * 集合操作​**不保证结果顺序**​,实际输出顺序可能与示例不同。 -> * 与 `UNION`/`INTERSECT` 混合使用时,需通过括号明确优先级(如 `A EXCEPT (B INTERSECT C)`)。 diff --git a/src/zh/UserGuide/latest-Table/SQL-Manual/overview_timecho.md b/src/zh/UserGuide/latest-Table/SQL-Manual/overview_timecho.md deleted file mode 100644 index 0cb0fd1d9..000000000 --- a/src/zh/UserGuide/latest-Table/SQL-Manual/overview_timecho.md +++ /dev/null @@ -1,54 +0,0 @@ - - -# 概览 - -## 1. 语法概览 - -```SQL -SELECT ⟨select_list⟩ - FROM ⟨tables⟩ | patternRecognition - [WHERE ⟨condition⟩] - [GROUP BY ⟨groups⟩] - [HAVING ⟨group_filter⟩] - [WINDOW windowDefinition (',' windowDefinition)*)] - [FILL ⟨fill_methods⟩] - [ORDER BY ⟨order_expression⟩] - [OFFSET ⟨n⟩] - [LIMIT ⟨n⟩]; -``` - -IoTDB 查询语法提供以下子句: - -- SELECT 子句:查询结果应包含的列。详细语法见:[SELECT子句](../SQL-Manual/Select-Clause_timecho.md) -- FROM 子句:指出查询的数据源,可以是单个表、多个通过 `JOIN` 子句连接的表,或者是一个子查询。详细语法见:[FROM & JOIN 子句](../SQL-Manual/From-Join-Clause.md) -- WHERE 子句:用于过滤数据,只选择满足特定条件的数据行。这个子句在逻辑上紧跟在 FROM 子句之后执行。详细语法见:[WHERE 子句](../SQL-Manual/Where-Clause.md) -- GROUP BY 子句:当需要对数据进行聚合时使用,指定了用于分组的列。详细语法见:[GROUP BY 子句](../SQL-Manual/GroupBy-Clause.md) -- HAVING 子句:在 GROUP BY 子句之后使用,用于对已经分组的数据进行过滤。与 WHERE 子句类似,但 HAVING 子句在分组后执行。详细语法见:[HAVING 子句](../SQL-Manual/Having-Clause.md) -- FILL 子句:用于处理查询结果中的空值,用户可以使用 FILL 子句来指定数据缺失时的填充模式(如前一个非空值或线性插值)来填充 null 值,以便于数据可视化和分析。 详细语法见:[FILL 子句](../SQL-Manual/Fill-Clause.md) -- ORDER BY 子句:对查询结果进行排序,可以指定升序(ASC)或降序(DESC),以及 NULL 值的处理方式(NULLS FIRST 或 NULLS LAST)。详细语法见:[ORDER BY 子句](../SQL-Manual/OrderBy-Clause.md) -- OFFSET 子句:用于指定查询结果的起始位置,即跳过前 OFFSET 行。与 LIMIT 子句配合使用。详细语法见:[LIMIT 和 OFFSET 子句](../SQL-Manual/Limit-Offset-Clause.md) -- LIMIT 子句:限制查询结果的行数,通常与 OFFSET 子句一起使用以实现分页功能。详细语法见:[LIMIT 和 OFFSET 子句](../SQL-Manual/Limit-Offset-Clause.md) - -## 2. 子句执行顺序 - - -![](/img/data-query-1.png) diff --git a/src/zh/UserGuide/latest-Table/Tools-System/CLI_timecho.md b/src/zh/UserGuide/latest-Table/Tools-System/CLI_timecho.md deleted file mode 100644 index 1df7d5e04..000000000 --- a/src/zh/UserGuide/latest-Table/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,195 +0,0 @@ - - -# 命令行工具 - -IoTDB 为用户提供 CLI 工具用于和服务端程序进行交互操作。在使用 CLI 工具连接 IoTDB 前,请保证 IoTDB 服务已经正常启动。下面介绍 CLI 工具的运行方式和相关参数。 - -> 本文中 $IoTDB_HOME 表示 IoTDB 的安装目录所在路径。 - -## 1. CLI 启动 - -CLI 客户端脚本是 $IoTDB_HOME/sbin 文件夹下的`start-cli`脚本。启动命令为: - -- Linux/MacOS 系统常用启动命令为: - -```Shell -Shell> bash sbin/start-cli.sh -sql_dialect table -或 -# V2.0.6.x 版本之前 -Shell> bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x 版本及之后 -Shell> bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -- Windows 系统常用启动命令为: - -```Shell -# V2.0.4.x 版本之前 -Shell> sbin\start-cli.bat -sql_dialect table -或 -Shell> sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table - -# V2.0.4.x 版本及之后 -Shell> sbin\windows\start-cli.bat -sql_dialect table -或 -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell> sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root -sql_dialect table -# V2.0.6.x 版本及之后 -Shell> sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -sql_dialect table -``` - -其中: - -- -h 和-p 项是 IoTDB 所在的 IP 和 RPC 端口号(本机未修改 IP 和 RPC 端口号默认为 127.0.0.1、6667) -- -u 和-pw 是 IoTDB 登录的用户名密码(安装后IoTDB有一个默认用户,用户名为`root`,密码为`TimechoDB@2021`,V2.0.6版本之前密码为`root`) -- -sql_dialect 是登录的数据模型(表模型或树模型),此处指定为 table 代表进入表模型模式 - -更多参数见: - -| **参数名** | **参数类型** | **是否为必需参数** | **说明** | **示例** | -|:-----------------------------|:-----------|:------------|:-----------------------------------------------------------|:---------------------| -| -h `` | string 类型 | 否 | IoTDB 客户端连接 IoTDB 服务器的 IP 地址, 默认使用:127.0.0.1。 | -h 127.0.0.1 | -| -p `` | int 类型 | 否 | IoTDB 客户端连接服务器的端口号,IoTDB 默认使用 6667。 | -p 6667 | -| -u `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的用户名,默认使用 root。 | -u root | -| -pw `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的密码,默认使用 TimechoDB@2021(V2.0.6版本之前为root)。 | -pw root | -| -sql_dialect `` | string 类型 | 否 | 目前可选 tree(树模型) 、table(表模型),默认 tree | -sql_dialect table | -| -e `` | string 类型 | 否 | 在不进入客户端输入模式的情况下,批量操作 IoTDB。 | -e "show databases" | -| -c | 空 | 否 | 如果服务器设置了 rpc_thrift_compression_enable=true, 则 CLI 必须使用 -c | -c | -| -disableISO8601 | 空 | 否 | 如果设置了这个参数,IoTDB 将以数字的形式打印时间戳 (timestamp)。 | -disableISO8601 | -| -usessl `` | Boolean 类型 | 否 | 否开启 ssl 连接 | -usessl true | -| -ts `` | string 类型 | 否 | ssl 证书存储路径 | -ts /path/to/truststore | -| -tpw `` | string 类型 | 否 | ssl 证书存储密码 | -tpw myTrustPassword | -| -timeout `` | int 类型 | 否 | 查询超时时间(秒)。如果未设置,则使用服务器的配置。 | -timeout 30 | -| -help | 空 | 否 | 打印 IoTDB 的帮助信息。 | -help | - -启动后出现如图提示即为启动成功。 - -![](/img/Cli-01.png) - -## 2. CLI 使用 - -### 2.1 执行语句 - -进入 CLI 后,用户可以直接在对话中输入 SQL 语句进行交互。如: - -- 创建数据库 - -```Java -create database test -``` - -![](/img/Cli-02.png) - - -- 查看数据库 - -```Java -show databases -``` - -![](/img/Cli-03.png) - - -### 2.2 命令技巧 - -CLI中使用命令小技巧: - -(1)快速切换历史命令: 上下箭头 - -(2)历史命令自动补全:右箭头 - -(3)中断执行命令: CTRL+C - -## 3. CLI 退出 - -在 CLI 中输入`quit`或`exit`可退出 CLI 结束本次会话。 - -## 4. 访问历史功能 - -IoTDB **V2.0.9.1** 起支持开启访问历史功能,即客户端登录成功后展示关键的历史访问信息,支持分布式场景。管理员与普通用户仅可查看自身访问历史,核心展示内容包括: - -* 上一次成功会话:显示日期、时间、访问应用、IP地址及访问方法(首次登录或无历史记录时不显示)。 -* 最近一次失败尝试:显示距离本次成功登录时间最近的一次失败记录的日期、时间、访问应用、IP地址及访问方法。 -* 累计失败次数:统计自上一次成功会话建立以来,所有未成功建立的会话尝试总次数。 - -### 4.1 开启访问历史 - -支持通过修改 `iotdb-system.properties` 文件中的相关参数来控制是否开启访问历史功能,修改参数后需重启生效,例如: - -```Plain -# 用于控制是否启用审计日志功能 -enable_audit_log=false -``` - -* 开启时,记录登录信息并定期清理过期数据; -* 关闭时,不记录、不展示、不清理; -* 开关关闭后重开,展示的历史为关闭前最后一条记录,不一定代表真实最近登录记录。 - -使用示例: - -```Bash ---------------------- -Starting IoTDB Cli ---------------------- - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ Enterprise version 2.0.9.1 (Build: xxxxxxx) - - ----Last Successful Session------------------ -Time: 2026-03-24T10:25:47.759+08:00 -IP Address: 127.0.0.1 ----Last Failed Session---------------------- -Time: 2026-03-24T10:27:26.314+08:00 -IP Address: 127.0.0.1 -Cumulative Failed Attempts: 1 -Successfully login at 127.0.0.1:6667 -IoTDB> -``` - -### 4.2 查看访问历史 - -root 用户及具有 AUDIT 权限的用户可以通过 SQL 语句查看访问历史记录。 - -语法定义: - -```SQL -select * from __audit.login_history; -``` - -示例: - -```SQL -IoTDB> select * from __audit.login_history -+-----------------------------+-------+-------+--------+---------+------+ -| time|user_id|node_id|username| ip|result| -+-----------------------------+-------+-------+--------+---------+------+ -|2026-03-25T10:55:58.240+08:00| u_0| node_1| root|127.0.0.1| true| -+-----------------------------+-------+-------+--------+---------+------+ -Total line number = 1 -It costs 0.213s -``` - - diff --git a/src/zh/UserGuide/latest-Table/Tools-System/Data-Export-Tool_timecho.md b/src/zh/UserGuide/latest-Table/Tools-System/Data-Export-Tool_timecho.md deleted file mode 100644 index f171f61d9..000000000 --- a/src/zh/UserGuide/latest-Table/Tools-System/Data-Export-Tool_timecho.md +++ /dev/null @@ -1,260 +0,0 @@ -# 数据导出 - -## 1. 功能概述 - -IoTDB 支持两种方式进行数据导出: - -* 数据导出工具 :`export-data.sh/bat` 位于 `tools `目录下,能够将指定 SQL 的查询结果导出为 CSV、SQL 及 TsFile (开源时间序列文件格式)格式。 -* 基于 PIPE 框架的 TsFileBackup:`tsfile-backup.sh/bat`位于 `tools `目录下,能够使用 PIPE 将指定的数据文件导出为 TsFile 格式。 - - - - - - - - - - - - - - - - - - - - - - - - - -
文件格式IoTDB工具具体介绍
CSVexport-data.sh/bat纯文本格式,存储格式化数据,需按照下文指定 CSV 格式进行构造
SQL包含自定义 SQL 语句的文件
TsFile开源时序数据文件格式
tsfile-backup.sh/bat开源时序数据文件格式,支持 Object 数据类型
- - -## 2. 数据导出工具 - -### 2.1 公共参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|--------------|-------------------------|--------------------------------------------------------------------------------------| -------------- |-----------------------------------------------------------------------------| -| -ft | --file\_type | 导出文件的类型,可以选择:csv、sql、tsfile | √ | | -| -h | --host | 主机名 | 否 | 127.0.0.1 | -| -p | --port | 端口号 | 否 | 6667 | -| -u | --username | 用户名 | 否 | root | -| -pw | --password | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6 版本之前为 root) | -| -sql_dialect | --sql_dialect | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| -db | --database | ​将要导出的目标数据库,只在`-sql_dialect`为 table 类型下生效。 | `-sql_dialect`为 table 时必填| - | -| -table | --table | 将要导出的目标表,只在`-sql_dialect`为 table 类型下生效。如果指定了`-q`参数则此参数不生效,如果导出类型为 tsfile/sql 则此参数必填。 | ​ 否 | - | -| -start_time | --start_time | 将要导出的数据起始时间,只有`-sql_dialect`为 table 类型时生效。如果填写了`-q`,则此参数不生效。支持的时间类型同`-tf`参数。 |否 | - | -| -end_time | --end_time | 将要导出的数据的终止时间,只有`-sql_dialect`为 table 类型时生效。如果填写了`-q`,则此参数不生效。 | 否 | - | -| -t | --target | 指定输出文件的目标文件夹,如果路径不存在新建文件夹 | √ | | -| -pfn | --prefix\_file\_name | 指定导出文件的名称。例如:abc,生成的文件是abc\_0.tsfile、abc\_1.tsfile | 否 | dump\_0.tsfile | -| -q | --query | 要执行的查询语句。自 V2.0.8 起,SQL 语句中的分号将被自动移除,查询执行保持正常。 | 否 | 无 | -| -timeout | --query\_timeout | 会话查询的超时时间(ms) | 否 | `-1`(V2.0.8 之前)
`Long.MAX_VALUE`(V2.0.8 及之后)
范围:`-1~Long.MAX_VALUE` | -| -help | --help | 显示帮助信息 | 否 | | -| -usessl | --use_ssl | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| -ts | --trust_store | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| -tpw | --trust_store_password | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 CSV 格式 - -#### 2.2.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table - [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -``` - -#### 2.2.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- |--------------------------------------| -| -dt | --datatype | 是否在 CSV 文件的表头输出时间序列的数据类型,可以选择`true`或`false` | 否 | false | -| -lpf | --lines\_per\_file | 每个转储文件的行数 | 否 | 10000
范围:0~Integer.Max=2147483647 | -| -tf | --time\_format | 指定 CSV 文件中的时间格式。可以选择:1) 时间戳(数字、长整型);2) ISO8601(默认);3) 用户自定义模式,如`yyyy-MM-dd HH:mm:ss`(默认为ISO8601)。SQL 文件中的时间戳输出不受时间格式设置影响 | 否| ISO8601 | -| -tz | --timezone | 设置时区,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | - -#### 2.2.3 运行示例: - -```Shell -# 正确示例 -> export-data.sh -ft csv -sql_dialect table -t /path/export/dir -db database1 -q "select * from table1" - -# 异常示例 -> export-data.sh -ft csv -sql_dialect table -t /path/export/dir -q "select * from table1" -Parse error: Missing required option: db -``` - -### 2.3 SQL 格式 - -#### 2.3.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-aligned ] - -lpf - [-tf ] [-tz ] [-q ] [-timeout ] - -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] -``` - -#### 2.3.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | -------------------------------------- | -| -aligned | --use\_aligned | 是否导出为对齐的 SQL 格式 | 否 | true | -| -lpf | --lines\_per\_file | 每个转储文件的行数 | 否 | 10000
范围:0~Integer.Max=2147483647 | -| -tf | --time\_format | 指定 CSV 文件中的时间格式。可以选择:1) 时间戳(数字、长整型);2) ISO8601(默认);3) 用户自定义模式,如`yyyy-MM-dd HH:mm:ss`(默认为ISO8601)。SQL 文件中的时间戳输出不受时间格式设置影响 | 否| ISO8601| -| -tz | --timezone | 设置时区,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | - -#### 2.3.3 运行示例: - -```Shell -# 正确示例 -> export-data.sh -ft sql -sql_dialect table -t /path/export/dir -db database1 -start_time 1 - -# 异常示例 -> export-data.sh -ft sql -sql_dialect table -t /path/export/dir -start_time 1 -Parse error: Missing required option: db -``` - -### 2.4 TsFile 格式 - -#### 2.4.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-sql_dialect] -db -table
- [-start_time] [-end_time] [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] -``` - -#### 2.4.2 私有参数 - -* 无 - -#### 2.4.3 运行示例: - -```Shell -# 正确示例 -> /tools/export-data.sh -ft tsfile -sql_dialect table -t /path/export/dir -db database1 -start_time 0 - -# 异常示例 -> /tools/export-data.sh -ft tsfile -sql_dialect table -t /path/export/dir -start_time 0 -Parse error: Missing required option: db -``` - -## 3. 基于 PIPE 框架的 TsFileBackup - -IoTDB 自 **V2.0.9.2** 版本起支持 `tsfile-backup.sh/bat` 脚本,该脚本能够自动生成并向服务端发送 `CREATE PIPE` SQL 指令,将指定的数据文件导出为 TsFile 格式。 - -**注意:** - -1. **使用该脚本需联系天谋团队获取相关的 jar 包(`tsfile-remote-sink--jar-with-dependencies.jar`),并放至 IoTDB 可访问的路径(例如所有数据节点主机)。** -2. **该脚本支持 Object 类型数据导出为 TsFile 文件。** - -### 3.1 运行命令 - -```Shell -# Unix/OS X -> tools/tsfile-backup.sh [-sql_dialect ] [-h ] [-p ] - [-u ] [-pw ] [-path ] [-db ] [-table -
] [-s ] [-e ] [-t ] - [-th ] [-tu ] [-tp ] - [--rate_limit] [--plugin_jar] [-help] -# Windows -> tools\windows>tsfile-backup.bat [-sql_dialect ] [-h ] [-p ] - [-u ] [-pw ] [-path ] [-db ] [-table -
] [-s ] [-e ] [-t ] - [-th ] [-tu ] [-tp ] - [--rate_limit] [--plugin_jar] [-help] -``` - -### 3.2 脚本参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|------------------------|------------------------|----------------------------------------------------------------| -------------- |-------------| -| `-sql_dialect` | `--sql_dialect` | 指定数据模型类型,可选值:`tree`(树模型) 或`table`(表模型)。 | 是 | - | -| `-h` | `--host` | 本地主机地址。指当前数据所在的 IoTDB 实例 IP。 | 否 | `127.0.0.1` | -| `-p` | `--port` | 端口号,IoTDB RPC 服务端口。 | 否 | `6667` | -| `-u` | `--user` | 用户名,用于登录 IoTDB 验证。 | 否 | `root` | -| `-pw` | `--password` | 密码,对应用户的IoTDB密码,支持隐藏输入。 | 否 | `root` | -| `-t` | `--target` | 导出目标目录。在 SCP 模式下,此路径指远程服务器上的绝对物理路径。TsFile 和关联的 Object 目录将导出至此。 | 是 | - | -| `-db` | `--database` | 数据库名称 (表模型可选) | 否 | `.*` | -| `-table` | `--table` | 表名 (表模型可选) | 否 | `.*` | -| `-s` | `--start_time` | 起始时间。支持 ISO8601 格式(如 2026-01-01T00:00:00)或毫秒时间戳。仅导出该时间点及之后的数据。 | 否 | - | -| `-e` | `--end_time` | 截止时间。格式同上。仅导出该时间点之前的数据。 | 否 | - | -| `-th` | `--target_host` | 远程目标主机 IP,默认自动识别启动脚本的IP。指定此参数后,脚本将自动配置 Pipe 使用 SCP 模式进行数据传输。 | 否 | - | -| `-tu` | `--target_host_user` | 远程主机用户名。用于 SSH/SCP 登录目标服务器。 | 否 | - | -| `-tpw` | `--target_host_pw` | 远程主机密码。用于远程身份验证,支持隐藏输入。 | 否 | - | -| `-tp` | `--target_host_port` | 远程 SSH 端口。 | 否 | `22` | -| `--rate_limit` | `--rate_limit` | 发送速率限制。单位:字节/秒 (Bytes/s)。防止导出任务占用过多网络带宽。 | 否 | - | -| `--plugin_jar` | `--plugin_jar` | 指定 Pipe 插件的Jar包路径 | 否 | - | -| `--object-parallelism` | `--object-parallelism` | 指定object文件发送最大并行度 | 否 | - | -| `--object-batch-size` | `--object-batch-size` | 限制每个对象文件上传批次的总字节数,用于控制内存占用和单次 SCP 传输大小 | 否 | - | -| `-help` | `--help` | 查看帮助 | 否 | - | - -### 3.3 运行示例 - -示例一:SCP 远程导出(将数据发送到另一台服务器) - -```Bash -./tsfile-backup.sh -sql_dialect table -db test_db -t /remote/archive/ -th 192.168.1.100 -tu backup_user -tpw ComplexPass123! -``` - -示例二:带限速的远程 Object 数据导出 - -```Bash -./tsfile-backup.sh -sql_dialect table -t /mnt/backup/ -th 10.0.0.5 -tu iot_admin -tpw Admin@2026 --rate_limit 5242880 -``` - -示例三:指定 Pipe jar 目录 - -```Bash -./tsfile-backup.sh -sql_dialect table -db test -table .* -tu luoluoyuyu -tpw -t /tmp/backup --plugin_jar /local/lib/tsfile-remote-sink-2.0.8-SNAPSHOT-jar-with-dependencies.jar -``` - -注意:SCP 模式导出 Object 类型数据时,为避免出现握手异常、连接失败或 Pipe 频繁启停问题,建议采取以下任一措施: -* 适当调低配置参数 object-parallelism -* 按需调大目标机的 MaxStartups,修改后执行 sshd reload 或 sshd restart 使配置生效 \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Tools-System/Data-Import-Tool_timecho.md b/src/zh/UserGuide/latest-Table/Tools-System/Data-Import-Tool_timecho.md deleted file mode 100644 index 71105b603..000000000 --- a/src/zh/UserGuide/latest-Table/Tools-System/Data-Import-Tool_timecho.md +++ /dev/null @@ -1,376 +0,0 @@ -# 数据导入 - -## 1. 功能概述 - -IoTDB 支持三种方式进行数据导入: -- 数据导入工具 :`import-data.sh/bat` 位于 `tools` 目录下,可以将 `CSV`、`SQL`、及`TsFile`(开源时序文件格式)的数据导入 `IoTDB`。 -- `TsFile` 自动加载功能。 -- `Load SQL` 导入 `TsFile` 。 - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
文件格式IoTDB工具具体介绍
CSVimport-data.sh/bat可用于单个或一个目录的 CSV 文件批量导入 IoTDB
SQL可用于单个或一个目录的 SQL 文件批量导入 IoTDB
TsFile可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
TsFile 自动加载可以监听指定路径下新产生的 TsFile 文件,并将其加载进 IoTDB
Load SQL可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
- -- **表模型 TsFile 导入暂时只支持本地导入。** -- 自 V2.0.9.2 版本起,import-data.sh/bat 脚本导入 tsfile 文件时支持 Object 数据类型。 - -## 2. 数据导入工具 - -### 2.1 公共参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|--------------|-------------------------|-------------------------------------------------------------------------------------------------------------------------------| -------------- |-------------------------------------| -| -ft | --file\_type | 导入文件的类型,可以选择:csv、sql、tsfile | √ | | -| -h | --host | 主机名 | 否 | 127.0.0.1 | -| -p | --port | 端口号 | 否 | 6667 | -| -u | --username | 用户名 | 否 | root | -| -pw | --password | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6 版本之前为 root) | -| -s | --source | 待加载的脚本文件(夹)的本地目录路径
如果为 csv sql tsfile 这三个支持的格式,直接导入
不支持的格式,报错提示`The file name must end with "csv" or "sql"or "tsfile"!` | √ | | -| -sql_dialect | --sql_dialect | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| -db | --database | 数据将要导入的目标库,只在 `-sql_dialect` 为 table 类型下生效。 |-sql_dialect 为 table 时必填;
V2.0.9.2 版本起,当文件格式为 SQL 时,该参数为可选参数,若参数或 SQL 中均未显式指定目标数据库时会进行提示。 | - | -| -table | --table | 数据将要导入的目标表,只在 `-sql_dialect` 为 table 类型且文件类型为 csv 条件下生效且必填。 | 否 | - | -| -tn | --thread\_num | 最大并行线程数 | 否 | 8
范围:0~Integer.Max=2147483647 | -| -tz | --timezone | 时区设置,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | -| -help | --help | 显示帮助信息,支持分开展示和全部展示`-help`或`-help csv` | 否 | | -| -usessl | --use_ssl | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| -ts | --trust_store | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| -tpw | --trust_store_password | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - - -### 2.2 CSV 格式 - -#### 2.2.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table - [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# V2.0.4.x 版本及之后 -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] -``` - -#### 2.2.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | ---------------------------- | ----------------------------------------------------------------------------------- |-------------------------------------------|---------------------------------------| -| -fd | --fail\_dir | 指定保存失败文件的目录 | 否 | YOUR\_CSV\_FILE\_PATH | -| -lpf | --lines\_per\_failed\_file | 指定失败文件最大写入数据的行数 | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -aligned | --use\_aligned | 是否导入为对齐序列 | 否 | false | -| -batch | --batch\_size | 指定每调用一次接口处理的数据行数(最小值为1,最大值为Integer.​*MAX\_VALUE*​) | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -ti | --type\_infer | 通过选项定义类型信息,例如`"boolean=text,int=long, ..."` | 否 | 无 | -| -tp | --timestamp\_precision | 时间戳精度 | 否:
1. ms(毫秒)
2. us(微秒)
3. ns(纳秒) | ms | - -#### 2.2.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_0.csv -db database1 -table table1 - -# 异常示例 -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_1.csv -table table1 -Parse error: Missing required option: db - -> tools/import-data.sh -ft csv -sql_dialect table -s ./csv/dump0_1.csv -db database1 -table table5 -There are no tables or the target table table5 does not exist -``` - -#### 2.2.4 导入说明 - -1. CSV 导入规范 - - - 特殊字符转义规则:若Text类型的字段中包含特殊字符(例如逗号`,`),需使用反斜杠(`\`)​进行转义处理。 - - 支持的时间格式:`yyyy-MM-dd'T'HH:mm:ss`, `yyy-MM-dd HH:mm:ss`, 或者 `yyyy-MM-dd'T'HH:mm:ss.SSSZ` 。 - - 时间戳列​必须作为数据文件的首列存在。 - -2. CSV 文件示例 - -```sql -time,region,device,model,temperature,humidity -1970-01-01T08:00:00.001+08:00,"上海","101","F",90.0,35.2 -1970-01-01T08:00:00.002+08:00,"上海","101","F",90.0,34.8 -``` - - -### 2.3 SQL 格式 - -#### 2.3.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# V2.0.4.x 版本及之后 -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] -``` - -#### 2.3.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | ---------------------------- | ----------------------------------------------------------------------------------- | -------------- |---------------------------------------| -| -fd | --fail\_dir | 指定保存失败文件的目录 | 否 | YOUR\_CSV\_FILE\_PATH | -| -lpf | --lines\_per\_failed\_file | 指定失败文件最大写入数据的行数 | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -batch | --batch\_size | 指定每调用一次接口处理的数据行数(最小值为1,最大值为Integer.​*MAX\_VALUE*​) | 否 | 100000
范围:0~Integer.Max=2147483647 | - -#### 2.3.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft sql -sql_dialect table -s ./sql/dump0_0.sql -db database1 - -# 异常示例 -> tools/import-data.sh -ft sql -sql_dialect table -s ./sql/dump1_1.sql -db database1 -Source file or directory ./sql/dump1_1.sql does not exist - -# 目标表存在但是元数据不适配/数据异常:生成.failed异常文件记录该条信息,日志打印错误信息如下 -Fail to insert measurements '[column.name]' caused by [data type is not consistent, input '[column.value]', registered '[column.DataType]'] -``` - -### 2.4 TsFile 格式 - -#### 2.4.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-o ] -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# V2.0.4.x 版本及之后 -> tools\windows\import-data.bat -ft [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-o ] -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] -``` - -#### 2.4.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|------|----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------| -------------------- | -| -os | --on\_succcess | 1. none:不删除
2. mv:移动成功的文件到目标文件夹
3. cp:硬连接(拷贝)成功的文件到目标文件夹
4. delete:删除 | √ || -| -sd | --success\_dir | 当`--on_success`为 mv 或 cp 时,mv 或 cp 的目标文件夹。文件的文件名变为文件夹打平后拼接原有文件名 | 当`--on_success`为mv或cp时需要填写 | `${EXEC_DIR}/success`| -| -of | --on\_fail | 1. none:跳过
2. mv:移动失败的文件到目标文件夹
3. cp:硬连接(拷贝)失败的文件到目标文件夹
4. delete:删除 | √ || -| -fd | --fail\_dir | 当`--on_fail`指定为 mv 或 cp 时,mv 或 cp 的目标文件夹。文件的文件名变为文件夹打平后拼接原有文件名 | 当`--on_fail`指定为 mv 或 cp 时需要填写 | `${EXEC_DIR}/fail` | -| -tp | --timestamp\_precision | 时间戳精度
tsfile 非远程导入:-tp 指定 tsfile 文件的时间精度 手动校验和服务器的时间戳是否一致 不一致返回报错信息
远程导入:-tp 指定 tsfile 文件的时间精度 pipe 自动校验时间戳精度是否一致 不一致返回 pipe 报错信息 | 否:
1. ms(毫秒)
2. us(微秒)
3. ns(纳秒) | ms| -| -o | --object-file-paths | Object 文件存储路径。
默认模式:若不指定此参数,脚本将自动识别并导入位于 `/` 同名子目录下的 Object 文件。
绝对路径模式:显式指定 Object 文件的外部存储根目录,工具将基于此路径建立数据的关联索引。
注意:该参数自 V2.0.9.2 版本起支持 | 否 | | - - -#### 2.4.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 -os none -of none - -# 异常示例 -> tools/import-data.sh -ft tsfile -sql_dialect table -s ./tsfile -db database1 -Parse error: Missing required options: os, of -``` - -**Object 类型导入** - -1. 导入格式 - -* 默认 - -```Bash -target_dir - ├── tsfile.tsfile - └── tsfile/ (对应TSFile名字) - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - -* 指定 Object 目录 - -```Bash -target_dir - ├── tsfile.tsfile -object_dir - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - -2. 命令行示例 - -* 基础导入(自动识别 TsFile 同名目录下的 Object 文件) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/import/sensor_v1.tsfile -db database1 -os none -of none -``` - -* 批量导入目录(指定并发线程数与成功后的处理动作) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/raw_data/ -tn 16 -os mv -sd /data/archive/ -``` - -* 表模型关联导入(指定外部 Object 存储路径与目标数据库) - -```Bash -./import-data.sh -sql_dialect table -ft tsfile -s /data/import/ -db factory_db -o /mnt/object_storage/ -of mv -fd /data/error_log/ -``` - - -## 3. TsFile 自动加载功能 - -本功能允许 IoTDB 主动监听指定目录下的新增 TsFile,并将 TsFile 自动加载至 IoTDB 中。通过此功能,IoTDB 能自动检测并加载 TsFile,无需手动执行任何额外的加载操作。 - -![](/img/Data-import1.png) - -### 3.1 配置参数 - -可通过从配置文件模版 `iotdb-system.properties.template` 中找到下列参数,添加到 IoTDB 配置文件 `iotdb-system.properties` 中开启 TsFile 自动加载功能。完整配置如下: - -| **配置参数** | **参数说明** | **value 取值范围** | **是否必填** | **默认值** | **加载方式** | -| --------------------------------------------------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ---------------------------- | -------------------- | ------------------------ | -------------------- | -| load\_active\_listening\_enable | 是否开启 DataNode 主动监听并且加载 tsfile 的功能(默认开启)。 | Boolean: true,false | 选填 | true | 热加载 | -| load\_active\_listening\_dirs | 需要监听的目录(自动包括目录中的子目录),如有多个使用 “,“ 隔开;
默认的目录为 `ext/load/pending`;
支持热装载;
**注意:表模型中,文件所在的目录名会作为 database**; | String: 一个或多个文件目录 | 选填 | `ext/load/pending` | 热加载 | -| load\_active\_listening\_fail\_dir | 执行加载 tsfile 文件失败后将文件转存的目录,只能配置一个 | String: 一个文件目录 | 选填 | `ext/load/failed` | 热加载 | -| load\_active\_listening\_max\_thread\_num | 同时执行加载 tsfile 任务的最大线程数,参数被注释掉时的默值为 max(1, CPU 核心数 / 2),当用户设置的值不在这个区间[1, CPU核心数 /2]内时,会设置为默认值 (1, CPU 核心数 / 2) | Long: [1, Long.MAX\_VALUE] | 选填 | max(1, CPU 核心数 / 2) | 重启后生效 | -| load\_active\_listening\_check\_interval\_seconds | 主动监听轮询间隔,单位秒。主动监听 tsfile 的功能是通过轮询检查文件夹实现的。该配置指定了两次检查 `load_active_listening_dirs` 的时间间隔,每次检查完成 `load_active_listening_check_interval_seconds` 秒后,会执行下一次检查。当用户设置的轮询间隔小于 1 时,会被设置为默认值 5 秒 | Long: [1, Long.MAX\_VALUE] | 选填 | 5 | 重启后生效 | - -### 3.2 示例说明 - -```bash -load_active_listening_dir/ -├─sensors/ -│ ├─temperature/ -│ │ └─temperature-table.TSFILE - -``` - -- 表模型 TsFile - - `temperature-table.TSFILE`: 会被导入到 `temperature` database 下(因为它位于`sensors/temperature/` 目录下) - -### 3.3 注意事项 - -1. 如果待加载的文件中,存在 mods 文件,应优先将 mods 文件移动到监听目录下面,然后再移动 tsfile 文件,且 mods 文件应和对应的 tsfile 文件处于同一目录。防止加载到 tsfile 文件时,加载不到对应的 mods 文件 -2. 禁止设置 Pipe 的 receiver 目录、存放数据的 data 目录等作为监听目录 -3. 禁止 `load_active_listening_fail_dir` 与 `load_active_listening_dirs` 存在相同的目录,或者互相嵌套 -4. 保证 `load_active_listening_dirs` 目录有足够的权限,在加载成功之后,文件将会被删除,如果没有删除权限,则会重复加载 - -## 4. Load SQL - -IoTDB 支持通过 CLI 执行 SQL 直接将存有时间序列的一个或多个 TsFile 文件导入到另外一个正在运行的 IoTDB 实例中。 - -### 4.1 运行命令 - -```SQL -load '' with ( - 'attribute-key1'='attribute-value1', - 'attribute-key2'='attribute-value2', -) -``` - -* `` :文件本身,或是包含若干文件的文件夹路径 -* ``:可选参数,具体如下表所示 - -| Key | Key 描述 | Value 类型 | Value 取值范围 | Value 是否必填 | Value 默认值 | -| --------------------------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ------------ | ----------------------------------------- | ---------------- | -------------------------- | -| `database-level` | 当 tsfile 对应的 database 不存在时,可以通过` database-level`参数的值来制定 database 的级别,默认为`iotdb-common.properties`中设置的级别。
例如当设置 level 参数为 1 时表明此 tsfile 中所有时间序列中层级为1的前缀路径是 database。 | Integer | `[1: Integer.MAX_VALUE]` | 否 | 1 | -| `on-success` | 表示对于成功载入的 tsfile 的处置方式:默认为`delete`,即tsfile 成功加载后将被删除;`none `表明 tsfile 成功加载之后依然被保留在源文件夹, | String | `delete / none` | 否 | delete | -| `model` | 指定写入的 tsfile 是表模型还是树模型,该参数在V2.0.2.1后无效(系统会自动识别是树模型还是表模型) | String | `tree / table` | 否 | 与`-sql_dialect`一致 | -| `database-name` | **仅限表模型有效**: 文件导入的目标 database,不存在时会自动创建,`database-name`中不允许包括"`root.`"前缀,如果包含,将会报错。 | String | `-` | 否 | null | -| `convert-on-type-mismatch` | 加载 tsfile 时,如果数据类型不一致,是否进行转换 | Boolean | `true / false` | 否 | true | -| `verify` | 加载 tsfile 前是否校验 schema | Boolean | `true / false` | 否 | true | -| `tablet-conversion-threshold` | 转换为 tablet 形式的 tsfile 大小阈值,针对小文件 tsfile 加载,采用将其转换为 tablet 形式进行写入:默认值为 -1,即任意大小 tsfile 都不进行转换 | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | 否 | -1 | -| `async` | 是否开启异步加载 tsfile,将文件移到 active load 目录下面,所有的 tsfile 都 load 到`database-name`下. | Boolean | `true / false` | 否 | false | - -### 4.2 运行示例 - -```SQL --- 准备目标数据库 database2 -IoTDB> create database database2 -Msg: The statement is executed successfully. - -IoTDB> use database2 -Msg: The statement is executed successfully. - -IoTDB:database2> show tables details -+---------+-------+------+-------+ -|TableName|TTL(ms)|Status|Comment| -+---------+-------+------+-------+ -+---------+-------+------+-------+ -Empty set. - ---通过执行load sql 导入tsfile -IoTDB:database2> load '/home/dump0.tsfile' with ( 'on-success'='none', 'database-name'='database2') -Msg: The statement is executed successfully. - --- 验证数据导入成功 -IoTDB:database2> select * from table2 -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -| time|region|plant_id|device_id|temperature|humidity|status| arrival_time| -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -|2024-11-30T00:00:00.000+08:00| 上海| 3002| 101| 90.0| 35.2| true| null| -|2024-11-29T00:00:00.000+08:00| 上海| 3001| 101| 85.0| 35.1| null|2024-11-29T10:00:13.000+08:00| -|2024-11-27T00:00:00.000+08:00| 北京| 1001| 101| 85.0| 35.1| true|2024-11-27T16:37:01.000+08:00| -|2024-11-29T11:00:00.000+08:00| 上海| 3002| 100| null| 45.1| true| null| -|2024-11-28T08:00:00.000+08:00| 上海| 3001| 100| 85.0| 35.2| false|2024-11-28T08:00:09.000+08:00| -|2024-11-26T13:37:00.000+08:00| 北京| 1001| 100| 90.0| 35.1| true|2024-11-26T13:37:34.000+08:00| -+-----------------------------+------+--------+---------+-----------+--------+------+-----------------------------+ -``` diff --git a/src/zh/UserGuide/latest-Table/Tools-System/Maintenance-Tool_timecho.md b/src/zh/UserGuide/latest-Table/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index 14859a654..000000000 --- a/src/zh/UserGuide/latest-Table/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,1013 +0,0 @@ - - -# 集群管理工具 - -## 1. 集群管理工具 - -IoTDB 集群管理工具是一款易用的运维工具(企业版工具)。旨在解决 IoTDB 分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。本文档将说明如何用集群管理工具远程部署、配置、启动和停止 IoTDB 集群实例。 - -### 1.1 环境准备 - -本工具为 TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -IoTDB 要部署的机器需要依赖jdk 8及以上版本、lsof、netstat、unzip功能如果没有请自行安装,可以参考文档最后的一节环境所需安装命令。 - -提示:IoTDB集群管理工具需要使用有root权限的账号 - -### 1.2 部署方法 - -#### 下载安装 - -本工具为TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -注意:由于二进制包仅支持GLIBC2.17 及以上版本,因此最低适配Centos7版本 - -* 在iotd目录内输入以下指令后: - -```bash -bash install-iotdbctl.sh -``` - -即可在之后的 shell 内激活 iotdbctl 关键词,如检查部署前所需的环境指令如下所示: - -```bash -iotdbctl cluster check example -``` - -* 也可以不激活iotd直接使用 <iotdbctl absolute path>/sbin/iotdbctl 来执行命令,如检查部署前所需的环境: - -```bash -/sbin/iotdbctl cluster check example -``` - -### 1.3 系统结构 - -IoTDB集群管理工具主要由config、logs、doc、sbin目录组成。 - -* `config`存放要部署的集群配置文件如果要使用集群部署工具需要修改里面的yaml文件。 -* `logs` 存放部署工具日志,如果想要查看部署工具执行日志请查看`logs/iotd_yyyy_mm_dd.log`。 -* `sbin` 存放集群部署工具所需的二进制包。 -* `doc` 存放用户手册、开发手册和推荐部署手册。 - - -### 1.4 集群配置文件介绍 - -* 在`iotdbctl/config` 目录下有集群配置的yaml文件,yaml文件名字就是集群名字yaml 文件可以有多个,为了方便用户配置yaml文件在iotd/config目录下面提供了`default_cluster.yaml`示例。 -* yaml 文件配置由`global`、`confignode_servers`、`datanode_servers`、`grafana_server`、`prometheus_server`四大部分组成 -* global 是通用配置主要配置机器用户名密码、IoTDB本地安装文件、Jdk配置等。在`iotdbctl/config`目录中提供了一个`default_cluster.yaml`样例数据, - 用户可以复制修改成自己集群名字并参考里面的说明进行配置IoTDB集群,在`default_cluster.yaml`样例中没有注释的均为必填项,已经注释的为非必填项。 - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotdbctl cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - - -| 参数 | 说明 | 是否必填 | -|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| iotdb_zip_dir | IoTDB 部署分发目录,如果值为空则从`iotdb_download_url`指定地址下载 | 非必填 | -| iotdb_download_url | IoTDB 下载地址,如果`iotdb_zip_dir` 没有值则从指定地址下载 | 非必填 | -| jdk_tar_dir | jdk 本地目录,可使用该 jdk 路径进行上传部署至目标节点。 | 非必填 | -| jdk_deploy_dir | jdk 远程机器部署目录,会将 jdk 部署到该目录下面,与下面的`jdk_dir_name`参数构成完整的jdk部署目录即 `/` | 非必填 | -| jdk_dir_name | jdk 解压后的目录名称默认是jdk_iotdb | 非必填 | -| iotdb_lib_dir | IoTDB lib 目录或者IoTDB 的lib 压缩包仅支持.zip格式 ,仅用于IoTDB升级,默认处于注释状态,如需升级请打开注释修改路径即可。如果使用zip文件请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache\-iotdb\-1.2.0/lib/* | 非必填 | -| user | ssh登陆部署机器的用户名 | 必填 | -| password | ssh登录的密码, 如果password未指定使用pkey登陆, 请确保已配置节点之间ssh登录免密钥 | 非必填 | -| pkey | 密钥登陆如果password有值优先使用password否则使用pkey登陆 | 非必填 | -| ssh_port | ssh登录端口 | 必填 | -| iotdb_admin_user | iotdb服务用户名默认root | 非必填 | -| iotdb_admin_password | iotdb服务密码默认root | 非必填 | -| deploy_dir | IoTDB 部署目录,会把 IoTDB 部署到该目录下面与下面的`iotdb_dir_name`参数构成完整的IoTDB 部署目录即 `/` | 必填 | -| iotdb_dir_name | IoTDB 解压后的目录名称默认是iotdb | 非必填 | -| datanode-env.sh | 对应`iotdb/config/datanode-env.sh` ,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值 | 非必填 | -| confignode-env.sh | 对应`iotdb/config/confignode-env.sh`,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值 | 非必填 | -| iotdb-common.properties | 对应`iotdb/config/iotdb-common.properties` | 非必填 | -| cn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`cn_seed_config_node` | 必填 | -| dn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` | 必填 | - -其中datanode-env.sh 和confignode-env.sh 可以配置额外参数extra_opts,当该参数配置后会在datanode-env.sh 和confignode-env.sh 后面追加对应的值,可参考default_cluster.yaml,配置示例如下: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - - -* confignode_servers 是部署IoTDB Confignodes配置,里面可以配置多个Confignode - 默认将第一个启动的ConfigNode节点node1当作Seed-ConfigNode - -| 参数 | 说明 | 是否必填 | -|-----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| name | Confignode 名称 | 必填 | -| deploy_dir | IoTDB config node 部署目录 | 必填| | -| cn_internal_address | 对应iotdb/内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_address` | 必填 | -| cn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-confignode.properties`中的`cn_seed_config_node` | 必填 | -| cn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_port` | 必填 | -| cn_consensus_port | 对应`iotdb/config/iotdb-system.properties`中的`cn_consensus_port` | 非必填 | -| cn_data_dir | 对应`iotdb/config/iotdb-system.properties`中的`cn_data_dir` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`confignode_servers`同时配置值优先使用confignode_servers中的值 | 非必填 | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| 参数 | 说明 | 是否必填 | -| -------------------------- | ------------------------------------------------------------ | -------- | -| name | Datanode 名称 | 必填 | -| deploy_dir | IoTDB data node 部署目录,注:该目录不能与下面的IoTDB config node部署目录相同 | 必填 | -| dn_rpc_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` | 必填 | -| dn_internal_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` | 必填 | -| dn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-datanode.properties`中的`dn_seed_config_node`,推荐使用 SeedConfigNode | 必填 | -| dn_rpc_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` | 必填 | -| dn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 | 非必填 | - - -| 参数 | 说明 |是否必填| -|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|--- | -| name | Datanode 名称 |必填| -| deploy_dir | IoTDB data node 部署目录 |必填| -| dn_rpc_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` |必填| -| dn_internal_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` |必填| -| dn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` |必填| -| dn_rpc_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` |必填| -| dn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` |必填| -| iotdb-system.properties | 对应`iotdb/config/iotdb-common.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 |非必填| - -* grafana_server 是部署Grafana 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------|------------------|-------------------| -| grafana_dir_name | grafana 解压目录名称 | 非必填默认grafana_iotdb | -| host | grafana 部署的服务器ip | 必填 | -| grafana_port | grafana 部署机器的端口 | 非必填,默认3000 | -| deploy_dir | grafana 部署服务器目录 | 必填 | -| grafana_tar_dir | grafana 压缩包位置 | 必填 | -| dashboards | dashboards 所在的位置 | 非必填,多个用逗号隔开 | - -* prometheus_server 是部署Prometheus 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------------------|------------------|-----------------------| -| prometheus_dir_name | prometheus 解压目录名称 | 非必填默认prometheus_iotdb | -| host | prometheus 部署的服务器ip | 必填 | -| prometheus_port | prometheus 部署机器的端口 | 非必填,默认9090 | -| deploy_dir | prometheus 部署服务器目录 | 必填 | -| prometheus_tar_dir | prometheus 压缩包位置 | 必填 | -| storage_tsdb_retention_time | 默认保存数据天数 默认15天 | 非必填 | -| storage_tsdb_retention_size | 指定block可以保存的数据大小默认512M ,注意单位KB, MB, GB, TB, PB, EB | 非必填 | - -如果在config/xxx.yaml的`iotdb-system.properties`和`iotdb-system.properties`中配置了metrics,则会自动把配置放入到Prometheus无需手动修改 - -注意:如何配置yaml key对应的值包含特殊字符如:等建议整个value使用双引号,对应的文件路径中不要使用包含空格的路径,防止出现识别出现异常问题。 - -### 1.5 使用场景 - -#### 清理数据场景 - -* 清理集群数据场景会删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 -* 首先执行停止集群命令、然后在执行集群清理命令。 -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster clean default_cluster -``` - -#### 集群销毁场景 - -* 集群销毁场景会删除IoTDB集群中的`data`、`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录。 -* 首先执行停止集群命令、然后在执行集群销毁命令。 - - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster destroy default_cluster -``` - -#### 集群升级场景 - -* 集群升级首先需要在config/xxx.yaml中配置`iotdb_lib_dir`为要上传到服务器的jar所在目录路径(例如iotdb/lib)。 -* 如果使用zip文件上传请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache-iotdb-1.2.0/lib/* -* 执行上传命令、然后执行重启IoTDB集群命令即可完成集群升级 - -```bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### 集群配置文件的热部署场景 - -* 首先修改在config/xxx.yaml中配置。 -* 执行分发命令、然后执行热部署命令即可完成集群配置的热部署 - -```bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### 集群扩容场景 - -* 首先修改在config/xxx.yaml中添加一个datanode 或者confignode 节点。 -* 执行集群扩容命令 -```bash -iotdbctl cluster scaleout default_cluster -``` - -#### 集群缩容场景 - -* 首先在config/xxx.yaml中找到要缩容的节点名字或者ip+port(其中confignode port 是cn_internal_port、datanode port 是rpc_port) -* 执行集群缩容命令 -```bash -iotdbctl cluster scalein default_cluster -``` - -#### 已有IoTDB集群,使用集群部署工具场景 - -* 配置服务器的`user`、`passwod`或`pkey`、`ssh_port` -* 修改config/xxx.yaml中IoTDB 部署路径,`deploy_dir`(IoTDB 部署目录)、`iotdb_dir_name`(IoTDB解压目录名称,默认是iotdb) - 例如IoTDB 部署完整路径是`/home/data/apache-iotdb-1.1.1`则需要修改yaml文件`deploy_dir:/home/data/`、`iotdb_dir_name:apache-iotdb-1.1.1` -* 如果服务器不是使用的java_home则修改`jdk_deploy_dir`(jdk 部署目录)、`jdk_dir_name`(jdk解压后的目录名称,默认是jdk_iotdb),如果使用的是java_home 则不需要修改配置 - 例如jdk部署完整路径是`/home/data/jdk_1.8.2`则需要修改yaml文件`jdk_deploy_dir:/home/data/`、`jdk_dir_name:jdk_1.8.2` -* 配置`cn_seed_config_node`、`dn_seed_config_node` -* 配置`confignode_servers`中`iotdb-system.properties`里面的`cn_internal_address`、`cn_internal_port`、`cn_consensus_port`、`cn_system_dir`、 - `cn_consensus_dir`里面的值不是IoTDB默认的则需要配置否则可不必配置 -* 配置`datanode_servers`中`iotdb-system.properties`里面的`dn_rpc_address`、`dn_internal_address`、`dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`等 -* 执行初始化命令 - -```bash -iotdbctl cluster init default_cluster -``` - -#### 一键部署IoTDB、Grafana和Prometheus 场景 - -* 配置`iotdb-system.properties` 打开metrics接口 -* 配置Grafana 配置,如果`dashboards` 有多个就用逗号隔开,名字不能重复否则会被覆盖。 -* 配置Prometheus配置,IoTDB 集群配置了metrics 则无需手动修改Prometheus 配置会根据哪个节点配置了metrics,自动修改Prometheus 配置。 -* 启动集群 - -```bash -iotdbctl cluster start default_cluster -``` - -更加详细参数请参考上方的 集群配置文件介绍 - - -### 1.6 命令格式 - -本工具的基本用法为: -```bash -iotdbctl cluster [params (Optional)] -``` -* key 表示了具体的命令。 - -* cluster name 表示集群名称(即`iotdbctl/config` 文件中yaml文件名字)。 - -* params 表示了命令的所需参数(选填)。 - -* 例如部署default_cluster集群的命令格式为: - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 集群的功能及参数列表如下: - -| 命令 | 功能 | 参数 | -|-----------------|-------------------------------|-------------------------------------------------------------------------------------------------------------------------| -| check | 检测集群是否可以部署 | 集群名称列表 | -| clean | 清理集群 | 集群名称 | -| deploy/dist-all | 部署集群 | 集群名称 ,-N,模块名称(iotdb、grafana、prometheus可选),-op force(可选) | -| list | 打印集群及状态列表 | 无 | -| start | 启动集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) | -| stop | 关闭集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) ,-op force(nodename、grafana、prometheus可选) | -| restart | 重启集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选),-op force(强制停止)/rolling(滚动重启) | -| show | 查看集群信息,details字段表示展示集群信息细节 | 集群名称, details(可选) | -| destroy | 销毁集群 | 集群名称,-N,模块名称(iotdb、grafana、prometheus可选) | -| scaleout | 集群扩容 | 集群名称 | -| scalein | 集群缩容 | 集群名称,-N,集群节点名字或集群节点ip+port | -| reload | 集群热加载 | 集群名称 | -| dist-conf | 集群配置文件分发 | 集群名称 | -| dumplog | 备份指定集群日志 | 集群名称,-N,集群节点名字 -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -loglevel 日志类型 -l 传输速度 | -| dumpdata | 备份指定集群数据 | 集群名称, -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -l 传输速度 | -| dist-lib | lib 包升级 | 集群名字(升级完后请重启) | -| init | 已有集群使用集群部署工具时,初始化集群配置 | 集群名字,初始化集群配置 | -| status | 查看进程状态 | 集群名字 | -| acitvate | 激活集群 | 集群名字 | -| dist-plugin | 上传plugin(udf,trigger,pipe)到集群 | 集群名字,-type 类型 U(udf)/T(trigger)/P(pipe) -file /xxxx/trigger.jar,上传完成后需手动执行创建udf、pipe、trigger命令 | -| upgrade | 滚动升级 | 集群名字 | -| health_check | 健康检查 | 集群名字,-N,节点名称(可选) | -| backup | 停机备份 | 集群名字,-N,节点名称(可选) | -| importschema | 元数据导入 | 集群名字,-N,节点名称(必填) -param 参数 | -| exportschema | 元数据导出 | 集群名字,-N,节点名称(必填) -param 参数 | - - -### 1.7 详细命令执行过程 - -下面的命令都是以default_cluster.yaml 为示例执行的,用户可以修改成自己的集群文件来执行 - -#### 检查集群部署环境命令 - -```bash -iotdbctl cluster check default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 验证目标节点是否能够通过 SSH 登录 - -* 验证对应节点上的 JDK 版本是否满足IoTDB jdk1.8及以上版本、服务器是否按照unzip、是否安装lsof 或者netstat - -* 如果看到下面提示`Info:example check successfully!` 证明服务器已经具备安装的要求, - 如果输出`Error:example check fail!` 证明有部分条件没有满足需求可以查看上面的输出的Error日志(例如:`Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`)进行修复, - 如果检查jdk没有满足要求,我们可以自己在yaml 文件中配置一个jdk1.8 及以上版本的进行部署不影响后面使用, - 如果检查lsof、netstat或者unzip 不满足要求需要在服务器上自行安装。 - -#### 部署集群命令 - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`confignode_servers` 和`datanode_servers`中的节点信息上传IoTDB压缩包和jdk压缩包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值) - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -```bash -iotdbctl cluster deploy default_cluster -op force -``` -注意:该命令会强制执行部署,具体过程会删除已存在的部署目录重新部署 - -*部署单个模块* -```bash -# 部署grafana模块 -iotdbctl cluster deploy default_cluster -N grafana -# 部署prometheus模块 -iotdbctl cluster deploy default_cluster -N prometheus -# 部署iotdb模块 -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### 启动集群命令 - -```bash -iotdbctl cluster start default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 启动confignode,根据yaml配置文件中`confignode_servers`中的顺序依次启动同时根据进程id检查confignode是否正常,第一个confignode 为seek config - -* 启动datanode,根据yaml配置文件中`datanode_servers`中的顺序依次启动同时根据进程id检查datanode是否正常 - -* 如果根据进程id检查进程存在后,通过cli依次检查集群列表中每个服务是否正常,如果cli链接失败则每隔10s重试一次直到成功最多重试5次 - - -*启动单个节点命令* -```bash -#按照IoTDB 节点名称启动 -iotdbctl cluster start default_cluster -N datanode_1 -#按照IoTDB 集群ip+port启动,其中port对应confignode的cn_internal_port、datanode的rpc_port -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 -#启动grafana -iotdbctl cluster start default_cluster -N grafana -#启动prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对于节点位置信息,如果启动的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果启动的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 启动该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的start-confignode.sh和start-datanode.sh 脚本, -在实际输出结果失败时有可能是集群还未正常启动,建议使用status命令进行查看当前集群状态(iotdbctl cluster status xxx) - - -#### 查看IoTDB集群状态命令 - -```bash -iotdbctl cluster show default_cluster -#查看IoTDB集群详细信息 -iotdbctl cluster show default_cluster details -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 依次在datanode通过cli执行`show cluster details` 如果有一个节点执行成功则不会在后续节点继续执行cli直接返回结果 - - -#### 停止集群命令 - - -```bash -iotdbctl cluster stop default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`datanode_servers`中datanode节点信息,按照配置先后顺序依次停止datanode节点 - -* 根据`confignode_servers`中confignode节点信息,按照配置依次停止confignode节点 - -*强制停止集群命令* - -```bash -iotdbctl cluster stop default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群 - -*停止单个节点命令* - -```bash -#按照IoTDB 节点名称停止 -iotdbctl cluster stop default_cluster -N datanode_1 -#按照IoTDB 集群ip+port停止(ip+port是按照datanode中的ip+dn_rpc_port获取唯一节点或confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 -#停止grafana -iotdbctl cluster stop default_cluster -N grafana -#停止prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对应节点位置信息,如果停止的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果停止的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 停止该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的stop-confignode.sh和stop-datanode.sh 脚本,在某些情况下有可能iotdb集群并未停止。 - - -#### 清理集群数据命令 - -```bash -iotdbctl cluster clean default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`配置信息 - -* 根据`confignode_servers`、`datanode_servers`中的信息,检查是否还有服务正在运行, - 如果有任何一个服务正在运行则不会执行清理命令 - -* 删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 - - - -#### 重启集群命令 - -```bash -iotdbctl cluster restart default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 执行上述的停止集群命令(stop),然后执行启动集群命令(start) 具体参考上面的start 和stop 命令 - -*强制重启集群命令* - -```bash -iotdbctl cluster restart default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群,然后启动集群 - -*重启单个节点命令* - -```bash -#按照IoTDB 节点名称重启datanode_1 -iotdbctl cluster restart default_cluster -N datanode_1 -#按照IoTDB 节点名称重启confignode_1 -iotdbctl cluster restart default_cluster -N confignode_1 -#重启grafana -iotdbctl cluster restart default_cluster -N grafana -#重启prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### 集群缩容命令 - -```bash -#按照节点名称缩容 -iotdbctl cluster scalein default_cluster -N nodename -#按照ip+port缩容(ip+port按照datanode中的ip+dn_rpc_port获取唯一节点,confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster scalein default_cluster -N ip:port -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 判断要缩容的confignode节点和datanode是否只剩一个,如果只剩一个则不能执行缩容 - -* 然后根据ip:port或者nodename 获取要缩容的节点信息,执行缩容命令,然后销毁该节点目录,如果缩容的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果缩容的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - - -提示:目前一次仅支持一个节点缩容 - -#### 集群扩容命令 - -```bash -iotdbctl cluster scaleout default_cluster -``` -* 修改config/xxx.yaml 文件添加一个datanode 节点或者confignode节点 - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 找到要扩容的节点,执行上传IoTDB压缩包和jdb包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值)并解压 - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -* 执行启动该节点命令并校验节点是否启动成功 - -提示:目前一次仅支持一个节点扩容 - -#### 销毁集群命令 -```bash -iotdbctl cluster destroy default_cluster -``` - -* cluster-name 找到默认位置的 yaml 文件 - -* 根据`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`中node节点信息,检查是否节点还在运行, - 如果有任何一个节点正在运行则停止销毁命令 - -* 删除IoTDB集群中的`data`以及yaml文件配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录 - -*销毁单个模块* -```bash -# 销毁grafana模块 -iotdbctl cluster destroy default_cluster -N grafana -# 销毁prometheus模块 -iotdbctl cluster destroy default_cluster -N prometheus -# 销毁iotdb模块 -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### 分发集群配置命令 -```bash -iotdbctl cluster dist-conf default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 根据yaml文件节点配置信息生成并依次上传`iotdb-system.properties`到指定节点 - -#### 热加载集群配置命令 -```bash -iotdbctl cluster reload default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据yaml文件节点配置信息依次在cli中执行`load configuration` - -#### 集群节点日志备份 -```bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 该命令会根据yaml文件校验datanode_1,confignode_1 是否存在,然后根据配置的起止日期(startdate<=logtime<=enddate)备份指定节点datanode_1,confignode_1 的日志数据到指定服务`192.168.9.48` 端口`36000` 数据备份路径是 `/iotdb/logs` ,IoTDB日志存储路径在`/root/data/db/iotdb/logs`(非必填,如果不填写-logs xxx 默认从IoTDB安装路径/logs下面备份日志) - -| 命令 | 功能 | 是否必填 | -|------------|------------------------------------| ---| -| -h | 存放备份数据机器ip |否| -| -u | 存放备份数据机器用户名 |否| -| -pw | 存放备份数据机器密码 |否| -| -p | 存放备份数据机器端口(默认22) |否| -| -path | 存放备份数据的路径(默认当前路径) |否| -| -loglevel | 日志基本有all、info、error、warn(默认是全部) |否| -| -l | 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -| -N | 配置文件集群名称多个用逗号隔开 |是| -| -startdate | 起始时间(包含默认1970-01-01) |否| -| -enddate | 截止时间(包含) |否| -| -logs | IoTDB 日志存放路径,默认是({iotdb}/logs) |否| - -#### 集群节点数据备份 -```bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* 该命令会根据yaml文件获取leader 节点,然后根据起止日期(startdate<=logtime<=enddate)备份数据到192.168.9.48 服务上的/iotdb/datas 目录下 - -| 命令 | 功能 | 是否必填 | -| ---|---------------------------------| ---| -|-h| 存放备份数据机器ip |否| -|-u| 存放备份数据机器用户名 |否| -|-pw| 存放备份数据机器密码 |否| -|-p| 存放备份数据机器端口(默认22) |否| -|-path| 存放备份数据的路径(默认当前路径) |否| -|-granularity| 类型partition |是| -|-l| 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -|-startdate| 起始时间(包含) |是| -|-enddate| 截止时间(包含) |是| - -#### 集群lib包上传(升级) -```bash -iotdbctl cluster dist-lib default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 - -注意执行完升级后请重启IoTDB 才能生效 - -#### 集群初始化 -```bash -iotdbctl cluster init default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 初始化集群配置 - -#### 查看集群进程状态 -```bash -iotdbctl cluster status default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 展示集群的存活状态 - -#### 集群授权激活 - -集群激活默认是通过输入激活码激活,也可以通过-op license_path 通过license路径激活 - -* 默认激活方式 -```bash -iotdbctl cluster activate default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -op license_path -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -### 1.8 集群plugin分发 -```bash -#分发udf -iotdbctl cluster dist-plugin default_cluster -type U -file /xxxx/udf.jar -#分发trigger -iotdbctl cluster dist-plugin default_cluster -type T -file /xxxx/trigger.jar -#分发pipe -iotdbctl cluster dist-plugin default_cluster -type P -file /xxxx/pipe.jar -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取 `datanode_servers`配置信息 - -* 上传udf/trigger/pipe jar包 - -上传完成后需要手动执行创建udf/trigger/pipe命令 - -### 1.9 集群滚动升级 -```bash -iotdbctl cluster upgrade default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 -* confignode 执行停止、替换lib包、启动,然后datanode执行停止、替换lib包、启动 - - - -### 1.10 集群健康检查 -```bash -iotdbctl cluster health_check default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行health_check.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行health_check.sh - - -### 1.11 集群停机备份 -```bash -iotdbctl cluster backup default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行backup.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行backup.sh - -说明:多个节点部署到单台机器,只支持 quick 模式 - -### 1.12 集群元数据导入 - -```bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入import-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|---------------------------------|------| -| -s |指定想要导入的数据文件,这里可以指定文件或者文件夹。如果指定的是文件夹,将会把文件夹中所有的后缀为csv的文件进行批量导入。 | 是 | -| -fd |指定一个目录来存放导入失败的文件,如果没有指定这个参数,失败的文件将会被保存到源数据的目录中,文件名为是源文件名加上.failed的后缀。 | 否 | -| -lpf |用于指定每个导入失败文件写入数据的行数,默认值为10000 | 否 | - - - -### 1.13 集群元数据导出 - -```bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入export-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|------------------------------------------------------------|------| -| -t | 为导出的CSV文件指定输出路径 | 是 | -| -path |指定导出元数据的path pattern,指定该参数后会忽略-s参数例如:root.stock.** | 否 | -| -pf |如果未指定-path,则需指定该参数,指定查询元数据路径所在文件路径,支持 txt 文件格式,每个待导出的路径为一行。 | 否 | -| -lpf |指定导出的dump文件最大行数,默认值为10000。 | 否 | -| -timeout |指定session查询时的超时时间,单位为ms | 否 | - - - -### 1.14 集群部署工具样例介绍 -在集群部署工具安装目录中config/example 下面有3个yaml样例,如果需要可以复制到config 中进行修改即可 - -| 名称 | 说明 | -|-----------------------------|------------------------------------------------| -| default_1c1d.yaml | 1个confignode和1个datanode 配置样例 | -| default_3c3d.yaml | 3个confignode和3个datanode 配置样例 | -| default_3c3d_grafa_prome | 3个confignode和3个datanode、Grafana、Prometheus配置样例 | - -## 2. 数据文件夹概览工具 - -IoTDB数据文件夹概览工具用于打印出数据文件夹的结构概览信息,工具位置为 tools/tsfile/print-iotdb-data-dir。 - -### 2.1 用法 - -- Windows: - -```bash -.\print-iotdb-data-dir.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"IoTDB_data_dir_overview.txt"作为默认值。 - -### 2.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## 3. TsFile概览工具 - -TsFile概览工具用于以概要模式打印出一个TsFile的内容,工具位置为 tools/tsfile/print-tsfile。 - -### 3.1 用法 - -- Windows: - -```bash -.\print-tsfile-sketch.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-tsfile-sketch.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"TsFile_sketch_view.txt"作为默认值。 - -### 3.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -解释: - -- 以"|"为分隔,左边是在TsFile文件中的实际位置,右边是梗概内容。 -- "|||||||||||||||||||||"是为增强可读性而添加的导引信息,不是TsFile中实际存储的数据。 -- 最后打印的"IndexOfTimerseriesIndex Tree"是对TsFile文件末尾的元数据索引树的重新整理打印,便于直观理解,不是TsFile中存储的实际数据。 - -## 4. TsFile Resource概览工具 - -TsFile resource概览工具用于打印出TsFile resource文件的内容,工具位置为 tools/tsfile/print-tsfile-resource-files。 - -### 4.1 用法 - -- Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### 4.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/zh/UserGuide/latest-Table/Tools-System/Monitor-Tool_timecho.md b/src/zh/UserGuide/latest-Table/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index 9f5d3f26a..000000000 --- a/src/zh/UserGuide/latest-Table/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,187 +0,0 @@ - - - -# 监控工具 - -监控工具的部署可参考文档 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) 章节。 - -## 1. 监控指标的 Prometheus 映射关系 - -> 对于 Metric Name 为 name, Tags 为 K1=V1, ..., Kn=Vn 的监控指标有如下映射,其中 value 为具体值 - -| 监控指标类型 | 映射关系 | -| ---------------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | - -## 2. 修改配置文件 - -1) 以 DataNode 为例,修改 iotdb-system.properties 配置文件如下: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -2) 启动 IoTDB DataNode - -3) 打开浏览器或者用```curl``` 访问 ```http://servier_ip:9091/metrics```, 就能得到如下 metric 数据: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -## 3. Prometheus + Grafana - -如上所示,IoTDB 对外暴露出标准的 Prometheus 格式的监控指标数据,可以使用 Prometheus 采集并存储监控指标,使用 Grafana -可视化监控指标。 - -IoTDB、Prometheus、Grafana三者的关系如下图所示: - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. IoTDB在运行过程中持续收集监控指标数据。 -2. Prometheus以固定的间隔(可配置)从IoTDB的HTTP接口拉取监控指标数据。 -3. Prometheus将拉取到的监控指标数据存储到自己的TSDB中。 -4. Grafana以固定的间隔(可配置)从Prometheus查询监控指标数据并绘图展示。 - -从交互流程可以看出,我们需要做一些额外的工作来部署和配置Prometheus和Grafana。 - -比如,你可以对Prometheus进行如下的配置(部分参数可以自行调整)来从IoTDB获取监控数据 - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -更多细节可以参考下面的文档: - -[Prometheus安装使用文档](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus从HTTP接口拉取metrics数据的配置说明](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana安装使用文档](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana从Prometheus查询数据并绘图的文档](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## 4. Apache IoTDB Dashboard - -我们提供了Apache IoTDB Dashboard,支持统一集中式运维管理,可通过一个监控面板监控多个集群。 - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - -你可以在企业版中获取到 Dashboard 的 Json文件。 - -### 4.1 集群概览 - -可以监控包括但不限于: -- 集群总CPU核数、总内存空间、总硬盘空间 -- 集群包含多少个ConfigNode与DataNode -- 集群启动时长 -- 集群写入速度 -- 集群各节点当前CPU、内存、磁盘使用率 -- 分节点的信息 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - -### 4.2 数据写入 - -可以监控包括但不限于: -- 写入平均耗时、耗时中位数、99%分位耗时 -- WAL文件数量与尺寸 -- 节点 WAL flush SyncBuffer 耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### 4.3 数据查询 - -可以监控包括但不限于: -- 节点查询加载时间序列元数据耗时 -- 节点查询读取时间序列耗时 -- 节点查询修改时间序列元数据耗时 -- 节点查询加载Chunk元数据列表耗时 -- 节点查询修改Chunk元数据耗时 -- 节点查询按照Chunk元数据过滤耗时 -- 节点查询构造Chunk Reader耗时的平均值 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### 4.4 存储引擎 - -可以监控包括但不限于: -- 分类型的文件数量、大小 -- 处于各阶段的TsFile数量、大小 -- 各类任务的数量与耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### 4.5 系统监控 - -可以监控包括但不限于: -- 系统内存、交换内存、进程内存 -- 磁盘空间、文件数、文件尺寸 -- JVM GC时间占比、分类型的GC次数、GC数据量、各年代的堆内存占用 -- 网络传输速率、包发送速率 - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) - -### 4.6 数据同步 - -可以监控包括但不限于: -- Pipe事件提交队列大小、未分配Pipe事件数。 -- Source队列未处理事件数、Source供给事件速率、Processor处理事件速率。 -- 各类Pipesink/source未传输事件数、Pipe connector传输事件速率。 -- Pipesink重试队列和pending handler数量、Pipesink压缩前后累计大小和压缩耗时、Pipesink 批量大小和批处理间隔分布。 -- Pipe内存占用和容量、Pipe phantom reference数量、linked TsFile数量和大小、Pipe发送TsFile读取磁盘字节数。 - -![](/img/monitor-tool-pipe-1.png) - -![](/img/monitor-tool-pipe-2.png) - -![](/img/monitor-tool-pipe-3.png) - -![](/img/monitor-tool-pipe-4.png) \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/Tools-System/Schema-Export-Tool_timecho.md b/src/zh/UserGuide/latest-Table/Tools-System/Schema-Export-Tool_timecho.md deleted file mode 100644 index c7affc309..000000000 --- a/src/zh/UserGuide/latest-Table/Tools-System/Schema-Export-Tool_timecho.md +++ /dev/null @@ -1,111 +0,0 @@ - - -# 元数据导出 - -## 1. 功能概述 - -元数据导出工具 `export-schema.sh/bat` 位于tools 目录下,能够将 IoTDB 中指定数据库下的元数据导出为脚本文件。 - -## 2. 功能详解 - -### 2.1 参数介绍 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|----------------|--------------------------|-----------------------------------------------| ----------------------------------- |---------------------------------------| -| `-h` | `--host` | 主机名 | 否 | 127.0.0.1 | -| `-p` | `--port` | 端口号 | 否 | 6667 | -| `-u` | `--username` | 用户名 | 否 | root | -| `-pw` | `--password` | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6 版本之前为 root) | -| `-sql_dialect` | `--sql_dialect` | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| `-db` | `--database` | 将要导出的目标数据库,只在`-sql_dialect`为 table 类型下生效。 | `-sql_dialect`为 table 时必填 | - | -| `-table` | `--table` | 将要导出的目标表,只在`-sql_dialect`为 table 类型下生效。 | 否 | - | -| `-t` | `--target` | 指定输出文件的目标文件夹,如果路径不存在新建文件夹 | 是 | | -| `-path` | `--path_pattern` | 指定导出元数据的path pattern | `-sql_dialect`为 tree 时必填 | | -| `-pfn` | `--prefix_file_name` | 指定导出文件的名称。 | 否 | dump\_dbname.sql | -| `-lpf` | `--lines_per_file` | 指定导出的dump文件最大行数,只在`-sql_dialect`为 tree 类型下生效。 | 否 | `10000` | -| `-timeout` | `--query_timeout` | 会话查询的超时时间(ms) | 否 | -1范围:-1~Long. max=9223372036854775807 | -| `-help` | `--help` | 显示帮助信息 | 否 | | -| `-usessl` | `--use_ssl` | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| `-ts` | `--trust_store` | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| `-tpw` | `--trust_store_password` | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 运行命令 - -```Bash -Shell -# Unix/OS X -> tools/export-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -# Windows -# V2.0.4.x 版本之前 -> tools\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\schema\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -``` - -### 2.3 运行示例 - -将 `database1` 下的元数据导出到`/home`下 - -```Bash -./export-schema.sh -sql_dialect table -t /home/ -db database1 -``` - -导出文件 `dump_database1.sql`,内容格式如下: - -```sql -DROP TABLE IF EXISTS table1; -CREATE TABLE table1( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -DROP TABLE IF EXISTS table2; -CREATE TABLE table2( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -``` diff --git a/src/zh/UserGuide/latest-Table/Tools-System/Schema-Import-Tool_timecho.md b/src/zh/UserGuide/latest-Table/Tools-System/Schema-Import-Tool_timecho.md deleted file mode 100644 index 1b418708d..000000000 --- a/src/zh/UserGuide/latest-Table/Tools-System/Schema-Import-Tool_timecho.md +++ /dev/null @@ -1,166 +0,0 @@ - - -# 元数据导入 - -## 1. 功能概述 - -元数据导入工具 `import-schema.sh/bat` 位于tools 目录下,能够将指定路径下创建元数据的脚本文件导入到 IoTDB 中。 - -## 2. 功能详解 - -### 2.1 参数介绍 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|----------------|------------------------------|------------------------------------------------| -------------- |-------------------------------------| -| `-h` | `--host` | 主机名 | 否 | 127.0.0.1 | -| `-p` | `--port` | 端口号 | 否 | 6667 | -| `-u` | `--username` | 用户名 | 否 | root | -| `-pw` | `--password` | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6 版本之前为 root) | -| `-sql_dialect` | `--sql_dialect` | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| `-db` | `--database` | 将要导入的目标数据库 | `是` | - | -| `-table` | `--table` | 将要导入的目标表,只在`-sql_dialect`为 table 类型下生效。 | 否 | - | -| `-s` | `--source` | 待加载的脚本文件(夹)的本地目录路径。 | 是 | | -| `-fd` | `--fail_dir` | 指定保存失败文件的目录 | 否 | | -| `-lpf` | `--lines_per_failed_file` | 指定失败文件最大写入数据的行数,只在`-sql_dialect`为 table 类型下生效。 | 否 | 100000范围:0~Integer.Max=2147483647 | -| `-help` | `--help` | 显示帮助信息 | 否 | | -| `-usessl` | `--use_ssl` | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| `-ts` | `--trust_store` | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| `-tpw` | `--trust_store_password` | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 运行命令 - -```Bash -# Unix/OS X -tools/import-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# Windows -# V2.0.4.x 版本之前 -tools\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# V2.0.4.x 版本及之后 -tools\windows\schema\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] -``` - -### 2.3 运行示例 - - -将 `/home `下的文件 `dump_database1.sql` 导入到 `database2 `中,文件内容如下: - -```sql -DROP TABLE IF EXISTS table1; -CREATE TABLE table1( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -DROP TABLE IF EXISTS table2; -CREATE TABLE table2( - time TIMESTAMP TIME, - region STRING TAG, - plant_id STRING TAG, - device_id STRING TAG, - model_id STRING ATTRIBUTE, - maintenance STRING ATTRIBUTE, - temperature FLOAT FIELD, - humidity FLOAT FIELD, - status BOOLEAN FIELD, - arrival_time TIMESTAMP FIELD -); -``` - -执行脚本: - -```Bash -./import-schema.sh -sql_dialect table -s /home/dump_database1.sql -db database2 - -# database2 不存在时,提示错误信息如下 -The target database database2 does not exist - -# database2 存在时,提示成功 -Import completely! -``` - -验证导入元数据: - -```Bash -# 导入前 -IoTDB:database2> show tables -+---------+-------+ -|TableName|TTL(ms)| -+---------+-------+ -+---------+-------+ -Empty set. - -# 导入后 -IoTDB:database2> show tables details -+---------+-------+------+-------+ -|TableName|TTL(ms)|Status|Comment| -+---------+-------+------+-------+ -| table2| INF| USING| null| -| table1| INF| USING| null| -+---------+-------+------+-------+ - -IoTDB:database2> desc table1 -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ - -IoTDB:database2> desc table2 -+------------+---------+---------+ -| ColumnName| DataType| Category| -+------------+---------+---------+ -| time|TIMESTAMP| TIME| -| region| STRING| TAG| -| plant_id| STRING| TAG| -| device_id| STRING| TAG| -| model_id| STRING|ATTRIBUTE| -| maintenance| STRING|ATTRIBUTE| -| temperature| FLOAT| FIELD| -| humidity| FLOAT| FIELD| -| status| BOOLEAN| FIELD| -|arrival_time|TIMESTAMP| FIELD| -+------------+---------+---------+ -``` diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Audit-Log_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index 0f0df7463..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,163 +0,0 @@ - - - -# 安全审计 - -## 1. 引言 - -审计日志是数据库的记录凭证,通过审计日志功能可以查询数据库中增删改查等各项操作,以保证信息安全。IoTDB 审计日志功能支持以下特性: - -* 可通过配置决定是否开启审计日志功能 -* 可通过参数设置审计日志记录的操作类型和权限级别 -* 可通过参数设置审计日志文件的存储周期,包括基于 TTL 实现时间滚动和基于 SpaceTL 实现空间滚动。 -* 可通过参数设置统计任意时间段内写入和查询延时大于阈值(默认3000毫秒)的慢请求个数。 -* 审计日志文件默认加密存储 - -> 注意:该功能从 V2.0.8 版本开始提供。 - -## 2. 配置参数 - -通过编辑配置文件 `iotdb-system.properties` 中如下参数来启动审计日志功能。 - -* V2.0.8.1 - - -| 参数名称 | 参数描述 | 数据类型 | 默认值 | 生效方式 | -|-----------------------------------|------------------------------------------------------------------------------------------------------| ---------- | ------------------------ | ---------- | -| `enable_audit_log` | 是否开启审计日志。 true:启用。false:禁用。 | Boolean | false | 热加载 | -| `auditable_operation_type` | 操作类型选择。 DML :所有 DML 都会记录审计日志; DDL :所有 DDL 都会记录审计日志; QUERY :所有 QUERY 都会记录审计日志; CONTROL:所有控制语句都会记录审计日志; | String | DML,DDL,QUERY,CONTROL | 热加载 | -| `auditable_operation_level` | 权限级别选择。 global :记录全部的审计日志; object:仅针对数据实例的事件的审计日志会被记录; 包含关系:object < global。 例如:设置为 global 时,所有审计日志正常记录;设置为 object 时,仅记录对具体数据实例的操作。 | String | global | 热加载 | -| `auditable_operation_result` | 审计结果选择。 success:只记录成功事件的审计日志; fail:只记录失败事件的审计日志; | String | success, fail | 热加载 | -| `audit_log_ttl_in_days` | 审计日志的 TTL,生成审计日志的时间达到该阈值后过期。 | Double | -1.0(永远不会被删除) | 热加载 | -| `audit_log_space_tl_in_GB` | 审计日志的 SpaceTL,审计日志总空间达到该阈值后开始轮转删除。 | Double | 1.0| 热加载| -| `audit_log_batch_interval_in_ms` | 审计日志批量写入的时间间隔 | Long | 1000 | 热加载 | -| `audit_log_batch_max_queue_bytes` | 用于批量处理审计日志的队列最大字节数。当队列大小超过此值时,后续的写入操作将被阻塞。 | Long | 268435456 | 热加载 | - - -* V2.0.9.2 - -| 参数名称 | 参数描述 | 数据类型 | 默认值 | 生效方式 | -|-----------------------------------|------------------------------------------------------------------------------------------------------| ---------- | ------------------------ | ---------- | -| `enable_audit_log` | 是否开启审计日志。 true:启用。false:禁用。 | Boolean | false | 热加载 | -| `auditable_operation_type` | 操作类型选择。 DML :所有 DML 都会记录审计日志; DDL :所有 DDL 都会记录审计日志; QUERY :所有 QUERY 都会记录审计日志; CONTROL:所有控制语句都会记录审计日志; | String | DML,DDL,QUERY,CONTROL | 热加载 | -| `auditable_dml_event_type` | 审计DML操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_ddl_event_type` | 审计DDL操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_query_event_type` | 审计查询操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_control_event_type` | 审计控制操作时的事件类型。`CHANGE_AUDIT_OPTION`:审计选项变更,`OBJECT_AUTHENTICATION`:对象鉴权,`LOGIN`:登录,`LOGOUT`:退出登录,`DN_SHUTDOWN`:数据节点关机,`SLOW_OPERATION`:慢操作 | String | `CHANGE_AUDIT_OPTION`,`OBJECT_AUTHENTICATION`,`LOGIN`,`LOGOUT`,`DN_SHUTDOWN`,`SLOW_OPERATION` | 热加载 | -| `auditable_operation_level` | 权限级别选择。 global :记录全部的审计日志; object:仅针对数据实例的事件的审计日志会被记录; 包含关系:object < global。 例如:设置为 global 时,所有审计日志正常记录;设置为 object 时,仅记录对具体数据实例的操作。 | String | global | 热加载 | -| `auditable_operation_result` | 审计结果选择。 success:只记录成功事件的审计日志; fail:只记录失败事件的审计日志; | String | success, fail | 热加载 | -| `audit_log_ttl_in_days` | 审计日志的 TTL,生成审计日志的时间达到该阈值后过期。 | Double | -1.0(永远不会被删除) | 热加载 | -| `audit_log_space_tl_in_GB` | 审计日志的 SpaceTL,审计日志总空间达到该阈值后开始轮转删除。 | Double | 1.0| 热加载| -| `audit_log_batch_interval_in_ms` | 审计日志批量写入的时间间隔 | Long | 1000 | 热加载 | -| `audit_log_batch_max_queue_bytes` | 用于批量处理审计日志的队列最大字节数。当队列大小超过此值时,后续的写入操作将被阻塞。 | Long | 268435456 | 热加载 | - -**关于对象鉴权和慢操作的说明:** -* 当 `auditable_dml_event_type` 、`auditable_ddl_event_type`、`auditable_query_event_type`、`auditable_control_event_type` 参数值设置为 `OBJECT_AUTHENTICATION`(对象鉴权)时,则对应的事件类型会被记录审计日志。 -* 当 `auditable_dml_event_type` 、`auditable_ddl_event_type`、`auditable_query_event_type`、`auditable_control_event_type` 参数值设置为 `SLOW_OPERATION`(慢操作),则操作时间大于 `slow_query_threshold` 参数值(默认 3000 ms)的对应事件类型才会被记录审计日志。`slow_query_threshold` 参数值可通过 iotdb-system.properties 文件进行配置。 - -## 3. 查阅方法 - -支持通过 SQL 直接阅读、获取审计日志相关信息。 - -### 3.1 SQL 语法 - -```SQL -SELECT (, )* log FROM WHERE whereclause ORDER BY order_expression -``` - -其中: - -* `AUDIT_LOG_PATH` :审计日志存储位置`__audit.audit_log`; -* `audit_log_field`:查询字段请参考下一小节元数据结构。 -* 支持 Where 条件搜索和 Order By 排序。 - -### 3.2 元数据结构 - -| 字段 | 含义 | 类型 | -| ------------------------ |------------------------------------------------------| ----------- | -| `time` | 事件开始的的日期和时间 | timestamp | -| `username` | 用户名称 | string | -| `cli_hostname` | 用户主机标识 | string | -| `audit_event_type` | 审计事件类型,WRITE\_DATA, GENERATE\_KEY, SLOW\_OPERATION 等 | string | -| `operation_type` | 审计事件的操作类型,DML, DDL, QUERY, CONTROL | string | -| `privilege_type` | 审计事件使用的权限,WRITE\_DATA, MANAGE\_USER 等 | string | -| `privilege_level` | 事件的权限级别,global, object | string | -| `result` | 事件结果,success=1, fail=0 | boolean | -| `database` | 数据库名称 | string | -| `sql_string` | 用户的原始 SQL | string | -| `log` | 具体的事件描述 | string | - -### 3.3 使用示例 - -* 查询成功执行了DML操作的时间、用户名及主机信息 - -```SQL -IoTDB:__audit> select time,username,cli_hostname from audit_log where result = true and operation_type='DML' -+-----------------------------+--------+------------+ -| time|username|cli_hostname| -+-----------------------------+--------+------------+ -|2026-01-23T11:43:46.697+08:00| root| 127.0.0.1| -|2026-01-23T11:45:39.950+08:00| root| 127.0.0.1| -+-----------------------------+--------+------------+ -Total line number = 2 -It costs 0.284s -``` - -* 查询最近一次操作的时间、用户名、主机信息、操作类型以及原始 SQL - -```SQL -IoTDB:__audit> select time,username,cli_hostname,operation_type,sql_string from audit_log order by time desc limit 1 -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -| time|username|cli_hostname|operation_type| sql_string| -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -|2026-01-23T11:46:31.026+08:00| root| 127.0.0.1| QUERY|select time,username,cli_hostname,operation_type,sql_string from audit_log order by time desc limit 1| -+-----------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------+ -Total line number = 1 -It costs 0.053s -``` - -* 查询所有事件结果为false的操作数据库、操作类型及日志信息 - -```SQL -IoTDB:__audit> select time,database,operation_type,log from audit_log where result=false -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -| time|database|operation_type| log| -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -|2026-01-23T11:47:42.136+08:00| | CONTROL|User user1 (ID=-1) login failed with code: 804, Authentication failed.| -+-----------------------------+--------+--------------+----------------------------------------------------------------------+ -Total line number = 1 -It costs 0.011s -``` - -* 设置 slow_query_threshold = 1 (ms),查询审计事件类型为慢操作的记录 -```SQL -IoTDB:__audit> select * from audit_log where audit_event_type='SLOW_OPERATION' limit 3 -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| time|node_id|user_id|username|cli_hostname|audit_event_type|operation_type|privilege_type|privilege_level|result| database| sql_string| log| -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|2026-05-06T14:57:57.468+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| QUERY| [SELECT]| OBJECT| true| | show databases| SLOW_QUERY: cost 10 ms, show databases| -|2026-05-06T14:58:38.149+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| DML| [INSERT]| OBJECT| true|database1|INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2024-11-26 13:37:00', 90.0, 35.1, true, '2024-11-26 13:37:34'), ('北京', '1001', '100', 'A', '180', '2024-11-26 13:38:00', 90.0, 35.1, true, '2024-11-26 13:38:25'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:38:00', NULL, 35.1, true, '2024-11-27 16:37:01'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:39:00', 85.0, 35.3, NULL, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:40:00', 85.0, NULL, NULL, '2024-11-27 16:37:03'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:41:00', 85.0, NULL, NULL, '2024-11-27 16:37:04'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:42:00', NULL, 35.2, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:43:00', NULL, Null, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:44:00', NULL, Null, false, '2024-11-27 16:37:08'), ('上海', '3001', '100', 'C', '90', '2024-11-28 08:00:00', 85.0, Null, NULL, '2024-11-28 08:00:09'), ('上海', '3001', '100', 'C', '90', '2024-11-28 09:00:00', NULL, 40.9, true, NULL), ('上海', '3001', '100', 'C', '90', '2024-11-28 10:00:00', 85.0, 35.2, NULL, '2024-11-28 10:00:11'), ('上海', '3001', '100', 'C', '90', '2024-11-28 11:00:00', 88.0, 45.1, true, '2024-11-28 11:00:12'), ('上海', '3001', '101', 'D', '360', '2024-11-29 10:00:00', 85.0, NULL, NULL, '2024-11-29 10:00:13'), ('上海', '3002', '100', 'E', '180', '2024-11-29 11:00:00', NULL, 45.1, true, NULL), ('上海', '3002', '100', 'E', '180', '2024-11-29 18:30:00', 90.0, 35.4, true, '2024-11-29 18:30:15'), ('上海', '3002', '101', 'F', '360', '2024-11-30 09:30:00', 90.0, 35.2, true, NULL), ('上海', '3002', '101', 'F', '360', '2024-11-30 14:30:00', 90.0, 34.8, true, '2024-11-30 14:30:17')|Execution: INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2024-11-26 13:37:00', 90.0, 35.1, true, '2024-11-26 13:37:34'), ('北京', '1001', '100', 'A', '180', '2024-11-26 13:38:00', 90.0, 35.1, true, '2024-11-26 13:38:25'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:38:00', NULL, 35.1, true, '2024-11-27 16:37:01'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:39:00', 85.0, 35.3, NULL, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:40:00', 85.0, NULL, NULL, '2024-11-27 16:37:03'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:41:00', 85.0, NULL, NULL, '2024-11-27 16:37:04'), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:42:00', NULL, 35.2, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:43:00', NULL, Null, false, Null), ('北京', '1001', '101', 'B', '180', '2024-11-27 16:44:00', NULL, Null, false, '2024-11-27 16:37:08'), ('上海', '3001', '100', 'C', '90', '2024-11-28 08:00:00', 85.0, Null, NULL, '2024-11-28 08:00:09'), ('上海', '3001', '100', 'C', '90', '2024-11-28 09:00:00', NULL, 40.9, true, NULL), ('上海', '3001', '100', 'C', '90', '2024-11-28 10:00:00', 85.0, 35.2, NULL, '2024-11-28 10:00:11'), ('上海', '3001', '100', 'C', '90', '2024-11-28 11:00:00', 88.0, 45.1, true, '2024-11-28 11:00:12'), ('上海', '3001', '101', 'D', '360', '2024-11-29 10:00:00', 85.0, NULL, NULL, '2024-11-29 10:00:13'), ('上海', '3002', '100', 'E', '180', '2024-11-29 11:00:00', NULL, 45.1, true, NULL), ('上海', '3002', '100', 'E', '180', '2024-11-29 18:30:00', 90.0, 35.4, true, '2024-11-29 18:30:15'), ('上海', '3002', '101', 'F', '360', '2024-11-30 09:30:00', 90.0, 35.2, true, NULL), ('上海', '3002', '101', 'F', '360', '2024-11-30 14:30:00', 90.0, 34.8, true, '2024-11-30 14:30:17') cost 329 ms, with status code: TSStatus(code:200, message:)| -|2026-05-06T14:58:45.534+08:00| node_1| u_0| root| 127.0.0.1| SLOW_OPERATION| QUERY| [SELECT]| OBJECT| true|database1| select * from table1| SLOW_QUERY: cost 121 ms, select * from table1| -+-----------------------------+-------+-------+--------+------------+----------------+--------------+--------------+---------------+------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 3 -It costs 0.026s -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Authority-Management-Upgrade_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Authority-Management-Upgrade_timecho.md deleted file mode 100644 index 94334ce70..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Authority-Management-Upgrade_timecho.md +++ /dev/null @@ -1,539 +0,0 @@ - - -# 权限管理 - -IoTDB 提供了权限管理功能,用于对数据和集群系统执行精细的访问控制,保障数据与系统安全。本篇介绍了 IoTDB 表模型中权限模块的基本概念、用户定义、权限管理、鉴权逻辑与功能用例。 - -## 1. 基本概念 - -### 1.1 用户 - -用户即数据库的合法使用者。一个用户与一个唯一的用户名相对应,并且拥有密码作为身份验证的手段。一个人在使用数据库之前,必须先提供合法的(即存于数据库中的)用户名与密码。 - -### 1.2 权限 - -数据库提供多种操作,但并非所有的用户都能执行所有操作。如果一个用户可以执行某项操作,则称该用户有执行该操作的权限。 - -### 1.3 角色 - -角色是若干权限的集合,并且有一个唯一的角色名作为标识符。角色通常和一个现实身份相对应(例如交通调度员),而一个现实身份可能对应着多个用户。这些具有相同现实身份的用户往往具有相同的一些权限,角色就是为了能对这样的权限进行统一的管理的抽象。 - -### 1.4 默认用户与角色 - -安装初始化后的 IoTDB 中有一个默认用户 root,默认密码为 TimechoDB@2021。该用户为管理员用户,拥有所有权限,无法被赋予、撤销权限,也无法被删除,数据库内仅有一个管理员用户。一个新创建的用户或角色不具备任何权限。 - -## 2. 用户定义 - -拥有 SECURITY 的用户可以创建用户或者角色,需要满足以下约束: - -### 2.1 用户名限制 - -* 4\~32个字符,支持使用英文大小写字母、数字、特殊字符`(!@#$%^&*()_+-=)`,用户无法创建和管理员用户同名的用户。 -* 如果用户名全是数字或包含特殊字符,则创建时需要使用双引号`""`括起来。 - -### 2.2 密码限制 - -12~32个字符,必须包含大写小写字母、至少一个数字、至少一个特殊字符(`!@#$%^&*()_+-=`)且不能与用户名相同。 - -### 2.3 角色名限制 - -4\~32个字符,支持使用英文大小写字母、数字、特殊字符(`!@#$%^&*()_+-=`)。用户无法创建和管理员用户同名的角色。 - -## 3. 权限管理 - -IoTDB 表模型主要有两类权限:全局权限、数据权限。 - -### 3.1 全局权限 - -全局权限包含 SYSTEM、SECURITY、AUDIT 三种类型: - -* SYSTEM:只具备运维操作、DDL(Data Definition Language)相关的权限。 -* SECURITY:只具备管理角色(Role)或用户(User)以及为其他账号授予权限的权限。 -* AUDIT :只具备维护审计规则、查看审计日志的权限。 - -各权限详细描述见下表: - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
权限名称原权限名称描述
SYSTEM允许用户创建、修改、删除数据库。
允许用户创建、修改、删除表及表视图。
允许用户创建、删除、查看用户自定义函数。
允许用户创建、开始、停止、删除、查看 PIPE。允许用户创建、删除、查看 PIPEPLUGINS。
允许用户查询、取消查询。允许用户查看变量。允许用户查看集群状态。
允许用户创建、删除、查询深度学习模型
SECURITYMANAGE_USER-允许用户创建、删除用户 允许用户修改用户密码 允许用户查看用户的权限信息 允许用户列出所有用户
MANAGE_ROLE-允许用户创建、删除角色 允许用户查看角色的权限信息 允许用户将角色授予某个用户或撤销 允许用户列出所有角色
AUDIT允许用户维护审计日志的规则 允许用户查看审计日志
- -### 3.2 数据权限 - -数据权限由权限类型和范围组成。 - -* 权限类型包括:CREATE(创建权限),DROP(删除权限),ALTER(修改权限),SELECT(查询数据权限),INSERT(插入/更新数据权限),DELETE(删除数据权限)。 -* 范围包括:ANY(系统范围内),DATABASE(数据库范围内),TABLE(单个表)。 - * 作用于 ANY 的权限会影响所有数据库和表。 - * 作用于数据库的权限会影响该数据库及其所有表。 - * 作用于表的权限仅影响该表。 -* 范围生效说明:执行单表操作时,数据库会匹配用户权限与数据权限范围。例如,用户尝试向 DATABASE1.TABLE1 写入数据时,系统会依次检查用户是否有对 ANY、DATABASE1或 DATABASE1.TABLE1 的写入权限,直到匹配成功或者匹配失败。 -* 权限类型、范围及效果逻辑关系如下表所示: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
权限类型权限范围(层级)权限效果
CREATEANY允许创建任意表、创建任意数据库
数据库允许用户在该数据库下创建表;允许用户创建该名称的数据库
允许用户创建该名称的表
DROPANY允许删除任意表、删除任意数据库
数据库允许用户删除该数据库;允许用户删除该数据库下的表
D允许用户删除该表
ALTERANY允许修改任意表的定义、任意数据库的定义
数据库允许用户修改数据库的定义,允许用户修改数据库下表的定义
允许用户修改表的定义
SELECTANY允许查询系统内任意数据库中任意表的数据
数据库允许用户查询该数据库中任意表的数据
允许用户查询该表中的数据。执行多表查询时,数据库仅展示用户有权限访问的数据。
INSERTANY允许任意数据库的任意表插入/更新数据
数据库允许用户向该数据库范围内任意表插入/更新数据
允许用户向该表中插入/更新数据
DELETEANY允许删除任意表的数据
数据库允许用户删除该数据库范围内的数据
允许用户删除该表中的数据
- -### 3.3 权限授予与取消 - -IoTDB支持通过如下三种途径进行用户授权和撤销权限: - -* 超级管理员直接授予或撤销 -* 拥有GRANT OPTION权限的用户授予或撤销 -* 通过角色授予或撤销(由超级管理员或具备 SECURITY 权限的用户操作角色) - -在IoTDB 表模型中,授权或撤销权限时需遵循以下原则: - -* 授权/撤销全局权限时,无需指定权限的范围。 -* 授予/撤销数据权限时,需要指定权限类型和权限范围。在撤销权限时只会撤销指定的权限范围,不会受权限范围包含关系的影响。 -* 允许对尚未创建的数据库或表提前进行权限规划和授权。 -* 允许重复授权/撤销权限。 -* WITH GRANT OPTION: 允许用户在授权范围内管理权限。用户可以授予或撤销其他用户在该范围内的权限。 - -## 4. 功能语法与示例 - -### 4.1 用户与角色相关 - -1. 创建用户(需 SECURITY 权限) - -```SQL -CREATE USER -eg: CREATE USER user1 'Passwd@202604'; -``` - -2. 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备 SECURITY 权限。 - -```SQL -ALTER USER SET PASSWORD -eg: ALTER USER tempuser SET PASSWORD 'Newpwd@202604'; -``` - -3. 删除用户(需 SECURITY 权限) - -```SQL -DROP USER -eg: DROP USER user1; -``` - -4. 创建角色 (需 SECURITY 权限) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1; -``` - -5. 删除角色 (需 SECURITY 权限) - -```SQL -DROP ROLE -eg: DROP ROLE role1; -``` - -6. 赋予用户角色 (需 SECURITY 权限) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1; -``` - -7. 移除用户角色 (需 SECURITY 权限) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1; -``` - -8. 列出所有用户(需 SECURITY 权限) - -```SQL -LIST USER; -``` - -9. 列出所有的角色 (需 SECURITY 权限) - -```SQL -LIST ROLE; -``` - -10. 列出指定角色下所有用户(需 SECURITY 权限) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser; -``` - -11. 列出指定用户下的所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 SECURITY 权限。 - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser; -``` - -12. 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser; -``` - -13. 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF ROLE -eg: LIST PRIVILEGES OF ROLE actor; -``` - -### 4.2 授权与取消授权 - -#### 4.2.1 授予权限 - -1. 给用户授予管理用户的权限 - -```SQL -GRANT SECURITY TO USER -eg: GRANT SECURITY TO USER TEST_USER; -``` - -2. 给用户授予创建数据库及在数据库范围内创建表的权限,且允许用户在该范围内管理权限 - -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION -eg: GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION; -``` - -3. 给角色授予查询数据库的权限 - -```SQL -GRANT SELECT ON DATABASE TO ROLE -eg: GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE; -``` - -4. 给用户授予查询表的权限 - -```SQL -GRANT SELECT ON . TO USER -eg: GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER; -``` - -5. 给角色授予查询所有数据库及表的权限 - -```SQL -GRANT SELECT ON ANY TO ROLE -eg: GRANT SELECT ON ANY TO ROLE TEST_ROLE; -``` - -6. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地授予权限。 - -```SQL -GRANT ALL TO USER TESTUSER; --- 将用户可以获取的所有权限授予给用户,包括全局权限和 ANY 范围的所有数据权限 - -GRANT ALL ON ANY TO USER TESTUSER; --- 将 ANY 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在所有数据库上的所有数据权限 - -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER; --- 将 DB 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该数据库上的所有数据权限 - -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER; --- 将 TABLE 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该表上的所有数据权限 -``` - -#### 4.2.2 撤销权限 - -1. 取消用户管理用户的权限 - -```SQL -REVOKE SECURITY FROM USER -eg: REVOKE SECURITY FROM USER TEST_USER; -``` - -2. 取消用户创建数据库及在数据库范围内创建表的权限 - -```SQL -REVOKE CREATE ON DATABASE FROM USER -eg: REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER; -``` - -3. 取消用户查询表的权限 - -```SQL -REVOKE SELECT ON . FROM USER -eg: REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER; -``` - -4. 取消用户查询所有数据库及表的权限 - -```SQL -REVOKE SELECT ON ANY FROM USER -eg: REVOKE SELECT ON ANY FROM USER TEST_USER; -``` - -5. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地撤销权限。 - -```SQL -REVOKE ALL FROM USER TESTUSER; --- 取消用户所有的全局权限以及 ANY 范围的所有数据权限 - -REVOKE ALL ON ANY FROM USER TESTUSER; --- 取消用户 ANY 范围的所有数据权限,不会影响 DB 范围和 TABLE 范围的权限 - -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER; --- 取消用户在 DB 上的所有数据权限,不会影响 TABLE 权限 - -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER; --- 取消用户在 TABLE 上的所有数据权限 -``` - -### 4.3 查看用户权限 - -每个用户都有一个权限访问列表,标识其获得的所有权限。可使用 `LIST PRIVILEGES OF USER ` 语句查看某个用户或角色的权限信息,输出格式如下: - -| ROLE | SCOPE | PRIVIVLEGE | WITH GRANT OPTION | -| ------- |---------|------------| ------------------- | -| | DB1.TB1 | SELECT | FALSE | -| | | SECURITY | TRUE | -| ROLE1 | DB2.TB2 | UPDATE | TRUE | -| ROLE1 | DB3.\* | DELETE | FALSE | -| ROLE1 | \*.\* | UPDATE | TRUE | - -其中: - -* `ROLE` 列:如果为空,则表示为该用户的自身权限。如果不为空,则表示该权限来源于被授予的角色。 -* `SCOPE` 列:表示该用户/角色的权限范围,表范围的权限表示为`DB.TABLE`,数据库范围的权限表示为`DB.*`, ANY 范围的权限表示为`*.*`。 -* `PRIVIVLEGE` 列:列出具体的权限类型。 -* `WITH GRANT OPTION` 列:如果为 TRUE,表示用户可以将自己的权限授予他人。 -* 用户或者角色可以同时具有树模型和表模型的权限,但系统会根据当前连接的模型来显示相应的权限,另一种模型下的权限则不会显示。 - -## 5. 场景示例 - -以 [示例数据](../Reference/Sample-Data.md) 内容为例,两个表的数据可能分别属于 bj、sh 两个数据中心,彼此间不希望对方获取自己的数据库数据,因此我们需要将不同的数据在数据中心层进行权限隔离。 - -### 5.1 创建用户 - -使用 `CREATE USER ` 创建用户。例如,可以使用具有所有权限的root用户为 ln 和 sgcc 集团创建两个用户角色,名为 `bj_write_user`, `sh_write_user`,密码均为 write_Pwd@2026。SQL 语句为: - -```SQL -CREATE USER bj_write_user 'write_Pwd@2026'; -CREATE USER sh_write_user 'write_Pwd@2026'; -``` - -使用展示用户的 SQL 语句: - -```SQL -LIST USER; -``` - -可以看到这两个已经被创建的用户,结果如下: - -```SQL -+------+-------------+-----------------+-----------------+ -|UserId| User|MaxSessionPerUser|MinSessionPerUser| -+------+-------------+-----------------+-----------------+ -| 0| root| -1| 1| -| 10000|bj_write_user| -1| -1| -| 10001|sh_write_user| -1| -1| -+------+-------------+-----------------+-----------------+ -``` - -### 5.2 赋予用户权限 - -虽然两个用户已经创建,但是不具有任何权限,因此并不能对数据库进行操作,例如使用 `bj_write_user` 用户对 table1 中的数据进行写入,SQL 语句为: - -```SQL -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -``` - -系统不允许用户进行此操作,会提示错误: - -```SQL -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: database is not specified -IoTDB> use database1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: DATABASE database1 -``` - -root 用户使用 `GRANT ON TO USER ` 语句赋予用户`bj_write_user`对 table1 的写入权限,例如: - -```SQL -GRANT INSERT ON database1.table1 TO USER bj_write_user -``` - -使用`bj_write_user`再尝试写入数据 - -```SQL -IoTDB> use database1 -Msg: The statement is executed successfully. -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: The statement is executed successfully. -``` - -### 5.3 撤销用户权限 - -授予用户权限后,可以使用 `REVOKE ON FROM USER `来撤销已经授予用户的权限。例如,用root用户撤销`bj_write_user`和`sh_write_user`的权限: - -```SQL -REVOKE INSERT ON database1.table1 FROM USER bj_write_user -REVOKE INSERT ON database1.table2 FROM USER sh_write_user -``` - -撤销权限后,`bj_write_user`就没有向table1写入数据的权限了。 - -```SQL -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: No permissions for this operation, please add privilege INSERT ON database1.table1 -``` diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Authority-Management_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Authority-Management_timecho.md deleted file mode 100644 index 13052b257..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Authority-Management_timecho.md +++ /dev/null @@ -1,484 +0,0 @@ - - -# 权限管理 - -IoTDB 提供了权限管理功能,用于对数据和集群系统执行精细的访问控制,保障数据与系统安全。本篇介绍了 IoTDB 表模型中权限模块的基本概念、用户定义、权限管理、鉴权逻辑与功能用例。 - -## 1. 基本概念 - -### 1.1 用户 - -用户即数据库的合法使用者。一个用户与一个唯一的用户名相对应,并且拥有密码作为身份验证的手段。一个人在使用数据库之前,必须先提供合法的(即存于数据库中的)用户名与密码。 - -### 1.2 权限 - -数据库提供多种操作,但并非所有的用户都能执行所有操作。如果一个用户可以执行某项操作,则称该用户有执行该操作的权限。 - -### 1.3 角色 - -角色是若干权限的集合,并且有一个唯一的角色名作为标识符。角色通常和一个现实身份相对应(例如交通调度员),而一个现实身份可能对应着多个用户。这些具有相同现实身份的用户往往具有相同的一些权限,角色就是为了能对这样的权限进行统一的管理的抽象。 - -### 1.4 默认用户与角色 - -安装初始化后的 IoTDB 中有一个默认用户 root,默认密码为 TimechoDB@2021(V2.0.6.x 之前为 root)。该用户为管理员用户,拥有所有权限,无法被赋予、撤销权限,也无法被删除,数据库内仅有一个管理员用户。一个新创建的用户或角色不具备任何权限。 - - -## 2. 权限列表 - -IoTDB 表模型主要有两类权限:全局权限、数据权限。 - -### 2.1 全局权限 - -全局权限包含用户管理和角色管理。 - -下表描述了全局权限的种类: - -| 权限名称 | 描述 | -| ----------------- |----------------------------------------------------------------------------------------| -| MANAGE\_USER | - 允许用户创建用户
- 允许用户删除用户
- 允许用户修改用户密码
- 允许用户查看用户的权限信息
- 允许用户列出所有用户 | -| MANAGE\_ROLE | - 允许用户创建角色
- 允许用户删除角色
- 允许用户查看角色的权限信息
- 允许用户将角色授予某个用户或撤销
- 允许用户列出所有角色 | - - -### 2.2 数据权限 - -数据权限由权限类型和范围组成。 - -* 权限类型包括:CREATE(创建权限),DROP(删除权限),ALTER(修改权限),SELECT(查询数据权限),INSERT(插入/更新数据权限),DELETE(删除数据权限)。 - -* 范围包括:ANY(系统范围内),DATABASE(数据库范围内),TABLE(单个表)。 - - 作用于 ANY 的权限会影响所有数据库和表。 - - 作用于数据库的权限会影响该数据库及其所有表。 - - 作用于表的权限仅影响该表。 - -* 范围生效说明:执行单表操作时,数据库会匹配用户权限与数据权限范围。例如,用户尝试向 DATABASE1.TABLE1 写入数据时,系统会依次检查用户是否有对 ANY、DATABASE1或 DATABASE1.TABLE1 的写入权限,直到匹配成功或者匹配失败。 - -* 权限类型、范围及效果逻辑关系如下表所示: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
权限类型权限范围(层级)权限效果
CREATEANY允许创建任意表、创建任意数据库
数据库允许用户在该数据库下创建表;允许用户创建该名称的数据库
允许用户创建该名称的表
DROPANY允许删除任意表、删除任意数据库
数据库允许用户删除该数据库;允许用户删除该数据库下的表
D允许用户删除该表
ALTERANY允许修改任意表的定义、任意数据库的定义
数据库允许用户修改数据库的定义,允许用户修改数据库下表的定义
允许用户修改表的定义
SELECTANY允许查询系统内任意数据库中任意表的数据
数据库允许用户查询该数据库中任意表的数据
允许用户查询该表中的数据。执行多表查询时,数据库仅展示用户有权限访问的数据。
INSERTANY允许任意数据库的任意表插入/更新数据
数据库允许用户向该数据库范围内任意表插入/更新数据
允许用户向该表中插入/更新数据
DELETEANY允许删除任意表的数据
数据库允许用户删除该数据库范围内的数据
允许用户删除该表中的数据
- -## 3. 用户、角色管理 -1. 创建用户(需 MANAGE_USER 权限) - -```SQL -CREATE USER -eg: CREATE USER user1 'passwd' -``` - -- 用户名约束:4~32个字符,支持使用英文大小写字母、数字、特殊字符`(!@#$%^&*()_+-=)`,用户无法创建和管理员用户同名的用户。 - - 如果用户名全是数字或包含特殊字符,则创建时需要使用双引号`""`括起来。 -- 密码约束:4~32个字符,可使用大写小写字母、数字、特殊字符`(!@#$%^&*()_+-=)`,密码默认采用 SHA-256 进行加密。 - -2. 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备 MANAGE_USER 权限。 - -```SQL -ALTER USER SET PASSWORD -eg: ALTER USER tempuser SET PASSWORD 'newpwd' -``` - -3. 删除用户(需 MANAGE_USER 权限) - -```SQL -DROP USER -eg: DROP USER user1 -``` - -4. 创建角色 (需 MANAGE_ROLE 权限) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1 -``` - -角色名约束:4~32个字符,支持使用英文大小写字母、数字、特殊字符`(!@#$%^&*()_+-=)`,用户无法创建和管理员用户同名的角色。 - -5. 删除角色 (需 MANAGE_ROLE 权限) - -```SQL -DROP ROLE -eg: DROP ROLE role1 -``` - -6. 赋予用户角色 (需 MANAGE_ROLE 权限) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1 -``` - -7. 移除用户角色 (需 MANAGE_ROLE 权限) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1 -``` - -8. 列出所有用户(需 MANAGE_USER 权限) - -```SQL -LIST USER -``` - -9. 列出所有的角色 (需 MANAGE_ROLE 权限) - -```SQL -LIST ROLE -``` - -10. 列出指定角色下所有用户(需 MANAGE_USER 权限) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser -``` - -11. 列出指定用户下的所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 MANAGE_ROLE 权限。 - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser -``` - -12. 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 MANAGE_USER 权限。 - -```SQL -LIST PRIVILEGES OF USER -eg: LIST PRIVILEGES OF USER tempuser -``` - -13. 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 MANAGE_ROLE 权限。 - -```SQL -LIST PRIVILEGES OF ROLE -eg: LIST PRIVILEGES OF ROLE actor -``` - -## 4. 权限管理 - -IoTDB支持通过如下三种途径进行用户授权和撤销权限: - -- 超级管理员直接授予或撤销 - -- 拥有GRANT OPTION权限的用户授予或撤销 - -- 通过角色授予或撤销(由超级管理员或具备MANAGE_ROLE权限的用户操作角色) - -在IoTDB 表模型中,授权或撤销权限时需遵循以下原则: - -- 授权/撤销全局权限时,无需指定权限的范围。 - -- 授予/撤销数据权限时,需要指定权限类型和权限范围。在撤销权限时只会撤销指定的权限范围,不会受权限范围包含关系的影响。 - -- 允许对尚未创建的数据库或表提前进行权限规划和授权。 - -- 允许重复授权/撤销权限。 - -- WITH GRANT OPTION: 允许用户在授权范围内管理权限。用户可以授予或撤销其他用户在该范围内的权限。 - -### 4.1 授予权限 - -1. 给用户授予管理用户的权限 - -```SQL -GRANT MANAGE_USER TO USER -eg: GRANT MANAGE_USER TO USER TEST_USER -``` - -2. 给用户授予创建数据库及在数据库范围内创建表的权限,且允许用户在该范围内管理权限 - -```SQL -GRANT CREATE ON DATABASE TO USER WITH GRANT OPTION -eg: GRANT CREATE ON DATABASE TESTDB TO USER TEST_USER WITH GRANT OPTION -``` - -3. 给角色授予查询数据库的权限 - -```SQL -GRANT SELECT ON DATABASE TO ROLE -eg: GRANT SELECT ON DATABASE TESTDB TO ROLE TEST_ROLE -``` - -4. 给用户授予查询表的权限 - -```SQL -GRANT SELECT ON . TO USER -eg: GRANT SELECT ON TESTDB.TESTTABLE TO USER TEST_USER -``` - -5. 给角色授予查询所有数据库及表的权限 - -```SQL -GRANT SELECT ON ANY TO ROLE -eg: GRANT SELECT ON ANY TO ROLE TEST_ROLE -``` - -6. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地授予权限。 - -```sql -GRANT ALL TO USER TESTUSER --- 将用户可以获取的所有权限授予给用户,包括全局权限和 ANY 范围的所有数据权限 - -GRANT ALL ON ANY TO USER TESTUSER --- 将 ANY 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在所有数据库上的所有数据权限 - -GRANT ALL ON DATABASE TESTDB TO USER TESTUSER --- 将 DB 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该数据库上的所有数据权限 - -GRANT ALL ON TABLE TESTTABLE TO USER TESTUSER --- 将 TABLE 范围内可以获取的所有权限授予给用户,执行该语句后,用户将拥有在该表上的所有数据权限 -``` - -### 4.2 撤销权限 - -1. 取消用户管理用户的权限 - -```SQL -REVOKE MANAGE_USER FROM USER -eg: REVOKE MANAGE_USER FROM USER TEST_USER -``` - -2. 取消用户创建数据库及在数据库范围内创建表的权限 - -```SQL -REVOKE CREATE ON DATABASE FROM USER -eg: REVOKE CREATE ON DATABASE TEST_DB FROM USER TEST_USER -``` - -3. 取消用户查询表的权限 - -```SQL -REVOKE SELECT ON . FROM USER -eg: REVOKE SELECT ON TESTDB.TESTTABLE FROM USER TEST_USER -``` - -4. 取消用户查询所有数据库及表的权限 - -```SQL -REVOKE SELECT ON ANY FROM USER -eg: REVOKE SELECT ON ANY FROM USER TEST_USER -``` - -5. ALL 语法糖:ALL 表示对象范围内所有权限,可以使用 ALL 字段灵活地撤销权限。 - -```sql -REVOKE ALL FROM USER TESTUSER --- 取消用户所有的全局权限以及 ANY 范围的所有数据权限 - -REVOKE ALL ON ANY FROM USER TESTUSER --- 取消用户 ANY 范围的所有数据权限,不会影响 DB 范围和 TABLE 范围的权限 - -REVOKE ALL ON DATABASE TESTDB FROM USER TESTUSER --- 取消用户在 DB 上的所有数据权限,不会影响 TABLE 权限 - -REVOKE ALL ON TABLE TESTDB FROM USER TESTUSER --- 取消用户在 TABLE 上的所有数据权限 -``` - -### 4.3 查看用户权限 - -每个用户都有一个权限访问列表,标识其获得的所有权限。可使用 `LIST PRIVILEGES OF USER ` 语句查看某个用户或角色的权限信息,输出格式如下: - -| ROLE | SCOPE | PRIVIVLEGE | WITH GRANT OPTION | -|--------------|---------| -------------- |-------------------| -| | DB1.TB1 | SELECT | FALSE | -| | | MANAGE\_ROLE | TRUE | -| ROLE1 | DB2.TB2 | UPDATE | TRUE | -| ROLE1 | DB3.\* | DELETE | FALSE | -| ROLE1 | \*.\* | UPDATE | TRUE | - -其中: -- `ROLE` 列:如果为空,则表示为该用户的自身权限。如果不为空,则表示该权限来源于被授予的角色。 -- `SCOPE` 列:表示该用户/角色的权限范围,表范围的权限表示为`DB.TABLE`,数据库范围的权限表示为`DB.*`, ANY 范围的权限表示为`*.*`。 -- `PRIVIVLEGE` 列:列出具体的权限类型。 -- `WITH GRANT OPTION` 列:如果为 TRUE,表示用户可以将自己的权限授予他人。 -- 用户或者角色可以同时具有树模型和表模型的权限,但系统会根据当前连接的模型来显示相应的权限,另一种模型下的权限则不会显示。 - -## 5. 示例 - -以 [示例数据](../Reference/Sample-Data.md) 内容为例,两个表的数据可能分别属于 bj、sh 两个数据中心,彼此间不希望对方获取自己的数据库数据,因此我们需要将不同的数据在数据中心层进行权限隔离。 - -### 5.1 创建用户 - -使用 `CREATE USER ` 创建用户。例如,可以使用具有所有权限的root用户为 ln 和 sgcc 集团创建两个用户角色,名为 `bj_write_user`, `sh_write_user`,密码均为 `write_pwd`。SQL 语句为: - -```SQL -CREATE USER bj_write_user 'write_pwd' -CREATE USER sh_write_user 'write_pwd' -``` - -使用展示用户的 SQL 语句: - -```Plain -LIST USER -``` - -可以看到这两个已经被创建的用户,结果如下: - -```sql -+-------------+ -| User| -+-------------+ -|bj_write_user| -| root| -|sh_write_user| -+-------------+ -``` - -### 5.2 赋予用户权限 - -虽然两个用户已经创建,但是不具有任何权限,因此并不能对数据库进行操作,例如使用 `bj_write_user` 用户对 table1 中的数据进行写入,SQL 语句为: - -```sql -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -``` - -系统不允许用户进行此操作,会提示错误: - -```sql -IoTDB> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: database is not specified -IoTDB> use database1 -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: DATABASE database1 -``` - -root 用户使用 `GRANT ON TO USER ` 语句赋予用户`bj_write_user`对 table1 的写入权限,例如: - -```sql -GRANT INSERT ON database1.table1 TO USER bj_write_user -``` - -使用`bj_write_user`再尝试写入数据 - -```SQL -IoTDB> use database1 -Msg: The statement is executed successfully. -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: The statement is executed successfully. -``` - -### 5.3 撤销用户权限 - -授予用户权限后,可以使用 `REVOKE ON FROM USER `来撤销已经授予用户的权限。例如,用root用户撤销`bj_write_user`和`sh_write_user`的权限: - -```sql -REVOKE INSERT ON database1.table1 FROM USER bj_write_user -REVOKE INSERT ON database1.table2 FROM USER sh_write_user -``` - -撤销权限后,`bj_write_user`就没有向table1写入数据的权限了。 - -```sql -IoTDB:database1> INSERT INTO table1(region, plant_id, device_id, model_id, maintenance, time, temperature, humidity, status, arrival_time) VALUES ('北京', '1001', '100', 'A', '180', '2025-03-26 13:37:00', 190.0, 30.1, false, '2025-03-26 13:37:34') -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 803: Access Denied: No permissions for this operation, please add privilege INSERT ON database1.table1 -``` diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Auto-Start-On-Boot_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Auto-Start-On-Boot_timecho.md deleted file mode 100644 index 06a0ddba6..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Auto-Start-On-Boot_timecho.md +++ /dev/null @@ -1,243 +0,0 @@ - - -# 开机自启 - -## 1.概述 - -TimechoDB 支持通过 `daemon-confignode.sh`、`daemon-datanode.sh`、`daemon-ainode.sh` 三个脚本,将ConfigNode、DataNode、AINode 注册为 Linux 系统服务,结合系统自带的 `systemctl `命令,以守护进程方式管理 TimechoDB 集群,实现更便捷的启动、停止、重启及开机自启等操作,提升服务稳定性。 - -> 注意:该功能从 V 2.0.9.1 版本开始提供。 - -## 2. 环境要求 - -| 操作系统 | Linux(支持`systemctl`命令) | -| ---------- |:-----------------------------------------------------:| -| 用户权限 | root 用户 | -| 环境变量 | 部署 ConfigNode 和 DataNode 前需设置`JAVA_HOME` | - -## 3. 服务注册 - -进入 TimechoDB 安装目录,执行对应的守护进程脚本: - -```Bash -# 注册 ConfigNode 服务 -./tools/ops/daemon-confignode.sh - -# 注册 DataNode 服务 -./tools/ops/daemon-datanode.sh - -# 注册 AINode 服务 -./tools/ops/daemon-ainode.sh -``` - -执行脚本时将提示以下两个选择项: - -1. 是否本次直接启动对应 TimechoDB 服务(timechodb-confignode/timechodb-datanode/timechodb-ainode); -2. 是否将对应服务注册为开机自启服务。 - -脚本执行完成后,将在 `/etc/systemd/system/` 目录生成对应的服务文件: - -* `timechodb-confignode.service` -* `timechodb-datanode.service` -* `timechodb-ainode.service` - -## 4. 服务管理 - -服务注册完成后,可通过 systemctl 命令对 TimechoDB 各节点服务进行启动、停止、重启、查看状态及配置开机自启等操作,以下命令均需使用 root 用户执行。 - -### 4.1 手动启动服务 - -```bash -# 启动 ConfigNode 服务 -systemctl start timechodb-confignode -# 启动 DataNode 服务 -systemctl start timechodb-datanode -# 启动 AINode 服务 -systemctl start timechodb-ainode -``` - -### 4.2 手动停止服务 - -```bash -# 停止 ConfigNode 服务 -systemctl stop timechodb-confignode -# 停止 DataNode 服务 -systemctl stop timechodb-datanode -# 停止 AINode 服务 -systemctl stop timechodb-ainode -``` - -停止服务后,通过查看服务状态,若显示为 inactive(dead),则说明服务关闭成功;若为其他状态,需查看 TimechoDB 日志,分析异常原因。 - -### 4.3 查看服务状态 - -```bash -# 查看 ConfigNode 服务状态 -systemctl status timechodb-confignode -# 查看 DataNode 服务状态 -systemctl status timechodb-datanode -# 查看 AINode 服务状态 -systemctl status timechodb-ainode -``` - -状态说明: - -* active(running):服务正在运行,若该状态持续 10 分钟,说明服务启动成功; -* failed:服务启动失败,需查看 TimechoDB 日志排查问题。 - -### 4.4 重启服务 - -重启服务相当于先执行停止操作,再执行启动操作,命令如下: - -```bash -# 重启 ConfigNode 服务 -systemctl restart timechodb-confignode -# 重启 DataNode 服务 -systemctl restart timechodb-datanode -# 重启 AINode 服务 -systemctl restart timechodb-ainode -``` - -### 4.5 配置开机自启 - -```bash -# 配置 ConfigNode 开机自启 -systemctl enable timechodb-confignode -# 配置 DataNode 开机自启 -systemctl enable timechodb-datanode -# 配置 AINode 开机自启 -systemctl enable timechodb-ainode -``` - -### 4.6 取消开机自启 - -```bash -# 取消 ConfigNode 开机自启 -systemctl disable timechodb-confignode -# 取消 DataNode 开机自启 -systemctl disable timechodb-datanode -# 取消 AINode 开机自启 -systemctl disable timechodb-ainode -``` - -## 5. 自定义服务配置 - -### 5.1 自定义方式 - -#### 5.1.1 方案一:修改脚本 - -1. 修改 `daemon-xxx.sh` 中的[Unit]、[Service]、[Install]区域配置项,具体配置项的含义参考下一小节 -2. 执行 `daemon-xxx.sh` 脚本 - -#### 5.1.2 方案二:修改服务文件 - -1. 修改 `/etc/systemd/system` 中的 `xx.service` 文件 -2. 执行 `systemctl deamon-reload` - -### 5.2 `daemon-xxx.sh` 配置项 - -#### 5.2.1 [Unit] 部分(服务元信息) - -| 配置项 | 说明 | -| --------------- | ---------------------------------- | -| Description | 服务描述 | -| Documentation | 指向 TimechoDB 官方文档 | -| After | 确保在网络服务启动后才启动该服务 | - -#### 5.2.2 [Service] 部分(服务运行配置) - -| 配置项 | 含义 | -| -------------------------------------------- | ---------------------------------------------------------------------- | -| StandardOutput、StandardError | 指定服务标准输出和错误日志的存储路径 | -| LimitNOFILE=65536 | 设置文件描述符上限,默认值为 65536 | -| Type=simple | 服务类型为简单前台进程,systemd 会跟踪服务主进程 | -| User=root、Group=root | 指定服务以 root 用户和 root 组的权限运行 | -| ExecStart/ExecStop | 分别指定服务的启动脚本和停止脚本的路径 | -| Restart=on-failure | 仅在服务异常退出时,自动重启服务 | -| SuccessExitStatus=143 | 将退出码 143(128+15,即 SIGTERM 正常终止)视为成功退出 | -| RestartSec=5 | 服务重启的间隔时间,默认为 5 秒 | -| StartLimitInterval=600s、StartLimitBurst=3 | 10 分钟(600 秒)内,服务最多重启 3 次,防止频繁重启导致系统资源浪费 | -| RestartPreventExitStatus=SIGKILL | 服务被 SIGKILL 信号杀死后,不自动重启,避免无限重启僵尸进程 | - -#### 5.2.3 [Install] 部分(安装配置) - -| 配置项 | 含义 | -| ---------------------------- | -------------------------------------------- | -| WantedBy=multi-user.target | 指定服务在系统进入多用户模式时,自动启动。 | - -### 5.3 .service 文件格式示例 - -```bash -[Unit] -Description=timechodb-confignode -Documentation=https://www.timecho.com/ -After=network.target - -[Service] -StandardOutput=null -StandardError=null -LimitNOFILE=65536 -Type=simple -User=root -Group=root -Environment=JAVA_HOME=$JAVA_HOME -ExecStart=$TimechoDB_SBIN_HOME/start-confignode.sh -Restart=on-failure -SuccessExitStatus=143 -RestartSec=5 -StartLimitInterval=600s -StartLimitBurst=3 -RestartPreventExitStatus=SIGKILL - -[Install] -WantedBy=multi-user.target -``` - -注:上述为 timechodb-confignode.service 文件的标准格式,timechodb-datanode.service、timechodb-ainode.service 文件格式类似。 - -## 6. 注意事项 - -1. **进程守护机制** - -* **自动重启**:服务启动失败或运行中异常退出(如 OOM)时,系统将自动重启。 -* **不重启**:正常退出(如执行 `kill`、`./sbin/stop-xxx.sh` 或 `systemctl stop`)不会触发自动重启。 - -2. **日志位置** - -* 所有运行日志均存储在 TimechoDB 安装目录下的 `logs` 文件夹中,排查问题时请查阅该目录。 - -3. **集群状态查看** - -* 服务启动后,执行 `./sbin/start-cli.sh` 并输入 `show cluster` 命令,即可查看集群状态。 - -4. **故障恢复流程** - -* 若服务状态为 `failed`,修复问题后**必须**先执行 `systemctl daemon-reload`,然后再执行 `systemctl start`,否则启动将失败。 - -5. **配置生效** - -* 修改 `daemon-xxx.sh` 脚本内容后,需执行 `systemctl daemon-reload` 重新注册服务,新配置方可生效。 - -6. **启动方式兼容** - -* `systemctl start`启动的服务,可用`./sbin/stop` 停止(不重启)。 -* `./sbin/start` 启动的进程,无法通过 `systemctl` 监控状态。 diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Black-White-List_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Black-White-List_timecho.md deleted file mode 100644 index 483be04cd..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Black-White-List_timecho.md +++ /dev/null @@ -1,78 +0,0 @@ - - -# 黑白名单 - -## 1. 引言 - -IoTDB 是一款针对物联网场景设计的时间序列数据库,支持高效的数据存储、查询和分析。随着物联网技术的广泛应用,数据安全性和访问控制变得至关重要。在开放环境中,如何保证合法用户对数据的安全访问成为了一项关键挑战。白名单机制仅允许可信 IP 或用户接入,从源头缩小攻击面;黑名单功能则能在边缘与云端协同场景下实时拦截恶意 IP,阻断非法访问、SQL 注入、暴力破解及 DDoS 等威胁,为数据传输提供持续、稳定的安全保障。 - -> 注意:该功能从 V2.0.6 版本开始提供。 - -## 2. 白名单 - -### 2.1 功能描述 - -通过开启白名单功能、配置白名单列表,指定允许连接 IoTDB 的客户端地址,来限制仅在白名单范围内的客户端才能够访问 IoTDB,从而实现安全控制。 - -### 2.2 配置参数 - -管理员可以通过以下两种方式来启用/禁用白名单功能以及添加、修改、删除白名单ip/ip段。 - -* 编辑配置文件 `iotdb-system.properties`进行维护 -* 通过 set configuration 语句进行维护 - * 表模型请参考:[set configuration](../SQL-Manual/SQL-Maintenance-Statements_timecho.md#_2-2-更新配置项) - -相关参数如下: - -| 名称 | 描述 | 默认值 | 生效方式 | 示例 | -| ------------------------- | ----------------------------------------------------------------------------------- | -------- | ---------- | ------------------------------------------------------------------- | -| `enable_white_list` | 是否启用白名单功能。true:启用;false:禁用。字段值不区分大小写。 | false | 热加载 | `set enable_white_list = 'true' ` | -| `white_ip_list` | 添加、修改、删除白名单ip/ip段。支持精确匹配,支持\*通配符,多个ip之间以逗号分隔。 | 空 | 热加载 | `set white_ip_list='192.168.1.200,192.168.1.201,192.168.1.*`' | - -## 3. 黑名单 - -### 3.1 功能描述 - -通过开启黑名单功能、配置黑名单列表,阻止某些特定 IP 地址访问数据库,来防止非法访问、SQL注入、暴力破解、DDoS攻击等安全威胁,从而确保数据传输过程中的安全性和稳定性。 - -### 3.2 配置参数 - -管理员可以通过以下两种方式来启用/禁用黑名单功能以及添加、修改、删除黑名单 ip/ip 段。 - -* 编辑配置文件 `iotdb-system.properties`进行维护 -* 通过 set configuration 语句进行维护 - * 表模型请参考:[set configuration](../SQL-Manual/SQL-Maintenance-Statements_timecho.md#_2-2-更新配置项) - -相关参数如下: - -| 名称 | 描述 | 默认值 | 生效方式 | 示例 | -| ------------------------- | ----------------------------------------------------------------------------------- | -------- | ---------- | ------------------------------------------------------------------- | -| `enable_black_list` | 是否启用黑名单功能。true:启用;false:禁用。字段值不区分大小写。 | false | 热加载 | `set enable_black_list = 'true' ` | -| `black_ip_list` | 添加、修改、删除黑名单ip/ip段。支持精确匹配,支持\*通配符,多个ip之间以逗号分隔。 | 空 | 热加载 | `set black_ip_list='192.168.1.200,192.168.1.201,192.168.1.*`' | - -## 4. 注意事项 - -1. 开启白名单后,若列表为空将拒绝所有连接,若未包含本机 IP 则拒绝本机登录。 -2. 当同一 IP 同时存在于黑白名单时,黑名单优先级更高。 -3. 系统会校验 IP 格式,无效条目将在用户连接时报错并被跳过,不影响其他有效IP的加载。 -4. 配置支持重复IP,内存中会自动去重且无提示。如需去重请手动修改。 -5. 黑/白名单规则仅对新连接生效,功能开启前的现有连接不受影响,其后续重连才会被拦截。 diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index 82de87d78..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,843 +0,0 @@ - - -# 数据同步 -数据同步是工业物联网的典型需求,通过数据同步机制,可实现 IoTDB 之间的数据共享,搭建完整的数据链路来满足内网外网数据互通、端边云同步、数据迁移、数据备份等需求。 - -## 1. 功能概述 - -### 1.1 数据同步 - -一个数据同步任务包含 3 个阶段: - -![](/img/data-sync-new.png) - -- 抽取(Source)阶段:该部分用于从源 IoTDB 抽取数据,在 SQL 语句中的 source 部分定义 -- 处理(Process)阶段:该部分用于处理从源 IoTDB 抽取出的数据,在 SQL 语句中的 processor 部分定义 -- 发送(Sink)阶段:该部分用于向目标 IoTDB 发送数据,在 SQL 语句中的 sink 部分定义 - -通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。目前数据同步支持以下信息的同步,您可以在创建同步任务时对同步范围进行选择(默认选择 data.insert,即同步新写入的数据): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
同步范围同步内容说明
all所有范围
data(数据)insert(增量)同步新写入的数据
delete(删除)同步被删除的数据
schema(元数据)database(数据库)同步数据库的创建、修改或删除操作
table(表)同步表的创建、修改或删除操作
TTL(数据到期时间)同步数据的存活时间
auth(权限)-同步用户权限和访问控制
- -### 1.2 功能限制及说明 - -- 不支持 1.x 系列版本 IoTDB 与 2.x 以及以上系列版本的 IoTDB 之间进行数据同步。 -- 在进行数据同步任务时,请避免执行任何删除操作,防止两端状态不一致。 -- 树模型与表模型的`pipe`及`pipe plugins`在设计上相互隔离,建议在创建`pipe`前先通过`show`命令查询当前`-sql_dialect`参数配置下可用的内置插件,以确保语法兼容性和功能支持。 -- 自 V2.0.9.2 版本起支持 Object 类型数据导出。 -- 当 Pipe 向接收端写入数据因字段类型不匹配而失败时,IoTDB 可按照目标端已有 schema 的字段类型对数据进行转换,并重试写入,以提高同步成功率。该能力由 `sink.exception.data.convert-on-type-mismatch` 控制,参数说明见后文 sink 参数表。 - -类型不匹配时的转换规则如下: - -| 源类型 | 目标类型 | 转换规则 | -| -------------------- | ----------- | -------------------------------------------------------------------------------- | -| 数值类型 | 数值类型 | 按目标数值类型进行转换,可能发生截断、精度损失或溢出。 | -| 数值类型 | BOOLEAN | `0`转换为`false`,非`0`转换为`true`。 | -| BOOLEAN | 数值类型 | `true`转换为`1`,`false`转换为`0`。 | -| TEXT、STRING、BLOB | BOOLEAN | 按字符串解析为 BOOLEAN。 | -| TEXT、STRING、BLOB | 数值类型 | 按字符串解析为目标数值类型;解析失败时写入默认值`0`、`0L`或`0.0`。 | -| TEXT、STRING、BLOB | TIMESTAMP | 按字符串解析为 TIMESTAMP;解析失败时写入默认值`0L`。 | -| TEXT、STRING、BLOB | DATE | 按字符串解析为 DATE;解析失败时写入默认日期`1970-01-01`。 | -| 非法数值 | DATE | 若无法转换为合法 DATE,则写入默认日期`1970-01-01`。 | -| DATE | TIMESTAMP | 按 UTC 转换为当天零点对应的时间戳。 | -| TIMESTAMP | DATE | 按 UTC 转换为对应日期。 | - -> 注意:自动转换基于目标端已有 schema 执行,不会自动修改目标端 schema;该能力优先保证同步继续进行,可能导致精度损失或默认值写入。 - -## 2. 使用说明 - -数据同步任务有三种状态:RUNNING、STOPPED 和 DROPPED。任务状态转换如下图所示: - -![](/img/Data-Sync01.png) - -创建后任务会直接启动,同时当任务发生异常停止后,系统会自动尝试重启任务。 - -提供以下 SQL 语句对同步任务进行状态管理。 - -### 2.1 创建任务 - -使用 `CREATE PIPE` 语句来创建一条数据同步任务,下列属性中`PipeId`和`sink`必填,`source`和`processor`为选填项,输入 SQL 时注意 `SOURCE`与 `SINK` 插件顺序不能替换。 - -SQL 示例如下: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId 是能够唯一标定任务的名字 --- 数据抽取插件,可选插件 -WITH SOURCE ( - [ = ,], -) --- 数据处理插件,可选插件 -WITH PROCESSOR ( - [ = ,], -) --- 数据连接插件,必填插件 -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Pipe 不存在时,执行创建命令,防止因尝试创建已存在的 Pipe 而导致报错。 - -**注意**:V2.0.8 起,创建一个全量数据同步 Pipe (例如 Pipeid : `alldatapipe`)时,系统会自动将其拆分为两个独立的 Pipe: - -* 历史 Pipe:PipeId 为原名称加 _history后缀(如 `alldatapipe_history`),source 参数默认携带 `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` - -* 实时 Pipe:PipeId 为原名称加 _realtime后缀(如 `alldatapipe_realtime`),source 参数默认携带 `'history.enable'='false'` ,若配置了元数据同步,则由实时 Pipe 负责发送 - -创建成功后,原 PipeId(如 `alldatapipe`)将不再作为有效标识符。在进行启动、停止、删除、查看等任务操作时,必须使用拆分后的独立 PipeId(即 `*_history`或 `*_realtime`)。操作示例见[查看任务](./Data-Sync_timecho.md#_2-5-查看任务)小节 - -### 2.2 开始任务 - -创建之后,任务直接进入运行状态,不需要执行启动任务。当使用`STOP PIPE`语句停止任务时需手动使用`START PIPE`语句来启动任务,PIPE 发生异常情况停止后会自动重新启动任务,从而开始处理数据: - -```SQL -START PIPE -``` - -### 2.3 停止任务 - -停止处理数据: - -```SQL -STOP PIPE -``` - -### 2.4 删除任务 - -删除指定任务: - -```SQL -DROP PIPE [IF EXISTS] -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Pipe 存在时,执行删除命令,防止因尝试删除不存在的 Pipe 而导致报错。 - -删除任务不需要先停止同步任务。 - -### 2.5 查看任务 - -查看全部任务: - -```SQL -SHOW PIPES -``` - -查看指定任务: - -```SQL -SHOW PIPE -``` - - pipe 的 show pipes 结果示例: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -其中各列含义如下: - -- **ID**:同步任务的唯一标识符 -- **CreationTime**:同步任务的创建的时间 -- **State**:同步任务的状态 -- **PipeSource**:同步数据流的来源 -- **PipeProcessor**:同步数据流在传输过程中的处理逻辑 -- **PipeSink**:同步数据流的目的地 -- **ExceptionMessage**:显示同步任务的异常信息 -- **RemainingEventCount(统计存在延迟)**:剩余 event 数,当前数据同步任务中的所有 event 总数,包括数据同步的 event,以及系统和用户自定义的 event。 -- **EstimatedRemainingSeconds(统计存在延迟)**:剩余时间,基于当前 event 个数和 pipe 处速率,预估完成传输的剩余时间。 - -示例: - -在 V2.0.8 及之后的版本中,创建一个全量数据同步任务,并查看该任务详情 - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -``` - -### 2.6 修改任务 - -`ALTER PIPE` 语句用于动态更新已存在的 PIPE,支持修改或替换 source、processor 及 sink 的配置。 - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -说明: - -* 执行操作不会改变 PIPE 的运行状态,等价于保留原 PipeId 的处理进度,在原进度位置创建新 PIPE。 -* source/processor/sink 的 modify/replace 参数均为非必填;若未指定任何修改参数,等价于删除当前 PIPE 后,按原配置和进度重新创建。 -* 对于指定 modify 的插件,保留该插件其他参数,仅替换或新增给定的参数。 -* 对于指定 replace 的插件,直接替换该插件所有参数。 -* 当使用 [IF EXISTS] 关键字时,即使不存在同名的 Pipe 也会返回执行成功,但是实际未执行任何操作。 - -示例: - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` - -### 2.7 同步插件 - -为了使得整体架构更加灵活以匹配不同的同步场景需求,我们支持在同步任务框架中进行插件组装。系统为您预置了一些常用插件可直接使用,同时您也可以自定义 processor 插件 和 Sink 插件,并加载至 IoTDB 系统进行使用。查看系统中的插件(含自定义与内置插件)可以用以下语句: - -```SQL -SHOW PIPEPLUGINS -``` - -返回结果如下: - -```SQL -IoTDB> SHOW PIPEPLUGINS -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -| PluginName|PluginType| ClassName|PluginJar|ExceptionMessage| -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -| DO-NOTHING-PROCESSOR| Builtin|org.apache.iotdb.commons.pipe.agent.plugin.builtin.processor.donothing.DoNothingProcessor| | | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.donothing.DoNothingSink| | | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.airgap.IoTDBAirGapSink| | | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.source.iotdb.IoTDBSource| | | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.thrift.IoTDBThriftSink| | | -|IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.iotdb.thrift.IoTDBThriftSslSink| | | -| TSFILE-LOCAL-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.tsfile.PipeTsFileLocalSink| | | -| WRITE-BACK-SINK| Builtin| org.apache.iotdb.commons.pipe.agent.plugin.builtin.sink.writeback.WriteBackSink| | | -+---------------------+----------+-----------------------------------------------------------------------------------------+---------+----------------+ -``` - -预置插件详细介绍如下(各插件的详细参数可参考本文[参数说明](#参考参数说明)): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
类型自定义插件插件名称介绍
source 插件不支持iotdb-source默认的 extractor 插件,用于抽取 IoTDB 历史或实时数据
processor 插件支持do-nothing-processor默认的 processor 插件,不对传入的数据做任何的处理
sink 插件支持do-nothing-sink不对发送出的数据做任何的处理
iotdb-thrift-sink默认的 sink 插件,用于 IoTDB 到 IoTDB(V2.0.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,多线程 async non-blocking IO 模型,传输性能高,尤其适用于目标端为分布式时的场景
iotdb-air-gap-sink用于 IoTDB 向 IoTDB(V2.0.0 及以上)跨单向数据网闸的数据同步。支持的网闸型号包括南瑞 Syskeeper 2000 等
iotdb-thrift-ssl-sink用于 IoTDB 与 IoTDB(V2.0.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,多线程 sync blocking IO 模型,适用于安全需求较高的场景
write-back-sink用于 IoTDB (V2.0.2 及以上)的数据回写插件,实现物化视图的作用。
opc-ua-sink用于 IoTDB (V2.0.2 及以上)支持OPC UA协议的数据传输插件,支持Client/Server 和 Pub/Sub 两种通信模式。
tsfile-local-sink用于 IoTDB (V2.0.9.2及以上)支持将 Object 数据导出到 IoTDB 服务器所在的本地文件系统。
tsfile-remote-sink用于 IoTDB (V2.0.9.2及以上)支持通过 SSH/SCP 协议将 Object 数据发送到远程服务器。
- - -## 3. 使用示例 - -### 3.1 全量数据同步 - -本例子用来演示将一个 IoTDB 的所有数据同步至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务,用来同步 A IoTDB 到 B IoTDB 间的全量数据,这里需要用到用到 sink 的 iotdb-thrift-sink 插件(内置插件),需通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,如下面的示例语句: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.2 部分数据同步 - -本例子用来演示同步某个历史时间范围( 2023 年 8 月 23 日 8 点到 2023 年 10 月 23 日 8 点)的数据至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务。首先我们需要在 source 中定义传输数据的范围,由于传输的是历史数据(历史数据是指同步任务创建之前存在的数据),需要配置数据的起止时间 start-time 和 end-time 以及传输的模式 mode.streaming。通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url。 - -详细语句如下: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'mode.streaming' = 'true' -- 新插入数据(pipe创建后)的抽取模式:是否按流式抽取(false 时为批式) - 'database-name'='testdb.*', -- 同步数据的范围 - 'start-time' = '2023.08.23T08:00:00+00:00', -- 同步所有数据的开始 event time,包含 start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- 同步所有数据的结束 event time,包含 end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.3 双向数据传输 - -本例子用来演示两个 IoTDB 之间互为双活的场景,数据链路如下图所示: - -![](/img/1706698592139.jpg) - -在这个例子中,为了避免数据无限循环,需要将 A 和 B 上的参数`source.mode.double-living` 均设置为 `true`,表示不转发从另一 pipe 传输而来的数据。 - -详细语句如下: - -在 A IoTDB 上执行下列语句: - -```SQL -create pipe AB -with source ( - 'source.mode.double-living' ='true' --不转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句: - -```SQL -create pipe BA -with source ( - 'source.mode.double-living' ='true' --不转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -### 3.4 边云数据传输 - -本例子用来演示多个 IoTDB 之间边云传输数据的场景,数据由 B 、C、D 集群分别都同步至 A 集群,数据链路如下图所示: - -![](/img/dataSync03.png) - -在这个例子中,为了将 B 、C、D 集群的数据同步至 A,在 BA 、CA、DA 之间的 pipe 需要配置database-name 和 table-name 限制范围,详细语句如下: - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 A: - -```SQL -create pipe BA -with source ( - 'database-name'='db_b.*', -- 限制范围 - 'table-name'='.*', -- 可选择匹配所有 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 C IoTDB 上执行下列语句,将 C 中数据同步至 A: - -```SQL -create pipe CA -with source ( - 'database-name'='db_c.*', -- 限制范围 - 'table-name'='.*', -- 可选择匹配所有 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 D IoTDB 上执行下列语句,将 D 中数据同步至 A: - -```SQL -create pipe DA -with source ( - 'database-name'='db_d.*', -- 限制范围 - 'table-name'='.*', -- 可选择匹配所有 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.5 级联数据传输 - -本例子用来演示多个 IoTDB 之间级联传输数据的场景,数据由 A 集群同步至 B 集群,再同步至 C 集群,数据链路如下图所示: - -![](/img/1706698610134.jpg) - - -在 A IoTDB 上执行下列语句,将 A 中数据同步至 B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 C: - -```SQL -create pipe BC -with source ( -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.6 跨网闸数据传输 - -本例子用来演示将一个 IoTDB 的数据,经过单向网闸,同步至另一个 IoTDB 的场景,数据链路如下图所示: - -![](/img/cross-network-gateway.png) - -在这个例子中,需要使用 sink 任务中的 iotdb-air-gap-sink 插件,配置网闸后,在 A IoTDB 上执行下列语句,其中 node-urls 填写网闸配置的目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,详细语句如下: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -**注意:** -* 跨网闸同步创建 pipe 时,必须确保接收端目标用户已存在。若创建 pipe 时接收端用户缺失,后续补建用户也无法同步此前数据。 -* 目前支持的网闸型号见下表 -> 其他型号的网闸设备,请与天谋商务联系确认是否支持。 - -| 网闸类型 | 网闸型号 | 回包限制 | 发送限制 | -| ------------ | -------------------------------------------- | ----------------- | --------------- | -| 正向型 | 南瑞 Syskeeper-2000 正向型 | 全 0 / 全 1 bytes | 无限制 | -| 正向型 | 许继自研网闸 | 全 0 / 全 1 bytes | 无限制 | -| 未标记正反向 | 威努特安全隔离与信息交换系统 | 无限制 | 无限制 | -| 正向型 | 科东 StoneWall-2000 网络安全隔离设备(正向型) | 无限制 | 无限制 | -| 反向型 | 南瑞 Syskeeper-2000 反向型 | 全 0 / 全 1 bytes | 满足 E 语言格式 | -| 未标记正反向 | 迪普科技ISG5000 | 无限制 | 无限制 | -| 未标记正反向 | 熙羚安全隔离与信息交换系统XL—GAP | 无限制 | 无限制 | - -### 3.7 压缩同步 - -IoTDB 支持在同步过程中指定数据压缩方式。可通过配置 `compressor` 参数,实现数据的实时压缩和传输。`compressor`目前支持 snappy / gzip / lz4 / zstd / lzma2 5 种可选算法,且可以选择多种压缩算法组合,按配置的顺序进行压缩。`rate-limit-bytes-per-second`(V1.3.3 及以后版本支持)每秒最大允许传输的byte数,计算压缩后的byte,若小于0则不限制。 - -如创建一个名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'compressor' = 'snappy,lz4' -- - 'rate-limit-bytes-per-second'='1048576' -- 每秒最大允许传输的byte数 -) -``` - -### 3.8 加密同步 - -IoTDB 支持在同步过程中使用 SSL 加密,从而在不同的 IoTDB 实例之间安全地传输数据。通过配置 SSL 相关的参数,如证书地址和密码(`ssl.trust-store-path`)、(`ssl.trust-store-pwd`)可以确保数据在同步过程中被 SSL 加密所保护。 - -如创建名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'ssl.trust-store-path'='pki/trusted', -- 连接目标端 DataNode 所需的 trust store 证书路径 - 'ssl.trust-store-pwd'='root' -- 连接目标端 DataNode 所需的 trust store 证书密码 -) -``` - -### 3.9 Object 类型数据导出 - -IoTDB 自 V2.0.9.2 版本起支持导出 Object 类型数据,通过配置 sink 参数支持如下两种方式: - -* Local 模式(本地导出):将数据导出到 IoTDB 服务器所在的本地文件系统。 -* SCP 模式(远程传输):通过 SSH/SCP 协议将数据发送到远程服务器。 - -**示例一:本地导出** - -可直接使用系统内置的 `tsfile-local-sink `插件创建 PIPE 语句导出数据,例如: - -```SQL -CREATE PIPE tsfile_export_local -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile-local-sink', -- 必填,指定 Sink 类型 - 'sink.local.target-path' = '/data/backup/export_2024' -- 导出目标路径 - 'sink.rate-limit-bytes-per-second' = '10485760' -- 限速 10MB/s -); -``` - -**示例二:远程传输** - -1. 联系天谋团队获取 `tsfile-remote-sink` 插件相关的 jar 包,如 `tsfile-remote-sink--jar-with-dependencies.jar`,并放至 IoTDB 可访问的路径(例如所有数据节点主机)。 -2. 使用如下语句注册插件 - -```SQL -CREATE PIPEPLUGIN tsfile_remote_sink -AS 'org.apache.iotdb.pipe.plugin.sink.tsfile.PipeTsFileRemoteSink' -USING URI 'file:///path/to/tsfile-remote-sink-<版本号>-jar-with-dependencies.jar'; -``` - -3. 创建 PIPE 语句 - -```SQL -CREATE PIPE tsfile_export_scp -WITH SOURCE ( - 'source' = 'iotdb-source', - 'table-name' = 'test_table' -) -WITH PROCESSOR ( - 'processor' = 'do-nothing-processor' -) -WITH SINK ( - 'sink' = 'tsfile_remote_sink', - 'sink.file-mode' = 'scp', -- 指定为 SCP 模式 - 'sink.scp.host' = '192.168.1.100', -- 远程主机 IP - 'sink.scp.port' = '22', -- SSH 端口 - 'sink.scp.user' = 'backup_user', -- SSH 用户名 - 'sink.scp.password' = 'ComplexPass123!', -- SSH 密码 - 'sink.scp.remote-path' = '/remote/archive/', -- 远程存放路径 - 'sink.rate-limit-bytes-per-second' = '10485760' -- 限速 10MB/s -); -``` - -注意:远程导出 Object 类型数据时,为避免出现握手异常、连接失败或 Pipe 频繁启停问题,建议采取以下任一措施: -* 适当调低配置参数 sink.scp.object-parallelism -* 按需调大目标机的 MaxStartups,修改后执行 sshd reload 或 sshd restart 使配置生效 - -**Sink 导出 TSFile 与 Object 格式:** - -```Bash -target_dir - ├── tsfile.tsfile - └── tsfile/ (对应TSFile名字) - ├── regionID/tableName/tag1/tag2/field/timestamp1.bin - ├── regionID/tableName/tag1/tag2/field/timestamp2.bin - └── regionID/tableName1/tag3/tag4/field/timestamp1.bin -``` - - -## 参考:注意事项 - -可通过修改 IoTDB 配置文件(`iotdb-system.properties`)以调整数据同步的参数,如同步数据存储目录等。完整配置如下:: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## 参考:参数说明 - -### source 参数 - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | -|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------| ------------ | ------------------------------- | -| source | iotdb-source | String: iotdb-source | 必填 | - | -| inclusion | 用于指定数据同步任务中需要同步范围,分为数据,元数据和权限 | String:all, data(insert,delete), schema(database,table,ttl),auth | 选填 | data.insert | -| inclusion.exclusion | 用于从 inclusion 指定的同步范围内排除特定的操作,减少同步的数据量 | String:all, data(insert,delete), schema(database,table,ttl), auth | 选填 | 空字符串 | -| mode.streaming | 此参数指定时序数据写入的捕获来源。适用于 `mode.streaming`为 `false` 模式下的场景,决定`inclusion`中`data.insert`数据的捕获来源。提供两种捕获策略:true: 动态选择捕获的类型。系统将根据下游处理速度,自适应地选择是捕获每个写入请求还是仅捕获 TsFile 文件的封口请求。当下游处理速度快时,优先捕获写入请求以减少延迟;当处理速度慢时,仅捕获文件封口请求以避免处理堆积。这种模式适用于大多数场景,能够实现处理延迟和吞吐量的最优平衡。false:固定按批捕获方式。仅捕获 TsFile 文件的封口请求,适用于资源受限的应用场景,以降低系统负载。注意,pipe 启动时捕获的快照数据只会以文件的方式供下游处理。 | Boolean: true / false | 否 | true | -| mode.strict | 在使用 time / path / database-name / table-name 参数过滤数据时,是否需要严格按照条件筛选:`true`: 严格筛选。系统将完全按照给定条件过滤筛选被捕获的数据,确保只有符合条件的数据被选中。`false`:非严格筛选。系统在筛选被捕获的数据时可能会包含一些额外的数据,适用于性能敏感的场景,可降低 CPU 和 IO 消耗。 | Boolean: true / false | 否 | true | -| mode.snapshot | 此参数决定时序数据的捕获方式,影响`inclusion`中的`data`数据。提供两种模式:`true`:静态数据捕获。启动 pipe 时,会进行一次性的数据快照捕获。当快照数据被完全消费后,**pipe 将自动终止(DROP PIPE SQL 会自动执行)**。`false`:动态数据捕获。除了在 pipe 启动时捕获快照数据外,还会持续捕获后续的数据变更。pipe 将持续运行以处理动态数据流。 | Boolean: true / false | 否 | false | -| database-name | 当用户连接指定的 sql_dialect 为 table 时可以指定。此参数决定时序数据的捕获范围,影响`inclusion`中的`data`数据。表示要过滤的数据库的名称。它可以是具体的数据库名,也可以是 Java 风格正则表达式来匹配多个数据库。默认情况下,匹配所有的库。 | String:数据库名或数据库正则模式串,可以匹配未创建的、不存在的库 | 否 | ".*" | -| table-name | 当用户连接指定的 sql_dialect 为 table 时可以指定。此参数决定时序数据的捕获范围,影响`inclusion`中的`data`数据。表示要过滤的表的名称。它可以是具体的表名,也可以是 Java 风格正则表达式来匹配多个表。默认情况下,匹配所有的表。 | String:数据表名或数据表正则模式串,可以是未创建的、不存在的表 | 否 | ".*" | -| start-time | 此参数决定时序数据的捕获范围,影响`inclusion`中的`data`数据。当数据的 event time 大于等于该参数时,数据会被筛选出来进入流处理 pipe。 | Long: [Long.MIN_VALUE, Long.MAX_VALUE] (unix 裸时间戳)或 String:IoTDB 支持的 ISO 格式时间戳 | 否 | Long.MIN_VALUE(unix 裸时间戳) | -| end-time | 此参数决定时序数据的捕获范围,影响`inclusion`中的`data`数据。当数据的 event time 小于等于该参数时,数据会被筛选出来进入流处理 pipe。 | Long: [Long.MIN_VALUE, Long.MAX_VALUE](unix 裸时间戳)或String:IoTDB 支持的 ISO 格式时间戳 | 否 | Long.MAX_VALUE(unix 裸时间戳) | -| mode.double-living | 是否开启全量双活模式,开启后将忽略`-sql_dialect`连接方式,树表模型数据均会被捕获,且不会转发由另一pipe同步而来的数据。 | Boolean: true / false | 否 | false | -| mods | 同 mods.enable,是否发送 tsfile 的 mods 文件 | Boolean: true / false | 选填 | false | -| skipIf | 出现哪些错误可以跳过,当前只有无权限的错误 | String:no-privileges | 选填 | no-privileges | - -> 💎 **说明:数据抽取模式 mode.streaming 取值 true 和 false 的差异** -> - **true(推荐)**:该取值下,任务将对数据进行实时处理、发送,其特点是高时效、低吞吐 -> - **false**:该取值下,任务将对数据进行批量(按底层数据文件)处理、发送,其特点是低时效、高吞吐 - - -### sink 参数 - -#### iotdb-thrift-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | -|--------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------| -------- | ------------ | -| sink | iotdb-thrift-sink 或 iotdb-thrift-async-sink | String: iotdb-thrift-sink 或 iotdb-thrift-async-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/usename | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V2.0.5及以后版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| format | 数据传输的payload格式, 可选项包括:
- hybrid: 取决于 processor 传递过来的格式(tsfile或tablet),sink不做任何转换。
- tsfile:强制转换成tsfile发送,可用于数据文件备份等场景。
- tablet:强制转换成tsfile发送,可用于发送端/接收端数据类型不完全兼容时的数据同步(以减少报错)。 | String: hybrid / tsfile / tablet | 选填 | hybrid | -| mark-as-general-write-request | 该参数可控制双活 pipe 之间能否同步外部 pipe 转发的数据(配置到双活外部 pipe 的发送端)(V2.0.5及以后版本支持) | Boolean: true / false。True:能同步;False:不能同步; | 选填 | False | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - - - -#### iotdb-air-gap-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | -|--------------------------------------------| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -------- | -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/username | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | TimechoDB@2021,V2.0.6.x之前为 root | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| air-gap.handshake-timeout-ms | 发送端与接收端在首次尝试建立连接时握手请求的超时时长,单位:毫秒 | Integer | 选填 | 5000 | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - -#### iotdb-thrift-ssl-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | -|--------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------| -------- | ------------ | -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/username | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | TimechoDB@2021,V2.0.6.x之前为 root | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V2.0.5及以后版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| ssl.trust-store-path | 连接目标端 DataNode 所需的 trust store 证书路径 | String.Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| ssl.trust-store-pwd | 连接目标端 DataNode 所需的 trust store 证书密码 | Integer | 必填 | - | -| format | 数据传输的payload格式, 可选项包括:
- hybrid: 取决于 processor 传递过来的格式(tsfile或tablet),sink不做任何转换。
- tsfile:强制转换成tsfile发送,可用于数据文件备份等场景。
- tablet:强制转换成tsfile发送,可用于发送端/接收端数据类型不完全兼容时的数据同步(以减少报错)。 | String: hybrid / tsfile / tablet | 选填 | hybrid | -| mark-as-general-write-request | 该参数可控制双活 pipe 之间能否同步外部 pipe 转发的数据(配置到双活外部 pipe 的发送端) | Boolean: true / false。True:能同步;False:不能同步; | 选填 | False | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - -#### write-back-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认取值** | -| ---------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -------- | -| sink | write-back-sink | String: write-back-sink | 必填 | - | -| user/username | 用于写回的用户 | String:用户名 | 选填 | root | -| password | 用于写回的密码 | String:密码 | 选填 | root123 | -| user-id | 用户对应的 userId | String | 选填 | root | -| cli-hostname | 用户对应的 cli 主机名 | String | 选填 | root | -| use-event-user-name | 如果 event 中含有另一个用户的用户名,是否使用该用户名(现在没有 external source 基本不需要) | Boolean: true / false | 选填 | false | - -#### opc-ua-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认值** | -|------------------------------------|----------------------------------|-----------------------------| ---------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| sink | OPC UA SINK | String: opc-ua-sink | 必填 | - | -| sink.opcua.model | OPC UA 使用的模式 | String: client-server / pub-sub | 选填 | pub-sub | -| sink.opcua.tcp.port | OPC UA 的 TCP 端口 | Integer: [0, 65536] | 选填 | 12686 | -| sink.opcua.https.port | OPC UA 的 HTTPS 端口 | Integer: [0, 65536] | 选填 | 8443 | -| sink.opcua.security.dir | OPC UA 的密钥及证书目录 | String: Path,支持绝对及相对目录 | 选填 | iotdb 相关 DataNode 的 conf 目录下的 opc_security 文件夹 ``。
如无 iotdb 的 conf 目录(例如 IDEA 中启动 DataNode),则为用户主目录下的 iotdb_opc_security 文件夹 `` | -| sink.opcua.enable-anonymous-access | OPC UA 是否允许匿名访问 | Boolean | 选填 | true | -| sink.user | 用户,这里指 OPC UA 的允许用户 | String | 选填 | root | -| sink.password | 密码,这里指 OPC UA 的允许密码 | String | 选填 | TimechoDB@2021,V2.0.6.x之前为 root | -| sink.opcua.placeholder | 当ID列的值出现null时,用于替代null映射路径的占位字符串 | String | 选填 | "null" | - -#### tsfile-local-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认值** | -|-----------------------------------|-----------------------------------------|---------------------------|------|---------| -| sink | 组件名称 | String: tsfile-local-sink | 是 | - | -| sink.local.target-path | 本地目标目录 | String | 是 | - | -| sink.rate-limit-bytes-per-second | 限速阈值。单位:字节/秒。开启限速时生效。rate-limit<=0不限速 | Long | 否 | 0 | - -#### tsfile-remote-sink - -| **参数** | **描述** | **value 取值范围** | **是否必填** | **默认值** | -|-----------------------------------|--------------------------------------|----------------------------|----------|--------------| -| sink | 组件名称 | String: tsfile-remote-sink | 是 | - | -| sink.scp.host | 远程主机 IP | String | 是 | - | -| sink.scp.port | 远程 SSH 端口 | Long | 否 | 22 | -| sink.scp.user | 远程 SSH 用户 | String | 是 | - | -| sink.scp.password | 远程 SSH 密码 | String | 是 | - | -| sink.scp.remote-path | 远程目标目录 | String | 是 | - | -| sink.rate-limit-bytes-per-second | 单位:字节/秒。开启限速时生效。rate-limit<=0不限速 | Long | 否 | 0 | -| sink.scp.object-parallelism | object文件发送最大并行度 | Long | 否 | `min(cpu/4,16)` | -| sink.scp.object-batch-size-bytes | 单次异步线程发送的最大Object文件大小, 单位 MB | Long | 否 | 200 | diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Maintenance-statement_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Maintenance-statement_timecho.md deleted file mode 100644 index 20d67a00e..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Maintenance-statement_timecho.md +++ /dev/null @@ -1,956 +0,0 @@ - -# 运维语句 - -## 1. 状态查看 - -### 1.1 查看连接的模型 - -**含义**:返回当前连接的 sql_dialect 是树模型/表模型。 - -#### 语法: - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT -``` - -执行结果如下: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` -### 1.2 查看登录的用户名 - -**含义**:返回当前登录的用户名。 - -#### 语法: - -```SQL -showCurrentUserStatement - : SHOW CURRENT_USER - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_USER -``` - -执行结果如下: - -```SQL -+-----------+ -|CurrentUser| -+-----------+ -| root| -+-----------+ -``` - -### 1.3 查看连接的数据库名 - -**含义**:返回当前连接的数据库名,若没有执行过 use 语句,则为 null。 - -#### 语法: - -```SQL -showCurrentDatabaseStatement - : SHOW CURRENT_DATABASE - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_DATABASE; - -IoTDB> USE test; - -IoTDB> SHOW CURRENT_DATABASE; -``` - -执行结果如下: - -```SQL -+---------------+ -|CurrentDatabase| -+---------------+ -| null| -+---------------+ - -+---------------+ -|CurrentDatabase| -+---------------+ -| test| -+---------------+ -``` - -### 1.4 查看集群版本 - -**含义**:返回当前集群的版本。 - -#### 语法: - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW VERSION -``` - -执行结果如下: - -```SQL -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.5 查看集群关键参数 - -**含义**:返回当前集群的关键参数。 - -#### 语法: - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -关键参数如下: - -1. **ClusterName**:当前集群的名称。 -2. **DataReplicationFactor**:数据副本的数量,表示每个数据分区(DataRegion)的副本数。 -3. **SchemaReplicationFactor**:元数据副本的数量,表示每个元数据分区(SchemaRegion)的副本数。 -4. **DataRegionConsensusProtocolClass**:数据分区(DataRegion)使用的共识协议类。 -5. **SchemaRegionConsensusProtocolClass**:元数据分区(SchemaRegion)使用的共识协议类。 -6. **ConfigNodeConsensusProtocolClass**:配置节点(ConfigNode)使用的共识协议类。 -7. **TimePartitionOrigin**:数据库时间分区的起始时间戳。 -8. **TimePartitionInterval**:数据库的时间分区间隔(单位:毫秒)。 -9. **ReadConsistencyLevel**:读取操作的一致性级别。 -10. **SchemaRegionPerDataNode**:数据节点(DataNode)上的元数据分区(SchemaRegion)数量。 -11. **DataRegionPerDataNode**:数据节点(DataNode)上的数据分区(DataRegion)数量。 -12. **SeriesSlotNum**:数据分区(DataRegion)的序列槽(SeriesSlot)数量。 -13. **SeriesSlotExecutorClass**:序列槽的实现类。 -14. **DiskSpaceWarningThreshold**:磁盘空间告警阈值(单位:百分比)。 -15. **TimestampPrecision**:时间戳精度。 - -#### 示例: - -```SQL -IoTDB> SHOW VARIABLES -``` - -执行结果如下: - -```SQL -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.6 查看集群ID - -**含义**:返回当前集群的ID。 - -#### 语法: - -```SQL -showClusterIdStatement - : SHOW (CLUSTERID | CLUSTER_ID) - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CLUSTER_ID -``` - -执行结果如下: - -```SQL -+------------------------------------+ -| ClusterId| -+------------------------------------+ -|40163007-9ec1-4455-aa36-8055d740fcda| -``` - -### 1.7 查看客户端直连的 DataNode 进程所在服务器的时间 - -#### 语法: - -**含义**:返回当前客户端直连的 DataNode 进程所在服务器的时间。 - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP -``` - -执行结果如下: - -```SQL -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - -### 1.8 查看正在执行的查询信息 - -**含义**:用于显示所有正在执行的查询信息。 - -> 更多系统表使用方法请参考[系统表](../Reference/System-Tables_timecho.md) - -#### 语法: - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**参数解释**: - -1. **WHERE** 子句:需保证过滤的目标列是结果集中存在的列 -2. **ORDER BY** 子句:需保证`sortKey`是结果集中存在的列 -3. **limitOffsetClause**: - - **含义**:用于限制结果集的返回数量。 - - **格式**:`LIMIT , `, `` 是偏移量,`` 是返回的行数。 -4. **QUERIES** 表中的列: - - **query_id**:查询语句的 ID - - **start_time**:查询开始的时间戳,时间戳精度与系统精度一致 - - **datanode_id**:发起查询语句的 DataNode 的ID - - **elapsed_time**:查询的执行耗时,单位是秒 - - **statement**:查询的 SQL 语句 - - **user**:发起查询的用户 - -#### 示例: - -```SQL -IoTDB> SHOW QUERIES WHERE elapsed_time > 30 -``` - -执行结果如下: - -```SQL -+-----------------------+-----------------------------+-----------+------------+------------+----+ -| query_id| start_time|datanode_id|elapsed_time| statement|user| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -|20250108_101015_00000_1|2025-01-08T18:10:15.935+08:00| 1| 32.283|show queries|root| -+-----------------------+-----------------------------+-----------+------------+------------+----+ -``` - -### 1.9 查看分区信息 - -**含义**:返回当前集群的分区信息。 - -#### 语法: - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW REGIONS -``` - -执行结果如下: - -```SQL -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 6|SchemaRegion|Running|tcollector| 670| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.194| | -| 7| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.196| 169.85 KB| -| 8| DataRegion|Running|tcollector| 335| 335| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.198| 161.63 KB| -+--------+------------+-------+----------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.10 查看可用节点 - -**含义**:返回当前集群所有可用的 DataNode 的 RPC 地址和端口。注意:这里对于“可用”的定义为:处于非 REMOVING 状态的 DN 节点。 - -> V2.0.8 起支持该功能 - -#### 语法: - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW AVAILABLE URLS -``` - -执行结果如下: - -```SQL -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.11 查看服务信息 - -**含义**:返回当前集群所有正常工作(RUNNING 或 READ-ONLY) DN 上的服务信息(MQTT 服务、REST 服务)。 - -> V2.0.8.2 起支持该功能 - -#### 语法: - -```SQL -showServicesStatement - : SHOW SERVICES - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -``` - -执行结果如下: - -```SQL -+------------+-----------+-------+ -|service_name|datanode_id| state| -+------------+-----------+-------+ -| MQTT| 1|STOPPED| -| REST| 1|RUNNING| -+------------+-----------+-------+ -``` - - -### 1.12 查看集群激活状态 - -**含义**:返回当前集群的激活状态。 - -#### 语法: - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW ACTIVATION -``` - -执行结果如下: - -```SQL -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -### 1.13 查看节点配置信息 - -**含义**:默认返回指定节点(通过 `node_id` 指定)的配置文件中已生效的配置项;若未指定 `node_id`,则返回客户端直连的 DataNode 配置。 添加 `all` 参数返回所有配置项(未配置项的 `value` 为 `null`);添加 `with desc` 参数返回配置项含描述信息。 - -> V2.0.9.1 起支持该功能 - -#### 语法: - -```SQL -showConfigurationStatement - : SHOW (ALL)? CONFIGURATION (ON nodeId=INTEGER_VALUE)? (WITH DESC)? - ; -``` - -#### 结果集说明 - -| 列名 | 列类型 | 含义 | -| ---------------- | -------- | ------------------ | -| name | string | 参数名 | -| value | string | 参数值 | -| default\_value | string | 参数默认值 | -| description | string | 参数描述(可选) | - -#### 示例: - -1. 查看客户端直连 DataNode 的配置信息 - -```SQL -show configuration; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| schema_replication_factor| 1| 1| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_metric_prometheus_reporter_port| 9091| 9091| -| dn_metric_prometheus_reporter_port| 9092| 9092| -| series_slot_num| 1000| 1000| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| time_partition_origin| 0| 0| -| time_partition_interval| 604800000| 604800000| -| disk_space_warning_threshold| 0.05| 0.05| -| schema_engine_mode| Memory| Memory| -| tag_attribute_total_size| 700| 700| -| read_consistency_level| strong| strong| -| timestamp_precision| ms| ms| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 28 -It costs 0.013s -``` - -2. 查看指定 node id 的节点配置信息 - -```Bash -show configuration on 1; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| schema_replication_factor| 1| 1| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_metric_prometheus_reporter_port| 9091| 9091| -| dn_metric_prometheus_reporter_port| 9092| 9092| -| series_slot_num| 1000| 1000| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| time_partition_origin| 0| 0| -| time_partition_interval| 604800000| 604800000| -| disk_space_warning_threshold| 0.05| 0.05| -| schema_engine_mode| Memory| Memory| -| tag_attribute_total_size| 700| 700| -| read_consistency_level| strong| strong| -| timestamp_precision| ms| ms| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 28 -It costs 0.004s -``` - -3. 查看所有配置信息 - -```Bash -show all configuration; -``` - -```Bash -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| name| value| default_value| -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| -| cn_internal_address| 127.0.0.1| 127.0.0.1| -| cn_internal_port| 10710| 10710| -| cn_consensus_port| 10720| 10720| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| -| dn_rpc_port| 6667| 6667| -| dn_internal_address| 127.0.0.1| 127.0.0.1| -| dn_internal_port| 10730| 10730| -| dn_mpp_data_exchange_port| 10740| 10740| -| dn_schema_region_consensus_port| 10750| 10750| -| dn_data_region_consensus_port| 10760| 10760| -| dn_join_cluster_retry_interval_ms| null| 5000| -| config_node_consensus_protocol_class| null| org.apache.iotdb.consensus.ratis.RatisConsensus| -| schema_replication_factor| 1| 1| -| schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| -| data_replication_factor| 1| 1| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| -| cn_system_dir| null| data/confignode/system| -| cn_consensus_dir| null| data/confignode/consensus| -| cn_pipe_receiver_file_dir| null| data/confignode/system/pipe/receiver| -... -+---------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ -Total line number = 412 -It costs 0.006s -``` - -4. 查看配置项描述信息 - -```Bash -show configuration on 1 with desc; -``` - -```Bash -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| name| value| default_value| description| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| cluster_name| defaultCluster| defaultCluster| 用于标识集群名称并区分不同集群。如果需要修改集群名称,建议使用 SQL 语句 'set configuration "cluster_name=xxx"'。不建议手动修改配置文件,否则可能导致节点重启失败。effectiveMode: hot_reload。Datatype: string| -| cn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710|For the first ConfigNode to start, cn_seed_config_node points to its own cn_internal_address:cn_internal_port. For other ConfigNodes that to join the cluster, cn_seed_config_node points to any running ConfigNode's cn_internal_address:cn_internal_port. Note: After this ConfigNode successfully joins the cluster for the first time, this parameter is no longer used. Each node automatically maintains the list of ConfigNodes and traverses connections when restarting. Format: address:port e.g. 127.0.0.1:10710.effectiveMode: first_start.Datatype: String| -| dn_seed_config_node| 127.0.0.1:10710| 127.0.0.1:10710| dn_seed_config_node points to any running ConfigNode's cn_internal_address:cn_internal_port. Note: After this DataNode successfully joins the cluster for the first time, this parameter is no longer used. Each node automatically maintains the list of ConfigNodes and traverses connections when restarting. Format: address:port e.g. 127.0.0.1:10710.effectiveMode: first_start.Datatype: String| -| cn_internal_address| 127.0.0.1| 127.0.0.1| Used for RPC communication inside cluster. Could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: first_start.Datatype: String| -| cn_internal_port| 10710| 10710| Used for RPC communication inside cluster.effectiveMode: first_start.Datatype: int| -| cn_consensus_port| 10720| 10720| Used for consensus communication among ConfigNodes inside cluster.effectiveMode: first_start.Datatype: int| -| dn_rpc_address| 0.0.0.0| 0.0.0.0| Used for connection of IoTDB native clients(Session) Could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: restart.Datatype: String| -| dn_rpc_port| 6667| 6667| Used for connection of IoTDB native clients(Session) Bind with dn_rpc_address.effectiveMode: restart.Datatype: int| -| dn_internal_address| 127.0.0.1| 127.0.0.1| Used for communication inside cluster. could set 127.0.0.1(for local test) or ipv4 address.effectiveMode: first_start.Datatype: String| -| dn_internal_port| 10730| 10730| Used for communication inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_mpp_data_exchange_port| 10740| 10740| Port for data exchange among DataNodes inside cluster Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_schema_region_consensus_port| 10750| 10750| port for consensus's communication for schema region inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| dn_data_region_consensus_port| 10760| 10760| port for consensus's communication for data region inside cluster. Bind with dn_internal_address.effectiveMode: first_start.Datatype: int| -| schema_replication_factor| 1| 1| Default number of schema replicas.effectiveMode: first_start.Datatype: int| -|schema_region_consensus_protocol_class| org.apache.iotdb.consensus.ratis.RatisConsensus| org.apache.iotdb.consensus.ratis.RatisConsensus| SchemaRegion consensus protocol type. This parameter is unmodifiable after ConfigNode starts for the first time. These consensus protocols are currently supported: 1. org.apache.iotdb.consensus.ratis.RatisConsensus 2. org.apache.iotdb.consensus.simple.SimpleConsensus (The schema_replication_factor can only be set to 1).effectiveMode: first_start.Datatype: string| -| data_replication_factor| 1| 1| Default number of data replicas.effectiveMode: first_start.Datatype: int| -| data_region_consensus_protocol_class| org.apache.iotdb.consensus.iot.IoTConsensus| org.apache.iotdb.consensus.iot.IoTConsensus| DataRegion consensus protocol type. This parameter is unmodifiable after ConfigNode starts for the first time. These consensus protocols are currently supported: 1. org.apache.iotdb.consensus.simple.SimpleConsensus (The data_replication_factor can only be set to 1) 2. org.apache.iotdb.consensus.iot.IoTConsensus 3. org.apache.iotdb.consensus.ratis.RatisConsensus 4. org.apache.iotdb.consensus.iot.IoTConsensusV2.effectiveMode: first_start.Datatype: string| -| cn_metric_prometheus_reporter_port| 9091| 9091| The port of prometheus reporter of metric module.effectiveMode: restart.Datatype: int| -| dn_metric_prometheus_reporter_port| 9092| 9092| The port of prometheus reporter of metric module.effectiveMode: restart.Datatype: int| -| series_slot_num| 1000| 1000| All parameters in Partition configuration is unmodifiable after ConfigNode starts for the first time. And these parameters should be consistent within the ConfigNodeGroup. Number of SeriesPartitionSlots per Database.effectiveMode: first_start.Datatype: Integer| -| series_partition_executor_class|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| SeriesPartitionSlot executor class These hashing algorithms are currently supported: 1. BKDRHashExecutor(Default) 2. APHashExecutor 3. JSHashExecutor 4. SDBMHashExecutor Also, if you want to implement your own SeriesPartition executor, you can inherit the SeriesPartitionExecutor class and modify this parameter to correspond to your Java class.effectiveMode: first_start.Datatype: String| -| time_partition_origin| 0| 0| Time partition origin in milliseconds, default is equal to zero. This origin is set by default to the beginning of Unix time, which is January 1, 1970, at 00:00 UTC (Coordinated Universal Time). This point is known as the Unix epoch, and its timestamp is 0. If you want to specify a different time partition origin, you can set this value to a specific Unix timestamp in milliseconds.effectiveMode: first_start.Datatype: long| -| time_partition_interval| 604800000| 604800000| Time partition interval in milliseconds, and partitioning data inside each data region, default is equal to one week.effectiveMode: first_start.Datatype: long| -| disk_space_warning_threshold| 0.05| 0.05| Disk remaining threshold at which DataNode is set to ReadOnly status.effectiveMode: restart.Datatype: double(percentage)| -| schema_engine_mode| Memory| Memory| The schema management mode of schema engine. Currently, support Memory and PBTree. This config of all DataNodes in one cluster must keep same.effectiveMode: first_start.Datatype: string| -| tag_attribute_total_size| 700| 700| max size for a storage block for tags and attributes of one time series. If the combined size of tags and attributes exceeds the tag_attribute_total_size, a new storage block will be allocated to continue storing the excess data. the unit is byte.effectiveMode: first_start.Datatype: int| -| read_consistency_level| strong| strong| The read consistency level These consistency levels are currently supported: 1. strong(Default, read from the leader replica) 2. weak(Read from a random replica).effectiveMode: restart.Datatype: string| -| timestamp_precision| ms| ms| Use this value to set timestamp precision as "ms", "us" or "ns". Once the precision has been set, it can not be changed.effectiveMode: first_start.Datatype: String| -+--------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 28 -It costs 0.010s -``` - - -## 2. 状态设置 - -### 2.1 设置连接的模型 - -**含义**:将当前连接的 sql_dialect 置为树模型/表模型,在树模型和表模型中均可使用该命令。 - -#### 语法: - -```SQL -SET SQL_DIALECT EQ (TABLE | TREE) -``` - -#### 示例: - -```SQL -IoTDB> SET SQL_DIALECT=TABLE -IoTDB> SHOW CURRENT_SQL_DIALECT -``` - -执行结果如下: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TABLE| -+-----------------+ -``` - -### 2.2 更新配置项 - -**含义**:用于更新配置项,执行完成后会进行配置项的热加载,对于支持热修改的配置项会立即生效。 - -#### 语法: - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**参数解释**: - -1. **propertyAssignments** - - **含义**:更新的配置列表,由多个 `property` 组成。 - - 可以更新多个配置列表,用逗号分隔。 - - **取值**: - - `DEFAULT`:将配置项恢复为默认值。 - - `expression`:具体的值,必须是一个字符串。 -2. **ON INTEGER_VALUE** - - **含义**:指定要更新配置的节点 ID。 - - **可选性**:可选。如果不指定或指定的值低于 0,则更新所有 ConfigNode 和 DataNode 的配置。 - -#### 示例: - -```SQL -IoTDB> SET CONFIGURATION disk_space_warning_threshold='0.05',heartbeat_interval_in_ms='1000' ON 1; -``` - -### 2.3 读取手动修改的配置文件 - -**含义**:用于读取手动修改过的配置文件,并对配置项进行热加载,对于支持热修改的配置项会立即生效。 - -#### 语法: - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定配置热加载的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `LOCAL`:只对客户端直连的 DataNode 进行配置热加载。 - - `CLUSTER`:对集群中所有 DataNode 进行配置热加载。 - -#### 示例: - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 设置系统的状态 - -**含义**:用于设置系统的状态。 - -#### 语法: - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **RUNNING | READONLY** - - **含义**:指定系统的新状态。 - - **取值**: - - `RUNNING`:将系统设置为运行状态,允许读写操作。 - - `READONLY`:将系统设置为只读状态,只允许读取操作,禁止写入操作。 -2. **localOrClusterMode** - - **含义**:指定状态变更的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `LOCAL`:仅对客户端直连的 DataNode 生效。 - - `CLUSTER`:对集群中所有 DataNode 生效。 - -#### 示例: - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - - -## 3. 数据管理 - -### 3.1 刷写内存表中的数据到磁盘 - -**含义**:将内存表中的数据刷写到磁盘上。 - -#### 语法: - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **identifier** - - **含义**:指定要刷写的数据库名称。 - - **可选性**:可选。如果不指定,则默认刷写所有数据库。 - - **多个数据库**:可以指定多个数据库名称,用逗号分隔。例如:`FLUSH test_db1, test_db2`。 -2. **booleanValue** - - **含义**:指定刷写的内容。 - - **可选性**:可选。如果不指定,则默认刷写顺序和乱序空间的内存。 - - **取值**: - - `TRUE`:只刷写顺序空间的内存表。 - - `FALSE`:只刷写乱序空间的MemTable。 -3. **localOrClusterMode** - - **含义**:指定刷写的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:只刷写客户端直连的 DataNode 上的内存表。 - - `ON CLUSTER`:刷写集群中所有 DataNode 上的内存表。 - -#### 示例: - -```SQL -IoTDB> FLUSH test_db TRUE ON LOCAL; -``` - -## 4. 数据修复 - -### 4.1 启动后台扫描并修复 tsfile 任务 - -**含义**:启动一个后台任务,开始扫描并修复 tsfile,能够修复数据文件内的时间戳乱序类异常。 - -#### 语法: - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定数据修复的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:仅对客户端直连的 DataNode 执行。 - - `ON CLUSTER`:对集群中所有 DataNode 执行。 - -#### 示例: - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 暂停后台修复 tsfile 任务 - -**含义**:暂停后台的修复任务,暂停中的任务可通过再次执行 start repair data 命令恢复。 - -#### 语法: - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定数据修复的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:仅对客户端直连的 DataNode 执行。 - - `ON CLUSTER`:对集群中所有 DataNode 执行。 - -#### 示例: - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. 终止查询 - -### 5.1 主动终止查询 - -**含义**:使用该命令主动地终止查询。 - -#### 语法: - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**参数解释**: - -1. **QUERY queryId=string** - - **含义**:指定要终止的查询的 ID。 `` 是正在执行的查询的唯一标识符。 - - **获取查询 ID**:可以通过 `SHOW QUERIES` 命令获取所有正在执行的查询及其 ID。 -2. **ALL QUERIES** - - **含义**:终止所有正在执行的查询。 - -#### 示例: - -通过指定 `queryId` 可以中止指定的查询,为了获取正在执行的查询 id,用户可以使用 show queries 命令,该命令将显示所有正在执行的查询列表。 - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -- 终止指定query -IoTDB> KILL ALL QUERIES; -- 终止所有query -``` - -## 6. 调试查询 - -### 6.1 DEBUG SQL - - -**​含义:​**在 SQL 查询语句开头添加 debug 关键字,执行时将输出 debug 日志,包括涉及到的底层文件 scan 信息。 - -> V2.0.9.1 起支持该功能 - -#### 语法: - -```SQL -debugSQLStatement - : DEBUG ? query - ; -``` - -**说明:** - -* 日志输出目录为: `logs/log_datanode_query_debug.log` - -#### 示例: - -1. 执行以下 SQL 进行 DEBUG 查询 - -```SQL -debug select * from table3; -``` - -2. 观察`log_datanode_query_debug.log` 的日志内容,查看查询涉及到的文件 scan 信息。 - -```Bash -2026-03-24 10:10:41,515 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.t.TsFileResource:1098 - Path: table3.d1 file /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2864/1769139940009-1-0-0.tsfile is not satisfied because of no device! -2026-03-24 10:10:41,515 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.t.TsFileResource:1098 - Path: table3.d1 file /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2865/1769139940010-1-0-0.tsfile is not satisfied because of no device! -2026-03-24 10:10:41,516 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:159 - Cache miss: table3.d1. in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,516 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:160 - Device: table3.d1, all sensors: [, temperature] -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.BloomFilterCache:110 - get bloomFilter from cache where filePath is: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: table3.d1. metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=-128, chunkMetaDataListDataSize=8, measurementId='', dataType=VECTOR, statistics=startTime: 1747065600001 endTime: 1747065601002 count: 2, modified=false, isSeq=true, chunkMetadataList=[measurementId: , datatype: VECTOR, version: 0, Statistics: startTime: 1747065600001 endTime: 1747065601002 count: 2, deleteIntervalList: null]}. -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: table3.d1.temperature metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=64, chunkMetaDataListDataSize=8, measurementId='temperature', dataType=FLOAT, statistics=startTime: 1747065600001 endTime: 1747065601002 count: 2 [minValue:85.0,maxValue:90.0,firstValue:90.0,lastValue:85.0,sumValue:175.0], modified=false, isSeq=true, chunkMetadataList=[measurementId: temperature, datatype: FLOAT, version: 0, Statistics: startTime: 1747065600001 endTime: 1747065601002 count: 2 [minValue:85.0,maxValue:90.0,firstValue:90.0,lastValue:85.0,sumValue:175.0], deleteIntervalList: null]}. -2026-03-24 10:10:41,517 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:110 - Modifications size is 1 for file Path: /home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:114 - [] -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:125 - After modification Chunk meta data list is: -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskAlignedChunkMetadataLoader:126 - org.apache.tsfile.file.metadata.TableDeviceChunkMetadata@2e11291f -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile', regionId=4, timePartitionId=2888, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=19} -2026-03-24 10:10:41,518 [Query-Worker-Thread-0$20260324_021041_00068_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/database1/4/2888/1774247880109-1-0-0.tsfile', regionId=4, timePartitionId=2888, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=46} -2026-03-24 10:10:41,519 [pool-69-IoTDB-ClientRPC-Processor-1$20260324_021041_00068_1] INFO o.a.i.d.q.p.Coordinator:902 - debug select * from table3 -``` diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Pattern-Query_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Pattern-Query_timecho.md deleted file mode 100644 index a0a6e0242..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Pattern-Query_timecho.md +++ /dev/null @@ -1,1138 +0,0 @@ - - -# 模式查询 - -IoTDB 针对时序数据的特色分析场景,提供了模式查询能力,为时序数据的深度挖掘与复杂计算提供了灵活高效的解决方案。下文将对该功能进行详细的介绍。 - -## 1. 概述 - -模式查询支持通过定义模式变量的识别逻辑以及正则表达式来捕获一段连续的数据,并对每一段捕获的数据进行分析计算,适用于识别时序数据中的特定模式(如下图所示)、检测特定事件等业务场景。 - -![](/img/timeseries-featured-analysis-1.png) - -> 注意:该功能从 V 2.0.5 版本开始提供。 - -## 2. 功能介绍 -### 2.1 语法格式 - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**说明:** - -* PARTITION BY : 可选,用于对输入表进行分组,每个分组能独立进行模式匹配。如果未声明该子句,则整个输入表将作为一个整体进行处理。 -* ORDER BY :可选,用于确保输入数据按某种顺序进行匹配处理。 -* MEASURES :可选,用于指定从匹配到的一段数据中提取哪些信息。 -* ROWS PER MATCH :可选,用于指定模式匹配成功后结果集的输出方式。 -* AFTER MATCH SKIP :可选,用于指定在识别到一个非空匹配后,下一次模式匹配应从哪一行继续进行。 -* PATTERN :用于定义需要匹配的行模式。 -* SUBSET :可选,用于将多个基本模式变量所匹配的行合并为一个逻辑集合。 -* DEFINE :用于定义行模式的基本模式变量。 - -**语法示例原始数据:** - -```SQL -IoTDB:database3> select * from t -+-----------------------------+------+----------+ -| time|device|totalprice| -+-----------------------------+------+----------+ -|2025-01-01T00:01:00.000+08:00| d1| 90| -|2025-01-01T00:02:00.000+08:00| d1| 80| -|2025-01-01T00:03:00.000+08:00| d1| 70| -|2025-01-01T00:04:00.000+08:00| d1| 80| -|2025-01-01T00:05:00.000+08:00| d1| 70| -|2025-01-01T00:06:00.000+08:00| d1| 80| -+-----------------------------+------+----------+ - --- 创建语句 -create table t(device tag, totalprice int32 field) - -insert into t(time,device,totalprice) values(2025-01-01T00:01:00, 'd1', 90),(2025-01-01T00:02:00, 'd1', 80),(2025-01-01T00:03:00, 'd1', 70),(2025-01-01T00:04:00, 'd1', 80),(2025-01-01T00:05:00, 'd1', 70),(2025-01-01T00:06:00, 'd1', 80) -``` - -### 2.2 DEFINE 子句 - -用于为模式识别中的每个基本模式变量指定其判断条件。这些变量通常由标识符(如 `A`, `B`)代表,并通过该子句中的布尔表达式精确定义哪些行符合该变量的要求。 - -* 在模式匹配执行过程中,仅当布尔表达式返回 TRUE 时,才会将当前行标记为该变量,从而将其纳入到当前匹配分组中。 - -```SQL --- 只有在当前行的 totalprice 值小于前一行 totalprice 值的情况下,当前行才可以被识别为 B。 -DEFINE B AS totalprice < PREV(totalprice) -``` - -* **未**在子句中**显式**定义的变量,其匹配条件隐含为恒真(TRUE),即可在任何输入行上成功匹配。 - -### 2.3 SUBSET 子句 - -用于将多个基本模式变量(如 `A`、`B`)匹配到的行合并成一个联合模式变量(如 `U`),使这些行可以被视为同一个逻辑集合进行操作。可用于`MEASURES`、`DEFINE `和`AFTER MATCH SKIP`子句。 - -```SQL -SUBSET U = (A, B) -``` - -例如,对于模式 `PATTERN ((A | B){5} C+)` ,在匹配过程中无法确定第五次重复时具体匹配的是基本模式变量 A 还是 B,因此 - -1. 在 `MEASURES `子句中,若需要引用该阶段最后一次匹配到的行,则可通过定义联合模式变量 `SUBSET U = (A, B)`实现。此时表达式 `RPR_LAST(U.totalprice)` 将直接返回该目标行的 `totalprice` 值。 -2. 在 `AFTER MATCH SKIP` 子句中,若匹配结果中未包含基本模式变量 A 或 B 时,执行 `AFTER MATCH SKIP TO LAST B` 或 `AFTER MATCH SKIP TO LAST A` 会因锚点缺失跳转失败;而通过引入联合模式变量 `SUBSET U = (A, B)`,使用 `AFTER MATCH SKIP TO LAST U` 则始终有效。 - -### 2.4 PATTERN 子句 - -用于定义需要匹配的行模式,其基本构成单元是**基本模式变量。** - -```SQL -PATTERN ( row_pattern ) -``` - -#### 2.4.1 模式种类 - -| 行模式 | 语法格式 | 描述 | -| ----------------------------------- |---------------------| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| 模式连接(Pattern Concatenation) | `A B+ C+ D+` | 由不带任何运算符的子模式组成,按声明顺序依次匹配所有子模式。| -| 模式选择(Pattern Alternation) | `A \| B \| C` | 由以`\|`分隔的多个子模式组成,仅匹配其中一个。当多个子模式均可匹配时,选择最左侧的子模式匹配。 | -| 模式排列(Pattern Permutation) | `PERMUTE(A, B, C)` | 该模式等价于对所有子模式元素的不同顺序进行选择匹配,即要求 A、B、C 三者均须匹配,但其出现顺序不固定。当多种匹配顺序均可成功时,依据 PERMUTE 列表中元素的定义先后顺序,按**字典序原则**确定优先级。例如,A B C 为最高优先,C B A 则为最低优先。 | -| 模式分组(Pattern Grouping) | `(A B C)` | 用圆括号将子模式括起,视作一个整体对待,可与其他运算符配合使用。如`(A B C)+`表示连续出现一组`(A B C)`的模式。 | -| 空模式(Empty Pattern) | `()` | 表示一个不包含任何行的空匹配 | -| 模式排除(Pattern Exclusion) | `{- row_pattern -}` | 用于指定在输出中需要排除的匹配部分。通常与`ALL ROWS PER MATCH`选项结合使用,用于输出感兴趣的行。如`PATTERN (A {- B+ C+ -} D+)`,并使用`ALL ROWS PER MATCH`时,输出将仅包含匹配的首行(`A`对应行)与尾部行(`D+`对应行)。 | - -#### 2.4.2 分区起始/结束锚点(Partition Start/End Anchor) - -* `^A` 表示匹配以 A 为分区开始的模式 - * 当 PATTERN 子句的取值为 `^A` 时,要求匹配必须从分区的首行开始,且这一行要满足 `A` 的定义 - * 当 PATTERN 子句的取值为 `^A^` 或 `A^` 时,输出结果为空 -* `A$` 表示匹配以 A 为分区结束的模式 - * 当 PATTERN 子句的取值为 `A$` 时,要求必须在分区的结束位置匹配,并且这一行要满足 `A`的定义 - * 当 PATTERN 子句的取值为 `$A` 或 `$A$` 时,输出结果为空 - -**示例说明** - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - AFTER MATCH SKIP PAST LAST ROW - PATTERN %s -- PATTERN 子句 - DEFINE A AS true -) AS m; -``` - -* 查询结果 - * 当 PATTERN 子句为 PATTERN (^A) 时 - - ![](/img/timeseries-featured-analysis-2.png) - - 实际返回 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * 当 PATTERN 子句为 PATTERN (^A^) 时,输出的结果为空,因为不可能从分区的起始位置开始匹配了一个 A 之后,又回到分区的起始位置 - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - * 当 PATTERN 子句为 PATTERN (A\$) 时 - - ![](/img/timeseries-featured-analysis-3.png) - - 实际返回 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:06:00.000+08:00| 1| 80| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * 当 PATTERN 子句为 PATTERN (\$A\$) 时,输出的结果为空 - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - -#### 2.4.3 量词(Quantifiers) - -量词用于指定子模式重复出现的次数,置于相应子模式之后,如 `(A | B)*`。 - -常用量词如下: - -| 量词 | 描述 | -| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `*` | 零次或多次重复 | -| `+` | 一次或多次重复 | -| `?` | 零次或一次重复 | -| `{n}` | 恰好重复 n 次 | -| `{m, n}` | 重复次数在 m 到 n 之间(m、n 为非负整数)。* 若省略左界,则默认从 0 开始;* 若省略右界,则重复次数不设上限(如 {5,} 等同于“至少重复五次”);* 若同时省略左右界,即 {,},则与 \* 等价。 | - -* 可通过在量词后加 `?` 改变匹配偏好。 - * `{3,5}`:偏好 5 次,最不偏好 3 次;`{3,5}?`:偏好 3 次,最不偏好 5 次 - * `?`:偏好 1 次;`??`:偏好 0 次 - -### 2.5 AFTER MATCH SKIP 子句 - -用于指定在识别到一个非空匹配后,下一次模式匹配应从哪一行继续进行。 - -| 跳转策略 | 描述 | 是否允许识别重叠匹配项 | -| ------------------------------------------------------------- | --------------------------------------------------- | ------------------------ | -| `AFTER MATCH SKIP PAST LAST ROW` | 默认行为。在当前匹配的最后一行之后的下一行开始。 | 否 | -| `AFTER MATCH SKIP TO NEXT ROW` | 在当前匹配中的第二行开始。 | 是 | -| `AFTER MATCH SKIP TO [ FIRST \| LAST ] pattern_variable` | 跳转到某个模式变量的 [ 第一行 \| 最后一行 ] 开始。 | 是 | - -* 在所有可能的配置中,仅当 `ALL ROWS PER MATCH WITH UNMATCHED ROWS` 与 `AFTER MATCH SKIP PAST LAST ROW` 联合使用时,系统才能确保对每个输入行恰好生成一条输出记录。 - -**示例说明** - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - %s -- AFTER MATCH SKIP 子句 - PATTERN (A B+ C+ D?) - SUBSET U = (C, D) - DEFINE - B AS B.totalprice < PREV (B.totalprice), - C AS C.totalprice > PREV (C.totalprice), - D AS false -- 永远不会匹配成功 -) AS m; -``` - -* 查询结果 - * 当 AFTER MATCH SKIP PAST LAST ROW 时 - - ![](/img/timeseries-featured-analysis-4.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:根据 `AFTER MATCH SKIP PAST LAST ROW` 语义,从第 5 行开始,无法再找寻到一个合法匹配 - * 此模式一定不会出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 4 - ``` - - * 当 AFTER MATCH SKIP TO NEXT ROW 时 - - ![](/img/timeseries-featured-analysis-5.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:根据 `AFTER MATCH SKIP TO NEXT ROW` 语义,从第 2 行开始,匹配:第 2、3、4 行 - * 第三次匹配:尝试从第 3 行开始,失败 - * 第三次匹配:尝试从第 4 行开始,成功,匹配第 4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:02:00.000+08:00| 2| 80| A| - |2025-01-01T00:03:00.000+08:00| 2| 70| B| - |2025-01-01T00:04:00.000+08:00| 2| 80| C| - |2025-01-01T00:04:00.000+08:00| 3| 80| A| - |2025-01-01T00:05:00.000+08:00| 3| 70| B| - |2025-01-01T00:06:00.000+08:00| 3| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 10 - ``` - - * 当 AFTER MATCH SKIP TO FIRST C 时 - - ![](/img/timeseries-featured-analysis-6.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:从第一个 C (也就是第 4 行)处开始,匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO LAST B 或 AFTER MATCH SKIP TO B 时 - - ![](/img/timeseries-featured-analysis-7.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:尝试从最后一个 B (也就是第 3 行)处开始,失败 - * 第二次匹配:尝试从第 4 行开始,成功匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO U 时 - - ![](/img/timeseries-featured-analysis-8.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:`SKIP TO U` 表示跳转到最后一个 C 或 D,D 永远不可能匹配成功,所以就是跳转到最后一个 C(也就是第 4 行),成功匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO A 时,报错。因为不能跳转到匹配的第一行, 否则会造成死循环。 - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: cannot skip to first row of match - ``` - - * 当 AFTER MATCH SKIP TO B 时,报错。因为不能跳转到匹配分组中不存在的模式变量。 - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: pattern variable is not present in match - ``` - - -### 2.6 ROWS PER MATCH 子句 - -用于指定模式匹配成功后结果集的输出方式,主要包括以下两种选项: - -| 输出方式 | 规则描述 | 输出结果 | **空匹配/未匹配行**处理逻辑 | -| -------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| ONE ROW PER MATCH | 每一次成功匹配,产生一行输出结果。 | * PARTITION BY 子句中的列* MEASURES 子句中定义的表达式。 | 输出空匹配;跳过未匹配行。 | -| ALL ROWS PER MATCH | 每一次匹配中的每一行都将产生一条输出记录,除非该行通过 exclusion 语法排除。 | * PARTITION BY 子句中的列* ORDER BY 子句中的列* MEASURES 子句中定义的表达式* 输入表中的其余列 | * 默认:输出空匹配;跳过未匹配行。* ALL ROWS PER MATCH​**SHOW EMPTY MATCHES**​:默认输出空匹配,跳过未匹配行* ALL ROWS PER MATCH​**OMIT EMPTY MATCHES**​:不输出空匹配,跳过未匹配行* ALL ROWS PER MATCH​**WITH UNMATCHED ROWS**​:输出空匹配,并为每一条未匹配行额外生成一条输出记录| - -### 2.7 MEASURES 子句 - -用于指定从匹配到的一段数据中提取哪些信息。该子句为可选项,如果未显式指定,则根据 ROWS PER MATCH 子句的设置,部分输入列会成为模式识别的输出结果。 - -```SQL -MEASURES measure_expression AS measure_name [, ...] -``` - -* `measure_expression` 是根据匹配的一段数据计算出的标量值。 - -| 用法示例 | 说明 | -| ---------------------------------------------- | -------------------------------------------------------------------------------------------------------------- | -| `A.totalprice AS starting_price` | 返回匹配分组中第一行(即与变量 A 关联的唯一一行)中的价格,作为起始价格。 | -| `RPR_LAST(B.totalprice) AS bottom_price` | 返回与变量 B 关联的最后一行中的价格,代表“V”形模式中最低点的价格,对应下降区段的末尾。 | -| `RPR_LAST(U.totalprice) AS top_price` | 返回匹配分组中的最高价格,对应变量 C 或 D 所关联的最后一行,即整个匹配分组的末尾。【假设 SUBSET U = (C, D)】 | - -* 每个 `measure_expression `都会定义一个输出列,该列可通过其指定的 `measure_name `进行引用。 - -### 2.8 模式查询表达式 - -在 MEASURES 与 DEFINE 子句中使用的表达式为​**标量表达式**​,用于在输入表的行级上下文中求值。**标量表达式**除了支持标准 SQL 语法外,还支持针对模式查询的特殊扩展函数。 - -#### 2.8.1 模式变量引用 - -```SQL -A.totalprice -U.orderdate -orderstatus -``` - -* 当列名前缀为某**基本模式变量**或**联合模式变量**时,表示引用该变量所匹配的所有行的对应列值。 -* 若列名不带前缀,则等同于使用“​**全局联合模式变量**​”(即所有基本模式变量的并集)作前缀,表示引用当前匹配中所有行的该列值。 - -> 不允许在模式识别表达式中使用表名作列名前缀。 - -#### 2.8.2 扩展函数 - -| 函数名 | 函数式 | 描述 | -|------------------| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `MATCH_NUMBER`函数 | `MATCH_NUMBER()` | 返回当前匹配在分区内的序号,从 1 开始计数。空匹配与非空匹配一致,也占用匹配序号。 | -| `CLASSIFIER `函数 | `CLASSIFIER(option)`| 1. 返回当前行所映射的基本模式变量名称。1. `option`是一个可选参数:可以传入基本模式变量`CLASSIFIER(A)`或联合模式变量`CLASSIFIER(U)`,用于限定函数作用范围,对于不在范围内的行,直接返回 NULL。在对联合模式变量使用时,可用于辨别该行究竟映射至并集中哪一个基本模式变量。 | -| 逻辑导航函数 | `RPR_FIRST(expr, k)` | 1. 表示从**当前匹配分组**中,定位至第一个满足 expr 的行,在此基础上再向分组尾部方向搜索到第 k 次出现的同一模式变量对应行,返回该行的指定列值。如果在指定方向上未能找到第 k 次匹配行,则函数返回 NULL。1. 其中 k 是可选参数,默认为 0,表示仅定位至首个满足条件的行;若显式指定,必须为非负整数。 | -| 逻辑导航函数 | `RPR_LAST(expr, k)`| 1. 表示从**当前匹配分组**中,定位至最后一个满足 expr 的行,在此基础上再向分组开头方向搜索到第 k 次出现的同一模式变量对应行,返回该行的指定列值。如果在指定方向上未能找到第 k 次匹配行,则函数返回 NULL。1. 其中 k 是可选参数,默认为 0,表示仅定位至末个满足条件的行;若显式指定,必须为非负整数。 | -| 物理导航函数 | `PREV(expr, k)` | 1. 表示从最后一次匹配至给定模式变量的行开始,向开头方向偏移 k 行,返回对应列值。若导航超出​**分区边界**​,则函数返回 NULL。1. 其中 k 是可选参数,默认为 1;若显式指定,必须为非负整数。 | -| 物理导航函数 |`NEXT(expr, k)` | 1. 表示从最后一次匹配至给定模式变量的行开始,向尾部方向偏移 k 行,返回对应列值。若导航超出​**分区边界**​,则函数返回 NULL。1. 其中 k 是可选参数,默认为 1;若显式指定,必须为非负整数。 | -| 聚合函数 | COUNT、SUM、AVG、MAX、MIN 函数 | 可用于对当前匹配中的数据进行计算。聚合函数与导航函数不允许互相嵌套。(V 2.0.6 版本起支持) | -| 嵌套函数 | `PREV/NEXT(CLASSIFIER())` | 物理导航函数与 CLASSIFIER 函数嵌套。用于获取当前行的前一个和后一个匹配行所对应的模式变量 | -| 嵌套函数 |`PREV/NEXT(RPR_FIRST/RPR_LAST(expr, k)`) | 物理函数内部**允许嵌套**逻辑函数,逻辑函数内部**不允许嵌套**物理函数。用于先进行逻辑偏移,再进行物理偏移。 | - -**示例说明** - -1. CLASSIFIER 函数 - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* 分析过程 - - ![](/img/timeseries-featured-analysis-9.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----+---------------+-----+ -| time|match|price|lower_or_higher|label| -+-----------------------------+-----+-----+---------------+-----+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| -+-----------------------------+-----+-----+---------------+-----+ -Total line number = 6 -``` - -2. 逻辑导航函数 - -* 查询 sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` - -* 查询结果 - * 当取值为 totalprice、RPR\_LAST(totalprice)、RUNNING RPR\_LAST(totalprice) 时 - - ![](/img/timeseries-featured-analysis-10.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 FINAL RPR\_LAST(totalprice) 时 - - ![](/img/timeseries-featured-analysis-11.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_FIRST(totalprice)、 RUNNING RPR\_FIRST(totalprice)、FINAL RPR\_FIRST(totalprice)时 - - ![](/img/timeseries-featured-analysis-12.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 90| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 90| - |2025-01-01T00:05:00.000+08:00| 90| - |2025-01-01T00:06:00.000+08:00| 90| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_LAST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-13.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| null| - |2025-01-01T00:02:00.000+08:00| null| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 FINAL RPP\_LAST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-14.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_FIRST(totalprice, 2) 和 FINAL RPR\_FIRST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-15.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 70| - |2025-01-01T00:02:00.000+08:00| 70| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 6 - ``` - -3. 物理导航函数 - -* 查询 sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (B) - DEFINE B AS B.totalprice >= PREV(B.totalprice) -) AS m; -``` - -* 查询结果 - * 当取值为 `PREV(totalprice)` 时 - - ![](/img/timeseries-featured-analysis-16.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `PREV(B.totalprice, 2)` 时 - - ![](/img/timeseries-featured-analysis-17.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `PREV(B.totalprice, 4)` 时 - - ![](/img/timeseries-featured-analysis-18.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| null| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `NEXT(totalprice)` 或 `NEXT(B.totalprice, 1)` 时 - - ![](/img/timeseries-featured-analysis-19.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * `当取值为 NEXT(B.totalprice, 2)` 时 - - ![](/img/timeseries-featured-analysis-20.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - -4. 聚合函数 - -* 查询 sql - -```SQL -SELECT m.time, m.count, m.avg, m.sum, m.min, m.max -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - COUNT(*) AS count, - AVG(totalprice) AS avg, - SUM(totalprice) AS sum, - MIN(totalprice) AS min, - MAX(totalprice) AS max - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* 分析过程(以 MIN(totalprice)为例) - -![](/img/timeseries-featured-analysis-21.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----------------+-----+---+---+ -| time|count| avg| sum|min|max| -+-----------------------------+-----+-----------------+-----+---+---+ -|2025-01-01T00:01:00.000+08:00| 1| 90.0| 90.0| 90| 90| -|2025-01-01T00:02:00.000+08:00| 2| 85.0|170.0| 80| 90| -|2025-01-01T00:03:00.000+08:00| 3| 80.0|240.0| 70| 90| -|2025-01-01T00:04:00.000+08:00| 4| 80.0|320.0| 70| 90| -|2025-01-01T00:05:00.000+08:00| 5| 78.0|390.0| 70| 90| -|2025-01-01T00:06:00.000+08:00| 6|78.33333333333333|470.0| 70| 90| -+-----------------------------+-----+-----------------+-----+---+---+ -Total line number = 6 -``` - -5. 嵌套函数 - -示例一 - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label, m.prev_label, m.next_label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label, - PREV(CLASSIFIER(W)) AS prev_label, - NEXT(CLASSIFIER(W)) AS next_label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* 分析过程 - -![](/img/timeseries-featured-analysis-22.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -| time|match|price|lower_or_higher|label|prev_label|next_label| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| null| A| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| H| null| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| null| A| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| L| null| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| null| A| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| L| null| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -Total line number = 6 -``` - -示例二 - -* 查询 sql - -```SQL -SELECT m.time, m.prev_last_price, m.next_first_price -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - PREV(RPR_LAST(totalprice), 2) AS prev_last_price, - NEXT(RPR_FIRST(totalprice), 2) as next_first_price - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* 分析过程 - -![](/img/timeseries-featured-analysis-23.png) - -* 查询结果 - -```SQL -+-----------------------------+---------------+----------------+ -| time|prev_last_price|next_first_price| -+-----------------------------+---------------+----------------+ -|2025-01-01T00:01:00.000+08:00| null| 70| -|2025-01-01T00:02:00.000+08:00| null| 70| -|2025-01-01T00:03:00.000+08:00| 90| 70| -|2025-01-01T00:04:00.000+08:00| 80| 70| -|2025-01-01T00:05:00.000+08:00| 70| 70| -|2025-01-01T00:06:00.000+08:00| 80| 70| -+-----------------------------+---------------+----------------+ -Total line number = 6 -``` - -#### 2.8.3 RUNNING 和 FINAL 语义 -1. 定义 - -* `RUNNING`: 表示计算范围为当前匹配分组内,从分组的起始行到当前正在处理的行(即到当前行为止)。 -* `FINAL`: 表示计算范围为当前匹配分组内,从分组的起始行到分组的最终行(即整个匹配分组)。 - -2. 作用范围 - -* DEFINE 子句默认采用 RUNNING 语义。 -* MEASURES 子句默认采用 RUNNING 语义,支持指定 FINAL 语义。当采用 ONE ROW PER MATCH 输出模式时,所有表达式都从匹配分组的末行位置进行计算,此时 RUNNING 语义与 FINAL 语义等价。 - -3. 语法约束 - -* RUNNING 和 FINAL 需要写在**逻辑导航函数**或聚合函数之前,不能直接作用于**列引用。** - * 合法:`RUNNING RPP_LAST(A.totalprice)`、`FINAL RPP_LAST(A.totalprice)` - * 非法:`RUNNING A.totalprice`、`FINAL A.totalprice`、 `RUNNING PREV(A.totalprice)` - -## 3. 场景示例 - -以[示例数据](../Reference/Sample-Data.md)为源数据 - -### 3.1 时间分段查询 - -将 table1 中的数据按照时间间隔小于等于 24 小时分段,查询每段中的数据总条数,以及开始、结束时间。 - -查询SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -### 3.2 差值分段查询 - -将 table2 中的数据按照 humidity 湿度值差值小于 0.1 分段,查询每段中的数据总条数,以及开始、结束时间。 - -* 查询sql - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* 查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -### 3.3 事件统计查询 - -将 table1 中数据按照设备号分组,统计上海地区湿度大于 35 的开始、结束时间及最大湿度值。 - -* 查询sql - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= '上海' AND A.humidity> 35 -) AS m -``` - -* 查询结果 - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` - -## 4. 实际案例 - -### 4.1海拔高度监测 - -* **业务背景** - -石油运输车辆在油品运输过程中,海拔高度会直接影响环境气压:海拔越高,气压越低,油品挥发风险越高。为精准评估油品自然损耗情况,需通过北斗定位数据识别海拔异常事件,为损耗评估提供数据支撑。 - -* **数据结构** - -监测数据表包含以下核心字段: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | 数据采集时间 | -| device\_id | STRING | TAG | 车辆设备编号(分区键) | -| department | STRING | FIELD | 所属部门 | -| altitude | DOUBLE | FIELD | 海拔高度(单位:米) | - -* **业务需求** - -识别运输车辆的海拔异常事件:当车辆海拔高度超过 500 米,后续又降至 500 米以下时,视为一个完整的异常事件。需计算每个事件的核心指标: - -* 事件起始时间(海拔首次超过 500 米的时间); -* 事件结束时间(海拔最后一次高于 500 米的时间); -* 事件期间该车辆的最大海拔值。 - -![](/img/pattern-query-altitude.png) - -* **实现方法** - -```SQL -SELECT * -FROM beidou -MATCH_RECOGNIZE ( - PARTITION BY device_id -- 按车辆设备分区 - ORDER BY time -- 按时间排序 - MEASURES - FIRST(A.time) AS ts_s, -- 事件起始时间 - LAST(A.time) AS ts_e, -- 事件结束时间 - MAX(A.altitude) AS max_a -- 事件最大海拔 - PATTERN (A+) -- 匹配连续的海拔超500米的记录 - DEFINE - A AS A.altitude > 500 -- 定义A为海拔高于500米的记录 -) -``` - -### 4.2 安全注入操作识别 - -* **业务背景** - -核电站需定期执行安全检测试验(如 PT1RPA010《用 1 RPA 601KC 进行安全注入逻辑试验》),以验证发电设备无损伤。该类试验会导致水管流量呈现特征性变化,中控系统需识别该流量模式,及时汇报异常行为,保障设备安全。 - -* **数据结构** - -传感器数据表包含以下核心字段: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | 数据采集时间 | -| pipe\_id | STRING | TAG | 水管编号(分区键) | -| pressure | DOUBLE | FIELD | 水管压力 | -| flow\_rate | DOUBLE | FIELD | 水管流量(核心监测值) | - -* **业务需求** - -识别 PT1RPA010 试验对应的流量特征模式:正常流量→持续下降→极低流量(<0.5)→持续回升→恢复正常流量。需提取该模式的核心指标: - -* 模式整体起始时间(初始正常流量的时间); -* 模式整体终止时间(恢复正常流量的时间); -* 极低流量阶段的起始 / 结束时间; -* 极低流量阶段的最小流量值。 - -![](/img/pattern-query-flow.png) - -* **实现方法** - -```SQL -SELECT * FROM sensor MATCH_RECOGNIZE( - PARTITION BY pipe_id -- 按水管编号分区 - ORDER BY time -- 按时间排序 - MEASURES - A.time AS start_ts, -- 模式整体起始时间 - E.time AS end_ts, -- 模式整体终止时间 - FIRST(C.time) AS low_start_ts, -- 极低流量起始时间 - LAST(C.time) AS low_end_ts, -- 极低流量结束时间 - MIN(C.flow_rate) AS min_low_flow -- 极低流量最小值(补充原代码缺失字段名) - ONE ROW PER MATCH -- 每个匹配模式仅输出1行结果 - PATTERN(A B+? C+ D+? E) -- 匹配正常→下降→极低→回升→正常的流量模式 - DEFINE - A AS flow_rate BETWEEN 2 AND 2.5, -- 初始正常流量 - B AS flow_rate < PREV(B.flow_rate), -- 流量持续下降 - C AS flow_rate < 0.5, -- 极低流量阈值 - D AS flow_rate > PREV(D.flow_rate), -- 流量持续回升 - E AS flow_rate BETWEEN 2 AND 2.5 -- 恢复正常流量 -); -``` - -### 4.3 极端运行阵风(草帽风)识别 - -* **业务背景** - -风力发电场景中,“极端运行阵风(草帽风)” 是一种短时间(约 10 秒)、波峰显著的正弦形阵风,这类阵风会对风机造成物理损伤。识别该类阵风并统计发生频率,可有效评估风机受损风险,指导设备维护。 - -* **数据结构** - -风机传感器数据表核心字段: - -| **ColumnName** | DataType | Category | Comment | -| ---------------------- | ----------- | ---------- | ------------------------ | -| time | TIMESTAMP | TIME | 风速采集时间 | -| speed | DOUBLE | FIELD | 风机处风速(核心指标) | - -* **业务需求** - -识别 “草帽风” 的特征模式:风力缓慢下降→急剧增加→急剧减少→缓慢增加至初始值(全程约 10 秒)。核心目标是统计该类阵风的发生次数,为风机风险评估提供依据。 - -![](/img/pattern-query-speed.png) - -* **实现方法** - -```SQL -SELECT COUNT(*) -- 统计极端阵风发生次数 -FROM sensor -MATCH_RECOGNIZE( - ORDER BY time -- 按时间排序 - MEASURES - FIRST(B.time) AS ts_s, -- 阵风起始时间 - LAST(D.time) AS ts_e -- 阵风结束时间 - PATTERN (B+ R+? F+? D+? E) -- 匹配草帽风的风速变化模式 - DEFINE - -- B阶段:风速缓慢下降,初始风速>9,首尾风速差<2.5 - B AS speed <= AVG(B.speed) - AND FIRST(B.speed) > 9 - AND (FIRST(B.speed) - LAST(B.speed)) < 2.5, - -- R阶段:风速急剧增加(高于阶段平均风速) - R AS speed >= AVG(R.speed), - -- F阶段:风速急剧减少,阶段最大风速>16(波峰阈值) - F AS speed <= AVG(F.speed) - AND MAX(F.speed) > 16, - -- D阶段:风速缓慢增加,首尾风速差<2.5 - D AS speed >= AVG(D.speed) - AND (LAST(D.speed) - FIRST(D.speed)) < 2.5, - -- E阶段:风速恢复至初始值±0.2,全程时长<11秒 - E AS speed - FIRST(B.speed) BETWEEN -0.2 AND 0.2 - AND time - FIRST(B.time) < 11 -); -``` - - - diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Tiered-Storage_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 86c183dc3..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,101 +0,0 @@ - - -# 多级存储 -## 1. 概述 - -多级存储功能向用户提供多种存储介质管理的能力,用户可以使用多级存储功能为 IoTDB 配置不同类型的存储介质,并为存储介质进行分级。具体的,在 IoTDB 中,多级存储的配置体现为多目录的管理。用户可以将多个存储目录归为同一类,作为一个“层级”向 IoTDB 中配置,这种“层级”我们称之为 storage tier;同时,用户可以根据数据的冷热进行分类,并将不同类别的数据存储到指定的“层级”中。当前 IoTDB 支持通过数据的 TTL 进行冷热数据的分类,当一个层级中的数据不满足当前层级定义的 TTL 规则时,该数据会被自动迁移至下一层级中。 - -## 2. 参数定义 - -在 IoTDB 中开启多级存储,需要进行以下几个方面的配置: - -1. 配置数据目录,并将数据目录分为不同的层级 -2. 配置每个层级所管理的数据的 TTL,以区分不同层级管理的冷热数据类别。 -3. 配置每个层级的最小剩余存储空间比例,当该层级的存储空间触发该阈值时,该层级的数据会被自动迁移至下一层级(可选)。 - -具体的参数定义及其描述如下。 - -| 配置项 | 默认值 | 是否必填 | 说明 | 约束 | -| --------------------------------------- | ------------------------ | --- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | 是 | 用来指定不同的存储目录,并将存储目录进行层级划分 | 每级存储使用分号分隔,单级内使用逗号分隔;云端配置只能作为最后一级存储且第一级不能作为云端存储;最多配置一个云端对象;远端存储目录使用 OBJECT_STORAGE 来表示 | -| tier_ttl_in_ms | -1 | 是 | 定义每个层级负责的数据范围,通过 TTL 表示 | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致;"-1" 表示"无限制" | -| dn_default_space_usage_thresholds | 0.85 | 是 | 定义每个层级数据目录的最大使用空间比例;当使用空间大于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的使用存储空间大于此阈值时,会将系统置为 READ_ONLY | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致 | -| object_storage_type | AWS_S3 | 使用远端存储时必填 | 云端存储类型 | IoTDB 支持 S3 协议作为远端存储类型 | -| object_storage_bucket | iotdb_data | 使用远端存储时必填 | 云端存储 bucket 的名称 | AWS S3 中的 bucket 定义 | -| object_storage_region | | 使用远端存储时必填 | 云端存储的服务区域 | AWS S3 中的 region 定义 | -| object_storage_endpoint | | 使用远端存储时必填 | 云端存储的 endpoint | AWS S3 的 endpoint | -| object_storage_access_key | | 使用远端存储时必填 | 云端存储的验证信息 key | AWS S3 的 credential key | -| object_storage_access_secret | | 使用远端存储时必填 | 云端存储的验证信息 secret | AWS S3 的 credential secret | -| enable_path_style_access | false | 否 | 是否启用云端存储服务路径访问 | | -| remote_tsfile_cache_dirs | data/datanode/data/cache | 否 | 云端存储在本地的缓存目录 | | -| remote_tsfile_cache_page_size_in_kb | 20480 | 否 | 云端存储在本地缓存文件的块大小 | | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | 否 | 云端存储本地缓存的最大磁盘占用大小 | | - - -## 3. 本地多级存储配置示例 - -以下以本地两级存储的配置示例。 - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data; -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -在该示例中,共配置了两个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 1 天以前的数据 | 10% | - -## 4. 远端多级存储配置示例 - -以下以三级存储为例: - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_type=AWS_S3 -object_storage_bucket=iotdb -object_storage_region= -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// 可选配置项 -enable_path_style_access=false -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -在该示例中,共配置了三个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 过去1 天至过去 10 天内的数据 | 15% | -| 层级三 | 远端 S3 协议存储 | 过去 10 天以前的数据 | 10% | \ No newline at end of file diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Timeseries-Featured-Analysis_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Timeseries-Featured-Analysis_timecho.md deleted file mode 100644 index a62e9bc39..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Timeseries-Featured-Analysis_timecho.md +++ /dev/null @@ -1,1721 +0,0 @@ - - -# 时序特色分析 - -IoTDB 针对时序数据的特色分析场景,提供了模式查询与窗口函数两大核心能力,为时序数据的深度挖掘与复杂计算提供了灵活高效的解决方案。下文将对两大功能进行详细的介绍。 - -## 1. 模式查询 - -### 1.1 概述 - -模式查询支持通过定义模式变量的识别逻辑以及正则表达式来捕获一段连续的数据,并对每一段捕获的数据进行分析计算,适用于识别时序数据中的特定模式(如下图所示)、检测特定事件等业务场景。 - -![](/img/timeseries-featured-analysis-1.png) - -> 注意:该功能从 V 2.0.5 版本开始提供。 - -### 1.2 功能介绍 -#### 1.2.1 语法格式 - -```SQL -MATCH_RECOGNIZE ( - [ PARTITION BY column [, ...] ] - [ ORDER BY column [, ...] ] - [ MEASURES measure_definition [, ...] ] - [ ROWS PER MATCH ] - [ AFTER MATCH skip_to ] - PATTERN ( row_pattern ) - [ SUBSET subset_definition [, ...] ] - DEFINE variable_definition [, ...] -) -``` - -**说明:** - -* PARTITION BY : 可选,用于对输入表进行分组,每个分组能独立进行模式匹配。如果未声明该子句,则整个输入表将作为一个整体进行处理。 -* ORDER BY :可选,用于确保输入数据按某种顺序进行匹配处理。 -* MEASURES :可选,用于指定从匹配到的一段数据中提取哪些信息。 -* ROWS PER MATCH :可选,用于指定模式匹配成功后结果集的输出方式。 -* AFTER MATCH SKIP :可选,用于指定在识别到一个非空匹配后,下一次模式匹配应从哪一行继续进行。 -* PATTERN :用于定义需要匹配的行模式。 -* SUBSET :可选,用于将多个基本模式变量所匹配的行合并为一个逻辑集合。 -* DEFINE :用于定义行模式的基本模式变量。 - -**语法示例原始数据:** - -```SQL -IoTDB:database3> select * from t -+-----------------------------+------+----------+ -| time|device|totalprice| -+-----------------------------+------+----------+ -|2025-01-01T00:01:00.000+08:00| d1| 90| -|2025-01-01T00:02:00.000+08:00| d1| 80| -|2025-01-01T00:03:00.000+08:00| d1| 70| -|2025-01-01T00:04:00.000+08:00| d1| 80| -|2025-01-01T00:05:00.000+08:00| d1| 70| -|2025-01-01T00:06:00.000+08:00| d1| 80| -+-----------------------------+------+----------+ - --- 创建语句 -create table t(device tag, totalprice int32 field) - -insert into t(time,device,totalprice) values(2025-01-01T00:01:00, 'd1', 90),(2025-01-01T00:02:00, 'd1', 80),(2025-01-01T00:03:00, 'd1', 70),(2025-01-01T00:04:00, 'd1', 80),(2025-01-01T00:05:00, 'd1', 70),(2025-01-01T00:06:00, 'd1', 80) -``` - -#### 1.2.2 DEFINE 子句 - -用于为模式识别中的每个基本模式变量指定其判断条件。这些变量通常由标识符(如 `A`, `B`)代表,并通过该子句中的布尔表达式精确定义哪些行符合该变量的要求。 - -* 在模式匹配执行过程中,仅当布尔表达式返回 TRUE 时,才会将当前行标记为该变量,从而将其纳入到当前匹配分组中。 - -```SQL --- 只有在当前行的 totalprice 值小于前一行 totalprice 值的情况下,当前行才可以被识别为 B。 -DEFINE B AS totalprice < PREV(totalprice) -``` - -* **未**在子句中**显式**定义的变量,其匹配条件隐含为恒真(TRUE),即可在任何输入行上成功匹配。 - -#### 1.2.3 SUBSET 子句 - -用于将多个基本模式变量(如 `A`、`B`)匹配到的行合并成一个联合模式变量(如 `U`),使这些行可以被视为同一个逻辑集合进行操作。可用于`MEASURES`、`DEFINE `和`AFTER MATCH SKIP`子句。 - -```SQL -SUBSET U = (A, B) -``` - -例如,对于模式 `PATTERN ((A | B){5} C+)` ,在匹配过程中无法确定第五次重复时具体匹配的是基本模式变量 A 还是 B,因此 - -1. 在 `MEASURES `子句中,若需要引用该阶段最后一次匹配到的行,则可通过定义联合模式变量 `SUBSET U = (A, B)`实现。此时表达式 `RPR_LAST(U.totalprice)` 将直接返回该目标行的 `totalprice` 值。 -2. 在 `AFTER MATCH SKIP` 子句中,若匹配结果中未包含基本模式变量 A 或 B 时,执行 `AFTER MATCH SKIP TO LAST B` 或 `AFTER MATCH SKIP TO LAST A` 会因锚点缺失跳转失败;而通过引入联合模式变量 `SUBSET U = (A, B)`,使用 `AFTER MATCH SKIP TO LAST U` 则始终有效。 - -#### 1.2.4 PATTERN 子句 - -用于定义需要匹配的行模式,其基本构成单元是**基本模式变量。** - -```SQL -PATTERN ( row_pattern ) -``` - -##### 1.2.4.1 模式种类 - -| 行模式 | 语法格式 | 描述 | -| ----------------------------------- |---------------------| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| 模式连接(Pattern Concatenation) | `A B+ C+ D+` | 由不带任何运算符的子模式组成,按声明顺序依次匹配所有子模式。| -| 模式选择(Pattern Alternation) | `A \| B \| C` | 由以`\|`分隔的多个子模式组成,仅匹配其中一个。当多个子模式均可匹配时,选择最左侧的子模式匹配。 | -| 模式排列(Pattern Permutation) | `PERMUTE(A, B, C)` | 该模式等价于对所有子模式元素的不同顺序进行选择匹配,即要求 A、B、C 三者均须匹配,但其出现顺序不固定。当多种匹配顺序均可成功时,依据 PERMUTE 列表中元素的定义先后顺序,按**字典序原则**确定优先级。例如,A B C 为最高优先,C B A 则为最低优先。 | -| 模式分组(Pattern Grouping) | `(A B C)` | 用圆括号将子模式括起,视作一个整体对待,可与其他运算符配合使用。如`(A B C)+`表示连续出现一组`(A B C)`的模式。 | -| 空模式(Empty Pattern) | `()` | 表示一个不包含任何行的空匹配 | -| 模式排除(Pattern Exclusion) | `{- row_pattern -}` | 用于指定在输出中需要排除的匹配部分。通常与`ALL ROWS PER MATCH`选项结合使用,用于输出感兴趣的行。如`PATTERN (A {- B+ C+ -} D+)`,并使用`ALL ROWS PER MATCH`时,输出将仅包含匹配的首行(`A`对应行)与尾部行(`D+`对应行)。 | - -##### 1.2.4.2 分区起始/结束锚点(Partition Start/End Anchor) - -* `^A` 表示匹配以 A 为分区开始的模式 - * 当 PATTERN 子句的取值为 `^A` 时,要求匹配必须从分区的首行开始,且这一行要满足 `A` 的定义 - * 当 PATTERN 子句的取值为 `^A^` 或 `A^` 时,输出结果为空 -* `A$` 表示匹配以 A 为分区结束的模式 - * 当 PATTERN 子句的取值为 `A$` 时,要求必须在分区的结束位置匹配,并且这一行要满足 `A`的定义 - * 当 PATTERN 子句的取值为 `$A` 或 `$A$` 时,输出结果为空 - -**示例说明** - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - AFTER MATCH SKIP PAST LAST ROW - PATTERN %s -- PATTERN 子句 - DEFINE A AS true -) AS m; -``` - -* 查询结果 - * 当 PATTERN 子句为 PATTERN (^A) 时 - - ![](/img/timeseries-featured-analysis-2.png) - - 实际返回 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * 当 PATTERN 子句为 PATTERN (^A^) 时,输出的结果为空,因为不可能从分区的起始位置开始匹配了一个 A 之后,又回到分区的起始位置 - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - * 当 PATTERN 子句为 PATTERN (A\$) 时 - - ![](/img/timeseries-featured-analysis-3.png) - - 实际返回 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:06:00.000+08:00| 1| 80| A| - +-----------------------------+-----+-----+-----+ - Total line number = 1 - ``` - - * 当 PATTERN 子句为 PATTERN (\$A\$) 时,输出的结果为空 - - ```SQL - +----+-----+-----+-----+ - |time|match|price|label| - +----+-----+-----+-----+ - +----+-----+-----+-----+ - Empty set. - ``` - - -##### 1.2.4.3 量词(Quantifiers) - -量词用于指定子模式重复出现的次数,置于相应子模式之后,如 `(A | B)*`。 - -常用量词如下: - -| 量词 | 描述 | -| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `*` | 零次或多次重复 | -| `+` | 一次或多次重复 | -| `?` | 零次或一次重复 | -| `{n}` | 恰好重复 n 次 | -| `{m, n}` | 重复次数在 m 到 n 之间(m、n 为非负整数)。* 若省略左界,则默认从 0 开始;* 若省略右界,则重复次数不设上限(如 {5,} 等同于“至少重复五次”);* 若同时省略左右界,即 {,},则与 \* 等价。 | - -* 可通过在量词后加 `?` 改变匹配偏好。 - * `{3,5}`:偏好 5 次,最不偏好 3 次;`{3,5}?`:偏好 3 次,最不偏好 5 次 - * `?`:偏好 1 次;`??`:偏好 0 次 - -#### 1.2.5 AFTER MATCH SKIP 子句 - -用于指定在识别到一个非空匹配后,下一次模式匹配应从哪一行继续进行。 - -| 跳转策略 | 描述 | 是否允许识别重叠匹配项 | -| ------------------------------------------------------------- | --------------------------------------------------- | ------------------------ | -| `AFTER MATCH SKIP PAST LAST ROW` | 默认行为。在当前匹配的最后一行之后的下一行开始。 | 否 | -| `AFTER MATCH SKIP TO NEXT ROW` | 在当前匹配中的第二行开始。 | 是 | -| `AFTER MATCH SKIP TO [ FIRST \| LAST ] pattern_variable` | 跳转到某个模式变量的 [ 第一行 \| 最后一行 ] 开始。 | 是 | - -* 在所有可能的配置中,仅当 `ALL ROWS PER MATCH WITH UNMATCHED ROWS` 与 `AFTER MATCH SKIP PAST LAST ROW` 联合使用时,系统才能确保对每个输入行恰好生成一条输出记录。 - -**示例说明** - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER() AS label - ALL ROWS PER MATCH - %s -- AFTER MATCH SKIP 子句 - PATTERN (A B+ C+ D?) - SUBSET U = (C, D) - DEFINE - B AS B.totalprice < PREV (B.totalprice), - C AS C.totalprice > PREV (C.totalprice), - D AS false -- 永远不会匹配成功 -) AS m; -``` - -* 查询结果 - * 当 AFTER MATCH SKIP PAST LAST ROW 时 - - ![](/img/timeseries-featured-analysis-4.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:根据 `AFTER MATCH SKIP PAST LAST ROW` 语义,从第 5 行开始,无法再找寻到一个合法匹配 - * 此模式一定不会出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 4 - ``` - - * 当 AFTER MATCH SKIP TO NEXT ROW 时 - - ![](/img/timeseries-featured-analysis-5.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:根据 `AFTER MATCH SKIP TO NEXT ROW` 语义,从第 2 行开始,匹配:第 2、3、4 行 - * 第三次匹配:尝试从第 3 行开始,失败 - * 第三次匹配:尝试从第 4 行开始,成功,匹配第 4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:02:00.000+08:00| 2| 80| A| - |2025-01-01T00:03:00.000+08:00| 2| 70| B| - |2025-01-01T00:04:00.000+08:00| 2| 80| C| - |2025-01-01T00:04:00.000+08:00| 3| 80| A| - |2025-01-01T00:05:00.000+08:00| 3| 70| B| - |2025-01-01T00:06:00.000+08:00| 3| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 10 - ``` - - * 当 AFTER MATCH SKIP TO FIRST C 时 - - ![](/img/timeseries-featured-analysis-6.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:从第一个 C (也就是第 4 行)处开始,匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO LAST B 或 AFTER MATCH SKIP TO B 时 - - ![](/img/timeseries-featured-analysis-7.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:尝试从最后一个 B (也就是第 3 行)处开始,失败 - * 第二次匹配:尝试从第 4 行开始,成功匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO U 时 - - ![](/img/timeseries-featured-analysis-8.png) - - * - * 第一次匹配:第 1、2、3、4 行 - * 第二次匹配:`SKIP TO U` 表示跳转到最后一个 C 或 D,D 永远不可能匹配成功,所以就是跳转到最后一个 C(也就是第 4 行),成功匹配第4、5、6行 - * 此模式允许出现重叠匹配 - - ```SQL - +-----------------------------+-----+-----+-----+ - | time|match|price|label| - +-----------------------------+-----+-----+-----+ - |2025-01-01T00:01:00.000+08:00| 1| 90| A| - |2025-01-01T00:02:00.000+08:00| 1| 80| B| - |2025-01-01T00:03:00.000+08:00| 1| 70| B| - |2025-01-01T00:04:00.000+08:00| 1| 80| C| - |2025-01-01T00:04:00.000+08:00| 2| 80| A| - |2025-01-01T00:05:00.000+08:00| 2| 70| B| - |2025-01-01T00:06:00.000+08:00| 2| 80| C| - +-----------------------------+-----+-----+-----+ - Total line number = 7 - ``` - - * 当 AFTER MATCH SKIP TO A 时,报错。因为不能跳转到匹配的第一行, 否则会造成死循环。 - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: cannot skip to first row of match - ``` - - * 当 AFTER MATCH SKIP TO B 时,报错。因为不能跳转到匹配分组中不存在的模式变量。 - - ```SQL - Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 701: AFTER MATCH SKIP TO failed: pattern variable is not present in match - ``` - - -#### 1.2.6 ROWS PER MATCH 子句 - -用于指定模式匹配成功后结果集的输出方式,主要包括以下两种选项: - -| 输出方式 | 规则描述 | 输出结果 | **空匹配/未匹配行**处理逻辑 | -| -------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| ONE ROW PER MATCH | 每一次成功匹配,产生一行输出结果。 | * PARTITION BY 子句中的列* MEASURES 子句中定义的表达式。 | 输出空匹配;跳过未匹配行。 | -| ALL ROWS PER MATCH | 每一次匹配中的每一行都将产生一条输出记录,除非该行通过 exclusion 语法排除。 | * PARTITION BY 子句中的列* ORDER BY 子句中的列* MEASURES 子句中定义的表达式* 输入表中的其余列 | * 默认:输出空匹配;跳过未匹配行。* ALL ROWS PER MATCH​**SHOW EMPTY MATCHES**​:默认输出空匹配,跳过未匹配行* ALL ROWS PER MATCH​**OMIT EMPTY MATCHES**​:不输出空匹配,跳过未匹配行* ALL ROWS PER MATCH​**WITH UNMATCHED ROWS**​:输出空匹配,并为每一条未匹配行额外生成一条输出记录| - -#### 1.2.7 MEASURES 子句 - -用于指定从匹配到的一段数据中提取哪些信息。该子句为可选项,如果未显式指定,则根据 ROWS PER MATCH 子句的设置,部分输入列会成为模式识别的输出结果。 - -```SQL -MEASURES measure_expression AS measure_name [, ...] -``` - -* `measure_expression` 是根据匹配的一段数据计算出的标量值。 - -| 用法示例 | 说明 | -| ---------------------------------------------- | -------------------------------------------------------------------------------------------------------------- | -| `A.totalprice AS starting_price` | 返回匹配分组中第一行(即与变量 A 关联的唯一一行)中的价格,作为起始价格。 | -| `RPR_LAST(B.totalprice) AS bottom_price` | 返回与变量 B 关联的最后一行中的价格,代表“V”形模式中最低点的价格,对应下降区段的末尾。 | -| `RPR_LAST(U.totalprice) AS top_price` | 返回匹配分组中的最高价格,对应变量 C 或 D 所关联的最后一行,即整个匹配分组的末尾。【假设 SUBSET U = (C, D)】 | - -* 每个 `measure_expression `都会定义一个输出列,该列可通过其指定的 `measure_name `进行引用。 - -#### 1.2.8 模式查询表达式 - -在 MEASURES 与 DEFINE 子句中使用的表达式为​**标量表达式**​,用于在输入表的行级上下文中求值。**标量表达式**除了支持标准 SQL 语法外,还支持针对模式查询的特殊扩展函数。 - -##### 1.2.8.1 模式变量引用 - -```SQL -A.totalprice -U.orderdate -orderstatus -``` - -* 当列名前缀为某**基本模式变量**或**联合模式变量**时,表示引用该变量所匹配的所有行的对应列值。 -* 若列名不带前缀,则等同于使用“​**全局联合模式变量**​”(即所有基本模式变量的并集)作前缀,表示引用当前匹配中所有行的该列值。 - -> 不允许在模式识别表达式中使用表名作列名前缀。 - -##### 1.2.8.2 扩展函数 - -| 函数名 | 函数式 | 描述 | -|------------------| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `MATCH_NUMBER`函数 | `MATCH_NUMBER()` | 返回当前匹配在分区内的序号,从 1 开始计数。空匹配与非空匹配一致,也占用匹配序号。 | -| `CLASSIFIER `函数 | `CLASSIFIER(option)`| 1. 返回当前行所映射的基本模式变量名称。1. `option`是一个可选参数:可以传入基本模式变量`CLASSIFIER(A)`或联合模式变量`CLASSIFIER(U)`,用于限定函数作用范围,对于不在范围内的行,直接返回 NULL。在对联合模式变量使用时,可用于辨别该行究竟映射至并集中哪一个基本模式变量。 | -| 逻辑导航函数 | `RPR_FIRST(expr, k)` | 1. 表示从**当前匹配分组**中,定位至第一个满足 expr 的行,在此基础上再向分组尾部方向搜索到第 k 次出现的同一模式变量对应行,返回该行的指定列值。如果在指定方向上未能找到第 k 次匹配行,则函数返回 NULL。1. 其中 k 是可选参数,默认为 0,表示仅定位至首个满足条件的行;若显式指定,必须为非负整数。 | -| 逻辑导航函数 | `RPR_LAST(expr, k)`| 1. 表示从**当前匹配分组**中,定位至最后一个满足 expr 的行,在此基础上再向分组开头方向搜索到第 k 次出现的同一模式变量对应行,返回该行的指定列值。如果在指定方向上未能找到第 k 次匹配行,则函数返回 NULL。1. 其中 k 是可选参数,默认为 0,表示仅定位至末个满足条件的行;若显式指定,必须为非负整数。 | -| 物理导航函数 | `PREV(expr, k)` | 1. 表示从最后一次匹配至给定模式变量的行开始,向开头方向偏移 k 行,返回对应列值。若导航超出​**分区边界**​,则函数返回 NULL。1. 其中 k 是可选参数,默认为 1;若显式指定,必须为非负整数。 | -| 物理导航函数 |`NEXT(expr, k)` | 1. 表示从最后一次匹配至给定模式变量的行开始,向尾部方向偏移 k 行,返回对应列值。若导航超出​**分区边界**​,则函数返回 NULL。1. 其中 k 是可选参数,默认为 1;若显式指定,必须为非负整数。 | -| 聚合函数 | COUNT、SUM、AVG、MAX、MIN 函数 | 可用于对当前匹配中的数据进行计算。聚合函数与导航函数不允许互相嵌套。(V 2.0.6 版本起支持) | -| 嵌套函数 | `PREV/NEXT(CLASSIFIER())` | 物理导航函数与 CLASSIFIER 函数嵌套。用于获取当前行的前一个和后一个匹配行所对应的模式变量 | -| 嵌套函数 |`PREV/NEXT(RPR_FIRST/RPR_LAST(expr, k)`) | 物理函数内部**允许嵌套**逻辑函数,逻辑函数内部**不允许嵌套**物理函数。用于先进行逻辑偏移,再进行物理偏移。 | - -**示例说明** - -1. CLASSIFIER 函数 - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* 分析过程 - - ![](/img/timeseries-featured-analysis-9.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----+---------------+-----+ -| time|match|price|lower_or_higher|label| -+-----------------------------+-----+-----+---------------+-----+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| -+-----------------------------+-----+-----+---------------+-----+ -Total line number = 6 -``` - -2. 逻辑导航函数 - -* 查询 sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` - -* 查询结果 - * 当取值为 totalprice、RPR\_LAST(totalprice)、RUNNING RPR\_LAST(totalprice) 时 - - ![](/img/timeseries-featured-analysis-10.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 FINAL RPR\_LAST(totalprice) 时 - - ![](/img/timeseries-featured-analysis-11.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_FIRST(totalprice)、 RUNNING RPR\_FIRST(totalprice)、FINAL RPR\_FIRST(totalprice)时 - - ![](/img/timeseries-featured-analysis-12.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 90| - |2025-01-01T00:02:00.000+08:00| 90| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 90| - |2025-01-01T00:05:00.000+08:00| 90| - |2025-01-01T00:06:00.000+08:00| 90| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_LAST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-13.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| null| - |2025-01-01T00:02:00.000+08:00| null| - |2025-01-01T00:03:00.000+08:00| 90| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 FINAL RPP\_LAST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-14.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 80| - |2025-01-01T00:02:00.000+08:00| 80| - |2025-01-01T00:03:00.000+08:00| 80| - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:05:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 6 - ``` - - * 当取值为 RPR\_FIRST(totalprice, 2) 和 FINAL RPR\_FIRST(totalprice, 2) 时 - - ![](/img/timeseries-featured-analysis-15.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:01:00.000+08:00| 70| - |2025-01-01T00:02:00.000+08:00| 70| - |2025-01-01T00:03:00.000+08:00| 70| - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:05:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 6 - ``` - -3. 物理导航函数 - -* 查询 sql - -```SQL -SELECT m.time, m.measure -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - %s AS measure -- MEASURES 子句 - ALL ROWS PER MATCH - PATTERN (B) - DEFINE B AS B.totalprice >= PREV(B.totalprice) -) AS m; -``` - -* 查询结果 - * 当取值为 `PREV(totalprice)` 时 - - ![](/img/timeseries-featured-analysis-16.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| 70| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `PREV(B.totalprice, 2)` 时 - - ![](/img/timeseries-featured-analysis-17.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `PREV(B.totalprice, 4)` 时 - - ![](/img/timeseries-featured-analysis-18.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| null| - |2025-01-01T00:06:00.000+08:00| 80| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * 当取值为 `NEXT(totalprice)` 或 `NEXT(B.totalprice, 1)` 时 - - ![](/img/timeseries-featured-analysis-19.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 70| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - - * `当取值为 NEXT(B.totalprice, 2)` 时 - - ![](/img/timeseries-featured-analysis-20.png) - - 实际返回 - - ```SQL - +-----------------------------+-------+ - | time|measure| - +-----------------------------+-------+ - |2025-01-01T00:04:00.000+08:00| 80| - |2025-01-01T00:06:00.000+08:00| null| - +-----------------------------+-------+ - Total line number = 2 - ``` - -4. 聚合函数 - -* 查询 sql - -```SQL -SELECT m.time, m.count, m.avg, m.sum, m.min, m.max -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - COUNT(*) AS count, - AVG(totalprice) AS avg, - SUM(totalprice) AS sum, - MIN(totalprice) AS min, - MAX(totalprice) AS max - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* 分析过程(以 MIN(totalprice)为例) - -![](/img/timeseries-featured-analysis-21.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----------------+-----+---+---+ -| time|count| avg| sum|min|max| -+-----------------------------+-----+-----------------+-----+---+---+ -|2025-01-01T00:01:00.000+08:00| 1| 90.0| 90.0| 90| 90| -|2025-01-01T00:02:00.000+08:00| 2| 85.0|170.0| 80| 90| -|2025-01-01T00:03:00.000+08:00| 3| 80.0|240.0| 70| 90| -|2025-01-01T00:04:00.000+08:00| 4| 80.0|320.0| 70| 90| -|2025-01-01T00:05:00.000+08:00| 5| 78.0|390.0| 70| 90| -|2025-01-01T00:06:00.000+08:00| 6|78.33333333333333|470.0| 70| 90| -+-----------------------------+-----+-----------------+-----+---+---+ -Total line number = 6 -``` - -5. 嵌套函数 - -示例一 - -* 查询 sql - -```SQL -SELECT m.time, m.match, m.price, m.lower_or_higher, m.label, m.prev_label, m.next_label -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RUNNING RPR_LAST(totalprice) AS price, - CLASSIFIER(U) AS lower_or_higher, - CLASSIFIER(W) AS label, - PREV(CLASSIFIER(W)) AS prev_label, - NEXT(CLASSIFIER(W)) AS next_label - ALL ROWS PER MATCH - PATTERN ((L | H) A) - SUBSET - U = (L, H), - W = (A, L, H) - DEFINE - A AS A.totalprice = 80, - L AS L.totalprice < 80, - H AS H.totalprice > 80 -) AS m; -``` -* 分析过程 - -![](/img/timeseries-featured-analysis-22.png) - -* 查询结果 - -```SQL -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -| time|match|price|lower_or_higher|label|prev_label|next_label| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -|2025-01-01T00:01:00.000+08:00| 1| 90| H| H| null| A| -|2025-01-01T00:02:00.000+08:00| 1| 80| H| A| H| null| -|2025-01-01T00:03:00.000+08:00| 2| 70| L| L| null| A| -|2025-01-01T00:04:00.000+08:00| 2| 80| L| A| L| null| -|2025-01-01T00:05:00.000+08:00| 3| 70| L| L| null| A| -|2025-01-01T00:06:00.000+08:00| 3| 80| L| A| L| null| -+-----------------------------+-----+-----+---------------+-----+----------+----------+ -Total line number = 6 -``` - -示例二 - -* 查询 sql - -```SQL -SELECT m.time, m.prev_last_price, m.next_first_price -FROM t -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - PREV(RPR_LAST(totalprice), 2) AS prev_last_price, - NEXT(RPR_FIRST(totalprice), 2) as next_first_price - ALL ROWS PER MATCH - PATTERN (A+) - DEFINE A AS true -) AS m; -``` -* 分析过程 - -![](/img/timeseries-featured-analysis-23.png) - -* 查询结果 - -```SQL -+-----------------------------+---------------+----------------+ -| time|prev_last_price|next_first_price| -+-----------------------------+---------------+----------------+ -|2025-01-01T00:01:00.000+08:00| null| 70| -|2025-01-01T00:02:00.000+08:00| null| 70| -|2025-01-01T00:03:00.000+08:00| 90| 70| -|2025-01-01T00:04:00.000+08:00| 80| 70| -|2025-01-01T00:05:00.000+08:00| 70| 70| -|2025-01-01T00:06:00.000+08:00| 80| 70| -+-----------------------------+---------------+----------------+ -Total line number = 6 -``` - -##### 1.2.8.3 RUNNING 和 FINAL 语义 -1. 定义 - -* `RUNNING`: 表示计算范围为当前匹配分组内,从分组的起始行到当前正在处理的行(即到当前行为止)。 -* `FINAL`: 表示计算范围为当前匹配分组内,从分组的起始行到分组的最终行(即整个匹配分组)。 - -2. 作用范围 - -* DEFINE 子句默认采用 RUNNING 语义。 -* MEASURES 子句默认采用 RUNNING 语义,支持指定 FINAL 语义。当采用 ONE ROW PER MATCH 输出模式时,所有表达式都从匹配分组的末行位置进行计算,此时 RUNNING 语义与 FINAL 语义等价。 - -3. 语法约束 - -* RUNNING 和 FINAL 需要写在**逻辑导航函数**或聚合函数之前,不能直接作用于**列引用。** - * 合法:`RUNNING RPP_LAST(A.totalprice)`、`FINAL RPP_LAST(A.totalprice)` - * 非法:`RUNNING A.totalprice`、`FINAL A.totalprice`、 `RUNNING PREV(A.totalprice)` - -### 1.3 场景示例 - -以[示例数据](../Reference/Sample-Data.md)为源数据 - -#### 1.3.1 时间分段查询 - -将 table1 中的数据按照时间间隔小于等于 24 小时分段,查询每段中的数据总条数,以及开始、结束时间。 - -查询SQL - -```SQL -SELECT start_time, end_time, cnt -FROM table1 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000 -) AS m -``` - -查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00| 2| -|2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16| -+-----------------------------+-----------------------------+---+ -Total line number = 2 -``` - -#### 1.3.2 差值分段查询 - -将 table2 中的数据按照 humidity 湿度值差值小于 0.1 分段,查询每段中的数据总条数,以及开始、结束时间。 - -* 查询sql - -```SQL -SELECT start_time, end_time, cnt -FROM table2 -MATCH_RECOGNIZE ( - ORDER BY time - MEASURES - RPR_FIRST(A.time) AS start_time, - RPR_LAST(time) AS end_time, - COUNT() AS cnt - PATTERN (A B*) - DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1 -) AS m; -``` - -* 查询结果 - -```SQL -+-----------------------------+-----------------------------+---+ -| start_time| end_time|cnt| -+-----------------------------+-----------------------------+---+ -|2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00| 2| -|2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00| 2| -|2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00| 2| -+-----------------------------+-----------------------------+---+ -Total line number = 3 -``` - -#### 1.3.3 事件统计查询 - -将 table1 中数据按照设备号分组,统计上海地区湿度大于 35 的开始、结束时间及最大湿度值。 - -* 查询sql - -```SQL -SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity -FROM table1 -MATCH_RECOGNIZE ( - PARTITION BY device_id - ORDER BY time - MEASURES - MATCH_NUMBER() AS match, - RPR_FIRST(A.time) AS event_start, - RPR_LAST(A.time) AS event_end, - MAX(A.humidity) AS max_humidity - ONE ROW PER MATCH - PATTERN (A+) - DEFINE - A AS A.region= '上海' AND A.humidity> 35 -) AS m -``` - -* 查询结果 - -```SQL -+---------+-----+-----------------------------+-----------------------------+------------+ -|device_id|match| event_start| event_end|max_humidity| -+---------+-----+-----------------------------+-----------------------------+------------+ -| 100| 1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00| 45.1| -| 101| 1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00| 35.2| -+---------+-----+-----------------------------+-----------------------------+------------+ -Total line number = 2 -``` - - -## 2. 窗口函数 - -### 2.1 功能介绍 - -窗口函数(Window Function) 是一种基于与当前行相关的特定行集合(称为“窗口”) 对每一行进行计算的特殊函数。它将分组操作(`PARTITION BY`)、排序(`ORDER BY`)与可定义的计算范围(窗口框架 `FRAME`)结合,在不折叠原始数据行的前提下实现复杂的跨行计算。常用于数据分析场景,比如排名、累计和、移动平均等操作。 - -> 注意:该功能从 V 2.0.5 版本开始提供。 - -例如,某场景下需要查询不同设备的功耗累加值,即可通过窗口函数来实现。 - -```SQL --- 原始数据 -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ - --- 创建表并插入数据 -CREATE TABLE device_flow(device String tag, flow INT32 FIELD); -insert into device_flow(time, device ,flow ) values ('1970-01-01T08:00:00.000+08:00','d0',3),('1970-01-01T08:00:01.000+08:00','d0',5),('1970-01-01T08:00:02.000+08:00','d0',3),('1970-01-01T08:00:03.000+08:00','d0',1),('1970-01-01T08:00:04.000+08:00','d1',2),('1970-01-01T08:00:05.000+08:00','d1',4); - - ---执行窗口函数查询 -SELECT *, sum(flow) ​OVER(PARTITION​ ​BY​ device ​ORDER​ ​BY​ flow) ​as​ sum ​FROM device_flow; -``` - -经过分组、排序、计算(步骤拆解如下图所示), - -![](/img/window-function-1.png) - -即可得到期望结果: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -### 2.2 功能定义 -#### 2.2.1 SQL 定义 - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -#### 2.2.2 窗口定义 -##### 2.2.2.1 Partition - -`PARTITION BY` 用于将数据分为多个独立、不相关的「组」,窗口函数只能访问并操作其所属分组内的数据,无法访问其它分组。该子句是可选的;如果未显式指定,则默认将所有数据分到同一组。值得注意的是,与 `GROUP BY` 通过聚合函数将一组数据规约成一行不同,`PARTITION BY` 的窗口函数**并不会影响组内的行数。** - -* 示例 - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER (PARTITION BY device) as count FROM device_flow; -``` - -拆解步骤: - -![](/img/window-function-2.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 4| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -|1970-01-01T08:00:02.000+08:00| d0| 3| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 4| -+-----------------------------+------+----+-----+ -``` - -##### 2.2.2.2 Ordering - -`ORDER BY` 用于对 partition 内的数据进行排序。排序后,相等的行被称为 peers。peers 会影响窗口函数的行为,例如不同 rank function 对 peers 的处理不同;不同 frame 的划分方式对于 peers 的处理也不同。该子句是可选的。 - -* 示例 - -查询语句: - -```SQL -IoTDB> SELECT *, rank() OVER (PARTITION BY device ORDER BY flow) as rank FROM device_flow; -``` - -拆解步骤: - -![](/img/window-function-3.png) - -查询结果: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -##### 2.2.2.3 Framing - -对于 partition 中的每一行,窗口函数都会在相应的一组行上求值,这些行称为 Frame(即 Window Function 在每一行上的输入域)。Frame 可以手动指定,指定时涉及两个属性,具体说明如下。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Frame 属性属性值值描述
类型ROWS通过行号来划分 frame
GROUPS通过 peers 来划分 frame,即值相同的行视为同等的存在。peers 中所有的行分为一个组,叫做 peer group
RANGE通过值来划分 frame
起始和终止位置UNBOUNDED PRECEDING整个 partition 的第一行
offset PRECEDING代表前面和当前行「距离」为 offset 的行
CURRENT ROW当前行
offset FOLLOWING代表后面和当前行「距离」为 offset 的行
UNBOUNDED FOLLOWING整个 partition 的最后一行
- -其中,`CURRENT ROW`、`PRECEDING N` 和 `FOLLOWING N` 的含义随着 frame 种类的不同而不同,如下表所示: - -| | `ROWS` | `GROUPS` | `RANGE` | -|--------------------|------------|------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------| -| `CURRENT ROW` | 当前行 | 由于 peer group 包含多行,因此这个选项根据作用于 frame\_start 和 frame\_end 而不同:* frame\_start:peer group 的第一行;* frame\_end:peer group 的最后一行。 | 和 GROUPS 相同,根据作用于 frame\_start 和 frame\_end 而不同:* frame\_start:peer group 的第一行;* frame\_end:peer group 的最后一行。 | -| `offset PRECEDING` | 前 offset 行 | 前 offset 个 peer group; | 前面与当前行的值之差小于等于 offset 就分为一个 frame | -| `offset FOLLOWING` | 后 offset 行 | 后 offset 个 peer group。 | 后面与当前行的值之差小于等于 offset 就分为一个 frame | - -语法格式如下: - -```SQL --- 同时指定 frame_start 和 frame_end -{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end --- 仅指定 frame_start,frame_end 为 CURRENT ROW -{ RANGE | ROWS | GROUPS } frame_start -``` - -若未手动指定 Frame,Frame 的默认划分规则如下: - -* 当窗口函数使用 ORDER BY 时:默认 Frame 为 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (即从窗口的第一行到当前行)。例如:RANK() OVER(PARTITION BY COL1 0RDER BY COL2) 中,Frame 默认包含分区内当前行及之前的所有行。 -* 当窗口函数不使用 ORDER BY 时:默认 Frame 为 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (即整个窗口的所有行)。例如:AVG(COL2) OVER(PARTITION BY col1) 中,Frame 默认包含分区内的所有行,计算整个分区的平均值。 - -需要注意的是,当 Frame 类型为 GROUPS 或 RANGE 时,需要指定 `ORDER BY`,区别在于 GROUPS 中的 ORDER BY 可以涉及多个字段,而 RANGE 需要计算,所以只能指定一个字段。 - -* 示例 - -1. Frame 类型为 ROWS - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ROWS 1 PRECEDING) as count FROM device_flow; -``` - -拆解步骤: - -* 取前一行和当前行作为 Frame - * 对于 partition 的第一行,由于没有前一行,所以整个 Frame 只有它一行,返回 1; - * 对于 partition 的其他行,整个 Frame 包含当前行和它的前一行,返回 2: - -![](/img/window-function-4.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 2| -+-----------------------------+------+----+-----+ -``` - -2. Frame 类型为 GROUPS - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -拆解步骤: - -* 取前一个 peer group 和当前 peer group 作为 Frame,那么以 device 为 d0 的 partition 为例(d1同理),对于 count 行数: - * 对于 flow 为 1 的 peer group,由于它也没比它小的 peer group 了,所以整个 Frame 就它一行,返回 1; - * 对于 flow 为 3 的 peer group,它本身包含 2 行,前一个 peer group 就是 flow 为 1 的,就一行,因此整个 Frame 三行,返回 3; - * 对于 flow 为 5 的 peer group,它本身包含 1 行,前一个 peer group 就是 flow 为 3 的,共两行,因此整个 Frame 三行,返回 3。 - -![](/img/window-function-5.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. Frame 类型为 RANGE - -查询语句: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -拆解步骤: - -* 把比当前行数据**小于等于 2 ​**的分为同一个 Frame,那么以 device 为 d0 的 partition 为例(d1 同理),对于 count 行数: - * 对于 flow 为 1 的行,由于它是最小的行了,所以整个 Frame 就它一行,返回 1; - * 对于 flow 为 3 的行,注意 CURRENT ROW 是作为 frame\_end 存在,因此是整个 peer group 的最后一行,符合要求比它小的共 1 行,然后 peer group 有 2 行,所以整个 Frame 共 3 行,返回 3; - * 对于 flow 为 5 的行,它本身包含 1 行,符合要求的比它小的共 2 行,所以整个 Frame 共 3 行,返回 3。 - -![](/img/window-function-6.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -### 2.3 内置的窗口函数 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
窗口函数分类窗口函数名函数定义是否支持 FRAME 子句
Aggregate Function所有内置聚合函数对一组值进行聚合计算,得到单个聚合结果。
Value Functionfirst_value返回 frame 的第一个值,如果指定了 IGNORE NULLS 需要跳过前缀的 NULL
last_value返回 frame 的最后一个值,如果指定了 IGNORE NULLS 需要跳过后缀的 NULL
nth_value返回 frame 的第 n 个元素(注意 n 是从 1 开始),如果有 IGNORE NULLS 需要跳过 NULL
lead返回当前行的后 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default
lag返回当前行的前 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default
Rank Functionrank返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间可能有 gap
dense_rank返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间没有 gap
row_number返回当前行在整个 partition 中的行号,注意行号从 1 开始
percent_rank以百分比的形式,返回当前行的值在整个 partition 中的序号;即 (rank() - 1) / (n - 1),其中 n 是整个 partition 的行数
cume_dist以百分比的形式,返回当前行的值在整个 partition 中的序号;即 (小于等于它的行数) / n
ntile指定 n,给每一行进行 1~n 的编号。
- -#### 2.3.1 Aggregate Function - -所有内置聚合函数,如 `sum()`、`avg()`、`min()`、`max()` 都能当作 Window Function 使用。 - -> 注意:与 GROUP BY 不同,Window Function 中每一行都有相应的输出 - -示例: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -#### 2.3.2 Value Function -1. `first_value` - -* 函数名:`first_value(value) [IGNORE NULLS]` -* 定义:返回 frame 的第一个值,如果指定了 IGNORE NULLS 需要跳过前缀的 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, first_value(flow) OVER w as first_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+-----------+ -| time|device|flow|first_value| -+-----------------------------+------+----+-----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----------+ -``` - -2. `last_value` - -* 函数名:`last_value(value) [IGNORE NULLS]` -* 定义:返回 frame 的最后一个值,如果指定了 IGNORE NULLS 需要跳过后缀的 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, last_value(flow) OVER w as last_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|last_value| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -3. `nth_value` - -* 函数名:`nth_value(value, n) [IGNORE NULLS]` -* 定义:返回 frame 的第 n 个元素(注意 n 是从 1 开始),如果有 IGNORE NULLS 需要跳过 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, nth_value(flow, 2) OVER w as nth_values FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|nth_values| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -4. lead - -* 函数名:`lead(value[, offset[, default]]) [IGNORE NULLS]` -* 定义:返回当前行的后 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default;offset 的默认值为 1,default 的默认值为 NULL。 -* lead 函数需要需要一个 ORDER BY 窗口子句 -* 示例: - -```SQL -IoTDB> SELECT *, lead(flow) OVER w as lead FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY time); -+-----------------------------+------+----+----+ -| time|device|flow|lead| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4|null| -|1970-01-01T08:00:00.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 1| -|1970-01-01T08:00:03.000+08:00| d0| 1|null| -+-----------------------------+------+----+----+ -``` - -5. lag - -* 函数名:`lag(value[, offset[, default]]) [IGNORE NULLS]` -* 定义:返回当前行的前 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default;offset 的默认值为 1,default 的默认值为 NULL。 -* lag 函数需要需要一个 ORDER BY 窗口子句 -* 示例: - -```SQL -IoTDB> SELECT *, lag(flow) OVER w as lag FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY device); -+-----------------------------+------+----+----+ -| time|device|flow| lag| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2|null| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3|null| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -+-----------------------------+------+----+----+ -``` - -#### 2.3.3 Rank Function -1. rank - -* 函数名:`rank()` -* 定义:返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间可能有 gap; -* 示例: - -```SQL -IoTDB> SELECT *, rank() OVER w as rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -2. dense\_rank - -* 函数名:`dense_rank()` -* 定义:返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间没有 gap。 -* 示例: - -```SQL -IoTDB> SELECT *, dense_rank() OVER w as dense_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|dense_rank| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+----------+ -``` - -3. row\_number - -* 函数名:`row_number()` -* 定义:返回当前行在整个 partition 中的行号,注意行号从 1 开始; -* 示例: - -```SQL -IoTDB> SELECT *, row_number() OVER w as row_number FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|row_number| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----------+ -``` - -4. percent\_rank - -* 函数名:`percent_rank()` -* 定义:以百分比的形式,返回当前行的值在整个 partition 中的序号;即 **(rank() - 1) / (n - 1)**,其中 n 是整个 partition 的行数; -* 示例: - -```SQL -IoTDB> SELECT *, percent_rank() OVER w as percent_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+------------------+ -| time|device|flow| percent_rank| -+-----------------------------+------+----+------------------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.0| -|1970-01-01T08:00:00.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:02.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+------------------+ -``` - -5. cume\_dist - -* 函数名:cume\_dist -* 定义:以百分比的形式,返回当前行的值在整个 partition 中的序号;即 **(小于等于它的行数) / n**。 -* 示例: - -```SQL -IoTDB> SELECT *, cume_dist() OVER w as cume_dist FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+---------+ -| time|device|flow|cume_dist| -+-----------------------------+------+----+---------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.5| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.25| -|1970-01-01T08:00:00.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:02.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+---------+ -``` - -6. ntile - -* 函数名:ntile -* 定义:指定 n,给每一行进行 1~n 的编号。 - * 整个 partition 行数比 n 小,那么编号就是行号 index; - * 整个 partition 行数比 n 大: - * 如果行数能除尽 n,那么比较完美,比如行数为 4,n 为 2,那么编号为 1、1、2、2、; - * 如果行数不能除尽 n,那么就分给开头几组,比如行数为 5,n 为 3,那么编号为 1、1、2、2、3; -* 示例: - -```SQL -IoTDB> SELECT *, ntile(2) OVER w as ntile FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+-----+ -| time|device|flow|ntile| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -+-----------------------------+------+----+-----+ -``` - -### 2.4 场景示例 -1. 多设备 diff 函数 - -对于每个设备的每一行,与前一行求差值: - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -对于每个设备的每一行,与后一行求差值: - -```SQL -SELECT - *, - measurement - lead(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -对于单个设备的每一行,与前一行求差值(后一行同理): - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (ORDER BY time) -FROM data -where device='d1' -WHERE timeCondition; -``` - -2. 多设备 TOP\_K/BOTTOM\_K - -利用 rank 获取序号,然后在外部的查询中保留想要的顺序。 - -(注意, window function 的执行顺序在 HAVING 子句之后,所以这里需要子查询) - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY time DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -除了按照时间排序之外,还可以按照测点的值进行排序: - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY measurement DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -3. 多设备 CHANGE\_POINTS - -这个 sql 用来去除输入序列中连续相同值,可以用 lead + 子查询实现: - -```SQL -SELECT - time, - device, - measurement -FROM( - SELECT - time, - device, - measurement, - LEAD(measurement) OVER (PARTITION BY device ORDER BY time) AS next - FROM data - WHERE timeCondition -) -WHERE measurement != next OR next IS NULL; -``` diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Tree-to-Table_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Tree-to-Table_timecho.md deleted file mode 100644 index 5cd9bef30..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Tree-to-Table_timecho.md +++ /dev/null @@ -1,615 +0,0 @@ - -# 树转表视图 - -## 1. 功能概述 - -IoTDB 提供了树转表功能,支持通过创建表视图的方式,将已存在的树模型数据转化为表视图,进而通过表视图进行查询,实现了对同一份数据的树模型和表模型协同处理: - -* 数据写入阶段,采用树模型语法,支持数据灵活接入和扩展。 -* 数据分析阶段,采用表模型语法,支持通过标准 SQL 查询语言,执行复杂的数据分析。 - -![](/img/tree-to-table-1.png) - -> - V2.0.5 及以后版本支持该功能。 -> - 表视图只读,不允许通过表视图写入数据。 - -## 2. 功能介绍 -### 2.1 创建表视图 -#### 2.1.1 语法定义 - -```SQL --- create (or replace) view on tree -CREATE - [OR REPLACE] - VIEW view_name ([viewColumnDefinition (',' viewColumnDefinition)*]) - [comment] - [RESTRICT] - [WITH properties] - AS prefixPath - -viewColumnDefinition - : column_name [dataType] TAG [comment] # tagColumn - | column_name [dataType] TIME [comment] # timeColumn - | column_name [dataType] FIELD [FROM original_measurement] [comment] # fieldColumn - ; - -comment - : COMMENT string - ; -``` - -> 注意:列仅支持 tag / field / time,不支持 attribute。 - -#### 2.1.2 语法说明 -1. **`prefixPath`** - -对应树模型的路径,路径最后一级必须为 `**`,且其他层级均不能出现 `*` 或 `**`。该路径确定 VIEW 对应的子树。 - -2. **`view_name`** - -视图名称,与表名相同(具体约束可参考[创建表](../Basic-Concept/Table-Management_timecho.md#\_1-1-创建表)),如 db.view。 - -3. **`viewColumnDefinition`** - -* `TAG`:每个 Tag 列按顺序对应`prefixPath`后面层级的路径节点。 -* `FIELD`:FIELD 列对应树模型中的测点(叶子节点)。 - * 若指定了 FIELD 列,则列名使用声明中的`column_name`。 - * 若声明了 `original_measurement`,则直接映射到树模型该测点。否则取小写`column_name` 作为测点名进行树模型映射。 - * 不支持多个 FIELD 映射到树模型同名测点。 - * 若未指定 FIELD 列的 `dataType`,则默认获取树模型映射测点的数据类型。 - * 若树模型中的设备不包含某些声明的 FIELD 列,或与声明的 FIELD 列的数据类型不一致,则在查询该设备时,该 FIELD 列的值永远为 NULL。 - * 若未指定 FIELD 列,则创建时会自动扫描出`prefixPath`子树下所有的测点(包括定义为所有普通序列的测点,以及挂载路径与 `prefixPath `有所重合的所有模板中的测点),列名使用树模型测点名称。 - * 不支持树模型存在名称(含小写)相同但类型不同的测点 -* `TIME`:创建视图时可以不指定时间列(TIME),IoTDB 会自动添加该列并命名为"time", 且顺序上位于第一列。自 V2.0.8.2 版本起,支持创建视图时**自定义命名时间列**,自定义时间列在视图中的顺序由创建 SQL 中的顺序决定。相关约束如下: - * 当列分类(columnCategory)设为 `TIME` 时,数据类型(dataType)必须为 `TIMESTAMP`。 - * 每个视图最多允许定义 1个时间列(columnCategory = TIME)。 - * 当未显式定义时间列时,不允许其他列使用 `time` 作为名称,否则会与系统默认时间列命名冲突。 - - -4. **`WITH properties`** - -目前仅支持 TTL,表示该视图 TTL ms 之前的数据不会在查询时展示,即`WHERE time > now() - TTL`。若树模型设置了 TTL,则查询时取两者中的更小值。 - -> 注意:表视图 TTL 不影响树模型中设备的真实 TTL,当设备数据达到树模型设定的 TTL 后,将被系统物理删除。 - -5. **`OR REPLACE`** - -table 与 view 不能重名。创建时若已存在同名 table ,则会报错;若已存在同名 view ,则进行替换。 - -6. **`RESTRICT`** - -约束匹配树模型设备的层级数(从 prefixPath 下一层开始),若有 RESTRICT 字段,则匹配层级完全等于 tag 数量的 device,否则匹配层级小于等于 tag 数量的 device。默认非 RESTRICT,即匹配层级小于等于 tag 数量的 device。 - -#### 2.1.3 使用示例 -1. 树模型及表视图原型 - -![](/img/tree-to-table-2.png) - -2. 创建表视图 - -* 创建语句 - -```SQL -CREATE OR REPLACE VIEW viewdb."风机表" - ("风机组" TAG, - "风机号" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -with (ttl=604800000) -AS root.db.** -``` - -* 具体说明 - -该语句表示,创建出名为 `viewdb."风机表"` 的视图(viewdb 如不存在会报错),如果该视图已存在,则替换该视图: - -* 为挂载于树模型 root.db.\*\* 路径下面的序列创建表视图。 -* 具备`风机组`、`风机号`两个 `TAG` 列,因此表视图中只包含原树模型中第 3 层上的设备。 -* 具备`电压`、`电流`两个 `FIELD` 列。这里两个 `FIELD` 列对应树模型下的序列名同样是`电压`、`电流`,且仅仅选取类型为 `DOUBLE` 的序列。 - - **​序列名的改名需求:​**如果树模型下的序列名为`current`,想要创建出的表视图中对应的 `FIELD` 列名为`电流`,这种情况下,SQL 变更如下: - - ```SQL - CREATE OR REPLACE VIEW viewdb."风机表" - ("风机组" TAG, - "风机号" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD FROM current - ) - AS root.db.** - with (ttl=604800000) - ``` - -* 当需要自定义时间列(V2.0.8.2 起支持)时,SQL 变更如下: - -```SQL -CREATE OR REPLACE VIEW viewdb."风机表" - ("风机组" TAG, - "风机号" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD, - time_user_defined TIMESTAMP TIME - ) -AS root.db.** -with (ttl=604800000) -``` - - -### 2.2 修改表视图 -#### 2.2.1 语法定义 - -修改表视图功能支持修改视图名称、添加列、列重命名、修改 FIELD 列数据类型(V2.0.8.2 起支持)、删除列、设置视图的 TTL 属性,以及通过 COMMENT 添加注释。 - -```SQL --- 修改视图名 -ALTER VIEW [IF EXISTS] viewName RENAME TO to=identifier - --- 在视图中添加某一列 -ALTER VIEW [IF EXISTS] viewName ADD COLUMN [IF NOT EXISTS] viewColumnDefinition -viewColumnDefinition - : column_name [dataType] TAG # tagColumn - | column_name [dataType] FIELD [FROM original_measurement] # fieldColumn - --- 为视图中的某一列重命名 -ALTER VIEW [IF EXISTS] viewName RENAME COLUMN [IF EXISTS] oldName TO newName - ---修改 FIELD 列的数据类型 -ALTER VIEW [IF EXISTS] viewName ALTER COLUMN [IF EXISTS] columnName SET DATA TYPE new_type - --- 删除视图中的某一列 -ALTER VIEW [IF EXISTS] viewName DROP COLUMN [IF EXISTS] columnName - --- 修改视图的 TTL -ALTER VIEW [IF EXISTS] viewName SET PROPERTIES propertyAssignments - --- 添加注释 -COMMENT ON VIEW qualifiedName IS (string | NULL) #commentView -COMMENT ON COLUMN qualifiedName '.' column=identifier IS (string | NULL) #commentColumn -``` - -#### 2.2.2 语法说明 -1. `SET PROPERTIES`操作目前仅支持对表视图的 TTL 属性进行配置。 -2. 删除列功能,仅支持删除物理量列(FIELD),标识列(TAG)不支持删除。 -3. 修改后的 comment 会覆盖原有注释,如果指定为 null,则会擦除之前的 comment。 -4. 修改 FIELD 列数据类型时,变更后的字段类型需要与原类型兼容,具体兼容性如下表所示: - -| 原始类型 | 可变更为类型 | -| ----------- | ----------------------------------------------- | -| INT32 | INT64, FLOAT, DOUBLE, TIMESTAMP, STRING, TEXT | -| INT64 | TIMESTAMP, DOUBLE, STRING, TEXT | -| FLOAT | DOUBLE, STRING, TEXT | -| DOUBLE | STRING, TEXT | -| BOOLEAN | STRING, TEXT | -| TEXT | BLOB, STRING | -| STRING | TEXT, BLOB | -| BLOB | STRING, TEXT | -| DATE | STRING, TEXT | -| TIMESTAMP | INT64, DOUBLE, STRING, TEXT | - -#### 2.2.3 使用示例 - -```SQL --- 修改视图名 -ALTER VIEW IF EXISTS tableview1 RENAME TO tableview - --- 在视图中添加某一列 -ALTER VIEW IF EXISTS tableview ADD COLUMN IF NOT EXISTS temperature float field - --- 为视图中的某一列重命名 -ALTER VIEW IF EXISTS tableview RENAME COLUMN IF EXISTS temperature TO temp - --- 修改 FIELD 列的数据类型 -ALTER VIEW IF EXISTS tableview ALTER COLUMN IF EXISTS temperature SET DATA TYPE double - --- 删除视图中的某一列 -ALTER VIEW IF EXISTS tableview DROP COLUMN IF EXISTS temp - --- 修改视图的 TTL -ALTER VIEW IF EXISTS tableview SET PROPERTIES TTL=3600 - --- 添加注释 -COMMENT ON VIEW tableview IS '树转表' -COMMENT ON COLUMN tableview.status is Null -``` - -### 2.3 删除表视图 -#### 2.3.1 语法定义 - -```SQL -DROP VIEW [IF EXISTS] viewName -``` - -#### 2.3.2 使用示例 - -```SQL -DROP VIEW IF EXISTS tableview -``` - -### 2.4 查看表视图 -#### 2.4.1 **`Show Tables`** -1. 语法定义 - -```SQL -SHOW TABLES (DETAILS)? ((FROM | IN) database_name)? -``` - -2. 语法说明 - -`SHOW TABLES (DETAILS)` 语句通过结果集的`TABLE_TYPE`字段展示表或视图的类型信息: - -| 类型 | `TABLE_TYPE`字段值 | -| -------------------------------------- | ------------------------ | -| 普通表(Table) | `BASE TABLE` | -| 树转表视图(Tree View) | `VIEW FROM TREE` | -| 系统表(Iinformation\_schema.Tables) | `SYSTEM VIEW` | - -3. 使用示例 - -```SQL -IoTDB> show tables details from database1 -+-----------+-----------+------+-------+--------------+ -| TableName| TTL(ms)|Status|Comment| TableType| -+-----------+-----------+------+-------+--------------+ -| tableview| INF| USING| 树转表 |VIEW FROM TREE| -| table1|31536000000| USING| null| BASE TABLE| -| table2|31536000000| USING| null| BASE TABLE| -+-----------+-----------+------+-------+--------------+ - -IoTDB> show tables details from information_schema -+--------------+-------+------+-------+-----------+ -| TableName|TTL(ms)|Status|Comment| TableType| -+--------------+-------+------+-------+-----------+ -| columns| INF| USING| null|SYSTEM VIEW| -| config_nodes| INF| USING| null|SYSTEM VIEW| -|configurations| INF| USING| null|SYSTEM VIEW| -| data_nodes| INF| USING| null|SYSTEM VIEW| -| databases| INF| USING| null|SYSTEM VIEW| -| functions| INF| USING| null|SYSTEM VIEW| -| keywords| INF| USING| null|SYSTEM VIEW| -| models| INF| USING| null|SYSTEM VIEW| -| nodes| INF| USING| null|SYSTEM VIEW| -| pipe_plugins| INF| USING| null|SYSTEM VIEW| -| pipes| INF| USING| null|SYSTEM VIEW| -| queries| INF| USING| null|SYSTEM VIEW| -| regions| INF| USING| null|SYSTEM VIEW| -| subscriptions| INF| USING| null|SYSTEM VIEW| -| tables| INF| USING| null|SYSTEM VIEW| -| topics| INF| USING| null|SYSTEM VIEW| -| views| INF| USING| null|SYSTEM VIEW| -+--------------+-------+------+-------+-----------+ -``` - -#### 2.4.2 **`Show Create View`** -1. 语法定义 - -```SQL -SHOW CREATE VIEW viewname; -``` - -2. 语法说明 - -* 该语句用于获取表视图的完整定义语句。 -* 该语句会自动补全创建时省略的所有默认值,因此结果集中所展示的语句可能与原始创建语句不同。 -* 该语句不支持用于展示系统表; - -3. 使用示例 - -```SQL -IoTDB> show create view tableview -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| View| Create View| -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|tableview|CREATE VIEW "tableview" ("device" STRING TAG,"model" STRING TAG,"status" BOOLEAN FIELD,"hardware" STRING FIELD) COMMENT '表视图' WITH (ttl=INF) AS root.ln.**| -+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+ -``` - -> 除此之外,还支持通过 `show create table` 语句查看表视图创建信息,相关详细介绍可查看[show create table](../Basic-Concept/Table-Management_timecho.md#_1-4-查看表的创建信息) - -### 2.5 非对齐与对齐设备的查询差异 - -树转表视图在查询对齐设备和非对齐设备中有 null 值的情况下结果​**可能与等价的树模型 align by device 查询不同**​。 - -* **对齐设备** - * 树模型的查询表现:当查询涉及的所有序列在某一行都是null时,不保留该行 - * 表视图的查询表现:与表模型一致,保留全是 null 的行 -* **非对齐设备** - * 树模型的查询表现:当查询涉及的所有序列在某一行都是null时,不保留该行 - * 表视图的查询表现:与树模型一致,不保留全是 null 的行 -* **说明示例** - * 对齐 - - ```SQL - -- 树模型写入数据(对齐) - CREATE ALIGNED TIMESERIES root.db.battery.b1(voltage INT32, current FLOAT) - INSERT INTO root.db.battery.b1(time, voltage, current) aligned values (1, 1, 1) - INSERT INTO root.db.battery.b1(time, voltage, current) aligned values (2, null, 1) - - -- 创建 VIEW 语句 - CREATE VIEW view1 (battery_id TAG, voltage INT32 FIELD, current FLOAT FIELD) as root.db.battery.** - - -- 查询 - IoTDB> select voltage from view1 - +-------+ - |voltage| - +-------+ - | 1| - | null| - +-------+ - Total line number = 2 - ``` - - * 非对齐 - - ```SQL - -- 树模型写入数据(非对齐) - CREATE TIMESERIES root.db.battery.b1.voltage INT32 - CREATE TIMESERIES root.db.battery.b1.current FLOAT - INSERT INTO root.db.battery.b1(time, voltage, current) values (1, 1, 1) - INSERT INTO root.db.battery.b1(time, voltage, current) values (2, null, 1) - - -- 创建 VIEW 语句 - CREATE VIEW view1 (battery_id TAG, voltage INT32 FIELD, current FLOAT FIELD) as root.db.battery.** - - -- 查询 - IoTDB> select voltage from view1 - +-------+ - |voltage| - +-------+ - | 1| - +-------+ - Total line number = 1 - - -- 如果在查询语句中指定了所有 field 列,或是仅指定了非 field 列时,才可以确保查到所有行 - IoTDB> select voltage,current from view1 - +-------+-------+ - |voltage|current| - +-------+-------+ - | 1| 1.0| - | null| 1.0| - +-------+-------+ - Total line number = 2 - - IoTDB> select battery_id from view1 - +-------+ - |battery_id| - +-------+ - | b1| - | b1| - +-------+ - Total line number = 2 - - -- 如果查询中同时有部分 field 列,那最终结果的行数取决于这部分 field 列根据时间戳对齐后的行数 - IoTDB> select time,voltage from view1 - +-----------------------------+-------+ - | time|voltage| - +-----------------------------+-------+ - |1970-01-01T08:00:00.001+08:00| 1| - +-----------------------------+-------+ - Total line number = 1 - ``` - -## 3. 场景示例 -### 3.1 原树模型管理了多种类型的设备 - -* 场景中不同类型的设备具备不同的层级路径和测点集合。 -* ​**写入时**​:在数据库节点下按设备类型创建分支,每种设备下可以有不同的测点结构 -* ​**查询时**​:为每种类型的设备建立一张表,每个表具有不同的标签和测点集合 - -![](/img/tree-to-table-3.png) - -**表视图的创建 SQL:** - -```SQL --- 风机表 -CREATE VIEW viewdb."风机表" - ("风机组" TAG, - "风机号" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -AS root.db."风机".** - --- 电机表 -CREATE VIEW viewdb."电机表" - ("电机组" TAG, - "电机号" TAG, - "功率" FLOAT FIELD, - "电量" FLOAT FIELD, - "温度" FLOAT FIELD - ) -AS root.db."电机".** -``` - -### 3.2 原树模型中没有设备,只有测点 - -如场站的监控系统中,每个测点都有唯一编号,但无法对应到某些设备 - -> 大宽表形式 - -![](/img/tree-to-table-4.png) - -**表视图的创建 SQL:** - -```SQL -CREATE VIEW viewdb.machine - (DCS_PIT_02105A DOUBLE FIELD, - DCS_PIT_02105B DOUBLE FIELD, - DCS_PIT_02105C DOUBLE FIELD, - ... - DCS_XI_02716A DOUBLE FIELD - ) -AS root.db.** -``` - -### 3.3 原树模型中一个设备既有子设备,也有测点 - -如在储能场景中,每一层结构都要监控其电压和电流 - -* ​**写入时**​:按照物理世界的监测点,对每一层结构进行建模 -* ​**查询时**​:按照设备分类,建立多个表对每一层结构信息进行管理 - -![](/img/tree-to-table-5.png) - -**表视图的创建 SQL:** - -```SQL --- 电池舱表 -CREATE VIEW viewdb."电池舱表" - ("电池站" TAG, - "电池舱" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -RESTRICT -AS root.db.** - --- 电池堆表 -CREATE VIEW viewdb."电池堆表" - ("电池站" TAG, - "电池舱" TAG, - "电池堆" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -RESTRICT -AS root.db.** - --- 电池簇表 -CREATE VIEW viewdb."电池簇表" - ("电池站" TAG, - "电池舱" TAG, - "电池堆" TAG, - "电池簇" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -RESTRICT -AS 'root.db.**' - --- 电芯表 -CREATE VIEW viewdb."电芯表" - ("电池站" TAG, - "电池舱" TAG, - "电池堆" TAG, - "电池簇" TAG, - "电芯" TAG, - "电压" DOUBLE FIELD, - "电流" DOUBLE FIELD - ) -RESTRICT -AS root.db.** -``` - -### 3.4 原树模型中一个设备下只有一个测点 - -> 窄表形式 - -#### 3.4.1 所有测点数据类型相同 - -![](/img/tree-to-table-6.png) - -**表视图的创建 SQL:** - -```SQL -CREATE VIEW viewdb.machine - ( - sensor_id STRING TAG, - value DOUBLE FIELD - ) -AS root.db.** -``` - -#### 3.4.2 测点的数据类型不相同 -##### 3.4.2.1 为每一种数据类型的测点建一个窄表视图 - -​**优点**​:表视图数量是常数个,仅与系统中的数据类型相关 - -​**缺点**​:查询某一个测点值时,需要提前知道其数据类型,再去决定查询哪张表视图 - -![](/img/tree-to-table-7.png) - -**表视图的创建 SQL:** - -```SQL -CREATE VIEW viewdb.machine_float - ( - sensor_id STRING TAG, - value FLOAT FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_double - ( - sensor_id STRING TAG, - value DOUBLE FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_int32 - ( - sensor_id STRING TAG, - value INT32 FIELD - ) -AS root.db.** - -CREATE VIEW viewdb.machine_int64 - ( - sensor_id STRING TAG, - value INT64 FIELD - ) -AS root.db.** - -... -``` - -##### 3.4.2.2 为每一个测点建一个表 - -​**优点**​:查询某一个测点值时,不需要先查一下数据类型,再去决定查询哪张表,简单便捷 - -​**缺点**​:当测点数量较多时,会引入过多的表视图,需要写大量的建视图语句 - -![](/img/tree-to-table-8.png) - -**表视图的创建 SQL:** - -```SQL -CREATE VIEW viewdb.DCS_PIT_02105A - ( - value FLOAT FIELD - ) -AS root.db.DCS_PIT_02105A.** - -CREATE VIEW viewdb.DCS_PIT_02105B - ( - value DOUBLE FIELD - ) -AS root.db.DCS_PIT_02105B.** - -CREATE VIEW viewdb.DCS_XI_02716A - ( - value INT64 FIELD - ) -AS root.db.DCS_XI_02716A.** - -...... -``` diff --git a/src/zh/UserGuide/latest-Table/User-Manual/Window-Function_timecho.md b/src/zh/UserGuide/latest-Table/User-Manual/Window-Function_timecho.md deleted file mode 100644 index 6d656d322..000000000 --- a/src/zh/UserGuide/latest-Table/User-Manual/Window-Function_timecho.md +++ /dev/null @@ -1,754 +0,0 @@ - - -# 窗口函数 - -IoTDB 针对时序数据的特色分析场景,提供了窗口函数能力,为时序数据的深度挖掘与复杂计算提供了灵活高效的解决方案。下文将对该功能进行详细的介绍。 - -## 1. 功能介绍 - -窗口函数(Window Function) 是一种基于与当前行相关的特定行集合(称为“窗口”) 对每一行进行计算的特殊函数。它将分组操作(`PARTITION BY`)、排序(`ORDER BY`)与可定义的计算范围(窗口框架 `FRAME`)结合,在不折叠原始数据行的前提下实现复杂的跨行计算。常用于数据分析场景,比如排名、累计和、移动平均等操作。 - -> 注意:该功能从 V 2.0.5 版本开始提供。 - -例如,某场景下需要查询不同设备的功耗累加值,即可通过窗口函数来实现。 - -```SQL --- 原始数据 -+-----------------------------+------+-----+ -| time|device| flow| -+-----------------------------+------+-----+ -|1970-01-01T08:00:00.000+08:00| d0| 3| -|1970-01-01T08:00:00.001+08:00| d0| 5| -|1970-01-01T08:00:00.002+08:00| d0| 3| -|1970-01-01T08:00:00.003+08:00| d0| 1| -|1970-01-01T08:00:00.004+08:00| d1| 2| -|1970-01-01T08:00:00.005+08:00| d1| 4| -+-----------------------------+------+-----+ - --- 创建表并插入数据 -CREATE TABLE device_flow(device String tag, flow INT32 FIELD); -insert into device_flow(time, device ,flow ) values ('1970-01-01T08:00:00.000+08:00','d0',3),('1970-01-01T08:00:01.000+08:00','d0',5),('1970-01-01T08:00:02.000+08:00','d0',3),('1970-01-01T08:00:03.000+08:00','d0',1),('1970-01-01T08:00:04.000+08:00','d1',2),('1970-01-01T08:00:05.000+08:00','d1',4); - - ---执行窗口函数查询 -SELECT *, sum(flow) ​OVER(PARTITION​ ​BY​ device ​ORDER​ ​BY​ flow) ​as​ sum ​FROM device_flow; -``` - -经过分组、排序、计算(步骤拆解如下图所示), - -![](/img/window-function-1.png) - -即可得到期望结果: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -## 2. 功能定义 -### 2.1 SQL 定义 - -```SQL -windowDefinition - : name=identifier AS '(' windowSpecification ')' - ; - -windowSpecification - : (existingWindowName=identifier)? - (PARTITION BY partition+=expression (',' partition+=expression)*)? - (ORDER BY sortItem (',' sortItem)*)? - windowFrame? - ; - -windowFrame - : frameExtent - ; - -frameExtent - : frameType=RANGE start=frameBound - | frameType=ROWS start=frameBound - | frameType=GROUPS start=frameBound - | frameType=RANGE BETWEEN start=frameBound AND end=frameBound - | frameType=ROWS BETWEEN start=frameBound AND end=frameBound - | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound - ; - -frameBound - : UNBOUNDED boundType=PRECEDING #unboundedFrame - | UNBOUNDED boundType=FOLLOWING #unboundedFrame - | CURRENT ROW #currentRowBound - | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame - ; -``` - -### 2.2 窗口定义 -#### 2.2.1 Partition - -`PARTITION BY` 用于将数据分为多个独立、不相关的「组」,窗口函数只能访问并操作其所属分组内的数据,无法访问其它分组。该子句是可选的;如果未显式指定,则默认将所有数据分到同一组。值得注意的是,与 `GROUP BY` 通过聚合函数将一组数据规约成一行不同,`PARTITION BY` 的窗口函数**并不会影响组内的行数。** - -* 示例 - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER (PARTITION BY device) as count FROM device_flow; -``` - -拆解步骤: - -![](/img/window-function-2.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 4| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -|1970-01-01T08:00:02.000+08:00| d0| 3| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 4| -+-----------------------------+------+----+-----+ -``` - -#### 2.2.2 Ordering - -`ORDER BY` 用于对 partition 内的数据进行排序。排序后,相等的行被称为 peers。peers 会影响窗口函数的行为,例如不同 rank function 对 peers 的处理不同;不同 frame 的划分方式对于 peers 的处理也不同。该子句是可选的。 - -* 示例 - -查询语句: - -```SQL -IoTDB> SELECT *, rank() OVER (PARTITION BY device ORDER BY flow) as rank FROM device_flow; -``` - -拆解步骤: - -![](/img/window-function-3.png) - -查询结果: - -```SQL -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -#### 2.2.3 Framing - -对于 partition 中的每一行,窗口函数都会在相应的一组行上求值,这些行称为 Frame(即 Window Function 在每一行上的输入域)。Frame 可以手动指定,指定时涉及两个属性,具体说明如下。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Frame 属性属性值值描述
类型ROWS通过行号来划分 frame
GROUPS通过 peers 来划分 frame,即值相同的行视为同等的存在。peers 中所有的行分为一个组,叫做 peer group
RANGE通过值来划分 frame
起始和终止位置UNBOUNDED PRECEDING整个 partition 的第一行
offset PRECEDING代表前面和当前行「距离」为 offset 的行
CURRENT ROW当前行
offset FOLLOWING代表后面和当前行「距离」为 offset 的行
UNBOUNDED FOLLOWING整个 partition 的最后一行
- -其中,`CURRENT ROW`、`PRECEDING N` 和 `FOLLOWING N` 的含义随着 frame 种类的不同而不同,如下表所示: - -| | `ROWS` | `GROUPS` | `RANGE` | -|--------------------|------------|------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------| -| `CURRENT ROW` | 当前行 | 由于 peer group 包含多行,因此这个选项根据作用于 frame\_start 和 frame\_end 而不同:* frame\_start:peer group 的第一行;* frame\_end:peer group 的最后一行。 | 和 GROUPS 相同,根据作用于 frame\_start 和 frame\_end 而不同:* frame\_start:peer group 的第一行;* frame\_end:peer group 的最后一行。 | -| `offset PRECEDING` | 前 offset 行 | 前 offset 个 peer group; | 前面与当前行的值之差小于等于 offset 就分为一个 frame | -| `offset FOLLOWING` | 后 offset 行 | 后 offset 个 peer group。 | 后面与当前行的值之差小于等于 offset 就分为一个 frame | - -语法格式如下: - -```SQL --- 同时指定 frame_start 和 frame_end -{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end --- 仅指定 frame_start,frame_end 为 CURRENT ROW -{ RANGE | ROWS | GROUPS } frame_start -``` - -若未手动指定 Frame,Frame 的默认划分规则如下: - -* 当窗口函数使用 ORDER BY 时:默认 Frame 为 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (即从窗口的第一行到当前行)。例如:RANK() OVER(PARTITION BY COL1 0RDER BY COL2) 中,Frame 默认包含分区内当前行及之前的所有行。 -* 当窗口函数不使用 ORDER BY 时:默认 Frame 为 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (即整个窗口的所有行)。例如:AVG(COL2) OVER(PARTITION BY col1) 中,Frame 默认包含分区内的所有行,计算整个分区的平均值。 - -需要注意的是,当 Frame 类型为 GROUPS 或 RANGE 时,需要指定 `ORDER BY`,区别在于 GROUPS 中的 ORDER BY 可以涉及多个字段,而 RANGE 需要计算,所以只能指定一个字段。 - -* 示例 - -1. Frame 类型为 ROWS - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ROWS 1 PRECEDING) as count FROM device_flow; -``` - -拆解步骤: - -* 取前一行和当前行作为 Frame - * 对于 partition 的第一行,由于没有前一行,所以整个 Frame 只有它一行,返回 1; - * 对于 partition 的其他行,整个 Frame 包含当前行和它的前一行,返回 2: - -![](/img/window-function-4.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 2| -+-----------------------------+------+----+-----+ -``` - -2. Frame 类型为 GROUPS - -查询语句: - -```SQL -IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -拆解步骤: - -* 取前一个 peer group 和当前 peer group 作为 Frame,那么以 device 为 d0 的 partition 为例(d1同理),对于 count 行数: - * 对于 flow 为 1 的 peer group,由于它也没比它小的 peer group 了,所以整个 Frame 就它一行,返回 1; - * 对于 flow 为 3 的 peer group,它本身包含 2 行,前一个 peer group 就是 flow 为 1 的,就一行,因此整个 Frame 三行,返回 3; - * 对于 flow 为 5 的 peer group,它本身包含 1 行,前一个 peer group 就是 flow 为 3 的,共两行,因此整个 Frame 三行,返回 3。 - -![](/img/window-function-5.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -3. Frame 类型为 RANGE - -查询语句: - -```SQL -IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow; -``` - -拆解步骤: - -* 把比当前行数据**小于等于 2 ​**的分为同一个 Frame,那么以 device 为 d0 的 partition 为例(d1 同理),对于 count 行数: - * 对于 flow 为 1 的行,由于它是最小的行了,所以整个 Frame 就它一行,返回 1; - * 对于 flow 为 3 的行,注意 CURRENT ROW 是作为 frame\_end 存在,因此是整个 peer group 的最后一行,符合要求比它小的共 1 行,然后 peer group 有 2 行,所以整个 Frame 共 3 行,返回 3; - * 对于 flow 为 5 的行,它本身包含 1 行,符合要求的比它小的共 2 行,所以整个 Frame 共 3 行,返回 3。 - -![](/img/window-function-6.png) - -查询结果: - -```SQL -+-----------------------------+------+----+-----+ -| time|device|flow|count| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----+ -``` - -## 3. 内置的窗口函数 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
窗口函数分类窗口函数名函数定义是否支持 FRAME 子句
Aggregate Function所有内置聚合函数对一组值进行聚合计算,得到单个聚合结果。
Value Functionfirst_value返回 frame 的第一个值,如果指定了 IGNORE NULLS 需要跳过前缀的 NULL
last_value返回 frame 的最后一个值,如果指定了 IGNORE NULLS 需要跳过后缀的 NULL
nth_value返回 frame 的第 n 个元素(注意 n 是从 1 开始),如果有 IGNORE NULLS 需要跳过 NULL
lead返回当前行的后 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default
lag返回当前行的前 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default
Rank Functionrank返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间可能有 gap
dense_rank返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间没有 gap
row_number返回当前行在整个 partition 中的行号,注意行号从 1 开始
percent_rank以百分比的形式,返回当前行的值在整个 partition 中的序号;即 (rank() - 1) / (n - 1),其中 n 是整个 partition 的行数
cume_dist以百分比的形式,返回当前行的值在整个 partition 中的序号;即 (小于等于它的行数) / n
ntile指定 n,给每一行进行 1~n 的编号。
- -### 3.1 Aggregate Function - -所有内置聚合函数,如 `sum()`、`avg()`、`min()`、`max()` 都能当作 Window Function 使用。 - -> 注意:与 GROUP BY 不同,Window Function 中每一行都有相应的输出 - -示例: - -```SQL -IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; -+-----------------------------+------+----+----+ -| time|device|flow| sum| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| -|1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| -|1970-01-01T08:00:01.000+08:00| d0| 5|12.0| -+-----------------------------+------+----+----+ -``` - -### 3.2 Value Function -1. `first_value` - -* 函数名:`first_value(value) [IGNORE NULLS]` -* 定义:返回 frame 的第一个值,如果指定了 IGNORE NULLS 需要跳过前缀的 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, first_value(flow) OVER w as first_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+-----------+ -| time|device|flow|first_value| -+-----------------------------+------+----+-----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 2| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+-----------+ -``` - -2. `last_value` - -* 函数名:`last_value(value) [IGNORE NULLS]` -* 定义:返回 frame 的最后一个值,如果指定了 IGNORE NULLS 需要跳过后缀的 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, last_value(flow) OVER w as last_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|last_value| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -3. `nth_value` - -* 函数名:`nth_value(value, n) [IGNORE NULLS]` -* 定义:返回 frame 的第 n 个元素(注意 n 是从 1 开始),如果有 IGNORE NULLS 需要跳过 NULL; -* 示例: - -```SQL -IoTDB> SELECT *, nth_value(flow, 2) OVER w as nth_values FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); -+-----------------------------+------+----+----------+ -| time|device|flow|nth_values| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4| 4| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -|1970-01-01T08:00:00.000+08:00| d0| 3| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 5| -+-----------------------------+------+----+----------+ -``` - -4. lead - -* 函数名:`lead(value[, offset[, default]]) [IGNORE NULLS]` -* 定义:返回当前行的后 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default;offset 的默认值为 1,default 的默认值为 NULL。 -* lead 函数需要需要一个 ORDER BY 窗口子句 -* 示例: - -```SQL -IoTDB> SELECT *, lead(flow) OVER w as lead FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY time); -+-----------------------------+------+----+----+ -| time|device|flow|lead| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 4| -|1970-01-01T08:00:05.000+08:00| d1| 4|null| -|1970-01-01T08:00:00.000+08:00| d0| 3| 5| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 1| -|1970-01-01T08:00:03.000+08:00| d0| 1|null| -+-----------------------------+------+----+----+ -``` - -5. lag - -* 函数名:`lag(value[, offset[, default]]) [IGNORE NULLS]` -* 定义:返回当前行的前 offset 个元素(如果有 IGNORE NULLS 则 NULL 不考虑在内),如果没有这样的元素(超过 partition 范围),则返回 default;offset 的默认值为 1,default 的默认值为 NULL。 -* lag 函数需要需要一个 ORDER BY 窗口子句 -* 示例: - -```SQL -IoTDB> SELECT *, lag(flow) OVER w as lag FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY device); -+-----------------------------+------+----+----+ -| time|device|flow| lag| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2|null| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:00.000+08:00| d0| 3|null| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -|1970-01-01T08:00:02.000+08:00| d0| 3| 5| -|1970-01-01T08:00:03.000+08:00| d0| 1| 3| -+-----------------------------+------+----+----+ -``` - -### 3.3 Rank Function -1. rank - -* 函数名:`rank()` -* 定义:返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间可能有 gap; -* 示例: - -```SQL -IoTDB> SELECT *, rank() OVER w as rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----+ -| time|device|flow|rank| -+-----------------------------+------+----+----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----+ -``` - -2. dense\_rank - -* 函数名:`dense_rank()` -* 定义:返回当前行在整个 partition 中的序号,值相同的行序号相同,序号之间没有 gap。 -* 示例: - -```SQL -IoTDB> SELECT *, dense_rank() OVER w as dense_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|dense_rank| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 3| -+-----------------------------+------+----+----------+ -``` - -3. row\_number - -* 函数名:`row_number()` -* 定义:返回当前行在整个 partition 中的行号,注意行号从 1 开始; -* 示例: - -```SQL -IoTDB> SELECT *, row_number() OVER w as row_number FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+----------+ -| time|device|flow|row_number| -+-----------------------------+------+----+----------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 2| -|1970-01-01T08:00:02.000+08:00| d0| 3| 3| -|1970-01-01T08:00:01.000+08:00| d0| 5| 4| -+-----------------------------+------+----+----------+ -``` - -4. percent\_rank - -* 函数名:`percent_rank()` -* 定义:以百分比的形式,返回当前行的值在整个 partition 中的序号;即 **(rank() - 1) / (n - 1)**,其中 n 是整个 partition 的行数; -* 示例: - -```SQL -IoTDB> SELECT *, percent_rank() OVER w as percent_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+------------------+ -| time|device|flow| percent_rank| -+-----------------------------+------+----+------------------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.0| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.0| -|1970-01-01T08:00:00.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:02.000+08:00| d0| 3|0.3333333333333333| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+------------------+ -``` - -5. cume\_dist - -* 函数名:cume\_dist -* 定义:以百分比的形式,返回当前行的值在整个 partition 中的序号;即 **(小于等于它的行数) / n**。 -* 示例: - -```SQL -IoTDB> SELECT *, cume_dist() OVER w as cume_dist FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+---------+ -| time|device|flow|cume_dist| -+-----------------------------+------+----+---------+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 0.5| -|1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| -|1970-01-01T08:00:03.000+08:00| d0| 1| 0.25| -|1970-01-01T08:00:00.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:02.000+08:00| d0| 3| 0.75| -|1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| -+-----------------------------+------+----+---------+ -``` - -6. ntile - -* 函数名:ntile -* 定义:指定 n,给每一行进行 1~n 的编号。 - * 整个 partition 行数比 n 小,那么编号就是行号 index; - * 整个 partition 行数比 n 大: - * 如果行数能除尽 n,那么比较完美,比如行数为 4,n 为 2,那么编号为 1、1、2、2、; - * 如果行数不能除尽 n,那么就分给开头几组,比如行数为 5,n 为 3,那么编号为 1、1、2、2、3; -* 示例: - -```SQL -IoTDB> SELECT *, ntile(2) OVER w as ntile FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); -+-----------------------------+------+----+-----+ -| time|device|flow|ntile| -+-----------------------------+------+----+-----+ -|1970-01-01T08:00:04.000+08:00| d1| 2| 1| -|1970-01-01T08:00:05.000+08:00| d1| 4| 2| -|1970-01-01T08:00:03.000+08:00| d0| 1| 1| -|1970-01-01T08:00:00.000+08:00| d0| 3| 1| -|1970-01-01T08:00:02.000+08:00| d0| 3| 2| -|1970-01-01T08:00:01.000+08:00| d0| 5| 2| -+-----------------------------+------+----+-----+ -``` - -## 4. 场景示例 -1. 多设备 diff 函数 - -对于每个设备的每一行,与前一行求差值: - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -对于每个设备的每一行,与后一行求差值: - -```SQL -SELECT - *, - measurement - lead(measurement) OVER (PARTITION BY device ORDER BY time) -FROM data -WHERE timeCondition; -``` - -对于单个设备的每一行,与前一行求差值(后一行同理): - -```SQL -SELECT - *, - measurement - lag(measurement) OVER (ORDER BY time) -FROM data -where device='d1' -WHERE timeCondition; -``` - -2. 多设备 TOP\_K/BOTTOM\_K - -利用 rank 获取序号,然后在外部的查询中保留想要的顺序。 - -(注意, window function 的执行顺序在 HAVING 子句之后,所以这里需要子查询) - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY time DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -除了按照时间排序之外,还可以按照测点的值进行排序: - -```SQL -SELECT * -FROM( - SELECT - *, - rank() OVER (PARTITION BY device ORDER BY measurement DESC) - FROM data - WHERE timeCondition -) -WHERE rank <= 3; -``` - -3. 多设备 CHANGE\_POINTS - -这个 sql 用来去除输入序列中连续相同值,可以用 lead + 子查询实现: - -```SQL -SELECT - time, - device, - measurement -FROM( - SELECT - time, - device, - measurement, - LEAD(measurement) OVER (PARTITION BY device ORDER BY time) AS next - FROM data - WHERE timeCondition -) -WHERE measurement != next OR next IS NULL; -``` diff --git a/src/zh/UserGuide/latest/AI-capability/AINode_Upgrade_timecho.md b/src/zh/UserGuide/latest/AI-capability/AINode_Upgrade_timecho.md deleted file mode 100644 index e7464bc7f..000000000 --- a/src/zh/UserGuide/latest/AI-capability/AINode_Upgrade_timecho.md +++ /dev/null @@ -1,669 +0,0 @@ - - -# AINode - -AINode 是支持时序相关模型注册、管理、调用的 IoTDB 原生节点,内置业界领先的自研时序大模型,如清华自研时序模型 Timer 系列,可通过标准 SQL 语句进行调用,实现时序数据的毫秒级实时推理,可支持时序趋势预测、缺失值填补、异常值检测等应用场景。 - -系统架构如下图所示: - -![](/img/AINode-0.png) - -三种节点的职责如下: - -* **ConfigNode**:负责分布式节点管理和负载均衡。 -* **DataNode**:负责接收并解析用户的 SQL请求;负责存储时间序列数据;负责数据的预处理计算。 -* **AINode**:负责时序模型的管理和使用。 - -## 1. 优势特点 - -与单独构建机器学习服务相比,具有以下优势: - -* **简单易用**:无需使用 Python 或 Java 编程,使用 SQL 语句即可完成机器学习模型管理与推理的完整流程。如创建模型可使用CREATE MODEL语句、使用模型进行推理可使用 CALL INFERENCE (...) 语句等,使用更加简单便捷。 -* **避免数据迁移**:使用 IoTDB 原生机器学习可以将存储在 IoTDB 中的数据直接应用于机器学习模型的推理,无需将数据移动到单独的机器学习服务平台,从而加速数据处理、提高安全性并降低成本。 - -![](/img/h1.png) - -* **内置先进算法**:支持业内领先机器学习分析算法,覆盖典型时序分析任务,为时序数据库赋能原生数据分析能力。如: - * **时间序列预测(Time Series Forecasting)**:从过去时间序列中学习变化模式;从而根据给定过去时间的观测值,输出未来序列最可能的预测。 - * **时序异常检测(Anomaly Detection for Time Series)**:在给定的时间序列数据中检测和识别异常值,帮助发现时间序列中的异常行为。 - -## 2. 基本概念 - -* **模型(Model)**:机器学习模型,以时序数据作为输入,输出分析任务的结果或决策。模型是 AINode 的基本管理单元,支持模型的增(注册)、删、查、改(微调)、用(推理)。 -* **创建(Create)**: 将外部设计或训练好的模型文件或算法加载到 AINode 中,由 IoTDB 统一管理与使用。 -* **推理(Inference)**:使用创建的模型在指定时序数据上完成该模型适用的时序分析任务。 -* **内置能力(Built-in)**:AINode 自带常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -![](/img/h3.png) - -## 3. 安装部署 - -AINode 的部署可参考文档 [AINode 部署](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md) 。 - -## 4. 使用指导 - -TimechoDB-AINode 支持模型推理、模型微调以及模型管理(注册、查看、删除、加载、卸载等)三大功能,下面章节将进行详细说明。 - -### 4.1 模型推理 - -SQL语法如下: - -```SQL -call inference(,inputSql,(=)*) -``` - -在完成模型的注册后(内置模型推理无需注册流程),通过call关键字,调用inference函数就可以使用模型的推理功能,其对应的参数介绍如下: - -* **model\_id**: 对应一个已经注册的模型 -* **sql**:sql查询语句,查询的结果作为模型的输入进行模型推理。查询的结果中行列的维度需要与具体模型config中指定的大小相匹配。(这里的sql不建议使用`SELECT *`子句,因为在IoTDB中,`*`并不会对列进行排序,因此列的顺序是未定义的,可以使用`SELECT ot` 的方式确保列的顺序符合模型输入的预期) -* **parameterName/parameterValue**:参数名/参数值,目前支持: - - | 参数名称 | 参数类型 | 参数描述 | 默认值 | - | ------------------------ | ---------- | -------------------------- | -------- | - | **generateTime** | boolean | 返回结果是否包含时间戳列 | false | - | **outputLength** | int | 指定返回结果的输出长度 | 96 | - - -说明: - -1. 使用内置时序大模型进行推理的前提条件是本地存有对应模型权重,目录为 /TIMECHODB\_AINODE\_HOME/data/ainode/models/builtin/model\_id/。若本地没有模型权重,则会自动从 HuggingFace 拉取,请保证本地能直接访问 HuggingFace。 -2. 在深度学习应用中,经常将时间戳衍生特征(数据中的时间列)作为生成式任务的协变量,一同输入到模型中以提升模型的效果,但是在模型的输出结果中一般不包含时间列。为了保证实现的通用性,模型推理结果只对应模型的真实输出,如果模型不输出时间列,则结果中不会包含。 - -**示例** - -样本数据 [ETTh-tree](/img/ETTh-tree.csv) - -下面是使用 sundial 模型推理的一个操作示例,输入 96 行, 输出 48 行,我们通过SQL使用其进行推理。 - -```SQL -IoTDB> select OT from root.db.** -+-----------------------------+---------------+ -| Time|root.db.etth.OT| -+-----------------------------+---------------+ -|2016-07-01T00:00:00.000+08:00| 30.531| -|2016-07-01T01:00:00.000+08:00| 27.787| -|2016-07-01T02:00:00.000+08:00| 27.787| -|2016-07-01T03:00:00.000+08:00| 25.044| -|2016-07-01T04:00:00.000+08:00| 21.948| -| ...... | ...... | -|2016-07-04T19:00:00.000+08:00| 29.546| -|2016-07-04T20:00:00.000+08:00| 29.475| -|2016-07-04T21:00:00.000+08:00| 29.264| -|2016-07-04T22:00:00.000+08:00| 30.953| -|2016-07-04T23:00:00.000+08:00| 31.726| -+-----------------------------+---------------+ -Total line number = 96 - -IoTDB> call inference(sundial,"select OT from root.db.**", generateTime=True, outputLength=48) -+-----------------------------+------------------+ -| Time| output| -+-----------------------------+------------------+ -|2016-07-04T23:00:00.000+08:00|30.537494659423828| -|2016-07-04T23:59:22.500+08:00|29.619892120361328| -|2016-07-05T00:58:45.000+08:00|28.815832138061523| -|2016-07-05T01:58:07.500+08:00| 27.91131019592285| -|2016-07-05T02:57:30.000+08:00|26.893848419189453| -| ...... | ...... | -|2016-07-06T17:33:07.500+08:00| 24.40607261657715| -|2016-07-06T18:32:30.000+08:00| 25.00441551208496| -|2016-07-06T19:31:52.500+08:00|24.907312393188477| -|2016-07-06T20:31:15.000+08:00|25.156436920166016| -|2016-07-06T21:30:37.500+08:00|25.335433959960938| -+-----------------------------+------------------+ -Total line number = 48 -``` - -### 4.2 模型微调 - -AINode 支持通过 SQL 进行模型微调任务。 - -**SQL 语法** - -```SQL -createModel - | CREATE MODEL modelId=identifier (WITH HYPERPARAMETERS LR_BRACKET hparamPair (COMMA hparamPair)* RR_BRACKET)? FROM MODEL existingModelId=identifier ON DATASET LR_BRACKET trainingData RR_BRACKET - ; - -trainingData - : dataElement(COMMA dataElement)* - ; - -dataElement - : pathPatternElement (LR_BRACKET timeRange RR_BRACKET)? - ; - -pathPatternElement - : PATH path=prefixPath - ; -``` - -**参数说明** - -| 名称 | 描述 | -| ----------------- |---------------------------------------------------------------------------------------------------------------------------------------| -| modelId | 微调出的模型的唯一标识 | -| hparamPair | 微调使用的超参数 key-value 对,目前支持如下:
`train_epochs`: int 类型,微调轮数
`iter_per_epoch`: int 类型,每轮微调的迭代次数
`learning_rate`: double 类型,学习率 | -| existingModelId | 微调使用的基座模型 | -| trainingData | 微调使用的数据集 | - -**示例** - -1. 选择测点 root.db.etth.ot 中指定时间范围的数据作为微调数据集,基于 sundial 创建模型 sundialv2. - -```SQL -IoTDB> CREATE MODEL sundialv2 FROM MODEL sundial ON DATASET (PATH root.db.etth.OT([1467302400000, 1467644400000))) -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| training| -+---------------------+---------+-----------+---------+ -``` - -2. 微调任务后台异步启动,可在 AINode 进程看到 log;微调完成后,查询并使用新的模型 - -```SQL -IoTDB> show models -+---------------------+---------+-----------+---------+ -| ModelId|ModelType| Category| State| -+---------------------+---------+-----------+---------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -| sundialv2| sundial| fine_tuned| active| -+---------------------+---------+-----------+---------+ -``` - -### 4.3 注册自定义模型 - -**符合以下要求的 Transformers 模型可以注册到 AINode 中:** - -1. AINode 目前使用 v4.56.2 版本的 transformers,构建模型时需**避免继承低版本(<4.50)接口**; -2. 模型需继承一类 AINode 的推理任务流水线(当前支持预测流水线): - * iotdb-core/ainode/iotdb/ainode/core/inference/pipeline/basic\_pipeline.py - - **V2.0.9.3 之前** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - 在推理任务开始前对输入数据进行前处理,包括形状验证和数值转换。 - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - 在推理任务结束后对输出结果进行后处理。 - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def preprocess(self, inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], **infer_kwargs): - """ - 在将输入数据传递给模型进行推理之前进行预处理,验证输入数据的形状和类型。 - - Args: - inputs (list[dict]): - 输入数据,字典列表,每个字典包含: - - 'targets': 形状为 (input_length,) 或 (target_count, input_length) 的张量。 - - 'past_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - 'future_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - infer_kwargs (dict, optional): 推理的额外关键字参数,如: - - `output_length`(int): 如果提供'future_covariates',用于验证其有效性。 - - Raises: - ValueError: 如果输入格式不正确(例如,缺少键、张量形状无效)。 - - Returns: - 经过预处理和验证的输入数据,可直接用于模型推理。 - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - 对给定输入执行预测。 - - Parameters: - inputs: 用于进行预测的输入数据。类型和结构取决于模型的具体实现。 - **infer_kwargs: 额外的推理参数,例如: - - `output_length`(int): 模型应该生成的时间点数量。 - - Returns: - 预测输出,具体形式取决于模型的具体实现。 - """ - pass - - def postprocess(self, outputs: list[torch.Tensor], **infer_kwargs) -> list[torch.Tensor]: - """ - 在推理后对模型输出进行后处理,验证输出数据的形状并确保其符合预期维度。 - - Args: - outputs: - 模型输出,2D张量列表,每个张量形状为 `[target_count, output_length]`。 - - Raises: - InferenceModelInternalException: 如果输出张量形状无效(例如,维数错误)。 - ValueError: 如果输出格式不正确。 - - Returns: - list[torch.Tensor]: - 后处理后的输出,将是一个2D张量列表。 - """ - pass - ``` - - **V2.0.9.3 起** - ```Python - class BasicPipeline(ABC): - def __init__(self, model_id, **model_kwargs): - self.model_info = model_info - self.device = model_kwargs.get("device", "cpu") - self.model = load_model(model_info, device_map=self.device, **model_kwargs) - - @abstractmethod - def preprocess(self, inputs, **infer_kwargs): - """ - 在推理任务开始前对输入数据进行前处理,包括形状验证和数值转换。 - """ - pass - - @abstractmethod - def postprocess(self, output, **infer_kwargs): - """ - 在推理任务结束后对输出结果进行后处理。 - """ - pass - - - class ForecastPipeline(BasicPipeline): - def __init__(self, model_info, **model_kwargs): - super().__init__(model_info, model_kwargs=model_kwargs) - - def _preprocess( - self, - inputs: list[dict[str, dict[str, torch.Tensor] | torch.Tensor]], - **infer_kwargs, - ): - """ - 在将输入数据传递给模型进行推理之前进行预处理,验证输入数据的形状和类型。 - - Args: - inputs (list[dict[str, dict[str, torch.Tensor] | torch.Tensor]]): - 输入数据,字典列表,每个字典包含: - - 'targets': 形状为 (input_length,) 或 (target_count, input_length) 的张量。 - - 'past_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - 'future_covariates': 可选,张量字典,每个张量形状为 (input_length,)。 - - infer_kwargs (dict, optional): 推理的额外关键字参数,如: - - `output_length`(int): 如果提供'future_covariates',用于验证其有效性。 - - Raises: - ValueError: 如果输入格式不正确(例如,缺少键、张量形状无效)。 - - Returns: - 经过预处理和验证的输入数据,可直接用于模型推理。 - """ - pass - - def forecast(self, inputs, **infer_kwargs): - """ - 对给定输入执行预测。 - - Parameters: - inputs: 用于进行预测的输入数据。类型和结构取决于模型的具体实现。 - **infer_kwargs: 额外的推理参数,例如: - - `output_length`(int): 模型应该生成的时间点数量。 - - Returns: - 预测输出,具体形式取决于模型的具体实现。 - """ - pass - - def _postprocess(self, outputs, **infer_kwargs) -> list[torch.Tensor]: - """ - 在推理后对模型输出进行后处理,验证输出数据的形状并确保其符合预期维度。 - - Args: - outputs: - 模型输出,2D张量列表,每个张量形状为 `[target_count, output_length]`。 - - Raises: - InferenceModelInternalException: 如果输出张量形状无效(例如,维数错误)。 - ValueError: 如果输出格式不正确。 - - Returns: - list[torch.Tensor]: - 后处理后的输出,将是一个2D张量列表。 - """ - pass - ``` - -3. 修改模型配置文件 config.json,确保包含以下字段: - - **V2.0.9.3 之前** - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // 指定模型 Config 类 - "AutoModelForCausalLM": "model.Chronos2Model" // 指定模型类 - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // 指定模型的推理流水线 - "model_type": "custom_t5", // 指定模型类型 - } - ``` - - * 必须通过 auto\_map 指定模型的 Config 类和模型类; - * 必须集成并指定推理流水线类; - * 对于 AINode 管理的内置(builtin)和自定义(user\_defined)模型,模型类别(model\_type)也作为不可重复的唯一标识。即,要注册的模型类别不得与任何已存在的模型类型重复,通过微调创建的模型将继承原模型的模型类别。 - - **V2.0.9.3 起** - > 参数 model_type 非必填 - ```JSON - { - "auto_map": { - "AutoConfig": "config.Chronos2CoreConfig", // 指定模型 Config 类 - "AutoModelForCausalLM": "model.Chronos2Model" // 指定模型类 - }, - "pipeline_cls": "pipeline_chronos2.Chronos2Pipeline", // 指定模型的推理流水线 - } - ``` - * 必须通过 auto\_map 指定模型的 Config 类和模型类; - * 必须集成并指定推理流水线类; - - -4. 确保要注册的模型目录包含以下文件,且模型配置文件名称和权重文件名称不支持自定义: - * 模型配置文件:config.json; - * 模型权重文件:model.safetensors; - * 模型代码:其它 .py 文件。 - -**注册自定义模型的 SQL 语法如下所示:** - -```SQL -CREATE MODEL USING URI -``` - -**参数说明:** - -* **model\_id**:自定义模型的唯一标识;不可重复,有以下约束: - * 允许出现标识符 [ 0-9 a-z A-Z \_ ] (字母,数字(非开头),下划线(非开头)) - * 长度限制为 2-64 字符 - * 大小写敏感 -* **uri**:包含模型代码和权重的本地 uri 地址。 - -**注册示例:** - -从本地路径上传自定义 Transformers 模型,AINode 会将该文件夹拷贝至 user\_defined 目录中。 - -```SQL -CREATE MODEL chronos2 USING URI 'file:///path/to/chronos2' -``` - -SQL执行后会异步进行注册的流程,可以通过模型展示查看模型的注册状态(见查看模型章节)。模型注册完成后,就可以通过使用正常查询的方式调用具体函数,进行模型推理。 - -### 4.4 查看模型 - -注册成功的模型可以通过查看指令查询模型的具体信息。 - -```SQL -SHOW MODELS -``` - -除了直接展示所有模型的信息外,可以指定`model_id`来查看某一具体模型的信息。 - -```SQL -SHOW MODELS -- 只展示特定模型 -``` - -模型展示的结果中包含如下内容: - -| **ModelId** | **ModelType** | **Category** | **State** | -| ------------------- | --------------------- | -------------------- | ----------------- | -| 模型ID | 模型类型 | 模型种类 | 模型状态 | - -其中,State 模型状态机流转示意图如下: - -![](/img/ainode-upgrade-state-timecho.png) - -状态机流程说明: - -1. 启动 AINode 后,执行 `show models` 命令,仅能查看到**系统内置(BUILTIN)**的模型。 -2. 用户可导入自己的模型,这类模型的来源标识为**用户自定义(USER\_****DEFINED)**;AINode 会尝试从模型配置文件中解析模型类型(ModelType),若解析失败,该字段则显示为空。 -3. 时序大模型(内置模型)权重文件不随 AINode 打包,AINode 启动时自动下载。 - 1. 下载过程中为 ACTIVATING,下载成功转变为 ACTIVE,失败则变成 INACTIVE。 -4. 用户启动模型微调任务后,正在训练的模型状态为 TRAINING,训练成功变为 ACTIVE,失败则是 FAILED。 -5. 若微调任务成功,微调结束后会统计所有 ckpt (训练文件)中指标最佳的文件并自动重命名,变成用户指定的 model\_id。 - -**查看示例** - -```SQL -IoTDB> show models -+---------------------+--------------+--------------+-------------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------+--------------+-------------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| custom| | user_defined| active| -| timer_xl| timer| builtin| activating| -| sundial| sundial| builtin| active| -| sundialx_1| sundial| fine_tuned| active| -| sundialx_4| sundial| fine_tuned| training| -| sundialx_5| sundial| fine_tuned| failed| -| chronos2| t5| builtin| inactive| -+---------------------+--------------+--------------+-------------+ -``` - -内置传统时序模型介绍如下: - -| 模型名称 | 核心概念 | 适用场景 | 主要特点 | -|----------------------------------| ----------------------------------------------------------------------------------------- | ---------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | -| **ARIMA**(自回归整合移动平均模型) | 结合自回归(AR)、差分(I)和移动平均(MA),用于预测平稳时间序列或可通过差分变为平稳的数据。 | 单变量时间序列预测,如股票价格、销量、经济指标等。| 1. 适用于线性趋势和季节性较弱的数据。2. 需要选择参数 (p,d,q)。3. 对缺失值敏感。 | -| **Holt-Winters**(三参数指数平滑) | 基于指数平滑,引入水平、趋势和季节性三个分量,适用于具有趋势和季节性的数据。 | 有明显季节性和趋势的时间序列,如月度销售额、电力需求等。 | 1. 可处理加性或乘性季节性。2. 对近期数据赋予更高权重。3. 简单易实现。 | -| **Exponential Smoothing**(指数平滑) | 通过加权平均历史数据,权重随时间指数递减,强调近期观测值的重要性。 | 无显著季节性但存在趋势的数据,如短期需求预测。 | 1. 参数少,计算简单。2. 适合平稳或缓慢变化序列。3. 可扩展为双指数或三指数平滑。 | -| **Naive Forecaster**(朴素预测器) | 使用最近一期的观测值作为下一期的预测值,是最简单的基准模型。 | 作为其他模型的比较基准,或数据无明显模式时的简单预测。 | 1. 无需训练。2. 对突发变化敏感。3. 季节性朴素变体可用前一季节同期值预测。 | -| **STL Forecaster**(季节趋势分解预测) | 基于STL分解时间序列,分别预测趋势、季节性和残差分量后组合。 | 具有复杂季节性、趋势和非线性模式的数据,如气候数据、交通流量。 | 1. 能处理非固定季节性。2. 对异常值稳健。3. 分解后可结合其他模型预测各分量。 | -| **Gaussian HMM**(高斯隐马尔可夫模型) | 假设观测数据由隐藏状态生成,每个状态的观测概率服从高斯分布。 | 状态序列预测或分类,如语音识别、金融状态识别。 | 1. 适用于时序数据的状态建模。2. 假设观测值在给定状态下独立。3. 需指定隐藏状态数量。 | -| **GMM HMM** (高斯混合隐马尔可夫模型) | 扩展Gaussian HMM,每个状态的观测概率由高斯混合模型描述,可捕捉更复杂的观测分布。 | 需要多模态观测分布的场景,如复杂动作识别、生物信号分析。 | 1. 比单一高斯更灵活。2. 参数更多,计算复杂度高。3. 需训练GMM成分数。 | -| **STRAY**(基于奇异值的异常检测) | 通过奇异值分解(SVD)检测高维数据中的异常点,常用于时间序列异常检测。 | 高维时间序列的异常检测,如传感器网络、IT系统监控。 | 1. 无需分布假设。2. 可处理高维数据。3. 对全局异常敏感,局部异常可能漏检。 | - -内置时序大模型介绍如下: - -| 模型名称 | 核心概念 | 适用场景 | 主要特点 | -|---------------| ---------------------------------------------------------------------- | ------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| **Timer-XL** | 支持超长上下文的时序大模型,通过大规模工业数据预训练增强泛化能力。 | 需利用极长历史数据的复杂工业预测,如能源、航空航天、交通等领域。 | 1. 超长上下文支持,可处理数万时间点输入。2. 多场景覆盖,支持非平稳、多变量及协变量预测。3. 基于万亿级高质量工业时序数据预训练。 | -| **Timer-Sundial** | 采用“Transformer + TimeFlow”架构的生成式基础模型,专注于概率预测。 | 需要量化不确定性的零样本预测场景,如金融、供应链、新能源发电预测。 | 1. 强大的零样本泛化能力,支持点预测与概率预测 2. 可灵活分析预测分布的任意统计特性。3. 创新生成架构,实现高效的非确定性样本生成。 | -| **Chronos-2** | 基于离散词元化范式的通用时序基础模型,将预测转化为语言建模任务。 | 快速零样本单变量预测,以及可借助协变量(如促销、天气)提升效果的场景。 | 1. 强大的零样本概率预测能力。2. 支持协变量统一建模,但对输入有严格要求:a. 未来协变量的名称组成的集合必须是历史协变量的名称组成的集合的子集;b. 每个历史协变量的长度必须等于目标变量的长度; c. 每个未来协变量的长度必须等于预测长度;3. 采用高效的编码器式结构,兼顾性能与推理速度。 | - - -### 4.5 删除模型 - -对于注册成功的模型,用户可以通过 SQL 进行删除,AINode 会将 user\_defined 目录下的对应模型文件夹整个删除。其 SQL 语法如下: - -```SQL -DROP MODEL -``` - -需要指定已经成功注册的模型 model\_id 来删除对应的模型。由于模型删除涉及模型数据清理,操作不会立即完成,此时模型的状态为 DROPPING,该状态的模型不能用于模型推理。请注意,该功能不支持删除内置模型。 - -### 4.6 加载/卸载模型 - -为适应不同场景,AINode 提供以下两种模型加载策略: - -* 即时加载:即推理时临时加载模型,结束后释放资源。适用于测试或低负载场景。 -* 常驻加载:即将模型持久化加载在内存(CPU)或显存(GPU)中,以支持高并发推理。用户只需通过 SQL 指定加载或卸载的模型,AINode 会自动管理实例数量。当前常驻模型的状态也可随时查看。 - -下文将详细介绍加载/卸载模型的相关内容: - -1. 配置参数 - -支持通过编辑如下配置项设置常驻加载相关参数。 - -```Properties -# AINode 在推理时可使用的设备内存/显存占总量的比例 -# Datatype: Float -ain_inference_memory_usage_ratio=0.4 - -# AINode 每个加载的模型实例需要占用的内存比例,即模型占用*该值 -# Datatype: Float -ain_inference_extra_memory_ratio=1.2 -``` - -2. 展示可用的 device - -支持通过如下 SQL 命令查看所有可用的设备 ID - -```SQL -SHOW AI_DEVICES -``` - -示例 - -```SQL -IoTDB> show ai_devices -+-------------+ -| DeviceId| -+-------------+ -| cpu| -| 0| -| 1| -+-------------+ -``` - -3. 加载模型 - -支持通过如下 SQL 命令手动加载模型,系统根据硬件资源使用情况**自动均衡**模型实例数量。 - -```SQL -LOAD MODEL TO DEVICES (, )* -``` - -参数要求 - -* **existing\_model\_id:** 指定的模型 id,当前版本仅支持 timer\_xl 和 sundial。 -* **device\_id:** 模型加载的位置。 - * **cpu:** 加载到 AINode 所在服务器的内存中。 - * **gpu\_id:** 加载到 AINode 所在服务器的对应显卡中,如 "0, 1" 表示加载到编号为 0 和 1 的两张显卡中。 - -示例 - -```SQL -LOAD MODEL sundial TO DEVICES 'cpu,0,1' -``` - -4. 卸载模型 - -支持通过如下 SQL 命令手动卸载指定模型的所有实例,系统会**重分配**空闲出的资源给其他模型 - -```SQL -UNLOAD MODEL FROM DEVICES (, )* -``` - -参数要求 - -* **existing\_model\_id:** 指定的模型 id,当前版本仅支持 timer\_xl 和 sundial。 -* **device\_id:** 模型加载的位置。 - * **cpu:** 尝试从 AINode 所在服务器的内存中卸载指定模型。 - * **gpu\_id:** 尝试从 AINode 所在服务器的对应显卡中卸载指定模型,如 "0, 1" 表示尝试从编号为 0 和 1 的两张显卡卸载指定模型。 - -示例 - -```SQL -UNLOAD MODEL sundial FROM DEVICES 'cpu,0,1' -``` - -5. 展示加载的模型 - -支持通过如下 SQL 命令查看已经手动加载的模型实例,可通过 `device_id `指定设备。 - -```SQL -SHOW LOADED MODELS -SHOW LOADED MODELS (, )* # 展示指定设备中的模型实例 -``` - -示例:在内存、gpu\_0 和 gpu\_1 两张显卡加载了sundial 模型 - -```SQL -IoTDB> show loaded models -+-------------+--------------+------------------+ -| DeviceId| ModelId| Count(instances)| -+-------------+--------------+------------------+ -| cpu| sundial| 4| -| 0| sundial| 6| -| 1| sundial| 6| -+-------------+--------------+------------------+ -``` - -说明: - -* DeviceId : 设备 ID -* ModelId :加载的模型 ID -* Count(instances) :每个设备中的模型实例数量(系统自动分配) - -### 4.7 时序大模型介绍 - -AINode 目前支持多种时序大模型,相关介绍及部署使用可参考[时序大模型](../AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md) - -## 5. 权限管理 - -使用 AINode 相关的功能时,可以使用IoTDB本身的鉴权去做一个权限管理,用户只有在具备 USE\_MODEL 权限时,才可以使用模型管理的相关功能。当使用推理功能时,用户需要有访问输入模型的 SQL 对应的源序列的权限。 - -| 权限名称 | 权限范围 | 管理员用户(默认ROOT) | 普通用户 | 路径相关 | -| ------------ | ----------------------------------------- | ------------------------ | ---------- | ---------- | -| USE\_MODEL | create model / show models / drop model | √ | √ | x | -| READ\_DATA | call inference | √ | √ | √ | diff --git a/src/zh/UserGuide/latest/AI-capability/AINode_timecho.md b/src/zh/UserGuide/latest/AI-capability/AINode_timecho.md deleted file mode 100644 index 22e29542e..000000000 --- a/src/zh/UserGuide/latest/AI-capability/AINode_timecho.md +++ /dev/null @@ -1,668 +0,0 @@ - - -# AINode - -AINode 是支持时序相关模型注册、管理、调用的 IoTDB 原生节点,内置业界领先的自研时序大模型,如清华自研时序模型 Timer 系列,可通过标准 SQL 语句进行调用,实现时序数据的毫秒级实时推理,可支持时序趋势预测、缺失值填补、异常值检测等应用场景。 - -> V2.0.5.1及以后版本支持 - -系统架构如下图所示: - -![](/img/AINode-0.png) - -三种节点的职责如下: - -- **ConfigNode**:负责分布式节点管理和负载均衡。 -- **DataNode**:负责接收并解析用户的 SQL请求;负责存储时间序列数据;负责数据的预处理计算。 -- **AINode**:负责时序模型的管理和使用。 - -## 1. 优势特点 - -与单独构建机器学习服务相比,具有以下优势: - -- **简单易用**:无需使用 Python 或 Java 编程,使用 SQL 语句即可完成机器学习模型管理与推理的完整流程。如创建模型可使用CREATE MODEL语句、使用模型进行推理可使用CALL INFERENCE(...)语句等,使用更加简单便捷。 - -- **避免数据迁移**:使用 IoTDB 原生机器学习可以将存储在 IoTDB 中的数据直接应用于机器学习模型的推理,无需将数据移动到单独的机器学习服务平台,从而加速数据处理、提高安全性并降低成本。 - -![](/img/h1.png) - -- **内置先进算法**:支持业内领先机器学习分析算法,覆盖典型时序分析任务,为时序数据库赋能原生数据分析能力。如: - - **时间序列预测(Time Series Forecasting)**:从过去时间序列中学习变化模式;从而根据给定过去时间的观测值,输出未来序列最可能的预测。 - - **时序异常检测(Anomaly Detection for Time Series)**:在给定的时间序列数据中检测和识别异常值,帮助发现时间序列中的异常行为。 - - **时间序列标注(Time Series Annotation)**:为每个数据点或特定时间段添加额外的信息或标记,例如事件发生、异常点、趋势变化等,以便更好地理解和分析数据。 - - -## 2. 基本概念 - -- **模型(Model)**:机器学习模型,以时序数据作为输入,输出分析任务的结果或决策。模型是 AINode 的基本管理单元,支持模型的增(注册)、删、查、改(微调)、用(推理)。 -- **创建(Create)**: 将外部设计或训练好的模型文件或算法加载到 AINode 中,由 IoTDB 统一管理与使用。 -- **推理(Inference)**:使用创建的模型在指定时序数据上完成该模型适用的时序分析任务。 -- **内置能力(Built-in)**:AINode 自带常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -![](/img/AINode-new.png) - -## 3. 安装部署 - -AINode 的部署可参考文档 [AINode 部署](../Deployment-and-Maintenance/AINode_Deployment_timecho.md) 章节。 - -## 4. 使用指导 - -AINode 对时序模型提供了模型创建及删除功能,内置模型无需创建,可直接使用。 - -### 4.1 注册模型 - -通过指定模型输入输出的向量维度,可以注册训练好的深度学习模型,从而用于模型推理。 - -符合以下内容的模型可以注册到AINode中: - 1. AINode 目前支持基于 PyTorch 2.4.0 版本训练的模型,需避免使用 2.4.0 版本以上的特性。 - 2. AINode 支持使用 PyTorch JIT 存储的模型(`model.pt`),模型文件需要包含模型的结构和权重。 - 3. 模型输入序列可以包含一列或多列,若有多列,需要和模型能力、模型配置文件对应。 - 4. 模型的配置参数必须在`config.yaml`配置文件中明确定义。使用模型时,必须严格按照`config.yaml`配置文件中定义的输入输出维度。如果输入输出列数不匹配配置文件,将会导致错误。 - -下方为模型注册的SQL语法定义。 - -```SQL -create model using uri -``` - -SQL中参数的具体含义如下: - -- model_id:模型的全局唯一标识,不可重复。模型名称具备以下约束: - - - 允许出现标识符 [ 0-9 a-z A-Z _ ](字母,数字(非开头),下划线(非开头)) - - 长度限制为2-64字符 - - 大小写敏感 - -- uri:模型注册文件的资源路径,路径下应包含**模型结构及权重文件 model.pt 文件和模型配置文件 config.yaml** - - - 模型结构及权重文件:模型训练完成后得到的权重文件,目前支持 pytorch 训练得到的 .pt 文件 - - - 模型配置文件:模型注册时需要提供的与模型结构有关的参数,其中必须包含模型的输入输出维度用于模型推理: - - | **参数名** | **参数描述** | **示例** | - | ------------ | ---------------------------- | -------- | - | input_shape | 模型输入的行列,用于模型推理 | [96,2] | - | output_shape | 模型输出的行列,用于模型推理 | [48,2] | - - 除了模型推理外,还可以指定模型输入输出的数据类型: - - | **参数名** | **参数描述** | **示例** | - | ----------- | ------------------ | --------------------- | - | input_type | 模型输入的数据类型 | ['float32','float32'] | - | output_type | 模型输出的数据类型 | ['float32','float32'] | - - 除此之外,可以额外指定备注信息用于在模型管理时进行展示 - - | **参数名** | **参数描述** | **示例** | - | ---------- | ---------------------------------------------- | ------------------------------------------- | - | attributes | 可选,用户自行设定的模型备注信息,用于模型展示 | 'model_type': 'dlinear','kernel_size': '25' | - - -除了本地模型文件的注册,还可以通过URI来指定远程资源路径来进行注册,使用开源的模型仓库(例如HuggingFace)。 - -#### 示例 - -在[example 文件夹](https://github.com/apache/iotdb/tree/master/integration-test/src/test/resources/ainode-example)下,包含model.pt和config.yaml文件,model.pt为训练得到,config.yaml的内容如下: - -```YAML -configs: - # 必选项 - input_shape: [96, 2] # 表示模型接收的数据为96行x2列 - output_shape: [48, 2] # 表示模型输出的数据为48行x2列 - - # 可选项 默认为全部float32,列数为shape对应的列数 - input_type: ["int64","int64"] #输入对应的数据类型,需要与输入列数匹配 - output_type: ["text","int64"] #输出对应的数据类型,需要与输出列数匹配 - -attributes: # 可选项 为用户自定义的备注信息 - 'model_type': 'dlinear' - 'kernel_size': '25' -``` - -指定该文件夹作为加载路径就可以注册该模型 - -```SQL -IoTDB> create model dlinear_example using uri "file://./example" -``` - -SQL执行后会异步进行注册的流程,可以通过模型展示查看模型的注册状态(见模型展示章节),注册成功的耗时主要受到模型文件大小的影响。 - -模型注册完成后,就可以通过使用正常查询的方式调用具体函数,进行模型推理。 - -### 4.2 查看模型 - -注册成功的模型可以通过show models指令查询模型的具体信息。其SQL定义如下: - -```SQL -show models - -show models -``` - -除了直接展示所有模型的信息外,可以指定model id来查看某一具体模型的信息。模型展示的结果中包含如下信息: - -| **ModelId** | **ModelType** | **Category** | **State** | -|-------------|-----------|--------------|----------------| -| 模型ID | 模型类型 | 模型种类 | 模型状态 | - -- 模型状态机流转示意图如下 - -![](/img/AINode-State.png) - -**说明:** - -1. 启动 AINode,show models 只能看到 BUILT-IN 模型 -2. 用户可导入自己的模型,来源为 USER-DEFINED,可尝试从配置文件解析 ModelType,解析不到则为空 -3. 时序大模型权重不随 AINode 打包,AINode 启动时自动下载,下载过程中为 LOADING -4. 下载成功转变为 ACTIVE,失败则变成 INACTIVE -5. 用户启动微调,正在训练的模型状态为 TRAINING,训练成功变为 ACTIVE,失败则是 FAILED - -**示例** - -```SQL -IoTDB> show models -+---------------------+--------------------+--------------+---------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+--------------+---------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| custom| | USER-DEFINED| ACTIVE| -| timerxl| Timer-XL| BUILT-IN| LOADING| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| sundialx_1| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_2| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx| Timer-Sundial| FINE-TUNED| ACTIVE| -| sundialx_4| Timer-Sundial| FINE-TUNED| TRAINING| -| sundialx_5| Timer-Sundial| FINE-TUNED| FAILED| -+---------------------+--------------------+--------------+---------+ -``` - -### 4.3 删除模型 - -对于注册成功的模型,用户可以通过SQL进行删除,该操作会删除所有 AINode 下的相关模型文件,其SQL如下: - -```SQL -drop model -``` - -需要指定已经成功注册的模型 model_id 来删除对应的模型。由于模型删除涉及模型数据清理,操作不会立即完成,此时模型的状态为 DROPPING,该状态的模型不能用于模型推理。请注意,该功能不支持删除内置模型。 - -### 4.4 使用内置模型推理 - -SQL语法如下: - - -```SQL -call inference(,inputSql,(=)*) - -window_function: - head(window_size) - tail(window_size) - count(window_size,sliding_step) -``` - -内置模型推理无需注册流程,通过call关键字,调用inference函数就可以使用模型的推理功能,其对应的参数介绍如下: - -- **model_id:** 模型名称 -- **parameterName**:参数名 -- **parameterValue**:参数值 - -请注意,使用内置时序大模型进行推理的前提条件是本地存有对应模型权重,目录为 /IOTDB_AINODE_HOME/data/ainode/models/weights/model_id/。若本地没有模型权重,则会自动从 HuggingFace 拉取,请保证本地能直接访问 HuggingFace。 - - -#### 内置模型及参数说明 - -目前已内置如下机器学习模型,具体参数说明请参考以下链接。 - -| 模型 | built_in_model_id | 任务类型 | 参数说明 | -| -------------------- | --------------------- | -------- | ------------------------------------------------------------ | -| Arima | _Arima | 预测 | [Arima参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.arima.ARIMA.html?highlight=Arima) | -| STLForecaster | _STLForecaster | 预测 | [STLForecaster参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.trend.STLForecaster.html#sktime.forecasting.trend.STLForecaster) | -| NaiveForecaster | _NaiveForecaster | 预测 | [NaiveForecaster参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.naive.NaiveForecaster.html#naiveforecaster) | -| ExponentialSmoothing | _ExponentialSmoothing | 预测 | [ExponentialSmoothing参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.exp_smoothing.ExponentialSmoothing.html) | -| GaussianHMM | _GaussianHMM | 标注 | [GaussianHMM参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.hmm_learn.gaussian.GaussianHMM.html) | -| GMMHMM | _GMMHMM | 标注 | [GMMHMM参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.hmm_learn.gmm.GMMHMM.html) | -| Stray | _Stray | 异常检测 | [Stray参数说明](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.detection.stray.STRAY.html) | - - -在完成模型的注册后,通过call关键字,调用inference函数就可以使用模型的推理功能,其对应的参数介绍如下: - -- **model_id**: 对应一个已经注册的模型 -- **sql**:sql查询语句,查询的结果作为模型的输入进行模型推理。查询的结果中行列的维度需要与具体模型config中指定的大小相匹配。(这里的sql不建议使用`SELECT *`子句,因为在IoTDB中,`*`并不会对列进行排序,因此列的顺序是未定义的,可以使用`SELECT s0,s1`的方式确保列的顺序符合模型输入的预期) -- **window_function**: 推理过程中可以使用的窗口函数,目前提供三种类型的窗口函数用于辅助模型推理: - - **head(window_size)**: 获取数据中最前的window_size个点用于模型推理,该窗口可用于数据裁剪 - ![](/img/AINode-call1.png) - - - **tail(window_size)**:获取数据中最后的window_size个点用于模型推,该窗口可用于数据裁剪 - ![](/img/AINode-call2.png) - - - **count(window_size, sliding_step)**:基于点数的滑动窗口,每个窗口的数据会分别通过模型进行推理,如下图示例所示,window_size为2的窗口函数将输入数据集分为三个窗口,每个窗口分别进行推理运算生成结果。该窗口可用于连续推理 - ![](/img/AINode-call3.png) - -**说明1: window可以用来解决sql查询结果和模型的输入行数要求不一致时的问题,对行进行裁剪。需要注意的是,当列数不匹配或是行数直接少于模型需求时,推理无法进行,会返回错误信息。** - -**说明2: 在深度学习应用中,经常将时间戳衍生特征(数据中的时间列)作为生成式任务的协变量,一同输入到模型中以提升模型的效果,但是在模型的输出结果中一般不包含时间列。为了保证实现的通用性,模型推理结果只对应模型的真实输出,如果模型不输出时间列,则结果中不会包含。** - - -#### 示例 - -下面是使用深度学习模型推理的一个操作示例,针对上面提到的输入为`[96,2]`,输出为`[48,2]`的`dlinear`预测模型,我们通过SQL使用其进行推理。 - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**", generateTime=True) -+-----------------------------+--------------------------------------------+-----------------------------+ -| Time| _result_0| _result_1| -+-----------------------------+--------------------------------------------+-----------------------------+ -|1990-04-06T00:00:00.000+08:00| 0.726302981376648| 1.6549958229064941| -|1990-04-08T00:00:00.000+08:00| 0.7354921698570251| 1.6482787370681763| -|1990-04-10T00:00:00.000+08:00| 0.7238251566886902| 1.6278168201446533| -...... -|1990-07-07T00:00:00.000+08:00| 0.7692174911499023| 1.654654049873352| -|1990-07-09T00:00:00.000+08:00| 0.7685555815696716| 1.6625318765640259| -|1990-07-11T00:00:00.000+08:00| 0.7856493592262268| 1.6508299350738525| -+-----------------------------+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### 使用tail/head窗口函数的示例 - -当数据量不定且想要取96行最新数据用于推理时,可以使用对应的窗口函数tail。head函数的用法与其类似,不同点在于其取的是最早的96个点。 - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1988-01-01T00:00:00.000+08:00| 0.7355| 1.211| -...... -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 996 - -IoTDB> call inference(dlinear_example,"select s0,s1 from root.**", generateTime=True, window=tail(96)) -+-----------------------------+--------------------------------------------+-----------------------------+ -| Time| _result_0| _result_1| -+-----------------------------+--------------------------------------------+-----------------------------+ -|1990-04-06T00:00:00.000+08:00| 0.726302981376648| 1.6549958229064941| -|1990-04-08T00:00:00.000+08:00| 0.7354921698570251| 1.6482787370681763| -|1990-04-10T00:00:00.000+08:00| 0.7238251566886902| 1.6278168201446533| -...... -|1990-07-07T00:00:00.000+08:00| 0.7692174911499023| 1.654654049873352| -|1990-07-09T00:00:00.000+08:00| 0.7685555815696716| 1.6625318765640259| -|1990-07-11T00:00:00.000+08:00| 0.7856493592262268| 1.6508299350738525| -+-----------------------------+--------------------------------------------+-----------------------------+ -Total line number = 48 -``` - -#### 使用count窗口函数的示例 - -该窗口主要用于计算式任务,当任务对应的模型一次只能处理固定行数据而最终想要的确实多组预测结果时,使用该窗口函数可以使用点数滑动窗口进行连续推理。假设我们现在有一个异常检测模型anomaly_example(input: [24,2], output[1,1]),对每24行数据会生成一个0/1的标签,其使用示例如下: - -```Shell -IoTDB> select s1,s2 from root.** -+-----------------------------+-------------------+-------------------+ -| Time| root.eg.etth.s0| root.eg.etth.s1| -+-----------------------------+-------------------+-------------------+ -|1990-01-01T00:00:00.000+08:00| 0.7855| 1.611| -|1990-01-02T00:00:00.000+08:00| 0.7818| 1.61| -|1990-01-03T00:00:00.000+08:00| 0.7867| 1.6293| -|1990-01-04T00:00:00.000+08:00| 0.786| 1.637| -|1990-01-05T00:00:00.000+08:00| 0.7849| 1.653| -|1990-01-06T00:00:00.000+08:00| 0.7866| 1.6537| -|1990-01-07T00:00:00.000+08:00| 0.7886| 1.662| -...... -|1990-03-31T00:00:00.000+08:00| 0.7585| 1.678| -|1990-04-01T00:00:00.000+08:00| 0.7587| 1.6763| -|1990-04-02T00:00:00.000+08:00| 0.76| 1.6813| -|1990-04-03T00:00:00.000+08:00| 0.7669| 1.684| -|1990-04-04T00:00:00.000+08:00| 0.7645| 1.677| -|1990-04-05T00:00:00.000+08:00| 0.7625| 1.68| -|1990-04-06T00:00:00.000+08:00| 0.7617| 1.6917| -+-----------------------------+-------------------+-------------------+ -Total line number = 96 - -IoTDB> call inference(anomaly_example,"select s0,s1 from root.**", generateTime=True, window=count(24,24)) -+-----------------------------+-------------------------+ -| Time| _result_0| -+-----------------------------+-------------------------+ -|1990-04-06T00:00:00.000+08:00| 0| -|1990-04-30T00:00:00.000+08:00| 1| -|1990-05-24T00:00:00.000+08:00| 1| -|1990-06-17T00:00:00.000+08:00| 0| -+-----------------------------+-------------------------+ -Total line number = 4 -``` - -其中结果集中每行的标签对应每24行数据为一组,输入该异常检测模型后的输出。 - -### 4.5 使用内置模型微调 -> 仅 Timer-XL、Timer-Sundial 可以进行微调操作。 - -SQL语法如下: - - -```SQL -create model (with hyperparameters -(=(, =)*))? -from model -on dataset (PATH ([timeRange])?) -``` - -#### 示例 - -1. 选择测点 root.db.etth.ot 中前 80% 的数据作为微调数据集,基于 sundial 创建模型 sundialv2. - -```SQL -IoTDB> CREATE MODEL sundialv2 FROM MODEL sundial ON DATASET (PATH root.db.etth.OT([1467302400000, 1517468400001))) -Msg: The statement is executed successfully. -IoTDB> show models -+---------------------+--------------------+----------+--------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+--------+ -| arima| Arima| BUILT-IN| ACTIVE| -| holtwinters| HoltWinters| BUILT-IN| ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN| ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN| ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN| ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN| ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN| ACTIVE| -| stray| Stray| BUILT-IN| ACTIVE| -| sundial| Timer-Sundial| BUILT-IN| ACTIVE| -| timer_xl| Timer-XL| BUILT-IN| ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|TRAINING| -+---------------------+--------------------+----------+--------+ -``` - -2. 微调任务后台异步启动,可在 AINode 进程看到 log;微调完成后,查询并使用新的模型 - -```SQL -IoTDB> show models -+---------------------+--------------------+----------+------+ -| ModelId| ModelType| Category| State| -+---------------------+--------------------+----------+------+ -| arima| Arima| BUILT-IN|ACTIVE| -| holtwinters| HoltWinters| BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing| BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster| BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster| BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm| BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm| BUILT-IN|ACTIVE| -| stray| Stray| BUILT-IN|ACTIVE| -| sundial| Timer-Sundial| BUILT-IN|ACTIVE| -| timer_xl| Timer-XL| BUILT-IN|ACTIVE| -| sundialv2| Timer-Sundial|FINE-TUNED|ACTIVE| -+---------------------+--------------------+----------+------+ -``` - -### 4.6 时序大模型导入步骤 - -AINode 目前支持多种时序大模型,部署使用请参考[时序大模型](../AI-capability/TimeSeries-Large-Model.md) - -## 5. 权限管理 - -使用AINode相关的功能时,可以使用IoTDB本身的鉴权去做一个权限管理,用户只有在具备 USE_MODEL 权限时,才可以使用模型管理的相关功能。当使用推理功能时,用户需要有访问输入模型的SQL对应的源序列的权限。 - -| 权限名称 | 权限范围 | 管理员用户(默认ROOT) | 普通用户 | 路径相关 | -| --------- | --------------------------------- | ---------------------- | -------- | -------- | -| USE_MODEL | create model / show models / drop model | √ | √ | x | -| READ_DATA | call inference | √ | √ | √ | - -## 6. 实际案例 - -### 6.1 电力负载预测 - -在部分工业场景下,会存在预测电力负载的需求,预测结果可用于优化电力供应、节约能源和资源、支持规划和扩展以及增强电力系统的可靠性。 - -我们所使用的 ETTh1 的测试集的数据为[ETTh1](/img/ETTh1.csv)。 - - -包含间隔1h采集一次的电力数据,每条数据由负载和油温构成,分别为:High UseFul Load, High UseLess Load, Middle UseLess Load, Low UseFul Load, Low UseLess Load, Oil Temperature。 - -在该数据集上,IoTDB-ML的模型推理功能可以通过以往高中低三种负载的数值和对应时间戳油温的关系,预测未来一段时间内的油温,赋能电网变压器的自动调控和监视。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-data.sh` 向 IoTDB 中导入 ETT 数据集 - -```Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/ETTh1.csv -``` - -#### 步骤二:模型导入 - -我们可以在iotdb-cli 中输入以下SQL从 huggingface 上拉取一个已经训练好的模型进行注册,用于后续的推理。 - -```SQL -create model dlinear using uri 'https://huggingface.co/hvlgo/dlinear/tree/main' -``` - -该模型基于较为轻量化的深度模型DLinear训练而得,能够以相对快的推理速度尽可能多地捕捉到序列内部的变化趋势和变量间的数据变化关系,相较于其他更深的模型更适用于快速实时预测。 - -#### 步骤三:模型推理 - -```Shell -IoTDB> select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth LIMIT 96 -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -| Time|root.eg.etth.s0|root.eg.etth.s1|root.eg.etth.s2|root.eg.etth.s3|root.eg.etth.s4|root.eg.etth.s5|root.eg.etth.s6| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -|2017-10-20T00:00:00.000+08:00| 10.449| 3.885| 8.706| 2.025| 2.041| 0.944| 8.864| -|2017-10-20T01:00:00.000+08:00| 11.119| 3.952| 8.813| 2.31| 2.071| 1.005| 8.442| -|2017-10-20T02:00:00.000+08:00| 9.511| 2.88| 7.533| 1.564| 1.949| 0.883| 8.16| -|2017-10-20T03:00:00.000+08:00| 9.645| 2.21| 7.249| 1.066| 1.828| 0.914| 7.949| -...... -|2017-10-23T20:00:00.000+08:00| 8.105| 0.938| 4.371| -0.569| 3.533| 1.279| 9.708| -|2017-10-23T21:00:00.000+08:00| 7.167| 1.206| 4.087| -0.462| 3.107| 1.432| 8.723| -|2017-10-23T22:00:00.000+08:00| 7.1| 1.34| 4.015| -0.32| 2.772| 1.31| 8.864| -|2017-10-23T23:00:00.000+08:00| 9.176| 2.746| 7.107| 1.635| 2.65| 1.097| 9.004| -+-----------------------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+ -Total line number = 96 - -IoTDB> call inference(dlinear_example, "select s0,s1,s2,s3,s4,s5,s6 from root.eg.etth", generateTime=True, window=head(96)) -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -| Time| output0| output1| output2| output3| output4| output5| output6| -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -|2017-10-23T23:00:00.000+08:00| 10.319546| 3.1450553| 7.877341| 1.5723765|2.7303758| 1.1362307| 8.867775| -|2017-10-24T01:00:00.000+08:00| 10.443649| 3.3286757| 7.8593454| 1.7675098| 2.560634| 1.1177158| 8.920919| -|2017-10-24T03:00:00.000+08:00| 10.883752| 3.2341104| 8.47036| 1.6116762|2.4874182| 1.1760603| 8.798939| -...... -|2017-10-26T19:00:00.000+08:00| 8.0115595| 1.2995274| 6.9900327|-0.098746896| 3.04923| 1.176214| 9.548782| -|2017-10-26T21:00:00.000+08:00| 8.612427| 2.5036244| 5.6790237| 0.66474205|2.8870275| 1.2051733| 9.330128| -|2017-10-26T22:00:00.000+08:00| 10.096699| 3.399722| 6.9909| 1.7478468|2.7642853| 1.1119363| 9.541455| -+-----------------------------+-----------+----------+----------+------------+---------+----------+----------+ -Total line number = 48 -``` - -我们将对油温的预测的结果和真实结果进行对比,可以得到以下的图像。 - -图中10/24 00:00之前的数据为输入模型的过去数据,10/24 00:00后的蓝色线条为模型给出的油温预测结果,而红色为数据集中实际的油温数据(用于进行对比)。 - -![](/img/AINode-analysis1.png) - -可以看到,我们使用了过去96个小时(4天)的六个负载信息和对应时间油温的关系,基于之前学习到的序列间相互关系对未来48个小时(2天)的油温这一数据的可能变化进行了建模,可以看到可视化后预测曲线与实际结果在趋势上保持了较高程度的一致性。 - -### 6.2 功率预测 - -变电站需要对电流、电压、功率等数据进行电力监控,用于检测潜在的电网问题、识别电力系统中的故障、有效管理电网负载以及分析电力系统的性能和趋势等。 - -我们利用某变电站中的电流、电压和功率等数据构成了真实场景下的数据集。该数据集包括变电站近四个月时间跨度,每5 - 6s 采集一次的 A相电压、B相电压、C相电压等数据。 - -测试集数据内容为[data](/img/data.csv)。 - -在该数据集上,IoTDB-ML的模型推理功能可以通过以往A相电压,B相电压和C相电压的数值和对应时间戳,预测未来一段时间内的C相电压,赋能变电站的监视管理。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-data.sh` 导入数据集 - -```Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/data.csv -``` - -#### 步骤二:模型导入 - -我们可以在iotdb-cli 中选择内置模型或已经注册好的模型用于后续的推理。 - -我们采用内置模型STLForecaster进行预测,STLForecaster 是一个基于 statsmodels 库中 STL 实现的时间序列预测方法。 - -#### 步骤三:模型推理 - -```Shell -IoTDB> select * from root.eg.voltage limit 96 -+-----------------------------+------------------+------------------+------------------+ -| Time|root.eg.voltage.s0|root.eg.voltage.s1|root.eg.voltage.s2| -+-----------------------------+------------------+------------------+------------------+ -|2023-02-14T20:38:32.000+08:00| 2038.0| 2028.0| 2041.0| -|2023-02-14T20:38:38.000+08:00| 2014.0| 2005.0| 2018.0| -|2023-02-14T20:38:44.000+08:00| 2014.0| 2005.0| 2018.0| -...... -|2023-02-14T20:47:52.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:47:57.000+08:00| 2024.0| 2016.0| 2027.0| -|2023-02-14T20:48:03.000+08:00| 2024.0| 2016.0| 2027.0| -+-----------------------------+------------------+------------------+------------------+ -Total line number = 96 - -IoTDB> call inference(_STLForecaster, "select s0,s1,s2 from root.eg.voltage", generateTime=True, window=head(96),predict_length=48) -+-----------------------------+---------+---------+---------+ -| Time| output0| output1| output2| -+-----------------------------+---------+---------+---------+ -|2023-02-14T20:48:03.000+08:00|2026.3601|2018.2953|2029.4257| -|2023-02-14T20:48:09.000+08:00|2019.1538|2011.4361|2022.0888| -|2023-02-14T20:48:15.000+08:00|2025.5074|2017.4522|2028.5199| -...... - -|2023-02-14T20:52:15.000+08:00|2022.2336|2015.0290|2025.1023| -|2023-02-14T20:52:21.000+08:00|2015.7241|2008.8975|2018.5085| -|2023-02-14T20:52:27.000+08:00|2022.0777|2014.9136|2024.9396| -|2023-02-14T20:52:33.000+08:00|2015.5682|2008.7821|2018.3458| -+-----------------------------+---------+---------+---------+ -Total line number = 48 -``` -我们将对C相电压的预测的结果和真实结果进行对比,可以得到以下的图像。 - -图中 02/14 20:48 之前的数据为输入模型的过去数据, 02/14 20:48 后的蓝色线条为模型给出的C相电压预测结果,而红色为数据集中实际的C相电压数据(用于进行对比)。 - -![](/img/AINode-analysis2.png) - -可以看到,我们使用了过去10分钟的电压的数据,基于之前学习到的序列间相互关系对未来5分钟的C相电压这一数据的可能变化进行了建模,可以看到可视化后预测曲线与实际结果在趋势上保持了一定的同步性。 - -### 6.3 异常检测 - -在民航交通运输业,存在着对乘机旅客数量进行异常检测的需求。异常检测的结果可用于指导调整航班的调度,以使得企业获得更大效益。 - -Airline Passengers一个时间序列数据集,该数据集记录了1949年至1960年期间国际航空乘客数量,间隔一个月进行一次采样。该数据集共含一条时间序列。数据集为[airline](/img/airline.csv)。 -在该数据集上,IoTDB-ML的模型推理功能可以通过捕捉序列的变化规律以对序列时间点进行异常检测,赋能交通运输业。 - -#### 步骤一:数据导入 - -用户可以使用tools文件夹中的`import-data.sh` 导入数据集 - -```Bash -bash ./import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw root -s /path/data.csv -``` - -#### 步骤二:模型推理 - -IoTDB内置有部分可以直接使用的机器学习算法,使用其中的异常检测算法进行预测的样例如下: - -```Shell -IoTDB> select * from root.eg.airline -+-----------------------------+------------------+ -| Time|root.eg.airline.s0| -+-----------------------------+------------------+ -|1949-01-31T00:00:00.000+08:00| 224.0| -|1949-02-28T00:00:00.000+08:00| 118.0| -|1949-03-31T00:00:00.000+08:00| 132.0| -|1949-04-30T00:00:00.000+08:00| 129.0| -...... -|1960-09-30T00:00:00.000+08:00| 508.0| -|1960-10-31T00:00:00.000+08:00| 461.0| -|1960-11-30T00:00:00.000+08:00| 390.0| -|1960-12-31T00:00:00.000+08:00| 432.0| -+-----------------------------+------------------+ -Total line number = 144 - -IoTDB> call inference(_Stray, "select s0 from root.eg.airline", generateTime=True, k=2) -+-----------------------------+-------+ -| Time|output0| -+-----------------------------+-------+ -|1960-12-31T00:00:00.000+08:00| 0| -|1961-01-31T08:00:00.000+08:00| 0| -|1961-02-28T08:00:00.000+08:00| 0| -|1961-03-31T08:00:00.000+08:00| 0| -...... -|1972-06-30T08:00:00.000+08:00| 1| -|1972-07-31T08:00:00.000+08:00| 1| -|1972-08-31T08:00:00.000+08:00| 0| -|1972-09-30T08:00:00.000+08:00| 0| -|1972-10-31T08:00:00.000+08:00| 0| -|1972-11-30T08:00:00.000+08:00| 0| -+-----------------------------+-------+ -Total line number = 144 -``` - -我们将检测为异常的结果进行绘制,可以得到以下图像。其中蓝色曲线为原时间序列,用红色点特殊标注的时间点为算法检测为异常的时间点。 - -![](/img/s6.png) - -可以看到,Stray模型对输入序列变化进行了建模,成功检测出出现异常的时间点。 diff --git a/src/zh/UserGuide/latest/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md b/src/zh/UserGuide/latest/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md deleted file mode 100644 index 953b3ed9a..000000000 --- a/src/zh/UserGuide/latest/AI-capability/TimeSeries-Large-Model_Upgrade_timecho.md +++ /dev/null @@ -1,157 +0,0 @@ - -# 时序大模型 - -## 1. 简介 - -时序大模型是专为时序数据分析设计的基础模型。IoTDB 团队长期自研时序基础模型 Timer,该模型基于 Transformer 架构,经海量多领域时序数据预训练,可支撑时序预测、异常检测、时序填补等下游任务;团队打造的 AINode 平台同时支持集成业界前沿时序基础模型,为用户提供多元选型。不同于传统时序分析技术,这类大模型具备通用特征提取能力,可通过零样本分析、微调等技术服务广泛的分析任务。 - -本文相关时序大模型领域的技术成果(含团队自研及业界前沿方向)均发表于国际机器学习顶级会议,具体内容见附录。 - -## 2. 应用场景 - -* **时序预测**:为工业生产、自然环境等领域提供时间序列数据的预测服务,帮助用户提前了解未来变化趋势。 -* **数据填补**:针对时间序列中的缺失序列段,进行上下文填补,以增强数据集的连续性和完整性。 -* **异常检测**:利用自回归分析技术,对时间序列数据进行实时监测,及时预警潜在的异常情况。 - -![](/img/LargeModel09.png) - -## 3. Timer-1 模型 - -Timer[1] 模型(非内置模型)不仅展现了出色的少样本泛化和多任务适配能力,还通过预训练获得了丰富的知识库,赋予了它处理多样化下游任务的通用能力,拥有以下特点: - -* **泛化性**:模型能够通过使用少量样本进行微调,达到行业内领先的深度模型预测效果。 -* **通用性**:模型设计灵活,能够适配多种不同的任务需求,并且支持变化的输入和输出长度,使其在各种应用场景中都能发挥作用。 -* **可扩展性**:随着模型参数数量的增加或预训练数据规模的扩大,模型效果会持续提升,确保模型能够随着时间和数据量的增长而不断优化其预测效果。 - -![](/img/model01.png) - -## 4. Timer-XL 模型 - -Timer-XL[2]基于 Timer 进一步扩展升级了网络结构,在多个维度全面突破: - -* **超长上下文支持**:该模型突破了传统时序预测模型的限制,支持处理数千个 Token(相当于数万个时间点)的输入,有效解决了上下文长度瓶颈问题。 -* **多变量预测场景覆盖**:支持多种预测场景,包括非平稳时间序列的预测、涉及多个变量的预测任务以及包含协变量的预测,满足多样化的业务需求。 -* **大规模工业时序数据集:**采用万亿大规模工业物联网领域的时序数据集进行预训练,数据集兼有庞大的体量、卓越的质量和丰富的领域等重要特质,覆盖能源、航空航天、钢铁、交通等多领域。 - -![](/img/model02.png) - -## 5. Timer-Sundial 模型 - -Timer-Sundial[3]是一个专注于时间序列预测的生成式基础模型系列,其基础版本拥有 1.28 亿参数,并在 1 万亿个时间点上进行了大规模预训练,其核心特性包括: - -* **强大的泛化性能:**具备零样本预测能力,可同时支持点预测和概率预测。 -* **灵活预测分布分析:**不仅能预测均值或分位数,还可通过模型生成的原始样本评估预测分布的任意统计特性。 -* **创新生成架构:** 采用 “Transformer + TimeFlow” 协同架构——Transformer 学习时间片段的自回归表征,TimeFlow 模块基于流匹配框架 (Flow-Matching) 将随机噪声转化为多样化预测轨迹,实现高效的非确定性样本生成。 - -![](/img/model03.png) - -## 6. Chronos-2 模型 - -Chronos-2 [4]是由 Amazon Web Services (AWS) 研究团队开发的,基于 Chronos 离散词元建模范式发展起来的通用时间序列基础模型,该模型同时适用于零样本单变量预测和协变量预测。其主要特性包括: - -* **概率性预测能力**:模型以生成式方式输出多步预测结果,支持分位数或分布级预测,从而刻画未来不确定性。 -* **零样本通用预测**:依托预训练获得的上下文学习能力,可直接对未见过的数据集执行预测,无需重新训练或参数更新。 -* **多变量与协变量统一建模**:支持在同一架构下联合建模多条相关时间序列及其协变量,以提升复杂任务的预测效果。但对输入有严格要求: - * 未来协变量的名称组成的集合必须是历史协变量的名称组成的集合的子集; - * 每个历史协变量的长度必须等于目标变量的长度; - * 每个未来协变量的长度必须等于预测长度; -* **高效推理与部署**:模型采用紧凑的编码器式(encoder-only)结构,在保持强泛化能力的同时兼顾推理效率。 - -![](/img/timeseries-large-model-chronos2.png) - -## 7. 效果展示 - -时序大模型能够适应多种不同领域和场景的真实时序数据,在各种任务上拥有优异的处理效果,以下是在不同数据上的真实表现: - -**时序预测:** - -利用时序大模型的预测能力,能够准确预测时间序列的未来变化趋势,如下图蓝色曲线代表预测趋势,红色曲线为实际趋势,两曲线高度吻合。 - -![](/img/LargeModel03.png) - -**数据填补**: - -利用时序大模型对缺失数据段进行预测式填补。 - -![](/img/timeseries-large-model-data-imputation.png) - -**异常检测**: - -利用时序大模型精准识别与正常趋势偏离过大的异常值。 - -![](/img/LargeModel05.png) - -## 8. 部署使用 - -1. 打开 IoTDB cli 控制台,检查 ConfigNode、DataNode、AINode 节点确保均为 Running。 - -```Plain -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| -+------+----------+-------+---------------+------------+--------------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| 2.0.5.1| 069354f| -| 1| DataNode|Running| 127.0.0.1| 10730| 2.0.5.1| 069354f| -| 2| AINode|Running| 127.0.0.1| 10810| 2.0.5.1|069354f-dev| -+------+----------+-------+---------------+------------+--------------+-----------+ -Total line number = 3 -It costs 0.140s -``` - -2. 联网环境下首次启动 AINode 节点会自动拉取 Timer-XL、Sundial、Chronos2 模型。 - - > 注意: - > - > * AINode 安装包不包含模型权重文件 - > * 自动拉取功能依赖部署环境具备 HuggingFace 网络访问能力 - > * AINode 支持手动上传模型权重文件,具体操作方法可参考[导入权重文件](../Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md#_3-3-导入内置权重文件) - -3. 检查模型是否可用。 - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 附录 - -**[1]** Timer- Generative Pre-trained Transformers Are Large Time Series Models, Yong Liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ 返回](#ref1) - -**[2]** TIMER-XL- LONG-CONTEXT TRANSFORMERS FOR UNIFIED TIME SERIES FORECASTING ,Yong Liu, Guo Qin, Xiangdong Huang, Jianmin Wang, Mingsheng Long. [↩ 返回](#ref2) - -**[3]** Sundial- A Family of Highly Capable Time Series Foundation Models, Yong Liu, Guo Qin, Zhiyuan Shi, Zhi Chen, Caiyin Yang, Xiangdong Huang, Jianmin Wang, Mingsheng Long, **ICML 2025 spotlight**. [↩ 返回](#ref3) - -**[4] **Chronos-2: From Univariate to Universal Forecasting, Abdul Fatir Ansari, Oleksandr Shchur, Jaris Küken, Andreas Auer, Boran Han, Pedro Mercado, Syama Sundar Rangapuram, Huibin Shen, Lorenzo Stella, Xiyuan Zhang, Mononito Goswami, Shubham Kapoor, Danielle C. Maddix, Pablo Guerron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, Michael Bohlke-Schneider, **arXiv:2510.15821.**[↩ 返回](#ref4) diff --git a/src/zh/UserGuide/latest/API/Programming-Data-Subscription_timecho.md b/src/zh/UserGuide/latest/API/Programming-Data-Subscription_timecho.md deleted file mode 100644 index 183fdcc80..000000000 --- a/src/zh/UserGuide/latest/API/Programming-Data-Subscription_timecho.md +++ /dev/null @@ -1,268 +0,0 @@ - - - - -# 数据订阅API - -IoTDB 提供了强大的数据订阅功能,允许用户通过订阅 API 实时获取 IoTDB 新增的数据。详细的功能定义及介绍:[数据订阅](../User-Manual/Data-subscription_timecho) - -## 1. 核心步骤 - -1. 创建Topic:创建一个Topic,Topic中包含希望订阅的测点。 -2. 订阅Topic:在 consumer 订阅 topic 前,topic 必须已经被创建,否则订阅会失败。同一个 consumer group 下的 consumers 会均分数据。 -3. 消费数据:只有显式订阅了某个 topic,才会收到对应 topic 的数据。 -4. 取消订阅: consumer close 时会退出对应的 consumer group,同时取消现存的所有订阅。 - - -## 2. 详细步骤 - -本章节用于说明开发的核心流程,并未演示所有的参数和接口,如需了解全部功能及参数请参见: [全量接口说明](./Programming-Java-Native-API_timecho#_3-全量接口说明) - - -### 2.1 创建maven项目 - -创建一个maven项目,并导入以下依赖(JDK >= 1.8, Maven >= 3.6) - -```xml - - - org.apache.iotdb - iotdb-session - - ${project.version} - - -``` -注意:请勿使用高版本客户端连接低版本服务。 - -### 2.2 代码案例 - -#### 2.2.1 Topic操作 - -```java -import java.util.Optional; -import java.util.Properties; -import java.util.Set; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.SubscriptionSession; -import org.apache.iotdb.session.subscription.model.Topic; - -public class DataConsumerExample { - - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - try (SubscriptionSession session = new SubscriptionSession("127.0.0.1", 6667, "root", "TimechoDB@2021", 67108864)) { //V2.0.6.x 之前默认密码为root - // 1. open session - session.open(); - - // 2. create a topic of all data - Properties sessionConfig = new Properties(); - sessionConfig.put(TopicConstant.PATH_KEY, "root.**"); - - session.createTopic("allData", sessionConfig); - - // 3. show all topics - Set topics = session.getTopics(); - System.out.println(topics); - - // 4. show a specific topic - Optional allData = session.getTopic("allData"); - System.out.println(allData.get()); - } - } -} -``` - -#### 2.2.2 数据消费 - -##### 场景-1: 订阅IoTDB中新增的实时数据(大屏或组态展示的场景) - -```java -import java.io.IOException; -import java.util.List; -import java.util.Properties; -import org.apache.iotdb.rpc.subscription.config.ConsumerConstant; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.consumer.SubscriptionPullConsumer; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessage; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessageType; -import org.apache.iotdb.session.subscription.payload.SubscriptionSessionDataSet; -import org.apache.tsfile.read.common.RowRecord; - -public class DataConsumerExample { - - public static void main(String[] args) throws IOException { - - // 5. create a pull consumer, the subscription is automatically cancelled when the logic in the try resources is completed - Properties consumerConfig = new Properties(); - consumerConfig.put(ConsumerConstant.CONSUMER_ID_KEY, "c1"); - consumerConfig.put(ConsumerConstant.CONSUMER_GROUP_ID_KEY, "cg1"); - consumerConfig.put(ConsumerConstant.USERNAME_KEY, "root"); - consumerConfig.put(ConsumerConstant.PASSWORD_KEY, "TimechoDB@2021"); //V2.0.6.x 之前默认密码为root - try (SubscriptionPullConsumer pullConsumer = new SubscriptionPullConsumer(consumerConfig)) { - pullConsumer.open(); - pullConsumer.subscribe("topic_all"); - while (true) { - List messages = pullConsumer.poll(10000); - for (final SubscriptionMessage message : messages) { - final short messageType = message.getMessageType(); - if (SubscriptionMessageType.isValidatedMessageType(messageType)) { - for (final SubscriptionSessionDataSet dataSet : message.getSessionDataSetsHandler()) { - while (dataSet.hasNext()) { - final RowRecord record = dataSet.next(); - System.out.println(record); - } - } - } - } - } - } - } -} - - -``` - -##### 场景-2:订阅新增的 TsFile(定期数据备份的场景) - -前提:需要被消费的topic的格式为TsfileHandler类型,举例:`create topic topic_all_tsfile with ('path'='root.**','format'='TsFileHandler')` - -```java -import java.io.IOException; -import java.util.List; -import java.util.Properties; -import org.apache.iotdb.rpc.subscription.config.ConsumerConstant; -import org.apache.iotdb.rpc.subscription.config.TopicConstant; -import org.apache.iotdb.session.subscription.consumer.SubscriptionPullConsumer; -import org.apache.iotdb.session.subscription.payload.SubscriptionMessage; - - -public class DataConsumerExample { - - public static void main(String[] args) throws IOException { - // 1. create a pull consumer, the subscription is automatically cancelled when the logic in the try resources is completed - Properties consumerConfig = new Properties(); - consumerConfig.put(ConsumerConstant.CONSUMER_ID_KEY, "c1"); - consumerConfig.put(ConsumerConstant.CONSUMER_GROUP_ID_KEY, "cg1"); - consumerConfig.put(ConsumerConstant.USERNAME_KEY, "root"); - consumerConfig.put(ConsumerConstant.PASSWORD_KEY, "TimechoDB@2021");//V2.0.6.x 之前默认密码为root - consumerConfig.put(ConsumerConstant.FILE_SAVE_DIR_KEY, "/Users/iotdb/Downloads"); - try (SubscriptionPullConsumer pullConsumer = new SubscriptionPullConsumer(consumerConfig)) { - pullConsumer.open(); - pullConsumer.subscribe("topic_all_tsfile"); - while (true) { - List messages = pullConsumer.poll(10000); - for (final SubscriptionMessage message : messages) { - message.getTsFileHandler().copyFile("/Users/iotdb/Downloads/1.tsfile"); - } - } - } - } -} -``` - - - - -## 3. 全量接口说明 - -### 3.1 参数列表 - -可通过Properties参数对象设置消费者相关参数,具体参数如下。 - -#### 3.1.1 SubscriptionConsumer - - -| 参数 | 是否必填(默认值) | 参数含义 | -| :---------------------- |:-------------------------------------------------------------------------------------| :----------------------------------------------------------- | -| host | optional: 127.0.0.1 | `String`: IoTDB 中某 DataNode 的 RPC host | -| port | optional: 6667 | `Integer`: IoTDB 中某 DataNode 的 RPC port | -| node-urls | optional: 127.0.0.1:6667 | `List`: IoTDB 中所有 DataNode 的 RPC 地址,可以是多个;host:port 和 node-urls 选填一个即可。当 host:port 和 node-urls 都填写了,则取 host:port 和 node-urls 的**并集**构成新的 node-urls 应用 | -| username | optional: root | `String`: IoTDB 中 DataNode 的用户名 | -| password | optional: TimechoDB@2021 //V2.0.6.x 之前默认密码为root | `String`: IoTDB 中 DataNode 的密码 | -| groupId | optional | `String`: consumer group id,若未指定则随机分配(新的 consumer group),保证不同的 consumer group 对应的 consumer group id 均不相同 | -| consumerId | optional | `String`: consumer client id,若未指定则随机分配,保证同一个 consumer group 中每一个 consumer client id 均不相同 | -| heartbeatIntervalMs | optional: 30000 (min: 1000) | `Long`: consumer 向 IoTDB DataNode 定期发送心跳请求的间隔 | -| endpointsSyncIntervalMs | optional: 120000 (min: 5000) | `Long`: consumer 探测 IoTDB 集群节点扩缩容情况调整订阅连接的间隔 | -| fileSaveDir | optional: Paths.get(System.getProperty("user.dir"), "iotdb-subscription").toString() | `String`: consumer 订阅出的 TsFile 文件临时存放的目录路径 | -| fileSaveFsync | optional: false | `Boolean`: consumer 订阅 TsFile 的过程中是否主动调用 fsync | - -`SubscriptionPushConsumer` 中的特殊配置: - -| 参数 | 是否必填(默认值) | 参数含义 | -| :----------------- | :------------------------------------ | :----------------------------------------------------------- | -| ackStrategy | optional: `ACKStrategy.AFTER_CONSUME` | 消费进度的确认机制包含以下选项:`ACKStrategy.BEFORE_CONSUME`(当 consumer 收到数据时立刻提交消费进度,`onReceive` 前)`ACKStrategy.AFTER_CONSUME`(当 consumer 消费完数据再去提交消费进度,`onReceive` 后) | -| consumeListener | optional | 消费数据的回调函数,需实现 `ConsumeListener` 接口,定义消费 `SessionDataSetsHandler` 和 `TsFileHandler` 形式数据的处理逻辑 | -| autoPollIntervalMs | optional: 5000 (min: 500) | Long: consumer 自动拉取数据的时间间隔,单位为**毫秒** | -| autoPollTimeoutMs | optional: 10000 (min: 1000) | Long: consumer 每次拉取数据的超时时间,单位为**毫秒** | - -`SubscriptionPullConsumer` 中的特殊配置: - -| 参数 | 是否必填(默认值) | 参数含义 | -| :----------------- | :------------------------ | :----------------------------------------------------------- | -| autoCommit | optional: true | Boolean: 是否自动提交消费进度如果此参数设置为 false,则需要调用 `commit` 方法来手动提交消费进度 | -| autoCommitInterval | optional: 5000 (min: 500) | Long: 自动提交消费进度的时间间隔,单位为**毫秒**仅当 autoCommit 参数为 true 的时候才会生效 | - - -### 3.2 函数列表 - -#### 3.2.1 数据订阅 - -##### SubscriptionPullConsumer - -| **函数名** | **说明** | **参数** | -| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -| `open()` | 打开消费者连接,启动消息消费。如果 `autoCommit` 启用,会启动自动提交工作器。 | 无 | -| `close()` | 关闭消费者连接。如果 `autoCommit` 启用,会在关闭前提交所有未提交的消息。 | 无 | -| `poll(final Duration timeout)` | 拉取消息,指定超时时间。 | `timeout` : 拉取的超时时间。 | -| `poll(final long timeoutMs)` | 拉取消息,指定超时时间(毫秒)。 | `timeoutMs` : 超时时间,单位为毫秒。 | -| `poll(final Set topicNames, final Duration timeout)` | 拉取指定主题的消息,指定超时时间。 | `topicNames` : 要拉取的主题集合。`timeout`: 超时时间。 | -| `poll(final Set topicNames, final long timeoutMs)` | 拉取指定主题的消息,指定超时时间(毫秒)。 | `topicNames` : 要拉取的主题集合。`timeoutMs`: 超时时间,单位为毫秒。 | -| `commitSync(final SubscriptionMessage message)` | 同步提交单条消息。 | `message` : 需要提交的消息对象。 | -| `commitSync(final Iterable messages)` | 同步提交多条消息。 | `messages` : 需要提交的消息集合。 | -| `commitAsync(final SubscriptionMessage message)` | 异步提交单条消息。 | `message` : 需要提交的消息对象。 | -| `commitAsync(final Iterable messages)` | 异步提交多条消息。 | `messages` : 需要提交的消息集合。 | -| `commitAsync(final SubscriptionMessage message, final AsyncCommitCallback callback)` | 异步提交单条消息并指定回调函数。 | `message` : 需要提交的消息对象。`callback` : 异步提交完成后的回调函数。 | -| `commitAsync(final Iterable messages, final AsyncCommitCallback callback)` | 异步提交多条消息并指定回调函数。 | `messages` : 需要提交的消息集合。`callback` : 异步提交完成后的回调函数。 | - -##### SubscriptionPushConsumer - -| **函数名** | **说明** | **参数** | -| -------------------------------------------------------- | ----------------------------------------------------- | ------------------------------------------------------- | -| `open()` | 打开消费者连接,启动消息消费,提交自动轮询工作器。 | 无 | -| `close()` | 关闭消费者连接,停止消息消费。 | 无 | -| `toString()` | 返回消费者对象的核心配置信息。 | 无 | -| `coreReportMessage()` | 获取消费者核心配置的键值对表示形式。 | 无 | -| `allReportMessage()` | 获取消费者所有配置的键值对表示形式。 | 无 | -| `buildPushConsumer()` | 通过 `Builder` 构建 `SubscriptionPushConsumer` 实例。 | 无 | -| `ackStrategy(final AckStrategy ackStrategy)` | 配置消费者的消息确认策略。 | `ackStrategy`: 指定的消息确认策略。 | -| `consumeListener(final ConsumeListener consumeListener)` | 配置消费者的消息消费逻辑。 | `consumeListener`: 消费者接收消息时的处理逻辑。 | -| `autoPollIntervalMs(final long autoPollIntervalMs)` | 配置自动轮询的时间间隔。 | `autoPollIntervalMs` : 自动轮询的间隔时间,单位为毫秒。 | -| `autoPollTimeoutMs(final long autoPollTimeoutMs)` | 配置自动轮询的超时时间。 | `autoPollTimeoutMs`: 自动轮询的超时时间,单位为毫秒。 | - - - - - - - - - diff --git a/src/zh/UserGuide/latest/API/Programming-JDBC_timecho.md b/src/zh/UserGuide/latest/API/Programming-JDBC_timecho.md deleted file mode 100644 index f259d0287..000000000 --- a/src/zh/UserGuide/latest/API/Programming-JDBC_timecho.md +++ /dev/null @@ -1,295 +0,0 @@ - - -# JDBC - -**注意**: 当前 JDBC 实现仅适用于与第三方工具对接。不建议通过 JDBC 执行插入操作,因其无法提供高性能写入;查询场景推荐使用 JDBC。 - -对于Java应用,我们推荐使用[Java 原生接口](./Programming-Java-Native-API_timecho)* - -## 1. 依赖 - -* JDK >= 1.8 -* Maven >= 3.6 - -## 2. 安装方法 - -在根目录下执行下面的命令: -```shell -mvn clean install -pl iotdb-client/jdbc -am -DskipTests -``` - -### 2.1 在 MAVEN 中使用 IoTDB JDBC - -```xml - - - org.apache.iotdb - iotdb-jdbc - - ${project.version} - - -``` -注意:请勿使用高版本客户端连接低版本服务。 - -### 2.2 示例代码 - -本章提供了如何建立数据库连接、执行 SQL 和显示查询结果的示例。 - -要求您已经在工程中包含了数据库编程所需引入的包和 JDBC class. - -**注意:为了更快地插入,建议使用 executeBatch()** - -```java -import java.sql.*; -import org.apache.iotdb.jdbc.IoTDBSQLException; - -public class JDBCExample { - /** - * Before executing a SQL statement with a Statement object, you need to create a Statement object using the createStatement() method of the Connection object. - * After creating a Statement object, you can use its execute() method to execute a SQL statement - * Finally, remember to close the 'statement' and 'connection' objects by using their close() method - * For statements with query results, we can use the getResultSet() method of the Statement object to get the result set. - */ - public static void main(String[] args) throws SQLException { - Connection connection = getConnection(); - if (connection == null) { - System.out.println("get connection defeat"); - return; - } - Statement statement = connection.createStatement(); - //Create database - try { - statement.execute("CREATE DATABASE root.demo"); - }catch (IoTDBSQLException e){ - System.out.println(e.getMessage()); - } - - //SHOW DATABASES - statement.execute("SHOW DATABASES"); - outputResult(statement.getResultSet()); - - //Create time series - //Different data type has different encoding methods. Here use INT32 as an example - try { - statement.execute("CREATE TIMESERIES root.demo.s0 WITH DATATYPE=INT32,ENCODING=RLE;"); - }catch (IoTDBSQLException e){ - System.out.println(e.getMessage()); - } - //Show time series - statement.execute("SHOW TIMESERIES root.demo"); - outputResult(statement.getResultSet()); - //Show devices - statement.execute("SHOW DEVICES"); - outputResult(statement.getResultSet()); - //Count time series - statement.execute("COUNT TIMESERIES root"); - outputResult(statement.getResultSet()); - //Count nodes at the given level - statement.execute("COUNT NODES root LEVEL=3"); - outputResult(statement.getResultSet()); - //Count timeseries group by each node at the given level - statement.execute("COUNT TIMESERIES root GROUP BY LEVEL=3"); - outputResult(statement.getResultSet()); - - - //Execute insert statements in batch - statement.addBatch("insert into root.demo(timestamp,s0) values(1,1);"); - statement.addBatch("insert into root.demo(timestamp,s0) values(2,15);"); - statement.addBatch("insert into root.demo(timestamp,s0) values(2,17);"); - statement.addBatch("insert into root.demo(timestamp,s0) values(4,12);"); - statement.executeBatch(); - statement.clearBatch(); - - //Full query statement - String sql = "select * from root.demo"; - ResultSet resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Exact query statement - sql = "select s0 from root.demo where time = 4;"; - resultSet= statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Time range query - sql = "select s0 from root.demo where time >= 2 and time < 5;"; - resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Aggregate query - sql = "select count(s0) from root.demo;"; - resultSet = statement.executeQuery(sql); - System.out.println("sql: " + sql); - outputResult(resultSet); - - //Delete time series - statement.execute("delete timeseries root.demo.s0"); - - //close connection - statement.close(); - connection.close(); - } - - public static Connection getConnection() { - // JDBC driver name and database URL - String driver = "org.apache.iotdb.jdbc.IoTDBDriver"; - String url = "jdbc:iotdb://127.0.0.1:6667/"; - // set rpc compress mode - // String url = "jdbc:iotdb://127.0.0.1:6667?rpc_compress=true"; - - // Database credentials - String username = "root"; - String password = "TimechoDB@2021"; // V2.0.6.x 之前默认密码是 root - - Connection connection = null; - try { - Class.forName(driver); - connection = DriverManager.getConnection(url, username, password); - } catch (ClassNotFoundException e) { - e.printStackTrace(); - } catch (SQLException e) { - e.printStackTrace(); - } - return connection; - } - - /** - * This is an example of outputting the results in the ResultSet - */ - private static void outputResult(ResultSet resultSet) throws SQLException { - if (resultSet != null) { - System.out.println("--------------------------"); - final ResultSetMetaData metaData = resultSet.getMetaData(); - final int columnCount = metaData.getColumnCount(); - for (int i = 0; i < columnCount; i++) { - System.out.print(metaData.getColumnLabel(i + 1) + " "); - } - System.out.println(); - while (resultSet.next()) { - for (int i = 1; ; i++) { - System.out.print(resultSet.getString(i)); - if (i < columnCount) { - System.out.print(", "); - } else { - System.out.println(); - break; - } - } - } - System.out.println("--------------------------\n"); - } - } -} -``` - -可以在 url 中指定 version 参数: -```java -String url = "jdbc:iotdb://127.0.0.1:6667?version=V_1_0"; -``` -version 表示客户端使用的 SQL 语义版本,用于升级 0.13 时兼容 0.12 的 SQL 语义,可能取值有:`V_0_12`、`V_0_13`、`V_1_0`。 - -此外,IoTDB 在 JDBC 中提供了额外的接口,供用户在连接中使用不同的字符集(例如 GB18030)读写数据库。 -IoTDB 默认的字符集为 UTF-8。当用户期望使用 UTF-8 外的字符集时,需要在 JDBC 的连接中,指定 charset 属性。例如: -1. 使用 GB18030 的 charset 创建连接: -```java -DriverManager.getConnection("jdbc:iotdb://127.0.0.1:6667?charset=GB18030", "root", "TimechoDB@2021") -// V2.0.6.x 之前默认密码是 root -``` -2. 调用如下 `IoTDBStatement` 接口执行 SQL 时,可以接受 `byte[]` 编码的 SQL,该 SQL 将按照被指定的 charset 解析成字符串。 -```java -public boolean execute(byte[] sql) throws SQLException; -``` -3. 查询结果输出时,可使用 `ResultSet` 的 `getBytes` 方法得到的 `byte[]`,`byte[]` 的编码使用连接指定的 charset 进行。 -```java -System.out.print(resultSet.getString(i) + " (" + new String(resultSet.getBytes(i), charset) + ")"); -``` -以下是完整示例: -```java -public class JDBCCharsetExample { - - private static final Logger LOGGER = LoggerFactory.getLogger(JDBCCharsetExample.class); - - public static void main(String[] args) throws Exception { - Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); - - try (final Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?charset=GB18030", "root", "TimechoDB@2021"); // V2.0.6.x 之前默认密码是 root - final IoTDBStatement statement = (IoTDBStatement) connection.createStatement()) { - - final String insertSQLWithGB18030 = - "insert into root.测试(timestamp, 维语, 彝语, 繁体, 蒙文, 简体, 标点符号, 藏语) values(1, 'ئۇيغۇر تىلى', 'ꆈꌠꉙ', \"繁體\", 'ᠮᠣᠩᠭᠣᠯ ᠬᠡᠯᠡ', '简体', '——?!', \"བོད་སྐད།\");"; - final byte[] insertSQLWithGB18030Bytes = insertSQLWithGB18030.getBytes("GB18030"); - statement.execute(insertSQLWithGB18030Bytes); - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - - outputResult("GB18030"); - outputResult("UTF-8"); - outputResult("UTF-16"); - outputResult("GBK"); - outputResult("ISO-8859-1"); - } - - private static void outputResult(String charset) throws SQLException { - System.out.println("[Charset: " + charset + "]"); - try (final Connection connection = - DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?charset=" + charset, "root", "TimechoDB@2021"); // V2.0.6.x 之前默认密码是 root - final IoTDBStatement statement = (IoTDBStatement) connection.createStatement()) { - outputResult(statement.executeQuery("select ** from root"), Charset.forName(charset)); - } catch (IoTDBSQLException e) { - LOGGER.error("IoTDB Jdbc example error", e); - } - } - - private static void outputResult(ResultSet resultSet, Charset charset) throws SQLException { - if (resultSet != null) { - System.out.println("--------------------------"); - final ResultSetMetaData metaData = resultSet.getMetaData(); - final int columnCount = metaData.getColumnCount(); - for (int i = 0; i < columnCount; i++) { - System.out.print(metaData.getColumnLabel(i + 1) + " "); - } - System.out.println(); - - while (resultSet.next()) { - for (int i = 1; ; i++) { - System.out.print( - resultSet.getString(i) + " (" + new String(resultSet.getBytes(i), charset) + ")"); - if (i < columnCount) { - System.out.print(", "); - } else { - System.out.println(); - break; - } - } - } - System.out.println("--------------------------\n"); - } - } -} -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest/API/Programming-Java-Native-API_timecho.md b/src/zh/UserGuide/latest/API/Programming-Java-Native-API_timecho.md deleted file mode 100644 index 18a7bb83d..000000000 --- a/src/zh/UserGuide/latest/API/Programming-Java-Native-API_timecho.md +++ /dev/null @@ -1,625 +0,0 @@ - - - -# Java原生API - -IoTDB 原生 API 中的 Session 是实现与数据库交互的核心接口,它集成了丰富的方法,支持数据写入、查询以及元数据操作等功能。通过实例化 Session,能够建立与 IoTDB 服务器的连接,在该连接所构建的环境中执行各类数据库操作。Session为非线程安全,不能被多线程同时调用。 - -SessionPool 是 Session 的连接池,推荐使用SessionPool编程。在多线程并发的情形下,SessionPool 能够合理地管理和分配连接资源,以提升系统性能与资源利用效率。 - -## 1. 步骤概览 - -1. 创建连接池实例:初始化一个SessionPool对象,用于管理多个Session实例。 -2. 执行操作:直接从SessionPool中获取Session实例,并执行数据库操作,无需每次都打开和关闭连接。 -3. 关闭连接池资源:在不再需要进行数据库操作时,关闭SessionPool,释放所有相关资源。 - -## 2. 详细步骤 - -本章节用于说明开发的核心流程,并未演示所有的参数和接口,如需了解全部功能及参数请参见: [全量接口说明](./Programming-Java-Native-API_timecho#_3-全量接口说明) 或 查阅: [源码](https://github.com/apache/iotdb/tree/rc/2.0.1/example/session/src/main/java/org/apache/iotdb) - -### 2.1 创建maven项目 - -创建一个maven项目,并在pom.xml文件中添加以下依赖(JDK >= 1.8, Maven >= 3.6) - -```xml - - - org.apache.iotdb - iotdb-session - - ${project.version} - - -``` -注意:请勿使用高版本客户端连接低版本服务。 - -### 2.2 创建连接池实例 - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.session.pool.SessionPool; - -public class IoTDBSessionPoolExample { - private static SessionPool sessionPool; - - public static void main(String[] args) { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //V2.0.6.x 之前默认密码为root - .maxSize(3) - .build(); - } -} -``` - -### 2.3 执行数据库操作 - -#### 2.3.1 数据写入 - -在工业场景中,数据写入可分为以下几类:多行数据写入、单设备多行数据写入,下面按不同场景对写入接口进行介绍。 - -##### 多行数据写入接口 - -接口说明:支持一次写入多行数据,每一行对应一个设备一个时间戳的多个测点值。 - - -接口列表: - -| 接口名称 | 功能描述 | -| ------------------------------------------------------------ | ------------------------------------------ | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | 插入多行数据,适用于不同测点独立采集的场景 | - -代码案例: - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; -import org.apache.tsfile.enums.TSDataType; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. execute insert data - insertRecordsExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //V2.0.6.x 之前默认密码为root - .maxSize(3) - .build(); - } - - public static void insertRecordsExample() throws IoTDBConnectionException, StatementExecutionException { - String deviceId = "root.sg1.d1"; - List measurements = new ArrayList<>(); - measurements.add("s1"); - measurements.add("s2"); - measurements.add("s3"); - List deviceIds = new ArrayList<>(); - List> measurementsList = new ArrayList<>(); - List> valuesList = new ArrayList<>(); - List timestamps = new ArrayList<>(); - List> typesList = new ArrayList<>(); - - for (long time = 0; time < 500; time++) { - List values = new ArrayList<>(); - List types = new ArrayList<>(); - values.add(1L); - values.add(2L); - values.add(3L); - types.add(TSDataType.INT64); - types.add(TSDataType.INT64); - types.add(TSDataType.INT64); - - deviceIds.add(deviceId); - measurementsList.add(measurements); - valuesList.add(values); - typesList.add(types); - timestamps.add(time); - if (time != 0 && time % 100 == 0) { - try { - sessionPool.insertRecords(deviceIds, timestamps, measurementsList, typesList, valuesList); - } catch (IoTDBConnectionException | StatementExecutionException e) { - // solve exception - } - deviceIds.clear(); - measurementsList.clear(); - valuesList.clear(); - typesList.clear(); - timestamps.clear(); - } - } - try { - sessionPool.insertRecords(deviceIds, timestamps, measurementsList, typesList, valuesList); - } catch (IoTDBConnectionException | StatementExecutionException e) { - // solve exception - } - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - -##### 单设备多行数据写入接口 - -接口说明:支持一次写入单个设备的多行数据,每一行对应一个时间戳的多个测点值。 - -接口列表: - -| 接口名称 | 功能描述 | -| ----------------------------- | ---------------------------------------------------- | -| `insertTablet(Tablet tablet)` | 插入单个设备的多行数据,适用于不同测点独立采集的场景 | - -代码案例: - -```java -import java.util.ArrayList; -import java.util.List; -import java.util.Random; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.write.record.Tablet; -import org.apache.tsfile.write.schema.IMeasurementSchema; -import org.apache.tsfile.write.schema.MeasurementSchema; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. execute insert data - insertTabletExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - //nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //V2.0.6.x 之前默认密码为root - .maxSize(3) - .build(); - } - - private static void insertTabletExample() throws IoTDBConnectionException, StatementExecutionException { - /* - * A Tablet example: - * device1 - * time s1, s2, s3 - * 1, 1, 1, 1 - * 2, 2, 2, 2 - * 3, 3, 3, 3 - */ - // The schema of measurements of one device - // only measurementId and data type in MeasurementSchema take effects in Tablet - List schemaList = new ArrayList<>(); - schemaList.add(new MeasurementSchema("s1", TSDataType.INT64)); - schemaList.add(new MeasurementSchema("s2", TSDataType.INT64)); - schemaList.add(new MeasurementSchema("s3", TSDataType.INT64)); - - Tablet tablet = new Tablet("root.sg.d1",schemaList,100); - - // Method 1 to add tablet data - long timestamp = System.currentTimeMillis(); - - Random random = new Random(); - for (long row = 0; row < 100; row++) { - int rowIndex = tablet.getRowSize(); - tablet.addTimestamp(rowIndex, timestamp); - for (int s = 0; s < 3; s++) { - long value = random.nextLong(); - tablet.addValue(schemaList.get(s).getMeasurementName(), rowIndex, value); - } - if (tablet.getRowSize() == tablet.getMaxRowNumber()) { - sessionPool.insertTablet(tablet); - tablet.reset(); - } - timestamp++; - } - if (tablet.getRowSize() != 0) { - sessionPool.insertTablet(tablet); - tablet.reset(); - } - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - -#### 2.3.2 SQL操作 - -SQL操作分为查询和非查询两类操作,对应的接口为`executeQuery`和`executeNonQuery`操作,其区别为前者执行的是具体的查询语句,会返回一个结果集,后者是执行的是增、删、改操作,不返回结果集。 - -```java -import java.util.ArrayList; -import java.util.List; -import org.apache.iotdb.isession.pool.SessionDataSetWrapper; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; - -public class SessionPoolExample { - private static SessionPool sessionPool; - public static void main(String[] args) throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. executes a non-query SQL statement, such as a DDL or DML command. - executeQueryExample(); - // 3. executes a query SQL statement and returns the result set. - executeNonQueryExample(); - // 4. close SessionPool - closeSessionPool(); - } - - private static void executeNonQueryExample() throws IoTDBConnectionException, StatementExecutionException { - // 1. create a nonAligned time series - sessionPool.executeNonQueryStatement("create timeseries root.test.d1.s1 with dataType = int32"); - // 2. set ttl - sessionPool.executeNonQueryStatement("set TTL to root.test.** 10000"); - // 3. delete time series - sessionPool.executeNonQueryStatement("delete timeseries root.test.d1.s1"); - } - - private static void executeQueryExample() throws IoTDBConnectionException, StatementExecutionException { - // 1. execute normal query - try(SessionDataSetWrapper wrapper = sessionPool.executeQueryStatement("select s1 from root.sg1.d1 limit 10")) { - // get DataIterator like JDBC - DataIterator dataIterator = wrapper.iterator(); - System.out.println(wrapper.getColumnNames()); - System.out.println(wrapper.getColumnTypes()); - while (dataIterator.next()) { - StringBuilder builder = new StringBuilder(); - for (String columnName : wrapper.getColumnNames()) { - builder.append(dataIterator.getString(columnName) + " "); - } - System.out.println(builder); - } - } - // 2. execute aggregate query - try(SessionDataSetWrapper wrapper = sessionPool.executeQueryStatement("select count(s1) from root.sg1.d1 group by ([0, 40), 5ms) ")) { - // get DataIterator like JDBC - DataIterator dataIterator = wrapper.iterator(); - System.out.println(wrapper.getColumnNames()); - System.out.println(wrapper.getColumnTypes()); - while (dataIterator.next()) { - StringBuilder builder = new StringBuilder(); - for (String columnName : wrapper.getColumnNames()) { - builder.append(dataIterator.getString(columnName) + " "); - } - System.out.println(builder); - } - } - } - - private static void constructSessionPool() { - // Using nodeUrls ensures that when one node goes down, other nodes are automatically connected to retry - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - nodeUrls.add("127.0.0.1:6668"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("TimechoDB@2021") //V2.0.6.x 之前默认密码为root - .maxSize(3) - .build(); - } - - public static void closeSessionPool(){ - sessionPool.close(); - } -} -``` - - -更多关于结果集及其方法 `SessionDataSet.DataIterator` 的使用可参考如下示例(其中,getBlob 和 getDate 两个接口从 V2.0.4 起支持): - -```java -import org.apache.iotdb.isession.SessionDataSet; -import org.apache.iotdb.isession.pool.SessionDataSetWrapper; -import org.apache.iotdb.rpc.IoTDBConnectionException; -import org.apache.iotdb.rpc.StatementExecutionException; -import org.apache.iotdb.session.pool.SessionPool; - -import org.apache.tsfile.enums.TSDataType; -import org.apache.tsfile.utils.Binary; -import org.apache.tsfile.utils.DateUtils; -import org.apache.tsfile.write.record.Tablet; -import org.apache.tsfile.write.schema.MeasurementSchema; -import org.junit.Assert; - -import java.sql.Timestamp; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -public class SessionExample { - private static SessionPool sessionPool; - - public static void main(String[] args) - throws IoTDBConnectionException, StatementExecutionException { - // 1. init SessionPool - constructSessionPool(); - // 2. executes a query SQL statement, such as a DDL or DML command. - executeQueryExample(); - // 3. close SessionPool - closeSessionPool(); - } - - private static void executeQueryExample() - throws IoTDBConnectionException, StatementExecutionException { - Tablet tablet = - new Tablet( - "root.sg.d1", - Arrays.asList( - new MeasurementSchema("s1", TSDataType.INT32), - new MeasurementSchema("s2", TSDataType.INT64), - new MeasurementSchema("s3", TSDataType.FLOAT), - new MeasurementSchema("s4", TSDataType.DOUBLE), - new MeasurementSchema("s5", TSDataType.TEXT), - new MeasurementSchema("s6", TSDataType.BOOLEAN), - new MeasurementSchema("s7", TSDataType.TIMESTAMP), - new MeasurementSchema("s8", TSDataType.BLOB), - new MeasurementSchema("s9", TSDataType.STRING), - new MeasurementSchema("s10", TSDataType.DATE), - new MeasurementSchema("s11", TSDataType.TIMESTAMP)), - 10); - tablet.addTimestamp(0, 0L); - tablet.addValue("s1", 0, 1); - tablet.addValue("s2", 0, 1L); - tablet.addValue("s3", 0, 0f); - tablet.addValue("s4", 0, 0d); - tablet.addValue("s5", 0, "text_value"); - tablet.addValue("s6", 0, true); - tablet.addValue("s7", 0, 1L); - tablet.addValue("s8", 0, new Binary(new byte[] {1})); - tablet.addValue("s9", 0, "string_value"); - tablet.addValue("s10", 0, DateUtils.parseIntToLocalDate(20250403)); - tablet.initBitMaps(); - tablet.bitMaps[10].mark(0); - tablet.rowSize = 1; - sessionPool.insertAlignedTablet(tablet); - - try (SessionDataSetWrapper dataSet = - sessionPool.executeQueryStatement("select * from root.sg.d1")) { - SessionDataSet.DataIterator iterator = dataSet.iterator(); - int count = 0; - while (iterator.next()) { - count++; - Assert.assertFalse(iterator.isNull("root.sg.d1.s1")); - Assert.assertEquals(1, iterator.getInt("root.sg.d1.s1")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s2")); - Assert.assertEquals(1L, iterator.getLong("root.sg.d1.s2")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s3")); - Assert.assertEquals(0, iterator.getFloat("root.sg.d1.s3"), 0.01); - Assert.assertFalse(iterator.isNull("root.sg.d1.s4")); - Assert.assertEquals(0, iterator.getDouble("root.sg.d1.s4"), 0.01); - Assert.assertFalse(iterator.isNull("root.sg.d1.s5")); - Assert.assertEquals("text_value", iterator.getString("root.sg.d1.s5")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s6")); - Assert.assertTrue(iterator.getBoolean("root.sg.d1.s6")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s7")); - Assert.assertEquals(new Timestamp(1), iterator.getTimestamp("root.sg.d1.s7")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s8")); - Assert.assertEquals(new Binary(new byte[] {1}), iterator.getBlob("root.sg.d1.s8")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s9")); - Assert.assertEquals("string_value", iterator.getString("root.sg.d1.s9")); - Assert.assertFalse(iterator.isNull("root.sg.d1.s10")); - Assert.assertEquals( - DateUtils.parseIntToLocalDate(20250403), iterator.getDate("root.sg.d1.s10")); - Assert.assertTrue(iterator.isNull("root.sg.d1.s11")); - Assert.assertNull(iterator.getTimestamp("root.sg.d1.s11")); - - Assert.assertEquals(new Timestamp(0), iterator.getTimestamp("Time")); - Assert.assertFalse(iterator.isNull("Time")); - } - Assert.assertEquals(tablet.rowSize, count); - } - } - - private static void constructSessionPool() { - List nodeUrls = new ArrayList<>(); - nodeUrls.add("127.0.0.1:6667"); - sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user("root") - .password("root") - .maxSize(3) - .build(); - } - - public static void closeSessionPool() { - sessionPool.close(); - } -} -``` - - -## 3. 全量接口说明 - -### 3.1 参数列表 - -Session具有如下的字段,可以通过构造函数或Session.Builder方式设置如下参数 - -| 字段名 | 类型 | 说明 | -| -------------------------------- | ----------------------------------- | ---------------------------------------- | -| `nodeUrls` | `List` | 数据库节点的 URL 列表,支持多节点连接 | -| `username` | `String` | 用户名 | -| `password` | `String` | 密码 | -| `fetchSize` | `int` | 查询结果的默认批量返回大小 | -| `useSSL` | `boolean` | 是否启用 SSL | -| `trustStore` | `String` | 信任库路径 | -| `trustStorePwd` | `String` | 信任库密码 | -| `queryTimeoutInMs` | `long` | 查询的超时时间,单位毫秒。默认值-1。负数代表采用服务器默认配置,0 代表关闭查询超时功能。 | -| `enableRPCCompression` | `boolean` | 是否启用 RPC 压缩 | -| `connectionTimeoutInMs` | `int` | 连接超时时间,单位毫秒 | -| `zoneId` | `ZoneId` | 会话的时区设置 | -| `thriftDefaultBufferSize` | `int` | Thrift 默认缓冲区大小 | -| `thriftMaxFrameSize` | `int` | Thrift 最大帧大小 | -| `defaultEndPoint` | `TEndPoint` | 默认的数据库端点信息 | -| `defaultSessionConnection` | `SessionConnection` | 默认的会话连接对象 | -| `isClosed` | `boolean` | 当前会话是否已关闭 | -| `enableRedirection` | `boolean` | 是否启用重定向功能 | -| `enableRecordsAutoConvertTablet` | `boolean` | 是否启用记录自动转换为 Tablet 的功能 | -| `deviceIdToEndpoint` | `Map` | 设备 ID 和数据库端点的映射关系 | -| `endPointToSessionConnection` | `Map` | 数据库端点和会话连接的映射关系 | -| `executorService` | `ScheduledExecutorService` | 用于定期更新节点列表的线程池 | -| `availableNodes` | `INodeSupplier` | 可用节点的供应器 | -| `enableQueryRedirection` | `boolean` | 是否启用查询重定向功能 | -| `version` | `Version` | 客户端的版本号,用于与服务端的兼容性判断 | -| `enableAutoFetch` | `boolean` | 是否启用自动获取功能 | -| `maxRetryCount` | `int` | 最大重试次数 | -| `retryIntervalInMs` | `long` | 重试的间隔时间,单位毫秒 | - - - -### 3.2 接口列表 - -#### 3.2.1 元数据管理 - -| 方法名 | 功能描述 | 参数解释 | -| ------------------------------------------------------------ | ------------------------ | ------------------------------------------------------------ | -| `createDatabase(String database)` | 创建数据库 | `database`: 数据库名称 | -| `deleteDatabase(String database)` | 删除指定数据库 | `database`: 要删除的数据库名称 | -| `deleteDatabases(List databases)` | 批量删除数据库 | `databases`: 要删除的数据库名称列表 | -| `createTimeseries(String path, TSDataType dataType, TSEncoding encoding, CompressionType compressor)` | 创建单个时间序列 | `path`: 时间序列路径,`dataType`: 数据类型,`encoding`: 编码类型,`compressor`: 压缩类型 | -| `createAlignedTimeseries(...)` | 创建对齐时间序列 | 设备ID、测点列表、数据类型列表、编码列表、压缩类型列表 | -| `createMultiTimeseries(...)` | 批量创建时间序列 | 多个路径、数据类型、编码、压缩类型、属性、标签、别名等 | -| `deleteTimeseries(String path)` | 删除时间序列 | `path`: 要删除的时间序列路径 | -| `deleteTimeseries(List paths)` | 批量删除时间序列 | `paths`: 要删除的时间序列路径列表 | -| `setSchemaTemplate(String templateName, String prefixPath)` | 设置模式模板 | `templateName`: 模板名称,`prefixPath`: 应用模板的路径 | -| `createSchemaTemplate(Template template)` | 创建模式模板 | `template`: 模板对象 | -| `dropSchemaTemplate(String templateName)` | 删除模式模板 | `templateName`: 要删除的模板名称 | -| `addAlignedMeasurementsInTemplate(...)` | 添加对齐测点到模板 | 模板名称、测点路径列表、数据类型、编码类型、压缩类型 | -| `addUnalignedMeasurementsInTemplate(...)` | 添加非对齐测点到模板 | 同上 | -| `deleteNodeInTemplate(String templateName, String path)` | 删除模板中的节点 | `templateName`: 模板名称,`path`: 要删除的路径 | -| `countMeasurementsInTemplate(String name)` | 统计模板中测点数量 | `name`: 模板名称 | -| `isMeasurementInTemplate(String templateName, String path)` | 检查模板中是否存在某测点 | `templateName`: 模板名称,`path`: 测点路径 | -| `isPathExistInTemplate(String templateName, String path)` | 检查模板中路径是否存在 | 同上 | -| `showMeasurementsInTemplate(String templateName)` | 显示模板中的测点 | `templateName`: 模板名称 | -| `showMeasurementsInTemplate(String templateName, String pattern)` | 按模式显示模板中的测点 | `templateName`: 模板名称,`pattern`: 匹配模式 | -| `showAllTemplates()` | 显示所有模板 | 无参数 | -| `showPathsTemplateSetOn(String templateName)` | 显示模板应用的路径 | `templateName`: 模板名称 | -| `showPathsTemplateUsingOn(String templateName)` | 显示模板实际使用的路径 | 同上 | -| `unsetSchemaTemplate(String prefixPath, String templateName)` | 取消路径的模板设置 | `prefixPath`: 路径,`templateName`: 模板名称 | - - -#### 3.2.2 数据写入 - -| 方法名 | 功能描述 | 参数解释 | -| ------------------------------------------------------------ | ---------------------------------- | ------------------------------------------------------------ | -| `insertRecord(String deviceId, long time, List measurements, List types, Object... values)` | 插入单条记录 | `deviceId`: 设备ID,`time`: 时间戳,`measurements`: 测点列表,`types`: 数据类型列表,`values`: 值列表 | -| `insertRecord(String deviceId, long time, List measurements, List values)` | 插入单条记录 | `deviceId`: 设备ID,`time`: 时间戳,`measurements`: 测点列表,`values`: 值列表 | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> valuesList)` | 插入多条记录 | `deviceIds`: 设备ID列表,`times`: 时间戳列表,`measurementsList`: 测点列表列表,`valuesList`: 值列表 | -| `insertRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | 插入多条记录 | 同上,增加 `typesList`: 数据类型列表 | -| `insertRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList)` | 插入单设备的多条记录 | `deviceId`: 设备ID,`times`: 时间戳列表,`measurementsList`: 测点列表列表,`typesList`: 类型列表,`valuesList`: 值列表 | -| `insertRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList, boolean haveSorted)` | 插入排序后的单设备多条记录 | 同上,增加 `haveSorted`: 数据是否已排序 | -| `insertStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList)` | 插入字符串格式的单设备记录 | `deviceId`: 设备ID,`times`: 时间戳列表,`measurementsList`: 测点列表,`valuesList`: 值列表 | -| `insertStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList, boolean haveSorted)` | 插入排序的字符串格式单设备记录 | 同上,增加 `haveSorted`: 数据是否已排序 | -| `insertAlignedRecord(String deviceId, long time, List measurements, List types, List values)` | 插入单条对齐记录 | `deviceId`: 设备ID,`time`: 时间戳,`measurements`: 测点列表,`types`: 类型列表,`values`: 值列表 | -| `insertAlignedRecord(String deviceId, long time, List measurements, List values)` | 插入字符串格式的单条对齐记录 | `deviceId`: 设备ID,`time`: 时间戳,`measurements`: 测点列表,`values`: 值列表 | -| `insertAlignedRecords(List deviceIds, List times, List> measurementsList, List> valuesList)` | 插入多条对齐记录 | `deviceIds`: 设备ID列表,`times`: 时间戳列表,`measurementsList`: 测点列表,`valuesList`: 值列表 | -| `insertAlignedRecords(List deviceIds, List times, List> measurementsList, List> typesList, List> valuesList)` | 插入多条对齐记录 | 同上,增加 `typesList`: 数据类型列表 | -| `insertAlignedRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList)` | 插入单设备的多条对齐记录 | 同上 | -| `insertAlignedRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> typesList, List> valuesList, boolean haveSorted)` | 插入排序的单设备多条对齐记录 | 同上,增加 `haveSorted`: 数据是否已排序 | -| `insertAlignedStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList)` | 插入字符串格式的单设备对齐记录 | `deviceId`: 设备ID,`times`: 时间戳列表,`measurementsList`: 测点列表,`valuesList`: 值列表 | -| `insertAlignedStringRecordsOfOneDevice(String deviceId, List times, List> measurementsList, List> valuesList, boolean haveSorted)` | 插入排序的字符串格式单设备对齐记录 | 同上,增加 `haveSorted`: 数据是否已排序 | -| `insertTablet(Tablet tablet)` | 插入单个Tablet数据 | `tablet`: 要插入的Tablet数据 | -| `insertTablet(Tablet tablet, boolean sorted)` | 插入排序的Tablet数据 | 同上,增加 `sorted`: 数据是否已排序 | -| `insertAlignedTablet(Tablet tablet)` | 插入对齐的Tablet数据 | `tablet`: 要插入的Tablet数据 | -| `insertAlignedTablet(Tablet tablet, boolean sorted)` | 插入排序的对齐Tablet数据 | 同上,增加 `sorted`: 数据是否已排序 | -| `insertTablets(Map tablets)` | 批量插入多个Tablet数据 | `tablets`: 设备ID到Tablet的映射表 | -| `insertTablets(Map tablets, boolean sorted)` | 批量插入排序的多个Tablet数据 | 同上,增加 `sorted`: 数据是否已排序 | -| `insertAlignedTablets(Map tablets)` | 批量插入多个对齐Tablet数据 | `tablets`: 设备ID到Tablet的映射表 | -| `insertAlignedTablets(Map tablets, boolean sorted)` | 批量插入排序的多个对齐Tablet数据 | 同上,增加 `sorted`: 数据是否已排序 | - -#### 3.2.3 数据删除 - -| 方法名 | 功能描述 | 参数解释 | -| ------------------------------------------------------------ | ---------------------------- | ---------------------------------------- | -| `deleteTimeseries(String path)` | 删除单个时间序列 | `path`: 时间序列路径 | -| `deleteTimeseries(List paths)` | 批量删除时间序列 | `paths`: 时间序列路径列表 | -| `deleteData(String path, long endTime)` | 删除指定路径的历史数据 | `path`: 路径,`endTime`: 结束时间戳 | -| `deleteData(List paths, long endTime)` | 批量删除路径的历史数据 | `paths`: 路径列表,`endTime`: 结束时间戳 | -| `deleteData(List paths, long startTime, long endTime)` | 删除路径时间范围内的历史数据 | 同上,增加 `startTime`: 起始时间戳 | - - -#### 3.2.4 数据查询 - -| 方法名 | 功能描述 | 参数解释 | -| ------------------------------------------------------------ | -------------------------------- | ------------------------------------------------------------ | -| `executeQueryStatement(String sql)` | 执行查询语句 | `sql`: 查询SQL语句 | -| `executeQueryStatement(String sql, long timeoutInMs)` | 执行带超时的查询语句 | `sql`: 查询SQL语句,`timeoutInMs`: 查询超时时间(毫秒),默认取服务器配置即60s | -| `executeRawDataQuery(List paths, long startTime, long endTime)` | 查询指定路径的原始数据 | `paths`: 查询路径列表,`startTime`: 起始时间戳,`endTime`: 结束时间戳 | -| `executeRawDataQuery(List paths, long startTime, long endTime, long timeOut)` | 查询指定路径的原始数据(带超时) | 同上,增加 `timeOut`: 超时时间 | -| `executeLastDataQuery(List paths)` | 查询最新数据 | `paths`: 查询路径列表 | -| `executeLastDataQuery(List paths, long lastTime)` | 查询指定时间的最新数据 | `paths`: 查询路径列表,`lastTime`: 指定的时间戳 | -| `executeLastDataQuery(List paths, long lastTime, long timeOut)` | 查询指定时间的最新数据(带超时) | 同上,增加 `timeOut`: 超时时间 | -| `executeLastDataQueryForOneDevice(String db, String device, List sensors, boolean isLegalPathNodes)` | 查询单个设备的最新数据 | `db`: 数据库名,`device`: 设备名,`sensors`: 传感器列表,`isLegalPathNodes`: 是否合法路径节点 | -| `executeAggregationQuery(List paths, List aggregations)` | 执行聚合查询 | `paths`: 查询路径列表,`aggregations`: 聚合类型列表 | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime)` | 执行带时间范围的聚合查询 | 同上,增加 `startTime`: 起始时间戳,`endTime`: 结束时间戳 | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime, long interval)` | 执行带时间间隔的聚合查询 | 同上,增加 `interval`: 时间间隔 | -| `executeAggregationQuery(List paths, List aggregations, long startTime, long endTime, long interval, long slidingStep)` | 执行滑动窗口聚合查询 | 同上,增加 `slidingStep`: 滑动步长 | -| `fetchAllConnections()` | 获取所有活动连接信息 | 无参数 | - -#### 3.2.5 系统状态与备份 - -| 方法名 | 功能描述 | 参数解释 | -| -------------------------- | ---------------------- | -------------------------------------- | -| `getBackupConfiguration()` | 获取备份配置信息 | 无参数 | -| `fetchAllConnections()` | 获取所有活动的连接信息 | 无参数 | -| `getSystemStatus()` | 获取系统状态 | 已废弃,默认返回 `SystemStatus.NORMAL` | - - - diff --git a/src/zh/UserGuide/latest/API/Programming-MQTT_timecho.md b/src/zh/UserGuide/latest/API/Programming-MQTT_timecho.md deleted file mode 100644 index 9dbd8aae4..000000000 --- a/src/zh/UserGuide/latest/API/Programming-MQTT_timecho.md +++ /dev/null @@ -1,295 +0,0 @@ - - -# MQTT 协议 - -## 1. 概述 - -MQTT 是一种专为物联网(IoT)和低带宽环境设计的轻量级消息传输协议,基于发布/订阅(Pub/Sub)模型,支持设备间高效、可靠的双向通信。其核心目标是低功耗、低带宽消耗和高实时性,尤其适合网络不稳定或资源受限的场景(如传感器、移动设备)。 - -IoTDB 深度集成了 MQTT 协议能力,完整兼容 MQTT v3.1(OASIS 国际标准协议)。IoTDB 服务器内置高性能 MQTT Broker 服务模块,无需第三方中间件,支持设备通过 MQTT 报文将时序数据直接写入 IoTDB 存储引擎。 - - - -注意,自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 MQTT 服务的 JAR 包。请使用该服务前联系天谋团队获取 JAR 包,并放置于 timechodb_home/lib 或者 timechodb_home/ext/external_service 路径下。 - -## 2. 内置 MQTT 服务 -内置的 MQTT 服务提供了通过 MQTT 直接连接到 IoTDB 的能力。 它侦听来自 MQTT 客户端的发布消息,然后立即将数据写入存储。 -MQTT 主题与 IoTDB 时间序列相对应。 -消息有效载荷可以由 Java SPI 加载的`PayloadFormatter`格式化为事件,默认实现为`JSONPayloadFormatter` - 默认的`json`格式化程序支持两种 json 格式以及由他们组成的json数组,以下是 MQTT 消息有效负载示例: - -```json - { - "device":"root.sg.d1", - "timestamp":1586076045524, - "measurements":["s1","s2"], - "values":[0.530635,0.530635] - } -``` -或者 -```json - { - "device":"root.sg.d1", - "timestamps":[1586076045524,1586076065526], - "measurements":["s1","s2"], - "values":[[0.530635,0.530635], [0.530655,0.530695]] - } -``` -或者以上两者的JSON数组形式。 - - - -## 3. MQTT 配置 -默认情况下,IoTDB MQTT 服务从`${IOTDB_HOME}/${IOTDB_CONF}/iotdb-system.properties`加载配置。 - -配置如下: - -| **名称** | **描述** | **默认** | -|---------------------------| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------- | -| `enable_mqtt_service` | 是否启用 mqtt 服务 | FALSE | -| `mqtt_host` | mqtt 服务绑定主机 | 127.0.0.1 | -| `mqtt_port` | mqtt 服务绑定端口 | 1883 | -| `mqtt_handler_pool_size` | 处理 mqtt 消息的处理程序池大小 | 1 | -| **`mqtt_payload_formatter`** | **mqtt**​**​ 消息有效负载格式化程序。**​**可选项:**​​**`json`**​**:仅适用于树模型。**​​**`line`**​**:仅适用于表模型。** | **json** | -| `mqtt_max_message_size` | mqtt 消息最大长度(字节) | 1048576 | - - -## 4. 示例代码 -以下是 mqtt 客户端将消息发送到 IoTDB 服务器的示例。 - - ```java -MQTT mqtt = new MQTT(); -mqtt.setHost("127.0.0.1", 1883); -mqtt.setUserName("root"); -mqtt.setPassword("root"); - -BlockingConnection connection = mqtt.blockingConnection(); -connection.connect(); - -Random random = new Random(); -for (int i = 0; i < 10; i++) { - String payload = String.format("{\n" + - "\"device\":\"root.sg.d1\",\n" + - "\"timestamp\":%d,\n" + - "\"measurements\":[\"s1\"],\n" + - "\"values\":[%f]\n" + - "}", System.currentTimeMillis(), random.nextDouble()); - - connection.publish("root.sg.d1.s1", payload.getBytes(), QoS.AT_LEAST_ONCE, false); -} - -connection.disconnect(); - ``` - - -## 5. 自定义 MQTT 消息格式 - -在生产环境中,每个设备通常都配备了自己的 MQTT 客户端,且这些客户端的消息格式已经预先设定。如果按照 IoTDB 所支持的 MQTT 消息格式进行通信,就需要对现有的所有客户端进行全面的升级改造,这无疑会带来较高的成本。然而,我们可以通过简单的编程手段,轻松实现 MQTT 消息格式的自定义,而无需改造客户端。 -可以在源码的 [example/mqtt-customize](https://github.com/apache/iotdb/tree/rc/2.0.1/example/mqtt-customize) 项目中找到一个简单示例。 - -假定mqtt客户端传过来的是以下消息格式: -```json - { - "time":1586076045523, - "deviceID":"car_1", - "deviceType":"油车", - "point":"油量", - "value":10.0 -} -``` -或者JSON的数组形式: -```java -[ - { - "time":1586076045523, - "deviceID":"car_1", - "deviceType":"油车", - "point":"油量", - "value":10.0 - }, - { - "time":1586076045524, - "deviceID":"car_2", - "deviceType":"新能源车", - "point":"速度", - "value":80.0 - } -] -``` - - -则可以通过以下步骤设置设置自定义MQTT消息格式: - -1. 创建一个 Java 项目,增加如下依赖 -```xml - - org.apache.iotdb - iotdb-server - 2.0.4-SNAPSHOT - -``` -2. 创建一个实现类,实现接口 `org.apache.iotdb.db.mqtt.protocol.PayloadFormatter` - -```java -package org.apache.iotdb.mqtt.server; - -import org.apache.iotdb.db.protocol.mqtt.Message; -import org.apache.iotdb.db.protocol.mqtt.PayloadFormatter; -import org.apache.iotdb.db.protocol.mqtt.TableMessage; - -import com.google.common.collect.Lists; -import com.google.gson.Gson; -import com.google.gson.GsonBuilder; -import com.google.gson.JsonArray; -import com.google.gson.JsonElement; -import com.google.gson.JsonObject; -import com.google.gson.JsonParseException; -import io.netty.buffer.ByteBuf; -import org.apache.commons.lang3.NotImplementedException; -import org.apache.tsfile.enums.TSDataType; - -import java.nio.charset.StandardCharsets; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -/** - * The Customized JSON payload formatter. one json format supported: { "time":1586076045523, - * "deviceID":"car_1", "deviceType":"新能源车", "point":"速度", "value":80.0 } - */ -public class CustomizedJsonPayloadFormatter implements PayloadFormatter { - private static final String JSON_KEY_TIME = "time"; - private static final String JSON_KEY_DEVICEID = "deviceID"; - private static final String JSON_KEY_DEVICETYPE = "deviceType"; - private static final String JSON_KEY_POINT = "point"; - private static final String JSON_KEY_VALUE = "value"; - private static final Gson GSON = new GsonBuilder().create(); - - @Override - public List format(String topic, ByteBuf payload) { - if (payload == null) { - return new ArrayList<>(); - } - String txt = payload.toString(StandardCharsets.UTF_8); - JsonElement jsonElement = GSON.fromJson(txt, JsonElement.class); - if (jsonElement.isJsonObject()) { - JsonObject jsonObject = jsonElement.getAsJsonObject(); - return formatTableRow(topic, jsonObject); - } else if (jsonElement.isJsonArray()) { - JsonArray jsonArray = jsonElement.getAsJsonArray(); - List messages = new ArrayList<>(); - for (JsonElement element : jsonArray) { - JsonObject jsonObject = element.getAsJsonObject(); - messages.addAll(formatTableRow(topic, jsonObject)); - } - return messages; - } - throw new JsonParseException("payload is invalidate"); - } - - @Override - @Deprecated - public List format(ByteBuf payload) { - throw new NotImplementedException(); - } - - private List formatTableRow(String topic, JsonObject jsonObject) { - TableMessage message = new TableMessage(); - String database = !topic.contains("/") ? topic : topic.substring(0, topic.indexOf("/")); - String table = "test_table"; - - // Parsing Database Name - message.setDatabase((database)); - - // Parsing Table Name - message.setTable(table); - - // Parsing Tags - List tagKeys = new ArrayList<>(); - tagKeys.add(JSON_KEY_DEVICEID); - List tagValues = new ArrayList<>(); - tagValues.add(jsonObject.get(JSON_KEY_DEVICEID).getAsString()); - message.setTagKeys(tagKeys); - message.setTagValues(tagValues); - - // Parsing Attributes - List attributeKeys = new ArrayList<>(); - List attributeValues = new ArrayList<>(); - attributeKeys.add(JSON_KEY_DEVICETYPE); - attributeValues.add(jsonObject.get(JSON_KEY_DEVICETYPE).getAsString()); - message.setAttributeKeys(attributeKeys); - message.setAttributeValues(attributeValues); - - // Parsing Fields - List fields = Arrays.asList(JSON_KEY_POINT); - List dataTypes = Arrays.asList(TSDataType.FLOAT); - List values = Arrays.asList(jsonObject.get(JSON_KEY_VALUE).getAsFloat()); - message.setFields(fields); - message.setDataTypes(dataTypes); - message.setValues(values); - - // Parsing timestamp - message.setTimestamp(jsonObject.get(JSON_KEY_TIME).getAsLong()); - return Lists.newArrayList(message); - } - - @Override - public String getName() { - // set the value of mqtt_payload_formatter in iotdb-common.properties as the following string: - return "CustomizedJson2Table"; - } - - @Override - public String getType() { - return PayloadFormatter.TABLE_TYPE; - } -} -``` -3. 修改项目中的 `src/main/resources/META-INF/services/org.apache.iotdb.db.protocol.mqtt.PayloadFormatter` 文件: - 将示例中的文件内容清除,并将刚才的实现类的全名(包名.类名)写入文件中。注意,这个文件中只有一行。 - 在本例中,文件内容为: `org.apache.iotdb.mqtt.server.CustomizedJsonPayloadFormatter` -4. 编译项目生成一个 jar 包: `mvn package -DskipTests` - - -在 IoTDB 服务端: -1. 创建 ${IOTDB_HOME}/ext/mqtt/ 文件夹, 将刚才的 jar 包放入此文件夹。 -2. 打开 MQTT 服务参数. (`enable_mqtt_service=true` in `conf/iotdb-system.properties`) -3. 用刚才的实现类中的 getName() 方法的返回值 设置为 `conf/iotdb-system.properties` 中 `mqtt_payload_formatter` 的值, - , 在本例中,为 `CustomizedJson2Table` -4. 启动 IoTDB -5. 搞定 - -More: MQTT 协议的消息不限于 json,你还可以用任意二进制。通过如下函数获得: -`payload.forEachByte()` or `payload.array`。 - - -## 6. 注意事项 - -为避免因缺省client_id引发的兼容性问题,强烈建议在所有MQTT客户端中始终显式地提供唯一且非空的 client_id。 -不同客户端在client_id缺失或为空时的表现并不一致,常见示例如下: -1. 显式传入空字符串 - • MQTTX:client_id=""时,IoTDB会直接丢弃消息; - • mosquitto_pub:client_id=""时,IoTDB能正常接收消息。 -2. 完全不传client_id - • MQTTX:消息可被IoTDB正常接收; - • mosquitto_pub:IoTDB拒绝连接。 - 由此可见,显式指定唯一且非空的client_id是消除上述差异、确保消息可靠投递的最简单做法。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest/API/Programming-ODBC_timecho.md b/src/zh/UserGuide/latest/API/Programming-ODBC_timecho.md deleted file mode 100644 index 9d3aff458..000000000 --- a/src/zh/UserGuide/latest/API/Programming-ODBC_timecho.md +++ /dev/null @@ -1,1030 +0,0 @@ - - -# ODBC - -## 1. 功能介绍 - -IoTDB ODBC 驱动程序提供了通过 ODBC 标准接口与数据库进行交互的能力,支持通过 ODBC 连接管理时序数据库中的数据。目前支持数据库连接、数据查询、数据插入、数据修改和数据删除等操作,可适配各类支持 ODBC 协议的应用程序与工具链。 - -> 注意:该功能从 V2.0.8.2 起支持。 - -## 2. 使用方式 - -推荐使用预编译二进制包安装,无需自行编译,直接通过脚本完成驱动安装与系统注册,目前仅支持 Windows 系统。 - -### 2.1 环境要求 - -仅需满足操作系统层面的 ODBC 驱动管理器依赖,无需配置编译环境: - -| **操作系统** | **要求与安装方式** | -| -------------------- |------------------------------------------------------------------------------------------------------------------------------------| -| Windows | 1. **Windows 10/11、Server 2016/2019/2022**:自带 ODBC 17/18 版本驱动管理器,无需额外安装
2. **Windows 8.1/Server 2012 R2**:需手动安装对应版本 ODBC 驱动管理器 | - -### 2.2 安装步骤 - -1. 联系天谋团队获取预编译二进制包 - -二进制包目录结构: - -```Plain -├── bin/ -│ ├── apache_iotdb_odbc.dll -│ └── install_driver.exe -├── install.bat -└── registry.bat -``` - -2. 以**管理员权限**打开命令行工具(CMD/PowerShell),并运行以下命令:(可以将路径替换为任意绝对路径) - -```Bash -install.bat "C:\Program Files\Apache IoTDB ODBC Driver" -``` - -脚本自动完成以下操作: - -* 创建安装目录(如果不存在) -* 将 `bin\apache_iotdb_odbc.dll` 复制到指定安装目录 -* 调用 `install_driver.exe` 通过 ODBC 标准 API(`SQLInstallDriverEx`)将驱动注册到系统 - -3. 验证安装:打开「ODBC 数据源管理器」,在「驱动程序」选项卡中可查看到 `Apache IoTDB ODBC Driver`,即表示注册成功。 - -![](/img/odbc-1.png) - -### 2.3 卸载步骤 - -1. 以管理员身份打开命令提示符,`cd` 进入项目根目录。 -2. 运行卸载脚本: - -```Bash -uninstall.bat -``` - -脚本会调用 `install_driver.exe` 通过 ODBC 标准 API(`SQLRemoveDriver`)从系统中注销驱动。安装目录中的 DLL 文件不会被自动删除,如需清理请手动删除。 - -### 2.4 连接配置 - -安装驱动后,需要配置数据源(DSN)才能让应用程序通过 DSN 名称连接数据库。IoTDB ODBC 驱动支持通过数据源和连接字符串配置连接参数两种方法。 - -#### 2.4.1 配置数据源 - -**通过 ODBC 数据源管理器配置** - -1. 打开"ODBC 数据源管理程序",切换到"用户 DSN"选项卡,点击"添加"按钮。 - -![](/img/odbc-2.png) - -2. 在弹出的驱动程序列表中选择"Apache IoTDB ODBC Driver",点击"完成"。 - -![](/img/odbc-3.png) - -3. 弹出数据源配置对话框,填写连接参数,随后点击 OK: - -![](/img/odbc-4.png) - -对话框中各字段的含义如下: - -| **区域** | **字段** | **说明** | -| ---------------- | ----------------- | ----------------------------------------------------------------------------------------------------------------- | -| Data Source | DSN Name | 数据源名称,应用程序通过此名称引用该数据源 | -| Data Source | Description | 数据源描述(可选) | -| Connection | Server | IoTDB 服务器 IP 地址,默认 127.0.0.1 | -| Connection | Port | IoTDB Session API 端口,默认 6667 | -| Connection | User | 用户名,默认 root | -| Connection | Password | 密码,默认 root | -| Options | Table Model | 勾选时使用表模型,取消勾选时使用树模型 | -| Options | Database | 数据库名称。仅表模型模式下可用;树模型时此字段灰化不可编辑 | -| Options | Log Level | 日志级别(0-4):0=OFF, 1=ERROR, 2=WARN, 3=INFO, 4=TRACE | -| Options | Session Timeout | 会话超时时间(毫秒),0 表示不设超时。注意服务端 queryTimeoutThreshold 默认为 60000ms,超过此值需修改服务端配置 | -| Options | Batch Size | 每次拉取的行数,默认 1000。设为 0 时重置为默认值 | - -4. 填写完成后,可以点击"Test Connection"按钮测试连接。测试连接会使用当前填写的参数尝试连接到 IoTDB 服务器并执行 `SHOW VERSION` 查询。连接成功时会显示服务器版本信息,失败时会显示具体的错误原因。 -5. 确认参数无误后,点击"OK"保存。数据源会出现在"用户 DSN"列表中,如下图中的名称为123的数据源。 - -![](/img/odbc-5.png) - -如需修改已有数据源的配置,在列表中选中后点击"配置"按钮即可重新编辑。 - -#### 2.4.2 连接字符串 - -连接字符串格式为**分号分隔的键值对**,如: - -```Bash -Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;isTableModel=false;loglevel=2 -``` - -具体字段属性介绍见下表: - -| **字段名称** | **说明** | **可选值** | **默认值** | -| --------------------------- | ---------------------------------- |------------------------------------------------------------------------------------------------------------------------------| --------------------------------- | -| DSN | 数据源名称 | 自定义数据源名 | - | -| uid | 数据库用户名 | 任意字符串 | root | -| pwd | 数据库密码 | 任意字符串 | root | -| server | IoTDB 服务器地址 | ip地址 | 127.0.0.1 | -| port | IoTDB 服务器端口 | 端口 | 6667 | -| database | 数据库名称(仅表模型模式下生效) | 任意字符串 | 空字符串| -| loglevel | 日志级别 | 整数值(0-4) | 4(LOG\_LEVEL\_TRACE) | -| isTableModel / tablemodel | 是否启用表模型模式 | 布尔类型,支持多种表示方式:
1. 0, false, no, off :设置为 false;
2. 1, true, yes, on :设置为 true;
3. 其他值默认设置为 true。 | true| -| sessiontimeoutms | Session 超时时间(毫秒) | 64 位整数,默认为`LLONG_MAX`;设置为`0`时将被替换为`LLONG_MAX`。注意,服务端有超时设置项:`private long queryTimeoutThreshold = 60000;`需要修改这一项才能得到超过60秒的超时时间。 | LLONG\_MAX| -| batchsize | 每次拉取数据的批量大小 | 64 位整数,默认为`1000`;设置为`0`时将被替换为`1000` | 1000| - -说明: - -* 字段名称不区分大小写(自动转换为小写进行比较) -* 连接字符串格式为分号分隔的键值对,如:`Driver={IoTDB ODBC Driver};server=127.0.0.1;port=6667;uid=root;pwd=root;isTableModel=false;loglevel=2` -* 对于布尔类型的字段(isTableModel),支持多种表示方式 -* 所有字段都是可选的,如果未指定则使用默认值 -* 不支持的字段会忽略并在日志中记录警告信息,但不会影响连接 -* 服务器接口默认值 6667 是 IoTDB 的 C++ Session 接口所使用的默认端口。本 ODBC 驱动使用 C++ Session 接口与 IoTDB 传递数据。如果 IoTDB 服务端的 C++ Session 接口使用的端口不是默认的,需要在 ODBC 连接字符串中作相应的更改。 - -#### 2.4.3 数据源配置与连接字符串的关系 - -在 ODBC 数据源管理器中保存的配置,会以键值对的形式写入系统的 ODBC 数据源配置中(Windows 下对应注册表 `HKEY_CURRENT_USER\SOFTWARE\ODBC\ODBC.INI`)。当应用程序使用 `SQLConnect` 或在连接字符串中指定 `DSN=数据源名称` 时,驱动会从系统配置中读取这些参数。 - -**连接字符串的优先级高于 DSN 中保存的配置。** 具体规则如下: - -1. 如果连接字符串中包含 `DSN=xxx` 且不包含 `DRIVER=...`,驱动会先从系统配置中加载该 DSN 的所有参数作为基础值。 -2. 然后,连接字符串中显式指定的参数会覆盖 DSN 中的同名参数。 -3. 如果连接字符串中包含 `DRIVER=...`,则不会从系统配置中读取任何 DSN 参数,完全以连接字符串为准。 - -例如:DSN 中配置了 `Server=192.168.1.100`、`Port=6667`,但连接字符串为 `DSN=MyDSN;Server=127.0.0.1`,则实际连接使用 `Server=127.0.0.1`(连接字符串覆盖),`Port=6667`(来自 DSN)。 - -### 2.5 日志记录 - -驱动运行时的日志输出分为「驱动自身日志」和「ODBC 管理器追踪日志」两类,需注意日志等级对性能的影响。 - -#### 2.5.1 驱动自身日志 - -* 输出位置:用户主目录下的 `apache_iotdb_odbc.log`; -* 日志等级:通过连接字符串的 `loglevel` 配置(0-4,等级越高输出越详细); -* 性能影响:高日志等级会显著降低驱动性能,建议仅调试时使用。 - -#### 2.5.2 ODBC 管理器追踪日志 - -* 开启方式:打开「ODBC 数据源管理器」→「跟踪」→「立即启动跟踪」; -* 注意事项:开启后会大幅降低驱动性能,仅用于问题排查。 - -## 3. 接口支持 - -### 3.1 方法列表 - -驱动对 ODBC 标准 API 的支持情况如下: - -| ODBC/Setup API | 函数功能 | 参数列表 | 参数说明 | -| ------------------- | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| SQLAllocHandle| 分配ODBC句柄 | (SQLSMALLINT HandleType, SQLHANDLE InputHandle, SQLHANDLE \*OutputHandle) | HandleType: 要分配的句柄类型(ENV/DBC/STMT/DESC);
InputHandle: 上级上下文句柄;
OutputHandle: 返回的新句柄指针 | -| SQLBindCol | 绑定列到结果缓冲区 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN \*StrLen\_or\_Ind) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
TargetType: C数据类型;
TargetValue: 数据缓冲区;
BufferLength: 缓冲区长度;
StrLen\_or\_Ind: 返回数据长度或NULL指示 | -| SQLColAttribute| 获取列属性信息 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLUSMALLINT FieldIdentifier, SQLPOINTER CharacterAttribute, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength, SQLLEN \*NumericAttribute) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
FieldIdentifier: 属性ID;
CharacterAttribute: 字符属性输出;
BufferLength: 缓冲区长度;
StringLength: 返回长度;
NumericAttribute: 数值属性输出 | -| SQLColumns| 查询表列信息 | (SQLHSTMT StatementHandle, SQLCHAR \*CatalogName, SQLSMALLINT NameLength1, SQLCHAR \*SchemaName, SQLSMALLINT NameLength2, SQLCHAR \*TableName, SQLSMALLINT NameLength3, SQLCHAR \*ColumnName, SQLSMALLINT NameLength4) | StatementHandle: 语句句柄;
Catalog/Schema/Table/ColumnName: 查询对象名称;
NameLength\*: 对应名称长度 | -| SQLConnect | 建立数据库连接 | (SQLHDBC ConnectionHandle, SQLCHAR \*ServerName, SQLSMALLINT NameLength1, SQLCHAR \*UserName, SQLSMALLINT NameLength2, SQLCHAR \*Authentication, SQLSMALLINT NameLength3) | ConnectionHandle: 连接句柄;
ServerName: 数据源名称;
UserName: 用户名;
Authentication: 密码;NameLength\*: 字符串长度 | -| SQLDescribeCol | 描述结果集中的列 | (SQLHSTMT StatementHandle, SQLUSMALLINT ColumnNumber, SQLCHAR \*ColumnName, SQLSMALLINT BufferLength, SQLSMALLINT \*NameLength, SQLSMALLINT \*DataType, SQLULEN \*ColumnSize, SQLSMALLINT \*DecimalDigits, SQLSMALLINT \*Nullable) | StatementHandle: 语句句柄;
ColumnNumber: 列号;
ColumnName: 列名输出;
BufferLength: 缓冲区长度;
NameLength: 返回列名长度;
DataType: SQL类型;
ColumnSize: 列大小;
DecimalDigits: 小数位;
Nullable: 是否可为空 | -| SQLDisconnect | 断开数据库连接 | (SQLHDBC ConnectionHandle) | ConnectionHandle: 连接句柄 | -| SQLDriverConnect | 使用连接字符串建立连接 | (SQLHDBC ConnectionHandle, SQLHWND WindowHandle, SQLCHAR \*InConnectionString, SQLSMALLINT StringLength1, SQLCHAR \*OutConnectionString, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength2, SQLUSMALLINT DriverCompletion) | ConnectionHandle: 连接句柄;
WindowHandle: 窗口句柄;
InConnectionString: 输入连接字符串;
StringLength1: 输入长度;
OutConnectionString: 输出连接字符串;
BufferLength: 输出缓冲区;
StringLength2: 返回长度;
DriverCompletion: 连接提示方式 | -| SQLEndTran | 提交或回滚事务 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT CompletionType) | HandleType: 句柄类型;
Handle: 连接或环境句柄;
CompletionType: 提交或回滚事务 | -| SQLExecDirect | 直接执行SQL语句 | (SQLHSTMT StatementHandle, SQLCHAR \*StatementText, SQLINTEGER TextLength) | StatementHandle: 语句句柄;
StatementText: SQL文本;
TextLength: SQL长度 | -| SQLFetch | 提取结果集中的下一行 | (SQLHSTMT StatementHandle) | StatementHandle: 语句句柄 | -| SQLFreeHandle | 释放ODBC句柄 | (SQLSMALLINT HandleType, SQLHANDLE Handle) | HandleType: 句柄类型;
Handle: 要释放的句柄 | -| SQLFreeStmt | 释放语句相关资源 | (SQLHSTMT StatementHandle, SQLUSMALLINT Option) | StatementHandle: 语句句柄;
Option: 释放选项(关闭游标/重置参数等) | -| SQLGetConnectAttr | 获取连接属性 | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER \*StringLength) | ConnectionHandle: 连接句柄;
Attribute: 属性ID;
Value: 返回属性值;
BufferLength: 缓冲区长度;
StringLength: 返回长度 | -| SQLGetData | 获取结果数据 | (SQLHSTMT StatementHandle, SQLUSMALLINT Col\_or\_Param\_Num, SQLSMALLINT TargetType, SQLPOINTER TargetValue, SQLLEN BufferLength, SQLLEN \*StrLen\_or\_Ind) | StatementHandle: 语句句柄;
Col\_or\_Param\_Num: 列号;
TargetType: C类型;
TargetValue: 数据缓冲区;
BufferLength: 缓冲区大小;
StrLen\_or\_Ind: 返回长度或NULL标志 | -| SQLGetDiagField | 获取诊断字段 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLSMALLINT DiagIdentifier, SQLPOINTER DiagInfo, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength) | HandleType: 句柄类型;
Handle: 句柄;
RecNumber: 记录号;
DiagIdentifier: 诊断字段ID;
DiagInfo: 输出信息;
BufferLength: 缓冲区;
StringLength: 返回长度 | -| SQLGetDiagRec | 获取诊断记录 | (SQLSMALLINT HandleType, SQLHANDLE Handle, SQLSMALLINT RecNumber, SQLCHAR \*Sqlstate, SQLINTEGER \*NativeError, SQLCHAR \*MessageText, SQLSMALLINT BufferLength, SQLSMALLINT \*TextLength) | HandleType: 句柄类型;
Handle: 句柄;
RecNumber: 记录号;
Sqlstate: SQL状态码;
NativeError: 原生错误码;
MessageText: 错误信息;
BufferLength: 缓冲区;
TextLength: 返回长度 | -| SQLGetInfo | 获取数据库信息 | (SQLHDBC ConnectionHandle, SQLUSMALLINT InfoType, SQLPOINTER InfoValue, SQLSMALLINT BufferLength, SQLSMALLINT \*StringLength) | ConnectionHandle: 连接句柄;

InfoType: 信息类型;
InfoValue: 返回值;
BufferLength: 缓冲区长度;
StringLength: 返回长度 | -| SQLGetStmtAttr | 获取语句属性 | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER BufferLength, SQLINTEGER \*StringLength) | StatementHandle: 语句句柄;
Attribute: 属性ID;
Value: 返回值;
BufferLength: 缓冲区;
StringLength: 返回长度 | -| SQLGetTypeInfo | 获取数据类型信息 | (SQLHSTMT StatementHandle, SQLSMALLINT DataType) | StatementHandle: 语句句柄;
DataType: SQL数据类型 | -| SQLMoreResults | 获取更多结果集 | (SQLHSTMT StatementHandle) | StatementHandle: 语句句柄 | -| SQLNumResultCols | 获取结果集列数 | (SQLHSTMT StatementHandle, SQLSMALLINT \*ColumnCount) | StatementHandle: 语句句柄;
ColumnCount: 返回列数 | -| SQLRowCount | 获取受影响的行数 | (SQLHSTMT StatementHandle, SQLLEN \*RowCount) | StatementHandle: 语句句柄;
RowCount: 返回受影响行数 | -| SQLSetConnectAttr | 设置连接属性 | (SQLHDBC ConnectionHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | ConnectionHandle: 连接句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 属性值长度 | -| SQLSetEnvAttr | 设置环境属性 | (SQLHENV EnvironmentHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | EnvironmentHandle: 环境句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 长度 | -| SQLSetStmtAttr | 设置语句属性 | (SQLHSTMT StatementHandle, SQLINTEGER Attribute, SQLPOINTER Value, SQLINTEGER StringLength) | StatementHandle: 语句句柄;
Attribute: 属性ID;
Value: 属性值;
StringLength: 长度 | -| SQLTables | 查询表信息 | (SQLHSTMT StatementHandle, SQLCHAR \*CatalogName, SQLSMALLINT NameLength1, SQLCHAR \*SchemaName, SQLSMALLINT NameLength2, SQLCHAR \*TableName, SQLSMALLINT NameLength3, SQLCHAR \*TableType, SQLSMALLINT NameLength4) | StatementHandle: 语句句柄;
Catalog/Schema/TableName: 表名;
TableType: 表类型;
NameLength\*: 对应长度 | - -### 3.2 数据类型转换 - -IoTDB 数据类型与 ODBC 标准数据类型的映射关系如下: - -| **IoTDB 数据类型** | **ODBC 数据类型** | -| -------------------------- | ------------------------- | -| BOOLEAN | SQL\_BIT | -| INT32 | SQL\_INTEGER | -| INT64 | SQL\_BIGINT | -| FLOAT | SQL\_REAL | -| DOUBLE | SQL\_DOUBLE | -| TEXT | SQL\_VARCHAR | -| STRING | SQL\_VARCHAR | -| BLOB | SQL\_LONGVARBINARY | -| TIMESTAMP | SQL\_BIGINT | -| DATE | SQL\_DATE | - -## 4. 操作示例 - -本章节主要介绍 **C#**、**Python**、**C++**、**PowerBI**、**Excel** 全类型操作示例,覆盖数据查询、插入、删除等核心操作。 - -### 4.1 C# 示例 - -```C# -/******* -Note: When the output contains Chinese characters, it may cause garbled text. -This is because the table.Write() function cannot output strings in UTF-8 encoding -and can only output using GB2312 (or another system default encoding). This issue -may not occur in software like Power BI; it also does not occur when using the Console. -WriteLine function. This is an issue with the ConsoleTable package. -*****/ -using System.Data.Common; -using System.Data.Odbc; -using System.Reflection.PortableExecutable; -using ConsoleTables; -using System; - -/// 执行 SELECT 查询并以表格形式输出 root.full.fulldevice 的结果 -void Query(OdbcConnection dbConnection) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - dbCommand.CommandText = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000"; - using (OdbcDataReader dbReader = dbCommand.ExecuteReader()) - { - var fCount = dbReader.FieldCount; - Console.WriteLine($"fCount = {fCount}"); - // 输出表头 - var columns = new string[fCount]; - for (var i = 0; i < fCount; i++) - { - var fName = dbReader.GetName(i); - if (fName.Contains('.')) - { - fName = fName.Substring(fName.LastIndexOf('.') + 1); - } - columns[i] = fName; - } - // 输出内容 - var table = new ConsoleTable(columns); - while (dbReader.Read()) - { - var row = new object[fCount]; - for (var i = 0; i < fCount; i++) - { - if (dbReader.IsDBNull(i)) - { - row[i] = null; - continue; - } - row[i] = dbReader.GetValue(i); - } - table.AddRow(row); - } - table.Write(); - Console.WriteLine(); - } - } - } - catch (Exception ex) - { - Console.WriteLine(ex.ToString()); - } -} - -/// 执行非查询 SQL 语句(如 INSERT 等,树模型 INSERT 会自动创建) -void Execute(OdbcConnection dbConnection, string command) -{ - try - { - using (OdbcCommand dbCommand = dbConnection.CreateCommand()) - { - try - { - dbCommand.CommandText = command; - Console.WriteLine($"Execute command: {command}"); - dbCommand.ExecuteNonQuery(); - } - catch (Exception ex) - { - Console.WriteLine($"CommandText error: {ex.Message}"); - } - } - } - catch (OdbcException ex) - { - Console.WriteLine($"数据库错误:{ex.Message}"); - } - catch (Exception ex) - { - Console.WriteLine($"发生未知错误:{ex.Message}"); - } -} - -var dsn = "Apache IoTDB DSN"; -var user = "root"; -var password = "root"; -var server = "127.0.0.1"; -var connectionString = $"DSN={dsn};Server={server};UID={user};PWD={password};loglevel=4;istablemodel=0"; - -using (OdbcConnection dbConnection = new OdbcConnection(connectionString)) -{ - Console.WriteLine($"Start"); - try - { - dbConnection.Open(); - } - catch (Exception ex) - { - Console.WriteLine($"Login failed: {ex.Message}"); - Console.WriteLine($"Stack Trace: {ex.StackTrace}"); - dbConnection.Dispose(); - return; - } - string[] insertStatements = new string[] - { - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')" - }; - foreach (var insert in insertStatements) - { - Execute(dbConnection, insert); - } - Console.WriteLine($"[DEBUG] Inserted {insertStatements.Length} rows. Begin to query."); - - Query(dbConnection); // 执行查询并输出结果 -} -``` - -### 4.2 Python 示例 - -1. 通过Python访问odbc,需安装pyodbc包 - -```Plain -pip install pyodbc -``` - -2. 完整代码 - -```Python -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -""" -Apache IoTDB ODBC Python 示例 - 树模型(Tree Model) -使用 pyodbc 连接 IoTDB ODBC 驱动,通过 istablemodel=0 使用树模型。 -功能参考 examples/BasicTest/TreeTest/TreeTest.cs 和 examples/cpp-example/TreeTest.cpp。 -""" - -import pyodbc - -def execute(conn: pyodbc.Connection, command: str) -> None: - """执行非查询 SQL 语句(如 INSERT 等,树模型 INSERT 会自动创建)""" - try: - with conn.cursor() as cursor: - cursor.execute(command) - cmd_upper = command.strip().upper() - if cmd_upper.startswith(("INSERT", "UPDATE", "DELETE")): - conn.commit() - print(f"Execute command: {command}") - except pyodbc.Error as ex: - print(f"CommandText error: {ex}") - -def query(conn: pyodbc.Connection, sql: str) -> None: - """执行 SELECT 查询并以表格形式输出 root.full.fulldevice 的结果""" - try: - with conn.cursor() as cursor: - cursor.execute(sql) - col_count = len(cursor.description) - print(f"fCount = {col_count}") - - if col_count <= 0: - return - - columns = [] - for i in range(col_count): - col_name = cursor.description[i][0] or f"Column{i}" - if "." in str(col_name): - col_name = str(col_name).split(".")[-1] - columns.append(str(col_name)) - - rows = cursor.fetchall() - - col_widths = [max(len(str(col)), 4) for col in columns] - for row in rows: - for j, val in enumerate(row): - if j < len(col_widths): - col_widths[j] = max(col_widths[j], len(str(val) if val is not None else "NULL")) - - header = " | ".join(str(c).ljust(col_widths[i]) for i, c in enumerate(columns)) - print(header) - print("-" * len(header)) - - for row in rows: - values = [] - for i, val in enumerate(row): - if val is None: - cell = "NULL" - else: - cell = str(val) - values.append(cell.ljust(col_widths[i]) if i < len(col_widths) else cell) - print(" | ".join(values)) - - print() - - except pyodbc.Error as ex: - print(f"Query error: {ex}") - -def main() -> None: - dsn = "Apache IoTDB DSN" - user = "root" - password = "root" - server = "127.0.0.1" - connection_string = ( - f"DSN={dsn};Server={server};UID={user};PWD={password};" - f"loglevel=4;istablemodel=0" - ) - - print("Start") - - try: - conn = pyodbc.connect(connection_string) - except pyodbc.Error as ex: - print(f"Login failed: {ex}") - return - - try: - driver_name = conn.getinfo(6) # SQL_DRIVER_NAME - print(f"Successfully opened connection. driver = {driver_name}") - except Exception: - print("Successfully opened connection.") - - try: - insert_statements = [ - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600001, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')", - ] - for insert_sql in insert_statements: - execute(conn, insert_sql) - print(f"[DEBUG] Inserted {len(insert_statements)} rows. Begin to query.") - - query_sql = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000" - query(conn, query_sql) - print("Query ok") - finally: - conn.close() - -if __name__ == "__main__": - main() -``` - -### 4.3 C++ 示例 - -```C++ -#define WIN32_LEAN_AND_MEAN -#include - -#include -#include -#include -#include -#include -#include -#include - -#ifndef SQL_DIAG_COLUMN_SIZE -#define SQL_DIAG_COLUMN_SIZE 33L -#endif - -void CheckOdbcError(SQLRETURN retCode, SQLSMALLINT handleType, SQLHANDLE handle, const char* functionName) { - if (retCode == SQL_SUCCESS || retCode == SQL_SUCCESS_WITH_INFO) { - return; - } - - SQLCHAR sqlState[6]; - SQLCHAR message[SQL_MAX_MESSAGE_LENGTH]; - SQLINTEGER nativeError; - SQLSMALLINT textLength; - SQLRETURN errRet; - errRet = SQLGetDiagRec(handleType, handle, 1, sqlState, &nativeError, message, sizeof(message), &textLength); - - std::cerr << "ODBC Error in " << functionName << ":\n"; - std::cerr << " SQL State: " << sqlState << "\n"; - std::cerr << " Native Error: " << nativeError << "\n"; - std::cerr << " Message: " << message << "\n"; - std::cerr << " SQLGetDiagRec Return: " << errRet << "\n"; - - if (retCode == SQL_ERROR || retCode == SQL_INVALID_HANDLE) { - exit(1); - } -} - -void PrintSimpleTable(const std::vector& headers, - const std::vector>& rows) { - for (size_t i = 0; i < headers.size(); i++) { - std::cout << headers[i]; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - for (size_t i = 0; i < headers.size(); i++) { - std::cout << "----------------"; - if (i < headers.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - - for (const auto& row : rows) { - for (size_t i = 0; i < row.size(); i++) { - std::cout << row[i]; - if (i < row.size() - 1) std::cout << "\t"; - } - std::cout << std::endl; - } - std::cout << std::endl; -} - -/// 执行 SELECT 查询并以表格形式输出 root.full.fulldevice 的结果 -void Query(SQLHDBC hDbc) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret = SQL_SUCCESS; - - try { - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - return; - } - - const std::string sqlQuery = "SELECT * FROM root.full.fulldevice WHERE time >= 1735689600000 AND time <= 1735690790000"; - std::cout << "Execute query: " << sqlQuery << std::endl; - - ret = SQLExecDirect(hStmt, reinterpret_cast(const_cast(sqlQuery.c_str())), SQL_NTS); - if (!SQL_SUCCEEDED(ret)) { - if (ret != SQL_NO_DATA) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect(SELECT)"); - } - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - SQLSMALLINT colCount = 0; - ret = SQLNumResultCols(hStmt, &colCount); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLNumResultCols"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::cout << "Column count = " << colCount << std::endl; - - if (colCount <= 0) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector columnNames; - std::vector columnTypes(colCount); - std::vector columnSizes(colCount); - std::vector decimalDigits(colCount); - std::vector nullable(colCount); - - // Get basic column information - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLSMALLINT nameLength = 0; - ret = SQLDescribeCol(hStmt, i, NULL, 0, &nameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get length)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::vector colNameBuffer(nameLength + 1); - SQLSMALLINT actualNameLength = 0; - - ret = SQLDescribeCol(hStmt, i, colNameBuffer.data(), nameLength + 1, - &actualNameLength, NULL, NULL, NULL, NULL); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get name)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - - std::string fullName(reinterpret_cast(colNameBuffer.data())); - - size_t pos = fullName.find_last_of('.'); - if (pos != std::string::npos) { - columnNames.push_back(fullName.substr(pos + 1)); - } else { - columnNames.push_back(fullName); - } - - ret = SQLDescribeCol(hStmt, i, NULL, 0, NULL, &columnTypes[i-1], - &columnSizes[i-1], &decimalDigits[i-1], &nullable[i-1]); - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLDescribeCol (get type info)"); - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - return; - } - } - - std::vector> tableRows; - - int rowCount = 0; - // Get data front every row - while (true) { - ret = SQLFetch(hStmt); - if (ret == SQL_NO_DATA) { - break; - } - - if (!SQL_SUCCEEDED(ret)) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLFetch"); - break; - } - - std::vector row; - - for (SQLSMALLINT i = 1; i <= colCount; i++) { - SQLLEN indicator = 0; - std::string valueStr; - - SQLSMALLINT cType; - size_t bufferSize; - bool isCharacterType = false; - const int maxBufferSize = 32768; - - switch (columnTypes[i-1]) { - case SQL_CHAR: - case SQL_VARCHAR: - case SQL_LONGVARCHAR: - case SQL_WCHAR: - case SQL_WVARCHAR: - case SQL_WLONGVARCHAR: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_DECIMAL: - case SQL_NUMERIC: - cType = SQL_C_CHAR; - if (columnSizes[i - 1] > 0) { - bufferSize = min(maxBufferSize, static_cast(columnSizes[i-1]) * 4 + 1); - } else { - bufferSize = maxBufferSize; - } - isCharacterType = true; - break; - - case SQL_INTEGER: - case SQL_SMALLINT: - case SQL_TINYINT: - case SQL_BIGINT: - cType = SQL_C_SBIGINT; - bufferSize = sizeof(SQLBIGINT); - break; - - case SQL_REAL: - case SQL_FLOAT: - case SQL_DOUBLE: - cType = SQL_C_DOUBLE; - bufferSize = sizeof(double); - break; - - case SQL_BIT: - cType = SQL_C_BIT; - bufferSize = sizeof(SQLCHAR); - break; - - case SQL_DATE: - case SQL_TYPE_DATE: - cType = SQL_C_DATE; - bufferSize = sizeof(SQL_DATE_STRUCT); - break; - - case SQL_TIME: - case SQL_TYPE_TIME: - cType = SQL_C_TIME; - bufferSize = sizeof(SQL_TIME_STRUCT); - break; - - case SQL_TIMESTAMP: - case SQL_TYPE_TIMESTAMP: - cType = SQL_C_TIMESTAMP; - bufferSize = sizeof(SQL_TIMESTAMP_STRUCT); - break; - - default: - cType = SQL_C_CHAR; - bufferSize = 256; - isCharacterType = true; - break; - } - - std::vector buffer(bufferSize); - - ret = SQLGetData(hStmt, i, cType, buffer.data(), bufferSize, &indicator); - - if (indicator == SQL_NULL_DATA) { - valueStr = "NULL"; - } - else if (ret != SQL_SUCCESS) { - valueStr = "ERR_CONV"; - } - else { - if (cType == SQL_C_CHAR) { - valueStr = reinterpret_cast(buffer.data()); - } - else if (cType == SQL_C_SBIGINT) { - SQLBIGINT intVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(intVal); - } - else if (cType == SQL_C_DOUBLE) { - double doubleVal = *reinterpret_cast(buffer.data()); - valueStr = std::to_string(doubleVal); - } - else if (cType == SQL_C_BIT) { - valueStr = (*buffer.data() != 0) ? "TRUE" : "FALSE"; - } - else if (cType == SQL_C_DATE) { - SQL_DATE_STRUCT* date = reinterpret_cast(buffer.data()); - char dateStr[20]; - snprintf(dateStr, sizeof(dateStr), "%04d-%02d-%02d", - date->year, date->month, date->day); - valueStr = dateStr; - } - else if (cType == SQL_C_TIME) { - SQL_TIME_STRUCT* time = reinterpret_cast(buffer.data()); - char timeStr[15]; - snprintf(timeStr, sizeof(timeStr), "%02d:%02d:%02d", - time->hour, time->minute, time->second); - valueStr = timeStr; - } - else if (cType == SQL_C_TIMESTAMP) { - SQL_TIMESTAMP_STRUCT* ts = reinterpret_cast(buffer.data()); - char tsStr[30]; - snprintf(tsStr, sizeof(tsStr), "%04d-%02d-%02d %02d:%02d:%02d.%06d", - ts->year, ts->month, ts->day, - ts->hour, ts->minute, ts->second, - ts->fraction / 1000); - valueStr = tsStr; - } - else { - valueStr = "UNKNOWN_TYPE"; - } - - if (isCharacterType && ret == SQL_SUCCESS_WITH_INFO) { - SQLLEN actualSize = 0; - SQLGetDiagField(SQL_HANDLE_STMT, hStmt, 0, SQL_DIAG_COLUMN_SIZE, - &actualSize, SQL_IS_INTEGER, NULL); - - if (indicator > 0 && static_cast(indicator) > bufferSize - 1) { - valueStr += "..."; - } - } - - } - - row.push_back(valueStr); - } - - tableRows.push_back(row); - } - - if (!tableRows.empty()) { - PrintSimpleTable(columnNames, tableRows); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (const std::exception& ex) { - std::cerr << "Exception: " << ex.what() << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } - catch (...) { - std::cerr << "Unknown exception occurred" << std::endl; - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -/// 执行非查询 SQL 语句(如 INSERT 等,树模型 INSERT 会自动创建) -void Execute(SQLHDBC hDbc, const std::string& command) { - SQLHSTMT hStmt = SQL_NULL_HSTMT; - SQLRETURN ret; - - try { - ret = SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLAllocHandle(SQL_HANDLE_STMT)"); - - ret = SQLExecDirect(hStmt, (SQLCHAR*)command.c_str(), SQL_NTS); - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - CheckOdbcError(ret, SQL_HANDLE_STMT, hStmt, "SQLExecDirect"); - } - - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - catch (...) { - if (hStmt != SQL_NULL_HSTMT) { - SQLFreeHandle(SQL_HANDLE_STMT, hStmt); - } - throw; - } -} - -int main() { - SQLHENV hEnv = SQL_NULL_HENV; - SQLHDBC hDbc = SQL_NULL_HDBC; - SQLRETURN ret; - - try { - std::cout << "Start" << std::endl; - - ret = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hEnv); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_ENV)"); - - ret = SQLSetEnvAttr(hEnv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)SQL_OV_ODBC3, 0); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLSetEnvAttr"); - - ret = SQLAllocHandle(SQL_HANDLE_DBC, hEnv, &hDbc); - CheckOdbcError(ret, SQL_HANDLE_ENV, hEnv, "SQLAllocHandle(SQL_HANDLE_DBC)"); - - std::string dsn = "Apache IoTDB DSN"; - std::string user = "root"; - std::string password = "root"; - std::string server = "127.0.0.1"; - - std::string connectionString = "DSN=" + dsn + ";Server=" + server + - ";UID=" + user + ";PWD=" + password + - ";loglevel=4;istablemodel=0"; - std::cout << "Using connection string: " << connectionString << std::endl; - - SQLCHAR outConnStr[1024]; - SQLSMALLINT outConnStrLen; - - ret = SQLDriverConnect(hDbc, NULL, - (SQLCHAR*)connectionString.c_str(), SQL_NTS, - outConnStr, sizeof(outConnStr), - &outConnStrLen, SQL_DRIVER_COMPLETE); - - if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { - std::cerr << "Login failed" << std::endl; - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLDriverConnect"); - return 1; - } - - SQLCHAR driverName[256]; - SQLSMALLINT nameLength; - ret = SQLGetInfo(hDbc, SQL_DRIVER_NAME, driverName, sizeof(driverName), &nameLength); - CheckOdbcError(ret, SQL_HANDLE_DBC, hDbc, "SQLGetInfo"); - - std::cout << "Successfully opened connection. database name = " << driverName << std::endl; - - const char* insertStatements[] = { - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689600000, true, 100, 10000000000, 36.5, 128.689, '设备运行状态正常', '设备A-机房1', 1735689600000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689660000, false, 101, 10000000001, 36.6, 128.789, '设备运行状态正常', '设备A-机房1', 1735689660000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689720000, true, 102, 10000000002, 36.7, 128.889, '设备运行状态正常', '设备A-机房1', 1735689720000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689780000, false, 103, 10000000003, 36.8, 128.989, '设备温度偏高告警', '设备A-机房1', 1735689780000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689840000, true, 104, 10000000004, 36.9, 129.089, '设备状态恢复正常', '设备A-机房1', 1735689840000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689900000, false, 105, 10000000005, 37.0, 129.189, '设备运行状态正常', '设备B-机房2', 1735689900000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735689960000, true, 106, 10000000006, 37.1, 129.289, '设备运行状态正常', '设备B-机房2', 1735689960000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690020000, false, 107, 10000000007, 37.2, 129.389, '设备湿度偏低告警', '设备B-机房2', 1735690020000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690080000, true, 108, 10000000008, 37.3, 129.489, '设备状态恢复正常', '设备B-机房2', 1735690080000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690140000, false, 109, 10000000009, 37.4, 129.589, '设备运行状态正常', '设备C-机房3', 1735690140000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690200000, true, 110, 10000000010, 37.5, 129.689, '设备运行状态正常', '设备C-机房3', 1735690200000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690260000, false, 111, 10000000011, 37.6, 129.789, '设备电压不稳告警', '设备C-机房3', 1735690260000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690320000, true, 112, 10000000012, 37.7, 129.889, '设备状态恢复正常', '设备C-机房3', 1735690320000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690380000, false, 113, 10000000013, 37.8, 129.989, '设备运行状态正常', '设备D-机房4', 1735690380000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690440000, true, 114, 10000000014, 37.9, 130.089, '设备运行状态正常', '设备D-机房4', 1735690440000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690500000, false, 115, 10000000015, 38.0, 130.189, '设备运行状态正常', '设备D-机房4', 1735690500000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690560000, true, 116, 10000000016, 38.1, 130.289, '设备信号中断告警', '设备D-机房4', 1735690560000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690620000, false, 117, 10000000017, 38.2, 130.389, '设备运行状态正常', '设备E-机房5', 1735690620000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690680000, true, 118, 10000000018, 38.3, 130.489, '设备运行状态正常', '设备E-机房5', 1735690680000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690740000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')", - "INSERT INTO root.full.fulldevice(timestamp, bool_col, int32_col, int64_col, float_col, double_col, text_col, string_col, timestamp_col, date_col) VALUES (1735690790000, false, 119, 10000000019, 38.4, 130.589, '设备运行状态正常', '设备E-机房5', 1735690740000, '2026-01-04')" - }; - for (const char* sql : insertStatements) { - Execute(hDbc, sql); - } - std::cout << "[DEBUG] Inserted 20 rows. Begin to query." << std::endl; - Query(hDbc); - std::cout << "Query ok" << std::endl; - - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - - return 0; - } - catch (...) { - if (hDbc != SQL_NULL_HDBC) { - SQLDisconnect(hDbc); - SQLFreeHandle(SQL_HANDLE_DBC, hDbc); - } - if (hEnv != SQL_NULL_HENV) { - SQLFreeHandle(SQL_HANDLE_ENV, hEnv); - } - - std::cerr << "Unexpected error!" << std::endl; - return 1; - } -} -``` - -### 4.4 PowerBI 示例 - -1. 打开 PowerBI Desktop,创建新项目; -2. 点击「主页」→「获取数据」→「更多...」→「ODBC」→ 点击「连接」按钮; -3. 数据源选择:在弹出窗口中选择「数据源名称 (DSN)」,下拉选择 `Apache IoTDB DSN`; -4. 高级配置: - -* 点击「高级选项」,在「连接字符串」输入框填写配置(样例): - -```Plain -server=127.0.0.1;port=6667;isTableModel=false;loglevel=4 -``` - -* 说明: - - * `dsn` 项可选,填写 / 不填写均不影响连接; - * `loglevel` 分为 0-4 等级:0 级(ERROR)日志最少,4 级(TRACE)日志最详细,按需设置; - * `server`/`dsn`/`loglevel` 大小写不敏感(如可写为 `Server`); - * 如果在DSN中配置了相关信息,则可以不填写任何配置信息,驱动管理器会自动使用在DSN中填入的配置信息。 - -5. 身份验证:输入用户名(默认 `root`)和密码(默认 `root`),点击「连接」; -6. 数据加载:点击「加载」即可查看数据。 - -### 4.5 Excel 示例 - -1. 打开 Excel,创建空白工作簿; -2. 点击「数据」选项卡 →「自其他来源」→「来自数据连接向导」; -3. 数据源选择:选择「ODBC DSN」→ 下一步 → 选择 `Apache IoTDB DSN` → 下一步; -4. 连接配置: -* 连接字符串、用户名、密码的输入流程与 PowerBI 完全一致,连接字符串格式参考: - -```Plain -server=127.0.0.1;port=6667;isTableModel=false;loglevel=4 -``` - -* 如果在DSN中配置了相关信息,则可以不填写任何配置信息,驱动管理器会自动使用在DSN中填入的配置信息。 -5. 保存连接:自定义设置数据连接文件名、连接描述等信息,点击「完成」; -6. 导入数据:选择数据导入到工作表的位置(如「现有工作表」的 A1 单元格),点击「确定」,完成数据加载。 diff --git a/src/zh/UserGuide/latest/API/Programming-OPC-DA_timecho.md b/src/zh/UserGuide/latest/API/Programming-OPC-DA_timecho.md deleted file mode 100644 index f6dc7368d..000000000 --- a/src/zh/UserGuide/latest/API/Programming-OPC-DA_timecho.md +++ /dev/null @@ -1,208 +0,0 @@ - - -# OPC DA 协议 - -## 1. OPC DA - -OPC DA (OPC Data Access) 是工业自动化领域的一种通信协议标准,属于经典 OPC(OLE for Process Control)技术的核心部分。它的主要目标是实现 Windows 环境下工业设备与软件(如 SCADA、HMI、数据库)之间的实时数据交互。OPC DA 基于 COM / DCOM 实现,是一个轻量级的协议,分为服务器和客户端两个角色。 - -* **服务器:** 可以视为一个 Item 的池,存储各个实例的最新数据及其状态。所有 item 只能在服务器端管理,客户端只能读写数据,无权操作元信息。 - -![](/img/opc-da-1-1.png) - -* **客户端:** 连接服务器后,需要自定义一个组(这个组仅与客户端有关),并创建服务器的同名 item,然后可以对自身已创建的 item 进行读写。 - -![](/img/opc-da-1-2.png) - -## 2. OPC DA Sink - -IoTDB (V2.0.5.1及以后的V2.x版本支持) 提供的 OPC DA Sink 支持将树模型数据推送到本地 COM 服务器的插件,它封装了 OPC DA 接口规范及其固有复杂性,显著简化了集成流程。OPC DA Sink 推送数据流图如下所示。 - -![](/img/opc-da-2-1.png) - -### 2.1 SQL 语法 - -```SQL ----- 注意这里的 clsID 需要替换为自己的 clsID -create pipe opc ( - 'sink'='opc-da-sink', - --- 'opcda.progid'='opcserversim.Instance.1' - 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C' -); -``` - -### 2.2 参数介绍 - -| **参数** | **描述** | **取值范围 ** | 是否必填 | -| ------------------- | --------------------------------------------------------------------- | ----------------------- | ------------------ | -| sink | OPC DA SINK | String: opc-da-sink | 必填 | -| sink.opcda.clsid | OPC Server 的 ClsID(唯一标识字符串)。建议使用 clsID 而非 progID。 | String | 和 progId 二选一 | -| sink.opcda.progid | OPC Server 的 ProgID,如果有 clsID,优先使用 clsID。 | String | 和 clsID 二选一 | - -### 2.3 映射规范 - -使用时,IoTDB 将会将自身的树模型最新数据推送到服务器,数据的 itemID 为树模型下的时间序列的全路径,如 `root.a.b.c.d`。注意根据 OPC DA 标准,客户端无权直接在 server 侧创建 item,因此需要服务器提前将 IoTDB 的时间序列以 itemID 和对应数据类型的格式创建为 item。 - -* 数据类型对应如下表所示。 - -| IoTDB | OPC-DA Server | -| ----------- | ----------------------------------------------------------- | -| INT32 | VT\_I4 | -| INT64 | VT\_I8 | -| FLOAT | VT\_R4 | -| DOUBLE | VT\_R8 | -| TEXT | VT\_BSTR | -| BOOLEAN | VT\_BOOL | -| DATE | VT\_DATE | -| TIMESTAMP | VT\_DATE | -| BLOB | VT\_BSTR(Variant 不支持 VT\_BLOB,因此用 VT\_BSTR 替代) | -| STRING | VT\_BSTR | - -### 2.4 常见错误码 - -| 符号 | 错误码 | 描述 | -| ----------------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------ | -| OPC\_E\_BADTYPE | 0xC0040004 | 服务器无法在指定格式/请求的数据类型与规范数据类型之间转换数据。即服务器的数据类型与 IoTDB 的注册类型不一致。 | -| OPC\_E\_UNKNOWNITEMID| 0xC0040007 | 在服务器地址空间中未定义该条目ID(添加或验证时),或该条目ID在服务器地址空间中已不存在(读取或写入时)。即 IoTDB 的测点在服务器内没有对应的 itemID。 | -| OPC\_E\_INVALIDITEMID | 0xC0040008 | 该 itemID不符合服务器的语法规范。 | -| REGDB\_E\_CLASSNOTREG | 0x80040154 | 未注册类 | -| RPC\_S\_SERVER\_UNAVAILABLE | 0x800706ba | RPC服务不可用 | -| DISP\_E\_OVERFLOW | 0x8002000a | 超过类型的最大值 | -| DISP\_E\_BADVARTYPE | 0x80020005 | 类型不匹配 | - -### 2.5 使用限制 - -* 仅支持 COM,且仅能在 Windows 上使用 -* 重启后可能会推送少部分旧数据,但是最终会推送新数据 -* 目前仅支持树模型数据。 - -## 3. 使用步骤 -### 3.1 前置条件 -1. Windows 环境,版本 >= 8 -2. IoTDB 已安装且可正常运行 -3. OPC DA Server 已安装 - -* 以 Simple OPC Server Simulator 为例 - -![](/img/opc-da-3-1.png) - -* 双击某项,可以修改该项的名字(itemID),数据,数据类型等各个信息。 -* 右键某项,可以删除该项、更新值、以及新建项。 - -![](/img/opc-da-3-2.png) - -4. OPC DA Client 已安装 - -* 以 KepwareServerEX 的 quickClient 为例 -* 在 Kepware 中可以如下打开 OPC DA Client - -![](/img/opc-da-3-3.png) - -![](/img/opc-da-3-4.png) - - -### 3.2 配置修改 - -修改 server 配置,以避免 IoTDB 的写入 client 与 Kepware 的读取 client 连接到两个不同的实例而无法调试。 - -* 首先按 Win+R 键,在运行菜单内输入 `dcomcnfg`,打开 dcom 的组件配置: - -![](/img/opc-da-3-5.png) - -* 点击组件服务 -> 计算机 -> 我的电脑 -> DCOM 配置,找到`AGG Software Simple OPC Server Simulator`,右键“属性”: - -![](/img/opc-da-3-6.png) - -* 在`标识`内,将`用户账户`改为`交互式用户`。注意这里不要为`启动用户`,否则可能导致两个 client 分别启动不同的 server 实例。 - -![](/img/opc-da-3-7.png) - -### 3.3 clsID 获取 -1. 方式一:通过 DCOM 配置 获取 - -* 按 Win+R 键,在运行菜单内输入 `dcomcnfg`,打开 dcom 的组件配置; -* 点击组件服务 -> 计算机 -> 我的电脑 -> DCOM 配置,找到`AGG Software Simple OPC Server Simulator`,右键“属性”。 -* 在 `常规 `中可以获取该应用程序的 clsID,用于之后 opc-da-sink 的连接,注意不带大括号 - -![](/img/opc-da-3-8.png) - -2. 方式二:clsID 与 progID 也可以直接在 server 里获取 - -* 点击 `Help` > `Show OPC Server Info` - -![](/img/opc-da-3-9.png) - -* 弹窗中即可显示 - -![](/img/opc-da-3-10.png) - -### 3.4 写入数据 -#### 3.4.1 DA Server -1. 在 DA Server 内新建项,与 IoTDB 的待写入项的 name 与 type 保持一致 - -![](/img/opc-da-3-11.png) - -2. 在 Kepware 中连上该 server: - -![](/img/opc-da-3-12.png) - -3. 右键服务器新建组,组名任意: - -![](/img/opc-da-3-13.png) - -![](/img/opc-da-3-14.png) - -4. 右键新建 item,item 的名字为之前创建的名字 - -![](/img/opc-da-3-15.png) - -![](/img/opc-da-3-16.png) - -![](/img/opc-da-3-17.png) - -#### 3.4.2 IoTDB -1. 启动 IoTDB -2. 创建 Pipe - -```SQL -create pipe opc ('sink'='opc-da-sink', 'opcda.clsid'='CAE8D0E1-117B-11D5-924B-11C0F023E91C') -``` - -* 注意:如果创建失败,提示` Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 1107: Failed to connect to server, error code: 0x80040154`,则可以参考该解决方案进行处理:https://opcexpert.com/support/0x80040154-class-not-registered/ - -3. 创建时间序列(如果已开启自动创建元数据,则本步骤可以省略) - -```SQL -create timeseries root.a.b.c.r string; -``` - -4. 插入数据 - -```SQL -insert into root.a.b.c (time, r) values(10000, "SomeString") -``` - -### 3.5 验证数据 - -查看 Quick client 的数据,应该已经得到更新。 - -![](/img/opc-da-3-18.png) \ No newline at end of file diff --git a/src/zh/UserGuide/latest/API/Programming-OPC-UA_timecho.md b/src/zh/UserGuide/latest/API/Programming-OPC-UA_timecho.md deleted file mode 100644 index f53d8d036..000000000 --- a/src/zh/UserGuide/latest/API/Programming-OPC-UA_timecho.md +++ /dev/null @@ -1,398 +0,0 @@ - - -# OPC UA 协议 - -## 1. 功能概述 - -本文档介绍了 IoTDB 与 OPC UA 协议集成的两种独立工作模式,请根据您的业务场景进行选择: - -* **模式一:数据订阅服务 (IoTDB 作为 OPC UA 服务器)**:IoTDB 启动内置的 OPC UA 服务器,被动地允许外部客户端(如 UAExpert)连接并订阅其内部数据。这是传统用法。 -* **模式二:数据主动推送 (IoTDB 作为 OPC UA 客户端)**:IoTDB 作为客户端,主动将数据和元数据同步到一个或多个独立部署的外部 OPC UA 服务器。 - > 注意:该模式从 V2.0.8 起支持。 - -**注意:模式互斥** - -当 Pipe 配置中指定了 `node-urls` 参数(模式二),IoTDB 将不会启动内置的 OPC UA 服务器(模式一)。两种模式在同一 Pipe 中**不可同时使用**。 - -## 2. 数据订阅 - -本模式支持用户以 OPC UA 协议从 IoTDB 中订阅数据,订阅数据的通信模式支持 Client/Server 和 Pub/Sub 两种。 - -注意:本功能并非从外部 OPC Server 中采集数据写入 IoTDB - -![](/img/opc-ua-new-1.png) - -### 2.1 OPC 服务启动方式 -#### 2.1.1 语法 - -启动 OPC UA 协议的语法: - -```SQL -create pipe p1 - with source (...) - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'sink.opcua.tcp.port' = '12686', - 'sink.opcua.https.port' = '8443', - 'sink.user' = 'root', - 'sink.password' = 'TimechoDB@2021', //V2.0.6.x 之前默认密码为root - 'sink.opcua.security.dir' = '...' - ) -``` - -#### 2.1.2 参数 - -| **参数** | **描述** | **取值范围** | **是否必填** | **默认值** | -| ------------------------------------ |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------| -------------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| sink | OPC UA SINK | String: opc-ua-sink | 必填 | | -| sink.opcua.model | OPC UA 使用的模式 | String: client-server / pub-sub | 选填 | client-server | -| sink.opcua.tcp.port | OPC UA 的 TCP 端口 | Integer: [0, 65536] | 选填 | 12686 | -| sink.opcua.https.port | OPC UA 的 HTTPS 端口 | Integer: [0, 65536] | 选填 | 8443 | -| sink.opcua.security.dir | OPC UA 的密钥及证书目录 | String: Path,支持绝对及相对目录 | 选填 | 1. iotdb 相关 DataNode 的 conf 目录下的 `opc_security` 文件夹 `/`。2. 如无 iotdb 的 conf 目录(例如 IDEA 中启动 DataNode),则为用户主目录下的 `iotdb_opc_security` 文件夹 `/` | -| opcua.security-policy | OPC UA 连接使用的安全策略,不区分大小写。可以配置多个,用`,`连接。配置一个安全策略后,client 才能用对应的策略连接。当前实现默认支持`None`和`Basic256Sha256`策略,应该默认改为任意的非`None`策略,`None`策略在调试环境中单独配置,因为`None`策略虽然不需移动证书,操作方便,但是不安全,生产环境的 server 不建议支持该策略。注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | String(安全性依次递增):
`None`
`Basic128Rsa15`
`Basic256`
`Basic256Sha256`
`Aes128_Sha256_RsaOaep`
`Aes256_Sha256_RsaPss` | 选填| `Basic256Sha256`,`Aes128_Sha256_RsaOaep`,`lAes256_Sha256_RsaPss` | -| sink.opcua.enable-anonymous-access | OPC UA 是否允许匿名访问 | Boolean | 选填 | true | -| sink.user | 用户,这里指 OPC UA 的允许用户 | String | 选填 | root | -| sink.password | 密码,这里指 OPC UA 的允许密码 | String | 选填 | TimechoDB@2021(V2.0.6.x 之前默认密码为root) | -| opcua.with-quality | OPC UA 的测点发布是否为 value + quality 模式。启用配置后,系统将按以下规则处理写入数据:
1. 同时包含 value 和 quality,则直接推送至 OPC UA Server。
2. 仅包含 value,则 quality 自动填充为 UNCERTAIN(默认值,支持自定义配置)。
3. 仅包含 quality,则该写入被忽略,不进行任何处理。
4. 包含非 value/quality 字段,则忽略该数据,并记录警告日志(日志频率可配置,避免高频干扰)。
5. quality 类型限制:目前仅支持布尔类型(true 表示 GOOD,false 表示 BAD); 注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | Boolean | 选填 | false | -| opcua.value-name | With-quality 为 true 时生效,表示 value 测点的名字。 注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | String | 选填 | value | -| opcua.quality-name | With-quality 为 true 时生效,表示 quality 测点的名字。 注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | String | 选填 | quality | -| opcua.default-quality | 没有 quality 时,可以通过 SQL 参数指定`GOOD`/`UNCERTAIN`/`BAD`。 注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | String:`GOOD`/`UNCERTAIN`/`BAD` | 选填 | `UNCERTAIN` | -| opcua.timeout-seconds | Client 连接 server 的超时秒数,仅在 IoTDB 为 client 时生效 注意:V2.0.8 起支持该参数,且仅支持 client-server 模式 | Long | 选填 | 10L | - -#### 2.1.3 示例 - -```Bash -create pipe p1 - with sink ('sink' = 'opc-ua-sink', - 'sink.user' = 'root', - 'sink.password' = 'TimechoDB@2021');//V2.0.6.x 之前默认密码为root -start pipe p1; -``` - -#### 2.1.4 使用限制 -1. 启动协议之后需要写入数据,才能建立连接,且仅能订阅建立连接之后的数据。 -2. 推荐在单机模式下使用。在分布式模式下,每一个 IoTDB DataNode 都作为一个独立的 OPC Server 提供数据,需要单独订阅。 - -### 2.2 两种通信模式示例 -#### 2.2.1 Client / Server 模式 - -在这种模式下,IoTDB 的流处理引擎通过 OPC UA Sink 与 OPC UA 服务器(Server)建立连接。OPC UA 服务器在其地址空间(Address Space) 中维护数据,IoTDB可以请求并获取这些数据。同时,其他OPC UA客户端(Client)也能访问服务器上的数据。 - -* 特性: - * OPC UA 将从 Sink 收到的设备信息,按照树形模型整理到 Objects folder 下的文件夹中。 - * 每个测点都被记录为一个变量节点,并记录当前数据库中的最新值。 - * OPC UA 无法删除数据或者改变数据类型的设置 - -##### 2.2.1.1 准备工作 -1. 此处以UAExpert客户端为例,下载 UAExpert 客户端:https://www.unified-automation.com/downloads/opc-ua-clients.html -2. 安装 UAExpert,填写自身的证书等信息。 - -##### 2.2.1.2 快速开始 -###### 2.2.1.2.1 支持 None 安全策略的场景 -1. 使用如下 sql,启动 OPC UA 服务。详细语法参见上文:[IoTDB OPC Server语法](./Programming-OPC-UA_timecho.md#_2-1-语法) - -```SQL -create pipe p1 with sink ('sink'='opc-ua-sink', 'opcua.security-policy'='AES128_SHA256_RSAOAEP, AES256_SHA256_RSAPSS, BASIC256SHA256, NONE'); -``` -注意:在 2.0.8.1 及以上版本中,默认不再支持 `None`,如需使用必须通过 `security-policy` 参数手动开启,如上所示。 - -2. 写入部分数据。 - -```SQL -insert into root.test.db(time, s2) values(now(), 2); -``` - -3. 在 UAExpert 中配置 iotdb 的连接,其中 password 填写为上述参数配置中 sink.password 中设定的密码(此处用户名、密码以2.3小节示例中配置的 root/root 为例): - -
- -
- -
- -
- -4. 信任服务器的证书后,在左侧 Objects folder 即可看到写入的数据。 - -
- -
- -
- -
- -注意:由于此处配置的 `SecurityPolicy` 为 `None`,因此不需要相互信任证书。生产环境建议使用非 `None` 的 `SecurityPolicy` 进行连接,此时需要相互信任证书,操作步骤可以见下文 `Pub/Sub` 模式,在 `Client/Server` 的证书目录下(可以在打印的日志中找 keyStore 关键词),将 reject 的内容挪到 `trusted/certs`下即可,采用连接、移动 server 目录、连接、移动 client 目录、连接的顺序。 - -5. 可以将左侧节点拖动到中间,并展示该节点的最新值: - -
- -
- -###### 2.2.1.2.2 不支持 None 安全策略的场景 -1. 使用如下 sql,创建并启动 OPC UA 服务。 - ```SQL - create pipe p1 with sink ('sink'='opc-ua-sink'); - ``` - 注意:从 2.0.8.1 版本开始,`OpcUaSink` 出于安全考虑,默认不再支持 `None` 模式。 - -2. 写入部分数据。 - ```SQL - insert into root.test.db(time, s2) values(now(), 2); - ``` - -3. 在 UAExpert 中配置 IoTDB 连接: - - 不可直接访问 `URL`,必须通过 `Discover` 方式发现端点 - - 客户端会先使用 `None` 策略发送 `GetEndpoints` 请求获取端点列表 - - 再根据配置的 `Basic256Sha256 + SignAndEncrypt` 选择对应加密端点建立加密连接 - -![](/img/opc-ua-un-none-1.png) - -4. 用户名密码配置同上,点击相关的连接模式后(`Sign` / `Sign & Encrypt`),如果出现以下内容,点 `Ignore` 直接连。 - -![](/img/opc-ua-un-none-2.png) - -#### 2.2.2 Pub / Sub 模式 - -在这种模式下,IoTDB的流处理引擎通过 OPC UA Sink 向OPC UA 服务器(Server)发送数据变更事件。这些事件被发布到服务器的消息队列中,并通过事件节点 (Event Node) 进行管理。其他OPC UA客户端(Client)可以订阅这些事件节点,以便在数据变更时接收通知。 - -* 特性: - * 每个测点会被 OPC UA 包装成一个事件节点(EventNode)。 - * 相关字段及其对应含义如下: - - | 字段 | 含义 | 类型(Milo) | 示例 | - | ------------ | ------------------ | --------------- | ----------------------- | - | Time | 时间戳 | DateTime | 1698907326198 | - | SourceName | 测点对应完整路径 | String | root.test.opc.sensor0 | - | SourceNode | 测点数据类型 | NodeId | Int32 | - | Message | 数据 | LocalizedText | 3.0 | - - - Event 仅会发送给所有已经监听的客户端,客户端未连接则会忽略该 Event。 - - 如果数据被删除,信息则无法推送给客户端。 - - -##### 2.2.2.1 准备工作 - -该代码位于 iotdb-example 包下的 [opc-ua-sink 文件夹](https://github.com/apache/iotdb/tree/master/example/pipe-opc-ua-sink/src/main/java/org/apache/iotdb/opcua)中 - -代码中包含: - -- 主类(ClientTest) -- Client 证书相关的逻辑(IoTDBKeyStoreLoaderClient) -- Client 的配置及启动逻辑(ClientExampleRunner) -- ClientTest 的父类(ClientExample) - -##### 2.2.2.2 快速开始 - -使用步骤为: - -1. 打开 IoTDB 并写入部分数据。 - -```SQL -insert into root.a.b(time, c, d) values(now(), 1, 2); -``` - - 此处自动创建元数据开启。 - -2. 使用如下 sql,创建并启动 Pub-Sub 模式的 OPC UA Sink。详细语法参见上文:[IoTDB OPC Server语法](./Programming-OPC-UA_timecho.md#_2-1-语法) - -```SQL -create pipe p1 with sink ('sink'='opc-ua-sink', 'sink.opcua.model'='pub-sub'); -start pipe p1; -``` - - 此时能看到服务器的 conf 目录下创建了 opc 证书相关的目录。 - -
- -
- -3. 直接运行 Client 连接,此时 Client 证书被服务器拒收。 - -
- -
- -4. 进入服务器的 sink.opcua.security.dir 目录下,进入 pki 的 rejected 目录,此时 Client 的证书应该已经在该目录下生成。 - -
- -
- -5. 将客户端的证书移入(不是复制) 同目录下 trusted 目录的 certs 文件夹中。 - -
- -
- -6. 再次打开 Client 连接,此时服务器的证书应该被 Client 拒收。 - -
- -
- -7. 进入客户端的 /client/security 目录下,进入 pki 的 rejected 目录,将服务器的证书移入(不是复制)trusted 目录。 - -
- -
- -8. 打开 Client,此时建立双向信任成功, Client 能够连接到服务器。 - -9. 向服务器中写入数据,此时 Client 中能够打印出收到的数据。 - -
- -
- - -#### 2.2.3 注意事项 - -1. **单机与集群**:建议使用1C1D单机版,如果集群中有多个 DataNode,可能数据会分散发送在各个 DataNode 上,无法收听到全量数据。 - -2. **无需操作根目录下证书**:在证书操作过程中,无需操作 IoTDB security 根目录下的 `iotdb-server.pfx` 证书和 client security 目录下的 `example-client.pfx` 目录。Client 和 Server 双向连接时,会将根目录下的证书发给对方,对方如果第一次看见此证书,就会放入 reject dir,如果该证书在 trusted/certs 里面,则能够信任对方。 - -3. **建议使用** **Java 17+**:在 JVM 8 的版本中,可能会存在密钥长度限制,报 Illegal key size 错误。对于特定版本(如 jdk.1.8u151+),可以在 ClientExampleRunner 的 create client 里加入 `Security.`*`setProperty`*`("crypto.policy", "unlimited");` 解决,也可以下载无限制的包 `local_policy.jar` 与 `US_export_policy `解决替换 `JDK/jre/lib/security `目录下的包解决,下载网址:https://www.oracle.com/java/technologies/javase-jce8-downloads.html。 - -4. **连接问题**:如果报错为 Unknown host,需要修改 IoTDB DataNode 所在机器的 etc/hosts 文件,加入目标端机器的 url 和 hostName。 - -## 3. 数据推送 - -在此模式下,IoTDB 通过 Pipe 扮演 OPC UA 客户端角色,主动将选定的数据连同质量码(`quality`)一并推送到一个或多个外部 OPC UA 服务器。外部服务器会自动按 IoTDB 的元数据动态创建目录树和节点。 - -![](/img/opc-ua-data-push.png) - -### 3.1 OPC 服务启动方式 - -#### 3.1.1语法 - -启动 OPC UA 协议的语法: - -```SQL -create pipe p1 - with source (...) - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'opcua.node-url' = '127.0.0.1:12686', - 'opcua.historizing' = 'true', - 'opcua.with-quality' = 'true' - ) -``` - -#### 3.1.2 参数 - -| **参数** | **描述** | ** 取值范围 ** | **是否必填** | **默认值** | -|-----------------------| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------- | -------------------- | -| sink | OPC UA SINK | String: opc-ua-sink | 必填 | | -| opcua.node-url | 逗号分隔,可以配置单个 OPC UA 的 tcp port,当存在该参数时,不会启动本机 server,而是发送到配置的 OPC UA Server。 | String | 选填 | `''` | -| opcua.historizing | 自动创建目录及叶子节点时,新节点是否存变量的历史数据。 | Boolean | 选填 | false | -| opcua.with-quality | OPC UA 的测点发布是否为 value + quality 模式。启用配置后,系统将按以下规则处理写入数据:
1. 同时包含 value 和 quality,则直接推送至 OPC UA Server。
2. 仅包含 value,则 quality 自动填充为 UNCERTAIN(默认值,支持自定义配置)。
3. 仅包含 quality,则该写入被忽略,不进行任何处理。
4. 包含非 value/quality 字段,则忽略该数据,并记录警告日志(日志频率可配置,避免高频干扰)。
5. quality 类型限制:目前仅支持布尔类型(true 表示 GOOD,false 表示 BAD); | Boolean | 选填 | false | -| opcua.value-name | With-quality 为 true 时生效,表示 value 测点的名字。 | String | 选填 | value | -| opcua.quality-name | With-quality 为 true 时生效,表示 quality 测点的名字。 | String | 选填 | quality | -| opcua.default-quality | 没有 quality 时,可以通过 SQL 参数指定`GOOD`/`UNCERTAIN`/`BAD`。 | String:`GOOD`/`UNCERTAIN`/`BAD` | 选填 | `UNCERTAIN` | -| opcua.security-policy | OPC UA client 连接使用的安全策略,不区分大小写,网址为:`http://opcfoundation.org/UA/SecurityPolicy#`,例如http://opcfoundation.org/UA/SecurityPolicy#Aes128_Sha256_RsaOaep | String(安全性依次递增):
`None`
`Basic128Rsa15`
`Basic256`
`Basic256Sha256`
`Aes128_Sha256_RsaOaep`
`Aes256_Sha256_RsaPss` | 选填| `Basic256Sha256` | -| opcua.timeout-seconds | Client 连接 server 的超时秒数,仅在 IoTDB 为 client 时生效 | Long | 选填 | 10L | - -> **参数命名注意**:以上参数均支持省略 `opcua.` 前缀,例如 `node-urls` 和 `opcua.node-urls` 等价。 -> -> **参数支持说明**:V2.0.8 起支持以上`opcua. `相关参数,且仅支持` client-server` 模式 - -#### 3.1.3 示例 - -```Bash -create pipe p1 - with source (...) - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'node-urls' = '127.0.0.1:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ) -``` - -#### 3.1.4 使用限制 - -1. 当前模式**仅支持`client-server`****模式和树模型数据**。 -2. 不支持一台机器上配置多个 DataNode,避免抢占相同的端口。 -3. 不支持`OBJECT` 类型的数据推送。 -4. 当某条时间序列改名后,将会联动修改 OPC UA Sink 删除对应的老路径,向新路径推送数据。 -5. 生产环境强烈建议使用非`None`的安全策略(如`Basic256Sha256`),并正确配置证书双向信任。 - -### 3.2 外置 OPC UA 服务器项目 - -IoTDB 支持单独的外部 Server 项目。该 Server 的实现及配置项与 IoTDB 目前的内置 Server 相同,但是需要额外支持新增目录及叶子节点,保证可以自动按照 IoTDB 写入中的元数据创建目录及叶子节点。 - -该 Server 的相关配置在启动 Server 时,需通过命令行的 args 注入,暂不支持 yml、xml 等配置文件。启动参数的键名与 IoTDB OPC Server 配置项对应,其中配置项中的点(.)和短划线(-)需替换为下划线(\_)。 - -例如: - -```SQL -.\start-IoTDB-opc-server.sh -enable_anonymous_access true -u root -pw root -https_port 8443 -``` - -其中,`user` 和 `password` 可简写为 `-u`、`-p`,其余参数键名均与配置项保持一致。请注意,`userName` 不能作为参数键名,仅支持 `user`。 - -### 3.3 场景示例 - -**目标**:将多个数据源的数据,按区域汇聚到3个外部OPC Server,供监控中心统一访问。 - -![](/img/opc-ua-data-push-example.png) - -1. **准备**:在三台服务器 (`ip1`, `ip2`, `ip3`) 上分别启动外部 OPC UA Server(端口12686)。 -2. **配置Pipe**:在 IoTDB 中创建3个 Pipe,使用`processor`或`source`中的路径模式过滤,将不同区域的数据推送到对应的 Server。 - ```SQL - -- 启动和连接 IoTDB - .\start-standalone.sh - - -- 启动三个 OPC UA Server - -- ip1、ip2、ip3 执行三次,端口为默认,12686 - .\start-IoTDB-external-opc-server.sh -enable-anonymous-access true -u root -pw root - - -- 创建三个 Pipe - .\start-cli.sh - create pipe p1 - with source () - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip1:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - create pipe p1 - with source () - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip2:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - create pipe p1 - with source () - with processor (...) - with sink ('sink' = 'opc-ua-sink', - 'node-urls' = 'ip3:12686', - 'historizing' = 'true', - 'with-quality' = 'true' - ); - ``` -3. **效果**:监控中心只需连接 `ip1`, `ip2`, `ip3` 这三个Server,即可获取所有区域的完整数据视图,且数据附带质量信息。 diff --git a/src/zh/UserGuide/latest/API/Programming-Python-Native-API_timecho.md b/src/zh/UserGuide/latest/API/Programming-Python-Native-API_timecho.md deleted file mode 100644 index f9895a3d4..000000000 --- a/src/zh/UserGuide/latest/API/Programming-Python-Native-API_timecho.md +++ /dev/null @@ -1,860 +0,0 @@ - - -# Python 原生接口 - -## 1. 依赖 - -在使用 Python 原生接口包前,您需要安装 thrift (>=0.13) 依赖。 - -## 2. 如何使用 (示例) - -首先下载包:`pip3 install apache-iotdb>=2.0` - -注意:请勿使用高版本客户端连接低版本服务。 - -您可以从这里得到一个使用该包进行数据读写的例子:[Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/session_example.py) - -关于对齐时间序列读写的例子:[Aligned Timeseries Session Example](https://github.com/apache/iotdb/blob/master/iotdb-client/client-py/session_aligned_timeseries_example.py) - -(您需要在文件的头部添加`import iotdb`) - -或者: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前密码默认值为root -session = Session(ip, port_, username_, password_) -session.open(False) -zone = session.get_time_zone() -session.close() -``` - -## 3. 基本接口说明 - -下面将给出 Session 对应的接口的简要介绍和对应参数: - -### 3.1 初始化 - -* 初始化 Session - -```python -session = Session( - ip="127.0.0.1", - port="6667", - user="root", - password="TimechoDB@2021", //V2.0.6.x 之前密码默认值为root - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) -``` - -* 初始化可连接多节点的 Session - -```python -session = Session.init_from_node_urls( - node_urls=["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"], - user="root", - password="TimechoDB@2021", //V2.0.6.x 之前密码默认值为root - fetch_size=1024, - zone_id="UTC+8", - enable_redirection=True -) -``` - -* 开启 Session,并决定是否开启 RPC 压缩 - -```python -session.open(enable_rpc_compression=False) -``` - -注意: 客户端的 RPC 压缩开启状态需和服务端一致 - -* 关闭 Session - -```python -session.close() -``` -### 3.2 通过SessionPool管理session连接 - -利用SessionPool管理session,不需要再考虑如何重用session。当session连接到达pool的最大值时,获取session的请求会被阻塞,可以通过参数设置阻塞等待时间。每次session使用完需要使用putBack方法将session归还到SessionPool中管理。 - -#### 创建SessionPool - -```python -pool_config = PoolConfig(host=ip,port=port, user_name=username, - password=password, fetch_size=1024, - time_zone="UTC+8", max_retry=3) -max_pool_size = 5 -wait_timeout_in_ms = 3000 - -# 通过配置参数创建连接池 -session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) -``` -#### 通过分布式节点创建SessionPool -```python -pool_config = PoolConfig(node_urls=node_urls=["127.0.0.1:6667", "127.0.0.1:6668", "127.0.0.1:6669"], user_name=username, - password=password, fetch_size=1024, - time_zone="UTC+8", max_retry=3) -max_pool_size = 5 -wait_timeout_in_ms = 3000 -``` - -#### 通过SessionPool获取session,使用完手动调用PutBack - -```python -session = session_pool.get_session() -session.set_storage_group(STORAGE_GROUP_NAME) -session.create_time_series( - TIMESERIES_PATH, TSDataType.BOOLEAN, TSEncoding.PLAIN, Compressor.SNAPPY -) -# 使用完调用putBack归还 -session_pool.put_back(session) -# 关闭sessionPool时同时关闭管理的session -session_pool.close() -``` - -### 3.3 SSL 连接 - -#### 3.3.1 服务器端配置证书 - -`conf/iotdb-system.properties` 配置文件中查找或添加以下配置项: - -```Java -enable_thrift_ssl=true -key_store_path=/path/to/your/server_keystore.jks -key_store_pwd=your_keystore_password -``` - -#### 3.3.2 配置 python 客户端证书 - -- 设置 use_ssl 为 True 以启用 SSL。 -- 指定客户端证书路径,使用 ca_certs 参数。 - -```Java -use_ssl = True -ca_certs = "/path/to/your/server.crt" # 或 ca_certs = "/path/to/your//ca_cert.pem" -``` -**示例代码:使用 SSL 连接 IoTDB** - -```Java -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# - -from iotdb.SessionPool import PoolConfig, SessionPool -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前密码默认值为root -# Configure SSL enabled -use_ssl = True -# Configure certificate path -ca_certs = "/path/server.crt" - - -def get_data(): - session = Session( - ip, port_, username_, password_, use_ssl=use_ssl, ca_certs=ca_certs - ) - session.open(False) - with session.execute_query_statement("select * from root.eg.etth") as result: - df = result.todf() - df.rename(columns={"Time": "date"}, inplace=True) - session.close() - return df - - -def get_data2(): - pool_config = PoolConfig( - host=ip, - port=port_, - user_name=username_, - password=password_, - fetch_size=1024, - time_zone="UTC+8", - max_retry=3, - use_ssl=use_ssl, - ca_certs=ca_certs, - ) - max_pool_size = 5 - wait_timeout_in_ms = 3000 - session_pool = SessionPool(pool_config, max_pool_size, wait_timeout_in_ms) - session = session_pool.get_session() - with session.execute_query_statement("select * from root.eg.etth") as result: - df = result.todf() - df.rename(columns={"Time": "date"}, inplace=True) - session_pool.put_back(session) - session_pool.close() - - -if __name__ == "__main__": - df = get_data() -``` - -## 4. 数据定义接口 DDL - -### 4.1 Database 管理 - -* 设置 database - -```python -session.set_storage_group(group_name) -``` - -* 删除单个或多个 database - -```python -session.delete_storage_group(group_name) -session.delete_storage_groups(group_name_lst) -``` -### 4.2 时间序列管理 - -* 创建单个或多个时间序列 - -```python -session.create_time_series(ts_path, data_type, encoding, compressor, - props=None, tags=None, attributes=None, alias=None) - -session.create_multi_time_series( - ts_path_lst, data_type_lst, encoding_lst, compressor_lst, - props_lst=None, tags_lst=None, attributes_lst=None, alias_lst=None -) -``` - -* 创建对齐时间序列 - -```python -session.create_aligned_time_series( - device_id, measurements_lst, data_type_lst, encoding_lst, compressor_lst -) -``` - -注意:目前**暂不支持**使用传感器别名。 - -* 删除一个或多个时间序列 - -```python -session.delete_time_series(paths_list) -``` - -* 检测时间序列是否存在 - -```python -session.check_time_series_exists(path) -``` - -## 5. 数据操作接口 DML - -### 5.1 数据写入 - -推荐使用 insert_tablet 帮助提高写入效率 - -* 插入一个 Tablet,Tablet 是一个设备若干行数据块,每一行的列都相同 - * **写入效率高** - * **支持写入空值** (0.13 版本起) - -Python API 里目前有两种 Tablet 实现 - -* 普通 Tablet - -```python -values_ = [ - [False, 10, 11, 1.1, 10011.1, "test01"], - [True, 100, 11111, 1.25, 101.0, "test02"], - [False, 100, 1, 188.1, 688.25, "test03"], - [True, 0, 0, 0, 6.25, "test04"], -] -timestamps_ = [1, 2, 3, 4] -tablet_ = Tablet( - device_id, measurements_, data_types_, values_, timestamps_ -) -session.insert_tablet(tablet_) - -values_ = [ - [None, 10, 11, 1.1, 10011.1, "test01"], - [True, None, 11111, 1.25, 101.0, "test02"], - [False, 100, None, 188.1, 688.25, "test03"], - [True, 0, 0, 0, None, None], -] -timestamps_ = [16, 17, 18, 19] -tablet_ = Tablet( - device_id, measurements_, data_types_, values_, timestamps_ -) -session.insert_tablet(tablet_) -``` -* Numpy Tablet - -相较于普通 Tablet,Numpy Tablet 使用 [numpy.ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html) 来记录数值型数据。 -内存占用和序列化耗时会降低很多,写入效率也会有很大提升。 - -**注意** -1. Tablet 中的每一列时间戳和值记录为一个 ndarray -2. Numpy Tablet 只支持大端类型数据,ndarray 构建时如果不指定数据类型会使用小端,因此推荐在构建 ndarray 时指定下面例子中类型使用大端。如果不指定,IoTDB Python客户端也会进行大小端转换,不影响使用正确性。 - -```python -import numpy as np -data_types_ = [ - TSDataType.BOOLEAN, - TSDataType.INT32, - TSDataType.INT64, - TSDataType.FLOAT, - TSDataType.DOUBLE, - TSDataType.TEXT, -] -np_values_ = [ - np.array([False, True, False, True], TSDataType.BOOLEAN.np_dtype()), - np.array([10, 100, 100, 0], TSDataType.INT32.np_dtype()), - np.array([11, 11111, 1, 0], TSDataType.INT64.np_dtype()), - np.array([1.1, 1.25, 188.1, 0], TSDataType.FLOAT.np_dtype()), - np.array([10011.1, 101.0, 688.25, 6.25], TSDataType.DOUBLE.np_dtype()), - np.array(["test01", "test02", "test03", "test04"], TSDataType.TEXT.np_dtype()), -] -np_timestamps_ = np.array([1, 2, 3, 4], TSDataType.INT64.np_dtype()) -np_tablet_ = NumpyTablet( - device_id, measurements_, data_types_, np_values_, np_timestamps_ -) -session.insert_tablet(np_tablet_) - -# insert one numpy tablet with None into the database. -np_values_ = [ - np.array([False, True, False, True], TSDataType.BOOLEAN.np_dtype()), - np.array([10, 100, 100, 0], TSDataType.INT32.np_dtype()), - np.array([11, 11111, 1, 0], TSDataType.INT64.np_dtype()), - np.array([1.1, 1.25, 188.1, 0], TSDataType.FLOAT.np_dtype()), - np.array([10011.1, 101.0, 688.25, 6.25], TSDataType.DOUBLE.np_dtype()), - np.array(["test01", "test02", "test03", "test04"], TSDataType.TEXT.np_dtype()), -] -np_timestamps_ = np.array([98, 99, 100, 101], TSDataType.INT64.np_dtype()) -np_bitmaps_ = [] -for i in range(len(measurements_)): - np_bitmaps_.append(BitMap(len(np_timestamps_))) -np_bitmaps_[0].mark(0) -np_bitmaps_[1].mark(1) -np_bitmaps_[2].mark(2) -np_bitmaps_[4].mark(3) -np_bitmaps_[5].mark(3) -np_tablet_with_none = NumpyTablet( - device_id, measurements_, data_types_, np_values_, np_timestamps_, np_bitmaps_ -) -session.insert_tablet(np_tablet_with_none) -``` - -* 插入多个 Tablet - -```python -session.insert_tablets(tablet_lst) -``` - -* 插入一个 Record,一个 Record 是一个设备一个时间戳下多个测点的数据。 - -```python -session.insert_record(device_id, timestamp, measurements_, data_types_, values_) -``` - -* 插入多个 Record - -```python -session.insert_records( - device_ids_, time_list_, measurements_list_, data_type_list_, values_list_ - ) -``` - -* 插入同属于一个 device 的多个 Record - -```python -session.insert_records_of_one_device(device_id, time_list, measurements_list, data_types_list, values_list) -``` - -### 5.2 带有类型推断的写入 - -当数据均是 String 类型时,我们可以使用如下接口,根据 value 的值进行类型推断。例如:value 为 "true" ,就可以自动推断为布尔类型。value 为 "3.2" ,就可以自动推断为数值类型。服务器需要做类型推断,可能会有额外耗时,速度较无需类型推断的写入慢 - -```python -session.insert_str_record(device_id, timestamp, measurements, string_values) -``` - -### 5.3 对齐时间序列的写入 - -对齐时间序列的写入使用 insert_aligned_xxx 接口,其余与上述接口类似: - -* insert_aligned_record -* insert_aligned_records -* insert_aligned_records_of_one_device -* insert_aligned_tablet -* insert_aligned_tablets - - -## 6. IoTDB-SQL 接口 - -* 执行查询语句 - -```python -session.execute_query_statement(sql) -``` - -* 执行非查询语句 - -```python -session.execute_non_query_statement(sql) -``` - -* 执行语句 - -```python -session.execute_statement(sql) -``` - - -## 7. 元数据模版接口 -### 7.1 构建元数据模版 -1. 首先构建 Template 类 -2. 添加子节点 MeasurementNode -3. 调用创建元数据模版接口 - -```python -template = Template(name=template_name, share_time=True) - -m_node_x = MeasurementNode("x", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) -m_node_y = MeasurementNode("y", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) -m_node_z = MeasurementNode("z", TSDataType.FLOAT, TSEncoding.RLE, Compressor.SNAPPY) - -template.add_template(m_node_x) -template.add_template(m_node_y) -template.add_template(m_node_z) - -session.create_schema_template(template) -``` -### 7.2 改模版节点信息 -修改模版节点,其中修改的模版必须已经被创建。以下函数能够在已经存在的模版中增加或者删除物理量 -* 在模版中增加实体 -```python -session.add_measurements_in_template(template_name, measurements_path, data_types, encodings, compressors, is_aligned) -``` - -* 在模版中删除物理量 -```python -session.delete_node_in_template(template_name, path) -``` - -### 7.3 挂载元数据模板 -```python -session.set_schema_template(template_name, prefix_path) -``` - -### 7.4 卸载元数据模版 -```python -session.unset_schema_template(template_name, prefix_path) -``` - -### 7.5 查看元数据模版 -* 查看所有的元数据模版 -```python -session.show_all_templates() -``` -* 查看元数据模版中的物理量个数 -```python -session.count_measurements_in_template(template_name) -``` - -* 判断某个节点是否为物理量,该节点必须已经在元数据模版中 -```python -session.count_measurements_in_template(template_name, path) -``` - -* 判断某个路径是否在元数据模版中,这个路径有可能不在元数据模版中 -```python -session.is_path_exist_in_template(template_name, path) -``` - -* 查看某个元数据模板下的物理量 -```python -session.show_measurements_in_template(template_name) -``` - -* 查看挂载了某个元数据模板的路径前缀 -```python -session.show_paths_template_set_on(template_name) -``` - -* 查看使用了某个元数据模板(即序列已创建)的路径前缀 -```python -session.show_paths_template_using_on(template_name) -``` - -### 7.6 删除元数据模版 -删除已经存在的元数据模版,不支持删除已经挂载的模版 -```python -session.drop_schema_template("template_python") -``` - - -## 8. 对 Pandas 的支持 - -我们支持将查询结果轻松地转换为 [Pandas Dataframe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html)。 - -SessionDataSet 有一个方法`.todf()`,它的作用是消费 SessionDataSet 中的数据,并将数据转换为 pandas dataframe。 - -例子: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前密码默认值为root -session = Session(ip, port_, username_, password_) -session.open(False) -with session.execute_query_statement("SELECT ** FROM root") as result: - # Transform to Pandas Dataset - df = result.todf() - -session.close() - -# Now you can work with the dataframe -df = ... -``` - -自 V2.0.8.2 版本起,SessionDataSet 提供分批获取 DataFrame 的方法,用于高效处理大数据量查询: - -```python -# 分批获取 DataFrame -has_next = result.has_next_df() -if has_next: - df = result.next_df() - # 处理 DataFrame -``` - -**方法说明:** -- `has_next_df()`: 返回 `True`/`False`,表示是否还有数据可返回 -- `next_df()`: 返回 `DataFrame` 或 `None`,每次返回 `fetchSize` 行(默认5000行,由 Session 的 `fetch_size` 参数控制) - - 剩余数据 ≥ `fetchSize` 时,返回 `fetchSize` 行 - - 剩余数据 < `fetchSize` 时,返回剩余所有行 - - 数据遍历完毕时,返回 `None` -- 初始化 Session 时检查 `fetchSize`,若 ≤0 则重置为 5000 并打印警告日志 - -**注意:** 不要混合使用不同的遍历方式,如(todf函数与 next_df 混用),否则会出现预期外的错误。 - -**使用示例:** -```python -from iotdb.Session import Session - -# 初始化 session,设置 fetch_size 为 2 -session = Session( - host="127.0.0.1", port="6667", fetch_size=2 -) -session.open(False) -session.execute_non_query_statement("CREATE DATABASE root.device0") - -# 写入三条数据 -session.insert_str_record("root.device0", 123, "pressure", "15.0") -session.insert_str_record("root.device0", 124, "pressure", "15.0") -session.insert_str_record("root.device0", 125, "pressure", "15.0") - -# 查询出 DataFrame -with session.execute_query_statement("SELECT * FROM root.device0") as session_data_set: - while session_data_set.has_next_df(): - df = session_data_set.next_df() - # 打印出两个 dataframe,第一个有 2 行,第二个有 1 行 - print(df) - -session.close() -``` - - -## 9. IoTDB Testcontainer - -Python 客户端对测试的支持是基于`testcontainers`库 (https://testcontainers-python.readthedocs.io/en/latest/index.html) 的,如果您想使用该特性,就需要将其安装到您的项目中。 - -要在 Docker 容器中启动(和停止)一个 IoTDB 数据库,只需这样做: - -```python -class MyTestCase(unittest.TestCase): - - def test_something(self): - with IoTDBContainer() as c: - session = Session("localhost", c.get_exposed_port(6667), "root", "TimechoDB@2021") //V2.0.6.x 之前密码默认值为root - session.open(False) - with session.execute_query_statement("SHOW TIMESERIES") as result: - print(result) - session.close() -``` - -默认情况下,它会拉取最新的 IoTDB 镜像 `apache/iotdb:latest`进行测试,如果您想指定待测 IoTDB 的版本,您只需要将版本信息像这样声明:`IoTDBContainer("apache/iotdb:0.12.0")`,此时,您就会得到一个`0.12.0`版本的 IoTDB 实例。 - -## 10. IoTDB DBAPI - -IoTDB DBAPI 遵循 Python DB API 2.0 规范 (https://peps.python.org/pep-0249/),实现了通过Python语言访问数据库的通用接口。 - -### 10.1 例子 -+ 初始化 - -初始化的参数与Session部分保持一致(sqlalchemy_mode参数除外,该参数仅在SQLAlchemy方言中使用) -```python -from iotdb.dbapi import connect - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前密码默认值为root -conn = connect(ip, port_, username_, password_,fetch_size=1024,zone_id="UTC+8",sqlalchemy_mode=False) -cursor = conn.cursor() -``` -+ 执行简单的SQL语句 -```python -cursor.execute("SELECT ** FROM root") -for row in cursor.fetchall(): - print(row) -``` - -+ 执行带有参数的SQL语句 - -IoTDB DBAPI 支持pyformat风格的参数 -```python -cursor.execute("SELECT ** FROM root WHERE time < %(time)s",{"time":"2017-11-01T00:08:00.000"}) -for row in cursor.fetchall(): - print(row) -``` - -+ 批量执行带有参数的SQL语句 -```python -seq_of_parameters = [ - {"timestamp": 1, "temperature": 1}, - {"timestamp": 2, "temperature": 2}, - {"timestamp": 3, "temperature": 3}, - {"timestamp": 4, "temperature": 4}, - {"timestamp": 5, "temperature": 5}, -] -sql = "insert into root.cursor(timestamp,temperature) values(%(timestamp)s,%(temperature)s)" -cursor.executemany(sql,seq_of_parameters) -``` - -+ 关闭连接 -```python -cursor.close() -conn.close() -``` - -## 11. IoTDB SQLAlchemy Dialect(实验性) -IoTDB的SQLAlchemy方言主要是为了适配Apache superset而编写的,该部分仍在完善中,请勿在生产环境中使用! -### 11.1 元数据模型映射 -SQLAlchemy 所使用的数据模型为关系数据模型,这种数据模型通过表格来描述不同实体之间的关系。 -而 IoTDB 的数据模型为层次数据模型,通过树状结构来对数据进行组织。 -为了使 IoTDB 能够适配 SQLAlchemy 的方言,需要对 IoTDB 中原有的数据模型进行重新组织, -把 IoTDB 的数据模型转换成 SQLAlchemy 的数据模型。 - -IoTDB 中的元数据有: - -1. Database:数据库 -2. Path:存储路径 -3. Entity:实体 -4. Measurement:物理量 - -SQLAlchemy 中的元数据有: -1. Schema:数据模式 -2. Table:数据表 -3. Column:数据列 - -它们之间的映射关系为: - -| SQLAlchemy中的元数据 | IoTDB中对应的元数据 | -| -------------------- | ---------------------------------------------- | -| Schema | Database | -| Table | Path ( from database to entity ) + Entity | -| Column | Measurement | - -下图更加清晰的展示了二者的映射关系: - -![sqlalchemy-to-iotdb](/img/UserGuide/API/IoTDB-SQLAlchemy/sqlalchemy-to-iotdb.png?raw=true) - -### 11.2 数据类型映射 -| IoTDB 中的数据类型 | SQLAlchemy 中的数据类型 | -|--------------|-------------------| -| BOOLEAN | Boolean | -| INT32 | Integer | -| INT64 | BigInteger | -| FLOAT | Float | -| DOUBLE | Float | -| TEXT | Text | -| LONG | BigInteger | -### 11.3 Example - -+ 执行语句 - -```python -from sqlalchemy import create_engine - -engine = create_engine("iotdb://root:TimechoDB@2021@127.0.0.1:6667") //V2.0.6.x 之前密码默认值为root -connect = engine.connect() -result = connect.execute("SELECT ** FROM root") -for row in result.fetchall(): - print(row) -``` - -+ ORM (目前只支持简单的查询) - -```python -from sqlalchemy import create_engine, Column, Float, BigInteger, MetaData -from sqlalchemy.ext.declarative import declarative_base -from sqlalchemy.orm import sessionmaker - -metadata = MetaData( - schema='root.factory' -) -Base = declarative_base(metadata=metadata) - - -class Device(Base): - __tablename__ = "room2.device1" - Time = Column(BigInteger, primary_key=True) - temperature = Column(Float) - status = Column(Float) - - -engine = create_engine("iotdb://root:TimechoDB@2021@127.0.0.1:6667") //V2.0.6.x 之前密码默认值为root - -DbSession = sessionmaker(bind=engine) -session = DbSession() - -res = session.query(Device.status).filter(Device.temperature > 1) - -for row in res: - print(row) -``` - -## 12. 给开发人员 - -### 12.1 介绍 - -这是一个使用 thrift rpc 接口连接到 IoTDB 的示例。在 Windows 和 Linux 上操作几乎是一样的,但要注意路径分隔符等不同之处。 - -### 12.2 依赖 - -首选 Python3.7 或更高版本。 - -必须安装 thrift(0.11.0 或更高版本)才能将 thrift 文件编译为 Python 代码。下面是官方的安装教程,最终,您应该得到一个 thrift 可执行文件。 - -``` -http://thrift.apache.org/docs/install/ -``` - -在开始之前,您还需要在 Python 环境中安装`requirements_dev.txt`中的其他依赖: -```shell -pip install -r requirements_dev.txt -``` - -### 12.3 编译 thrift 库并调试 - -在 IoTDB 源代码文件夹的根目录下,运行`mvn clean generate-sources -pl iotdb-client/client-py -am`, - -这个指令将自动删除`iotdb/thrift`中的文件,并使用新生成的 thrift 文件重新填充该文件夹。 - -这个文件夹在 git 中会被忽略,并且**永远不应该被推到 git 中!** - -**注意**不要将`iotdb/thrift`上传到 git 仓库中 ! - -### 12.4 Session 客户端 & 使用示例 - -我们将 thrift 接口打包到`client-py/src/iotdb/session.py `中(与 Java 版本类似),还提供了一个示例文件`client-py/src/SessionExample.py`来说明如何使用 Session 模块。请仔细阅读。 - -另一个简单的例子: - -```python -from iotdb.Session import Session - -ip = "127.0.0.1" -port_ = "6667" -username_ = "root" -password_ = "TimechoDB@2021" //V2.0.6.x 之前密码默认值为root -session = Session(ip, port_, username_, password_) -session.open(False) -zone = session.get_time_zone() -session.close() -``` - -### 12.5 测试 - -请在`tests`文件夹中添加自定义测试。 - -要运行所有的测试,只需在根目录中运行`pytest . `即可。 - -**注意**一些测试需要在您的系统上使用 docker,因为测试的 IoTDB 实例是使用 [testcontainers](https://testcontainers-python.readthedocs.io/en/latest/index.html) 在 docker 容器中启动的。 - -### 12.6 其他工具 - -[black](https://pypi.org/project/black/) 和 [flake8](https://pypi.org/project/flake8/) 分别用于自动格式化和 linting。 -它们可以通过 `black .` 或 `flake8 .` 分别运行。 - -## 13. 发版 - -要进行发版, - -只需确保您生成了正确的 thrift 代码, - -运行了 linting 并进行了自动格式化, - -然后,确保所有测试都正常通过(通过`pytest . `), - -最后,您就可以将包发布到 pypi 了。 - -### 13.1 准备您的环境 - -首先,通过`pip install -r requirements_dev.txt`安装所有必要的开发依赖。 - -### 13.2 发版 - -有一个脚本`release.sh`可以用来执行发版的所有步骤。 - -这些步骤包括: - -* 删除所有临时目录(如果存在) - -* (重新)通过 mvn 生成所有必须的源代码 - -* 运行 linting (flke8) - -* 通过 pytest 运行测试 - -* Build - -* 发布到 pypi diff --git a/src/zh/UserGuide/latest/API/RestServiceV1_timecho.md b/src/zh/UserGuide/latest/API/RestServiceV1_timecho.md deleted file mode 100644 index 239d57ee7..000000000 --- a/src/zh/UserGuide/latest/API/RestServiceV1_timecho.md +++ /dev/null @@ -1,965 +0,0 @@ - - -# REST API V1(不推荐) -IoTDB 的 RESTful 服务可用于查询、写入和管理操作,它使用 OpenAPI 标准来定义接口并生成框架。 - -注意:自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 REST 服务的 JAR 包,请使用该服务前联系天谋团队获取相应的 JAR 包,并放置于 timechodb_home/lib 或者 timechodb_home/ext/external_service 路径下。 - -## 1. 开启RESTful 服务 -RESTful 服务默认情况是关闭的 - - 找到IoTDB安装目录下面的`conf/iotdb-system.properties`文件,将 `enable_rest_service` 设置为 `true` 以启用该模块。 - - ```properties - enable_rest_service=true - ``` - -## 2. 鉴权 -除了检活接口 `/ping`,RESTful 服务均使用基础(Basic)鉴权,所有请求都需要在 Header 中携带 `Authorization` 信息。 - -1. 鉴权格式 - -```JSON -Authorization: Basic -``` - -其中 `` 是 `用户名:密码` 直接做 Base64 编码的结果,其快速生成方式如下 - -* Linux/macOS - -```Bash -echo -n "你的用户名:你的密码" | base64 -eg: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows - -```Bash -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("用户名:密码")) -eg: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) - -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"用户名:密码\"))" -eg: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. 鉴权示例 - -默认用户名 `root`,密码 `TimechoDB@2021`: - -* 拼接字符串:`root:TimechoDB@2021` -* Base64 编码后为:`cm9vdDpUaW1lY2hvREJAMjAyMQ==` -* 最终 Header: - -```JSON -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. 错误说明 -* 用户名/密码错误:返回 HTTP 状态码 `600`,内容: - -```JSON -{"code":600,"message":"WRONG_LOGIN_PASSWORD_ERROR"} -``` - -* 未设置 `Authorization`:返回 HTTP 状态码 `603`,内容: - -```JSON -{"code":603,"message":"UNINITIALIZED_AUTH_ERROR"} -``` - -## 3. 接口 - -### 3.1 ping - -ping 接口可以用于线上服务检活。 - -请求方式:`GET` - -请求路径:`http://ip:port/ping -` -请求示例: - -```shell -$ curl http://127.0.0.1:18080/ping -``` - -返回的 HTTP 状态码: - -- `200`:当前服务工作正常,可以接收外部请求。 -- `503`:当前服务出现异常,不能接收外部请求。 - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: - -- HTTP 状态码为 `200` 时: - - ```json - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -- HTTP 状态码为 `503` 时: - - ```json - { - "code": 500, - "message": "thrift service is unavailable" - } - ``` - -> `/ping` 接口访问不需要鉴权。 - -### 3.2 query - -query 接口可以用于处理数据查询和元数据查询。 - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v1/query` - -参数说明: - -| 参数名称 |参数类型 |是否必填|参数描述| -|-----------| ------------ | ------------ |------------ | -| sql | string | 是 | | -| rowLimit | integer | 否 | 一次查询能返回的结果集的最大行数。
如果不设置该参数,将使用配置文件的 `rest_query_default_row_size_limit` 作为默认值。
当返回结果集的行数超出限制时,将返回状态码 `411`。 | - -响应参数: - -| 参数名称 |参数类型 |参数描述| -|--------------| ------------ | ------------| -| expressions | array | 用于数据查询时结果集列名的数组,用于元数据查询时为`null`| -| columnNames | array | 用于元数据查询结果集列名数组,用于数据查询时为`null` | -| timestamps | array | 时间戳列,用于元数据查询时为`null` | -| values |array|二维数组,第一维与结果集列名数组的长度相同,第二维数组代表结果集的一列| - -请求示例如下所示: - -提示:为了避免OOM问题,不推荐使用select * from root.xx.** 这种查找方式。 - -1. 请求示例 表达式查询: - ```shell - curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select s3, s4, s3 + 1 from root.sg27 limit 2"}' http://127.0.0.1:18080/rest/v1/query -``` - - - 响应示例: - -```json -{ - "expressions": [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg27.s3 + 1" - ], - "columnNames": null, - "timestamps": [ - 1635232143960, - 1635232153960 - ], - "values": [ - [ - 11, - null - ], - [ - false, - true - ], - [ - 12.0, - null - ] - ] -} -``` - -2. 请求示例 show child paths: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show child paths root"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "child paths" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ] - ] -} -``` - -3. 请求示例 show child nodes: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show child nodes root"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "child nodes" - ], - "timestamps": null, - "values": [ - [ - "sg27", - "sg28" - ] - ] -} -``` - -4. 请求示例 show all ttl: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show all ttl"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - null, - null - ] - ] -} -``` - -5. 请求示例 show ttl: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show ttl on root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27" - ], - [ - null - ] - ] -} -``` - -6. 请求示例 show functions: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show functions"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "function name", - "function type", - "class name (UDF)" - ], - "timestamps": null, - "values": [ - [ - "ABS", - "ACOS", - "ASIN", - ... - ], - [ - "built-in UDTF", - "built-in UDTF", - "built-in UDTF", - ... - ], - [ - "org.apache.iotdb.db.query.udf.builtin.UDTFAbs", - "org.apache.iotdb.db.query.udf.builtin.UDTFAcos", - "org.apache.iotdb.db.query.udf.builtin.UDTFAsin", - ... - ] - ] -} -``` - -7. 请求示例 show timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show timeseries"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg28.s3", - "root.sg28.s4" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg27", - "root.sg27", - "root.sg28", - "root.sg28" - ], - [ - "INT32", - "BOOLEAN", - "INT32", - "BOOLEAN" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -8. 请求示例 show latest timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show latest timeseries"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg28.s4", - "root.sg27.s4", - "root.sg28.s3", - "root.sg27.s3" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg28", - "root.sg27", - "root.sg28", - "root.sg27" - ], - [ - "BOOLEAN", - "BOOLEAN", - "INT32", - "INT32" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -9. 请求示例 count timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"count timeseries root.**"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -10. 请求示例 count nodes: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"count nodes root.** level=2"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -11. 请求示例 show devices: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show devices"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "devices", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -12. 请求示例 show devices with database: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"show devices with database"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "devices", - "database", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -13. 请求示例 list user: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"list user"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "user" - ], - "timestamps": null, - "values": [ - [ - "root" - ] - ] -} -``` - -14. 请求示例 原始聚合查询: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "columnNames": null, - "timestamps": [ - 0 - ], - "values": [ - [ - 1 - ], - [ - 2 - ] - ] -} -``` - -15. 请求示例 group by level: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.** group by level = 1"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "count(root.sg27.*)", - "count(root.sg28.*)" - ], - "timestamps": null, - "values": [ - [ - 3 - ], - [ - 3 - ] - ] -} -``` - -16. 请求示例 group by: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(*) from root.sg27 group by([1635232143960,1635232153960),1s)"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "columnNames": null, - "timestamps": [ - 1635232143960, - 1635232144960, - 1635232145960, - 1635232146960, - 1635232147960, - 1635232148960, - 1635232149960, - 1635232150960, - 1635232151960, - 1635232152960 - ], - "values": [ - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ], - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ] - ] -} -``` - -17. 请求示例 last: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select last s3 from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "columnNames": [ - "timeseries", - "value", - "dataType" - ], - "timestamps": [ - 1635232143960 - ], - "values": [ - [ - "root.sg27.s3" - ], - [ - "11" - ], - [ - "INT32" - ] - ] -} -``` - -18. 请求示例 disable align: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select * from root.sg27 disable align"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "disable align clauses are not supported." -} -``` - -19. 请求示例 align by device: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select count(s3) from root.sg27 align by device"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "align by device clauses are not supported." -} -``` - -20. 请求示例 select into: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"select s3, s4 into root.sg29.s1, root.sg29.s2 from root.sg27"}' http://127.0.0.1:18080/rest/v1/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "select into clauses are not supported." -} -``` - -### 3.3 nonQuery - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v1/nonQuery` - -参数说明: - -|参数名称 |参数类型 |是否必填|参数描述| -| ------------ | ------------ | ------------ |------------ | -| sql | string | 是 | | - -请求示例: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"sql":"CREATE DATABASE root.ln"}' http://127.0.0.1:18080/rest/v1/nonQuery -``` - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - - -### 3.4 insertTablet - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v1/insertTablet` - -参数说明: - -| 参数名称 |参数类型 |是否必填|参数描述| -|--------------| ------------ | ------------ |------------ | -| timestamps | array | 是 | 时间列 | -| measurements | array | 是 | 测点名称 | -| dataTypes | array | 是 | 数据类型 | -| values | array | 是 | 值列,每一列中的值可以为 `null` | -| isAligned | boolean | 是 | 是否是对齐时间序列 | -| deviceId | string | 是 | 设备名称 | - -请求示例: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpyb290" -X POST --data '{"timestamps":[1635232143960,1635232153960],"measurements":["s3","s4"],"dataTypes":["INT32","BOOLEAN"],"values":[[11,null],[false,true]],"isAligned":false,"deviceId":"root.sg27"}' http://127.0.0.1:18080/rest/v1/insertTablet -``` - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - -## 4. 配置 - -配置位于 `iotdb-system.properties` 中。 - - - -* 将 `enable_rest_service` 设置为 `true` 以启用该模块,而将 `false` 设置为禁用该模块。默认情况下,该值为 `false`。 - -```properties -enable_rest_service=true -``` - -* 仅在 `enable_rest_service=true` 时生效。将 `rest_service_port `设置为数字(1025~65535),以自定义REST服务套接字端口。默认情况下,值为 `18080`。 - -```properties -rest_service_port=18080 -``` - -* 将 'enable_swagger' 设置 'true' 启用swagger来展示rest接口信息, 而设置为 'false' 关闭该功能. 默认情况下,该值为 `false`。 - -```properties -enable_swagger=false -``` - -* 一次查询能返回的结果集最大行数。当返回结果集的行数超出参数限制时,您只会得到在行数范围内的结果集,且将得到状态码`411`。 - -```properties -rest_query_default_row_size_limit=10000 -``` - -* 缓存客户登录信息的过期时间(用于加速用户鉴权的速度,单位为秒,默认是8个小时) - -```properties -cache_expire=28800 -``` - -* 缓存中存储的最大用户数量(默认是100) - -```properties -cache_max_num=100 -``` - -* 缓存初始容量(默认是10) - -```properties -cache_init_num=10 -``` - -* REST Service 是否开启 SSL 配置,将 `enable_https` 设置为 `true` 以启用该模块,而将 `false` 设置为禁用该模块。默认情况下,该值为 `false`。 - -```properties -enable_https=false -``` - -* keyStore 所在路径(非必填) - -```properties -key_store_path= -``` - - -* keyStore 密码(非必填) - -```properties -key_store_pwd= -``` - - -* trustStore 所在路径(非必填) - -```properties -trust_store_path= -``` - -* trustStore 密码(非必填) - -```properties -trust_store_pwd= -``` - - -* SSL 超时时间,单位为秒 - -```properties -idle_timeout=5000 -``` diff --git a/src/zh/UserGuide/latest/API/RestServiceV2_timecho.md b/src/zh/UserGuide/latest/API/RestServiceV2_timecho.md deleted file mode 100644 index 629c40fb4..000000000 --- a/src/zh/UserGuide/latest/API/RestServiceV2_timecho.md +++ /dev/null @@ -1,1004 +0,0 @@ - - -# REST API V2 -IoTDB 的 RESTful 服务可用于查询、写入和管理操作,它使用 OpenAPI 标准来定义接口并生成框架。 - -注意:自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 REST 服务的 JAR 包,请使用该服务前联系天谋团队获取相应的 JAR 包,并放置于 timechodb_home/lib 或者 timechodb_home/ext/external_service 路径下。 - -## 1. 开启RESTful 服务 -RESTful 服务默认情况是关闭的 - - 找到IoTDB安装目录下面的`conf/iotdb-system.properties`文件,将 `enable_rest_service` 设置为 `true` 以启用该模块。 - - ```properties - enable_rest_service=true - ``` - -## 2. 鉴权 -除了检活接口 `/ping`,RESTful 服务均使用基础(Basic)鉴权,所有请求都需要在 Header 中携带 `Authorization` 信息。 - -1. 鉴权格式 - -```JSON -Authorization: Basic -``` - -其中 `` 是 `用户名:密码` 直接做 Base64 编码的结果,其快速生成方式如下 - -* Linux/macOS - -```Bash -echo -n "你的用户名:你的密码" | base64 -eg: echo -n "root:TimechoDB@2021" | base64 -``` - -* Windows - -```Bash -# PowerShell -[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("用户名:密码")) -eg: [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("root:TimechoDB@2021")) - -# CMD -powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"用户名:密码\"))" -eg: powershell "[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes(\"root:TimechoDB@2021\"))" -``` - -2. 鉴权示例 - -默认用户名 `root`,密码 `TimechoDB@2021`: - -* 拼接字符串:`root:TimechoDB@2021` -* Base64 编码后为:`cm9vdDpUaW1lY2hvREJAMjAyMQ==` -* 最终 Header: - -```JSON -Authorization: Basic cm9vdDpUaW1lY2hvREJAMjAyMQ== -``` - -3. 错误说明 -* 用户名/密码错误:返回 HTTP 状态码 `600`,内容: - -```JSON -{"code":600,"message":"WRONG_LOGIN_PASSWORD_ERROR"} -``` - -* 未设置 `Authorization`:返回 HTTP 状态码 `603`,内容: - -```JSON -{"code":603,"message":"UNINITIALIZED_AUTH_ERROR"} -``` - -## 3. 接口 - -### 3.1 ping - -ping 接口可以用于线上服务检活。 - -请求方式:`GET` - -请求路径:http://ip:port/ping - -请求示例: - -```shell -$ curl http://127.0.0.1:18080/ping -``` - -返回的 HTTP 状态码: - -- `200`:当前服务工作正常,可以接收外部请求。 -- `503`:当前服务出现异常,不能接收外部请求。 - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: - -- HTTP 状态码为 `200` 时: - - ```json - { - "code": 200, - "message": "SUCCESS_STATUS" - } - ``` - -- HTTP 状态码为 `503` 时: - - ```json - { - "code": 500, - "message": "thrift service is unavailable" - } - ``` - -> `/ping` 接口访问不需要鉴权。 - -### 3.2 query - -query 接口可以用于处理数据查询和元数据查询。 - -请求方式:`POST` - -请求头:`application/json` - -请求路径: `http://ip:port/rest/v2/query` - -参数说明: - -| 参数名称 |参数类型 |是否必填|参数描述| -|-----------| ------------ | ------------ |------------ | -| sql | string | 是 | | -| row_limit | integer | 否 | 一次查询能返回的结果集的最大行数。
如果不设置该参数,将使用配置文件的 `rest_query_default_row_size_limit` 作为默认值。
当返回结果集的行数超出限制时,将返回状态码 `411`。 | - -响应参数: - -| 参数名称 |参数类型 |参数描述| -|--------------| ------------ | ------------| -| expressions | array | 用于数据查询时结果集列名的数组,用于元数据查询时为`null`| -| column_names | array | 用于元数据查询结果集列名数组,用于数据查询时为`null` | -| timestamps | array | 时间戳列,用于元数据查询时为`null` | -| values |array|二维数组,第一维与结果集列名数组的长度相同,第二维数组代表结果集的一列| - -请求示例如下所示: - -提示:为了避免OOM问题,不推荐使用select * from root.xx.** 这种查找方式。 - -1. 请求示例 表达式查询: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select s3, s4, s3 + 1 from root.sg27 limit 2"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg27.s3 + 1" - ], - "column_names": null, - "timestamps": [ - 1635232143960, - 1635232153960 - ], - "values": [ - [ - 11, - null - ], - [ - false, - true - ], - [ - 12.0, - null - ] - ] -} -``` - -2.请求示例 show child paths: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show child paths root"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "child paths" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ] - ] -} -``` - -3. 请求示例 show child nodes: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show child nodes root"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "child nodes" - ], - "timestamps": null, - "values": [ - [ - "sg27", - "sg28" - ] - ] -} -``` - -4. 请求示例 show all ttl: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show all ttl"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - null, - null - ] - ] -} -``` - -5. 请求示例 show ttl: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show ttl on root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "database", - "ttl" - ], - "timestamps": null, - "values": [ - [ - "root.sg27" - ], - [ - null - ] - ] -} -``` - -6. 请求示例 show functions: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show functions"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "function name", - "function type", - "class name (UDF)" - ], - "timestamps": null, - "values": [ - [ - "ABS", - "ACOS", - "ASIN", - ... - ], - [ - "built-in UDTF", - "built-in UDTF", - "built-in UDTF", - ... - ], - [ - "org.apache.iotdb.db.query.udf.builtin.UDTFAbs", - "org.apache.iotdb.db.query.udf.builtin.UDTFAcos", - "org.apache.iotdb.db.query.udf.builtin.UDTFAsin", - ... - ] - ] -} -``` - -7. 请求示例 show timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show timeseries"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg27.s3", - "root.sg27.s4", - "root.sg28.s3", - "root.sg28.s4" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg27", - "root.sg27", - "root.sg28", - "root.sg28" - ], - [ - "INT32", - "BOOLEAN", - "INT32", - "BOOLEAN" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -8. 请求示例 show latest timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show latest timeseries"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "alias", - "database", - "dataType", - "encoding", - "compression", - "tags", - "attributes" - ], - "timestamps": null, - "values": [ - [ - "root.sg28.s4", - "root.sg27.s4", - "root.sg28.s3", - "root.sg27.s3" - ], - [ - null, - null, - null, - null - ], - [ - "root.sg28", - "root.sg27", - "root.sg28", - "root.sg27" - ], - [ - "BOOLEAN", - "BOOLEAN", - "INT32", - "INT32" - ], - [ - "RLE", - "RLE", - "RLE", - "RLE" - ], - [ - "SNAPPY", - "SNAPPY", - "SNAPPY", - "SNAPPY" - ], - [ - null, - null, - null, - null - ], - [ - null, - null, - null, - null - ] - ] -} -``` - -9. 请求示例 count timeseries: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"count timeseries root.**"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -10. 请求示例 count nodes: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"count nodes root.** level=2"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "count" - ], - "timestamps": null, - "values": [ - [ - 4 - ] - ] -} -``` - -11. 请求示例 show devices: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show devices"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "devices", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -12. 请求示例 show devices with database: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"show devices with database"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "devices", - "database", - "isAligned" - ], - "timestamps": null, - "values": [ - [ - "root.sg27", - "root.sg28" - ], - [ - "root.sg27", - "root.sg28" - ], - [ - "false", - "false" - ] - ] -} -``` - -13. 请求示例 list user: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"list user"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "user" - ], - "timestamps": null, - "values": [ - [ - "root" - ] - ] -} -``` - -14. 请求示例 原始聚合查询: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "column_names": null, - "timestamps": [ - 0 - ], - "values": [ - [ - 1 - ], - [ - 2 - ] - ] -} -``` - -15. 请求示例 group by level: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.** group by level = 1"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "count(root.sg27.*)", - "count(root.sg28.*)" - ], - "timestamps": null, - "values": [ - [ - 3 - ], - [ - 3 - ] - ] -} -``` - -16. 请求示例 group by: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(*) from root.sg27 group by([1635232143960,1635232153960),1s)"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": [ - "count(root.sg27.s3)", - "count(root.sg27.s4)" - ], - "column_names": null, - "timestamps": [ - 1635232143960, - 1635232144960, - 1635232145960, - 1635232146960, - 1635232147960, - 1635232148960, - 1635232149960, - 1635232150960, - 1635232151960, - 1635232152960 - ], - "values": [ - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ], - [ - 1, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0, - 0 - ] - ] -} -``` - -17. 请求示例 last: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select last s3 from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "expressions": null, - "column_names": [ - "timeseries", - "value", - "dataType" - ], - "timestamps": [ - 1635232143960 - ], - "values": [ - [ - "root.sg27.s3" - ], - [ - "11" - ], - [ - "INT32" - ] - ] -} -``` - -18. 请求示例 disable align: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select * from root.sg27 disable align"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "disable align clauses are not supported." -} -``` - -19. 请求示例 align by device: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select count(s3) from root.sg27 align by device"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "align by device clauses are not supported." -} -``` - -20. 请求示例 select into: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"select s3, s4 into root.sg29.s1, root.sg29.s2 from root.sg27"}' http://127.0.0.1:18080/rest/v2/query -``` - -- 响应示例: - -```json -{ - "code": 407, - "message": "select into clauses are not supported." -} -``` - -### 3.3 nonQuery - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v2/nonQuery` - -参数说明: - -|参数名称 |参数类型 |是否必填|参数描述| -| ------------ | ------------ | ------------ |------------ | -| sql | string | 是 | | - -请求示例: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"sql":"CREATE DATABASE root.ln"}' http://127.0.0.1:18080/rest/v2/nonQuery -``` - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - - -### 3.4 insertTablet - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v2/insertTablet` - -参数说明: - -| 参数名称 |参数类型 |是否必填|参数描述| -|--------------| ------------ | ------------ |------------ | -| timestamps | array | 是 | 时间列 | -| measurements | array | 是 | 测点名称 | -| data_types | array | 是 | 数据类型 | -| values | array | 是 | 值列,每一列中的值可以为 `null` | -| is_aligned | boolean | 是 | 是否是对齐时间序列 | -| device | string | 是 | 设备名称 | - -请求示例: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"timestamps":[1635232143960,1635232153960],"measurements":["s3","s4"],"data_types":["INT32","BOOLEAN"],"values":[[11,null],[false,true]],"is_aligned":false,"device":"root.sg27"}' http://127.0.0.1:18080/rest/v2/insertTablet -``` - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - -### 3.5 insertRecords - -请求方式:`POST` - -请求头:`application/json` - -请求路径:`http://ip:port/rest/v2/insertRecords` - -参数说明: - -| 参数名称 |参数类型 |是否必填|参数描述| -|-------------------| ------------ | ------------ |------------ | -| timestamps | array | 是 | 时间列 | -| measurements_list | array | 是 | 测点名称 | -| data_types_list | array | 是 | 数据类型 | -| values_list | array | 是 | 值列,每一列中的值可以为 `null` | -| devices | string | 是 | 设备名称 | -| is_aligned | string | 是 | 是否是对齐时间序列 | - -请求示例: -```shell -curl -H "Content-Type:application/json" -H "Authorization:Basic cm9vdDpUaW1lY2hvREJAMjAyMQ==" -X POST --data '{"timestamps":[1635232113960,1635232151960,1635232143960,1635232143960],"measurements_list":[["s33","s44"],["s55","s66"],["s77","s88"],["s771","s881"]],"data_types_list":[["INT32","INT64"],["FLOAT","DOUBLE"],["FLOAT","DOUBLE"],["BOOLEAN","TEXT"]],"values_list":[[1,11],[2.1,2],[4,6],[false,"cccccc"]],"is_aligned":false,"devices":["root.s1","root.s1","root.s1","root.s3"]}' http://127.0.0.1:18080/rest/v2/insertRecords -``` - -响应参数: - -|参数名称 |参数类型 |参数描述| -| ------------ | ------------ | ------------| -| code | integer | 状态码 | -| message | string | 信息提示 | - -响应示例: -```json -{ - "code": 200, - "message": "SUCCESS_STATUS" -} -``` - - -## 4. 配置 - -配置位于 `iotdb-system.properties` 中。 - - - -* 将 `enable_rest_service` 设置为 `true` 以启用该模块,而将 `false` 设置为禁用该模块。默认情况下,该值为 `false`。 - -```properties -enable_rest_service=true -``` - -* 仅在 `enable_rest_service=true` 时生效。将 `rest_service_port `设置为数字(1025~65535),以自定义REST服务套接字端口。默认情况下,值为 `18080`。 - -```properties -rest_service_port=18080 -``` - -* 将 'enable_swagger' 设置 'true' 启用swagger来展示rest接口信息, 而设置为 'false' 关闭该功能. 默认情况下,该值为 `false`。 - -```properties -enable_swagger=false -``` - -* 一次查询能返回的结果集最大行数。当返回结果集的行数超出参数限制时,您只会得到在行数范围内的结果集,且将得到状态码`411`。 - -```properties -rest_query_default_row_size_limit=10000 -``` - -* 缓存客户登录信息的过期时间(用于加速用户鉴权的速度,单位为秒,默认是8个小时) - -```properties -cache_expire=28800 -``` - -* 缓存中存储的最大用户数量(默认是100) - -```properties -cache_max_num=100 -``` - -* 缓存初始容量(默认是10) - -```properties -cache_init_num=10 -``` - -* REST Service 是否开启 SSL 配置,将 `enable_https` 设置为 `true` 以启用该模块,而将 `false` 设置为禁用该模块。默认情况下,该值为 `false`。 - -```properties -enable_https=false -``` - -* keyStore 所在路径(非必填) - -```properties -key_store_path= -``` - - -* keyStore 密码(非必填) - -```properties -key_store_pwd= -``` - - -* trustStore 所在路径(非必填) - -```properties -trust_store_path= -``` - -* trustStore 密码(非必填) - -```properties -trust_store_pwd= -``` - - -* SSL 超时时间,单位为秒 - -```properties -idle_timeout=5000 -``` diff --git a/src/zh/UserGuide/latest/Background-knowledge/Cluster-Concept_timecho.md b/src/zh/UserGuide/latest/Background-knowledge/Cluster-Concept_timecho.md deleted file mode 100644 index b0462f0dd..000000000 --- a/src/zh/UserGuide/latest/Background-knowledge/Cluster-Concept_timecho.md +++ /dev/null @@ -1,132 +0,0 @@ - - -# 常见概念 - -## 1. 数据模型相关概念 - -### 1.1 数据模型(sql_dialect) - -IoTDB 支持两种时序数据模型(SQL语法),管理的对象均为设备和测点树:以层级路径的方式管理数据,一条路径对应一个设备的一个测点表;以关系表的方式管理数据,一张表对应一类设备。 - -### 1.2 元数据(Schema) - -元数据是数据库的数据模型信息,即树形结构或表结构。包括测点的名称、数据类型等定义。 - -### 1.3 设备(Device) - -对应一个实际场景中的物理设备,通常包含多个测点。 - -### 1.4 测点(Timeseries) - -又名:物理量、时间序列、时间线、点位、信号量、指标、测量值等。
-测点是多个数据点按时间戳递增排列形成的一个时间序列。通常一个测点代表一个采集点位,能够定期采集所在环境的物理量。 - -### 1.5 编码(Encoding) - -编码是一种压缩技术,将数据以二进制的形式进行表示,可以提高存储效率。IoTDB 支持多种针对不同类型的数据的编码方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md)。 - -### 1.6 压缩(Compression) - -IoTDB 在数据编码后,使用压缩技术进一步压缩二进制数据,提升存储效率。IoTDB 支持多种压缩方法,详细信息请查看:[压缩和编码](../Technical-Insider/Encoding-and-Compression.md)。 - -## 2. 分布式相关概念 - -下图展示了一个常见的 IoTDB 3C3D(3 个 ConfigNode、3 个 DataNode)的集群部署模式: - - - -IoTDB 的集群包括如下常见概念: - -- 节点(ConfigNode、DataNode、AINode) -- Region(SchemaRegion、DataRegion) -- 多副本 - -下文将对以上概念进行介绍。 - - -### 2.1 节点 - -IoTDB 集群包括三种节点(进程):ConfigNode(管理节点),DataNode(数据节点)和 AINode(分析节点),如下所示: - -- ConfigNode:管理集群的节点信息、配置信息、用户权限、元数据、分区信息等,负责分布式操作的调度和负载均衡,所有 ConfigNode 之间互为全量备份,如上图中的 ConfigNode-1,ConfigNode-2 和 ConfigNode-3 所示。 -- DataNode:服务客户端请求,负责数据的存储和计算,如上图中的 DataNode-1,DataNode-2 和 DataNode-3 所示。 -- AINode:负责提供机器学习能力,支持注册已训练好的机器学习模型,并通过 SQL 调用模型进行推理,目前已内置自研时序大模型和常见的机器学习算法(如预测与异常检测)。 - -### 2.2 数据分区 - -在 IoTDB 中,元数据和数据都被分为小的分区,即 Region,由集群的各个 DataNode 进行管理。 - -- SchemaRegion:元数据分区,管理一部分设备和测点的元数据。不同 DataNode 相同 RegionID 的 SchemaRegion 互为副本,如上图中 SchemaRegion-1 拥有三个副本,分别放置于 DataNode-1,DataNode-2 和 DataNode-3。 -- DataRegion:数据分区,管理一部分设备的一段时间的数据。不同 DataNode 相同 RegionID 的 DataRegion 互为副本,如上图中 DataRegion-2 拥有两个副本,分别放置于 DataNode-1 和 DataNode-2。 -- 具体分区算法可参考:[数据分区](../Technical-Insider/Cluster-data-partitioning.md) - -### 2.3 多副本 - -数据和元数据的副本数可配置,不同部署模式下的副本数推荐如下配置,其中多副本时可提供高可用服务。 - -| 类别 | 配置项 | 单机推荐配置 | 集群推荐配置 | -| :----- | :------------------------ | :----------- | :----------- | -| 元数据 | schema_replication_factor | 1 | 3 | -| 数据 | data_replication_factor | 1 | 2 | - - -## 3. 部署相关概念 - -IoTDB 有三种运行模式:单机模式、集群模式和双活模式。 - -### 3.1 单机模式 - -IoTDB单机实例包括 1 个ConfigNode、1个DataNode,即1C1D; - -- **特点**:便于开发者安装部署,部署和维护成本较低,操作方便。 -- **适用场景**:资源有限或对高可用要求不高的场景,例如边缘端服务器。 -- **部署方法**:[单机版部署](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -### 3.2 双活模式 - -双活版部署为 TimechoDB 企业版功能,是指两个独立的实例进行双向同步,能同时对外提供服务。当一台停机重启后,另一个实例会将缺失数据断点续传。 - -> IoTDB 双活实例通常为2个单机节点,即2套1C1D。每个实例也可以为集群。 - -- **特点**:资源占用最低的高可用解决方案。 -- **适用场景**:资源有限(仅有两台服务器),但希望获得高可用能力。 -- **部署方法**:[双活版部署](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -### 3.3 集群模式 - -IoTDB 集群实例为 3 个ConfigNode 和不少于 3 个 DataNode,通常为 3 个 DataNode,即3C3D;当部分节点出现故障时,剩余节点仍然能对外提供服务,保证数据库服务的高可用性,且可随节点增加提升数据库性能。 - -- **特点**:具有高可用性、高扩展性,可通过增加 DataNode 提高系统性能。 -- **适用场景**:需要提供高可用和可靠性的企业级应用场景。 -- **部署方法**:[集群版部署](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -### 3.4 特点总结 - -| 维度 | 单机模式 | 双活模式 | 集群模式 | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| 适用场景 | 边缘侧部署、对高可用要求不高 | 高可用性业务、容灾场景等 | 高可用性业务、容灾场景等 | -| 所需机器数量 | 1 | 2 | ≥3 | -| 安全可靠性 | 无法容忍单点故障 | 高,可容忍单点故障 | 高,可容忍单点故障 | -| 扩展性 | 可扩展 DataNode 提升性能 | 每个实例可按需扩展 | 可扩展 DataNode 提升性能 | -| 性能 | 可随 DataNode 数量扩展 | 与其中一个实例性能相同 | 可随 DataNode 数量扩展 | - -- 单机模式和集群模式,部署步骤类似(逐个增加 ConfigNode 和 DataNode),仅副本数和可提供服务的最少节点数不同。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Background-knowledge/Data-Model-and-Terminology_timecho.md b/src/zh/UserGuide/latest/Background-knowledge/Data-Model-and-Terminology_timecho.md deleted file mode 100644 index c56987874..000000000 --- a/src/zh/UserGuide/latest/Background-knowledge/Data-Model-and-Terminology_timecho.md +++ /dev/null @@ -1,395 +0,0 @@ - - -# 建模方案设计 - -本章节主要介绍如何将时序数据应用场景转化为IoTDB时序建模。 - -## 1. 时序数据模型 - -在构建IoTDB建模方案前,需要先了解时序数据和时序数据模型,详细内容见此页面:[时序数据模型](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - -## 2. IoTDB 的树表孪生模型 - -IoTDB 提供了树表孪生模型的方式,其特点分别如下: - -**树模型**:以测点为对象进行管理,每个测点对应一条时间序列,测点名按`.`分割可形成一个树形目录结构,与物理世界一一对应,对测点的读写操作简单直观。 - -> 1. 数据建模时,为了足够的性能要求,建议数据路径(Path)的倒数第二层节点(对应设备数量)不少于 1000 个,且设备数量与并发处理能力挂钩,设备数量充足时,并发读写效率更优。 -若遇到“设备数量较少但单设备测点数量较多”的场景(如仅 3 台设备,每台设备含 10000 个测点),推荐在最后层级新增 `.value` ,以此提升倒数第二层节点总数,示例:`root.db.device01.metric.value`。 -> 2. 在构建树模型[路径](../Basic-Concept/Operate-Metadata_timecho.md#4-路径查询)时,节点命名若存在包含非标准字符或特殊符号的可能性,则建议对所有层级节点实施反引号封装策略。这样可以有效规避因字符解析异常导致的测点注册失败及数据写入中断问题,确保路径标识符在语法解析层面的准确性。 - -**表模型**:推荐为每类设备创建一张表,同类设备的物理量采集都具备一定共性(如都采集温度和湿度物理量),数据分析灵活丰富。 - -### 2.1 模型特点 - -树表孪生模型有各自的适用场景。 - -以下表格从适用场景、典型操作等多个维度对树模型和表模型进行了对比。用户可以根据具体的使用需求,选择适合的模型,从而实现数据的高效存储和管理。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
对比维度树模型表模型
适用场景测点管理,监控场景设备管理,分析场景
典型操作指定点位路径进行读写通过标签进行数据筛选分析
结构特点和文件系统一样灵活增删模板化管理,便于数据治理
语法特点简洁灵活分析丰富
性能对比相同
- -**注意:** -- 同一个集群实例中可以存在两种模型空间,不同模型的语法、数据库命名方式不同,默认不互相可见。 - -### 2.2 模型选择 - -IoTDB 支持通过多种客户端工具与数据库建立连接,不同客户端下进行模型选择的方式说明如下: - -1. [命令行工具 CLI](../Tools-System/CLI_timecho.md) - -通过 CLI 建立连接时,需要通过 `sql_dialect` 参数指定使用的模型(默认使用树模型)。 - -```Bash -# 树模型 -start-cli.sh(bat) -start-cli.sh(bat) -sql_dialect tree - -# 表模型 -start-cli.sh(bat) -sql_dialect table -``` - -2. [SQL](../User-Manual/Maintenance-statement_timecho.md#_2-1-设置连接的模型) - -在使用 SQL 语言进行数据操作时,可通过 set 语句切换使用的模型。 - -```SQL --- 指定为树模型 -IoTDB> SET SQL_DIALECT=TREE - --- 指定为表模型 -IoTDB> SET SQL_DIALECT=TABLE -``` - -3. 应用编程接口 - -通过多语言应用编程接口建立连接时,可通过模型对应的 session/sessionpool 创建连接池实例,简单示例如下: - -* [Java 原生接口](../API/Programming-Java-Native-API_timecho.md) - -```Java -// 树模型 -SessionPool sessionPool = - new SessionPool.Builder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(3) - .build(); - -//表模型 - ITableSessionPool tableSessionPool = - new TableSessionPoolBuilder() - .nodeUrls(nodeUrls) - .user(username) - .password(password) - .maxSize(1) - .build(); -``` - -* [Python 原生接口](../API/Programming-Python-Native-API_timecho.md) - -```Python -# 树模型 -session = Session( -​ ip=ip, -​ port=port, -​ user=username, -​ password=password, -​ fetch_size=1024, -​ zone_id="UTC+8", -​ enable_redirection=True -) - -# 表模型 -config = TableSessionPoolConfig( -​ node_urls=node_urls, -​ username=username, -​ password=password, -​ database=database, -​ max_pool_size=max_pool_size, -​ fetch_size=fetch_size, -​ wait_timeout_in_ms=wait_timeout_in_ms, -) -session_pool = TableSessionPool(config) -``` - -* [C++ 原生接口](../API/Programming-Cpp-Native-API.md) - -```C++ -// 树模型 -session = new Session(hostip, port, username, password); - -// 表模型 -session = (new TableSessionBuilder()) - ->host(ip) - ->rpcPort(port) - ->username(username) - ->password(password) - ->build(); -``` - -* [GO 原生接口](../API/Programming-Go-Native-API.md) - -```Go -//树模型 -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, -} -sessionPool = client.NewSessionPool(config, 3, 60000, 60000, false) -defer sessionPool.Close() - -//表模型 -config := &client.PoolConfig{ - Host: host, - Port: port, - UserName: user, - Password: password, - Database: dbname, -} -sessionPool := client.NewTableSessionPool(config, 3, 60000, 4000, false) -defer sessionPool.Close() -``` - -* [C# 原生接口](../API/Programming-CSharp-Native-API.md) - -```C# -//树模型 -var session_pool = new SessionPool(host, port, pool_size); - -//表模型 -var tableSessionPool = new TableSessionPool.Builder() - .SetNodeUrls(nodeUrls) - .SetUsername(username) - .SetPassword(password) - .SetFetchSize(1024) - .Build(); -``` - -* [JDBC](../API/Programming-JDBC_timecho.md) - -使用表模型,必须在 url 中指定 sql\_dialect 参数为 table。 - -```Java -// 树模型 -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667/", username, password); - -// 表模型 -Class.forName("org.apache.iotdb.jdbc.IoTDBDriver"); -Connection connection = DriverManager.getConnection( - "jdbc:iotdb://127.0.0.1:6667?sql_dialect=table", username, password); -``` - -### 2.3 树转表 - -IoTDB 提供了树转表功能,如下图所示: - -![](/img/tree-to-table-1.png) - -该功能支持通过创建表视图的方式,将已存在的树模型数据转化为表视图,进而通过表视图进行查询,实现了对同一份数据的树模型和表模型协同处理。更详细的功能介绍可参考[树转表视图](../../latest-Table/User-Manual/Tree-to-Table_timecho.md),需要注意的是:​**创建树转表视图的 SQL 语句只允许在表模型下执行**​。 - - -## 3. 应用场景 - -应用场景主要包括三类: - -- 场景一:使用树模型进行数据的读写 - -- 场景二:使用表模型进行数据的读写 - -- 场景三:共用一份数据,使用树模型进行数据读写、使用表模型进行数据分析 - -### 3.1 场景一:树模型 - -#### 3.1.1 特点 - -- 简单直观,和物理世界的监测点位一一对应 - -- 类似文件系统一样灵活,可以设计任意分支结构 - -- 适用 DCS、SCADA 等工业监控场景 - -#### 3.1.2 基础概念 - -| **概念** | **定义** | -| -------------------- | ------------------------------------------------------------ | -| **数据库** | 定义:一个以 root. 为前缀的路径
命名推荐:仅包含 root 的下一级节点,如 root.db
数量推荐:上限和内存相关,一个数据库也可以充分利用机器资源,无需为性能原因创建多个数据库
创建方式:推荐手动创建,也可创建时间序列时自动创建(默认为 root 的下一级节点) | -| **时间序列(测点)** | 定义:
1. 一个以数据库路径为前缀的、由 . 分割的路径,可包含任意多个层级,如 root.db.turbine.device1.metric1
2. 每个时间序列可以有不同的数据类型。
命名推荐:
1. 仅将唯一定位时间序列的标签(类似联合主键)放入路径中,一般不超过10层
2. 通常将基数(不同的取值数量)少的标签放在前面,便于系统将公共前缀进行压缩
数量推荐:
1. 集群可管理的时间序列总量和总内存相关,可参考资源推荐章节
2. 任一层级的子节点数量没有限制
创建方式:可手动创建或在数据写入时自动创建。 | -| **设备** | 定义:倒数第二级为设备,如 root.db.turbine.**device1**.metric1中的“device1”这一层级即为设备
创建方式:无法仅创建设备,随时间序列创建而存在 | - - -#### 3.1.3 建模示例 - -##### 3.1.3.1 有多种类型的设备需要管理,如何建模? - -- 如场景中不同类型的设备具备不同的层级路径和测点集合,可以在数据库节点下按设备类型创建分支。每种设备下可以有不同的测点结构。 - -
- -
- -##### 3.1.3.2 如果场景中没有设备,只有测点,如何建模? - -- 如场站的监控系统中,每个测点都有唯一编号,但无法对应到某些设备。 - -
- -
- -##### 3.1.3.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 如在储能场景中,每一层结构都要监控其电压和电流,可以采用如下建模方式。 - -
- -
- - -### 3.2 场景二:表模型 - -#### 3.2.1 特点 - -- 以时序表建模管理设备时序数据,便于使用标准 SQL 进行分析 - -- 适用于设备数据分析或从其他数据库迁移至 IoTDB 的场景 - -#### 3.2.2 基础概念 - -- 数据库:可管理多类设备 - -- 时序表:对应一类设备 - -| **列类别** | **定义** | -| --------------------------- | ------------------------------------------------------------ | -| **时间列(TIME)** | 每个时序表必须有一个时间列,且列名必须为 time,数据类型为 TIMESTAMP | -| **标签列(TAG)** | 设备的唯一标识(联合主键),可以为 0 至多个
标签信息不可修改和删除,但允许增加
推荐按粒度由大到小进行排列 | -| **测点列(FIELD)** | 一个设备采集的测点可以有1个至多个,值随时间变化
表的测点列没有数量限制,可以达到数十万以上 | -| **属性列(ATTRIBUTE)** | 对设备的补充描述,**不随时间变化**
设备属性信息可以有0个或多个,可以更新或新增
少量希望修改的静态属性可以存至此列 | - - -数据筛选效率:时间列=标签列>属性列>测点列 - -#### 3.2.3 建模示例 - -##### 3.2.3.1 有多种类型的设备需要管理,如何建模? - -- 推荐为每一类型的设备建立一张表,每个表可以具有不同的标签和测点集合。 -- 即使设备之间有联系,或有层级关系,也推荐为每一类设备建一张表。 - -
- -
- -##### 3.2.3.2 如果没有设备标识列和属性列,如何建模? - -- 列数没有数量限制,可以达到数十万以上。 - -
- -
- -##### 3.2.3.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 每个设备有多个子设备及测点信息,推荐为每类设备建一个表进行管理。 - -
- -
- -### 3.3 场景三:双模型结合 - -#### 3.3.1 特点 - -- 巧妙融合了树模型与表模型的优点,共用一份数据,写入灵活,查询丰富。 - -- 数据写入阶段,采用树模型语法,支持数据灵活接入和扩展。 - -- 数据分析阶段,采用表模型语法,允许用户通过标准 SQL 查询语言,执行复杂的数据分析。 - -#### 3.3.2 建模示例 - -##### 3.3.2.1 有多种类型的设备需要管理,如何建模? - -- 场景中不同类型的设备具备不同的层级路径和测点集合。 - -- 树模型:在数据库节点下按设备类型创建分支,每种设备下可以有不同的测点结构。 - -- 表视图:为每种类型的设备建立一张表视图,每个表视图具有不同的标签和测点集合。 - -
- -
- -##### 3.3.2.2 如果没有设备标识列和属性列,如何建模? - -- 树模型:每个测点都有唯一编号,但无法对应到某些设备。 - -- 表视图:将所有测点放入一张表中,测点列数没有数量限制,可以达到数十万以上。若测点具有相同的数据类型,可将测点作为同一类设备。 - -
- -
- -##### 3.3.2.3 如果在一个设备下,既有子设备,也有测点,如何建模? - -- 树模型:按照物理世界的监测点,对每一层结构进行建模。 - -- 表视图:按照设备分类,建立多个表对每一层结构信息进行管理。 - -
- -
diff --git a/src/zh/UserGuide/latest/Background-knowledge/Navigating_Time_Series_Data_timecho.md b/src/zh/UserGuide/latest/Background-knowledge/Navigating_Time_Series_Data_timecho.md deleted file mode 100644 index f537b0e63..000000000 --- a/src/zh/UserGuide/latest/Background-knowledge/Navigating_Time_Series_Data_timecho.md +++ /dev/null @@ -1,70 +0,0 @@ - -# 时序数据模型 - -## 1. 什么叫时序数据? - -万物互联的今天,物联网场景、工业场景等各类场景都在进行数字化转型,人们通过在各类设备上安装传感器对设备的各类状态进行采集。如电机采集电压、电流,风机的叶片转速、角速度、发电功率;车辆采集经纬度、速度、油耗;桥梁的振动频率、挠度、位移量等。传感器的数据采集,已经渗透在各个行业中。 - -![](/img/%E6%97%B6%E5%BA%8F%E6%95%B0%E6%8D%AE%E4%BB%8B%E7%BB%8D.png) - - - -通常来说,我们把每个采集点位叫做一个**测点( 也叫物理量、时间序列、时间线、信号量、指标、测量值等)**,每个测点都在随时间的推移不断收集到新的数据信息,从而构成了一条**时间序列**。用表格的方式,每个时间序列就是一个由时间、值两列形成的表格;用图形化的方式,每个时间序列就是一个随时间推移形成的走势图,也可以形象的称之为设备的“心电图”。 - -![](/img/%E5%BF%83%E7%94%B5%E5%9B%BE1.png) - -传感器产生的海量时序数据是各行各业数字化转型的基础,因此我们对时序数据的模型梳理主要围绕设备、传感器展开。 - -## 2. 时序数据中的关键概念有哪些? - -时序数据中主要涉及的概念由下至上可分为:数据点、测点、设备。 - -![](/img/%E7%99%BD%E6%9D%BF.png) - -### 2.1 数据点 - -- 定义:由一个时间戳和一个数值组成,其中时间戳为 long 类型,数值可以为 BOOLEAN、FLOAT、INT32 等各种类型。 -- 示例:如上图中表格形式的时间序列的一行,或图形形式的时间序列的一个点,就是一个数据点。 - -![](/img/%E6%95%B0%E6%8D%AE%E7%82%B9.png) - -### 2.2 测点 - -- 定义:是多个数据点按时间戳递增排列形成的一个时间序列。通常一个测点代表一个采集点位,能够定期采集所在环境的物理量。 -- 又名:物理量、时间序列、时间线、信号量、指标、测量值等 -- 示例: - - 电力场景:电流、电压 - - 能源场景:风速、转速 - - 车联网场景:油量、车速、经度、维度 - - 工厂场景:温度、湿度 - -- _树模型下**测点数量**等于整个路径模式下叶子节点的数量,具体统计方法可参考_[统计时间序列总数](../Basic-Concept/Operate-Metadata_timecho.md#_2-7-统计时间序列总数) - - -### 2.3 设备 - -- 定义:对应一个实际场景中的物理设备,通常是一组测点的集合,由一到多个标签定位标识 -- 示例 - - 车联网场景:车辆,由车辆识别代码 VIN 标识 - - 工厂场景:机械臂,由物联网平台生成的唯一 ID 标识 - - 能源场景:风机,由区域、场站、线路、机型、实例等标识 - - 监控场景:CPU,由机房、机架、Hostname、设备类型等标识 \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Basic-Concept/Operate-Metadata_timecho.md b/src/zh/UserGuide/latest/Basic-Concept/Operate-Metadata_timecho.md deleted file mode 100644 index 55fb26b33..000000000 --- a/src/zh/UserGuide/latest/Basic-Concept/Operate-Metadata_timecho.md +++ /dev/null @@ -1,1278 +0,0 @@ - - -# 测点管理 - -## 1. 数据库管理 - -数据库(Database)可以被视为关系数据库中的Database。 - -### 1.1 创建数据库 - -我们可以根据存储模型建立相应的数据库。如下所示: - -```sql -CREATE DATABASE root.ln; -``` - -需要注意的是,推荐创建一个 database. - -Database 的父子节点都不能再设置 database。 - -例如在已经有`root.ln`和`root.sgcc`这两个 database 的情况下,创建`root.ln.wf01` database 是不可行的。系统将给出相应的错误提示,如下所示: - -```sql -CREATE DATABASE root.ln.wf01; -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 501: root.ln has already been created as database -``` -同样,在已经有 `root.db.test` 这个 database 的情况下,创建 `root.db` database 也是不可行的。系统也会给出相应的错误提示,如下所示: - -```sql -CREATE DATABASE root.db; -Msg: org.apache.iotdb.jdbc.IoTDBSQLException: 529: some children of root.db have already been created as database -``` - -Database 节点名命名规则: -1. 节点名可由**中英文字符、数字、下划线(\_)、英文句号(.)、反引号(\`)** 组成 -2. 若节点名为以下情况,则必须用**反引号(\`)** 将整个名称包裹。 - - 纯数字(如 12345) - - 含有特殊字符(如 . 或 \_)并可能引发歧义的名称(如 db.01、\_temp) -3. 反引号的特殊处理: - 若节点名本身需要包含反引号(\`),则需用**两个反引号(\`\`)** 表示一个反引号。例如:命名为\`db123\`\`(本身包含一个反引号),需写为 \`db123\`\`\`。 - -还需注意,如果在 Windows 或 macOS 系统上部署,database 名是大小写不敏感的。例如同时创建`root.ln` 和 `root.LN` 是不被允许的。 - -### 1.2 查看数据库 - -在 database 创建后,我们可以使用 [SHOW DATABASES](../SQL-Manual/SQL-Manual.md#查看数据库) 语句和 [SHOW DATABASES \](../SQL-Manual/SQL-Manual.md#查看数据库) 来查看 database,SQL 语句如下所示: - -```sql -show databases; -show databases root.*; -show databases root.**; -``` - -执行结果为: - -```shell -+-------------+----+-------------------------+-----------------------+-----------------------+ -| database| ttl|schema_replication_factor|data_replication_factor|time_partition_interval| -+-------------+----+-------------------------+-----------------------+-----------------------+ -| root.sgcc|null| 2| 2| 604800| -| root.ln|null| 2| 2| 604800| -+-------------+----+-------------------------+-----------------------+-----------------------+ -Total line number = 2 -It costs 0.060s -``` - -### 1.3 删除数据库 - -用户可以使用`DELETE DATABASE `语句删除该路径模式匹配的所有的数据库。在删除的过程中,需要注意的是数据库的数据也会被删除。 - -```sql -DELETE DATABASE root.ln; -DELETE DATABASE root.sgcc; -// 删除所有数据,时间序列以及数据库; -DELETE DATABASE root.**; -``` - -### 1.4 统计数据库数量 - -用户可以使用`COUNT DATABASES `语句统计数据库的数量,允许指定`PathPattern` 用来统计匹配该`PathPattern` 的数据库的数量 - -SQL 语句如下所示: - -```sql -show databases; -count databases; -count databases root.*; -count databases root.sgcc.*; -count databases root.sgcc; -``` - -执行结果为: - -```shell -+-------------+ -| database| -+-------------+ -| root.sgcc| -| root.turbine| -| root.ln| -+-------------+ -Total line number = 3 -It costs 0.003s - -+-------------+ -| Database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.003s - -+-------------+ -| Database| -+-------------+ -| 3| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| Database| -+-------------+ -| 0| -+-------------+ -Total line number = 1 -It costs 0.002s - -+-------------+ -| database| -+-------------+ -| 1| -+-------------+ -Total line number = 1 -It costs 0.002s -``` - -### 1.5 数据保留时间(TTL) - -IoTDB 支持对设备(device)级别设置数据保留时间(TTL),允许系统自动定期删除旧数据,以有效控制磁盘空间并维护高性能查询和低内存占用。TTL 默认以毫秒为单位,数据过期后不可查询且禁止写入,但物理删除会延迟至压缩时。需注意,TTL 变更可能导致短暂数据可查询性变化,且若调小或解除 TTL,之前因 TTL 不可见的数据可能重新出现。 - -注意事项: -- TTL 设置为毫秒,不受配置文件时间精度影响。 -- TTL 变更可能影响数据的可查询性。 -- 系统最终会移除过期数据,但存在延迟。 -- TTL 判断数据是否过期依据的是数据点时间,非写入时间。 -- 系统最多支持设置 1000 条 TTL 规则,达到上限需先删除部分规则才能设置新规则。 - -#### TTL Path 规则 -设置的路径 path 只支持前缀路径(即路径中间不能带 \* , 且必须以 \*\* 结尾),该路径会匹配到设备,也允许用户指定不带星的 path 为具体的 database 或 device,当 path 不带 \* 时,会检查是否匹配到 database,若匹配到 database,则会同时设置 path 和 path.\*\*。 -注意:设备 TTL 设置不会对元数据的存在性进行校验,即允许对一条不存在的设备设置 TTL。 -```shell -合格的 path: -root.** -root.db.** -root.db.group1.** -root.db -root.db.group1.d1 - -不合格的 path: -root.*.db -root.**.db.* -root.db.* -``` -#### TTL 适用规则 -当一个设备适用多条TTL规则时,优先适用较精确和较长的规则。例如对于设备“root.bj.hd.dist001.turbine001”来说,规则“root.bj.hd.dist001.turbine001”比“root.bj.hd.dist001.\*\*”优先,而规则“root.bj.hd.dist001.\*\*”比“root.bj.hd.\*\*”优先; -#### 设置 TTL -set ttl 操作可以理解为设置一条 TTL规则,比如 set ttl to root.sg.group1.\*\* 就相当于对所有可以匹配到该路径模式的设备挂载 ttl。 unset ttl 操作表示对相应路径模式卸载 TTL,若不存在对应 TTL,则不做任何事。若想把 TTL 调成无限大,则可以使用 INF 关键字 -设置 TTL 的 SQL 语句如下所示: -```sql -set ttl to pathPattern 360000; -``` -pathPattern 是前缀路径,即路径中间不能带 \* 且必须以 \*\* 结尾。 -pathPattern 匹配对应的设备。为了兼容老版本 SQL 语法,允许用户输入的 pathPattern 匹配到 db,则自动将前缀路径扩展为 path.\*\*。 -例如,写set ttl to root.sg 360000 则会自动转化为set ttl to root.sg.\*\* 360000,转化后的语句对所有 root.sg 下的 device 设置TTL。 -但若写的 pathPattern 无法匹配到 db,则上述逻辑不会生效。 -如写set ttl to root.sg.group 360000 ,由于root.sg.group未匹配到 db,则不会被扩充为root.sg.group.\*\*。 也允许指定具体 device,不带 \*。 -#### 取消 TTL - -取消 TTL 的 SQL 语句如下所示: - -```sql -unset ttl from root.ln; -``` - -取消设置 TTL 后, `root.ln` 路径下所有的数据都会被保存。 -```sql -unset ttl from root.sgcc.**; -``` - -取消设置`root.sgcc`路径下的所有的 TTL 。 -```sql -unset ttl from root.**; -``` - -取消设置所有的 TTL 。 - -新语法 -```sql -unset ttl from root.**; -``` - -旧语法 -```sql -unset ttl to root.**; -``` -新旧语法在功能上没有区别并且同时兼容,仅是新语法在用词上更符合常规。 -#### 显示 TTL - -显示 TTL 的 SQL 语句如下所示: -show all ttl - -```sql -SHOW ALL TTL; -``` -```shell -+--------------+--------+ -| path| TTL| -| root.**|55555555| -| root.sg2.a.**|44440000| -+--------------+--------+ -``` - -show ttl on pathPattern -```sql -SHOW TTL ON root.db.**; -``` -```shell -+--------------+--------+ -| path| TTL| -| root.db.**|55555555| -| root.db.a.**|44440000| -+--------------+--------+ -``` -SHOW ALL TTL 这个例子会给出所有的 TTL。 -SHOW TTL ON pathPattern 这个例子会显示指定路径的 TTL。 - -显示设备的 TTL。 -```sql -show devices; -``` -```shell -+---------------+---------+---------+ -| Device|IsAligned| TTL| -+---------------+---------+---------+ -|root.sg.device1| false| 36000000| -|root.sg.device2| true| INF| -+---------------+---------+---------+ -``` -所有设备都一定会有 TTL,即不可能是 null。INF 表示无穷大。 - - -### 1.6 设置异构数据库(进阶操作) - -在熟悉 IoTDB 元数据建模的前提下,用户可以在 IoTDB 中设置异构的数据库,以便应对不同的生产需求。 - -目前支持的数据库异构参数有: - -| 参数名 | 参数类型 | 参数描述 | -|---------------------------|---------|---------------------------| -| TTL | Long | 数据库的 TTL | -| SCHEMA_REPLICATION_FACTOR | Integer | 数据库的元数据副本数 | -| DATA_REPLICATION_FACTOR | Integer | 数据库的数据副本数 | -| SCHEMA_REGION_GROUP_NUM | Integer | 数据库的 SchemaRegionGroup 数量 | -| DATA_REGION_GROUP_NUM | Integer | 数据库的 DataRegionGroup 数量 | - -用户在配置异构参数时需要注意以下三点: -+ TTL 和 TIME_PARTITION_INTERVAL 必须为正整数。 -+ SCHEMA_REPLICATION_FACTOR 和 DATA_REPLICATION_FACTOR 必须小于等于已部署的 DataNode 数量。 -+ SCHEMA_REGION_GROUP_NUM 和 DATA_REGION_GROUP_NUM 的功能与 iotdb-common.properties 配置文件中的 -`schema_region_group_extension_policy` 和 `data_region_group_extension_policy` 参数相关,以 DATA_REGION_GROUP_NUM 为例: -若设置 `data_region_group_extension_policy=CUSTOM`,则 DATA_REGION_GROUP_NUM 将作为 Database 拥有的 DataRegionGroup 的数量; -若设置 `data_region_group_extension_policy=AUTO`,则 DATA_REGION_GROUP_NUM 将作为 Database 拥有的 DataRegionGroup 的配额下界,即当该 Database 开始写入数据时,将至少拥有此数量的 DataRegionGroup。 - -用户可以在创建 Database 时设置任意异构参数,或在单机/分布式 IoTDB 运行时调整部分异构参数。 - -#### 创建 Database 时设置异构参数 - -用户可以在创建 Database 时设置上述任意异构参数,SQL 语句如下所示: - -```sql -CREATE DATABASE prefixPath (WITH databaseAttributeClause (COMMA? databaseAttributeClause)*)? -``` - -例如: -```sql -CREATE DATABASE root.db WITH SCHEMA_REPLICATION_FACTOR=1, DATA_REPLICATION_FACTOR=3, SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -#### 运行时调整异构参数 - -用户可以在 IoTDB 运行时调整部分异构参数,SQL 语句如下所示: - -```sql -ALTER DATABASE prefixPath WITH databaseAttributeClause (COMMA? databaseAttributeClause)* -``` - -例如: -```sql -ALTER DATABASE root.db WITH SCHEMA_REGION_GROUP_NUM=1, DATA_REGION_GROUP_NUM=2; -``` - -注意,运行时只能调整下列异构参数: -+ SCHEMA_REGION_GROUP_NUM -+ DATA_REGION_GROUP_NUM - -#### 查看异构数据库 - -用户可以查询每个 Database 的具体异构配置,SQL 语句如下所示: - -```sql -SHOW DATABASES DETAILS prefixPath? -``` - -例如: - -```sql -SHOW DATABASES DETAILS; -``` -```shell -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|Database| TTL|SchemaReplicationFactor|DataReplicationFactor|TimePartitionInterval|SchemaRegionGroupNum|MinSchemaRegionGroupNum|MaxSchemaRegionGroupNum|DataRegionGroupNum|MinDataRegionGroupNum|MaxDataRegionGroupNum| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -|root.db1| null| 1| 3| 604800000| 0| 1| 1| 0| 2| 2| -|root.db2|86400000| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -|root.db3| null| 1| 1| 604800000| 0| 1| 1| 0| 2| 2| -+--------+--------+-----------------------+---------------------+---------------------+--------------------+-----------------------+-----------------------+------------------+---------------------+---------------------+ -Total line number = 3 -It costs 0.058s -``` - -各列查询结果依次为: -+ 数据库名称 -+ 数据库的 TTL -+ 数据库的元数据副本数 -+ 数据库的数据副本数 -+ 数据库的时间分区间隔 -+ 数据库当前拥有的 SchemaRegionGroup 数量 -+ 数据库需要拥有的最小 SchemaRegionGroup 数量 -+ 数据库允许拥有的最大 SchemaRegionGroup 数量 -+ 数据库当前拥有的 DataRegionGroup 数量 -+ 数据库需要拥有的最小 DataRegionGroup 数量 -+ 数据库允许拥有的最大 DataRegionGroup 数量 - - -## 2. 时间序列管理 - -### 2.1 创建时间序列 - -根据建立的数据模型,我们可以分别在两个数据库中创建相应的时间序列。创建时间序列的 SQL 语句如下所示: - -```sql -create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT; -create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT; -create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT; -``` - -从 v0.13 起,可以使用简化版的 SQL 语句创建时间序列: - -```sql -create timeseries root.ln.wf01.wt01.status BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature FLOAT; -create timeseries root.ln.wf02.wt02.hardware TEXT; -create timeseries root.ln.wf02.wt02.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature FLOAT; -``` - -创建时间序列时,系统会默认指定编码压缩方式,无需手动指定,若业务场景需要手动调整,可参考如下示例: -```sql -create timeseries root.sgcc.wf03.wt01.temperature FLOAT encoding=PLAIN compressor=SNAPPY; -``` - -需要注意的是,如果手动指定了编码方式,但与数据类型不对应时,系统会给出相应的错误提示,如下所示: -```sql -create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF; -error: encoding TS_2DIFF does not support BOOLEAN -``` - -更多详细的数据类型与编码压缩方式的对应列表请参见 [压缩&编码](../Technical-Insider/Encoding-and-Compression.md)。 - - -### 2.2 创建对齐时间序列 - -创建一组对齐时间序列的SQL语句如下所示: - -```sql -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT, longitude FLOAT); -``` - -一组对齐序列中的序列可以有不同的数据类型、编码方式以及压缩方式。 - -对齐的时间序列也支持设置别名、标签、属性。 - - -### 2.3 修改时间序列数据类型 - -自 V2.0.8.2 版本起,支持通过 SQL 语句修改时间序列的数据类型。 - -语法定义: - -```SQL -ALTER TIMESERIES fullPath SET DATA TYPE newType=type -``` - -说明: - -* 变更过程中若该时间序列被并发删除,会报错提示。 -* 变更后的时间序列类型需要与原类型兼容,具体兼容性如下表所示: - -| 原始类型 | 可变更为类型 | -| ----------- | ----------------------------------------------- | -| INT32 | INT64, FLOAT, DOUBLE, TIMESTAMP, STRING, TEXT | -| INT64 | TIMESTAMP, DOUBLE, STRING, TEXT | -| FLOAT | DOUBLE, STRING, TEXT | -| DOUBLE | STRING, TEXT | -| BOOLEAN | STRING, TEXT | -| TEXT | BLOB, STRING | -| STRING | TEXT, BLOB | -| BLOB | STRING, TEXT | -| DATE | STRING, TEXT | -| TIMESTAMP | INT64, DOUBLE, STRING, TEXT | - -使用示例: - -```SQL -ALTER TIMESERIES root.ln.wf01.wt01.temperature set data type DOUBLE; -``` - -### 2.4 修改时间序列名称 - -自 V2.0.8.2 版本起,支持通过 SQL 语句修改时间序列的全路径名称。修改成功后,原有名称作废,但仍在元数据的存储中。 - -语法定义: - -```SQL --- 支持将某个序列的全路径修改为另一全路径 -ALTER TIMESERIES RENAME TO -``` - -使用说明: - -* 该语句执行成功后将立即生效,原序列的 tag/attribute/alias 将迁移到新序列。 -* 作废序列(原序列)不再支持写入、查询、删除等操作。作废后的序列名称会被系统保留,不允许创建同名新序列,以此确保原序列名称唯一可追溯:支持通过 SHOW INVALID TIMESERIES 语句查看原序列,避免因频繁修改导致原序列信息丢失,大幅提升数据溯源与问题定位效率。 -* 新序列支持创建视图,原序列不再支持创建视图。修改新序列的编码压缩、序列类型、标签、属性、别名时,不会连带修改原序列;删除新序列时,会连带修改原序列。 -* 新序列路径或目标设备下原序列别名已存在时(包括真实序列、view、作废序列及其别名),系统会报错提示。 - -使用示例: - -```SQL -ALTER TIMESERIES root.ln.wf01.wt01.temperature RENAME TO root.newln.newwf.newwt.temperature; -``` - - -### 2.5 删除时间序列 - -我们可以使用`(DELETE | DROP) TimeSeries `语句来删除我们之前创建的时间序列。SQL 语句如下所示: - -```sql -delete timeseries root.ln.wf01.wt01.status; -delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware; -delete timeseries root.ln.wf02.*; -drop timeseries root.ln.wf02.*; -``` - -### 2.6 查看时间序列 - -* SHOW LATEST? TIMESERIES pathPattern? timeseriesWhereClause? limitClause? - - SHOW TIMESERIES 中可以有四种可选的子句,查询结果为这些时间序列的所有信息 - -时间序列信息具体包括:时间序列路径名,database,Measurement 别名,数据类型,编码方式,压缩方式,属性和标签。 - -示例: - -* SHOW TIMESERIES - - 展示系统中所有的时间序列信息 - -* SHOW TIMESERIES <`Path`> - - 返回给定路径的下的所有时间序列信息。其中 `Path` 需要为一个时间序列路径或路径模式。例如,分别查看`root`路径和`root.ln`路径下的时间序列,SQL 语句如下所示: - -```sql -show timeseries root.**; -show timeseries root.ln.**; -``` - -执行结果分别为: - -```shell -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.016s - -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression|tags|attributes|deadband|deadband parameters| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|null| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY|null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| null| null| -+-----------------------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+ -Total line number = 4 -It costs 0.004s -``` - -* SHOW TIMESERIES LIMIT INT OFFSET INT - - 只返回从指定下标开始的结果,最大返回条数被 LIMIT 限制,用于分页查询。例如: - -```sql -show timeseries root.ln.** limit 10 offset 10; -``` - -* SHOW TIMESERIES WHERE TIMESERIES contains 'containStr' - - 对查询结果集根据 timeseries 名称进行字符串模糊匹配过滤。例如: - -```sql -show timeseries root.ln.** where timeseries contains 'wf01.wt'; -``` - -执行结果为: - -```shell -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 2 -It costs 0.016s -``` - -* SHOW TIMESERIES WHERE DataType=type - - 对查询结果集根据时间序列数据类型进行过滤。例如: - -```sql -show timeseries root.ln.** where dataType=FLOAT; -``` - -执行结果为: - -```shell -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 3 -It costs 0.016s - -``` - -* SHOW TIMESERIES WHERE TAGS(KEY) = VALUE -* SHOW TIMESERIES WHERE TAGS(KEY) CONTAINS VALUE - - 对查询结果集根据标签进行过滤。例如: - -```sql -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -执行结果分别为: - -```shell -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s - -``` - - -* SHOW LATEST TIMESERIES - - 表示查询出的时间序列需要按照最近插入时间戳降序排列 - -需要注意的是,当查询路径不存在时,系统会返回 0 条时间序列。 - -* SHOW INVALID TIMESERIES - -自 V2.0.8.2 版本起,支持该 SQL 语句,用于展示**修改全路径名称**成功后的作废时间序列。 - -```SQL -IoTDB> show invalid timeSeries -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| NewPath| -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -|root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| GORILLA| LZ4|null| null| null| null| BASE|root.newln.newwf.newwt.temperature| -+-----------------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+----------------------------------+ -``` - -说明:返回结果中的最后一列 `NewPath`,展示作废序列对应的新序列,以服务于视图构建、集群迁移(Load+改名)等场景。 - -### 2.7 统计时间序列总数 - -IoTDB 支持使用`COUNT TIMESERIES`来统计一条路径中的时间序列个数。SQL 语句如下所示: - -* 可以通过 `WHERE` 条件对时间序列名称进行字符串模糊匹配,语法为: `COUNT TIMESERIES WHERE TIMESERIES contains 'containStr'` 。 -* 可以通过 `WHERE` 条件对时间序列数据类型进行过滤,语法为: `COUNT TIMESERIES WHERE DataType='`。 -* 可以通过 `WHERE` 条件对标签点进行过滤,语法为: `COUNT TIMESERIES WHERE TAGS(key)='value'` 或 `COUNT TIMESERIES WHERE TAGS(key) contains 'value'`。 -* 可以通过定义`LEVEL`来统计指定层级下的时间序列个数。这条语句可以用来统计每一个设备下的传感器数量,语法为:`COUNT TIMESERIES GROUP BY LEVEL=`。 - -```sql -COUNT TIMESERIES root.**; -COUNT TIMESERIES root.ln.**; -COUNT TIMESERIES root.ln.*.*.status; -COUNT TIMESERIES root.ln.wf01.wt01.status; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' ; -COUNT TIMESERIES root.** WHERE DATATYPE = INT64; -COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c' ; -COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c' ; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1; -``` - -例如有如下时间序列(可以使用`show timeseries`展示所有时间序列): - -```shell -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -| timeseries| alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -|root.sgcc.wf03.wt01.temperature| null| root.sgcc| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.sgcc.wf03.wt01.status| null| root.sgcc| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -| root.turbine.d1.s1|newAlias| root.turbine| FLOAT| RLE| SNAPPY|{"newTag1":"newV1","tag4":"v4","tag3":"v3"}|{"attr2":"v2","attr1":"newV1","attr4":"v4","attr3":"v3"}| null| null| -| root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY| {"unit":"c"}| null| null| null| -| root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| {"description":"test1"}| null| null| null| -| root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY| null| null| null| null| -| root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY| null| null| null| null| -+-------------------------------+--------+-------------+--------+--------+-----------+-------------------------------------------+--------------------------------------------------------+--------+-------------------+ -Total line number = 7 -It costs 0.004s -``` - -那么 Metadata Tree 如下所示: - - - -可以看到,`root`被定义为`LEVEL=0`。那么当你输入如下语句时: - -```sql -COUNT TIMESERIES root.** GROUP BY LEVEL=1; -COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2; -COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2; -``` - -你将得到以下结果: - -```sql -COUNT TIMESERIES root.** GROUP BY LEVEL=1; -COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2; -COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2; -``` -```shell -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -| root.sgcc| 2| -| root.ln| 4| -+------------+-----------------+ -Total line number = 3 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf02| 2| -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 2 -It costs 0.002s - -+------------+-----------------+ -| column|count(timeseries)| -+------------+-----------------+ -|root.ln.wf01| 2| -+------------+-----------------+ -Total line number = 1 -It costs 0.002s -``` - -> 注意:时间序列的路径只是过滤条件,与 level 的定义无关。 - -### 2.8 活跃时间序列查询 -我们在原有的时间序列查询和统计上添加新的WHERE时间过滤条件,可以得到在指定时间范围中存在数据的时间序列。 - -需要注意的是, 在带有时间过滤的元数据查询中并不考虑视图的存在,只考虑TsFile中实际存储的时间序列。 - -一个使用样例如下: -```sql -insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -show timeseries; -show timeseries where time >= 15000 and time < 16000; -count timeseries where time >= 15000 and time < 16000; -``` -```shell -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data3.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| root.sg.data.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -| root.sg.data.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s1| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -|root.sg.data2.s2| null| root.sg| FLOAT| GORILLA| LZ4|null| null| null| null| BASE| -+----------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -+-----------------+ -|count(timeseries)| -+-----------------+ -| 4| -+-----------------+ -``` -关于活跃时间序列的定义,能通过正常查询查出来的数据就是活跃数据,也就是说插入但被删除的时间序列不在考虑范围内。 - -### 2.9 标签点管理 - -我们可以在创建时间序列的时候,为它添加别名和额外的标签和属性信息。 - -标签和属性的区别在于: - -* 标签可以用来查询时间序列路径,会在内存中维护标点到时间序列路径的倒排索引:标签 -> 时间序列路径 -* 属性只能用时间序列路径来查询:时间序列路径 -> 属性 - -所用到的扩展的创建时间序列的 SQL 语句如下所示: -```sql -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT, encoding=RLE, compression=SNAPPY tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2); -``` - -括号里的`temprature`是`s1`这个传感器的别名。 -我们可以在任何用到`s1`的地方,将其用`temprature`代替,这两者是等价的。 - -> IoTDB 同时支持在查询语句中使用 AS 函数设置别名。二者的区别在于:AS 函数设置的别名用于替代整条时间序列名,且是临时的,不与时间序列绑定;而上文中的别名只作为传感器的别名,与其绑定且可与原传感器名等价使用。 - -> 注意:额外的标签和属性信息总的大小不能超过`tag_attribute_total_size`. - - * 标签点属性更新 -创建时间序列后,我们也可以对其原有的标签点属性进行更新,主要有以下六种更新方式: -* 重命名标签或属性 -```sql -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1; -``` -* 重新设置标签或属性的值 -```sql -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1; -``` -* 删除已经存在的标签或属性 -```sql -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2; -``` -* 添加新的标签 -```sql -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4; -``` -* 添加新的属性 -```sql -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4; -``` -* 更新插入别名,标签和属性 -> 如果该别名,标签或属性原来不存在,则插入,否则,用新值更新原来的旧值 -```sql -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4); -``` - -* 使用标签作为过滤条件查询时间序列,使用 TAGS(tagKey) 来标识作为过滤条件的标签 -```sql -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -``` - -返回给定路径的下的所有满足条件的时间序列信息,SQL 语句如下所示: - -```sql -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c; -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1; -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -执行结果分别为: - -```shell -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.hardware| null| root.ln| TEXT| PLAIN| SNAPPY|{"unit":"c"}| null| null| null| -+--------------------------+-----+-------------+--------+--------+-----------+------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.005s - -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags|attributes|deadband|deadband parameters| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -|root.ln.wf02.wt02.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|{"description":"test1"}| null| null| null| -+------------------------+-----+-------------+--------+--------+-----------+-----------------------+----------+--------+-------------------+ -Total line number = 1 -It costs 0.004s -``` - -- 使用标签作为过滤条件统计时间序列数量 - -```sql -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause; -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL=; -``` - -返回给定路径的下的所有满足条件的时间序列的数量,SQL 语句如下所示: - -```sql -count timeseries; -count timeseries root.** where TAGS(unit)='c'; -count timeseries root.** where TAGS(unit)='c' group by level = 2; -``` - -执行结果分别为: - -```shell -+-----------------+ -|count(timeseries)| -+-----------------+ -| 6| -+-----------------+ -Total line number = 1 -It costs 0.019s - -+-----------------+ -|count(timeseries)| -+-----------------+ -| 2| -+-----------------+ -Total line number = 1 -It costs 0.020s - -+--------------+-----------------+ -| column|count(timeseries)| -+--------------+-----------------+ -| root.ln.wf02| 2| -| root.ln.wf01| 0| -|root.sgcc.wf03| 0| -+--------------+-----------------+ -Total line number = 3 -It costs 0.011s -``` - -> 注意,现在我们只支持一个查询条件,要么是等值条件查询,要么是包含条件查询。当然 where 子句中涉及的必须是标签值,而不能是属性值。 - -创建对齐时间序列 - -```sql -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)); -``` - -执行结果如下: - -```sql -show timeseries; -``` -```shell -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -|root.sg1.d1.s2| null| root.sg1| DOUBLE| GORILLA| SNAPPY|{"tag4":"v4","tag3":"v3"}|{"attr4":"v4","attr3":"v3"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -支持查询: - -```sql -show timeseries where TAGS(tag1)='v1'; -``` -```shell -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -| timeseries|alias| database|dataType|encoding|compression| tags| attributes|deadband|deadband parameters| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -|root.sg1.d1.s1| null| root.sg1| INT32| RLE| SNAPPY|{"tag1":"v1","tag2":"v2"}|{"attr2":"v2","attr1":"v1"}| null| null| -+--------------+-----+-------------+--------+--------+-----------+-------------------------+---------------------------+--------+-------------------+ -``` - -上述对时间序列标签、属性的更新等操作都支持。 - - -## 3. 路径查询 - -### 3.1 路径(Path) - -路径(path)是用于表示时间序列的层级结构的表达式,其语法定义如下: -```SQL - path - : nodeName ('.' nodeName)* - ; - - nodeName - : wildcard? identifier wildcard? - | wildcard - ; - - wildcard - : '*' - | '**' - ; -``` -### 3.2 路径结点名(NodeName) - -- 路径中由 `.` 分割的部分称为路径结点名(nodeName)。 -- 例如,`root.a.b.c` 是一个层级为 4 的路径,其中 root、a、b 和 c 都是路径结点名。 - -#### 约束条件 -- 保留字符 root:root 是一个保留字符,仅允许出现在路径的开头。如果在其他层级出现 root,系统将无法解析并提示报错。 - -- 字符支持:除 root 外的其他层级支持以下字符: - - 字母(a-z、A-Z) - - 数字(0-9) - - 下划线(_) - - UNICODE 中文字符(\u2E80 到 \u9FFF) -- 大小写敏感性:在 Windows 系统上,数据库路径结点名是大小写不敏感的。例如,root.ln 和 root.LN 会被视为相同的路径。 - -### 3.3 特殊字符(反引号) - -如果`路径结点名(NodeName)`中需要使用特殊字符(如空格、标点符号等),可以使用反引号(`)将结点名引用起来。更多关于反引号的使用方法,请参考[反引号](../SQL-Manual/Syntax-Rule.md#反引号)。 - -### 3.4 路径模式(Path Pattern) - -为了使得在表达多个时间序列的时候更加方便快捷,IoTDB 为用户提供带通配符`*`或`**`的路径。通配符可以出现在路径中的任何层。 - -- 单层通配符(*):在路径中表示一层。 - - 例如,`root.vehicle.*.sensor1` 表示以 `root.vehicle` 为前缀,以 `sensor1` 为后缀,且层级等于 4 的路径。 - -- 多层通配符(**):在路径中表示(`*`)+,即为一层或多层`*`. - - 例如:`root.vehicle.device1.**` 表示所有以 `root.vehicle.device1` 为前缀且层级大于等于 4 的路径。 - - `root.vehicle.**.sensor1` 表示以 `root.vehicle` 为前缀,以 `sensor1` 为后缀,且层级大于等于 4 的路径。 - -**注意**:* 和 ** 不能放在路径的开头。 - -### 3.5 查看路径的所有子路径 - -```sql -SHOW CHILD PATHS pathPattern; -``` - -可以查看此路径模式所匹配的所有路径的下一层的所有路径和它对应的节点类型,即pathPattern.*所匹配的路径及其节点类型。 - -节点类型:ROOT -> SG INTERNAL -> DATABASE -> INTERNAL -> DEVICE -> TIMESERIES - -示例: - -* 查询 root.ln 的下一层:show child paths root.ln - -```shell -+------------+----------+ -| child paths|node types| -+------------+----------+ -|root.ln.wf01| INTERNAL| -|root.ln.wf02| INTERNAL| -+------------+----------+ -Total line number = 2 -It costs 0.002s -``` - -* 查询形如 root.xx.xx.xx 的路径:show child paths root.\*.\* - -```shell -+---------------+ -| child paths| -+---------------+ -|root.ln.wf01.s1| -|root.ln.wf02.s2| -+---------------+ -``` - -### 3.6 查看路径的下一级节点 - -```sql -SHOW CHILD NODES pathPattern; -``` - -可以查看此路径模式所匹配的节点的下一层的所有节点。 - -示例: - -* 查询 root 的下一层:show child nodes root - -```shell -+------------+ -| child nodes| -+------------+ -| ln| -+------------+ -``` - -* 查询 root.ln 的下一层 :show child nodes root.ln - -```shell -+------------+ -| child nodes| -+------------+ -| wf01| -| wf02| -+------------+ -``` - -### 3.7 统计节点数 - -IoTDB 支持使用`COUNT NODES LEVEL=`来统计当前 Metadata - 树下满足某路径模式的路径中指定层级的节点个数。这条语句可以用来统计带有特定采样点的设备数。例如: - -```sql -COUNT NODES root.** LEVEL=2; -COUNT NODES root.ln.** LEVEL=2; -COUNT NODES root.ln.wf01.* LEVEL=3; -COUNT NODES root.**.temperature LEVEL=3; -``` - -对于上面提到的例子和 Metadata Tree,你可以获得如下结果: - -```shell -+------------+ -|count(nodes)| -+------------+ -| 4| -+------------+ -Total line number = 1 -It costs 0.003s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 1| -+------------+ -Total line number = 1 -It costs 0.002s - -+------------+ -|count(nodes)| -+------------+ -| 2| -+------------+ -Total line number = 1 -It costs 0.002s -``` - -> 注意:时间序列的路径只是过滤条件,与 level 的定义无关。 - -### 3.8 查看设备 - -* SHOW DEVICES pathPattern? (WITH DATABASE)? devicesWhereClause? limitClause? - -与 `Show Timeseries` 相似,IoTDB 目前也支持两种方式查看设备。 - -* `SHOW DEVICES` 语句显示当前所有的设备信息,等价于 `SHOW DEVICES root.**`。 -* `SHOW DEVICES ` 语句规定了 `PathPattern`,返回给定的路径模式所匹配的设备信息。 -* `WHERE` 条件中可以使用 `DEVICE contains 'xxx'`,根据 device 名称进行模糊查询。 - -SQL 语句如下所示: - -```sql -show devices; -show devices root.ln.**; -show devices root.ln.** where device contains 't'; -``` - -你可以获得如下数据: - -```shell -+-------------------+---------+---------+ -| devices|isAligned| Template| -+-------------------+---------+---------+ -| root.ln.wf01.wt01| false| t1| -| root.ln.wf02.wt02| false| null| -|root.sgcc.wf03.wt01| false| null| -| root.turbine.d1| false| null| -+-------------------+---------+---------+ -Total line number = 4 -It costs 0.002s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf01.wt01| false| t1| -|root.ln.wf02.wt02| false| null| -+-----------------+---------+---------+ -Total line number = 2 -It costs 0.001s - -+-----------------+---------+---------+ -| devices|isAligned| Template| -+-----------------+---------+---------+ -|root.ln.wf01.wt01| false| t1| -|root.ln.wf02.wt02| false| null| -+-----------------+---------+---------+ -Total line number = 2 -It costs 0.001s - -``` - -其中,`isAligned`表示该设备下的时间序列是否对齐, -`Template`显示着该设备所激活的模板名,null 表示没有激活模板。 - -查看设备及其 database 信息,可以使用 `SHOW DEVICES WITH DATABASE` 语句。 - -* `SHOW DEVICES WITH DATABASE` 语句显示当前所有的设备信息和其所在的 database,等价于 `SHOW DEVICES root.**`。 -* `SHOW DEVICES WITH DATABASE` 语句规定了 `PathPattern`,返回给定的路径模式所匹配的设备信息和其所在的 database。 - -SQL 语句如下所示: - -```sql -show devices with database; -show devices root.ln.** with database; -``` - -你可以获得如下数据: - -```shell -+-------------------+-------------+---------+---------+ -| devices| database|isAligned| Template| -+-------------------+-------------+---------+---------+ -| root.ln.wf01.wt01| root.ln| false| t1| -| root.ln.wf02.wt02| root.ln| false| null| -|root.sgcc.wf03.wt01| root.sgcc| false| null| -| root.turbine.d1| root.turbine| false| null| -+-------------------+-------------+---------+---------+ -Total line number = 4 -It costs 0.003s - -+-----------------+-------------+---------+---------+ -| devices| database|isAligned| Template| -+-----------------+-------------+---------+---------+ -|root.ln.wf01.wt01| root.ln| false| t1| -|root.ln.wf02.wt02| root.ln| false| null| -+-----------------+-------------+---------+---------+ -Total line number = 2 -It costs 0.001s -``` - -### 3.9 统计设备数量 - -* COUNT DEVICES \ - -上述语句用于统计设备的数量,同时允许指定`PathPattern` 用于统计匹配该`PathPattern` 的设备数量 - -SQL 语句如下所示: - -```sql -show devices; -count devices; -count devices root.ln.**; -``` - -你可以获得如下数据: - -```shell -+-------------------+---------+---------+ -| devices|isAligned| Template| -+-------------------+---------+---------+ -|root.sgcc.wf03.wt03| false| null| -| root.turbine.d1| false| null| -| root.ln.wf02.wt02| false| null| -| root.ln.wf01.wt01| false| t1| -+-------------------+---------+---------+ -Total line number = 4 -It costs 0.024s - -+--------------+ -|count(devices)| -+--------------+ -| 4| -+--------------+ -Total line number = 1 -It costs 0.004s - -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -Total line number = 1 -It costs 0.004s -``` - -### 3.10 活跃设备查询 -和活跃时间序列一样,我们可以在查看和统计设备的基础上添加时间过滤条件来查询在某段时间内存在数据的活跃设备。这里活跃的定义与活跃时间序列相同,使用样例如下: -```sql -insert into root.sg.data(timestamp, s1,s2) values(15000, 1, 2); -insert into root.sg.data2(timestamp, s1,s2) values(15002, 1, 2); -insert into root.sg.data3(timestamp, s1,s2) values(16000, 1, 2); -show devices; -show devices where time >= 15000 and time < 16000; -count devices where time >= 15000 and time < 16000; -``` -```shell -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -| root.sg.data3| false| -+-------------------+---------+ - -+-------------------+---------+ -| devices|isAligned| -+-------------------+---------+ -| root.sg.data| false| -| root.sg.data2| false| -+-------------------+---------+ - -+--------------+ -|count(devices)| -+--------------+ -| 2| -+--------------+ -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Basic-Concept/Query-Data_timecho.md b/src/zh/UserGuide/latest/Basic-Concept/Query-Data_timecho.md deleted file mode 100644 index 5f2dd6c68..000000000 --- a/src/zh/UserGuide/latest/Basic-Concept/Query-Data_timecho.md +++ /dev/null @@ -1,3087 +0,0 @@ - - -# 数据查询 -## 1. 概述 - -在 IoTDB 中,使用 `SELECT` 语句从一条或多条时间序列中查询数据,IoTDB 不区分历史数据和实时数据,用户可以用统一的sql语法进行查询,通过 `WHERE` 子句中的时间过滤谓词决定查询的时间范围。 - -### 1.1 语法定义 - -```sql -SELECT [LAST] selectExpr [, selectExpr] ... - [INTO intoItem [, intoItem] ...] - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY { - ([startTime, endTime), interval [, slidingStep]) | - LEVEL = levelNum [, levelNum] ... | - TAGS(tagKey [, tagKey] ... | - VARIATION(expression[,delta][,ignoreNull=true/false]) | - CONDITION(expression,[keep>/>=/=/ 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` - -其含义为: - -被选择的设备为 ln 集团 wf01 子站 wt01 设备;被选择的时间序列为供电状态(status)和温度传感器(temperature);该语句要求选择出 “2017-11-01T00:05:00.000” 至 “2017-11-01T00:12:00.000” 之间的所选时间序列的值。 - -该 SQL 语句的执行结果如下: - -```shell -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -|2017-11-01T00:10:00.000+08:00| true| 25.52| -|2017-11-01T00:11:00.000+08:00| false| 22.91| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 6 -It costs 0.018s -``` - -#### 示例3:按照多个时间区间选择同一设备的多列数据 - -IoTDB 支持在一次查询中指定多个时间区间条件,用户可以根据需求随意组合时间区间条件。例如, - -SQL 语句为: - -```sql -select status, temperature from root.ln.wf01.wt01 where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -其含义为: - -被选择的设备为 ln 集团 wf01 子站 wt01 设备;被选择的时间序列为“供电状态(status)”和“温度传感器(temperature)”;该语句指定了两个不同的时间区间,分别为“2017-11-01T00:05:00.000 至 2017-11-01T00:12:00.000”和“2017-11-01T16:35:00.000 至 2017-11-01T16:37:00.000”;该语句要求选择出满足任一时间区间的被选时间序列的值。 - -该 SQL 语句的执行结果如下: - -```shell -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -|2017-11-01T00:10:00.000+08:00| true| 25.52| -|2017-11-01T00:11:00.000+08:00| false| 22.91| -|2017-11-01T16:35:00.000+08:00| true| 23.44| -|2017-11-01T16:36:00.000+08:00| false| 21.98| -|2017-11-01T16:37:00.000+08:00| false| 21.93| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 9 -It costs 0.018s -``` - -#### 示例4:按照多个时间区间选择不同设备的多列数据 - -该系统支持在一次查询中选择任意列的数据,也就是说,被选择的列可以来源于不同的设备。例如,SQL 语句为: - -```sql -select wf01.wt01.status, wf02.wt02.hardware from root.ln where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` - -其含义为: - -被选择的时间序列为 “ln 集团 wf01 子站 wt01 设备的供电状态” 以及 “ln 集团 wf02 子站 wt02 设备的硬件版本”;该语句指定了两个时间区间,分别为 “2017-11-01T00:05:00.000 至 2017-11-01T00:12:00.000” 和 “2017-11-01T16:35:00.000 至 2017-11-01T16:37:00.000”;该语句要求选择出满足任意时间区间的被选时间序列的值。 - -该 SQL 语句的执行结果如下: - -```shell -+-----------------------------+------------------------+--------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf02.wt02.hardware| -+-----------------------------+------------------------+--------------------------+ -|2017-11-01T00:06:00.000+08:00| false| v1| -|2017-11-01T00:07:00.000+08:00| false| v1| -|2017-11-01T00:08:00.000+08:00| false| v1| -|2017-11-01T00:09:00.000+08:00| false| v1| -|2017-11-01T00:10:00.000+08:00| true| v2| -|2017-11-01T00:11:00.000+08:00| false| v1| -|2017-11-01T16:35:00.000+08:00| true| v2| -|2017-11-01T16:36:00.000+08:00| false| v1| -|2017-11-01T16:37:00.000+08:00| false| v1| -+-----------------------------+------------------------+--------------------------+ -Total line number = 9 -It costs 0.014s -``` - -#### 示例5:根据时间降序返回结果集 - -IoTDB 支持 `order by time` 语句,用于对结果按照时间进行降序展示。例如,SQL 语句为: - -```sql -select * from root.ln.** where time > 1 order by time desc limit 10; -``` - -语句执行的结果为: - -```shell -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -|2017-11-07T23:59:00.000+08:00| v1| false| 21.07| false| -|2017-11-07T23:58:00.000+08:00| v1| false| 22.93| false| -|2017-11-07T23:57:00.000+08:00| v2| true| 24.39| true| -|2017-11-07T23:56:00.000+08:00| v2| true| 24.44| true| -|2017-11-07T23:55:00.000+08:00| v2| true| 25.9| true| -|2017-11-07T23:54:00.000+08:00| v1| false| 22.52| false| -|2017-11-07T23:53:00.000+08:00| v2| true| 24.58| true| -|2017-11-07T23:52:00.000+08:00| v1| false| 20.18| false| -|2017-11-07T23:51:00.000+08:00| v1| false| 22.24| false| -|2017-11-07T23:50:00.000+08:00| v2| true| 23.7| true| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -Total line number = 10 -It costs 0.016s -``` - -### 1.4 查询执行接口 - -在 IoTDB 中,提供两种方式执行数据查询操作: -- 使用 IoTDB-SQL 执行查询。 -- 常用查询的高效执行接口,包括时间序列原始数据范围查询、最新点查询、简单聚合查询。 - -#### 使用 IoTDB-SQL 执行查询 - -数据查询语句支持在 SQL 命令行终端、JDBC、JAVA / C++ / Python / Go 等编程语言 API、RESTful API 中使用。 - -- 在 SQL 命令行终端中执行查询语句:启动 SQL 命令行终端,直接输入查询语句执行即可,详见 [SQL 命令行终端](../Tools-System/CLI_timecho.md)。 - -- 在 JDBC 中执行查询语句,详见 [JDBC](../API/Programming-JDBC_timecho.md) 。 - -- 在 JAVA / C++ / Python / Go 等编程语言 API 中执行查询语句,详见应用编程接口一章相应文档。接口原型如下: - - ```java - SessionDataSet executeQueryStatement(String sql); - ``` - -- 在 RESTful API 中使用,详见 [HTTP API V1](../API/RestServiceV1_timecho.md) 或者 [HTTP API V2](../API/RestServiceV2_timecho.md)。 - -#### 常用查询的高效执行接口 - -各编程语言的 API 为常用的查询提供了高效执行接口,可以省去 SQL 解析等操作的耗时。包括: - -* 时间序列原始数据范围查询: - - 指定的查询时间范围为左闭右开区间,包含开始时间但不包含结束时间。 - -```java -SessionDataSet executeRawDataQuery(List paths, long startTime, long endTime); -``` - -* 最新点查询: - - 查询最后一条时间戳大于等于某个时间点的数据。 - -```java -SessionDataSet executeLastDataQuery(List paths, long lastTime); -``` - -* 聚合查询: - - 支持指定查询时间范围。指定的查询时间范围为左闭右开区间,包含开始时间但不包含结束时间。 - - 支持按照时间区间分段查询。 - -```java -SessionDataSet executeAggregationQuery(List paths, List aggregations); - -SessionDataSet executeAggregationQuery( - List paths, List aggregations, long startTime, long endTime); - -SessionDataSet executeAggregationQuery( - List paths, - List aggregations, - long startTime, - long endTime, - long interval); - -SessionDataSet executeAggregationQuery( - List paths, - List aggregations, - long startTime, - long endTime, - long interval, - long slidingStep); -``` - -## 2. 选择表达式(SELECT FROM 子句) - -`SELECT` 子句指定查询的输出,由若干个 `selectExpr` 组成。 每个 `selectExpr` 定义了查询结果中的一列或多列。 - -**`selectExpr` 是一个由时间序列路径后缀、常量、函数和运算符组成的表达式。即 `selectExpr` 中可以包含:** -- 时间序列路径后缀(支持使用通配符) -- 运算符 - - 算数运算符 - - 比较运算符 - - 逻辑运算符 -- 函数 - - 聚合函数 - - 时间序列生成函数(包括内置函数和用户自定义函数) -- 常量 - -### 2.1 使用别名 - -由于 IoTDB 独特的数据模型,在每个传感器前都附带有设备等诸多额外信息。有时,我们只针对某个具体设备查询,而这些前缀信息频繁显示造成了冗余,影响了结果集的显示与分析。 - -IoTDB 支持使用`AS`为查询结果集中的列指定别名。 - -**示例:** - -```sql -select s1 as temperature, s2 as speed from root.ln.wf01.wt01; -``` - -结果集将显示为: - -| Time | temperature | speed | -| ---- | ----------- | ----- | -| ... | ... | ... | - -### 2.2 运算符 - -IoTDB 中支持的运算符列表见文档 [运算符和函数](../SQL-Manual/Operator-and-Expression.md)。 - -### 2.3 函数 - -#### 聚合函数 - -聚合函数是多对一函数。它们对一组值进行聚合计算,得到单个聚合结果。 - -**包含聚合函数的查询称为聚合查询**,否则称为时间序列查询。 - -**注意:聚合查询和时间序列查询不能混合使用。** 下列语句是不支持的: - -```sql -select s1, count(s1) from root.sg.d1; -select sin(s1), count(s1) from root.sg.d1; -select s1, count(s1) from root.sg.d1 group by ([10,100),10ms); -``` - -IoTDB 支持的聚合函数见文档 [聚合函数](../SQL-Manual/Operator-and-Expression.md#内置函数)。 - -#### 时间序列生成函数 - -时间序列生成函数接受若干原始时间序列作为输入,产生一列时间序列输出。与聚合函数不同的是,时间序列生成函数的结果集带有时间戳列。 - -所有的时间序列生成函数都可以接受 * 作为输入,都可以与原始时间序列查询混合进行。 - -##### 内置时间序列生成函数 - -IoTDB 中支持的内置函数列表见文档 [运算符和函数](../SQL-Manual/Operator-and-Expression.md)。 - -##### 自定义时间序列生成函数 - -IoTDB 支持通过用户自定义函数(点击查看: [用户自定义函数](../User-Manual/Database-Programming.md#用户自定义函数) )能力进行函数功能扩展。 - -### 2.4 嵌套表达式举例 - -IoTDB 支持嵌套表达式,由于聚合查询和时间序列查询不能在一条查询语句中同时出现,我们将支持的嵌套表达式分为时间序列查询嵌套表达式和聚合查询嵌套表达式两类。 - -#### 时间序列查询嵌套表达式 - -IoTDB 支持在 `SELECT` 子句中计算由**时间序列、常量、时间序列生成函数(包括用户自定义函数)和运算符**组成的任意嵌套表达式。 - -**说明:** - -- 当某个时间戳下左操作数和右操作数都不为空(`null`)时,表达式才会有结果,否则表达式值为`null`,且默认不出现在结果集中。 -- 如果表达式中某个操作数对应多条时间序列(如通配符 `*`),那么每条时间序列对应的结果都会出现在结果集中(按照笛卡尔积形式)。 - -**示例 1:** - -```sql -select a, - b, - ((a + 1) * 2 - 1) % 2 + 1.5, - sin(a + sin(a + sin(b))), - -(a + b) * (sin(a + b) * sin(a + b) + cos(a + b) * cos(a + b)) + 1 -from root.sg1; -``` - -运行结果: - -```shell -+-----------------------------+----------+----------+----------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| Time|root.sg1.a|root.sg1.b|((((root.sg1.a + 1) * 2) - 1) % 2) + 1.5|sin(root.sg1.a + sin(root.sg1.a + sin(root.sg1.b)))|(-root.sg1.a + root.sg1.b * ((sin(root.sg1.a + root.sg1.b) * sin(root.sg1.a + root.sg1.b)) + (cos(root.sg1.a + root.sg1.b) * cos(root.sg1.a + root.sg1.b)))) + 1| -+-----------------------------+----------+----------+----------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ -|1970-01-01T08:00:00.010+08:00| 1| 1| 2.5| 0.9238430524420609| -1.0| -|1970-01-01T08:00:00.020+08:00| 2| 2| 2.5| 0.7903505371876317| -3.0| -|1970-01-01T08:00:00.030+08:00| 3| 3| 2.5| 0.14065207680386618| -5.0| -|1970-01-01T08:00:00.040+08:00| 4| null| 2.5| null| null| -|1970-01-01T08:00:00.050+08:00| null| 5| null| null| null| -|1970-01-01T08:00:00.060+08:00| 6| 6| 2.5| -0.7288037411970916| -11.0| -+-----------------------------+----------+----------+----------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ -Total line number = 6 -It costs 0.048s -``` - -**示例 2:** - -```sql -select (a + b) * 2 + sin(a) from root.sg; -``` - -运行结果: - -```shell -+-----------------------------+----------------------------------------------+ -| Time|((root.sg.a + root.sg.b) * 2) + sin(root.sg.a)| -+-----------------------------+----------------------------------------------+ -|1970-01-01T08:00:00.010+08:00| 59.45597888911063| -|1970-01-01T08:00:00.020+08:00| 100.91294525072763| -|1970-01-01T08:00:00.030+08:00| 139.01196837590714| -|1970-01-01T08:00:00.040+08:00| 180.74511316047935| -|1970-01-01T08:00:00.050+08:00| 219.73762514629607| -|1970-01-01T08:00:00.060+08:00| 259.6951893788978| -|1970-01-01T08:00:00.070+08:00| 300.7738906815579| -|1970-01-01T08:00:00.090+08:00| 39.45597888911063| -|1970-01-01T08:00:00.100+08:00| 39.45597888911063| -+-----------------------------+----------------------------------------------+ -Total line number = 9 -It costs 0.011s -``` - -**示例 3:** - -```sql -select (a + *) / 2 from root.sg1; -``` - -运行结果: - -```shell -+-----------------------------+-----------------------------+-----------------------------+ -| Time|(root.sg1.a + root.sg1.a) / 2|(root.sg1.a + root.sg1.b) / 2| -+-----------------------------+-----------------------------+-----------------------------+ -|1970-01-01T08:00:00.010+08:00| 1.0| 1.0| -|1970-01-01T08:00:00.020+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.030+08:00| 3.0| 3.0| -|1970-01-01T08:00:00.040+08:00| 4.0| null| -|1970-01-01T08:00:00.060+08:00| 6.0| 6.0| -+-----------------------------+-----------------------------+-----------------------------+ -Total line number = 5 -It costs 0.011s -``` - -**示例 4:** - -```sql -select (a + b) * 3 from root.sg, root.ln; -``` - -运行结果: - -```shell -+-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ -| Time|(root.sg.a + root.sg.b) * 3|(root.sg.a + root.ln.b) * 3|(root.ln.a + root.sg.b) * 3|(root.ln.a + root.ln.b) * 3| -+-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ -|1970-01-01T08:00:00.010+08:00| 90.0| 270.0| 360.0| 540.0| -|1970-01-01T08:00:00.020+08:00| 150.0| 330.0| 690.0| 870.0| -|1970-01-01T08:00:00.030+08:00| 210.0| 450.0| 570.0| 810.0| -|1970-01-01T08:00:00.040+08:00| 270.0| 240.0| 690.0| 660.0| -|1970-01-01T08:00:00.050+08:00| 330.0| null| null| null| -|1970-01-01T08:00:00.060+08:00| 390.0| null| null| null| -|1970-01-01T08:00:00.070+08:00| 450.0| null| null| null| -|1970-01-01T08:00:00.090+08:00| 60.0| null| null| null| -|1970-01-01T08:00:00.100+08:00| 60.0| null| null| null| -+-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ -Total line number = 9 -It costs 0.014s -``` - -#### 聚合查询嵌套表达式 - -IoTDB 支持在 `SELECT` 子句中计算由**聚合函数、常量、时间序列生成函数和表达式**组成的任意嵌套表达式。 - -**说明:** -- 当某个时间戳下左操作数和右操作数都不为空(`null`)时,表达式才会有结果,否则表达式值为`null`,且默认不出现在结果集中。但在使用`GROUP BY`子句的聚合查询嵌套表达式中,我们希望保留每个时间窗口的值,所以表达式值为`null`的窗口也包含在结果集中。 -- 如果表达式中某个操作数对应多条时间序列(如通配符`*`),那么每条时间序列对应的结果都会出现在结果集中(按照笛卡尔积形式)。 - -**示例 1:** - -```sql -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) -from root.ln.wf01.wt01; -``` - -运行结果: - -```shell -+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+--------------------------------------------------------------------+ -|avg(root.ln.wf01.wt01.temperature)|sin(avg(root.ln.wf01.wt01.temperature))|avg(root.ln.wf01.wt01.temperature) + 1|-sum(root.ln.wf01.wt01.hardware)|avg(root.ln.wf01.wt01.temperature) + sum(root.ln.wf01.wt01.hardware)| -+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+--------------------------------------------------------------------+ -| 15.927999999999999| -0.21826546964855045| 16.927999999999997| -7426.0| 7441.928| -+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+--------------------------------------------------------------------+ -Total line number = 1 -It costs 0.009s -``` - -**示例 2:** - -```sql -select avg(*), - (avg(*) + 1) * 3 / 2 -1 -from root.sg1; -``` - -运行结果: - -```shell -+---------------+---------------+-------------------------------------+-------------------------------------+ -|avg(root.sg1.a)|avg(root.sg1.b)|(avg(root.sg1.a) + 1) * 3 / 2 - 1 |(avg(root.sg1.b) + 1) * 3 / 2 - 1 | -+---------------+---------------+-------------------------------------+-------------------------------------+ -| 3.2| 3.4| 5.300000000000001| 5.6000000000000005| -+---------------+---------------+-------------------------------------+-------------------------------------+ -Total line number = 1 -It costs 0.007s -``` - -**示例 3:** - -```sql -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) as custom_sum -from root.ln.wf01.wt01 -GROUP BY([10, 90), 10ms); -``` - -运行结果: - -```shell -+-----------------------------+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+----------+ -| Time|avg(root.ln.wf01.wt01.temperature)|sin(avg(root.ln.wf01.wt01.temperature))|avg(root.ln.wf01.wt01.temperature) + 1|-sum(root.ln.wf01.wt01.hardware)|custom_sum| -+-----------------------------+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+----------+ -|1970-01-01T08:00:00.010+08:00| 13.987499999999999| 0.9888207947857667| 14.987499999999999| -3211.0| 3224.9875| -|1970-01-01T08:00:00.020+08:00| 29.6| -0.9701057337071853| 30.6| -3720.0| 3749.6| -|1970-01-01T08:00:00.030+08:00| null| null| null| null| null| -|1970-01-01T08:00:00.040+08:00| null| null| null| null| null| -|1970-01-01T08:00:00.050+08:00| null| null| null| null| null| -|1970-01-01T08:00:00.060+08:00| null| null| null| null| null| -|1970-01-01T08:00:00.070+08:00| null| null| null| null| null| -|1970-01-01T08:00:00.080+08:00| null| null| null| null| null| -+-----------------------------+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+----------+ -Total line number = 8 -It costs 0.012s -``` - -### 2.5 最新点查询 - -最新点查询是时序数据库 Apache IoTDB 中提供的一种特殊查询。它返回指定时间序列中时间戳最大的数据点,即一条序列的最新状态。 - -在物联网数据分析场景中,此功能尤为重要。为了满足了用户对设备实时监控的需求,Apache IoTDB 对最新点查询进行了**缓存优化**,能够提供毫秒级的返回速度。 - -SQL 语法: - -```sql -select last [COMMA ]* from < PrefixPath > [COMMA < PrefixPath >]* [ORDER BY TIMESERIES (DESC | ASC)?] -``` - -其含义是: 查询时间序列 prefixPath.path 中最近时间戳的数据。 - -- `whereClause` 中当前只支持时间过滤条件,任何其他过滤条件都将会返回异常。当缓存的最新点不满足过滤条件时,IoTDB 需要从存储中获取结果,此时性能将会有所下降。 - -- 结果集为四列的结构: - - ```shell - +----+----------+-----+--------+ - |Time|timeseries|value|dataType| - +----+----------+-----+--------+ - ``` - -- 可以使用 `ORDER BY TIME/TIMESERIES/VALUE/DATATYPE (DESC | ASC)` 指定结果集按照某一列进行降序/升序排列。当值列包含多种类型的数据时,按照字符串类型来排序。 - -**示例 1:** 查询 root.ln.wf01.wt01.status 的最新数据点 - -```sql - select last status from root.ln.wf01.wt01; -``` -```shell -+-----------------------------+------------------------+-----+--------+ -| Time| timeseries|value|dataType| -+-----------------------------+------------------------+-----+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.status|false| BOOLEAN| -+-----------------------------+------------------------+-----+--------+ -Total line number = 1 -It costs 0.000s -``` - -**示例 2:** 查询 root.ln.wf01.wt01 下 status,temperature 时间戳大于等于 2017-11-07T23:50:00 的最新数据点。 - -```sql - select last status, temperature from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00; -``` -```shell -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**示例 3:** 查询 root.ln.wf01.wt01 下所有序列的最新数据点,并按照序列名降序排列。 - -```sql - select last * from root.ln.wf01.wt01 order by timeseries desc; -``` -```shell -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**示例 4:** 查询 root.ln.wf01.wt01 下所有序列的最新数据点,并按照dataType降序排列。 - -```sql - select last * from root.ln.wf01.wt01 order by dataType desc; -``` -```shell -+-----------------------------+-----------------------------+---------+--------+ -| Time| timeseries| value|dataType| -+-----------------------------+-----------------------------+---------+--------+ -|2017-11-07T23:59:00.000+08:00|root.ln.wf01.wt01.temperature|21.067368| DOUBLE| -|2017-11-07T23:59:00.000+08:00| root.ln.wf01.wt01.status| false| BOOLEAN| -+-----------------------------+-----------------------------+---------+--------+ -Total line number = 2 -It costs 0.002s -``` - -**注意:** 可以通过函数组合方式实现其他过滤条件查询最新点的需求,例如 - -```sql - select max_time(*), last_value(*) from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00 and status = false align by device; -``` -```shell -+-----------------+---------------------+----------------+-----------------------+------------------+ -| Device|max_time(temperature)|max_time(status)|last_value(temperature)|last_value(status)| -+-----------------+---------------------+----------------+-----------------------+------------------+ -|root.ln.wf01.wt01| 1510077540000| 1510077540000| 21.067368| false| -+-----------------+---------------------+----------------+-----------------------+------------------+ -Total line number = 1 -It costs 0.021s -``` - - -## 3. 查询过滤条件(WHERE 子句) - -`WHERE` 子句指定了对数据行的筛选条件,由一个 `whereCondition` 组成。 - -`whereCondition` 是一个逻辑表达式,对于要选择的每一行,其计算结果为真。如果没有 `WHERE` 子句,将选择所有行。 -在 `whereCondition` 中,可以使用除聚合函数之外的任何 IOTDB 支持的函数和运算符。 - -根据过滤条件的不同,可以分为时间过滤条件和值过滤条件。时间过滤条件和值过滤条件可以混合使用。 - -### 3.1 时间过滤条件 - -使用时间过滤条件可以筛选特定时间范围的数据。对于时间戳支持的格式,请参考 [时间戳类型](../Background-knowledge/Data-Type.md) 。 - -示例如下: - -1. 选择时间戳大于 2022-01-01T00:05:00.000 的数据: - - ```sql - select s1 from root.sg1.d1 where time > 2022-01-01T00:05:00.000; - ``` - -2. 选择时间戳等于 2022-01-01T00:05:00.000 的数据: - - ```sql - select s1 from root.sg1.d1 where time = 2022-01-01T00:05:00.000; - ``` - -3. 选择时间区间 [2017-11-01T00:05:00.000, 2017-11-01T00:12:00.000) 内的数据: - - ```sql - select s1 from root.sg1.d1 where time >= 2022-01-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; - ``` - -注:在上述示例中,`time` 也可写做 `timestamp`。 - -### 3.2 值过滤条件 - -使用值过滤条件可以筛选数据值满足特定条件的数据。 -**允许**使用 select 子句中未选择的时间序列作为值过滤条件。 - -示例如下: - -1. 选择值大于 36.5 的数据: - - ```sql - select temperature from root.sg1.d1 where temperature > 36.5; - ``` - -2. 选择值等于 true 的数据: - - ```sql - select status from root.sg1.d1 where status = true; - -3. 选择区间 [36.5,40] 内或之外的数据: - - ```sql - select temperature from root.sg1.d1 where temperature between 36.5 and 40; - ```` - ```sql - select temperature from root.sg1.d1 where temperature not between 36.5 and 40; - ```` - -4. 选择值在特定范围内的数据: - - ```sql - select code from root.sg1.d1 where code in ('200', '300', '400', '500'); - ``` - -5. 选择值在特定范围外的数据: - - ```sql - select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); - ``` - -6. 选择值为空的数据: - - ```sql - select code from root.sg1.d1 where temperature is null; - ```` - -7. 选择值为非空的数据: - - ```sql - select code from root.sg1.d1 where temperature is not null; - ```` - -### 3.3 模糊查询 - -对于 TEXT 和 STRING 类型的数据,支持使用 `Like` 和 `Regexp` 运算符对数据进行模糊匹配 - -#### 使用 `Like` 进行模糊匹配 - -**匹配规则:** - -- `%` 表示任意0个或多个字符。 -- `_` 表示任意单个字符。 - -**示例 1:** 查询 `root.sg.d1` 下 `value` 含有`'cc'`的数据。 - -```sql - select * from root.sg.d1 where value like '%cc%'; -``` -```shell -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -**示例 2:** 查询 `root.sg.d1` 下 `value` 中间为 `'b'`、前后为任意单个字符的数据。 - -```sql - select * from root.sg.device where value like '_b_'; -``` -```shell -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:02.000+08:00| abc| -+-----------------------------+----------------+ -Total line number = 1 -It costs 0.002s -``` - -#### 使用 `Regexp` 进行模糊匹配 - -需要传入的过滤条件为 **Java 标准库风格的正则表达式**。 - -**常见的正则匹配举例:** - -```shell -长度为3-20的所有字符:^.{3,20}$ -大写英文字符:^[A-Z]+$ -数字和英文字符:^[A-Za-z0-9]+$ -以a开头的:^a.* -``` - -**示例 1:** 查询 root.sg.d1 下 value 值为26个英文字符组成的字符串。 - -```sql - select * from root.sg.d1 where value regexp '^[A-Za-z]+$'; -``` -```shell -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -**示例 2:** 查询 root.sg.d1 下 value 值为26个小写英文字符组成的字符串且时间大于100的。 - -```sql - select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100; -``` -```shell -+-----------------------------+----------------+ -| Time|root.sg.d1.value| -+-----------------------------+----------------+ -|2017-11-01T00:00:00.000+08:00| aabbccdd| -|2017-11-01T00:00:01.000+08:00| cc| -+-----------------------------+----------------+ -Total line number = 2 -It costs 0.002s -``` - -## 4. 分段分组聚合(GROUP BY 子句) -IoTDB支持通过`GROUP BY`子句对序列进行分段或者分组聚合。 - -分段聚合是指按照时间维度,针对同时间序列中不同数据点之间的时间关系,对数据在行的方向进行分段,每个段得到一个聚合值。目前支持**时间区间分段**、**差值分段**、**条件分段**、**会话分段**和**点数分段**,未来将支持更多分段方式。 - -分组聚合是指针对不同时间序列,在时间序列的潜在业务属性上分组,每个组包含若干条时间序列,每个组得到一个聚合值。支持**按路径层级分组**和**按序列标签分组**两种分组方式。 - -### 4.1 分段聚合 - -#### 时间区间分段聚合 - -时间区间分段聚合是一种时序数据典型的查询方式,数据以高频进行采集,需要按照一定的时间间隔进行聚合计算,如计算每天的平均气温,需要将气温的序列按天进行分段,然后计算平均值。 - -在 IoTDB 中,聚合查询可以通过 `GROUP BY` 子句指定按照时间区间分段聚合。用户可以指定聚合的时间间隔和滑动步长,相关参数如下: - -* 参数 1:时间轴显示时间窗口大小 -* 参数 2:聚合窗口的大小(必须为正数) -* 参数 3:聚合窗口的滑动步长(可选,默认与聚合窗口大小相同) - -下图中指出了这三个参数的含义: - - - -接下来,我们给出几个典型例子: - -##### 未指定滑动步长的时间区间分段聚合查询 - -对应的 SQL 语句是: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d); -``` -这条查询的含义是: - -由于用户没有指定滑动步长,滑动步长将会被默认设置为跟时间间隔参数相同,也就是`1d`。 - -上面这个例子的第一个参数是显示窗口参数,决定了最终的显示范围是 [2017-11-01T00:00:00, 2017-11-07T23:00:00)。 - -上面这个例子的第二个参数是划分时间轴的时间间隔参数,将`1d`当作划分间隔,显示窗口参数的起始时间当作分割原点,时间轴即被划分为连续的时间间隔:[0,1d), [1d, 2d), [2d, 3d) 等等。 - -然后系统将会用 WHERE 子句中的时间和值过滤条件以及 GROUP BY 语句中的第一个参数作为数据的联合过滤条件,获得满足所有过滤条件的数据(在这个例子里是在 [2017-11-01T00:00:00, 2017-11-07 T23:00:00) 这个时间范围的数据),并把这些数据映射到之前分割好的时间轴中(这个例子里是从 2017-11-01T00:00:00 到 2017-11-07T23:00:00:00 的每一天) - -每个时间间隔窗口内都有数据,SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 1440| 26.0| -|2017-11-02T00:00:00.000+08:00| 1440| 26.0| -|2017-11-03T00:00:00.000+08:00| 1440| 25.99| -|2017-11-04T00:00:00.000+08:00| 1440| 26.0| -|2017-11-05T00:00:00.000+08:00| 1440| 26.0| -|2017-11-06T00:00:00.000+08:00| 1440| 25.99| -|2017-11-07T00:00:00.000+08:00| 1380| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 7 -It costs 0.024s -``` - -##### 指定滑动步长的时间区间分段聚合查询 - -对应的 SQL 语句是: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d); -``` - -这条查询的含义是: - -由于用户指定了滑动步长为`1d`,GROUP BY 语句执行时将会每次把时间间隔往后移动一天的步长,而不是默认的 3 小时。 - -也就意味着,我们想要取从 2017-11-01 到 2017-11-07 每一天的凌晨 0 点到凌晨 3 点的数据。 - -上面这个例子的第一个参数是显示窗口参数,决定了最终的显示范围是 [2017-11-01T00:00:00, 2017-11-07T23:00:00)。 - -上面这个例子的第二个参数是划分时间轴的时间间隔参数,将`3h`当作划分间隔,显示窗口参数的起始时间当作分割原点,时间轴即被划分为连续的时间间隔:[2017-11-01T00:00:00, 2017-11-01T03:00:00), [2017-11-02T00:00:00, 2017-11-02T03:00:00), [2017-11-03T00:00:00, 2017-11-03T03:00:00) 等等。 - -上面这个例子的第三个参数是每次时间间隔的滑动步长。 - -然后系统将会用 WHERE 子句中的时间和值过滤条件以及 GROUP BY 语句中的第一个参数作为数据的联合过滤条件,获得满足所有过滤条件的数据(在这个例子里是在 [2017-11-01T00:00:00, 2017-11-07 T23:00:00) 这个时间范围的数据),并把这些数据映射到之前分割好的时间轴中(这个例子里是从 2017-11-01T00:00:00 到 2017-11-07T23:00:00:00 的每一天的凌晨 0 点到凌晨 3 点) - -每个时间间隔窗口内都有数据,SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| 25.98| -|2017-11-02T00:00:00.000+08:00| 180| 25.98| -|2017-11-03T00:00:00.000+08:00| 180| 25.96| -|2017-11-04T00:00:00.000+08:00| 180| 25.96| -|2017-11-05T00:00:00.000+08:00| 180| 26.0| -|2017-11-06T00:00:00.000+08:00| 180| 25.85| -|2017-11-07T00:00:00.000+08:00| 180| 25.99| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 7 -It costs 0.006s -``` - -滑动步长可以小于聚合窗口,此时聚合窗口之间有重叠时间(类似于一个滑动窗口)。 - -例如 SQL: -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-01 10:00:00), 4h, 2h); -``` - -SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| 25.98| -|2017-11-01T02:00:00.000+08:00| 180| 25.98| -|2017-11-01T04:00:00.000+08:00| 180| 25.96| -|2017-11-01T06:00:00.000+08:00| 180| 25.96| -|2017-11-01T08:00:00.000+08:00| 180| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 5 -It costs 0.006s -``` - -##### 按照自然月份的时间区间分段聚合查询 - -对应的 SQL 语句是: - -```sql -select count(status) from root.ln.wf01.wt01 where time > 2017-11-01T01:00:00 group by([2017-11-01T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` - -这条查询的含义是: - -由于用户指定了滑动步长为`2mo`,GROUP BY 语句执行时将会每次把时间间隔往后移动 2 个自然月的步长,而不是默认的 1 个自然月。 - -也就意味着,我们想要取从 2017-11-01 到 2019-11-07 每 2 个自然月的第一个月的数据。 - -上面这个例子的第一个参数是显示窗口参数,决定了最终的显示范围是 [2017-11-01T00:00:00, 2019-11-07T23:00:00)。 - -起始时间为 2017-11-01T00:00:00,滑动步长将会以起始时间作为标准按月递增,取当月的 1 号作为时间间隔的起始时间。 - -上面这个例子的第二个参数是划分时间轴的时间间隔参数,将`1mo`当作划分间隔,显示窗口参数的起始时间当作分割原点,时间轴即被划分为连续的时间间隔:[2017-11-01T00:00:00, 2017-12-01T00:00:00), [2018-02-01T00:00:00, 2018-03-01T00:00:00), [2018-05-03T00:00:00, 2018-06-01T00:00:00) 等等。 - -上面这个例子的第三个参数是每次时间间隔的滑动步长。 - -然后系统将会用 WHERE 子句中的时间和值过滤条件以及 GROUP BY 语句中的第一个参数作为数据的联合过滤条件,获得满足所有过滤条件的数据(在这个例子里是在 [2017-11-01T00:00:00, 2019-11-07T23:00:00) 这个时间范围的数据),并把这些数据映射到之前分割好的时间轴中(这个例子里是从 2017-11-01T00:00:00 到 2019-11-07T23:00:00:00 的每两个自然月的第一个月) - -每个时间间隔窗口内都有数据,SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-11-01T00:00:00.000+08:00| 259| -|2018-01-01T00:00:00.000+08:00| 250| -|2018-03-01T00:00:00.000+08:00| 259| -|2018-05-01T00:00:00.000+08:00| 251| -|2018-07-01T00:00:00.000+08:00| 242| -|2018-09-01T00:00:00.000+08:00| 225| -|2018-11-01T00:00:00.000+08:00| 216| -|2019-01-01T00:00:00.000+08:00| 207| -|2019-03-01T00:00:00.000+08:00| 216| -|2019-05-01T00:00:00.000+08:00| 207| -|2019-07-01T00:00:00.000+08:00| 199| -|2019-09-01T00:00:00.000+08:00| 181| -|2019-11-01T00:00:00.000+08:00| 60| -+-----------------------------+-------------------------------+ -``` - -对应的 SQL 语句是: - -```sql -select count(status) from root.ln.wf01.wt01 group by([2017-10-31T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` - -这条查询的含义是: - -由于用户指定了滑动步长为`2mo`,GROUP BY 语句执行时将会每次把时间间隔往后移动 2 个自然月的步长,而不是默认的 1 个自然月。 - -也就意味着,我们想要取从 2017-10-31 到 2019-11-07 每 2 个自然月的第一个月的数据。 - -与上述示例不同的是起始时间为 2017-10-31T00:00:00,滑动步长将会以起始时间作为标准按月递增,取当月的 31 号(即最后一天)作为时间间隔的起始时间。若起始时间设置为 30 号,滑动步长会将时间间隔的起始时间设置为当月 30 号,若不存在则为最后一天。 - -上面这个例子的第一个参数是显示窗口参数,决定了最终的显示范围是 [2017-10-31T00:00:00, 2019-11-07T23:00:00)。 - -上面这个例子的第二个参数是划分时间轴的时间间隔参数,将`1mo`当作划分间隔,显示窗口参数的起始时间当作分割原点,时间轴即被划分为连续的时间间隔:[2017-10-31T00:00:00, 2017-11-31T00:00:00), [2018-02-31T00:00:00, 2018-03-31T00:00:00), [2018-05-31T00:00:00, 2018-06-31T00:00:00) 等等。 - -上面这个例子的第三个参数是每次时间间隔的滑动步长。 - -然后系统将会用 WHERE 子句中的时间和值过滤条件以及 GROUP BY 语句中的第一个参数作为数据的联合过滤条件,获得满足所有过滤条件的数据(在这个例子里是在 [2017-10-31T00:00:00, 2019-11-07T23:00:00) 这个时间范围的数据),并把这些数据映射到之前分割好的时间轴中(这个例子里是从 2017-10-31T00:00:00 到 2019-11-07T23:00:00:00 的每两个自然月的第一个月) - -每个时间间隔窗口内都有数据,SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-10-31T00:00:00.000+08:00| 251| -|2017-12-31T00:00:00.000+08:00| 250| -|2018-02-28T00:00:00.000+08:00| 259| -|2018-04-30T00:00:00.000+08:00| 250| -|2018-06-30T00:00:00.000+08:00| 242| -|2018-08-31T00:00:00.000+08:00| 225| -|2018-10-31T00:00:00.000+08:00| 216| -|2018-12-31T00:00:00.000+08:00| 208| -|2019-02-28T00:00:00.000+08:00| 216| -|2019-04-30T00:00:00.000+08:00| 208| -|2019-06-30T00:00:00.000+08:00| 199| -|2019-08-31T00:00:00.000+08:00| 181| -|2019-10-31T00:00:00.000+08:00| 69| -+-----------------------------+-------------------------------+ -``` - -##### 左开右闭区间 - -每个区间的结果时间戳为区间右端点,对应的 SQL 语句是: - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d); -``` - -这条查询语句的时间区间是左开右闭的,结果中不会包含时间点 2017-11-01 的数据,但是会包含时间点 2017-11-07 的数据。 - -SQL 执行后的结果集如下所示: - -```shell -+-----------------------------+-------------------------------+ -| Time|count(root.ln.wf01.wt01.status)| -+-----------------------------+-------------------------------+ -|2017-11-02T00:00:00.000+08:00| 1440| -|2017-11-03T00:00:00.000+08:00| 1440| -|2017-11-04T00:00:00.000+08:00| 1440| -|2017-11-05T00:00:00.000+08:00| 1440| -|2017-11-06T00:00:00.000+08:00| 1440| -|2017-11-07T00:00:00.000+08:00| 1440| -|2017-11-07T23:00:00.000+08:00| 1380| -+-----------------------------+-------------------------------+ -Total line number = 7 -It costs 0.004s -``` - -#### 差值分段聚合 -IoTDB支持通过`GROUP BY VARIATION`语句来根据差值进行分组。`GROUP BY VARIATION`会将第一个点作为一个组的**基准点**,每个新的数据在按照给定规则与基准点进行差值运算后, -如果差值小于给定的阈值则将该新点归于同一组,否则结束当前分组,以这个新的数据为新的基准点开启新的分组。 -该分组方式不会重叠,且没有固定的开始结束时间。其子句语法如下: -```sql -group by variation(controlExpression[,delta][,ignoreNull=true/false]) -``` -不同的参数含义如下 -* controlExpression - -分组所参照的值,**可以是查询数据中的某一列或是多列的表达式 -(多列表达式计算后仍为一个值,使用多列表达式时指定的列必须都为数值列)**, 差值便是根据数据的controlExpression的差值运算。 -* delta - -分组所使用的阈值,同一分组中**每个点的controlExpression对应的值与该组中基准点对应值的差值都小于`delta`**。当`delta=0`时,相当于一个等值分组,所有连续且expression值相同的数据将被分到一组。 - -* ignoreNull - -用于指定`controlExpression`的值为null时对数据的处理方式,当`ignoreNull`为false时,该null值会被视为新的值,`ignoreNull`为true时,则直接跳过对应的点。 - -在`delta`取不同值时,`controlExpression`支持的返回数据类型以及当`ignoreNull`为false时对于null值的处理方式可以见下表: - -| delta | controlExpression支持的返回类型 | ignoreNull=false时对于Null值的处理 | -|----------|--------------------------------------|-----------------------------------------------------------------| -| delta!=0 | INT32、INT64、FLOAT、DOUBLE | 若正在维护分组的值不为null,null视为无穷大/无穷小,结束当前分组。连续的null视为差值相等的值,会被分配在同一个分组 | -| delta=0 | TEXT、BINARY、INT32、INT64、FLOAT、DOUBLE | null被视为新分组中的新值,连续的null属于相同的分组 | - -下图为差值分段的一个分段方式示意图,与组中第一个数据的控制列值的差值在delta内的控制列对应的点属于相同的分组。 - -groupByVariation - -##### 使用注意事项 -1. `controlExpression`的结果应该为唯一值,如果使用通配符拼接后出现多列,则报错。 -2. 对于一个分组,默认Time列输出分组的开始时间,查询时可以使用select `__endTime`的方式来使得结果输出分组的结束时间。 -3. 与`ALIGN BY DEVICE`搭配使用时会对每个device进行单独的分组操作。 -4. 当没有指定`delta`和`ignoreNull`时,`delta`默认为0,`ignoreNull`默认为true。 -5. 当前暂不支持与`GROUP BY LEVEL`搭配使用。 - -使用如下的原始数据,接下来会给出几个事件分段查询的使用样例 -```shell -+-----------------------------+-------+-------+-------+--------+-------+-------+ -| Time| s1| s2| s3| s4| s5| s6| -+-----------------------------+-------+-------+-------+--------+-------+-------+ -|1970-01-01T08:00:00.000+08:00| 4.5| 9.0| 0.0| 45.0| 9.0| 8.25| -|1970-01-01T08:00:00.010+08:00| null| 19.0| 10.0| 145.0| 19.0| 8.25| -|1970-01-01T08:00:00.020+08:00| 24.5| 29.0| null| 245.0| 29.0| null| -|1970-01-01T08:00:00.030+08:00| 34.5| null| 30.0| 345.0| null| null| -|1970-01-01T08:00:00.040+08:00| 44.5| 49.0| 40.0| 445.0| 49.0| 8.25| -|1970-01-01T08:00:00.050+08:00| null| 59.0| 50.0| 545.0| 59.0| 6.25| -|1970-01-01T08:00:00.060+08:00| 64.5| 69.0| 60.0| 645.0| 69.0| null| -|1970-01-01T08:00:00.070+08:00| 74.5| 79.0| null| null| 79.0| 3.25| -|1970-01-01T08:00:00.080+08:00| 84.5| 89.0| 80.0| 845.0| 89.0| 3.25| -|1970-01-01T08:00:00.090+08:00| 94.5| 99.0| 90.0| 945.0| 99.0| 3.25| -|1970-01-01T08:00:00.150+08:00| 66.5| 77.0| 90.0| 945.0| 99.0| 9.25| -+-----------------------------+-------+-------+-------+--------+-------+-------+ -``` -##### delta=0时的等值事件分段 -使用如下sql语句 -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6); -``` -得到如下的查询结果,这里忽略了s6为null的行 -```shell -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.040+08:00| 24.5| 3| 50.0| -|1970-01-01T08:00:00.050+08:00|1970-01-01T08:00:00.050+08:00| null| 1| 50.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` -当指定ignoreNull为false时,会将s6为null的数据也考虑进来 -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, ignoreNull=false); -``` -得到如下的结果 -```shell -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.010+08:00| 4.5| 2| 10.0| -|1970-01-01T08:00:00.020+08:00|1970-01-01T08:00:00.030+08:00| 29.5| 1| 30.0| -|1970-01-01T08:00:00.040+08:00|1970-01-01T08:00:00.040+08:00| 44.5| 1| 40.0| -|1970-01-01T08:00:00.050+08:00|1970-01-01T08:00:00.050+08:00| null| 1| 50.0| -|1970-01-01T08:00:00.060+08:00|1970-01-01T08:00:00.060+08:00| 64.5| 1| 60.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` -##### delta!=0时的差值事件分段 -使用如下sql语句 -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, 4); -``` -得到如下的查询结果 -```shell -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.050+08:00| 24.5| 4| 100.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.090+08:00| 84.5| 3| 170.0| -|1970-01-01T08:00:00.150+08:00|1970-01-01T08:00:00.150+08:00| 66.5| 1| 90.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` -group by子句中的controlExpression同样支持列的表达式 - -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6+s5, 10); -``` -得到如下的查询结果 -```shell -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -| Time| __endTime|avg(root.sg.d.s1)|count(root.sg.d.s2)|sum(root.sg.d.s3)| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -|1970-01-01T08:00:00.000+08:00|1970-01-01T08:00:00.010+08:00| 4.5| 2| 10.0| -|1970-01-01T08:00:00.040+08:00|1970-01-01T08:00:00.050+08:00| 44.5| 2| 90.0| -|1970-01-01T08:00:00.070+08:00|1970-01-01T08:00:00.080+08:00| 79.5| 2| 80.0| -|1970-01-01T08:00:00.090+08:00|1970-01-01T08:00:00.150+08:00| 80.5| 2| 180.0| -+-----------------------------+-----------------------------+-----------------+-------------------+-----------------+ -``` -#### 条件分段聚合 -当需要根据指定条件对数据进行筛选,并将连续的符合条件的行分为一组进行聚合运算时,可以使用`GROUP BY CONDITION`的分段方式;不满足给定条件的行因为不属于任何分组会被直接简单忽略。 -其语法定义如下: -```sql -group by condition(predict,[keep>/>=/=/<=/<]threshold,[,ignoreNull=true/false]) -``` -* predict - -返回boolean数据类型的合法表达式,用于分组的筛选。 -* keep[>/>=/=/<=/<]threshold - -keep表达式用来指定形成分组所需要连续满足`predict`条件的数据行数,只有行数满足keep表达式的分组才会被输出。keep表达式由一个'keep'字符串和`long`类型的threshold组合或者是单独的`long`类型数据构成。 - -* ignoreNull=true/false - -用于指定遇到predict为null的数据行时的处理方式,为true则跳过该行,为false则结束当前分组。 - -##### 使用注意事项 -1. keep条件在查询中是必需的,但可以省略掉keep字符串给出一个`long`类型常数,默认为`keep=该long型常数`的等于条件。 -2. `ignoreNull`默认为true。 -3. 对于一个分组,默认Time列输出分组的开始时间,查询时可以使用select `__endTime`的方式来使得结果输出分组的结束时间。 -4. 与`ALIGN BY DEVICE`搭配使用时会对每个device进行单独的分组操作。 -5. 当前暂不支持与`GROUP BY LEVEL`搭配使用。 - - -对于如下原始数据,下面会给出几个查询样例: -```shell -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -| Time|root.sg.beijing.car01.soc|root.sg.beijing.car01.charging_status|root.sg.beijing.car01.vehicle_status| -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 14.0| 1| 1| -|1970-01-01T08:00:00.002+08:00| 16.0| 1| 1| -|1970-01-01T08:00:00.003+08:00| 16.0| 0| 1| -|1970-01-01T08:00:00.004+08:00| 16.0| 0| 1| -|1970-01-01T08:00:00.005+08:00| 18.0| 1| 1| -|1970-01-01T08:00:00.006+08:00| 24.0| 1| 1| -|1970-01-01T08:00:00.007+08:00| 36.0| 1| 1| -|1970-01-01T08:00:00.008+08:00| 36.0| null| 1| -|1970-01-01T08:00:00.009+08:00| 45.0| 1| 1| -|1970-01-01T08:00:00.010+08:00| 60.0| 1| 1| -+-----------------------------+-------------------------+-------------------------------------+------------------------------------+ -``` -查询至少连续两行以上的charging_status=1的数据,sql语句如下: -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoreNull=true); -``` -得到结果如下: -```shell -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -| Time|max_time(root.sg.beijing.car01.charging_status)|count(root.sg.beijing.car01.vehicle_status)|last_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 2| 2| 16.0| -|1970-01-01T08:00:00.005+08:00| 10| 5| 60.0| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -``` -当设置`ignoreNull`为false时,遇到null值为将其视为一个不满足条件的行,会结束正在计算的分组。 -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoreNull=false); -``` -得到如下结果,原先的分组被含null的行拆分: -```shell -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -| Time|max_time(root.sg.beijing.car01.charging_status)|count(root.sg.beijing.car01.vehicle_status)|last_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -|1970-01-01T08:00:00.001+08:00| 2| 2| 16.0| -|1970-01-01T08:00:00.005+08:00| 7| 3| 36.0| -|1970-01-01T08:00:00.009+08:00| 10| 2| 60.0| -+-----------------------------+-----------------------------------------------+-------------------------------------------+-------------------------------------+ -``` -#### 会话分段聚合 -`GROUP BY SESSION`可以根据时间列的间隔进行分组,在结果集的时间列中,时间间隔小于等于设定阈值的数据会被分为一组。例如在工业场景中,设备并不总是连续运行,`GROUP BY SESSION`会将设备每次接入会话所产生的数据分为一组。 -其语法定义如下: -```sql -group by session(timeInterval) -``` -* timeInterval - -设定的时间差阈值,当两条数据时间列的差值大于该阈值,则会给数据创建一个新的分组。 - -下图为`group by session`下的一个分组示意图 - - - -##### 使用注意事项 -1. 对于一个分组,默认Time列输出分组的开始时间,查询时可以使用select `__endTime`的方式来使得结果输出分组的结束时间。 -2. 与`ALIGN BY DEVICE`搭配使用时会对每个device进行单独的分组操作。 -3. 当前暂不支持与`GROUP BY LEVEL`搭配使用。 - -对于下面的原始数据,给出几个查询样例。 -```shell -+-----------------------------+-----------------+-----------+--------+------+ -| Time| Device|temperature|hardware|status| -+-----------------------------+-----------------+-----------+--------+------+ -|1970-01-01T08:00:01.000+08:00|root.ln.wf02.wt01| 35.7| 11| false| -|1970-01-01T08:00:02.000+08:00|root.ln.wf02.wt01| 35.8| 22| true| -|1970-01-01T08:00:03.000+08:00|root.ln.wf02.wt01| 35.4| 33| false| -|1970-01-01T08:00:04.000+08:00|root.ln.wf02.wt01| 36.4| 44| false| -|1970-01-01T08:00:05.000+08:00|root.ln.wf02.wt01| 36.8| 55| false| -|1970-01-01T08:00:10.000+08:00|root.ln.wf02.wt01| 36.8| 110| false| -|1970-01-01T08:00:20.000+08:00|root.ln.wf02.wt01| 37.8| 220| true| -|1970-01-01T08:00:30.000+08:00|root.ln.wf02.wt01| 37.5| 330| false| -|1970-01-01T08:00:40.000+08:00|root.ln.wf02.wt01| 37.4| 440| false| -|1970-01-01T08:00:50.000+08:00|root.ln.wf02.wt01| 37.9| 550| false| -|1970-01-01T08:01:40.000+08:00|root.ln.wf02.wt01| 38.0| 110| false| -|1970-01-01T08:02:30.000+08:00|root.ln.wf02.wt01| 38.8| 220| true| -|1970-01-01T08:03:20.000+08:00|root.ln.wf02.wt01| 38.6| 330| false| -|1970-01-01T08:04:20.000+08:00|root.ln.wf02.wt01| 38.4| 440| false| -|1970-01-01T08:05:20.000+08:00|root.ln.wf02.wt01| 38.3| 550| false| -|1970-01-01T08:06:40.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-01T08:07:50.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-01T08:08:00.000+08:00|root.ln.wf02.wt01| null| 0| null| -|1970-01-02T08:08:01.000+08:00|root.ln.wf02.wt01| 38.2| 110| false| -|1970-01-02T08:08:02.000+08:00|root.ln.wf02.wt01| 37.5| 220| true| -|1970-01-02T08:08:03.000+08:00|root.ln.wf02.wt01| 37.4| 330| false| -|1970-01-02T08:08:04.000+08:00|root.ln.wf02.wt01| 36.8| 440| false| -|1970-01-02T08:08:05.000+08:00|root.ln.wf02.wt01| 37.4| 550| false| -+-----------------------------+-----------------+-----------+--------+------+ -``` -可以按照不同的时间单位设定时间间隔,sql语句如下: -```sql -select __endTime,count(*) from root.** group by session(1d); -``` -得到如下结果: -```shell -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -| Time| __endTime|count(root.ln.wf02.wt01.temperature)|count(root.ln.wf02.wt01.hardware)|count(root.ln.wf02.wt01.status)| -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -|1970-01-01T08:00:01.000+08:00|1970-01-01T08:08:00.000+08:00| 15| 18| 15| -|1970-01-02T08:08:01.000+08:00|1970-01-02T08:08:05.000+08:00| 5| 5| 5| -+-----------------------------+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -``` -也可以和`HAVING`、`ALIGN BY DEVICE`共同使用 -```sql -select __endTime,sum(hardware) from root.ln.wf02.wt01 group by session(50s) having sum(hardware)>0 align by device; -``` -得到如下结果,其中排除了`sum(hardware)`为0的部分 -```shell -+-----------------------------+-----------------+-----------------------------+-------------+ -| Time| Device| __endTime|sum(hardware)| -+-----------------------------+-----------------+-----------------------------+-------------+ -|1970-01-01T08:00:01.000+08:00|root.ln.wf02.wt01|1970-01-01T08:03:20.000+08:00| 2475.0| -|1970-01-01T08:04:20.000+08:00|root.ln.wf02.wt01|1970-01-01T08:04:20.000+08:00| 440.0| -|1970-01-01T08:05:20.000+08:00|root.ln.wf02.wt01|1970-01-01T08:05:20.000+08:00| 550.0| -|1970-01-02T08:08:01.000+08:00|root.ln.wf02.wt01|1970-01-02T08:08:05.000+08:00| 1650.0| -+-----------------------------+-----------------+-----------------------------+-------------+ -``` -#### 点数分段聚合 -`GROUP BY COUNT`可以根据点数分组进行聚合运算,将连续的指定数量数据点分为一组,即按照固定的点数进行分组。 -其语法定义如下: -```sql -group by count(controlExpression, size[,ignoreNull=true/false]) -``` -* controlExpression - -计数参照的对象,可以是结果集的任意列或是列的表达式 - -* size - -一个组中数据点的数量,每`size`个数据点会被分到同一个组 - -* ignoreNull=true/false - -是否忽略`controlExpression`为null的数据点,当ignoreNull为true时,在计数时会跳过`controlExpression`结果为null的数据点 - -##### 使用注意事项 -1. 对于一个分组,默认Time列输出分组的开始时间,查询时可以使用select `__endTime`的方式来使得结果输出分组的结束时间。 -2. 与`ALIGN BY DEVICE`搭配使用时会对每个device进行单独的分组操作。 -3. 当前暂不支持与`GROUP BY LEVEL`搭配使用。 -4. 当一个分组内最终的点数不满足`size`的数量时,不会输出该分组的结果 - -对于下面的原始数据,给出几个查询样例。 -```shell -+-----------------------------+-----------+-----------------------+ -| Time|root.sg.soc|root.sg.charging_status| -+-----------------------------+-----------+-----------------------+ -|1970-01-01T08:00:00.001+08:00| 14.0| 1| -|1970-01-01T08:00:00.002+08:00| 16.0| 1| -|1970-01-01T08:00:00.003+08:00| 16.0| 0| -|1970-01-01T08:00:00.004+08:00| 16.0| 0| -|1970-01-01T08:00:00.005+08:00| 18.0| 1| -|1970-01-01T08:00:00.006+08:00| 24.0| 1| -|1970-01-01T08:00:00.007+08:00| 36.0| 1| -|1970-01-01T08:00:00.008+08:00| 36.0| null| -|1970-01-01T08:00:00.009+08:00| 45.0| 1| -|1970-01-01T08:00:00.010+08:00| 60.0| 1| -+-----------------------------+-----------+-----------------------+ -``` -sql语句如下 -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5); -``` -得到如下结果,其中由于第二个1970-01-01T08:00:00.006+08:00到1970-01-01T08:00:00.010+08:00的窗口中包含四个点,不符合`size = 5`的条件,因此不被输出 -```shell -+-----------------------------+-----------------------------+--------------------------------------+ -| Time| __endTime|first_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.001+08:00|1970-01-01T08:00:00.005+08:00| 14.0| -+-----------------------------+-----------------------------+--------------------------------------+ -``` -而当使用ignoreNull将null值也考虑进来时,可以得到两个点计数为5的窗口,sql如下 -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5,ignoreNull=false); -``` -得到如下结果 -```shell -+-----------------------------+-----------------------------+--------------------------------------+ -| Time| __endTime|first_value(root.sg.beijing.car01.soc)| -+-----------------------------+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.001+08:00|1970-01-01T08:00:00.005+08:00| 14.0| -|1970-01-01T08:00:00.006+08:00|1970-01-01T08:00:00.010+08:00| 24.0| -+-----------------------------+-----------------------------+--------------------------------------+ -``` -### 4.2 分组聚合 - -#### 路径层级分组聚合 - -在时间序列层级结构中,路径层级分组聚合查询用于**对某一层级下同名的序列进行聚合查询**。 - -- 使用 `GROUP BY LEVEL = INT` 来指定需要聚合的层级,并约定 `ROOT` 为第 0 层。若统计 "root.ln" 下所有序列则需指定 level 为 1。 -- 路径层次分组聚合查询支持使用所有内置聚合函数。对于 `sum`,`avg`,`min_value`, `max_value`, `extreme` 五种聚合函数,需保证所有聚合的时间序列数据类型相同。其他聚合函数没有此限制。 - -**示例1:** 不同 database 下均存在名为 status 的序列, 如 "root.ln.wf01.wt01.status", "root.ln.wf02.wt02.status", 以及 "root.sgcc.wf03.wt01.status", 如果需要统计不同 database 下 status 序列的数据点个数,使用以下查询: - -```sql -select count(status) from root.** group by level = 1; -``` - -运行结果为: - -```shell -+-------------------------+---------------------------+ -|count(root.ln.*.*.status)|count(root.sgcc.*.*.status)| -+-------------------------+---------------------------+ -| 20160| 10080| -+-------------------------+---------------------------+ -Total line number = 1 -It costs 0.003s -``` - -**示例2:** 统计不同设备下 status 序列的数据点个数,可以规定 level = 3, - -```sql -select count(status) from root.** group by level = 3; -``` - -运行结果为: - -```shell -+---------------------------+---------------------------+ -|count(root.*.*.wt01.status)|count(root.*.*.wt02.status)| -+---------------------------+---------------------------+ -| 20160| 10080| -+---------------------------+---------------------------+ -Total line number = 1 -It costs 0.003s -``` - -注意,这时会将 database `ln` 和 `sgcc` 下名为 `wt01` 的设备视为同名设备聚合在一起。 - -**示例3:** 统计不同 database 下的不同设备中 status 序列的数据点个数,可以使用以下查询: - -```sql -select count(status) from root.** group by level = 1, 3; -``` - -运行结果为: - -```shell -+----------------------------+----------------------------+------------------------------+ -|count(root.ln.*.wt01.status)|count(root.ln.*.wt02.status)|count(root.sgcc.*.wt01.status)| -+----------------------------+----------------------------+------------------------------+ -| 10080| 10080| 10080| -+----------------------------+----------------------------+------------------------------+ -Total line number = 1 -It costs 0.003s -``` - -**示例4:** 查询所有序列下温度传感器 temperature 的最大值,可以使用下列查询语句: - -```sql -select max_value(temperature) from root.** group by level = 0; -``` - -运行结果: - -```shell -+---------------------------------+ -|max_value(root.*.*.*.temperature)| -+---------------------------------+ -| 26.0| -+---------------------------------+ -Total line number = 1 -It costs 0.013s -``` - -**示例5:** 上面的查询都是针对某一个传感器,特别地,**如果想要查询某一层级下所有传感器拥有的总数据点数,则需要显式规定测点为 `*`** - -```sql -select count(*) from root.ln.** group by level = 2; -``` - -运行结果: - -```shell -+----------------------+----------------------+ -|count(root.*.wf01.*.*)|count(root.*.wf02.*.*)| -+----------------------+----------------------+ -| 20160| 20160| -+----------------------+----------------------+ -Total line number = 1 -It costs 0.013s -``` - -##### 与时间区间分段聚合混合使用 - -通过定义 LEVEL 来统计指定层级下的数据点个数。 - -例如: - -统计降采样后的数据点个数 - -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d), level=1; -``` - -结果: - -```shell -+-----------------------------+-------------------------+ -| Time|COUNT(root.ln.*.*.status)| -+-----------------------------+-------------------------+ -|2017-11-02T00:00:00.000+08:00| 1440| -|2017-11-03T00:00:00.000+08:00| 1440| -|2017-11-04T00:00:00.000+08:00| 1440| -|2017-11-05T00:00:00.000+08:00| 1440| -|2017-11-06T00:00:00.000+08:00| 1440| -|2017-11-07T00:00:00.000+08:00| 1440| -|2017-11-07T23:00:00.000+08:00| 1380| -+-----------------------------+-------------------------+ -Total line number = 7 -It costs 0.006s -``` - -加上滑动 Step 的降采样后的结果也可以汇总 - -```sql -select count(status) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d), level=1; -``` - -```shell -+-----------------------------+-------------------------+ -| Time|COUNT(root.ln.*.*.status)| -+-----------------------------+-------------------------+ -|2017-11-01T00:00:00.000+08:00| 180| -|2017-11-02T00:00:00.000+08:00| 180| -|2017-11-03T00:00:00.000+08:00| 180| -|2017-11-04T00:00:00.000+08:00| 180| -|2017-11-05T00:00:00.000+08:00| 180| -|2017-11-06T00:00:00.000+08:00| 180| -|2017-11-07T00:00:00.000+08:00| 180| -+-----------------------------+-------------------------+ -Total line number = 7 -It costs 0.004s -``` - -#### 标签分组聚合 - -IoTDB 支持通过 `GROUP BY TAGS` 语句根据时间序列中定义的标签的键值做分组聚合查询。 - -我们先在 IoTDB 中写入如下示例数据,稍后会以这些数据为例介绍标签聚合查询。 - -这些是某工厂 `factory1` 在多个城市的多个车间的设备温度数据, 时间范围为 [1000, 10000)。 - -时间序列路径中的设备一级是设备唯一标识。城市信息 `city` 和车间信息 `workshop` 则被建模在该设备时间序列的标签中。 -其中,设备 `d1`、`d2` 在 `Beijing` 的 `w1` 车间, `d3`、`d4` 在 `Beijing` 的 `w2` 车间,`d5`、`d6` 在 `Shanghai` 的 `w1` 车间,`d7` 在 `Shanghai` 的 `w2` 车间。 -`d8` 和 `d9` 设备目前处于调试阶段,还未被分配到具体的城市和车间,所以其相应的标签值为空值。 - -```SQL -create database root.factory1; -create timeseries root.factory1.d1.temperature with datatype=FLOAT tags(city=Beijing, workshop=w1); -create timeseries root.factory1.d2.temperature with datatype=FLOAT tags(city=Beijing, workshop=w1); -create timeseries root.factory1.d3.temperature with datatype=FLOAT tags(city=Beijing, workshop=w2); -create timeseries root.factory1.d4.temperature with datatype=FLOAT tags(city=Beijing, workshop=w2); -create timeseries root.factory1.d5.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w1); -create timeseries root.factory1.d6.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w1); -create timeseries root.factory1.d7.temperature with datatype=FLOAT tags(city=Shanghai, workshop=w2); -create timeseries root.factory1.d8.temperature with datatype=FLOAT; -create timeseries root.factory1.d9.temperature with datatype=FLOAT; - -insert into root.factory1.d1(time, temperature) values(1000, 104.0); -insert into root.factory1.d1(time, temperature) values(3000, 104.2); -insert into root.factory1.d1(time, temperature) values(5000, 103.3); -insert into root.factory1.d1(time, temperature) values(7000, 104.1); - -insert into root.factory1.d2(time, temperature) values(1000, 104.4); -insert into root.factory1.d2(time, temperature) values(3000, 103.7); -insert into root.factory1.d2(time, temperature) values(5000, 103.3); -insert into root.factory1.d2(time, temperature) values(7000, 102.9); - -insert into root.factory1.d3(time, temperature) values(1000, 103.9); -insert into root.factory1.d3(time, temperature) values(3000, 103.8); -insert into root.factory1.d3(time, temperature) values(5000, 102.7); -insert into root.factory1.d3(time, temperature) values(7000, 106.9); - -insert into root.factory1.d4(time, temperature) values(1000, 103.9); -insert into root.factory1.d4(time, temperature) values(5000, 102.7); -insert into root.factory1.d4(time, temperature) values(7000, 106.9); - -insert into root.factory1.d5(time, temperature) values(1000, 112.9); -insert into root.factory1.d5(time, temperature) values(7000, 113.0); - -insert into root.factory1.d6(time, temperature) values(1000, 113.9); -insert into root.factory1.d6(time, temperature) values(3000, 113.3); -insert into root.factory1.d6(time, temperature) values(5000, 112.7); -insert into root.factory1.d6(time, temperature) values(7000, 112.3); - -insert into root.factory1.d7(time, temperature) values(1000, 101.2); -insert into root.factory1.d7(time, temperature) values(3000, 99.3); -insert into root.factory1.d7(time, temperature) values(5000, 100.1); -insert into root.factory1.d7(time, temperature) values(7000, 99.8); - -insert into root.factory1.d8(time, temperature) values(1000, 50.0); -insert into root.factory1.d8(time, temperature) values(3000, 52.1); -insert into root.factory1.d8(time, temperature) values(5000, 50.1); -insert into root.factory1.d8(time, temperature) values(7000, 50.5); - -insert into root.factory1.d9(time, temperature) values(1000, 50.3); -insert into root.factory1.d9(time, temperature) values(3000, 52.1); -``` - -##### 单标签聚合查询 - -用户想统计该工厂每个地区的设备的温度的平均值,可以使用如下查询语句 - -```SQL -SELECT AVG(temperature) FROM root.factory1.** GROUP BY TAGS(city); -``` - -该查询会将具有同一个 `city` 标签值的时间序列的所有满足查询条件的点做平均值计算,计算结果如下 - -```shell -+--------+------------------+ -| city| avg(temperature)| -+--------+------------------+ -| Beijing|104.04666697184244| -|Shanghai|107.85000076293946| -| NULL| 50.84999910990397| -+--------+------------------+ -Total line number = 3 -It costs 0.231s -``` - -从结果集中可以看到,和分段聚合、按层次分组聚合相比,标签聚合的查询结果的不同点是: -1. 标签聚合查询的聚合结果不会再做去星号展开,而是将多个时间序列的数据作为一个整体进行聚合计算。 -2. 标签聚合查询除了输出聚合结果列,还会输出聚合标签的键值列。该列的列名为聚合指定的标签键,列的值则为所有查询的时间序列中出现的该标签的值。 -如果某些时间序列未设置该标签,则在键值列中有一行单独的 `NULL` ,代表未设置标签的所有时间序列数据的聚合结果。 - -##### 多标签分组聚合查询 - -除了基本的单标签聚合查询外,还可以按顺序指定多个标签进行聚合计算。 - -例如,用户想统计每个城市的每个车间内设备的平均温度。但因为各个城市的车间名称有可能相同,所以不能直接按照 `workshop` 做标签聚合。必须要先按照城市,再按照车间处理。 - -SQL 语句如下 - -```SQL -SELECT avg(temperature) FROM root.factory1.** GROUP BY TAGS(city, workshop); -``` - -查询结果如下 - -```shell -+--------+--------+------------------+ -| city|workshop| avg(temperature)| -+--------+--------+------------------+ -| NULL| NULL| 50.84999910990397| -|Shanghai| w1|113.01666768391927| -| Beijing| w2| 104.4000004359654| -|Shanghai| w2|100.10000038146973| -| Beijing| w1|103.73750019073486| -+--------+--------+------------------+ -Total line number = 5 -It costs 0.027s -``` - -从结果集中可以看到,和单标签聚合相比,多标签聚合的查询结果会根据指定的标签顺序,输出相应标签的键值列。 - -##### 基于时间区间的标签聚合查询 - -按照时间区间聚合是时序数据库中最常用的查询需求之一。IoTDB 在基于时间区间的聚合基础上,支持进一步按照标签进行聚合查询。 - -例如,用户想统计时间 `[1000, 10000)` 范围内,每个城市每个车间中的设备每 5 秒内的平均温度。 - -SQL 语句如下 - -```SQL -SELECT AVG(temperature) FROM root.factory1.** GROUP BY ([1000, 10000), 5s), TAGS(city, workshop); -``` - -查询结果如下 - -```shell -+-----------------------------+--------+--------+------------------+ -| Time| city|workshop| avg(temperature)| -+-----------------------------+--------+--------+------------------+ -|1970-01-01T08:00:01.000+08:00| NULL| NULL| 50.91999893188476| -|1970-01-01T08:00:01.000+08:00|Shanghai| w1|113.20000076293945| -|1970-01-01T08:00:01.000+08:00| Beijing| w2| 103.4| -|1970-01-01T08:00:01.000+08:00|Shanghai| w2| 100.1999994913737| -|1970-01-01T08:00:01.000+08:00| Beijing| w1|103.81666692097981| -|1970-01-01T08:00:06.000+08:00| NULL| NULL| 50.5| -|1970-01-01T08:00:06.000+08:00|Shanghai| w1| 112.6500015258789| -|1970-01-01T08:00:06.000+08:00| Beijing| w2| 106.9000015258789| -|1970-01-01T08:00:06.000+08:00|Shanghai| w2| 99.80000305175781| -|1970-01-01T08:00:06.000+08:00| Beijing| w1| 103.5| -+-----------------------------+--------+--------+------------------+ -``` - -和标签聚合相比,基于时间区间的标签聚合的查询会首先按照时间区间划定聚合范围,在时间区间内部再根据指定的标签顺序,进行相应数据的聚合计算。在输出的结果集中,会包含一列时间列,该时间列值的含义和时间区间聚合查询的相同。 - -##### 标签分组聚合的限制 - -由于标签聚合功能仍然处于开发阶段,目前有如下未实现功能。 - -> 1. 暂不支持 `HAVING` 子句过滤查询结果。 -> 2. 暂不支持结果按照标签值排序。 -> 3. 暂不支持 `LIMIT`,`OFFSET`,`SLIMIT`,`SOFFSET`。 -> 4. 暂不支持 `ALIGN BY DEVICE`。 -> 5. 暂不支持聚合函数内部包含表达式,例如 `count(s+1)`。 -> 6. 不支持值过滤条件聚合,和分层聚合查询行为保持一致。 - -## 5. 聚合结果过滤(HAVING 子句) - -如果想对聚合查询的结果进行过滤,可以在 `GROUP BY` 子句之后使用 `HAVING` 子句。 - -**注意:** - -1. `HAVING`子句中的过滤条件必须由聚合值构成,原始序列不能单独出现。 - - 下列使用方式是不正确的: - ```sql - select count(s1) from root.** group by ([1,3),1ms) having sum(s1) > s1; - select count(s1) from root.** group by ([1,3),1ms) having s1 > 1; - ``` - -2. 对`GROUP BY LEVEL`结果进行过滤时,`SELECT`和`HAVING`中出现的PATH只能有一级。 - - 下列使用方式是不正确的: - ```sql - select count(s1) from root.** group by ([1,3),1ms), level=1 having sum(d1.s1) > 1; - select count(d1.s1) from root.** group by ([1,3),1ms), level=1 having sum(s1) > 1; - ``` - -**SQL 示例:** - -- **示例 1:** - - 对于以下聚合结果进行过滤: - - ```shell - +-----------------------------+---------------------+---------------------+ - | Time|count(root.test.*.s1)|count(root.test.*.s2)| - +-----------------------------+---------------------+---------------------+ - |1970-01-01T08:00:00.001+08:00| 4| 4| - |1970-01-01T08:00:00.003+08:00| 1| 0| - |1970-01-01T08:00:00.005+08:00| 2| 4| - |1970-01-01T08:00:00.007+08:00| 3| 2| - |1970-01-01T08:00:00.009+08:00| 4| 4| - +-----------------------------+---------------------+---------------------+ - ``` - - ```sql - select count(s1) from root.** group by ([1,11),2ms), level=1 having count(s2) > 2; - ``` - - 执行结果如下: - - ```shell - +-----------------------------+---------------------+ - | Time|count(root.test.*.s1)| - +-----------------------------+---------------------+ - |1970-01-01T08:00:00.001+08:00| 4| - |1970-01-01T08:00:00.005+08:00| 2| - |1970-01-01T08:00:00.009+08:00| 4| - +-----------------------------+---------------------+ - ``` - -- **示例 2:** - - 对于以下聚合结果进行过滤: - ```shell - +-----------------------------+-------------+---------+---------+ - | Time| Device|count(s1)|count(s2)| - +-----------------------------+-------------+---------+---------+ - |1970-01-01T08:00:00.001+08:00|root.test.sg1| 1| 2| - |1970-01-01T08:00:00.003+08:00|root.test.sg1| 1| 0| - |1970-01-01T08:00:00.005+08:00|root.test.sg1| 1| 2| - |1970-01-01T08:00:00.007+08:00|root.test.sg1| 2| 1| - |1970-01-01T08:00:00.009+08:00|root.test.sg1| 2| 2| - |1970-01-01T08:00:00.001+08:00|root.test.sg2| 2| 2| - |1970-01-01T08:00:00.003+08:00|root.test.sg2| 0| 0| - |1970-01-01T08:00:00.005+08:00|root.test.sg2| 1| 2| - |1970-01-01T08:00:00.007+08:00|root.test.sg2| 1| 1| - |1970-01-01T08:00:00.009+08:00|root.test.sg2| 2| 2| - +-----------------------------+-------------+---------+---------+ - ``` - - ```sql - select count(s1), count(s2) from root.** group by ([1,11),2ms) having count(s2) > 1 align by device; - ``` - - 执行结果如下: - - ```shell - +-----------------------------+-------------+---------+---------+ - | Time| Device|count(s1)|count(s2)| - +-----------------------------+-------------+---------+---------+ - |1970-01-01T08:00:00.001+08:00|root.test.sg1| 1| 2| - |1970-01-01T08:00:00.005+08:00|root.test.sg1| 1| 2| - |1970-01-01T08:00:00.009+08:00|root.test.sg1| 2| 2| - |1970-01-01T08:00:00.001+08:00|root.test.sg2| 2| 2| - |1970-01-01T08:00:00.005+08:00|root.test.sg2| 1| 2| - |1970-01-01T08:00:00.009+08:00|root.test.sg2| 2| 2| - +-----------------------------+-------------+---------+---------+ - ``` - - -## 6. 结果集补空值(FILL 子句) - -### 6.1 功能介绍 - -当执行一些数据查询时,结果集的某行某列可能没有数据,则此位置结果为空,但这种空值不利于进行数据可视化展示和分析,需要对空值进行填充。 - -在 IoTDB 中,用户可以使用 `FILL` 子句指定数据缺失情况下的填充模式,允许用户按照特定的方法对任何查询的结果集填充空值,如取前一个不为空的值、线性插值等。 - -### 6.2 语法定义 - -**`FILL` 子句的语法定义如下:** - -```sql -FILL '(' PREVIOUS | LINEAR | constant ')' -``` - -**注意:** -- 在 `Fill` 语句中只能指定一种填充方法,该方法作用于结果集的全部列。 -- 空值填充不兼容 0.13 版本及以前的语法(即不支持 `FILL(([(, , )?])+)`) - -### 6.3 填充方式 - -**IoTDB 目前支持以下三种空值填充方式:** - -- `PREVIOUS` 填充:使用该列前一个非空值进行填充。 -- `LINEAR` 填充:使用该列前一个非空值和下一个非空值的线性插值进行填充。 -- 常量填充:使用指定常量填充。 - -**各数据类型支持的填充方法如下表所示:** - -| 数据类型 | 支持的填充方法 | -| :------- |:------------------------| -| BOOLEAN | `PREVIOUS`、常量 | -| INT32 | `PREVIOUS`、`LINEAR`、常量 | -| INT64 | `PREVIOUS`、`LINEAR`、常量 | -| FLOAT | `PREVIOUS`、`LINEAR`、常量 | -| DOUBLE | `PREVIOUS`、`LINEAR`、常量 | -| TEXT | `PREVIOUS`、常量 | - -**注意:** 对于数据类型不支持指定填充方法的列,既不会填充它,也不会报错,只是让那一列保持原样。 - -**下面通过举例进一步说明。** - -如果我们不使用任何填充方式: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000; -``` - -查询结果如下: - -```shell -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| null| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -#### `PREVIOUS` 填充 - -**对于查询结果集中的空值,使用该列前一个非空值进行填充。** - -**注意:** 如果结果集的某一列第一个值就为空,则不会填充该值,直到遇到该列第一个非空值为止。 - -例如,使用 `PREVIOUS` 填充,SQL 语句如下: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous); -``` - -`PREVIOUS` 填充后的结果如下: - -```shell -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 21.93| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| false| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -**在前值填充时,能够支持指定一个时间间隔,如果当前null值的时间戳与前一个非null值的时间戳的间隔,超过指定的时间间隔,则不进行填充。** - -> 1. 在线性填充和常量填充的情况下,如果指定了第二个参数,会抛出异常 -> 2. 时间超时参数仅支持整数 - 例如,原始数据如下所示: - -```sql -select s1 from root.db.d1; -``` -```shell -+-----------------------------+-------------+ -| Time|root.db.d1.s1| -+-----------------------------+-------------+ -|2023-11-08T16:41:50.008+08:00| 1.0| -+-----------------------------+-------------+ -|2023-11-08T16:46:50.011+08:00| 2.0| -+-----------------------------+-------------+ -|2023-11-08T16:48:50.011+08:00| 3.0| -+-----------------------------+-------------+ -``` - -根据时间分组,每1分钟求一个平均值 - -```sql -select avg(s1) - from root.db.d1 - group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m); -``` -```shell -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| null| -+-----------------------------+------------------+ -``` - -根据时间分组并用前值填充 - -```sql -select avg(s1) - from root.db.d1 - group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m) - FILL(PREVIOUS); -``` -```shell -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| 3.0| -+-----------------------------+------------------+ -``` - -根据时间分组并用前值填充,并指定超过2分钟的就不填充 - -```sql -select avg(s1) -from root.db.d1 -group by([2023-11-08T16:40:00.008+08:00, 2023-11-08T16:50:00.008+08:00), 1m) - FILL(PREVIOUS, 2m); -``` -```shell -+-----------------------------+------------------+ -| Time|avg(root.db.d1.s1)| -+-----------------------------+------------------+ -|2023-11-08T16:40:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:41:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:42:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:43:00.008+08:00| 1.0| -+-----------------------------+------------------+ -|2023-11-08T16:44:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:45:00.008+08:00| null| -+-----------------------------+------------------+ -|2023-11-08T16:46:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:47:00.008+08:00| 2.0| -+-----------------------------+------------------+ -|2023-11-08T16:48:00.008+08:00| 3.0| -+-----------------------------+------------------+ -|2023-11-08T16:49:00.008+08:00| 3.0| -+-----------------------------+------------------+ -``` - - -#### `LINEAR` 填充 - -**对于查询结果集中的空值,使用该列前一个非空值和下一个非空值的线性插值进行填充。** - -**注意:** -- 如果某个值之前的所有值都为空,或者某个值之后的所有值都为空,则不会填充该值。 -- 如果某列的数据类型为boolean/text,我们既不会填充它,也不会报错,只是让那一列保持原样。 - -例如,使用 `LINEAR` 填充,SQL 语句如下: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(linear); -``` - -`LINEAR` 填充后的结果如下: - -```shell -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 22.08| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -#### 常量填充 - -**对于查询结果集中的空值,使用指定常量填充。** - -**注意:** -- 如果某列数据类型与常量类型不兼容,既不填充该列,也不报错,将该列保持原样。对于常量兼容的数据类型,如下表所示: - - | 常量类型 | 能够填充的序列数据类型 | - |:------ |:------------------ | - | `BOOLEAN` | `BOOLEAN` `TEXT` | - | `INT64` | `INT32` `INT64` `FLOAT` `DOUBLE` `TEXT` | - | `DOUBLE` | `FLOAT` `DOUBLE` `TEXT` | - | `TEXT` | `TEXT` | -- 当常量值大于 `INT32` 所能表示的最大值时,对于 `INT32` 类型的列,既不填充该列,也不报错,将该列保持原样。 - -例如,使用 `FLOAT` 类型的常量填充,SQL 语句如下: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(2.0); -``` - -`FLOAT` 类型的常量填充后的结果如下: - -```shell -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| 2.0| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| null| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| null| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - -再比如,使用 `BOOLEAN` 类型的常量填充,SQL 语句如下: - -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(true); -``` - -`BOOLEAN` 类型的常量填充后的结果如下: - -```shell -+-----------------------------+-------------------------------+--------------------------+ -| Time|root.sgcc.wf03.wt01.temperature|root.sgcc.wf03.wt01.status| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:37:00.000+08:00| 21.93| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:38:00.000+08:00| null| false| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:39:00.000+08:00| 22.23| true| -+-----------------------------+-------------------------------+--------------------------+ -|2017-11-01T16:40:00.000+08:00| 23.43| true| -+-----------------------------+-------------------------------+--------------------------+ -Total line number = 4 -``` - - -## 7. 查询结果分页(LIMIT/SLIMIT 子句) - -当查询结果集数据量很大,放在一个页面不利于显示,可以使用 `LIMIT/SLIMIT` 子句和 `OFFSET/SOFFSET `子句进行分页控制。 - -- `LIMIT` 和 `SLIMIT` 子句用于控制查询结果的行数和列数。 -- `OFFSET` 和 `SOFFSET` 子句用于控制结果显示的起始位置。 - -### 7.1 按行分页 - -用户可以通过 `LIMIT` 和 `OFFSET` 子句控制查询结果的行数,`LIMIT rowLimit` 指定查询结果的行数,`OFFSET rowOffset` 指定查询结果显示的起始行位置。 - -注意: -- 当 `rowOffset` 超过结果集的大小时,返回空结果集。 -- 当 `rowLimit` 超过结果集的大小时,返回所有查询结果。 -- 当 `rowLimit` 和 `rowOffset` 不是正整数,或超过 `INT64` 允许的最大值时,系统将提示错误。 - -我们将通过以下示例演示如何使用 `LIMIT` 和 `OFFSET` 子句。 - -- **示例 1:** 基本的 `LIMIT` 子句 - -SQL 语句: - -```sql -select status, temperature from root.ln.wf01.wt01 limit 10; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 选择的时间序列是“状态”和“温度”。 SQL 语句要求返回查询结果的前 10 行。 - -结果如下所示: - -```shell -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:00:00.000+08:00| true| 25.96| -|2017-11-01T00:01:00.000+08:00| true| 24.36| -|2017-11-01T00:02:00.000+08:00| false| 20.09| -|2017-11-01T00:03:00.000+08:00| false| 20.18| -|2017-11-01T00:04:00.000+08:00| false| 21.13| -|2017-11-01T00:05:00.000+08:00| false| 22.72| -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -|2017-11-01T00:08:00.000+08:00| false| 22.58| -|2017-11-01T00:09:00.000+08:00| false| 20.98| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 10 -It costs 0.000s -``` - -- **示例 2:** 带 `OFFSET` 的 `LIMIT` 子句 - -SQL 语句: - -```sql -select status, temperature from root.ln.wf01.wt01 limit 5 offset 3; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 选择的时间序列是“状态”和“温度”。 SQL 语句要求返回查询结果的第 3 至 7 行(第一行编号为 0 行)。 - -结果如下所示: - -```shell -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2017-11-01T00:03:00.000+08:00| false| 20.18| -|2017-11-01T00:04:00.000+08:00| false| 21.13| -|2017-11-01T00:05:00.000+08:00| false| 22.72| -|2017-11-01T00:06:00.000+08:00| false| 20.71| -|2017-11-01T00:07:00.000+08:00| false| 21.45| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 5 -It costs 0.342s -``` - -- **示例 3:** `LIMIT` 子句与 `WHERE` 子句结合 - -SQL 语句: - -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2024-07-07T00:05:00.000 and time< 2024-07-12T00:12:00.000 limit 5 offset 3; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 选择的时间序列是“状态”和“温度”。 SQL 语句要求返回时间“ 2024-07-07T00:05:00.000”和“ 2024-07-12T00:12:00.000”之间的状态和温度传感器值的第 3 至 7 行(第一行编号为第 0 行)。 - -结果如下所示: - -```shell -+-----------------------------+------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.status|root.ln.wf01.wt01.temperature| -+-----------------------------+------------------------+-----------------------------+ -|2024-07-09T17:32:11.943+08:00| true| 24.941973| -|2024-07-09T17:32:12.944+08:00| true| 20.05108| -|2024-07-09T17:32:13.945+08:00| true| 20.541632| -|2024-07-09T17:32:14.945+08:00| null| 23.09016| -|2024-07-09T17:32:14.946+08:00| true| null| -+-----------------------------+------------------------+-----------------------------+ -Total line number = 5 -It costs 0.070s -```` - -- **示例 4:** `LIMIT` 子句与 `GROUP BY` 子句组合 - -SQL 语句: - -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) limit 4 offset 3; -``` - -含义: - -SQL 语句子句要求返回查询结果的第 3 至 6 行(第一行编号为 0 行)。 - -结果如下所示: - -```shell -+-----------------------------+-------------------------------+----------------------------------------+ -| Time|count(root.ln.wf01.wt01.status)|max_value(root.ln.wf01.wt01.temperature)| -+-----------------------------+-------------------------------+----------------------------------------+ -|2017-11-04T00:00:00.000+08:00| 1440| 26.0| -|2017-11-05T00:00:00.000+08:00| 1440| 26.0| -|2017-11-06T00:00:00.000+08:00| 1440| 25.99| -|2017-11-07T00:00:00.000+08:00| 1380| 26.0| -+-----------------------------+-------------------------------+----------------------------------------+ -Total line number = 4 -It costs 0.016s -``` - -### 7.2 按列分页 - -用户可以通过 `SLIMIT` 和 `SOFFSET` 子句控制查询结果的列数,`SLIMIT seriesLimit` 指定查询结果的列数,`SOFFSET seriesOffset` 指定查询结果显示的起始列位置。 - -注意: -- 仅用于控制值列,对时间列和设备列无效。 -- 当 `seriesOffset` 超过结果集的大小时,返回空结果集。 -- 当 `seriesLimit` 超过结果集的大小时,返回所有查询结果。 -- 当 `seriesLimit` 和 `seriesOffset` 不是正整数,或超过 `INT64` 允许的最大值时,系统将提示错误。 - -我们将通过以下示例演示如何使用 `SLIMIT` 和 `SOFFSET` 子句。 - -- **示例 1:** 基本的 `SLIMIT` 子句 - -SQL 语句: - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 所选时间序列是该设备下的第二列,即温度。 SQL 语句要求在"2017-11-01T00:05:00.000"和"2017-11-01T00:12:00.000"的时间点之间选择温度传感器值。 - -结果如下所示: - -```shell -+-----------------------------+-----------------------------+ -| Time|root.ln.wf01.wt01.temperature| -+-----------------------------+-----------------------------+ -|2017-11-01T00:06:00.000+08:00| 20.71| -|2017-11-01T00:07:00.000+08:00| 21.45| -|2017-11-01T00:08:00.000+08:00| 22.58| -|2017-11-01T00:09:00.000+08:00| 20.98| -|2017-11-01T00:10:00.000+08:00| 25.52| -|2017-11-01T00:11:00.000+08:00| 22.91| -+-----------------------------+-----------------------------+ -Total line number = 6 -It costs 0.000s -``` - -- **示例 2:** 带 `SOFFSET` 的 `SLIMIT` 子句 - -SQL 语句: - -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 soffset 1; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 所选时间序列是该设备下的第一列,即电源状态。 SQL 语句要求在" 2017-11-01T00:05:00.000"和"2017-11-01T00:12:00.000"的时间点之间选择状态传感器值。 - -结果如下所示: - -```shell -+-----------------------------+------------------------+ -| Time|root.ln.wf01.wt01.status| -+-----------------------------+------------------------+ -|2017-11-01T00:06:00.000+08:00| false| -|2017-11-01T00:07:00.000+08:00| false| -|2017-11-01T00:08:00.000+08:00| false| -|2017-11-01T00:09:00.000+08:00| false| -|2017-11-01T00:10:00.000+08:00| true| -|2017-11-01T00:11:00.000+08:00| false| -+-----------------------------+------------------------+ -Total line number = 6 -It costs 0.003s -``` - -- **示例 3:** `SLIMIT` 子句与 `GROUP BY` 子句结合 - -SQL 语句: - -```sql -select max_value(*) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) slimit 1 soffset 1; -``` - -含义: - -```shell -+-----------------------------+-----------------------------------+ -| Time|max_value(root.ln.wf01.wt01.status)| -+-----------------------------+-----------------------------------+ -|2017-11-01T00:00:00.000+08:00| true| -|2017-11-02T00:00:00.000+08:00| true| -|2017-11-03T00:00:00.000+08:00| true| -|2017-11-04T00:00:00.000+08:00| true| -|2017-11-05T00:00:00.000+08:00| true| -|2017-11-06T00:00:00.000+08:00| true| -|2017-11-07T00:00:00.000+08:00| true| -+-----------------------------+-----------------------------------+ -Total line number = 7 -It costs 0.000s -``` - -- **示例 4:** `SLIMIT` 子句与 `LIMIT` 子句结合 - -SQL 语句: - -```sql -select * from root.ln.wf01.wt01 limit 10 offset 100 slimit 2 soffset 0; -``` - -含义: - -所选设备为 ln 组 wf01 工厂 wt01 设备; 所选时间序列是此设备下的第 0 列至第 1 列(第一列编号为第 0 列)。 SQL 语句子句要求返回查询结果的第 100 至 109 行(第一行编号为 0 行)。 - -结果如下所示: - -```shell -+-----------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+-----------------------------+------------------------+ -|2017-11-01T01:40:00.000+08:00| 21.19| false| -|2017-11-01T01:41:00.000+08:00| 22.79| false| -|2017-11-01T01:42:00.000+08:00| 22.98| false| -|2017-11-01T01:43:00.000+08:00| 21.52| false| -|2017-11-01T01:44:00.000+08:00| 23.45| true| -|2017-11-01T01:45:00.000+08:00| 24.06| true| -|2017-11-01T01:46:00.000+08:00| 22.6| false| -|2017-11-01T01:47:00.000+08:00| 23.78| true| -|2017-11-01T01:48:00.000+08:00| 24.72| true| -|2017-11-01T01:49:00.000+08:00| 24.68| true| -+-----------------------------+-----------------------------+------------------------+ -Total line number = 10 -It costs 0.009s -``` - -## 8. 结果集排序(ORDER BY 子句) - -### 8.1 时间对齐模式下的排序 -IoTDB的查询结果集默认按照时间对齐,可以使用`ORDER BY TIME`的子句指定时间戳的排列顺序。示例代码如下: -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time desc; -``` -执行结果: - -```shell -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -|2017-11-01T00:01:00.000+08:00| v2| true| 24.36| true| -|2017-11-01T00:00:00.000+08:00| v2| true| 25.96| true| -|1970-01-01T08:00:00.002+08:00| v2| false| null| null| -|1970-01-01T08:00:00.001+08:00| v1| true| null| null| -+-----------------------------+--------------------------+------------------------+-----------------------------+------------------------+ -``` -### 8.2 设备对齐模式下的排序 -当使用`ALIGN BY DEVICE`查询对齐模式下的结果集时,可以使用`ORDER BY`子句对返回的结果集顺序进行规定。 - -在设备对齐模式下支持4种排序模式的子句,其中包括两种排序键,`DEVICE`和`TIME`,靠前的排序键为主排序键,每种排序键都支持`ASC`和`DESC`两种排列顺序。 -1. ``ORDER BY DEVICE``: 按照设备名的字典序进行排序,排序方式为字典序排序,在这种情况下,相同名的设备会以组的形式进行展示。 - -2. ``ORDER BY TIME``: 按照时间戳进行排序,此时不同的设备对应的数据点会按照时间戳的优先级被打乱排序。 - -3. ``ORDER BY DEVICE,TIME``: 按照设备名的字典序进行排序,设备名相同的数据点会通过时间戳进行排序。 - -4. ``ORDER BY TIME,DEVICE``: 按照时间戳进行排序,时间戳相同的数据点会通过设备名的字典序进行排序。 - -> 为了保证结果的可观性,当不使用`ORDER BY`子句,仅使用`ALIGN BY DEVICE`时,会为设备视图提供默认的排序方式。其中默认的排序视图为``ORDER BY DEVCE,TIME``,默认的排序顺序为`ASC`, -> 即结果集默认先按照设备名升序排列,在相同设备名内再按照时间戳升序排序。 - - -当主排序键为`DEVICE`时,结果集的格式与默认情况类似:先按照设备名对结果进行排列,在相同的设备名下内按照时间戳进行排序。示例代码如下: -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by device desc,time asc align by device; -``` -执行结果: - -```shell -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -+-----------------------------+-----------------+--------+------+-----------+ -``` -主排序键为`Time`时,结果集会先按照时间戳进行排序,在时间戳相等时按照设备名排序。 -示例代码如下: -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time asc,device desc align by device; -``` -执行结果: -```shell -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -+-----------------------------+-----------------+--------+------+-----------+ -``` -当没有显式指定时,主排序键默认为`Device`,排序顺序默认为`ASC`,示例代码如下: -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` -结果如图所示,可以看出,`ORDER BY DEVICE ASC,TIME ASC`就是默认情况下的排序方式,由于`ASC`是默认排序顺序,此处可以省略。 -```shell -+-----------------------------+-----------------+--------+------+-----------+ -| Time| Device|hardware|status|temperature| -+-----------------------------+-----------------+--------+------+-----------+ -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| null| true| 25.96| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| true| 24.36| -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| v1| true| null| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| v2| false| null| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| v2| true| null| -+-----------------------------+-----------------+--------+------+-----------+ -``` -同样,可以在聚合查询中使用`ALIGN BY DEVICE`和`ORDER BY`子句,对聚合后的结果进行排序,示例代码如下所示: -```sql -select count(*) from root.ln.** group by ((2017-11-01T00:00:00.000+08:00,2017-11-01T00:03:00.000+08:00],1m) order by device asc,time asc align by device; -``` -执行结果: -```shell -+-----------------------------+-----------------+---------------+-------------+------------------+ -| Time| Device|count(hardware)|count(status)|count(temperature)| -+-----------------------------+-----------------+---------------+-------------+------------------+ -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| null| 1| 1| -|2017-11-01T00:02:00.000+08:00|root.ln.wf01.wt01| null| 0| 0| -|2017-11-01T00:03:00.000+08:00|root.ln.wf01.wt01| null| 0| 0| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| 1| 1| null| -|2017-11-01T00:02:00.000+08:00|root.ln.wf02.wt02| 0| 0| null| -|2017-11-01T00:03:00.000+08:00|root.ln.wf02.wt02| 0| 0| null| -+-----------------------------+-----------------+---------------+-------------+------------------+ -``` - -### 8.3 任意表达式排序 -除了IoTDB中规定的Time,Device关键字外,还可以通过`ORDER BY`子句对指定时间序列中任意列的表达式进行排序。 - -排序在通过`ASC`,`DESC`指定排序顺序的同时,可以通过`NULLS`语法来指定NULL值在排序中的优先级,`NULLS FIRST`默认NULL值在结果集的最上方,`NULLS LAST`则保证NULL值在结果集的最后。如果没有在子句中指定,则默认顺序为`ASC`,`NULLS LAST`。 - -对于如下的数据,将给出几个任意表达式的查询示例供参考: -```shell -+-----------------------------+-------------+-------+-------+--------+-------+ -| Time| Device| base| score| bonus| total| -+-----------------------------+-------------+-------+-------+--------+-------+ -|1970-01-01T08:00:00.000+08:00| root.one| 12| 50.0| 45.0| 107.0| -|1970-01-02T08:00:00.000+08:00| root.one| 10| 50.0| 45.0| 105.0| -|1970-01-03T08:00:00.000+08:00| root.one| 8| 50.0| 45.0| 103.0| -|1970-01-01T08:00:00.010+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.020+08:00| root.two| 8| 10.0| 15.0| 33.0| -|1970-01-01T08:00:00.010+08:00| root.three| 9| null| 24.0| 33.0| -|1970-01-01T08:00:00.020+08:00| root.three| 8| null| 22.5| 30.5| -|1970-01-01T08:00:00.030+08:00| root.three| 7| null| 23.5| 30.5| -|1970-01-01T08:00:00.010+08:00| root.four| 9| 32.0| 45.0| 86.0| -|1970-01-01T08:00:00.020+08:00| root.four| 8| 32.0| 45.0| 85.0| -|1970-01-01T08:00:00.030+08:00| root.five| 7| 53.0| 44.0| 104.0| -|1970-01-01T08:00:00.040+08:00| root.five| 6| 54.0| 42.0| 102.0| -+-----------------------------+-------------+-------+-------+--------+-------+ -``` - -当需要根据基础分数score对结果进行排序时,可以直接使用 -```Sql -select score from root.** order by score desc align by device; -``` -会得到如下结果 - -```shell -+-----------------------------+---------+-----+ -| Time| Device|score| -+-----------------------------+---------+-----+ -|1970-01-01T08:00:00.040+08:00|root.five| 54.0| -|1970-01-01T08:00:00.030+08:00|root.five| 53.0| -|1970-01-01T08:00:00.000+08:00| root.one| 50.0| -|1970-01-02T08:00:00.000+08:00| root.one| 50.0| -|1970-01-03T08:00:00.000+08:00| root.one| 50.0| -|1970-01-01T08:00:00.000+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00| root.two| 10.0| -+-----------------------------+---------+-----+ -``` - -当想要根据总分对结果进行排序,可以在order by子句中使用表达式进行计算 -```Sql -select score,total from root.one order by base+score+bonus desc; -``` -该sql等价于 -```Sql -select score,total from root.one order by total desc; -``` -得到如下结果 - -```shell -+-----------------------------+--------------+--------------+ -| Time|root.one.score|root.one.total| -+-----------------------------+--------------+--------------+ -|1970-01-01T08:00:00.000+08:00| 50.0| 107.0| -|1970-01-02T08:00:00.000+08:00| 50.0| 105.0| -|1970-01-03T08:00:00.000+08:00| 50.0| 103.0| -+-----------------------------+--------------+--------------+ -``` -而如果要对总分进行排序,且分数相同时依次根据score, base, bonus和提交时间进行排序时,可以通过多个表达式来指定多层排序 - -```Sql -select base, score, bonus, total from root.** order by total desc NULLS Last, - score desc NULLS Last, - bonus desc NULLS Last, - time desc align by device; -``` -得到如下结果 -```shell -+-----------------------------+----------+----+-----+-----+-----+ -| Time| Device|base|score|bonus|total| -+-----------------------------+----------+----+-----+-----+-----+ -|1970-01-01T08:00:00.000+08:00| root.one| 12| 50.0| 45.0|107.0| -|1970-01-02T08:00:00.000+08:00| root.one| 10| 50.0| 45.0|105.0| -|1970-01-01T08:00:00.030+08:00| root.five| 7| 53.0| 44.0|104.0| -|1970-01-03T08:00:00.000+08:00| root.one| 8| 50.0| 45.0|103.0| -|1970-01-01T08:00:00.040+08:00| root.five| 6| 54.0| 42.0|102.0| -|1970-01-01T08:00:00.010+08:00| root.four| 9| 32.0| 45.0| 86.0| -|1970-01-01T08:00:00.020+08:00| root.four| 8| 32.0| 45.0| 85.0| -|1970-01-01T08:00:00.010+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.000+08:00| root.two| 9| 50.0| 15.0| 74.0| -|1970-01-01T08:00:00.020+08:00| root.two| 8| 10.0| 15.0| 33.0| -|1970-01-01T08:00:00.010+08:00|root.three| 9| null| 24.0| 33.0| -|1970-01-01T08:00:00.030+08:00|root.three| 7| null| 23.5| 30.5| -|1970-01-01T08:00:00.020+08:00|root.three| 8| null| 22.5| 30.5| -+-----------------------------+----------+----+-----+-----+-----+ -``` -在order by中同样可以使用聚合查询表达式 -```Sql -select min_value(total) from root.** order by min_value(total) asc align by device; -``` -得到如下结果 -```shell -+----------+----------------+ -| Device|min_value(total)| -+----------+----------------+ -|root.three| 30.5| -| root.two| 33.0| -| root.four| 85.0| -| root.five| 102.0| -| root.one| 103.0| -+----------+----------------+ -``` -当在查询中指定多列,未被排序的列会随着行和排序列一起改变顺序,当排序列相同时行的顺序和具体实现有关(没有固定顺序) -```Sql -select min_value(total),max_value(base) from root.** order by max_value(total) desc align by device; -``` -得到结果如下 - -```shell -+----------+----------------+---------------+ -| Device|min_value(total)|max_value(base)| -+----------+----------------+---------------+ -| root.one| 103.0| 12| -| root.five| 102.0| 7| -| root.four| 85.0| 9| -| root.two| 33.0| 9| -|root.three| 30.5| 9| -+----------+----------------+---------------+ -``` - -Order by device, time可以和order by expression共同使用 -```Sql -select score from root.** order by device asc, score desc, time asc align by device; -``` -会得到如下结果 -```shell -+-----------------------------+---------+-----+ -| Time| Device|score| -+-----------------------------+---------+-----+ -|1970-01-01T08:00:00.040+08:00|root.five| 54.0| -|1970-01-01T08:00:00.030+08:00|root.five| 53.0| -|1970-01-01T08:00:00.010+08:00|root.four| 32.0| -|1970-01-01T08:00:00.020+08:00|root.four| 32.0| -|1970-01-01T08:00:00.000+08:00| root.one| 50.0| -|1970-01-02T08:00:00.000+08:00| root.one| 50.0| -|1970-01-03T08:00:00.000+08:00| root.one| 50.0| -|1970-01-01T08:00:00.000+08:00| root.two| 50.0| -|1970-01-01T08:00:00.010+08:00| root.two| 50.0| -|1970-01-01T08:00:00.020+08:00| root.two| 10.0| -+-----------------------------+---------+-----+ -``` - -## 9. 查询对齐模式(ALIGN BY DEVICE 子句) - -在 IoTDB 中,查询结果集**默认按照时间对齐**,包含一列时间列和若干个值列,每一行数据各列的时间戳相同。 - -除按照时间对齐外,还支持以下对齐模式: - -- 按设备对齐 `ALIGN BY DEVICE` - -### 9.1 按设备对齐 - -在按设备对齐模式下,设备名会单独作为一列出现,查询结果集包含一列时间列、一列设备列和若干个值列。如果 `SELECT` 子句中选择了 `N` 列,则结果集包含 `N + 2` 列(时间列和设备名字列)。 - -在默认情况下,结果集按照 `Device` 进行排列,在每个 `Device` 内按照 `Time` 列升序排序。 - -当查询多个设备时,要求设备之间同名的列数据类型相同。 - -为便于理解,可以按照关系模型进行对应。设备可以视为关系模型中的表,选择的列可以视为表中的列,`Time + Device` 看做其主键。 - -**示例:** - -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` - -执行如下: - -```shell -+-----------------------------+-----------------+-----------+------+--------+ -| Time| Device|temperature|status|hardware| -+-----------------------------+-----------------+-----------+------+--------+ -|2017-11-01T00:00:00.000+08:00|root.ln.wf01.wt01| 25.96| true| null| -|2017-11-01T00:01:00.000+08:00|root.ln.wf01.wt01| 24.36| true| null| -|1970-01-01T08:00:00.001+08:00|root.ln.wf02.wt02| null| true| v1| -|1970-01-01T08:00:00.002+08:00|root.ln.wf02.wt02| null| false| v2| -|2017-11-01T00:00:00.000+08:00|root.ln.wf02.wt02| null| true| v2| -|2017-11-01T00:01:00.000+08:00|root.ln.wf02.wt02| null| true| v2| -+-----------------------------+-----------------+-----------+------+--------+ -Total line number = 6 -It costs 0.012s -``` -### 9.2 设备对齐模式下的排序 -在设备对齐模式下,默认按照设备名的字典序升序排列,每个设备内部按照时间戳大小升序排列,可以通过 `ORDER BY` 子句调整设备列和时间列的排序优先级。 - -详细说明及示例见文档 [结果集排序](../SQL-Manual/Operator-and-Expression.md)。 - -## 10. 查询写回(INTO 子句) - -`SELECT INTO` 语句用于将查询结果写入一系列指定的时间序列中。 - -应用场景如下: -- **实现 IoTDB 内部 ETL**:对原始数据进行 ETL 处理后写入新序列。 -- **查询结果存储**:将查询结果进行持久化存储,起到类似物化视图的作用。 -- **非对齐序列转对齐序列**:对齐序列从0.13版本开始支持,可以通过该功能将非对齐序列的数据写入新的对齐序列中。 - -### 10.1 语法定义 - -#### 整体描述 - -```sql -selectIntoStatement - : SELECT - resultColumn [, resultColumn] ... - INTO intoItem [, intoItem] ... - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY groupByTimeClause, groupByLevelClause] - [FILL {PREVIOUS | LINEAR | constant}] - [LIMIT rowLimit OFFSET rowOffset] - [ALIGN BY DEVICE] - ; - -intoItem - : [ALIGNED] intoDevicePath '(' intoMeasurementName [',' intoMeasurementName]* ')' - ; -``` - -#### `INTO` 子句 - -`INTO` 子句由若干个 `intoItem` 构成。 - -每个 `intoItem` 由一个目标设备路径和一个包含若干目标物理量名的列表组成(与 `INSERT` 语句中的 `INTO` 子句写法类似)。 - -其中每个目标物理量名与目标设备路径组成一个目标序列,一个 `intoItem` 包含若干目标序列。例如:`root.sg_copy.d1(s1, s2)` 指定了两条目标序列 `root.sg_copy.d1.s1` 和 `root.sg_copy.d1.s2`。 - -`INTO` 子句指定的目标序列要能够与查询结果集的列一一对应。具体规则如下: - -- **按时间对齐**(默认):全部 `intoItem` 包含的目标序列数量要与查询结果集的列数(除时间列外)一致,且按照表头从左到右的顺序一一对应。 -- **按设备对齐**(使用 `ALIGN BY DEVICE`):全部 `intoItem` 中指定的目标设备数和查询的设备数(即 `FROM` 子句中路径模式匹配的设备数)一致,且按照结果集设备的输出顺序一一对应。 - 为每个目标设备指定的目标物理量数量要与查询结果集的列数(除时间和设备列外)一致,且按照表头从左到右的顺序一一对应。 - -下面通过示例进一步说明: - -- **示例 1**(按时间对齐) -```sql - select s1, s2 into root.sg_copy.d1(t1), root.sg_copy.d2(t1, t2), root.sg_copy.d1(t2) from root.sg.d1, root.sg.d2; -``` -```shell -+--------------+-------------------+--------+ -| source column| target timeseries| written| -+--------------+-------------------+--------+ -| root.sg.d1.s1| root.sg_copy.d1.t1| 8000| -+--------------+-------------------+--------+ -| root.sg.d2.s1| root.sg_copy.d2.t1| 10000| -+--------------+-------------------+--------+ -| root.sg.d1.s2| root.sg_copy.d2.t2| 12000| -+--------------+-------------------+--------+ -| root.sg.d2.s2| root.sg_copy.d1.t2| 10000| -+--------------+-------------------+--------+ -Total line number = 4 -It costs 0.725s -``` - -该语句将 `root.sg` database 下四条序列的查询结果写入到 `root.sg_copy` database 下指定的四条序列中。注意,`root.sg_copy.d2(t1, t2)` 也可以写做 `root.sg_copy.d2(t1), root.sg_copy.d2(t2)`。 - -可以看到,`INTO` 子句的写法非常灵活,只要满足组合出的目标序列没有重复,且与查询结果列一一对应即可。 - -> `CLI` 展示的结果集中,各列的含义如下: -> - `source column` 列表示查询结果的列名。 -> - `target timeseries` 表示对应列写入的目标序列。 -> - `written` 表示预期写入的数据量。 - -- **示例 2**(按时间对齐) -```sql - select count(s1 + s2), last_value(s2) into root.agg.count(s1_add_s2), root.agg.last_value(s2) from root.sg.d1 group by ([0, 100), 10ms); -``` -```shell -+--------------------------------------+-------------------------+--------+ -| source column| target timeseries| written| -+--------------------------------------+-------------------------+--------+ -| count(root.sg.d1.s1 + root.sg.d1.s2)| root.agg.count.s1_add_s2| 10| -+--------------------------------------+-------------------------+--------+ -| last_value(root.sg.d1.s2)| root.agg.last_value.s2| 10| -+--------------------------------------+-------------------------+--------+ -Total line number = 2 -It costs 0.375s -``` - -该语句将聚合查询的结果存储到指定序列中。 - -- **示例 3**(按设备对齐) -```sql - select s1, s2 into root.sg_copy.d1(t1, t2), root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` -```shell -+--------------+--------------+-------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+--------------+-------------------+--------+ -| root.sg.d1| s1| root.sg_copy.d1.t1| 8000| -+--------------+--------------+-------------------+--------+ -| root.sg.d1| s2| root.sg_copy.d1.t2| 11000| -+--------------+--------------+-------------------+--------+ -| root.sg.d2| s1| root.sg_copy.d2.t1| 12000| -+--------------+--------------+-------------------+--------+ -| root.sg.d2| s2| root.sg_copy.d2.t2| 9000| -+--------------+--------------+-------------------+--------+ -Total line number = 4 -It costs 0.625s -``` - -该语句同样是将 `root.sg` database 下四条序列的查询结果写入到 `root.sg_copy` database 下指定的四条序列中。但在按设备对齐中,`intoItem` 的数量必须和查询的设备数量一致,每个查询设备对应一个 `intoItem`。 - -> 按设备对齐查询时,`CLI` 展示的结果集多出一列 `source device` 列表示查询的设备。 - -- **示例 4**(按设备对齐) -```sql - select s1 + s2 into root.expr.add(d1s1_d1s2), root.expr.add(d2s1_d2s2) from root.sg.d1, root.sg.d2 align by device; -``` -```shell -+--------------+--------------+------------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+--------------+------------------------+--------+ -| root.sg.d1| s1 + s2| root.expr.add.d1s1_d1s2| 10000| -+--------------+--------------+------------------------+--------+ -| root.sg.d2| s1 + s2| root.expr.add.d2s1_d2s2| 10000| -+--------------+--------------+------------------------+--------+ -Total line number = 2 -It costs 0.532s -``` - -该语句将表达式计算的结果存储到指定序列中。 - -#### 使用变量占位符 - -特别地,可以使用变量占位符描述目标序列与查询序列之间的对应规律,简化语句书写。目前支持以下两种变量占位符: - -- 后缀复制符 `::`:复制查询设备后缀(或物理量),表示从该层开始一直到设备的最后一层(或物理量),目标设备的节点名(或物理量名)与查询的设备对应的节点名(或物理量名)相同。 -- 单层节点匹配符 `${i}`:表示目标序列当前层节点名与查询序列的第`i`层节点名相同。比如,对于路径`root.sg1.d1.s1`而言,`${1}`表示`sg1`,`${2}`表示`d1`,`${3}`表示`s1`。 - -在使用变量占位符时,`intoItem`与查询结果集列的对应关系不能存在歧义,具体情况分类讨论如下: - -##### 按时间对齐(默认) - -> 注:变量占位符**只能描述序列与序列之间的对应关系**,如果查询中包含聚合、表达式计算,此时查询结果中的列无法与某个序列对应,因此目标设备和目标物理量都不能使用变量占位符。 - -###### (1)目标设备不使用变量占位符 & 目标物理量列表使用变量占位符 - -**限制:** - 1. 每个 `intoItem` 中,物理量列表的长度必须为 1。
(如果长度可以大于1,例如 `root.sg1.d1(::, s1)`,无法确定具体哪些列与`::`匹配) - 2. `intoItem` 数量为 1,或与查询结果集列数一致。
(在每个目标物理量列表长度均为 1 的情况下,若 `intoItem` 只有 1 个,此时表示全部查询序列写入相同设备;若 `intoItem` 数量与查询序列一致,则表示为每个查询序列指定一个目标设备;若 `intoItem` 大于 1 小于查询序列数,此时无法与查询序列一一对应) - -**匹配方法:** 每个查询序列指定目标设备,而目标物理量根据变量占位符生成。 - -**示例:** - -```sql -select s1, s2 -into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) -from root.sg.d1, root.sg.d2; -``` -该语句等价于: -```sql -select s1, s2 -into root.sg_copy.d1(s1), root.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2) -from root.sg.d1, root.sg.d2; -``` -可以看到,在这种情况下,语句并不能得到很好地简化。 - -###### (2)目标设备使用变量占位符 & 目标物理量列表不使用变量占位符 - -**限制:** 全部 `intoItem` 中目标物理量的数量与查询结果集列数一致。 - -**匹配方式:** 为每个查询序列指定了目标物理量,目标设备根据对应目标物理量所在 `intoItem` 的目标设备占位符生成。 - -**示例:** -```sql -select d1.s1, d1.s2, d2.s3, d3.s4 -into ::(s1_1, s2_2), root.sg.d2_2(s3_3), root.${2}_copy.::(s4) -from root.sg; -``` - -###### (3)目标设备使用变量占位符 & 目标物理量列表使用变量占位符 - -**限制:** `intoItem` 只有一个且物理量列表的长度为 1。 - -**匹配方式:** 每个查询序列根据变量占位符可以得到一个目标序列。 - -**示例:** -```sql -select * into root.sg_bk.::(::) from root.sg.**; -``` -将 `root.sg` 下全部序列的查询结果写到 `root.sg_bk`,设备名后缀和物理量名保持不变。 - -##### 按设备对齐(使用 `ALIGN BY DEVICE`) - -> 注:变量占位符**只能描述序列与序列之间的对应关系**,如果查询中包含聚合、表达式计算,此时查询结果中的列无法与某个物理量对应,因此目标物理量不能使用变量占位符。 - -###### (1)目标设备不使用变量占位符 & 目标物理量列表使用变量占位符 - -**限制:** 每个 `intoItem` 中,如果物理量列表使用了变量占位符,则列表的长度必须为 1。 - -**匹配方法:** 每个查询序列指定目标设备,而目标物理量根据变量占位符生成。 - -**示例:** -```sql -select s1, s2, s3, s4 -into root.backup_sg.d1(s1, s2, s3, s4), root.backup_sg.d2(::), root.sg.d3(backup_${4}) -from root.sg.d1, root.sg.d2, root.sg.d3 -align by device; -``` - -###### (2)目标设备使用变量占位符 & 目标物理量列表不使用变量占位符 - -**限制:** `intoItem` 只有一个。(如果出现多个带占位符的 `intoItem`,我们将无法得知每个 `intoItem` 需要匹配哪几个源设备) - -**匹配方式:** 每个查询设备根据变量占位符得到一个目标设备,每个设备下结果集各列写入的目标物理量由目标物理量列表指定。 - -**示例:** -```sql -select avg(s1), sum(s2) + sum(s3), count(s4) -into root.agg_${2}.::(avg_s1, sum_s2_add_s3, count_s4) -from root.** -align by device; -``` - -###### (3)目标设备使用变量占位符 & 目标物理量列表使用变量占位符 - -**限制:** `intoItem` 只有一个且物理量列表的长度为 1。 - -**匹配方式:** 每个查询序列根据变量占位符可以得到一个目标序列。 - -**示例:** -```sql -select * into ::(backup_${4}) from root.sg.** align by device; -``` -将 `root.sg` 下每条序列的查询结果写到相同设备下,物理量名前加`backup_`。 - -#### 指定目标序列为对齐序列 - -通过 `ALIGNED` 关键词可以指定写入的目标设备为对齐写入,每个 `intoItem` 可以独立设置。 - -**示例:** -```sql -select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` -该语句指定了 `root.sg_copy.d1` 是非对齐设备,`root.sg_copy.d2`是对齐设备。 - -#### 不支持使用的查询子句 - -- `SLIMIT`、`SOFFSET`:查询出来的列不确定,功能不清晰,因此不支持。 -- `LAST`查询、`GROUP BY TAGS`、`DISABLE ALIGN`:表结构和写入结构不一致,因此不支持。 - -#### 其他要注意的点 - -- 对于一般的聚合查询,时间戳是无意义的,约定使用 0 来存储。 -- 当目标序列存在时,需要保证源序列和目标时间序列的数据类型兼容。关于数据类型的兼容性,查看文档 [数据类型](../Background-knowledge/Data-Type.md#数据类型兼容性)。 -- 当目标序列不存在时,系统将自动创建目标序列(包括 database)。 -- 当查询的序列不存在或查询的序列不存在数据,则不会自动创建目标序列。 - -### 10.2 应用举例 - -#### 实现 IoTDB 内部 ETL -对原始数据进行 ETL 处理后写入新序列。 -```sql -SELECT preprocess_udf(s1, s2) INTO ::(preprocessed_s1, preprocessed_s2) FROM root.sg.* ALIGN BY DEVICE; -``` -```shell -+--------------+-------------------+---------------------------+--------+ -| source device| source column| target timeseries| written| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d1| preprocess_udf(s1)| root.sg.d1.preprocessed_s1| 8000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d1| preprocess_udf(s2)| root.sg.d1.preprocessed_s2| 10000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d2| preprocess_udf(s1)| root.sg.d2.preprocessed_s1| 11000| -+--------------+-------------------+---------------------------+--------+ -| root.sg.d2| preprocess_udf(s2)| root.sg.d2.preprocessed_s2| 9000| -+--------------+-------------------+---------------------------+--------+ -``` -以上语句使用自定义函数对数据进行预处理,将预处理后的结果持久化存储到新序列中。 - -#### 查询结果存储 -将查询结果进行持久化存储,起到类似物化视图的作用。 -```sql -SELECT count(s1), last_value(s1) INTO root.sg.agg_${2}(count_s1, last_value_s1) FROM root.sg1.d1 GROUP BY ([0, 10000), 10ms); -``` -```shell -+--------------------------+-----------------------------+--------+ -| source column| target timeseries| written| -+--------------------------+-----------------------------+--------+ -| count(root.sg.d1.s1)| root.sg.agg_d1.count_s1| 1000| -+--------------------------+-----------------------------+--------+ -| last_value(root.sg.d1.s2)| root.sg.agg_d1.last_value_s2| 1000| -+--------------------------+-----------------------------+--------+ -Total line number = 2 -It costs 0.115s -``` -以上语句将降采样查询的结果持久化存储到新序列中。 - -#### 非对齐序列转对齐序列 -对齐序列从 0.13 版本开始支持,可以通过该功能将非对齐序列的数据写入新的对齐序列中。 - -**注意:** 建议配合使用 `LIMIT & OFFSET` 子句或 `WHERE` 子句(时间过滤条件)对数据进行分批,防止单次操作的数据量过大。 - -```sql -SELECT s1, s2 INTO ALIGNED root.sg1.aligned_d(s1, s2) FROM root.sg1.non_aligned_d WHERE time >= 0 and time < 10000; -``` -```shell -+--------------------------+----------------------+--------+ -| source column| target timeseries| written| -+--------------------------+----------------------+--------+ -| root.sg1.non_aligned_d.s1| root.sg1.aligned_d.s1| 10000| -+--------------------------+----------------------+--------+ -| root.sg1.non_aligned_d.s2| root.sg1.aligned_d.s2| 10000| -+--------------------------+----------------------+--------+ -Total line number = 2 -It costs 0.375s -``` -以上语句将一组非对齐的序列的数据迁移到一组对齐序列。 - -### 10.3 相关用户权限 - -用户必须有下列权限才能正常执行查询写回语句: - -* 所有 `SELECT` 子句中源序列的 `WRITE_SCHEMA` 权限。 -* 所有 `INTO` 子句中目标序列 `WRITE_DATA` 权限。 - -更多用户权限相关的内容,请参考[权限管理语句](../User-Manual/Authority-Management_timecho.md)。 - -### 10.4 相关配置参数 - -* `select_into_insert_tablet_plan_row_limit` - - | 参数名 | select_into_insert_tablet_plan_row_limit | - | ---- | ---- | - | 描述 | 写入过程中每一批 `Tablet` 的最大行数 | - | 类型 | int32 | - | 默认值 | 10000 | - | 改后生效方式 | 重启后生效 | diff --git a/src/zh/UserGuide/latest/Basic-Concept/Write-Data_timecho.md b/src/zh/UserGuide/latest/Basic-Concept/Write-Data_timecho.md deleted file mode 100644 index 4bdd250e9..000000000 --- a/src/zh/UserGuide/latest/Basic-Concept/Write-Data_timecho.md +++ /dev/null @@ -1,190 +0,0 @@ - - - -# 数据写入 -## 1. CLI写入数据 - -IoTDB 为用户提供多种插入实时数据的方式,例如在 [Cli/Shell 工具](../Tools-System/CLI.md) 中直接输入插入数据的 INSERT 语句,或使用 Java API(标准 [Java JDBC](../API/Programming-JDBC_timecho) 接口)单条或批量执行插入数据的 INSERT 语句。 - -本节主要为您介绍实时数据接入的 INSERT 语句在场景中的实际使用示例,有关 INSERT SQL 语句的详细语法请参见本文 [INSERT 语句](../SQL-Manual/SQL-Manual.md#写入数据) 节。 - -注:写入重复时间戳的数据时,会直接覆盖原有同时间戳数据,等效于数据更新;但若写入值为 NULL,则不生效,不会覆盖原有字段值。 - -### 1.1 使用 INSERT 语句 - -使用 INSERT 语句可以向指定的已经创建的一条或多条时间序列中插入数据。对于每一条数据,均由一个时间戳类型的时间戳和一个数值或布尔值、字符串类型的传感器采集值组成。 - -在本节的场景实例下,以其中的两个时间序列`root.ln.wf02.wt02.status`和`root.ln.wf02.wt02.hardware`为例 ,它们的数据类型分别为 BOOLEAN 和 TEXT。 - -单列数据插入示例代码如下: - -```sql -IoTDB > insert into root.ln.wf02.wt02(timestamp,status) values(1,true) -IoTDB > insert into root.ln.wf02.wt02(timestamp,hardware) values(1, "v1") -``` - -以上示例代码将长整型的 timestamp 以及值为 true 的数据插入到时间序列`root.ln.wf02.wt02.status`中和将长整型的 timestamp 以及值为"v1"的数据插入到时间序列`root.ln.wf02.wt02.hardware`中。执行成功后会返回执行时间,代表数据插入已完成。 - -> 注意:在 IoTDB 中,TEXT 类型的数据单双引号都可以来表示,上面的插入语句是用的是双引号表示 TEXT 类型数据,下面的示例将使用单引号表示 TEXT 类型数据。 - -INSERT 语句还可以支持在同一个时间点下多列数据的插入,同时向 2 时间点插入上述两个时间序列的值,多列数据插入示例代码如下: - -```sql -IoTDB > insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2') -``` - -此外,INSERT 语句支持一次性插入多行数据,同时向 2 个不同时间点插入上述时间序列的值,示例代码如下: - -```sql -IoTDB > insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4') -``` - -在树模型写入数据时,timestamp 与 time 均可作为时间列标识用于 INSERT 语句,书写时无需刻意区分;但查询结果中,时间列统一展示为 Time(固定名称),保证结果格式统一。 - -插入数据后我们可以使用 SELECT 语句简单查询已插入的数据。 - -```sql -IoTDB > select * from root.ln.wf02.wt02 where time < 5 -``` - -结果如图所示。由查询结果可以看出,单列、多列数据的插入操作正确执行。 - -``` -+-----------------------------+--------------------------+------------------------+ -| Time|root.ln.wf02.wt02.hardware|root.ln.wf02.wt02.status| -+-----------------------------+--------------------------+------------------------+ -|1970-01-01T08:00:00.001+08:00| v1| true| -|1970-01-01T08:00:00.002+08:00| v2| false| -|1970-01-01T08:00:00.003+08:00| v3| false| -|1970-01-01T08:00:00.004+08:00| v4| true| -+-----------------------------+--------------------------+------------------------+ -Total line number = 4 -It costs 0.004s -``` - -此外,我们可以省略 timestamp 列,此时系统将使用当前的系统时间作为该数据点的时间戳,示例代码如下: -```sql -IoTDB > insert into root.ln.wf02.wt02(status, hardware) values (false, 'v2') -``` -**注意:** 当一次插入多行数据时必须指定时间戳。 - -### 1.2 向对齐时间序列插入数据 - -向对齐时间序列插入数据只需在SQL中增加`ALIGNED`关键词,其他类似。 - -示例代码如下: - -```sql -IoTDB > create aligned timeseries root.sg1.d1(s1 INT32, s2 DOUBLE) -IoTDB > insert into root.sg1.d1(time, s1, s2) aligned values(1, 1, 1) -IoTDB > insert into root.sg1.d1(time, s1, s2) aligned values(2, 2, 2), (3, 3, 3) -IoTDB > select * from root.sg1.d1 -``` - -结果如图所示。由查询结果可以看出,数据的插入操作正确执行。 - -``` -+-----------------------------+--------------+--------------+ -| Time|root.sg1.d1.s1|root.sg1.d1.s2| -+-----------------------------+--------------+--------------+ -|1970-01-01T08:00:00.001+08:00| 1| 1.0| -|1970-01-01T08:00:00.002+08:00| 2| 2.0| -|1970-01-01T08:00:00.003+08:00| 3| 3.0| -+-----------------------------+--------------+--------------+ -Total line number = 3 -It costs 0.004s -``` - -## 2. 原生接口写入 -原生接口 (Session) 是目前IoTDB使用最广泛的系列接口,包含多种写入接口,适配不同的数据采集场景,性能高效且支持多语言。 - -### 2.1 多语言接口写入 -* ### Java - 使用Java接口写入之前,你需要先建立连接,参考 [Java原生接口](../API/Programming-Java-Native-API_timecho)。 - 之后通过 [ JAVA 数据操作接口(DML)](../API/Programming-Java-Native-API_timecho#数据写入)写入。 - -* ### Python - 参考 [ Python 数据操作接口(DML)](../API/Programming-Python-Native-API_timecho#数据写入) - -* ### C++ - 参考 [ C++ 数据操作接口(DML)](../API/Programming-Cpp-Native-API.md) - -* ### Go - 参考 [Go 原生接口](../API/Programming-Go-Native-API.md) - -## 3. REST API写入 - -参考 [insertTablet (v1)](../API/RestServiceV1_timecho#inserttablet) or [insertTablet (v2)](../API/RestServiceV2_timecho#inserttablet) - -示例如下: -```JSON -{ -      "timestamps": [ -            1, -            2, -            3 -      ], -      "measurements": [ -            "temperature", -            "status" -      ], -      "data_types": [ -            "FLOAT", -            "BOOLEAN" -      ], -      "values": [ -            [ -                  1.1, -                  2.2, -                  3.3 -            ], -            [ -                  false, -                  true, -                  true -            ] -      ], -      "is_aligned": false, -      "device": "root.ln.wf01.wt01" -} -``` - -## 4. MQTT写入 - -参考 [内置 MQTT 服务](../API/Programming-MQTT_timecho.md#_2-内置-mqtt-服务) - -## 5. 批量数据导入 - -针对于不同场景,IoTDB 为用户提供多种批量导入数据的操作方式,本章节向大家介绍最为常用的两种方式为 CSV文本形式的导入 和 TsFile文件形式的导入。 - -### 5.1 TsFile批量导入 - -TsFile 是在 IoTDB 中使用的时间序列的文件格式,您可以通过CLI等工具直接将存有时间序列的一个或多个 TsFile 文件导入到另外一个正在运行的IoTDB实例中。具体操作方式请参考[数据导入](../Tools-System/Data-Import-Tool_timecho)。 - -### 5.2 CSV批量导入 - -CSV 是以纯文本形式存储表格数据,您可以在CSV文件中写入多条格式化的数据,并批量的将这些数据导入到 IoTDB 中,在导入数据之前,建议在IoTDB中创建好对应的元数据信息。如果忘记创建元数据也不要担心,IoTDB 可以自动将CSV中数据推断为其对应的数据类型,前提是你每一列的数据类型必须唯一。除单个文件外,此工具还支持以文件夹的形式导入多个 CSV 文件,并且支持设置如时间精度等优化参数。具体操作方式请参考[数据导入](../Tools-System/Data-Import-Tool_timecho)。 - -## 6. 无模式写入 -在物联网场景中,由于设备的类型、数量可能随时间动态增减,不同设备可能产生不同字段的数据(如温度、湿度、状态码等),业务上又往往需要快速部署,需要灵活接入新设备且无需繁琐的预定义流程。因此,不同于传统时序数据库通常需要预先定义数据模型,IoTDB支持不提前创建元数据,在写入数据时,数据库中将自动识别并注册所需的元数据,实现自动建模。 - -用户既可以通过CLI使用insert语句或者原生接口的方式,批量或者单行实时写入一个设备或者多个设备的测点数据,也可以通过导入工具导入csv,TsFile等格式的历史数据,在导入过程中会自动创建序列,数据类型,压缩编码方式等元数据。 diff --git a/src/zh/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md b/src/zh/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md deleted file mode 100644 index ae67e19e3..000000000 --- a/src/zh/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_Upgrade_timecho.md +++ /dev/null @@ -1,268 +0,0 @@ - -# AINode 部署 - -## 1. AINode 介绍 - -### 1.1 能力介绍 - -AINode 是 TimechoDB 在 ConfigNode、DataNode 后提供的第三种内生节点,该节点通过与 TimechoDB 集群的 DataNode、ConfigNode 交互,扩展了对时间序列进行机器学习分析的能力。AINode 将模型的管理、训练及推理融合在数据库引擎中,支持使用注册的模型在指定时序数据上通过简单 SQL 语句完成时序分析任务,还支持注册并使用自定义机器学习模型。AINode 目前已集成常见时序分析场景(例如预测)的机器学习算法和自研模型。 - -### 1.2 部署模式 - -AINode 是 TimechoDB 集群外的额外套件,采用独立安装包部署。 - -
- - -
- -## 2. 安装准备 - -### 2.1 安装包获取 - -AINode 安装包(`timechodb--ainode-bin.zip`)解压后关键目录结构如下: - -| **目录** | **类型** | **说明** | -| ---------------- | ---------------- | ------------------------------------------ | -| lib | 文件夹 | AINode 的可执行程序及依赖 | -| sbin | 文件夹 | AINode 的运行脚本,用于启动或停止 AINode | -| conf | 文件夹 | AINode 的配置文件和版本声明文件 | - -### 2.2 前置检查 - -为确保您获取的 AINode 安装包完整且正确,在执行安装部署前建议您进行 SHA512 校验。 - -**准备工作:** - -* 获取官方发布的 SHA512 校验码:请联系天谋工作人员获取 - -**校验步骤(以 linux 为例):** - -1. 打开终端,进入安装包所在目录(如`/data/ainode`): - ```Bash - cd /data/ainode - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -```SQL -(base) root@hadoop@1:/data/ainode (0.664s) -sha512sum timechodb-2.0.6.1-ainode-bin.zip -4d5a6a64935b4f0459bc9ed214c4563aa7a6a5941024336e9416212424707f27bdfdfc70f4c528b51b812687d660014adc1b8add699498ea67ff17c7e619a6f0 timechodb-2.0.6.1-ainode-bin.zip -``` - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行 AINode 的安装部署操作。 - -**注意事项:** - -* 若校验结果不一致,请联系天谋工作人员重新获取安装包 -* 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.3 环境要求 - -* 建议操作环境: Linux, MacOS; -* TimechoDB 版本:>= V 2.0.8; - -## 3. 安装部署及使用 - -### 3.1 安装 AINode - -下载导入 AINode 到专用文件夹,切换到专用文件夹并解压安装包; - -```Shell -unzip timechodb--ainode-bin.zip -``` - -### 3.2 配置项修改 - -AINode 支持修改一些必要的参数。可以在 `/TIMECHO_AINODE_HOME/conf/iotdb-ainode.properties` 文件中找到下列参数并进行持久化的修改: - -| **名称** | **描述** | **类型** | **默认值** | -|-----------------------------------|----------------------------------------------| ---------------- | -------------------- | -| cluster\_name | AINode 要加入的集群标识 | string| defaultCluster | -| ain\_seed\_config\_node | AINode 启动时注册的 ConfigNode 地址 | String | 127.0.0.1:10710 | -| ain\_cluster\_ingress\_address | AINode 拉取数据的 DataNode 的 rpc 地址 | String | 127.0.0.1 | -| ain\_cluster\_ingress\_port | AINode 拉取数据的 DataNode 的 rpc 端口 | Integer | 6667 | -| ain\_cluster\_ingress\_username | AINode 拉取数据的 DataNode 的客户端用户名 | String | root | -| ain\_cluster\_ingress\_password | AINode 拉取数据的 DataNode 的客户端密码 | String | root | -| ain\_rpc\_address | AINode 提供服务与通信的地址 ,内部服务通讯接口 | String | 127.0.0.1 | -| ain\_rpc\_port | AINode 提供服务与通信的端口 | String | 10810 | -| ain\_system\_dir | AINode 元数据存储路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String| data/AINode/system | -| ain\_models\_dir | AINode 存储模型文件的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String| data/AINode/models | -| ain\_thrift\_compression\_enabled | AINode 是否启用 thrift 的压缩机制,0-不启动、1-启动 | Boolean | 0 | - -### 3.3 导入内置权重文件 - -若部署环境可联网且能连通 HuggingFace 环境,系统会自动拉取内置模型权重文件,可忽略本步骤。 - -若为离线环境,联系天谋工作人员获取模型权重文件夹,并放置到`/TIMECHO_AINODE_HOME/data/ainode/models/builtin` 目录下。 - -**​NOTE:​**注意目录层级,最终所有内置模型权重的父目录都是 `builtin `。 - -### 3.4 启动 AINode - -在完成 ConfigNode 的部署后,可以通过添加 TimechoDB 来支持时序模型的管理和推理功能。在配置项中指定 TimechoDB 集群的信息后,可以执行相应的指令来启动 AINode,加入 TimechoDB 集群。 - -```Shell -# 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh -d - - # Windows 系统 - bash sbin\start-ainode.bat -d -``` - -### 3.5 激活 AINode - -1. 参考 TimechoDB 激活:[激活方式](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md#_2-6-激活数据库) - -2. 可通过如下方式验证 AINode 激活,当看到状态显示为 ACTIVATED 表示激活成功。 - -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -Total line number = 3 -It costs 0.002s -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ -Total line number = 7 -It costs 0.013s -``` - - -### 3.6 检测 AINode 节点状态 - -AINode 启动过程中会自动将新的 AINode 加入 TimechoDB 集群。启动 AINode 后可以在命令行中输入 SQL 来查询,集群中看到 AINode 节点,其运行状态为 Running(如下展示)表示加入成功。 - -```Shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -除此之外,还可以通过 show models 命令来查看模型状态。如果模型状态不对,请检查权重文件路径是否正确。 - -```Bash -IoTDB> show models -+---------------------+---------+--------+--------+ -| ModelId|ModelType|Category| State| -+---------------------+---------+--------+--------+ -| arima| sktime| builtin| active| -| holtwinters| sktime| builtin| active| -|exponential_smoothing| sktime| builtin| active| -| naive_forecaster| sktime| builtin| active| -| stl_forecaster| sktime| builtin| active| -| gaussian_hmm| sktime| builtin| active| -| gmm_hmm| sktime| builtin| active| -| stray| sktime| builtin| active| -| timer_xl| timer| builtin| active| -| sundial| sundial| builtin| active| -| chronos2| t5| builtin| active| -+---------------------+---------+--------+--------+ -``` - -### 3.7 停止 AINode - -如果需要停止正在运行的 AINode 节点,则执行相应的关停脚本,且支持通过参数 -p 指定端口,该端口为配置项中的 `ain_rpc_port`。 - -```Shell -# Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # 指定端口 - - #Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # 指定端口 -``` - -停止 AINode 后,还可以在集群中看到 AINode 节点,其运行状态为 UNKNOWN(如下展示),此时无法使用 AINode 功能。 - -```Shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|UNKNOWN| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -``` - -如果需要重新启动该节点,需重新执行启动脚本。 - -### 3.8 升级 AINode - -如果需要对当前 AINode 进行版本升级,可参考如下步骤: - -1. 停止当前 AINode 服务 - - * 执行停止命令,确保服务完全退出后再进行后续操作 - - ```Shell - # Linux / MacOS - bash sbin/stop-ainode.sh - bash sbin/stop-ainode.sh -p # 指定端口 - - #Windows - sbin\stop-ainode.bat - sbin\stop-ainode.bat -p # 指定端口 - ``` -2. 替换核心文件 - - * 删除当前版本的`lib` 和 `sbin`目录,并将新版本的 `lib` 和 `sbin` 复制到对应位置 - * 备份 conf 目录下已修改的配置文件,然后替换 conf 文件夹,并将修改的配置同步到对应位置 -3. 更新内置模型权重(可选) - - * 若新版本涉及内置模型更新,相关信息将在[发布历史](../IoTDB-Introduction/Release-history_timecho.md)中同步。可联系天谋工作人员获取最新权重包,并将权重包替换至 `data/ainode/models/builtin` 目录 -4. 升级完毕后,可启动 AINode 服务,并查看节点状态,具体命令可参考【3.4】和【3.6】小节。 - diff --git a/src/zh/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_timecho.md b/src/zh/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_timecho.md deleted file mode 100644 index 2dd610188..000000000 --- a/src/zh/UserGuide/latest/Deployment-and-Maintenance/AINode_Deployment_timecho.md +++ /dev/null @@ -1,340 +0,0 @@ - -# AINode 部署 - -## 1. AINode介绍 - -### 1.1 能力介绍 - -AINode 是 IoTDB 在 ConfigNode、DataNode 后提供的第三种内生节点,该节点通过与 IoTDB 集群的 DataNode、ConfigNode 的交互,扩展了对时间序列进行机器学习分析的能力,支持从外部引入已有机器学习模型进行注册,并使用注册的模型在指定时序数据上通过简单 SQL 语句完成时序分析任务的过程,将模型的创建、管理及推理融合在数据库引擎中。目前已提供常见时序分析场景(例如预测与异常检测)的机器学习算法或自研模型。 - -### 1.2 交付方式 -AINode 是 IoTDB 集群外的额外套件,独立安装包。 - -### 1.3 部署模式 -
- - -
- -## 2. 安装准备 - -### 2.1 安装包获取 - -AINode 安装包(`timechodb--ainode-bin.zip`),安装包解压后目录结构如下: - -| **目录** | **类型** | **说明** | -| ------------ | -------- | ------------------------------------------------ | -| lib | 文件夹 | AINode 的 python 包文件 | -| sbin | 文件夹 | AINode的运行脚本,可以启动,移除和停止AINode | -| conf | 文件夹 | AINode 的配置文件和运行环境设置脚本 | -| LICENSE | 文件 | 证书 | -| NOTICE | 文件 | 提示 | -| README_ZH.md | 文件 | markdown格式的中文版说明 | -| README.md | 文件 | 使用说明 | - -### 2.2 前置检查 - -为确保您获取的 AINode 安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:请联系天谋工作人员获取 - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/ainode`): - ```Bash - cd /data/ainode - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-ainode-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-06.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行 AINode 的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.3 环境准备 - -1. 建议操作环境: Ubuntu, MacOS -2. IoTDB 版本:>= V 2.0.5.1 -3. 运行环境 - - Python 版本在 3.9 ~3.12,且带有 pip 和 venv 工具; - - -## 3. 安装部署及使用 - -### 3.1 安装 AINode - -1. 保证 Python 版本介于 3.9 ~3.12 - -```shell -python --version -# 或 -python3 --version -``` -2. 下载导入 AINode 到专用文件夹,切换到专用文件夹并解压安装包 - -```shell - unzip timechodb--ainode-bin.zip -``` - -3. 激活 AINode: - -- 进入 IoTDB CLI - -```sql - # Linux或MACOS系统 - ./start-cli.sh - - # windows系统 - ./start-cli.bat -``` - -- 执行以下内容获取激活所需机器码: - -```sql -show system info -``` - -- 将返回的机器码复制给天谋工作人员: - -```sql -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -``` - -- 将工作人员返回的激活码输入到CLI中,输入以下内容 - - 注:激活码前后需要用'符号进行标注,如所示 - -```sql -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZK' -``` - -- 可通过如下方式验证激活,当看到状态显示为 ACTIVATED 表示激活成功 - -```sql -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo|ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710| | xxxxxxx| ACTIVATED| -| 1| DataNode|Running| 127.0.0.1| 10730| | xxxxxxx| ACTIVATED| -| 2| AINode|Running| 127.0.0.1| 10810| | xxxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+--------------+ - -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2025-07-16T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| AiNodeLimit| 1| 1| -| CpuLimit| 11| Unlimited| -| DeviceLimit| 0| Unlimited| -|TimeSeriesLimit| 0| 9,999| -+---------------+---------+-----------------------------+ - -``` - -### 3.2 配置项修改 -AINode 支持修改一些必要的参数。可以在 `conf/iotdb-ainode.properties` 文件中找到下列参数并进行持久化的修改: - -| **名称** | **描述** | **类型** | **默认值** | -| ------------------------------ | ------------------------------------------------------------ | -------- | ------------------ | -| cluster_name | AINode 要加入集群的标识 | string | defaultCluster | -| ain_seed_config_node | AINode 启动时注册的 ConfigNode 地址 | String | 127.0.0.1:10710 | -| ain_cluster_ingress_address | AINode 拉取数据的 DataNode 的 rpc 地址 | String | 127.0.0.1 | -| ain_cluster_ingress_port | AINode 拉取数据的 DataNode 的 rpc 端口 | Integer | 6667 | -| ain_cluster_ingress_username | AINode 拉取数据的 DataNode 的客户端用户名 | String | root | -| ain_cluster_ingress_password | AINode 拉取数据的 DataNode 的客户端密码 | String | root | -| ain_cluster_ingress_time_zone | AINode 拉取数据的 DataNode 的客户端时区 | String | UTC+8 | -| ain_inference_rpc_address | AINode 提供服务与通信的地址 ,内部服务通讯接口 | String | 127.0.0.1 | -| ain_inference_rpc_port | AINode 提供服务与通信的端口 | String | 10810 | -| ain_system_dir | AINode 元数据存储路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/system | -| ain_models_dir | AINode 存储模型文件的路径,相对路径的起始目录与操作系统相关,建议使用绝对路径 | String | data/AINode/models | -| ain_thrift_compression_enabled | AINode 是否启用 thrift 的压缩机制,0-不启动、1-启动 | Boolean | 0 | - -### 3.3 导入权重文件 - -> 仅离线环境,在线环境可忽略本步骤 -> -联系天谋工作人员获取模型权重文件,并放置到/IOTDB_AINODE_HOME/data/ainode/models/weights/目录下。 - -### 3.4 启动 AINode - -在完成 Seed-ConfigNode 的部署后,可以通过添加 AINode 节点来支持模型的注册和推理功能。在配置项中指定 IoTDB 集群的信息后,可以执行相应的指令来启动 AINode,加入 IoTDB 集群。 - -- 联网环境启动 - -启动命令 - -```shell - # 启动命令 - # Linux 和 MacOS 系统 - bash sbin/start-ainode.sh - - # Windows 系统 - sbin\start-ainode.bat - - # 后台启动命令(长期运行推荐) - # Linux 和 MacOS 系统 - nohup bash sbin/start-ainode.sh > myout.file 2>& 1 & - - # Windows 系统 - nohup bash sbin\start-ainode.bat > myout.file 2>& 1 & - ``` - -### 3.5 检测 AINode 节点状态 - -AINode 启动过程中会自动将新的 AINode 加入 IoTDB 集群。启动 AINode 后可以在 命令行中输入 SQL 来查询,集群中看到 AINode 节点,其运行状态为 Running(如下展示)表示加入成功。 - -```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|Running| 127.0.0.1| 10810|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` - -除此之外,还可以通过 show models 命令来查看模型状态。如果模型状态不对,请检查权重文件路径是否正确。 - -```sql -IoTDB:etth> show models -+---------------------+--------------------+--------+------+ -| ModelId| ModelType|Category| State| -+---------------------+--------------------+--------+------+ -| arima| Arima|BUILT-IN|ACTIVE| -| holtwinters| HoltWinters|BUILT-IN|ACTIVE| -|exponential_smoothing|ExponentialSmoothing|BUILT-IN|ACTIVE| -| naive_forecaster| NaiveForecaster|BUILT-IN|ACTIVE| -| stl_forecaster| StlForecaster|BUILT-IN|ACTIVE| -| gaussian_hmm| GaussianHmm|BUILT-IN|ACTIVE| -| gmm_hmm| GmmHmm|BUILT-IN|ACTIVE| -| stray| Stray|BUILT-IN|ACTIVE| -| sundial| Timer-Sundial|BUILT-IN|ACTIVE| -| timer_xl| Timer-XL|BUILT-IN|ACTIVE| -+---------------------+--------------------+--------+------+ -``` - -### 3.6 停止 AINode - -如果需要停止正在运行的 AINode 节点,则执行相应的关闭脚本。 - -- 停止命令 - -```shell - # Linux / MacOS - bash sbin/stop-ainode.sh - - #Windows - sbin\stop-ainode.bat - ``` - -停止 AINode 后,还可以在集群中看到 AINode 节点,其运行状态为 UNKNOWN(如下展示),此时无法使用 AINode 功能。 - - ```shell -IoTDB> show cluster -+------+----------+-------+---------------+------------+-------+-----------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort|Version| BuildInfo| -+------+----------+-------+---------------+------------+-------+-----------+ -| 0|ConfigNode|Running| 127.0.0.1| 10710|UNKNOWN|190e303-dev| -| 1| DataNode|Running| 127.0.0.1| 10730|UNKNOWN|190e303-dev| -| 2| AINode|UNKNOWN| 127.0.0.1| 10790|UNKNOWN|190e303-dev| -+------+----------+-------+---------------+------------+-------+-----------+ -``` -如果需要重新启动该节点,需重新执行启动脚本。 - - -## 4. 常见问题 - -### 4.1 启动AINode时出现找不到venv模块的报错 - -当使用默认方式启动 AINode 时,会在安装包目录下创建一个 python 虚拟环境并安装依赖,因此要求安装 venv 模块。通常来说 python3.10 及以上的版本会自带 venv,但对于一些系统自带的 python 环境可能并不满足这一要求。出现该报错时有两种解决方案(二选一): - -在本地安装 venv 模块,以 ubuntu 为例,可以通过运行以下命令来安装 python 自带的 venv 模块。或者从 python 官网安装一个自带 venv 的 python 版本。 - - ```shell -apt-get install python3.10-venv -``` -安装 3.10.0 版本的 venv 到 AINode 里面 在 AINode 路径下 - - ```shell -../Python-3.10.0/python -m venv venv(文件夹名) -``` -在运行启动脚本时通过 `-i` 指定已有的 python 解释器路径作为 AINode 的运行环境,这样就不再需要创建一个新的虚拟环境。 - -### 4.2 python中的SSL模块没有被正确安装和配置,无法处理HTTPS资源 -WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. -可以安装 OpenSSLS 后,再重新构建 python 来解决这个问题 -> Currently Python versions 3.6 to 3.9 are compatible with OpenSSL 1.0.2, 1.1.0, and 1.1.1. - -Python 要求我们的系统上安装有 OpenSSL,具体安装方法可见[链接](https://stackoverflow.com/questions/56552390/how-to-fix-ssl-module-in-python-is-not-available-in-centos) - - ```shell -sudo apt-get install build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev uuid-dev lzma-dev liblzma-dev -sudo -E ./configure --with-ssl -make -sudo make install -``` - -### 4.3 pip版本较低 - -windows下出现类似“error:Microsoft Visual C++ 14.0 or greater is required...”的编译问题 - -出现对应的报错,通常是 c++版本或是 setuptools 版本不足,可以在 - - ```shell -./python -m pip install --upgrade pip -./python -m pip install --upgrade setuptools -``` - - -### 4.4 安装编译python - -使用以下指定从官网下载安装包并解压: - ```shell -.wget https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz -tar Jxf Python-3.10.0.tar.xz -``` -编译安装对应的 python 包: - ```shell -cd Python-3.10.0 -./configure prefix=/usr/local/python3 -make -sudo make install -python3 --version -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/src/zh/UserGuide/latest/Deployment-and-Maintenance/Cluster-Deployment_timecho.md deleted file mode 100644 index 391d6a474..000000000 --- a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Cluster-Deployment_timecho.md +++ /dev/null @@ -1,589 +0,0 @@ - -# 集群版部署指导 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -## 1. 注意事项 - -1. 安装前请确认系统已参照[系统配置](./Environment-Requirements.md)准备完成。 - -2. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置/etc/hosts,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。`dn_internal_address`。 - - ``` shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. 有些参数首次启动后不能修改,请参考下方的"参数配置"章节来进行设置。 - -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 - -5. 请注意,安装部署(包括激活和使用软件)IoTDB时需要保持使用同一个用户进行操作,您可以: -- 使用 root 用户(推荐):使用 root 用户可以避免权限等问题。 -- 使用固定的非 root 用户: - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - 避免使用 sudo:尽量避免使用 sudo 命令,因为它会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 - -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考:[监控面板部署](./Monitoring-panel-deployment.md) - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - -## 2. 准备步骤 - -1. 准备IoTDB数据库安装包 :iotdb-enterprise-{version}-bin.zip(安装包获取见:[链接](../Deployment-and-Maintenance/IoTDB-Package_timecho.md)) -2. 按环境要求配置好操作系统环境(系统环境配置见:[链接](../Deployment-and-Maintenance/Environment-Requirements.md)) - -### 2.1 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-01.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -## 3. 安装步骤 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ----------- | ------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -### 3.1 设置主机名 - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置`/etc/hosts`,使用如下命令: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 3.2 参数配置 - -解压安装包并进入安装目录 - -```Plain -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -#### 环境脚本配置 - -- `./conf/confignode-env.sh`配置 - - | **配置项** | **说明** | **默认值** | **推荐值** | 备注 | - | :---------- | :------------------------------------- | :--------- | :----------------------------------------------- | :----------- | - | MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- `./conf/datanode-env.sh`配置 - - | **配置项** | **说明** | **默认值** | **推荐值** | 备注 | - | :---------- | :----------------------------------- |:-----------------------| :----------------------------------------------- | :----------- | - | MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 通用配置 - -打开通用配置文件`./conf/iotdb-system.properties`,可根据部署方式设置以下参数: - -| 配置项 | 说明 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | -| ------------------------- | ---------------------------------------- | -------------- | -------------- | -------------- | -| cluster_name | 集群名称 | defaultCluster | defaultCluster | defaultCluster | -| schema_replication_factor | 元数据副本数,DataNode数量不应少于此数目 | 3 | 3 | 3 | -| data_replication_factor | 数据副本数,DataNode数量不应少于此数目 | 2 | 2 | 2 | - -#### ConfigNode 配置 - -打开ConfigNode配置文件`./conf/iotdb-system.properties`,设置以下参数 - -| 配置项 | 说明 | 默认 | 推荐值 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | 备注 | -| ------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 10710 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 10720 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -#### DataNode 配置 - -打开DataNode配置文件 `./conf/iotdb-system.properties`,设置以下参数: - -| 配置项 | 说明 | 默认 | 推荐值 | 192.168.1.3 | 192.168.1.4 | 192.168.1.5 | 备注 | -| ------------------------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------- | ------------- | ------------- | ------------- | ------------------ | -| dn_rpc_address | 客户端 RPC 服务的地址 | 127.0.0.1 | 默认本机可直接访问。非本机访问,请修改此配置项为所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址。 | iotdb-1 |iotdb-2 | iotdb-3 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 6667 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | iotdb-1 | iotdb-2 | iotdb-3 | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 10730 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 10740 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 10750 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 10760 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | 第一个CongfigNode的cn_internal_address:cn_internal_port | iotdb-1:10710 | iotdb-1:10710 | iotdb-1:10710 | 首次启动后不能修改 | - -> ❗️注意:VSCode Remote等编辑器无自动保存配置功能,请确保修改的文件被持久化保存,否则配置项无法生效 - -### 3.3 启动ConfigNode节点 - -先启动第一个iotdb-1的confignode, 保证种子confignode节点先启动,然后依次启动第2和第3个confignode节点 - -```shell -# Unix/OS X -cd sbin -./start-confignode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\start-confignode.bat - -# V2.0.4.x 版本及之后 -.\windows\start-confignode.bat -``` - -如果启动失败,请参考下[常见问题](#常见问题) - -### 3.4 启动DataNode 节点 - -分别进入iotdb的sbin目录下,依次启动3个datanode节点: - -```shell -# Unix/OS X -cd sbin -./start-datanode.sh -d #-d参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\start-datanode.bat - -# V2.0.4.x 版本及之后 -.\windows\start-datanode.bat -``` - -### 3.5 激活数据库 - -#### 方式一:通过 CLI 激活 - -- 进入集群任一节点 CLI - -```shell -# Linux 系统与 MacOS 系统启动命令如下: -# V2.0.6.x 版本之前 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 - -# Windows 系统启动命令如下: -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` - -- 执行以下内容获取激活所需机器码: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -|01-TE5NLES4-UDDWCMYE,01-GG5NLES4-XXDWCMYE,01-FF5NLES4-WWWWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -- 执行以下语句获取待激活数据库的版本号: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.x.x| xxxxxxx| -+-------+---------+ -Total line number = 1 -``` - -- 将获取到的机器码与版本号,一同提供给天谋工作人员。 - -- 工作人员会返回激活码,正常是与提供的机器码的顺序对应的,请将整串激活码粘贴到CLI中进行激活,此激活操作只需在集群中的任意一台机器上执行一次即可。 - - - 注:激活码前后需要用`'`符号进行标注,如下所示 - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===,01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - - -#### 方式二:激活文件拷贝激活 - -- 依次启动3个Confignode、Datanode节点后,每台机器各自的activation文件夹, 分别拷贝每台机器的system_info文件给天谋工作人员; -- 工作人员将返回每个ConfigNode、Datanode节点的license文件,这里会返回3个license文件; -- 将3个license文件分别放入对应的ConfigNode节点的activation文件夹下; - - -### 3.6 验证激活 - -可在 CLI 中通过执行 `show activation` 命令查看激活状态,示例如下,状态显示为 ACTIVATED 表示激活成功 - -```sql -IoTDB> show activation -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - -### 3.7 一键启停集群 - -#### 3.7.1 概述 - -在 IoTDB 的根目录中,`sbin` 子目录包含的 `start-all.sh` 和 `stop-all.sh` 脚本,与 `conf` 子目录中的 `iotdb-cluster.properties` 配置文件协同工作,可通过单一节点实现一键启动或停止集群所有节点的功能。通过这种方式,可以高效地管理 IoTDB 集群的生命周期,简化了部署和运维流程。 -下文将介绍`iotdb-cluster.properties` 文件中的具体配置项。 - -#### 3.7.2 配置项 - - -> 注意: -> -> * 当集群变更时,需要手动更新此配置文件。 -> * 如果在未配置 `iotdb-cluster.properties` 配置文件的情况下执行 `start-all.sh` 或者 `stop-all.sh` 脚本,则默认会启停当前脚本所在 IOTDB\_HOME 目录下的 ConfigNode 与 DataNode 节点。 -> * 推荐配置 ssh 免密登录:如果未配置,启动脚本后会提示输入服务器密码以便于后续启动/停止/销毁操作。如果已配置,则无需在执行脚本过程中输入服务器密码。 - -* confignode\_address\_list - -| 名字 | confignode\_address\_list | -| :--------------: | :------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的 ConfigNode 节点所在主机的 IP 或主机名列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_address\_list - -| 名字 | datanode\_address\_list | -| :----------------: | :---------------------------------------------------------------------------- | -| 描述 | 待启动/停止的 DataNode 节点所在主机的 IP 或主机名列表,如果有多个需要用“,”分隔。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* ssh\_account - -| 名字 | ssh\_account | -| :----------------: | :------------------------------------------------------------- | -| 描述 | 通过 SSH 登陆目标主机的用户名,需要所有的主机的用户名都相同 | -| 类型 | String | -| 默认值 | root | -| 改后生效方式 | 重启服务生效 | - -* ssh\_port - -| 名字 | ssh\_port | -| :----------------: | :--------------------------------------------------------- | -| 描述 | 目标主机对外暴露的 SSH 端口,需要所有的主机的端口都相同 | -| 类型 | int | -| 默认值 | 22 | -| 改后生效方式 | 重启服务生效 | - -* confignode\_deploy\_path - -| 名字 | confignode\_deploy\_path | -| :----------------: | :---------------------------------------------------------------------------------------------------------------- | -| 描述 | 待启动/停止的所有 ConfigNode 所在目标主机的路径,需要所有待启动/停止的 ConfigNode 节点在目标主机的相同目录下。例如:`/data/demo/apache-iotdb-1.3.1-all-bin` | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* datanode\_deploy\_path - -| 名字 | datanode\_deploy\_path | -| :----------------: | :------------------------------------------------------------------------------------------------------------ | -| 描述 | 待启动/停止的所有 DataNode 所在目标主机的路径,需要所有待启动/停止的 DataNode 节点在目标主机的相同目录下。例如:`/data/demo/apache-iotdb-1.3.1-all-bin` | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - - -#### 3.7.3 简单示例 - -1. 配置文件 `iotdb-cluster.properties` - -```properties -# Configure ConfigNodes machine addresses separated by , -confignode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# Configure DataNodes machine addresses separated by , -datanode_address_list=172.xx.xx.16,172.xx.xx.17,172.xx.xx.18 - -# User name for logging in to the deployment machine using ssh -ssh_account=root - -# ssh login port -ssh_port=22 - -# iotdb deployment directory (iotdb will be deployed to the target node in this folder) -confignode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -datanode_deploy_path=/data/demo/apache-iotdb-1.3.1-all-bin -``` - -2. 执行 ./start-all.sh 命令验证启动结果,在 cli 中执行 show cluster,可看到类似如下结果 -```SQL -IoTDB> show cluster -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -|NodeID| NodeType| Status|InternalAddress|InternalPort| Version| BuildInfo| ActivateStatus| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -| 0|ConfigNode|Running| 172.xx.xx.16| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 1|ConfigNode|Running| 172.xx.xx.18| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 2|ConfigNode|Running| 172.xx.xx.17| 10710| 1.3.1| 0xxxxxx| ACTIVATED| -| 3| DataNode|Running| 172.xx.xx.18| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 4| DataNode|Running| 172.xx.xx.17| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -| 5| DataNode|Running| 172.xx.xx.16| 10730| 1.3.1| 0xxxxxx| ACTIVATED| -+------+----------+-------+---------------+------------+--------------+-----------+----------------+ -``` - -## 4. 节点维护步骤 - -### 4.1 ConfigNode节点维护 - -ConfigNode节点维护分为ConfigNode添加和移除两种操作,有两个常见使用场景: -- 集群扩展:如集群中只有1个ConfigNode时,希望增加ConfigNode以提升ConfigNode节点高可用性,则可以添加2个ConfigNode,使得集群中有3个ConfigNode。 -- 集群故障恢复:1个ConfigNode所在机器发生故障,使得该ConfigNode无法正常运行,此时可以移除该ConfigNode,然后添加一个新的ConfigNode进入集群。 - -> ❗️注意,在完成ConfigNode节点维护后,需要保证集群中有1或者3个正常运行的ConfigNode。2个ConfigNode不具备高可用性,超过3个ConfigNode会导致性能损失。 - -#### 添加ConfigNode节点 - -脚本命令: -```shell -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-confignode.sh - -# Windows -# 首先切换到IoTDB根目录 -# V2.0.4.x 版本之前 -sbin\start-confignode.bat - -# V2.0.4.x 版本及之后 -sbin\windows\start-confignode.bat -``` - -参数介绍: - -| 参数 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - -#### 移除ConfigNode节点 - -首先通过CLI连接集群,通过`show confignodes`确认想要移除ConfigNode的NodeID: - -```Bash -IoTDB> show confignodes -+------+-------+---------------+------------+--------+ -|NodeID| Status|InternalAddress|InternalPort| Role| -+------+-------+---------------+------------+--------+ -| 0|Running| 127.0.0.1| 10710| Leader| -| 1|Running| 127.0.0.1| 10711|Follower| -| 2|Running| 127.0.0.1| 10712|Follower| -+------+-------+---------------+------------+--------+ -Total line number = 3 -It costs 0.030s -``` - -然后使用SQL将ConfigNode移除,SQL命令: - - -```Bash -remove confignode [confignode_id] - -``` - -### 4.2 DataNode节点维护 - -DataNode节点维护有两个常见场景: - -- 集群扩容:出于集群能力扩容等目的,添加新的DataNode进入集群 -- 集群故障恢复:一个DataNode所在机器出现故障,使得该DataNode无法正常运行,此时可以移除该DataNode,并添加新的DataNode进入集群 - -> ❗️注意,为了使集群能正常工作,在DataNode节点维护过程中以及维护完成后,正常运行的DataNode总数不得少于数据副本数(通常为2),也不得少于元数据副本数(通常为3)。 - -#### 添加DataNode节点 - -脚本命令: - -```Bash -# Linux / MacOS -# 首先切换到IoTDB根目录 -sbin/start-datanode.sh - -# Windows -# 首先切换到IoTDB根目录 -# V2.0.4.x 版本之前 -sbin\start-datanode.bat - -# V2.0.4.x 版本及之后 -sbin\windows\start-datanode.bat -``` - -参数介绍: - -| 缩写 | 描述 | 是否为必填项 | -| :--- | :--------------------------------------------- | :----------- | -| -v | 显示版本信息 | 否 | -| -f | 在前台运行脚本,不将其放到后台 | 否 | -| -d | 以守护进程模式启动,即在后台运行 | 否 | -| -p | 指定一个文件来存放进程ID,用于进程管理 | 否 | -| -c | 指定配置文件夹的路径,脚本会从这里加载配置文件 | 否 | -| -g | 打印垃圾回收(GC)的详细信息 | 否 | -| -H | 指定Java堆转储文件的路径,当JVM内存溢出时使用 | 否 | -| -E | 指定JVM错误日志文件的路径 | 否 | -| -D | 定义系统属性,格式为 key=value | 否 | -| -X | 直接传递 -XX 参数给 JVM | 否 | -| -h | 帮助指令 | 否 | - -说明:在添加DataNode后,随着新的写入到来(以及旧数据过期,如果设置了TTL),集群负载会逐渐向新的DataNode均衡,最终在所有节点上达到存算资源的均衡。 - -#### 移除DataNode节点 - -首先通过CLI连接集群,通过`show datanodes`确认想要移除的DataNode的NodeID: - -```Bash -IoTDB> show datanodes -+------+-------+----------+-------+-------------+---------------+ -|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum| -+------+-------+----------+-------+-------------+---------------+ -| 1|Running| 0.0.0.0| 6667| 0| 0| -| 2|Running| 0.0.0.0| 6668| 1| 1| -| 3|Running| 0.0.0.0| 6669| 1| 0| -+------+-------+----------+-------+-------------+---------------+ -Total line number = 3 -It costs 0.110s -``` - -然后使用SQL将DataNode移除,SQL命令: - -```Bash -remove datanode [datanode_id] - -``` - -### 4.3 集群维护 - -更多关于集群维护的介绍可参考:[集群维护](../User-Manual/Load-Balance.md) - - -## 5. 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 - -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 步骤 4: 清理环境: - - a. 结束所有 ConfigNode 和 DataNode 进程。 - -```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - # Unix/OS X - sbin/stop-standalone.sh - - # Windows - # V2.0.4.x 版本之前 - sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - sbin\windows\stop-standalone.bat - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```Bash - cd /data/iotdb - rm -rf data logs - ``` diff --git a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Database-Resources_timecho.md b/src/zh/UserGuide/latest/Deployment-and-Maintenance/Database-Resources_timecho.md deleted file mode 100644 index 11f6b5861..000000000 --- a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Database-Resources_timecho.md +++ /dev/null @@ -1,206 +0,0 @@ - -# 资源规划 -## 1. CPU - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
序列数(采集频率<=1HZ)CPU节点数
单机双活分布式
10W以内2核-4核123
30W以内4核-8核123
50W以内8核-16核123
100W以内16核-32核123
200w以内32核-48核123
1000w以内48核12请联系天谋商务咨询
1000w以上请联系天谋商务咨询
- -> CPU支持型号:鲲鹏、飞腾、申威、海光、兆芯、龙芯 - -## 2. 内存 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
序列数(采集频率<=1HZ)内存节点数
单机双活分布式
10W以内2G-4G123
30W以内6G-12G123
50W以内12G-24G123
100W以内24G-48G123
200w以内48G-96G123
1000w以内128G12请联系天谋商务咨询
1000w以上请联系天谋商务咨询
- -> 提供灵活的内存配置选项,用户可在datanode-env文件中进行调整,详细信息和配置指南请参见 [datanode-env](../Reference/DataNode-Config-Manual.md#_2-环境配置项-datanode-env-sh-bat) - -## 3. 存储(磁盘) -### 3.1 存储空间 - -可通过磁盘资源评估器进行计算:[磁盘资源评估器](https://www.timecho.com/docs/zh/ResourceEvaluator.html) - -计算公式:测点数量 * 采样频率(Hz)* 每个数据点大小(Byte,不同数据类型不一样,见下表) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
数据点大小计算表
数据类型 时间戳(字节)值(字节)数据点总大小(字节)
开关量(Boolean)819
整型(INT32)/ 单精度浮点数(FLOAT)8412
长整型(INT64)/ 双精度浮点数(DOUBLE)8816
字符串(TEXT)8平均为a8+a
- -示例:1000设备,每个设备100 测点,共 100000 序列,INT32 类型。采样频率1Hz(每秒一次),存储1年,3副本。 -- 完整计算公式:1000设备 * 100测点 * 12字节每数据点 * 86400秒每天 * 365天每年 * 3副本/10压缩比 / 1024 / 1024 / 1024 / 1024 =11T -- 简版计算公式:1000 * 100 * 12 * 86400 * 365 * 3 / 10 / 1024 / 1024 / 1024 / 1024 = 11T -### 3.2 存储配置 -1000w 点位以上或查询负载较大,推荐配置 SSD。 -## 4. 网络(网卡) -在写入吞吐不超过1000万点/秒时,需配置千兆网卡;当写入吞吐超过 1000万点/秒时,需配置万兆网卡。 -| **写入吞吐(数据点/秒)** | **网卡速率** | -| ------------------- | ------------- | -| <1000万 | 1Gbps(千兆) | -| >=1000万 | 10Gbps(万兆) | -## 5. 其他说明 -IoTDB 具有集群秒级扩容能力,扩容节点数据可不迁移,因此您无需担心按现有数据情况估算的集群能力有限,未来您可在需要扩容时为集群加入新的节点。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Deployment-form_timecho.md b/src/zh/UserGuide/latest/Deployment-and-Maintenance/Deployment-form_timecho.md deleted file mode 100644 index d49674d07..000000000 --- a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Deployment-form_timecho.md +++ /dev/null @@ -1,61 +0,0 @@ - -# 部署形态 - -IoTDB 有三种运行模式:单机模式、集群模式和双活模式。 - -## 1. 单机模式 - -IoTDB单机实例包括 1 个ConfigNode、1个DataNode,即1C1D; - -- **特点**:便于开发者安装部署,部署和维护成本较低,操作方便。 -- **适用场景**:资源有限或对高可用要求不高的场景,例如边缘端服务器。 -- **部署方法**:[单机版部署](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - -## 2. 双活模式 - -双活版部署为 TimechoDB 企业版功能,是指两个独立的实例进行双向同步,能同时对外提供服务。当一台停机重启后,另一个实例会将缺失数据断点续传。 - -> IoTDB 双活实例通常为2个单机节点,即2套1C1D。每个实例也可以为集群。 - -- **特点**:资源占用最低的高可用解决方案。 -- **适用场景**:资源有限(仅有两台服务器),但希望获得高可用能力。 -- **部署方法**:[双活版部署](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -## 3. 集群模式 - -IoTDB 集群实例为 3 个ConfigNode 和不少于 3 个 DataNode,通常为 3 个 DataNode,即3C3D;当部分节点出现故障时,剩余节点仍然能对外提供服务,保证数据库服务的高可用性,且可随节点增加提升数据库性能。 - -- **特点**:具有高可用性、高扩展性,可通过增加 DataNode 提高系统性能。 -- **适用场景**:需要提供高可用和可靠性的企业级应用场景。 -- **部署方法**:[集群版部署](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md) - -## 4. 特点总结 - -| 维度 | 单机模式 | 双活模式 | 集群模式 | -| ------------ | ---------------------------- | ------------------------ | ------------------------ | -| 适用场景 | 边缘侧部署、对高可用要求不高 | 高可用性业务、容灾场景等 | 高可用性业务、容灾场景等 | -| 所需机器数量 | 1 | 2 | ≥3 | -| 安全可靠性 | 无法容忍单点故障 | 高,可容忍单点故障 | 高,可容忍单点故障 | -| 扩展性 | 可扩展 DataNode 提升性能 | 每个实例可按需扩展 | 可扩展 DataNode 提升性能 | -| 性能 | 可随 DataNode 数量扩展 | 与其中一个实例性能相同 | 可随 DataNode 数量扩展 | - -- 单机模式和集群模式,部署步骤类似(逐个增加 ConfigNode 和 DataNode),仅副本数和可提供服务的最少节点数不同。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Docker-Deployment_timecho.md b/src/zh/UserGuide/latest/Deployment-and-Maintenance/Docker-Deployment_timecho.md deleted file mode 100644 index 80a847eaf..000000000 --- a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Docker-Deployment_timecho.md +++ /dev/null @@ -1,495 +0,0 @@ - -# Docker部署指导 - -## 1. 环境准备 - -### 1.1 Docker安装 - -```Bash -#以ubuntu为例,其他操作系统可以自行搜索安装方法 -#step1: 安装一些必要的系统工具 -sudo apt-get update -sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common -#step2: 安装GPG证书 -curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add - -#step3: 写入软件源信息 -sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" -#step4: 更新并安装Docker-CE -sudo apt-get -y update -sudo apt-get -y install docker-ce -#step5: 设置docker开机自启动 -sudo systemctl enable docker -#step6: 验证docker是否安装成功 -docker --version #显示版本信息,即安装成功 -``` - -### 1.2 docker-compose安装 - -```Bash -#安装命令 -curl -L "https://github.com/docker/compose/releases/download/v2.20.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose -chmod +x /usr/local/bin/docker-compose -ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose -#验证是否安装成功 -docker-compose --version #显示版本信息即安装成功 -``` - -### 1.3 安装dmidecode插件 - -默认情况下,linux服务器应该都已安装,如果没有安装的话,可以使用下面的命令安装。 - -```Bash -sudo apt-get install dmidecode -``` - -dmidecode 安装后,查找安装路径:`whereis dmidecode`,这里假设结果为`/usr/sbin/dmidecode`,记住该路径,后面的docker-compose的yml文件会用到。 - -### 1.4 获取IoTDB的容器镜像 - -关于IoTDB企业版的容器镜像您可联系商务或技术支持获取。 - -## 2. 单机版部署 - -本节演示如何部署1C1D的docker单机版。 - -### 2.1 load 镜像文件 - -比如这里获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -load镜像: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E6%9F%A5%E7%9C%8B%E9%95%9C%E5%83%8F.png) - -### 2.2 创建docker bridge网络 - -```Bash -docker network create --driver=bridge --subnet=172.18.0.0/16 --gateway=172.18.0.1 iotdb -``` - -### 2.3 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在`/docker-iotdb` 文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`, `/docker-iotdb/docker-compose-standalone.yml ` - -```Bash -docker-iotdb: -├── iotdb #iotdb安装目录 -│── docker-compose-standalone.yml #单机版docker-compose的yml文件 -``` - -完整的`docker-compose-standalone.yml`内容如下: - -```Bash -version: "3" -services: - iotdb-service: - image: timecho/timechodb:2.0.2.1-standalone #使用的镜像 - hostname: iotdb - container_name: iotdb - restart: always - ports: - - "6667:6667" - environment: - - cn_internal_address=iotdb - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb:10710 - - dn_rpc_address=iotdb - - dn_internal_address=iotdb - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - dn_seed_config_node=iotdb:10710 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - networks: - iotdb: - ipv4_address: 172.18.0.6 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -networks: - iotdb: - external: true -``` - -### 2.4 首次启动 - -使用下面的命令启动: - -```Bash -cd /docker-iotdb -docker-compose -f docker-compose-standalone.yml up -``` - -由于没有激活,首次启动时会直接退出,属于正常现象,首次启动是为了获取机器码文件,用于后面的激活流程。 - -![](/img/%E5%8D%95%E6%9C%BA-%E6%BF%80%E6%B4%BB.png) - -### 2.5 申请激活 - -- 首次启动后,在物理机目录`/docker-iotdb/iotdb/activation`下会生成一个 `system_info`文件,将这个文件拷贝给天谋工作人员。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 收到工作人员返回的license文件,将license文件拷贝到`/docker-iotdb/iotdb/activation`文件夹下。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -### 2.6 再次启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -![](/img/%E5%90%AF%E5%8A%A8iotdb.png) - -### 2.7 验证部署 - -- 查看日志,有如下字样,表示启动成功 - -```Bash -docker logs -f iotdb-datanode #查看日志命令 -2024-07-19 12:02:32,608 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! -``` - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B21.png) - -- 进入容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B22.png) - - 进入容器, 通过cli登录数据库, 使用show cluster命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb /bin/bash #进入容器 - ./start-cli.sh -h iotdb #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81%E9%83%A8%E7%BD%B23.png) - -### 2.8 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在docker-compose-standalone.yml中添加映射 - -```Bash - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:重新启动IoTDB - -```Bash -docker-compose -f docker-compose-standalone.yml up -d -``` - -## 3. 集群版部署 - -本小节描述如何手动部署包括3个ConfigNode和3个DataNode的实例,即通常所说的3C3D集群。 - -
- -
- -**注意:集群版目前只支持host网络和overlay 网络,不支持bridge网络。** - -下面以host网络为例演示如何部署3C3D集群。 - -### 3.1 设置主机名 - -假设现在有3台linux服务器,IP地址和服务角色分配如下: - -| 节点ip | 主机名 | 服务 | -| ----------- | ------- | -------------------- | -| 192.168.1.3 | iotdb-1 | ConfigNode、DataNode | -| 192.168.1.4 | iotdb-2 | ConfigNode、DataNode | -| 192.168.1.5 | iotdb-3 | ConfigNode、DataNode | - -在3台机器上分别配置主机名,设置主机名需要在目标服务器上配置/etc/hosts,使用如下命令: - -```Bash -echo "192.168.1.3 iotdb-1" >> /etc/hosts -echo "192.168.1.4 iotdb-2" >> /etc/hosts -echo "192.168.1.5 iotdb-3" >> /etc/hosts -``` - -### 3.2 load镜像文件 - -比如获取的IoTDB的容器镜像文件名是:`iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz` - -在3台服务器上分别执行load镜像命令: - -```Bash -docker load -i iotdb-enterprise-2.0.x.x-standalone-docker.tar.gz -``` - -查看镜像: - -```Bash -docker images -``` - -![](/img/%E9%95%9C%E5%83%8F%E5%8A%A0%E8%BD%BD.png) - -### 3.3 编写docker-compose的yml文件 - -这里我们以把IoTDB安装目录和yml文件统一放在/docker-iotdb文件夹下为例: - -文件目录结构为:`/docker-iotdb/iotdb`,`/docker-iotdb/confignode.yml`,`/docker-iotdb/datanode.yml` - -```Bash -docker-iotdb: -├── confignode.yml #confignode的yml文件 -├── datanode.yml #datanode的yml文件 -└── iotdb #IoTDB安装目录 -``` - -在每台服务器上都要编写2个yml文件,即`confignode.yml`和`datanode.yml`,yml示例如下: - -**confignode.yml:** - -```Bash -#confignode.yml -version: "3" -services: - iotdb-confignode: - image: iotdb-enterprise:2.0.x.x-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-confignode - command: ["bash", "-c", "entrypoint.sh confignode"] - restart: always - environment: - - cn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - cn_internal_port=10710 - - cn_consensus_port=10720 - - cn_seed_config_node=iotdb-1:10710 #默认第一台为seed节点 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - privileged: true - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -**datanode.yml:** - -```Bash -#datanode.yml -version: "3" -services: - iotdb-datanode: - image: iotdb-enterprise:2.0.x.x-standalone #使用的镜像 - hostname: iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - container_name: iotdb-datanode - command: ["bash", "-c", "entrypoint.sh datanode"] - restart: always - ports: - - "6667:6667" - privileged: true - environment: - - dn_rpc_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_internal_address=iotdb-1|iotdb-2|iotdb-3 #根据实际情况选择,三选一 - - dn_seed_config_node=iotdb-1:10710 #默认第1台为seed节点 - - dn_rpc_port=6667 - - dn_internal_port=10730 - - dn_mpp_data_exchange_port=10740 - - dn_schema_region_consensus_port=10750 - - dn_data_region_consensus_port=10760 - - schema_replication_factor=3 #元数据副本数 - - data_replication_factor=2 #数据副本数 - volumes: - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - network_mode: "host" #使用host网络 - # Note: Some environments set an extremely high container nofile limit (~2^30 = 1073741824). - # This can make the startup step "Checking whether the ports are already occupied..." appear to hang (lsof slow). - # If you see that line for a long time, lower the nofile limit by uncommenting below: - # ulimits: - # nofile: - # soft: 1048576 - # hard: 1048576 -``` - -### 3.4 首次启动confignode - -先在3台服务器上分别启动confignode, 用来获取机器码,注意启动顺序,先启动第1台iotdb-1,再启动iotdb-2和iotdb-3。 - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d #后台启动 -``` - -### 3.5 申请激活 - -- 首次启动3个confignode后,在每个物理机目录`/docker-iotdb/iotdb/activation`下都会生成一个`system_info`文件,将3个服务器的`system_info`文件拷贝给天谋工作人员; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB1.png) - -- 将3个license文件分别放入对应的ConfigNode节点的`/docker-iotdb/iotdb/activation`文件夹下; - - ![](/img/%E5%8D%95%E6%9C%BA-%E7%94%B3%E8%AF%B7%E6%BF%80%E6%B4%BB2.png) - -- license放入对应的activation文件夹后,confignode会自动激活,不用重启confignode - -### 3.6 启动datanode - -在3台服务器上分别启动datanode - -```Bash -cd /docker-iotdb -docker-compose -f datanode.yml up -d #后台启动 -``` - -![](/img/%E9%9B%86%E7%BE%A4%E7%89%88-dn%E5%90%AF%E5%8A%A8.png) - -### 3.7 验证部署 - -- 查看日志,有如下字样,表示datanode启动成功 - - ```Bash - docker logs -f iotdb-datanode #查看日志命令 - 2024-07-20 16:50:48,937 [main] INFO o.a.i.db.service.DataNode:231 - Congratulations, IoTDB DataNode is set up successfully. Now, enjoy yourself! - ``` - - ![](/img/dn%E5%90%AF%E5%8A%A8.png) - -- 进入任意一个容器,查看服务运行状态及激活信息 - - 查看启动的容器 - - ```Bash - docker ps - ``` - - ![](/img/%E6%9F%A5%E7%9C%8B%E5%AE%B9%E5%99%A8.png) - - 进入容器,通过cli登录数据库,使用`show cluster`命令查看服务状态及激活状态 - - ```Bash - docker exec -it iotdb-datanode /bin/bash #进入容器 - ./start-cli.sh -h iotdb-1 #登录数据库 - IoTDB> show cluster #查看状态 - ``` - - 可以看到服务都是running,激活状态显示已激活。 - - ![](/img/%E9%9B%86%E7%BE%A4-%E6%BF%80%E6%B4%BB.png) - -### 3.8 映射/conf目录(可选) - -后续如果想在物理机中直接修改配置文件,可以把容器中的/conf文件夹映射出来,分三步: - -步骤一:在3台服务器中分别拷贝容器中的/conf目录到`/docker-iotdb/iotdb/conf` - -```Bash -docker cp iotdb-confignode:/iotdb/conf /docker-iotdb/iotdb/conf -或者 -docker cp iotdb-datanode:/iotdb/conf /docker-iotdb/iotdb/conf -``` - -步骤二:在3台服务器的`confignode.yml`和`datanode.yml`中添加/conf目录映射 - -```Bash -#confignode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro - -#datanode.yml - volumes: - - ./iotdb/conf:/iotdb/conf #增加这个/conf文件夹的映射 - - ./iotdb/activation:/iotdb/activation - - ./iotdb/data:/iotdb/data - - ./iotdb/logs:/iotdb/logs - - /usr/sbin/dmidecode:/usr/sbin/dmidecode:ro - - /dev/mem:/dev/mem:ro -``` - -步骤三:在3台服务器上重新启动IoTDB - -```Bash -cd /docker-iotdb -docker-compose -f confignode.yml up -d -docker-compose -f datanode.yml up -d -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md b/src/zh/UserGuide/latest/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md deleted file mode 100644 index 5e9c9314f..000000000 --- a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md +++ /dev/null @@ -1,203 +0,0 @@ - -# 双活版部署指导 - -## 1. 什么是双活版? - -双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 两个单机(或集群)可构成一个高可用组:当其中一个单机(或集群)停止服务时,另一个单机(或集群)不会受到影响。当停止服务的单机(或集群)再次启动时,另一个单机(或集群)会将新写入的数据同步过来。业务可以绑定两个单机(或集群)进行读写,从而达到高可用的目的。 -- 双活部署方案允许在物理节点少于 3 的情况下实现高可用,在部署成本上具备一定优势。同时可以通过电力、网络的双环网,实现两套单机(或集群)的物理供应隔离,保障运行的稳定性。 -- 目前双活能力为企业版功能。 - -![](/img/%E5%8F%8C%E6%B4%BB%E5%90%8C%E6%AD%A5.png) - -## 2. 注意事项 - -1. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置`/etc/hosts`,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、`dn_internal_address`。 - - ```Bash - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -2. 有些参数首次启动后不能修改,请参考下方的"安装步骤"章节来进行设置。 - -3. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考[文档](https://www.timecho.com/docs/zh/UserGuide/latest/Deployment-and-Maintenance/Monitoring-panel-deployment.html) - -## 3. 安装步骤 - -我们以两台单机A和B构建的双活版IoTDB为例,A和B的ip分别是192.168.1.3 和 192.168.1.4 ,这里用hostname来表示不同的主机,规划如下: - -| 机器 | 机器ip | 主机名 | -| ---- | ----------- | ------- | -| A | 192.168.1.3 | iotdb-1 | -| B | 192.168.1.4 | iotdb-2 | - -### Step1:分别安装两套独立的 IoTDB - -在2个机器上分别安装 IoTDB,单机版部署文档可参考[文档](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md),集群版部署文档可参考[文档](../Deployment-and-Maintenance/Cluster-Deployment_timecho.md)。**推荐 A、B 集群的各项配置保持一致,以实现最佳的双活效果。** - -### Step2:在机器A上创建数据同步任务至机器B - -- 在机器A上创建数据同步流程,即机器A上的数据自动同步到机器B,使用sbin目录下的cli工具连接A上的IoTDB数据库: - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-1 - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-1 - ``` - -- 创建并启动数据同步命令,SQL 如下: - - ```Bash - create pipe AB - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-2', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一pipe传输而来的数据。 - -### Step3:在机器B上创建数据同步任务至机器A - - - 在机器B上创建数据同步流程,即机器B上的数据自动同步到机器A,使用sbin目录下的cli工具连接B上的IoTDB数据库: - - ```Bash - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-2 - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-2 - ``` - - 创建并启动pipe,SQL 如下: - - ```Bash - create pipe BA - with source ( - 'source.forwarding-pipe-requests' = 'false' - ) - with sink ( - 'sink'='iotdb-thrift-sink', - 'sink.ip'='iotdb-1', - 'sink.port'='6667' - ) - ``` - -- 注意:为了避免数据无限循环,需要将A和B上的参数`source.forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一pipe传输而来的数据。 - -### Step4:验证部署 - -上述数据同步流程创建完成后,即可启动双活集群。 - -#### 检查集群运行状态 - -```Bash -#在2个节点分别执行show cluster命令检查IoTDB服务状态 -show cluster -``` - -**机器A**: - -![](/img/%E5%8F%8C%E6%B4%BB-A.png) - -**机器B**: - -![](/img/%E5%8F%8C%E6%B4%BB-B.png) - -确保每一个 ConfigNode 和 DataNode 都处于 Running 状态。 - -#### 检查同步状态 - -- 机器A上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-A.png) - -- 机器B上检查同步状态 - -```Bash -show pipes -``` - -![](/img/show%20pipes-B.png) - -确保每一个 pipe 都处于 RUNNING 状态。 - -### Step5:停止双活版 IoTDB - -- 在机器A的执行下列命令: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-1 #登录cli - IoTDB> stop pipe AB #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-1 - IoTDB> stop pipe AB - .\sbin\windows\stop-standalone.bat - ``` - -- 在机器B的执行下列命令: - - ```SQL - # Unix/OS X - ./sbin/start-cli.sh -h iotdb-2 #登录cli - IoTDB> stop pipe BA #停止数据同步流程 - ./sbin/stop-standalone.sh #停止数据库服务 - - # Windows - # V2.0.4.x 版本之前 - .\sbin\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - .\sbin\windows\start-cli.bat -h iotdb-2 - IoTDB> stop pipe BA - .\sbin\windows\stop-standalone.bat - ``` diff --git a/src/zh/UserGuide/latest/Deployment-and-Maintenance/IoTDB-Package_timecho.md b/src/zh/UserGuide/latest/Deployment-and-Maintenance/IoTDB-Package_timecho.md deleted file mode 100644 index e2aaf79fb..000000000 --- a/src/zh/UserGuide/latest/Deployment-and-Maintenance/IoTDB-Package_timecho.md +++ /dev/null @@ -1,48 +0,0 @@ - -# 安装包获取 - -## 1. 企业版获取方式 - -企业版安装包可通过产品试用申请,或直接联系与您对接的商务人员获取。 - -## 2. 安装包结构 - -解压后安装包(iotdb-enterprise-{version}-bin.zip),安装包解压后目录结构如下: - -| **目录** | **类型** | **说明** | -| ---------------- | -------- | ------------------------------------------------------------ | -| activation | 文件夹 | 激活文件所在目录,包括生成的机器码以及从天谋工作人员获取的企业版激活码(启动ConfigNode后才会生成该目录,即可获取激活码) | -| conf | 文件夹 | 配置文件目录,包含 ConfigNode、DataNode、JMX 和 logback 等配置文件 | -| data | 文件夹 | 默认的数据文件目录,包含 ConfigNode 和 DataNode 的数据文件。(启动程序后才会生成该目录) | -| lib | 文件夹 | 库文件目录 | -| licenses | 文件夹 | 开源协议证书文件目录 | -| logs | 文件夹 | 默认的日志文件目录,包含 ConfigNode 和 DataNode 的日志文件(启动程序后才会生成该目录) | -| sbin | 文件夹 | 主要脚本目录,包含数据库启、停等脚本 | -| tools | 文件夹 | 工具目录 | -| ext | 文件夹 | pipe,trigger,udf插件的相关文件 | -| LICENSE | 文件 | 开源许可证文件 | -| NOTICE | 文件 | 开源声明文件 | -| README_ZH.md | 文件 | 使用说明(中文版) | -| README.md | 文件 | 使用说明(英文版) | -| RELEASE_NOTES.md | 文件 | 版本说明 | - -注意:自 V2.0.8.2 版本起,TimechoDB 安装包中默认不包含 MQTT 服务 和 REST 服务的 JAR 包。如需使用,请联系天谋团队获取。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Kubernetes_timecho.md b/src/zh/UserGuide/latest/Deployment-and-Maintenance/Kubernetes_timecho.md deleted file mode 100644 index 7fbc7be8d..000000000 --- a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Kubernetes_timecho.md +++ /dev/null @@ -1,445 +0,0 @@ - - -# Kubernetes - -## 1. 环境准备 - -### 1.1 准备 Kubernetes 集群 - -确保拥有一个可用的 Kubernetes 集群(建议最低版本:Kubernetes 1.24),作为部署 IoTDB 集群的基础。 - -Kubernetes 版本要求:建议版本为 Kubernetes 1.24及以上 - -IoTDB版本要求:不能低于v1.3.3 - -## 2. 创建命名空间 - -### 2.1 创建命名空间 - -> 注意:在执行命名空间创建操作之前,需验证所指定的命名空间名称在 Kubernetes 集群中尚未被使用。如果命名空间已存在,创建命令将无法执行,可能导致部署过程中的错误。 - -```Bash -kubectl create ns iotdb-ns -``` - -### 2.2 查看命名空间 - -```Bash -kubectl get ns -``` - -## 3. 创建 PersistentVolume (PV) - -### 3.1 创建 PV 配置文件 - -PV用于持久化存储IoTDB的ConfigNode 和 DataNode的数据,有几个节点就要创建几个PV。 - -> 注:1个ConfigNode和1个DataNode 算2个节点,需要2个PV。 - -以 3ConfigNode、3DataNode 为例: - -1. 创建 `pv.yaml` 文件,并复制六份,分别重命名为 `pv01.yaml` ~ `pv06.yaml`。 - -```Bash -#可新建个文件夹放yaml文件 -#创建 pv.yaml 文件语句 -touch pv.yaml -``` - -2. 修改每个文件中的 `name` 和 `path` 以确保一致性。 - -**pv.yaml 示例:** - -```YAML -# pv.yaml -apiVersion: v1 -kind: PersistentVolume -metadata: - name: iotdb-pv-01 -spec: - capacity: - storage: 10Gi # 存储容量 - accessModes: # 访问模式 - - ReadWriteOnce - persistentVolumeReclaimPolicy: Retain # 回收策略 - # 存储类名称,如果使用本地静态存储storageClassName 不用配置,如果使用动态存储必需设置此项 - storageClassName: local-storage - # 根据你的存储类型添加相应的配置 - hostPath: # 如果是使用本地路径 - path: /data/k8s-data/iotdb-pv-01 - type: DirectoryOrCreate # 这行不配置就要手动创建文件夹 -``` - -### 3.2 应用 PV 配置 - -```Bash -kubectl apply -f pv01.yaml -kubectl apply -f pv-02.yaml -... -``` - -### 3.3 查看 PV - -```Bash -kubectl get pv -``` - - -### 3.4 手动创建文件夹 - -> 如果yaml里的hostPath-type未配置,需要手动创建对应的文件夹 - -在所有 Kubernetes 节点上创建对应的文件夹: - -```Bash -mkdir -p /data/k8s-data/iotdb-pv-01 -mkdir -p /data/k8s-data/iotdb-pv-02 -... -``` - -## 4. 安装 Helm - -安装Helm步骤请参考[Helm官网](https://helm.sh/zh/docs/intro/install/) - -## 5. 配置IoTDB的Helm Chart - -### 5.1 克隆 IoTDB Kubernetes 部署代码 - -请联系天谋工作人员获取IoTDB的Helm Chart - -### 5.2 修改 YAML 文件 - -> 确保使用的是支持的版本 >=1.3.3.2 - -**values.yaml 示例:** - -```YAML -nameOverride: "iotdb" -fullnameOverride: "iotdb" #软件安装后的名称 - -image: - repository: nexus.infra.timecho.com:8143/timecho/iotdb-enterprise - pullPolicy: IfNotPresent - tag: 1.3.3.2-standalone #软件所用的仓库和版本 - -storage: -# 存储类名称,如果使用本地静态存储storageClassName 不用配置,如果使用动态存储必需设置此项 - className: local-storage - -datanode: - name: datanode - nodeCount: 3 #datanode的节点数量 - enableRestService: true - storageCapacity: 10Gi #datanode的可用空间大小 - resources: - requests: - memory: 2Gi #datanode的内存初始化大小 - cpu: 1000m #datanode的CPU初始化大小 - limits: - memory: 4Gi #datanode的最大内存大小 - cpu: 1000m #datanode的最大CPU大小 - -confignode: - name: confignode - nodeCount: 3 #confignode的节点数量 - storageCapacity: 10Gi #confignode的可用空间大小 - resources: - requests: - memory: 512Mi #confignode的内存初始化大小 - cpu: 1000m #confignode的CPU初始化大小 - limits: - memory: 1024Mi #confignode的最大内存大小 - cpu: 2000m #confignode的最大CPU大小 - configNodeConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - schemaReplicationFactor: 3 - schemaRegionConsensusProtocolClass: org.apache.iotdb.consensus.ratis.RatisConsensus - dataReplicationFactor: 2 - dataRegionConsensusProtocolClass: org.apache.iotdb.consensus.iot.IoTConsensus -``` - -## 6. 配置私库信息或预先使用ctr拉取镜像 - -在k8s上配置私有仓库的信息,为下一步helm install的前置步骤。 - -方案一即在 helm install 时拉取可用的iotdb镜像,方案二则是提前将可用的iotdb镜像导入到containerd里。 - -### 6.1 【方案一】从私有仓库拉取镜像 - -#### 6.1.1 创建secret 使k8s可访问iotdb-helm的私有仓库 - -下文中“xxxxxx”表示IoTDB私有仓库的账号、密码、邮箱。 - -```Bash -# 注意 单引号 -kubectl create secret docker-registry timecho-nexus \ - --docker-server='nexus.infra.timecho.com:8143' \ - --docker-username='xxxxxx' \ - --docker-password='xxxxxx' \ - --docker-email='xxxxxx' \ - -n iotdb-ns - -# 查看secret -kubectl get secret timecho-nexus -n iotdb-ns -# 查看并输出为yaml -kubectl get secret timecho-nexus --output=yaml -n iotdb-ns -# 查看并解密 -kubectl get secret timecho-nexus --output="jsonpath={.data.\.dockerconfigjson}" -n iotdb-ns | base64 --decode -``` - -#### 6.1.2 将secret作为一个patch加载到命名空间iotdb-ns - -```Bash -# 添加一个patch,使该命名空间增加登陆nexus的登陆信息 -kubectl patch serviceaccount default -n iotdb-ns -p '{"imagePullSecrets": [{"name": "timecho-nexus"}]}' - -# 查看命名空间的该条信息 -kubectl get serviceaccounts -n iotdb-ns -o yaml -``` - -### 6.2 【方案二】导入镜像 - -该步骤用于客户无法连接私库的场景,需要联系公司实施同事辅助准备。 - -#### 6.2.1 拉取并导出镜像: - -```Bash -ctr images pull --user xxxxxxxx nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.2 查看并导出镜像: - -```Bash -# 查看 -ctr images ls - -# 导出 -ctr images export iotdb-enterprise:1.3.3.2-standalone.tar nexus.infra.timecho.com:8143/timecho/iotdb-enterprise:1.3.3.2-standalone -``` - -#### 6.2.3 导入到k8s的namespace下: - -> 注意,k8s.io为示例环境中k8s的ctr的命名空间,导入到其他命名空间是不行的 - -```Bash -# 导入到k8s的namespace下 -ctr -n k8s.io images import iotdb-enterprise:1.3.3.2-standalone.tar -``` - -#### 6.2.4 查看镜像 - -```Bash -ctr --namespace k8s.io images list | grep 1.3.3.2 -``` - -## 7. 安装 IoTDB - -### 7.1 安装 IoTDB - -```Bash -# 进入文件夹 -cd iotdb-cluster-k8s/helm - -# 安装iotdb -helm install iotdb ./ -n iotdb-ns -``` - -### 7.2 查看 Helm 安装列表 - -```Bash -# helm list -helm list -n iotdb-ns -``` - -### 7.3 查看 Pods - -```Bash -# 查看 iotdb的pods -kubectl get pods -n iotdb-ns -o wide -``` - -执行命令后,输出了带有confignode和datanode标识的各3个Pods,,总共6个Pods,即表明安装成功;需要注意的是,并非所有Pods都处于Running状态,未激活的datanode可能会持续重启,但在激活后将恢复正常。 - -### 7.4 发现故障的排除方式 - -```Bash -# 查看k8s的创建log -kubectl get events -n iotdb-ns -watch kubectl get events -n iotdb-ns - -# 获取详细信息 -kubectl describe pod confignode-0 -n iotdb-ns -kubectl describe pod datanode-0 -n iotdb-ns - -# 查看confignode日志 -kubectl logs -n iotdb-ns confignode-0 -f -``` - -## 8. 激活 IoTDB - -### 8.1 方案1:直接在 Pod 中激活(最快捷) - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-1 -- /iotdb/sbin/start-activate.sh -kubectl exec -it -n iotdb-ns confignode-2 -- /iotdb/sbin/start-activate.sh -# 拿到机器码后进行激活 -``` - -### 8.2 方案2:进入confignode的容器中激活 - -```Bash -kubectl exec -it -n iotdb-ns confignode-0 -- /bin/bash -cd /iotdb/sbin -/bin/bash start-activate.sh -# 拿到机器码后进行激活 -# 退出容器 -``` - -### 8.3 方案3:手动激活 - -1. 查看 ConfigNode 详细信息,确定所在节点: - -```Bash -kubectl describe pod confignode-0 -n iotdb-ns | grep -e "Node:" -e "Path:" - -# 结果示例: -# Node: a87/172.20.31.87 -# Path: /data/k8s-data/env/confignode/.env -``` - -2. 查看 PVC 并找到 ConfigNode 对应的 Volume,确定所在路径: - -```Bash -kubectl get pvc -n iotdb-ns | grep "confignode-0" - -# 结果示例: -# map-confignode-confignode-0 Bound iotdb-pv-04 10Gi RWO local-storage 8h - -# 如果要查看多个confignode,使用如下: -for i in {0..2}; do echo confignode-$i;kubectl describe pod confignode-${i} -n iotdb-ns | grep -e "Node:" -e "Path:"; echo "----"; done -``` - -3. 查看对应 Volume 的详细信息,确定物理目录的位置: - -```Bash -kubectl describe pv iotdb-pv-04 | grep "Path:" - -# 结果示例: -# Path: /data/k8s-data/iotdb-pv-04 -``` - -4. 从对应节点的对应目录下找到 system-info 文件,使用该 system-info 作为机器码生成激活码,并在同级目录新建文件 license,将激活码写入到该文件。 - -## 9. 验证 IoTDB - -### 9.1 查看命名空间内的 Pods 状态 - -查看iotdb-ns命名空间内的IP、状态等信息,确定全部运行正常 - -```Bash -kubectl get pods -n iotdb-ns -o wide - -# 结果示例: -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` - -### 9.2 查看命名空间内的端口映射情况 - -```Bash -kubectl get svc -n iotdb-ns - -# 结果示例: -# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -# confignode-svc NodePort 10.10.226.151 80:31026/TCP 7d8h -# datanode-svc NodePort 10.10.194.225 6667:31563/TCP 7d8h -# jdbc-balancer LoadBalancer 10.10.191.209 6667:31895/TCP 7d8h -``` - -### 9.3 在任意服务器启动 CLI 脚本验证 IoTDB 集群状态 - -端口即jdbc-balancer的端口,服务器为k8s任意节点的IP - -```Bash -start-cli.sh -h 172.20.31.86 -p 31895 -start-cli.sh -h 172.20.31.87 -p 31895 -start-cli.sh -h 172.20.31.88 -p 31895 -``` - - - -## 10. 扩容 - -### 10.1 新增pv - -新增pv,必须有可用的pv才可以扩容。 - - - -**注意:DataNode重启后无法加入集群** - -**原因**:配置了静态存储的 hostPath 模式,并通过脚本修改了 `iotdb-system.properties` 文件,将 `dn_data_dirs` 设为 `/iotdb6/iotdb_data,/iotdb7/iotdb_data`,但未将默认存储路径 `/iotdb/data` 进行外挂,导致重启时数据丢失。 - -**解决方案**:是将 `/iotdb/data` 目录也进行外挂操作,且 ConfigNode 和 DataNode 均需如此设置,以确保数据完整性和集群稳定性。 - -### 10.2 扩容confignode - -示例:3 confignode 扩容为 4 confignode - -修改iotdb-cluster-k8s/helm的values.yaml文件,将confignode的3改成4 - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - - - - -### 10.3 扩容datanode - -示例:3 datanode 扩容为 4 datanode - -修改iotdb-cluster-k8s/helm的values.yaml文件,将datanode的3改成4 - -```Shell -helm upgrade iotdb . -n iotdb-ns -``` - -### 10.4 验证IoTDB状态 - -```Shell -kubectl get pods -n iotdb-ns -o wide - -# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -# confignode-0 1/1 Running 0 75m 10.20.187.14 a87 -# confignode-1 1/1 Running 0 75m 10.20.191.75 a88 -# confignode-2 1/1 Running 0 75m 10.20.187.16 a87 -# datanode-0 1/1 Running 10 (5m54s ago) 75m 10.20.191.74 a88 -# datanode-1 1/1 Running 10 (5m42s ago) 75m 10.20.187.15 a87 -# datanode-2 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -# datanode-3 1/1 Running 10 (5m55s ago) 75m 10.20.191.76 a88 -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md b/src/zh/UserGuide/latest/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md deleted file mode 100644 index a877f5b27..000000000 --- a/src/zh/UserGuide/latest/Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md +++ /dev/null @@ -1,294 +0,0 @@ - -# 单机版部署指导 - -本章将介绍如何启动IoTDB单机实例,IoTDB单机实例包括 1 个ConfigNode 和1个DataNode(即通常所说的1C1D)。 - -## 1. 注意事项 - -1. 安装前请确认系统已参照[系统配置](./Environment-Requirements.md)准备完成。 - -2. 部署时推荐优先使用`hostname`进行IP配置,可避免后期修改主机ip导致数据库无法启动的问题。设置hostname需要在目标服务器上配置/etc/hosts,如本机ip是192.168.1.3,hostname是iotdb-1,则可以使用以下命令设置服务器的 hostname,并使用hostname配置IoTDB的`cn_internal_address`、dn_internal_address、dn_rpc_address。 - - ```shell - echo "192.168.1.3 iotdb-1" >> /etc/hosts - ``` - -3. 部分参数首次启动后不能修改,请参考下方的【参数配置】章节进行设置 - -4. 无论是在linux还是windows中,请确保IoTDB的安装路径中不含空格和中文,避免软件运行异常。 - -5. 请注意,安装部署(包括激活和使用软件)IoTDB时需要保持使用同一个用户进行操作,您可以: -- 使用 root 用户(推荐):使用 root 用户可以避免权限等问题。 -- 使用固定的非 root 用户: - - 使用同一用户操作:确保在启动、激活、停止等操作均保持使用同一用户,不要切换用户。 - - 避免使用 sudo:尽量避免使用 sudo 命令,因为它会以 root 用户权限执行命令,可能会引起权限混淆或安全问题。 - -6. 推荐部署监控面板,可以对重要运行指标进行监控,随时掌握数据库运行状态,监控面板可以联系商务获取,部署监控面板步骤可以参考:[监控面板部署](./Monitoring-panel-deployment.md)。 - -7. 在安装部署数据库前,可以使用健康检查工具检测 IoTDB 节点运行环境,并获取详细的检查结果。 IoTDB 健康检查工具使用方法可以参考:[健康检查工具](../Tools-System/Health-Check-Tool.md)。 - -## 2. 安装步骤 - -### 2.1 前置检查 - -为确保您获取的IoTDB企业版安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:[发布历史](../IoTDB-Introduction/Release-history_timecho.md)文档中各版本对应的"SHA512校验码" - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/iotdb`): - ```Bash - cd /data/iotdb - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum timechodb-{version}-bin.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-01.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行IoTDB企业版的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -### 2.2 解压安装包并进入安装目录 - -```shell -unzip iotdb-enterprise-{version}-bin.zip -cd iotdb-enterprise-{version}-bin -``` - -### 2.3 参数配置 - -#### 环境脚本配置 - -- ./conf/confignode-env.sh(./conf/confignode-env.bat)配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------: | :------------------------------------: |:------------------------:| :----------------------------------------------: | :----------: | -| MEMORY_SIZE | IoTDB ConfigNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的30% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -- ./conf/datanode-env.sh(./conf/datanode-env.bat)配置 - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :---------: | :----------------------------------: |:----------------------:| :----------------------------------------------: | :----------: | -| MEMORY_SIZE | IoTDB DataNode节点可以使用的内存总量 | 根据系统内存自动计算,默认为系统内存的50% | 可按需填写,填写后系统会根据填写的数值来分配内存 | 修改后保存即可,无需执行;重启服务后生效 | - -#### 系统通用配置 - -打开通用配置文件(./conf/iotdb-system.properties 文件),设置以下参数: - -| **配置项** | **说明** | **默认值** | **推荐值** | 备注 | -| :-----------------------: | :------------------------------: | :------------: | :----------------------------------------------: |:------------------------:| -| cluster_name | 集群名称 | defaultCluster | 可根据需要设置集群名称,如无特殊需要保持默认即可 | 支持热加载,但不建议手动修改该参数 | -| schema_replication_factor | 元数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | -| data_replication_factor | 数据副本数,单机版此处设置为 1 | 1 | 1 | 默认1,首次启动后不可修改 | - -#### ConfigNode配置 - -打开ConfigNode配置文件(./conf/iotdb-system.properties文件),设置以下参数: - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :-----------------: | :----------------------------------------------------------: | :-------------: | :----------------------------------------------: | :----------------: | -| cn_internal_address | ConfigNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| cn_internal_port | ConfigNode在集群内部通讯使用的端口 | 10710 | 10710 | 首次启动后不能修改 | -| cn_consensus_port | ConfigNode副本组共识协议通信使用的端口 | 10720 | 10720 | 首次启动后不能修改 | -| cn_seed_config_node | 节点注册加入集群时连接的ConfigNode 的地址,cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -#### DataNode 配置 - -打开DataNode配置文件 ./conf/iotdb-system.properties,设置以下参数: - -| **配置项** | **说明** | **默认** | 推荐值 | **备注** | -| :------------------------------ | :----------------------------------------------------------- | :-------------- | :----------------------------------------------- | :----------------- | -| dn_rpc_address | 客户端 RPC 服务的地址 | 127.0.0.1 | 默认本机可直接访问。非本机访问,请修改此配置项为所在服务器的IPV4地址或hostname,推荐使用所在服务器的IPV4地址。 | 重启服务生效 | -| dn_rpc_port | 客户端 RPC 服务的端口 | 6667 | 6667 | 重启服务生效 | -| dn_internal_address | DataNode在集群内部通讯使用的地址 | 127.0.0.1 | 所在服务器的IPV4地址或hostname,推荐使用hostname | 首次启动后不能修改 | -| dn_internal_port | DataNode在集群内部通信使用的端口 | 10730 | 10730 | 首次启动后不能修改 | -| dn_mpp_data_exchange_port | DataNode用于接收数据流使用的端口 | 10740 | 10740 | 首次启动后不能修改 | -| dn_data_region_consensus_port | DataNode用于数据副本共识协议通信使用的端口 | 10750 | 10750 | 首次启动后不能修改 | -| dn_schema_region_consensus_port | DataNode用于元数据副本共识协议通信使用的端口 | 10760 | 10760 | 首次启动后不能修改 | -| dn_seed_config_node | 节点注册加入集群时连接的ConfigNode地址,即cn_internal_address:cn_internal_port | 127.0.0.1:10710 | cn_internal_address:cn_internal_port | 首次启动后不能修改 | - -> ❗️注意:VSCode Remote等编辑器无自动保存配置功能,请确保修改的文件被持久化保存,否则配置项无法生效 - -### 2.4 启动 ConfigNode 节点 - -进入iotdb的sbin目录下,启动confignode - -```shell -# Unix/OS X -./sbin/start-confignode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\sbin\start-confignode.bat - -# V2.0.4.x 版本及之后 -.\sbin\windows\start-confignode.bat -``` - -如果启动失败,请参考下方[常见问题](#常见问题)。 - -### 2.5 启动 DataNode 节点 - -进入iotdb的sbin目录下,启动datanode: - -```shell -# Unix/OS X -./sbin/start-datanode.sh -d #“-d”参数将在后台进行启动 - -# Windows -# V2.0.4.x 版本之前 -.\sbin\start-datanode.bat - -# V2.0.4.x 版本及之后 -.\sbin\windows\start-datanode.bat -``` - -### 2.6 激活数据库 - -#### 方式一:命令激活 -- 进入 IoTDB CLI - -```shell -# Linux 系统与 MacOS 系统启动命令如下: -# V2.0.6.x 版本之前 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 - -# Windows 系统启动命令如下: -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` -- 执行以下语句获取激活所需机器码: - -```SQL -IoTDB> show system info -``` -```shell -+--------------------------------------------------------------+ -| SystemInfo| -+--------------------------------------------------------------+ -| 01-TE5NLES4-UDDWCMYE| -+--------------------------------------------------------------+ -Total line number = 1 -``` - -- 执行以下语句获取待激活数据库的版本号: - -```SQL -IoTDB> show version -``` -```shell -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.x.x| xxxxxxx| -+-------+---------+ -Total line number = 1 -``` - -- 将获取到的机器码与版本号,一同提供给天谋工作人员。 - -- 将工作人员返回的激活码输入到 CLI 中进行激活操作,请注意激活码前后需要用`'`符号进行标注,如下所示 - -```SQL -IoTDB> activate '01-D4EYQGPZ-EAUJJODW-NUKRDR6F-TUQS3B75-EDZFLK3A-6BOKJFFZ-ALDHOMN7-NB2E4BHI-7ZKGFVK6-GCIFXA4T-UG3XJTTD-SHJV6F2P-Q27B4OMJ-R47ZDIM3-UUASUXG2-OQXGVZCO-MMYKICZU-TWFQYYAO-ZOAGOKJA-NYHQTA5U-EWAR4EP5-MRC6R2CI-PKUTKRCT-7UDGRH3F-7BYV4P5D-6KKIA===' -``` - - -#### 方式二:文件激活 - -- 启动Confignode、Datanode节点后,进入activation文件夹, 将 system_info文件复制给天谋工作人员 -- 收到工作人员返回的 license文件 -- 将license文件放入对应节点的activation文件夹下; - - -### 2.7 验证激活 - -可在 CLI 中通过执行 `show activation` 命令查看激活状态,当看到“ClusterActivationStatus”字段状态显示为 ACTIVATED 表示激活成功 - -![](/img/%E5%8D%95%E6%9C%BA-%E9%AA%8C%E8%AF%81.png) - -## 3. 常见问题 - -1. 部署过程中多次提示激活失败 - - 使用 `ls -al` 命令:使用 `ls -al` 命令检查安装包根目录的所有者信息是否为当前用户。 - - 检查激活目录:检查 `./activation` 目录下的所有文件,所有者信息是否为当前用户。 - -2. Confignode节点启动失败 - - 步骤 1: 请查看启动日志,检查是否修改了某些首次启动后不可改的参数。 - - 步骤 2: 请查看启动日志,检查是否出现其他异常。日志中若存在异常现象,请联系天谋技术支持人员咨询解决方案。 - - 步骤 3: 如果是首次部署或者数据可删除,也可按下述步骤清理环境,重新部署后,再次启动。 - - 步骤 4: 清理环境: - - a. 结束所有 ConfigNode 和 DataNode 进程。 - -```Bash - # 1. 停止 ConfigNode 和 DataNode 服务 - # Unix/OS X - sbin/stop-standalone.sh - - # Windows - # V2.0.4.x 版本之前 - sbin\stop-standalone.bat - - # V2.0.4.x 版本及之后 - sbin\windows\stop-standalone.bat - - # 2. 检查是否还有进程残留 - jps - # 或者 - ps -ef|grep iotdb - - # 3. 如果有进程残留,则手动kill - kill -9 - # 如果确定机器上仅有1个iotdb,可以使用下面命令清理残留进程 - ps -ef|grep iotdb|grep -v grep|tr -s ' ' ' ' |cut -d ' ' -f2|xargs kill -9 - ``` - b. 删除 data 和 logs 目录。 - - 说明:删除 data 目录是必要的,删除 logs 目录是为了纯净日志,非必需。 - ```Bash - cd /data/iotdb - rm -rf data logs - ``` diff --git a/src/zh/UserGuide/latest/Deployment-and-Maintenance/workbench-deployment_timecho.md b/src/zh/UserGuide/latest/Deployment-and-Maintenance/workbench-deployment_timecho.md deleted file mode 100644 index 48267f677..000000000 --- a/src/zh/UserGuide/latest/Deployment-and-Maintenance/workbench-deployment_timecho.md +++ /dev/null @@ -1,254 +0,0 @@ - -# 可视化控制台部署 - -可视化控制台是IoTDB配套工具之一(类似 Navicat for MySQL)。它用于数据库部署实施、运维管理、应用开发各阶段的官方应用工具体系,让数据库的使用、运维和管理更加简单、高效,真正实现数据库低成本的管理和运维。本文档将帮助您安装Workbench。 - -
-  -  -
- -可视化控制台工具的使用说明可参考文档 [使用说明](../Tools-System/Workbench_timecho.md) 章节。 - -## 1. 安装准备 - -| 准备内容 | 名称 | 版本要求 | 官方链接 | -| :------: | :-----------------------: | :----------------------------------------------------------: | :----------------------------------------------------: | -| 操作系统 | Windows或Linux | - | - | -| 安装环境 | JDK | 1.5.4及以下版本需要 >= 1.8,1.5.5及以上版本需要 >= 17(下载时请根据机器配置选择ARM或x64安装包) | https://www.oracle.com/java/technologies/downloads/ | -| 相关软件 | Prometheus | 需要 >=V2.30.3 | https://prometheus.io/download/ | -| 数据库 | IoTDB | 需要>=V1.2.0企业版 | 您可联系商务或技术支持获取 | -| 控制台 | IoTDB-Workbench-``| - | 您可根据附录版本对照表进行选择后联系商务或技术支持获取 | - -### 1.1 前置检查 - -为确保您获取的可视化控制台安装包完整且正确,在执行安装部署前建议您进行SHA512校验。 - -#### 准备工作: - -- 获取官方发布的 SHA512 校验码:联系天谋工作人员获取 - -#### 校验步骤(以 linux 为例): - -1. 打开终端,进入安装包所在目录(如`/data/workbench`): - ```Bash - cd /data/workbench - ``` -2. 执行以下命令计算哈希值: - ```Bash - sha512sum IoTDB-Workbench-``.zip - ``` -3. 终端输出结果(左侧为SHA512 校验码,右侧为文件名): - -![img](/img/sha512-04.png) - -4. 对比输出结果与官方 SHA512 校验码,确认一致后,即可按照下方流程执行可视化控制台的安装部署操作。 - -#### 注意事项: - -- 若校验结果不一致,请联系天谋工作人员重新获取安装包 -- 校验过程中若出现"文件不存在"提示,需检查文件路径是否正确或安装包是否完整下载 - -## 2. 安装步骤 - -### 2.1 步骤一:IoTDB 开启监控指标采集 - -1. 打开监控配置项。IoTDB中监控有关的配置项默认是关闭的,在部署监控面板前,您需要打开相关配置项(注意开启监控配置后需要重启服务)。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
配置项所在配置文件配置说明
cn_metric_reporter_listconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为PROMETHEUS
cn_metric_level请在配置文件中添加该配置项,值设置为IMPORTANT
cn_metric_prometheus_reporter_port请在配置文件中添加该配置项,可保持默认设置9091,如设置其他端口,不与其他端口冲突即可
dn_metric_reporter_listconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为PROMETHEUS
dn_metric_level请在配置文件中添加该配置项,值设置为IMPORTANT
dn_metric_prometheus_reporter_port请在配置文件中添加该配置项,可保持默认设置9092,如设置其他端口,不与其他端口冲突即可
dn_metric_internal_reporter_type请在配置文件中添加该配置项,值设置为IOTDB
enable_audit_logconf/iotdb-system.properties请在配置文件中添加该配置项,值设置为true
audit_log_storage请在配置文件中添加该配置项,值设置为IOTDB,LOGGER
audit_log_operation请在配置文件中添加该配置项,值设置为DML,DDL,QUERY
- -2. 重启所有节点。修改3个节点的监控指标配置后,可重新启动所有节点的confignode和datanode服务: - - ```shell - # Unix/OS X - ./sbin/stop-standalone.sh #先停止confignode和datanode - ./sbin/start-confignode.sh -d #启动confignode - ./sbin/start-datanode.sh -d #启动datanode - - # Windows - # V2.0.4.x 版本之前 - .\sbin\stop-standalone.bat - .\sbin\start-confignode.bat - .\sbin\start-datanode.bat - - # V2.0.4.x 版本及之后 - .\sbin\windows\stop-standalone.bat - .\sbin\windows\start-confignode.bat - .\sbin\windows\start-datanode.bat - ``` - -3. 重启后,通过客户端确认各节点的运行状态,若状态都为Running,则为配置成功: - - ![](/img/%E5%90%AF%E5%8A%A8.png) - -### 2.2 步骤二:安装、配置Prometheus监控 - -1. 确保Prometheus安装完成(官方安装说明可参考:https://prometheus.io/docs/introduction/first_steps/) -2. 解压安装包,进入解压后的文件夹: - - ```Shell - tar xvfz prometheus-*.tar.gz - cd prometheus-* - ``` - -3. 修改配置。修改配置文件prometheus.yml如下 - 1. 新增confignode任务收集ConfigNode的监控数据 - 2. 新增datanode任务收集DataNode的监控数据 - - ```shell - global: - scrape_interval: 15s - evaluation_interval: 15s - scrape_configs: - - job_name: "prometheus" - static_configs: - - targets: ["localhost:9090"] - - job_name: "confignode" - static_configs: - - targets: ["iotdb-1:9091","iotdb-2:9091","iotdb-3:9091"] - honor_labels: true - - job_name: "datanode" - static_configs: - - targets: ["iotdb-1:9092","iotdb-2:9092","iotdb-3:9092"] - honor_labels: true - ``` - -4. 启动Prometheus。Prometheus 监控数据的默认过期时间为15天,在生产环境中,建议将其调整为180天以上,以对更长时间的历史监控数据进行追踪,启动命令如下所示: - - ```Shell - ./prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=180d - ``` - -5. 确认启动成功。在浏览器中输入 `http://IP:port`,进入Prometheus,点击进入Status下的Target界面,当看到State均为Up时表示配置成功并已经联通。 - -
- - -
- -### 2.3 步骤三:安装Workbench - -1. 进入iotdb-Workbench-``的config目录 - -2. 修改Workbench配置文件:进入`config`文件夹下修改配置文件`application-prod.properties`。若您是在本机安装则无需修改,若是部署在服务器上则需修改IP地址 - > Workbench可以部署在本地或者云服务器,只要能与 IoTDB 连接即可 - - | 配置项 | 修改前 | 修改后 | - | ---------------- | --------------------------------- | -------------------------------------- | - | pipe.callbackUrl | pipe.callbackUrl=`http://127.0.0.1` | pipe.callbackUrl=`http://<部署Workbench的IP地址>` | - - ![](/img/workbench-conf-1.png) - -3. 启动程序:请在IoTDB-Workbench-``的sbin文件夹下执行启动命令 - - Windows版: - ```shell - # 后台启动Workbench - start.bat -d - ``` - - Linux版: - ```shell - # 后台启动Workbench - ./start.sh -d - ``` - -4. 可以通过`jps`命令进行启动是否成功,如图所示即为启动成功: - - ![](/img/windows-jps.png) - -5. 验证是否成功:浏览器中打开:"`http://服务器ip:配置文件中端口`"进行访问,例如:"`http://127.0.0.1:9190`",当出现登录界面时即为成功 - - ![](/img/workbench.png) - - -## 3. 附录:IoTDB与控制台版本对照表 - -| **控制台版本号** | **版本说明** | **可支持IoTDB版本** | -|------------|--------------------------------------------------------|-------------------| -| V2.0.1-beta | V2.x系列首个版本,支持树、表双模型 | V2.0 及以上版本,AI分析模块仅支持2.0.5以上版本 | -| V1.5.7 | 优化测点列表中测点名称拆分为设备名称和测点,测点选择区域支持左右滚动,以及导出文件列顺序与页面保持一致 | V1.3.4及以上的1.x系列版本 | -| V1.5.6 | 优化 CSV 格式导入导出功能:导入时,支持标签、别名为非必填项;导出时,支持测点描述里反引号包裹引号的场景 | V1.3.4及以上的1.x系列版本 | -| V1.5.5 | 新增服务器时钟,支持企业版激活数据库 | V1.3.4及以上的1.x系列版本 | -| V1.5.4 | 新增实例管理中prometheus设置的认证功能 | V1.3.4及以上的1.x系列版本 | -| V1.5.1 | 新增AI分析功能以及模式匹配功能 | V1.3.2及以上的1.x系列版本 | -| V1.4.0 | 新增树模型展示及英文版 | V1.3.2及以上的1.x系列版本 | -| V1.3.1 | 分析功能新增分析方式,优化导入模版等功能 | V1.3.2及以上的1.x系列版本 | -| V1.3.0 | 新增数据库配置功能,优化部分版本细节 | V1.3.2及以上的1.x系列版本 | -| V1.2.6 | 优化各模块权限控制功能 | V1.3.1及以上的1.x系列版本 | -| V1.2.5 | 可视化功能新增“常用模版”概念,所有界面优化补充页面缓存等功能 | V1.3.0及以上的1.x系列版本 | -| V1.2.4 | 计算功能新增“导入、导出”功能,测点列表新增“时间对齐”字段 | V1.2.2及以上的1.x系列版本 | -| V1.2.3 | 首页新增“激活详情”,新增分析等功能 | V1.2.2及以上的1.x系列版本 | -| V1.2.2 | 优化“测点描述”展示内容等功能 | V1.2.2及以上的1.x系列版本 | -| V1.2.1 | 数据同步界面新增“监控面板”,优化Prometheus提示信息 | V1.2.2及以上的1.x系列版本 | -| V1.2.0 | 全新Workbench版本升级 | V1.2.0及以上的1.x系列版本 | diff --git a/src/zh/UserGuide/latest/Ecosystem-Integration/Ecosystem-Overview_timecho.md b/src/zh/UserGuide/latest/Ecosystem-Integration/Ecosystem-Overview_timecho.md deleted file mode 100644 index 454e27bba..000000000 --- a/src/zh/UserGuide/latest/Ecosystem-Integration/Ecosystem-Overview_timecho.md +++ /dev/null @@ -1,47 +0,0 @@ - - -# 概览 - -IoTDB 生态集成打通时序数据全链路:通过数据采集实现设备秒级接入,经数据集成构建跨云管道,依托编程框架快速开发业务逻辑,结合计算引擎完成分布式处理,通过可视化与 SQL 开发实现分析策略,最终对接物联网平台完成边云协同,构建从物理世界到数字决策的完整智能闭环。 - -![](/img/eco-overview-n.png) - -下面的文档将会帮助您快速详细的了解各个阶段不同集成工具的使用方式: - -- 数据采集 - - Telegraf [Telegraf 插件](./Telegraf.md) -- 数据集成 - - NiFi [Apache NiFi](./NiFi-IoTDB.md) - - Kafka [Kafka](./Programming-Kafka.md) -- 计算引擎 - - Flink [Flink](./Flink-IoTDB.md) - - Spark [Spark](./Spark-IoTDB.md) -- 可视化分析 - - Zeppelin [Zeppelin](./Zeppelin-IoTDB_timecho.md) - - Grafana [Grafana](./Grafana-Connector.md) - - Grafana Plugin [Grafana Plugin](./Grafana-Plugin.md) - - DataEase [DataEase](./DataEase.md) -- SQL 开发 - - DBeaver [DBeaver](./DBeaver.md) -- 物联网对接 - - Ignition [Ignition](./Ignition-IoTDB-plugin_timecho.md) - - Thingsboard [Thingsboard](./Thingsboard.md) \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md b/src/zh/UserGuide/latest/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md deleted file mode 100644 index 553d01d39..000000000 --- a/src/zh/UserGuide/latest/Ecosystem-Integration/Ignition-IoTDB-plugin_timecho.md +++ /dev/null @@ -1,274 +0,0 @@ - -# Ignition - -## 1. 产品概述 - -1. Ignition简介 - -Ignition 是一个基于WEB的监控和数据采集工具(SCADA)- 一个开放且可扩展的通用平台。Ignition可以让你更轻松地控制、跟踪、显示和分析企业的所有数据,提升业务能力。更多介绍详情请参考[Ignition官网](https://docs.inductiveautomation.com/docs/8.1/getting-started/introducing-ignition) - -2. Ignition-IoTDB Connector介绍 - - Ignition-IoTDB Connector分为两个模块:Ignition-IoTDB连接器、Ignition-IoTDB With JDBC。其中: - - - Ignition-IoTDB 连接器:提供了将 Ignition 采集到的数据存入 IoTDB 的能力,也支持在Components中进行数据读取,同时注入了 `system.iotdb.insert`和`system.iotdb.query`脚本接口用于方便在Ignition编程使用 - - Ignition-IoTDB With JDBC:Ignition-IoTDB With JDBC 可以在 `Transaction Groups` 模块中使用,不适用于 `Tag Historian`模块,可以用于自定义写入和查询。 - - 两个模块与Ignition的具体关系与内容如下图所示。 - - ![](/img/Ignition.png) - -## 2. 安装要求 - -| **准备内容** | **版本要求** | -| :------------------------: | :------------------------------------------------------------: | -| IoTDB | 要求已安装V1.3.1及以上版本,安装请参考 IoTDB [部署指导](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) | -| Ignition | 要求已安装 8.1.x版本(8.1.37及以上)的 8.1 版本,安装请参考 Ignition 官网[安装指导](https://docs.inductiveautomation.com/docs/8.1/getting-started/installing-and-upgrading)(其他版本适配请联系商务了解) | -| Ignition-IoTDB连接器模块 | 请联系商务获取 | -| Ignition-IoTDB With JDBC模块 | 下载地址:https://repo1.maven.org/maven2/org/apache/iotdb/iotdb-jdbc/ | - -## 3. Ignition-IoTDB连接器使用说明 - -### 3.1 简介 - -Ignition-IoTDB连接器模块可以将数据存入与历史数据库提供程序关联的数据库连接中。数据根据其数据类型直接存储到 SQL 数据库中的表中,以及毫秒时间戳。根据每个标签上的值模式和死区设置,仅在更改时存储数据,从而避免重复和不必要的数据存储。 - -Ignition-IoTDB连接器提供了将 Ignition 采集到的数据存入 IoTDB 的能力。 - -### 3.2 安装步骤 - -步骤一:进入 `Config` - `System`- `Modules` 模块,点击最下方的`Install or Upgrade a Module...` - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-1.png) - -步骤二:选择获取到的 `modl`,选择文件并上传,点击 `Install`,信任相关证书。 - -![](/img/ignition-3.png) - -步骤三:安装完成后可以看到如下内容 - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-3.png) - -步骤四:进入 `Config` - `Tags`- `History` 模块,点击下方的`Create new Historical Tag Provider...` - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-4.png) - -步骤五:选择 `IoTDB`并填写配置信息 - -![](/img/Ignition-IoTDB%E8%BF%9E%E6%8E%A5%E5%99%A8-5.png) - -配置内容如下: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
名称含义默认值备注
Main
Provider NameProvider 名称-
Enabled true为 true 时才能使用该 Provider
Description备注-
IoTDB Settings
Host Name目标IoTDB实例的地址-
Port Number目标IoTDB实例的端口6667
Username目标IoTDB的用户名-
Password目标IoTDB的密码-
Database Name要存储的数据库名称,以 root 开头,如 root.db-
Pool SizeSessionPool 的 Size50可以按需进行配置
Store and Forward Settings保持默认即可
- - -### 3.3 使用说明 - -#### 配置历史数据存储 - -- 配置好 `Provider` 后就可以在 `Designer` 中使用 `IoTDB Tag Historian` 了,就跟使用其他的 `Provider` 一样,右键点击对应 `Tag` 选择 `Edit tag(s)`,在 Tag Editor 中选择 History 分类 - - ![](/img/ignition-7.png) - -- 设置 `History Enabled` 为 `true`,并选择 `Storage Provider` 为上一步创建的 `Provider`,按需要配置其它参数,并点击 `OK`,然后保存项目。此时数据将会按照设置的内容持续的存入 `IoTDB` 实例中。 - - ![](/img/ignition-8.png) - -#### 读取数据 - -- 也可以在 Report 的 Data 标签下面直接选择存入 IoTDB 的 Tags - - ![](/img/ignition-9.png) - -- 在 Components 中也可以直接浏览相关数据 - - ![](/img/ignition-10.png) - -#### 脚本模块:该功能能够与 IoTDB 进行交互 - -1. system.iotdb.insert: - - -- 脚本说明:将数据写入到 IoTDB 实例中 - -- 脚本定义: - ``` shell - system.iotdb.insert(historian, deviceId, timestamps, measurementNames, measurementValues) - ``` - -- 参数: - - - `str historian`:对应的 IoTDB Tag Historian Provider 的名称 - - `str deviceId`:写入的 deviceId,不含配置的 database,如 Sine - - `long[] timestamps`:写入的数据点对于的时间戳列表 - - `str[] measurementNames`:写入的物理量的名称列表 - - `str[][] measurementValues`:写入的数据点数据,与时间戳列表和物理量名称列表对应 - -- 返回值:无 - -- 可用范围:Client, Designer, Gateway - -- 使用示例: - - ```shell - system.iotdb.insert("IoTDB", "Sine", [system.date.now()],["measure1","measure2"],[["val1","val2"]]) - ``` - -2. system.iotdb.query: - - -- 脚本说明:查询写到 IoTDB 实例中的数据 - -- 脚本定义: - ```shell - system.iotdb.query(historian, sql) - ``` - -- 参数: - - - `str historian`:对应的 IoTDB Tag Historian Provider 的名称 - - `str sql`:待查询的 sql 语句 - -- 返回值: - 查询的结果:`List>` - -- 可用范围:Client, Designer, Gateway -- 使用示例: - -```shell -system.iotdb.query("IoTDB", "select * from root.db.Sine where time > 1709563427247") -``` - -## 4. Ignition-IoTDB With JDBC - -### 4.1 简介 - - Ignition-IoTDB With JDBC提供了一个 JDBC 驱动,允许用户使用标准的JDBC API 连接和查询 lgnition-loTDB 数据库 - -### 4.2 安装步骤 - - 步骤一:进入 `Config` - `Databases` -`Drivers` 模块,创建 `Translator` - -![](/img/Ignition-IoTDBWithJDBC-1.png) - - 步骤二:进入 `Config` - `Databases` -`Drivers` 模块,创建 `JDBC Driver`,选择上一步配置的 `Translator`并上传下载的 `IoTDB-JDBC`,Classname 配置为 `org.apache.iotdb.jdbc.IoTDBDriver` - -![](/img/Ignition-IoTDBWithJDBC-2.png) - -步骤三:进入 `Config` - `Databases` -`Connections` 模块,创建新的 `Connections`,`JDBC Driver` 选择上一步创建的 `IoTDB Driver`,配置相关信息后保存即可使用 - -![](/img/Ignition-IoTDBWithJDBC-3.png) - -### 4.3 使用说明 - -#### 数据写入 - - 在`Transaction Groups`中的 `Data Source`选择之前创建的 `Connection` - -- `Table name` 需设置为 root 开始的完整的设备路径 -- 取消勾选 `Automatically create table` -- `Store timestame to` 配置为 time - -不选择其他项,设置好字段,并 `Enabled` 后 数据会安装设置存入对应的 IoTDB - -![](/img/%E6%95%B0%E6%8D%AE%E5%86%99%E5%85%A5-1.png) - -#### 数据查询 - -- 在 `Database Query Browser` 中选择`Data Source`选择之前创建的 `Connection`,即可编写 SQL 语句查询 IoTDB 中的数据 - -![](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2-ponz.png) - diff --git a/src/zh/UserGuide/latest/Ecosystem-Integration/SeaTunnel_timecho.md b/src/zh/UserGuide/latest/Ecosystem-Integration/SeaTunnel_timecho.md deleted file mode 100644 index 5fbef03f9..000000000 --- a/src/zh/UserGuide/latest/Ecosystem-Integration/SeaTunnel_timecho.md +++ /dev/null @@ -1,189 +0,0 @@ - - -# Apache SeaTunnel - -## 1. 概述 - -SeaTunnel 是一款专为海量数据设计的分布式集成平台,凭借其高性能与弹性扩展能力,通过标准化的 Connector 连接器(由 Source 和 Sink 构成)打通多源异构数据链路。平台将各类数据源通过 Source 统一抽象为 SeaTunnelRow 格式,经动态资源调度与批量处理优化后,由 Sink 高效写入不同存储系统。通过 IoTDB Connector 与 SeaTunnel 的深度集成,不仅解决了时序数据场景下的 高吞吐写入、多源治理、复杂分析 等核心挑战,更通过开箱即用的连接器生态和自动化运维能力,帮助企业在物联网、工业互联网等领域快速构建 低成本、高可靠、易扩展 的数据基础设施。 - -## 2. 使用步骤 - -### 2.1 环境准备 - -#### 2.1.1 软件要求 - -| 软件 | 版本 | 安装参考 | -| ----------- | ---------- |-----------------------------------------------| -| IoTDB | >= 2.0.5 | [快速入手](../QuickStart/QuickStart_timecho.md) | -| SeaTunnel | 2.3.12 | [官方网站](https://seatunnel.apache.org/download) | - -* Thrift 版本冲突解决(仅 Spark 引擎需处理): - -```Bash -# 移除 Spark 中的旧版 Thrift -rm -f $SPARK_HOME/jars/libthrift* -# 复制 IoTDB 的 Thrift 库到 Sparkcp -$IOTDB_HOME/lib/libthrift* $SPARK_HOME/jars/ -``` - -#### 2.1.2 依赖配置 - -1. JDBC - -* Spark/Flink 引擎:将 [JDBC 驱动 Jar 包](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) 放入 `${SEATUNNEL_HOME}/plugins/` 目录 -* SeaTunnel Zeta 引擎:将 [JDBC 驱动 Jar 包](https://mvnrepository.com/artifact/org.apache.iotdb/iotdb-jdbc) 放入 `${SEATUNNEL_HOME}/lib/` 目录 - -2. Connector - -将对应版本的 [seaTunnel Connector](https://mvnrepository.com/artifact/org.apache.seatunnel/connector-iotdb) 放入 `${SEATUNNEL_HOME}/plugins/` 目录 - -### 2.2 读取数据 (IoTDB Source Connector) - -#### 2.2.1 配置参数 - -| **参数名** | **类型** | **必填** | **默认值** | **描述** | -| ---------------------------------- | ---------------- | ---------------- | ------------------ |-----------------------------------------------------------------------| -| `node_urls` | string | 是 | - | IoTDB 集群地址,格式:`"host1:port"`或`"host1:port,host2:port"` | -| `username` | string | 是 | - | IoTDB 用户名 | -| `password` | string | 是 | - | IoTDB 密码 | -| `sql_dialect` | string | 否 | tree | IoTDB 模型,tree:树模型;table:表模型 | -| `sql` | string | 是 | - | 要执行的 SQL 查询语句 | -| `database` | string | 否 | - | 数据库名,只在表模型中生效 | -| `schema` | config | 是 | - | 数据模式定义 | -| `fetch_size` | int | 否 | - | 单次获取数据量:查询时每次从 IoTDB 获取的数据量 | -| `lower_bound`| long | 否 | - | 时间范围下界(通过时间列进行数据分片时使用) | -| `upper_bound` | long | 否 | - | 时间范围上界(通过时间列进行数据分片时使用) | -| `num_partitions`| int | 否 | - | 分区数量(通过时间列进行数据分片时使用):
1个分区:使用完整时间范围
若分区数 < (上界-下界),则使用差值作为实际分区数 | -| `thrift_default_buffer_size` | int | 否 | - | Thrift 协议缓冲区大小 | -| `thrift_max_frame_size` | int | 否 | - | Thrift 最大帧尺寸 | -| `enable_cache_leader` | boolean | 否 | - | 是否启用 Leader 节点缓存 | -| `version` | string | 否 | - | 客户端 SQL 语义版本`(V_0_12/V_0_13)` | - -#### 2.2.2 配置示例 - -1. 在 `${SEATUNNEL_HOME}/`​`config/` 目录下新建` iotdb_source_example.conf` - -```Bash -env{ - parallelism = 2 # 并行度为2 - job.mode = "BATCH" # 批处理模式 -} - -source { - IoTDB { - node_urls = "localhost:6667" - username = "root" - password = "root" - sql = "SELECT temperature, humidity, status FROM root.testdb.seatunnel.source.device align by device" - schema { - fields { - ts = timestamp - device_name = string - temperature = double - humidity = double - status = boolean - } - } - } -} - -sink { - Console { - } # 输出到控制台 -} -``` - -2. 执行如下命令运行 seaTunnel - -```Bash -./bin/seatunnel.sh --config config/iotdb_source_example.conf -e local -``` - -3. 更多详情请参考 Apache SeanTunnel 官网 [IoTDB Source Connector](https://seatunnel.incubator.apache.org/zh-CN/docs/2.3.12/connector-v2/source/IoTDB) 相关介绍 - -### 2.3 写入数据(IoTDB Sink Connector) - -#### 2.3.1 配置参数 - -| **名称** | **类型** | **是否必传​** | **默认值** | **描述** | -|-------------------------------|---------| ---------------------- |------------------| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `node_urls` | Array | 是 | - | `IoTDB`集群地址,格式为` ["host1:port"]`或`["host1:port","host2:port"]` | -| `username` | String | 是 | - | `IoTDB`用户的用户名 | -| `password` | String | 是 | - | `IoTDB`用户的密码 | -| `sql_dialect` | String | 否 | tree | `IoTDB`模型,tree:树模型;table:表模型 | -| `storage_group` | String | 是 | - | `IoTDB`树模型:指定设备存储组(路径前缀) 例: deviceId = \${storage\_group} + "." + \${key\_device} ;`IoTDB`表模型:指定数据库 | -| `key_device` | String | 是 | - | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`设备 ID 的字段名;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`表名的字段名 | -| `key_timestamp` | String | 否 | processing time | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`时间戳的字段名(如未指定,则使用处理时间作为时间戳);`IoTDB`表模型:在 SeaTunnelRow 中指定 IoTDB 时间列的字段名(如未指定,则使用处理时间作为时间戳) | -| `key_measurement_fields` | Array | 否 | 见描述 | `IoTDB`树模型:在 SeaTunnelRow 中指定`IoTDB`测量列表的字段名(如未指定,则包括排除`key_device`&`key_timestamp`后的其余字段);`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`测点列(FIELD)的字段名(如未指定,则包括排除`key_device`&`key_timestamp`&`key_tag_fields`&`key_attribute_fields`后的其余字段) | -| `key_tag_fields` | Array | 否 | - | `IoTDB`树模型:不生效;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`标签列(TAG)的字段名 | -| `key_attribute_fields` | Array | 否 | - | `IoTDB`树模型:不生效;`IoTDB`表模型:在 SeaTunnelRow 中指定`IoTDB`属性列(ATTRIBUTE)的字段名 | -| `batch_size` | Integer | 否 | 1024 | 对于批写入,当缓冲区的数量达到`batch_size`的数量或时间达到`batch_interval_ms`时,数据将被刷新到IoTDB中 | -| `max_retries` | Integer | 否 | - | 刷新的重试次数 failed | -| `retry_backoff_multiplier_ms` | Integer | 否 | - | 用作生成下一个退避延迟的乘数 | -| `max_retry_backoff_ms` | Integer | 否 | - | 尝试重试对`IoTDB`的请求之前等待的时间量 | -| `default_thrift_buffer_size` | Integer | 否 | - | 在`IoTDB`客户端中节省初始化缓冲区大小 | -| `max_thrift_frame_size` | Integer | 否 | - | 在`IoTDB`客户端中节约最大帧大小 | -| `zone_id` | string | 否 | - | `IoTDB`java.time.ZoneId client | -| `enable_rpc_compression` | Boolean | 否 | - | 在`IoTDB`客户端中启用rpc压缩 | -| `connection_timeout_in_ms` | Integer | 否 | - | 连接到`IoTDB`时等待的最长时间(毫秒) | - -#### 2.3.2 配置示例 - -1. 在 `${SEATUNNEL_HOME}/`​`config/` 目录下新建` iotdb_sink_example.conf` - -```bash -# 定义运行时环境 -env { - parallelism = 4 - job.mode = "BATCH" -} -source{ - Jdbc { - url = "jdbc:mysql://localhost:3306/demo_db?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true" - driver = "com.mysql.cj.jdbc.Driver" - connection_check_timeout_sec = 100 - user = "root" - password = "IoTDB@2024" - query = "select * from device" - } -} -sink { - IoTDB { - node_urls = ["localhost:6667"] - username = "root" - password = "root" - key_device = "id" # specify the `deviceId` use device_name field - key_timestamp = "intime" - storage_group = "root.mysql" - } -} -``` - -2. 执行如下命令运行 seaTunnel - -```Bash -./bin/seatunnel.sh --config config/iotdb_sink_example.conf -e local -``` - -3. 更多配置参数及示例请参考 Apache SeanTunnel 官网 [IoTDB Sink Connector](https://seatunnel.incubator.apache.org/zh-CN/docs/2.3.12/connector-v2/sink/IoTDB) 相关介绍 - - diff --git a/src/zh/UserGuide/latest/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md b/src/zh/UserGuide/latest/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md deleted file mode 100644 index f3bba6de1..000000000 --- a/src/zh/UserGuide/latest/Ecosystem-Integration/Zeppelin-IoTDB_timecho.md +++ /dev/null @@ -1,174 +0,0 @@ - - -# Apache Zeppelin - -## 1. Zeppelin 简介 - -Apache Zeppelin 是一个基于网页的交互式数据分析系统。用户可以通过 Zeppelin 连接数据源并使用 SQL、Scala 等进行交互式操作。操作可以保存为文档(类似于 Jupyter)。Zeppelin 支持多种数据源,包括 Spark、ElasticSearch、Cassandra 和 InfluxDB 等等。现在,IoTDB 已经支持使用 Zeppelin 进行操作。样例如下: - -![iotdb-note-snapshot](/img/github/102752947-520a3e80-43a5-11eb-8fb1-8fac471c8c7e.png) - -## 2. Zeppelin-IoTDB 解释器 - -### 2.1 系统环境需求 - -| IoTDB 版本 | Java 版本 | Zeppelin 版本 | -| :--------: | :-----------: | :-----------: | -| >=`0.12.0` | >=`1.8.0_271` | `>=0.9.0` | - -安装 IoTDB:参考 [快速上手](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md). 假设 IoTDB 安装在 `$IoTDB_HOME`. - -安装 Zeppelin: -> 方法 1 直接下载:下载 [Zeppelin](https://zeppelin.apache.org/download.html#) 并解压二进制文件。推荐下载 [netinst](http://www.apache.org/dyn/closer.cgi/zeppelin/zeppelin-0.9.0/zeppelin-0.9.0-bin-netinst.tgz) 二进制包,此包由于未编译不相关的 interpreter,因此大小相对较小。 -> -> 方法 2 源码编译:参考 [从源码构建 Zeppelin](https://zeppelin.apache.org/docs/latest/setup/basics/how_to_build.html) ,使用命令为 `mvn clean package -pl zeppelin-web,zeppelin-server -am -DskipTests`。 - -假设 Zeppelin 安装在 `$Zeppelin_HOME`. - -### 2.2 编译解释器 - -运行如下命令编译 IoTDB Zeppelin 解释器。 - -```shell -cd $IoTDB_HOME - mvn clean package -pl iotdb-connector/zeppelin-interpreter -am -DskipTests -P get-jar-with-dependencies -``` - -编译后的解释器位于如下目录: - -```shell -$IoTDB_HOME/zeppelin-interpreter/target/zeppelin-{version}-SNAPSHOT-jar-with-dependencies.jar -``` - -### 2.3 安装解释器 - -当你编译好了解释器,在 Zeppelin 的解释器目录下创建一个新的文件夹`iotdb`,并将 IoTDB 解释器放入其中。 - -```shell -cd $IoTDB_HOME -mkdir -p $Zeppelin_HOME/interpreter/iotdb -cp $IoTDB_HOME/zeppelin-interpreter/target/zeppelin-{version}-SNAPSHOT-jar-with-dependencies.jar $Zeppelin_HOME/interpreter/iotdb -``` - -### 2.4 修改 Zeppelin 配置 - -进入 `$Zeppelin_HOME/conf`,使用 template 创建 Zeppelin 配置文件: - -```shell -cp zeppelin-site.xml.template zeppelin-site.xml -``` - -打开 zeppelin-site.xml 文件,将 `zeppelin.server.addr` 项修改为 `0.0.0.0` - -### 2.5 启动 Zeppelin 和 IoTDB - -进入 `$Zeppelin_HOME` 并运行 Zeppelin: - -```shell -# Unix/OS X -> ./bin/zeppelin-daemon.sh start - -# Windows -> .\bin\zeppelin.cmd -``` - -进入 `$IoTDB_HOME` 并运行 IoTDB: - -```shell -# Unix/OS X -> nohup sbin/start-server.sh >/dev/null 2>&1 & -or -> nohup sbin/start-server.sh -c -rpc_port >/dev/null 2>&1 & - -# Windows -> sbin\start-server.bat -c -rpc_port -``` - -## 3. 使用 Zeppelin-IoTDB 解释器 - -当 Zeppelin 启动后,访问 [http://127.0.0.1:8080/](http://127.0.0.1:8080/) - -通过如下步骤创建一个新的笔记本页面: - -1. 点击 `Create new node` 按钮 -2. 设置笔记本名 -3. 选择解释器为 iotdb - -现在可以开始使用 Zeppelin 操作 IoTDB 了。 - -![iotdb-create-note](/img/github/102752945-5171a800-43a5-11eb-8614-53b3276a3ce2.png) - -我们提供了一些简单的 SQL 来展示 Zeppelin-IoTDB 解释器的使用: - -```sql -CREATE DATABASE root.ln.wf01.wt01; -CREATE TIMESERIES root.ln.wf01.wt01.status WITH DATATYPE=BOOLEAN, ENCODING=PLAIN; -CREATE TIMESERIES root.ln.wf01.wt01.temperature WITH DATATYPE=FLOAT, ENCODING=PLAIN; -CREATE TIMESERIES root.ln.wf01.wt01.hardware WITH DATATYPE=INT32, ENCODING=PLAIN; - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (1, 1.1, false, 11); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (2, 2.2, true, 22); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (3, 3.3, false, 33); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (4, 4.4, false, 44); - -INSERT INTO root.ln.wf01.wt01 (timestamp, temperature, status, hardware) -VALUES (5, 5.5, false, 55); - -SELECT * -FROM root.ln.wf01.wt01 -WHERE time >= 1 - AND time <= 6; -``` - -样例如下: - -![iotdb-note-snapshot2](/img/github/102752948-52a2d500-43a5-11eb-9156-0c55667eb4cd.png) - -用户也可以参考 [[1]](https://zeppelin.apache.org/docs/0.9.0/usage/display_system/basic.html) 编写更丰富多彩的文档。 - -以上样例放置于 `$IoTDB_HOME/zeppelin-interpreter/Zeppelin-IoTDB-Demo.zpln` - -## 4. 解释器配置项 - -进入页面 [http://127.0.0.1:8080/#/interpreter](http://127.0.0.1:8080/#/interpreter) 并配置 IoTDB 的连接参数: - -![iotdb-configuration](/img/github/102752940-50407b00-43a5-11eb-94fb-3e3be222183c.png) - -可配置参数默认值和解释如下: - -| 属性 | 默认值 | 描述 | -| ---------------------------- | --------- | -------------------------------- | -| iotdb.host | 127.0.0.1 | IoTDB 主机名 | -| iotdb.port | 6667 | IoTDB 端口 | -| iotdb.username | root | 用户名 | -| iotdb.password | root | 密码 | -| iotdb.fetchSize | 10000 | 查询结果分批次返回时,每一批数量 | -| iotdb.zoneId | | 时区 ID | -| iotdb.enable.rpc.compression | FALSE | 是否允许 rpc 压缩 | -| iotdb.time.display.type | default | 时间戳的展示格式 | diff --git a/src/zh/UserGuide/latest/IoTDB-Introduction/IoTDB-Introduction_timecho.md b/src/zh/UserGuide/latest/IoTDB-Introduction/IoTDB-Introduction_timecho.md deleted file mode 100644 index a82e4ed7f..000000000 --- a/src/zh/UserGuide/latest/IoTDB-Introduction/IoTDB-Introduction_timecho.md +++ /dev/null @@ -1,271 +0,0 @@ - - -# 产品介绍 - -TimechoDB 是一款低成本、高性能的物联网原生时序数据库,是天谋科技基于 Apache IoTDB 社区版本提供的原厂商业化产品。它可以解决企业组建物联网大数据平台管理时序数据时所遇到的应用场景复杂、数据体量大、采样频率高、数据乱序多、数据处理耗时长、分析需求多样、存储与运维成本高等多种问题。 - -天谋科技基于 TimechoDB 提供更多样的产品功能、更强大的性能和稳定性、更丰富的效能工具,并为用户提供全方位的企业服务,从而为商业化客户提供更强大的产品能力,和更优质的开发、运维、使用体验。 - -- 下载、部署与使用:[快速上手](../QuickStart/QuickStart_timecho.md) - -## 1. 产品体系 - -天谋产品体系由若干个组件构成,覆盖由【数据采集】到【数据管理】到【数据分析&应用】的全时序数据生命周期,做到“采-存-用”一体化时序数据解决方案,帮助用户高效地管理和分析物联网产生的海量时序数据。 - -
- Introduction-zh-timecho.png -
- - -其中: - -1. **时序数据库(TimechoDB,基于 Apache IoTDB 提供的原厂商业化产品)**:时序数据存储的核心组件,其能够为用户提供高压缩存储能力、丰富时序查询能力、实时流处理能力,同时具备数据的高可用和集群的高扩展性,并在安全层面提供全方位保障。同时 TimechoDB 还为用户提供多种应用工具,方便用户配置和管理系统;多语言API和外部系统应用集成能力,方便用户在 TimechoDB 基础上构建业务应用。 -2. **时序数据标准文件格式(Apache TsFile,多位天谋科技核心团队成员主导&贡献代码)**:该文件格式是一种专为时序数据设计的存储格式,可以高效地存储和查询海量时序数据。目前 Timecho 采集、存储、智能分析等模块的底层存储文件均由 Apache TsFile 进行支撑。TsFile 可以被高效地加载至 IoTDB 中,也能够被迁移出来。通过 TsFile,用户可以在采集、管理、应用&分析阶段统一使用相同的文件格式进行数据管理,极大简化了数据采集到分析的整个流程,提高时序数据管理的效率和便捷度。 -3. **时序模型训推一体化引擎(AINode)**:针对智能分析场景,TimechoDB 提供 AINode 时序模型训推一体化引擎,它提供了一套完整的时序数据分析工具,底层为模型训练引擎,支持训练任务与数据管理,与包括机器学习、深度学习等。通过这些工具,用户可以对存储在 TimechoDB 中的数据进行深入分析,挖掘出其中的价值。 -4. **数据采集**:为了更加便捷的对接各类工业采集场景, 天谋科技提供数据采集接入服务,支持多种协议和格式,可以接入各种传感器、设备产生的数据,同时支持断点续传、网闸穿透等特性。更加适配工业领域采集过程中配置难、传输慢、网络弱的特点,让用户的数采变得更加简单、高效。 - -## 2. TimechoDB 整体架构 - -下图展示了一个常见的 IoTDB 3C3D(3 个 ConfigNode、3 个 DataNode)的集群部署模式: - - - -## 3. 产品特性 - -TimechoDB 具备以下优势和特性: - -- 灵活的部署方式:支持云端一键部署、终端解压即用、终端-云端无缝连接(数据云端同步工具) - -- 低硬件成本的存储解决方案:支持高压缩比的磁盘存储,无需区分历史库与实时库,数据统一管理 - -- 层级化的测点组织管理方式:支持在系统中根据设备实际层级关系进行建模,以实现与工业测点管理结构的对齐,同时支持针对层级结构的目录查看、检索等能力 - -- 高通量的数据读写:支持百万级设备接入、数据高速读写、乱序/多频采集等复杂工业读写场景 - -- 丰富的时间序列查询语义:支持时序数据原生计算引擎,支持查询时时间戳对齐,提供近百种内置聚合与时序计算函数,支持面向时序特征分析和AI能力 - -- 高可用的分布式系统:支持HA分布式架构,系统提供7*24小时不间断的实时数据库服务,一个物理节点宕机或网络故障,不会影响系统的正常运行;支持物理节点的增加、删除或过热,系统会自动进行计算/存储资源的负载均衡处理;支持异构环境,不同类型、不同性能的服务器可以组建集群,系统根据物理机的配置,自动负载均衡 - -- 极低的使用&运维门槛:支持类 SQL 语言、提供多语言原生二次开发接口、具备控制台等完善的工具体系 - -- 丰富的生态环境对接:支持Hadoop、Spark等大数据生态系统组件对接,支持Grafana、Thingsboard、DataEase等设备管理和可视化工具 - -## 4. 企业特性 - -### 4.1 更高阶的产品功能 - -TimechoDB 在 Apache IoTDB 基础上提供了更多高阶产品功能,在内核层面针对工业生产场景进行原生升级和优化,如多级存储、云边协同、可视化工具、安全增强等功能,能够让用户无需过多关注底层逻辑,将精力聚焦在业务开发中,让工业生产更简单更高效,为企业带来更多的经济效益。如: - -- 双活部署:双活通常是指两个独立的单机(或集群),实时进行镜像同步,它们的配置完全独立,可以同时接收外界的写入,每一个独立的单机(或集群)都可以将写入到自己的数据同步到另一个单机(或集群)中,两个单机(或集群)的数据可达到最终一致。 - -- 数据同步:通过数据库内置的同步模块,支持数据由场站向中心汇聚,支持全量汇聚、部分汇聚、级联汇聚等各类场景,可支持实时数据同步与批量数据同步两种模式。同时提供多种内置插件,支持企业数据同步应用中的网闸穿透、加密传输、压缩传输等相关要求。 - -- 多级存储:通过升级底层存储能力,支持根据访问频率和数据重要性等因素将数据划分为冷、温、热等不同层级的数据,并将其存储在不同介质中(如 SSD、机械硬盘、云存储等),同时在查询过程中也由系统进行数据调度。从而在保证数据访问速度的同时,降低客户数据存储成本。 - -- 安全增强:通过白名单、审计日志等功能加强企业内部管理,降低数据泄露风险。 - -详细功能对比如下: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
功能Apache IoTDBTimechoDB
部署模式单机部署
分布式部署
双活部署-
容器部署部分支持
数据库功能测点管理
数据写入
数据查询
连续查询
触发器
用户自定义函数
权限管理
数据同步仅文件同步,无内置插件实时同步+文件同步,丰富内置插件
流处理仅框架,无内置插件框架+丰富内置插件
多级存储-
视图-
白名单-
审计日志-
配套工具可视化控制台-
集群管理工具-
系统监控工具-
国产化国产化兼容性认证-
技术支持专家服务-
使用培训-
- -### 4.2 更高效/稳定的产品性能 - -TimechoDB 在 Apache IoTDB 的基础上优化了稳定性与性能,经过企业版技术支持,能够实现10倍以上性能提升,并具有故障及时恢复的性能优势。 - -### 4.3 更用户友好的工具体系 - -TimechoDB 将为用户提供更简单、易用的工具体系,通过集群监控面板(IoTDB Grafana)、数据库控制台(IoTDB Workbench)、集群管理工具(IoTDB Deploy Tool,简称 IoTD)等产品帮助用户快速部署、管理、监控数据库集群,降低运维人员工作/学习成本,简化数据库运维工作,使运维过程更加方便、快捷。 - -- 集群监控面板:旨在解决 IoTDB 及其所在操作系统的监控问题,主要包括:操作系统资源监控、IoTDB 性能监控,及上百项内核监控指标,从而帮助用户监控集群健康状态,并进行集群调优和运维。 - -
-

总体概览

-

操作系统资源监控

-

IoTDB 性能监控

-
-
- - - -
-

- -- 数据库控制台:旨在提供低门槛的数据库交互工具,通过提供界面化的控制台帮助用户简洁明了的进行元数据管理、数据增删改查、权限管理、系统管理等操作,简化数据库使用难度,提高数据库使用效率。 - - -
-

首页

-

元数据管理

-

SQL 查询

-
-
- - - -
-

- - -- 集群管理工具:旨在解决分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。 - - -
-  -
- -### 4.4 更专业的企业技术服务 - -TimechoDB 客户提供强大的原厂服务,包括但不限于现场安装及培训、专家顾问咨询、现场紧急救助、软件升级、在线自助服务、远程支持、最新开发版使用指导等服务。同时,为了使 IoTDB 更契合工业生产场景,我们会根据企业实际数据结构和读写负载,进行建模方案推荐、读写性能调优、压缩比调优、数据库配置推荐及其他的技术支持。如遇到部分产品未覆盖的工业化定制场景,TimechoDB 将根据用户特点提供定制化开发工具。 - -相较于 Apache IoTDB,每 2-3 个月一个发版周期,TimechoDB 提供周期更快的发版频率,同时针对客户现场紧急问题,提供天级别的专属修复,确保生产环境稳定。 - - -### 4.5 更兼容的国产化适配 - -TimechoDB 代码自研可控,同时兼容大部分主流信创产品(CPU、操作系统等),并完成与多个厂家的兼容认证,确保产品的合规性和安全性。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest/IoTDB-Introduction/Release-history_timecho.md b/src/zh/UserGuide/latest/IoTDB-Introduction/Release-history_timecho.md deleted file mode 100644 index 77f73b1e2..000000000 --- a/src/zh/UserGuide/latest/IoTDB-Introduction/Release-history_timecho.md +++ /dev/null @@ -1,677 +0,0 @@ - -# 发布历史 - -## 1. TimechoDB(数据库内核) - -### V2.0.9.4 - -> 发版时间:2026.06.10
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.4-bin.zip
-> SHA512 校验码:040ebdd9e45d93535e9628cf377003d560be83cec9737f5a5fbd0c3a93a12810814094752eac3eacdfec5cddcf433fa83e76edc14be34c73c1a54d9b937ea1b5 - -V2.0.9.4 版本主要优化了表模型 AINode 的推理功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:表模型协变量推理模型自适应支持填充空值 - - -### V2.0.9.3 - -> 发版时间:2026.05.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.3-bin.zip
-> SHA512 校验码:f6c5d50cbf8902503289884f073593c650ffdc8edbebfabf27f6ab4499630749331aa4ed09dd34627a39fa8dee27b4d7e2689d0ed1cf23c76dd9c7270f9fae2a - -V2.0.9.3 版本 AINode 新增支持同一套模型代码搭配不同模型权重分别注册为模型的功能,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:[支持同一套模型代码搭配不同模型权重分别注册为模型](../AI-capability/AINode_Upgrade_timecho.md#_4-3-注册自定义模型) - - -### V2.0.9.2 - -> 发版时间:2026.05.11
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.2-bin.zip
-> SHA512 校验码:10d3f34b6e65ad5c09b1cf3538ee27e181cc38c5fedf6acfd7d7053797ca23c76245683536275b69bd478aa1e43364351eceef1948832ab663a7398665af9eff - -V2.0.9.2 版本 新增 Object 类型导入导出功能,新增脚本 tsfile-backup(目前仅支持表模型),同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 脚本与工具:表模型[import-data 脚本 TsFile 格式](../../latest-Table/Tools-System/Data-Import-Tool_timecho.md#_2-4-tsfile-格式)支持 object 类型数据导入 -- 脚本与工具:表模型新增 [tsfile-backup 脚本](../../latest-Table/Tools-System/Data-Export-Tool_timecho.md#_3-基于-pipe-框架的-tsfilebackup) -- 流处理模块:表模型 PIPE 支持 [Object 类型数据本地导出和远程传输](../../latest-Table/User-Manual/Data-Sync_timecho.md#_3-9-object-类型数据导出) -- 系统模块:[审计日志](../User-Manual/Audit-Log_timecho.md)支持慢请求个数统计 - - -### V2.0.9.1 - -> 发版时间:2026.05.11
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.9.1-bin.zip
-> SHA512 校验码:18ff3801ba58550e06ef0aa4bf4465e8ce1b31d1aecb9c6899eb843f5d9187d3cc575e930ee38d96b87b17067e2b21f1852ab5127eac7480cf5051c20a68894b - -V2.0.9.1 版本新增 AINode 协变量分类推理能力,支持 schema级/表级存储空间统计功能,数据查询新增集合操作、CTE 及多个内置函数,支持通过 DEBUG SQL 调试查询,支持配置开机自启等,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- AINode:表模型支持[时序数据分类推理](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理) -- 查询模块:表模型支持[集合操作(UNION/INTERSECT/EXCEPT)](../../latest-Table/SQL-Manual/Set-Operations_timecho.md)及 [共用表表达式(CTE)](../../latest-Table/SQL-Manual/Common-Table-Expression_timecho.md) -- 查询模块:表模型新增 [IF 标量函数](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_8-3-if-表达式)、[二进制函数](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_7-二进制函数)、[APPROX_PERCENTILE 聚合函数](../../latest-Table/SQL-Manual/Basis-Function_timecho.md#_2-聚合函数) -- 查询模块:支持 [DEBUG SQL](../User-Manual/Maintenance-statement_timecho.md#_6-调试查询),优化 [Explain Analyze](../User-Manual/Query-Performance-Analysis.md) 结果集 -- 查询模块:支持 [schema级](../User-Manual/Maintenance-statement_timecho.md#_1-10-查看磁盘空间占用情况)/[表级](../../latest-Table/Reference/System-Tables_timecho.md#_2-22-table-disk-usage-表)存储空间统计,支持 [show configuration 语句](../../latest-Table/User-Manual/Maintenance-statement_timecho.md#_1-13-查看节点配置信息)查看集群配置信息 -- 脚本与工具:数据/元数据导入导出工具支持 SSL 协议 -- 脚本与工具:命令行工具支持展示[访问历史功能](../Tools-System/CLI_timecho.md#_5-访问历史功能) -- 系统模块:支持配置[开机自启](../User-Manual/Auto-Start-On-Boot_timecho.md) -- 其他:修复安全漏洞 CVE-2026-28564 - - -### V2.0.8.3 - -> 发版时间:2026.04.21
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.3-bin.zip
-> SHA512 校验码:4b95bea87cc375bc455897dcf4cec80692421fa5c3eee746e1095b94288611d4afdd94aa8dad70340757d041757758924701cbdb2b73b49fb8730c4caac2a126 - -V2.0.8.3 版本新增 Python 读写 Object 类型数据的能力,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 接口模块:表模型[Python 原生接口](../../latest-Table/API/Programming-Python-Native-API_timecho.md)支持读写 Object 类型数据 - - -### V2.0.8.2 - -> 发版时间:2026.03.31
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.2-bin.zip
-> SHA512 校验码:02ab10e3e94786dd5676e0a69609eef192afd90d87f4d8d7bd44e7e9cbc8a18d61ba5668bae56cb8e4416ac71a877f760963b72ca7838d7c39ae10f1ed321d89 - -V2.0.8.2 版本新增树模型修改序列全名功能,表模型支持自定义 Time 列列名,树、表双模型支持更改数据类型,ODBC Driver等,同时对历史版本进行改进和 bug 修复,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 存储模块:树模型支持[修改序列全名](../Basic-Concept/Operate-Metadata_timecho.md#_2-4-修改时间序列名称),支持[更改序列数据类型](../Basic-Concept/Operate-Metadata_timecho.md#_2-3-修改时间序列数据类型) -- 存储模块:表模型支持[更改列数据类型](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-5-修改表),支持[自定义 Time 列列名](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-1-创建表) -- 接口模块:支持 [ODBC Driver](../API/Programming-ODBC_timecho.md), Python SessionDataset 支持分批获取 DataFrame,MQTT 服务外置并新增系统表 Services 提供服务查询 -- AINode:表模型支持自适应[协变量推理](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md#_4-1-模型推理) -- 流处理模块:树模型数据同步 pipe 语句中支持填写多个精确路径的 path - -### V2.0.8.1 - -> 发版时间:2026.02.04
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.8.1-bin.zip
-> SHA512 校验码:49d97cbf488443f8e8e73cc39f6f320b3bc84b194aed90af695ebd5771650b5e5b6a3abb0fb68059bd01827260485b903c035657b337442f4fdd32c877f2aca3 - -V2.0.8.1 版本表模型新增Object数据类型,强化升级审计日志功,优化树模型 OPC UA 协议,AINode 支持协变量预测,以及 AINode 支持并发推理等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:新增 DataNode 可用节点的列表展示,可[查看节点的 RPC 地址和端口](../User-Manual/Maintenance-statement_timecho.md#_1-7-查看可用节点) -- 查询模块:表模型新增[统计查询耗时的系统表](../../latest-Table/Reference/System-Tables_timecho.md#_2-20-queries-costs-histogram-表) -- 存储模块:支持通过 SQL 查看[创建表](../../latest-Table/Basic-Concept/Table-Management_timecho.md#_1-4-查看表的创建信息)/[视图](../../latest-Table/User-Manual/Tree-to-Table_timecho.md#_2-4-查看表视图)的完整定义语句 -- 存储模块:优化树模型 [OPC UA 协议](../API/Programming-OPC-UA_timecho.md) -- 系统模块:表模型新增 [Object 数据类型](../../latest-Table/Background-knowledge/Data-Type_timecho.md) -- 系统模块:强化升级[审计日志](../User-Manual/Audit-Log_timecho.md)功能 -- 系统模块:表模型新增 DataNode [节点连接情况](../../latest-Table/Reference/System-Tables_timecho.md#_2-18-connections-表)的系统表 -- AINode:内置 chronos-2 模型,支持[协变量预测](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md)功能 -- AINode:Timer-XL、Sundial 内置模型支持[并发推理](../../latest-Table/AI-capability/AINode_Upgrade_timecho.md)功能 -- 流处理模块:创建全量同步 pipe 会[自动拆分](../User-Manual/Data-Sync_timecho.md#_2-1-创建任务)为实时、历史两个独立 pipe,可通过 show pipes 语句分别查看剩余事件数 -- 其他:修复安全漏洞 CVE-2025-12183、CVE-2025-66566、CVE-2025-11226 - - -### V2.0.6.6 - -> 发版时间:2026.01.20
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.6-bin.zip
-> SHA512 校验码:d12e60b8119690d63c501d0c2afcd527e39df8a8786198e35b53338e21939e1a9244805e710d81cbb62d02c2739909d7e8227c029660a0cd9ea7ca718cf9bdf6 - -V2.0.6.6 版本主要优化了树模型时间序列的查询性能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:优化了 show/count timeseries/devices 的查询性能 - -### V2.0.6.4 - -> 发版时间:2025.11.17
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.4-bin.zip
-> SHA512 校验码:57b9998cc14632862c32b6781c70db1c52caf8172b5d45d27cc214cab50d3afd4230ed0754e1c1a4ed825666bf971dc81fbb7d3b93261e57e9dabc20e794a2b8 - -V2.0.6.4 版本主要优化了存储以及 AINode 模块的相关功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 存储模块:支持树模型修改时间序列的编码及压缩方式 -* AINode:支持一键部署,优化了模型推理功能 - - -### V2.0.6.1 - -> 发版时间:2025.09.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.6.1-bin.zip
-> SHA512 校验码:c88e3e2c0dbd06578bd0697ca9992880b300baee2c4906ba1f952134e37ae2fa803a6af236f4541d318b75f43a498b5d5bfbbc7c445783271076c36e696e4dd0 - -V2.0.6.1 版本新增表模型查询写回功能,新增访问控制黑白名单功能,新增位操作函数(内置标量函数)以及可下推的时间函数,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:支持表模型查询写回功能 -* 查询模块:表模型行模式识别支持使用聚合函数,捕获连续数据进行分析计算 -* 查询模块:表模型新增内置标量函数-位操作函数 -* 查询模块:表模型新增可下推的 EXTRACT 时间函数 -* 系统模块:新增访问控制,支持用户自定义配置黑白名单功能 -* 其他:用户默认密码更新为安全强度更高的“TimechoDB@2021” - -### V2.0.5.2 - -> 发版时间:2025.08.08
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.5.2-bin.zip
-> SHA512 校验码:a00a4075c9937b7749c454f71d2480fea5e9ff9659c0628b132e30e2f256c7c537cd91dca4f6be924db0274bb180946a1b88e460c025bf82fdb994a3c2c7b91e - -V2.0.5.2 版本修复了部分产品缺陷,优化了数据同步功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V2.0.5.1 - -> 发版时间:2025.07.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.5.1-bin.zip
-> SHA512 校验码:aa724755b659bf89a60da6f2123dfa91fe469d2e330ed9bd029e8f36dd49212f3d83b1025e9da26cb69315e02f65c7e9a93922e40df4f2aa4c7f8da8da2a4cea - -V2.0.5.1 版本新增树转表视图、表模型窗口函数、聚合函数 approx\_most\_frequent,并支持 LEFT & RIGHT JOIN、ASOF LEFT JOIN;AINode 新增 Timer-XL、Timer-Sundial 两种内置模型,支持树、表模型推理及微调功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:支持手动创建树转表视图 -* 查询模块:表模型新增窗口函数 -* 查询模块:表模型新增聚合函数 approx\_most\_frequent -* 查询模块:表模型 JOIN 功能扩展,支持 LEFT & RIGHT JOIN、ASOF LEFT JOIN -* 查询模块:表模型支持行模式识别,可捕获连续数据进行分析计算 -* 查询模块:表模型新增多个系统表,例如:VIEWS(表视图信息)、MODELS(模型信息)等 -* 系统模块:新增 TsFile 数据文件加密功能 -* AI 模块:AINode 新增 Timer-XL、Timer-Sundial 两种内置模型 -* AI 模块:AINode 支持树模型、表模型的推理及微调功能 -* 其他模块:支持通过 OPC DA 协议发布数据 - -### 2.x 其他历史版本 - -#### V2.0.4.2 - -> 发版时间:2025.06.21
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.4.2-bin.zip
-> SHA512 校验码:31f26473ac90988ce970dac8d0950671bde918f9af6f2f6a6c2bf99a53aa1c0a459c53a137b18ff0b28e70952e9c4b6acb50029e0b2e38837b969eb8f78f2939 - -V2.0.4.2 版本支持了传递 TOPIC 给 MQTT 自定义插件,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.4.1 - -> 发版时间:2025.06.03
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.4.1-bin.zip
-> SHA512 校验码:93ac08bfae06aff6db04849f474458433026f66778f4f5c402eb22f1a7cb14d8096daf0a9e9cc365ddfefd4f8ca4443b2a9fb6461906f056b1e6a344990beb3a - -V2.0.4.1 版本表模型新增用户自定义表函数(UDTF)及多种内置表函数、新增聚合函数 approx\_count\_distinct、新增支持针对时间列的 ASOF INNER JOIN,并对脚本工具进行了分类整理,将 Windows 平台专用脚本独立,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:表模型新增用户自定义表函数(UDTF)及多种内置表函数 -* 查询模块:表模型支持针对时间列的 ASOF INNER JOIN -* 查询模块:表模型新增聚合函数 approx\_count\_distinct -* 流处理:支持通过 SQL 异步加载 TsFile -* 系统模块:缩容时,副本选择支持容灾负载均衡策略 -* 系统模块:适配 Window Server 2025 -* 脚本与工具:对脚本工具进行了分类整理,并将 Windows 平台专用脚本独立 - -#### V2.0.3.4 - -> 发版时间:2025.06.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.4-bin.zip
-> SHA512 校验码:d80d34b7d3890def75b17c491fc4c13efc36153a5950a9b23744755d04d6adb5d6ab9ec970101183fef7bfeb8a559ef92fce90d2d22f7b7fd5795cd5589461bb - -V2.0.3.4版本将用户密码的加密算法变更为 SHA-256,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.3.3 - -> 发版时间:2025.05.16
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.3-bin.zip
-> SHA512 校验码:f47e3fb45f869dbe690e7cfaa93f95e5e08a462b362aa9d7ccac7ee5b55022dc8f62db12009dfde055f278f3003ff9ea7c22849d52a3ef2c25822f01ade78591 - -V2.0.3.3 版本新增元数据导入导出脚本适配表模型、Spark 生态集成(表模型)、AINode 返回结果新增时间戳,表模型新增部分聚合函数和标量函数,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:表模型新增聚合函数 count\_if 和标量函数 greatest / least -* 查询模块:表模型全表 count(\*) 查询性能显著提升 -* AI 模块:AINode 返回结果新增时间戳 -* 系统模块:表模型元数据模块性能优化 -* 系统模块:表模型支持主动监听并加载 TsFile 功能 -* 系统模块:新增 TsFile 解析转换时间、TsFile 转 Tablet 数量等监控指标 -* 生态集成:表模型生态拓展集成 Spark -* 脚本与工具:import-schema、export-schema 脚本支持表模型元数据导入导出 - -#### V2.0.3.2 - -> 发版时间:2025.05.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.3.2-bin.zip
-> SHA512 校验码:76bd294de4b01782e5dd621a996aeb448e4581f98c70fb5b72b17dc392c2e1227c0d26bd3df5533669a80f217a83a566bc6ec926b7efd21ce7a89b894cd33e19 - -V2.0.3.2版本修复了部分产品缺陷,优化了节点移除功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V2.0.2.1 - -> 发版时间:2025.04.07
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.2.1-bin.zip
-> SHA512 校验码:a41be3f8c57e6a39ac165f1d6ab92c9ed790b0712528f31662c58617f4c94e6bfc9392a9c1ef2fc5bdd8c7ca79901389f368cbdbec3e5b1d5c1ce155b2f1a457 - -V2.0.2.1 版本新增了表模型权限管理、用户管理以及相关操作鉴权,并新增了表模型 UDF、系统表和嵌套查询等功能。此外,持续优化数据订阅机制,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:新增表模型 UDF 的管理、用户自定义标量函数(UDSF)和用户自定义聚合函数(UDAF) -* 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -* 查询模块:表模型支持权限管理、用户管理以及相关操作鉴权 -* 查询模块:新增系统表及多种运维语句,优化系统管理 -* 系统模块:CSharp 客户端支持表模型 -* 系统模块:新增表模型 C++ Session 写入接口 -* 系统模块:多级存储支持符合 S3 协议的非 AWS 对象存储系统 -* 系统模块:UDF 函数拓展,新增 pattern\_match 模式匹配函数 -* 数据同步:表模型支持元数据同步及同步删除操作 - -#### V2.0.1.2 - -> 发版时间:2025.01.25
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:timechodb-2.0.1.2-bin.zip
-> SHA512 校验码:51c2fa5da2974a8a3c8871dec1c49bd98e5d193a13ef33ac7801adb833a1e360d74f0160bcdf33c7ffb23a5c5e0f376e26a4315cf877f1459483356285b85349 - -V2.0.1.2 版本正式实现树表双模型配置,并配合表模型支持标准 SQL 查询语法、多种函数和运算符、流处理、Benchmark 等功能。此外,该版本更新还包括:Python 客户端支持四种新数据类型,支持只读模式下的数据库删除操作,脚本工具同时兼容 TsFile、CSV 和 SQL 数据的导入导出,对 Kubernetes Operator 的生态集成等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 时序表模型:IoTDB 支持了时序表模型,提供的 SQL 语法包括 SELECT、WHERE、JOIN、GROUP BY、ORDER BY、LIMIT 子句和嵌套查询 -* 查询模块:表模型支持多种函数和运算符,包括逻辑运算符、数学函数以及时序特色函数 DIFF 等 -* 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -* 存储模块:表模型支持通过 Session 接口进行数据写入,Session 接口支持元数据自动创建 -* 存储模块:Python 客户端新增支持四种新数据类型:`String`、`Blob`、`Date` 和 `Timestamp` -* 存储模块:优化同种类合并任务优先级的比较规则 -* 流处理模块:支持在发送端指定接收端鉴权信息 -* 流处理模块:TsFile Load 支持表模型 -* 流处理模块:流处理插件适配表模型 -* 系统模块:增强了 DataNode 缩容的稳定性 -* 系统模块:在 readonly 状态下,支持用户进行 drop database 操作 -* 脚本与工具:Benchmark 工具适配表模型 -* 脚本与工具: Benchmark 工具支持四种新数据类型:`String`、`Blob`、`Date` 和 `Timestamp` -* 脚本与工具:data/export-data 脚本扩展,支持新数据类型(字符串、大二进制对象、日期、时间戳) -* 脚本与工具:import-data/export-data 脚本迭代,同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出 -* 生态集成:支持 Kubernetes Operator - - -### V1.3.7.3 - -> 发版时间:2026.06.02
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.3-bin.zip
-> SHA512 校验码:8e6cde061421a552b9855f39f9cccd4838c820dc15ef0ad2a7c23a54cd6cc4f06c35190c1f428784e6a4d5463dd1b794f58ff5cdf891f27f6d0be4d3ab00bf6f - -V1.3.7.3 版本主要优化了查询模块和数据同步等功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:优化 Last 查询、对齐序列查询、倒序时间过滤查询等场景 -- 元数据模块:优化已激活序列及其子路径下的设备创建校验 -- 数据同步:优化同步失败后的重试机制 -- 数据同步:跨网闸同步插件支持配置实时写入传输超时时间 -- 接口模块:Go 客户端写入接口增加错误码校验 -- 接口模块:优化 C# 客户端连接池管理 - - -### V1.3.7.2 - -> 发版时间:2026.04.07
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.7.2-bin.zip
-> SHA512 校验码:787766af64992069f0db0ac8b250b461d799307b3ce06b0782fc25752c8c5307fa2205c9e3a38a41685b81bb6b4b5c1ec9f71a395bfad285caf90de7b8224783 - -V1.3.7.2 版本主要优化了数据同步和查询模块的相关功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:优化 Pipe 复杂路径匹配场景下的分发性能 -- 查询模块:Show Queries 语句新增客户端 IP、查询超时时间、服务端等待时间等信息 -- 生态集成:支持 IoTDB 以 OPC Client 模式向外部 OPC Server 推送数据 - - -### V1.3.6.6 - -> 发版时间:2026.01.20
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.6-bin.zip
-> SHA512 校验码:590d3ead053298c6df0ede637572ba598b9b684f8b35ab874bd4452f765e1421938f4cca2cf0423af2e806592aa8b15bdd25b41df7de809435a4d0239fc04790 - -V1.3.6.6 版本优化了数据的读写功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.6.3 - -> 发版时间:2026.01.04
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.3-bin.zip
-> SHA512 校验码:43719a1384f59f63cb0029cdda0aba433383cd1a0f5ebc142e54f8aa6623cc30a7efb3e3aef7f3d485d5e07bec91be215c92ed21b5201613d5cc44044251c978 - -V1.3.6.3 版本主要围绕查询性能、内存管理机制两大核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 查询模块:优化多种场景的查询性能,包括多序列 Last 查询等 -* 查询模块:Java SDK 新增 FastLastQuery 接口,支持更高效的 Last 查询操作 -* 查询模块:树模型 fetchSchema 调整为分段流式返回,提升大数据量场景下的响应速度 -* 存储模块:优化内存管理,避免内存泄漏风险,保障系统长期稳定运行 -* 存储模块:优化文件合并机制,提升合并处理效率,优化系统存储资源占用 -* 其他:修复安全漏洞 CVE-2025-12183,CVE-2025-66566 and CVE-2025-11226 - - -### V1.3.6.1 - -> 发版时间:2025.12.09
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.6.1-bin.zip
-> SHA512 校验码:9fb6a6870aa2133bfc40508324a7d97ee078d0d44895beef7b0a331edd203419119fb02b933f585b6c4a6fe9b59708a053d7cf65206b22b1a4f01a5fe518424c - -V1.3.6.1 版本主要围绕数据同步稳定性这一核心方向进行了深度优化,对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -* 数据同步:优化 Pipe SQL 参数配置,支持指定异步加载方式 -* 数据同步:新增语法糖功能,可将全量 Pipe 创建 SQL 自动拆分为实时同步与历史同步两类 -* 系统模块:新增全局数据类型压缩方式配置项,支持按需调整存储压缩策略 - - -### V1.3.5.11 - -> 发版时间:2025.09.24
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.11-bin.zip
-> SHA512 校验码:f18419e20c0d7e9316febee5a053306a97268cb07e18e6933716c2ef98520fbbe051dfa1da02a9c83e8481a839ce35525ce6c50f890f821e3d760f550c75f804 - -V1.3.5.11 版本主要优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.10 - -> 发版时间:2025.08.27
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.10-bin.zip
-> SHA512 校验码:3aea6d2318f52b39bfb86dae9ff06fe1b719fdeceaabb39278c9a73544e1ceaf0660339f9342abb888c8281a0fb6144179dac9bb0c40ba0ecc66bac4dd7cbe80 - -V1.3.5.10 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -### V1.3.5.9 - -> 发版时间:2025.08.25
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.9-bin.zip
-> SHA512 校验码:95b7a6790e94dc88e355a81e5a54b10ee87bdadae69ba0b215273967b3422178d5ee81fa5adf1c5380a67dbb30cf9782eaa3cbfd6ec744b0fd9a91c983ee8f70 - -V1.3.5.9 版本优化了内存控制,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 -### 1.x 其他历史版本 - -#### V1.3.5.8 - -> 发版时间:2025.08.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.8-bin.zip
-> SHA512 校验码:aa9802301614e20294a7f2fc4c149ba20d58213d9b74e8f8c607e0f4860949bad164bce2851b63c1d39b7568d62975ab257c269b3a9c168a29ea3945b6d28982 - -V1.3.5.8 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.7 - -> 发版时间:2025.08.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.7-bin.zip
-> SHA512 校验码:17374a440267aed3507dcc8cf4dc8703f8136d5af30d16206a6e1101e378cbbc50eda340b1598a12df35fe87d96db20f7802f0e64033a013d4b81499198663d4 - -V1.3.5.7 版本优化了数据同步功能,修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.6 - -> 发版时间:2025.07.16
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.6-bin.zip
-> SHA512 校验码:05b9fda4d98ba8a1c9313c0831362ed3d667ce07cb00acaeabcf6441a6d67dff7da27f3fda2a5e1b3c3b85d1e5c730a534f3aa2f0c731b8c03ef447203b32493 - -V1.3.5.6 版本新增配置项开关支持禁用数据订阅功能,优化了C++高可用客户端,以及正常情况、重启、删除三个场景下的 PIPE 同步延迟问题,和大 TEXT 对象时的查询问题,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.4 - -> 发版时间:2025.06.19
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.4-bin.zip
-> SHA512 校验码:edac5f8b70dd67b3f84d3e693dc025a10b41565143afa15fc0c4937f8207479ffe2da787cc9384440262b1b05748c23411373c08606c6e354ea3dcdba0371778 - -V1.3.5.4 版本修复了部分产品缺陷,优化了节点移除功能,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.3 - -> 发版时间:2025.06.13
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.3-bin.zip
-> SHA512 校验码:5f807322ceec9e63a6be86108cc57e7ad4251b99a6c28baf11256ab65b2145768e9110409f89834d5f4256094a8ad995775c0e59a17224ff2627cd9354e09d82 - -V1.3.5.3 版本主要优化了数据同步功能,包括持久化 PIPE 发送进度,增加 PIPE 事件传输时间监控项,并修复了相关缺陷;另外将用户密码的加密算法变更为 SHA-256,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.2 - -> 发版时间:2025.06.10
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.2-bin.zip
-> SHA512 校验码:4c0a5db76c6045dfd27cce303546155cdb402318024dae5f999f596000d7b038b13bbeac39068331b5c6e2c80bc1d89cd346dd0be566fe2fe865007d441d9d05 - -V1.3.5.2 版本主要优化了数据同步功能,包括支持通过使用参数进行级联配置,支持同步和实时写入顺序完全一致;支持系统重启后历史数据和实时数据分区发送,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.5.1 - -> 发版时间:2025.05.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.5.1-bin.zip
-> SHA512 校验码:91f22bafbdd4d580126ed59ba1ba99d14209f10ce4a0a4bd7d731943ac99fdb6ebfab6e3a1e294a7cb7f46367e9fd4252b0d9ac4d4240ddedf6d85658e48f212 - -V1.3.5.1 版本修复了部分产品缺陷,同时对数据库监控、性能、稳定性进行了全方位提升。 - -#### V1.3.4.2 - -> 发版时间:2025.04.14
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.2-bin.zip
-> SHA512 校验码:52fbd79f5e7256e7d04edc8f640bb8d918e837fedd1e64642beb2b2b25e3525b5f5a4c92235f88f6f7b59bfcdf096e4ea52ab85bfef0b69274334470017a2c5b2 - -V1.3.4.2 版本优化了数据同步功能,支持双活之间同步外部 PIPE 转发而来的数据。 - - -#### V1.3.4.1 - -> 发版时间:2025.01.08
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.4.1-bin.zip
-> SHA512 校验码:e9d46516f1f25732a93cc915041a8e59bca77cf8a1018c89d18ed29598540c9f2bdf1ffae9029c87425cecd9ecb5ebebea0334c7e23af11e28d78621d4a78148 - -V1.3.4.1 版本新增模式匹配函数、持续优化数据订阅机制,提升稳定性、import-data/export-data 脚本扩展支持新数据类型,import-data/export-data 脚本合并同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 查询模块:用户可通过配置项控制 UDF、PipePlugin、Trigger 和 AINode 通过 URI 加载 jar 包 -- 系统模块:UDF 函数拓展,新增 pattern_match 模式匹配函数 -- 数据同步:支持在发送端指定接收端鉴权信息 -- 生态集成:支持 Kubernetes Operator -- 脚本与工具:import-data/export-data 脚本扩展,支持新数据类型(字符串、大二进制对象、日期、时间戳) -- 脚本与工具:import-data/export-data 脚本迭代,同时兼容 TsFile、CSV 和 SQL 三种类型数据的导入导出 - -#### V1.3.3.3 - -> 发版时间:2024.10.31
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.3-bin.zip
-> SHA512 校验码:4a3eceda479db3980e9c8058628e71ba5a16fbfccf70894e8181aea5e014c7b89988d0093f6d42df29d478340a33878602a3924bec13f442a48611cec4e0e961 - -V1.3.3.3版本增加优化重启恢复性能,减少启动时间、DataNode 主动监听并加载 TsFile,同时增加可观测性指标、发送端支持传文件至指定目录后,接收端自动加载到IoTDB、Alter Pipe 支持 Alter Source 的能力等功能,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 数据同步:接收端支持对不一致数据类型的自动转换 -- 数据同步:接收端增强可观测性,支持多个内部接口的 ops/latency 统计 -- 数据同步:opc-ua-sink 插件支持 CS 模式访问和非匿名访问方式 -- 数据订阅: SDK 支持 create if not exists 和 drop if exists 接口 -- 流处理:Alter Pipe 支持 Alter Source 的能力 -- 系统模块:新增 rest 模块的耗时监控 -- 脚本与工具:支持加载自动加载指定目录的TsFile文件 -- 脚本与工具:import-tsfile脚本扩展,支持脚本与iotdb server不在同一服务器运行 -- 脚本与工具:新增对Kubernetes Helm的支持 -- 脚本与工具:Python 客户端支持新数据类型(字符串、大二进制对象、日期、时间戳) - -#### V1.3.3.2 - -> 发版时间:2024.8.15
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.2-bin.zip
-> SHA512 校验码:32733610da40aa965e5e9263a869d6e315c5673feaefad43b61749afcf534926398209d9ca7fff866c09deb92c09d950c583cea84be5a6aa2c315e1c7e8cfb74 - -V1.3.3.2版本支持输出读取mods文件的耗时、输入最大顺乱序归并排序内存 以及dispatch 耗时、通过参数配置对时间分区原点的调整、支持根据 pipe 历史数据处理结束标记自动结束订阅,同时合并了模块内存控制性能提升,具体发布内容如下: - -- 查询模块:Explain Analyze 功能支持输出读取mods文件的耗时 -- 查询模块:Explain Analyze 功能支持输入最大顺乱序归并排序内存以及 dispatch 耗时 -- 存储模块:新增合并目标文件拆分功能,增加配置文件参数 -- 系统模块:支持通过参数配置对时间分区原点的调整 -- 流处理:数据订阅支持根据 pipe 历史数据处理结束标记自动结束订阅 -- 数据同步:RPC 压缩支持指定压缩等级 -- 脚本与工具:数据/元数据导出只过滤 root.__system,不对root.__systema 等开头的数据进行过滤 - -#### V1.3.3.1 - -> 发版时间:2024.7.12
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.3.1-bin.zip
-> SHA512 校验码:1fdffbc1f18bfabfa3463a5a6fbc4f6ba6ab686942f9e85e7e6be1840fb8700e0147e5e73fd52201656ae6adb572cc2e5ecc61bcad6fa4c5a4048c4207e3c6c0 - -V1.3.3.1版本多级存储增加限流机制、数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权,优化了数据同步接收端一些不明确的WARN日志、重启恢复性能,减少启动时间,同时对脚本内容进行了合并,具体发布内容如下: - -- 查询模块:Filter 性能优化,提升聚合查询和where条件查询的速度 -- 查询模块:Java Session客户端查询 sql 请求均分到所有节点 -- 系统模块:将"iotdb-confignode.properties、iotdb-datanode.properties、iotdb-common.properties"配置文件合并为" iotdb-system.properties" -- 存储模块:多级存储增加限流机制 -- 数据同步:数据同步支持在发送端 sink 指定接收端使用用户名密码密码鉴权 -- 系统模块:优化重启恢复性能,减少启动时间 - -#### V1.3.2.2 - -> 发版时间:2024.6.4
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.2.2-bin.zip
-> SHA512 校验码:ad73212a0b5025d18d2481163f6b2d4f604e06eb5e391cc6cba7bf4e42792e115b527ed8bfb5cd95d20a150645c8b4d56a531889dac229ce0f63139a27267322 - -V1.3.2.2 版本新增 explain analyze 语句分析单个 SQL 查询耗时、新增 UDAF 用户自定义聚合函数框架、支持磁盘空间到达设置阈值自动删除数据、元数据同步、统计指定路径下数据点数、SQL 语句导入导出脚本等功能,同时集群管理工具支持滚动升级、上传插件到整个集群,同时对数据库监控、性能、稳定性进行了全方位提升。具体发布内容如下: - -- 存储模块:insertRecords 接口写入性能提升 -- 存储模块:新增 SpaceTL 功能,支持磁盘空间到达设置阈值自动删除数据 -- 查询模块:新增 Explain Analyze 语句(监控单条 SQL 执行各阶段耗时) -- 查询模块:新增 UDAF 用户自定义聚合函数框架 -- 查询模块:UDF 新增包络解调分析 -- 查询模块:新增 MaxBy/MinBy 函数,支持获取最大/小值的同时返回对应时间戳 -- 查询模块:值过滤查询性能提升 -- 数据同步:路径匹配支持通配符 -- 数据同步:支持元数据同步(含时间序列及相关属性、权限等设置) -- 流处理:增加 Alter Pipe 语句,支持热更新 Pipe 任务的插件 -- 系统模块:系统数据点数统计增加对 load TsFile 导入数据的统计 -- 脚本与工具:新增本地升级备份工具(通过硬链接对原有数据进行备份) -- 脚本与工具:新增 export-data/import-data 脚本,支持将数据导出为 CSV、TsFile 格式或 SQL 语句 -- 脚本与工具:Windows 环境支持通过窗口名区分 ConfigNode、DataNode、Cli - -#### V1.3.1.4 - -> 发版时间:2024.4.23
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.1.4-bin.zip
-> SHA512 校验码:8547702061d52e2707c750a624730eb2d9b605b60661efa3c8f11611ca1685aeb51b6f8a93f94c1b30bf2e8764139489c9fbb76cf598cfa8bf9c874b2a7c57eb - -V1.3.1 版本增加系统激活情况查看、内置方差/标准差聚合函数、内置Fill语句支持超时时间设置、tsfile修复命令等功能,增加一键收集实例信息脚本、一键启停集群等脚本,并对视图、流处理等功能进行优化,提升使用易用度和版本性能。具体发布内容如下: - -- 查询模块:Fill 子句支持设置填充超时阈值,超过时间阈值不填充 -- 查询模块:Rest 接口(V2 版)增加列类型返回 -- 数据同步:数据同步简化时间范围指定方式,直接设置起止时间 -- 数据同步:数据同步支持 SSL 传输协议(iotdb-thrift-ssl-sink 插件) -- 系统模块:支持使用 SQL 查询集群激活信息 -- 系统模块:多级存储增加迁移时传输速率控制 -- 系统模块:系统可观测性提升(增加集群节点的散度监控、分布式任务调度框架可观测性) -- 系统模块:日志默认输出策略优化 -- 脚本与工具:增加一键启停集群脚本(start-all/stop-all.sh & start-all/stop-all.bat) -- 脚本与工具:增加一键收集实例信息脚本(collect-info.sh & collect-info.bat) - -#### V1.3.0.4 - -> 发版时间:2024.1.3
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.3.0.4-bin.zip
-> SHA512 校验码:3c07798f37c07e776e5cd24f758e8aaa563a2aae0fb820dad5ebf565ad8a76c765b896d44e7fdb7dad2e46ffd4262af901c765f9bf6af926bc62103118e38951 - -V1.3.0.4 发布了全新内生机器学习框架 AINode,全面升级权限模块支持序列粒度授予权限,并对视图、流处理等功能进行诸多细节优化,进一步提升了产品的使用易用度,并增强了版本稳定性和各方面性能。具体发布内容如下: - -- 查询模块:新增 AINode 内生机器学习模块 -- 查询模块:优化 show path 语句返回时间长的问题 -- 安全模块:升级权限模块,支持时间序列粒度的权限设置 -- 安全模块:支持客户端与服务器 SSL 通讯加密 -- 流处理:流处理模块新增多种 metrics 监控项 -- 查询模块:非可写视图序列支持 LAST 查询 -- 系统模块:优化数据点监控项统计准确性 - -#### V1.2.0.1 - -> 发版时间:2023.6.30
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.2.0.1-bin.zip
-> SHA512 校验码:dcf910d0c047d148a6c52fa9ee03a4d6bc3ff2a102dc31c0864695a25268ae933a274b093e5f3121689063544d7c6b3b635e5e87ae6408072e8705b3c4e20bf0 - -V1.2.0.1主要增加了流处理框架、动态模板、substring/replace/round内置查询函数等新特性,增强了show region、show timeseries、show variable等内置语句功能和Session接口,同时优化了内置监控项及其实现,修复部分产品bug和性能问题。 - -- 流处理:新增流处理框架 -- 元数据模块:新增模板动态扩充功能 -- 存储模块:新增SPRINTZ和RLBE编码以及LZMA2压缩算法 -- 查询模块:新增cast、round、substr、replace内置标量函数 -- 查询模块:新增time_duration、mode内置聚合函数 -- 查询模块:SQL语句支持case when语法 -- 查询模块:SQL语句支持order by表达式 -- 接口模块:Python API支持连接分布式多个节点 -- 接口模块:Python客户端支持写入重定向 -- 接口模块:Session API增加用模板批量创建序列接口 - -#### V1.1.0.1 - -> 发版时间:2023-04-03
-> 下载地址:请联系天谋工作人员进行下载
-> 安装包名称:iotdb-enterprise-1.1.0.1.zip
-> SHA512 校验码:58df58fc8b11afeec8436678842210ec092ac32f6308656d5356b7819acc199f1aec4b531635976b091b61d6736f0d9706badcabeaa5de50939e5c331c1dc804 - -V1.1.0.1主要改进增加了部分新特性,如支持 GROUP BY VARIATION、GROUP BY CONDITION 等分段方式、增加 DIFF、COUNT_IF 等实用函数,引入 pipeline 执行引擎进一步提升查询速度等。同时修复对齐序列 last 查询 order by timeseries、LIMIT&OFFSET 不生效、重启后元数据模版错误、删除所有 database 后创建序列错误等相关问题。 - -- 查询模块:align by device 语句支持 order by time -- 查询模块:支持 Show Queries 命令 -- 查询模块:支持 kill query 命令 -- 系统模块:show regions 支持指定特定的 database -- 系统模块:新增 SQL show variables, 可以展示当前集群参数 -- 查询模块:聚合查询支持 GROUP BY VARIATION -- 查询模块:SELECT INTO 支持特定的数据类型强转 -- 查询模块:实现内置标量函数 DIFF -- 系统模块:show regions 显示创建时间 -- 查询模块:实现内置聚合函数 COUNT_IF -- 查询模块:聚合查询支持 GROUP BY CONDITION -- 系统模块:支持修改 dn_rpc_port 和 dn_rpc_address - - -## 2. Workbench(控制台工具) - -| **控制台版本号** | **版本说明** | **可支持IoTDB版本** | **SHA512 校验码** | -| ---------------- | ------------------------------------------------------------ | ------------------------- | ------------------------------------------------------------ | -| V2.1.1 | 优化趋势界面测点选择,支持无设备场景 | V2.0 及以上版本 | aa05fd4d9f33f07c0949bc2d6546bb4b9791ed5ea94bcef27e2bf51ea141ec0206f1c12466aced7bf3449e11ad68d65378d697f3d10cb4881024a83746029a65 | -| V2.0.1-beta | V2.x系列首个版本,支持树、表双模型 | V2.0 及以上版本 | 0ca0d5029874ed8ada9c7d1cb562370b3a46913eed66d39c08759287ccc8bf332cf80bb8861e788614b61ae5d53a9f5605f553e1a607e856f395eb5102e7cc4d | -| V1.5.7 | 优化测点列表中测点名称拆分为设备名称和测点,测点选择区域支持左右滚动,以及导出文件列顺序与页面保持一致 | V1.3.4及以上的1.x系列版本 | d3cd4a63372ca5d6217b67dddf661980c6a442b3b1564235e9ad34fc254d681febd58c2cc59c6273ffbfd8a1b003b9adb130ecfaaebe1942003b0d07427b1fcc | -| V1.5.6 | 优化 CSV 格式导入导出功能:导入时,支持标签、别名为非必填项;导出时,支持测点描述里反引号包裹引号的场景 | V1.3.4及以上的1.x系列版本 | 276ac1ea341f468bf6d29489c9109e9aa61afe2d1caaab577bc40603c6f4120efccc36b65a58a29ce6a266c21b46837aad6128f84ba5e676231ea9e6284a35e5 | -| V1.5.5 | 新增服务器时钟,支持企业版激活数据库 | V1.3.4及以上的1.x系列版本 | b18d01b70908d503a25866d1cc69d14e024d5b10ca6fcc536932fdbef8257c66e53204663ce3be5548479911aca238645be79dfd7ee7e65a07ab3c0f68c497f6 | -| V1.5.4 | 新增实例管理中prometheus设置的认证功能 | V1.3.4及以上的1.x系列版本 | adc7e13576913f9e43a9671fed02911983888da57be98ec8fbbb2593600d310f69619d32b22b569520c88e29f100d7ccae995b20eba757dbb1b2825655719335 | -| V1.5.1 | 新增AI分析功能以及模式匹配功能 | V1.3.2及以上的1.x系列版本 | 4f2053a2a3b2b255ce195268d6cd245278f3be32ba4cf68be1552c386d78ed4424f7bdc9d8e68c6b8260b3e398c8fd23ff342439c4e88e1e777c62640d2279f9 | -| V1.4.0 | 新增树模型展示及英文版 | V1.3.2及以上的1.x系列版本 | 734077f3bb5e1719d20b319d8b554ce30718c935cb0451e02b2c9267ff770e9c2d63b958222f314f16c2e6e62bf78b643255249b574ee6f37d00e123433981e8 | -| V1.3.1 | 分析功能新增分析方式,优化导入模版等功能 | V1.3.2及以上的1.x系列版本 | 134f87101cc7f159f8a22ac976ad2a3a295c5435058ee0a15160892aac46ac61dd3cfb0633b4aea9cc7415bf904d0ae65aaf77d663f027d864204d81fb34768b | -| V1.3.0 | 新增数据库配置功能,优化部分版本细节 | V1.3.2及以上的1.x系列版本 | 94a137fc5c681b211f3e076472a9c5875d59e7f0cd6d7409cb8f66bb9e4f87577a0f12dd500e2bcb99a435860c82183e4a6514b638bcb4aecfb48f184730f3f1 | -| V1.2.6 | 优化各模块权限控制功能 | V1.3.1及以上的1.x系列版本 | f345b7edcbe245a561cb94ec2e4f4d40731fe205f134acadf5e391e5874c5c2477d9f75f15dbaf36c3a7cb6506823ac6fbc2a0ccce484b7c4cc71ec0fbdd9901 | -| V1.2.5 | 可视化功能新增“常用模版”概念,所有界面优化补充页面缓存等功能 | V1.3.0及以上的1.x系列版本 | 37376b6cfbef7df8496e255fc33627de01bd68f636e50b573ed3940906b6f3da1e8e8b25260262293b8589718f5a72180fa15e5823437bf6dc51ed7da0c583f7 | -| V1.2.4 | 计算功能新增“导入、导出”功能,测点列表新增“时间对齐”字段 | V1.2.2及以上的1.x系列版本 | 061ad1add38c109c1a90b06f1ddb7797bd45e84a34a4f77154ee48b90bdc7ecccc1e25eaa53fbbc98170d99facca93e3536192dd8d10a50ce505f59923ce6186 | -| V1.2.3 | 首页新增“激活详情”,新增分析等功能 | V1.2.2及以上的1.x系列版本 | 254f5b7451300f6f99937d27fd7a5b20847d5293f53e0eaf045ac9235c7ea011785716b800014645ed5d2161078b37e1d04f3c59589c976614fb801c4da982e1 | -| V1.2.2 | 优化“测点描述”展示内容等功能 | V1.2.2及以上的1.x系列版本 | 062e520d010082be852d6db0e2a3aa6de594eb26aeb608da28a212726e378cd4ea30fca5e1d2c3231ebd8de29e94ca9641f1fabc1cea46acfb650c37b7681b4e | -| V1.2.1 | 数据同步界面新增“监控面板”,优化Prometheus提示信息 | V1.2.2及以上的1.x系列版本 | 8a3bcf87982ad5004528829b121f2d3945429deb77069917a42a8c8d2e2e2a2c24a398aaa87003920eeacc0c692f1ed39eac52a696887aa085cce011f0ddd745 | -| V1.2.0 | 全新Workbench版本升级 | V1.2.0及以上的1.x系列版本 | ea1f7d3a4c0c6476a195479e69bbd3b3a2da08b5b2bb70b0a4aba988a28b5db5a209d4e2c697eb8095dfdf130e29f61f2ddf58c5b51d002c8d4c65cfc13106b3 | diff --git a/src/zh/UserGuide/latest/QuickStart/QuickStart_timecho.md b/src/zh/UserGuide/latest/QuickStart/QuickStart_timecho.md deleted file mode 100644 index 8445cc339..000000000 --- a/src/zh/UserGuide/latest/QuickStart/QuickStart_timecho.md +++ /dev/null @@ -1,109 +0,0 @@ - - -# 快速上手 - -本篇文档将帮助您了解快速入门 IoTDB 的方法。 - -## 1. 如何安装部署? - -本篇文档将帮助您快速安装部署 IoTDB,您可以通过以下文档的链接快速定位到所需要查看的内容: - -1. 准备所需机器资源:IoTDB 的部署和运行需要考虑多个方面的机器资源配置。具体资源配置可查看 [资源规划](../Deployment-and-Maintenance/Database-Resources_timecho.md) - -2. 完成系统配置准备:IoTDB 的系统配置涉及多个方面,关键的系统配置介绍可查看 [系统配置](../Deployment-and-Maintenance/Environment-Requirements.md) - -3. 获取安装包:您可以联系天谋商务获取 IoTDB 安装包,以确保下载的是最新且稳定的版本。具体安装包结构可查看:[安装包获取](../Deployment-and-Maintenance/IoTDB-Package_timecho.md) - -4. 安装数据库并激活:您可以根据实际部署架构选择以下教程进行安装部署: - - - 单机版:[单机版](../Deployment-and-Maintenance/Stand-Alone-Deployment_timecho.md) - - - 分布式(集群)版:[分布式(集群)版](../Deployment-and-Maintenance//Cluster-Deployment_timecho.md) - - - 双活版:[双活版](../Deployment-and-Maintenance/Dual-Active-Deployment_timecho.md) - -> ❗️注意:目前我们仍然推荐直接在物理机/虚拟机上安装部署,如需要 docker 部署,可参考:[Docker 部署](../Deployment-and-Maintenance/Docker-Deployment_timecho.md) - -5. 安装数据库配套工具:企业版数据库提供监控面板、可视化控制台等配套工具,建议在部署企业版时安装,可以帮助您更加便捷的使用 IoTDB: - - - 监控面板:提供了上百个数据库监控指标,用来对 IoTDB 及其所在操作系统进行细致监控,从而进行系统优化、性能优化、发现瓶颈等,安装步骤可查看 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - 可视化控制台:是 IoTDB 的可视化界面,支持通过界面交互的形式提供元数据管理、数据查询、数据可视化等功能的操作,帮助用户简单、高效的使用数据库,安装步骤可查看 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - -## 2. 如何使用? - -1. 数据库建模设计:数据库建模是创建数据库系统的重要步骤,它涉及到设计数据的结构和关系,以确保数据的组织方式能够满足特定应用的需求,下面的文档将会帮助您快速了解 IoTDB 的建模设计: - - - 时序概念介绍:[走进时序数据](../Background-knowledge/Navigating_Time_Series_Data_timecho.md) - - - 建模设计介绍:[数据模型介绍](../Background-knowledge/Data-Model-and-Terminology_timecho.md) - - - SQL 语法介绍:[SQL 语法介绍](../Basic-Concept/Operate-Metadata_timecho.md) - -2. 数据写入:在数据写入方面,IoTDB 提供了多种方式来插入实时数据,基本的数据写入操作请查看 [数据写入](../Basic-Concept/Write-Data_timecho.md) - -3. 数据查询:IoTDB 提供了丰富的数据查询功能,数据查询的基本介绍请查看 [数据查询](../Basic-Concept/Query-Data_timecho.md) - -4. 其他进阶功能:除了数据库常见的写入、查询等功能外,IoTDB 还支持“数据同步、流处理框架、安全控制、权限管理、AI 分析”等功能,具体使用方法可参见具体文档: - - - 数据同步:[数据同步](../User-Manual/Data-Sync_timecho.md) - - - 流处理框架:[流处理框架](../User-Manual/Streaming_timecho.md) - - - 安全控制:[安全控制](../User-Manual/Black-White-List_timecho.md) - - - 权限管理:[权限管理](../User-Manual/Authority-Management_timecho.md) - - - AI 分析:[AI 能力](../AI-capability/AINode_timecho.md) - -5. 应用编程接口: IoTDB 提供了多种应用编程接口(API),以便于开发者在应用程序中与 IoTDB 进行交互,目前支持[ Java 原生接口](../API/Programming-Java-Native-API_timecho.md)、[Python 原生接口](../API/Programming-Python-Native-API_timecho.md)、[C++原生接口](../API/Programming-Cpp-Native-API.md)、[Go 原生接口](../API/Programming-Go-Native-API.md)等,更多编程接口可参见官网【应用编程接口】其他章节 - -## 3. 还有哪些便捷的周边工具? - -IoTDB 除了自身拥有丰富的功能外,其周边的工具体系包含的种类十分齐全。本篇文档将帮助您快速使用周边工具体系: - - - 可视化控制台:workbench 是 IoTDB 的一个支持界面交互的形式的可视化界面,提供直观的元数据管理、数据查询和数据可视化等功能,提升用户操作数据库的便捷性和效率,具体使用介绍请查看 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) - - - 监控面板:是一个对 IoTDB 及其所在操作系统进行细致监控的工具,涵盖数据库性能、系统资源等上百个数据库监控指标,助力系统优化与瓶颈识别等,具体使用介绍请查看 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) - - - 测试工具:IoT-benchmark 是一个基于 Java 和大数据环境开发的时序数据库基准测试工具,由清华大学软件学院研发并开源。它支持多种写入和查询方式,能够存储测试信息和结果供进一步查询或分析,并支持与 Tableau 集成以可视化测试结果。具体使用介绍请查看:[测试工具](../Tools-System/Benchmark.md) - - - 数据导入脚本:针对于不同场景,IoTDB 为用户提供多种批量导入数据的操作方式,具体使用介绍请查看:[数据导入](../Tools-System/Data-Import-Tool_timecho.md) - - - - 数据导出脚本:针对于不同场景,IoTDB 为用户提供多种批量导出数据的操作方式,具体使用介绍请查看:[数据导出](../Tools-System/Data-Export-Tool_timecho.md) - - -## 4. 想了解更多技术细节? - -如果您想了解 IoTDB 的更多技术内幕,可以移步至下面的文档: - - - 研究论文:IoTDB 具有列式存储、数据编码、预计算和索引技术,以及其类 SQL 接口和高性能数据处理能力,同时与 Apache Hadoop、MapReduce 和 Apache Spark 无缝集成。相关研究论文请查看 [研究论文](../Technical-Insider/Publication.md) - - - 压缩&编码:IoTDB 通过多样化的编码和压缩技术,针对不同数据类型优化存储效率,想了解更多请查看 [压缩&编码](../Technical-Insider/Encoding-and-Compression.md) - - - 数据分区和负载均衡:IoTDB 基于时序数据特性,精心设计了数据分区策略和负载均衡算法,提升了集群的可用性和性能,想了解更多请查看 [数据分区和负载均衡](../Technical-Insider/Cluster-data-partitioning.md) - - -## 5. 使用过程中遇到问题? - -如果您在安装或使用过程中遇到困难,可以移步至 [常见问题](../FAQ/Frequently-asked-questions.md) 中进行查看 \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Reference/DataNode-Config-Manual_timecho.md b/src/zh/UserGuide/latest/Reference/DataNode-Config-Manual_timecho.md deleted file mode 100644 index 8f0a3fc61..000000000 --- a/src/zh/UserGuide/latest/Reference/DataNode-Config-Manual_timecho.md +++ /dev/null @@ -1,625 +0,0 @@ - - -# DataNode 配置参数 - -IoTDB DataNode 与 Standalone 模式共用一套配置文件,均位于 IoTDB 安装目录:`conf`文件夹下。 - -* `datanode-env.sh/bat`:环境配置项的配置文件,可以配置 DataNode 的内存大小。 - -* `iotdb-system.properties`:IoTDB 的配置文件。 - -## 1. 热修改配置项 - -为方便用户使用,IoTDB 为用户提供了热修改功能,即在系统运行过程中修改 `iotdb-system.properties` 中部分配置参数并即时应用到系统中。下面介绍的参数中,改后 生效方式为`热加载` -的均为支持热修改的配置参数。 - -通过 Session 或 Cli 发送 ```load configuration``` 或 `set configuration` 命令(SQL)至 IoTDB 可触发配置热加载。 - -## 2. 环境配置项(datanode-env.sh/bat) - -环境配置项主要用于对 DataNode 运行的 Java 环境相关参数进行配置,如 JVM 相关配置。DataNode/Standalone 启动时,此部分配置会被传给 JVM,详细配置项说明如下: - -* MEMORY\_SIZE - -|名字|MEMORY\_SIZE| -|:---:|:---| -|描述|IoTDB DataNode 启动时分配的内存大小 | -|类型|String| -|默认值|取决于操作系统和机器配置。默认为机器内存的二分之一。| -|改后生效方式|重启服务生效| - -* ON\_HEAP\_MEMORY - -|名字|ON\_HEAP\_MEMORY| -|:---:|:---| -|描述|IoTDB DataNode 能使用的堆内内存大小, 曾用名: MAX\_HEAP\_SIZE | -|类型|String| -|默认值|取决于MEMORY\_SIZE的配置。| -|改后生效方式|重启服务生效| - -* OFF\_HEAP\_MEMORY - -|名字|OFF\_HEAP\_MEMORY| -|:---:|:---| -|描述|IoTDB DataNode 能使用的堆外内存大小, 曾用名: MAX\_DIRECT\_MEMORY\_SIZE | -|类型|String| -|默认值|取决于MEMORY\_SIZE的配置| -|改后生效方式|重启服务生效| - -* JMX\_LOCAL - -|名字|JMX\_LOCAL| -|:---:|:---| -|描述|JMX 监控模式,配置为 true 表示仅允许本地监控,设置为 false 的时候表示允许远程监控。如想在本地通过网络连接JMX Service,比如nodeTool.sh会尝试连接127.0.0.1:31999,请将JMX_LOCAL设置为false。| -|类型|枚举 String : “true”, “false”| -|默认值|true| -|改后生效方式|重启服务生效| - -* JMX\_PORT - -|名字|JMX\_PORT| -|:---:|:---| -|描述|JMX 监听端口。请确认该端口是不是系统保留端口并且未被占用。| -|类型|Short Int: [0,65535]| -|默认值|31999| -|改后生效方式|重启服务生效| - -## 3. 系统配置项(iotdb-system.properties) - -系统配置项是 IoTDB DataNode/Standalone 运行的核心配置,它主要用于设置 DataNode/Standalone 数据库引擎的参数。 - -### 3.1 Data Node RPC 服务配置 - -* dn\_rpc\_address - -|名字| dn\_rpc\_address | -|:---:|:-----------------| -|描述| 客户端 RPC 服务监听地址 | -|类型| String | -|默认值| 127.0.0.1 | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_port - -|名字| dn\_rpc\_port | -|:---:|:---| -|描述| Client RPC 服务监听端口| -|类型| Short Int : [0,65535] | -|默认值| 6667 | -|改后生效方式|重启服务生效| - -* dn\_internal\_address - -|名字| dn\_internal\_address | -|:---:|:---| -|描述| DataNode 内网通信地址 | -|类型| string | -|默认值| 127.0.0.1 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_internal\_port - -|名字| dn\_internal\_port | -|:---:|:-------------------| -|描述| DataNode 内网通信端口 | -|类型| int | -|默认值| 10730 | -|改后生效方式| 仅允许在第一次启动服务前修改 | - -* dn\_mpp\_data\_exchange\_port - -|名字| dn\_mpp\_data\_exchange\_port | -|:---:|:---| -|描述| MPP 数据交换端口 | -|类型| int | -|默认值| 10740 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_schema\_region\_consensus\_port - -|名字| dn\_schema\_region\_consensus\_port | -|:---:|:---| -|描述| DataNode 元数据副本的共识协议通信端口 | -|类型| int | -|默认值| 10750 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_data\_region\_consensus\_port - -|名字| dn\_data\_region\_consensus\_port | -|:---:|:---| -|描述| DataNode 数据副本的共识协议通信端口 | -|类型| int | -|默认值| 10760 | -|改后生效方式|仅允许在第一次启动服务前修改| - -* dn\_join\_cluster\_retry\_interval\_ms - -|名字| dn\_join\_cluster\_retry\_interval\_ms | -|:---:|:---------------------------------------| -|描述| DataNode 再次重试加入集群等待时间 | -|类型| long | -|默认值| 5000 | -|改后生效方式| 重启服务生效 | - - -### 3.2 SSL 配置 - -* enable\_thrift\_ssl - -|名字| enable\_thrift\_ssl | -|:---:|:----------------------------------------------| -|描述| 当enable\_thrift\_ssl配置为true时,将通过dn\_rpc\_port使用 SSL 加密进行通信 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* enable\_https - -|名字| enable\_https | -|:---:|:-------------------------| -|描述| REST Service 是否开启 SSL 配置 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* key\_store\_path - -|名字| key\_store\_path | -|:---:|:-----------------| -|描述| ssl证书路径 | -|类型| String | -|默认值| "" | -|改后生效方式| 重启服务生效 | - -* key\_store\_pwd - -|名字| key\_store\_pwd | -|:---:|:----------------| -|描述| ssl证书密码 | -|类型| String | -|默认值| "" | -|改后生效方式| 重启服务生效 | - - -### 3.3 SeedConfigNode 配置 - -* dn\_seed\_config\_node - -|名字| dn\_seed\_config\_node | -|:---:|:------------------------------------| -|描述| ConfigNode 地址,DataNode 启动时通过此地址加入集群,推荐使用 SeedConfigNode。V1.2.2 及以前曾用名是 dn\_target\_config\_node\_list | -|类型| String | -|默认值| 127.0.0.1:10710 | -|改后生效方式| 仅允许在第一次启动服务前修改 | - -### 3.4 连接配置 - -* dn\_session\_timeout\_threshold - -|名字| dn\_session_timeout_threshold | -|:---:|:------------------------------| -|描述| 最大的会话空闲时间 | -|类型| int | -|默认值| 0 | -|改后生效方式| 重启服务生效 | - - -* dn\_rpc\_thrift\_compression\_enable - -|名字| dn\_rpc\_thrift\_compression\_enable | -|:---:|:---------------------------------| -|描述| 是否启用 thrift 的压缩机制 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_advanced\_compression\_enable - -|名字| dn\_rpc\_advanced\_compression\_enable | -|:---:|:-----------------------------------| -|描述| 是否启用 thrift 的自定制压缩机制 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启服务生效 | - -* dn\_rpc\_selector\_thread\_count - -| 名字 | rpc\_selector\_thread\_count | -|:------:|:-----------------------------| -| 描述 | rpc 选择器线程数量 | -| 类型 | int | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -* dn\_rpc\_min\_concurrent\_client\_num - -| 名字 | rpc\_min\_concurrent\_client\_num | -|:------:|:----------------------------------| -| 描述 | 最小连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1 | -| 改后生效方式 | 重启服务生效 | - -* dn\_rpc\_max\_concurrent\_client\_num - -| 名字 | dn\_rpc\_max\_concurrent\_client\_num | -|:------:|:--------------------------------------| -| 描述 | 最大连接数 | -| 类型 | Short Int : [0,65535] | -| 默认值 | 1000 | -| 改后生效方式 | 重启服务生效 | - -* dn\_thrift\_max\_frame\_size - -|名字| dn\_thrift\_max\_frame\_size | -|:---:|:------------------------------------------------------------------------------------------------------------------| -|描述| RPC 请求/响应的最大字节数 | -|类型| int | -|默认值| 默认为0,即根据启动时DNJVM的配置参数自动计算:
a. min(64MB, dn_alloc_memory/64)
b.若用户手动配置了dn_thrift_max_frame_size,仍然使用用户指定的大小 | -|改后生效方式| 重启服务生效 | - -* dn\_thrift\_init\_buffer\_size - -|名字| dn\_thrift\_init\_buffer\_size | -|:---:|:---| -|描述| 字节数 | -|类型| long | -|默认值| 1024 | -|改后生效方式|重启服务生效| - -* dn\_connection\_timeout\_ms - -| 名字 | dn\_connection\_timeout\_ms | -|:------:|:----------------------------| -| 描述 | 节点连接超时时间 | -| 类型 | int | -| 默认值 | 60000 | -| 改后生效方式 | 重启服务生效 | - -* dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager - -| 名字 | dn\_core\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------:|:--------------------------------------------------------------| -| 描述 | 单 ClientManager 中路由到每个节点的核心 Client 个数 | -| 类型 | int | -| 默认值 | 200 | -| 改后生效方式 | 重启服务生效 | - -* dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager - -| 名字 | dn\_max\_client\_count\_for\_each\_node\_in\_client\_manager | -|:------:|:-------------------------------------------------------------| -| 描述 | 单 ClientManager 中路由到每个节点的最大 Client 个数 | -| 类型 | int | -| 默认值 | 300 | -| 改后生效方式 | 重启服务生效 | - -### 3.5 目录配置 - -* dn\_system\_dir - -| 名字 | dn\_system\_dir | -|:------:|:--------------------------------------------------------------------| -| 描述 | IoTDB 元数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/system(Windows:data\\datanode\\system) | -| 改后生效方式 | 重启服务生效 | - -* dn\_data\_dirs - -| 名字 | dn\_data\_dirs | -|:------:|:-------------------------------------------------------------------| -| 描述 | IoTDB 数据存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/data(Windows:data\\datanode\\data) | -| 改后生效方式 | 重启服务生效 | - -* dn\_multi\_dir\_strategy - -| 名字 | dn\_multi\_dir\_strategy | -|:------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 描述 | IoTDB 在 data\_dirs 中为 TsFile 选择目录时采用的策略。可使用简单类名或类名全称。系统提供以下三种策略:
1. SequenceStrategy:IoTDB 按顺序选择目录,依次遍历 data\_dirs 中的所有目录,并不断轮循;
2. MaxDiskUsableSpaceFirstStrategy:IoTDB 优先选择 data\_dirs 中对应磁盘空余空间最大的目录;
您可以通过以下方法完成用户自定义策略:
1. 继承 org.apache.iotdb.db.storageengine.rescon.disk.strategy.DirectoryStrategy 类并实现自身的 Strategy 方法;
2. 将实现的类的完整类名(包名加类名,UserDefineStrategyPackage)填写到该配置项;
3. 将该类 jar 包添加到工程中。 | -| 类型 | String | -| 默认值 | SequenceStrategy | -| 改后生效方式 | 热加载 | - -* dn\_consensus\_dir - -| 名字 | dn\_consensus\_dir | -|:------:|:-------------------------------------------------------------------------| -| 描述 | IoTDB 共识层日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/consensus(Windows:data\\datanode\\consensus) | -| 改后生效方式 | 重启服务生效 | - -* dn\_wal\_dirs - -| 名字 | dn\_wal\_dirs | -|:------:|:---------------------------------------------------------------------| -| 描述 | IoTDB 写前日志存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/wal(Windows:data\\datanode\\wal) | -| 改后生效方式 | 重启服务生效 | - -* dn\_tracing\_dir - -| 名字 | dn\_tracing\_dir | -|:------:|:--------------------------------------------------------------------| -| 描述 | IoTDB 追踪根目录路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | datanode/tracing | -| 改后生效方式 | 重启服务生效 | - -* dn\_sync\_dir - -| 名字 | dn\_sync\_dir | -|:------:|:----------------------------------------------------------------------| -| 描述 | IoTDB sync 存储路径,默认存放在和 sbin 目录同级的 data 目录下。相对路径的起始目录与操作系统相关,建议使用绝对路径。 | -| 类型 | String | -| 默认值 | data/datanode/sync | -| 改后生效方式 | 重启服务生效 | - -### 3.6 Metric 配置 - -* dn\_metric\_reporter\_list - -| 名字 | dn\_metric\_reporter\_list | -|:------:|:------------------------------------------| -| 描述 | DataNode 中用于配置监控模块的数据需要报告的系统。 | -| 类型 | String | -| 默认值 | 无 | -| 改后生效方式 | 重启服务生效 | - -* dn\_metric\_level - -| 名字 | dn\_metric\_level | -|:------:|:---------------------------------| -| 描述 | DataNode 中控制监控模块收集数据的详细程度 | -| 类型 | String | -| 默认值 | IMPORTANT | -| 改后生效方式 | 重启服务生效 | - -* dn\_metric\_async\_collect\_period - -| 名字 | dn\_metric\_async\_collect\_period | -|:------:|:---------------------------------------| -| 描述 | DataNode 中某些监控数据异步收集的周期,单位是秒。 | -| 类型 | int | -| 默认值 | 5 | -| 改后生效方式 | 重启服务生效 | - -* dn\_metric\_prometheus\_reporter\_port - -| 名字 | dn\_metric\_prometheus\_reporter\_port | -|:------:|:-------------------------------------------| -| 描述 | DataNode 中 Prometheus 报告者用于监控数据报告的端口号。 | -| 类型 | int | -| 默认值 | 9092 | -| 改后生效方式 | 重启服务生效 | - -* dn\_metric\_internal\_reporter\_type - -| 名字 | dn\_metric\_internal\_reporter\_type | -|:------:|:-----------------------------------------------------------| -| 描述 | DataNode 中监控模块内部报告者的种类,用于内部监控和检查数据是否已经成功写入和刷新。 | -| 类型 | String | -| 默认值 | IOTDB | -| 改后生效方式 | 重启服务生效 | - -## 4. 开启 GC 日志 - -GC 日志默认是关闭的。为了性能调优,用户可能会需要收集 GC 信息。 -若要打开 GC 日志,则需要在启动 IoTDB Server 的时候加上"printgc"参数: - -```bash -nohup sbin/start-datanode.sh printgc >/dev/null 2>&1 & -``` - -或者 - -```bash -# V2.0.4.x 版本之前 -sbin\start-datanode.bat printgc - -# V2.0.4.x 版本及之后 -tools\windows\start-datanode.bat printgc -``` - -GC 日志会被存储在`IOTDB_HOME/logs/gc.log`. 至多会存储 10 个 gc.log 文件,每个文件最多 10MB。 - -#### REST 服务配置 - -* enable\_rest\_service - -|名字| enable\_rest\_service | -|:---:|:--------------------| -|描述| 是否开启Rest服务。 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* rest\_service\_port - -|名字| rest\_service\_port | -|:---:|:------------------| -|描述| Rest服务监听端口号 | -|类型| int32 | -|默认值| 18080 | -|改后生效方式| 重启生效 | - -* enable\_swagger - -|名字| enable\_swagger | -|:---:|:-----------------------| -|描述| 是否启用swagger来展示rest接口信息 | -|类型| Boolean | -|默认值| false | -|改后生效方式| 重启生效 | - -* rest\_query\_default\_row\_size\_limit - -|名字| rest\_query\_default\_row\_size\_limit | -|:---:|:----------------------------------| -|描述| 一次查询能返回的结果集最大行数 | -|类型| int32 | -|默认值| 10000 | -|改后生效方式| 重启生效 | - -* cache\_expire - -|名字| cache\_expire | -|:---:|:--------------| -|描述| 缓存客户登录信息的过期时间 | -|类型| int32 | -|默认值| 28800 | -|改后生效方式| 重启生效 | - -* cache\_max\_num - -|名字| cache\_max\_num | -|:---:|:--------------| -|描述| 缓存中存储的最大用户数量 | -|类型| int32 | -|默认值| 100 | -|改后生效方式| 重启生效 | - -* cache\_init\_num - -|名字| cache\_init\_num | -|:---:|:---------------| -|描述| 缓存初始容量 | -|类型| int32 | -|默认值| 10 | -|改后生效方式| 重启生效 | - -* trust\_store\_path - -|名字| trust\_store\_path | -|:---:|:---------------| -|描述| keyStore 密码(非必填) | -|类型| String | -|默认值| "" | -|改后生效方式| 重启生效 | - -* trust\_store\_pwd - -|名字| trust\_store\_pwd | -|:---:|:---------------| -|描述| trustStore 密码(非必填) | -|类型| String | -|默认值| "" | -|改后生效方式| 重启生效 | - -* idle\_timeout - -|名字| idle\_timeout | -|:---:|:--------------| -|描述| SSL 超时时间,单位为秒 | -|类型| int32 | -|默认值| 5000 | -|改后生效方式| 重启生效 | - - - -#### 多级存储配置 - -* dn\_default\_space\_usage\_thresholds - -|名字| dn\_default\_space\_usage\_thresholds | -|:---:|:--------------| -|描述| 定义每个层级数据目录的最小剩余空间比例;当剩余空间少于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的剩余存储空间到低于此阈值时,会将系统置为 READ_ONLY | -|类型| double | -|默认值| 0.85 | -|改后生效方式| 热加载 | - -* remote\_tsfile\_cache\_dirs - -|名字| remote\_tsfile\_cache\_dirs | -|:---:|:--------------| -|描述| 云端存储在本地的缓存目录 | -|类型| string | -|默认值| data/datanode/data/cache | -|改后生效方式| 重启生效 | - -* remote\_tsfile\_cache\_page\_size\_in\_kb - -|名字| remote\_tsfile\_cache\_page\_size\_in\_kb | -|:---:|:--------------| -|描述| 云端存储在本地缓存文件的块大小 | -|类型| int | -|默认值| 20480 | -|改后生效方式| 重启生效 | - -* remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb - -|名字| remote\_tsfile\_cache\_max\_disk\_usage\_in\_mb | -|:---:|:--------------| -|描述| 云端存储本地缓存的最大磁盘占用大小 | -|类型| long | -|默认值| 51200 | -|改后生效方式| 重启生效 | - -* object\_storage\_type - -|名字| object\_storage\_type | -|:---:|:--------------| -|描述| 云端存储类型 | -|类型| string | -|默认值| AWS_S3 | -|改后生效方式| 重启生效 | - -* object\_storage\_bucket - -|名字| object\_storage\_bucket | -|:---:|:--------------| -|描述| 云端存储 bucket 的名称 | -|类型| string | -|默认值| iotdb_data | -|改后生效方式| 重启生效 | - -* object\_storage\_endpoint - -|名字| object\_storage\_endpoint | -|:---:|:---------------------------| -|描述| 云端存储的 endpoint | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | - -* object\_storage\_access\_key - -|名字| object\_storage\_access\_key | -|:---:|:--------------| -|描述| 云端存储的验证信息 key | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | - -* object\_storage\_access\_secret - -|名字| object\_storage\_access\_secret | -|:---:|:--------------| -|描述| 云端存储的验证信息 secret | -|类型| string | -|默认值| 无 | -|改后生效方式| 重启生效 | diff --git a/src/zh/UserGuide/latest/SQL-Manual/QuickStart-Only-Sql_timecho.md b/src/zh/UserGuide/latest/SQL-Manual/QuickStart-Only-Sql_timecho.md deleted file mode 100644 index 55f835bb8..000000000 --- a/src/zh/UserGuide/latest/SQL-Manual/QuickStart-Only-Sql_timecho.md +++ /dev/null @@ -1,111 +0,0 @@ - - -# 快速 SQL 体验 - -> **在执行以下 SQL 语句前,请确保** -> -> * **已成功启动 IoTDB 服务** -> * **已通过 Cli 客户端连接 IoTDB** -> -> 注意:若您使用的终端不支持多行粘贴(例如 Windows CMD),请将 SQL 语句调整为单行格式后再执行。 - -## 1. 数据库管理 - -```SQL --- 创建数据库; -CREATE DATABASE root.ln; - --- 查看数据库; -SHOW DATABASES root.**; - --- 删除数据库; -DELETE DATABASE root.ln; - --- 统计数据库; -COUNT DATABASES root.**; -``` - -详细语法说明可参考:[数据库管理](../Basic-Concept/Operate-Metadata_timecho.md#_1-数据库管理) - -## 2. 时间序列管理 - -```SQL --- 创建时间序列; -CREATE TIMESERIES root.ln.wf01.wt01.status BOOLEAN; -CREATE TIMESERIES root.ln.wf01.wt01.temperature FLOAT; - --- 创建对齐时间序列; -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT, longitude FLOAT); - --- 删除时间序列; -DELETE TIMESERIES root.ln.wf01.wt01.status; - --- 查看时间序列; -SHOW TIMESERIES root.ln.**; - --- 统计时间序列; -COUNT TIMESERIES root.ln.**; -``` - -详细语法说明可参考:[时间序列管理](../Basic-Concept/Operate-Metadata_timecho.md#_2-时间序列管理) - -## 3. 数据写入 - -```SQL --- 单列写入; -INSERT INTO root.ln.wf01.wt01(timestamp, temperature) VALUES(1, 23.0),(2, 42.6); - --- 多列写入; -INSERT INTO root.ln.wf01.wt01(timestamp, status, temperature) VALUES (3, false, 33.1),(4, true, 24.6); -``` - -详细语法说明可参考:[数据写入](../Basic-Concept/Write-Data_timecho.md) - -## 4. 数据查询 - -```SQL --- 时间过滤查询; -SELECT * from root.ln.** where time > 1; - --- 值过滤查询; -SELECT temperature FROM root.ln.wf01.wt01 where temperature > 36.5; - --- 函数查询; -SELECT count(temperature) FROM root.ln.wf01.wt01; - --- 最新点查询; -SELECT LAST status FROM root.ln.wf01.wt01; -``` - -详细语法说明可参考:[数据查询](../Basic-Concept/Query-Data_timecho.md) - -## 5. 数据删除 - -```SQL --- 单列删除; -DELETE FROM root.ln.wf01.wt01.status WHERE time >= 20; - --- 多列删除; -DELETE FROM root.ln.wf01.wt01.* where time <= 10; -``` - -详细语法说明可参考:[数据删除](../Basic-Concept/Delete-Data.md) diff --git a/src/zh/UserGuide/latest/SQL-Manual/SQL-Manual_timecho.md b/src/zh/UserGuide/latest/SQL-Manual/SQL-Manual_timecho.md deleted file mode 100644 index 44a04a5b5..000000000 --- a/src/zh/UserGuide/latest/SQL-Manual/SQL-Manual_timecho.md +++ /dev/null @@ -1,1704 +0,0 @@ -# SQL手册 - -## 1. 元数据操作 - -### 1.1 数据库管理 - -#### 创建数据库 - -```sql -CREATE DATABASE root.ln; -``` - -#### 查看数据库 - -```sql -show databases; -show databases root.*; -show databases root.**; -``` - -#### 删除数据库 - -```sql -DELETE DATABASE root.ln; -DELETE DATABASE root.sgcc; -DELETE DATABASE root.**; -``` - -#### 统计数据库数量 - -```sql -count databases; -count databases root.*; -count databases root.sgcc.*; -count databases root.sgcc; -``` - -### 1.2 时间序列管理 - -#### 创建时间序列 - -```sql -create timeseries root.ln.wf01.wt01.status with datatype=BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature with datatype=FLOAT; -create timeseries root.ln.wf02.wt02.hardware with datatype=TEXT; -create timeseries root.ln.wf02.wt02.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status with datatype=BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature with datatype=FLOAT; -``` - -- 简化版 - -```sql -create timeseries root.ln.wf01.wt01.status BOOLEAN; -create timeseries root.ln.wf01.wt01.temperature FLOAT; -create timeseries root.ln.wf02.wt02.hardware TEXT; -create timeseries root.ln.wf02.wt02.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.status BOOLEAN; -create timeseries root.sgcc.wf03.wt01.temperature FLOAT; -``` - -- 错误提示 - -```sql -create timeseries root.ln.wf02.wt02.status WITH DATATYPE=BOOLEAN, ENCODING=TS_2DIFF; -error: encoding TS_2DIFF does not support BOOLEAN -``` - -#### 创建对齐时间序列 - -```sql -CREATE ALIGNED TIMESERIES root.ln.wf01.GPS(latitude FLOAT, longitude FLOAT); -``` - -#### 修改时间序列数据类型 -> V2.0.8.2 起支持该语句 - -```sql -ALTER TIMESERIES root.ln.wf01.wt01.temperature set data type DOUBLE -``` - -#### 修改时间序列名称 -> V2.0.8.2 起支持该语句 - -```SQL -ALTER TIMESERIES root.ln.wf01.wt01.temperature RENAME TO root.newln.newwf.newwt.temperature -``` - -#### 删除时间序列 - -```sql -delete timeseries root.ln.wf01.wt01.status; -delete timeseries root.ln.wf01.wt01.temperature, root.ln.wf02.wt02.hardware; -delete timeseries root.ln.wf02.*; -drop timeseries root.ln.wf02.*; -``` - -#### 查看时间序列 - -```sql -SHOW TIMESERIES; -SHOW TIMESERIES ; -SHOW TIMESERIES root.**; -SHOW TIMESERIES root.ln.**; -SHOW TIMESERIES root.ln.** limit 10 offset 10; -SHOW TIMESERIES root.ln.** where timeseries contains 'wf01.wt'; -SHOW TIMESERIES root.ln.** where dataType=FLOAT; -SHOW TIMESERIES root.ln.** where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -SHOW LATEST TIMESERIES; -SHOW INVALID TIMESERIES; --V2.0.8.2 起支持该语句; -``` - -#### 统计时间序列数量 - -```sql -COUNT TIMESERIES root.**; -COUNT TIMESERIES root.ln.**; -COUNT TIMESERIES root.ln.*.*.status; -COUNT TIMESERIES root.ln.wf01.wt01.status; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc'; -COUNT TIMESERIES root.** WHERE DATATYPE = INT64; -COUNT TIMESERIES root.** WHERE TAGS(unit) contains 'c'; -COUNT TIMESERIES root.** WHERE TAGS(unit) = 'c'; -COUNT TIMESERIES root.** WHERE TIMESERIES contains 'sgcc' group by level = 1; -COUNT TIMESERIES root.** WHERE time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -COUNT TIMESERIES root.** GROUP BY LEVEL=1; -COUNT TIMESERIES root.ln.** GROUP BY LEVEL=2; -COUNT TIMESERIES root.ln.wf01.* GROUP BY LEVEL=2; -``` - -#### 标签点管理 - -```sql -create timeseries root.turbine.d1.s1(temprature) with datatype=FLOAT tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2); -``` - -- 重命名标签或属性 - -```sql -ALTER timeseries root.turbine.d1.s1 RENAME tag1 TO newTag1; -``` - -- 重新设置标签或属性的值 - -```sql -ALTER timeseries root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1; -``` - -- 删除已经存在的标签或属性 - -```sql -ALTER timeseries root.turbine.d1.s1 DROP tag1, tag2; -``` - -- 添加新的标签 - -```sql -ALTER timeseries root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4; -``` - -- 添加新的属性 - -```sql -ALTER timeseries root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4; -``` - -- 更新插入别名,标签和属性 - -```sql -ALTER timeseries root.turbine.d1.s1 UPSERT ALIAS=newAlias TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4); -``` - -- 使用标签作为过滤条件查询时间序列 - -```sql -SHOW TIMESERIES (<`PathPattern`>)? timeseriesWhereClause -``` - -返回给定路径的下的所有满足条件的时间序列信息: - -```sql -ALTER timeseries root.ln.wf02.wt02.hardware ADD TAGS unit=c; -ALTER timeseries root.ln.wf02.wt02.status ADD TAGS description=test1; -show timeseries root.ln.** where TAGS(unit)='c'; -show timeseries root.ln.** where TAGS(description) contains 'test1'; -``` - -- 使用标签作为过滤条件统计时间序列数量 - -```sql -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause; -COUNT TIMESERIES (<`PathPattern`>)? timeseriesWhereClause GROUP BY LEVEL=; -``` - -返回给定路径的下的所有满足条件的时间序列的数量: - -```sql -count timeseries; -count timeseries root.** where TAGS(unit)='c'; -count timeseries root.** where TAGS(unit)='c' group by level = 2; -``` - -创建对齐时间序列: - -```sql -create aligned timeseries root.sg1.d1(s1 INT32 tags(tag1=v1, tag2=v2) attributes(attr1=v1, attr2=v2), s2 DOUBLE tags(tag3=v3, tag4=v4) attributes(attr3=v3, attr4=v4)); -``` - -支持查询: - -```sql -show timeseries where TAGS(tag1)='v1'; -``` - -### 1.3 时间序列路径管理 - -#### 查看路径的所有子路径 - -```sql -SHOW CHILD PATHS pathPattern; -- 查询 root.ln 的下一层; -show child paths root.ln; -- 查询形如 root.xx.xx.xx 的路径; -show child paths root.*.*; -``` -#### 查看路径的所有子节点 - -```sql -SHOW CHILD NODES pathPattern; -- 查询 root 的下一层; -show child nodes root; -- 查询 root.ln 的下一层; -show child nodes root.ln; -``` -#### 查看设备 - -```sql -show devices; -show devices root.ln.**; -show devices where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -``` -##### 查看设备及其 database 信息 - -```sql -show devices with database; -show devices root.ln.** with database; -``` -#### 统计节点数 - -```sql -COUNT NODES root.** LEVEL=2; -COUNT NODES root.ln.** LEVEL=2; -COUNT NODES root.ln.wf01.* LEVEL=3; -COUNT NODES root.**.temperature LEVEL=3; -``` -#### 统计设备数量 - -```sql -count devices; -count devices root.ln.**; -count devices where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -``` - -### 1.4 数据存活时间管理 - -#### 设置 TTL -```sql -set ttl to root.ln 3600000; -set ttl to root.sgcc.** 3600000; -set ttl to root.** 3600000; -``` -#### 取消 TTL -```sql -unset ttl from root.ln; -unset ttl from root.sgcc.**; -unset ttl from root.**; -``` - -#### 显示 TTL -```sql -SHOW ALL TTL; -SHOW TTL ON pathPattern; -show DEVICES; -``` -## 2. 写入数据 - -### 2.1 写入单列数据 -```sql -insert into root.ln.wf02.wt02(timestamp,status) values(1,true); -insert into root.ln.wf02.wt02(timestamp,hardware) values(1, 'v1'),(2, 'v1'); -``` -### 2.2 写入多列数据 -```sql -insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2'); -insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4'); -``` -### 2.3 使用服务器时间戳 -```sql -insert into root.ln.wf02.wt02(status, hardware) values (false, 'v2'); -``` -### 2.4 写入对齐时间序列数据 -```sql -create aligned timeseries root.sg1.d1(s1 INT32, s2 DOUBLE); -insert into root.sg1.d1(timestamp, s1, s2) aligned values(1, 1, 1); -insert into root.sg1.d1(timestamp, s1, s2) aligned values(2, 2, 2), (3, 3, 3); -select * from root.sg1.d1; -``` -### 2.5 加载 TsFile 文件数据 - -load '' [sglevel=int][onSuccess=delete/none] - -#### 通过指定文件路径(绝对路径)加载单 tsfile 文件 - -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile'` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' sglevel=1` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' onSuccess=delete` -- `load '/Users/Desktop/data/1575028885956-101-0.tsfile' sglevel=1 onSuccess=delete` - - -#### 通过指定文件夹路径(绝对路径)批量加载文件 - -- `load '/Users/Desktop/data'` -- `load '/Users/Desktop/data' sglevel=1` -- `load '/Users/Desktop/data' onSuccess=delete` -- `load '/Users/Desktop/data' sglevel=1 onSuccess=delete` - -## 3. 删除数据 - -### 3.1 删除单列数据 -```sql -delete from root.ln.wf02.wt02.status where time<=2017-11-01T16:26:00; -delete from root.ln.wf02.wt02.status where time>=2017-01-01T00:00:00 and time<=2017-11-01T16:26:00; -delete from root.ln.wf02.wt02.status where time < 10; -delete from root.ln.wf02.wt02.status where time <= 10; -delete from root.ln.wf02.wt02.status where time < 20 and time > 10; -delete from root.ln.wf02.wt02.status where time <= 20 and time >= 10; -delete from root.ln.wf02.wt02.status where time > 20; -delete from root.ln.wf02.wt02.status where time >= 20; -delete from root.ln.wf02.wt02.status where time = 20; -``` -出错: -```sql -delete from root.ln.wf02.wt02.status where time > 4 or time < 0; -Msg: 303: Check metadata error: For delete statement, where clause can only contain atomic expressions like : time > XXX, time <= XXX, or two atomic expressions connected by 'AND' -``` - -删除时间序列中的所有数据: -```sql -delete from root.ln.wf02.wt02.status; -``` -### 3.2 删除多列数据 -```sql -delete from root.ln.wf02.wt02.* where time <= 2017-11-01T16:26:00; -``` -声明式的编程方式: -```sql -delete from root.ln.wf03.wt02.status where time < now(); -Msg: The statement is executed successfully. -``` -## 4. 数据查询 - -### 4.1 基础查询 - -#### 时间过滤查询 -```sql -select temperature from root.ln.wf01.wt01 where time < 2017-11-01T00:08:00.000; -``` -#### 根据一个时间区间选择多列数据 -```sql -select status, temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` -#### 按照多个时间区间选择同一设备的多列数据 -```sql -select status, temperature from root.ln.wf01.wt01 where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` -#### 按照多个时间区间选择不同设备的多列数据 -```sql -select wf01.wt01.status, wf02.wt02.hardware from root.ln where (time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000) or (time >= 2017-11-01T16:35:00.000 and time <= 2017-11-01T16:37:00.000); -``` -#### 根据时间降序返回结果集 -```sql -select * from root.ln.** where time > 1 order by time desc limit 10; -``` -### 4.2 选择表达式 - -#### 使用别名 -```sql -select s1 as temperature, s2 as speed from root.ln.wf01.wt01; -``` -#### 运算符 - -#### 函数 - -不支持: -```sql -select s1, count(s1) from root.sg.d1; -select sin(s1), count(s1) from root.sg.d1; -select s1, count(s1) from root.sg.d1 group by ([10,100),10ms); -``` -##### 时间序列查询嵌套表达式 - -示例 1: -```sql -select a, - b, - ((a + 1) * 2 - 1) % 2 + 1.5, - sin(a + sin(a + sin(b))), - -(a + b) * (sin(a + b) * sin(a + b) + cos(a + b) * cos(a + b)) + 1 -from root.sg1; -``` -示例 2: -```sql -select (a + b) * 2 + sin(a) from root.sg; -``` -示例 3: -```sql -select (a + *) / 2 from root.sg1; -``` -示例 4: -```sql -select (a + b) * 3 from root.sg, root.ln; -``` -##### 聚合查询嵌套表达式 - -示例 1: -```sql -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) -from root.ln.wf01.wt01; -``` -示例 2: -```sql -select avg(*), - (avg(*) + 1) * 3 / 2 -1 -from root.sg1; -``` -示例 3: -```sql -select avg(temperature), - sin(avg(temperature)), - avg(temperature) + 1, - -sum(hardware), - avg(temperature) + sum(hardware) as custom_sum -from root.ln.wf01.wt01 -GROUP BY([10, 90), 10ms); -``` -#### 最新点查询 - -SQL 语法: - -```sql -select last [COMMA ]* from < PrefixPath > [COMMA < PrefixPath >]* [ORDER BY TIMESERIES (DESC | ASC)?] -``` - -查询 root.ln.wf01.wt01.status 的最新数据点 -```sql -select last status from root.ln.wf01.wt01; -``` -查询 root.ln.wf01.wt01 下 status,temperature 时间戳大于等于 2017-11-07T23:50:00 的最新数据点 -```sql -select last status, temperature from root.ln.wf01.wt01 where time >= 2017-11-07T23:50:00; -``` - 查询 root.ln.wf01.wt01 下所有序列的最新数据点,并按照序列名降序排列 -```sql -select last * from root.ln.wf01.wt01 order by timeseries desc; -``` -### 4.3 查询过滤条件 - -#### 时间过滤条件 - -选择时间戳大于 2022-01-01T00:05:00.000 的数据: -```sql -select s1 from root.sg1.d1 where time > 2022-01-01T00:05:00.000; -``` -选择时间戳等于 2022-01-01T00:05:00.000 的数据: -```sql -select s1 from root.sg1.d1 where time = 2022-01-01T00:05:00.000; -``` -选择时间区间 [2017-11-01T00:05:00.000, 2017-11-01T00:12:00.000) 内的数据: -```sql -select s1 from root.sg1.d1 where time >= 2022-01-01T00:05:00.000 and time < 2017-11-01T00:12:00.000; -``` -#### 值过滤条件 - -选择值大于 36.5 的数据: -```sql -select temperature from root.sg1.d1 where temperature > 36.5; -``` -选择值等于 true 的数据: -```sql -select status from root.sg1.d1 where status = true; -``` -选择区间 [36.5,40] 内或之外的数据: -```sql -select temperature from root.sg1.d1 where temperature between 36.5 and 40; -``` -```sql -select temperature from root.sg1.d1 where temperature not between 36.5 and 40; -``` -选择值在特定范围内的数据: -```sql -select code from root.sg1.d1 where code in ('200', '300', '400', '500'); -``` -选择值在特定范围外的数据: -```sql -select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); -``` -选择值为空的数据: -```sql -select code from root.sg1.d1 where temperature is null; -``` -选择值为非空的数据: -```sql -select code from root.sg1.d1 where temperature is not null; -``` -#### 模糊查询 - -查询 `root.sg.d1` 下 `value` 含有`'cc'`的数据 -```sql -select * from root.sg.d1 where value like '%cc%'; -``` -查询 `root.sg.d1` 下 `value` 中间为 `'b'`、前后为任意单个字符的数据 -```sql -select * from root.sg.device where value like '_b_'; -``` -查询 root.sg.d1 下 value 值为26个英文字符组成的字符串 -```sql -select * from root.sg.d1 where value regexp '^[A-Za-z]+$'; -``` - -查询 root.sg.d1 下 value 值为26个小写英文字符组成的字符串且时间大于100的 -```sql -select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100; -``` - -### 4.4 分段分组聚合 - -#### 未指定滑动步长的时间区间分组聚合查询 -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d); -``` -#### 指定滑动步长的时间区间分组聚合查询 -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d); -``` -滑动步长可以小于聚合窗口 -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-01 10:00:00), 4h, 2h); -``` -#### 按照自然月份的时间区间分组聚合查询 -```sql -select count(status) from root.ln.wf01.wt01 where time > 2017-11-01T01:00:00 group by([2017-11-01T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` -每个时间间隔窗口内都有数据 -```sql -select count(status) from root.ln.wf01.wt01 group by([2017-10-31T00:00:00, 2019-11-07T23:00:00), 1mo, 2mo); -``` -#### 左开右闭区间 -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d); -``` -#### 与分组聚合混合使用 - -统计降采样后的数据点个数 -```sql -select count(status) from root.ln.wf01.wt01 group by ((2017-11-01T00:00:00, 2017-11-07T23:00:00],1d), level=1; -``` -加上滑动 Step 的降采样后的结果也可以汇总 -```sql -select count(status) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d), level=1; -``` -#### 路径层级分组聚合 - -统计不同 database 下 status 序列的数据点个数 -```sql -select count(status) from root.** group by level = 1; -``` - 统计不同设备下 status 序列的数据点个数 -```sql -select count(status) from root.** group by level = 3; -``` -统计不同 database 下的不同设备中 status 序列的数据点个数 -```sql -select count(status) from root.** group by level = 1, 3; -``` -查询所有序列下温度传感器 temperature 的最大值 -```sql -select max_value(temperature) from root.** group by level = 0; -``` -查询某一层级下所有传感器拥有的总数据点数 -```sql -select count(*) from root.ln.** group by level = 2; -``` -#### 标签分组聚合 - -##### 单标签聚合查询 -```sql -SELECT AVG(temperature) FROM root.factory1.** GROUP BY TAGS(city); -``` -##### 多标签聚合查询 -```sql -SELECT avg(temperature) FROM root.factory1.** GROUP BY TAGS(city, workshop); -``` -##### 基于时间区间的标签聚合查询 -```sql -SELECT AVG(temperature) FROM root.factory1.** GROUP BY ([1000, 10000), 5s), TAGS(city, workshop); -``` -#### 差值分段聚合 -```sql -group by variation(controlExpression[,delta][,ignoreNull=true/false]) -``` -##### delta=0时的等值事件分段 -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6); -``` -指定ignoreNull为false -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, ignoreNull=false); -``` -##### delta!=0时的差值事件分段 -```sql -select __endTime, avg(s1), count(s2), sum(s3) from root.sg.d group by variation(s6, 4); -``` -#### 条件分段聚合 -```sql -group by condition(predict,[keep>/>=/=/<=/<]threshold,[,ignoreNull=true/false]) -``` -查询至少连续两行以上的charging_status=1的数据 -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoreNull=true); -``` -当设置`ignoreNull`为false时,遇到null值为将其视为一个不满足条件的行,得到结果原先的分组被含null的行拆分 -```sql -select max_time(charging_status),count(vehicle_status),last_value(soc) from root.** group by condition(charging_status=1,KEEP>=2,ignoreNull=false); -``` -#### 会话分段聚合 -```sql -group by session(timeInterval) -``` -按照不同的时间单位设定时间间隔 -```sql -select __endTime,count(*) from root.** group by session(1d); -``` -和`HAVING`、`ALIGN BY DEVICE`共同使用 -```sql -select __endTime,sum(hardware) from root.ln.wf02.wt01 group by session(50s) having sum(hardware)>0 align by device; -``` -#### 点数分段聚合 -```sql -group by count(controlExpression, size[,ignoreNull=true/false]) -``` -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5); -``` -当使用ignoreNull将null值也考虑进来 -```sql -select count(charging_stauts), first_value(soc) from root.sg group by count(charging_status,5,ignoreNull=false); -``` -### 4.5 聚合结果过滤 - -不正确的: -```sql -select count(s1) from root.** group by ([1,3),1ms) having sum(s1) > s1; -select count(s1) from root.** group by ([1,3),1ms) having s1 > 1; -select count(s1) from root.** group by ([1,3),1ms), level=1 having sum(d1.s1) > 1; -select count(d1.s1) from root.** group by ([1,3),1ms), level=1 having sum(s1) > 1; -``` -SQL 示例: -```sql - select count(s1) from root.** group by ([1,11),2ms), level=1 having count(s2) > 2; - select count(s1), count(s2) from root.** group by ([1,11),2ms) having count(s2) > 1 align by device; -``` -### 4.6 结果集补空值 -```sql -FILL '(' PREVIOUS | LINEAR | constant (, interval=DURATION_LITERAL)? ')' -``` -#### `PREVIOUS` 填充 -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous); -``` -#### `PREVIOUS` 填充并指定填充超时阈值 -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(previous, 2m); -``` -#### `LINEAR` 填充 -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(linear); -``` -#### 常量填充 -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(2.0); -``` -使用 `BOOLEAN` 类型的常量填充 -```sql -select temperature, status from root.sgcc.wf03.wt01 where time >= 2017-11-01T16:37:00.000 and time <= 2017-11-01T16:40:00.000 fill(true); -``` -### 4.7 查询结果分页 - -#### 按行分页 - - 基本的 `LIMIT` 子句 -```sql -select status, temperature from root.ln.wf01.wt01 limit 10; -``` -带 `OFFSET` 的 `LIMIT` 子句 -```sql -select status, temperature from root.ln.wf01.wt01 limit 5 offset 3; -``` -`LIMIT` 子句与 `WHERE` 子句结合 -```sql -select status,temperature from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time< 2017-11-01T00:12:00.000 limit 5 offset 3; -``` - `LIMIT` 子句与 `GROUP BY` 子句组合 -```sql -select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) limit 4 offset 3; -``` -#### 按列分页 - - 基本的 `SLIMIT` 子句 -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1; -``` -带 `SOFFSET` 的 `SLIMIT` 子句 -```sql -select * from root.ln.wf01.wt01 where time > 2017-11-01T00:05:00.000 and time < 2017-11-01T00:12:00.000 slimit 1 soffset 1; -``` -`SLIMIT` 子句与 `GROUP BY` 子句结合 -```sql -select max_value(*) from root.ln.wf01.wt01 group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00),1d) slimit 1 soffset 1; -``` -`SLIMIT` 子句与 `LIMIT` 子句结合 -```sql -select * from root.ln.wf01.wt01 limit 10 offset 100 slimit 2 soffset 0; -``` -### 4.8 排序 - -时间对齐模式下的排序 -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time desc; -``` -设备对齐模式下的排序 -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by device desc,time asc align by device; -``` -在时间戳相等时按照设备名排序 -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 order by time asc,device desc align by device; -``` -没有显式指定时 -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` -对聚合后的结果进行排序 -```sql -select count(*) from root.ln.** group by ((2017-11-01T00:00:00.000+08:00,2017-11-01T00:03:00.000+08:00],1m) order by device asc,time asc align by device; -``` -### 4.9 查询对齐模式 - -#### 按设备对齐 -```sql -select * from root.ln.** where time <= 2017-11-01T00:01:00 align by device; -``` -### 4.10 查询写回(SELECT INTO) - -#### 整体描述 -```sql -selectIntoStatement - : SELECT - resultColumn [, resultColumn] ... - INTO intoItem [, intoItem] ... - FROM prefixPath [, prefixPath] ... - [WHERE whereCondition] - [GROUP BY groupByTimeClause, groupByLevelClause] - [FILL ({PREVIOUS | LINEAR | constant} (, interval=DURATION_LITERAL)?)] - [LIMIT rowLimit OFFSET rowOffset] - [ALIGN BY DEVICE] - ; - -intoItem - : [ALIGNED] intoDevicePath '(' intoMeasurementName [',' intoMeasurementName]* ')' - ; -``` -按时间对齐,将 `root.sg` database 下四条序列的查询结果写入到 `root.sg_copy` database 下指定的四条序列中 -```sql -select s1, s2 into root.sg_copy.d1(t1), root.sg_copy.d2(t1, t2), root.sg_copy.d1(t2) from root.sg.d1, root.sg.d2; -``` -按时间对齐,将聚合查询的结果存储到指定序列中 -```sql -select count(s1 + s2), last_value(s2) into root.agg.count(s1_add_s2), root.agg.last_value(s2) from root.sg.d1 group by ([0, 100), 10ms); -``` -按设备对齐 -```sql -select s1, s2 into root.sg_copy.d1(t1, t2), root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` -按设备对齐,将表达式计算的结果存储到指定序列中 -```sql -select s1 + s2 into root.expr.add(d1s1_d1s2), root.expr.add(d2s1_d2s2) from root.sg.d1, root.sg.d2 align by device; -``` -#### 使用变量占位符 - -##### 按时间对齐(默认) - -###### 目标设备不使用变量占位符 & 目标物理量列表使用变量占位符 -```sql -select s1, s2 -into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) -from root.sg.d1, root.sg.d2; -``` - -该语句等价于: -```sql -select s1, s2 -into root.sg_copy.d1(s1), root.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2) -from root.sg.d1, root.sg.d2; -``` - -###### 目标设备使用变量占位符 & 目标物理量列表不使用变量占位符 - -```sql -select d1.s1, d1.s2, d2.s3, d3.s4 -into ::(s1_1, s2_2), root.sg.d2_2(s3_3), root.${2}_copy.::(s4) -from root.sg; -``` - -###### 目标设备使用变量占位符 & 目标物理量列表使用变量占位符 - -```sql -select * into root.sg_bk.::(::) from root.sg.**; -``` - -##### 按设备对齐(使用 `ALIGN BY DEVICE`) - -###### 目标设备不使用变量占位符 & 目标物理量列表使用变量占位符 -```sql -select s1, s2, s3, s4 -into root.backup_sg.d1(s1, s2, s3, s4), root.backup_sg.d2(::), root.sg.d3(backup_${4}) -from root.sg.d1, root.sg.d2, root.sg.d3 -align by device; -``` - -###### 目标设备使用变量占位符 & 目标物理量列表不使用变量占位符 -```sql -select avg(s1), sum(s2) + sum(s3), count(s4) -into root.agg_${2}.::(avg_s1, sum_s2_add_s3, count_s4) -from root.** -align by device; -``` - -###### 目标设备使用变量占位符 & 目标物理量列表使用变量占位符 -```sql -select * into ::(backup_${4}) from root.sg.** align by device; -``` - -#### 指定目标序列为对齐序列 -```sql -select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; -``` -## 5. 运维语句 -生成对应的查询计划 -```sql -explain select s1,s2 from root.sg.d1; -``` -执行对应的查询语句,并获取分析结果 -```sql -explain analyze select s1,s2 from root.sg.d1 order by s1; -``` - -更多运维语句可查看[运维语句](../User-Manual/Maintenance-statement_timecho.md) - -## 6. 运算符 - -更多见文档[Operator-and-Expression](./Operator-and-Expression.md) - -### 6.1 算数运算符 - -更多见文档 [Arithmetic Operators and Functions](./Operator-and-Expression.md#_1-1-算数运算符) - -```sql -select s1, - s1, s2, + s2, s1 + s2, s1 - s2, s1 * s2, s1 / s2, s1 % s2 from root.sg.d1; -``` - -### 6.2 比较运算符 - -更多见文档[Comparison Operators and Functions](./Operator-and-Expression.md#_1-2-比较运算符) - -```sql -# Basic comparison operators; -select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test; - -# `BETWEEN ... AND ...` operator; -select temperature from root.sg1.d1 where temperature between 36.5 and 40; -select temperature from root.sg1.d1 where temperature not between 36.5 and 40; - -# Fuzzy matching operator: Use `Like` for fuzzy matching; -select * from root.sg.d1 where value like '%cc%'; -select * from root.sg.device where value like '_b_'; - -# Fuzzy matching operator: Use `Regexp` for fuzzy matching; -select * from root.sg.d1 where value regexp '^[A-Za-z]+$'; -select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100; -select b, b like '1%', b regexp '[0-2]' from root.test; - -# `IS NULL` operator; -select code from root.sg1.d1 where temperature is null; -select code from root.sg1.d1 where temperature is not null; - -# `IN` operator; -select code from root.sg1.d1 where code in ('200', '300', '400', '500'); -select code from root.sg1.d1 where code not in ('200', '300', '400', '500'); -select a, a in (1, 2) from root.test; -``` - -### 6.3 逻辑运算符 - -更多见文档[Logical Operators](./Operator-and-Expression.md#_1-3-逻辑运算符) - -```sql -select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test; -``` - -## 7. 内置函数 - -更多见文档[Operator-and-Expression](./Operator-and-Expression.md#_2-内置函数) - -### 7.1 Aggregate Functions - -更多见文档[Aggregate Functions](./Operator-and-Expression.md#_2-1-聚合函数) - -```sql -select count(status) from root.ln.wf01.wt01; - -select count_if(s1=0 & s2=0, 3), count_if(s1=1 & s2=0, 3) from root.db.d1; -select count_if(s1=0 & s2=0, 3, 'ignoreNull'='false'), count_if(s1=1 & s2=0, 3, 'ignoreNull'='false') from root.db.d1; - -select time_duration(s1) from root.db.d1; -``` - -### 7.2 算数函数 - -更多见文档[Arithmetic Operators and Functions](./Operator-and-Expression.md#_2-2-数学函数) - -```sql -select s1, sin(s1), cos(s1), tan(s1) from root.sg1.d1 limit 5 offset 1000; -select s4,round(s4),round(s4,2),round(s4,-1) from root.sg1.d1; -``` - -### 7.3 比较函数 - -更多见文档[Comparison Operators and Functions](./Operator-and-Expression.md#_2-3-比较函数) - -```sql -select ts, on_off(ts, 'threshold'='2') from root.test; -select ts, in_range(ts, 'lower'='2', 'upper'='3.1') from root.test; -``` - -### 7.4 字符串处理函数 - -更多见文档[String Processing](./Operator-and-Expression.md#_2-4-字符串函数) - -```sql -select s1, string_contains(s1, 's'='warn') from root.sg1.d4; -select s1, string_matches(s1, 'regex'='[^\\s]+37229') from root.sg1.d4; -select s1, length(s1) from root.sg1.d1; -select s1, locate(s1, "target"="1") from root.sg1.d1; -select s1, locate(s1, "target"="1", "reverse"="true") from root.sg1.d1; -select s1, startswith(s1, "target"="1") from root.sg1.d1; -select s1, endswith(s1, "target"="1") from root.sg1.d1; -select s1, s2, concat(s1, s2, "target1"="IoT", "target2"="DB") from root.sg1.d1; -select s1, s2, concat(s1, s2, "target1"="IoT", "target2"="DB", "series_behind"="true") from root.sg1.d1; -select s1, substring(s1 from 1 for 2) from root.sg1.d1; -select s1, replace(s1, 'es', 'tt') from root.sg1.d1; -select s1, upper(s1) from root.sg1.d1; -select s1, lower(s1) from root.sg1.d1; -select s3, trim(s3) from root.sg1.d1; -select s1, s2, strcmp(s1, s2) from root.sg1.d1; -select strreplace(s1, "target"=",", "replace"="/", "limit"="2") from root.test.d1; -select strreplace(s1, "target"=",", "replace"="/", "limit"="1", "offset"="1", "reverse"="true") from root.test.d1; -select regexmatch(s1, "regex"="\d+\.\d+\.\d+\.\d+", "group"="0") from root.test.d1; -select regexreplace(s1, "regex"="192\.168\.0\.(\d+)", "replace"="cluster-$1", "limit"="1") from root.test.d1; -select regexsplit(s1, "regex"=",", "index"="-1") from root.test.d1; -select regexsplit(s1, "regex"=",", "index"="3") from root.test.d1; -``` - -### 7.5 数据类型转换函数 - -更多见文档[Data Type Conversion Function](./Operator-and-Expression.md#_2-5-数据类型转换函数) - -```sql -SELECT cast(s1 as INT32) from root.sg; -``` - -### 7.6 常序列生成函数 - -更多见文档[Constant Timeseries Generating Functions](./Operator-and-Expression.md#_2-6-常序列生成函数) - -```sql -select s1, s2, const(s1, 'value'='1024', 'type'='INT64'), pi(s2), e(s1, s2) from root.sg1.d1; -``` - -### 7.7 选择函数 - -更多见文档[Selector Functions](./Operator-and-Expression.md#_2-7-选择函数) - -```sql -select s1, top_k(s1, 'k'='2'), bottom_k(s1, 'k'='2') from root.sg1.d2 where time > 2020-12-10T20:36:15.530+08:00; -``` - -### 7.8 区间查询函数 - -更多见文档[Continuous Interval Functions](./Operator-and-Expression.md#_2-8-区间查询函数) - -```sql -select s1, zero_count(s1), non_zero_count(s2), zero_duration(s3), non_zero_duration(s4) from root.sg.d2; -``` - -### 7.9 趋势计算函数 - -更多见文档[Variation Trend Calculation Functions](./Operator-and-Expression.md#_2-9-趋势计算函数) - -```sql -select s1, time_difference(s1), difference(s1), non_negative_difference(s1), derivative(s1), non_negative_derivative(s1) from root.sg1.d1 limit 5 offset 1000; - -SELECT DIFF(s1), DIFF(s2) from root.test; -SELECT DIFF(s1, 'ignoreNull'='false'), DIFF(s2, 'ignoreNull'='false') from root.test; -``` - -### 7.10 采样函数 - -更多见文档[Sample Functions](./Operator-and-Expression.md#_2-10-采样函数)。 - -```sql -select equal_size_bucket_random_sample(temperature,'proportion'='0.1') as random_sample from root.ln.wf01.wt01; -select equal_size_bucket_agg_sample(temperature, 'type'='avg','proportion'='0.1') as agg_avg, equal_size_bucket_agg_sample(temperature, 'type'='max','proportion'='0.1') as agg_max, equal_size_bucket_agg_sample(temperature,'type'='min','proportion'='0.1') as agg_min, equal_size_bucket_agg_sample(temperature, 'type'='sum','proportion'='0.1') as agg_sum, equal_size_bucket_agg_sample(temperature, 'type'='extreme','proportion'='0.1') as agg_extreme, equal_size_bucket_agg_sample(temperature, 'type'='variance','proportion'='0.1') as agg_variance from root.ln.wf01.wt01; -select equal_size_bucket_m4_sample(temperature, 'proportion'='0.1') as M4_sample from root.ln.wf01.wt01; -select equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='avg', 'number'='2') as outlier_avg_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='stendis', 'number'='2') as outlier_stendis_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='cos', 'number'='2') as outlier_cos_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='prenextdis', 'number'='2') as outlier_prenextdis_sample from root.ln.wf01.wt01; - -select M4(s1,'timeInterval'='25','displayWindowBegin'='0','displayWindowEnd'='100') from root.vehicle.d1; -select M4(s1,'windowSize'='10') from root.vehicle.d1; -``` - -### 7.12 时间序列处理函数 - -更多见文档[Time-Series](./Operator-and-Expression.md#_2-11-时间序列处理函数) - -```sql -select change_points(s1), change_points(s2), change_points(s3), change_points(s4), change_points(s5), change_points(s6) from root.testChangePoints.d1; -``` - -## 8. 数据质量函数库 - -更多见文档[UDF-Libraries](../SQL-Manual/UDF-Libraries.md) - -### 8.1 数据质量 - -更多见文档[Data-Quality](../SQL-Manual/UDF-Libraries.md#数据质量) - -```sql -# Completeness; -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Consistency; -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Timeliness; -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Validity; -select Validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; -select Validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00; - -# Accuracy; -select Accuracy(t1,t2,t3,m1,m2,m3) from root.test; -``` - -### 8.2 数据画像 - -更多见文档[Data-Profiling](../SQL-Manual/UDF-Libraries.md#数据画像) - -```sql -# ACF; -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05; - -# Distinct; -select distinct(s2) from root.test.d2; - -# Histogram; -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1; - -# Integral; -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10; -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10; - -# IntegralAvg; -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10; - -# Mad; -select mad(s0) from root.test; -select mad(s0, "error"="0.01") from root.test; - -# Median; -select median(s0, "error"="0.01") from root.test; - -# MinMax; -select minmax(s1) from root.test; - -# Mode; -select mode(s2) from root.test.d2; - -# MvAvg; -select mvavg(s1, "window"="3") from root.test; - -# PACF; -select pacf(s1, "lag"="5") from root.test; - -# Percentile; -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test; - -# Quantile; -select quantile(s0, "rank"="0.2", "K"="800") from root.test; - -# Period; -select period(s1) from root.test.d3; - -# QLB; -select QLB(s1) from root.test.d1; - -# Resample; -select resample(s1,'every'='5m','interp'='linear') from root.test.d1; -select resample(s1,'every'='30m','aggr'='first') from root.test.d1; -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1; - -# Sample; -select sample(s1,'method'='reservoir','k'='5') from root.test.d1; -select sample(s1,'method'='isometric','k'='5') from root.test.d1; - -# Segment; -select segment(s1, "error"="0.1") from root.test; - -# Skew; -select skew(s1) from root.test.d1; - -# Spline; -select spline(s1, "points"="151") from root.test; - -# Spread; -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30; - -# Stddev; -select stddev(s1) from root.test.d1; - -# ZScore; -select zscore(s1) from root.test; -``` - -### 8.3 异常检测 - -更多见文档[Anomaly-Detection](../SQL-Manual/UDF-Libraries.md#异常检测) - -```sql -# IQR; -select iqr(s1) from root.test; - -# KSigma; -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30; - -# LOF; -select lof(s1,s2) from root.test.d1 where time<1000; -select lof(s1, "method"="series") from root.test.d1 where time<1000; - -# MissDetect; -select missdetect(s2,'minlen'='10') from root.test.d2; - -# Range; -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30; - -# TwoSidedFilter; -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test; - -# Outlier; -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test; - -# MasterTrain; -select MasterTrain(lo,la,m_lo,m_la,'p'='3','eta'='1.0') from root.test; - -# MasterDetect; -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='repair','p'='3','k'='3','eta'='1.0') from root.test; -select MasterDetect(lo,la,m_lo,m_la,model,'output_type'='anomaly','p'='3','k'='3','eta'='1.0') from root.test; -``` - -### 8.4 频域分析 - -更多见文档[Frequency-Domain](../SQL-Manual/UDF-Libraries.md#频域分析) - -```sql -# Conv; -select conv(s1,s2) from root.test.d2; - -# Deconv; -select deconv(s3,s2) from root.test.d2; -select deconv(s3,s2,'result'='remainder') from root.test.d2; - -# DWT; -select dwt(s1,"method"="haar") from root.test.d1; - -# FFT; -select fft(s1) from root.test.d1; -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1; - -# HighPass; -select highpass(s1,'wpass'='0.45') from root.test.d1; - -# IFFT; -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1; - -# LowPass; -select lowpass(s1,'wpass'='0.45') from root.test.d1; - -# Envelope; -select envelope(s1) from root.test.d1; -``` - -### 8.5 数据匹配 - -更多见文档[Data-Matching](../SQL-Manual/UDF-Libraries.md#数据匹配) - -```sql -# Cov; -select cov(s1,s2) from root.test.d2; - -# DTW; -select dtw(s1,s2) from root.test.d2; - -# Pearson; -select pearson(s1,s2) from root.test.d2; - -# PtnSym; -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1; - -# XCorr; -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05; -``` - -### 8.6 数据修复 - -更多见文档[Data-Repairing](../SQL-Manual/UDF-Libraries.md#数据修复) - -```sql -# TimestampRepair; -select timestamprepair(s1,'interval'='10000') from root.test.d2; -select timestamprepair(s1) from root.test.d2; - -# ValueFill; -select valuefill(s1) from root.test.d2; -select valuefill(s1,"method"="previous") from root.test.d2; - -# ValueRepair; -select valuerepair(s1) from root.test.d2; -select valuerepair(s1,'method'='LsGreedy') from root.test.d2; - -# MasterRepair; -select MasterRepair(t1,t2,t3,m1,m2,m3) from root.test; - -# SeasonalRepair; -select seasonalrepair(s1,'period'=3,'k'=2) from root.test.d2; -select seasonalrepair(s1,'method'='improved','period'=3) from root.test.d2; -``` - -### 8.7 序列发现 - -更多见文档[Series-Discovery](../SQL-Manual/UDF-Libraries.md#序列发现) - -```sql -# ConsecutiveSequences; -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1; -select consecutivesequences(s1,s2) from root.test.d1; - -# ConsecutiveWindows; -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1; -``` - -### 8.8 机器学习 - -更多见文档[Machine-Learning](../SQL-Manual/UDF-Libraries.md#机器学习) - -```sql -# AR; -select ar(s0,"p"="2") from root.test.d0; - -# Representation; -select representation(s0,"tb"="3","vb"="2") from root.test.d0; - -# RM; -select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0; -``` - -## 9. 条件表达式 - -更多见文档[Conditional Expressions](./Operator-and-Expression.md#_3-条件表达式) - -```sql -select T, P, case -when 1000=1050 then "bad temperature" -when P<=1000000 or P>=1100000 then "bad pressure" -end as `result` -from root.test1; - -select str, case -when str like "%cc%" then "has cc" -when str like "%dd%" then "has dd" -else "no cc and dd" end as `result` -from root.test2; - -select -count(case when x<=1 then 1 end) as `(-∞,1]`, -count(case when 1 -[RESAMPLE - [EVERY ] - [BOUNDARY ] - [RANGE [, end_time_offset]] -] -[TIMEOUT POLICY BLOCKED|DISCARD] -BEGIN - SELECT CLAUSE - INTO CLAUSE - FROM CLAUSE - [WHERE CLAUSE] - [GROUP BY([, ]) [, level = ]] - [HAVING CLAUSE] - [FILL ({PREVIOUS | LINEAR | constant} (, interval=DURATION_LITERAL)?)] - [LIMIT rowLimit OFFSET rowOffset] - [ALIGN BY DEVICE] -END -``` - -#### 配置连续查询执行的周期性间隔 -```sql -CREATE CONTINUOUS QUERY cq1 -RESAMPLE EVERY 20s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) -END; - -SELECT temperature_max from root.ln.*.*; -``` -#### 配置连续查询的时间窗口大小 -```sql -CREATE CONTINUOUS QUERY cq2 -RESAMPLE RANGE 40s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) -END; - -SELECT temperature_max from root.ln.*.*; -``` -#### 同时配置连续查询执行的周期性间隔和时间窗口大小 -```sql -CREATE CONTINUOUS QUERY cq3 -RESAMPLE EVERY 20s RANGE 40s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) - FILL(100.0) -END; - -SELECT temperature_max from root.ln.*.*; -``` -#### 配置连续查询每次查询执行时间窗口的结束时间 -```sql -CREATE CONTINUOUS QUERY cq4 -RESAMPLE EVERY 20s RANGE 40s, 20s -BEGIN - SELECT max_value(temperature) - INTO root.ln.wf02.wt02(temperature_max), root.ln.wf02.wt01(temperature_max), root.ln.wf01.wt02(temperature_max), root.ln.wf01.wt01(temperature_max) - FROM root.ln.*.* - GROUP BY(10s) - FILL(100.0) -END; - -SELECT temperature_max from root.ln.*.*; -``` -#### 没有GROUP BY TIME子句的连续查询 -```sql -CREATE CONTINUOUS QUERY cq5 -RESAMPLE EVERY 20s -BEGIN - SELECT temperature + 1 - INTO root.precalculated_sg.::(temperature) - FROM root.ln.*.* - align by device -END; - -SELECT temperature from root.precalculated_sg.*.* align by device; -``` -### 11.2 连续查询的管理 - -#### 查询系统已有的连续查询 - -展示集群中所有的已注册的连续查询 -```sql -SHOW (CONTINUOUS QUERIES | CQS) -``` -```sql -SHOW CONTINUOUS QUERIES; -``` -#### 删除已有的连续查询 - -删除指定的名为cq_id的连续查询: - -```sql -DROP (CONTINUOUS QUERY | CQ) -``` -```sql -DROP CONTINUOUS QUERY s1_count_cq; -``` -#### 作为子查询的替代品 - -1. 创建一个连续查询 -```sql -CREATE CQ s1_count_cq -BEGIN - SELECT count(s1) - INTO root.sg_count.d.count_s1 - FROM root.sg.d - GROUP BY(30m) -END; -``` -1. 查询连续查询的结果 -```sql -SELECT avg(count_s1) from root.sg_count.d; -``` -## 12. 用户自定义函数 - -### 12.1 UDFParameters -```sql -SELECT UDF(s1, s2, 'key1'='iotdb', 'key2'='123.45') FROM root.sg.d; -``` -### 12.2 UDF 注册 - -```sql -CREATE FUNCTION AS (USING URI URI-STRING)? -``` - -#### 不指定URI -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample'; -``` -#### 指定URI -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' USING URI 'http://jar/example.jar'; -``` -### 12.3 UDF 卸载 - -```sql -DROP FUNCTION -``` -```sql -DROP FUNCTION example; -``` -### 12.4 UDF 查询 - -#### 带自定义输入参数的查询 -```sql -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -``` -```sql -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; -``` -#### 与其他查询的嵌套查询 -```sql -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` -### 12.5 查看所有注册的 UDF -```sql -SHOW FUNCTIONS; -``` -## 13. 权限管理 - -### 13.1 用户与角色相关 - -- 创建用户(需 MANAGE_USER 权限) - -```SQL -CREATE USER ; -eg: CREATE USER user1 'passwd'; -``` - -- 删除用户 (需 MANEGE_USER 权限) - -```SQL -DROP USER ; -eg: DROP USER user1; -``` - -- 创建角色 (需 MANAGE_ROLE 权限) - -```SQL -CREATE ROLE ; -eg: CREATE ROLE role1; -``` - -- 删除角色 (需 MANAGE_ROLE 权限) - -```SQL -DROP ROLE ; -eg: DROP ROLE role1; -``` - -- 赋予用户角色 (需 MANAGE_ROLE 权限) - -```SQL -GRANT ROLE TO ; -eg: GRANT ROLE admin TO user1; -``` - -- 移除用户角色 (需 MANAGE_ROLE 权限) - -```SQL -REVOKE ROLE FROM ; -eg: REVOKE ROLE admin FROM user1; -``` - -- 列出所有用户 (需 MANEGE_USER 权限) - -```SQL -LIST USER; -``` - -- 列出所有角色 (需 MANAGE_ROLE 权限) - -```SQL -LIST ROLE; -``` - -- 列出指定角色下所有用户 (需 MANEGE_USER 权限) - -```SQL -LIST USER OF ROLE ; -eg: LIST USER OF ROLE roleuser; -``` - -- 列出指定用户下所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 MANAGE_ROLE 权限。 - -```SQL -LIST ROLE OF USER ; -eg: LIST ROLE OF USER tempuser; -``` - -- 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 MANAGE_USER 权限。 - -```SQL -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; - -``` - -- 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 MANAGE_ROLE 权限。 - -```SQL -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -- 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备MANAGE_USER 权限。 - -```SQL -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'newpwd'; -``` - -### 13.2 授权与取消授权 - -用户使用授权语句对赋予其他用户权限,语法如下: - -```SQL -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT MANAGE_ROLE ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -用户使用取消授权语句可以将其他的权限取消,语法如下: - -```SQL -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE MANAGE_ROLE ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - diff --git a/src/zh/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md b/src/zh/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md deleted file mode 100644 index 2ef07e3b5..000000000 --- a/src/zh/UserGuide/latest/SQL-Manual/UDF-Libraries_timecho.md +++ /dev/null @@ -1,5093 +0,0 @@ - -# UDF函数库 - -基于用户自定义函数能力,IoTDB 提供了一系列关于时序数据处理的函数,包括数据质量、数据画像、异常检测、 频域分析、数据匹配、数据修复、序列发现、机器学习等,能够满足工业领域对时序数据处理的需求。 - -> 注意:当前UDF函数库中的函数仅支持毫秒级的时间戳精度。 - -## 1. 安装步骤 -1. 请获取与 IoTDB 版本兼容的 UDF 函数库 JAR 包的压缩包。 - - | UDF 安装包 | 支持的 IoTDB 版本 | 下载链接 | - | --------------- | ----------------- | ------------------------------------------------------------ | - | TimechoDB-UDF-1.3.3.zip | V1.3.3及以上 | 请联系天谋商务获取 | - | TimechoDB-UDF-1.3.2.zip | V1.0.0~V1.3.2 | 请联系天谋商务获取 | - -2. 将获取的压缩包中的 `library-udf.jar` 文件放置在 IoTDB 集群所有节点的 `/ext/udf` 的目录下 -3. 在 IoTDB 的 SQL 命令行终端(CLI)或可视化控制台(Workbench)的 SQL 操作界面中,执行下述相应的函数注册语句。 -4. 批量注册:两种注册方式:注册脚本 或 SQL汇总语句 -- 注册脚本 - - 将压缩包中的注册脚本(`register-UDF.sh` 或 `register-UDF.bat`)按需复制到 IoTDB 的 tools 目录下,修改脚本中的参数(默认为host=127.0.0.1,rpcPort=6667,user=root,pass=root); - - 启动 IoTDB 服务,运行注册脚本批量注册 UDF - -- SQL汇总语句 - - 打开压缩包中的SQl文件,复制全部 SQL 语句,在 IoTDB 的 SQL 命令行终端(CLI)或可视化控制台(Workbench)的 SQL 操作界面中,执行全部 SQl 语句批量注册 UDF - -## 2. 数据质量 - -### 2.1 Completeness - -#### 注册语句 - -```sql -create function completeness as 'org.apache.iotdb.library.dquality.UDTFCompleteness' -``` - -#### 函数简介 - -本函数用于计算时间序列的完整性,用来衡量一段时序数据有没有缺失。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据完整程度,并输出窗口第一个数据点的时间戳和完整性结果。 - -**函数名:** COMPLETENESS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 -+ `downtime`:完整性计算是否考虑停机异常。它的取值为 'true' 或 'false',默认值为 'true'. 在考虑停机异常时,长时间的数据缺失将被视作停机,不对完整性产生影响。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行完整性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算完整性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------+ -| Time|completeness(root.test.d1.s1)| -+-----------------------------+-----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -+-----------------------------+-----------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算完整性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------+ -| Time|completeness(root.test.d1.s1, "window"="15")| -+-----------------------------+--------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.875| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+--------------------------------------------+ -``` - -### 2.2 Consistency - -#### 注册语句 - -```sql -create function consistency as 'org.apache.iotdb.library.dquality.UDTFConsistency' -``` - -#### 函数简介 - -本函数用于计算时间序列的一致性,用来衡量时序数据变化是否平稳、规律是否统一。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据一致性,并输出窗口第一个数据点的时间戳和一致性结果。 - -**函数名:** CONSISTENCY - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行一致性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算一致性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|consistency(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+----------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算一致性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------+ -| Time|consistency(root.test.d1.s1, "window"="15")| -+-----------------------------+-------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+-------------------------------------------+ -``` - -### 2.3 Timeliness - -#### 注册语句 - -```sql -create function timeliness as 'org.apache.iotdb.library.dquality.UDTFTimeliness' -``` - -#### 函数简介 - -本函数用于计算时间序列的时效性,用来衡量时序数据是否按时采集、按时上报。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据时效性,并输出窗口第一个数据点的时间戳和时效性结果。 - -**函数名:** TIMELINESS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行时效性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算时效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+---------------------------+ -| Time|timeliness(root.test.d1.s1)| -+-----------------------------+---------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -+-----------------------------+---------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算时效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------+ -| Time|timeliness(root.test.d1.s1, "window"="15")| -+-----------------------------+------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.9333333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+------------------------------------------+ -``` - -### 2.4 Validity - -#### 注册语句 - -```sql -create function validity as 'org.apache.iotdb.library.dquality.UDTFValidity' -``` - -#### 函数简介 - -本函数用于计算时间序列的有效性,用来衡量时序数据是否正常、可用、无异常值。函数会把输入的时序数据分成连续不重叠的时间窗口,分别计算每个窗口的数据有效性,并输出窗口第一个数据点的时间戳和有效性结果。 - - -**函数名:** VALIDITY - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `window`:窗口大小,它是一个大于0的整数或者一个有单位的正数。前者代表每一个窗口包含的数据点数目,最后一个窗口的数据点数目可能会不足;后者代表窗口的时间跨度,目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。缺省情况下,全部输入数据都属于同一个窗口。 - -**输出序列:** 输出单个序列,类型为DOUBLE,其中每一个数据点的值的范围都是 [0,1]. - -**提示:** 只有当窗口内的数据点数目超过10时,才会进行有效性计算。否则,该窗口将被忽略,不做任何输出。 - - -#### 使用示例 - -##### 参数缺省 - -在参数缺省的情况下,本函数将会把全部输入数据都作为同一个窗口计算有效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|validity(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -+-----------------------------+-------------------------+ -``` - -##### 指定窗口大小 - -在指定窗口大小的情况下,本函数会把输入数据划分为若干个窗口计算有效性。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -|2020-01-01T00:00:32.000+08:00| 130.0| -|2020-01-01T00:00:34.000+08:00| 132.0| -|2020-01-01T00:00:36.000+08:00| 134.0| -|2020-01-01T00:00:38.000+08:00| 136.0| -|2020-01-01T00:00:40.000+08:00| 138.0| -|2020-01-01T00:00:42.000+08:00| 140.0| -|2020-01-01T00:00:44.000+08:00| 142.0| -|2020-01-01T00:00:46.000+08:00| 144.0| -|2020-01-01T00:00:48.000+08:00| 146.0| -|2020-01-01T00:00:50.000+08:00| 148.0| -|2020-01-01T00:00:52.000+08:00| 150.0| -|2020-01-01T00:00:54.000+08:00| 152.0| -|2020-01-01T00:00:56.000+08:00| 154.0| -|2020-01-01T00:00:58.000+08:00| 156.0| -|2020-01-01T00:01:00.000+08:00| 158.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|validity(root.test.d1.s1, "window"="15")| -+-----------------------------+----------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.8833333333333333| -|2020-01-01T00:00:32.000+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - - - - -## 3. 数据画像 - -### 3.1 ACF - -#### 注册语句 - -```sql -create function acf as 'org.apache.iotdb.library.dprofile.UDTFACF' -``` - -#### 函数简介 - -本函数用于计算时间序列的自相关函数值,即序列与自身之间的互相关函数。 - -**函数名:** ACF - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中共包含$2N-1$个数据点。 - -**提示:** - -+ 序列中的`NaN`值会被忽略,在计算中表现为0。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select acf(s1) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|acf(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 6.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -|1970-01-01T08:00:00.004+08:00| 7.0| -|1970-01-01T08:00:00.005+08:00| 0.0| -|1970-01-01T08:00:00.006+08:00| 3.6| -|1970-01-01T08:00:00.007+08:00| 0.0| -|1970-01-01T08:00:00.008+08:00| 1.0| -+-----------------------------+--------------------+ -``` - -### 3.2 Distinct - -#### 注册语句 - -```sql -create function distinct as 'org.apache.iotdb.library.dprofile.UDTFDistinct' -``` - -#### 函数简介 - -本函数可以返回输入序列中出现的所有不同的元素。 - -**函数名:** DISTINCT - -**输入序列:** 仅支持单个输入序列,类型可以是任意的 - -**输出序列:** 输出单个序列,类型与输入相同。 - -**提示:** - -+ 输出序列的时间戳是无意义的。输出顺序是任意的。 -+ 缺失值和空值将被忽略,但`NaN`不会被忽略。 -+ 字符串区分大小写 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2020-01-01T08:00:00.001+08:00| Hello| -|2020-01-01T08:00:00.002+08:00| hello| -|2020-01-01T08:00:00.003+08:00| Hello| -|2020-01-01T08:00:00.004+08:00| World| -|2020-01-01T08:00:00.005+08:00| World| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select distinct(s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|distinct(root.test.d2.s2)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.001+08:00| Hello| -|1970-01-01T08:00:00.002+08:00| hello| -|1970-01-01T08:00:00.003+08:00| World| -+-----------------------------+-------------------------+ -``` - -### 3.3 Histogram - -#### 注册语句 - -```sql -create function histogram as 'org.apache.iotdb.library.dprofile.UDTFHistogram' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的分布直方图。 - -**函数名:** HISTOGRAM - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `min`:表示所求数据范围的下限,默认值为 -Double.MAX_VALUE。 -+ `max`:表示所求数据范围的上限,默认值为 Double.MAX_VALUE,`start`的值必须小于或等于`end`。 -+ `count`: 表示直方图分桶的数量,默认值为 1,其值必须为正整数。 - -**输出序列:** 直方图分桶的值,其中第 i 个桶(从 1 开始计数)表示的数据范围下界为$min+ (i-1)\cdot\frac{max-min}{count}$,数据范围上界为$min+ i \cdot \frac{max-min}{count}$。 - - -**提示:** - -+ 如果某个数据点的数值小于`min`,它会被放入第 1 个桶;如果某个数据点的数值大于`max`,它会被放入最后 1 个桶。 -+ 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 11.0| -|2020-01-01T00:00:11.000+08:00| 12.0| -|2020-01-01T00:00:12.000+08:00| 13.0| -|2020-01-01T00:00:13.000+08:00| 14.0| -|2020-01-01T00:00:14.000+08:00| 15.0| -|2020-01-01T00:00:15.000+08:00| 16.0| -|2020-01-01T00:00:16.000+08:00| 17.0| -|2020-01-01T00:00:17.000+08:00| 18.0| -|2020-01-01T00:00:18.000+08:00| 19.0| -|2020-01-01T00:00:19.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select histogram(s1,"min"="1","max"="20","count"="10") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------------------------------+ -| Time|histogram(root.test.d1.s1, "min"="1", "max"="20", "count"="10")| -+-----------------------------+---------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 2| -|1970-01-01T08:00:00.001+08:00| 2| -|1970-01-01T08:00:00.002+08:00| 2| -|1970-01-01T08:00:00.003+08:00| 2| -|1970-01-01T08:00:00.004+08:00| 2| -|1970-01-01T08:00:00.005+08:00| 2| -|1970-01-01T08:00:00.006+08:00| 2| -|1970-01-01T08:00:00.007+08:00| 2| -|1970-01-01T08:00:00.008+08:00| 2| -|1970-01-01T08:00:00.009+08:00| 2| -+-----------------------------+---------------------------------------------------------------+ -``` - -### 3.4 Integral - -#### 注册语句 - -```sql -create function integral as 'org.apache.iotdb.library.dprofile.UDAFIntegral' -``` - -#### 函数简介 - -本函数用于计算时间序列的数值积分,即以时间为横坐标、数值为纵坐标绘制的折线图中折线以下的面积。 - -**函数名:** INTEGRAL - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `unit`:积分求解所用的时间轴单位,取值为 "1S", "1s", "1m", "1H", "1d"(区分大小写),分别表示以毫秒、秒、分钟、小时、天为单位计算积分。 - 缺省情况下取 "1s",以秒为单位。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为积分结果的数据点。 - -**提示:** - -+ 积分值等于折线图中每相邻两个数据点和时间轴形成的直角梯形的面积之和,不同时间单位下相当于横轴进行不同倍数放缩,得到的积分值可直接按放缩倍数转换。 - -+ 数据中`NaN`将会被忽略。折线将以临近两个有值数据点为准。 - -#### 使用示例 - -##### 参数缺省 - -缺省情况下积分以1s为时间单位。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select integral(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 57.5| -+-----------------------------+-------------------------+ -``` - -其计算公式为: -$$\frac{1}{2}[(1+2)\times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 57.5$$ - - -##### 指定时间单位 - -指定以分钟为时间单位。 - - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select integral(s1, "unit"="1m") from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+-------------------------+ -| Time|integral(root.test.d1.s1)| -+-----------------------------+-------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.958| -+-----------------------------+-------------------------+ -``` - -其计算公式为: -$$\frac{1}{2\times 60}[(1+2) \times 1 + (2+3) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] = 0.958$$ - -### 3.5 IntegralAvg - -#### 注册语句 - -```sql -create function integralavg as 'org.apache.iotdb.library.dprofile.UDAFIntegralAvg' -``` - -#### 函数简介 - -本函数用于计算时间序列的函数均值,即在相同时间单位下的数值积分除以序列总的时间跨度。更多关于数值积分计算的信息请参考`Integral`函数。 - -**函数名:** INTEGRALAVG - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为时间加权平均结果的数据点。 - -**提示:** - -+ 时间加权的平均值等于在任意时间单位`unit`下计算的数值积分(即折线图中每相邻两个数据点和时间轴形成的直角梯形的面积之和), - 除以相同时间单位下输入序列的时间跨度,其值与具体采用的时间单位无关,默认与 IoTDB 时间单位一致。 - -+ 数据中的`NaN`将会被忽略。折线将以临近两个有值数据点为准。 - -+ 输入序列为空时,函数输出结果为 0;仅有一个数据点时,输出结果为该点数值。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| 2| -|2020-01-01T00:00:03.000+08:00| 5| -|2020-01-01T00:00:04.000+08:00| 6| -|2020-01-01T00:00:05.000+08:00| 7| -|2020-01-01T00:00:08.000+08:00| 8| -|2020-01-01T00:00:09.000+08:00| NaN| -|2020-01-01T00:00:10.000+08:00| 10| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select integralavg(s1) from root.test.d1 where time <= 2020-01-01 00:00:10 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|integralavg(root.test.d1.s1)| -+-----------------------------+----------------------------+ -|1970-01-01T08:00:00.000+08:00| 6.388888888888889| -+-----------------------------+----------------------------+ -``` - -其计算公式为: -$$\frac{1}{2}[(1+2)\times 1 + (2+5) \times 1 + (5+6) \times 1 + (6+7) \times 1 + (7+8) \times 3 + (8+10) \times 2] / 10 = 5.75$$ - -### 3.6 Mad - -#### 注册语句 - -```sql -create function mad as 'org.apache.iotdb.library.dprofile.UDAFMad' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似绝对中位差,绝对中位差为所有数值与其中位数绝对偏移量的中位数。 - -如有数据集$\{1,3,3,5,5,6,7,8,9\}$,其中位数为5,所有数值与中位数的偏移量的绝对值为$\{0,0,1,2,2,2,3,4,4\}$,其中位数为2,故而原数据集的绝对中位差为2。 - -**函数名:** MAD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `error`:近似绝对中位差的基于数值的误差百分比,取值范围为 [0,1),默认值为 0。如当`error`=0.01 时,记精确绝对中位差为a,近似绝对中位差为b,不等式 $0.99a \le b \le 1.01a$ 成立。当`error`=0 时,计算结果为精确绝对中位差。 - - -**输出序列:** 输出单个序列,类型为DOUBLE,序列仅包含一个时间戳为 0、值为绝对中位差的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -##### 近似查询 - -当`error`参数取值不为 0 时,本函数计算近似绝对中位差。 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -............ -Total line number = 20 -``` - -用于查询的 SQL 语句如下: - -```sql -select mad(s1, "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -| Time|mad(root.test.s1, "error"="0.01")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9900000000000001| -+-----------------------------+---------------------------------+ -``` - -### 3.7 Median - -#### 注册语句 - -```sql -create function median as 'org.apache.iotdb.library.dprofile.UDAFMedian' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似中位数。中位数是顺序排列的一组数据中居于中间位置的数;当序列有偶数个时,中位数为中间二者的平均数。 - -**函数名:** MEDIAN - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `error`:近似中位数的基于排名的误差百分比,取值范围 [0,1),默认值为 0。如当`error`=0.01 时,计算出的中位数的真实排名百分比在 0.49~0.51 之间。当`error`=0 时,计算结果为精确中位数。 - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为中位数的数据点。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -Total line number = 20 -``` - -用于查询的 SQL 语句: - -```sql -select median(s1, "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|median(root.test.s1, "error"="0.01")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -+-----------------------------+------------------------------------+ -``` - -### 3.8 MinMax - -#### 注册语句 - -```sql -create function minmax as 'org.apache.iotdb.library.dprofile.UDTFMinMax' -``` - -#### 函数简介 - -本函数将输入序列使用 min-max 方法进行标准化。最小值归一至 0,最大值归一至 1. - -**函数名:** MINMAX - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `compute`:若设置为"batch",则将数据全部读入后转换;若设置为 "stream",则需用户提供最大值及最小值进行流式计算转换。默认为 "batch"。 -+ `min`:使用流式计算时的最小值。 -+ `max`:使用流式计算时的最大值。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select minmax(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|minmax(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.200+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.300+08:00| 0.25| -|1970-01-01T08:00:00.400+08:00| 0.08333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:00.700+08:00| 0.0| -|1970-01-01T08:00:00.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.900+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.000+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.100+08:00| 0.25| -|1970-01-01T08:00:01.200+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.300+08:00| 0.08333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.25| -|1970-01-01T08:00:01.500+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.600+08:00| 0.16666666666666666| -|1970-01-01T08:00:01.700+08:00| 1.0| -|1970-01-01T08:00:01.800+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.900+08:00| 0.0| -|1970-01-01T08:00:02.000+08:00| 0.16666666666666666| -+-----------------------------+--------------------+ -``` - - - -### 3.9 MvAvg - -#### 注册语句 - -```sql -create function mvavg as 'org.apache.iotdb.library.dprofile.UDTFMvAvg' -``` - -#### 函数简介 - -本函数计算序列的移动平均。 - -**函数名:** MVAVG - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:移动窗口的长度。默认值为 10. - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 指定窗口长度 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select mvavg(s1, "window"="3") from root.test -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -| Time|mvavg(root.test.s1, "window"="3")| -+-----------------------------+---------------------------------+ -|1970-01-01T08:00:00.300+08:00| 0.3333333333333333| -|1970-01-01T08:00:00.400+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.700+08:00| -0.3333333333333333| -|1970-01-01T08:00:00.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.100+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.200+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.300+08:00| -0.3333333333333333| -|1970-01-01T08:00:01.400+08:00| 0.0| -|1970-01-01T08:00:01.500+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.600+08:00| 0.3333333333333333| -|1970-01-01T08:00:01.700+08:00| 3.0| -|1970-01-01T08:00:01.800+08:00| 0.6666666666666666| -|1970-01-01T08:00:01.900+08:00| -0.6666666666666666| -|1970-01-01T08:00:02.000+08:00| -3.3333333333333335| -+-----------------------------+---------------------------------+ -``` - -### 3.10 PACF - -#### 注册语句 - -```sql -create function pacf as 'org.apache.iotdb.library.dprofile.UDTFPACF' -``` - -#### 函数简介 - -本函数通过求解 Yule-Walker 方程,计算序列的偏自相关系数。对于特殊的输入序列,方程可能没有解,此时输出`NaN`。 - -**函数名:** PACF - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `lag`:最大滞后阶数。默认值为$\min(10\log_{10}n,n-1)$,$n$表示数据点个数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 指定滞后阶数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1| -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 3| -|2020-01-01T00:00:04.000+08:00| NaN| -|2020-01-01T00:00:05.000+08:00| 5| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select pacf(s1, "lag"="5") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------+ -| Time|pacf(root.test.d1.s1, "lag"="5")| -+-----------------------------+--------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| -0.5744680851063829| -|2020-01-01T00:00:03.000+08:00| 0.3172297297297296| -|2020-01-01T00:00:04.000+08:00| -0.2977686586304181| -|2020-01-01T00:00:05.000+08:00| -2.0609033521065867| -+-----------------------------+--------------------------------+ -``` - -### 3.11 Percentile - -#### 注册语句 - -```sql -create function percentile as 'org.apache.iotdb.library.dprofile.UDAFPercentile' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的精确或近似分位数。 - -**函数名:** PERCENTILE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `rank`:所求分位数在所有数据中的排名百分比,取值范围为 (0,1],默认值为 0.5。如当设为 0.5时则计算中位数。 -+ `error`:近似分位数的基于排名的误差百分比,取值范围为 [0,1),默认值为0。如`rank`=0.5 且`error`=0.01,则计算出的分位数的真实排名百分比在 0.49~0.51之间。当`error`=0 时,计算结果为精确分位数。 - -**输出序列:** 输出单个序列,类型与输入序列相同。当`error`=0时,序列仅包含一个时间戳为分位数第一次出现的时间戳、值为分位数的数据点;否则,输出值的时间戳为0。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|2021-03-17T10:32:17.054+08:00| 0.5319929| -|2021-03-17T10:32:18.054+08:00| 0.9304316| -|2021-03-17T10:32:19.054+08:00| -1.4800133| -|2021-03-17T10:32:20.054+08:00| 0.6114087| -|2021-03-17T10:32:21.054+08:00| 2.5163336| -|2021-03-17T10:32:22.054+08:00| -1.0845392| -|2021-03-17T10:32:23.054+08:00| 1.0562582| -|2021-03-17T10:32:24.054+08:00| 1.3867859| -|2021-03-17T10:32:25.054+08:00| -0.45429882| -|2021-03-17T10:32:26.054+08:00| 1.0353678| -|2021-03-17T10:32:27.054+08:00| 0.7307929| -|2021-03-17T10:32:28.054+08:00| 2.3167255| -|2021-03-17T10:32:29.054+08:00| 2.342443| -|2021-03-17T10:32:30.054+08:00| 1.5809103| -|2021-03-17T10:32:31.054+08:00| 1.4829416| -|2021-03-17T10:32:32.054+08:00| 1.5800357| -|2021-03-17T10:32:33.054+08:00| 0.7124368| -|2021-03-17T10:32:34.054+08:00| -0.78597564| -|2021-03-17T10:32:35.054+08:00| 1.2058644| -|2021-03-17T10:32:36.054+08:00| 1.4215064| -|2021-03-17T10:32:37.054+08:00| 1.2808295| -|2021-03-17T10:32:38.054+08:00| -0.6173715| -|2021-03-17T10:32:39.054+08:00| 0.06644377| -|2021-03-17T10:32:40.054+08:00| 2.349338| -|2021-03-17T10:32:41.054+08:00| 1.7335888| -|2021-03-17T10:32:42.054+08:00| 1.5872132| -............ -Total line number = 10000 -``` - -用于查询的 SQL 语句: - -```sql -select percentile(s0, "rank"="0.2", "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|percentile(root.test.s0, "rank"="0.2", "error"="0.01")| -+-----------------------------+------------------------------------------------------+ -|2021-03-17T10:35:02.054+08:00| 0.1801469624042511| -+-----------------------------+------------------------------------------------------+ -``` -输入序列: - -``` -+-----------------------------+-------------+ -| Time|root.test2.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+-------------+ -............ -Total line number = 20 -``` - -用于查询的 SQL 语句: - -```sql -select percentile(s1, "rank"="0.2", "error"="0.01") from root.test -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|percentile(root.test2.s1, "rank"="0.2", "error"="0.01")| -+-----------------------------+-------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| -1.0| -+-----------------------------+-------------------------------------------------------+ -``` - - -### 3.12 Quantile - -#### 注册语句 - -```sql -create function quantile as 'org.apache.iotdb.library.dprofile.UDAFQuantile' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的近似分位数。本函数基于KLL sketch算法实现。 - -**函数名:** QUANTILE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `rank`:所求分位数在所有数据中的排名比,取值范围为 (0,1],默认值为 0.5。如当设为 0.5时则计算近似中位数。 -+ `K`:允许维护的KLL sketch大小,最小值为100,默认值为800。如`rank`=0.5 且`K`=800,则计算出的分位数的真实排名比有至少99%的可能性在 0.49~0.51之间。 - -**输出序列:** 输出单个序列,类型与输入序列相同。输出值的时间戳为0。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - - -输入序列: - -``` -+-----------------------------+-------------+ -| Time|root.test1.s1| -+-----------------------------+-------------+ -|2021-03-17T10:32:17.054+08:00| 7| -|2021-03-17T10:32:18.054+08:00| 15| -|2021-03-17T10:32:19.054+08:00| 36| -|2021-03-17T10:32:20.054+08:00| 39| -|2021-03-17T10:32:21.054+08:00| 40| -|2021-03-17T10:32:22.054+08:00| 41| -|2021-03-17T10:32:23.054+08:00| 20| -|2021-03-17T10:32:24.054+08:00| 18| -+-----------------------------+-------------+ -............ -Total line number = 8 -``` - -用于查询的 SQL 语句: - -```sql -select quantile(s1, "rank"="0.2", "K"="800") from root.test1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------+ -| Time|quantile(root.test1.s1, "rank"="0.2", "K"="800")| -+-----------------------------+------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.000000000000001| -+-----------------------------+------------------------------------------------+ -``` - -### 3.13 Period - -#### 注册语句 - -```sql -create function period as 'org.apache.iotdb.library.dprofile.UDAFPeriod' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的周期。 - -**函数名:** PERIOD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为 INT32,序列仅包含一个时间戳为 0、值为周期的数据点。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d3.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| -|1970-01-01T08:00:00.002+08:00| 2.0| -|1970-01-01T08:00:00.003+08:00| 3.0| -|1970-01-01T08:00:00.004+08:00| 1.0| -|1970-01-01T08:00:00.005+08:00| 2.0| -|1970-01-01T08:00:00.006+08:00| 3.0| -|1970-01-01T08:00:00.007+08:00| 1.0| -|1970-01-01T08:00:00.008+08:00| 2.0| -|1970-01-01T08:00:00.009+08:00| 3.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select period(s1) from root.test.d3 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time|period(root.test.d3.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 3| -+-----------------------------+-----------------------+ -``` - -### 3.14 QLB - -#### 注册语句 - -```sql -create function qlb as 'org.apache.iotdb.library.dprofile.UDTFQLB' -``` - -#### 函数简介 - -本函数对输入序列计算$Q_{LB} $统计量,并计算对应的p值。p值越小表明序列越有可能为非平稳序列。 - -**函数名:** QLB - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `lag`:计算时用到的最大延迟阶数,取值应为 1 至 n-2 之间的整数,n 为序列采样总数。默认取 n-2。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。该序列是$Q_{LB} $统计量对应的 p 值,时间标签代表偏移阶数。 - -**提示:** $Q_{LB} $统计量由自相关系数求得,如需得到统计量而非 p 值,可以使用 ACF 函数。 - -#### 使用示例 - -##### 使用默认参数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T00:00:00.100+08:00| 1.22| -|1970-01-01T00:00:00.200+08:00| -2.78| -|1970-01-01T00:00:00.300+08:00| 1.53| -|1970-01-01T00:00:00.400+08:00| 0.70| -|1970-01-01T00:00:00.500+08:00| 0.75| -|1970-01-01T00:00:00.600+08:00| -0.72| -|1970-01-01T00:00:00.700+08:00| -0.22| -|1970-01-01T00:00:00.800+08:00| 0.28| -|1970-01-01T00:00:00.900+08:00| 0.57| -|1970-01-01T00:00:01.000+08:00| -0.22| -|1970-01-01T00:00:01.100+08:00| -0.72| -|1970-01-01T00:00:01.200+08:00| 1.34| -|1970-01-01T00:00:01.300+08:00| -0.25| -|1970-01-01T00:00:01.400+08:00| 0.17| -|1970-01-01T00:00:01.500+08:00| 2.51| -|1970-01-01T00:00:01.600+08:00| 1.42| -|1970-01-01T00:00:01.700+08:00| -1.34| -|1970-01-01T00:00:01.800+08:00| -0.01| -|1970-01-01T00:00:01.900+08:00| -0.49| -|1970-01-01T00:00:02.000+08:00| 1.63| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select QLB(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+---------------------+ -| Time| QLB(root.test.d1.s1)| -+-----------------------------+---------------------+ -|1970-01-01T08:00:00.021+08:00| -0.31671| -|1970-01-01T08:00:00.001+08:00| 0.12748561639660716| -|1970-01-01T08:00:00.022+08:00| -0.17051499999999997| -|1970-01-01T08:00:00.002+08:00| 0.21941409592365868| -|1970-01-01T08:00:00.023+08:00| -0.11341499999999997| -|1970-01-01T08:00:00.003+08:00| 0.3384920824593398| -|1970-01-01T08:00:00.024+08:00| 0.26146| -|1970-01-01T08:00:00.004+08:00| 0.26293189359893154| -|1970-01-01T08:00:00.025+08:00| 0.06431999999999996| -|1970-01-01T08:00:00.005+08:00| 0.37265953802871943| -|1970-01-01T08:00:00.026+08:00| 0.036919999999999994| -|1970-01-01T08:00:00.006+08:00| 0.4923218142923832| -|1970-01-01T08:00:00.027+08:00|-0.009294999999999993| -|1970-01-01T08:00:00.007+08:00| 0.609628728420623| -|1970-01-01T08:00:00.028+08:00| 0.12271499999999999| -|1970-01-01T08:00:00.008+08:00| 0.6510708392264906| -|1970-01-01T08:00:00.029+08:00| 0.008480000000000033| -|1970-01-01T08:00:00.009+08:00| 0.7430561964288097| -|1970-01-01T08:00:00.030+08:00| -0.21764500000000003| -|1970-01-01T08:00:00.010+08:00| 0.6236738200492055| -|1970-01-01T08:00:00.031+08:00| 0.35853999999999997| -|1970-01-01T08:00:00.011+08:00| 0.21487390993160937| -|1970-01-01T08:00:00.032+08:00| 0.18115499999999998| -|1970-01-01T08:00:00.012+08:00| 0.18479562182870324| -|1970-01-01T08:00:00.033+08:00| -0.27745499999999995| -|1970-01-01T08:00:00.013+08:00| 0.07329862193377235| -|1970-01-01T08:00:00.034+08:00| -0.22418500000000002| -|1970-01-01T08:00:00.014+08:00| 0.038000864459751926| -|1970-01-01T08:00:00.035+08:00| 0.31609000000000004| -|1970-01-01T08:00:00.015+08:00| 0.004052989734200874| -|1970-01-01T08:00:00.036+08:00| -0.06078500000000001| -|1970-01-01T08:00:00.016+08:00| 0.005663787468609627| -|1970-01-01T08:00:00.037+08:00| 0.19219499999999998| -|1970-01-01T08:00:00.017+08:00|0.0016316380755082571| -|1970-01-01T08:00:00.038+08:00| -0.25646| -|1970-01-01T08:00:00.018+08:00|2.0047954405910673E-5| -+-----------------------------+---------------------+ -``` - -### 3.15 Resample - -#### 注册语句 - -```sql -create function re_sample as 'org.apache.iotdb.library.dprofile.UDTFResample' -``` - -#### 函数简介 - -本函数对输入序列按照指定的频率进行重采样,包括上采样和下采样。目前,本函数支持的上采样方法包括`NaN`填充法 (NaN)、前值填充法 (FFill)、后值填充法 (BFill) 以及线性插值法 (Linear);本函数支持的下采样方法为分组聚合,聚合方法包括最大值 (Max)、最小值 (Min)、首值 (First)、末值 (Last)、平均值 (Mean)和中位数 (Median)。 - -**函数名:** RESAMPLE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `every`:重采样频率,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。该参数不允许缺省。 -+ `interp`:上采样的插值方法,取值为 'NaN'、'FFill'、'BFill' 或 'Linear'。在缺省情况下,使用`NaN`填充法。 -+ `aggr`:下采样的聚合方法,取值为 'Max'、'Min'、'First'、'Last'、'Mean' 或 'Median'。在缺省情况下,使用平均数聚合。 -+ `start`:重采样的起始时间(包含),是一个格式为 'yyyy-MM-dd HH:mm:ss' 的时间字符串。在缺省情况下,使用第一个有效数据点的时间戳。 -+ `end`:重采样的结束时间(不包含),是一个格式为 'yyyy-MM-dd HH:mm:ss' 的时间字符串。在缺省情况下,使用最后一个有效数据点的时间戳。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。该序列按照重采样频率严格等间隔分布。 - -**提示:** 数据中的`NaN`将会被忽略。 - -#### 使用示例 - -##### 上采样 - -当重采样频率高于数据原始频率时,将会进行上采样。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2021-03-06T16:00:00.000+08:00| 3.09| -|2021-03-06T16:15:00.000+08:00| 3.53| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:45:00.000+08:00| 3.51| -|2021-03-06T17:00:00.000+08:00| 3.41| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select resample(s1,'every'='5m','interp'='linear') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="5m", "interp"="linear")| -+-----------------------------+----------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:05:00.000+08:00| 3.2366665999094644| -|2021-03-06T16:10:00.000+08:00| 3.3833332856496177| -|2021-03-06T16:15:00.000+08:00| 3.5299999713897705| -|2021-03-06T16:20:00.000+08:00| 3.5199999809265137| -|2021-03-06T16:25:00.000+08:00| 3.509999990463257| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T16:35:00.000+08:00| 3.503333330154419| -|2021-03-06T16:40:00.000+08:00| 3.506666660308838| -|2021-03-06T16:45:00.000+08:00| 3.509999990463257| -|2021-03-06T16:50:00.000+08:00| 3.4766666889190674| -|2021-03-06T16:55:00.000+08:00| 3.443333387374878| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+----------------------------------------------------------+ -``` - -##### 下采样 - -当重采样频率低于数据原始频率时,将会进行下采样。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select resample(s1,'every'='30m','aggr'='first') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "aggr"="first")| -+-----------------------------+--------------------------------------------------------+ -|2021-03-06T16:00:00.000+08:00| 3.0899999141693115| -|2021-03-06T16:30:00.000+08:00| 3.5| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+--------------------------------------------------------+ -``` - - -###### 指定重采样时间段 - -可以使用`start`和`end`两个参数指定重采样的时间段,超出实际时间范围的部分会被插值填补。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select resample(s1,'every'='30m','start'='2021-03-06 15:00:00') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------------------------------+ -| Time|resample(root.test.d1.s1, "every"="30m", "start"="2021-03-06 15:00:00")| -+-----------------------------+-----------------------------------------------------------------------+ -|2021-03-06T15:00:00.000+08:00| NaN| -|2021-03-06T15:30:00.000+08:00| NaN| -|2021-03-06T16:00:00.000+08:00| 3.309999942779541| -|2021-03-06T16:30:00.000+08:00| 3.5049999952316284| -|2021-03-06T17:00:00.000+08:00| 3.4100000858306885| -+-----------------------------+-----------------------------------------------------------------------+ -``` - -### 3.16 Sample - -#### 注册语句 - -```sql -create function sample as 'org.apache.iotdb.library.dprofile.UDTFSample' -``` - -#### 函数简介 - -本函数对输入序列进行采样,即从输入序列中选取指定数量的数据点并输出。目前,本函数支持三种采样方法:**蓄水池采样法 (reservoir sampling)** 对数据进行随机采样,所有数据点被采样的概率相同;**等距采样法 (isometric sampling)** 按照相等的索引间隔对数据进行采样,**最大三角采样法 (triangle sampling)** 对所有数据会按采样率分桶,每个桶内会计算数据点间三角形面积,并保留面积最大的点,该算法通常用于数据的可视化展示中,采用过程可以保证一些关键的突变点在采用中得到保留,更多抽样算法细节可以阅读论文 [here](http://skemman.is/stream/get/1946/15343/37285/3/SS_MSthesis.pdf)。 - -**函数名:** SAMPLE - -**输入序列:** 仅支持单个输入序列,类型可以是任意的。 - -**参数:** - -+ `method`:采样方法,取值为 'reservoir','isometric' 或 'triangle' 。在缺省情况下,采用蓄水池采样法。 -+ `k`:采样数,它是一个正整数,在缺省情况下为 1。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列的长度为采样数,序列中的每一个数据点都来自于输入序列。 - -**提示:** 如果采样数大于序列长度,那么输入序列中所有的数据点都会被输出。 - -#### 使用示例 - - -##### 蓄水池采样 - -当`method`参数为 'reservoir' 或缺省时,采用蓄水池采样法对输入序列进行采样。由于该采样方法具有随机性,下面展示的输出序列只是一种可能的结果。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 4.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select sample(s1,'method'='reservoir','k'='5') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="reservoir", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:06.000+08:00| 6.0| -|2020-01-01T00:00:08.000+08:00| 8.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - - -##### 等距采样 - -当`method`参数为 'isometric' 时,采用等距采样法对输入序列进行采样。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select sample(s1,'method'='isometric','k'='5') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|sample(root.test.d1.s1, "method"="isometric", "k"="5")| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:05.000+08:00| 5.0| -|2020-01-01T00:00:07.000+08:00| 7.0| -|2020-01-01T00:00:09.000+08:00| 9.0| -+-----------------------------+------------------------------------------------------+ -``` - -### 3.17 Segment - -#### 注册语句 - -```sql -create function segment as 'org.apache.iotdb.library.dprofile.UDTFSegment' -``` - -#### 函数简介 - -本函数按照数据的线性变化趋势将数据划分为多个子序列,返回分段直线拟合后的子序列首值或所有拟合值。 - -**函数名:** SEGMENT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `output`:"all" 输出所有拟合值;"first" 输出子序列起点拟合值。默认为 "first"。 - -+ `error`:判定存在线性趋势的误差允许阈值。误差的定义为子序列进行线性拟合的误差的绝对值的均值。默认为 0.1. - -**输出序列:** 输出单个序列,类型为 DOUBLE。 - -**提示:** 函数默认所有数据等时间间隔分布。函数读取所有数据,若原始数据过多,请先进行降采样处理。拟合采用自底向上方法,子序列的尾值可能会被认作子序列首值输出。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:00.300+08:00| 2.0| -|1970-01-01T08:00:00.400+08:00| 3.0| -|1970-01-01T08:00:00.500+08:00| 4.0| -|1970-01-01T08:00:00.600+08:00| 5.0| -|1970-01-01T08:00:00.700+08:00| 6.0| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 8.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:01.100+08:00| 9.1| -|1970-01-01T08:00:01.200+08:00| 9.2| -|1970-01-01T08:00:01.300+08:00| 9.3| -|1970-01-01T08:00:01.400+08:00| 9.4| -|1970-01-01T08:00:01.500+08:00| 9.5| -|1970-01-01T08:00:01.600+08:00| 9.6| -|1970-01-01T08:00:01.700+08:00| 9.7| -|1970-01-01T08:00:01.800+08:00| 9.8| -|1970-01-01T08:00:01.900+08:00| 9.9| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:02.100+08:00| 8.0| -|1970-01-01T08:00:02.200+08:00| 6.0| -|1970-01-01T08:00:02.300+08:00| 4.0| -|1970-01-01T08:00:02.400+08:00| 2.0| -|1970-01-01T08:00:02.500+08:00| 0.0| -|1970-01-01T08:00:02.600+08:00| -2.0| -|1970-01-01T08:00:02.700+08:00| -4.0| -|1970-01-01T08:00:02.800+08:00| -6.0| -|1970-01-01T08:00:02.900+08:00| -8.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.100+08:00| 10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -|1970-01-01T08:00:03.300+08:00| 10.0| -|1970-01-01T08:00:03.400+08:00| 10.0| -|1970-01-01T08:00:03.500+08:00| 10.0| -|1970-01-01T08:00:03.600+08:00| 10.0| -|1970-01-01T08:00:03.700+08:00| 10.0| -|1970-01-01T08:00:03.800+08:00| 10.0| -|1970-01-01T08:00:03.900+08:00| 10.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select segment(s1,"error"="0.1") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|segment(root.test.s1, "error"="0.1")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 5.0| -|1970-01-01T08:00:00.200+08:00| 1.0| -|1970-01-01T08:00:01.000+08:00| 9.0| -|1970-01-01T08:00:02.000+08:00| 10.0| -|1970-01-01T08:00:03.000+08:00| -10.0| -|1970-01-01T08:00:03.200+08:00| 10.0| -+-----------------------------+------------------------------------+ -``` - -### 3.18 Skew - -#### 注册语句 - -```sql -create function skew as 'org.apache.iotdb.library.dprofile.UDAFSkew' -``` - -#### 函数简介 - -本函数用于计算单列数值型数据的总体偏度 - -**函数名:** SKEW - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为 DOUBLE,序列仅包含一个时间戳为 0、值为总体偏度的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| -|2020-01-01T00:00:01.000+08:00| 2.0| -|2020-01-01T00:00:02.000+08:00| 3.0| -|2020-01-01T00:00:03.000+08:00| 4.0| -|2020-01-01T00:00:04.000+08:00| 5.0| -|2020-01-01T00:00:05.000+08:00| 6.0| -|2020-01-01T00:00:06.000+08:00| 7.0| -|2020-01-01T00:00:07.000+08:00| 8.0| -|2020-01-01T00:00:08.000+08:00| 9.0| -|2020-01-01T00:00:09.000+08:00| 10.0| -|2020-01-01T00:00:10.000+08:00| 10.0| -|2020-01-01T00:00:11.000+08:00| 10.0| -|2020-01-01T00:00:12.000+08:00| 10.0| -|2020-01-01T00:00:13.000+08:00| 10.0| -|2020-01-01T00:00:14.000+08:00| 10.0| -|2020-01-01T00:00:15.000+08:00| 10.0| -|2020-01-01T00:00:16.000+08:00| 10.0| -|2020-01-01T00:00:17.000+08:00| 10.0| -|2020-01-01T00:00:18.000+08:00| 10.0| -|2020-01-01T00:00:19.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select skew(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time| skew(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| -0.9998427402292644| -+-----------------------------+-----------------------+ -``` - -### 3.19 Spline - -#### 注册语句 - -```sql -create function spline as 'org.apache.iotdb.library.dprofile.UDTFSpline' -``` - -#### 函数简介 - -本函数提供对原始序列进行三次样条曲线拟合后的插值重采样。 - -**函数名:** SPLINE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `points`:重采样个数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -**提示**:输出序列保留输入序列的首尾值,等时间间隔采样。仅当输入点个数不少于 4 个时才计算插值。 - -#### 使用示例 - -##### 指定插值个数 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select spline(s1, "points"="151") from root.test -``` - -输出序列: - -``` -+-----------------------------+------------------------------------+ -| Time|spline(root.test.s1, "points"="151")| -+-----------------------------+------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.010+08:00| 0.04870000251134237| -|1970-01-01T08:00:00.020+08:00| 0.09680000495910646| -|1970-01-01T08:00:00.030+08:00| 0.14430000734329226| -|1970-01-01T08:00:00.040+08:00| 0.19120000966389972| -|1970-01-01T08:00:00.050+08:00| 0.23750001192092896| -|1970-01-01T08:00:00.060+08:00| 0.2832000141143799| -|1970-01-01T08:00:00.070+08:00| 0.32830001624425253| -|1970-01-01T08:00:00.080+08:00| 0.3728000183105469| -|1970-01-01T08:00:00.090+08:00| 0.416700020313263| -|1970-01-01T08:00:00.100+08:00| 0.4600000222524008| -|1970-01-01T08:00:00.110+08:00| 0.5027000241279602| -|1970-01-01T08:00:00.120+08:00| 0.5448000259399414| -|1970-01-01T08:00:00.130+08:00| 0.5863000276883443| -|1970-01-01T08:00:00.140+08:00| 0.627200029373169| -|1970-01-01T08:00:00.150+08:00| 0.6675000309944153| -|1970-01-01T08:00:00.160+08:00| 0.7072000325520833| -|1970-01-01T08:00:00.170+08:00| 0.7463000340461731| -|1970-01-01T08:00:00.180+08:00| 0.7848000354766846| -|1970-01-01T08:00:00.190+08:00| 0.8227000368436178| -|1970-01-01T08:00:00.200+08:00| 0.8600000381469728| -|1970-01-01T08:00:00.210+08:00| 0.8967000393867494| -|1970-01-01T08:00:00.220+08:00| 0.9328000405629477| -|1970-01-01T08:00:00.230+08:00| 0.9683000416755676| -|1970-01-01T08:00:00.240+08:00| 1.0032000427246095| -|1970-01-01T08:00:00.250+08:00| 1.037500043710073| -|1970-01-01T08:00:00.260+08:00| 1.071200044631958| -|1970-01-01T08:00:00.270+08:00| 1.1043000454902647| -|1970-01-01T08:00:00.280+08:00| 1.1368000462849934| -|1970-01-01T08:00:00.290+08:00| 1.1687000470161437| -|1970-01-01T08:00:00.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:00.310+08:00| 1.2307000483103594| -|1970-01-01T08:00:00.320+08:00| 1.2608000489139557| -|1970-01-01T08:00:00.330+08:00| 1.2903000494873524| -|1970-01-01T08:00:00.340+08:00| 1.3192000500233967| -|1970-01-01T08:00:00.350+08:00| 1.3475000505149364| -|1970-01-01T08:00:00.360+08:00| 1.3752000509548186| -|1970-01-01T08:00:00.370+08:00| 1.402300051335891| -|1970-01-01T08:00:00.380+08:00| 1.4288000516510009| -|1970-01-01T08:00:00.390+08:00| 1.4547000518929958| -|1970-01-01T08:00:00.400+08:00| 1.480000052054723| -|1970-01-01T08:00:00.410+08:00| 1.5047000521290301| -|1970-01-01T08:00:00.420+08:00| 1.5288000521087646| -|1970-01-01T08:00:00.430+08:00| 1.5523000519867738| -|1970-01-01T08:00:00.440+08:00| 1.575200051755905| -|1970-01-01T08:00:00.450+08:00| 1.597500051409006| -|1970-01-01T08:00:00.460+08:00| 1.619200050938924| -|1970-01-01T08:00:00.470+08:00| 1.6403000503385066| -|1970-01-01T08:00:00.480+08:00| 1.660800049600601| -|1970-01-01T08:00:00.490+08:00| 1.680700048718055| -|1970-01-01T08:00:00.500+08:00| 1.7000000476837158| -|1970-01-01T08:00:00.510+08:00| 1.7188475466453037| -|1970-01-01T08:00:00.520+08:00| 1.7373800457262996| -|1970-01-01T08:00:00.530+08:00| 1.7555825448831923| -|1970-01-01T08:00:00.540+08:00| 1.7734400440724702| -|1970-01-01T08:00:00.550+08:00| 1.790937543250622| -|1970-01-01T08:00:00.560+08:00| 1.8080600423741364| -|1970-01-01T08:00:00.570+08:00| 1.8247925413995016| -|1970-01-01T08:00:00.580+08:00| 1.8411200402832066| -|1970-01-01T08:00:00.590+08:00| 1.8570275389817397| -|1970-01-01T08:00:00.600+08:00| 1.8725000374515897| -|1970-01-01T08:00:00.610+08:00| 1.8875225356492449| -|1970-01-01T08:00:00.620+08:00| 1.902080033531194| -|1970-01-01T08:00:00.630+08:00| 1.9161575310539258| -|1970-01-01T08:00:00.640+08:00| 1.9297400281739288| -|1970-01-01T08:00:00.650+08:00| 1.9428125248476913| -|1970-01-01T08:00:00.660+08:00| 1.9553600210317021| -|1970-01-01T08:00:00.670+08:00| 1.96736751668245| -|1970-01-01T08:00:00.680+08:00| 1.9788200117564232| -|1970-01-01T08:00:00.690+08:00| 1.9897025062101101| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.710+08:00| 2.0097024933913334| -|1970-01-01T08:00:00.720+08:00| 2.0188199867081615| -|1970-01-01T08:00:00.730+08:00| 2.027367479995188| -|1970-01-01T08:00:00.740+08:00| 2.0353599732971155| -|1970-01-01T08:00:00.750+08:00| 2.0428124666586482| -|1970-01-01T08:00:00.760+08:00| 2.049739960124489| -|1970-01-01T08:00:00.770+08:00| 2.056157453739342| -|1970-01-01T08:00:00.780+08:00| 2.06207994754791| -|1970-01-01T08:00:00.790+08:00| 2.067522441594897| -|1970-01-01T08:00:00.800+08:00| 2.072499935925006| -|1970-01-01T08:00:00.810+08:00| 2.07702743058294| -|1970-01-01T08:00:00.820+08:00| 2.081119925613404| -|1970-01-01T08:00:00.830+08:00| 2.0847924210611| -|1970-01-01T08:00:00.840+08:00| 2.0880599169707317| -|1970-01-01T08:00:00.850+08:00| 2.0909374133870027| -|1970-01-01T08:00:00.860+08:00| 2.0934399103546166| -|1970-01-01T08:00:00.870+08:00| 2.0955824079182768| -|1970-01-01T08:00:00.880+08:00| 2.0973799061226863| -|1970-01-01T08:00:00.890+08:00| 2.098847405012549| -|1970-01-01T08:00:00.900+08:00| 2.0999999046325684| -|1970-01-01T08:00:00.910+08:00| 2.1005574051201332| -|1970-01-01T08:00:00.920+08:00| 2.1002599065303778| -|1970-01-01T08:00:00.930+08:00| 2.0991524087846245| -|1970-01-01T08:00:00.940+08:00| 2.0972799118041947| -|1970-01-01T08:00:00.950+08:00| 2.0946874155104105| -|1970-01-01T08:00:00.960+08:00| 2.0914199198245944| -|1970-01-01T08:00:00.970+08:00| 2.0875224246680673| -|1970-01-01T08:00:00.980+08:00| 2.083039929962151| -|1970-01-01T08:00:00.990+08:00| 2.0780174356281687| -|1970-01-01T08:00:01.000+08:00| 2.0724999415874406| -|1970-01-01T08:00:01.010+08:00| 2.06653244776129| -|1970-01-01T08:00:01.020+08:00| 2.060159954071038| -|1970-01-01T08:00:01.030+08:00| 2.053427460438006| -|1970-01-01T08:00:01.040+08:00| 2.046379966783517| -|1970-01-01T08:00:01.050+08:00| 2.0390624730288924| -|1970-01-01T08:00:01.060+08:00| 2.031519979095454| -|1970-01-01T08:00:01.070+08:00| 2.0237974849045237| -|1970-01-01T08:00:01.080+08:00| 2.015939990377423| -|1970-01-01T08:00:01.090+08:00| 2.0079924954354746| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.110+08:00| 1.9907018211101906| -|1970-01-01T08:00:01.120+08:00| 1.9788509124245144| -|1970-01-01T08:00:01.130+08:00| 1.9645127287932083| -|1970-01-01T08:00:01.140+08:00| 1.9477527250665083| -|1970-01-01T08:00:01.150+08:00| 1.9286363560946513| -|1970-01-01T08:00:01.160+08:00| 1.9072290767278735| -|1970-01-01T08:00:01.170+08:00| 1.8835963418164114| -|1970-01-01T08:00:01.180+08:00| 1.8578036062105014| -|1970-01-01T08:00:01.190+08:00| 1.8299163247603802| -|1970-01-01T08:00:01.200+08:00| 1.7999999523162842| -|1970-01-01T08:00:01.210+08:00| 1.7623635841923329| -|1970-01-01T08:00:01.220+08:00| 1.7129696477516976| -|1970-01-01T08:00:01.230+08:00| 1.6543635959181928| -|1970-01-01T08:00:01.240+08:00| 1.5890908816156328| -|1970-01-01T08:00:01.250+08:00| 1.5196969577678319| -|1970-01-01T08:00:01.260+08:00| 1.4487272772986044| -|1970-01-01T08:00:01.270+08:00| 1.3787272931317647| -|1970-01-01T08:00:01.280+08:00| 1.3122424581911272| -|1970-01-01T08:00:01.290+08:00| 1.251818225400506| -|1970-01-01T08:00:01.300+08:00| 1.2000000476837158| -|1970-01-01T08:00:01.310+08:00| 1.1548000470995912| -|1970-01-01T08:00:01.320+08:00| 1.1130667107899999| -|1970-01-01T08:00:01.330+08:00| 1.0756000393033045| -|1970-01-01T08:00:01.340+08:00| 1.043200033187868| -|1970-01-01T08:00:01.350+08:00| 1.016666692992053| -|1970-01-01T08:00:01.360+08:00| 0.9968000192642223| -|1970-01-01T08:00:01.370+08:00| 0.9844000125527389| -|1970-01-01T08:00:01.380+08:00| 0.9802666734059655| -|1970-01-01T08:00:01.390+08:00| 0.9852000023722649| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.410+08:00| 1.023999999165535| -|1970-01-01T08:00:01.420+08:00| 1.0559999990463256| -|1970-01-01T08:00:01.430+08:00| 1.0959999996423722| -|1970-01-01T08:00:01.440+08:00| 1.1440000009536744| -|1970-01-01T08:00:01.450+08:00| 1.2000000029802322| -|1970-01-01T08:00:01.460+08:00| 1.264000005722046| -|1970-01-01T08:00:01.470+08:00| 1.3360000091791153| -|1970-01-01T08:00:01.480+08:00| 1.4160000133514405| -|1970-01-01T08:00:01.490+08:00| 1.5040000182390214| -|1970-01-01T08:00:01.500+08:00| 1.600000023841858| -+-----------------------------+------------------------------------+ -``` - -### 3.20 Spread - -#### 注册语句 - -```sql -create function spread as 'org.apache.iotdb.library.dprofile.UDAFSpread' -``` - -#### 函数简介 - -本函数用于计算时间序列的极差,即最大值减去最小值的结果。 - -**函数名:** SPREAD - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型与输入相同,序列仅包含一个时间戳为 0 、值为极差的数据点。 - -**提示:** 数据中的空值、缺失值和`NaN`将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select spread(s1) from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+-----------------------+ -| Time|spread(root.test.d1.s1)| -+-----------------------------+-----------------------+ -|1970-01-01T08:00:00.000+08:00| 26.0| -+-----------------------------+-----------------------+ -``` - - - -### 3.21 ZScore - -#### 注册语句 - -```sql -create function zscore as 'org.apache.iotdb.library.dprofile.UDTFZScore' -``` - -#### 函数简介 - -本函数将输入序列使用z-score方法进行归一化。 - -**函数名:** ZSCORE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `compute`:若设置为 "batch",则将数据全部读入后转换;若设置为 "stream",则需用户提供均值及方差进行流式计算转换。默认为 "batch"。 -+ `avg`:使用流式计算时的均值。 -+ `sd`:使用流式计算时的标准差。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select zscore(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|zscore(root.test.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.200+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.300+08:00| 0.20672455764868078| -|1970-01-01T08:00:00.400+08:00| -0.6201736729460423| -|1970-01-01T08:00:00.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:00.700+08:00| -1.033622788243404| -|1970-01-01T08:00:00.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:00.900+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.000+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.100+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.200+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.300+08:00| -0.6201736729460423| -|1970-01-01T08:00:01.400+08:00| 0.20672455764868078| -|1970-01-01T08:00:01.500+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.600+08:00|-0.20672455764868078| -|1970-01-01T08:00:01.700+08:00| 3.9277665953249348| -|1970-01-01T08:00:01.800+08:00| 0.6201736729460423| -|1970-01-01T08:00:01.900+08:00| -1.033622788243404| -|1970-01-01T08:00:02.000+08:00|-0.20672455764868078| -+-----------------------------+--------------------+ -``` - - - -## 4. 异常检测 - -### 4.1 IQR - -#### 注册语句 - -```sql -create function iqr as 'org.apache.iotdb.library.anomaly.UDTFIQR' -``` - -#### 函数简介 - -本函数用于检验超出上下四分位数1.5倍IQR的数据分布异常。 - -**函数名:** IQR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `method`:若设置为 "batch",则将数据全部读入后检测;若设置为 "stream",则需用户提供上下四分位数进行流式检测。默认为 "batch"。 -+ `q1`:使用流式计算时的下四分位数。 -+ `q3`:使用流式计算时的上四分位数。 - -**输出序列**:输出单个序列,类型为 DOUBLE。 - -**说明**:$IQR=Q_3-Q_1$ - -#### 使用示例 - -##### 全数据计算 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| -|1970-01-01T08:00:00.300+08:00| 1.0| -|1970-01-01T08:00:00.400+08:00| -1.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -|1970-01-01T08:00:00.600+08:00| 0.0| -|1970-01-01T08:00:00.700+08:00| -2.0| -|1970-01-01T08:00:00.800+08:00| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.0| -|1970-01-01T08:00:01.200+08:00| -1.0| -|1970-01-01T08:00:01.300+08:00| -1.0| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 0.0| -|1970-01-01T08:00:01.600+08:00| 0.0| -|1970-01-01T08:00:01.700+08:00| 10.0| -|1970-01-01T08:00:01.800+08:00| 2.0| -|1970-01-01T08:00:01.900+08:00| -2.0| -|1970-01-01T08:00:02.000+08:00| 0.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select iqr(s1) from root.test -``` - -输出序列: - -``` -+-----------------------------+-----------------+ -| Time|iqr(root.test.s1)| -+-----------------------------+-----------------+ -|1970-01-01T08:00:01.700+08:00| 10.0| -+-----------------------------+-----------------+ -``` - -### 4.2 KSigma - -#### 注册语句 - -```sql -create function ksigma as 'org.apache.iotdb.library.anomaly.UDTFKSigma' -``` - -#### 函数简介 - -本函数利用动态 K-Sigma 算法进行异常检测。在一个窗口内,与平均值的差距超过k倍标准差的数据将被视作异常并输出。 - -**函数名:** KSIGMA - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `k`:在动态 K-Sigma 算法中,分布异常的标准差倍数阈值,默认值为 3。 -+ `window`:动态 K-Sigma 算法的滑动窗口大小,默认值为 10000。 - - -**输出序列:** 输出单个序列,类型与输入序列相同。 - -**提示:** k 应大于 0,否则将不做输出。 - -#### 使用示例 - -##### 指定k - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:04.000+08:00| 100.0| -|2020-01-01T00:00:06.000+08:00| 150.0| -|2020-01-01T00:00:08.000+08:00| 200.0| -|2020-01-01T00:00:10.000+08:00| 200.0| -|2020-01-01T00:00:14.000+08:00| 200.0| -|2020-01-01T00:00:15.000+08:00| 200.0| -|2020-01-01T00:00:16.000+08:00| 200.0| -|2020-01-01T00:00:18.000+08:00| 200.0| -|2020-01-01T00:00:20.000+08:00| 150.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ksigma(s1,"k"="1.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------+ -|Time |ksigma(root.test.d1.s1,"k"="3.0")| -+-----------------------------+---------------------------------+ -|2020-01-01T00:00:02.000+08:00| 0.0| -|2020-01-01T00:00:03.000+08:00| 50.0| -|2020-01-01T00:00:26.000+08:00| 50.0| -|2020-01-01T00:00:28.000+08:00| 0.0| -+-----------------------------+---------------------------------+ -``` - -### 4.3 LOF - -#### 注册语句 - -```sql -create function LOF as 'org.apache.iotdb.library.anomaly.UDTFLOF' -``` - -#### 函数简介 - -本函数使用局部离群点检测方法用于查找序列的密度异常。将根据提供的第k距离数及局部离群点因子(lof)阈值,判断输入数据是否为离群点,即异常,并输出各点的 LOF 值。 - -**函数名:** LOF - -**输入序列:** 多个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:使用的检测方法。默认为 default,以高维数据计算。设置为 series,将一维时间序列转换为高维数据计算。 -+ `k`:使用第k距离计算局部离群点因子.默认为 3。 -+ `window`:每次读取数据的窗口长度。默认为 10000. -+ `windowsize`:使用series方法时,转化高维数据的维数,即单个窗口的大小。默认为 5。 - -**输出序列:** 输出单时间序列,类型为DOUBLE。 - -**提示:** 不完整的数据行会被忽略,不参与计算,也不标记为离群点。 - - -#### 使用示例 - -##### 默认参数 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.200+08:00| 0.0| 1.0| -|1970-01-01T08:00:00.300+08:00| 1.0| 1.0| -|1970-01-01T08:00:00.400+08:00| 1.0| 0.0| -|1970-01-01T08:00:00.500+08:00| 0.0| -1.0| -|1970-01-01T08:00:00.600+08:00| -1.0| -1.0| -|1970-01-01T08:00:00.700+08:00| -1.0| 0.0| -|1970-01-01T08:00:00.800+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.900+08:00| 0.0| null| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select lof(s1,s2) from root.test.d1 where time<1000 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|lof(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.100+08:00| 3.8274824267668244| -|1970-01-01T08:00:00.200+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.300+08:00| 2.838155437762879| -|1970-01-01T08:00:00.400+08:00| 3.0117631741126156| -|1970-01-01T08:00:00.500+08:00| 2.73518261244453| -|1970-01-01T08:00:00.600+08:00| 2.371440975708148| -|1970-01-01T08:00:00.700+08:00| 2.73518261244453| -|1970-01-01T08:00:00.800+08:00| 1.7561416374270742| -+-----------------------------+-------------------------------------+ -``` - -##### 诊断一维时间序列 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.100+08:00| 1.0| -|1970-01-01T08:00:00.200+08:00| 2.0| -|1970-01-01T08:00:00.300+08:00| 3.0| -|1970-01-01T08:00:00.400+08:00| 4.0| -|1970-01-01T08:00:00.500+08:00| 5.0| -|1970-01-01T08:00:00.600+08:00| 6.0| -|1970-01-01T08:00:00.700+08:00| 7.0| -|1970-01-01T08:00:00.800+08:00| 8.0| -|1970-01-01T08:00:00.900+08:00| 9.0| -|1970-01-01T08:00:01.000+08:00| 10.0| -|1970-01-01T08:00:01.100+08:00| 11.0| -|1970-01-01T08:00:01.200+08:00| 12.0| -|1970-01-01T08:00:01.300+08:00| 13.0| -|1970-01-01T08:00:01.400+08:00| 14.0| -|1970-01-01T08:00:01.500+08:00| 15.0| -|1970-01-01T08:00:01.600+08:00| 16.0| -|1970-01-01T08:00:01.700+08:00| 17.0| -|1970-01-01T08:00:01.800+08:00| 18.0| -|1970-01-01T08:00:01.900+08:00| 19.0| -|1970-01-01T08:00:02.000+08:00| 20.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select lof(s1, "method"="series") from root.test.d1 where time<1000 -``` - -输出序列: - -``` -+-----------------------------+--------------------+ -| Time|lof(root.test.d1.s1)| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.100+08:00| 3.77777777777778| -|1970-01-01T08:00:00.200+08:00| 4.32727272727273| -|1970-01-01T08:00:00.300+08:00| 4.85714285714286| -|1970-01-01T08:00:00.400+08:00| 5.40909090909091| -|1970-01-01T08:00:00.500+08:00| 5.94999999999999| -|1970-01-01T08:00:00.600+08:00| 6.43243243243243| -|1970-01-01T08:00:00.700+08:00| 6.79999999999999| -|1970-01-01T08:00:00.800+08:00| 7.0| -|1970-01-01T08:00:00.900+08:00| 7.0| -|1970-01-01T08:00:01.000+08:00| 6.79999999999999| -|1970-01-01T08:00:01.100+08:00| 6.43243243243243| -|1970-01-01T08:00:01.200+08:00| 5.94999999999999| -|1970-01-01T08:00:01.300+08:00| 5.40909090909091| -|1970-01-01T08:00:01.400+08:00| 4.85714285714286| -|1970-01-01T08:00:01.500+08:00| 4.32727272727273| -|1970-01-01T08:00:01.600+08:00| 3.77777777777778| -+-----------------------------+--------------------+ -``` - -### 4.4 MissDetect - -#### 注册语句 - -```sql -create function missdetect as 'org.apache.iotdb.library.anomaly.UDTFMissDetect' -``` - -#### 函数简介 - -本函数用于检测数据中的缺失异常。在一些数据中,缺失数据会被线性插值填补,在数据中出现完美的线性片段,且这些片段往往长度较大。本函数通过在数据中发现这些完美线性片段来检测缺失异常。 - -**函数名:** MISSDETECT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `minlen`:被标记为异常的完美线性片段的最小长度,是一个大于等于 10 的整数,默认值为 10。 - -**输出序列:** 输出单个序列,类型为 BOOLEAN,即该数据点是否为缺失异常。 - -**提示:** 数据中的`NaN`将会被忽略。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s2| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 0.0| -|2021-07-01T12:00:01.000+08:00| 1.0| -|2021-07-01T12:00:02.000+08:00| 0.0| -|2021-07-01T12:00:03.000+08:00| 1.0| -|2021-07-01T12:00:04.000+08:00| 0.0| -|2021-07-01T12:00:05.000+08:00| 0.0| -|2021-07-01T12:00:06.000+08:00| 0.0| -|2021-07-01T12:00:07.000+08:00| 0.0| -|2021-07-01T12:00:08.000+08:00| 0.0| -|2021-07-01T12:00:09.000+08:00| 0.0| -|2021-07-01T12:00:10.000+08:00| 0.0| -|2021-07-01T12:00:11.000+08:00| 0.0| -|2021-07-01T12:00:12.000+08:00| 0.0| -|2021-07-01T12:00:13.000+08:00| 0.0| -|2021-07-01T12:00:14.000+08:00| 0.0| -|2021-07-01T12:00:15.000+08:00| 0.0| -|2021-07-01T12:00:16.000+08:00| 1.0| -|2021-07-01T12:00:17.000+08:00| 0.0| -|2021-07-01T12:00:18.000+08:00| 1.0| -|2021-07-01T12:00:19.000+08:00| 0.0| -|2021-07-01T12:00:20.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select missdetect(s2,'minlen'='10') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------+ -| Time|missdetect(root.test.d2.s2, "minlen"="10")| -+-----------------------------+------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| false| -|2021-07-01T12:00:01.000+08:00| false| -|2021-07-01T12:00:02.000+08:00| false| -|2021-07-01T12:00:03.000+08:00| false| -|2021-07-01T12:00:04.000+08:00| true| -|2021-07-01T12:00:05.000+08:00| true| -|2021-07-01T12:00:06.000+08:00| true| -|2021-07-01T12:00:07.000+08:00| true| -|2021-07-01T12:00:08.000+08:00| true| -|2021-07-01T12:00:09.000+08:00| true| -|2021-07-01T12:00:10.000+08:00| true| -|2021-07-01T12:00:11.000+08:00| true| -|2021-07-01T12:00:12.000+08:00| true| -|2021-07-01T12:00:13.000+08:00| true| -|2021-07-01T12:00:14.000+08:00| true| -|2021-07-01T12:00:15.000+08:00| true| -|2021-07-01T12:00:16.000+08:00| false| -|2021-07-01T12:00:17.000+08:00| false| -|2021-07-01T12:00:18.000+08:00| false| -|2021-07-01T12:00:19.000+08:00| false| -|2021-07-01T12:00:20.000+08:00| false| -+-----------------------------+------------------------------------------+ -``` - -### 4.5 Range - -#### 注册语句 - -```sql -create function range as 'org.apache.iotdb.library.anomaly.UDTFRange' -``` - -#### 函数简介 - -本函数用于查找时间序列的范围异常。将根据提供的上界与下界,判断输入数据是否越界,即异常,并输出所有异常点为新的时间序列。 - -**函数名:** RANGE - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `lower_bound`:范围异常检测的下界。 -+ `upper_bound`:范围异常检测的上界。 - -**输出序列:** 输出单个序列,类型与输入序列相同。 - -**提示:** 应满足`upper_bound`大于`lower_bound`,否则将不做输出。 - - -#### 使用示例 - -##### 指定上界与下界 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select range(s1,"lower_bound"="101.0","upper_bound"="125.0") from root.test.d1 where time <= 2020-01-01 00:00:30 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------------------+ -|Time |range(root.test.d1.s1,"lower_bound"="101.0","upper_bound"="125.0")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -+-----------------------------+------------------------------------------------------------------+ -``` - -### 4.6 TwoSidedFilter - -#### 注册语句 - -```sql -create function twosidedfilter as 'org.apache.iotdb.library.anomaly.UDTFTwoSidedFilter' -``` - -#### 函数简介 - -本函数基于双边窗口检测法对输入序列中的异常点进行过滤。 - -**函数名:** TWOSIDEDFILTER - -**输出序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型与输入相同,是输入序列去除异常点后的结果。 - -**参数:** - -- `len`:双边窗口检测法中的窗口大小,取值范围为正整数,默认值为 5.如当`len`=3 时,算法向前、向后各取长度为3的窗口,在窗口中计算异常度。 -- `threshold`:异常度的阈值,取值范围为(0,1),默认值为 0.3。阈值越高,函数对于异常度的判定标准越严格。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:00:37.000+08:00| 1484.0| -|1970-01-01T08:00:38.000+08:00| 1055.0| -|1970-01-01T08:00:39.000+08:00| 1050.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select TwoSidedFilter(s0, 'len'='5', 'threshold'='0.3') from root.test -``` - -输出序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s0| -+-----------------------------+------------+ -|1970-01-01T08:00:00.000+08:00| 2002.0| -|1970-01-01T08:00:01.000+08:00| 1946.0| -|1970-01-01T08:00:02.000+08:00| 1958.0| -|1970-01-01T08:00:03.000+08:00| 2012.0| -|1970-01-01T08:00:04.000+08:00| 2051.0| -|1970-01-01T08:00:05.000+08:00| 1898.0| -|1970-01-01T08:00:06.000+08:00| 2014.0| -|1970-01-01T08:00:07.000+08:00| 2052.0| -|1970-01-01T08:00:08.000+08:00| 1935.0| -|1970-01-01T08:00:09.000+08:00| 1901.0| -|1970-01-01T08:00:10.000+08:00| 1972.0| -|1970-01-01T08:00:11.000+08:00| 1969.0| -|1970-01-01T08:00:12.000+08:00| 1984.0| -|1970-01-01T08:00:13.000+08:00| 2018.0| -|1970-01-01T08:01:05.000+08:00| 1023.0| -|1970-01-01T08:01:06.000+08:00| 1056.0| -|1970-01-01T08:01:07.000+08:00| 978.0| -|1970-01-01T08:01:08.000+08:00| 1050.0| -|1970-01-01T08:01:09.000+08:00| 1123.0| -|1970-01-01T08:01:10.000+08:00| 1150.0| -|1970-01-01T08:01:11.000+08:00| 1034.0| -|1970-01-01T08:01:12.000+08:00| 950.0| -|1970-01-01T08:01:13.000+08:00| 1059.0| -+-----------------------------+------------+ -``` - -### 4.7 Outlier - -#### 注册语句 - -```sql -create function outlier as 'org.apache.iotdb.library.anomaly.UDTFOutlier' -``` - -#### 函数简介 - -本函数用于检测基于距离的异常点。在当前窗口中,如果一个点距离阈值范围内的邻居数量(包括它自己)少于密度阈值,则该点是异常点。 - -**函数名:** OUTLIER - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `r`:基于距离异常检测中的距离阈值。 -+ `k`:基于距离异常检测中的密度阈值。 -+ `w`:用于指定滑动窗口的大小。 -+ `s`:用于指定滑动窗口的步长。 - -**输出序列**:输出单个序列,类型与输入序列相同。 - -#### 使用示例 - -##### 指定查询参数 - -输入序列: - -``` -+-----------------------------+------------+ -| Time|root.test.s1| -+-----------------------------+------------+ -|2020-01-04T23:59:55.000+08:00| 56.0| -|2020-01-04T23:59:56.000+08:00| 55.1| -|2020-01-04T23:59:57.000+08:00| 54.2| -|2020-01-04T23:59:58.000+08:00| 56.3| -|2020-01-04T23:59:59.000+08:00| 59.0| -|2020-01-05T00:00:00.000+08:00| 60.0| -|2020-01-05T00:00:01.000+08:00| 60.5| -|2020-01-05T00:00:02.000+08:00| 64.5| -|2020-01-05T00:00:03.000+08:00| 69.0| -|2020-01-05T00:00:04.000+08:00| 64.2| -|2020-01-05T00:00:05.000+08:00| 62.3| -|2020-01-05T00:00:06.000+08:00| 58.0| -|2020-01-05T00:00:07.000+08:00| 58.9| -|2020-01-05T00:00:08.000+08:00| 52.0| -|2020-01-05T00:00:09.000+08:00| 62.3| -|2020-01-05T00:00:10.000+08:00| 61.0| -|2020-01-05T00:00:11.000+08:00| 64.2| -|2020-01-05T00:00:12.000+08:00| 61.8| -|2020-01-05T00:00:13.000+08:00| 64.0| -|2020-01-05T00:00:14.000+08:00| 63.0| -+-----------------------------+------------+ -``` - -用于查询的 SQL 语句: - -```sql -select outlier(s1,"r"="5.0","k"="4","w"="10","s"="5") from root.test -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------+ -| Time|outlier(root.test.s1,"r"="5.0","k"="4","w"="10","s"="5")| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:03.000+08:00| 69.0| -+-----------------------------+--------------------------------------------------------+ -|2020-01-05T00:00:08.000+08:00| 52.0| -+-----------------------------+--------------------------------------------------------+ -``` - -## 5. 频域分析 - -### 5.1 Conv - -#### 注册语句 - -```sql -create function conv as 'org.apache.iotdb.library.frequency.UDTFConv' -``` - -#### 函数简介 - -本函数对两个输入序列进行卷积,即多项式乘法。 - - -**函数名:** CONV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**输出序列:** 输出单个序列,类型为DOUBLE,它是两个序列卷积的结果。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 0.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| null| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select conv(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------+ -| Time|conv(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| -|1970-01-01T08:00:00.003+08:00| 2.0| -+-----------------------------+--------------------------------------+ -``` - -### 5.2 Deconv - -#### 注册语句 - -```sql -create function deconv as 'org.apache.iotdb.library.frequency.UDTFDeconv' -``` - -#### 函数简介 - -本函数对两个输入序列进行去卷积,即多项式除法运算。 - -**函数名:** DECONV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `result`:去卷积的结果,取值为'quotient'或'remainder',分别对应于去卷积的商和余数。在缺省情况下,输出去卷积的商。 - -**输出序列:** 输出单个序列,类型为DOUBLE。它是将第二个序列从第一个序列中去卷积(第一个序列除以第二个序列)的结果。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -##### 计算去卷积的商 - -当`result`参数缺省或为'quotient'时,本函数计算去卷积的商。 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s3|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 8.0| 7.0| -|1970-01-01T08:00:00.001+08:00| 2.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 7.0| null| -|1970-01-01T08:00:00.003+08:00| 2.0| null| -+-----------------------------+---------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select deconv(s3,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2)| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.0| -+-----------------------------+----------------------------------------+ -``` - -##### 计算去卷积的余数 - -当`result`参数为'remainder'时,本函数计算去卷积的余数。输入序列同上,用于查询的SQL语句如下: - -```sql -select deconv(s3,s2,'result'='remainder') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------+ -| Time|deconv(root.test.d2.s3, root.test.d2.s2, "result"="remainder")| -+-----------------------------+--------------------------------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.0| -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 0.0| -|1970-01-01T08:00:00.003+08:00| 0.0| -+-----------------------------+--------------------------------------------------------------+ -``` - -### 5.3 DWT - -#### 注册语句 - -```sql -create function dwt as 'org.apache.iotdb.library.frequency.UDTFDWT' -``` - -#### 函数简介 - -本函数对输入序列进行一维离散小波变换。 - -**函数名:** DWT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:小波滤波的类型,提供'Haar', 'DB4', 'DB6', 'DB8',其中DB指代Daubechies。若不设置该参数,则用户需提供小波滤波的系数。不区分大小写。 -+ `coef`:小波滤波的系数。若提供该参数,请使用英文逗号','分割各项,不添加空格或其它符号。 -+ `layer`:进行变换的次数,最终输出的向量个数等同于$layer+1$.默认取1。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。 - -**提示:** 输入序列长度必须为2的整数次幂。 - -#### 使用示例 - -##### Haar变换 - - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.2| -|1970-01-01T08:00:00.200+08:00| 1.5| -|1970-01-01T08:00:00.300+08:00| 1.2| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.7| -|1970-01-01T08:00:00.600+08:00| 0.8| -|1970-01-01T08:00:00.700+08:00| 2.0| -|1970-01-01T08:00:00.800+08:00| 2.5| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 2.0| -|1970-01-01T08:00:01.200+08:00| 1.8| -|1970-01-01T08:00:01.300+08:00| 1.2| -|1970-01-01T08:00:01.400+08:00| 1.0| -|1970-01-01T08:00:01.500+08:00| 1.6| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select dwt(s1,"method"="haar") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|dwt(root.test.d1.s1, "method"="haar")| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.14142135834465192| -|1970-01-01T08:00:00.100+08:00| 1.909188342921157| -|1970-01-01T08:00:00.200+08:00| 1.6263456473052773| -|1970-01-01T08:00:00.300+08:00| 1.9798989957517026| -|1970-01-01T08:00:00.400+08:00| 3.252691126023161| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776479437628| -|1970-01-01T08:00:00.800+08:00| -0.14142135834465192| -|1970-01-01T08:00:00.900+08:00| 0.21213200063848547| -|1970-01-01T08:00:01.000+08:00| -0.7778174761639416| -|1970-01-01T08:00:01.100+08:00| -0.8485281289944873| -|1970-01-01T08:00:01.200+08:00| 0.2828427799095765| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426400127697095| -|1970-01-01T08:00:01.500+08:00| -0.42426408557066786| -+-----------------------------+-------------------------------------+ -``` - - -### 5.4 IDWT - -#### 注册语句 - -```sql -create function idwt as 'org.apache.iotdb.library.frequency.UDTFIDWT' -``` - -#### 函数简介 - -本函数对输入序列进行一维离散小波逆变换,将 DWT 分解后的小波系数还原为原始数据。 - -**函数名:** IDWT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:小波滤波的类型,提供'Haar', 'DB4', 'DB6', 'DB8',其中DB指代Daubechies。若不设置该参数,则用户需提供小波滤波的系数。不区分大小写。 -+ `coef`:小波滤波的系数。若提供该参数,请使用英文逗号','分割各项,不添加空格或其它符号。 -+ `layer`:进行变换的次数,最终输出的向量个数等同于$layer+1$.默认取1。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。 - -**提示:** -* 输入序列长度必须为2的整数次幂。 -* IDWT 函数的参数设置(method/coef/layer)应与对应 DWT 变换时保持一致,才能正确还原原始数据。 -* 通常 IDWT 的输入为 DWT 函数的输出结果。 - -#### 使用示例 - -##### Haar变换 - - -输入序列: - -``` -+-----------------------------+--------------------+ -| Time| root.test.d1.s2| -+-----------------------------+--------------------+ -|1970-01-01T08:00:00.000+08:00| 0.1414213562373095| -|1970-01-01T08:00:00.100+08:00| 1.909188309203678| -|1970-01-01T08:00:00.200+08:00| 1.6263455967290592| -|1970-01-01T08:00:00.300+08:00| 1.979898987322333| -|1970-01-01T08:00:00.400+08:00| 3.2526911934581184| -|1970-01-01T08:00:00.500+08:00| 1.414213562373095| -|1970-01-01T08:00:00.600+08:00| 2.1213203435596424| -|1970-01-01T08:00:00.700+08:00| 1.8384776310850235| -|1970-01-01T08:00:00.800+08:00| -0.1414213562373095| -|1970-01-01T08:00:00.900+08:00| 0.21213203435596428| -|1970-01-01T08:00:01.000+08:00| -0.7778174593052022| -|1970-01-01T08:00:01.100+08:00| -0.8485281374238569| -|1970-01-01T08:00:01.200+08:00| 0.2828427124746189| -|1970-01-01T08:00:01.300+08:00| -1.414213562373095| -|1970-01-01T08:00:01.400+08:00| 0.42426406871192857| -|1970-01-01T08:00:01.500+08:00|-0.42426406871192857| -+-----------------------------+--------------------+ -``` - -用于查询的SQL语句: - -```sql -select idwt(s2,"method"="haar") from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------+ -| Time|idwt(root.test.d1.s2, "method"="haar")| -+-----------------------------+--------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.100+08:00| 0.19999999999999998| -|1970-01-01T08:00:00.200+08:00| 1.4999999999999996| -|1970-01-01T08:00:00.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:00.400+08:00| 0.6| -|1970-01-01T08:00:00.500+08:00| 1.6999999999999997| -|1970-01-01T08:00:00.600+08:00| 0.7999999999999998| -|1970-01-01T08:00:00.700+08:00| 1.9999999999999996| -|1970-01-01T08:00:00.800+08:00| 2.4999999999999996| -|1970-01-01T08:00:00.900+08:00| 2.1| -|1970-01-01T08:00:01.000+08:00| 0.0| -|1970-01-01T08:00:01.100+08:00| 1.9999999999999996| -|1970-01-01T08:00:01.200+08:00| 1.7999999999999998| -|1970-01-01T08:00:01.300+08:00| 1.1999999999999997| -|1970-01-01T08:00:01.400+08:00| 0.9999999999999998| -|1970-01-01T08:00:01.500+08:00| 1.5999999999999999| -+-----------------------------+--------------------------------------+ -``` - - -### 5.5 FFT - -#### 注册语句 - -```sql -create function fft as 'org.apache.iotdb.library.frequency.UDTFFFT' -``` - -#### 函数简介 - -本函数对输入序列进行快速傅里叶变换。 - -**函数名:** FFT - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`:傅里叶变换的类型,取值为'uniform'或'nonuniform',缺省情况下为'uniform'。当取值为'uniform'时,时间戳将被忽略,所有数据点都将被视作等距的,并应用等距快速傅里叶算法;当取值为'nonuniform'时,将根据时间戳应用非等距快速傅里叶算法(未实现)。 -+ `result`:傅里叶变换的结果,取值为'real'、'imag'、'abs'或'angle',分别对应于变换结果的实部、虚部、模和幅角。在缺省情况下,输出变换的模。 -+ `compress`:压缩参数,取值范围(0,1],是有损压缩时保留的能量比例。在缺省情况下,不进行压缩。 - -**输出序列:** 输出单个序列,类型为DOUBLE,长度与输入相等。序列的时间戳从0开始,仅用于表示顺序。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -##### 等距傅里叶变换 - -当`type`参数缺省或为'uniform'时,本函数进行等距傅里叶变换。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select fft(s1) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------+ -| Time| fft(root.test.d1.s1)| -+-----------------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 1.2727111142703152E-8| -|1970-01-01T08:00:00.002+08:00| 2.385520799101839E-7| -|1970-01-01T08:00:00.003+08:00| 8.723291723972645E-8| -|1970-01-01T08:00:00.004+08:00| 19.999999960195904| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| -|1970-01-01T08:00:00.006+08:00| 3.2260694930700566E-7| -|1970-01-01T08:00:00.007+08:00| 8.723291605373329E-8| -|1970-01-01T08:00:00.008+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.009+08:00| 1.2727110997246171E-8| -|1970-01-01T08:00:00.010+08:00|1.9852334701272664E-23| -|1970-01-01T08:00:00.011+08:00| 1.2727111194499847E-8| -|1970-01-01T08:00:00.012+08:00| 1.108657103979944E-7| -|1970-01-01T08:00:00.013+08:00| 8.723291785769131E-8| -|1970-01-01T08:00:00.014+08:00| 3.226069493070057E-7| -|1970-01-01T08:00:00.015+08:00| 9.999999850988388| -|1970-01-01T08:00:00.016+08:00| 19.999999960195904| -|1970-01-01T08:00:00.017+08:00| 8.723291747109068E-8| -|1970-01-01T08:00:00.018+08:00| 2.3855207991018386E-7| -|1970-01-01T08:00:00.019+08:00| 1.2727112069910878E-8| -+-----------------------------+----------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此在输出序列中$k=4$和$k=5$处有尖峰。 - -##### 等距傅里叶变换并压缩 - -输入序列同上,用于查询的SQL语句如下: - -```sql -select fft(s1, 'result'='real', 'compress'='0.99'), fft(s1, 'result'='imag','compress'='0.99') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| fft(root.test.d1.s1,| fft(root.test.d1.s1,| -| | "result"="real",| "result"="imag",| -| | "compress"="0.99")| "compress"="0.99")| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - -注:基于傅里叶变换结果的共轭性质,压缩结果只保留前一半;根据给定的压缩参数,从低频到高频保留数据点,直到保留的能量比例超过该值;保留最后一个数据点以表示序列长度。 - -### 5.6 HighPass - -#### 注册语句 - -```sql -create function highpass as 'org.apache.iotdb.library.frequency.UDTFHighPass' -``` - -#### 函数简介 - -本函数对输入序列进行高通滤波,提取高于截止频率的分量。输入序列的时间戳将被忽略,所有数据点都将被视作等距的。 - -**函数名:** HIGHPASS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `wpass`:归一化后的截止频率,取值为(0,1),不可缺省。 - -**输出序列:** 输出单个序列,类型为DOUBLE,它是滤波后的序列,长度与时间戳均与输入一致。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select highpass(s1,'wpass'='0.45') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------+ -| Time|highpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.9999999534830373| -|1970-01-01T08:00:01.000+08:00| 1.7462829277628608E-8| -|1970-01-01T08:00:02.000+08:00| -0.9999999593178128| -|1970-01-01T08:00:03.000+08:00| -4.1115269056426626E-8| -|1970-01-01T08:00:04.000+08:00| 0.9999999925494194| -|1970-01-01T08:00:05.000+08:00| 3.328126513330016E-8| -|1970-01-01T08:00:06.000+08:00| -1.0000000183304454| -|1970-01-01T08:00:07.000+08:00| 6.260191433311374E-10| -|1970-01-01T08:00:08.000+08:00| 1.0000000018134796| -|1970-01-01T08:00:09.000+08:00| -3.097210911744423E-17| -|1970-01-01T08:00:10.000+08:00| -1.0000000018134794| -|1970-01-01T08:00:11.000+08:00| -6.260191627862097E-10| -|1970-01-01T08:00:12.000+08:00| 1.0000000183304454| -|1970-01-01T08:00:13.000+08:00| -3.328126501424346E-8| -|1970-01-01T08:00:14.000+08:00| -0.9999999925494196| -|1970-01-01T08:00:15.000+08:00| 4.111526915498874E-8| -|1970-01-01T08:00:16.000+08:00| 0.9999999593178128| -|1970-01-01T08:00:17.000+08:00| -1.7462829341296528E-8| -|1970-01-01T08:00:18.000+08:00| -0.9999999534830369| -|1970-01-01T08:00:19.000+08:00| -1.035237222742873E-16| -+-----------------------------+-----------------------------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此高通滤波之后的输出序列服从$y=sin(2\pi t/4)$。 - -### 5.7 IFFT - -#### 注册语句 - -```sql -create function ifft as 'org.apache.iotdb.library.frequency.UDTFIFFT' -``` - -#### 函数简介 - -本函数将输入的两个序列作为实部和虚部视作一个复数,进行逆快速傅里叶变换,并输出结果的实部。输入数据的格式参见`FFT`函数的输出,并支持以`FFT`函数压缩后的输出作为本函数的输入。 - -**函数名:** IFFT - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `start`:输出序列的起始时刻,是一个格式为'yyyy-MM-dd HH:mm:ss'的时间字符串。在缺省情况下,为'1970-01-01 08:00:00'。 -+ `interval`:输出序列的时间间隔,是一个有单位的正数。目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,为1s。 - - -**输出序列:** 输出单个序列,类型为DOUBLE。该序列是一个等距时间序列,它的值是将两个输入序列依次作为实部和虚部进行逆快速傅里叶变换的结果。 - -**提示:** 如果某行数据中包含空值或`NaN`,该行数据将会被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+----------------------+----------------------+ -| Time| root.test.d1.re| root.test.d1.im| -+-----------------------------+----------------------+----------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| 0.0| -|1970-01-01T08:00:00.001+08:00| -3.932894010461041E-9| 1.2104201863039066E-8| -|1970-01-01T08:00:00.002+08:00|-1.4021739447490164E-7| 1.9299268669082926E-7| -|1970-01-01T08:00:00.003+08:00| -7.057291240286645E-8| 5.127422242345858E-8| -|1970-01-01T08:00:00.004+08:00| 19.021130288047125| -6.180339875198807| -|1970-01-01T08:00:00.005+08:00| 9.999999850988388| 3.501852745067114E-16| -|1970-01-01T08:00:00.019+08:00| -3.932894898639461E-9|-1.2104202549376264E-8| -+-----------------------------+----------------------+----------------------+ -``` - - -用于查询的SQL语句: - -```sql -select ifft(re, im, 'interval'='1m', 'start'='2021-01-01 00:00:00') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------------+ -| Time|ifft(root.test.d1.re, root.test.d1.im, "interval"="1m",| -| | "start"="2021-01-01 00:00:00")| -+-----------------------------+-------------------------------------------------------+ -|2021-01-01T00:00:00.000+08:00| 2.902112992431231| -|2021-01-01T00:01:00.000+08:00| 1.1755704705132448| -|2021-01-01T00:02:00.000+08:00| -2.175570513757101| -|2021-01-01T00:03:00.000+08:00| -1.9021130389094498| -|2021-01-01T00:04:00.000+08:00| 0.9999999925494194| -|2021-01-01T00:05:00.000+08:00| 1.902113046743454| -|2021-01-01T00:06:00.000+08:00| 0.17557053610884188| -|2021-01-01T00:07:00.000+08:00| -1.1755704886020932| -|2021-01-01T00:08:00.000+08:00| -0.9021130371347148| -|2021-01-01T00:09:00.000+08:00| 3.552713678800501E-16| -|2021-01-01T00:10:00.000+08:00| 0.9021130371347154| -|2021-01-01T00:11:00.000+08:00| 1.1755704886020932| -|2021-01-01T00:12:00.000+08:00| -0.17557053610884144| -|2021-01-01T00:13:00.000+08:00| -1.902113046743454| -|2021-01-01T00:14:00.000+08:00| -0.9999999925494196| -|2021-01-01T00:15:00.000+08:00| 1.9021130389094498| -|2021-01-01T00:16:00.000+08:00| 2.1755705137571004| -|2021-01-01T00:17:00.000+08:00| -1.1755704705132448| -|2021-01-01T00:18:00.000+08:00| -2.902112992431231| -|2021-01-01T00:19:00.000+08:00| -3.552713678800501E-16| -+-----------------------------+-------------------------------------------------------+ -``` - -### 5.8 LowPass - -#### 注册语句 - -```sql -create function lowpass as 'org.apache.iotdb.library.frequency.UDTFLowPass' -``` - -#### 函数简介 - -本函数对输入序列进行低通滤波,提取低于截止频率的分量。输入序列的时间戳将被忽略,所有数据点都将被视作等距的。 - -**函数名:** LOWPASS - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `wpass`:归一化后的截止频率,取值为(0,1),不可缺省。 - -**输出序列:** 输出单个序列,类型为DOUBLE,它是滤波后的序列,长度与时间戳均与输入一致。 - -**提示:** 输入序列中的`NaN`将被忽略。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:00.000+08:00| 2.902113| -|1970-01-01T08:00:01.000+08:00| 1.1755705| -|1970-01-01T08:00:02.000+08:00| -2.1755705| -|1970-01-01T08:00:03.000+08:00| -1.9021131| -|1970-01-01T08:00:04.000+08:00| 1.0| -|1970-01-01T08:00:05.000+08:00| 1.9021131| -|1970-01-01T08:00:06.000+08:00| 0.1755705| -|1970-01-01T08:00:07.000+08:00| -1.1755705| -|1970-01-01T08:00:08.000+08:00| -0.902113| -|1970-01-01T08:00:09.000+08:00| 0.0| -|1970-01-01T08:00:10.000+08:00| 0.902113| -|1970-01-01T08:00:11.000+08:00| 1.1755705| -|1970-01-01T08:00:12.000+08:00| -0.1755705| -|1970-01-01T08:00:13.000+08:00| -1.9021131| -|1970-01-01T08:00:14.000+08:00| -1.0| -|1970-01-01T08:00:15.000+08:00| 1.9021131| -|1970-01-01T08:00:16.000+08:00| 2.1755705| -|1970-01-01T08:00:17.000+08:00| -1.1755705| -|1970-01-01T08:00:18.000+08:00| -2.902113| -|1970-01-01T08:00:19.000+08:00| 0.0| -+-----------------------------+---------------+ -``` - - -用于查询的SQL语句: - -```sql -select lowpass(s1,'wpass'='0.45') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------+ -| Time|lowpass(root.test.d1.s1, "wpass"="0.45")| -+-----------------------------+----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 1.9021130073323922| -|1970-01-01T08:00:01.000+08:00| 1.1755704705132448| -|1970-01-01T08:00:02.000+08:00| -1.1755705286582614| -|1970-01-01T08:00:03.000+08:00| -1.9021130389094498| -|1970-01-01T08:00:04.000+08:00| 7.450580419288145E-9| -|1970-01-01T08:00:05.000+08:00| 1.902113046743454| -|1970-01-01T08:00:06.000+08:00| 1.1755705212076808| -|1970-01-01T08:00:07.000+08:00| -1.1755704886020932| -|1970-01-01T08:00:08.000+08:00| -1.9021130222335536| -|1970-01-01T08:00:09.000+08:00| 3.552713678800501E-16| -|1970-01-01T08:00:10.000+08:00| 1.9021130222335536| -|1970-01-01T08:00:11.000+08:00| 1.1755704886020932| -|1970-01-01T08:00:12.000+08:00| -1.1755705212076801| -|1970-01-01T08:00:13.000+08:00| -1.902113046743454| -|1970-01-01T08:00:14.000+08:00| -7.45058112983088E-9| -|1970-01-01T08:00:15.000+08:00| 1.9021130389094498| -|1970-01-01T08:00:16.000+08:00| 1.1755705286582616| -|1970-01-01T08:00:17.000+08:00| -1.1755704705132448| -|1970-01-01T08:00:18.000+08:00| -1.9021130073323924| -|1970-01-01T08:00:19.000+08:00| -2.664535259100376E-16| -+-----------------------------+----------------------------------------+ -``` - -注:输入序列服从$y=sin(2\pi t/4)+2sin(2\pi t/5)$,长度为20,因此低通滤波之后的输出序列服从$y=2sin(2\pi t/5)$。 - - -### 5.9 Envelope - -#### 注册语句 - -```sql -create function envelope as 'org.apache.iotdb.library.frequency.UDFEnvelopeAnalysis' -``` - -#### 函数简介 - -本函数通过输入一维浮点数数组和用户指定的调制频率,实现对信号的解调和包络提取。解调的目标是从复杂的信号中提取感兴趣的部分,使其更易理解。比如通过解调可以找到信号的包络,即振幅的变化趋势。 - -**函数名:** Envelope - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `frequency`:频率(选填,正数。不填此参数,系统会基于序列对应时间的时间间隔来推断频率)。 -+ `amplification`: 扩增倍数(选填,正整数。输出Time列的结果为正整数的集合,不会输出小数。当频率小1时,可通过此参数对频率进行扩增以展示正常的结果)。 - -**输出序列:** -+ `Time`: 该列返回的值的含义是频率而并非时间,如果输出的格式为时间格式(如:1970-01-01T08:00:19.000+08:00),请将其转为时间戳值。 - -+ `Envelope(Path, 'frequency'='{frequency}')`:输出单个序列,类型为DOUBLE,它是包络分析之后的结果。 - -**提示:** 当解调的原始序列的值不连续时,本函数会视为连续处理,建议被分析的时间序列是一段值完整的时间序列。同时建议指定开始时间与结束时间。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s1| -+-----------------------------+---------------+ -|1970-01-01T08:00:01.000+08:00| 1.0 | -|1970-01-01T08:00:02.000+08:00| 2.0 | -|1970-01-01T08:00:03.000+08:00| 3.0 | -|1970-01-01T08:00:04.000+08:00| 4.0 | -|1970-01-01T08:00:05.000+08:00| 5.0 | -|1970-01-01T08:00:06.000+08:00| 6.0 | -|1970-01-01T08:00:07.000+08:00| 7.0 | -|1970-01-01T08:00:08.000+08:00| 8.0 | -|1970-01-01T08:00:09.000+08:00| 9.0 | -|1970-01-01T08:00:10.000+08:00| 10.0 | -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: -```sql -set time_display_type=long; -select envelope(s1),envelope(s1,'frequency'='1000'),envelope(s1,'amplification'='10') from root.test.d1; -``` -输出序列: - -``` -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -|Time|envelope(root.test.d1.s1)|envelope(root.test.d1.s1, "frequency"="1000")|envelope(root.test.d1.s1, "amplification"="10")| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ -| 0| 6.284350808484124| 6.284350808484124| 6.284350808484124| -| 100| 1.5581923657404393| 1.5581923657404393| null| -| 200| 0.8503211038340728| 0.8503211038340728| null| -| 300| 0.512808785945551| 0.512808785945551| null| -| 400| 0.26361156774506744| 0.26361156774506744| null| -|1000| null| null| 1.5581923657404393| -|2000| null| null| 0.8503211038340728| -|3000| null| null| 0.512808785945551| -|4000| null| null| 0.26361156774506744| -+----+-------------------------+---------------------------------------------+-----------------------------------------------+ - -``` - -## 6. 数据匹配 - -### 6.1 Cov - -#### 注册语句 - -```sql -create function cov as 'org.apache.iotdb.library.dmatch.UDAFCov' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的总体协方差。 - -**函数名:** COV - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为总体协方差的数据点。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出`NaN`。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select cov(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|cov(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 12.291666666666666| -+-----------------------------+-------------------------------------+ -``` - -### 6.2 Dtw - -#### 注册语句 - -```sql -create function dtw as 'org.apache.iotdb.library.dmatch.UDAFDtw' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的 DTW 距离。 - -**函数名:** DTW - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为两个时间序列的 DTW 距离值。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出 0。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|1970-01-01T08:00:00.001+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.002+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.003+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.004+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.005+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.006+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.007+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.008+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.009+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.010+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.011+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.012+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.013+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.014+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.015+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.016+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.017+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.018+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.019+08:00| 1.0| 2.0| -|1970-01-01T08:00:00.020+08:00| 1.0| 2.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select dtw(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------+ -| Time|dtw(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 20.0| -+-----------------------------+-------------------------------------+ -``` - -### 6.3 Pearson - -#### 注册语句 - -```sql -create function pearson as 'org.apache.iotdb.library.dmatch.UDAFPearson' -``` - -#### 函数简介 - -本函数用于计算两列数值型数据的皮尔森相关系数。 - -**函数名:** PEARSON - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列仅包含一个时间戳为 0、值为皮尔森相关系数的数据点。 - -**提示:** - -+ 如果某行数据中包含空值、缺失值或`NaN`,该行数据将会被忽略; -+ 如果数据中所有的行都被忽略,函数将会输出`NaN`。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d2.s1|root.test.d2.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| null| -|2020-01-01T00:00:04.000+08:00| 102.0| 101.0| -|2020-01-01T00:00:06.000+08:00| 104.0| 102.0| -|2020-01-01T00:00:08.000+08:00| 126.0| 102.0| -|2020-01-01T00:00:10.000+08:00| 108.0| 103.0| -|2020-01-01T00:00:12.000+08:00| null| 103.0| -|2020-01-01T00:00:14.000+08:00| 112.0| 104.0| -|2020-01-01T00:00:15.000+08:00| 113.0| null| -|2020-01-01T00:00:16.000+08:00| 114.0| 104.0| -|2020-01-01T00:00:18.000+08:00| 116.0| 105.0| -|2020-01-01T00:00:20.000+08:00| 118.0| 105.0| -|2020-01-01T00:00:22.000+08:00| 100.0| 106.0| -|2020-01-01T00:00:26.000+08:00| 124.0| 108.0| -|2020-01-01T00:00:28.000+08:00| 126.0| 108.0| -|2020-01-01T00:00:30.000+08:00| NaN| 108.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select pearson(s1,s2) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------+ -| Time|pearson(root.test.d2.s1, root.test.d2.s2)| -+-----------------------------+-----------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.5630881927754872| -+-----------------------------+-----------------------------------------+ -``` - -### 6.4 PtnSym - -#### 注册语句 - -```sql -create function ptnsym as 'org.apache.iotdb.library.dmatch.UDTFPtnSym' -``` - -#### 函数简介 - -本函数用于寻找序列中所有对称度小于阈值的对称子序列。对称度通过 DTW 计算,值越小代表序列对称性越高。 - -**函数名:** PTNSYM - -**输入序列:** 仅支持一个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `window`:对称子序列的长度,是一个正整数,默认值为 10。 -+ `threshold`:对称度阈值,是一个非负数,只有对称度小于等于该值的对称子序列才会被输出。在缺省情况下,所有的子序列都会被输出。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中的每一个数据点对应于一个对称子序列,时间戳为子序列的起始时刻,值为对称度。 - - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d1.s4| -+-----------------------------+---------------+ -|2021-01-01T12:00:00.000+08:00| 1.0| -|2021-01-01T12:00:01.000+08:00| 2.0| -|2021-01-01T12:00:02.000+08:00| 3.0| -|2021-01-01T12:00:03.000+08:00| 2.0| -|2021-01-01T12:00:04.000+08:00| 1.0| -|2021-01-01T12:00:05.000+08:00| 1.0| -|2021-01-01T12:00:06.000+08:00| 1.0| -|2021-01-01T12:00:07.000+08:00| 1.0| -|2021-01-01T12:00:08.000+08:00| 2.0| -|2021-01-01T12:00:09.000+08:00| 3.0| -|2021-01-01T12:00:10.000+08:00| 2.0| -|2021-01-01T12:00:11.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|ptnsym(root.test.d1.s4, "window"="5", "threshold"="0")| -+-----------------------------+------------------------------------------------------+ -|2021-01-01T12:00:00.000+08:00| 0.0| -|2021-01-01T12:00:07.000+08:00| 0.0| -+-----------------------------+------------------------------------------------------+ -``` - -### 6.5 XCorr - -#### 注册语句 - -```sql -create function xcorr as 'org.apache.iotdb.library.dmatch.UDTFXCorr' -``` - -#### 函数简介 - -本函数用于计算两条时间序列的互相关函数值, -对离散序列而言,互相关函数可以表示为 -$$CR(n) = \frac{1}{N} \sum_{m=1}^N S_1[m]S_2[m+n]$$ -常用于表征两条序列在不同对齐条件下的相似度。 - -**函数名:** XCORR - -**输入序列:** 仅支持两个输入序列,类型均为 INT32 / INT64 / FLOAT / DOUBLE。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。序列中共包含$2N-1$个数据点, -其中正中心的值为两条序列按照预先对齐的结果计算的互相关系数(即等于以上公式的$CR(0)$), -前半部分的值表示将后一条输入序列向前平移时计算的互相关系数, -直至两条序列没有重合的数据点(不包含完全分离时的结果$CR(-N)=0.0$), -后半部分类似。 -用公式可表示为(所有序列的索引从1开始计数): -$$OS[i] = CR(-N+i) = \frac{1}{N} \sum_{m=1}^{i} S_1[m]S_2[N-i+m],\ if\ i <= N$$ -$$OS[i] = CR(i-N) = \frac{1}{N} \sum_{m=1}^{2N-i} S_1[i-N+m]S_2[m],\ if\ i > N$$ - -**提示:** - -+ 两条序列中的`null` 和`NaN` 值会被忽略,在计算中表现为 0。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:01.000+08:00| null| 6| -|2020-01-01T00:00:02.000+08:00| 2| 7| -|2020-01-01T00:00:03.000+08:00| 3| NaN| -|2020-01-01T00:00:04.000+08:00| 4| 9| -|2020-01-01T00:00:05.000+08:00| 5| 10| -+-----------------------------+---------------+---------------+ -``` - - -用于查询的 SQL 语句: - -```sql -select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05 -``` - -输出序列: - -``` -+-----------------------------+---------------------------------------+ -| Time|xcorr(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+---------------------------------------+ -|1970-01-01T08:00:00.000+08:00| 0.0| -|1970-01-01T08:00:00.001+08:00| 10.0| -|1970-01-01T08:00:00.002+08:00| 16.0| -|1970-01-01T08:00:00.003+08:00| 16.75| -|1970-01-01T08:00:00.004+08:00| 20.0| -|1970-01-01T08:00:00.005+08:00| 13.2| -|1970-01-01T08:00:00.006+08:00| 5.6| -|1970-01-01T08:00:00.007+08:00| 7.0| -|1970-01-01T08:00:00.008+08:00| 0.0| -+-----------------------------+---------------------------------------+ -``` - - -### 6.6 Pattern\_match - -#### 注册语句 - -```SQL -create function pattern_match as 'org.apache.iotdb.library.match.UDAFPatternMatch' -``` - -#### 函数简介 - -本函数用于对输入的某一条时间序列与预设的`pattern`进行模式匹配,当相似度小于等于某个预设阈值时判定为匹配成功,并将最终匹配结果以`json`列表的方式输出。 - -**函数名:** PATTERN\_MATCH - -**输入序列:** 仅支持一个输入序列,类型为INT32,INT64,FLOAT,DOUBLE,BOOLEAN。 - -**参数:** - -* `timePattern` :以时间戳组成的字符串,以逗号分隔。长度必须大于1 。必填项。 -* `valuePattern `:以数字组成的字符串,以逗号分隔。数量与 `timePattern `相同,长度必须大于1。必填项。 - -> 提示:布尔类型的`valuePattern `,需要用1,0来表示`true`和`false`。 - -* `threshold` :阈值。Float类型。必填项。 - -**输出序列**:输出结果为包含所有成功匹配段落的起始时间戳`startTime`、终止时间戳`endTime`及相似度值`distance`的`json`列表。 - -#### 使用示例 -1. 线性数据 - -输入序列: - -```SQL -IoTDB> select s0 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s0| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| 1.1| -|1970-01-01T08:00:00.003+08:00| 1.2| -|1970-01-01T08:00:00.004+08:00| 1.3| -|1970-01-01T08:00:00.005+08:00| 0.0| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+--------------------------------------------------------------------------------------------------+ -| match_result| -+--------------------------------------------------------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]| -+--------------------------------------------------------------------------------------------------+ -``` - -2. 布尔类型数据 - -输入序列: - -```SQL -IoTDB> select s1 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s1| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| true| -|1970-01-01T08:00:00.002+08:00| true| -|1970-01-01T08:00:00.003+08:00| true| -|1970-01-01T08:00:00.004+08:00| false| -|1970-01-01T08:00:00.005+08:00| false| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", "threshold"="0.5") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+-------------------------------------------------+ -| match_result| -+-------------------------------------------------+ -|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+-------------------------------------------------+ -``` - -3. V型数据 - -输入序列: - -```SQL -IoTDB> select s2 from root.** -+-----------------------------+-------------+ -| Time|root.db.d0.s2| -+-----------------------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| -|1970-01-01T08:00:00.002+08:00| -1.0| -|1970-01-01T08:00:00.003+08:00| -2.0| -|1970-01-01T08:00:00.004+08:00| -3.0| -|1970-01-01T08:00:00.005+08:00| -2.0| -|1970-01-01T08:00:00.006+08:00| -1.0| -|1970-01-01T08:00:00.007+08:00| -0.0| -|1970-01-01T08:00:00.008+08:00| -0.0| -|1970-01-01T08:00:00.009+08:00| -0.0| -|1970-01-01T08:00:00.010+08:00| -0.0| -+-----------------------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s2, "timePattern"="1,2,3,4,5,6,7", "valuePattern"="0.0,-1.0,-2.0,-3.0,-2.0,-1.0,-0.0", "threshold"="10") as match_result from root.db.d0 -``` - -输出序列: - -```SQL -+----------------------------------------------+ -| match_result| -+----------------------------------------------+ -|[{"distance":0.53,"startTime":1,"endTime":10}]| -+----------------------------------------------+ -``` - -4. 多个匹配模式 - -输入序列: - -```SQL -IoTDB> select s0,s1 from root.** -+-----------------------------+-------------+-------------+ -| Time|root.db.d0.s0|root.db.d0.s1| -+-----------------------------+-------------+-------------+ -|1970-01-01T08:00:00.001+08:00| 0.0| true| -|1970-01-01T08:00:00.002+08:00| 1.1| true| -|1970-01-01T08:00:00.003+08:00| 1.2| true| -|1970-01-01T08:00:00.004+08:00| 1.3| false| -|1970-01-01T08:00:00.005+08:00| 0.0| false| -+-----------------------------+-------------+-------------+ -``` - -用于查询的SQL语句: - -```SQL -select pattern_match (s0, "timePattern"="1,2,3", "valuePattern"="1.1,1.2,1.3", "threshold"="0.5") as match_result1, pattern_match (s1, "timePattern"="1,2,3", "valuePattern"="1,1,1", - "threshold"="0.5") as match_result2 from root.db.d0 -``` - -输出序列: - -```SQL -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -| match_result1| match_result2| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -|[{"distance":0.200000,"startTime":1,"endTime":3}, {"distance":0.000000,"startTime":2,"endTime":4}]|[{"distance":0.000000,"startTime":1,"endTime":3}]| -+--------------------------------------------------------------------------------------------------+-------------------------------------------------+ -``` - - -## 7. 数据修复 - -### 7.1 TimestampRepair - -#### 注册语句 - -```sql -create function timestamprepair as 'org.apache.iotdb.library.drepair.UDTFTimestampRepair' -``` - -#### 函数简介 - -本函数用于时间戳修复。根据给定的标准时间间隔,采用最小化修复代价的方法,通过对数据时间戳的微调,将原本时间戳间隔不稳定的数据修复为严格等间隔的数据。在未给定标准时间间隔的情况下,本函数将使用时间间隔的中位数 (median)、众数 (mode) 或聚类中心 (cluster) 来推算标准时间间隔。 - - -**函数名:** TIMESTAMPREPAIR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `interval`: 标准时间间隔(单位是毫秒),是一个正整数。在缺省情况下,将根据指定的方法推算。 -+ `method`:推算标准时间间隔的方法,取值为 'median', 'mode' 或 'cluster',仅在`interval`缺省时有效。在缺省情况下,将使用中位数方法进行推算。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列是修复后的输入序列。 - -#### 使用示例 - -#### 指定标准时间间隔 - -在给定`interval`参数的情况下,本函数将按照指定的标准时间间隔进行修复。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:19.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:01.000+08:00| 7.0| -|2021-07-01T12:01:11.000+08:00| 8.0| -|2021-07-01T12:01:21.000+08:00| 9.0| -|2021-07-01T12:01:31.000+08:00| 10.0| -+-----------------------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select timestamprepair(s1,'interval'='10000') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------------------+ -| Time|timestamprepair(root.test.d2.s1, "interval"="10000")| -+-----------------------------+----------------------------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+----------------------------------------------------+ -``` - -#### 自动推算标准时间间隔 - -如果`interval`参数没有给定,本函数将按照推算的标准时间间隔进行修复。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select timestamprepair(s1) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------+ -| Time|timestamprepair(root.test.d2.s1)| -+-----------------------------+--------------------------------+ -|2021-07-01T12:00:00.000+08:00| 1.0| -|2021-07-01T12:00:10.000+08:00| 2.0| -|2021-07-01T12:00:20.000+08:00| 3.0| -|2021-07-01T12:00:30.000+08:00| 4.0| -|2021-07-01T12:00:40.000+08:00| 5.0| -|2021-07-01T12:00:50.000+08:00| 6.0| -|2021-07-01T12:01:00.000+08:00| 7.0| -|2021-07-01T12:01:10.000+08:00| 8.0| -|2021-07-01T12:01:20.000+08:00| 9.0| -|2021-07-01T12:01:30.000+08:00| 10.0| -|2021-07-01T12:01:40.000+08:00| NaN| -+-----------------------------+--------------------------------+ -``` - -### 7.2ValueFill - -#### 注册语句 - -```sql -create function valuefill as 'org.apache.iotdb.library.drepair.UDTFValueFill' -``` - -#### 函数简介 - -**函数名:** ValueFill - -**输入序列:** 单列时序数据,类型为INT32 / INT64 / FLOAT / DOUBLE - -**参数:** - -+ `method`: {"mean", "previous", "linear", "likelihood", "AR", "MA", "SCREEN"}, 默认为 "linear"。其中,“mean” 指使用均值填补的方法; “previous" 指使用前值填补方法;“linear" 指使用线性插值填补方法;“likelihood” 为基于速度的正态分布的极大似然估计方法;“AR” 指自回归的填补方法;“MA” 指滑动平均的填补方法;"SCREEN" 指约束填补方法;缺省情况下使用 “linear”。 - -**输出序列:** 填补后的单维序列。 - -**备注:** AR 模型采用 AR(1),时序列需满足自相关条件,否则将输出单个数据点 (0, 0.0). - -#### 使用示例 -##### 使用 linear 方法进行填补 - -当`method`缺省或取值为 'linear' 时,本函数将使用线性插值方法进行填补。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| NaN| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| NaN| -|2020-01-01T00:00:22.000+08:00| NaN| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select valuefill(s1) from root.test.d2 -``` - -输出序列: - - - -``` -+-----------------------------+--------------------------+ -| Time|valuefill(root.test.d2.s1)| -+-----------------------------+--------------------------+ -|2020-01-01T00:00:02.000+08:00| 101.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 110.5| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.66666666666667| -|2020-01-01T00:00:22.000+08:00| 121.33333333333333| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+--------------------------+ -``` - -##### 使用 previous 方法进行填补 - -当`method`取值为 'previous' 时,本函数将使前值填补方法进行数值填补。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select valuefill(s1,"method"="previous") from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-----------------------------------------------+ -| Time|valuefill(root.test.d2.s1, "method"="previous")| -+-----------------------------+-----------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| NaN| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 108.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 116.0| -|2020-01-01T00:00:22.000+08:00| 116.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-----------------------------------------------+ -``` - -### 7.3 ValueRepair - -#### 注册语句 - -```sql -create function valuerepair as 'org.apache.iotdb.library.drepair.UDTFValueRepair' -``` - -#### 函数简介 - -本函数用于对时间序列的数值进行修复。目前,本函数支持两种修复方法:**Screen** 是一种基于速度阈值的方法,在最小改动的前提下使得所有的速度符合阈值要求;**LsGreedy** 是一种基于速度变化似然的方法,将速度变化建模为高斯分布,并采用贪心算法极大化似然函数。 - -**函数名:** VALUEREPAIR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -+ `method`:修复时采用的方法,取值为 'Screen' 或 'LsGreedy'. 在缺省情况下,使用 Screen 方法进行修复。 -+ `minSpeed`:该参数仅在使用 Screen 方法时有效。当速度小于该值时会被视作数值异常点加以修复。在缺省情况下为中位数减去三倍绝对中位差。 -+ `maxSpeed`:该参数仅在使用 Screen 方法时有效。当速度大于该值时会被视作数值异常点加以修复。在缺省情况下为中位数加上三倍绝对中位差。 -+ `center`:该参数仅在使用 LsGreedy 方法时有效。对速度变化分布建立的高斯模型的中心。在缺省情况下为 0。 -+ `sigma` :该参数仅在使用 LsGreedy 方法时有效。对速度变化分布建立的高斯模型的标准差。在缺省情况下为绝对中位差。 - -**输出序列:** 输出单个序列,类型与输入序列相同。该序列是修复后的输入序列。 - -**提示:** 输入序列中的`NaN`在修复之前会先进行线性插值填补。 - -#### 使用示例 - -##### 使用 Screen 方法进行修复 - -当`method`缺省或取值为 'Screen' 时,本函数将使用 Screen 方法进行数值修复。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d2.s1| -+-----------------------------+---------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 126.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 100.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| NaN| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select valuerepair(s1) from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+----------------------------+ -| Time|valuerepair(root.test.d2.s1)| -+-----------------------------+----------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+----------------------------+ -``` - -##### 使用 LsGreedy 方法进行修复 - -当`method`取值为 'LsGreedy' 时,本函数将使用 LsGreedy 方法进行数值修复。 - -输入序列同上,用于查询的 SQL 语句如下: - -```sql -select valuerepair(s1,'method'='LsGreedy') from root.test.d2 -``` - -输出序列: - -``` -+-----------------------------+-------------------------------------------------+ -| Time|valuerepair(root.test.d2.s1, "method"="LsGreedy")| -+-----------------------------+-------------------------------------------------+ -|2020-01-01T00:00:02.000+08:00| 100.0| -|2020-01-01T00:00:03.000+08:00| 101.0| -|2020-01-01T00:00:04.000+08:00| 102.0| -|2020-01-01T00:00:06.000+08:00| 104.0| -|2020-01-01T00:00:08.000+08:00| 106.0| -|2020-01-01T00:00:10.000+08:00| 108.0| -|2020-01-01T00:00:14.000+08:00| 112.0| -|2020-01-01T00:00:15.000+08:00| 113.0| -|2020-01-01T00:00:16.000+08:00| 114.0| -|2020-01-01T00:00:18.000+08:00| 116.0| -|2020-01-01T00:00:20.000+08:00| 118.0| -|2020-01-01T00:00:22.000+08:00| 120.0| -|2020-01-01T00:00:26.000+08:00| 124.0| -|2020-01-01T00:00:28.000+08:00| 126.0| -|2020-01-01T00:00:30.000+08:00| 128.0| -+-----------------------------+-------------------------------------------------+ -``` - -## 8. 序列发现 - -### 8.1 ConsecutiveSequences - -#### 注册语句 - -```sql -create function consecutivesequences as 'org.apache.iotdb.library.series.UDTFConsecutiveSequences' -``` - -#### 函数简介 - -本函数用于在多维严格等间隔数据中发现局部最长连续子序列。 - -严格等间隔数据是指数据的时间间隔是严格相等的,允许存在数据缺失(包括行缺失和值缺失),但不允许存在数据冗余和时间戳偏移。 - -连续子序列是指严格按照标准时间间隔等距排布,不存在任何数据缺失的子序列。如果某个连续子序列不是任何连续子序列的真子序列,那么它是局部最长的。 - - -**函数名:** CONSECUTIVESEQUENCES - -**输入序列:** 支持多个输入序列,类型可以是任意的,但要满足严格等间隔的要求。 - -**参数:** - -+ `gap`:标准时间间隔,是一个有单位的正数。目前支持五种单位,分别是'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,函数会利用众数估计标准时间间隔。 - -**输出序列:** 输出单个序列,类型为 INT32。输出序列中的每一个数据点对应一个局部最长连续子序列,时间戳为子序列的起始时刻,值为子序列包含的数据点个数。 - -**提示:** 对于不符合要求的输入,本函数不对输出做任何保证。 - -#### 使用示例 - -##### 手动指定标准时间间隔 - -本函数可以通过`gap`参数手动指定标准时间间隔。需要注意的是,错误的参数设置会导致输出产生严重错误。 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2, "gap"="5m")| -+-----------------------------+------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------------------+ -``` - -##### 自动估计标准时间间隔 - -当`gap`参数缺省时,本函数可以利用众数估计标准时间间隔,得到同样的结果。因此,这种用法更受推荐。 - -输入序列同上,用于查询的SQL语句如下: - -```sql -select consecutivesequences(s1,s2) from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+------------------------------------------------------+ -| Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2)| -+-----------------------------+------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -|2020-01-01T00:45:00.000+08:00| 2| -+-----------------------------+------------------------------------------------------+ -``` - -### 8.2 ConsecutiveWindows - -#### 注册语句 - -```sql -create function consecutivewindows as 'org.apache.iotdb.library.series.UDTFConsecutiveWindows' -``` - -#### 函数简介 - -本函数用于在多维严格等间隔数据中发现指定长度的连续窗口。 - -严格等间隔数据是指数据的时间间隔是严格相等的,允许存在数据缺失(包括行缺失和值缺失),但不允许存在数据冗余和时间戳偏移。 - -连续窗口是指严格按照标准时间间隔等距排布,不存在任何数据缺失的子序列。 - - -**函数名:** CONSECUTIVEWINDOWS - -**输入序列:** 支持多个输入序列,类型可以是任意的,但要满足严格等间隔的要求。 - -**参数:** - -+ `gap`:标准时间间隔,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。在缺省情况下,函数会利用众数估计标准时间间隔。 -+ `length`:序列长度,是一个有单位的正数。目前支持五种单位,分别是 'ms'(毫秒)、's'(秒)、'm'(分钟)、'h'(小时)和'd'(天)。该参数不允许缺省。 - -**输出序列:** 输出单个序列,类型为 INT32。输出序列中的每一个数据点对应一个指定长度连续子序列,时间戳为子序列的起始时刻,值为子序列包含的数据点个数。 - -**提示:** 对于不符合要求的输入,本函数不对输出做任何保证。 - -#### 使用示例 - -输入序列: - -``` -+-----------------------------+---------------+---------------+ -| Time|root.test.d1.s1|root.test.d1.s2| -+-----------------------------+---------------+---------------+ -|2020-01-01T00:00:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:05:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:10:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:20:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:25:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:30:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:35:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:40:00.000+08:00| 1.0| null| -|2020-01-01T00:45:00.000+08:00| 1.0| 1.0| -|2020-01-01T00:50:00.000+08:00| 1.0| 1.0| -+-----------------------------+---------------+---------------+ -``` - -用于查询的SQL语句: - -```sql -select consecutivewindows(s1,s2,'length'='10m') from root.test.d1 -``` - -输出序列: - -``` -+-----------------------------+--------------------------------------------------------------------+ -| Time|consecutivewindows(root.test.d1.s1, root.test.d1.s2, "length"="10m")| -+-----------------------------+--------------------------------------------------------------------+ -|2020-01-01T00:00:00.000+08:00| 3| -|2020-01-01T00:20:00.000+08:00| 4| -+-----------------------------+--------------------------------------------------------------------+ -``` - - - -## 9. 机器学习 - -### 9.1 AR - -#### 注册语句 - -```sql -create function ar as 'org.apache.iotdb.library.dlearn.UDTFAR' -``` -#### 函数简介 - -本函数用于学习数据的自回归模型系数。 - -**函数名:** AR - -**输入序列:** 仅支持单个输入序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。 - -**参数:** - -- `p`:自回归模型的阶数。默认为1。 - -**输出序列:** 输出单个序列,类型为 DOUBLE。第一行对应模型的一阶系数,以此类推。 - -**提示:** - -- `p`应为正整数。 - -- 序列中的大部分点为等间隔采样点。 -- 序列中的缺失点通过线性插值进行填补后用于学习过程。 - -#### 使用示例 - -##### 指定阶数 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| -4.0| -|2020-01-01T00:00:02.000+08:00| -3.0| -|2020-01-01T00:00:03.000+08:00| -2.0| -|2020-01-01T00:00:04.000+08:00| -1.0| -|2020-01-01T00:00:05.000+08:00| 0.0| -|2020-01-01T00:00:06.000+08:00| 1.0| -|2020-01-01T00:00:07.000+08:00| 2.0| -|2020-01-01T00:00:08.000+08:00| 3.0| -|2020-01-01T00:00:09.000+08:00| 4.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select ar(s0,"p"="2") from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+---------------------------+ -| Time|ar(root.test.d0.s0,"p"="2")| -+-----------------------------+---------------------------+ -|1970-01-01T08:00:00.001+08:00| 0.9429| -|1970-01-01T08:00:00.002+08:00| -0.2571| -+-----------------------------+---------------------------+ -``` - -### 9.2 Cluster - -#### 注册语句 - -```sql -create function cluster as 'org.apache.iotdb.library.dlearn.UDTFCluster' -``` - -#### 函数简介 - -本函数对**单条输入时间序列**,按固定长度 `l` 切分为**互不重叠**的连续子序列(窗口),再对这些子序列聚类,得到 `k` 个分组。 - -**函数名:** Cluster - -**输入序列:** 仅支持单条数值型时间序列,类型为 INT32 / INT64 / FLOAT / DOUBLE。点按时间顺序读取;末尾不足以凑满一整窗的采样会被**丢弃**(仅使用 `⌊n/l⌋` 个窗口,`n` 为有效点数)。 - -**参数:** - -| 名称 | 含义 | 默认值 | 说明 | -|------|------|--------|------| -| `l` | 子序列(窗口)长度 | (必填) | 正整数;每个窗口含连续 `l` 个采样。 | -| `k` | 聚类个数 | (必填) | 整数 ≥ 2。 | -| `method` | 聚类算法 | `kmeans` | 可选:`kmeans`、`kshape`、`medoidshape`(大小写不敏感)。省略时默认为 k-means。 | -| `norm` | 是否对每个子序列做 Z-score 标准化 | `true` | 布尔;为 `true` 时在聚类前对每个子序列标准化。 | -| `maxiter` | 最大迭代次数 | `200` | 正整数。 | -| `output` | 输出模式 | `label` | `label`:每个窗口一个簇编号;`centroid`:按簇顺序拼接 `k` 个质心向量。 | -| `sample_rate` | 贪心采样比例 | `0.3` | 仅在 **`method` = `medoidshape`** 时使用;取值须在 `(0, 1]`。 | - -**`method` 说明:** - -- **kmeans**:欧氏空间中的 k-means(可选是否先做逐窗归一化)。 -- **kshape**:基于形状距离(由归一化互相关 NCC 得到的 SBD)分配簇;质心通过簇矩阵的 **SVD** 更新。 -- **medoidshape**:先粗聚类,再贪心选出 `k` 条代表子序列;`sample_rate` 控制每轮采样的候选数量。 - -**输出序列:** 由 `output` 控制: - -- **`output` = `label`(默认):** 一条输出序列,类型为 **INT32**。行数 = 完整窗口个数 `⌊n/l⌋`。每行时间戳 = 该窗口**第一个采样**的时间;值为簇编号 **0 … k−1**。 -- **`output` = `centroid`:** 一条输出序列,类型为 **DOUBLE**。行数 = **`k × l`**:按簇 **0 → k−1** 依次输出各簇质心的 `l` 个分量(拼接)。时间戳为 `0, 1, 2, …`(仅占位,无物理时间含义)。 - -**提示:** - -- 需满足有效点数 `n ≥ l`,且窗口数 `⌊n/l⌋ ≥ k`。 - -#### 使用示例 - -##### KShape:窗口长度 3,k = 2 - -九个采样 `{1,2,3,10,20,30,1,5,1}` 构成三个长度为 3 的不重叠窗口 `{1,2,3}`、`{10,20,30}`、`{1,5,1}`。在 **`method` = `kshape`** 且默认 **`norm` = `true`** 时,每一行对应一个窗口的簇编号,时间戳为各窗口起点。得到的标签为:**0, 0, 1**。 - -输入序列: - -``` -+-----------------------------+---------------+ -| Time|root.test.d0.s0| -+-----------------------------+---------------+ -|2020-01-01T00:00:01.000+08:00| 1.0| -|2020-01-01T00:00:02.000+08:00| 2.0| -|2020-01-01T00:00:03.000+08:00| 3.0| -|2020-01-01T00:00:04.000+08:00| 10.0| -|2020-01-01T00:00:05.000+08:00| 20.0| -|2020-01-01T00:00:06.000+08:00| 30.0| -|2020-01-01T00:00:07.000+08:00| 1.0| -|2020-01-01T00:00:08.000+08:00| 5.0| -|2020-01-01T00:00:09.000+08:00| 1.0| -+-----------------------------+---------------+ -``` - -用于查询的 SQL 语句: - -```sql -select cluster(s0, "l"="3", "k"="2", "method"="kshape", "output"="label") -from root.test.d0 -``` - -输出序列: - -``` -+-----------------------------+----------------------------------------------------------------------------+ -| Time|cluster(root.test.d0.s0,"l"="3","k"="2","method"="kshape","output"="label")| -+-----------------------------+----------------------------------------------------------------------------+ -|2020-01-01T00:00:01.000+08:00| 0| -|2020-01-01T00:00:04.000+08:00| 0| -|2020-01-01T00:00:07.000+08:00| 1| -+-----------------------------+----------------------------------------------------------------------------+ -``` diff --git a/src/zh/UserGuide/latest/Tools-System/CLI_timecho.md b/src/zh/UserGuide/latest/Tools-System/CLI_timecho.md deleted file mode 100644 index 8904f07d0..000000000 --- a/src/zh/UserGuide/latest/Tools-System/CLI_timecho.md +++ /dev/null @@ -1,265 +0,0 @@ - - -# 命令行工具 - -IOTDB 为用户提供 cli/Shell 工具用于启动客户端和服务端程序。下面介绍每个 cli/Shell 工具的运行方式和相关参数。 -> \$IOTDB\_HOME 表示 IoTDB 的安装目录所在路径。 - -## 1. Cli 运行方式 -安装后的 IoTDB 中有一个默认用户:`root`,默认密码为`TimechoDB@2021`(V2.0.6.x 版本之前为`root`)。用户可以使用该用户尝试运行 IoTDB 客户端以测试服务器是否正常启动。客户端启动脚本为$IOTDB_HOME/sbin 文件夹下的`start-cli`脚本。启动脚本时需要指定运行 IP 和 RPC PORT。以下为服务器在本机启动,且用户未更改运行端口号的示例,默认端口为 6667。若用户尝试连接远程服务器或更改了服务器运行的端口号,请在-h 和-p 项处使用服务器的 IP 和 RPC PORT。
-用户也可以在启动脚本的最前方设置自己的环境变量,如 JAVA_HOME 等。 - -Linux 系统与 MacOS 系统启动命令如下: - -```shell -# V2.0.6.x 版本之前 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > bash sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` -Windows 系统启动命令如下: - -```shell -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.4.x 版本及之后, V2.0.6.x 版本之前 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root - -# V2.0.6.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -``` -回车后即可成功启动客户端。启动后出现如图提示即为启动成功。 - -``` - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ version - -Successfully login at 127.0.0.1:6667 -``` -输入`quit`或`exit`可退出 cli 结束本次会话,cli 输出`quit normally`表示退出成功。 - -## 2. Cli 运行参数 - -| **参数名** | **参数类型** | **是否为必需参数** | **说明** | **示例** | -|:-----------------------------|:-----------|:------------|:-----------------------------------------------------------|:---------------------| -| -h `` | string 类型 | 否 | IoTDB 客户端连接 IoTDB 服务器的 IP 地址, 默认使用:127.0.0.1。 | -h 127.0.0.1 | -| -p `` | int 类型 | 否 | IoTDB 客户端连接服务器的端口号,IoTDB 默认使用 6667。 | -p 6667 | -| -u `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的用户名,默认使用 root。 | -u root | -| -pw `` | string 类型 | 否 | IoTDB 客户端连接服务器所使用的密码,默认使用 TimechoDB@2021(V2.0.6版本之前为root)。 | -pw root | -| -sql_dialect `` | string 类型 | 否 | 目前可选 tree(树模型) 、table(表模型),默认 tree | -sql_dialect table | -| -e `` | string 类型 | 否 | 在不进入客户端输入模式的情况下,批量操作 IoTDB。 | -e "show databases" | -| -c | 空 | 否 | 如果服务器设置了 rpc_thrift_compression_enable=true, 则 CLI 必须使用 -c | -c | -| -disableISO8601 | 空 | 否 | 如果设置了这个参数,IoTDB 将以数字的形式打印时间戳 (timestamp)。 | -disableISO8601 | -| -usessl `` | Boolean 类型 | 否 | 否开启 ssl 连接 | -usessl true | -| -ts `` | string 类型 | 否 | ssl 证书存储路径 | -ts /path/to/truststore | -| -tpw `` | string 类型 | 否 | ssl 证书存储密码 | -tpw myTrustPassword | -| -timeout `` | int 类型 | 否 | 查询超时时间(秒)。如果未设置,则使用服务器的配置。 | -timeout 30 | -| -help | 空 | 否 | 打印 IoTDB 的帮助信息。 | -help | - -下面展示一条客户端命令,功能是连接 IP 为 10.129.187.21 的主机,端口为 6667 ,用户名为 root,密码为 root,以数字的形式打印时间戳,IoTDB 命令行显示的最大行数为 10。 - -Linux 系统与 MacOS 系统启动命令如下: - -```shell -Shell > bash sbin/start-cli.sh -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` -Windows 系统启动命令如下: - -```shell -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 - -# V2.0.4.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h 10.129.187.21 -p 6667 -u root -pw root -disableISO8601 -maxPRC 10 -``` - -## 3. CLI 特殊命令 -下面列举了一些CLI的特殊命令。 - -| 命令 | 描述 / 例子 | -|:---|:---| -| `set time_display_type=xxx` | 例如: long, default, ISO8601, yyyy-MM-dd HH:mm:ss | -| `show time_display_type` | 显示时间显示方式 | -| `set time_zone=xxx` | 例如: +08:00, Asia/Shanghai | -| `show time_zone` | 显示CLI的时区 | -| `set fetch_size=xxx` | 设置从服务器查询数据时的读取条数 | -| `show fetch_size` | 显示读取条数的大小 | -| `set max_display_num=xxx` | 设置 CLI 一次展示的最大数据条数, 设置为-1表示无限制 | -| `help` | 获取CLI特殊命令的提示 | -| `exit/quit` | 退出CLI | - - -## 4. Cli 的批量操作 -当您想要通过脚本的方式通过 Cli / Shell 对 IoTDB 进行批量操作时,可以使用-e 参数。通过使用该参数,您可以在不进入客户端输入模式的情况下操作 IoTDB。 - -为了避免 SQL 语句和其他参数混淆,现在只支持-e 参数作为最后的参数使用。 - -针对 cli/Shell 工具的-e 参数用法如下: - -Linux 系统与 MacOS 指令: - -```shell -Shell > bash sbin/start-cli.sh -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -Windows 系统指令 -```shell -# V2.0.4.x 版本之前 -Shell > sbin\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} - -# V2.0.4.x 版本及之后 -Shell > sbin\windows\start-cli.bat -h {host} -p {rpcPort} -u {user} -pw {password} -e {sql for iotdb} -``` - -在 Windows 环境下,-e 参数的 SQL 语句需要使用` `` `对于`" "`进行替换 - -为了更好的解释-e 参数的使用,可以参考下面在 Linux 上执行的例子。 - -假设用户希望对一个新启动的 IoTDB 进行如下操作: - -1. 创建名为 root.demo 的 database - -2. 创建名为 root.demo.s1 的时间序列 - -3. 向创建的时间序列中插入三个数据点 - -4. 查询验证数据是否插入成功 - -那么通过使用 cli/Shell 工具的 -e 参数,可以采用如下的脚本: - -```shell -# !/bin/bash - -host=127.0.0.1 -rpcPort=6667 -user=root -pass=root - -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "CREATE DATABASE root.demo" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "create timeseries root.demo.s1 WITH DATATYPE=INT32, ENCODING=RLE" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(1,10)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(2,11)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "insert into root.demo(timestamp,s1) values(3,12)" -bash ./sbin/start-cli.sh -h ${host} -p ${rpcPort} -u ${user} -pw ${pass} -e "select s1 from root.demo" -``` - -打印出来的结果显示如下,通过这种方式进行的操作与客户端的输入模式以及通过 JDBC 进行操作结果是一致的。 - -```shell - Shell > bash ./shell.sh -+-----------------------------+------------+ -| Time|root.demo.s1| -+-----------------------------+------------+ -|1970-01-01T08:00:00.001+08:00| 10| -|1970-01-01T08:00:00.002+08:00| 11| -|1970-01-01T08:00:00.003+08:00| 12| -+-----------------------------+------------+ -Total line number = 3 -It costs 0.267s -``` - -需要特别注意的是,在脚本中使用 -e 参数时要对特殊字符进行转义。 - -## 5. 访问历史功能 - -IoTDB **V2.0.9.1** 起支持开启访问历史功能,即客户端登录成功后展示关键的历史访问信息,支持分布式场景。管理员与普通用户仅可查看自身访问历史,核心展示内容包括: - -* 上一次成功会话:显示日期、时间、访问应用、IP地址及访问方法(首次登录或无历史记录时不显示)。 -* 最近一次失败尝试:显示距离本次成功登录时间最近的一次失败记录的日期、时间、访问应用、IP地址及访问方法。 -* 累计失败次数:统计自上一次成功会话建立以来,所有未成功建立的会话尝试总次数。 - -### 5.1 开启访问历史 - -支持通过修改 `iotdb-system.properties` 文件中的相关参数来控制是否开启访问历史功能,修改参数后需重启生效,例如: - -```Plain -# 用于控制是否启用审计日志功能 -enable_audit_log=false -``` - -* 开启时,记录登录信息并定期清理过期数据; -* 关闭时,不记录、不展示、不清理; -* 开关关闭后重开,展示的历史为关闭前最后一条记录,不一定代表真实最近登录记录。 - -使用示例: - -```Bash ---------------------- -Starting IoTDB Cli ---------------------- - _____ _________ ______ ______ -|_ _| | _ _ ||_ _ `.|_ _ \ - | | .--.|_/ | | \_| | | `. \ | |_) | - | | / .'`\ \ | | | | | | | __'. - _| |_| \__. | _| |_ _| |_.' /_| |__) | -|_____|'.__.' |_____| |______.'|_______/ Enterprise version 2.0.9.1 (Build: xxxxxxx) - - ----Last Successful Session------------------ -Time: 2026-03-24T10:25:47.759+08:00 -IP Address: 127.0.0.1 ----Last Failed Session---------------------- -Time: 2026-03-24T10:27:26.314+08:00 -IP Address: 127.0.0.1 -Cumulative Failed Attempts: 1 -Successfully login at 127.0.0.1:6667 -IoTDB> -``` - -### 5.2 查看访问历史 - -root 用户及具有 AUDIT 权限的用户可以通过 SQL 语句查看访问历史记录。 - -语法定义: - -```SQL -select * from root.__audit.login.u_{userid}.** -``` - -其中 userid 可通过 `list user` 语句查看。 - -示例: - -```SQL -IoTDB> select * from root.__audit.login.** -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -| Time|root.__audit.login.u_0.node_1.result|root.__audit.login.u_0.node_1.ip|root.__audit.login.u_0.node_1.username| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -|2026-03-25T10:55:58.240+08:00| true| 127.0.0.1| root| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -Total line number = 1 -It costs 0.039s -IoTDB> select * from root.__audit.login.u_0.** -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -| Time|root.__audit.login.u_0.node_1.result|root.__audit.login.u_0.node_1.ip|root.__audit.login.u_0.node_1.username| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -|2026-03-25T10:55:58.240+08:00| true| 127.0.0.1| root| -+-----------------------------+------------------------------------+--------------------------------+--------------------------------------+ -Total line number = 1 -It costs 0.020s -``` diff --git a/src/zh/UserGuide/latest/Tools-System/Data-Export-Tool_timecho.md b/src/zh/UserGuide/latest/Tools-System/Data-Export-Tool_timecho.md deleted file mode 100644 index 2bdbee362..000000000 --- a/src/zh/UserGuide/latest/Tools-System/Data-Export-Tool_timecho.md +++ /dev/null @@ -1,176 +0,0 @@ -# 数据导出 - -## 1. 功能概述 - -数据导出工具 export-data.sh/bat 位于 tools 目录下,能够将指定 SQL 的查询结果导出为 CSV、SQL 及 TsFile(开源时间序列文件格式)格式。具体功能如下: - - - - - - - - - - - - - - - - - - - - - -
文件格式IoTDB工具具体介绍
CSVexport-data.sh/bat纯文本格式,存储格式化数据,需按照下文指定 CSV 格式进行构造
SQL包含自定义 SQL 语句的文件
TsFile开源时序数据文件格式
- - -## 2. 功能详解 - -### 2.1 公共参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| -------- |-------------------------| ---------------------------------------------------------------------- | -------------- |------------------------------------------| -| -ft | --file\_type | 导出文件的类型,可以选择:csv、sql、tsfile | √ | | -| -h | --host | 主机名 | 否 | 127.0.0.1 | -| -p | --port | 端口号 | 否 | 6667 | -| -u | --username | 用户名 | 否 | root | -| -pw | --password | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6.x 版本之前为 root) | -| -t | --target | 指定输出文件的目标文件夹,如果路径不存在新建文件夹 | √ | | -| -pfn | --prefix\_file\_name | 指定导出文件的名称。例如:abc,生成的文件是abc\_0.tsfile、abc\_1.tsfile | 否 | dump\_0.tsfile | -| -q | --query | 要执行的查询语句。自 V2.0.8 起,SQL 语句中的分号将被自动移除,查询执行保持正常。 | 否 | 无 | -| -timeout | --query\_timeout | 会话查询的超时时间(ms) | 否 | `-1`(V2.0.8 之前)
`Long.MAX_VALUE`(V2.0.8 及之后)
范围:`-1~Long.MAX_VALUE` | -| -help | --help | 显示帮助信息 | 否 | | -| -usessl | --use_ssl | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| -ts | --trust_store | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| -tpw | --trust_store_password | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 Csv 格式 - -#### 2.2.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] -t - [-pfn ] [-dt ] [-lpf ] [-tf ] - [-tz ] [-q ] [-timeout ] -``` - -#### 2.2.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- |--------------------------------------| -| -dt | --datatype | 是否在CSV文件的表头输出时间序列的数据类型,可以选择`true`或`false` | 否 | false | -| -lpf | --lines\_per\_file | 每个转储文件的行数 | 否 | 10000
范围:0~Integer.Max=2147483647 | -| -tf | --time\_format | 指定CSV文件中的时间格式。可以选择:1) 时间戳(数字、长整型);2) ISO8601(默认);3) 用户自定义模式,如`yyyy-MM-dd HH:mm:ss`(默认为ISO8601)。SQL文件中的时间戳输出不受时间格式设置影响 | 否| ISO8601 | -| -tz | --timezone | 设置时区,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | - -#### 2.2.3 运行示例: - -```Shell -# 正确示例 -> tools/export-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn exported-data.csv -dt true -lpf 1000 -tf "yyyy-MM-dd HH:mm:ss" - -tz +08:00 -q "SELECT * FROM root.ln" -timeout 20000 - -# 异常示例 -> tools/export-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` - -### 2.3 Sql 格式 - -#### 2.3.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-aligned ] - -lpf - [-tf ] [-tz ] [-q ] [-timeout ] - -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-h -p -u -pw ] - -t [-pfn -aligned - -lpf -tf -tz -q -timeout ] -``` - -#### 2.3.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | -------------------------------------- | -| -aligned | --use\_aligned | 是否导出为对齐的SQL格式 | 否 | true | -| -lpf | --lines\_per\_file | 每个转储文件的行数 | 否 | 10000
范围:0~Integer.Max=2147483647 | -| -tf | --time\_format | 指定CSV文件中的时间格式。可以选择:1) 时间戳(数字、长整型);2) ISO8601(默认);3) 用户自定义模式,如`yyyy-MM-dd HH:mm:ss`(默认为ISO8601)。SQL文件中的时间戳输出不受时间格式设置影响 | 否| ISO8601| -| -tz | --timezone | 设置时区,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | - -#### 2.3.3 运行示例: - -```Shell -# 正确示例 -> tools/export-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn exported-data.csv -aligned true -lpf 1000 -tf "yyyy-MM-dd HH:mm:ss" - -tz +08:00 -q "SELECT * FROM root.ln" -timeout 20000 - -# 异常示例 -> tools/export-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` - -### 2.4 TsFile 格式 - -#### 2.4.1 运行命令 - -```Shell -# Unix/OS X -> tools/export-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# Windows -# V2.0.4.x 版本之前 -> tools\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\export-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -t [-pfn ] [-q ] [-timeout ] -``` - -#### 2.4.2 私有参数 - -* 无 - -#### 2.4.3 运行示例: - -```Shell -# 正确示例 -> tools/export-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -t /path/export/dir - -pfn export-data.tsfile -q "SELECT * FROM root.ln" -timeout 10000 - -# 异常示例 -> tools/export-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -Parse error: Missing required option: t - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` diff --git a/src/zh/UserGuide/latest/Tools-System/Data-Import-Tool_timecho.md b/src/zh/UserGuide/latest/Tools-System/Data-Import-Tool_timecho.md deleted file mode 100644 index 236dac53b..000000000 --- a/src/zh/UserGuide/latest/Tools-System/Data-Import-Tool_timecho.md +++ /dev/null @@ -1,332 +0,0 @@ -# 数据导入 - -## 1. 功能概述 - -IoTDB 支持三种方式进行数据导入: -- 数据导入工具 :`import-data.sh/bat` 位于 `tools` 目录下,可以将 `CSV`、`SQL`、及`TsFile`(开源时序文件格式)的数据导入 `IoTDB`。 -- `TsFile` 自动加载功能。 -- `Load SQL` 导入 `TsFile` 。 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
文件格式IoTDB工具具体介绍
CSVimport-data.sh/bat可用于单个或一个目录的 CSV 文件批量导入 IoTDB
SQL可用于单个或一个目录的 SQL 文件批量导入 IoTDB
TsFile可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
TsFile 自动加载可以监听指定路径下新产生的 TsFile 文件,并将其加载进 IoTDB
Load SQL可用于单个或一个目录的 TsFile 文件批量导入 IoTDB
- -## 2. 数据导入工具 - -### 2.1 公共参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|----------|---------------------------|-----------------------------------------------------------------------------------------------------------------------------| -------------- |--------------------------------------| -| -ft | --file\_type | 导入文件的类型,可以选择:csv、sql、tsfile | √ | | -| -h | --host | 主机名 | 否 | 127.0.0.1 | -| -p | --port | 端口号 | 否 | 6667 | -| -u | --username | 用户名 | 否 | root | -| -pw | --password | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6.x 版本之前为 root) | -| -s | --source | 待加载的脚本文件(夹)的本地目录路径
如果为csv sql tsfile这三个支持的格式,直接导入
不支持的格式,报错提示`The file name must end with "csv" or "sql"or "tsfile"!` | √ | | -| -tn | --thread\_num | 最大并行线程数 | 否 | 8
范围:0~Integer.Max=2147483647 | -| -tz | --timezone | 时区设置,例如`+08:00`或`-01:00` | 否 | 本机系统时间 | -| -help | --help | 显示帮助信息,支持分开展示和全部展示`-help`或`-help csv` | 否 | | -| -usessl | --use_ssl | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| -ts | --trust_store | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| -tpw | --trust_store_password | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - - -### 2.2 CSV 格式 - -#### 2.2.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] - -# V2.0.4.x 版本及之后 -> tools\windows\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-aligned ] - [-ti ] [-tp ] [-tz ] [-batch ] - [-tn ] -``` - -#### 2.2.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | ---------------------------- | ----------------------------------------------------------------------------------- |-------------------------------------------|---------------------------------------| -| -fd | --fail\_dir | 指定保存失败文件的目录 | 否 | YOUR\_CSV\_FILE\_PATH | -| -lpf | --lines\_per\_failed\_file | 指定失败文件最大写入数据的行数 | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -aligned | --use\_aligned | 是否导入为对齐序列 | 否 | false | -| -batch | --batch\_size | 指定每调用一次接口处理的数据行数(最小值为1,最大值为Integer.​*MAX\_VALUE*​) | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -ti | --type\_infer | 通过选项定义类型信息,例如`"boolean=text,int=long, ..."` | 否 | 无 | -| -tp | --timestamp\_precision | 时间戳精度 | 否:
1. ms(毫秒)
2. us(微秒)
3. ns(纳秒) | ms | - -#### 2.2.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft csv -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -s /path/sql - -fd /path/failure/dir -lpf 100 -aligned true -ti "BOOLEAN=text,INT=long,FLOAT=double" - -tp ms -tz +08:00 -batch 5000 -tn 4 - -# 异常示例 -> tools/import-data.sh -ft csv -s /non_path -error: Source file or directory /non_path does not exist - -> tools/import-data.sh -ft csv -s /path/sql -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` - -#### 2.3.4 导入说明 - -1. CSV 导入规范 - -- 特殊字符转义规则:若Text类型的字段中包含特殊字符(例如逗号,),需使用反斜杠(\)​进行转义处理。 -- 支持的时间格式:yyyy-MM-dd'T'HH:mm:ss, yyy-MM-dd HH:mm:ss, 或者 yyyy-MM-dd'T'HH:mm:ss.SSSZ。 -- 时间戳列​必须作为数据文件的首列存在。 - -2. CSV 文件示例 - -- 时间对齐 - -```sql --- header 中不包含数据类型 - Time,root.test.t1.str,root.test.t2.str,root.test.t2.var - 1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 - 1970-01-01T08:00:00.002+08:00,"123",, - --- header 中包含数据类型(Text 类型数据支持加双引号和不加双引号) -Time,root.test.t1.str(TEXT),root.test.t2.str(TEXT),root.test.t2.var(INT32) -1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 -1970-01-01T08:00:00.002+08:00,123,hello world,123 -1970-01-01T08:00:00.003+08:00,"123",, -1970-01-01T08:00:00.004+08:00,123,,12 -``` - -- 设备对齐 - -```sql --- header 中不包含数据类型 - Time,Device,str,var - 1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", - 1970-01-01T08:00:00.002+08:00,root.test.t1,"123", - 1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100 - --- header 中包含数据类型(Text 类型数据支持加双引号和不加双引号) -Time,Device,str(TEXT),var(INT32) -1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", -1970-01-01T08:00:00.002+08:00,root.test.t1,"123", -1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100 -1970-01-01T08:00:00.002+08:00,root.test.t1,hello world,123 -``` - -### 2.3 SQL 格式 - -#### 2.2.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] - -# V2.0.4.x 版本及之后 -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] [-tz ] - [-batch ] [-tn ] -``` - -#### 2.2.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | ---------------------------- | ----------------------------------------------------------------------------------- | -------------- |---------------------------------------| -| -fd | --fail\_dir | 指定保存失败文件的目录 | 否 | YOUR\_CSV\_FILE\_PATH | -| -lpf | --lines\_per\_failed\_file | 指定失败文件最大写入数据的行数 | 否 | 100000
范围:0~Integer.Max=2147483647 | -| -batch | --batch\_size | 指定每调用一次接口处理的数据行数(最小值为1,最大值为Integer.​*MAX\_VALUE*​) | 否 | 100000
范围:0~Integer.Max=2147483647 | - -#### 2.2.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft sql -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 -s /path/sql - -fd /path/failure/dir -lpf 500 -tz +08:00 - -batch 100000 -tn 4 - -# 异常示例 -> tools/import-data.sh -ft sql -s /path/sql -fd /non_path -error: Source file or directory /path/sql does not exist - - -> tools/import-data.sh -ft sql -s /path/sql -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` - -### 2.4 TsFile 格式 - -#### 2.4.1 运行命令 - -```Shell -# Unix/OS X -> tools/import-data.sh -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# Windows -# V2.0.4.x 版本之前 -> tools\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] - -# V2.0.4.x 版本及之后 -> tools\windows\import-data.bat -ft [-h ] [-p ] [-u ] [-pw ] - -s -os [-sd ] -of [-fd ] - [-tn ] [-tz ] [-tp ] -``` - -#### 2.4.2 私有参数 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| ---------- | ------------------------ |----------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------| -------------------- | -| -os| --on\_succcess| 1. none:不删除
2. mv:移动成功的文件到目标文件夹
3. cp:硬连接(拷贝)成功的文件到目标文件夹
4. delete:删除 | √ || -| -sd | --success\_dir | 当`--on_success`为mv或cp时,mv或cp的目标文件夹。文件的文件名变为文件夹打平后拼接原有文件名 | 当`--on_success`为mv或cp时需要填写 | `${EXEC_DIR}/success`| -| -of| --on\_fail| 1. none:跳过
2. mv:移动失败的文件到目标文件夹
3. cp:硬连接(拷贝)失败的文件到目标文件夹
4. delete:删除 | √ || -| -fd | --fail\_dir | 当`--on_fail`指定为mv或cp时,mv或cp的目标文件夹。文件的文件名变为文件夹打平后拼接原有文件名 | 当`--on_fail`指定为mv或cp时需要填写 | `${EXEC_DIR}/fail` | -| -tp | --timestamp\_precision | 时间戳精度
tsfile非远程导入:-tp 指定tsfile文件的时间精度 手动校验和服务器的时间戳是否一致 不一致返回报错信息
远程导入:-tp 指定tsfile文件的时间精度 pipe自动校验时间戳精度是否一致 不一致返回pipe报错信息 | 否:
1. ms(毫秒)
2. us(微秒)
3. ns(纳秒) | ms| - - -#### 2.4.3 运行示例 - -```Shell -# 正确示例 -> tools/import-data.sh -ft tsfile -h 127.0.0.1 -p 6667 -u root -pw TimechoDB@2021 - -s /path/sql -os mv -of cp -sd /path/success/dir -fd /path/failure/dir - -tn 8 -tz +08:00 -tp ms - -# 异常示例 -> tools/import-data.sh -ft tsfile -s /path/sql -os mv -of cp - -fd /path/failure/dir -tn 8 -error: Missing option --success_dir (or -sd) when --on_success is 'mv' or 'cp' - -> tools/import-data.sh -ft tsfile -s /path/sql -os mv -of cp - -sd /path/success/dir -fd /path/failure/dir -tn 0 -error: Invalid thread number '0'. Please set a positive integer. - -# 注意:V2.0.6.x 版本之前 -pw 参数值默认值为 root -``` -## 3. TsFile 自动加载功能 - -本功能允许 IoTDB 主动监听指定目录下的新增 TsFile,并将 TsFile 自动加载至 IoTDB 中。通过此功能,IoTDB 能自动检测并加载 TsFile,无需手动执行任何额外的加载操作。 - -![](/img/Data-import1.png) - -### 3.1 配置参数 - -可通过从配置文件模版 `iotdb-system.properties.template` 中找到下列参数,添加到 IoTDB 配置文件 `iotdb-system.properties` 中开启 TsFile 自动加载功能。完整配置如下: - -| **配置参数** | **参数说明** | **value 取值范围** | **是否必填** | **默认值** | **加载方式** | -| --------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------- | -------------------- | ------------------------ | -------------------- | -| load\_active\_listening\_enable | 是否开启 DataNode 主动监听并且加载 tsfile 的功能(默认开启)。 | Boolean: true,false | 选填 | true | 热加载 | -| load\_active\_listening\_dirs | 需要监听的目录(自动包括目录中的子目录),如有多个使用 “,“ 隔开默认的目录为 `ext/load/pending`(支持热装载) | String: 一个或多个文件目录 | 选填 | `ext/load/pending` | 热加载 | -| load\_active\_listening\_fail\_dir | 执行加载 tsfile 文件失败后将文件转存的目录,只能配置一个 | String: 一个文件目录 | 选填 | `ext/load/failed` | 热加载 | -| load\_active\_listening\_max\_thread\_num | 同时执行加载 tsfile 任务的最大线程数,参数被注释掉时的默值为 max(1, CPU 核心数 / 2),当用户设置的值不在这个区间[1, CPU核心数 /2]内时,会设置为默认值 (1, CPU 核心数 / 2) | Long: [1, Long.MAX\_VALUE] | 选填 | max(1, CPU 核心数 / 2) | 重启后生效 | -| load\_active\_listening\_check\_interval\_seconds | 主动监听轮询间隔,单位秒。主动监听 tsfile 的功能是通过轮询检查文件夹实现的。该配置指定了两次检查 `load_active_listening_dirs` 的时间间隔,每次检查完成 `load_active_listening_check_interval_seconds` 秒后,会执行下一次检查。当用户设置的轮询间隔小于 1 时,会被设置为默认值 5 秒 | Long: [1, Long.MAX\_VALUE] | 选填 | 5 | 重启后生效 | - -### 3.2 注意事项 - -1. 如果待加载的文件中,存在 mods 文件,应优先将 mods 文件移动到监听目录下面,然后再移动 tsfile 文件,且 mods 文件应和对应的 tsfile 文件处于同一目录。防止加载到 tsfile 文件时,加载不到对应的 mods 文件 -2. 禁止设置 Pipe 的 receiver 目录、存放数据的 data 目录等作为监听目录 -3. 禁止 `load_active_listening_fail_dir` 与 `load_active_listening_dirs` 存在相同的目录,或者互相嵌套 -4. 保证 `load_active_listening_dirs` 目录有足够的权限,在加载成功之后,文件将会被删除,如果没有删除权限,则会重复加载 - -## 4. Load SQL - -IoTDB 支持通过 CLI 执行 SQL 直接将存有时间序列的一个或多个 TsFile 文件导入到另外一个正在运行的 IoTDB 实例中。 - -### 4.1 运行命令 - -```SQL -load '' with ( - 'attribute-key1'='attribute-value1', - 'attribute-key2'='attribute-value2', -) -``` - -* `` :文件本身,或是包含若干文件的文件夹路径 -* ``:可选参数,具体如下表所示 - -| Key | Key 描述 | Value 类型 | Value 取值范围 | Value 是否必填 | Value 默认值 | -| --------------------------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ------------ | ----------------------------------------- | ---------------- | -------------------------- | -| `database-level` | 当 tsfile 对应的 database 不存在时,可以通过` database-level`参数的值来制定 database 的级别,默认为`iotdb-common.properties`中设置的级别。
例如当设置 level 参数为 1 时表明此 tsfile 中所有时间序列中层级为1的前缀路径是 database。 | Integer | `[1: Integer.MAX_VALUE]` | 否 | 1 | -| `on-success` | 表示对于成功载入的 tsfile 的处置方式:默认为`delete`,即tsfile 成功加载后将被删除;`none `表明 tsfile 成功加载之后依然被保留在源文件夹, | String | `delete / none` | 否 | delete | -| `model` | 指定写入的 tsfile 是表模型还是树模型,该参数在V2.0.2.1后无效(系统会自动识别是树模型还是表模型) | String | `tree / table` | 否 | 与`-sql_dialect`一致 | -| `database-name` | **仅限表模型有效**: 文件导入的目标 database,不存在时会自动创建,`database-name`中不允许包括"`root.`"前缀,如果包含,将会报错。 | String | `-` | 否 | null | -| `convert-on-type-mismatch` | 加载 tsfile 时,如果数据类型不一致,是否进行转换 | Boolean | `true / false` | 否 | true | -| `verify` | 加载 tsfile 前是否校验 schema | Boolean | `true / false` | 否 | true | -| `tablet-conversion-threshold` | 转换为 tablet 形式的 tsfile 大小阈值,针对小文件 tsfile 加载,采用将其转换为 tablet 形式进行写入:默认值为 -1,即任意大小 tsfile 都不进行转换 | Integer | `[-1,0 :`​`Integer.MAX_VALUE]` | 否 | -1 | -| `async` | 是否开启异步加载 tsfile,将文件移到 active load 目录下面,所有的 tsfile 都 load 到`database-name`下. | Boolean | `true / false` | 否 | false | - -### 4.2 运行示例 - -```SQL --- 准备待导入环境 -IoTDB> show databases -+-------------+-----------------------+---------------------+-------------------+---------------------+ -| Database|SchemaReplicationFactor|DataReplicationFactor|TimePartitionOrigin|TimePartitionInterval| -+-------------+-----------------------+---------------------+-------------------+---------------------+ -|root.__system| 1| 1| 0| 604800000| -+-------------+-----------------------+---------------------+-------------------+---------------------+ - --- 通过load sql 导入 tsfile -IoTDB> load '/home/dump1.tsfile' with ( 'on-success'='none') -Msg: The statement is executed successfully. - --- 验证数据导入成功 -IoTDB> select * from root.testdb.** -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -| Time|root.testdb.device.model.temperature|root.testdb.device.model.humidity|root.testdb.device.model.status| -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -|2025-04-17T10:35:47.218+08:00| 22.3| 19.4| true| -+-----------------------------+------------------------------------+---------------------------------+-------------------------------+ -``` diff --git a/src/zh/UserGuide/latest/Tools-System/Maintenance-Tool_timecho.md b/src/zh/UserGuide/latest/Tools-System/Maintenance-Tool_timecho.md deleted file mode 100644 index c462adde3..000000000 --- a/src/zh/UserGuide/latest/Tools-System/Maintenance-Tool_timecho.md +++ /dev/null @@ -1,1013 +0,0 @@ - - -# 集群管理工具 - -## 1. 集群管理工具 - -IoTDB 集群管理工具是一款易用的运维工具(企业版工具)。旨在解决 IoTDB 分布式系统多节点的运维难题,主要包括集群部署、集群启停、弹性扩容、配置更新、数据导出等功能,从而实现对复杂数据库集群的一键式指令下发,极大降低管理难度。本文档将说明如何用集群管理工具远程部署、配置、启动和停止 IoTDB 集群实例。 - -### 1.1 环境准备 - -本工具为 TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -IoTDB 要部署的机器需要依赖jdk 8及以上版本、lsof、netstat、unzip功能如果没有请自行安装,可以参考文档最后的一节环境所需安装命令。 - -提示:IoTDB集群管理工具需要使用有root权限的账号 - -### 1.2 部署方法 - -#### 下载安装 - -本工具为TimechoDB(基于IoTDB的企业版数据库)配套工具,您可以联系您的销售获取工具下载方式。 - -注意:由于二进制包仅支持GLIBC2.17 及以上版本,因此最低适配Centos7版本 - -* 在iotd目录内输入以下指令后: - -```bash -bash install-iotdbctl.sh -``` - -即可在之后的 shell 内激活 iotdbctl 关键词,如检查部署前所需的环境指令如下所示: - -```bash -iotdbctl cluster check example -``` - -* 也可以不激活iotd直接使用 <iotdbctl absolute path>/sbin/iotdbctl 来执行命令,如检查部署前所需的环境: - -```bash -/sbin/iotdbctl cluster check example -``` - -### 1.3 系统结构 - -IoTDB集群管理工具主要由config、logs、doc、sbin目录组成。 - -* `config`存放要部署的集群配置文件如果要使用集群部署工具需要修改里面的yaml文件。 -* `logs` 存放部署工具日志,如果想要查看部署工具执行日志请查看`logs/iotd_yyyy_mm_dd.log`。 -* `sbin` 存放集群部署工具所需的二进制包。 -* `doc` 存放用户手册、开发手册和推荐部署手册。 - - -### 1.4 集群配置文件介绍 - -* 在`iotdbctl/config` 目录下有集群配置的yaml文件,yaml文件名字就是集群名字yaml 文件可以有多个,为了方便用户配置yaml文件在iotd/config目录下面提供了`default_cluster.yaml`示例。 -* yaml 文件配置由`global`、`confignode_servers`、`datanode_servers`、`grafana_server`、`prometheus_server`四大部分组成 -* global 是通用配置主要配置机器用户名密码、IoTDB本地安装文件、Jdk配置等。在`iotdbctl/config`目录中提供了一个`default_cluster.yaml`样例数据, - 用户可以复制修改成自己集群名字并参考里面的说明进行配置IoTDB集群,在`default_cluster.yaml`样例中没有注释的均为必填项,已经注释的为非必填项。 - -例如要执行`default_cluster.yaml`检查命令则需要执行命令`iotdbctl cluster check default_cluster`即可, -更多详细命令请参考下面命令列表。 - - - -| 参数 | 说明 | 是否必填 | -|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| iotdb\_zip\_dir | IoTDB 部署分发目录,如果值为空则从`iotdb_download_url`指定地址下载 | 非必填 | -| iotdb\_download\_url | IoTDB 下载地址,如果`iotdb_zip_dir` 没有值则从指定地址下载 | 非必填 | -| jdk\_tar\_dir | jdk 本地目录,可使用该 jdk 路径进行上传部署至目标节点。 | 非必填 | -| jdk\_deploy\_dir | jdk 远程机器部署目录,会将 jdk 部署到该目录下面,与下面的`jdk_dir_name`参数构成完整的jdk部署目录即 `/` | 非必填 | -| jdk\_dir\_name | jdk 解压后的目录名称默认是jdk_iotdb | 非必填 | -| iotdb\_lib\_dir | IoTDB lib 目录或者IoTDB 的lib 压缩包仅支持.zip格式 ,仅用于IoTDB升级,默认处于注释状态,如需升级请打开注释修改路径即可。如果使用zip文件请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache\-iotdb\-1.2.0/lib/* | 非必填 | -| user | ssh登陆部署机器的用户名 | 必填 | -| password | ssh登录的密码, 如果password未指定使用pkey登陆, 请确保已配置节点之间ssh登录免密钥 | 非必填 | -| pkey | 密钥登陆如果password有值优先使用password否则使用pkey登陆 | 非必填 | -| ssh\_port | ssh登录端口 | 必填 | -| iotdb\_admin_user | iotdb服务用户名默认root | 非必填 | -| iotdb\_admin_password | iotdb服务密码默认root | 非必填 | -| deploy\_dir | IoTDB 部署目录,会把 IoTDB 部署到该目录下面与下面的`iotdb_dir_name`参数构成完整的IoTDB 部署目录即 `/` | 必填 | -| iotdb\_dir\_name | IoTDB 解压后的目录名称默认是iotdb | 非必填 | -| datanode-env.sh | 对应`iotdb/config/datanode-env.sh` ,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值 | 非必填 | -| confignode-env.sh | 对应`iotdb/config/confignode-env.sh`,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值 | 非必填 | -| iotdb-common.properties | 对应`iotdb/config/iotdb-common.properties` | 非必填 | -| cn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode\_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`cn_seed_config_node` | 必填 | -| dn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode\_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` | 必填 | - -其中datanode-env.sh 和confignode-env.sh 可以配置额外参数extra_opts,当该参数配置后会在datanode-env.sh 和confignode-env.sh 后面追加对应的值,可参考default\_cluster.yaml,配置示例如下: -datanode-env.sh: -extra_opts: | -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:+UseG1GC" -IOTDB_JMX_OPTS="$IOTDB_JMX_OPTS -XX:MaxGCPauseMillis=200" - - -* confignode_servers 是部署IoTDB Confignodes配置,里面可以配置多个Confignode - 默认将第一个启动的ConfigNode节点node1当作Seed-ConfigNode - -| 参数 | 说明 | 是否必填 | -|-----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| name | Confignode 名称 | 必填 | -| deploy\_dir | IoTDB config node 部署目录 | 必填| | -| cn\_internal\_address | 对应iotdb/内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_address` | 必填 | -| cn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`confignode_servers`同时配置值时优先使用`confignode_servers`中的值,对应`iotdb/config/iotdb-confignode.properties`中的`cn_seed_config_node` | 必填 | -| cn\_internal\_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`cn_internal_port` | 必填 | -| cn\_consensus\_port | 对应`iotdb/config/iotdb-system.properties`中的`cn_consensus_port` | 非必填 | -| cn\_data\_dir | 对应`iotdb/config/iotdb-system.properties`中的`cn_data_dir` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`confignode_servers`同时配置值优先使用confignode\_servers中的值 | 非必填 | - -* datanode_servers 是部署IoTDB Datanodes配置,里面可以配置多个Datanode - -| 参数 | 说明 | 是否必填 | -| -------------------------- | ------------------------------------------------------------ | -------- | -| name | Datanode 名称 | 必填 | -| deploy_dir | IoTDB data node 部署目录,注:该目录不能与下面的IoTDB config node部署目录相同 | 必填 | -| dn_rpc_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` | 必填 | -| dn_internal_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` | 必填 | -| dn_seed_config_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-datanode.properties`中的`dn_seed_config_node`,推荐使用 SeedConfigNode | 必填 | -| dn_rpc_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` | 必填 | -| dn_internal_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` | 必填 | -| iotdb-system.properties | 对应`iotdb/config/iotdb-system.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 | 非必填 | - - -| 参数 | 说明 |是否必填| -|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|--- | -| name | Datanode 名称 |必填| -| deploy\_dir | IoTDB data node 部署目录 |必填| -| dn\_rpc\_address | datanode rpc 地址对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_address` |必填| -| dn\_internal\_address | 内部通信地址,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_address` |必填| -| dn\_seed\_config\_node | 集群配置地址指向存活的ConfigNode,默认指向confignode_x,在`global`与`datanode_servers`同时配置值时优先使用`datanode_servers`中的值,对应`iotdb/config/iotdb-system.properties`中的`dn_seed_config_node` |必填| -| dn\_rpc\_port | datanode rpc端口地址,对应`iotdb/config/iotdb-system.properties`中的`dn_rpc_port` |必填| -| dn\_internal\_port | 内部通信端口,对应`iotdb/config/iotdb-system.properties`中的`dn_internal_port` |必填| -| iotdb-system.properties | 对应`iotdb/config/iotdb-common.properties`在`global`与`datanode_servers`同时配置值优先使用`datanode_servers`中的值 |非必填| - -* grafana_server 是部署Grafana 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------|------------------|-------------------| -| grafana\_dir\_name | grafana 解压目录名称 | 非必填默认grafana_iotdb | -| host | grafana 部署的服务器ip | 必填 | -| grafana\_port | grafana 部署机器的端口 | 非必填,默认3000 | -| deploy\_dir | grafana 部署服务器目录 | 必填 | -| grafana\_tar\_dir | grafana 压缩包位置 | 必填 | -| dashboards | dashboards 所在的位置 | 非必填,多个用逗号隔开 | - -* prometheus_server 是部署Prometheus 相关配置 - -| 参数 | 说明 | 是否必填 | -|--------------------------------|------------------|-----------------------| -| prometheus_dir\_name | prometheus 解压目录名称 | 非必填默认prometheus_iotdb | -| host | prometheus 部署的服务器ip | 必填 | -| prometheus\_port | prometheus 部署机器的端口 | 非必填,默认9090 | -| deploy\_dir | prometheus 部署服务器目录 | 必填 | -| prometheus\_tar\_dir | prometheus 压缩包位置 | 必填 | -| storage\_tsdb\_retention\_time | 默认保存数据天数 默认15天 | 非必填 | -| storage\_tsdb\_retention\_size | 指定block可以保存的数据大小默认512M ,注意单位KB, MB, GB, TB, PB, EB | 非必填 | - -如果在config/xxx.yaml的`iotdb-system.properties`和`iotdb-system.properties`中配置了metrics,则会自动把配置放入到Prometheus无需手动修改 - -注意:如何配置yaml key对应的值包含特殊字符如:等建议整个value使用双引号,对应的文件路径中不要使用包含空格的路径,防止出现识别出现异常问题。 - -### 1.5 使用场景 - -#### 清理数据场景 - -* 清理集群数据场景会删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 -* 首先执行停止集群命令、然后在执行集群清理命令。 -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster clean default_cluster -``` - -#### 集群销毁场景 - -* 集群销毁场景会删除IoTDB集群中的`data`、`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录。 -* 首先执行停止集群命令、然后在执行集群销毁命令。 - - -```bash -iotdbctl cluster stop default_cluster -iotdbctl cluster destroy default_cluster -``` - -#### 集群升级场景 - -* 集群升级首先需要在config/xxx.yaml中配置`iotdb_lib_dir`为要上传到服务器的jar所在目录路径(例如iotdb/lib)。 -* 如果使用zip文件上传请使用zip 命令压缩iotdb/lib目录例如 zip -r lib.zip apache-iotdb-1.2.0/lib/* -* 执行上传命令、然后执行重启IoTDB集群命令即可完成集群升级 - -```bash -iotdbctl cluster dist-lib default_cluster -iotdbctl cluster restart default_cluster -``` - -#### 集群配置文件的热部署场景 - -* 首先修改在config/xxx.yaml中配置。 -* 执行分发命令、然后执行热部署命令即可完成集群配置的热部署 - -```bash -iotdbctl cluster dist-conf default_cluster -iotdbctl cluster reload default_cluster -``` - -#### 集群扩容场景 - -* 首先修改在config/xxx.yaml中添加一个datanode 或者confignode 节点。 -* 执行集群扩容命令 -```bash -iotdbctl cluster scaleout default_cluster -``` - -#### 集群缩容场景 - -* 首先在config/xxx.yaml中找到要缩容的节点名字或者ip+port(其中confignode port 是cn_internal_port、datanode port 是rpc_port) -* 执行集群缩容命令 -```bash -iotdbctl cluster scalein default_cluster -``` - -#### 已有IoTDB集群,使用集群部署工具场景 - -* 配置服务器的`user`、`passwod`或`pkey`、`ssh_port` -* 修改config/xxx.yaml中IoTDB 部署路径,`deploy_dir`(IoTDB 部署目录)、`iotdb_dir_name`(IoTDB解压目录名称,默认是iotdb) - 例如IoTDB 部署完整路径是`/home/data/apache-iotdb-1.1.1`则需要修改yaml文件`deploy_dir:/home/data/`、`iotdb_dir_name:apache-iotdb-1.1.1` -* 如果服务器不是使用的java_home则修改`jdk_deploy_dir`(jdk 部署目录)、`jdk_dir_name`(jdk解压后的目录名称,默认是jdk_iotdb),如果使用的是java_home 则不需要修改配置 - 例如jdk部署完整路径是`/home/data/jdk_1.8.2`则需要修改yaml文件`jdk_deploy_dir:/home/data/`、`jdk_dir_name:jdk_1.8.2` -* 配置`cn_seed_config_node`、`dn_seed_config_node` -* 配置`confignode_servers`中`iotdb-system.properties`里面的`cn_internal_address`、`cn_internal_port`、`cn_consensus_port`、`cn_system_dir`、 - `cn_consensus_dir`里面的值不是IoTDB默认的则需要配置否则可不必配置 -* 配置`datanode_servers`中`iotdb-system.properties`里面的`dn_rpc_address`、`dn_internal_address`、`dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`等 -* 执行初始化命令 - -```bash -iotdbctl cluster init default_cluster -``` - -#### 一键部署IoTDB、Grafana和Prometheus 场景 - -* 配置`iotdb-system.properties` 打开metrics接口 -* 配置Grafana 配置,如果`dashboards` 有多个就用逗号隔开,名字不能重复否则会被覆盖。 -* 配置Prometheus配置,IoTDB 集群配置了metrics 则无需手动修改Prometheus 配置会根据哪个节点配置了metrics,自动修改Prometheus 配置。 -* 启动集群 - -```bash -iotdbctl cluster start default_cluster -``` - -更加详细参数请参考上方的 集群配置文件介绍 - - -### 1.6 命令格式 - -本工具的基本用法为: -```bash -iotdbctl cluster [params (Optional)] -``` -* key 表示了具体的命令。 - -* cluster name 表示集群名称(即`iotdbctl/config` 文件中yaml文件名字)。 - -* params 表示了命令的所需参数(选填)。 - -* 例如部署default_cluster集群的命令格式为: - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 集群的功能及参数列表如下: - -| 命令 | 功能 | 参数 | -|-----------------|-------------------------------|-------------------------------------------------------------------------------------------------------------------------| -| check | 检测集群是否可以部署 | 集群名称列表 | -| clean | 清理集群 | 集群名称 | -| deploy/dist-all | 部署集群 | 集群名称 ,-N,模块名称(iotdb、grafana、prometheus可选),-op force(可选) | -| list | 打印集群及状态列表 | 无 | -| start | 启动集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) | -| stop | 关闭集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选) ,-op force(nodename、grafana、prometheus可选) | -| restart | 重启集群 | 集群名称,-N,节点名称(nodename、grafana、prometheus可选),-op force(强制停止)/rolling(滚动重启) | -| show | 查看集群信息,details字段表示展示集群信息细节 | 集群名称, details(可选) | -| destroy | 销毁集群 | 集群名称,-N,模块名称(iotdb、grafana、prometheus可选) | -| scaleout | 集群扩容 | 集群名称 | -| scalein | 集群缩容 | 集群名称,-N,集群节点名字或集群节点ip+port | -| reload | 集群热加载 | 集群名称 | -| dist-conf | 集群配置文件分发 | 集群名称 | -| dumplog | 备份指定集群日志 | 集群名称,-N,集群节点名字 -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -loglevel 日志类型 -l 传输速度 | -| dumpdata | 备份指定集群数据 | 集群名称, -h 备份至目标机器ip -pw 备份至目标机器密码 -p 备份至目标机器端口 -path 备份的目录 -startdate 起始时间 -enddate 结束时间 -l 传输速度 | -| dist-lib | lib 包升级 | 集群名字(升级完后请重启) | -| init | 已有集群使用集群部署工具时,初始化集群配置 | 集群名字,初始化集群配置 | -| status | 查看进程状态 | 集群名字 | -| acitvate | 激活集群 | 集群名字 | -| dist-plugin | 上传plugin(udf,trigger,pipe)到集群 | 集群名字,-type 类型 U(udf)/T(trigger)/P(pipe) -file /xxxx/trigger.jar,上传完成后需手动执行创建udf、pipe、trigger命令 | -| upgrade | 滚动升级 | 集群名字 | -| health_check | 健康检查 | 集群名字,-N,节点名称(可选) | -| backup | 停机备份 | 集群名字,-N,节点名称(可选) | -| importschema | 元数据导入 | 集群名字,-N,节点名称(必填) -param 参数 | -| exportschema | 元数据导出 | 集群名字,-N,节点名称(必填) -param 参数 | - - -### 1.7 细命令执行过程 - -下面的命令都是以default_cluster.yaml 为示例执行的,用户可以修改成自己的集群文件来执行 - -#### 检查集群部署环境命令 - -```bash -iotdbctl cluster check default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 验证目标节点是否能够通过 SSH 登录 - -* 验证对应节点上的 JDK 版本是否满足IoTDB jdk1.8及以上版本、服务器是否按照unzip、是否安装lsof 或者netstat - -* 如果看到下面提示`Info:example check successfully!` 证明服务器已经具备安装的要求, - 如果输出`Error:example check fail!` 证明有部分条件没有满足需求可以查看上面的输出的Error日志(例如:`Error:Server (ip:172.20.31.76) iotdb port(10713) is listening`)进行修复, - 如果检查jdk没有满足要求,我们可以自己在yaml 文件中配置一个jdk1.8 及以上版本的进行部署不影响后面使用, - 如果检查lsof、netstat或者unzip 不满足要求需要在服务器上自行安装。 - -#### 部署集群命令 - -```bash -iotdbctl cluster deploy default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`confignode_servers` 和`datanode_servers`中的节点信息上传IoTDB压缩包和jdk压缩包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值) - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -```bash -iotdbctl cluster deploy default_cluster -op force -``` -注意:该命令会强制执行部署,具体过程会删除已存在的部署目录重新部署 - -*部署单个模块* -```bash -# 部署grafana模块 -iotdbctl cluster deploy default_cluster -N grafana -# 部署prometheus模块 -iotdbctl cluster deploy default_cluster -N prometheus -# 部署iotdb模块 -iotdbctl cluster deploy default_cluster -N iotdb -``` - -#### 启动集群命令 - -```bash -iotdbctl cluster start default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 启动confignode,根据yaml配置文件中`confignode_servers`中的顺序依次启动同时根据进程id检查confignode是否正常,第一个confignode 为seek config - -* 启动datanode,根据yaml配置文件中`datanode_servers`中的顺序依次启动同时根据进程id检查datanode是否正常 - -* 如果根据进程id检查进程存在后,通过cli依次检查集群列表中每个服务是否正常,如果cli链接失败则每隔10s重试一次直到成功最多重试5次 - - -*启动单个节点命令* -```bash -#按照IoTDB 节点名称启动 -iotdbctl cluster start default_cluster -N datanode_1 -#按照IoTDB 集群ip+port启动,其中port对应confignode的cn_internal_port、datanode的rpc_port -iotdbctl cluster start default_cluster -N 192.168.1.5:6667 -#启动grafana -iotdbctl cluster start default_cluster -N grafana -#启动prometheus -iotdbctl cluster start default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对于节点位置信息,如果启动的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果启动的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 启动该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的start-confignode.sh和start-datanode.sh 脚本, -在实际输出结果失败时有可能是集群还未正常启动,建议使用status命令进行查看当前集群状态(iotdbctl cluster status xxx) - - -#### 查看IoTDB集群状态命令 - -```bash -iotdbctl cluster show default_cluster -#查看IoTDB集群详细信息 -iotdbctl cluster show default_cluster details -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 依次在datanode通过cli执行`show cluster details` 如果有一个节点执行成功则不会在后续节点继续执行cli直接返回结果 - - -#### 停止集群命令 - - -```bash -iotdbctl cluster stop default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据`datanode_servers`中datanode节点信息,按照配置先后顺序依次停止datanode节点 - -* 根据`confignode_servers`中confignode节点信息,按照配置依次停止confignode节点 - -*强制停止集群命令* - -```bash -iotdbctl cluster stop default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群 - -*停止单个节点命令* - -```bash -#按照IoTDB 节点名称停止 -iotdbctl cluster stop default_cluster -N datanode_1 -#按照IoTDB 集群ip+port停止(ip+port是按照datanode中的ip+dn_rpc_port获取唯一节点或confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster stop default_cluster -N 192.168.1.5:6667 -#停止grafana -iotdbctl cluster stop default_cluster -N grafana -#停止prometheus -iotdbctl cluster stop default_cluster -N prometheus -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 根据提供的节点名称或者ip:port找到对应节点位置信息,如果停止的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果停止的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - -* 停止该节点 - -说明:由于集群部署工具仅是调用了IoTDB集群中的stop-confignode.sh和stop-datanode.sh 脚本,在某些情况下有可能iotdb集群并未停止。 - - -#### 清理集群数据命令 - -```bash -iotdbctl cluster clean default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`配置信息 - -* 根据`confignode_servers`、`datanode_servers`中的信息,检查是否还有服务正在运行, - 如果有任何一个服务正在运行则不会执行清理命令 - -* 删除IoTDB集群中的data目录以及yaml文件中配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`和`ext`目录。 - - - -#### 重启集群命令 - -```bash -iotdbctl cluster restart default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 执行上述的停止集群命令(stop),然后执行启动集群命令(start) 具体参考上面的start 和stop 命令 - -*强制重启集群命令* - -```bash -iotdbctl cluster restart default_cluster -op force -``` -会直接执行kill -9 pid 命令强制停止集群,然后启动集群 - -*重启单个节点命令* - -```bash -#按照IoTDB 节点名称重启datanode_1 -iotdbctl cluster restart default_cluster -N datanode_1 -#按照IoTDB 节点名称重启confignode_1 -iotdbctl cluster restart default_cluster -N confignode_1 -#重启grafana -iotdbctl cluster restart default_cluster -N grafana -#重启prometheus -iotdbctl cluster restart default_cluster -N prometheus -``` - -#### 集群缩容命令 - -```bash -#按照节点名称缩容 -iotdbctl cluster scalein default_cluster -N nodename -#按照ip+port缩容(ip+port按照datanode中的ip+dn_rpc_port获取唯一节点,confignode中的ip+cn_internal_port获取唯一节点) -iotdbctl cluster scalein default_cluster -N ip:port -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 判断要缩容的confignode节点和datanode是否只剩一个,如果只剩一个则不能执行缩容 - -* 然后根据ip:port或者nodename 获取要缩容的节点信息,执行缩容命令,然后销毁该节点目录,如果缩容的节点是`data_node`则ip使用yaml 文件中的`dn_rpc_address`、port 使用的是yaml文件中datanode_servers 中的`dn_rpc_port`。 - 如果缩容的节点是`config_node`则ip使用的是yaml文件中confignode_servers 中的`cn_internal_address` 、port 使用的是`cn_internal_port` - - -提示:目前一次仅支持一个节点缩容 - -#### 集群扩容命令 - -```bash -iotdbctl cluster scaleout default_cluster -``` -* 修改config/xxx.yaml 文件添加一个datanode 节点或者confignode节点 - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 找到要扩容的节点,执行上传IoTDB压缩包和jdb包(如果yaml中配置`jdk_tar_dir`和`jdk_deploy_dir`值)并解压 - -* 根据yaml文件节点配置信息生成并上传`iotdb-system.properties` - -* 执行启动该节点命令并校验节点是否启动成功 - -提示:目前一次仅支持一个节点扩容 - -#### 销毁集群命令 -```bash -iotdbctl cluster destroy default_cluster -``` - -* cluster-name 找到默认位置的 yaml 文件 - -* 根据`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`中node节点信息,检查是否节点还在运行, - 如果有任何一个节点正在运行则停止销毁命令 - -* 删除IoTDB集群中的`data`以及yaml文件配置的`cn_system_dir`、`cn_consensus_dir`、 - `dn_data_dirs`、`dn_consensus_dir`、`dn_system_dir`、`logs`、`ext`、`IoTDB`部署目录、 - grafana部署目录和prometheus部署目录 - -*销毁单个模块* -```bash -# 销毁grafana模块 -iotdbctl cluster destroy default_cluster -N grafana -# 销毁prometheus模块 -iotdbctl cluster destroy default_cluster -N prometheus -# 销毁iotdb模块 -iotdbctl cluster destroy default_cluster -N iotdb -``` - -#### 分发集群配置命令 -```bash -iotdbctl cluster dist-conf default_cluster -``` - -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 - -* 根据yaml文件节点配置信息生成并依次上传`iotdb-system.properties`到指定节点 - -#### 热加载集群配置命令 -```bash -iotdbctl cluster reload default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 根据yaml文件节点配置信息依次在cli中执行`load configuration` - -#### 集群节点日志备份 -```bash -iotdbctl cluster dumplog default_cluster -N datanode_1,confignode_1 -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/logs' -logs '/root/data/db/iotdb/logs' -``` -* 根据 cluster-name 找到默认位置的 yaml 文件 - -* 该命令会根据yaml文件校验datanode_1,confignode_1 是否存在,然后根据配置的起止日期(startdate<=logtime<=enddate)备份指定节点datanode_1,confignode_1 的日志数据到指定服务`192.168.9.48` 端口`36000` 数据备份路径是 `/iotdb/logs` ,IoTDB日志存储路径在`/root/data/db/iotdb/logs`(非必填,如果不填写-logs xxx 默认从IoTDB安装路径/logs下面备份日志) - -| 命令 | 功能 | 是否必填 | -|------------|------------------------------------| ---| -| -h | 存放备份数据机器ip |否| -| -u | 存放备份数据机器用户名 |否| -| -pw | 存放备份数据机器密码 |否| -| -p | 存放备份数据机器端口(默认22) |否| -| -path | 存放备份数据的路径(默认当前路径) |否| -| -loglevel | 日志基本有all、info、error、warn(默认是全部) |否| -| -l | 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -| -N | 配置文件集群名称多个用逗号隔开 |是| -| -startdate | 起始时间(包含默认1970-01-01) |否| -| -enddate | 截止时间(包含) |否| -| -logs | IoTDB 日志存放路径,默认是({iotdb}/logs) |否| - -#### 集群节点数据备份 -```bash -iotdbctl cluster dumpdata default_cluster -granularity partition -startdate '2023-04-11' -enddate '2023-04-26' -h 192.168.9.48 -p 36000 -u root -pw root -path '/iotdb/datas' -``` -* 该命令会根据yaml文件获取leader 节点,然后根据起止日期(startdate<=logtime<=enddate)备份数据到192.168.9.48 服务上的/iotdb/datas 目录下 - -| 命令 | 功能 | 是否必填 | -| ---|---------------------------------| ---| -|-h| 存放备份数据机器ip |否| -|-u| 存放备份数据机器用户名 |否| -|-pw| 存放备份数据机器密码 |否| -|-p| 存放备份数据机器端口(默认22) |否| -|-path| 存放备份数据的路径(默认当前路径) |否| -|-granularity| 类型partition |是| -|-l| 限速(默认不限速范围0到104857601 单位Kbit/s) |否| -|-startdate| 起始时间(包含) |是| -|-enddate| 截止时间(包含) |是| - -#### 集群lib包上传(升级) -```bash -iotdbctl cluster dist-lib default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 - -注意执行完升级后请重启IoTDB 才能生效 - -#### 集群初始化 -```bash -iotdbctl cluster init default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 初始化集群配置 - -#### 查看集群进程状态 -```bash -iotdbctl cluster status default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`、`datanode_servers`、`grafana`、`prometheus`配置信息 -* 展示集群的存活状态 - -#### 集群授权激活 - -集群激活默认是通过输入激活码激活,也可以通过-op license_path 通过license路径激活 - -* 默认激活方式 -```bash -iotdbctl cluster activate default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -```bash -Machine code: -Kt8NfGP73FbM8g4Vty+V9qU5lgLvwqHEF3KbLN/SGWYCJ61eFRKtqy7RS/jw03lHXt4MwdidrZJ== -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKgGXEGzMms25+u== -Please enter the activation code: -JHQpXu97IKwv3rzbaDwoPLUuzNCm5aEeC9ZEBW8ndKg=,lTF1Dur1AElXIi/5jPV9h0XCm8ziPd9/R+tMYLsze1oAPxE87+Nwws= -Activation successful -``` -* 激活单个节点 - -```bash -iotdbctl cluster activate default_cluster -N confignode1 -op license_path -``` - -* 通过license路径方式激活 - -```bash -iotdbctl cluster activate default_cluster -op license_path -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`配置信息 -* 读取里面的机器码 -* 等待输入激活码 - -### 1.8 集群plugin分发 -```bash -#分发udf -iotdbctl cluster dist-plugin default_cluster -type U -file /xxxx/udf.jar -#分发trigger -iotdbctl cluster dist-plugin default_cluster -type T -file /xxxx/trigger.jar -#分发pipe -iotdbctl cluster dist-plugin default_cluster -type P -file /xxxx/pipe.jar -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取 `datanode_servers`配置信息 - -* 上传udf/trigger/pipe jar包 - -上传完成后需要手动执行创建udf/trigger/pipe命令 - -### 1.9 集群滚动升级 -```bash -iotdbctl cluster upgrade default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 - -* 上传lib包 -* confignode 执行停止、替换lib包、启动,然后datanode执行停止、替换lib包、启动 - - - -### 1.10 集群健康检查 -```bash -iotdbctl cluster health_check default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行health_check.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster health_check default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行health_check.sh - - -### 1.11 集群停机备份 -```bash -iotdbctl cluster backup default_cluster -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`confignode_servers`和`datanode_servers`配置信息 -* 每个节点执行backup.sh - -* 单个节点健康检查 -```bash -iotdbctl cluster backup default_cluster -N datanode_1 -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行backup.sh - -说明:多个节点部署到单台机器,只支持 quick 模式 - -### 1.12 集群元数据导入 - -```bash -iotdbctl cluster importschema default_cluster -N datanode1 -param "-s ./dump0.csv -fd ./failed/ -lpf 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入import-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|---------------------------------|------| -| -s |指定想要导入的数据文件,这里可以指定文件或者文件夹。如果指定的是文件夹,将会把文件夹中所有的后缀为csv的文件进行批量导入。 | 是 | -| -fd |指定一个目录来存放导入失败的文件,如果没有指定这个参数,失败的文件将会被保存到源数据的目录中,文件名为是源文件名加上.failed的后缀。 | 否 | -| -lpf |用于指定每个导入失败文件写入数据的行数,默认值为10000 | 否 | - - - -### 1.13 集群元数据导出 - -```bash -iotdbctl cluster exportschema default_cluster -N datanode1 -param "-t ./ -pf ./pattern.txt -lpf 10 -t 10000" -``` -* 根据 cluster-name 找到默认位置的 yaml 文件,获取`datanode_servers`配置信息 -* datanode1 执行元数据导入export-schema.sh - -其中 -param的参数如下: - -| 命令 | 功能 | 是否必填 | -|-----|------------------------------------------------------------|------| -| -t | 为导出的CSV文件指定输出路径 | 是 | -| -path |指定导出元数据的path pattern,指定该参数后会忽略-s参数例如:root.stock.** | 否 | -| -pf |如果未指定-path,则需指定该参数,指定查询元数据路径所在文件路径,支持 txt 文件格式,每个待导出的路径为一行。 | 否 | -| -lpf |指定导出的dump文件最大行数,默认值为10000。 | 否 | -| -timeout |指定session查询时的超时时间,单位为ms | 否 | - - - -### 1.14 集群部署工具样例介绍 -在集群部署工具安装目录中config/example 下面有3个yaml样例,如果需要可以复制到config 中进行修改即可 - -| 名称 | 说明 | -|-----------------------------|------------------------------------------------| -| default\_1c1d.yaml | 1个confignode和1个datanode 配置样例 | -| default\_3c3d.yaml | 3个confignode和3个datanode 配置样例 | -| default\_3c3d\_grafa\_prome | 3个confignode和3个datanode、Grafana、Prometheus配置样例 | - -## 2. 数据文件夹概览工具 - -IoTDB数据文件夹概览工具用于打印出数据文件夹的结构概览信息,工具位置为 tools/tsfile/print-iotdb-data-dir。 - -### 2.1 用法 - -- Windows: - -```bash -.\print-iotdb-data-dir.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-iotdb-data-dir.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"IoTDB_data_dir_overview.txt"作为默认值。 - -### 2.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-iotdb-data-dir.bat D:\github\master\iotdb\data\datanode\data -```````````````````````` -Starting Printing the IoTDB Data Directory Overview -```````````````````````` -output save path:IoTDB_data_dir_overview.txt -data dir num:1 -143 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -|============================================================== -|D:\github\master\iotdb\data\datanode\data -|--sequence -| |--root.redirect0 -| | |--1 -| | | |--0 -| |--root.redirect1 -| | |--2 -| | | |--0 -| |--root.redirect2 -| | |--3 -| | | |--0 -| |--root.redirect3 -| | |--4 -| | | |--0 -| |--root.redirect4 -| | |--5 -| | | |--0 -| |--root.redirect5 -| | |--6 -| | | |--0 -| |--root.sg1 -| | |--0 -| | | |--0 -| | | |--2760 -|--unsequence -|============================================================== -````````````````````````` - -## 3. TsFile概览工具 - -TsFile概览工具用于以概要模式打印出一个TsFile的内容,工具位置为 tools/tsfile/print-tsfile。 - -### 3.1 用法 - -- Windows: - -```bash -.\print-tsfile-sketch.bat (<输出结果的存储路径>) -``` - -- Linux or MacOs: - -```shell -./print-tsfile-sketch.sh (<输出结果的存储路径>) -``` - -注意:如果没有设置输出结果的存储路径, 将使用相对路径"TsFile_sketch_view.txt"作为默认值。 - -### 3.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile.bat D:\github\master\1669359533965-1-0-0.tsfile D:\github\master\sketch.txt -```````````````````````` -Starting Printing the TsFile Sketch -```````````````````````` -TsFile path:D:\github\master\1669359533965-1-0-0.tsfile -Sketch save path:D:\github\master\sketch.txt -148 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. --------------------------------- TsFile Sketch -------------------------------- -file path: D:\github\master\1669359533965-1-0-0.tsfile -file length: 2974 - - POSITION| CONTENT - -------- ------- - 0| [magic head] TsFile - 6| [version number] 3 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1, num of Chunks:3 - 7| [Chunk Group Header] - | [marker] 0 - | [deviceID] root.sg1.d1 - 20| [Chunk] of root.sg1.d1.s1, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [chunk header] marker=5, measurementID=s1, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 893| [Chunk] of root.sg1.d1.s2, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [chunk header] marker=5, measurementID=s2, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 - 1766| [Chunk] of root.sg1.d1.s3, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [chunk header] marker=5, measurementID=s3, dataSize=864, dataType=INT64, compressionType=SNAPPY, encodingType=RLE - | [page] UncompressedSize:862, CompressedSize:860 -||||||||||||||||||||| [Chunk Group] of root.sg1.d1 ends - 2656| [marker] 2 - 2657| [TimeseriesIndex] of root.sg1.d1.s1, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9032452783138882770,maxValue:9117677033041335123,firstValue:7068645577795875906,lastValue:-5833792328174747265,sumValue:5.795959009889246E19] - | [ChunkIndex] offset=20 - 2728| [TimeseriesIndex] of root.sg1.d1.s2, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-8806861312244965718,maxValue:9192550740609853234,firstValue:1150295375739457693,lastValue:-2839553973758938646,sumValue:8.2822564314572677E18] - | [ChunkIndex] offset=893 - 2799| [TimeseriesIndex] of root.sg1.d1.s3, tsDataType:INT64, startTime: 1669359533948 endTime: 1669359534047 count: 100 [minValue:-9076669333460323191,maxValue:9175278522960949594,firstValue:2537897870994797700,lastValue:7194625271253769397,sumValue:-2.126008424849926E19] - | [ChunkIndex] offset=1766 - 2870| [IndexOfTimerseriesIndex Node] type=LEAF_MEASUREMENT - | - | -||||||||||||||||||||| [TsFileMetadata] begins - 2891| [IndexOfTimerseriesIndex Node] type=LEAF_DEVICE - | - | - | [meta offset] 2656 - | [bloom filter] bit vector byte array length=31, filterSize=256, hashFunctionSize=5 -||||||||||||||||||||| [TsFileMetadata] ends - 2964| [TsFileMetadataSize] 73 - 2968| [magic tail] TsFile - 2974| END of TsFile ----------------------------- IndexOfTimerseriesIndex Tree ----------------------------- - [MetadataIndex:LEAF_DEVICE] - └──────[root.sg1.d1,2870] - [MetadataIndex:LEAF_MEASUREMENT] - └──────[s1,2657] ----------------------------------- TsFile Sketch End ---------------------------------- -````````````````````````` - -解释: - -- 以"|"为分隔,左边是在TsFile文件中的实际位置,右边是梗概内容。 -- "|||||||||||||||||||||"是为增强可读性而添加的导引信息,不是TsFile中实际存储的数据。 -- 最后打印的"IndexOfTimerseriesIndex Tree"是对TsFile文件末尾的元数据索引树的重新整理打印,便于直观理解,不是TsFile中存储的实际数据。 - -## 4. TsFile Resource概览工具 - -TsFile resource概览工具用于打印出TsFile resource文件的内容,工具位置为 tools/tsfile/print-tsfile-resource-files。 - -### 4.1 用法 - -- Windows: - -```bash -.\print-tsfile-resource-files.bat -``` - -- Linux or MacOs: - -``` -./print-tsfile-resource-files.sh -``` - -### 4.2 示例 - -以Windows系统为例: - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -147 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -230 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -231 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -233 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -237 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file folder D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0 finished. -````````````````````````` - -`````````````````````````bash -.\print-tsfile-resource-files.bat D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource -```````````````````````` -Starting Printing the TsFileResources -```````````````````````` -178 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -186 [main] WARN o.a.i.t.c.conf.TSFileDescriptor - not found iotdb-system.properties, use the default configs. -187 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -188 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Cannot find IOTDB_HOME or IOTDB_CONF environment variable when loading config file iotdb-system.properties, use default configuration -192 [main] WARN o.a.iotdb.db.conf.IoTDBDescriptor - Couldn't load the configuration iotdb-system.properties from any of the known sources. -Analyzing D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile ... - -Resource plan index range [9223372036854775807, -9223372036854775808] -device root.sg1.d1, start time 0 (1970-01-01T08:00+08:00[GMT+08:00]), end time 99 (1970-01-01T08:00:00.099+08:00[GMT+08:00]) - -Analyzing the resource file D:\github\master\iotdb\data\datanode\data\sequence\root.sg1\0\0\1669359533489-1-0-0.tsfile.resource finished. -````````````````````````` diff --git a/src/zh/UserGuide/latest/Tools-System/Monitor-Tool_timecho.md b/src/zh/UserGuide/latest/Tools-System/Monitor-Tool_timecho.md deleted file mode 100644 index 9f5d3f26a..000000000 --- a/src/zh/UserGuide/latest/Tools-System/Monitor-Tool_timecho.md +++ /dev/null @@ -1,187 +0,0 @@ - - - -# 监控工具 - -监控工具的部署可参考文档 [监控面板部署](../Deployment-and-Maintenance/Monitoring-panel-deployment.md) 章节。 - -## 1. 监控指标的 Prometheus 映射关系 - -> 对于 Metric Name 为 name, Tags 为 K1=V1, ..., Kn=Vn 的监控指标有如下映射,其中 value 为具体值 - -| 监控指标类型 | 映射关系 | -| ---------------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| AutoGauge、Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value | -| Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value
name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | -| Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m1"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m5"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m15"} value
name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="mean"} value | -| Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value value
name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value | - -## 2. 修改配置文件 - -1) 以 DataNode 为例,修改 iotdb-system.properties 配置文件如下: - -```properties -dn_metric_reporter_list=PROMETHEUS -dn_metric_level=CORE -dn_metric_prometheus_reporter_port=9091 -``` - -2) 启动 IoTDB DataNode - -3) 打开浏览器或者用```curl``` 访问 ```http://servier_ip:9091/metrics```, 就能得到如下 metric 数据: - -``` -... -# HELP file_count -# TYPE file_count gauge -file_count{name="wal",} 0.0 -file_count{name="unseq",} 0.0 -file_count{name="seq",} 2.0 -... -``` - -## 3. Prometheus + Grafana - -如上所示,IoTDB 对外暴露出标准的 Prometheus 格式的监控指标数据,可以使用 Prometheus 采集并存储监控指标,使用 Grafana -可视化监控指标。 - -IoTDB、Prometheus、Grafana三者的关系如下图所示: - -![iotdb_prometheus_grafana](/img/UserGuide/System-Tools/Metrics/iotdb_prometheus_grafana.png) - -1. IoTDB在运行过程中持续收集监控指标数据。 -2. Prometheus以固定的间隔(可配置)从IoTDB的HTTP接口拉取监控指标数据。 -3. Prometheus将拉取到的监控指标数据存储到自己的TSDB中。 -4. Grafana以固定的间隔(可配置)从Prometheus查询监控指标数据并绘图展示。 - -从交互流程可以看出,我们需要做一些额外的工作来部署和配置Prometheus和Grafana。 - -比如,你可以对Prometheus进行如下的配置(部分参数可以自行调整)来从IoTDB获取监控数据 - -```yaml -job_name: pull-metrics -honor_labels: true -honor_timestamps: true -scrape_interval: 15s -scrape_timeout: 10s -metrics_path: /metrics -scheme: http -follow_redirects: true -static_configs: - - targets: - - localhost:9091 -``` - -更多细节可以参考下面的文档: - -[Prometheus安装使用文档](https://prometheus.io/docs/prometheus/latest/getting_started/) - -[Prometheus从HTTP接口拉取metrics数据的配置说明](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) - -[Grafana安装使用文档](https://grafana.com/docs/grafana/latest/getting-started/getting-started/) - -[Grafana从Prometheus查询数据并绘图的文档](https://prometheus.io/docs/visualization/grafana/#grafana-support-for-prometheus) - -## 4. Apache IoTDB Dashboard - -我们提供了Apache IoTDB Dashboard,支持统一集中式运维管理,可通过一个监控面板监控多个集群。 - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20default%20cluster.png) - -![Apache IoTDB Dashboard](/img/%E7%9B%91%E6%8E%A7%20cluster2.png) - -你可以在企业版中获取到 Dashboard 的 Json文件。 - -### 4.1 集群概览 - -可以监控包括但不限于: -- 集群总CPU核数、总内存空间、总硬盘空间 -- 集群包含多少个ConfigNode与DataNode -- 集群启动时长 -- 集群写入速度 -- 集群各节点当前CPU、内存、磁盘使用率 -- 分节点的信息 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%A6%82%E8%A7%88.png) - -### 4.2 数据写入 - -可以监控包括但不限于: -- 写入平均耗时、耗时中位数、99%分位耗时 -- WAL文件数量与尺寸 -- 节点 WAL flush SyncBuffer 耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%86%99%E5%85%A5.png) - -### 4.3 数据查询 - -可以监控包括但不限于: -- 节点查询加载时间序列元数据耗时 -- 节点查询读取时间序列耗时 -- 节点查询修改时间序列元数据耗时 -- 节点查询加载Chunk元数据列表耗时 -- 节点查询修改Chunk元数据耗时 -- 节点查询按照Chunk元数据过滤耗时 -- 节点查询构造Chunk Reader耗时的平均值 - -![](/img/%E7%9B%91%E6%8E%A7%20%E6%9F%A5%E8%AF%A2.png) - -### 4.4 存储引擎 - -可以监控包括但不限于: -- 分类型的文件数量、大小 -- 处于各阶段的TsFile数量、大小 -- 各类任务的数量与耗时 - -![](/img/%E7%9B%91%E6%8E%A7%20%E5%AD%98%E5%82%A8%E5%BC%95%E6%93%8E.png) - -### 4.5 系统监控 - -可以监控包括但不限于: -- 系统内存、交换内存、进程内存 -- 磁盘空间、文件数、文件尺寸 -- JVM GC时间占比、分类型的GC次数、GC数据量、各年代的堆内存占用 -- 网络传输速率、包发送速率 - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E5%86%85%E5%AD%98%E4%B8%8E%E7%A1%AC%E7%9B%98.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9Fjvm.png) - -![](/img/%E7%9B%91%E6%8E%A7%20%E7%B3%BB%E7%BB%9F%20%E7%BD%91%E7%BB%9C.png) - -### 4.6 数据同步 - -可以监控包括但不限于: -- Pipe事件提交队列大小、未分配Pipe事件数。 -- Source队列未处理事件数、Source供给事件速率、Processor处理事件速率。 -- 各类Pipesink/source未传输事件数、Pipe connector传输事件速率。 -- Pipesink重试队列和pending handler数量、Pipesink压缩前后累计大小和压缩耗时、Pipesink 批量大小和批处理间隔分布。 -- Pipe内存占用和容量、Pipe phantom reference数量、linked TsFile数量和大小、Pipe发送TsFile读取磁盘字节数。 - -![](/img/monitor-tool-pipe-1.png) - -![](/img/monitor-tool-pipe-2.png) - -![](/img/monitor-tool-pipe-3.png) - -![](/img/monitor-tool-pipe-4.png) \ No newline at end of file diff --git a/src/zh/UserGuide/latest/Tools-System/Schema-Export-Tool_timecho.md b/src/zh/UserGuide/latest/Tools-System/Schema-Export-Tool_timecho.md deleted file mode 100644 index ffb239ea3..000000000 --- a/src/zh/UserGuide/latest/Tools-System/Schema-Export-Tool_timecho.md +++ /dev/null @@ -1,84 +0,0 @@ - - -# 元数据导出 - -## 1. 功能概述 - -元数据导出工具 `export-schema.sh/bat` 位于tools 目录下,能够将 IoTDB 中指定数据库下的元数据导出为脚本文件。 - -## 2. 功能详解 - -### 2.1 参数介绍 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -| --------------------- | -------------------------- | ------------------------------------------------------------------------ | ----------------------------------- |---------------------------------------| -| `-h` | `--host` | 主机名 | 否 | 127.0.0.1 | -| `-p` | `--port` | 端口号 | 否 | 6667 | -| `-u` | `--username` | 用户名 | 否 | root | -| `-pw` | `--password` | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6.x 版本之前为 root) | -| `-sql_dialect` | `--sql_dialect` | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| `-db` | `--database` | 将要导出的目标数据库,只在`-sql_dialect`为 table 类型下生效。 | `-sql_dialect`为 table 时必填 | - | -| `-table` | `--table` | 将要导出的目标表,只在`-sql_dialect`为 table 类型下生效。 | 否 | - | -| `-t` | `--target` | 指定输出文件的目标文件夹,如果路径不存在新建文件夹 | 是 | | -| `-path` | `--path_pattern` | 指定导出元数据的path pattern | `-sql_dialect`为 tree 时必填 | | -| `-pfn` | `--prefix_file_name` | 指定导出文件的名称。 | 否 | dump\_dbname.sql | -| `-lpf` | `--lines_per_file` | 指定导出的dump文件最大行数,只在`-sql_dialect`为 tree 类型下生效。 | 否 | `10000` | -| `-timeout` | `--query_timeout` | 会话查询的超时时间(ms) | 否 | -1范围:-1~Long. max=9223372036854775807 | -| `-help` | `--help` | 显示帮助信息 | 否 | | -| `-usessl` | `--use_ssl` | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| `-ts` | `--trust_store` | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| `-tpw` | `--trust_store_password` | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 运行命令 - -```Bash -Shell -# Unix/OS X -> tools/export-schema.sh [-sql_dialect] -db -table - [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -# Windows -# V2.0.4.x 版本之前 -> tools\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] - -# V2.0.4.x 版本及之后 -> tools\windows\schema\export-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -t [-path ] [-pfn ] - [-lpf ] [-timeout ] -``` - -### 2.3 运行示例 - -```Bash -# 导出 root.treedb路径下的元数据 -./export-schema.sh -sql_dialect tree -t /home/ -path "root.treedb.**" - -# 导出结果内容格式如下 -Timeseries,Alias,DataType,Encoding,Compression -root.treedb.device.temperature,,DOUBLE,GORILLA,LZ4 -root.treedb.device.humidity,,DOUBLE,GORILLA,LZ4 -``` diff --git a/src/zh/UserGuide/latest/Tools-System/Schema-Import-Tool_timecho.md b/src/zh/UserGuide/latest/Tools-System/Schema-Import-Tool_timecho.md deleted file mode 100644 index 6b5ab5743..000000000 --- a/src/zh/UserGuide/latest/Tools-System/Schema-Import-Tool_timecho.md +++ /dev/null @@ -1,90 +0,0 @@ - - -# 元数据导入 - -## 1. 功能概述 - -元数据导入工具 `import-schema.sh/bat` 位于tools 目录下,能够将指定路径下创建元数据的脚本文件导入到 IoTDB 中。 - -## 2. 功能详解 - -### 2.1 参数介绍 - -| 参数缩写 | 参数全称 | 参数含义 | 是否为必填项 | 默认值 | -|----------------|--------------------------------|----------------------------------------------|--------|--------------------------------------| -| `-h` | `--host` | 主机名 | 否 | 127.0.0.1 | -| `-p` | `--port` | 端口号 | 否 | 6667 | -| `-u` | `--username` | 用户名 | 否 | root | -| `-pw` | `--password` | 密码,自 V2.0.9.1 起支持隐藏输入 | 否 | TimechoDB@2021 (V2.0.6.x 版本之前为 root) | -| `-sql_dialect` | `--sql_dialect` | 选择 server 是树模型还是表模型,当前支持 tree 和 table 类型 | 否 | tree | -| `-db` | `--database` | 将要导入的目标数据库 | `是` | - | -| `-table` | `--table` | 将要导入的目标表,只在`-sql_dialect`为 table 类型下生效。 | 否 | - | -| `-s` | `--source` | 待加载的脚本文件(夹)的本地目录路径。 | 是 | | -| `-fd` | `--fail_dir` | 指定保存失败文件的目录 | 否 | | -| `-lpf` | `--lines_per_failed_file` | 指定失败文件最大写入数据的行数,只在`-sql_dialect`为 table 类型下生效。 | 否 | 100000范围:0~Integer.Max=2147483647 | -| `-help` | `--help` | 显示帮助信息 | 否 | | -| `-usessl` | `--use_ssl` | 使用 SSL 协议,自 V2.0.9.1 起支持 | 否 | - | -| `-ts` | `--trust_store` | 信任库。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | -| `-tpw` | `--trust_store_password` | 信任库密码。支持隐藏输入,自 V2.0.9.1 起支持 | 否 | - | - -### 2.2 运行命令 - -```Bash -# Unix/OS X -tools/import-schema.sh [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# Windows -# V2.0.4.x 版本之前 -tools\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] - -# V2.0.4.x 版本及之后 -tools\windows\schema\import-schema.bat [-sql_dialect] -db -table
- [-h ] [-p ] [-u ] [-pw ] - -s [-fd ] [-lpf ] -``` - -### 2.3 运行示例 - -```Bash -# 导入前 -IoTDB> show timeseries root.treedb.** -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -+----------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ - -# 执行导入命令 -./import-schema.sh -sql_dialect tree -s /home/dump0_0.csv -db root.treedb - -# 导入成功后验证 -IoTDB> show timeseries root.treedb.** -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias| Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.treedb.device.temperature| null|root.treedb| DOUBLE| GORILLA| LZ4|null| null| null| null| BASE| -| root.treedb.device.humidity| null|root.treedb| DOUBLE| GORILLA| LZ4|null| null| null| null| BASE| -+------------------------------+-----+-----------+--------+--------+-----------+----+----------+--------+------------------+--------+ -``` diff --git a/src/zh/UserGuide/latest/Tools-System/Workbench_timecho.md b/src/zh/UserGuide/latest/Tools-System/Workbench_timecho.md deleted file mode 100644 index 423a657a8..000000000 --- a/src/zh/UserGuide/latest/Tools-System/Workbench_timecho.md +++ /dev/null @@ -1,34 +0,0 @@ -# 可视化控制台 - -可视化控制台的部署可参考文档 [可视化控制台部署](../Deployment-and-Maintenance/workbench-deployment_timecho.md) 章节。 - -## 1. 产品介绍 -IoTDB可视化控制台是在IoTDB企业版时序数据库基础上针对工业场景的实时数据收集、存储与分析一体化的数据管理场景开发的扩展组件,旨在为用户提供高效、可靠的实时数据存储和查询解决方案。它具有体量轻、性能高、易使用的特点,完美对接 Hadoop 与 Spark 生态,适用于工业物联网应用中海量时间序列数据高速写入和复杂分析查询的需求。 - -## 2. 使用说明 -IoTDB的可视化控制台包含以下功能模块: -| **功能模块** | **功能说明** | -| ------------ | ------------------------------------------------------------ | -| 实例管理 | 支持对连接实例进行统一管理,支持创建、编辑和删除,同时可以可视化呈现多实例的关系,帮助客户更清晰的管理多数据库实例 | -| 首页 | 支持查看数据库实例中各节点的服务运行状态(如是否激活、是否运行、IP信息等),支持查看集群、ConfigNode、DataNode运行监控状态,对数据库运行健康度进行监控,判断实例是否有潜在运行问题 | -| 测点列表 | 支持直接查看实例中的测点信息,包括所在数据库信息(如数据库名称、数据保存时间、设备数量等),及测点信息(测点名称、数据类型、压缩编码等),同时支持单条或批量创建、导出、删除测点 | -| 数据模型 | 支持查看各层级从属关系,将层级模型直观展示 | -| 数据查询 | 支持对常用数据查询场景提供界面式查询交互,并对查询数据进行批量导入、导出 | -| 统计查询 | 支持对常用数据统计场景提供界面式查询交互,如最大值、最小值、平均值、总和的结果输出。 | -| SQL操作 | 支持对数据库SQL进行界面式交互,单条或多条语句执行,结果的展示和导出 | -| 趋势 | 支持一键可视化查看数据整体趋势,对选中测点进行实时&历史数据绘制,观察测点实时&历史运行状态 | -| 分析 | 支持将数据通过不同的分析方式(如傅里叶变换等)进行可视化展示 | -| 视图 | 支持通过界面来查看视图名称、视图描述、结果测点以及表达式等信息,同时还可以通过界面交互快速的创建、编辑、删除视图 | -| 数据同步 | 支持对数据库间的数据同步任务进行直观创建、查看、管理,支持直接查看任务运行状态、同步数据和目标地址,还可以通过界面实时观察到同步状态的监控指标变化 | -| 权限管理 | 支持对权限进行界面管控,用于管理和控制数据库用户访问和操作数据库的权限 | -| 审计日志 | 支持对用户在数据库上的操作进行详细记录,包括DDL、DML和查询操作。帮助用户追踪和识别潜在的安全威胁、数据库错误和滥用行为 | - -主要功能展示: -* 首页 -![首页.png](/img/%E9%A6%96%E9%A1%B5.png) -* 测点列表 -![测点列表.png](/img/workbench-1.png) -* 数据查询 -![数据查询.png](/img/%E6%95%B0%E6%8D%AE%E6%9F%A5%E8%AF%A2.png) -* 趋势 -![历史趋势.png](/img/%E5%8E%86%E5%8F%B2%E8%B6%8B%E5%8A%BF.png) \ No newline at end of file diff --git a/src/zh/UserGuide/latest/User-Manual/Audit-Log_timecho.md b/src/zh/UserGuide/latest/User-Manual/Audit-Log_timecho.md deleted file mode 100644 index d65afcc7c..000000000 --- a/src/zh/UserGuide/latest/User-Manual/Audit-Log_timecho.md +++ /dev/null @@ -1,166 +0,0 @@ - - - -# 安全审计 - -## 1. 引言 - -审计日志是数据库的记录凭证,通过审计日志功能可以查询到数据库中增删改查等各项操作,以保证信息安全。IoTDB 审计日志功能支持以下特性: - -* 可通过配置决定是否开启审计日志功能 -* 可通过参数设置审计日志记录的操作类型和权限级别 -* 可通过参数设置审计日志文件的存储周期,包括基于 TTL 实现时间滚动和基于 SpaceTL 实现空间滚动。 -* 可通过参数设置统计任意时间段内写入和查询延时大于阈值(默认3000毫秒)的慢请求个数。 -* 审计日志文件默认加密存储 - -> 注意:该功能从 V2.0.8 版本开始提供。 - -## 2. 配置参数 - -通过编辑配置文件 `iotdb-system.properties` 中如下参数来启动审计日志功能。 - -* V2.0.8.1 - - -| 参数名称 | 参数描述 | 数据类型 | 默认值 | 生效方式 | -|-----------------------------------|------------------------------------------------------------------------------------------------------| ---------- | ------------------------ | ---------- | -| `enable_audit_log` | 是否开启审计日志。 true:启用。false:禁用。 | Boolean | false | 热加载 | -| `auditable_operation_type` | 操作类型选择。 DML :所有 DML 都会记录审计日志; DDL :所有 DDL 都会记录审计日志; QUERY :所有 QUERY 都会记录审计日志; CONTROL:所有控制语句都会记录审计日志; | String | DML,DDL,QUERY,CONTROL | 热加载 | -| `auditable_operation_level` | 权限级别选择。 global :记录全部的审计日志; object:仅针对数据实例的事件的审计日志会被记录; 包含关系:object < global。 例如:设置为 global 时,所有审计日志正常记录;设置为 object 时,仅记录对具体数据实例的操作。 | String | global | 热加载 | -| `auditable_operation_result` | 审计结果选择。 success:只记录成功事件的审计日志; fail:只记录失败事件的审计日志; | String | success, fail | 热加载 | -| `audit_log_ttl_in_days` | 审计日志的 TTL,生成审计日志的时间达到该阈值后过期。 | Double | -1.0(永远不会被删除) | 热加载 | -| `audit_log_space_tl_in_GB` | 审计日志的 SpaceTL,审计日志总空间达到该阈值后开始轮转删除。 | Double | 1.0| 热加载| -| `audit_log_batch_interval_in_ms` | 审计日志批量写入的时间间隔 | Long | 1000 | 热加载 | -| `audit_log_batch_max_queue_bytes` | 用于批量处理审计日志的队列最大字节数。当队列大小超过此值时,后续的写入操作将被阻塞。 | Long | 268435456 | 热加载 | - - -* V2.0.9.2 - -| 参数名称 | 参数描述 | 数据类型 | 默认值 | 生效方式 | -|-----------------------------------|------------------------------------------------------------------------------------------------------| ---------- | ------------------------ | ---------- | -| `enable_audit_log` | 是否开启审计日志。 true:启用。false:禁用。 | Boolean | false | 热加载 | -| `auditable_operation_type` | 操作类型选择。 DML :所有 DML 都会记录审计日志; DDL :所有 DDL 都会记录审计日志; QUERY :所有 QUERY 都会记录审计日志; CONTROL:所有控制语句都会记录审计日志; | String | DML,DDL,QUERY,CONTROL | 热加载 | -| `auditable_dml_event_type` | 审计DML操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_ddl_event_type` | 审计DDL操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_query_event_type` | 审计查询操作时的事件类型。`OBJECT_AUTHENTICATION`:对象鉴权,`SLOW_OPERATION`:慢操作 | String | `OBJECT_AUTHENTICATION`,`SLOW_OPERATION` | 热加载 | -| `auditable_control_event_type` | 审计控制操作时的事件类型。`CHANGE_AUDIT_OPTION`:审计选项变更,`OBJECT_AUTHENTICATION`:对象鉴权,`LOGIN`:登录,`LOGOUT`:退出登录,`DN_SHUTDOWN`:数据节点关机,`SLOW_OPERATION`:慢操作 | String | `CHANGE_AUDIT_OPTION`,`OBJECT_AUTHENTICATION`,`LOGIN`,`LOGOUT`,`DN_SHUTDOWN`,`SLOW_OPERATION` | 热加载 | -| `auditable_operation_level` | 权限级别选择。 global :记录全部的审计日志; object:仅针对数据实例的事件的审计日志会被记录; 包含关系:object < global。 例如:设置为 global 时,所有审计日志正常记录;设置为 object 时,仅记录对具体数据实例的操作。 | String | global | 热加载 | -| `auditable_operation_result` | 审计结果选择。 success:只记录成功事件的审计日志; fail:只记录失败事件的审计日志; | String | success, fail | 热加载 | -| `audit_log_ttl_in_days` | 审计日志的 TTL,生成审计日志的时间达到该阈值后过期。 | Double | -1.0(永远不会被删除) | 热加载 | -| `audit_log_space_tl_in_GB` | 审计日志的 SpaceTL,审计日志总空间达到该阈值后开始轮转删除。 | Double | 1.0| 热加载| -| `audit_log_batch_interval_in_ms` | 审计日志批量写入的时间间隔 | Long | 1000 | 热加载 | -| `audit_log_batch_max_queue_bytes` | 用于批量处理审计日志的队列最大字节数。当队列大小超过此值时,后续的写入操作将被阻塞。 | Long | 268435456 | 热加载 | - -**关于对象鉴权和慢操作的说明:** -* 当 `auditable_dml_event_type` 、`auditable_ddl_event_type`、`auditable_query_event_type`、`auditable_control_event_type` 参数值设置为 `OBJECT_AUTHENTICATION`(对象鉴权)时,则对应的事件类型会被记录审计日志。 -* 当 `auditable_dml_event_type` 、`auditable_ddl_event_type`、`auditable_query_event_type`、`auditable_control_event_type` 参数值设置为 `SLOW_OPERATION`(慢操作),则操作时间大于 `slow_query_threshold` 参数值(默认 3000 ms)的对应事件类型才会被记录审计日志。`slow_query_threshold` 参数值可通过 iotdb-system.properties 文件进行配置。 - -## 3. 查阅方法 - -支持通过 SQL 直接阅读、获取审计日志相关信息。 - -### 3.1 SQL 语法 - -```SQL -SELECT (, )* log FROM WHERE whereclause ORDER BY order_expression -``` - -* `AUDIT_LOG_PATH` :审计日志存储位置`root.__audit.log..`; -* `audit_log_field`:查询字段请参考下一小节元数据结构。 -* 支持 Where 条件搜索和 Order By 排序。 - -### 3.2 元数据结构 - -| 字段 | 含义 | 类型 | -| ------------------------ | -------------------------------------------------- | ----------- | -| `time` | 事件开始的的日期和时间 | timestamp | -| `username` | 用户名称 | string | -| `cli_hostname` | 用户主机标识 | string | -| `audit_event_type` | 审计事件类型,WRITE\_DATA, GENERATE\_KEY, SLOW\_OPERATION 等 | string | -| `operation_type` | 审计事件的操作类型,DML, DDL, QUERY, CONTROL | string | -| `privilege_type` | 审计事件使用的权限,WRITE\_DATA, MANAGE\_USER 等 | string | -| `privilege_level` | 事件的权限级别,global, object | string | -| `result` | 事件结果,success=1, fail=0 | boolean | -| `database` | 数据库名称 | string | -| `sql_string` | 用户的原始 SQL | string | -| `log` | 具体的事件描述 | string | - -### 3.3 使用示例 - -* 查询成功执行了 QUERY 操作的时间、用户名及主机信息 - -```SQL -IoTDB> select username,cli_hostname from root.__audit.log.** where operation_type='QUERY' and result=true align by device -+-----------------------------+---------------------------+--------+------------+ -| Time| Device|username|cli_hostname| -+-----------------------------+---------------------------+--------+------------+ -|2026-01-23T10:39:21.563+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -|2026-01-23T10:39:33.746+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -|2026-01-23T10:42:15.032+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| -+-----------------------------+---------------------------+--------+------------+ -Total line number = 3 -It costs 0.036s -``` - -* 查询最近一次操作的时间、用户名、主机信息、操作类型以及原始 SQL - -```SQL -IoTDB> select username,cli_hostname,operation_type,sql_string from root.__audit.log.** order by time desc limit 1 align by device -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -| Time| Device|username|cli_hostname|operation_type| sql_string| -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -|2026-01-23T10:42:32.795+08:00|root.__audit.log.node_1.u_0| root| 127.0.0.1| QUERY|select username,cli_hostname from root.__audit.log.** where operation_type='QUERY' and result=true align by device| -+-----------------------------+---------------------------+--------+------------+--------------+------------------------------------------------------------------------------------------------------------------+ -Total line number = 1 -It costs 0.033s -``` - -* 查询所有事件结果为false的操作数据库、操作类型及日志信息 - -```SQL -IoTDB> select database,operation_type,log from root.__audit.log.** where result=false align by device -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -| Time| Device| database|operation_type| log| -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -|2026-01-23T10:49:55.159+08:00|root.__audit.log.node_1.u_10000| | CONTROL| User user1 (ID=10000) login failed with code: 801, Authentication failed.| -|2026-01-23T10:52:04.579+08:00|root.__audit.log.node_1.u_10000| [root.**]| QUERY| User user1 (ID=10000) requests authority on object [root.**] with result false| -|2026-01-23T10:52:43.412+08:00|root.__audit.log.node_1.u_10000|root.userdb| DDL| User user1 (ID=10000) requests authority on object root.userdb with result false| -|2026-01-23T10:52:48.075+08:00|root.__audit.log.node_1.u_10000| null| QUERY|User user1 (ID=10000) requests authority on object root.__audit with result false| -+-----------------------------+-------------------------------+-----------+--------------+---------------------------------------------------------------------------------+ -Total line number = 4 -It costs 0.024s -``` - -* 设置 slow_query_threshold = 1 (ms), 查询某个用户在某个数据节点上审计事件类型为慢操作的记录 - -```SQL -IoTDB> select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' align by device -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -| Time| Device|result|privilege_level|privilege_type|database|operation_type| log| sql_string|audit_event_type|cli_hostname|username| -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -|2026-05-06T14:43:55.088+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [READ_DATA]| | QUERY| SLOW_QUERY: cost 60 ms, select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' or audit_event_type='LOGIN'limit 1 align by device|select * from root.__audit.log.node_1.u_0 where audit_event_type='SLOW_OPERATION' or audit_event_type='LOGIN'limit 1 align by device| SLOW_OPERATION| 127.0.0.1| root| -|2026-05-06T14:44:08.715+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [WRITE_DATA]| | DML| Execution: insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2') cost 290 ms, with status code: TSStatus(code:200, message:)| insert into root.ln.wf02.wt02(timestamp, status, hardware) values (2, false, 'v2')| SLOW_OPERATION| 127.0.0.1| root| -|2026-05-06T14:44:11.684+08:00|root.__audit.log.node_1.u_0| true| OBJECT| [WRITE_DATA]| | DML|Execution: insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4') cost 6 ms, with status code: TSStatus(code:200, message:)| insert into root.ln.wf02.wt02(timestamp, status, hardware) VALUES (3, false, 'v3'),(4, true, 'v4')| SLOW_OPERATION| 127.0.0.1| root| -+-----------------------------+---------------------------+------+---------------+--------------+--------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+--------+ -Total line number = 3 -It costs 0.010s -``` diff --git a/src/zh/UserGuide/latest/User-Manual/Authority-Management-Upgrade_timecho.md b/src/zh/UserGuide/latest/User-Manual/Authority-Management-Upgrade_timecho.md deleted file mode 100644 index 2733e772c..000000000 --- a/src/zh/UserGuide/latest/User-Manual/Authority-Management-Upgrade_timecho.md +++ /dev/null @@ -1,468 +0,0 @@ - - -# 权限管理 - -IoTDB 为用户提供了权限管理操作,为用户提供对数据与集群系统的权限管理功能,保障数据与系统安全。本篇介绍IoTDB 中权限模块的基本概念、用户定义、权限管理、鉴权逻辑与功能用例。 - -## 1. 基本概念 - -### 1.1 用户 - -用户即数据库的合法使用者。一个用户与一个唯一的用户名相对应,并且拥有密码作为身份验证的手段。一个人在使用数据库之前,必须先提供合法的(即存于数据库中的)用户名与密码,作为用户成功登录。 - -### 1.2 权限 - -数据库提供多种操作,但并非所有的用户都能执行所有操作。如果一个用户可以执行某项操作,则称该用户有执行该操作的权限。权限通常需要一个路径来限定其生效范围,可以使用[路径模式](../Basic-Concept/Operate-Metadata_timecho.md)灵活管理权限。 - -### 1.3 角色 - -角色是若干权限的集合,并且有一个唯一的角色名作为标识符。角色通常和一个现实身份相对应(例如交通调度员),而一个现实身份可能对应着多个用户。这些具有相同现实身份的用户往往具有相同的一些权限,角色就是为了能对这样的权限进行统一的管理的抽象。 - -### 1.4 默认用户与角色 - -安装初始化后的 IoTDB 中有一个默认用户:root,默认密码为`TimechoDB@2021`。该用户为管理员用户,固定拥有所有权限,无法被赋予、撤销权限,也无法被删除,数据库内仅有一个管理员用户。 - -一个新创建的用户或角色不具备任何权限。 - -## 2. 用户定义 - -拥有 SECURITY 的用户可以创建用户或者角色,需要满足以下约束: - -### 2.1 用户名限制 - -4~32个字符,支持使用英文大小写字母、数字、特殊字符(`!@#$%^&*()_+-=`)。用户无法创建和管理员用户同名的用户。 - -### 2.2 密码限制 - -12~32个字符,必须包含大写小写字母、至少一个数字、至少一个特殊字符(`!@#$%^&*()_+-=`)且不能与用户名相同。 - -### 2.3 角色名限制 - -4~32个字符,支持使用英文大小写字母、数字、特殊字符(`!@#$%^&*()_+-=`)。用户无法创建和管理员用户同名的角色。 - -## 3. 权限管理 - -IoTDB 树模型主要有两类权限:全局权限、序列权限。 - -### 3.1 全局权限 - -全局权限包含 SYSTEM、SECURITY、AUDIT 三种类型: - -* SYSTEM:只具备运维操作、DDL(Data Definition Language)相关的权限。 -* SECURITY:只具备管理角色(Role)或用户(User)以及为其他账号授予权限的权限。 -* AUDIT :只具备维护审计规则、查看审计日志的权限。 - -各权限详细描述见下表: - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
权限名称原权限名称描述
SYSTEMMANAGE_DATABASE允许用户创建、删除数据库.
USE_TRIGGER允许用户创建、删除、查看触发器。
USE_UDF允许用户创建、删除、查看用户自定义函数。
USE_PIPE允许用户创建、开始、停止、删除、查看 PIPE。允许用户创建、删除、查看 PIPEPLUGINS。
USE_CQ允许用户注册、开始、停止、卸载、查询流处理任务。允许用户注册、卸载、查询注册流处理任务插件。
MAINTAIN允许用户查询、取消查询。允许用户查看变量。允许用户查看集群状态。
USE_MODEL允许用户创建、删除、查询深度学习模型
SECURITYMANAGE_USER允许用户创建、删除、修改、查看用户。
MANAGE_ROLE允许用户创建、删除、查看角色。允许用户将角色授予给其他用户,或取消其他用户的角色。
AUDIT允许用户维护审计日志的规则 允许用户查看审计日志
- -### 3.2 序列权限 - -序列权限约束了用户访问数据的范围与方式,支持对绝对路径与前缀匹配路径授权,可对timeseries 粒度生效。 - -下表描述了这类权限的种类与范围: - -| 权限名称 | 描述 | -| --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| READ\_DATA | 允许读取授权路径下的序列数据。 | -| WRITE\_DATA | 允许读取授权路径下的序列数据。
允许插入、删除授权路径下的的序列数据。
允许在授权路径下导入、加载数据,在导入数据时,需要拥有对应路径的 WRITE\_DATA 权限,在自动创建数据库与序列时,需要有SYSTEM与 WRITE\_SCHEMA 权限。 | -| READ\_SCHEMA | 允许获取授权路径下元数据树的详细信息,包括:路径下的数据库、子路径、子节点、设备、序列、模版、视图等。 | -| WRITE\_SCHEMA | 允许获取授权路径下元数据树的详细信息。
允许在授权路径下对序列、模版、视图等进行创建、删除、修改操作。
在创建或修改 view 的时候,会检查 view 路径的 WRITE\_SCHEMA 权限、数据源的 READ\_SCHEMA 权限。
在对 view 进行查询、插入时,会检查 view 路径的 READ\_DATA 权限、WRITE\_DATA 权限。
允许在授权路径下设置、取消、查看TTL。
允许在授权路径下挂载或者接触挂载模板。
允许在授权路径下对序列进行全路径名称的修改操作。//V2.0.8.2 起支持该功能 | - -### 3.3 权限授予与取消 - -在 IoTDB 中,用户可以由三种途径获得权限: - -1. 由超级管理员授予,超级管理员可以控制其他用户的权限。 -2. 由允许权限授权的用户授予,该用户获得权限时被指定了 grant option 关键字。 -3. 由超级管理员或者有 SECURITY 的用户授予某个角色进而获取权限。 - -取消用户的权限,可以由以下几种途径: - -1. 由超级管理员取消用户的权限。 -2. 由允许权限授权的用户取消权限,该用户获得权限时被指定了 grant option 关键字。 -3. 由超级管理员或者 SECURITY 的用户取消用户的某个角色进而取消权限。 - -* 在授权时,必须指定路径。全局权限需要指定为 root.\*\*, 而序列相关权限必须为绝对路径或者以双通配符结尾的前缀路径。 -* 当授予角色权限时,可以为该权限指定 with grant option 关键字,意味着用户可以转授其授权路径上的权限,也可以取消其他用户的授权路径上的权限。例如用户 A 在被授予`集团1.公司1.**`的读权限时制定了 grant option 关键字,那么 A 可以将`集团1.公司1`以下的任意节点、序列的读权限转授给他人, 同样也可以取消其他用户 `集团1.公司1` 下任意节点的读权限。 -* 在取消授权时,取消授权语句会与用户所有的权限路径进行匹配,将匹配到的权限路径进行清理,例如用户A 具有 `集团1.公司1.工厂1 `的读权限, 在取消 `集团1.公司1.** `的读权限时,会清除用户A 的 `集团1.公司1.工厂1` 的读权限。 - -## 4. 功能语法与示例 - -IoTDB 提供了组合权限,方便用户授权: - -| 权限名称 | 权限范围 | -| ---------- | ---------------------------- | -| ALL | 所有权限 | -| READ | READ\_SCHEMA、READ\_DATA | -| WRITE | WRITE\_SCHEMA、WRITE\_DATA | - -组合权限并不是一种具体的权限,而是一种简写方式,与直接书写对应的权限名称没有差异。 - -下面将通过一系列具体的用例展示权限语句的用法,非管理员执行下列语句需要提前获取权限,所需的权限标记在操作描述后。 - -### 4.1 用户与角色相关 - -* 创建用户(需 SECURITY 权限) - -```SQL -CREATE USER -eg: CREATE USER user1 'Passwd@202604'; -``` - -* 删除用户 (需 SECURITY 权限) - -```SQL -DROP USER -eg: DROP USER user1; -``` - -* 创建角色 (需 SECURITY 权限) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1; -``` - -* 删除角色 (需 SECURITY 权限) - -```SQL -DROP ROLE -eg: DROP ROLE role1; -``` - -* 赋予用户角色 (需 SECURITY 权限) - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1; -``` - -* 移除用户角色 (需 SECURITY 权限) - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1; -``` - -* 列出所有用户 (需 SECURITY 权限) - -```SQL -LIST USER; -``` - -* 列出所有角色 (需 SECURITY 权限) - -```SQL -LIST ROLE; -``` - -* 列出指定角色下所有用户 (需 SECURITY 权限) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser; -``` - -* 列出指定用户下所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 SECURITY 权限。 - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser; -``` - -* 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; -``` - -* 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 SECURITY 权限。 - -```SQL -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -* 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备 SECURITY 权限。 - -```SQL -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'Newpwd@202604'; -``` - -### 4.2 授权与取消授权 - -用户使用授权语句对赋予其他用户权限,语法如下: - -```SQL -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT SECURITY ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -用户使用取消授权语句可以将其他的权限取消,语法如下: - -```SQL -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE SECURITY ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - -* **非管理员用户执行授权/取消授权语句时,需要对`` 有`` 权限,并且该权限是被标记带有 WITH GRANT OPTION 的。** -* 在授予取消全局权限时,或者语句中包含全局权限时(ALL 展开会包含全局权限),须指定 path 为 root.\*\*。 例如,以下授权/取消授权语句是合法的: - - ```SQL - GRANT SECURITY ON root.** TO USER user1; - GRANT SECURITY ON root.** TO ROLE role1 WITH GRANT OPTION; - GRANT ALL ON root.** TO role role1 WITH GRANT OPTION; - REVOKE SECURITY ON root.** FROM USER user1; - REVOKE SECURITY ON root.** FROM ROLE role1; - REVOKE ALL ON root.** FROM ROLE role1; - ``` - - 下面的语句是非法的: - - ```SQL - GRANT READ, SECURITY ON root.t1.** TO USER user1; - GRANT ALL ON root.t1.t2 TO USER user1 WITH GRANT OPTION; - REVOKE ALL ON root.t1.t2 FROM USER user1; - REVOKE READ, SECURITY ON root.t1.t2 FROM ROLE ROLE1; - ``` -* `` 必须为全路径或者以双通配符结尾的匹配路径,以下路径是合法的: - - ```SQL - root.** - root.t1.t2.** - root.t1.t2.t3 - ``` - - 以下的路径是非法的: - - ```SQL - root.t1.* - root.t1.**.t2 - root.t1*.t2.t3 - ``` - -## 5. 场景示例 - -根据本文中描述的 [样例数据](https://github.com/thulab/iotdb/files/4438687/OtherMaterial-Sample.Data.txt) 内容,IoTDB 的样例数据可能同时属于 ln, sgcc 等不同发电集团,不同的发电集团不希望其他发电集团获取自己的数据库数据,因此我们需要将不同的数据在集团层进行权限隔离。 - -### 5.1 创建用户 - -使用 `CREATE USER ` 创建用户。例如,我们可以使用具有所有权限的root用户为 ln 和 sgcc 集团创建两个用户角色,名为 ln\_write\_user, sgcc\_write\_user,密码均为 write_Pwd@2026。建议使用反引号(\`)包裹用户名。SQL 语句为: - -```SQL -CREATE USER `ln_write_user` 'write_Pwd@2026'; -CREATE USER `sgcc_write_user` 'write_Pwd@2026'; -``` - -此时使用展示用户的 SQL 语句: - -```SQL -LIST USER; -``` - -我们可以看到这两个已经被创建的用户,结果如下: - -```SQL -IoTDB> CREATE USER `ln_write_user` 'write_Pwd@2026'; -Msg: The statement is executed successfully. -IoTDB> CREATE USER `sgcc_write_user` 'write_Pwd@2026'; -Msg: The statement is executed successfully. -IoTDB> LIST USER; -+------+---------------+-----------------+-----------------+ -|UserId| User|MaxSessionPerUser|MinSessionPerUser| -+------+---------------+-----------------+-----------------+ -| 0| root| -1| 1| -| 10000| ln_write_user| -1| -1| -| 10001|sgcc_write_user| -1| -1| -+------+---------------+-----------------+-----------------+ -Total line number = 3 -It costs 0.005s -``` - -### 5.2 赋予用户权限 - -此时,虽然两个用户已经创建,但是他们不具有任何权限,因此他们并不能对数据库进行操作,例如我们使用 ln\_write\_user 用户对数据库中的数据进行写入,SQL 语句为: - -```SQL -INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true); -``` - -此时,系统不允许用户进行此操作,会提示错误: - -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true); -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -现在,我们用 root 用户分别赋予他们向对应路径的写入权限. - -我们使用 `GRANT ON TO USER ` 语句赋予用户权限,例如: - -```SQL -GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user`; -GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user`; -``` - -执行状态如下所示: - -```SQL -IoTDB> GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user`; -Msg: The statement is executed successfully. -IoTDB> GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user`; -Msg: The statement is executed successfully. -``` - -接着使用ln\_write\_user再尝试写入数据 - -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true); -Msg: The statement is executed successfully. -``` - -### 5.3 撤销用户权限 - -授予用户权限后,我们可以使用 `REVOKE ON FROM USER `来撤销已经授予用户的权限。例如,用root用户撤销ln\_write\_user和sgcc\_write\_user的权限: - -```SQL -REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user`; -REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user`; -``` - -执行状态如下所示: - -```SQL -IoTDB> REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user`; -Msg: The statement is executed successfully. -IoTDB> REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user`; -Msg: The statement is executed successfully. -``` - -撤销权限后,ln\_write\_user就没有向root.ln.\*\*写入数据的权限了。 - -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true); -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -## 6. 鉴权及其他 - -### 6.1 鉴权 - -用户权限主要由三部分组成:权限生效范围(路径), 权限类型, with grant option 标记: - -```Plain -userTest1 : - root.t1.** - read_schema, read_data - with grant option - root.** - write_schema, write_data - with grant option -``` - -每个用户都有一个这样的权限访问列表,标识他们获得的所有权限,可以通过 `LIST PRIVILEGES OF USER ` 查看他们的权限。 - -在对一个路径进行鉴权时,数据库会进行路径与权限的匹配。例如检查 `root.t1.t2` 的 read\_schema 权限时,首先会与权限访问列表的 `root.t1.**`进行匹配,匹配成功,则检查该路径是否包含待鉴权的权限,否则继续下一条路径-权限的匹配,直到匹配成功或者匹配结束。 - -在进行多路径鉴权时,对于多路径查询任务,数据库只会将有权限的数据呈现出来,无权限的数据不会包含在结果中;对于多路径写入任务,数据库要求必须所有的目标序列都获得了对应的权限,才能进行写入。 - -请注意,下面的操作需要检查多重权限 - -1. 开启了自动创建序列功能,在用户将数据插入到不存在的序列中时,不仅需要对应序列的写入权限,还需要序列的元数据修改权限。 -2. 执行 select into 语句时,需要检查源序列的读权限与目标序列的写权限。需要注意的是源序列数据可能因为权限不足而仅能获取部分数据,目标序列写入权限不足时会报错终止任务。 -3. View 权限与数据源的权限是独立的,向 view 执行读写操作仅会检查 view 的权限,而不再对源路径进行权限校验。 - -### 6.2 其他说明 - -角色是权限的集合,而权限和角色都是用户的一种属性。即一个角色可以拥有若干权限。一个用户可以拥有若干角色与权限(称为用户自身权限)。 - -目前在 IoTDB 中并不存在相互冲突的权限,因此一个用户真正具有的权限是用户自身权限与其所有的角色的权限的并集。即要判定用户是否能执行某一项操作,就要看用户自身权限或用户的角色的所有权限中是否有一条允许了该操作。用户自身权限与其角色权限,他的多个角色的权限之间可能存在相同的权限,但这并不会产生影响。 - -需要注意的是:如果一个用户自身有某种权限(对应操作 A),而他的某个角色有相同的权限。那么如果仅从该用户撤销该权限无法达到禁止该用户执行操作 A 的目的,还需要从这个角色中也撤销对应的权限,或者从这个用户将该角色撤销。同样,如果仅从上述角色将权限撤销,也不能禁止该用户执行操作 A。 - -同时,对角色的修改会立即反映到所有拥有该角色的用户上,例如对角色增加某种权限将立即使所有拥有该角色的用户都拥有对应权限,删除某种权限也将使对应用户失去该权限(除非用户本身有该权限)。 diff --git a/src/zh/UserGuide/latest/User-Manual/Authority-Management_timecho.md b/src/zh/UserGuide/latest/User-Manual/Authority-Management_timecho.md deleted file mode 100644 index 2e4c5c21e..000000000 --- a/src/zh/UserGuide/latest/User-Manual/Authority-Management_timecho.md +++ /dev/null @@ -1,510 +0,0 @@ - - -# 权限管理 - -IoTDB 为用户提供了权限管理操作,为用户提供对数据与集群系统的权限管理功能,保障数据与系统安全。 -本篇介绍IoTDB 中权限模块的基本概念、用户定义、权限管理、鉴权逻辑与功能用例。在 JAVA 编程环境中,您可以使用 [JDBC API](../API/Programming-JDBC_timecho) 单条或批量执行权限管理类语句。 - -## 1. 基本概念 - -### 1.1 用户 - -用户即数据库的合法使用者。一个用户与一个唯一的用户名相对应,并且拥有密码作为身份验证的手段。一个人在使用数据库之前,必须先提供合法的(即存于数据库中的)用户名与密码,作为用户成功登录。 - -### 1.2 权限 - -数据库提供多种操作,但并非所有的用户都能执行所有操作。如果一个用户可以执行某项操作,则称该用户有执行该操作的权限。权限通常需要一个路径来限定其生效范围,可以使用[路径模式](../Basic-Concept/Operate-Metadata.md)灵活管理权限。 - -### 1.3 角色 - -角色是若干权限的集合,并且有一个唯一的角色名作为标识符。角色通常和一个现实身份相对应(例如交通调度员),而一个现实身份可能对应着多个用户。这些具有相同现实身份的用户往往具有相同的一些权限,角色就是为了能对这样的权限进行统一的管理的抽象。 - -### 1.4 默认用户与角色 - -安装初始化后的 IoTDB 中有一个默认用户:root,默认密码为`TimechoDB@2021`(V2.0.6.x 版本之前为`root`)。该用户为管理员用户,固定拥有所有权限,无法被赋予、撤销权限,也无法被删除,数据库内仅有一个管理员用户。 - -一个新创建的用户或角色不具备任何权限。 - -## 2. 用户定义 - -拥有 MANAGE_USER、MANAGE_ROLE 的用户或者管理员可以创建用户或者角色,需要满足以下约束: - -### 2.1 用户名限制 - -4~32个字符,支持使用英文大小写字母、数字、特殊字符(`!@#$%^&*()_+-=`) - -用户无法创建和管理员用户同名的用户。 - -### 2.2 密码限制 - -4~32个字符,可使用大写小写字母、数字、特殊字符(`!@#$%^&*()_+-=`),密码默认采用 SHA-256 进行加密。 - -### 2.3 角色名限制 - -4~32个字符,支持使用英文大小写字母、数字、特殊字符(`!@#$%^&*()_+-=`) - -用户无法创建和管理员用户同名的角色。 - -## 3. 权限管理 - -IoTDB 主要有两类权限:序列权限、全局权限。 - -### 3.1 序列权限 - -序列权限约束了用户访问数据的范围与方式,支持对绝对路径与前缀匹配路径授权,可对timeseries 粒度生效。 - -下表描述了这类权限的种类与范围: - -| 权限名称 | 描述 | -|--------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| READ_DATA | 允许读取授权路径下的序列数据。 | -| WRITE_DATA | 允许读取授权路径下的序列数据。
允许插入、删除授权路径下的的序列数据。
允许在授权路径下导入、加载数据,在导入数据时,需要拥有对应路径的 WRITE_DATA 权限,在自动创建数据库与序列时,需要有 MANAGE_DATABASE 与 WRITE_SCHEMA 权限。 | -| READ_SCHEMA | 允许获取授权路径下元数据树的详细信息:
包括:路径下的数据库、子路径、子节点、设备、序列、模版、视图等。 | -| WRITE_SCHEMA | 允许获取授权路径下元数据树的详细信息。
允许在授权路径下对序列、模版、视图等进行创建、删除、修改操作。
在创建或修改 view 的时候,会检查 view 路径的 WRITE_SCHEMA 权限、数据源的 READ_SCHEMA 权限。
在对 view 进行查询、插入时,会检查 view 路径的 READ_DATA 权限、WRITE_DATA 权限。
允许在授权路径下设置、取消、查看TTL。
允许在授权路径下挂载或者接触挂载模板。
允许在授权路径下对序列进行全路径名称的修改操作。//V2.0.8.2 起支持该功能 | - -### 3.2 全局权限 - -全局权限约束了用户使用的数据库功能、限制了用户执行改变系统状态与任务状态的命令,用户获得全局授权后,可对数据库进行管理。 - -下表描述了系统权限的种类: - -| 权限名称 | 描述 | -|:---------------:|:------------------------------------------------------------------| -| MANAGE_DATABASE | - 允许用户创建、删除数据库. | -| MANAGE_USER | - 允许用户创建、删除、修改、查看用户。 | -| MANAGE_ROLE | - 允许用户创建、删除、查看角色。
允许用户将角色授予给其他用户,或取消其他用户的角色。 | -| USE_TRIGGER | - 允许用户创建、删除、查看触发器。
与触发器的数据源权限检查相独立。 | -| USE_UDF | - 允许用户创建、删除、查看用户自定义函数。
与自定义函数的数据源权限检查相独立。 | -| USE_CQ | - 允许用户创建、开始、停止、删除、查看管道。
允许用户创建、删除、查看管道插件。
与管道的数据源权限检查相独立。 | -| USE_PIPE | - 允许用户注册、开始、停止、卸载、查询流处理任务。
- 允许用户注册、卸载、查询注册流处理任务插件。 | -| EXTEND_TEMPLATE | - 允许自动扩展模板。 | -| MAINTAIN | - 允许用户查询、取消查询。
允许用户查看变量。
允许用户查看集群状态。 | -| USE_MODEL | - 允许用户创建、删除、查询深度学习模型 | - -关于模板权限: - -1. 模板的创建、删除、修改、查询、挂载、卸载仅允许管理员操作。 -2. 激活模板需要拥有激活路径的 WRITE_SCHEMA 权限 -3. 若开启了自动创建,在向挂载了模板的不存在路径写入时,数据库会自动扩展模板并写入数据,因此需要有 EXTEND_TEMPLATE 权限与写入序列的 WRITE_DATA 权限。 -4. 解除模板,需要拥有挂载模板路径的 WRITE_SCHEMA 权限。 -5. 查询使用了某个元数据模板的路径,需要有路径的 READ_SCHEMA 权限,否则将返回为空。 - -### 3.3 权限授予与取消 - -在 IoTDB 中,用户可以由三种途径获得权限: - -1. 由超级管理员授予,超级管理员可以控制其他用户的权限。 -2. 由允许权限授权的用户授予,该用户获得权限时被指定了 grant option 关键字。 -3. 由超级管理员或者有 MANAGE_ROLE 的用户授予某个角色进而获取权限。 - -取消用户的权限,可以由以下几种途径: - -1. 由超级管理员取消用户的权限。 -2. 由允许权限授权的用户取消权限,该用户获得权限时被指定了 grant option 关键字。 -3. 由超级管理员或者MANAGE_ROLE 的用户取消用户的某个角色进而取消权限。 - -- 在授权时,必须指定路径。全局权限需要指定为 root.**, 而序列相关权限必须为绝对路径或者以双通配符结尾的前缀路径。 -- 当授予角色权限时,可以为该权限指定 with grant option 关键字,意味着用户可以转授其授权路径上的权限,也可以取消其他用户的授权路径上的权限。例如用户 A 在被授予`集团1.公司1.**`的读权限时制定了 grant option 关键字,那么 A 可以将`集团1.公司1`以下的任意节点、序列的读权限转授给他人, 同样也可以取消其他用户 `集团1.公司1` 下任意节点的读权限。 -- 在取消授权时,取消授权语句会与用户所有的权限路径进行匹配,将匹配到的权限路径进行清理,例如用户A 具有 `集团1.公司1.工厂1 `的读权限, 在取消 `集团1.公司1.** `的读权限时,会清除用户A 的 `集团1.公司1.工厂1` 的读权限。 - - - -## 4. 鉴权 - -用户权限主要由三部分组成:权限生效范围(路径), 权限类型, with grant option 标记: - -``` -userTest1 : - root.t1.** - read_schema, read_data - with grant option - root.** - write_schema, write_data - with grant option -``` - -每个用户都有一个这样的权限访问列表,标识他们获得的所有权限,可以通过 `LIST PRIVILEGES OF USER ` 查看他们的权限。 - -在对一个路径进行鉴权时,数据库会进行路径与权限的匹配。例如检查 `root.t1.t2` 的 read_schema 权限时,首先会与权限访问列表的 `root.t1.**`进行匹配,匹配成功,则检查该路径是否包含待鉴权的权限,否则继续下一条路径-权限的匹配,直到匹配成功或者匹配结束。 - -在进行多路径鉴权时,对于多路径查询任务,数据库只会将有权限的数据呈现出来,无权限的数据不会包含在结果中;对于多路径写入任务,数据库要求必须所有的目标序列都获得了对应的权限,才能进行写入。 - -请注意,下面的操作需要检查多重权限 -1. 开启了自动创建序列功能,在用户将数据插入到不存在的序列中时,不仅需要对应序列的写入权限,还需要序列的元数据修改权限。 -2. 执行 select into 语句时,需要检查源序列的读权限与目标序列的写权限。需要注意的是源序列数据可能因为权限不足而仅能获取部分数据,目标序列写入权限不足时会报错终止任务。 -3. View 权限与数据源的权限是独立的,向 view 执行读写操作仅会检查 view 的权限,而不再对源路径进行权限校验。 - - - -## 5. 功能语法与示例 - -IoTDB 提供了组合权限,方便用户授权: - -| 权限名称 | 权限范围 | -|-------|-------------------------| -| ALL | 所有权限 | -| READ | READ_SCHEMA、READ_DATA | -| WRITE | WRITE_SCHEMA、WRITE_DATA | - -组合权限并不是一种具体的权限,而是一种简写方式,与直接书写对应的权限名称没有差异。 - -下面将通过一系列具体的用例展示权限语句的用法,非管理员执行下列语句需要提前获取权限,所需的权限标记在操作描述后。 - -### 5.1 用户与角色相关 - -- 创建用户(需 MANAGE_USER 权限) - - -```SQL -CREATE USER -eg: CREATE USER user1 'passwd' -``` - -- 删除用户 (需 MANEGE_USER 权限) - - -```SQL -DROP USER -eg: DROP USER user1 -``` - -- 创建角色 (需 MANAGE_ROLE 权限) - -```SQL -CREATE ROLE -eg: CREATE ROLE role1 -``` - -- 删除角色 (需 MANAGE_ROLE 权限) - - -```SQL -DROP ROLE -eg: DROP ROLE role1 -``` - -- 赋予用户角色 (需 MANAGE_ROLE 权限) - - -```SQL -GRANT ROLE TO -eg: GRANT ROLE admin TO user1 -``` - -- 移除用户角色 (需 MANAGE_ROLE 权限) - - -```SQL -REVOKE ROLE FROM -eg: REVOKE ROLE admin FROM user1 -``` - -- 列出所有用户 (需 MANEGE_USER 权限) - -```SQL -LIST USER -``` - -- 列出所有角色 (需 MANAGE_ROLE 权限) - -```SQL -LIST ROLE -``` - -- 列出指定角色下所有用户 (需 MANEGE_USER 权限) - -```SQL -LIST USER OF ROLE -eg: LIST USER OF ROLE roleuser -``` - -- 列出指定用户下所有角色 - -用户可以列出自己的角色,但列出其他用户的角色需要拥有 MANAGE_ROLE 权限。 - -```SQL -LIST ROLE OF USER -eg: LIST ROLE OF USER tempuser -``` - -- 列出用户所有权限 - -用户可以列出自己的权限信息,但列出其他用户的权限需要拥有 MANAGE_USER 权限。 - -```SQL -LIST PRIVILEGES OF USER ; -eg: LIST PRIVILEGES OF USER tempuser; - -``` - -- 列出角色所有权限 - -用户可以列出自己具有的角色的权限信息,列出其他角色的权限需要有 MANAGE_ROLE 权限。 - -```SQL -LIST PRIVILEGES OF ROLE ; -eg: LIST PRIVILEGES OF ROLE actor; -``` - -- 修改密码 - -用户可以修改自己的密码,但修改其他用户密码需要具备MANAGE_USER 权限。 - -```SQL -ALTER USER SET PASSWORD ; -eg: ALTER USER tempuser SET PASSWORD 'newpwd'; -``` - -### 5.2 授权与取消授权 - -用户使用授权语句对赋予其他用户权限,语法如下: - -```SQL -GRANT ON TO ROLE/USER [WITH GRANT OPTION]; -eg: GRANT READ ON root.** TO ROLE role1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.** TO USER user1; -eg: GRANT READ_DATA, WRITE_DATA ON root.t1.**,root.t2.** TO USER user1; -eg: GRANT MANAGE_ROLE ON root.** TO USER user1 WITH GRANT OPTION; -eg: GRANT ALL ON root.** TO USER user1 WITH GRANT OPTION; -``` - -用户使用取消授权语句可以将其他的权限取消,语法如下: - -```SQL -REVOKE ON FROM ROLE/USER ; -eg: REVOKE READ ON root.** FROM ROLE role1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.** FROM USER user1; -eg: REVOKE READ_DATA, WRITE_DATA ON root.t1.**, root.t2.** FROM USER user1; -eg: REVOKE MANAGE_ROLE ON root.** FROM USER user1; -eg: REVOKE ALL ON root.** FROM USER user1; -``` - -- **非管理员用户执行授权/取消授权语句时,需要对\ 有\ 权限,并且该权限是被标记带有 WITH GRANT OPTION 的。** - -- 在授予取消全局权限时,或者语句中包含全局权限时(ALL 展开会包含全局权限),须指定 path 为 root.**。 例如,以下授权/取消授权语句是合法的: - - ```SQL - GRANT MANAGE_USER ON root.** TO USER user1; - GRANT MANAGE_ROLE ON root.** TO ROLE role1 WITH GRANT OPTION; - GRANT ALL ON root.** TO role role1 WITH GRANT OPTION; - REVOKE MANAGE_USER ON root.** FROM USER user1; - REVOKE MANAGE_ROLE ON root.** FROM ROLE role1; - REVOKE ALL ON root.** FROM ROLE role1; - ``` - 下面的语句是非法的: - - ```SQL - GRANT READ, MANAGE_ROLE ON root.t1.** TO USER user1; - GRANT ALL ON root.t1.t2 TO USER user1 WITH GRANT OPTION; - REVOKE ALL ON root.t1.t2 FROM USER user1; - REVOKE READ, MANAGE_ROLE ON root.t1.t2 FROM ROLE ROLE1; - ``` - -- \ 必须为全路径或者以双通配符结尾的匹配路径,以下路径是合法的: - - ```SQL - root.** - root.t1.t2.** - root.t1.t2.t3 - ``` - - 以下的路径是非法的: - - ```SQL - root.t1.* - root.t1.**.t2 - root.t1*.t2.t3 - ``` - -## 6. 示例 - -根据本文中描述的 [样例数据](https://github.com/thulab/iotdb/files/4438687/OtherMaterial-Sample.Data.txt) 内容,IoTDB 的样例数据可能同时属于 ln, sgcc 等不同发电集团,不同的发电集团不希望其他发电集团获取自己的数据库数据,因此我们需要将不同的数据在集团层进行权限隔离。 - -### 6.1 创建用户 - -使用 `CREATE USER ` 创建用户。例如,我们可以使用具有所有权限的root用户为 ln 和 sgcc 集团创建两个用户角色,名为 ln_write_user, sgcc_write_user,密码均为 write_pwd。建议使用反引号(`)包裹用户名。SQL 语句为: - -```SQL -CREATE USER `ln_write_user` 'write_pwd' -CREATE USER `sgcc_write_user` 'write_pwd' -``` -此时使用展示用户的 SQL 语句: - -```SQL -LIST USER -``` - -我们可以看到这两个已经被创建的用户,结果如下: - -```SQL -IoTDB> CREATE USER `ln_write_user` 'write_pwd' -Msg: The statement is executed successfully. -IoTDB> CREATE USER `sgcc_write_user` 'write_pwd' -Msg: The statement is executed successfully. -IoTDB> LIST USER; -+---------------+ -| user| -+---------------+ -| ln_write_user| -| root| -|sgcc_write_user| -+---------------+ -Total line number = 3 -It costs 0.012s -``` - -### 6.2 赋予用户权限 - -此时,虽然两个用户已经创建,但是他们不具有任何权限,因此他们并不能对数据库进行操作,例如我们使用 ln_write_user 用户对数据库中的数据进行写入,SQL 语句为: - -```SQL -INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true) -``` - -此时,系统不允许用户进行此操作,会提示错误: - -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp,status) values(1509465600000,true) -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -现在,我们用 root 用户分别赋予他们向对应路径的写入权限. - -我们使用 `GRANT ON TO USER ` 语句赋予用户权限,例如: -```SQL -GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user` -GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user` -``` - -执行状态如下所示: - -```SQL -IoTDB> GRANT WRITE_DATA ON root.ln.** TO USER `ln_write_user` -Msg: The statement is executed successfully. -IoTDB> GRANT WRITE_DATA ON root.sgcc1.**, root.sgcc2.** TO USER `sgcc_write_user` -Msg: The statement is executed successfully. -``` - -接着使用ln_write_user再尝试写入数据 - -```SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true) -Msg: The statement is executed successfully. -``` - -### 6.3 撤销用户权限 -授予用户权限后,我们可以使用 `REVOKE ON FROM USER `来撤销已经授予用户的权限。例如,用root用户撤销ln_write_user和sgcc_write_user的权限: - -``` SQL -REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -``` - -执行状态如下所示: -``` SQL -IoTDB> REVOKE WRITE_DATA ON root.ln.** FROM USER `ln_write_user` -Msg: The statement is executed successfully. -IoTDB> REVOKE WRITE_DATA ON root.sgcc1.**, root.sgcc2.** FROM USER `sgcc_write_user` -Msg: The statement is executed successfully. -``` - -撤销权限后,ln_write_user就没有向root.ln.**写入数据的权限了。 - -``` SQL -IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp, status) values(1509465600000, true) -Msg: 803: No permissions for this operation, please add privilege WRITE_DATA on [root.ln.wf01.wt01.status] -``` - -## 7. 其他说明 - -角色是权限的集合,而权限和角色都是用户的一种属性。即一个角色可以拥有若干权限。一个用户可以拥有若干角色与权限(称为用户自身权限)。 - -目前在 IoTDB 中并不存在相互冲突的权限,因此一个用户真正具有的权限是用户自身权限与其所有的角色的权限的并集。即要判定用户是否能执行某一项操作,就要看用户自身权限或用户的角色的所有权限中是否有一条允许了该操作。用户自身权限与其角色权限,他的多个角色的权限之间可能存在相同的权限,但这并不会产生影响。 - -需要注意的是:如果一个用户自身有某种权限(对应操作 A),而他的某个角色有相同的权限。那么如果仅从该用户撤销该权限无法达到禁止该用户执行操作 A 的目的,还需要从这个角色中也撤销对应的权限,或者从这个用户将该角色撤销。同样,如果仅从上述角色将权限撤销,也不能禁止该用户执行操作 A。 - -同时,对角色的修改会立即反映到所有拥有该角色的用户上,例如对角色增加某种权限将立即使所有拥有该角色的用户都拥有对应权限,删除某种权限也将使对应用户失去该权限(除非用户本身有该权限)。 - -## 8. 升级说明 - -在 1.3 版本前,权限类型较多,在这一版实现中,权限类型做了精简,并且添加了对权限路径的约束。 - -数据库 1.3 版本的权限路径必须为全路径或者以双通配符结尾的匹配路径,在系统升级时,会自动转换不合法的权限路径和权限类型。 -路径上首个非法节点会被替换为`**`, 不在支持的权限类型也会映射到当前系统支持的权限上。 - -例如: - -| 权限类型 | 权限路径 | 映射之后的权限类型 | 权限路径 | -| ----------------- | --------------- |-----------------| ------------- | -| CREATE_DATBASE | root.db.t1.* | MANAGE_DATABASE | root.** | -| INSERT_TIMESERIES | root.db.t2.*.t3 | WRITE_DATA | root.db.t2.** | -| CREATE_TIMESERIES | root.db.t2*c.t3 | WRITE_SCHEMA | root.db.** | -| LIST_ROLE | root.** | (忽略) | | - - -新旧版本的权限类型对照可以参照下面的表格(--IGNORE 表示新版本忽略该权限): - -| 权限名称 | 是否路径相关 | 新权限名称 | 是否路径相关 | -|---------------------------|--------|-----------------|--------| -| CREATE_DATABASE | 是 | MANAGE_DATABASE | 否 | -| INSERT_TIMESERIES | 是 | WRITE_DATA | 是 | -| UPDATE_TIMESERIES | 是 | WRITE_DATA | 是 | -| READ_TIMESERIES | 是 | READ_DATA | 是 | -| CREATE_TIMESERIES | 是 | WRITE_SCHEMA | 是 | -| DELETE_TIMESERIES | 是 | WRITE_SCHEMA | 是 | -| CREATE_USER | 否 | MANAGE_USER | 否 | -| DELETE_USER | 否 | MANAGE_USER | 否 | -| MODIFY_PASSWORD | 否 | -- IGNORE | | -| LIST_USER | 否 | -- IGNORE | | -| GRANT_USER_PRIVILEGE | 否 | -- IGNORE | | -| REVOKE_USER_PRIVILEGE | 否 | -- IGNORE | | -| GRANT_USER_ROLE | 否 | MANAGE_ROLE | 否 | -| REVOKE_USER_ROLE | 否 | MANAGE_ROLE | 否 | -| CREATE_ROLE | 否 | MANAGE_ROLE | 否 | -| DELETE_ROLE | 否 | MANAGE_ROLE | 否 | -| LIST_ROLE | 否 | -- IGNORE | | -| GRANT_ROLE_PRIVILEGE | 否 | -- IGNORE | | -| REVOKE_ROLE_PRIVILEGE | 否 | -- IGNORE | | -| CREATE_FUNCTION | 否 | USE_UDF | 否 | -| DROP_FUNCTION | 否 | USE_UDF | 否 | -| CREATE_TRIGGER | 是 | USE_TRIGGER | 否 | -| DROP_TRIGGER | 是 | USE_TRIGGER | 否 | -| START_TRIGGER | 是 | USE_TRIGGER | 否 | -| STOP_TRIGGER | 是 | USE_TRIGGER | 否 | -| CREATE_CONTINUOUS_QUERY | 否 | USE_CQ | 否 | -| DROP_CONTINUOUS_QUERY | 否 | USE_CQ | 否 | -| ALL | 否 | All privilegs | | -| DELETE_DATABASE | 是 | MANAGE_DATABASE | 否 | -| ALTER_TIMESERIES | 是 | WRITE_SCHEMA | 是 | -| UPDATE_TEMPLATE | 否 | -- IGNORE | | -| READ_TEMPLATE | 否 | -- IGNORE | | -| APPLY_TEMPLATE | 是 | WRITE_SCHEMA | 是 | -| READ_TEMPLATE_APPLICATION | 否 | -- IGNORE | | -| SHOW_CONTINUOUS_QUERIES | 否 | -- IGNORE | | -| CREATE_PIPEPLUGIN | 否 | USE_PIPE | 否 | -| DROP_PIPEPLUGINS | 否 | USE_PIPE | 否 | -| SHOW_PIPEPLUGINS | 否 | -- IGNORE | | -| CREATE_PIPE | 否 | USE_PIPE | 否 | -| START_PIPE | 否 | USE_PIPE | 否 | -| STOP_PIPE | 否 | USE_PIPE | 否 | -| DROP_PIPE | 否 | USE_PIPE | 否 | -| SHOW_PIPES | 否 | -- IGNORE | | -| CREATE_VIEW | 是 | WRITE_SCHEMA | 是 | -| ALTER_VIEW | 是 | WRITE_SCHEMA | 是 | -| RENAME_VIEW | 是 | WRITE_SCHEMA | 是 | -| DELETE_VIEW | 是 | WRITE_SCHEMA | 是 | diff --git a/src/zh/UserGuide/latest/User-Manual/Auto-Start-On-Boot_timecho.md b/src/zh/UserGuide/latest/User-Manual/Auto-Start-On-Boot_timecho.md deleted file mode 100644 index 06a0ddba6..000000000 --- a/src/zh/UserGuide/latest/User-Manual/Auto-Start-On-Boot_timecho.md +++ /dev/null @@ -1,243 +0,0 @@ - - -# 开机自启 - -## 1.概述 - -TimechoDB 支持通过 `daemon-confignode.sh`、`daemon-datanode.sh`、`daemon-ainode.sh` 三个脚本,将ConfigNode、DataNode、AINode 注册为 Linux 系统服务,结合系统自带的 `systemctl `命令,以守护进程方式管理 TimechoDB 集群,实现更便捷的启动、停止、重启及开机自启等操作,提升服务稳定性。 - -> 注意:该功能从 V 2.0.9.1 版本开始提供。 - -## 2. 环境要求 - -| 操作系统 | Linux(支持`systemctl`命令) | -| ---------- |:-----------------------------------------------------:| -| 用户权限 | root 用户 | -| 环境变量 | 部署 ConfigNode 和 DataNode 前需设置`JAVA_HOME` | - -## 3. 服务注册 - -进入 TimechoDB 安装目录,执行对应的守护进程脚本: - -```Bash -# 注册 ConfigNode 服务 -./tools/ops/daemon-confignode.sh - -# 注册 DataNode 服务 -./tools/ops/daemon-datanode.sh - -# 注册 AINode 服务 -./tools/ops/daemon-ainode.sh -``` - -执行脚本时将提示以下两个选择项: - -1. 是否本次直接启动对应 TimechoDB 服务(timechodb-confignode/timechodb-datanode/timechodb-ainode); -2. 是否将对应服务注册为开机自启服务。 - -脚本执行完成后,将在 `/etc/systemd/system/` 目录生成对应的服务文件: - -* `timechodb-confignode.service` -* `timechodb-datanode.service` -* `timechodb-ainode.service` - -## 4. 服务管理 - -服务注册完成后,可通过 systemctl 命令对 TimechoDB 各节点服务进行启动、停止、重启、查看状态及配置开机自启等操作,以下命令均需使用 root 用户执行。 - -### 4.1 手动启动服务 - -```bash -# 启动 ConfigNode 服务 -systemctl start timechodb-confignode -# 启动 DataNode 服务 -systemctl start timechodb-datanode -# 启动 AINode 服务 -systemctl start timechodb-ainode -``` - -### 4.2 手动停止服务 - -```bash -# 停止 ConfigNode 服务 -systemctl stop timechodb-confignode -# 停止 DataNode 服务 -systemctl stop timechodb-datanode -# 停止 AINode 服务 -systemctl stop timechodb-ainode -``` - -停止服务后,通过查看服务状态,若显示为 inactive(dead),则说明服务关闭成功;若为其他状态,需查看 TimechoDB 日志,分析异常原因。 - -### 4.3 查看服务状态 - -```bash -# 查看 ConfigNode 服务状态 -systemctl status timechodb-confignode -# 查看 DataNode 服务状态 -systemctl status timechodb-datanode -# 查看 AINode 服务状态 -systemctl status timechodb-ainode -``` - -状态说明: - -* active(running):服务正在运行,若该状态持续 10 分钟,说明服务启动成功; -* failed:服务启动失败,需查看 TimechoDB 日志排查问题。 - -### 4.4 重启服务 - -重启服务相当于先执行停止操作,再执行启动操作,命令如下: - -```bash -# 重启 ConfigNode 服务 -systemctl restart timechodb-confignode -# 重启 DataNode 服务 -systemctl restart timechodb-datanode -# 重启 AINode 服务 -systemctl restart timechodb-ainode -``` - -### 4.5 配置开机自启 - -```bash -# 配置 ConfigNode 开机自启 -systemctl enable timechodb-confignode -# 配置 DataNode 开机自启 -systemctl enable timechodb-datanode -# 配置 AINode 开机自启 -systemctl enable timechodb-ainode -``` - -### 4.6 取消开机自启 - -```bash -# 取消 ConfigNode 开机自启 -systemctl disable timechodb-confignode -# 取消 DataNode 开机自启 -systemctl disable timechodb-datanode -# 取消 AINode 开机自启 -systemctl disable timechodb-ainode -``` - -## 5. 自定义服务配置 - -### 5.1 自定义方式 - -#### 5.1.1 方案一:修改脚本 - -1. 修改 `daemon-xxx.sh` 中的[Unit]、[Service]、[Install]区域配置项,具体配置项的含义参考下一小节 -2. 执行 `daemon-xxx.sh` 脚本 - -#### 5.1.2 方案二:修改服务文件 - -1. 修改 `/etc/systemd/system` 中的 `xx.service` 文件 -2. 执行 `systemctl deamon-reload` - -### 5.2 `daemon-xxx.sh` 配置项 - -#### 5.2.1 [Unit] 部分(服务元信息) - -| 配置项 | 说明 | -| --------------- | ---------------------------------- | -| Description | 服务描述 | -| Documentation | 指向 TimechoDB 官方文档 | -| After | 确保在网络服务启动后才启动该服务 | - -#### 5.2.2 [Service] 部分(服务运行配置) - -| 配置项 | 含义 | -| -------------------------------------------- | ---------------------------------------------------------------------- | -| StandardOutput、StandardError | 指定服务标准输出和错误日志的存储路径 | -| LimitNOFILE=65536 | 设置文件描述符上限,默认值为 65536 | -| Type=simple | 服务类型为简单前台进程,systemd 会跟踪服务主进程 | -| User=root、Group=root | 指定服务以 root 用户和 root 组的权限运行 | -| ExecStart/ExecStop | 分别指定服务的启动脚本和停止脚本的路径 | -| Restart=on-failure | 仅在服务异常退出时,自动重启服务 | -| SuccessExitStatus=143 | 将退出码 143(128+15,即 SIGTERM 正常终止)视为成功退出 | -| RestartSec=5 | 服务重启的间隔时间,默认为 5 秒 | -| StartLimitInterval=600s、StartLimitBurst=3 | 10 分钟(600 秒)内,服务最多重启 3 次,防止频繁重启导致系统资源浪费 | -| RestartPreventExitStatus=SIGKILL | 服务被 SIGKILL 信号杀死后,不自动重启,避免无限重启僵尸进程 | - -#### 5.2.3 [Install] 部分(安装配置) - -| 配置项 | 含义 | -| ---------------------------- | -------------------------------------------- | -| WantedBy=multi-user.target | 指定服务在系统进入多用户模式时,自动启动。 | - -### 5.3 .service 文件格式示例 - -```bash -[Unit] -Description=timechodb-confignode -Documentation=https://www.timecho.com/ -After=network.target - -[Service] -StandardOutput=null -StandardError=null -LimitNOFILE=65536 -Type=simple -User=root -Group=root -Environment=JAVA_HOME=$JAVA_HOME -ExecStart=$TimechoDB_SBIN_HOME/start-confignode.sh -Restart=on-failure -SuccessExitStatus=143 -RestartSec=5 -StartLimitInterval=600s -StartLimitBurst=3 -RestartPreventExitStatus=SIGKILL - -[Install] -WantedBy=multi-user.target -``` - -注:上述为 timechodb-confignode.service 文件的标准格式,timechodb-datanode.service、timechodb-ainode.service 文件格式类似。 - -## 6. 注意事项 - -1. **进程守护机制** - -* **自动重启**:服务启动失败或运行中异常退出(如 OOM)时,系统将自动重启。 -* **不重启**:正常退出(如执行 `kill`、`./sbin/stop-xxx.sh` 或 `systemctl stop`)不会触发自动重启。 - -2. **日志位置** - -* 所有运行日志均存储在 TimechoDB 安装目录下的 `logs` 文件夹中,排查问题时请查阅该目录。 - -3. **集群状态查看** - -* 服务启动后,执行 `./sbin/start-cli.sh` 并输入 `show cluster` 命令,即可查看集群状态。 - -4. **故障恢复流程** - -* 若服务状态为 `failed`,修复问题后**必须**先执行 `systemctl daemon-reload`,然后再执行 `systemctl start`,否则启动将失败。 - -5. **配置生效** - -* 修改 `daemon-xxx.sh` 脚本内容后,需执行 `systemctl daemon-reload` 重新注册服务,新配置方可生效。 - -6. **启动方式兼容** - -* `systemctl start`启动的服务,可用`./sbin/stop` 停止(不重启)。 -* `./sbin/start` 启动的进程,无法通过 `systemctl` 监控状态。 diff --git a/src/zh/UserGuide/latest/User-Manual/Black-White-List_timecho.md b/src/zh/UserGuide/latest/User-Manual/Black-White-List_timecho.md deleted file mode 100644 index 66d99c273..000000000 --- a/src/zh/UserGuide/latest/User-Manual/Black-White-List_timecho.md +++ /dev/null @@ -1,78 +0,0 @@ - - -# 黑白名单 - -## 1. 引言 - -IoTDB 是一款针对物联网场景设计的时间序列数据库,支持高效的数据存储、查询和分析。随着物联网技术的广泛应用,数据安全性和访问控制变得至关重要。在开放环境中,如何保证合法用户对数据的安全访问成为了一项关键挑战。白名单机制仅允许可信 IP 或用户接入,从源头缩小攻击面;黑名单功能则能在边缘与云端协同场景下实时拦截恶意 IP,阻断非法访问、SQL 注入、暴力破解及 DDoS 等威胁,为数据传输提供持续、稳定的安全保障。 - -> 注意:该功能从 V2.0.6 版本开始提供。 - -## 2. 白名单 - -### 2.1 功能描述 - -通过开启白名单功能、配置白名单列表,指定允许连接 IoTDB 的客户端地址,来限制仅在白名单范围内的客户端才能够访问 IoTDB,从而实现安全控制。 - -### 2.2 配置参数 - -管理员可以通过以下两种方式来启用/禁用白名单功能以及添加、修改、删除白名单ip/ip段。 - -* 编辑配置文件 `iotdb-system.properties`进行维护 -* 通过 set configuration 语句进行维护 - * 树模型请参考:[set configuration](../Reference/Modify-Config-Manual.md) - -相关参数如下: - -| 名称 | 描述 | 默认值 | 生效方式 | 示例 | -| ------------------------- | ----------------------------------------------------------------------------------- | -------- | ---------- | ------------------------------------------------------------------- | -| `enable_white_list` | 是否启用白名单功能。true:启用;false:禁用。字段值不区分大小写。 | false | 热加载 | `set enable_white_list = 'true' ` | -| `white_ip_list` | 添加、修改、删除白名单ip/ip段。支持精确匹配,支持\*通配符,多个ip之间以逗号分隔。 | 空 | 热加载 | `set white_ip_list='192.168.1.200,192.168.1.201,192.168.1.*`' | - -## 3. 黑名单 - -### 3.1 功能描述 - -通过开启黑名单功能、配置黑名单列表,阻止某些特定 IP 地址访问数据库,来防止非法访问、SQL注入、暴力破解、DDoS攻击等安全威胁,从而确保数据传输过程中的安全性和稳定性。 - -### 3.2 配置参数 - -管理员可以通过以下两种方式来启用/禁用黑名单功能以及添加、修改、删除黑名单 ip/ip 段。 - -* 编辑配置文件 `iotdb-system.properties`进行维护 -* 通过 set configuration 语句进行维护 - * 树模型请参考:[set configuration](../Reference/Modify-Config-Manual.md) - -相关参数如下: - -| 名称 | 描述 | 默认值 | 生效方式 | 示例 | -| ------------------------- | ----------------------------------------------------------------------------------- | -------- | ---------- | ------------------------------------------------------------------- | -| `enable_black_list` | 是否启用黑名单功能。true:启用;false:禁用。字段值不区分大小写。 | false | 热加载 | `set enable_black_list = 'true' ` | -| `black_ip_list` | 添加、修改、删除黑名单ip/ip段。支持精确匹配,支持\*通配符,多个ip之间以逗号分隔。 | 空 | 热加载 | `set black_ip_list='192.168.1.200,192.168.1.201,192.168.1.*`' | - -## 4. 注意事项 - -1. 开启白名单后,若列表为空将拒绝所有连接,若未包含本机 IP 则拒绝本机登录。 -2. 当同一 IP 同时存在于黑白名单时,黑名单优先级更高。 -3. 系统会校验 IP 格式,无效条目将在用户连接时报错并被跳过,不影响其他有效IP的加载。 -4. 配置支持重复IP,内存中会自动去重且无提示。如需去重请手动修改。 -5. 黑/白名单规则仅对新连接生效,功能开启前的现有连接不受影响,其后续重连才会被拦截。 diff --git a/src/zh/UserGuide/latest/User-Manual/Data-Sync_timecho.md b/src/zh/UserGuide/latest/User-Manual/Data-Sync_timecho.md deleted file mode 100644 index c4cfe8fa7..000000000 --- a/src/zh/UserGuide/latest/User-Manual/Data-Sync_timecho.md +++ /dev/null @@ -1,743 +0,0 @@ - - -# 数据同步 -数据同步是工业物联网的典型需求,通过数据同步机制,可实现 IoTDB 之间的数据共享,搭建完整的数据链路来满足内网外网数据互通、端边云同步、数据迁移、数据备份等需求。 - -## 1. 功能概述 - -### 1.1 数据同步 - -一个数据同步任务包含 3 个阶段: - -![](/img/data-sync-new.png) - -- 抽取(Source)阶段:该部分用于从源 IoTDB 抽取数据,在 SQL 语句中的 source 部分定义 -- 处理(Process)阶段:该部分用于处理从源 IoTDB 抽取出的数据,在 SQL 语句中的 processor 部分定义 -- 发送(Sink)阶段:该部分用于向目标 IoTDB 发送数据,在 SQL 语句中的 sink 部分定义 - -通过 SQL 语句声明式地配置 3 个部分的具体内容,可实现灵活的数据同步能力。目前数据同步支持以下信息的同步,您可以在创建同步任务时对同步范围进行选择(默认选择 data.insert,即同步新写入的数据): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
同步范围同步内容说明
all所有范围
data(数据)insert(增量)同步新写入的数据
delete(删除)同步被删除的数据
schema(元数据)database(数据库)同步数据库的创建、修改或删除操作
timeseries(时间序列)同步时间序列的定义和属性
TTL(数据到期时间)同步数据的存活时间
auth(权限)-同步用户权限和访问控制
- -### 1.2 功能限制及说明 - -1. 元数据(schema)、权限(auth)同步功能存在如下限制: - -- 使用元数据同步时,要求`Schema region`、`ConfigNode` 的共识协议必须为默认的 ratis 协议,即`iotdb-system.properties`配置文件中是否包含`config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`、`schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus`,不包含即为默认值ratis 协议。 - -- 为了防止潜在的冲突,请在开启元数据同步时关闭接收端自动创建元数据功能。可通过修改 `iotdb-system.properties`配置文件中的`enable_auto_create_schema`配置项为 false,关闭元数据自动创建功能。 - -- 开启元数据同步时,不支持使用自定义插件。 - -- 双活集群中元数据同步需避免两端同时操作。 - -- 在进行数据同步任务时,请避免执行任何删除操作,防止两端状态不一致。 - -2. Pipe 权限控制规范如下: - -- 创建 pipe 时,可以对抽取/写回插件指定用户名和密码。密码错误则禁止创建,未指定时默认使用当前用户进行同步。 - -- 数据/元数据同步时,先根据 Pipe 配置的路径模式(pattern/path)筛选,再基于用户读取权限进行鉴权 - - - 权限范围≥写入路径:完整同步 - - - 权限范围与写入路径无交集:不同步 - - - 权限范围<写入路径或存在交集:同步交集部分 - -- 遇到无权限数据时,若发送端 skipIf=no-privileges,则跳过无权限数据;若 skipIf 配置为空,任务报错(803错误) - - - 注意:此 skipIf 配置与接收端的 skipIf(默认为空)相互独立 - -- 对于 root.__system, root.__audit 均不会同步 - -3. Pipe 接收端类型自动转换 - -当 Pipe 向接收端写入数据因字段类型不匹配而失败时,IoTDB 可按照目标端已有 schema 的字段类型对数据进行转换,并重试写入,以提高同步成功率。该能力由 `sink.exception.data.convert-on-type-mismatch` 控制,参数说明见后文 sink 参数表。 - -类型不匹配时的转换规则如下: - -| 源类型 | 目标类型 | 转换规则 | -| -------------------- | ----------- | -------------------------------------------------------------------------------- | -| 数值类型 | 数值类型 | 按目标数值类型进行转换,可能发生截断、精度损失或溢出。 | -| 数值类型 | BOOLEAN | `0`转换为`false`,非`0`转换为`true`。 | -| BOOLEAN | 数值类型 | `true`转换为`1`,`false`转换为`0`。 | -| TEXT、STRING、BLOB | BOOLEAN | 按字符串解析为 BOOLEAN。 | -| TEXT、STRING、BLOB | 数值类型 | 按字符串解析为目标数值类型;解析失败时写入默认值`0`、`0L`或`0.0`。 | -| TEXT、STRING、BLOB | TIMESTAMP | 按字符串解析为 TIMESTAMP;解析失败时写入默认值`0L`。 | -| TEXT、STRING、BLOB | DATE | 按字符串解析为 DATE;解析失败时写入默认日期`1970-01-01`。 | -| 非法数值 | DATE | 若无法转换为合法 DATE,则写入默认日期`1970-01-01`。 | -| DATE | TIMESTAMP | 按 UTC 转换为当天零点对应的时间戳。 | -| TIMESTAMP | DATE | 按 UTC 转换为对应日期。 | - -> 注意:自动转换基于目标端已有 schema 执行,不会自动修改目标端 schema;该能力优先保证同步继续进行,可能导致精度损失或默认值写入。 - - - -## 2. 使用说明 - -数据同步任务有三种状态:RUNNING、STOPPED 和 DROPPED。任务状态转换如下图所示: - -![](/img/Data-Sync01.png) - -创建后任务会直接启动,同时当任务发生异常停止后,系统会自动尝试重启任务。 - -提供以下 SQL 语句对同步任务进行状态管理。 - -### 2.1 创建任务 - -使用 `CREATE PIPE` 语句来创建一条数据同步任务,下列属性中`PipeId`和`sink`必填,`source`和`processor`为选填项,输入 SQL 时注意 `SOURCE`与 `SINK` 插件顺序不能替换。 - -SQL 示例如下: - -```SQL -CREATE PIPE [IF NOT EXISTS] -- PipeId 是能够唯一标定任务的名字 --- 数据抽取插件,可选插件 -WITH SOURCE ( - [ = ,], -) --- 数据处理插件,可选插件 -WITH PROCESSOR ( - [ = ,], -) --- 数据连接插件,必填插件 -WITH SINK ( - [ = ,], -) -``` - -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Pipe 不存在时,执行创建命令,防止因尝试创建已存在的 Pipe 而导致报错。 - -**注意**:V2.0.8 起,创建一个全量数据同步 Pipe (例如 Pipeid : `alldatapipe`)时,系统会自动将其拆分为两个独立的 Pipe: - -* 历史 Pipe:PipeId 为原名称加 _history后缀(如 `alldatapipe_history`),source 参数默认携带 `'realtime.enable'='false', 'inclusion'='data.insert', 'inclusion.exclusion'=''` - -* 实时 Pipe:PipeId 为原名称加 _realtime后缀(如 `alldatapipe_realtime`),source 参数默认携带 `'history.enable'='false'` ,若配置了元数据同步,则由实时 Pipe 负责发送 - -创建成功后,原 PipeId(如 `alldatapipe`)将不再作为有效标识符。在进行启动、停止、删除、查看等任务操作时,必须使用拆分后的独立 PipeId(即 `*_history`或 `*_realtime`)。操作示例见[查看任务](./Data-Sync_timecho.md#_2-5-查看任务)小节 - - -### 2.2 开始任务 - -开始处理数据: - -```SQL -START PIPE -``` - -### 2.3 停止任务 - -停止处理数据: - -```SQL -STOP PIPE -``` - -### 2.4 删除任务 - -删除指定任务: - -```SQL -DROP PIPE [IF EXISTS] -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Pipe 存在时,执行删除命令,防止因尝试删除不存在的 Pipe 而导致报错。 - -删除任务不需要先停止同步任务。 - -### 2.5 查看任务 - -查看全部任务: - -```SQL -SHOW PIPES -``` - -查看指定任务: - -```SQL -SHOW PIPE -``` - - pipe 的 show pipes 结果示例: - -```SQL -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State|PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -|59abf95db892428b9d01c5fa318014ea|2024-06-17T14:03:44.189|RUNNING| {}| {}|{sink=iotdb-thrift-sink, sink.ip=127.0.0.1, sink.port=6668}| | 128| 1.03| -+--------------------------------+-----------------------+-------+----------+-------------+-----------------------------------------------------------+----------------+-------------------+-------------------------+ -``` - -其中各列含义如下: - -- **ID**:同步任务的唯一标识符 -- **CreationTime**:同步任务的创建的时间 -- **State**:同步任务的状态 -- **PipeSource**:同步数据流的来源 -- **PipeProcessor**:同步数据流在传输过程中的处理逻辑 -- **PipeSink**:同步数据流的目的地 -- **ExceptionMessage**:显示同步任务的异常信息 -- **RemainingEventCount(统计存在延迟)**:剩余 event 数,当前数据同步任务中的所有 event 总数,包括数据和元数据同步的 event,以及系统和用户自定义的 event。 -- **EstimatedRemainingSeconds(统计存在延迟)**:剩余时间,基于当前 event 个数和 pipe 处速率,预估完成传输的剩余时间。 - -示例: - -在 V2.0.8 及之后的版本中,创建一个全量数据同步任务,并查看该任务详情 - -```sql -IoTDB> create pipe alldatapipe with source('inclusion'='all','exclusion'='auth') with sink('node-urls'='127.0.0.1:6668') - -IoTDB> show pipe alldatapipe_history -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_history|2025-12-18T15:06:16.697|RUNNING|{exclusion=auth, history.enable=true, inclusion=data.insert, inclusion.exclusion=, realtime.enable=false}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+-------------------+-----------------------+-------+---------------------------------------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ - -IoTDB> show pipe alldatapipe_realtime -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -| ID| CreationTime| State| PipeSource|PipeProcessor| PipeSink|ExceptionMessage|RemainingEventCount|EstimatedRemainingSeconds| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -|alldatapipe_realtime|2025-12-18T15:06:16.312|RUNNING|{exclusion=auth, history.enable=false, inclusion=all, realtime.enable=true}| {}|{node-urls=127.0.0.1:6668}| | 0| 0.00| -+--------------------+-----------------------+-------+---------------------------------------------------------------------------+-------------+--------------------------+----------------+-------------------+-------------------------+ -``` - - -### 2.6 修改任务 - -`ALTER PIPE` 语句用于动态更新已存在的 PIPE,支持修改或替换 source、processor 及 sink 的配置。 - -```SQL -ALTER PIPE [IF EXISTS] - MODIFY/REPLACE SOURCE(...) - MODIFY/REPLACE PROCESSOR(...) - MODIFY/REPLACE SINK(...) -``` - -说明: - -* 执行操作不会改变 PIPE 的运行状态,等价于保留原 PipeId 的处理进度,在原进度位置创建新 PIPE。 -* source/processor/sink 的 modify/replace 参数均为非必填;若未指定任何修改参数,等价于删除当前 PIPE 后,按原配置和进度重新创建。 -* 对于指定 modify 的插件,保留该插件其他参数,仅替换或新增给定的参数。 -* 对于指定 replace 的插件,直接替换该插件所有参数。 -* 当使用 [IF EXISTS] 关键字时,即使不存在同名的 Pipe 也会返回执行成功,但是实际未执行任何操作。 - -示例: - -```SQL -ALTER PIPE A2B REPLACE SINK ('sink'='iotdb-thrift-sink', 'node-urls' = '127.0.0.1:6668'); -``` - -### 2.7 同步插件 - -为了使得整体架构更加灵活以匹配不同的同步场景需求,我们支持在同步任务框架中进行插件组装。系统为您预置了一些常用插件可直接使用,同时您也可以自定义 processor 插件 和 Sink 插件,并加载至 IoTDB 系统进行使用。查看系统中的插件(含自定义与内置插件)可以用以下语句: - -```SQL -SHOW PIPEPLUGINS -``` - -返回结果如下: - -```SQL -IoTDB> SHOW PIPEPLUGINS -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| PluginName|PluginType| ClassName| PluginJar| -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ -| DO-NOTHING-PROCESSOR| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.processor.donothing.DoNothingProcessor| | -| DO-NOTHING-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.donothing.DoNothingConnector| | -| IOTDB-AIR-GAP-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.airgap.IoTDBAirGapConnector| | -| IOTDB-SOURCE| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.extractor.iotdb.IoTDBExtractor| | -| IOTDB-THRIFT-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftConnector| | -| IOTDB-THRIFT-SSL-SINK| Builtin| org.apache.iotdb.commons.pipe.plugin.builtin.connector.iotdb.thrift.IoTDBThriftSslConnector| | -+------------------------------+----------+--------------------------------------------------------------------------------------------------+----------------------------------------------------+ - -``` - -预置插件详细介绍如下(各插件的详细参数可参考本文[参数说明](#参考参数说明)): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
类型自定义插件插件名称介绍适用版本
source 插件不支持iotdb-source默认的 extractor 插件,用于抽取 IoTDB 历史或实时数据1.2.x
processor 插件支持do-nothing-processor默认的 processor 插件,不对传入的数据做任何的处理1.2.x
sink 插件支持do-nothing-sink不对发送出的数据做任何的处理1.2.x
iotdb-thrift-sink默认的 sink 插件(V1.3.1及以上),用于 IoTDB(V1.2.0 及以上)与 IoTDB(V1.2.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,多线程 async non-blocking IO 模型,传输性能高,尤其适用于目标端为分布式时的场景1.2.x
iotdb-air-gap-sink用于 IoTDB(V1.2.2 及以上)向 IoTDB(V1.2.2 及以上)跨单向数据网闸的数据同步。支持的网闸型号包括南瑞 Syskeeper 2000 等1.2.x
iotdb-thrift-ssl-sink用于 IoTDB(V1.3.1 及以上)与 IoTDB(V1.2.0 及以上)之间的数据传输。使用 Thrift RPC 框架传输数据,单线程 sync blocking IO 模型,适用于安全需求较高的场景 1.3.1+
- -导入自定义插件可参考[流处理框架](./Streaming_timecho.md#自定义流处理插件管理)章节。 - -## 3. 使用示例 - -### 3.1 全量数据同步 - -本例子用来演示将一个 IoTDB 的所有数据同步至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务,用来同步 A IoTDB 到 B IoTDB 间的全量数据,这里需要用到用到 sink 的 iotdb-thrift-sink 插件(内置插件),需通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,如下面的示例语句: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.2 部分数据同步 - -本例子用来演示同步某个历史时间范围( 2023 年 8 月 23 日 8 点到 2023 年 10 月 23 日 8 点)的数据至另一个 IoTDB,数据链路如下图所示: - -![](/img/%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A51.png) - -在这个例子中,我们可以创建一个名为 A2B 的同步任务。首先我们需要在 source 中定义传输数据的范围,由于传输的是历史数据(历史数据是指同步任务创建之前存在的数据),需要配置数据的起止时间 start-time 和 end-time 以及传输的模式 mode。通过 node-urls 配置目标端 IoTDB 中 DataNode 节点的数据服务端口的 url。 - -详细语句如下: - -```SQL -create pipe A2B -WITH SOURCE ( - 'source'= 'iotdb-source', - 'realtime.mode' = 'stream' -- 新插入数据(pipe创建后)的抽取模式 - 'path' = 'root.vehicle.**', -- 同步数据的范围 - 'start-time' = '2023.08.23T08:00:00+00:00', -- 同步所有数据的开始 event time,包含 start-time - 'end-time' = '2023.10.23T08:00:00+00:00' -- 同步所有数据的结束 event time,包含 end-time -) -with SINK ( - 'sink'='iotdb-thrift-async-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.3 双向数据传输 - -本例子用来演示两个 IoTDB 之间互为双活的场景,数据链路如下图所示: - -![](/img/1706698592139.jpg) - -在这个例子中,为了避免数据无限循环,需要将 A 和 B 上的参数`forwarding-pipe-requests` 均设置为 `false`,表示不转发从另一 pipe 传输而来的数据,以及要保持两侧的数据一致 pipe 需要配置`inclusion=all`来同步全量数据和元数据。 - -详细语句如下: - -在 A IoTDB 上执行下列语句: - -```SQL -create pipe AB -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'forwarding-pipe-requests' = 'false' --不转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'forwarding-pipe-requests' = 'false' --是否转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -### 3.4 边云数据传输 - -本例子用来演示多个 IoTDB 之间边云传输数据的场景,数据由 B 、C、D 集群分别都同步至 A 集群,数据链路如下图所示: - -![](/img/dataSync03.png) - -在这个例子中,为了将 B 、C、D 集群的数据同步至 A,在 BA 、CA、DA 之间的 pipe 需要配置`path`限制范围,以及要保持边侧和云侧的数据一致 pipe 需要配置`inclusion=all`来同步全量数据和元数据,详细语句如下: - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 A: - -```SQL -create pipe BA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 C IoTDB 上执行下列语句,将 C 中数据同步至 A: - -```SQL -create pipe CA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 D IoTDB 上执行下列语句,将 D 中数据同步至 A: - -```SQL -create pipe DA -with source ( - 'inclusion'='all', -- 表示同步全量数据、元数据和权限 - 'path'='root.db.**', -- 限制范围 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.5 级联数据传输 - -本例子用来演示多个 IoTDB 之间级联传输数据的场景,数据由 A 集群同步至 B 集群,再同步至 C 集群,数据链路如下图所示: - -![](/img/1706698610134.jpg) - -在这个例子中,为了将 A 集群的数据同步至 C,在 BC 之间的 pipe 需要将 `forwarding-pipe-requests` 配置为`true`,详细语句如下: - -在 A IoTDB 上执行下列语句,将 A 中数据同步至 B: - -```SQL -create pipe AB -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -在 B IoTDB 上执行下列语句,将 B 中数据同步至 C: - -```SQL -create pipe BC -with source ( - 'forwarding-pipe-requests' = 'true' --是否转发由其他 Pipe 写入的数据 -) -with sink ( - 'sink'='iotdb-thrift-sink', - 'node-urls' = '127.0.0.1:6669', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` - -### 3.6 跨网闸数据传输 - -本例子用来演示将一个 IoTDB 的数据,经过单向网闸,同步至另一个 IoTDB 的场景,数据链路如下图所示: - -![](/img/cross-network-gateway.png) - -在这个例子中,需要使用 sink 任务中的 iotdb-air-gap-sink 插件,配置网闸后,在 A IoTDB 上执行下列语句,其中 node-urls 填写网闸配置的目标端 IoTDB 中 DataNode 节点的数据服务端口的 url,详细语句如下: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-air-gap-sink', - 'node-urls' = '10.53.53.53:9780', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url -) -``` -**注意:** -* 跨网闸同步创建 pipe 时,必须确保接收端目标用户已存在。若创建 pipe 时接收端用户缺失,后续补建用户也无法同步此前数据。 -* 目前支持的网闸型号见下表 -> 其他型号的网闸设备,请与天谋商务联系确认是否支持。 - -| 网闸类型 | 网闸型号 | 回包限制 | 发送限制 | -| ------------ | -------------------------------------------- | ----------------- | --------------- | -| 正向型 | 南瑞 Syskeeper-2000 正向型 | 全 0 / 全 1 bytes | 无限制 | -| 正向型 | 许继自研网闸 | 全 0 / 全 1 bytes | 无限制 | -| 未标记正反向 | 威努特安全隔离与信息交换系统 | 无限制 | 无限制 | -| 正向型 | 科东 StoneWall-2000 网络安全隔离设备(正向型) | 无限制 | 无限制 | -| 反向型 | 南瑞 Syskeeper-2000 反向型 | 全 0 / 全 1 bytes | 满足 E 语言格式 | -| 未标记正反向 | 迪普科技ISG5000 | 无限制 | 无限制 | -| 未标记正反向 | 熙羚安全隔离与信息交换系统XL—GAP | 无限制 | 无限制 | - -### 3.7 压缩同步 - -IoTDB 支持在同步过程中指定数据压缩方式。可通过配置 `compressor` 参数,实现数据的实时压缩和传输。`compressor`目前支持 snappy / gzip / lz4 / zstd / lzma2 5 种可选算法,且可以选择多种压缩算法组合,按配置的顺序进行压缩。`rate-limit-bytes-per-second`(V1.3.3 及以后版本支持)每秒最大允许传输的byte数,计算压缩后的byte,若小于0则不限制。 - -如创建一个名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'node-urls' = '127.0.0.1:6668', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'compressor' = 'snappy,lz4' -- - 'rate-limit-bytes-per-second'='1048576' -- 每秒最大允许传输的byte数 -) -``` - -### 3.8 加密同步 - -IoTDB 支持在同步过程中使用 SSL 加密,从而在不同的 IoTDB 实例之间安全地传输数据。通过配置 SSL 相关的参数,如证书地址和密码(`ssl.trust-store-path`)、(`ssl.trust-store-pwd`)可以确保数据在同步过程中被 SSL 加密所保护。 - -如创建名为 A2B 的同步任务: - -```SQL -create pipe A2B -with sink ( - 'sink'='iotdb-thrift-ssl-sink', - 'node-urls'='127.0.0.1:6667', -- 目标端 IoTDB 中 DataNode 节点的数据服务端口的 url - 'ssl.trust-store-path'='pki/trusted', -- 连接目标端 DataNode 所需的 trust store 证书路径 - 'ssl.trust-store-pwd'='root' -- 连接目标端 DataNode 所需的 trust store 证书密码 -) -``` - -## 4. 参考:注意事项 - -可通过修改 IoTDB 配置文件(`iotdb-system.properties`)以调整数据同步的参数,如同步数据存储目录等。完整配置如下:: - -V1.3.3+: - -```Properties -# pipe_receiver_file_dir -# If this property is unset, system will save the data in the default relative path directory under the IoTDB folder(i.e., %IOTDB_HOME%/${cn_system_dir}/pipe/receiver). -# If it is absolute, system will save the data in the exact location it points to. -# If it is relative, system will save the data in the relative path directory it indicates under the IoTDB folder. -# Note: If pipe_receiver_file_dir is assigned an empty string(i.e.,zero-size), it will be handled as a relative path. -# effectiveMode: restart -# For windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is absolute. Otherwise, it is relative. -# pipe_receiver_file_dir=data\\confignode\\system\\pipe\\receiver -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_receiver_file_dir=data/confignode/system/pipe/receiver - -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# effectiveMode: first_start -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# effectiveMode: restart -# Datatype: int -pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# effectiveMode: restart -# Datatype: int -pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# effectiveMode: restart -# Datatype: int -pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# effectiveMode: restart -# Datatype: int -pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# effectiveMode: restart -# Datatype: Boolean -pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# Datatype: int -# effectiveMode: restart -pipe_air_gap_receiver_port=9780 - -# The total bytes that all pipe sinks can transfer per second. -# When given a value less than or equal to 0, it means no limit. -# default value is -1, which means no limit. -# effectiveMode: hot_reload -# Datatype: double -pipe_all_sinks_rate_limit_bytes_per_second=-1 -``` - -## 5. 参考:参数说明 - -### 5.1 source 参数 - -| 参数 | 描述 | value 取值范围 | 是否必填 | 默认取值 | -|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------|------|----------------| -| source | iotdb-source | String: iotdb-source | 必填 | - | -| inclusion | 用于指定数据同步任务中需要同步范围,分为数据、元数据和权限 | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | 选填 | data.insert | -| inclusion.exclusion | 用于从 inclusion 指定的同步范围内排除特定的操作,减少同步的数据量 | String:all, data(insert,delete), schema(database,timeseries,ttl), auth | 选填 | 空字符串 | -| mode.streaming | 此参数指定时序数据写入的捕获来源。适用于 `mode.streaming`为 `false` 模式下的场景,决定`inclusion`中`data.insert`数据的捕获来源。提供两种捕获策略:true: 动态选择捕获的类型。系统将根据下游处理速度,自适应地选择是捕获每个写入请求还是仅捕获 TsFile 文件的封口请求。当下游处理速度快时,优先捕获写入请求以减少延迟;当处理速度慢时,仅捕获文件封口请求以避免处理堆积。这种模式适用于大多数场景,能够实现处理延迟和吞吐量的最优平衡。false:固定按批捕获方式。仅捕获 TsFile 文件的封口请求,适用于资源受限的应用场景,以降低系统负载。注意,pipe 启动时捕获的快照数据只会以文件的方式供下游处理。 | Boolean: true / false | 否 | true | -| mode.strict | 在使用 time / path / database-name / table-name 参数过滤数据时,是否需要严格按照条件筛选:`true`: 严格筛选。系统将完全按照给定条件过滤筛选被捕获的数据,确保只有符合条件的数据被选中。`false`:非严格筛选。系统在筛选被捕获的数据时可能会包含一些额外的数据,适用于性能敏感的场景,可降低 CPU 和 IO 消耗。 | Boolean: true / false | 否 | true | -| mode.snapshot | 此参数决定时序数据的捕获方式,影响`inclusion`中的`data`数据。提供两种模式:`true`:静态数据捕获。启动 pipe 时,会进行一次性的数据快照捕获。当快照数据被完全消费后,**pipe 将自动终止(DROP PIPE SQL 会自动执行)**。`false`:动态数据捕获。除了在 pipe 启动时捕获快照数据外,还会持续捕获后续的数据变更。pipe 将持续运行以处理动态数据流。 | Boolean: true / false | 否 | false | -| path | 当用户连接指定的sql_dialect为tree时可以指定。对于升级上来的用户pipe,默认sql_dialect为tree。此参数决定时序数据的捕获范围,影响 inclusion中的data数据,以及部分序列相关的元数据。当数据的树模型路径能够被path匹配时,数据会被筛选出来进入流处理pipe。
自 V2.0.8.2 版本起,该参数支持在一个pipe中填写多个精确路径的path , 如 `'path'='root.test.d0,s1,root.test.d0.s2,root.test.d0.s3'` | String:IoTDB标准的树路径模式,可以带通配符 | 选填 | root.** | -| start-time | 同步所有数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MIN_VALUE | -| end-time | 同步所有数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | 选填 | Long.MAX_VALUE | -| forwarding-pipe-requests | 是否转发由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | 选填 | true | -| mods | 同 mods.enable,是否发送 tsfile 的 mods 文件 | Boolean: true / false | 选填 | false | -| skipIf | 出现哪些错误可以跳过,当前只有无权限的错误 | String:no-privileges | 选填 | no-privileges | - -> 💎 **说明:数据抽取模式 mode.streaming 取值 true 和 false 的差异** -> - **true(推荐)**:该取值下,任务将对数据进行实时处理、发送,其特点是高时效、低吞吐 -> - **false**:该取值下,任务将对数据进行批量(按底层数据文件)处理、发送,其特点是低时效、高吞吐 - - -### 5.2 sink **参数** - -#### iotdb-thrift-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -|-------------------------------------------| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- |----------------------------------| -| sink | iotdb-thrift-sink 或 iotdb-thrift-async-sink | String: iotdb-thrift-sink 或 iotdb-thrift-async-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/username | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | TimechoDB@2021, V2.0.6.x 之前为root | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V2.0.5及以后版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| format | 数据传输的payload格式, 可选项包括:
- hybrid: 取决于 processor 传递过来的格式(tsfile或tablet),sink不做任何转换。
- tsfile:强制转换成tsfile发送,可用于数据文件备份等场景。
- tablet:强制转换成tsfile发送,可用于发送端/接收端数据类型不完全兼容时的数据同步(以减少报错)。 | String: hybrid / tsfile / tablet | 选填 | hybrid | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - -#### iotdb-air-gap-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -|-------------------------------------------| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- |-----------------------------------| -| sink | iotdb-air-gap-sink | String: iotdb-air-gap-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/username | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | TimechoDB@2021, V2.0.6.x 之前为root | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| air-gap.handshake-timeout-ms | 发送端与接收端在首次尝试建立连接时握手请求的超时时长,单位:毫秒 | Integer | 选填 | 5000 | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - -#### iotdb-thrift-ssl-sink - -| key | value | value 取值范围 | 是否必填 | 默认取值 | -|--------------------------------------------| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- |-----------------------------------| -| sink | iotdb-thrift-ssl-sink | String: iotdb-thrift-ssl-sink | 必填 | - | -| node-urls | 目标端 IoTDB 任意多个 DataNode 节点的数据服务端口的 url(请注意同步任务不支持向自身服务进行转发) | String. 例:'127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| user/username | 连接接收端使用的用户名,同步要求该用户具备相应的操作权限 | String | 选填 | root | -| password | 连接接收端使用的用户名对应的密码,同步要求该用户具备相应的操作权限 | String | 选填 | TimechoDB@2021, V2.0.6.x 之前为root | -| batch.enable | 是否开启日志攒批发送模式,用于提高传输吞吐,降低 IOPS | Boolean: true, false | 选填 | true | -| batch.max-delay-seconds | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:s) | Integer | 选填 | 1 | -| batch.max-delay-ms | 在开启日志攒批发送模式时生效,表示一批数据在发送前的最长等待时间(单位:ms)(V2.0.5及以后版本支持) | Integer | 选填 | 1 | -| batch.size-bytes | 在开启日志攒批发送模式时生效,表示一批数据最大的攒批大小(单位:byte) | Long | 选填 | 16*1024*1024 | -| compressor | 所选取的 rpc 压缩算法,可配置多个,对每个请求顺序采用 | String: snappy / gzip / lz4 / zstd / lzma2 | 选填 | "" | -| compressor.zstd.level | 所选取的 rpc 压缩算法为 zstd 时,可使用该参数额外配置 zstd 算法的压缩等级 | Int: [-131072, 22] | 选填 | 3 | -| rate-limit-bytes-per-second | 每秒最大允许传输的 byte 数,计算压缩后的 byte(如压缩),若小于 0 则不限制 | Double: [Double.MIN_VALUE, Double.MAX_VALUE] | 选填 | -1 | -| load-tsfile-strategy | 文件同步数据时,接收端请求返回发送端前,是否等待接收端本地的 load tsfile 执行结果返回。
sync:等待本地的 load tsfile 执行结果返回;
async:不等待本地的 load tsfile 执行结果返回。 | String: sync / async | 选填 | sync | -| ssl.trust-store-path | 连接目标端 DataNode 所需的 trust store 证书路径 | String.Example: '127.0.0.1:6667,127.0.0.1:6668,127.0.0.1:6669', '127.0.0.1:6667' | 必填 | - | -| ssl.trust-store-pwd | 连接目标端 DataNode 所需的 trust store 证书密码 | Integer | 必填 | - | -| format | 数据传输的payload格式, 可选项包括:
- hybrid: 取决于 processor 传递过来的格式(tsfile或tablet),sink不做任何转换。
- tsfile:强制转换成tsfile发送,可用于数据文件备份等场景。
- tablet:强制转换成tsfile发送,可用于发送端/接收端数据类型不完全兼容时的数据同步(以减少报错)。 | String: hybrid / tsfile / tablet | 选填 | hybrid | -| exception.data.convert-on-type-mismatch | 接收端类型不同时是否转换 | Boolean: true / false | 选填 | true | - diff --git a/src/zh/UserGuide/latest/User-Manual/Data-subscription_timecho.md b/src/zh/UserGuide/latest/User-Manual/Data-subscription_timecho.md deleted file mode 100644 index 77267b8e7..000000000 --- a/src/zh/UserGuide/latest/User-Manual/Data-subscription_timecho.md +++ /dev/null @@ -1,144 +0,0 @@ -# 数据订阅 - -## 1. 功能介绍 - -IoTDB 数据订阅模块(又称 IoTDB 订阅客户端)是IoTDB V1.3.3 版本后支持的功能,它为用户提供了一种区别于数据查询的流式数据消费方式。它参考了 Kafka 等消息队列产品的基本概念和逻辑,**提供数据订阅和消费接口**,但并不是为了完全替代这些消费队列的产品,更多的是在简单流式获取数据的场景为用户提供更加便捷的数据订阅服务。 - -在下面应用场景中,使用 IoTDB 订阅客户端消费数据会有显著的优势: - -1. **持续获取最新数据**:使用订阅的方式,比定时查询更实时、应用编程更简单、系统负担更小; -2. **简化数据推送至第三方系统**:无需在 IoTDB 内部开发不同系统的数据推送组件,可以在第三方系统内实现数据的流式获取,更方便将数据发送至 Flink、Kafka、DataX、Camel、MySQL、PG 等系统。 - -## 2. 主要概念 - -IoTDB 订阅客户端包含 3 个核心概念:Topic、Consumer、Consumer Group,具体关系如下图 - -
- -
- -1. **Topic(主题)**: Topic 是 IoTDB 的数据空间,由路径和时间范围表示(如 root.** 的全时间范围)。消费者可以订阅这些主题的数据(当前已有的和未来写入的)。不同于 Kafka,IoTDB 可在数据入库后再创建 Topic,且输出格式可选择 Message 或 TsFile 两种。 - -2. **Consumer(消费者)**: Consumer 是 IoTDB 的订阅客户端,负责接收和处理发布到特定 Topic 的数据。Consumer 从队列中获取数据并进行相应的处理。在 IoTDB 订阅客户端中提供了两种类型的 Consumers: - - 一种是 `SubscriptionPullConsumer`,对应的是消息队列中的 pull 消费模式,用户代码需要主动调用数据获取逻辑 - - 一种是 `SubscriptionPushConsumer`,对应的是消息队列中的 push 消费模式,用户代码由新到达的数据事件触发 - -3. **Consumer Group(消费者组)**: Consumer Group 是一组 Consumers 的集合,拥有相同 Consumer Group ID 的消费者属于同一个消费者组。Consumer Group 有以下特点: - - Consumer Group 与 Consumer 为一对多的关系。即一个 consumer group 中的 consumers 可以有任意多个,但不允许一个 consumer 同时加入多个 consumer groups - - 允许一个 Consumer Group 中有不同类型的 Consumer(`SubscriptionPullConsumer` 和 `SubscriptionPushConsumer`) - - 一个 topic 不需要被一个 consumer group 中的所有 consumer 订阅 - - 当同一个 Consumer Group 中不同的 Consumers 订阅了相同的 Topic 时,该 Topic 下的每条数据只会被组内的一个 Consumer 处理,确保数据不会被重复处理 - -## 3. SQL 语句 - -### 3.1 Topic 管理 - -IoTDB 支持通过 SQL 语句对 Topic 进行创建、删除、查看操作。Topic状态变化如下图所示: - -
- -
- -#### 3.1.1 创建 Topic - -SQL 语句为: - -```SQL - CREATE TOPIC [IF NOT EXISTS] - WITH ( - [ = ,], - ); -``` -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Topic 不存在时,执行创建命令,防止因尝试创建已存在的 Topic 而导致报错。 - -各参数详细解释如下: - -| 参数 | 是否必填(默认值) | 参数含义 | -| :-------------------------------------------- | :--------------------------------- | :----------------------------------------------------------- | -| **path** | optional: `root.**` | topic 对应订阅数据时间序列的路径 path,表示一组需要订阅的时间序列集合 | -| **start-time** | optional: `MIN_VALUE` | topic 对应订阅数据时间序列的开始时间(event time)可以为 ISO 格式,例如 2011-12-03T10:15:30 或 2011-12-03T10:15:30+01:00也可以为 long 值,含义为裸时间戳,单位与数据库时间戳精度一致支持特殊 value **`now`**,含义为 topic 的创建时间,当 start-time 为 `now` 且 end-time 为 MAX_VALUE 时表示只订阅实时数据 | -| **end-time** | optional: `MAX_VALUE` | topic 对应订阅数据时间序列的结束时间(event time)可以为 ISO 格式,例如 2011-12-03T10:15:30 或 2011-12-03T10:15:30+01:00也可以为 long 值,含义为裸时间戳,单位与数据库时间戳精度一致支持特殊 value `now`,含义为 topic 的创建时间,当 end-time 为 `now` 且 start-time 为 MIN_VALUE 时表示只订阅历史数据 | -| **processor** | optional: `do-nothing-processor` | processor 插件名及其参数配置,表示对原始订阅数据应用的自定义处理逻辑,可以通过类似 pipe processor 插件的方式指定 | -| **format** | optional: `SessionDataSetsHandler` | 表示从该主题订阅出的数据呈现形式,目前支持下述两种数据形式:`SessionDataSetsHandler`:使用 `SubscriptionSessionDataSetsHandler` 获取从该主题订阅出的数据,消费者可以按行消费每条数据`TsFileHandler`:使用 `SubscriptionTsFileHandler` 获取从该主题订阅出的数据,消费者可以直接订阅到存储相应数据的 TsFile | -| **mode** **(v1.3.3.2 及之后版本支持)** | option: `live` | topic 对应的订阅模式,有两个选项:`live`:订阅该主题时,订阅的数据集模式为动态数据集,即可以不断消费到最新的数据`snapshot`:consumer 订阅该主题时,订阅的数据集模式为静态数据集,即 consumer group 订阅该主题的时刻(不是创建主题的时刻)数据的 snapshot;形成订阅后的静态数据集不支持 TTL | -| **loose-range** **(v1.3.3.2 及之后版本支持)** | option: `""` | String: 是否严格按照 path 和 time range 来筛选该 topic 对应的数据,例如:`""`:严格按照 path 和 time range 来筛选该 topic 对应的数据`"time"`:不严格按照 time range 来筛选该 topic 对应的数据(粗筛);严格按照 path 来筛选该 topic 对应的数据`"path"`:不严格按照 path 来筛选该 topic 对应的数据(粗筛);严格按照 time range 来筛选该 topic 对应的数据`"time, path"` / `"path, time"` / `"all"`:不严格按照 path 和 time range 来筛选该 topic 对应的数据(粗筛) | - -示例如下: - -```SQL --- 全量订阅 -CREATE TOPIC root_all; - --- 自定义订阅 -CREATE TOPIC IF NOT EXISTS db_timerange -WITH ( - 'path' = 'root.db.**', - 'start-time' = '2023-01-01', - 'end-time' = '2023-12-31' -); -``` - -#### 3.1.2 删除 Topic - -Topic 在没有被订阅的情况下,才能被删除,Topic 被删除时,其相关的消费进度都会被清理 - -```SQL -DROP TOPIC [IF EXISTS] ; -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Topic 存在时,执行删除命令,防止因尝试删除不存在的 Topic 而导致报错。 - -#### 3.1.3 查看 Topic - -```SQL -SHOW TOPICS; -SHOW TOPIC ; -``` - -结果集: - -```SQL -[TopicName|TopicConfigs] -``` - -- TopicName:主题 ID -- TopicConfigs:主题配置 - -### 3.2 查看订阅状态 - -查看所有订阅关系: - -```SQL --- 查询所有的 topics 与 consumer group 的订阅关系 -SHOW SUBSCRIPTIONS --- 查询某个 topic 下所有的 subscriptions -SHOW SUBSCRIPTIONS ON -``` - -结果集: - -```SQL -[TopicName|ConsumerGroupName|SubscribedConsumers] -``` - -- TopicName:主题 ID -- ConsumerGroupName:用户代码中指定的消费者组 ID -- SubscribedConsumers:该消费者组中订阅了该主题的所有客户端 ID - -## 4. API 接口 - -除 SQL 语句外,IoTDB 还支持通过 Java 原生接口使用数据订阅功能。详细语法参见页面:Java 原生接口([链接](../API/Programming-Java-Native-API_timecho))。 - -## 5. 常见问题 - -### 5.1 IoTDB 数据订阅与 Kafka 的区别是什么? - -1. 消费有序性 - -- **Kafka 保证消息在单个 partition 内是有序的**,当某个 topic 仅对应一个 partition 且只有一个 consumer 订阅了这个 topic,即可保证该 consumer(单线程) 消费该 topic 数据的顺序即为数据写入的顺序。 -- IoTDB 订阅客户端**不保证** consumer 消费数据的顺序即为数据写入的顺序,但会尽量反映数据写入的顺序。 - -2. 消息送达语义 - -- Kafka 可以通过配置实现 Producer 和 Consumer 的 Exactly once 语义。 -- IoTDB 订阅客户端目前无法提供 Consumer 的 Exactly once 语义。 \ No newline at end of file diff --git a/src/zh/UserGuide/latest/User-Manual/IoTDB-View_timecho.md b/src/zh/UserGuide/latest/User-Manual/IoTDB-View_timecho.md deleted file mode 100644 index c3fa738ee..000000000 --- a/src/zh/UserGuide/latest/User-Manual/IoTDB-View_timecho.md +++ /dev/null @@ -1,547 +0,0 @@ - - -# 视图 - -## 1. 序列视图应用背景 - -### 1.1 应用场景1 时间序列重命名(PI资产管理) - -实际应用中,采集数据的设备可能使用人类难以理解的标识号来命名,这给业务层带来了查询上的困难。 - -而序列视图能够重新组织管理这些序列,在不改变原有序列内容、无需新建或拷贝序列的情况下,使用新的模型结构来访问他们。 - -**例如**:一台云端设备使用自己的网卡MAC地址组成实体编号,存储数据时写入如下时间序列:`root.db.0800200A8C6D.xvjeifg`. - -对于用户来说,它是难以理解的。但此时,用户能够使用序列视图功能对它重命名,将它映射到一个序列视图中去,使用`root.view.device001.temperature`来访问采集到的数据。 - -### 1.2 应用场景2 简化业务层查询逻辑 - -有时用户有大量设备,管理着大量时间序列。在进行某项业务时,用户希望仅处理其中的部分序列,此时就可以通过序列视图功能挑选出关注重点,方便反复查询、写入。 - -**例如**:用户管理一条产品流水线,各环节的设备有大量时间序列。温度检测员仅需要关注设备温度,就可以抽取温度相关的序列,组成序列视图。 - -### 1.3 应用场景3 辅助权限管理 - -生产过程中,不同业务负责的范围一般不同,出于安全考虑往往需要通过权限管理来限制业务员的访问范围。 - -**例如**:安全管理部门现在仅需要监控某生产线上各设备的温度,但这些数据与其他机密数据存放在同一数据库。此时,就可以创建若干新的视图,视图中仅含有生产线上与温度有关的时间序列,接着,向安全员只赋予这些序列视图的权限,从而达到权限限制的目的。 - -### 1.4 设计序列视图功能的动机 - -结合上述两类使用场景,设计序列视图功能的动机,主要有: - -1. 时间序列重命名。 -2. 简化业务层查询逻辑。 -3. 辅助权限管理,通过视图向特定用户开放数据。 - -## 2. 序列视图概念 - -### 2.1 术语概念 - -约定:若无特殊说明,本文档所指定的视图均是**序列视图**,未来可能引入设备视图等新功能。 - -### 2.2 序列视图 - -序列视图是一种组织管理时间序列的方式。 - -在传统关系型数据库中,数据都必须存放在一个表中,而在IoTDB等时序数据库中,序列才是存储单元。因此,IoTDB中序列视图的概念也是建立在序列上的。 - -一个序列视图就是一条虚拟的时间序列,每条虚拟的时间序列都像是一条软链接或快捷方式,映射到某个视图外部的序列或者某种计算逻辑。换言之,一个虚拟序列要么映射到某个确定的外部序列,要么由多个外部序列运算得来。 - -用户可以使用复杂的SQL查询创建视图,此时序列视图就像一条被存储的查询语句,当从视图中读取数据时,就把被存储的查询语句作为数据来源,放在FROM子句中。 - -### 2.3 别名序列 - -在序列视图中,有一类特殊的存在,他们满足如下所有条件: - -1. 数据来源为单一的时间序列 -2. 没有任何计算逻辑 -3. 没有任何筛选条件(例如无WHERE子句的限制) - -这样的序列视图,被称为**别名序列**,或别名序列视图。不完全满足上述所有条件的序列视图,就称为非别名序列视图。他们之间的区别是:只有别名序列支持写入功能。 - -**所有序列视图包括别名序列目前均不支持触发器功能(Trigger)。** - -### 2.4 嵌套视图 - -用户可能想从一个现有的序列视图中选出若干序列,组成一个新的序列视图,就称之为嵌套视图。 - -**当前版本不支持嵌套视图功能**。 - -### 2.5 IoTDB中对序列视图的一些约束 - -#### 限制1 序列视图必须依赖于一个或者若干个时间序列 - -一个序列视图有两种可能的存在形式: - -1. 它映射到一条时间序列 -2. 它由一条或若干条时间序列计算得来 - -前种存在形式已在前文举例,易于理解;而此处的后一种存在形式,则是因为序列视图允许计算逻辑的存在。 - -比如,用户在同一个锅炉安装了两个温度计,现在需要计算两个温度值的平均值作为测量结果。用户采集到的是如下两个序列:`root.db.d01.temperature01`、`root.db.d01.temperature02`。 - -此时,用户可以使用两个序列求平均值,作为视图中的一条序列:`root.db.d01.avg_temperature`。 - -该例子会3.1.2详细展开。 - -#### 限制2 非别名序列视图是只读的 - -不允许向非别名序列视图写入。 - -只有别名序列视图是支持写入的。 - -#### 限制3 不允许嵌套视图 - -不能选定现有序列视图中的某些列来创建序列视图,无论是直接的还是间接的。 - -本限制将在3.1.3给出示例。 - -#### 限制4 序列视图与时间序列不能重名 - -序列视图和时间序列都位于同一棵树下,所以他们不能重名。 - -任何一条序列的名称(路径)都应该是唯一确定的。 - -#### 限制5 序列视图与时间序列的时序数据共用,标签等元数据不共用 - -序列视图是指向时间序列的映射,所以它们完全共用时序数据,由时间序列负责持久化存储。 - -但是它们的tag、attributes等元数据不共用。 - -这是因为进行业务查询时,面向视图的用户关心的是当前视图的结构,而如果使用group by tag等方式做查询,显然希望是得到视图下含有对应tag的分组效果,而非时间序列的tag的分组效果(用户甚至对那些时间序列毫无感知)。 - -## 3. 序列视图功能介绍 - -### 3.1 创建视图 - -创建一个序列视图与创建一条时间序列类似,区别在于需要通过AS关键字指定数据来源,即原始序列。 - -#### 创建视图的SQL - -用户可以选取一些序列创建一个视图: - -```SQL -CREATE VIEW root.view.device.status -AS - SELECT s01 - FROM root.db.device -``` - -它表示用户从现有设备`root.db.device`中选出了`s01`这条序列,创建了序列视图`root.view.device.status`。 - -序列视图可以与时间序列存在于同一实体下,例如: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device -``` - -这样,`root.db.device`下就有了`s01`的一份虚拟拷贝,但是使用不同的名字`status`。 - -可以发现,上述两个例子中的序列视图,都是别名序列,我们给用户提供一种针对该序列的更方便的创建方式: - -```SQL -CREATE VIEW root.view.device.status -AS - root.db.device.s01 -``` - -#### 创建含有计算逻辑的视图 - -沿用2.2章节限制1中的例子: - -> 用户在同一个锅炉安装了两个温度计,现在需要计算两个温度值的平均值作为测量结果。用户采集到的是如下两个序列:`root.db.d01.temperature01`、`root.db.d01.temperature02`。 -> -> 此时,用户可以使用两个序列求平均值,作为视图中的一条序列:`root.view.device01.avg_temperature`。 - -如果不使用视图,用户可以这样查询两个温度的平均值: - -```SQL -SELECT (temperature01 + temperature02) / 2 -FROM root.db.d01 -``` - -而如果使用序列视图,用户可以这样创建一个视图来简化将来的查询: - -```SQL -CREATE VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02) / 2 - FROM root.db.d01 -``` - -然后用户可以这样查询: - -```SQL -SELECT avg_temperature FROM root.db.d01 -``` - -#### 不支持嵌套序列视图 - -继续沿用3.1.2中的例子,现在用户想使用序列视图`root.db.d01.avg_temperature`创建一个新的视图,这是不允许的。我们目前不支持嵌套视图,无论它是否是别名序列,都不支持。 - -比如下列SQL语句会报错: - -```SQL -CREATE VIEW root.view.device.avg_temp_copy -AS - root.db.d01.avg_temperature -- 不支持。不允许嵌套视图 -``` - -#### 一次创建多条序列视图 - -一次只能指定一个序列视图对用户来说使用不方便,则可以一次指定多条序列,比如: - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - SELECT s01, s02 - FROM root.db.device -``` - -此外,上述写法可以做简化: - -```SQL -CREATE VIEW root.db.device(status, sub.hardware) -AS - SELECT s01, s02 - FROM root.db.device -``` - -上述两条语句都等价于如下写法: - -```SQL -CREATE VIEW root.db.device.status -AS - SELECT s01 - FROM root.db.device; - -CREATE VIEW root.db.device.sub.hardware -AS - SELECT s02 - FROM root.db.device -``` - -也等价于如下写法 - -```SQL -CREATE VIEW root.db.device.status, root.db.device.sub.hardware -AS - root.db.device.s01, root.db.device.s02 - --- 或者 - -CREATE VIEW root.db.device(status, sub.hardware) -AS - root.db.device(s01, s02) -``` - -##### 所有序列间的映射关系为静态存储 - -有时,SELECT子句中可能包含运行时才能确定的语句个数,比如如下的语句: - -```SQL -SELECT s01, s02 -FROM root.db.d01, root.db.d02 -``` - -上述语句能匹配到的序列数量是并不确定的,和系统状态有关。即便如此,用户也可以使用它创建视图。 - -不过需要特别注意,所有序列间的映射关系为静态存储(创建时固定)!请看以下示例: - -当前数据库中仅含有`root.db.d01.s01`、`root.db.d02.s01`、`root.db.d02.s02`三条序列,接着创建视图: - -```SQL -CREATE VIEW root.view.d(alpha, beta, gamma) -AS - SELECT s01, s02 - FROM root.db.d01, root.db.d02 -``` - -时间序列之间映射关系如下: - -| 序号 | 时间序列 | 序列视图 | -| ---- | ----------------- | ----------------- | -| 1 | `root.db.d01.s01` | root.view.d.alpha | -| 2 | `root.db.d02.s01` | root.view.d.beta | -| 3 | `root.db.d02.s02` | root.view.d.gamma | - -此后,用户新增了序列`root.db.d01.s02`,则它不对应到任何视图;接着,用户删除`root.db.d01.s01`,则查询`root.view.d.alpha`会直接报错,它也不会对应到`root.db.d01.s02`。 - -请时刻注意,序列间映射关系是静态地、固化地存储的。 - -#### 批量创建序列视图 - -现有若干个设备,每个设备都有一个温度数值,例如: - -1. root.db.d1.temperature -2. root.db.d2.temperature -3. ... - -这些设备下可能存储了很多其他序列(例如`root.db.d1.speed`),但目前可以创建一个视图,只包含这些设备的温度值,而不关系其他序列: - -```SQL -CREATE VIEW root.db.view(${2}_temperature) -AS - SELECT temperature FROM root.db.* -``` - -这里仿照了查询写回(`SELECT INTO`)对命名规则的约定,使用变量占位符来指定命名规则。可以参考:[查询写回(SELECT INTO)](../Basic-Concept/Query-Data_timecho.md#_10-查询写回-into-子句) - -这里`root.db.*.temperature`指定了有哪些时间序列会被包含在视图中;`${2}`则指定了从时间序列中的哪个节点提取出名字来命名序列视图。 - -此处,`${2}`指代的是`root.db.*.temperature`的层级2(从 0 开始),也就是`*`的匹配结果;`${2}_temperature`则是将匹配结果与`temperature`通过下划线拼接了起来,构成视图下各序列的节点名称。 - -上述创建视图的语句,和下列写法是等价的: - -```SQL -CREATE VIEW root.db.view(${2}_${3}) -AS - SELECT temperature from root.db.* -``` - -最终视图中含有这些序列: - -1. root.db.view.d1_temperature -2. root.db.view.d2_temperature -3. ... - -使用通配符创建,只会存储创建时刻的静态映射关系。 - -#### 创建视图时SELECT子句受到一定限制 - -创建序列视图时,使用的SELECT子句受到一定限制。主要限制如下: - -1. 不能使用`WHERE`子句。 -2. 不能使用`GROUP BY`子句。 -3. 不能使用`MAX_VALUE`等聚合函数。 - -简单来说,`AS`后只能使用`SELECT ... FROM ... `的结构,且该查询语句的结果必须能构成一条时间序列。 - -### 3.2 视图数据查询 - -对于可以支持的数据查询功能,在执行时序数据查询时,序列视图与时间序列可以无差别使用,行为完全一致。 - -**目前序列视图不支持的查询类型如下:** - -1. **align by device 查询** -2. **group by tags 查询** - -用户也可以在同一个SELECT语句中混合查询时间序列与序列视图,比如: - -```SQL -SELECT temperature01, temperature02, avg_temperature -FROM root.db.d01 -WHERE temperature01 < temperature02 -``` - -但是,如果用户想要查询序列的元数据,例如tag、attributes等,则查询到的是序列视图的结果,而并非序列视图所引用的时间序列的结果。 - -此外,对于别名序列,如果用户想要得到时间序列的tag、attributes等信息,则需要先查询视图列的映射,找到对应的时间序列,再向时间序列查询tag、attributes等信息。查询视图列的映射的方法将会在3.5部分说明。 - - -### 3.3 视图修改 - -视图支持的修改操作包括:修改计算逻辑,修改标签/属性,以及删除。 - -#### 修改视图数据来源 - -```SQL -ALTER VIEW root.view.device.status -AS - SELECT s01 - FROM root.ln.wf.d01 -``` - -#### 修改视图的计算逻辑 - -```SQL -ALTER VIEW root.db.d01.avg_temperature -AS - SELECT (temperature01 + temperature02 + temperature03) / 3 - FROM root.db.d01 -``` - -#### 标签点管理 - -- 添加新的标签 - -```SQL -ALTER view root.turbine.d1.s1 ADD TAGS tag3=v3, tag4=v4 -``` - -- 添加新的属性 - -```SQL -ALTER view root.turbine.d1.s1 ADD ATTRIBUTES attr3=v3, attr4=v4 -``` - -- 重命名标签或属性 - -```SQL -ALTER view root.turbine.d1.s1 RENAME tag1 TO newTag1 -``` - -- 重新设置标签或属性的值 - -```SQL -ALTER view root.turbine.d1.s1 SET newTag1=newV1, attr1=newV1 -``` - -- 删除已经存在的标签或属性 - -```SQL -ALTER view root.turbine.d1.s1 DROP tag1, tag2 -``` - -- 更新插入标签和属性 - -> 如果该标签或属性原来不存在,则插入,否则,用新值更新原来的旧值 - -```SQL -ALTER view root.turbine.d1.s1 UPSERT TAGS(tag2=newV2, tag3=v3) ATTRIBUTES(attr3=v3, attr4=v4) -``` - -#### 删除视图 - -因为一个视图就是一条序列,因此可以像删除时间序列一样删除一个视图。 - -```SQL -DELETE VIEW root.view.device.avg_temperatue -``` - -### 3.4 视图同步 - -#### 如果依赖的原序列被删除了 - -当序列视图查询时(序列解析时),如果依赖的时间序列不存在,则**返回空结果集**。 - -这和查询一个不存在的序列的反馈类似,但是有区别:如果依赖的时间序列无法解析,空结果集是包含表头的,以此来提醒用户该视图是存在问题的。 - -此外,被依赖的时间序列删除时,不会去查找是否有依赖于该列的视图,用户不会收到任何警告。 - -#### 不支持非别名序列的数据写入 - -不支持向非别名序列的写入。 - -详情请参考前文 2.1.6 限制2 - -#### 序列的元数据不共用 - -详情请参考前文2.1.6 限制5 - -### 3.5 视图元数据查询 - -视图元数据查询,特指查询视图本身的元数据(例如视图有多少列),以及数据库内视图的信息(例如有哪些视图)。 - -#### 查看当前的视图列 - -用户有两种查询方式: - -1. 使用`SHOW TIMESERIES`进行查询,该查询既包含时间序列,也包含序列视图。但是只能显示视图的部分属性 -2. 使用`SHOW VIEW`进行查询,该查询只包含序列视图。能完整显示序列视图的属性。 - -举例: - -```Shell -IoTDB> show timeseries; -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -| Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|ViewType| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.device.s01 | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.view.status | null| root.db| INT32| RLE| SNAPPY|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp01 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.temp02 | null| root.db| FLOAT| RLE| SNAPPY|null| null| null| null| BASE| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -|root.db.d01.avg_temp| null| root.db| FLOAT| null| null|null| null| null| null| VIEW| -+--------------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+--------+ -Total line number = 5 -It costs 0.789s -IoTDB> -``` - -最后一列`ViewType`中显示了该序列的类型,时间序列为BASE,序列视图是VIEW。 - -此外,某些序列视图的属性会缺失,比如`root.db.d01.avg_temp`是由温度均值计算得来,所以`Encoding`和`Compression`属性都为空值。 - -此外,`SHOW TIMESERIES`语句的查询结果主要分为两部分: - -1. 时序数据的信息,例如数据类型,压缩方式,编码等 -2. 其他元数据信息,例如tag,attribute,所属database等 - -对于序列视图,展示的时序数据信息与其原始序列一致或者为空值(比如计算得到的平均温度有数据类型但是无压缩方式);展示的元数据信息则是视图的内容。 - -如果要得知视图的更多信息,需要使用`SHOW ``VIEW`。`SHOW ``VIEW`中展示视图的数据来源等。 - -```Shell -IoTDB> show VIEW root.**; -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -| Timeseries|Database|DataType|Tags|Attributes|ViewType| SOURCE| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.view.status | root.db| INT32|null| null| VIEW| root.db.device.s01| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -|root.db.d01.avg_temp| root.db| FLOAT|null| null| VIEW|(root.db.d01.temp01+root.db.d01.temp02)/2| -+--------------------+--------+--------+----+----------+--------+-----------------------------------------+ -Total line number = 2 -It costs 0.789s -IoTDB> -``` - -最后一列`SOURCE`显示了该序列视图的数据来源,列出了创建该序列的SQL语句。 - -##### 关于数据类型 - -上述两种查询都涉及视图的数据类型。视图的数据类型是根据定义视图的查询语句或别名序列的原始时间序列类型推断出来的。这个数据类型是根据当前系统的状态实时计算出来的,因此在不同时刻查询到的数据类型可能是改变的。 - -## 4. FAQ - -#### Q1:我想让视图实现类型转换的功能。例如,原有一个int32类型的时间序列,和其他int64类型的序列被放在了同一个视图中。我现在希望通过视图查询到的数据,都能自动转换为int64类型。 - -> Ans:这不是序列视图的职能范围。但是可以使用`CAST`进行转换,比如: - -```SQL -CREATE VIEW root.db.device.int64_status -AS - SELECT CAST(s1, 'type'='INT64') from root.db.device -``` - -> 这样,查询`root.view.status`时,就会得到int64类型的结果。 -> -> 请特别注意,上述例子中,序列视图的数据是通过`CAST`转换得到的,因此`root.db.device.int64_status`并不是一条别名序列,也就**不支持写入**。 - -#### Q2:是否支持默认命名?选择若干时间序列,创建视图;但是我不指定每条序列的名字,由数据库自动命名? - -> Ans:不支持。用户必须明确指定命名。 - -#### Q3:在原有体系中,创建时间序列`root.db.device.s01`,可以发现自动创建了database`root.db`,自动创建了device`root.db.device`。接着删除时间序列`root.db.device.s01`,可以发现`root.db.device`被自动删除,`root.db`却还是保留的。对于创建视图,会沿用这一机制吗?出于什么考虑呢? - -> Ans:保持原有的行为不变,引入视图功能不会改变原有的这些逻辑。 - -#### Q4:是否支持序列视图重命名? - -> A:当前版本不支持重命名,可以自行创建新名称的视图投入使用。 diff --git a/src/zh/UserGuide/latest/User-Manual/Maintenance-statement_timecho.md b/src/zh/UserGuide/latest/User-Manual/Maintenance-statement_timecho.md deleted file mode 100644 index 0f8e55762..000000000 --- a/src/zh/UserGuide/latest/User-Manual/Maintenance-statement_timecho.md +++ /dev/null @@ -1,719 +0,0 @@ - -# 运维语句 - -## 1. 状态查看 - -### 1.1 查看连接的模型 - -**含义**:返回当前连接的 sql_dialect 是树模型/表模型。 - -#### 语法: - -```SQL -showCurrentSqlDialectStatement - : SHOW CURRENT_SQL_DIALECT - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_SQL_DIALECT -``` - -执行结果如下: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TREE| -+-----------------+ -``` - -### 1.2 查看集群版本 - -**含义**:返回当前集群的版本。 - -#### 语法: - -```SQL -showVersionStatement - : SHOW VERSION - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW VERSION -``` - -执行结果如下: - -```SQL -+-------+---------+ -|Version|BuildInfo| -+-------+---------+ -|2.0.1.2| 1ca4008| -+-------+---------+ -``` - -### 1.3 查看集群关键参数 - -**含义**:返回当前集群的关键参数。 - -#### 语法: - -```SQL -showVariablesStatement - : SHOW VARIABLES - ; -``` - -关键参数如下: - -1. **ClusterName**:当前集群的名称。 -2. **DataReplicationFactor**:数据副本的数量,表示每个数据分区(DataRegion)的副本数。 -3. **SchemaReplicationFactor**:元数据副本的数量,表示每个元数据分区(SchemaRegion)的副本数。 -4. **DataRegionConsensusProtocolClass**:数据分区(DataRegion)使用的共识协议类。 -5. **SchemaRegionConsensusProtocolClass**:元数据分区(SchemaRegion)使用的共识协议类。 -6. **ConfigNodeConsensusProtocolClass**:配置节点(ConfigNode)使用的共识协议类。 -7. **TimePartitionOrigin**:数据库时间分区的起始时间戳。 -8. **TimePartitionInterval**:数据库的时间分区间隔(单位:毫秒)。 -9. **ReadConsistencyLevel**:读取操作的一致性级别。 -10. **SchemaRegionPerDataNode**:数据节点(DataNode)上的元数据分区(SchemaRegion)数量。 -11. **DataRegionPerDataNode**:数据节点(DataNode)上的数据分区(DataRegion)数量。 -12. **SeriesSlotNum**:数据分区(DataRegion)的序列槽(SeriesSlot)数量。 -13. **SeriesSlotExecutorClass**:序列槽的实现类。 -14. **DiskSpaceWarningThreshold**:磁盘空间告警阈值(单位:百分比)。 -15. **TimestampPrecision**:时间戳精度。 - -#### 示例: - -```SQL -IoTDB> SHOW VARIABLES -``` - -执行结果如下: - -```SQL -+----------------------------------+-----------------------------------------------------------------+ -| Variable| Value| -+----------------------------------+-----------------------------------------------------------------+ -| ClusterName| defaultCluster| -| DataReplicationFactor| 1| -| SchemaReplicationFactor| 1| -| DataRegionConsensusProtocolClass| org.apache.iotdb.consensus.iot.IoTConsensus| -|SchemaRegionConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| ConfigNodeConsensusProtocolClass| org.apache.iotdb.consensus.ratis.RatisConsensus| -| TimePartitionOrigin| 0| -| TimePartitionInterval| 604800000| -| ReadConsistencyLevel| strong| -| SchemaRegionPerDataNode| 1| -| DataRegionPerDataNode| 0| -| SeriesSlotNum| 1000| -| SeriesSlotExecutorClass|org.apache.iotdb.commons.partition.executor.hash.BKDRHashExecutor| -| DiskSpaceWarningThreshold| 0.05| -| TimestampPrecision| ms| -+----------------------------------+-----------------------------------------------------------------+ -``` - -### 1.4 查看数据库当前时间 - -#### 语法: - -**含义**:返回数据库当前时间。 - -```SQL -showCurrentTimestampStatement - : SHOW CURRENT_TIMESTAMP - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW CURRENT_TIMESTAMP -``` - -执行结果如下: - -```SQL -+-----------------------------+ -| CurrentTimestamp| -+-----------------------------+ -|2025-02-17T11:11:52.987+08:00| -+-----------------------------+ -``` - -### 1.5 查看正在执行的查询信息 - -**含义**:用于显示所有正在执行的查询信息。 - -#### 语法: - -```SQL -showQueriesStatement - : SHOW (QUERIES | QUERY PROCESSLIST) - (WHERE where=booleanExpression)? - (ORDER BY sortItem (',' sortItem)*)? - limitOffsetClause - ; -``` - -**参数解释**: - -1. **WHERE** 子句:需保证过滤的目标列是结果集中存在的列 -2. **ORDER BY** 子句:需保证`sortKey`是结果集中存在的列 -3. **limitOffsetClause**: - - **含义**:用于限制结果集的返回数量。 - - **格式**:`LIMIT , `, `` 是偏移量,`` 是返回的行数。 -4. **QUERIES** 表中的列: - - **time**:查询开始的时间戳,时间戳精度与系统精度一致 - - **queryid**:查询语句的 ID - - **datanodeid**:发起查询语句的 DataNode 的ID - - **elapsedtime**:查询的执行耗时,单位是秒 - - **statement**:查询的 SQL 语句 - - -#### 示例: - -```SQL -IoTDB> SHOW QUERIES WHERE elapsedtime > 0.003 -``` - -执行结果如下: - -```SQL -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -| Time| QueryId|DataNodeId|ElapsedTime| Statement| -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -|2025-05-09T15:16:01.293+08:00|20250509_071601_00015_1| 1| 0.006|SHOW QUERIES WHERE elapsedtime > 0.003| -+-----------------------------+-----------------------+----------+-----------+--------------------------------------+ -``` - - -### 1.6 查看分区信息 - -**含义**:返回当前集群的分区信息。 - -#### 语法: - -```SQL -showRegionsStatement - : SHOW REGIONS - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW REGIONS -``` - -执行结果如下: - -```SQL -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -|RegionId| Type| Status| Database|SeriesSlotNum|TimeSlotNum|DataNodeId|RpcAddress|RpcPort|InternalAddress| Role| CreateTime|TsFileSize| -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -| 9|SchemaRegion|Running|root.__system| 21| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.555| | -| 10| DataRegion|Running|root.__system| 21| 21| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-01T17:37:01.556| 8.27 KB| -| 65|SchemaRegion|Running| root.ln| 1| 0| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-25T14:46:50.113| | -| 66| DataRegion|Running| root.ln| 1| 1| 1| 0.0.0.0| 6667| 127.0.0.1|Leader|2025-08-25T14:46:50.425| 524 B| -+--------+------------+-------+-------------+-------------+-----------+----------+----------+-------+---------------+------+-----------------------+----------+ -``` - -### 1.7 查看可用节点 - -**含义**:返回当前集群所有可用的 DataNode 的 RPC 地址和端口。注意:这里对于“可用”的定义为:处于非 REMOVING 状态的 DN 节点。 - -> V2.0.8 起支持该功能 - -#### 语法: - -```SQL -showAvailableUrlsStatement - : SHOW AVAILABLE URLS - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW AVAILABLE URLS -``` - -执行结果如下: - -```SQL -+----------+-------+ -|RpcAddress|RpcPort| -+----------+-------+ -| 0.0.0.0| 6667| -+----------+-------+ -``` - -### 1.8 查看服务信息 - -**含义**:返回当前集群所有正常工作(RUNNING 或 READ-ONLY) DN 上的服务信息(MQTT 服务、REST 服务)。 - -> V2.0.8.2 起支持该功能 - -#### 语法: - -```SQL -showServicesStatement - : SHOW SERVICES - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW SERVICES -IoTDB> SHOW SERVICES ON 1 -``` - -执行结果如下: - -```SQL -+------------+-----------+-------+ -|service_name|datanode_id| state| -+------------+-----------+-------+ -| MQTT| 1|STOPPED| -| REST| 1|RUNNING| -+------------+-----------+-------+ -``` - -### 1.9 查看集群激活状态 - -**含义**:返回当前集群的激活状态。 - -#### 语法: - -```SQL -showActivationStatement - : SHOW ACTIVATION - ; -``` - -#### 示例: - -```SQL -IoTDB> SHOW ACTIVATION -``` - -执行结果如下: - -```SQL -+---------------+---------+-----------------------------+ -| LicenseInfo| Usage| Limit| -+---------------+---------+-----------------------------+ -| Status|ACTIVATED| -| -| ExpiredTime| -|2026-04-30T00:00:00.000+08:00| -| DataNodeLimit| 1| Unlimited| -| CpuLimit| 16| Unlimited| -| DeviceLimit| 30| Unlimited| -|TimeSeriesLimit| 72| 1,000,000,000| -+---------------+---------+-----------------------------+ -``` - - -### 1.10 查看磁盘空间占用情况 - -含义:返回指定 pattern 的磁盘空间占用情况,包括 ChunkGroup 的大小和 Metadata 大小。 - -注意:统计基于 TsFile 中数据的真实大小,因此不会考虑 mods 删除的情况。 - -> V2.0.9.1 起支持该功能 - -#### 语法: - -```SQL -showDiskUsageStatement - : SHOW DISK_USAGE FROM pathPattern - whereClause? - orderByClause? - rowPaginationClause? - ; -pathPattern - : ROOT (DOT nodeName)* - ; -``` - -说明:Pattern 用于匹配设备,需要使用 root 作为开头,路径的中间节点支持 * 或 **。 - -#### 结果集 - -| 列名 | 列类型 | 含义 | -| --------------- | -------- | -------------------- | -| Database | string | Database 名 | -| DataNodeId | int32 | DataNode 节点 id | -| RegionId | int32 | Region id | -| TimePartition | int64 | 时间分区 id | -| SizeInBytes | int64 | 占用磁盘空间(byte) | - -#### 示例: - -```SQL -SHOW DISK_USAGE FROM root.ln.**; -``` - -执行结果如下: - -```Bash -+--------+----------+--------+-------------+-----------+ -|Database|DataNodeId|RegionId|TimePartition|SizeInBytes| -+--------+----------+--------+-------------+-----------+ -| root.ln| 1| 13| 2932| 203| -+--------+----------+--------+-------------+-----------+ -``` - - -## 2. 状态设置 - -### 2.1 设置连接的模型 - -**含义**:将当前连接的 sql_dialect 置为树模型/表模型,在树模型和表模型中均可使用该命令。 - -#### 语法: - -```SQL -SET SQL_DIALECT EQ (TABLE | TREE) -``` - -#### 示例: - -```SQL -IoTDB> SET SQL_DIALECT=TREE -IoTDB> SHOW CURRENT_SQL_DIALECT -``` - -执行结果如下: - -```SQL -+-----------------+ -|CurrentSqlDialect| -+-----------------+ -| TREE| -+-----------------+ -``` - -### 2.2 更新配置项 - -**含义**:用于更新配置项,执行完成后会进行配置项的热加载,对于支持热修改的配置项会立即生效。 - -#### 语法: - -```SQL -setConfigurationStatement - : SET CONFIGURATION propertyAssignments (ON INTEGER_VALUE)? - ; - -propertyAssignments - : property (',' property)* - ; - -property - : identifier EQ propertyValue - ; - -propertyValue - : DEFAULT - | expression - ; -``` - -**参数解释**: - -1. **propertyAssignments** - - **含义**:更新的配置列表,由多个 `property` 组成。 - - 可以更新多个配置列表,用逗号分隔。 - - **取值**: - - `DEFAULT`:将配置项恢复为默认值。 - - `expression`:具体的值,必须是一个字符串。 -2. **ON INTEGER_VALUE** - - **含义**:指定要更新配置的节点 ID。 - - **可选性**:可选。如果不指定或指定的值低于 0,则更新所有 ConfigNode 和 DataNode 的配置。 - -#### 示例: - -```SQL -IoTDB> SET CONFIGURATION 'disk_space_warning_threshold'='0.05','heartbeat_interval_in_ms'='1000' ON 1; -``` - -### 2.3 读取手动修改的配置文件 - -**含义**:用于读取手动修改过的配置文件,并对配置项进行热加载,对于支持热修改的配置项会立即生效。 - -#### 语法: - -```SQL -loadConfigurationStatement - : LOAD CONFIGURATION localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定配置热加载的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `LOCAL`:只对客户端直连的 DataNode 进行配置热加载。 - - `CLUSTER`:对集群中所有 DataNode 进行配置热加载。 - -#### 示例: - -```SQL -IoTDB> LOAD CONFIGURATION ON LOCAL; -``` - -### 2.4 设置系统的状态 - -**含义**:用于设置系统的状态。 - -#### 语法: - -```SQL -setSystemStatusStatement - : SET SYSTEM TO (READONLY | RUNNING) localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **RUNNING | READONLY** - - **含义**:指定系统的新状态。 - - **取值**: - - `RUNNING`:将系统设置为运行状态,允许读写操作。 - - `READONLY`:将系统设置为只读状态,只允许读取操作,禁止写入操作。 -2. **localOrClusterMode** - - **含义**:指定状态变更的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `LOCAL`:仅对客户端直连的 DataNode 生效。 - - `CLUSTER`:对集群中所有 DataNode 生效。 - -#### 示例: - -```SQL -IoTDB> SET SYSTEM TO READONLY ON CLUSTER; -``` - - -## 3. 数据管理 - -### 3.1 刷写内存表中的数据到磁盘 - -**含义**:将内存表中的数据刷写到磁盘上。 - -#### 语法: - -```SQL -flushStatement - : FLUSH identifier? (',' identifier)* booleanValue? localOrClusterMode? - ; - -booleanValue - : TRUE | FALSE - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **identifier** - - **含义**:指定要刷写的路径名称。 - - **可选性**:可选。如果不指定,则默认刷写所有路径。 - - **多个路径**:可以指定多个路径名称,用逗号分隔。例如:`FLUSH root.ln, root.lnm`。 -2. **booleanValue** - - **含义**:指定刷写的内容。 - - **可选性**:可选。如果不指定,则默认刷写顺序和乱序空间的内存。 - - **取值**: - - `TRUE`:只刷写顺序空间的内存表。 - - `FALSE`:只刷写乱序空间的MemTable。 -3. **localOrClusterMode** - - **含义**:指定刷写的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:只刷写客户端直连的 DataNode 上的内存表。 - - `ON CLUSTER`:刷写集群中所有 DataNode 上的内存表。 - -#### 示例: - -```SQL -IoTDB> FLUSH root.ln TRUE ON LOCAL; -``` - -## 4. 数据修复 - -### 4.1 启动后台扫描并修复 tsfile 任务 - -**含义**:启动一个后台任务,开始扫描并修复 tsfile,能够修复数据文件内的时间戳乱序类异常。 - -#### 语法: - -```SQL -startRepairDataStatement - : START REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定数据修复的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:仅对客户端直连的 DataNode 执行。 - - `ON CLUSTER`:对集群中所有 DataNode 执行。 - -#### 示例: - -```SQL -IoTDB> START REPAIR DATA ON CLUSTER; -``` - -### 4.2 暂停后台修复 tsfile 任务 - -**含义**:暂停后台的修复任务,暂停中的任务可通过再次执行 start repair data 命令恢复。 - -#### 语法: - -```SQL -stopRepairDataStatement - : STOP REPAIR DATA localOrClusterMode? - ; - -localOrClusterMode - : (ON (LOCAL | CLUSTER)) - ; -``` - -**参数解释**: - -1. **localOrClusterMode** - - **含义**:指定数据修复的范围。 - - **可选性**:可选。默认值为 `CLUSTER`。 - - **取值**: - - `ON LOCAL`:仅对客户端直连的 DataNode 执行。 - - `ON CLUSTER`:对集群中所有 DataNode 执行。 - -#### 示例: - -```SQL -IoTDB> STOP REPAIR DATA ON CLUSTER; -``` - -## 5. 终止查询 - -### 5.1 主动终止查询 - -**含义**:使用该命令主动地终止查询。 - -#### 语法: - -```SQL -killQueryStatement - : KILL (QUERY queryId=string | ALL QUERIES) - ; -``` - -**参数解释**: - -1. **QUERY queryId=string** - - **含义**:指定要终止的查询的 ID。 `` 是正在执行的查询的唯一标识符。 - - **获取查询 ID**:可以通过 `SHOW QUERIES` 命令获取所有正在执行的查询及其 ID。 -2. **ALL QUERIES** - - **含义**:终止所有正在执行的查询。 - -#### 示例: - -通过指定 `queryId` 可以中止指定的查询,为了获取正在执行的查询 id,用户可以使用 show queries 命令,该命令将显示所有正在执行的查询列表。 - -```SQL -IoTDB> KILL QUERY 20250108_101015_00000_1; -- 终止指定query -IoTDB> KILL ALL QUERIES; -- 终止所有query -``` - - - -## 6. 调试查询 - -### 6.1 DEBUG SQL - - -**​含义:​**在 SQL 查询语句开头添加 debug 关键字,执行时将输出 debug 日志,包括涉及到的底层文件 scan 信息。 - -> V2.0.9.1 起支持该功能 - -#### 语法: - -```SQL -debugSQLStatement - : DEBUG ? query - ; -``` - -**说明:** - -* 日志输出目录为: `logs/log_datanode_query_debug.log` - -#### 示例: - -1. 执行以下 SQL 进行 DEBUG 查询 - -```SQL -debug select * from root.ln.**; -``` - -2. 观察`log_datanode_query_debug.log` 的日志内容,查看查询涉及到的文件 scan 信息。 - -```Bash -2026-03-24 10:06:18,755 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:159 - Cache miss: root.ln.wf01.wt01.temperature in file: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,757 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:160 - Device: root.ln.wf01.wt01, all sensors: [temperature] -2026-03-24 10:06:18,758 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.BloomFilterCache:110 - get bloomFilter from cache where filePath is: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.TimeSeriesMetadataCache:227 - Get timeseries: root.ln.wf01.wt01.temperature metadata in file: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile from cache: TimeseriesMetadata{timeSeriesMetadataType=0, chunkMetaDataListDataSize=8, measurementId='temperature', dataType=DOUBLE, statistics=startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], modified=false, isSeq=true, chunkMetadataList=[measurementId: temperature, datatype: DOUBLE, version: 0, Statistics: startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], deleteIntervalList: null]}. -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:97 - Modifications size is 0 for file Path: /home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:109 - After modification Chunk meta data list is: -2026-03-24 10:06:18,759 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.d.r.r.c.m.DiskChunkMetadataLoader:110 - measurementId: temperature, datatype: DOUBLE, version: 0, Statistics: startTime: 1773824951259 endTime: 1773824951259 count: 1 [minValue:12.9,maxValue:12.9,firstValue:12.9,lastValue:12.9,sumValue:12.9], deleteIntervalList: null -2026-03-24 10:06:18,760 [Query-Worker-Thread-3$20260324_020618_00052_1.1.0.0] INFO o.a.i.d.s.b.ChunkCache:167 - get chunk from cache whose key is: ChunkCacheKey{filePath='/home/iotdb/timechodb/data/datanode/data/sequence/root.ln/13/2932/1773824951611-1-0-0.tsfile', regionId=13, timePartitionId=2932, tsFileVersion=1, compactionVersion=0, offsetOfChunkHeader=27} -2026-03-24 10:06:18,761 [pool-69-IoTDB-ClientRPC-Processor-1$20260324_020618_00052_1] INFO o.a.i.d.q.p.Coordinator:902 - debug select * from root.ln.** -``` \ No newline at end of file diff --git a/src/zh/UserGuide/latest/User-Manual/Streaming_timecho.md b/src/zh/UserGuide/latest/User-Manual/Streaming_timecho.md deleted file mode 100644 index a5944ef7a..000000000 --- a/src/zh/UserGuide/latest/User-Manual/Streaming_timecho.md +++ /dev/null @@ -1,858 +0,0 @@ - - -# 流计算框架 - -IoTDB 流处理框架允许用户实现自定义的流处理逻辑,可以实现对存储引擎变更的监听和捕获、实现对变更数据的变形、实现对变形后数据的向外推送等逻辑。 - -我们将一个数据流处理任务称为 Pipe。一个流处理任务(Pipe)包含三个子任务: - -- 抽取(Source) -- 处理(Process) -- 发送(Sink) - -流处理框架允许用户使用 Java 语言自定义编写三个子任务的处理逻辑,通过类似 UDF 的方式处理数据。 -在一个 Pipe 中,上述的三个子任务分别由三种插件执行实现,数据会依次经过这三个插件进行处理: -Pipe Source 用于抽取数据,Pipe Processor 用于处理数据,Pipe Sink 用于发送数据,最终数据将被发至外部系统。 - -**Pipe 任务的模型如下:** - -![任务模型图](/img/1706697228308.jpg) - -描述一个数据流处理任务,本质就是描述 Pipe Source、Pipe Processor 和 Pipe Sink 插件的属性。 -用户可以通过 SQL 语句声明式地配置三个子任务的具体属性,通过组合不同的属性,实现灵活的数据 ETL 能力。 - -利用流处理框架,可以搭建完整的数据链路来满足端*边云同步、异地灾备、读写负载分库*等需求。 - -## 1. 自定义流处理插件开发 - -### 1.1 编程开发依赖 - -推荐采用 maven 构建项目,在`pom.xml`中添加以下依赖。请注意选择和 IoTDB 服务器版本相同的依赖版本。 - -```xml - - org.apache.iotdb - pipe-api - 1.3.1 - provided - -``` - -### 1.2 事件驱动编程模型 - -流处理插件的用户编程接口设计,参考了事件驱动编程模型的通用设计理念。事件(Event)是用户编程接口中的数据抽象,而编程接口与具体的执行方式解耦,只需要专注于描述事件(数据)到达系统后,系统期望的处理方式即可。 - -在流处理插件的用户编程接口中,事件是数据库数据写入操作的抽象。事件由单机流处理引擎捕获,按照流处理三个阶段的流程,依次传递至 PipeSource 插件,PipeProcessor 插件和 PipeSink 插件,并依次在三个插件中触发用户逻辑的执行。 - -为了兼顾端侧低负载场景下的流处理低延迟和端侧高负载场景下的流处理高吞吐,流处理引擎会动态地在操作日志和数据文件中选择处理对象,因此,流处理的用户编程接口要求用户提供下列两类事件的处理逻辑:操作日志写入事件 TabletInsertionEvent 和数据文件写入事件 TsFileInsertionEvent。 - -#### **操作日志写入事件(TabletInsertionEvent)** - -操作日志写入事件(TabletInsertionEvent)是对用户写入请求的高层数据抽象,它通过提供统一的操作接口,为用户提供了操纵写入请求底层数据的能力。 - -对于不同的数据库部署方式,操作日志写入事件对应的底层存储结构是不一样的。对于单机部署的场景,操作日志写入事件是对写前日志(WAL)条目的封装;对于分布式部署的场景,操作日志写入事件是对单个节点共识协议操作日志条目的封装。 - -对于数据库不同写入请求接口生成的写入操作,操作日志写入事件对应的请求结构体的数据结构也是不一样的。IoTDB 提供了 InsertRecord、InsertRecords、InsertTablet、InsertTablets 等众多的写入接口,每一种写入请求都使用了完全不同的序列化方式,生成的二进制条目也不尽相同。 - -操作日志写入事件的存在,为用户提供了一种统一的数据操作视图,它屏蔽了底层数据结构的实现差异,极大地降低了用户的编程门槛,提升了功能的易用性。 - -```java -/** TabletInsertionEvent is used to define the event of data insertion. */ -public interface TabletInsertionEvent extends Event { - - /** - * The consumer processes the data row by row and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processRowByRow(BiConsumer consumer); - - /** - * The consumer processes the Tablet directly and collects the results by RowCollector. - * - * @return {@code Iterable} a list of new TabletInsertionEvent contains the - * results collected by the RowCollector - */ - Iterable processTablet(BiConsumer consumer); -} -``` - -#### **数据文件写入事件(TsFileInsertionEvent)** - -数据文件写入事件(TsFileInsertionEvent) 是对数据库文件落盘操作的高层抽象,它是若干操作日志写入事件(TabletInsertionEvent)的数据集合。 - -IoTDB 的存储引擎是 LSM 结构的。数据写入时会先将写入操作落盘到日志结构的文件里,同时将写入数据保存在内存里。当内存达到控制上限,则会触发刷盘行为,即将内存中的数据转换为数据库文件,同时删除之前预写的操作日志。当内存中的数据转换为数据库文件中的数据时,会经过编码压缩和通用压缩两次压缩处理,因此数据库文件的数据相比内存中的原始数据占用的空间更少。 - -在极端的网络情况下,直接传输数据文件相比传输数据写入的操作要更加经济,它会占用更低的网络带宽,能实现更快的传输速度。当然,天下没有免费的午餐,对文件中的数据进行计算处理,相比直接对内存中的数据进行计算处理时,需要额外付出文件 I/O 的代价。但是,正是磁盘数据文件和内存写入操作两种结构各有优劣的存在,给了系统做动态权衡调整的机会,也正是基于这样的观察,插件的事件模型中才引入了数据文件写入事件。 - -综上,数据文件写入事件出现在流处理插件的事件流中,存在下面两种情况: - -(1)历史数据抽取:一个流处理任务开始前,所有已经落盘的写入数据都会以 TsFile 的形式存在。一个流处理任务开始后,采集历史数据时,历史数据将以 TsFileInsertionEvent 作为抽象; - -(2)实时数据抽取:一个流处理任务进行时,当数据流中实时处理操作日志写入事件的速度慢于写入请求速度一定进度之后,未来得及处理的操作日志写入事件会被被持久化至磁盘,以 TsFile 的形式存在,这一些数据被流处理引擎抽取到后,会以 TsFileInsertionEvent 作为抽象。 - -```java -/** - * TsFileInsertionEvent is used to define the event of writing TsFile. Event data stores in disks, - * which is compressed and encoded, and requires IO cost for computational processing. - */ -public interface TsFileInsertionEvent extends Event { - - /** - * The method is used to convert the TsFileInsertionEvent into several TabletInsertionEvents. - * - * @return {@code Iterable} the list of TabletInsertionEvent - */ - Iterable toTabletInsertionEvents(); -} -``` - -### 1.3 自定义流处理插件编程接口定义 - -基于自定义流处理插件编程接口,用户可以轻松编写数据抽取插件、数据处理插件和数据发送插件,从而使得流处理功能灵活适配各种工业场景。 - -#### 数据抽取插件接口 - -数据抽取是流处理数据从数据抽取到数据发送三阶段的第一阶段。数据抽取插件(PipeSource)是流处理引擎和存储引擎的桥梁,它通过监听存储引擎的行为,捕获各种数据写入事件。 - -```java -/** - * PipeSource - * - *

PipeSource is responsible for capturing events from sources. - * - *

Various data sources can be supported by implementing different PipeSource classes. - * - *

The lifecycle of a PipeSource is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SOURCE` clause in SQL are - * parsed and the validation method {@link PipeSource#validate(PipeParameterValidator)} will - * be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} will be called to - * config the runtime behavior of the PipeSource. - *
  • Then the method {@link PipeSource#start()} will be called to start the PipeSource. - *
  • While the collaboration task is in progress, the method {@link PipeSource#supply()} will be - * called to capture events from sources and then the events will be passed to the - * PipeProcessor. - *
  • The method {@link PipeSource#close()} will be called when the collaboration task is - * cancelled (the `DROP PIPE` command is executed). - *
- */ -public interface PipeSource extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSource. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSourceRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSource#validate(PipeParameterValidator)} - * is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSource - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSourceRuntimeConfiguration configuration) - throws Exception; - - /** - * Start the Source. After this method is called, events should be ready to be supplied by - * {@link PipeSource#supply()}. This method is called after {@link - * PipeSource#customize(PipeParameters, PipeSourceRuntimeConfiguration)} is called. - * - * @throws Exception the user can throw errors if necessary - */ - void start() throws Exception; - - /** - * Supply single event from the Source and the caller will send the event to the processor. - * This method is called after {@link PipeSource#start()} is called. - * - * @return the event to be supplied. the event may be null if the Source has no more events at - * the moment, but the Source is still running for more events. - * @throws Exception the user can throw errors if necessary - */ - Event supply() throws Exception; -} -``` - -#### 数据处理插件接口 - -数据处理是流处理数据从数据抽取到数据发送三阶段的第二阶段。数据处理插件(PipeProcessor)主要用于过滤和转换由数据抽取插件(PipeSource)捕获的各种事件。 - -```java -/** - * PipeProcessor - * - *

PipeProcessor is used to filter and transform the Event formed by the PipeSource. - * - *

The lifecycle of a PipeProcessor is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH PROCESSOR` clause in SQL are - * parsed and the validation method {@link PipeProcessor#validate(PipeParameterValidator)} - * will be called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} will be called - * to config the runtime behavior of the PipeProcessor. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. The - * following 3 methods will be called: {@link - * PipeProcessor#process(TabletInsertionEvent, EventCollector)}, {@link - * PipeProcessor#process(TsFileInsertionEvent, EventCollector)} and {@link - * PipeProcessor#process(Event, EventCollector)}. - *
    • PipeSink serializes the events into binaries and send them to sinks. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeProcessor#close() } method will be called. - *
- */ -public interface PipeProcessor extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeProcessor#customize(PipeParameters, PipeProcessorRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeProcessor. In this method, the user can do the - * following things: - * - *
    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeProcessorRuntimeConfiguration. - *
- * - *

This method is called after the method {@link - * PipeProcessor#validate(PipeParameterValidator)} is called and before the beginning of the - * events processing. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeProcessor - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeProcessorRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is called to process the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(TabletInsertionEvent tabletInsertionEvent, EventCollector eventCollector) - throws Exception; - - /** - * This method is called to process the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - default void process(TsFileInsertionEvent tsFileInsertionEvent, EventCollector eventCollector) - throws Exception { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - process(tabletInsertionEvent, eventCollector); - } - } - - /** - * This method is called to process the Event. - * - * @param event Event to be processed - * @param eventCollector used to collect result events after processing - * @throws Exception the user can throw errors if necessary - */ - void process(Event event, EventCollector eventCollector) throws Exception; -} -``` - -#### 数据发送插件接口 - -数据发送是流处理数据从数据抽取到数据发送三阶段的第三阶段。数据发送插件(PipeSink)主要用于发送经由数据处理插件(PipeProcessor)处理过后的各种事件,它作为流处理框架的网络实现层,接口上应允许接入多种实时通信协议和多种连接器。 - -```java -/** - * PipeSink - * - *

PipeSink is responsible for sending events to sinks. - * - *

Various network protocols can be supported by implementing different PipeSink classes. - * - *

The lifecycle of a PipeSink is as follows: - * - *

    - *
  • When a collaboration task is created, the KV pairs of `WITH SINK` clause in SQL are - * parsed and the validation method {@link PipeSink#validate(PipeParameterValidator)} will be - * called to validate the parameters. - *
  • Before the collaboration task starts, the method {@link PipeSink#customize(PipeParameters, - * PipeSinkRuntimeConfiguration)} will be called to config the runtime behavior of the - * PipeSink and the method {@link PipeSink#handshake()} will be called to create a connection - * with sink. - *
  • While the collaboration task is in progress: - *
      - *
    • PipeSource captures the events and wraps them into three types of Event instances. - *
    • PipeProcessor processes the event and then passes them to the PipeSink. - *
    • PipeSink serializes the events into binaries and send them to sinks. The following 3 - * methods will be called: {@link PipeSink#transfer(TabletInsertionEvent)}, {@link - * PipeSink#transfer(TsFileInsertionEvent)} and {@link PipeSink#transfer(Event)}. - *
    - *
  • When the collaboration task is cancelled (the `DROP PIPE` command is executed), the {@link - * PipeSink#close() } method will be called. - *
- * - *

In addition, the method {@link PipeSink#heartbeat()} will be called periodically to check - * whether the connection with sink is still alive. The method {@link PipeSink#handshake()} will be - * called to create a new connection with the sink when the method {@link PipeSink#heartbeat()} - * throws exceptions. - */ -public interface PipeSink extends PipePlugin { - - /** - * This method is mainly used to validate {@link PipeParameters} and it is executed before {@link - * PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called. - * - * @param validator the validator used to validate {@link PipeParameters} - * @throws Exception if any parameter is not valid - */ - void validate(PipeParameterValidator validator) throws Exception; - - /** - * This method is mainly used to customize PipeSink. In this method, the user can do the following - * things: - * - *

    - *
  • Use PipeParameters to parse key-value pair attributes entered by the user. - *
  • Set the running configurations in PipeSinkRuntimeConfiguration. - *
- * - *

This method is called after the method {@link PipeSink#validate(PipeParameterValidator)} is - * called and before the method {@link PipeSink#handshake()} is called. - * - * @param parameters used to parse the input parameters entered by the user - * @param configuration used to set the required properties of the running PipeSink - * @throws Exception the user can throw errors if necessary - */ - void customize(PipeParameters parameters, PipeSinkRuntimeConfiguration configuration) - throws Exception; - - /** - * This method is used to create a connection with sink. This method will be called after the - * method {@link PipeSink#customize(PipeParameters, PipeSinkRuntimeConfiguration)} is called or - * will be called when the method {@link PipeSink#heartbeat()} throws exceptions. - * - * @throws Exception if the connection is failed to be created - */ - void handshake() throws Exception; - - /** - * This method will be called periodically to check whether the connection with sink is still - * alive. - * - * @throws Exception if the connection dies - */ - void heartbeat() throws Exception; - - /** - * This method is used to transfer the TabletInsertionEvent. - * - * @param tabletInsertionEvent TabletInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(TabletInsertionEvent tabletInsertionEvent) throws Exception; - - /** - * This method is used to transfer the TsFileInsertionEvent. - * - * @param tsFileInsertionEvent TsFileInsertionEvent to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - default void transfer(TsFileInsertionEvent tsFileInsertionEvent) throws Exception { - try { - for (final TabletInsertionEvent tabletInsertionEvent : - tsFileInsertionEvent.toTabletInsertionEvents()) { - transfer(tabletInsertionEvent); - } - } finally { - tsFileInsertionEvent.close(); - } - } - - /** - * This method is used to transfer the generic events, including HeartbeatEvent. - * - * @param event Event to be transferred - * @throws PipeConnectionException if the connection is broken - * @throws Exception the user can throw errors if necessary - */ - void transfer(Event event) throws Exception; -} -``` - -## 2. 自定义流处理插件管理 - -为了保证用户自定义插件在实际生产中的灵活性和易用性,系统还需要提供对插件进行动态统一管理的能力。 -本章节介绍的流处理插件管理语句提供了对插件进行动态统一管理的入口。 - -### 2.1 加载插件语句 - -在 IoTDB 中,若要在系统中动态载入一个用户自定义插件,则首先需要基于 PipeSource、 PipeProcessor 或者 PipeSink 实现一个具体的插件类,然后需要将插件类编译打包成 jar 可执行文件,最后使用加载插件的管理语句将插件载入 IoTDB。 - -加载插件的管理语句的语法如图所示。 - -```sql -CREATE PIPEPLUGIN [IF NOT EXISTS] <别名> -AS <全类名> -USING -``` - -**IF NOT EXISTS 语义**:用于创建操作中,确保当指定 Pipe Plugin 不存在时,执行创建命令,防止因尝试创建已存在的 Pipe Plugin 而导致报错。 - -示例:假如用户实现了一个全类名为edu.tsinghua.iotdb.pipe.ExampleProcessor 的数据处理插件,打包后的jar包为 pipe-plugin.jar ,用户希望在流处理引擎中使用这个插件,将插件标记为 example。插件包有两种使用方式,一种为上传到URI服务器,一种为上传到集群本地目录,两种方法任选一种即可。 - -【方式一】上传到URI服务器 - -准备工作:使用该种方式注册,您需要提前将 JAR 包上传到 URI 服务器上并确保执行注册语句的IoTDB实例能够访问该 URI 服务器。例如 https://example.com:8080/iotdb/pipe-plugin.jar 。 - -创建语句: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -【方式二】上传到集群本地目录 - -准备工作:使用该种方式注册,您需要提前将 JAR 包放置到DataNode节点所在机器的任意路径下,推荐您将JAR包放在IoTDB安装路径的/ext/pipe目录下(安装包中已有,无需新建)。例如:iotdb-1.x.x-bin/ext/pipe/pipe-plugin.jar。(**注意:如果您使用的是集群,那么需要将 JAR 包放置到每个 DataNode 节点所在机器的该路径下)** - -创建语句: - -```sql -CREATE PIPEPLUGIN IF NOT EXISTS example -AS 'edu.tsinghua.iotdb.pipe.ExampleProcessor' -USING URI -``` - -### 2.2 删除插件语句 - -当用户不再想使用一个插件,需要将插件从系统中卸载时,可以使用如图所示的删除插件语句。 - -```sql -DROP PIPEPLUGIN [IF EXISTS] <别名> -``` - -**IF EXISTS 语义**:用于删除操作中,确保当指定 Pipe Plugin 存在时,执行删除命令,防止因尝试删除不存在的 Pipe Plugin 而导致报错。 - -### 2.3 查看插件语句 - -用户也可以按需查看系统中的插件。查看插件的语句如图所示。 - -```sql -SHOW PIPEPLUGINS -``` - -## 3. 系统预置的流处理插件 - -### 3.1 预置 source 插件 - -#### iotdb-source - -作用:抽取 IoTDB 内部的历史或实时数据进入 pipe。 - - -| key | value | value 取值范围 | required or optional with default | -|---------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|-----------------------------------| -| source | iotdb-source | String: iotdb-source | required | -| source.pattern | 用于筛选时间序列的路径前缀 | String: 任意的时间序列前缀 | optional: root | -| source.history.start-time | 抽取的历史数据的开始 event time,包含 start-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| source.history.end-time | 抽取的历史数据的结束 event time,包含 end-time | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| start-time(V1.3.1+) | start of synchronizing all data event time,including start-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MIN_VALUE | -| end-time(V1.3.1+) | end of synchronizing all data event time,including end-time. Will disable "history.start-time" "history.end-time" if configured | Long: [Long.MIN_VALUE, Long.MAX_VALUE] | optional: Long.MAX_VALUE | -| source.realtime.mode | 实时数据的抽取模式 | String: hybrid, log, file | optional: hybrid | -| source.forwarding-pipe-requests | 是否抽取由其他 Pipe (通常是数据同步)写入的数据 | Boolean: true, false | optional: true | - -> 🚫 **source.pattern 参数说明** -> -> * Pattern 需用反引号修饰不合法字符或者是不合法路径节点,例如如果希望筛选 root.\`a@b\` 或者 root.\`123\`,应设置 pattern 为 root.\`a@b\` 或者 root.\`123\`(具体参考 [单双引号和反引号的使用时机](https://iotdb.apache.org/zh/Download/#_1-0-版本不兼容的语法详细说明)) -> * 在底层实现中,当检测到 pattern 为 root(默认值)时,抽取效率较高,其他任意格式都将降低性能 -> * 路径前缀不需要能够构成完整的路径。例如,当创建一个包含参数为 'source.pattern'='root.aligned.1' 的 pipe 时: - > - > * root.aligned.1TS - > * root.aligned.1TS.\`1\` -> * root.aligned.100T - > - > 的数据会被抽取; - > - > * root.aligned.\`1\` -> * root.aligned.\`123\` - > - > 的数据不会被抽取。 - -> ❗️**source.history 的 start-time,end-time 参数说明** -> -> * start-time,end-time 应为 ISO 格式,例如 2011-12-03T10:15:30 或 2011-12-03T10:15:30+01:00 - -> ✅ **一条数据从生产到落库 IoTDB,包含两个关键的时间概念** -> -> * **event time:** 数据实际生产时的时间(或者数据生产系统给数据赋予的生成时间,是数据点中的时间项),也称为事件时间。 -> * **arrival time:** 数据到达 IoTDB 系统内的时间。 -> -> 我们常说的乱序数据,指的是数据到达时,其 **event time** 远落后于当前系统时间(或者已经落库的最大 **event time**)的数据。另一方面,不论是乱序数据还是顺序数据,只要它们是新到达系统的,那它们的 **arrival time** 都是会随着数据到达 IoTDB 的顺序递增的。 - -> 💎 **iotdb-source 的工作可以拆分成两个阶段** -> -> 1. 历史数据抽取:所有 **arrival time** < 创建 pipe 时**当前系统时间**的数据称为历史数据 -> 2. 实时数据抽取:所有 **arrival time** >= 创建 pipe 时**当前系统时间**的数据称为实时数据 -> -> 历史数据传输阶段和实时数据传输阶段,**两阶段串行执行,只有当历史数据传输阶段完成后,才执行实时数据传输阶段。** - -> 📌 **source.realtime.mode:数据抽取的模式** -> -> * log:该模式下,任务仅使用操作日志进行数据处理、发送 -> * file:该模式下,任务仅使用数据文件进行数据处理、发送 -> * hybrid:该模式,考虑了按操作日志逐条目发送数据时延迟低但吞吐低的特点,以及按数据文件批量发送时发送吞吐高但延迟高的特点,能够在不同的写入负载下自动切换适合的数据抽取方式,首先采取基于操作日志的数据抽取方式以保证低发送延迟,当产生数据积压时自动切换成基于数据文件的数据抽取方式以保证高发送吞吐,积压消除时自动切换回基于操作日志的数据抽取方式,避免了采用单一数据抽取算法难以平衡数据发送延迟或吞吐的问题。 - -> 🍕 **source.forwarding-pipe-requests:是否允许转发从另一 pipe 传输而来的数据** -> -> * 如果要使用 pipe 构建 A -> B -> C 的数据同步,那么 B -> C 的 pipe 需要将该参数为 true 后,A -> B 中 A 通过 pipe 写入 B 的数据才能被正确转发到 C -> * 如果要使用 pipe 构建 A \<-> B 的双向数据同步(双活),那么 A -> B 和 B -> A 的 pipe 都需要将该参数设置为 false,否则将会造成数据无休止的集群间循环转发 - -### 3.2 预置 processor 插件 - -#### do-nothing-processor - -作用:不对 source 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -|-----------|----------------------|------------------------------|-----------------------------------| -| processor | do-nothing-processor | String: do-nothing-processor | required | - -### 3.3 预置 sink 插件 - -#### do-nothing-sink - -作用:不对 processor 传入的事件做任何的处理。 - - -| key | value | value 取值范围 | required or optional with default | -|------|-----------------|-------------------------|-----------------------------------| -| sink | do-nothing-sink | String: do-nothing-sink | required | - -## 4. 流处理任务管理 - -### 4.1 创建流处理任务 - -使用 `CREATE PIPE` 语句来创建流处理任务。以数据同步流处理任务的创建为例,示例 SQL 语句如下: - -```sql -CREATE PIPE -- PipeId 是能够唯一标定流处理任务的名字 -WITH SOURCE ( - -- 默认的 IoTDB 数据抽取插件 - 'source' = 'iotdb-source', - -- 路径前缀,只有能够匹配该路径前缀的数据才会被抽取,用作后续的处理和发送 - 'source.pattern' = 'root.timecho', - -- 是否抽取历史数据 - 'source.history.enable' = 'true', - -- 描述被抽取的历史数据的时间范围,表示最早时间 - 'source.history.start-time' = '2011.12.03T10:15:30+01:00', - -- 描述被抽取的历史数据的时间范围,表示最晚时间 - 'source.history.end-time' = '2022.12.03T10:15:30+01:00', - -- 是否抽取实时数据 - 'source.realtime.enable' = 'true', - -- 描述实时数据的抽取方式 - 'source.realtime.mode' = 'hybrid', -) -WITH PROCESSOR ( - -- 默认的数据处理插件,即不做任何处理 - 'processor' = 'do-nothing-processor', -) -WITH SINK ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'sink' = 'iotdb-thrift-sink', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'sink.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'sink.port' = '6667', -) -``` - -**创建流处理任务时需要配置 PipeId 以及三个插件部分的参数:** - - -| 配置项 | 说明 | 是否必填 | 默认实现 | 默认实现说明 | 是否允许自定义实现 | -|-----------|--------------------------------|---------------------------|----------------------|------------------------------|--------------------------| -| PipeId | 全局唯一标定一个流处理任务的名称 | 必填 | - | - | - | -| source | Pipe Source 插件,负责在数据库底层抽取流处理数据 | 选填 | iotdb-source | 将数据库的全量历史数据和后续到达的实时数据接入流处理任务 | 否 | -| processor | Pipe Processor 插件,负责处理数据 | 选填 | do-nothing-processor | 对传入的数据不做任何处理 | | -| sink | Pipe Sink 插件,负责发送数据 | 必填 | - | - | | - -示例中,使用了 iotdb-source、do-nothing-processor 和 iotdb-thrift-sink 插件构建数据流处理任务。IoTDB 还内置了其他的流处理插件,**请查看“系统预置流处理插件”一节**。 - -**一个最简的 CREATE PIPE 语句示例如下:** - -```sql -CREATE PIPE -- PipeId 是能够唯一标定流处理任务的名字 -WITH SINK ( - -- IoTDB 数据发送插件,目标端为 IoTDB - 'sink' = 'iotdb-thrift-sink', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 ip - 'sink.ip' = '127.0.0.1', - -- 目标端 IoTDB 其中一个 DataNode 节点的数据服务 port - 'sink.port' = '6667', -) -``` - -其表达的语义是:将本数据库实例中的全量历史数据和后续到达的实时数据,同步到目标为 127.0.0.1:6667 的 IoTDB 实例上。 - -**注意:** - -- SOURCE 和 PROCESSOR 为选填配置,若不填写配置参数,系统则会采用相应的默认实现 -- SINK 为必填配置,需要在 CREATE PIPE 语句中声明式配置 -- SINK 具备自复用能力。对于不同的流处理任务,如果他们的 SINK 具备完全相同 KV 属性的(所有属性的 key 对应的 value 都相同),**那么系统最终只会创建一个 SINK 实例**,以实现对连接资源的复用。 - - - 例如,有下面 pipe1, pipe2 两个流处理任务的声明: - - ```sql - CREATE PIPE pipe1 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.ip' = 'localhost', - 'sink.port' = '9999', - ) - - CREATE PIPE pipe2 - WITH SINK ( - 'sink' = 'iotdb-thrift-sink', - 'sink.port' = '9999', - 'sink.ip' = 'localhost', - ) - ``` - - - 因为它们对 SINK 的声明完全相同(**即使某些属性声明时的顺序不同**),所以框架会自动对它们声明的 SINK 进行复用,最终 pipe1, pipe2 的 SINK 将会是同一个实例。 -- 在 source 为默认的 iotdb-source,且 source.forwarding-pipe-requests 为默认值 true 时,请不要构建出包含数据循环同步的应用场景(会导致无限循环): - - - IoTDB A -> IoTDB B -> IoTDB A - - IoTDB A -> IoTDB A - -### 4.2 启动流处理任务 - -CREATE PIPE 语句成功执行后,流处理任务相关实例会被创建,但整个流处理任务的运行状态会被置为 STOPPED,即流处理任务不会立刻处理数据(V1.3.0)。在 1.3.1 及以上的版本,流处理任务的运行状态在创建后将被立即置为 RUNNING。 - -可以使用 START PIPE 语句使流处理任务开始处理数据: - -```sql -START PIPE -``` - -### 4.3 停止流处理任务 - -使用 STOP PIPE 语句使流处理任务停止处理数据: - -```sql -STOP PIPE -``` - -### 4.4 删除流处理任务 - -使用 DROP PIPE 语句使流处理任务停止处理数据(当流处理任务状态为 RUNNING 时),然后删除整个流处理任务流处理任务: - -```sql -DROP PIPE -``` - -用户在删除流处理任务前,不需要执行 STOP 操作。 - -### 4.5 展示流处理任务 - -使用 SHOW PIPES 语句查看所有流处理任务: - -```sql -SHOW PIPES -``` - -查询结果如下: - -```sql -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -| ID| CreationTime | State|PipeSource|PipeProcessor|PipeSink|ExceptionMessage| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-kafka|2022-03-30T20:58:30.689|RUNNING| ...| ...| ...| {}| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -|iotdb-iotdb|2022-03-31T12:55:28.129|STOPPED| ...| ...| ...| TException: ...| -+-----------+-----------------------+-------+----------+-------------+--------+----------------+ -``` - -可以使用 `` 指定想看的某个流处理任务状态: - -```sql -SHOW PIPE -``` - -您也可以通过 where 子句,判断某个 \ 使用的 Pipe Sink 被复用的情况。 - -```sql -SHOW PIPES -WHERE SINK USED BY -``` - -### 4.6 流处理任务运行状态迁移 - -一个流处理 pipe 在其的生命周期中会经过多种状态: - -- **RUNNING:** pipe 正在正常工作 - - 当一个 pipe 被成功创建之后,其初始状态为工作状态(V1.3.1+) -- **STOPPED:** pipe 处于停止运行状态。当管道处于该状态时,有如下几种可能: - - 当一个 pipe 被成功创建之后,其初始状态为暂停状态(V1.3.0) - - 用户手动将一个处于正常运行状态的 pipe 暂停,其状态会被动从 RUNNING 变为 STOPPED - - 当一个 pipe 运行过程中出现无法恢复的错误时,其状态会自动从 RUNNING 变为 STOPPED -- **DROPPED:** pipe 任务被永久删除 - -下图表明了所有状态以及状态的迁移: - -![状态迁移图](/img/%E7%8A%B6%E6%80%81%E8%BF%81%E7%A7%BB%E5%9B%BE.png) - -## 5. 权限管理 - -### 5.1 流处理任务 - - -| 权限名称 | 描述 | -|----------|---------------| -| USE_PIPE | 注册流处理任务。路径无关。 | -| USE_PIPE | 开启流处理任务。路径无关。 | -| USE_PIPE | 停止流处理任务。路径无关。 | -| USE_PIPE | 卸载流处理任务。路径无关。 | -| USE_PIPE | 查询流处理任务。路径无关。 | - -### 5.2 流处理任务插件 - - -| 权限名称 | 描述 | -|----------|-----------------| -| USE_PIPE | 注册流处理任务插件。路径无关。 | -| USE_PIPE | 卸载流处理任务插件。路径无关。 | -| USE_PIPE | 查询流处理任务插件。路径无关。 | - -## 6. 配置参数 - -在 iotdb-system.properties 中: - -V1.3.0+: -```Properties -#################### -### Pipe Configuration -#################### - -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_connector_timeout_ms=900000 - -# The maximum number of selectors that can be used in the async connector. -# pipe_async_connector_selector_number=1 - -# The core number of clients that can be used in the async connector. -# pipe_async_connector_core_client_number=8 - -# The maximum number of clients that can be used in the async connector. -# pipe_async_connector_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` - -V1.3.1+: -```Properties -# Uncomment the following field to configure the pipe lib directory. -# For Windows platform -# If its prefix is a drive specifier followed by "\\", or if its prefix is "\\\\", then the path is -# absolute. Otherwise, it is relative. -# pipe_lib_dir=ext\\pipe -# For Linux platform -# If its prefix is "/", then the path is absolute. Otherwise, it is relative. -# pipe_lib_dir=ext/pipe - -# The maximum number of threads that can be used to execute the pipe subtasks in PipeSubtaskExecutor. -# The actual value will be min(pipe_subtask_executor_max_thread_num, max(1, CPU core number / 2)). -# pipe_subtask_executor_max_thread_num=5 - -# The connection timeout (in milliseconds) for the thrift client. -# pipe_sink_timeout_ms=900000 - -# The maximum number of selectors that can be used in the sink. -# Recommend to set this value to less than or equal to pipe_sink_max_client_number. -# pipe_sink_selector_number=4 - -# The maximum number of clients that can be used in the sink. -# pipe_sink_max_client_number=16 - -# Whether to enable receiving pipe data through air gap. -# The receiver can only return 0 or 1 in tcp mode to indicate whether the data is received successfully. -# pipe_air_gap_receiver_enabled=false - -# The port for the server to receive pipe data through air gap. -# pipe_air_gap_receiver_port=9780 -``` diff --git a/src/zh/UserGuide/latest/User-Manual/Tiered-Storage_timecho.md b/src/zh/UserGuide/latest/User-Manual/Tiered-Storage_timecho.md deleted file mode 100644 index 86c183dc3..000000000 --- a/src/zh/UserGuide/latest/User-Manual/Tiered-Storage_timecho.md +++ /dev/null @@ -1,101 +0,0 @@ - - -# 多级存储 -## 1. 概述 - -多级存储功能向用户提供多种存储介质管理的能力,用户可以使用多级存储功能为 IoTDB 配置不同类型的存储介质,并为存储介质进行分级。具体的,在 IoTDB 中,多级存储的配置体现为多目录的管理。用户可以将多个存储目录归为同一类,作为一个“层级”向 IoTDB 中配置,这种“层级”我们称之为 storage tier;同时,用户可以根据数据的冷热进行分类,并将不同类别的数据存储到指定的“层级”中。当前 IoTDB 支持通过数据的 TTL 进行冷热数据的分类,当一个层级中的数据不满足当前层级定义的 TTL 规则时,该数据会被自动迁移至下一层级中。 - -## 2. 参数定义 - -在 IoTDB 中开启多级存储,需要进行以下几个方面的配置: - -1. 配置数据目录,并将数据目录分为不同的层级 -2. 配置每个层级所管理的数据的 TTL,以区分不同层级管理的冷热数据类别。 -3. 配置每个层级的最小剩余存储空间比例,当该层级的存储空间触发该阈值时,该层级的数据会被自动迁移至下一层级(可选)。 - -具体的参数定义及其描述如下。 - -| 配置项 | 默认值 | 是否必填 | 说明 | 约束 | -| --------------------------------------- | ------------------------ | --- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| dn_data_dirs | data/datanode/data | 是 | 用来指定不同的存储目录,并将存储目录进行层级划分 | 每级存储使用分号分隔,单级内使用逗号分隔;云端配置只能作为最后一级存储且第一级不能作为云端存储;最多配置一个云端对象;远端存储目录使用 OBJECT_STORAGE 来表示 | -| tier_ttl_in_ms | -1 | 是 | 定义每个层级负责的数据范围,通过 TTL 表示 | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致;"-1" 表示"无限制" | -| dn_default_space_usage_thresholds | 0.85 | 是 | 定义每个层级数据目录的最大使用空间比例;当使用空间大于该比例时,数据会被自动迁移至下一个层级;当最后一个层级的使用存储空间大于此阈值时,会将系统置为 READ_ONLY | 每级存储使用分号分隔;层级数量需与 dn_data_dirs 定义的层级数一致 | -| object_storage_type | AWS_S3 | 使用远端存储时必填 | 云端存储类型 | IoTDB 支持 S3 协议作为远端存储类型 | -| object_storage_bucket | iotdb_data | 使用远端存储时必填 | 云端存储 bucket 的名称 | AWS S3 中的 bucket 定义 | -| object_storage_region | | 使用远端存储时必填 | 云端存储的服务区域 | AWS S3 中的 region 定义 | -| object_storage_endpoint | | 使用远端存储时必填 | 云端存储的 endpoint | AWS S3 的 endpoint | -| object_storage_access_key | | 使用远端存储时必填 | 云端存储的验证信息 key | AWS S3 的 credential key | -| object_storage_access_secret | | 使用远端存储时必填 | 云端存储的验证信息 secret | AWS S3 的 credential secret | -| enable_path_style_access | false | 否 | 是否启用云端存储服务路径访问 | | -| remote_tsfile_cache_dirs | data/datanode/data/cache | 否 | 云端存储在本地的缓存目录 | | -| remote_tsfile_cache_page_size_in_kb | 20480 | 否 | 云端存储在本地缓存文件的块大小 | | -| remote_tsfile_cache_max_disk_usage_in_mb | 51200 | 否 | 云端存储本地缓存的最大磁盘占用大小 | | - - -## 3. 本地多级存储配置示例 - -以下以本地两级存储的配置示例。 - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data; -tier_ttl_in_ms=86400000;-1 -dn_default_space_usage_thresholds=0.2;0.1 -``` - -在该示例中,共配置了两个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | --------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 1 天以前的数据 | 10% | - -## 4. 远端多级存储配置示例 - -以下以三级存储为例: - -```JavaScript -// 必须配置项 -dn_data_dirs=/data1/data;/data2/data,/data3/data;OBJECT_STORAGE -tier_ttl_in_ms=86400000;864000000;-1 -dn_default_space_usage_thresholds=0.2;0.15;0.1 -object_storage_type=AWS_S3 -object_storage_bucket=iotdb -object_storage_region= -object_storage_endpoint= -object_storage_access_key= -object_storage_access_secret= - -// 可选配置项 -enable_path_style_access=false -remote_tsfile_cache_dirs=data/datanode/data/cache -remote_tsfile_cache_page_size_in_kb=20971520 -remote_tsfile_cache_max_disk_usage_in_mb=53687091200 -``` - -在该示例中,共配置了三个层级的存储,具体为: - -| **层级** | **数据目录** | **数据范围** | **磁盘最小剩余空间阈值** | -| -------- | -------------------------------------- | ---------------------------- | ------------------------ | -| 层级一 | 目录一:/data1/data | 最近 1 天的数据 | 20% | -| 层级二 | 目录一:/data2/data目录二:/data3/data | 过去1 天至过去 10 天内的数据 | 15% | -| 层级三 | 远端 S3 协议存储 | 过去 10 天以前的数据 | 10% | \ No newline at end of file diff --git a/src/zh/UserGuide/latest/User-Manual/User-defined-function_timecho.md b/src/zh/UserGuide/latest/User-Manual/User-defined-function_timecho.md deleted file mode 100644 index 9ae75d8b5..000000000 --- a/src/zh/UserGuide/latest/User-Manual/User-defined-function_timecho.md +++ /dev/null @@ -1,928 +0,0 @@ -# UDF - -## 1. UDF 介绍 - -UDF(User Defined Function)即用户自定义函数,IoTDB 提供多种内建的面向时序处理的函数,也支持扩展自定义函数来满足更多的计算需求。 - -IoTDB 支持两种类型的 UDF 函数,如下表所示。 - - - - - - - - - - - - - - - - - - - - - - -
UDF 分类数据访问策略描述
UDTFMAPPABLE_ROW_BY_ROW自定义标量函数,输入 k 列时间序列 1 行数据,输出 1 列时间序列 1 行数据,可用于标量函数出现的任何子句和表达式中,如select子句、where子句等。
ROW_BY_ROW
SLIDING_TIME_WINDOW
SLIDING_SIZE_WINDOW
SESSION_TIME_WINDOW
STATE_WINDOW
自定义时间序列生成函数,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据,输入行数 m 可以与输出行数 n 不相同,只能用于SELECT子句中。
UDAF-自定义聚合函数,输入 k 列时间序列 m 行数据,输出 1 列时间序列 1 行数据,可用于聚合函数出现的任何子句和表达式中,如select子句、having子句等。
- -### 1.1 UDF 使用 - -UDF 的使用方法与普通内建函数类似,可以直接在 SELECT 语句中像调用普通函数一样使用UDF。 - -#### 1.支持的基础 SQL 语法 - -* `SLIMIT` / `SOFFSET` -* `LIMIT` / `OFFSET` -* 支持值过滤 -* 支持时间过滤 - - -#### 2. 带 * 查询 - -假定现在有时间序列 `root.sg.d1.s1`和 `root.sg.d1.s2`。 - -* **执行`SELECT example(*) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1)`和`example(root.sg.d1.s2)`的结果。 - -* **执行`SELECT example(s1, *) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1, root.sg.d1.s1)`和`example(root.sg.d1.s1, root.sg.d1.s2)`的结果。 - -* **执行`SELECT example(*, *) from root.sg.d1`** - -那么结果集中将包括`example(root.sg.d1.s1, root.sg.d1.s1)`,`example(root.sg.d1.s2, root.sg.d1.s1)`,`example(root.sg.d1.s1, root.sg.d1.s2)` 和 `example(root.sg.d1.s2, root.sg.d1.s2)`的结果。 - -#### 3. 带自定义输入参数的查询 - -可以在进行 UDF 查询的时候,向 UDF 传入任意数量的键值对参数。键值对中的键和值都需要被单引号或者双引号引起来。注意,键值对参数只能在所有时间序列后传入。下面是一组例子: - - 示例: -``` sql -SELECT example(s1, 'key1'='value1', 'key2'='value2'), example(*, 'key3'='value3') FROM root.sg.d1; -SELECT example(s1, s2, 'key1'='value1', 'key2'='value2') FROM root.sg.d1; -``` - -#### 4. 与其他查询的嵌套查询 - - 示例: -``` sql -SELECT s1, s2, example(s1, s2) FROM root.sg.d1; -SELECT *, example(*) FROM root.sg.d1 DISABLE ALIGN; -SELECT s1 * example(* / s1 + s2) FROM root.sg.d1; -SELECT s1, s2, s1 + example(s1, s2), s1 - example(s1 + example(s1, s2) / s2) FROM root.sg.d1; -``` - - - -## 2. UDF 管理 - -### 2.1 UDF 注册 - -注册一个 UDF 可以按如下流程进行: - -1. 实现一个完整的 UDF 类,假定这个类的全类名为`org.apache.iotdb.udf.UDTFExample` -2. 将项目打成 JAR 包,如果使用 Maven 管理项目,可以参考 [Maven 项目示例](https://github.com/apache/iotdb/tree/master/example/udf)的写法 -3. 进行注册前的准备工作,根据注册方式的不同需要做不同的准备,具体可参考以下例子 -4. 使用以下 SQL 语句注册 UDF - -```sql -CREATE FUNCTION AS (USING URI URI-STRING) -``` - -#### 示例:注册名为`example`的 UDF,以下两种注册方式任选其一即可 - -#### 方式一:手动放置jar包 - -准备工作: -使用该种方式注册时,需要提前将 JAR 包放置到集群所有节点的 `ext/udf`目录下(该目录可配置)。 - -注册语句: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' -``` - -#### 方式二:集群通过URI自动安装jar包 - -准备工作: -使用该种方式注册时,需要提前将 JAR 包上传到 URI 服务器上并确保执行注册语句的 IoTDB 实例能够访问该 URI 服务器。 - -注册语句: - -```sql -CREATE FUNCTION example AS 'org.apache.iotdb.udf.UDTFExample' USING URI 'http://jar/example.jar' -``` - -IoTDB 会下载 JAR 包并同步到整个集群。 - -#### 注意 - -1. 由于 IoTDB 的 UDF 是通过反射技术动态装载的,因此在装载过程中无需启停服务器。 - -2. UDF 函数名称是大小写不敏感的。 - -3. 请不要给 UDF 函数注册一个内置函数的名字。使用内置函数的名字给 UDF 注册会失败。 - -4. 不同的 JAR 包中最好不要有全类名相同但实现功能逻辑不一样的类。例如 UDF(UDAF/UDTF):`udf1`、`udf2`分别对应资源`udf1.jar`、`udf2.jar`。如果两个 JAR 包里都包含一个`org.apache.iotdb.udf.UDTFExample`类,当同一个 SQL 中同时使用到这两个 UDF 时,系统会随机加载其中一个类,导致 UDF 执行行为不一致。 - -### 2.2 UDF 卸载 - -SQL 语法如下: - -```sql -DROP FUNCTION -``` - -示例:卸载上述例子的 UDF: - -```sql -DROP FUNCTION example -``` - -注意:对于使用 using uri 注册的函数,需要移除集群所有节点路径(`安装包/ext/udf/install`)中存在的 UDF 的 jar 文件。 - -### 2.3 查看所有注册的 UDF - -``` sql -SHOW FUNCTIONS -``` - -### 2.4 UDF 配置 - -- 允许在 `iotdb-system.properties` 中配置 udf 的存储目录.: - ``` Properties -# UDF lib dir - -udf_lib_dir=ext/udf -``` - -- 使用自定义函数时,提示内存不足,更改 `iotdb-system.properties` 中下述配置参数并重启服务。 - ``` Properties - -# Used to estimate the memory usage of text fields in a UDF query. -# It is recommended to set this value to be slightly larger than the average length of all text -# effectiveMode: restart -# Datatype: int -udf_initial_byte_array_length_for_memory_control=48 - -# How much memory may be used in ONE UDF query (in MB). -# The upper limit is 20% of allocated memory for read. -# effectiveMode: restart -# Datatype: float -udf_memory_budget_in_mb=30.0 - -# UDF memory allocation ratio. -# The parameter form is a:b:c, where a, b, and c are integers. -# effectiveMode: restart -udf_reader_transformer_collector_memory_proportion=1:1:1 -``` - -### 2.5 UDF 用户权限 - -用户在使用 UDF 时会涉及到 `USE_UDF` 权限,具备该权限的用户才被允许执行 UDF 注册、卸载和查询操作。 - -更多用户权限相关的内容,请参考 [权限管理语句](../User-Manual/Authority-Management_timecho##权限管理)。 - - -## 3. UDF 函数库 - -基于用户自定义函数能力,IoTDB 提供了一系列关于时序数据处理的函数,包括数据质量、数据画像、异常检测、 频域分析、数据匹配、数据修复、序列发现、机器学习等,能够满足工业领域对时序数据处理的需求。 - -可以参考 [UDF 函数库](../SQL-Manual/UDF-Libraries_timecho.md)文档,查找安装步骤及每个函数对应的注册语句,以确保正确注册所有需要的函数。 - -## 4. UDF 开发 - -### 4.1 UDF 依赖 - -如果您使用 [Maven](http://search.maven.org/) ,可以从 [Maven 库](http://search.maven.org/) 中搜索下面示例中的依赖。请注意选择和目标 IoTDB 服务器版本相同的依赖版本。 - -``` xml - - org.apache.iotdb - udf-api - 1.0.0 - provided - -``` - -### 4.2 UDTF(User Defined Timeseries Generating Function) - -编写一个 UDTF 需要继承`org.apache.iotdb.udf.api.UDTF`类,并至少实现`beforeStart`方法和一种`transform`方法。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| :----------------------------------------------------------- | :----------------------------------------------------------- | ------------------------- | -| void validate(UDFParameterValidator validator) throws Exception | 在初始化方法`beforeStart`调用前执行,用于检测`UDFParameters`中用户输入的参数是否合法。 | 否 | -| void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception | 初始化方法,在 UDTF 处理输入数据前,调用用户自定义的初始化行为。用户每执行一次 UDTF 查询,框架就会构造一个新的 UDF 类实例,该方法在每个 UDF 类实例被初始化时调用一次。在每一个 UDF 类实例的生命周期内,该方法只会被调用一次。 | 是 | -| Object transform(Row row) throws Exception` | 这个方法由框架调用。当您在`beforeStart`中选择以`MappableRowByRowAccessStrategy`的策略消费原始数据时,可以选用该方法进行数据处理。输入参数以`Row`的形式传入,输出结果通过返回值`Object`输出。 | 所有`transform`方法四选一 | -| void transform(Column[] columns, ColumnBuilder builder) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`MappableRowByRowAccessStrategy`的策略消费原始数据时,可以选用该方法进行数据处理。输入参数以`Column[]`的形式传入,输出结果通过`ColumnBuilder`输出。您需要在该方法内自行调用`builder`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void transform(Row row, PointCollector collector) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`RowByRowAccessStrategy`的策略消费原始数据时,这个数据处理方法就会被调用。输入参数以`Row`的形式传入,输出结果通过`PointCollector`输出。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void transform(RowWindow rowWindow, PointCollector collector) throws Exception | 这个方法由框架调用。当您在`beforeStart`中选择以`SlidingSizeWindowAccessStrategy`或者`SlidingTimeWindowAccessStrategy`的策略消费原始数据时,这个数据处理方法就会被调用。输入参数以`RowWindow`的形式传入,输出结果通过`PointCollector`输出。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 所有`transform`方法四选一 | -| void terminate(PointCollector collector) throws Exception | 这个方法由框架调用。该方法会在所有的`transform`调用执行完成后,在`beforeDestory`方法执行前被调用。在一个 UDF 查询过程中,该方法会且只会调用一次。您需要在该方法内自行调用`collector`提供的数据收集方法,以决定最终的输出数据。 | 否 | -| void beforeDestroy() | UDTF 的结束方法。此方法由框架调用,并且只会被调用一次,即在处理完最后一条记录之后被调用。 | 否 | - -在一个完整的 UDTF 实例生命周期中,各个方法的调用顺序如下: - -1. void validate(UDFParameterValidator validator) throws Exception -2. void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception -3. Object transform(Row row) throws Exception 或着 void transform(Column[] columns, ColumnBuilder builder) throws Exception 或者 void transform(Row row, PointCollector collector) throws Exception 或者 void transform(RowWindow rowWindow, PointCollector collector) throws Exception -4. void terminate(PointCollector collector) throws Exception -5. void beforeDestroy() - -> 注意,框架每执行一次 UDTF 查询,都会构造一个全新的 UDF 类实例,查询结束时,对应的 UDF 类实例即被销毁,因此不同 UDTF 查询(即使是在同一个 SQL 语句中)UDF 类实例内部的数据都是隔离的。您可以放心地在 UDTF 中维护一些状态数据,无需考虑并发对 UDF 类实例内部状态数据的影响。 - -#### 接口详细介绍: - -1. **void validate(UDFParameterValidator validator) throws Exception** - - `validate`方法能够对用户输入的参数进行验证。 - - 您可以在该方法中限制输入序列的数量和类型,检查用户输入的属性或者进行自定义逻辑的验证。 - -`UDFParameterValidator`的使用方法请见 [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/parameter/UDFParameterValidator.java)。 - -2. **void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception** - - `beforeStart`方法有两个作用: - 1. 帮助用户解析 SQL 语句中的 UDF 参数 - 2. 配置 UDF 运行时必要的信息,即指定 UDF 访问原始数据时采取的策略和输出结果序列的类型 - 3. 创建资源,比如建立外部链接,打开文件等 - -2.1 **UDFParameters** - -`UDFParameters`的作用是解析 SQL 语句中的 UDF 参数(SQL 中 UDF 函数名称后括号中的部分)。参数包括序列类型参数和字符串 key-value 对形式输入的属性参数。 - -示例: - -``` sql -SELECT UDF(s1, s2, 'key1'='iotdb', 'key2'='123.45') FROM root.sg.d; -``` - -用法: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - String stringValue = parameters.getString("key1"); // iotdb - Float floatValue = parameters.getFloat("key2"); // 123.45 - Double doubleValue = parameters.getDouble("key3"); // null - int intValue = parameters.getIntOrDefault("key4", 678); // 678 - // do something - - // configurations - // ... -} -``` - -2.2 **UDTFConfigurations** - -您必须使用 `UDTFConfigurations` 指定 UDF 访问原始数据时采取的策略和输出结果序列的类型。 - -用法: - -``` java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(Type.INT32); -} -``` - -其中`setAccessStrategy`方法用于设定 UDF 访问原始数据时采取的策略,`setOutputDataType`用于设定输出结果序列的类型。 - - 2.2.1 **setAccessStrategy** - -注意,您在此处设定的原始数据访问策略决定了框架会调用哪一种`transform`方法 ,请实现与原始数据访问策略对应的`transform`方法。当然,您也可以根据`UDFParameters`解析出来的属性参数,动态决定设定哪一种策略,因此,实现两种`transform`方法也是被允许的。 - -下面是您可以设定的访问原始数据的策略: - -| 接口定义 | 描述 | 调用的`transform`方法 | -| ------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| MappableRowByRowStrategy | 自定义标量函数
框架会为每一行原始数据输入调用一次`transform`方法,输入 k 列时间序列 1 行数据,输出 1 列时间序列 1 行数据,可用于标量函数出现的任何子句和表达式中,如select子句、where子句等。 | void transform(Column[] columns, ColumnBuilder builder) throws ExceptionObject transform(Row row) throws Exception | -| RowByRowAccessStrategy | 自定义时间序列生成函数,逐行地处理原始数据。
框架会为每一行原始数据输入调用一次`transform`方法,输入 k 列时间序列 1 行数据,输出 1 列时间序列 n 行数据。
当输入一个序列时,该行就作为输入序列的一个数据点。
当输入多个序列时,输入序列按时间对齐后,每一行作为的输入序列的一个数据点。
(一行数据中,可能存在某一列为`null`值,但不会全部都是`null`) | void transform(Row row, PointCollector collector) throws Exception | -| SlidingTimeWindowAccessStrategy | 自定义时间序列生成函数,以滑动时间窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SlidingSizeWindowAccessStrategy | 自定义时间序列生成函数,以固定行数的方式处理原始数据,即每个数据处理窗口都会包含固定行数的数据(最后一个窗口除外)。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为的输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| SessionTimeWindowAccessStrategy | 自定义时间序列生成函数,以会话窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 k 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,输入序列按时间对齐后,每个窗口作为的输入序列的一个数据点。
(每个窗口可能存在 i 行,每行数据可能存在某一列为`null`值,但不会全部都是`null`) | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | -| StateWindowAccessStrategy | 自定义时间序列生成函数,以状态窗口的方式处理原始数据。
框架会为每一个原始数据输入窗口调用一次`transform`方法,输入 1 列时间序列 m 行数据,输出 1 列时间序列 n 行数据。
一个窗口可能存在多行数据,目前仅支持对一个物理量也就是一列数据进行开窗。 | void transform(RowWindow rowWindow, PointCollector collector) throws Exception | - -#### 接口详情: - -- `MappableRowByRowStrategy` 和 `RowByRowAccessStrategy`的构造不需要任何参数。 - -- `SlidingTimeWindowAccessStrategy` - -开窗示意图: - - - -`SlidingTimeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 3 类参数: - -1. 时间轴显示时间窗开始和结束时间 - -时间轴显示时间窗开始和结束时间不是必须要提供的。当您不提供这类参数时,时间轴显示时间窗开始时间会被定义为整个查询结果集中最小的时间戳,时间轴显示时间窗结束时间会被定义为整个查询结果集中最大的时间戳。 - -2. 划分时间轴的时间间隔参数(必须为正数) -3. 滑动步长(不要求大于等于时间间隔,但是必须为正数) - -滑动步长参数也不是必须的。当您不提供滑动步长参数时,滑动步长会被设定为划分时间轴的时间间隔。 - -3 类参数的关系可见下图。策略的构造方法详见 [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/SlidingTimeWindowAccessStrategy.java)。 - - - -> 注意,最后的一些时间窗口的实际时间间隔可能小于规定的时间间隔参数。另外,可能存在某些时间窗口内数据行数量为 0 的情况,这种情况框架也会为该窗口调用一次`transform`方法。 - -- `SlidingSizeWindowAccessStrategy` - -开窗示意图: - - - -`SlidingSizeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 2 个参数: - -1. 窗口大小,即一个数据处理窗口包含的数据行数。注意,最后一些窗口的数据行数可能少于规定的数据行数。 -2. 滑动步长,即下一窗口第一个数据行与当前窗口第一个数据行间的数据行数(不要求大于等于窗口大小,但是必须为正数) - -滑动步长参数不是必须的。当您不提供滑动步长参数时,滑动步长会被设定为窗口大小。 - -- `SessionTimeWindowAccessStrategy` - -开窗示意图:**时间间隔小于等于给定的最小时间间隔 sessionGap 则分为一组。** - - - - -`SessionTimeWindowAccessStrategy`有多种构造方法,您可以向构造方法提供 2 类参数: - -1. 时间轴显示时间窗开始和结束时间。 -2. 会话窗口之间的最小时间间隔。 - -- `StateWindowAccessStrategy` - -开窗示意图:**对于数值型数据,状态差值小于等于给定的阈值 delta 则分为一组。** - - - -`StateWindowAccessStrategy`有四种构造方法: - -1. 针对数值型数据,可以提供时间轴显示时间窗开始和结束时间以及对于单个窗口内部允许变化的阈值delta。 -2. 针对文本数据以及布尔数据,可以提供时间轴显示时间窗开始和结束时间。对于这两种数据类型,单个窗口内的数据是相同的,不需要提供变化阈值。 -3. 针对数值型数据,可以只提供单个窗口内部允许变化的阈值delta,时间轴显示时间窗开始时间会被定义为整个查询结果集中最小的时间戳,时间轴显示时间窗结束时间会被定义为整个查询结果集中最大的时间戳。 -4. 针对文本数据以及布尔数据,可以不提供任何参数,开始与结束时间戳见3中解释。 - -StateWindowAccessStrategy 目前只能接收一列输入。策略的构造方法详见 [Javadoc](https://github.com/apache/iotdb/blob/rc/2.0.4/iotdb-api/udf-api/src/main/java/org/apache/iotdb/udf/api/customizer/strategy/StateWindowAccessStrategy.java)。 - - 2.2.2 **setOutputDataType** - -注意,您在此处设定的输出结果序列的类型,决定了`transform`方法中`PointCollector`实际能够接收的数据类型。`setOutputDataType`中设定的输出类型和`PointCollector`实际能够接收的数据输出类型关系如下: - -| `setOutputDataType`中设定的输出类型 | `PointCollector`实际能够接收的输出类型 | -| :---------------------------------- | :----------------------------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | java.lang.String 和 org.apache.iotdb.udf.api.type.Binary | - -UDTF 输出序列的类型是运行时决定的。您可以根据输入序列类型动态决定输出序列类型。 - -示例: - -```java -void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setAccessStrategy(new RowByRowAccessStrategy()) - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **Object transform(Row row) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `MappableRowByRowAccessStrategy`,您就需要该方法和下面的`void transform(Column[] columns, ColumnBuilder builder) throws Exception` 二选一来实现,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的一行。原始数据由`Row`读入,由返回值输出。您必须在一次`transform`方法调用中,根据每个输入的数据点输出一个对应的数据点,即输入和输出依然是一对一的。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`Object transform(Row row) throws Exception`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,输出这两个数据点的代数和。 - -```java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type dataType; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - dataType = parameters.getDataType(0); - configurations - .setAccessStrategy(new MappableRowByRowAccessStrategy()) - .setOutputDataType(dataType); - } - - @Override - public Object transform(Row row) throws Exception { - return row.getLong(0) + row.getLong(1); - } -} -``` - -4. **void transform(Column[] columns, ColumnBuilder builder) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `MappableRowByRowAccessStrategy`,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的多行,经过性能测试,我们发现一次性处理多行的 UDTF 比一次处理一行的 UDTF 性能更好。原始数据由`Column[]`读入,由`ColumnBuilder`输出。您必须在一次`transform`方法调用中,根据每个输入的数据点输出一个对应的数据点,即输入和输出依然是一对一的。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(Column[] columns, ColumnBuilder builder) throws Exception`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,输出这两个数据点的代数和。 - -``` java -import org.apache.iotdb.tsfile.read.common.block.column.Column; -import org.apache.iotdb.tsfile.read.common.block.column.ColumnBuilder; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameterValidator; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.MappableRowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - private Type type; - - @Override - public void validate(UDFParameterValidator validator) throws Exception { - validator - .validateInputSeriesNumber(2) - .validateInputSeriesDataType(0, Type.INT64) - .validateInputSeriesDataType(1, Type.INT64); - } - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - type = parameters.getDataType(0); - configurations.setAccessStrategy(new MappableRowByRowAccessStrategy()).setOutputDataType(type); - } - - @Override - public void transform(Column[] columns, ColumnBuilder builder) throws Exception { - long[] inputs1 = columns[0].getLongs(); - long[] inputs2 = columns[1].getLongs(); - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - builder.writeLong(inputs1[i] + inputs2[i]); - } - } -} -``` - -5. **void transform(Row row, PointCollector collector) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `RowByRowAccessStrategy`,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理原始数据的一行。原始数据由`Row`读入,由`PointCollector`输出。您可以选择在一次`transform`方法调用中输出任意数量的数据点。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(Row row, PointCollector collector) throws Exception`方法的完整 UDF 示例。它是一个加法器,接收两列时间序列输入,当这两个数据点都不为`null`时,输出这两个数据点的代数和。 - -``` java -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Adder implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(Type.INT64) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) throws Exception { - if (row.isNull(0) || row.isNull(1)) { - return; - } - collector.putLong(row.getTime(), row.getLong(0) + row.getLong(1)); - } -} -``` - -6. **void transform(RowWindow rowWindow, PointCollector collector) throws Exception** - -当您在`beforeStart`方法中指定 UDF 读取原始数据的策略为 `SlidingTimeWindowAccessStrategy`或者`SlidingSizeWindowAccessStrategy`时,您就需要实现该方法,在该方法中增加对原始数据处理的逻辑。 - -该方法每次处理固定行数或者固定时间间隔内的一批数据,我们称包含这一批数据的容器为窗口。原始数据由`RowWindow`读入,由`PointCollector`输出。`RowWindow`能够帮助您访问某一批次的`Row`,它提供了对这一批次的`Row`进行随机访问和迭代访问的接口。您可以选择在一次`transform`方法调用中输出任意数量的数据点,需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void transform(RowWindow rowWindow, PointCollector collector) throws Exception`方法的完整 UDF 示例。它是一个计数器,接收任意列数的时间序列输入,作用是统计并输出指定时间范围内每一个时间窗口中的数据行数。 - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.RowWindow; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.SlidingTimeWindowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Counter implements UDTF { - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(Type.INT32) - .setAccessStrategy(new SlidingTimeWindowAccessStrategy( - parameters.getLong("time_interval"), - parameters.getLong("sliding_step"), - parameters.getLong("display_window_begin"), - parameters.getLong("display_window_end"))); - } - - @Override - public void transform(RowWindow rowWindow, PointCollector collector) throws Exception { - if (rowWindow.windowSize() != 0) { - collector.putInt(rowWindow.windowStartTime(), rowWindow.windowSize()); - } - } -} -``` - -7. **void terminate(PointCollector collector) throws Exception** - -在一些场景下,UDF 需要遍历完所有的原始数据后才能得到最后的输出结果。`terminate`接口为这类 UDF 提供了支持。 - -该方法会在所有的`transform`调用执行完成后,在`beforeDestory`方法执行前被调用。您可以选择使用`transform`方法进行单纯的数据处理,最后使用`terminate`将处理结果输出。 - -结果需要由`PointCollector`输出。您可以选择在一次`terminate`方法调用中输出任意数量的数据点。需要注意的是,输出数据点的类型必须与您在`beforeStart`方法中设置的一致,而输出数据点的时间戳必须是严格单调递增的。 - -下面是一个实现了`void terminate(PointCollector collector) throws Exception`方法的完整 UDF 示例。它接收一个`INT32`类型的时间序列输入,作用是输出该序列的最大值点。 - -```java -import java.io.IOException; -import org.apache.iotdb.udf.api.UDTF; -import org.apache.iotdb.udf.api.access.Row; -import org.apache.iotdb.udf.api.collector.PointCollector; -import org.apache.iotdb.udf.api.customizer.config.UDTFConfigurations; -import org.apache.iotdb.udf.api.customizer.parameter.UDFParameters; -import org.apache.iotdb.udf.api.customizer.strategy.RowByRowAccessStrategy; -import org.apache.iotdb.udf.api.type.Type; - -public class Max implements UDTF { - - private Long time; - private int value; - - @Override - public void beforeStart(UDFParameters parameters, UDTFConfigurations configurations) { - configurations - .setOutputDataType(TSDataType.INT32) - .setAccessStrategy(new RowByRowAccessStrategy()); - } - - @Override - public void transform(Row row, PointCollector collector) { - if (row.isNull(0)) { - return; - } - int candidateValue = row.getInt(0); - if (time == null || value < candidateValue) { - time = row.getTime(); - value = candidateValue; - } - } - - @Override - public void terminate(PointCollector collector) throws IOException { - if (time != null) { - collector.putInt(time, value); - } - } -} -``` - -8. **void beforeDestroy()** - -UDTF 的结束方法,您可以在此方法中进行一些资源释放等的操作。 - -此方法由框架调用。对于一个 UDF 类实例而言,生命周期中会且只会被调用一次,即在处理完最后一条记录之后被调用。 - -### 4.3 UDAF(User Defined Aggregation Function) - -一个完整的 UDAF 定义涉及到 State 和 UDAF 两个类。 - -#### State 类 - -编写一个 State 类需要实现`org.apache.iotdb.udf.api.State`接口,下表是需要实现的方法说明。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| -------------------------------- | ------------------------------------------------------------ | -------- | -| void reset() | 将 `State` 对象重置为初始的状态,您需要像编写构造函数一样,在该方法内填入 `State` 类中各个字段的初始值。 | 是 | -| byte[] serialize() | 将 `State` 序列化为二进制数据。该方法用于 IoTDB 内部的 `State` 对象传递,注意序列化的顺序必须和下面的反序列化方法一致。 | 是 | -| void deserialize(byte[] bytes) | 将二进制数据反序列化为 `State`。该方法用于 IoTDB 内部的 `State` 对象传递,注意反序列化的顺序必须和上面的序列化方法一致。 | 是 | - -#### 接口详细介绍: - -1. **void reset()** - -该方法的作用是将 `State` 重置为初始的状态,您需要在该方法内填写 `State` 对象中各个字段的初始值。出于优化上的考量,IoTDB 在内部会尽可能地复用 `State`,而不是为每一个组创建一个新的 `State`,这样会引入不必要的开销。当 `State` 更新完一个组中的数据之后,就会调用这个方法重置为初始状态,以此来处理下一个组。 - -以求平均数(也就是 `avg`)的 `State` 为例,您需要数据的总和 `sum` 与数据的条数 `count`,并在 `reset()` 方法中将二者初始化为 0。 - -```java -class AvgState implements State { - double sum; - - long count; - - @Override - public void reset() { - sum = 0; - count = 0; - } - - // other methods -} -``` - -2. **byte[] serialize()/void deserialize(byte[] bytes)** - -该方法的作用是将 State 序列化为二进制数据,和从二进制数据中反序列化出 State。IoTDB 作为分布式数据库,涉及到在不同节点中传递数据,因此您需要编写这两个方法,来实现 State 在不同节点中的传递。注意序列化和反序列的顺序必须一致。 - -还是以求平均数(也就是求 avg)的 State 为例,您可以通过任意途径将 State 的内容转化为 `byte[]` 数组,以及从 `byte[]` 数组中读取出 State 的内容,下面展示的是用 Java8 引入的 `ByteBuffer` 进行序列化/反序列的代码: - -```java -@Override -public byte[] serialize() { - ByteBuffer buffer = ByteBuffer.allocate(Double.BYTES + Long.BYTES); - buffer.putDouble(sum); - buffer.putLong(count); - - return buffer.array(); -} - -@Override -public void deserialize(byte[] bytes) { - ByteBuffer buffer = ByteBuffer.wrap(bytes); - sum = buffer.getDouble(); - count = buffer.getLong(); -} -``` - -#### UDAF 类 - -编写一个 UDAF 类需要实现`org.apache.iotdb.udf.api.UDAF`接口,下表是需要实现的方法说明。 - -#### 接口说明: - -| 接口定义 | 描述 | 是否必须 | -| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | -| void validate(UDFParameterValidator validator) throws Exception | 在初始化方法`beforeStart`调用前执行,用于检测`UDFParameters`中用户输入的参数是否合法。该方法与 UDTF 的`validate`相同。 | 否 | -| void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception | 初始化方法,在 UDAF 处理输入数据前,调用用户自定义的初始化行为。与 UDTF 不同的是,这里的 configuration 是 `UDAFConfiguration` 类型。 | 是 | -| State createState() | 创建`State`对象,一般只需要调用默认构造函数,然后按需修改默认的初始值即可。 | 是 | -| void addInput(State state, Column[] columns, BitMap bitMap) | 根据传入的数据`Column[]`批量地更新`State`对象,注意最后一列,也就是 `columns[columns.length - 1]` 总是代表时间列。另外`BitMap`表示之前已经被过滤掉的数据,您在编写该方法时需要手动判断对应的数据是否被过滤掉。 | 是 | -| void combineState(State state, State rhs) | 将`rhs`状态合并至`state`状态中。在分布式场景下,同一组的数据可能分布在不同节点上,IoTDB 会为每个节点上的部分数据生成一个`State`对象,然后调用该方法合并成完整的`State`。 | 是 | -| void outputFinal(State state, ResultValue resultValue) | 根据`State`中的数据,计算出最终的聚合结果。注意根据聚合的语义,每一组只能输出一个值。 | 是 | -| void beforeDestroy() | UDAF 的结束方法。此方法由框架调用,并且只会被调用一次,即在处理完最后一条记录之后被调用。 | 否 | - -在一个完整的 UDAF 实例生命周期中,各个方法的调用顺序如下: - -1. State createState() -2. void validate(UDFParameterValidator validator) throws Exception -3. void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception -4. void addInput(State state, Column[] columns, BitMap bitMap) -5. void combineState(State state, State rhs) -6. void outputFinal(State state, ResultValue resultValue) -7. void beforeDestroy() - -和 UDTF 类似,框架每执行一次 UDAF 查询,都会构造一个全新的 UDF 类实例,查询结束时,对应的 UDF 类实例即被销毁,因此不同 UDAF 查询(即使是在同一个 SQL 语句中)UDF 类实例内部的数据都是隔离的。您可以放心地在 UDAF 中维护一些状态数据,无需考虑并发对 UDF 类实例内部状态数据的影响。 - -#### 接口详细介绍: - -1. **void validate(UDFParameterValidator validator) throws Exception** - -同 UDTF, `validate`方法能够对用户输入的参数进行验证。 - -您可以在该方法中限制输入序列的数量和类型,检查用户输入的属性或者进行自定义逻辑的验证。 - -2. **void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception** - - `beforeStart`方法的作用 UDAF 相同: - - 1. 帮助用户解析 SQL 语句中的 UDF 参数 - 2. 配置 UDF 运行时必要的信息,即指定 UDF 访问原始数据时采取的策略和输出结果序列的类型 - 3. 创建资源,比如建立外部链接,打开文件等。 - -其中,`UDFParameters` 类型的作用可以参照上文。 - -2.2 **UDTFConfigurations** - -和 UDTF 的区别在于,UDAF 使用了 `UDAFConfigurations` 作为 `configuration` 对象的类型。 - -目前,该类仅支持设置输出数据的类型。 - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // parameters - // ... - - // configurations - configurations - .setOutputDataType(Type.INT32); -} -``` - -`setOutputDataType` 中设定的输出类型和 `ResultValue` 实际能够接收的数据输出类型关系如下: - -| `setOutputDataType`中设定的输出类型 | `ResultValue`实际能够接收的输出类型 | -| :---------------------------------- | :------------------------------------- | -| INT32 | int | -| INT64 | long | -| FLOAT | float | -| DOUBLE | double | -| BOOLEAN | boolean | -| TEXT | org.apache.iotdb.udf.api.type.Binary | - -UDAF 输出序列的类型也是运行时决定的。您可以根据输入序列类型动态决定输出序列类型。 - -示例: - -```java -void beforeStart(UDFParameters parameters, UDAFConfigurations configurations) throws Exception { - // do something - // ... - - configurations - .setOutputDataType(parameters.getDataType(0)); -} -``` - -3. **State createState()** - -为 UDAF 创建并初始化 `State`。由于 Java 语言本身的限制,您只能调用 `State` 类的默认构造函数。默认构造函数会为类中所有的字段赋一个默认的初始值,如果该初始值并不符合您的要求,您需要在这个方法内进行手动的初始化。 - -下面是一个包含手动初始化的例子。假设您要实现一个累乘的聚合函数,`State` 的初始值应该设置为 1,但是默认构造函数会初始化为 0,因此您需要在调用默认构造函数之后,手动对 `State` 进行初始化: - -```java -public State createState() { - MultiplyState state = new MultiplyState(); - state.result = 1; - return state; -} -``` - -4. **void addInput(State state, Column[] columns, BitMap bitMap)** - -该方法的作用是,通过原始的输入数据来更新 `State` 对象。出于性能上的考量,也是为了和 IoTDB 向量化的查询引擎相对齐,原始的输入数据不再是一个数据点,而是列的数组 `Column[]`。注意最后一列(也就是 `columns[columns.length - 1]` )总是时间列,因此您也可以在 UDAF 中根据时间进行不同的操作。 - -由于输入参数的类型不是一个数据点,而是多个列,您需要手动对列中的部分数据进行过滤处理,这就是第三个参数 `BitMap` 存在的意义。它用来标识这些列中哪些数据被过滤掉了,您在任何情况下都无需考虑被过滤掉的数据。 - -下面是一个用于统计数据条数(也就是 count)的 `addInput()` 示例。它展示了您应该如何使用 `BitMap` 来忽视那些已经被过滤掉的数据。注意还是由于 Java 语言本身的限制,您需要在方法的开头将接口中定义的 `State` 类型强制转化为自定义的 `State` 类型,不然后续无法正常使用该 `State` 对象。 - -```java -public void addInput(State state, Column[] columns, BitMap bitMap) { - CountState countState = (CountState) state; - - int count = columns[0].getPositionCount(); - for (int i = 0; i < count; i++) { - if (bitMap != null && !bitMap.isMarked(i)) { - continue; - } - if (!columns[0].isNull(i)) { - countState.count++; - } - } -} -``` - -5. **void combineState(State state, State rhs)** - -该方法的作用是合并两个 `State`,更加准确的说,是用第二个 `State` 对象来更新第一个 `State` 对象。IoTDB 是分布式数据库,同一组的数据可能分布在多个不同的节点上。出于性能考虑,IoTDB 会为每个节点上的部分数据先进行聚合成 `State`,然后再将不同节点上的、属于同一个组的 `State` 进行合并,这就是 `combineState` 的作用。 - -下面是一个用于求平均数(也就是 avg)的 `combineState()` 示例。和 `addInput` 类似,您都需要在开头对两个 `State` 进行强制类型转换。另外需要注意是用第二个 `State` 的内容来更新第一个 `State` 的值。 - -```java -public void combineState(State state, State rhs) { - AvgState avgState = (AvgState) state; - AvgState avgRhs = (AvgState) rhs; - - avgState.count += avgRhs.count; - avgState.sum += avgRhs.sum; -} -``` - -6. **void outputFinal(State state, ResultValue resultValue)** - -该方法的作用是从 `State` 中计算出最终的结果。您需要访问 `State` 中的各个字段,求出最终的结果,并将最终的结果设置到 `ResultValue` 对象中。IoTDB 内部会为每个组在最后调用一次这个方法。注意根据聚合的语义,最终的结果只能是一个值。 - -下面还是一个用于求平均数(也就是 avg)的 `outputFinal` 示例。除了开头的强制类型转换之外,您还将看到 `ResultValue` 对象的具体用法,即通过 `setXXX`(其中 `XXX` 是类型名)来设置最后的结果。 - -```java -public void outputFinal(State state, ResultValue resultValue) { - AvgState avgState = (AvgState) state; - - if (avgState.count != 0) { - resultValue.setDouble(avgState.sum / avgState.count); - } else { - resultValue.setNull(); - } -} -``` - -7. **void beforeDestroy()** - -UDAF 的结束方法,您可以在此方法中进行一些资源释放等的操作。 - -此方法由框架调用。对于一个 UDF 类实例而言,生命周期中会且只会被调用一次,即在处理完最后一条记录之后被调用。 - -### 4.4 完整 Maven 项目示例 - -如果您使用 [Maven](http://search.maven.org/),可以参考我们编写的示例项目**udf-example**。您可以在 [这里](https://github.com/apache/iotdb/tree/master/example/udf) 找到它。 - - -## 5. 为iotdb贡献通用的内置UDF函数 - -该部分主要讲述了外部用户如何将自己编写的 UDF 贡献给 IoTDB 社区。 - -### 5.1 前提条件 - -1. UDF 具有通用性。 - - 通用性主要指的是:UDF 在某些业务场景下,可以被广泛使用。换言之,就是 UDF 具有复用价值,可被社区内其他用户直接使用。 - - 如果不确定自己写的 UDF 是否具有通用性,可以发邮件到 `dev@iotdb.apache.org` 或直接创建 ISSUE 发起讨论。 - -2. UDF 已经完成测试,且能够正常运行在用户的生产环境中。 - -### 5.2 贡献清单 - -1. UDF 的源代码 -2. UDF 的测试用例 -3. UDF 的使用说明 - -### 5.3 贡献内容 - -#### 5.3.1 源代码 - -1. 在`iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin`中创建 UDF 主类和相关的辅助类。 -2. 在`iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/udf/builtin/BuiltinTimeSeriesGeneratingFunction.java`中注册编写的 UDF。 - -#### 5.3.2 测试用例 - -至少需要为贡献的 UDF 编写集成测试。 - -可以在`integration-test/src/test/java/org/apache/iotdb/db/it/udf`中为贡献的 UDF 新增一个测试类进行测试。 - -#### 5.3.3 使用说明 - -使用说明需要包含:UDF 的名称、UDF 的作用、执行函数必须的属性参数、函数的适用的场景以及使用示例等。 - -使用说明需包含中英文两个版本。应分别在 `docs/zh/UserGuide/Operation Manual/DML Data Manipulation Language.md` 和 `docs/UserGuide/Operation Manual/DML Data Manipulation Language.md` 中新增使用说明。 - -#### 5.3.4 提交 PR - -当准备好源代码、测试用例和使用说明后,就可以将 UDF 贡献到 IoTDB 社区了。在 [Github](https://github.com/apache/iotdb) 上面提交 Pull Request (PR) 即可。具体提交方式见:[贡献指南](https://iotdb.apache.org/zh/Community/Development-Guide.html)。 - -当 PR 评审通过并被合并后, UDF 就已经贡献给 IoTDB 社区了! - -## 6. 常见问题 - -1. 如何修改已经注册的 UDF? - -答:假设 UDF 的名称为`example`,全类名为`org.apache.iotdb.udf.UDTFExample`,由`example.jar`引入 - -1. 首先卸载已经注册的`example`函数,执行`DROP FUNCTION example` -2. 删除 `iotdb-server-2.0.x-all-bin/ext/udf` 目录下的`example.jar` -3. 修改`org.apache.iotdb.udf.UDTFExample`中的逻辑,重新打包,JAR 包的名字可以仍然为`example.jar` -4. 将新的 JAR 包上传至 `iotdb-server-2.0.x-all-bin/ext/udf` 目录下 -5. 装载新的 UDF,执行`CREATE FUNCTION example AS "org.apache.iotdb.udf.UDTFExample"` \ No newline at end of file